[v3,1/2] Provide in-kernel headers for making it easy to extend the kernel
diff mbox series

Message ID 20190227193748.132301-1-joel@joelfernandes.org
State New
Headers show
Series
  • [v3,1/2] Provide in-kernel headers for making it easy to extend the kernel
Related show

Commit Message

Joel Fernandes Feb. 27, 2019, 7:37 p.m. UTC
Introduce in-kernel headers and other artifacts which are made available
as an archive through proc (/proc/kheaders.tar.xz file). This archive makes
it possible to build kernel modules, run eBPF programs, and other
tracing programs that need to extend the kernel for tracing purposes
without any dependency on the file system having headers and build
artifacts.

On Android and embedded systems, it is common to switch kernels but not
have kernel headers available on the file system. Raw kernel headers
also cannot be copied into the filesystem like they can be on other
distros, due to licensing and other issues. There's no linux-headers
package on Android. Further once a different kernel is booted, any
headers stored on the file system will no longer be useful. By storing
the headers as a compressed archive within the kernel, we can avoid these
issues that have been a hindrance for a long time.

The feature is also buildable as a module just in case the user desires
it not being part of the kernel image. This makes it possible to load
and unload the headers on demand. A tracing program, or a kernel module
builder can load the module, do its operations, and then unload the
module to save kernel memory. The total memory needed is 3.8MB.

The code to read the headers is based on /proc/config.gz code and uses
the same technique to embed the headers.

To build a module, the below steps have been tested on an x86 machine:
modprobe kheaders
rm -rf $HOME/headers
mkdir -p $HOME/headers
tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
cd my-kernel-module
make -C $HOME/headers M=$(pwd) modules
rmmod kheaders

Additional notes:
(1)
A limitation of module building with this is, since Module.symvers is
not available in the archive due to a cyclic dependency with building of
the archive into the kernel or module binaries, the modules built using
the archive will not contain symbol versioning (modversion). This is
usually not an issue since the idea of this patch is to build a kernel
module on the fly and load it into the same kernel. An appropriate
warning is already printed by the kernel to alert the user of modules
not having modversions when built using the archive. For building with
modversions, the user can use traditional header packages. For our
tracing usecases, we build modules on the fly with this so it is not a
concern.

(2) I have left IKHD_ST and IKHD_ED markers as is to facilitate
future patches that would extract the headers from a kernel or module
image.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---

    Changes since v2:
    (Thanks to Masahiro Yamada for several excellent suggestions)
    - Added support for out of tree builds.
    - Added incremental build support bringing down build time of
      incremental builds from 50 seconds to 5 seconds.
    - Fixed various small nits / cleanups.
    - clean ups to kheaders.c pointed by Alexey Dobriyan.
    - Fixed MODULE_LICENSE in test module and kheaders.c
    - Dropped Module.symvers from archive due to circular dependency.

    Changes since v1:
    - removed IKH_EXTRA variable, not needed (Masahiro Yamada)
    - small fix ups to selftest
       - added target to main Makefile etc
       - added MODULE_LICENSE to test module
       - made selftest more quiet

    Changes since RFC:
    Both changes bring size down to 3.8MB:
    - use xz for compression
    - strip comments except SPDX lines
    - Call out the module name in Kconfig
    - Also added selftests in second patch to ensure headers are always
    working.

Other notes:
By the way I still see this error (without the patch) when doing a clean
build: Makefile:594: include/config/auto.conf: No such file or directory

It appears to be because of commit 0a16d2e8cb7e ("kbuild: use 'include'
directive to load auto.conf from top Makefile")

 Documentation/dontdiff    |  1 +
 init/Kconfig              | 11 ++++++
 kernel/.gitignore         |  3 ++
 kernel/Makefile           | 36 +++++++++++++++++++
 kernel/kheaders.c         | 72 +++++++++++++++++++++++++++++++++++++
 scripts/gen_ikh_data.sh   | 76 +++++++++++++++++++++++++++++++++++++++
 scripts/strip-comments.pl |  8 +++++
 7 files changed, 207 insertions(+)
 create mode 100644 kernel/kheaders.c
 create mode 100755 scripts/gen_ikh_data.sh
 create mode 100755 scripts/strip-comments.pl

Comments

Masahiro Yamada Feb. 28, 2019, 2:17 a.m. UTC | #1
Hi Joel,


On Thu, Feb 28, 2019 at 4:40 AM Joel Fernandes (Google)
<joel@joelfernandes.org> wrote:
>
> Introduce in-kernel headers and other artifacts which are made available
> as an archive through proc (/proc/kheaders.tar.xz file). This archive makes
> it possible to build kernel modules, run eBPF programs, and other
> tracing programs that need to extend the kernel for tracing purposes
> without any dependency on the file system having headers and build
> artifacts.
>
> On Android and embedded systems, it is common to switch kernels but not
> have kernel headers available on the file system. Raw kernel headers
> also cannot be copied into the filesystem like they can be on other
> distros, due to licensing and other issues. There's no linux-headers
> package on Android. Further once a different kernel is booted, any
> headers stored on the file system will no longer be useful. By storing
> the headers as a compressed archive within the kernel, we can avoid these
> issues that have been a hindrance for a long time.
>
> The feature is also buildable as a module just in case the user desires
> it not being part of the kernel image. This makes it possible to load
> and unload the headers on demand. A tracing program, or a kernel module
> builder can load the module, do its operations, and then unload the
> module to save kernel memory. The total memory needed is 3.8MB.
>
> The code to read the headers is based on /proc/config.gz code and uses
> the same technique to embed the headers.



Please let me ask a question about the actual use-case.


To build embedded systems including Android,
I use an x86 build machine.

In other words, I cross-compile vmlinux and in-tree modules.
So,

  target-arch: arm64
  host-arch:   x86



> To build a module, the below steps have been tested on an x86 machine:
> modprobe kheaders
> rm -rf $HOME/headers
> mkdir -p $HOME/headers
> tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
> cd my-kernel-module
> make -C $HOME/headers M=$(pwd) modules
> rmmod kheaders

I am guessing the user will run these commands
on the target system.
In other words, external modules are native-compiled.
So,

  target-arch: arm64
  host-arch:   arm64


Is this correct?


If I understood the assumed use-case correctly,
kheaders.tar.xw will contain host-programs compiled for x86,
which will not work on the target system.




Masahiro




> Additional notes:
> (1)
> A limitation of module building with this is, since Module.symvers is
> not available in the archive due to a cyclic dependency with building of
> the archive into the kernel or module binaries, the modules built using
> the archive will not contain symbol versioning (modversion). This is
> usually not an issue since the idea of this patch is to build a kernel
> module on the fly and load it into the same kernel. An appropriate
> warning is already printed by the kernel to alert the user of modules
> not having modversions when built using the archive. For building with
> modversions, the user can use traditional header packages. For our
> tracing usecases, we build modules on the fly with this so it is not a
> concern.
>
> (2) I have left IKHD_ST and IKHD_ED markers as is to facilitate
> future patches that would extract the headers from a kernel or module
> image.
>
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---



--
Best Regards
Masahiro Yamada
Masami Hiramatsu Feb. 28, 2019, 8:34 a.m. UTC | #2
Hi Joel,

On Wed, 27 Feb 2019 14:37:47 -0500
"Joel Fernandes (Google)" <joel@joelfernandes.org> wrote:

> Introduce in-kernel headers and other artifacts which are made available
> as an archive through proc (/proc/kheaders.tar.xz file). This archive makes
> it possible to build kernel modules, run eBPF programs, and other
> tracing programs that need to extend the kernel for tracing purposes
> without any dependency on the file system having headers and build
> artifacts.
> 
> On Android and embedded systems, it is common to switch kernels but not
> have kernel headers available on the file system. Raw kernel headers
> also cannot be copied into the filesystem like they can be on other
> distros, due to licensing and other issues. There's no linux-headers
> package on Android. Further once a different kernel is booted, any
> headers stored on the file system will no longer be useful. By storing
> the headers as a compressed archive within the kernel, we can avoid these
> issues that have been a hindrance for a long time.

Hmm, isn't it easier to add kernel-headers package on Android?

> The feature is also buildable as a module just in case the user desires
> it not being part of the kernel image. This makes it possible to load
> and unload the headers on demand. A tracing program, or a kernel module
> builder can load the module, do its operations, and then unload the
> module to save kernel memory. The total memory needed is 3.8MB.

But it also requires to install build environment (tools etc.)
on the target system...

> 
> The code to read the headers is based on /proc/config.gz code and uses
> the same technique to embed the headers.
> 
> To build a module, the below steps have been tested on an x86 machine:
> modprobe kheaders
> rm -rf $HOME/headers
> mkdir -p $HOME/headers
> tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
> cd my-kernel-module
> make -C $HOME/headers M=$(pwd) modules
> rmmod kheaders

It seems a bit complex, but no difference from compared with carrying
kheaders.tar.gz. I think we would better have a psudo filesystem
which can mount this compressed header file directly :) Then it becomes
simpler, like

modprobe headerfs
mkdir $HOME/headers
mount -t headerfs $HOME/headers

And this doesn't consume any disk-space.

Thank you,
Qais Yousef Feb. 28, 2019, 1:53 p.m. UTC | #3
Hi Joel

On 02/27/19 14:37, Joel Fernandes (Google) wrote:
> Introduce in-kernel headers and other artifacts which are made available
> as an archive through proc (/proc/kheaders.tar.xz file). This archive makes
> it possible to build kernel modules, run eBPF programs, and other
> tracing programs that need to extend the kernel for tracing purposes
> without any dependency on the file system having headers and build
> artifacts.

Thanks for doing this work. This will be a very helpful feature and simplify
the workflow, especially when dealing with multiple targets each with its own
kernel tree/version.

> 
> On Android and embedded systems, it is common to switch kernels but not
> have kernel headers available on the file system. Raw kernel headers
> also cannot be copied into the filesystem like they can be on other
> distros, due to licensing and other issues. There's no linux-headers
> package on Android. Further once a different kernel is booted, any

I use non-android systems quite often for development. And yes while getting
the headers on the target isn't always a problem but when moving across trees
or versions it's very easy to mess this up and having the headers embedded in
the kernel means you're always guaranteed to use the right headers. eBPF is my
main use case here.

> headers stored on the file system will no longer be useful. By storing
> the headers as a compressed archive within the kernel, we can avoid these
> issues that have been a hindrance for a long time.
> 
> The feature is also buildable as a module just in case the user desires
> it not being part of the kernel image. This makes it possible to load
> and unload the headers on demand. A tracing program, or a kernel module
> builder can load the module, do its operations, and then unload the
> module to save kernel memory. The total memory needed is 3.8MB.

[...]

> diff --git a/scripts/gen_ikh_data.sh b/scripts/gen_ikh_data.sh
> new file mode 100755
> index 000000000000..7329262bed2f
> --- /dev/null
> +++ b/scripts/gen_ikh_data.sh
> @@ -0,0 +1,76 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +
> +spath="$(dirname "$(readlink -f "$0")")"
> +kroot="$spath/.."
> +outdir="$(pwd)"
> +tarfile=$1
> +cpio_dir=$outdir/$tarfile.tmp
> +
> +src_file_list=""
> +for f in $file_list; do

$file_list is not assigned here.

I applied the patches and I got an empty tar generated. Setting `file_list=$*`
fixed it for me - though not sure if this is the right fix to use. Last minute
change/cleanup accidently removed the line that assigns $file_list?

Cheers

--
Qais Yousef

> +	src_file_list="$src_file_list $(echo $f | grep -v OBJDIR)"
> +done
> +
> +obj_file_list=""
> +for f in $file_list; do
> +	f=$(echo $f | grep OBJDIR | sed -e 's/OBJDIR\///g')
> +	obj_file_list="$obj_file_list $f";
> +done
> +
> +# Support incremental builds by skipping archive generation
> +# if timestamps of files being archived are not changed.
> +
> +# This block is useful for debugging the incremental builds.
> +# Uncomment it for debugging.
> +# iter=1
> +# if [ ! -f /tmp/iter ]; then echo 1 > /tmp/iter;
> +# else; 	iter=$(($(cat /tmp/iter) + 1)); fi
> +# find $src_file_list -type f | xargs ls -lR > /tmp/src-ls-$iter
> +# find $obj_file_list -type f | xargs ls -lR > /tmp/obj-ls-$iter
> +
> +# modules.order and include/generated/compile.h are ignored because these are
> +# touched even when none of the source files changed. This causes pointless
> +# regeneration, so let us ignore them for md5 calculation.
> +pushd $kroot > /dev/null
> +src_files_md5="$(find $src_file_list -type f ! -name modules.order |
> +		grep -v "include/generated/compile.h"		   |
> +		xargs ls -lR | md5sum | cut -d ' ' -f1)"
> +popd > /dev/null
> +obj_files_md5="$(find $obj_file_list -type f ! -name modules.order |
> +		grep -v "include/generated/compile.h"		   |
> +		xargs ls -lR | md5sum | cut -d ' ' -f1)"
> +
> +if [ -f $tarfile ]; then tarfile_md5="$(md5sum $tarfile | cut -d ' ' -f1)"; fi
> +if [ -f kernel/kheaders.md5 ] &&
> +	[ "$(cat kernel/kheaders.md5|head -1)" == "$src_files_md5" ] &&
> +	[ "$(cat kernel/kheaders.md5|head -2|tail -1)" == "$obj_files_md5" ] &&
> +	[ "$(cat kernel/kheaders.md5|tail -1)" == "$tarfile_md5" ]; then
> +		exit
> +fi
> +
> +rm -rf $cpio_dir
> +mkdir $cpio_dir
> +
> +pushd $kroot > /dev/null
> +for f in $src_file_list;
> +	do find "$f" ! -name "*.c" ! -name "*.o" ! -name "*.cmd" ! -name ".*";
> +done | cpio --quiet -pd $cpio_dir
> +popd > /dev/null
> +
> +# The second CPIO can complain if files already exist which can
> +# happen with out of tree builds. Just silence CPIO for now.
> +for f in $obj_file_list;
> +	do find "$f" ! -name "*.c" ! -name "*.o" ! -name "*.cmd" ! -name ".*";
> +done | cpio --quiet -pd $cpio_dir >/dev/null 2>&1
> +
> +find  $cpio_dir -type f -print0 |
> +	xargs -0 -P8 -n1 -I {} sh -c "$spath/strip-comments.pl {}"
> +
> +tar -Jcf $tarfile -C $cpio_dir/ . > /dev/null
> +
> +echo "$src_files_md5" > kernel/kheaders.md5
> +echo "$obj_files_md5" >> kernel/kheaders.md5
> +echo "$(md5sum $tarfile | cut -d ' ' -f1)" >> kernel/kheaders.md5
> +
> +rm -rf $cpio_dir
> diff --git a/scripts/strip-comments.pl b/scripts/strip-comments.pl
> new file mode 100755
> index 000000000000..f8ada87c5802
> --- /dev/null
> +++ b/scripts/strip-comments.pl
> @@ -0,0 +1,8 @@
> +#!/usr/bin/perl -pi
> +# SPDX-License-Identifier: GPL-2.0
> +
> +# This script removes /**/ comments from a file, unless such comments
> +# contain "SPDX". It is used when building compressed in-kernel headers.
> +
> +BEGIN {undef $/;}
> +s/\/\*((?!SPDX).)*?\*\///smg;
> -- 
> 2.21.0.rc2.261.ga7da99ff1b-goog
Joel Fernandes Feb. 28, 2019, 2:43 p.m. UTC | #4
On Thu, Feb 28, 2019 at 11:17:51AM +0900, Masahiro Yamada wrote:
> Hi Joel,
> 
> 
> On Thu, Feb 28, 2019 at 4:40 AM Joel Fernandes (Google)
> <joel@joelfernandes.org> wrote:
> >
> > Introduce in-kernel headers and other artifacts which are made available
> > as an archive through proc (/proc/kheaders.tar.xz file). This archive makes
> > it possible to build kernel modules, run eBPF programs, and other
> > tracing programs that need to extend the kernel for tracing purposes
> > without any dependency on the file system having headers and build
> > artifacts.
> >
> > On Android and embedded systems, it is common to switch kernels but not
> > have kernel headers available on the file system. Raw kernel headers
> > also cannot be copied into the filesystem like they can be on other
> > distros, due to licensing and other issues. There's no linux-headers
> > package on Android. Further once a different kernel is booted, any
> > headers stored on the file system will no longer be useful. By storing
> > the headers as a compressed archive within the kernel, we can avoid these
> > issues that have been a hindrance for a long time.
> >
> > The feature is also buildable as a module just in case the user desires
> > it not being part of the kernel image. This makes it possible to load
> > and unload the headers on demand. A tracing program, or a kernel module
> > builder can load the module, do its operations, and then unload the
> > module to save kernel memory. The total memory needed is 3.8MB.
> >
> > The code to read the headers is based on /proc/config.gz code and uses
> > the same technique to embed the headers.
> 
> 
> 
> Please let me ask a question about the actual use-case.
> 
> 
> To build embedded systems including Android,
> I use an x86 build machine.
> 
> In other words, I cross-compile vmlinux and in-tree modules.
> So,
> 
>   target-arch: arm64
>   host-arch:   x86
> 
> 
> 
> > To build a module, the below steps have been tested on an x86 machine:
> > modprobe kheaders
> > rm -rf $HOME/headers
> > mkdir -p $HOME/headers
> > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
> > cd my-kernel-module
> > make -C $HOME/headers M=$(pwd) modules
> > rmmod kheaders
> 
> I am guessing the user will run these commands
> on the target system.
> In other words, external modules are native-compiled.
> So,
> 
>   target-arch: arm64
>   host-arch:   arm64
> 
> 
> Is this correct?
> 
> 
> If I understood the assumed use-case correctly,
> kheaders.tar.xw will contain host-programs compiled for x86,
> which will not work on the target system.
> 

You are right, the above commands in the commit message work only if the
host/target are same arch due to scripts.

However we can build with arm64 device connected to a host, like this (which
I tested):

adb shell modprobe kheaders; adb pull /proc/kheaders.tar.xz
rm -rf $HOME/headers; mkdir -p $HOME/headers
tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
cd my-kernel-module
make -C $HOME/headers M=$(pwd) ARCH=arm64 CROSS_COMPILE=aarch64- modules
adb push test.ko /data/; adb shell rmmod kheaders

The other way we can make this work is using x86 usermode emulation inside a
chroot on the Android device which will make the earlier commands work. One
thing to note is that Android also runs on x86 hardware so the commands in
the commit message will work even for x86 Android targets already.

Also note that this the "module building" part is really only one of the
usecases. eBPF is another which needs the headers - and the headers are vast
majority of the archive. Headers take 3.1MB out of 3.6MB of the archive on
arm64 builds.

How do you want to proceed here, should I mention these points in the commit
message?

thanks,

 - Joel
Joel Fernandes Feb. 28, 2019, 2:47 p.m. UTC | #5
On Thu, Feb 28, 2019 at 01:53:43PM +0000, Qais Yousef wrote:
> Hi Joel
> 
> On 02/27/19 14:37, Joel Fernandes (Google) wrote:
> > Introduce in-kernel headers and other artifacts which are made available
> > as an archive through proc (/proc/kheaders.tar.xz file). This archive makes
> > it possible to build kernel modules, run eBPF programs, and other
> > tracing programs that need to extend the kernel for tracing purposes
> > without any dependency on the file system having headers and build
> > artifacts.
> 
> Thanks for doing this work. This will be a very helpful feature and simplify
> the workflow, especially when dealing with multiple targets each with its own
> kernel tree/version.

You're welcome and glad you can find it useful as well.

> > On Android and embedded systems, it is common to switch kernels but not
> > have kernel headers available on the file system. Raw kernel headers
> > also cannot be copied into the filesystem like they can be on other
> > distros, due to licensing and other issues. There's no linux-headers
> > package on Android. Further once a different kernel is booted, any
> 
> I use non-android systems quite often for development. And yes while getting
> the headers on the target isn't always a problem but when moving across trees
> or versions it's very easy to mess this up and having the headers embedded in
> the kernel means you're always guaranteed to use the right headers. eBPF is my
> main use case here.
> 
> > headers stored on the file system will no longer be useful. By storing
> > the headers as a compressed archive within the kernel, we can avoid these
> > issues that have been a hindrance for a long time.
> > 
> > The feature is also buildable as a module just in case the user desires
> > it not being part of the kernel image. This makes it possible to load
> > and unload the headers on demand. A tracing program, or a kernel module
> > builder can load the module, do its operations, and then unload the
> > module to save kernel memory. The total memory needed is 3.8MB.
> 
> [...]
> 
> > diff --git a/scripts/gen_ikh_data.sh b/scripts/gen_ikh_data.sh
> > new file mode 100755
> > index 000000000000..7329262bed2f
> > --- /dev/null
> > +++ b/scripts/gen_ikh_data.sh
> > @@ -0,0 +1,76 @@
> > +#!/bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +
> > +spath="$(dirname "$(readlink -f "$0")")"
> > +kroot="$spath/.."
> > +outdir="$(pwd)"
> > +tarfile=$1
> > +cpio_dir=$outdir/$tarfile.tmp
> > +
> > +src_file_list=""
> > +for f in $file_list; do
> 
> $file_list is not assigned here.
> 
> I applied the patches and I got an empty tar generated. Setting `file_list=$*`
> fixed it for me - though not sure if this is the right fix to use. Last minute
> change/cleanup accidently removed the line that assigns $file_list?

Ah good catch, I made this change for "file_list=${@:2}" in my tree but
forgot to push it. Below is the updated patch. Sorry and I'll refresh the
series with the change after we finish the discussion in the other thread.
Meanwhile the updated patch is as follows...

---8<-----------------------

From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Subject: [PATCH v3.1] Provide in-kernel headers for making it easy to extend the kernel

Introduce in-kernel headers and other artifacts which are made available
as an archive through proc (/proc/kheaders.tar.xz file). This archive makes
it possible to build kernel modules, run eBPF programs, and other
tracing programs that need to extend the kernel for tracing purposes
without any dependency on the file system having headers and build
artifacts.

On Android and embedded systems, it is common to switch kernels but not
have kernel headers available on the file system. Raw kernel headers
also cannot be copied into the filesystem like they can be on other
distros, due to licensing and other issues. There's no linux-headers
package on Android. Further once a different kernel is booted, any
headers stored on the file system will no longer be useful. By storing
the headers as a compressed archive within the kernel, we can avoid these
issues that have been a hindrance for a long time.

The feature is also buildable as a module just in case the user desires
it not being part of the kernel image. This makes it possible to load
and unload the headers on demand. A tracing program, or a kernel module
builder can load the module, do its operations, and then unload the
module to save kernel memory. The total memory needed is 3.8MB.

The code to read the headers is based on /proc/config.gz code and uses
the same technique to embed the headers.

To build a module, the below steps have been tested on an x86 machine:
modprobe kheaders
rm -rf $HOME/headers
mkdir -p $HOME/headers
tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
cd my-kernel-module
make -C $HOME/headers M=$(pwd) modules
rmmod kheaders

Additional notes:
(1)
A limitation of module building with this is, since Module.symvers is
not available in the archive due to a cyclic dependency with building of
the archive into the kernel or module binaries, the modules built using
the archive will not contain symbol versioning (modversion). This is
usually not an issue since the idea of this patch is to build a kernel
module on the fly and load it into the same kernel. An appropriate
warning is already printed by the kernel to alert the user of modules
not having modversions when built using the archive. For building with
modversions, the user can use traditional header packages. For our
tracing usecases, we build modules on the fly with this so it is not a
concern.

(2) I have left IKHD_ST and IKHD_ED markers as is to facilitate
future patches that would extract the headers from a kernel or module
image.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
diff-note-start

    Changes since v2:
    (Thanks to Masahiro Yamada for several excellent suggestions)
    - Added support for out of tree builds.
    - Added incremental build support bringing down build time of
      incremental builds from 50 seconds to 5 seconds.
    - Fixed various small nits / cleanups.
    - clean ups to kheaders.c pointed by Alexey Dobriyan.
    - Fixed MODULE_LICENSE in test module and kheaders.c
    - Dropped Module.symvers from archive due to circular dependency.

    Changes since v1:
    - removed IKH_EXTRA variable, not needed (Masahiro Yamada)
    - small fix ups to selftest
       - added target to main Makefile etc
       - added MODULE_LICENSE to test module
       - made selftest more quiet

    Changes since RFC:
    Both changes bring size down to 3.8MB:
    - use xz for compression
    - strip comments except SPDX lines
    - Call out the module name in Kconfig
    - Also added selftests in second patch to ensure headers are always
    working.

Other notes:
By the way I still see this error (without the patch) when doing a clean
build: Makefile:594: include/config/auto.conf: No such file or directory

It appears to be because of commit 0a16d2e8cb7e ("kbuild: use 'include'
directive to load auto.conf from top Makefile")
---
 Documentation/dontdiff    |  1 +
 init/Kconfig              | 11 ++++++
 kernel/.gitignore         |  3 ++
 kernel/Makefile           | 36 ++++++++++++++++++
 kernel/kheaders.c         | 72 ++++++++++++++++++++++++++++++++++++
 scripts/gen_ikh_data.sh   | 78 +++++++++++++++++++++++++++++++++++++++
 scripts/strip-comments.pl |  8 ++++
 7 files changed, 209 insertions(+)
 create mode 100644 kernel/kheaders.c
 create mode 100755 scripts/gen_ikh_data.sh
 create mode 100755 scripts/strip-comments.pl

diff --git a/Documentation/dontdiff b/Documentation/dontdiff
index 2228fcc8e29f..05a2319ee2a2 100644
--- a/Documentation/dontdiff
+++ b/Documentation/dontdiff
@@ -151,6 +151,7 @@ int8.c
 kallsyms
 kconfig
 keywords.c
+kheaders_data.h*
 ksym.c*
 ksym.h*
 kxgettext
diff --git a/init/Kconfig b/init/Kconfig
index c9386a365eea..63ff0990ae55 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -563,6 +563,17 @@ config IKCONFIG_PROC
 	  This option enables access to the kernel configuration file
 	  through /proc/config.gz.
 
+config IKHEADERS_PROC
+	tristate "Enable kernel header artifacts through /proc/kheaders.tar.xz"
+	select BUILD_BIN2C
+	depends on PROC_FS
+	help
+	  This option enables access to the kernel header and other artifacts that
+          are generated during the build process. These can be used to build kernel
+          modules, and other in-kernel programs such as those generated by eBPF
+          and systemtap tools. If you build the headers as a module, a module
+          called kheaders.ko is built which can be loaded to get access to them.
+
 config LOG_BUF_SHIFT
 	int "Kernel log buffer size (16 => 64KB, 17 => 128KB)"
 	range 12 25
diff --git a/kernel/.gitignore b/kernel/.gitignore
index b3097bde4e9c..484018945e93 100644
--- a/kernel/.gitignore
+++ b/kernel/.gitignore
@@ -3,5 +3,8 @@
 #
 config_data.h
 config_data.gz
+kheaders.md5
+kheaders_data.h
+kheaders_data.tar.xz
 timeconst.h
 hz.bc
diff --git a/kernel/Makefile b/kernel/Makefile
index 6aa7543bcdb2..0bc7cacd7da6 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -70,6 +70,7 @@ obj-$(CONFIG_UTS_NS) += utsname.o
 obj-$(CONFIG_USER_NS) += user_namespace.o
 obj-$(CONFIG_PID_NS) += pid_namespace.o
 obj-$(CONFIG_IKCONFIG) += configs.o
+obj-$(CONFIG_IKHEADERS_PROC) += kheaders.o
 obj-$(CONFIG_SMP) += stop_machine.o
 obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
 obj-$(CONFIG_AUDIT) += audit.o auditfilter.o
@@ -130,3 +131,38 @@ filechk_ikconfiggz = \
 targets += config_data.h
 $(obj)/config_data.h: $(obj)/config_data.gz FORCE
 	$(call filechk,ikconfiggz)
+
+# Build a list of in-kernel headers for building kernel modules
+ikh_file_list := include/
+ikh_file_list += arch/$(SRCARCH)/Makefile
+ikh_file_list += arch/$(SRCARCH)/include/
+ikh_file_list += scripts/
+ikh_file_list += Makefile
+
+# Things we need from the $objtree. "OBJDIR" is for the gen_ikh_data.sh
+# script to identify that this comes from the $objtree directory
+ikh_file_list += OBJDIR/scripts/
+ikh_file_list += OBJDIR/include/
+ikh_file_list += OBJDIR/arch/$(SRCARCH)/include/
+ifeq ($(CONFIG_STACK_VALIDATION), y)
+ikh_file_list += OBJDIR/tools/objtool/objtool
+endif
+
+$(obj)/kheaders.o: $(obj)/kheaders_data.h
+
+targets += kheaders_data.tar.xz
+
+quiet_cmd_genikh = GEN     $(obj)/kheaders_data.tar.xz
+cmd_genikh = $(srctree)/scripts/gen_ikh_data.sh $@ $(ikh_file_list)
+$(obj)/kheaders_data.tar.xz: FORCE
+	$(call cmd,genikh)
+
+filechk_ikheadersxz = \
+	echo "static const char kernel_headers_data[] __used = KH_MAGIC_START"; \
+	cat $< | scripts/bin2c; \
+	echo "KH_MAGIC_END;"
+
+targets += kheaders_data.h
+targets += kheaders.md5
+$(obj)/kheaders_data.h: $(obj)/kheaders_data.tar.xz FORCE
+	$(call filechk,ikheadersxz)
diff --git a/kernel/kheaders.c b/kernel/kheaders.c
new file mode 100644
index 000000000000..46a6358301e5
--- /dev/null
+++ b/kernel/kheaders.c
@@ -0,0 +1,72 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * kernel/kheaders.c
+ * Provide headers and artifacts needed to build kernel modules.
+ * (Borrowed code from kernel/configs.c)
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/proc_fs.h>
+#include <linux/init.h>
+#include <linux/uaccess.h>
+
+/*
+ * Define kernel_headers_data and kernel_headers_data_size, which contains the
+ * compressed kernel headers.  The file is first compressed with xz and then
+ * bounded by two eight byte magic numbers to allow extraction from a binary
+ * kernel image:
+ *
+ *   IKHD_ST
+ *   <image>
+ *   IKHD_ED
+ */
+#define KH_MAGIC_START	"IKHD_ST"
+#define KH_MAGIC_END	"IKHD_ED"
+#include "kheaders_data.h"
+
+
+#define KH_MAGIC_SIZE (sizeof(KH_MAGIC_START) - 1)
+#define kernel_headers_data_size \
+	(sizeof(kernel_headers_data) - 1 - KH_MAGIC_SIZE * 2)
+
+static ssize_t
+ikheaders_read_current(struct file *file, char __user *buf,
+		      size_t len, loff_t *offset)
+{
+	return simple_read_from_buffer(buf, len, offset,
+				       kernel_headers_data + KH_MAGIC_SIZE,
+				       kernel_headers_data_size);
+}
+
+static const struct file_operations ikheaders_file_ops = {
+	.read = ikheaders_read_current,
+	.llseek = default_llseek,
+};
+
+static int __init ikheaders_init(void)
+{
+	struct proc_dir_entry *entry;
+
+	/* create the current headers file */
+	entry = proc_create("kheaders.tar.xz", S_IRUGO, NULL,
+			    &ikheaders_file_ops);
+	if (!entry)
+		return -ENOMEM;
+
+	proc_set_size(entry, kernel_headers_data_size);
+
+	return 0;
+}
+
+static void __exit ikheaders_cleanup(void)
+{
+	remove_proc_entry("kheaders.tar.xz", NULL);
+}
+
+module_init(ikheaders_init);
+module_exit(ikheaders_cleanup);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Joel Fernandes");
+MODULE_DESCRIPTION("Echo the kernel header artifacts used to build the kernel");
diff --git a/scripts/gen_ikh_data.sh b/scripts/gen_ikh_data.sh
new file mode 100755
index 000000000000..1fa5628fcc30
--- /dev/null
+++ b/scripts/gen_ikh_data.sh
@@ -0,0 +1,78 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+spath="$(dirname "$(readlink -f "$0")")"
+kroot="$spath/.."
+outdir="$(pwd)"
+tarfile=$1
+cpio_dir=$outdir/$tarfile.tmp
+
+file_list=${@:2}
+
+src_file_list=""
+for f in $file_list; do
+	src_file_list="$src_file_list $(echo $f | grep -v OBJDIR)"
+done
+
+obj_file_list=""
+for f in $file_list; do
+	f=$(echo $f | grep OBJDIR | sed -e 's/OBJDIR\///g')
+	obj_file_list="$obj_file_list $f";
+done
+
+# Support incremental builds by skipping archive generation
+# if timestamps of files being archived are not changed.
+
+# This block is useful for debugging the incremental builds.
+# Uncomment it for debugging.
+# iter=1
+# if [ ! -f /tmp/iter ]; then echo 1 > /tmp/iter;
+# else; 	iter=$(($(cat /tmp/iter) + 1)); fi
+# find $src_file_list -type f | xargs ls -lR > /tmp/src-ls-$iter
+# find $obj_file_list -type f | xargs ls -lR > /tmp/obj-ls-$iter
+
+# modules.order and include/generated/compile.h are ignored because these are
+# touched even when none of the source files changed. This causes pointless
+# regeneration, so let us ignore them for md5 calculation.
+pushd $kroot > /dev/null
+src_files_md5="$(find $src_file_list -type f ! -name modules.order |
+		grep -v "include/generated/compile.h"		   |
+		xargs ls -lR | md5sum | cut -d ' ' -f1)"
+popd > /dev/null
+obj_files_md5="$(find $obj_file_list -type f ! -name modules.order |
+		grep -v "include/generated/compile.h"		   |
+		xargs ls -lR | md5sum | cut -d ' ' -f1)"
+
+if [ -f $tarfile ]; then tarfile_md5="$(md5sum $tarfile | cut -d ' ' -f1)"; fi
+if [ -f kernel/kheaders.md5 ] &&
+	[ "$(cat kernel/kheaders.md5|head -1)" == "$src_files_md5" ] &&
+	[ "$(cat kernel/kheaders.md5|head -2|tail -1)" == "$obj_files_md5" ] &&
+	[ "$(cat kernel/kheaders.md5|tail -1)" == "$tarfile_md5" ]; then
+		exit
+fi
+
+rm -rf $cpio_dir
+mkdir $cpio_dir
+
+pushd $kroot > /dev/null
+for f in $src_file_list;
+	do find "$f" ! -name "*.c" ! -name "*.o" ! -name "*.cmd" ! -name ".*";
+done | cpio --quiet -pd $cpio_dir
+popd > /dev/null
+
+# The second CPIO can complain if files already exist which can
+# happen with out of tree builds. Just silence CPIO for now.
+for f in $obj_file_list;
+	do find "$f" ! -name "*.c" ! -name "*.o" ! -name "*.cmd" ! -name ".*";
+done | cpio --quiet -pd $cpio_dir >/dev/null 2>&1
+
+find  $cpio_dir -type f -print0 |
+	xargs -0 -P8 -n1 -I {} sh -c "$spath/strip-comments.pl {}"
+
+tar -Jcf $tarfile -C $cpio_dir/ . > /dev/null
+
+echo "$src_files_md5" > kernel/kheaders.md5
+echo "$obj_files_md5" >> kernel/kheaders.md5
+echo "$(md5sum $tarfile | cut -d ' ' -f1)" >> kernel/kheaders.md5
+
+rm -rf $cpio_dir
diff --git a/scripts/strip-comments.pl b/scripts/strip-comments.pl
new file mode 100755
index 000000000000..f8ada87c5802
--- /dev/null
+++ b/scripts/strip-comments.pl
@@ -0,0 +1,8 @@
+#!/usr/bin/perl -pi
+# SPDX-License-Identifier: GPL-2.0
+
+# This script removes /**/ comments from a file, unless such comments
+# contain "SPDX". It is used when building compressed in-kernel headers.
+
+BEGIN {undef $/;}
+s/\/\*((?!SPDX).)*?\*\///smg;
Joel Fernandes Feb. 28, 2019, 3 p.m. UTC | #6
On Thu, Feb 28, 2019 at 05:34:44PM +0900, Masami Hiramatsu wrote:
> Hi Joel,
> 
> On Wed, 27 Feb 2019 14:37:47 -0500
> "Joel Fernandes (Google)" <joel@joelfernandes.org> wrote:
> 
> > Introduce in-kernel headers and other artifacts which are made available
> > as an archive through proc (/proc/kheaders.tar.xz file). This archive makes
> > it possible to build kernel modules, run eBPF programs, and other
> > tracing programs that need to extend the kernel for tracing purposes
> > without any dependency on the file system having headers and build
> > artifacts.
> > 
> > On Android and embedded systems, it is common to switch kernels but not
> > have kernel headers available on the file system. Raw kernel headers
> > also cannot be copied into the filesystem like they can be on other
> > distros, due to licensing and other issues. There's no linux-headers
> > package on Android. Further once a different kernel is booted, any
> > headers stored on the file system will no longer be useful. By storing
> > the headers as a compressed archive within the kernel, we can avoid these
> > issues that have been a hindrance for a long time.
> 
> Hmm, isn't it easier to add kernel-headers package on Android?

I have already been down that road. In the Android ecosystem, the Android
teams only provide a "userspace system image" which goes on the system
partition of the flash (and a couple other images are also provided but
system is the main one). The system image cannot contain GPL source code. It
is also not possible to put kernel headers for every kernel version on the
system images that ship and is not practical. Android boots on 1000s of forked
kernels. It does not make sense to provide headers on the system image for
every kernel version and I already had many discussions on the subject with
the teams, it is something that is just not done. Now for kernel modules,
there's another image called the "vendor image" which is flashed onto the
vendor parition, this is where kernel modules go.  This vendor image is not
provided by Google for non-Pixel devices. So we have no control over what
goes there BUT we do know that kernel modules that are enabled will go there,
and we do have control over enforcing that certain kernel modules should be
built and available as they are mandatory for Android to function properly.
We would also possibly make it a built-in option as well. Anyway my point is
keeping it in the kernel is really the easiest and the smartest choice IMO.

> > The feature is also buildable as a module just in case the user desires
> > it not being part of the kernel image. This makes it possible to load
> > and unload the headers on demand. A tracing program, or a kernel module
> > builder can load the module, do its operations, and then unload the
> > module to save kernel memory. The total memory needed is 3.8MB.
> 
> But it also requires to install build environment (tools etc.)
> on the target system...

Yes, that's true. Check the other thread with Masahiro that we are discussing
this point on and let us continue discussing there:
https://lore.kernel.org/patchwork/patch/1046307/#1238223
https://lore.kernel.org/patchwork/patch/1046307/#1238491

> > The code to read the headers is based on /proc/config.gz code and uses
> > the same technique to embed the headers.
> > 
> > To build a module, the below steps have been tested on an x86 machine:
> > modprobe kheaders
> > rm -rf $HOME/headers
> > mkdir -p $HOME/headers
> > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
> > cd my-kernel-module
> > make -C $HOME/headers M=$(pwd) modules
> > rmmod kheaders
> 
> It seems a bit complex, but no difference from compared with carrying
> kheaders.tar.gz. I think we would better have a psudo filesystem
> which can mount this compressed header file directly :) Then it becomes
> simpler, like
> 
> modprobe headerfs
> mkdir $HOME/headers
> mount -t headerfs $HOME/headers
> 
> And this doesn't consume any disk-space.

I felt using a compressed tar is really the easiest way because of all the
tools are already available. There isn't a compressed in-ram filesystem right
now that I'm aware off that can achieve the kind of high compression ratio
this patchset does.

thanks,

 - Joel
Greg KH Feb. 28, 2019, 3:30 p.m. UTC | #7
On Thu, Feb 28, 2019 at 10:00:54AM -0500, Joel Fernandes wrote:
> > Hmm, isn't it easier to add kernel-headers package on Android?
> 
> I have already been down that road. In the Android ecosystem, the Android
> teams only provide a "userspace system image" which goes on the system
> partition of the flash (and a couple other images are also provided but
> system is the main one). The system image cannot contain GPL source code.

Note, that last sentance is not true, it can contain whatever license
code it wants to, as long as the group doing the distribution abides by
the license of the code contained in it.

The split of system and other images has nothing to do with license
issues, but rather functional requirements.  There _might_ be GPL
licensed code in the image if it meets a requirement for the proper
functionality of the system, you never know :)

thanks,

greg "not a lawyer but I work with a lot of them all the time" k-h
Joel Fernandes Feb. 28, 2019, 3:37 p.m. UTC | #8
On Thu, Feb 28, 2019 at 04:30:26PM +0100, Greg KH wrote:
> On Thu, Feb 28, 2019 at 10:00:54AM -0500, Joel Fernandes wrote:
> > > Hmm, isn't it easier to add kernel-headers package on Android?
> > 
> > I have already been down that road. In the Android ecosystem, the Android
> > teams only provide a "userspace system image" which goes on the system
> > partition of the flash (and a couple other images are also provided but
> > system is the main one). The system image cannot contain GPL source code.
> 
> Note, that last sentance is not true, it can contain whatever license
> code it wants to, as long as the group doing the distribution abides by
> the license of the code contained in it.
> 
> The split of system and other images has nothing to do with license
> issues, but rather functional requirements.  There _might_ be GPL
> licensed code in the image if it meets a requirement for the proper
> functionality of the system, you never know :)

I am not a lawyer either but I know I tried to convince people to allow
headers (GPL sources) on the system image for this same reason and was told
that is not something allowed (allowed can also mean "people are Ok with").

In any case, it is not practical to provide headers for every kernel version on
the system image and maintain them, it will take up too much space and has to
be periodically packaged. Not to mention that there will be kernel versions
that are booting Android that simply don't have anyone maintaining headers
for. So the proposed solution is an easier path.

thanks,

 - Joel

> 
> thanks,
> 
> greg "not a lawyer but I work with a lot of them all the time" k-h
Greg KH Feb. 28, 2019, 3:45 p.m. UTC | #9
On Thu, Feb 28, 2019 at 10:37:41AM -0500, Joel Fernandes wrote:
> In any case, it is not practical to provide headers for every kernel version on
> the system image and maintain them, it will take up too much space and has to
> be periodically packaged. Not to mention that there will be kernel versions
> that are booting Android that simply don't have anyone maintaining headers
> for. So the proposed solution is an easier path.

I totally agree with this, and with your proposed patches, just wanting
to clear up some potential anti-GPL-FUD before it got spread any further :)

thanks,

greg k-h
Joel Fernandes Feb. 28, 2019, 3:59 p.m. UTC | #10
On Thu, Feb 28, 2019 at 7:45 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Thu, Feb 28, 2019 at 10:37:41AM -0500, Joel Fernandes wrote:
> > In any case, it is not practical to provide headers for every kernel version on
> > the system image and maintain them, it will take up too much space and has to
> > be periodically packaged. Not to mention that there will be kernel versions
> > that are booting Android that simply don't have anyone maintaining headers
> > for. So the proposed solution is an easier path.
>
> I totally agree with this, and with your proposed patches, just wanting
> to clear up some potential anti-GPL-FUD before it got spread any further :)

Sure, no problem, thanks :)
 - Joel
Dietmar Eggemann Feb. 28, 2019, 4:04 p.m. UTC | #11
Hi Joel,

On 2/28/19 3:47 PM, Joel Fernandes wrote:
> On Thu, Feb 28, 2019 at 01:53:43PM +0000, Qais Yousef wrote:
>> Hi Joel
>>
>> On 02/27/19 14:37, Joel Fernandes (Google) wrote:

[...]

> Ah good catch, I made this change for "file_list=${@:2}" in my tree but
> forgot to push it. Below is the updated patch. Sorry and I'll refresh the
> series with the change after we finish the discussion in the other thread.
> Meanwhile the updated patch is as follows...
> 
> ---8<-----------------------
> 
> From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> Subject: [PATCH v3.1] Provide in-kernel headers for making it easy to extend the kernel
> 
> Introduce in-kernel headers and other artifacts which are made available
> as an archive through proc (/proc/kheaders.tar.xz file). This archive makes
> it possible to build kernel modules, run eBPF programs, and other
> tracing programs that need to extend the kernel for tracing purposes
> without any dependency on the file system having headers and build
> artifacts.
> 
> On Android and embedded systems, it is common to switch kernels but not
> have kernel headers available on the file system. Raw kernel headers
> also cannot be copied into the filesystem like they can be on other
> distros, due to licensing and other issues. There's no linux-headers
> package on Android. Further once a different kernel is booted, any
> headers stored on the file system will no longer be useful. By storing
> the headers as a compressed archive within the kernel, we can avoid these
> issues that have been a hindrance for a long time.
> 
> The feature is also buildable as a module just in case the user desires
> it not being part of the kernel image. This makes it possible to load
> and unload the headers on demand. A tracing program, or a kernel module
> builder can load the module, do its operations, and then unload the
> module to save kernel memory. The total memory needed is 3.8MB.
> 
> The code to read the headers is based on /proc/config.gz code and uses
> the same technique to embed the headers.

This version gives me the header files on a v5.0-rc8 kernel on my arm64 
box but does not compile anymore on v4.20:

kernel/kheaders.c:25:22: error: expected identifier or ‘(’ before string 
constant
  #define KH_MAGIC_END "IKHD_ED"
                       ^
kernel/kheaders_data.h:1:1: note: in expansion of macro ‘KH_MAGIC_END’
  KH_MAGIC_END;
  ^~~~~~~~~~~~
kernel/kheaders.c: In function ‘ikheaders_read_current’:
kernel/kheaders.c:38:12: error: ‘kernel_headers_data’ undeclared (first 
use in this function); did you mean ‘kernel_headers_data_size’?
             kernel_headers_data + KH_MAGIC_SIZE,
             ^~~~~~~~~~~~~~~~~~~
             kernel_headers_data_size
kernel/kheaders.c:38:12: note: each undeclared identifier is reported 
only once for each function it appears in
kernel/kheaders.c: In function ‘ikheaders_init’:
kernel/kheaders.c:31:10: error: ‘kernel_headers_data’ undeclared (first 
use in this function); did you mean ‘kernel_headers_data_size’?
   (sizeof(kernel_headers_data) - 1 - KH_MAGIC_SIZE * 2)
           ^
kernel/kheaders.c:57:23: note: in expansion of macro 
‘kernel_headers_data_size’
   proc_set_size(entry, kernel_headers_data_size);
                        ^~~~~~~~~~~~~~~~~~~~~~~~
kernel/kheaders.c: In function ‘ikheaders_read_current’:
kernel/kheaders.c:40:1: warning: control reaches end of non-void 
function [-Wreturn-type]
  }


The reason for me to stay on v4.20 is that with v5.0-rc8 I don't have 
ebpf 'raw tracepoint' support any more on my arm64 board. But this issue 
is not related to your patch though.

Another point which supports the functionality your patch provides is 
the fact that maintainers don't want to see new TRACE_EVENTs in their 
code. So here your patch comes handy when using ebpf for tracing in 
embedded environments.
Steven Rostedt Feb. 28, 2019, 4:09 p.m. UTC | #12
On Thu, 28 Feb 2019 16:45:13 +0100
Greg KH <gregkh@linuxfoundation.org> wrote:

> On Thu, Feb 28, 2019 at 10:37:41AM -0500, Joel Fernandes wrote:
> > In any case, it is not practical to provide headers for every kernel version on
> > the system image and maintain them, it will take up too much space and has to
> > be periodically packaged. Not to mention that there will be kernel versions
> > that are booting Android that simply don't have anyone maintaining headers
> > for. So the proposed solution is an easier path.  
> 
> I totally agree with this, and with your proposed patches, just wanting
> to clear up some potential anti-GPL-FUD before it got spread any further :)
>

I would believe that the decision of saying it is "not allowed" is not
about it being illegal, but more of a "we would then need to keep track
of every GPL code and all the specific versions in every package that we
ship such that we will have to provide it if someone were to ask for
it", which would be required by the license. And that can be quite a
burden.

-- Steve
Qais Yousef Feb. 28, 2019, 4:22 p.m. UTC | #13
On 02/28/19 17:04, Dietmar Eggemann wrote:
> Hi Joel,
> 
> On 2/28/19 3:47 PM, Joel Fernandes wrote:
> > On Thu, Feb 28, 2019 at 01:53:43PM +0000, Qais Yousef wrote:
> > > Hi Joel
> > > 
> > > On 02/27/19 14:37, Joel Fernandes (Google) wrote:
> 
> [...]
> 
> > Ah good catch, I made this change for "file_list=${@:2}" in my tree but
> > forgot to push it. Below is the updated patch. Sorry and I'll refresh the
> > series with the change after we finish the discussion in the other thread.
> > Meanwhile the updated patch is as follows...
> > 
> > ---8<-----------------------
> > 
> > From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> > Subject: [PATCH v3.1] Provide in-kernel headers for making it easy to extend the kernel
> > 
> > Introduce in-kernel headers and other artifacts which are made available
> > as an archive through proc (/proc/kheaders.tar.xz file). This archive makes
> > it possible to build kernel modules, run eBPF programs, and other
> > tracing programs that need to extend the kernel for tracing purposes
> > without any dependency on the file system having headers and build
> > artifacts.
> > 
> > On Android and embedded systems, it is common to switch kernels but not
> > have kernel headers available on the file system. Raw kernel headers
> > also cannot be copied into the filesystem like they can be on other
> > distros, due to licensing and other issues. There's no linux-headers
> > package on Android. Further once a different kernel is booted, any
> > headers stored on the file system will no longer be useful. By storing
> > the headers as a compressed archive within the kernel, we can avoid these
> > issues that have been a hindrance for a long time.
> > 
> > The feature is also buildable as a module just in case the user desires
> > it not being part of the kernel image. This makes it possible to load
> > and unload the headers on demand. A tracing program, or a kernel module
> > builder can load the module, do its operations, and then unload the
> > module to save kernel memory. The total memory needed is 3.8MB.
> > 
> > The code to read the headers is based on /proc/config.gz code and uses
> > the same technique to embed the headers.
> 
> This version gives me the header files on a v5.0-rc8 kernel on my arm64 box
> but does not compile anymore on v4.20:
> 
> kernel/kheaders.c:25:22: error: expected identifier or ‘(’ before string
> constant
>  #define KH_MAGIC_END "IKHD_ED"
>                       ^
> kernel/kheaders_data.h:1:1: note: in expansion of macro ‘KH_MAGIC_END’

I believe this is caused by this change ad774086356d (kbuild: change filechk
to surround the given command with { }) which changes how filechk defined - and
since filechk_ikheadersxz is used to generate kheaders_data.h it fails when
backported because the syntax has changed.

HTH

--
Qais Yousef

>  KH_MAGIC_END;
>  ^~~~~~~~~~~~
> kernel/kheaders.c: In function ‘ikheaders_read_current’:
> kernel/kheaders.c:38:12: error: ‘kernel_headers_data’ undeclared (first use
> in this function); did you mean ‘kernel_headers_data_size’?
>             kernel_headers_data + KH_MAGIC_SIZE,
>             ^~~~~~~~~~~~~~~~~~~
>             kernel_headers_data_size
> kernel/kheaders.c:38:12: note: each undeclared identifier is reported only
> once for each function it appears in
> kernel/kheaders.c: In function ‘ikheaders_init’:
> kernel/kheaders.c:31:10: error: ‘kernel_headers_data’ undeclared (first use
> in this function); did you mean ‘kernel_headers_data_size’?
>   (sizeof(kernel_headers_data) - 1 - KH_MAGIC_SIZE * 2)
>           ^
> kernel/kheaders.c:57:23: note: in expansion of macro
> ‘kernel_headers_data_size’
>   proc_set_size(entry, kernel_headers_data_size);
>                        ^~~~~~~~~~~~~~~~~~~~~~~~
> kernel/kheaders.c: In function ‘ikheaders_read_current’:
> kernel/kheaders.c:40:1: warning: control reaches end of non-void function
> [-Wreturn-type]
>  }
> 
> 
> The reason for me to stay on v4.20 is that with v5.0-rc8 I don't have ebpf
> 'raw tracepoint' support any more on my arm64 board. But this issue is not
> related to your patch though.
> 
> Another point which supports the functionality your patch provides is the
> fact that maintainers don't want to see new TRACE_EVENTs in their code. So
> here your patch comes handy when using ebpf for tracing in embedded
> environments.
Qais Yousef Feb. 28, 2019, 4:48 p.m. UTC | #14
On 02/28/19 17:04, Dietmar Eggemann wrote:
> Hi Joel,
> 
> On 2/28/19 3:47 PM, Joel Fernandes wrote:
> > On Thu, Feb 28, 2019 at 01:53:43PM +0000, Qais Yousef wrote:
> > > Hi Joel
> > > 
> > > On 02/27/19 14:37, Joel Fernandes (Google) wrote:
> 
> [...]
> 
> > Ah good catch, I made this change for "file_list=${@:2}" in my tree but
> > forgot to push it. Below is the updated patch. Sorry and I'll refresh the
> > series with the change after we finish the discussion in the other thread.
> > Meanwhile the updated patch is as follows...
> > 
> > ---8<-----------------------
> > 
> > From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> > Subject: [PATCH v3.1] Provide in-kernel headers for making it easy to extend the kernel
> > 
> > Introduce in-kernel headers and other artifacts which are made available
> > as an archive through proc (/proc/kheaders.tar.xz file). This archive makes
> > it possible to build kernel modules, run eBPF programs, and other
> > tracing programs that need to extend the kernel for tracing purposes
> > without any dependency on the file system having headers and build
> > artifacts.
> > 
> > On Android and embedded systems, it is common to switch kernels but not
> > have kernel headers available on the file system. Raw kernel headers
> > also cannot be copied into the filesystem like they can be on other
> > distros, due to licensing and other issues. There's no linux-headers
> > package on Android. Further once a different kernel is booted, any
> > headers stored on the file system will no longer be useful. By storing
> > the headers as a compressed archive within the kernel, we can avoid these
> > issues that have been a hindrance for a long time.
> > 
> > The feature is also buildable as a module just in case the user desires
> > it not being part of the kernel image. This makes it possible to load
> > and unload the headers on demand. A tracing program, or a kernel module
> > builder can load the module, do its operations, and then unload the
> > module to save kernel memory. The total memory needed is 3.8MB.
> > 
> > The code to read the headers is based on /proc/config.gz code and uses
> > the same technique to embed the headers.
> 
> This version gives me the header files on a v5.0-rc8 kernel on my arm64 box
> but does not compile anymore on v4.20:
> 
> kernel/kheaders.c:25:22: error: expected identifier or ‘(’ before string
> constant
>  #define KH_MAGIC_END "IKHD_ED"
>                       ^
> kernel/kheaders_data.h:1:1: note: in expansion of macro ‘KH_MAGIC_END’
>  KH_MAGIC_END;
>  ^~~~~~~~~~~~
> kernel/kheaders.c: In function ‘ikheaders_read_current’:
> kernel/kheaders.c:38:12: error: ‘kernel_headers_data’ undeclared (first use
> in this function); did you mean ‘kernel_headers_data_size’?
>             kernel_headers_data + KH_MAGIC_SIZE,
>             ^~~~~~~~~~~~~~~~~~~
>             kernel_headers_data_size
> kernel/kheaders.c:38:12: note: each undeclared identifier is reported only
> once for each function it appears in
> kernel/kheaders.c: In function ‘ikheaders_init’:
> kernel/kheaders.c:31:10: error: ‘kernel_headers_data’ undeclared (first use
> in this function); did you mean ‘kernel_headers_data_size’?
>   (sizeof(kernel_headers_data) - 1 - KH_MAGIC_SIZE * 2)
>           ^
> kernel/kheaders.c:57:23: note: in expansion of macro
> ‘kernel_headers_data_size’
>   proc_set_size(entry, kernel_headers_data_size);
>                        ^~~~~~~~~~~~~~~~~~~~~~~~
> kernel/kheaders.c: In function ‘ikheaders_read_current’:
> kernel/kheaders.c:40:1: warning: control reaches end of non-void function
> [-Wreturn-type]
>  }
> 
> 
> The reason for me to stay on v4.20 is that with v5.0-rc8 I don't have ebpf
> 'raw tracepoint' support any more on my arm64 board. But this issue is not
> related to your patch though.

And this is caused by a38d1107f937 (bpf: support raw tracepoints in modules)
which renamed bpf_find_raw_tracepoint() which bcc-tools relies on to detect if
raw tracepoints are supported..

https://github.com/iovisor/bcc/blob/master/src/python/bcc/__init__.py#L860

Speaking of fragile depedencies :-)

I guess the check can be extended to check for both symbols - but it'll stay
fragile. Not sure if they can do better.

I filed a bug

https://github.com/iovisor/bcc/issues/2240

--
Qais Yousef

> 
> Another point which supports the functionality your patch provides is the
> fact that maintainers don't want to see new TRACE_EVENTs in their code. So
> here your patch comes handy when using ebpf for tracing in embedded
> environments.
Dietmar Eggemann Feb. 28, 2019, 6:36 p.m. UTC | #15
On 2/28/19 5:48 PM, Qais Yousef wrote:
> On 02/28/19 17:04, Dietmar Eggemann wrote:
>> Hi Joel,
>>
>> On 2/28/19 3:47 PM, Joel Fernandes wrote:
>>> On Thu, Feb 28, 2019 at 01:53:43PM +0000, Qais Yousef wrote:
>>>> Hi Joel
>>>>
>>>> On 02/27/19 14:37, Joel Fernandes (Google) wrote:

[...]

>> The reason for me to stay on v4.20 is that with v5.0-rc8 I don't have ebpf
>> 'raw tracepoint' support any more on my arm64 board. But this issue is not
>> related to your patch though.
> 
> And this is caused by a38d1107f937 (bpf: support raw tracepoints in modules)
> which renamed bpf_find_raw_tracepoint() which bcc-tools relies on to detect if
> raw tracepoints are supported..
> 
> https://github.com/iovisor/bcc/blob/master/src/python/bcc/__init__.py#L860
> 
> Speaking of fragile depedencies :-)
> 
> I guess the check can be extended to check for both symbols - but it'll stay
> fragile. Not sure if they can do better.
> 
> I filed a bug
> 
> https://github.com/iovisor/bcc/issues/2240

This did the trick for me on v5.0-rc8. Thanks, Qais!
Joel Fernandes Feb. 28, 2019, 7:10 p.m. UTC | #16
On Thu, Feb 28, 2019 at 05:04:59PM +0100, Dietmar Eggemann wrote:
[...]                      ^
> kernel/kheaders_data.h:1:1: note: in expansion of macro ‘KH_MAGIC_END’
>  KH_MAGIC_END;
>  ^~~~~~~~~~~~
> kernel/kheaders.c: In function ‘ikheaders_read_current’:
> kernel/kheaders.c:38:12: error: ‘kernel_headers_data’ undeclared (first use
> in this function); did you mean ‘kernel_headers_data_size’?
>             kernel_headers_data + KH_MAGIC_SIZE,
>             ^~~~~~~~~~~~~~~~~~~
>             kernel_headers_data_size
> kernel/kheaders.c:38:12: note: each undeclared identifier is reported only
> once for each function it appears in
> kernel/kheaders.c: In function ‘ikheaders_init’:
> kernel/kheaders.c:31:10: error: ‘kernel_headers_data’ undeclared (first use
> in this function); did you mean ‘kernel_headers_data_size’?
>   (sizeof(kernel_headers_data) - 1 - KH_MAGIC_SIZE * 2)
>           ^
> kernel/kheaders.c:57:23: note: in expansion of macro
> ‘kernel_headers_data_size’
>   proc_set_size(entry, kernel_headers_data_size);
>                        ^~~~~~~~~~~~~~~~~~~~~~~~
> kernel/kheaders.c: In function ‘ikheaders_read_current’:
> kernel/kheaders.c:40:1: warning: control reaches end of non-void function
> [-Wreturn-type]
>  }
> 
> 
> The reason for me to stay on v4.20 is that with v5.0-rc8 I don't have ebpf
> 'raw tracepoint' support any more on my arm64 board. But this issue is not
> related to your patch though.

I'm glad mainline fixed your issue though as you said in the other thread ;-)
I will refer to your email here later when doing backports if I hit the issue.

> Another point which supports the functionality your patch provides is the
> fact that maintainers don't want to see new TRACE_EVENTs in their code. So
> here your patch comes handy when using ebpf for tracing in embedded
> environments.

Yes, I agree. One can also use the series to build custom kernel modules
which can also contain tracepoints in their own right. Glad you also find it
useful. I will CC you on it for next spin.

thanks,

 - Joel
Joel Fernandes Feb. 28, 2019, 11:27 p.m. UTC | #17
On Thu, Feb 28, 2019 at 09:43:06AM -0500, Joel Fernandes wrote:
> On Thu, Feb 28, 2019 at 11:17:51AM +0900, Masahiro Yamada wrote:
> > Hi Joel,
> > 
> > 
> > On Thu, Feb 28, 2019 at 4:40 AM Joel Fernandes (Google)
> > <joel@joelfernandes.org> wrote:
> > >
> > > Introduce in-kernel headers and other artifacts which are made available
> > > as an archive through proc (/proc/kheaders.tar.xz file). This archive makes
> > > it possible to build kernel modules, run eBPF programs, and other
> > > tracing programs that need to extend the kernel for tracing purposes
> > > without any dependency on the file system having headers and build
> > > artifacts.
> > >
> > > On Android and embedded systems, it is common to switch kernels but not
> > > have kernel headers available on the file system. Raw kernel headers
> > > also cannot be copied into the filesystem like they can be on other
> > > distros, due to licensing and other issues. There's no linux-headers
> > > package on Android. Further once a different kernel is booted, any
> > > headers stored on the file system will no longer be useful. By storing
> > > the headers as a compressed archive within the kernel, we can avoid these
> > > issues that have been a hindrance for a long time.
> > >
> > > The feature is also buildable as a module just in case the user desires
> > > it not being part of the kernel image. This makes it possible to load
> > > and unload the headers on demand. A tracing program, or a kernel module
> > > builder can load the module, do its operations, and then unload the
> > > module to save kernel memory. The total memory needed is 3.8MB.
> > >
> > > The code to read the headers is based on /proc/config.gz code and uses
> > > the same technique to embed the headers.
> > 
> > 
> > 
> > Please let me ask a question about the actual use-case.
> > 
> > 
> > To build embedded systems including Android,
> > I use an x86 build machine.
> > 
> > In other words, I cross-compile vmlinux and in-tree modules.
> > So,
> > 
> >   target-arch: arm64
> >   host-arch:   x86
> > 
> > 
> > 
> The other way we can make this work is using x86 usermode emulation inside a
> chroot on the Android device which will make the earlier commands work.

I verified the steps to build a module on my Pixel 3 (arm64) with Linux
kernel for arm64 compiled on my x86 host:

After building the headers, the steps were something like:

1.Build an x86 debian image with cross-gcc:

sudo qemu-debootstrap --arch amd64
  --include=make,gcc,gcc-aarch64-linux-gnu,perl,libelf1,python
  --variant=minbase $DIST $RUN_DIR http://ftp.us.debian.org/debian

2. Push qemu-x86_64-static (which I downloaded from the web) onto the device.

3. Tell binfmt_misc about qemu:
echo
':qemu-x86_64:M::\x7fELF\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x03\x00\x3e\x00:
\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff:/qemu-x86_64-static:OC'
> /proc/sys/fs/binfmt_misc/register

4. adb shell and then chroot into the image

5. follow all the steps in the commit message but set ARCH and CROSS_COMPILE
appropriately.

After Make, kernel module is cooked and ready :)

thanks,

 - Joel
Masami Hiramatsu March 1, 2019, 2:28 a.m. UTC | #18
Hi Joel,

On Thu, 28 Feb 2019 10:00:54 -0500
Joel Fernandes <joel@joelfernandes.org> wrote:

> > Hmm, isn't it easier to add kernel-headers package on Android?
> 
> I have already been down that road. In the Android ecosystem, the Android
> teams only provide a "userspace system image" which goes on the system
> partition of the flash (and a couple other images are also provided but
> system is the main one). The system image cannot contain GPL source code. It
> is also not possible to put kernel headers for every kernel version on the
> system images that ship and is not practical. Android boots on 1000s of forked
> kernels. It does not make sense to provide headers on the system image for
> every kernel version and I already had many discussions on the subject with
> the teams, it is something that is just not done. Now for kernel modules,
> there's another image called the "vendor image" which is flashed onto the
> vendor parition, this is where kernel modules go.  This vendor image is not
> provided by Google for non-Pixel devices. So we have no control over what
> goes there BUT we do know that kernel modules that are enabled will go there,
> and we do have control over enforcing that certain kernel modules should be
> built and available as they are mandatory for Android to function properly.
> We would also possibly make it a built-in option as well. Anyway my point is
> keeping it in the kernel is really the easiest and the smartest choice IMO.

Sorry, I'm not convinced yet. This sounds like "because Android decided not
to put the header files on vendor partition, but kernel module is OK"
Why don't google ask vendors to put their kernel headers (or header tarball)
on vendor partition instead?

> 
> > > The feature is also buildable as a module just in case the user desires
> > > it not being part of the kernel image. This makes it possible to load
> > > and unload the headers on demand. A tracing program, or a kernel module
> > > builder can load the module, do its operations, and then unload the
> > > module to save kernel memory. The total memory needed is 3.8MB.
> > 
> > But it also requires to install build environment (tools etc.)
> > on the target system...
> 
> Yes, that's true. Check the other thread with Masahiro that we are discussing
> this point on and let us continue discussing there:
> https://lore.kernel.org/patchwork/patch/1046307/#1238223
> https://lore.kernel.org/patchwork/patch/1046307/#1238491
> 
> > > The code to read the headers is based on /proc/config.gz code and uses
> > > the same technique to embed the headers.
> > > 
> > > To build a module, the below steps have been tested on an x86 machine:
> > > modprobe kheaders
> > > rm -rf $HOME/headers
> > > mkdir -p $HOME/headers
> > > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
> > > cd my-kernel-module
> > > make -C $HOME/headers M=$(pwd) modules
> > > rmmod kheaders
> > 
> > It seems a bit complex, but no difference from compared with carrying
> > kheaders.tar.gz. I think we would better have a psudo filesystem
> > which can mount this compressed header file directly :) Then it becomes
> > simpler, like
> > 
> > modprobe headerfs
> > mkdir $HOME/headers
> > mount -t headerfs $HOME/headers
> > 
> > And this doesn't consume any disk-space.
> 
> I felt using a compressed tar is really the easiest way because of all the
> tools are already available.

As I asked above, if the pure tarball is useful, you can simply ask vendors
to put the header tarball on their vendor directory. I feel making it as
a module is not a right way.

> There isn't a compressed in-ram filesystem right
> now that I'm aware off that can achieve the kind of high compression ratio
> this patchset does.

I think if linux can support something like tarfs(or compressed initramfs)
in kernel, it gives linux an improvement not only a hack. :-)

Thank you,
Joel Fernandes March 1, 2019, 3:26 a.m. UTC | #19
On Fri, Mar 01, 2019 at 11:28:26AM +0900, Masami Hiramatsu wrote:
> Hi Joel,

Hi Masami,

> On Thu, 28 Feb 2019 10:00:54 -0500
> Joel Fernandes <joel@joelfernandes.org> wrote:
> 
> > > Hmm, isn't it easier to add kernel-headers package on Android?
> > 
> > I have already been down that road. In the Android ecosystem, the Android
> > teams only provide a "userspace system image" which goes on the system
> > partition of the flash (and a couple other images are also provided but
> > system is the main one). The system image cannot contain GPL source code. It
> > is also not possible to put kernel headers for every kernel version on the
> > system images that ship and is not practical. Android boots on 1000s of forked
> > kernels. It does not make sense to provide headers on the system image for
> > every kernel version and I already had many discussions on the subject with
> > the teams, it is something that is just not done. Now for kernel modules,
> > there's another image called the "vendor image" which is flashed onto the
> > vendor parition, this is where kernel modules go.  This vendor image is not
> > provided by Google for non-Pixel devices. So we have no control over what
> > goes there BUT we do know that kernel modules that are enabled will go there,
> > and we do have control over enforcing that certain kernel modules should be
> > built and available as they are mandatory for Android to function properly.
> > We would also possibly make it a built-in option as well. Anyway my point is
> > keeping it in the kernel is really the easiest and the smartest choice IMO.
> 
> Sorry, I'm not convinced yet. This sounds like "because Android decided not
> to put the header files on vendor partition, but kernel module is OK"
> Why don't google ask vendors to put their kernel headers (or header tarball)
> on vendor partition instead?

May be Google can do that, but I think you missed the point of the patches.
Making it a module is not mandatory, people can build it into the kernel as
well (CONFIG_IKHEADERS_PROC=y). In this case, the proc entry will be
available on boot without any dependency on the filesystem. If you go through
the other threads such as folks from ARM who replied, they just boot a kernel
image for debug purpose and want headers on device available without any
additional step of copying headers to the filesystem. And folks from Google
also said that they wanted a built-in option.

There are many usecases for this, I have often run into issues with Linux
over the years not only with Android, but other distros, where I boot custom
kernels with no linux-headers package. This is quite painful. It is
convenient to have it as /proc file since the file is dependent on kernel
being booted up and this will work across all Linux distros and systems. I
feel that if you can keep an open mind about it, you will see that a lot of
people will use this feature if it is accepted and there is a lot of positive
feedback in earlier posts of this set.

> > > > The code to read the headers is based on /proc/config.gz code and uses
> > > > the same technique to embed the headers.
> > > > 
> > > > To build a module, the below steps have been tested on an x86 machine:
> > > > modprobe kheaders
> > > > rm -rf $HOME/headers
> > > > mkdir -p $HOME/headers
> > > > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
> > > > cd my-kernel-module
> > > > make -C $HOME/headers M=$(pwd) modules
> > > > rmmod kheaders
> > > 
> > > It seems a bit complex, but no difference from compared with carrying
> > > kheaders.tar.gz. I think we would better have a psudo filesystem
> > > which can mount this compressed header file directly :) Then it becomes
> > > simpler, like
> > > 
> > > modprobe headerfs
> > > mkdir $HOME/headers
> > > mount -t headerfs $HOME/headers
> > > 
> > > And this doesn't consume any disk-space.
> > 
> > I felt using a compressed tar is really the easiest way because of all the
> > tools are already available.
> 
> As I asked above, if the pure tarball is useful, you can simply ask vendors
> to put the header tarball on their vendor directory. I feel making it as
> a module is not a right way.

I don't see what is the drawback of making it a module, it makes it well
integrated into kernel build and ecosystem. I also didn't see any
justification you're providing about why it cannot be a module. If you go
through this and earlier threads, a lot of people are Ok with having a module
option. And I asked several top kernel maintainers at LPC and many people
suggested having it as a module.

> > There isn't a compressed in-ram filesystem right
> > now that I'm aware off that can achieve the kind of high compression ratio
> > this patchset does.
> 
> I think if linux can support something like tarfs(or compressed initramfs)
> in kernel, it gives linux an improvement not only a hack. :-)

Agreed, that sounds like a good idea. I will consider doing it once the
series in its current form can be accepted. I am saying so since this series
is simple, and I can do that as a next step since that idea will take a lot
of time to implement. But I am keen on doing it.

thanks,

 - Joel
Masahiro Yamada March 1, 2019, 6:25 a.m. UTC | #20
On Fri, Mar 1, 2019 at 12:06 AM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> On Thu, Feb 28, 2019 at 11:17:51AM +0900, Masahiro Yamada wrote:
> > Hi Joel,
> >
> >
> > On Thu, Feb 28, 2019 at 4:40 AM Joel Fernandes (Google)
> > <joel@joelfernandes.org> wrote:
> > >
> > > Introduce in-kernel headers and other artifacts which are made available
> > > as an archive through proc (/proc/kheaders.tar.xz file). This archive makes
> > > it possible to build kernel modules, run eBPF programs, and other
> > > tracing programs that need to extend the kernel for tracing purposes
> > > without any dependency on the file system having headers and build
> > > artifacts.
> > >
> > > On Android and embedded systems, it is common to switch kernels but not
> > > have kernel headers available on the file system. Raw kernel headers
> > > also cannot be copied into the filesystem like they can be on other
> > > distros, due to licensing and other issues. There's no linux-headers
> > > package on Android. Further once a different kernel is booted, any
> > > headers stored on the file system will no longer be useful. By storing
> > > the headers as a compressed archive within the kernel, we can avoid these
> > > issues that have been a hindrance for a long time.
> > >
> > > The feature is also buildable as a module just in case the user desires
> > > it not being part of the kernel image. This makes it possible to load
> > > and unload the headers on demand. A tracing program, or a kernel module
> > > builder can load the module, do its operations, and then unload the
> > > module to save kernel memory. The total memory needed is 3.8MB.
> > >
> > > The code to read the headers is based on /proc/config.gz code and uses
> > > the same technique to embed the headers.
> >
> >
> >
> > Please let me ask a question about the actual use-case.
> >
> >
> > To build embedded systems including Android,
> > I use an x86 build machine.
> >
> > In other words, I cross-compile vmlinux and in-tree modules.
> > So,
> >
> >   target-arch: arm64
> >   host-arch:   x86
> >
> >
> >
> > > To build a module, the below steps have been tested on an x86 machine:
> > > modprobe kheaders
> > > rm -rf $HOME/headers
> > > mkdir -p $HOME/headers
> > > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
> > > cd my-kernel-module
> > > make -C $HOME/headers M=$(pwd) modules
> > > rmmod kheaders
> >
> > I am guessing the user will run these commands
> > on the target system.
> > In other words, external modules are native-compiled.
> > So,
> >
> >   target-arch: arm64
> >   host-arch:   arm64
> >
> >
> > Is this correct?
> >
> >
> > If I understood the assumed use-case correctly,
> > kheaders.tar.xw will contain host-programs compiled for x86,
> > which will not work on the target system.
> >
>
> You are right, the above commands in the commit message work only if the
> host/target are same arch due to scripts.
>
> However we can build with arm64 device connected to a host, like this (which
> I tested):
>
> adb shell modprobe kheaders; adb pull /proc/kheaders.tar.xz
> rm -rf $HOME/headers; mkdir -p $HOME/headers
> tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
> cd my-kernel-module
> make -C $HOME/headers M=$(pwd) ARCH=arm64 CROSS_COMPILE=aarch64- modules
> adb push test.ko /data/; adb shell rmmod kheaders
>
> The other way we can make this work is using x86 usermode emulation inside a
> chroot on the Android device which will make the earlier commands work. One
> thing to note is that Android also runs on x86 hardware so the commands in
> the commit message will work even for x86 Android targets already.
>
> Also note that this the "module building" part is really only one of the
> usecases. eBPF is another which needs the headers - and the headers are vast
> majority of the archive. Headers take 3.1MB out of 3.6MB of the archive on
> arm64 builds.
>
> How do you want to proceed here, should I mention these points in the commit
> message?



I do not request a re-spin just for a matter of commit log,
but this version produces an empty tarball.
So, you will have a chance to update the patch anyway.

In the next version, it would be nice to note that
"external modules must be built on the same host arch
as built vmlinux".




Let me ask one more question.

I guess this patch is motivated by
how difficult to convey kernel headers
from vendors to users.

In that situation, how will the user find
the right compiler to use for building external modules?




Greg KH said:

We don't ever support the system of loading a module built with anything
other than the _exact_ same compiler than the kernel was.


For the full context, see this:
https://lore.kernel.org/patchwork/patch/836247/#1031547
Masami Hiramatsu March 1, 2019, 7:03 a.m. UTC | #21
Hi Joel,

On Thu, 28 Feb 2019 22:26:11 -0500
Joel Fernandes <joel@joelfernandes.org> wrote:

> On Fri, Mar 01, 2019 at 11:28:26AM +0900, Masami Hiramatsu wrote:
> > Hi Joel,
> 
> Hi Masami,
> 
> > On Thu, 28 Feb 2019 10:00:54 -0500
> > Joel Fernandes <joel@joelfernandes.org> wrote:
> > 
> > > > Hmm, isn't it easier to add kernel-headers package on Android?
> > > 
> > > I have already been down that road. In the Android ecosystem, the Android
> > > teams only provide a "userspace system image" which goes on the system
> > > partition of the flash (and a couple other images are also provided but
> > > system is the main one). The system image cannot contain GPL source code. It
> > > is also not possible to put kernel headers for every kernel version on the
> > > system images that ship and is not practical. Android boots on 1000s of forked
> > > kernels. It does not make sense to provide headers on the system image for
> > > every kernel version and I already had many discussions on the subject with
> > > the teams, it is something that is just not done. Now for kernel modules,
> > > there's another image called the "vendor image" which is flashed onto the
> > > vendor parition, this is where kernel modules go.  This vendor image is not
> > > provided by Google for non-Pixel devices. So we have no control over what
> > > goes there BUT we do know that kernel modules that are enabled will go there,
> > > and we do have control over enforcing that certain kernel modules should be
> > > built and available as they are mandatory for Android to function properly.
> > > We would also possibly make it a built-in option as well. Anyway my point is
> > > keeping it in the kernel is really the easiest and the smartest choice IMO.
> > 
> > Sorry, I'm not convinced yet. This sounds like "because Android decided not
> > to put the header files on vendor partition, but kernel module is OK"
> > Why don't google ask vendors to put their kernel headers (or header tarball)
> > on vendor partition instead?
> 
> May be Google can do that, but I think you missed the point of the patches.
> Making it a module is not mandatory, people can build it into the kernel as
> well (CONFIG_IKHEADERS_PROC=y). In this case, the proc entry will be
> available on boot without any dependency on the filesystem. If you go through
> the other threads such as folks from ARM who replied, they just boot a kernel
> image for debug purpose and want headers on device available without any
> additional step of copying headers to the filesystem. And folks from Google
> also said that they wanted a built-in option.

Agreed. Making it built-in is reasonable, since it will not involves any
other files. :) 

> There are many usecases for this, I have often run into issues with Linux
> over the years not only with Android, but other distros, where I boot custom
> kernels with no linux-headers package. This is quite painful. It is
> convenient to have it as /proc file since the file is dependent on kernel
> being booted up and this will work across all Linux distros and systems. I
> feel that if you can keep an open mind about it, you will see that a lot of
> people will use this feature if it is accepted and there is a lot of positive
> feedback in earlier posts of this set.

I don't complain about having headers for custom boot kernel. I agree with you
that having kernel headers for debugging is always good. :)
So google recommends built-in, it is reasonable.

> > > > > The code to read the headers is based on /proc/config.gz code and uses
> > > > > the same technique to embed the headers.
> > > > > 
> > > > > To build a module, the below steps have been tested on an x86 machine:
> > > > > modprobe kheaders
> > > > > rm -rf $HOME/headers
> > > > > mkdir -p $HOME/headers
> > > > > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
> > > > > cd my-kernel-module
> > > > > make -C $HOME/headers M=$(pwd) modules
> > > > > rmmod kheaders
> > > > 
> > > > It seems a bit complex, but no difference from compared with carrying
> > > > kheaders.tar.gz. I think we would better have a psudo filesystem
> > > > which can mount this compressed header file directly :) Then it becomes
> > > > simpler, like
> > > > 
> > > > modprobe headerfs
> > > > mkdir $HOME/headers
> > > > mount -t headerfs $HOME/headers
> > > > 
> > > > And this doesn't consume any disk-space.
> > > 
> > > I felt using a compressed tar is really the easiest way because of all the
> > > tools are already available.
> > 
> > As I asked above, if the pure tarball is useful, you can simply ask vendors
> > to put the header tarball on their vendor directory. I feel making it as
> > a module is not a right way.
> 
> I don't see what is the drawback of making it a module, it makes it well
> integrated into kernel build and ecosystem. I also didn't see any
> justification you're providing about why it cannot be a module. If you go
> through this and earlier threads, a lot of people are Ok with having a module
> option. And I asked several top kernel maintainers at LPC and many people
> suggested having it as a module.

I meant, if we have a tarball, we don't need any operation of loading/unloading
kmodules. But if we have this as built-in, yes, this would be much easier to
deploy it to device.

Anyway, having that option (make it as a module) is not bad. IMHO, that may
be more complicated than just have a tarball file, but it is a user's choice.

OK, now I understand it.

> > > There isn't a compressed in-ram filesystem right
> > > now that I'm aware off that can achieve the kind of high compression ratio
> > > this patchset does.
> > 
> > I think if linux can support something like tarfs(or compressed initramfs)
> > in kernel, it gives linux an improvement not only a hack. :-)
> 
> Agreed, that sounds like a good idea. I will consider doing it once the
> series in its current form can be accepted. I am saying so since this series
> is simple, and I can do that as a next step since that idea will take a lot
> of time to implement. But I am keen on doing it.

I look forward to it :-)

Thank you!
Qais Yousef March 1, 2019, 11:05 a.m. UTC | #22
On 02/28/19 22:26, Joel Fernandes wrote:
> On Fri, Mar 01, 2019 at 11:28:26AM +0900, Masami Hiramatsu wrote:
> > Hi Joel,
> 
> Hi Masami,
> 
> > On Thu, 28 Feb 2019 10:00:54 -0500
> > Joel Fernandes <joel@joelfernandes.org> wrote:
> > 
> > > > Hmm, isn't it easier to add kernel-headers package on Android?
> > > 
> > > I have already been down that road. In the Android ecosystem, the Android
> > > teams only provide a "userspace system image" which goes on the system
> > > partition of the flash (and a couple other images are also provided but
> > > system is the main one). The system image cannot contain GPL source code. It
> > > is also not possible to put kernel headers for every kernel version on the
> > > system images that ship and is not practical. Android boots on 1000s of forked
> > > kernels. It does not make sense to provide headers on the system image for
> > > every kernel version and I already had many discussions on the subject with
> > > the teams, it is something that is just not done. Now for kernel modules,
> > > there's another image called the "vendor image" which is flashed onto the
> > > vendor parition, this is where kernel modules go.  This vendor image is not
> > > provided by Google for non-Pixel devices. So we have no control over what
> > > goes there BUT we do know that kernel modules that are enabled will go there,
> > > and we do have control over enforcing that certain kernel modules should be
> > > built and available as they are mandatory for Android to function properly.
> > > We would also possibly make it a built-in option as well. Anyway my point is
> > > keeping it in the kernel is really the easiest and the smartest choice IMO.
> > 
> > Sorry, I'm not convinced yet. This sounds like "because Android decided not
> > to put the header files on vendor partition, but kernel module is OK"
> > Why don't google ask vendors to put their kernel headers (or header tarball)
> > on vendor partition instead?
> 
> May be Google can do that, but I think you missed the point of the patches.
> Making it a module is not mandatory, people can build it into the kernel as
> well (CONFIG_IKHEADERS_PROC=y). In this case, the proc entry will be
> available on boot without any dependency on the filesystem. If you go through
> the other threads such as folks from ARM who replied, they just boot a kernel
> image for debug purpose and want headers on device available without any
> additional step of copying headers to the filesystem. And folks from Google
> also said that they wanted a built-in option.

Yes I do see this patch a useful addition. When you need the header and you're
dealing with multiple devices and kernel trees and versions - having the
headers part of the kernel just makes it easier to use them when you need them.
Especially sometimes when you change a config option (like enabling a new
syscall e.g: bpf) it's very easy to forget that the headers has changed as well
and you need to push an updated copy.

FWIW most of the time I run on non-android systems for development/debugging
purposes. I use the built-in version although I can see a module version
helpful too. You can save some space if your kernel image partition is small
and it's easy to install the module along with all other modules when
rebuilding the kernel than remembering to update the headers. It's a (very)
nice convenience.

--
Qais Yousef

> 
> There are many usecases for this, I have often run into issues with Linux
> over the years not only with Android, but other distros, where I boot custom
> kernels with no linux-headers package. This is quite painful. It is
> convenient to have it as /proc file since the file is dependent on kernel
> being booted up and this will work across all Linux distros and systems. I
> feel that if you can keep an open mind about it, you will see that a lot of
> people will use this feature if it is accepted and there is a lot of positive
> feedback in earlier posts of this set.
> 
> > > > > The code to read the headers is based on /proc/config.gz code and uses
> > > > > the same technique to embed the headers.
> > > > > 
> > > > > To build a module, the below steps have been tested on an x86 machine:
> > > > > modprobe kheaders
> > > > > rm -rf $HOME/headers
> > > > > mkdir -p $HOME/headers
> > > > > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
> > > > > cd my-kernel-module
> > > > > make -C $HOME/headers M=$(pwd) modules
> > > > > rmmod kheaders
> > > > 
> > > > It seems a bit complex, but no difference from compared with carrying
> > > > kheaders.tar.gz. I think we would better have a psudo filesystem
> > > > which can mount this compressed header file directly :) Then it becomes
> > > > simpler, like
> > > > 
> > > > modprobe headerfs
> > > > mkdir $HOME/headers
> > > > mount -t headerfs $HOME/headers
> > > > 
> > > > And this doesn't consume any disk-space.
> > > 
> > > I felt using a compressed tar is really the easiest way because of all the
> > > tools are already available.
> > 
> > As I asked above, if the pure tarball is useful, you can simply ask vendors
> > to put the header tarball on their vendor directory. I feel making it as
> > a module is not a right way.
> 
> I don't see what is the drawback of making it a module, it makes it well
> integrated into kernel build and ecosystem. I also didn't see any
> justification you're providing about why it cannot be a module. If you go
> through this and earlier threads, a lot of people are Ok with having a module
> option. And I asked several top kernel maintainers at LPC and many people
> suggested having it as a module.
> 
> > > There isn't a compressed in-ram filesystem right
> > > now that I'm aware off that can achieve the kind of high compression ratio
> > > this patchset does.
> > 
> > I think if linux can support something like tarfs(or compressed initramfs)
> > in kernel, it gives linux an improvement not only a hack. :-)
> 
> Agreed, that sounds like a good idea. I will consider doing it once the
> series in its current form can be accepted. I am saying so since this series
> is simple, and I can do that as a next step since that idea will take a lot
> of time to implement. But I am keen on doing it.
> 
> thanks,
> 
>  - Joel
>
Joel Fernandes March 1, 2019, 5:19 p.m. UTC | #23
On Fri, Mar 01, 2019 at 03:25:05PM +0900, Masahiro Yamada wrote:
[...]
> > > I am guessing the user will run these commands
> > > on the target system.
> > > In other words, external modules are native-compiled.
> > > So,
> > >
> > >   target-arch: arm64
> > >   host-arch:   arm64
> > >
> > >
> > > Is this correct?
> > >
> > >
> > > If I understood the assumed use-case correctly,
> > > kheaders.tar.xw will contain host-programs compiled for x86,
> > > which will not work on the target system.
> > >
> >
> > You are right, the above commands in the commit message work only if the
> > host/target are same arch due to scripts.
> >
> > However we can build with arm64 device connected to a host, like this (which
> > I tested):
> >
> > adb shell modprobe kheaders; adb pull /proc/kheaders.tar.xz
> > rm -rf $HOME/headers; mkdir -p $HOME/headers
> > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
> > cd my-kernel-module
> > make -C $HOME/headers M=$(pwd) ARCH=arm64 CROSS_COMPILE=aarch64- modules
> > adb push test.ko /data/; adb shell rmmod kheaders
> >
> > The other way we can make this work is using x86 usermode emulation inside a
> > chroot on the Android device which will make the earlier commands work. One
> > thing to note is that Android also runs on x86 hardware so the commands in
> > the commit message will work even for x86 Android targets already.
> >
> > Also note that this the "module building" part is really only one of the
> > usecases. eBPF is another which needs the headers - and the headers are vast
> > majority of the archive. Headers take 3.1MB out of 3.6MB of the archive on
> > arm64 builds.
> >
> > How do you want to proceed here, should I mention these points in the commit
> > message?
> 
> 
> 
> I do not request a re-spin just for a matter of commit log,
> but this version produces an empty tarball.
> So, you will have a chance to update the patch anyway.
> 
> In the next version, it would be nice to note that
> "external modules must be built on the same host arch
> as built vmlinux".

Ok, I respun it with 1 more minor nit for arm64 building. Please take a look.

> Let me ask one more question.
> 
> I guess this patch is motivated by
> how difficult to convey kernel headers
> from vendors to users.
> 
> In that situation, how will the user find
> the right compiler to use for building external modules?
> 
> 
> 
> 
> Greg KH said:
> 
> We don't ever support the system of loading a module built with anything
> other than the _exact_ same compiler than the kernel was.
> 
> 
> For the full context, see this:
> https://lore.kernel.org/patchwork/patch/836247/#1031547

IMO this issue is not related to this patch but is just an issue with
building external modules in general. It is up to the user to use the right
compiler, etc. I will let Greg comment more on that.

thanks,

 - Joel
Joel Fernandes March 1, 2019, 5:21 p.m. UTC | #24
On Fri, Mar 01, 2019 at 04:03:09PM +0900, Masami Hiramatsu wrote:
> Hi Joel,
> 
> On Thu, 28 Feb 2019 22:26:11 -0500
> Joel Fernandes <joel@joelfernandes.org> wrote:
> 
> > On Fri, Mar 01, 2019 at 11:28:26AM +0900, Masami Hiramatsu wrote:
[..]
> > There are many usecases for this, I have often run into issues with Linux
> > over the years not only with Android, but other distros, where I boot custom
> > kernels with no linux-headers package. This is quite painful. It is
> > convenient to have it as /proc file since the file is dependent on kernel
> > being booted up and this will work across all Linux distros and systems. I
> > feel that if you can keep an open mind about it, you will see that a lot of
> > people will use this feature if it is accepted and there is a lot of positive
> > feedback in earlier posts of this set.
> 
> I don't complain about having headers for custom boot kernel. I agree with you
> that having kernel headers for debugging is always good. :)
> So google recommends built-in, it is reasonable.

Ok, thanks :)

> > > > > > The code to read the headers is based on /proc/config.gz code and uses
> > > > > > the same technique to embed the headers.
> > > > > > 
> > > > > > To build a module, the below steps have been tested on an x86 machine:
> > > > > > modprobe kheaders
> > > > > > rm -rf $HOME/headers
> > > > > > mkdir -p $HOME/headers
> > > > > > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
> > > > > > cd my-kernel-module
> > > > > > make -C $HOME/headers M=$(pwd) modules
> > > > > > rmmod kheaders
> > > > > 
> > > > > It seems a bit complex, but no difference from compared with carrying
> > > > > kheaders.tar.gz. I think we would better have a psudo filesystem
> > > > > which can mount this compressed header file directly :) Then it becomes
> > > > > simpler, like
> > > > > 
> > > > > modprobe headerfs
> > > > > mkdir $HOME/headers
> > > > > mount -t headerfs $HOME/headers
> > > > > 
> > > > > And this doesn't consume any disk-space.
> > > > 
> > > > I felt using a compressed tar is really the easiest way because of all the
> > > > tools are already available.
> > > 
> > > As I asked above, if the pure tarball is useful, you can simply ask vendors
> > > to put the header tarball on their vendor directory. I feel making it as
> > > a module is not a right way.
> > 
> > I don't see what is the drawback of making it a module, it makes it well
> > integrated into kernel build and ecosystem. I also didn't see any
> > justification you're providing about why it cannot be a module. If you go
> > through this and earlier threads, a lot of people are Ok with having a module
> > option. And I asked several top kernel maintainers at LPC and many people
> > suggested having it as a module.
> 
> I meant, if we have a tarball, we don't need any operation of loading/unloading
> kmodules. But if we have this as built-in, yes, this would be much easier to
> deploy it to device.
> 
> Anyway, having that option (make it as a module) is not bad. IMHO, that may
> be more complicated than just have a tarball file, but it is a user's choice.
> 
> OK, now I understand it.

Sounds good. :) Just sent out v4.

thanks,

 - Joel
Masahiro Yamada March 2, 2019, 2:13 a.m. UTC | #25
On Sat, Mar 2, 2019 at 3:03 AM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> On Fri, Mar 01, 2019 at 03:25:05PM +0900, Masahiro Yamada wrote:
> [...]
> > > > I am guessing the user will run these commands
> > > > on the target system.
> > > > In other words, external modules are native-compiled.
> > > > So,
> > > >
> > > >   target-arch: arm64
> > > >   host-arch:   arm64
> > > >
> > > >
> > > > Is this correct?
> > > >
> > > >
> > > > If I understood the assumed use-case correctly,
> > > > kheaders.tar.xw will contain host-programs compiled for x86,
> > > > which will not work on the target system.
> > > >
> > >
> > > You are right, the above commands in the commit message work only if the
> > > host/target are same arch due to scripts.
> > >
> > > However we can build with arm64 device connected to a host, like this (which
> > > I tested):
> > >
> > > adb shell modprobe kheaders; adb pull /proc/kheaders.tar.xz
> > > rm -rf $HOME/headers; mkdir -p $HOME/headers
> > > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
> > > cd my-kernel-module
> > > make -C $HOME/headers M=$(pwd) ARCH=arm64 CROSS_COMPILE=aarch64- modules
> > > adb push test.ko /data/; adb shell rmmod kheaders
> > >
> > > The other way we can make this work is using x86 usermode emulation inside a
> > > chroot on the Android device which will make the earlier commands work. One
> > > thing to note is that Android also runs on x86 hardware so the commands in
> > > the commit message will work even for x86 Android targets already.
> > >
> > > Also note that this the "module building" part is really only one of the
> > > usecases. eBPF is another which needs the headers - and the headers are vast
> > > majority of the archive. Headers take 3.1MB out of 3.6MB of the archive on
> > > arm64 builds.
> > >
> > > How do you want to proceed here, should I mention these points in the commit
> > > message?
> >
> >
> >
> > I do not request a re-spin just for a matter of commit log,
> > but this version produces an empty tarball.
> > So, you will have a chance to update the patch anyway.
> >
> > In the next version, it would be nice to note that
> > "external modules must be built on the same host arch
> > as built vmlinux".
>
> Ok, I respun it with 1 more minor nit for arm64 building. Please take a look.


I have not checked code-diff in v3 yet.

Anyway, I will add comments to v4
if I notice something.


> > Let me ask one more question.
> >
> > I guess this patch is motivated by
> > how difficult to convey kernel headers
> > from vendors to users.
> >
> > In that situation, how will the user find
> > the right compiler to use for building external modules?
> >
> >
> >
> >
> > Greg KH said:
> >
> > We don't ever support the system of loading a module built with anything
> > other than the _exact_ same compiler than the kernel was.
> >
> >
> > For the full context, see this:
> > https://lore.kernel.org/patchwork/patch/836247/#1031547
>
> IMO this issue is not related to this patch but is just an issue with
> building external modules in general.


I do not think it is an issue of the build system, at least.

As far as I understood Greg's comment, it is troublesome
without the assumption that vmlinux and modules are built
by the same compiler.

It is related to this patch since this patch assumes use-cases
where external modules are built in a completely different environment,
where a different compiler is probably installed.



> It is up to the user to use the right
> compiler, etc. I will let Greg comment more on that.







> thanks,
>
>  - Joel
>
Joel Fernandes March 2, 2019, 2:39 a.m. UTC | #26
On Sat, Mar 02, 2019 at 11:13:07AM +0900, Masahiro Yamada wrote:
> On Sat, Mar 2, 2019 at 3:03 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> >
> > On Fri, Mar 01, 2019 at 03:25:05PM +0900, Masahiro Yamada wrote:
> > [...]
> > > > > I am guessing the user will run these commands
> > > > > on the target system.
> > > > > In other words, external modules are native-compiled.
> > > > > So,
> > > > >
> > > > >   target-arch: arm64
> > > > >   host-arch:   arm64
> > > > >
> > > > >
> > > > > Is this correct?
> > > > >
> > > > >
> > > > > If I understood the assumed use-case correctly,
> > > > > kheaders.tar.xw will contain host-programs compiled for x86,
> > > > > which will not work on the target system.
> > > > >
> > > >
> > > > You are right, the above commands in the commit message work only if the
> > > > host/target are same arch due to scripts.
> > > >
> > > > However we can build with arm64 device connected to a host, like this (which
> > > > I tested):
> > > >
> > > > adb shell modprobe kheaders; adb pull /proc/kheaders.tar.xz
> > > > rm -rf $HOME/headers; mkdir -p $HOME/headers
> > > > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
> > > > cd my-kernel-module
> > > > make -C $HOME/headers M=$(pwd) ARCH=arm64 CROSS_COMPILE=aarch64- modules
> > > > adb push test.ko /data/; adb shell rmmod kheaders
> > > >
> > > > The other way we can make this work is using x86 usermode emulation inside a
> > > > chroot on the Android device which will make the earlier commands work. One
> > > > thing to note is that Android also runs on x86 hardware so the commands in
> > > > the commit message will work even for x86 Android targets already.
> > > >
> > > > Also note that this the "module building" part is really only one of the
> > > > usecases. eBPF is another which needs the headers - and the headers are vast
> > > > majority of the archive. Headers take 3.1MB out of 3.6MB of the archive on
> > > > arm64 builds.
> > > >
> > > > How do you want to proceed here, should I mention these points in the commit
> > > > message?
> > >
> > >
> > >
> > > I do not request a re-spin just for a matter of commit log,
> > > but this version produces an empty tarball.
> > > So, you will have a chance to update the patch anyway.
> > >
> > > In the next version, it would be nice to note that
> > > "external modules must be built on the same host arch
> > > as built vmlinux".
> >
> > Ok, I respun it with 1 more minor nit for arm64 building. Please take a look.
> 
> 
> I have not checked code-diff in v3 yet.
> 
> Anyway, I will add comments to v4
> if I notice something.

Ok. Since all your comments from previous series were addressed, it would be
nice to get your Acked-by tag for v4 unless you have further comments or
concerns.

> > > Let me ask one more question.
> > >
> > > I guess this patch is motivated by
> > > how difficult to convey kernel headers
> > > from vendors to users.
> > >
> > > In that situation, how will the user find
> > > the right compiler to use for building external modules?
> > >
> > >
> > >
> > >
> > > Greg KH said:
> > >
> > > We don't ever support the system of loading a module built with anything
> > > other than the _exact_ same compiler than the kernel was.
> > >
> > >
> > > For the full context, see this:
> > > https://lore.kernel.org/patchwork/patch/836247/#1031547
> >
> > IMO this issue is not related to this patch but is just an issue with
> > building external modules in general.
> 
> 
> I do not think it is an issue of the build system, at least.
> 
> As far as I understood Greg's comment, it is troublesome
> without the assumption that vmlinux and modules are built
> by the same compiler.
> It is related to this patch since this patch assumes use-cases
> where external modules are built in a completely different environment,
> where a different compiler is probably installed.

Yes, but what I'm trying to say is the same issue exists with all other
solutions today that do this. Such as debian you have linux-headers package.
A user could totally use the build artifacts obtained from somewhere to build
a kernel module with a completely different compiler. That issue has just to
do with the reality, and isn't an issue caused by any one solution such as
this one.  I agree care must be taken whenever user is building external
kernel modules independent of kernel sources.  Did I miss something else?

thanks a lot,

 - Joel
Masahiro Yamada March 4, 2019, 5:40 a.m. UTC | #27
> > > > Let me ask one more question.
> > > >
> > > > I guess this patch is motivated by
> > > > how difficult to convey kernel headers
> > > > from vendors to users.
> > > >
> > > > In that situation, how will the user find
> > > > the right compiler to use for building external modules?
> > > >
> > > >
> > > >
> > > >
> > > > Greg KH said:
> > > >
> > > > We don't ever support the system of loading a module built with anything
> > > > other than the _exact_ same compiler than the kernel was.
> > > >
> > > >
> > > > For the full context, see this:
> > > > https://lore.kernel.org/patchwork/patch/836247/#1031547
> > >
> > > IMO this issue is not related to this patch but is just an issue with
> > > building external modules in general.
> >
> >
> > I do not think it is an issue of the build system, at least.
> >
> > As far as I understood Greg's comment, it is troublesome
> > without the assumption that vmlinux and modules are built
> > by the same compiler.
> > It is related to this patch since this patch assumes use-cases
> > where external modules are built in a completely different environment,
> > where a different compiler is probably installed.
>
> Yes, but what I'm trying to say is the same issue exists with all other
> solutions today that do this. Such as debian you have linux-headers package.


Distributions provide the compiler in the standard path (/usr/bin/gcc),
and users are supposed to use it for building external modules.
That's the difference.



> A user could totally use the build artifacts obtained from somewhere to build
> a kernel module with a completely different compiler. That issue has just to
> do with the reality, and isn't an issue caused by any one solution such as
> this one.  I agree care must be taken whenever user is building external
> kernel modules independent of kernel sources.  Did I miss something else?
>
> thanks a lot,
>
>  - Joel
>

--
Best Regards
Masahiro Yamada

Patch
diff mbox series

diff --git a/Documentation/dontdiff b/Documentation/dontdiff
index 2228fcc8e29f..05a2319ee2a2 100644
--- a/Documentation/dontdiff
+++ b/Documentation/dontdiff
@@ -151,6 +151,7 @@  int8.c
 kallsyms
 kconfig
 keywords.c
+kheaders_data.h*
 ksym.c*
 ksym.h*
 kxgettext
diff --git a/init/Kconfig b/init/Kconfig
index c9386a365eea..63ff0990ae55 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -563,6 +563,17 @@  config IKCONFIG_PROC
 	  This option enables access to the kernel configuration file
 	  through /proc/config.gz.
 
+config IKHEADERS_PROC
+	tristate "Enable kernel header artifacts through /proc/kheaders.tar.xz"
+	select BUILD_BIN2C
+	depends on PROC_FS
+	help
+	  This option enables access to the kernel header and other artifacts that
+          are generated during the build process. These can be used to build kernel
+          modules, and other in-kernel programs such as those generated by eBPF
+          and systemtap tools. If you build the headers as a module, a module
+          called kheaders.ko is built which can be loaded to get access to them.
+
 config LOG_BUF_SHIFT
 	int "Kernel log buffer size (16 => 64KB, 17 => 128KB)"
 	range 12 25
diff --git a/kernel/.gitignore b/kernel/.gitignore
index b3097bde4e9c..484018945e93 100644
--- a/kernel/.gitignore
+++ b/kernel/.gitignore
@@ -3,5 +3,8 @@ 
 #
 config_data.h
 config_data.gz
+kheaders.md5
+kheaders_data.h
+kheaders_data.tar.xz
 timeconst.h
 hz.bc
diff --git a/kernel/Makefile b/kernel/Makefile
index 6aa7543bcdb2..0bc7cacd7da6 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -70,6 +70,7 @@  obj-$(CONFIG_UTS_NS) += utsname.o
 obj-$(CONFIG_USER_NS) += user_namespace.o
 obj-$(CONFIG_PID_NS) += pid_namespace.o
 obj-$(CONFIG_IKCONFIG) += configs.o
+obj-$(CONFIG_IKHEADERS_PROC) += kheaders.o
 obj-$(CONFIG_SMP) += stop_machine.o
 obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
 obj-$(CONFIG_AUDIT) += audit.o auditfilter.o
@@ -130,3 +131,38 @@  filechk_ikconfiggz = \
 targets += config_data.h
 $(obj)/config_data.h: $(obj)/config_data.gz FORCE
 	$(call filechk,ikconfiggz)
+
+# Build a list of in-kernel headers for building kernel modules
+ikh_file_list := include/
+ikh_file_list += arch/$(SRCARCH)/Makefile
+ikh_file_list += arch/$(SRCARCH)/include/
+ikh_file_list += scripts/
+ikh_file_list += Makefile
+
+# Things we need from the $objtree. "OBJDIR" is for the gen_ikh_data.sh
+# script to identify that this comes from the $objtree directory
+ikh_file_list += OBJDIR/scripts/
+ikh_file_list += OBJDIR/include/
+ikh_file_list += OBJDIR/arch/$(SRCARCH)/include/
+ifeq ($(CONFIG_STACK_VALIDATION), y)
+ikh_file_list += OBJDIR/tools/objtool/objtool
+endif
+
+$(obj)/kheaders.o: $(obj)/kheaders_data.h
+
+targets += kheaders_data.tar.xz
+
+quiet_cmd_genikh = GEN     $(obj)/kheaders_data.tar.xz
+cmd_genikh = $(srctree)/scripts/gen_ikh_data.sh $@ $(ikh_file_list)
+$(obj)/kheaders_data.tar.xz: FORCE
+	$(call cmd,genikh)
+
+filechk_ikheadersxz = \
+	echo "static const char kernel_headers_data[] __used = KH_MAGIC_START"; \
+	cat $< | scripts/bin2c; \
+	echo "KH_MAGIC_END;"
+
+targets += kheaders_data.h
+targets += kheaders.md5
+$(obj)/kheaders_data.h: $(obj)/kheaders_data.tar.xz FORCE
+	$(call filechk,ikheadersxz)
diff --git a/kernel/kheaders.c b/kernel/kheaders.c
new file mode 100644
index 000000000000..46a6358301e5
--- /dev/null
+++ b/kernel/kheaders.c
@@ -0,0 +1,72 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * kernel/kheaders.c
+ * Provide headers and artifacts needed to build kernel modules.
+ * (Borrowed code from kernel/configs.c)
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/proc_fs.h>
+#include <linux/init.h>
+#include <linux/uaccess.h>
+
+/*
+ * Define kernel_headers_data and kernel_headers_data_size, which contains the
+ * compressed kernel headers.  The file is first compressed with xz and then
+ * bounded by two eight byte magic numbers to allow extraction from a binary
+ * kernel image:
+ *
+ *   IKHD_ST
+ *   <image>
+ *   IKHD_ED
+ */
+#define KH_MAGIC_START	"IKHD_ST"
+#define KH_MAGIC_END	"IKHD_ED"
+#include "kheaders_data.h"
+
+
+#define KH_MAGIC_SIZE (sizeof(KH_MAGIC_START) - 1)
+#define kernel_headers_data_size \
+	(sizeof(kernel_headers_data) - 1 - KH_MAGIC_SIZE * 2)
+
+static ssize_t
+ikheaders_read_current(struct file *file, char __user *buf,
+		      size_t len, loff_t *offset)
+{
+	return simple_read_from_buffer(buf, len, offset,
+				       kernel_headers_data + KH_MAGIC_SIZE,
+				       kernel_headers_data_size);
+}
+
+static const struct file_operations ikheaders_file_ops = {
+	.read = ikheaders_read_current,
+	.llseek = default_llseek,
+};
+
+static int __init ikheaders_init(void)
+{
+	struct proc_dir_entry *entry;
+
+	/* create the current headers file */
+	entry = proc_create("kheaders.tar.xz", S_IRUGO, NULL,
+			    &ikheaders_file_ops);
+	if (!entry)
+		return -ENOMEM;
+
+	proc_set_size(entry, kernel_headers_data_size);
+
+	return 0;
+}
+
+static void __exit ikheaders_cleanup(void)
+{
+	remove_proc_entry("kheaders.tar.xz", NULL);
+}
+
+module_init(ikheaders_init);
+module_exit(ikheaders_cleanup);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Joel Fernandes");
+MODULE_DESCRIPTION("Echo the kernel header artifacts used to build the kernel");
diff --git a/scripts/gen_ikh_data.sh b/scripts/gen_ikh_data.sh
new file mode 100755
index 000000000000..7329262bed2f
--- /dev/null
+++ b/scripts/gen_ikh_data.sh
@@ -0,0 +1,76 @@ 
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+spath="$(dirname "$(readlink -f "$0")")"
+kroot="$spath/.."
+outdir="$(pwd)"
+tarfile=$1
+cpio_dir=$outdir/$tarfile.tmp
+
+src_file_list=""
+for f in $file_list; do
+	src_file_list="$src_file_list $(echo $f | grep -v OBJDIR)"
+done
+
+obj_file_list=""
+for f in $file_list; do
+	f=$(echo $f | grep OBJDIR | sed -e 's/OBJDIR\///g')
+	obj_file_list="$obj_file_list $f";
+done
+
+# Support incremental builds by skipping archive generation
+# if timestamps of files being archived are not changed.
+
+# This block is useful for debugging the incremental builds.
+# Uncomment it for debugging.
+# iter=1
+# if [ ! -f /tmp/iter ]; then echo 1 > /tmp/iter;
+# else; 	iter=$(($(cat /tmp/iter) + 1)); fi
+# find $src_file_list -type f | xargs ls -lR > /tmp/src-ls-$iter
+# find $obj_file_list -type f | xargs ls -lR > /tmp/obj-ls-$iter
+
+# modules.order and include/generated/compile.h are ignored because these are
+# touched even when none of the source files changed. This causes pointless
+# regeneration, so let us ignore them for md5 calculation.
+pushd $kroot > /dev/null
+src_files_md5="$(find $src_file_list -type f ! -name modules.order |
+		grep -v "include/generated/compile.h"		   |
+		xargs ls -lR | md5sum | cut -d ' ' -f1)"
+popd > /dev/null
+obj_files_md5="$(find $obj_file_list -type f ! -name modules.order |
+		grep -v "include/generated/compile.h"		   |
+		xargs ls -lR | md5sum | cut -d ' ' -f1)"
+
+if [ -f $tarfile ]; then tarfile_md5="$(md5sum $tarfile | cut -d ' ' -f1)"; fi
+if [ -f kernel/kheaders.md5 ] &&
+	[ "$(cat kernel/kheaders.md5|head -1)" == "$src_files_md5" ] &&
+	[ "$(cat kernel/kheaders.md5|head -2|tail -1)" == "$obj_files_md5" ] &&
+	[ "$(cat kernel/kheaders.md5|tail -1)" == "$tarfile_md5" ]; then
+		exit
+fi
+
+rm -rf $cpio_dir
+mkdir $cpio_dir
+
+pushd $kroot > /dev/null
+for f in $src_file_list;
+	do find "$f" ! -name "*.c" ! -name "*.o" ! -name "*.cmd" ! -name ".*";
+done | cpio --quiet -pd $cpio_dir
+popd > /dev/null
+
+# The second CPIO can complain if files already exist which can
+# happen with out of tree builds. Just silence CPIO for now.
+for f in $obj_file_list;
+	do find "$f" ! -name "*.c" ! -name "*.o" ! -name "*.cmd" ! -name ".*";
+done | cpio --quiet -pd $cpio_dir >/dev/null 2>&1
+
+find  $cpio_dir -type f -print0 |
+	xargs -0 -P8 -n1 -I {} sh -c "$spath/strip-comments.pl {}"
+
+tar -Jcf $tarfile -C $cpio_dir/ . > /dev/null
+
+echo "$src_files_md5" > kernel/kheaders.md5
+echo "$obj_files_md5" >> kernel/kheaders.md5
+echo "$(md5sum $tarfile | cut -d ' ' -f1)" >> kernel/kheaders.md5
+
+rm -rf $cpio_dir
diff --git a/scripts/strip-comments.pl b/scripts/strip-comments.pl
new file mode 100755
index 000000000000..f8ada87c5802
--- /dev/null
+++ b/scripts/strip-comments.pl
@@ -0,0 +1,8 @@ 
+#!/usr/bin/perl -pi
+# SPDX-License-Identifier: GPL-2.0
+
+# This script removes /**/ comments from a file, unless such comments
+# contain "SPDX". It is used when building compressed in-kernel headers.
+
+BEGIN {undef $/;}
+s/\/\*((?!SPDX).)*?\*\///smg;