diff mbox series

Documentation: kbuild: Add document about reproducible builds

Message ID 20190911115353.yngbk6hf6gwctock@decadent.org.uk (mailing list archive)
State New, archived
Headers show
Series Documentation: kbuild: Add document about reproducible builds | expand

Commit Message

Ben Hutchings Sept. 11, 2019, 11:53 a.m. UTC
In the Distribution Kernels track at Linux Plumbers Conference there
was some discussion around the difficulty of making kernel builds
reproducible.

This is a solved problem, but the solutions don't appear to be
documented in one place.  This document lists the issues I know about
and the settings needed to ensure reproducibility.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 Documentation/kbuild/index.rst               |   1 +
 Documentation/kbuild/reproducible-builds.rst | 115 +++++++++++++++++++
 2 files changed, 116 insertions(+)
 create mode 100644 Documentation/kbuild/reproducible-builds.rst

Comments

Masahiro Yamada Sept. 11, 2019, 12:17 p.m. UTC | #1
Hi Ben,


Thanks for this.
Please let me add some comments.


On Wed, Sep 11, 2019 at 8:54 PM Ben Hutchings <ben@decadent.org.uk> wrote:
>
> In the Distribution Kernels track at Linux Plumbers Conference there
> was some discussion around the difficulty of making kernel builds
> reproducible.
>
> This is a solved problem, but the solutions don't appear to be
> documented in one place.  This document lists the issues I know about
> and the settings needed to ensure reproducibility.
>
> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
> ---
>  Documentation/kbuild/index.rst               |   1 +
>  Documentation/kbuild/reproducible-builds.rst | 115 +++++++++++++++++++
>  2 files changed, 116 insertions(+)
>  create mode 100644 Documentation/kbuild/reproducible-builds.rst
>
> diff --git a/Documentation/kbuild/index.rst b/Documentation/kbuild/index.rst
> index e323a3f2cc81..0f144fad99a6 100644
> --- a/Documentation/kbuild/index.rst
> +++ b/Documentation/kbuild/index.rst
> @@ -18,6 +18,7 @@ Kernel Build System
>      headers_install
>
>      issues
> +    reproducible-builds
>
>  .. only::  subproject and html
>
> diff --git a/Documentation/kbuild/reproducible-builds.rst b/Documentation/kbuild/reproducible-builds.rst
> new file mode 100644
> index 000000000000..4d988faf93b8
> --- /dev/null
> +++ b/Documentation/kbuild/reproducible-builds.rst
> @@ -0,0 +1,115 @@
> +===================
> +Reproducible builds
> +===================
> +
> +It is generally desirable that the building the same source code with
> +the same set of tools is reproducible, i.e. the output is always
> +exactly the same.  This makes it possible to verify that the build
> +infrastructure for a binary distribution or embedded system has not
> +been subverted.  This can also make it easier to verify that a source
> +or tool change does not make any difference to the resulting binaries.
> +
> +The `Reproducible Builds project`_ has more information about this
> +general topic.  This document covers the various reasons why building
> +the kernel may be unreproducible, and how to avoid them.
> +
> +Timestamps
> +----------
> +
> +The kernel embeds a timestamp in two places:
> +
> +* The version string exposed by ``uname()`` and included in
> +  ``/proc/version``
> +
> +* File timestamps in the embedded initramfs
> +
> +By default the timestamp is the current time.  This must be overridden
> +using the `KBUILD_BUILD_TIMESTAMP`_ variable.  If you are building
> +from a git commit, you could use its commit date.
> +
> +The kernel does *not* use the ``__DATE__`` and ``__TIME__`` macros,
> +and enables warnings if they are used.  If you incorporate external
> +code that does use these, you must override the timestamp they
> +correspond to by setting the `SOURCE_DATE_EPOCH`_ environment
> +variable.
> +
> +User, host
> +----------
> +
> +The kernel embeds the building user and host names in
> +``/proc/version``.  These must be overridden using the
> +`KBUILD_BUILD_USER and KBUILD_BUILD_HOST`_ variables.  If you are
> +building from a git commit, you could use its committer address.
> +
> +Absolute filenames
> +------------------
> +
> +When the kernel is built out-of-tree, debug information may include
> +absolute filenames for the source files.  The ``__FILE__`` macro may
> +also expand to an absolute filename.  This must be overridden by
> +including `prefix-map options`_ in the `KCFLAGS`_ variable.

Do you mean -fmacro-prefix-map ?

If so, it is already taken care of by the top Makefile.
If you use GCC 8 or newer, it is automatically added to
KBUILD_CFLAGS.




> +
> +Generated files in source packages
> +----------------------------------
> +
> +The build processes for some programs under the ``tools/``
> +subdirectory do not completely support out-of-tree builds.  This may
> +cause source packages built using e.g. ``make rpm-pkg`` to include
> +generated files and so be unreproducible.  It may be necessary to
> +clean the source tree completely (``make mrproper`` or
> +``git clean -d -f -x``) before building a source package.


Currently, the source package building does not support
out-of-tree build anyway.

'make O=foo rpm-pkg' fails with an error message.

Building in a pristine source will solve the issue.





> +Module signing
> +--------------
> +
> +If you enable ``CONFIG_MODULE_SIG_ALL``, the default behaviour is to
> +generate a different temporary key for each build, resulting in the
> +modules being unreproducible.  However, including a signing key with
> +your source would presumably defeat the purpose of signing modules.
> +
> +One approach to this is to divide up the build process so that the
> +unreproducible parts can be treated as sources:
> +
> +1. Generate a persistent signing key.  Add the certificate for the key
> +   to the kernel source.
> +
> +2. Set the ``CONFIG_SYSTEM_TRUSTED_KEYS`` symbol to include the
> +   signing key's certificate, set ``CONFIG_MODULE_SIG_KEY`` to an
> +   empty string, and disable ``CONFIG_MODULE_SIG_ALL``.
> +   Build the kernel and modules.
> +
> +3. Create detached signatures for the modules, and publish them as
> +   sources.
> +
> +4. Perform a second build that attaches the module signatures.  It
> +   can either rebuild the modules or use the output of step 2.
> +
> +Structure randomisation
> +-----------------------
> +
> +If you enable ``CONFIG_GCC_PLUGIN_RANDSTRUCT``, you will need to
> +pre-generate the random seed in
> +``scripts/gcc-plgins/randomize_layout_seed.h`` so the same value
> +is used in rebuilds.
> +
> +Debug info conflicts
> +--------------------
> +
> +This is not a problem of unreproducibility, but of generated files
> +being *too* reproducible.
> +
> +Once you set all the necessary variables for a reproducible build, a
> +vDSO's debug information may be identical even for different kernel
> +versions.  This can result in file conflicts between debug information
> +packages for the different kernel versions.
> +
> +To avoid this, you can make the vDSO different for different
> +kernel versions by including an arbitrary string of "salt" in it.
> +This is specified by the Kconfig symbol ``CONFIG_BUILD_SALT``.
> +
> +.. _KBUILD_BUILD_TIMESTAMP: kbuild.html#kbuild-build-timestamp
> +.. _KBUILD_BUILD_USER and KBUILD_BUILD_HOST: kbuild.html#kbuild-build-user-kbuild-build-host
> +.. _KCFLAGS: kbuild.html#kcflags
> +.. _prefix-map options: https://reproducible-builds.org/docs/build-path/
> +.. _Reproducible Builds project: https://reproducible-builds.org/
> +.. _SOURCE_DATE_EPOCH: https://reproducible-builds.org/docs/source-date-epoch/
Ben Hutchings Sept. 11, 2019, 1:04 p.m. UTC | #2
On Wed, 2019-09-11 at 21:17 +0900, Masahiro Yamada wrote:
> Hi Ben,
> 
> 
> Thanks for this.
> Please let me add some comments.
> 
> 
> On Wed, Sep 11, 2019 at 8:54 PM Ben Hutchings <ben@decadent.org.uk> wrote:
[...]
> > +Absolute filenames
> > +------------------
> > +
> > +When the kernel is built out-of-tree, debug information may include
> > +absolute filenames for the source files.  The ``__FILE__`` macro may
> > +also expand to an absolute filename.  This must be overridden by
> > +including `prefix-map options`_ in the `KCFLAGS`_ variable.
> 
> Do you mean -fmacro-prefix-map ?

No, I mean -ffile-prefix-map or the older -fdebug-prefix-map.

> If so, it is already taken care of by the top Makefile.
> If you use GCC 8 or newer, it is automatically added to
> KBUILD_CFLAGS.

Ah, that's helpful.  So, I suppose I should just mention
-fdebug-prefix-map here and warn that __FILE__ will still be a proble
if using older compiler versions.

> > +Generated files in source packages
> > +----------------------------------
> > +
> > +The build processes for some programs under the ``tools/``
> > +subdirectory do not completely support out-of-tree builds.  This may
> > +cause source packages built using e.g. ``make rpm-pkg`` to include
> > +generated files and so be unreproducible.  It may be necessary to
> > +clean the source tree completely (``make mrproper`` or
> > +``git clean -d -f -x``) before building a source package.
> 
> Currently, the source package building does not support
> out-of-tree build anyway.

Yes, I realise that.

> 'make O=foo rpm-pkg' fails with an error message.
> 
> Building in a pristine source will solve the issue.
[...]

The issue I'm thinking about is that an out-of-tree build, prior to the
package build, *should* leave the source pristine and sometimes does
not.

For Debian's official kernel packages, we build a binary package of the
upstream source, and at some times this has unexpectedly included some 
generated files.  I believe a similar issue would affect the upstream
package scripts.

Ben.
Ben Hutchings Sept. 11, 2019, 1:14 p.m. UTC | #3
On Wed, 2019-09-11 at 14:04 +0100, Ben Hutchings wrote:
> On Wed, 2019-09-11 at 21:17 +0900, Masahiro Yamada wrote:
> > Hi Ben,
> > 
> > 
> > Thanks for this.
> > Please let me add some comments.
> > 
> > 
> > On Wed, Sep 11, 2019 at 8:54 PM Ben Hutchings <ben@decadent.org.uk> wrote:
> [...]
> > > +Absolute filenames
> > > +------------------
> > > +
> > > +When the kernel is built out-of-tree, debug information may include
> > > +absolute filenames for the source files.  The ``__FILE__`` macro may
> > > +also expand to an absolute filename.  This must be overridden by
> > > +including `prefix-map options`_ in the `KCFLAGS`_ variable.
> > 
> > Do you mean -fmacro-prefix-map ?
> 
> No, I mean -ffile-prefix-map or the older -fdebug-prefix-map.
> 
> > If so, it is already taken care of by the top Makefile.
> > If you use GCC 8 or newer, it is automatically added to
> > KBUILD_CFLAGS.
> 
> Ah, that's helpful.  So, I suppose I should just mention
> -fdebug-prefix-map here and warn that __FILE__ will still be a proble
> if using older compiler versions.

My revised text for this section is:

---
When the kernel is built out-of-tree, debug information may include
absolute filenames for the source files.  This must be overridden by
including the ``-fdebug-prefix-map`` option in the `KCFLAGS`_ variable.

Depending on the compiler used, the ``__FILE__`` macro may also expand
to an absolute filename in an out-of-tree build.  Kbuild automatically
uses the ``-fmacro-prefix-map`` option to prevent this, if it is
supported.

The Reproducible Builds web site has more information about these
`prefix-map options`_.
---

Does that look OK to you?

> > > +Generated files in source packages
> > > +----------------------------------
> > > +
> > > +The build processes for some programs under the ``tools/``
> > > +subdirectory do not completely support out-of-tree builds.  This may
> > > +cause source packages built using e.g. ``make rpm-pkg`` to include
> > > +generated files and so be unreproducible.  It may be necessary to
> > > +clean the source tree completely (``make mrproper`` or
> > > +``git clean -d -f -x``) before building a source package.
> > 
> > Currently, the source package building does not support
> > out-of-tree build anyway.
> 
> Yes, I realise that.
> 
> > 'make O=foo rpm-pkg' fails with an error message.
> > 
> > Building in a pristine source will solve the issue.
> [...]
> 
> The issue I'm thinking about is that an out-of-tree build, prior to the
> package build, *should* leave the source pristine and sometimes does
> not.
> 
> For Debian's official kernel packages, we build a binary package of the
> upstream source, and at some times this has unexpectedly included some 
> generated files.  I believe a similar issue would affect the upstream
> package scripts.

My revised text for this section is:

---
The build processes for some programs under the ``tools/``
subdirectory do not completely support out-of-tree builds.  This may
cause a later source package build using e.g. ``make rpm-pkg`` to
include generated files.  You should ensure the source tree is
pristine by running ``make mrproper`` or ``git clean -d -f -x`` before
building a source package.
---

Ben.
Masahiro Yamada Sept. 12, 2019, 1:47 a.m. UTC | #4
On Wed, Sep 11, 2019 at 10:15 PM Ben Hutchings <ben@decadent.org.uk> wrote:
>
> On Wed, 2019-09-11 at 14:04 +0100, Ben Hutchings wrote:
> > On Wed, 2019-09-11 at 21:17 +0900, Masahiro Yamada wrote:
> > > Hi Ben,
> > >
> > >
> > > Thanks for this.
> > > Please let me add some comments.
> > >
> > >
> > > On Wed, Sep 11, 2019 at 8:54 PM Ben Hutchings <ben@decadent.org.uk> wrote:
> > [...]
> > > > +Absolute filenames
> > > > +------------------
> > > > +
> > > > +When the kernel is built out-of-tree, debug information may include
> > > > +absolute filenames for the source files.  The ``__FILE__`` macro may
> > > > +also expand to an absolute filename.  This must be overridden by
> > > > +including `prefix-map options`_ in the `KCFLAGS`_ variable.
> > >
> > > Do you mean -fmacro-prefix-map ?
> >
> > No, I mean -ffile-prefix-map or the older -fdebug-prefix-map.
> >
> > > If so, it is already taken care of by the top Makefile.
> > > If you use GCC 8 or newer, it is automatically added to
> > > KBUILD_CFLAGS.
> >
> > Ah, that's helpful.  So, I suppose I should just mention
> > -fdebug-prefix-map here and warn that __FILE__ will still be a proble
> > if using older compiler versions.
>
> My revised text for this section is:
>
> ---
> When the kernel is built out-of-tree, debug information may include
> absolute filenames for the source files.  This must be overridden by
> including the ``-fdebug-prefix-map`` option in the `KCFLAGS`_ variable.
>
> Depending on the compiler used, the ``__FILE__`` macro may also expand
> to an absolute filename in an out-of-tree build.  Kbuild automatically
> uses the ``-fmacro-prefix-map`` option to prevent this, if it is
> supported.
>
> The Reproducible Builds web site has more information about these
> `prefix-map options`_.
> ---
>
> Does that look OK to you?


Both hunks sound good.

Thanks.






> > > > +Generated files in source packages
> > > > +----------------------------------
> > > > +
> > > > +The build processes for some programs under the ``tools/``
> > > > +subdirectory do not completely support out-of-tree builds.  This may
> > > > +cause source packages built using e.g. ``make rpm-pkg`` to include
> > > > +generated files and so be unreproducible.  It may be necessary to
> > > > +clean the source tree completely (``make mrproper`` or
> > > > +``git clean -d -f -x``) before building a source package.
> > >
> > > Currently, the source package building does not support
> > > out-of-tree build anyway.
> >
> > Yes, I realise that.
> >
> > > 'make O=foo rpm-pkg' fails with an error message.
> > >
> > > Building in a pristine source will solve the issue.
> > [...]
> >
> > The issue I'm thinking about is that an out-of-tree build, prior to the
> > package build, *should* leave the source pristine and sometimes does
> > not.
> >
> > For Debian's official kernel packages, we build a binary package of the
> > upstream source, and at some times this has unexpectedly included some
> > generated files.  I believe a similar issue would affect the upstream
> > package scripts.
>
> My revised text for this section is:
>
> ---
> The build processes for some programs under the ``tools/``
> subdirectory do not completely support out-of-tree builds.  This may
> cause a later source package build using e.g. ``make rpm-pkg`` to
> include generated files.  You should ensure the source tree is
> pristine by running ``make mrproper`` or ``git clean -d -f -x`` before
> building a source package.
> ---
>
> Ben.
>
> --
> Ben Hutchings
> The obvious mathematical breakthrough [to break modern encryption]
> would be development of an easy way to factor large prime numbers.
>                                                            - Bill Gates
>
>
Nicolas Schier Sept. 13, 2019, 6:28 a.m. UTC | #5
Hi Ben,

thanks for that document, I really enjoyed reading it!

On Wed, Sep 11, 2019 at 12:53:53PM +0100, Ben Hutchings wrote:
[...]
> diff --git a/Documentation/kbuild/reproducible-builds.rst b/Documentation/kbuild/reproducible-builds.rst
> new file mode 100644
> index 000000000000..4d988faf93b8
> --- /dev/null
> +++ b/Documentation/kbuild/reproducible-builds.rst
> @@ -0,0 +1,115 @@
> +===================
> +Reproducible builds
> +===================
> +
> +It is generally desirable that the building the same source code with

In this sentence, I think there is either one word to much (the first
'the') or some word is missing (e.h. 'of').

Kind regards,
Nicolas

> +the same set of tools is reproducible, i.e. the output is always
> +exactly the same.  This makes it possible to verify that the build
> +infrastructure for a binary distribution or embedded system has not
> +been subverted.  This can also make it easier to verify that a source
> +or tool change does not make any difference to the resulting binaries.
Ben Hutchings Sept. 14, 2019, 11:15 a.m. UTC | #6
On Fri, 2019-09-13 at 08:28 +0200, Nicolas Schier wrote:
> Hi Ben,
> 
> thanks for that document, I really enjoyed reading it!
> 
> On Wed, Sep 11, 2019 at 12:53:53PM +0100, Ben Hutchings wrote:
> [...]
> > diff --git a/Documentation/kbuild/reproducible-builds.rst b/Documentation/kbuild/reproducible-builds.rst
> > new file mode 100644
> > index 000000000000..4d988faf93b8
> > --- /dev/null
> > +++ b/Documentation/kbuild/reproducible-builds.rst
> > @@ -0,0 +1,115 @@
> > +===================
> > +Reproducible builds
> > +===================
> > +
> > +It is generally desirable that the building the same source code with
> 
> In this sentence, I think there is either one word to much (the first
> 'the') or some word is missing (e.h. 'of').

Yes, thanks.

Ben.
diff mbox series

Patch

diff --git a/Documentation/kbuild/index.rst b/Documentation/kbuild/index.rst
index e323a3f2cc81..0f144fad99a6 100644
--- a/Documentation/kbuild/index.rst
+++ b/Documentation/kbuild/index.rst
@@ -18,6 +18,7 @@  Kernel Build System
     headers_install
 
     issues
+    reproducible-builds
 
 .. only::  subproject and html
 
diff --git a/Documentation/kbuild/reproducible-builds.rst b/Documentation/kbuild/reproducible-builds.rst
new file mode 100644
index 000000000000..4d988faf93b8
--- /dev/null
+++ b/Documentation/kbuild/reproducible-builds.rst
@@ -0,0 +1,115 @@ 
+===================
+Reproducible builds
+===================
+
+It is generally desirable that the building the same source code with
+the same set of tools is reproducible, i.e. the output is always
+exactly the same.  This makes it possible to verify that the build
+infrastructure for a binary distribution or embedded system has not
+been subverted.  This can also make it easier to verify that a source
+or tool change does not make any difference to the resulting binaries.
+
+The `Reproducible Builds project`_ has more information about this
+general topic.  This document covers the various reasons why building
+the kernel may be unreproducible, and how to avoid them.
+
+Timestamps
+----------
+
+The kernel embeds a timestamp in two places:
+
+* The version string exposed by ``uname()`` and included in
+  ``/proc/version``
+
+* File timestamps in the embedded initramfs
+
+By default the timestamp is the current time.  This must be overridden
+using the `KBUILD_BUILD_TIMESTAMP`_ variable.  If you are building
+from a git commit, you could use its commit date.
+
+The kernel does *not* use the ``__DATE__`` and ``__TIME__`` macros,
+and enables warnings if they are used.  If you incorporate external
+code that does use these, you must override the timestamp they
+correspond to by setting the `SOURCE_DATE_EPOCH`_ environment
+variable.
+
+User, host
+----------
+
+The kernel embeds the building user and host names in
+``/proc/version``.  These must be overridden using the
+`KBUILD_BUILD_USER and KBUILD_BUILD_HOST`_ variables.  If you are
+building from a git commit, you could use its committer address.
+
+Absolute filenames
+------------------
+
+When the kernel is built out-of-tree, debug information may include
+absolute filenames for the source files.  The ``__FILE__`` macro may
+also expand to an absolute filename.  This must be overridden by
+including `prefix-map options`_ in the `KCFLAGS`_ variable.
+
+Generated files in source packages
+----------------------------------
+
+The build processes for some programs under the ``tools/``
+subdirectory do not completely support out-of-tree builds.  This may
+cause source packages built using e.g. ``make rpm-pkg`` to include
+generated files and so be unreproducible.  It may be necessary to
+clean the source tree completely (``make mrproper`` or
+``git clean -d -f -x``) before building a source package.
+
+Module signing
+--------------
+
+If you enable ``CONFIG_MODULE_SIG_ALL``, the default behaviour is to
+generate a different temporary key for each build, resulting in the
+modules being unreproducible.  However, including a signing key with
+your source would presumably defeat the purpose of signing modules.
+
+One approach to this is to divide up the build process so that the
+unreproducible parts can be treated as sources:
+
+1. Generate a persistent signing key.  Add the certificate for the key
+   to the kernel source.
+
+2. Set the ``CONFIG_SYSTEM_TRUSTED_KEYS`` symbol to include the
+   signing key's certificate, set ``CONFIG_MODULE_SIG_KEY`` to an
+   empty string, and disable ``CONFIG_MODULE_SIG_ALL``.
+   Build the kernel and modules.
+
+3. Create detached signatures for the modules, and publish them as
+   sources.
+
+4. Perform a second build that attaches the module signatures.  It
+   can either rebuild the modules or use the output of step 2.
+
+Structure randomisation
+-----------------------
+
+If you enable ``CONFIG_GCC_PLUGIN_RANDSTRUCT``, you will need to
+pre-generate the random seed in
+``scripts/gcc-plgins/randomize_layout_seed.h`` so the same value
+is used in rebuilds.
+
+Debug info conflicts
+--------------------
+
+This is not a problem of unreproducibility, but of generated files
+being *too* reproducible.
+
+Once you set all the necessary variables for a reproducible build, a
+vDSO's debug information may be identical even for different kernel
+versions.  This can result in file conflicts between debug information
+packages for the different kernel versions.
+
+To avoid this, you can make the vDSO different for different
+kernel versions by including an arbitrary string of "salt" in it.
+This is specified by the Kconfig symbol ``CONFIG_BUILD_SALT``.
+
+.. _KBUILD_BUILD_TIMESTAMP: kbuild.html#kbuild-build-timestamp
+.. _KBUILD_BUILD_USER and KBUILD_BUILD_HOST: kbuild.html#kbuild-build-user-kbuild-build-host
+.. _KCFLAGS: kbuild.html#kcflags
+.. _prefix-map options: https://reproducible-builds.org/docs/build-path/
+.. _Reproducible Builds project: https://reproducible-builds.org/
+.. _SOURCE_DATE_EPOCH: https://reproducible-builds.org/docs/source-date-epoch/