diff mbox series

[bpf-next] kbuild, bpf: Enable reproducible BTF generation

Message ID 20241211-pahole-reproducible-v1-1-22feae19bad9@weissschuh.net (mailing list archive)
State New
Headers show
Series [bpf-next] kbuild, bpf: Enable reproducible BTF generation | expand

Commit Message

Thomas Weißschuh Dec. 10, 2024, 11:23 p.m. UTC
Pahole v1.27 added a new BTF generation feature to support
reproducibility in the face of multithreading.
Enable it if supported and reproducible builds are requested.

As unknown --btf_features are ignored, avoid the test for the pahole
version to keep the line readable.

Fixes: b4f72786429c ("scripts/pahole-flags.sh: Parse DWARF and generate BTF with multithreading.")
Fixes: 72d091846de9 ("kbuild: avoid too many execution of scripts/pahole-flags.sh")
Link: https://lore.kernel.org/lkml/4154d202-5c72-493e-bf3f-bce882a296c6@gentoo.org/
Link: https://lore.kernel.org/lkml/20240322-pahole-reprodicible-v1-1-3eaafb1842da@weissschuh.net/
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
---
 scripts/Makefile.btf | 1 +
 1 file changed, 1 insertion(+)


---
base-commit: 7cb1b466315004af98f6ba6c2546bb713ca3c237
change-id: 20241124-pahole-reproducible-2b879ac8bdab

Best regards,

Comments

Ihor Solodrai Dec. 11, 2024, 12:17 a.m. UTC | #1
On Tuesday, December 10th, 2024 at 3:23 PM, Thomas Weißschuh <linux@weissschuh.net> wrote:

> 
> 
> Pahole v1.27 added a new BTF generation feature to support
> reproducibility in the face of multithreading.
> Enable it if supported and reproducible builds are requested.
> 
> As unknown --btf_features are ignored, avoid the test for the pahole
> version to keep the line readable.
> 
> Fixes: b4f72786429c ("scripts/pahole-flags.sh: Parse DWARF and generate BTF with multithreading.")
> Fixes: 72d091846de9 ("kbuild: avoid too many execution of scripts/pahole-flags.sh")
> Link: https://lore.kernel.org/lkml/4154d202-5c72-493e-bf3f-bce882a296c6@gentoo.org/
> Link: https://lore.kernel.org/lkml/20240322-pahole-reprodicible-v1-1-3eaafb1842da@weissschuh.net/
> Signed-off-by: Thomas Weißschuh linux@weissschuh.net
> 
> ---
> scripts/Makefile.btf | 1 +
> 1 file changed, 1 insertion(+)
> 
> diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
> index c3cbeb13de503555adcf00029a0b328e74381f13..da23265bc8b3cf43c0a1c89fbc4f53815a290e13 100644
> --- a/scripts/Makefile.btf
> +++ b/scripts/Makefile.btf
> @@ -22,6 +22,7 @@ else
> 
> # Switch to using --btf_features for v1.26 and later.
> pahole-flags-$(call test-ge, $(pahole-ver), 126) = -j$(JOBS) --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs
> +pahole-flags-$(if $(KBUILD_BUILD_TIMESTAMP),y) += --btf_features=reproducible_build

Hi Thomas,

There are a couple of issues with reproducible_build flag which I
think are worth mentioning here. I don't know all the reasons behind
adding this now, and it's optional too, so feel free to discard my
comments.

Currently with this flag, the BTF output is deterministic for a given
order of DWARF compilation units. So the BTF will be the same for the
same vmlinux binary. However, if the vmlinux is rebuilt due to an
incremental change in a source code, my understanding is that there is
no guarantee that DWARF CUs will be in the same order in the binary.

At the same time, reproducible_build slows down BTF generation by
30-50%, maybe more depending on the kernel config.

Hopefully these problems will be solved in upcoming pahole releases.

Question: why KBUILD_BUILD_TIMESTAMP flag? Isn't it more appropriate
to use a separate flag for this particular feature?

Thanks.

> 
> ifneq ($(KBUILD_EXTMOD),)
> module-pahole-flags-$(call test-ge, $(pahole-ver), 126) += --btf_features=distilled_base
> 
> ---
> base-commit: 7cb1b466315004af98f6ba6c2546bb713ca3c237
> change-id: 20241124-pahole-reproducible-2b879ac8bdab
> 
> Best regards,
> --
> Thomas Weißschuh linux@weissschuh.net
> 
>
Thomas Weißschuh Dec. 11, 2024, 6:24 a.m. UTC | #2
Hi Ihor,

On 2024-12-11 00:17:02+0000, Ihor Solodrai wrote:
> On Tuesday, December 10th, 2024 at 3:23 PM, Thomas Weißschuh <linux@weissschuh.net> wrote:
> 
> > 
> > 
> > Pahole v1.27 added a new BTF generation feature to support
> > reproducibility in the face of multithreading.
> > Enable it if supported and reproducible builds are requested.
> > 
> > As unknown --btf_features are ignored, avoid the test for the pahole
> > version to keep the line readable.
> > 
> > Fixes: b4f72786429c ("scripts/pahole-flags.sh: Parse DWARF and generate BTF with multithreading.")
> > Fixes: 72d091846de9 ("kbuild: avoid too many execution of scripts/pahole-flags.sh")
> > Link: https://lore.kernel.org/lkml/4154d202-5c72-493e-bf3f-bce882a296c6@gentoo.org/
> > Link: https://lore.kernel.org/lkml/20240322-pahole-reprodicible-v1-1-3eaafb1842da@weissschuh.net/
> > Signed-off-by: Thomas Weißschuh linux@weissschuh.net
> > 
> > ---
> > scripts/Makefile.btf | 1 +
> > 1 file changed, 1 insertion(+)
> > 
> > diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
> > index c3cbeb13de503555adcf00029a0b328e74381f13..da23265bc8b3cf43c0a1c89fbc4f53815a290e13 100644
> > --- a/scripts/Makefile.btf
> > +++ b/scripts/Makefile.btf
> > @@ -22,6 +22,7 @@ else
> > 
> > # Switch to using --btf_features for v1.26 and later.
> > pahole-flags-$(call test-ge, $(pahole-ver), 126) = -j$(JOBS) --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs
> > +pahole-flags-$(if $(KBUILD_BUILD_TIMESTAMP),y) += --btf_features=reproducible_build
> 
> Hi Thomas,
> 
> There are a couple of issues with reproducible_build flag which I
> think are worth mentioning here. I don't know all the reasons behind
> adding this now, and it's optional too, so feel free to discard my
> comments.
> 
> Currently with this flag, the BTF output is deterministic for a given
> order of DWARF compilation units. So the BTF will be the same for the
> same vmlinux binary. However, if the vmlinux is rebuilt due to an
> incremental change in a source code, my understanding is that there is
> no guarantee that DWARF CUs will be in the same order in the binary.

The goal behind reproducible builds is to produce bit-by-bit idential
binaries. If the CUs are in a different order then that requirement
would have been broken there already.

For an incremental build a full relink with *all* CUs is done, not only
the changed once, so the order should always be the same.

> At the same time, reproducible_build slows down BTF generation by
> 30-50%, maybe more depending on the kernel config.

If a user explicitly requests reproducibility then they should get it,
even if it is slower.

> Hopefully these problems will be solved in upcoming pahole releases.

I don't see it as big problem. This is used for release builds, not
during development.
 
> Question: why KBUILD_BUILD_TIMESTAMP flag? Isn't it more appropriate
> to use a separate flag for this particular feature?

Adding an additional variable would need to be documented and would
makes the feature harder to use. KBUILD_BUILD_TIMESTAMP already needs to
be set by the user if they are building for reproducibility.

> > ifneq ($(KBUILD_EXTMOD),)
> > module-pahole-flags-$(call test-ge, $(pahole-ver), 126) += --btf_features=distilled_base
> > 
> > ---
> > base-commit: 7cb1b466315004af98f6ba6c2546bb713ca3c237
> > change-id: 20241124-pahole-reproducible-2b879ac8bdab
> > 
> > Best regards,
> > --
> > Thomas Weißschuh linux@weissschuh.net
> > 
> > 
>
Andrii Nakryiko Dec. 12, 2024, 7:23 p.m. UTC | #3
On Tue, Dec 10, 2024 at 10:24 PM Thomas Weißschuh <linux@weissschuh.net> wrote:
>
> Hi Ihor,
>
> On 2024-12-11 00:17:02+0000, Ihor Solodrai wrote:
> > On Tuesday, December 10th, 2024 at 3:23 PM, Thomas Weißschuh <linux@weissschuh.net> wrote:
> >
> > >
> > >
> > > Pahole v1.27 added a new BTF generation feature to support
> > > reproducibility in the face of multithreading.
> > > Enable it if supported and reproducible builds are requested.
> > >
> > > As unknown --btf_features are ignored, avoid the test for the pahole
> > > version to keep the line readable.
> > >
> > > Fixes: b4f72786429c ("scripts/pahole-flags.sh: Parse DWARF and generate BTF with multithreading.")
> > > Fixes: 72d091846de9 ("kbuild: avoid too many execution of scripts/pahole-flags.sh")
> > > Link: https://lore.kernel.org/lkml/4154d202-5c72-493e-bf3f-bce882a296c6@gentoo.org/
> > > Link: https://lore.kernel.org/lkml/20240322-pahole-reprodicible-v1-1-3eaafb1842da@weissschuh.net/
> > > Signed-off-by: Thomas Weißschuh linux@weissschuh.net
> > >
> > > ---
> > > scripts/Makefile.btf | 1 +
> > > 1 file changed, 1 insertion(+)
> > >
> > > diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
> > > index c3cbeb13de503555adcf00029a0b328e74381f13..da23265bc8b3cf43c0a1c89fbc4f53815a290e13 100644
> > > --- a/scripts/Makefile.btf
> > > +++ b/scripts/Makefile.btf
> > > @@ -22,6 +22,7 @@ else
> > >
> > > # Switch to using --btf_features for v1.26 and later.
> > > pahole-flags-$(call test-ge, $(pahole-ver), 126) = -j$(JOBS) --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs
> > > +pahole-flags-$(if $(KBUILD_BUILD_TIMESTAMP),y) += --btf_features=reproducible_build
> >
> > Hi Thomas,
> >
> > There are a couple of issues with reproducible_build flag which I
> > think are worth mentioning here. I don't know all the reasons behind
> > adding this now, and it's optional too, so feel free to discard my
> > comments.
> >
> > Currently with this flag, the BTF output is deterministic for a given
> > order of DWARF compilation units. So the BTF will be the same for the
> > same vmlinux binary. However, if the vmlinux is rebuilt due to an
> > incremental change in a source code, my understanding is that there is
> > no guarantee that DWARF CUs will be in the same order in the binary.
>
> The goal behind reproducible builds is to produce bit-by-bit idential
> binaries. If the CUs are in a different order then that requirement
> would have been broken there already.

I'm curious, how do we guarantee that we get bit-by-bit identical
DWARF? Do we enforce the order of linking of .o files into the final
vmlinux? Is this described anywhere?

>
> For an incremental build a full relink with *all* CUs is done, not only
> the changed once, so the order should always be the same.

The concern here is whether linker guarantees that CUs' DWARF data
will be appended in exactly the same order in such case?

>
> > At the same time, reproducible_build slows down BTF generation by
> > 30-50%, maybe more depending on the kernel config.
>
> If a user explicitly requests reproducibility then they should get it,
> even if it is slower.
>
> > Hopefully these problems will be solved in upcoming pahole releases.
>
> I don't see it as big problem. This is used for release builds, not
> during development.
>
> > Question: why KBUILD_BUILD_TIMESTAMP flag? Isn't it more appropriate
> > to use a separate flag for this particular feature?
>
> Adding an additional variable would need to be documented and would
> makes the feature harder to use. KBUILD_BUILD_TIMESTAMP already needs to
> be set by the user if they are building for reproducibility.
>
> > > ifneq ($(KBUILD_EXTMOD),)
> > > module-pahole-flags-$(call test-ge, $(pahole-ver), 126) += --btf_features=distilled_base
> > >
> > > ---
> > > base-commit: 7cb1b466315004af98f6ba6c2546bb713ca3c237
> > > change-id: 20241124-pahole-reproducible-2b879ac8bdab
> > >
> > > Best regards,
> > > --
> > > Thomas Weißschuh linux@weissschuh.net
> > >
> > >
> >
Thomas Weißschuh Dec. 12, 2024, 9:07 p.m. UTC | #4
Hi Andrii,

On 2024-12-12 11:23:03-0800, Andrii Nakryiko wrote:
> On Tue, Dec 10, 2024 at 10:24 PM Thomas Weißschuh <linux@weissschuh.net> wrote:
> > On 2024-12-11 00:17:02+0000, Ihor Solodrai wrote:
> > > On Tuesday, December 10th, 2024 at 3:23 PM, Thomas Weißschuh <linux@weissschuh.net> wrote:
> > >
> > > >
> > > >
> > > > Pahole v1.27 added a new BTF generation feature to support
> > > > reproducibility in the face of multithreading.
> > > > Enable it if supported and reproducible builds are requested.
> > > >
> > > > As unknown --btf_features are ignored, avoid the test for the pahole
> > > > version to keep the line readable.
> > > >
> > > > Fixes: b4f72786429c ("scripts/pahole-flags.sh: Parse DWARF and generate BTF with multithreading.")
> > > > Fixes: 72d091846de9 ("kbuild: avoid too many execution of scripts/pahole-flags.sh")
> > > > Link: https://lore.kernel.org/lkml/4154d202-5c72-493e-bf3f-bce882a296c6@gentoo.org/
> > > > Link: https://lore.kernel.org/lkml/20240322-pahole-reprodicible-v1-1-3eaafb1842da@weissschuh.net/
> > > > Signed-off-by: Thomas Weißschuh linux@weissschuh.net
> > > >
> > > > ---
> > > > scripts/Makefile.btf | 1 +
> > > > 1 file changed, 1 insertion(+)
> > > >
> > > > diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
> > > > index c3cbeb13de503555adcf00029a0b328e74381f13..da23265bc8b3cf43c0a1c89fbc4f53815a290e13 100644
> > > > --- a/scripts/Makefile.btf
> > > > +++ b/scripts/Makefile.btf
> > > > @@ -22,6 +22,7 @@ else
> > > >
> > > > # Switch to using --btf_features for v1.26 and later.
> > > > pahole-flags-$(call test-ge, $(pahole-ver), 126) = -j$(JOBS) --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs
> > > > +pahole-flags-$(if $(KBUILD_BUILD_TIMESTAMP),y) += --btf_features=reproducible_build
> > >
> > > Hi Thomas,
> > >
> > > There are a couple of issues with reproducible_build flag which I
> > > think are worth mentioning here. I don't know all the reasons behind
> > > adding this now, and it's optional too, so feel free to discard my
> > > comments.
> > >
> > > Currently with this flag, the BTF output is deterministic for a given
> > > order of DWARF compilation units. So the BTF will be the same for the
> > > same vmlinux binary. However, if the vmlinux is rebuilt due to an
> > > incremental change in a source code, my understanding is that there is
> > > no guarantee that DWARF CUs will be in the same order in the binary.
> >
> > The goal behind reproducible builds is to produce bit-by-bit idential
> > binaries. If the CUs are in a different order then that requirement
> > would have been broken there already.
> 
> I'm curious, how do we guarantee that we get bit-by-bit identical
> DWARF? Do we enforce the order of linking of .o files into the final
> vmlinux? Is this described anywhere?

The CU order has to be fixed, otherwise the non-debugging parts of the
binary would not be reproducible either.
For docs is Documentation/kbuild/reproducible-builds.rst, the linked 
reproducible-builds.org project has much more information.

Also besides reproducible builds, lots of kernel components rely
(accidentally or intentionally) on a stable initialization order, which
is also defined by linking order.

From Documentation/kbuild/makefiles.rst:

	Link order is significant, because certain functions
	(module_init() / __initcall) will be called during boot in the
	order they appear. So keep in mind that changing the link
	order may e.g. change the order in which your SCSI
	controllers are detected, and thus your disks are renumbered.

> > For an incremental build a full relink with *all* CUs is done, not only
> > the changed once, so the order should always be the same.
> 
> The concern here is whether linker guarantees that CUs' DWARF data
> will be appended in exactly the same order in such case?

Otherwise it wouldn't be reproducible in general.
The pahole developers specifically implemented
--btf_features=reproducible_build for use in the kernel; after I sent
a precursor patch to this one (also linked in the patch):

https://lore.kernel.org/lkml/20240322-pahole-reprodicible-v1-1-3eaafb1842da@weissschuh.net/

In general the kernel already supports reproducible builds.

For my personal kernel builds only two incompatibilities/rough edges remain:

* (parallel) BTF generation, which is fixed with this patch
* CONFIG_MODULE_SIG with non-precreated keys (which I am working on)

> > > At the same time, reproducible_build slows down BTF generation by
> > > 30-50%, maybe more depending on the kernel config.
> >
> > If a user explicitly requests reproducibility then they should get it,
> > even if it is slower.
> >
> > > Hopefully these problems will be solved in upcoming pahole releases.
> >
> > I don't see it as big problem. This is used for release builds, not
> > during development.
> >
> > > Question: why KBUILD_BUILD_TIMESTAMP flag? Isn't it more appropriate
> > > to use a separate flag for this particular feature?
> >
> > Adding an additional variable would need to be documented and would
> > makes the feature harder to use. KBUILD_BUILD_TIMESTAMP already needs to
> > be set by the user if they are building for reproducibility.
> >
> > > > ifneq ($(KBUILD_EXTMOD),)
> > > > module-pahole-flags-$(call test-ge, $(pahole-ver), 126) += --btf_features=distilled_base
> > > >
> > > > ---
> > > > base-commit: 7cb1b466315004af98f6ba6c2546bb713ca3c237
> > > > change-id: 20241124-pahole-reproducible-2b879ac8bdab
> > > >
> > > > Best regards,
> > > > --
> > > > Thomas Weißschuh linux@weissschuh.net
Andrii Nakryiko Dec. 12, 2024, 10:54 p.m. UTC | #5
On Thu, Dec 12, 2024 at 1:07 PM Thomas Weißschuh <linux@weissschuh.net> wrote:
>
> Hi Andrii,
>
> On 2024-12-12 11:23:03-0800, Andrii Nakryiko wrote:
> > On Tue, Dec 10, 2024 at 10:24 PM Thomas Weißschuh <linux@weissschuh.net> wrote:
> > > On 2024-12-11 00:17:02+0000, Ihor Solodrai wrote:
> > > > On Tuesday, December 10th, 2024 at 3:23 PM, Thomas Weißschuh <linux@weissschuh.net> wrote:
> > > >
> > > > >
> > > > >
> > > > > Pahole v1.27 added a new BTF generation feature to support
> > > > > reproducibility in the face of multithreading.
> > > > > Enable it if supported and reproducible builds are requested.
> > > > >
> > > > > As unknown --btf_features are ignored, avoid the test for the pahole
> > > > > version to keep the line readable.
> > > > >
> > > > > Fixes: b4f72786429c ("scripts/pahole-flags.sh: Parse DWARF and generate BTF with multithreading.")
> > > > > Fixes: 72d091846de9 ("kbuild: avoid too many execution of scripts/pahole-flags.sh")
> > > > > Link: https://lore.kernel.org/lkml/4154d202-5c72-493e-bf3f-bce882a296c6@gentoo.org/
> > > > > Link: https://lore.kernel.org/lkml/20240322-pahole-reprodicible-v1-1-3eaafb1842da@weissschuh.net/
> > > > > Signed-off-by: Thomas Weißschuh linux@weissschuh.net
> > > > >
> > > > > ---
> > > > > scripts/Makefile.btf | 1 +
> > > > > 1 file changed, 1 insertion(+)
> > > > >
> > > > > diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
> > > > > index c3cbeb13de503555adcf00029a0b328e74381f13..da23265bc8b3cf43c0a1c89fbc4f53815a290e13 100644
> > > > > --- a/scripts/Makefile.btf
> > > > > +++ b/scripts/Makefile.btf
> > > > > @@ -22,6 +22,7 @@ else
> > > > >
> > > > > # Switch to using --btf_features for v1.26 and later.
> > > > > pahole-flags-$(call test-ge, $(pahole-ver), 126) = -j$(JOBS) --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs
> > > > > +pahole-flags-$(if $(KBUILD_BUILD_TIMESTAMP),y) += --btf_features=reproducible_build
> > > >
> > > > Hi Thomas,
> > > >
> > > > There are a couple of issues with reproducible_build flag which I
> > > > think are worth mentioning here. I don't know all the reasons behind
> > > > adding this now, and it's optional too, so feel free to discard my
> > > > comments.
> > > >
> > > > Currently with this flag, the BTF output is deterministic for a given
> > > > order of DWARF compilation units. So the BTF will be the same for the
> > > > same vmlinux binary. However, if the vmlinux is rebuilt due to an
> > > > incremental change in a source code, my understanding is that there is
> > > > no guarantee that DWARF CUs will be in the same order in the binary.
> > >
> > > The goal behind reproducible builds is to produce bit-by-bit idential
> > > binaries. If the CUs are in a different order then that requirement
> > > would have been broken there already.
> >
> > I'm curious, how do we guarantee that we get bit-by-bit identical
> > DWARF? Do we enforce the order of linking of .o files into the final
> > vmlinux? Is this described anywhere?
>
> The CU order has to be fixed, otherwise the non-debugging parts of the
> binary would not be reproducible either.
> For docs is Documentation/kbuild/reproducible-builds.rst, the linked
> reproducible-builds.org project has much more information.
>
> Also besides reproducible builds, lots of kernel components rely
> (accidentally or intentionally) on a stable initialization order, which
> is also defined by linking order.
>
> From Documentation/kbuild/makefiles.rst:
>
>         Link order is significant, because certain functions
>         (module_init() / __initcall) will be called during boot in the
>         order they appear. So keep in mind that changing the link
>         order may e.g. change the order in which your SCSI
>         controllers are detected, and thus your disks are renumbered.
>
> > > For an incremental build a full relink with *all* CUs is done, not only
> > > the changed once, so the order should always be the same.
> >
> > The concern here is whether linker guarantees that CUs' DWARF data
> > will be appended in exactly the same order in such case?
>
> Otherwise it wouldn't be reproducible in general.
> The pahole developers specifically implemented
> --btf_features=reproducible_build for use in the kernel; after I sent
> a precursor patch to this one (also linked in the patch):
>
> https://lore.kernel.org/lkml/20240322-pahole-reprodicible-v1-1-3eaafb1842da@weissschuh.net/
>
> In general the kernel already supports reproducible builds.

Great, thanks for all the info!

I do agree with Ihor that KBUILD_BUILD_TIMESTAMP is a non-obvious and
surprising way to enable this behavior, but if that's what's used for
other aspects of kernel build I guess it's fine by me.

Ihor's work on making BTF generation more deterministic w.r.t. CU
order would automatically benefit --btf_features=reproducible_build in
the end and might make it unnecessary, but there is no need to block
on a completion of that work.

>
> For my personal kernel builds only two incompatibilities/rough edges remain:
>
> * (parallel) BTF generation, which is fixed with this patch
> * CONFIG_MODULE_SIG with non-precreated keys (which I am working on)
>
> > > > At the same time, reproducible_build slows down BTF generation by
> > > > 30-50%, maybe more depending on the kernel config.
> > >
> > > If a user explicitly requests reproducibility then they should get it,
> > > even if it is slower.
> > >
> > > > Hopefully these problems will be solved in upcoming pahole releases.
> > >
> > > I don't see it as big problem. This is used for release builds, not
> > > during development.
> > >
> > > > Question: why KBUILD_BUILD_TIMESTAMP flag? Isn't it more appropriate
> > > > to use a separate flag for this particular feature?
> > >
> > > Adding an additional variable would need to be documented and would
> > > makes the feature harder to use. KBUILD_BUILD_TIMESTAMP already needs to
> > > be set by the user if they are building for reproducibility.
> > >
> > > > > ifneq ($(KBUILD_EXTMOD),)
> > > > > module-pahole-flags-$(call test-ge, $(pahole-ver), 126) += --btf_features=distilled_base
> > > > >
> > > > > ---
> > > > > base-commit: 7cb1b466315004af98f6ba6c2546bb713ca3c237
> > > > > change-id: 20241124-pahole-reproducible-2b879ac8bdab
> > > > >
> > > > > Best regards,
> > > > > --
> > > > > Thomas Weißschuh linux@weissschuh.net
Thomas Weißschuh Dec. 12, 2024, 11:18 p.m. UTC | #6
On 2024-12-12 14:54:36-0800, Andrii Nakryiko wrote:
> On Thu, Dec 12, 2024 at 1:07 PM Thomas Weißschuh <linux@weissschuh.net> wrote:
> >
> > Hi Andrii,
> >
> > On 2024-12-12 11:23:03-0800, Andrii Nakryiko wrote:
> > > On Tue, Dec 10, 2024 at 10:24 PM Thomas Weißschuh <linux@weissschuh.net> wrote:
> > > > On 2024-12-11 00:17:02+0000, Ihor Solodrai wrote:
> > > > > On Tuesday, December 10th, 2024 at 3:23 PM, Thomas Weißschuh <linux@weissschuh.net> wrote:
> > > > >
> > > > > >
> > > > > >
> > > > > > Pahole v1.27 added a new BTF generation feature to support
> > > > > > reproducibility in the face of multithreading.
> > > > > > Enable it if supported and reproducible builds are requested.
> > > > > >
> > > > > > As unknown --btf_features are ignored, avoid the test for the pahole
> > > > > > version to keep the line readable.
> > > > > >
> > > > > > Fixes: b4f72786429c ("scripts/pahole-flags.sh: Parse DWARF and generate BTF with multithreading.")
> > > > > > Fixes: 72d091846de9 ("kbuild: avoid too many execution of scripts/pahole-flags.sh")
> > > > > > Link: https://lore.kernel.org/lkml/4154d202-5c72-493e-bf3f-bce882a296c6@gentoo.org/
> > > > > > Link: https://lore.kernel.org/lkml/20240322-pahole-reprodicible-v1-1-3eaafb1842da@weissschuh.net/
> > > > > > Signed-off-by: Thomas Weißschuh linux@weissschuh.net
> > > > > >
> > > > > > ---
> > > > > > scripts/Makefile.btf | 1 +
> > > > > > 1 file changed, 1 insertion(+)
> > > > > >
> > > > > > diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
> > > > > > index c3cbeb13de503555adcf00029a0b328e74381f13..da23265bc8b3cf43c0a1c89fbc4f53815a290e13 100644
> > > > > > --- a/scripts/Makefile.btf
> > > > > > +++ b/scripts/Makefile.btf
> > > > > > @@ -22,6 +22,7 @@ else
> > > > > >
> > > > > > # Switch to using --btf_features for v1.26 and later.
> > > > > > pahole-flags-$(call test-ge, $(pahole-ver), 126) = -j$(JOBS) --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs
> > > > > > +pahole-flags-$(if $(KBUILD_BUILD_TIMESTAMP),y) += --btf_features=reproducible_build
> > > > >
> > > > > Hi Thomas,
> > > > >
> > > > > There are a couple of issues with reproducible_build flag which I
> > > > > think are worth mentioning here. I don't know all the reasons behind
> > > > > adding this now, and it's optional too, so feel free to discard my
> > > > > comments.
> > > > >
> > > > > Currently with this flag, the BTF output is deterministic for a given
> > > > > order of DWARF compilation units. So the BTF will be the same for the
> > > > > same vmlinux binary. However, if the vmlinux is rebuilt due to an
> > > > > incremental change in a source code, my understanding is that there is
> > > > > no guarantee that DWARF CUs will be in the same order in the binary.
> > > >
> > > > The goal behind reproducible builds is to produce bit-by-bit idential
> > > > binaries. If the CUs are in a different order then that requirement
> > > > would have been broken there already.
> > >
> > > I'm curious, how do we guarantee that we get bit-by-bit identical
> > > DWARF? Do we enforce the order of linking of .o files into the final
> > > vmlinux? Is this described anywhere?
> >
> > The CU order has to be fixed, otherwise the non-debugging parts of the
> > binary would not be reproducible either.
> > For docs is Documentation/kbuild/reproducible-builds.rst, the linked
> > reproducible-builds.org project has much more information.
> >
> > Also besides reproducible builds, lots of kernel components rely
> > (accidentally or intentionally) on a stable initialization order, which
> > is also defined by linking order.
> >
> > From Documentation/kbuild/makefiles.rst:
> >
> >         Link order is significant, because certain functions
> >         (module_init() / __initcall) will be called during boot in the
> >         order they appear. So keep in mind that changing the link
> >         order may e.g. change the order in which your SCSI
> >         controllers are detected, and thus your disks are renumbered.
> >
> > > > For an incremental build a full relink with *all* CUs is done, not only
> > > > the changed once, so the order should always be the same.
> > >
> > > The concern here is whether linker guarantees that CUs' DWARF data
> > > will be appended in exactly the same order in such case?
> >
> > Otherwise it wouldn't be reproducible in general.
> > The pahole developers specifically implemented
> > --btf_features=reproducible_build for use in the kernel; after I sent
> > a precursor patch to this one (also linked in the patch):
> >
> > https://lore.kernel.org/lkml/20240322-pahole-reprodicible-v1-1-3eaafb1842da@weissschuh.net/
> >
> > In general the kernel already supports reproducible builds.
> 
> Great, thanks for all the info!
> 
> I do agree with Ihor that KBUILD_BUILD_TIMESTAMP is a non-obvious and
> surprising way to enable this behavior, but if that's what's used for
> other aspects of kernel build I guess it's fine by me.

Agreed. So far KBUILD_BUILD_TIMESTAMP is really only used for timestamp
related configuration. While it's not a perfect fit, adding yet another
switch that needs to be specified can't be the answer, either.

Maybe the Kbuild maintainers have some preference?

> Ihor's work on making BTF generation more deterministic w.r.t. CU
> order would automatically benefit --btf_features=reproducible_build in
> the end and might make it unnecessary, but there is no need to block
> on a completion of that work.

Sounds good.

[..]
Andrii Nakryiko Dec. 12, 2024, 11:21 p.m. UTC | #7
On Thu, Dec 12, 2024 at 3:19 PM Thomas Weißschuh <linux@weissschuh.net> wrote:
>
> On 2024-12-12 14:54:36-0800, Andrii Nakryiko wrote:
> > On Thu, Dec 12, 2024 at 1:07 PM Thomas Weißschuh <linux@weissschuh.net> wrote:
> > >
> > > Hi Andrii,
> > >
> > > On 2024-12-12 11:23:03-0800, Andrii Nakryiko wrote:
> > > > On Tue, Dec 10, 2024 at 10:24 PM Thomas Weißschuh <linux@weissschuh.net> wrote:
> > > > > On 2024-12-11 00:17:02+0000, Ihor Solodrai wrote:
> > > > > > On Tuesday, December 10th, 2024 at 3:23 PM, Thomas Weißschuh <linux@weissschuh.net> wrote:
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Pahole v1.27 added a new BTF generation feature to support
> > > > > > > reproducibility in the face of multithreading.
> > > > > > > Enable it if supported and reproducible builds are requested.
> > > > > > >
> > > > > > > As unknown --btf_features are ignored, avoid the test for the pahole
> > > > > > > version to keep the line readable.
> > > > > > >
> > > > > > > Fixes: b4f72786429c ("scripts/pahole-flags.sh: Parse DWARF and generate BTF with multithreading.")
> > > > > > > Fixes: 72d091846de9 ("kbuild: avoid too many execution of scripts/pahole-flags.sh")
> > > > > > > Link: https://lore.kernel.org/lkml/4154d202-5c72-493e-bf3f-bce882a296c6@gentoo.org/
> > > > > > > Link: https://lore.kernel.org/lkml/20240322-pahole-reprodicible-v1-1-3eaafb1842da@weissschuh.net/
> > > > > > > Signed-off-by: Thomas Weißschuh linux@weissschuh.net
> > > > > > >
> > > > > > > ---
> > > > > > > scripts/Makefile.btf | 1 +
> > > > > > > 1 file changed, 1 insertion(+)
> > > > > > >
> > > > > > > diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
> > > > > > > index c3cbeb13de503555adcf00029a0b328e74381f13..da23265bc8b3cf43c0a1c89fbc4f53815a290e13 100644
> > > > > > > --- a/scripts/Makefile.btf
> > > > > > > +++ b/scripts/Makefile.btf
> > > > > > > @@ -22,6 +22,7 @@ else
> > > > > > >
> > > > > > > # Switch to using --btf_features for v1.26 and later.
> > > > > > > pahole-flags-$(call test-ge, $(pahole-ver), 126) = -j$(JOBS) --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs
> > > > > > > +pahole-flags-$(if $(KBUILD_BUILD_TIMESTAMP),y) += --btf_features=reproducible_build
> > > > > >
> > > > > > Hi Thomas,
> > > > > >
> > > > > > There are a couple of issues with reproducible_build flag which I
> > > > > > think are worth mentioning here. I don't know all the reasons behind
> > > > > > adding this now, and it's optional too, so feel free to discard my
> > > > > > comments.
> > > > > >
> > > > > > Currently with this flag, the BTF output is deterministic for a given
> > > > > > order of DWARF compilation units. So the BTF will be the same for the
> > > > > > same vmlinux binary. However, if the vmlinux is rebuilt due to an
> > > > > > incremental change in a source code, my understanding is that there is
> > > > > > no guarantee that DWARF CUs will be in the same order in the binary.
> > > > >
> > > > > The goal behind reproducible builds is to produce bit-by-bit idential
> > > > > binaries. If the CUs are in a different order then that requirement
> > > > > would have been broken there already.
> > > >
> > > > I'm curious, how do we guarantee that we get bit-by-bit identical
> > > > DWARF? Do we enforce the order of linking of .o files into the final
> > > > vmlinux? Is this described anywhere?
> > >
> > > The CU order has to be fixed, otherwise the non-debugging parts of the
> > > binary would not be reproducible either.
> > > For docs is Documentation/kbuild/reproducible-builds.rst, the linked
> > > reproducible-builds.org project has much more information.
> > >
> > > Also besides reproducible builds, lots of kernel components rely
> > > (accidentally or intentionally) on a stable initialization order, which
> > > is also defined by linking order.
> > >
> > > From Documentation/kbuild/makefiles.rst:
> > >
> > >         Link order is significant, because certain functions
> > >         (module_init() / __initcall) will be called during boot in the
> > >         order they appear. So keep in mind that changing the link
> > >         order may e.g. change the order in which your SCSI
> > >         controllers are detected, and thus your disks are renumbered.
> > >
> > > > > For an incremental build a full relink with *all* CUs is done, not only
> > > > > the changed once, so the order should always be the same.
> > > >
> > > > The concern here is whether linker guarantees that CUs' DWARF data
> > > > will be appended in exactly the same order in such case?
> > >
> > > Otherwise it wouldn't be reproducible in general.
> > > The pahole developers specifically implemented
> > > --btf_features=reproducible_build for use in the kernel; after I sent
> > > a precursor patch to this one (also linked in the patch):
> > >
> > > https://lore.kernel.org/lkml/20240322-pahole-reprodicible-v1-1-3eaafb1842da@weissschuh.net/
> > >
> > > In general the kernel already supports reproducible builds.
> >
> > Great, thanks for all the info!
> >
> > I do agree with Ihor that KBUILD_BUILD_TIMESTAMP is a non-obvious and
> > surprising way to enable this behavior, but if that's what's used for
> > other aspects of kernel build I guess it's fine by me.
>
> Agreed. So far KBUILD_BUILD_TIMESTAMP is really only used for timestamp
> related configuration. While it's not a perfect fit, adding yet another
> switch that needs to be specified can't be the answer, either.
>
> Maybe the Kbuild maintainers have some preference?

Wouldn't KBUILD_REPRODUCIBLE_BUILD=y be the logical thing to have for
all these things that affect exact reproducibility?

>
> > Ihor's work on making BTF generation more deterministic w.r.t. CU
> > order would automatically benefit --btf_features=reproducible_build in
> > the end and might make it unnecessary, but there is no need to block
> > on a completion of that work.
>
> Sounds good.
>
> [..]
Thomas Weißschuh Dec. 12, 2024, 11:31 p.m. UTC | #8
On 2024-12-12 15:21:14-0800, Andrii Nakryiko wrote:
> On Thu, Dec 12, 2024 at 3:19 PM Thomas Weißschuh <linux@weissschuh.net> wrote:
> >
> > On 2024-12-12 14:54:36-0800, Andrii Nakryiko wrote:
> > > On Thu, Dec 12, 2024 at 1:07 PM Thomas Weißschuh <linux@weissschuh.net> wrote:
> > > >
> > > > Hi Andrii,
> > > >
> > > > On 2024-12-12 11:23:03-0800, Andrii Nakryiko wrote:
> > > > > On Tue, Dec 10, 2024 at 10:24 PM Thomas Weißschuh <linux@weissschuh.net> wrote:
> > > > > > On 2024-12-11 00:17:02+0000, Ihor Solodrai wrote:
> > > > > > > On Tuesday, December 10th, 2024 at 3:23 PM, Thomas Weißschuh <linux@weissschuh.net> wrote:
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > Pahole v1.27 added a new BTF generation feature to support
> > > > > > > > reproducibility in the face of multithreading.
> > > > > > > > Enable it if supported and reproducible builds are requested.
> > > > > > > >
> > > > > > > > As unknown --btf_features are ignored, avoid the test for the pahole
> > > > > > > > version to keep the line readable.
> > > > > > > >
> > > > > > > > Fixes: b4f72786429c ("scripts/pahole-flags.sh: Parse DWARF and generate BTF with multithreading.")
> > > > > > > > Fixes: 72d091846de9 ("kbuild: avoid too many execution of scripts/pahole-flags.sh")
> > > > > > > > Link: https://lore.kernel.org/lkml/4154d202-5c72-493e-bf3f-bce882a296c6@gentoo.org/
> > > > > > > > Link: https://lore.kernel.org/lkml/20240322-pahole-reprodicible-v1-1-3eaafb1842da@weissschuh.net/
> > > > > > > > Signed-off-by: Thomas Weißschuh linux@weissschuh.net
> > > > > > > >
> > > > > > > > ---
> > > > > > > > scripts/Makefile.btf | 1 +
> > > > > > > > 1 file changed, 1 insertion(+)
> > > > > > > >
> > > > > > > > diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
> > > > > > > > index c3cbeb13de503555adcf00029a0b328e74381f13..da23265bc8b3cf43c0a1c89fbc4f53815a290e13 100644
> > > > > > > > --- a/scripts/Makefile.btf
> > > > > > > > +++ b/scripts/Makefile.btf
> > > > > > > > @@ -22,6 +22,7 @@ else
> > > > > > > >
> > > > > > > > # Switch to using --btf_features for v1.26 and later.
> > > > > > > > pahole-flags-$(call test-ge, $(pahole-ver), 126) = -j$(JOBS) --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs
> > > > > > > > +pahole-flags-$(if $(KBUILD_BUILD_TIMESTAMP),y) += --btf_features=reproducible_build
> > > > > > >
> > > > > > > Hi Thomas,
> > > > > > >
> > > > > > > There are a couple of issues with reproducible_build flag which I
> > > > > > > think are worth mentioning here. I don't know all the reasons behind
> > > > > > > adding this now, and it's optional too, so feel free to discard my
> > > > > > > comments.
> > > > > > >
> > > > > > > Currently with this flag, the BTF output is deterministic for a given
> > > > > > > order of DWARF compilation units. So the BTF will be the same for the
> > > > > > > same vmlinux binary. However, if the vmlinux is rebuilt due to an
> > > > > > > incremental change in a source code, my understanding is that there is
> > > > > > > no guarantee that DWARF CUs will be in the same order in the binary.
> > > > > >
> > > > > > The goal behind reproducible builds is to produce bit-by-bit idential
> > > > > > binaries. If the CUs are in a different order then that requirement
> > > > > > would have been broken there already.
> > > > >
> > > > > I'm curious, how do we guarantee that we get bit-by-bit identical
> > > > > DWARF? Do we enforce the order of linking of .o files into the final
> > > > > vmlinux? Is this described anywhere?
> > > >
> > > > The CU order has to be fixed, otherwise the non-debugging parts of the
> > > > binary would not be reproducible either.
> > > > For docs is Documentation/kbuild/reproducible-builds.rst, the linked
> > > > reproducible-builds.org project has much more information.
> > > >
> > > > Also besides reproducible builds, lots of kernel components rely
> > > > (accidentally or intentionally) on a stable initialization order, which
> > > > is also defined by linking order.
> > > >
> > > > From Documentation/kbuild/makefiles.rst:
> > > >
> > > >         Link order is significant, because certain functions
> > > >         (module_init() / __initcall) will be called during boot in the
> > > >         order they appear. So keep in mind that changing the link
> > > >         order may e.g. change the order in which your SCSI
> > > >         controllers are detected, and thus your disks are renumbered.
> > > >
> > > > > > For an incremental build a full relink with *all* CUs is done, not only
> > > > > > the changed once, so the order should always be the same.
> > > > >
> > > > > The concern here is whether linker guarantees that CUs' DWARF data
> > > > > will be appended in exactly the same order in such case?
> > > >
> > > > Otherwise it wouldn't be reproducible in general.
> > > > The pahole developers specifically implemented
> > > > --btf_features=reproducible_build for use in the kernel; after I sent
> > > > a precursor patch to this one (also linked in the patch):
> > > >
> > > > https://lore.kernel.org/lkml/20240322-pahole-reprodicible-v1-1-3eaafb1842da@weissschuh.net/
> > > >
> > > > In general the kernel already supports reproducible builds.
> > >
> > > Great, thanks for all the info!
> > >
> > > I do agree with Ihor that KBUILD_BUILD_TIMESTAMP is a non-obvious and
> > > surprising way to enable this behavior, but if that's what's used for
> > > other aspects of kernel build I guess it's fine by me.
> >
> > Agreed. So far KBUILD_BUILD_TIMESTAMP is really only used for timestamp
> > related configuration. While it's not a perfect fit, adding yet another
> > switch that needs to be specified can't be the answer, either.
> >
> > Maybe the Kbuild maintainers have some preference?
> 
> Wouldn't KBUILD_REPRODUCIBLE_BUILD=y be the logical thing to have for
> all these things that affect exact reproducibility?

The other settings aren't booleans:

KBUILD_BUILD_TIMESTAMP, KBUILD_BUILD_VERSION, KBUILD_BUILD_HOST,
KBUILD_BUILD_USER.

Out of these not all *need* to be used. They can already be implicitly
fixed by the general build environment.
That doesn't work for KBUILD_BUILD_TIMESTAMP, so I used it for the basic
boolean detection.

> > > Ihor's work on making BTF generation more deterministic w.r.t. CU
> > > order would automatically benefit --btf_features=reproducible_build in
> > > the end and might make it unnecessary, but there is no need to block
> > > on a completion of that work.
> >
> > Sounds good.
> >
> > [..]
diff mbox series

Patch

diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
index c3cbeb13de503555adcf00029a0b328e74381f13..da23265bc8b3cf43c0a1c89fbc4f53815a290e13 100644
--- a/scripts/Makefile.btf
+++ b/scripts/Makefile.btf
@@ -22,6 +22,7 @@  else
 
 # Switch to using --btf_features for v1.26 and later.
 pahole-flags-$(call test-ge, $(pahole-ver), 126)  = -j$(JOBS) --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs
+pahole-flags-$(if $(KBUILD_BUILD_TIMESTAMP),y) += --btf_features=reproducible_build
 
 ifneq ($(KBUILD_EXTMOD),)
 module-pahole-flags-$(call test-ge, $(pahole-ver), 126) += --btf_features=distilled_base