diff mbox series

[bpf-next,2/2] selftests/bpf: do not update vmlinux.h unnecessarily

Message ID 20240828174608.377204-2-ihor.solodrai@pm.me (mailing list archive)
State Accepted
Commit 2ad6d23f465a4f851e3bcf6d74c315ce7b2c205b
Delegated to: BPF
Headers show
Series [bpf-next,1/2] selftests/bpf: specify libbpf headers required for %.bpf.o progs | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 16 this patch: 16
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 10 maintainers not CCed: sdf@fomichev.me haoluo@google.com jolsa@kernel.org shuah@kernel.org linux-kselftest@vger.kernel.org song@kernel.org yonghong.song@linux.dev kpsingh@kernel.org martin.lau@linux.dev john.fastabend@gmail.com
netdev/build_clang success Errors and warnings before: 17 this patch: 17
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 22 this patch: 22
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 15 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-16 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-17 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-9 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-33 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-41 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 fail Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-32 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-40 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-38 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-30 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-37 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-31 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-39 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc

Commit Message

Ihor Solodrai Aug. 28, 2024, 5:46 p.m. UTC
%.bpf.o objects depend on vmlinux.h, which makes them transitively
dependent on unnecessary libbpf headers. However vmlinux.h doesn't
actually change as often.

When generating vmlinux.h, compare it to a previous version and update
it only if there are changes.

Example of build time improvement (after first clean build):
  $ touch ../../../lib/bpf/bpf.h
  $ time make -j8
Before: real  1m37.592s
After:  real  0m27.310s

Notice that %.bpf.o gen step is skipped if vmlinux.h hasn't changed.

Link: https://lore.kernel.org/bpf/CAEf4BzY1z5cC7BKye8=A8aTVxpsCzD=p1jdTfKC7i0XVuYoHUQ@mail.gmail.com

Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
---
 tools/testing/selftests/bpf/Makefile | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Eduard Zingerman Aug. 28, 2024, 10:02 p.m. UTC | #1
On Wed, 2024-08-28 at 17:46 +0000, Ihor Solodrai wrote:
> %.bpf.o objects depend on vmlinux.h, which makes them transitively
> dependent on unnecessary libbpf headers. However vmlinux.h doesn't
> actually change as often.
> 
> When generating vmlinux.h, compare it to a previous version and update
> it only if there are changes.
> 
> Example of build time improvement (after first clean build):
>   $ touch ../../../lib/bpf/bpf.h
>   $ time make -j8
> Before: real  1m37.592s
> After:  real  0m27.310s
> 
> Notice that %.bpf.o gen step is skipped if vmlinux.h hasn't changed.
> 
> Link: https://lore.kernel.org/bpf/CAEf4BzY1z5cC7BKye8=A8aTVxpsCzD=p1jdTfKC7i0XVuYoHUQ@mail.gmail.com
> 
> Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
> ---

Unfortunately, I think that this is a half-measure.
E.g. the following command forces tests rebuild for me:

  touch ../../../../kernel/bpf/verifier.c; \
  make -j22 -C ../../../../; \
  time make test_progs

To workaround this we need to enable reproducible_build option:

    diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
    index b75f09f3f424..8cd648f3e32b 100644
    --- a/scripts/Makefile.btf
    +++ b/scripts/Makefile.btf
    @@ -19,7 +19,7 @@ pahole-flags-$(call test-ge, $(pahole-ver), 125)      += --skip_encoding_btf_inconsis
     else

     # Switch to using --btf_features for v1.26 and later.
    -pahole-flags-$(call test-ge, $(pahole-ver), 126)  = -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs
    +pahole-flags-$(call test-ge, $(pahole-ver), 126)  = -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs,reproducible_build

     ifneq ($(KBUILD_EXTMOD),)
     module-pahole-flags-$(call test-ge, $(pahole-ver), 126) += --btf_features=distilled_base

Question to the mailing list: do we want this?

[...]
Alexei Starovoitov Aug. 28, 2024, 11:25 p.m. UTC | #2
On Wed, Aug 28, 2024 at 3:02 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2024-08-28 at 17:46 +0000, Ihor Solodrai wrote:
> > %.bpf.o objects depend on vmlinux.h, which makes them transitively
> > dependent on unnecessary libbpf headers. However vmlinux.h doesn't
> > actually change as often.
> >
> > When generating vmlinux.h, compare it to a previous version and update
> > it only if there are changes.
> >
> > Example of build time improvement (after first clean build):
> >   $ touch ../../../lib/bpf/bpf.h
> >   $ time make -j8
> > Before: real  1m37.592s
> > After:  real  0m27.310s
> >
> > Notice that %.bpf.o gen step is skipped if vmlinux.h hasn't changed.
> >
> > Link: https://lore.kernel.org/bpf/CAEf4BzY1z5cC7BKye8=A8aTVxpsCzD=p1jdTfKC7i0XVuYoHUQ@mail.gmail.com
> >
> > Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
> > ---
>
> Unfortunately, I think that this is a half-measure.
> E.g. the following command forces tests rebuild for me:
>
>   touch ../../../../kernel/bpf/verifier.c; \
>   make -j22 -C ../../../../; \
>   time make test_progs
>
> To workaround this we need to enable reproducible_build option:
>
>     diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
>     index b75f09f3f424..8cd648f3e32b 100644
>     --- a/scripts/Makefile.btf
>     +++ b/scripts/Makefile.btf
>     @@ -19,7 +19,7 @@ pahole-flags-$(call test-ge, $(pahole-ver), 125)      += --skip_encoding_btf_inconsis
>      else
>
>      # Switch to using --btf_features for v1.26 and later.
>     -pahole-flags-$(call test-ge, $(pahole-ver), 126)  = -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs
>     +pahole-flags-$(call test-ge, $(pahole-ver), 126)  = -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs,reproducible_build
>
>      ifneq ($(KBUILD_EXTMOD),)
>      module-pahole-flags-$(call test-ge, $(pahole-ver), 126) += --btf_features=distilled_base
>
> Question to the mailing list: do we want this?

I don't think so. Too drastic just to save rebuild time.

imo the patch is good enough as-is.
Andrii Nakryiko Aug. 30, 2024, 8:34 p.m. UTC | #3
On Wed, Aug 28, 2024 at 3:02 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2024-08-28 at 17:46 +0000, Ihor Solodrai wrote:
> > %.bpf.o objects depend on vmlinux.h, which makes them transitively
> > dependent on unnecessary libbpf headers. However vmlinux.h doesn't
> > actually change as often.
> >
> > When generating vmlinux.h, compare it to a previous version and update
> > it only if there are changes.
> >
> > Example of build time improvement (after first clean build):
> >   $ touch ../../../lib/bpf/bpf.h
> >   $ time make -j8
> > Before: real  1m37.592s
> > After:  real  0m27.310s
> >
> > Notice that %.bpf.o gen step is skipped if vmlinux.h hasn't changed.
> >
> > Link: https://lore.kernel.org/bpf/CAEf4BzY1z5cC7BKye8=A8aTVxpsCzD=p1jdTfKC7i0XVuYoHUQ@mail.gmail.com
> >
> > Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
> > ---
>
> Unfortunately, I think that this is a half-measure.
> E.g. the following command forces tests rebuild for me:
>
>   touch ../../../../kernel/bpf/verifier.c; \
>   make -j22 -C ../../../../; \
>   time make test_progs
>
> To workaround this we need to enable reproducible_build option:
>
>     diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
>     index b75f09f3f424..8cd648f3e32b 100644
>     --- a/scripts/Makefile.btf
>     +++ b/scripts/Makefile.btf
>     @@ -19,7 +19,7 @@ pahole-flags-$(call test-ge, $(pahole-ver), 125)      += --skip_encoding_btf_inconsis
>      else
>
>      # Switch to using --btf_features for v1.26 and later.
>     -pahole-flags-$(call test-ge, $(pahole-ver), 126)  = -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs
>     +pahole-flags-$(call test-ge, $(pahole-ver), 126)  = -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs,reproducible_build
>
>      ifneq ($(KBUILD_EXTMOD),)
>      module-pahole-flags-$(call test-ge, $(pahole-ver), 126) += --btf_features=distilled_base
>
> Question to the mailing list: do we want this?

Alan, can you please give us a summary of what are the consequences of
the reproducible_build pahole option? In terms of performance and
otherwise.

I've applied patches as is, despite them not solving the issue
completely, as they are moving us in the right direction anyways. I do
get slightly different BTF every single time I rebuild my kernel, so
the change in patch #2 doesn't yet help me.

For libbpf headers, Ihor, can you please follow up with adding
bpf_helper_defs.h as a dependency?

I have some ideas on how to make BTF regeneration in vmlinux.h itself
unnecessary, that might help with this issue. Separately (depending on
what are the negatives of the reproducible_build option) we can look
into making pahole have more consistent internal BTF type ordering
without negatively affecting the overall BTF dedup performance in
pahole. Hopefully I can work with Ihor on this as follow ups.

P.S. I also spent more time than I'm willing to admit trying to
improve bpftool's BTF sorting to minimize the chance of vmlinux.h
contents being different, and I think I removed a bunch of cases where
we had unnecessary differences, but still, it's fundamentally
non-deterministic to do everything based on type and field names,
unfortunately.

Anyways, Mykyta (cc'ed), what do you think about the changes below?
Note that I'm also fixing the incorrect handling of enum64 (would be
nice to prepare a proper patch and send it upstream, if you get a
chance).

diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c
index 6789c7a4d5ca..e8a244b09d56 100644
--- a/tools/bpf/bpftool/btf.c
+++ b/tools/bpf/bpftool/btf.c
@@ -50,6 +50,7 @@ struct sort_datum {
        int type_rank;
        const char *sort_name;
        const char *own_name;
+       __u64 disambig_hash;
 };

 static const char *btf_int_enc_str(__u8 encoding)
@@ -552,35 +553,92 @@ static int btf_type_rank(const struct btf *btf,
__u32 index, bool has_name)
        }
 }

-static const char *btf_type_sort_name(const struct btf *btf, __u32
index, bool from_ref)
+static const char *btf_type_sort_name(const struct btf *btf, __u32
index, bool from_ref, const char *typedef_name)
 {
        const struct btf_type *t = btf__type_by_id(btf, index);
+       int name_off;

        switch (btf_kind(t)) {
        case BTF_KIND_ENUM:
-       case BTF_KIND_ENUM64: {
-               int name_off = t->name_off;
-
                /* Use name of the first element for anonymous enums
if allowed */
                if (!from_ref && !t->name_off && btf_vlen(t))
                        name_off = btf_enum(t)->name_off;
+               else
+                       name_off = t->name_off;
+
+               return btf__name_by_offset(btf, name_off);
+       case BTF_KIND_ENUM64:
+               /* Use name of the first element for anonymous enums
if allowed */
+               if (!from_ref && !t->name_off && btf_vlen(t))
+                       name_off = btf_enum64(t)->name_off;
+               else
+                       name_off = t->name_off;

                return btf__name_by_offset(btf, name_off);
-       }
        case BTF_KIND_ARRAY:
-               return btf_type_sort_name(btf, btf_array(t)->type, true);
+               return btf_type_sort_name(btf, btf_array(t)->type,
true, typedef_name);
+       case BTF_KIND_STRUCT:
+       case BTF_KIND_UNION:
+               if (t->name_off == 0)
+                       return typedef_name;
+               return btf__name_by_offset(btf, t->name_off);
+       case BTF_KIND_TYPEDEF:
+               return btf_type_sort_name(btf, t->type, true,
+                                         btf__name_by_offset(btf,
t->name_off));
        case BTF_KIND_TYPE_TAG:
        case BTF_KIND_CONST:
        case BTF_KIND_PTR:
        case BTF_KIND_VOLATILE:
        case BTF_KIND_RESTRICT:
-       case BTF_KIND_TYPEDEF:
        case BTF_KIND_DECL_TAG:
-               return btf_type_sort_name(btf, t->type, true);
+               return btf_type_sort_name(btf, t->type, true, typedef_name);
        default:
                return btf__name_by_offset(btf, t->name_off);
        }
-       return NULL;
+}
+
+static __u64 hasher(__u64 hash, __u64 val)
+{
+       return hash * 31 + val;
+}
+
+static __u64 btf_type_disambig_hash(const struct btf *btf, __u32 index)
+{
+       const struct btf_type *t = btf__type_by_id(btf, index);
+       int i;
+       size_t hash = 0;
+
+       switch (btf_kind(t)) {
+       case BTF_KIND_ENUM:
+               hash = hasher(hash, t->size);
+               for (i = 0; i < btf_vlen(t); i++)
+                       hash = hasher(hash, btf_enum(t)[i].name_off);
+               break;
+       case BTF_KIND_ENUM64:
+               hash = hasher(hash, t->size);
+               for (i = 0; i < btf_vlen(t); i++)
+                       hash = hasher(hash, btf_enum64(t)[i].name_off);
+               break;
+       case BTF_KIND_STRUCT:
+       case BTF_KIND_UNION: {
+               const struct btf_member *m;
+               const char *ftname;
+
+               hash = hasher(hash, t->size);
+               for (i = 0; i < btf_vlen(t); i++) {
+                       m = btf_members(t) + i;
+                       hash = hasher(hash, m->name_off);
+
+                       /* resolve field type's name and hash it as well */
+                       ftname = btf_type_sort_name(btf, m->type, false, "");
+                       hash = hasher(hash, str_hash(ftname));
+               }
+               break;
+       }
+       default:
+               break;
+       }
+       return hash;
 }

 static int btf_type_compare(const void *left, const void *right)
@@ -596,7 +654,14 @@ static int btf_type_compare(const void *left,
const void *right)
        if (r)
                return r;

-       return strcmp(d1->own_name, d2->own_name);
+       r = strcmp(d1->own_name, d2->own_name);
+       if (r)
+               return r;
+
+       if (d1->disambig_hash != d2->disambig_hash)
+               return d1->disambig_hash < d2->disambig_hash ? -1 : 1;
+
+       return d1->index < d2->index ? -1 : 1;
 }

 static struct sort_datum *sort_btf_c(const struct btf *btf)
@@ -615,8 +680,9 @@ static struct sort_datum *sort_btf_c(const struct btf *btf)

                d->index = i;
                d->type_rank = btf_type_rank(btf, i, false);
-               d->sort_name = btf_type_sort_name(btf, i, false);
+               d->sort_name = btf_type_sort_name(btf, i, false, "");
                d->own_name = btf__name_by_offset(btf, t->name_off);
+               d->disambig_hash = btf_type_disambig_hash(btf, i);
        }

        qsort(datums, n, sizeof(struct sort_datum), btf_type_compare);


>
> [...]
>
Alan Maguire Aug. 30, 2024, 9:03 p.m. UTC | #4
On 30/08/2024 21:34, Andrii Nakryiko wrote:
> On Wed, Aug 28, 2024 at 3:02 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>>
>> On Wed, 2024-08-28 at 17:46 +0000, Ihor Solodrai wrote:
>>> %.bpf.o objects depend on vmlinux.h, which makes them transitively
>>> dependent on unnecessary libbpf headers. However vmlinux.h doesn't
>>> actually change as often.
>>>
>>> When generating vmlinux.h, compare it to a previous version and update
>>> it only if there are changes.
>>>
>>> Example of build time improvement (after first clean build):
>>>   $ touch ../../../lib/bpf/bpf.h
>>>   $ time make -j8
>>> Before: real  1m37.592s
>>> After:  real  0m27.310s
>>>
>>> Notice that %.bpf.o gen step is skipped if vmlinux.h hasn't changed.
>>>
>>> Link: https://lore.kernel.org/bpf/CAEf4BzY1z5cC7BKye8=A8aTVxpsCzD=p1jdTfKC7i0XVuYoHUQ@mail.gmail.com
>>>
>>> Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
>>> ---
>>
>> Unfortunately, I think that this is a half-measure.
>> E.g. the following command forces tests rebuild for me:
>>
>>   touch ../../../../kernel/bpf/verifier.c; \
>>   make -j22 -C ../../../../; \
>>   time make test_progs
>>
>> To workaround this we need to enable reproducible_build option:
>>
>>     diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
>>     index b75f09f3f424..8cd648f3e32b 100644
>>     --- a/scripts/Makefile.btf
>>     +++ b/scripts/Makefile.btf
>>     @@ -19,7 +19,7 @@ pahole-flags-$(call test-ge, $(pahole-ver), 125)      += --skip_encoding_btf_inconsis
>>      else
>>
>>      # Switch to using --btf_features for v1.26 and later.
>>     -pahole-flags-$(call test-ge, $(pahole-ver), 126)  = -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs
>>     +pahole-flags-$(call test-ge, $(pahole-ver), 126)  = -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs,reproducible_build
>>
>>      ifneq ($(KBUILD_EXTMOD),)
>>      module-pahole-flags-$(call test-ge, $(pahole-ver), 126) += --btf_features=distilled_base
>>
>> Question to the mailing list: do we want this?
> 
> Alan, can you please give us a summary of what are the consequences of
> the reproducible_build pahole option? In terms of performance and
> otherwise.
>

Sure. The original context was that the folks trying to do reproducible
builds were being impacted by the fact that BTF generation was
non-deterministic when done in parallel; i.e. same kernel would give
different BTF ids when rebuilding vmlinux BTF; the reason was largely as
I understand it that when pahole partitioned CUs between multiple
threads, that partitioning could vary. If it varied, when BTF was merged
across threads we could end up with differing id assignments. Since BTF
then was baked into the vmlinux binary, unstable BTF ids meant
non-identical vmlinux.

The first approach to solve this was to remove parallel BTF generation
to support reproducibility. Arnaldo however added support that retained
parallelism while supporting determinism through using the DWARF CU
order. He did some great analysis on the overheads for vmlinux
generation too [1]; summary is that the overhead in runtime is approx
33% versus parallel non-reproducible encoding. Those numbers might not
100% translate to the vmlinux build during kernel since it was a
detached pahole generation and the options might differ slightly, but
they give a sense of things. I don't _think_ there should be additional
memory overheads during pahole generation (we really can't afford any
more memory usage), since it's really more about making order of CU
processing consistent.

Would be good to get Arnaldo's perspective too if we're considering
switching this on by default, as he knows this stuff much better than I do.

[1]
https://lore.kernel.org/dwarves/20240412211604.789632-12-acme@kernel.org/
Mykyta Yatsenko Aug. 30, 2024, 9:23 p.m. UTC | #5
On 30/08/2024 21:34, Andrii Nakryiko wrote:
> On Wed, Aug 28, 2024 at 3:02 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>> On Wed, 2024-08-28 at 17:46 +0000, Ihor Solodrai wrote:
>>> %.bpf.o objects depend on vmlinux.h, which makes them transitively
>>> dependent on unnecessary libbpf headers. However vmlinux.h doesn't
>>> actually change as often.
>>>
>>> When generating vmlinux.h, compare it to a previous version and update
>>> it only if there are changes.
>>>
>>> Example of build time improvement (after first clean build):
>>>    $ touch ../../../lib/bpf/bpf.h
>>>    $ time make -j8
>>> Before: real  1m37.592s
>>> After:  real  0m27.310s
>>>
>>> Notice that %.bpf.o gen step is skipped if vmlinux.h hasn't changed.
>>>
>>> Link: https://lore.kernel.org/bpf/CAEf4BzY1z5cC7BKye8=A8aTVxpsCzD=p1jdTfKC7i0XVuYoHUQ@mail.gmail.com
>>>
>>> Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
>>> ---
>> Unfortunately, I think that this is a half-measure.
>> E.g. the following command forces tests rebuild for me:
>>
>>    touch ../../../../kernel/bpf/verifier.c; \
>>    make -j22 -C ../../../../; \
>>    time make test_progs
>>
>> To workaround this we need to enable reproducible_build option:
>>
>>      diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
>>      index b75f09f3f424..8cd648f3e32b 100644
>>      --- a/scripts/Makefile.btf
>>      +++ b/scripts/Makefile.btf
>>      @@ -19,7 +19,7 @@ pahole-flags-$(call test-ge, $(pahole-ver), 125)      += --skip_encoding_btf_inconsis
>>       else
>>
>>       # Switch to using --btf_features for v1.26 and later.
>>      -pahole-flags-$(call test-ge, $(pahole-ver), 126)  = -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs
>>      +pahole-flags-$(call test-ge, $(pahole-ver), 126)  = -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs,reproducible_build
>>
>>       ifneq ($(KBUILD_EXTMOD),)
>>       module-pahole-flags-$(call test-ge, $(pahole-ver), 126) += --btf_features=distilled_base
>>
>> Question to the mailing list: do we want this?
> Alan, can you please give us a summary of what are the consequences of
> the reproducible_build pahole option? In terms of performance and
> otherwise.
>
> I've applied patches as is, despite them not solving the issue
> completely, as they are moving us in the right direction anyways. I do
> get slightly different BTF every single time I rebuild my kernel, so
> the change in patch #2 doesn't yet help me.
>
> For libbpf headers, Ihor, can you please follow up with adding
> bpf_helper_defs.h as a dependency?
>
> I have some ideas on how to make BTF regeneration in vmlinux.h itself
> unnecessary, that might help with this issue. Separately (depending on
> what are the negatives of the reproducible_build option) we can look
> into making pahole have more consistent internal BTF type ordering
> without negatively affecting the overall BTF dedup performance in
> pahole. Hopefully I can work with Ihor on this as follow ups.
>
> P.S. I also spent more time than I'm willing to admit trying to
> improve bpftool's BTF sorting to minimize the chance of vmlinux.h
> contents being different, and I think I removed a bunch of cases where
> we had unnecessary differences, but still, it's fundamentally
> non-deterministic to do everything based on type and field names,
> unfortunately.
>
> Anyways, Mykyta (cc'ed), what do you think about the changes below?
> Note that I'm also fixing the incorrect handling of enum64 (would be
> nice to prepare a proper patch and send it upstream, if you get a
> chance).
>
> diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c
> index 6789c7a4d5ca..e8a244b09d56 100644
> --- a/tools/bpf/bpftool/btf.c
> +++ b/tools/bpf/bpftool/btf.c
> @@ -50,6 +50,7 @@ struct sort_datum {
>          int type_rank;
>          const char *sort_name;
>          const char *own_name;
> +       __u64 disambig_hash;
>   };
>
>   static const char *btf_int_enc_str(__u8 encoding)
> @@ -552,35 +553,92 @@ static int btf_type_rank(const struct btf *btf,
> __u32 index, bool has_name)
>          }
>   }
>
> -static const char *btf_type_sort_name(const struct btf *btf, __u32
> index, bool from_ref)
> +static const char *btf_type_sort_name(const struct btf *btf, __u32
> index, bool from_ref, const char *typedef_name)
>   {
>          const struct btf_type *t = btf__type_by_id(btf, index);
> +       int name_off;
>
>          switch (btf_kind(t)) {
>          case BTF_KIND_ENUM:
> -       case BTF_KIND_ENUM64: {
> -               int name_off = t->name_off;
> -
>                  /* Use name of the first element for anonymous enums
> if allowed */
>                  if (!from_ref && !t->name_off && btf_vlen(t))
>                          name_off = btf_enum(t)->name_off;
> +               else
> +                       name_off = t->name_off;
> +
> +               return btf__name_by_offset(btf, name_off);
> +       case BTF_KIND_ENUM64:
> +               /* Use name of the first element for anonymous enums
> if allowed */
> +               if (!from_ref && !t->name_off && btf_vlen(t))
> +                       name_off = btf_enum64(t)->name_off;
> +               else
> +                       name_off = t->name_off;
>
>                  return btf__name_by_offset(btf, name_off);
> -       }
>          case BTF_KIND_ARRAY:
> -               return btf_type_sort_name(btf, btf_array(t)->type, true);
> +               return btf_type_sort_name(btf, btf_array(t)->type,
> true, typedef_name);
> +       case BTF_KIND_STRUCT:
> +       case BTF_KIND_UNION:
> +               if (t->name_off == 0)
> +                       return typedef_name;
> +               return btf__name_by_offset(btf, t->name_off);
> +       case BTF_KIND_TYPEDEF:
> +               return btf_type_sort_name(btf, t->type, true,
> +                                         btf__name_by_offset(btf,
> t->name_off));
>          case BTF_KIND_TYPE_TAG:
>          case BTF_KIND_CONST:
>          case BTF_KIND_PTR:
>          case BTF_KIND_VOLATILE:
>          case BTF_KIND_RESTRICT:
> -       case BTF_KIND_TYPEDEF:
>          case BTF_KIND_DECL_TAG:
> -               return btf_type_sort_name(btf, t->type, true);
> +               return btf_type_sort_name(btf, t->type, true, typedef_name);
>          default:
>                  return btf__name_by_offset(btf, t->name_off);
>          }
> -       return NULL;
> +}
> +
> +static __u64 hasher(__u64 hash, __u64 val)
> +{
> +       return hash * 31 + val;
> +}
> +
> +static __u64 btf_type_disambig_hash(const struct btf *btf, __u32 index)
> +{
> +       const struct btf_type *t = btf__type_by_id(btf, index);
> +       int i;
> +       size_t hash = 0;
> +
> +       switch (btf_kind(t)) {
> +       case BTF_KIND_ENUM:
> +               hash = hasher(hash, t->size);
> +               for (i = 0; i < btf_vlen(t); i++)
> +                       hash = hasher(hash, btf_enum(t)[i].name_off);
> +               break;
> +       case BTF_KIND_ENUM64:
> +               hash = hasher(hash, t->size);
> +               for (i = 0; i < btf_vlen(t); i++)
> +                       hash = hasher(hash, btf_enum64(t)[i].name_off);
> +               break;
> +       case BTF_KIND_STRUCT:
> +       case BTF_KIND_UNION: {
> +               const struct btf_member *m;
> +               const char *ftname;
> +
> +               hash = hasher(hash, t->size);
> +               for (i = 0; i < btf_vlen(t); i++) {
> +                       m = btf_members(t) + i;
> +                       hash = hasher(hash, m->name_off);
> +
> +                       /* resolve field type's name and hash it as well */
> +                       ftname = btf_type_sort_name(btf, m->type, false, "");
> +                       hash = hasher(hash, str_hash(ftname));
> +               }
> +               break;
> +       }
> +       default:
> +               break;
> +       }
> +       return hash;
>   }
>
>   static int btf_type_compare(const void *left, const void *right)
> @@ -596,7 +654,14 @@ static int btf_type_compare(const void *left,
> const void *right)
>          if (r)
>                  return r;
>
> -       return strcmp(d1->own_name, d2->own_name);
> +       r = strcmp(d1->own_name, d2->own_name);
> +       if (r)
> +               return r;
> +
> +       if (d1->disambig_hash != d2->disambig_hash)
> +               return d1->disambig_hash < d2->disambig_hash ? -1 : 1;
> +
> +       return d1->index < d2->index ? -1 : 1;
>   }
>
>   static struct sort_datum *sort_btf_c(const struct btf *btf)
> @@ -615,8 +680,9 @@ static struct sort_datum *sort_btf_c(const struct btf *btf)
>
>                  d->index = i;
>                  d->type_rank = btf_type_rank(btf, i, false);
> -               d->sort_name = btf_type_sort_name(btf, i, false);
> +               d->sort_name = btf_type_sort_name(btf, i, false, "");
>                  d->own_name = btf__name_by_offset(btf, t->name_off);
> +               d->disambig_hash = btf_type_disambig_hash(btf, i);
>          }
>
>          qsort(datums, n, sizeof(struct sort_datum), btf_type_compare);
>
Thanks for pointing to the bug of enum64 handling. I'll create a patch.
Reading the rest of the code, hashing struct/union/enum fields is 
introduced:
this is only useful for disambiguating ordering of the anonymous 
structs/unions/enums.

I suspect the biggest source of the issues are structs and unions, though.
Are definitions like this create problems?
typedef struct {...} foo_t;
?
I'll check what other differences this change makes.
>> [...]
>>
Andrii Nakryiko Aug. 30, 2024, 10:18 p.m. UTC | #6
On Fri, Aug 30, 2024 at 2:23 PM Mykyta Yatsenko
<mykyta.yatsenko5@gmail.com> wrote:
>
> On 30/08/2024 21:34, Andrii Nakryiko wrote:
> > On Wed, Aug 28, 2024 at 3:02 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >> On Wed, 2024-08-28 at 17:46 +0000, Ihor Solodrai wrote:
> >>> %.bpf.o objects depend on vmlinux.h, which makes them transitively
> >>> dependent on unnecessary libbpf headers. However vmlinux.h doesn't
> >>> actually change as often.
> >>>
> >>> When generating vmlinux.h, compare it to a previous version and update
> >>> it only if there are changes.
> >>>
> >>> Example of build time improvement (after first clean build):
> >>>    $ touch ../../../lib/bpf/bpf.h
> >>>    $ time make -j8
> >>> Before: real  1m37.592s
> >>> After:  real  0m27.310s
> >>>
> >>> Notice that %.bpf.o gen step is skipped if vmlinux.h hasn't changed.
> >>>
> >>> Link: https://lore.kernel.org/bpf/CAEf4BzY1z5cC7BKye8=A8aTVxpsCzD=p1jdTfKC7i0XVuYoHUQ@mail.gmail.com
> >>>
> >>> Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
> >>> ---
> >> Unfortunately, I think that this is a half-measure.
> >> E.g. the following command forces tests rebuild for me:
> >>
> >>    touch ../../../../kernel/bpf/verifier.c; \
> >>    make -j22 -C ../../../../; \
> >>    time make test_progs
> >>
> >> To workaround this we need to enable reproducible_build option:
> >>
> >>      diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
> >>      index b75f09f3f424..8cd648f3e32b 100644
> >>      --- a/scripts/Makefile.btf
> >>      +++ b/scripts/Makefile.btf
> >>      @@ -19,7 +19,7 @@ pahole-flags-$(call test-ge, $(pahole-ver), 125)      += --skip_encoding_btf_inconsis
> >>       else
> >>
> >>       # Switch to using --btf_features for v1.26 and later.
> >>      -pahole-flags-$(call test-ge, $(pahole-ver), 126)  = -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs
> >>      +pahole-flags-$(call test-ge, $(pahole-ver), 126)  = -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs,reproducible_build
> >>
> >>       ifneq ($(KBUILD_EXTMOD),)
> >>       module-pahole-flags-$(call test-ge, $(pahole-ver), 126) += --btf_features=distilled_base
> >>
> >> Question to the mailing list: do we want this?
> > Alan, can you please give us a summary of what are the consequences of
> > the reproducible_build pahole option? In terms of performance and
> > otherwise.
> >
> > I've applied patches as is, despite them not solving the issue
> > completely, as they are moving us in the right direction anyways. I do
> > get slightly different BTF every single time I rebuild my kernel, so
> > the change in patch #2 doesn't yet help me.
> >
> > For libbpf headers, Ihor, can you please follow up with adding
> > bpf_helper_defs.h as a dependency?
> >
> > I have some ideas on how to make BTF regeneration in vmlinux.h itself
> > unnecessary, that might help with this issue. Separately (depending on
> > what are the negatives of the reproducible_build option) we can look
> > into making pahole have more consistent internal BTF type ordering
> > without negatively affecting the overall BTF dedup performance in
> > pahole. Hopefully I can work with Ihor on this as follow ups.
> >
> > P.S. I also spent more time than I'm willing to admit trying to
> > improve bpftool's BTF sorting to minimize the chance of vmlinux.h
> > contents being different, and I think I removed a bunch of cases where
> > we had unnecessary differences, but still, it's fundamentally
> > non-deterministic to do everything based on type and field names,
> > unfortunately.
> >
> > Anyways, Mykyta (cc'ed), what do you think about the changes below?
> > Note that I'm also fixing the incorrect handling of enum64 (would be
> > nice to prepare a proper patch and send it upstream, if you get a
> > chance).
> >

[...]

> >   static struct sort_datum *sort_btf_c(const struct btf *btf)
> > @@ -615,8 +680,9 @@ static struct sort_datum *sort_btf_c(const struct btf *btf)
> >
> >                  d->index = i;
> >                  d->type_rank = btf_type_rank(btf, i, false);
> > -               d->sort_name = btf_type_sort_name(btf, i, false);
> > +               d->sort_name = btf_type_sort_name(btf, i, false, "");
> >                  d->own_name = btf__name_by_offset(btf, t->name_off);
> > +               d->disambig_hash = btf_type_disambig_hash(btf, i);
> >          }
> >
> >          qsort(datums, n, sizeof(struct sort_datum), btf_type_compare);
> >
> Thanks for pointing to the bug of enum64 handling. I'll create a patch.

great, thanks

> Reading the rest of the code, hashing struct/union/enum fields is
> introduced:
> this is only useful for disambiguating ordering of the anonymous
> structs/unions/enums.

yes, I was trying to have some sort of fixed order. What I was doing
was basically:

touch kernel/bpf/verifier.c && make -j$(nproc)

and after that generated vmlinux.h and compared with the previous version.

And every single time it differs a bit due to a) there being types
that are named the same, but really are different types and b) as a
consequence, different order of types will cause different ___2, ___3
suffixes, which causes tiny changes to vmlinux.h)

Reproducible build would help with the ordering, but I was hoping to
achieve similar effect with bpftool (and I did remove some of the
unnecessary changes, but not all).

>
> I suspect the biggest source of the issues are structs and unions, though.
> Are definitions like this create problems?
> typedef struct {...} foo_t;
> ?
> I'll check what other differences this change makes.

just try the kernel rebuild and vmlinux.h generation steps above, you'll see :)

> >> [...]
> >>
>
Arnaldo Carvalho de Melo Aug. 30, 2024, 11:42 p.m. UTC | #7
On Fri, Aug 30, 2024 at 10:03:40PM +0100, Alan Maguire wrote:
> On 30/08/2024 21:34, Andrii Nakryiko wrote:
> > On Wed, Aug 28, 2024 at 3:02 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >>
> >> On Wed, 2024-08-28 at 17:46 +0000, Ihor Solodrai wrote:
> >>> %.bpf.o objects depend on vmlinux.h, which makes them transitively
> >>> dependent on unnecessary libbpf headers. However vmlinux.h doesn't
> >>> actually change as often.
> >>>
> >>> When generating vmlinux.h, compare it to a previous version and update
> >>> it only if there are changes.
> >>>
> >>> Example of build time improvement (after first clean build):
> >>>   $ touch ../../../lib/bpf/bpf.h
> >>>   $ time make -j8
> >>> Before: real  1m37.592s
> >>> After:  real  0m27.310s
> >>>
> >>> Notice that %.bpf.o gen step is skipped if vmlinux.h hasn't changed.
> >>>
> >>> Link: https://lore.kernel.org/bpf/CAEf4BzY1z5cC7BKye8=A8aTVxpsCzD=p1jdTfKC7i0XVuYoHUQ@mail.gmail.com
> >>>
> >>> Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
> >>> ---
> >>
> >> Unfortunately, I think that this is a half-measure.
> >> E.g. the following command forces tests rebuild for me:
> >>
> >>   touch ../../../../kernel/bpf/verifier.c; \
> >>   make -j22 -C ../../../../; \
> >>   time make test_progs
> >>
> >> To workaround this we need to enable reproducible_build option:
> >>
> >>     diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
> >>     index b75f09f3f424..8cd648f3e32b 100644
> >>     --- a/scripts/Makefile.btf
> >>     +++ b/scripts/Makefile.btf
> >>     @@ -19,7 +19,7 @@ pahole-flags-$(call test-ge, $(pahole-ver), 125)      += --skip_encoding_btf_inconsis
> >>      else
> >>
> >>      # Switch to using --btf_features for v1.26 and later.
> >>     -pahole-flags-$(call test-ge, $(pahole-ver), 126)  = -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs
> >>     +pahole-flags-$(call test-ge, $(pahole-ver), 126)  = -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs,reproducible_build
> >>
> >>      ifneq ($(KBUILD_EXTMOD),)
> >>      module-pahole-flags-$(call test-ge, $(pahole-ver), 126) += --btf_features=distilled_base
> >>
> >> Question to the mailing list: do we want this?
> > 
> > Alan, can you please give us a summary of what are the consequences of
> > the reproducible_build pahole option? In terms of performance and
> > otherwise.
> >
> 
> Sure. The original context was that the folks trying to do reproducible
> builds were being impacted by the fact that BTF generation was
> non-deterministic when done in parallel; i.e. same kernel would give
> different BTF ids when rebuilding vmlinux BTF; the reason was largely as
> I understand it that when pahole partitioned CUs between multiple
> threads, that partitioning could vary. If it varied, when BTF was merged
> across threads we could end up with differing id assignments. Since BTF
> then was baked into the vmlinux binary, unstable BTF ids meant
> non-identical vmlinux.
> 
> The first approach to solve this was to remove parallel BTF generation
> to support reproducibility. Arnaldo however added support that retained
> parallelism while supporting determinism through using the DWARF CU
> order. He did some great analysis on the overheads for vmlinux
> generation too [1]; summary is that the overhead in runtime is approx
> 33% versus parallel non-reproducible encoding. Those numbers might not
> 100% translate to the vmlinux build during kernel since it was a
> detached pahole generation and the options might differ slightly, but
> they give a sense of things. I don't _think_ there should be additional
> memory overheads during pahole generation (we really can't afford any
> more memory usage), since it's really more about making order of CU
> processing consistent.
> 
> Would be good to get Arnaldo's perspective too if we're considering
> switching this on by default, as he knows this stuff much better than I do.

You described it nicely! And on top of that there was recent work that
will be available in 1.28 to reduce the memory footprint of pahole,
using it to find things to pack in itself, reducing the number of
allocations, not keeping unreferenced CUs around (you did it), part of
the work to have it working on 32-bit architectures, where we had
reports of it not working.

There is certainly more optimizations to be made to reduce its memory
footprint while allowing it to run in parallel, but at this point it
seems to have addressed the problems that were reported.

More people trying it and measuring the impacts, to confirm the tests
and analysis we did and you alluded too can be only a good thing in
getting us all informed and confortable with using this option by
default.

BTW, we have now a tests/ directory with a regression test for this
feature and another for the --prettify feature in pahole (use DWARF or
BTF to pretty print raw data with several tricks on finding the right
data structure based on enumerations when the conventions used in a
project allow for that, that is the case with tools/perf, also for using
header sizes to traverse variable sized records, etc), please see:

https://git.kernel.org/pub/scm/devel/pahole/pahole.git/tree/tests

And:

https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=fd14dc67cb6aaead553074afb4a1ddad10209892
https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=be638365781ed0c843249c5bcebe90a01e74b2fe

This should help us in detecting problems earlier, brownie point to
whoever gets this hooked up in CI systems, existing or new 8-)

Thanks,

- Arnaldo
 
> [1]
> https://lore.kernel.org/dwarves/20240412211604.789632-12-acme@kernel.org/
Ihor Solodrai Aug. 31, 2024, 6:18 p.m. UTC | #8
Andrii, Eduard,

On Friday, August 30th, 2024 at 1:34 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

[...]

> I've applied patches as is, despite them not solving the issue
> completely, as they are moving us in the right direction anyways. I do
> get slightly different BTF every single time I rebuild my kernel, so
> the change in patch #2 doesn't yet help me.

Thanks for applying the patches.
I didn't realize vmlinux.h generation is non-deterministic. Interesting.

> 
> For libbpf headers, Ihor, can you please follow up with adding
> bpf_helper_defs.h as a dependency?

I've tried tracking down where bpf_helper_defs.h is coming from and
(assuming my analysis is correct) this header is generated by
`scripts/bpf_doc.py`. From the selftests/bpf point of view the
dependency chain is as follows:

  1. vmlinux.h depends on bpftool:

       $(INCLUDE_DIR)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL) | $(INCLUDE_DIR)

  2. bpftool is installed for selftests via `make -C tools/bpf/bpftool install-bin`:

       BPFTOOL ?= $(DEFAULT_BPFTOOL)
       $(DEFAULT_BPFTOOL): ...
  	  $(Q)$(MAKE) $(submake_extras) -C $(BPFTOOLDIR) ... install-bin

  3. bpftool install-bin depends on libbpf:

       $(OUTPUT)bpftool: $(OBJS) $(LIBBPF)
         ...
       install-bin: $(OUTPUT)bpftool


  4. $(LIBBPF) recipe runs `make -C tools/lib/bpf install_headers`,
     which depends on $(BPF_GENERATED) which equals to $(BPF_HELPER_DEFS)

       BPF_GENERATED	:= $(BPF_HELPER_DEFS)
         ...
       install_headers: $(BPF_GENERATED) $(INSTALL_SRC_HDRS) $(INSTALL_GEN_HDRS)

  5. Finally $(BPF_HELPER_DEFS) recipe executes the python script (in lib/bpf):

     $(BPF_HELPER_DEFS): $(srctree)/tools/include/uapi/linux/bpf.h
	$(QUIET_GEN)$(srctree)/scripts/bpf_doc.py --header \
		--file $(srctree)/tools/include/uapi/linux/bpf.h > $(BPF_HELPER_DEFS)


I don't see any benefit to adding bpf_helper_defs.h as a direct
dependency of anything in selftests/bpf. %.bpf.o already depend on
vmlinux.h, and unless we somehow get rid of vmlinux.h dependency on
bpftool, bpf_helper_defs.h should always be there at a point when 
%.bpf.o objects are compiled.


> 
> I have some ideas on how to make BTF regeneration in vmlinux.h itself
> unnecessary, that might help with this issue. Separately (depending on
> what are the negatives of the reproducible_build option) we can look
> into making pahole have more consistent internal BTF type ordering
> without negatively affecting the overall BTF dedup performance in
> pahole. Hopefully I can work with Ihor on this as follow ups.

I still know little about how all this machinery works, but I'd be
glad to help.

> 
> P.S. I also spent more time than I'm willing to admit trying to
> improve bpftool's BTF sorting to minimize the chance of vmlinux.h
> contents being different, and I think I removed a bunch of cases where
> we had unnecessary differences, but still, it's fundamentally
> non-deterministic to do everything based on type and field names,
> unfortunately.

[...]
Andrii Nakryiko Sept. 3, 2024, 4:58 p.m. UTC | #9
On Sat, Aug 31, 2024 at 11:18 AM Ihor Solodrai <ihor.solodrai@pm.me> wrote:
>
> Andrii, Eduard,
>
> On Friday, August 30th, 2024 at 1:34 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> [...]
>
> > I've applied patches as is, despite them not solving the issue
> > completely, as they are moving us in the right direction anyways. I do
> > get slightly different BTF every single time I rebuild my kernel, so
> > the change in patch #2 doesn't yet help me.
>
> Thanks for applying the patches.
> I didn't realize vmlinux.h generation is non-deterministic. Interesting.
>
> >
> > For libbpf headers, Ihor, can you please follow up with adding
> > bpf_helper_defs.h as a dependency?
>
> I've tried tracking down where bpf_helper_defs.h is coming from and
> (assuming my analysis is correct) this header is generated by
> `scripts/bpf_doc.py`. From the selftests/bpf point of view the
> dependency chain is as follows:
>
>   1. vmlinux.h depends on bpftool:
>
>        $(INCLUDE_DIR)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL) | $(INCLUDE_DIR)
>
>   2. bpftool is installed for selftests via `make -C tools/bpf/bpftool install-bin`:
>
>        BPFTOOL ?= $(DEFAULT_BPFTOOL)
>        $(DEFAULT_BPFTOOL): ...
>           $(Q)$(MAKE) $(submake_extras) -C $(BPFTOOLDIR) ... install-bin
>
>   3. bpftool install-bin depends on libbpf:
>
>        $(OUTPUT)bpftool: $(OBJS) $(LIBBPF)
>          ...
>        install-bin: $(OUTPUT)bpftool
>
>
>   4. $(LIBBPF) recipe runs `make -C tools/lib/bpf install_headers`,
>      which depends on $(BPF_GENERATED) which equals to $(BPF_HELPER_DEFS)
>
>        BPF_GENERATED    := $(BPF_HELPER_DEFS)
>          ...
>        install_headers: $(BPF_GENERATED) $(INSTALL_SRC_HDRS) $(INSTALL_GEN_HDRS)
>
>   5. Finally $(BPF_HELPER_DEFS) recipe executes the python script (in lib/bpf):
>
>      $(BPF_HELPER_DEFS): $(srctree)/tools/include/uapi/linux/bpf.h
>         $(QUIET_GEN)$(srctree)/scripts/bpf_doc.py --header \
>                 --file $(srctree)/tools/include/uapi/linux/bpf.h > $(BPF_HELPER_DEFS)
>
>
> I don't see any benefit to adding bpf_helper_defs.h as a direct
> dependency of anything in selftests/bpf. %.bpf.o already depend on
> vmlinux.h, and unless we somehow get rid of vmlinux.h dependency on
> bpftool, bpf_helper_defs.h should always be there at a point when
> %.bpf.o objects are compiled.
>

Making sure that bpf_helper_defs.h is generated by the time .bpf.o is
being compiled is one thing. Triggering .bpf.o regeneration when
bpf_helper_defs.h changes is another. The second used to be important
when we were adding new helpers, otherwise we can get compilation
error because of missing helper definitions.

This is much less of an issue today, we we might just leave it as is.
Making sure bpf_helper_defs.h is there might be good enough.

>
> >
> > I have some ideas on how to make BTF regeneration in vmlinux.h itself
> > unnecessary, that might help with this issue. Separately (depending on
> > what are the negatives of the reproducible_build option) we can look
> > into making pahole have more consistent internal BTF type ordering
> > without negatively affecting the overall BTF dedup performance in
> > pahole. Hopefully I can work with Ihor on this as follow ups.
>
> I still know little about how all this machinery works, but I'd be
> glad to help.

I'd like to avoid regenerating BTF inside the vmlinux image, if
possible. This will cut down significantly on incremental kernel build
times. We can chat about this separately a bit later, don't worry.

>
> >
> > P.S. I also spent more time than I'm willing to admit trying to
> > improve bpftool's BTF sorting to minimize the chance of vmlinux.h
> > contents being different, and I think I removed a bunch of cases where
> > we had unnecessary differences, but still, it's fundamentally
> > non-deterministic to do everything based on type and field names,
> > unfortunately.
>
> [...]
>
>
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 53cc13b92ee2..7660d19b66c2 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -399,10 +399,14 @@  $(HOST_BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile)		       \
 		    DESTDIR=$(HOST_SCRATCH_DIR)/ prefix= all install_headers
 endif
 
+# vmlinux.h is first dumped to a temprorary file and then compared to
+# the previous version. This helps to avoid unnecessary re-builds of
+# $(TRUNNER_BPF_OBJS)
 $(INCLUDE_DIR)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL) | $(INCLUDE_DIR)
 ifeq ($(VMLINUX_H),)
 	$(call msg,GEN,,$@)
-	$(Q)$(BPFTOOL) btf dump file $(VMLINUX_BTF) format c > $@
+	$(Q)$(BPFTOOL) btf dump file $(VMLINUX_BTF) format c > $(INCLUDE_DIR)/.vmlinux.h.tmp
+	$(Q)cmp -s $(INCLUDE_DIR)/.vmlinux.h.tmp $@ || mv $(INCLUDE_DIR)/.vmlinux.h.tmp $@
 else
 	$(call msg,CP,,$@)
 	$(Q)cp "$(VMLINUX_H)" $@