diff mbox series

[bpf,3/7] bpf: Free dynamically allocated bits in bpf_iter_bits_destroy()

Message ID 20241008091718.3797027-4-houtao@huaweicloud.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series Misc fixes for bpf | expand

Checks

Context Check Description
bpf/vmtest-bpf-PR success PR summary
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 9 this patch: 9
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 14 of 14 maintainers
netdev/build_clang success Errors and warnings before: 7 this patch: 7
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 61 this patch: 61
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 39 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 10 this patch: 10
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-VM_Test-9 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-7 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-8 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-VM_Test-16 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-VM_Test-17 success Logs for set-matrix
bpf/vmtest-bpf-VM_Test-19 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-VM_Test-15 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-VM_Test-18 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-VM_Test-27 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-28 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-VM_Test-33 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-VM_Test-34 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-35 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-VM_Test-41 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-VM_Test-13 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-VM_Test-14 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-VM_Test-32 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-20 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-21 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-22 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-23 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-25 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-26 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-29 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-30 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-31 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-36 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-37 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-38 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-39 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-40 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18

Commit Message

Hou Tao Oct. 8, 2024, 9:17 a.m. UTC
From: Hou Tao <houtao1@huawei.com>

bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the
bits are dynamically allocated. However, the check is incorrect and may
cause a kmemleak as shown below:

unreferenced object 0xffff88812628c8c0 (size 32):
  comm "swapper/0", pid 1, jiffies 4294727320
  hex dump (first 32 bytes):
    b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0  ..U.............
    f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00  ................
  backtrace (crc 781e32cc):
    [<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80
    [<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0
    [<00000000597124d6>] __alloc.isra.0+0x89/0xb0
    [<000000004ebfffcd>] alloc_bulk+0x2af/0x720
    [<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0
    [<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610
    [<000000008b616eac>] bpf_global_ma_init+0x19/0x30
    [<00000000fc473efc>] do_one_initcall+0xd3/0x3c0
    [<00000000ec81498c>] kernel_init_freeable+0x66a/0x940
    [<00000000b119f72f>] kernel_init+0x20/0x160
    [<00000000f11ac9a7>] ret_from_fork+0x3c/0x70
    [<0000000004671da4>] ret_from_fork_asm+0x1a/0x30

That is because nr_bits will be set as zero in bpf_iter_bits_next()
after all bits have been iterated.

Fix the problem by introducing an extra allocated status in
bpf_iter_bits and using it to indicate whether the bits are
dynamically allocated.

Fixes: 4665415975b0 ("bpf: Add bits iterator")
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 kernel/bpf/helpers.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Comments

Andrii Nakryiko Oct. 8, 2024, 6:26 p.m. UTC | #1
On Tue, Oct 8, 2024 at 2:05 AM Hou Tao <houtao@huaweicloud.com> wrote:
>
> From: Hou Tao <houtao1@huawei.com>
>
> bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the
> bits are dynamically allocated. However, the check is incorrect and may
> cause a kmemleak as shown below:
>
> unreferenced object 0xffff88812628c8c0 (size 32):
>   comm "swapper/0", pid 1, jiffies 4294727320
>   hex dump (first 32 bytes):
>     b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0  ..U.............
>     f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00  ................
>   backtrace (crc 781e32cc):
>     [<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80
>     [<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0
>     [<00000000597124d6>] __alloc.isra.0+0x89/0xb0
>     [<000000004ebfffcd>] alloc_bulk+0x2af/0x720
>     [<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0
>     [<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610
>     [<000000008b616eac>] bpf_global_ma_init+0x19/0x30
>     [<00000000fc473efc>] do_one_initcall+0xd3/0x3c0
>     [<00000000ec81498c>] kernel_init_freeable+0x66a/0x940
>     [<00000000b119f72f>] kernel_init+0x20/0x160
>     [<00000000f11ac9a7>] ret_from_fork+0x3c/0x70
>     [<0000000004671da4>] ret_from_fork_asm+0x1a/0x30
>
> That is because nr_bits will be set as zero in bpf_iter_bits_next()
> after all bits have been iterated.
>

so maybe don't touch nr_bits and just use `kit->bit >= kit->nr_bits`
condition to know when iterator is done?

> Fix the problem by introducing an extra allocated status in
> bpf_iter_bits and using it to indicate whether the bits are
> dynamically allocated.
>
> Fixes: 4665415975b0 ("bpf: Add bits iterator")
> Signed-off-by: Hou Tao <houtao1@huawei.com>
> ---
>  kernel/bpf/helpers.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 1a43d06eab28..9484b5f7c4c0 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -2856,7 +2856,8 @@ struct bpf_iter_bits_kern {
>                 unsigned long *bits;
>                 unsigned long bits_copy;
>         };
> -       u32 nr_bits;
> +       u32 allocated:1;
> +       u32 nr_bits:31;
>         int bit;
>  } __aligned(8);
>
> @@ -2886,6 +2887,7 @@ bpf_iter_bits_new(struct bpf_iter_bits *it, const u64 *unsafe_ptr__ign, u32 nr_w
>         BUILD_BUG_ON(__alignof__(struct bpf_iter_bits_kern) !=
>                      __alignof__(struct bpf_iter_bits));
>
> +       kit->allocated = 0;
>         kit->nr_bits = 0;
>         kit->bits_copy = 0;
>         kit->bit = -1;
> @@ -2914,6 +2916,7 @@ bpf_iter_bits_new(struct bpf_iter_bits *it, const u64 *unsafe_ptr__ign, u32 nr_w
>                 return err;
>         }
>
> +       kit->allocated = 1;
>         kit->nr_bits = nr_bits;
>         return 0;
>  }
> @@ -2937,7 +2940,7 @@ __bpf_kfunc int *bpf_iter_bits_next(struct bpf_iter_bits *it)
>         if (nr_bits == 0)
>                 return NULL;
>
> -       bits = nr_bits == 64 ? &kit->bits_copy : kit->bits;
> +       bits = !kit->allocated ? &kit->bits_copy : kit->bits;
>         bit = find_next_bit(bits, nr_bits, kit->bit + 1);
>         if (bit >= nr_bits) {
>                 kit->nr_bits = 0;
> @@ -2958,7 +2961,7 @@ __bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it)
>  {
>         struct bpf_iter_bits_kern *kit = (void *)it;
>
> -       if (kit->nr_bits <= 64)
> +       if (!kit->allocated)
>                 return;
>         bpf_mem_free(&bpf_global_ma, kit->bits);
>  }
> --
> 2.29.2
>
Hou Tao Oct. 9, 2024, 1:09 a.m. UTC | #2
Hi,

On 10/9/2024 2:26 AM, Andrii Nakryiko wrote:
> On Tue, Oct 8, 2024 at 2:05 AM Hou Tao <houtao@huaweicloud.com> wrote:
>> From: Hou Tao <houtao1@huawei.com>
>>
>> bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the
>> bits are dynamically allocated. However, the check is incorrect and may
>> cause a kmemleak as shown below:
>>
>> unreferenced object 0xffff88812628c8c0 (size 32):
>>   comm "swapper/0", pid 1, jiffies 4294727320
>>   hex dump (first 32 bytes):
>>     b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0  ..U.............
>>     f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00  ................
>>   backtrace (crc 781e32cc):
>>     [<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80
>>     [<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0
>>     [<00000000597124d6>] __alloc.isra.0+0x89/0xb0
>>     [<000000004ebfffcd>] alloc_bulk+0x2af/0x720
>>     [<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0
>>     [<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610
>>     [<000000008b616eac>] bpf_global_ma_init+0x19/0x30
>>     [<00000000fc473efc>] do_one_initcall+0xd3/0x3c0
>>     [<00000000ec81498c>] kernel_init_freeable+0x66a/0x940
>>     [<00000000b119f72f>] kernel_init+0x20/0x160
>>     [<00000000f11ac9a7>] ret_from_fork+0x3c/0x70
>>     [<0000000004671da4>] ret_from_fork_asm+0x1a/0x30
>>
>> That is because nr_bits will be set as zero in bpf_iter_bits_next()
>> after all bits have been iterated.
>>
> so maybe don't touch nr_bits and just use `kit->bit >= kit->nr_bits`
> condition to know when iterator is done?

Good idea. That would be simpler. Will do in v2.
>
>> Fix the problem by introducing an extra allocated status in
>> bpf_iter_bits and using it to indicate whether the bits are
>> dynamically allocated.
Yafang Shao Oct. 9, 2024, 11:37 a.m. UTC | #3
On Wed, Oct 9, 2024 at 2:26 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Oct 8, 2024 at 2:05 AM Hou Tao <houtao@huaweicloud.com> wrote:
> >
> > From: Hou Tao <houtao1@huawei.com>
> >
> > bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the
> > bits are dynamically allocated. However, the check is incorrect and may
> > cause a kmemleak as shown below:
> >
> > unreferenced object 0xffff88812628c8c0 (size 32):
> >   comm "swapper/0", pid 1, jiffies 4294727320
> >   hex dump (first 32 bytes):
> >     b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0  ..U.............
> >     f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00  ................
> >   backtrace (crc 781e32cc):
> >     [<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80
> >     [<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0
> >     [<00000000597124d6>] __alloc.isra.0+0x89/0xb0
> >     [<000000004ebfffcd>] alloc_bulk+0x2af/0x720
> >     [<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0
> >     [<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610
> >     [<000000008b616eac>] bpf_global_ma_init+0x19/0x30
> >     [<00000000fc473efc>] do_one_initcall+0xd3/0x3c0
> >     [<00000000ec81498c>] kernel_init_freeable+0x66a/0x940
> >     [<00000000b119f72f>] kernel_init+0x20/0x160
> >     [<00000000f11ac9a7>] ret_from_fork+0x3c/0x70
> >     [<0000000004671da4>] ret_from_fork_asm+0x1a/0x30
> >
> > That is because nr_bits will be set as zero in bpf_iter_bits_next()
> > after all bits have been iterated.
> >
>
> so maybe don't touch nr_bits and just use `kit->bit >= kit->nr_bits`
> condition to know when iterator is done?

No, we can't do that. The iterator may only process a few bits, which
would result in `kit->bit < kit->nr_bits`. Wouldn't it be better to
simply remove the line `kit->nr_bits = 0;`?

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 1a43d06eab28..7fcd3163cf68 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2939,10 +2939,8 @@ __bpf_kfunc int *bpf_iter_bits_next(struct
bpf_iter_bits *it)

        bits = nr_bits == 64 ? &kit->bits_copy : kit->bits;
        bit = find_next_bit(bits, nr_bits, kit->bit + 1);
-       if (bit >= nr_bits) {
-               kit->nr_bits = 0;
+       if (bit >= nr_bits)
                return NULL;
-       }

        kit->bit = bit;
        return &kit->bit;

--
Regards

Yafang
Hou Tao Oct. 10, 2024, 1:09 a.m. UTC | #4
Hi,

On 10/9/2024 7:37 PM, Yafang Shao wrote:
> On Wed, Oct 9, 2024 at 2:26 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
>> On Tue, Oct 8, 2024 at 2:05 AM Hou Tao <houtao@huaweicloud.com> wrote:
>>> From: Hou Tao <houtao1@huawei.com>
>>>
>>> bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the
>>> bits are dynamically allocated. However, the check is incorrect and may
>>> cause a kmemleak as shown below:
>>>
>>> unreferenced object 0xffff88812628c8c0 (size 32):
>>>   comm "swapper/0", pid 1, jiffies 4294727320
>>>   hex dump (first 32 bytes):
>>>     b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0  ..U.............
>>>     f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00  ................
>>>   backtrace (crc 781e32cc):
>>>     [<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80
>>>     [<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0
>>>     [<00000000597124d6>] __alloc.isra.0+0x89/0xb0
>>>     [<000000004ebfffcd>] alloc_bulk+0x2af/0x720
>>>     [<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0
>>>     [<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610
>>>     [<000000008b616eac>] bpf_global_ma_init+0x19/0x30
>>>     [<00000000fc473efc>] do_one_initcall+0xd3/0x3c0
>>>     [<00000000ec81498c>] kernel_init_freeable+0x66a/0x940
>>>     [<00000000b119f72f>] kernel_init+0x20/0x160
>>>     [<00000000f11ac9a7>] ret_from_fork+0x3c/0x70
>>>     [<0000000004671da4>] ret_from_fork_asm+0x1a/0x30
>>>
>>> That is because nr_bits will be set as zero in bpf_iter_bits_next()
>>> after all bits have been iterated.
>>>
>> so maybe don't touch nr_bits and just use `kit->bit >= kit->nr_bits`
>> condition to know when iterator is done?
> No, we can't do that. The iterator may only process a few bits, which
> would result in `kit->bit < kit->nr_bits`. Wouldn't it be better to
> simply remove the line `kit->nr_bits = 0;`?

I think that is Andrii wanted to say. And is it more reasonable to also
change the check in the begin of bpf_iter_bits_next() to "bit >= nr_bits" ?

@@ -2934,15 +2934,13 @@ __bpf_kfunc int *bpf_iter_bits_next(struct
bpf_iter_bits *it)
        const unsigned long *bits;
        int bit;

-       if (nr_bits == 0)
+       if (kit->bit >= nr_bits)
                return NULL;


>
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 1a43d06eab28..7fcd3163cf68 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -2939,10 +2939,8 @@ __bpf_kfunc int *bpf_iter_bits_next(struct
> bpf_iter_bits *it)
>
>         bits = nr_bits == 64 ? &kit->bits_copy : kit->bits;
>         bit = find_next_bit(bits, nr_bits, kit->bit + 1);
> -       if (bit >= nr_bits) {
> -               kit->nr_bits = 0;
> +       if (bit >= nr_bits)
>                 return NULL;
> -       }
>
>         kit->bit = bit;
>         return &kit->bit;
>
> --
> Regards
>
> Yafang
> .
Yafang Shao Oct. 10, 2024, 2:22 a.m. UTC | #5
On Thu, Oct 10, 2024 at 9:10 AM Hou Tao <houtao@huaweicloud.com> wrote:
>
> Hi,
>
> On 10/9/2024 7:37 PM, Yafang Shao wrote:
> > On Wed, Oct 9, 2024 at 2:26 AM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> >> On Tue, Oct 8, 2024 at 2:05 AM Hou Tao <houtao@huaweicloud.com> wrote:
> >>> From: Hou Tao <houtao1@huawei.com>
> >>>
> >>> bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the
> >>> bits are dynamically allocated. However, the check is incorrect and may
> >>> cause a kmemleak as shown below:
> >>>
> >>> unreferenced object 0xffff88812628c8c0 (size 32):
> >>>   comm "swapper/0", pid 1, jiffies 4294727320
> >>>   hex dump (first 32 bytes):
> >>>     b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0  ..U.............
> >>>     f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00  ................
> >>>   backtrace (crc 781e32cc):
> >>>     [<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80
> >>>     [<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0
> >>>     [<00000000597124d6>] __alloc.isra.0+0x89/0xb0
> >>>     [<000000004ebfffcd>] alloc_bulk+0x2af/0x720
> >>>     [<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0
> >>>     [<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610
> >>>     [<000000008b616eac>] bpf_global_ma_init+0x19/0x30
> >>>     [<00000000fc473efc>] do_one_initcall+0xd3/0x3c0
> >>>     [<00000000ec81498c>] kernel_init_freeable+0x66a/0x940
> >>>     [<00000000b119f72f>] kernel_init+0x20/0x160
> >>>     [<00000000f11ac9a7>] ret_from_fork+0x3c/0x70
> >>>     [<0000000004671da4>] ret_from_fork_asm+0x1a/0x30
> >>>
> >>> That is because nr_bits will be set as zero in bpf_iter_bits_next()
> >>> after all bits have been iterated.
> >>>
> >> so maybe don't touch nr_bits and just use `kit->bit >= kit->nr_bits`
> >> condition to know when iterator is done?
> > No, we can't do that. The iterator may only process a few bits, which
> > would result in `kit->bit < kit->nr_bits`. Wouldn't it be better to
> > simply remove the line `kit->nr_bits = 0;`?
>
> I think that is Andrii wanted to say. And is it more reasonable to also
> change the check in the begin of bpf_iter_bits_next() to "bit >= nr_bits" ?
>
> @@ -2934,15 +2934,13 @@ __bpf_kfunc int *bpf_iter_bits_next(struct
> bpf_iter_bits *it)
>         const unsigned long *bits;
>         int bit;
>
> -       if (nr_bits == 0)
> +       if (kit->bit >= nr_bits)
>                 return NULL;

Agreed. I misunderstood what Andrii suggested.
diff mbox series

Patch

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 1a43d06eab28..9484b5f7c4c0 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2856,7 +2856,8 @@  struct bpf_iter_bits_kern {
 		unsigned long *bits;
 		unsigned long bits_copy;
 	};
-	u32 nr_bits;
+	u32 allocated:1;
+	u32 nr_bits:31;
 	int bit;
 } __aligned(8);
 
@@ -2886,6 +2887,7 @@  bpf_iter_bits_new(struct bpf_iter_bits *it, const u64 *unsafe_ptr__ign, u32 nr_w
 	BUILD_BUG_ON(__alignof__(struct bpf_iter_bits_kern) !=
 		     __alignof__(struct bpf_iter_bits));
 
+	kit->allocated = 0;
 	kit->nr_bits = 0;
 	kit->bits_copy = 0;
 	kit->bit = -1;
@@ -2914,6 +2916,7 @@  bpf_iter_bits_new(struct bpf_iter_bits *it, const u64 *unsafe_ptr__ign, u32 nr_w
 		return err;
 	}
 
+	kit->allocated = 1;
 	kit->nr_bits = nr_bits;
 	return 0;
 }
@@ -2937,7 +2940,7 @@  __bpf_kfunc int *bpf_iter_bits_next(struct bpf_iter_bits *it)
 	if (nr_bits == 0)
 		return NULL;
 
-	bits = nr_bits == 64 ? &kit->bits_copy : kit->bits;
+	bits = !kit->allocated ? &kit->bits_copy : kit->bits;
 	bit = find_next_bit(bits, nr_bits, kit->bit + 1);
 	if (bit >= nr_bits) {
 		kit->nr_bits = 0;
@@ -2958,7 +2961,7 @@  __bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it)
 {
 	struct bpf_iter_bits_kern *kit = (void *)it;
 
-	if (kit->nr_bits <= 64)
+	if (!kit->allocated)
 		return;
 	bpf_mem_free(&bpf_global_ma, kit->bits);
 }