Message ID | 20241008091718.3797027-4-houtao@huaweicloud.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | Misc fixes for bpf | expand |
On Tue, Oct 8, 2024 at 2:05 AM Hou Tao <houtao@huaweicloud.com> wrote: > > From: Hou Tao <houtao1@huawei.com> > > bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the > bits are dynamically allocated. However, the check is incorrect and may > cause a kmemleak as shown below: > > unreferenced object 0xffff88812628c8c0 (size 32): > comm "swapper/0", pid 1, jiffies 4294727320 > hex dump (first 32 bytes): > b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0 ..U............. > f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00 ................ > backtrace (crc 781e32cc): > [<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80 > [<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0 > [<00000000597124d6>] __alloc.isra.0+0x89/0xb0 > [<000000004ebfffcd>] alloc_bulk+0x2af/0x720 > [<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0 > [<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610 > [<000000008b616eac>] bpf_global_ma_init+0x19/0x30 > [<00000000fc473efc>] do_one_initcall+0xd3/0x3c0 > [<00000000ec81498c>] kernel_init_freeable+0x66a/0x940 > [<00000000b119f72f>] kernel_init+0x20/0x160 > [<00000000f11ac9a7>] ret_from_fork+0x3c/0x70 > [<0000000004671da4>] ret_from_fork_asm+0x1a/0x30 > > That is because nr_bits will be set as zero in bpf_iter_bits_next() > after all bits have been iterated. > so maybe don't touch nr_bits and just use `kit->bit >= kit->nr_bits` condition to know when iterator is done? > Fix the problem by introducing an extra allocated status in > bpf_iter_bits and using it to indicate whether the bits are > dynamically allocated. > > Fixes: 4665415975b0 ("bpf: Add bits iterator") > Signed-off-by: Hou Tao <houtao1@huawei.com> > --- > kernel/bpf/helpers.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > index 1a43d06eab28..9484b5f7c4c0 100644 > --- a/kernel/bpf/helpers.c > +++ b/kernel/bpf/helpers.c > @@ -2856,7 +2856,8 @@ struct bpf_iter_bits_kern { > unsigned long *bits; > unsigned long bits_copy; > }; > - u32 nr_bits; > + u32 allocated:1; > + u32 nr_bits:31; > int bit; > } __aligned(8); > > @@ -2886,6 +2887,7 @@ bpf_iter_bits_new(struct bpf_iter_bits *it, const u64 *unsafe_ptr__ign, u32 nr_w > BUILD_BUG_ON(__alignof__(struct bpf_iter_bits_kern) != > __alignof__(struct bpf_iter_bits)); > > + kit->allocated = 0; > kit->nr_bits = 0; > kit->bits_copy = 0; > kit->bit = -1; > @@ -2914,6 +2916,7 @@ bpf_iter_bits_new(struct bpf_iter_bits *it, const u64 *unsafe_ptr__ign, u32 nr_w > return err; > } > > + kit->allocated = 1; > kit->nr_bits = nr_bits; > return 0; > } > @@ -2937,7 +2940,7 @@ __bpf_kfunc int *bpf_iter_bits_next(struct bpf_iter_bits *it) > if (nr_bits == 0) > return NULL; > > - bits = nr_bits == 64 ? &kit->bits_copy : kit->bits; > + bits = !kit->allocated ? &kit->bits_copy : kit->bits; > bit = find_next_bit(bits, nr_bits, kit->bit + 1); > if (bit >= nr_bits) { > kit->nr_bits = 0; > @@ -2958,7 +2961,7 @@ __bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it) > { > struct bpf_iter_bits_kern *kit = (void *)it; > > - if (kit->nr_bits <= 64) > + if (!kit->allocated) > return; > bpf_mem_free(&bpf_global_ma, kit->bits); > } > -- > 2.29.2 >
Hi, On 10/9/2024 2:26 AM, Andrii Nakryiko wrote: > On Tue, Oct 8, 2024 at 2:05 AM Hou Tao <houtao@huaweicloud.com> wrote: >> From: Hou Tao <houtao1@huawei.com> >> >> bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the >> bits are dynamically allocated. However, the check is incorrect and may >> cause a kmemleak as shown below: >> >> unreferenced object 0xffff88812628c8c0 (size 32): >> comm "swapper/0", pid 1, jiffies 4294727320 >> hex dump (first 32 bytes): >> b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0 ..U............. >> f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00 ................ >> backtrace (crc 781e32cc): >> [<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80 >> [<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0 >> [<00000000597124d6>] __alloc.isra.0+0x89/0xb0 >> [<000000004ebfffcd>] alloc_bulk+0x2af/0x720 >> [<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0 >> [<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610 >> [<000000008b616eac>] bpf_global_ma_init+0x19/0x30 >> [<00000000fc473efc>] do_one_initcall+0xd3/0x3c0 >> [<00000000ec81498c>] kernel_init_freeable+0x66a/0x940 >> [<00000000b119f72f>] kernel_init+0x20/0x160 >> [<00000000f11ac9a7>] ret_from_fork+0x3c/0x70 >> [<0000000004671da4>] ret_from_fork_asm+0x1a/0x30 >> >> That is because nr_bits will be set as zero in bpf_iter_bits_next() >> after all bits have been iterated. >> > so maybe don't touch nr_bits and just use `kit->bit >= kit->nr_bits` > condition to know when iterator is done? Good idea. That would be simpler. Will do in v2. > >> Fix the problem by introducing an extra allocated status in >> bpf_iter_bits and using it to indicate whether the bits are >> dynamically allocated.
On Wed, Oct 9, 2024 at 2:26 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Tue, Oct 8, 2024 at 2:05 AM Hou Tao <houtao@huaweicloud.com> wrote: > > > > From: Hou Tao <houtao1@huawei.com> > > > > bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the > > bits are dynamically allocated. However, the check is incorrect and may > > cause a kmemleak as shown below: > > > > unreferenced object 0xffff88812628c8c0 (size 32): > > comm "swapper/0", pid 1, jiffies 4294727320 > > hex dump (first 32 bytes): > > b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0 ..U............. > > f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00 ................ > > backtrace (crc 781e32cc): > > [<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80 > > [<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0 > > [<00000000597124d6>] __alloc.isra.0+0x89/0xb0 > > [<000000004ebfffcd>] alloc_bulk+0x2af/0x720 > > [<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0 > > [<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610 > > [<000000008b616eac>] bpf_global_ma_init+0x19/0x30 > > [<00000000fc473efc>] do_one_initcall+0xd3/0x3c0 > > [<00000000ec81498c>] kernel_init_freeable+0x66a/0x940 > > [<00000000b119f72f>] kernel_init+0x20/0x160 > > [<00000000f11ac9a7>] ret_from_fork+0x3c/0x70 > > [<0000000004671da4>] ret_from_fork_asm+0x1a/0x30 > > > > That is because nr_bits will be set as zero in bpf_iter_bits_next() > > after all bits have been iterated. > > > > so maybe don't touch nr_bits and just use `kit->bit >= kit->nr_bits` > condition to know when iterator is done? No, we can't do that. The iterator may only process a few bits, which would result in `kit->bit < kit->nr_bits`. Wouldn't it be better to simply remove the line `kit->nr_bits = 0;`? diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 1a43d06eab28..7fcd3163cf68 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -2939,10 +2939,8 @@ __bpf_kfunc int *bpf_iter_bits_next(struct bpf_iter_bits *it) bits = nr_bits == 64 ? &kit->bits_copy : kit->bits; bit = find_next_bit(bits, nr_bits, kit->bit + 1); - if (bit >= nr_bits) { - kit->nr_bits = 0; + if (bit >= nr_bits) return NULL; - } kit->bit = bit; return &kit->bit; -- Regards Yafang
Hi, On 10/9/2024 7:37 PM, Yafang Shao wrote: > On Wed, Oct 9, 2024 at 2:26 AM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: >> On Tue, Oct 8, 2024 at 2:05 AM Hou Tao <houtao@huaweicloud.com> wrote: >>> From: Hou Tao <houtao1@huawei.com> >>> >>> bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the >>> bits are dynamically allocated. However, the check is incorrect and may >>> cause a kmemleak as shown below: >>> >>> unreferenced object 0xffff88812628c8c0 (size 32): >>> comm "swapper/0", pid 1, jiffies 4294727320 >>> hex dump (first 32 bytes): >>> b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0 ..U............. >>> f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00 ................ >>> backtrace (crc 781e32cc): >>> [<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80 >>> [<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0 >>> [<00000000597124d6>] __alloc.isra.0+0x89/0xb0 >>> [<000000004ebfffcd>] alloc_bulk+0x2af/0x720 >>> [<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0 >>> [<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610 >>> [<000000008b616eac>] bpf_global_ma_init+0x19/0x30 >>> [<00000000fc473efc>] do_one_initcall+0xd3/0x3c0 >>> [<00000000ec81498c>] kernel_init_freeable+0x66a/0x940 >>> [<00000000b119f72f>] kernel_init+0x20/0x160 >>> [<00000000f11ac9a7>] ret_from_fork+0x3c/0x70 >>> [<0000000004671da4>] ret_from_fork_asm+0x1a/0x30 >>> >>> That is because nr_bits will be set as zero in bpf_iter_bits_next() >>> after all bits have been iterated. >>> >> so maybe don't touch nr_bits and just use `kit->bit >= kit->nr_bits` >> condition to know when iterator is done? > No, we can't do that. The iterator may only process a few bits, which > would result in `kit->bit < kit->nr_bits`. Wouldn't it be better to > simply remove the line `kit->nr_bits = 0;`? I think that is Andrii wanted to say. And is it more reasonable to also change the check in the begin of bpf_iter_bits_next() to "bit >= nr_bits" ? @@ -2934,15 +2934,13 @@ __bpf_kfunc int *bpf_iter_bits_next(struct bpf_iter_bits *it) const unsigned long *bits; int bit; - if (nr_bits == 0) + if (kit->bit >= nr_bits) return NULL; > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > index 1a43d06eab28..7fcd3163cf68 100644 > --- a/kernel/bpf/helpers.c > +++ b/kernel/bpf/helpers.c > @@ -2939,10 +2939,8 @@ __bpf_kfunc int *bpf_iter_bits_next(struct > bpf_iter_bits *it) > > bits = nr_bits == 64 ? &kit->bits_copy : kit->bits; > bit = find_next_bit(bits, nr_bits, kit->bit + 1); > - if (bit >= nr_bits) { > - kit->nr_bits = 0; > + if (bit >= nr_bits) > return NULL; > - } > > kit->bit = bit; > return &kit->bit; > > -- > Regards > > Yafang > .
On Thu, Oct 10, 2024 at 9:10 AM Hou Tao <houtao@huaweicloud.com> wrote: > > Hi, > > On 10/9/2024 7:37 PM, Yafang Shao wrote: > > On Wed, Oct 9, 2024 at 2:26 AM Andrii Nakryiko > > <andrii.nakryiko@gmail.com> wrote: > >> On Tue, Oct 8, 2024 at 2:05 AM Hou Tao <houtao@huaweicloud.com> wrote: > >>> From: Hou Tao <houtao1@huawei.com> > >>> > >>> bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the > >>> bits are dynamically allocated. However, the check is incorrect and may > >>> cause a kmemleak as shown below: > >>> > >>> unreferenced object 0xffff88812628c8c0 (size 32): > >>> comm "swapper/0", pid 1, jiffies 4294727320 > >>> hex dump (first 32 bytes): > >>> b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0 ..U............. > >>> f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00 ................ > >>> backtrace (crc 781e32cc): > >>> [<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80 > >>> [<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0 > >>> [<00000000597124d6>] __alloc.isra.0+0x89/0xb0 > >>> [<000000004ebfffcd>] alloc_bulk+0x2af/0x720 > >>> [<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0 > >>> [<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610 > >>> [<000000008b616eac>] bpf_global_ma_init+0x19/0x30 > >>> [<00000000fc473efc>] do_one_initcall+0xd3/0x3c0 > >>> [<00000000ec81498c>] kernel_init_freeable+0x66a/0x940 > >>> [<00000000b119f72f>] kernel_init+0x20/0x160 > >>> [<00000000f11ac9a7>] ret_from_fork+0x3c/0x70 > >>> [<0000000004671da4>] ret_from_fork_asm+0x1a/0x30 > >>> > >>> That is because nr_bits will be set as zero in bpf_iter_bits_next() > >>> after all bits have been iterated. > >>> > >> so maybe don't touch nr_bits and just use `kit->bit >= kit->nr_bits` > >> condition to know when iterator is done? > > No, we can't do that. The iterator may only process a few bits, which > > would result in `kit->bit < kit->nr_bits`. Wouldn't it be better to > > simply remove the line `kit->nr_bits = 0;`? > > I think that is Andrii wanted to say. And is it more reasonable to also > change the check in the begin of bpf_iter_bits_next() to "bit >= nr_bits" ? > > @@ -2934,15 +2934,13 @@ __bpf_kfunc int *bpf_iter_bits_next(struct > bpf_iter_bits *it) > const unsigned long *bits; > int bit; > > - if (nr_bits == 0) > + if (kit->bit >= nr_bits) > return NULL; Agreed. I misunderstood what Andrii suggested.
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 1a43d06eab28..9484b5f7c4c0 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -2856,7 +2856,8 @@ struct bpf_iter_bits_kern { unsigned long *bits; unsigned long bits_copy; }; - u32 nr_bits; + u32 allocated:1; + u32 nr_bits:31; int bit; } __aligned(8); @@ -2886,6 +2887,7 @@ bpf_iter_bits_new(struct bpf_iter_bits *it, const u64 *unsafe_ptr__ign, u32 nr_w BUILD_BUG_ON(__alignof__(struct bpf_iter_bits_kern) != __alignof__(struct bpf_iter_bits)); + kit->allocated = 0; kit->nr_bits = 0; kit->bits_copy = 0; kit->bit = -1; @@ -2914,6 +2916,7 @@ bpf_iter_bits_new(struct bpf_iter_bits *it, const u64 *unsafe_ptr__ign, u32 nr_w return err; } + kit->allocated = 1; kit->nr_bits = nr_bits; return 0; } @@ -2937,7 +2940,7 @@ __bpf_kfunc int *bpf_iter_bits_next(struct bpf_iter_bits *it) if (nr_bits == 0) return NULL; - bits = nr_bits == 64 ? &kit->bits_copy : kit->bits; + bits = !kit->allocated ? &kit->bits_copy : kit->bits; bit = find_next_bit(bits, nr_bits, kit->bit + 1); if (bit >= nr_bits) { kit->nr_bits = 0; @@ -2958,7 +2961,7 @@ __bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it) { struct bpf_iter_bits_kern *kit = (void *)it; - if (kit->nr_bits <= 64) + if (!kit->allocated) return; bpf_mem_free(&bpf_global_ma, kit->bits); }