Message ID | 20220919144811.3570825-1-houtao@huaweicloud.com (mailing list archive) |
---|---|
State | Accepted |
Commit | c31b38cb948ee7d3317139f005fa1f90de4a06b7 |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] bpf: Check whether or not node is NULL before free it in free_bulk | expand |
On Mon, Sep 19, 2022 at 7:30 AM Hou Tao <houtao@huaweicloud.com> wrote: > > From: Hou Tao <houtao1@huawei.com> > > llnode could be NULL if there are new allocations after the checking of > c-free_cnt > c->high_watermark in bpf_mem_refill() and before the > calling of __llist_del_first() in free_bulk (e.g. a PREEMPT_RT kernel > or allocation in NMI context). And it will incur oops as shown below: > > BUG: kernel NULL pointer dereference, address: 0000000000000000 > #PF: supervisor write access in kernel mode > #PF: error_code(0x0002) - not-present page > PGD 0 P4D 0 > Oops: 0002 [#1] PREEMPT_RT SMP > CPU: 39 PID: 373 Comm: irq_work/39 Tainted: G W 6.0.0-rc6-rt9+ #1 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) > RIP: 0010:bpf_mem_refill+0x66/0x130 > ...... > Call Trace: > <TASK> > irq_work_single+0x24/0x60 > irq_work_run_list+0x24/0x30 > run_irq_workd+0x18/0x20 > smpboot_thread_fn+0x13f/0x2c0 > kthread+0x121/0x140 > ? kthread_complete_and_exit+0x20/0x20 > ret_from_fork+0x1f/0x30 > </TASK> > > Simply fixing it by checking whether or not llnode is NULL in free_bulk(). > > Fixes: 1376b7c57624 ("bpf: Introduce any context BPF specific memory allocator.") There is no such sha. Also that commit isn't buggy as-is. The proper fixes tag: Fixes: 8d5a8011b35d ("bpf: Batch call_rcu callbacks instead of SLAB_TYPESAFE_BY_RCU.") Used that while applying. Thanks for the fix !
Hello: This patch was applied to bpf/bpf-next.git (master) by Alexei Starovoitov <ast@kernel.org>: On Mon, 19 Sep 2022 22:48:11 +0800 you wrote: > From: Hou Tao <houtao1@huawei.com> > > llnode could be NULL if there are new allocations after the checking of > c-free_cnt > c->high_watermark in bpf_mem_refill() and before the > calling of __llist_del_first() in free_bulk (e.g. a PREEMPT_RT kernel > or allocation in NMI context). And it will incur oops as shown below: > > [...] Here is the summary with links: - [bpf-next] bpf: Check whether or not node is NULL before free it in free_bulk https://git.kernel.org/bpf/bpf-next/c/c31b38cb948e You are awesome, thank you!
Hi, On 9/20/2022 11:09 PM, Alexei Starovoitov wrote: > On Mon, Sep 19, 2022 at 7:30 AM Hou Tao <houtao@huaweicloud.com> wrote: >> From: Hou Tao <houtao1@huawei.com> >> >> llnode could be NULL if there are new allocations after the checking of >> c-free_cnt > c->high_watermark in bpf_mem_refill() and before the >> calling of __llist_del_first() in free_bulk (e.g. a PREEMPT_RT kernel >> or allocation in NMI context). And it will incur oops as shown below: >> >> BUG: kernel NULL pointer dereference, address: 0000000000000000 >> #PF: supervisor write access in kernel mode >> #PF: error_code(0x0002) - not-present page >> PGD 0 P4D 0 >> Oops: 0002 [#1] PREEMPT_RT SMP >> CPU: 39 PID: 373 Comm: irq_work/39 Tainted: G W 6.0.0-rc6-rt9+ #1 >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) >> RIP: 0010:bpf_mem_refill+0x66/0x130 >> ...... >> Call Trace: >> <TASK> >> irq_work_single+0x24/0x60 >> irq_work_run_list+0x24/0x30 >> run_irq_workd+0x18/0x20 >> smpboot_thread_fn+0x13f/0x2c0 >> kthread+0x121/0x140 >> ? kthread_complete_and_exit+0x20/0x20 >> ret_from_fork+0x1f/0x30 >> </TASK> >> >> Simply fixing it by checking whether or not llnode is NULL in free_bulk(). >> >> Fixes: 1376b7c57624 ("bpf: Introduce any context BPF specific memory allocator.") > There is no such sha. > Also that commit isn't buggy as-is. > The proper fixes tag: > Fixes: 8d5a8011b35d ("bpf: Batch call_rcu callbacks instead of > SLAB_TYPESAFE_BY_RCU.") The incorrect git sha-sum is due to rebase on my local branch. You are right. In 7c8199e24fa0 ("bpf: Introduce any context BPF specific memory allocator."), free_bulk() calls kfree() and kmem_cache_free() directly, so there is no such problem. And in commit 8d5a8011b35d ("bpf: Batch call_rcu callbacks instead of SLAB_TYPESAFE_BY_RCU."), free_one is replaced by enque_to_free() and incurs the problem. > > Used that while applying. Thanks for the update. > Thanks for the fix ! > .
diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c index 20621f5407d8..5f83be1d2018 100644 --- a/kernel/bpf/memalloc.c +++ b/kernel/bpf/memalloc.c @@ -277,7 +277,8 @@ static void free_bulk(struct bpf_mem_cache *c) local_dec(&c->active); if (IS_ENABLED(CONFIG_PREEMPT_RT)) local_irq_restore(flags); - enque_to_free(c, llnode); + if (llnode) + enque_to_free(c, llnode); } while (cnt > (c->high_watermark + c->low_watermark) / 2); /* and drain free_llist_extra */