diff mbox series

[bpf-next] bpf: Check whether or not node is NULL before free it in free_bulk

Message ID 20220919144811.3570825-1-houtao@huaweicloud.com (mailing list archive)
State Accepted
Commit c31b38cb948ee7d3317139f005fa1f90de4a06b7
Delegated to: BPF
Headers show
Series [bpf-next] bpf: Check whether or not node is NULL before free it in free_bulk | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 5 this patch: 5
netdev/cc_maintainers warning 2 maintainers not CCed: song@kernel.org martin.lau@linux.dev
netdev/build_clang success Errors and warnings before: 5 this patch: 5
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes fail Problems with Fixes tag: 1
netdev/build_allmodconfig_warn success Errors and warnings before: 5 this patch: 5
netdev/checkpatch warning WARNING: Unknown commit id '1376b7c57624', maybe rebased or not pulled?
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-4 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-5 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-1 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-7 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for test_maps on s390x with gcc

Commit Message

Hou Tao Sept. 19, 2022, 2:48 p.m. UTC
From: Hou Tao <houtao1@huawei.com>

llnode could be NULL if there are new allocations after the checking of
c-free_cnt > c->high_watermark in bpf_mem_refill() and before the
calling of __llist_del_first() in free_bulk (e.g. a PREEMPT_RT kernel
or allocation in NMI context). And it will incur oops as shown below:

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 0 P4D 0
 Oops: 0002 [#1] PREEMPT_RT SMP
 CPU: 39 PID: 373 Comm: irq_work/39 Tainted: G        W          6.0.0-rc6-rt9+ #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
 RIP: 0010:bpf_mem_refill+0x66/0x130
 ......
 Call Trace:
  <TASK>
  irq_work_single+0x24/0x60
  irq_work_run_list+0x24/0x30
  run_irq_workd+0x18/0x20
  smpboot_thread_fn+0x13f/0x2c0
  kthread+0x121/0x140
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x1f/0x30
  </TASK>

Simply fixing it by checking whether or not llnode is NULL in free_bulk().

Fixes: 1376b7c57624 ("bpf: Introduce any context BPF specific memory allocator.")
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 kernel/bpf/memalloc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Alexei Starovoitov Sept. 20, 2022, 3:09 p.m. UTC | #1
On Mon, Sep 19, 2022 at 7:30 AM Hou Tao <houtao@huaweicloud.com> wrote:
>
> From: Hou Tao <houtao1@huawei.com>
>
> llnode could be NULL if there are new allocations after the checking of
> c-free_cnt > c->high_watermark in bpf_mem_refill() and before the
> calling of __llist_del_first() in free_bulk (e.g. a PREEMPT_RT kernel
> or allocation in NMI context). And it will incur oops as shown below:
>
>  BUG: kernel NULL pointer dereference, address: 0000000000000000
>  #PF: supervisor write access in kernel mode
>  #PF: error_code(0x0002) - not-present page
>  PGD 0 P4D 0
>  Oops: 0002 [#1] PREEMPT_RT SMP
>  CPU: 39 PID: 373 Comm: irq_work/39 Tainted: G        W          6.0.0-rc6-rt9+ #1
>  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
>  RIP: 0010:bpf_mem_refill+0x66/0x130
>  ......
>  Call Trace:
>   <TASK>
>   irq_work_single+0x24/0x60
>   irq_work_run_list+0x24/0x30
>   run_irq_workd+0x18/0x20
>   smpboot_thread_fn+0x13f/0x2c0
>   kthread+0x121/0x140
>   ? kthread_complete_and_exit+0x20/0x20
>   ret_from_fork+0x1f/0x30
>   </TASK>
>
> Simply fixing it by checking whether or not llnode is NULL in free_bulk().
>
> Fixes: 1376b7c57624 ("bpf: Introduce any context BPF specific memory allocator.")

There is no such sha.
Also that commit isn't buggy as-is.
The proper fixes tag:
Fixes: 8d5a8011b35d ("bpf: Batch call_rcu callbacks instead of
SLAB_TYPESAFE_BY_RCU.")

Used that while applying.
Thanks for the fix !
patchwork-bot+netdevbpf@kernel.org Sept. 20, 2022, 3:10 p.m. UTC | #2
Hello:

This patch was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Mon, 19 Sep 2022 22:48:11 +0800 you wrote:
> From: Hou Tao <houtao1@huawei.com>
> 
> llnode could be NULL if there are new allocations after the checking of
> c-free_cnt > c->high_watermark in bpf_mem_refill() and before the
> calling of __llist_del_first() in free_bulk (e.g. a PREEMPT_RT kernel
> or allocation in NMI context). And it will incur oops as shown below:
> 
> [...]

Here is the summary with links:
  - [bpf-next] bpf: Check whether or not node is NULL before free it in free_bulk
    https://git.kernel.org/bpf/bpf-next/c/c31b38cb948e

You are awesome, thank you!
Hou Tao Sept. 21, 2022, 2:55 a.m. UTC | #3
Hi,

On 9/20/2022 11:09 PM, Alexei Starovoitov wrote:
> On Mon, Sep 19, 2022 at 7:30 AM Hou Tao <houtao@huaweicloud.com> wrote:
>> From: Hou Tao <houtao1@huawei.com>
>>
>> llnode could be NULL if there are new allocations after the checking of
>> c-free_cnt > c->high_watermark in bpf_mem_refill() and before the
>> calling of __llist_del_first() in free_bulk (e.g. a PREEMPT_RT kernel
>> or allocation in NMI context). And it will incur oops as shown below:
>>
>>  BUG: kernel NULL pointer dereference, address: 0000000000000000
>>  #PF: supervisor write access in kernel mode
>>  #PF: error_code(0x0002) - not-present page
>>  PGD 0 P4D 0
>>  Oops: 0002 [#1] PREEMPT_RT SMP
>>  CPU: 39 PID: 373 Comm: irq_work/39 Tainted: G        W          6.0.0-rc6-rt9+ #1
>>  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
>>  RIP: 0010:bpf_mem_refill+0x66/0x130
>>  ......
>>  Call Trace:
>>   <TASK>
>>   irq_work_single+0x24/0x60
>>   irq_work_run_list+0x24/0x30
>>   run_irq_workd+0x18/0x20
>>   smpboot_thread_fn+0x13f/0x2c0
>>   kthread+0x121/0x140
>>   ? kthread_complete_and_exit+0x20/0x20
>>   ret_from_fork+0x1f/0x30
>>   </TASK>
>>
>> Simply fixing it by checking whether or not llnode is NULL in free_bulk().
>>
>> Fixes: 1376b7c57624 ("bpf: Introduce any context BPF specific memory allocator.")
> There is no such sha.
> Also that commit isn't buggy as-is.
> The proper fixes tag:
> Fixes: 8d5a8011b35d ("bpf: Batch call_rcu callbacks instead of
> SLAB_TYPESAFE_BY_RCU.")
The incorrect git sha-sum is due to rebase on my local branch.
You are right. In 7c8199e24fa0 ("bpf: Introduce any context BPF specific memory
allocator."), free_bulk() calls kfree() and kmem_cache_free() directly, so there
is no such problem.  And in commit 8d5a8011b35d ("bpf: Batch call_rcu callbacks
instead of SLAB_TYPESAFE_BY_RCU."), free_one is replaced by enque_to_free() and
incurs the problem.
>
> Used that while applying.
Thanks for the update.
> Thanks for the fix !
> .
diff mbox series

Patch

diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c
index 20621f5407d8..5f83be1d2018 100644
--- a/kernel/bpf/memalloc.c
+++ b/kernel/bpf/memalloc.c
@@ -277,7 +277,8 @@  static void free_bulk(struct bpf_mem_cache *c)
 		local_dec(&c->active);
 		if (IS_ENABLED(CONFIG_PREEMPT_RT))
 			local_irq_restore(flags);
-		enque_to_free(c, llnode);
+		if (llnode)
+			enque_to_free(c, llnode);
 	} while (cnt > (c->high_watermark + c->low_watermark) / 2);
 
 	/* and drain free_llist_extra */