Message ID | 20231107140702.1891778-2-houtao@huaweicloud.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | bpf: Fix the release of inner map | expand |
On 11/7/23 6:06 AM, Hou Tao wrote: > From: Hou Tao <houtao1@huawei.com> > > These three bpf_map_{lookup,update,delete}_elem() helpers are also > available for sleepable bpf program, so add the corresponding lock > assertion for sleepable bpf program, otherwise the following warning > will be reported when a sleepable bpf program manipulates bpf map under > interpreter mode (aka bpf_jit_enable=0): > > WARNING: CPU: 3 PID: 4985 at kernel/bpf/helpers.c:40 ...... > CPU: 3 PID: 4985 Comm: test_progs Not tainted 6.6.0+ #2 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ...... > RIP: 0010:bpf_map_lookup_elem+0x54/0x60 > ...... > Call Trace: > <TASK> > ? __warn+0xa5/0x240 > ? bpf_map_lookup_elem+0x54/0x60 > ? report_bug+0x1ba/0x1f0 > ? handle_bug+0x40/0x80 > ? exc_invalid_op+0x18/0x50 > ? asm_exc_invalid_op+0x1b/0x20 > ? __pfx_bpf_map_lookup_elem+0x10/0x10 > ? rcu_lockdep_current_cpu_online+0x65/0xb0 > ? rcu_is_watching+0x23/0x50 > ? bpf_map_lookup_elem+0x54/0x60 > ? __pfx_bpf_map_lookup_elem+0x10/0x10 > ___bpf_prog_run+0x513/0x3b70 > __bpf_prog_run32+0x9d/0xd0 > ? __bpf_prog_enter_sleepable_recur+0xad/0x120 > ? __bpf_prog_enter_sleepable_recur+0x3e/0x120 > bpf_trampoline_6442580665+0x4d/0x1000 > __x64_sys_getpgid+0x5/0x30 > ? do_syscall_64+0x36/0xb0 > entry_SYSCALL_64_after_hwframe+0x6e/0x76 > </TASK> > > Signed-off-by: Hou Tao <houtao1@huawei.com> > --- > kernel/bpf/helpers.c | 13 ++++++++----- > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > index 56b0c1f678ee7..f43038931935e 100644 > --- a/kernel/bpf/helpers.c > +++ b/kernel/bpf/helpers.c > @@ -32,12 +32,13 @@ > * > * Different map implementations will rely on rcu in map methods > * lookup/update/delete, therefore eBPF programs must run under rcu lock > - * if program is allowed to access maps, so check rcu_read_lock_held in > - * all three functions. > + * if program is allowed to access maps, so check rcu_read_lock_held() or > + * rcu_read_lock_trace_held() in all three functions. > */ > BPF_CALL_2(bpf_map_lookup_elem, struct bpf_map *, map, void *, key) > { > - WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held()); > + WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held() && > + !rcu_read_lock_bh_held()); > return (unsigned long) map->ops->map_lookup_elem(map, key); > } > > @@ -53,7 +54,8 @@ const struct bpf_func_proto bpf_map_lookup_elem_proto = { > BPF_CALL_4(bpf_map_update_elem, struct bpf_map *, map, void *, key, > void *, value, u64, flags) > { > - WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held()); > + WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held() && > + !rcu_read_lock_bh_held()); > return map->ops->map_update_elem(map, key, value, flags); > } > > @@ -70,7 +72,8 @@ const struct bpf_func_proto bpf_map_update_elem_proto = { > > BPF_CALL_2(bpf_map_delete_elem, struct bpf_map *, map, void *, key) > { > - WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held()); > + WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held() && > + !rcu_read_lock_bh_held()); Should these WARN_ON_ONCE be removed from the helpers instead? For catching error purpose, the ops->map_{lookup,update,delete}_elem are inlined for the jitted case which I believe is the bpf-CI setting also. Meaning the above change won't help to catch error in the common normal case. > return map->ops->map_delete_elem(map, key); > } >
Hi, On 11/9/2023 7:11 AM, Martin KaFai Lau wrote: > On 11/7/23 6:06 AM, Hou Tao wrote: >> From: Hou Tao <houtao1@huawei.com> >> >> These three bpf_map_{lookup,update,delete}_elem() helpers are also >> available for sleepable bpf program, so add the corresponding lock >> assertion for sleepable bpf program, otherwise the following warning >> will be reported when a sleepable bpf program manipulates bpf map under >> interpreter mode (aka bpf_jit_enable=0): >> SNIP >> BPF_CALL_2(bpf_map_lookup_elem, struct bpf_map *, map, void *, key) >> { >> - WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held()); >> + WARN_ON_ONCE(!rcu_read_lock_held() && >> !rcu_read_lock_trace_held() && >> + !rcu_read_lock_bh_held()); >> return (unsigned long) map->ops->map_lookup_elem(map, key); >> } >> @@ -53,7 +54,8 @@ const struct bpf_func_proto >> bpf_map_lookup_elem_proto = { >> BPF_CALL_4(bpf_map_update_elem, struct bpf_map *, map, void *, key, >> void *, value, u64, flags) >> { >> - WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held()); >> + WARN_ON_ONCE(!rcu_read_lock_held() && >> !rcu_read_lock_trace_held() && >> + !rcu_read_lock_bh_held()); >> return map->ops->map_update_elem(map, key, value, flags); >> } >> @@ -70,7 +72,8 @@ const struct bpf_func_proto >> bpf_map_update_elem_proto = { >> BPF_CALL_2(bpf_map_delete_elem, struct bpf_map *, map, void *, key) >> { >> - WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held()); >> + WARN_ON_ONCE(!rcu_read_lock_held() && >> !rcu_read_lock_trace_held() && >> + !rcu_read_lock_bh_held()); > > Should these WARN_ON_ONCE be removed from the helpers instead? > > For catching error purpose, the ops->map_{lookup,update,delete}_elem > are inlined for the jitted case which I believe is the bpf-CI setting > also. Meaning the above change won't help to catch error in the common > normal case. Removing these WARN_ON_ONCE is also an option. Considering JIT is not available for all architectures and there is no KASAN support in JIT, could we enable BPF interpreter mode in BPF CI to find more potential problems ? > >> return map->ops->map_delete_elem(map, key); >> } >> > > > .
On 11/8/23 7:46 PM, Hou Tao wrote: > Hi, > > On 11/9/2023 7:11 AM, Martin KaFai Lau wrote: >> On 11/7/23 6:06 AM, Hou Tao wrote: >>> From: Hou Tao <houtao1@huawei.com> >>> >>> These three bpf_map_{lookup,update,delete}_elem() helpers are also >>> available for sleepable bpf program, so add the corresponding lock >>> assertion for sleepable bpf program, otherwise the following warning >>> will be reported when a sleepable bpf program manipulates bpf map under >>> interpreter mode (aka bpf_jit_enable=0): >>> > SNIP >>> BPF_CALL_2(bpf_map_lookup_elem, struct bpf_map *, map, void *, key) >>> { >>> - WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held()); >>> + WARN_ON_ONCE(!rcu_read_lock_held() && >>> !rcu_read_lock_trace_held() && >>> + !rcu_read_lock_bh_held()); >>> return (unsigned long) map->ops->map_lookup_elem(map, key); >>> } >>> @@ -53,7 +54,8 @@ const struct bpf_func_proto >>> bpf_map_lookup_elem_proto = { >>> BPF_CALL_4(bpf_map_update_elem, struct bpf_map *, map, void *, key, >>> void *, value, u64, flags) >>> { >>> - WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held()); >>> + WARN_ON_ONCE(!rcu_read_lock_held() && >>> !rcu_read_lock_trace_held() && >>> + !rcu_read_lock_bh_held()); >>> return map->ops->map_update_elem(map, key, value, flags); >>> } >>> @@ -70,7 +72,8 @@ const struct bpf_func_proto >>> bpf_map_update_elem_proto = { >>> BPF_CALL_2(bpf_map_delete_elem, struct bpf_map *, map, void *, key) >>> { >>> - WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held()); >>> + WARN_ON_ONCE(!rcu_read_lock_held() && >>> !rcu_read_lock_trace_held() && >>> + !rcu_read_lock_bh_held()); >> >> Should these WARN_ON_ONCE be removed from the helpers instead? >> >> For catching error purpose, the ops->map_{lookup,update,delete}_elem >> are inlined for the jitted case which I believe is the bpf-CI setting >> also. Meaning the above change won't help to catch error in the common >> normal case. > > Removing these WARN_ON_ONCE is also an option. Considering JIT is not > available for all architectures and there is no KASAN support in JIT, > could we enable BPF interpreter mode in BPF CI to find more potential > problems ? ah. The test in patch 11 needs jit to be off because the map_gen_lookup inlined the code? Would it help to use bpf_map_update_elem(inner_map,...) to trigger the issue instead? > >> >>> return map->ops->map_delete_elem(map, key); >>> } >>> >> >> >> . >
Hi, On 11/9/2023 3:02 PM, Martin KaFai Lau wrote: > On 11/8/23 7:46 PM, Hou Tao wrote: >> Hi, >> >> On 11/9/2023 7:11 AM, Martin KaFai Lau wrote: >>> On 11/7/23 6:06 AM, Hou Tao wrote: >>>> From: Hou Tao <houtao1@huawei.com> >>>> >>>> These three bpf_map_{lookup,update,delete}_elem() helpers are also >>>> available for sleepable bpf program, so add the corresponding lock >>>> assertion for sleepable bpf program, otherwise the following warning >>>> will be reported when a sleepable bpf program manipulates bpf map >>>> under >>>> interpreter mode (aka bpf_jit_enable=0): >>>> >> SNIP >>>> BPF_CALL_2(bpf_map_lookup_elem, struct bpf_map *, map, void *, key) >>>> { >>>> - WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held()); >>>> + WARN_ON_ONCE(!rcu_read_lock_held() && >>>> !rcu_read_lock_trace_held() && >>>> + !rcu_read_lock_bh_held()); >>>> return (unsigned long) map->ops->map_lookup_elem(map, key); >>>> } >>>> @@ -53,7 +54,8 @@ const struct bpf_func_proto >>>> bpf_map_lookup_elem_proto = { >>>> BPF_CALL_4(bpf_map_update_elem, struct bpf_map *, map, void *, key, >>>> void *, value, u64, flags) >>>> { >>>> - WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held()); >>>> + WARN_ON_ONCE(!rcu_read_lock_held() && >>>> !rcu_read_lock_trace_held() && >>>> + !rcu_read_lock_bh_held()); >>>> return map->ops->map_update_elem(map, key, value, flags); >>>> } >>>> @@ -70,7 +72,8 @@ const struct bpf_func_proto >>>> bpf_map_update_elem_proto = { >>>> BPF_CALL_2(bpf_map_delete_elem, struct bpf_map *, map, void *, >>>> key) >>>> { >>>> - WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held()); >>>> + WARN_ON_ONCE(!rcu_read_lock_held() && >>>> !rcu_read_lock_trace_held() && >>>> + !rcu_read_lock_bh_held()); >>> >>> Should these WARN_ON_ONCE be removed from the helpers instead? >>> >>> For catching error purpose, the ops->map_{lookup,update,delete}_elem >>> are inlined for the jitted case which I believe is the bpf-CI setting >>> also. Meaning the above change won't help to catch error in the common >>> normal case. >> >> Removing these WARN_ON_ONCE is also an option. Considering JIT is not >> available for all architectures and there is no KASAN support in JIT, >> could we enable BPF interpreter mode in BPF CI to find more potential >> problems ? > > ah. The test in patch 11 needs jit to be off because the > map_gen_lookup inlined the code? Would it help to use > bpf_map_update_elem(inner_map,...) to trigger the issue instead? Er, I didn't consider that before, but you are right. bpf_map_lookup_elem(inner_map) is inlined by verifier. I think using bpf_map_update_elem() may be able to reproduce the issue under JIT mode. Will try later. > >> >>> >>>> return map->ops->map_delete_elem(map, key); >>>> } >>>> >>> >>> >>> . >> > > .
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 56b0c1f678ee7..f43038931935e 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -32,12 +32,13 @@ * * Different map implementations will rely on rcu in map methods * lookup/update/delete, therefore eBPF programs must run under rcu lock - * if program is allowed to access maps, so check rcu_read_lock_held in - * all three functions. + * if program is allowed to access maps, so check rcu_read_lock_held() or + * rcu_read_lock_trace_held() in all three functions. */ BPF_CALL_2(bpf_map_lookup_elem, struct bpf_map *, map, void *, key) { - WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held()); + WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held() && + !rcu_read_lock_bh_held()); return (unsigned long) map->ops->map_lookup_elem(map, key); } @@ -53,7 +54,8 @@ const struct bpf_func_proto bpf_map_lookup_elem_proto = { BPF_CALL_4(bpf_map_update_elem, struct bpf_map *, map, void *, key, void *, value, u64, flags) { - WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held()); + WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held() && + !rcu_read_lock_bh_held()); return map->ops->map_update_elem(map, key, value, flags); } @@ -70,7 +72,8 @@ const struct bpf_func_proto bpf_map_update_elem_proto = { BPF_CALL_2(bpf_map_delete_elem, struct bpf_map *, map, void *, key) { - WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held()); + WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_trace_held() && + !rcu_read_lock_bh_held()); return map->ops->map_delete_elem(map, key); }