Message ID | 20220510211727.575686-3-memxor@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | Follow ups for kptr series | expand |
On Tue, May 10, 2022 at 2:17 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > In an effort to actually test the refcounting logic at runtime, add a > refcount_t member to prog_test_ref_kfunc and use it in selftests to > verify and test the whole logic more exhaustively. > > To ensure reading the count to verify it remains stable, make > prog_test_ref_kfunc a per-CPU variable, so that inside a BPF program the > count can be read reliably based on number of acquisitions made. Then, > pairing them with releases and reading from the global per-CPU variable > will allow verifying whether release operations put the refcount. The patches look good, but the per-cpu part is a puzzle. The test is not parallel. Everything looks sequential and there are no races. It seems to me if it was static struct prog_test_ref_kfunc prog_test_struct = {..}; and none of [bpf_]this_cpu_ptr() the test would work the same way. What am I missing?
On Wed, May 11, 2022 at 10:07:35AM IST, Alexei Starovoitov wrote: > On Tue, May 10, 2022 at 2:17 PM Kumar Kartikeya Dwivedi > <memxor@gmail.com> wrote: > > > > In an effort to actually test the refcounting logic at runtime, add a > > refcount_t member to prog_test_ref_kfunc and use it in selftests to > > verify and test the whole logic more exhaustively. > > > > To ensure reading the count to verify it remains stable, make > > prog_test_ref_kfunc a per-CPU variable, so that inside a BPF program the > > count can be read reliably based on number of acquisitions made. Then, > > pairing them with releases and reading from the global per-CPU variable > > will allow verifying whether release operations put the refcount. > > The patches look good, but the per-cpu part is a puzzle. > The test is not parallel. Everything looks sequential > and there are no races. > It seems to me if it was > static struct prog_test_ref_kfunc prog_test_struct = {..}; > and none of [bpf_]this_cpu_ptr() > the test would work the same way. > What am I missing? You are not missing anything. It would work the same. I just made it per-CPU for the off chance that someone runs ./test_progs -t map_kptr in parallel on the same machine. Then one or both might fail, since count won't just be inc/dec by us, and reading it would produce something other than what we expect. One other benefit is getting non-ref PTR_TO_BTF_ID to prog_test_struct to inspect cnt after releasing acquired pointer (using bpf_this_cpu_ptr), but that can also be done by non-ref kfunc returning a pointer to it. If you feel it's not worth it, I will drop it in the next version. -- Kartikeya
On Tue, May 10, 2022 at 11:01 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Wed, May 11, 2022 at 10:07:35AM IST, Alexei Starovoitov wrote: > > On Tue, May 10, 2022 at 2:17 PM Kumar Kartikeya Dwivedi > > <memxor@gmail.com> wrote: > > > > > > In an effort to actually test the refcounting logic at runtime, add a > > > refcount_t member to prog_test_ref_kfunc and use it in selftests to > > > verify and test the whole logic more exhaustively. > > > > > > To ensure reading the count to verify it remains stable, make > > > prog_test_ref_kfunc a per-CPU variable, so that inside a BPF program the > > > count can be read reliably based on number of acquisitions made. Then, > > > pairing them with releases and reading from the global per-CPU variable > > > will allow verifying whether release operations put the refcount. > > > > The patches look good, but the per-cpu part is a puzzle. > > The test is not parallel. Everything looks sequential > > and there are no races. > > It seems to me if it was > > static struct prog_test_ref_kfunc prog_test_struct = {..}; > > and none of [bpf_]this_cpu_ptr() > > the test would work the same way. > > What am I missing? > > You are not missing anything. It would work the same. I just made it per-CPU for > the off chance that someone runs ./test_progs -t map_kptr in parallel on the > same machine. Then one or both might fail, since count won't just be inc/dec by > us, and reading it would produce something other than what we expect. I see. You should have mentioned that in the commit log. But how per-cpu helps in this case? prog_run is executed with cpu=0, so both test_progs -t map_kptr will collide on the same cpu. At the end it's the same. one or both might fail? In general all serial_ tests in test_progs will fail in parallel run. Even non-serial tests might fail. The non-serial tests are ok for test_progs -j. They're parallel between themselves, but there are no guarantees that every individual test can be run parallel with itself. Majority will probably be fine, but not all. > One other benefit is getting non-ref PTR_TO_BTF_ID to prog_test_struct to > inspect cnt after releasing acquired pointer (using bpf_this_cpu_ptr), but that > can also be done by non-ref kfunc returning a pointer to it. Not following. non-ref == ptr_untrusted. That doesn't preclude bpf prog from reading refcnt directly, but disallows passing into helpers. So with non-percpu the following hack bpf_kfunc_call_test_release(p); if (p_cpu->cnt.refs.counter ...) wouldn't be necessary. The prog could release(p) and read p->cnt.refs.counter right after. While with per-cpu approach you had to do p_cpu = bpf_this_cpu_ptr(&prog_test_struct); hack and rely on intimate knowledge of the kernel side. > If you feel it's not worth it, I will drop it in the next version. So far I see the downsides. Also check the CI. test_progs-no_alu32 fails: test_map_kptr_fail_prog:PASS:map_kptr__load must fail 0 nsec test_map_kptr_fail_prog:FAIL:expected error message unexpected error: -22 Expected: Unreleased reference id=5 alloc_insn=18 Verifier: 0: R1=ctx(off=0,imm=0) R10=fp0
On Wed, May 11, 2022 at 11:23:59PM IST, Alexei Starovoitov wrote: > On Tue, May 10, 2022 at 11:01 PM Kumar Kartikeya Dwivedi > <memxor@gmail.com> wrote: > > > > On Wed, May 11, 2022 at 10:07:35AM IST, Alexei Starovoitov wrote: > > > On Tue, May 10, 2022 at 2:17 PM Kumar Kartikeya Dwivedi > > > <memxor@gmail.com> wrote: > > > > > > > > In an effort to actually test the refcounting logic at runtime, add a > > > > refcount_t member to prog_test_ref_kfunc and use it in selftests to > > > > verify and test the whole logic more exhaustively. > > > > > > > > To ensure reading the count to verify it remains stable, make > > > > prog_test_ref_kfunc a per-CPU variable, so that inside a BPF program the > > > > count can be read reliably based on number of acquisitions made. Then, > > > > pairing them with releases and reading from the global per-CPU variable > > > > will allow verifying whether release operations put the refcount. > > > > > > The patches look good, but the per-cpu part is a puzzle. > > > The test is not parallel. Everything looks sequential > > > and there are no races. > > > It seems to me if it was > > > static struct prog_test_ref_kfunc prog_test_struct = {..}; > > > and none of [bpf_]this_cpu_ptr() > > > the test would work the same way. > > > What am I missing? > > > > You are not missing anything. It would work the same. I just made it per-CPU for > > the off chance that someone runs ./test_progs -t map_kptr in parallel on the > > same machine. Then one or both might fail, since count won't just be inc/dec by > > us, and reading it would produce something other than what we expect. > > I see. You should have mentioned that in the commit log. > But how per-cpu helps in this case? > prog_run is executed with cpu=0, so both test_progs -t map_kptr > will collide on the same cpu. Right, I was thinking bpf_prog_run disabled preemption, so that would prevent collisions, but it seems my knowledge is now outdated (only migration is disabled). Also, just realising, we rely on observing a specific count across test_run invocations, which won't be protected against for parallel runs anyway. > At the end it's the same. one or both might fail? > > In general all serial_ tests in test_progs will fail in > parallel run. > Even non-serial tests might fail. > The non-serial tests are ok for test_progs -j. > They're parallel between themselves, but there are no guarantees > that every individual test can be run parallel with itself. > Majority will probably be fine, but not all. > I'll drop it and go with a global struct. > > One other benefit is getting non-ref PTR_TO_BTF_ID to prog_test_struct to > > inspect cnt after releasing acquired pointer (using bpf_this_cpu_ptr), but that > > can also be done by non-ref kfunc returning a pointer to it. > > Not following. non-ref == ptr_untrusted. That doesn't preclude By non-ref PTR_TO_BTF_ID I meant normal (not untrusted) PTR_TO_BTF_ID with ref_obj_id = 0. bpf_this_cpu_ptr returns a normal PTR_TO_BTF_ID, not an untrusted one. > bpf prog from reading refcnt directly, but disallows passing > into helpers. > So with non-percpu the following hack > bpf_kfunc_call_test_release(p); > if (p_cpu->cnt.refs.counter ...) > wouldn't be necessary. > The prog could release(p) and read p->cnt.refs.counter right after. release(p) will kill p, so that won't work. I have a better idea, since p->next points to itself, just loading that will give me a pointer I can read after release(p). As an aside, do you think we should change the behaviour of killing released registers and skip it for refcounted PTR_TO_BTF_ID (perhaps mark it as untrusted pointer instead, with ref_obj_id reset to zero)? So loads are allowed into it, but passing into the kernel isn't, wdyt? p = acq(); // p.type = PTR_TO_BTF_ID, ref_obj_id=X foo(p); // works bar(p->a + p->b); // works rel(p); // p.type = PTR_TO_BTF_ID | PTR_UNTRUSTED, ref_obj_id=0 // Instead of mark_reg_unknown(p) There is still the case where you can do: p2 = p->next; rel(p); p3 = p->next; Now p2 is trusted PTR_TO_BTF_ID, while p3 is untrusted, but this is a separate problem which requires a more general fix, and needs more discussion. A bit of a digression, but I would like to know what you and other BPF developers think. So far my thinking (culminating towards an RFC) is this: For a refcounted PTR_TO_BTF_ID, it is marked as trusted. When loading from it, by default all loads yield untrusted pointers, except those which are specifically marked with some annotation ("bpf_ptr_trust") which indicates that parent holds reference to member pointer. This is a loose description to mean that for the lifetime of trusted parent pointer, member pointer may also be trusted. If lifetime can end (due to release), trusted member pointers will become untrusted. If it cannot (e.g. function arguments), it remains valid. This will use BTF tags. Known cases in the kernel which are useful and safe can be whitelisted. Such loads yield trusted pointers linked to refcounted PTR_TO_BTF_ID. Linked means the source refcounted PTR_TO_BTF_ID owns it. When releasing PTR_TO_BTF_ID, all registers with same ref_obj_id, and all linked PTR_TO_BTF_ID are marked as untrusted. As an example: struct foo { struct bar __ref *br; struct baz *bz; }; struct foo *f = acq(); // f.type = PTR_TO_BTF_ID, ref_obj_id=X br = f->br; // br.type = PTR_TO_BTF_ID, linked_to=X bz = f->bz; // bz.type = PTR_TO_BTF_ID | PTR_UNTRUSTED rel(f); // f.type = PTR_TO_BTF_ID | PTR_UNTRUSTED // and since br.linked_to == f.ref_obj_id, // br.type = PTR_TO_BTF_ID | PTR_UNTRUSTED For trusted loads from br, linked_to will be same as X, so they will also be marked as untrusted, and so on. For tp_btf/LSM programs, pointer arguments will be non-refcounted trusted PTR_TO_BTF_ID. All rules as above apply, but since it cannot be released, trusted pointers obtained from them remain valid till BPF_EXIT. I have no idea how much backwards compat this will break, or how much of it can be tolerated. > While with per-cpu approach you had to do > p_cpu = bpf_this_cpu_ptr(&prog_test_struct); > hack and rely on intimate knowledge of the kernel side. > > > If you feel it's not worth it, I will drop it in the next version. > > So far I see the downsides. > Also check the CI. test_progs-no_alu32 fails: > test_map_kptr_fail_prog:PASS:map_kptr__load must fail 0 nsec > test_map_kptr_fail_prog:FAIL:expected error message unexpected error: -22 > Expected: Unreleased reference id=5 alloc_insn=18 > Verifier: 0: R1=ctx(off=0,imm=0) R10=fp0 Yes, noticed that. It is because alloc_insn= is different for no_alu32. I'll drop the matching on specific insn idx number. Thanks for your feedback. -- Kartikeya
On Thu, May 12, 2022 at 12:37:24AM +0530, Kumar Kartikeya Dwivedi wrote: > On Wed, May 11, 2022 at 11:23:59PM IST, Alexei Starovoitov wrote: > > On Tue, May 10, 2022 at 11:01 PM Kumar Kartikeya Dwivedi > > <memxor@gmail.com> wrote: > > > > > > On Wed, May 11, 2022 at 10:07:35AM IST, Alexei Starovoitov wrote: > > > > On Tue, May 10, 2022 at 2:17 PM Kumar Kartikeya Dwivedi > > > > <memxor@gmail.com> wrote: > > > > > > > > > > In an effort to actually test the refcounting logic at runtime, add a > > > > > refcount_t member to prog_test_ref_kfunc and use it in selftests to > > > > > verify and test the whole logic more exhaustively. > > > > > > > > > > To ensure reading the count to verify it remains stable, make > > > > > prog_test_ref_kfunc a per-CPU variable, so that inside a BPF program the > > > > > count can be read reliably based on number of acquisitions made. Then, > > > > > pairing them with releases and reading from the global per-CPU variable > > > > > will allow verifying whether release operations put the refcount. > > > > > > > > The patches look good, but the per-cpu part is a puzzle. > > > > The test is not parallel. Everything looks sequential > > > > and there are no races. > > > > It seems to me if it was > > > > static struct prog_test_ref_kfunc prog_test_struct = {..}; > > > > and none of [bpf_]this_cpu_ptr() > > > > the test would work the same way. > > > > What am I missing? > > > > > > You are not missing anything. It would work the same. I just made it per-CPU for > > > the off chance that someone runs ./test_progs -t map_kptr in parallel on the > > > same machine. Then one or both might fail, since count won't just be inc/dec by > > > us, and reading it would produce something other than what we expect. > > > > I see. You should have mentioned that in the commit log. > > But how per-cpu helps in this case? > > prog_run is executed with cpu=0, so both test_progs -t map_kptr > > will collide on the same cpu. > > Right, I was thinking bpf_prog_run disabled preemption, so that would prevent > collisions, but it seems my knowledge is now outdated (only migration is > disabled). Also, just realising, we rely on observing a specific count across > test_run invocations, which won't be protected against for parallel runs > anyway. > > > At the end it's the same. one or both might fail? > > > > In general all serial_ tests in test_progs will fail in > > parallel run. > > Even non-serial tests might fail. > > The non-serial tests are ok for test_progs -j. > > They're parallel between themselves, but there are no guarantees > > that every individual test can be run parallel with itself. > > Majority will probably be fine, but not all. > > > > I'll drop it and go with a global struct. > > > > One other benefit is getting non-ref PTR_TO_BTF_ID to prog_test_struct to > > > inspect cnt after releasing acquired pointer (using bpf_this_cpu_ptr), but that > > > can also be done by non-ref kfunc returning a pointer to it. > > > > Not following. non-ref == ptr_untrusted. That doesn't preclude > > By non-ref PTR_TO_BTF_ID I meant normal (not untrusted) PTR_TO_BTF_ID with > ref_obj_id = 0. > > bpf_this_cpu_ptr returns a normal PTR_TO_BTF_ID, not an untrusted one. > > > bpf prog from reading refcnt directly, but disallows passing > > into helpers. > > So with non-percpu the following hack > > bpf_kfunc_call_test_release(p); > > if (p_cpu->cnt.refs.counter ...) > > wouldn't be necessary. > > The prog could release(p) and read p->cnt.refs.counter right after. > > release(p) will kill p, so that won't work. I have a better idea, since > p->next points to itself, just loading that will give me a pointer I can > read after release(p). > > As an aside, do you think we should change the behaviour of killing released > registers and skip it for refcounted PTR_TO_BTF_ID (perhaps mark it as untrusted > pointer instead, with ref_obj_id reset to zero)? So loads are allowed into it, > but passing into the kernel isn't, wdyt? > > p = acq(); // p.type = PTR_TO_BTF_ID, ref_obj_id=X > foo(p); // works > bar(p->a + p->b); // works > rel(p); // p.type = PTR_TO_BTF_ID | PTR_UNTRUSTED, ref_obj_id=0 > // Instead of mark_reg_unknown(p) > > There is still the case where you can do: > p2 = p->next; > rel(p); > p3 = p->next; It's probably better to keep existing behavior since acquire/release is mainly used in the networking context today. Probe reading a socket after release is technically safe, but right now such usage will be rejected by the verifier and the user will have to fix such bug. If we relax it such bugs might be much harder to spot. > Now p2 is trusted PTR_TO_BTF_ID, while p3 is untrusted, but this is a separate > problem which requires a more general fix, and needs more discussion. It might not be a bug. p3 might still be trusted and valid pointer. rel(p) releases p only. The verifier doesn't know semantics. It's a general link list walking issue. It needs separate discussion. > A bit of a digression, but I would like to know what you and other BPF > developers think. > > So far my thinking (culminating towards an RFC) is this: > > For a refcounted PTR_TO_BTF_ID, it is marked as trusted. > > When loading from it, by default all loads yield untrusted pointers, except > those which are specifically marked with some annotation ("bpf_ptr_trust") which > indicates that parent holds reference to member pointer. This is a loose > description to mean that for the lifetime of trusted parent pointer, member > pointer may also be trusted. If lifetime can end (due to release), trusted > member pointers will become untrusted. If it cannot (e.g. function arguments), > it remains valid. > > This will use BTF tags. > Known cases in the kernel which are useful and safe can be whitelisted. > > Such loads yield trusted pointers linked to refcounted PTR_TO_BTF_ID. Linked > means the source refcounted PTR_TO_BTF_ID owns it. > > When releasing PTR_TO_BTF_ID, all registers with same ref_obj_id, and all linked > PTR_TO_BTF_ID are marked as untrusted. > > As an example: > > struct foo { > struct bar __ref *br; > struct baz *bz; > }; > > struct foo *f = acq(); // f.type = PTR_TO_BTF_ID, ref_obj_id=X > br = f->br; // br.type = PTR_TO_BTF_ID, linked_to=X > bz = f->bz; // bz.type = PTR_TO_BTF_ID | PTR_UNTRUSTED > rel(f); // f.type = PTR_TO_BTF_ID | PTR_UNTRUSTED > // and since br.linked_to == f.ref_obj_id, > // br.type = PTR_TO_BTF_ID | PTR_UNTRUSTED > > For trusted loads from br, linked_to will be same as X, so they will also be > marked as untrusted, and so on. > > For tp_btf/LSM programs, pointer arguments will be non-refcounted trusted > PTR_TO_BTF_ID. All rules as above apply, but since it cannot be released, > trusted pointers obtained from them remain valid till BPF_EXIT. > > I have no idea how much backwards compat this will break, or how much of it can > be tolerated. Exactly. It's not practical to mark all such fields in the kernel with __ref tags. bpf_lsm is already in the wild and is using multi level pointer walks and then passes them into helpers. We cannot break them. In other words if you try really hard you can crash the kernel. bpf tracing is only 99.99% safe. I'll start a separate thread about link lists in bpf.
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index 7a1579c91432..16d489b03000 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -564,31 +564,39 @@ struct prog_test_ref_kfunc { int b; struct prog_test_member memb; struct prog_test_ref_kfunc *next; + refcount_t cnt; }; -static struct prog_test_ref_kfunc prog_test_struct = { +DEFINE_PER_CPU(struct prog_test_ref_kfunc, prog_test_struct) = { .a = 42, .b = 108, - .next = &prog_test_struct, + .cnt = REFCOUNT_INIT(1), }; noinline struct prog_test_ref_kfunc * bpf_kfunc_call_test_acquire(unsigned long *scalar_ptr) { - /* randomly return NULL */ - if (get_jiffies_64() % 2) - return NULL; - return &prog_test_struct; + struct prog_test_ref_kfunc *p; + + p = this_cpu_ptr(&prog_test_struct); + p->next = p; + refcount_inc(&p->cnt); + return p; } noinline struct prog_test_member * bpf_kfunc_call_memb_acquire(void) { - return &prog_test_struct.memb; + WARN_ON_ONCE(1); + return NULL; } noinline void bpf_kfunc_call_test_release(struct prog_test_ref_kfunc *p) { + if (!p) + return; + + refcount_dec(&p->cnt); } noinline void bpf_kfunc_call_memb_release(struct prog_test_member *p) @@ -597,12 +605,18 @@ noinline void bpf_kfunc_call_memb_release(struct prog_test_member *p) noinline void bpf_kfunc_call_memb1_release(struct prog_test_member1 *p) { + WARN_ON_ONCE(1); } noinline struct prog_test_ref_kfunc * -bpf_kfunc_call_test_kptr_get(struct prog_test_ref_kfunc **p, int a, int b) +bpf_kfunc_call_test_kptr_get(struct prog_test_ref_kfunc **pp, int a, int b) { - return &prog_test_struct; + struct prog_test_ref_kfunc *p = READ_ONCE(*pp); + + if (!p) + return NULL; + refcount_inc(&p->cnt); + return p; } struct prog_test_pass1 { diff --git a/tools/testing/selftests/bpf/verifier/map_kptr.c b/tools/testing/selftests/bpf/verifier/map_kptr.c index 9113834640e6..6914904344c0 100644 --- a/tools/testing/selftests/bpf/verifier/map_kptr.c +++ b/tools/testing/selftests/bpf/verifier/map_kptr.c @@ -212,13 +212,13 @@ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0), BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), BPF_EXIT_INSN(), - BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 24), + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 32), BPF_EXIT_INSN(), }, .prog_type = BPF_PROG_TYPE_SCHED_CLS, .fixup_map_kptr = { 1 }, .result = REJECT, - .errstr = "access beyond struct prog_test_ref_kfunc at off 24 size 8", + .errstr = "access beyond struct prog_test_ref_kfunc at off 32 size 8", }, { "map_kptr: unref: inherit PTR_UNTRUSTED on struct walk",
In an effort to actually test the refcounting logic at runtime, add a refcount_t member to prog_test_ref_kfunc and use it in selftests to verify and test the whole logic more exhaustively. To ensure reading the count to verify it remains stable, make prog_test_ref_kfunc a per-CPU variable, so that inside a BPF program the count can be read reliably based on number of acquisitions made. Then, pairing them with releases and reading from the global per-CPU variable will allow verifying whether release operations put the refcount. We don't actually rely on preemption being disabled, but migration must be disabled, as BPF program is the only context where these acquire and release functions are called. As such, when an object is acquired on one CPU, only that CPU should manipulate its refcount. Likewise, for bpf_this_cpu_ptr, the returned pointer and acquired pointer must match, so that refcount can actually be read and verified. The kfunc calls for prog_test_member do not require runtime refcounting, as they are only used for verifier selftests, not during runtime execution. Hence, their implementation now has a WARN_ON_ONCE as it is not meant to be reachable code at runtime. It is strictly used in tests triggering failure cases in the verifier. bpf_kfunc_call_memb_release is called from map free path, since prog_test_member is embedded in map value for some verifier tests, so we skip WARN_ON_ONCE for it. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> --- net/bpf/test_run.c | 32 +++++++++++++------ .../testing/selftests/bpf/verifier/map_kptr.c | 4 +-- 2 files changed, 25 insertions(+), 11 deletions(-)