diff mbox series

[bpf-next,v1,2/4] bpf: Prepare prog_test_struct kfuncs for runtime tests

Message ID 20220510211727.575686-3-memxor@gmail.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series Follow ups for kptr series | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit fail Errors and warnings before: 29 this patch: 30
netdev/cc_maintainers warning 11 maintainers not CCed: netdev@vger.kernel.org kafai@fb.com john.fastabend@gmail.com pabeni@redhat.com kuba@kernel.org shuah@kernel.org yhs@fb.com kpsingh@kernel.org davem@davemloft.net linux-kselftest@vger.kernel.org songliubraving@fb.com
netdev/build_clang success Errors and warnings before: 9 this patch: 9
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn fail Errors and warnings before: 29 this patch: 30
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 81 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-1 fail Logs for Kernel LATEST on ubuntu-latest + selftests
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Kernel LATEST on z15 + selftests

Commit Message

Kumar Kartikeya Dwivedi May 10, 2022, 9:17 p.m. UTC
In an effort to actually test the refcounting logic at runtime, add a
refcount_t member to prog_test_ref_kfunc and use it in selftests to
verify and test the whole logic more exhaustively.

To ensure reading the count to verify it remains stable, make
prog_test_ref_kfunc a per-CPU variable, so that inside a BPF program the
count can be read reliably based on number of acquisitions made. Then,
pairing them with releases and reading from the global per-CPU variable
will allow verifying whether release operations put the refcount.

We don't actually rely on preemption being disabled, but migration must
be disabled, as BPF program is the only context where these acquire and
release functions are called. As such, when an object is acquired on one
CPU, only that CPU should manipulate its refcount. Likewise, for
bpf_this_cpu_ptr, the returned pointer and acquired pointer must match,
so that refcount can actually be read and verified.

The kfunc calls for prog_test_member do not require runtime refcounting,
as they are only used for verifier selftests, not during runtime
execution. Hence, their implementation now has a WARN_ON_ONCE as it is
not meant to be reachable code at runtime. It is strictly used in tests
triggering failure cases in the verifier. bpf_kfunc_call_memb_release is
called from map free path, since prog_test_member is embedded in map
value for some verifier tests, so we skip WARN_ON_ONCE for it.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 net/bpf/test_run.c                            | 32 +++++++++++++------
 .../testing/selftests/bpf/verifier/map_kptr.c |  4 +--
 2 files changed, 25 insertions(+), 11 deletions(-)

Comments

Alexei Starovoitov May 11, 2022, 4:37 a.m. UTC | #1
On Tue, May 10, 2022 at 2:17 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> In an effort to actually test the refcounting logic at runtime, add a
> refcount_t member to prog_test_ref_kfunc and use it in selftests to
> verify and test the whole logic more exhaustively.
>
> To ensure reading the count to verify it remains stable, make
> prog_test_ref_kfunc a per-CPU variable, so that inside a BPF program the
> count can be read reliably based on number of acquisitions made. Then,
> pairing them with releases and reading from the global per-CPU variable
> will allow verifying whether release operations put the refcount.

The patches look good, but the per-cpu part is a puzzle.
The test is not parallel. Everything looks sequential
and there are no races.
It seems to me if it was
static struct prog_test_ref_kfunc prog_test_struct = {..};
and none of [bpf_]this_cpu_ptr()
the test would work the same way.
What am I missing?
Kumar Kartikeya Dwivedi May 11, 2022, 6:02 a.m. UTC | #2
On Wed, May 11, 2022 at 10:07:35AM IST, Alexei Starovoitov wrote:
> On Tue, May 10, 2022 at 2:17 PM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > In an effort to actually test the refcounting logic at runtime, add a
> > refcount_t member to prog_test_ref_kfunc and use it in selftests to
> > verify and test the whole logic more exhaustively.
> >
> > To ensure reading the count to verify it remains stable, make
> > prog_test_ref_kfunc a per-CPU variable, so that inside a BPF program the
> > count can be read reliably based on number of acquisitions made. Then,
> > pairing them with releases and reading from the global per-CPU variable
> > will allow verifying whether release operations put the refcount.
>
> The patches look good, but the per-cpu part is a puzzle.
> The test is not parallel. Everything looks sequential
> and there are no races.
> It seems to me if it was
> static struct prog_test_ref_kfunc prog_test_struct = {..};
> and none of [bpf_]this_cpu_ptr()
> the test would work the same way.
> What am I missing?

You are not missing anything. It would work the same. I just made it per-CPU for
the off chance that someone runs ./test_progs -t map_kptr in parallel on the
same machine. Then one or both might fail, since count won't just be inc/dec by
us, and reading it would produce something other than what we expect.

One other benefit is getting non-ref PTR_TO_BTF_ID to prog_test_struct to
inspect cnt after releasing acquired pointer (using bpf_this_cpu_ptr), but that
can also be done by non-ref kfunc returning a pointer to it.

If you feel it's not worth it, I will drop it in the next version.

--
Kartikeya
Alexei Starovoitov May 11, 2022, 5:53 p.m. UTC | #3
On Tue, May 10, 2022 at 11:01 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> On Wed, May 11, 2022 at 10:07:35AM IST, Alexei Starovoitov wrote:
> > On Tue, May 10, 2022 at 2:17 PM Kumar Kartikeya Dwivedi
> > <memxor@gmail.com> wrote:
> > >
> > > In an effort to actually test the refcounting logic at runtime, add a
> > > refcount_t member to prog_test_ref_kfunc and use it in selftests to
> > > verify and test the whole logic more exhaustively.
> > >
> > > To ensure reading the count to verify it remains stable, make
> > > prog_test_ref_kfunc a per-CPU variable, so that inside a BPF program the
> > > count can be read reliably based on number of acquisitions made. Then,
> > > pairing them with releases and reading from the global per-CPU variable
> > > will allow verifying whether release operations put the refcount.
> >
> > The patches look good, but the per-cpu part is a puzzle.
> > The test is not parallel. Everything looks sequential
> > and there are no races.
> > It seems to me if it was
> > static struct prog_test_ref_kfunc prog_test_struct = {..};
> > and none of [bpf_]this_cpu_ptr()
> > the test would work the same way.
> > What am I missing?
>
> You are not missing anything. It would work the same. I just made it per-CPU for
> the off chance that someone runs ./test_progs -t map_kptr in parallel on the
> same machine. Then one or both might fail, since count won't just be inc/dec by
> us, and reading it would produce something other than what we expect.

I see. You should have mentioned that in the commit log.
But how per-cpu helps in this case?
prog_run is executed with cpu=0, so both test_progs -t map_kptr
will collide on the same cpu.
At the end it's the same. one or both might fail?

In general all serial_ tests in test_progs will fail in
parallel run.
Even non-serial tests might fail.
The non-serial tests are ok for test_progs -j.
They're parallel between themselves, but there are no guarantees
that every individual test can be run parallel with itself.
Majority will probably be fine, but not all.

> One other benefit is getting non-ref PTR_TO_BTF_ID to prog_test_struct to
> inspect cnt after releasing acquired pointer (using bpf_this_cpu_ptr), but that
> can also be done by non-ref kfunc returning a pointer to it.

Not following. non-ref == ptr_untrusted. That doesn't preclude
bpf prog from reading refcnt directly, but disallows passing
into helpers.
So with non-percpu the following hack
 bpf_kfunc_call_test_release(p);
 if (p_cpu->cnt.refs.counter ...)
wouldn't be necessary.
The prog could release(p) and read p->cnt.refs.counter right after.
While with per-cpu approach you had to do
p_cpu = bpf_this_cpu_ptr(&prog_test_struct);
hack and rely on intimate knowledge of the kernel side.

> If you feel it's not worth it, I will drop it in the next version.

So far I see the downsides.
Also check the CI. test_progs-no_alu32 fails:
test_map_kptr_fail_prog:PASS:map_kptr__load must fail 0 nsec
test_map_kptr_fail_prog:FAIL:expected error message unexpected error: -22
Expected: Unreleased reference id=5 alloc_insn=18
Verifier: 0: R1=ctx(off=0,imm=0) R10=fp0
Kumar Kartikeya Dwivedi May 11, 2022, 7:07 p.m. UTC | #4
On Wed, May 11, 2022 at 11:23:59PM IST, Alexei Starovoitov wrote:
> On Tue, May 10, 2022 at 11:01 PM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > On Wed, May 11, 2022 at 10:07:35AM IST, Alexei Starovoitov wrote:
> > > On Tue, May 10, 2022 at 2:17 PM Kumar Kartikeya Dwivedi
> > > <memxor@gmail.com> wrote:
> > > >
> > > > In an effort to actually test the refcounting logic at runtime, add a
> > > > refcount_t member to prog_test_ref_kfunc and use it in selftests to
> > > > verify and test the whole logic more exhaustively.
> > > >
> > > > To ensure reading the count to verify it remains stable, make
> > > > prog_test_ref_kfunc a per-CPU variable, so that inside a BPF program the
> > > > count can be read reliably based on number of acquisitions made. Then,
> > > > pairing them with releases and reading from the global per-CPU variable
> > > > will allow verifying whether release operations put the refcount.
> > >
> > > The patches look good, but the per-cpu part is a puzzle.
> > > The test is not parallel. Everything looks sequential
> > > and there are no races.
> > > It seems to me if it was
> > > static struct prog_test_ref_kfunc prog_test_struct = {..};
> > > and none of [bpf_]this_cpu_ptr()
> > > the test would work the same way.
> > > What am I missing?
> >
> > You are not missing anything. It would work the same. I just made it per-CPU for
> > the off chance that someone runs ./test_progs -t map_kptr in parallel on the
> > same machine. Then one or both might fail, since count won't just be inc/dec by
> > us, and reading it would produce something other than what we expect.
>
> I see. You should have mentioned that in the commit log.
> But how per-cpu helps in this case?
> prog_run is executed with cpu=0, so both test_progs -t map_kptr
> will collide on the same cpu.

Right, I was thinking bpf_prog_run disabled preemption, so that would prevent
collisions, but it seems my knowledge is now outdated (only migration is
disabled). Also, just realising, we rely on observing a specific count across
test_run invocations, which won't be protected against for parallel runs
anyway.

> At the end it's the same. one or both might fail?
>
> In general all serial_ tests in test_progs will fail in
> parallel run.
> Even non-serial tests might fail.
> The non-serial tests are ok for test_progs -j.
> They're parallel between themselves, but there are no guarantees
> that every individual test can be run parallel with itself.
> Majority will probably be fine, but not all.
>

I'll drop it and go with a global struct.

> > One other benefit is getting non-ref PTR_TO_BTF_ID to prog_test_struct to
> > inspect cnt after releasing acquired pointer (using bpf_this_cpu_ptr), but that
> > can also be done by non-ref kfunc returning a pointer to it.
>
> Not following. non-ref == ptr_untrusted. That doesn't preclude

By non-ref PTR_TO_BTF_ID I meant normal (not untrusted) PTR_TO_BTF_ID with
ref_obj_id = 0.

bpf_this_cpu_ptr returns a normal PTR_TO_BTF_ID, not an untrusted one.

> bpf prog from reading refcnt directly, but disallows passing
> into helpers.
> So with non-percpu the following hack
>  bpf_kfunc_call_test_release(p);
>  if (p_cpu->cnt.refs.counter ...)
> wouldn't be necessary.
> The prog could release(p) and read p->cnt.refs.counter right after.

release(p) will kill p, so that won't work. I have a better idea, since
p->next points to itself, just loading that will give me a pointer I can
read after release(p).

As an aside, do you think we should change the behaviour of killing released
registers and skip it for refcounted PTR_TO_BTF_ID (perhaps mark it as untrusted
pointer instead, with ref_obj_id reset to zero)? So loads are allowed into it,
but passing into the kernel isn't, wdyt?

p = acq();	  // p.type = PTR_TO_BTF_ID, ref_obj_id=X
foo(p);		  // works
bar(p->a + p->b); // works
rel(p);		  // p.type = PTR_TO_BTF_ID | PTR_UNTRUSTED, ref_obj_id=0
		  // Instead of mark_reg_unknown(p)

There is still the case where you can do:
p2 = p->next;
rel(p);
p3 = p->next;

Now p2 is trusted PTR_TO_BTF_ID, while p3 is untrusted, but this is a separate
problem which requires a more general fix, and needs more discussion.

A bit of a digression, but I would like to know what you and other BPF
developers think.

So far my thinking (culminating towards an RFC) is this:

For a refcounted PTR_TO_BTF_ID, it is marked as trusted.

When loading from it, by default all loads yield untrusted pointers, except
those which are specifically marked with some annotation ("bpf_ptr_trust") which
indicates that parent holds reference to member pointer. This is a loose
description to mean that for the lifetime of trusted parent pointer, member
pointer may also be trusted. If lifetime can end (due to release), trusted
member pointers will become untrusted. If it cannot (e.g. function arguments),
it remains valid.

This will use BTF tags.
Known cases in the kernel which are useful and safe can be whitelisted.

Such loads yield trusted pointers linked to refcounted PTR_TO_BTF_ID. Linked
means the source refcounted PTR_TO_BTF_ID owns it.

When releasing PTR_TO_BTF_ID, all registers with same ref_obj_id, and all linked
PTR_TO_BTF_ID are marked as untrusted.

As an example:

struct foo {
	struct bar __ref *br;
	struct baz *bz;
};

struct foo *f = acq(); // f.type = PTR_TO_BTF_ID, ref_obj_id=X
br = f->br;	       // br.type = PTR_TO_BTF_ID, linked_to=X
bz = f->bz;	       // bz.type = PTR_TO_BTF_ID | PTR_UNTRUSTED
rel(f);		       // f.type = PTR_TO_BTF_ID | PTR_UNTRUSTED
		       // and since br.linked_to == f.ref_obj_id,
		       // br.type = PTR_TO_BTF_ID | PTR_UNTRUSTED

For trusted loads from br, linked_to will be same as X, so they will also be
marked as untrusted, and so on.

For tp_btf/LSM programs, pointer arguments will be non-refcounted trusted
PTR_TO_BTF_ID. All rules as above apply, but since it cannot be released,
trusted pointers obtained from them remain valid till BPF_EXIT.

I have no idea how much backwards compat this will break, or how much of it can
be tolerated.

> While with per-cpu approach you had to do
> p_cpu = bpf_this_cpu_ptr(&prog_test_struct);
> hack and rely on intimate knowledge of the kernel side.
>
> > If you feel it's not worth it, I will drop it in the next version.
>
> So far I see the downsides.
> Also check the CI. test_progs-no_alu32 fails:
> test_map_kptr_fail_prog:PASS:map_kptr__load must fail 0 nsec
> test_map_kptr_fail_prog:FAIL:expected error message unexpected error: -22
> Expected: Unreleased reference id=5 alloc_insn=18
> Verifier: 0: R1=ctx(off=0,imm=0) R10=fp0

Yes, noticed that. It is because alloc_insn= is different for no_alu32. I'll
drop the matching on specific insn idx number.

Thanks for your feedback.

--
Kartikeya
Alexei Starovoitov May 12, 2022, 12:28 a.m. UTC | #5
On Thu, May 12, 2022 at 12:37:24AM +0530, Kumar Kartikeya Dwivedi wrote:
> On Wed, May 11, 2022 at 11:23:59PM IST, Alexei Starovoitov wrote:
> > On Tue, May 10, 2022 at 11:01 PM Kumar Kartikeya Dwivedi
> > <memxor@gmail.com> wrote:
> > >
> > > On Wed, May 11, 2022 at 10:07:35AM IST, Alexei Starovoitov wrote:
> > > > On Tue, May 10, 2022 at 2:17 PM Kumar Kartikeya Dwivedi
> > > > <memxor@gmail.com> wrote:
> > > > >
> > > > > In an effort to actually test the refcounting logic at runtime, add a
> > > > > refcount_t member to prog_test_ref_kfunc and use it in selftests to
> > > > > verify and test the whole logic more exhaustively.
> > > > >
> > > > > To ensure reading the count to verify it remains stable, make
> > > > > prog_test_ref_kfunc a per-CPU variable, so that inside a BPF program the
> > > > > count can be read reliably based on number of acquisitions made. Then,
> > > > > pairing them with releases and reading from the global per-CPU variable
> > > > > will allow verifying whether release operations put the refcount.
> > > >
> > > > The patches look good, but the per-cpu part is a puzzle.
> > > > The test is not parallel. Everything looks sequential
> > > > and there are no races.
> > > > It seems to me if it was
> > > > static struct prog_test_ref_kfunc prog_test_struct = {..};
> > > > and none of [bpf_]this_cpu_ptr()
> > > > the test would work the same way.
> > > > What am I missing?
> > >
> > > You are not missing anything. It would work the same. I just made it per-CPU for
> > > the off chance that someone runs ./test_progs -t map_kptr in parallel on the
> > > same machine. Then one or both might fail, since count won't just be inc/dec by
> > > us, and reading it would produce something other than what we expect.
> >
> > I see. You should have mentioned that in the commit log.
> > But how per-cpu helps in this case?
> > prog_run is executed with cpu=0, so both test_progs -t map_kptr
> > will collide on the same cpu.
> 
> Right, I was thinking bpf_prog_run disabled preemption, so that would prevent
> collisions, but it seems my knowledge is now outdated (only migration is
> disabled). Also, just realising, we rely on observing a specific count across
> test_run invocations, which won't be protected against for parallel runs
> anyway.
> 
> > At the end it's the same. one or both might fail?
> >
> > In general all serial_ tests in test_progs will fail in
> > parallel run.
> > Even non-serial tests might fail.
> > The non-serial tests are ok for test_progs -j.
> > They're parallel between themselves, but there are no guarantees
> > that every individual test can be run parallel with itself.
> > Majority will probably be fine, but not all.
> >
> 
> I'll drop it and go with a global struct.
> 
> > > One other benefit is getting non-ref PTR_TO_BTF_ID to prog_test_struct to
> > > inspect cnt after releasing acquired pointer (using bpf_this_cpu_ptr), but that
> > > can also be done by non-ref kfunc returning a pointer to it.
> >
> > Not following. non-ref == ptr_untrusted. That doesn't preclude
> 
> By non-ref PTR_TO_BTF_ID I meant normal (not untrusted) PTR_TO_BTF_ID with
> ref_obj_id = 0.
> 
> bpf_this_cpu_ptr returns a normal PTR_TO_BTF_ID, not an untrusted one.
> 
> > bpf prog from reading refcnt directly, but disallows passing
> > into helpers.
> > So with non-percpu the following hack
> >  bpf_kfunc_call_test_release(p);
> >  if (p_cpu->cnt.refs.counter ...)
> > wouldn't be necessary.
> > The prog could release(p) and read p->cnt.refs.counter right after.
> 
> release(p) will kill p, so that won't work. I have a better idea, since
> p->next points to itself, just loading that will give me a pointer I can
> read after release(p).
> 
> As an aside, do you think we should change the behaviour of killing released
> registers and skip it for refcounted PTR_TO_BTF_ID (perhaps mark it as untrusted
> pointer instead, with ref_obj_id reset to zero)? So loads are allowed into it,
> but passing into the kernel isn't, wdyt?
> 
> p = acq();	  // p.type = PTR_TO_BTF_ID, ref_obj_id=X
> foo(p);		  // works
> bar(p->a + p->b); // works
> rel(p);		  // p.type = PTR_TO_BTF_ID | PTR_UNTRUSTED, ref_obj_id=0
> 		  // Instead of mark_reg_unknown(p)
> 
> There is still the case where you can do:
> p2 = p->next;
> rel(p);
> p3 = p->next;

It's probably better to keep existing behavior since acquire/release is mainly
used in the networking context today.
Probe reading a socket after release is technically safe, but right now
such usage will be rejected by the verifier and the user will have to
fix such bug. If we relax it such bugs might be much harder to spot.

> Now p2 is trusted PTR_TO_BTF_ID, while p3 is untrusted, but this is a separate
> problem which requires a more general fix, and needs more discussion.

It might not be a bug. p3 might still be trusted and valid pointer.
rel(p) releases p only.
The verifier doesn't know semantics.
It's a general link list walking issue.
It needs separate discussion.

> A bit of a digression, but I would like to know what you and other BPF
> developers think.
> 
> So far my thinking (culminating towards an RFC) is this:
> 
> For a refcounted PTR_TO_BTF_ID, it is marked as trusted.
> 
> When loading from it, by default all loads yield untrusted pointers, except
> those which are specifically marked with some annotation ("bpf_ptr_trust") which
> indicates that parent holds reference to member pointer. This is a loose
> description to mean that for the lifetime of trusted parent pointer, member
> pointer may also be trusted. If lifetime can end (due to release), trusted
> member pointers will become untrusted. If it cannot (e.g. function arguments),
> it remains valid.
> 
> This will use BTF tags.
> Known cases in the kernel which are useful and safe can be whitelisted.
> 
> Such loads yield trusted pointers linked to refcounted PTR_TO_BTF_ID. Linked
> means the source refcounted PTR_TO_BTF_ID owns it.
> 
> When releasing PTR_TO_BTF_ID, all registers with same ref_obj_id, and all linked
> PTR_TO_BTF_ID are marked as untrusted.
> 
> As an example:
> 
> struct foo {
> 	struct bar __ref *br;
> 	struct baz *bz;
> };
> 
> struct foo *f = acq(); // f.type = PTR_TO_BTF_ID, ref_obj_id=X
> br = f->br;	       // br.type = PTR_TO_BTF_ID, linked_to=X
> bz = f->bz;	       // bz.type = PTR_TO_BTF_ID | PTR_UNTRUSTED
> rel(f);		       // f.type = PTR_TO_BTF_ID | PTR_UNTRUSTED
> 		       // and since br.linked_to == f.ref_obj_id,
> 		       // br.type = PTR_TO_BTF_ID | PTR_UNTRUSTED
> 
> For trusted loads from br, linked_to will be same as X, so they will also be
> marked as untrusted, and so on.
> 
> For tp_btf/LSM programs, pointer arguments will be non-refcounted trusted
> PTR_TO_BTF_ID. All rules as above apply, but since it cannot be released,
> trusted pointers obtained from them remain valid till BPF_EXIT.
> 
> I have no idea how much backwards compat this will break, or how much of it can
> be tolerated.

Exactly. It's not practical to mark all such fields in the kernel with __ref tags.
bpf_lsm is already in the wild and is using multi level pointer walks
and then passes them into helpers. We cannot break them.
In other words if you try really hard you can crash the kernel.
bpf tracing is only 99.99% safe.

I'll start a separate thread about link lists in bpf.
diff mbox series

Patch

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 7a1579c91432..16d489b03000 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -564,31 +564,39 @@  struct prog_test_ref_kfunc {
 	int b;
 	struct prog_test_member memb;
 	struct prog_test_ref_kfunc *next;
+	refcount_t cnt;
 };
 
-static struct prog_test_ref_kfunc prog_test_struct = {
+DEFINE_PER_CPU(struct prog_test_ref_kfunc, prog_test_struct) = {
 	.a = 42,
 	.b = 108,
-	.next = &prog_test_struct,
+	.cnt = REFCOUNT_INIT(1),
 };
 
 noinline struct prog_test_ref_kfunc *
 bpf_kfunc_call_test_acquire(unsigned long *scalar_ptr)
 {
-	/* randomly return NULL */
-	if (get_jiffies_64() % 2)
-		return NULL;
-	return &prog_test_struct;
+	struct prog_test_ref_kfunc *p;
+
+	p = this_cpu_ptr(&prog_test_struct);
+	p->next = p;
+	refcount_inc(&p->cnt);
+	return p;
 }
 
 noinline struct prog_test_member *
 bpf_kfunc_call_memb_acquire(void)
 {
-	return &prog_test_struct.memb;
+	WARN_ON_ONCE(1);
+	return NULL;
 }
 
 noinline void bpf_kfunc_call_test_release(struct prog_test_ref_kfunc *p)
 {
+	if (!p)
+		return;
+
+	refcount_dec(&p->cnt);
 }
 
 noinline void bpf_kfunc_call_memb_release(struct prog_test_member *p)
@@ -597,12 +605,18 @@  noinline void bpf_kfunc_call_memb_release(struct prog_test_member *p)
 
 noinline void bpf_kfunc_call_memb1_release(struct prog_test_member1 *p)
 {
+	WARN_ON_ONCE(1);
 }
 
 noinline struct prog_test_ref_kfunc *
-bpf_kfunc_call_test_kptr_get(struct prog_test_ref_kfunc **p, int a, int b)
+bpf_kfunc_call_test_kptr_get(struct prog_test_ref_kfunc **pp, int a, int b)
 {
-	return &prog_test_struct;
+	struct prog_test_ref_kfunc *p = READ_ONCE(*pp);
+
+	if (!p)
+		return NULL;
+	refcount_inc(&p->cnt);
+	return p;
 }
 
 struct prog_test_pass1 {
diff --git a/tools/testing/selftests/bpf/verifier/map_kptr.c b/tools/testing/selftests/bpf/verifier/map_kptr.c
index 9113834640e6..6914904344c0 100644
--- a/tools/testing/selftests/bpf/verifier/map_kptr.c
+++ b/tools/testing/selftests/bpf/verifier/map_kptr.c
@@ -212,13 +212,13 @@ 
 	BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0),
 	BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
 	BPF_EXIT_INSN(),
-	BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 24),
+	BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 32),
 	BPF_EXIT_INSN(),
 	},
 	.prog_type = BPF_PROG_TYPE_SCHED_CLS,
 	.fixup_map_kptr = { 1 },
 	.result = REJECT,
-	.errstr = "access beyond struct prog_test_ref_kfunc at off 24 size 8",
+	.errstr = "access beyond struct prog_test_ref_kfunc at off 32 size 8",
 },
 {
 	"map_kptr: unref: inherit PTR_UNTRUSTED on struct walk",