[v2,bpf-next,1/5] bpf: Add bpf_dynptr_adjust

Message ID	20230420071414.570108-2-joannelkoong@gmail.com (mailing list archive)
State	Accepted
Commit	a18ce61a6c1f5dfcd8635f89db75241cfd5ae476
Delegated to:	BPF
Headers	show Return-Path: <bpf-owner@vger.kernel.org> From: Joanne Koong <joannelkoong@gmail.com> To: bpf@vger.kernel.org Cc: andrii@kernel.org, ast@kernel.org, daniel@iogearbox.net, Joanne Koong <joannelkoong@gmail.com> Subject: [PATCH v2 bpf-next 1/5] bpf: Add bpf_dynptr_adjust Date: Thu, 20 Apr 2023 00:14:10 -0700 Message-Id: <20230420071414.570108-2-joannelkoong@gmail.com> In-Reply-To: <20230420071414.570108-1-joannelkoong@gmail.com> References: <20230420071414.570108-1-joannelkoong@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Dynptr helpers \| expand [v2,bpf-next,0/5] Dynptr helpers [v2,bpf-next,1/5] bpf: Add bpf_dynptr_adjust [v2,bpf-next,2/5] bpf: Add bpf_dynptr_is_null and bpf_dynptr_is_rdonly [v2,bpf-next,3/5] bpf: Add bpf_dynptr_size [v2,bpf-next,4/5] bpf: Add bpf_dynptr_clone [v2,bpf-next,5/5] selftests/bpf: add tests for dynptr convenience helpers

Context	Check	Description
netdev/series_format	success	Posting correctly formatted
netdev/tree_selection	success	Clearly marked for bpf-next, async
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	fail	Errors and warnings before: 52 this patch: 53
netdev/cc_maintainers	warning	8 maintainers not CCed: song@kernel.org sdf@google.com haoluo@google.com yhs@fb.com john.fastabend@gmail.com kpsingh@kernel.org jolsa@kernel.org martin.lau@linux.dev
netdev/build_clang	success	Errors and warnings before: 18 this patch: 18
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	fail	Errors and warnings before: 52 this patch: 53
netdev/checkpatch	warning	WARNING: line length of 82 exceeds 80 columns
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-36	success	Logs for veristat
bpf/vmtest-bpf-next-PR	success	PR summary
bpf/vmtest-bpf-next-VM_Test-1	success	Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-10	fail	Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-2	success	Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3	success	Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-4	success	Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-5	success	Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6	success	Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-7	success	Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-8	success	Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9	success	Logs for test_maps on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-11	success	Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-12	success	Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-13	success	Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-14	success	Logs for test_progs on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-15	success	Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16	success	Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17	success	Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-18	success	Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-19	success	Logs for test_progs_no_alu32 on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-20	success	Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-21	success	Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22	success	Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-23	success	Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-24	success	Logs for test_progs_no_alu32_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-25	success	Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26	success	Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27	success	Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-28	success	Logs for test_progs_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29	success	Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-30	success	Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-31	success	Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-32	success	Logs for test_verifier on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-33	success	Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-34	success	Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-35	success	Logs for test_verifier on x86_64 with llvm-16

Joanne Koong April 20, 2023, 7:14 a.m. UTC

Add a new kfunc

int bpf_dynptr_adjust(struct bpf_dynptr_kern *ptr, u32 start, u32 end);

which adjusts the dynptr to reflect the new [start, end) interval.
In particular, it advances the offset of the dynptr by "start" bytes,
and if end is less than the size of the dynptr, then this will trim the
dynptr accordingly.

Adjusting the dynptr interval may be useful in certain situations.
For example, when hashing which takes in generic dynptrs, if the dynptr
points to a struct but only a certain memory region inside the struct
should be hashed, adjust can be used to narrow in on the
specific region to hash.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
---
 kernel/bpf/helpers.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

Alexei Starovoitov April 20, 2023, 6:38 p.m. UTC | #1

On Thu, Apr 20, 2023 at 12:14:10AM -0700, Joanne Koong wrote:
>  	return obj;
> @@ -2369,6 +2394,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL)
>  BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW)
>  BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL)
>  BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY)
> +BTF_ID_FLAGS(func, bpf_dynptr_adjust)

I've missed this earlier.
Shouldn't we change all the existing dynptr kfuncs to be KF_TRUSTED_ARGS?
Otherwise when people start passing bpf_dynptr-s from kernel code
(like fuse-bpf is planning to do)
the bpf prog might get vanilla ptr_to_btf_id to bpf_dynptr_kern.
It's probably not possible right now, so not a high-pri issue, but still.
Or something in the verifier makes sure that dynptr-s are all trusted?

Joanne Koong April 21, 2023, 3:46 a.m. UTC | #2

On Thu, Apr 20, 2023 at 11:38 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Apr 20, 2023 at 12:14:10AM -0700, Joanne Koong wrote:
> >       return obj;
> > @@ -2369,6 +2394,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL)
> >  BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW)
> >  BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL)
> >  BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY)
> > +BTF_ID_FLAGS(func, bpf_dynptr_adjust)
>
> I've missed this earlier.
> Shouldn't we change all the existing dynptr kfuncs to be KF_TRUSTED_ARGS?
> Otherwise when people start passing bpf_dynptr-s from kernel code
> (like fuse-bpf is planning to do)
> the bpf prog might get vanilla ptr_to_btf_id to bpf_dynptr_kern.
> It's probably not possible right now, so not a high-pri issue, but still.
> Or something in the verifier makes sure that dynptr-s are all trusted?

In my understanding, the checks the verifier enforces for
KF_TRUSTED_ARGS are that the reg->offset is 0 and the reg may not be
null. The verifier logic does this for dynptrs currently, it enforces
that reg->offset is 0 (in stack_slot_obj_get_spi()) and that the
reg->type is PTR_TO_STACK or CONST_PTR_TO_DYNPTR (in
check_kfunc_args() for KF_ARG_PTR_TO_DYNPTR case). But maybe it's a
good idea to add the KF_TRUSTED_ARGS flag anyways in case more safety
checks are added to KF_TRUSTED_ARGS in the future?

Alexei Starovoitov April 22, 2023, 11:44 p.m. UTC | #3

On Thu, Apr 20, 2023 at 8:46 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> On Thu, Apr 20, 2023 at 11:38 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Thu, Apr 20, 2023 at 12:14:10AM -0700, Joanne Koong wrote:
> > >       return obj;
> > > @@ -2369,6 +2394,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL)
> > >  BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW)
> > >  BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL)
> > >  BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY)
> > > +BTF_ID_FLAGS(func, bpf_dynptr_adjust)
> >
> > I've missed this earlier.
> > Shouldn't we change all the existing dynptr kfuncs to be KF_TRUSTED_ARGS?
> > Otherwise when people start passing bpf_dynptr-s from kernel code
> > (like fuse-bpf is planning to do)
> > the bpf prog might get vanilla ptr_to_btf_id to bpf_dynptr_kern.
> > It's probably not possible right now, so not a high-pri issue, but still.
> > Or something in the verifier makes sure that dynptr-s are all trusted?
>
> In my understanding, the checks the verifier enforces for
> KF_TRUSTED_ARGS are that the reg->offset is 0 and the reg may not be
> null. The verifier logic does this for dynptrs currently, it enforces
> that reg->offset is 0 (in stack_slot_obj_get_spi()) and that the
> reg->type is PTR_TO_STACK or CONST_PTR_TO_DYNPTR (in
> check_kfunc_args() for KF_ARG_PTR_TO_DYNPTR case). But maybe it's a
> good idea to add the KF_TRUSTED_ARGS flag anyways in case more safety
> checks are added to KF_TRUSTED_ARGS in the future?

Yeah. You're right.
The verifier is doing the same checks for dynptr and for trusted ptrs.
So adding KF_TRUSTED_ARGS to bpf_dynptr_adjust is not mandatory.
Maybe an opportunity to generalize the checks between
KF_ARG_PTR_TO_BTF_ID and KF_ARG_PTR_TO_DYNPTR.
But KF_TRUSTED_ARGS is necessary for bpf_dynptr_from_skb
otherwise old style ptr_to_btf_id skb can be passed in.

For example the following passes test_progs:
diff --git a/net/core/filter.c b/net/core/filter.c
index d9ce04ca22ce..abb14036b455 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -11718,6 +11718,7 @@ static int __init bpf_kfunc_init(void)
        ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_LWT_XMIT,
&bpf_kfunc_set_skb);
        ret = ret ?:
register_btf_kfunc_id_set(BPF_PROG_TYPE_LWT_SEG6LOCAL,
&bpf_kfunc_set_skb);
        ret = ret ?:
register_btf_kfunc_id_set(BPF_PROG_TYPE_NETFILTER,
&bpf_kfunc_set_skb);
+       ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING,
&bpf_kfunc_set_skb);
        return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP,
&bpf_kfunc_set_xdp);
 }
 late_initcall(bpf_kfunc_init);
diff --git a/tools/testing/selftests/bpf/progs/dynptr_success.c
b/tools/testing/selftests/bpf/progs/dynptr_success.c
index b2fa6c47ecc0..bd8fbc3e04ea 100644
--- a/tools/testing/selftests/bpf/progs/dynptr_success.c
+++ b/tools/testing/selftests/bpf/progs/dynptr_success.c
@@ -4,6 +4,7 @@
 #include <string.h>
 #include <linux/bpf.h>
 #include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
 #include "bpf_misc.h"
 #include "bpf_kfuncs.h"
 #include "errno.h"
@@ -187,6 +188,15 @@ int test_skb_readonly(struct __sk_buff *skb)
        return 1;
 }

+SEC("fentry/__kfree_skb")
+int BPF_PROG(test_skb, struct __sk_buff *skb)
+{
+       struct bpf_dynptr ptr;
+
+       bpf_dynptr_from_skb(skb, 0, &ptr);
+       return 0;
+}

but shouldn't. skb in fentry is not trusted.
It's not an issue right now, because bpf_dynptr_from_skb()
is enabled for networking prog types only,
but BPF_PROG_TYPE_NETFILTER is already blending the boundary.
It's more networking than tracing and normal tracing should
be able to examine skb. dynptr allows to do it nicely.
Not a blocker for this set. Just something to follow up.

Eduard Zingerman April 24, 2023, 2:31 p.m. UTC | #4

On Thu, 2023-04-20 at 00:14 -0700, Joanne Koong wrote:
> Add a new kfunc
> 
> int bpf_dynptr_adjust(struct bpf_dynptr_kern *ptr, u32 start, u32 end);
> 
> which adjusts the dynptr to reflect the new [start, end) interval.
> In particular, it advances the offset of the dynptr by "start" bytes,
> and if end is less than the size of the dynptr, then this will trim the
> dynptr accordingly.
> 
> Adjusting the dynptr interval may be useful in certain situations.
> For example, when hashing which takes in generic dynptrs, if the dynptr
> points to a struct but only a certain memory region inside the struct
> should be hashed, adjust can be used to narrow in on the
> specific region to hash.
> 
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> ---
>  kernel/bpf/helpers.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 00e5fb0682ac..7ddf63ac93ce 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -1448,6 +1448,13 @@ u32 bpf_dynptr_get_size(const struct bpf_dynptr_kern *ptr)
>  	return ptr->size & DYNPTR_SIZE_MASK;
>  }
>  
> +static void bpf_dynptr_set_size(struct bpf_dynptr_kern *ptr, u32 new_size)
> +{
> +	u32 metadata = ptr->size & ~DYNPTR_SIZE_MASK;
> +
> +	ptr->size = new_size | metadata;
> +}
> +
>  int bpf_dynptr_check_size(u32 size)
>  {
>  	return size > DYNPTR_MAX_SIZE ? -E2BIG : 0;
> @@ -2297,6 +2304,24 @@ __bpf_kfunc void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr_kern *ptr, u32 o
>  	return bpf_dynptr_slice(ptr, offset, buffer, buffer__szk);
>  }
>  
> +__bpf_kfunc int bpf_dynptr_adjust(struct bpf_dynptr_kern *ptr, u32 start, u32 end)
> +{
> +	u32 size;
> +
> +	if (!ptr->data || start > end)
> +		return -EINVAL;
> +
> +	size = bpf_dynptr_get_size(ptr);
> +
> +	if (start > size || end > size)
> +		return -ERANGE;

If new size is computed as "end - start" should
the check above be "end >= size"?

> +
> +	ptr->offset += start;
> +	bpf_dynptr_set_size(ptr, end - start);
> +
> +	return 0;
> +}
> +
>  __bpf_kfunc void *bpf_cast_to_kern_ctx(void *obj)
>  {
>  	return obj;
> @@ -2369,6 +2394,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL)
>  BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW)
>  BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL)
>  BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY)
> +BTF_ID_FLAGS(func, bpf_dynptr_adjust)
>  BTF_SET8_END(common_btf_ids)
>  
>  static const struct btf_kfunc_id_set common_kfunc_set = {

Eduard Zingerman April 24, 2023, 2:33 p.m. UTC | #5

On Mon, 2023-04-24 at 17:31 +0300, Eduard Zingerman wrote:
> On Thu, 2023-04-20 at 00:14 -0700, Joanne Koong wrote:
> > Add a new kfunc
> > 
> > int bpf_dynptr_adjust(struct bpf_dynptr_kern *ptr, u32 start, u32 end);
> > 
> > which adjusts the dynptr to reflect the new [start, end) interval.
> > In particular, it advances the offset of the dynptr by "start" bytes,
> > and if end is less than the size of the dynptr, then this will trim the
> > dynptr accordingly.
> > 
> > Adjusting the dynptr interval may be useful in certain situations.
> > For example, when hashing which takes in generic dynptrs, if the dynptr
> > points to a struct but only a certain memory region inside the struct
> > should be hashed, adjust can be used to narrow in on the
> > specific region to hash.
> > 
> > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > ---
> >  kernel/bpf/helpers.c | 26 ++++++++++++++++++++++++++
> >  1 file changed, 26 insertions(+)
> > 
> > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > index 00e5fb0682ac..7ddf63ac93ce 100644
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -1448,6 +1448,13 @@ u32 bpf_dynptr_get_size(const struct bpf_dynptr_kern *ptr)
> >  	return ptr->size & DYNPTR_SIZE_MASK;
> >  }
> >  
> > +static void bpf_dynptr_set_size(struct bpf_dynptr_kern *ptr, u32 new_size)
> > +{
> > +	u32 metadata = ptr->size & ~DYNPTR_SIZE_MASK;
> > +
> > +	ptr->size = new_size | metadata;
> > +}
> > +
> >  int bpf_dynptr_check_size(u32 size)
> >  {
> >  	return size > DYNPTR_MAX_SIZE ? -E2BIG : 0;
> > @@ -2297,6 +2304,24 @@ __bpf_kfunc void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr_kern *ptr, u32 o
> >  	return bpf_dynptr_slice(ptr, offset, buffer, buffer__szk);
> >  }
> >  
> > +__bpf_kfunc int bpf_dynptr_adjust(struct bpf_dynptr_kern *ptr, u32 start, u32 end)
> > +{
> > +	u32 size;
> > +
> > +	if (!ptr->data || start > end)
> > +		return -EINVAL;
> > +
> > +	size = bpf_dynptr_get_size(ptr);
> > +
> > +	if (start > size || end > size)
> > +		return -ERANGE;
> 
> If new size is computed as "end - start" should
> the check above be "end >= size"?

Please ignore this comment, I got confused.

> 
> > +
> > +	ptr->offset += start;
> > +	bpf_dynptr_set_size(ptr, end - start);
> > +
> > +	return 0;
> > +}
> > +
> >  __bpf_kfunc void *bpf_cast_to_kern_ctx(void *obj)
> >  {
> >  	return obj;
> > @@ -2369,6 +2394,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL)
> >  BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW)
> >  BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL)
> >  BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY)
> > +BTF_ID_FLAGS(func, bpf_dynptr_adjust)
> >  BTF_SET8_END(common_btf_ids)
> >  
> >  static const struct btf_kfunc_id_set common_kfunc_set = {
>

John Fastabend April 24, 2023, 7:46 p.m. UTC | #6

Joanne Koong wrote:
> Add a new kfunc
> 
> int bpf_dynptr_adjust(struct bpf_dynptr_kern *ptr, u32 start, u32 end);
> 
> which adjusts the dynptr to reflect the new [start, end) interval.
> In particular, it advances the offset of the dynptr by "start" bytes,
> and if end is less than the size of the dynptr, then this will trim the
> dynptr accordingly.
> 
> Adjusting the dynptr interval may be useful in certain situations.
> For example, when hashing which takes in generic dynptrs, if the dynptr
> points to a struct but only a certain memory region inside the struct
> should be hashed, adjust can be used to narrow in on the
> specific region to hash.

Would you want to prohibit creating an empty dynptr with [start, start)?

> 
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> ---
>  kernel/bpf/helpers.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 00e5fb0682ac..7ddf63ac93ce 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -1448,6 +1448,13 @@ u32 bpf_dynptr_get_size(const struct bpf_dynptr_kern *ptr)
>  	return ptr->size & DYNPTR_SIZE_MASK;
>  }
>  
> +static void bpf_dynptr_set_size(struct bpf_dynptr_kern *ptr, u32 new_size)
> +{
> +	u32 metadata = ptr->size & ~DYNPTR_SIZE_MASK;
> +
> +	ptr->size = new_size | metadata;
> +}
> +
>  int bpf_dynptr_check_size(u32 size)
>  {
>  	return size > DYNPTR_MAX_SIZE ? -E2BIG : 0;
> @@ -2297,6 +2304,24 @@ __bpf_kfunc void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr_kern *ptr, u32 o
>  	return bpf_dynptr_slice(ptr, offset, buffer, buffer__szk);
>  }
>  
> +__bpf_kfunc int bpf_dynptr_adjust(struct bpf_dynptr_kern *ptr, u32 start, u32 end)
> +{
> +	u32 size;
> +
> +	if (!ptr->data || start > end)
> +		return -EINVAL;
> +
> +	size = bpf_dynptr_get_size(ptr);
> +
> +	if (start > size || end > size)
> +		return -ERANGE;

maybe 'start >= size'? OTOH if the verifier doesn't mind I guess its OK
to create the thing even if it doesn't make much sense.

> +
> +	ptr->offset += start;
> +	bpf_dynptr_set_size(ptr, end - start);
> +
> +	return 0;
> +}
> +
>  __bpf_kfunc void *bpf_cast_to_kern_ctx(void *obj)
>  {
>  	return obj;
> @@ -2369,6 +2394,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL)
>  BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW)
>  BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL)
>  BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY)
> +BTF_ID_FLAGS(func, bpf_dynptr_adjust)
>  BTF_SET8_END(common_btf_ids)
>  
>  static const struct btf_kfunc_id_set common_kfunc_set = {
> -- 
> 2.34.1
>

Joanne Koong April 25, 2023, 5:05 a.m. UTC | #7

On Sat, Apr 22, 2023 at 4:44 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Apr 20, 2023 at 8:46 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> >
> > On Thu, Apr 20, 2023 at 11:38 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Thu, Apr 20, 2023 at 12:14:10AM -0700, Joanne Koong wrote:
> > > >       return obj;
> > > > @@ -2369,6 +2394,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL)
> > > >  BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW)
> > > >  BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL)
> > > >  BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY)
> > > > +BTF_ID_FLAGS(func, bpf_dynptr_adjust)
> > >
> > > I've missed this earlier.
> > > Shouldn't we change all the existing dynptr kfuncs to be KF_TRUSTED_ARGS?
> > > Otherwise when people start passing bpf_dynptr-s from kernel code
> > > (like fuse-bpf is planning to do)
> > > the bpf prog might get vanilla ptr_to_btf_id to bpf_dynptr_kern.
> > > It's probably not possible right now, so not a high-pri issue, but still.
> > > Or something in the verifier makes sure that dynptr-s are all trusted?
> >
> > In my understanding, the checks the verifier enforces for
> > KF_TRUSTED_ARGS are that the reg->offset is 0 and the reg may not be
> > null. The verifier logic does this for dynptrs currently, it enforces
> > that reg->offset is 0 (in stack_slot_obj_get_spi()) and that the
> > reg->type is PTR_TO_STACK or CONST_PTR_TO_DYNPTR (in
> > check_kfunc_args() for KF_ARG_PTR_TO_DYNPTR case). But maybe it's a
> > good idea to add the KF_TRUSTED_ARGS flag anyways in case more safety
> > checks are added to KF_TRUSTED_ARGS in the future?
>
> Yeah. You're right.
> The verifier is doing the same checks for dynptr and for trusted ptrs.
> So adding KF_TRUSTED_ARGS to bpf_dynptr_adjust is not mandatory.
> Maybe an opportunity to generalize the checks between
> KF_ARG_PTR_TO_BTF_ID and KF_ARG_PTR_TO_DYNPTR.
> But KF_TRUSTED_ARGS is necessary for bpf_dynptr_from_skb
> otherwise old style ptr_to_btf_id skb can be passed in.
>
> For example the following passes test_progs:
> diff --git a/net/core/filter.c b/net/core/filter.c
> index d9ce04ca22ce..abb14036b455 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -11718,6 +11718,7 @@ static int __init bpf_kfunc_init(void)
>         ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_LWT_XMIT,
> &bpf_kfunc_set_skb);
>         ret = ret ?:
> register_btf_kfunc_id_set(BPF_PROG_TYPE_LWT_SEG6LOCAL,
> &bpf_kfunc_set_skb);
>         ret = ret ?:
> register_btf_kfunc_id_set(BPF_PROG_TYPE_NETFILTER,
> &bpf_kfunc_set_skb);
> +       ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING,
> &bpf_kfunc_set_skb);
>         return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP,
> &bpf_kfunc_set_xdp);
>  }
>  late_initcall(bpf_kfunc_init);
> diff --git a/tools/testing/selftests/bpf/progs/dynptr_success.c
> b/tools/testing/selftests/bpf/progs/dynptr_success.c
> index b2fa6c47ecc0..bd8fbc3e04ea 100644
> --- a/tools/testing/selftests/bpf/progs/dynptr_success.c
> +++ b/tools/testing/selftests/bpf/progs/dynptr_success.c
> @@ -4,6 +4,7 @@
>  #include <string.h>
>  #include <linux/bpf.h>
>  #include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
>  #include "bpf_misc.h"
>  #include "bpf_kfuncs.h"
>  #include "errno.h"
> @@ -187,6 +188,15 @@ int test_skb_readonly(struct __sk_buff *skb)
>         return 1;
>  }
>
> +SEC("fentry/__kfree_skb")
> +int BPF_PROG(test_skb, struct __sk_buff *skb)
> +{
> +       struct bpf_dynptr ptr;
> +
> +       bpf_dynptr_from_skb(skb, 0, &ptr);
> +       return 0;
> +}
>
> but shouldn't. skb in fentry is not trusted.
> It's not an issue right now, because bpf_dynptr_from_skb()
> is enabled for networking prog types only,
> but BPF_PROG_TYPE_NETFILTER is already blending the boundary.
> It's more networking than tracing and normal tracing should
> be able to examine skb. dynptr allows to do it nicely.
> Not a blocker for this set. Just something to follow up.

Ahh I see, thanks for the explanation. I'm trying to find where this
happens in the code - i see the check in the verifier for
is_trusted_reg() (when we call check_kfunc_args() for the
KF_ARG_PTR_TO_BTF_ID case) so it seems like the skb ctx reg is trusted
if it's been marked as either MEM_ALLOC or PTR_TRUSTED, and it's
untrusted if it's not. But where does this get marked as PTR_TRUSTED
for networking prog types?

Joanne Koong April 25, 2023, 5:29 a.m. UTC | #8

On Mon, Apr 24, 2023 at 12:46 PM John Fastabend
<john.fastabend@gmail.com> wrote:
>
> Joanne Koong wrote:
> > Add a new kfunc
> >
> > int bpf_dynptr_adjust(struct bpf_dynptr_kern *ptr, u32 start, u32 end);
> >
> > which adjusts the dynptr to reflect the new [start, end) interval.
> > In particular, it advances the offset of the dynptr by "start" bytes,
> > and if end is less than the size of the dynptr, then this will trim the
> > dynptr accordingly.
> >
> > Adjusting the dynptr interval may be useful in certain situations.
> > For example, when hashing which takes in generic dynptrs, if the dynptr
> > points to a struct but only a certain memory region inside the struct
> > should be hashed, adjust can be used to narrow in on the
> > specific region to hash.
>
> Would you want to prohibit creating an empty dynptr with [start, start)?

I'm open to either :) I don't reallysee a use case for creating an
empty dynptr, but I think the concept of an empty dynptr might be
useful in general, so maybe we should let this be okay as well?

>
> >
> > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > ---
> >  kernel/bpf/helpers.c | 26 ++++++++++++++++++++++++++
> >  1 file changed, 26 insertions(+)
> >
> > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > index 00e5fb0682ac..7ddf63ac93ce 100644
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -1448,6 +1448,13 @@ u32 bpf_dynptr_get_size(const struct bpf_dynptr_kern *ptr)
> >       return ptr->size & DYNPTR_SIZE_MASK;
> >  }
> >
> > +static void bpf_dynptr_set_size(struct bpf_dynptr_kern *ptr, u32 new_size)
> > +{
> > +     u32 metadata = ptr->size & ~DYNPTR_SIZE_MASK;
> > +
> > +     ptr->size = new_size | metadata;
> > +}
> > +
> >  int bpf_dynptr_check_size(u32 size)
> >  {
> >       return size > DYNPTR_MAX_SIZE ? -E2BIG : 0;
> > @@ -2297,6 +2304,24 @@ __bpf_kfunc void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr_kern *ptr, u32 o
> >       return bpf_dynptr_slice(ptr, offset, buffer, buffer__szk);
> >  }
> >
> > +__bpf_kfunc int bpf_dynptr_adjust(struct bpf_dynptr_kern *ptr, u32 start, u32 end)
> > +{
> > +     u32 size;
> > +
> > +     if (!ptr->data || start > end)
> > +             return -EINVAL;
> > +
> > +     size = bpf_dynptr_get_size(ptr);
> > +
> > +     if (start > size || end > size)
> > +             return -ERANGE;
>
> maybe 'start >= size'? OTOH if the verifier doesn't mind I guess its OK
> to create the thing even if it doesn't make much sense.

I think there might be use cases where this is useful even though the
zero-sized dynptr can't do anything. for example, if there's a helper
function in a program that takes in a dynptr, parses some fixed-size
chunk of its data, and then advances it, it might be useful to have
the concept of a zero-sized dynptr, so that if we're parsing the last
chunk of the data, then the last call to bpf_dynptr_adjust() will
still succeed and the dynptr will be of size 0, which signals
completion.

>
> > +
> > +     ptr->offset += start;
> > +     bpf_dynptr_set_size(ptr, end - start);
> > +
> > +     return 0;
> > +}
> > +
> >  __bpf_kfunc void *bpf_cast_to_kern_ctx(void *obj)
> >  {
> >       return obj;
> > @@ -2369,6 +2394,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL)
> >  BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW)
> >  BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL)
> >  BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY)
> > +BTF_ID_FLAGS(func, bpf_dynptr_adjust)
> >  BTF_SET8_END(common_btf_ids)
> >
> >  static const struct btf_kfunc_id_set common_kfunc_set = {
> > --
> > 2.34.1
> >

Alexei Starovoitov April 25, 2023, 11:46 p.m. UTC | #9

On Mon, Apr 24, 2023 at 10:05:32PM -0700, Joanne Koong wrote:
> On Sat, Apr 22, 2023 at 4:44 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Thu, Apr 20, 2023 at 8:46 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> > >
> > > On Thu, Apr 20, 2023 at 11:38 AM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Thu, Apr 20, 2023 at 12:14:10AM -0700, Joanne Koong wrote:
> > > > >       return obj;
> > > > > @@ -2369,6 +2394,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL)
> > > > >  BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW)
> > > > >  BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL)
> > > > >  BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY)
> > > > > +BTF_ID_FLAGS(func, bpf_dynptr_adjust)
> > > >
> > > > I've missed this earlier.
> > > > Shouldn't we change all the existing dynptr kfuncs to be KF_TRUSTED_ARGS?
> > > > Otherwise when people start passing bpf_dynptr-s from kernel code
> > > > (like fuse-bpf is planning to do)
> > > > the bpf prog might get vanilla ptr_to_btf_id to bpf_dynptr_kern.
> > > > It's probably not possible right now, so not a high-pri issue, but still.
> > > > Or something in the verifier makes sure that dynptr-s are all trusted?
> > >
> > > In my understanding, the checks the verifier enforces for
> > > KF_TRUSTED_ARGS are that the reg->offset is 0 and the reg may not be
> > > null. The verifier logic does this for dynptrs currently, it enforces
> > > that reg->offset is 0 (in stack_slot_obj_get_spi()) and that the
> > > reg->type is PTR_TO_STACK or CONST_PTR_TO_DYNPTR (in
> > > check_kfunc_args() for KF_ARG_PTR_TO_DYNPTR case). But maybe it's a
> > > good idea to add the KF_TRUSTED_ARGS flag anyways in case more safety
> > > checks are added to KF_TRUSTED_ARGS in the future?
> >
> > Yeah. You're right.
> > The verifier is doing the same checks for dynptr and for trusted ptrs.
> > So adding KF_TRUSTED_ARGS to bpf_dynptr_adjust is not mandatory.
> > Maybe an opportunity to generalize the checks between
> > KF_ARG_PTR_TO_BTF_ID and KF_ARG_PTR_TO_DYNPTR.
> > But KF_TRUSTED_ARGS is necessary for bpf_dynptr_from_skb
> > otherwise old style ptr_to_btf_id skb can be passed in.
> >
> > For example the following passes test_progs:
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index d9ce04ca22ce..abb14036b455 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -11718,6 +11718,7 @@ static int __init bpf_kfunc_init(void)
> >         ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_LWT_XMIT,
> > &bpf_kfunc_set_skb);
> >         ret = ret ?:
> > register_btf_kfunc_id_set(BPF_PROG_TYPE_LWT_SEG6LOCAL,
> > &bpf_kfunc_set_skb);
> >         ret = ret ?:
> > register_btf_kfunc_id_set(BPF_PROG_TYPE_NETFILTER,
> > &bpf_kfunc_set_skb);
> > +       ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING,
> > &bpf_kfunc_set_skb);
> >         return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP,
> > &bpf_kfunc_set_xdp);
> >  }
> >  late_initcall(bpf_kfunc_init);
> > diff --git a/tools/testing/selftests/bpf/progs/dynptr_success.c
> > b/tools/testing/selftests/bpf/progs/dynptr_success.c
> > index b2fa6c47ecc0..bd8fbc3e04ea 100644
> > --- a/tools/testing/selftests/bpf/progs/dynptr_success.c
> > +++ b/tools/testing/selftests/bpf/progs/dynptr_success.c
> > @@ -4,6 +4,7 @@
> >  #include <string.h>
> >  #include <linux/bpf.h>
> >  #include <bpf/bpf_helpers.h>
> > +#include <bpf/bpf_tracing.h>
> >  #include "bpf_misc.h"
> >  #include "bpf_kfuncs.h"
> >  #include "errno.h"
> > @@ -187,6 +188,15 @@ int test_skb_readonly(struct __sk_buff *skb)
> >         return 1;
> >  }
> >
> > +SEC("fentry/__kfree_skb")
> > +int BPF_PROG(test_skb, struct __sk_buff *skb)
> > +{
> > +       struct bpf_dynptr ptr;
> > +
> > +       bpf_dynptr_from_skb(skb, 0, &ptr);
> > +       return 0;
> > +}
> >
> > but shouldn't. skb in fentry is not trusted.
> > It's not an issue right now, because bpf_dynptr_from_skb()
> > is enabled for networking prog types only,
> > but BPF_PROG_TYPE_NETFILTER is already blending the boundary.
> > It's more networking than tracing and normal tracing should
> > be able to examine skb. dynptr allows to do it nicely.
> > Not a blocker for this set. Just something to follow up.
> 
> Ahh I see, thanks for the explanation. I'm trying to find where this
> happens in the code - i see the check in the verifier for
> is_trusted_reg() (when we call check_kfunc_args() for the
> KF_ARG_PTR_TO_BTF_ID case) so it seems like the skb ctx reg is trusted
> if it's been marked as either MEM_ALLOC or PTR_TRUSTED, and it's
> untrusted if it's not. But where does this get marked as PTR_TRUSTED
> for networking prog types?

is_trusted_reg() applies to PTR_TO_BTF_ID pointers.
For networking progs skb comes as PTR_TO_CTX which are implicitly trusted
and from safety pov equivalent to PTR_TO_BTF_ID | PTR_TRUSTED.
But tracing progs are different. Arguments of tp_btf progs
are also trusted, but fexit args are not. They're old legacy PTR_TO_BTF_ID
without flags. Neither PTR_TRUSTED nor PTR_UNTRUSTED.
The purpose of KF_TRUSTED_ARGS is to filter out such pointers.

Andrii Nakryiko April 26, 2023, 5:46 p.m. UTC | #10

On Mon, Apr 24, 2023 at 10:29 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> On Mon, Apr 24, 2023 at 12:46 PM John Fastabend
> <john.fastabend@gmail.com> wrote:
> >
> > Joanne Koong wrote:
> > > Add a new kfunc
> > >
> > > int bpf_dynptr_adjust(struct bpf_dynptr_kern *ptr, u32 start, u32 end);
> > >
> > > which adjusts the dynptr to reflect the new [start, end) interval.
> > > In particular, it advances the offset of the dynptr by "start" bytes,
> > > and if end is less than the size of the dynptr, then this will trim the
> > > dynptr accordingly.
> > >
> > > Adjusting the dynptr interval may be useful in certain situations.
> > > For example, when hashing which takes in generic dynptrs, if the dynptr
> > > points to a struct but only a certain memory region inside the struct
> > > should be hashed, adjust can be used to narrow in on the
> > > specific region to hash.
> >
> > Would you want to prohibit creating an empty dynptr with [start, start)?
>
> I'm open to either :) I don't reallysee a use case for creating an
> empty dynptr, but I think the concept of an empty dynptr might be
> useful in general, so maybe we should let this be okay as well?

Yes, there is no need to artificially enforce a non-empty range. We
already use pointers to zero-sized memory region in verifier (e.g.,
Alexei's recent kfunc existence check changes). In general, empty
range is a valid range and we should strive to have that working
without assumptions on who and how would use that. As long as it's
conceptually safe, we should support it.

>
> >
> > >
> > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> > > ---
> > >  kernel/bpf/helpers.c | 26 ++++++++++++++++++++++++++
> > >  1 file changed, 26 insertions(+)
> > >
> > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > index 00e5fb0682ac..7ddf63ac93ce 100644
> > > --- a/kernel/bpf/helpers.c
> > > +++ b/kernel/bpf/helpers.c
> > > @@ -1448,6 +1448,13 @@ u32 bpf_dynptr_get_size(const struct bpf_dynptr_kern *ptr)
> > >       return ptr->size & DYNPTR_SIZE_MASK;
> > >  }
> > >
> > > +static void bpf_dynptr_set_size(struct bpf_dynptr_kern *ptr, u32 new_size)
> > > +{
> > > +     u32 metadata = ptr->size & ~DYNPTR_SIZE_MASK;
> > > +
> > > +     ptr->size = new_size | metadata;
> > > +}
> > > +
> > >  int bpf_dynptr_check_size(u32 size)
> > >  {
> > >       return size > DYNPTR_MAX_SIZE ? -E2BIG : 0;
> > > @@ -2297,6 +2304,24 @@ __bpf_kfunc void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr_kern *ptr, u32 o
> > >       return bpf_dynptr_slice(ptr, offset, buffer, buffer__szk);
> > >  }
> > >
> > > +__bpf_kfunc int bpf_dynptr_adjust(struct bpf_dynptr_kern *ptr, u32 start, u32 end)
> > > +{
> > > +     u32 size;
> > > +
> > > +     if (!ptr->data || start > end)
> > > +             return -EINVAL;
> > > +
> > > +     size = bpf_dynptr_get_size(ptr);
> > > +
> > > +     if (start > size || end > size)
> > > +             return -ERANGE;
> >
> > maybe 'start >= size'? OTOH if the verifier doesn't mind I guess its OK
> > to create the thing even if it doesn't make much sense.
>
> I think there might be use cases where this is useful even though the
> zero-sized dynptr can't do anything. for example, if there's a helper
> function in a program that takes in a dynptr, parses some fixed-size
> chunk of its data, and then advances it, it might be useful to have
> the concept of a zero-sized dynptr, so that if we're parsing the last
> chunk of the data, then the last call to bpf_dynptr_adjust() will
> still succeed and the dynptr will be of size 0, which signals
> completion.
>

+1, empty range does happen in practice naturally, and having to
special-case them is a hindrance. Let's keep it possible.

> >
> > > +
> > > +     ptr->offset += start;
> > > +     bpf_dynptr_set_size(ptr, end - start);
> > > +
> > > +     return 0;
> > > +}
> > > +
> > >  __bpf_kfunc void *bpf_cast_to_kern_ctx(void *obj)
> > >  {
> > >       return obj;
> > > @@ -2369,6 +2394,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL)
> > >  BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW)
> > >  BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL)
> > >  BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY)
> > > +BTF_ID_FLAGS(func, bpf_dynptr_adjust)
> > >  BTF_SET8_END(common_btf_ids)
> > >
> > >  static const struct btf_kfunc_id_set common_kfunc_set = {
> > > --
> > > 2.34.1
> > >

Andrii Nakryiko April 26, 2023, 5:50 p.m. UTC | #11

On Tue, Apr 25, 2023 at 4:46 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Apr 24, 2023 at 10:05:32PM -0700, Joanne Koong wrote:
> > On Sat, Apr 22, 2023 at 4:44 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Thu, Apr 20, 2023 at 8:46 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> > > >
> > > > On Thu, Apr 20, 2023 at 11:38 AM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Thu, Apr 20, 2023 at 12:14:10AM -0700, Joanne Koong wrote:
> > > > > >       return obj;
> > > > > > @@ -2369,6 +2394,7 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL)
> > > > > >  BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW)
> > > > > >  BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL)
> > > > > >  BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY)
> > > > > > +BTF_ID_FLAGS(func, bpf_dynptr_adjust)
> > > > >
> > > > > I've missed this earlier.
> > > > > Shouldn't we change all the existing dynptr kfuncs to be KF_TRUSTED_ARGS?
> > > > > Otherwise when people start passing bpf_dynptr-s from kernel code
> > > > > (like fuse-bpf is planning to do)
> > > > > the bpf prog might get vanilla ptr_to_btf_id to bpf_dynptr_kern.
> > > > > It's probably not possible right now, so not a high-pri issue, but still.
> > > > > Or something in the verifier makes sure that dynptr-s are all trusted?
> > > >
> > > > In my understanding, the checks the verifier enforces for
> > > > KF_TRUSTED_ARGS are that the reg->offset is 0 and the reg may not be
> > > > null. The verifier logic does this for dynptrs currently, it enforces
> > > > that reg->offset is 0 (in stack_slot_obj_get_spi()) and that the
> > > > reg->type is PTR_TO_STACK or CONST_PTR_TO_DYNPTR (in
> > > > check_kfunc_args() for KF_ARG_PTR_TO_DYNPTR case). But maybe it's a
> > > > good idea to add the KF_TRUSTED_ARGS flag anyways in case more safety
> > > > checks are added to KF_TRUSTED_ARGS in the future?
> > >
> > > Yeah. You're right.
> > > The verifier is doing the same checks for dynptr and for trusted ptrs.
> > > So adding KF_TRUSTED_ARGS to bpf_dynptr_adjust is not mandatory.
> > > Maybe an opportunity to generalize the checks between
> > > KF_ARG_PTR_TO_BTF_ID and KF_ARG_PTR_TO_DYNPTR.
> > > But KF_TRUSTED_ARGS is necessary for bpf_dynptr_from_skb
> > > otherwise old style ptr_to_btf_id skb can be passed in.
> > >
> > > For example the following passes test_progs:
> > > diff --git a/net/core/filter.c b/net/core/filter.c
> > > index d9ce04ca22ce..abb14036b455 100644
> > > --- a/net/core/filter.c
> > > +++ b/net/core/filter.c
> > > @@ -11718,6 +11718,7 @@ static int __init bpf_kfunc_init(void)
> > >         ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_LWT_XMIT,
> > > &bpf_kfunc_set_skb);
> > >         ret = ret ?:
> > > register_btf_kfunc_id_set(BPF_PROG_TYPE_LWT_SEG6LOCAL,
> > > &bpf_kfunc_set_skb);
> > >         ret = ret ?:
> > > register_btf_kfunc_id_set(BPF_PROG_TYPE_NETFILTER,
> > > &bpf_kfunc_set_skb);
> > > +       ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING,
> > > &bpf_kfunc_set_skb);
> > >         return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP,
> > > &bpf_kfunc_set_xdp);
> > >  }
> > >  late_initcall(bpf_kfunc_init);
> > > diff --git a/tools/testing/selftests/bpf/progs/dynptr_success.c
> > > b/tools/testing/selftests/bpf/progs/dynptr_success.c
> > > index b2fa6c47ecc0..bd8fbc3e04ea 100644
> > > --- a/tools/testing/selftests/bpf/progs/dynptr_success.c
> > > +++ b/tools/testing/selftests/bpf/progs/dynptr_success.c
> > > @@ -4,6 +4,7 @@
> > >  #include <string.h>
> > >  #include <linux/bpf.h>
> > >  #include <bpf/bpf_helpers.h>
> > > +#include <bpf/bpf_tracing.h>
> > >  #include "bpf_misc.h"
> > >  #include "bpf_kfuncs.h"
> > >  #include "errno.h"
> > > @@ -187,6 +188,15 @@ int test_skb_readonly(struct __sk_buff *skb)
> > >         return 1;
> > >  }
> > >
> > > +SEC("fentry/__kfree_skb")
> > > +int BPF_PROG(test_skb, struct __sk_buff *skb)
> > > +{
> > > +       struct bpf_dynptr ptr;
> > > +
> > > +       bpf_dynptr_from_skb(skb, 0, &ptr);
> > > +       return 0;
> > > +}
> > >
> > > but shouldn't. skb in fentry is not trusted.
> > > It's not an issue right now, because bpf_dynptr_from_skb()
> > > is enabled for networking prog types only,
> > > but BPF_PROG_TYPE_NETFILTER is already blending the boundary.
> > > It's more networking than tracing and normal tracing should
> > > be able to examine skb. dynptr allows to do it nicely.
> > > Not a blocker for this set. Just something to follow up.
> >
> > Ahh I see, thanks for the explanation. I'm trying to find where this
> > happens in the code - i see the check in the verifier for
> > is_trusted_reg() (when we call check_kfunc_args() for the
> > KF_ARG_PTR_TO_BTF_ID case) so it seems like the skb ctx reg is trusted
> > if it's been marked as either MEM_ALLOC or PTR_TRUSTED, and it's
> > untrusted if it's not. But where does this get marked as PTR_TRUSTED
> > for networking prog types?
>
> is_trusted_reg() applies to PTR_TO_BTF_ID pointers.
> For networking progs skb comes as PTR_TO_CTX which are implicitly trusted
> and from safety pov equivalent to PTR_TO_BTF_ID | PTR_TRUSTED.
> But tracing progs are different. Arguments of tp_btf progs
> are also trusted, but fexit args are not. They're old legacy PTR_TO_BTF_ID
> without flags. Neither PTR_TRUSTED nor PTR_UNTRUSTED.
> The purpose of KF_TRUSTED_ARGS is to filter out such pointers.

I need to do a thorough look at how FUSE BPF is using bpf_dynptr, but
I'd rather us starting strict and not allowing to pass bpf_dynptr as
PTR_TO_BTF_ID, at least right now until we can think this through
thoroughly. One fundamental aspect of current on the stack dynptr or
dynptr passed from user_ringbuf's drain helper is that we have a
guarantee that no one else besides program (on a single current
thread) can modify its state.

Anyways, I'd expect that if some kfunc accepts `struct bpf_dynptr` we
force that it's PTR_TO_DYNPTR_ID or PTR_TO_STACK register, at least
for now. If that's not the case right now, let's make sure it is?

John Fastabend April 28, 2023, 9:08 p.m. UTC | #12

Andrii Nakryiko wrote:
> On Mon, Apr 24, 2023 at 10:29 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> >
> > On Mon, Apr 24, 2023 at 12:46 PM John Fastabend
> > <john.fastabend@gmail.com> wrote:
> > >
> > > Joanne Koong wrote:
> > > > Add a new kfunc
> > > >
> > > > int bpf_dynptr_adjust(struct bpf_dynptr_kern *ptr, u32 start, u32 end);
> > > >
> > > > which adjusts the dynptr to reflect the new [start, end) interval.
> > > > In particular, it advances the offset of the dynptr by "start" bytes,
> > > > and if end is less than the size of the dynptr, then this will trim the
> > > > dynptr accordingly.
> > > >
> > > > Adjusting the dynptr interval may be useful in certain situations.
> > > > For example, when hashing which takes in generic dynptrs, if the dynptr
> > > > points to a struct but only a certain memory region inside the struct
> > > > should be hashed, adjust can be used to narrow in on the
> > > > specific region to hash.
> > >
> > > Would you want to prohibit creating an empty dynptr with [start, start)?
> >
> > I'm open to either :) I don't reallysee a use case for creating an
> > empty dynptr, but I think the concept of an empty dynptr might be
> > useful in general, so maybe we should let this be okay as well?
> 
> Yes, there is no need to artificially enforce a non-empty range. We
> already use pointers to zero-sized memory region in verifier (e.g.,
> Alexei's recent kfunc existence check changes). In general, empty
> range is a valid range and we should strive to have that working
> without assumptions on who and how would use that. As long as it's
> conceptually safe, we should support it.

Ack. Agree sounds good to me.

[v2,bpf-next,1/5] bpf: Add bpf_dynptr_adjust

Checks

Commit Message

Comments

Patch