Message ID | 20240104013847.3875810-8-andrii@kernel.org (mailing list archive) |
---|---|
State | Accepted |
Commit | 2f38fe689470055440bf80fc644920023a643a82 |
Delegated to: | BPF |
Headers | show |
Series | Libbpf-side __arg_ctx fallback support | expand |
On Wed, Jan 3, 2024 at 5:39 PM Andrii Nakryiko <andrii@kernel.org> wrote: > > This limitation was the reason to add btf_decl_tag("arg:ctx"), making > the actual argument type not important, so that user can just define > "generic" signature: > > __noinline int global_subprog(void *ctx __arg_ctx) { ... } I still think that this __arg_ctx only makes sense with 'void *'. Blind rewrite of ctx is a foot gun. I've tried the following: diff --git a/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c b/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c index 9a06e5eb1fbe..0e5f5205d4a8 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c +++ b/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c @@ -106,9 +106,9 @@ int perf_event_ctx(void *ctx) /* this global subprog can be now called from many types of entry progs, each * with different context type */ -__weak int subprog_ctx_tag(void *ctx __arg_ctx) +__weak int subprog_ctx_tag(long ctx __arg_ctx) { - return bpf_get_stack(ctx, stack, sizeof(stack), 0); + return bpf_get_stack((void *)ctx, stack, sizeof(stack), 0); } struct my_struct { int x; }; @@ -131,7 +131,7 @@ int arg_tag_ctx_raw_tp(void *ctx) { struct my_struct x = { .x = 123 }; - return subprog_ctx_tag(ctx) + subprog_multi_ctx_tags(ctx, &x, ctx); + return subprog_ctx_tag((long)ctx) + subprog_multi_ctx_tags(ctx, &x, ctx); } and it "works". Both kernel and libbpf should really limit it to 'void *'. In the other email I suggested to allow types that match expected based on prog type, but even that is probably a danger zone as well. The correct type would already be detected by the verifier, so extra __arg_ctx is pointless. It makes sense only for such polymorphic functions and those better use 'void *' and don't dereference it. I think this can be a follow up.
On Wed, Jan 3, 2024 at 9:39 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Wed, Jan 3, 2024 at 5:39 PM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > This limitation was the reason to add btf_decl_tag("arg:ctx"), making > > the actual argument type not important, so that user can just define > > "generic" signature: > > > > __noinline int global_subprog(void *ctx __arg_ctx) { ... } > > I still think that this __arg_ctx only makes sense with 'void *'. > Blind rewrite of ctx is a foot gun. > > I've tried the following: > > diff --git a/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c > b/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c > index 9a06e5eb1fbe..0e5f5205d4a8 100644 > --- a/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c > +++ b/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c > @@ -106,9 +106,9 @@ int perf_event_ctx(void *ctx) > /* this global subprog can be now called from many types of entry progs, each > * with different context type > */ > -__weak int subprog_ctx_tag(void *ctx __arg_ctx) > +__weak int subprog_ctx_tag(long ctx __arg_ctx) > { > - return bpf_get_stack(ctx, stack, sizeof(stack), 0); > + return bpf_get_stack((void *)ctx, stack, sizeof(stack), 0); > } > > struct my_struct { int x; }; > @@ -131,7 +131,7 @@ int arg_tag_ctx_raw_tp(void *ctx) > { > struct my_struct x = { .x = 123 }; > > - return subprog_ctx_tag(ctx) + subprog_multi_ctx_tags(ctx, &x, ctx); > + return subprog_ctx_tag((long)ctx) + > subprog_multi_ctx_tags(ctx, &x, ctx); > } > > and it "works". Yeah, but you had to actively force casting everywhere *and* you still had to consciously add __arg_ctx, right? If a user wants to subvert the type system, they will do it. It's C, after all. But if they just accidentally use sk_buff ctx and call it from the XDP program with xdp_buff/xdp_md, the compiler will call out type mismatch. > > Both kernel and libbpf should really limit it to 'void *'. > > In the other email I suggested to allow types that match expected > based on prog type, but even that is probably a danger zone as well. > The correct type would already be detected by the verifier, > so extra __arg_ctx is pointless. > It makes sense only for such polymorphic functions and those > better use 'void *' and don't dereference it. > > I think this can be a follow up. Not really just polymorphic functions. Think about subprog specifically for the fentry program, as one example. You *need* __arg_ctx just to make context passing work, but you also want non-`void *` type to access arguments. int subprog(u64 *args __arg_ctx) { ... } SEC("fentry") int BPF_PROG(main_prog, ...) { return subprog(ctx); } Similarly, tracepoint programs, you'd have: int subprog(struct syscall_trace_enter* ctx __arg_ctx) { ... } SEC("tracepoint/syscalls/sys_enter_kill") int main_prog(struct syscall_trace_enter* ctx) { return subprog(ctx); } So that's one group of cases. Another special case are networking programs, where both "__sk_buff" and "sk_buff" are allowed, same for "xdp_buff" and "xdp_md". Also, kprobes are special, both "struct bpf_user_pt_regs_t" and *typedef* "bpf_user_pt_regs_t" are supported. But in practice users will often just use `struct pt_regs *ctx`, actually. There might be some other edges I don't yet realize. In short, I think any sort of enforcement will just cause unnecessary pain, while seemingly fixing some problem that doesn't seem to be a problem in practice.
On Thu, Jan 4, 2024 at 10:37 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Wed, Jan 3, 2024 at 9:39 PM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Wed, Jan 3, 2024 at 5:39 PM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > > > This limitation was the reason to add btf_decl_tag("arg:ctx"), making > > > the actual argument type not important, so that user can just define > > > "generic" signature: > > > > > > __noinline int global_subprog(void *ctx __arg_ctx) { ... } > > > > I still think that this __arg_ctx only makes sense with 'void *'. > > Blind rewrite of ctx is a foot gun. > > > > I've tried the following: > > > > diff --git a/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c > > b/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c > > index 9a06e5eb1fbe..0e5f5205d4a8 100644 > > --- a/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c > > +++ b/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c > > @@ -106,9 +106,9 @@ int perf_event_ctx(void *ctx) > > /* this global subprog can be now called from many types of entry progs, each > > * with different context type > > */ > > -__weak int subprog_ctx_tag(void *ctx __arg_ctx) > > +__weak int subprog_ctx_tag(long ctx __arg_ctx) > > { > > - return bpf_get_stack(ctx, stack, sizeof(stack), 0); > > + return bpf_get_stack((void *)ctx, stack, sizeof(stack), 0); > > } > > > > struct my_struct { int x; }; > > @@ -131,7 +131,7 @@ int arg_tag_ctx_raw_tp(void *ctx) > > { > > struct my_struct x = { .x = 123 }; > > > > - return subprog_ctx_tag(ctx) + subprog_multi_ctx_tags(ctx, &x, ctx); > > + return subprog_ctx_tag((long)ctx) + > > subprog_multi_ctx_tags(ctx, &x, ctx); > > } > > > > and it "works". > > Yeah, but you had to actively force casting everywhere *and* you still > had to consciously add __arg_ctx, right? If a user wants to subvert > the type system, they will do it. It's C, after all. But if they just > accidentally use sk_buff ctx and call it from the XDP program with > xdp_buff/xdp_md, the compiler will call out type mismatch. I could have used long everywhere and avoided casts. > > > > Both kernel and libbpf should really limit it to 'void *'. > > > > In the other email I suggested to allow types that match expected > > based on prog type, but even that is probably a danger zone as well. > > The correct type would already be detected by the verifier, > > so extra __arg_ctx is pointless. > > It makes sense only for such polymorphic functions and those > > better use 'void *' and don't dereference it. > > > > I think this can be a follow up. > > Not really just polymorphic functions. Think about subprog > specifically for the fentry program, as one example. You *need* > __arg_ctx just to make context passing work, but you also want > non-`void *` type to access arguments. > > int subprog(u64 *args __arg_ctx) { ... } > > SEC("fentry") > int BPF_PROG(main_prog, ...) > { > return subprog(ctx); > } > > Similarly, tracepoint programs, you'd have: > > int subprog(struct syscall_trace_enter* ctx __arg_ctx) { ... } > > SEC("tracepoint/syscalls/sys_enter_kill") > int main_prog(struct syscall_trace_enter* ctx) > { > return subprog(ctx); > } > > So that's one group of cases. But the above two are not supported by libbpf since it doesn't handle "tracing" and "tracepoint" prog types in global_ctx_map. I suspect the kernel sort-of supports above, but in a dangerous and broken way. My point is that users must not use __arg_ctx in these two cases. fentry (tracing prog type) wants 'void *' in the kernel to match to ctx. So the existing mechanism (prior to arg_ctx in the kernel) should already work. > Another special case are networking programs, where both "__sk_buff" > and "sk_buff" are allowed, same for "xdp_buff" and "xdp_md". what do you mean both? networking bpf prog must only use __sk_buff and that is one and only supported ctx. Using 'struct sk_buff *ctx __arg_ctx' will be a bad bug. Since offsets will be all wrong while ctx rewrite will apply garbage and will likely fail. > Also, kprobes are special, both "struct bpf_user_pt_regs_t" and > *typedef* "bpf_user_pt_regs_t" are supported. But in practice users > will often just use `struct pt_regs *ctx`, actually. Same thing. The global bpf prog has to use bpf_user_pt_regs_t to be properly recognized as ctx arg type. Nothing special. Using 'struct pt_regs * ctx __arg_ctx' and blind rewrite will cause similar hard to debug bugs when bpf_user_pt_regs_t doesn't match pt_regs that bpf prog sees at compile time.
On Thu, Jan 4, 2024 at 10:52 AM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Thu, Jan 4, 2024 at 10:37 AM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Wed, Jan 3, 2024 at 9:39 PM Alexei Starovoitov > > <alexei.starovoitov@gmail.com> wrote: > > > > > > On Wed, Jan 3, 2024 at 5:39 PM Andrii Nakryiko <andrii@kernel.org> wrote: > > > > > > > > This limitation was the reason to add btf_decl_tag("arg:ctx"), making > > > > the actual argument type not important, so that user can just define > > > > "generic" signature: > > > > > > > > __noinline int global_subprog(void *ctx __arg_ctx) { ... } > > > > > > I still think that this __arg_ctx only makes sense with 'void *'. > > > Blind rewrite of ctx is a foot gun. > > > > > > I've tried the following: > > > > > > diff --git a/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c > > > b/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c > > > index 9a06e5eb1fbe..0e5f5205d4a8 100644 > > > --- a/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c > > > +++ b/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c > > > @@ -106,9 +106,9 @@ int perf_event_ctx(void *ctx) > > > /* this global subprog can be now called from many types of entry progs, each > > > * with different context type > > > */ > > > -__weak int subprog_ctx_tag(void *ctx __arg_ctx) > > > +__weak int subprog_ctx_tag(long ctx __arg_ctx) > > > { > > > - return bpf_get_stack(ctx, stack, sizeof(stack), 0); > > > + return bpf_get_stack((void *)ctx, stack, sizeof(stack), 0); > > > } > > > > > > struct my_struct { int x; }; > > > @@ -131,7 +131,7 @@ int arg_tag_ctx_raw_tp(void *ctx) > > > { > > > struct my_struct x = { .x = 123 }; > > > > > > - return subprog_ctx_tag(ctx) + subprog_multi_ctx_tags(ctx, &x, ctx); > > > + return subprog_ctx_tag((long)ctx) + > > > subprog_multi_ctx_tags(ctx, &x, ctx); > > > } > > > > > > and it "works". > > > > Yeah, but you had to actively force casting everywhere *and* you still > > had to consciously add __arg_ctx, right? If a user wants to subvert > > the type system, they will do it. It's C, after all. But if they just > > accidentally use sk_buff ctx and call it from the XDP program with > > xdp_buff/xdp_md, the compiler will call out type mismatch. > > I could have used long everywhere and avoided casts. > My point was that it's hard to accidentally forget to "generalize" type if you were supporting sk_buff, and suddenly started calling it with xdp_md. From my POV, if I'm a user, and I declare an argument as long and annotate it as __arg_ctx, then I know what I'm doing and I'd hate for some smart-ass library to double-guess me dictating what exact incantation I should specify to make it happy. If I'm clueless and just randomly sprinkling __arg_ctx, then I have bigger problems than type mismatch. > > > > > > Both kernel and libbpf should really limit it to 'void *'. > > > > > > In the other email I suggested to allow types that match expected > > > based on prog type, but even that is probably a danger zone as well. > > > The correct type would already be detected by the verifier, > > > so extra __arg_ctx is pointless. > > > It makes sense only for such polymorphic functions and those > > > better use 'void *' and don't dereference it. > > > > > > I think this can be a follow up. > > > > Not really just polymorphic functions. Think about subprog > > specifically for the fentry program, as one example. You *need* > > __arg_ctx just to make context passing work, but you also want > > non-`void *` type to access arguments. > > > > int subprog(u64 *args __arg_ctx) { ... } > > > > SEC("fentry") > > int BPF_PROG(main_prog, ...) > > { > > return subprog(ctx); > > } > > > > Similarly, tracepoint programs, you'd have: > > > > int subprog(struct syscall_trace_enter* ctx __arg_ctx) { ... } > > > > SEC("tracepoint/syscalls/sys_enter_kill") > > int main_prog(struct syscall_trace_enter* ctx) > > { > > return subprog(ctx); > > } > > > > So that's one group of cases. > > But the above two are not supported by libbpf > since it doesn't handle "tracing" and "tracepoint" prog types > in global_ctx_map. Ok, so I'm confused now. I thought we were talking about both kernel-side and libbpf-side extra checks. Look, I don't want libbpf to be too smart and actually cause unnecessary problems for users (pt_regs being one such case, see below), and making users do work arounds just to satisfy libbpf. Like passing `void * ctx __arg_ctx`, but then casting to `struct pt_regs`, for example. (see below about pt_regs) Sure, if someone has no clue what they are doing and specifies a different type, I think it's acceptable for them to have that bug. They will debug it, fix it, learn something, and won't do it again. I'd rather assume users know what they are doing rather than double-guess what they are doing. If we are talking about libbpf-only changes just for those types that libbpf is rewriting, fine (though I'm still not happy about struct pt_regs case not working), we can add it. If Eduard concurs, I'll add it, it's not hard. But as I said, I think libbpf would be doing something that it's not supposed to do here (libbpf is just silently adding an annotation, effectively, it's not changing how code is generated or how verifier is interpreting types). If we are talking about kernel-side extra checks, I propose we do that on my next patch set adding PTR_TO_BTF_ID, but again, we need to keep those non-polymorphic valid cases in mind (u64 *ctx for fentry, tracepoint structs, etc) and not make them unnecessarily painful. > I suspect the kernel sort-of supports above, but in a dangerous > and broken way. > > My point is that users must not use __arg_ctx in these two cases. > fentry (tracing prog type) wants 'void *' in the kernel to > match to ctx. > So the existing mechanism (prior to arg_ctx in the kernel) > should already work. Let's unpack. fentry doesn't "want" `void *`, it just doesn't support passing context argument to global subprog. So you would have to specify __arg_ctx, and that will only work on recent enough kernels. At that point, all of `long ctx __arg_ctx`, `void *ctx __arg_ctx` and `u64 *ctx __arg_ctx` will work. Yes, `long ctx` out of those 3 are weird, but verifier will treat it as PTR_TO_CTX regardless of specific type correctly. More importantly, I'm saying that both `void *ctx __arg_ctx` and `u64 *ctx __arg_ctx` should work for fentry, don't you agree? > > > Another special case are networking programs, where both "__sk_buff" > > and "sk_buff" are allowed, same for "xdp_buff" and "xdp_md". > > what do you mean both? > networking bpf prog must only use __sk_buff and that is one and > only supported ctx. > Using 'struct sk_buff *ctx __arg_ctx' will be a bad bug. > Since offsets will be all wrong while ctx rewrite will apply garbage > and will likely fail. You are right about wrong offsets, but the kernel does allow it. See [0]. I actually tried, and indeed, it allows sk_buff to denote "context". Note that I had to comment out skb->len dereference (otherwise verifier will correctly complain about wrong offset), but it is recognized as PTR_TO_CTX and I could technically pass it to another subprog or helpers/kfuncs (and that would work). [0] https://lore.kernel.org/all/20230301154953.641654-2-joannelkoong@gmail.com/ diff --git a/tools/testing/selftests/bpf/progs/test_global_func2.c b/tools/testing/selftests/bpf/progs/test_global_func2.c index 2beab9c3b68a..29d7f3e78f8e 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func2.c +++ b/tools/testing/selftests/bpf/progs/test_global_func2.c @@ -1,16 +1,15 @@ // SPDX-License-Identifier: GPL-2.0-only /* Copyright (c) 2020 Facebook */ -#include <stddef.h> -#include <linux/bpf.h> +#include "vmlinux.h" #include <bpf/bpf_helpers.h> #include "bpf_misc.h" #define MAX_STACK (512 - 3 * 32) static __attribute__ ((noinline)) -int f0(int var, struct __sk_buff *skb) +int f0(int var, struct sk_buff *skb) { - return skb->len; + return 0; } __attribute__ ((noinline)) @@ -20,7 +19,7 @@ int f1(struct __sk_buff *skb) __sink(buf[MAX_STACK - 1]); - return f0(0, skb) + skb->len; + return f0(0, (void*)skb) + skb->len; } int f3(int, struct __sk_buff *skb, int); @@ -45,5 +44,5 @@ SEC("tc") __success int global_func2(struct __sk_buff *skb) { - return f0(1, skb) + f1(skb) + f2(2, skb) + f3(3, skb, 4); + return f0(1, (void *)skb) + f1(skb) + f2(2, skb) + f3(3, skb, 4); } > > > Also, kprobes are special, both "struct bpf_user_pt_regs_t" and > > *typedef* "bpf_user_pt_regs_t" are supported. But in practice users > > will often just use `struct pt_regs *ctx`, actually. > > Same thing. The global bpf prog has to use bpf_user_pt_regs_t > to be properly recognized as ctx arg type. > Nothing special. Using 'struct pt_regs * ctx __arg_ctx' and blind > rewrite will cause similar hard to debug bugs when > bpf_user_pt_regs_t doesn't match pt_regs that bpf prog sees > at compile time. So this is not the same thing as skbuff. If BPF program is meant for a single architecture, like x86-64, it's completely valid (and that's what people have been doing with static subprogs for ages now) to just use `struct pt_regs`. They are the same thing on x86. I'll say even more, with libbpf's PT_REGS_xxx() macros you don't even need to know about pt_regs vs user_pt_regs difference, as macros properly force-cast arguments, depending on architecture. So in your BPF code you can just pass `struct pt_regs *` around just fine across multiple architectures as long as you only use PT_REGS_xxx() macros and then pass that context to helpers (to get stack trace, bpf_perf_event_output, etc). No one even knows about bpf_user_pt_regs_t, I had to dig it up from kernel source code and let users know what exact type name to use for global subprog.
On Thu, Jan 4, 2024 at 12:58 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > My point was that it's hard to accidentally forget to "generalize" > type if you were supporting sk_buff, and suddenly started calling it > with xdp_md. > > From my POV, if I'm a user, and I declare an argument as long and > annotate it as __arg_ctx, then I know what I'm doing and I'd hate for > some smart-ass library to double-guess me dictating what exact > incantation I should specify to make it happy. But that's exactly what's happening! The smart-ass libbpf ignores the type in 'struct sk_buff *skb __arg_ctx' and replaces it with whatever is appropriate for prog type. More below. > static __attribute__ ((noinline)) > -int f0(int var, struct __sk_buff *skb) > +int f0(int var, struct sk_buff *skb) > { > - return skb->len; > + return 0; > } > > __attribute__ ((noinline)) > @@ -20,7 +19,7 @@ int f1(struct __sk_buff *skb) > > __sink(buf[MAX_STACK - 1]); > > - return f0(0, skb) + skb->len; > + return f0(0, (void*)skb) + skb->len; This is static f0. Not sure what you're trying to say. I don't think btf_get_prog_ctx_type() logic applies here. > I'll say even more, with libbpf's PT_REGS_xxx() macros you don't even > need to know about pt_regs vs user_pt_regs difference, as macros > properly force-cast arguments, depending on architecture. So in your > BPF code you can just pass `struct pt_regs *` around just fine across > multiple architectures as long as you only use PT_REGS_xxx() macros > and then pass that context to helpers (to get stack trace, > bpf_perf_event_output, etc). Pretty much. For some time the kernel recognized bpf_user_pt_regs_t as PTR_TO_CTX for kprobe. And the users who needed global prog verification with ctx already used that feature. We even have helper macros to typeof to correct btf type. From selftests: _weak int kprobe_typedef_ctx_subprog(bpf_user_pt_regs_t *ctx) { return bpf_get_stack(ctx, &stack, sizeof(stack), 0); } SEC("?kprobe") __success int kprobe_typedef_ctx(void *ctx) { return kprobe_typedef_ctx_subprog(ctx); } #define pt_regs_struct_t typeof(*(__PT_REGS_CAST((struct pt_regs *)NULL))) __weak int kprobe_struct_ctx_subprog(pt_regs_struct_t *ctx) { return bpf_get_stack((void *)ctx, &stack, sizeof(stack), 0); } SEC("?kprobe") __success int kprobe_resolved_ctx(void *ctx) { return kprobe_struct_ctx_subprog(ctx); } __PT_REGS_CAST is arch dependent and typeof makes it seen with correct btf_id and the kernel knows it's PTR_TO_CTX. All that works. No need for __arg_ctx. I'm sure you know this. I'm only explaining for everybody else to follow. > No one even knows about bpf_user_pt_regs_t, I had to dig it up from > kernel source code and let users know what exact type name to use for > global subprog. Few people know that global subprogs are verified differently than static. That's true, but I bet people that knew also used the right type for ctx. If you're saying that __arg_ctx is making it easier for users to use global subprogs I certainly agree, but it's not something that was mandatory for uniform global progs. __arg_ctx main value is for polymorphic subprogs. An add-on value is ease-of-use for existing non polymorphic subrpogs. I'm saying that in the above example working code: __weak int kprobe_typedef_ctx_subprog(bpf_user_pt_regs_t *ctx) should _not_ be allowed to be replaced with: __weak int kprobe_typedef_ctx_subprog(struct pt_regs *ctx __arg_ctx) Unfortunately in the newest kernel/libbpf patches allowed it and this way both kernel and libbpf are silently breaking C type matching rules and general expectations of C language. Consider these variants: 1. __weak int kprobe_typedef_ctx_subprog(struct pt_regs *ctx __arg_ctx) { PT_REGS_PARM1(ctx); } 2. __weak int kprobe_typedef_ctx_subprog(void *ctx __arg_ctx) { struct pt_regs *regs = ctx; PT_REGS_PARM1(regs); } 3. __weak int kprobe_typedef_ctx_subprog(bpf_user_pt_regs_t *ctx) { PT_REGS_PARM1(ctx); } In 1 and 3 the caller has to type cast to correct type. In 2 the caller can pass anything without type cast. In C when the user writes: void foo(int *p) it knows that it can access it as pointer to int in the callee and it's caller's job to pass correct pointer into it. When caller type casts something else to 'int *' it's caller's fault if things don't work. Now when user writes: void foo(void *p) { int *i = p; the caller can pass anything into foo() and callee's fault to assume that 'void *' is 'int *'. These are the C rules that we're breaking with __arg_ctx. In 2 it's clear to callee that any ctx argument could have been passed and type cast to 'struct pt_regs *' it's callee's responsibility. In 3 the users know that only bpf_user_pt_regs_t will be passed in. But 1 (the current kernel and libbpf) breaks these C rules. The C language tells prog writer to expect that only 'struct pt_regs *' will be passed, but the kernel/libbpf allows any ctx to be passed in. Hence 1 should be disallowed. The 'void *' case 2 we extend in the future to truly support polymorphism: __weak int subprog(void *ctx __arg_ctx) { __u32 ctx_btf_id = bpf_core_typeof(*ctx); if (ctx_btf_id == bpf_core_type_id_kernel(struct sk_buff)) { struct sk_buff *skb = ctx; .. } else if (ctx_btf_id == bpf_core_type_id_kernel(struct xdp_buff)) { struct xdp_buff *xdp = ctx; and it will conform to C rules. It's on callee side to do the right thing with 'void *'.
On Thu, Jan 4, 2024 at 5:34 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Thu, Jan 4, 2024 at 12:58 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > > > My point was that it's hard to accidentally forget to "generalize" > > type if you were supporting sk_buff, and suddenly started calling it > > with xdp_md. > > > > From my POV, if I'm a user, and I declare an argument as long and > > annotate it as __arg_ctx, then I know what I'm doing and I'd hate for > > some smart-ass library to double-guess me dictating what exact > > incantation I should specify to make it happy. > > But that's exactly what's happening! > The smart-ass libbpf ignores the type in 'struct sk_buff *skb __arg_ctx' > and replaces it with whatever is appropriate for prog type. The only thing that libbpf does in this case is it honors __arg_ctx and makes it work *exactly the same* as __arg_ctx natively works on the newest kernel. Not more, not less. It doesn't change compilation or verification rules. At all. Libbpf is not a compiler. And libbpf is not a verifier. It sees __arg_ctx, it makes sure this argument is communicated to the verifier as context. That's all. Again, it has no effect on code generation *or* verification (compared to a newer kernel with native __arg_ctx support). If a user has a bug, either the compiler or verifier will complain (or not), depending on how subtle the bug is. > More below. > > > static __attribute__ ((noinline)) > > -int f0(int var, struct __sk_buff *skb) > > +int f0(int var, struct sk_buff *skb) > > { > > - return skb->len; > > + return 0; > > } > > > > __attribute__ ((noinline)) > > @@ -20,7 +19,7 @@ int f1(struct __sk_buff *skb) > > > > __sink(buf[MAX_STACK - 1]); > > > > - return f0(0, skb) + skb->len; > > + return f0(0, (void*)skb) + skb->len; > > This is static f0. Not sure what you're trying to say. Ok, I brainfarted and converted a static function in the test called test_global_func2 without even checking if it's global or not, as I didn't expect that there are any static subprogs at all. My bad. But the point stands, both sk_buff and __sk_buff are recognized as PTR_TO_CTX for global subprogs, see below. And that's all I'm trying to say. > I don't think btf_get_prog_ctx_type() logic applies here. > Did you check the patch I referenced? I'm saying that `struct sk_buff *ctx` is recognized as a context type by kernel *for global subprog*. Here we go again, this time on f1(), which is not static: diff --git a/tools/testing/selftests/bpf/progs/test_global_func2.c b/tools/testing/selftests/bpf/progs/test_global_func2.c index 2beab9c3b68a..4a54350f0aa0 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func2.c +++ b/tools/testing/selftests/bpf/progs/test_global_func2.c @@ -1,7 +1,6 @@ // SPDX-License-Identifier: GPL-2.0-only /* Copyright (c) 2020 Facebook */ -#include <stddef.h> -#include <linux/bpf.h> +#include "vmlinux.h" #include <bpf/bpf_helpers.h> #include "bpf_misc.h" @@ -14,13 +13,13 @@ int f0(int var, struct __sk_buff *skb) } __attribute__ ((noinline)) -int f1(struct __sk_buff *skb) +int f1(struct sk_buff *skb) { volatile char buf[MAX_STACK] = {}; __sink(buf[MAX_STACK - 1]); - return f0(0, skb) + skb->len; + return f0(0, (void *)skb); } int f3(int, struct __sk_buff *skb, int); @@ -28,7 +27,7 @@ int f3(int, struct __sk_buff *skb, int); __attribute__ ((noinline)) int f2(int val, struct __sk_buff *skb) { - return f1(skb) + f3(val, skb, 1); + return f1((void *)skb) + f3(val, skb, 1); } __attribute__ ((noinline)) @@ -45,5 +44,5 @@ SEC("tc") __success int global_func2(struct __sk_buff *skb) { - return f0(1, skb) + f1(skb) + f2(2, skb) + f3(3, skb, 4); + return f0(1, skb) + f1((void *)skb) + f2(2, skb) + f3(3, skb, 4); } And here's portion of veristat log output to be 100% sure this time: Validating f1() func#2... 20: R1=ctx() R10=fp0 ; int f1(struct sk_buff *skb) It's a context. > > I'll say even more, with libbpf's PT_REGS_xxx() macros you don't even > > need to know about pt_regs vs user_pt_regs difference, as macros > > properly force-cast arguments, depending on architecture. So in your > > BPF code you can just pass `struct pt_regs *` around just fine across > > multiple architectures as long as you only use PT_REGS_xxx() macros > > and then pass that context to helpers (to get stack trace, > > bpf_perf_event_output, etc). > > Pretty much. For some time the kernel recognized bpf_user_pt_regs_t > as PTR_TO_CTX for kprobe. > And the users who needed global prog verification with ctx > already used that feature. Not really, see below. For a long time *we thought* that kernel recognizes bpf_user_pt_regs_t, but in reality it wanted `struct bpf_user_pt_regs_t` which doesn't even exist in kernel and has nothing common with either `struct pt_regs` or `struct user_pt_regs`. I fixed that and now the kernel recognizes *both* typedef and struct bpf_user_pt_regs_t. And there is no point in using typedef, because `struct bpf_user_pt_regs_t` is backwards compatible and that's what users actually use in practice. > We even have helper macros to typeof to correct btf type. > > From selftests: > > _weak int kprobe_typedef_ctx_subprog(bpf_user_pt_regs_t *ctx) > { > return bpf_get_stack(ctx, &stack, sizeof(stack), 0); > } > > SEC("?kprobe") > __success > int kprobe_typedef_ctx(void *ctx) > { > return kprobe_typedef_ctx_subprog(ctx); > } > > #define pt_regs_struct_t typeof(*(__PT_REGS_CAST((struct pt_regs *)NULL))) > > __weak int kprobe_struct_ctx_subprog(pt_regs_struct_t *ctx) > { > return bpf_get_stack((void *)ctx, &stack, sizeof(stack), 0); > } > > SEC("?kprobe") > __success > int kprobe_resolved_ctx(void *ctx) > { > return kprobe_struct_ctx_subprog(ctx); > } > > __PT_REGS_CAST is arch dependent and typeof makes it seen with > correct btf_id and the kernel knows it's PTR_TO_CTX. TBH, I don't know what btf_id has to do with this, it looks either as a distraction or subtle point you are making that I'm missing. __PT_REGS_CAST() just does C language cast, there is no BTF or BTF ID involved here, so what am I missing? > All that works. No need for __arg_ctx. > I'm sure you know this. > I'm only explaining for everybody else to follow. > Ok, though I'm not sure we are actually agreeing that with libbpf's PT_REGS_xxx() macros it's kind of expected that users will be using `struct pt_regs *` everywhere (and not user_pt_regs). And so actual real world code is actually written with explicit `struct pt_regs *` being passed around. In all static subprogs as well. And only global subprogs currently force the use of the fake `struct bpf_user_pt_regs_t` (not typedef, it's so confusing, but I have to emphasize the big difference, sorry!). So what I'm saying (and I'm repeating that below) is that it would be nice to make global subprogs use the same types as static subprogs, which is just plain `struct pt_regs *ctx __arg_ctx`. > > No one even knows about bpf_user_pt_regs_t, I had to dig it up from > > kernel source code and let users know what exact type name to use for > > global subprog. > > Few people know that global subprogs are verified differently than static. > That's true, but I bet people that knew also used the right type for ctx. > If you're saying that __arg_ctx is making it easier for users > to use global subprogs I certainly agree, but it's not > something that was mandatory for uniform global progs. What is "uniform global progs"? If you mean those polymorphic global subprogs, then keep in mind that only one level of global subprogs are possible without this __arg_ctx approach. It's a big limitation right now, actually. > __arg_ctx main value is for polymorphic subprogs. If you mean this libbpf's type rewriting logic for __arg_ctx, yes, I agree. If you mean in general, then no, it's not just for polymorphic subprogs. It's also to allow passing context in program types that don't support passing context to global subprog (fentry, tracepoint, etc). But libbpf cannot do anything about the latter case, if kernel doesn't support __arg_ctx natively. > An add-on value is ease-of-use for existing non polymorphic subrpogs. > > I'm saying that in the above example working code: > > __weak int kprobe_typedef_ctx_subprog(bpf_user_pt_regs_t *ctx) > > should _not_ be allowed to be replaced with: > > __weak int kprobe_typedef_ctx_subprog(struct pt_regs *ctx __arg_ctx) Why not? This is what I don't get. Here's a real piece of code to demonstrate what users do in practice: struct bpf_user_pt_regs_t {} __hidden int handle_event_user_pt_regs(struct bpf_user_pt_regs_t* ctx) { if (pyperf_prog_cfg.sample_interval > 0) { if (__sync_fetch_and_add(&total_events_count, 1) % pyperf_prog_cfg.sample_interval) { return 0; } } return handle_event_helper((struct pt_regs*)ctx, NULL); } See that cast to `struct pt_regs *`? It's because all non-global code is working with struct pt_regs already, and it's fine. Keep in mind, they can't use bpf_user_pt_regs_t typedef and avoid the cast, because older kernels didn't recognize typedef, so they use empty `struct bpf_user_pt_regs_t`, which has to be casted. I don't want to get into a debate about whether they should convert all `struct pt_regs *` to `bpf_user_pt_regs_t *`, that's not the point. Maybe they could, but their code already is written like that and works. Using struct pt_regs is not broken for them, both on x86-64 and arm64. I'm saying that I explicitly do want to be able to declare (in general): int handle_event_user(struct pt_regs *ctx __arg_ctx) { ...} And this would work both on old and new kernels, with and without native __arg_ctx support. And it will be very close to static subprogs in the existing code base. Why do you want to disallow this artificially? > > Unfortunately in the newest kernel/libbpf patches allowed it and > this way both kernel and libbpf are silently breaking C type > matching rules and general expectations of C language. C types are still checked and enforced by the compiler. And only the compiler. Verifier doesn't use BTF for verification of PTR_TO_CTX beyond getting type name for global subprog argument. How is libbpf breaking anything here? > > Consider these variants: > > 1. > __weak int kprobe_typedef_ctx_subprog(struct pt_regs *ctx __arg_ctx) > { PT_REGS_PARM1(ctx); } > > 2. > __weak int kprobe_typedef_ctx_subprog(void *ctx __arg_ctx) > { struct pt_regs *regs = ctx; PT_REGS_PARM1(regs); } > > 3. > __weak int kprobe_typedef_ctx_subprog(bpf_user_pt_regs_t *ctx) > { PT_REGS_PARM1(ctx); } > > In 1 and 3 the caller has to type cast to correct type. > In 2 the caller can pass anything without type cast. > > In C when the user writes: void foo(int *p) > it knows that it can access it as pointer to int in the callee > and it's caller's job to pass correct pointer into it. > When caller type casts something else to 'int *' it's caller's fault > if things don't work. > Now when user writes: > void foo(void *p) { int *i = p; > > the caller can pass anything into foo() and callee's fault > to assume that 'void *' is 'int *'. > These are the C rules that we're breaking with __arg_ctx. > > In 2 it's clear to callee that any ctx argument could have been passed > and type cast to 'struct pt_regs *' it's callee's responsibility. > > In 3 the users know that only bpf_user_pt_regs_t will be passed in. > > But 1 (the current kernel and libbpf) breaks these C rules. > The C language tells prog writer to expect that only 'struct pt_regs *' > will be passed, but the kernel/libbpf allows any ctx to be passed in. > > Hence 1 should be disallowed. All the above is already checked and enforced by the compiler. Libbpf doesn't subvert it in any way. All that libbpf is doing is saying "ah, user, you want this argument to be treated as PTR_TO_CTX, right? Too bad host kernel is a bit too old to understand __arg_ctx natively, but worry you not, I'll just quickly fix up BTF information that *only kernel* uses *only to check type name* (nothing else!), and it will look like kernel actually understood __arg_ctx, that's all, happy BPF'ing!". If a user is misusing types in his code, that will be caught by the compiler. If user's code is doing something that the BPF verifier detects as illegal, regardless of types and whatnot, the verifier will complain. I don't want libbpf to perform functions of both compiler and verifier in these narrow and unnecessary cases. Especially that there are specific situations where the user's code is correct and legal, and yet libbpf will be complaining because... reasons. > > The 'void *' case 2 we extend in the future to truly support polymorphism: > > __weak int subprog(void *ctx __arg_ctx) > { > __u32 ctx_btf_id = bpf_core_typeof(*ctx); > > if (ctx_btf_id == bpf_core_type_id_kernel(struct sk_buff)) { > struct sk_buff *skb = ctx; > .. > } else if (ctx_btf_id == bpf_core_type_id_kernel(struct xdp_buff)) { > struct xdp_buff *xdp = ctx; > > and it will conform to C rules. It's on callee side to do the right > thing with 'void *'.
On Thu, Jan 4, 2024 at 7:58 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Thu, Jan 4, 2024 at 5:34 PM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Thu, Jan 4, 2024 at 12:58 PM Andrii Nakryiko > > <andrii.nakryiko@gmail.com> wrote: > > > > > > > > > My point was that it's hard to accidentally forget to "generalize" > > > type if you were supporting sk_buff, and suddenly started calling it > > > with xdp_md. > > > > > > From my POV, if I'm a user, and I declare an argument as long and > > > annotate it as __arg_ctx, then I know what I'm doing and I'd hate for > > > some smart-ass library to double-guess me dictating what exact > > > incantation I should specify to make it happy. > > > > But that's exactly what's happening! > > The smart-ass libbpf ignores the type in 'struct sk_buff *skb __arg_ctx' > > and replaces it with whatever is appropriate for prog type. > > The only thing that libbpf does in this case is it honors __arg_ctx > and makes it work *exactly the same* as __arg_ctx natively works on > the newest kernel. Not more, not less. It doesn't change compilation > or verification rules. At all. Here in all previous emails I was talking about both kernel and libbpf. Both shouldn't be breaking C rules. Not singling out libbpf. > Validating f1() func#2... > 20: R1=ctx() R10=fp0 > ; int f1(struct sk_buff *skb) > > It's a context. Ohh. Looks like I screwed it up back then. /* only compare that prog's ctx type name is the same as * kernel expects. No need to compare field by field. * It's ok for bpf prog to do: * struct __sk_buff {}; * int socket_filter_bpf_prog(struct __sk_buff *skb) * { // no fields of skb are ever used } */ if (strcmp(ctx_tname, "__sk_buff") == 0 && strcmp(tname, "sk_buff") == 0) return ctx_type; See comment. The intent was to allow __sk_buff in prog to match with __sk_buff in the kernel. Brainfart. > Not really, see below. For a long time *we thought* that kernel > recognizes bpf_user_pt_regs_t, but in reality it wanted `struct > bpf_user_pt_regs_t` which doesn't even exist in kernel and has nothing > common with either `struct pt_regs` or `struct user_pt_regs`. I fixed > that and now the kernel recognizes *both* typedef and struct > bpf_user_pt_regs_t. And there is no point in using typedef, because > `struct bpf_user_pt_regs_t` is backwards compatible and that's what > users actually use in practice. Hmm. The test with __weak int kprobe_typedef_ctx_subprog(bpf_user_pt_regs_t *ctx) was added back in Feb 2023. So it was surely working for the last year. > > __PT_REGS_CAST is arch dependent and typeof makes it seen with > > correct btf_id and the kernel knows it's PTR_TO_CTX. > > TBH, I don't know what btf_id has to do with this, it looks either as > a distraction or subtle point you are making that I'm missing. > __PT_REGS_CAST() just does C language cast, there is no BTF or BTF ID > involved here, so what am I missing? That was your patch :) I'm just pointing out the neat trick with typeof to put the correct type in there, so it's later seen with proper btf_id and recognized as ctx. You added it a year ago. > > Why not? This is what I don't get. Here's a real piece of code to > demonstrate what users do in practice: > > struct bpf_user_pt_regs_t {} > > __hidden int handle_event_user_pt_regs(struct bpf_user_pt_regs_t* ctx) { > if (pyperf_prog_cfg.sample_interval > 0) { > if (__sync_fetch_and_add(&total_events_count, 1) % > pyperf_prog_cfg.sample_interval) { > return 0; > } > } > > return handle_event_helper((struct pt_regs*)ctx, NULL); > } I think you're talking about kernel prior to that commit a year ago that made it possible to drop 'struct'. > I'm saying that I explicitly do want to be able to declare (in general):> > int handle_event_user(struct pt_regs *ctx __arg_ctx) { ...} > > And this would work both on old and new kernels, with and without > native __arg_ctx support. And it will be very close to static subprogs > in the existing code base. > > Why do you want to disallow this artificially? Not artificially, but only when pt_regs in bpf prog doesn't match what kernel is passing. I think allowing only: handle_event_user(void *ctx __arg_ctx) and prog will cast it to pt_regs immediately is less surprising and proper C code, but handle_event_user(struct pt_regs *ctx __arg_ctx) is also ok when pt_regs is indeed what is being passed. Which will be the case for x86. And will be fine on arm64 too, because arch/arm64/include/asm/ptrace.h struct pt_regs { union { struct user_pt_regs user_regs; but if arm64 ever changes that layout we should start failing to load. > All the above is already checked and enforced by the compiler. Libbpf > doesn't subvert it in any way. All that libbpf is doing is saying "ah, > user, you want this argument to be treated as PTR_TO_CTX, right? Too > bad host kernel is a bit too old to understand __arg_ctx natively, but > worry you not, I'll just quickly fix up BTF information that *only > kernel* uses *only to check type name* (nothing else!), and it will > look like kernel actually understood __arg_ctx, that's all, happy > BPF'ing!". and this way libbpf _may_ introduce a hard to debug bug. The same mistake the new kernel _may_ do with __arg_ctx with old libbpf. Both will do a hidden typecast when the bpf prog is potentially written with different type. foo(struct pt_regs *ctx __arg_ctx) Quick git grep shows that it will probably work on all archs except 'arc' where arch/arc/include/uapi/asm/bpf_perf_event.h:typedef struct user_regs_struct bpf_user_pt_regs_t; and struct pt_regs seems to have a different layout than user_regs_struct. But when kernel allows sk_buff to be passed into foo(struct pt_regs *ctx __arg_ctx) is just broken.
On Thu, Jan 4, 2024 at 9:42 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Thu, Jan 4, 2024 at 7:58 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Thu, Jan 4, 2024 at 5:34 PM Alexei Starovoitov > > <alexei.starovoitov@gmail.com> wrote: > > > > > > On Thu, Jan 4, 2024 at 12:58 PM Andrii Nakryiko > > > <andrii.nakryiko@gmail.com> wrote: > > > > > > > > > > > > My point was that it's hard to accidentally forget to "generalize" > > > > type if you were supporting sk_buff, and suddenly started calling it > > > > with xdp_md. > > > > > > > > From my POV, if I'm a user, and I declare an argument as long and > > > > annotate it as __arg_ctx, then I know what I'm doing and I'd hate for > > > > some smart-ass library to double-guess me dictating what exact > > > > incantation I should specify to make it happy. > > > > > > But that's exactly what's happening! > > > The smart-ass libbpf ignores the type in 'struct sk_buff *skb __arg_ctx' > > > and replaces it with whatever is appropriate for prog type. > > > > The only thing that libbpf does in this case is it honors __arg_ctx > > and makes it work *exactly the same* as __arg_ctx natively works on > > the newest kernel. Not more, not less. It doesn't change compilation > > or verification rules. At all. > > Here in all previous emails I was talking about both kernel and libbpf. > Both shouldn't be breaking C rules. > Not singling out libbpf. > Ok, I kept doubting which side (or both) we were talking about. BTW, just as an aside (and I just realized it), identical checks performed on global subprog by libbpf and kernel are not really identical (anymore), because of the kernel's lazy global subprog validation. Libbpf unfortunately doesn't know which global functions might be dead code, so we need to be careful about being too eager. > > Validating f1() func#2... > > 20: R1=ctx() R10=fp0 > > ; int f1(struct sk_buff *skb) > > > > It's a context. > > Ohh. Looks like I screwed it up back then. > /* only compare that prog's ctx type name is the same as > * kernel expects. No need to compare field by field. > * It's ok for bpf prog to do: > * struct __sk_buff {}; > * int socket_filter_bpf_prog(struct __sk_buff *skb) > * { // no fields of skb are ever used } > */ > if (strcmp(ctx_tname, "__sk_buff") == 0 && strcmp(tname, > "sk_buff") == 0) > return ctx_type; > > See comment. The intent was to allow __sk_buff in prog to > match with __sk_buff in the kernel. > Brainfart. Ok, then at least we shouldn't allow sk_buff in new use cases (like __arg_ctx tag). > > > Not really, see below. For a long time *we thought* that kernel > > recognizes bpf_user_pt_regs_t, but in reality it wanted `struct > > bpf_user_pt_regs_t` which doesn't even exist in kernel and has nothing > > common with either `struct pt_regs` or `struct user_pt_regs`. I fixed > > that and now the kernel recognizes *both* typedef and struct > > bpf_user_pt_regs_t. And there is no point in using typedef, because > > `struct bpf_user_pt_regs_t` is backwards compatible and that's what > > users actually use in practice. > > Hmm. > The test with > __weak int kprobe_typedef_ctx_subprog(bpf_user_pt_regs_t *ctx) > > was added back in Feb 2023. > So it was surely working for the last year. right, but I meant even earlier kernels that did support `struct bpf_user_pt_regs_t *`, but didn't support `bpf_user_pt_regs_t *`. > > > > __PT_REGS_CAST is arch dependent and typeof makes it seen with > > > correct btf_id and the kernel knows it's PTR_TO_CTX. > > > > TBH, I don't know what btf_id has to do with this, it looks either as > > a distraction or subtle point you are making that I'm missing. > > __PT_REGS_CAST() just does C language cast, there is no BTF or BTF ID > > involved here, so what am I missing? > > That was your patch :) > I'm just pointing out the neat trick with typeof to put > the correct type in there, > so it's later seen with proper btf_id and recognized as ctx. > You added it a year ago. ah, ok. TBH, I don't even know (now) why it works, must be some other quirk in kernel logic? But either way, this might have been appropriate for selftest, but I wouldn't recommend users to do this, it relies on "internal macro" __PT_REGS_CAST, so could technically break (but also is just pure magic). Anyways, see below. > > > > > Why not? This is what I don't get. Here's a real piece of code to > > demonstrate what users do in practice: > > > > struct bpf_user_pt_regs_t {} > > > > __hidden int handle_event_user_pt_regs(struct bpf_user_pt_regs_t* ctx) { > > if (pyperf_prog_cfg.sample_interval > 0) { > > if (__sync_fetch_and_add(&total_events_count, 1) % > > pyperf_prog_cfg.sample_interval) { > > return 0; > > } > > } > > > > return handle_event_helper((struct pt_regs*)ctx, NULL); > > } > > I think you're talking about kernel prior to that commit a year ago > that made it possible to drop 'struct'. yes, exactly. Global funcs were supported way earlier than my fix. > > > I'm saying that I explicitly do want to be able to declare (in general):> > > int handle_event_user(struct pt_regs *ctx __arg_ctx) { ...} > > > > And this would work both on old and new kernels, with and without > > native __arg_ctx support. And it will be very close to static subprogs > > in the existing code base. > > > > Why do you want to disallow this artificially? > > Not artificially, but only when pt_regs in bpf prog doesn't match > what kernel is passing. > I think allowing only: > handle_event_user(void *ctx __arg_ctx) > and prog will cast it to pt_regs immediately is less surprising > and proper C code, > but > handle_event_user(struct pt_regs *ctx __arg_ctx) > is also ok when pt_regs is indeed what is being passed. > Which will be the case for x86. > And will be fine on arm64 too, because > arch/arm64/include/asm/ptrace.h > struct pt_regs { > union { > struct user_pt_regs user_regs; > > but if arm64 ever changes that layout we should start failing to load. Ok, I'm glad you agreed to allow `struct pt_regs *`. I also will say that (as it stands right now) passing `struct pt_regs *` is valid on all architectures, because that's what kernel passes around internally as context for uprobe, kprobe, and kprobe-multi. See uprobe_prog_run, kprobe_multi_link_prog_run, and perf_trace_run_bpf_submit, we always pass real `struct pt_regs *`. So, I'll add kprobe/multi-kprobe special handling to allows `struct pt_regs *` then, ok? > > > All the above is already checked and enforced by the compiler. Libbpf > > doesn't subvert it in any way. All that libbpf is doing is saying "ah, > > user, you want this argument to be treated as PTR_TO_CTX, right? Too > > bad host kernel is a bit too old to understand __arg_ctx natively, but > > worry you not, I'll just quickly fix up BTF information that *only > > kernel* uses *only to check type name* (nothing else!), and it will > > look like kernel actually understood __arg_ctx, that's all, happy > > BPF'ing!". > > and this way libbpf _may_ introduce a hard to debug bug. > The same mistake the new kernel _may_ do with __arg_ctx with old libbpf. > Both will do a hidden typecast when the bpf prog is > potentially written with different type. > > foo(struct pt_regs *ctx __arg_ctx) > Quick git grep shows that it will probably work on all archs > except 'arc' where > arch/arc/include/uapi/asm/bpf_perf_event.h:typedef struct > user_regs_struct bpf_user_pt_regs_t; > and struct pt_regs seems to have a different layout than user_regs_struct. > > But when kernel allows sk_buff to be passed into > foo(struct pt_regs *ctx __arg_ctx) > is just broken. Yes, of course, sk_buff instead of pt_regs is definitely broken. But that will be detected even by the compiler. Anyways, I can add special casing for pt_regs and start enforcing types. A bit hesitant to do that on libbpf side, still, due to that eager global func behavior, which deviates from kernel, but if you insist I'll do it. (Eduard, I'll add feature detection for the need to rewrite BTF at the same time, just FYI) Keep in mind, though, for non-global subprogs kernel doesn't enforce any types, so you can really pass sk_buff into pt_regs argument, if you really want to, but kernel will happily still assume PTR_TO_CTX (and I'm sure you know this as well, so this is mostly for others and for completeness).
On Mon, Jan 8, 2024 at 3:45 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > > > Not artificially, but only when pt_regs in bpf prog doesn't match > > what kernel is passing. > > I think allowing only: > > handle_event_user(void *ctx __arg_ctx) > > and prog will cast it to pt_regs immediately is less surprising > > and proper C code, > > but > > handle_event_user(struct pt_regs *ctx __arg_ctx) > > is also ok when pt_regs is indeed what is being passed. > > Which will be the case for x86. > > And will be fine on arm64 too, because > > arch/arm64/include/asm/ptrace.h > > struct pt_regs { > > union { > > struct user_pt_regs user_regs; > > > > but if arm64 ever changes that layout we should start failing to load. > > Ok, I'm glad you agreed to allow `struct pt_regs *`. I also will say > that (as it stands right now) passing `struct pt_regs *` is valid on > all architectures, because that's what kernel passes around internally > as context for uprobe, kprobe, and kprobe-multi. See uprobe_prog_run, > kprobe_multi_link_prog_run, and perf_trace_run_bpf_submit, we always > pass real `struct pt_regs *`. Right, but for perf event progs it's actually bpf_user_pt_regs_t: ctx.regs = perf_arch_bpf_user_pt_regs(regs); bpf_prog_run(prog, &ctx); yet all such progs are written assuming struct pt_regs which is not correct. It's a bit of a mess while strict type checking should make it better. BPF is a strictly typed assembly language and the verifier should not be violating its own promises of type checking when it sees arg_ctx. The reason I was proposing to restrict both kernel and libbpf to 'void *ctx __arg_ctx' is because it's trivial to implement in both. To allow 'struct correct_type *ctx __arg_ctx' generically is much more work. > So, I'll add kprobe/multi-kprobe special handling to allows `struct > pt_regs *` then, ok? If you mean to allow 'void *ctx __arg_ctx' in kernel and libbpf and in addition allow 'struct pt_reg *ctx __arg_ctx' for kprobe and other prog types where that's what is being passed then yes. Totally fine with me. These two are easy to enforce in kernel and libbpf. > Yes, of course, sk_buff instead of pt_regs is definitely broken. But > that will be detected even by the compiler. Right. C can do casts, but in bpf asm the verifier promises strict type checking and it goes further and makes safety decisions based on types. > Anyways, I can add special casing for pt_regs and start enforcing > types. A bit hesitant to do that on libbpf side, still, due to that > eager global func behavior, which deviates from kernel, but if you > insist I'll do it. I don't understand this part. Both kernel and libbpf can check if (btf_type_id(ctx) == 'struct pt_regs' && prog_type == kprobe) allow such __arg_ctx. > > (Eduard, I'll add feature detection for the need to rewrite BTF at the > same time, just FYI) > > Keep in mind, though, for non-global subprogs kernel doesn't enforce > any types, so you can really pass sk_buff into pt_regs argument, if > you really want to, but kernel will happily still assume PTR_TO_CTX > (and I'm sure you know this as well, so this is mostly for others and > for completeness). static functions are very different. They're typeless and will stay typeless for some time. Compiler can do whatever it wants with them. Like in katran case the static function of 6 arguments is optimized into 5 args. The types are unknown. The compiler can specialize args with constant, partially inline, etc. Even if it kept types of args after heavy transformations the verifier cannot rely on that for safety or enforce strict types (yet). static foo() is like static inline foo(). kprobe-ing into static func is questionable. Only if case of global __weak the types are dependable and that's why the verifier treats them differently. Hopefully the -Overifiable llvm/gcc proposal will keep moving. Then, one day, we can potentially disable some of the transformations on static functions that makes types useless. Then the verifier will be able to verify them just as globals and enforce strict types.
On Mon, Jan 8, 2024 at 5:49 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Mon, Jan 8, 2024 at 3:45 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > > > > > Not artificially, but only when pt_regs in bpf prog doesn't match > > > what kernel is passing. > > > I think allowing only: > > > handle_event_user(void *ctx __arg_ctx) > > > and prog will cast it to pt_regs immediately is less surprising > > > and proper C code, > > > but > > > handle_event_user(struct pt_regs *ctx __arg_ctx) > > > is also ok when pt_regs is indeed what is being passed. > > > Which will be the case for x86. > > > And will be fine on arm64 too, because > > > arch/arm64/include/asm/ptrace.h > > > struct pt_regs { > > > union { > > > struct user_pt_regs user_regs; > > > > > > but if arm64 ever changes that layout we should start failing to load. > > > > Ok, I'm glad you agreed to allow `struct pt_regs *`. I also will say > > that (as it stands right now) passing `struct pt_regs *` is valid on > > all architectures, because that's what kernel passes around internally > > as context for uprobe, kprobe, and kprobe-multi. See uprobe_prog_run, > > kprobe_multi_link_prog_run, and perf_trace_run_bpf_submit, we always > > pass real `struct pt_regs *`. > > Right, but for perf event progs it's actually bpf_user_pt_regs_t: > ctx.regs = perf_arch_bpf_user_pt_regs(regs); > bpf_prog_run(prog, &ctx); > yet all such progs are written assuming struct pt_regs > which is not correct. Yes, SEC("perf_event") programs have very different context (bpf_perf_event_data, where *pointer to pt_regs* is the first field, so it's not compatible even memory layout-wise). So I'm not going to allow struct pt_regs there. > It's a bit of a mess while strict type checking should make it better. > > BPF is a strictly typed assembly language and the verifier > should not be violating its own promises of type checking when > it sees arg_ctx. > > The reason I was proposing to restrict both kernel and libbpf > to 'void *ctx __arg_ctx' is because it's trivial to implement > in both. > To allow 'struct correct_type *ctx __arg_ctx' generically is much more > work. Yes, it's definitely more complicated (but kernel has full BTF info, so maybe not too bad, I need to try). I'll give it a try, if it's too bad, we can discuss a fallback plan. But we should try at least, forcing users to do unnecessary void * casts to u64[] or tracepoint struct is suboptimal from usability POV. > > > So, I'll add kprobe/multi-kprobe special handling to allows `struct > > pt_regs *` then, ok? > > If you mean to allow 'void *ctx __arg_ctx' in kernel and libbpf and > in addition allow 'struct pt_reg *ctx __arg_ctx' for kprobe and other > prog types where that's what is being passed then yes. > Totally fine with me. > These two are easy to enforce in kernel and libbpf. Ok, great. > > > Yes, of course, sk_buff instead of pt_regs is definitely broken. But > > that will be detected even by the compiler. > > Right. C can do casts, but in bpf asm the verifier promises strict type > checking and it goes further and makes safety decisions based on types. > It feels like you are thinking about PTR_TO_BTF_ID only and extrapolating that behavior to everything else. You know that it's not like that in general. > > Anyways, I can add special casing for pt_regs and start enforcing > > types. A bit hesitant to do that on libbpf side, still, due to that > > eager global func behavior, which deviates from kernel, but if you > > insist I'll do it. > > I don't understand this part. > Both kernel and libbpf can check > if (btf_type_id(ctx) == 'struct pt_regs' > && prog_type == kprobe) allow such __arg_ctx. > I was thinking about the case where we have __arg_ctx and type doesn't match expectation. Should libbpf error out? Or emit a warning and proceed without adjusting types? If we do the warning and let verifier reject invalid program, I think it will be better and then these concerns of mine about lazy vs eager global subprog verification behavior are irrelevant. > > > > (Eduard, I'll add feature detection for the need to rewrite BTF at the > > same time, just FYI) > > > > Keep in mind, though, for non-global subprogs kernel doesn't enforce > > any types, so you can really pass sk_buff into pt_regs argument, if > > you really want to, but kernel will happily still assume PTR_TO_CTX > > (and I'm sure you know this as well, so this is mostly for others and > > for completeness). > > static functions are very different. They're typeless and will stay > typeless for some time. Right. And they are different in how we verify and so on. But from the user's perspective they are still just functions and (in my mind at least), the less difference between static and global subprogs there are, the better. But anyways, I'll do what we agreed above. I'll proceed with libbpf emitting warning and not doing anything for __arg_ctx, unless you feel strongly libbpf should error out. > Compiler can do whatever it wants with them. > Like in katran case the static function of 6 arguments is optimized > into 5 args. The types are unknown. The compiler can specialize > args with constant, partially inline, etc. > Even if it kept types of args after heavy transformations > the verifier cannot rely on that for safety or enforce strict types (yet). > static foo() is like static inline foo(). > kprobe-ing into static func is questionable. > Only if case of global __weak the types are dependable and that's why > the verifier treats them differently. > Hopefully the -Overifiable llvm/gcc proposal will keep moving. > Then, one day, we can potentially disable some of the transformations > on static functions that makes types useless. Then the verifier > will be able to verify them just as globals and enforce strict types.
On Tue, Jan 9, 2024 at 9:17 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Mon, Jan 8, 2024 at 5:49 PM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Mon, Jan 8, 2024 at 3:45 PM Andrii Nakryiko > > <andrii.nakryiko@gmail.com> wrote: > > > > > > > > > > > Not artificially, but only when pt_regs in bpf prog doesn't match > > > > what kernel is passing. > > > > I think allowing only: > > > > handle_event_user(void *ctx __arg_ctx) > > > > and prog will cast it to pt_regs immediately is less surprising > > > > and proper C code, > > > > but > > > > handle_event_user(struct pt_regs *ctx __arg_ctx) > > > > is also ok when pt_regs is indeed what is being passed. > > > > Which will be the case for x86. > > > > And will be fine on arm64 too, because > > > > arch/arm64/include/asm/ptrace.h > > > > struct pt_regs { > > > > union { > > > > struct user_pt_regs user_regs; > > > > > > > > but if arm64 ever changes that layout we should start failing to load. > > > > > > Ok, I'm glad you agreed to allow `struct pt_regs *`. I also will say > > > that (as it stands right now) passing `struct pt_regs *` is valid on > > > all architectures, because that's what kernel passes around internally > > > as context for uprobe, kprobe, and kprobe-multi. See uprobe_prog_run, > > > kprobe_multi_link_prog_run, and perf_trace_run_bpf_submit, we always > > > pass real `struct pt_regs *`. > > > > Right, but for perf event progs it's actually bpf_user_pt_regs_t: > > ctx.regs = perf_arch_bpf_user_pt_regs(regs); > > bpf_prog_run(prog, &ctx); > > yet all such progs are written assuming struct pt_regs > > which is not correct. > > Yes, SEC("perf_event") programs have very different context > (bpf_perf_event_data, where *pointer to pt_regs* is the first field, > so it's not compatible even memory layout-wise). So I'm not going to > allow struct pt_regs there. Not quite. I don't think we're on the same page. Technically SEC("perf_event") bpf_prog should see a pointer to: struct bpf_perf_event_data { bpf_user_pt_regs_t regs; __u64 sample_period; __u64 addr; }; but a lot of them are written as: SEC("perf_event") int handle_pe(struct pt_regs *ctx) and it's working, because of magic (or ugliness, it depends on pov) that we do in pe_prog_convert_ctx_access() (that inserts extra load). The part where I'm saying: "written assuming struct pt_regs which is not correct". The incorrect part is that the prog have access only to bpf_user_pt_regs_t and the prog should be written as: SEC("perf_event") int pe_prog(bpf_user_pt_regs_t *ctx) or as SEC("perf_event") int pe_prog(struct bpf_perf_event_data *ctx) but in generic case not as: SEC("perf_event") int pe_prog(struct pt_regs *ctx) because that is valid only on archs where bpf_user_pt_regs_t == pt_regs. > > It's a bit of a mess while strict type checking should make it better. > > > > BPF is a strictly typed assembly language and the verifier > > should not be violating its own promises of type checking when > > it sees arg_ctx. > > > > The reason I was proposing to restrict both kernel and libbpf > > to 'void *ctx __arg_ctx' is because it's trivial to implement > > in both. > > To allow 'struct correct_type *ctx __arg_ctx' generically is much more > > work. > > Yes, it's definitely more complicated (but kernel has full BTF info, > so maybe not too bad, I need to try). I'll give it a try, if it's too > bad, we can discuss a fallback plan. right. the kernel has btf and everything, but as seen in perf_event example above it's not trivial to do the type match... even in the kernel. Matching the accurate type in libbpf is imo too much complexity. > But we should try at least, > forcing users to do unnecessary void * casts to u64[] or tracepoint > struct is suboptimal from usability POV. That part of usability concerns I still don't understand. Where do you see "suboptimal usability" in the code: SEC("perf_event") int pe_prog(void *ctx __arg_ctx) { struct pt_regs *regs = ctx; ? It's a pretty clear interface and the program author knows exactly what it's doing. It's an explicit cast because the user wrote it. Clear, unambiguous and no surprises. Also we shouldn't forget interaction with CO-RE (preserve_access_index) and upcoming preserve_context_offset. When people do #include <vmlinux.h> they get unnecessary CO-RE-ed 'struct pt_regs *' and hidden type conversion by libbpf/kernel is another level of debug complexity. libbpf doesn't even print what it did in bpf_program_fixup_func_info() no matter the verbosity flag. > > > > > So, I'll add kprobe/multi-kprobe special handling to allows `struct > > > pt_regs *` then, ok? > > > > If you mean to allow 'void *ctx __arg_ctx' in kernel and libbpf and > > in addition allow 'struct pt_reg *ctx __arg_ctx' for kprobe and other > > prog types where that's what is being passed then yes. > > Totally fine with me. > > These two are easy to enforce in kernel and libbpf. > > Ok, great. > > > > > > Yes, of course, sk_buff instead of pt_regs is definitely broken. But > > > that will be detected even by the compiler. > > > > Right. C can do casts, but in bpf asm the verifier promises strict type > > checking and it goes further and makes safety decisions based on types. > > > > It feels like you are thinking about PTR_TO_BTF_ID only and > extrapolating that behavior to everything else. You know that it's not > like that in general. imo trusted ptr_to_btf_id and ptr_to_ctx are equivalent. They're fully trusted from the verifier pov and fully correct from bpf prog pov. So neither should have hidden type casts. Of course, I remember bpf_rdonly_cast. But this one is untrusted. It's equivalent to C type cast. > > > Anyways, I can add special casing for pt_regs and start enforcing > > > types. A bit hesitant to do that on libbpf side, still, due to that > > > eager global func behavior, which deviates from kernel, but if you > > > insist I'll do it. > > > > I don't understand this part. > > Both kernel and libbpf can check > > if (btf_type_id(ctx) == 'struct pt_regs' > > && prog_type == kprobe) allow such __arg_ctx. > > > > I was thinking about the case where we have __arg_ctx and type doesn't > match expectation. Should libbpf error out? Or emit a warning and > proceed without adjusting types? If we do the warning and let verifier > reject invalid program, I think it will be better and then these > concerns of mine about lazy vs eager global subprog verification > behavior are irrelevant. I think warn in libbpf is fine. Old kernel will likely fail verification since types won't match and libbpf warn will serve its purpose, but for the new kernel both libbpf and kernel will print two similar looking warns? > > > > > > > (Eduard, I'll add feature detection for the need to rewrite BTF at the > > > same time, just FYI) > > > > > > Keep in mind, though, for non-global subprogs kernel doesn't enforce > > > any types, so you can really pass sk_buff into pt_regs argument, if > > > you really want to, but kernel will happily still assume PTR_TO_CTX > > > (and I'm sure you know this as well, so this is mostly for others and > > > for completeness). > > > > static functions are very different. They're typeless and will stay > > typeless for some time. > > Right. And they are different in how we verify and so on. But from the > user's perspective they are still just functions and (in my mind at > least), the less difference between static and global subprogs there > are, the better. > > But anyways, I'll do what we agreed above. I'll proceed with libbpf > emitting warning and not doing anything for __arg_ctx, unless you feel > strongly libbpf should error out. warn is fine. thanks.
On Tue, Jan 9, 2024 at 5:58 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Tue, Jan 9, 2024 at 9:17 AM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Mon, Jan 8, 2024 at 5:49 PM Alexei Starovoitov > > <alexei.starovoitov@gmail.com> wrote: > > > > > > On Mon, Jan 8, 2024 at 3:45 PM Andrii Nakryiko > > > <andrii.nakryiko@gmail.com> wrote: > > > > > > > > > > > > > > Not artificially, but only when pt_regs in bpf prog doesn't match > > > > > what kernel is passing. > > > > > I think allowing only: > > > > > handle_event_user(void *ctx __arg_ctx) > > > > > and prog will cast it to pt_regs immediately is less surprising > > > > > and proper C code, > > > > > but > > > > > handle_event_user(struct pt_regs *ctx __arg_ctx) > > > > > is also ok when pt_regs is indeed what is being passed. > > > > > Which will be the case for x86. > > > > > And will be fine on arm64 too, because > > > > > arch/arm64/include/asm/ptrace.h > > > > > struct pt_regs { > > > > > union { > > > > > struct user_pt_regs user_regs; > > > > > > > > > > but if arm64 ever changes that layout we should start failing to load. > > > > > > > > Ok, I'm glad you agreed to allow `struct pt_regs *`. I also will say > > > > that (as it stands right now) passing `struct pt_regs *` is valid on > > > > all architectures, because that's what kernel passes around internally > > > > as context for uprobe, kprobe, and kprobe-multi. See uprobe_prog_run, > > > > kprobe_multi_link_prog_run, and perf_trace_run_bpf_submit, we always > > > > pass real `struct pt_regs *`. > > > > > > Right, but for perf event progs it's actually bpf_user_pt_regs_t: > > > ctx.regs = perf_arch_bpf_user_pt_regs(regs); > > > bpf_prog_run(prog, &ctx); > > > yet all such progs are written assuming struct pt_regs > > > which is not correct. > > > > Yes, SEC("perf_event") programs have very different context > > (bpf_perf_event_data, where *pointer to pt_regs* is the first field, > > so it's not compatible even memory layout-wise). So I'm not going to > > allow struct pt_regs there. > > Not quite. I don't think we're on the same page. > > Technically SEC("perf_event") bpf_prog should see a pointer to: > struct bpf_perf_event_data { > bpf_user_pt_regs_t regs; > __u64 sample_period; > __u64 addr; > }; > > but a lot of them are written as: > SEC("perf_event") > int handle_pe(struct pt_regs *ctx) > > and it's working, because of magic (or ugliness, it depends on pov) that > we do in pe_prog_convert_ctx_access() (that inserts extra load). Ah, I didn't know about this part. I checked bpf_perf_event_data_kern and saw a pointer for regs, but didn't realize it's remapped as an embedded struct for perf_event programs by verifier. > > The part where I'm saying: > "written assuming struct pt_regs which is not correct". > > The incorrect part is that the prog have access only to bpf_user_pt_regs_t > and the prog should be written as: > SEC("perf_event") > int pe_prog(bpf_user_pt_regs_t *ctx) > or as > SEC("perf_event") > int pe_prog(struct bpf_perf_event_data *ctx) > > but in generic case not as: > SEC("perf_event") > int pe_prog(struct pt_regs *ctx) > > because that is valid only on archs where bpf_user_pt_regs_t == pt_regs. I guess for perf_event I can just add `struct pt_regs * __arg_ctx` support if (typeof(struct pt_regs) == typeof(bpf_user_pt_regs_t)), or something along those lines. I'll need to check if that works correctly. > > > > It's a bit of a mess while strict type checking should make it better. > > > > > > BPF is a strictly typed assembly language and the verifier > > > should not be violating its own promises of type checking when > > > it sees arg_ctx. > > > > > > The reason I was proposing to restrict both kernel and libbpf > > > to 'void *ctx __arg_ctx' is because it's trivial to implement > > > in both. > > > To allow 'struct correct_type *ctx __arg_ctx' generically is much more > > > work. > > > > Yes, it's definitely more complicated (but kernel has full BTF info, > > so maybe not too bad, I need to try). I'll give it a try, if it's too > > bad, we can discuss a fallback plan. > > right. the kernel has btf and everything, but as seen in perf_event example > above it's not trivial to do the type match... even in the kernel. > Matching the accurate type in libbpf is imo too much complexity. > Heh, we agree on this. But we disagree on solutions :) But my proposed solution was to rely on users and compilers, just like we do with static subprogs :) But you want to restrict it to `void *`. Alright. > > But we should try at least, > > forcing users to do unnecessary void * casts to u64[] or tracepoint > > struct is suboptimal from usability POV. > > That part of usability concerns I still don't understand. > Where do you see "suboptimal usability" in the code: > > SEC("perf_event") > int pe_prog(void *ctx __arg_ctx) > { > struct pt_regs *regs = ctx; > > ? > It's a pretty clear interface and the program author knows exactly > what it's doing. It's an explicit cast because the user wrote it. > Clear, unambiguous and no surprises. I'm not saying it's disastrous. But my point is that, say, for tracepoint, it's natural to want to do this, and that's what users might do first: int global_subprog(struct syscall_trace_enter* ctx __arg_ctx) { return ctx->args[0]; } SEC("tracepoint/syscalls/sys_enter_kill") int tracepoint__syscalls__sys_enter_kill(struct syscall_trace_enter* ctx) { ... global_subprog(ctx); ... } And it will be rejected. User will have WTF moment, maybe ask online or maybe will figure out by themselves that he needs to rewrite global_subprog as: int global_subprog(void *ctx __arg_ctx) { struct syscall_trace_enter* ctx1 = ctx; return ctx1->args[0]; } It works and it's easy to work around. But it's a stumbling point for global subprog adoption/conversion, I think. While on the other hand users having to debug issues due to using `struct pt_regs` instead of `bpf_user_pt_regs_t` is quite theoretical, tbh (because of PT_REGS_xxx() macros doing the right thing). > > Also we shouldn't forget interaction with CO-RE (preserve_access_index) > and upcoming preserve_context_offset. > > When people do #include <vmlinux.h> they get unnecessary CO-RE-ed > 'struct pt_regs *' and hidden type conversion by libbpf/kernel > is another level of debug complexity. This is where I don't get why you keep saying this :) What libbpf is doing is just rewriting FUNC->FUNC_PROTO declaration just before loading BTF into kernel. It's a sleight of hand just to make kernel recognize argument (declaratively) as PTR_TO_CTX. All the CO-RE relocations, preserve_context_offset, etc, all that is done *before all this** and are **absolutely irrelevant** to this whole discussion. Compiler will generate offsets based on original C types that user specified. Libbpf will do CO-RE relocations based on original BTF information (with user's original types). So all the stuff libbpf is doing here for __arg_ctx has no bearing on the above. > > libbpf doesn't even print what it did in bpf_program_fixup_func_info() > no matter the verbosity flag. > > > > > > > > So, I'll add kprobe/multi-kprobe special handling to allows `struct > > > > pt_regs *` then, ok? > > > > > > If you mean to allow 'void *ctx __arg_ctx' in kernel and libbpf and > > > in addition allow 'struct pt_reg *ctx __arg_ctx' for kprobe and other > > > prog types where that's what is being passed then yes. > > > Totally fine with me. > > > These two are easy to enforce in kernel and libbpf. > > > > Ok, great. > > > > > > > > > Yes, of course, sk_buff instead of pt_regs is definitely broken. But > > > > that will be detected even by the compiler. > > > > > > Right. C can do casts, but in bpf asm the verifier promises strict type > > > checking and it goes further and makes safety decisions based on types. > > > > > > > It feels like you are thinking about PTR_TO_BTF_ID only and > > extrapolating that behavior to everything else. You know that it's not > > like that in general. > > imo trusted ptr_to_btf_id and ptr_to_ctx are equivalent. > They're fully trusted from the verifier pov and fully correct > from bpf prog pov. So neither should have hidden type casts. But my point is that the BPF verifier itself doesn't not treat them the same today. For context argument, the verifier just assumes valid PTR_TO_CTX and never looks at BTF information. But I also think it's ironic that above you suggested this to be totally fine: > int pe_prog(void *ctx __arg_ctx) > { > struct pt_regs *regs = ctx; Which is totally a cast, and has no bearing on verifier's PTR_TO_CTX treatment. But we are getting a bit philosophical, I propose we put this to rest. > > Of course, I remember bpf_rdonly_cast. But this one is untrusted. > It's equivalent to C type cast. > > > > > Anyways, I can add special casing for pt_regs and start enforcing > > > > types. A bit hesitant to do that on libbpf side, still, due to that > > > > eager global func behavior, which deviates from kernel, but if you > > > > insist I'll do it. > > > > > > I don't understand this part. > > > Both kernel and libbpf can check > > > if (btf_type_id(ctx) == 'struct pt_regs' > > > && prog_type == kprobe) allow such __arg_ctx. > > > > > > > I was thinking about the case where we have __arg_ctx and type doesn't > > match expectation. Should libbpf error out? Or emit a warning and > > proceed without adjusting types? If we do the warning and let verifier > > reject invalid program, I think it will be better and then these > > concerns of mine about lazy vs eager global subprog verification > > behavior are irrelevant. > > I think warn in libbpf is fine. > Old kernel will likely fail verification since types won't match > and libbpf warn will serve its purpose, > but for the new kernel both libbpf and kernel will print > two similar looking warns? I'm going to add feature detection and disable this __arg_ctx treatment on libbpf side (for new kernels, of course). So yes, on old kernels verifier will reject argument (because it won't be PTR_TO_CTX) and we'll have libbpf warning. And on new one any check will be done directly by verifier and emitted properly in verifier log (which I think is better than libbpf's warnings, especially taking into account veristat and tools like that that by default emit only verifier log on failures). > > > > > > > > > > > (Eduard, I'll add feature detection for the need to rewrite BTF at the > > > > same time, just FYI) > > > > > > > > Keep in mind, though, for non-global subprogs kernel doesn't enforce > > > > any types, so you can really pass sk_buff into pt_regs argument, if > > > > you really want to, but kernel will happily still assume PTR_TO_CTX > > > > (and I'm sure you know this as well, so this is mostly for others and > > > > for completeness). > > > > > > static functions are very different. They're typeless and will stay > > > typeless for some time. > > > > Right. And they are different in how we verify and so on. But from the > > user's perspective they are still just functions and (in my mind at > > least), the less difference between static and global subprogs there > > are, the better. > > > > But anyways, I'll do what we agreed above. I'll proceed with libbpf > > emitting warning and not doing anything for __arg_ctx, unless you feel > > strongly libbpf should error out. > > warn is fine. thanks. Alright, cool, will work on this and add it in v2 of [0]. If you get time, please check that one as well (unless you already did). [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=814516&state=*
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 836986974de3..c5a42ac309fd 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -6181,7 +6181,7 @@ reloc_prog_func_and_line_info(const struct bpf_object *obj, int err; /* no .BTF.ext relocation if .BTF.ext is missing or kernel doesn't - * supprot func/line info + * support func/line info */ if (!obj->btf_ext || !kernel_supports(obj, FEAT_BTF_FUNC)) return 0; @@ -6663,8 +6663,247 @@ static int bpf_prog_assign_exc_cb(struct bpf_object *obj, struct bpf_program *pr return 0; } -static int -bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) +static struct { + enum bpf_prog_type prog_type; + const char *ctx_name; +} global_ctx_map[] = { + { BPF_PROG_TYPE_CGROUP_DEVICE, "bpf_cgroup_dev_ctx" }, + { BPF_PROG_TYPE_CGROUP_SKB, "__sk_buff" }, + { BPF_PROG_TYPE_CGROUP_SOCK, "bpf_sock" }, + { BPF_PROG_TYPE_CGROUP_SOCK_ADDR, "bpf_sock_addr" }, + { BPF_PROG_TYPE_CGROUP_SOCKOPT, "bpf_sockopt" }, + { BPF_PROG_TYPE_CGROUP_SYSCTL, "bpf_sysctl" }, + { BPF_PROG_TYPE_FLOW_DISSECTOR, "__sk_buff" }, + { BPF_PROG_TYPE_KPROBE, "bpf_user_pt_regs_t" }, + { BPF_PROG_TYPE_LWT_IN, "__sk_buff" }, + { BPF_PROG_TYPE_LWT_OUT, "__sk_buff" }, + { BPF_PROG_TYPE_LWT_SEG6LOCAL, "__sk_buff" }, + { BPF_PROG_TYPE_LWT_XMIT, "__sk_buff" }, + { BPF_PROG_TYPE_NETFILTER, "bpf_nf_ctx" }, + { BPF_PROG_TYPE_PERF_EVENT, "bpf_perf_event_data" }, + { BPF_PROG_TYPE_RAW_TRACEPOINT, "bpf_raw_tracepoint_args" }, + { BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, "bpf_raw_tracepoint_args" }, + { BPF_PROG_TYPE_SCHED_ACT, "__sk_buff" }, + { BPF_PROG_TYPE_SCHED_CLS, "__sk_buff" }, + { BPF_PROG_TYPE_SK_LOOKUP, "bpf_sk_lookup" }, + { BPF_PROG_TYPE_SK_MSG, "sk_msg_md" }, + { BPF_PROG_TYPE_SK_REUSEPORT, "sk_reuseport_md" }, + { BPF_PROG_TYPE_SK_SKB, "__sk_buff" }, + { BPF_PROG_TYPE_SOCK_OPS, "bpf_sock_ops" }, + { BPF_PROG_TYPE_SOCKET_FILTER, "__sk_buff" }, + { BPF_PROG_TYPE_XDP, "xdp_md" }, + /* all other program types don't have "named" context structs */ +}; + +static int clone_func_btf_info(struct btf *btf, int orig_fn_id, struct bpf_program *prog) +{ + int fn_id, fn_proto_id, ret_type_id, orig_proto_id; + int i, err, arg_cnt, fn_name_off, linkage; + struct btf_type *fn_t, *fn_proto_t, *t; + struct btf_param *p; + + /* caller already validated FUNC -> FUNC_PROTO validity */ + fn_t = btf_type_by_id(btf, orig_fn_id); + fn_proto_t = btf_type_by_id(btf, fn_t->type); + + /* Note that each btf__add_xxx() operation invalidates + * all btf_type and string pointers, so we need to be + * very careful when cloning BTF types. BTF type + * pointers have to be always refetched. And to avoid + * problems with invalidated string pointers, we + * add empty strings initially, then just fix up + * name_off offsets in place. Offsets are stable for + * existing strings, so that works out. + */ + fn_name_off = fn_t->name_off; /* we are about to invalidate fn_t */ + linkage = btf_func_linkage(fn_t); + orig_proto_id = fn_t->type; /* original FUNC_PROTO ID */ + ret_type_id = fn_proto_t->type; /* fn_proto_t will be invalidated */ + arg_cnt = btf_vlen(fn_proto_t); + + /* clone FUNC_PROTO and its params */ + fn_proto_id = btf__add_func_proto(btf, ret_type_id); + if (fn_proto_id < 0) + return -EINVAL; + + for (i = 0; i < arg_cnt; i++) { + int name_off; + + /* copy original parameter data */ + t = btf_type_by_id(btf, orig_proto_id); + p = &btf_params(t)[i]; + name_off = p->name_off; + + err = btf__add_func_param(btf, "", p->type); + if (err) + return err; + + fn_proto_t = btf_type_by_id(btf, fn_proto_id); + p = &btf_params(fn_proto_t)[i]; + p->name_off = name_off; /* use remembered str offset */ + } + + /* clone FUNC now, btf__add_func() enforces non-empty name, so use + * entry program's name as a placeholder, which we replace immediately + * with original name_off + */ + fn_id = btf__add_func(btf, prog->name, linkage, fn_proto_id); + if (fn_id < 0) + return -EINVAL; + + fn_t = btf_type_by_id(btf, fn_id); + fn_t->name_off = fn_name_off; /* reuse original string */ + + return fn_id; +} + +/* Check if main program or global subprog's function prototype has `arg:ctx` + * argument tags, and, if necessary, substitute correct type to match what BPF + * verifier would expect, taking into account specific program type. This + * allows to support __arg_ctx tag transparently on old kernels that don't yet + * have a native support for it in the verifier, making user's life much + * easier. + */ +static int bpf_program_fixup_func_info(struct bpf_object *obj, struct bpf_program *prog) +{ + const char *ctx_name = NULL, *ctx_tag = "arg:ctx"; + struct bpf_func_info_min *func_rec; + struct btf_type *fn_t, *fn_proto_t; + struct btf *btf = obj->btf; + const struct btf_type *t; + struct btf_param *p; + int ptr_id = 0, struct_id, tag_id, orig_fn_id; + int i, n, arg_idx, arg_cnt, err, rec_idx; + int *orig_ids; + + /* no .BTF.ext, no problem */ + if (!obj->btf_ext || !prog->func_info) + return 0; + + /* some BPF program types just don't have named context structs, so + * this fallback mechanism doesn't work for them + */ + for (i = 0; i < ARRAY_SIZE(global_ctx_map); i++) { + if (global_ctx_map[i].prog_type != prog->type) + continue; + ctx_name = global_ctx_map[i].ctx_name; + break; + } + if (!ctx_name) + return 0; + + /* remember original func BTF IDs to detect if we already cloned them */ + orig_ids = calloc(prog->func_info_cnt, sizeof(*orig_ids)); + if (!orig_ids) + return -ENOMEM; + for (i = 0; i < prog->func_info_cnt; i++) { + func_rec = prog->func_info + prog->func_info_rec_size * i; + orig_ids[i] = func_rec->type_id; + } + + /* go through each DECL_TAG with "arg:ctx" and see if it points to one + * of our subprogs; if yes and subprog is global and needs adjustment, + * clone and adjust FUNC -> FUNC_PROTO combo + */ + for (i = 1, n = btf__type_cnt(btf); i < n; i++) { + /* only DECL_TAG with "arg:ctx" value are interesting */ + t = btf__type_by_id(btf, i); + if (!btf_is_decl_tag(t)) + continue; + if (strcmp(btf__str_by_offset(btf, t->name_off), ctx_tag) != 0) + continue; + + /* only global funcs need adjustment, if at all */ + orig_fn_id = t->type; + fn_t = btf_type_by_id(btf, orig_fn_id); + if (!btf_is_func(fn_t) || btf_func_linkage(fn_t) != BTF_FUNC_GLOBAL) + continue; + + /* sanity check FUNC -> FUNC_PROTO chain, just in case */ + fn_proto_t = btf_type_by_id(btf, fn_t->type); + if (!fn_proto_t || !btf_is_func_proto(fn_proto_t)) + continue; + + /* find corresponding func_info record */ + func_rec = NULL; + for (rec_idx = 0; rec_idx < prog->func_info_cnt; rec_idx++) { + if (orig_ids[rec_idx] == t->type) { + func_rec = prog->func_info + prog->func_info_rec_size * rec_idx; + break; + } + } + /* current main program doesn't call into this subprog */ + if (!func_rec) + continue; + + /* some more sanity checking of DECL_TAG */ + arg_cnt = btf_vlen(fn_proto_t); + arg_idx = btf_decl_tag(t)->component_idx; + if (arg_idx < 0 || arg_idx >= arg_cnt) + continue; + + /* check if existing parameter already matches verifier expectations */ + p = &btf_params(fn_proto_t)[arg_idx]; + t = skip_mods_and_typedefs(btf, p->type, NULL); + if (btf_is_ptr(t) && + (t = skip_mods_and_typedefs(btf, t->type, NULL)) && + btf_is_struct(t) && + strcmp(btf__str_by_offset(btf, t->name_off), ctx_name) == 0) { + continue; /* no need for fix up */ + } + + /* clone fn/fn_proto, unless we already did it for another arg */ + if (func_rec->type_id == orig_fn_id) { + int fn_id; + + fn_id = clone_func_btf_info(btf, orig_fn_id, prog); + if (fn_id < 0) { + err = fn_id; + goto err_out; + } + + /* point func_info record to a cloned FUNC type */ + func_rec->type_id = fn_id; + } + + /* create PTR -> STRUCT type chain to mark PTR_TO_CTX argument; + * we do it just once per main BPF program, as all global + * funcs share the same program type, so need only PTR -> + * STRUCT type chain + */ + if (ptr_id == 0) { + struct_id = btf__add_struct(btf, ctx_name, 0); + ptr_id = btf__add_ptr(btf, struct_id); + if (ptr_id < 0 || struct_id < 0) { + err = -EINVAL; + goto err_out; + } + } + + /* for completeness, clone DECL_TAG and point it to cloned param */ + tag_id = btf__add_decl_tag(btf, ctx_tag, func_rec->type_id, arg_idx); + if (tag_id < 0) { + err = -EINVAL; + goto err_out; + } + + /* all the BTF manipulations invalidated pointers, refetch them */ + fn_t = btf_type_by_id(btf, func_rec->type_id); + fn_proto_t = btf_type_by_id(btf, fn_t->type); + + /* fix up type ID pointed to by param */ + p = &btf_params(fn_proto_t)[arg_idx]; + p->type = ptr_id; + } + + free(orig_ids); + return 0; +err_out: + free(orig_ids); + return err; +} + +static int bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) { struct bpf_program *prog; size_t i, j; @@ -6745,19 +6984,28 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) } } } - /* Process data relos for main programs */ for (i = 0; i < obj->nr_programs; i++) { prog = &obj->programs[i]; if (prog_is_subprog(obj, prog)) continue; if (!prog->autoload) continue; + + /* Process data relos for main programs */ err = bpf_object__relocate_data(obj, prog); if (err) { pr_warn("prog '%s': failed to relocate data references: %d\n", prog->name, err); return err; } + + /* Fix up .BTF.ext information, if necessary */ + err = bpf_program_fixup_func_info(obj, prog); + if (err) { + pr_warn("prog '%s': failed to perform .BTF.ext fix ups: %d\n", + prog->name, err); + return err; + } } return 0;