Message ID | 20230315223607.50803-2-alexei.starovoitov@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | bpf: Add detection of kfuncs. | expand |
On Wed, 2023-03-15 at 15:36 -0700, Alexei Starovoitov wrote: > From: Alexei Starovoitov <ast@kernel.org> > > Allow ld_imm64 insn with BPF_PSEUDO_BTF_ID to hold the address of kfunc. > PTR_MEM is already recognized for NULL-ness by is_branch_taken(), > so unresolved kfuncs will be seen as zero. > This allows BPF programs to detect at load time whether kfunc is present > in the kernel with bpf_kfunc_exists() macro. > > Signed-off-by: Alexei Starovoitov <ast@kernel.org> > --- > kernel/bpf/verifier.c | 7 +++++-- > tools/lib/bpf/bpf_helpers.h | 3 +++ > 2 files changed, 8 insertions(+), 2 deletions(-) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 60793f793ca6..4e49d34d8cd6 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -15955,8 +15955,8 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, > goto err_put; > } > > - if (!btf_type_is_var(t)) { > - verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR.\n", id); > + if (!btf_type_is_var(t) && !btf_type_is_func(t)) { > + verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR or KIND_FUNC\n", id); > err = -EINVAL; > goto err_put; > } > @@ -15990,6 +15990,9 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, > aux->btf_var.reg_type = PTR_TO_BTF_ID | MEM_PERCPU; > aux->btf_var.btf = btf; > aux->btf_var.btf_id = type; > + } else if (!btf_type_is_func(t)) { > + aux->btf_var.reg_type = PTR_TO_MEM | MEM_RDONLY; > + aux->btf_var.mem_size = 0; This if statement has the following conditions in master: if (percpu) { // ... } else if (!btf_type_is_struct(t)) { // ... } else { // ... } Conditions `!btf_type_is_func()` and `!btf_type_is_struct()` are not mutually exclusive, thus adding `if (!btf_type_is_func())` would match certain conditions that were previously matched by struct case, wouldn't it? E.g. if type is `BTF_KIND_INT`? Although, I was not able to trigger it, as it seems that pahole only encodes per-cpu vars in BTF. > } else if (!btf_type_is_struct(t)) { > const struct btf_type *ret; > const char *tname; > diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h > index 7d12d3e620cc..43abe4c29409 100644 > --- a/tools/lib/bpf/bpf_helpers.h > +++ b/tools/lib/bpf/bpf_helpers.h > @@ -177,6 +177,9 @@ enum libbpf_tristate { > #define __kptr_untrusted __attribute__((btf_type_tag("kptr_untrusted"))) > #define __kptr __attribute__((btf_type_tag("kptr"))) > > +/* pass function pointer through asm otherwise compiler assumes that any function != 0 */ > +#define bpf_kfunc_exists(fn) ({ void *__p = fn; asm ("" : "+r"(__p)); __p; }) > + > #ifndef ___bpf_concat > #define ___bpf_concat(a, b) a ## b > #endif
On Thu, Mar 16, 2023 at 7:14 AM Eduard Zingerman <eddyz87@gmail.com> wrote: > > On Wed, 2023-03-15 at 15:36 -0700, Alexei Starovoitov wrote: > > From: Alexei Starovoitov <ast@kernel.org> > > > > Allow ld_imm64 insn with BPF_PSEUDO_BTF_ID to hold the address of kfunc. > > PTR_MEM is already recognized for NULL-ness by is_branch_taken(), > > so unresolved kfuncs will be seen as zero. > > This allows BPF programs to detect at load time whether kfunc is present > > in the kernel with bpf_kfunc_exists() macro. > > > > Signed-off-by: Alexei Starovoitov <ast@kernel.org> > > --- > > kernel/bpf/verifier.c | 7 +++++-- > > tools/lib/bpf/bpf_helpers.h | 3 +++ > > 2 files changed, 8 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 60793f793ca6..4e49d34d8cd6 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -15955,8 +15955,8 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, > > goto err_put; > > } > > > > - if (!btf_type_is_var(t)) { > > - verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR.\n", id); > > + if (!btf_type_is_var(t) && !btf_type_is_func(t)) { > > + verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR or KIND_FUNC\n", id); > > err = -EINVAL; > > goto err_put; > > } > > @@ -15990,6 +15990,9 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, > > aux->btf_var.reg_type = PTR_TO_BTF_ID | MEM_PERCPU; > > aux->btf_var.btf = btf; > > aux->btf_var.btf_id = type; > > + } else if (!btf_type_is_func(t)) { > > + aux->btf_var.reg_type = PTR_TO_MEM | MEM_RDONLY; > > + aux->btf_var.mem_size = 0; > > This if statement has the following conditions in master: > > if (percpu) { > // ... > } else if (!btf_type_is_struct(t)) { > // ... > } else { > // ... > } > > Conditions `!btf_type_is_func()` and `!btf_type_is_struct()` are > not mutually exclusive, thus adding `if (!btf_type_is_func())` > would match certain conditions that were previously matched by struct > case, wouldn't it? E.g. if type is `BTF_KIND_INT`? ohh. good catch! > Although, I was not able to trigger it, as it seems that pahole only > encodes per-cpu vars in BTF. right. that's why we don't have selftests for this code that could have caught my braino. There were patches to add vars to vmlinux btf, but it didn't materialize yet.
On Wed, Mar 15, 2023 at 3:36 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > From: Alexei Starovoitov <ast@kernel.org> > > Allow ld_imm64 insn with BPF_PSEUDO_BTF_ID to hold the address of kfunc. > PTR_MEM is already recognized for NULL-ness by is_branch_taken(), > so unresolved kfuncs will be seen as zero. > This allows BPF programs to detect at load time whether kfunc is present > in the kernel with bpf_kfunc_exists() macro. > > Signed-off-by: Alexei Starovoitov <ast@kernel.org> > --- > kernel/bpf/verifier.c | 7 +++++-- > tools/lib/bpf/bpf_helpers.h | 3 +++ > 2 files changed, 8 insertions(+), 2 deletions(-) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 60793f793ca6..4e49d34d8cd6 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -15955,8 +15955,8 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, > goto err_put; > } > > - if (!btf_type_is_var(t)) { > - verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR.\n", id); > + if (!btf_type_is_var(t) && !btf_type_is_func(t)) { > + verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR or KIND_FUNC\n", id); > err = -EINVAL; > goto err_put; > } > @@ -15990,6 +15990,9 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, > aux->btf_var.reg_type = PTR_TO_BTF_ID | MEM_PERCPU; > aux->btf_var.btf = btf; > aux->btf_var.btf_id = type; > + } else if (!btf_type_is_func(t)) { > + aux->btf_var.reg_type = PTR_TO_MEM | MEM_RDONLY; > + aux->btf_var.mem_size = 0; > } else if (!btf_type_is_struct(t)) { > const struct btf_type *ret; > const char *tname; > diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h > index 7d12d3e620cc..43abe4c29409 100644 > --- a/tools/lib/bpf/bpf_helpers.h > +++ b/tools/lib/bpf/bpf_helpers.h > @@ -177,6 +177,9 @@ enum libbpf_tristate { > #define __kptr_untrusted __attribute__((btf_type_tag("kptr_untrusted"))) > #define __kptr __attribute__((btf_type_tag("kptr"))) > > +/* pass function pointer through asm otherwise compiler assumes that any function != 0 */ > +#define bpf_kfunc_exists(fn) ({ void *__p = fn; asm ("" : "+r"(__p)); __p; }) > + I think we shouldn't add this helper macro. It just obfuscates a misuse of libbpf features and will be more confusing in practice. If I understand the comment, that asm is there to avoid compiler optimization of *knowing* that kfunc exists (it's extern is resolved to something other than 0), even if kfunc's ksym is not declared with __weak. But that's actually bad and misleading, as even if code is written to use kfunc as optional, libbpf will fail load even before we'll get to kernel, as it won't be able to find ksym's BTF information in kernel BTF. Optional kfunc *has* to be marked __weak. __weak has a consistent semantics to indicate something that's optional. This is documented (e.g., [0] for kconfig variables) We do have tests making sure this works for weak __kconfig and variable __ksyms. Let's add similar ones for kfunc ksyms. [0] https://nakryiko.com/posts/bpf-core-reference-guide/#kconfig-extern-variables Just to demonstrate what I mentioned above, I tried this quick experiment. Commented out block assumes that feature detection is done by user-space and use_kfunc is set to true or false, depending on whether that kfunc is detected. But if bpf_iter_num_new1 is defined as non-weak __ksym, this fails with either use_kfunc=true or use_kfunc=false. Which is correct behavior from libbpf's POV. On the other hand, the second part, which your patch now makes possible, is the proper way to detect if kfunc is defined and that kfunc is defined as __weak. It works, even if kfunc is not present in the kernel. So I think bpf_kfunc_exists() will just hide and obfuscate the actual issue (lack of __weak marking for something that's optional). diff --git a/tools/testing/selftests/bpf/progs/test_vmlinux.c b/tools/testing/selftests/bpf/progs/test_vmlinux.c index 4b8e37f7fd06..92291a0727b7 100644 --- a/tools/testing/selftests/bpf/progs/test_vmlinux.c +++ b/tools/testing/selftests/bpf/progs/test_vmlinux.c @@ -6,15 +6,21 @@ #include <bpf/bpf_helpers.h> #include <bpf/bpf_tracing.h> #include <bpf/bpf_core_read.h> +#include "bpf_misc.h" #define MY_TV_NSEC 1337 +const volatile bool use_kfunc = false; + bool tp_called = false; bool raw_tp_called = false; bool tp_btf_called = false; bool kprobe_called = false; bool fentry_called = false; +extern int bpf_iter_num_new1(struct bpf_iter_num *it, int start, int end) __ksym; +extern int bpf_iter_num_new2(struct bpf_iter_num *it, int start, int end) __ksym __weak; + SEC("tp/syscalls/sys_enter_nanosleep") int handle__tp(struct trace_event_raw_sys_enter *args) { @@ -24,6 +30,19 @@ int handle__tp(struct trace_event_raw_sys_enter *args) if (args->id != __NR_nanosleep) return 0; + /* + if (use_kfunc) { // fails + struct bpf_iter_num it; + bpf_iter_num_new1(&it, 0, 100); + bpf_iter_num_destroy(&it); + } + */ + if (bpf_iter_num_new2) { // works + struct bpf_iter_num it; + bpf_iter_num_new2(&it, 0, 100); + bpf_iter_num_destroy(&it); + } + ts = (void *)args->args[0]; if (bpf_probe_read_user(&tv_nsec, sizeof(ts->tv_nsec), &ts->tv_nsec) || tv_nsec != MY_TV_NSEC) > #ifndef ___bpf_concat > #define ___bpf_concat(a, b) a ## b > #endif > -- > 2.34.1 >
On Thu, Mar 16, 2023 at 1:34 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Wed, Mar 15, 2023 at 3:36 PM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > From: Alexei Starovoitov <ast@kernel.org> > > > > Allow ld_imm64 insn with BPF_PSEUDO_BTF_ID to hold the address of kfunc. > > PTR_MEM is already recognized for NULL-ness by is_branch_taken(), > > so unresolved kfuncs will be seen as zero. > > This allows BPF programs to detect at load time whether kfunc is present > > in the kernel with bpf_kfunc_exists() macro. > > > > Signed-off-by: Alexei Starovoitov <ast@kernel.org> > > --- > > kernel/bpf/verifier.c | 7 +++++-- > > tools/lib/bpf/bpf_helpers.h | 3 +++ > > 2 files changed, 8 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 60793f793ca6..4e49d34d8cd6 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -15955,8 +15955,8 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, > > goto err_put; > > } > > > > - if (!btf_type_is_var(t)) { > > - verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR.\n", id); > > + if (!btf_type_is_var(t) && !btf_type_is_func(t)) { > > + verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR or KIND_FUNC\n", id); > > err = -EINVAL; > > goto err_put; > > } > > @@ -15990,6 +15990,9 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, > > aux->btf_var.reg_type = PTR_TO_BTF_ID | MEM_PERCPU; > > aux->btf_var.btf = btf; > > aux->btf_var.btf_id = type; > > + } else if (!btf_type_is_func(t)) { > > + aux->btf_var.reg_type = PTR_TO_MEM | MEM_RDONLY; > > + aux->btf_var.mem_size = 0; > > } else if (!btf_type_is_struct(t)) { > > const struct btf_type *ret; > > const char *tname; > > diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h > > index 7d12d3e620cc..43abe4c29409 100644 > > --- a/tools/lib/bpf/bpf_helpers.h > > +++ b/tools/lib/bpf/bpf_helpers.h > > @@ -177,6 +177,9 @@ enum libbpf_tristate { > > #define __kptr_untrusted __attribute__((btf_type_tag("kptr_untrusted"))) > > #define __kptr __attribute__((btf_type_tag("kptr"))) > > > > +/* pass function pointer through asm otherwise compiler assumes that any function != 0 */ > > +#define bpf_kfunc_exists(fn) ({ void *__p = fn; asm ("" : "+r"(__p)); __p; }) > > + > > I think we shouldn't add this helper macro. It just obfuscates a > misuse of libbpf features and will be more confusing in practice. I don't think it obfuscates anything. If __weak is missing in extern declaration of kfunc the libbpf will error anyway, so there is no danger to miss it. > If I understand the comment, that asm is there to avoid compiler > optimization of *knowing* that kfunc exists (it's extern is resolved > to something other than 0), even if kfunc's ksym is not declared with > __weak. That's the current behavior of the combination of llvm and libbpf. Resolution of weak is a linker job. libbpf loading process is equivalent to linking. It's "linking" bpf elf .o into kernel and resolving weak symbols. We can document and guarantee that libbpf will evaluate unresolved into zero, but we might have a hard time doing the same with compilers. It's currently the case for LLVM and I hope GCC will follow. Here is nice writeup about weak: https://maskray.me/blog/2021-04-25-weak-symbol Two things to notice in that writeup: 1. "Unresolved weak symbols have a zero value." This is part of ELF spec for linkers. In our case it applies to libbpf. But it doesn't apply to compilers. 2. "GCC<5 (at least x86_64 and arm) may emit PC-relative relocations for hidden undefined weak symbols. GCC<5 i386 may optimize if (&foo) foo(); to unconditional foo();" In other words if compiler implements weak as PC-relative the optimizer part of the compiler may consider it as always not-null and optimize the check out. We can try to prevent that in LLVM and GCC compilers. Another approach is to have a macro that passes weak addr through asm which prevents such optimization. So we still rely on libbpf resolving to zero while "linking" and macro prevents compilers from messing up the check. I feel it's safer to do it with macro. I guess I'm fine leaving it out for now and just do if (bpf_rcu_read_lock) bpf_rcu_read_lock(); though such code looks ugly and begs for a comment or macro like: if (bpf_kfunc_exists(bpf_rcu_read_lock)) bpf_rcu_read_lock(); or if (bpf_rcu_read_lock) // unknown weak kfunc resolve to zero by libbpf // and compiler won't optimize it out bpf_rcu_read_lock(); but adding such comment everywhere feels odd. macro is a cleaner way to document the code. > __weak has a consistent semantics to indicate something that's > optional. This is documented (e.g., [0] for kconfig variables) We do > have tests making sure this works for weak __kconfig and variable > __ksyms. Let's add similar ones for kfunc ksyms. Right. We should definitely document that libbpf resolves unknown __ksym __weak into zero. > + if (bpf_iter_num_new2) { // works > + struct bpf_iter_num it; > + bpf_iter_num_new2(&it, 0, 100); > + bpf_iter_num_destroy(&it); > + } It works, but how many people know that unknown weak resolves to zero ? I didn't know until yesterday.
On Thu, Mar 16, 2023 at 3:26 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Thu, Mar 16, 2023 at 1:34 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Wed, Mar 15, 2023 at 3:36 PM Alexei Starovoitov > > <alexei.starovoitov@gmail.com> wrote: > > > > > > From: Alexei Starovoitov <ast@kernel.org> > > > > > > Allow ld_imm64 insn with BPF_PSEUDO_BTF_ID to hold the address of kfunc. > > > PTR_MEM is already recognized for NULL-ness by is_branch_taken(), > > > so unresolved kfuncs will be seen as zero. > > > This allows BPF programs to detect at load time whether kfunc is present > > > in the kernel with bpf_kfunc_exists() macro. > > > > > > Signed-off-by: Alexei Starovoitov <ast@kernel.org> > > > --- > > > kernel/bpf/verifier.c | 7 +++++-- > > > tools/lib/bpf/bpf_helpers.h | 3 +++ > > > 2 files changed, 8 insertions(+), 2 deletions(-) > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > > index 60793f793ca6..4e49d34d8cd6 100644 > > > --- a/kernel/bpf/verifier.c > > > +++ b/kernel/bpf/verifier.c > > > @@ -15955,8 +15955,8 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, > > > goto err_put; > > > } > > > > > > - if (!btf_type_is_var(t)) { > > > - verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR.\n", id); > > > + if (!btf_type_is_var(t) && !btf_type_is_func(t)) { > > > + verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR or KIND_FUNC\n", id); > > > err = -EINVAL; > > > goto err_put; > > > } > > > @@ -15990,6 +15990,9 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, > > > aux->btf_var.reg_type = PTR_TO_BTF_ID | MEM_PERCPU; > > > aux->btf_var.btf = btf; > > > aux->btf_var.btf_id = type; > > > + } else if (!btf_type_is_func(t)) { > > > + aux->btf_var.reg_type = PTR_TO_MEM | MEM_RDONLY; > > > + aux->btf_var.mem_size = 0; > > > } else if (!btf_type_is_struct(t)) { > > > const struct btf_type *ret; > > > const char *tname; > > > diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h > > > index 7d12d3e620cc..43abe4c29409 100644 > > > --- a/tools/lib/bpf/bpf_helpers.h > > > +++ b/tools/lib/bpf/bpf_helpers.h > > > @@ -177,6 +177,9 @@ enum libbpf_tristate { > > > #define __kptr_untrusted __attribute__((btf_type_tag("kptr_untrusted"))) > > > #define __kptr __attribute__((btf_type_tag("kptr"))) > > > > > > +/* pass function pointer through asm otherwise compiler assumes that any function != 0 */ > > > +#define bpf_kfunc_exists(fn) ({ void *__p = fn; asm ("" : "+r"(__p)); __p; }) > > > + > > > > I think we shouldn't add this helper macro. It just obfuscates a > > misuse of libbpf features and will be more confusing in practice. > > I don't think it obfuscates anything. > If __weak is missing in extern declaration of kfunc the libbpf will > error anyway, so there is no danger to miss it. > > > If I understand the comment, that asm is there to avoid compiler > > optimization of *knowing* that kfunc exists (it's extern is resolved > > to something other than 0), even if kfunc's ksym is not declared with > > __weak. > > That's the current behavior of the combination of llvm and libbpf. > Resolution of weak is a linker job. libbpf loading process is > equivalent to linking. It's "linking" bpf elf .o into kernel and > resolving weak symbols. > We can document and guarantee that libbpf will evaluate > unresolved into zero, but we might have a hard time doing the same with > compilers. It's currently the case for LLVM and I hope GCC will follow. > Here is nice writeup about weak: > https://maskray.me/blog/2021-04-25-weak-symbol > > Two things to notice in that writeup: > 1. "Unresolved weak symbols have a zero value." > This is part of ELF spec for linkers. > In our case it applies to libbpf. > But it doesn't apply to compilers. > > 2. "GCC<5 (at least x86_64 and arm) may emit PC-relative relocations > for hidden undefined weak symbols. GCC<5 i386 may optimize if (&foo) > foo(); to unconditional foo();" Ok, so /* pass function pointer through asm otherwise compiler assumes that any function != 0 */ comment was referring to compiler assuming that function != 0 for __weak symbol? I definitely didn't read it this way. And "compiler assumes that function != 0" seems a bit too strong of a statement, because at least Clang doesn't. > > In other words if compiler implements weak as PC-relative > the optimizer part of the compiler may consider it as always not-null > and optimize the check out. > > We can try to prevent that in LLVM and GCC compilers. I'd definitely do that, yes. Seems like GCC also realized that this is not good, so GCC>5 don't do this "optimization" (or whatever it was). It seems like we are good right now. We have tests for validating this for .kconfig and .ksym, so regardless of the macro let's have tests for .ksym functions as well. I think it's reasonable behavior that Clang has today and having a test we can detect regression and work with compilers to fix this. Just like we did previously with .rodata where GCC and Clang diverged. > Another approach is to have a macro that passes weak addr through asm > which prevents such optimization. > So we still rely on libbpf resolving to zero while "linking" and > macro prevents compilers from messing up the check. > I feel it's safer to do it with macro. > > I guess I'm fine leaving it out for now and just do > if (bpf_rcu_read_lock) > bpf_rcu_read_lock(); > > though such code looks ugly and begs for a comment or macro like: > > if (bpf_kfunc_exists(bpf_rcu_read_lock)) > bpf_rcu_read_lock(); > > or > > if (bpf_rcu_read_lock) // unknown weak kfunc resolve to zero by libbpf > // and compiler won't optimize it out > bpf_rcu_read_lock(); > > but adding such comment everywhere feels odd. > macro is a cleaner way to document the code. It's subjective. To me, checking whether __weak extern is non-NULL seems pretty clean and explicit. From blog that you referred: > The ELF specification says "Unresolved weak symbols have a zero value." This property can be used to check whether a definition is provided. So this is quite intended use cases even outside of BPF world. But for macro, it's not kfunc-specific (and macro itself has no way to check that you are actually passing kfunc ksym), so even if it was a macro, it would be better to call it something more generic like bpf_ksym_exists() (though that will work for .kconfig, yet will be inappropriately named). The asm bit, though, seems to be a premature thing that can hide real compiler issues, so I'm still hesitant to add it. It should work today with modern compilers, so I'd test and validate this. > > > __weak has a consistent semantics to indicate something that's > > optional. This is documented (e.g., [0] for kconfig variables) We do > > have tests making sure this works for weak __kconfig and variable > > __ksyms. Let's add similar ones for kfunc ksyms. > > Right. We should definitely document that libbpf resolves > unknown __ksym __weak into zero. Yep. > > > + if (bpf_iter_num_new2) { // works > > + struct bpf_iter_num it; > > + bpf_iter_num_new2(&it, 0, 100); > > + bpf_iter_num_destroy(&it); > > + } > > It works, but how many people know that unknown weak resolves to zero ? > I didn't know until yesterday. I was explicit about this from the very beginning of BPF CO-RE work. ksyms were added later, but semantics was consistent between .kconfig and .ksym. Documentation can't be ever good enough and discoverable enough (like [0]), of course, but let's do our best to make it as obvious as possible. This __weak behavior is tested and used in multiple selftests as well: $ git grep -E '(__kconfig|__ksym) __weak' progs/bpf_iter_ipv6_route.c:extern bool CONFIG_IPV6_SUBTREES __kconfig __weak; progs/get_func_ip_test.c:extern bool CONFIG_X86_KERNEL_IBT __kconfig __weak; progs/linked_vars1.c:extern bool CONFIG_BPF_SYSCALL __kconfig __weak; progs/linked_vars1.c:extern const void bpf_link_fops __ksym __weak; progs/lsm_cgroup.c:extern bool CONFIG_SECURITY_SELINUX __kconfig __weak; progs/lsm_cgroup.c:extern bool CONFIG_SECURITY_SMACK __kconfig __weak; progs/lsm_cgroup.c:extern bool CONFIG_SECURITY_APPARMOR __kconfig __weak; progs/profiler.inc.h:extern bool CONFIG_CGROUP_PIDS __kconfig __weak; progs/read_bpf_task_storage_busy.c:extern bool CONFIG_PREEMPT __kconfig __weak; progs/task_kfunc_success.c:void invalid_kfunc(void) __ksym __weak; progs/task_storage_nodeadlock.c:extern bool CONFIG_PREEMPT __kconfig __weak; progs/test_core_extern.c:extern int LINUX_UNKNOWN_VIRTUAL_EXTERN __kconfig __weak; progs/test_core_extern.c:extern enum libbpf_tristate CONFIG_TRISTATE __kconfig __weak; progs/test_core_extern.c:extern bool CONFIG_BOOL __kconfig __weak; progs/test_core_extern.c:extern char CONFIG_CHAR __kconfig __weak; progs/test_core_extern.c:extern uint16_t CONFIG_USHORT __kconfig __weak; progs/test_core_extern.c:extern int CONFIG_INT __kconfig __weak; progs/test_core_extern.c:extern uint64_t CONFIG_ULONG __kconfig __weak; progs/test_core_extern.c:extern const char CONFIG_STR[8] __kconfig __weak; progs/test_core_extern.c:extern uint64_t CONFIG_MISSING __kconfig __weak; progs/test_ksyms.c:extern const void bpf_link_fops1 __ksym __weak; progs/test_ksyms_module.c:extern void bpf_testmod_invalid_mod_kfunc(void) __ksym __weak; progs/test_ksyms_weak.c:extern const struct rq runqueues __ksym __weak; /* typed */ progs/test_ksyms_weak.c:extern const void bpf_prog_active __ksym __weak; /* typeless */ progs/test_ksyms_weak.c:extern const void bpf_link_fops1 __ksym __weak; progs/test_ksyms_weak.c:extern const int bpf_link_fops2 __ksym __weak; [0] https://nakryiko.com/posts/bpf-core-reference-guide/#kconfig-extern-variables
On Thu, Mar 16, 2023 at 04:06:02PM -0700, Andrii Nakryiko wrote: > > /* pass function pointer through asm otherwise compiler assumes that > any function != 0 */ > > comment was referring to compiler assuming that function != 0 for > __weak symbol? I definitely didn't read it this way. And "compiler > assumes that function != 0" seems a bit too strong of a statement, > because at least Clang doesn't. Correct. Instead of 'any function' it should be 'any non-weak function'. > > But for macro, it's not kfunc-specific (and macro itself has no way to > check that you are actually passing kfunc ksym), so even if it was a > macro, it would be better to call it something more generic like > bpf_ksym_exists() (though that will work for .kconfig, yet will be > inappropriately named). Rigth. bpf_ksym_exists() is what I proposed couple emails ago in my reply to Ed. > The asm bit, though, seems to be a premature thing that can hide real > compiler issues, so I'm still hesitant to add it. It should work today > with modern compilers, so I'd test and validate this. We're using asm in lots of place to avoid fighting with compiler. This is just one of them, but I found a different way to prevent silent optimizations. I'll go with: #define bpf_ksym_exists(sym) \ ({ _Static_assert(!__builtin_constant_p(!!sym), #sym " should be marked as __weak"); !!sym; }) It will address the silent dead code elimination issue and will detect missing weak immediately at build time. Just going with: if (bpf_rcu_read_lock) // comment about weak bpf_rcu_read_lock(); else .. and forgetting to use __weak in bpf_rcu_read_lock() declaration will make it "work" on current kernels. The compiler will optimize 'if' and 'else' out and libbpf will happily find that kfunc in the kernel. While the program will fail to load on kernels without this kfunc much later with: libbpf: extern (func ksym) 'bpf_rcu_read_lock': not found in kernel or module BTFs. Which is the opposite of what that block of bpf code wanted to achieve. > > It works, but how many people know that unknown weak resolves to zero ? > > I didn't know until yesterday. > > I was explicit about this from the very beginning of BPF CO-RE work. > ksyms were added later, but semantics was consistent between .kconfig > and .ksym. Documentation can't be ever good enough and discoverable > enough (like [0]), of course, but let's do our best to make it as ... > [0] https://nakryiko.com/posts/bpf-core-reference-guide/#kconfig-extern-variables I read it long ago, but reading is one thing and remembering small details is another.
On Thu, Mar 16, 2023 at 6:39 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Thu, Mar 16, 2023 at 04:06:02PM -0700, Andrii Nakryiko wrote: > > > > /* pass function pointer through asm otherwise compiler assumes that > > any function != 0 */ > > > > comment was referring to compiler assuming that function != 0 for > > __weak symbol? I definitely didn't read it this way. And "compiler > > assumes that function != 0" seems a bit too strong of a statement, > > because at least Clang doesn't. > > Correct. Instead of 'any function' it should be 'any non-weak function'. ok, your new macro implementation seems to prevent usage of non-weak ksyms, which is great > > > > > But for macro, it's not kfunc-specific (and macro itself has no way to > > check that you are actually passing kfunc ksym), so even if it was a > > macro, it would be better to call it something more generic like > > bpf_ksym_exists() (though that will work for .kconfig, yet will be > > inappropriately named). > > Rigth. bpf_ksym_exists() is what I proposed couple emails ago in my reply to Ed. > I don't see it, maybe some email was dropped. But sounds good. > > The asm bit, though, seems to be a premature thing that can hide real > > compiler issues, so I'm still hesitant to add it. It should work today > > with modern compilers, so I'd test and validate this. > > We're using asm in lots of place to avoid fighting with compiler. > This is just one of them, but I found a different way to prevent > silent optimizations. I'll go with: > > #define bpf_ksym_exists(sym) \ > ({ _Static_assert(!__builtin_constant_p(!!sym), #sym " should be marked as __weak"); !!sym; }) > > It will address the silent dead code elimination issue and > will detect missing weak immediately at build time. Yep, I like this protection against using non-weak ksym with this check. It's very helpful, thanks! > > Just going with: > > if (bpf_rcu_read_lock) // comment about weak > bpf_rcu_read_lock(); > else > .. > > and forgetting to use __weak in bpf_rcu_read_lock() declaration will make it > "work" on current kernels. The compiler will optimize 'if' and 'else' out and > libbpf will happily find that kfunc in the kernel. > While the program will fail to load on kernels without this kfunc much later with: > libbpf: extern (func ksym) 'bpf_rcu_read_lock': not found in kernel or module BTFs. > Yep. Good testing is still important, of course. > Which is the opposite of what that block of bpf code wanted to achieve. I like the new bpf_ksym_exists() implementation, now I think it adds value, instead of hiding an issue. > > > > It works, but how many people know that unknown weak resolves to zero ? > > > I didn't know until yesterday. > > > > I was explicit about this from the very beginning of BPF CO-RE work. > > ksyms were added later, but semantics was consistent between .kconfig > > and .ksym. Documentation can't be ever good enough and discoverable > > enough (like [0]), of course, but let's do our best to make it as > ... > > [0] https://nakryiko.com/posts/bpf-core-reference-guide/#kconfig-extern-variables > > I read it long ago, but reading is one thing and remembering small details is another. That's understandable, no worries. I'm just saying that this is officially supported semantics, so if compilers somehow break this, I'd like to rely on BPF selftests to detect this early, thanks to BPF CI's use of nightly Clang builds (and eventually hopefully we'll also use GCC-BPF nightlies).
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 60793f793ca6..4e49d34d8cd6 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -15955,8 +15955,8 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, goto err_put; } - if (!btf_type_is_var(t)) { - verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR.\n", id); + if (!btf_type_is_var(t) && !btf_type_is_func(t)) { + verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR or KIND_FUNC\n", id); err = -EINVAL; goto err_put; } @@ -15990,6 +15990,9 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env, aux->btf_var.reg_type = PTR_TO_BTF_ID | MEM_PERCPU; aux->btf_var.btf = btf; aux->btf_var.btf_id = type; + } else if (!btf_type_is_func(t)) { + aux->btf_var.reg_type = PTR_TO_MEM | MEM_RDONLY; + aux->btf_var.mem_size = 0; } else if (!btf_type_is_struct(t)) { const struct btf_type *ret; const char *tname; diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h index 7d12d3e620cc..43abe4c29409 100644 --- a/tools/lib/bpf/bpf_helpers.h +++ b/tools/lib/bpf/bpf_helpers.h @@ -177,6 +177,9 @@ enum libbpf_tristate { #define __kptr_untrusted __attribute__((btf_type_tag("kptr_untrusted"))) #define __kptr __attribute__((btf_type_tag("kptr"))) +/* pass function pointer through asm otherwise compiler assumes that any function != 0 */ +#define bpf_kfunc_exists(fn) ({ void *__p = fn; asm ("" : "+r"(__p)); __p; }) + #ifndef ___bpf_concat #define ___bpf_concat(a, b) a ## b #endif