Message ID | 20240401223058.1503400-1-thinker.li@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] selftests/bpf: Make sure libbpf doesn't enforce the signature of a func pointer. | expand |
Kui-Feng Lee wrote: > The verifier in the kernel checks the signatures of struct_ops > operators. Libbpf should not verify it in order to allow flexibility in > loading different implementations of an operator with different signatures > to try to comply with the kernel, even if the signature defined in the BPF > programs does not match with the implementations and the kernel. > > This feature enables user space applications to manage the variations > between different versions of the kernel by attempting various > implementations of an operator. What is the utility of this? I'm missing what difference it would be if libbpf rejected vs kernel rejecting it? For backwards compat the kernel will fail or libbpf might throw an error and user will have to fixup signature regardless right? Why not get the error as early as possible. > > This is a follow-up of the commit c911fc61a7ce ("libbpf: Skip zeroed or > null fields if not found in the kernel type.") > > Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> > --- > .../bpf/prog_tests/test_struct_ops_module.c | 24 +++++++++++++++++++ > .../selftests/bpf/progs/struct_ops_module.c | 13 ++++++++++ > 2 files changed, 37 insertions(+) > > diff --git a/tools/testing/selftests/bpf/prog_tests/test_struct_ops_module.c b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_module.c > index 098776d00ab4..7cf2b9ddd3e1 100644 > --- a/tools/testing/selftests/bpf/prog_tests/test_struct_ops_module.c > +++ b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_module.c > @@ -138,11 +138,35 @@ static void test_struct_ops_not_zeroed(void) > struct_ops_module__destroy(skel); > } > > +/* The signature of an implementation might not match the signature of the > + * function pointer prototype defined in the BPF program. This mismatch > + * should be allowed as long as the behavior of the operator program > + * adheres to the signature in the kernel. Libbpf should not enforce the > + * signature; rather, let the kernel verifier handle the enforcement. > + */ > +static void test_struct_ops_incompatible(void) > +{ > + struct struct_ops_module *skel; > + struct bpf_link *link; > + > + skel = struct_ops_module__open_and_load(); > + if (!ASSERT_OK_PTR(skel, "open_and_load")) > + return; > + > + link = bpf_map__attach_struct_ops(skel->maps.testmod_incompatible); > + if (ASSERT_OK_PTR(link, "attach_struct_ops")) > + bpf_link__destroy(link); > + > + struct_ops_module__destroy(skel); > +} > + > void serial_test_struct_ops_module(void) > { > if (test__start_subtest("test_struct_ops_load")) > test_struct_ops_load(); > if (test__start_subtest("test_struct_ops_not_zeroed")) > test_struct_ops_not_zeroed(); > + if (test__start_subtest("test_struct_ops_incompatible")) > + test_struct_ops_incompatible(); > } > > diff --git a/tools/testing/selftests/bpf/progs/struct_ops_module.c b/tools/testing/selftests/bpf/progs/struct_ops_module.c > index 86e1e50c5531..63b065dae002 100644 > --- a/tools/testing/selftests/bpf/progs/struct_ops_module.c > +++ b/tools/testing/selftests/bpf/progs/struct_ops_module.c > @@ -68,3 +68,16 @@ struct bpf_testmod_ops___zeroed testmod_zeroed = { > .test_1 = (void *)test_1, > .test_2 = (void *)test_2_v2, > }; > + > +struct bpf_testmod_ops___incompatible { > + int (*test_1)(void); > + void (*test_2)(int *a); > + int data; > +}; > + > +SEC(".struct_ops.link") > +struct bpf_testmod_ops___incompatible testmod_incompatible = { > + .test_1 = (void *)test_1, > + .test_2 = (void *)test_2, > + .data = 3, > +}; > -- > 2.34.1 > >
On 4/1/24 18:43, John Fastabend wrote: > Kui-Feng Lee wrote: >> The verifier in the kernel checks the signatures of struct_ops >> operators. Libbpf should not verify it in order to allow flexibility in >> loading different implementations of an operator with different signatures >> to try to comply with the kernel, even if the signature defined in the BPF >> programs does not match with the implementations and the kernel. >> >> This feature enables user space applications to manage the variations >> between different versions of the kernel by attempting various >> implementations of an operator. > > What is the utility of this? I'm missing what difference it would be > if libbpf rejected vs kernel rejecting it? For backwards compat the > kernel will fail or libbpf might throw an error and user will have to > fixup signature regardless right? Why not get the error as early as > possible. The check described here is that libbpf compares BTF types of functions and function pointers in struct_ops types in BPF programs, which may differ from kernel definitions. A scenario here is a struct_ops type that includes an operator op_A with different versions depending on the kernel. All other fields in the struct_ops type have the same types. The application has only one definition for this struct_ops type, but the implementation of op_A is done separately for each version. The application can try variations by assigning implementations to the op_A field until one is accepted by the kernel if libbpf doesn’t enforce signatures. Otherwise, the application has to define this struct_ops type for each variant if libbpf enforces signatures. Does that make sense to you? > >> >> This is a follow-up of the commit c911fc61a7ce ("libbpf: Skip zeroed or >> null fields if not found in the kernel type.") >> >> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> >> --- >> .../bpf/prog_tests/test_struct_ops_module.c | 24 +++++++++++++++++++ >> .../selftests/bpf/progs/struct_ops_module.c | 13 ++++++++++ >> 2 files changed, 37 insertions(+) >> >> diff --git a/tools/testing/selftests/bpf/prog_tests/test_struct_ops_module.c b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_module.c >> index 098776d00ab4..7cf2b9ddd3e1 100644 >> --- a/tools/testing/selftests/bpf/prog_tests/test_struct_ops_module.c >> +++ b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_module.c >> @@ -138,11 +138,35 @@ static void test_struct_ops_not_zeroed(void) >> struct_ops_module__destroy(skel); >> } >> >> +/* The signature of an implementation might not match the signature of the >> + * function pointer prototype defined in the BPF program. This mismatch >> + * should be allowed as long as the behavior of the operator program >> + * adheres to the signature in the kernel. Libbpf should not enforce the >> + * signature; rather, let the kernel verifier handle the enforcement. >> + */ >> +static void test_struct_ops_incompatible(void) >> +{ >> + struct struct_ops_module *skel; >> + struct bpf_link *link; >> + >> + skel = struct_ops_module__open_and_load(); >> + if (!ASSERT_OK_PTR(skel, "open_and_load")) >> + return; >> + >> + link = bpf_map__attach_struct_ops(skel->maps.testmod_incompatible); >> + if (ASSERT_OK_PTR(link, "attach_struct_ops")) >> + bpf_link__destroy(link); >> + >> + struct_ops_module__destroy(skel); >> +} >> + >> void serial_test_struct_ops_module(void) >> { >> if (test__start_subtest("test_struct_ops_load")) >> test_struct_ops_load(); >> if (test__start_subtest("test_struct_ops_not_zeroed")) >> test_struct_ops_not_zeroed(); >> + if (test__start_subtest("test_struct_ops_incompatible")) >> + test_struct_ops_incompatible(); >> } >> >> diff --git a/tools/testing/selftests/bpf/progs/struct_ops_module.c b/tools/testing/selftests/bpf/progs/struct_ops_module.c >> index 86e1e50c5531..63b065dae002 100644 >> --- a/tools/testing/selftests/bpf/progs/struct_ops_module.c >> +++ b/tools/testing/selftests/bpf/progs/struct_ops_module.c >> @@ -68,3 +68,16 @@ struct bpf_testmod_ops___zeroed testmod_zeroed = { >> .test_1 = (void *)test_1, >> .test_2 = (void *)test_2_v2, >> }; >> + >> +struct bpf_testmod_ops___incompatible { >> + int (*test_1)(void); >> + void (*test_2)(int *a); >> + int data; >> +}; >> + >> +SEC(".struct_ops.link") >> +struct bpf_testmod_ops___incompatible testmod_incompatible = { >> + .test_1 = (void *)test_1, >> + .test_2 = (void *)test_2, >> + .data = 3, >> +}; >> -- >> 2.34.1 >> >> > >
On 4/2/24 10:00 AM, Kui-Feng Lee wrote: > > > > On 4/1/24 18:43, John Fastabend wrote: >> Kui-Feng Lee wrote: >>> The verifier in the kernel checks the signatures of struct_ops >>> operators. Libbpf should not verify it in order to allow flexibility in This description probably is not accurate. iirc, the verifier does not check the function signature either. The verifier rejects only when the struct_ops prog tries to access something invalid. e.g. reading a function argument that does not exist in the running kernel. >>> loading different implementations of an operator with different signatures >>> to try to comply with the kernel, even if the signature defined in the BPF >>> programs does not match with the implementations and the kernel. >>> This feature enables user space applications to manage the variations >>> between different versions of the kernel by attempting various >>> implementations of an operator. >> >> What is the utility of this? I'm missing what difference it would be >> if libbpf rejected vs kernel rejecting it? For backwards compat the >> kernel will fail or libbpf might throw an error and user will have to >> fixup signature regardless right? Why not get the error as early as >> possible. > > The check described here is that libbpf compares BTF types of functions > and function pointers in struct_ops types in BPF programs, which may > differ from kernel definitions. > > A scenario here is a struct_ops type that includes an operator op_A with > different versions depending on the kernel. All other fields in the > struct_ops type have the same types. The application has only one > definition for this struct_ops type, but the implementation of op_A is > done separately for each version. > > The application can try variations by assigning implementations to the > op_A field until one is accepted by the kernel if libbpf doesn’t enforce It probably would be clearer if the test actually does the retry. e.g. Try to load a struct_ops prog which reads an extra arg that is not supported by the running kernel and gets rejected by verifier. Then assigns an older struct_ops prog to the skel->struct_ops...->fn and loads successfully by the verifier.
On Wed, Apr 3, 2024 at 1:52 PM Martin KaFai Lau <martin.lau@linux.dev> wrote: > > On 4/2/24 10:00 AM, Kui-Feng Lee wrote: > > > > > > > > On 4/1/24 18:43, John Fastabend wrote: > >> Kui-Feng Lee wrote: > >>> The verifier in the kernel checks the signatures of struct_ops > >>> operators. Libbpf should not verify it in order to allow flexibility in > > This description probably is not accurate. iirc, the verifier does not check the > function signature either. The verifier rejects only when the struct_ops prog > tries to access something invalid. e.g. reading a function argument that does > not exist in the running kernel. > > >>> loading different implementations of an operator with different signatures > >>> to try to comply with the kernel, even if the signature defined in the BPF > >>> programs does not match with the implementations and the kernel. > > >>> This feature enables user space applications to manage the variations > >>> between different versions of the kernel by attempting various > >>> implementations of an operator. > >> > >> What is the utility of this? I'm missing what difference it would be > >> if libbpf rejected vs kernel rejecting it? For backwards compat the > >> kernel will fail or libbpf might throw an error and user will have to > >> fixup signature regardless right? Why not get the error as early as > >> possible. > > > > The check described here is that libbpf compares BTF types of functions > > and function pointers in struct_ops types in BPF programs, which may > > differ from kernel definitions. > > > > A scenario here is a struct_ops type that includes an operator op_A with > > different versions depending on the kernel. All other fields in the > > struct_ops type have the same types. The application has only one > > definition for this struct_ops type, but the implementation of op_A is > > done separately for each version. > > > > The application can try variations by assigning implementations to the > > op_A field until one is accepted by the kernel if libbpf doesn’t enforce > > It probably would be clearer if the test actually does the retry. e.g. Try to > load a struct_ops prog which reads an extra arg that is not supported by the > running kernel and gets rejected by verifier. Then assigns an older struct_ops > prog to the skel->struct_ops...->fn and loads successfully by the verifier. > This is actually a discouraged practice. In practice in production user-space logic does feature detection (using BTF or whatever else necessary) and then decides on specific BPF program implementation. So I wouldn't overstress this approach (trial-and-error one) in tests, it's a bad and sloppy practice.
On 4/3/24 13:52, Martin KaFai Lau wrote: > On 4/2/24 10:00 AM, Kui-Feng Lee wrote: >> >> >> >> On 4/1/24 18:43, John Fastabend wrote: >>> Kui-Feng Lee wrote: >>>> The verifier in the kernel checks the signatures of struct_ops >>>> operators. Libbpf should not verify it in order to allow flexibility in > > This description probably is not accurate. iirc, the verifier does not > check the function signature either. The verifier rejects only when the > struct_ops prog tries to access something invalid. e.g. reading a > function argument that does not exist in the running kernel. Yes, kernel checks the behavior of programs. I will change the description. > >>>> loading different implementations of an operator with different >>>> signatures >>>> to try to comply with the kernel, even if the signature defined in >>>> the BPF >>>> programs does not match with the implementations and the kernel. > >>>> This feature enables user space applications to manage the variations >>>> between different versions of the kernel by attempting various >>>> implementations of an operator. >>> >>> What is the utility of this? I'm missing what difference it would be >>> if libbpf rejected vs kernel rejecting it? For backwards compat the >>> kernel will fail or libbpf might throw an error and user will have to >>> fixup signature regardless right? Why not get the error as early as >>> possible. >> >> The check described here is that libbpf compares BTF types of functions >> and function pointers in struct_ops types in BPF programs, which may >> differ from kernel definitions. >> >> A scenario here is a struct_ops type that includes an operator op_A with >> different versions depending on the kernel. All other fields in the >> struct_ops type have the same types. The application has only one >> definition for this struct_ops type, but the implementation of op_A is >> done separately for each version. >> >> The application can try variations by assigning implementations to the >> op_A field until one is accepted by the kernel if libbpf doesn’t enforce > > It probably would be clearer if the test actually does the retry. e.g. > Try to load a struct_ops prog which reads an extra arg that is not > supported by the running kernel and gets rejected by verifier. Then > assigns an older struct_ops prog to the skel->struct_ops...->fn and > loads successfully by the verifier. >
On 4/3/24 14:15, Andrii Nakryiko wrote: > On Wed, Apr 3, 2024 at 1:52 PM Martin KaFai Lau <martin.lau@linux.dev> wrote: >> >> On 4/2/24 10:00 AM, Kui-Feng Lee wrote: >>> >>> >>> >>> On 4/1/24 18:43, John Fastabend wrote: >>>> Kui-Feng Lee wrote: >>>>> The verifier in the kernel checks the signatures of struct_ops >>>>> operators. Libbpf should not verify it in order to allow flexibility in >> >> This description probably is not accurate. iirc, the verifier does not check the >> function signature either. The verifier rejects only when the struct_ops prog >> tries to access something invalid. e.g. reading a function argument that does >> not exist in the running kernel. >> >>>>> loading different implementations of an operator with different signatures >>>>> to try to comply with the kernel, even if the signature defined in the BPF >>>>> programs does not match with the implementations and the kernel. >> >>>>> This feature enables user space applications to manage the variations >>>>> between different versions of the kernel by attempting various >>>>> implementations of an operator. >>>> >>>> What is the utility of this? I'm missing what difference it would be >>>> if libbpf rejected vs kernel rejecting it? For backwards compat the >>>> kernel will fail or libbpf might throw an error and user will have to >>>> fixup signature regardless right? Why not get the error as early as >>>> possible. >>> >>> The check described here is that libbpf compares BTF types of functions >>> and function pointers in struct_ops types in BPF programs, which may >>> differ from kernel definitions. >>> >>> A scenario here is a struct_ops type that includes an operator op_A with >>> different versions depending on the kernel. All other fields in the >>> struct_ops type have the same types. The application has only one >>> definition for this struct_ops type, but the implementation of op_A is >>> done separately for each version. >>> >>> The application can try variations by assigning implementations to the >>> op_A field until one is accepted by the kernel if libbpf doesn’t enforce >> >> It probably would be clearer if the test actually does the retry. e.g. Try to >> load a struct_ops prog which reads an extra arg that is not supported by the >> running kernel and gets rejected by verifier. Then assigns an older struct_ops >> prog to the skel->struct_ops...->fn and loads successfully by the verifier. >> > > This is actually a discouraged practice. In practice in production > user-space logic does feature detection (using BTF or whatever else > necessary) and then decides on specific BPF program implementation. So > I wouldn't overstress this approach (trial-and-error one) in tests, > it's a bad and sloppy practice. It makes sense for me. I will rephrase this paragraph by using "feature detection" to replace "Try variations...".
diff --git a/tools/testing/selftests/bpf/prog_tests/test_struct_ops_module.c b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_module.c index 098776d00ab4..7cf2b9ddd3e1 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_struct_ops_module.c +++ b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_module.c @@ -138,11 +138,35 @@ static void test_struct_ops_not_zeroed(void) struct_ops_module__destroy(skel); } +/* The signature of an implementation might not match the signature of the + * function pointer prototype defined in the BPF program. This mismatch + * should be allowed as long as the behavior of the operator program + * adheres to the signature in the kernel. Libbpf should not enforce the + * signature; rather, let the kernel verifier handle the enforcement. + */ +static void test_struct_ops_incompatible(void) +{ + struct struct_ops_module *skel; + struct bpf_link *link; + + skel = struct_ops_module__open_and_load(); + if (!ASSERT_OK_PTR(skel, "open_and_load")) + return; + + link = bpf_map__attach_struct_ops(skel->maps.testmod_incompatible); + if (ASSERT_OK_PTR(link, "attach_struct_ops")) + bpf_link__destroy(link); + + struct_ops_module__destroy(skel); +} + void serial_test_struct_ops_module(void) { if (test__start_subtest("test_struct_ops_load")) test_struct_ops_load(); if (test__start_subtest("test_struct_ops_not_zeroed")) test_struct_ops_not_zeroed(); + if (test__start_subtest("test_struct_ops_incompatible")) + test_struct_ops_incompatible(); } diff --git a/tools/testing/selftests/bpf/progs/struct_ops_module.c b/tools/testing/selftests/bpf/progs/struct_ops_module.c index 86e1e50c5531..63b065dae002 100644 --- a/tools/testing/selftests/bpf/progs/struct_ops_module.c +++ b/tools/testing/selftests/bpf/progs/struct_ops_module.c @@ -68,3 +68,16 @@ struct bpf_testmod_ops___zeroed testmod_zeroed = { .test_1 = (void *)test_1, .test_2 = (void *)test_2_v2, }; + +struct bpf_testmod_ops___incompatible { + int (*test_1)(void); + void (*test_2)(int *a); + int data; +}; + +SEC(".struct_ops.link") +struct bpf_testmod_ops___incompatible testmod_incompatible = { + .test_1 = (void *)test_1, + .test_2 = (void *)test_2, + .data = 3, +};
The verifier in the kernel checks the signatures of struct_ops operators. Libbpf should not verify it in order to allow flexibility in loading different implementations of an operator with different signatures to try to comply with the kernel, even if the signature defined in the BPF programs does not match with the implementations and the kernel. This feature enables user space applications to manage the variations between different versions of the kernel by attempting various implementations of an operator. This is a follow-up of the commit c911fc61a7ce ("libbpf: Skip zeroed or null fields if not found in the kernel type.") Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> --- .../bpf/prog_tests/test_struct_ops_module.c | 24 +++++++++++++++++++ .../selftests/bpf/progs/struct_ops_module.c | 13 ++++++++++ 2 files changed, 37 insertions(+)