diff mbox series

[bpf-next,2/8] bpf: Allow trusted args to walk struct when checking BTF IDs

Message ID 20230119235833.2948341-3-void@manifault.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series Enable cpumasks to be used as kptrs | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-7 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-8 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-14 fail Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-15 fail Logs for test_progs on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-17 fail Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 fail Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 fail Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 fail Logs for test_progs_no_alu32 on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22 fail Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 fail Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-36 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-37 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-38 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-26 fail Logs for test_progs_no_alu32_parallel on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-21 fail Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_progs_parallel on s390x with gcc
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs on s390x with gcc
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 10 this patch: 10
netdev/cc_maintainers warning 1 maintainers not CCed: yhs@fb.com
netdev/build_clang success Errors and warnings before: 1 this patch: 1
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 10 this patch: 10
netdev/checkpatch warning WARNING: line length of 82 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

David Vernet Jan. 19, 2023, 11:58 p.m. UTC
When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
currently enforces that the top-level type must match when calling
the kfunc. In other words, the verifier does not allow the BPF program
to pass a bitwise equivalent struct, despite it being functionally safe.
For example, if you have the following type:

struct  nf_conn___init {
	struct nf_conn ct;
};

It would be safe to pass a struct nf_conn___init to a kfunc expecting a
struct nf_conn. Being able to do this will be useful for certain types
of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a
series of kfuncs will be added which allow programs to do bitwise
queries on cpumasks that are either allocated by the program (in which
case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as
its first element), or a cpumask that was allocated by the main kernel
(in which case it will just be a straight cpumask_t, as in
 task->cpus_ptr).

Having the two types of cpumasks allows us to distinguish between the
two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
cannot be. On the other hand, a struct bpf_cpumask can of course be
queried in the exact same manner as a cpumask_t, with e.g.
bpf_cpumask_test_cpu().

If we were to enforce that top level types match, then a user that's
passing a struct bpf_cpumask to a read-only cpumask_t argument would
have to cast with something like bpf_cast_to_kern_ctx() (which itself
would need to be updated to expect the alias, and currently it only
accommodates a single alias per prog type). Additionally, not specifying
KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
struct bpf_cpumask *, and another as a struct cpumask *
(i.e. cpumask_t).

In order to enable this, this patch relaxes the constraint that a
KF_TRUSTED_ARGS kfunc must have strict type matching. In order to
try and be conservative and match existing behavior / expectations, this
patch also enforces strict type checking for acquire kfuncs. We were
already enforcing it for release kfuncs, so this should also improve the
consistency of the semantics for kfuncs.

Signed-off-by: David Vernet <void@manifault.com>
---
 kernel/bpf/verifier.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

Comments

Kumar Kartikeya Dwivedi Jan. 20, 2023, 4:58 a.m. UTC | #1
On Fri, Jan 20, 2023 at 05:28:27AM IST, David Vernet wrote:
> When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
> currently enforces that the top-level type must match when calling
> the kfunc. In other words, the verifier does not allow the BPF program
> to pass a bitwise equivalent struct, despite it being functionally safe.
> For example, if you have the following type:
>
> struct  nf_conn___init {
> 	struct nf_conn ct;
> };
>
> It would be safe to pass a struct nf_conn___init to a kfunc expecting a
> struct nf_conn.

Just running bpf_nf selftest would have shown this is false.

> Being able to do this will be useful for certain types
> of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a
> series of kfuncs will be added which allow programs to do bitwise
> queries on cpumasks that are either allocated by the program (in which
> case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as
> its first element), or a cpumask that was allocated by the main kernel
> (in which case it will just be a straight cpumask_t, as in
>  task->cpus_ptr).
>
> Having the two types of cpumasks allows us to distinguish between the
> two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
> can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
> cannot be. On the other hand, a struct bpf_cpumask can of course be
> queried in the exact same manner as a cpumask_t, with e.g.
> bpf_cpumask_test_cpu().
>
> If we were to enforce that top level types match, then a user that's
> passing a struct bpf_cpumask to a read-only cpumask_t argument would
> have to cast with something like bpf_cast_to_kern_ctx() (which itself
> would need to be updated to expect the alias, and currently it only
> accommodates a single alias per prog type). Additionally, not specifying
> KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
> struct bpf_cpumask *, and another as a struct cpumask *
> (i.e. cpumask_t).
>
> In order to enable this, this patch relaxes the constraint that a
> KF_TRUSTED_ARGS kfunc must have strict type matching. In order to
> try and be conservative and match existing behavior / expectations, this
> patch also enforces strict type checking for acquire kfuncs. We were
> already enforcing it for release kfuncs, so this should also improve the
> consistency of the semantics for kfuncs.
>

What you want is to simply follow type at off = 0 (but still enforce the off = 0
requirement). This is something which is currently done for bpf_sk_release (for
struct sk_common) in check_reg_type, but it is not safe in general to just open
this up for all cases. I suggest encoding this particular requirement in the
argument, and simply using triple underscore variant of the type for the special
'read_only' requirement. This will allow you to use same type in your BPF C
program, while allowing verifier to see them as two different types in kfunc
parameters. Then just relax type following for the particular argument so that
one can pass cpumask_t___ro to kfunc expecting cpumask_t (but only at off = 0,
it just visits first member after failing match on top level type). off = 0
check is still necessary.

So offset checks still need to be according to OBJ_RELEASE but you can relax
strict_type_match bool for the particular arg when calling btf_struct_ids_match.

All code in your tests will then deal with a cpumask_t type only, including in
kfunc declarations. Same as bpf_nf selftests which don't cast from/to
nf_conn___init and only deal with nf_conn pointers even though semantics differ
depending on how it is used and passed around. Overall more convenient and
simple to use.
David Vernet Jan. 20, 2023, 5:23 a.m. UTC | #2
On Fri, Jan 20, 2023 at 10:28:15AM +0530, Kumar Kartikeya Dwivedi wrote:
> On Fri, Jan 20, 2023 at 05:28:27AM IST, David Vernet wrote:
> > When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
> > currently enforces that the top-level type must match when calling
> > the kfunc. In other words, the verifier does not allow the BPF program
> > to pass a bitwise equivalent struct, despite it being functionally safe.
> > For example, if you have the following type:
> >
> > struct  nf_conn___init {
> > 	struct nf_conn ct;
> > };
> >
> > It would be safe to pass a struct nf_conn___init to a kfunc expecting a
> > struct nf_conn.
> 
> Just running bpf_nf selftest would have shown this is false.

And I feel silly, because I did run them, and could have sworn they
passed...looking now at the change_status_after_alloc testcase I see
you're of course correct. Very poor example, thank you for pointing it
out.

> 
> > Being able to do this will be useful for certain types
> > of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a
> > series of kfuncs will be added which allow programs to do bitwise
> > queries on cpumasks that are either allocated by the program (in which
> > case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as
> > its first element), or a cpumask that was allocated by the main kernel
> > (in which case it will just be a straight cpumask_t, as in
> >  task->cpus_ptr).
> >
> > Having the two types of cpumasks allows us to distinguish between the
> > two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
> > can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
> > cannot be. On the other hand, a struct bpf_cpumask can of course be
> > queried in the exact same manner as a cpumask_t, with e.g.
> > bpf_cpumask_test_cpu().
> >
> > If we were to enforce that top level types match, then a user that's
> > passing a struct bpf_cpumask to a read-only cpumask_t argument would
> > have to cast with something like bpf_cast_to_kern_ctx() (which itself
> > would need to be updated to expect the alias, and currently it only
> > accommodates a single alias per prog type). Additionally, not specifying
> > KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
> > struct bpf_cpumask *, and another as a struct cpumask *
> > (i.e. cpumask_t).
> >
> > In order to enable this, this patch relaxes the constraint that a
> > KF_TRUSTED_ARGS kfunc must have strict type matching. In order to
> > try and be conservative and match existing behavior / expectations, this
> > patch also enforces strict type checking for acquire kfuncs. We were
> > already enforcing it for release kfuncs, so this should also improve the
> > consistency of the semantics for kfuncs.
> >
> 
> What you want is to simply follow type at off = 0 (but still enforce the off = 0
> requirement). This is something which is currently done for bpf_sk_release (for
> struct sk_common) in check_reg_type, but it is not safe in general to just open
> this up for all cases. I suggest encoding this particular requirement in the
> argument, and simply using triple underscore variant of the type for the special
> 'read_only' requirement. This will allow you to use same type in your BPF C
> program, while allowing verifier to see them as two different types in kfunc
> parameters. Then just relax type following for the particular argument so that
> one can pass cpumask_t___ro to kfunc expecting cpumask_t (but only at off = 0,
> it just visits first member after failing match on top level type). off = 0
> check is still necessary.

Sigh, yeah, another ___ workaround but I agree it's probably the best we
can do for now, and in general seems pretty useful. Obviously preferable
to this patch which just doesn't work. Alexei, are you OK with this? If
so, I'll take this approach for v2.

> 
> So offset checks still need to be according to OBJ_RELEASE but you can relax
> strict_type_match bool for the particular arg when calling btf_struct_ids_match.
> 
> All code in your tests will then deal with a cpumask_t type only, including in
> kfunc declarations. Same as bpf_nf selftests which don't cast from/to
> nf_conn___init and only deal with nf_conn pointers even though semantics differ
> depending on how it is used and passed around. Overall more convenient and
> simple to use.
Alexei Starovoitov Jan. 20, 2023, 5:40 a.m. UTC | #3
On Thu, Jan 19, 2023 at 11:23:18PM -0600, David Vernet wrote:
> On Fri, Jan 20, 2023 at 10:28:15AM +0530, Kumar Kartikeya Dwivedi wrote:
> > On Fri, Jan 20, 2023 at 05:28:27AM IST, David Vernet wrote:
> > > When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
> > > currently enforces that the top-level type must match when calling
> > > the kfunc. In other words, the verifier does not allow the BPF program
> > > to pass a bitwise equivalent struct, despite it being functionally safe.
> > > For example, if you have the following type:
> > >
> > > struct  nf_conn___init {
> > > 	struct nf_conn ct;
> > > };
> > >
> > > It would be safe to pass a struct nf_conn___init to a kfunc expecting a
> > > struct nf_conn.
> > 
> > Just running bpf_nf selftest would have shown this is false.
> 
> And I feel silly, because I did run them, and could have sworn they
> passed...looking now at the change_status_after_alloc testcase I see
> you're of course correct. Very poor example, thank you for pointing it
> out.
> 
> > 
> > > Being able to do this will be useful for certain types
> > > of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a
> > > series of kfuncs will be added which allow programs to do bitwise
> > > queries on cpumasks that are either allocated by the program (in which
> > > case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as
> > > its first element), or a cpumask that was allocated by the main kernel
> > > (in which case it will just be a straight cpumask_t, as in
> > >  task->cpus_ptr).
> > >
> > > Having the two types of cpumasks allows us to distinguish between the
> > > two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
> > > can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
> > > cannot be. On the other hand, a struct bpf_cpumask can of course be
> > > queried in the exact same manner as a cpumask_t, with e.g.
> > > bpf_cpumask_test_cpu().
> > >
> > > If we were to enforce that top level types match, then a user that's
> > > passing a struct bpf_cpumask to a read-only cpumask_t argument would
> > > have to cast with something like bpf_cast_to_kern_ctx() (which itself
> > > would need to be updated to expect the alias, and currently it only
> > > accommodates a single alias per prog type). Additionally, not specifying
> > > KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
> > > struct bpf_cpumask *, and another as a struct cpumask *
> > > (i.e. cpumask_t).
> > >
> > > In order to enable this, this patch relaxes the constraint that a
> > > KF_TRUSTED_ARGS kfunc must have strict type matching. In order to
> > > try and be conservative and match existing behavior / expectations, this
> > > patch also enforces strict type checking for acquire kfuncs. We were
> > > already enforcing it for release kfuncs, so this should also improve the
> > > consistency of the semantics for kfuncs.
> > >
> > 
> > What you want is to simply follow type at off = 0 (but still enforce the off = 0
> > requirement). This is something which is currently done for bpf_sk_release (for
> > struct sk_common) in check_reg_type, but it is not safe in general to just open
> > this up for all cases. I suggest encoding this particular requirement in the
> > argument, and simply using triple underscore variant of the type for the special
> > 'read_only' requirement. This will allow you to use same type in your BPF C
> > program, while allowing verifier to see them as two different types in kfunc
> > parameters. Then just relax type following for the particular argument so that
> > one can pass cpumask_t___ro to kfunc expecting cpumask_t (but only at off = 0,
> > it just visits first member after failing match on top level type). off = 0
> > check is still necessary.
> 
> Sigh, yeah, another ___ workaround but I agree it's probably the best we
> can do for now, and in general seems pretty useful. Obviously preferable
> to this patch which just doesn't work. Alexei, are you OK with this? If
> so, I'll take this approach for v2.

We decided to rely on strict type match when we introduced 'struct nf_conn___init',
but with that we twisted the C standard to, what looks to be, a wrong direction.

For definition:
struct nf_conn___init {
   struct nf_conn ct;
};
if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn__init
for both read and write, because in C that's valid and safe type cast.

We can fix this design issue by saying that '___init' suffix is special and
C type casting rules don't apply to it.
In all other cases bpf_cpumask/cpumask would should allow it.

__ro suffix idea will keep moving us into further discrepancies with C.
Kumar Kartikeya Dwivedi Jan. 20, 2023, 5:56 a.m. UTC | #4
On Fri, 20 Jan 2023 at 11:10, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Jan 19, 2023 at 11:23:18PM -0600, David Vernet wrote:
> > On Fri, Jan 20, 2023 at 10:28:15AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > On Fri, Jan 20, 2023 at 05:28:27AM IST, David Vernet wrote:
> > > > When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
> > > > currently enforces that the top-level type must match when calling
> > > > the kfunc. In other words, the verifier does not allow the BPF program
> > > > to pass a bitwise equivalent struct, despite it being functionally safe.
> > > > For example, if you have the following type:
> > > >
> > > > struct  nf_conn___init {
> > > >   struct nf_conn ct;
> > > > };
> > > >
> > > > It would be safe to pass a struct nf_conn___init to a kfunc expecting a
> > > > struct nf_conn.
> > >
> > > Just running bpf_nf selftest would have shown this is false.
> >
> > And I feel silly, because I did run them, and could have sworn they
> > passed...looking now at the change_status_after_alloc testcase I see
> > you're of course correct. Very poor example, thank you for pointing it
> > out.
> >
> > >
> > > > Being able to do this will be useful for certain types
> > > > of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a
> > > > series of kfuncs will be added which allow programs to do bitwise
> > > > queries on cpumasks that are either allocated by the program (in which
> > > > case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as
> > > > its first element), or a cpumask that was allocated by the main kernel
> > > > (in which case it will just be a straight cpumask_t, as in
> > > >  task->cpus_ptr).
> > > >
> > > > Having the two types of cpumasks allows us to distinguish between the
> > > > two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
> > > > can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
> > > > cannot be. On the other hand, a struct bpf_cpumask can of course be
> > > > queried in the exact same manner as a cpumask_t, with e.g.
> > > > bpf_cpumask_test_cpu().
> > > >
> > > > If we were to enforce that top level types match, then a user that's
> > > > passing a struct bpf_cpumask to a read-only cpumask_t argument would
> > > > have to cast with something like bpf_cast_to_kern_ctx() (which itself
> > > > would need to be updated to expect the alias, and currently it only
> > > > accommodates a single alias per prog type). Additionally, not specifying
> > > > KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
> > > > struct bpf_cpumask *, and another as a struct cpumask *
> > > > (i.e. cpumask_t).
> > > >
> > > > In order to enable this, this patch relaxes the constraint that a
> > > > KF_TRUSTED_ARGS kfunc must have strict type matching. In order to
> > > > try and be conservative and match existing behavior / expectations, this
> > > > patch also enforces strict type checking for acquire kfuncs. We were
> > > > already enforcing it for release kfuncs, so this should also improve the
> > > > consistency of the semantics for kfuncs.
> > > >
> > >
> > > What you want is to simply follow type at off = 0 (but still enforce the off = 0
> > > requirement). This is something which is currently done for bpf_sk_release (for
> > > struct sk_common) in check_reg_type, but it is not safe in general to just open
> > > this up for all cases. I suggest encoding this particular requirement in the
> > > argument, and simply using triple underscore variant of the type for the special
> > > 'read_only' requirement. This will allow you to use same type in your BPF C
> > > program, while allowing verifier to see them as two different types in kfunc
> > > parameters. Then just relax type following for the particular argument so that
> > > one can pass cpumask_t___ro to kfunc expecting cpumask_t (but only at off = 0,
> > > it just visits first member after failing match on top level type). off = 0
> > > check is still necessary.
> >
> > Sigh, yeah, another ___ workaround but I agree it's probably the best we
> > can do for now, and in general seems pretty useful. Obviously preferable
> > to this patch which just doesn't work. Alexei, are you OK with this? If
> > so, I'll take this approach for v2.
>
> We decided to rely on strict type match when we introduced 'struct nf_conn___init',
> but with that we twisted the C standard to, what looks to be, a wrong direction.
>
> For definition:
> struct nf_conn___init {
>    struct nf_conn ct;
> };
> if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn__init
> for both read and write, because in C that's valid and safe type cast.
>

The intention of this nf_conn___init was to be invisible to the user.
In selftests there is no trace of nf_conn___init. It is only for
enforcing semantics by virtue of type safety in the verifier.

Allocated but not inserted nf_conn -> nf_conn___init
Inserted/looked up nf_conn -> nf_conn

We can't pass e.g. nf_conn___init * to a function expecting nf_conn *.
The allocated nf_conn may not yet be fully initialized. It is only
after bpf_ct_insert_entry takes the nf_conn___init * and returns
inserted nf_conn * should it be allowed.

But for the user in BPF C it will be the same nf_conn. The verifier
can enforce different semantics on the underlying type's usage in
kfuncs etc, while the user performs normal direct access to the
nf_conn.

It will be the same case here, except you also introduce the case of
kfuncs that are 'polymorphic' and can take both. Relaxing
'strict_type_match' for that arg and placing the type of member you
wish to convert the pointer to gives you such polymorphism. But it's
not correct to do for nf_conn___init to nf_conn, at least not by
default.

In the future we may do:

union bpf_subtype {
  type A;
  type B;
  type C;
};

And using the relaxed rule allows all types at off = 0 to be passed to
kfuncs expecting type A/B/C for bpf_subtype *.
bpf_subtype is a fake type. We're just using the type system to
enforce different API usage for the same underlying kernel type.

> We can fix this design issue by saying that '___init' suffix is special and
> C type casting rules don't apply to it.
> In all other cases bpf_cpumask/cpumask would should allow it.
>

I'm just saying the triple underscore is not visible to the user.
You can declare kfunc that is:
struct foo___x *foo_alloc(void); in the kernel as
struct foo *foo_alloc(void); in BPF program and avoid all the
casting/ugliness and still enforce semantics around use.
Alexei Starovoitov Jan. 20, 2023, 6:14 a.m. UTC | #5
On Fri, Jan 20, 2023 at 11:26:37AM +0530, Kumar Kartikeya Dwivedi wrote:
> On Fri, 20 Jan 2023 at 11:10, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Thu, Jan 19, 2023 at 11:23:18PM -0600, David Vernet wrote:
> > > On Fri, Jan 20, 2023 at 10:28:15AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > On Fri, Jan 20, 2023 at 05:28:27AM IST, David Vernet wrote:
> > > > > When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
> > > > > currently enforces that the top-level type must match when calling
> > > > > the kfunc. In other words, the verifier does not allow the BPF program
> > > > > to pass a bitwise equivalent struct, despite it being functionally safe.
> > > > > For example, if you have the following type:
> > > > >
> > > > > struct  nf_conn___init {
> > > > >   struct nf_conn ct;
> > > > > };
> > > > >
> > > > > It would be safe to pass a struct nf_conn___init to a kfunc expecting a
> > > > > struct nf_conn.
> > > >
> > > > Just running bpf_nf selftest would have shown this is false.
> > >
> > > And I feel silly, because I did run them, and could have sworn they
> > > passed...looking now at the change_status_after_alloc testcase I see
> > > you're of course correct. Very poor example, thank you for pointing it
> > > out.
> > >
> > > >
> > > > > Being able to do this will be useful for certain types
> > > > > of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a
> > > > > series of kfuncs will be added which allow programs to do bitwise
> > > > > queries on cpumasks that are either allocated by the program (in which
> > > > > case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as
> > > > > its first element), or a cpumask that was allocated by the main kernel
> > > > > (in which case it will just be a straight cpumask_t, as in
> > > > >  task->cpus_ptr).
> > > > >
> > > > > Having the two types of cpumasks allows us to distinguish between the
> > > > > two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
> > > > > can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
> > > > > cannot be. On the other hand, a struct bpf_cpumask can of course be
> > > > > queried in the exact same manner as a cpumask_t, with e.g.
> > > > > bpf_cpumask_test_cpu().
> > > > >
> > > > > If we were to enforce that top level types match, then a user that's
> > > > > passing a struct bpf_cpumask to a read-only cpumask_t argument would
> > > > > have to cast with something like bpf_cast_to_kern_ctx() (which itself
> > > > > would need to be updated to expect the alias, and currently it only
> > > > > accommodates a single alias per prog type). Additionally, not specifying
> > > > > KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
> > > > > struct bpf_cpumask *, and another as a struct cpumask *
> > > > > (i.e. cpumask_t).
> > > > >
> > > > > In order to enable this, this patch relaxes the constraint that a
> > > > > KF_TRUSTED_ARGS kfunc must have strict type matching. In order to
> > > > > try and be conservative and match existing behavior / expectations, this
> > > > > patch also enforces strict type checking for acquire kfuncs. We were
> > > > > already enforcing it for release kfuncs, so this should also improve the
> > > > > consistency of the semantics for kfuncs.
> > > > >
> > > >
> > > > What you want is to simply follow type at off = 0 (but still enforce the off = 0
> > > > requirement). This is something which is currently done for bpf_sk_release (for
> > > > struct sk_common) in check_reg_type, but it is not safe in general to just open
> > > > this up for all cases. I suggest encoding this particular requirement in the
> > > > argument, and simply using triple underscore variant of the type for the special
> > > > 'read_only' requirement. This will allow you to use same type in your BPF C
> > > > program, while allowing verifier to see them as two different types in kfunc
> > > > parameters. Then just relax type following for the particular argument so that
> > > > one can pass cpumask_t___ro to kfunc expecting cpumask_t (but only at off = 0,
> > > > it just visits first member after failing match on top level type). off = 0
> > > > check is still necessary.
> > >
> > > Sigh, yeah, another ___ workaround but I agree it's probably the best we
> > > can do for now, and in general seems pretty useful. Obviously preferable
> > > to this patch which just doesn't work. Alexei, are you OK with this? If
> > > so, I'll take this approach for v2.
> >
> > We decided to rely on strict type match when we introduced 'struct nf_conn___init',
> > but with that we twisted the C standard to, what looks to be, a wrong direction.
> >
> > For definition:
> > struct nf_conn___init {
> >    struct nf_conn ct;
> > };
> > if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn__init
> > for both read and write, because in C that's valid and safe type cast.
> >
> 
> The intention of this nf_conn___init was to be invisible to the user.
> In selftests there is no trace of nf_conn___init. It is only for
> enforcing semantics by virtue of type safety in the verifier.
> 
> Allocated but not inserted nf_conn -> nf_conn___init
> Inserted/looked up nf_conn -> nf_conn
> 
> We can't pass e.g. nf_conn___init * to a function expecting nf_conn *.
> The allocated nf_conn may not yet be fully initialized. It is only
> after bpf_ct_insert_entry takes the nf_conn___init * and returns
> inserted nf_conn * should it be allowed.

Yes. I know and agree with all of the above.

> But for the user in BPF C it will be the same nf_conn. The verifier
> can enforce different semantics on the underlying type's usage in
> kfuncs etc, while the user performs normal direct access to the
> nf_conn.
> 
> It will be the same case here, except you also introduce the case of
> kfuncs that are 'polymorphic' and can take both. Relaxing
> 'strict_type_match' for that arg and placing the type of member you
> wish to convert the pointer to gives you such polymorphism. But it's
> not correct to do for nf_conn___init to nf_conn, at least not by
> default.

Yes. Agree. I used unfortunate example in the previous reply with nf_conn___init.
I meant to say:

 For definition:
 struct nf_conn_init {
    struct nf_conn ct;
 };
 if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn_init
 for both read and write, because in C that's valid and safe type cast.

Meainng that C rules apply.
Our triple underscore is special, because it's the "same type".
In the 2nd part of my reply I'm proposing to use the whole suffix "___init" to indicate that.
I think you're arguing that just "___" part is enough to enforce strict match.
Matching foo___flavor with foo should not be allowed.
While passing struct foo_flavor {struct foo;} into a kfunc that accepts 'struct foo'
is safe.
If so, I'm fine with such approach.
David Vernet Jan. 20, 2023, 2:56 p.m. UTC | #6
On Thu, Jan 19, 2023 at 10:14:41PM -0800, Alexei Starovoitov wrote:
> On Fri, Jan 20, 2023 at 11:26:37AM +0530, Kumar Kartikeya Dwivedi wrote:
> > On Fri, 20 Jan 2023 at 11:10, Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Thu, Jan 19, 2023 at 11:23:18PM -0600, David Vernet wrote:
> > > > On Fri, Jan 20, 2023 at 10:28:15AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > > On Fri, Jan 20, 2023 at 05:28:27AM IST, David Vernet wrote:
> > > > > > When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
> > > > > > currently enforces that the top-level type must match when calling
> > > > > > the kfunc. In other words, the verifier does not allow the BPF program
> > > > > > to pass a bitwise equivalent struct, despite it being functionally safe.
> > > > > > For example, if you have the following type:
> > > > > >
> > > > > > struct  nf_conn___init {
> > > > > >   struct nf_conn ct;
> > > > > > };
> > > > > >
> > > > > > It would be safe to pass a struct nf_conn___init to a kfunc expecting a
> > > > > > struct nf_conn.
> > > > >
> > > > > Just running bpf_nf selftest would have shown this is false.
> > > >
> > > > And I feel silly, because I did run them, and could have sworn they
> > > > passed...looking now at the change_status_after_alloc testcase I see
> > > > you're of course correct. Very poor example, thank you for pointing it
> > > > out.
> > > >
> > > > >
> > > > > > Being able to do this will be useful for certain types
> > > > > > of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a
> > > > > > series of kfuncs will be added which allow programs to do bitwise
> > > > > > queries on cpumasks that are either allocated by the program (in which
> > > > > > case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as
> > > > > > its first element), or a cpumask that was allocated by the main kernel
> > > > > > (in which case it will just be a straight cpumask_t, as in
> > > > > >  task->cpus_ptr).
> > > > > >
> > > > > > Having the two types of cpumasks allows us to distinguish between the
> > > > > > two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
> > > > > > can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
> > > > > > cannot be. On the other hand, a struct bpf_cpumask can of course be
> > > > > > queried in the exact same manner as a cpumask_t, with e.g.
> > > > > > bpf_cpumask_test_cpu().
> > > > > >
> > > > > > If we were to enforce that top level types match, then a user that's
> > > > > > passing a struct bpf_cpumask to a read-only cpumask_t argument would
> > > > > > have to cast with something like bpf_cast_to_kern_ctx() (which itself
> > > > > > would need to be updated to expect the alias, and currently it only
> > > > > > accommodates a single alias per prog type). Additionally, not specifying
> > > > > > KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
> > > > > > struct bpf_cpumask *, and another as a struct cpumask *
> > > > > > (i.e. cpumask_t).
> > > > > >
> > > > > > In order to enable this, this patch relaxes the constraint that a
> > > > > > KF_TRUSTED_ARGS kfunc must have strict type matching. In order to
> > > > > > try and be conservative and match existing behavior / expectations, this
> > > > > > patch also enforces strict type checking for acquire kfuncs. We were
> > > > > > already enforcing it for release kfuncs, so this should also improve the
> > > > > > consistency of the semantics for kfuncs.
> > > > > >
> > > > >
> > > > > What you want is to simply follow type at off = 0 (but still enforce the off = 0
> > > > > requirement). This is something which is currently done for bpf_sk_release (for
> > > > > struct sk_common) in check_reg_type, but it is not safe in general to just open
> > > > > this up for all cases. I suggest encoding this particular requirement in the
> > > > > argument, and simply using triple underscore variant of the type for the special
> > > > > 'read_only' requirement. This will allow you to use same type in your BPF C
> > > > > program, while allowing verifier to see them as two different types in kfunc
> > > > > parameters. Then just relax type following for the particular argument so that
> > > > > one can pass cpumask_t___ro to kfunc expecting cpumask_t (but only at off = 0,
> > > > > it just visits first member after failing match on top level type). off = 0
> > > > > check is still necessary.
> > > >
> > > > Sigh, yeah, another ___ workaround but I agree it's probably the best we
> > > > can do for now, and in general seems pretty useful. Obviously preferable
> > > > to this patch which just doesn't work. Alexei, are you OK with this? If
> > > > so, I'll take this approach for v2.
> > >
> > > We decided to rely on strict type match when we introduced 'struct nf_conn___init',
> > > but with that we twisted the C standard to, what looks to be, a wrong direction.
> > >
> > > For definition:
> > > struct nf_conn___init {
> > >    struct nf_conn ct;
> > > };
> > > if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn__init
> > > for both read and write, because in C that's valid and safe type cast.
> > >
> > 
> > The intention of this nf_conn___init was to be invisible to the user.
> > In selftests there is no trace of nf_conn___init. It is only for
> > enforcing semantics by virtue of type safety in the verifier.
> > 
> > Allocated but not inserted nf_conn -> nf_conn___init
> > Inserted/looked up nf_conn -> nf_conn
> > 
> > We can't pass e.g. nf_conn___init * to a function expecting nf_conn *.
> > The allocated nf_conn may not yet be fully initialized. It is only
> > after bpf_ct_insert_entry takes the nf_conn___init * and returns
> > inserted nf_conn * should it be allowed.
> 
> Yes. I know and agree with all of the above.
> 
> > But for the user in BPF C it will be the same nf_conn. The verifier
> > can enforce different semantics on the underlying type's usage in
> > kfuncs etc, while the user performs normal direct access to the
> > nf_conn.
> > 
> > It will be the same case here, except you also introduce the case of
> > kfuncs that are 'polymorphic' and can take both. Relaxing
> > 'strict_type_match' for that arg and placing the type of member you
> > wish to convert the pointer to gives you such polymorphism. But it's
> > not correct to do for nf_conn___init to nf_conn, at least not by
> > default.
> 
> Yes. Agree. I used unfortunate example in the previous reply with nf_conn___init.
> I meant to say:
> 
>  For definition:
>  struct nf_conn_init {
>     struct nf_conn ct;
>  };
>  if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn_init
>  for both read and write, because in C that's valid and safe type cast.
> 
> Meainng that C rules apply.
> Our triple underscore is special, because it's the "same type".
> In the 2nd part of my reply I'm proposing to use the whole suffix "___init" to indicate that.
> I think you're arguing that just "___" part is enough to enforce strict match.
> Matching foo___flavor with foo should not be allowed.
> While passing struct foo_flavor {struct foo;} into a kfunc that accepts 'struct foo'
> is safe.
> If so, I'm fine with such approach.

Alright, I'll spin v2 to treat any type with name___.* as a disallowed
alias, and update the documentation to mention it. I was originally
going to push back and say that we should just use a single alias like
__nocast to keep things simple, but it doesn't feel generalizable
enough.
David Vernet Jan. 20, 2023, 3:26 p.m. UTC | #7
On Fri, Jan 20, 2023 at 08:56:55AM -0600, David Vernet wrote:
> On Thu, Jan 19, 2023 at 10:14:41PM -0800, Alexei Starovoitov wrote:
> > On Fri, Jan 20, 2023 at 11:26:37AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > On Fri, 20 Jan 2023 at 11:10, Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Thu, Jan 19, 2023 at 11:23:18PM -0600, David Vernet wrote:
> > > > > On Fri, Jan 20, 2023 at 10:28:15AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > > > On Fri, Jan 20, 2023 at 05:28:27AM IST, David Vernet wrote:
> > > > > > > When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
> > > > > > > currently enforces that the top-level type must match when calling
> > > > > > > the kfunc. In other words, the verifier does not allow the BPF program
> > > > > > > to pass a bitwise equivalent struct, despite it being functionally safe.
> > > > > > > For example, if you have the following type:
> > > > > > >
> > > > > > > struct  nf_conn___init {
> > > > > > >   struct nf_conn ct;
> > > > > > > };
> > > > > > >
> > > > > > > It would be safe to pass a struct nf_conn___init to a kfunc expecting a
> > > > > > > struct nf_conn.
> > > > > >
> > > > > > Just running bpf_nf selftest would have shown this is false.
> > > > >
> > > > > And I feel silly, because I did run them, and could have sworn they
> > > > > passed...looking now at the change_status_after_alloc testcase I see
> > > > > you're of course correct. Very poor example, thank you for pointing it
> > > > > out.
> > > > >
> > > > > >
> > > > > > > Being able to do this will be useful for certain types
> > > > > > > of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a
> > > > > > > series of kfuncs will be added which allow programs to do bitwise
> > > > > > > queries on cpumasks that are either allocated by the program (in which
> > > > > > > case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as
> > > > > > > its first element), or a cpumask that was allocated by the main kernel
> > > > > > > (in which case it will just be a straight cpumask_t, as in
> > > > > > >  task->cpus_ptr).
> > > > > > >
> > > > > > > Having the two types of cpumasks allows us to distinguish between the
> > > > > > > two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
> > > > > > > can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
> > > > > > > cannot be. On the other hand, a struct bpf_cpumask can of course be
> > > > > > > queried in the exact same manner as a cpumask_t, with e.g.
> > > > > > > bpf_cpumask_test_cpu().
> > > > > > >
> > > > > > > If we were to enforce that top level types match, then a user that's
> > > > > > > passing a struct bpf_cpumask to a read-only cpumask_t argument would
> > > > > > > have to cast with something like bpf_cast_to_kern_ctx() (which itself
> > > > > > > would need to be updated to expect the alias, and currently it only
> > > > > > > accommodates a single alias per prog type). Additionally, not specifying
> > > > > > > KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
> > > > > > > struct bpf_cpumask *, and another as a struct cpumask *
> > > > > > > (i.e. cpumask_t).
> > > > > > >
> > > > > > > In order to enable this, this patch relaxes the constraint that a
> > > > > > > KF_TRUSTED_ARGS kfunc must have strict type matching. In order to
> > > > > > > try and be conservative and match existing behavior / expectations, this
> > > > > > > patch also enforces strict type checking for acquire kfuncs. We were
> > > > > > > already enforcing it for release kfuncs, so this should also improve the
> > > > > > > consistency of the semantics for kfuncs.
> > > > > > >
> > > > > >
> > > > > > What you want is to simply follow type at off = 0 (but still enforce the off = 0
> > > > > > requirement). This is something which is currently done for bpf_sk_release (for
> > > > > > struct sk_common) in check_reg_type, but it is not safe in general to just open
> > > > > > this up for all cases. I suggest encoding this particular requirement in the
> > > > > > argument, and simply using triple underscore variant of the type for the special
> > > > > > 'read_only' requirement. This will allow you to use same type in your BPF C
> > > > > > program, while allowing verifier to see them as two different types in kfunc
> > > > > > parameters. Then just relax type following for the particular argument so that
> > > > > > one can pass cpumask_t___ro to kfunc expecting cpumask_t (but only at off = 0,
> > > > > > it just visits first member after failing match on top level type). off = 0
> > > > > > check is still necessary.
> > > > >
> > > > > Sigh, yeah, another ___ workaround but I agree it's probably the best we
> > > > > can do for now, and in general seems pretty useful. Obviously preferable
> > > > > to this patch which just doesn't work. Alexei, are you OK with this? If
> > > > > so, I'll take this approach for v2.
> > > >
> > > > We decided to rely on strict type match when we introduced 'struct nf_conn___init',
> > > > but with that we twisted the C standard to, what looks to be, a wrong direction.
> > > >
> > > > For definition:
> > > > struct nf_conn___init {
> > > >    struct nf_conn ct;
> > > > };
> > > > if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn__init
> > > > for both read and write, because in C that's valid and safe type cast.
> > > >
> > > 
> > > The intention of this nf_conn___init was to be invisible to the user.
> > > In selftests there is no trace of nf_conn___init. It is only for
> > > enforcing semantics by virtue of type safety in the verifier.
> > > 
> > > Allocated but not inserted nf_conn -> nf_conn___init
> > > Inserted/looked up nf_conn -> nf_conn
> > > 
> > > We can't pass e.g. nf_conn___init * to a function expecting nf_conn *.
> > > The allocated nf_conn may not yet be fully initialized. It is only
> > > after bpf_ct_insert_entry takes the nf_conn___init * and returns
> > > inserted nf_conn * should it be allowed.
> > 
> > Yes. I know and agree with all of the above.
> > 
> > > But for the user in BPF C it will be the same nf_conn. The verifier
> > > can enforce different semantics on the underlying type's usage in
> > > kfuncs etc, while the user performs normal direct access to the
> > > nf_conn.
> > > 
> > > It will be the same case here, except you also introduce the case of
> > > kfuncs that are 'polymorphic' and can take both. Relaxing
> > > 'strict_type_match' for that arg and placing the type of member you
> > > wish to convert the pointer to gives you such polymorphism. But it's
> > > not correct to do for nf_conn___init to nf_conn, at least not by
> > > default.
> > 
> > Yes. Agree. I used unfortunate example in the previous reply with nf_conn___init.
> > I meant to say:
> > 
> >  For definition:
> >  struct nf_conn_init {
> >     struct nf_conn ct;
> >  };
> >  if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn_init
> >  for both read and write, because in C that's valid and safe type cast.
> > 
> > Meainng that C rules apply.
> > Our triple underscore is special, because it's the "same type".
> > In the 2nd part of my reply I'm proposing to use the whole suffix "___init" to indicate that.
> > I think you're arguing that just "___" part is enough to enforce strict match.
> > Matching foo___flavor with foo should not be allowed.
> > While passing struct foo_flavor {struct foo;} into a kfunc that accepts 'struct foo'
> > is safe.
> > If so, I'm fine with such approach.
> 
> Alright, I'll spin v2 to treat any type with name___.* as a disallowed
> alias, and update the documentation to mention it. I was originally
> going to push back and say that we should just use a single alias like
> __nocast to keep things simple, but it doesn't feel generalizable
> enough.

On second thought, unless you guys feel strongly, I'll just check
___init. The resulting code is going to be a lot of tricky string
manipulation / math otherwise. Not _terrible_, but I'd prefer to avoid
adding it until we have a concrete use-case. And I expect this could be
implemented much simpler using something like tags, once gcc has support
for it.
Alexei Starovoitov Jan. 20, 2023, 4:17 p.m. UTC | #8
On Fri, Jan 20, 2023 at 7:26 AM David Vernet <void@manifault.com> wrote:
> > >
> > > Yes. Agree. I used unfortunate example in the previous reply with nf_conn___init.
> > > I meant to say:
> > >
> > >  For definition:
> > >  struct nf_conn_init {
> > >     struct nf_conn ct;
> > >  };
> > >  if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn_init
> > >  for both read and write, because in C that's valid and safe type cast.
> > >
> > > Meainng that C rules apply.
> > > Our triple underscore is special, because it's the "same type".
> > > In the 2nd part of my reply I'm proposing to use the whole suffix "___init" to indicate that.
> > > I think you're arguing that just "___" part is enough to enforce strict match.
> > > Matching foo___flavor with foo should not be allowed.
> > > While passing struct foo_flavor {struct foo;} into a kfunc that accepts 'struct foo'
> > > is safe.
> > > If so, I'm fine with such approach.
> >
> > Alright, I'll spin v2 to treat any type with name___.* as a disallowed
> > alias, and update the documentation to mention it. I was originally
> > going to push back and say that we should just use a single alias like
> > __nocast to keep things simple, but it doesn't feel generalizable
> > enough.
>
> On second thought, unless you guys feel strongly, I'll just check
> ___init. The resulting code is going to be a lot of tricky string
> manipulation / math otherwise. Not _terrible_, but I'd prefer to avoid
> adding it until we have a concrete use-case. And I expect this could be
> implemented much simpler using something like tags, once gcc has support
> for it.

There is bpf_core_is_flavor_sep() that makes it easy to check,
but thinking more about it we probably should stick to strict "___init"
suffix for now, since flavors can appear in progs too and they
are equivalent to corresponding types in the kernel.
The nf_conn___init is kernel only type.
The verifier sees that bpf_xdp_ct_alloc kfunc returns it,
but when we export this kfunc to bpf prog it returns nf_conn.
We probably should pick some other suffix without "___" to distinguish
from flavors. bpf prog should not use nf_conn___init.
diff mbox series

Patch

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 7f973847b58e..9fa101420046 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -8563,9 +8563,34 @@  static int process_kf_arg_ptr_to_btf_id(struct bpf_verifier_env *env,
 		reg_ref_id = *reg2btf_ids[base_type(reg->type)];
 	}
 
-	if (is_kfunc_trusted_args(meta) || (is_kfunc_release(meta) && reg->ref_obj_id))
+	/* Enforce strict type matching for calls to kfuncs that are acquiring
+	 * or releasing a reference. We do _not_ enforce strict matching for
+	 * plain KF_TRUSTED_ARGS kfuncs, as we want to enable BPF programs to
+	 * pass types that are bitwise equivalent without forcing them to
+	 * explicitly cast with something like bpf_cast_to_kern_ctx().
+	 *
+	 * For example, say we had a type like the following:
+	 *
+	 * struct bpf_cpumask {
+	 *	cpumask_t cpumask;
+	 *	refcount_t usage;
+	 * };
+	 *
+	 * Note that as specified in <linux/cpumask.h>, cpumask_t is typedef'ed
+	 * to a struct cpumask, so it would be safe to pass a struct
+	 * bpf_cpumask * to a kfunc expecting a struct cpumask *.
+	 *
+	 * The philosophy here is similar to how we allow scalars of different
+	 * types to be passed to kfuncs as long as the size is the same. The
+	 * only difference here is that we're simply allowing
+	 * btf_struct_ids_match() to walk the struct at the 0th offset, and
+	 * resolve types.
+	 */
+	if (is_kfunc_acquire(meta) || (is_kfunc_release(meta) && reg->ref_obj_id))
 		strict_type_match = true;
 
+	WARN_ON_ONCE(is_kfunc_trusted_args(meta) && reg->off);
+
 	reg_ref_t = btf_type_skip_modifiers(reg_btf, reg_ref_id, &reg_ref_id);
 	reg_ref_tname = btf_name_by_offset(reg_btf, reg_ref_t->name_off);
 	if (!btf_struct_ids_match(&env->log, reg_btf, reg_ref_id, reg->off, meta->btf, ref_id, strict_type_match)) {