diff mbox series

[bpf-next,v1,5/7] bpf: Move PTR_TO_STACK alignment check to process_dynptr_func

Message ID 20221115000130.1967465-6-memxor@gmail.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series Dynptr refactorings | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-7 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-8 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-36 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-37 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-38 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-14 fail Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32 on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_progs_no_alu32_parallel on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_progs_parallel on s390x with gcc
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on s390x with gcc
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 10 this patch: 10
netdev/cc_maintainers warning 8 maintainers not CCed: sdf@google.com kpsingh@kernel.org haoluo@google.com yhs@fb.com jolsa@kernel.org martin.lau@linux.dev song@kernel.org john.fastabend@gmail.com
netdev/build_clang success Errors and warnings before: 5 this patch: 5
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 10 this patch: 10
netdev/checkpatch warning WARNING: line length of 82 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Kumar Kartikeya Dwivedi Nov. 15, 2022, 12:01 a.m. UTC
After previous commit, we are minimizing helper specific assumptions
from check_func_arg_reg_off, making it generic, and offloading checks
for a specific argument type to their respective functions called after
check_func_arg_reg_off has been called.

This allows relying on a consistent set of guarantees after that call
and then relying on them in code that deals with registers for each
argument type later. This is in line with how process_spin_lock,
process_timer_func, process_kptr_func check reg->var_off to be constant.
The same reasoning is used here to move the alignment check into
process_dynptr_func. Note that it also needs to check for constant
var_off, and accumulate the constant var_off when computing the spi in
get_spi, but that fix will come in later changes.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/verifier.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

Comments

Joanne Koong Nov. 15, 2022, 6:29 p.m. UTC | #1
On Mon, Nov 14, 2022 at 4:01 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> After previous commit, we are minimizing helper specific assumptions
> from check_func_arg_reg_off, making it generic, and offloading checks
> for a specific argument type to their respective functions called after
> check_func_arg_reg_off has been called.
>
> This allows relying on a consistent set of guarantees after that call
> and then relying on them in code that deals with registers for each
> argument type later. This is in line with how process_spin_lock,
> process_timer_func, process_kptr_func check reg->var_off to be constant.
> The same reasoning is used here to move the alignment check into
> process_dynptr_func. Note that it also needs to check for constant
> var_off, and accumulate the constant var_off when computing the spi in
> get_spi, but that fix will come in later changes.
>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

Acked-by: Joanne Koong <joannelkoong@gmail.com>

> ---
>  kernel/bpf/verifier.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 34e67d04579b..fd292f762d53 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5774,6 +5774,14 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
>                 return -EFAULT;
>         }
>
> +       /* CONST_PTR_TO_DYNPTR already has fixed and var_off as 0 due to
> +        * check_func_arg_reg_off's logic. We only need to check offset
> +        * alignment for PTR_TO_STACK.
> +        */
> +       if (reg->type == PTR_TO_STACK && (reg->off % BPF_REG_SIZE)) {
> +               verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
> +               return -EINVAL;
> +       }
>         /* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
>          * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
>          *
> @@ -6125,11 +6133,6 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
>         switch (type) {
>         /* Pointer types where both fixed and variable offset is explicitly allowed: */
>         case PTR_TO_STACK:
> -               if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
> -                       verbose(env, "cannot pass in dynptr at an offset\n");
> -                       return -EINVAL;
> -               }
> -               fallthrough;
>         case PTR_TO_PACKET:
>         case PTR_TO_PACKET_META:
>         case PTR_TO_MAP_KEY:
> --
> 2.38.1
>
David Vernet Nov. 17, 2022, 11:49 p.m. UTC | #2
On Tue, Nov 15, 2022 at 05:31:28AM +0530, Kumar Kartikeya Dwivedi wrote:
> After previous commit, we are minimizing helper specific assumptions
> from check_func_arg_reg_off, making it generic, and offloading checks
> for a specific argument type to their respective functions called after
> check_func_arg_reg_off has been called.

What's the point of check_func_arg_reg_off() if helpers have to check
offsets after it's been called? Also, in [0], there's now logic in
check_func_arg_reg_off() which checks for OBJ_RELEASE arg types, so
there's still a precedent for looking at arg types there. IMO it's
actually less confusing to just push as much offset checking as possible
into one place.

[0]: https://lore.kernel.org/all/20221115000130.1967465-5-memxor@gmail.com/

> This allows relying on a consistent set of guarantees after that call
> and then relying on them in code that deals with registers for each
> argument type later. This is in line with how process_spin_lock,
> process_timer_func, process_kptr_func check reg->var_off to be constant.
> The same reasoning is used here to move the alignment check into
> process_dynptr_func. Note that it also needs to check for constant
> var_off, and accumulate the constant var_off when computing the spi in
> get_spi, but that fix will come in later changes.
> 
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  kernel/bpf/verifier.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 34e67d04579b..fd292f762d53 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5774,6 +5774,14 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
>  		return -EFAULT;
>  	}
>  
> +	/* CONST_PTR_TO_DYNPTR already has fixed and var_off as 0 due to
> +	 * check_func_arg_reg_off's logic. We only need to check offset
> +	 * alignment for PTR_TO_STACK.
> +	 */
> +	if (reg->type == PTR_TO_STACK && (reg->off % BPF_REG_SIZE)) {
> +		verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
> +		return -EINVAL;
> +	}

As alluded to above, I personally think it's more confusing to have this
check in process_dynptr_func(). On the one hand you have
check_func_arg_reg_off() which verifies that a register has the correct
offset, but then here we have to check for the register offset for
PTR_TO_STACK dynptrs specifically? Wouldn't it be better to try and push
as much of the offset-checking complexity into one place as possible?

>  	/* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
>  	 * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
>  	 *
> @@ -6125,11 +6133,6 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
>  	switch (type) {
>  	/* Pointer types where both fixed and variable offset is explicitly allowed: */
>  	case PTR_TO_STACK:
> -		if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
> -			verbose(env, "cannot pass in dynptr at an offset\n");
> -			return -EINVAL;
> -		}
> -		fallthrough;
>  	case PTR_TO_PACKET:
>  	case PTR_TO_PACKET_META:
>  	case PTR_TO_MAP_KEY:
> -- 
> 2.38.1
>
Kumar Kartikeya Dwivedi Nov. 20, 2022, 7:10 p.m. UTC | #3
On Fri, Nov 18, 2022 at 05:19:27AM IST, David Vernet wrote:
> On Tue, Nov 15, 2022 at 05:31:28AM +0530, Kumar Kartikeya Dwivedi wrote:
> > After previous commit, we are minimizing helper specific assumptions
> > from check_func_arg_reg_off, making it generic, and offloading checks
> > for a specific argument type to their respective functions called after
> > check_func_arg_reg_off has been called.
>
> What's the point of check_func_arg_reg_off() if helpers have to check
> offsets after it's been called? Also, in [0], there's now logic in
> check_func_arg_reg_off() which checks for OBJ_RELEASE arg types, so
> there's still a precedent for looking at arg types there. IMO it's
> actually less confusing to just push as much offset checking as possible
> into one place.
>

I think you need to define 'as much offset checking'.

Consider process_kptr_func, it requires var_off to be constant. Same for
bpf_timer, bpf_spin_lock, bpf_list_head, bpf_list_node, etc. They take
PTR_TO_MAP_VALUE, PTR_TO_BTF_ID, PTR_TO_BTF_ID | MEM_ALLOC. Should we move all
of that into check_func_arg_reg_off?

Some argument types like ARG_PTR_TO_MEM are ok with variable offset, should that
exception go in this function as well?

Where do you draw the line here in terms of what that function does?

IMO, there are a certain set of properties that check_func_arg_reg_off provides,
you could say that if each register type was a class, then the checks there
would be what you would do while constructing them on calling:

PtrToStack(off, var_off /* can be const or variable */)
PtrToMapValue(off, var_off /* can be const or variable */)
PtrToBtfId(off /* must be >= 0 */) /* no var_off */

How they get used by each helper and what further checks each helper needs to do
on them based on the arg_type should be done at a later stage when the specific
argument type is processed.

Agreed that, it's not perfect, with the odd case for PTR_TO_STACK having non-0
reg->off for OBJ_RELEASE. But IMO once you realise it makes no sense to release
PTR_TO_STACK and that PTR_TO_STACK actually points to the real pointer being
released, it needs to be handled specially.

> [0]: https://lore.kernel.org/all/20221115000130.1967465-5-memxor@gmail.com/
>
> > This allows relying on a consistent set of guarantees after that call
> > and then relying on them in code that deals with registers for each
> > argument type later. This is in line with how process_spin_lock,
> > process_timer_func, process_kptr_func check reg->var_off to be constant.
> > The same reasoning is used here to move the alignment check into
> > process_dynptr_func. Note that it also needs to check for constant
> > var_off, and accumulate the constant var_off when computing the spi in
> > get_spi, but that fix will come in later changes.
> >
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >  kernel/bpf/verifier.c | 13 ++++++++-----
> >  1 file changed, 8 insertions(+), 5 deletions(-)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 34e67d04579b..fd292f762d53 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -5774,6 +5774,14 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> >  		return -EFAULT;
> >  	}
> >
> > +	/* CONST_PTR_TO_DYNPTR already has fixed and var_off as 0 due to
> > +	 * check_func_arg_reg_off's logic. We only need to check offset
> > +	 * alignment for PTR_TO_STACK.
> > +	 */
> > +	if (reg->type == PTR_TO_STACK && (reg->off % BPF_REG_SIZE)) {
> > +		verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
> > +		return -EINVAL;
> > +	}
>
> As alluded to above, I personally think it's more confusing to have this
> check in process_dynptr_func(). On the one hand you have
> check_func_arg_reg_off() which verifies that a register has the correct
> offset, but then here we have to check for the register offset for
> PTR_TO_STACK dynptrs specifically? Wouldn't it be better to try and push
> as much of the offset-checking complexity into one place as possible?
>

I'm trying to make a split between 'offset correct for the reg->type in
general', and 'offset correct for the reg->type when when passed as an argument
for arg_type'. I think the latter is specific and different for each case and
thus belongs inside the case ARG_TYPE_* block of each of those.

Anyhow, all of this is not to reject your point, but to say that if we're
keeping that check in check_func_arg_reg_off for dynptr, let's also examine why
we should/shouldn't move checks for other arg_types inside it as well, and
whether the end result is going to be better than this.

In that case, atleast to me, it doesn't make sense to check reg->off %
BPF_REG_SIZE for ARG_PTR_TO_DYNPTR while leaving out other arg types.
Alexei Starovoitov Nov. 20, 2022, 7:40 p.m. UTC | #4
On Sun, Nov 20, 2022 at 11:10 AM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> IMO, there are a certain set of properties that check_func_arg_reg_off provides,
> you could say that if each register type was a class, then the checks there
> would be what you would do while constructing them on calling:
>
> PtrToStack(off, var_off /* can be const or variable */)
> PtrToMapValue(off, var_off /* can be const or variable */)
> PtrToBtfId(off /* must be >= 0 */) /* no var_off */

Just to complicate things a bit... ;)
There was a request to allow var_off in ptr_to_btf_id.
Sometimes there are fixed size arrays in structs and
programs need to iterate those arrays in a loop.
Another case is to access flex_array at the end of a struct.
Like cgroup->ancestors[].
Both are currently impossible, but the verifier has to
get this capability.
Kumar Kartikeya Dwivedi Nov. 20, 2022, 9:02 p.m. UTC | #5
On Mon, Nov 21, 2022 at 01:10:21AM IST, Alexei Starovoitov wrote:
> On Sun, Nov 20, 2022 at 11:10 AM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > IMO, there are a certain set of properties that check_func_arg_reg_off provides,
> > you could say that if each register type was a class, then the checks there
> > would be what you would do while constructing them on calling:
> >
> > PtrToStack(off, var_off /* can be const or variable */)
> > PtrToMapValue(off, var_off /* can be const or variable */)
> > PtrToBtfId(off /* must be >= 0 */) /* no var_off */
>
> Just to complicate things a bit... ;)
> There was a request to allow var_off in ptr_to_btf_id.
> Sometimes there are fixed size arrays in structs and
> programs need to iterate those arrays in a loop.

Honestly, I don't see why this case needs var_off.
Each iteration will bump the reg->off while indexing into the flex/fixed size
array. Even ptr += scalar where scalar is known will accumulate into reg->off.
But maybe I missed some specific case.

> Another case is to access flex_array at the end of a struct.
> Like cgroup->ancestors[].

This might, but still, to mark the dst_reg as pointer you need to know whether
the possibly variable offset going beyond type->size is 8-byte aligned, right?

Otherwise the best that could be done conservatively is mark_reg_unknown.

> Both are currently impossible, but the verifier has to
> get this capability.

I guess it will be a very involved change. All over the verifier there are
implicit assumptions in places (after certain checks) that PTR_TO_BTF_ID never
has var_off.
David Vernet Nov. 21, 2022, 7:27 a.m. UTC | #6
On Mon, Nov 21, 2022 at 12:40:13AM +0530, Kumar Kartikeya Dwivedi wrote:
> On Fri, Nov 18, 2022 at 05:19:27AM IST, David Vernet wrote:
> > On Tue, Nov 15, 2022 at 05:31:28AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > After previous commit, we are minimizing helper specific assumptions
> > > from check_func_arg_reg_off, making it generic, and offloading checks
> > > for a specific argument type to their respective functions called after
> > > check_func_arg_reg_off has been called.
> >
> > What's the point of check_func_arg_reg_off() if helpers have to check
> > offsets after it's been called? Also, in [0], there's now logic in
> > check_func_arg_reg_off() which checks for OBJ_RELEASE arg types, so
> > there's still a precedent for looking at arg types there. IMO it's
> > actually less confusing to just push as much offset checking as possible
> > into one place.
> >
> 
> I think you need to define 'as much offset checking'.

We certainly don't want to make check_func_arg_reg_off() a monster
function, but I think we can do better in terms of making the verifier
consistent and easier to reason about by pushing more logic into it.

My view (subject to change upon learning new context I may be missing)
is that the job of check_func_arg_reg_off() should be to map all
(reg_type x arg_type) combinations to checking the correct offset,
likely by calling __check_ptr_off_reg() with the correct arguments. The
signature / implementation of __check_ptr_off_reg() may need to change
for that to happen, and we may need to e.g. also leverage
__check_mem_access() to accommodate the mem register types.

Yes, doing this may cause check_func_arg_reg_off() to have a potentially
very large switch statement (though there are other ways to address
that), but isn't that preferable to having to read through hundreds or
thousands of lines of verifier code to convince yourself that an offset
was correctly determined for every possible (register x arg_type)?
Having all of that contained in one, well-defined spot seems much
simpler. Please let me know if I've grossly misunderstood something, or
am missing important / relevant context.

> Consider process_kptr_func, it requires var_off to be constant. Same for

IMO that check should certainly go in check_func_arg_reg_off().
__check_ptr_off_reg() already checks for tnum_is_const(reg->var_off) in
__check_ptr_off_reg(), and check_func_arg_reg_off() has all the
information it needs to encapsulate the logic for that check.

> bpf_timer, bpf_spin_lock, bpf_list_head, bpf_list_node, etc. They take
> PTR_TO_MAP_VALUE, PTR_TO_BTF_ID, PTR_TO_BTF_ID | MEM_ALLOC. Should we move all
> of that into check_func_arg_reg_off?
> 
> Some argument types like ARG_PTR_TO_MEM are ok with variable offset, should that
> exception go in this function as well?
> 
> Where do you draw the line here in terms of what that function does?

Personally I think "drawing the line" is the wrong way to think about
it. We need to decide what role the function plays, and generalize it in
a way that's consistent and clear. IMO its role at a high level should
be, "The verify arg / register offsets step in the verifier". If you
break that encapsulation, it becomes much more difficult to build a
consistent mental model of what the verifier is doing. Note that this
applies to other verification steps as well, such as e.g. verifying
types, verifying proper refcounts, etc. Perhaps I'm grossly naive for
thinking this is possible? Please let me know if you think that's the
case.

Anyways, it might not be possible to aggregate all logic for checking
reg->off into the function in the codebase as it exists today, but it
seems like a desirable end-state, and it feels like a step backwards to
start selectively moving reg->off checking out of
check_func_arg_reg_off() and into arg / helper specific functions.

> IMO, there are a certain set of properties that check_func_arg_reg_off provides,
> you could say that if each register type was a class, then the checks there
> would be what you would do while constructing them on calling:
> 
> PtrToStack(off, var_off /* can be const or variable */)
> PtrToMapValue(off, var_off /* can be const or variable */)
> PtrToBtfId(off /* must be >= 0 */) /* no var_off */

Hmmm, but these are all just reg_type, right? Why are we checking
OBJ_RELEASE in check_func_arg_reg_off() if that's the case?

> How they get used by each helper and what further checks each helper needs to do
> on them based on the arg_type should be done at a later stage when the specific
> argument type is processed.

I definitely agree that there may be helper-specific verification that
needs to be done. We're talking about arg_type and reg_type, though.
Those aren't specific to an _individual_ helper (though yes, of course
arg_types are specific to whatever _sets_ of helpers take them, such as
e.g. dynptrs).

If we go with the approach of having individual arg types or sets of
helpers verify offsets, I think that still needs to be generalized so
that it's happening in a single place. This would involve something
like:

1. Having check_func_arg_reg_off() as it exists today be renamed to
check_func_reg_off(), and be solely responsible for checking reg_type.

2. Update check_func_arg_reg_off() to contain all the logic which does
actual arg type checking, possibly calling out to one of many possible
helper functions depending on the arg type.

My main issue is really just the fact that all of this logic is
scattered throughout the verifier.

> Agreed that, it's not perfect, with the odd case for PTR_TO_STACK having non-0
> reg->off for OBJ_RELEASE. But IMO once you realise it makes no sense to release
> PTR_TO_STACK and that PTR_TO_STACK actually points to the real pointer being
> released, it needs to be handled specially.
> 
> > [0]: https://lore.kernel.org/all/20221115000130.1967465-5-memxor@gmail.com/
> >
> > > This allows relying on a consistent set of guarantees after that call
> > > and then relying on them in code that deals with registers for each
> > > argument type later. This is in line with how process_spin_lock,
> > > process_timer_func, process_kptr_func check reg->var_off to be constant.
> > > The same reasoning is used here to move the alignment check into
> > > process_dynptr_func. Note that it also needs to check for constant
> > > var_off, and accumulate the constant var_off when computing the spi in
> > > get_spi, but that fix will come in later changes.
> > >
> > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > ---
> > >  kernel/bpf/verifier.c | 13 ++++++++-----
> > >  1 file changed, 8 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 34e67d04579b..fd292f762d53 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -5774,6 +5774,14 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> > >  		return -EFAULT;
> > >  	}
> > >
> > > +	/* CONST_PTR_TO_DYNPTR already has fixed and var_off as 0 due to
> > > +	 * check_func_arg_reg_off's logic. We only need to check offset
> > > +	 * alignment for PTR_TO_STACK.
> > > +	 */
> > > +	if (reg->type == PTR_TO_STACK && (reg->off % BPF_REG_SIZE)) {
> > > +		verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
> > > +		return -EINVAL;
> > > +	}
> >
> > As alluded to above, I personally think it's more confusing to have this
> > check in process_dynptr_func(). On the one hand you have
> > check_func_arg_reg_off() which verifies that a register has the correct
> > offset, but then here we have to check for the register offset for
> > PTR_TO_STACK dynptrs specifically? Wouldn't it be better to try and push
> > as much of the offset-checking complexity into one place as possible?
> >
> 
> I'm trying to make a split between 'offset correct for the reg->type in
> general', and 'offset correct for the reg->type when when passed as an argument
> for arg_type'. I think the latter is specific and different for each case and
> thus belongs inside the case ARG_TYPE_* block of each of those.

If we're going to do that, IMO we should just remove the arg_type
parameter from the function altogether. Having specific arg_type logic
in the function feels artificial.

> Anyhow, all of this is not to reject your point, but to say that if we're
> keeping that check in check_func_arg_reg_off for dynptr, let's also examine why
> we should/shouldn't move checks for other arg_types inside it as well, and
> whether the end result is going to be better than this.

Agreed that this is the crux of the issue. I think I've said my piece so
I won't reply directly to this point here.

> In that case, atleast to me, it doesn't make sense to check reg->off %
> BPF_REG_SIZE for ARG_PTR_TO_DYNPTR while leaving out other arg types.
Kumar Kartikeya Dwivedi Dec. 7, 2022, 8:41 p.m. UTC | #7
On Mon, Nov 21, 2022 at 12:57:26PM IST, David Vernet wrote:
> On Mon, Nov 21, 2022 at 12:40:13AM +0530, Kumar Kartikeya Dwivedi wrote:
> > On Fri, Nov 18, 2022 at 05:19:27AM IST, David Vernet wrote:
> > > On Tue, Nov 15, 2022 at 05:31:28AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > After previous commit, we are minimizing helper specific assumptions
> > > > from check_func_arg_reg_off, making it generic, and offloading checks
> > > > for a specific argument type to their respective functions called after
> > > > check_func_arg_reg_off has been called.
> > >
> > > What's the point of check_func_arg_reg_off() if helpers have to check
> > > offsets after it's been called? Also, in [0], there's now logic in
> > > check_func_arg_reg_off() which checks for OBJ_RELEASE arg types, so
> > > there's still a precedent for looking at arg types there. IMO it's
> > > actually less confusing to just push as much offset checking as possible
> > > into one place.
> > >
> >
> > I think you need to define 'as much offset checking'.
>
> We certainly don't want to make check_func_arg_reg_off() a monster
> function, but I think we can do better in terms of making the verifier
> consistent and easier to reason about by pushing more logic into it.
>
> My view (subject to change upon learning new context I may be missing)
> is that the job of check_func_arg_reg_off() should be to map all
> (reg_type x arg_type) combinations to checking the correct offset,
> likely by calling __check_ptr_off_reg() with the correct arguments. The
> signature / implementation of __check_ptr_off_reg() may need to change
> for that to happen, and we may need to e.g. also leverage
> __check_mem_access() to accommodate the mem register types.
>
> Yes, doing this may cause check_func_arg_reg_off() to have a potentially
> very large switch statement (though there are other ways to address
> that), but isn't that preferable to having to read through hundreds or
> thousands of lines of verifier code to convince yourself that an offset
> was correctly determined for every possible (register x arg_type)?
> Having all of that contained in one, well-defined spot seems much
> simpler. Please let me know if I've grossly misunderstood something, or
> am missing important / relevant context.
>
> > Consider process_kptr_func, it requires var_off to be constant. Same for
>
> IMO that check should certainly go in check_func_arg_reg_off().
> __check_ptr_off_reg() already checks for tnum_is_const(reg->var_off) in
> __check_ptr_off_reg(), and check_func_arg_reg_off() has all the
> information it needs to encapsulate the logic for that check.
>

I think this will be a bigger change that should be probably go in on its own,
since you've expressed in the reply to patch 4 that you intend to discuss this
with Daniel, I'm respinning without this for now.

> > bpf_timer, bpf_spin_lock, bpf_list_head, bpf_list_node, etc. They take
> > PTR_TO_MAP_VALUE, PTR_TO_BTF_ID, PTR_TO_BTF_ID | MEM_ALLOC. Should we move all
> > of that into check_func_arg_reg_off?
> >
> > Some argument types like ARG_PTR_TO_MEM are ok with variable offset, should that
> > exception go in this function as well?
> >
> > Where do you draw the line here in terms of what that function does?
>
> Personally I think "drawing the line" is the wrong way to think about
> it. We need to decide what role the function plays, and generalize it in
> a way that's consistent and clear. IMO its role at a high level should
> be, "The verify arg / register offsets step in the verifier". If you
> break that encapsulation, it becomes much more difficult to build a
> consistent mental model of what the verifier is doing. Note that this
> applies to other verification steps as well, such as e.g. verifying
> types, verifying proper refcounts, etc. Perhaps I'm grossly naive for
> thinking this is possible? Please let me know if you think that's the
> case.
>
> Anyways, it might not be possible to aggregate all logic for checking
> reg->off into the function in the codebase as it exists today, but it
> seems like a desirable end-state, and it feels like a step backwards to
> start selectively moving reg->off checking out of
> check_func_arg_reg_off() and into arg / helper specific functions.
>
> > IMO, there are a certain set of properties that check_func_arg_reg_off provides,
> > you could say that if each register type was a class, then the checks there
> > would be what you would do while constructing them on calling:
> >
> > PtrToStack(off, var_off /* can be const or variable */)
> > PtrToMapValue(off, var_off /* can be const or variable */)
> > PtrToBtfId(off /* must be >= 0 */) /* no var_off */
>
> Hmmm, but these are all just reg_type, right? Why are we checking
> OBJ_RELEASE in check_func_arg_reg_off() if that's the case?
>

Probably for the same reason we handle meta->release_regno in a unified manner
for all helpers. Earlier OBJ_RELEASE was not an arg_type, but a list of func_ids
matched in a function, so in that sense it had nothing to do with the arg type
until then.

> > How they get used by each helper and what further checks each helper needs to do
> > on them based on the arg_type should be done at a later stage when the specific
> > argument type is processed.
>
> I definitely agree that there may be helper-specific verification that
> needs to be done. We're talking about arg_type and reg_type, though.
> Those aren't specific to an _individual_ helper (though yes, of course
> arg_types are specific to whatever _sets_ of helpers take them, such as
> e.g. dynptrs).
>
> If we go with the approach of having individual arg types or sets of
> helpers verify offsets, I think that still needs to be generalized so
> that it's happening in a single place. This would involve something
> like:
>
> 1. Having check_func_arg_reg_off() as it exists today be renamed to
> check_func_reg_off(), and be solely responsible for checking reg_type.
>
> 2. Update check_func_arg_reg_off() to contain all the logic which does
> actual arg type checking, possibly calling out to one of many possible
> helper functions depending on the arg type.
>
> My main issue is really just the fact that all of this logic is
> scattered throughout the verifier.
>

I think it's a difference of opinion. I guess it's fine to move all offset
related checks to this function, for every arg_type and every single case as
well, but then it should be part of a bigger change that it does for all of the
existing cases, not just limited to ARG_PTR_TO_{DYNPTR, KPTR, SPIN_LOCK, TIMER}
etc. It will be a bigger refactoring that rearranges and puts all those checks
in a single place, which I think should be done later as its own set.

Though you might end up duplicating some checks because the verification path
from insns won't call check_func_arg_reg_off, but those shared functions are
called from both insns and helpers side, so the checks would need to remain
there unless this refactored function is called for each insn (in which case it
would need to check based on insn type as well). So I might be wrong but there
won't be a lot of code reduction without more involved refactorings.
diff mbox series

Patch

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 34e67d04579b..fd292f762d53 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5774,6 +5774,14 @@  int process_dynptr_func(struct bpf_verifier_env *env, int regno,
 		return -EFAULT;
 	}
 
+	/* CONST_PTR_TO_DYNPTR already has fixed and var_off as 0 due to
+	 * check_func_arg_reg_off's logic. We only need to check offset
+	 * alignment for PTR_TO_STACK.
+	 */
+	if (reg->type == PTR_TO_STACK && (reg->off % BPF_REG_SIZE)) {
+		verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
+		return -EINVAL;
+	}
 	/* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
 	 * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
 	 *
@@ -6125,11 +6133,6 @@  int check_func_arg_reg_off(struct bpf_verifier_env *env,
 	switch (type) {
 	/* Pointer types where both fixed and variable offset is explicitly allowed: */
 	case PTR_TO_STACK:
-		if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
-			verbose(env, "cannot pass in dynptr at an offset\n");
-			return -EINVAL;
-		}
-		fallthrough;
 	case PTR_TO_PACKET:
 	case PTR_TO_PACKET_META:
 	case PTR_TO_MAP_KEY: