diff mbox series

[bpf-next,v1,1/2] bpf: verify scalar ids mapping in regsafe() using check_ids()

Message ID 20230526184126.3104040-2-eddyz87@gmail.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series bpf: verify scalar ids mapping in regsafe() | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 20 this patch: 20
netdev/cc_maintainers fail 1 blamed authors not CCed: john.fastabend@gmail.com; 6 maintainers not CCed: kpsingh@kernel.org john.fastabend@gmail.com sdf@google.com song@kernel.org jolsa@kernel.org haoluo@google.com
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff fail author Signed-off-by missing
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 20 this patch: 20
netdev/checkpatch warning WARNING: line length of 81 exceeds 80 columns WARNING: line length of 83 exceeds 80 columns WARNING: line length of 84 exceeds 80 columns WARNING: line length of 86 exceeds 80 columns WARNING: line length of 87 exceeds 80 columns WARNING: line length of 88 exceeds 80 columns WARNING: line length of 89 exceeds 80 columns WARNING: line length of 91 exceeds 80 columns WARNING: line length of 92 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-11 fail Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-29 success Logs for veristat
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 fail Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ${{ matrix.test }} on ${{ matrix.arch }} with ${{ matrix.toolchain_full }}
bpf/vmtest-bpf-next-VM_Test-2 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-7 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-8 success Logs for veristat

Commit Message

Eduard Zingerman May 26, 2023, 6:41 p.m. UTC
Make sure that the following unsafe example is rejected by verifier:

1: r9 = ... some pointer with range X ...
2: r6 = ... unbound scalar ID=a ...
3: r7 = ... unbound scalar ID=b ...
4: if (r6 > r7) goto +1
5: r6 = r7
6: if (r6 > X) goto ...
--- checkpoint ---
7: r9 += r7
8: *(u64 *)r9 = Y

This example is unsafe because not all execution paths verify r7 range.
Because of the jump at (4) the verifier would arrive at (6) in two states:
I.  r6{.id=b}, r7{.id=b} via path 1-6;
II. r6{.id=a}, r7{.id=b} via path 1-4, 6.

Currently regsafe() does not call check_ids() for scalar registers,
thus from POV of regsafe() states (I) and (II) are identical. If the
path 1-6 is taken by verifier first, and checkpoint is created at (6)
the path [1-4, 6] would be considered safe.

This commit updates regsafe() to call check_ids() for scalar registers.

The change in check_alu_op() to avoid assigning scalar id to constants
is performance optimization. W/o it the regsafe() change becomes
costly for some programs, e.g. for
tools/testing/selftests/bpf/progs/pyperf600.c the difference is:

File             Program   States (A)  States (B)  States    (DIFF)
---------------  --------  ----------  ----------  ----------------
pyperf600.bpf.o  on_event       22200       37060  +14860 (+66.94%)

Where A -- this patch,
      B -- this patch but w/o check_alu_op() changes.

Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.")
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/verifier.c | 31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

Comments

Yonghong Song May 27, 2023, 12:40 a.m. UTC | #1
On 5/26/23 11:41 AM, Eduard Zingerman wrote:
> Make sure that the following unsafe example is rejected by verifier:
> 
> 1: r9 = ... some pointer with range X ...
> 2: r6 = ... unbound scalar ID=a ...
> 3: r7 = ... unbound scalar ID=b ...
> 4: if (r6 > r7) goto +1
> 5: r6 = r7
> 6: if (r6 > X) goto ...
> --- checkpoint ---
> 7: r9 += r7
> 8: *(u64 *)r9 = Y
> 
> This example is unsafe because not all execution paths verify r7 range.
> Because of the jump at (4) the verifier would arrive at (6) in two states:
> I.  r6{.id=b}, r7{.id=b} via path 1-6;
> II. r6{.id=a}, r7{.id=b} via path 1-4, 6.
> 
> Currently regsafe() does not call check_ids() for scalar registers,
> thus from POV of regsafe() states (I) and (II) are identical. If the
> path 1-6 is taken by verifier first, and checkpoint is created at (6)
> the path [1-4, 6] would be considered safe.
> 
> This commit updates regsafe() to call check_ids() for scalar registers.
> 
> The change in check_alu_op() to avoid assigning scalar id to constants
> is performance optimization. W/o it the regsafe() change becomes
> costly for some programs, e.g. for
> tools/testing/selftests/bpf/progs/pyperf600.c the difference is:
> 
> File             Program   States (A)  States (B)  States    (DIFF)
> ---------------  --------  ----------  ----------  ----------------
> pyperf600.bpf.o  on_event       22200       37060  +14860 (+66.94%)
> 
> Where A -- this patch,
>        B -- this patch but w/o check_alu_op() changes.
> 
> Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.")
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>   kernel/bpf/verifier.c | 31 ++++++++++++++++++++++++++++++-
>   1 file changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index af70dad655ab..624556eda430 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -12806,10 +12806,12 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
>   				/* case: R1 = R2
>   				 * copy register state to dest reg
>   				 */
> -				if (src_reg->type == SCALAR_VALUE && !src_reg->id)
> +				if (src_reg->type == SCALAR_VALUE && !src_reg->id &&
> +				    !tnum_is_const(src_reg->var_off))
>   					/* Assign src and dst registers the same ID
>   					 * that will be used by find_equal_scalars()
>   					 * to propagate min/max range.
> +					 * Skip constants to avoid allocation of useless ID.
>   					 */

The above is for ALU64.

We also have ALU32 version:
    } else if (src_reg->type == SCALAR_VALUE) {
        bool is_src_reg_u32 = src_reg->umax_value <= U32_MAX;

        if (is_src_reg_u32 && !src_reg->id)
              src_reg->id = ++env->id_gen;
        copy_register_state(dst_reg, src_reg);
        ...

Do you think we should do the same thing if src_reg is a constant,
not to change src_reg->id?

If this is added, could you have a test case for 32-bit subregister
as well?

>   					src_reg->id = ++env->id_gen;
>   				copy_register_state(dst_reg, src_reg);
> @@ -15151,6 +15153,33 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
>   
>   	switch (base_type(rold->type)) {
>   	case SCALAR_VALUE:
> +		/* Why check_ids() for precise registers?
> +		 *
> +		 * Consider the following BPF code:
> +		 *   1: r6 = ... unbound scalar, ID=a ...
> +		 *   2: r7 = ... unbound scalar, ID=b ...
> +		 *   3: if (r6 > r7) goto +1
> +		 *   4: r6 = r7
> +		 *   5: if (r6 > X) goto ...
> +		 *   6: ... memory operation using r7 ...
> +		 *
> +		 * First verification path is [1-6]:
> +		 * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7;
> +		 * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark
> +		 *   r7 <= X, because r6 and r7 share same id.
> +		 *
> +		 * Next verification path would start from (5), because of the jump at (3).
> +		 * The only state difference between first and second visits of (5) is
> +		 * bpf_reg_state::id assignments for r6 and r7: (b, b) vs (a, b).
> +		 * Thus, use check_ids() to distinguish these states.
> +		 *
> +		 * The `rold->precise` check is a performance optimization. If `rold->id`
> +		 * was ever used to access memory / predict jump, the `rold` or any
> +		 * register used in `rold = r?` / `r? = rold` operations would be marked
> +		 * as precise, otherwise it's ID is not really interesting.
> +		 */
> +		if (rold->precise && rold->id && !check_ids(rold->id, rcur->id, idmap))

Do we need rold->id checking in the above? check_ids should have 
rold->id = 0 properly. Or this is just an optimization?

regs_exact() has check_ids as well. Not sure whether it makes sense to
create a function regs_exact_scalar() just for scalar and include the
above code. Otherwise, it is strange we do check_ids in different
places.

> +			return false;
>   		if (regs_exact(rold, rcur, idmap))
>   			return true;
>   		if (env->explore_alu_limits)
Eduard Zingerman May 27, 2023, 12:21 p.m. UTC | #2
On Fri, 2023-05-26 at 17:40 -0700, Yonghong Song wrote:
> 
> On 5/26/23 11:41 AM, Eduard Zingerman wrote:
> > Make sure that the following unsafe example is rejected by verifier:
> > 
> > 1: r9 = ... some pointer with range X ...
> > 2: r6 = ... unbound scalar ID=a ...
> > 3: r7 = ... unbound scalar ID=b ...
> > 4: if (r6 > r7) goto +1
> > 5: r6 = r7
> > 6: if (r6 > X) goto ...
> > --- checkpoint ---
> > 7: r9 += r7
> > 8: *(u64 *)r9 = Y
> > 
> > This example is unsafe because not all execution paths verify r7 range.
> > Because of the jump at (4) the verifier would arrive at (6) in two states:
> > I.  r6{.id=b}, r7{.id=b} via path 1-6;
> > II. r6{.id=a}, r7{.id=b} via path 1-4, 6.
> > 
> > Currently regsafe() does not call check_ids() for scalar registers,
> > thus from POV of regsafe() states (I) and (II) are identical. If the
> > path 1-6 is taken by verifier first, and checkpoint is created at (6)
> > the path [1-4, 6] would be considered safe.
> > 
> > This commit updates regsafe() to call check_ids() for scalar registers.
> > 
> > The change in check_alu_op() to avoid assigning scalar id to constants
> > is performance optimization. W/o it the regsafe() change becomes
> > costly for some programs, e.g. for
> > tools/testing/selftests/bpf/progs/pyperf600.c the difference is:
> > 
> > File             Program   States (A)  States (B)  States    (DIFF)
> > ---------------  --------  ----------  ----------  ----------------
> > pyperf600.bpf.o  on_event       22200       37060  +14860 (+66.94%)
> > 
> > Where A -- this patch,
> >        B -- this patch but w/o check_alu_op() changes.
> > 
> > Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.")
> > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> > ---
> >   kernel/bpf/verifier.c | 31 ++++++++++++++++++++++++++++++-
> >   1 file changed, 30 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index af70dad655ab..624556eda430 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -12806,10 +12806,12 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
> >   				/* case: R1 = R2
> >   				 * copy register state to dest reg
> >   				 */
> > -				if (src_reg->type == SCALAR_VALUE && !src_reg->id)
> > +				if (src_reg->type == SCALAR_VALUE && !src_reg->id &&
> > +				    !tnum_is_const(src_reg->var_off))
> >   					/* Assign src and dst registers the same ID
> >   					 * that will be used by find_equal_scalars()
> >   					 * to propagate min/max range.
> > +					 * Skip constants to avoid allocation of useless ID.
> >   					 */
> 
> The above is for ALU64.
> 
> We also have ALU32 version:
>     } else if (src_reg->type == SCALAR_VALUE) {
>         bool is_src_reg_u32 = src_reg->umax_value <= U32_MAX;
> 
>         if (is_src_reg_u32 && !src_reg->id)
>               src_reg->id = ++env->id_gen;
>         copy_register_state(dst_reg, src_reg);
>         ...
> 
> Do you think we should do the same thing if src_reg is a constant,
> not to change src_reg->id?

This is a good point, thank you. Adding the same check for 32-bit case
actually helps with the verifier performance a bit:

$ ./veristat -e file,prog,states -f 'insns_pct>1' -C master-baseline.log current.log
File       Program                         States (A)  States (B)  States (DIFF)
---------  ------------------------------  ----------  ----------  -------------
bpf_xdp.o  tail_handle_nat_fwd_ipv6               648         660   +12 (+1.85%)
bpf_xdp.o  tail_nodeport_nat_ingress_ipv4         375         455  +80 (+21.33%)
bpf_xdp.o  tail_rev_nodeport_lb4                  398         472  +74 (+18.59%)

(all +1% - +3% cases from the cover letter are gone).

> If this is added, could you have a test case for 32-bit subregister
> as well?

I will add the 32-bit test case.

> 
> >   					src_reg->id = ++env->id_gen;
> >   				copy_register_state(dst_reg, src_reg);
> > @@ -15151,6 +15153,33 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
> >   
> >   	switch (base_type(rold->type)) {
> >   	case SCALAR_VALUE:
> > +		/* Why check_ids() for precise registers?
> > +		 *
> > +		 * Consider the following BPF code:
> > +		 *   1: r6 = ... unbound scalar, ID=a ...
> > +		 *   2: r7 = ... unbound scalar, ID=b ...
> > +		 *   3: if (r6 > r7) goto +1
> > +		 *   4: r6 = r7
> > +		 *   5: if (r6 > X) goto ...
> > +		 *   6: ... memory operation using r7 ...
> > +		 *
> > +		 * First verification path is [1-6]:
> > +		 * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7;
> > +		 * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark
> > +		 *   r7 <= X, because r6 and r7 share same id.
> > +		 *
> > +		 * Next verification path would start from (5), because of the jump at (3).
> > +		 * The only state difference between first and second visits of (5) is
> > +		 * bpf_reg_state::id assignments for r6 and r7: (b, b) vs (a, b).
> > +		 * Thus, use check_ids() to distinguish these states.
> > +		 *
> > +		 * The `rold->precise` check is a performance optimization. If `rold->id`
> > +		 * was ever used to access memory / predict jump, the `rold` or any
> > +		 * register used in `rold = r?` / `r? = rold` operations would be marked
> > +		 * as precise, otherwise it's ID is not really interesting.
> > +		 */
> > +		if (rold->precise && rold->id && !check_ids(rold->id, rcur->id, idmap))
> 
> Do we need rold->id checking in the above? check_ids should have 
> rold->id = 0 properly. Or this is just an optimization?

You are correct, the check_ids() handles this case and it should be inlined,
so there is no need to check rold->id in this 'if' branch.
 
> regs_exact() has check_ids as well. Not sure whether it makes sense to
> create a function regs_exact_scalar() just for scalar and include the
> above code. Otherwise, it is strange we do check_ids in different
> places.

I'm not sure how to best re-organize code here, regs_exact() is a nice
compartmentalized abstraction. It is possible to merge my additional
check_ids() call with the main 'precise' processing part as below:

@@ -15152,21 +15154,22 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
        switch (base_type(rold->type)) {
        case SCALAR_VALUE:
                if (regs_exact(rold, rcur, idmap))
                        return true;
                if (env->explore_alu_limits)
                        return false;
                if (!rold->precise)
                        return true;
                /* new val must satisfy old val knowledge */
                return range_within(rold, rcur) &&
-                      tnum_in(rold->var_off, rcur->var_off);
+                      tnum_in(rold->var_off, rcur->var_off) &&
+                      check_ids(rold->id, rcur->id, idmap);

I'd say that extending /* new val must satisfy ... */ comment to
explain why check_ids() is needed should be sufficient, but I'm open
for suggestions.

> 
> > +			return false;
> >   		if (regs_exact(rold, rcur, idmap))
> >   			return true;
> >   		if (env->explore_alu_limits)
Eduard Zingerman May 27, 2023, 12:29 p.m. UTC | #3
On Sat, 2023-05-27 at 15:21 +0300, Eduard Zingerman wrote:
[...]
> > > @@ -15151,6 +15153,33 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
> > >   
> > >   	switch (base_type(rold->type)) {
> > >   	case SCALAR_VALUE:
> > > +		/* Why check_ids() for precise registers?
> > > +		 *
> > > +		 * Consider the following BPF code:
> > > +		 *   1: r6 = ... unbound scalar, ID=a ...
> > > +		 *   2: r7 = ... unbound scalar, ID=b ...
> > > +		 *   3: if (r6 > r7) goto +1
> > > +		 *   4: r6 = r7
> > > +		 *   5: if (r6 > X) goto ...
> > > +		 *   6: ... memory operation using r7 ...
> > > +		 *
> > > +		 * First verification path is [1-6]:
> > > +		 * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7;
> > > +		 * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark
> > > +		 *   r7 <= X, because r6 and r7 share same id.
> > > +		 *
> > > +		 * Next verification path would start from (5), because of the jump at (3).
> > > +		 * The only state difference between first and second visits of (5) is
> > > +		 * bpf_reg_state::id assignments for r6 and r7: (b, b) vs (a, b).
> > > +		 * Thus, use check_ids() to distinguish these states.
> > > +		 *
> > > +		 * The `rold->precise` check is a performance optimization. If `rold->id`
> > > +		 * was ever used to access memory / predict jump, the `rold` or any
> > > +		 * register used in `rold = r?` / `r? = rold` operations would be marked
> > > +		 * as precise, otherwise it's ID is not really interesting.
> > > +		 */
> > > +		if (rold->precise && rold->id && !check_ids(rold->id, rcur->id, idmap))
> > 
> > Do we need rold->id checking in the above? check_ids should have 
> > rold->id = 0 properly. Or this is just an optimization?
> 
> You are correct, the check_ids() handles this case and it should be inlined,
> so there is no need to check rold->id in this 'if' branch.
>  
> > regs_exact() has check_ids as well. Not sure whether it makes sense to
> > create a function regs_exact_scalar() just for scalar and include the
> > above code. Otherwise, it is strange we do check_ids in different
> > places.
> 
> I'm not sure how to best re-organize code here, regs_exact() is a nice
> compartmentalized abstraction. It is possible to merge my additional
> check_ids() call with the main 'precise' processing part as below:
> 
> @@ -15152,21 +15154,22 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
>         switch (base_type(rold->type)) {
>         case SCALAR_VALUE:
>                 if (regs_exact(rold, rcur, idmap))
>                         return true;
>                 if (env->explore_alu_limits)
>                         return false;
>                 if (!rold->precise)
>                         return true;
>                 /* new val must satisfy old val knowledge */
>                 return range_within(rold, rcur) &&
> -                      tnum_in(rold->var_off, rcur->var_off);
> +                      tnum_in(rold->var_off, rcur->var_off) &&
> +                      check_ids(rold->id, rcur->id, idmap);
> 
> I'd say that extending /* new val must satisfy ... */ comment to
> explain why check_ids() is needed should be sufficient, but I'm open
> for suggestions.

On the other hand, I wanted to have a separate 'if' branch like:

  if (rold->precise && !check_ids(rold->id, rcur->id, idmap))
  
Specifically to explain that 'rold->precise' part is an optimization.

> 
> > 
> > > +			return false;
> > >   		if (regs_exact(rold, rcur, idmap))
> > >   			return true;
> > >   		if (env->explore_alu_limits)
>
Yonghong Song May 27, 2023, 11:43 p.m. UTC | #4
On 5/27/23 5:29 AM, Eduard Zingerman wrote:
> On Sat, 2023-05-27 at 15:21 +0300, Eduard Zingerman wrote:
> [...]
>>>> @@ -15151,6 +15153,33 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
>>>>    
>>>>    	switch (base_type(rold->type)) {
>>>>    	case SCALAR_VALUE:
>>>> +		/* Why check_ids() for precise registers?
>>>> +		 *
>>>> +		 * Consider the following BPF code:
>>>> +		 *   1: r6 = ... unbound scalar, ID=a ...
>>>> +		 *   2: r7 = ... unbound scalar, ID=b ...
>>>> +		 *   3: if (r6 > r7) goto +1
>>>> +		 *   4: r6 = r7
>>>> +		 *   5: if (r6 > X) goto ...
>>>> +		 *   6: ... memory operation using r7 ...
>>>> +		 *
>>>> +		 * First verification path is [1-6]:
>>>> +		 * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7;
>>>> +		 * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark
>>>> +		 *   r7 <= X, because r6 and r7 share same id.
>>>> +		 *
>>>> +		 * Next verification path would start from (5), because of the jump at (3).
>>>> +		 * The only state difference between first and second visits of (5) is
>>>> +		 * bpf_reg_state::id assignments for r6 and r7: (b, b) vs (a, b).
>>>> +		 * Thus, use check_ids() to distinguish these states.
>>>> +		 *
>>>> +		 * The `rold->precise` check is a performance optimization. If `rold->id`
>>>> +		 * was ever used to access memory / predict jump, the `rold` or any
>>>> +		 * register used in `rold = r?` / `r? = rold` operations would be marked
>>>> +		 * as precise, otherwise it's ID is not really interesting.
>>>> +		 */
>>>> +		if (rold->precise && rold->id && !check_ids(rold->id, rcur->id, idmap))
>>>
>>> Do we need rold->id checking in the above? check_ids should have
>>> rold->id = 0 properly. Or this is just an optimization?
>>
>> You are correct, the check_ids() handles this case and it should be inlined,
>> so there is no need to check rold->id in this 'if' branch.
>>   
>>> regs_exact() has check_ids as well. Not sure whether it makes sense to
>>> create a function regs_exact_scalar() just for scalar and include the
>>> above code. Otherwise, it is strange we do check_ids in different
>>> places.
>>
>> I'm not sure how to best re-organize code here, regs_exact() is a nice
>> compartmentalized abstraction. It is possible to merge my additional
>> check_ids() call with the main 'precise' processing part as below:
>>
>> @@ -15152,21 +15154,22 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
>>          switch (base_type(rold->type)) {
>>          case SCALAR_VALUE:
>>                  if (regs_exact(rold, rcur, idmap))
>>                          return true;
>>                  if (env->explore_alu_limits)
>>                          return false;
>>                  if (!rold->precise)
>>                          return true;
>>                  /* new val must satisfy old val knowledge */
>>                  return range_within(rold, rcur) &&
>> -                      tnum_in(rold->var_off, rcur->var_off);
>> +                      tnum_in(rold->var_off, rcur->var_off) &&
>> +                      check_ids(rold->id, rcur->id, idmap);
>>
>> I'd say that extending /* new val must satisfy ... */ comment to
>> explain why check_ids() is needed should be sufficient, but I'm open
>> for suggestions.
> 
> On the other hand, I wanted to have a separate 'if' branch like:
> 
>    if (rold->precise && !check_ids(rold->id, rcur->id, idmap))
>    
> Specifically to explain that 'rold->precise' part is an optimization.

Okay, I think you could keep your original implementation. I do think
checking rold->ref_obj_id in regs_exact is not needed for
SCALAR_VALUE but it may not be that important. The check_ids
checking in reg_exact (for SCALAR_VALUE) can also be skipped
if !rold->precise as an optimization. That is why I mention
to 'inline' regs_exact and re-arrange the codes. You can still
mention that optimization w.r.t. rold->precise. Overall if the code
is more complex, I am okay with your current change.

> 
>>
>>>
>>>> +			return false;
>>>>    		if (regs_exact(rold, rcur, idmap))
>>>>    			return true;
>>>>    		if (env->explore_alu_limits)
>>
>
Eduard Zingerman May 29, 2023, 12:59 a.m. UTC | #5
On Sat, 2023-05-27 at 16:43 -0700, Yonghong Song wrote:
> 
> On 5/27/23 5:29 AM, Eduard Zingerman wrote:
> > On Sat, 2023-05-27 at 15:21 +0300, Eduard Zingerman wrote:
> > [...]
> > > > > @@ -15151,6 +15153,33 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
> > > > >    
> > > > >    	switch (base_type(rold->type)) {
> > > > >    	case SCALAR_VALUE:
> > > > > +		/* Why check_ids() for precise registers?
> > > > > +		 *
> > > > > +		 * Consider the following BPF code:
> > > > > +		 *   1: r6 = ... unbound scalar, ID=a ...
> > > > > +		 *   2: r7 = ... unbound scalar, ID=b ...
> > > > > +		 *   3: if (r6 > r7) goto +1
> > > > > +		 *   4: r6 = r7
> > > > > +		 *   5: if (r6 > X) goto ...
> > > > > +		 *   6: ... memory operation using r7 ...
> > > > > +		 *
> > > > > +		 * First verification path is [1-6]:
> > > > > +		 * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7;
> > > > > +		 * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark
> > > > > +		 *   r7 <= X, because r6 and r7 share same id.
> > > > > +		 *
> > > > > +		 * Next verification path would start from (5), because of the jump at (3).
> > > > > +		 * The only state difference between first and second visits of (5) is
> > > > > +		 * bpf_reg_state::id assignments for r6 and r7: (b, b) vs (a, b).
> > > > > +		 * Thus, use check_ids() to distinguish these states.
> > > > > +		 *
> > > > > +		 * The `rold->precise` check is a performance optimization. If `rold->id`
> > > > > +		 * was ever used to access memory / predict jump, the `rold` or any
> > > > > +		 * register used in `rold = r?` / `r? = rold` operations would be marked
> > > > > +		 * as precise, otherwise it's ID is not really interesting.
> > > > > +		 */
> > > > > +		if (rold->precise && rold->id && !check_ids(rold->id, rcur->id, idmap))
> > > > 
> > > > Do we need rold->id checking in the above? check_ids should have
> > > > rold->id = 0 properly. Or this is just an optimization?
> > > 
> > > You are correct, the check_ids() handles this case and it should be inlined,
> > > so there is no need to check rold->id in this 'if' branch.
> > >   
> > > > regs_exact() has check_ids as well. Not sure whether it makes sense to
> > > > create a function regs_exact_scalar() just for scalar and include the
> > > > above code. Otherwise, it is strange we do check_ids in different
> > > > places.
> > > 
> > > I'm not sure how to best re-organize code here, regs_exact() is a nice
> > > compartmentalized abstraction. It is possible to merge my additional
> > > check_ids() call with the main 'precise' processing part as below:
> > > 
> > > @@ -15152,21 +15154,22 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
> > >          switch (base_type(rold->type)) {
> > >          case SCALAR_VALUE:
> > >                  if (regs_exact(rold, rcur, idmap))
> > >                          return true;
> > >                  if (env->explore_alu_limits)
> > >                          return false;
> > >                  if (!rold->precise)
> > >                          return true;
> > >                  /* new val must satisfy old val knowledge */
> > >                  return range_within(rold, rcur) &&
> > > -                      tnum_in(rold->var_off, rcur->var_off);
> > > +                      tnum_in(rold->var_off, rcur->var_off) &&
> > > +                      check_ids(rold->id, rcur->id, idmap);
> > > 
> > > I'd say that extending /* new val must satisfy ... */ comment to
> > > explain why check_ids() is needed should be sufficient, but I'm open
> > > for suggestions.
> > 
> > On the other hand, I wanted to have a separate 'if' branch like:
> > 
> >    if (rold->precise && !check_ids(rold->id, rcur->id, idmap))
> >    
> > Specifically to explain that 'rold->precise' part is an optimization.
> 
> Okay, I think you could keep your original implementation. I do think
> checking rold->ref_obj_id in regs_exact is not needed for
> SCALAR_VALUE but it may not be that important. The check_ids
> checking in reg_exact (for SCALAR_VALUE) can also be skipped
> if !rold->precise as an optimization. That is why I mention
> to 'inline' regs_exact and re-arrange the codes. You can still
> mention that optimization w.r.t. rold->precise. Overall if the code
> is more complex, I am okay with your current change.

I thought a bit more about this and came up with example that doesn't
work with 'rold->precise' case:

        /* Bump allocated stack */
 1:     r1 = 0;
        *(u64*)(r10 - 8) = r1;
        /* r9 = pointer to stack */
 2:     r9 = r10;
 3:     r9 += -8;
        /* r8 = ktime_get_ns() */
 4:     call %[bpf_ktime_get_ns];
 5:     r8 = r0;
        /* r7 = ktime_get_ns() */
 6:     call %[bpf_ktime_get_ns];
 7:     r7 = r0;
        /* r6 = ktime_get_ns() */
 8:     call %[bpf_ktime_get_ns];
 9:     r6 = r0;
        /* scratch .id from r0 */
 10:    r0 = 0;
        /* if r6 > r7 is an unpredictable jump */
 11:    if r6 > r7 goto l1;
        /* tie r6 and r7 .id */
 12:    r6 = r7;
    l0:
        /* if r7 > 4 exit(0) */
 13:    if r7 > 4 goto l2;
        /* access memory at r9[r6] */
 14:    r9 += r6;
 15:    r0 = *(u8*)(r9 + 0);
    l2:
 16:    r0 = 0;
 17:    exit;
    l1:
        /* tie r6 and r8 .id */
 18:    r6 = r8;
 19:    goto l0;
    
This example is marked as safe, however it isn't.
What happens:
(a) path 1-17 is verified first, it is marked safe
  - when 14 is processed mark_chain_precision() is called with regno set to r6;
  - moving backwards mark_chain_precision() does *not* mark r7 as precise at 13
    because mark_chain_precision() does not track register ids;
  - thus, for checkpiont at 13 only r6 is marked as precise.
(b) path 1-11, 18-19, 13-17 is verified next:
  - when insn 13 is processed the saved checkpiont is examined,
    the only precise register is r6, so check_ids() is called only
    for r6 and it returns true => checkpiont is considered safe.

However, in reality register ID assignments differ between (a) and (b) at insn 13:
(a) r6{id=A}, r7{id=A}, r8{id=B}
(b) r6{id=B}, r7{id=A}, r8{id=B}
    
So, simplest and safest change is as follows:

  @@ -15152,4 +15154,6 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
          switch (base_type(rold->type)) {
          case SCALAR_VALUE:
  +               if (!check_ids(rold->id, rcur->id, idmap))
  +                       return false;
                  if (regs_exact(rold, rcur, idmap))
                          return true;

Here regsafe() does not care about rold->precise marks,
thus differences between (a) and (b) would be detected by check_ids,
as all three registers r{6,7,8} would be fed to it.

However, it is also costly (note the filter by 40% processed states increase or more):

$ ./veristat -e file,prog,states -f 'states_pct>40' -C master-baseline.log current.log 
File         Program                        States (A)  States (B)  States  (DIFF)
-----------  -----------------------------  ----------  ----------  --------------
bpf_host.o   cil_from_host                          37          52   +15 (+40.54%)
bpf_host.o   cil_from_netdev                        28          46   +18 (+64.29%)
bpf_host.o   tail_handle_ipv4_from_host            225         350  +125 (+55.56%)
bpf_host.o   tail_handle_ipv4_from_netdev          109         173   +64 (+58.72%)
bpf_host.o   tail_handle_ipv6_from_host            250         387  +137 (+54.80%)
bpf_host.o   tail_handle_ipv6_from_netdev          132         194   +62 (+46.97%)
bpf_host.o   tail_ipv4_host_policy_ingress         103         167   +64 (+62.14%)
bpf_host.o   tail_ipv6_host_policy_ingress          98         160   +62 (+63.27%)
bpf_xdp.o    __send_drop_notify                      8          14    +6 (+75.00%)
bpf_xdp.o    tail_handle_nat_fwd_ipv6              648         971  +323 (+49.85%)
loop6.bpf.o  trace_virtqueue_add_sgs               226         357  +131 (+57.96%)

I'll modify mark_chain_precision() to mark registers precise taking
into account scalar IDs, when comparisons are processed.
Will report on Monday.

> 
> > 
> > > 
> > > > 
> > > > > + return false; > > > > if (regs_exact(rold, rcur, idmap)) >
> > > > return true; > > > > if (env->explore_alu_limits)
> > > 
> >
diff mbox series

Patch

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index af70dad655ab..624556eda430 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -12806,10 +12806,12 @@  static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
 				/* case: R1 = R2
 				 * copy register state to dest reg
 				 */
-				if (src_reg->type == SCALAR_VALUE && !src_reg->id)
+				if (src_reg->type == SCALAR_VALUE && !src_reg->id &&
+				    !tnum_is_const(src_reg->var_off))
 					/* Assign src and dst registers the same ID
 					 * that will be used by find_equal_scalars()
 					 * to propagate min/max range.
+					 * Skip constants to avoid allocation of useless ID.
 					 */
 					src_reg->id = ++env->id_gen;
 				copy_register_state(dst_reg, src_reg);
@@ -15151,6 +15153,33 @@  static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
 
 	switch (base_type(rold->type)) {
 	case SCALAR_VALUE:
+		/* Why check_ids() for precise registers?
+		 *
+		 * Consider the following BPF code:
+		 *   1: r6 = ... unbound scalar, ID=a ...
+		 *   2: r7 = ... unbound scalar, ID=b ...
+		 *   3: if (r6 > r7) goto +1
+		 *   4: r6 = r7
+		 *   5: if (r6 > X) goto ...
+		 *   6: ... memory operation using r7 ...
+		 *
+		 * First verification path is [1-6]:
+		 * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7;
+		 * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark
+		 *   r7 <= X, because r6 and r7 share same id.
+		 *
+		 * Next verification path would start from (5), because of the jump at (3).
+		 * The only state difference between first and second visits of (5) is
+		 * bpf_reg_state::id assignments for r6 and r7: (b, b) vs (a, b).
+		 * Thus, use check_ids() to distinguish these states.
+		 *
+		 * The `rold->precise` check is a performance optimization. If `rold->id`
+		 * was ever used to access memory / predict jump, the `rold` or any
+		 * register used in `rold = r?` / `r? = rold` operations would be marked
+		 * as precise, otherwise it's ID is not really interesting.
+		 */
+		if (rold->precise && rold->id && !check_ids(rold->id, rcur->id, idmap))
+			return false;
 		if (regs_exact(rold, rcur, idmap))
 			return true;
 		if (env->explore_alu_limits)