diff mbox series

[v4,bpf-next,09/10] selftests/bpf: validate precision logic in partial_stack_load_preserves_zeros

Message ID 20231205184248.1502704-10-andrii@kernel.org (mailing list archive)
State Accepted
Commit 064e0bea19b356c5d5f48a4549d80a3c03ce898b
Delegated to: BPF
Headers show
Series Complete BPF verifier precision tracking support for register spills | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-3 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-15 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-16 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-llvm-16 / build / build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-llvm-16 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-llvm-16 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-llvm-16 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-16 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-16 / veristat
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/ynl success SINGLE THREAD; Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers warning 11 maintainers not CCed: kpsingh@kernel.org sdf@google.com yonghong.song@linux.dev martin.lau@linux.dev mykolal@fb.com song@kernel.org haoluo@google.com jolsa@kernel.org linux-kselftest@vger.kernel.org shuah@kernel.org john.fastabend@gmail.com
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch warning WARNING: line length of 81 exceeds 80 columns WARNING: line length of 82 exceeds 80 columns WARNING: line length of 84 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR success PR summary

Commit Message

Andrii Nakryiko Dec. 5, 2023, 6:42 p.m. UTC
Enhance partial_stack_load_preserves_zeros subtest with detailed
precision propagation log checks. We know expect fp-16 to be spilled,
initially imprecise, zero const register, which is later marked as
precise even when partial stack slot load is performed, even if it's not
a register fill (!).

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../selftests/bpf/progs/verifier_spill_fill.c    | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Maxim Mikityanskiy Dec. 14, 2023, 2:21 p.m. UTC | #1
Hi Andrii,

I'm preparing a series for submission [1], and it started failing on
this selftest on big endian after I rebased over your series. Can we
discuss (see below) to figure out whether it's a bug in your patch or
whether I'm missing something?

On Tue, 05 Dec 2023 at 10:42:47 -0800, Andrii Nakryiko wrote:
> Enhance partial_stack_load_preserves_zeros subtest with detailed
> precision propagation log checks. We know expect fp-16 to be spilled,
> initially imprecise, zero const register, which is later marked as
> precise even when partial stack slot load is performed, even if it's not
> a register fill (!).
> 
> Acked-by: Eduard Zingerman <eddyz87@gmail.com>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
>  .../selftests/bpf/progs/verifier_spill_fill.c    | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/tools/testing/selftests/bpf/progs/verifier_spill_fill.c b/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
> index 41fd61299eab..df4920da3472 100644
> --- a/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
> +++ b/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
> @@ -495,6 +495,22 @@ char single_byte_buf[1] SEC(".data.single_byte_buf");
>  SEC("raw_tp")
>  __log_level(2)
>  __success
> +/* make sure fp-8 is all STACK_ZERO */
> +__msg("2: (7a) *(u64 *)(r10 -8) = 0          ; R10=fp0 fp-8_w=00000000")
> +/* but fp-16 is spilled IMPRECISE zero const reg */
> +__msg("4: (7b) *(u64 *)(r10 -16) = r0        ; R0_w=0 R10=fp0 fp-16_w=0")
> +/* and now check that precision propagation works even for such tricky case */
> +__msg("10: (71) r2 = *(u8 *)(r10 -9)         ; R2_w=P0 R10=fp0 fp-16_w=0")

Why do we require R2 to be precise at this point? It seems the only
reason it's marked as precise here is because it was marked at line 6,
and the mark was never cleared: when R2 was overwritten at line 10, only
__mark_reg_const_zero was called, and no-one cleared the flag, although
R2 was overwritten.

Moreover, if I replace r2 with r3 in this block, it doesn't get the
precise mark, as I expect.

Preserving the flag looks like a bug to me, but I wanted to double-check
with you.

The context why it's relevant to my series: after patch [3], this fill
goes to the then-branch on big endian (not to the else-branch, as
before), and I copy the register with copy_register_state, which
preserves the precise flag from the stack, not from the old value of r2.

> +__msg("11: (0f) r1 += r2")
> +__msg("mark_precise: frame0: last_idx 11 first_idx 0 subseq_idx -1")
> +__msg("mark_precise: frame0: regs=r2 stack= before 10: (71) r2 = *(u8 *)(r10 -9)")
> +__msg("mark_precise: frame0: regs= stack=-16 before 9: (bf) r1 = r6")
> +__msg("mark_precise: frame0: regs= stack=-16 before 8: (73) *(u8 *)(r1 +0) = r2")
> +__msg("mark_precise: frame0: regs= stack=-16 before 7: (0f) r1 += r2")
> +__msg("mark_precise: frame0: regs= stack=-16 before 6: (71) r2 = *(u8 *)(r10 -1)")
> +__msg("mark_precise: frame0: regs= stack=-16 before 5: (bf) r1 = r6")
> +__msg("mark_precise: frame0: regs= stack=-16 before 4: (7b) *(u64 *)(r10 -16) = r0")
> +__msg("mark_precise: frame0: regs=r0 stack= before 3: (b7) r0 = 0")
>  __naked void partial_stack_load_preserves_zeros(void)
>  {
>  	asm volatile (
> -- 
> 2.34.1
> 
> 

[1]: https://github.com/kernel-patches/bpf/pull/6132
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/kernel/bpf/verifier.c?id=c838fe1282df540ebf6e24e386ac34acb3ef3115#n4806
[3]: https://github.com/kernel-patches/bpf/pull/6132/commits/0e72ee541180812e515b2bf3ebd127b6e670fd59
Andrii Nakryiko Dec. 14, 2023, 11:45 p.m. UTC | #2
On Thu, Dec 14, 2023 at 6:21 AM Maxim Mikityanskiy <maxtram95@gmail.com> wrote:
>
> Hi Andrii,
>
> I'm preparing a series for submission [1], and it started failing on
> this selftest on big endian after I rebased over your series. Can we
> discuss (see below) to figure out whether it's a bug in your patch or
> whether I'm missing something?
>
> On Tue, 05 Dec 2023 at 10:42:47 -0800, Andrii Nakryiko wrote:
> > Enhance partial_stack_load_preserves_zeros subtest with detailed
> > precision propagation log checks. We know expect fp-16 to be spilled,
> > initially imprecise, zero const register, which is later marked as
> > precise even when partial stack slot load is performed, even if it's not
> > a register fill (!).
> >
> > Acked-by: Eduard Zingerman <eddyz87@gmail.com>
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> >  .../selftests/bpf/progs/verifier_spill_fill.c    | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> >
> > diff --git a/tools/testing/selftests/bpf/progs/verifier_spill_fill.c b/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
> > index 41fd61299eab..df4920da3472 100644
> > --- a/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
> > +++ b/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
> > @@ -495,6 +495,22 @@ char single_byte_buf[1] SEC(".data.single_byte_buf");
> >  SEC("raw_tp")
> >  __log_level(2)
> >  __success
> > +/* make sure fp-8 is all STACK_ZERO */
> > +__msg("2: (7a) *(u64 *)(r10 -8) = 0          ; R10=fp0 fp-8_w=00000000")
> > +/* but fp-16 is spilled IMPRECISE zero const reg */
> > +__msg("4: (7b) *(u64 *)(r10 -16) = r0        ; R0_w=0 R10=fp0 fp-16_w=0")
> > +/* and now check that precision propagation works even for such tricky case */
> > +__msg("10: (71) r2 = *(u8 *)(r10 -9)         ; R2_w=P0 R10=fp0 fp-16_w=0")
>
> Why do we require R2 to be precise at this point? It seems the only
> reason it's marked as precise here is because it was marked at line 6,
> and the mark was never cleared: when R2 was overwritten at line 10, only
> __mark_reg_const_zero was called, and no-one cleared the flag, although
> R2 was overwritten.
>
> Moreover, if I replace r2 with r3 in this block, it doesn't get the
> precise mark, as I expect.
>
> Preserving the flag looks like a bug to me, but I wanted to double-check
> with you.
>


So let's look at the relevant pieces of the code and the log.

First, note that we set fp-16 slot to all zeroes by spilling register
with known value zero (but not yet marked precise)

3: (b7) r0 = 0                        ; R0_w=0
4: (7b) *(u64 *)(r10 -16) = r0        ; R0_w=0 R10=fp0 fp-16_w=0

then eventually we get to insns #11, which is using r2 as an offset
into map_value pointer, so r2's value is important to know precisely,
so we start precision back propagation:

8: (73) *(u8 *)(r1 +0) = r2           ;
R1_w=map_value(map=.data.single_by,ks=4,vs=1) R2_w=P0
9: (bf) r1 = r6                       ;
R1_w=map_value(map=.data.single_by,ks=4,vs=1)
R6_w=map_value(map=.data.single_by,ks=4,vs=1)
10: (71) r2 = *(u8 *)(r10 -9)         ; R2_w=P0 R10=fp0 fp-16_w=0
11: (0f) r1 += r2
mark_precise: frame0: last_idx 11 first_idx 0 subseq_idx -1
mark_precise: frame0: regs=r2 stack= before 10: (71) r2 = *(u8 *)(r10 -9)

^^ here r2 is assigned from fp-16 slot, so now we drop r2, but start
tracking fp-16 to mark it as precise

mark_precise: frame0: regs= stack=-16 before 9: (bf) r1 = r6
mark_precise: frame0: regs= stack=-16 before 8: (73) *(u8 *)(r1 +0) = r2
mark_precise: frame0: regs= stack=-16 before 7: (0f) r1 += r2
mark_precise: frame0: regs= stack=-16 before 6: (71) r2 = *(u8 *)(r10 -1)
mark_precise: frame0: regs= stack=-16 before 5: (bf) r1 = r6

^^ irrelevant instructions which we just skip

mark_precise: frame0: regs= stack=-16 before 4: (7b) *(u64 *)(r10 -16) = r0

^^ here we notice that fp-16 was set by spilling r0 state, so we drop
fp-16, start tracking r0

mark_precise: frame0: regs=r0 stack= before 3: (b7) r0 = 0

^^ and finally we arrive at r0 which was assigned 0 directly. We are done.


All seems correct. Did you spot any problem in the logic?


> The context why it's relevant to my series: after patch [3], this fill
> goes to the then-branch on big endian (not to the else-branch, as
> before), and I copy the register with copy_register_state, which
> preserves the precise flag from the stack, not from the old value of r2.
>

I haven't looked at your patches, sorry, let's try figuring out if the
test's logic is broken, first.

> > +__msg("11: (0f) r1 += r2")
> > +__msg("mark_precise: frame0: last_idx 11 first_idx 0 subseq_idx -1")
> > +__msg("mark_precise: frame0: regs=r2 stack= before 10: (71) r2 = *(u8 *)(r10 -9)")
> > +__msg("mark_precise: frame0: regs= stack=-16 before 9: (bf) r1 = r6")
> > +__msg("mark_precise: frame0: regs= stack=-16 before 8: (73) *(u8 *)(r1 +0) = r2")
> > +__msg("mark_precise: frame0: regs= stack=-16 before 7: (0f) r1 += r2")
> > +__msg("mark_precise: frame0: regs= stack=-16 before 6: (71) r2 = *(u8 *)(r10 -1)")
> > +__msg("mark_precise: frame0: regs= stack=-16 before 5: (bf) r1 = r6")
> > +__msg("mark_precise: frame0: regs= stack=-16 before 4: (7b) *(u64 *)(r10 -16) = r0")
> > +__msg("mark_precise: frame0: regs=r0 stack= before 3: (b7) r0 = 0")
> >  __naked void partial_stack_load_preserves_zeros(void)
> >  {
> >       asm volatile (
> > --
> > 2.34.1
> >
> >
>
> [1]: https://github.com/kernel-patches/bpf/pull/6132
> [2]: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/kernel/bpf/verifier.c?id=c838fe1282df540ebf6e24e386ac34acb3ef3115#n4806
> [3]: https://github.com/kernel-patches/bpf/pull/6132/commits/0e72ee541180812e515b2bf3ebd127b6e670fd59
Maxim Mikityanskiy Dec. 15, 2023, 10:34 p.m. UTC | #3
On Thu, 14 Dec 2023 at 15:45:07 -0800, Andrii Nakryiko wrote:
> On Thu, Dec 14, 2023 at 6:21 AM Maxim Mikityanskiy <maxtram95@gmail.com> wrote:
> >
> > Hi Andrii,
> >
> > I'm preparing a series for submission [1], and it started failing on
> > this selftest on big endian after I rebased over your series. Can we
> > discuss (see below) to figure out whether it's a bug in your patch or
> > whether I'm missing something?
> >
> > On Tue, 05 Dec 2023 at 10:42:47 -0800, Andrii Nakryiko wrote:
> > > Enhance partial_stack_load_preserves_zeros subtest with detailed
> > > precision propagation log checks. We know expect fp-16 to be spilled,
> > > initially imprecise, zero const register, which is later marked as
> > > precise even when partial stack slot load is performed, even if it's not
> > > a register fill (!).
> > >
> > > Acked-by: Eduard Zingerman <eddyz87@gmail.com>
> > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > ---
> > >  .../selftests/bpf/progs/verifier_spill_fill.c    | 16 ++++++++++++++++
> > >  1 file changed, 16 insertions(+)
> > >
> > > diff --git a/tools/testing/selftests/bpf/progs/verifier_spill_fill.c b/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
> > > index 41fd61299eab..df4920da3472 100644
> > > --- a/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
> > > +++ b/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
> > > @@ -495,6 +495,22 @@ char single_byte_buf[1] SEC(".data.single_byte_buf");
> > >  SEC("raw_tp")
> > >  __log_level(2)
> > >  __success
> > > +/* make sure fp-8 is all STACK_ZERO */
> > > +__msg("2: (7a) *(u64 *)(r10 -8) = 0          ; R10=fp0 fp-8_w=00000000")
> > > +/* but fp-16 is spilled IMPRECISE zero const reg */
> > > +__msg("4: (7b) *(u64 *)(r10 -16) = r0        ; R0_w=0 R10=fp0 fp-16_w=0")
> > > +/* and now check that precision propagation works even for such tricky case */
> > > +__msg("10: (71) r2 = *(u8 *)(r10 -9)         ; R2_w=P0 R10=fp0 fp-16_w=0")
> >
> > Why do we require R2 to be precise at this point? It seems the only
> > reason it's marked as precise here is because it was marked at line 6,
> > and the mark was never cleared: when R2 was overwritten at line 10, only
> > __mark_reg_const_zero was called, and no-one cleared the flag, although
> > R2 was overwritten.
> >
> > Moreover, if I replace r2 with r3 in this block, it doesn't get the
> > precise mark, as I expect.
> >
> > Preserving the flag looks like a bug to me, but I wanted to double-check
> > with you.
> >
> 
> 
> So let's look at the relevant pieces of the code and the log.
> 
> First, note that we set fp-16 slot to all zeroes by spilling register
> with known value zero (but not yet marked precise)
> 
> 3: (b7) r0 = 0                        ; R0_w=0
> 4: (7b) *(u64 *)(r10 -16) = r0        ; R0_w=0 R10=fp0 fp-16_w=0
> 
> then eventually we get to insns #11, which is using r2 as an offset
> into map_value pointer, so r2's value is important to know precisely,
> so we start precision back propagation:
> 
> 8: (73) *(u8 *)(r1 +0) = r2           ;
> R1_w=map_value(map=.data.single_by,ks=4,vs=1) R2_w=P0
> 9: (bf) r1 = r6                       ;
> R1_w=map_value(map=.data.single_by,ks=4,vs=1)
> R6_w=map_value(map=.data.single_by,ks=4,vs=1)
> 10: (71) r2 = *(u8 *)(r10 -9)         ; R2_w=P0 R10=fp0 fp-16_w=0

All that you say below makes sense to me. What looks weird is this "P0"
at line 10, because it's before the backtracking happened. And if I
patch this block in the test as follows (replacing r2 with r3):

"r1 = %[single_byte_buf];"
"r3 = *(u8 *)(r10 -9);"
"r1 += r3;"
"*(u8 *)(r1 + 0) = r3;"

then I no longer see R3_w=P0 before the backtracking:

8: (73) *(u8 *)(r1 +0) = r2           ; R1_w=map_value(map=.data.single_by,ks=4,vs=1) R2_w=P0
9: (bf) r1 = r6                       ; R1_w=map_value(map=.data.single_by,ks=4,vs=1) R6_w=map_value(map=.data.single_by,ks=4,vs=1)
10: (71) r3 = *(u8 *)(r10 -9)         ; R3_w=0 R10=fp0 fp-16_w=0
11: (0f) r1 += r3

although the backtracking that follows looks the same:

mark_precise: frame0: last_idx 11 first_idx 0 subseq_idx -1
mark_precise: frame0: regs=r3 stack= before 10: (71) r3 = *(u8 *)(r10 -9)
mark_precise: frame0: regs= stack=-16 before 9: (bf) r1 = r6
mark_precise: frame0: regs= stack=-16 before 8: (73) *(u8 *)(r1 +0) = r2
mark_precise: frame0: regs= stack=-16 before 7: (0f) r1 += r2
mark_precise: frame0: regs= stack=-16 before 6: (71) r2 = *(u8 *)(r10 -1)
mark_precise: frame0: regs= stack=-16 before 5: (bf) r1 = r6
mark_precise: frame0: regs= stack=-16 before 4: (7b) *(u64 *)(r10 -16) = r0
mark_precise: frame0: regs=r0 stack= before 3: (b7) r0 = 0

It seems the reason it shows R2_w=P0, but R3_w=0, is that at [2] you
overwrite the register boundaries with zero, but you don't reset the
precise flag, and r2 had it set higher above (for its previous value).

What do you think? Does what I say make sense?

> 11: (0f) r1 += r2
> mark_precise: frame0: last_idx 11 first_idx 0 subseq_idx -1
> mark_precise: frame0: regs=r2 stack= before 10: (71) r2 = *(u8 *)(r10 -9)
> 
> ^^ here r2 is assigned from fp-16 slot, so now we drop r2, but start
> tracking fp-16 to mark it as precise
> 
> mark_precise: frame0: regs= stack=-16 before 9: (bf) r1 = r6
> mark_precise: frame0: regs= stack=-16 before 8: (73) *(u8 *)(r1 +0) = r2
> mark_precise: frame0: regs= stack=-16 before 7: (0f) r1 += r2
> mark_precise: frame0: regs= stack=-16 before 6: (71) r2 = *(u8 *)(r10 -1)
> mark_precise: frame0: regs= stack=-16 before 5: (bf) r1 = r6
> 
> ^^ irrelevant instructions which we just skip
> 
> mark_precise: frame0: regs= stack=-16 before 4: (7b) *(u64 *)(r10 -16) = r0
> 
> ^^ here we notice that fp-16 was set by spilling r0 state, so we drop
> fp-16, start tracking r0
> 
> mark_precise: frame0: regs=r0 stack= before 3: (b7) r0 = 0
> 
> ^^ and finally we arrive at r0 which was assigned 0 directly. We are done.
> 
> 
> All seems correct. Did you spot any problem in the logic?
> 
> 
> > The context why it's relevant to my series: after patch [3], this fill
> > goes to the then-branch on big endian (not to the else-branch, as
> > before), and I copy the register with copy_register_state, which
> > preserves the precise flag from the stack, not from the old value of r2.
> >
> 
> I haven't looked at your patches, sorry, let's try figuring out if the
> test's logic is broken, first.
> 
> > > +__msg("11: (0f) r1 += r2")
> > > +__msg("mark_precise: frame0: last_idx 11 first_idx 0 subseq_idx -1")
> > > +__msg("mark_precise: frame0: regs=r2 stack= before 10: (71) r2 = *(u8 *)(r10 -9)")
> > > +__msg("mark_precise: frame0: regs= stack=-16 before 9: (bf) r1 = r6")
> > > +__msg("mark_precise: frame0: regs= stack=-16 before 8: (73) *(u8 *)(r1 +0) = r2")
> > > +__msg("mark_precise: frame0: regs= stack=-16 before 7: (0f) r1 += r2")
> > > +__msg("mark_precise: frame0: regs= stack=-16 before 6: (71) r2 = *(u8 *)(r10 -1)")
> > > +__msg("mark_precise: frame0: regs= stack=-16 before 5: (bf) r1 = r6")
> > > +__msg("mark_precise: frame0: regs= stack=-16 before 4: (7b) *(u64 *)(r10 -16) = r0")
> > > +__msg("mark_precise: frame0: regs=r0 stack= before 3: (b7) r0 = 0")
> > >  __naked void partial_stack_load_preserves_zeros(void)
> > >  {
> > >       asm volatile (
> > > --
> > > 2.34.1
> > >
> > >
> >
> > [1]: https://github.com/kernel-patches/bpf/pull/6132
> > [2]: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/kernel/bpf/verifier.c?id=c838fe1282df540ebf6e24e386ac34acb3ef3115#n4806
> > [3]: https://github.com/kernel-patches/bpf/pull/6132/commits/0e72ee541180812e515b2bf3ebd127b6e670fd59
Andrii Nakryiko Dec. 15, 2023, 11:02 p.m. UTC | #4
On Fri, Dec 15, 2023 at 2:34 PM Maxim Mikityanskiy <maxtram95@gmail.com> wrote:
>
> On Thu, 14 Dec 2023 at 15:45:07 -0800, Andrii Nakryiko wrote:
> > On Thu, Dec 14, 2023 at 6:21 AM Maxim Mikityanskiy <maxtram95@gmail.com> wrote:
> > >
> > > Hi Andrii,
> > >
> > > I'm preparing a series for submission [1], and it started failing on
> > > this selftest on big endian after I rebased over your series. Can we
> > > discuss (see below) to figure out whether it's a bug in your patch or
> > > whether I'm missing something?
> > >
> > > On Tue, 05 Dec 2023 at 10:42:47 -0800, Andrii Nakryiko wrote:
> > > > Enhance partial_stack_load_preserves_zeros subtest with detailed
> > > > precision propagation log checks. We know expect fp-16 to be spilled,
> > > > initially imprecise, zero const register, which is later marked as
> > > > precise even when partial stack slot load is performed, even if it's not
> > > > a register fill (!).
> > > >
> > > > Acked-by: Eduard Zingerman <eddyz87@gmail.com>
> > > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > > ---
> > > >  .../selftests/bpf/progs/verifier_spill_fill.c    | 16 ++++++++++++++++
> > > >  1 file changed, 16 insertions(+)
> > > >
> > > > diff --git a/tools/testing/selftests/bpf/progs/verifier_spill_fill.c b/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
> > > > index 41fd61299eab..df4920da3472 100644
> > > > --- a/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
> > > > +++ b/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
> > > > @@ -495,6 +495,22 @@ char single_byte_buf[1] SEC(".data.single_byte_buf");
> > > >  SEC("raw_tp")
> > > >  __log_level(2)
> > > >  __success
> > > > +/* make sure fp-8 is all STACK_ZERO */
> > > > +__msg("2: (7a) *(u64 *)(r10 -8) = 0          ; R10=fp0 fp-8_w=00000000")
> > > > +/* but fp-16 is spilled IMPRECISE zero const reg */
> > > > +__msg("4: (7b) *(u64 *)(r10 -16) = r0        ; R0_w=0 R10=fp0 fp-16_w=0")
> > > > +/* and now check that precision propagation works even for such tricky case */
> > > > +__msg("10: (71) r2 = *(u8 *)(r10 -9)         ; R2_w=P0 R10=fp0 fp-16_w=0")
> > >
> > > Why do we require R2 to be precise at this point? It seems the only
> > > reason it's marked as precise here is because it was marked at line 6,
> > > and the mark was never cleared: when R2 was overwritten at line 10, only
> > > __mark_reg_const_zero was called, and no-one cleared the flag, although
> > > R2 was overwritten.
> > >
> > > Moreover, if I replace r2 with r3 in this block, it doesn't get the
> > > precise mark, as I expect.
> > >
> > > Preserving the flag looks like a bug to me, but I wanted to double-check
> > > with you.
> > >
> >
> >
> > So let's look at the relevant pieces of the code and the log.
> >
> > First, note that we set fp-16 slot to all zeroes by spilling register
> > with known value zero (but not yet marked precise)
> >
> > 3: (b7) r0 = 0                        ; R0_w=0
> > 4: (7b) *(u64 *)(r10 -16) = r0        ; R0_w=0 R10=fp0 fp-16_w=0
> >
> > then eventually we get to insns #11, which is using r2 as an offset
> > into map_value pointer, so r2's value is important to know precisely,
> > so we start precision back propagation:
> >
> > 8: (73) *(u8 *)(r1 +0) = r2           ;
> > R1_w=map_value(map=.data.single_by,ks=4,vs=1) R2_w=P0
> > 9: (bf) r1 = r6                       ;
> > R1_w=map_value(map=.data.single_by,ks=4,vs=1)
> > R6_w=map_value(map=.data.single_by,ks=4,vs=1)
> > 10: (71) r2 = *(u8 *)(r10 -9)         ; R2_w=P0 R10=fp0 fp-16_w=0
>
> All that you say below makes sense to me. What looks weird is this "P0"
> at line 10, because it's before the backtracking happened. And if I
> patch this block in the test as follows (replacing r2 with r3):
>
> "r1 = %[single_byte_buf];"
> "r3 = *(u8 *)(r10 -9);"
> "r1 += r3;"
> "*(u8 *)(r1 + 0) = r3;"
>
> then I no longer see R3_w=P0 before the backtracking:
>
> 8: (73) *(u8 *)(r1 +0) = r2           ; R1_w=map_value(map=.data.single_by,ks=4,vs=1) R2_w=P0
> 9: (bf) r1 = r6                       ; R1_w=map_value(map=.data.single_by,ks=4,vs=1) R6_w=map_value(map=.data.single_by,ks=4,vs=1)
> 10: (71) r3 = *(u8 *)(r10 -9)         ; R3_w=0 R10=fp0 fp-16_w=0
> 11: (0f) r1 += r3
>
> although the backtracking that follows looks the same:
>
> mark_precise: frame0: last_idx 11 first_idx 0 subseq_idx -1
> mark_precise: frame0: regs=r3 stack= before 10: (71) r3 = *(u8 *)(r10 -9)
> mark_precise: frame0: regs= stack=-16 before 9: (bf) r1 = r6
> mark_precise: frame0: regs= stack=-16 before 8: (73) *(u8 *)(r1 +0) = r2
> mark_precise: frame0: regs= stack=-16 before 7: (0f) r1 += r2
> mark_precise: frame0: regs= stack=-16 before 6: (71) r2 = *(u8 *)(r10 -1)
> mark_precise: frame0: regs= stack=-16 before 5: (bf) r1 = r6
> mark_precise: frame0: regs= stack=-16 before 4: (7b) *(u64 *)(r10 -16) = r0
> mark_precise: frame0: regs=r0 stack= before 3: (b7) r0 = 0
>
> It seems the reason it shows R2_w=P0, but R3_w=0, is that at [2] you
> overwrite the register boundaries with zero, but you don't reset the
> precise flag, and r2 had it set higher above (for its previous value).
>
> What do you think? Does what I say make sense?

Oh, I yes, now I see that as well. You are right. Turns out
__mark_reg_const_zero() doesn't reset the precision flag. Yeah, in
this case when we restore zero from spilled register we should reset
precision for sure, that's an easy fix. But we need to also audit all
the uses of __mark_reg_const_zero() and confirm that it's intended to
preserve old value of precision flag (I suspect in some cases it's not
and we should probably specify that we either clear or set if always).

>
> > 11: (0f) r1 += r2
> > mark_precise: frame0: last_idx 11 first_idx 0 subseq_idx -1
> > mark_precise: frame0: regs=r2 stack= before 10: (71) r2 = *(u8 *)(r10 -9)
> >
> > ^^ here r2 is assigned from fp-16 slot, so now we drop r2, but start
> > tracking fp-16 to mark it as precise
> >
> > mark_precise: frame0: regs= stack=-16 before 9: (bf) r1 = r6
> > mark_precise: frame0: regs= stack=-16 before 8: (73) *(u8 *)(r1 +0) = r2
> > mark_precise: frame0: regs= stack=-16 before 7: (0f) r1 += r2
> > mark_precise: frame0: regs= stack=-16 before 6: (71) r2 = *(u8 *)(r10 -1)
> > mark_precise: frame0: regs= stack=-16 before 5: (bf) r1 = r6
> >
> > ^^ irrelevant instructions which we just skip
> >
> > mark_precise: frame0: regs= stack=-16 before 4: (7b) *(u64 *)(r10 -16) = r0
> >
> > ^^ here we notice that fp-16 was set by spilling r0 state, so we drop
> > fp-16, start tracking r0
> >
> > mark_precise: frame0: regs=r0 stack= before 3: (b7) r0 = 0
> >
> > ^^ and finally we arrive at r0 which was assigned 0 directly. We are done.
> >
> >
> > All seems correct. Did you spot any problem in the logic?
> >
> >
> > > The context why it's relevant to my series: after patch [3], this fill
> > > goes to the then-branch on big endian (not to the else-branch, as
> > > before), and I copy the register with copy_register_state, which
> > > preserves the precise flag from the stack, not from the old value of r2.
> > >
> >
> > I haven't looked at your patches, sorry, let's try figuring out if the
> > test's logic is broken, first.
> >
> > > > +__msg("11: (0f) r1 += r2")
> > > > +__msg("mark_precise: frame0: last_idx 11 first_idx 0 subseq_idx -1")
> > > > +__msg("mark_precise: frame0: regs=r2 stack= before 10: (71) r2 = *(u8 *)(r10 -9)")
> > > > +__msg("mark_precise: frame0: regs= stack=-16 before 9: (bf) r1 = r6")
> > > > +__msg("mark_precise: frame0: regs= stack=-16 before 8: (73) *(u8 *)(r1 +0) = r2")
> > > > +__msg("mark_precise: frame0: regs= stack=-16 before 7: (0f) r1 += r2")
> > > > +__msg("mark_precise: frame0: regs= stack=-16 before 6: (71) r2 = *(u8 *)(r10 -1)")
> > > > +__msg("mark_precise: frame0: regs= stack=-16 before 5: (bf) r1 = r6")
> > > > +__msg("mark_precise: frame0: regs= stack=-16 before 4: (7b) *(u64 *)(r10 -16) = r0")
> > > > +__msg("mark_precise: frame0: regs=r0 stack= before 3: (b7) r0 = 0")
> > > >  __naked void partial_stack_load_preserves_zeros(void)
> > > >  {
> > > >       asm volatile (
> > > > --
> > > > 2.34.1
> > > >
> > > >
> > >
> > > [1]: https://github.com/kernel-patches/bpf/pull/6132
> > > [2]: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/kernel/bpf/verifier.c?id=c838fe1282df540ebf6e24e386ac34acb3ef3115#n4806
> > > [3]: https://github.com/kernel-patches/bpf/pull/6132/commits/0e72ee541180812e515b2bf3ebd127b6e670fd59
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/progs/verifier_spill_fill.c b/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
index 41fd61299eab..df4920da3472 100644
--- a/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
+++ b/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
@@ -495,6 +495,22 @@  char single_byte_buf[1] SEC(".data.single_byte_buf");
 SEC("raw_tp")
 __log_level(2)
 __success
+/* make sure fp-8 is all STACK_ZERO */
+__msg("2: (7a) *(u64 *)(r10 -8) = 0          ; R10=fp0 fp-8_w=00000000")
+/* but fp-16 is spilled IMPRECISE zero const reg */
+__msg("4: (7b) *(u64 *)(r10 -16) = r0        ; R0_w=0 R10=fp0 fp-16_w=0")
+/* and now check that precision propagation works even for such tricky case */
+__msg("10: (71) r2 = *(u8 *)(r10 -9)         ; R2_w=P0 R10=fp0 fp-16_w=0")
+__msg("11: (0f) r1 += r2")
+__msg("mark_precise: frame0: last_idx 11 first_idx 0 subseq_idx -1")
+__msg("mark_precise: frame0: regs=r2 stack= before 10: (71) r2 = *(u8 *)(r10 -9)")
+__msg("mark_precise: frame0: regs= stack=-16 before 9: (bf) r1 = r6")
+__msg("mark_precise: frame0: regs= stack=-16 before 8: (73) *(u8 *)(r1 +0) = r2")
+__msg("mark_precise: frame0: regs= stack=-16 before 7: (0f) r1 += r2")
+__msg("mark_precise: frame0: regs= stack=-16 before 6: (71) r2 = *(u8 *)(r10 -1)")
+__msg("mark_precise: frame0: regs= stack=-16 before 5: (bf) r1 = r6")
+__msg("mark_precise: frame0: regs= stack=-16 before 4: (7b) *(u64 *)(r10 -16) = r0")
+__msg("mark_precise: frame0: regs=r0 stack= before 3: (b7) r0 = 0")
 __naked void partial_stack_load_preserves_zeros(void)
 {
 	asm volatile (