diff mbox series

[v2,bpf,3/3] selftests/bpf: add edge case backtracking logic test

Message ID 20231110002638.4168352-4-andrii@kernel.org (mailing list archive)
State Accepted
Commit 62ccdb11d3c63dc697dea1fd92b3496fe43dcc1e
Delegated to: BPF
Headers show
Series BPF control flow graph and precision backtrack fixes | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf, async
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 9 this patch: 9
netdev/cc_maintainers warning 12 maintainers not CCed: shuah@kernel.org jolsa@kernel.org john.fastabend@gmail.com linux-kselftest@vger.kernel.org yonghong.song@linux.dev martin.lau@linux.dev sdf@google.com mykolal@fb.com song@kernel.org haoluo@google.com shung-hsi.yu@suse.com kpsingh@kernel.org
netdev/build_clang success Errors and warnings before: 9 this patch: 9
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 9 this patch: 9
netdev/checkpatch warning CHECK: Lines should not end with a '(' WARNING: line length of 83 exceeds 80 columns WARNING: quoted string split across lines
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-PR fail PR summary
bpf/vmtest-bpf-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-VM_Test-2 success Logs for Validate matrix.py
bpf/vmtest-bpf-VM_Test-3 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-VM_Test-8 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-VM_Test-11 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-VM_Test-10 success Logs for set-matrix
bpf/vmtest-bpf-VM_Test-4 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-7 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-5 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-6 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-17 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-14 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-VM_Test-18 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-23 fail Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-21 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-19 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-15 success Logs for set-matrix
bpf/vmtest-bpf-VM_Test-28 success Logs for x86_64-llvm-16 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-20 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-24 success Logs for x86_64-llvm-16 / build / build for x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-22 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-29 success Logs for x86_64-llvm-16 / veristat
bpf/vmtest-bpf-VM_Test-27 success Logs for x86_64-llvm-16 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-16 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-VM_Test-9 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-VM_Test-26 success Logs for x86_64-llvm-16 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-25 success Logs for x86_64-llvm-16 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-12 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-VM_Test-13 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc

Commit Message

Andrii Nakryiko Nov. 10, 2023, 12:26 a.m. UTC
Add a dedicated selftests to try to set up conditions to have a state
with same first and last instruction index, but it actually is a loop
3->4->1->2->3. This confuses mark_chain_precision() if verifier doesn't
take into account jump history.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../selftests/bpf/progs/verifier_precision.c  | 40 +++++++++++++++++++
 1 file changed, 40 insertions(+)

Comments

Alexei Starovoitov Nov. 10, 2023, 1:34 a.m. UTC | #1
On Thu, Nov 9, 2023 at 4:26 PM Andrii Nakryiko <andrii@kernel.org> wrote:
>
> Add a dedicated selftests to try to set up conditions to have a state
> with same first and last instruction index, but it actually is a loop
> 3->4->1->2->3. This confuses mark_chain_precision() if verifier doesn't
> take into account jump history.
>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
>  .../selftests/bpf/progs/verifier_precision.c  | 40 +++++++++++++++++++
>  1 file changed, 40 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/progs/verifier_precision.c b/tools/testing/selftests/bpf/progs/verifier_precision.c
> index 193c0f8272d0..6b564d4c0986 100644
> --- a/tools/testing/selftests/bpf/progs/verifier_precision.c
> +++ b/tools/testing/selftests/bpf/progs/verifier_precision.c
> @@ -91,3 +91,43 @@ __naked int bpf_end_bswap(void)
>  }
>
>  #endif /* v4 instruction */
> +
> +SEC("?raw_tp")
> +__success __log_level(2)
> +/*
> + * Without the bug fix there will be no history between "last_idx 3 first_idx 3"
> + * and "parent state regs=" lines. "R0_w=6" parts are here to help anchor
> + * expected log messages to the one specific mark_chain_precision operation.
> + *
> + * This is quite fragile: if verifier checkpointing heuristic changes, this
> + * might need adjusting.

Hmm, but that what
__flag(BPF_F_TEST_STATE_FREQ)
supposed to address.

> + */
> +__msg("2: (07) r0 += 1                       ; R0_w=6")
> +__msg("3: (35) if r0 >= 0xa goto pc+1")
> +__msg("mark_precise: frame0: last_idx 3 first_idx 3 subseq_idx -1")
> +__msg("mark_precise: frame0: regs=r0 stack= before 2: (07) r0 += 1")
> +__msg("mark_precise: frame0: regs=r0 stack= before 1: (07) r0 += 1")
> +__msg("mark_precise: frame0: regs=r0 stack= before 4: (05) goto pc-4")
> +__msg("mark_precise: frame0: regs=r0 stack= before 3: (35) if r0 >= 0xa goto pc+1")
> +__msg("mark_precise: frame0: parent state regs= stack=:  R0_rw=P4")
> +__msg("3: R0_w=6")
> +__naked int state_loop_first_last_equal(void)
> +{
> +       asm volatile (
> +               "r0 = 0;"
> +       "l0_%=:"
> +               "r0 += 1;"
> +               "r0 += 1;"

That's why you had two ++ ?
Add state_freq and remove one of them?
Andrii Nakryiko Nov. 10, 2023, 3:43 a.m. UTC | #2
On Thu, Nov 9, 2023 at 5:34 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Nov 9, 2023 at 4:26 PM Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > Add a dedicated selftests to try to set up conditions to have a state
> > with same first and last instruction index, but it actually is a loop
> > 3->4->1->2->3. This confuses mark_chain_precision() if verifier doesn't
> > take into account jump history.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> >  .../selftests/bpf/progs/verifier_precision.c  | 40 +++++++++++++++++++
> >  1 file changed, 40 insertions(+)
> >
> > diff --git a/tools/testing/selftests/bpf/progs/verifier_precision.c b/tools/testing/selftests/bpf/progs/verifier_precision.c
> > index 193c0f8272d0..6b564d4c0986 100644
> > --- a/tools/testing/selftests/bpf/progs/verifier_precision.c
> > +++ b/tools/testing/selftests/bpf/progs/verifier_precision.c
> > @@ -91,3 +91,43 @@ __naked int bpf_end_bswap(void)
> >  }
> >
> >  #endif /* v4 instruction */
> > +
> > +SEC("?raw_tp")
> > +__success __log_level(2)
> > +/*
> > + * Without the bug fix there will be no history between "last_idx 3 first_idx 3"
> > + * and "parent state regs=" lines. "R0_w=6" parts are here to help anchor
> > + * expected log messages to the one specific mark_chain_precision operation.
> > + *
> > + * This is quite fragile: if verifier checkpointing heuristic changes, this
> > + * might need adjusting.
>
> Hmm, but that what
> __flag(BPF_F_TEST_STATE_FREQ)
> supposed to address.

When I was analysing and crafting the test I for some reason assumed I
need to have a jump inside the state that won't trigger state
checkpoint. But I think that's not necessary, just doing conditional
jump and jumping back an instruction or two should do. With that yes,
TEST_STATE_FREQ should be a better way to do this.

>
> > + */
> > +__msg("2: (07) r0 += 1                       ; R0_w=6")
> > +__msg("3: (35) if r0 >= 0xa goto pc+1")
> > +__msg("mark_precise: frame0: last_idx 3 first_idx 3 subseq_idx -1")
> > +__msg("mark_precise: frame0: regs=r0 stack= before 2: (07) r0 += 1")
> > +__msg("mark_precise: frame0: regs=r0 stack= before 1: (07) r0 += 1")
> > +__msg("mark_precise: frame0: regs=r0 stack= before 4: (05) goto pc-4")
> > +__msg("mark_precise: frame0: regs=r0 stack= before 3: (35) if r0 >= 0xa goto pc+1")
> > +__msg("mark_precise: frame0: parent state regs= stack=:  R0_rw=P4")
> > +__msg("3: R0_w=6")
> > +__naked int state_loop_first_last_equal(void)
> > +{
> > +       asm volatile (
> > +               "r0 = 0;"
> > +       "l0_%=:"
> > +               "r0 += 1;"
> > +               "r0 += 1;"
>
> That's why you had two ++ ?
> Add state_freq and remove one of them?
Andrii Nakryiko Nov. 10, 2023, 4:05 a.m. UTC | #3
On Thu, Nov 9, 2023 at 7:43 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Nov 9, 2023 at 5:34 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Thu, Nov 9, 2023 at 4:26 PM Andrii Nakryiko <andrii@kernel.org> wrote:
> > >
> > > Add a dedicated selftests to try to set up conditions to have a state
> > > with same first and last instruction index, but it actually is a loop
> > > 3->4->1->2->3. This confuses mark_chain_precision() if verifier doesn't
> > > take into account jump history.
> > >
> > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > ---
> > >  .../selftests/bpf/progs/verifier_precision.c  | 40 +++++++++++++++++++
> > >  1 file changed, 40 insertions(+)
> > >
> > > diff --git a/tools/testing/selftests/bpf/progs/verifier_precision.c b/tools/testing/selftests/bpf/progs/verifier_precision.c
> > > index 193c0f8272d0..6b564d4c0986 100644
> > > --- a/tools/testing/selftests/bpf/progs/verifier_precision.c
> > > +++ b/tools/testing/selftests/bpf/progs/verifier_precision.c
> > > @@ -91,3 +91,43 @@ __naked int bpf_end_bswap(void)
> > >  }
> > >
> > >  #endif /* v4 instruction */
> > > +
> > > +SEC("?raw_tp")
> > > +__success __log_level(2)
> > > +/*
> > > + * Without the bug fix there will be no history between "last_idx 3 first_idx 3"
> > > + * and "parent state regs=" lines. "R0_w=6" parts are here to help anchor
> > > + * expected log messages to the one specific mark_chain_precision operation.
> > > + *
> > > + * This is quite fragile: if verifier checkpointing heuristic changes, this
> > > + * might need adjusting.
> >
> > Hmm, but that what
> > __flag(BPF_F_TEST_STATE_FREQ)
> > supposed to address.
>
> When I was analysing and crafting the test I for some reason assumed I
> need to have a jump inside the state that won't trigger state
> checkpoint. But I think that's not necessary, just doing conditional
> jump and jumping back an instruction or two should do. With that yes,
> TEST_STATE_FREQ should be a better way to do this.

Ah, ok, TEST_STATE_FREQ won't work. It triggers state checkpointing
both at conditional jump instruction and on its target, because target
is prune point.

So I think this test has to be the way it is.

>
> >
> > > + */
> > > +__msg("2: (07) r0 += 1                       ; R0_w=6")
> > > +__msg("3: (35) if r0 >= 0xa goto pc+1")
> > > +__msg("mark_precise: frame0: last_idx 3 first_idx 3 subseq_idx -1")
> > > +__msg("mark_precise: frame0: regs=r0 stack= before 2: (07) r0 += 1")
> > > +__msg("mark_precise: frame0: regs=r0 stack= before 1: (07) r0 += 1")
> > > +__msg("mark_precise: frame0: regs=r0 stack= before 4: (05) goto pc-4")
> > > +__msg("mark_precise: frame0: regs=r0 stack= before 3: (35) if r0 >= 0xa goto pc+1")
> > > +__msg("mark_precise: frame0: parent state regs= stack=:  R0_rw=P4")
> > > +__msg("3: R0_w=6")
> > > +__naked int state_loop_first_last_equal(void)
> > > +{
> > > +       asm volatile (
> > > +               "r0 = 0;"
> > > +       "l0_%=:"
> > > +               "r0 += 1;"
> > > +               "r0 += 1;"
> >
> > That's why you had two ++ ?
> > Add state_freq and remove one of them?
Alexei Starovoitov Nov. 10, 2023, 4:14 a.m. UTC | #4
On Thu, Nov 9, 2023 at 8:05 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> >
> > When I was analysing and crafting the test I for some reason assumed I
> > need to have a jump inside the state that won't trigger state
> > checkpoint. But I think that's not necessary, just doing conditional
> > jump and jumping back an instruction or two should do. With that yes,
> > TEST_STATE_FREQ should be a better way to do this.
>
> Ah, ok, TEST_STATE_FREQ won't work. It triggers state checkpointing
> both at conditional jump instruction and on its target, because target
> is prune point.
>
> So I think this test has to be the way it is.

I see.
I was about to apply it, but then noticed:
numamove_bpf-numamove_bpf.o |migrate_misplaced_page |success ->
failure (!!)|-100.00 %

veristat is not known for sporadic failures.
Is this a real issue?
Andrii Nakryiko Nov. 10, 2023, 4:48 a.m. UTC | #5
On Thu, Nov 9, 2023 at 8:14 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Nov 9, 2023 at 8:05 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > >
> > > When I was analysing and crafting the test I for some reason assumed I
> > > need to have a jump inside the state that won't trigger state
> > > checkpoint. But I think that's not necessary, just doing conditional
> > > jump and jumping back an instruction or two should do. With that yes,
> > > TEST_STATE_FREQ should be a better way to do this.
> >
> > Ah, ok, TEST_STATE_FREQ won't work. It triggers state checkpointing
> > both at conditional jump instruction and on its target, because target
> > is prune point.
> >
> > So I think this test has to be the way it is.
>
> I see.
> I was about to apply it, but then noticed:
> numamove_bpf-numamove_bpf.o |migrate_misplaced_page |success ->
> failure (!!)|-100.00 %
>
> veristat is not known for sporadic failures.
> Is this a real issue?

No idea what this is, I don't have it in my local object files, will
need to regenerate them and check.
Andrii Nakryiko Nov. 10, 2023, 5:06 a.m. UTC | #6
On Thu, Nov 9, 2023 at 8:48 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Nov 9, 2023 at 8:14 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Thu, Nov 9, 2023 at 8:05 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > >
> > > > When I was analysing and crafting the test I for some reason assumed I
> > > > need to have a jump inside the state that won't trigger state
> > > > checkpoint. But I think that's not necessary, just doing conditional
> > > > jump and jumping back an instruction or two should do. With that yes,
> > > > TEST_STATE_FREQ should be a better way to do this.
> > >
> > > Ah, ok, TEST_STATE_FREQ won't work. It triggers state checkpointing
> > > both at conditional jump instruction and on its target, because target
> > > is prune point.
> > >
> > > So I think this test has to be the way it is.
> >
> > I see.
> > I was about to apply it, but then noticed:
> > numamove_bpf-numamove_bpf.o |migrate_misplaced_page |success ->
> > failure (!!)|-100.00 %
> >
> > veristat is not known for sporadic failures.
> > Is this a real issue?
>
> No idea what this is, I don't have it in my local object files, will
> need to regenerate them and check.

libbpf: prog 'migrate_misplaced_page_exit': failed to find kernel BTF
type ID of 'migrate_misplaced_page': -3

It fails also on bpf-next/master.

I think CI compares with the last state before net/net-next merge, and
now this tool (it's from libbpf-tools) fails to find
migrate_misplaced_page kernel function, apparently.

So veristat itself doesn't have sporadic failures, but our CI setup is
not 100% reliable, it seems.
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/progs/verifier_precision.c b/tools/testing/selftests/bpf/progs/verifier_precision.c
index 193c0f8272d0..6b564d4c0986 100644
--- a/tools/testing/selftests/bpf/progs/verifier_precision.c
+++ b/tools/testing/selftests/bpf/progs/verifier_precision.c
@@ -91,3 +91,43 @@  __naked int bpf_end_bswap(void)
 }
 
 #endif /* v4 instruction */
+
+SEC("?raw_tp")
+__success __log_level(2)
+/*
+ * Without the bug fix there will be no history between "last_idx 3 first_idx 3"
+ * and "parent state regs=" lines. "R0_w=6" parts are here to help anchor
+ * expected log messages to the one specific mark_chain_precision operation.
+ *
+ * This is quite fragile: if verifier checkpointing heuristic changes, this
+ * might need adjusting.
+ */
+__msg("2: (07) r0 += 1                       ; R0_w=6")
+__msg("3: (35) if r0 >= 0xa goto pc+1")
+__msg("mark_precise: frame0: last_idx 3 first_idx 3 subseq_idx -1")
+__msg("mark_precise: frame0: regs=r0 stack= before 2: (07) r0 += 1")
+__msg("mark_precise: frame0: regs=r0 stack= before 1: (07) r0 += 1")
+__msg("mark_precise: frame0: regs=r0 stack= before 4: (05) goto pc-4")
+__msg("mark_precise: frame0: regs=r0 stack= before 3: (35) if r0 >= 0xa goto pc+1")
+__msg("mark_precise: frame0: parent state regs= stack=:  R0_rw=P4")
+__msg("3: R0_w=6")
+__naked int state_loop_first_last_equal(void)
+{
+	asm volatile (
+		"r0 = 0;"
+	"l0_%=:"
+		"r0 += 1;"
+		"r0 += 1;"
+		/* every few iterations we'll have a checkpoint here with
+		 * first_idx == last_idx, potentially confusing precision
+		 * backtracking logic
+		 */
+		"if r0 >= 10 goto l1_%=;"	/* checkpoint + mark_precise */
+		"goto l0_%=;"
+	"l1_%=:"
+		"exit;"
+		::: __clobber_common
+	);
+}
+
+char _license[] SEC("license") = "GPL";