[v3,bpf-next,4/4] selftests/bpf: Test may_goto

Message ID	20240301033734.95939-5-alexei.starovoitov@gmail.com (mailing list archive)
State	Superseded
Delegated to:	BPF
Headers	show Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AB3904EB3A for <bpf@vger.kernel.org>; Fri, 1 Mar 2024 03:38:00 +0000 (UTC) From: Alexei Starovoitov <alexei.starovoitov@gmail.com> To: bpf@vger.kernel.org Cc: daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, memxor@gmail.com, eddyz87@gmail.com, kernel-team@fb.com Subject: [PATCH v3 bpf-next 4/4] selftests/bpf: Test may_goto Date: Thu, 29 Feb 2024 19:37:34 -0800 Message-Id: <20240301033734.95939-5-alexei.starovoitov@gmail.com> In-Reply-To: <20240301033734.95939-1-alexei.starovoitov@gmail.com> References: <20240301033734.95939-1-alexei.starovoitov@gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	bpf: Introduce may_goto and cond_break \| expand [v3,bpf-next,0/4] bpf: Introduce may_goto and cond_break [v3,bpf-next,1/4] bpf: Introduce may_goto instruction [v3,bpf-next,2/4] bpf: Recognize that two registers are safe when their ranges match [v3,bpf-next,3/4] bpf: Add cond_break macro [v3,bpf-next,4/4] selftests/bpf: Test may_goto

Context	Check	Description
bpf/vmtest-bpf-next-VM_Test-0	success	Logs for Lint
bpf/vmtest-bpf-next-VM_Test-3	success	Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-2	success	Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-1	success	Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-5	success	Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-26	success	Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-11	success	Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-4	success	Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9	success	Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-12	success	Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-20	success	Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-10	success	Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-18	success	Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-17	success	Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-19	success	Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28	success	Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-33	success	Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-34	success	Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-35	success	Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-41	success	Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-42	success	Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-24	success	Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25	success	Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21	success	Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22	success	Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-27	success	Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23	fail	Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6	success	Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-16	success	Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-30	success	Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-31	success	Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-32	fail	Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-37	success	Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-38	success	Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-39	success	Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-40	fail	Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-8	fail	Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-13	success	Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-7	success	Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-36	success	Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18 and -O2 optimization
bpf/vmtest-bpf-next-VM_Test-29	success	Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17 and -O2 optimization
bpf/vmtest-bpf-next-VM_Test-14	fail	Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-PR	fail	PR summary
bpf/vmtest-bpf-next-VM_Test-15	fail	Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
netdev/series_format	success	Posting correctly formatted
netdev/tree_selection	success	Clearly marked for bpf-next, async
netdev/ynl	success	Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 8 this patch: 8
netdev/build_tools	success	Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers	warning	11 maintainers not CCed: jolsa@kernel.org mykolal@fb.com john.fastabend@gmail.com yonghong.song@linux.dev martin.lau@linux.dev song@kernel.org shuah@kernel.org sdf@google.com linux-kselftest@vger.kernel.org kpsingh@kernel.org haoluo@google.com
netdev/build_clang	success	Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 8 this patch: 8
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 85 lines checked
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0

Alexei Starovoitov March 1, 2024, 3:37 a.m. UTC

From: Alexei Starovoitov <ast@kernel.org>

Add tests for may_goto instruction via cond_break macro.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/testing/selftests/bpf/DENYLIST.s390x    |  1 +
 .../bpf/progs/verifier_iterating_callbacks.c  | 72 ++++++++++++++++++-
 2 files changed, 70 insertions(+), 3 deletions(-)

John Fastabend March 1, 2024, 7:47 p.m. UTC | #1

Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
> 
> Add tests for may_goto instruction via cond_break macro.
> 
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  tools/testing/selftests/bpf/DENYLIST.s390x    |  1 +
>  .../bpf/progs/verifier_iterating_callbacks.c  | 72 ++++++++++++++++++-
>  2 files changed, 70 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
> index 1a63996c0304..c6c31b960810 100644
> --- a/tools/testing/selftests/bpf/DENYLIST.s390x
> +++ b/tools/testing/selftests/bpf/DENYLIST.s390x
> @@ -3,3 +3,4 @@
>  exceptions				 # JIT does not support calling kfunc bpf_throw				       (exceptions)
>  get_stack_raw_tp                         # user_stack corrupted user stack                                             (no backchain userspace)
>  stacktrace_build_id                      # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2                   (?)
> +verifier_iter/cond_break
> diff --git a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> index 5905e036e0ea..8476dc47623f 100644
> --- a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> +++ b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> @@ -1,8 +1,6 @@
>  // SPDX-License-Identifier: GPL-2.0
> -
> -#include <linux/bpf.h>
> -#include <bpf/bpf_helpers.h>
>  #include "bpf_misc.h"
> +#include "bpf_experimental.h"
>  
>  struct {
>  	__uint(type, BPF_MAP_TYPE_ARRAY);
> @@ -239,4 +237,72 @@ int bpf_loop_iter_limit_nested(void *unused)
>  	return 1000 * a + b + c;
>  }
>  
> +#define ARR_SZ 1000000
> +int zero;
> +char arr[ARR_SZ];
> +
> +SEC("socket")
> +__success __retval(0xd495cdc0)
> +int cond_break1(const void *ctx)
> +{
> +	unsigned int i;
> +	unsigned int sum = 0;
> +
> +	for (i = zero; i < ARR_SZ; cond_break, i++)
> +		sum += i;
> +	for (i = zero; i < ARR_SZ; i++) {
> +		barrier_var(i);
> +		sum += i + arr[i];
> +		cond_break;
> +	}
> +
> +	return sum;
> +}
> +
> +SEC("socket")
> +__success __retval(999000000)
> +int cond_break2(const void *ctx)
> +{
> +	int i, j;
> +	int sum = 0;
> +
> +	for (i = zero; i < 1000; cond_break, i++)
> +		for (j = zero; j < 1000; j++) {
> +			sum += i + j;
> +			cond_break;
> +		}
> +
> +	return sum;
> +}
> +
> +static __noinline int loop(void)
> +{
> +	int i, sum = 0;
> +
> +	for (i = zero; i <= 1000000; i++, cond_break)
> +		sum += i;
> +
> +	return sum;
> +}
> +
> +SEC("socket")
> +__success __retval(0x6a5a2920)
> +int cond_break3(const void *ctx)
> +{
> +	return loop();
> +}
> +
> +SEC("socket")
> +__success __retval(0x800000) /* BPF_MAX_LOOPS */
> +int cond_break4(const void *ctx)
> +{
> +	int cnt = 0;
> +
> +	for (;;) {
> +		cond_break;
> +		cnt++;
> +	}
> +	return cnt;
> +}

I found this test illustrative to show how the cond_break which
is to me "feels" like a global hidden iterator appears to not
be reinitialized across calls?

 static __noinline int full_loop(void)
 {
         int cnt = 0;
 
         for (;;) {
                 cond_break;
                 cnt++;
         }
 
         for (;;) {
                 cond_break;
                 cnt++;
         }
 
         bpf_printk("cnt==%d\n", cnt);
         return cnt;
 }

 SEC("socket")
 __success __retval(16777216)
 int cond_break5(const void *ctx)
 {
         int cnt = 0;
  
         for (;;) {
                 cond_break;
                 cnt++;
         }
  
         cnt += full_loop();
  
         for (;;) {
                 cond_break;
                 cnt++;
         }
         return cnt;
 }
  
This fails with,

do_prog_test_run:PASS:bpf_prog_test_run 0 nsec
run_subtest:FAIL:654 Unexpected retval: 8388608 != 16777216
#430/15  verifier_iterating_callbacks/cond_break5:FAIL
#430     verifier_iterating_callbacks:FAIL

;       cnt += full_loop();                                         
     118:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     120:       b4 02 00 00 0d 00 00 00 w2 = 13
     121:       bc 73 00 00 00 00 00 00 w3 = w7
     122:       85 00 00 00 06 00 00 00 call 6
;                                                            

I guess this is by design but I sort of expected each
call to have its own context. It does make some sense to
limit main and all calls to a max loop count so not
complaining. Maybe consider adding the test? I at least
thought it helped.

> +
>  char _license[] SEC("license") = "GPL";
> -- 
> 2.34.1
> 
>

Alexei Starovoitov March 1, 2024, 9:16 p.m. UTC | #2

On Fri, Mar 1, 2024 at 11:47 AM John Fastabend <john.fastabend@gmail.com> wrote:
>
> Alexei Starovoitov wrote:
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > Add tests for may_goto instruction via cond_break macro.
> >
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > ---
> >  tools/testing/selftests/bpf/DENYLIST.s390x    |  1 +
> >  .../bpf/progs/verifier_iterating_callbacks.c  | 72 ++++++++++++++++++-
> >  2 files changed, 70 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
> > index 1a63996c0304..c6c31b960810 100644
> > --- a/tools/testing/selftests/bpf/DENYLIST.s390x
> > +++ b/tools/testing/selftests/bpf/DENYLIST.s390x
> > @@ -3,3 +3,4 @@
> >  exceptions                            # JIT does not support calling kfunc bpf_throw                                (exceptions)
> >  get_stack_raw_tp                         # user_stack corrupted user stack                                             (no backchain userspace)
> >  stacktrace_build_id                      # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2                   (?)
> > +verifier_iter/cond_break
> > diff --git a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > index 5905e036e0ea..8476dc47623f 100644
> > --- a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > +++ b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > @@ -1,8 +1,6 @@
> >  // SPDX-License-Identifier: GPL-2.0
> > -
> > -#include <linux/bpf.h>
> > -#include <bpf/bpf_helpers.h>
> >  #include "bpf_misc.h"
> > +#include "bpf_experimental.h"
> >
> >  struct {
> >       __uint(type, BPF_MAP_TYPE_ARRAY);
> > @@ -239,4 +237,72 @@ int bpf_loop_iter_limit_nested(void *unused)
> >       return 1000 * a + b + c;
> >  }
> >
> > +#define ARR_SZ 1000000
> > +int zero;
> > +char arr[ARR_SZ];
> > +
> > +SEC("socket")
> > +__success __retval(0xd495cdc0)
> > +int cond_break1(const void *ctx)
> > +{
> > +     unsigned int i;
> > +     unsigned int sum = 0;
> > +
> > +     for (i = zero; i < ARR_SZ; cond_break, i++)
> > +             sum += i;
> > +     for (i = zero; i < ARR_SZ; i++) {
> > +             barrier_var(i);
> > +             sum += i + arr[i];
> > +             cond_break;
> > +     }
> > +
> > +     return sum;
> > +}
> > +
> > +SEC("socket")
> > +__success __retval(999000000)
> > +int cond_break2(const void *ctx)
> > +{
> > +     int i, j;
> > +     int sum = 0;
> > +
> > +     for (i = zero; i < 1000; cond_break, i++)
> > +             for (j = zero; j < 1000; j++) {
> > +                     sum += i + j;
> > +                     cond_break;
> > +             }
> > +
> > +     return sum;
> > +}
> > +
> > +static __noinline int loop(void)
> > +{
> > +     int i, sum = 0;
> > +
> > +     for (i = zero; i <= 1000000; i++, cond_break)
> > +             sum += i;
> > +
> > +     return sum;
> > +}
> > +
> > +SEC("socket")
> > +__success __retval(0x6a5a2920)
> > +int cond_break3(const void *ctx)
> > +{
> > +     return loop();
> > +}
> > +
> > +SEC("socket")
> > +__success __retval(0x800000) /* BPF_MAX_LOOPS */
> > +int cond_break4(const void *ctx)
> > +{
> > +     int cnt = 0;
> > +
> > +     for (;;) {
> > +             cond_break;
> > +             cnt++;
> > +     }
> > +     return cnt;
> > +}
>
> I found this test illustrative to show how the cond_break which

ohh. I shouldn't have exposed this implementation detail
in the test. I'll adjust it in the next revision.

> is to me "feels" like a global hidden iterator appears to not
> be reinitialized across calls?
...
> I guess this is by design but I sort of expected each
> call to have its own context. It does make some sense to
> limit main and all calls to a max loop count so not
> complaining. Maybe consider adding the test? I at least
> thought it helped.

At the moment each subprog has its own hidden counter,
but we might have different limits per program type.
Like sleepable might be allowed to loop longer.
The actual limit of BPF_MAX_LOOPS is a random number.
The bpf prog shouldn't rely on any particular loop count.
Most likely we'll add a watchdog soon and will start cancelling
bpf progs that were on cpu for more than a second
regardless of number of iterations.
Arena faults will be causing loops to terminate too.
And so on.
In other words "cond_break" is a contract between
the verifier and the program. The verifier allows the
program to loop assuming it's behaving well,
but reserves the right to terminate it.
So bpf author can assume that cond_break is a nop
if their program is well formed.
The loops with discoverable iteration count like
for (i = 0; i < 1000; i++)
are not really a target use case for cond_break.
It's mainly for loops that may have unbounded looping,
but should terminate quickly when code is correct.
Like walking a link list or strlen().

Alexei Starovoitov March 1, 2024, 9:22 p.m. UTC | #3

On Thu, Feb 29, 2024 at 7:37 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> +#define ARR_SZ 1000000
> +int zero;
> +char arr[ARR_SZ];
> +
> +SEC("socket")
> +__success __retval(0xd495cdc0)
> +int cond_break1(const void *ctx)
> +{
> +       unsigned int i;
> +       unsigned int sum = 0;

This is the reason for CI -no_alu32 fail.
I'll fix it in the next revision with:

 int cond_break1(const void *ctx)
 {
-       unsigned int i;
+       unsigned long i;
        unsigned int sum = 0;

John Fastabend March 1, 2024, 9:47 p.m. UTC | #4

Alexei Starovoitov wrote:
> On Fri, Mar 1, 2024 at 11:47 AM John Fastabend <john.fastabend@gmail.com> wrote:
> >
> > Alexei Starovoitov wrote:
> > > From: Alexei Starovoitov <ast@kernel.org>
> > >
> > > Add tests for may_goto instruction via cond_break macro.
> > >
> > > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > > ---
> > >  tools/testing/selftests/bpf/DENYLIST.s390x    |  1 +
> > >  .../bpf/progs/verifier_iterating_callbacks.c  | 72 ++++++++++++++++++-
> > >  2 files changed, 70 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
> > > index 1a63996c0304..c6c31b960810 100644
> > > --- a/tools/testing/selftests/bpf/DENYLIST.s390x
> > > +++ b/tools/testing/selftests/bpf/DENYLIST.s390x
> > > @@ -3,3 +3,4 @@
> > >  exceptions                            # JIT does not support calling kfunc bpf_throw                                (exceptions)
> > >  get_stack_raw_tp                         # user_stack corrupted user stack                                             (no backchain userspace)
> > >  stacktrace_build_id                      # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2                   (?)
> > > +verifier_iter/cond_break
> > > diff --git a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > > index 5905e036e0ea..8476dc47623f 100644
> > > --- a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > > +++ b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > > @@ -1,8 +1,6 @@
> > >  // SPDX-License-Identifier: GPL-2.0
> > > -
> > > -#include <linux/bpf.h>
> > > -#include <bpf/bpf_helpers.h>
> > >  #include "bpf_misc.h"
> > > +#include "bpf_experimental.h"
> > >
> > >  struct {
> > >       __uint(type, BPF_MAP_TYPE_ARRAY);
> > > @@ -239,4 +237,72 @@ int bpf_loop_iter_limit_nested(void *unused)
> > >       return 1000 * a + b + c;
> > >  }
> > >
> > > +#define ARR_SZ 1000000
> > > +int zero;
> > > +char arr[ARR_SZ];
> > > +
> > > +SEC("socket")
> > > +__success __retval(0xd495cdc0)
> > > +int cond_break1(const void *ctx)
> > > +{
> > > +     unsigned int i;
> > > +     unsigned int sum = 0;
> > > +
> > > +     for (i = zero; i < ARR_SZ; cond_break, i++)
> > > +             sum += i;
> > > +     for (i = zero; i < ARR_SZ; i++) {
> > > +             barrier_var(i);
> > > +             sum += i + arr[i];
> > > +             cond_break;
> > > +     }
> > > +
> > > +     return sum;
> > > +}
> > > +
> > > +SEC("socket")
> > > +__success __retval(999000000)
> > > +int cond_break2(const void *ctx)
> > > +{
> > > +     int i, j;
> > > +     int sum = 0;
> > > +
> > > +     for (i = zero; i < 1000; cond_break, i++)
> > > +             for (j = zero; j < 1000; j++) {
> > > +                     sum += i + j;
> > > +                     cond_break;
> > > +             }
> > > +
> > > +     return sum;
> > > +}
> > > +
> > > +static __noinline int loop(void)
> > > +{
> > > +     int i, sum = 0;
> > > +
> > > +     for (i = zero; i <= 1000000; i++, cond_break)
> > > +             sum += i;
> > > +
> > > +     return sum;
> > > +}
> > > +
> > > +SEC("socket")
> > > +__success __retval(0x6a5a2920)
> > > +int cond_break3(const void *ctx)
> > > +{
> > > +     return loop();
> > > +}
> > > +
> > > +SEC("socket")
> > > +__success __retval(0x800000) /* BPF_MAX_LOOPS */
> > > +int cond_break4(const void *ctx)
> > > +{
> > > +     int cnt = 0;
> > > +
> > > +     for (;;) {
> > > +             cond_break;
> > > +             cnt++;
> > > +     }
> > > +     return cnt;
> > > +}
> >
> > I found this test illustrative to show how the cond_break which
> 
> ohh. I shouldn't have exposed this implementation detail
> in the test. I'll adjust it in the next revision.
> 
> > is to me "feels" like a global hidden iterator appears to not
> > be reinitialized across calls?
> ...
> > I guess this is by design but I sort of expected each
> > call to have its own context. It does make some sense to
> > limit main and all calls to a max loop count so not
> > complaining. Maybe consider adding the test? I at least
> > thought it helped.
> 
> At the moment each subprog has its own hidden counter,

aha that is how I read the patch1 as well. But I'm trying to follow
why I get two different answers here.

Below passes all good the total there in break5 is 2xMAX_LOOPS which
is what I expect from above and reading patch. If I trace the code
I have two subprogs and each does fixup,

   insn_buf[j] = BPF_ST_MEM(BPF_DW, BPF_REG_FP,
     -subprogs[i].stack_depth + j * 8, BPF_MAX_LOOPS);

This is the good one.

 __noinline int full_loop(void)
 {
	int cnt = 0;

	for (;;) {
		cond_break;
		cnt++;
	}

	for (;;) {
		cond_break;
		cnt++;
	}

	bpf_printk("cnt==%d\n", cnt);
	return cnt;
 }

 SEC("socket")
 __success __retval(16777216)
 int cond_break5(const void *ctx)
 {
	int cnt = 0;

	for (;;) {
		cond_break;
		cnt++;
	}

	cnt += full_loop();

	for (;;) {
		cond_break;
		cnt++;
	}
	return cnt;
 }

But adding static fails :( which I didn't expect. Is it obvious
why this is the case?

static  __noinline int full_loop(void)
 {
	int cnt = 0;

	for (;;) {
		cond_break;
		cnt++;
	}

	for (;;) {
		cond_break;
		cnt++;
	}

	bpf_printk("cnt==%d\n", cnt);
	return cnt;
 }

 SEC("socket")
 __success __retval(16777216)
 int cond_break5(const void *ctx)
 {
	int cnt = 0;

	for (;;) {
		cond_break;
		cnt++;
	}

	cnt += full_loop();

	for (;;) {
		cond_break;
		cnt++;
	}
	return cnt;
 }

From verifier side story is slightly different. There are still
two subprogs, but for subprog[0] has stack_slots==0? Debugging
now but maybe its obvious what that static is doing to you.

> but we might have different limits per program type.
> Like sleepable might be allowed to loop longer.
> The actual limit of BPF_MAX_LOOPS is a random number.
> The bpf prog shouldn't rely on any particular loop count.
> Most likely we'll add a watchdog soon and will start cancelling
> bpf progs that were on cpu for more than a second
> regardless of number of iterations.
> Arena faults will be causing loops to terminate too.
> And so on.
> In other words "cond_break" is a contract between
> the verifier and the program. The verifier allows the
> program to loop assuming it's behaving well,
> but reserves the right to terminate it.
> So bpf author can assume that cond_break is a nop
> if their program is well formed.
> The loops with discoverable iteration count like
> for (i = 0; i < 1000; i++)
> are not really a target use case for cond_break.
> It's mainly for loops that may have unbounded looping,
> but should terminate quickly when code is correct.
> Like walking a link list or strlen().

Yep we do this a lot and just create some artifical upper
bound so this is nicer for sure. Lots of Tetragon code reads


   for (i = 0; i < MAX_LOOP; i++) {
     do_stuff
     if (exit_cond)
       break;
  } 

.John

John Fastabend March 1, 2024, 10:06 p.m. UTC | #5

John Fastabend wrote:
> Alexei Starovoitov wrote:
> > On Fri, Mar 1, 2024 at 11:47 AM John Fastabend <john.fastabend@gmail.com> wrote:
> > >
> > > Alexei Starovoitov wrote:
> > > > From: Alexei Starovoitov <ast@kernel.org>
> > > >
> > > > Add tests for may_goto instruction via cond_break macro.
> > > >
> > > > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > > > ---
> > > >  tools/testing/selftests/bpf/DENYLIST.s390x    |  1 +
> > > >  .../bpf/progs/verifier_iterating_callbacks.c  | 72 ++++++++++++++++++-
> > > >  2 files changed, 70 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
> > > > index 1a63996c0304..c6c31b960810 100644
> > > > --- a/tools/testing/selftests/bpf/DENYLIST.s390x
> > > > +++ b/tools/testing/selftests/bpf/DENYLIST.s390x
> > > > @@ -3,3 +3,4 @@
> > > >  exceptions                            # JIT does not support calling kfunc bpf_throw                                (exceptions)
> > > >  get_stack_raw_tp                         # user_stack corrupted user stack                                             (no backchain userspace)
> > > >  stacktrace_build_id                      # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2                   (?)
> > > > +verifier_iter/cond_break
> > > > diff --git a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > > > index 5905e036e0ea..8476dc47623f 100644
> > > > --- a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > > > +++ b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > > > @@ -1,8 +1,6 @@
> > > >  // SPDX-License-Identifier: GPL-2.0
> > > > -
> > > > -#include <linux/bpf.h>
> > > > -#include <bpf/bpf_helpers.h>
> > > >  #include "bpf_misc.h"
> > > > +#include "bpf_experimental.h"
> > > >
> > > >  struct {
> > > >       __uint(type, BPF_MAP_TYPE_ARRAY);
> > > > @@ -239,4 +237,72 @@ int bpf_loop_iter_limit_nested(void *unused)
> > > >       return 1000 * a + b + c;
> > > >  }
> > > >
> > > > +#define ARR_SZ 1000000
> > > > +int zero;
> > > > +char arr[ARR_SZ];
> > > > +
> > > > +SEC("socket")
> > > > +__success __retval(0xd495cdc0)
> > > > +int cond_break1(const void *ctx)
> > > > +{
> > > > +     unsigned int i;
> > > > +     unsigned int sum = 0;
> > > > +
> > > > +     for (i = zero; i < ARR_SZ; cond_break, i++)
> > > > +             sum += i;
> > > > +     for (i = zero; i < ARR_SZ; i++) {
> > > > +             barrier_var(i);
> > > > +             sum += i + arr[i];
> > > > +             cond_break;
> > > > +     }
> > > > +
> > > > +     return sum;
> > > > +}
> > > > +
> > > > +SEC("socket")
> > > > +__success __retval(999000000)
> > > > +int cond_break2(const void *ctx)
> > > > +{
> > > > +     int i, j;
> > > > +     int sum = 0;
> > > > +
> > > > +     for (i = zero; i < 1000; cond_break, i++)
> > > > +             for (j = zero; j < 1000; j++) {
> > > > +                     sum += i + j;
> > > > +                     cond_break;
> > > > +             }
> > > > +
> > > > +     return sum;
> > > > +}
> > > > +
> > > > +static __noinline int loop(void)
> > > > +{
> > > > +     int i, sum = 0;
> > > > +
> > > > +     for (i = zero; i <= 1000000; i++, cond_break)
> > > > +             sum += i;
> > > > +
> > > > +     return sum;
> > > > +}
> > > > +
> > > > +SEC("socket")
> > > > +__success __retval(0x6a5a2920)
> > > > +int cond_break3(const void *ctx)
> > > > +{
> > > > +     return loop();
> > > > +}
> > > > +
> > > > +SEC("socket")
> > > > +__success __retval(0x800000) /* BPF_MAX_LOOPS */
> > > > +int cond_break4(const void *ctx)
> > > > +{
> > > > +     int cnt = 0;
> > > > +
> > > > +     for (;;) {
> > > > +             cond_break;
> > > > +             cnt++;
> > > > +     }
> > > > +     return cnt;
> > > > +}
> > >
> > > I found this test illustrative to show how the cond_break which
> > 
> > ohh. I shouldn't have exposed this implementation detail
> > in the test. I'll adjust it in the next revision.
> > 
> > > is to me "feels" like a global hidden iterator appears to not
> > > be reinitialized across calls?
> > ...
> > > I guess this is by design but I sort of expected each
> > > call to have its own context. It does make some sense to
> > > limit main and all calls to a max loop count so not
> > > complaining. Maybe consider adding the test? I at least
> > > thought it helped.
> > 
> > At the moment each subprog has its own hidden counter,
> 
> aha that is how I read the patch1 as well. But I'm trying to follow
> why I get two different answers here.
> 
> Below passes all good the total there in break5 is 2xMAX_LOOPS which
> is what I expect from above and reading patch. If I trace the code
> I have two subprogs and each does fixup,
> 
>    insn_buf[j] = BPF_ST_MEM(BPF_DW, BPF_REG_FP,
>      -subprogs[i].stack_depth + j * 8, BPF_MAX_LOOPS);
> 
> This is the good one.
> 
>  __noinline int full_loop(void)
>  {
> 	int cnt = 0;
> 
> 	for (;;) {
> 		cond_break;
> 		cnt++;
> 	}
> 
> 	for (;;) {
> 		cond_break;
> 		cnt++;
> 	}
> 
> 	bpf_printk("cnt==%d\n", cnt);
> 	return cnt;
>  }
> 
>  SEC("socket")
>  __success __retval(16777216)
>  int cond_break5(const void *ctx)
>  {
> 	int cnt = 0;
> 
> 	for (;;) {
> 		cond_break;
> 		cnt++;
> 	}
> 
> 	cnt += full_loop();
> 
> 	for (;;) {
> 		cond_break;
> 		cnt++;
> 	}
> 	return cnt;
>  }
> 
> But adding static fails :( which I didn't expect. Is it obvious
> why this is the case?
> 
> static  __noinline int full_loop(void)
>  {
> 	int cnt = 0;
> 
> 	for (;;) {
> 		cond_break;
> 		cnt++;
> 	}
> 
> 	for (;;) {
> 		cond_break;
> 		cnt++;
> 	}
> 
> 	bpf_printk("cnt==%d\n", cnt);
> 	return cnt;
>  }
> 
>  SEC("socket")
>  __success __retval(16777216)
>  int cond_break5(const void *ctx)
>  {
> 	int cnt = 0;
> 
> 	for (;;) {
> 		cond_break;
> 		cnt++;
> 	}
> 
> 	cnt += full_loop();
> 
> 	for (;;) {
> 		cond_break;
> 		cnt++;
> 	}
> 	return cnt;
>  }
> 
> From verifier side story is slightly different. There are still
> two subprogs, but for subprog[0] has stack_slots==0? Debugging
> now but maybe its obvious what that static is doing to you.

That was a typo its subprog[1] with stack_slots == 0. Also
tracing insn it seems in nonstatic case we hit multiple
insn->code (BPF_JMP| BPF_JMA) but in the static case only
find the first one. Object file seems to have multiples
though. I need to drop for the rest of the afternoon most
likely, but will try to see what sort of silly thing I did
later today or worse case Monday.

> k
> > but we might have different limits per program type.
> > Like sleepable might be allowed to loop longer.
> > The actual limit of BPF_MAX_LOOPS is a random number.
> > The bpf prog shouldn't rely on any particular loop count.
> > Most likely we'll add a watchdog soon and will start cancelling
> > bpf progs that were on cpu for more than a second
> > regardless of number of iterations.
> > Arena faults will be causing loops to terminate too.
> > And so on.
> > In other words "cond_break" is a contract between
> > the verifier and the program. The verifier allows the
> > program to loop assuming it's behaving well,
> > but reserves the right to terminate it.
> > So bpf author can assume that cond_break is a nop
> > if their program is well formed.
> > The loops with discoverable iteration count like
> > for (i = 0; i < 1000; i++)
> > are not really a target use case for cond_break.
> > It's mainly for loops that may have unbounded looping,
> > but should terminate quickly when code is correct.
> > Like walking a link list or strlen().
> 
> Yep we do this a lot and just create some artifical upper
> bound so this is nicer for sure. Lots of Tetragon code reads
> 
> 
>    for (i = 0; i < MAX_LOOP; i++) {
>      do_stuff
>      if (exit_cond)
>        break;
>   } 
> 
> .John

Alexei Starovoitov March 1, 2024, 10:12 p.m. UTC | #6

On Fri, Mar 1, 2024 at 2:07 PM John Fastabend <john.fastabend@gmail.com> wrote:
> >
> >  SEC("socket")
> >  __success __retval(16777216)
> >  int cond_break5(const void *ctx)
> >  {
> >       int cnt = 0;
> >
> >       for (;;) {
> >               cond_break;
> >               cnt++;
> >       }
> >
> >       cnt += full_loop();
> >
> >       for (;;) {
> >               cond_break;
> >               cnt++;
> >       }
> >       return cnt;
> >  }
> >
> > From verifier side story is slightly different. There are still
> > two subprogs, but for subprog[0] has stack_slots==0? Debugging
> > now but maybe its obvious what that static is doing to you.
>
> That was a typo its subprog[1] with stack_slots == 0. Also
> tracing insn it seems in nonstatic case we hit multiple
> insn->code (BPF_JMP| BPF_JMA) but in the static case only
> find the first one. Object file seems to have multiples
> though. I need to drop for the rest of the afternoon most
> likely, but will try to see what sort of silly thing I did
> later today or worse case Monday.

Thanks for the bug report.
For static case:

$ bpftool p dump xlated id 36

int cond_break5(const void * ctx):
; int cond_break5(const void *ctx)
   0: (7a) *(u64 *)(r10 -8) = 8388608
   1: (b4) w6 = 0
; cond_break;
   2: (79) r11 = *(u64 *)(r10 -8)
   3: (15) if r11 == 0x0 goto pc+4
   4: (17) r11 -= 1
   5: (7b) *(u64 *)(r10 -8) = r11
; cnt++;
   6: (04) w6 += 1
   7: (05) goto pc-6
; cnt += full_loop();
   8: (85) call pc+2#bpf_prog_270866f75dae27c8_full_loop
; for (;;) {
   9: (0c) w0 += w6
; return cnt;
  10: (95) exit
int full_loop():
; static __noinline int full_loop(void)
  11: (b4) w6 = 0
; bpf_printk("cnt==%d\n", cnt);
  12: (18) r1 = map[id:35][0]+0
  14: (b4) w2 = 9
  15: (bc) w3 = w6
  16: (85) call bpf_trace_printk#-87376
; return cnt;
  17: (bc) w0 = w6
  18: (95) exit

Looks like I made a mistake in may_goto verification.
Only the first loop remains. Other loops were removed as dead code.
It's certainly a bug in the patch 1. Will fix in the next revision.

pw-bot: cr

[v3,bpf-next,4/4] selftests/bpf: Test may_goto

Checks

Commit Message

Comments

Patch