diff mbox series

[v3,bpf-next,14/19] bpf: relax log_buf NULL conditions when log_level>0 is requested

Message ID 20230404043659.2282536-15-andrii@kernel.org (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series BPF verifier rotating log | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
netdev/series_format fail Series longer than 15 patches (and no cover letter)
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 20 this patch: 20
netdev/cc_maintainers warning 8 maintainers not CCed: song@kernel.org sdf@google.com haoluo@google.com yhs@fb.com john.fastabend@gmail.com kpsingh@kernel.org jolsa@kernel.org martin.lau@linux.dev
netdev/build_clang success Errors and warnings before: 18 this patch: 18
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 20 this patch: 20
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 88 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32 on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_verifier on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ${{ matrix.test }} on ${{ matrix.arch }} with ${{ matrix.toolchain_full }}
bpf/vmtest-bpf-next-VM_Test-2 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-5 fail Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-8 success Logs for set-matrix

Commit Message

Andrii Nakryiko April 4, 2023, 4:36 a.m. UTC
Drop the log_size>0 and log_buf!=NULL condition when log_level>0. This
allows users to request log_size_actual of a full log without providing
actual (even if small) log buffer. Verifier log handling code was mostly ready to handle NULL log->ubuf, so only few small changes were necessary to prevent NULL log->ubuf from causing problems.

Note, that user is basically guaranteed to receive -ENOSPC when
providing log_level>0 and log_buf==NULL. We also still enforce that
either (log_buf==NULL && log_size==0) or (log_buf!=NULL && log_size>0).

Suggested-by: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/bpf/log.c | 39 +++++++++++++++++++++++++--------------
 1 file changed, 25 insertions(+), 14 deletions(-)

Comments

Lorenz Bauer April 5, 2023, 5:29 p.m. UTC | #1
On Tue, Apr 4, 2023 at 5:37 AM Andrii Nakryiko <andrii@kernel.org> wrote:
>
> Drop the log_size>0 and log_buf!=NULL condition when log_level>0. This
> allows users to request log_size_actual of a full log without providing
> actual (even if small) log buffer. Verifier log handling code was mostly ready to handle NULL log->ubuf, so only few small changes were necessary to prevent NULL log->ubuf from causing problems.
>
> Note, that user is basically guaranteed to receive -ENOSPC when
> providing log_level>0 and log_buf==NULL. We also still enforce that
> either (log_buf==NULL && log_size==0) or (log_buf!=NULL && log_size>0).

Is it possible to change it so that log_buf == NULL && log_size == 0
&& log_level > 0 only fills in log_size_actual and doesn't return
ENOSPC? Otherwise we can't do oneshot loading.

  if PROG_LOAD(buf=NULL, size=0, level=1) >= 0:
    return fd
  else
    retry PROG_LOAD(buf!=NULL, size=log_size_actual, level=1)

If the first PROG_LOAD returned ENOSPC we'd have to re-run it without
the log enabled to see whether ENOSPC is masking a real verification
error. With the current semantics we can work around this with three
syscalls, but that seems wasteful?

  if PROG_LOAD(buf=NULL, size=0, level=0) >= 0:
    return fd

  PROG_LOAD(buf=NULL, size=0, level=1) == ENOSPC
  PROG_LOAD(buf!=NULL, size=log_size_actual, level=1)
Andrii Nakryiko April 5, 2023, 5:44 p.m. UTC | #2
On Wed, Apr 5, 2023 at 10:29 AM Lorenz Bauer <lmb@isovalent.com> wrote:
>
> On Tue, Apr 4, 2023 at 5:37 AM Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > Drop the log_size>0 and log_buf!=NULL condition when log_level>0. This
> > allows users to request log_size_actual of a full log without providing
> > actual (even if small) log buffer. Verifier log handling code was mostly ready to handle NULL log->ubuf, so only few small changes were necessary to prevent NULL log->ubuf from causing problems.
> >
> > Note, that user is basically guaranteed to receive -ENOSPC when
> > providing log_level>0 and log_buf==NULL. We also still enforce that
> > either (log_buf==NULL && log_size==0) or (log_buf!=NULL && log_size>0).
>
> Is it possible to change it so that log_buf == NULL && log_size == 0
> && log_level > 0 only fills in log_size_actual and doesn't return
> ENOSPC? Otherwise we can't do oneshot loading.
>
>   if PROG_LOAD(buf=NULL, size=0, level=1) >= 0:
>     return fd
>   else
>     retry PROG_LOAD(buf!=NULL, size=log_size_actual, level=1)
>
> If the first PROG_LOAD returned ENOSPC we'd have to re-run it without
> the log enabled to see whether ENOSPC is masking a real verification
> error. With the current semantics we can work around this with three
> syscalls, but that seems wasteful?
>
>   if PROG_LOAD(buf=NULL, size=0, level=0) >= 0:
>     return fd
>
>   PROG_LOAD(buf=NULL, size=0, level=1) == ENOSPC
>   PROG_LOAD(buf!=NULL, size=log_size_actual, level=1)

We could and I thought about this, but verifier logging is quite an
expensive operation due to all the extra formatting and reporting, so
it's not advised to have log_level=1 permanently enabled for
production BPF program loading. So I didn't want to encourage this
pattern. Note that even if log_buf==NULL when log_level>0 we'd be
doing printf()-ing everything, which is the expensive part. So do you
really want to add all this extra overhead *constantly* to all
production BPF programs?
Lorenz Bauer April 6, 2023, 4:15 p.m. UTC | #3
On Wed, Apr 5, 2023 at 6:44 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:>
> We could and I thought about this, but verifier logging is quite an
> expensive operation due to all the extra formatting and reporting, so
> it's not advised to have log_level=1 permanently enabled for
> production BPF program loading.

Any idea how expensive this is?

> Note that even if log_buf==NULL when log_level>0 we'd be
> doing printf()-ing everything, which is the expensive part. So do you
> really want to add all this extra overhead *constantly* to all
> production BPF programs?

No, I'm just going off of what UAPI I would like to use. Keeping
semantics as they are is fine if it's too slow. We could always use a
small-ish buffer for the first retry and hope things fit.
Andrii Nakryiko April 6, 2023, 6:43 p.m. UTC | #4
On Thu, Apr 6, 2023 at 9:15 AM Lorenz Bauer <lmb@isovalent.com> wrote:
>
> On Wed, Apr 5, 2023 at 6:44 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:>
> > We could and I thought about this, but verifier logging is quite an
> > expensive operation due to all the extra formatting and reporting, so
> > it's not advised to have log_level=1 permanently enabled for
> > production BPF program loading.
>
> Any idea how expensive this is?
>

It will depend on the specific program, of course. But just to
estimate, here's pyperf100 selftests with log_level=1 and log_level=4
(just stats, basically free).

[vmuser@archvm bpf]$ time sudo ./veristat pyperf100.bpf.o -v >/dev/null

real    0m1.761s
user    0m0.008s
sys     0m1.743s
[vmuser@archvm bpf]$ time sudo ./veristat pyperf100.bpf.o >/dev/null

real    0m0.869s
user    0m0.009s
sys     0m0.846s

2x difference. So I'd definitely not recommend running with
log_level=1 by default.

> > Note that even if log_buf==NULL when log_level>0 we'd be
> > doing printf()-ing everything, which is the expensive part. So do you
> > really want to add all this extra overhead *constantly* to all
> > production BPF programs?
>
> No, I'm just going off of what UAPI I would like to use. Keeping
> semantics as they are is fine if it's too slow. We could always use a
> small-ish buffer for the first retry and hope things fit.

It's easy for me to implement it either way, Alexei and Daniel, any
thoughts on this?
Alexei Starovoitov April 6, 2023, 7:46 p.m. UTC | #5
On Thu, Apr 6, 2023 at 11:43 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Apr 6, 2023 at 9:15 AM Lorenz Bauer <lmb@isovalent.com> wrote:
> >
> > On Wed, Apr 5, 2023 at 6:44 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:>
> > > We could and I thought about this, but verifier logging is quite an
> > > expensive operation due to all the extra formatting and reporting, so
> > > it's not advised to have log_level=1 permanently enabled for
> > > production BPF program loading.
> >
> > Any idea how expensive this is?
> >
>
> It will depend on the specific program, of course. But just to
> estimate, here's pyperf100 selftests with log_level=1 and log_level=4
> (just stats, basically free).
>
> [vmuser@archvm bpf]$ time sudo ./veristat pyperf100.bpf.o -v >/dev/null
>
> real    0m1.761s
> user    0m0.008s
> sys     0m1.743s
> [vmuser@archvm bpf]$ time sudo ./veristat pyperf100.bpf.o >/dev/null
>
> real    0m0.869s
> user    0m0.009s
> sys     0m0.846s
>
> 2x difference. So I'd definitely not recommend running with
> log_level=1 by default.
>
> > > Note that even if log_buf==NULL when log_level>0 we'd be
> > > doing printf()-ing everything, which is the expensive part. So do you
> > > really want to add all this extra overhead *constantly* to all
> > > production BPF programs?
> >
> > No, I'm just going off of what UAPI I would like to use. Keeping
> > semantics as they are is fine if it's too slow. We could always use a
> > small-ish buffer for the first retry and hope things fit.
>
> It's easy for me to implement it either way, Alexei and Daniel, any
> thoughts on this?

I like this part of the idea:

> Is it possible to change it so that log_buf == NULL && log_size == 0
> && log_level > 0 only fills in log_size_actual and doesn't return
> ENOSPC? Otherwise we can't do oneshot loading.
>  if PROG_LOAD(buf=NULL, size=0, level=1) >= 0:
>    return fd
>  else
>    retry PROG_LOAD(buf!=NULL, size=log_size_actual, level=1)

libbpf shouldn't be doing it by default, since load time matters,
but it could be useful in other scenarios.
Like central bpf loading daemon can use (buf=NULL, size=0, level=(1 | 4))
every 100th prog_load as part of stats collection.
veristat can do it by default.

log_true_size should more or less align with verified_insns uapi number,
but still interesting to see and track over time.
Andrii Nakryiko April 6, 2023, 8:10 p.m. UTC | #6
On Thu, Apr 6, 2023 at 12:46 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Apr 6, 2023 at 11:43 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Thu, Apr 6, 2023 at 9:15 AM Lorenz Bauer <lmb@isovalent.com> wrote:
> > >
> > > On Wed, Apr 5, 2023 at 6:44 PM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:>
> > > > We could and I thought about this, but verifier logging is quite an
> > > > expensive operation due to all the extra formatting and reporting, so
> > > > it's not advised to have log_level=1 permanently enabled for
> > > > production BPF program loading.
> > >
> > > Any idea how expensive this is?
> > >
> >
> > It will depend on the specific program, of course. But just to
> > estimate, here's pyperf100 selftests with log_level=1 and log_level=4
> > (just stats, basically free).
> >
> > [vmuser@archvm bpf]$ time sudo ./veristat pyperf100.bpf.o -v >/dev/null
> >
> > real    0m1.761s
> > user    0m0.008s
> > sys     0m1.743s
> > [vmuser@archvm bpf]$ time sudo ./veristat pyperf100.bpf.o >/dev/null
> >
> > real    0m0.869s
> > user    0m0.009s
> > sys     0m0.846s
> >
> > 2x difference. So I'd definitely not recommend running with
> > log_level=1 by default.
> >
> > > > Note that even if log_buf==NULL when log_level>0 we'd be
> > > > doing printf()-ing everything, which is the expensive part. So do you
> > > > really want to add all this extra overhead *constantly* to all
> > > > production BPF programs?
> > >
> > > No, I'm just going off of what UAPI I would like to use. Keeping
> > > semantics as they are is fine if it's too slow. We could always use a
> > > small-ish buffer for the first retry and hope things fit.
> >
> > It's easy for me to implement it either way, Alexei and Daniel, any
> > thoughts on this?
>
> I like this part of the idea:
>
> > Is it possible to change it so that log_buf == NULL && log_size == 0
> > && log_level > 0 only fills in log_size_actual and doesn't return
> > ENOSPC? Otherwise we can't do oneshot loading.
> >  if PROG_LOAD(buf=NULL, size=0, level=1) >= 0:
> >    return fd
> >  else
> >    retry PROG_LOAD(buf!=NULL, size=log_size_actual, level=1)
>
> libbpf shouldn't be doing it by default, since load time matters,
> but it could be useful in other scenarios.
> Like central bpf loading daemon can use (buf=NULL, size=0, level=(1 | 4))
> every 100th prog_load as part of stats collection.
> veristat can do it by default.

makes sense, then no -ENOSPC for log_buf==NULL, will do in next version

>
> log_true_size should more or less align with verified_insns uapi number,
> but still interesting to see and track over time.
diff mbox series

Patch

diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
index 1198b6ad235a..ab8149448724 100644
--- a/kernel/bpf/log.c
+++ b/kernel/bpf/log.c
@@ -12,8 +12,17 @@ 
 
 static bool bpf_verifier_log_attr_valid(const struct bpf_verifier_log *log)
 {
-	return log->len_total > 0 && log->len_total <= UINT_MAX >> 2 &&
-	       log->level && log->ubuf && !(log->level & ~BPF_LOG_MASK);
+	/* ubuf and len_total should both be specified (or not) together */
+	if (!!log->ubuf != !!log->len_total)
+		return false;
+	/* log buf without log_level is meaningless */
+	if (log->ubuf && log->level == 0)
+		return false;
+	if (log->level & ~BPF_LOG_MASK)
+		return false;
+	if (log->len_total > UINT_MAX >> 2)
+		return false;
+	return true;
 }
 
 int bpf_vlog_init(struct bpf_verifier_log *log, u32 log_level,
@@ -64,15 +73,15 @@  void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt,
 	}
 
 	n += 1; /* include terminating zero */
+	bpf_vlog_update_len_max(log, n);
+
 	if (log->level & BPF_LOG_FIXED) {
 		/* check if we have at least something to put into user buf */
 		new_n = 0;
-		if (log->end_pos < log->len_total - 1) {
+		if (log->end_pos + 1 < log->len_total) {
 			new_n = min_t(u32, log->len_total - log->end_pos, n);
 			log->kbuf[new_n - 1] = '\0';
 		}
-
-		bpf_vlog_update_len_max(log, n);
 		cur_pos = log->end_pos;
 		log->end_pos += n - 1; /* don't count terminating '\0' */
 
@@ -84,16 +93,21 @@  void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt,
 		u32 buf_start, buf_end, new_n;
 
 		log->kbuf[n - 1] = '\0';
-		bpf_vlog_update_len_max(log, n);
 
 		new_end = log->end_pos + n;
 		if (new_end - log->start_pos >= log->len_total)
 			new_start = new_end - log->len_total;
 		else
 			new_start = log->start_pos;
+
+		log->start_pos = new_start;
+		log->end_pos = new_end - 1; /* don't count terminating '\0' */
+
+		if (!log->ubuf)
+			return;
+
 		new_n = min(n, log->len_total);
 		cur_pos = new_end - new_n;
-
 		div_u64_rem(cur_pos, log->len_total, &buf_start);
 		div_u64_rem(new_end, log->len_total, &buf_end);
 		/* new_end and buf_end are exclusive indices, so if buf_end is
@@ -103,12 +117,6 @@  void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt,
 		if (buf_end == 0)
 			buf_end = log->len_total;
 
-		log->start_pos = new_start;
-		log->end_pos = new_end - 1; /* don't count terminating '\0' */
-
-		if (!log->ubuf)
-			return;
-
 		/* if buf_start > buf_end, we wrapped around;
 		 * if buf_start == buf_end, then we fill ubuf completely; we
 		 * can't have buf_start == buf_end to mean that there is
@@ -155,12 +163,15 @@  void bpf_vlog_reset(struct bpf_verifier_log *log, u64 new_pos)
 	if (log->end_pos < log->start_pos)
 		log->start_pos = log->end_pos;
 
+	if (!log->ubuf)
+		return;
+
 	if (log->level & BPF_LOG_FIXED)
 		pos = log->end_pos + 1;
 	else
 		div_u64_rem(new_pos, log->len_total, &pos);
 
-	if (log->ubuf && pos < log->len_total && put_user(zero, log->ubuf + pos))
+	if (pos < log->len_total && put_user(zero, log->ubuf + pos))
 		log->ubuf = NULL;
 }