mbox series

[v3,bpf-next,00/19] BPF verifier rotating log

Message ID 20230404043659.2282536-1-andrii@kernel.org (mailing list archive)
Headers show
Series BPF verifier rotating log | expand

Message

Andrii Nakryiko April 4, 2023, 4:36 a.m. UTC
This patch set changes BPF verifier log behavior to behave as a rotating log,
by default. If user-supplied log buffer is big enough to contain entire
verifier log output, there is no effective difference. But where previously
user supplied too small log buffer and would get -ENOSPC error result and the
beginning part of the verifier log, now there will be no error and user will
get ending part of verifier log filling up user-supplied log buffer.  Which
is, in absolute majority of cases, is exactly what's useful, relevant, and
what users want and need, as the ending of the verifier log is containing
details of verifier failure and relevant state that got us to that failure. So
this rotating mode is made default, but for some niche advanced debugging
scenarios it's possible to request old behavior by specifying additional
BPF_LOG_FIXED (8) flag.

This patch set adjusts libbpf to allow specifying flags beyond 1 | 2 | 4. We
also add --log-size and --log-fixed options to veristat to be able to both
test this functionality manually, but also to be used in various debugging
scenarios. We also add selftests that tries many variants of log buffer size
to stress-test correctness of internal verifier log bookkeeping code.

Further, this patch set is merged with log_size_actual v1 patchset ([0]),
which adds ability to get required log buffer size to fit entire verifier log
output. 

This addresses a long-standing limitation, which causes users and BPF loader
library writers to guess and pre-size log buffer, often allocating unnecessary
extra memory for this or doing extra program verifications just to size logs
better, ultimately wasting resources. This was requested most recently by Go
BPF library maintainers ([1]). 

See respective patches for details. A bunch of them some drive-by fixes
detecting during working with the code. Some other further refactor and
compratmentalize verifier log handling code into kernel/bpf/log.c, which
should also make it simpler to integrate such verbose log for other
complicated bpf() syscall commands, if necessary. The rest are actual logic to
calculate maximum log buffer size needed and return it to user-space. Few
patches wire this on libbpf side, and the rest add selftests to test proper
log truncation and log_buf==NULL handling.

This turned into a pretty sizable patch set with lots of arithmetics, but
hopefully the set of features added to verifier log in this patch set are both
useful for BPF users and are self-contained and isolated enough to not cause
troubles going forward.

v2->v3:
  - typos and comment improvement (Lorenz);
  - merged with log_size_actual v1 ([0]) patch set (Alexei);
  - added log_buf==NULL condition allowed (Lorenz);
  - added BPF_BTF_LOAD logs tests (Lorenz);
  - more clean up and refactoring of internal verifier log API;
v1->v2:
  - return -ENOSPC even in rotating log mode for preserving backwards
    compatibility (Lorenz);

  [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=735213&state=*
  [1] https://lore.kernel.org/bpf/CAN+4W8iNoEbQzQVbB_o1W0MWBDV4xCJAq7K3f6psVE-kkCfMqg@mail.gmail.com/

Andrii Nakryiko (19):
  bpf: split off basic BPF verifier log into separate file
  bpf: remove minimum size restrictions on verifier log buffer
  bpf: switch BPF verifier log to be a rotating log by default
  libbpf: don't enforce unnecessary verifier log restrictions on libbpf
    side
  veristat: add more veristat control over verifier log options
  selftests/bpf: add fixed vs rotating verifier log tests
  bpf: ignore verifier log reset in BPF_LOG_KERNEL mode
  bpf: fix missing -EFAULT return on user log buf error in btf_parse()
  bpf: avoid incorrect -EFAULT error in BPF_LOG_KERNEL mode
  bpf: simplify logging-related error conditions handling
  bpf: keep track of total log content size in both fixed and rolling
    modes
  bpf: add log_size_actual output field to return log contents size
  bpf: simplify internal verifier log interface
  bpf: relax log_buf NULL conditions when log_level>0 is requested
  libbpf: wire through log_size_actual returned from kernel for
    BPF_PROG_LOAD
  libbpf: wire through log_size_actual for bpf_btf_load() API
  selftests/bpf: add tests to validate log_size_actual feature
  selftests/bpf: add testing of log_buf==NULL condition for
    BPF_PROG_LOAD
  selftests/bpf: add verifier log tests for BPF_BTF_LOAD command

 include/linux/bpf.h                           |   2 +-
 include/linux/bpf_verifier.h                  |  41 +-
 include/linux/btf.h                           |   2 +-
 include/uapi/linux/bpf.h                      |  10 +
 kernel/bpf/Makefile                           |   3 +-
 kernel/bpf/btf.c                              |  74 +--
 kernel/bpf/log.c                              | 332 +++++++++++++
 kernel/bpf/syscall.c                          |  16 +-
 kernel/bpf/verifier.c                         | 125 ++---
 tools/include/uapi/linux/bpf.h                |  12 +-
 tools/lib/bpf/bpf.c                           |  17 +-
 tools/lib/bpf/bpf.h                           |  22 +-
 .../selftests/bpf/prog_tests/log_fixup.c      |   1 +
 .../selftests/bpf/prog_tests/verifier_log.c   | 437 ++++++++++++++++++
 tools/testing/selftests/bpf/veristat.c        |  44 +-
 15 files changed, 954 insertions(+), 184 deletions(-)
 create mode 100644 kernel/bpf/log.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/verifier_log.c

Comments

Lorenz Bauer April 5, 2023, 5:28 p.m. UTC | #1
On Tue, Apr 4, 2023 at 5:37 AM Andrii Nakryiko <andrii@kernel.org> wrote:
>
> This turned into a pretty sizable patch set with lots of arithmetics, but
> hopefully the set of features added to verifier log in this patch set are both
> useful for BPF users and are self-contained and isolated enough to not cause
> troubles going forward.

Hi Andrii,

Thanks for pushing this forward, this will make all of the log
handling so much nicer. Sorry it took a while to review, but it's
quite a chunky series as you point out yourself :) Maybe it makes
sense to pull out some of the acked bits (moving code around, etc.)
into a separate patch set?

I'll send out individual reviews shortly, but wanted to put my my main
proposal here. It's only compile tested, but it's hopefully clearer
than my words. Note that I didn't fix up whitespace to make the diff
smaller. It should apply on top of your branch.

>From 162afa86d109954a4951d052513580849bd5cc54 Mon Sep 17 00:00:00 2001
From: Lorenz Bauer <lmb@isovalent.com>
Date: Wed, 5 Apr 2023 18:24:42 +0100
Subject: [PATCH] bpf: simplify log nul termination and FIXED mode

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
---
 kernel/bpf/log.c | 105 ++++++++++++++++++++++-------------------------
 1 file changed, 50 insertions(+), 55 deletions(-)

diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
index ab8149448724..b6b59047a594 100644
--- a/kernel/bpf/log.c
+++ b/kernel/bpf/log.c
@@ -54,45 +54,13 @@ static void bpf_vlog_update_len_max(struct
bpf_verifier_log *log, u32 add_len)
         log->len_max = len;
 }

-void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt,
-               va_list args)
+static void bpf_vlog_emit(struct bpf_verifier_log *log, const char
*kbuf, u32 n)
 {
     u64 cur_pos;
-    u32 new_n, n;
-
-    n = vscnprintf(log->kbuf, BPF_VERIFIER_TMP_LOG_SIZE, fmt, args);
-
-    WARN_ONCE(n >= BPF_VERIFIER_TMP_LOG_SIZE - 1,
-          "verifier log line truncated - local buffer too short\n");
-
-    if (log->level == BPF_LOG_KERNEL) {
-        bool newline = n > 0 && log->kbuf[n - 1] == '\n';
-
-        pr_err("BPF: %s%s", log->kbuf, newline ? "" : "\n");
-        return;
-    }
-
-    n += 1; /* include terminating zero */
-    bpf_vlog_update_len_max(log, n);
-
-    if (log->level & BPF_LOG_FIXED) {
-        /* check if we have at least something to put into user buf */
-        new_n = 0;
-        if (log->end_pos + 1 < log->len_total) {
-            new_n = min_t(u32, log->len_total - log->end_pos, n);
-            log->kbuf[new_n - 1] = '\0';
-        }
-        cur_pos = log->end_pos;
-        log->end_pos += n - 1; /* don't count terminating '\0' */
-
-        if (log->ubuf && new_n &&
-            copy_to_user(log->ubuf + cur_pos, log->kbuf, new_n))
-            goto fail;
-    } else {
         u64 new_end, new_start;
         u32 buf_start, buf_end, new_n;

-        log->kbuf[n - 1] = '\0';
+    bpf_vlog_update_len_max(log, n);

         new_end = log->end_pos + n;
         if (new_end - log->start_pos >= log->len_total)
@@ -101,7 +69,7 @@ void bpf_verifier_vlog(struct bpf_verifier_log
*log, const char *fmt,
             new_start = log->start_pos;

         log->start_pos = new_start;
-        log->end_pos = new_end - 1; /* don't count terminating '\0' */
+        log->end_pos = new_end;

         if (!log->ubuf)
             return;
@@ -126,35 +94,60 @@ void bpf_verifier_vlog(struct bpf_verifier_log
*log, const char *fmt,
         if (buf_start < buf_end) {
             /* message fits within contiguous chunk of ubuf */
             if (copy_to_user(log->ubuf + buf_start,
-                     log->kbuf + n - new_n,
+                     kbuf + n - new_n,
                      buf_end - buf_start))
                 goto fail;
         } else {
             /* message wraps around the end of ubuf, copy in two chunks */
             if (copy_to_user(log->ubuf + buf_start,
-                     log->kbuf + n - new_n,
+                     kbuf + n - new_n,
                      log->len_total - buf_start))
                 goto fail;
             if (copy_to_user(log->ubuf,
-                     log->kbuf + n - buf_end,
+                     kbuf + n - buf_end,
                      buf_end))
                 goto fail;
         }
-    }
-
     return;
 fail:
     log->ubuf = NULL;
 }

-void bpf_vlog_reset(struct bpf_verifier_log *log, u64 new_pos)
+static u32 bpf_vlog_available(const struct bpf_verifier_log *log)
 {
-    char zero = 0;
-    u32 pos;
+    return log->len_total - (log->end_pos - log->start_pos);
+}
+
+void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt,
+               va_list args)
+{
+    /* NB: contrary to vsnprintf n can't be larger than sizeof(log->kbuf) */
+    u32 n = vscnprintf(log->kbuf, sizeof(log->kbuf), fmt, args);
+
+    if (log->level == BPF_LOG_KERNEL) {
+        bool newline = n > 0 && log->kbuf[n - 1] == '\n';
+
+        pr_err("BPF: %s%s", log->kbuf, newline ? "" : "\n");
+        return;
+    }

+    if (log->level & BPF_LOG_FIXED) {
+        bpf_vlog_update_len_max(log, n);
+        /* avoid rotation by never emitting more than what's unused */
+        n = min_t(u32, n, bpf_vlog_available(log));
+    }
+
+    bpf_vlog_emit(log, log->kbuf, n);
+}
+
+void bpf_vlog_reset(struct bpf_verifier_log *log, u64 new_pos)
+{
     if (!bpf_verifier_log_needed(log) || log->level == BPF_LOG_KERNEL)
         return;

+    if (WARN_ON_ONCE(new_pos > log->end_pos))
+        return;
+
     /* if position to which we reset is beyond current log window,
      * then we didn't preserve any useful content and should adjust
      * start_pos to end up with an empty log (start_pos == end_pos)
@@ -162,17 +155,6 @@ void bpf_vlog_reset(struct bpf_verifier_log *log,
u64 new_pos)
     log->end_pos = new_pos;
     if (log->end_pos < log->start_pos)
         log->start_pos = log->end_pos;
-
-    if (!log->ubuf)
-        return;
-
-    if (log->level & BPF_LOG_FIXED)
-        pos = log->end_pos + 1;
-    else
-        div_u64_rem(new_pos, log->len_total, &pos);
-
-    if (pos < log->len_total && put_user(zero, log->ubuf + pos))
-        log->ubuf = NULL;
 }

 static void bpf_vlog_reverse_kbuf(char *buf, int len)
@@ -228,6 +210,7 @@ static bool bpf_vlog_truncated(const struct
bpf_verifier_log *log)

 int bpf_vlog_finalize(struct bpf_verifier_log *log, u32 *log_size_actual)
 {
+    char zero = 0;
     u32 sublen;
     int err;

@@ -237,8 +220,20 @@ int bpf_vlog_finalize(struct bpf_verifier_log
*log, u32 *log_size_actual)

     if (!log->ubuf)
         goto skip_log_rotate;
+
+    if (log->level & BPF_LOG_FIXED) {
+        bpf_vlog_update_len_max(log, 1);
+
+        /* terminate by (potentially) overwriting the last byte */
+        if (put_user(zero, log->ubuf + min_t(u32, log->end_pos,
log->len_total-1))
+            return -EFAULT;
+    } else {
+        /* terminate by (potentially) rotating out the first byte */
+        bpf_vlog_emit(log, &zero, 1);
+    }
+
     /* If we never truncated log, there is nothing to move around. */
-    if ((log->level & BPF_LOG_FIXED) || log->start_pos == 0)
+    if (log->start_pos == 0)
         goto skip_log_rotate;

     /* Otherwise we need to rotate log contents to make it start from the
Andrii Nakryiko April 5, 2023, 6:35 p.m. UTC | #2
On Wed, Apr 5, 2023 at 10:28 AM Lorenz Bauer <lmb@isovalent.com> wrote:
>
> On Tue, Apr 4, 2023 at 5:37 AM Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > This turned into a pretty sizable patch set with lots of arithmetics, but
> > hopefully the set of features added to verifier log in this patch set are both
> > useful for BPF users and are self-contained and isolated enough to not cause
> > troubles going forward.
>
> Hi Andrii,
>
> Thanks for pushing this forward, this will make all of the log
> handling so much nicer. Sorry it took a while to review, but it's
> quite a chunky series as you point out yourself :) Maybe it makes
> sense to pull out some of the acked bits (moving code around, etc.)
> into a separate patch set?
>
> I'll send out individual reviews shortly, but wanted to put my my main
> proposal here. It's only compile tested, but it's hopefully clearer
> than my words. Note that I didn't fix up whitespace to make the diff
> smaller. It should apply on top of your branch.
>
> From 162afa86d109954a4951d052513580849bd5cc54 Mon Sep 17 00:00:00 2001
> From: Lorenz Bauer <lmb@isovalent.com>
> Date: Wed, 5 Apr 2023 18:24:42 +0100
> Subject: [PATCH] bpf: simplify log nul termination and FIXED mode
>
> Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
> ---
>  kernel/bpf/log.c | 105 ++++++++++++++++++++++-------------------------
>  1 file changed, 50 insertions(+), 55 deletions(-)

Thanks for the code, I do understand your idea much better now.
Basically, you are saying to tread FIXED as a partial case of rotating
by adjusting N to never go beyond the end of the buffer. While I do
find explicit FIXED code path in bpf_verifier_log() a bit easier to
reason about, I can see this as a conceptual simplification, yep. The
finalization/zero termination is still special-cased for fixed vs
rotating.

I still fail to see how zero termination every single time is
complicating everything, and even with your implementation we can
support this easily with no more code (just kbuf[n - 1] = 0; before we
delegate to bpf_vlog_emit()).

So all in all, looking at stats, I don't really see a big
simplification. On the other hand, I spent a considerable time
thinking, debugging, and testing my existing implementation
thoroughly. Then there is also interaction with log_buf==NULL &&
log_size==0 case, I'd need to re-analyze everything again.

How strong do you feel the need for me to redo this tricky part to
save a few lines of C code (and lose easy debuggability at least of
kbuf contents)? I'm a bit on the fence. I noted a few things I would
add (or remove) even to existing code and I'll apply that. But unless
someone comes out and says "let's do it this way", I'd rather not
waste half a day on debugging some random off-by-one error again.

>
> diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
> index ab8149448724..b6b59047a594 100644
> --- a/kernel/bpf/log.c
> +++ b/kernel/bpf/log.c
> @@ -54,45 +54,13 @@ static void bpf_vlog_update_len_max(struct
> bpf_verifier_log *log, u32 add_len)
>          log->len_max = len;
>  }
>
> -void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt,
> -               va_list args)
> +static void bpf_vlog_emit(struct bpf_verifier_log *log, const char
> *kbuf, u32 n)
>  {
>      u64 cur_pos;
> -    u32 new_n, n;
> -
> -    n = vscnprintf(log->kbuf, BPF_VERIFIER_TMP_LOG_SIZE, fmt, args);
> -
> -    WARN_ONCE(n >= BPF_VERIFIER_TMP_LOG_SIZE - 1,
> -          "verifier log line truncated - local buffer too short\n");
> -
> -    if (log->level == BPF_LOG_KERNEL) {
> -        bool newline = n > 0 && log->kbuf[n - 1] == '\n';
> -
> -        pr_err("BPF: %s%s", log->kbuf, newline ? "" : "\n");
> -        return;
> -    }
> -
> -    n += 1; /* include terminating zero */
> -    bpf_vlog_update_len_max(log, n);
> -
> -    if (log->level & BPF_LOG_FIXED) {
> -        /* check if we have at least something to put into user buf */
> -        new_n = 0;
> -        if (log->end_pos + 1 < log->len_total) {
> -            new_n = min_t(u32, log->len_total - log->end_pos, n);
> -            log->kbuf[new_n - 1] = '\0';

without this part I can't debug what kernel is actually emitting into
user-space with a simple printk()...

> -        }
> -        cur_pos = log->end_pos;
> -        log->end_pos += n - 1; /* don't count terminating '\0' */
> -
> -        if (log->ubuf && new_n &&
> -            copy_to_user(log->ubuf + cur_pos, log->kbuf, new_n))
> -            goto fail;
> -    } else {
>          u64 new_end, new_start;
>          u32 buf_start, buf_end, new_n;
>
> -        log->kbuf[n - 1] = '\0';

realized we don't really need this zero-termination for rotating case,
will drop it

> +    bpf_vlog_update_len_max(log, n);
>
>          new_end = log->end_pos + n;
>          if (new_end - log->start_pos >= log->len_total)
> @@ -101,7 +69,7 @@ void bpf_verifier_vlog(struct bpf_verifier_log
> *log, const char *fmt,
>              new_start = log->start_pos;
>
>          log->start_pos = new_start;
> -        log->end_pos = new_end - 1; /* don't count terminating '\0' */
> +        log->end_pos = new_end;
>
>          if (!log->ubuf)
>              return;
> @@ -126,35 +94,60 @@ void bpf_verifier_vlog(struct bpf_verifier_log
> *log, const char *fmt,
>          if (buf_start < buf_end) {
>              /* message fits within contiguous chunk of ubuf */
>              if (copy_to_user(log->ubuf + buf_start,
> -                     log->kbuf + n - new_n,
> +                     kbuf + n - new_n,
>                       buf_end - buf_start))
>                  goto fail;
>          } else {
>              /* message wraps around the end of ubuf, copy in two chunks */
>              if (copy_to_user(log->ubuf + buf_start,
> -                     log->kbuf + n - new_n,
> +                     kbuf + n - new_n,
>                       log->len_total - buf_start))
>                  goto fail;
>              if (copy_to_user(log->ubuf,
> -                     log->kbuf + n - buf_end,
> +                     kbuf + n - buf_end,
>                       buf_end))
>                  goto fail;
>          }
> -    }
> -
>      return;
>  fail:
>      log->ubuf = NULL;
>  }
>
> -void bpf_vlog_reset(struct bpf_verifier_log *log, u64 new_pos)
> +static u32 bpf_vlog_available(const struct bpf_verifier_log *log)
>  {
> -    char zero = 0;
> -    u32 pos;
> +    return log->len_total - (log->end_pos - log->start_pos);
> +}
> +
> +void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt,
> +               va_list args)
> +{
> +    /* NB: contrary to vsnprintf n can't be larger than sizeof(log->kbuf) */

it can't be even equal to sizeof(log->kbuf)

> +    u32 n = vscnprintf(log->kbuf, sizeof(log->kbuf), fmt, args);
> +
> +    if (log->level == BPF_LOG_KERNEL) {
> +        bool newline = n > 0 && log->kbuf[n - 1] == '\n';
> +
> +        pr_err("BPF: %s%s", log->kbuf, newline ? "" : "\n");
> +        return;
> +    }
>
> +    if (log->level & BPF_LOG_FIXED) {
> +        bpf_vlog_update_len_max(log, n);

this made me pause for a second to prove we are not double-accounting
something. We don't, but I find the argument of a simplification a bit
weaker due to this :)

> +        /* avoid rotation by never emitting more than what's unused */
> +        n = min_t(u32, n, bpf_vlog_available(log));
> +    }
> +
> +    bpf_vlog_emit(log, log->kbuf, n);
> +}
> +
> +void bpf_vlog_reset(struct bpf_verifier_log *log, u64 new_pos)
> +{
>      if (!bpf_verifier_log_needed(log) || log->level == BPF_LOG_KERNEL)
>          return;
>
> +    if (WARN_ON_ONCE(new_pos > log->end_pos))
> +        return;

will add this warn, good point

> +
>      /* if position to which we reset is beyond current log window,
>       * then we didn't preserve any useful content and should adjust
>       * start_pos to end up with an empty log (start_pos == end_pos)
> @@ -162,17 +155,6 @@ void bpf_vlog_reset(struct bpf_verifier_log *log,
> u64 new_pos)
>      log->end_pos = new_pos;
>      if (log->end_pos < log->start_pos)
>          log->start_pos = log->end_pos;
> -
> -    if (!log->ubuf)
> -        return;
> -
> -    if (log->level & BPF_LOG_FIXED)
> -        pos = log->end_pos + 1;
> -    else
> -        div_u64_rem(new_pos, log->len_total, &pos);
> -
> -    if (pos < log->len_total && put_user(zero, log->ubuf + pos))
> -        log->ubuf = NULL;

equivalent to what you do in vlog_finalize, right?

>  }
>
>  static void bpf_vlog_reverse_kbuf(char *buf, int len)
> @@ -228,6 +210,7 @@ static bool bpf_vlog_truncated(const struct
> bpf_verifier_log *log)
>
>  int bpf_vlog_finalize(struct bpf_verifier_log *log, u32 *log_size_actual)
>  {
> +    char zero = 0;
>      u32 sublen;
>      int err;
>
> @@ -237,8 +220,20 @@ int bpf_vlog_finalize(struct bpf_verifier_log
> *log, u32 *log_size_actual)
>
>      if (!log->ubuf)
>          goto skip_log_rotate;
> +
> +    if (log->level & BPF_LOG_FIXED) {
> +        bpf_vlog_update_len_max(log, 1);
> +
> +        /* terminate by (potentially) overwriting the last byte */
> +        if (put_user(zero, log->ubuf + min_t(u32, log->end_pos,
> log->len_total-1))
> +            return -EFAULT;
> +    } else {
> +        /* terminate by (potentially) rotating out the first byte */
> +        bpf_vlog_emit(log, &zero, 1);
> +    }

not a big fan of this part where we still do two separate handlings
for two modes

> +
>      /* If we never truncated log, there is nothing to move around. */
> -    if ((log->level & BPF_LOG_FIXED) || log->start_pos == 0)
> +    if (log->start_pos == 0)

yep, BPF_LOG_FIXED implied start_pos == 0, unnecessary check

>          goto skip_log_rotate;
>
>      /* Otherwise we need to rotate log contents to make it start from the
> --
> 2.39.2
Lorenz Bauer April 6, 2023, 5:30 p.m. UTC | #3
On Wed, Apr 5, 2023 at 7:35 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:

> So all in all, looking at stats, I don't really see a big
> simplification. On the other hand, I spent a considerable time
> thinking, debugging, and testing my existing implementation
> thoroughly. Then there is also interaction with log_buf==NULL &&
> log_size==0 case, I'd need to re-analyze everything again.
>
> How strong do you feel the need for me to redo this tricky part to
> save a few lines of C code (and lose easy debuggability at least of
> kbuf contents)? I'm a bit on the fence. I noted a few things I would
> add (or remove) even to existing code and I'll apply that. But unless
> someone comes out and says "let's do it this way", I'd rather not
> waste half a day on debugging some random off-by-one error again.

Well, what are you optimising for? Is it getting the code in or is it
ease of understanding for future readers (including you in a year or
so?)

Feel free to submit it as it is, maybe I can trim down my solution
some more to get rid of some more special cases ;)

> > -            new_n = min_t(u32, log->len_total - log->end_pos, n);
> > -            log->kbuf[new_n - 1] = '\0';
>
> without this part I can't debug what kernel is actually emitting into
> user-space with a simple printk()...

As I just learned, vscnprintf always null terminates so log->kbuf can
always be printed?

> > +void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt,
> > +               va_list args)
> > +{
> > +    /* NB: contrary to vsnprintf n can't be larger than sizeof(log->kbuf) */
>
> it can't be even equal to sizeof(log->kbuf)

Yes... C string functions are the worst.

>
> > +    u32 n = vscnprintf(log->kbuf, sizeof(log->kbuf), fmt, args);
> > +
> > +    if (log->level == BPF_LOG_KERNEL) {
> > +        bool newline = n > 0 && log->kbuf[n - 1] == '\n';
> > +
> > +        pr_err("BPF: %s%s", log->kbuf, newline ? "" : "\n");
> > +        return;
> > +    }
> >
> > +    if (log->level & BPF_LOG_FIXED) {
> > +        bpf_vlog_update_len_max(log, n);
>
> this made me pause for a second to prove we are not double-accounting
> something. We don't, but I find the argument of a simplification a bit
> weaker due to this :)

Yeah. I found a way to get rid of this, I might submit that as a follow up.

> > -    if (log->level & BPF_LOG_FIXED)
> > -        pos = log->end_pos + 1;
> > -    else
> > -        div_u64_rem(new_pos, log->len_total, &pos);
> > -
> > -    if (pos < log->len_total && put_user(zero, log->ubuf + pos))
> > -        log->ubuf = NULL;
>
> equivalent to what you do in vlog_finalize, right?

Yep

> > @@ -237,8 +220,20 @@ int bpf_vlog_finalize(struct bpf_verifier_log
> > *log, u32 *log_size_actual)
> >
> >      if (!log->ubuf)
> >          goto skip_log_rotate;
> > +
> > +    if (log->level & BPF_LOG_FIXED) {
> > +        bpf_vlog_update_len_max(log, 1);
> > +
> > +        /* terminate by (potentially) overwriting the last byte */
> > +        if (put_user(zero, log->ubuf + min_t(u32, log->end_pos,
> > log->len_total-1))
> > +            return -EFAULT;
> > +    } else {
> > +        /* terminate by (potentially) rotating out the first byte */
> > +        bpf_vlog_emit(log, &zero, 1);
> > +    }
>
> not a big fan of this part where we still do two separate handlings
> for two modes

Agreed, but I'm not sure it can be avoided at all since we need to
clamp different parts of the buffer depending on which mode we are in.
Andrii Nakryiko April 6, 2023, 6:53 p.m. UTC | #4
On Thu, Apr 6, 2023 at 10:30 AM Lorenz Bauer <lmb@isovalent.com> wrote:
>
> On Wed, Apr 5, 2023 at 7:35 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
>
> > So all in all, looking at stats, I don't really see a big
> > simplification. On the other hand, I spent a considerable time
> > thinking, debugging, and testing my existing implementation
> > thoroughly. Then there is also interaction with log_buf==NULL &&
> > log_size==0 case, I'd need to re-analyze everything again.
> >
> > How strong do you feel the need for me to redo this tricky part to
> > save a few lines of C code (and lose easy debuggability at least of
> > kbuf contents)? I'm a bit on the fence. I noted a few things I would
> > add (or remove) even to existing code and I'll apply that. But unless
> > someone comes out and says "let's do it this way", I'd rather not
> > waste half a day on debugging some random off-by-one error again.
>
> Well, what are you optimising for? Is it getting the code in or is it
> ease of understanding for future readers (including you in a year or
> so?)

I'm just saying that ease of understanding is subjective. So given I'm
not convinced that your approach is simpler, I'd avoid unnecessary
extra changes to the code that I've spent a lot of time testing and
debugging and am pretty confident about.

The point of code review is not to satisfy every nit, but to see if
there are problems with the proposed solution and if something can be
done better. Not *different*, but *better*.

One of the things I wanted to hear feedback on was if the overall UAPI
behavior makes sense. You had concerns about enabling rotating log
mode by default, which I believe are addressed by returning -ENOSPC.
Can you please confirm that this approach is acceptable now? Thanks!

>
> Feel free to submit it as it is, maybe I can trim down my solution
> some more to get rid of some more special cases ;)
>
> > > -            new_n = min_t(u32, log->len_total - log->end_pos, n);
> > > -            log->kbuf[new_n - 1] = '\0';
> >
> > without this part I can't debug what kernel is actually emitting into
> > user-space with a simple printk()...
>
> As I just learned, vscnprintf always null terminates so log->kbuf can
> always be printed?

aren't we adjusting n in this branch and would need to zero-terminate
anew? Yes we can safely print it, it just won't be the contents we are
putting into the user buffer.

>
> > > +void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt,
> > > +               va_list args)
> > > +{
> > > +    /* NB: contrary to vsnprintf n can't be larger than sizeof(log->kbuf) */
> >
> > it can't be even equal to sizeof(log->kbuf)
>
> Yes... C string functions are the worst.
>
> >
> > > +    u32 n = vscnprintf(log->kbuf, sizeof(log->kbuf), fmt, args);
> > > +
> > > +    if (log->level == BPF_LOG_KERNEL) {
> > > +        bool newline = n > 0 && log->kbuf[n - 1] == '\n';
> > > +
> > > +        pr_err("BPF: %s%s", log->kbuf, newline ? "" : "\n");
> > > +        return;
> > > +    }
> > >
> > > +    if (log->level & BPF_LOG_FIXED) {
> > > +        bpf_vlog_update_len_max(log, n);
> >
> > this made me pause for a second to prove we are not double-accounting
> > something. We don't, but I find the argument of a simplification a bit
> > weaker due to this :)
>
> Yeah. I found a way to get rid of this, I might submit that as a follow up.
>
> > > -    if (log->level & BPF_LOG_FIXED)
> > > -        pos = log->end_pos + 1;
> > > -    else
> > > -        div_u64_rem(new_pos, log->len_total, &pos);
> > > -
> > > -    if (pos < log->len_total && put_user(zero, log->ubuf + pos))
> > > -        log->ubuf = NULL;
> >
> > equivalent to what you do in vlog_finalize, right?
>
> Yep
>
> > > @@ -237,8 +220,20 @@ int bpf_vlog_finalize(struct bpf_verifier_log
> > > *log, u32 *log_size_actual)
> > >
> > >      if (!log->ubuf)
> > >          goto skip_log_rotate;
> > > +
> > > +    if (log->level & BPF_LOG_FIXED) {
> > > +        bpf_vlog_update_len_max(log, 1);
> > > +
> > > +        /* terminate by (potentially) overwriting the last byte */
> > > +        if (put_user(zero, log->ubuf + min_t(u32, log->end_pos,
> > > log->len_total-1))
> > > +            return -EFAULT;
> > > +    } else {
> > > +        /* terminate by (potentially) rotating out the first byte */
> > > +        bpf_vlog_emit(log, &zero, 1);
> > > +    }
> >
> > not a big fan of this part where we still do two separate handlings
> > for two modes
>
> Agreed, but I'm not sure it can be avoided at all since we need to
> clamp different parts of the buffer depending on which mode we are in.
Andrii Nakryiko April 6, 2023, 11:42 p.m. UTC | #5
On Thu, Apr 6, 2023 at 11:53 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Apr 6, 2023 at 10:30 AM Lorenz Bauer <lmb@isovalent.com> wrote:
> >
> > On Wed, Apr 5, 2023 at 7:35 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> >
> > > So all in all, looking at stats, I don't really see a big
> > > simplification. On the other hand, I spent a considerable time
> > > thinking, debugging, and testing my existing implementation
> > > thoroughly. Then there is also interaction with log_buf==NULL &&
> > > log_size==0 case, I'd need to re-analyze everything again.
> > >
> > > How strong do you feel the need for me to redo this tricky part to
> > > save a few lines of C code (and lose easy debuggability at least of
> > > kbuf contents)? I'm a bit on the fence. I noted a few things I would
> > > add (or remove) even to existing code and I'll apply that. But unless
> > > someone comes out and says "let's do it this way", I'd rather not
> > > waste half a day on debugging some random off-by-one error again.
> >
> > Well, what are you optimising for? Is it getting the code in or is it
> > ease of understanding for future readers (including you in a year or
> > so?)
>
> I'm just saying that ease of understanding is subjective. So given I'm
> not convinced that your approach is simpler, I'd avoid unnecessary
> extra changes to the code that I've spent a lot of time testing and
> debugging and am pretty confident about.
>
> The point of code review is not to satisfy every nit, but to see if
> there are problems with the proposed solution and if something can be
> done better. Not *different*, but *better*.
>

So, I did an honest attempt to implement the idea of BPF_LOG_FIXED
being just a partial case of rotating mode by just adjusting N. And
this breaks down once we start calculating len_max and using
log->end_pos as logical position that can go beyond log->len_total. I
had to abandon this idea, sorry. As I mentioned before, interactions
with log_size==0 are not obvious and straightforward.

I did more testing, though, verifying fixed mode truncation. That
actually caught a corner case when len_total = 1, in which case buffer
wasn't zero-terminated. So at least that's the payoff. :)


> One of the things I wanted to hear feedback on was if the overall UAPI
> behavior makes sense. You had concerns about enabling rotating log
> mode by default, which I believe are addressed by returning -ENOSPC.
> Can you please confirm that this approach is acceptable now? Thanks!
>
> >
> > Feel free to submit it as it is, maybe I can trim down my solution
> > some more to get rid of some more special cases ;)
> >
> > > > -            new_n = min_t(u32, log->len_total - log->end_pos, n);
> > > > -            log->kbuf[new_n - 1] = '\0';
> > >
> > > without this part I can't debug what kernel is actually emitting into
> > > user-space with a simple printk()...
> >
> > As I just learned, vscnprintf always null terminates so log->kbuf can
> > always be printed?
>
> aren't we adjusting n in this branch and would need to zero-terminate
> anew? Yes we can safely print it, it just won't be the contents we are
> putting into the user buffer.
>
> >
> > > > +void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt,
> > > > +               va_list args)
> > > > +{
> > > > +    /* NB: contrary to vsnprintf n can't be larger than sizeof(log->kbuf) */
> > >
> > > it can't be even equal to sizeof(log->kbuf)
> >
> > Yes... C string functions are the worst.
> >
> > >
> > > > +    u32 n = vscnprintf(log->kbuf, sizeof(log->kbuf), fmt, args);
> > > > +
> > > > +    if (log->level == BPF_LOG_KERNEL) {
> > > > +        bool newline = n > 0 && log->kbuf[n - 1] == '\n';
> > > > +
> > > > +        pr_err("BPF: %s%s", log->kbuf, newline ? "" : "\n");
> > > > +        return;
> > > > +    }
> > > >
> > > > +    if (log->level & BPF_LOG_FIXED) {
> > > > +        bpf_vlog_update_len_max(log, n);
> > >
> > > this made me pause for a second to prove we are not double-accounting
> > > something. We don't, but I find the argument of a simplification a bit
> > > weaker due to this :)
> >
> > Yeah. I found a way to get rid of this, I might submit that as a follow up.
> >
> > > > -    if (log->level & BPF_LOG_FIXED)
> > > > -        pos = log->end_pos + 1;
> > > > -    else
> > > > -        div_u64_rem(new_pos, log->len_total, &pos);
> > > > -
> > > > -    if (pos < log->len_total && put_user(zero, log->ubuf + pos))
> > > > -        log->ubuf = NULL;
> > >
> > > equivalent to what you do in vlog_finalize, right?
> >
> > Yep
> >
> > > > @@ -237,8 +220,20 @@ int bpf_vlog_finalize(struct bpf_verifier_log
> > > > *log, u32 *log_size_actual)
> > > >
> > > >      if (!log->ubuf)
> > > >          goto skip_log_rotate;
> > > > +
> > > > +    if (log->level & BPF_LOG_FIXED) {
> > > > +        bpf_vlog_update_len_max(log, 1);
> > > > +
> > > > +        /* terminate by (potentially) overwriting the last byte */
> > > > +        if (put_user(zero, log->ubuf + min_t(u32, log->end_pos,
> > > > log->len_total-1))
> > > > +            return -EFAULT;
> > > > +    } else {
> > > > +        /* terminate by (potentially) rotating out the first byte */
> > > > +        bpf_vlog_emit(log, &zero, 1);
> > > > +    }
> > >
> > > not a big fan of this part where we still do two separate handlings
> > > for two modes
> >
> > Agreed, but I'm not sure it can be avoided at all since we need to
> > clamp different parts of the buffer depending on which mode we are in.