diff mbox series

[bpf-next,v3,3/6] bpf: Add a bpf_snprintf helper

Message ID 20210412153754.235500-4-revest@chromium.org (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series Add a snprintf eBPF helper | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for bpf-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 8 maintainers not CCed: netdev@vger.kernel.org joe@cilium.io mingo@redhat.com kafai@fb.com rostedt@goodmis.org john.fastabend@gmail.com songliubraving@fb.com quentin@isovalent.com
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 11958 this patch: 11958
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch warning CHECK: Alignment should match open parenthesis WARNING: line length of 87 exceeds 80 columns
netdev/build_allmodconfig_warn success Errors and warnings before: 12445 this patch: 12443
netdev/header_inline success Link

Commit Message

Florent Revest April 12, 2021, 3:37 p.m. UTC
The implementation takes inspiration from the existing bpf_trace_printk
helper but there are a few differences:

To allow for a large number of format-specifiers, parameters are
provided in an array, like in bpf_seq_printf.

Because the output string takes two arguments and the array of
parameters also takes two arguments, the format string needs to fit in
one argument. Thankfully, ARG_PTR_TO_CONST_STR is guaranteed to point to
a zero-terminated read-only map so we don't need a format string length
arg.

Because the format-string is known at verification time, we also do
a first pass of format string validation in the verifier logic. This
makes debugging easier.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 include/linux/bpf.h            |  6 ++++
 include/uapi/linux/bpf.h       | 28 +++++++++++++++++++
 kernel/bpf/helpers.c           |  2 ++
 kernel/bpf/verifier.c          | 41 ++++++++++++++++++++++++++++
 kernel/trace/bpf_trace.c       | 50 ++++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h | 28 +++++++++++++++++++
 6 files changed, 155 insertions(+)

Comments

kernel test robot April 12, 2021, 7:11 p.m. UTC | #1
Hi Florent,

I love your patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Florent-Revest/Add-a-snprintf-eBPF-helper/20210412-233921
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: i386-randconfig-s002-20210412 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.3-280-g2cd6d34e-dirty
        # https://github.com/0day-ci/linux/commit/168301dda58989eca274d5f5c926675540010e13
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Florent-Revest/Add-a-snprintf-eBPF-helper/20210412-233921
        git checkout 168301dda58989eca274d5f5c926675540010e13
        # save the attached .config to linux build tree
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)
   kernel/bpf/verifier.c:1674:41: sparse: sparse: subtraction of functions? Share your drugs
   kernel/bpf/verifier.c:12051:76: sparse: sparse: subtraction of functions? Share your drugs
   kernel/bpf/verifier.c:12489:81: sparse: sparse: subtraction of functions? Share your drugs
   kernel/bpf/verifier.c:12493:81: sparse: sparse: subtraction of functions? Share your drugs
   kernel/bpf/verifier.c:12497:81: sparse: sparse: subtraction of functions? Share your drugs
   kernel/bpf/verifier.c:12501:79: sparse: sparse: subtraction of functions? Share your drugs
   kernel/bpf/verifier.c:12505:78: sparse: sparse: subtraction of functions? Share your drugs
   kernel/bpf/verifier.c:12509:79: sparse: sparse: subtraction of functions? Share your drugs
   kernel/bpf/verifier.c:12513:78: sparse: sparse: subtraction of functions? Share your drugs
   kernel/bpf/verifier.c:12557:38: sparse: sparse: subtraction of functions? Share your drugs
>> kernel/bpf/verifier.c:5944:23: sparse: sparse: non size-preserving integer to pointer cast

vim +5944 kernel/bpf/verifier.c

  5920	
  5921	static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
  5922					   struct bpf_reg_state *regs)
  5923	{
  5924		struct bpf_reg_state *fmt_reg = &regs[BPF_REG_3];
  5925		struct bpf_reg_state *data_len_reg = &regs[BPF_REG_5];
  5926		struct bpf_map *fmt_map = fmt_reg->map_ptr;
  5927		int err, fmt_map_off, num_args;
  5928		u64 fmt_addr;
  5929		char *fmt;
  5930	
  5931		/* data must be an array of u64 */
  5932		if (data_len_reg->var_off.value % 8)
  5933			return -EINVAL;
  5934		num_args = data_len_reg->var_off.value / 8;
  5935	
  5936		/* fmt being ARG_PTR_TO_CONST_STR guarantees that var_off is const
  5937		 * and map_direct_value_addr is set.
  5938		 */
  5939		fmt_map_off = fmt_reg->off + fmt_reg->var_off.value;
  5940		err = fmt_map->ops->map_direct_value_addr(fmt_map, &fmt_addr,
  5941							  fmt_map_off);
  5942		if (err)
  5943			return err;
> 5944		fmt = (char *)fmt_addr + fmt_map_off;
  5945	
  5946		/* We are also guaranteed that fmt+fmt_map_off is NULL terminated, we
  5947		 * can focus on validating the format specifiers.
  5948		 */
  5949		err = bpf_printf_prepare(fmt, UINT_MAX, NULL, NULL, NULL, num_args);
  5950		if (err < 0)
  5951			verbose(env, "Invalid format string\n");
  5952	
  5953		return err;
  5954	}
  5955	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
kernel test robot April 12, 2021, 8:32 p.m. UTC | #2
Hi Florent,

I love your patch! Yet something to improve:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Florent-Revest/Add-a-snprintf-eBPF-helper/20210412-233921
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: m68k-allmodconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/168301dda58989eca274d5f5c926675540010e13
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Florent-Revest/Add-a-snprintf-eBPF-helper/20210412-233921
        git checkout 168301dda58989eca274d5f5c926675540010e13
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=m68k 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   m68k-linux-ld: kernel/bpf/verifier.o: in function `check_helper_call.isra.0':
>> verifier.c:(.text+0xf79e): undefined reference to `bpf_printf_prepare'
   m68k-linux-ld: kernel/bpf/helpers.o: in function `bpf_base_func_proto':
>> helpers.c:(.text+0xd82): undefined reference to `bpf_snprintf_proto'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
kernel test robot April 12, 2021, 8:40 p.m. UTC | #3
Hi Florent,

I love your patch! Yet something to improve:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Florent-Revest/Add-a-snprintf-eBPF-helper/20210412-233921
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: mips-decstation_r4k_defconfig (attached as .config)
compiler: mipsel-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/168301dda58989eca274d5f5c926675540010e13
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Florent-Revest/Add-a-snprintf-eBPF-helper/20210412-233921
        git checkout 168301dda58989eca274d5f5c926675540010e13
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=mips 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   mipsel-linux-ld: kernel/bpf/verifier.o: in function `check_helper_call.isra.0':
   verifier.c:(.text+0x13bac): undefined reference to `bpf_printf_prepare'
   mipsel-linux-ld: kernel/bpf/helpers.o: in function `bpf_base_func_proto':
   helpers.c:(.text+0x9c8): undefined reference to `bpf_snprintf_proto'
>> mipsel-linux-ld: helpers.c:(.text+0x9d0): undefined reference to `bpf_snprintf_proto'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Andrii Nakryiko April 13, 2021, 11:16 p.m. UTC | #4
On Mon, Apr 12, 2021 at 8:38 AM Florent Revest <revest@chromium.org> wrote:
>
> The implementation takes inspiration from the existing bpf_trace_printk
> helper but there are a few differences:
>
> To allow for a large number of format-specifiers, parameters are
> provided in an array, like in bpf_seq_printf.
>
> Because the output string takes two arguments and the array of
> parameters also takes two arguments, the format string needs to fit in
> one argument. Thankfully, ARG_PTR_TO_CONST_STR is guaranteed to point to
> a zero-terminated read-only map so we don't need a format string length
> arg.
>
> Because the format-string is known at verification time, we also do
> a first pass of format string validation in the verifier logic. This
> makes debugging easier.
>
> Signed-off-by: Florent Revest <revest@chromium.org>
> ---
>  include/linux/bpf.h            |  6 ++++
>  include/uapi/linux/bpf.h       | 28 +++++++++++++++++++
>  kernel/bpf/helpers.c           |  2 ++
>  kernel/bpf/verifier.c          | 41 ++++++++++++++++++++++++++++
>  kernel/trace/bpf_trace.c       | 50 ++++++++++++++++++++++++++++++++++
>  tools/include/uapi/linux/bpf.h | 28 +++++++++++++++++++
>  6 files changed, 155 insertions(+)
>

[...]

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 5f46dd6f3383..d4020e5f91ee 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5918,6 +5918,41 @@ static int check_reference_leak(struct bpf_verifier_env *env)
>         return state->acquired_refs ? -EINVAL : 0;
>  }
>
> +static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
> +                                  struct bpf_reg_state *regs)
> +{
> +       struct bpf_reg_state *fmt_reg = &regs[BPF_REG_3];
> +       struct bpf_reg_state *data_len_reg = &regs[BPF_REG_5];
> +       struct bpf_map *fmt_map = fmt_reg->map_ptr;
> +       int err, fmt_map_off, num_args;
> +       u64 fmt_addr;
> +       char *fmt;
> +
> +       /* data must be an array of u64 */
> +       if (data_len_reg->var_off.value % 8)
> +               return -EINVAL;
> +       num_args = data_len_reg->var_off.value / 8;
> +
> +       /* fmt being ARG_PTR_TO_CONST_STR guarantees that var_off is const
> +        * and map_direct_value_addr is set.
> +        */
> +       fmt_map_off = fmt_reg->off + fmt_reg->var_off.value;
> +       err = fmt_map->ops->map_direct_value_addr(fmt_map, &fmt_addr,
> +                                                 fmt_map_off);
> +       if (err)
> +               return err;
> +       fmt = (char *)fmt_addr + fmt_map_off;
> +

bot complained about lack of (long) cast before fmt_addr, please address


[...]

> +       /* Maximumly we can have MAX_SNPRINTF_VARARGS parameters, just give
> +        * all of them to snprintf().
> +        */
> +       err = snprintf(str, str_size, fmt, BPF_CAST_FMT_ARG(0, args, mod),
> +               BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod),
> +               BPF_CAST_FMT_ARG(3, args, mod), BPF_CAST_FMT_ARG(4, args, mod),
> +               BPF_CAST_FMT_ARG(5, args, mod), BPF_CAST_FMT_ARG(6, args, mod),
> +               BPF_CAST_FMT_ARG(7, args, mod), BPF_CAST_FMT_ARG(8, args, mod),
> +               BPF_CAST_FMT_ARG(9, args, mod), BPF_CAST_FMT_ARG(10, args, mod),
> +               BPF_CAST_FMT_ARG(11, args, mod));
> +
> +       put_fmt_tmp_buf();

reading this for at least 3rd time, this put_fmt_tmp_buf() looks a bit
out of place and kind of random. I think bpf_printf_cleanup() name
pairs with bpf_printf_prepare() better.

> +
> +       return err + 1;

snprintf() already returns string length *including* terminating zero,
so this is wrong


> +}
> +

[...]
Florent Revest April 14, 2021, 9:46 a.m. UTC | #5
On Wed, Apr 14, 2021 at 1:16 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
> On Mon, Apr 12, 2021 at 8:38 AM Florent Revest <revest@chromium.org> wrote:
> > +static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
> > +                                  struct bpf_reg_state *regs)
> > +{
> > +       struct bpf_reg_state *fmt_reg = &regs[BPF_REG_3];
> > +       struct bpf_reg_state *data_len_reg = &regs[BPF_REG_5];
> > +       struct bpf_map *fmt_map = fmt_reg->map_ptr;
> > +       int err, fmt_map_off, num_args;
> > +       u64 fmt_addr;
> > +       char *fmt;
> > +
> > +       /* data must be an array of u64 */
> > +       if (data_len_reg->var_off.value % 8)
> > +               return -EINVAL;
> > +       num_args = data_len_reg->var_off.value / 8;
> > +
> > +       /* fmt being ARG_PTR_TO_CONST_STR guarantees that var_off is const
> > +        * and map_direct_value_addr is set.
> > +        */
> > +       fmt_map_off = fmt_reg->off + fmt_reg->var_off.value;
> > +       err = fmt_map->ops->map_direct_value_addr(fmt_map, &fmt_addr,
> > +                                                 fmt_map_off);
> > +       if (err)
> > +               return err;
> > +       fmt = (char *)fmt_addr + fmt_map_off;
> > +
>
> bot complained about lack of (long) cast before fmt_addr, please address

Will do.

> > +       /* Maximumly we can have MAX_SNPRINTF_VARARGS parameters, just give
> > +        * all of them to snprintf().
> > +        */
> > +       err = snprintf(str, str_size, fmt, BPF_CAST_FMT_ARG(0, args, mod),
> > +               BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod),
> > +               BPF_CAST_FMT_ARG(3, args, mod), BPF_CAST_FMT_ARG(4, args, mod),
> > +               BPF_CAST_FMT_ARG(5, args, mod), BPF_CAST_FMT_ARG(6, args, mod),
> > +               BPF_CAST_FMT_ARG(7, args, mod), BPF_CAST_FMT_ARG(8, args, mod),
> > +               BPF_CAST_FMT_ARG(9, args, mod), BPF_CAST_FMT_ARG(10, args, mod),
> > +               BPF_CAST_FMT_ARG(11, args, mod));
> > +
> > +       put_fmt_tmp_buf();
>
> reading this for at least 3rd time, this put_fmt_tmp_buf() looks a bit
> out of place and kind of random. I think bpf_printf_cleanup() name
> pairs with bpf_printf_prepare() better.

Yes, I thought it would be clever to name that function
put_fmt_tmp_buf() as a clear parallel to try_get_fmt_tmp_buf() but
because it only puts the buffer if it is used and because they get
called in two different contexts, it's after all maybe not such a
clever name... I'll revert to bpf_printf_cleanup(). Thank you for your
patience with my naming adventures! :)

> > +
> > +       return err + 1;
>
> snprintf() already returns string length *including* terminating zero,
> so this is wrong

lib/vsprintf.c says:
 * The return value is the number of characters which would be
 * generated for the given input, excluding the trailing null,
 * as per ISO C99.

Also if I look at the "no arg" test case in the selftest patch.
"simple case" is asserted to return 12 which seems correct to me
(includes the terminating zero only once). Am I missing something ?

However that makes me wonder whether it would be more appropriate to
return the value excluding the trailing null. On one hand it makes
sense to be coherent with other BPF helpers that include the trailing
zero (as discussed in patch v1), on the other hand the helper is
clearly named after the standard "snprintf" function and it's likely
that users will assume it works the same as the std snprintf.
Florent Revest April 14, 2021, 1:32 p.m. UTC | #6
On Mon, Apr 12, 2021 at 10:32 PM kernel test robot <lkp@intel.com> wrote:
>    m68k-linux-ld: kernel/bpf/verifier.o: in function `check_helper_call.isra.0':
> >> verifier.c:(.text+0xf79e): undefined reference to `bpf_printf_prepare'
>    m68k-linux-ld: kernel/bpf/helpers.o: in function `bpf_base_func_proto':
> >> helpers.c:(.text+0xd82): undefined reference to `bpf_snprintf_proto'

I'll move the implementation of bpf_printf_prepare/bpf_printf_cleanup
and bpf_snprintf to kernel/bpf/helpers.c so that they all get built on
kernels with CONFIG_BPF_SYSCALL but not CONFIG_BPF_EVENTS.
Geert Uytterhoeven April 14, 2021, 6:02 p.m. UTC | #7
Hi Andrii,

On Wed, Apr 14, 2021 at 9:41 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
> On Mon, Apr 12, 2021 at 8:38 AM Florent Revest <revest@chromium.org> wrote:
> > The implementation takes inspiration from the existing bpf_trace_printk
> > helper but there are a few differences:
> >
> > To allow for a large number of format-specifiers, parameters are
> > provided in an array, like in bpf_seq_printf.
> >
> > Because the output string takes two arguments and the array of
> > parameters also takes two arguments, the format string needs to fit in
> > one argument. Thankfully, ARG_PTR_TO_CONST_STR is guaranteed to point to
> > a zero-terminated read-only map so we don't need a format string length
> > arg.
> >
> > Because the format-string is known at verification time, we also do
> > a first pass of format string validation in the verifier logic. This
> > makes debugging easier.
> >
> > Signed-off-by: Florent Revest <revest@chromium.org>
> > ---
> >  include/linux/bpf.h            |  6 ++++
> >  include/uapi/linux/bpf.h       | 28 +++++++++++++++++++
> >  kernel/bpf/helpers.c           |  2 ++
> >  kernel/bpf/verifier.c          | 41 ++++++++++++++++++++++++++++
> >  kernel/trace/bpf_trace.c       | 50 ++++++++++++++++++++++++++++++++++
> >  tools/include/uapi/linux/bpf.h | 28 +++++++++++++++++++
> >  6 files changed, 155 insertions(+)
> >
>
> [...]
>
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 5f46dd6f3383..d4020e5f91ee 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -5918,6 +5918,41 @@ static int check_reference_leak(struct bpf_verifier_env *env)
> >         return state->acquired_refs ? -EINVAL : 0;
> >  }
> >
> > +static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
> > +                                  struct bpf_reg_state *regs)
> > +{
> > +       struct bpf_reg_state *fmt_reg = &regs[BPF_REG_3];
> > +       struct bpf_reg_state *data_len_reg = &regs[BPF_REG_5];
> > +       struct bpf_map *fmt_map = fmt_reg->map_ptr;
> > +       int err, fmt_map_off, num_args;
> > +       u64 fmt_addr;
> > +       char *fmt;
> > +
> > +       /* data must be an array of u64 */
> > +       if (data_len_reg->var_off.value % 8)
> > +               return -EINVAL;
> > +       num_args = data_len_reg->var_off.value / 8;
> > +
> > +       /* fmt being ARG_PTR_TO_CONST_STR guarantees that var_off is const
> > +        * and map_direct_value_addr is set.
> > +        */
> > +       fmt_map_off = fmt_reg->off + fmt_reg->var_off.value;
> > +       err = fmt_map->ops->map_direct_value_addr(fmt_map, &fmt_addr,
> > +                                                 fmt_map_off);
> > +       if (err)
> > +               return err;
> > +       fmt = (char *)fmt_addr + fmt_map_off;
> > +
>
> bot complained about lack of (long) cast before fmt_addr, please address

(uintptr_t), I assume?

Gr{oetje,eeting}s,

                        Geert
Florent Revest April 14, 2021, 6:30 p.m. UTC | #8
Hey Geert! :)

On Wed, Apr 14, 2021 at 8:02 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Wed, Apr 14, 2021 at 9:41 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> > On Mon, Apr 12, 2021 at 8:38 AM Florent Revest <revest@chromium.org> wrote:
> > > +       fmt = (char *)fmt_addr + fmt_map_off;
> > > +
> >
> > bot complained about lack of (long) cast before fmt_addr, please address
>
> (uintptr_t), I assume?

(uintptr_t) seems more correct to me as well. However, I just had a
look at the rest of verifier.c and (long) casts are already used
pretty much everywhere whereas uintptr_t isn't used yet.
I'll send a v4 with a long cast for the sake of consistency with the
rest of the verifier.
Andrii Nakryiko April 14, 2021, 10:57 p.m. UTC | #9
On Wed, Apr 14, 2021 at 2:46 AM Florent Revest <revest@chromium.org> wrote:
>
> On Wed, Apr 14, 2021 at 1:16 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> > On Mon, Apr 12, 2021 at 8:38 AM Florent Revest <revest@chromium.org> wrote:
> > > +static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
> > > +                                  struct bpf_reg_state *regs)
> > > +{
> > > +       struct bpf_reg_state *fmt_reg = &regs[BPF_REG_3];
> > > +       struct bpf_reg_state *data_len_reg = &regs[BPF_REG_5];
> > > +       struct bpf_map *fmt_map = fmt_reg->map_ptr;
> > > +       int err, fmt_map_off, num_args;
> > > +       u64 fmt_addr;
> > > +       char *fmt;
> > > +
> > > +       /* data must be an array of u64 */
> > > +       if (data_len_reg->var_off.value % 8)
> > > +               return -EINVAL;
> > > +       num_args = data_len_reg->var_off.value / 8;
> > > +
> > > +       /* fmt being ARG_PTR_TO_CONST_STR guarantees that var_off is const
> > > +        * and map_direct_value_addr is set.
> > > +        */
> > > +       fmt_map_off = fmt_reg->off + fmt_reg->var_off.value;
> > > +       err = fmt_map->ops->map_direct_value_addr(fmt_map, &fmt_addr,
> > > +                                                 fmt_map_off);
> > > +       if (err)
> > > +               return err;
> > > +       fmt = (char *)fmt_addr + fmt_map_off;
> > > +
> >
> > bot complained about lack of (long) cast before fmt_addr, please address
>
> Will do.
>
> > > +       /* Maximumly we can have MAX_SNPRINTF_VARARGS parameters, just give
> > > +        * all of them to snprintf().
> > > +        */
> > > +       err = snprintf(str, str_size, fmt, BPF_CAST_FMT_ARG(0, args, mod),
> > > +               BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod),
> > > +               BPF_CAST_FMT_ARG(3, args, mod), BPF_CAST_FMT_ARG(4, args, mod),
> > > +               BPF_CAST_FMT_ARG(5, args, mod), BPF_CAST_FMT_ARG(6, args, mod),
> > > +               BPF_CAST_FMT_ARG(7, args, mod), BPF_CAST_FMT_ARG(8, args, mod),
> > > +               BPF_CAST_FMT_ARG(9, args, mod), BPF_CAST_FMT_ARG(10, args, mod),
> > > +               BPF_CAST_FMT_ARG(11, args, mod));
> > > +
> > > +       put_fmt_tmp_buf();
> >
> > reading this for at least 3rd time, this put_fmt_tmp_buf() looks a bit
> > out of place and kind of random. I think bpf_printf_cleanup() name
> > pairs with bpf_printf_prepare() better.
>
> Yes, I thought it would be clever to name that function
> put_fmt_tmp_buf() as a clear parallel to try_get_fmt_tmp_buf() but
> because it only puts the buffer if it is used and because they get
> called in two different contexts, it's after all maybe not such a
> clever name... I'll revert to bpf_printf_cleanup(). Thank you for your
> patience with my naming adventures! :)
>
> > > +
> > > +       return err + 1;
> >
> > snprintf() already returns string length *including* terminating zero,
> > so this is wrong
>
> lib/vsprintf.c says:
>  * The return value is the number of characters which would be
>  * generated for the given input, excluding the trailing null,
>  * as per ISO C99.
>
> Also if I look at the "no arg" test case in the selftest patch.
> "simple case" is asserted to return 12 which seems correct to me
> (includes the terminating zero only once). Am I missing something ?
>

no, you are right, but that means that bpf_trace_printk is broken, it
doesn't do + 1 (which threw me off here), shall we fix that?

> However that makes me wonder whether it would be more appropriate to
> return the value excluding the trailing null. On one hand it makes
> sense to be coherent with other BPF helpers that include the trailing
> zero (as discussed in patch v1), on the other hand the helper is
> clearly named after the standard "snprintf" function and it's likely
> that users will assume it works the same as the std snprintf.


Having zero included simplifies BPF code tremendously for cases like
bpf_probe_read_str(). So no, let's stick with including zero
terminator in return size.
Andrii Nakryiko April 14, 2021, 10:58 p.m. UTC | #10
On Wed, Apr 14, 2021 at 11:30 AM Florent Revest <revest@chromium.org> wrote:
>
> Hey Geert! :)
>
> On Wed, Apr 14, 2021 at 8:02 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > On Wed, Apr 14, 2021 at 9:41 AM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > > On Mon, Apr 12, 2021 at 8:38 AM Florent Revest <revest@chromium.org> wrote:
> > > > +       fmt = (char *)fmt_addr + fmt_map_off;
> > > > +
> > >
> > > bot complained about lack of (long) cast before fmt_addr, please address
> >
> > (uintptr_t), I assume?
>
> (uintptr_t) seems more correct to me as well. However, I just had a
> look at the rest of verifier.c and (long) casts are already used
> pretty much everywhere whereas uintptr_t isn't used yet.
> I'll send a v4 with a long cast for the sake of consistency with the
> rest of the verifier.

right, I don't care about long or uintptr_t, both are guaranteed to
work, I just remember seeing a lot of code with (long) cast. I have no
preference.
Geert Uytterhoeven April 15, 2021, 7:18 a.m. UTC | #11
Hi Andrii,

On Thu, Apr 15, 2021 at 12:58 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
> On Wed, Apr 14, 2021 at 11:30 AM Florent Revest <revest@chromium.org> wrote:
> > On Wed, Apr 14, 2021 at 8:02 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > On Wed, Apr 14, 2021 at 9:41 AM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > > On Mon, Apr 12, 2021 at 8:38 AM Florent Revest <revest@chromium.org> wrote:
> > > > > +       fmt = (char *)fmt_addr + fmt_map_off;
> > > > > +
> > > >
> > > > bot complained about lack of (long) cast before fmt_addr, please address
> > >
> > > (uintptr_t), I assume?
> >
> > (uintptr_t) seems more correct to me as well. However, I just had a
> > look at the rest of verifier.c and (long) casts are already used
> > pretty much everywhere whereas uintptr_t isn't used yet.
> > I'll send a v4 with a long cast for the sake of consistency with the
> > rest of the verifier.
>
> right, I don't care about long or uintptr_t, both are guaranteed to
> work, I just remember seeing a lot of code with (long) cast. I have no
> preference.

AFAIR, uintptr_t was introduced only in C99. Early Linux code predates that,
hence uses long, and this behavior was of course copied to new code.

Please use uintptr_t in new code.

Gr{oetje,eeting}s,

                        Geert
Florent Revest April 15, 2021, 9:34 a.m. UTC | #12
On Thu, Apr 15, 2021 at 12:57 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, Apr 14, 2021 at 2:46 AM Florent Revest <revest@chromium.org> wrote:
> >
> > On Wed, Apr 14, 2021 at 1:16 AM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > > On Mon, Apr 12, 2021 at 8:38 AM Florent Revest <revest@chromium.org> wrote:
> > > > +
> > > > +       return err + 1;
> > >
> > > snprintf() already returns string length *including* terminating zero,
> > > so this is wrong
> >
> > lib/vsprintf.c says:
> >  * The return value is the number of characters which would be
> >  * generated for the given input, excluding the trailing null,
> >  * as per ISO C99.
> >
> > Also if I look at the "no arg" test case in the selftest patch.
> > "simple case" is asserted to return 12 which seems correct to me
> > (includes the terminating zero only once). Am I missing something ?
> >
>
> no, you are right, but that means that bpf_trace_printk is broken, it
> doesn't do + 1 (which threw me off here), shall we fix that?

Answered in the 1/6 thread

> > However that makes me wonder whether it would be more appropriate to
> > return the value excluding the trailing null. On one hand it makes
> > sense to be coherent with other BPF helpers that include the trailing
> > zero (as discussed in patch v1), on the other hand the helper is
> > clearly named after the standard "snprintf" function and it's likely
> > that users will assume it works the same as the std snprintf.
>
>
> Having zero included simplifies BPF code tremendously for cases like
> bpf_probe_read_str(). So no, let's stick with including zero
> terminator in return size.

Cool :)
diff mbox series

Patch

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 7d389eeee0b3..a3650fc93068 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1953,6 +1953,7 @@  extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto;
 extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto;
 extern const struct bpf_func_proto bpf_copy_from_user_proto;
 extern const struct bpf_func_proto bpf_snprintf_btf_proto;
+extern const struct bpf_func_proto bpf_snprintf_proto;
 extern const struct bpf_func_proto bpf_per_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
@@ -2078,4 +2079,9 @@  int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
 struct btf_id_set;
 bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
 
+enum bpf_printf_mod_type;
+int bpf_printf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
+		       u64 *final_args, enum bpf_printf_mod_type *mod,
+		       u32 num_args);
+
 #endif /* _LINUX_BPF_H */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 49371eba98ba..40546d4676f1 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4671,6 +4671,33 @@  union bpf_attr {
  *	Return
  *		The number of traversed map elements for success, **-EINVAL** for
  *		invalid **flags**.
+ *
+ * long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data, u32 data_len)
+ *	Description
+ *		Outputs a string into the **str** buffer of size **str_size**
+ *		based on a format string stored in a read-only map pointed by
+ *		**fmt**.
+ *
+ *		Each format specifier in **fmt** corresponds to one u64 element
+ *		in the **data** array. For strings and pointers where pointees
+ *		are accessed, only the pointer values are stored in the *data*
+ *		array. The *data_len* is the size of *data* in bytes.
+ *
+ *		Formats **%s** and **%p{i,I}{4,6}** require to read kernel
+ *		memory. Reading kernel memory may fail due to either invalid
+ *		address or valid address but requiring a major memory fault. If
+ *		reading kernel memory fails, the string for **%s** will be an
+ *		empty string, and the ip address for **%p{i,I}{4,6}** will be 0.
+ *		Not returning error to bpf program is consistent with what
+ *		**bpf_trace_printk**\ () does for now.
+ *
+ *	Return
+ *		The strictly positive length of the formatted string, including
+ *		the trailing zero character. If the return value is greater than
+ *		**str_size**, **str** contains a truncated string, guaranteed to
+ *		be zero-terminated except when **str_size** is 0.
+ *
+ *		Or **-EBUSY** if the per-CPU memory copy buffer is busy.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -4838,6 +4865,7 @@  union bpf_attr {
 	FN(sock_from_file),		\
 	FN(check_mtu),			\
 	FN(for_each_map_elem),		\
+	FN(snprintf),			\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index f306611c4ddf..ec45c7526924 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -757,6 +757,8 @@  bpf_base_func_proto(enum bpf_func_id func_id)
 		return &bpf_probe_read_kernel_str_proto;
 	case BPF_FUNC_snprintf_btf:
 		return &bpf_snprintf_btf_proto;
+	case BPF_FUNC_snprintf:
+		return &bpf_snprintf_proto;
 	default:
 		return NULL;
 	}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5f46dd6f3383..d4020e5f91ee 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5918,6 +5918,41 @@  static int check_reference_leak(struct bpf_verifier_env *env)
 	return state->acquired_refs ? -EINVAL : 0;
 }
 
+static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
+				   struct bpf_reg_state *regs)
+{
+	struct bpf_reg_state *fmt_reg = &regs[BPF_REG_3];
+	struct bpf_reg_state *data_len_reg = &regs[BPF_REG_5];
+	struct bpf_map *fmt_map = fmt_reg->map_ptr;
+	int err, fmt_map_off, num_args;
+	u64 fmt_addr;
+	char *fmt;
+
+	/* data must be an array of u64 */
+	if (data_len_reg->var_off.value % 8)
+		return -EINVAL;
+	num_args = data_len_reg->var_off.value / 8;
+
+	/* fmt being ARG_PTR_TO_CONST_STR guarantees that var_off is const
+	 * and map_direct_value_addr is set.
+	 */
+	fmt_map_off = fmt_reg->off + fmt_reg->var_off.value;
+	err = fmt_map->ops->map_direct_value_addr(fmt_map, &fmt_addr,
+						  fmt_map_off);
+	if (err)
+		return err;
+	fmt = (char *)fmt_addr + fmt_map_off;
+
+	/* We are also guaranteed that fmt+fmt_map_off is NULL terminated, we
+	 * can focus on validating the format specifiers.
+	 */
+	err = bpf_printf_prepare(fmt, UINT_MAX, NULL, NULL, NULL, num_args);
+	if (err < 0)
+		verbose(env, "Invalid format string\n");
+
+	return err;
+}
+
 static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			     int *insn_idx_p)
 {
@@ -6032,6 +6067,12 @@  static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 			return -EINVAL;
 	}
 
+	if (func_id == BPF_FUNC_snprintf) {
+		err = check_bpf_snprintf_call(env, regs);
+		if (err < 0)
+			return err;
+	}
+
 	/* reset caller saved regs */
 	for (i = 0; i < CALLER_SAVED_REGS; i++) {
 		mark_reg_not_init(env, regs, caller_saved[i]);
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 3ce9aeee6681..990ed98d2842 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1238,6 +1238,54 @@  const struct bpf_func_proto bpf_snprintf_btf_proto = {
 	.arg5_type	= ARG_ANYTHING,
 };
 
+#define MAX_SNPRINTF_VARARGS		12
+
+BPF_CALL_5(bpf_snprintf, char *, str, u32, str_size, char *, fmt,
+	   const void *, data, u32, data_len)
+{
+	enum bpf_printf_mod_type mod[MAX_SNPRINTF_VARARGS];
+	u64 args[MAX_SNPRINTF_VARARGS];
+	int err, num_args;
+
+	if (data_len % 8 || data_len > MAX_SNPRINTF_VARARGS * 8 ||
+	    (data_len && !data))
+		return -EINVAL;
+	num_args = data_len / 8;
+
+	/* ARG_PTR_TO_CONST_STR guarantees that fmt is zero-terminated so we
+	 * can safely give an unbounded size.
+	 */
+	err = bpf_printf_prepare(fmt, UINT_MAX, data, args, mod, num_args);
+	if (err < 0)
+		return err;
+
+	/* Maximumly we can have MAX_SNPRINTF_VARARGS parameters, just give
+	 * all of them to snprintf().
+	 */
+	err = snprintf(str, str_size, fmt, BPF_CAST_FMT_ARG(0, args, mod),
+		BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod),
+		BPF_CAST_FMT_ARG(3, args, mod), BPF_CAST_FMT_ARG(4, args, mod),
+		BPF_CAST_FMT_ARG(5, args, mod), BPF_CAST_FMT_ARG(6, args, mod),
+		BPF_CAST_FMT_ARG(7, args, mod), BPF_CAST_FMT_ARG(8, args, mod),
+		BPF_CAST_FMT_ARG(9, args, mod), BPF_CAST_FMT_ARG(10, args, mod),
+		BPF_CAST_FMT_ARG(11, args, mod));
+
+	put_fmt_tmp_buf();
+
+	return err + 1;
+}
+
+const struct bpf_func_proto bpf_snprintf_proto = {
+	.func		= bpf_snprintf,
+	.gpl_only	= true,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_MEM_OR_NULL,
+	.arg2_type	= ARG_CONST_SIZE_OR_ZERO,
+	.arg3_type	= ARG_PTR_TO_CONST_STR,
+	.arg4_type	= ARG_PTR_TO_MEM_OR_NULL,
+	.arg5_type	= ARG_CONST_SIZE_OR_ZERO,
+};
+
 const struct bpf_func_proto *
 bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
@@ -1340,6 +1388,8 @@  bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_task_storage_delete_proto;
 	case BPF_FUNC_for_each_map_elem:
 		return &bpf_for_each_map_elem_proto;
+	case BPF_FUNC_snprintf:
+		return &bpf_snprintf_proto;
 	default:
 		return NULL;
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 69902603012c..ffdd2ae18c69 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4665,6 +4665,33 @@  union bpf_attr {
  *	Return
  *		The number of traversed map elements for success, **-EINVAL** for
  *		invalid **flags**.
+ *
+ * long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data, u32 data_len)
+ *	Description
+ *		Outputs a string into the **str** buffer of size **str_size**
+ *		based on a format string stored in a read-only map pointed by
+ *		**fmt**.
+ *
+ *		Each format specifier in **fmt** corresponds to one u64 element
+ *		in the **data** array. For strings and pointers where pointees
+ *		are accessed, only the pointer values are stored in the *data*
+ *		array. The *data_len* is the size of *data* in bytes.
+ *
+ *		Formats **%s** and **%p{i,I}{4,6}** require to read kernel
+ *		memory. Reading kernel memory may fail due to either invalid
+ *		address or valid address but requiring a major memory fault. If
+ *		reading kernel memory fails, the string for **%s** will be an
+ *		empty string, and the ip address for **%p{i,I}{4,6}** will be 0.
+ *		Not returning error to bpf program is consistent with what
+ *		**bpf_trace_printk**\ () does for now.
+ *
+ *	Return
+ *		The strictly positive length of the formatted string, including
+ *		the trailing zero character. If the return value is greater than
+ *		**str_size**, **str** contains a truncated string, guaranteed to
+ *		be zero-terminated except when **str_size** is 0.
+ *
+ *		Or **-EBUSY** if the per-CPU memory copy buffer is busy.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -4832,6 +4859,7 @@  union bpf_attr {
 	FN(sock_from_file),		\
 	FN(check_mtu),			\
 	FN(for_each_map_elem),		\
+	FN(snprintf),			\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper