diff mbox series

[bpf-next,v3,1/2] bpf: Add bpf_copy_from_user_str kfunc

Message ID 20240813012528.3566133-1-linux@jordanrome.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series [bpf-next,v3,1/2] bpf: Add bpf_copy_from_user_str kfunc | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-16 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-32 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-18 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-next-VM_Test-41 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-33 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-next-VM_Test-40 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-31 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-9 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-14 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-37 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-39 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-38 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 42 this patch: 42
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 9 maintainers not CCed: sdf@fomichev.me eddyz87@gmail.com haoluo@google.com jolsa@kernel.org song@kernel.org yonghong.song@linux.dev kpsingh@kernel.org martin.lau@linux.dev john.fastabend@gmail.com
netdev/build_clang success Errors and warnings before: 43 this patch: 43
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn fail Errors and warnings before: 96 this patch: 97
netdev/checkpatch warning WARNING: line length of 81 exceeds 80 columns WARNING: line length of 82 exceeds 80 columns WARNING: line length of 99 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc fail Errors and warnings before: 9 this patch: 10
netdev/source_inline success Was 0 now: 0

Commit Message

Jordan Rome Aug. 13, 2024, 1:25 a.m. UTC
This adds a kfunc wrapper around strncpy_from_user,
which can be called from sleepable BPF programs.

This matches the non-sleepable 'bpf_probe_read_user_str'
helper.

Signed-off-by: Jordan Rome <linux@jordanrome.com>
---
 kernel/bpf/helpers.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

--
2.43.5

Comments

Alexei Starovoitov Aug. 13, 2024, 2:10 a.m. UTC | #1
On Mon, Aug 12, 2024 at 6:26 PM Jordan Rome <linux@jordanrome.com> wrote:
>
> This adds a kfunc wrapper around strncpy_from_user,
> which can be called from sleepable BPF programs.
>
> This matches the non-sleepable 'bpf_probe_read_user_str'
> helper.
>
> Signed-off-by: Jordan Rome <linux@jordanrome.com>
> ---
>  kernel/bpf/helpers.c | 36 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 36 insertions(+)
>
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index d02ae323996b..e87d5df658cb 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -2939,6 +2939,41 @@ __bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it)
>         bpf_mem_free(&bpf_global_ma, kit->bits);
>  }
>
> +/**
> + * bpf_copy_from_user_str() - Copy a string from an unsafe user address
> + * @dst:             Destination address, in kernel space.  This buffer must be at
> + *                   least @dst__szk bytes long.
> + * @dst__szk:        Maximum number of bytes to copy, including the trailing NUL.
> + * @unsafe_ptr__ign: Source address, in user space.
> + *
> + * Copies a NUL-terminated string from userspace to BPF space. If user string is
> + * too long this will still ensure zero termination in the dst buffer unless
> + * buffer size is 0.
> + */
> +__bpf_kfunc int bpf_copy_from_user_str(void *dst, u32 dst__szk, const void __user *unsafe_ptr__ign)
> +{
> +       int ret;
> +       int count;
> +
> +       if (unlikely(!dst__szk))
> +               return 0;
> +
> +       count = dst__szk - 1;
> +       if (unlikely(!count)) {
> +               ((char *)dst)[0] = '\0';
> +               return 1;
> +       }
> +
> +       ret = strncpy_from_user(dst, unsafe_ptr__ign, count);
> +       if (ret >= 0) {
> +               if (ret == count)
> +                       ((char *)dst)[ret] = '\0';
> +               ret++;
> +       }
> +
> +       return ret;
> +}

The above will not pad the buffer and it will create instability
when the target buffer is a part of the map key. Consider:

struct map_key {
   char str[100];
};
struct {
        __uint(type, BPF_MAP_TYPE_HASH);
        __type(key, struct map_key);
} hash SEC(".maps");

struct map_key key;
bpf_copy_from_user_str(key.str, sizeof(key.str), user_string);

The verifier will think that all of the 'key' is initialized,
but for short strings the key will have garbage.

bpf_probe_read_kernel_str() has the same issue as above, but
let's fix it here first and update read_kernel_str() later.

pw-bot: cr
Jordan Rome Aug. 13, 2024, 10:27 a.m. UTC | #2
On Mon, Aug 12, 2024 at 10:10 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Aug 12, 2024 at 6:26 PM Jordan Rome <linux@jordanrome.com> wrote:
> >
> > This adds a kfunc wrapper around strncpy_from_user,
> > which can be called from sleepable BPF programs.
> >
> > This matches the non-sleepable 'bpf_probe_read_user_str'
> > helper.
> >
> > Signed-off-by: Jordan Rome <linux@jordanrome.com>
> > ---
> >  kernel/bpf/helpers.c | 36 ++++++++++++++++++++++++++++++++++++
> >  1 file changed, 36 insertions(+)
> >
> > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > index d02ae323996b..e87d5df658cb 100644
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -2939,6 +2939,41 @@ __bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it)
> >         bpf_mem_free(&bpf_global_ma, kit->bits);
> >  }
> >
> > +/**
> > + * bpf_copy_from_user_str() - Copy a string from an unsafe user address
> > + * @dst:             Destination address, in kernel space.  This buffer must be at
> > + *                   least @dst__szk bytes long.
> > + * @dst__szk:        Maximum number of bytes to copy, including the trailing NUL.
> > + * @unsafe_ptr__ign: Source address, in user space.
> > + *
> > + * Copies a NUL-terminated string from userspace to BPF space. If user string is
> > + * too long this will still ensure zero termination in the dst buffer unless
> > + * buffer size is 0.
> > + */
> > +__bpf_kfunc int bpf_copy_from_user_str(void *dst, u32 dst__szk, const void __user *unsafe_ptr__ign)
> > +{
> > +       int ret;
> > +       int count;
> > +
> > +       if (unlikely(!dst__szk))
> > +               return 0;
> > +
> > +       count = dst__szk - 1;
> > +       if (unlikely(!count)) {
> > +               ((char *)dst)[0] = '\0';
> > +               return 1;
> > +       }
> > +
> > +       ret = strncpy_from_user(dst, unsafe_ptr__ign, count);
> > +       if (ret >= 0) {
> > +               if (ret == count)
> > +                       ((char *)dst)[ret] = '\0';
> > +               ret++;
> > +       }
> > +
> > +       return ret;
> > +}
>
> The above will not pad the buffer and it will create instability
> when the target buffer is a part of the map key. Consider:
>
> struct map_key {
>    char str[100];
> };
> struct {
>         __uint(type, BPF_MAP_TYPE_HASH);
>         __type(key, struct map_key);
> } hash SEC(".maps");
>
> struct map_key key;
> bpf_copy_from_user_str(key.str, sizeof(key.str), user_string);
>
> The verifier will think that all of the 'key' is initialized,
> but for short strings the key will have garbage.
>
> bpf_probe_read_kernel_str() has the same issue as above, but
> let's fix it here first and update read_kernel_str() later.
>
> pw-bot: cr

You're saying we should always do a memset using `dst__szk` on success
of copying the string?
Jordan Rome Aug. 13, 2024, 1:30 p.m. UTC | #3
On Tue, Aug 13, 2024 at 6:27 AM Jordan Rome <linux@jordanrome.com> wrote:
>
> On Mon, Aug 12, 2024 at 10:10 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Mon, Aug 12, 2024 at 6:26 PM Jordan Rome <linux@jordanrome.com> wrote:
> > >
> > > This adds a kfunc wrapper around strncpy_from_user,
> > > which can be called from sleepable BPF programs.
> > >
> > > This matches the non-sleepable 'bpf_probe_read_user_str'
> > > helper.
> > >
> > > Signed-off-by: Jordan Rome <linux@jordanrome.com>
> > > ---
> > >  kernel/bpf/helpers.c | 36 ++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 36 insertions(+)
> > >
> > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > index d02ae323996b..e87d5df658cb 100644
> > > --- a/kernel/bpf/helpers.c
> > > +++ b/kernel/bpf/helpers.c
> > > @@ -2939,6 +2939,41 @@ __bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it)
> > >         bpf_mem_free(&bpf_global_ma, kit->bits);
> > >  }
> > >
> > > +/**
> > > + * bpf_copy_from_user_str() - Copy a string from an unsafe user address
> > > + * @dst:             Destination address, in kernel space.  This buffer must be at
> > > + *                   least @dst__szk bytes long.
> > > + * @dst__szk:        Maximum number of bytes to copy, including the trailing NUL.
> > > + * @unsafe_ptr__ign: Source address, in user space.
> > > + *
> > > + * Copies a NUL-terminated string from userspace to BPF space. If user string is
> > > + * too long this will still ensure zero termination in the dst buffer unless
> > > + * buffer size is 0.
> > > + */
> > > +__bpf_kfunc int bpf_copy_from_user_str(void *dst, u32 dst__szk, const void __user *unsafe_ptr__ign)
> > > +{
> > > +       int ret;
> > > +       int count;
> > > +
> > > +       if (unlikely(!dst__szk))
> > > +               return 0;
> > > +
> > > +       count = dst__szk - 1;
> > > +       if (unlikely(!count)) {
> > > +               ((char *)dst)[0] = '\0';
> > > +               return 1;
> > > +       }
> > > +
> > > +       ret = strncpy_from_user(dst, unsafe_ptr__ign, count);
> > > +       if (ret >= 0) {
> > > +               if (ret == count)
> > > +                       ((char *)dst)[ret] = '\0';
> > > +               ret++;
> > > +       }
> > > +
> > > +       return ret;
> > > +}
> >
> > The above will not pad the buffer and it will create instability
> > when the target buffer is a part of the map key. Consider:
> >
> > struct map_key {
> >    char str[100];
> > };
> > struct {
> >         __uint(type, BPF_MAP_TYPE_HASH);
> >         __type(key, struct map_key);
> > } hash SEC(".maps");
> >
> > struct map_key key;
> > bpf_copy_from_user_str(key.str, sizeof(key.str), user_string);
> >
> > The verifier will think that all of the 'key' is initialized,
> > but for short strings the key will have garbage.
> >
> > bpf_probe_read_kernel_str() has the same issue as above, but
> > let's fix it here first and update read_kernel_str() later.
> >
> > pw-bot: cr
>
> You're saying we should always do a memset using `dst__szk` on success
> of copying the string?

Something like this?
```
ret = strncpy_from_user(dst, unsafe_ptr__ign, count);
  if (ret >= 0) {
    if (ret <= count)
       memset((char *)dst + ret, 0, dst__szk - ret);
    ret++;
}
```
Alexei Starovoitov Aug. 13, 2024, 4:07 p.m. UTC | #4
On Tue, Aug 13, 2024 at 6:30 AM Jordan Rome <linux@jordanrome.com> wrote:
>
> On Tue, Aug 13, 2024 at 6:27 AM Jordan Rome <linux@jordanrome.com> wrote:
> >
> > On Mon, Aug 12, 2024 at 10:10 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Mon, Aug 12, 2024 at 6:26 PM Jordan Rome <linux@jordanrome.com> wrote:
> > > >
> > > > This adds a kfunc wrapper around strncpy_from_user,
> > > > which can be called from sleepable BPF programs.
> > > >
> > > > This matches the non-sleepable 'bpf_probe_read_user_str'
> > > > helper.
> > > >
> > > > Signed-off-by: Jordan Rome <linux@jordanrome.com>
> > > > ---
> > > >  kernel/bpf/helpers.c | 36 ++++++++++++++++++++++++++++++++++++
> > > >  1 file changed, 36 insertions(+)
> > > >
> > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > > index d02ae323996b..e87d5df658cb 100644
> > > > --- a/kernel/bpf/helpers.c
> > > > +++ b/kernel/bpf/helpers.c
> > > > @@ -2939,6 +2939,41 @@ __bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it)
> > > >         bpf_mem_free(&bpf_global_ma, kit->bits);
> > > >  }
> > > >
> > > > +/**
> > > > + * bpf_copy_from_user_str() - Copy a string from an unsafe user address
> > > > + * @dst:             Destination address, in kernel space.  This buffer must be at
> > > > + *                   least @dst__szk bytes long.
> > > > + * @dst__szk:        Maximum number of bytes to copy, including the trailing NUL.
> > > > + * @unsafe_ptr__ign: Source address, in user space.
> > > > + *
> > > > + * Copies a NUL-terminated string from userspace to BPF space. If user string is
> > > > + * too long this will still ensure zero termination in the dst buffer unless
> > > > + * buffer size is 0.
> > > > + */
> > > > +__bpf_kfunc int bpf_copy_from_user_str(void *dst, u32 dst__szk, const void __user *unsafe_ptr__ign)
> > > > +{
> > > > +       int ret;
> > > > +       int count;
> > > > +
> > > > +       if (unlikely(!dst__szk))
> > > > +               return 0;
> > > > +
> > > > +       count = dst__szk - 1;
> > > > +       if (unlikely(!count)) {
> > > > +               ((char *)dst)[0] = '\0';
> > > > +               return 1;
> > > > +       }
> > > > +
> > > > +       ret = strncpy_from_user(dst, unsafe_ptr__ign, count);
> > > > +       if (ret >= 0) {
> > > > +               if (ret == count)
> > > > +                       ((char *)dst)[ret] = '\0';
> > > > +               ret++;
> > > > +       }
> > > > +
> > > > +       return ret;
> > > > +}
> > >
> > > The above will not pad the buffer and it will create instability
> > > when the target buffer is a part of the map key. Consider:
> > >
> > > struct map_key {
> > >    char str[100];
> > > };
> > > struct {
> > >         __uint(type, BPF_MAP_TYPE_HASH);
> > >         __type(key, struct map_key);
> > > } hash SEC(".maps");
> > >
> > > struct map_key key;
> > > bpf_copy_from_user_str(key.str, sizeof(key.str), user_string);
> > >
> > > The verifier will think that all of the 'key' is initialized,
> > > but for short strings the key will have garbage.
> > >
> > > bpf_probe_read_kernel_str() has the same issue as above, but
> > > let's fix it here first and update read_kernel_str() later.
> > >
> > > pw-bot: cr
> >
> > You're saying we should always do a memset using `dst__szk` on success
> > of copying the string?
>
> Something like this?
> ```
> ret = strncpy_from_user(dst, unsafe_ptr__ign, count);
>   if (ret >= 0) {
>     if (ret <= count)
>        memset((char *)dst + ret, 0, dst__szk - ret);
>     ret++;
> }
> ```

yep. something like this. I didn't check the math.
Andrii Nakryiko Aug. 13, 2024, 6:10 p.m. UTC | #5
On Tue, Aug 13, 2024 at 9:08 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Aug 13, 2024 at 6:30 AM Jordan Rome <linux@jordanrome.com> wrote:
> >
> > On Tue, Aug 13, 2024 at 6:27 AM Jordan Rome <linux@jordanrome.com> wrote:
> > >
> > > On Mon, Aug 12, 2024 at 10:10 PM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Mon, Aug 12, 2024 at 6:26 PM Jordan Rome <linux@jordanrome.com> wrote:
> > > > >
> > > > > This adds a kfunc wrapper around strncpy_from_user,
> > > > > which can be called from sleepable BPF programs.
> > > > >
> > > > > This matches the non-sleepable 'bpf_probe_read_user_str'
> > > > > helper.
> > > > >
> > > > > Signed-off-by: Jordan Rome <linux@jordanrome.com>
> > > > > ---
> > > > >  kernel/bpf/helpers.c | 36 ++++++++++++++++++++++++++++++++++++
> > > > >  1 file changed, 36 insertions(+)
> > > > >
> > > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > > > index d02ae323996b..e87d5df658cb 100644
> > > > > --- a/kernel/bpf/helpers.c
> > > > > +++ b/kernel/bpf/helpers.c
> > > > > @@ -2939,6 +2939,41 @@ __bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it)
> > > > >         bpf_mem_free(&bpf_global_ma, kit->bits);
> > > > >  }
> > > > >
> > > > > +/**
> > > > > + * bpf_copy_from_user_str() - Copy a string from an unsafe user address
> > > > > + * @dst:             Destination address, in kernel space.  This buffer must be at
> > > > > + *                   least @dst__szk bytes long.
> > > > > + * @dst__szk:        Maximum number of bytes to copy, including the trailing NUL.
> > > > > + * @unsafe_ptr__ign: Source address, in user space.
> > > > > + *
> > > > > + * Copies a NUL-terminated string from userspace to BPF space. If user string is
> > > > > + * too long this will still ensure zero termination in the dst buffer unless
> > > > > + * buffer size is 0.
> > > > > + */
> > > > > +__bpf_kfunc int bpf_copy_from_user_str(void *dst, u32 dst__szk, const void __user *unsafe_ptr__ign)
> > > > > +{
> > > > > +       int ret;
> > > > > +       int count;
> > > > > +
> > > > > +       if (unlikely(!dst__szk))
> > > > > +               return 0;
> > > > > +
> > > > > +       count = dst__szk - 1;
> > > > > +       if (unlikely(!count)) {
> > > > > +               ((char *)dst)[0] = '\0';
> > > > > +               return 1;
> > > > > +       }
> > > > > +
> > > > > +       ret = strncpy_from_user(dst, unsafe_ptr__ign, count);
> > > > > +       if (ret >= 0) {
> > > > > +               if (ret == count)
> > > > > +                       ((char *)dst)[ret] = '\0';
> > > > > +               ret++;
> > > > > +       }
> > > > > +
> > > > > +       return ret;
> > > > > +}
> > > >
> > > > The above will not pad the buffer and it will create instability
> > > > when the target buffer is a part of the map key. Consider:
> > > >
> > > > struct map_key {
> > > >    char str[100];
> > > > };
> > > > struct {
> > > >         __uint(type, BPF_MAP_TYPE_HASH);
> > > >         __type(key, struct map_key);
> > > > } hash SEC(".maps");
> > > >
> > > > struct map_key key;
> > > > bpf_copy_from_user_str(key.str, sizeof(key.str), user_string);
> > > >
> > > > The verifier will think that all of the 'key' is initialized,
> > > > but for short strings the key will have garbage.
> > > >
> > > > bpf_probe_read_kernel_str() has the same issue as above, but
> > > > let's fix it here first and update read_kernel_str() later.
> > > >
> > > > pw-bot: cr
> > >
> > > You're saying we should always do a memset using `dst__szk` on success
> > > of copying the string?
> >
> > Something like this?
> > ```
> > ret = strncpy_from_user(dst, unsafe_ptr__ign, count);
> >   if (ret >= 0) {
> >     if (ret <= count)
> >        memset((char *)dst + ret, 0, dst__szk - ret);
> >     ret++;
> > }
> > ```
>
> yep. something like this. I didn't check the math.

I'm a bit worried about this unconditional memset without having a way
to disable it. In practice, lots of cases won't use the destination
buffer as a map key, but rather just send it over ringbuf. So paying
the price of zeroing out seems unnecessary.

It's quite often (I do that in retsnoop, for instance; and we have
other cases in our production) that we have a pretty big buffer, but
expect that most of the time strings will be much smaller. So we can
have a 1K buffer, but get 20 bytes of string content (and we end up
sending only actual useful size of data over ringbuf/perfbuf, so not
even paying 1K memcpy() overhead). Paying for memset()'ing the entire
1K (and string reading can happen in a loop, so this memsetting will
be happening over and over, unnecessarily), seems excessive.

Given it's pretty easy to do memset(0) using bpf_prober_read(dst, sz,
NULL), maybe we shouldn't do memsetting unconditionally? We can add a
loud comment stating the danger of using the resulting buffer as map
key without clearing the unfilled part of the buffer and that should
be sufficient?
Alexei Starovoitov Aug. 13, 2024, 6:30 p.m. UTC | #6
On Tue, Aug 13, 2024 at 11:10 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Aug 13, 2024 at 9:08 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Tue, Aug 13, 2024 at 6:30 AM Jordan Rome <linux@jordanrome.com> wrote:
> > >
> > > On Tue, Aug 13, 2024 at 6:27 AM Jordan Rome <linux@jordanrome.com> wrote:
> > > >
> > > > On Mon, Aug 12, 2024 at 10:10 PM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Mon, Aug 12, 2024 at 6:26 PM Jordan Rome <linux@jordanrome.com> wrote:
> > > > > >
> > > > > > This adds a kfunc wrapper around strncpy_from_user,
> > > > > > which can be called from sleepable BPF programs.
> > > > > >
> > > > > > This matches the non-sleepable 'bpf_probe_read_user_str'
> > > > > > helper.
> > > > > >
> > > > > > Signed-off-by: Jordan Rome <linux@jordanrome.com>
> > > > > > ---
> > > > > >  kernel/bpf/helpers.c | 36 ++++++++++++++++++++++++++++++++++++
> > > > > >  1 file changed, 36 insertions(+)
> > > > > >
> > > > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > > > > index d02ae323996b..e87d5df658cb 100644
> > > > > > --- a/kernel/bpf/helpers.c
> > > > > > +++ b/kernel/bpf/helpers.c
> > > > > > @@ -2939,6 +2939,41 @@ __bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it)
> > > > > >         bpf_mem_free(&bpf_global_ma, kit->bits);
> > > > > >  }
> > > > > >
> > > > > > +/**
> > > > > > + * bpf_copy_from_user_str() - Copy a string from an unsafe user address
> > > > > > + * @dst:             Destination address, in kernel space.  This buffer must be at
> > > > > > + *                   least @dst__szk bytes long.
> > > > > > + * @dst__szk:        Maximum number of bytes to copy, including the trailing NUL.
> > > > > > + * @unsafe_ptr__ign: Source address, in user space.
> > > > > > + *
> > > > > > + * Copies a NUL-terminated string from userspace to BPF space. If user string is
> > > > > > + * too long this will still ensure zero termination in the dst buffer unless
> > > > > > + * buffer size is 0.
> > > > > > + */
> > > > > > +__bpf_kfunc int bpf_copy_from_user_str(void *dst, u32 dst__szk, const void __user *unsafe_ptr__ign)
> > > > > > +{
> > > > > > +       int ret;
> > > > > > +       int count;
> > > > > > +
> > > > > > +       if (unlikely(!dst__szk))
> > > > > > +               return 0;
> > > > > > +
> > > > > > +       count = dst__szk - 1;
> > > > > > +       if (unlikely(!count)) {
> > > > > > +               ((char *)dst)[0] = '\0';
> > > > > > +               return 1;
> > > > > > +       }
> > > > > > +
> > > > > > +       ret = strncpy_from_user(dst, unsafe_ptr__ign, count);
> > > > > > +       if (ret >= 0) {
> > > > > > +               if (ret == count)
> > > > > > +                       ((char *)dst)[ret] = '\0';
> > > > > > +               ret++;
> > > > > > +       }
> > > > > > +
> > > > > > +       return ret;
> > > > > > +}
> > > > >
> > > > > The above will not pad the buffer and it will create instability
> > > > > when the target buffer is a part of the map key. Consider:
> > > > >
> > > > > struct map_key {
> > > > >    char str[100];
> > > > > };
> > > > > struct {
> > > > >         __uint(type, BPF_MAP_TYPE_HASH);
> > > > >         __type(key, struct map_key);
> > > > > } hash SEC(".maps");
> > > > >
> > > > > struct map_key key;
> > > > > bpf_copy_from_user_str(key.str, sizeof(key.str), user_string);
> > > > >
> > > > > The verifier will think that all of the 'key' is initialized,
> > > > > but for short strings the key will have garbage.
> > > > >
> > > > > bpf_probe_read_kernel_str() has the same issue as above, but
> > > > > let's fix it here first and update read_kernel_str() later.
> > > > >
> > > > > pw-bot: cr
> > > >
> > > > You're saying we should always do a memset using `dst__szk` on success
> > > > of copying the string?
> > >
> > > Something like this?
> > > ```
> > > ret = strncpy_from_user(dst, unsafe_ptr__ign, count);
> > >   if (ret >= 0) {
> > >     if (ret <= count)
> > >        memset((char *)dst + ret, 0, dst__szk - ret);
> > >     ret++;
> > > }
> > > ```
> >
> > yep. something like this. I didn't check the math.
>
> I'm a bit worried about this unconditional memset without having a way
> to disable it. In practice, lots of cases won't use the destination
> buffer as a map key, but rather just send it over ringbuf. So paying
> the price of zeroing out seems unnecessary.
>
> It's quite often (I do that in retsnoop, for instance; and we have
> other cases in our production) that we have a pretty big buffer, but
> expect that most of the time strings will be much smaller. So we can
> have a 1K buffer, but get 20 bytes of string content (and we end up
> sending only actual useful size of data over ringbuf/perfbuf, so not
> even paying 1K memcpy() overhead). Paying for memset()'ing the entire
> 1K (and string reading can happen in a loop, so this memsetting will
> be happening over and over, unnecessarily), seems excessive.
>
> Given it's pretty easy to do memset(0) using bpf_prober_read(dst, sz,
> NULL), maybe we shouldn't do memsetting unconditionally? We can add a
> loud comment stating the danger of using the resulting buffer as map
> key without clearing the unfilled part of the buffer and that should
> be sufficient?

probe_read as memset is a quirk that folks learned to abuse.
Let's add a flag to this bpf_copy_from_user_str() kfunc instead,
so it behaves either like strscpy_pad or strscpy.
Andrii Nakryiko Aug. 13, 2024, 8:18 p.m. UTC | #7
On Tue, Aug 13, 2024 at 11:30 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Aug 13, 2024 at 11:10 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Tue, Aug 13, 2024 at 9:08 AM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Tue, Aug 13, 2024 at 6:30 AM Jordan Rome <linux@jordanrome.com> wrote:
> > > >
> > > > On Tue, Aug 13, 2024 at 6:27 AM Jordan Rome <linux@jordanrome.com> wrote:
> > > > >
> > > > > On Mon, Aug 12, 2024 at 10:10 PM Alexei Starovoitov
> > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > >
> > > > > > On Mon, Aug 12, 2024 at 6:26 PM Jordan Rome <linux@jordanrome.com> wrote:
> > > > > > >
> > > > > > > This adds a kfunc wrapper around strncpy_from_user,
> > > > > > > which can be called from sleepable BPF programs.
> > > > > > >
> > > > > > > This matches the non-sleepable 'bpf_probe_read_user_str'
> > > > > > > helper.
> > > > > > >
> > > > > > > Signed-off-by: Jordan Rome <linux@jordanrome.com>
> > > > > > > ---
> > > > > > >  kernel/bpf/helpers.c | 36 ++++++++++++++++++++++++++++++++++++
> > > > > > >  1 file changed, 36 insertions(+)
> > > > > > >
> > > > > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > > > > > index d02ae323996b..e87d5df658cb 100644
> > > > > > > --- a/kernel/bpf/helpers.c
> > > > > > > +++ b/kernel/bpf/helpers.c
> > > > > > > @@ -2939,6 +2939,41 @@ __bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it)
> > > > > > >         bpf_mem_free(&bpf_global_ma, kit->bits);
> > > > > > >  }
> > > > > > >
> > > > > > > +/**
> > > > > > > + * bpf_copy_from_user_str() - Copy a string from an unsafe user address
> > > > > > > + * @dst:             Destination address, in kernel space.  This buffer must be at
> > > > > > > + *                   least @dst__szk bytes long.
> > > > > > > + * @dst__szk:        Maximum number of bytes to copy, including the trailing NUL.
> > > > > > > + * @unsafe_ptr__ign: Source address, in user space.
> > > > > > > + *
> > > > > > > + * Copies a NUL-terminated string from userspace to BPF space. If user string is
> > > > > > > + * too long this will still ensure zero termination in the dst buffer unless
> > > > > > > + * buffer size is 0.
> > > > > > > + */
> > > > > > > +__bpf_kfunc int bpf_copy_from_user_str(void *dst, u32 dst__szk, const void __user *unsafe_ptr__ign)
> > > > > > > +{
> > > > > > > +       int ret;
> > > > > > > +       int count;
> > > > > > > +
> > > > > > > +       if (unlikely(!dst__szk))
> > > > > > > +               return 0;
> > > > > > > +
> > > > > > > +       count = dst__szk - 1;
> > > > > > > +       if (unlikely(!count)) {
> > > > > > > +               ((char *)dst)[0] = '\0';
> > > > > > > +               return 1;
> > > > > > > +       }
> > > > > > > +
> > > > > > > +       ret = strncpy_from_user(dst, unsafe_ptr__ign, count);
> > > > > > > +       if (ret >= 0) {
> > > > > > > +               if (ret == count)
> > > > > > > +                       ((char *)dst)[ret] = '\0';
> > > > > > > +               ret++;
> > > > > > > +       }
> > > > > > > +
> > > > > > > +       return ret;
> > > > > > > +}
> > > > > >
> > > > > > The above will not pad the buffer and it will create instability
> > > > > > when the target buffer is a part of the map key. Consider:
> > > > > >
> > > > > > struct map_key {
> > > > > >    char str[100];
> > > > > > };
> > > > > > struct {
> > > > > >         __uint(type, BPF_MAP_TYPE_HASH);
> > > > > >         __type(key, struct map_key);
> > > > > > } hash SEC(".maps");
> > > > > >
> > > > > > struct map_key key;
> > > > > > bpf_copy_from_user_str(key.str, sizeof(key.str), user_string);
> > > > > >
> > > > > > The verifier will think that all of the 'key' is initialized,
> > > > > > but for short strings the key will have garbage.
> > > > > >
> > > > > > bpf_probe_read_kernel_str() has the same issue as above, but
> > > > > > let's fix it here first and update read_kernel_str() later.
> > > > > >
> > > > > > pw-bot: cr
> > > > >
> > > > > You're saying we should always do a memset using `dst__szk` on success
> > > > > of copying the string?
> > > >
> > > > Something like this?
> > > > ```
> > > > ret = strncpy_from_user(dst, unsafe_ptr__ign, count);
> > > >   if (ret >= 0) {
> > > >     if (ret <= count)
> > > >        memset((char *)dst + ret, 0, dst__szk - ret);
> > > >     ret++;
> > > > }
> > > > ```
> > >
> > > yep. something like this. I didn't check the math.
> >
> > I'm a bit worried about this unconditional memset without having a way
> > to disable it. In practice, lots of cases won't use the destination
> > buffer as a map key, but rather just send it over ringbuf. So paying
> > the price of zeroing out seems unnecessary.
> >
> > It's quite often (I do that in retsnoop, for instance; and we have
> > other cases in our production) that we have a pretty big buffer, but
> > expect that most of the time strings will be much smaller. So we can
> > have a 1K buffer, but get 20 bytes of string content (and we end up
> > sending only actual useful size of data over ringbuf/perfbuf, so not
> > even paying 1K memcpy() overhead). Paying for memset()'ing the entire
> > 1K (and string reading can happen in a loop, so this memsetting will
> > be happening over and over, unnecessarily), seems excessive.
> >
> > Given it's pretty easy to do memset(0) using bpf_prober_read(dst, sz,
> > NULL), maybe we shouldn't do memsetting unconditionally? We can add a
> > loud comment stating the danger of using the resulting buffer as map
> > key without clearing the unfilled part of the buffer and that should
> > be sufficient?
>
> probe_read as memset is a quirk that folks learned to abuse.
> Let's add a flag to this bpf_copy_from_user_str() kfunc instead,
> so it behaves either like strscpy_pad or strscpy.

agreed, a flag sounds good
diff mbox series

Patch

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index d02ae323996b..e87d5df658cb 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2939,6 +2939,41 @@  __bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it)
 	bpf_mem_free(&bpf_global_ma, kit->bits);
 }

+/**
+ * bpf_copy_from_user_str() - Copy a string from an unsafe user address
+ * @dst:             Destination address, in kernel space.  This buffer must be at
+ *                   least @dst__szk bytes long.
+ * @dst__szk:        Maximum number of bytes to copy, including the trailing NUL.
+ * @unsafe_ptr__ign: Source address, in user space.
+ *
+ * Copies a NUL-terminated string from userspace to BPF space. If user string is
+ * too long this will still ensure zero termination in the dst buffer unless
+ * buffer size is 0.
+ */
+__bpf_kfunc int bpf_copy_from_user_str(void *dst, u32 dst__szk, const void __user *unsafe_ptr__ign)
+{
+	int ret;
+	int count;
+
+	if (unlikely(!dst__szk))
+		return 0;
+
+	count = dst__szk - 1;
+	if (unlikely(!count)) {
+		((char *)dst)[0] = '\0';
+		return 1;
+	}
+
+	ret = strncpy_from_user(dst, unsafe_ptr__ign, count);
+	if (ret >= 0) {
+		if (ret == count)
+			((char *)dst)[ret] = '\0';
+		ret++;
+	}
+
+	return ret;
+}
+
 __bpf_kfunc_end_defs();

 BTF_KFUNCS_START(generic_btf_ids)
@@ -3024,6 +3059,7 @@  BTF_ID_FLAGS(func, bpf_preempt_enable)
 BTF_ID_FLAGS(func, bpf_iter_bits_new, KF_ITER_NEW)
 BTF_ID_FLAGS(func, bpf_iter_bits_next, KF_ITER_NEXT | KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_iter_bits_destroy, KF_ITER_DESTROY)
+BTF_ID_FLAGS(func, bpf_copy_from_user_str, KF_SLEEPABLE)
 BTF_KFUNCS_END(common_btf_ids)

 static const struct btf_kfunc_id_set common_kfunc_set = {