mbox series

[bpf-next,0/2] error checking where helpers call bpf_map_ops

Message ID 20230318011324.203830-1-inwardvessel@gmail.com (mailing list archive)
Headers show
Series error checking where helpers call bpf_map_ops | expand

Message

JP Kobryn March 18, 2023, 1:13 a.m. UTC
From: JP Kobryn <inwardvessel@gmail.com>

Within bpf programs, the bpf helper functions can make inline calls to
kernel functions. In this scenario there can be a disconnect between the
register the kernel function writes a return value to and the register the
bpf program uses to evaluate that return value.

As an example, this bpf code:

long err = bpf_map_update_elem(...);
if (err && err != -EEXIST)
	// got some error other than -EEXIST

...can result in the bpf assembly:

; err = bpf_map_update_elem(&mymap, &key, &val, BPF_NOEXIST);
  37:	movabs $0xffff976a10730400,%rdi
  41:	mov    $0x1,%ecx
  46:	call   0xffffffffe103291c	; htab_map_update_elem
; if (err && err != -EEXIST) {
  4b:	cmp    $0xffffffffffffffef,%rax ; cmp -EEXIST,%rax
  4f:	je     0x000000000000008e
  51:	test   %rax,%rax
  54:	je     0x000000000000008e

The compare operation here evaluates %rax, while in the preceding call to 
htab_map_update_elem the corresponding assembly returns -EEXIST via %eax:

movl $0xffffffef, %r9d
...
movl %r9d, %eax

...since it's returning int (32-bit). So the resulting comparison becomes:

cmp $0xffffffffffffffef, $0x00000000ffffffef

...making it not possible to check for negative errors or specific errors,
since the sign value is left at the 32nd bit. It means in the original
example, the conditional branch will be entered even when the error is
-EEXIST, which was not intended.

The selftests added cover these cases for the different bpf_map_ops
functions. When the second patch is applied, changing the return type of
those functions to long, the comparison works as intended and the tests
pass.

JP Kobryn (2):
  bpf/selftests: coverage for bpf_map_ops errors
  bpf: return long from bpf_map_ops funcs

 include/linux/bpf.h                           |  10 +-
 kernel/bpf/arraymap.c                         |   8 +-
 kernel/bpf/bloom_filter.c                     |  12 +-
 kernel/bpf/bpf_cgrp_storage.c                 |   6 +-
 kernel/bpf/bpf_inode_storage.c                |   6 +-
 kernel/bpf/bpf_struct_ops.c                   |   6 +-
 kernel/bpf/bpf_task_storage.c                 |   6 +-
 kernel/bpf/cpumap.c                           |   6 +-
 kernel/bpf/devmap.c                           |  20 +--
 kernel/bpf/hashtab.c                          |  32 ++---
 kernel/bpf/local_storage.c                    |   6 +-
 kernel/bpf/lpm_trie.c                         |   6 +-
 kernel/bpf/queue_stack_maps.c                 |  22 +--
 kernel/bpf/reuseport_array.c                  |   2 +-
 kernel/bpf/ringbuf.c                          |   6 +-
 kernel/bpf/stackmap.c                         |   6 +-
 kernel/bpf/verifier.c                         |  10 +-
 net/core/bpf_sk_storage.c                     |   6 +-
 net/core/sock_map.c                           |   8 +-
 net/xdp/xskmap.c                              |   6 +-
 .../selftests/bpf/prog_tests/map_ops.c        | 130 ++++++++++++++++++
 .../selftests/bpf/progs/test_map_ops.c        |  90 ++++++++++++
 22 files changed, 315 insertions(+), 95 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/map_ops.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_map_ops.c

Comments

Eduard Zingerman March 20, 2023, 2:18 p.m. UTC | #1
On Fri, 2023-03-17 at 18:13 -0700, inwardvessel wrote:
> From: JP Kobryn <inwardvessel@gmail.com>
> 
> Within bpf programs, the bpf helper functions can make inline calls to
> kernel functions. In this scenario there can be a disconnect between the
> register the kernel function writes a return value to and the register the
> bpf program uses to evaluate that return value.
> 
> As an example, this bpf code:
> 
> long err = bpf_map_update_elem(...);
> if (err && err != -EEXIST)
> 	// got some error other than -EEXIST
> 
> ...can result in the bpf assembly:
> 
> ; err = bpf_map_update_elem(&mymap, &key, &val, BPF_NOEXIST);
>   37:	movabs $0xffff976a10730400,%rdi
>   41:	mov    $0x1,%ecx
>   46:	call   0xffffffffe103291c	; htab_map_update_elem
> ; if (err && err != -EEXIST) {
>   4b:	cmp    $0xffffffffffffffef,%rax ; cmp -EEXIST,%rax
>   4f:	je     0x000000000000008e
>   51:	test   %rax,%rax
>   54:	je     0x000000000000008e
> 
> The compare operation here evaluates %rax, while in the preceding call to 
> htab_map_update_elem the corresponding assembly returns -EEXIST via %eax:
> 
> movl $0xffffffef, %r9d
> ...
> movl %r9d, %eax
> 
> ...since it's returning int (32-bit). So the resulting comparison becomes:
> 
> cmp $0xffffffffffffffef, $0x00000000ffffffef
> 
> ...making it not possible to check for negative errors or specific errors,
> since the sign value is left at the 32nd bit. It means in the original
> example, the conditional branch will be entered even when the error is
> -EEXIST, which was not intended.
> 
> The selftests added cover these cases for the different bpf_map_ops
> functions. When the second patch is applied, changing the return type of
> those functions to long, the comparison works as intended and the tests
> pass.
>

Looks like this fixes commit from 2020:
bdb7b79b4ce8 ("bpf: Switch most helper return values from 32-bit int to 64-bit long")

To add to the summary: the issue is caused by the fact that test
program uses map function definitions from `bpf_helper_defs.h`, e.g.:

    static long (*bpf_map_update_elem)(...) 2;

These definitions are generated from `include/uapi/linux/bpf.h`,
which specifies the return type for this helper to be `long`
(changed to from `int` in the commit mentioned above).
That's why clang does not insert sign extension instructions when
helper is called.

Interesting how this went under the radar for so long, probably
because user code mostly uses `int` to catch return value of map
functions.

That commit changes return types for a lot of functions.
I looked through function definitions and verifier.c code for those,
but have not found any additional issues, except for two obvious:
- bpf_redirect_map / ops->map_redirect
- bpf_for_each_map_elem / ops->map_for_each_callback

These require similar changes.

Also, the documentation is inconsistent as well.
For example, `int bpf_map_update_elem(...)` is mentioned in:
- Documentation/bpf/map_sockmap.rst
- Documentation/bpf/map_xskmap.rst
- Documentation/bpf/map_cpumap.rst
- Documentation/bpf/map_devmap.rst
- Documentation/bpf/map_sk_storage.rst
- Documentation/bpf/map_bloom_filter.rst
- Documentation/bpf/map_queue_stack.rst

And `long bpf_map_update_elem(...)`:
- Documentation/bpf/map_sockmap.rst
- Documentation/bpf/map_array.rst
- Documentation/bpf/map_hash.rst
- Documentation/bpf/map_lpm_trie.rst

Tested-By: Eduard Zingerman <eddyz87@gmail.com>

> JP Kobryn (2):
>   bpf/selftests: coverage for bpf_map_ops errors
>   bpf: return long from bpf_map_ops funcs
> 
>  include/linux/bpf.h                           |  10 +-
>  kernel/bpf/arraymap.c                         |   8 +-
>  kernel/bpf/bloom_filter.c                     |  12 +-
>  kernel/bpf/bpf_cgrp_storage.c                 |   6 +-
>  kernel/bpf/bpf_inode_storage.c                |   6 +-
>  kernel/bpf/bpf_struct_ops.c                   |   6 +-
>  kernel/bpf/bpf_task_storage.c                 |   6 +-
>  kernel/bpf/cpumap.c                           |   6 +-
>  kernel/bpf/devmap.c                           |  20 +--
>  kernel/bpf/hashtab.c                          |  32 ++---
>  kernel/bpf/local_storage.c                    |   6 +-
>  kernel/bpf/lpm_trie.c                         |   6 +-
>  kernel/bpf/queue_stack_maps.c                 |  22 +--
>  kernel/bpf/reuseport_array.c                  |   2 +-
>  kernel/bpf/ringbuf.c                          |   6 +-
>  kernel/bpf/stackmap.c                         |   6 +-
>  kernel/bpf/verifier.c                         |  10 +-
>  net/core/bpf_sk_storage.c                     |   6 +-
>  net/core/sock_map.c                           |   8 +-
>  net/xdp/xskmap.c                              |   6 +-
>  .../selftests/bpf/prog_tests/map_ops.c        | 130 ++++++++++++++++++
>  .../selftests/bpf/progs/test_map_ops.c        |  90 ++++++++++++
>  22 files changed, 315 insertions(+), 95 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/map_ops.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_map_ops.c
>
Alexei Starovoitov March 21, 2023, 9:37 p.m. UTC | #2
On Mon, Mar 20, 2023 at 7:19 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Fri, 2023-03-17 at 18:13 -0700, inwardvessel wrote:
> > From: JP Kobryn <inwardvessel@gmail.com>
> >
> > Within bpf programs, the bpf helper functions can make inline calls to
> > kernel functions. In this scenario there can be a disconnect between the
> > register the kernel function writes a return value to and the register the
> > bpf program uses to evaluate that return value.
> >
> > As an example, this bpf code:
> >
> > long err = bpf_map_update_elem(...);
> > if (err && err != -EEXIST)
> >       // got some error other than -EEXIST
> >
> > ...can result in the bpf assembly:
> >
> > ; err = bpf_map_update_elem(&mymap, &key, &val, BPF_NOEXIST);
> >   37: movabs $0xffff976a10730400,%rdi
> >   41: mov    $0x1,%ecx
> >   46: call   0xffffffffe103291c       ; htab_map_update_elem
> > ; if (err && err != -EEXIST) {
> >   4b: cmp    $0xffffffffffffffef,%rax ; cmp -EEXIST,%rax
> >   4f: je     0x000000000000008e
> >   51: test   %rax,%rax
> >   54: je     0x000000000000008e
> >
> > The compare operation here evaluates %rax, while in the preceding call to
> > htab_map_update_elem the corresponding assembly returns -EEXIST via %eax:
> >
> > movl $0xffffffef, %r9d
> > ...
> > movl %r9d, %eax
> >
> > ...since it's returning int (32-bit). So the resulting comparison becomes:
> >
> > cmp $0xffffffffffffffef, $0x00000000ffffffef
> >
> > ...making it not possible to check for negative errors or specific errors,
> > since the sign value is left at the 32nd bit. It means in the original
> > example, the conditional branch will be entered even when the error is
> > -EEXIST, which was not intended.
> >
> > The selftests added cover these cases for the different bpf_map_ops
> > functions. When the second patch is applied, changing the return type of
> > those functions to long, the comparison works as intended and the tests
> > pass.
> >
>
> Looks like this fixes commit from 2020:
> bdb7b79b4ce8 ("bpf: Switch most helper return values from 32-bit int to 64-bit long")
>
> To add to the summary: the issue is caused by the fact that test
> program uses map function definitions from `bpf_helper_defs.h`, e.g.:
>
>     static long (*bpf_map_update_elem)(...) 2;
>
> These definitions are generated from `include/uapi/linux/bpf.h`,
> which specifies the return type for this helper to be `long`
> (changed to from `int` in the commit mentioned above).
> That's why clang does not insert sign extension instructions when
> helper is called.

JP,

could you please add Ed's clarification to the commit log
and add 'Fixes: bdb7b79b4ce8 ...' tag and respin ?

>
> Interesting how this went under the radar for so long, probably
> because user code mostly uses `int` to catch return value of map
> functions.
>
> That commit changes return types for a lot of functions.
> I looked through function definitions and verifier.c code for those,
> but have not found any additional issues, except for two obvious:
> - bpf_redirect_map / ops->map_redirect
> - bpf_for_each_map_elem / ops->map_for_each_callback

Please fix these two as well in the same patch.

> Tested-By: Eduard Zingerman <eddyz87@gmail.com>

and please carry the Tested-by tag in the respin.

Thanks!