mbox series

[RFC,bpf-next,0/2] bpf: allow unprivileged map access to some map types

Message ID 1652275168-18630-1-git-send-email-alan.maguire@oracle.com (mailing list archive)
Headers show
Series bpf: allow unprivileged map access to some map types | expand

Message

Alan Maguire May 11, 2022, 1:19 p.m. UTC
Unprivileged BPF disabled (kernel.unprivileged_bpf_disabled >= 1)
is the default in most cases now; when set, the BPF system call is
blocked for users without CAP_BPF/CAP_SYS_ADMIN.  In some use cases
prior to having unpriviliged_bpf_disabled set to >= 1 however,
it made sense to split activities between capability-requiring
ones - such as program load+attach - and those that might not require
capabilities such as reading perf/ringbuf events, reading or
updating BPF map configuration etc.  One example of this sort of
approach is a service that loads a BPF program, and a user-space
program that, after it has been loaded, interacts with it via
pinned maps.

Such a split model is not possible with unprivileged BPF disabled,
since even those activities such as map interactions, retrieving
map information from pinned paths etc are blocked to
unprivileged users.  However, granting CAP_BPF to such unprivileged
users is quite a big hammer, allowing them to do pretty much
anything with BPF.

This very rough RFC explores the idea - with
CONFIG_BPF_UNPRIV_MAP_ACCESS=y - of allowing unprivileged processes
to retrieve and work with a restricted set of BPF maps.  See
patch 1 for the restrictions on BPF cmd and map type. Note that
permission checks on maps are still enforced, so it's still
possible to prevent unwanted interference by unprivileged users
by pinning a map and setting permissions to prevent access.
CONFIG_BPF_UNPRIV_MAP_ACCESS defaults to n, preserving current
behaviour of blocking all BPF syscall commands.

Discussion on the bpf mailing list already alluded to this idea [1],
though it's possible I misinterpreted it.

There are other ways to achieve this goal I suspect; for example
a BPF LSM program attached to security_bpf() could weed out the
disallowed commands, so setting unprivileged_bpf_disabled=0 + BPF
LSM could support a wider range of policies (not sure if we
could easily determine map types in that case though).
However, as a starting point I just wanted to see if others see
this as an issue, and if so how best to solve it in the general
case. Thanks!

[1] https://lore.kernel.org/bpf/CAADnVQLTBhCTAx1a_nev7CgMZxv1Bb7ecz1AFRin8tHmjPREJA@mail.gmail.com/

Alan Maguire (2):
  bpf: with CONFIG_BPF_UNPRIV_MAP_ACCESS=y, allow unprivileged access to
    BPF maps
  selftests/bpf: add tests verifying unpriv bpf map access

 kernel/bpf/Kconfig                            |  15 ++
 kernel/bpf/syscall.c                          |  57 ++++-
 tools/testing/selftests/bpf/config            |   1 +
 .../bpf/prog_tests/unpriv_map_access.c        | 194 ++++++++++++++++++
 .../bpf/progs/test_unpriv_map_access.c        |  81 ++++++++
 5 files changed, 347 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/unpriv_map_access.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_unpriv_map_access.c

Comments

Alexei Starovoitov May 11, 2022, 4:36 p.m. UTC | #1
On Wed, May 11, 2022 at 02:19:26PM +0100, Alan Maguire wrote:
> Unprivileged BPF disabled (kernel.unprivileged_bpf_disabled >= 1)
> is the default in most cases now; when set, the BPF system call is
> blocked for users without CAP_BPF/CAP_SYS_ADMIN.  In some use cases
> prior to having unpriviliged_bpf_disabled set to >= 1 however,
> it made sense to split activities between capability-requiring
> ones - such as program load+attach - and those that might not require
> capabilities such as reading perf/ringbuf events, reading or
> updating BPF map configuration etc.  One example of this sort of
> approach is a service that loads a BPF program, and a user-space
> program that, after it has been loaded, interacts with it via
> pinned maps.
> 
> Such a split model is not possible with unprivileged BPF disabled,
> since even those activities such as map interactions, retrieving
> map information from pinned paths etc are blocked to
> unprivileged users.  However, granting CAP_BPF to such unprivileged
> users is quite a big hammer, allowing them to do pretty much
> anything with BPF.
> 
> This very rough RFC explores the idea - with
> CONFIG_BPF_UNPRIV_MAP_ACCESS=y - of allowing unprivileged processes
> to retrieve and work with a restricted set of BPF maps.  See
> patch 1 for the restrictions on BPF cmd and map type. Note that
> permission checks on maps are still enforced, so it's still
> possible to prevent unwanted interference by unprivileged users
> by pinning a map and setting permissions to prevent access.
> CONFIG_BPF_UNPRIV_MAP_ACCESS defaults to n, preserving current
> behaviour of blocking all BPF syscall commands.
> 
> Discussion on the bpf mailing list already alluded to this idea [1],
> though it's possible I misinterpreted it.

Thanks for the follow up. We had a long discussion about this during lsfmmbpf.
In short a bunch of folks wants to address this problem.
Your summary of the problem is accurate.
The patch though is overly cautious in fixing the issue.
The bpf ACL model is the same as traditional file's ACL.
The creds and ACLs are checked at open().  Then during file's write/read
additional checks might be performed. BPF has such functionality already. 
Different map_creates have capability checks while map_lookup has:
map_get_sys_perms(map, f) & FMODE_CAN_READ.
In other words it's enough to gate FD-receiving parts of bpf
with unprivileged_bpf_disabled sysctl.
The rest is handled by availability of FD and access to files in bpffs.
Additional kconfig is unnecessary. Also no need to filter
different map types. Only array/hash/ringbuf are ok for unpriv
when that sysctl is off. The rest of maps have their own cap_bpf checks
at creation time which is enough.
The patch 1 should probably be something like:
  if (!capable &&
      (cmd == BPF_MAP_CREATE || cmd == BPF_PROG_LOAD))
    return -EPERM;
For all other commands the user has to have a valid map/btf/prog FD to proceed.
There are few special commands that don't need to be in the above 'if':
. BPF_[PROG|MAP|BTF]_GET_NEXT_ID have explicit cap_sys_admin check.
. BPF_OBJ_GET is using traditional file ACLs to access.
. BPF_BTF_LOAD has its own cap_bpf check.

Of course there could be bugs in any of unpriv code paths, but they were
enabled with sysctl off for long time. When/if new bugs are found they will be fixed.
The unprivileged_bpf_disabled's default was flipped only because of HW bugs.
In all HW spectre exploits BPF_PROG_LOAD was the mandatory command.
Just disabling that command is enough. The BPF_MAP_CREATE alone
cannot be used in spectre-like attack. But map_create without prog_load
is a useless combination for unrpiv. So having sysctl affecting them
both makes sense from symmetry pov. libbpf and other loaders typically
create a map first, so with sysctl off the unpriv users will see those eperms first.
I hope it's mostly accurate summary of the discussion.