diff mbox series

[V2] bpf: security enhancement by limiting the offensive eBPF helpers

Message ID 20230118111854.744810-1-clangllvm@126.com (mailing list archive)
State Handled Elsewhere
Headers show
Series [V2] bpf: security enhancement by limiting the offensive eBPF helpers | expand

Commit Message

Yi He Jan. 18, 2023, 11:18 a.m. UTC
The bpf_send_singal, bpf_send_singal_thread and bpf_override_return
is similar to bpf_write_user and can affect userspace processes.
Thus, these three helpers should also be restricted by security lockdown.

Signed-off-by: Yi He <clangllvm@126.com>
---

Thanks for your feedback.

This patch aims to mitigate the offensive eBPF problem which has been dicussed since 2019 [1]. Recently, we find that enable eBPF in container environemnt can lead to container escape or cross-nodes attacks (which may compromise mutiple VMs) in the Kubernetes [2]. Since lots of eBPF based tools are used in containers, mutiple containers have the CAP_SYS_ADMIN needed by eBPF which may be abused by untrusted eBPF code. 

We are still working for a better fine-grained eBPF permission model which add capability fitler bits to control the permissions of different eBPF program types and helper functions of a processes [3].

Security lockdown seems to be a simple way to mitigate this problem. It only restrict all the offensive features and enable other eBPF features needed by benign eBPF program such as Cillium (which do not use these offensive features but only need bpf_read_user).

> I'm not applying this.. i) this means by default you effectively remove these
> helpers from existing users in the wild given integrity mode is default for
> secure boot, but also ii) should we lock-down and remove the ability for other
> privileged entities like processes to send signals, seccomp to ret_kill, ptrace,
> etc given they all "can affect userspace processes"

It does not affect other privielge processes (e.g., ptrace) to kill process. Seccomp is classic bpf does not use this eBPF helper [4].

>  check out already existing FUNCTION_ERROR_INJECTION kernel config.
We do not think the FUNCTION_ERROR_INJECTION  config can solve this problem as this option is default enable in many linux distributions such as debian/ubuntu. All the syscall are in allowlist of error injection and can be attacked by evil eBPF via eBPF override return.

We hop you can rethink this problem. 

[1]. J. Dileo. Evil eBPF: Practical Abuses of an In-Kernel Bytecode Runtime. DEFCON 27
[2]. https://rolandorange.zone/report.html
[3]. https://lore.kernel.org/bpf/CAADnVQK4ucv=LugqZ3He9ubwdxDu6ohaBKr2E=TX0UT65+7WpQ@mail.gmail.com/T/ 
[4]. https://elixir.bootlin.com/linux/v6.2-rc4/source/kernel/seccomp.c#L1304


 V1 -> V2: add security lockdown to bpf_send_singal_thread and remove 
	the unused LOCKDOWN_OFFENSIVE_BPF_MAX.

 include/linux/security.h | 2 ++
 kernel/trace/bpf_trace.c | 9 ++++++---
 2 files changed, 8 insertions(+), 3 deletions(-)

Comments

Djalal Harouni Jan. 18, 2023, 3:36 p.m. UTC | #1
On Wed, Jan 18, 2023 at 1:38 PM Yi He <clangllvm@126.com> wrote:
[...]
> Thanks for your feedback.
>
> This patch aims to mitigate the offensive eBPF problem which has been dicussed since 2019 [1]. Recently, we find that enable eBPF in container environemnt can lead to container escape or cross-nodes attacks (which may compromise mutiple VMs) in the Kubernetes [2]. Since lots of eBPF based tools are used in containers, mutiple containers have the CAP_SYS_ADMIN needed by eBPF which may be abused by untrusted eBPF code.

Then solution should be toward restricting eBPF in container, there is already
sysctl, per process seccomp, LSM + bpf LSM for that.

...
> > I'm not applying this.. i) this means by default you effectively remove these
> > helpers from existing users in the wild given integrity mode is default for
> > secure boot, but also ii) should we lock-down and remove the ability for other
> > privileged entities like processes to send signals, seccomp to ret_kill, ptrace,
> > etc given they all "can affect userspace processes"
>
> It does not affect other privielge processes (e.g., ptrace) to kill process. Seccomp is classic bpf does not use this eBPF helper [4].

Those are more or less same as bpf sending signal. Supervisors are using
seccomp to ret kill process and/or sending signals. Where will you draw the
line? should we go restrict those too? IMHO this does not relate to lockdown.

This reasoning will kill any effort to improve sandbox mechanisms that are
moving some functionality from seccomp ret kill to a more flexible and
transparent bpf-LSM model where privileged installs the sandbox. Actually,
we are already doing this and beside eBPF flexibility and transparency
(change policy at runtime without restart) from a _user perspective_
I don't see that much difference between a seccomp kill and ebpf signal.

Thanks!
diff mbox series

Patch

diff --git a/include/linux/security.h b/include/linux/security.h
index 5b67f208f..42420e620 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -123,6 +123,8 @@  enum lockdown_reason {
 	LOCKDOWN_DEBUGFS,
 	LOCKDOWN_XMON_WR,
 	LOCKDOWN_BPF_WRITE_USER,
+	LOCKDOWN_BPF_SEND_SIGNAL,
+	LOCKDOWN_BPF_OVERRIDE_RETURN,
 	LOCKDOWN_DBG_WRITE_KERNEL,
 	LOCKDOWN_RTAS_ERROR_INJECTION,
 	LOCKDOWN_INTEGRITY_MAX,
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 3bbd3f0c8..fdb94868d 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1463,9 +1463,11 @@  bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_cgrp_storage_delete_proto;
 #endif
 	case BPF_FUNC_send_signal:
-		return &bpf_send_signal_proto;
+		return security_locked_down(LOCKDOWN_BPF_SEND_SIGNAL) < 0 ?
+		       NULL : &bpf_send_signal_proto;
 	case BPF_FUNC_send_signal_thread:
-		return &bpf_send_signal_thread_proto;
+		return security_locked_down(LOCKDOWN_BPF_SEND_SIGNAL) < 0 ?
+		       NULL : &bpf_send_signal_thread_proto;
 	case BPF_FUNC_perf_event_read_value:
 		return &bpf_perf_event_read_value_proto;
 	case BPF_FUNC_get_ns_current_pid_tgid:
@@ -1531,7 +1533,8 @@  kprobe_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_get_stack_proto;
 #ifdef CONFIG_BPF_KPROBE_OVERRIDE
 	case BPF_FUNC_override_return:
-		return &bpf_override_return_proto;
+		return security_locked_down(LOCKDOWN_BPF_OVERRIDE_RETURN) < 0 ?
+		       NULL : &bpf_override_return_proto;
 #endif
 	case BPF_FUNC_get_func_ip:
 		return prog->expected_attach_type == BPF_TRACE_KPROBE_MULTI ?