Message ID | 20230127001732.4162630-1-kuifeng@meta.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 5416c9aea8323583e8696f0500b6142dfae80821 |
Delegated to: | BPF |
Headers | show |
Series | [bpf,v2] bpf: Fix the kernel crash caused by bpf_setsockopt(). | expand |
On 1/26/23 4:17 PM, Kui-Feng Lee wrote: > The kernel crash was caused by a BPF program attached to the > "lsm_cgroup/socket_sock_rcv_skb" hook, which performed a call to > `bpf_setsockopt()` in order to set the TCP_NODELAY flag as an > example. Flags like TCP_NODELAY can prompt the kernel to flush a > socket's outgoing queue, and this hook > "lsm_cgroup/socket_sock_rcv_skb" is frequently triggered by > softirqs. The issue was that in certain circumstances, when > `tcp_write_xmit()` was called to flush the queue, it would also allow > BH (bottom-half) to run. This could lead to our program attempting to > flush the same socket recursively, which caused a `skbuf` to be > unlinked twice. > > `security_sock_rcv_skb()` is triggered by `tcp_filter()`. This occurs > before the sock ownership is checked in `tcp_v4_rcv()`. Consequently, > if a bpf program runs on `security_sock_rcv_skb()` while under softirq > conditions, it may not possess the lock needed for `bpf_setsoppt()`, > thus presenting an issue. Fixed a few minor things like s/bpf_setsoppt/bpf_setsockopt/ and s/skbuf/skbuff/. Applied. Thanks.
Hello: This patch was applied to bpf/bpf.git (master) by Martin KaFai Lau <martin.lau@kernel.org>: On Thu, 26 Jan 2023 16:17:32 -0800 you wrote: > The kernel crash was caused by a BPF program attached to the > "lsm_cgroup/socket_sock_rcv_skb" hook, which performed a call to > `bpf_setsockopt()` in order to set the TCP_NODELAY flag as an > example. Flags like TCP_NODELAY can prompt the kernel to flush a > socket's outgoing queue, and this hook > "lsm_cgroup/socket_sock_rcv_skb" is frequently triggered by > softirqs. The issue was that in certain circumstances, when > `tcp_write_xmit()` was called to flush the queue, it would also allow > BH (bottom-half) to run. This could lead to our program attempting to > flush the same socket recursively, which caused a `skbuf` to be > unlinked twice. > > [...] Here is the summary with links: - [bpf,v2] bpf: Fix the kernel crash caused by bpf_setsockopt(). https://git.kernel.org/bpf/bpf/c/5416c9aea832 You are awesome, thank you!
diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c index a4a41ee3e80b..e14c822f8911 100644 --- a/kernel/bpf/bpf_lsm.c +++ b/kernel/bpf/bpf_lsm.c @@ -51,7 +51,6 @@ BTF_SET_END(bpf_lsm_current_hooks) */ BTF_SET_START(bpf_lsm_locked_sockopt_hooks) #ifdef CONFIG_SECURITY_NETWORK -BTF_ID(func, bpf_lsm_socket_sock_rcv_skb) BTF_ID(func, bpf_lsm_sock_graft) BTF_ID(func, bpf_lsm_inet_csk_clone) BTF_ID(func, bpf_lsm_inet_conn_established)
The kernel crash was caused by a BPF program attached to the "lsm_cgroup/socket_sock_rcv_skb" hook, which performed a call to `bpf_setsockopt()` in order to set the TCP_NODELAY flag as an example. Flags like TCP_NODELAY can prompt the kernel to flush a socket's outgoing queue, and this hook "lsm_cgroup/socket_sock_rcv_skb" is frequently triggered by softirqs. The issue was that in certain circumstances, when `tcp_write_xmit()` was called to flush the queue, it would also allow BH (bottom-half) to run. This could lead to our program attempting to flush the same socket recursively, which caused a `skbuf` to be unlinked twice. `security_sock_rcv_skb()` is triggered by `tcp_filter()`. This occurs before the sock ownership is checked in `tcp_v4_rcv()`. Consequently, if a bpf program runs on `security_sock_rcv_skb()` while under softirq conditions, it may not possess the lock needed for `bpf_setsoppt()`, thus presenting an issue. The patch fixes this issue by ensuring that a BPF program attached to the "lsm_cgroup/socket_sock_rcv_skb" hook is not allowed to call `bpf_setsockopt()`. The differences from v1 are - changing commit log to explain holding the lock of the sock, - emphasizing that TCP_NODELAY is not the only flag, and - adding the fixes tag. v1: https://lore.kernel.org/bpf/20230125000244.1109228-1-kuifeng@meta.com/ Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Fixes: 9113d7e48e91 ("bpf: expose bpf_{g,s}etsockopt to lsm cgroup") --- kernel/bpf/bpf_lsm.c | 1 - 1 file changed, 1 deletion(-)