Message ID | 20230125000244.1109228-1-kuifeng@meta.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] bpf: Fix the kernel crash caused by bpf_setsockopt(). | expand |
On 1/24/23 4:02 PM, Kui-Feng Lee wrote: > The kernel crash was caused by a BPF program attached to the > "lsm_cgroup/socket_sock_rcv_skb" hook, which performed a call to > `bpf_setsockopt()` in order to set the TCP_NODELAY flag. This flag Note that this race is not limited to TCP_NODELAY. > causes the kernel to flush the outgoing queue of a socket, and this > hook can be triggered during a softirq. The issue was that in certain > circumstances, when `tcp_write_xmit()` was called to flush the queue, > it would also allow BH (bottom-half) to run. This could lead to our > program attempting to flush the same socket recursively, which caused > a `skbuf` to be unlinked twice. Thanks for the fix. The commit message could use more details about this particular security_sock_rcv_skb() hook. Something like, security_sock_rcv_skb() is called from tcp_filter(). In tcp_v4_rcv(), tcp_filter() is called before the sock_owned_by_user() check. If a bpf prog is run in security_sock_rcv_skb() from the softirq, it may not own the sock lock and break the bpf_setsockopt() assumption. > > The patch fixes this issue by ensuring that a BPF program attached to > the "lsm_cgroup/socket_sock_rcv_skb" hook is not allowed to call > `bpf_setsockopt()`. Please add Fixes tag. Also, this should target for the bpf tree. Fixes: 9113d7e48e91 ("bpf: expose bpf_{g,s}etsockopt to lsm cgroup") > > Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> > --- > kernel/bpf/bpf_lsm.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c > index a4a41ee3e80b..e14c822f8911 100644 > --- a/kernel/bpf/bpf_lsm.c > +++ b/kernel/bpf/bpf_lsm.c > @@ -51,7 +51,6 @@ BTF_SET_END(bpf_lsm_current_hooks) > */ > BTF_SET_START(bpf_lsm_locked_sockopt_hooks) > #ifdef CONFIG_SECURITY_NETWORK > -BTF_ID(func, bpf_lsm_socket_sock_rcv_skb) > BTF_ID(func, bpf_lsm_sock_graft) > BTF_ID(func, bpf_lsm_inet_csk_clone) > BTF_ID(func, bpf_lsm_inet_conn_established)
diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c index a4a41ee3e80b..e14c822f8911 100644 --- a/kernel/bpf/bpf_lsm.c +++ b/kernel/bpf/bpf_lsm.c @@ -51,7 +51,6 @@ BTF_SET_END(bpf_lsm_current_hooks) */ BTF_SET_START(bpf_lsm_locked_sockopt_hooks) #ifdef CONFIG_SECURITY_NETWORK -BTF_ID(func, bpf_lsm_socket_sock_rcv_skb) BTF_ID(func, bpf_lsm_sock_graft) BTF_ID(func, bpf_lsm_inet_csk_clone) BTF_ID(func, bpf_lsm_inet_conn_established)
The kernel crash was caused by a BPF program attached to the "lsm_cgroup/socket_sock_rcv_skb" hook, which performed a call to `bpf_setsockopt()` in order to set the TCP_NODELAY flag. This flag causes the kernel to flush the outgoing queue of a socket, and this hook can be triggered during a softirq. The issue was that in certain circumstances, when `tcp_write_xmit()` was called to flush the queue, it would also allow BH (bottom-half) to run. This could lead to our program attempting to flush the same socket recursively, which caused a `skbuf` to be unlinked twice. The patch fixes this issue by ensuring that a BPF program attached to the "lsm_cgroup/socket_sock_rcv_skb" hook is not allowed to call `bpf_setsockopt()`. Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> --- kernel/bpf/bpf_lsm.c | 1 - 1 file changed, 1 deletion(-)