Message ID | 20240329134037.92124-1-kerneljasonxing@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] bpf, skmsg: fix NULL pointer dereference in sk_psock_skb_ingress_enqueue | expand |
On Fri, Mar 29, 2024 at 9:40 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > > From: Jason Xing <kernelxing@tencent.com> > > Fix NULL pointer data-races in sk_psock_skb_ingress_enqueue() which > syzbot reported [1]. > > [1] > BUG: KCSAN: data-race in sk_psock_drop / sk_psock_skb_ingress_enqueue > > write to 0xffff88814b3278b8 of 8 bytes by task 10724 on cpu 1: > sk_psock_stop_verdict net/core/skmsg.c:1257 [inline] > sk_psock_drop+0x13e/0x1f0 net/core/skmsg.c:843 > sk_psock_put include/linux/skmsg.h:459 [inline] > sock_map_close+0x1a7/0x260 net/core/sock_map.c:1648 > unix_release+0x4b/0x80 net/unix/af_unix.c:1048 > __sock_release net/socket.c:659 [inline] > sock_close+0x68/0x150 net/socket.c:1421 > __fput+0x2c1/0x660 fs/file_table.c:422 > __fput_sync+0x44/0x60 fs/file_table.c:507 > __do_sys_close fs/open.c:1556 [inline] > __se_sys_close+0x101/0x1b0 fs/open.c:1541 > __x64_sys_close+0x1f/0x30 fs/open.c:1541 > do_syscall_64+0xd3/0x1d0 > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > read to 0xffff88814b3278b8 of 8 bytes by task 10713 on cpu 0: > sk_psock_data_ready include/linux/skmsg.h:464 [inline] > sk_psock_skb_ingress_enqueue+0x32d/0x390 net/core/skmsg.c:555 > sk_psock_skb_ingress_self+0x185/0x1e0 net/core/skmsg.c:606 > sk_psock_verdict_apply net/core/skmsg.c:1008 [inline] > sk_psock_verdict_recv+0x3e4/0x4a0 net/core/skmsg.c:1202 > unix_read_skb net/unix/af_unix.c:2546 [inline] > unix_stream_read_skb+0x9e/0xf0 net/unix/af_unix.c:2682 > sk_psock_verdict_data_ready+0x77/0x220 net/core/skmsg.c:1223 > unix_stream_sendmsg+0x527/0x860 net/unix/af_unix.c:2339 > sock_sendmsg_nosec net/socket.c:730 [inline] > __sock_sendmsg+0x140/0x180 net/socket.c:745 > ____sys_sendmsg+0x312/0x410 net/socket.c:2584 > ___sys_sendmsg net/socket.c:2638 [inline] > __sys_sendmsg+0x1e9/0x280 net/socket.c:2667 > __do_sys_sendmsg net/socket.c:2676 [inline] > __se_sys_sendmsg net/socket.c:2674 [inline] > __x64_sys_sendmsg+0x46/0x50 net/socket.c:2674 > do_syscall_64+0xd3/0x1d0 > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > value changed: 0xffffffff83d7feb0 -> 0x0000000000000000 > > Reported by Kernel Concurrency Sanitizer on: > CPU: 0 PID: 10713 Comm: syz-executor.4 Tainted: G W 6.8.0-syzkaller-08951-gfe46a7dd189e #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024 > > Prior to this, commit 4cd12c6065df ("bpf, sockmap: Fix NULL pointer > dereference in sk_psock_verdict_data_ready()") fixed one NULL pointer > similarly due to no protection of saved_data_ready. Here is another > different caller causing the same issue because of the same reason. So > we should protect it with sk_callback_lock read lock because the writer > side in the sk_psock_drop() uses "write_lock_bh(&sk->sk_callback_lock);". Should I use 'read_lock(&sk->sk_callback_lock);' in bpf_tcp_ingress() to protect sk_callback_lock field? If it's ok, I will do it in another patch. Thanks, Jason > > Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") > Reported-by: syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=aa8c8ec2538929f18f2d > Signed-off-by: Jason Xing <kernelxing@tencent.com> > --- > net/core/skmsg.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > index 4d75ef9d24bf..67c4c01c5235 100644 > --- a/net/core/skmsg.c > +++ b/net/core/skmsg.c > @@ -552,7 +552,9 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb, > msg->skb = skb; > > sk_psock_queue_msg(psock, msg); > + read_lock_bh(&sk->sk_callback_lock); > sk_psock_data_ready(sk, psock); > + read_unlock_bh(&sk->sk_callback_lock); > return copied; > } > > -- > 2.37.3 >
On Fri, Mar 29, 2024 at 9:40 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > > From: Jason Xing <kernelxing@tencent.com> > > Fix NULL pointer data-races in sk_psock_skb_ingress_enqueue() which > syzbot reported [1]. > > [1] > BUG: KCSAN: data-race in sk_psock_drop / sk_psock_skb_ingress_enqueue > > write to 0xffff88814b3278b8 of 8 bytes by task 10724 on cpu 1: > sk_psock_stop_verdict net/core/skmsg.c:1257 [inline] > sk_psock_drop+0x13e/0x1f0 net/core/skmsg.c:843 > sk_psock_put include/linux/skmsg.h:459 [inline] > sock_map_close+0x1a7/0x260 net/core/sock_map.c:1648 > unix_release+0x4b/0x80 net/unix/af_unix.c:1048 > __sock_release net/socket.c:659 [inline] > sock_close+0x68/0x150 net/socket.c:1421 > __fput+0x2c1/0x660 fs/file_table.c:422 > __fput_sync+0x44/0x60 fs/file_table.c:507 > __do_sys_close fs/open.c:1556 [inline] > __se_sys_close+0x101/0x1b0 fs/open.c:1541 > __x64_sys_close+0x1f/0x30 fs/open.c:1541 > do_syscall_64+0xd3/0x1d0 > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > read to 0xffff88814b3278b8 of 8 bytes by task 10713 on cpu 0: > sk_psock_data_ready include/linux/skmsg.h:464 [inline] > sk_psock_skb_ingress_enqueue+0x32d/0x390 net/core/skmsg.c:555 > sk_psock_skb_ingress_self+0x185/0x1e0 net/core/skmsg.c:606 > sk_psock_verdict_apply net/core/skmsg.c:1008 [inline] > sk_psock_verdict_recv+0x3e4/0x4a0 net/core/skmsg.c:1202 > unix_read_skb net/unix/af_unix.c:2546 [inline] > unix_stream_read_skb+0x9e/0xf0 net/unix/af_unix.c:2682 > sk_psock_verdict_data_ready+0x77/0x220 net/core/skmsg.c:1223 > unix_stream_sendmsg+0x527/0x860 net/unix/af_unix.c:2339 > sock_sendmsg_nosec net/socket.c:730 [inline] > __sock_sendmsg+0x140/0x180 net/socket.c:745 > ____sys_sendmsg+0x312/0x410 net/socket.c:2584 > ___sys_sendmsg net/socket.c:2638 [inline] > __sys_sendmsg+0x1e9/0x280 net/socket.c:2667 > __do_sys_sendmsg net/socket.c:2676 [inline] > __se_sys_sendmsg net/socket.c:2674 [inline] > __x64_sys_sendmsg+0x46/0x50 net/socket.c:2674 > do_syscall_64+0xd3/0x1d0 > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > value changed: 0xffffffff83d7feb0 -> 0x0000000000000000 > > Reported by Kernel Concurrency Sanitizer on: > CPU: 0 PID: 10713 Comm: syz-executor.4 Tainted: G W 6.8.0-syzkaller-08951-gfe46a7dd189e #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024 > > Prior to this, commit 4cd12c6065df ("bpf, sockmap: Fix NULL pointer > dereference in sk_psock_verdict_data_ready()") fixed one NULL pointer > similarly due to no protection of saved_data_ready. Here is another > different caller causing the same issue because of the same reason. So > we should protect it with sk_callback_lock read lock because the writer > side in the sk_psock_drop() uses "write_lock_bh(&sk->sk_callback_lock);". Hello experts, any comments are welcome:) Did I miss something? Thanks, Jason > > Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") > Reported-by: syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=aa8c8ec2538929f18f2d > Signed-off-by: Jason Xing <kernelxing@tencent.com> > --- > net/core/skmsg.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > index 4d75ef9d24bf..67c4c01c5235 100644 > --- a/net/core/skmsg.c > +++ b/net/core/skmsg.c > @@ -552,7 +552,9 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb, > msg->skb = skb; > > sk_psock_queue_msg(psock, msg); > + read_lock_bh(&sk->sk_callback_lock); > sk_psock_data_ready(sk, psock); > + read_unlock_bh(&sk->sk_callback_lock); > return copied; > } > > -- > 2.37.3 >
Jason Xing wrote: > From: Jason Xing <kernelxing@tencent.com> > > Fix NULL pointer data-races in sk_psock_skb_ingress_enqueue() which > syzbot reported [1]. > > [1] > BUG: KCSAN: data-race in sk_psock_drop / sk_psock_skb_ingress_enqueue > > write to 0xffff88814b3278b8 of 8 bytes by task 10724 on cpu 1: > sk_psock_stop_verdict net/core/skmsg.c:1257 [inline] > sk_psock_drop+0x13e/0x1f0 net/core/skmsg.c:843 > sk_psock_put include/linux/skmsg.h:459 [inline] > sock_map_close+0x1a7/0x260 net/core/sock_map.c:1648 > unix_release+0x4b/0x80 net/unix/af_unix.c:1048 > __sock_release net/socket.c:659 [inline] > sock_close+0x68/0x150 net/socket.c:1421 > __fput+0x2c1/0x660 fs/file_table.c:422 > __fput_sync+0x44/0x60 fs/file_table.c:507 > __do_sys_close fs/open.c:1556 [inline] > __se_sys_close+0x101/0x1b0 fs/open.c:1541 > __x64_sys_close+0x1f/0x30 fs/open.c:1541 > do_syscall_64+0xd3/0x1d0 > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > read to 0xffff88814b3278b8 of 8 bytes by task 10713 on cpu 0: > sk_psock_data_ready include/linux/skmsg.h:464 [inline] > sk_psock_skb_ingress_enqueue+0x32d/0x390 net/core/skmsg.c:555 > sk_psock_skb_ingress_self+0x185/0x1e0 net/core/skmsg.c:606 > sk_psock_verdict_apply net/core/skmsg.c:1008 [inline] > sk_psock_verdict_recv+0x3e4/0x4a0 net/core/skmsg.c:1202 > unix_read_skb net/unix/af_unix.c:2546 [inline] > unix_stream_read_skb+0x9e/0xf0 net/unix/af_unix.c:2682 > sk_psock_verdict_data_ready+0x77/0x220 net/core/skmsg.c:1223 > unix_stream_sendmsg+0x527/0x860 net/unix/af_unix.c:2339 > sock_sendmsg_nosec net/socket.c:730 [inline] > __sock_sendmsg+0x140/0x180 net/socket.c:745 > ____sys_sendmsg+0x312/0x410 net/socket.c:2584 > ___sys_sendmsg net/socket.c:2638 [inline] > __sys_sendmsg+0x1e9/0x280 net/socket.c:2667 > __do_sys_sendmsg net/socket.c:2676 [inline] > __se_sys_sendmsg net/socket.c:2674 [inline] > __x64_sys_sendmsg+0x46/0x50 net/socket.c:2674 > do_syscall_64+0xd3/0x1d0 > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > value changed: 0xffffffff83d7feb0 -> 0x0000000000000000 > > Reported by Kernel Concurrency Sanitizer on: > CPU: 0 PID: 10713 Comm: syz-executor.4 Tainted: G W 6.8.0-syzkaller-08951-gfe46a7dd189e #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024 > > Prior to this, commit 4cd12c6065df ("bpf, sockmap: Fix NULL pointer > dereference in sk_psock_verdict_data_ready()") fixed one NULL pointer > similarly due to no protection of saved_data_ready. Here is another > different caller causing the same issue because of the same reason. So > we should protect it with sk_callback_lock read lock because the writer > side in the sk_psock_drop() uses "write_lock_bh(&sk->sk_callback_lock);". > > Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") > Reported-by: syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=aa8c8ec2538929f18f2d > Signed-off-by: Jason Xing <kernelxing@tencent.com> > --- > net/core/skmsg.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > index 4d75ef9d24bf..67c4c01c5235 100644 > --- a/net/core/skmsg.c > +++ b/net/core/skmsg.c > @@ -552,7 +552,9 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb, > msg->skb = skb; > > sk_psock_queue_msg(psock, msg); > + read_lock_bh(&sk->sk_callback_lock); > sk_psock_data_ready(sk, psock); > + read_unlock_bh(&sk->sk_callback_lock); > return copied; > } The problem is the check and then usage presumably it is already set to NULL: static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) { if (psock->saved_data_ready) psock->saved_data_ready(sk); I'm thinking we might be able to get away with just a READ_ONCE here with similar WRITE_ONCE on other side. Something like this, sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) { saved_data_ready = READ_ONCE(psock->saved_data_ready) if (saved_data_ready) saved_data_ready(sk) .... And then in sk_psock_stop_verdict, WRITE_ONCE(sk->sk_data_ready, psock->saved_data_ready); WRITE_ONCE(psock->saved_data_ready, NULL); And because we don't actually release the sock until a RCU grace period we should be OK. The TCP stack manages to work correctly without wrapping tcp_data_ready in locks like this. But nice thing there is you don't change this callback on live sockets. I think at least to keep backport simply above patch is ok, but lets move the read_lock_bh()/unlock_bh() into the sk_psock_data_ready() call and then we don't duplicate this error again. Does that make sense? Thanks, John
Hello John, On Thu, Apr 4, 2024 at 9:01 AM John Fastabend <john.fastabend@gmail.com> wrote: > > Jason Xing wrote: > > From: Jason Xing <kernelxing@tencent.com> > > > > Fix NULL pointer data-races in sk_psock_skb_ingress_enqueue() which > > syzbot reported [1]. > > > > [1] > > BUG: KCSAN: data-race in sk_psock_drop / sk_psock_skb_ingress_enqueue > > > > write to 0xffff88814b3278b8 of 8 bytes by task 10724 on cpu 1: > > sk_psock_stop_verdict net/core/skmsg.c:1257 [inline] > > sk_psock_drop+0x13e/0x1f0 net/core/skmsg.c:843 > > sk_psock_put include/linux/skmsg.h:459 [inline] > > sock_map_close+0x1a7/0x260 net/core/sock_map.c:1648 > > unix_release+0x4b/0x80 net/unix/af_unix.c:1048 > > __sock_release net/socket.c:659 [inline] > > sock_close+0x68/0x150 net/socket.c:1421 > > __fput+0x2c1/0x660 fs/file_table.c:422 > > __fput_sync+0x44/0x60 fs/file_table.c:507 > > __do_sys_close fs/open.c:1556 [inline] > > __se_sys_close+0x101/0x1b0 fs/open.c:1541 > > __x64_sys_close+0x1f/0x30 fs/open.c:1541 > > do_syscall_64+0xd3/0x1d0 > > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > > > read to 0xffff88814b3278b8 of 8 bytes by task 10713 on cpu 0: > > sk_psock_data_ready include/linux/skmsg.h:464 [inline] > > sk_psock_skb_ingress_enqueue+0x32d/0x390 net/core/skmsg.c:555 > > sk_psock_skb_ingress_self+0x185/0x1e0 net/core/skmsg.c:606 > > sk_psock_verdict_apply net/core/skmsg.c:1008 [inline] > > sk_psock_verdict_recv+0x3e4/0x4a0 net/core/skmsg.c:1202 > > unix_read_skb net/unix/af_unix.c:2546 [inline] > > unix_stream_read_skb+0x9e/0xf0 net/unix/af_unix.c:2682 > > sk_psock_verdict_data_ready+0x77/0x220 net/core/skmsg.c:1223 > > unix_stream_sendmsg+0x527/0x860 net/unix/af_unix.c:2339 > > sock_sendmsg_nosec net/socket.c:730 [inline] > > __sock_sendmsg+0x140/0x180 net/socket.c:745 > > ____sys_sendmsg+0x312/0x410 net/socket.c:2584 > > ___sys_sendmsg net/socket.c:2638 [inline] > > __sys_sendmsg+0x1e9/0x280 net/socket.c:2667 > > __do_sys_sendmsg net/socket.c:2676 [inline] > > __se_sys_sendmsg net/socket.c:2674 [inline] > > __x64_sys_sendmsg+0x46/0x50 net/socket.c:2674 > > do_syscall_64+0xd3/0x1d0 > > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > > > value changed: 0xffffffff83d7feb0 -> 0x0000000000000000 > > > > Reported by Kernel Concurrency Sanitizer on: > > CPU: 0 PID: 10713 Comm: syz-executor.4 Tainted: G W 6.8.0-syzkaller-08951-gfe46a7dd189e #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024 > > > > Prior to this, commit 4cd12c6065df ("bpf, sockmap: Fix NULL pointer > > dereference in sk_psock_verdict_data_ready()") fixed one NULL pointer > > similarly due to no protection of saved_data_ready. Here is another > > different caller causing the same issue because of the same reason. So > > we should protect it with sk_callback_lock read lock because the writer > > side in the sk_psock_drop() uses "write_lock_bh(&sk->sk_callback_lock);". > > > > Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") > > Reported-by: syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com > > Closes: https://syzkaller.appspot.com/bug?extid=aa8c8ec2538929f18f2d > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > --- > > net/core/skmsg.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > > index 4d75ef9d24bf..67c4c01c5235 100644 > > --- a/net/core/skmsg.c > > +++ b/net/core/skmsg.c > > @@ -552,7 +552,9 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb, > > msg->skb = skb; > > > > sk_psock_queue_msg(psock, msg); > > + read_lock_bh(&sk->sk_callback_lock); > > sk_psock_data_ready(sk, psock); > > + read_unlock_bh(&sk->sk_callback_lock); > > return copied; > > } > > The problem is the check and then usage presumably it is already set > to NULL: > > static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) > { > if (psock->saved_data_ready) > psock->saved_data_ready(sk); Yes. > > > I'm thinking we might be able to get away with just a READ_ONCE here with > similar WRITE_ONCE on other side. Something like this, The simple fix that popped into my mind at the beginning is the same as you: adding the READ_ONCE/WRITE_ONCE pair. > > sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) > { > saved_data_ready = READ_ONCE(psock->saved_data_ready) > > if (saved_data_ready) > saved_data_ready(sk) > .... > > And then in sk_psock_stop_verdict, > > WRITE_ONCE(sk->sk_data_ready, psock->saved_data_ready); > WRITE_ONCE(psock->saved_data_ready, NULL); > > And because we don't actually release the sock until a RCU grace period we > should be OK. The TCP stack manages to work correctly without wrapping > tcp_data_ready in locks like this. But nice thing there is you don't change > this callback on live sockets. > > I think at least to keep backport simply above patch is ok, but lets move > the read_lock_bh()/unlock_bh() into the sk_psock_data_ready() call and then > we don't duplicate this error again. Does that make sense? Thanks for your quick response and such detailed analysis:) It makes sense. I will move it into the sk_psock_data_ready(), then three callers for now and possible callers in future would never go wrong. > > Thanks, > John >
Jason Xing wrote: > Hello John, > > On Thu, Apr 4, 2024 at 9:01 AM John Fastabend <john.fastabend@gmail.com> wrote: > > > > Jason Xing wrote: > > > From: Jason Xing <kernelxing@tencent.com> > > > > > > Fix NULL pointer data-races in sk_psock_skb_ingress_enqueue() which > > > syzbot reported [1]. > > > > > > [1] > > > BUG: KCSAN: data-race in sk_psock_drop / sk_psock_skb_ingress_enqueue > > > > > > write to 0xffff88814b3278b8 of 8 bytes by task 10724 on cpu 1: > > > sk_psock_stop_verdict net/core/skmsg.c:1257 [inline] > > > sk_psock_drop+0x13e/0x1f0 net/core/skmsg.c:843 > > > sk_psock_put include/linux/skmsg.h:459 [inline] > > > sock_map_close+0x1a7/0x260 net/core/sock_map.c:1648 > > > unix_release+0x4b/0x80 net/unix/af_unix.c:1048 > > > __sock_release net/socket.c:659 [inline] > > > sock_close+0x68/0x150 net/socket.c:1421 > > > __fput+0x2c1/0x660 fs/file_table.c:422 > > > __fput_sync+0x44/0x60 fs/file_table.c:507 > > > __do_sys_close fs/open.c:1556 [inline] > > > __se_sys_close+0x101/0x1b0 fs/open.c:1541 > > > __x64_sys_close+0x1f/0x30 fs/open.c:1541 > > > do_syscall_64+0xd3/0x1d0 > > > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > > > > > read to 0xffff88814b3278b8 of 8 bytes by task 10713 on cpu 0: > > > sk_psock_data_ready include/linux/skmsg.h:464 [inline] > > > sk_psock_skb_ingress_enqueue+0x32d/0x390 net/core/skmsg.c:555 > > > sk_psock_skb_ingress_self+0x185/0x1e0 net/core/skmsg.c:606 > > > sk_psock_verdict_apply net/core/skmsg.c:1008 [inline] > > > sk_psock_verdict_recv+0x3e4/0x4a0 net/core/skmsg.c:1202 > > > unix_read_skb net/unix/af_unix.c:2546 [inline] > > > unix_stream_read_skb+0x9e/0xf0 net/unix/af_unix.c:2682 > > > sk_psock_verdict_data_ready+0x77/0x220 net/core/skmsg.c:1223 > > > unix_stream_sendmsg+0x527/0x860 net/unix/af_unix.c:2339 > > > sock_sendmsg_nosec net/socket.c:730 [inline] > > > __sock_sendmsg+0x140/0x180 net/socket.c:745 > > > ____sys_sendmsg+0x312/0x410 net/socket.c:2584 > > > ___sys_sendmsg net/socket.c:2638 [inline] > > > __sys_sendmsg+0x1e9/0x280 net/socket.c:2667 > > > __do_sys_sendmsg net/socket.c:2676 [inline] > > > __se_sys_sendmsg net/socket.c:2674 [inline] > > > __x64_sys_sendmsg+0x46/0x50 net/socket.c:2674 > > > do_syscall_64+0xd3/0x1d0 > > > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > > > > > value changed: 0xffffffff83d7feb0 -> 0x0000000000000000 > > > > > > Reported by Kernel Concurrency Sanitizer on: > > > CPU: 0 PID: 10713 Comm: syz-executor.4 Tainted: G W 6.8.0-syzkaller-08951-gfe46a7dd189e #0 > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024 > > > > > > Prior to this, commit 4cd12c6065df ("bpf, sockmap: Fix NULL pointer > > > dereference in sk_psock_verdict_data_ready()") fixed one NULL pointer > > > similarly due to no protection of saved_data_ready. Here is another > > > different caller causing the same issue because of the same reason. So > > > we should protect it with sk_callback_lock read lock because the writer > > > side in the sk_psock_drop() uses "write_lock_bh(&sk->sk_callback_lock);". > > > > > > Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") > > > Reported-by: syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com > > > Closes: https://syzkaller.appspot.com/bug?extid=aa8c8ec2538929f18f2d > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > > --- > > > net/core/skmsg.c | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > > > index 4d75ef9d24bf..67c4c01c5235 100644 > > > --- a/net/core/skmsg.c > > > +++ b/net/core/skmsg.c > > > @@ -552,7 +552,9 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb, > > > msg->skb = skb; > > > > > > sk_psock_queue_msg(psock, msg); > > > + read_lock_bh(&sk->sk_callback_lock); > > > sk_psock_data_ready(sk, psock); > > > + read_unlock_bh(&sk->sk_callback_lock); > > > return copied; > > > } > > > > The problem is the check and then usage presumably it is already set > > to NULL: > > > > static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) > > { > > if (psock->saved_data_ready) > > psock->saved_data_ready(sk); > > Yes. > > > > > > > I'm thinking we might be able to get away with just a READ_ONCE here with > > similar WRITE_ONCE on other side. Something like this, > > The simple fix that popped into my mind at the beginning is the same > as you: adding the READ_ONCE/WRITE_ONCE pair. Let me know if you want to try doing a patch with the READ_ONCE/WRITE_ONCE we could push something like that through bpf-next I think. Just needs some extra thought and testing.
On Fri, Apr 5, 2024 at 12:45 PM John Fastabend <john.fastabend@gmail.com> wrote: > > Jason Xing wrote: > > Hello John, > > > > On Thu, Apr 4, 2024 at 9:01 AM John Fastabend <john.fastabend@gmail.com> wrote: > > > > > > Jason Xing wrote: > > > > From: Jason Xing <kernelxing@tencent.com> > > > > > > > > Fix NULL pointer data-races in sk_psock_skb_ingress_enqueue() which > > > > syzbot reported [1]. > > > > > > > > [1] > > > > BUG: KCSAN: data-race in sk_psock_drop / sk_psock_skb_ingress_enqueue > > > > > > > > write to 0xffff88814b3278b8 of 8 bytes by task 10724 on cpu 1: > > > > sk_psock_stop_verdict net/core/skmsg.c:1257 [inline] > > > > sk_psock_drop+0x13e/0x1f0 net/core/skmsg.c:843 > > > > sk_psock_put include/linux/skmsg.h:459 [inline] > > > > sock_map_close+0x1a7/0x260 net/core/sock_map.c:1648 > > > > unix_release+0x4b/0x80 net/unix/af_unix.c:1048 > > > > __sock_release net/socket.c:659 [inline] > > > > sock_close+0x68/0x150 net/socket.c:1421 > > > > __fput+0x2c1/0x660 fs/file_table.c:422 > > > > __fput_sync+0x44/0x60 fs/file_table.c:507 > > > > __do_sys_close fs/open.c:1556 [inline] > > > > __se_sys_close+0x101/0x1b0 fs/open.c:1541 > > > > __x64_sys_close+0x1f/0x30 fs/open.c:1541 > > > > do_syscall_64+0xd3/0x1d0 > > > > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > > > > > > > read to 0xffff88814b3278b8 of 8 bytes by task 10713 on cpu 0: > > > > sk_psock_data_ready include/linux/skmsg.h:464 [inline] > > > > sk_psock_skb_ingress_enqueue+0x32d/0x390 net/core/skmsg.c:555 > > > > sk_psock_skb_ingress_self+0x185/0x1e0 net/core/skmsg.c:606 > > > > sk_psock_verdict_apply net/core/skmsg.c:1008 [inline] > > > > sk_psock_verdict_recv+0x3e4/0x4a0 net/core/skmsg.c:1202 > > > > unix_read_skb net/unix/af_unix.c:2546 [inline] > > > > unix_stream_read_skb+0x9e/0xf0 net/unix/af_unix.c:2682 > > > > sk_psock_verdict_data_ready+0x77/0x220 net/core/skmsg.c:1223 > > > > unix_stream_sendmsg+0x527/0x860 net/unix/af_unix.c:2339 > > > > sock_sendmsg_nosec net/socket.c:730 [inline] > > > > __sock_sendmsg+0x140/0x180 net/socket.c:745 > > > > ____sys_sendmsg+0x312/0x410 net/socket.c:2584 > > > > ___sys_sendmsg net/socket.c:2638 [inline] > > > > __sys_sendmsg+0x1e9/0x280 net/socket.c:2667 > > > > __do_sys_sendmsg net/socket.c:2676 [inline] > > > > __se_sys_sendmsg net/socket.c:2674 [inline] > > > > __x64_sys_sendmsg+0x46/0x50 net/socket.c:2674 > > > > do_syscall_64+0xd3/0x1d0 > > > > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > > > > > > > value changed: 0xffffffff83d7feb0 -> 0x0000000000000000 > > > > > > > > Reported by Kernel Concurrency Sanitizer on: > > > > CPU: 0 PID: 10713 Comm: syz-executor.4 Tainted: G W 6.8.0-syzkaller-08951-gfe46a7dd189e #0 > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024 > > > > > > > > Prior to this, commit 4cd12c6065df ("bpf, sockmap: Fix NULL pointer > > > > dereference in sk_psock_verdict_data_ready()") fixed one NULL pointer > > > > similarly due to no protection of saved_data_ready. Here is another > > > > different caller causing the same issue because of the same reason. So > > > > we should protect it with sk_callback_lock read lock because the writer > > > > side in the sk_psock_drop() uses "write_lock_bh(&sk->sk_callback_lock);". > > > > > > > > Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") > > > > Reported-by: syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com > > > > Closes: https://syzkaller.appspot.com/bug?extid=aa8c8ec2538929f18f2d > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > > > --- > > > > net/core/skmsg.c | 2 ++ > > > > 1 file changed, 2 insertions(+) > > > > > > > > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > > > > index 4d75ef9d24bf..67c4c01c5235 100644 > > > > --- a/net/core/skmsg.c > > > > +++ b/net/core/skmsg.c > > > > @@ -552,7 +552,9 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb, > > > > msg->skb = skb; > > > > > > > > sk_psock_queue_msg(psock, msg); > > > > + read_lock_bh(&sk->sk_callback_lock); > > > > sk_psock_data_ready(sk, psock); > > > > + read_unlock_bh(&sk->sk_callback_lock); > > > > return copied; > > > > } > > > > > > The problem is the check and then usage presumably it is already set > > > to NULL: > > > > > > static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) > > > { > > > if (psock->saved_data_ready) > > > psock->saved_data_ready(sk); > > > > Yes. > > > > > > > > > > > I'm thinking we might be able to get away with just a READ_ONCE here with > > > similar WRITE_ONCE on other side. Something like this, > > > > The simple fix that popped into my mind at the beginning is the same > > as you: adding the READ_ONCE/WRITE_ONCE pair. > > Let me know if you want to try doing a patch with the READ_ONCE/WRITE_ONCE > we could push something like that through bpf-next I think. Just needs > some extra thought and testing. Yes, I'm interested in it. Just a little bit worried that I cannot do it well. I will take some time to dig into it. BTW, would this modification conflict with the current patch? The final solution you're thinking of is using the READ_ONCE/WRITE_ONCE pair? Thanks, Jason
Jason Xing wrote: > On Fri, Apr 5, 2024 at 12:45 PM John Fastabend <john.fastabend@gmail.com> wrote: > > > > Jason Xing wrote: > > > Hello John, > > > > > > On Thu, Apr 4, 2024 at 9:01 AM John Fastabend <john.fastabend@gmail.com> wrote: > > > > > > > > Jason Xing wrote: > > > > > From: Jason Xing <kernelxing@tencent.com> > > > > > > > > > > Fix NULL pointer data-races in sk_psock_skb_ingress_enqueue() which > > > > > syzbot reported [1]. > > > > > > > > > > [1] > > > > > BUG: KCSAN: data-race in sk_psock_drop / sk_psock_skb_ingress_enqueue > > > > > > > > > > write to 0xffff88814b3278b8 of 8 bytes by task 10724 on cpu 1: > > > > > sk_psock_stop_verdict net/core/skmsg.c:1257 [inline] > > > > > sk_psock_drop+0x13e/0x1f0 net/core/skmsg.c:843 > > > > > sk_psock_put include/linux/skmsg.h:459 [inline] > > > > > sock_map_close+0x1a7/0x260 net/core/sock_map.c:1648 > > > > > unix_release+0x4b/0x80 net/unix/af_unix.c:1048 > > > > > __sock_release net/socket.c:659 [inline] > > > > > sock_close+0x68/0x150 net/socket.c:1421 > > > > > __fput+0x2c1/0x660 fs/file_table.c:422 > > > > > __fput_sync+0x44/0x60 fs/file_table.c:507 > > > > > __do_sys_close fs/open.c:1556 [inline] > > > > > __se_sys_close+0x101/0x1b0 fs/open.c:1541 > > > > > __x64_sys_close+0x1f/0x30 fs/open.c:1541 > > > > > do_syscall_64+0xd3/0x1d0 > > > > > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > > > > > > > > > read to 0xffff88814b3278b8 of 8 bytes by task 10713 on cpu 0: > > > > > sk_psock_data_ready include/linux/skmsg.h:464 [inline] > > > > > sk_psock_skb_ingress_enqueue+0x32d/0x390 net/core/skmsg.c:555 > > > > > sk_psock_skb_ingress_self+0x185/0x1e0 net/core/skmsg.c:606 > > > > > sk_psock_verdict_apply net/core/skmsg.c:1008 [inline] > > > > > sk_psock_verdict_recv+0x3e4/0x4a0 net/core/skmsg.c:1202 > > > > > unix_read_skb net/unix/af_unix.c:2546 [inline] > > > > > unix_stream_read_skb+0x9e/0xf0 net/unix/af_unix.c:2682 > > > > > sk_psock_verdict_data_ready+0x77/0x220 net/core/skmsg.c:1223 > > > > > unix_stream_sendmsg+0x527/0x860 net/unix/af_unix.c:2339 > > > > > sock_sendmsg_nosec net/socket.c:730 [inline] > > > > > __sock_sendmsg+0x140/0x180 net/socket.c:745 > > > > > ____sys_sendmsg+0x312/0x410 net/socket.c:2584 > > > > > ___sys_sendmsg net/socket.c:2638 [inline] > > > > > __sys_sendmsg+0x1e9/0x280 net/socket.c:2667 > > > > > __do_sys_sendmsg net/socket.c:2676 [inline] > > > > > __se_sys_sendmsg net/socket.c:2674 [inline] > > > > > __x64_sys_sendmsg+0x46/0x50 net/socket.c:2674 > > > > > do_syscall_64+0xd3/0x1d0 > > > > > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > > > > > > > > > value changed: 0xffffffff83d7feb0 -> 0x0000000000000000 > > > > > > > > > > Reported by Kernel Concurrency Sanitizer on: > > > > > CPU: 0 PID: 10713 Comm: syz-executor.4 Tainted: G W 6.8.0-syzkaller-08951-gfe46a7dd189e #0 > > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024 > > > > > > > > > > Prior to this, commit 4cd12c6065df ("bpf, sockmap: Fix NULL pointer > > > > > dereference in sk_psock_verdict_data_ready()") fixed one NULL pointer > > > > > similarly due to no protection of saved_data_ready. Here is another > > > > > different caller causing the same issue because of the same reason. So > > > > > we should protect it with sk_callback_lock read lock because the writer > > > > > side in the sk_psock_drop() uses "write_lock_bh(&sk->sk_callback_lock);". > > > > > > > > > > Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") > > > > > Reported-by: syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com > > > > > Closes: https://syzkaller.appspot.com/bug?extid=aa8c8ec2538929f18f2d > > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > > > > --- > > > > > net/core/skmsg.c | 2 ++ > > > > > 1 file changed, 2 insertions(+) > > > > > > > > > > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > > > > > index 4d75ef9d24bf..67c4c01c5235 100644 > > > > > --- a/net/core/skmsg.c > > > > > +++ b/net/core/skmsg.c > > > > > @@ -552,7 +552,9 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb, > > > > > msg->skb = skb; > > > > > > > > > > sk_psock_queue_msg(psock, msg); > > > > > + read_lock_bh(&sk->sk_callback_lock); > > > > > sk_psock_data_ready(sk, psock); > > > > > + read_unlock_bh(&sk->sk_callback_lock); > > > > > return copied; > > > > > } > > > > > > > > The problem is the check and then usage presumably it is already set > > > > to NULL: > > > > > > > > static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) > > > > { > > > > if (psock->saved_data_ready) > > > > psock->saved_data_ready(sk); > > > > > > Yes. > > > > > > > > > > > > > > > I'm thinking we might be able to get away with just a READ_ONCE here with > > > > similar WRITE_ONCE on other side. Something like this, > > > > > > The simple fix that popped into my mind at the beginning is the same > > > as you: adding the READ_ONCE/WRITE_ONCE pair. > > > > Let me know if you want to try doing a patch with the READ_ONCE/WRITE_ONCE > > we could push something like that through bpf-next I think. Just needs > > some extra thought and testing. > > Yes, I'm interested in it. Just a little bit worried that I cannot do > it well. I will take some time to dig into it. > > BTW, would this modification conflict with the current patch? The > final solution you're thinking of is using the READ_ONCE/WRITE_ONCE > pair? Idea would be you can drop the read_lock/unlock once READ_ONCE/WRITE_ONCE and such are in place. > > Thanks, > Jason
On Fri, Apr 5, 2024 at 10:58 PM John Fastabend <john.fastabend@gmail.com> wrote: > > Jason Xing wrote: > > On Fri, Apr 5, 2024 at 12:45 PM John Fastabend <john.fastabend@gmail.com> wrote: > > > > > > Jason Xing wrote: > > > > Hello John, > > > > > > > > On Thu, Apr 4, 2024 at 9:01 AM John Fastabend <john.fastabend@gmail.com> wrote: > > > > > > > > > > Jason Xing wrote: > > > > > > From: Jason Xing <kernelxing@tencent.com> > > > > > > > > > > > > Fix NULL pointer data-races in sk_psock_skb_ingress_enqueue() which > > > > > > syzbot reported [1]. > > > > > > > > > > > > [1] > > > > > > BUG: KCSAN: data-race in sk_psock_drop / sk_psock_skb_ingress_enqueue > > > > > > > > > > > > write to 0xffff88814b3278b8 of 8 bytes by task 10724 on cpu 1: > > > > > > sk_psock_stop_verdict net/core/skmsg.c:1257 [inline] > > > > > > sk_psock_drop+0x13e/0x1f0 net/core/skmsg.c:843 > > > > > > sk_psock_put include/linux/skmsg.h:459 [inline] > > > > > > sock_map_close+0x1a7/0x260 net/core/sock_map.c:1648 > > > > > > unix_release+0x4b/0x80 net/unix/af_unix.c:1048 > > > > > > __sock_release net/socket.c:659 [inline] > > > > > > sock_close+0x68/0x150 net/socket.c:1421 > > > > > > __fput+0x2c1/0x660 fs/file_table.c:422 > > > > > > __fput_sync+0x44/0x60 fs/file_table.c:507 > > > > > > __do_sys_close fs/open.c:1556 [inline] > > > > > > __se_sys_close+0x101/0x1b0 fs/open.c:1541 > > > > > > __x64_sys_close+0x1f/0x30 fs/open.c:1541 > > > > > > do_syscall_64+0xd3/0x1d0 > > > > > > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > > > > > > > > > > > read to 0xffff88814b3278b8 of 8 bytes by task 10713 on cpu 0: > > > > > > sk_psock_data_ready include/linux/skmsg.h:464 [inline] > > > > > > sk_psock_skb_ingress_enqueue+0x32d/0x390 net/core/skmsg.c:555 > > > > > > sk_psock_skb_ingress_self+0x185/0x1e0 net/core/skmsg.c:606 > > > > > > sk_psock_verdict_apply net/core/skmsg.c:1008 [inline] > > > > > > sk_psock_verdict_recv+0x3e4/0x4a0 net/core/skmsg.c:1202 > > > > > > unix_read_skb net/unix/af_unix.c:2546 [inline] > > > > > > unix_stream_read_skb+0x9e/0xf0 net/unix/af_unix.c:2682 > > > > > > sk_psock_verdict_data_ready+0x77/0x220 net/core/skmsg.c:1223 > > > > > > unix_stream_sendmsg+0x527/0x860 net/unix/af_unix.c:2339 > > > > > > sock_sendmsg_nosec net/socket.c:730 [inline] > > > > > > __sock_sendmsg+0x140/0x180 net/socket.c:745 > > > > > > ____sys_sendmsg+0x312/0x410 net/socket.c:2584 > > > > > > ___sys_sendmsg net/socket.c:2638 [inline] > > > > > > __sys_sendmsg+0x1e9/0x280 net/socket.c:2667 > > > > > > __do_sys_sendmsg net/socket.c:2676 [inline] > > > > > > __se_sys_sendmsg net/socket.c:2674 [inline] > > > > > > __x64_sys_sendmsg+0x46/0x50 net/socket.c:2674 > > > > > > do_syscall_64+0xd3/0x1d0 > > > > > > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > > > > > > > > > > > value changed: 0xffffffff83d7feb0 -> 0x0000000000000000 > > > > > > > > > > > > Reported by Kernel Concurrency Sanitizer on: > > > > > > CPU: 0 PID: 10713 Comm: syz-executor.4 Tainted: G W 6.8.0-syzkaller-08951-gfe46a7dd189e #0 > > > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024 > > > > > > > > > > > > Prior to this, commit 4cd12c6065df ("bpf, sockmap: Fix NULL pointer > > > > > > dereference in sk_psock_verdict_data_ready()") fixed one NULL pointer > > > > > > similarly due to no protection of saved_data_ready. Here is another > > > > > > different caller causing the same issue because of the same reason. So > > > > > > we should protect it with sk_callback_lock read lock because the writer > > > > > > side in the sk_psock_drop() uses "write_lock_bh(&sk->sk_callback_lock);". > > > > > > > > > > > > Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") > > > > > > Reported-by: syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com > > > > > > Closes: https://syzkaller.appspot.com/bug?extid=aa8c8ec2538929f18f2d > > > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > > > > > --- > > > > > > net/core/skmsg.c | 2 ++ > > > > > > 1 file changed, 2 insertions(+) > > > > > > > > > > > > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > > > > > > index 4d75ef9d24bf..67c4c01c5235 100644 > > > > > > --- a/net/core/skmsg.c > > > > > > +++ b/net/core/skmsg.c > > > > > > @@ -552,7 +552,9 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb, > > > > > > msg->skb = skb; > > > > > > > > > > > > sk_psock_queue_msg(psock, msg); > > > > > > + read_lock_bh(&sk->sk_callback_lock); > > > > > > sk_psock_data_ready(sk, psock); > > > > > > + read_unlock_bh(&sk->sk_callback_lock); > > > > > > return copied; > > > > > > } > > > > > > > > > > The problem is the check and then usage presumably it is already set > > > > > to NULL: > > > > > > > > > > static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) > > > > > { > > > > > if (psock->saved_data_ready) > > > > > psock->saved_data_ready(sk); > > > > > > > > Yes. > > > > > > > > > > > > > > > > > > > I'm thinking we might be able to get away with just a READ_ONCE here with > > > > > similar WRITE_ONCE on other side. Something like this, > > > > > > > > The simple fix that popped into my mind at the beginning is the same > > > > as you: adding the READ_ONCE/WRITE_ONCE pair. > > > > > > Let me know if you want to try doing a patch with the READ_ONCE/WRITE_ONCE > > > we could push something like that through bpf-next I think. Just needs > > > some extra thought and testing. > > > > Yes, I'm interested in it. Just a little bit worried that I cannot do > > it well. I will take some time to dig into it. > > > > BTW, would this modification conflict with the current patch? The > > final solution you're thinking of is using the READ_ONCE/WRITE_ONCE > > pair? > > Idea would be you can drop the read_lock/unlock once READ_ONCE/WRITE_ONCE > and such are in place. Got it. Allow me to work on it. Lockless protection is surely better than rw lock. But can we let the current patch go into the tree first? Thanks, Jason > > > > > Thanks, > > Jason > >
Jason Xing wrote: > On Fri, Apr 5, 2024 at 10:58 PM John Fastabend <john.fastabend@gmail.com> wrote: > > > > Jason Xing wrote: > > > On Fri, Apr 5, 2024 at 12:45 PM John Fastabend <john.fastabend@gmail.com> wrote: > > > > > > > > Jason Xing wrote: > > > > > Hello John, > > > > > > > > > > On Thu, Apr 4, 2024 at 9:01 AM John Fastabend <john.fastabend@gmail.com> wrote: > > > > > > > > > > > > Jason Xing wrote: > > > > > > > From: Jason Xing <kernelxing@tencent.com> > > > > > > > > > > > > > > Fix NULL pointer data-races in sk_psock_skb_ingress_enqueue() which > > > > > > > syzbot reported [1]. > > > > > > > > > > > > > > [1] > > > > > > > BUG: KCSAN: data-race in sk_psock_drop / sk_psock_skb_ingress_enqueue > > > > > > > > > > > > > > write to 0xffff88814b3278b8 of 8 bytes by task 10724 on cpu 1: > > > > > > > sk_psock_stop_verdict net/core/skmsg.c:1257 [inline] > > > > > > > sk_psock_drop+0x13e/0x1f0 net/core/skmsg.c:843 > > > > > > > sk_psock_put include/linux/skmsg.h:459 [inline] > > > > > > > sock_map_close+0x1a7/0x260 net/core/sock_map.c:1648 > > > > > > > unix_release+0x4b/0x80 net/unix/af_unix.c:1048 > > > > > > > __sock_release net/socket.c:659 [inline] > > > > > > > sock_close+0x68/0x150 net/socket.c:1421 > > > > > > > __fput+0x2c1/0x660 fs/file_table.c:422 > > > > > > > __fput_sync+0x44/0x60 fs/file_table.c:507 > > > > > > > __do_sys_close fs/open.c:1556 [inline] > > > > > > > __se_sys_close+0x101/0x1b0 fs/open.c:1541 > > > > > > > __x64_sys_close+0x1f/0x30 fs/open.c:1541 > > > > > > > do_syscall_64+0xd3/0x1d0 > > > > > > > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > > > > > > > > > > > > > read to 0xffff88814b3278b8 of 8 bytes by task 10713 on cpu 0: > > > > > > > sk_psock_data_ready include/linux/skmsg.h:464 [inline] > > > > > > > sk_psock_skb_ingress_enqueue+0x32d/0x390 net/core/skmsg.c:555 > > > > > > > sk_psock_skb_ingress_self+0x185/0x1e0 net/core/skmsg.c:606 > > > > > > > sk_psock_verdict_apply net/core/skmsg.c:1008 [inline] > > > > > > > sk_psock_verdict_recv+0x3e4/0x4a0 net/core/skmsg.c:1202 > > > > > > > unix_read_skb net/unix/af_unix.c:2546 [inline] > > > > > > > unix_stream_read_skb+0x9e/0xf0 net/unix/af_unix.c:2682 > > > > > > > sk_psock_verdict_data_ready+0x77/0x220 net/core/skmsg.c:1223 > > > > > > > unix_stream_sendmsg+0x527/0x860 net/unix/af_unix.c:2339 > > > > > > > sock_sendmsg_nosec net/socket.c:730 [inline] > > > > > > > __sock_sendmsg+0x140/0x180 net/socket.c:745 > > > > > > > ____sys_sendmsg+0x312/0x410 net/socket.c:2584 > > > > > > > ___sys_sendmsg net/socket.c:2638 [inline] > > > > > > > __sys_sendmsg+0x1e9/0x280 net/socket.c:2667 > > > > > > > __do_sys_sendmsg net/socket.c:2676 [inline] > > > > > > > __se_sys_sendmsg net/socket.c:2674 [inline] > > > > > > > __x64_sys_sendmsg+0x46/0x50 net/socket.c:2674 > > > > > > > do_syscall_64+0xd3/0x1d0 > > > > > > > entry_SYSCALL_64_after_hwframe+0x6d/0x75 > > > > > > > > > > > > > > value changed: 0xffffffff83d7feb0 -> 0x0000000000000000 > > > > > > > > > > > > > > Reported by Kernel Concurrency Sanitizer on: > > > > > > > CPU: 0 PID: 10713 Comm: syz-executor.4 Tainted: G W 6.8.0-syzkaller-08951-gfe46a7dd189e #0 > > > > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024 > > > > > > > > > > > > > > Prior to this, commit 4cd12c6065df ("bpf, sockmap: Fix NULL pointer > > > > > > > dereference in sk_psock_verdict_data_ready()") fixed one NULL pointer > > > > > > > similarly due to no protection of saved_data_ready. Here is another > > > > > > > different caller causing the same issue because of the same reason. So > > > > > > > we should protect it with sk_callback_lock read lock because the writer > > > > > > > side in the sk_psock_drop() uses "write_lock_bh(&sk->sk_callback_lock);". > > > > > > > > > > > > > > Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") > > > > > > > Reported-by: syzbot+aa8c8ec2538929f18f2d@syzkaller.appspotmail.com > > > > > > > Closes: https://syzkaller.appspot.com/bug?extid=aa8c8ec2538929f18f2d > > > > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > > > > > > --- > > > > > > > net/core/skmsg.c | 2 ++ > > > > > > > 1 file changed, 2 insertions(+) > > > > > > > > > > > > > > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > > > > > > > index 4d75ef9d24bf..67c4c01c5235 100644 > > > > > > > --- a/net/core/skmsg.c > > > > > > > +++ b/net/core/skmsg.c > > > > > > > @@ -552,7 +552,9 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb, > > > > > > > msg->skb = skb; > > > > > > > > > > > > > > sk_psock_queue_msg(psock, msg); > > > > > > > + read_lock_bh(&sk->sk_callback_lock); > > > > > > > sk_psock_data_ready(sk, psock); > > > > > > > + read_unlock_bh(&sk->sk_callback_lock); > > > > > > > return copied; > > > > > > > } > > > > > > > > > > > > The problem is the check and then usage presumably it is already set > > > > > > to NULL: > > > > > > > > > > > > static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) > > > > > > { > > > > > > if (psock->saved_data_ready) > > > > > > psock->saved_data_ready(sk); > > > > > > > > > > Yes. > > > > > > > > > > > > > > > > > > > > > > > I'm thinking we might be able to get away with just a READ_ONCE here with > > > > > > similar WRITE_ONCE on other side. Something like this, > > > > > > > > > > The simple fix that popped into my mind at the beginning is the same > > > > > as you: adding the READ_ONCE/WRITE_ONCE pair. > > > > > > > > Let me know if you want to try doing a patch with the READ_ONCE/WRITE_ONCE > > > > we could push something like that through bpf-next I think. Just needs > > > > some extra thought and testing. > > > > > > Yes, I'm interested in it. Just a little bit worried that I cannot do > > > it well. I will take some time to dig into it. > > > > > > BTW, would this modification conflict with the current patch? The > > > final solution you're thinking of is using the READ_ONCE/WRITE_ONCE > > > pair? > > > > Idea would be you can drop the read_lock/unlock once READ_ONCE/WRITE_ONCE > > and such are in place. > > Got it. Allow me to work on it. Lockless protection is surely better > than rw lock. But can we let the current patch go into the tree first? Yes v2 fix has my Reviewed-by and should go in as a fix. The above improvement can go through bpf-next when its ready.
diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 4d75ef9d24bf..67c4c01c5235 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -552,7 +552,9 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb, msg->skb = skb; sk_psock_queue_msg(psock, msg); + read_lock_bh(&sk->sk_callback_lock); sk_psock_data_ready(sk, psock); + read_unlock_bh(&sk->sk_callback_lock); return copied; }