Message ID | tencent_0CCE4C90A7C306FCD2EE466AC9882EFFAE06@qq.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | bluetooth/l2cap: sync sock recv cb and release | expand |
Context | Check | Description |
---|---|---|
tedd_an/pre-ci_am | success | Success |
tedd_an/CheckPatch | warning | WARNING: Prefer a maximum 75 chars per line (possible unwrapped commit description?) #81: The problem occurs between the system call to close the sock and hci_rx_work, WARNING: Reported-by: should be immediately followed by Closes: with a URL to the report #99: Reported-and-tested-by: syzbot+b7f6f8c9303466e16c8a@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> total: 0 errors, 2 warnings, 0 checks, 41 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. /github/workspace/src/src/13693909.patch has style problems, please review. NOTE: Ignored message types: UNKNOWN_COMMIT_ID NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. |
tedd_an/GitLint | fail | WARNING: I3 - ignore-body-lines: gitlint will be switching from using Python regex 'match' (match beginning) to 'search' (match anywhere) semantics. Please review your ignore-body-lines.regex option accordingly. To remove this warning, set general.regex-style-search=True. More details: https://jorisroovers.github.io/gitlint/configuration/#regex-style-search 4: B1 Line exceeds max length (86>80): "where the former releases the sock and the latter accesses it without lock protection." 9: B3 Line contains hard tab characters (\t): " l2cap_sock_release hci_acldata_packet" 10: B3 Line contains hard tab characters (\t): " l2cap_sock_kill l2cap_recv_frame" 11: B3 Line contains hard tab characters (\t): " sk_free l2cap_conless_channel" 12: B3 Line contains hard tab characters (\t): " l2cap_sock_recv_cb" 18: B1 Line exceeds max length (82>80): "Add a chan mutex in the rx callback of the sock to achieve synchronization between" |
tedd_an/SubjectPrefix | fail | "Bluetooth: " prefix is not specified in the subject |
tedd_an/BuildKernel | success | BuildKernel PASS |
tedd_an/CheckAllWarning | success | CheckAllWarning PASS |
tedd_an/CheckSparse | success | CheckSparse PASS |
tedd_an/CheckSmatch | success | CheckSparse PASS |
tedd_an/BuildKernel32 | success | BuildKernel32 PASS |
tedd_an/TestRunnerSetup | success | TestRunnerSetup PASS |
Hi Edward, On Tue, Jun 11, 2024 at 10:50 AM Edward Adam Davis <eadavis@qq.com> wrote: > > The problem occurs between the system call to close the sock and hci_rx_work, > where the former releases the sock and the latter accesses it without lock protection. > > CPU0 CPU1 > ---- ---- > sock_close hci_rx_work > l2cap_sock_release hci_acldata_packet > l2cap_sock_kill l2cap_recv_frame > sk_free l2cap_conless_channel > l2cap_sock_recv_cb > > If hci_rx_work processes the data that needs to be received before the sock is > closed, then everything is normal; Otherwise, the work thread may access the > released sock when receiving data. > > Add a chan mutex in the rx callback of the sock to achieve synchronization between > the sock release and recv cb. > > Reported-and-tested-by: syzbot+b7f6f8c9303466e16c8a@syzkaller.appspotmail.com > Signed-off-by: Edward Adam Davis <eadavis@qq.com> > --- > net/bluetooth/l2cap_sock.c | 20 +++++++++++++++++--- > 1 file changed, 17 insertions(+), 3 deletions(-) > > diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c > index 6db60946c627..f3e9236293e1 100644 > --- a/net/bluetooth/l2cap_sock.c > +++ b/net/bluetooth/l2cap_sock.c > @@ -1413,6 +1413,8 @@ static int l2cap_sock_release(struct socket *sock) > l2cap_chan_hold(chan); > l2cap_chan_lock(chan); > > + if (refcount_read(&sk->sk_refcnt) == 1) > + chan->data = NULL; Might be a good idea to add some comment on why checking for refcount == 1 is the right thing to do here, or perhaps we can always assign chan->data to NULL, instead of that perhaps we could have it done in l2cap_sock_kill? > sock_orphan(sk); > l2cap_sock_kill(sk); > > @@ -1481,12 +1483,22 @@ static struct l2cap_chan *l2cap_sock_new_connection_cb(struct l2cap_chan *chan) > > static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb) > { > - struct sock *sk = chan->data; > - struct l2cap_pinfo *pi = l2cap_pi(sk); > + struct sock *sk; > + struct l2cap_pinfo *pi; > int err; > > - lock_sock(sk); > + l2cap_chan_hold(chan); > + l2cap_chan_lock(chan); > + sk = chan->data; > + > + if (!sk) { > + l2cap_chan_unlock(chan); > + l2cap_chan_put(chan); > + return -ENXIO; > + } > > + pi = l2cap_pi(sk); > + lock_sock(sk); > if (chan->mode == L2CAP_MODE_ERTM && !list_empty(&pi->rx_busy)) { > err = -ENOMEM; > goto done; > @@ -1535,6 +1547,8 @@ static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb) > > done: > release_sock(sk); > + l2cap_chan_unlock(chan); > + l2cap_chan_put(chan); > > return err; > } > -- > 2.43.0 >
Hi Luiz Augusto von Dentz, On Tue, 11 Jun 2024 15:24:52 -0400, Luiz Augusto von Dentz wrote: > > The problem occurs between the system call to close the sock and hci_rx_work, > > where the former releases the sock and the latter accesses it without lock protection. > > > > CPU0 CPU1 > > ---- ---- > > sock_close hci_rx_work > > l2cap_sock_release hci_acldata_packet > > l2cap_sock_kill l2cap_recv_frame > > sk_free l2cap_conless_channel > > l2cap_sock_recv_cb > > > > If hci_rx_work processes the data that needs to be received before the sock is > > closed, then everything is normal; Otherwise, the work thread may access the > > released sock when receiving data. > > > > Add a chan mutex in the rx callback of the sock to achieve synchronization between > > the sock release and recv cb. > > > > Reported-and-tested-by: syzbot+b7f6f8c9303466e16c8a@syzkaller.appspotmail.com > > Signed-off-by: Edward Adam Davis <eadavis@qq.com> > > --- > > net/bluetooth/l2cap_sock.c | 20 +++++++++++++++++--- > > 1 file changed, 17 insertions(+), 3 deletions(-) > > > > diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c > > index 6db60946c627..f3e9236293e1 100644 > > --- a/net/bluetooth/l2cap_sock.c > > +++ b/net/bluetooth/l2cap_sock.c > > @@ -1413,6 +1413,8 @@ static int l2cap_sock_release(struct socket *sock) > > l2cap_chan_hold(chan); > > l2cap_chan_lock(chan); > > > > + if (refcount_read(&sk->sk_refcnt) == 1) > > + chan->data = NULL; > > Might be a good idea to add some comment on why checking for refcount > == 1 is the right thing to do here, or perhaps we can always assign > chan->data to NULL, instead of that perhaps we could have it done in > l2cap_sock_kill? In this case, it is possible to always set chan->data to NULL, but I think a better approach would be to release sock in l2cap_sock_kill when the reference count of the sock is 1. Previously, it was mentioned that setting chan->data to NULL is more rigorous. And chan->data is bound one-on-one to the sock, if the sock is not released, I can't confirm whether setting chan->data to NULL will affect the operation of the sock on other paths. > > > sock_orphan(sk); > > l2cap_sock_kill(sk); > > > > @@ -1481,12 +1483,22 @@ static struct l2cap_chan *l2cap_sock_new_connection_cb(struct l2cap_chan *chan) > > > > static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb) > > { > > - struct sock *sk = chan->data; > > - struct l2cap_pinfo *pi = l2cap_pi(sk); > > + struct sock *sk; > > + struct l2cap_pinfo *pi; > > int err; > > > > - lock_sock(sk); > > + l2cap_chan_hold(chan); > > + l2cap_chan_lock(chan); > > + sk = chan->data; > > + > > + if (!sk) { > > + l2cap_chan_unlock(chan); > > + l2cap_chan_put(chan); > > + return -ENXIO; > > + } > > > > + pi = l2cap_pi(sk); > > + lock_sock(sk); > > if (chan->mode == L2CAP_MODE_ERTM && !list_empty(&pi->rx_busy)) { > > err = -ENOMEM; > > goto done; > > @@ -1535,6 +1547,8 @@ static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb) > > > > done: > > release_sock(sk); > > + l2cap_chan_unlock(chan); > > + l2cap_chan_put(chan); > > > > return err; > > } -- Edward
diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c index 6db60946c627..f3e9236293e1 100644 --- a/net/bluetooth/l2cap_sock.c +++ b/net/bluetooth/l2cap_sock.c @@ -1413,6 +1413,8 @@ static int l2cap_sock_release(struct socket *sock) l2cap_chan_hold(chan); l2cap_chan_lock(chan); + if (refcount_read(&sk->sk_refcnt) == 1) + chan->data = NULL; sock_orphan(sk); l2cap_sock_kill(sk); @@ -1481,12 +1483,22 @@ static struct l2cap_chan *l2cap_sock_new_connection_cb(struct l2cap_chan *chan) static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb) { - struct sock *sk = chan->data; - struct l2cap_pinfo *pi = l2cap_pi(sk); + struct sock *sk; + struct l2cap_pinfo *pi; int err; - lock_sock(sk); + l2cap_chan_hold(chan); + l2cap_chan_lock(chan); + sk = chan->data; + + if (!sk) { + l2cap_chan_unlock(chan); + l2cap_chan_put(chan); + return -ENXIO; + } + pi = l2cap_pi(sk); + lock_sock(sk); if (chan->mode == L2CAP_MODE_ERTM && !list_empty(&pi->rx_busy)) { err = -ENOMEM; goto done; @@ -1535,6 +1547,8 @@ static int l2cap_sock_recv_cb(struct l2cap_chan *chan, struct sk_buff *skb) done: release_sock(sk); + l2cap_chan_unlock(chan); + l2cap_chan_put(chan); return err; }
The problem occurs between the system call to close the sock and hci_rx_work, where the former releases the sock and the latter accesses it without lock protection. CPU0 CPU1 ---- ---- sock_close hci_rx_work l2cap_sock_release hci_acldata_packet l2cap_sock_kill l2cap_recv_frame sk_free l2cap_conless_channel l2cap_sock_recv_cb If hci_rx_work processes the data that needs to be received before the sock is closed, then everything is normal; Otherwise, the work thread may access the released sock when receiving data. Add a chan mutex in the rx callback of the sock to achieve synchronization between the sock release and recv cb. Reported-and-tested-by: syzbot+b7f6f8c9303466e16c8a@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> --- net/bluetooth/l2cap_sock.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-)