diff mbox series

bluetooth: fix use-after-delete

Message ID 20230131230105.139035-1-alex.coffin@matician.com (mailing list archive)
State New, archived
Headers show
Series bluetooth: fix use-after-delete | expand

Checks

Context Check Description
tedd_an/pre-ci_am success Success
tedd_an/CheckPatch success CheckPatch PASS
tedd_an/GitLint success Gitlint PASS
tedd_an/SubjectPrefix fail "Bluetooth: " prefix is not specified in the subject
tedd_an/BuildKernel success BuildKernel PASS
tedd_an/CheckAllWarning success CheckAllWarning PASS
tedd_an/CheckSparse success CheckSparse PASS
tedd_an/CheckSmatch success CheckSparse PASS
tedd_an/BuildKernel32 success BuildKernel32 PASS
tedd_an/TestRunnerSetup success TestRunnerSetup PASS
tedd_an/TestRunner_l2cap-tester success TestRunner PASS
tedd_an/TestRunner_iso-tester success TestRunner PASS
tedd_an/TestRunner_bnep-tester success TestRunner PASS
tedd_an/TestRunner_mgmt-tester success TestRunner PASS
tedd_an/TestRunner_rfcomm-tester success TestRunner PASS
tedd_an/TestRunner_sco-tester success TestRunner PASS
tedd_an/TestRunner_ioctl-tester success TestRunner PASS
tedd_an/TestRunner_mesh-tester success TestRunner PASS
tedd_an/TestRunner_smp-tester success TestRunner PASS
tedd_an/TestRunner_userchan-tester success TestRunner PASS
tedd_an/IncrementalBuild success Incremental Build PASS

Commit Message

Alexander Coffin Jan. 31, 2023, 11:01 p.m. UTC
the use-after-delete occurs when the bluetooth connection closes while
messages are still being sent.

Signed-off-by: Alexander Coffin <alex.coffin@matician.com>
---
 net/bluetooth/l2cap_core.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Alexander Coffin Jan. 31, 2023, 11:11 p.m. UTC | #1
Hello,

The above patch addresses a bluetooth crash/issue that we have
observed on our robots. We have tested it for a few weeks, and have
not seen the issue after we applied this patch so not only should this
patch theoretically fix the issue, but it seems to work practically
work for us too. I have pasted the crash below snippet below

Jan 07 15:54:53 robot kernel: Unable to handle kernel NULL pointer
dereference at virtual address 0000000000000010
Jan 07 15:54:53 robot kernel: Mem abort info:
Jan 07 15:54:53 robot kernel:   ESR = 0x96000005
Jan 07 15:54:53 robot kernel:   EC = 0x25: DABT (current EL), IL = 32 bits
Jan 07 15:54:53 robot kernel:   SET = 0, FnV = 0
Jan 07 15:54:53 robot kernel:   EA = 0, S1PTW = 0
Jan 07 15:54:53 robot kernel: Data abort info:
Jan 07 15:54:53 robot kernel:   ISV = 0, ISS = 0x00000005
Jan 07 15:54:54 robot kernel:   CM = 0, WnR = 0
Jan 07 15:54:54 robot kernel: user pgtable: 4k pages, 39-bit VAs,
pgdp=0000000060557000
Jan 07 15:54:54 robot kernel: [0000000000000010] pgd=0000000000000000,
pud=0000000000000000
Jan 07 15:54:54 robot kernel: Internal error: Oops: 96000005 [#1] PREEMPT SMP
Jan 07 15:54:54 robot kernel: Modules linked in: bnep hci_uart
usb_f_ncm u_ether usb_f_acm u_serial REDACTED st7701s(O) cdc_acm
rtc_bucha(O) libcomposite btsdio bluetooth ecdh_generic ecc brcmfmac
cfg80211 REDACTED
Jan 07 15:54:54 robot kernel: CPU: 0 PID: 446 Comm: agent-rt Tainted:
P           O      5.4.215-yocto-standard-cv2 #1
Jan 07 15:54:54 robot kernel: Hardware name: REDACTED
Jan 07 15:54:54 robot kernel: pstate: 20000005 (nzCv daif -PAN -UAO)
Jan 07 15:54:54 robot kernel: pc : l2cap_chan_send+0x34c/0xf50 [bluetooth]
Jan 07 15:54:54 robot kernel: lr : l2cap_chan_send+0x314/0xf50 [bluetooth]
Jan 07 15:54:54 robot kernel: sp : ffffffc07ab8baa0
Jan 07 15:54:54 robot kernel: x29: ffffffc07ab8baa0 x28: ffffff806758d500
Jan 07 15:54:54 robot kernel: x27: 00000000000005da x26: ffffffc07ab8bb50
Jan 07 15:54:54 robot kernel: x25: ffffff805368d2c8 x24: 00000000000000f5
Jan 07 15:54:54 robot kernel: x23: ffffffc07ab8bbe0 x22: 00000000000005d8
Jan 07 15:54:54 robot kernel: x21: 00000000000005da x20: 00000000000000f5
Jan 07 15:54:54 robot kernel: x19: ffffff80605e6c00 x18: 0000000000000000
Jan 07 15:54:54 robot kernel: x17: 0000000000000000 x16: 0000000000000000
Jan 07 15:54:54 robot kernel: x15: 0000000000000000 x14: 733276632b373432
Jan 07 15:54:54 robot kernel: x13: 343166322d623730 x12: 3130333230322d65
Jan 07 15:54:54 robot kernel: x11: 7361656c65722722 x10: 3731652d61706f63
Jan 07 15:54:54 robot kernel: x9 : 081a01d101ca6705 x8 : ebde183f0196d44d
Jan 07 15:54:54 robot kernel: x7 : 0140013802304553 x6 : ffffff805368d103
Jan 07 15:54:54 robot kernel: x5 : ffffff805368d2c0 x4 : 00000000000005da
Jan 07 15:54:54 robot kernel: x3 : 000000007fffffff x2 : ffffffc008c88380
Jan 07 15:54:54 robot kernel: x1 : 0000000000000000 x0 : 0000000000000000
Jan 07 15:54:54 robot kernel: Call trace:
Jan 07 15:54:54 robot kernel:  l2cap_chan_send+0x34c/0xf50 [bluetooth]
Jan 07 15:54:54 robot kernel:  l2cap_sock_sendmsg+0x100/0x138 [bluetooth]
Jan 07 15:54:54 robot kernel:  sock_write_iter+0xa0/0x108
Jan 07 15:54:54 robot kernel:  do_iter_readv_writev+0x144/0x1c8
Jan 07 15:54:54 robot kernel:  do_iter_write+0x98/0x1a0
Jan 07 15:54:54 robot kernel:  vfs_writev+0xc0/0x110
Jan 07 15:54:54 robot kernel:  do_writev+0x80/0x108
Jan 07 15:54:54 robot kernel:  __arm64_sys_writev+0x28/0x38
Jan 07 15:54:54 robot kernel:  el0_svc_common.constprop.0+0x78/0x1a0
Jan 07 15:54:54 robot kernel:  el0_svc_handler+0x34/0xa0
Jan 07 15:54:54 robot kernel:  el0_svc+0x8/0x600
Jan 07 15:54:54 robot kernel: Code: f9004fe3 f94047e0 d2800001
f9417a62 (b9401018)
Jan 07 15:54:54 robot kernel: ---[ end trace 35098a74adb57ee8 ]---

Regards,
Alexander Coffin

On Tue, Jan 31, 2023 at 3:01 PM Alexander Coffin
<alex.coffin@matician.com> wrote:
>
> the use-after-delete occurs when the bluetooth connection closes while
> messages are still being sent.
>
> Signed-off-by: Alexander Coffin <alex.coffin@matician.com>
> ---
>  net/bluetooth/l2cap_core.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> index a3e0dc6a6e73..6cf5ed9a1a7b 100644
> --- a/net/bluetooth/l2cap_core.c
> +++ b/net/bluetooth/l2cap_core.c
> @@ -2350,6 +2350,10 @@ static inline int l2cap_skbuff_fromiovec(struct l2cap_chan *chan,
>                                          struct msghdr *msg, int len,
>                                          int count, struct sk_buff *skb)
>  {
> +       /* `conn` may be NULL, or dangling as this is called from some contexts
> +        * where `chan->ops->alloc_skb` was just called, and the connection
> +        * status was not checked afterward.
> +        */
>         struct l2cap_conn *conn = chan->conn;
>         struct sk_buff **frag;
>         int sent = 0;
> @@ -2365,6 +2369,13 @@ static inline int l2cap_skbuff_fromiovec(struct l2cap_chan *chan,
>         while (len) {
>                 struct sk_buff *tmp;
>
> +               /* Channel lock is released before requesting new skb and then
> +                * reacquired thus we need to recheck channel state.
> +                * chan->state == BT_CONNECTED implies that conn is still valid.
> +                */
> +               if (chan->state != BT_CONNECTED)
> +                       return -ENOTCONN;
> +
>                 count = min_t(unsigned int, conn->mtu, len);
>
>                 tmp = chan->ops->alloc_skb(chan, 0, count,
> --
> 2.30.2
>
bluez.test.bot@gmail.com Jan. 31, 2023, 11:36 p.m. UTC | #2
This is automated email and please do not reply to this email!

Dear submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=717519

---Test result---

Test Summary:
CheckPatch                    PASS      0.63 seconds
GitLint                       PASS      0.24 seconds
SubjectPrefix                 FAIL      0.41 seconds
BuildKernel                   PASS      37.65 seconds
CheckAllWarning               PASS      41.35 seconds
CheckSparse                   PASS      46.47 seconds
CheckSmatch                   PASS      126.09 seconds
BuildKernel32                 PASS      36.64 seconds
TestRunnerSetup               PASS      526.49 seconds
TestRunner_l2cap-tester       PASS      18.64 seconds
TestRunner_iso-tester         PASS      20.74 seconds
TestRunner_bnep-tester        PASS      6.79 seconds
TestRunner_mgmt-tester        PASS      129.13 seconds
TestRunner_rfcomm-tester      PASS      10.72 seconds
TestRunner_sco-tester         PASS      9.83 seconds
TestRunner_ioctl-tester       PASS      11.65 seconds
TestRunner_mesh-tester        PASS      8.62 seconds
TestRunner_smp-tester         PASS      9.70 seconds
TestRunner_userchan-tester    PASS      7.18 seconds
IncrementalBuild              PASS      34.02 seconds

Details
##############################
Test: SubjectPrefix - FAIL
Desc: Check subject contains "Bluetooth" prefix
Output:
"Bluetooth: " prefix is not specified in the subject


---
Regards,
Linux Bluetooth
Luiz Augusto von Dentz Jan. 31, 2023, 11:56 p.m. UTC | #3
Hi Alexander,

On Tue, Jan 31, 2023 at 3:02 PM Alexander Coffin
<alex.coffin@matician.com> wrote:
>
> the use-after-delete occurs when the bluetooth connection closes while
> messages are still being sent.
>
> Signed-off-by: Alexander Coffin <alex.coffin@matician.com>
> ---
>  net/bluetooth/l2cap_core.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> index a3e0dc6a6e73..6cf5ed9a1a7b 100644
> --- a/net/bluetooth/l2cap_core.c
> +++ b/net/bluetooth/l2cap_core.c
> @@ -2350,6 +2350,10 @@ static inline int l2cap_skbuff_fromiovec(struct l2cap_chan *chan,
>                                          struct msghdr *msg, int len,
>                                          int count, struct sk_buff *skb)
>  {
> +       /* `conn` may be NULL, or dangling as this is called from some contexts
> +        * where `chan->ops->alloc_skb` was just called, and the connection
> +        * status was not checked afterward.
> +        */
>         struct l2cap_conn *conn = chan->conn;
>         struct sk_buff **frag;
>         int sent = 0;
> @@ -2365,6 +2369,13 @@ static inline int l2cap_skbuff_fromiovec(struct l2cap_chan *chan,
>         while (len) {
>                 struct sk_buff *tmp;
>
> +               /* Channel lock is released before requesting new skb and then
> +                * reacquired thus we need to recheck channel state.
> +                * chan->state == BT_CONNECTED implies that conn is still valid.
> +                */
> +               if (chan->state != BT_CONNECTED)
> +                       return -ENOTCONN;
> +
>                 count = min_t(unsigned int, conn->mtu, len);
>
>                 tmp = chan->ops->alloc_skb(chan, 0, count,
> --
> 2.30.2

How about if we do it at l2cap_sock_alloc_skb_cb:

https://gist.github.com/Vudentz/3a1d8c4c3a80e9490ff98118fb135656

Since that is where we do the unlock to allocate the skb and then lock again.
Alexander Coffin Feb. 1, 2023, 2:39 a.m. UTC | #4
Hi Luiz,

I like your proposed patch, and think it is much better (and what
should be used instead of mine) assuming that you have tested, and
verified that no code requires allocating buffers on a closed channel.

Regards,
Alexander Coffin
Luiz Augusto von Dentz Feb. 1, 2023, 6:48 p.m. UTC | #5
Hi Alexander,

On Tue, Jan 31, 2023 at 6:39 PM Alexander Coffin
<alex.coffin@matician.com> wrote:
>
> Hi Luiz,
>
> I like your proposed patch, and think it is much better (and what
> should be used instead of mine) assuming that you have tested, and
> verified that no code requires allocating buffers on a closed channel.

I will prepare a proper patch then, but I don't have access to your
testing bot so it is probably a good idea that you test it yourself as
well.

> Regards,
> Alexander Coffin
Alexander Coffin Feb. 1, 2023, 7:03 p.m. UTC | #6
Hi Luiz,

> I will prepare a proper patch then, but I don't have access to your
> testing bot so it is probably a good idea that you test it yourself as
> well.

Sure, we will switch to it, and let you know if we face any issues
(the issues don't always pop up right away, and we don't always test
the newest version of the kernel on all of our bots so it may be a few
days before we would notice issues).

Regards,
Alexander Coffin
diff mbox series

Patch

diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index a3e0dc6a6e73..6cf5ed9a1a7b 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -2350,6 +2350,10 @@  static inline int l2cap_skbuff_fromiovec(struct l2cap_chan *chan,
 					 struct msghdr *msg, int len,
 					 int count, struct sk_buff *skb)
 {
+	/* `conn` may be NULL, or dangling as this is called from some contexts
+	 * where `chan->ops->alloc_skb` was just called, and the connection
+	 * status was not checked afterward.
+	 */
 	struct l2cap_conn *conn = chan->conn;
 	struct sk_buff **frag;
 	int sent = 0;
@@ -2365,6 +2369,13 @@  static inline int l2cap_skbuff_fromiovec(struct l2cap_chan *chan,
 	while (len) {
 		struct sk_buff *tmp;
 
+		/* Channel lock is released before requesting new skb and then
+		 * reacquired thus we need to recheck channel state.
+		 * chan->state == BT_CONNECTED implies that conn is still valid.
+		 */
+		if (chan->state != BT_CONNECTED)
+			return -ENOTCONN;
+
 		count = min_t(unsigned int, conn->mtu, len);
 
 		tmp = chan->ops->alloc_skb(chan, 0, count,