Bluetooth: hci_sync: Fix UAF in hci_disconnect_all_sync

Message ID	20230809235622.1664227-1-luiz.dentz@gmail.com (mailing list archive)
State	Superseded
Headers	show Return-Path: <linux-bluetooth-owner@vger.kernel.org> From: Luiz Augusto von Dentz <luiz.dentz@gmail.com> To: linux-bluetooth@vger.kernel.org Subject: [PATCH] Bluetooth: hci_sync: Fix UAF in hci_disconnect_all_sync Date: Wed, 9 Aug 2023 16:56:22 -0700 Message-ID: <20230809235622.1664227-1-luiz.dentz@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Bluetooth: hci_sync: Fix UAF in hci_disconnect_all_sync \| expand Bluetooth: hci_sync: Fix UAF in hci_disconnect_all_sync

Context	Check	Description
tedd_an/pre-ci_am	success	Success
tedd_an/CheckPatch	success	CheckPatch PASS
tedd_an/GitLint	success	Gitlint PASS
tedd_an/SubjectPrefix	success	Gitlint PASS
tedd_an/BuildKernel	success	BuildKernel PASS
tedd_an/CheckAllWarning	success	CheckAllWarning PASS
tedd_an/CheckSparse	success	CheckSparse PASS
tedd_an/CheckSmatch	success	CheckSparse PASS
tedd_an/BuildKernel32	success	BuildKernel32 PASS
tedd_an/TestRunnerSetup	success	TestRunnerSetup PASS
tedd_an/TestRunner_l2cap-tester	success	TestRunner PASS
tedd_an/TestRunner_iso-tester	success	TestRunner PASS
tedd_an/TestRunner_bnep-tester	success	TestRunner PASS
tedd_an/TestRunner_mgmt-tester	success	TestRunner PASS
tedd_an/TestRunner_rfcomm-tester	success	TestRunner PASS
tedd_an/TestRunner_sco-tester	success	TestRunner PASS
tedd_an/TestRunner_ioctl-tester	success	TestRunner PASS
tedd_an/TestRunner_mesh-tester	success	TestRunner PASS
tedd_an/TestRunner_smp-tester	success	TestRunner PASS
tedd_an/TestRunner_userchan-tester	success	TestRunner PASS
tedd_an/IncrementalBuild	success	Incremental Build PASS

Luiz Augusto von Dentz Aug. 9, 2023, 11:56 p.m. UTC

From: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>

Use-after-free can occur in hci_disconnect_all_sync if a connection is
deleted by concurrent processing of a controller event.

To prevent this the code now tries to iterate over the list backwards
to ensure the links are cleanup before its parents, also it no longer
relies on a cursor, instead it always uses the last element since
hci_abort_conn_sync is guaranteed to call hci_conn_del.

UAF crash log:
==================================================================
BUG: KASAN: slab-use-after-free in hci_set_powered_sync
(net/bluetooth/hci_sync.c:5424) [bluetooth]
Read of size 8 at addr ffff888009d9c000 by task kworker/u9:0/124

CPU: 0 PID: 124 Comm: kworker/u9:0 Tainted: G        W
6.5.0-rc1+ #10
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
1.16.2-1.fc38 04/01/2014
Workqueue: hci0 hci_cmd_sync_work [bluetooth]
Call Trace:
 <TASK>
 dump_stack_lvl+0x5b/0x90
 print_report+0xcf/0x670
 ? __virt_addr_valid+0xdd/0x160
 ? hci_set_powered_sync+0x2c9/0x4a0 [bluetooth]
 kasan_report+0xa6/0xe0
 ? hci_set_powered_sync+0x2c9/0x4a0 [bluetooth]
 ? __pfx_set_powered_sync+0x10/0x10 [bluetooth]
 hci_set_powered_sync+0x2c9/0x4a0 [bluetooth]
 ? __pfx_hci_set_powered_sync+0x10/0x10 [bluetooth]
 ? __pfx_lock_release+0x10/0x10
 ? __pfx_set_powered_sync+0x10/0x10 [bluetooth]
 hci_cmd_sync_work+0x137/0x220 [bluetooth]
 process_one_work+0x526/0x9d0
 ? __pfx_process_one_work+0x10/0x10
 ? __pfx_do_raw_spin_lock+0x10/0x10
 ? mark_held_locks+0x1a/0x90
 worker_thread+0x92/0x630
 ? __pfx_worker_thread+0x10/0x10
 kthread+0x196/0x1e0
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x2c/0x50
 </TASK>

Allocated by task 1782:
 kasan_save_stack+0x33/0x60
 kasan_set_track+0x25/0x30
 __kasan_kmalloc+0x8f/0xa0
 hci_conn_add+0xa5/0xa80 [bluetooth]
 hci_bind_cis+0x881/0x9b0 [bluetooth]
 iso_connect_cis+0x121/0x520 [bluetooth]
 iso_sock_connect+0x3f6/0x790 [bluetooth]
 __sys_connect+0x109/0x130
 __x64_sys_connect+0x40/0x50
 do_syscall_64+0x60/0x90
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Freed by task 695:
 kasan_save_stack+0x33/0x60
 kasan_set_track+0x25/0x30
 kasan_save_free_info+0x2b/0x50
 __kasan_slab_free+0x10a/0x180
 __kmem_cache_free+0x14d/0x2e0
 device_release+0x5d/0xf0
 kobject_put+0xdf/0x270
 hci_disconn_complete_evt+0x274/0x3a0 [bluetooth]
 hci_event_packet+0x579/0x7e0 [bluetooth]
 hci_rx_work+0x287/0xaa0 [bluetooth]
 process_one_work+0x526/0x9d0
 worker_thread+0x92/0x630
 kthread+0x196/0x1e0
 ret_from_fork+0x2c/0x50
==================================================================

Fixes: 182ee45da083 ("Bluetooth: hci_sync: Rework hci_suspend_notifier")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
---
 net/bluetooth/hci_sync.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

bluez.test.bot@gmail.com Aug. 10, 2023, 1:17 a.m. UTC | #1

This is automated email and please do not reply to this email!

Dear submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=774728

---Test result---

Test Summary:
CheckPatch                    PASS      0.73 seconds
GitLint                       PASS      0.34 seconds
SubjectPrefix                 PASS      0.13 seconds
BuildKernel                   PASS      34.17 seconds
CheckAllWarning               PASS      36.99 seconds
CheckSparse                   PASS      42.33 seconds
CheckSmatch                   PASS      114.33 seconds
BuildKernel32                 PASS      32.77 seconds
TestRunnerSetup               PASS      497.41 seconds
TestRunner_l2cap-tester       PASS      24.07 seconds
TestRunner_iso-tester         PASS      48.08 seconds
TestRunner_bnep-tester        PASS      10.83 seconds
TestRunner_mgmt-tester        PASS      217.70 seconds
TestRunner_rfcomm-tester      PASS      16.31 seconds
TestRunner_sco-tester         PASS      19.30 seconds
TestRunner_ioctl-tester       PASS      18.36 seconds
TestRunner_mesh-tester        PASS      13.48 seconds
TestRunner_smp-tester         PASS      14.60 seconds
TestRunner_userchan-tester    PASS      11.31 seconds
IncrementalBuild              PASS      30.92 seconds



---
Regards,
Linux Bluetooth

Luiz Augusto von Dentz Aug. 10, 2023, 5:54 p.m. UTC | #2

Hi Pauli,

On Wed, Aug 9, 2023 at 6:17 PM <bluez.test.bot@gmail.com> wrote:
>
> This is automated email and please do not reply to this email!
>
> Dear submitter,
>
> Thank you for submitting the patches to the linux bluetooth mailing list.
> This is a CI test results with your patch series:
> PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=774728
>
> ---Test result---
>
> Test Summary:
> CheckPatch                    PASS      0.73 seconds
> GitLint                       PASS      0.34 seconds
> SubjectPrefix                 PASS      0.13 seconds
> BuildKernel                   PASS      34.17 seconds
> CheckAllWarning               PASS      36.99 seconds
> CheckSparse                   PASS      42.33 seconds
> CheckSmatch                   PASS      114.33 seconds
> BuildKernel32                 PASS      32.77 seconds
> TestRunnerSetup               PASS      497.41 seconds
> TestRunner_l2cap-tester       PASS      24.07 seconds
> TestRunner_iso-tester         PASS      48.08 seconds
> TestRunner_bnep-tester        PASS      10.83 seconds
> TestRunner_mgmt-tester        PASS      217.70 seconds
> TestRunner_rfcomm-tester      PASS      16.31 seconds
> TestRunner_sco-tester         PASS      19.30 seconds
> TestRunner_ioctl-tester       PASS      18.36 seconds
> TestRunner_mesh-tester        PASS      13.48 seconds
> TestRunner_smp-tester         PASS      14.60 seconds
> TestRunner_userchan-tester    PASS      11.31 seconds
> IncrementalBuild              PASS      30.92 seconds

Any comments on this one? Perhaps we should add a test for it as well
since I don't think it was reproducible with the existing tests.

Pauli Virtanen Aug. 10, 2023, 8:55 p.m. UTC | #3

Hi Luiz,

ke, 2023-08-09 kello 16:56 -0700, Luiz Augusto von Dentz kirjoitti:
> From: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
> 
> Use-after-free can occur in hci_disconnect_all_sync if a connection is
> deleted by concurrent processing of a controller event.
> 
> To prevent this the code now tries to iterate over the list backwards
> to ensure the links are cleanup before its parents, also it no longer
> relies on a cursor, instead it always uses the last element since
> hci_abort_conn_sync is guaranteed to call hci_conn_del.
> 
> UAF crash log:
> ==================================================================
> BUG: KASAN: slab-use-after-free in hci_set_powered_sync
> (net/bluetooth/hci_sync.c:5424) [bluetooth]
> Read of size 8 at addr ffff888009d9c000 by task kworker/u9:0/124
> 
> CPU: 0 PID: 124 Comm: kworker/u9:0 Tainted: G        W
> 6.5.0-rc1+ #10
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> 1.16.2-1.fc38 04/01/2014
> Workqueue: hci0 hci_cmd_sync_work [bluetooth]
> Call Trace:
>  <TASK>
>  dump_stack_lvl+0x5b/0x90
>  print_report+0xcf/0x670
>  ? __virt_addr_valid+0xdd/0x160
>  ? hci_set_powered_sync+0x2c9/0x4a0 [bluetooth]
>  kasan_report+0xa6/0xe0
>  ? hci_set_powered_sync+0x2c9/0x4a0 [bluetooth]
>  ? __pfx_set_powered_sync+0x10/0x10 [bluetooth]
>  hci_set_powered_sync+0x2c9/0x4a0 [bluetooth]
>  ? __pfx_hci_set_powered_sync+0x10/0x10 [bluetooth]
>  ? __pfx_lock_release+0x10/0x10
>  ? __pfx_set_powered_sync+0x10/0x10 [bluetooth]
>  hci_cmd_sync_work+0x137/0x220 [bluetooth]
>  process_one_work+0x526/0x9d0
>  ? __pfx_process_one_work+0x10/0x10
>  ? __pfx_do_raw_spin_lock+0x10/0x10
>  ? mark_held_locks+0x1a/0x90
>  worker_thread+0x92/0x630
>  ? __pfx_worker_thread+0x10/0x10
>  kthread+0x196/0x1e0
>  ? __pfx_kthread+0x10/0x10
>  ret_from_fork+0x2c/0x50
>  </TASK>
> 
> Allocated by task 1782:
>  kasan_save_stack+0x33/0x60
>  kasan_set_track+0x25/0x30
>  __kasan_kmalloc+0x8f/0xa0
>  hci_conn_add+0xa5/0xa80 [bluetooth]
>  hci_bind_cis+0x881/0x9b0 [bluetooth]
>  iso_connect_cis+0x121/0x520 [bluetooth]
>  iso_sock_connect+0x3f6/0x790 [bluetooth]
>  __sys_connect+0x109/0x130
>  __x64_sys_connect+0x40/0x50
>  do_syscall_64+0x60/0x90
>  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> 
> Freed by task 695:
>  kasan_save_stack+0x33/0x60
>  kasan_set_track+0x25/0x30
>  kasan_save_free_info+0x2b/0x50
>  __kasan_slab_free+0x10a/0x180
>  __kmem_cache_free+0x14d/0x2e0
>  device_release+0x5d/0xf0
>  kobject_put+0xdf/0x270
>  hci_disconn_complete_evt+0x274/0x3a0 [bluetooth]
>  hci_event_packet+0x579/0x7e0 [bluetooth]
>  hci_rx_work+0x287/0xaa0 [bluetooth]
>  process_one_work+0x526/0x9d0
>  worker_thread+0x92/0x630
>  kthread+0x196/0x1e0
>  ret_from_fork+0x2c/0x50
> ==================================================================
> 
> Fixes: 182ee45da083 ("Bluetooth: hci_sync: Rework hci_suspend_notifier")
> Signed-off-by: Pauli Virtanen <pav@iki.fi>
> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
> ---
>  net/bluetooth/hci_sync.c | 22 ++++++++++++++++------
>  1 file changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
> index 32fa9006f381..234da30292a4 100644
> --- a/net/bluetooth/hci_sync.c
> +++ b/net/bluetooth/hci_sync.c
> @@ -5429,7 +5429,11 @@ int hci_abort_conn_sync(struct hci_dev *hdev, struct hci_conn *conn, u8 reason)
>  		hci_dev_unlock(hdev);
>  		return 0;
>  	default:
> +		hci_dev_lock(hdev);
>  		conn->state = BT_CLOSED;
> +		hci_disconn_cfm(conn, reason);
> +		hci_conn_del(conn);
> +		hci_dev_unlock(hdev);
>  		return 0;
>  	}

When powering off we do not wait for the Disconnection_Complete or
Connection_Complete events from disconnect/cancel, but only the command
status.

In these cases it looks like hci_abort_conn_sync returns while the conn
still exists. Then the command may get sent again. If the controller
rejects the second disconnect/cancel command with error status then
conn probably gets deleted on second round (kprinting errors),
otherwise it loops here waiting until the conn goes away.

LE connection cancel maybe busy-loops here forever until the connection
completion event arrives, because of the HCI_CONN_CANCEL flag test,
negating the point of not waiting for that event.

Maybe hci_conn_abort_conn_sync should do the "if (err) { ..." cleanup
unconditionally also when err = 0? This might get the state out of sync
with controller when powering off, if the power off gets canceled after
this. OTOH, we probably would be deleting the conns anyway when the
connection completion events from the successful commands arrive, so
maybe this doesn't matter and one can think it through.

Tests would indeed be nice here...

>  
> @@ -5458,13 +5462,19 @@ int hci_abort_conn_sync(struct hci_dev *hdev, struct hci_conn *conn, u8 reason)
>  
>  static int hci_disconnect_all_sync(struct hci_dev *hdev, u8 reason)
>  {
> -	struct hci_conn *conn, *tmp;
> -	int err;
> +	struct list_head *head = &hdev->conn_hash.list;
>  
> -	list_for_each_entry_safe(conn, tmp, &hdev->conn_hash.list, list) {
> -		err = hci_abort_conn_sync(hdev, conn, reason);
> -		if (err)
> -			return err;
> +	/* Use reverse order so links are cleanup before parents */

I suspect the ordering may be either way: you can have a pre-existing
ACL connection when ISO is connected, or use something like ISO
BT_DEFER_SETUP and then hci_connect_cis may create the ACL after ISO.

In iso-tester tests it's probably the latter and with real BAP the
former (but didn't actually check it now, too late for today).

Maybe iterating the struct hci_link is OK here, it's probably still
fairly simple.

A big simplification came from having hci_abort_conn delete always and
not bailing out from disconnect_all_sync loop on errors.

> +	while (!list_empty(&hdev->conn_hash.list)) {
> +		struct hci_conn *conn = list_last_entry(head, struct hci_conn,
> +							list);
> +
> +		/* Disregard possible errors since hci_conn_del shall have been
> +		 * called even in case of errors had occurred since it would
> +		 * then cause hci_conn_failed to be called which calls
> +		 * hci_conn_del internally.
> +		 */
> +		hci_abort_conn_sync(hdev, conn, reason);
>  	}

I think to be UAF-safe the list iteration must either hold
rcu_read_lock and use RCU primitives, or, hold hdev->lock and use usual
list primitives.

In both cases it should take hci_conn_get to prevent concurrent
hci_conn_del freeing the conn before hci_abort_conn_sync is finished
accessing its fields.

rcu_read_lock();
while ((conn = list_first_or_null_rcu(...)) {
	conn = /* one of conn->links if any, otherwise conn */
	hci_conn_get(conn);
	rcu_read_unlock();
	hci_abort_conn_sync(hdev, conn, reason);
	hci_conn_put(conn);  /* must be before rcu_read_lock */
	rcu_read_lock();
}
rcu_read_unlock();

or with *rcu* -> hdev->lock.

Not using RCU primitives or hdev->lock is probably not safe even if
hci_conn_del is prevented from running concurrently by future changes
in hci_sync, because there may be concurrent list_add_rcu and you can't
know if compiler generated concurrency safe code from the non-RCU list
iteration primitives here.

>  
>  	return 0;

Luiz Augusto von Dentz Aug. 11, 2023, 5:56 p.m. UTC | #4

Hi Pauli,

On Thu, Aug 10, 2023 at 1:55 PM Pauli Virtanen <pav@iki.fi> wrote:
>
> Hi Luiz,
>
> ke, 2023-08-09 kello 16:56 -0700, Luiz Augusto von Dentz kirjoitti:
> > From: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
> >
> > Use-after-free can occur in hci_disconnect_all_sync if a connection is
> > deleted by concurrent processing of a controller event.
> >
> > To prevent this the code now tries to iterate over the list backwards
> > to ensure the links are cleanup before its parents, also it no longer
> > relies on a cursor, instead it always uses the last element since
> > hci_abort_conn_sync is guaranteed to call hci_conn_del.
> >
> > UAF crash log:
> > ==================================================================
> > BUG: KASAN: slab-use-after-free in hci_set_powered_sync
> > (net/bluetooth/hci_sync.c:5424) [bluetooth]
> > Read of size 8 at addr ffff888009d9c000 by task kworker/u9:0/124
> >
> > CPU: 0 PID: 124 Comm: kworker/u9:0 Tainted: G        W
> > 6.5.0-rc1+ #10
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> > 1.16.2-1.fc38 04/01/2014
> > Workqueue: hci0 hci_cmd_sync_work [bluetooth]
> > Call Trace:
> >  <TASK>
> >  dump_stack_lvl+0x5b/0x90
> >  print_report+0xcf/0x670
> >  ? __virt_addr_valid+0xdd/0x160
> >  ? hci_set_powered_sync+0x2c9/0x4a0 [bluetooth]
> >  kasan_report+0xa6/0xe0
> >  ? hci_set_powered_sync+0x2c9/0x4a0 [bluetooth]
> >  ? __pfx_set_powered_sync+0x10/0x10 [bluetooth]
> >  hci_set_powered_sync+0x2c9/0x4a0 [bluetooth]
> >  ? __pfx_hci_set_powered_sync+0x10/0x10 [bluetooth]
> >  ? __pfx_lock_release+0x10/0x10
> >  ? __pfx_set_powered_sync+0x10/0x10 [bluetooth]
> >  hci_cmd_sync_work+0x137/0x220 [bluetooth]
> >  process_one_work+0x526/0x9d0
> >  ? __pfx_process_one_work+0x10/0x10
> >  ? __pfx_do_raw_spin_lock+0x10/0x10
> >  ? mark_held_locks+0x1a/0x90
> >  worker_thread+0x92/0x630
> >  ? __pfx_worker_thread+0x10/0x10
> >  kthread+0x196/0x1e0
> >  ? __pfx_kthread+0x10/0x10
> >  ret_from_fork+0x2c/0x50
> >  </TASK>
> >
> > Allocated by task 1782:
> >  kasan_save_stack+0x33/0x60
> >  kasan_set_track+0x25/0x30
> >  __kasan_kmalloc+0x8f/0xa0
> >  hci_conn_add+0xa5/0xa80 [bluetooth]
> >  hci_bind_cis+0x881/0x9b0 [bluetooth]
> >  iso_connect_cis+0x121/0x520 [bluetooth]
> >  iso_sock_connect+0x3f6/0x790 [bluetooth]
> >  __sys_connect+0x109/0x130
> >  __x64_sys_connect+0x40/0x50
> >  do_syscall_64+0x60/0x90
> >  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> >
> > Freed by task 695:
> >  kasan_save_stack+0x33/0x60
> >  kasan_set_track+0x25/0x30
> >  kasan_save_free_info+0x2b/0x50
> >  __kasan_slab_free+0x10a/0x180
> >  __kmem_cache_free+0x14d/0x2e0
> >  device_release+0x5d/0xf0
> >  kobject_put+0xdf/0x270
> >  hci_disconn_complete_evt+0x274/0x3a0 [bluetooth]
> >  hci_event_packet+0x579/0x7e0 [bluetooth]
> >  hci_rx_work+0x287/0xaa0 [bluetooth]
> >  process_one_work+0x526/0x9d0
> >  worker_thread+0x92/0x630
> >  kthread+0x196/0x1e0
> >  ret_from_fork+0x2c/0x50
> > ==================================================================
> >
> > Fixes: 182ee45da083 ("Bluetooth: hci_sync: Rework hci_suspend_notifier")
> > Signed-off-by: Pauli Virtanen <pav@iki.fi>
> > Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
> > ---
> >  net/bluetooth/hci_sync.c | 22 ++++++++++++++++------
> >  1 file changed, 16 insertions(+), 6 deletions(-)
> >
> > diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
> > index 32fa9006f381..234da30292a4 100644
> > --- a/net/bluetooth/hci_sync.c
> > +++ b/net/bluetooth/hci_sync.c
> > @@ -5429,7 +5429,11 @@ int hci_abort_conn_sync(struct hci_dev *hdev, struct hci_conn *conn, u8 reason)
> >               hci_dev_unlock(hdev);
> >               return 0;
> >       default:
> > +             hci_dev_lock(hdev);
> >               conn->state = BT_CLOSED;
> > +             hci_disconn_cfm(conn, reason);
> > +             hci_conn_del(conn);
> > +             hci_dev_unlock(hdev);
> >               return 0;
> >       }
>
> When powering off we do not wait for the Disconnection_Complete or
> Connection_Complete events from disconnect/cancel, but only the command
> status.
>
> In these cases it looks like hci_abort_conn_sync returns while the conn
> still exists. Then the command may get sent again. If the controller
> rejects the second disconnect/cancel command with error status then
> conn probably gets deleted on second round (kprinting errors),
> otherwise it loops here waiting until the conn goes away.
>
> LE connection cancel maybe busy-loops here forever until the connection
> completion event arrives, because of the HCI_CONN_CANCEL flag test,
> negating the point of not waiting for that event.
>
> Maybe hci_conn_abort_conn_sync should do the "if (err) { ..." cleanup
> unconditionally also when err = 0? This might get the state out of sync
> with controller when powering off, if the power off gets canceled after
> this. OTOH, we probably would be deleting the conns anyway when the
> connection completion events from the successful commands arrive, so
> maybe this doesn't matter and one can think it through.

Yep, I tried to address this on v2

> Tests would indeed be nice here...

Perhaps we need to update the suspend test to actually have
connections to cleanup.

> >
> > @@ -5458,13 +5462,19 @@ int hci_abort_conn_sync(struct hci_dev *hdev, struct hci_conn *conn, u8 reason)
> >
> >  static int hci_disconnect_all_sync(struct hci_dev *hdev, u8 reason)
> >  {
> > -     struct hci_conn *conn, *tmp;
> > -     int err;
> > +     struct list_head *head = &hdev->conn_hash.list;
> >
> > -     list_for_each_entry_safe(conn, tmp, &hdev->conn_hash.list, list) {
> > -             err = hci_abort_conn_sync(hdev, conn, reason);
> > -             if (err)
> > -                     return err;
> > +     /* Use reverse order so links are cleanup before parents */
>
> I suspect the ordering may be either way: you can have a pre-existing
> ACL connection when ISO is connected, or use something like ISO
> BT_DEFER_SETUP and then hci_connect_cis may create the ACL after ISO.
>
> In iso-tester tests it's probably the latter and with real BAP the
> former (but didn't actually check it now, too late for today).
>
> Maybe iterating the struct hci_link is OK here, it's probably still
> fairly simple.
>
> A big simplification came from having hci_abort_conn delete always and
> not bailing out from disconnect_all_sync loop on errors.
>
> > +     while (!list_empty(&hdev->conn_hash.list)) {
> > +             struct hci_conn *conn = list_last_entry(head, struct hci_conn,
> > +                                                     list);
> > +
> > +             /* Disregard possible errors since hci_conn_del shall have been
> > +              * called even in case of errors had occurred since it would
> > +              * then cause hci_conn_failed to be called which calls
> > +              * hci_conn_del internally.
> > +              */
> > +             hci_abort_conn_sync(hdev, conn, reason);
> >       }
>
> I think to be UAF-safe the list iteration must either hold
> rcu_read_lock and use RCU primitives, or, hold hdev->lock and use usual
> list primitives.
>
> In both cases it should take hci_conn_get to prevent concurrent
> hci_conn_del freeing the conn before hci_abort_conn_sync is finished
> accessing its fields.
>
> rcu_read_lock();
> while ((conn = list_first_or_null_rcu(...)) {
>         conn = /* one of conn->links if any, otherwise conn */
>         hci_conn_get(conn);
>         rcu_read_unlock();
>         hci_abort_conn_sync(hdev, conn, reason);
>         hci_conn_put(conn);  /* must be before rcu_read_lock */
>         rcu_read_lock();
> }
> rcu_read_unlock();
>
> or with *rcu* -> hdev->lock.
>
> Not using RCU primitives or hdev->lock is probably not safe even if
> hci_conn_del is prevented from running concurrently by future changes
> in hci_sync, because there may be concurrent list_add_rcu and you can't
> know if compiler generated concurrency safe code from the non-RCU list
> iteration primitives here.

Added something similar to v2.

> >
> >       return 0;
>
> --
> Pauli Virtanen

Bluetooth: hci_sync: Fix UAF in hci_disconnect_all_sync

Checks

Commit Message

Comments

Patch