diff mbox series

rfkill: Fix use-after-free in rfkill_resume()

Message ID 20201110084908.219088-1-tientzu@chromium.org (mailing list archive)
State Accepted
Delegated to: Johannes Berg
Headers show
Series rfkill: Fix use-after-free in rfkill_resume() | expand

Commit Message

Claire Chang Nov. 10, 2020, 8:49 a.m. UTC
If a device is getting removed or reprobed during resume, use-after-free
might happen. For example, h5_btrtl_resume()[drivers/bluetooth/hci_h5.c]
schedules a work queue for device reprobing. During the reprobing, if
rfkill_set_block() in rfkill_resume() is called after the corresponding
*_unregister() and kfree() are called, there will be an use-after-free
in hci_rfkill_set_block()[net/bluetooth/hci_core.c].

BUG: KASAN: use-after-free in hci_rfkill_set_block+0x58/0xc0 [bluetooth]
...
Call trace:
  dump_backtrace+0x0/0x154
  show_stack+0x20/0x2c
  dump_stack+0xbc/0x12c
  print_address_description+0x88/0x4b0
  __kasan_report+0x144/0x168
  kasan_report+0x10/0x18
  check_memory_region+0x19c/0x1ac
  __kasan_check_write+0x18/0x24
  hci_rfkill_set_block+0x58/0xc0 [bluetooth]
  rfkill_set_block+0x9c/0x120
  rfkill_resume+0x34/0x70
  dpm_run_callback+0xf0/0x1f4
  device_resume+0x210/0x22c

Fix this by checking rfkill->registered in rfkill_resume().
Since device_del() in rfkill_unregister() requires device_lock() and the
whole rfkill_resume() is also protected by the same lock in
device_resume()[drivers/base/power/main.c], we can make sure either the
rfkill->registered is false before rfkill_resume() starts or the rfkill
device won't be unregistered before rfkill_resume() returns.

Fixes: 8589086f4efd ("Bluetooth: hci_h5: Turn off RTL8723BS on suspend, reprobe on resume")
Signed-off-by: Claire Chang <tientzu@chromium.org>
---
 net/rfkill/core.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Johannes Berg Nov. 10, 2020, 5:35 p.m. UTC | #1
On Tue, 2020-11-10 at 16:49 +0800, Claire Chang wrote:
> If a device is getting removed or reprobed during resume, use-after-free
> might happen. For example, h5_btrtl_resume()[drivers/bluetooth/hci_h5.c]
> schedules a work queue for device reprobing. During the reprobing, if
> rfkill_set_block() in rfkill_resume() is called after the corresponding
> *_unregister() and kfree() are called, there will be an use-after-free
> in hci_rfkill_set_block()[net/bluetooth/hci_core.c].


Not sure I understand. So you're saying

 * something (h5_btrtl_resume) schedules a worker
 * said worker run, when it runs, calls rfkill_unregister()
 * somehow rfkill_resume() still gets called after this

But that can't really be right, device_del() removes it from the PM
lists?

johannes
Claire Chang Nov. 11, 2020, 3:23 a.m. UTC | #2
On Wed, Nov 11, 2020 at 1:35 AM Johannes Berg <johannes@sipsolutions.net> wrote:
>
> On Tue, 2020-11-10 at 16:49 +0800, Claire Chang wrote:
> > If a device is getting removed or reprobed during resume, use-after-free
> > might happen. For example, h5_btrtl_resume()[drivers/bluetooth/hci_h5.c]
> > schedules a work queue for device reprobing. During the reprobing, if
> > rfkill_set_block() in rfkill_resume() is called after the corresponding
> > *_unregister() and kfree() are called, there will be an use-after-free
> > in hci_rfkill_set_block()[net/bluetooth/hci_core.c].
>
>
> Not sure I understand. So you're saying
>
>  * something (h5_btrtl_resume) schedules a worker
>  * said worker run, when it runs, calls rfkill_unregister()
>  * somehow rfkill_resume() still gets called after this
>
> But that can't really be right, device_del() removes it from the PM
> lists?

If device_del() is called right before the device_lock() in device_resume()[1],
it's possible the rfkill device is unregistered, but rfkill_resume is
still called.
We actually hit this during the suspend/resume stress test, although it's rare.

I also have a patch with multiple msleep that can 100% reproduce this
use-after-free. Happy to share here if needed.

[1] https://elixir.bootlin.com/linux/v5.10-rc3/source/drivers/base/power/main.c#L919

Thanks,
Claire

>
>
> johannes
>
>
Johannes Berg Nov. 12, 2020, 8:17 a.m. UTC | #3
On Wed, 2020-11-11 at 11:23 +0800, Claire Chang wrote:
> On Wed, Nov 11, 2020 at 1:35 AM Johannes Berg <johannes@sipsolutions.net> wrote:
> > On Tue, 2020-11-10 at 16:49 +0800, Claire Chang wrote:
> > > If a device is getting removed or reprobed during resume, use-after-free
> > > might happen. For example, h5_btrtl_resume()[drivers/bluetooth/hci_h5.c]
> > > schedules a work queue for device reprobing. During the reprobing, if
> > > rfkill_set_block() in rfkill_resume() is called after the corresponding
> > > *_unregister() and kfree() are called, there will be an use-after-free
> > > in hci_rfkill_set_block()[net/bluetooth/hci_core.c].
> > 
> > Not sure I understand. So you're saying
> > 
> >  * something (h5_btrtl_resume) schedules a worker
> >  * said worker run, when it runs, calls rfkill_unregister()
> >  * somehow rfkill_resume() still gets called after this
> > 
> > But that can't really be right, device_del() removes it from the PM
> > lists?
> 
> If device_del() is called right before the device_lock() in device_resume()[1],
> it's possible the rfkill device is unregistered, but rfkill_resume is
> still called.

OK, I see, thanks for the clarification!

I'll try to add that to the commit message.

Thanks,
johannes
diff mbox series

Patch

diff --git a/net/rfkill/core.c b/net/rfkill/core.c
index 971c73c7d34c..97101c55763d 100644
--- a/net/rfkill/core.c
+++ b/net/rfkill/core.c
@@ -876,6 +876,9 @@  static int rfkill_resume(struct device *dev)
 
 	rfkill->suspended = false;
 
+	if (!rfkill->registered)
+		return 0;
+
 	if (!rfkill->persistent) {
 		cur = !!(rfkill->state & RFKILL_BLOCK_SW);
 		rfkill_set_block(rfkill, cur);