diff mbox series

[RFC] deadlock with flush_work() in UAS

Message ID 1560869625.3184.11.camel@suse.com (mailing list archive)
State New, archived
Headers show
Series [RFC] deadlock with flush_work() in UAS | expand

Commit Message

Oliver Neukum June 18, 2019, 2:53 p.m. UTC
Hi,

looking at those deadlocks it looks to me like UAS can
deadlock on itself. What do you think?

	Regards
		Oliver

From 2d497f662e6c03fe9e0a75e05b64d52514e890b3 Mon Sep 17 00:00:00 2001
From: Oliver Neukum <oneukum@suse.com>
Date: Tue, 18 Jun 2019 15:03:56 +0200
Subject: [PATCH] UAS: fix deadlock in error handling and PM flushing work

A SCSI error handler and block runtime PM must not allocate
memory with GFP_KERNEL. Furthermore they must not wait for
tasks allocating memory with GFP_KERNEL.
That means that they cannot share a workqueue with arbitrary tasks.

Fix this for UAS using a private workqueue.

Signed-off0-by: Oliver Neukum <oneukum@suse.com>
---
 drivers/usb/storage/uas.c | 39 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 36 insertions(+), 3 deletions(-)

Comments

Oliver Neukum June 18, 2019, 3:29 p.m. UTC | #1
Am Dienstag, den 18.06.2019, 11:29 -0400 schrieb Alan Stern:
> On Tue, 18 Jun 2019, Oliver Neukum wrote:
> 
> > Hi,
> > 
> > looking at those deadlocks it looks to me like UAS can
> > deadlock on itself. What do you think?
> > 
> >       Regards
> >               Oliver
> > 
> > From 2d497f662e6c03fe9e0a75e05b64d52514e890b3 Mon Sep 17 00:00:00 2001
> > From: Oliver Neukum <oneukum@suse.com>
> > Date: Tue, 18 Jun 2019 15:03:56 +0200
> > Subject: [PATCH] UAS: fix deadlock in error handling and PM flushing work
> > 
> > A SCSI error handler and block runtime PM must not allocate
> > memory with GFP_KERNEL. Furthermore they must not wait for
> > tasks allocating memory with GFP_KERNEL.
> > That means that they cannot share a workqueue with arbitrary tasks.
> > 
> > Fix this for UAS using a private workqueue.
> 
> I'm not so sure that one long-running task in a workqueue will block 
> other tasks.  Workqueues have variable numbers of threads, added and 
> removed on demand.  (On the other hand, when new threads need to be 
> added the workqueue manager probably uses GFP_KERNEL.)

Do we have a guarantee it will reschedule already scheduled works?
The deadlock would be something like

uas_pre_reset() -> uas_wait_for_pending_cmnds() ->
flush_work(&devinfo->work) -> kmalloc() -> DEADLOCK

You can also make this chain with uas_suspend()

> Even if you disagree, perhaps we should have a global workqueue with a
> permanently set noio flag.  It could be shared among multiple drivers
> such as uas and the hub driver for purposes like this.  (In fact, the 
> hub driver already has its own dedicated workqueue.)

That is a good idea. But does UAS need WQ_MEM_RECLAIM?

	Regards
		Oliver
Alan Stern June 18, 2019, 3:29 p.m. UTC | #2
On Tue, 18 Jun 2019, Oliver Neukum wrote:

> Hi,
> 
> looking at those deadlocks it looks to me like UAS can
> deadlock on itself. What do you think?
> 
> 	Regards
> 		Oliver
> 
> From 2d497f662e6c03fe9e0a75e05b64d52514e890b3 Mon Sep 17 00:00:00 2001
> From: Oliver Neukum <oneukum@suse.com>
> Date: Tue, 18 Jun 2019 15:03:56 +0200
> Subject: [PATCH] UAS: fix deadlock in error handling and PM flushing work
> 
> A SCSI error handler and block runtime PM must not allocate
> memory with GFP_KERNEL. Furthermore they must not wait for
> tasks allocating memory with GFP_KERNEL.
> That means that they cannot share a workqueue with arbitrary tasks.
> 
> Fix this for UAS using a private workqueue.

I'm not so sure that one long-running task in a workqueue will block 
other tasks.  Workqueues have variable numbers of threads, added and 
removed on demand.  (On the other hand, when new threads need to be 
added the workqueue manager probably uses GFP_KERNEL.)

Even if you disagree, perhaps we should have a global workqueue with a
permanently set noio flag.  It could be shared among multiple drivers
such as uas and the hub driver for purposes like this.  (In fact, the 
hub driver already has its own dedicated workqueue.)

Alan Stern
Alan Stern June 18, 2019, 3:59 p.m. UTC | #3
Tejun and other workqueue maintainers:

On Tue, 18 Jun 2019, Oliver Neukum wrote:

> Am Dienstag, den 18.06.2019, 11:29 -0400 schrieb Alan Stern:
> > On Tue, 18 Jun 2019, Oliver Neukum wrote:
> > 
> > > Hi,
> > > 
> > > looking at those deadlocks it looks to me like UAS can
> > > deadlock on itself. What do you think?
> > > 
> > >       Regards
> > >               Oliver
> > > 
> > > From 2d497f662e6c03fe9e0a75e05b64d52514e890b3 Mon Sep 17 00:00:00 2001
> > > From: Oliver Neukum <oneukum@suse.com>
> > > Date: Tue, 18 Jun 2019 15:03:56 +0200
> > > Subject: [PATCH] UAS: fix deadlock in error handling and PM flushing work
> > > 
> > > A SCSI error handler and block runtime PM must not allocate
> > > memory with GFP_KERNEL. Furthermore they must not wait for
> > > tasks allocating memory with GFP_KERNEL.
> > > That means that they cannot share a workqueue with arbitrary tasks.
> > > 
> > > Fix this for UAS using a private workqueue.
> > 
> > I'm not so sure that one long-running task in a workqueue will block 
> > other tasks.  Workqueues have variable numbers of threads, added and 
> > removed on demand.  (On the other hand, when new threads need to be 
> > added the workqueue manager probably uses GFP_KERNEL.)
> 
> Do we have a guarantee it will reschedule already scheduled works?
> The deadlock would be something like
> 
> uas_pre_reset() -> uas_wait_for_pending_cmnds() ->
> flush_work(&devinfo->work) -> kmalloc() -> DEADLOCK
> 
> You can also make this chain with uas_suspend()
> 
> > Even if you disagree, perhaps we should have a global workqueue with a
> > permanently set noio flag.  It could be shared among multiple drivers
> > such as uas and the hub driver for purposes like this.  (In fact, the 
> > hub driver already has its own dedicated workqueue.)
> 
> That is a good idea. But does UAS need WQ_MEM_RECLAIM?

These are good questions, and I don't have the answers.  Perhaps Tejun 
or someone else on LKML can help.

Alan Stern
Tejun Heo June 20, 2019, 2:10 p.m. UTC | #4
Hello,

On Tue, Jun 18, 2019 at 11:59:39AM -0400, Alan Stern wrote:
> > > Even if you disagree, perhaps we should have a global workqueue with a
> > > permanently set noio flag.  It could be shared among multiple drivers
> > > such as uas and the hub driver for purposes like this.  (In fact, the 
> > > hub driver already has its own dedicated workqueue.)
> > 
> > That is a good idea. But does UAS need WQ_MEM_RECLAIM?
> 
> These are good questions, and I don't have the answers.  Perhaps Tejun 
> or someone else on LKML can help.

Any device which may host a filesystem or swap needs to use
WQ_MEM_RECLAIM workqueues on anything which may be used during normal
IOs including e.g. error handling which may be invoked.  One
WQ_MEM_RECLAIM workqueue guarantees one level of concurrency for all
its tasks regardless of memory situation, so as long as there's no
interdependence between work items, the workqueue can be shared.

Thanks.
Oliver Neukum June 24, 2019, 8:56 a.m. UTC | #5
Am Donnerstag, den 20.06.2019, 07:10 -0700 schrieb Tejun Heo:
> Hello,
> 
> On Tue, Jun 18, 2019 at 11:59:39AM -0400, Alan Stern wrote:
> > > > Even if you disagree, perhaps we should have a global workqueue with a
> > > > permanently set noio flag.  It could be shared among multiple drivers
> > > > such as uas and the hub driver for purposes like this.  (In fact, the 
> > > > hub driver already has its own dedicated workqueue.)
> > > 
> > > That is a good idea. But does UAS need WQ_MEM_RECLAIM?
> > 
> > These are good questions, and I don't have the answers.  Perhaps Tejun 
> > or someone else on LKML can help.
> 
> Any device which may host a filesystem or swap needs to use
> WQ_MEM_RECLAIM workqueues on anything which may be used during normal
> IOs including e.g. error handling which may be invoked.  One
> WQ_MEM_RECLAIM workqueue guarantees one level of concurrency for all
> its tasks regardless of memory situation, so as long as there's no
> interdependence between work items, the workqueue can be shared.

Ouch.

Alan, in that case anything doing a reset, suspend or resume needs
to use WQ_MEM_RECLAIM, it looks to me. What do we do?

	Regards
		Oliver
Alan Stern June 24, 2019, 2:22 p.m. UTC | #6
On Mon, 24 Jun 2019, Oliver Neukum wrote:

> Am Donnerstag, den 20.06.2019, 07:10 -0700 schrieb Tejun Heo:
> > Hello,
> > 
> > On Tue, Jun 18, 2019 at 11:59:39AM -0400, Alan Stern wrote:
> > > > > Even if you disagree, perhaps we should have a global workqueue with a
> > > > > permanently set noio flag.  It could be shared among multiple drivers
> > > > > such as uas and the hub driver for purposes like this.  (In fact, the 
> > > > > hub driver already has its own dedicated workqueue.)
> > > > 
> > > > That is a good idea. But does UAS need WQ_MEM_RECLAIM?
> > > 
> > > These are good questions, and I don't have the answers.  Perhaps Tejun 
> > > or someone else on LKML can help.
> > 
> > Any device which may host a filesystem or swap needs to use
> > WQ_MEM_RECLAIM workqueues on anything which may be used during normal
> > IOs including e.g. error handling which may be invoked.  One
> > WQ_MEM_RECLAIM workqueue guarantees one level of concurrency for all
> > its tasks regardless of memory situation, so as long as there's no
> > interdependence between work items, the workqueue can be shared.
> 
> Ouch.
> 
> Alan, in that case anything doing a reset, suspend or resume needs
> to use WQ_MEM_RECLAIM, it looks to me. What do we do?

I'm not sure this is really a problem.

For example, the reset issue arises only when a driver does the 
following:

	Locks the device.

	Queues a work routine to reset the device.

	Waits for the reset to finish.

	Unlocks the device.

But that pattern makes no sense; a driver would never use it.  The 
driver would just do the reset itself.

There's no problem if the locking is done in the work routine; in that
case the usb-storage or uas driver would be able to carry out any
necessary resets if the work routine was unable to start for lack of
memory.

Similarly, while async wakeups might get blocked by lack of memory, the 
normal USB driver paths use synchronous wakeup.

Alan Stern
Oliver Neukum June 26, 2019, 8:10 a.m. UTC | #7
Am Montag, den 24.06.2019, 10:22 -0400 schrieb Alan Stern:
> But that pattern makes no sense; a driver would never use it.  The 
> driver would just do the reset itself.

Correct. But UAS and storage themselves still need to use
WQ_MEM_RECLAIM for their workqueues, don't they?

	Regards
		Oliver
Alan Stern June 26, 2019, 2:38 p.m. UTC | #8
On Wed, 26 Jun 2019, Oliver Neukum wrote:

> Am Montag, den 24.06.2019, 10:22 -0400 schrieb Alan Stern:
> > But that pattern makes no sense; a driver would never use it.  The 
> > driver would just do the reset itself.
> 
> Correct. But UAS and storage themselves still need to use
> WQ_MEM_RECLAIM for their workqueues, don't they?

Perhaps so for uas.  usb-storage uses a work queue only for scanning 
targets, which doesn't interfere with the block I/O pathway.

Alan Stern
Oliver Neukum July 1, 2019, 11:02 a.m. UTC | #9
Am Mittwoch, den 26.06.2019, 10:38 -0400 schrieb Alan Stern:
> On Wed, 26 Jun 2019, Oliver Neukum wrote:
> 
> > Am Montag, den 24.06.2019, 10:22 -0400 schrieb Alan Stern:
> > > But that pattern makes no sense; a driver would never use it.  The 
> > > driver would just do the reset itself.
> > 
> > Correct. But UAS and storage themselves still need to use
> > WQ_MEM_RECLAIM for their workqueues, don't they?
> 
> Perhaps so for uas.  usb-storage uses a work queue only for scanning 
> targets, which doesn't interfere with the block I/O pathway.

Are you sure? What about hub_tt_work? As far as I can tell, hub_quiesce
will flush it, hence it is used in error handling.

	Regards
		Oliver
Alan Stern July 1, 2019, 2:20 p.m. UTC | #10
On Mon, 1 Jul 2019, Oliver Neukum wrote:

> Am Mittwoch, den 26.06.2019, 10:38 -0400 schrieb Alan Stern:
> > On Wed, 26 Jun 2019, Oliver Neukum wrote:
> > 
> > > Am Montag, den 24.06.2019, 10:22 -0400 schrieb Alan Stern:
> > > > But that pattern makes no sense; a driver would never use it.  The 
> > > > driver would just do the reset itself.
> > > 
> > > Correct. But UAS and storage themselves still need to use
> > > WQ_MEM_RECLAIM for their workqueues, don't they?
> > 
> > Perhaps so for uas.  usb-storage uses a work queue only for scanning 
> > targets, which doesn't interfere with the block I/O pathway.
> 
> Are you sure? What about hub_tt_work?

Technically speaking, hub_tt_work is used by the hub driver, not by 
usb-storage.  :-)

> As far as I can tell, hub_quiesce
> will flush it, hence it is used in error handling.

Yes, it needs to use a work queue with WQ_MEM_RECLAIM set.  
Unfortunately, I don't think we can use hub_wq for this purpose (we
could end up with a work item waiting for another work item later on in
the same queue, not good).

Alan Stern
diff mbox series

Patch

diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
index 047c5922618f..68b1cb0f84e5 100644
--- a/drivers/usb/storage/uas.c
+++ b/drivers/usb/storage/uas.c
@@ -80,6 +80,15 @@  static void uas_free_streams(struct uas_dev_info *devinfo);
 static void uas_log_cmd_state(struct scsi_cmnd *cmnd, const char *prefix,
 				int status);
 
+/*
+ * This driver needs its own workqueue, as we need to control memory allocation
+ * In the course of error handling and power management uas_wait_for_pending_cmnds()
+ * needs to flush pending work items. In these contexts we cannot allocate
+ * do block IO and we cannot wait for anybody allocating memory with GFP_KERNEL
+ * So we have to control all work items that can be on our workqueue we flush
+ */
+static struct workqueue_struct *workqueue;
+
 static void uas_do_work(struct work_struct *work)
 {
 	struct uas_dev_info *devinfo =
@@ -108,7 +117,7 @@  static void uas_do_work(struct work_struct *work)
 		if (!err)
 			cmdinfo->state &= ~IS_IN_WORK_LIST;
 		else
-			schedule_work(&devinfo->work);
+			queue_work(workqueue, &devinfo->work);
 	}
 out:
 	spin_unlock_irqrestore(&devinfo->lock, flags);
@@ -122,7 +131,7 @@  static void uas_add_work(struct uas_cmd_info *cmdinfo)
 
 	lockdep_assert_held(&devinfo->lock);
 	cmdinfo->state |= IS_IN_WORK_LIST;
-	schedule_work(&devinfo->work);
+	queue_work(workqueue, &devinfo->work);
 }
 
 static void uas_zap_pending(struct uas_dev_info *devinfo, int result)
@@ -1216,7 +1225,31 @@  static struct usb_driver uas_driver = {
 	.id_table = uas_usb_ids,
 };
 
-module_usb_driver(uas_driver);
+static int __init uas_init(void)
+{
+	int rv;
+
+	workqueue = alloc_workqueue("uas", WQ_MEM_RECLAIM, 0);
+	if (!workqueue)
+		return -ENOMEM;
+
+	rv = usb_register(&uas_driver);
+	if (rv) {
+		destroy_workqueue(workqueue);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static void __exit uas_exit(void)
+{
+	usb_deregister(&uas_driver);
+	destroy_workqueue(workqueue);
+}
+
+module_init(uas_init);
+module_exit(uas_exit);
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR(