diff mbox

[4/4] sd: use async_probe cookie to avoid deadlocks

Message ID 1513069072-32514-5-git-send-email-hare@suse.de (mailing list archive)
State Changes Requested
Headers show

Commit Message

Hannes Reinecke Dec. 12, 2017, 8:57 a.m. UTC
With the current design we're waiting for all async probes to
finish when removing any sd device.
This might lead to a livelock where the 'remove' call is blocking
for any probe calls to finish, and the probe calls are waiting for
a response, which will never be processes as the thread handling
the responses is waiting for the remove call to finish.
Which is completely pointless as we only _really_ care for the
probe on _this_ device to be completed; any other probing can
happily continue for all we care.
So save the async probing cookie in the structure and only wait
if this specific probe is still active.

Signed-off-by: Hannes Reinecke <hare@suse.com>
---
 drivers/scsi/sd.c | 6 ++++--
 drivers/scsi/sd.h | 3 +++
 2 files changed, 7 insertions(+), 2 deletions(-)

Comments

Bart Van Assche Dec. 14, 2017, 10:13 p.m. UTC | #1
On Tue, 2017-12-12 at 09:57 +0100, Hannes Reinecke wrote:
> With the current design we're waiting for all async probes to

> finish when removing any sd device.

> This might lead to a livelock where the 'remove' call is blocking

> for any probe calls to finish, and the probe calls are waiting for

> a response, which will never be processes as the thread handling

> the responses is waiting for the remove call to finish.

> Which is completely pointless as we only _really_ care for the

> probe on _this_ device to be completed; any other probing can

> happily continue for all we care.

> So save the async probing cookie in the structure and only wait

> if this specific probe is still active.


From async_synchronize_cookie_domain():

	wait_event(async_done, lowest_in_progress(domain) >= cookie);

So async_synchronize_cookie_domain() also waits for multiple asynchronous
probes to finish. Does this patch have any advantages over the patch I
posted (https://marc.info/?l=linux-scsi&m=151275368714540)?

Thanks,

Bart.
Hannes Reinecke Dec. 15, 2017, 2:08 p.m. UTC | #2
On 12/14/2017 11:13 PM, Bart Van Assche wrote:
> On Tue, 2017-12-12 at 09:57 +0100, Hannes Reinecke wrote:
>> With the current design we're waiting for all async probes to
>> finish when removing any sd device.
>> This might lead to a livelock where the 'remove' call is blocking
>> for any probe calls to finish, and the probe calls are waiting for
>> a response, which will never be processes as the thread handling
>> the responses is waiting for the remove call to finish.
>> Which is completely pointless as we only _really_ care for the
>> probe on _this_ device to be completed; any other probing can
>> happily continue for all we care.
>> So save the async probing cookie in the structure and only wait
>> if this specific probe is still active.
> 
> From async_synchronize_cookie_domain():
> 
> 	wait_event(async_done, lowest_in_progress(domain) >= cookie);
> 
> So async_synchronize_cookie_domain() also waits for multiple asynchronous
> probes to finish. Does this patch have any advantages over the patch I
> posted (https://marc.info/?l=linux-scsi&m=151275368714540)?
> 
Correct, it waits for all _previous_ entries to complete.
But this was precisely the point; previous entries (should) have all
necessary information to complete (they probably only have been
scheduled out for some reason), so a

The main advantage is that the change to make it work is relatively simple.
(And doesn't change the interface; something we as poor distribution
developer have to worry about...)

Cheers,

Hannes
diff mbox

Patch

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index abbab17..7bf20ca 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3416,7 +3416,8 @@  static int sd_probe(struct device *dev)
 	dev_set_drvdata(dev, sdkp);
 
 	get_device(&sdkp->dev);	/* prevent release before async_schedule */
-	async_schedule_domain(sd_probe_async, sdkp, &scsi_sd_probe_domain);
+	sdkp->async_probe = async_schedule_domain(sd_probe_async, sdkp,
+						  &scsi_sd_probe_domain);
 
 	return 0;
 
@@ -3454,7 +3455,8 @@  static int sd_remove(struct device *dev)
 	scsi_autopm_get_device(sdkp->device);
 
 	async_synchronize_full_domain(&scsi_sd_pm_domain);
-	async_synchronize_full_domain(&scsi_sd_probe_domain);
+	async_synchronize_cookie_domain(sdkp->async_probe,
+					&scsi_sd_probe_domain);
 	device_del(&sdkp->dev);
 	del_gendisk(sdkp->disk);
 	sd_shutdown(dev);
diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
index 320de75..d8aff29 100644
--- a/drivers/scsi/sd.h
+++ b/drivers/scsi/sd.h
@@ -2,6 +2,8 @@ 
 #ifndef _SCSI_DISK_H
 #define _SCSI_DISK_H
 
+#include <linux/async.h>
+
 /*
  * More than enough for everybody ;)  The huge number of majors
  * is a leftover from 16bit dev_t days, we don't really need that
@@ -73,6 +75,7 @@  struct scsi_disk {
 	struct device	dev;
 	struct gendisk	*disk;
 	struct opal_dev *opal_dev;
+	async_cookie_t  async_probe;
 #ifdef CONFIG_BLK_DEV_ZONED
 	unsigned int	nr_zones;
 	unsigned int	zone_blocks;