diff mbox series

[2/2] block: mark the disk dead before taking open_mutx in del_gendisk

Message ID 20241008115756.355936-3-hch@lst.de (mailing list archive)
State New
Headers show
Series [1/2] block: also mark disk-owned queues as dying in __blk_mark_disk_dead | expand

Commit Message

Christoph Hellwig Oct. 8, 2024, 11:57 a.m. UTC
Now that we stop sd and sr from submitting passthrough commands from
their ->release methods we can and should start the drain before taking
->open_mutex, so that we can entirely prevent this kind of deadlock by
ensuring that the disk is clearly marked dead before open_mutex is
taken in del_gendisk.

This includes a revert of commit 7e04da2dc701 ("block: fix deadlock
between sd_remove & sd_release"), which was a partial fix for a similar
deadlock.

Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Suggested-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/genhd.c | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

Comments

Sergey Senozhatsky Oct. 9, 2024, 5:04 a.m. UTC | #1
On (24/10/08 13:57), Christoph Hellwig wrote:
> Now that we stop sd and sr from submitting passthrough commands from
> their ->release methods we can and should start the drain before taking
> ->open_mutex, so that we can entirely prevent this kind of deadlock by
> ensuring that the disk is clearly marked dead before open_mutex is
> taken in del_gendisk.
> 
> This includes a revert of commit 7e04da2dc701 ("block: fix deadlock
> between sd_remove & sd_release"), which was a partial fix for a similar
> deadlock.
> 
> Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org>
> Suggested-by: Sergey Senozhatsky <senozhatsky@chromium.org>
> Signed-off-by: Christoph Hellwig <hch@lst.de>

FWIW
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
diff mbox series

Patch

diff --git a/block/genhd.c b/block/genhd.c
index ac1c496ad43343..98c8d13171c642 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -655,16 +655,6 @@  void del_gendisk(struct gendisk *disk)
 	if (WARN_ON_ONCE(!disk_live(disk) && !(disk->flags & GENHD_FL_HIDDEN)))
 		return;
 
-	disk_del_events(disk);
-
-	/*
-	 * Prevent new openers by unlinked the bdev inode.
-	 */
-	mutex_lock(&disk->open_mutex);
-	xa_for_each(&disk->part_tbl, idx, part)
-		bdev_unhash(part);
-	mutex_unlock(&disk->open_mutex);
-
 	/*
 	 * Tell the file system to write back all dirty data and shut down if
 	 * it hasn't been notified earlier.
@@ -673,10 +663,22 @@  void del_gendisk(struct gendisk *disk)
 		blk_report_disk_dead(disk, false);
 
 	/*
-	 * Drop all partitions now that the disk is marked dead.
+	 * Then mark the disk dead to stop new requests from being served ASAP.
+	 * This needs to happen before taking ->open_mutex to prevent deadlocks
+	 * with SCSI ULPs that send passthrough commands from their ->release
+	 * methods.
 	 */
-	mutex_lock(&disk->open_mutex);
 	__blk_mark_disk_dead(disk);
+
+	disk_del_events(disk);
+
+	/*
+	 * Prevent new openers by unlinking the bdev inode, and drop all
+	 * partitions.
+	 */
+	mutex_lock(&disk->open_mutex);
+	xa_for_each(&disk->part_tbl, idx, part)
+		bdev_unhash(part);
 	xa_for_each_start(&disk->part_tbl, idx, part, 1)
 		drop_partition(part);
 	mutex_unlock(&disk->open_mutex);