diff mbox series

[v2] block: fix deadlock between sd_remove & sd_release

Message ID 20240722091633.13128-1-yang.yang@vivo.com (mailing list archive)
State New, archived
Headers show
Series [v2] block: fix deadlock between sd_remove & sd_release | expand

Commit Message

YangYang July 22, 2024, 9:16 a.m. UTC
Our test report the following hung task:

[ 2538.459400] INFO: task "kworker/0:0":7 blocked for more than 188 seconds.
[ 2538.459427] Call trace:
[ 2538.459430]  __switch_to+0x174/0x338
[ 2538.459436]  __schedule+0x628/0x9c4
[ 2538.459442]  schedule+0x7c/0xe8
[ 2538.459447]  schedule_preempt_disabled+0x24/0x40
[ 2538.459453]  __mutex_lock+0x3ec/0xf04
[ 2538.459456]  __mutex_lock_slowpath+0x14/0x24
[ 2538.459459]  mutex_lock+0x30/0xd8
[ 2538.459462]  del_gendisk+0xdc/0x350
[ 2538.459466]  sd_remove+0x30/0x60
[ 2538.459470]  device_release_driver_internal+0x1c4/0x2c4
[ 2538.459474]  device_release_driver+0x18/0x28
[ 2538.459478]  bus_remove_device+0x15c/0x174
[ 2538.459483]  device_del+0x1d0/0x358
[ 2538.459488]  __scsi_remove_device+0xa8/0x198
[ 2538.459493]  scsi_forget_host+0x50/0x70
[ 2538.459497]  scsi_remove_host+0x80/0x180
[ 2538.459502]  usb_stor_disconnect+0x68/0xf4
[ 2538.459506]  usb_unbind_interface+0xd4/0x280
[ 2538.459510]  device_release_driver_internal+0x1c4/0x2c4
[ 2538.459514]  device_release_driver+0x18/0x28
[ 2538.459518]  bus_remove_device+0x15c/0x174
[ 2538.459523]  device_del+0x1d0/0x358
[ 2538.459528]  usb_disable_device+0x84/0x194
[ 2538.459532]  usb_disconnect+0xec/0x300
[ 2538.459537]  hub_event+0xb80/0x1870
[ 2538.459541]  process_scheduled_works+0x248/0x4dc
[ 2538.459545]  worker_thread+0x244/0x334
[ 2538.459549]  kthread+0x114/0x1bc

[ 2538.461001] INFO: task "fsck.":15415 blocked for more than 188 seconds.
[ 2538.461014] Call trace:
[ 2538.461016]  __switch_to+0x174/0x338
[ 2538.461021]  __schedule+0x628/0x9c4
[ 2538.461025]  schedule+0x7c/0xe8
[ 2538.461030]  blk_queue_enter+0xc4/0x160
[ 2538.461034]  blk_mq_alloc_request+0x120/0x1d4
[ 2538.461037]  scsi_execute_cmd+0x7c/0x23c
[ 2538.461040]  ioctl_internal_command+0x5c/0x164
[ 2538.461046]  scsi_set_medium_removal+0x5c/0xb0
[ 2538.461051]  sd_release+0x50/0x94
[ 2538.461054]  blkdev_put+0x190/0x28c
[ 2538.461058]  blkdev_release+0x28/0x40
[ 2538.461063]  __fput+0xf8/0x2a8
[ 2538.461066]  __fput_sync+0x28/0x5c
[ 2538.461070]  __arm64_sys_close+0x84/0xe8
[ 2538.461073]  invoke_syscall+0x58/0x114
[ 2538.461078]  el0_svc_common+0xac/0xe0
[ 2538.461082]  do_el0_svc+0x1c/0x28
[ 2538.461087]  el0_svc+0x38/0x68
[ 2538.461090]  el0t_64_sync_handler+0x68/0xbc
[ 2538.461093]  el0t_64_sync+0x1a8/0x1ac

  T1:				T2:
  sd_remove
  del_gendisk
  __blk_mark_disk_dead
  blk_freeze_queue_start
  ++q->mq_freeze_depth
  				bdev_release
 				mutex_lock(&disk->open_mutex)
  				sd_release
 				scsi_execute_cmd
 				blk_queue_enter
 				wait_event(!q->mq_freeze_depth)
  mutex_lock(&disk->open_mutex)

SCSI does not set GD_OWNS_QUEUE, so QUEUE_FLAG_DYING is not set in
this scenario. This is a classic ABBA deadlock. To fix the deadlock,
make sure we don't try to acquire disk->open_mutex after freezing
the queue.

Signed-off-by: Yang Yang <yang.yang@vivo.com>

---
Changes from v1:
  - Modify commit message by suggestion
---
 block/genhd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Ming Lei July 23, 2024, 1:33 a.m. UTC | #1
On Mon, Jul 22, 2024 at 5:17 PM Yang Yang <yang.yang@vivo.com> wrote:
>
> Our test report the following hung task:
>
> [ 2538.459400] INFO: task "kworker/0:0":7 blocked for more than 188 seconds.
> [ 2538.459427] Call trace:
> [ 2538.459430]  __switch_to+0x174/0x338
> [ 2538.459436]  __schedule+0x628/0x9c4
> [ 2538.459442]  schedule+0x7c/0xe8
> [ 2538.459447]  schedule_preempt_disabled+0x24/0x40
> [ 2538.459453]  __mutex_lock+0x3ec/0xf04
> [ 2538.459456]  __mutex_lock_slowpath+0x14/0x24
> [ 2538.459459]  mutex_lock+0x30/0xd8
> [ 2538.459462]  del_gendisk+0xdc/0x350
> [ 2538.459466]  sd_remove+0x30/0x60
> [ 2538.459470]  device_release_driver_internal+0x1c4/0x2c4
> [ 2538.459474]  device_release_driver+0x18/0x28
> [ 2538.459478]  bus_remove_device+0x15c/0x174
> [ 2538.459483]  device_del+0x1d0/0x358
> [ 2538.459488]  __scsi_remove_device+0xa8/0x198
> [ 2538.459493]  scsi_forget_host+0x50/0x70
> [ 2538.459497]  scsi_remove_host+0x80/0x180
> [ 2538.459502]  usb_stor_disconnect+0x68/0xf4
> [ 2538.459506]  usb_unbind_interface+0xd4/0x280
> [ 2538.459510]  device_release_driver_internal+0x1c4/0x2c4
> [ 2538.459514]  device_release_driver+0x18/0x28
> [ 2538.459518]  bus_remove_device+0x15c/0x174
> [ 2538.459523]  device_del+0x1d0/0x358
> [ 2538.459528]  usb_disable_device+0x84/0x194
> [ 2538.459532]  usb_disconnect+0xec/0x300
> [ 2538.459537]  hub_event+0xb80/0x1870
> [ 2538.459541]  process_scheduled_works+0x248/0x4dc
> [ 2538.459545]  worker_thread+0x244/0x334
> [ 2538.459549]  kthread+0x114/0x1bc
>
> [ 2538.461001] INFO: task "fsck.":15415 blocked for more than 188 seconds.
> [ 2538.461014] Call trace:
> [ 2538.461016]  __switch_to+0x174/0x338
> [ 2538.461021]  __schedule+0x628/0x9c4
> [ 2538.461025]  schedule+0x7c/0xe8
> [ 2538.461030]  blk_queue_enter+0xc4/0x160
> [ 2538.461034]  blk_mq_alloc_request+0x120/0x1d4
> [ 2538.461037]  scsi_execute_cmd+0x7c/0x23c
> [ 2538.461040]  ioctl_internal_command+0x5c/0x164
> [ 2538.461046]  scsi_set_medium_removal+0x5c/0xb0
> [ 2538.461051]  sd_release+0x50/0x94
> [ 2538.461054]  blkdev_put+0x190/0x28c
> [ 2538.461058]  blkdev_release+0x28/0x40
> [ 2538.461063]  __fput+0xf8/0x2a8
> [ 2538.461066]  __fput_sync+0x28/0x5c
> [ 2538.461070]  __arm64_sys_close+0x84/0xe8
> [ 2538.461073]  invoke_syscall+0x58/0x114
> [ 2538.461078]  el0_svc_common+0xac/0xe0
> [ 2538.461082]  do_el0_svc+0x1c/0x28
> [ 2538.461087]  el0_svc+0x38/0x68
> [ 2538.461090]  el0t_64_sync_handler+0x68/0xbc
> [ 2538.461093]  el0t_64_sync+0x1a8/0x1ac
>
>   T1:                           T2:
>   sd_remove
>   del_gendisk
>   __blk_mark_disk_dead
>   blk_freeze_queue_start
>   ++q->mq_freeze_depth
>                                 bdev_release
>                                 mutex_lock(&disk->open_mutex)
>                                 sd_release
>                                 scsi_execute_cmd
>                                 blk_queue_enter
>                                 wait_event(!q->mq_freeze_depth)
>   mutex_lock(&disk->open_mutex)
>
> SCSI does not set GD_OWNS_QUEUE, so QUEUE_FLAG_DYING is not set in
> this scenario. This is a classic ABBA deadlock. To fix the deadlock,

del_gendisk() shouldn't fail scsi_execute_cmd(), so QUEUE_FLAG_DYING
can't be set.

> make sure we don't try to acquire disk->open_mutex after freezing
> the queue.
>
> Signed-off-by: Yang Yang <yang.yang@vivo.com>
>
> ---
> Changes from v1:
>   - Modify commit message by suggestion
> ---
>  block/genhd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/block/genhd.c b/block/genhd.c
> index 8f1f3c6b4d67..c5fca3e893a0 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -663,12 +663,12 @@ void del_gendisk(struct gendisk *disk)
>          */
>         if (!test_bit(GD_DEAD, &disk->state))
>                 blk_report_disk_dead(disk, false);
> -       __blk_mark_disk_dead(disk);
>
>         /*
>          * Drop all partitions now that the disk is marked dead.
>          */
>         mutex_lock(&disk->open_mutex);
> +       __blk_mark_disk_dead(disk);
>         xa_for_each_start(&disk->part_tbl, idx, part, 1)
>                 drop_partition(part);
>         mutex_unlock(&disk->open_mutex);

This fix looks fine, and the added lock dependency is safe since
mq_freeze_lock is block layer internal lock, so:

Reviewed-by: Ming Lei <ming.lei@redhat.com>

Thanks,
Bart Van Assche July 23, 2024, 4:28 p.m. UTC | #2
On 7/22/24 2:16 AM, Yang Yang wrote:
> SCSI does not set GD_OWNS_QUEUE, so QUEUE_FLAG_DYING is not set in
> this scenario. This is a classic ABBA deadlock. To fix the deadlock,
> make sure we don't try to acquire disk->open_mutex after freezing
> the queue.
> 
> Signed-off-by: Yang Yang <yang.yang@vivo.com>

Fixes: and Cc: stable tags are missing. Otherwise this patch looks fine
to me, hence:

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
diff mbox series

Patch

diff --git a/block/genhd.c b/block/genhd.c
index 8f1f3c6b4d67..c5fca3e893a0 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -663,12 +663,12 @@  void del_gendisk(struct gendisk *disk)
 	 */
 	if (!test_bit(GD_DEAD, &disk->state))
 		blk_report_disk_dead(disk, false);
-	__blk_mark_disk_dead(disk);
 
 	/*
 	 * Drop all partitions now that the disk is marked dead.
 	 */
 	mutex_lock(&disk->open_mutex);
+	__blk_mark_disk_dead(disk);
 	xa_for_each_start(&disk->part_tbl, idx, part, 1)
 		drop_partition(part);
 	mutex_unlock(&disk->open_mutex);