Message ID | 20211110004440.3389311-8-bvanassche@acm.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | UFS patches for kernel v5.17 | expand |
Hi Bart, as pointed out last time: LLDDS have no business directy allocating tags. Please work with Hannes and John on the proper APIs, as metnioned last time as well.
> The following deadlock has been observed on a test setup: > * All tags allocated. > * The SCSI error handler calls ufshcd_eh_host_reset_handler() > * ufshcd_eh_host_reset_handler() queues work that calls ufshcd_err_handler() > * ufshcd_err_handler() locks up as follows: > > Workqueue: ufs_eh_wq_0 ufshcd_err_handler.cfi_jt Call trace: > __switch_to+0x298/0x5d8 > __schedule+0x6cc/0xa94 > schedule+0x12c/0x298 > blk_mq_get_tag+0x210/0x480 > __blk_mq_alloc_request+0x1c8/0x284 > blk_get_request+0x74/0x134 > ufshcd_exec_dev_cmd+0x68/0x640 > ufshcd_verify_dev_init+0x68/0x35c > ufshcd_probe_hba+0x12c/0x1cb8 > ufshcd_host_reset_and_restore+0x88/0x254 > ufshcd_reset_and_restore+0xd0/0x354 > ufshcd_err_handler+0x408/0xc58 > process_one_work+0x24c/0x66c > worker_thread+0x3e8/0xa4c > kthread+0x150/0x1b4 > ret_from_fork+0x10/0x30 > > Fix this lockup by making ufshcd_exec_dev_cmd() allocate a reserved request. > > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > --- > drivers/scsi/ufs/ufshcd.c | 14 +++++--------- > 1 file changed, 5 insertions(+), 9 deletions(-) > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index > 8400d8e9a6f7..8f5640647054 100644 > --- a/drivers/scsi/ufs/ufshcd.c > +++ b/drivers/scsi/ufs/ufshcd.c > @@ -2940,12 +2940,7 @@ static int ufshcd_exec_dev_cmd(struct ufs_hba > *hba, > > down_read(&hba->clk_scaling_lock); > > - /* > - * Get free slot, sleep if slots are unavailable. > - * Even though we use wait_event() which sleeps indefinitely, > - * the maximum wait time is bounded by SCSI request timeout. > - */ > - req = blk_get_request(q, REQ_OP_DRV_OUT, 0); > + req = blk_mq_alloc_request(q, REQ_OP_DRV_OUT, > + BLK_MQ_REQ_RESERVED); > if (IS_ERR(req)) { > err = PTR_ERR(req); > goto out_unlock; > @@ -8152,7 +8147,8 @@ static struct scsi_host_template > ufshcd_driver_template = { > .this_id = -1, > .sg_tablesize = SG_ALL, > .cmd_per_lun = UFSHCD_CMD_PER_LUN, UFSHCD_CMD_PER_LUN needs to be < 32 as well? Thanks, Avri > - .can_queue = UFSHCD_CAN_QUEUE, > + .can_queue = UFSHCD_CAN_QUEUE - 1, > + .reserved_tags = 1, > .max_segment_size = PRDT_DATA_BYTE_COUNT_MAX, > .max_host_blocked = 1, > .track_queue_depth = 1, > @@ -9513,8 +9509,8 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem > *mmio_base, unsigned int irq) > /* Configure LRB */ > ufshcd_host_memory_configure(hba); > > - host->can_queue = hba->nutrs; > - host->cmd_per_lun = hba->nutrs; > + host->can_queue = hba->nutrs - 1; > + host->cmd_per_lun = hba->nutrs - 1; > host->max_id = UFSHCD_MAX_ID; > host->max_lun = UFS_MAX_LUNS; > host->max_channel = UFSHCD_MAX_CHANNEL;
On 11/9/21 10:42 PM, Christoph Hellwig wrote: > as pointed out last time: LLDDS have no business directy allocating > tags. Please work with Hannes and John on the proper APIs, as metnioned > last time as well. Hi Christoph, I will do this. Thanks, Bart.
On 11/10/21 11:33 PM, Avri Altman wrote: >> @@ -8152,7 +8147,8 @@ static struct scsi_host_template >> ufshcd_driver_template = { >> .this_id = -1, >> .sg_tablesize = SG_ALL, >> .cmd_per_lun = UFSHCD_CMD_PER_LUN, > > UFSHCD_CMD_PER_LUN needs to be < 32 as well? I will make that change although that change should not affect the behavior of the UFS driver since the SCSI core limits cmd_per_lun to host->can_queue. Thanks, Bart.
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 8400d8e9a6f7..8f5640647054 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -2940,12 +2940,7 @@ static int ufshcd_exec_dev_cmd(struct ufs_hba *hba, down_read(&hba->clk_scaling_lock); - /* - * Get free slot, sleep if slots are unavailable. - * Even though we use wait_event() which sleeps indefinitely, - * the maximum wait time is bounded by SCSI request timeout. - */ - req = blk_get_request(q, REQ_OP_DRV_OUT, 0); + req = blk_mq_alloc_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED); if (IS_ERR(req)) { err = PTR_ERR(req); goto out_unlock; @@ -8152,7 +8147,8 @@ static struct scsi_host_template ufshcd_driver_template = { .this_id = -1, .sg_tablesize = SG_ALL, .cmd_per_lun = UFSHCD_CMD_PER_LUN, - .can_queue = UFSHCD_CAN_QUEUE, + .can_queue = UFSHCD_CAN_QUEUE - 1, + .reserved_tags = 1, .max_segment_size = PRDT_DATA_BYTE_COUNT_MAX, .max_host_blocked = 1, .track_queue_depth = 1, @@ -9513,8 +9509,8 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq) /* Configure LRB */ ufshcd_host_memory_configure(hba); - host->can_queue = hba->nutrs; - host->cmd_per_lun = hba->nutrs; + host->can_queue = hba->nutrs - 1; + host->cmd_per_lun = hba->nutrs - 1; host->max_id = UFSHCD_MAX_ID; host->max_lun = UFS_MAX_LUNS; host->max_channel = UFSHCD_MAX_CHANNEL;
The following deadlock has been observed on a test setup: * All tags allocated. * The SCSI error handler calls ufshcd_eh_host_reset_handler() * ufshcd_eh_host_reset_handler() queues work that calls ufshcd_err_handler() * ufshcd_err_handler() locks up as follows: Workqueue: ufs_eh_wq_0 ufshcd_err_handler.cfi_jt Call trace: __switch_to+0x298/0x5d8 __schedule+0x6cc/0xa94 schedule+0x12c/0x298 blk_mq_get_tag+0x210/0x480 __blk_mq_alloc_request+0x1c8/0x284 blk_get_request+0x74/0x134 ufshcd_exec_dev_cmd+0x68/0x640 ufshcd_verify_dev_init+0x68/0x35c ufshcd_probe_hba+0x12c/0x1cb8 ufshcd_host_reset_and_restore+0x88/0x254 ufshcd_reset_and_restore+0xd0/0x354 ufshcd_err_handler+0x408/0xc58 process_one_work+0x24c/0x66c worker_thread+0x3e8/0xa4c kthread+0x150/0x1b4 ret_from_fork+0x10/0x30 Fix this lockup by making ufshcd_exec_dev_cmd() allocate a reserved request. Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- drivers/scsi/ufs/ufshcd.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-)