diff mbox series

[v2] blk-mq: fix a hung issue when set device state to blocked and restore running

Message ID 1553138480-27397-1-git-send-email-zhengbin13@huawei.com (mailing list archive)
State Superseded
Headers show
Series [v2] blk-mq: fix a hung issue when set device state to blocked and restore running | expand

Commit Message

Zheng Bin March 21, 2019, 3:21 a.m. UTC
When I use dd test a SCSI device which use blk-mq in the following steps:
1.echo "blocked" >/sys/block/sda/device/state
2.dd if=/dev/sda of=/mnt/t.log bs=1M count=10
3.echo "running" >/sys/block/sda/device/state
dd should finish this work after step 3, unfortunately, still hung.

After step2, the key code process is like this:
blk_mq_dispatch_rq_list-->scsi_queue_rq-->prep_to_mq

prep_to_mq will return BLK_STS_RESOURCE, and scsi_queue_rq will transter
it to BLK_STS_DEV_RESOURCE, which means that driver can guarantee that
IO dispatch will be triggered in future when the resource is available.
Need to follow the rule if we set the device state to running.

Signed-off-by: zhengbin <zhengbin13@huawei.com>
---
 drivers/scsi/scsi_sysfs.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

--
2.7.4

Comments

Ming Lei March 21, 2019, 3:55 a.m. UTC | #1
On Thu, Mar 21, 2019 at 11:21:20AM +0800, zhengbin wrote:
> When I use dd test a SCSI device which use blk-mq in the following steps:
> 1.echo "blocked" >/sys/block/sda/device/state
> 2.dd if=/dev/sda of=/mnt/t.log bs=1M count=10
> 3.echo "running" >/sys/block/sda/device/state
> dd should finish this work after step 3, unfortunately, still hung.
> 
> After step2, the key code process is like this:
> blk_mq_dispatch_rq_list-->scsi_queue_rq-->prep_to_mq
> 
> prep_to_mq will return BLK_STS_RESOURCE, and scsi_queue_rq will transter
> it to BLK_STS_DEV_RESOURCE, which means that driver can guarantee that
> IO dispatch will be triggered in future when the resource is available.
> Need to follow the rule if we set the device state to running.

IMO, it is the correct direction for fixing the issue.

> 
> Signed-off-by: zhengbin <zhengbin13@huawei.com>
> ---
>  drivers/scsi/scsi_sysfs.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index 6a9040f..ce60408 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -770,7 +770,20 @@ store_state_field(struct device *dev, struct device_attribute *attr,
>  		return -EINVAL;
> 
>  	mutex_lock(&sdev->state_mutex);
> +	enum scsi_device_state oldstate = sdev->sdev_state;
>  	ret = scsi_device_set_state(sdev, state);
> +	if (ret == 0) {
> +		/* If device use blk-mq, the device state changes from
> +		 * SDEV_BLOCK to SDEV_RUNNING, we need to run hw queue
> +		 * to avoid io hung.
> +		 */
> +		if ((state == SDEV_RUNNING) && (oldstate == SDEV_BLOCK)) {
> +			struct request_queue *q = sdev->request_queue;

It is fine to run queue unconditionally in case of 'state ==
SDEV_RUNNING'.

> +
> +			if (q->mq_ops)

The above check isn't needed.

> +				blk_mq_run_hw_queues(q, true);

I suggest to move blk_mq_run_hw_queues() out of the mutex for
avoiding any potential trouble.

Thanks,
Ming
jianchao.wang March 21, 2019, 4:20 a.m. UTC | #2
Hi Ming

On 3/21/19 11:55 AM, Ming Lei wrote:
> On Thu, Mar 21, 2019 at 11:21:20AM +0800, zhengbin wrote:
>> When I use dd test a SCSI device which use blk-mq in the following steps:
>> 1.echo "blocked" >/sys/block/sda/device/state
>> 2.dd if=/dev/sda of=/mnt/t.log bs=1M count=10
>> 3.echo "running" >/sys/block/sda/device/state
>> dd should finish this work after step 3, unfortunately, still hung.
>>
>> After step2, the key code process is like this:
>> blk_mq_dispatch_rq_list-->scsi_queue_rq-->prep_to_mq
>>
>> prep_to_mq will return BLK_STS_RESOURCE, and scsi_queue_rq will transter
>> it to BLK_STS_DEV_RESOURCE, which means that driver can guarantee that
>> IO dispatch will be triggered in future when the resource is available.
>> Need to follow the rule if we set the device state to running.
> 
> IMO, it is the correct direction for fixing the issue.

Do you think we should provide more guarantee when setting sdev state through sysfs ?

In the current implementation, store_state_field does nothing but just modify state.

For example, when we set state to 'blocked', it cannot ensure the tasks that has escaped
the checking of state in scsi_queue_rq has quit, when we return from the sysfs. 

Thanks
Jianchao
> 
>>
>> Signed-off-by: zhengbin <zhengbin13@huawei.com>
>> ---
>>  drivers/scsi/scsi_sysfs.c | 13 +++++++++++++
>>  1 file changed, 13 insertions(+)
>>
>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
>> index 6a9040f..ce60408 100644
>> --- a/drivers/scsi/scsi_sysfs.c
>> +++ b/drivers/scsi/scsi_sysfs.c
>> @@ -770,7 +770,20 @@ store_state_field(struct device *dev, struct device_attribute *attr,
>>  		return -EINVAL;
>>
>>  	mutex_lock(&sdev->state_mutex);
>> +	enum scsi_device_state oldstate = sdev->sdev_state;
>>  	ret = scsi_device_set_state(sdev, state);
>> +	if (ret == 0) {
>> +		/* If device use blk-mq, the device state changes from
>> +		 * SDEV_BLOCK to SDEV_RUNNING, we need to run hw queue
>> +		 * to avoid io hung.
>> +		 */
>> +		if ((state == SDEV_RUNNING) && (oldstate == SDEV_BLOCK)) {
>> +			struct request_queue *q = sdev->request_queue;
> 
> It is fine to run queue unconditionally in case of 'state ==
> SDEV_RUNNING'.
> 
>> +
>> +			if (q->mq_ops)
> 
> The above check isn't needed.
> 
>> +				blk_mq_run_hw_queues(q, true);
> 
> I suggest to move blk_mq_run_hw_queues() out of the mutex for
> avoiding any potential trouble.
> 
> Thanks,
> Ming
>
Bart Van Assche March 21, 2019, 4:22 a.m. UTC | #3
On 3/20/19 8:21 PM, zhengbin wrote:
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index 6a9040f..ce60408 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -770,7 +770,20 @@ store_state_field(struct device *dev, struct device_attribute *attr,
>   		return -EINVAL;
> 
>   	mutex_lock(&sdev->state_mutex);
> +	enum scsi_device_state oldstate = sdev->sdev_state;
>   	ret = scsi_device_set_state(sdev, state);
> +	if (ret == 0) {
> +		/* If device use blk-mq, the device state changes from
> +		 * SDEV_BLOCK to SDEV_RUNNING, we need to run hw queue
> +		 * to avoid io hung.
> +		 */
> +		if ((state == SDEV_RUNNING) && (oldstate == SDEV_BLOCK)) {
> +			struct request_queue *q = sdev->request_queue;
> +
> +			if (q->mq_ops)
> +				blk_mq_run_hw_queues(q, true);
> +		}
> +	}
>   	mutex_unlock(&sdev->state_mutex);

Hi Zhengbin,

In addition to what Ming asked, please combine the nested if-statements 
into a single if-statement and leave out the superfluous parentheses.

Thanks,

Bart.
kernel test robot March 21, 2019, 10:40 a.m. UTC | #4
Hi zhengbin,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on mkp-scsi/for-next]
[also build test WARNING on v5.1-rc1 next-20190321]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/zhengbin/blk-mq-fix-a-hung-issue-when-set-device-state-to-blocked-and-restore-running/20190321-180603
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next
config: x86_64-randconfig-x012-201911 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   drivers/scsi/scsi_sysfs.c: In function 'store_state_field':
>> drivers/scsi/scsi_sysfs.c:773:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
     enum scsi_device_state oldstate = sdev->sdev_state;
     ^~~~

vim +773 drivers/scsi/scsi_sysfs.c

   752	
   753	static ssize_t
   754	store_state_field(struct device *dev, struct device_attribute *attr,
   755			  const char *buf, size_t count)
   756	{
   757		int i, ret;
   758		struct scsi_device *sdev = to_scsi_device(dev);
   759		enum scsi_device_state state = 0;
   760	
   761		for (i = 0; i < ARRAY_SIZE(sdev_states); i++) {
   762			const int len = strlen(sdev_states[i].name);
   763			if (strncmp(sdev_states[i].name, buf, len) == 0 &&
   764			   buf[len] == '\n') {
   765				state = sdev_states[i].value;
   766				break;
   767			}
   768		}
   769		if (!state)
   770			return -EINVAL;
   771	
   772		mutex_lock(&sdev->state_mutex);
 > 773		enum scsi_device_state oldstate = sdev->sdev_state;
   774		ret = scsi_device_set_state(sdev, state);
   775		if (ret == 0) {
   776			/* If device use blk-mq, the device state changes from
   777			 * SDEV_BLOCK to SDEV_RUNNING, we need to run hw queue
   778			 * to avoid io hung.
   779			 */
   780			if ((state == SDEV_RUNNING) && (oldstate == SDEV_BLOCK)) {
   781				struct request_queue *q = sdev->request_queue;
   782	
   783				if (q->mq_ops)
   784					blk_mq_run_hw_queues(q, true);
   785			}
   786		}
   787		mutex_unlock(&sdev->state_mutex);
   788	
   789		return ret == 0 ? count : -EINVAL;
   790	}
   791	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
Bart Van Assche March 21, 2019, 1:51 p.m. UTC | #5
On 3/20/19 8:55 PM, Ming Lei wrote:
> I suggest to move blk_mq_run_hw_queues() out of the mutex for
> avoiding any potential trouble.

Which trouble? I don't think any trouble can arise from holding a mutex 
around a request to run a request queue asynchronously.

Bart.
Bart Van Assche March 21, 2019, 3:08 p.m. UTC | #6
On Thu, 2019-03-21 at 12:20 +0800, jianchao.wang wrote:
> Do you think we should provide more guarantee when setting sdev state through sysfs ?
> 
> In the current implementation, store_state_field does nothing but just modify state.
> 
> For example, when we set state to 'blocked', it cannot ensure the tasks that has escaped
> the checking of state in scsi_queue_rq has quit, when we return from the sysfs. 

I would like to know which real workloads store "blocked" in this sysfs
attribute. If there are no such workloads, I would prefer no longer to allow
"blocked" being written in this sysfs attribute rather than making the
implementation more complex.

Thanks,

Bart.
diff mbox series

Patch

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 6a9040f..ce60408 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -770,7 +770,20 @@  store_state_field(struct device *dev, struct device_attribute *attr,
 		return -EINVAL;

 	mutex_lock(&sdev->state_mutex);
+	enum scsi_device_state oldstate = sdev->sdev_state;
 	ret = scsi_device_set_state(sdev, state);
+	if (ret == 0) {
+		/* If device use blk-mq, the device state changes from
+		 * SDEV_BLOCK to SDEV_RUNNING, we need to run hw queue
+		 * to avoid io hung.
+		 */
+		if ((state == SDEV_RUNNING) && (oldstate == SDEV_BLOCK)) {
+			struct request_queue *q = sdev->request_queue;
+
+			if (q->mq_ops)
+				blk_mq_run_hw_queues(q, true);
+		}
+	}
 	mutex_unlock(&sdev->state_mutex);

 	return ret == 0 ? count : -EINVAL;