diff mbox

blk-mq: Make sure that the affected zone is unlocked if a request times out

Message ID 20180228002830.3052-1-bart.vanassche@wdc.com (mailing list archive)
State New, archived
Headers show

Commit Message

Bart Van Assche Feb. 28, 2018, 12:28 a.m. UTC
If a request times out the .completed_request() method is not called
and the .finish_request() method is only called if RQF_ELVPRIV has
been set. Hence this patch that sets RQF_ELVPRIV and that adds a
.finish_request() method. Without this patch, if a request times out
the zone that request applies to remains locked forever and no further
writes are accepted for that zone.

Fixes: 5700f69178e9 ("mq-deadline: Introduce zone locking support")
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Ming Lei <ming.lei@redhat.com>
---
 block/mq-deadline.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Ming Lei Feb. 28, 2018, 1:35 a.m. UTC | #1
On Tue, Feb 27, 2018 at 04:28:30PM -0800, Bart Van Assche wrote:
> If a request times out the .completed_request() method is not called

If BLK_EH_HANDLED is returned from .timeout(), __blk_mq_complete_request()
should have called .completed_request(). Otherwise, somewhere may be
wrong about timeout handling. Could you investigate why .completed_request
isn't called under this situation?

> and the .finish_request() method is only called if RQF_ELVPRIV has

.finish_request() is counter-pair of .prepare_request(), and both
aren't implemented by mq-deadline, so RQF_ELVPRIV needn't to be set,
and the current rule is that this flag is managed by block core.

> been set. Hence this patch that sets RQF_ELVPRIV and that adds a
> .finish_request() method. Without this patch, if a request times out
> the zone that request applies to remains locked forever and no further
> writes are accepted for that zone.
> 
> Fixes: 5700f69178e9 ("mq-deadline: Introduce zone locking support")
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Hannes Reinecke <hare@suse.com>
> Cc: Ming Lei <ming.lei@redhat.com>
> ---
>  block/mq-deadline.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> index c56f211c8440..55d5b7a02d62 100644
> --- a/block/mq-deadline.c
> +++ b/block/mq-deadline.c
> @@ -367,7 +367,8 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd)
>  	 * If the request needs its target zone locked, do it.
>  	 */
>  	blk_req_zone_write_lock(rq);
> -	rq->rq_flags |= RQF_STARTED;
> +	/* Set RQF_ELVPRIV to ensure that .finish_request() gets called */
> +	rq->rq_flags |= RQF_STARTED | RQF_ELVPRIV;
>  	return rq;
>  }
>  
> @@ -539,7 +540,9 @@ static void dd_insert_requests(struct blk_mq_hw_ctx *hctx,
>   * For zoned block devices, write unlock the target zone of
>   * completed write requests. Do this while holding the zone lock
>   * spinlock so that the zone is never unlocked while deadline_fifo_request()
> - * while deadline_next_request() are executing.
> + * while deadline_next_request() are executing. This function is called twice
> + * for requests that complete in a normal way and once for requests that time
> + * out.
>   */
>  static void dd_completed_request(struct request *rq)
>  {
> @@ -757,6 +760,7 @@ static struct elevator_type mq_deadline = {
>  		.insert_requests	= dd_insert_requests,
>  		.dispatch_request	= dd_dispatch_request,
>  		.completed_request	= dd_completed_request,
> +		.finish_request		= dd_completed_request,
>  		.next_request		= elv_rb_latter_request,
>  		.former_request		= elv_rb_former_request,
>  		.bio_merge		= dd_bio_merge,
> -- 
> 2.16.2
>
Damien Le Moal Feb. 28, 2018, 2:21 a.m. UTC | #2
Ming,

On 2018/02/27 17:35, Ming Lei wrote:
> On Tue, Feb 27, 2018 at 04:28:30PM -0800, Bart Van Assche wrote:

>> If a request times out the .completed_request() method is not called

> 

> If BLK_EH_HANDLED is returned from .timeout(), __blk_mq_complete_request()

> should have called .completed_request(). Otherwise, somewhere may be

> wrong about timeout handling. Could you investigate why .completed_request

> isn't called under this situation?


Actually, the commit message is a little misleading. The problem is not only for
timeout but also for commands completing with a failure. This is very easy to
reproduce by simply doing an unaligned write to a sequential zone on an ATA
zoned block device. In this case, the scheduler .completed_request method is not
called, which result in the target zone of the failed write to remain locked.

Hence the addition of a .finish_request method in mq-deadline pointing to the
same function as .completed_request to ensure that the command target zone is
unlocked. To ensure that the .finish_request method is called, the RQF_ELVPRIV
flag is set when the request is dispatched after the target zone was locked.

>> and the .finish_request() method is only called if RQF_ELVPRIV has

> 

> .finish_request() is counter-pair of .prepare_request(), and both

> aren't implemented by mq-deadline, so RQF_ELVPRIV needn't to be set,

> and the current rule is that this flag is managed by block core.


Indeed. So do you think it would be better to rather force a call to
.completed_request for failed command in ATA case ? Currently, it is not called
after all retries for the command failed.

> 

>> been set. Hence this patch that sets RQF_ELVPRIV and that adds a

>> .finish_request() method. Without this patch, if a request times out

>> the zone that request applies to remains locked forever and no further

>> writes are accepted for that zone.

>>

>> Fixes: 5700f69178e9 ("mq-deadline: Introduce zone locking support")

>> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>

>> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>

>> Cc: Hannes Reinecke <hare@suse.com>

>> Cc: Ming Lei <ming.lei@redhat.com>

>> ---

>>  block/mq-deadline.c | 8 ++++++--

>>  1 file changed, 6 insertions(+), 2 deletions(-)

>>

>> diff --git a/block/mq-deadline.c b/block/mq-deadline.c

>> index c56f211c8440..55d5b7a02d62 100644

>> --- a/block/mq-deadline.c

>> +++ b/block/mq-deadline.c

>> @@ -367,7 +367,8 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd)

>>  	 * If the request needs its target zone locked, do it.

>>  	 */

>>  	blk_req_zone_write_lock(rq);

>> -	rq->rq_flags |= RQF_STARTED;

>> +	/* Set RQF_ELVPRIV to ensure that .finish_request() gets called */

>> +	rq->rq_flags |= RQF_STARTED | RQF_ELVPRIV;

>>  	return rq;

>>  }

>>  

>> @@ -539,7 +540,9 @@ static void dd_insert_requests(struct blk_mq_hw_ctx *hctx,

>>   * For zoned block devices, write unlock the target zone of

>>   * completed write requests. Do this while holding the zone lock

>>   * spinlock so that the zone is never unlocked while deadline_fifo_request()

>> - * while deadline_next_request() are executing.

>> + * while deadline_next_request() are executing. This function is called twice

>> + * for requests that complete in a normal way and once for requests that time

>> + * out.

>>   */

>>  static void dd_completed_request(struct request *rq)

>>  {

>> @@ -757,6 +760,7 @@ static struct elevator_type mq_deadline = {

>>  		.insert_requests	= dd_insert_requests,

>>  		.dispatch_request	= dd_dispatch_request,

>>  		.completed_request	= dd_completed_request,

>> +		.finish_request		= dd_completed_request,

>>  		.next_request		= elv_rb_latter_request,

>>  		.former_request		= elv_rb_former_request,

>>  		.bio_merge		= dd_bio_merge,

>> -- 

>> 2.16.2

>>

> 



-- 
Damien Le Moal
Western Digital Research
Ming Lei Feb. 28, 2018, 3:09 a.m. UTC | #3
Hi Damien,

On Wed, Feb 28, 2018 at 02:21:49AM +0000, Damien Le Moal wrote:
> Ming,
> 
> On 2018/02/27 17:35, Ming Lei wrote:
> > On Tue, Feb 27, 2018 at 04:28:30PM -0800, Bart Van Assche wrote:
> >> If a request times out the .completed_request() method is not called
> > 
> > If BLK_EH_HANDLED is returned from .timeout(), __blk_mq_complete_request()
> > should have called .completed_request(). Otherwise, somewhere may be
> > wrong about timeout handling. Could you investigate why .completed_request
> > isn't called under this situation?
> 
> Actually, the commit message is a little misleading. The problem is not only for
> timeout but also for commands completing with a failure. This is very easy to
> reproduce by simply doing an unaligned write to a sequential zone on an ATA
> zoned block device. In this case, the scheduler .completed_request method is not
> called, which result in the target zone of the failed write to remain locked.

Actually request should have been completed in case of timeout,
otherwise the race between timeout and normal completion can't be
avoided.

But for dispatch failure, we deal with that with blk_mq_end_request(IOERR)
directly, please see blk_mq_dispatch_rq_list(), so the failed request is
freed without completion.

> 
> Hence the addition of a .finish_request method in mq-deadline pointing to the
> same function as .completed_request to ensure that the command target zone is
> unlocked. To ensure that the .finish_request method is called, the RQF_ELVPRIV
> flag is set when the request is dispatched after the target zone was locked.

> 
> >> and the .finish_request() method is only called if RQF_ELVPRIV has
> > 
> > .finish_request() is counter-pair of .prepare_request(), and both
> > aren't implemented by mq-deadline, so RQF_ELVPRIV needn't to be set,
> > and the current rule is that this flag is managed by block core.
> 
> Indeed. So do you think it would be better to rather force a call to
> .completed_request for failed command in ATA case ? Currently, it is not called
> after all retries for the command failed.

Now we know the reason, seems fine with either way:

1) handle it before calling blk_mq_end_request(IOERR)

2) introduce both .prepare_request()/.finish_request(), and do req zone
write lock in .dispatch_reqeust, but unlock in .finish_request, just like
what kyber does.


Thanks,
Ming
diff mbox

Patch

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index c56f211c8440..55d5b7a02d62 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -367,7 +367,8 @@  static struct request *__dd_dispatch_request(struct deadline_data *dd)
 	 * If the request needs its target zone locked, do it.
 	 */
 	blk_req_zone_write_lock(rq);
-	rq->rq_flags |= RQF_STARTED;
+	/* Set RQF_ELVPRIV to ensure that .finish_request() gets called */
+	rq->rq_flags |= RQF_STARTED | RQF_ELVPRIV;
 	return rq;
 }
 
@@ -539,7 +540,9 @@  static void dd_insert_requests(struct blk_mq_hw_ctx *hctx,
  * For zoned block devices, write unlock the target zone of
  * completed write requests. Do this while holding the zone lock
  * spinlock so that the zone is never unlocked while deadline_fifo_request()
- * while deadline_next_request() are executing.
+ * while deadline_next_request() are executing. This function is called twice
+ * for requests that complete in a normal way and once for requests that time
+ * out.
  */
 static void dd_completed_request(struct request *rq)
 {
@@ -757,6 +760,7 @@  static struct elevator_type mq_deadline = {
 		.insert_requests	= dd_insert_requests,
 		.dispatch_request	= dd_dispatch_request,
 		.completed_request	= dd_completed_request,
+		.finish_request		= dd_completed_request,
 		.next_request		= elv_rb_latter_request,
 		.former_request		= elv_rb_former_request,
 		.bio_merge		= dd_bio_merge,