diff mbox series

[1/2] blk-mq: respect io scheduler

Message ID 20190927072431.23901-2-ming.lei@redhat.com (mailing list archive)
State New, archived
Headers show
Series blk-mq: two improvemens on slow MQ devices | expand

Commit Message

Ming Lei Sept. 27, 2019, 7:24 a.m. UTC
Now in case of real MQ, io scheduler may be bypassed, and not only this
way may hurt performance for some slow MQ device, but also break zoned
device which depends on mq-deadline for respecting the write order in
one zone.

So don't bypass io scheduler if we have one setup.

This patch can double sequential write performance basically on MQ
scsi_debug when mq-deadline is applied.

Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Javier González Sept. 27, 2019, 10:45 a.m. UTC | #1
> On 27 Sep 2019, at 09.24, Ming Lei <ming.lei@redhat.com> wrote:
> 
> Now in case of real MQ, io scheduler may be bypassed, and not only this
> way may hurt performance for some slow MQ device, but also break zoned
> device which depends on mq-deadline for respecting the write order in
> one zone.
> 
> So don't bypass io scheduler if we have one setup.
> 
> This patch can double sequential write performance basically on MQ
> scsi_debug when mq-deadline is applied.
> 
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: Hannes Reinecke <hare@suse.com>
> Cc: Damien Le Moal <damien.lemoal@wdc.com>
> Cc: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
> block/blk-mq.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 20a49be536b5..d7aed6518e62 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2003,6 +2003,8 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
> 		}
> 
> 		blk_add_rq_to_plug(plug, rq);
> +	} else if (q->elevator) {
> +		blk_mq_sched_insert_request(rq, false, true, true);
> 	} else if (plug && !blk_queue_nomerges(q)) {
> 		/*
> 		 * We do limited plugging. If the bio can be merged, do that.
> @@ -2026,8 +2028,8 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
> 			blk_mq_try_issue_directly(data.hctx, same_queue_rq,
> 					&cookie);
> 		}
> -	} else if ((q->nr_hw_queues > 1 && is_sync) || (!q->elevator &&
> -			!data.hctx->dispatch_busy)) {
> +	} else if ((q->nr_hw_queues > 1 && is_sync) ||
> +			!data.hctx->dispatch_busy) {
> 		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
> 	} else {
> 		blk_mq_sched_insert_request(rq, false, true, true);
> --
> 2.20.1


Looks good to me. Fixes a couple issues we have seen with zoned devices too.

Reviewed-by: Javier González <javier@javigon.com>
Hans Holmberg Sept. 27, 2019, 1:07 p.m. UTC | #2
On Fri, Sep 27, 2019 at 9:25 AM Ming Lei <ming.lei@redhat.com> wrote:
>
> Now in case of real MQ, io scheduler may be bypassed, and not only this
> way may hurt performance for some slow MQ device, but also break zoned
> device which depends on mq-deadline for respecting the write order in
> one zone.
>
> So don't bypass io scheduler if we have one setup.
>
> This patch can double sequential write performance basically on MQ
> scsi_debug when mq-deadline is applied.
>
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: Hannes Reinecke <hare@suse.com>
> Cc: Damien Le Moal <damien.lemoal@wdc.com>
> Cc: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  block/blk-mq.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 20a49be536b5..d7aed6518e62 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2003,6 +2003,8 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>                 }
>
>                 blk_add_rq_to_plug(plug, rq);
> +       } else if (q->elevator) {
> +               blk_mq_sched_insert_request(rq, false, true, true);
>         } else if (plug && !blk_queue_nomerges(q)) {
>                 /*
>                  * We do limited plugging. If the bio can be merged, do that.
> @@ -2026,8 +2028,8 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>                         blk_mq_try_issue_directly(data.hctx, same_queue_rq,
>                                         &cookie);
>                 }
> -       } else if ((q->nr_hw_queues > 1 && is_sync) || (!q->elevator &&
> -                       !data.hctx->dispatch_busy)) {
> +       } else if ((q->nr_hw_queues > 1 && is_sync) ||
> +                       !data.hctx->dispatch_busy) {
>                 blk_mq_try_issue_directly(data.hctx, rq, &cookie);
>         } else {
>                 blk_mq_sched_insert_request(rq, false, true, true);
> --
> 2.20.1
>

I've verified that the issue I saw with zone locking being skipped due
to scheduler bypass is resolved with this patch.

Thanks,
Hans
Damien Le Moal Sept. 27, 2019, 5:25 p.m. UTC | #3
On 2019/09/27 0:25, Ming Lei wrote:
> Now in case of real MQ, io scheduler may be bypassed, and not only this
> way may hurt performance for some slow MQ device, but also break zoned
> device which depends on mq-deadline for respecting the write order in
> one zone.
> 
> So don't bypass io scheduler if we have one setup.
> 
> This patch can double sequential write performance basically on MQ
> scsi_debug when mq-deadline is applied.
> 
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: Hannes Reinecke <hare@suse.com>
> Cc: Damien Le Moal <damien.lemoal@wdc.com>
> Cc: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  block/blk-mq.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 20a49be536b5..d7aed6518e62 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2003,6 +2003,8 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>  		}
>  
>  		blk_add_rq_to_plug(plug, rq);
> +	} else if (q->elevator) {
> +		blk_mq_sched_insert_request(rq, false, true, true);>  	} else if (plug && !blk_queue_nomerges(q)) {
>  		/*
>  		 * We do limited plugging. If the bio can be merged, do that.
> @@ -2026,8 +2028,8 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>  			blk_mq_try_issue_directly(data.hctx, same_queue_rq,
>  					&cookie);
>  		}
> -	} else if ((q->nr_hw_queues > 1 && is_sync) || (!q->elevator &&
> -			!data.hctx->dispatch_busy)) {
> +	} else if ((q->nr_hw_queues > 1 && is_sync) ||
> +			!data.hctx->dispatch_busy) {
>  		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
>  	} else {
>  		blk_mq_sched_insert_request(rq, false, true, true);
> 

I think this patch should have a Cc: stable@vger.kernel.org
This fixes a problem existing since we added deadline zone write-locking with
commit 5700f69178e9 ("mq-deadline: Introduce zone locking support").

Otherwise:

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Jens Axboe Sept. 27, 2019, 5:32 p.m. UTC | #4
On 9/27/19 7:25 PM, Damien Le Moal wrote:
> On 2019/09/27 0:25, Ming Lei wrote:
>> Now in case of real MQ, io scheduler may be bypassed, and not only this
>> way may hurt performance for some slow MQ device, but also break zoned
>> device which depends on mq-deadline for respecting the write order in
>> one zone.
>>
>> So don't bypass io scheduler if we have one setup.
>>
>> This patch can double sequential write performance basically on MQ
>> scsi_debug when mq-deadline is applied.
>>
>> Cc: Bart Van Assche <bvanassche@acm.org>
>> Cc: Hannes Reinecke <hare@suse.com>
>> Cc: Damien Le Moal <damien.lemoal@wdc.com>
>> Cc: Dave Chinner <dchinner@redhat.com>
>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>> ---
>>   block/blk-mq.c | 6 ++++--
>>   1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index 20a49be536b5..d7aed6518e62 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -2003,6 +2003,8 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>>   		}
>>   
>>   		blk_add_rq_to_plug(plug, rq);
>> +	} else if (q->elevator) {
>> +		blk_mq_sched_insert_request(rq, false, true, true);>  	} else if (plug && !blk_queue_nomerges(q)) {
>>   		/*
>>   		 * We do limited plugging. If the bio can be merged, do that.
>> @@ -2026,8 +2028,8 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>>   			blk_mq_try_issue_directly(data.hctx, same_queue_rq,
>>   					&cookie);
>>   		}
>> -	} else if ((q->nr_hw_queues > 1 && is_sync) || (!q->elevator &&
>> -			!data.hctx->dispatch_busy)) {
>> +	} else if ((q->nr_hw_queues > 1 && is_sync) ||
>> +			!data.hctx->dispatch_busy) {
>>   		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
>>   	} else {
>>   		blk_mq_sched_insert_request(rq, false, true, true);
>>
> 
> I think this patch should have a Cc: stable@vger.kernel.org
> This fixes a problem existing since we added deadline zone write-locking with
> commit 5700f69178e9 ("mq-deadline: Introduce zone locking support").

I'd rather not mark it for stable until it's been in the kernel for some
weeks at least, since we are potentially dealing with behavioral change
for everyone. We've been burnt by stuff like this in the past.

That said, this patch could be a candidate. Let's revisit in a few weeks.
Damien Le Moal Sept. 27, 2019, 5:33 p.m. UTC | #5
On 2019/09/27 10:32, Jens Axboe wrote:
> On 9/27/19 7:25 PM, Damien Le Moal wrote:
>> On 2019/09/27 0:25, Ming Lei wrote:
>>> Now in case of real MQ, io scheduler may be bypassed, and not only this
>>> way may hurt performance for some slow MQ device, but also break zoned
>>> device which depends on mq-deadline for respecting the write order in
>>> one zone.
>>>
>>> So don't bypass io scheduler if we have one setup.
>>>
>>> This patch can double sequential write performance basically on MQ
>>> scsi_debug when mq-deadline is applied.
>>>
>>> Cc: Bart Van Assche <bvanassche@acm.org>
>>> Cc: Hannes Reinecke <hare@suse.com>
>>> Cc: Damien Le Moal <damien.lemoal@wdc.com>
>>> Cc: Dave Chinner <dchinner@redhat.com>
>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>>> ---
>>>   block/blk-mq.c | 6 ++++--
>>>   1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>> index 20a49be536b5..d7aed6518e62 100644
>>> --- a/block/blk-mq.c
>>> +++ b/block/blk-mq.c
>>> @@ -2003,6 +2003,8 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>>>   		}
>>>   
>>>   		blk_add_rq_to_plug(plug, rq);
>>> +	} else if (q->elevator) {
>>> +		blk_mq_sched_insert_request(rq, false, true, true);>  	} else if (plug && !blk_queue_nomerges(q)) {
>>>   		/*
>>>   		 * We do limited plugging. If the bio can be merged, do that.
>>> @@ -2026,8 +2028,8 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>>>   			blk_mq_try_issue_directly(data.hctx, same_queue_rq,
>>>   					&cookie);
>>>   		}
>>> -	} else if ((q->nr_hw_queues > 1 && is_sync) || (!q->elevator &&
>>> -			!data.hctx->dispatch_busy)) {
>>> +	} else if ((q->nr_hw_queues > 1 && is_sync) ||
>>> +			!data.hctx->dispatch_busy) {
>>>   		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
>>>   	} else {
>>>   		blk_mq_sched_insert_request(rq, false, true, true);
>>>
>>
>> I think this patch should have a Cc: stable@vger.kernel.org
>> This fixes a problem existing since we added deadline zone write-locking with
>> commit 5700f69178e9 ("mq-deadline: Introduce zone locking support").
> 
> I'd rather not mark it for stable until it's been in the kernel for some
> weeks at least, since we are potentially dealing with behavioral change
> for everyone. We've been burnt by stuff like this in the past.
> 
> That said, this patch could be a candidate. Let's revisit in a few weeks.
> 

OK. Thanks !
André Almeida Oct. 24, 2019, 7:57 p.m. UTC | #6
Hello Ming Lei,

On 9/27/19 4:24 AM, Ming Lei wrote:
> Now in case of real MQ, io scheduler may be bypassed, and not only this
> way may hurt performance for some slow MQ device, but also break zoned
> device which depends on mq-deadline for respecting the write order in
> one zone.
> 
> So don't bypass io scheduler if we have one setup.
> 
> This patch can double sequential write performance basically on MQ
> scsi_debug when mq-deadline is applied.
> 
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: Hannes Reinecke <hare@suse.com>
> Cc: Damien Le Moal <damien.lemoal@wdc.com>
> Cc: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  block/blk-mq.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 20a49be536b5..d7aed6518e62 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2003,6 +2003,8 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>  		}
>  
>  		blk_add_rq_to_plug(plug, rq);
> +	} else if (q->elevator) {
> +		blk_mq_sched_insert_request(rq, false, true, true);
>  	} else if (plug && !blk_queue_nomerges(q)) {
>  		/*
>  		 * We do limited plugging. If the bio can be merged, do that.
> @@ -2026,8 +2028,8 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>  			blk_mq_try_issue_directly(data.hctx, same_queue_rq,
>  					&cookie);
>  		}
> -	} else if ((q->nr_hw_queues > 1 && is_sync) || (!q->elevator &&
> -			!data.hctx->dispatch_busy)) {
> +	} else if ((q->nr_hw_queues > 1 && is_sync) ||
> +			!data.hctx->dispatch_busy) {
>  		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
>  	} else {
>  		blk_mq_sched_insert_request(rq, false, true, true);
> 

This patch introduces identical branches, both "else if (q->elevator)"
and "else" do the same thing. Wouldn't be enough to just remove the
condition "!q->elevator" without adding a new if?

Thanks,
	André
diff mbox series

Patch

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 20a49be536b5..d7aed6518e62 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2003,6 +2003,8 @@  static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 		}
 
 		blk_add_rq_to_plug(plug, rq);
+	} else if (q->elevator) {
+		blk_mq_sched_insert_request(rq, false, true, true);
 	} else if (plug && !blk_queue_nomerges(q)) {
 		/*
 		 * We do limited plugging. If the bio can be merged, do that.
@@ -2026,8 +2028,8 @@  static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 			blk_mq_try_issue_directly(data.hctx, same_queue_rq,
 					&cookie);
 		}
-	} else if ((q->nr_hw_queues > 1 && is_sync) || (!q->elevator &&
-			!data.hctx->dispatch_busy)) {
+	} else if ((q->nr_hw_queues > 1 && is_sync) ||
+			!data.hctx->dispatch_busy) {
 		blk_mq_try_issue_directly(data.hctx, rq, &cookie);
 	} else {
 		blk_mq_sched_insert_request(rq, false, true, true);