diff mbox

block, bfq: dispatch request to prevent queue stalling after the request completion

Message ID 1499781495-44090-1-git-send-email-houtao1@huawei.com (mailing list archive)
State New, archived
Headers show

Commit Message

Hou Tao July 11, 2017, 1:58 p.m. UTC
There are mq devices (eg., virtio-blk, nbd and loopback) which don't
invoke blk_mq_run_hw_queues() after the completion of a request.
If bfq is enabled on these devices and the slice_idle attribute or
strict_guarantees attribute is set as zero, it is possible that
after a request completion the remaining requests of busy bfq queue
will stalled in the bfq schedule until a new request arrives.

To fix the scheduler latency problem, we need to check whether or not
all issued requests have completed and dispatch more requests to driver
if there is no request in driver.

Signed-off-by: Hou Tao <houtao1@huawei.com>

The problem can be reproduced by running the following script
on a virtio-blk device with nr_hw_queues as 1:

#!/bin/sh

dev=vdb
# mount point for dev
mp=/tmp/mnt
cd $mp

job=strict.job
cat <<EOF > $job
[global]
direct=1
bs=4k
size=256M
rw=write
ioengine=libaio
iodepth=128
runtime=5
time_based

[1]
filename=1.data

[2]
new_group
filename=2.data
EOF

echo bfq > /sys/block/$dev/queue/scheduler
echo 1 > /sys/block/$dev/queue/iosched/strict_guarantees
fio $job
---
 block/bfq-iosched.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Paolo Valente July 12, 2017, 9:41 a.m. UTC | #1
> Il giorno 11 lug 2017, alle ore 15:58, Hou Tao <houtao1@huawei.com> ha scritto:
> 
> There are mq devices (eg., virtio-blk, nbd and loopback) which don't
> invoke blk_mq_run_hw_queues() after the completion of a request.
> If bfq is enabled on these devices and the slice_idle attribute or
> strict_guarantees attribute is set as zero,

I guess you meant slice_idle > 0 or strict_guarantees equal to 1 here.

> it is possible that
> after a request completion the remaining requests of busy bfq queue
> will stalled in the bfq schedule until a new request arrives.
> 
> To fix the scheduler latency problem, we need to check whether or not
> all issued requests have completed and dispatch more requests to driver
> if there is no request in driver.
> 
> Signed-off-by: Hou Tao <houtao1@huawei.com>
> 

Thanks for this fix.  My only (possible) concern is whether there
would be some more coherent and possibly more efficient solution to
this problem, outside BFQ.  For example, would it make sense to call
blk_mq_run_hw_queues(), after a request completion, from the offended
devices too?  This would make the behavior of these devices coherent
with that of the other devices.  Unfortunately I have no sound answer
to such a question.  Apart from this concern:

Reviewed-by: Paolo Valente <paolo.valente@linaro.org>

> The problem can be reproduced by running the following script
> on a virtio-blk device with nr_hw_queues as 1:
> 
> #!/bin/sh
> 
> dev=vdb
> # mount point for dev
> mp=/tmp/mnt
> cd $mp
> 
> job=strict.job
> cat <<EOF > $job
> [global]
> direct=1
> bs=4k
> size=256M
> rw=write
> ioengine=libaio
> iodepth=128
> runtime=5
> time_based
> 
> [1]
> filename=1.data
> 
> [2]
> new_group
> filename=2.data
> EOF
> 
> echo bfq > /sys/block/$dev/queue/scheduler
> echo 1 > /sys/block/$dev/queue/iosched/strict_guarantees
> fio $job
> ---
> block/bfq-iosched.c | 3 +++
> 1 file changed, 3 insertions(+)
> 
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index 12bbc6b..94cd682 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -4293,6 +4293,9 @@ static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
> 			bfq_bfqq_expire(bfqd, bfqq, false,
> 					BFQQE_NO_MORE_REQUESTS);
> 	}
> +
> +	if (!bfqd->rq_in_driver)
> +		bfq_schedule_dispatch(bfqd);
> }
> 
> static void bfq_put_rq_priv_body(struct bfq_queue *bfqq)
> -- 
> 2.5.0
>
Hou Tao July 12, 2017, 10:36 a.m. UTC | #2
On 2017/7/12 17:41, Paolo Valente wrote:
> 
>> Il giorno 11 lug 2017, alle ore 15:58, Hou Tao <houtao1@huawei.com> ha scritto:
>>
>> There are mq devices (eg., virtio-blk, nbd and loopback) which don't
>> invoke blk_mq_run_hw_queues() after the completion of a request.
>> If bfq is enabled on these devices and the slice_idle attribute or
>> strict_guarantees attribute is set as zero,
> 
> I guess you meant slice_idle > 0 or strict_guarantees equal to 1 here.
Sorry for the typo, I mean slice_idle = 0 or strict_guarantees = 1.
Setting slice_idle as 0 alone could also trigger the latency problem.

>> it is possible that
>> after a request completion the remaining requests of busy bfq queue
>> will stalled in the bfq schedule until a new request arrives.
>>
>> To fix the scheduler latency problem, we need to check whether or not
>> all issued requests have completed and dispatch more requests to driver
>> if there is no request in driver.
>>
>> Signed-off-by: Hou Tao <houtao1@huawei.com>
>>
> 
> Thanks for this fix.  My only (possible) concern is whether there
> would be some more coherent and possibly more efficient solution to
> this problem, outside BFQ.  For example, would it make sense to call
> blk_mq_run_hw_queues(), after a request completion, from the offended
> devices too?  This would make the behavior of these devices coherent
> with that of the other devices.  Unfortunately I have no sound answer
> to such a question.  Apart from this concern:
> 
> Reviewed-by: Paolo Valente <paolo.valente@linaro.org>
Thanks for your review.

The inconsistencies between the different mq drivers also make me confused.
I don't known the reason why scsi-mq dispatches the remaining requests
after each request completion and virtio-blk does not.

I'm not sure whether or not fixing in MQ is a better idea, but I has a proposal.
If the BLK_MQ_S_SCHED_RESTART flag has been set, MQ would invoke blk_mq_run_hw_queues()
after the request completion. The BLK_MQ_S_SCHED_RESTART flag will be set by
blk_mq_sched_dispatch_requests() when it finds out that there are residual requests in its
dispatch list. We can let .dispatch_request() return a flag to set the BLK_MQ_S_SCHED_RESTART
flag if the underlying scheduler needs it, so after the request completion, the MQ core will
pull the remaining requests from BFQ.

Regards,
Tao

> 
>> The problem can be reproduced by running the following script
>> on a virtio-blk device with nr_hw_queues as 1:
>>
>> #!/bin/sh
>>
>> dev=vdb
>> # mount point for dev
>> mp=/tmp/mnt
>> cd $mp
>>
>> job=strict.job
>> cat <<EOF > $job
>> [global]
>> direct=1
>> bs=4k
>> size=256M
>> rw=write
>> ioengine=libaio
>> iodepth=128
>> runtime=5
>> time_based
>>
>> [1]
>> filename=1.data
>>
>> [2]
>> new_group
>> filename=2.data
>> EOF
>>
>> echo bfq > /sys/block/$dev/queue/scheduler
>> echo 1 > /sys/block/$dev/queue/iosched/strict_guarantees
>> fio $job
>> ---
>> block/bfq-iosched.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>> index 12bbc6b..94cd682 100644
>> --- a/block/bfq-iosched.c
>> +++ b/block/bfq-iosched.c
>> @@ -4293,6 +4293,9 @@ static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
>> 			bfq_bfqq_expire(bfqd, bfqq, false,
>> 					BFQQE_NO_MORE_REQUESTS);
>> 	}
>> +
>> +	if (!bfqd->rq_in_driver)
>> +		bfq_schedule_dispatch(bfqd);
>> }
>>
>> static void bfq_put_rq_priv_body(struct bfq_queue *bfqq)
>> -- 
>> 2.5.0
>>
> 
> 
> .
>
Jens Axboe July 12, 2017, 2:22 p.m. UTC | #3
On 07/12/2017 03:41 AM, Paolo Valente wrote:
> 
>> Il giorno 11 lug 2017, alle ore 15:58, Hou Tao <houtao1@huawei.com> ha scritto:
>>
>> There are mq devices (eg., virtio-blk, nbd and loopback) which don't
>> invoke blk_mq_run_hw_queues() after the completion of a request.
>> If bfq is enabled on these devices and the slice_idle attribute or
>> strict_guarantees attribute is set as zero,
> 
> I guess you meant slice_idle > 0 or strict_guarantees equal to 1 here.
> 
>> it is possible that
>> after a request completion the remaining requests of busy bfq queue
>> will stalled in the bfq schedule until a new request arrives.
>>
>> To fix the scheduler latency problem, we need to check whether or not
>> all issued requests have completed and dispatch more requests to driver
>> if there is no request in driver.
>>
>> Signed-off-by: Hou Tao <houtao1@huawei.com>
>>
> 
> Thanks for this fix.  My only (possible) concern is whether there
> would be some more coherent and possibly more efficient solution to
> this problem, outside BFQ.  For example, would it make sense to call
> blk_mq_run_hw_queues(), after a request completion, from the offended
> devices too?  This would make the behavior of these devices coherent
> with that of the other devices.  Unfortunately I have no sound answer
> to such a question.  Apart from this concern:

No, that's wrong. If a scheduler decides to stop dispatching
requests before the list has been drained, it is the schedulers
responsibility to restart dispatch. Some drivers need to restart
the queue for other reasons, don't confuse the two.
Paolo Valente July 12, 2017, 2:45 p.m. UTC | #4
> Il giorno 12 lug 2017, alle ore 16:22, Jens Axboe <axboe@kernel.dk> ha scritto:
> 
> On 07/12/2017 03:41 AM, Paolo Valente wrote:
>> 
>>> Il giorno 11 lug 2017, alle ore 15:58, Hou Tao <houtao1@huawei.com> ha scritto:
>>> 
>>> There are mq devices (eg., virtio-blk, nbd and loopback) which don't
>>> invoke blk_mq_run_hw_queues() after the completion of a request.
>>> If bfq is enabled on these devices and the slice_idle attribute or
>>> strict_guarantees attribute is set as zero,
>> 
>> I guess you meant slice_idle > 0 or strict_guarantees equal to 1 here.
>> 
>>> it is possible that
>>> after a request completion the remaining requests of busy bfq queue
>>> will stalled in the bfq schedule until a new request arrives.
>>> 
>>> To fix the scheduler latency problem, we need to check whether or not
>>> all issued requests have completed and dispatch more requests to driver
>>> if there is no request in driver.
>>> 
>>> Signed-off-by: Hou Tao <houtao1@huawei.com>
>>> 
>> 
>> Thanks for this fix.  My only (possible) concern is whether there
>> would be some more coherent and possibly more efficient solution to
>> this problem, outside BFQ.  For example, would it make sense to call
>> blk_mq_run_hw_queues(), after a request completion, from the offended
>> devices too?  This would make the behavior of these devices coherent
>> with that of the other devices.  Unfortunately I have no sound answer
>> to such a question.  Apart from this concern:
> 
> No, that's wrong. If a scheduler decides to stop dispatching
> requests before the list has been drained, it is the schedulers
> responsibility to restart dispatch. Some drivers need to restart
> the queue for other reasons, don't confuse the two.
> 

Thanks for clearing up this point.  As I already wrote, the patch is
then ok for me.

Thanks,
Paolo

> -- 
> Jens Axboe
diff mbox

Patch

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 12bbc6b..94cd682 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -4293,6 +4293,9 @@  static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
 			bfq_bfqq_expire(bfqd, bfqq, false,
 					BFQQE_NO_MORE_REQUESTS);
 	}
+
+	if (!bfqd->rq_in_driver)
+		bfq_schedule_dispatch(bfqd);
 }
 
 static void bfq_put_rq_priv_body(struct bfq_queue *bfqq)