diff mbox series

[v5,11/11] block: mq-deadline: Fix handling of at-head zoned writes

Message ID 20230516223323.1383342-12-bvanassche@acm.org (mailing list archive)
State New, archived
Headers show
Series mq-deadline: Improve support for zoned block devices | expand

Commit Message

Bart Van Assche May 16, 2023, 10:33 p.m. UTC
Before dispatching a zoned write from the FIFO list, check whether there
are any zoned writes in the RB-tree with a lower LBA for the same zone.
This patch ensures that zoned writes happen in order even if at_head is
set for some writes for a zone and not for others.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/mq-deadline.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Damien Le Moal May 17, 2023, 1:24 a.m. UTC | #1
On 5/17/23 07:33, Bart Van Assche wrote:
> Before dispatching a zoned write from the FIFO list, check whether there
> are any zoned writes in the RB-tree with a lower LBA for the same zone.
> This patch ensures that zoned writes happen in order even if at_head is
> set for some writes for a zone and not for others.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
> Cc: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Hannes Reinecke May 17, 2023, 7:47 a.m. UTC | #2
On 5/17/23 00:33, Bart Van Assche wrote:
> Before dispatching a zoned write from the FIFO list, check whether there
> are any zoned writes in the RB-tree with a lower LBA for the same zone.
> This patch ensures that zoned writes happen in order even if at_head is
> set for some writes for a zone and not for others.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
> Cc: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>   block/mq-deadline.c | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> index 059727fa4b98..67989f8d29a5 100644
> --- a/block/mq-deadline.c
> +++ b/block/mq-deadline.c
> @@ -346,7 +346,7 @@ static struct request *
>   deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
>   		      enum dd_data_dir data_dir)
>   {
> -	struct request *rq;
> +	struct request *rq, *rb_rq, *next;
>   	unsigned long flags;
>   
>   	if (list_empty(&per_prio->fifo_list[data_dir]))
> @@ -364,7 +364,12 @@ deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
>   	 * zones and these zones are unlocked.
>   	 */
>   	spin_lock_irqsave(&dd->zone_lock, flags);
> -	list_for_each_entry(rq, &per_prio->fifo_list[DD_WRITE], queuelist) {
> +	list_for_each_entry_safe(rq, next, &per_prio->fifo_list[DD_WRITE],
> +				 queuelist) {
> +		/* Check whether a prior request exists for the same zone. */
> +		rb_rq = deadline_from_pos(per_prio, data_dir, blk_rq_pos(rq));
> +		if (rb_rq && blk_rq_pos(rb_rq) < blk_rq_pos(rq))
> +			rq = rb_rq;
>   		if (blk_req_can_dispatch_to_zone(rq) &&
>   		    (blk_queue_nonrot(rq->q) ||
>   		     !deadline_is_seq_write(dd, rq)))

Similar concern here; we'll have to traverse the entire tree here.
But if that's of no concern...

Cheers,

Hannes
Damien Le Moal May 17, 2023, 7:53 a.m. UTC | #3
On 5/17/23 16:47, Hannes Reinecke wrote:
> On 5/17/23 00:33, Bart Van Assche wrote:
>> Before dispatching a zoned write from the FIFO list, check whether there
>> are any zoned writes in the RB-tree with a lower LBA for the same zone.
>> This patch ensures that zoned writes happen in order even if at_head is
>> set for some writes for a zone and not for others.
>>
>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
>> Cc: Ming Lei <ming.lei@redhat.com>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> ---
>>   block/mq-deadline.c | 9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
>> index 059727fa4b98..67989f8d29a5 100644
>> --- a/block/mq-deadline.c
>> +++ b/block/mq-deadline.c
>> @@ -346,7 +346,7 @@ static struct request *
>>   deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
>>   		      enum dd_data_dir data_dir)
>>   {
>> -	struct request *rq;
>> +	struct request *rq, *rb_rq, *next;
>>   	unsigned long flags;
>>   
>>   	if (list_empty(&per_prio->fifo_list[data_dir]))
>> @@ -364,7 +364,12 @@ deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
>>   	 * zones and these zones are unlocked.
>>   	 */
>>   	spin_lock_irqsave(&dd->zone_lock, flags);
>> -	list_for_each_entry(rq, &per_prio->fifo_list[DD_WRITE], queuelist) {
>> +	list_for_each_entry_safe(rq, next, &per_prio->fifo_list[DD_WRITE],
>> +				 queuelist) {
>> +		/* Check whether a prior request exists for the same zone. */
>> +		rb_rq = deadline_from_pos(per_prio, data_dir, blk_rq_pos(rq));
>> +		if (rb_rq && blk_rq_pos(rb_rq) < blk_rq_pos(rq))
>> +			rq = rb_rq;
>>   		if (blk_req_can_dispatch_to_zone(rq) &&
>>   		    (blk_queue_nonrot(rq->q) ||
>>   		     !deadline_is_seq_write(dd, rq)))
> 
> Similar concern here; we'll have to traverse the entire tree here.
> But if that's of no concern...

Should be fine for HDDs. Not so sure about much faster UFS devices.
And for NVMe ZNS, using a scheduler in itself already halve the max perf you can
get...

> 
> Cheers,
> 
> Hannes
Bart Van Assche May 17, 2023, 5:13 p.m. UTC | #4
On 5/17/23 00:47, Hannes Reinecke wrote:
> Similar concern here; we'll have to traverse the entire tree here.
> But if that's of no concern...

Hi Hannes,

If I measure IOPS for a null_blk device instance then I see the 
following performance results for a single CPU core:
* No I/O scheduler:                       690 K IOPS.
* mq-deadline, without this patch series: 147 K IOPS.
* mq-deadline, with this patch series:    146 K IOPS.

In other words, the performance impact of this patch series on the 
mq-deadline scheduler is small.

Thanks,

Bart.
diff mbox series

Patch

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 059727fa4b98..67989f8d29a5 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -346,7 +346,7 @@  static struct request *
 deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
 		      enum dd_data_dir data_dir)
 {
-	struct request *rq;
+	struct request *rq, *rb_rq, *next;
 	unsigned long flags;
 
 	if (list_empty(&per_prio->fifo_list[data_dir]))
@@ -364,7 +364,12 @@  deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
 	 * zones and these zones are unlocked.
 	 */
 	spin_lock_irqsave(&dd->zone_lock, flags);
-	list_for_each_entry(rq, &per_prio->fifo_list[DD_WRITE], queuelist) {
+	list_for_each_entry_safe(rq, next, &per_prio->fifo_list[DD_WRITE],
+				 queuelist) {
+		/* Check whether a prior request exists for the same zone. */
+		rb_rq = deadline_from_pos(per_prio, data_dir, blk_rq_pos(rq));
+		if (rb_rq && blk_rq_pos(rb_rq) < blk_rq_pos(rq))
+			rq = rb_rq;
 		if (blk_req_can_dispatch_to_zone(rq) &&
 		    (blk_queue_nonrot(rq->q) ||
 		     !deadline_is_seq_write(dd, rq)))