Message ID | 20230516223323.1383342-12-bvanassche@acm.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mq-deadline: Improve support for zoned block devices | expand |
On 5/17/23 07:33, Bart Van Assche wrote: > Before dispatching a zoned write from the FIFO list, check whether there > are any zoned writes in the RB-tree with a lower LBA for the same zone. > This patch ensures that zoned writes happen in order even if at_head is > set for some writes for a zone and not for others. > > Reviewed-by: Christoph Hellwig <hch@lst.de> > Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com> > Cc: Ming Lei <ming.lei@redhat.com> > Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
On 5/17/23 00:33, Bart Van Assche wrote: > Before dispatching a zoned write from the FIFO list, check whether there > are any zoned writes in the RB-tree with a lower LBA for the same zone. > This patch ensures that zoned writes happen in order even if at_head is > set for some writes for a zone and not for others. > > Reviewed-by: Christoph Hellwig <hch@lst.de> > Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com> > Cc: Ming Lei <ming.lei@redhat.com> > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > --- > block/mq-deadline.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/block/mq-deadline.c b/block/mq-deadline.c > index 059727fa4b98..67989f8d29a5 100644 > --- a/block/mq-deadline.c > +++ b/block/mq-deadline.c > @@ -346,7 +346,7 @@ static struct request * > deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio, > enum dd_data_dir data_dir) > { > - struct request *rq; > + struct request *rq, *rb_rq, *next; > unsigned long flags; > > if (list_empty(&per_prio->fifo_list[data_dir])) > @@ -364,7 +364,12 @@ deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio, > * zones and these zones are unlocked. > */ > spin_lock_irqsave(&dd->zone_lock, flags); > - list_for_each_entry(rq, &per_prio->fifo_list[DD_WRITE], queuelist) { > + list_for_each_entry_safe(rq, next, &per_prio->fifo_list[DD_WRITE], > + queuelist) { > + /* Check whether a prior request exists for the same zone. */ > + rb_rq = deadline_from_pos(per_prio, data_dir, blk_rq_pos(rq)); > + if (rb_rq && blk_rq_pos(rb_rq) < blk_rq_pos(rq)) > + rq = rb_rq; > if (blk_req_can_dispatch_to_zone(rq) && > (blk_queue_nonrot(rq->q) || > !deadline_is_seq_write(dd, rq))) Similar concern here; we'll have to traverse the entire tree here. But if that's of no concern... Cheers, Hannes
On 5/17/23 16:47, Hannes Reinecke wrote: > On 5/17/23 00:33, Bart Van Assche wrote: >> Before dispatching a zoned write from the FIFO list, check whether there >> are any zoned writes in the RB-tree with a lower LBA for the same zone. >> This patch ensures that zoned writes happen in order even if at_head is >> set for some writes for a zone and not for others. >> >> Reviewed-by: Christoph Hellwig <hch@lst.de> >> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com> >> Cc: Ming Lei <ming.lei@redhat.com> >> Signed-off-by: Bart Van Assche <bvanassche@acm.org> >> --- >> block/mq-deadline.c | 9 +++++++-- >> 1 file changed, 7 insertions(+), 2 deletions(-) >> >> diff --git a/block/mq-deadline.c b/block/mq-deadline.c >> index 059727fa4b98..67989f8d29a5 100644 >> --- a/block/mq-deadline.c >> +++ b/block/mq-deadline.c >> @@ -346,7 +346,7 @@ static struct request * >> deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio, >> enum dd_data_dir data_dir) >> { >> - struct request *rq; >> + struct request *rq, *rb_rq, *next; >> unsigned long flags; >> >> if (list_empty(&per_prio->fifo_list[data_dir])) >> @@ -364,7 +364,12 @@ deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio, >> * zones and these zones are unlocked. >> */ >> spin_lock_irqsave(&dd->zone_lock, flags); >> - list_for_each_entry(rq, &per_prio->fifo_list[DD_WRITE], queuelist) { >> + list_for_each_entry_safe(rq, next, &per_prio->fifo_list[DD_WRITE], >> + queuelist) { >> + /* Check whether a prior request exists for the same zone. */ >> + rb_rq = deadline_from_pos(per_prio, data_dir, blk_rq_pos(rq)); >> + if (rb_rq && blk_rq_pos(rb_rq) < blk_rq_pos(rq)) >> + rq = rb_rq; >> if (blk_req_can_dispatch_to_zone(rq) && >> (blk_queue_nonrot(rq->q) || >> !deadline_is_seq_write(dd, rq))) > > Similar concern here; we'll have to traverse the entire tree here. > But if that's of no concern... Should be fine for HDDs. Not so sure about much faster UFS devices. And for NVMe ZNS, using a scheduler in itself already halve the max perf you can get... > > Cheers, > > Hannes
On 5/17/23 00:47, Hannes Reinecke wrote: > Similar concern here; we'll have to traverse the entire tree here. > But if that's of no concern... Hi Hannes, If I measure IOPS for a null_blk device instance then I see the following performance results for a single CPU core: * No I/O scheduler: 690 K IOPS. * mq-deadline, without this patch series: 147 K IOPS. * mq-deadline, with this patch series: 146 K IOPS. In other words, the performance impact of this patch series on the mq-deadline scheduler is small. Thanks, Bart.
diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 059727fa4b98..67989f8d29a5 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -346,7 +346,7 @@ static struct request * deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio, enum dd_data_dir data_dir) { - struct request *rq; + struct request *rq, *rb_rq, *next; unsigned long flags; if (list_empty(&per_prio->fifo_list[data_dir])) @@ -364,7 +364,12 @@ deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio, * zones and these zones are unlocked. */ spin_lock_irqsave(&dd->zone_lock, flags); - list_for_each_entry(rq, &per_prio->fifo_list[DD_WRITE], queuelist) { + list_for_each_entry_safe(rq, next, &per_prio->fifo_list[DD_WRITE], + queuelist) { + /* Check whether a prior request exists for the same zone. */ + rb_rq = deadline_from_pos(per_prio, data_dir, blk_rq_pos(rq)); + if (rb_rq && blk_rq_pos(rb_rq) < blk_rq_pos(rq)) + rq = rb_rq; if (blk_req_can_dispatch_to_zone(rq) && (blk_queue_nonrot(rq->q) || !deadline_is_seq_write(dd, rq)))