mbox series

[RFC,V2,00/13] block: support bio based io polling

Message ID 20210318164827.1481133-1-ming.lei@redhat.com (mailing list archive)
Headers show
Series block: support bio based io polling | expand

Message

Ming Lei March 18, 2021, 4:48 p.m. UTC
Hi,

Add per-task io poll context for holding HIPRI blk-mq/underlying bios
queued from bio based driver's io submission context, and reuse one bio
padding field for storing 'cookie' returned from submit_bio() for these
bios. Also explicitly end these bios in poll context by adding two
new bio flags.

In this way, we needn't to poll all underlying hw queues any more,
which is implemented in Jeffle's patches. And we can just poll hw queues
in which there is HIPRI IO queued.

Usually io submission and io poll share same context, so the added io
poll context data is just like one stack variable, and the cost for
saving bios is cheap.

Any comments are welcome.

V2:
	- address queue depth scalability issue reported by Jeffle via bio
	group list. Reuse .bi_end_io for linking bios which share same
	.bi_end_io, and support 32 such groups in submit queue. With this way,
	the scalability issue caused by kfifio is solved. Before really
	ending bio, .bi_end_io is recovered from the group head.


Jeffle Xu (4):
  block/mq: extract one helper function polling hw queue
  block: add queue_to_disk() to get gendisk from request_queue
  block: add poll_capable method to support bio-based IO polling
  dm: support IO polling for bio-based dm device

Ming Lei (9):
  block: add helper of blk_queue_poll
  block: add one helper to free io_context
  block: add helper of blk_create_io_context
  block: create io poll context for submission and poll task
  block: add req flag of REQ_TAG
  block: add new field into 'struct bvec_iter'
  block: prepare for supporting bio_list via other link
  block: use per-task poll context to implement bio based io poll
  blk-mq: limit hw queues to be polled in each blk_poll()

 block/bio.c                   |   5 +
 block/blk-core.c              | 248 ++++++++++++++++++++++++++++++++--
 block/blk-ioc.c               |  12 +-
 block/blk-mq.c                | 232 ++++++++++++++++++++++++++++++-
 block/blk-sysfs.c             |  14 +-
 block/blk.h                   |  55 ++++++++
 drivers/md/dm-table.c         |  24 ++++
 drivers/md/dm.c               |  14 ++
 drivers/nvme/host/core.c      |   2 +-
 include/linux/bio.h           | 132 +++++++++---------
 include/linux/blk_types.h     |  20 ++-
 include/linux/blkdev.h        |   4 +
 include/linux/bvec.h          |   9 ++
 include/linux/device-mapper.h |   1 +
 include/linux/iocontext.h     |   2 +
 include/trace/events/kyber.h  |   6 +-
 16 files changed, 686 insertions(+), 94 deletions(-)

Comments

Jingbo Xu March 19, 2021, 5:50 a.m. UTC | #1
On 3/19/21 12:48 AM, Ming Lei wrote:
> Hi,
> 
> Add per-task io poll context for holding HIPRI blk-mq/underlying bios
> queued from bio based driver's io submission context, and reuse one bio
> padding field for storing 'cookie' returned from submit_bio() for these
> bios. Also explicitly end these bios in poll context by adding two
> new bio flags.
> 
> In this way, we needn't to poll all underlying hw queues any more,
> which is implemented in Jeffle's patches. And we can just poll hw queues
> in which there is HIPRI IO queued.
> 
> Usually io submission and io poll share same context, so the added io
> poll context data is just like one stack variable, and the cost for
> saving bios is cheap.
> 
> Any comments are welcome.
> 
> V2:
> 	- address queue depth scalability issue reported by Jeffle via bio
> 	group list. Reuse .bi_end_io for linking bios which share same
> 	.bi_end_io, and support 32 such groups in submit queue. With this way,
> 	the scalability issue caused by kfifio is solved. Before really
> 	ending bio, .bi_end_io is recovered from the group head.

I have retested this latest version, and it seems the scaling issue has
been fixed at the first glance.

Test results with the latest version:
3-threads  dm-stripe-3 targets  (12k randread IOPS, unit K)
317 -> 409 (iodepth=128)

Compared to the test results of v1:
3-threads  dm-stripe-3 targets  (12k randread IOPS, unit K)
313 -> 349 (iodepth=128, kfifo queue depth =128)
313 -> 409 (iodepth=32, kfifo queue depth =128)
314 -> 409 (iodepth=128, kfifo queue depth =512)

> 
> 
> Jeffle Xu (4):
>   block/mq: extract one helper function polling hw queue
>   block: add queue_to_disk() to get gendisk from request_queue
>   block: add poll_capable method to support bio-based IO polling
>   dm: support IO polling for bio-based dm device
> 
> Ming Lei (9):
>   block: add helper of blk_queue_poll
>   block: add one helper to free io_context
>   block: add helper of blk_create_io_context
>   block: create io poll context for submission and poll task
>   block: add req flag of REQ_TAG
>   block: add new field into 'struct bvec_iter'
>   block: prepare for supporting bio_list via other link
>   block: use per-task poll context to implement bio based io poll
>   blk-mq: limit hw queues to be polled in each blk_poll()
> 
>  block/bio.c                   |   5 +
>  block/blk-core.c              | 248 ++++++++++++++++++++++++++++++++--
>  block/blk-ioc.c               |  12 +-
>  block/blk-mq.c                | 232 ++++++++++++++++++++++++++++++-
>  block/blk-sysfs.c             |  14 +-
>  block/blk.h                   |  55 ++++++++
>  drivers/md/dm-table.c         |  24 ++++
>  drivers/md/dm.c               |  14 ++
>  drivers/nvme/host/core.c      |   2 +-
>  include/linux/bio.h           | 132 +++++++++---------
>  include/linux/blk_types.h     |  20 ++-
>  include/linux/blkdev.h        |   4 +
>  include/linux/bvec.h          |   9 ++
>  include/linux/device-mapper.h |   1 +
>  include/linux/iocontext.h     |   2 +
>  include/trace/events/kyber.h  |   6 +-
>  16 files changed, 686 insertions(+), 94 deletions(-)
>
Mike Snitzer March 19, 2021, 6:45 p.m. UTC | #2
On Thu, Mar 18 2021 at 12:48pm -0400,
Ming Lei <ming.lei@redhat.com> wrote:

> Hi,
> 
> Add per-task io poll context for holding HIPRI blk-mq/underlying bios
> queued from bio based driver's io submission context, and reuse one bio
> padding field for storing 'cookie' returned from submit_bio() for these
> bios. Also explicitly end these bios in poll context by adding two
> new bio flags.
> 
> In this way, we needn't to poll all underlying hw queues any more,
> which is implemented in Jeffle's patches. And we can just poll hw queues
> in which there is HIPRI IO queued.
> 
> Usually io submission and io poll share same context, so the added io
> poll context data is just like one stack variable, and the cost for
> saving bios is cheap.
> 
> Any comments are welcome.

I really like your approach and am very encouraged by the early results
Jeffle has shared.

Please review my various nits for your next iteration of this patchset.
But I think you aren't far from these changes being ready to make the
5.13 merge, which is really pretty awesome.

Outstanding job Ming, thanks so much for taking on this line of work!

Mike