mbox series

[v3,0/7] Submit zoned writes in order

Message ID 20230522183845.354920-1-bvanassche@acm.org (mailing list archive)
Headers show
Series Submit zoned writes in order | expand

Message

Bart Van Assche May 22, 2023, 6:38 p.m. UTC
Hi Jens,

Tests with a zoned UFS prototype have shown that there are plenty of
opportunities for reordering in the block layer for zoned writes (REQ_OP_WRITE).
The UFS driver is more likely to trigger reordering than other SCSI drivers
because it reports BLK_STS_DEV_RESOURCE more often, e.g. during clock scaling.
This patch series makes sure that zoned writes are submitted in order without
affecting other workloads significantly.

Please consider this patch series for the next merge window.

Thanks,

Bart.

Changes compared to v2:
- Changed the approach from one requeue list per hctx into preserving one
  requeue list per request queue.
- Rebased on top of Jens' for-next branch. Left out the mq-deadline patches
  since these are already in the for-next branch.
- Modified patch "block: Requeue requests if a CPU is unplugged" such that it
  always uses the requeue list.
- Added a patch that removes blk_mq_kick_requeue_list() and
  blk_mq_delay_kick_requeue_list().
- Dropped patch "block: mq-deadline: Disable head insertion for zoned writes".
- Dropped patch "block: mq-deadline: Introduce a local variable".

Changes compared to v1:
- Fixed two issues detected by the kernel test robot.

Bart Van Assche (7):
  block: Rename a local variable in blk_mq_requeue_work()
  block: Send requeued requests to the I/O scheduler
  block: Requeue requests if a CPU is unplugged
  block: Make it easier to debug zoned write reordering
  block: Preserve the order of requeued requests
  dm: Inline __dm_mq_kick_requeue_list()
  block: Inline blk_mq_{,delay_}kick_requeue_list()

 block/blk-flush.c            |   4 +-
 block/blk-mq-debugfs.c       |   2 +-
 block/blk-mq.c               | 107 ++++++++++++++++++-----------------
 drivers/block/ublk_drv.c     |   6 +-
 drivers/block/xen-blkfront.c |   1 -
 drivers/md/dm-rq.c           |  11 +---
 drivers/nvme/host/core.c     |   2 +-
 drivers/s390/block/scm_blk.c |   2 +-
 drivers/scsi/scsi_lib.c      |   2 +-
 include/linux/blk-mq.h       |   6 +-
 include/linux/blkdev.h       |   1 -
 11 files changed, 69 insertions(+), 75 deletions(-)

Comments

hch@lst.de May 23, 2023, 7:22 a.m. UTC | #1
On Mon, May 22, 2023 at 11:38:35AM -0700, Bart Van Assche wrote:
> - Changed the approach from one requeue list per hctx into preserving one
>   requeue list per request queue.

Can you explain why?  The resulting code looks rather odd to me as we
now reach out to a global list from the per-hctx run_queue helper,
which seems a bit awkware.
Bart Van Assche May 23, 2023, 8:04 p.m. UTC | #2
On 5/23/23 00:22, Christoph Hellwig wrote:
> On Mon, May 22, 2023 at 11:38:35AM -0700, Bart Van Assche wrote:
>> - Changed the approach from one requeue list per hctx into preserving one
>>    requeue list per request queue.
> 
> Can you explain why?  The resulting code looks rather odd to me as we
> now reach out to a global list from the per-hctx run_queue helper,
> which seems a bit awkward.

Hi Christoph,

This change is based on the assumption that requeuing and flushing are 
relatively rare events. Do you perhaps want me to change the approach 
back to one requeue list and one flush list per hctx?

Thanks,

Bart.
hch@lst.de May 24, 2023, 6:15 a.m. UTC | #3
On Tue, May 23, 2023 at 01:04:44PM -0700, Bart Van Assche wrote:
>> Can you explain why?  The resulting code looks rather odd to me as we
>> now reach out to a global list from the per-hctx run_queue helper,
>> which seems a bit awkward.
>
> Hi Christoph,
>
> This change is based on the assumption that requeuing and flushing are 
> relatively rare events.

The former are, the latter not so much.  But more importantly you now
look into a global list in the per-hctx dispatch, adding cache line
sharing.

> Do you perhaps want me to change the approach back 
> to one requeue list and one flush list per hctx?

Unless we have a very good reason to make them global that would
be my preference.