mbox series

[v4,0/8] implement async block discards and other ops via io_uring

Message ID cover.1725621577.git.asml.silence@gmail.com (mailing list archive)
Headers show
Series implement async block discards and other ops via io_uring | expand

Message

Pavel Begunkov Sept. 6, 2024, 10:57 p.m. UTC
There is an interest in having asynchronous block operations like
discard and write zeroes. The series implements that as io_uring commands,
which is an io_uring request type allowing to implement custom file
specific operations.

First 4 are preparation patches. Patch 5 introduces the main chunk of
cmd infrastructure and discard commands. Patches 6-8 implement
write zeroes variants.

Branch with tests and docs:
https://github.com/isilence/liburing.git discard-cmd

The man page specifically (need to shuffle it to some cmd section):
https://github.com/isilence/liburing/commit/a6fa2bc2400bf7fcb80496e322b5db4c8b3191f0

v4: fix failing to pass nowait (unused opf) in patch 7

v3: use GFP_NOWAIT for non-blocking allocation
    fail oversized nowait discards in advance
    drop secure erase and add zero page writes
    renamed function name + other cosmetic changes
    use IOC / ioctl encoding for cmd opcodes

v2: move out of CONFIG_COMPAT
    add write zeroes & secure erase
    drop a note about interaction with page cache

Pavel Begunkov (8):
  io_uring/cmd: expose iowq to cmds
  io_uring/cmd: give inline space in request to cmds
  filemap: introduce filemap_invalidate_pages
  block: introduce blk_validate_byte_range()
  block: implement async discard as io_uring cmd
  block: implement async write zeroes command
  block: add nowait flag for __blkdev_issue_zero_pages
  block: implement async write zero pages command

 block/blk-lib.c              |  27 ++++-
 block/blk.h                  |   1 +
 block/fops.c                 |   2 +
 block/ioctl.c                | 228 ++++++++++++++++++++++++++++++++---
 include/linux/bio.h          |   6 +
 include/linux/blkdev.h       |   1 +
 include/linux/io_uring/cmd.h |  15 +++
 include/linux/pagemap.h      |   2 +
 include/uapi/linux/fs.h      |   4 +
 io_uring/io_uring.c          |  11 ++
 io_uring/io_uring.h          |   1 +
 io_uring/uring_cmd.c         |   7 ++
 mm/filemap.c                 |  17 ++-
 13 files changed, 293 insertions(+), 29 deletions(-)

Comments

Jens Axboe Sept. 8, 2024, 10:25 p.m. UTC | #1
On 9/6/24 4:57 PM, Pavel Begunkov wrote:
> There is an interest in having asynchronous block operations like
> discard and write zeroes. The series implements that as io_uring commands,
> which is an io_uring request type allowing to implement custom file
> specific operations.
> 
> First 4 are preparation patches. Patch 5 introduces the main chunk of
> cmd infrastructure and discard commands. Patches 6-8 implement
> write zeroes variants.
> 
> Branch with tests and docs:
> https://github.com/isilence/liburing.git discard-cmd
> 
> The man page specifically (need to shuffle it to some cmd section):
> https://github.com/isilence/liburing/commit/a6fa2bc2400bf7fcb80496e322b5db4c8b3191f0

This looks good to me now. Only minor nit is that I generally don't
like:

while ((bio = blk_alloc_discard_bio(bdev, &sector, &nr_sects, gfp))) {

where assignment and test are in one line as they are harder do read,
prefer doing:

do {
	bio = blk_alloc_discard_bio(bdev, &sector, &nr_sects, gfp);
	if (!bio)
		break;
	[...]
} while (1);

instead. But nothing that should need a respin or anything.

I'll run some testing on this tomorrow!

Thanks,
Jens Axboe Sept. 9, 2024, 2:51 p.m. UTC | #2
On 9/6/24 4:57 PM, Pavel Begunkov wrote:
> There is an interest in having asynchronous block operations like
> discard and write zeroes. The series implements that as io_uring commands,
> which is an io_uring request type allowing to implement custom file
> specific operations.
> 
> First 4 are preparation patches. Patch 5 introduces the main chunk of
> cmd infrastructure and discard commands. Patches 6-8 implement
> write zeroes variants.

Sitting in for-6.12/io_uring-discard for now, as there's a hidden
dependency with the end/len patch in for-6.12/block.

Ran a quick test - have 64 4k discards inflight. Here's the current
performance, with 64 threads with sync discard:

qd64 sync discard: 21K IOPS, lat avg 3 msec (max 21 msec)

and using io_uring with async discard, otherwise same test case:

qd64 async discard: 76K IOPS, lat avg 845 usec (max 2.2 msec)

If we switch to doing 1M discards, then we get:

qd64 sync discard: 14K IOPS, lat avg 5 msec (max 25 msec)

and using io_uring with async discard, otherwise same test case:

qd64 async discard: 56K IOPS, lat avg 1153 usec (max 3.6 msec)

This is on a:

Samsung Electronics Co Ltd NVMe SSD Controller PM174X

nvme device. It doesn't have the fastest discard, but still nicely shows
the improvement over a purely sync discard.
Jens Axboe Sept. 9, 2024, 3:09 p.m. UTC | #3
On Fri, 06 Sep 2024 23:57:17 +0100, Pavel Begunkov wrote:
> There is an interest in having asynchronous block operations like
> discard and write zeroes. The series implements that as io_uring commands,
> which is an io_uring request type allowing to implement custom file
> specific operations.
> 
> First 4 are preparation patches. Patch 5 introduces the main chunk of
> cmd infrastructure and discard commands. Patches 6-8 implement
> write zeroes variants.
> 
> [...]

Applied, thanks!

[1/8] io_uring/cmd: expose iowq to cmds
      commit: c6472f5f9a0806b0598ba513344b5a30cfa53b97
[2/8] io_uring/cmd: give inline space in request to cmds
      commit: 1a7628d034f8328813163d07ce112e1198289aeb
[3/8] filemap: introduce filemap_invalidate_pages
      commit: 1f027ae3136dfb4bfe40d83f3e0f5019e63db883
[4/8] block: introduce blk_validate_byte_range()
      commit: da22f537db72c2520c48445840b7e371c58762a7
[5/8] block: implement async discard as io_uring cmd
      commit: 0d266c981982f0f54165f05dbcdf449bb87f5184
[6/8] block: implement async write zeroes command
      commit: b56d5132a78db21ca3b386056af38802aea0a274
[7/8] block: add nowait flag for __blkdev_issue_zero_pages
      commit: 4f8e422a0744f1294c784109cfbedafd97263c2f
[8/8] block: implement async write zero pages command
      commit: 4811c90cbf179b4c58fdbad54c5b05efc0d59159

Best regards,
Jens Axboe Sept. 9, 2024, 3:33 p.m. UTC | #4
On 9/9/24 8:51 AM, Jens Axboe wrote:
> On 9/6/24 4:57 PM, Pavel Begunkov wrote:
>> There is an interest in having asynchronous block operations like
>> discard and write zeroes. The series implements that as io_uring commands,
>> which is an io_uring request type allowing to implement custom file
>> specific operations.
>>
>> First 4 are preparation patches. Patch 5 introduces the main chunk of
>> cmd infrastructure and discard commands. Patches 6-8 implement
>> write zeroes variants.
> 
> Sitting in for-6.12/io_uring-discard for now, as there's a hidden
> dependency with the end/len patch in for-6.12/block.
> 
> Ran a quick test - have 64 4k discards inflight. Here's the current
> performance, with 64 threads with sync discard:
> 
> qd64 sync discard: 21K IOPS, lat avg 3 msec (max 21 msec)
> 
> and using io_uring with async discard, otherwise same test case:
> 
> qd64 async discard: 76K IOPS, lat avg 845 usec (max 2.2 msec)
> 
> If we switch to doing 1M discards, then we get:
> 
> qd64 sync discard: 14K IOPS, lat avg 5 msec (max 25 msec)
> 
> and using io_uring with async discard, otherwise same test case:
> 
> qd64 async discard: 56K IOPS, lat avg 1153 usec (max 3.6 msec)
> 
> This is on a:
> 
> Samsung Electronics Co Ltd NVMe SSD Controller PM174X
> 
> nvme device. It doesn't have the fastest discard, but still nicely shows
> the improvement over a purely sync discard.

Did some basic testing with null_blk just to get a better idea of what
it'd look like on a faster devices. Same test cases as above (qd=64, 4k
and 1M random trims):

Type	Trim size	IOPS	Lat avg (usec)	Lat Max (usec)
==============================================================
sync	4k		 144K	    444		   20314
async	4k		1353K	     47		     595
sync	1M		  56K	   1136		   21031
async	1M		  94K	    680		     760