mbox series

[v5,0/3] io_uring/rsrc: coalescing multi-hugepage registered buffers

Message ID 20240628084411.2371-1-cliang01.li@samsung.com (mailing list archive)
Headers show
Series io_uring/rsrc: coalescing multi-hugepage registered buffers | expand

Message

Chenliang Li June 28, 2024, 8:44 a.m. UTC
Registered buffers are stored and processed in the form of bvec array,
each bvec element typically points to a PAGE_SIZE page but can also work
with hugepages. Specifically, a buffer consisting of a hugepage is
coalesced to use only one hugepage bvec entry during registration.
This coalescing feature helps to save both the space and DMA-mapping time.

However, currently the coalescing feature doesn't work for multi-hugepage
buffers. For a buffer with several 2M hugepages, we still split it into
thousands of 4K page bvec entries while in fact, we can just use a
handful of hugepage bvecs.

This patch series enables coalescing registered buffers with more than
one hugepages. It optimizes the DMA-mapping time and saves memory for
these kind of buffers.

Testing:

The hugepage fixed buffer I/O can be tested using fio without
modification. The fio command used in the following test is given
in [1]. There's also a liburing testcase in [2]. Also, the system
should have enough hugepages available before testing.

Perf diff of 8M(4 * 2M hugepages) fio randread test:

Before          After           Symbol
.....................................................
5.88%				[k] __blk_rq_map_sg
3.98%		-3.95%		[k] dma_direct_map_sg
2.47%				[k] dma_pool_alloc
1.37%		-1.36%		[k] sg_next
                +0.28%		[k] dma_map_page_attrs

Perf diff of 8M fio randwrite test:

Before		After		Symbol
......................................................
2.80%				[k] __blk_rq_map_sg
1.74%				[k] dma_direct_map_sg
1.61%				[k] dma_pool_alloc
0.67%				[k] sg_next
		+0.04%		[k] dma_map_page_attrs

First two patches prepare for adding the multi-hugepage coalescing
into buffer registration, the 3rd patch enables the feature. 

-----------------
Changes since v4:

- Use a new compacted array of pages instead of the original one, 
  if buffer can be coalesced.
- Clear unnecessary loops after using the new page array.
- Remove the account and init helper for coalesced imu. Use the original
  path instead.
- Remove unnecessary nr_folios field in the io_imu_folio_data struct.
- Rearrange the helper functions.

v4 : https://lore.kernel.org/io-uring/aaad076c-af5b-46fa-9f74-0c1e8358715b@kernel.dk/T/#t

Changes since v3:

- Delete unnecessary commit message
- Update test command and test results

v3 : https://lore.kernel.org/io-uring/20240514001614.566276-1-cliang01.li@samsung.com/T/#t

Changes since v2:

- Modify the loop iterator increment to make code cleaner
- Minor fix to the return procedure in coalesced buffer account
- Correct commit messages
- Add test cases in liburing

v2 : https://lore.kernel.org/io-uring/20240513020149.492727-1-cliang01.li@samsung.com/T/#t

Changes since v1:

- Split into 4 patches
- Fix code style issues
- Rearrange the change of code for cleaner look
- Add speciallized pinned page accounting procedure for coalesced
  buffers
- Reordered the newly add fields in imu struct for better compaction

v1 : https://lore.kernel.org/io-uring/20240506075303.25630-1-cliang01.li@samsung.com/T/#u

[1]
fio -iodepth=64 -rw=randread(-rw=randwrite) -direct=1 -ioengine=io_uring \
-bs=8M -numjobs=1 -group_reporting -mem=shmhuge -fixedbufs -hugepage-size=2M \
-filename=/dev/nvme0n1 -runtime=10s -name=test1

[2]
https://lore.kernel.org/io-uring/20240514051343.582556-1-cliang01.li@samsung.com/T/#u

Chenliang Li (3):
  io_uring/rsrc: add hugepage fixed buffer coalesce helpers
  io_uring/rsrc: store folio shift and mask into imu
  io_uring/rsrc: enable multi-hugepage buffer coalescing

 io_uring/rsrc.c | 149 +++++++++++++++++++++++++++++++++++-------------
 io_uring/rsrc.h |  11 ++++
 2 files changed, 120 insertions(+), 40 deletions(-)


base-commit: 50cf5f3842af3135b88b041890e7e12a74425fcb