mbox series

[PATCHSET,0/3] passthru block optimizations

Message ID 20220806152004.382170-1-axboe@kernel.dk (mailing list archive)
Headers show
Series passthru block optimizations | expand

Message

Jens Axboe Aug. 6, 2022, 3:20 p.m. UTC
Hi,

Currently passthru IO is slower than bdev O_DIRECT. One of the reasons
is that we do two allocations for each IO:

- One alloc+free for the page array for mapping the data
- One alloc+free of the bio

Let passthru IO dip into the bio cache to eliminate that one, and use
UIO_FASTIOV to gate whether we need to alloc+free the page array for
mapping purposes.

This closes about half of the gap between passthru and bdev dio for me.
If we can sanely wire up completion batching for passthru, then that
would almost fully close the gap. Outside of that, the main missing
feature for passthru is the ability to use registered buffers with
io_uring, as the per-io get_user_pages() is a large cycle consumer as
well.