Message ID | 20241216201901.2670237-2-bvanassche@acm.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Move blk_mq_submit_bio() error handling | expand |
On Mon, Dec 16, 2024 at 12:19:00PM -0800, Bart Van Assche wrote: > Help the CPU branch predictor in case of a cache hit by handling the cache > hit scenario first. Numbers, please.
On 12/16/24 8:16 PM, Christoph Hellwig wrote: > On Mon, Dec 16, 2024 at 12:19:00PM -0800, Bart Van Assche wrote: >> Help the CPU branch predictor in case of a cache hit by handling the cache >> hit scenario first. > > Numbers, please. For a single CPU core and with the brd driver and fio and the io_uring I/O engine, I see the following performance in a VM (three test runs for each test case): Without this patch: 1619K, 1641K, 1638K IOPS or 1633 K +/- 10 K. With this patch applied: 1650K, 1633K, 1635K IOPS or 1639 K +/- 8 K. So there is a small performance improvement but the improvement is smaller than the measurement error. Is this sufficient data to proceed with this patch? Thanks, Bart.
On 12/17/24 12:22 PM, Bart Van Assche wrote: > On 12/16/24 8:16 PM, Christoph Hellwig wrote: >> On Mon, Dec 16, 2024 at 12:19:00PM -0800, Bart Van Assche wrote: >>> Help the CPU branch predictor in case of a cache hit by handling the cache >>> hit scenario first. >> >> Numbers, please. > > For a single CPU core and with the brd driver and fio and the io_uring > I/O engine, I see the following performance in a VM (three test runs for > each test case): > > Without this patch: 1619K, 1641K, 1638K IOPS or 1633 K +/- 10 K. > With this patch applied: 1650K, 1633K, 1635K IOPS or 1639 K +/- 8 K. > > So there is a small performance improvement but the improvement is > smaller than the measurement error. Is this sufficient data to proceed > with this patch? I think it's fine, it's going to be very hard to reliably measure. I tend to prefer the expected case to be the one checked for, eg: rq = get_cached(); if (rq) ... anyway, as then the expected outcome is what you're reading next anyway. Either that or rq = get_cached(); if (unlikely(!rq)) goto some_failure_path; If you really wanted to measure it, you'd probably skip the IOPS part and just look at branch misprediction in perf.
On Tue, Dec 17, 2024 at 11:22:47AM -0800, Bart Van Assche wrote: > For a single CPU core and with the brd driver and fio and the io_uring > I/O engine, I see the following performance in a VM (three test runs for > each test case): > > Without this patch: 1619K, 1641K, 1638K IOPS or 1633 K +/- 10 K. > With this patch applied: 1650K, 1633K, 1635K IOPS or 1639 K +/- 8 K. > > So there is a small performance improvement but the improvement is > smaller than the measurement error. Is this sufficient data to proceed > with this patch? I think it's sufficient data that should not claim that it is an optimizations. The resulting code still looks nicer to me, but arguing with optimizations feels like BS.
diff --git a/block/blk-mq.c b/block/blk-mq.c index 7ee21346a41e..8d2aab4d9ba9 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3102,12 +3102,12 @@ void blk_mq_submit_bio(struct bio *bio) goto queue_exit; new_request: - if (!rq) { + if (rq) { + blk_mq_use_cached_rq(rq, plug, bio); + } else { rq = blk_mq_get_new_requests(q, plug, bio, nr_segs); if (unlikely(!rq)) goto queue_exit; - } else { - blk_mq_use_cached_rq(rq, plug, bio); } trace_block_getrq(bio);
Help the CPU branch predictor in case of a cache hit by handling the cache hit scenario first. Cc: Damien Le Moal <dlemoal@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- block/blk-mq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)