[for-next] block: fix hctx checks for batch allocation

Message ID	80d4511011d7d4751b4cf6375c4e38f237d935e3.1673955390.git.asml.silence@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-block-owner@vger.kernel.org> From: Pavel Begunkov <asml.silence@gmail.com> To: Jens Axboe <axboe@kernel.dk>, linux-block@vger.kernel.org Cc: Pavel Begunkov <asml.silence@gmail.com> Subject: [PATCH for-next] block: fix hctx checks for batch allocation Date: Tue, 17 Jan 2023 11:42:15 +0000 Message-Id: <80d4511011d7d4751b4cf6375c4e38f237d935e3.1673955390.git.asml.silence@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	[for-next] block: fix hctx checks for batch allocation \| expand [for-next] block: fix hctx checks for batch allocation

Message ID

80d4511011d7d4751b4cf6375c4e38f237d935e3.1673955390.git.asml.silence@gmail.com (mailing list archive)

State

New, archived

Headers

From: Pavel Begunkov <asml.silence@gmail.com>
To: Jens Axboe <axboe@kernel.dk>, linux-block@vger.kernel.org
Cc: Pavel Begunkov <asml.silence@gmail.com>
Subject: [PATCH for-next] block: fix hctx checks for batch allocation
Date: Tue, 17 Jan 2023 11:42:15 +0000
Message-Id: 
 <80d4511011d7d4751b4cf6375c4e38f237d935e3.1673955390.git.asml.silence@gmail.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

[for-next] block: fix hctx checks for batch allocation | expand

Commit Message

Pavel Begunkov Jan. 17, 2023, 11:42 a.m. UTC

When there are no read queues read requests will be assigned a
default queue on allocation. However, blk_mq_get_cached_request() is not
prepared for that and will fail all attempts to grab read requests from
the cache. Worst case it doubles the number of requests allocated,
roughly half of which will be returned by blk_mq_free_plug_rqs().

It only affects batched allocations and so is io_uring specific.
For reference, QD8 t/io_uring benchmark improves by 20-35%.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---

It might be a good idea to always use HCTX_TYPE_DEFAULT, so the cache
always can accomodate combined write with read reqs.

 block/blk-mq.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Jens Axboe Jan. 17, 2023, 4:56 p.m. UTC | #1

On 1/17/23 4:42?AM, Pavel Begunkov wrote:
> When there are no read queues read requests will be assigned a
> default queue on allocation. However, blk_mq_get_cached_request() is not
> prepared for that and will fail all attempts to grab read requests from
> the cache. Worst case it doubles the number of requests allocated,
> roughly half of which will be returned by blk_mq_free_plug_rqs().
> 
> It only affects batched allocations and so is io_uring specific.
> For reference, QD8 t/io_uring benchmark improves by 20-35%.

This does make a big difference for me. Usual peak test (24 drives), and
I get 63-65M IOPS with IRQ based IO. With the patch:

polled=0, fixedbufs=1/0, register_files=1, buffered=0, QD=128
Engine=io_uring, sq_ring=128, cq_ring=128
IOPS=64.79M, BW=31.64GiB/s, IOS/call=32/31
IOPS=73.45M, BW=35.86GiB/s, IOS/call=32/32
IOPS=73.70M, BW=35.99GiB/s, IOS/call=31/31
IOPS=74.57M, BW=36.41GiB/s, IOS/call=31/31
IOPS=75.18M, BW=36.71GiB/s, IOS/call=31/31
IOPS=74.33M, BW=36.29GiB/s, IOS/call=32/32
IOPS=74.53M, BW=36.39GiB/s, IOS/call=32/32
IOPS=74.61M, BW=36.43GiB/s, IOS/call=32/32

which is 15-19% better.

> It might be a good idea to always use HCTX_TYPE_DEFAULT, so the cache
> always can accomodate combined write with read reqs.

I think it makes sense to do so, particularly now that we have support
for not just polled IO.

Jens Axboe Jan. 17, 2023, 5:22 p.m. UTC | #2

On Tue, 17 Jan 2023 11:42:15 +0000, Pavel Begunkov wrote:
> When there are no read queues read requests will be assigned a
> default queue on allocation. However, blk_mq_get_cached_request() is not
> prepared for that and will fail all attempts to grab read requests from
> the cache. Worst case it doubles the number of requests allocated,
> roughly half of which will be returned by blk_mq_free_plug_rqs().
> 
> It only affects batched allocations and so is io_uring specific.
> For reference, QD8 t/io_uring benchmark improves by 20-35%.
> 
> [...]

Applied, thanks!

[1/1] block: fix hctx checks for batch allocation
      (no commit info)

Best regards,

Christoph Hellwig Jan. 18, 2023, 5:36 a.m. UTC | #3

On Tue, Jan 17, 2023 at 11:42:15AM +0000, Pavel Begunkov wrote:
> It might be a good idea to always use HCTX_TYPE_DEFAULT, so the cache
> always can accomodate combined write with read reqs.

I suspect we'll just need a separate cache for each HCTX_TYPE.

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 2c49b4151da1..9d463f7563bc 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2890,6 +2890,7 @@  static inline struct request *blk_mq_get_cached_request(struct request_queue *q,
 		struct blk_plug *plug, struct bio **bio, unsigned int nsegs)
 {
 	struct request *rq;
+	enum hctx_type type, hctx_type;
 
 	if (!plug)
 		return NULL;
@@ -2902,7 +2903,10 @@  static inline struct request *blk_mq_get_cached_request(struct request_queue *q,
 		return NULL;
 	}
 
-	if (blk_mq_get_hctx_type((*bio)->bi_opf) != rq->mq_hctx->type)
+	type = blk_mq_get_hctx_type((*bio)->bi_opf);
+	hctx_type = rq->mq_hctx->type;
+	if (type != hctx_type &&
+	    !(type == HCTX_TYPE_READ && hctx_type == HCTX_TYPE_DEFAULT))
 		return NULL;
 	if (op_is_flush(rq->cmd_flags) != op_is_flush((*bio)->bi_opf))
 		return NULL;

[for-next] block: fix hctx checks for batch allocation

Commit Message

Comments

Patch