Message ID | 20220308152105.309618-11-joshi.k@samsung.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | io_uring passthru over nvme | expand |
On Tue, Mar 08, 2022 at 08:50:58PM +0530, Kanchan Joshi wrote: > From: Jens Axboe <axboe@kernel.dk> > > Add support to use plugging if it is enabled, else use default path. The subject and this comment don't really explain what is done, and also don't mention at all why it is done.
On Thu, Mar 10, 2022 at 2:04 PM Christoph Hellwig <hch@lst.de> wrote: > > On Tue, Mar 08, 2022 at 08:50:58PM +0530, Kanchan Joshi wrote: > > From: Jens Axboe <axboe@kernel.dk> > > > > Add support to use plugging if it is enabled, else use default path. > > The subject and this comment don't really explain what is done, and > also don't mention at all why it is done. Missed out, will fix up. But plugging gave a very good hike to IOPS. Especially while comparing this with io-uring's block-io path that keeps .plug enabled. Patch 9 (that enables .plug for uring-cmd) and this goes hand in hand.
On Thu, Mar 10, 2022 at 06:10:08PM +0530, Kanchan Joshi wrote: > On Thu, Mar 10, 2022 at 2:04 PM Christoph Hellwig <hch@lst.de> wrote: > > > > On Tue, Mar 08, 2022 at 08:50:58PM +0530, Kanchan Joshi wrote: > > > From: Jens Axboe <axboe@kernel.dk> > > > > > > Add support to use plugging if it is enabled, else use default path. > > > > The subject and this comment don't really explain what is done, and > > also don't mention at all why it is done. > > Missed out, will fix up. But plugging gave a very good hike to IOPS. But how does plugging improve IOPS here for passthrough request? Not see plug->nr_ios is wired to data.nr_tags in blk_mq_alloc_request(), which is called by nvme_submit_user_cmd(). Thanks, Ming
On Mon, Mar 14, 2022 at 10:40:53PM +0800, Ming Lei wrote: >On Thu, Mar 10, 2022 at 06:10:08PM +0530, Kanchan Joshi wrote: >> On Thu, Mar 10, 2022 at 2:04 PM Christoph Hellwig <hch@lst.de> wrote: >> > >> > On Tue, Mar 08, 2022 at 08:50:58PM +0530, Kanchan Joshi wrote: >> > > From: Jens Axboe <axboe@kernel.dk> >> > > >> > > Add support to use plugging if it is enabled, else use default path. >> > >> > The subject and this comment don't really explain what is done, and >> > also don't mention at all why it is done. >> >> Missed out, will fix up. But plugging gave a very good hike to IOPS. > >But how does plugging improve IOPS here for passthrough request? Not >see plug->nr_ios is wired to data.nr_tags in blk_mq_alloc_request(), >which is called by nvme_submit_user_cmd(). Yes, one tag at a time for each request, but none of the request gets dispatched and instead added to the plug. And when io_uring ends the plug, the whole batch gets dispatched via ->queue_rqs (otherwise it used to be via ->queue_rq, one request at a time). Only .plug impact looks like this on passthru-randread: KIOPS(depth_batch) 1_1 8_2 64_16 128_32 Without plug 159 496 784 785 With plug 159 525 991 1044 Hope it does clarify.
On Mon, Mar 21, 2022 at 12:32:08PM +0530, Kanchan Joshi wrote: > On Mon, Mar 14, 2022 at 10:40:53PM +0800, Ming Lei wrote: > > On Thu, Mar 10, 2022 at 06:10:08PM +0530, Kanchan Joshi wrote: > > > On Thu, Mar 10, 2022 at 2:04 PM Christoph Hellwig <hch@lst.de> wrote: > > > > > > > > On Tue, Mar 08, 2022 at 08:50:58PM +0530, Kanchan Joshi wrote: > > > > > From: Jens Axboe <axboe@kernel.dk> > > > > > > > > > > Add support to use plugging if it is enabled, else use default path. > > > > > > > > The subject and this comment don't really explain what is done, and > > > > also don't mention at all why it is done. > > > > > > Missed out, will fix up. But plugging gave a very good hike to IOPS. > > > > But how does plugging improve IOPS here for passthrough request? Not > > see plug->nr_ios is wired to data.nr_tags in blk_mq_alloc_request(), > > which is called by nvme_submit_user_cmd(). > > Yes, one tag at a time for each request, but none of the request gets > dispatched and instead added to the plug. And when io_uring ends the > plug, the whole batch gets dispatched via ->queue_rqs (otherwise it used > to be via ->queue_rq, one request at a time). > > Only .plug impact looks like this on passthru-randread: > > KIOPS(depth_batch) 1_1 8_2 64_16 128_32 > Without plug 159 496 784 785 > With plug 159 525 991 1044 > > Hope it does clarify. OK, thanks for your confirmation, then the improvement should be from batch submission only. If cached request is enabled, I guess the number could be better. Thanks, Ming
On 3/22/22 7:27 PM, Ming Lei wrote: > On Mon, Mar 21, 2022 at 12:32:08PM +0530, Kanchan Joshi wrote: >> On Mon, Mar 14, 2022 at 10:40:53PM +0800, Ming Lei wrote: >>> On Thu, Mar 10, 2022 at 06:10:08PM +0530, Kanchan Joshi wrote: >>>> On Thu, Mar 10, 2022 at 2:04 PM Christoph Hellwig <hch@lst.de> wrote: >>>>> >>>>> On Tue, Mar 08, 2022 at 08:50:58PM +0530, Kanchan Joshi wrote: >>>>>> From: Jens Axboe <axboe@kernel.dk> >>>>>> >>>>>> Add support to use plugging if it is enabled, else use default path. >>>>> >>>>> The subject and this comment don't really explain what is done, and >>>>> also don't mention at all why it is done. >>>> >>>> Missed out, will fix up. But plugging gave a very good hike to IOPS. >>> >>> But how does plugging improve IOPS here for passthrough request? Not >>> see plug->nr_ios is wired to data.nr_tags in blk_mq_alloc_request(), >>> which is called by nvme_submit_user_cmd(). >> >> Yes, one tag at a time for each request, but none of the request gets >> dispatched and instead added to the plug. And when io_uring ends the >> plug, the whole batch gets dispatched via ->queue_rqs (otherwise it used >> to be via ->queue_rq, one request at a time). >> >> Only .plug impact looks like this on passthru-randread: >> >> KIOPS(depth_batch) 1_1 8_2 64_16 128_32 >> Without plug 159 496 784 785 >> With plug 159 525 991 1044 >> >> Hope it does clarify. > > OK, thanks for your confirmation, then the improvement should be from > batch submission only. > > If cached request is enabled, I guess the number could be better. Yes, my original test patch pre-dates being able to set a submit count, it would definitely help improve this case too. The current win is indeed just from being able to use ->queue_rqs() rather than single submit.
On 3/22/22 7:41 PM, Jens Axboe wrote: > On 3/22/22 7:27 PM, Ming Lei wrote: >> On Mon, Mar 21, 2022 at 12:32:08PM +0530, Kanchan Joshi wrote: >>> On Mon, Mar 14, 2022 at 10:40:53PM +0800, Ming Lei wrote: >>>> On Thu, Mar 10, 2022 at 06:10:08PM +0530, Kanchan Joshi wrote: >>>>> On Thu, Mar 10, 2022 at 2:04 PM Christoph Hellwig <hch@lst.de> wrote: >>>>>> >>>>>> On Tue, Mar 08, 2022 at 08:50:58PM +0530, Kanchan Joshi wrote: >>>>>>> From: Jens Axboe <axboe@kernel.dk> >>>>>>> >>>>>>> Add support to use plugging if it is enabled, else use default path. >>>>>> >>>>>> The subject and this comment don't really explain what is done, and >>>>>> also don't mention at all why it is done. >>>>> >>>>> Missed out, will fix up. But plugging gave a very good hike to IOPS. >>>> >>>> But how does plugging improve IOPS here for passthrough request? Not >>>> see plug->nr_ios is wired to data.nr_tags in blk_mq_alloc_request(), >>>> which is called by nvme_submit_user_cmd(). >>> >>> Yes, one tag at a time for each request, but none of the request gets >>> dispatched and instead added to the plug. And when io_uring ends the >>> plug, the whole batch gets dispatched via ->queue_rqs (otherwise it used >>> to be via ->queue_rq, one request at a time). >>> >>> Only .plug impact looks like this on passthru-randread: >>> >>> KIOPS(depth_batch) 1_1 8_2 64_16 128_32 >>> Without plug 159 496 784 785 >>> With plug 159 525 991 1044 >>> >>> Hope it does clarify. >> >> OK, thanks for your confirmation, then the improvement should be from >> batch submission only. >> >> If cached request is enabled, I guess the number could be better. > > Yes, my original test patch pre-dates being able to set a submit count, > it would definitely help improve this case too. The current win is > indeed just from being able to use ->queue_rqs() rather than single > submit. Actually that is already there through io_uring, nothing extra is needed.
On Tue, Mar 22, 2022 at 07:58:25PM -0600, Jens Axboe wrote: > On 3/22/22 7:41 PM, Jens Axboe wrote: > > On 3/22/22 7:27 PM, Ming Lei wrote: > >> On Mon, Mar 21, 2022 at 12:32:08PM +0530, Kanchan Joshi wrote: > >>> On Mon, Mar 14, 2022 at 10:40:53PM +0800, Ming Lei wrote: > >>>> On Thu, Mar 10, 2022 at 06:10:08PM +0530, Kanchan Joshi wrote: > >>>>> On Thu, Mar 10, 2022 at 2:04 PM Christoph Hellwig <hch@lst.de> wrote: > >>>>>> > >>>>>> On Tue, Mar 08, 2022 at 08:50:58PM +0530, Kanchan Joshi wrote: > >>>>>>> From: Jens Axboe <axboe@kernel.dk> > >>>>>>> > >>>>>>> Add support to use plugging if it is enabled, else use default path. > >>>>>> > >>>>>> The subject and this comment don't really explain what is done, and > >>>>>> also don't mention at all why it is done. > >>>>> > >>>>> Missed out, will fix up. But plugging gave a very good hike to IOPS. > >>>> > >>>> But how does plugging improve IOPS here for passthrough request? Not > >>>> see plug->nr_ios is wired to data.nr_tags in blk_mq_alloc_request(), > >>>> which is called by nvme_submit_user_cmd(). > >>> > >>> Yes, one tag at a time for each request, but none of the request gets > >>> dispatched and instead added to the plug. And when io_uring ends the > >>> plug, the whole batch gets dispatched via ->queue_rqs (otherwise it used > >>> to be via ->queue_rq, one request at a time). > >>> > >>> Only .plug impact looks like this on passthru-randread: > >>> > >>> KIOPS(depth_batch) 1_1 8_2 64_16 128_32 > >>> Without plug 159 496 784 785 > >>> With plug 159 525 991 1044 > >>> > >>> Hope it does clarify. > >> > >> OK, thanks for your confirmation, then the improvement should be from > >> batch submission only. > >> > >> If cached request is enabled, I guess the number could be better. > > > > Yes, my original test patch pre-dates being able to set a submit count, > > it would definitely help improve this case too. The current win is > > indeed just from being able to use ->queue_rqs() rather than single > > submit. > > Actually that is already there through io_uring, nothing extra is > needed. I meant in this patchset that plug->nr_ios isn't wired to data.nr_tags in blk_mq_alloc_request(), which is called by pt request allocation, so looks cached request isn't enabled for pt commands? Thanks, Ming
On 3/22/22 8:10 PM, Ming Lei wrote: > On Tue, Mar 22, 2022 at 07:58:25PM -0600, Jens Axboe wrote: >> On 3/22/22 7:41 PM, Jens Axboe wrote: >>> On 3/22/22 7:27 PM, Ming Lei wrote: >>>> On Mon, Mar 21, 2022 at 12:32:08PM +0530, Kanchan Joshi wrote: >>>>> On Mon, Mar 14, 2022 at 10:40:53PM +0800, Ming Lei wrote: >>>>>> On Thu, Mar 10, 2022 at 06:10:08PM +0530, Kanchan Joshi wrote: >>>>>>> On Thu, Mar 10, 2022 at 2:04 PM Christoph Hellwig <hch@lst.de> wrote: >>>>>>>> >>>>>>>> On Tue, Mar 08, 2022 at 08:50:58PM +0530, Kanchan Joshi wrote: >>>>>>>>> From: Jens Axboe <axboe@kernel.dk> >>>>>>>>> >>>>>>>>> Add support to use plugging if it is enabled, else use default path. >>>>>>>> >>>>>>>> The subject and this comment don't really explain what is done, and >>>>>>>> also don't mention at all why it is done. >>>>>>> >>>>>>> Missed out, will fix up. But plugging gave a very good hike to IOPS. >>>>>> >>>>>> But how does plugging improve IOPS here for passthrough request? Not >>>>>> see plug->nr_ios is wired to data.nr_tags in blk_mq_alloc_request(), >>>>>> which is called by nvme_submit_user_cmd(). >>>>> >>>>> Yes, one tag at a time for each request, but none of the request gets >>>>> dispatched and instead added to the plug. And when io_uring ends the >>>>> plug, the whole batch gets dispatched via ->queue_rqs (otherwise it used >>>>> to be via ->queue_rq, one request at a time). >>>>> >>>>> Only .plug impact looks like this on passthru-randread: >>>>> >>>>> KIOPS(depth_batch) 1_1 8_2 64_16 128_32 >>>>> Without plug 159 496 784 785 >>>>> With plug 159 525 991 1044 >>>>> >>>>> Hope it does clarify. >>>> >>>> OK, thanks for your confirmation, then the improvement should be from >>>> batch submission only. >>>> >>>> If cached request is enabled, I guess the number could be better. >>> >>> Yes, my original test patch pre-dates being able to set a submit count, >>> it would definitely help improve this case too. The current win is >>> indeed just from being able to use ->queue_rqs() rather than single >>> submit. >> >> Actually that is already there through io_uring, nothing extra is >> needed. > > I meant in this patchset that plug->nr_ios isn't wired to data.nr_tags in > blk_mq_alloc_request(), which is called by pt request allocation, so > looks cached request isn't enabled for pt commands? My point is that it is, since submission is through io_uring which does exactly that.
diff --git a/block/blk-mq.c b/block/blk-mq.c index 1adfe4824ef5..29f65eaf3e6b 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2326,6 +2326,40 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, blk_mq_hctx_mark_pending(hctx, ctx); } +/* + * Allow 2x BLK_MAX_REQUEST_COUNT requests on plug queue for multiple + * queues. This is important for md arrays to benefit from merging + * requests. + */ +static inline unsigned short blk_plug_max_rq_count(struct blk_plug *plug) +{ + if (plug->multiple_queues) + return BLK_MAX_REQUEST_COUNT * 2; + return BLK_MAX_REQUEST_COUNT; +} + +static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq) +{ + struct request *last = rq_list_peek(&plug->mq_list); + + if (!plug->rq_count) { + trace_block_plug(rq->q); + } else if (plug->rq_count >= blk_plug_max_rq_count(plug) || + (!blk_queue_nomerges(rq->q) && + blk_rq_bytes(last) >= BLK_PLUG_FLUSH_SIZE)) { + blk_mq_flush_plug_list(plug, false); + trace_block_plug(rq->q); + } + + if (!plug->multiple_queues && last && last->q != rq->q) + plug->multiple_queues = true; + if (!plug->has_elevator && (rq->rq_flags & RQF_ELV)) + plug->has_elevator = true; + rq->rq_next = NULL; + rq_list_add(&plug->mq_list, rq); + plug->rq_count++; +} + /** * blk_mq_request_bypass_insert - Insert a request at dispatch list. * @rq: Pointer to request to be inserted. @@ -2339,16 +2373,20 @@ void blk_mq_request_bypass_insert(struct request *rq, bool at_head, bool run_queue) { struct blk_mq_hw_ctx *hctx = rq->mq_hctx; + struct blk_plug *plug = current->plug; - spin_lock(&hctx->lock); - if (at_head) - list_add(&rq->queuelist, &hctx->dispatch); - else - list_add_tail(&rq->queuelist, &hctx->dispatch); - spin_unlock(&hctx->lock); - - if (run_queue) - blk_mq_run_hw_queue(hctx, false); + if (plug) { + blk_add_rq_to_plug(plug, rq); + } else { + spin_lock(&hctx->lock); + if (at_head) + list_add(&rq->queuelist, &hctx->dispatch); + else + list_add_tail(&rq->queuelist, &hctx->dispatch); + spin_unlock(&hctx->lock); + if (run_queue) + blk_mq_run_hw_queue(hctx, false); + } } void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, @@ -2666,40 +2704,6 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, hctx->queue->mq_ops->commit_rqs(hctx); } -/* - * Allow 2x BLK_MAX_REQUEST_COUNT requests on plug queue for multiple - * queues. This is important for md arrays to benefit from merging - * requests. - */ -static inline unsigned short blk_plug_max_rq_count(struct blk_plug *plug) -{ - if (plug->multiple_queues) - return BLK_MAX_REQUEST_COUNT * 2; - return BLK_MAX_REQUEST_COUNT; -} - -static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq) -{ - struct request *last = rq_list_peek(&plug->mq_list); - - if (!plug->rq_count) { - trace_block_plug(rq->q); - } else if (plug->rq_count >= blk_plug_max_rq_count(plug) || - (!blk_queue_nomerges(rq->q) && - blk_rq_bytes(last) >= BLK_PLUG_FLUSH_SIZE)) { - blk_mq_flush_plug_list(plug, false); - trace_block_plug(rq->q); - } - - if (!plug->multiple_queues && last && last->q != rq->q) - plug->multiple_queues = true; - if (!plug->has_elevator && (rq->rq_flags & RQF_ELV)) - plug->has_elevator = true; - rq->rq_next = NULL; - rq_list_add(&plug->mq_list, rq); - plug->rq_count++; -} - static bool blk_mq_attempt_bio_merge(struct request_queue *q, struct bio *bio, unsigned int nr_segs) {