From patchwork Thu Aug 24 14:43:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chengming Zhou X-Patchwork-Id: 13364312 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DDC8CC6FA8F for ; Thu, 24 Aug 2023 14:46:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241739AbjHXOpc (ORCPT ); Thu, 24 Aug 2023 10:45:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36738 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241733AbjHXOpL (ORCPT ); Thu, 24 Aug 2023 10:45:11 -0400 Received: from out-39.mta1.migadu.com (out-39.mta1.migadu.com [IPv6:2001:41d0:203:375::27]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F68F1BC5 for ; Thu, 24 Aug 2023 07:45:09 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1692888307; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZPF+c1DJ7XmvhtLIn8XJOXXiqYrgAJHZCAQvWuMIHRQ=; b=PtBO8GhDb7ozEtU9kP2iVRjbUhhPDrVqhXJ12tWghz4MV8Z/vLY50fEpBZ7YsatVtjEQMW fQBcJ0VvKSXqaNK2AYY/nPNI2Ych8r8jnsvEavHhFq3vkhhF7eQc24BK8DJbw4dHLUbTm7 tR47+vDLlad1Vrwe7GOF2+A7+qMCfLs= From: chengming.zhou@linux.dev To: axboe@kernel.dk, hch@lst.de, ming.lei@redhat.com, bvanassche@acm.org, kbusch@kernel.org Cc: mst@redhat.com, sagi@grimberg.me, damien.lemoal@opensource.wdc.com, kch@nvidia.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com Subject: [PATCH 1/6] blk-mq: account active requests when get driver tag Date: Thu, 24 Aug 2023 22:43:58 +0800 Message-ID: <20230824144403.2135739-2-chengming.zhou@linux.dev> In-Reply-To: <20230824144403.2135739-1-chengming.zhou@linux.dev> References: <20230824144403.2135739-1-chengming.zhou@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Chengming Zhou There is a limit that batched queue_rqs() can't work on shared tags queue, since the account of active requests can't be done there. Now we account the active requests only in blk_mq_get_driver_tag(), which is not the time we get driver tag actually (with none elevator). To support batched queue_rqs() on shared tags queue, we move the account of active requests to where we get the driver tag: 1. none elevator: blk_mq_get_tags() and blk_mq_get_tag() 2. other elevator: __blk_mq_alloc_driver_tag() This is clearer and match with the unaccount side, which just happen when we put the driver tag. The other good point is that we don't need RQF_MQ_INFLIGHT trick anymore, which used to avoid double account of flush request. Now we only account when actually get the driver tag, so all is good. We will remove RQF_MQ_INFLIGHT in the next patch. Signed-off-by: Chengming Zhou --- block/blk-mq.c | 33 +++++++++++------------------ block/blk-mq.h | 56 ++++++++++++++++++++++++++++++++++++-------------- 2 files changed, 53 insertions(+), 36 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index ec922c6bccbe..bcdb750ef575 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -426,6 +426,8 @@ __blk_mq_alloc_requests_batch(struct blk_mq_alloc_data *data) rq_list_add(data->cached_rq, rq); nr++; } + if (!(data->rq_flags & RQF_SCHED_TAGS)) + blk_mq_add_active_requests(data->hctx, nr); /* caller already holds a reference, add for remainder */ percpu_ref_get_many(&data->q->q_usage_counter, nr - 1); data->nr_tags -= nr; @@ -510,6 +512,8 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data) goto retry; } + if (!(data->rq_flags & RQF_SCHED_TAGS)) + blk_mq_inc_active_requests(data->hctx); rq = blk_mq_rq_ctx_init(data, blk_mq_tags_from_data(data), tag); blk_mq_rq_time_init(rq, alloc_time_ns); return rq; @@ -669,6 +673,8 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q, tag = blk_mq_get_tag(&data); if (tag == BLK_MQ_NO_TAG) goto out_queue_exit; + if (!(data.rq_flags & RQF_SCHED_TAGS)) + blk_mq_inc_active_requests(data.hctx); rq = blk_mq_rq_ctx_init(&data, blk_mq_tags_from_data(&data), tag); blk_mq_rq_time_init(rq, alloc_time_ns); rq->__data_len = 0; @@ -708,11 +714,10 @@ static void __blk_mq_free_request(struct request *rq) blk_pm_mark_last_busy(rq); rq->mq_hctx = NULL; - if (rq->rq_flags & RQF_MQ_INFLIGHT) - __blk_mq_dec_active_requests(hctx); - - if (rq->tag != BLK_MQ_NO_TAG) + if (rq->tag != BLK_MQ_NO_TAG) { + blk_mq_dec_active_requests(hctx); blk_mq_put_tag(hctx->tags, ctx, rq->tag); + } if (sched_tag != BLK_MQ_NO_TAG) blk_mq_put_tag(hctx->sched_tags, ctx, sched_tag); blk_mq_sched_restart(hctx); @@ -1065,8 +1070,7 @@ static inline void blk_mq_flush_tag_batch(struct blk_mq_hw_ctx *hctx, * All requests should have been marked as RQF_MQ_INFLIGHT, so * update hctx->nr_active in batch */ - if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) - __blk_mq_sub_active_requests(hctx, nr_tags); + blk_mq_sub_active_requests(hctx, nr_tags); blk_mq_put_tags(hctx->tags, tag_array, nr_tags); percpu_ref_put_many(&q->q_usage_counter, nr_tags); @@ -1748,7 +1752,7 @@ struct request *blk_mq_dequeue_from_ctx(struct blk_mq_hw_ctx *hctx, return data.rq; } -static bool __blk_mq_alloc_driver_tag(struct request *rq) +bool __blk_mq_alloc_driver_tag(struct request *rq) { struct sbitmap_queue *bt = &rq->mq_hctx->tags->bitmap_tags; unsigned int tag_offset = rq->mq_hctx->tags->nr_reserved_tags; @@ -1769,20 +1773,7 @@ static bool __blk_mq_alloc_driver_tag(struct request *rq) return false; rq->tag = tag + tag_offset; - return true; -} - -bool __blk_mq_get_driver_tag(struct blk_mq_hw_ctx *hctx, struct request *rq) -{ - if (rq->tag == BLK_MQ_NO_TAG && !__blk_mq_alloc_driver_tag(rq)) - return false; - - if ((hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) && - !(rq->rq_flags & RQF_MQ_INFLIGHT)) { - rq->rq_flags |= RQF_MQ_INFLIGHT; - __blk_mq_inc_active_requests(hctx); - } - hctx->tags->rqs[rq->tag] = rq; + blk_mq_inc_active_requests(rq->mq_hctx); return true; } diff --git a/block/blk-mq.h b/block/blk-mq.h index 1743857e0b01..560a76df290a 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -271,12 +271,18 @@ static inline int blk_mq_get_rq_budget_token(struct request *rq) return -1; } -static inline void __blk_mq_inc_active_requests(struct blk_mq_hw_ctx *hctx) +static inline void __blk_mq_add_active_requests(struct blk_mq_hw_ctx *hctx, + int val) { if (blk_mq_is_shared_tags(hctx->flags)) - atomic_inc(&hctx->queue->nr_active_requests_shared_tags); + atomic_add(val, &hctx->queue->nr_active_requests_shared_tags); else - atomic_inc(&hctx->nr_active); + atomic_add(val, &hctx->nr_active); +} + +static inline void __blk_mq_inc_active_requests(struct blk_mq_hw_ctx *hctx) +{ + __blk_mq_add_active_requests(hctx, 1); } static inline void __blk_mq_sub_active_requests(struct blk_mq_hw_ctx *hctx, @@ -293,6 +299,32 @@ static inline void __blk_mq_dec_active_requests(struct blk_mq_hw_ctx *hctx) __blk_mq_sub_active_requests(hctx, 1); } +static inline void blk_mq_add_active_requests(struct blk_mq_hw_ctx *hctx, + int val) +{ + if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) + __blk_mq_add_active_requests(hctx, val); +} + +static inline void blk_mq_inc_active_requests(struct blk_mq_hw_ctx *hctx) +{ + if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) + __blk_mq_inc_active_requests(hctx); +} + +static inline void blk_mq_sub_active_requests(struct blk_mq_hw_ctx *hctx, + int val) +{ + if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) + __blk_mq_sub_active_requests(hctx, val); +} + +static inline void blk_mq_dec_active_requests(struct blk_mq_hw_ctx *hctx) +{ + if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) + __blk_mq_dec_active_requests(hctx); +} + static inline int __blk_mq_active_requests(struct blk_mq_hw_ctx *hctx) { if (blk_mq_is_shared_tags(hctx->flags)) @@ -302,13 +334,9 @@ static inline int __blk_mq_active_requests(struct blk_mq_hw_ctx *hctx) static inline void __blk_mq_put_driver_tag(struct blk_mq_hw_ctx *hctx, struct request *rq) { + blk_mq_dec_active_requests(hctx); blk_mq_put_tag(hctx->tags, rq->mq_ctx, rq->tag); rq->tag = BLK_MQ_NO_TAG; - - if (rq->rq_flags & RQF_MQ_INFLIGHT) { - rq->rq_flags &= ~RQF_MQ_INFLIGHT; - __blk_mq_dec_active_requests(hctx); - } } static inline void blk_mq_put_driver_tag(struct request *rq) @@ -319,19 +347,17 @@ static inline void blk_mq_put_driver_tag(struct request *rq) __blk_mq_put_driver_tag(rq->mq_hctx, rq); } -bool __blk_mq_get_driver_tag(struct blk_mq_hw_ctx *hctx, struct request *rq); +bool __blk_mq_alloc_driver_tag(struct request *rq); static inline bool blk_mq_get_driver_tag(struct request *rq) { struct blk_mq_hw_ctx *hctx = rq->mq_hctx; - if (rq->tag != BLK_MQ_NO_TAG && - !(hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) { - hctx->tags->rqs[rq->tag] = rq; - return true; - } + if (rq->tag == BLK_MQ_NO_TAG && !__blk_mq_alloc_driver_tag(rq)) + return false; - return __blk_mq_get_driver_tag(hctx, rq); + hctx->tags->rqs[rq->tag] = rq; + return true; } static inline void blk_mq_clear_mq_map(struct blk_mq_queue_map *qmap) From patchwork Thu Aug 24 14:43:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chengming Zhou X-Patchwork-Id: 13364315 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DD65C7EE43 for ; Thu, 24 Aug 2023 14:46:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241744AbjHXOpd (ORCPT ); Thu, 24 Aug 2023 10:45:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36760 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241735AbjHXOpP (ORCPT ); Thu, 24 Aug 2023 10:45:15 -0400 Received: from out-13.mta1.migadu.com (out-13.mta1.migadu.com [95.215.58.13]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C9211BC5 for ; Thu, 24 Aug 2023 07:45:13 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1692888312; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ioomYqYWf906f961IGBrVvToN2mTXaJ5vW4lI+wrQ5c=; b=rU04i+t4Os3zyH1G2hJfWVt7Br9LhZsFLFPOZQ5K5AZ6k5ubnNReHJm6gznd+YklvVsgta WvQjp4PTqv/+lMxf3LvVskfjbuMYNS6iXFWQtB26jb+IyLndKSbKreqEk3DRzkBWpfTrGW 0zmZXam001cC/A2rN2erwclxhq/ET7Y= From: chengming.zhou@linux.dev To: axboe@kernel.dk, hch@lst.de, ming.lei@redhat.com, bvanassche@acm.org, kbusch@kernel.org Cc: mst@redhat.com, sagi@grimberg.me, damien.lemoal@opensource.wdc.com, kch@nvidia.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com Subject: [PATCH 2/6] blk-mq: remove RQF_MQ_INFLIGHT Date: Thu, 24 Aug 2023 22:43:59 +0800 Message-ID: <20230824144403.2135739-3-chengming.zhou@linux.dev> In-Reply-To: <20230824144403.2135739-1-chengming.zhou@linux.dev> References: <20230824144403.2135739-1-chengming.zhou@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Chengming Zhou Since the previous patch change to only account active requests when we really allocate the driver tag, the RQF_MQ_INFLIGHT can be removed and no double account problem. 1. none elevator: flush request will use the first pending request's driver tag, won't double account. 2. other elevator: flush request will be accounted when allocate driver tag when issue, and will be unaccounted when it put the driver tag. Signed-off-by: Chengming Zhou --- block/blk-flush.c | 11 ++--------- block/blk-mq-debugfs.c | 1 - block/blk-mq.c | 4 ---- include/linux/blk-mq.h | 2 -- 4 files changed, 2 insertions(+), 16 deletions(-) diff --git a/block/blk-flush.c b/block/blk-flush.c index e73dc22d05c1..3f4d41952ef2 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -323,16 +323,9 @@ static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq, flush_rq->mq_ctx = first_rq->mq_ctx; flush_rq->mq_hctx = first_rq->mq_hctx; - if (!q->elevator) { + if (!q->elevator) flush_rq->tag = first_rq->tag; - - /* - * We borrow data request's driver tag, so have to mark - * this flush request as INFLIGHT for avoiding double - * account of this driver tag - */ - flush_rq->rq_flags |= RQF_MQ_INFLIGHT; - } else + else flush_rq->internal_tag = first_rq->internal_tag; flush_rq->cmd_flags = REQ_OP_FLUSH | REQ_PREFLUSH; diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index c3b5930106b2..5cbeb9344f2f 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -246,7 +246,6 @@ static const char *const rqf_name[] = { RQF_NAME(STARTED), RQF_NAME(FLUSH_SEQ), RQF_NAME(MIXED_MERGE), - RQF_NAME(MQ_INFLIGHT), RQF_NAME(DONTPREP), RQF_NAME(SCHED_TAGS), RQF_NAME(USE_SCHED), diff --git a/block/blk-mq.c b/block/blk-mq.c index bcdb750ef575..1d0459142f61 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1066,10 +1066,6 @@ static inline void blk_mq_flush_tag_batch(struct blk_mq_hw_ctx *hctx, { struct request_queue *q = hctx->queue; - /* - * All requests should have been marked as RQF_MQ_INFLIGHT, so - * update hctx->nr_active in batch - */ blk_mq_sub_active_requests(hctx, nr_tags); blk_mq_put_tags(hctx->tags, tag_array, nr_tags); diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 958ed7e89b30..1ab3081c82ed 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -32,8 +32,6 @@ typedef __u32 __bitwise req_flags_t; #define RQF_FLUSH_SEQ ((__force req_flags_t)(1 << 4)) /* merge of different types, fail separately */ #define RQF_MIXED_MERGE ((__force req_flags_t)(1 << 5)) -/* track inflight for MQ */ -#define RQF_MQ_INFLIGHT ((__force req_flags_t)(1 << 6)) /* don't call prep for this one */ #define RQF_DONTPREP ((__force req_flags_t)(1 << 7)) /* use hctx->sched_tags */ From patchwork Thu Aug 24 14:44:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chengming Zhou X-Patchwork-Id: 13364313 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2CF36C7EE2C for ; Thu, 24 Aug 2023 14:46:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241735AbjHXOpd (ORCPT ); Thu, 24 Aug 2023 10:45:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241737AbjHXOpT (ORCPT ); Thu, 24 Aug 2023 10:45:19 -0400 Received: from out-39.mta1.migadu.com (out-39.mta1.migadu.com [IPv6:2001:41d0:203:375::27]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C163D1BC6 for ; Thu, 24 Aug 2023 07:45:17 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1692888316; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Z52FLMQcKzOLYIHH0RGB6ixBmspzLyNsEH35/uGCkUU=; b=gfcGC/VTEK9hRqOYp40Qr8BqqMyydafpamnriWKBMMOUVH+b0BWiM8Kl/hOyRFfYEdpzgE 0mC1Dr7e7ZsuUplosZ6goBPqIcKGYF5vrsRIJbJMCQOU4WGL5nvDd3AyAQWTwXddXPXGsH r4ir/SNLmF/05Id+BbtnD+GWCC8kRgk= From: chengming.zhou@linux.dev To: axboe@kernel.dk, hch@lst.de, ming.lei@redhat.com, bvanassche@acm.org, kbusch@kernel.org Cc: mst@redhat.com, sagi@grimberg.me, damien.lemoal@opensource.wdc.com, kch@nvidia.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com Subject: [PATCH 3/6] blk-mq: support batched queue_rqs() on shared tags queue Date: Thu, 24 Aug 2023 22:44:00 +0800 Message-ID: <20230824144403.2135739-4-chengming.zhou@linux.dev> In-Reply-To: <20230824144403.2135739-1-chengming.zhou@linux.dev> References: <20230824144403.2135739-1-chengming.zhou@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Chengming Zhou Since active requests have been accounted when allocate driver tags, we can remove this limit now. Signed-off-by: Chengming Zhou --- block/blk-mq.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 1d0459142f61..44595385b34c 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2781,13 +2781,8 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule) * If we do, we can dispatch the whole plug list in one go. We * already know at this point that all requests belong to the * same queue, caller must ensure that's the case. - * - * Since we pass off the full list to the driver at this point, - * we do not increment the active request count for the queue. - * Bypass shared tags for now because of that. */ - if (q->mq_ops->queue_rqs && - !(rq->mq_hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) { + if (q->mq_ops->queue_rqs) { blk_mq_run_dispatch_ops(q, __blk_mq_flush_plug_list(q, plug)); if (rq_list_empty(plug->mq_list)) From patchwork Thu Aug 24 14:44:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chengming Zhou X-Patchwork-Id: 13364316 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EAC6C83003 for ; Thu, 24 Aug 2023 14:46:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241730AbjHXOpe (ORCPT ); Thu, 24 Aug 2023 10:45:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51342 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241743AbjHXOpZ (ORCPT ); Thu, 24 Aug 2023 10:45:25 -0400 Received: from out-6.mta1.migadu.com (out-6.mta1.migadu.com [95.215.58.6]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9F6271BC6 for ; Thu, 24 Aug 2023 07:45:22 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1692888321; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zNMIjrcjjgLtOR47Ivp/pWFjvTNQF2gw34heMee5wC0=; b=szAF17CsZXJ3rsgIWhZDvvPDDCZ+0LKs7QcffukGw3yP+NwTajtWYV5l9u1BFdXMTMcrvC PTN43prCMhw1klaOb0ULkbBxn9i+FefrzJdRNDCJ/sBDGSe1d+N5IErcpX1IeVTTSDHj0Z k4Ak3o63jDl+d48mHBU+euzjOgnFCVs= From: chengming.zhou@linux.dev To: axboe@kernel.dk, hch@lst.de, ming.lei@redhat.com, bvanassche@acm.org, kbusch@kernel.org Cc: mst@redhat.com, sagi@grimberg.me, damien.lemoal@opensource.wdc.com, kch@nvidia.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com Subject: [PATCH 4/6] blk-mq: update driver tags request table when start request Date: Thu, 24 Aug 2023 22:44:01 +0800 Message-ID: <20230824144403.2135739-5-chengming.zhou@linux.dev> In-Reply-To: <20230824144403.2135739-1-chengming.zhou@linux.dev> References: <20230824144403.2135739-1-chengming.zhou@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Chengming Zhou Now we update driver tags request table in blk_mq_get_driver_tag(), so the driver that support queue_rqs() have to update that inflight table by itself. Move it to blk_mq_start_request(), which is a better place where we setup the deadline for request timeout check. And it's just where the request becomes inflight. Signed-off-by: Chengming Zhou --- block/blk-mq.c | 1 + block/blk-mq.h | 3 --- drivers/block/virtio_blk.c | 2 -- drivers/nvme/host/pci.c | 1 - 4 files changed, 1 insertion(+), 6 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 44595385b34c..ff1b0f3ab3a8 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1259,6 +1259,7 @@ void blk_mq_start_request(struct request *rq) blk_add_timer(rq); WRITE_ONCE(rq->state, MQ_RQ_IN_FLIGHT); + rq->mq_hctx->tags->rqs[rq->tag] = rq; #ifdef CONFIG_BLK_DEV_INTEGRITY if (blk_integrity_rq(rq) && req_op(rq) == REQ_OP_WRITE) diff --git a/block/blk-mq.h b/block/blk-mq.h index 560a76df290a..f75a9ecfebde 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -351,12 +351,9 @@ bool __blk_mq_alloc_driver_tag(struct request *rq); static inline bool blk_mq_get_driver_tag(struct request *rq) { - struct blk_mq_hw_ctx *hctx = rq->mq_hctx; - if (rq->tag == BLK_MQ_NO_TAG && !__blk_mq_alloc_driver_tag(rq)) return false; - hctx->tags->rqs[rq->tag] = rq; return true; } diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 1fe011676d07..4689ac2e0c0e 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -470,8 +470,6 @@ static bool virtblk_prep_rq_batch(struct request *req) struct virtio_blk *vblk = req->mq_hctx->queue->queuedata; struct virtblk_req *vbr = blk_mq_rq_to_pdu(req); - req->mq_hctx->tags->rqs[req->tag] = req; - return virtblk_prep_rq(req->mq_hctx, vblk, req, vbr) == BLK_STS_OK; } diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 2f57da12d983..c2e942808eff 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -924,7 +924,6 @@ static bool nvme_prep_rq_batch(struct nvme_queue *nvmeq, struct request *req) if (unlikely(!nvme_check_ready(&nvmeq->dev->ctrl, req, true))) return false; - req->mq_hctx->tags->rqs[req->tag] = req; return nvme_prep_rq(nvmeq->dev, req) == BLK_STS_OK; } From patchwork Thu Aug 24 14:44:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chengming Zhou X-Patchwork-Id: 13364314 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76323C88CB2 for ; Thu, 24 Aug 2023 14:46:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241743AbjHXOpf (ORCPT ); Thu, 24 Aug 2023 10:45:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241745AbjHXOp3 (ORCPT ); Thu, 24 Aug 2023 10:45:29 -0400 Received: from out-55.mta1.migadu.com (out-55.mta1.migadu.com [IPv6:2001:41d0:203:375::37]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58CCD1BC5 for ; Thu, 24 Aug 2023 07:45:27 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1692888325; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WfeZa4Y5jAeM+nwkiT/BS1yQ2iWyllMXdsEwns8FPns=; b=EKJiNVsaft0sHEc9R6T7LMrjABFaeKezxlh5EL18TMxj99tnCeqaL+HKir1bIpvCuj88ql EY91OcF5k/gn2ZYgQx4kYxBZlbVBiM2Fkf2WohB8Bm5XFJx8QrjNdOd0fjIhiStcyFyShC HARTCWpXJhX4r/dYItsWnFfXEba9XnE= From: chengming.zhou@linux.dev To: axboe@kernel.dk, hch@lst.de, ming.lei@redhat.com, bvanassche@acm.org, kbusch@kernel.org Cc: mst@redhat.com, sagi@grimberg.me, damien.lemoal@opensource.wdc.com, kch@nvidia.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com Subject: [PATCH 5/6] blk-mq: fix potential reorder of request state and deadline Date: Thu, 24 Aug 2023 22:44:02 +0800 Message-ID: <20230824144403.2135739-6-chengming.zhou@linux.dev> In-Reply-To: <20230824144403.2135739-1-chengming.zhou@linux.dev> References: <20230824144403.2135739-1-chengming.zhou@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Chengming Zhou CPU0 CPU1 blk_mq_start_request() blk_mq_req_expired() WRITE_ONCE(rq->deadline) WRITE_ONCE(rq->state) if (READ_ONCE(rq->state) != IN_FLIGHT) return deadline = READ_ONCE(rq->deadline) If CPU1 speculately reorder rq->deadline LOAD before rq->state, the deadline will be the initial value 0. CPU0 CPU1 blk_mq_start_request() blk_mq_req_expired() deadline = READ_ONCE(rq->deadline) WRITE_ONCE(rq->deadline) WRITE_ONCE(rq->state) if (READ_ONCE(rq->state) != IN_FLIGHT) return Signed-off-by: Chengming Zhou --- block/blk-mq.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index ff1b0f3ab3a8..49cbf826b100 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1258,6 +1258,8 @@ void blk_mq_start_request(struct request *rq) WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_IDLE); blk_add_timer(rq); + /* Pair with smp_rmb in blk_mq_req_expired(). */ + smp_wmb(); WRITE_ONCE(rq->state, MQ_RQ_IN_FLIGHT); rq->mq_hctx->tags->rqs[rq->tag] = rq; @@ -1568,6 +1570,12 @@ static bool blk_mq_req_expired(struct request *rq, struct blk_expired_data *expi if (rq->rq_flags & RQF_TIMED_OUT) return false; + /* + * Order LOADs of rq->state and rq->deadline, pair with + * smp_wmb in blk_mq_start_request(). + */ + smp_rmb(); + deadline = READ_ONCE(rq->deadline); if (time_after_eq(expired->timeout_start, deadline)) return true; From patchwork Thu Aug 24 14:44:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chengming Zhou X-Patchwork-Id: 13364317 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4835AC27C40 for ; Thu, 24 Aug 2023 14:46:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236907AbjHXOqE (ORCPT ); Thu, 24 Aug 2023 10:46:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37000 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241750AbjHXOpe (ORCPT ); Thu, 24 Aug 2023 10:45:34 -0400 Received: from out-51.mta1.migadu.com (out-51.mta1.migadu.com [IPv6:2001:41d0:203:375::33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B75841BC8 for ; Thu, 24 Aug 2023 07:45:32 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1692888330; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KuUH+9zTpQHPAacq3J8wMyhnGazr4U6zklKopxDD0no=; b=nQd1/FwzjzBFi7rMxNcNiH+nJ08oYnR89VYgKO2z6iN1+2H5zVbUphrNLZ+n1fVUXYQHqz fHhAdGpe+xkMok57NI7IWItVvXkPAhc32VIii9eqjap3szb7J/lQk5rqsAywa6VAmMKVqK o/vCJY4RvmF4WMsMfmYiXwLFstL5JlM= From: chengming.zhou@linux.dev To: axboe@kernel.dk, hch@lst.de, ming.lei@redhat.com, bvanassche@acm.org, kbusch@kernel.org Cc: mst@redhat.com, sagi@grimberg.me, damien.lemoal@opensource.wdc.com, kch@nvidia.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com Subject: [PATCH 6/6] block/null_blk: add queue_rqs() support Date: Thu, 24 Aug 2023 22:44:03 +0800 Message-ID: <20230824144403.2135739-7-chengming.zhou@linux.dev> In-Reply-To: <20230824144403.2135739-1-chengming.zhou@linux.dev> References: <20230824144403.2135739-1-chengming.zhou@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Chengming Zhou Add batched mq_ops.queue_rqs() support in null_blk for testing. The implementation is much easy since null_blk doesn't have commit_rqs(). We simply handle each request one by one, if errors are encountered, leave them in the passed in list and return back. There is about 3.6% improvement in IOPS of fio/t/io_uring on null_blk with hw_queue_depth=256 on my test VM, from 1.09M to 1.13M. Signed-off-by: Chengming Zhou --- drivers/block/null_blk/main.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c index 864013019d6b..1b1b58d36707 100644 --- a/drivers/block/null_blk/main.c +++ b/drivers/block/null_blk/main.c @@ -1742,6 +1742,25 @@ static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx, return null_handle_cmd(cmd, sector, nr_sectors, req_op(rq)); } +static void null_queue_rqs(struct request **rqlist) +{ + struct request *requeue_list = NULL; + struct request **requeue_lastp = &requeue_list; + struct blk_mq_queue_data bd = { }; + blk_status_t ret; + + do { + struct request *rq = rq_list_pop(rqlist); + + bd.rq = rq; + ret = null_queue_rq(rq->mq_hctx, &bd); + if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) + rq_list_add_tail(&requeue_lastp, rq); + } while (!rq_list_empty(*rqlist)); + + *rqlist = requeue_list; +} + static void cleanup_queue(struct nullb_queue *nq) { bitmap_free(nq->tag_map); @@ -1794,6 +1813,7 @@ static int null_init_hctx(struct blk_mq_hw_ctx *hctx, void *driver_data, static const struct blk_mq_ops null_mq_ops = { .queue_rq = null_queue_rq, + .queue_rqs = null_queue_rqs, .complete = null_complete_rq, .timeout = null_timeout_rq, .poll = null_poll,