From patchwork Tue Jun 27 12:08:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chengming Zhou X-Patchwork-Id: 13294408 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F4202EB64D9 for ; Tue, 27 Jun 2023 12:09:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230356AbjF0MJa (ORCPT ); Tue, 27 Jun 2023 08:09:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35190 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230310AbjF0MJ3 (ORCPT ); Tue, 27 Jun 2023 08:09:29 -0400 X-Greylist: delayed 2198 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Tue, 27 Jun 2023 05:09:27 PDT Received: from out-25.mta1.migadu.com (out-25.mta1.migadu.com [95.215.58.25]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6C92198D for ; Tue, 27 Jun 2023 05:09:27 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1687867766; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jXwoOICLOShdX7GdT4p+4NbQyy/5jY0pTAO6DGHq6Ek=; b=deQwbwKzKSR21OXpf2PePuZ4sU+KKmFuah4oWQXe3F+ZpELyCJ15iLXDG6blxyzBfiIHs2 2ywkmlUzuZIIgj1T6XEFXzPV3TvaD9FoHjGU2qNcBejs35N3PYWmVHzpMWJDxgoOgFwTAY M1eU9Wx8l5DBTCFFlNalCZjGl91D878= From: chengming.zhou@linux.dev To: axboe@kernel.dk, tj@kernel.org, hch@lst.de, ming.lei@redhat.com Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com Subject: [PATCH 1/4] blk-mq: use percpu csd to remote complete instead of per-rq csd Date: Tue, 27 Jun 2023 20:08:51 +0800 Message-Id: <20230627120854.971475-2-chengming.zhou@linux.dev> In-Reply-To: <20230627120854.971475-1-chengming.zhou@linux.dev> References: <20230627120854.971475-1-chengming.zhou@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Chengming Zhou If request need to be completed remotely, we insert it into percpu llist, and smp_call_function_single_async() if llist is empty previously. We don't need to use per-rq csd, percpu csd is enough. And the size of struct request is decreased by 24 bytes. Signed-off-by: Chengming Zhou --- block/blk-mq.c | 12 ++++++++---- include/linux/blk-mq.h | 5 +---- 2 files changed, 9 insertions(+), 8 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index decb6ab2d508..a36822479b94 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -43,6 +43,7 @@ #include "blk-ioprio.h" static DEFINE_PER_CPU(struct llist_head, blk_cpu_done); +static DEFINE_PER_CPU(struct __call_single_data, blk_cpu_csd); static void blk_mq_insert_request(struct request *rq, blk_insert_t flags); static void blk_mq_request_bypass_insert(struct request *rq, @@ -1156,13 +1157,13 @@ static void blk_mq_complete_send_ipi(struct request *rq) { struct llist_head *list; unsigned int cpu; + struct __call_single_data *csd; cpu = rq->mq_ctx->cpu; list = &per_cpu(blk_cpu_done, cpu); - if (llist_add(&rq->ipi_list, list)) { - INIT_CSD(&rq->csd, __blk_mq_complete_request_remote, rq); - smp_call_function_single_async(cpu, &rq->csd); - } + csd = &per_cpu(blk_cpu_csd, cpu); + if (llist_add(&rq->ipi_list, list)) + smp_call_function_single_async(cpu, csd); } static void blk_mq_raise_softirq(struct request *rq) @@ -4796,6 +4797,9 @@ static int __init blk_mq_init(void) for_each_possible_cpu(i) init_llist_head(&per_cpu(blk_cpu_done, i)); + for_each_possible_cpu(i) + INIT_CSD(&per_cpu(blk_cpu_csd, i), + __blk_mq_complete_request_remote, NULL); open_softirq(BLOCK_SOFTIRQ, blk_done_softirq); cpuhp_setup_state_nocalls(CPUHP_BLOCK_SOFTIRQ_DEAD, diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index f401067ac03a..070551197c0e 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -182,10 +182,7 @@ struct request { rq_end_io_fn *saved_end_io; } flush; - union { - struct __call_single_data csd; - u64 fifo_time; - }; + u64 fifo_time; /* * completion callback. From patchwork Tue Jun 27 12:08:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chengming Zhou X-Patchwork-Id: 13294409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA0F3EB64DC for ; Tue, 27 Jun 2023 12:09:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231860AbjF0MJe (ORCPT ); Tue, 27 Jun 2023 08:09:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35224 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231842AbjF0MJc (ORCPT ); Tue, 27 Jun 2023 08:09:32 -0400 Received: from out-47.mta1.migadu.com (out-47.mta1.migadu.com [IPv6:2001:41d0:203:375::2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E56F819A2 for ; Tue, 27 Jun 2023 05:09:30 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1687867769; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V2dtlX5xWTLahOFf3Fslyq5xkemB+V+IMRf77iTil6I=; b=mkDkI0OESoDGaRtSLk6gglXQ8TbHSu8npIrgxY/bH+9lX3lHz7L06NPIVLmBjq2FznIp2j e7/Ly4MiXHQSDMgZxWpui2OyH2v5CmV8MP2fKUo3cEnhkkafy9L56fa00s2rfwN7d491DI eOkqpf/CRyHOsxLSehrXVlPGFvIboUE= From: chengming.zhou@linux.dev To: axboe@kernel.dk, tj@kernel.org, hch@lst.de, ming.lei@redhat.com Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com Subject: [PATCH 2/4] blk-flush: count inflight flush_data requests Date: Tue, 27 Jun 2023 20:08:52 +0800 Message-Id: <20230627120854.971475-3-chengming.zhou@linux.dev> In-Reply-To: <20230627120854.971475-1-chengming.zhou@linux.dev> References: <20230627120854.971475-1-chengming.zhou@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Chengming Zhou The flush state machine use a double list to link all inflight flush_data requests, to avoid issuing separate post-flushes for these flush_data requests which shared PREFLUSH. So we can't reuse rq->queuelist, this is why we need rq->flush.list In preparation of the next patch that reuse rq->queuelist for flush state machine, we change the double linked list to a u64 counter, which count all inflight flush_data requests. This is ok since we only need to know if there is any inflight flush_data request, so a u64 counter is good. The only problem I can think of is that u64 counter may overflow, which should be unlikely happen. Signed-off-by: Chengming Zhou --- block/blk-flush.c | 9 +++++---- block/blk.h | 5 ++--- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/block/blk-flush.c b/block/blk-flush.c index dba392cf22be..bb7adfc2a5da 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -187,7 +187,8 @@ static void blk_flush_complete_seq(struct request *rq, break; case REQ_FSEQ_DATA: - list_move_tail(&rq->flush.list, &fq->flush_data_in_flight); + list_del_init(&rq->flush.list); + fq->flush_data_in_flight++; spin_lock(&q->requeue_lock); list_add_tail(&rq->queuelist, &q->flush_list); spin_unlock(&q->requeue_lock); @@ -299,7 +300,7 @@ static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq, return; /* C2 and C3 */ - if (!list_empty(&fq->flush_data_in_flight) && + if (fq->flush_data_in_flight && time_before(jiffies, fq->flush_pending_since + FLUSH_PENDING_TIMEOUT)) return; @@ -374,6 +375,7 @@ static enum rq_end_io_ret mq_flush_data_end_io(struct request *rq, * the comment in flush_end_io(). */ spin_lock_irqsave(&fq->mq_flush_lock, flags); + fq->flush_data_in_flight--; blk_flush_complete_seq(rq, fq, REQ_FSEQ_DATA, error); spin_unlock_irqrestore(&fq->mq_flush_lock, flags); @@ -445,7 +447,7 @@ bool blk_insert_flush(struct request *rq) blk_rq_init_flush(rq); rq->flush.seq |= REQ_FSEQ_POSTFLUSH; spin_lock_irq(&fq->mq_flush_lock); - list_move_tail(&rq->flush.list, &fq->flush_data_in_flight); + fq->flush_data_in_flight++; spin_unlock_irq(&fq->mq_flush_lock); return false; default: @@ -496,7 +498,6 @@ struct blk_flush_queue *blk_alloc_flush_queue(int node, int cmd_size, INIT_LIST_HEAD(&fq->flush_queue[0]); INIT_LIST_HEAD(&fq->flush_queue[1]); - INIT_LIST_HEAD(&fq->flush_data_in_flight); return fq; diff --git a/block/blk.h b/block/blk.h index 608c5dcc516b..686712e13835 100644 --- a/block/blk.h +++ b/block/blk.h @@ -15,15 +15,14 @@ struct elevator_type; extern struct dentry *blk_debugfs_root; struct blk_flush_queue { + spinlock_t mq_flush_lock; unsigned int flush_pending_idx:1; unsigned int flush_running_idx:1; blk_status_t rq_status; unsigned long flush_pending_since; struct list_head flush_queue[2]; - struct list_head flush_data_in_flight; + unsigned long flush_data_in_flight; struct request *flush_rq; - - spinlock_t mq_flush_lock; }; bool is_flush_rq(struct request *req); From patchwork Tue Jun 27 12:08:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chengming Zhou X-Patchwork-Id: 13294410 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F50DEB64DC for ; Tue, 27 Jun 2023 12:09:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231843AbjF0MJg (ORCPT ); Tue, 27 Jun 2023 08:09:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35236 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231854AbjF0MJe (ORCPT ); Tue, 27 Jun 2023 08:09:34 -0400 Received: from out-49.mta1.migadu.com (out-49.mta1.migadu.com [IPv6:2001:41d0:203:375::31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 754CC1FEE for ; Tue, 27 Jun 2023 05:09:33 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1687867771; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hdKaqogn/I3BFaAaWb1R9T3dUmGgvZb8Wo54xKMjRbQ=; b=tKbN3wBPkiIT2z6p9v+pV9wWBJg62XsRXb4rg80bF4qg7PWSmOn1tZ3NRdGzEVvbmXaWgj KnTVANboxGHno6nqoVg7aZ62GhWJna8xNAE8KxQ/rBtIeqCaQJkUT4r8I3VSkGkgxZ81hm bGgi45C5DAR53qM4s1Z2kCZ4qnc9+Es= From: chengming.zhou@linux.dev To: axboe@kernel.dk, tj@kernel.org, hch@lst.de, ming.lei@redhat.com Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com Subject: [PATCH 3/4] blk-flush: reuse rq queuelist in flush state machine Date: Tue, 27 Jun 2023 20:08:53 +0800 Message-Id: <20230627120854.971475-4-chengming.zhou@linux.dev> In-Reply-To: <20230627120854.971475-1-chengming.zhou@linux.dev> References: <20230627120854.971475-1-chengming.zhou@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Chengming Zhou Since we don't need to maintain inflight flush_data requests list anymore, we can reuse rq->queuelist for flush pending list. This patch decrease the size of struct request by 16 bytes. Signed-off-by: Chengming Zhou --- block/blk-flush.c | 12 +++++------- include/linux/blk-mq.h | 1 - 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/block/blk-flush.c b/block/blk-flush.c index bb7adfc2a5da..81588edbe8b0 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -183,14 +183,13 @@ static void blk_flush_complete_seq(struct request *rq, /* queue for flush */ if (list_empty(pending)) fq->flush_pending_since = jiffies; - list_move_tail(&rq->flush.list, pending); + list_move_tail(&rq->queuelist, pending); break; case REQ_FSEQ_DATA: - list_del_init(&rq->flush.list); fq->flush_data_in_flight++; spin_lock(&q->requeue_lock); - list_add_tail(&rq->queuelist, &q->flush_list); + list_move_tail(&rq->queuelist, &q->flush_list); spin_unlock(&q->requeue_lock); blk_mq_kick_requeue_list(q); break; @@ -202,7 +201,7 @@ static void blk_flush_complete_seq(struct request *rq, * flush data request completion path. Restore @rq for * normal completion and end it. */ - list_del_init(&rq->flush.list); + list_del_init(&rq->queuelist); blk_flush_restore_request(rq); blk_mq_end_request(rq, error); break; @@ -258,7 +257,7 @@ static enum rq_end_io_ret flush_end_io(struct request *flush_rq, fq->flush_running_idx ^= 1; /* and push the waiting requests to the next stage */ - list_for_each_entry_safe(rq, n, running, flush.list) { + list_for_each_entry_safe(rq, n, running, queuelist) { unsigned int seq = blk_flush_cur_seq(rq); BUG_ON(seq != REQ_FSEQ_PREFLUSH && seq != REQ_FSEQ_POSTFLUSH); @@ -292,7 +291,7 @@ static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq, { struct list_head *pending = &fq->flush_queue[fq->flush_pending_idx]; struct request *first_rq = - list_first_entry(pending, struct request, flush.list); + list_first_entry(pending, struct request, queuelist); struct request *flush_rq = fq->flush_rq; /* C1 described at the top of this file */ @@ -386,7 +385,6 @@ static enum rq_end_io_ret mq_flush_data_end_io(struct request *rq, static void blk_rq_init_flush(struct request *rq) { rq->flush.seq = 0; - INIT_LIST_HEAD(&rq->flush.list); rq->rq_flags |= RQF_FLUSH_SEQ; rq->flush.saved_end_io = rq->end_io; /* Usually NULL */ rq->end_io = mq_flush_data_end_io; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 070551197c0e..96644d6f8d18 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -178,7 +178,6 @@ struct request { struct { unsigned int seq; - struct list_head list; rq_end_io_fn *saved_end_io; } flush; From patchwork Tue Jun 27 12:08:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chengming Zhou X-Patchwork-Id: 13294411 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75A37EB64DC for ; Tue, 27 Jun 2023 12:09:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230429AbjF0MJz (ORCPT ); Tue, 27 Jun 2023 08:09:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231872AbjF0MJh (ORCPT ); Tue, 27 Jun 2023 08:09:37 -0400 Received: from out-46.mta1.migadu.com (out-46.mta1.migadu.com [95.215.58.46]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C9421BE5 for ; Tue, 27 Jun 2023 05:09:35 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1687867774; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/O8YJWAa7A//A8YJ1hLAuWA8H/d3IRDfJMQYLuFl59Q=; b=BKYs2RGGfguSyGgb3IkV0RPbcq3qcyqARGTaYepvA/FA23tWcM1aCKi9+krWCYMIr76uDo 0rexnwxqzl8F4RZ+cp2FB7HOANASVMlE9UM3j94Z0oNZIF2TrJ5THWKMJPYTCCYgFm75hz mfYbjzBSr1YbbergYLsysrTqqxOQqg0= From: chengming.zhou@linux.dev To: axboe@kernel.dk, tj@kernel.org, hch@lst.de, ming.lei@redhat.com Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com Subject: [PATCH 4/4] blk-mq: delete unused completion_data in struct request Date: Tue, 27 Jun 2023 20:08:54 +0800 Message-Id: <20230627120854.971475-5-chengming.zhou@linux.dev> In-Reply-To: <20230627120854.971475-1-chengming.zhou@linux.dev> References: <20230627120854.971475-1-chengming.zhou@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Chengming Zhou After global search, I found "completion_data" in struct request is not used anywhere, so just clean it up by the way. Signed-off-by: Chengming Zhou --- include/linux/blk-mq.h | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 96644d6f8d18..ab790eba5fcf 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -158,13 +158,11 @@ struct request { /* * The rb_node is only used inside the io scheduler, requests - * are pruned when moved to the dispatch queue. So let the - * completion_data share space with the rb_node. + * are pruned when moved to the dispatch queue. */ union { struct rb_node rb_node; /* sort/lookup */ struct bio_vec special_vec; - void *completion_data; }; /*