From patchwork Thu Jun 23 23:25:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 12893237 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C162C43334 for ; Thu, 23 Jun 2022 23:26:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230118AbiFWX0M (ORCPT ); Thu, 23 Jun 2022 19:26:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38582 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229493AbiFWX0M (ORCPT ); Thu, 23 Jun 2022 19:26:12 -0400 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 940845639F for ; Thu, 23 Jun 2022 16:26:11 -0700 (PDT) Received: by mail-pl1-f175.google.com with SMTP id l6so549072plg.11 for ; Thu, 23 Jun 2022 16:26:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=a3Do2YgRAaAJn11Nkv2QWkHfv2RmvzTULT5W6oBZVa4=; b=Xoss4MgM/c16p/W0LIHjw74p/ZlW6IY2F0S8g6ScPWVlir+z0hXkaXwzk0mS2LCdW2 QLKROVGvNj6WdeskE5TBa2dpEZc8bDi2u7azPgj1CuF3gLchHUU6kTV7ARjPc3d4L+te kJyBM9AtAgahWhtgtW3BQCZl0/n/X0yCWtoKC8xcHfbX+GX7A4Ybr1qM/xEH+kShzWuT BXOHnxmXww+UnWkzETM4qnGbn6w1KX9svAFxAMhxvhP2apteIGg0ugbr3xLnFrnS+n7b bz+wqCxWllwrNaxslR9wsUMvQr3gLTOYMdvLUoR5fSySAKMmbGJAV7MzDlq8L+mTQTs2 zreA== X-Gm-Message-State: AJIora8aNO6XLJXHHlZ4xD7PKuR/hQ3mfSpLIm4KVkvXJAyjjagp2NHH nzRT0+Ilw3MdJ3WZEpvW32Y= X-Google-Smtp-Source: AGRyM1vEf8xKS+r3ak60YknJMZH83vDOIBr0pCsn27xK77sJIr5m4Utsyw+0jPQ6aI0y1rbfEAWFxw== X-Received: by 2002:a17:902:ec8e:b0:16a:2d25:aa5b with SMTP id x14-20020a170902ec8e00b0016a2d25aa5bmr19414926plg.69.1656026770948; Thu, 23 Jun 2022 16:26:10 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:85b2:5fa3:f71e:1b43]) by smtp.gmail.com with ESMTPSA id f11-20020a62380b000000b0051829b1595dsm184709pfa.130.2022.06.23.16.26.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jun 2022 16:26:10 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Bart Van Assche , Damien Le Moal Subject: [PATCH v2 1/6] block: Document blk_queue_zone_is_seq() and blk_rq_zone_is_seq() Date: Thu, 23 Jun 2022 16:25:58 -0700 Message-Id: <20220623232603.3751912-2-bvanassche@acm.org> X-Mailer: git-send-email 2.37.0.rc0.161.g10f37bed90-goog In-Reply-To: <20220623232603.3751912-1-bvanassche@acm.org> References: <20220623232603.3751912-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Since it is nontrivial to figure out how blk_queue_zone_is_seq() and blk_rq_zone_is_seq() handle sequential write preferred zones, document this. Cc: Damien Le Moal Signed-off-by: Bart Van Assche Reviewed-by: Damien Le Moal --- include/linux/blk-mq.h | 7 +++++++ include/linux/blkdev.h | 9 +++++++++ 2 files changed, 16 insertions(+) diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index e2d9daf7e8dd..909d47e34b7c 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -1124,6 +1124,13 @@ static inline unsigned int blk_rq_zone_no(struct request *rq) return blk_queue_zone_no(rq->q, blk_rq_pos(rq)); } +/** + * blk_rq_zone_is_seq() - Whether a request is for a sequential zone. + * @rq: Request pointer. + * + * Return: true if and only if blk_rq_pos(@rq) refers either to a sequential + * write required or a sequential write preferred zone. + */ static inline unsigned int blk_rq_zone_is_seq(struct request *rq) { return blk_queue_zone_is_seq(rq->q, blk_rq_pos(rq)); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 92b3bffad328..2904100d2485 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -688,6 +688,15 @@ static inline unsigned int blk_queue_zone_no(struct request_queue *q, return sector >> ilog2(q->limits.chunk_sectors); } +/** + * blk_queue_zone_is_seq() - Whether a logical block is in a sequential zone. + * @q: Request queue pointer. + * @sector: Offset from start of block device in 512 byte units. + * + * Return: true if and only if @q is associated with a zoned block device and + * @sector refers either to a sequential write required or a sequential write + * preferred zone. + */ static inline bool blk_queue_zone_is_seq(struct request_queue *q, sector_t sector) { From patchwork Thu Jun 23 23:25:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 12893238 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09247C433EF for ; Thu, 23 Jun 2022 23:26:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230360AbiFWX0N (ORCPT ); Thu, 23 Jun 2022 19:26:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38602 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229493AbiFWX0N (ORCPT ); Thu, 23 Jun 2022 19:26:13 -0400 Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED7665639F for ; Thu, 23 Jun 2022 16:26:12 -0700 (PDT) Received: by mail-pj1-f42.google.com with SMTP id g10-20020a17090a708a00b001ea8aadd42bso1161194pjk.0 for ; Thu, 23 Jun 2022 16:26:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=a0ETpIJc3DGDpQ1xz1PUuuQ2LfTUoB8PgHXjpILsX48=; b=jyDsNQ0Oi3mQUdJT0ZRNiMsCc+a5yIwOOC3RNygXUXmDKGhHxgOxKAarFxU66o2zab E1/xcjF1iH3Lt7F8twYp+KD0ihqjEkr4Qsm4QKRThiWh151ef/e2OGNQKv8Wj7f8KE9e l2wipYcxiQChAA4NVlDRp6eI9XNkvt21hiwKFVHCO6u8WtK/WOCUIlib3ITNb6FFVKVy mmnRn2n4zBFYpacGcSVyRxJtZiVKl9ZEXdSBhjyx1Ej3YgLDlBNEtAQJfxP/N+QL1OCe BRP48igx3jhObD1vOuta3rO9VGsmr2pz/wzx/QkjU3/rAiklmMWsjhSlOLz7C2efJlGu n14w== X-Gm-Message-State: AJIora/xYAMFcTsyxSGbsu3ucxvqwTSIVfxKt1iZt9I2VuyUGH1LEDIo 0vA0hR+QjIurv3fLOreU12zD9/HToPE3Vw== X-Google-Smtp-Source: AGRyM1vxMonQTASoU8hvG9ThpSGV1RfZt7b3YUUYGTEiq0wLScmpGZY3pXBtaYIt4MAPQxn3TyJh5w== X-Received: by 2002:a17:902:f78c:b0:169:b76f:2685 with SMTP id q12-20020a170902f78c00b00169b76f2685mr36385473pln.41.1656026772560; Thu, 23 Jun 2022 16:26:12 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:85b2:5fa3:f71e:1b43]) by smtp.gmail.com with ESMTPSA id f11-20020a62380b000000b0051829b1595dsm184709pfa.130.2022.06.23.16.26.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jun 2022 16:26:11 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Bart Van Assche , Damien Le Moal Subject: [PATCH v2 2/6] block: Introduce the blk_rq_is_zoned_seq_write() function Date: Thu, 23 Jun 2022 16:25:59 -0700 Message-Id: <20220623232603.3751912-3-bvanassche@acm.org> X-Mailer: git-send-email 2.37.0.rc0.161.g10f37bed90-goog In-Reply-To: <20220623232603.3751912-1-bvanassche@acm.org> References: <20220623232603.3751912-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Introduce a function that makes it easy to verify whether a write request is for a sequential write required or sequential write preferred zone. Simplify blk_req_needs_zone_write_lock() by using the new function. Cc: Damien Le Moal Signed-off-by: Bart Van Assche Reviewed-by: Damien Le Moal --- block/blk-zoned.c | 14 +------------- include/linux/blk-mq.h | 23 +++++++++++++++++++++++ 2 files changed, 24 insertions(+), 13 deletions(-) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index 38cd840d8838..cafcbc508dfb 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -57,19 +57,7 @@ EXPORT_SYMBOL_GPL(blk_zone_cond_str); */ bool blk_req_needs_zone_write_lock(struct request *rq) { - if (!rq->q->seq_zones_wlock) - return false; - - if (blk_rq_is_passthrough(rq)) - return false; - - switch (req_op(rq)) { - case REQ_OP_WRITE_ZEROES: - case REQ_OP_WRITE: - return blk_rq_zone_is_seq(rq); - default: - return false; - } + return rq->q->seq_zones_wlock && blk_rq_is_zoned_seq_write(rq); } EXPORT_SYMBOL_GPL(blk_req_needs_zone_write_lock); diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 909d47e34b7c..d5930797b84d 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -1136,6 +1136,24 @@ static inline unsigned int blk_rq_zone_is_seq(struct request *rq) return blk_queue_zone_is_seq(rq->q, blk_rq_pos(rq)); } +/** + * blk_rq_is_zoned_seq_write() - Whether @rq is a write request for a sequential zone. + * @rq: Request to examine. + * + * In this context sequential zone means either a sequential write required or + * a sequential write preferred zone. + */ +static inline bool blk_rq_is_zoned_seq_write(struct request *rq) +{ + switch (req_op(rq)) { + case REQ_OP_WRITE: + case REQ_OP_WRITE_ZEROES: + return blk_rq_zone_is_seq(rq); + default: + return false; + } +} + bool blk_req_needs_zone_write_lock(struct request *rq); bool blk_req_zone_write_trylock(struct request *rq); void __blk_req_zone_write_lock(struct request *rq); @@ -1166,6 +1184,11 @@ static inline bool blk_req_can_dispatch_to_zone(struct request *rq) return !blk_req_zone_is_write_locked(rq); } #else /* CONFIG_BLK_DEV_ZONED */ +static inline bool blk_rq_is_zoned_seq_write(struct request *rq) +{ + return false; +} + static inline bool blk_req_needs_zone_write_lock(struct request *rq) { return false; From patchwork Thu Jun 23 23:26:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 12893239 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F11C4C433EF for ; Thu, 23 Jun 2022 23:26:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230362AbiFWX0R (ORCPT ); Thu, 23 Jun 2022 19:26:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38636 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229493AbiFWX0P (ORCPT ); Thu, 23 Jun 2022 19:26:15 -0400 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD22152E67 for ; Thu, 23 Jun 2022 16:26:14 -0700 (PDT) Received: by mail-pj1-f47.google.com with SMTP id w24so1058371pjg.5 for ; Thu, 23 Jun 2022 16:26:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lTLxuQEEkdEPzGrA/9/qnJZUc0U6t5sn8h6Rs+LkkCQ=; b=JJJbSwRrIutPZqN5crpNK2FvoabC1AN97IY47gfvr5nfaeS8lhQgmtJrV0xNbWfOwA WfVwzvLrs1XanF2NONjbxGZJOU/GIXDipoptqUtGQXrGn35hOWBRpg+zzB/Zz48pm+pS /4l1FRglsC2TRSz/J5tHucy6r6e+t+XXPzAsUzw340yuDHAEUSh0vWtkvDrnbYe5nFCu EkYL8EdAJNlTGpPp8lov1CI+6xwin3LlT/fVlgGUp2Xk9SuAfonFESXtf8yyP7Yzbpar 9iAgexYlwixi3MbbwC+BeNeNxzZwv7og/M7imsAM74DiTRwr8CcER34+KFpjUOWFfwyP oeZw== X-Gm-Message-State: AJIora995HxlWw5yWVlyN24FPmRCnqx3t+R4ghgXiQ5oS7s+RBC/zxS4 8phs5nEV7qvBrGaFgHRgWaI= X-Google-Smtp-Source: AGRyM1tiBmwjVp9iHdEpayq9QehC7y8OrJaqeykxyR8ydEMlWdpmbmomavNM3UqZPui74T2oW7tPRg== X-Received: by 2002:a17:90b:1c8e:b0:1ea:4b95:1348 with SMTP id oo14-20020a17090b1c8e00b001ea4b951348mr431092pjb.153.1656026774265; Thu, 23 Jun 2022 16:26:14 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:85b2:5fa3:f71e:1b43]) by smtp.gmail.com with ESMTPSA id f11-20020a62380b000000b0051829b1595dsm184709pfa.130.2022.06.23.16.26.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jun 2022 16:26:13 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Bart Van Assche , Damien Le Moal Subject: [PATCH v2 3/6] block: Introduce a request queue flag for pipelining zoned writes Date: Thu, 23 Jun 2022 16:26:00 -0700 Message-Id: <20220623232603.3751912-4-bvanassche@acm.org> X-Mailer: git-send-email 2.37.0.rc0.161.g10f37bed90-goog In-Reply-To: <20220623232603.3751912-1-bvanassche@acm.org> References: <20220623232603.3751912-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Writes in sequential write required zones must happen at the write pointer. Even if the submitter of the write commands (e.g. a filesystem) submits writes for sequential write required zones in order, the block layer or the storage controller may reorder these write commands. The zone locking mechanism in the mq-deadline I/O scheduler serializes write commands for sequential zones. Some but not all storage controllers require this serialization. Introduce a new flag such that block drivers can request pipelining of writes for sequential write required zones. An example of a storage controller standard that requires write serialization is AHCI (Advanced Host Controller Interface). Submitting commands to AHCI controllers happens by writing a bit pattern into a register. Each set bit corresponds to an active command. This mechanism does not preserve command ordering information. Cc: Damien Le Moal Signed-off-by: Bart Van Assche --- include/linux/blkdev.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 2904100d2485..fcaa06b9c65a 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -581,6 +581,8 @@ struct request_queue { #define QUEUE_FLAG_HCTX_ACTIVE 28 /* at least one blk-mq hctx is active */ #define QUEUE_FLAG_NOWAIT 29 /* device supports NOWAIT */ #define QUEUE_FLAG_SQ_SCHED 30 /* single queue style io dispatch */ +/* Writes for sequential write required zones may be pipelined. */ +#define QUEUE_FLAG_PIPELINE_ZONED_WRITES 31 #define QUEUE_FLAG_MQ_DEFAULT ((1 << QUEUE_FLAG_IO_STAT) | \ (1 << QUEUE_FLAG_SAME_COMP) | \ @@ -624,6 +626,11 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q); #define blk_queue_nowait(q) test_bit(QUEUE_FLAG_NOWAIT, &(q)->queue_flags) #define blk_queue_sq_sched(q) test_bit(QUEUE_FLAG_SQ_SCHED, &(q)->queue_flags) +static inline bool blk_queue_pipeline_zoned_writes(struct request_queue *q) +{ + return test_bit(QUEUE_FLAG_PIPELINE_ZONED_WRITES, &(q)->queue_flags); +} + extern void blk_set_pm_only(struct request_queue *q); extern void blk_clear_pm_only(struct request_queue *q); From patchwork Thu Jun 23 23:26:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 12893240 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 935DAC43334 for ; Thu, 23 Jun 2022 23:26:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230361AbiFWX0R (ORCPT ); Thu, 23 Jun 2022 19:26:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38654 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229493AbiFWX0R (ORCPT ); Thu, 23 Jun 2022 19:26:17 -0400 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8858452E67 for ; Thu, 23 Jun 2022 16:26:16 -0700 (PDT) Received: by mail-pl1-f182.google.com with SMTP id o18so590793plg.2 for ; Thu, 23 Jun 2022 16:26:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Jp/dOxiOcqQO4HcTlE71wKaOLAfvCjPmT9GXQdly+2Q=; b=xm00QsFkvB6vjJJl6MGrNcLHEqCrW2uITIw5IpIdTvhgP6Fkb7zEH/9UgcVnduAwU6 91+/tiikD89Xt+L41uiaxqFP1DGJUEacSU77QptxhRv0KBVgF+lQv04UmiEOFf2/dDHA h5cITUZ4zs2rAYV4UIoI5RhjHNoHuNPbqLEN8o1PACmJ+EjV0CpCNCjwHPO81rExOKa+ VCrYoGmqn1Ay5eDBPU5OM8yr6MJfgUs6lWv5QPdnKHeq4YYZFc7FlDsZSuqubQkc8DLv NX4sGlMccOJLn4oGqTE7frtTDnmes0eaoM/eUjuB3eWkJy/r3a8EmPQXw/cFUY0Cvabg DC6g== X-Gm-Message-State: AJIora+ZhfblPx6VAd4EjgXvPHDskMyT5QVxPhmB1BxgRl3T29E1r49d OeAosUMftOzHI0QTt9Y5bvg= X-Google-Smtp-Source: AGRyM1sATxq+5jIPmAtmmvP9EGcpVR5NPDWtUK/sXh6V7SXZ9D/99W0KpHRETXqQMrYNpHXnzq7ITw== X-Received: by 2002:a17:90a:9cd:b0:1ec:7258:d01b with SMTP id 71-20020a17090a09cd00b001ec7258d01bmr453821pjo.244.1656026775847; Thu, 23 Jun 2022 16:26:15 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:85b2:5fa3:f71e:1b43]) by smtp.gmail.com with ESMTPSA id f11-20020a62380b000000b0051829b1595dsm184709pfa.130.2022.06.23.16.26.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jun 2022 16:26:15 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Bart Van Assche , Damien Le Moal Subject: [PATCH v2 4/6] block/mq-deadline: Only use zone locking if necessary Date: Thu, 23 Jun 2022 16:26:01 -0700 Message-Id: <20220623232603.3751912-5-bvanassche@acm.org> X-Mailer: git-send-email 2.37.0.rc0.161.g10f37bed90-goog In-Reply-To: <20220623232603.3751912-1-bvanassche@acm.org> References: <20220623232603.3751912-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Measurements have shown that limiting the queue depth to one for zoned writes has a significant negative performance impact on zoned UFS devices. Hence this patch that disables zone locking from the mq-deadline scheduler for storage controllers that support pipelining zoned writes. This patch is based on the following assumptions: - Applications submit write requests to sequential write required zones in order. - The I/O priority of all pipelined write requests is the same per zone. - If such write requests get reordered by the software or hardware queue mechanism, nr_hw_queues * nr_requests - 1 retries are sufficient to reorder the write requests. - It happens infrequently that zoned write requests are reordered by the block layer. - Either no I/O scheduler is used or an I/O scheduler is used that submits write requests per zone in LBA order. See also commit 5700f69178e9 ("mq-deadline: Introduce zone locking support"). Cc: Damien Le Moal Signed-off-by: Bart Van Assche --- block/blk-zoned.c | 3 ++- block/mq-deadline.c | 15 +++++++++------ 2 files changed, 11 insertions(+), 7 deletions(-) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index cafcbc508dfb..88a0610ba0c3 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -513,7 +513,8 @@ static int blk_revalidate_zone_cb(struct blk_zone *zone, unsigned int idx, break; case BLK_ZONE_TYPE_SEQWRITE_REQ: case BLK_ZONE_TYPE_SEQWRITE_PREF: - if (!args->seq_zones_wlock) { + if (!blk_queue_pipeline_zoned_writes(q) && + !args->seq_zones_wlock) { args->seq_zones_wlock = blk_alloc_zone_bitmap(q->node, args->nr_zones); if (!args->seq_zones_wlock) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 1a9e835e816c..8ab9694c8f3a 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -292,7 +292,7 @@ deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio, return NULL; rq = rq_entry_fifo(per_prio->fifo_list[data_dir].next); - if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q)) + if (data_dir == DD_READ || blk_queue_pipeline_zoned_writes(rq->q)) return rq; /* @@ -326,7 +326,7 @@ deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio, if (!rq) return NULL; - if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q)) + if (data_dir == DD_READ || blk_queue_pipeline_zoned_writes(rq->q)) return rq; /* @@ -445,8 +445,9 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd, } /* - * For a zoned block device, if we only have writes queued and none of - * them can be dispatched, rq will be NULL. + * For a zoned block device that requires write serialization, if we + * only have writes queued and none of them can be dispatched, rq will + * be NULL. */ if (!rq) return NULL; @@ -719,6 +720,8 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, u8 ioprio_class = IOPRIO_PRIO_CLASS(ioprio); struct dd_per_prio *per_prio; enum dd_prio prio; + bool pipelined_seq_write = blk_queue_pipeline_zoned_writes(q) && + blk_rq_is_zoned_seq_write(rq); LIST_HEAD(free); lockdep_assert_held(&dd->lock); @@ -743,7 +746,7 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, trace_block_rq_insert(rq); - if (at_head) { + if (at_head && !pipelined_seq_write) { list_add(&rq->queuelist, &per_prio->dispatch); rq->fifo_time = jiffies; } else { @@ -823,7 +826,7 @@ static void dd_finish_request(struct request *rq) atomic_inc(&per_prio->stats.completed); - if (blk_queue_is_zoned(q)) { + if (!blk_queue_pipeline_zoned_writes(q)) { unsigned long flags; spin_lock_irqsave(&dd->zone_lock, flags); From patchwork Thu Jun 23 23:26:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 12893241 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25665C43334 for ; Thu, 23 Jun 2022 23:26:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230373AbiFWX0T (ORCPT ); Thu, 23 Jun 2022 19:26:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38696 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230372AbiFWX0S (ORCPT ); Thu, 23 Jun 2022 19:26:18 -0400 Received: from mail-pg1-f180.google.com (mail-pg1-f180.google.com [209.85.215.180]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2CEBB5D113 for ; Thu, 23 Jun 2022 16:26:18 -0700 (PDT) Received: by mail-pg1-f180.google.com with SMTP id h192so832628pgc.4 for ; Thu, 23 Jun 2022 16:26:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FGkmOPdQjkionkQeEiD2XzJRJeBRx1ZoOb71Pb1WGpM=; b=nXZWEOBl7M5h+YXndDZd+TVoeF9lvucV+Kn4P5vKc5AFQpBXDbY09gvj81hQYQ/13b XRSCe+sYnQWKPDLHH9V1H7/y5likF28851/kj82Rz9iCSR5F0jLgrdOYxLRQCvRXJUPG OlDO9u5hW1mT2FU0m5mWswvU41GJAJACZnNw5MvvYVAFJVzfADI53WK8q99er/omADDk eV79CkKt43iJwd8aziUaEpiFBjr4FAQ2O+xMIW0g2ZetDrg9N4e54A4Ro5CKtaIcJidv 7GHEDH2x+LOsDhERj7SAFiroBkyRKTdnMlDz7KG6P09xiBJJ+ds/aQyJxfJuU5kz0Yck dTiA== X-Gm-Message-State: AJIora9aOegOHF6L5B1Ik9qjoQpNrpV1b1SSKIaMCq/pc9JuQVnAY7sL sHUAePHIqd5OdUDB/N3eulw= X-Google-Smtp-Source: AGRyM1tJo6eR6bm+Pk/V99t4BbjPqwZSI0226BUF1UVHV/699ugoevVBt0KC4P3Ni3vV2PRGjpkpKw== X-Received: by 2002:a05:6a00:1948:b0:525:45e3:2eb7 with SMTP id s8-20020a056a00194800b0052545e32eb7mr10759104pfk.77.1656026777437; Thu, 23 Jun 2022 16:26:17 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:85b2:5fa3:f71e:1b43]) by smtp.gmail.com with ESMTPSA id f11-20020a62380b000000b0051829b1595dsm184709pfa.130.2022.06.23.16.26.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jun 2022 16:26:16 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Bart Van Assche , Damien Le Moal Subject: [PATCH v2 5/6] block/null_blk: Refactor null_queue_rq() Date: Thu, 23 Jun 2022 16:26:02 -0700 Message-Id: <20220623232603.3751912-6-bvanassche@acm.org> X-Mailer: git-send-email 2.37.0.rc0.161.g10f37bed90-goog In-Reply-To: <20220623232603.3751912-1-bvanassche@acm.org> References: <20220623232603.3751912-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Introduce a local variable for the expression bd->rq since that expression occurs multiple times. This patch does not change any functionality. Cc: Damien Le Moal Signed-off-by: Bart Van Assche Reviewed-by: Damien Le Moal --- drivers/block/null_blk/main.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c index 6b67088f4ea7..fd68e6f4637f 100644 --- a/drivers/block/null_blk/main.c +++ b/drivers/block/null_blk/main.c @@ -1609,10 +1609,11 @@ static enum blk_eh_timer_return null_timeout_rq(struct request *rq, bool res) static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx, const struct blk_mq_queue_data *bd) { - struct nullb_cmd *cmd = blk_mq_rq_to_pdu(bd->rq); + struct request *rq = bd->rq; + struct nullb_cmd *cmd = blk_mq_rq_to_pdu(rq); struct nullb_queue *nq = hctx->driver_data; - sector_t nr_sectors = blk_rq_sectors(bd->rq); - sector_t sector = blk_rq_pos(bd->rq); + sector_t nr_sectors = blk_rq_sectors(rq); + sector_t sector = blk_rq_pos(rq); const bool is_poll = hctx->type == HCTX_TYPE_POLL; might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING); @@ -1621,14 +1622,14 @@ static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx, hrtimer_init(&cmd->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); cmd->timer.function = null_cmd_timer_expired; } - cmd->rq = bd->rq; + cmd->rq = rq; cmd->error = BLK_STS_OK; cmd->nq = nq; - cmd->fake_timeout = should_timeout_request(bd->rq); + cmd->fake_timeout = should_timeout_request(rq); - blk_mq_start_request(bd->rq); + blk_mq_start_request(rq); - if (should_requeue_request(bd->rq)) { + if (should_requeue_request(rq)) { /* * Alternate between hitting the core BUSY path, and the * driver driven requeue path @@ -1637,21 +1638,21 @@ static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx, if (nq->requeue_selection & 1) return BLK_STS_RESOURCE; else { - blk_mq_requeue_request(bd->rq, true); + blk_mq_requeue_request(rq, true); return BLK_STS_OK; } } if (is_poll) { spin_lock(&nq->poll_lock); - list_add_tail(&bd->rq->queuelist, &nq->poll_list); + list_add_tail(&rq->queuelist, &nq->poll_list); spin_unlock(&nq->poll_lock); return BLK_STS_OK; } if (cmd->fake_timeout) return BLK_STS_OK; - return null_handle_cmd(cmd, sector, nr_sectors, req_op(bd->rq)); + return null_handle_cmd(cmd, sector, nr_sectors, req_op(rq)); } static void cleanup_queue(struct nullb_queue *nq) From patchwork Thu Jun 23 23:26:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 12893242 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43780C433EF for ; Thu, 23 Jun 2022 23:26:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230379AbiFWX0V (ORCPT ); Thu, 23 Jun 2022 19:26:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229493AbiFWX0T (ORCPT ); Thu, 23 Jun 2022 19:26:19 -0400 Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F4EE52E67 for ; Thu, 23 Jun 2022 16:26:19 -0700 (PDT) Received: by mail-pj1-f42.google.com with SMTP id g10-20020a17090a708a00b001ea8aadd42bso1161194pjk.0 for ; Thu, 23 Jun 2022 16:26:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=IrMdzGs0zmYyXqBODn8sQ9FYeoF6iisrzUHqYhgPeWs=; b=DYWsrUy1eMDruomeDaKj0f+G4KHCg4IWnvsagua9aKYmkkXdTGT7EdxFWLodgAYx/g 5HgoXxymObyiOLstN5V3p/+JiyYZkGAI+E/i4GblxmkQBu9p2jsh9l9vjscAbW4owqkS V/j0C7p2rhFVtQ413wHg24jGUkWzPUA/ihpoatD/0tQzg7bn6ufYRLqwI8c/k8T1IqI4 Al2HWQQpjzxUGmOcWgN9a6HvcDEOJrC9P09YJtzkwUAM/GMJlLwjKvdV2/fVPT4frjPj 82Mu0B7DN+lIJlYswRKirDFCY54VO3+eFG9pWLk1eWXsbtAD6jC2hwe/Vp769nlaqrX+ eNeQ== X-Gm-Message-State: AJIora/4o7ZRNALaP2UzxfMoJq8GD6tJDqmNXQ+DQ2W9lULQ/fr+yYkb 2s34mD9tGcvMl/gSpyyIPMYnqglnSb1IvA== X-Google-Smtp-Source: AGRyM1smZBqAnz2zNP1T7BAYlVpInJbeM7DZw1PUD1AvOXNGukk+82tN1l6ipC6nYF2i9+0GX8fyvw== X-Received: by 2002:a17:903:234e:b0:16a:2d02:add7 with SMTP id c14-20020a170903234e00b0016a2d02add7mr19542427plh.10.1656026778844; Thu, 23 Jun 2022 16:26:18 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:85b2:5fa3:f71e:1b43]) by smtp.gmail.com with ESMTPSA id f11-20020a62380b000000b0051829b1595dsm184709pfa.130.2022.06.23.16.26.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jun 2022 16:26:18 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Bart Van Assche , Damien Le Moal Subject: [PATCH v2 6/6] block/null_blk: Add support for pipelining zoned writes Date: Thu, 23 Jun 2022 16:26:03 -0700 Message-Id: <20220623232603.3751912-7-bvanassche@acm.org> X-Mailer: git-send-email 2.37.0.rc0.161.g10f37bed90-goog In-Reply-To: <20220623232603.3751912-1-bvanassche@acm.org> References: <20220623232603.3751912-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Add a new configfs attribute for enabling pipelining of zoned writes. If that attribute has been set, retry zoned writes that are not aligned with the write pointer. The test script below reports 236 K IOPS with no I/O scheduler, 81 K IOPS with mq-deadline and pipelining disabled and 121 K IOPS with mq-deadline and pipelining enabled (+49%). #!/bin/bash for d in /sys/kernel/config/nullb/*; do [ -d "$d" ] && rmdir "$d" done modprobe -r null_blk set -e modprobe null_blk nr_devices=0 udevadm settle ( cd /sys/kernel/config/nullb mkdir nullb0 cd nullb0 params=( completion_nsec=100000 hw_queue_depth=64 irqmode=2 memory_backed=1 pipeline_zoned_writes=1 size=1 submit_queues=1 zone_size=1 zoned=1 power=1 ) for p in "${params[@]}"; do echo "${p//*=}" > "${p//=*}" done ) params=( --direct=1 --filename=/dev/nullb0 --iodepth=64 --iodepth_batch=16 --ioengine=io_uring --ioscheduler=mq-deadline --hipri=0 --name=nullb0 --runtime=30 --rw=write --time_based=1 --zonemode=zbd ) fio "${params[@]}" Cc: Damien Le Moal Signed-off-by: Bart Van Assche --- drivers/block/null_blk/main.c | 9 +++++++++ drivers/block/null_blk/null_blk.h | 3 +++ drivers/block/null_blk/zoned.c | 4 +++- 3 files changed, 15 insertions(+), 1 deletion(-) diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c index fd68e6f4637f..d5fc651ffc3d 100644 --- a/drivers/block/null_blk/main.c +++ b/drivers/block/null_blk/main.c @@ -408,6 +408,7 @@ NULLB_DEVICE_ATTR(zone_capacity, ulong, NULL); NULLB_DEVICE_ATTR(zone_nr_conv, uint, NULL); NULLB_DEVICE_ATTR(zone_max_open, uint, NULL); NULLB_DEVICE_ATTR(zone_max_active, uint, NULL); +NULLB_DEVICE_ATTR(pipeline_zoned_writes, bool, NULL); NULLB_DEVICE_ATTR(virt_boundary, bool, NULL); static ssize_t nullb_device_power_show(struct config_item *item, char *page) @@ -531,6 +532,7 @@ static struct configfs_attribute *nullb_device_attrs[] = { &nullb_device_attr_zone_nr_conv, &nullb_device_attr_zone_max_open, &nullb_device_attr_zone_max_active, + &nullb_device_attr_pipeline_zoned_writes, &nullb_device_attr_virt_boundary, NULL, }; @@ -1626,6 +1628,11 @@ static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx, cmd->error = BLK_STS_OK; cmd->nq = nq; cmd->fake_timeout = should_timeout_request(rq); + if (!(rq->rq_flags & RQF_DONTPREP)) { + rq->rq_flags |= RQF_DONTPREP; + cmd->retries = 0; + cmd->max_attempts = rq->q->nr_hw_queues * rq->q->nr_requests; + } blk_mq_start_request(rq); @@ -2042,6 +2049,8 @@ static int null_add_dev(struct nullb_device *dev) nullb->q->queuedata = nullb; blk_queue_flag_set(QUEUE_FLAG_NONROT, nullb->q); blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, nullb->q); + if (dev->pipeline_zoned_writes) + blk_queue_flag_set(QUEUE_FLAG_PIPELINE_ZONED_WRITES, nullb->q); mutex_lock(&lock); nullb->index = ida_simple_get(&nullb_indexes, 0, 0, GFP_KERNEL); diff --git a/drivers/block/null_blk/null_blk.h b/drivers/block/null_blk/null_blk.h index 8359b43842f2..bbe2cb17bdbd 100644 --- a/drivers/block/null_blk/null_blk.h +++ b/drivers/block/null_blk/null_blk.h @@ -23,6 +23,8 @@ struct nullb_cmd { unsigned int tag; blk_status_t error; bool fake_timeout; + u16 retries; + u16 max_attempts; struct nullb_queue *nq; struct hrtimer timer; }; @@ -112,6 +114,7 @@ struct nullb_device { bool memory_backed; /* if data is stored in memory */ bool discard; /* if support discard */ bool zoned; /* if device is zoned */ + bool pipeline_zoned_writes; bool virt_boundary; /* virtual boundary on/off for the device */ }; diff --git a/drivers/block/null_blk/zoned.c b/drivers/block/null_blk/zoned.c index 2fdd7b20c224..8d0a5e16f4b1 100644 --- a/drivers/block/null_blk/zoned.c +++ b/drivers/block/null_blk/zoned.c @@ -403,7 +403,9 @@ static blk_status_t null_zone_write(struct nullb_cmd *cmd, sector_t sector, else cmd->bio->bi_iter.bi_sector = sector; } else if (sector != zone->wp) { - ret = BLK_STS_IOERR; + ret = dev->pipeline_zoned_writes && + ++cmd->retries < cmd->max_attempts ? + BLK_STS_DEV_RESOURCE : BLK_STS_IOERR; goto unlock; }