From patchwork Fri Apr 7 23:58:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13205470 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41437C76196 for ; Fri, 7 Apr 2023 23:58:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229730AbjDGX6n (ORCPT ); Fri, 7 Apr 2023 19:58:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46918 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229743AbjDGX6m (ORCPT ); Fri, 7 Apr 2023 19:58:42 -0400 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC347EF8D for ; Fri, 7 Apr 2023 16:58:41 -0700 (PDT) Received: by mail-pl1-f175.google.com with SMTP id q2so5069845pll.7 for ; Fri, 07 Apr 2023 16:58:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680911921; x=1683503921; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FDZ0a4ViXZtUJ7fNZdyM/i6Nk6el+pS0D6Bc/VV4l5M=; b=50uS+HvXA3fhwJ6tJkvt5wPiy2FeUgFSU08pyn4DpnQ1AogUfg0HdxwO8UcnvA9ZCl IM8azJ6MrMlUn3cJVlEeq3UUePae8sgoWtocgP1uceEnn6eTcrB6+md01ZukfqDXVHKG cLCt8vDVsmGQfId8vkyW8VWnBO8FUHvPnTJc9dBqegalNNakqbFVcUXg9d0ti84lCHEV mGtpr0+CY5NtgIjkT7tIu9x7/+IudBDnbshqnA4UKaatUBqSDgK4fGOWKqXtG4chBhHk ylDWw6IPUFlEy22WT58y5gKo9V8d3IvYIqt+DyG1V3VXCX1siG44VYz7EGbyEP5AgsKN rMjQ== X-Gm-Message-State: AAQBX9dLywe60S/eEiYn7zDdj4gBYR3dbnv5DsNZKcQi4U9kaYEv+qiY ltETUL1KF7VNLffqsC0JPp8= X-Google-Smtp-Source: AKy350ZxCQr3cIJHBGUSszbcZN07olU8EINFmtB9xaWSwr1toXmd1iXqJSIu73TU2aLyZBhXHcfclg== X-Received: by 2002:a05:6a20:b05d:b0:cd:91bc:a9af with SMTP id dx29-20020a056a20b05d00b000cd91bca9afmr3268237pzb.58.1680911921145; Fri, 07 Apr 2023 16:58:41 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:f2c:4ac2:6000:5900]) by smtp.gmail.com with ESMTPSA id j16-20020a62e910000000b006258dd63a3fsm3556003pfh.56.2023.04.07.16.58.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Apr 2023 16:58:40 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Jaegeuk Kim , Christoph Hellwig , Bart Van Assche , Damien Le Moal , Ming Lei , Mike Snitzer Subject: [PATCH v2 01/12] block: Send zoned writes to the I/O scheduler Date: Fri, 7 Apr 2023 16:58:11 -0700 Message-Id: <20230407235822.1672286-2-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407235822.1672286-1-bvanassche@acm.org> References: <20230407235822.1672286-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Send zoned writes inserted by the device mapper to the I/O scheduler. This prevents that zoned writes get reordered if a device mapper driver has been stacked on top of a driver for a zoned block device. Cc: Christoph Hellwig Cc: Damien Le Moal Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/blk-mq.c | 16 +++++++++++++--- block/blk.h | 19 +++++++++++++++++++ 2 files changed, 32 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index db93b1a71157..fefc9a728e0e 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3008,9 +3008,19 @@ blk_status_t blk_insert_cloned_request(struct request *rq) blk_account_io_start(rq); /* - * Since we have a scheduler attached on the top device, - * bypass a potential scheduler on the bottom device for - * insert. + * Send zoned writes to the I/O scheduler if an I/O scheduler has been + * attached. + */ + if (q->elevator && blk_rq_is_seq_zoned_write(rq)) { + blk_mq_sched_insert_request(rq, /*at_head=*/false, + /*run_queue=*/true, + /*async=*/false); + return BLK_STS_OK; + } + + /* + * If no I/O scheduler has been attached or if the request is not a + * zoned write bypass the I/O scheduler attached to the bottom device. */ blk_mq_run_dispatch_ops(q, ret = blk_mq_request_issue_directly(rq, true)); diff --git a/block/blk.h b/block/blk.h index d65d96994a94..4b6f8d7a6b84 100644 --- a/block/blk.h +++ b/block/blk.h @@ -118,6 +118,25 @@ static inline bool bvec_gap_to_prev(const struct queue_limits *lim, return __bvec_gap_to_prev(lim, bprv, offset); } +/** + * blk_rq_is_seq_zoned_write() - Whether @rq is a write request for a sequential zone. + * @rq: Request to examine. + * + * In this context sequential zone means either a sequential write required or + * to a sequential write preferred zone. + */ +static inline bool blk_rq_is_seq_zoned_write(struct request *rq) +{ + switch (req_op(rq)) { + case REQ_OP_WRITE: + case REQ_OP_WRITE_ZEROES: + return disk_zone_is_seq(rq->q->disk, blk_rq_pos(rq)); + case REQ_OP_ZONE_APPEND: + default: + return false; + } +} + static inline bool rq_mergeable(struct request *rq) { if (blk_rq_is_passthrough(rq)) From patchwork Fri Apr 7 23:58:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13205471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2172C77B6C for ; Fri, 7 Apr 2023 23:58:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229717AbjDGX6p (ORCPT ); Fri, 7 Apr 2023 19:58:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46994 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229689AbjDGX6o (ORCPT ); Fri, 7 Apr 2023 19:58:44 -0400 Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 719BECA1C for ; Fri, 7 Apr 2023 16:58:43 -0700 (PDT) Received: by mail-pl1-f176.google.com with SMTP id kx12so2288235plb.12 for ; Fri, 07 Apr 2023 16:58:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680911923; x=1683503923; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mqHrajTxO4No+Knsk0F9iORrtB37kJDx2LKjrju7sMM=; b=NnTbs2xJZPrf4wxLolq65Pm/h4JRzfk8l1Dc+OSor4aQGOk6gU/KDAr7ISbn0HCKoi Kl2WhCXxlXvIMZ2nNQPxD5PgcqPSYRvvS9MZ7EcmmOdh4XVDLAegPSa5njfudDXS4u2u MPtp0v95zpc/JMqoGN/ntr9ftcCHDoZO6wvxKzAK1JeqK1VVjFH6I7KOFZximILwZP7s Tmplk/LyLlJk4Z6iW1V5j7W8sibbrW6k7ke2OMDtKfO78UcOBvuGYOzo/GJpHEzDHsrk GJv/dtRwFxu/tanayqqm6b0iu0hQwMxBeFPreC6zRYozB27u3Lq/zoIELSrNYhIVCDcK fuTg== X-Gm-Message-State: AAQBX9dBJLPz1iTYqb3zBGlol4lZRy0f+l4JjCfMo/Aa2ipEHHWfXdMH UqYnFMi0yIgVzB8mcZPFkx0= X-Google-Smtp-Source: AKy350YxlQa0ByUh4V6oq64xSWNWHESbibZvMkzxNtpQq2HQ4Tw9Cl7w52DsUXh80jGbvTnE6MflRg== X-Received: by 2002:a05:6a20:4a27:b0:d5:2f2a:ead4 with SMTP id fr39-20020a056a204a2700b000d52f2aead4mr92613pzb.47.1680911922719; Fri, 07 Apr 2023 16:58:42 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:f2c:4ac2:6000:5900]) by smtp.gmail.com with ESMTPSA id j16-20020a62e910000000b006258dd63a3fsm3556003pfh.56.2023.04.07.16.58.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Apr 2023 16:58:42 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Jaegeuk Kim , Christoph Hellwig , Bart Van Assche , Damien Le Moal , Ming Lei , Mike Snitzer Subject: [PATCH v2 02/12] block: Send flush requests to the I/O scheduler Date: Fri, 7 Apr 2023 16:58:12 -0700 Message-Id: <20230407235822.1672286-3-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407235822.1672286-1-bvanassche@acm.org> References: <20230407235822.1672286-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Prevent that zoned writes with the FUA flag set are reordered against each other or against other zoned writes. Separate the I/O scheduler members from the flush members in struct request since with this patch applied a request may pass through both an I/O scheduler and the flush machinery. Cc: Christoph Hellwig Cc: Damien Le Moal Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche Signed-off-by: Bart Van Assche --- block/blk-flush.c | 3 ++- block/blk-mq.c | 11 ++++------- block/mq-deadline.c | 2 +- include/linux/blk-mq.h | 27 +++++++++++---------------- 4 files changed, 18 insertions(+), 25 deletions(-) diff --git a/block/blk-flush.c b/block/blk-flush.c index 53202eff545e..e0cf153388d8 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -432,7 +432,8 @@ void blk_insert_flush(struct request *rq) */ if ((policy & REQ_FSEQ_DATA) && !(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) { - blk_mq_request_bypass_insert(rq, false, true); + blk_mq_sched_insert_request(rq, /*at_head=*/false, + /*run_queue=*/true, /*async=*/true); return; } diff --git a/block/blk-mq.c b/block/blk-mq.c index fefc9a728e0e..250556546bbf 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -390,8 +390,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data, INIT_HLIST_NODE(&rq->hash); RB_CLEAR_NODE(&rq->rb_node); - if (!op_is_flush(data->cmd_flags) && - e->type->ops.prepare_request) { + if (e->type->ops.prepare_request) { e->type->ops.prepare_request(rq); rq->rq_flags |= RQF_ELVPRIV; } @@ -452,13 +451,11 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data) data->rq_flags |= RQF_ELV; /* - * Flush/passthrough requests are special and go directly to the - * dispatch list. Don't include reserved tags in the - * limiting, as it isn't useful. + * Do not limit the depth for passthrough requests nor for + * requests with a reserved tag. */ - if (!op_is_flush(data->cmd_flags) && + if (e->type->ops.limit_depth && !blk_op_is_passthrough(data->cmd_flags) && - e->type->ops.limit_depth && !(data->flags & BLK_MQ_REQ_RESERVED)) e->type->ops.limit_depth(data->cmd_flags, data); } diff --git a/block/mq-deadline.c b/block/mq-deadline.c index f10c2a0d18d4..d885ccf49170 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -789,7 +789,7 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, prio = ioprio_class_to_prio[ioprio_class]; per_prio = &dd->per_prio[prio]; - if (!rq->elv.priv[0]) { + if (!rq->elv.priv[0] && !(rq->rq_flags & RQF_FLUSH_SEQ)) { per_prio->stats.inserted++; rq->elv.priv[0] = (void *)(uintptr_t)1; } diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 06caacd77ed6..5e6c79ad83d2 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -169,25 +169,20 @@ struct request { void *completion_data; }; - /* * Three pointers are available for the IO schedulers, if they need - * more they have to dynamically allocate it. Flush requests are - * never put on the IO scheduler. So let the flush fields share - * space with the elevator data. + * more they have to dynamically allocate it. */ - union { - struct { - struct io_cq *icq; - void *priv[2]; - } elv; - - struct { - unsigned int seq; - struct list_head list; - rq_end_io_fn *saved_end_io; - } flush; - }; + struct { + struct io_cq *icq; + void *priv[2]; + } elv; + + struct { + unsigned int seq; + struct list_head list; + rq_end_io_fn *saved_end_io; + } flush; union { struct __call_single_data csd; From patchwork Fri Apr 7 23:58:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13205472 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A5BDC77B6E for ; Fri, 7 Apr 2023 23:58:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229748AbjDGX6q (ORCPT ); Fri, 7 Apr 2023 19:58:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47042 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229720AbjDGX6p (ORCPT ); Fri, 7 Apr 2023 19:58:45 -0400 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2519EFAA for ; Fri, 7 Apr 2023 16:58:44 -0700 (PDT) Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-1a51ba7fdfcso2139735ad.3 for ; Fri, 07 Apr 2023 16:58:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680911924; x=1683503924; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=u40JsLzOyTXUUB/d1En8vLEh5k2DOI0+hCEG0pu8TkE=; b=1W/UoNA6pAVhu8RsLl94a49Tpo5i9C0j6HOQTPTVtowL65oehN9AwPfvHfncaUnMHv 32HXhC21PVi3XV6cs+6pwZXKcrggyQrFH2HKlj8W2E2pVnK1dJcu6dYGY1suX88j6KyT tCI0VAj+4oMsHipz2pxqA+dHGiK+q+QUFFJ9O+P1Q+9mxG2kB20yBCYmAmgma512Ekwf LLIZ5dtJerzldj2QMg7F+V1CQcF7q4R2UplO4dicHGyaiR2j/lSLfwmZly5at3n+3Qxj pXuzNQkVW2GjvVzWg6ZoV/rNb057RqDRBOsWHyep+UJjinuyR8/Cam67VWBldroOhqfe 99tA== X-Gm-Message-State: AAQBX9ekyU822U6g6qCWIX7W9Z59LU77IC5FB2uu2o0weAXJhZBO4alo DnLt8NaSou4TyUNO8Mjcp1I= X-Google-Smtp-Source: AKy350a3EVMx8HjUQax9YOrZ8VyFsA7brvq7XqNPiX0FkrCHLviUiwnOTEyZyflmog7os3VECQTUYw== X-Received: by 2002:a62:3047:0:b0:626:cc72:51ac with SMTP id w68-20020a623047000000b00626cc7251acmr273212pfw.30.1680911924054; Fri, 07 Apr 2023 16:58:44 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:f2c:4ac2:6000:5900]) by smtp.gmail.com with ESMTPSA id j16-20020a62e910000000b006258dd63a3fsm3556003pfh.56.2023.04.07.16.58.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Apr 2023 16:58:43 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Jaegeuk Kim , Christoph Hellwig , Bart Van Assche , Damien Le Moal , Ming Lei , Mike Snitzer Subject: [PATCH v2 03/12] block: Send requeued requests to the I/O scheduler Date: Fri, 7 Apr 2023 16:58:13 -0700 Message-Id: <20230407235822.1672286-4-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407235822.1672286-1-bvanassche@acm.org> References: <20230407235822.1672286-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Let the I/O scheduler control which requests are dispatched. Cc: Christoph Hellwig Cc: Damien Le Moal Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/blk-mq.c | 21 +++++++++------------ include/linux/blk-mq.h | 5 +++-- 2 files changed, 12 insertions(+), 14 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 250556546bbf..57315395434b 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1426,15 +1426,7 @@ static void blk_mq_requeue_work(struct work_struct *work) rq->rq_flags &= ~RQF_SOFTBARRIER; list_del_init(&rq->queuelist); - /* - * If RQF_DONTPREP, rq has contained some driver specific - * data, so insert it to hctx dispatch list to avoid any - * merge. - */ - if (rq->rq_flags & RQF_DONTPREP) - blk_mq_request_bypass_insert(rq, false, false); - else - blk_mq_sched_insert_request(rq, true, false, false); + blk_mq_sched_insert_request(rq, /*at_head=*/true, false, false); } while (!list_empty(&rq_list)) { @@ -2065,9 +2057,14 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, if (nr_budgets) blk_mq_release_budgets(q, list); - spin_lock(&hctx->lock); - list_splice_tail_init(list, &hctx->dispatch); - spin_unlock(&hctx->lock); + if (!q->elevator) { + spin_lock(&hctx->lock); + list_splice_tail_init(list, &hctx->dispatch); + spin_unlock(&hctx->lock); + } else { + q->elevator->type->ops.insert_requests(hctx, list, + /*at_head=*/true); + } /* * Order adding requests to hctx->dispatch and checking diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 5e6c79ad83d2..3a3bee9085e3 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -64,8 +64,9 @@ typedef __u32 __bitwise req_flags_t; #define RQF_RESV ((__force req_flags_t)(1 << 23)) /* flags that prevent us from merging requests: */ -#define RQF_NOMERGE_FLAGS \ - (RQF_STARTED | RQF_SOFTBARRIER | RQF_FLUSH_SEQ | RQF_SPECIAL_PAYLOAD) +#define RQF_NOMERGE_FLAGS \ + (RQF_STARTED | RQF_SOFTBARRIER | RQF_FLUSH_SEQ | RQF_DONTPREP | \ + RQF_SPECIAL_PAYLOAD) enum mq_rq_state { MQ_RQ_IDLE = 0, From patchwork Fri Apr 7 23:58:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13205473 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3EDEC76196 for ; Fri, 7 Apr 2023 23:58:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229760AbjDGX6t (ORCPT ); Fri, 7 Apr 2023 19:58:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47124 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229743AbjDGX6s (ORCPT ); Fri, 7 Apr 2023 19:58:48 -0400 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D68D4EFB6 for ; Fri, 7 Apr 2023 16:58:45 -0700 (PDT) Received: by mail-pl1-f175.google.com with SMTP id q2so5069934pll.7 for ; Fri, 07 Apr 2023 16:58:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680911925; x=1683503925; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dTHZLSnxhEyvXySdFXAQP4I/WDGyYr95vE3AK3qZWvI=; b=b5NTHx+KFveBiKv3RqkGt5HmXBRahrpCEH3zDbNHVjlUAlICBWT9YsYpkV7ZiskDIe QrTvGr6aYje8MPy2oLUgSCvi1ZPdcGbew96SRlUA7+6WkqjXmKaLcwXPEavPHUVVDmny ytvcSoV6fms4sAduQlssELEa+yYlGNPWhPmmNm6b39EFsJMJLiJvjjx+GisXskmk3Bwz EujMs5yI3wm0rRdpOS3Mos4hysxYtOff6UQJMmgejtH1BJm0DWMyiShzHz/e4pA4WUX+ oY8zCJbmU3tOp7J6njy+zJ7uu60uJBsh3dVjkFOcUEwApsYRdkN8BnJNDV+Jeq1AJJat bXPg== X-Gm-Message-State: AAQBX9dsdTRiub+8jeRyU5f50ZtLZsPOboxTuXdYHwIsFmS9x4tUEDBA tVKmHG0UatB+0j9ALHGWL4s= X-Google-Smtp-Source: AKy350Z/R77DsAWDh3N9mtwsvcHl561OvsSxZQhncpp/DFG3/c5y6gE6ym9sypmjARXQnqMDpiTOyA== X-Received: by 2002:a05:6a20:29a4:b0:d9:2818:44d with SMTP id f36-20020a056a2029a400b000d92818044dmr3232465pzh.11.1680911925381; Fri, 07 Apr 2023 16:58:45 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:f2c:4ac2:6000:5900]) by smtp.gmail.com with ESMTPSA id j16-20020a62e910000000b006258dd63a3fsm3556003pfh.56.2023.04.07.16.58.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Apr 2023 16:58:45 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Jaegeuk Kim , Christoph Hellwig , Bart Van Assche , Damien Le Moal , Ming Lei , Mike Snitzer Subject: [PATCH v2 04/12] block: Requeue requests if a CPU is unplugged Date: Fri, 7 Apr 2023 16:58:14 -0700 Message-Id: <20230407235822.1672286-5-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407235822.1672286-1-bvanassche@acm.org> References: <20230407235822.1672286-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Requeue requests instead of sending these to the dispatch list if a CPU is unplugged to prevent reordering of zoned writes. Cc: Christoph Hellwig Cc: Damien Le Moal Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche Reviewed-by: Damien Le Moal --- block/blk-mq.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 57315395434b..77fdaed4e074 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3495,9 +3495,17 @@ static int blk_mq_hctx_notify_dead(unsigned int cpu, struct hlist_node *node) if (list_empty(&tmp)) return 0; - spin_lock(&hctx->lock); - list_splice_tail_init(&tmp, &hctx->dispatch); - spin_unlock(&hctx->lock); + if (hctx->queue->elevator) { + struct request *rq, *next; + + list_for_each_entry_safe(rq, next, &tmp, queuelist) + blk_mq_requeue_request(rq, false); + blk_mq_kick_requeue_list(hctx->queue); + } else { + spin_lock(&hctx->lock); + list_splice_tail_init(&tmp, &hctx->dispatch); + spin_unlock(&hctx->lock); + } blk_mq_run_hw_queue(hctx, true); return 0; From patchwork Fri Apr 7 23:58:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13205474 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78F54C77B6E for ; Fri, 7 Apr 2023 23:58:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229812AbjDGX6u (ORCPT ); Fri, 7 Apr 2023 19:58:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47188 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229720AbjDGX6t (ORCPT ); Fri, 7 Apr 2023 19:58:49 -0400 Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5FFE7CA0E for ; Fri, 7 Apr 2023 16:58:47 -0700 (PDT) Received: by mail-pj1-f48.google.com with SMTP id r21-20020a17090aa09500b0024663a79050so1616285pjp.4 for ; Fri, 07 Apr 2023 16:58:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680911927; x=1683503927; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0FkAfNw38n+cfRK1egXFo/Q+ek8lsQl754nofnNyCKE=; b=NEW6ZRJWMZ4AtOtXqMX5OO88M35e8krvHMp6zYRpV2DdVU59Bv8Bwz8BcEtxsZjNRq j1+W8kXJPMLHoItRTnNgvbtIS+Q4Vr4mpvRctZLgwSk5IFA5R/LHkbtBTeOe3vG15iwd ldI8rqTyWZjNpvpUa8hGjqcFXqABPjWd2/U+t6P1M2eIkccQEYwJrhRhZukM/bGQ+wJu x5QQAj1iSvwyrz+h/lv25tzo4QnfURhoEX7+o5a76KC+PJvTf3SWIG0KKXDNkjvpKotC B9lnB6iAbmF3y4fixy6hA7pmIJyFzJmLyMvjd5OQeOPASixj+NDz7dyoewcrWHKtkm/p JhfQ== X-Gm-Message-State: AAQBX9d2hFXi9xSWqOAywTxpPStBTFrZDAn9JnAT4W4eutOTo8OHj1y2 llx0oEz13QD9nlsOBzorQrFI+MaIR3E= X-Google-Smtp-Source: AKy350ZXmxjMARM9amh866dcIrBjhFk3mWXtV0tgbALH/b34F6g6qp8Vmv2WuEi5fVD0GngdQsuobQ== X-Received: by 2002:a05:6a20:6d9e:b0:d6:d41e:87ee with SMTP id gl30-20020a056a206d9e00b000d6d41e87eemr3863780pzb.12.1680911926768; Fri, 07 Apr 2023 16:58:46 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:f2c:4ac2:6000:5900]) by smtp.gmail.com with ESMTPSA id j16-20020a62e910000000b006258dd63a3fsm3556003pfh.56.2023.04.07.16.58.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Apr 2023 16:58:46 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Jaegeuk Kim , Christoph Hellwig , Bart Van Assche , Damien Le Moal , Ming Lei , Mike Snitzer Subject: [PATCH v2 05/12] block: One requeue list per hctx Date: Fri, 7 Apr 2023 16:58:15 -0700 Message-Id: <20230407235822.1672286-6-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407235822.1672286-1-bvanassche@acm.org> References: <20230407235822.1672286-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Prepare for processing the requeue list from inside __blk_mq_run_hw_queue(). Cc: Christoph Hellwig Cc: Damien Le Moal Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/blk-mq-debugfs.c | 66 +++++++++++++++++++++--------------------- block/blk-mq.c | 58 +++++++++++++++++++++++-------------- include/linux/blk-mq.h | 4 +++ include/linux/blkdev.h | 4 --- 4 files changed, 73 insertions(+), 59 deletions(-) diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index 212a7f301e73..5eb930754347 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -20,37 +20,6 @@ static int queue_poll_stat_show(void *data, struct seq_file *m) return 0; } -static void *queue_requeue_list_start(struct seq_file *m, loff_t *pos) - __acquires(&q->requeue_lock) -{ - struct request_queue *q = m->private; - - spin_lock_irq(&q->requeue_lock); - return seq_list_start(&q->requeue_list, *pos); -} - -static void *queue_requeue_list_next(struct seq_file *m, void *v, loff_t *pos) -{ - struct request_queue *q = m->private; - - return seq_list_next(v, &q->requeue_list, pos); -} - -static void queue_requeue_list_stop(struct seq_file *m, void *v) - __releases(&q->requeue_lock) -{ - struct request_queue *q = m->private; - - spin_unlock_irq(&q->requeue_lock); -} - -static const struct seq_operations queue_requeue_list_seq_ops = { - .start = queue_requeue_list_start, - .next = queue_requeue_list_next, - .stop = queue_requeue_list_stop, - .show = blk_mq_debugfs_rq_show, -}; - static int blk_flags_show(struct seq_file *m, const unsigned long flags, const char *const *flag_name, int flag_name_count) { @@ -156,11 +125,10 @@ static ssize_t queue_state_write(void *data, const char __user *buf, static const struct blk_mq_debugfs_attr blk_mq_debugfs_queue_attrs[] = { { "poll_stat", 0400, queue_poll_stat_show }, - { "requeue_list", 0400, .seq_ops = &queue_requeue_list_seq_ops }, { "pm_only", 0600, queue_pm_only_show, NULL }, { "state", 0600, queue_state_show, queue_state_write }, { "zone_wlock", 0400, queue_zone_wlock_show, NULL }, - { }, + {}, }; #define HCTX_STATE_NAME(name) [BLK_MQ_S_##name] = #name @@ -513,6 +481,37 @@ static int hctx_dispatch_busy_show(void *data, struct seq_file *m) return 0; } +static void *hctx_requeue_list_start(struct seq_file *m, loff_t *pos) + __acquires(&hctx->requeue_lock) +{ + struct blk_mq_hw_ctx *hctx = m->private; + + spin_lock_irq(&hctx->requeue_lock); + return seq_list_start(&hctx->requeue_list, *pos); +} + +static void *hctx_requeue_list_next(struct seq_file *m, void *v, loff_t *pos) +{ + struct blk_mq_hw_ctx *hctx = m->private; + + return seq_list_next(v, &hctx->requeue_list, pos); +} + +static void hctx_requeue_list_stop(struct seq_file *m, void *v) + __releases(&hctx->requeue_lock) +{ + struct blk_mq_hw_ctx *hctx = m->private; + + spin_unlock_irq(&hctx->requeue_lock); +} + +static const struct seq_operations hctx_requeue_list_seq_ops = { + .start = hctx_requeue_list_start, + .next = hctx_requeue_list_next, + .stop = hctx_requeue_list_stop, + .show = blk_mq_debugfs_rq_show, +}; + #define CTX_RQ_SEQ_OPS(name, type) \ static void *ctx_##name##_rq_list_start(struct seq_file *m, loff_t *pos) \ __acquires(&ctx->lock) \ @@ -628,6 +627,7 @@ static const struct blk_mq_debugfs_attr blk_mq_debugfs_hctx_attrs[] = { {"run", 0600, hctx_run_show, hctx_run_write}, {"active", 0400, hctx_active_show}, {"dispatch_busy", 0400, hctx_dispatch_busy_show}, + {"requeue_list", 0400, .seq_ops = &hctx_requeue_list_seq_ops}, {"type", 0400, hctx_type_show}, {}, }; diff --git a/block/blk-mq.c b/block/blk-mq.c index 77fdaed4e074..deb3d08a6b26 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1411,14 +1411,17 @@ EXPORT_SYMBOL(blk_mq_requeue_request); static void blk_mq_requeue_work(struct work_struct *work) { - struct request_queue *q = - container_of(work, struct request_queue, requeue_work.work); + struct blk_mq_hw_ctx *hctx = + container_of(work, struct blk_mq_hw_ctx, requeue_work.work); LIST_HEAD(rq_list); struct request *rq, *next; - spin_lock_irq(&q->requeue_lock); - list_splice_init(&q->requeue_list, &rq_list); - spin_unlock_irq(&q->requeue_lock); + if (list_empty_careful(&hctx->requeue_list)) + return; + + spin_lock_irq(&hctx->requeue_lock); + list_splice_init(&hctx->requeue_list, &rq_list); + spin_unlock_irq(&hctx->requeue_lock); list_for_each_entry_safe(rq, next, &rq_list, queuelist) { if (!(rq->rq_flags & (RQF_SOFTBARRIER | RQF_DONTPREP))) @@ -1435,13 +1438,13 @@ static void blk_mq_requeue_work(struct work_struct *work) blk_mq_sched_insert_request(rq, false, false, false); } - blk_mq_run_hw_queues(q, false); + blk_mq_run_hw_queue(hctx, false); } void blk_mq_add_to_requeue_list(struct request *rq, bool at_head, bool kick_requeue_list) { - struct request_queue *q = rq->q; + struct blk_mq_hw_ctx *hctx = rq->mq_hctx; unsigned long flags; /* @@ -1449,31 +1452,42 @@ void blk_mq_add_to_requeue_list(struct request *rq, bool at_head, * request head insertion from the workqueue. */ BUG_ON(rq->rq_flags & RQF_SOFTBARRIER); + WARN_ON_ONCE(!rq->mq_hctx); - spin_lock_irqsave(&q->requeue_lock, flags); + spin_lock_irqsave(&hctx->requeue_lock, flags); if (at_head) { rq->rq_flags |= RQF_SOFTBARRIER; - list_add(&rq->queuelist, &q->requeue_list); + list_add(&rq->queuelist, &hctx->requeue_list); } else { - list_add_tail(&rq->queuelist, &q->requeue_list); + list_add_tail(&rq->queuelist, &hctx->requeue_list); } - spin_unlock_irqrestore(&q->requeue_lock, flags); + spin_unlock_irqrestore(&hctx->requeue_lock, flags); if (kick_requeue_list) - blk_mq_kick_requeue_list(q); + blk_mq_kick_requeue_list(rq->q); } void blk_mq_kick_requeue_list(struct request_queue *q) { - kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, &q->requeue_work, 0); + struct blk_mq_hw_ctx *hctx; + unsigned long i; + + queue_for_each_hw_ctx(q, hctx, i) + kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, + &hctx->requeue_work, 0); } EXPORT_SYMBOL(blk_mq_kick_requeue_list); void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs) { - kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, &q->requeue_work, - msecs_to_jiffies(msecs)); + struct blk_mq_hw_ctx *hctx; + unsigned long i; + + queue_for_each_hw_ctx(q, hctx, i) + kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, + &hctx->requeue_work, + msecs_to_jiffies(msecs)); } EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list); @@ -3594,6 +3608,10 @@ static int blk_mq_init_hctx(struct request_queue *q, struct blk_mq_tag_set *set, struct blk_mq_hw_ctx *hctx, unsigned hctx_idx) { + INIT_DELAYED_WORK(&hctx->requeue_work, blk_mq_requeue_work); + INIT_LIST_HEAD(&hctx->requeue_list); + spin_lock_init(&hctx->requeue_lock); + hctx->queue_num = hctx_idx; if (!(hctx->flags & BLK_MQ_F_STACKING)) @@ -4209,10 +4227,6 @@ int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, q->queue_flags |= QUEUE_FLAG_MQ_DEFAULT; blk_mq_update_poll_flag(q); - INIT_DELAYED_WORK(&q->requeue_work, blk_mq_requeue_work); - INIT_LIST_HEAD(&q->requeue_list); - spin_lock_init(&q->requeue_lock); - q->nr_requests = set->queue_depth; blk_mq_init_cpu_queues(q, set->nr_hw_queues); @@ -4757,10 +4771,10 @@ void blk_mq_cancel_work_sync(struct request_queue *q) struct blk_mq_hw_ctx *hctx; unsigned long i; - cancel_delayed_work_sync(&q->requeue_work); - - queue_for_each_hw_ctx(q, hctx, i) + queue_for_each_hw_ctx(q, hctx, i) { + cancel_delayed_work_sync(&hctx->requeue_work); cancel_delayed_work_sync(&hctx->run_work); + } } static int __init blk_mq_init(void) diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 3a3bee9085e3..0157f1569980 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -311,6 +311,10 @@ struct blk_mq_hw_ctx { unsigned long state; } ____cacheline_aligned_in_smp; + struct list_head requeue_list; + spinlock_t requeue_lock; + struct delayed_work requeue_work; + /** * @run_work: Used for scheduling a hardware queue run at a later time. */ diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index e3242e67a8e3..f5fa53cd13bd 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -491,10 +491,6 @@ struct request_queue { */ struct blk_flush_queue *fq; - struct list_head requeue_list; - spinlock_t requeue_lock; - struct delayed_work requeue_work; - struct mutex sysfs_lock; struct mutex sysfs_dir_lock; From patchwork Fri Apr 7 23:58:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13205475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3495C77B6C for ; Fri, 7 Apr 2023 23:58:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229477AbjDGX6w (ORCPT ); Fri, 7 Apr 2023 19:58:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229720AbjDGX6v (ORCPT ); Fri, 7 Apr 2023 19:58:51 -0400 Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E4A01EFB3 for ; Fri, 7 Apr 2023 16:58:48 -0700 (PDT) Received: by mail-pj1-f48.google.com with SMTP id q15-20020a17090a2dcf00b0023efab0e3bfso2602132pjm.3 for ; Fri, 07 Apr 2023 16:58:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680911928; x=1683503928; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OVd5wUXmRsxnNJj2KiM91aTH9y1P0rN3yll+bZuK3Wo=; b=ytlT0wuoZyK8ZnrpROGkTJaORqj40R0kmXmUlfgYaRUy1VEhH2LvRtm4Cb9bDNhMSz 4WEPUp/ftxYTCQKZOUg6K0u/ZC8BoP7cw/0iJo4bdhNhTS6W43mb78aaFIqLxqJaoqWt wjzzu+YsC7RvrwgUUjBmLPoDjt4slM3AXY2ao4NP6dNO9wHOUZTCSJM/VCts8VlvQHqZ +XtIma1EbJx459V9jAuOYMG+PUf6haD6lBxDAYVppuoRfdr/Y89mU7uu6d22PBBXg6cH zHEh9NQZIYLoNVdMPvXfmmEjaC1AnOXoIkMIJMVFmUfVuj8ScuQw+exUoseK3c2wyeUH 6uIg== X-Gm-Message-State: AAQBX9eHxci+t8zf9ZiBaSecIc6luCksNlGD0rXRj05bhydyRxf6CgSe CFbgpQpH45hEf6Gxs/wYfGY= X-Google-Smtp-Source: AKy350YjKJ70JDL8SiGTQTnvR22Ip3RrTUEIq/heWq5DUOaOTCqfLHehQxfnaLN2OpOlzRg4SZ10kw== X-Received: by 2002:a05:6a20:a928:b0:cd:fc47:ddbf with SMTP id cd40-20020a056a20a92800b000cdfc47ddbfmr4141685pzb.47.1680911928099; Fri, 07 Apr 2023 16:58:48 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:f2c:4ac2:6000:5900]) by smtp.gmail.com with ESMTPSA id j16-20020a62e910000000b006258dd63a3fsm3556003pfh.56.2023.04.07.16.58.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Apr 2023 16:58:47 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Jaegeuk Kim , Christoph Hellwig , Bart Van Assche , Damien Le Moal , Ming Lei , Mike Snitzer Subject: [PATCH v2 06/12] block: Preserve the order of requeued requests Date: Fri, 7 Apr 2023 16:58:16 -0700 Message-Id: <20230407235822.1672286-7-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407235822.1672286-1-bvanassche@acm.org> References: <20230407235822.1672286-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org If a queue is run before all requeued requests have been sent to the I/O scheduler, the I/O scheduler may dispatch the wrong request. Fix this by making __blk_mq_run_hw_queue() process the requeue_list instead of blk_mq_requeue_work(). Cc: Christoph Hellwig Cc: Damien Le Moal Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/blk-mq.c | 35 ++++++++++------------------------- include/linux/blk-mq.h | 1 - 2 files changed, 10 insertions(+), 26 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index deb3d08a6b26..562868dff43f 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -64,6 +64,7 @@ static inline blk_qc_t blk_rq_to_qc(struct request *rq) static bool blk_mq_hctx_has_pending(struct blk_mq_hw_ctx *hctx) { return !list_empty_careful(&hctx->dispatch) || + !list_empty_careful(&hctx->requeue_list) || sbitmap_any_bit_set(&hctx->ctx_map) || blk_mq_sched_has_work(hctx); } @@ -1409,10 +1410,8 @@ void blk_mq_requeue_request(struct request *rq, bool kick_requeue_list) } EXPORT_SYMBOL(blk_mq_requeue_request); -static void blk_mq_requeue_work(struct work_struct *work) +static void blk_mq_process_requeue_list(struct blk_mq_hw_ctx *hctx) { - struct blk_mq_hw_ctx *hctx = - container_of(work, struct blk_mq_hw_ctx, requeue_work.work); LIST_HEAD(rq_list); struct request *rq, *next; @@ -1437,8 +1436,6 @@ static void blk_mq_requeue_work(struct work_struct *work) list_del_init(&rq->queuelist); blk_mq_sched_insert_request(rq, false, false, false); } - - blk_mq_run_hw_queue(hctx, false); } void blk_mq_add_to_requeue_list(struct request *rq, bool at_head, @@ -1464,30 +1461,19 @@ void blk_mq_add_to_requeue_list(struct request *rq, bool at_head, spin_unlock_irqrestore(&hctx->requeue_lock, flags); if (kick_requeue_list) - blk_mq_kick_requeue_list(rq->q); + blk_mq_run_hw_queue(hctx, /*async=*/true); } void blk_mq_kick_requeue_list(struct request_queue *q) { - struct blk_mq_hw_ctx *hctx; - unsigned long i; - - queue_for_each_hw_ctx(q, hctx, i) - kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, - &hctx->requeue_work, 0); + blk_mq_run_hw_queues(q, true); } EXPORT_SYMBOL(blk_mq_kick_requeue_list); void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs) { - struct blk_mq_hw_ctx *hctx; - unsigned long i; - - queue_for_each_hw_ctx(q, hctx, i) - kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, - &hctx->requeue_work, - msecs_to_jiffies(msecs)); + blk_mq_delay_run_hw_queues(q, msecs); } EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list); @@ -2146,6 +2132,8 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx) */ WARN_ON_ONCE(in_interrupt()); + blk_mq_process_requeue_list(hctx); + blk_mq_run_dispatch_ops(hctx->queue, blk_mq_sched_dispatch_requests(hctx)); } @@ -2317,7 +2305,7 @@ void blk_mq_run_hw_queues(struct request_queue *q, bool async) * scheduler. */ if (!sq_hctx || sq_hctx == hctx || - !list_empty_careful(&hctx->dispatch)) + blk_mq_hctx_has_pending(hctx)) blk_mq_run_hw_queue(hctx, async); } } @@ -2353,7 +2341,7 @@ void blk_mq_delay_run_hw_queues(struct request_queue *q, unsigned long msecs) * scheduler. */ if (!sq_hctx || sq_hctx == hctx || - !list_empty_careful(&hctx->dispatch)) + blk_mq_hctx_has_pending(hctx)) blk_mq_delay_run_hw_queue(hctx, msecs); } } @@ -3608,7 +3596,6 @@ static int blk_mq_init_hctx(struct request_queue *q, struct blk_mq_tag_set *set, struct blk_mq_hw_ctx *hctx, unsigned hctx_idx) { - INIT_DELAYED_WORK(&hctx->requeue_work, blk_mq_requeue_work); INIT_LIST_HEAD(&hctx->requeue_list); spin_lock_init(&hctx->requeue_lock); @@ -4771,10 +4758,8 @@ void blk_mq_cancel_work_sync(struct request_queue *q) struct blk_mq_hw_ctx *hctx; unsigned long i; - queue_for_each_hw_ctx(q, hctx, i) { - cancel_delayed_work_sync(&hctx->requeue_work); + queue_for_each_hw_ctx(q, hctx, i) cancel_delayed_work_sync(&hctx->run_work); - } } static int __init blk_mq_init(void) diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 0157f1569980..e62feb17af96 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -313,7 +313,6 @@ struct blk_mq_hw_ctx { struct list_head requeue_list; spinlock_t requeue_lock; - struct delayed_work requeue_work; /** * @run_work: Used for scheduling a hardware queue run at a later time. From patchwork Fri Apr 7 23:58:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13205476 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 256D3C76196 for ; Fri, 7 Apr 2023 23:58:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229761AbjDGX6z (ORCPT ); Fri, 7 Apr 2023 19:58:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229931AbjDGX6x (ORCPT ); Fri, 7 Apr 2023 19:58:53 -0400 Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6EFECE19A for ; Fri, 7 Apr 2023 16:58:50 -0700 (PDT) Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-632384298b3so180466b3a.0 for ; Fri, 07 Apr 2023 16:58:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680911929; x=1683503929; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=o3jAvSoEVRdwVTbsDZcxY/yQqNhvxOtP9gEd9bVOfiY=; b=A+MCN4/vbYqOkgJAhTS0JsJe+K7hLTHaCCqsFRUDJSIbjtvobE83mxLyZhbgwYttWo CqPMABd+QRtpHW4EGuHdv37zNGS//HJ7RSdPQmSh7ZOipKLtrA57OGYeisFs5SbDEnnc fUsuZhtQ6xibVkSohSaGUQY2KI5oV8BLTWqyTVxfJ5nOudJ8hirwHmY+1zVhzjttQc+c 63Wgauj2I+gzG2F6ckdKoJ+e5gqQmvtrjkJpGsq9QiJL/sixhip3pbAA9q5E0g5UzT/S jrkYObLN9RKXdkMQb1szPn0U2uHRzH3EIu46oq8EIG5KDQAW+28qrf0h4igOLFpba3ov YemA== X-Gm-Message-State: AAQBX9dxxnr44U/SQBVG0tMpsNbyYi1o+QGJnp+6ee5hJLWr7Q2R8QRT YAiB3glQck1WdZUx2bOAdis= X-Google-Smtp-Source: AKy350bv9QkGix6FiD/bgHn6C7xcYUSjUv6GKxKdmt6yx5hr2fliWsKeWorlGezCv8vl1bSr2LcxaQ== X-Received: by 2002:a62:5204:0:b0:629:fae0:d96a with SMTP id g4-20020a625204000000b00629fae0d96amr3473563pfb.16.1680911929386; Fri, 07 Apr 2023 16:58:49 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:f2c:4ac2:6000:5900]) by smtp.gmail.com with ESMTPSA id j16-20020a62e910000000b006258dd63a3fsm3556003pfh.56.2023.04.07.16.58.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Apr 2023 16:58:49 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Jaegeuk Kim , Christoph Hellwig , Bart Van Assche , Damien Le Moal , Ming Lei , Mike Snitzer Subject: [PATCH v2 07/12] block: Make it easier to debug zoned write reordering Date: Fri, 7 Apr 2023 16:58:17 -0700 Message-Id: <20230407235822.1672286-8-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407235822.1672286-1-bvanassche@acm.org> References: <20230407235822.1672286-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Issue a kernel warning if reordering could happen. Cc: Christoph Hellwig Cc: Damien Le Moal Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/blk-mq.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index 562868dff43f..d89a0e6cf37d 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2478,6 +2478,8 @@ void blk_mq_request_bypass_insert(struct request *rq, bool at_head, { struct blk_mq_hw_ctx *hctx = rq->mq_hctx; + WARN_ON_ONCE(rq->q->elevator && blk_rq_is_seq_zoned_write(rq)); + spin_lock(&hctx->lock); if (at_head) list_add(&rq->queuelist, &hctx->dispatch); @@ -2570,6 +2572,8 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, bool run_queue = true; int budget_token; + WARN_ON_ONCE(q->elevator && blk_rq_is_seq_zoned_write(rq)); + /* * RCU or SRCU read lock is needed before checking quiesced flag. * From patchwork Fri Apr 7 23:58:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13205477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12C59C77B6E for ; Fri, 7 Apr 2023 23:58:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229720AbjDGX65 (ORCPT ); Fri, 7 Apr 2023 19:58:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229689AbjDGX6z (ORCPT ); Fri, 7 Apr 2023 19:58:55 -0400 Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85DB5EFB7 for ; Fri, 7 Apr 2023 16:58:51 -0700 (PDT) Received: by mail-pl1-f178.google.com with SMTP id n14so25305493plc.8 for ; Fri, 07 Apr 2023 16:58:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680911931; x=1683503931; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yXMoxgLDo4dyjVXptXOnmrqXR6FziNMll+I0AUUU5Fg=; b=mbljzSFG78jj4o4hbDxQv8LqezLk+53DAVgSPt8OrZagTwL9CL9sf6dDIQdkB1P2Mo Yu5e0Bnr3W0lzplzouYL2WLUVuCfX+N6N98Km63uxCsV/8OAoMwqLzIQG1vRg/BzzNC4 tGk9GzrTSQnwezY1zXo78KKWKvuvHWfV7fIKDHl8aYHaF5HmqqKgUo89HkqGc3fvJ/F/ Rsrp3OHiAsAuVEiycQ+Flw2kK2rYt1lCFzie3rX1FxT7KvRuTfMnqja3TAPGQjQcIRbZ F2Mn8IC53576NmItAl07/Y6lJWOG5Xa7agYagivI+FJsMxNlEfO3pfFt4TUmNYz/+OvZ laEA== X-Gm-Message-State: AAQBX9dXBLDZLBmYU8wO1rubxCn0XFGHJhc+II0x3YlmmoGwmlCbS7cl 12/RgmLQgKpANlXTK4gzduE= X-Google-Smtp-Source: AKy350bYZBQKbkrqAw7/0RibIIUwwovlm9AAw4h8PZ3DH8OZlBgVd8kig/oIGkNptT0oHOQekYOooA== X-Received: by 2002:a05:6a20:7b07:b0:d6:ba0b:c81d with SMTP id s7-20020a056a207b0700b000d6ba0bc81dmr3870676pzh.12.1680911930728; Fri, 07 Apr 2023 16:58:50 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:f2c:4ac2:6000:5900]) by smtp.gmail.com with ESMTPSA id j16-20020a62e910000000b006258dd63a3fsm3556003pfh.56.2023.04.07.16.58.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Apr 2023 16:58:50 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Jaegeuk Kim , Christoph Hellwig , Bart Van Assche , Damien Le Moal , Ming Lei , Mike Snitzer Subject: [PATCH v2 08/12] block: mq-deadline: Simplify deadline_skip_seq_writes() Date: Fri, 7 Apr 2023 16:58:18 -0700 Message-Id: <20230407235822.1672286-9-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407235822.1672286-1-bvanassche@acm.org> References: <20230407235822.1672286-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Make deadline_skip_seq_writes() shorter without changing its functionality. Cc: Damien Le Moal Cc: Christoph Hellwig Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche Reviewed-by: Damien Le Moal --- block/mq-deadline.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index d885ccf49170..50a9d3b0a291 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -312,12 +312,9 @@ static struct request *deadline_skip_seq_writes(struct deadline_data *dd, struct request *rq) { sector_t pos = blk_rq_pos(rq); - sector_t skipped_sectors = 0; - while (rq) { - if (blk_rq_pos(rq) != pos + skipped_sectors) - break; - skipped_sectors += blk_rq_sectors(rq); + while (rq && blk_rq_pos(rq) == pos) { + pos += blk_rq_sectors(rq); rq = deadline_latter_request(rq); } From patchwork Fri Apr 7 23:58:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13205478 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 224D4C76196 for ; Fri, 7 Apr 2023 23:58:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229919AbjDGX65 (ORCPT ); Fri, 7 Apr 2023 19:58:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230102AbjDGX64 (ORCPT ); Fri, 7 Apr 2023 19:58:56 -0400 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9FE71EF91 for ; Fri, 7 Apr 2023 16:58:52 -0700 (PDT) Received: by mail-pl1-f182.google.com with SMTP id 20so1997995plk.10 for ; Fri, 07 Apr 2023 16:58:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680911932; x=1683503932; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=t0rraQvI0NSDkdGimavbgcqDUTGPiPZaZulhFDWUQ2I=; b=oGWzCclDT1Zl3t8BEFiswzGfZT4lMhknImLq5hljXOoiazCe1YyItguzeoKID2yzrP u9NODBj8NtMGjkjhAWYwg/3vXjhGxEkPidB/HnS6ZPPIJog6LaNvg1poGFlRECKOWdRp KAJ7giG4WWiUdS9m8hpwDllvwS8qtrM3VYjGsgcTzwiTpqI0eJe3aPdOrHPyxBLLGkmO 2zmhdPMgItKXCCKmRqegTlONA9pW2wj26a4MFpObRTdVVewq1PNBFYRqhWOEUOnhgVJK dvrP6wOwfPZUt8T09QKYN08XAfMRRuAmHbdH9PwJ/zaitK5N1MZwCEOVw5bJFDeS7cz7 aZCQ== X-Gm-Message-State: AAQBX9f+auZgDop2RfYDU4952fAytaaH71EJ6HZRmP88DPUrXviitEbx dKphupxwTMyz8TQ3uDStGg6h8CyryTs= X-Google-Smtp-Source: AKy350aGhAV/pPaPxd1I+mcFyMprRDgupv3iGB9oyM4RKTFpfR/ikLF9/yUpNpUiG9Vjr4Ox/cHPsQ== X-Received: by 2002:a05:6a20:3b1a:b0:cc:ac05:88f7 with SMTP id c26-20020a056a203b1a00b000ccac0588f7mr142592pzh.34.1680911931996; Fri, 07 Apr 2023 16:58:51 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:f2c:4ac2:6000:5900]) by smtp.gmail.com with ESMTPSA id j16-20020a62e910000000b006258dd63a3fsm3556003pfh.56.2023.04.07.16.58.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Apr 2023 16:58:51 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Jaegeuk Kim , Christoph Hellwig , Bart Van Assche , Damien Le Moal , Ming Lei , Mike Snitzer Subject: [PATCH v2 09/12] block: mq-deadline: Disable head insertion for zoned writes Date: Fri, 7 Apr 2023 16:58:19 -0700 Message-Id: <20230407235822.1672286-10-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407235822.1672286-1-bvanassche@acm.org> References: <20230407235822.1672286-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Make sure that zoned writes are submitted in LBA order. Cc: Damien Le Moal Cc: Christoph Hellwig Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/mq-deadline.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 50a9d3b0a291..891ee0da73ac 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -798,7 +798,7 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, trace_block_rq_insert(rq); - if (at_head) { + if (at_head && !blk_rq_is_seq_zoned_write(rq)) { list_add(&rq->queuelist, &per_prio->dispatch); rq->fifo_time = jiffies; } else { From patchwork Fri Apr 7 23:58:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13205479 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4FAB8C77B6C for ; Fri, 7 Apr 2023 23:59:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230160AbjDGX7A (ORCPT ); Fri, 7 Apr 2023 19:59:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230064AbjDGX66 (ORCPT ); Fri, 7 Apr 2023 19:58:58 -0400 Received: from mail-pj1-f43.google.com (mail-pj1-f43.google.com [209.85.216.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EBDABEFA8 for ; Fri, 7 Apr 2023 16:58:53 -0700 (PDT) Received: by mail-pj1-f43.google.com with SMTP id go23so2895024pjb.4 for ; Fri, 07 Apr 2023 16:58:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680911933; x=1683503933; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=C86OJvzoW4rtHBJaXKlXJfoD3L6aK93cPTFYckMkAmg=; b=WJbevpT4pBqh9x1HJW4tYRHzIIMcxZC3MMrnDjQuvfmP9P5v7gnzl08wNlUHeGyMG/ ZZaFdSDZI8ZoL0FQcZhH+biVSumzjBgScN3TgovPGh28J+dMEHeOJBx5cfKPhKPdXKzM c65RMb4RgZ1jrv4yxE4KirUUfqqjc7dhHz8a7KVkqVsZ2JI95exj2K0jwxPCofFrRQxy HXc0mPdiHtxFqZSnWRoqSlwMctK9aERHfdgowAoU39msQwGPOGzVgk6YC8+qOpuRtqlf 7LI1NxxF6qOD1fFuGDizK/Ap40OtCksftR4Z3tFarUOdlIymcwoS/bPfDRES+jghYfc8 Cf6g== X-Gm-Message-State: AAQBX9cWuzByfa3hQNEBoshryWg5U8IlmO5syUrLGKHxSWZqXS5DnEF3 DkMDF1ztoq7pthf8QFC0W+c= X-Google-Smtp-Source: AKy350aV0NHYQpIKP/WX8scBC9JiBvpa0t8vZAt9Cqwx0HUJn8I9mT8o5u+Lc+PUZUCd7nTmBC+IBg== X-Received: by 2002:a05:6a20:7a05:b0:de:808e:8f3d with SMTP id t5-20020a056a207a0500b000de808e8f3dmr2845315pzh.13.1680911933216; Fri, 07 Apr 2023 16:58:53 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:f2c:4ac2:6000:5900]) by smtp.gmail.com with ESMTPSA id j16-20020a62e910000000b006258dd63a3fsm3556003pfh.56.2023.04.07.16.58.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Apr 2023 16:58:52 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Jaegeuk Kim , Christoph Hellwig , Bart Van Assche , Damien Le Moal , Ming Lei , Mike Snitzer Subject: [PATCH v2 10/12] block: mq-deadline: Introduce a local variable Date: Fri, 7 Apr 2023 16:58:20 -0700 Message-Id: <20230407235822.1672286-11-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407235822.1672286-1-bvanassche@acm.org> References: <20230407235822.1672286-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Prepare for adding more code that uses the request queue pointer. Cc: Damien Le Moal Cc: Christoph Hellwig Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche Reviewed-by: Damien Le Moal --- block/mq-deadline.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 891ee0da73ac..8c2bc9fdcf8c 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -368,6 +368,7 @@ static struct request * deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio, enum dd_data_dir data_dir) { + struct request_queue *q; struct request *rq; unsigned long flags; @@ -375,7 +376,8 @@ deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio, if (!rq) return NULL; - if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q)) + q = rq->q; + if (data_dir == DD_READ || !blk_queue_is_zoned(q)) return rq; /* @@ -389,7 +391,7 @@ deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio, while (rq) { if (blk_req_can_dispatch_to_zone(rq)) break; - if (blk_queue_nonrot(rq->q)) + if (blk_queue_nonrot(q)) rq = deadline_latter_request(rq); else rq = deadline_skip_seq_writes(dd, rq); From patchwork Fri Apr 7 23:58:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13205480 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BA05C76196 for ; Fri, 7 Apr 2023 23:59:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230115AbjDGX7C (ORCPT ); Fri, 7 Apr 2023 19:59:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230130AbjDGX67 (ORCPT ); Fri, 7 Apr 2023 19:58:59 -0400 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 54FB2E19A for ; Fri, 7 Apr 2023 16:58:55 -0700 (PDT) Received: by mail-pj1-f47.google.com with SMTP id b3so66195pjq.3 for ; Fri, 07 Apr 2023 16:58:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680911935; x=1683503935; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TbIDxmTUcHQIKbZyqyah2h625CSb86OBr3A1gMI1quA=; b=i/M8RD25xZODAXwDyDleyMmSeQImiAePwMD2QHpqq5FL8eFjGxDYMwU/twcMeOxDaH wLFlr9OtMdTr0aKCLkGN9+qZby6THue8Q2ObNCZvYvn7Eu6hiC/ruWIKhlCOqxDoaKOQ TR3gb8F9otoZRlQ0eyMF7F4m1J6YllboSPrPC8pJ5q2Ynj/fYaHKep2Yz7PhjTMjiP4i CWcOsYrOf80wnHfFJnVMavNpWWPf60nQMFmi0goeIyo1G3QV+x0pX+IJYvBu2UF35X37 SVMmMLR52aF1G/RMYWXQTrNZExFr/VLPDQHLzh5EJz63NYgGCvVNkj48k1t5ILraRmsi zgtw== X-Gm-Message-State: AAQBX9dIl7FiNW0clF2KIG8FgqGg4ojhSGlSzxy3O4EboYQDMHLR7KHT mZUpB7ppukpb/AuEh11bKFs= X-Google-Smtp-Source: AKy350aPK5lySNkYzWg7KSJmNnEx58eN8NiUIcnr8W11aFo+gZYnVzVKjYEvHfAp6TJSz6Rh/WhUpA== X-Received: by 2002:a05:6a20:a922:b0:d9:5db:7345 with SMTP id cd34-20020a056a20a92200b000d905db7345mr3136566pzb.26.1680911934603; Fri, 07 Apr 2023 16:58:54 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:f2c:4ac2:6000:5900]) by smtp.gmail.com with ESMTPSA id j16-20020a62e910000000b006258dd63a3fsm3556003pfh.56.2023.04.07.16.58.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Apr 2023 16:58:54 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Jaegeuk Kim , Christoph Hellwig , Bart Van Assche , Damien Le Moal , Ming Lei , Mike Snitzer Subject: [PATCH v2 11/12] block: mq-deadline: Fix a race condition related to zoned writes Date: Fri, 7 Apr 2023 16:58:21 -0700 Message-Id: <20230407235822.1672286-12-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407235822.1672286-1-bvanassche@acm.org> References: <20230407235822.1672286-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Let deadline_next_request() only consider the first zoned write per zone. This patch fixes a race condition between deadline_next_request() and completion of zoned writes. Cc: Damien Le Moal Cc: Christoph Hellwig Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/mq-deadline.c | 24 +++++++++++++++++++++--- include/linux/blk-mq.h | 5 +++++ 2 files changed, 26 insertions(+), 3 deletions(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 8c2bc9fdcf8c..d49e20d3011d 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -389,12 +389,30 @@ deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio, */ spin_lock_irqsave(&dd->zone_lock, flags); while (rq) { + unsigned int zno = blk_rq_zone_no(rq); + if (blk_req_can_dispatch_to_zone(rq)) break; - if (blk_queue_nonrot(q)) - rq = deadline_latter_request(rq); - else + + WARN_ON_ONCE(!blk_queue_is_zoned(q)); + + if (!blk_queue_nonrot(q)) { rq = deadline_skip_seq_writes(dd, rq); + if (!rq) + break; + rq = deadline_earlier_request(rq); + if (WARN_ON_ONCE(!rq)) + break; + } + + /* + * Skip all other write requests for the zone with zone number + * 'zno'. This prevents that this function selects a zoned write + * that is not the first write for a given zone. + */ + while ((rq = deadline_latter_request(rq)) && + blk_rq_zone_no(rq) == zno) + ; } spin_unlock_irqrestore(&dd->zone_lock, flags); diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index e62feb17af96..515dfd04d736 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -1193,6 +1193,11 @@ static inline bool blk_req_can_dispatch_to_zone(struct request *rq) return !blk_req_zone_is_write_locked(rq); } #else /* CONFIG_BLK_DEV_ZONED */ +static inline unsigned int blk_rq_zone_no(struct request *rq) +{ + return 0; +} + static inline bool blk_req_needs_zone_write_lock(struct request *rq) { return false; From patchwork Fri Apr 7 23:58:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13205481 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F158CC77B6C for ; Fri, 7 Apr 2023 23:59:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229743AbjDGX7D (ORCPT ); Fri, 7 Apr 2023 19:59:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47748 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229570AbjDGX7C (ORCPT ); Fri, 7 Apr 2023 19:59:02 -0400 Received: from mail-pj1-f53.google.com (mail-pj1-f53.google.com [209.85.216.53]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC1C4CA1C for ; Fri, 7 Apr 2023 16:58:56 -0700 (PDT) Received: by mail-pj1-f53.google.com with SMTP id w11so56665pjh.5 for ; Fri, 07 Apr 2023 16:58:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680911936; x=1683503936; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2IHmovrm22qDdnBZvKTzWyMHmpC60E1KaJrCMmtQeVc=; b=fmiTUNu9yzIqXHUmncYXHeBNx8XSaupDVXJ85wq3aDW2lDop4pV11nQSBtci4YUITT Oe4ale9qPKOUzo7r58PTzBbob9ugWZd4hm+4Qph+Vbfr6HThCmTxQ8HMkm/e4G4G/IZ+ 42m2tFgd3g60TfBZAey6HsqjtM6g3KXpNBYPquw8XXArATXe2knp9h3Xkl6Y+VIhWI9G DV4leIy7ZsnoA4ekDdMTiSs1nVKjkeZw3q/9StgVuYVoIkViLE1SQNrFNE2pV+WYonv5 RvD2YvhcJOqCwfO01kWwlrZrQ42HboyxvPKkaswQH1DvMpZn7BO1Z+zqCxYn0HB9tw31 DYgA== X-Gm-Message-State: AAQBX9dBJNfI/JZHh5IK2Mt6Vws1l4E98JYNl6cQgFjwrRZwIQV/d3Ja fr2l169n7WQA7P6Ckhfomtw= X-Google-Smtp-Source: AKy350aeWuR2Sz8EOAGg5jnmScPJm+Dt67sKDPQQNMWvcspMIzwNCFGQRoon7L+jj9y+xCaro6ZOiQ== X-Received: by 2002:a05:6a20:4c98:b0:d3:84ca:11b with SMTP id fq24-20020a056a204c9800b000d384ca011bmr3486018pzb.40.1680911935876; Fri, 07 Apr 2023 16:58:55 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:f2c:4ac2:6000:5900]) by smtp.gmail.com with ESMTPSA id j16-20020a62e910000000b006258dd63a3fsm3556003pfh.56.2023.04.07.16.58.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Apr 2023 16:58:55 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Jaegeuk Kim , Christoph Hellwig , Bart Van Assche , Damien Le Moal , Ming Lei , Mike Snitzer Subject: [PATCH v2 12/12] block: mq-deadline: Handle requeued requests correctly Date: Fri, 7 Apr 2023 16:58:22 -0700 Message-Id: <20230407235822.1672286-13-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407235822.1672286-1-bvanassche@acm.org> References: <20230407235822.1672286-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org If a zoned write is requeued with an LBA that is lower than already inserted zoned writes, make sure that it is submitted first. Cc: Damien Le Moal Cc: Christoph Hellwig Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/mq-deadline.c | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index d49e20d3011d..c536b499a60f 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -162,8 +162,19 @@ static void deadline_add_rq_rb(struct dd_per_prio *per_prio, struct request *rq) { struct rb_root *root = deadline_rb_root(per_prio, rq); + struct request **next_rq = &per_prio->next_rq[rq_data_dir(rq)]; elv_rb_add(root, rq); + if (*next_rq == NULL || !blk_queue_is_zoned(rq->q)) + return; + /* + * If a request got requeued or requests have been submitted out of + * order, make sure that per zone the request with the lowest LBA is + * submitted first. + */ + if (blk_rq_pos(rq) < blk_rq_pos(*next_rq) && + blk_rq_zone_no(rq) == blk_rq_zone_no(*next_rq)) + *next_rq = rq; } static inline void @@ -822,6 +833,8 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, list_add(&rq->queuelist, &per_prio->dispatch); rq->fifo_time = jiffies; } else { + struct list_head *insert_before; + deadline_add_rq_rb(per_prio, rq); if (rq_mergeable(rq)) { @@ -834,7 +847,18 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, * set expire time and add to fifo list */ rq->fifo_time = jiffies + dd->fifo_expire[data_dir]; - list_add_tail(&rq->queuelist, &per_prio->fifo_list[data_dir]); + insert_before = &per_prio->fifo_list[data_dir]; + if (blk_rq_is_seq_zoned_write(rq)) { + const unsigned int zno = blk_rq_zone_no(rq); + struct request *rq2 = rq; + + while ((rq2 = deadline_earlier_request(rq2)) && + blk_rq_zone_no(rq2) == zno && + blk_rq_pos(rq2) > blk_rq_pos(rq)) { + insert_before = &rq2->queuelist; + } + } + list_add_tail(&rq->queuelist, insert_before); } }