From patchwork Fri Apr 7 00:16:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13204349 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A32C8C761A6 for ; Fri, 7 Apr 2023 00:17:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229892AbjDGARU (ORCPT ); Thu, 6 Apr 2023 20:17:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36932 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230098AbjDGARS (ORCPT ); Thu, 6 Apr 2023 20:17:18 -0400 Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AD1E48A6E for ; Thu, 6 Apr 2023 17:17:17 -0700 (PDT) Received: by mail-pj1-f45.google.com with SMTP id lr16-20020a17090b4b9000b0023f187954acso42137430pjb.2 for ; Thu, 06 Apr 2023 17:17:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680826637; x=1683418637; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FDZ0a4ViXZtUJ7fNZdyM/i6Nk6el+pS0D6Bc/VV4l5M=; b=BxqDgQrkyvsp1XIHSJ7lZXOVw4EBcJE7vpnd9tn+T1aTv9fwrJUxzIAaQQFAMW/zy/ mx/wtpH2qXcTo7+QaCJzK+MBPw4WM9wTN1DTUye/Nva+w+1vEKMLhenQFdwmYsOB4u+M GokpzuM3Lezy3F+kFWYDHJEU6mN8PmYnF/qaOlN6KVqtOzcBc4wu0rKDziVUvUG3PurP BEG8/VqPtMRezXf04gfGAWajC3K8SvY2AAbo6u6FWCXqrQ+nVH+HCHCVS06yks0nY8AV 92APAjEY/RCe6rJsgy4RrLeE4Gddq/j31sq0kg+Md2lHpAwr3KqEEJqIf3qB70ulHkDT IMCQ== X-Gm-Message-State: AAQBX9cgj8+bZfihmm1PYC3qhvRo2Tf42Nt16jDosIAIBwbtzU6l9ZPf JnEC4IU8ceqY81invhCyI5M= X-Google-Smtp-Source: AKy350ZM5SI0hrLRmHMV4XIfvBLTLNKsZroJ+AKwj8YipOXxU4tRhNAsWv9tvemmugNBfyEq1CV6lA== X-Received: by 2002:a17:902:c941:b0:1a4:fc40:bf04 with SMTP id i1-20020a170902c94100b001a4fc40bf04mr1078722pla.27.1680826637156; Thu, 06 Apr 2023 17:17:17 -0700 (PDT) Received: from bvanassche-glaptop2.roam.corp.google.com ([98.51.102.78]) by smtp.gmail.com with ESMTPSA id x9-20020a1709028ec900b0019a773419a6sm1873676plo.170.2023.04.06.17.17.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 17:17:16 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Damien Le Moal , Ming Lei , Mike Snitzer , Jaegeuk Kim , Bart Van Assche Subject: [PATCH 01/12] block: Send zoned writes to the I/O scheduler Date: Thu, 6 Apr 2023 17:16:59 -0700 Message-Id: <20230407001710.104169-2-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407001710.104169-1-bvanassche@acm.org> References: <20230407001710.104169-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Send zoned writes inserted by the device mapper to the I/O scheduler. This prevents that zoned writes get reordered if a device mapper driver has been stacked on top of a driver for a zoned block device. Cc: Christoph Hellwig Cc: Damien Le Moal Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/blk-mq.c | 16 +++++++++++++--- block/blk.h | 19 +++++++++++++++++++ 2 files changed, 32 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index db93b1a71157..fefc9a728e0e 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3008,9 +3008,19 @@ blk_status_t blk_insert_cloned_request(struct request *rq) blk_account_io_start(rq); /* - * Since we have a scheduler attached on the top device, - * bypass a potential scheduler on the bottom device for - * insert. + * Send zoned writes to the I/O scheduler if an I/O scheduler has been + * attached. + */ + if (q->elevator && blk_rq_is_seq_zoned_write(rq)) { + blk_mq_sched_insert_request(rq, /*at_head=*/false, + /*run_queue=*/true, + /*async=*/false); + return BLK_STS_OK; + } + + /* + * If no I/O scheduler has been attached or if the request is not a + * zoned write bypass the I/O scheduler attached to the bottom device. */ blk_mq_run_dispatch_ops(q, ret = blk_mq_request_issue_directly(rq, true)); diff --git a/block/blk.h b/block/blk.h index d65d96994a94..4b6f8d7a6b84 100644 --- a/block/blk.h +++ b/block/blk.h @@ -118,6 +118,25 @@ static inline bool bvec_gap_to_prev(const struct queue_limits *lim, return __bvec_gap_to_prev(lim, bprv, offset); } +/** + * blk_rq_is_seq_zoned_write() - Whether @rq is a write request for a sequential zone. + * @rq: Request to examine. + * + * In this context sequential zone means either a sequential write required or + * to a sequential write preferred zone. + */ +static inline bool blk_rq_is_seq_zoned_write(struct request *rq) +{ + switch (req_op(rq)) { + case REQ_OP_WRITE: + case REQ_OP_WRITE_ZEROES: + return disk_zone_is_seq(rq->q->disk, blk_rq_pos(rq)); + case REQ_OP_ZONE_APPEND: + default: + return false; + } +} + static inline bool rq_mergeable(struct request *rq) { if (blk_rq_is_passthrough(rq)) From patchwork Fri Apr 7 00:17:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13204351 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AC1DC77B6C for ; Fri, 7 Apr 2023 00:17:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230436AbjDGARV (ORCPT ); Thu, 6 Apr 2023 20:17:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36948 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230409AbjDGARU (ORCPT ); Thu, 6 Apr 2023 20:17:20 -0400 Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10FF08A55 for ; Thu, 6 Apr 2023 17:17:19 -0700 (PDT) Received: by mail-pl1-f173.google.com with SMTP id km16so697270plb.0 for ; Thu, 06 Apr 2023 17:17:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680826638; x=1683418638; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mqHrajTxO4No+Knsk0F9iORrtB37kJDx2LKjrju7sMM=; b=HLOeaG/y2lFYtdupNBSzjQbkPAU39g3Cj2pF/THLiYb1CThRuc03G0CUxbOq7Or7Bw rcByIpG6LDYH886nlhvqsBwUYiSp0ylpWkOZhpkddsjprqWftId5chu1kgMj1+UBlpjG PVx4fi6tH4CbuznLJcL8skzO5BcsyozoQtJscBFmXiqB2EeakAwURIi3RiWrA0dzEMKZ fXsghFB8scXvY61ebTfZCzmU2JpahfyHDYe+b5cpPpRymm3xr9ozGcNX5set9Gm9nFgd u9fMbY8Rn/NveKjZPjZQGobtARNIJxa08vGcuMO92hSyK6Qn9nTkUZ80hRzC+GKo91P8 MSKQ== X-Gm-Message-State: AAQBX9eevjuTswWKvFBFYO8bvkSSyuRh0WUDD7oWuRmZo+TTQZ3AJxwq Deyzav/5emxbPgR1FGlgJ3qo80GMrbk= X-Google-Smtp-Source: AKy350ZwR8s27lwAb8InZYq1ofNVQuGEjdLTKVjPz7Qs940ROBsCzVRTvyWfliB89C3HJTt8Dt5xhA== X-Received: by 2002:a17:903:249:b0:1a2:1513:44bf with SMTP id j9-20020a170903024900b001a2151344bfmr1104328plh.1.1680826638311; Thu, 06 Apr 2023 17:17:18 -0700 (PDT) Received: from bvanassche-glaptop2.roam.corp.google.com ([98.51.102.78]) by smtp.gmail.com with ESMTPSA id x9-20020a1709028ec900b0019a773419a6sm1873676plo.170.2023.04.06.17.17.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 17:17:17 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Damien Le Moal , Ming Lei , Mike Snitzer , Jaegeuk Kim , Bart Van Assche Subject: [PATCH 02/12] block: Send flush requests to the I/O scheduler Date: Thu, 6 Apr 2023 17:17:00 -0700 Message-Id: <20230407001710.104169-3-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407001710.104169-1-bvanassche@acm.org> References: <20230407001710.104169-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Prevent that zoned writes with the FUA flag set are reordered against each other or against other zoned writes. Separate the I/O scheduler members from the flush members in struct request since with this patch applied a request may pass through both an I/O scheduler and the flush machinery. Cc: Christoph Hellwig Cc: Damien Le Moal Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/blk-flush.c | 3 ++- block/blk-mq.c | 11 ++++------- block/mq-deadline.c | 2 +- include/linux/blk-mq.h | 27 +++++++++++---------------- 4 files changed, 18 insertions(+), 25 deletions(-) diff --git a/block/blk-flush.c b/block/blk-flush.c index 53202eff545e..e0cf153388d8 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -432,7 +432,8 @@ void blk_insert_flush(struct request *rq) */ if ((policy & REQ_FSEQ_DATA) && !(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) { - blk_mq_request_bypass_insert(rq, false, true); + blk_mq_sched_insert_request(rq, /*at_head=*/false, + /*run_queue=*/true, /*async=*/true); return; } diff --git a/block/blk-mq.c b/block/blk-mq.c index fefc9a728e0e..250556546bbf 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -390,8 +390,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data, INIT_HLIST_NODE(&rq->hash); RB_CLEAR_NODE(&rq->rb_node); - if (!op_is_flush(data->cmd_flags) && - e->type->ops.prepare_request) { + if (e->type->ops.prepare_request) { e->type->ops.prepare_request(rq); rq->rq_flags |= RQF_ELVPRIV; } @@ -452,13 +451,11 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data) data->rq_flags |= RQF_ELV; /* - * Flush/passthrough requests are special and go directly to the - * dispatch list. Don't include reserved tags in the - * limiting, as it isn't useful. + * Do not limit the depth for passthrough requests nor for + * requests with a reserved tag. */ - if (!op_is_flush(data->cmd_flags) && + if (e->type->ops.limit_depth && !blk_op_is_passthrough(data->cmd_flags) && - e->type->ops.limit_depth && !(data->flags & BLK_MQ_REQ_RESERVED)) e->type->ops.limit_depth(data->cmd_flags, data); } diff --git a/block/mq-deadline.c b/block/mq-deadline.c index f10c2a0d18d4..d885ccf49170 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -789,7 +789,7 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, prio = ioprio_class_to_prio[ioprio_class]; per_prio = &dd->per_prio[prio]; - if (!rq->elv.priv[0]) { + if (!rq->elv.priv[0] && !(rq->rq_flags & RQF_FLUSH_SEQ)) { per_prio->stats.inserted++; rq->elv.priv[0] = (void *)(uintptr_t)1; } diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 06caacd77ed6..5e6c79ad83d2 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -169,25 +169,20 @@ struct request { void *completion_data; }; - /* * Three pointers are available for the IO schedulers, if they need - * more they have to dynamically allocate it. Flush requests are - * never put on the IO scheduler. So let the flush fields share - * space with the elevator data. + * more they have to dynamically allocate it. */ - union { - struct { - struct io_cq *icq; - void *priv[2]; - } elv; - - struct { - unsigned int seq; - struct list_head list; - rq_end_io_fn *saved_end_io; - } flush; - }; + struct { + struct io_cq *icq; + void *priv[2]; + } elv; + + struct { + unsigned int seq; + struct list_head list; + rq_end_io_fn *saved_end_io; + } flush; union { struct __call_single_data csd; From patchwork Fri Apr 7 00:17:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13204350 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EBC9C77B71 for ; Fri, 7 Apr 2023 00:17:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231171AbjDGARV (ORCPT ); Thu, 6 Apr 2023 20:17:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36950 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230098AbjDGARU (ORCPT ); Thu, 6 Apr 2023 20:17:20 -0400 Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B97138A6E for ; Thu, 6 Apr 2023 17:17:19 -0700 (PDT) Received: by mail-pl1-f178.google.com with SMTP id ix20so38890028plb.3 for ; Thu, 06 Apr 2023 17:17:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680826639; x=1683418639; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XMqZoVLH7diA5Sdp1MnvjtadI8hHVn6QGoRCcrnE5Pg=; b=cslK2ouNSRqBMh8rXgVaon8P6fL53F76ml1qYGIKBXA8/D19PQ229jLroB3SykguYz 51IXD+geQo8aX+wxLFMH7uFXqQvCI7XLdKA1lDJVGSFjo3ED0wMCMrk2r3Hyx0CUE6gs C4BvYfzcBCOOQAZ53lT/O8GUoYjmWIcPEKSn7yLE9umaS2zGY8qgg8Kht8TWIjHtzTmY tWgKLfmDUl5QrzmgTenMNMfwYDFA5D+MNyLDYTw5OHFQIzCrIA6dgudYTr0pa//yhife PJz4QcG+w1KAXiC+B08iW/qot6Ok4ez1q77ouTmbyV7pvJ/hbaek5cM77+M2M3dFesrg m/LA== X-Gm-Message-State: AAQBX9dqhYpYNgCFSHDRLcSBOcllhmvt6wEp49j70+KsYHH5sVAGSZmq PovShVmrjS321zfJv71zgKyp/VcGflE= X-Google-Smtp-Source: AKy350YCeKZvjmze6EHftvlPUww03iQTSCleXC/Hhn6iPVXinzvNGCU6HSS6Q9I/DrFzbCOscCp+XA== X-Received: by 2002:a17:902:c94d:b0:1a0:53b3:ee87 with SMTP id i13-20020a170902c94d00b001a053b3ee87mr1029512pla.62.1680826639181; Thu, 06 Apr 2023 17:17:19 -0700 (PDT) Received: from bvanassche-glaptop2.roam.corp.google.com ([98.51.102.78]) by smtp.gmail.com with ESMTPSA id x9-20020a1709028ec900b0019a773419a6sm1873676plo.170.2023.04.06.17.17.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 17:17:18 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Damien Le Moal , Ming Lei , Mike Snitzer , Jaegeuk Kim , Bart Van Assche Subject: [PATCH 03/12] block: Send requeued requests to the I/O scheduler Date: Thu, 6 Apr 2023 17:17:01 -0700 Message-Id: <20230407001710.104169-4-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407001710.104169-1-bvanassche@acm.org> References: <20230407001710.104169-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Let the I/O scheduler control which requests are dispatched. Cc: Christoph Hellwig Cc: Damien Le Moal Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/blk-mq.c | 22 ++++++++++------------ include/linux/blk-mq.h | 5 +++-- 2 files changed, 13 insertions(+), 14 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 250556546bbf..f6ffa76bc159 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1426,15 +1426,7 @@ static void blk_mq_requeue_work(struct work_struct *work) rq->rq_flags &= ~RQF_SOFTBARRIER; list_del_init(&rq->queuelist); - /* - * If RQF_DONTPREP, rq has contained some driver specific - * data, so insert it to hctx dispatch list to avoid any - * merge. - */ - if (rq->rq_flags & RQF_DONTPREP) - blk_mq_request_bypass_insert(rq, false, false); - else - blk_mq_sched_insert_request(rq, true, false, false); + blk_mq_sched_insert_request(rq, /*at_head=*/true, false, false); } while (!list_empty(&rq_list)) { @@ -2065,9 +2057,15 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, if (nr_budgets) blk_mq_release_budgets(q, list); - spin_lock(&hctx->lock); - list_splice_tail_init(list, &hctx->dispatch); - spin_unlock(&hctx->lock); + if (!q->elevator) { + spin_lock(&hctx->lock); + list_splice_tail_init(list, &hctx->dispatch); + spin_unlock(&hctx->lock); + } else { + q->elevator->type->ops.insert_requests( + hctx, list, + /*at_head=*/true); + } /* * Order adding requests to hctx->dispatch and checking diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 5e6c79ad83d2..3a3bee9085e3 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -64,8 +64,9 @@ typedef __u32 __bitwise req_flags_t; #define RQF_RESV ((__force req_flags_t)(1 << 23)) /* flags that prevent us from merging requests: */ -#define RQF_NOMERGE_FLAGS \ - (RQF_STARTED | RQF_SOFTBARRIER | RQF_FLUSH_SEQ | RQF_SPECIAL_PAYLOAD) +#define RQF_NOMERGE_FLAGS \ + (RQF_STARTED | RQF_SOFTBARRIER | RQF_FLUSH_SEQ | RQF_DONTPREP | \ + RQF_SPECIAL_PAYLOAD) enum mq_rq_state { MQ_RQ_IDLE = 0, From patchwork Fri Apr 7 00:17:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13204352 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C40E9C76196 for ; Fri, 7 Apr 2023 00:17:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230098AbjDGARW (ORCPT ); Thu, 6 Apr 2023 20:17:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36968 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230409AbjDGARV (ORCPT ); Thu, 6 Apr 2023 20:17:21 -0400 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBAA89029 for ; Thu, 6 Apr 2023 17:17:20 -0700 (PDT) Received: by mail-pl1-f169.google.com with SMTP id n14so23009432plc.8 for ; Thu, 06 Apr 2023 17:17:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680826640; x=1683418640; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SUUWniA4uPgIfsv1JT2KAKD8sP6eQdC/K73aOS/Pwsg=; b=lKX1Al8EVMnulH7f8BfpUVwDJRRili9hRtSnw73JDoM9ctSjxYKblRNkuWpBsGNGgC KlcdISabZ+dHEIDDDlNjJoxtSsDzZoYxEKQyQ5sGhjkVBMpyEh60ZQRWdTSLa9n1rpfc S28ufh+ehznEEcZRNtw0mwG8Fx2lGLVrrl2CbalnWlnvDw/omA6F4im5jfeGeDNE7ahq qA3X6yEdLbl3+RLe3Ufkdl+0+tx6qy8oAG4l01mZIOhhnSqnS9aVx1j+nCgE4co1NBtT cftLteYMZzwxPa2z5EF5sT5mQjWgMQiwQdsKD5moqA38Of5of/gZEBg+uBmAlONiQhZH ezKQ== X-Gm-Message-State: AAQBX9dR/lk8qAEORdWvva4CVzvQM7Q9TdxNOulYVvE9wy8JerSKZbPr qVpiJ/wqmqhz6ZFK8DAkUXw= X-Google-Smtp-Source: AKy350b6ZvPR1mIH+NY2fWKKWmVbP8suS78IoYvOXmKGxX2FRk9OWiPhOYnATEnjz1JNWgdiiq33ZA== X-Received: by 2002:a17:903:138b:b0:1a5:dfd:d16e with SMTP id jx11-20020a170903138b00b001a50dfdd16emr564991plb.42.1680826640248; Thu, 06 Apr 2023 17:17:20 -0700 (PDT) Received: from bvanassche-glaptop2.roam.corp.google.com ([98.51.102.78]) by smtp.gmail.com with ESMTPSA id x9-20020a1709028ec900b0019a773419a6sm1873676plo.170.2023.04.06.17.17.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 17:17:19 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Damien Le Moal , Ming Lei , Mike Snitzer , Jaegeuk Kim , Bart Van Assche Subject: [PATCH 04/12] block: Requeue requests if a CPU is unplugged Date: Thu, 6 Apr 2023 17:17:02 -0700 Message-Id: <20230407001710.104169-5-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407001710.104169-1-bvanassche@acm.org> References: <20230407001710.104169-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Requeue requests instead of sending these to the dispatch list if a CPU is unplugged to prevent reordering of zoned writes. Cc: Christoph Hellwig Cc: Damien Le Moal Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/blk-mq.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index f6ffa76bc159..8bb35deff5ec 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3496,9 +3496,17 @@ static int blk_mq_hctx_notify_dead(unsigned int cpu, struct hlist_node *node) if (list_empty(&tmp)) return 0; - spin_lock(&hctx->lock); - list_splice_tail_init(&tmp, &hctx->dispatch); - spin_unlock(&hctx->lock); + if (hctx->queue->elevator) { + struct request *rq, *next; + + list_for_each_entry_safe(rq, next, &tmp, queuelist) + blk_mq_requeue_request(rq, false); + blk_mq_kick_requeue_list(hctx->queue); + } else { + spin_lock(&hctx->lock); + list_splice_tail_init(&tmp, &hctx->dispatch); + spin_unlock(&hctx->lock); + } blk_mq_run_hw_queue(hctx, true); return 0; From patchwork Fri Apr 7 00:17:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13204353 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A495C76196 for ; Fri, 7 Apr 2023 00:17:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237695AbjDGARZ (ORCPT ); Thu, 6 Apr 2023 20:17:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37120 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232254AbjDGARY (ORCPT ); Thu, 6 Apr 2023 20:17:24 -0400 Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F0C29ECC for ; Thu, 6 Apr 2023 17:17:22 -0700 (PDT) Received: by mail-pj1-f48.google.com with SMTP id d22-20020a17090a111600b0023d1b009f52so184244pja.2 for ; Thu, 06 Apr 2023 17:17:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680826642; x=1683418642; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=97cM6iAZmdUOfT6s3oQ1fU2j6sZ1EQX65qrvjYxJW28=; b=5PD6wfQ4qsLx9rs3l+mix8Qgtyx3QUTtW0MhYBTi7bpKd+c5AnJqIk3BL9Diau8r3j 89XB6h5ZuYlnNUE1a37zhdLKMYKG1qyLgOOiLSj9LddIWWA02OJASiJqPWMcSAWsIE1X wzAZWItRk8EE/msBJJrhl+RXHwJHc4xZvHQ2QJPBkfGI6IoB5cJmbWku4aElHfe1nof5 nOfKGDXKAPQqeTe3GVkq52pLQUXZZ2gNldCPjJztV3KNjLH528IRKBiXYTTGPkAfWVJf ectOMmHMbyFYGGyhJWHp9SSg8gu2YIpYUhwBGacTxpbYT1iKc+s4iP3Txb16FUTSKd3S tc9A== X-Gm-Message-State: AAQBX9cArkf2fFTo+Li6nG7hicidlDs8CLuHs5bMhEhkUbs/97wmxUKq yTMaOsjLKA8dTpP8J642xZE= X-Google-Smtp-Source: AKy350ayUAWPGjtbmMFkgfrRxFWdKSLkrJz0uv5tvED2eAIfSbGdt6Gujwdfow/i3pC9PNfXCcZKMg== X-Received: by 2002:a17:903:11c4:b0:1a2:9e64:bc5e with SMTP id q4-20020a17090311c400b001a29e64bc5emr851473plh.39.1680826641632; Thu, 06 Apr 2023 17:17:21 -0700 (PDT) Received: from bvanassche-glaptop2.roam.corp.google.com ([98.51.102.78]) by smtp.gmail.com with ESMTPSA id x9-20020a1709028ec900b0019a773419a6sm1873676plo.170.2023.04.06.17.17.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 17:17:21 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Damien Le Moal , Ming Lei , Mike Snitzer , Jaegeuk Kim , Bart Van Assche Subject: [PATCH 05/12] block: One requeue list per hctx Date: Thu, 6 Apr 2023 17:17:03 -0700 Message-Id: <20230407001710.104169-6-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407001710.104169-1-bvanassche@acm.org> References: <20230407001710.104169-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Prepare for processing the requeue list from inside __blk_mq_run_hw_queue(). Cc: Christoph Hellwig Cc: Damien Le Moal Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/blk-mq-debugfs.c | 66 +++++++++++++++++++++--------------------- block/blk-mq.c | 55 ++++++++++++++++++++++------------- include/linux/blk-mq.h | 4 +++ include/linux/blkdev.h | 4 --- 4 files changed, 72 insertions(+), 57 deletions(-) diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index 212a7f301e73..5eb930754347 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -20,37 +20,6 @@ static int queue_poll_stat_show(void *data, struct seq_file *m) return 0; } -static void *queue_requeue_list_start(struct seq_file *m, loff_t *pos) - __acquires(&q->requeue_lock) -{ - struct request_queue *q = m->private; - - spin_lock_irq(&q->requeue_lock); - return seq_list_start(&q->requeue_list, *pos); -} - -static void *queue_requeue_list_next(struct seq_file *m, void *v, loff_t *pos) -{ - struct request_queue *q = m->private; - - return seq_list_next(v, &q->requeue_list, pos); -} - -static void queue_requeue_list_stop(struct seq_file *m, void *v) - __releases(&q->requeue_lock) -{ - struct request_queue *q = m->private; - - spin_unlock_irq(&q->requeue_lock); -} - -static const struct seq_operations queue_requeue_list_seq_ops = { - .start = queue_requeue_list_start, - .next = queue_requeue_list_next, - .stop = queue_requeue_list_stop, - .show = blk_mq_debugfs_rq_show, -}; - static int blk_flags_show(struct seq_file *m, const unsigned long flags, const char *const *flag_name, int flag_name_count) { @@ -156,11 +125,10 @@ static ssize_t queue_state_write(void *data, const char __user *buf, static const struct blk_mq_debugfs_attr blk_mq_debugfs_queue_attrs[] = { { "poll_stat", 0400, queue_poll_stat_show }, - { "requeue_list", 0400, .seq_ops = &queue_requeue_list_seq_ops }, { "pm_only", 0600, queue_pm_only_show, NULL }, { "state", 0600, queue_state_show, queue_state_write }, { "zone_wlock", 0400, queue_zone_wlock_show, NULL }, - { }, + {}, }; #define HCTX_STATE_NAME(name) [BLK_MQ_S_##name] = #name @@ -513,6 +481,37 @@ static int hctx_dispatch_busy_show(void *data, struct seq_file *m) return 0; } +static void *hctx_requeue_list_start(struct seq_file *m, loff_t *pos) + __acquires(&hctx->requeue_lock) +{ + struct blk_mq_hw_ctx *hctx = m->private; + + spin_lock_irq(&hctx->requeue_lock); + return seq_list_start(&hctx->requeue_list, *pos); +} + +static void *hctx_requeue_list_next(struct seq_file *m, void *v, loff_t *pos) +{ + struct blk_mq_hw_ctx *hctx = m->private; + + return seq_list_next(v, &hctx->requeue_list, pos); +} + +static void hctx_requeue_list_stop(struct seq_file *m, void *v) + __releases(&hctx->requeue_lock) +{ + struct blk_mq_hw_ctx *hctx = m->private; + + spin_unlock_irq(&hctx->requeue_lock); +} + +static const struct seq_operations hctx_requeue_list_seq_ops = { + .start = hctx_requeue_list_start, + .next = hctx_requeue_list_next, + .stop = hctx_requeue_list_stop, + .show = blk_mq_debugfs_rq_show, +}; + #define CTX_RQ_SEQ_OPS(name, type) \ static void *ctx_##name##_rq_list_start(struct seq_file *m, loff_t *pos) \ __acquires(&ctx->lock) \ @@ -628,6 +627,7 @@ static const struct blk_mq_debugfs_attr blk_mq_debugfs_hctx_attrs[] = { {"run", 0600, hctx_run_show, hctx_run_write}, {"active", 0400, hctx_active_show}, {"dispatch_busy", 0400, hctx_dispatch_busy_show}, + {"requeue_list", 0400, .seq_ops = &hctx_requeue_list_seq_ops}, {"type", 0400, hctx_type_show}, {}, }; diff --git a/block/blk-mq.c b/block/blk-mq.c index 8bb35deff5ec..1e285b0cfba3 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1411,14 +1411,17 @@ EXPORT_SYMBOL(blk_mq_requeue_request); static void blk_mq_requeue_work(struct work_struct *work) { - struct request_queue *q = - container_of(work, struct request_queue, requeue_work.work); + struct blk_mq_hw_ctx *hctx = + container_of(work, struct blk_mq_hw_ctx, requeue_work.work); LIST_HEAD(rq_list); struct request *rq, *next; - spin_lock_irq(&q->requeue_lock); - list_splice_init(&q->requeue_list, &rq_list); - spin_unlock_irq(&q->requeue_lock); + if (list_empty_careful(&hctx->requeue_list)) + return; + + spin_lock_irq(&hctx->requeue_lock); + list_splice_init(&hctx->requeue_list, &rq_list); + spin_unlock_irq(&hctx->requeue_lock); list_for_each_entry_safe(rq, next, &rq_list, queuelist) { if (!(rq->rq_flags & (RQF_SOFTBARRIER | RQF_DONTPREP))) @@ -1435,13 +1438,15 @@ static void blk_mq_requeue_work(struct work_struct *work) blk_mq_sched_insert_request(rq, false, false, false); } - blk_mq_run_hw_queues(q, false); + blk_mq_run_hw_queue(hctx, false); } void blk_mq_add_to_requeue_list(struct request *rq, bool at_head, bool kick_requeue_list) { struct request_queue *q = rq->q; + struct blk_mq_hw_ctx *hctx = + rq->mq_hctx ?: q->queue_ctx[0].hctxs[HCTX_TYPE_DEFAULT]; unsigned long flags; /* @@ -1450,14 +1455,14 @@ void blk_mq_add_to_requeue_list(struct request *rq, bool at_head, */ BUG_ON(rq->rq_flags & RQF_SOFTBARRIER); - spin_lock_irqsave(&q->requeue_lock, flags); + spin_lock_irqsave(&hctx->requeue_lock, flags); if (at_head) { rq->rq_flags |= RQF_SOFTBARRIER; - list_add(&rq->queuelist, &q->requeue_list); + list_add(&rq->queuelist, &hctx->requeue_list); } else { - list_add_tail(&rq->queuelist, &q->requeue_list); + list_add_tail(&rq->queuelist, &hctx->requeue_list); } - spin_unlock_irqrestore(&q->requeue_lock, flags); + spin_unlock_irqrestore(&hctx->requeue_lock, flags); if (kick_requeue_list) blk_mq_kick_requeue_list(q); @@ -1465,15 +1470,25 @@ void blk_mq_add_to_requeue_list(struct request *rq, bool at_head, void blk_mq_kick_requeue_list(struct request_queue *q) { - kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, &q->requeue_work, 0); + struct blk_mq_hw_ctx *hctx; + unsigned long i; + + queue_for_each_hw_ctx(q, hctx, i) + kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, + &hctx->requeue_work, 0); } EXPORT_SYMBOL(blk_mq_kick_requeue_list); void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs) { - kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, &q->requeue_work, - msecs_to_jiffies(msecs)); + struct blk_mq_hw_ctx *hctx; + unsigned long i; + + queue_for_each_hw_ctx(q, hctx, i) + kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, + &hctx->requeue_work, + msecs_to_jiffies(msecs)); } EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list); @@ -3595,6 +3610,10 @@ static int blk_mq_init_hctx(struct request_queue *q, struct blk_mq_tag_set *set, struct blk_mq_hw_ctx *hctx, unsigned hctx_idx) { + INIT_DELAYED_WORK(&hctx->requeue_work, blk_mq_requeue_work); + INIT_LIST_HEAD(&hctx->requeue_list); + spin_lock_init(&hctx->requeue_lock); + hctx->queue_num = hctx_idx; if (!(hctx->flags & BLK_MQ_F_STACKING)) @@ -4210,10 +4229,6 @@ int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, q->queue_flags |= QUEUE_FLAG_MQ_DEFAULT; blk_mq_update_poll_flag(q); - INIT_DELAYED_WORK(&q->requeue_work, blk_mq_requeue_work); - INIT_LIST_HEAD(&q->requeue_list); - spin_lock_init(&q->requeue_lock); - q->nr_requests = set->queue_depth; blk_mq_init_cpu_queues(q, set->nr_hw_queues); @@ -4758,10 +4773,10 @@ void blk_mq_cancel_work_sync(struct request_queue *q) struct blk_mq_hw_ctx *hctx; unsigned long i; - cancel_delayed_work_sync(&q->requeue_work); - - queue_for_each_hw_ctx(q, hctx, i) + queue_for_each_hw_ctx(q, hctx, i) { + cancel_delayed_work_sync(&hctx->requeue_work); cancel_delayed_work_sync(&hctx->run_work); + } } static int __init blk_mq_init(void) diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 3a3bee9085e3..0157f1569980 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -311,6 +311,10 @@ struct blk_mq_hw_ctx { unsigned long state; } ____cacheline_aligned_in_smp; + struct list_head requeue_list; + spinlock_t requeue_lock; + struct delayed_work requeue_work; + /** * @run_work: Used for scheduling a hardware queue run at a later time. */ diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index e3242e67a8e3..f5fa53cd13bd 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -491,10 +491,6 @@ struct request_queue { */ struct blk_flush_queue *fq; - struct list_head requeue_list; - spinlock_t requeue_lock; - struct delayed_work requeue_work; - struct mutex sysfs_lock; struct mutex sysfs_dir_lock; From patchwork Fri Apr 7 00:17:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13204354 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A6D2C77B6C for ; Fri, 7 Apr 2023 00:17:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232933AbjDGAR1 (ORCPT ); Thu, 6 Apr 2023 20:17:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37080 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232836AbjDGARY (ORCPT ); Thu, 6 Apr 2023 20:17:24 -0400 Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2EBF5A268 for ; Thu, 6 Apr 2023 17:17:23 -0700 (PDT) Received: by mail-pj1-f49.google.com with SMTP id l9-20020a17090a3f0900b0023d32684e7fso7046558pjc.1 for ; Thu, 06 Apr 2023 17:17:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680826642; x=1683418642; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=aVHpa3RBUUBBIeyy2cZDzappozjgYO7NgaywNZc71qk=; b=YLw7R+hB5BRdWNIAf7H+qANQnVASyuyVHeKDu4oKyNt4ygzIfgPlSSBlXuDlyQ6Ci/ LJ+43mxPTFG0IdcZHI8mR4GAoPChPlD9aD9vOD1ZZ9kJDXEPPsbjhaHSTUyeM9Vpp6Tt i3k8mxD3Ur0bk649Wkf9lppOiSmlXdA5ekndfOW32VswJkjk39/1WWKiDb64/bd3DV4O anWB0mwlplPY2/OAd+CiSkVg2PqzZwwLi6SjX08+zKdYhdH6QH3WFcFrrjz4sjReLVFd ib8kWx5aVPh5MxBPXAoHUPF0nlaD/IoFXjnGGA5adfE7umO8QfwDqrSJsD9a8O8HSptA Rr3g== X-Gm-Message-State: AAQBX9dsbtwtFv419Wl4c1cD7hd7aOLntHs2veSchkVICwfa2cedWEDO zacmMdAwO4wE1zNCmGqlhR4= X-Google-Smtp-Source: AKy350ZBGqjFxPPQt1uT8NyhT3UxswMD4gri3uf63iYHcBhMPGC+hKXAk5HS5Qh3A/q009ef6yydcA== X-Received: by 2002:a17:903:2308:b0:1a1:a6e5:764b with SMTP id d8-20020a170903230800b001a1a6e5764bmr772143plh.60.1680826642553; Thu, 06 Apr 2023 17:17:22 -0700 (PDT) Received: from bvanassche-glaptop2.roam.corp.google.com ([98.51.102.78]) by smtp.gmail.com with ESMTPSA id x9-20020a1709028ec900b0019a773419a6sm1873676plo.170.2023.04.06.17.17.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 17:17:22 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Damien Le Moal , Ming Lei , Mike Snitzer , Jaegeuk Kim , Bart Van Assche Subject: [PATCH 06/12] block: Preserve the order of requeued requests Date: Thu, 6 Apr 2023 17:17:04 -0700 Message-Id: <20230407001710.104169-7-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407001710.104169-1-bvanassche@acm.org> References: <20230407001710.104169-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org If a queue is run before all requeued requests have been sent to the I/O scheduler, the I/O scheduler may dispatch the wrong request. Fix this by making __blk_mq_run_hw_queue() process the requeue_list instead of blk_mq_requeue_work(). Cc: Christoph Hellwig Cc: Damien Le Moal Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/blk-mq.c | 35 ++++++++++------------------------- include/linux/blk-mq.h | 1 - 2 files changed, 10 insertions(+), 26 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 1e285b0cfba3..2cf317d49f56 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -64,6 +64,7 @@ static inline blk_qc_t blk_rq_to_qc(struct request *rq) static bool blk_mq_hctx_has_pending(struct blk_mq_hw_ctx *hctx) { return !list_empty_careful(&hctx->dispatch) || + !list_empty_careful(&hctx->requeue_list) || sbitmap_any_bit_set(&hctx->ctx_map) || blk_mq_sched_has_work(hctx); } @@ -1409,10 +1410,8 @@ void blk_mq_requeue_request(struct request *rq, bool kick_requeue_list) } EXPORT_SYMBOL(blk_mq_requeue_request); -static void blk_mq_requeue_work(struct work_struct *work) +static void blk_mq_process_requeue_list(struct blk_mq_hw_ctx *hctx) { - struct blk_mq_hw_ctx *hctx = - container_of(work, struct blk_mq_hw_ctx, requeue_work.work); LIST_HEAD(rq_list); struct request *rq, *next; @@ -1437,8 +1436,6 @@ static void blk_mq_requeue_work(struct work_struct *work) list_del_init(&rq->queuelist); blk_mq_sched_insert_request(rq, false, false, false); } - - blk_mq_run_hw_queue(hctx, false); } void blk_mq_add_to_requeue_list(struct request *rq, bool at_head, @@ -1465,30 +1462,19 @@ void blk_mq_add_to_requeue_list(struct request *rq, bool at_head, spin_unlock_irqrestore(&hctx->requeue_lock, flags); if (kick_requeue_list) - blk_mq_kick_requeue_list(q); + blk_mq_run_hw_queue(hctx, /*async=*/true); } void blk_mq_kick_requeue_list(struct request_queue *q) { - struct blk_mq_hw_ctx *hctx; - unsigned long i; - - queue_for_each_hw_ctx(q, hctx, i) - kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, - &hctx->requeue_work, 0); + blk_mq_run_hw_queues(q, true); } EXPORT_SYMBOL(blk_mq_kick_requeue_list); void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs) { - struct blk_mq_hw_ctx *hctx; - unsigned long i; - - queue_for_each_hw_ctx(q, hctx, i) - kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, - &hctx->requeue_work, - msecs_to_jiffies(msecs)); + blk_mq_delay_run_hw_queues(q, msecs); } EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list); @@ -2148,6 +2134,8 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx) */ WARN_ON_ONCE(in_interrupt()); + blk_mq_process_requeue_list(hctx); + blk_mq_run_dispatch_ops(hctx->queue, blk_mq_sched_dispatch_requests(hctx)); } @@ -2319,7 +2307,7 @@ void blk_mq_run_hw_queues(struct request_queue *q, bool async) * scheduler. */ if (!sq_hctx || sq_hctx == hctx || - !list_empty_careful(&hctx->dispatch)) + blk_mq_hctx_has_pending(hctx)) blk_mq_run_hw_queue(hctx, async); } } @@ -2355,7 +2343,7 @@ void blk_mq_delay_run_hw_queues(struct request_queue *q, unsigned long msecs) * scheduler. */ if (!sq_hctx || sq_hctx == hctx || - !list_empty_careful(&hctx->dispatch)) + blk_mq_hctx_has_pending(hctx)) blk_mq_delay_run_hw_queue(hctx, msecs); } } @@ -3610,7 +3598,6 @@ static int blk_mq_init_hctx(struct request_queue *q, struct blk_mq_tag_set *set, struct blk_mq_hw_ctx *hctx, unsigned hctx_idx) { - INIT_DELAYED_WORK(&hctx->requeue_work, blk_mq_requeue_work); INIT_LIST_HEAD(&hctx->requeue_list); spin_lock_init(&hctx->requeue_lock); @@ -4773,10 +4760,8 @@ void blk_mq_cancel_work_sync(struct request_queue *q) struct blk_mq_hw_ctx *hctx; unsigned long i; - queue_for_each_hw_ctx(q, hctx, i) { - cancel_delayed_work_sync(&hctx->requeue_work); + queue_for_each_hw_ctx(q, hctx, i) cancel_delayed_work_sync(&hctx->run_work); - } } static int __init blk_mq_init(void) diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 0157f1569980..e62feb17af96 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -313,7 +313,6 @@ struct blk_mq_hw_ctx { struct list_head requeue_list; spinlock_t requeue_lock; - struct delayed_work requeue_work; /** * @run_work: Used for scheduling a hardware queue run at a later time. From patchwork Fri Apr 7 00:17:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13204355 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9517C76196 for ; Fri, 7 Apr 2023 00:17:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232254AbjDGAR3 (ORCPT ); Thu, 6 Apr 2023 20:17:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37124 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237881AbjDGARZ (ORCPT ); Thu, 6 Apr 2023 20:17:25 -0400 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 89A79A255 for ; Thu, 6 Apr 2023 17:17:24 -0700 (PDT) Received: by mail-pl1-f174.google.com with SMTP id w4so38849056plg.9 for ; Thu, 06 Apr 2023 17:17:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680826644; x=1683418644; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MitvE+a1WFK76wqNn1WUiMU2nEet4MTVmJfoebGVEI0=; b=qoCwl70gZR+k/SIKX78e5F8ZID/Aj4NiR+XHGEhwJQL1q0dWgSV7LHCgnaFmY8/k/X 68aUEvICMss+fg2lyJdaYeuAxUPRv+UTQlJ0gGU93rQraRRbINtG84ylJ2ctIc8qc8Pf 7g4R6frxCYqR/XS3L+VZy27hGP1EjjaO+25g+xKXtNCCqVkekNl3p0XfhJJPhDl43UJ7 a2OjpuBCCR1gg4JDOWUNdNR+rUu4j7wit34McHx7XWIiCPtC+TNaDlYq7BojKwPKclBV 8UpflAxmERv2lacHVIRJ4yY+c5yPbycHlMu3/wtlzdbqyjFlw+rfv8oei7VGDi3um22c Qpxg== X-Gm-Message-State: AAQBX9cvweArGCTg5zMqcWN1UBWMe0NtTE808JKBUqzpcPjqPbkTJBRG O9pWYuPoa8fTFJ1CqXIRZDY= X-Google-Smtp-Source: AKy350YSWGEzFu1obHqjjZkMJXnKta2yoYu9j4b2nhDb8+vryke7QF1d2mVK4fVjTHySOci0/PEWLw== X-Received: by 2002:a17:903:18b:b0:19f:188c:3e34 with SMTP id z11-20020a170903018b00b0019f188c3e34mr927018plg.53.1680826643848; Thu, 06 Apr 2023 17:17:23 -0700 (PDT) Received: from bvanassche-glaptop2.roam.corp.google.com ([98.51.102.78]) by smtp.gmail.com with ESMTPSA id x9-20020a1709028ec900b0019a773419a6sm1873676plo.170.2023.04.06.17.17.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 17:17:23 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Damien Le Moal , Ming Lei , Mike Snitzer , Jaegeuk Kim , Bart Van Assche Subject: [PATCH 07/12] block: Make it easier to debug zoned write reordering Date: Thu, 6 Apr 2023 17:17:05 -0700 Message-Id: <20230407001710.104169-8-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407001710.104169-1-bvanassche@acm.org> References: <20230407001710.104169-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Issue a kernel warning if reordering could happen. Cc: Christoph Hellwig Cc: Damien Le Moal Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/blk-mq.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index 2cf317d49f56..07426dbbe720 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2480,6 +2480,8 @@ void blk_mq_request_bypass_insert(struct request *rq, bool at_head, { struct blk_mq_hw_ctx *hctx = rq->mq_hctx; + WARN_ON_ONCE(rq->q->elevator && blk_rq_is_seq_zoned_write(rq)); + spin_lock(&hctx->lock); if (at_head) list_add(&rq->queuelist, &hctx->dispatch); @@ -2572,6 +2574,8 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, bool run_queue = true; int budget_token; + WARN_ON_ONCE(q->elevator && blk_rq_is_seq_zoned_write(rq)); + /* * RCU or SRCU read lock is needed before checking quiesced flag. * From patchwork Fri Apr 7 00:17:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13204356 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3466EC761A6 for ; Fri, 7 Apr 2023 00:17:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232836AbjDGARb (ORCPT ); Thu, 6 Apr 2023 20:17:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37098 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238199AbjDGAR1 (ORCPT ); Thu, 6 Apr 2023 20:17:27 -0400 Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B64CF93EB for ; Thu, 6 Apr 2023 17:17:25 -0700 (PDT) Received: by mail-pj1-f41.google.com with SMTP id l7so38620589pjg.5 for ; Thu, 06 Apr 2023 17:17:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680826645; x=1683418645; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yXMoxgLDo4dyjVXptXOnmrqXR6FziNMll+I0AUUU5Fg=; b=zlVej3lBK7/WEsGHzo/9I1Q+JhK7Xudm+MLLfNrJtqdYOHgE7Gs7QZSruDZQlOVgw4 tenrckcw2YmvYsJAiwJIKFVRMGvWe0XH3NygpSLKZ84d6sMAG1OYIbEaAv/ZF5u24/yZ zvBqJ8Zega03Gl8WvLrf6dS27hmNSXY3iglZAVYp4E8uPpLuNC9X6iTN7v7MIEJomivN k6LF552tPcz3cTTx5PwpaLqX7bUsoyCxk+fbqnLbveou42dblKjgJ2vAq88a6TgLd9Kr lGBd9OWddKzrGjcQnpLYlZD0IhHwISgKrOplF/Bs6DFkU4GimMBeqm6eNRa7eVrD9XvH jCJA== X-Gm-Message-State: AAQBX9eBzzxiBBUoFECgUEkhRYoMgZJ1XEQEddf6FCwS8pFG6RI6DE0N cHQEYVP3G96VULHZHQo4SJo= X-Google-Smtp-Source: AKy350Z8LMBBLwqFPGFbt0H7yoeE82UNDUt0VF3gsd0BZXSUDQFxNrpwmfBK68vLQlFr/IB7n/MiLw== X-Received: by 2002:a17:902:f552:b0:1a0:6bd4:ea9a with SMTP id h18-20020a170902f55200b001a06bd4ea9amr8089231plf.16.1680826645076; Thu, 06 Apr 2023 17:17:25 -0700 (PDT) Received: from bvanassche-glaptop2.roam.corp.google.com ([98.51.102.78]) by smtp.gmail.com with ESMTPSA id x9-20020a1709028ec900b0019a773419a6sm1873676plo.170.2023.04.06.17.17.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 17:17:24 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Damien Le Moal , Ming Lei , Mike Snitzer , Jaegeuk Kim , Bart Van Assche Subject: [PATCH 08/12] block: mq-deadline: Simplify deadline_skip_seq_writes() Date: Thu, 6 Apr 2023 17:17:06 -0700 Message-Id: <20230407001710.104169-9-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407001710.104169-1-bvanassche@acm.org> References: <20230407001710.104169-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Make deadline_skip_seq_writes() shorter without changing its functionality. Cc: Damien Le Moal Cc: Christoph Hellwig Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/mq-deadline.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index d885ccf49170..50a9d3b0a291 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -312,12 +312,9 @@ static struct request *deadline_skip_seq_writes(struct deadline_data *dd, struct request *rq) { sector_t pos = blk_rq_pos(rq); - sector_t skipped_sectors = 0; - while (rq) { - if (blk_rq_pos(rq) != pos + skipped_sectors) - break; - skipped_sectors += blk_rq_sectors(rq); + while (rq && blk_rq_pos(rq) == pos) { + pos += blk_rq_sectors(rq); rq = deadline_latter_request(rq); } From patchwork Fri Apr 7 00:17:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13204357 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80F50C76196 for ; Fri, 7 Apr 2023 00:17:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236825AbjDGARb (ORCPT ); Thu, 6 Apr 2023 20:17:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238666AbjDGAR1 (ORCPT ); Thu, 6 Apr 2023 20:17:27 -0400 Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 039269757 for ; Thu, 6 Apr 2023 17:17:27 -0700 (PDT) Received: by mail-pj1-f45.google.com with SMTP id lr16-20020a17090b4b9000b0023f187954acso42137629pjb.2 for ; Thu, 06 Apr 2023 17:17:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680826646; x=1683418646; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=t0rraQvI0NSDkdGimavbgcqDUTGPiPZaZulhFDWUQ2I=; b=TOBna3HKtQ2O4bRH0MqjyO4MRXPuBiE3yvMwgg/QU0izn5uNHFmz/W9S63FY15R1SV vFKKtJkVf1QB+1irGOQRMTbuTBxnrRDAkdGTEVP8RHOOkCBZF3NEx4QfdKhCkdfXvP3J odeQ4NAin7F5V71Aub7Uia9/mfsyLWba0H1fPYb79uGmEbTtys6NvyAJvjPC4cyEvM6L GXIKOTehNTSgKtR0C3SZZQ0kz7UhQ2jbx4Nw4mNQxpdLmby9iwOsdooy3mik260KaUtU eE9FDKv8xcaMJrXt0+Y4IChH8i2vVpowlnoLbsNFXV5J0slNzaS5mDQKpAd7KIjIJtp5 endQ== X-Gm-Message-State: AAQBX9cQsi/cj4tP2Qkdk+/5aBzsvxMdymrCSsJZrtL2kqq73IdXpaSc bqvMFZx/ab26d753Dix8WFI= X-Google-Smtp-Source: AKy350ZeMOFpCwtskN2LBCnDEDifosZNJFmIvD5zGf7js3IBK0Ev4MhVVygPa4ZXmASfPDTxD+kurg== X-Received: by 2002:a17:902:cec8:b0:1a1:80ea:4364 with SMTP id d8-20020a170902cec800b001a180ea4364mr1117393plg.31.1680826646642; Thu, 06 Apr 2023 17:17:26 -0700 (PDT) Received: from bvanassche-glaptop2.roam.corp.google.com ([98.51.102.78]) by smtp.gmail.com with ESMTPSA id x9-20020a1709028ec900b0019a773419a6sm1873676plo.170.2023.04.06.17.17.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 17:17:25 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Damien Le Moal , Ming Lei , Mike Snitzer , Jaegeuk Kim , Bart Van Assche Subject: [PATCH 09/12] block: mq-deadline: Disable head insertion for zoned writes Date: Thu, 6 Apr 2023 17:17:07 -0700 Message-Id: <20230407001710.104169-10-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407001710.104169-1-bvanassche@acm.org> References: <20230407001710.104169-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Make sure that zoned writes are submitted in LBA order. Cc: Damien Le Moal Cc: Christoph Hellwig Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/mq-deadline.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 50a9d3b0a291..891ee0da73ac 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -798,7 +798,7 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, trace_block_rq_insert(rq); - if (at_head) { + if (at_head && !blk_rq_is_seq_zoned_write(rq)) { list_add(&rq->queuelist, &per_prio->dispatch); rq->fifo_time = jiffies; } else { From patchwork Fri Apr 7 00:17:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13204358 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE635C761A6 for ; Fri, 7 Apr 2023 00:17:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233203AbjDGARc (ORCPT ); Thu, 6 Apr 2023 20:17:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37196 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233130AbjDGAR3 (ORCPT ); Thu, 6 Apr 2023 20:17:29 -0400 Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 115BF8A55 for ; Thu, 6 Apr 2023 17:17:28 -0700 (PDT) Received: by mail-pj1-f49.google.com with SMTP id l9-20020a17090a3f0900b0023d32684e7fso7046655pjc.1 for ; Thu, 06 Apr 2023 17:17:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680826648; x=1683418648; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=C86OJvzoW4rtHBJaXKlXJfoD3L6aK93cPTFYckMkAmg=; b=4D8cYGKo6YMi9ZYQf0+4qbWThHHqb4rBa0X3LSFG11by8d+YDX4igmdTpxwj1iexdU IG1AVML+ihOCwzfZHiDdjphKAWId6mfQFjzvWlvPpW/Zns5nWoyx6gIc271v1/kGJlmm 0ynj/vXyb3OkHa0xapqESr0xv2yrIksCO1sLEeruXC9ADulZOMFA147lAx8KIkYa7gIW AV3p6O8Qk8Ttd8+oRqCrnIc/1rp44lo5uS4tZhPivNTvptXykVb6rFpE3rrWoDKB2NEL U3r3ituMHuQr8ov3A48Y3rAL2qW0v/TGu2i9tcpcfgROgJh9WkTowA1fRhgWrVypYEH1 dOaA== X-Gm-Message-State: AAQBX9dfGCSN5V7cnFULTawQlQ99PT6uu2ZEmPyP+mbnkQ88srK4A0e5 xemGT7AwpKMKYN3c2k56Y49fE/2UHeE= X-Google-Smtp-Source: AKy350at2WfR1iPgMlpDsFP95FOQVOXchZtrUmK6tUHDTl4p4/trXszUDZq9aZew0Yw+B41WRE0Hng== X-Received: by 2002:a17:903:280d:b0:1a2:8c7e:f315 with SMTP id kp13-20020a170903280d00b001a28c7ef315mr630405plb.21.1680826647688; Thu, 06 Apr 2023 17:17:27 -0700 (PDT) Received: from bvanassche-glaptop2.roam.corp.google.com ([98.51.102.78]) by smtp.gmail.com with ESMTPSA id x9-20020a1709028ec900b0019a773419a6sm1873676plo.170.2023.04.06.17.17.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 17:17:27 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Damien Le Moal , Ming Lei , Mike Snitzer , Jaegeuk Kim , Bart Van Assche Subject: [PATCH 10/12] block: mq-deadline: Introduce a local variable Date: Thu, 6 Apr 2023 17:17:08 -0700 Message-Id: <20230407001710.104169-11-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407001710.104169-1-bvanassche@acm.org> References: <20230407001710.104169-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Prepare for adding more code that uses the request queue pointer. Cc: Damien Le Moal Cc: Christoph Hellwig Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/mq-deadline.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 891ee0da73ac..8c2bc9fdcf8c 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -368,6 +368,7 @@ static struct request * deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio, enum dd_data_dir data_dir) { + struct request_queue *q; struct request *rq; unsigned long flags; @@ -375,7 +376,8 @@ deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio, if (!rq) return NULL; - if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q)) + q = rq->q; + if (data_dir == DD_READ || !blk_queue_is_zoned(q)) return rq; /* @@ -389,7 +391,7 @@ deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio, while (rq) { if (blk_req_can_dispatch_to_zone(rq)) break; - if (blk_queue_nonrot(rq->q)) + if (blk_queue_nonrot(q)) rq = deadline_latter_request(rq); else rq = deadline_skip_seq_writes(dd, rq); From patchwork Fri Apr 7 00:17:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13204359 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 884B9C761A6 for ; Fri, 7 Apr 2023 00:17:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237444AbjDGARh (ORCPT ); Thu, 6 Apr 2023 20:17:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232409AbjDGARf (ORCPT ); Thu, 6 Apr 2023 20:17:35 -0400 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC87EA5CB for ; Thu, 6 Apr 2023 17:17:30 -0700 (PDT) Received: by mail-pl1-f171.google.com with SMTP id q2so2775163pll.7 for ; Thu, 06 Apr 2023 17:17:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680826650; x=1683418650; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2CFIH6DPkGWqv6H5U0GnqrQGslXoWOV64uSysltami8=; b=yVn9H/Ogr2aeMAnh0XMZLXqkGeWB9VJTVjFVXHQGHxIUyXdHmMzIZovrl6N0omP25L ntxJJGMOIcjufXebH9uWI3FFl1kj/RtzOTJJk69AgTbfm8O/vQ/S7kXGP8vxhJHIk0i5 v3h5puP5DzZW3h0mqhkaJjAY/Ic9x29/Spy9uKiHCFMNqpKYPUqXrHH2bmJY2AEkG7R4 /4FmwNK96bm+FGG/rdxJU8URkIsJTU5Us7oRlDntqQf3QwI/jIA5MoklnYPM0aBK1M7R mVbpahEQmO9maoqY6EAXe66FpR4udWTfCbLJribFTjsGIW5Wq1Pbc6aW63Ii2u9Lr79d c5gA== X-Gm-Message-State: AAQBX9dINy2/atNx4EIg2Ha+ocsd/FXNX2bxLakkugzXeB00gXAX2A4q 9jnxJYyuyiUn8FKYd8INJxM= X-Google-Smtp-Source: AKy350aXlZpPqEsiDzbkChI9U78MvnndkfXEDs5uKM0iU7cAahrr51N9YhNomPYuDSJ+z55ZvtvJmQ== X-Received: by 2002:a17:902:ea0d:b0:1a1:f06b:f171 with SMTP id s13-20020a170902ea0d00b001a1f06bf171mr1078582plg.17.1680826649620; Thu, 06 Apr 2023 17:17:29 -0700 (PDT) Received: from bvanassche-glaptop2.roam.corp.google.com ([98.51.102.78]) by smtp.gmail.com with ESMTPSA id x9-20020a1709028ec900b0019a773419a6sm1873676plo.170.2023.04.06.17.17.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 17:17:28 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Damien Le Moal , Ming Lei , Mike Snitzer , Jaegeuk Kim , Bart Van Assche Subject: [PATCH 11/12] block: mq-deadline: Fix a race condition related to zoned writes Date: Thu, 6 Apr 2023 17:17:09 -0700 Message-Id: <20230407001710.104169-12-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407001710.104169-1-bvanassche@acm.org> References: <20230407001710.104169-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Let deadline_next_request() only consider the first zoned write per zone. This patch fixes a race condition between deadline_next_request() and completion of zoned writes. Cc: Damien Le Moal Cc: Christoph Hellwig Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/mq-deadline.c | 24 +++++++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 8c2bc9fdcf8c..d49e20d3011d 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -389,12 +389,30 @@ deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio, */ spin_lock_irqsave(&dd->zone_lock, flags); while (rq) { + unsigned int zno = blk_rq_zone_no(rq); + if (blk_req_can_dispatch_to_zone(rq)) break; - if (blk_queue_nonrot(q)) - rq = deadline_latter_request(rq); - else + + WARN_ON_ONCE(!blk_queue_is_zoned(q)); + + if (!blk_queue_nonrot(q)) { rq = deadline_skip_seq_writes(dd, rq); + if (!rq) + break; + rq = deadline_earlier_request(rq); + if (WARN_ON_ONCE(!rq)) + break; + } + + /* + * Skip all other write requests for the zone with zone number + * 'zno'. This prevents that this function selects a zoned write + * that is not the first write for a given zone. + */ + while ((rq = deadline_latter_request(rq)) && + blk_rq_zone_no(rq) == zno) + ; } spin_unlock_irqrestore(&dd->zone_lock, flags); From patchwork Fri Apr 7 00:17:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13204360 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51656C77B6C for ; Fri, 7 Apr 2023 00:17:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230409AbjDGARi (ORCPT ); Thu, 6 Apr 2023 20:17:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236830AbjDGARh (ORCPT ); Thu, 6 Apr 2023 20:17:37 -0400 Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D0A699ED2 for ; Thu, 6 Apr 2023 17:17:31 -0700 (PDT) Received: by mail-pl1-f176.google.com with SMTP id kc4so38855101plb.10 for ; Thu, 06 Apr 2023 17:17:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680826651; x=1683418651; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oZf9GTQmWyzqPhonfoNHUKso5UeZ2r0GPOtB8iA+7ls=; b=z6BjS4hzZMdBuccjUInnfgB0xnC7Hxp9mOmwO8Dcnraz4zQ7tKW/PRHczd/qz7Z2m7 BM19LJxD4jatCXv6unaiXEVkd1XFP05SY/mPRNnZtsDdmajWPxsV3faIbGX19psk8eQu aQZqpm4EHFcrK42cuNCeBoRxrs9ti5xZGL/hLBkSh2hmW8MvV4yevWG6D3534OzBMh8U Ra8klyMdZ84gI1nOhVzMBe9icQV9wn8nAoyNn+4r6AeED1w28YBztjKFvDp5pHc1/itR sLoFUTGVH3EpI3UdhmdNtxzucGHL188uZ7LrCoM8USpcR1wtBRviGI89NczlN1fvCB4i zn5Q== X-Gm-Message-State: AAQBX9emFYl/9js0BNBQqRdSSHpaOZH1w8wH1b0C7Ynow7fCps44TUrL CSy2n0IeKh/s8jqDI9Wswws= X-Google-Smtp-Source: AKy350aPPnmukwf3PXEAFaLd6XFyHRvfmSFraaChmsJhnZCbPVnVHfAYq+LkZU6LzSJcyKZ8rEKpUg== X-Received: by 2002:a17:902:e5c7:b0:1a0:5349:6606 with SMTP id u7-20020a170902e5c700b001a053496606mr916034plf.56.1680826651086; Thu, 06 Apr 2023 17:17:31 -0700 (PDT) Received: from bvanassche-glaptop2.roam.corp.google.com ([98.51.102.78]) by smtp.gmail.com with ESMTPSA id x9-20020a1709028ec900b0019a773419a6sm1873676plo.170.2023.04.06.17.17.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 17:17:30 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Christoph Hellwig , Damien Le Moal , Ming Lei , Mike Snitzer , Jaegeuk Kim , Bart Van Assche Subject: [PATCH 12/12] block: mq-deadline: Handle requeued requests correctly Date: Thu, 6 Apr 2023 17:17:10 -0700 Message-Id: <20230407001710.104169-13-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230407001710.104169-1-bvanassche@acm.org> References: <20230407001710.104169-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org If a zoned write is requeued with an LBA that is lower than already inserted zoned writes, make sure that it is submitted first. Cc: Damien Le Moal Cc: Christoph Hellwig Cc: Ming Lei Cc: Mike Snitzer Signed-off-by: Bart Van Assche --- block/mq-deadline.c | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index d49e20d3011d..2e046ad8ca2c 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -162,8 +162,19 @@ static void deadline_add_rq_rb(struct dd_per_prio *per_prio, struct request *rq) { struct rb_root *root = deadline_rb_root(per_prio, rq); + struct request **next_rq = &per_prio->next_rq[rq_data_dir(rq)]; elv_rb_add(root, rq); + if (*next_rq == NULL || !blk_queue_is_zoned(rq->q)) + return; + /* + * If a request got requeued or requests have been submitted out of + * order, make sure that per zone the request with the lowest LBA is + * submitted first. + */ + if (blk_rq_pos(rq) < blk_rq_pos(*next_rq) && + blk_rq_zone_no(rq) == blk_rq_zone_no(*next_rq)) + *next_rq = rq; } static inline void @@ -822,6 +833,8 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, list_add(&rq->queuelist, &per_prio->dispatch); rq->fifo_time = jiffies; } else { + struct list_head *insert_before; + deadline_add_rq_rb(per_prio, rq); if (rq_mergeable(rq)) { @@ -834,7 +847,20 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, * set expire time and add to fifo list */ rq->fifo_time = jiffies + dd->fifo_expire[data_dir]; - list_add_tail(&rq->queuelist, &per_prio->fifo_list[data_dir]); + insert_before = &per_prio->fifo_list[data_dir]; + if (blk_rq_is_seq_zoned_write(rq)) { + const unsigned int zno = blk_rq_zone_no(rq); + struct request *prev; + + while ((prev = deadline_earlier_request(rq))) { + if (blk_rq_zone_no(prev) != zno) + continue; + if (blk_rq_pos(rq) >= blk_rq_pos(prev)) + break; + insert_before = &prev->queuelist; + } + } + list_add_tail(&rq->queuelist, insert_before); } }