From patchwork Mon Mar 20 23:49:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13182048 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7ABC9C6FD1C for ; Mon, 20 Mar 2023 23:51:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229786AbjCTXvK (ORCPT ); Mon, 20 Mar 2023 19:51:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51920 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230196AbjCTXvG (ORCPT ); Mon, 20 Mar 2023 19:51:06 -0400 Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 151A531E00 for ; Mon, 20 Mar 2023 16:50:32 -0700 (PDT) Received: by mail-pf1-f181.google.com with SMTP id s8so7868327pfk.5 for ; Mon, 20 Mar 2023 16:50:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679356173; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YhxQB1ck5AzwQak0OgHEiUUpT5lV2AE4SHuGYT8zNGY=; b=TTg2oHE/NFdVFDE62LmqoGTOaFwoqc8xtD4QX7C/nFBZj+AMG15gX20K1qPDpIvb05 y3Z+QvmSq+CjOpqijrtzAv6KIEWPEUnh7rkfDRILLlnN1FL9fjgu2UUpFObKg6vy5DnD L+qvi7h6+xkAL/CzD4tMtzXW0I76nzyK/Tg4XVYahApewSydbUD/i5+pKSKIri9p/Jld a/zjruzWuMuBuLy2OvnPQWDKqlQeBW+jz8hF6pUayC/FmwF/NyyYAIRNLtu7SjYVMH/u o40FvOGEkRpetKf2OuO6lXa0PJGGwpu8hNGrwn1EkSyCmo/yvKFLH++egA9bnS9Mk76d QAbA== X-Gm-Message-State: AO0yUKV6K7w7GSdcXlquDzY1C5az2YXQ6Q8YqlCaM6lvpEyh5QlKzN8G QDrP7AWp4hq47S4N81+wUKo= X-Google-Smtp-Source: AK7set/JsoMQoi00fzY3PKfPmi0clTN+HVGW3WsgqSJBpiIHS4yr/Z2P5CDuuR3J0eDOd657kWMdFg== X-Received: by 2002:aa7:96f9:0:b0:626:a9b:94b8 with SMTP id i25-20020aa796f9000000b006260a9b94b8mr461526pfq.20.1679356173107; Mon, 20 Mar 2023 16:49:33 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:ad26:bef0:6406:d659]) by smtp.gmail.com with ESMTPSA id j23-20020aa78dd7000000b0058bf2ae9694sm6915907pfr.156.2023.03.20.16.49.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Mar 2023 16:49:32 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Jaegeuk Kim , Bart Van Assche , Christoph Hellwig , Ming Lei , Damien Le Moal , Johannes Thumshirn Subject: [PATCH v2 1/3] block: Split blk_recalc_rq_segments() Date: Mon, 20 Mar 2023 16:49:03 -0700 Message-Id: <20230320234905.3832131-2-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.rc1.284.g88254d51c5-goog In-Reply-To: <20230320234905.3832131-1-bvanassche@acm.org> References: <20230320234905.3832131-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Prepare for adding an additional call to bio_chain_nr_segments(). Cc: Christoph Hellwig Cc: Ming Lei Cc: Damien Le Moal Cc: Johannes Thumshirn Signed-off-by: Bart Van Assche --- block/blk-merge.c | 26 ++++++++++++++++---------- block/blk.h | 2 ++ 2 files changed, 18 insertions(+), 10 deletions(-) diff --git a/block/blk-merge.c b/block/blk-merge.c index 65e75efa9bd3..d6f8552ef209 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -409,22 +409,22 @@ struct bio *bio_split_to_limits(struct bio *bio) } EXPORT_SYMBOL(bio_split_to_limits); -unsigned int blk_recalc_rq_segments(struct request *rq) +/* Calculate the number of DMA segments in a bio chain. */ +unsigned int bio_chain_nr_segments(struct bio *bio, + const struct queue_limits *lim) { unsigned int nr_phys_segs = 0; unsigned int bytes = 0; - struct req_iterator iter; + struct bvec_iter iter; struct bio_vec bv; - if (!rq->bio) + if (!bio) return 0; - switch (bio_op(rq->bio)) { + switch (bio_op(bio)) { case REQ_OP_DISCARD: case REQ_OP_SECURE_ERASE: - if (queue_max_discard_segments(rq->q) > 1) { - struct bio *bio = rq->bio; - + if (lim->max_discard_segments > 1) { for_each_bio(bio) nr_phys_segs++; return nr_phys_segs; @@ -436,12 +436,18 @@ unsigned int blk_recalc_rq_segments(struct request *rq) break; } - rq_for_each_bvec(bv, rq, iter) - bvec_split_segs(&rq->q->limits, &bv, &nr_phys_segs, &bytes, - UINT_MAX, UINT_MAX); + for_each_bio(bio) + bio_for_each_bvec(bv, bio, iter) + bvec_split_segs(lim, &bv, &nr_phys_segs, &bytes, + UINT_MAX, UINT_MAX); return nr_phys_segs; } +unsigned int blk_recalc_rq_segments(struct request *rq) +{ + return bio_chain_nr_segments(rq->bio, &rq->q->limits); +} + static inline struct scatterlist *blk_next_sg(struct scatterlist **sg, struct scatterlist *sglist) { diff --git a/block/blk.h b/block/blk.h index d65d96994a94..ea15b1a4c2b7 100644 --- a/block/blk.h +++ b/block/blk.h @@ -330,6 +330,8 @@ int ll_back_merge_fn(struct request *req, struct bio *bio, unsigned int nr_segs); bool blk_attempt_req_merge(struct request_queue *q, struct request *rq, struct request *next); +unsigned int bio_chain_nr_segments(struct bio *bio, + const struct queue_limits *lim); unsigned int blk_recalc_rq_segments(struct request *rq); void blk_rq_set_mixed_merge(struct request *rq); bool blk_rq_merge_ok(struct request *rq, struct bio *bio); From patchwork Mon Mar 20 23:49:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13182047 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB384C6FD1D for ; Mon, 20 Mar 2023 23:51:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230072AbjCTXvK (ORCPT ); Mon, 20 Mar 2023 19:51:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229786AbjCTXvG (ORCPT ); Mon, 20 Mar 2023 19:51:06 -0400 Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 49BAB32527 for ; Mon, 20 Mar 2023 16:50:32 -0700 (PDT) Received: by mail-pj1-f45.google.com with SMTP id om3-20020a17090b3a8300b0023efab0e3bfso18166976pjb.3 for ; Mon, 20 Mar 2023 16:50:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679356174; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=P0htuxj+xYDwdXpkOC4+cEB037EaSURNu1sQNEZqd9U=; b=rCm0NDcikhNANqWw8WSqRUxxb/NfokjmJOmkJL93UIWEkFswG8ZbYpHIh6AApzKzto 9vdp9Qn5brpEHm8BFYP7O0AzoEszqP3GcjwnMPObyDbH6F4mg+4uZfhGA2I288RYuasP /2qhsKscIsPl1aelHJfGi/vkpwzS9jSc2qFtrs3CbM0X91EJn1fLNL3SNF9GVoOWWZly CC57TmwEFT0qmgDoNPtmapX2CfADuiZtw6Huj+0PtA2Piy9xZLpLkARJ+NO9/49UutiW qanbT6qpFY5a/EYGxHWooETFITAukmfQ1ziLwsujPp1ffhU1tLhtBwdMbv/NKoqZeBV3 bVmA== X-Gm-Message-State: AO0yUKXrCRNH/aoRvtGv54mT4/rn8imQ2n4Nsf/QOkordL++yluvAIFn /9bva6Ew7aGbYPWGBz2q894= X-Google-Smtp-Source: AK7set8dqmKN3FrcKC8uVy6/xMsdjkzP+OE0+R2UcvwHVE4gQ6tObZkVCMn9q1d7XnY1XEe2ryoBow== X-Received: by 2002:a05:6a20:8b85:b0:cb:6869:ca66 with SMTP id m5-20020a056a208b8500b000cb6869ca66mr314788pzh.19.1679356174508; Mon, 20 Mar 2023 16:49:34 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:ad26:bef0:6406:d659]) by smtp.gmail.com with ESMTPSA id j23-20020aa78dd7000000b0058bf2ae9694sm6915907pfr.156.2023.03.20.16.49.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Mar 2023 16:49:34 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Jaegeuk Kim , Bart Van Assche , Christoph Hellwig , Ming Lei , Damien Le Moal , Johannes Thumshirn Subject: [PATCH v2 2/3] block: Split and submit bios in LBA order Date: Mon, 20 Mar 2023 16:49:04 -0700 Message-Id: <20230320234905.3832131-3-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.rc1.284.g88254d51c5-goog In-Reply-To: <20230320234905.3832131-1-bvanassche@acm.org> References: <20230320234905.3832131-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Submit the bio fragment with the lowest LBA first. This approach prevents write errors when submitting large bios to host-managed zoned block devices. This patch only modifies the behavior of drivers that call bio_split_to_limits() directly. This includes DRBD, pktcdvd, dm, md and the NVMe multipath code. Cc: Christoph Hellwig Cc: Ming Lei Cc: Damien Le Moal Cc: Johannes Thumshirn Signed-off-by: Bart Van Assche --- block/blk-merge.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/block/blk-merge.c b/block/blk-merge.c index d6f8552ef209..7281f2d91b2f 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -345,8 +345,8 @@ EXPORT_SYMBOL_GPL(bio_split_rw); * @nr_segs: returns the number of segments in the returned bio * * Check if @bio needs splitting based on the queue limits, and if so split off - * a bio fitting the limits from the beginning of @bio and return it. @bio is - * shortened to the remainder and re-submitted. + * a bio fitting the limits from the beginning of @bio. @bio is shortened to + * the remainder. * * The split bio is allocated from @q->bio_split, which is provided by the * block layer. @@ -379,10 +379,23 @@ struct bio *__bio_split_to_limits(struct bio *bio, split->bi_opf |= REQ_NOMERGE; blkcg_bio_issue_init(split); - bio_chain(split, bio); trace_block_split(split, bio->bi_iter.bi_sector); - submit_bio_noacct(bio); - return split; + if (current->bio_list) { + /* + * The caller will submit the first half ('split') + * before the second half ('bio'). + */ + bio_chain(split, bio); + submit_bio_noacct(bio); + return split; + } + /* + * Submit the first half ('split') let the caller submit the + * second half ('bio'). + */ + *nr_segs = bio_chain_nr_segments(bio, lim); + bio_chain(split, bio); + submit_bio_noacct(split); } return bio; } From patchwork Mon Mar 20 23:49:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 13182046 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0D3FC6FD1C for ; Mon, 20 Mar 2023 23:51:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230030AbjCTXvD (ORCPT ); Mon, 20 Mar 2023 19:51:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229627AbjCTXvC (ORCPT ); Mon, 20 Mar 2023 19:51:02 -0400 Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AED7531E1D for ; Mon, 20 Mar 2023 16:50:27 -0700 (PDT) Received: by mail-pj1-f50.google.com with SMTP id mp3-20020a17090b190300b0023fcc8ce113so3336673pjb.4 for ; Mon, 20 Mar 2023 16:50:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679356176; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6fo1Le/AT02JY1skfmKl6BFKi8BdgJ1W5SuOdeJnrrs=; b=h6aWQGraSjZjf+/08m6s/GWtcBOO7edcX1PGywhFyIChv+07+ZLWQNZfpP95Yqvbee A4BELWOAl898eAtyWRcwOh+xI+ICiVN7a4SvvmSod+Ka6bzI2p0WzPsal8COZeepeTmG 2n4l582CVRt9OG8JmAqfrWjoD2i9V6KPE+/KmJKK87ze6Dlst0pSFf1gfMiDcza0hBeh YjV7C16M0K0OhhFVlLtL3p3+olB2creZnhpg789ruVAdlmO2Z4AfF80nSumdEpH86Lot kplxFwuYDbeK88/NfP3OIVf0xBsyPRCCYD+mr4AO6RuVtKFsy+RNinpUpALd6fakNp7J vIxQ== X-Gm-Message-State: AO0yUKVkxKFJPI75r0VcF7scsL6/6+7ZaFzf0uDYCR4uQr0Lg4w3Epjo YUF69NpZo1WFgjAMNEjH0h8= X-Google-Smtp-Source: AK7set/OEra0Sl9hUP9ff7gWhJg8ihmQ3ZKaWFK/sq+7TDjcmCnD4Hx798mDE8QHl3YIKEPohUx1rQ== X-Received: by 2002:a05:6a20:bb21:b0:cc:50de:a2be with SMTP id fc33-20020a056a20bb2100b000cc50dea2bemr1139623pzb.14.1679356175889; Mon, 20 Mar 2023 16:49:35 -0700 (PDT) Received: from bvanassche-linux.mtv.corp.google.com ([2620:15c:211:201:ad26:bef0:6406:d659]) by smtp.gmail.com with ESMTPSA id j23-20020aa78dd7000000b0058bf2ae9694sm6915907pfr.156.2023.03.20.16.49.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Mar 2023 16:49:35 -0700 (PDT) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, Jaegeuk Kim , Bart Van Assche , Christoph Hellwig , Ming Lei , Damien Le Moal , Johannes Thumshirn Subject: [PATCH v2 3/3] block: Preserve LBA order when requeuing Date: Mon, 20 Mar 2023 16:49:05 -0700 Message-Id: <20230320234905.3832131-4-bvanassche@acm.org> X-Mailer: git-send-email 2.40.0.rc1.284.g88254d51c5-goog In-Reply-To: <20230320234905.3832131-1-bvanassche@acm.org> References: <20230320234905.3832131-1-bvanassche@acm.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org When requeuing a request to a zoned block device, preserve the LBA order per zone. Cc: Christoph Hellwig Cc: Ming Lei Cc: Damien Le Moal Cc: Johannes Thumshirn Signed-off-by: Bart Van Assche --- block/blk-mq.c | 45 +++++++++++++++++++++++++++++++++++++++------ 1 file changed, 39 insertions(+), 6 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index cc32ad0cd548..2ec7d6140114 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1495,6 +1495,44 @@ static void blk_mq_requeue_work(struct work_struct *work) blk_mq_run_hw_queues(q, false); } +static void blk_mq_insert_rq(struct request *rq, struct list_head *list, + bool at_head) +{ + bool zone_in_list = false; + struct request *rq2; + + /* + * For request queues associated with a zoned block device, check + * whether another request for the same zone has already been queued. + */ + if (blk_queue_is_zoned(rq->q)) { + const unsigned int zno = blk_rq_zone_no(rq); + + list_for_each_entry(rq2, list, queuelist) { + if (blk_rq_zone_no(rq2) == zno) { + zone_in_list = true; + if (blk_rq_pos(rq) < blk_rq_pos(rq2)) + break; + } + } + } + if (!zone_in_list) { + if (at_head) { + rq->rq_flags |= RQF_SOFTBARRIER; + list_add(&rq->queuelist, list); + } else { + list_add_tail(&rq->queuelist, list); + } + } else { + /* + * Insert the request in the list before another request for + * the same zone and with a higher LBA. If there is no such + * request, insert the request at the end of the list. + */ + list_add_tail(&rq->queuelist, &rq2->queuelist); + } +} + void blk_mq_add_to_requeue_list(struct request *rq, bool at_head, bool kick_requeue_list) { @@ -1508,12 +1546,7 @@ void blk_mq_add_to_requeue_list(struct request *rq, bool at_head, BUG_ON(rq->rq_flags & RQF_SOFTBARRIER); spin_lock_irqsave(&q->requeue_lock, flags); - if (at_head) { - rq->rq_flags |= RQF_SOFTBARRIER; - list_add(&rq->queuelist, &q->requeue_list); - } else { - list_add_tail(&rq->queuelist, &q->requeue_list); - } + blk_mq_insert_rq(rq, &q->requeue_list, at_head); spin_unlock_irqrestore(&q->requeue_lock, flags); if (kick_requeue_list)