From patchwork Mon Aug 26 17:37:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13778300 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 48705184525; Mon, 26 Aug 2024 17:38:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724693913; cv=none; b=cHvQPizOuqtGbpjrydlZIbPRDjch8HfyLsiBYPo67fi04tRsq0i2Y5hBkukqCjSsHjRqEF9azX0izDoL7ap+/iEQkyb9ufx6DD9mCZncQoJSOmISY64KAWXMw8IahPy4LmbAiCwC2cBOuNJ1HR/luvy7HBxRuRpLvfqEOE3hv38= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724693913; c=relaxed/simple; bh=F+H5FTYeSNVfv0zKzC3k0UrT+M7zEM7OQ+VwpJ7oNqw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=okqqjktNW5kTNFr9VNnlrxpwnwbbeldktF9DKCytflhhRn5tpMit/BXor8SXo8H86EsUoNFb2jIFNQiNAGSCcBKm/xM5QidwwexO5VOSbH+GaPPGuAn5wDLA1cIvAYAo8XVhY3+/5brQP7UEhyzbLVW1JTgtOD2sJgE8E2BFDlw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de; spf=none smtp.mailfrom=bombadil.srs.infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=VByE0FTy; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=bombadil.srs.infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="VByE0FTy" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=c0a9yXoYMZRPSdb82OW+51SC4gEQOgkG7w9Uj6dBOaQ=; b=VByE0FTyTDYd6ZMCrzNorC7eN1 beg4KuMqy+xLs2+34NpQ1gdMiCh89BpNAXGeIMqIKcMpr6mVc0uMq+4bfUAt+nzTZJPQrYfkrlqRa l5r6VHRoapJ4LNtMl7Nr8QjK4Hx7sKpnlH8BBpf1ENy3MdU9L5VBBHkHBNdCrQHwP5GaLue3aiBHS mJs7qYDEs12ukikKTyFNn7peCEUO3kPlOAxRguKY85mC+CYC+9YVOpzqTwZJSdC+4eSb4xe+p3JUv yNhOc8ljwXJNQAdGwivEfTs5Mbhc6NejhQXdS0C/bi3UyX5zc/XcRxzP0XqSbg9XC9GX0grTIw+kX +bMY143A==; Received: from 2a02-8389-2341-5b80-a8a7-5c16-efec-d7f9.cable.dynamic.v6.surfer.at ([2a02:8389:2341:5b80:a8a7:5c16:efec:d7f9] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.97.1 #2 (Red Hat Linux)) id 1sidfg-00000008Fw9-3oFG; Mon, 26 Aug 2024 17:38:29 +0000 From: Christoph Hellwig To: Jens Axboe Cc: Chris Mason , Josef Bacik , David Sterba , Hans Holmberg , Damien Le Moal , Shin'ichiro Kawasaki , linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org Subject: [PATCH 1/4] block: rework bio splitting Date: Mon, 26 Aug 2024 19:37:54 +0200 Message-ID: <20240826173820.1690925-2-hch@lst.de> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240826173820.1690925-1-hch@lst.de> References: <20240826173820.1690925-1-hch@lst.de> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html The current setup with bio_may_exceed_limit and __bio_split_to_limits is a bit of a mess. Change it so that __bio_split_to_limits does all the work and is just a variant of bio_split_to_limits that returns nr_segs. This is done by inlining it and instead have the various bio_split_* helpers directly submit the potentially split bios. To support btrfs, the rw version has a lower level helper split out that just returns the offset to split. This turns out to nicely clean up the btrfs flow as well. Signed-off-by: Christoph Hellwig Acked-by: David Sterba Reviewed-by: Damien Le Moal --- block/blk-merge.c | 146 +++++++++++++++++--------------------------- block/blk-mq.c | 11 ++-- block/blk.h | 63 +++++++++++++------ fs/btrfs/bio.c | 30 +++++---- include/linux/bio.h | 4 +- 5 files changed, 125 insertions(+), 129 deletions(-) diff --git a/block/blk-merge.c b/block/blk-merge.c index de5281bcadc538..c7222c4685e060 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -105,9 +105,33 @@ static unsigned int bio_allowed_max_sectors(const struct queue_limits *lim) return round_down(UINT_MAX, lim->logical_block_size) >> SECTOR_SHIFT; } -static struct bio *bio_split_discard(struct bio *bio, - const struct queue_limits *lim, - unsigned *nsegs, struct bio_set *bs) +static struct bio *bio_submit_split(struct bio *bio, int split_sectors) +{ + if (unlikely(split_sectors < 0)) { + bio->bi_status = errno_to_blk_status(split_sectors); + bio_endio(bio); + return NULL; + } + + if (split_sectors) { + struct bio *split; + + split = bio_split(bio, split_sectors, GFP_NOIO, + &bio->bi_bdev->bd_disk->bio_split); + split->bi_opf |= REQ_NOMERGE; + blkcg_bio_issue_init(split); + bio_chain(split, bio); + trace_block_split(split, bio->bi_iter.bi_sector); + WARN_ON_ONCE(bio_zone_write_plugging(bio)); + submit_bio_noacct(bio); + return split; + } + + return bio; +} + +struct bio *bio_split_discard(struct bio *bio, const struct queue_limits *lim, + unsigned *nsegs) { unsigned int max_discard_sectors, granularity; sector_t tmp; @@ -121,10 +145,10 @@ static struct bio *bio_split_discard(struct bio *bio, min(lim->max_discard_sectors, bio_allowed_max_sectors(lim)); max_discard_sectors -= max_discard_sectors % granularity; if (unlikely(!max_discard_sectors)) - return NULL; + return bio; if (bio_sectors(bio) <= max_discard_sectors) - return NULL; + return bio; split_sectors = max_discard_sectors; @@ -139,19 +163,18 @@ static struct bio *bio_split_discard(struct bio *bio, if (split_sectors > tmp) split_sectors -= tmp; - return bio_split(bio, split_sectors, GFP_NOIO, bs); + return bio_submit_split(bio, split_sectors); } -static struct bio *bio_split_write_zeroes(struct bio *bio, - const struct queue_limits *lim, - unsigned *nsegs, struct bio_set *bs) +struct bio *bio_split_write_zeroes(struct bio *bio, + const struct queue_limits *lim, unsigned *nsegs) { *nsegs = 0; if (!lim->max_write_zeroes_sectors) - return NULL; + return bio; if (bio_sectors(bio) <= lim->max_write_zeroes_sectors) - return NULL; - return bio_split(bio, lim->max_write_zeroes_sectors, GFP_NOIO, bs); + return bio; + return bio_submit_split(bio, lim->max_write_zeroes_sectors); } static inline unsigned int blk_boundary_sectors(const struct queue_limits *lim, @@ -274,27 +297,19 @@ static bool bvec_split_segs(const struct queue_limits *lim, } /** - * bio_split_rw - split a bio in two bios + * bio_split_rw_at - check if and where to split a read/write bio * @bio: [in] bio to be split * @lim: [in] queue limits to split based on * @segs: [out] number of segments in the bio with the first half of the sectors - * @bs: [in] bio set to allocate the clone from * @max_bytes: [in] maximum number of bytes per bio * - * Clone @bio, update the bi_iter of the clone to represent the first sectors - * of @bio and update @bio->bi_iter to represent the remaining sectors. The - * following is guaranteed for the cloned bio: - * - That it has at most @max_bytes worth of data - * - That it has at most queue_max_segments(@q) segments. - * - * Except for discard requests the cloned bio will point at the bi_io_vec of - * the original bio. It is the responsibility of the caller to ensure that the - * original bio is not freed before the cloned bio. The caller is also - * responsible for ensuring that @bs is only destroyed after processing of the - * split bio has finished. + * Find out if @bio needs to be split to fit the queue limits in @lim and a + * maximum size of @max_bytes. Returns a negative error number if @bio can't be + * split, 0 if the bio doesn't have to be split, or a positive sector offset if + * @bio needs to be split. */ -struct bio *bio_split_rw(struct bio *bio, const struct queue_limits *lim, - unsigned *segs, struct bio_set *bs, unsigned max_bytes) +int bio_split_rw_at(struct bio *bio, const struct queue_limits *lim, + unsigned *segs, unsigned max_bytes) { struct bio_vec bv, bvprv, *bvprvp = NULL; struct bvec_iter iter; @@ -324,22 +339,17 @@ struct bio *bio_split_rw(struct bio *bio, const struct queue_limits *lim, } *segs = nsegs; - return NULL; + return 0; split: - if (bio->bi_opf & REQ_ATOMIC) { - bio->bi_status = BLK_STS_INVAL; - bio_endio(bio); - return ERR_PTR(-EINVAL); - } + if (bio->bi_opf & REQ_ATOMIC) + return -EINVAL; + /* * We can't sanely support splitting for a REQ_NOWAIT bio. End it * with EAGAIN if splitting is required and return an error pointer. */ - if (bio->bi_opf & REQ_NOWAIT) { - bio->bi_status = BLK_STS_AGAIN; - bio_endio(bio); - return ERR_PTR(-EAGAIN); - } + if (bio->bi_opf & REQ_NOWAIT) + return -EAGAIN; *segs = nsegs; @@ -356,58 +366,16 @@ struct bio *bio_split_rw(struct bio *bio, const struct queue_limits *lim, * big IO can be trival, disable iopoll when split needed. */ bio_clear_polled(bio); - return bio_split(bio, bytes >> SECTOR_SHIFT, GFP_NOIO, bs); + return bytes >> SECTOR_SHIFT; } -EXPORT_SYMBOL_GPL(bio_split_rw); +EXPORT_SYMBOL_GPL(bio_split_rw_at); -/** - * __bio_split_to_limits - split a bio to fit the queue limits - * @bio: bio to be split - * @lim: queue limits to split based on - * @nr_segs: returns the number of segments in the returned bio - * - * Check if @bio needs splitting based on the queue limits, and if so split off - * a bio fitting the limits from the beginning of @bio and return it. @bio is - * shortened to the remainder and re-submitted. - * - * The split bio is allocated from @q->bio_split, which is provided by the - * block layer. - */ -struct bio *__bio_split_to_limits(struct bio *bio, - const struct queue_limits *lim, - unsigned int *nr_segs) +struct bio *bio_split_rw(struct bio *bio, const struct queue_limits *lim, + unsigned *nr_segs) { - struct bio_set *bs = &bio->bi_bdev->bd_disk->bio_split; - struct bio *split; - - switch (bio_op(bio)) { - case REQ_OP_DISCARD: - case REQ_OP_SECURE_ERASE: - split = bio_split_discard(bio, lim, nr_segs, bs); - break; - case REQ_OP_WRITE_ZEROES: - split = bio_split_write_zeroes(bio, lim, nr_segs, bs); - break; - default: - split = bio_split_rw(bio, lim, nr_segs, bs, - get_max_io_size(bio, lim) << SECTOR_SHIFT); - if (IS_ERR(split)) - return NULL; - break; - } - - if (split) { - /* there isn't chance to merge the split bio */ - split->bi_opf |= REQ_NOMERGE; - - blkcg_bio_issue_init(split); - bio_chain(split, bio); - trace_block_split(split, bio->bi_iter.bi_sector); - WARN_ON_ONCE(bio_zone_write_plugging(bio)); - submit_bio_noacct(bio); - return split; - } - return bio; + return bio_submit_split(bio, + bio_split_rw_at(bio, lim, nr_segs, + get_max_io_size(bio, lim) << SECTOR_SHIFT)); } /** @@ -426,9 +394,7 @@ struct bio *bio_split_to_limits(struct bio *bio) const struct queue_limits *lim = &bdev_get_queue(bio->bi_bdev)->limits; unsigned int nr_segs; - if (bio_may_exceed_limits(bio, lim)) - return __bio_split_to_limits(bio, lim, &nr_segs); - return bio; + return __bio_split_to_limits(bio, lim, &nr_segs); } EXPORT_SYMBOL(bio_split_to_limits); diff --git a/block/blk-mq.c b/block/blk-mq.c index e3c3c0c21b5536..36abbaefe38749 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2939,7 +2939,7 @@ void blk_mq_submit_bio(struct bio *bio) struct blk_plug *plug = current->plug; const int is_sync = op_is_sync(bio->bi_opf); struct blk_mq_hw_ctx *hctx; - unsigned int nr_segs = 1; + unsigned int nr_segs; struct request *rq; blk_status_t ret; @@ -2981,11 +2981,10 @@ void blk_mq_submit_bio(struct bio *bio) goto queue_exit; } - if (unlikely(bio_may_exceed_limits(bio, &q->limits))) { - bio = __bio_split_to_limits(bio, &q->limits, &nr_segs); - if (!bio) - goto queue_exit; - } + bio = __bio_split_to_limits(bio, &q->limits, &nr_segs); + if (!bio) + goto queue_exit; + if (!bio_integrity_prep(bio)) goto queue_exit; diff --git a/block/blk.h b/block/blk.h index e180863f918b15..0d8cd64c126064 100644 --- a/block/blk.h +++ b/block/blk.h @@ -331,33 +331,58 @@ ssize_t part_timeout_show(struct device *, struct device_attribute *, char *); ssize_t part_timeout_store(struct device *, struct device_attribute *, const char *, size_t); -static inline bool bio_may_exceed_limits(struct bio *bio, - const struct queue_limits *lim) +struct bio *bio_split_discard(struct bio *bio, const struct queue_limits *lim, + unsigned *nsegs); +struct bio *bio_split_write_zeroes(struct bio *bio, + const struct queue_limits *lim, unsigned *nsegs); +struct bio *bio_split_rw(struct bio *bio, const struct queue_limits *lim, + unsigned *nr_segs); + +/* + * All drivers must accept single-segments bios that are smaller than PAGE_SIZE. + * + * This is a quick and dirty check that relies on the fact that bi_io_vec[0] is + * always valid if a bio has data. The check might lead to occasional false + * positives when bios are cloned, but compared to the performance impact of + * cloned bios themselves the loop below doesn't matter anyway. + */ +static inline bool bio_may_need_split(struct bio *bio, + const struct queue_limits *lim) +{ + return lim->chunk_sectors || bio->bi_vcnt != 1 || + bio->bi_io_vec->bv_len + bio->bi_io_vec->bv_offset > PAGE_SIZE; +} + +/** + * __bio_split_to_limits - split a bio to fit the queue limits + * @bio: bio to be split + * @lim: queue limits to split based on + * @nr_segs: returns the number of segments in the returned bio + * + * Check if @bio needs splitting based on the queue limits, and if so split off + * a bio fitting the limits from the beginning of @bio and return it. @bio is + * shortened to the remainder and re-submitted. + * + * The split bio is allocated from @q->bio_split, which is provided by the + * block layer. + */ +static inline struct bio *__bio_split_to_limits(struct bio *bio, + const struct queue_limits *lim, unsigned int *nr_segs) { switch (bio_op(bio)) { + default: + if (bio_may_need_split(bio, lim)) + return bio_split_rw(bio, lim, nr_segs); + *nr_segs = 1; + return bio; case REQ_OP_DISCARD: case REQ_OP_SECURE_ERASE: + return bio_split_discard(bio, lim, nr_segs); case REQ_OP_WRITE_ZEROES: - return true; /* non-trivial splitting decisions */ - default: - break; + return bio_split_write_zeroes(bio, lim, nr_segs); } - - /* - * All drivers must accept single-segments bios that are <= PAGE_SIZE. - * This is a quick and dirty check that relies on the fact that - * bi_io_vec[0] is always valid if a bio has data. The check might - * lead to occasional false negatives when bios are cloned, but compared - * to the performance impact of cloned bios themselves the loop below - * doesn't matter anyway. - */ - return lim->chunk_sectors || bio->bi_vcnt != 1 || - bio->bi_io_vec->bv_len + bio->bi_io_vec->bv_offset > PAGE_SIZE; } -struct bio *__bio_split_to_limits(struct bio *bio, - const struct queue_limits *lim, - unsigned int *nr_segs); int ll_back_merge_fn(struct request *req, struct bio *bio, unsigned int nr_segs); bool blk_attempt_req_merge(struct request_queue *q, struct request *rq, diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c index f04d931099601a..5ae7691849072b 100644 --- a/fs/btrfs/bio.c +++ b/fs/btrfs/bio.c @@ -73,20 +73,13 @@ struct btrfs_bio *btrfs_bio_alloc(unsigned int nr_vecs, blk_opf_t opf, static struct btrfs_bio *btrfs_split_bio(struct btrfs_fs_info *fs_info, struct btrfs_bio *orig_bbio, - u64 map_length, bool use_append) + u64 map_length) { struct btrfs_bio *bbio; struct bio *bio; - if (use_append) { - unsigned int nr_segs; - - bio = bio_split_rw(&orig_bbio->bio, &fs_info->limits, &nr_segs, - &btrfs_clone_bioset, map_length); - } else { - bio = bio_split(&orig_bbio->bio, map_length >> SECTOR_SHIFT, - GFP_NOFS, &btrfs_clone_bioset); - } + bio = bio_split(&orig_bbio->bio, map_length >> SECTOR_SHIFT, GFP_NOFS, + &btrfs_clone_bioset); bbio = btrfs_bio(bio); btrfs_bio_init(bbio, fs_info, NULL, orig_bbio); bbio->inode = orig_bbio->inode; @@ -664,6 +657,19 @@ static bool btrfs_wq_submit_bio(struct btrfs_bio *bbio, return true; } +static u64 btrfs_append_map_length(struct btrfs_bio *bbio, u64 map_length) +{ + unsigned int nr_segs; + int sector_offset; + + map_length = min(map_length, bbio->fs_info->max_zone_append_size); + sector_offset = bio_split_rw_at(&bbio->bio, &bbio->fs_info->limits, + &nr_segs, map_length); + if (sector_offset) + return sector_offset << SECTOR_SHIFT; + return map_length; +} + static bool btrfs_submit_chunk(struct btrfs_bio *bbio, int mirror_num) { struct btrfs_inode *inode = bbio->inode; @@ -691,10 +697,10 @@ static bool btrfs_submit_chunk(struct btrfs_bio *bbio, int mirror_num) map_length = min(map_length, length); if (use_append) - map_length = min(map_length, fs_info->max_zone_append_size); + map_length = btrfs_append_map_length(bbio, map_length); if (map_length < length) { - bbio = btrfs_split_bio(fs_info, bbio, map_length, use_append); + bbio = btrfs_split_bio(fs_info, bbio, map_length); bio = &bbio->bio; } diff --git a/include/linux/bio.h b/include/linux/bio.h index a46e2047bea4d2..faceadb040f9ac 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -324,8 +324,8 @@ static inline void bio_next_folio(struct folio_iter *fi, struct bio *bio) void bio_trim(struct bio *bio, sector_t offset, sector_t size); extern struct bio *bio_split(struct bio *bio, int sectors, gfp_t gfp, struct bio_set *bs); -struct bio *bio_split_rw(struct bio *bio, const struct queue_limits *lim, - unsigned *segs, struct bio_set *bs, unsigned max_bytes); +int bio_split_rw_at(struct bio *bio, const struct queue_limits *lim, + unsigned *segs, unsigned max_bytes); /** * bio_next_split - get next @sectors from a bio, splitting if necessary From patchwork Mon Aug 26 17:37:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13778301 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D4CA1197A9B; Mon, 26 Aug 2024 17:38:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724693915; cv=none; b=pmY4zjR1XLleh12h6kYO2rhBuqBJKr5mN5fHLLu+69q5nvWYnEPQslZp42ZbUItz7l25YpvkvgPfefi4DXuYJAMPjhXd8MQvUAB8JJ5HrqxoCg7HjAB1yZaNXA+i+hwzZH0WdIMeLuobcKCTphZW57/hjkaygpJuiXDEC6F7dRs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724693915; c=relaxed/simple; bh=CA2gephBjvJn0/GkW9AEA2AU2FIwaaErndnSem8buXY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YBG2B+5Y5hVXJJ4ulXQzRBZBoLQT5toVqSOiP4tZhLxxgQqp5ZuXENBn4/IBHQ176KvPLex5c0IH+kfVswSC9eRVmnF56QPA8P5nAVqXxrtUErYpC7MpQdnjJWy3aYvEFoq/GD9tAEeUcmkVIfkoc5JcqM+TRhV24MddZ3yLrWM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de; spf=none smtp.mailfrom=bombadil.srs.infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=tdpaDM12; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=bombadil.srs.infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="tdpaDM12" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=JFuGo/wtYw2xFTlIaF+fNehaa9Zme8icFr0kBsP7amI=; b=tdpaDM12LWtNEeYXZZAqtubroa gCWbGqturrnMqyX1sCx9+w/MLupNUI9DakZygtfbRqVRjDjabgHXQMnTESktAPaZyfwTP4OnJtgkn vcm1qx398ptvKrmKxiVYb4TabPKG522ejCb5UfknSoWMIDJBenxchqb3O5yp6vYPcPvmTEj9p/92r MOCsg0rTYEiolXhuURTWXUPGtIzaQyiwLsjT4WHYsZlWnGAzR4sVfN+O8hTSAa8ljkCRD7u8voenH XVd/T3iOR4ciR9QY9iF+Z86xZuExZocaYJcXGeBumHPPMoapbxjH6Di8ChmU3X1K/k6ekJaA7oGHg 37fVsIjw==; Received: from 2a02-8389-2341-5b80-a8a7-5c16-efec-d7f9.cable.dynamic.v6.surfer.at ([2a02:8389:2341:5b80:a8a7:5c16:efec:d7f9] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.97.1 #2 (Red Hat Linux)) id 1sidfk-00000008FxP-0s31; Mon, 26 Aug 2024 17:38:32 +0000 From: Christoph Hellwig To: Jens Axboe Cc: Chris Mason , Josef Bacik , David Sterba , Hans Holmberg , Damien Le Moal , Shin'ichiro Kawasaki , linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org Subject: [PATCH 2/4] block: constify the lim argument to queue_limits_max_zone_append_sectors Date: Mon, 26 Aug 2024 19:37:55 +0200 Message-ID: <20240826173820.1690925-3-hch@lst.de> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240826173820.1690925-1-hch@lst.de> References: <20240826173820.1690925-1-hch@lst.de> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html queue_limits_max_zone_append_sectors doesn't change the lim argument, so mark it as const. Signed-off-by: Christoph Hellwig Reviewed-by: Damien Le Moal --- include/linux/blkdev.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index e85ec73a07d575..ec3ea5d1f99dfe 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1187,7 +1187,8 @@ static inline unsigned int queue_max_segment_size(const struct request_queue *q) return q->limits.max_segment_size; } -static inline unsigned int queue_limits_max_zone_append_sectors(struct queue_limits *l) +static inline unsigned int +queue_limits_max_zone_append_sectors(const struct queue_limits *l) { unsigned int max_sectors = min(l->chunk_sectors, l->max_hw_sectors); From patchwork Mon Aug 26 17:37:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13778302 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4D73E198A34; Mon, 26 Aug 2024 17:38:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724693918; cv=none; b=p67cYBjLybv6PjQc1OBdVcMuC/7WYi0/ScZdnbiTSd9URD8BWI/BkaI1uPRSkZVei1NgHpZGGnuBxpmK0OUMNkffYu+Vsss8iluzA0+LDRX3u0Sc6aBpuGwewxVi9PiqrE3exrC/8nBHQoXyVFrzgP1bP0GANFoqE9QA6TsgpK0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724693918; c=relaxed/simple; bh=iwo8hl8xY1BSbt4bXpNnabk5ptyd2+uDGnTGCImjgWc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TG5Z0qCnAPDAOdhHjFgS/gdnWIN1tW7uXNt1qU2m+6UCco1zpkyjY+7Kr4AbSO6w5BovhlgzA1L8euXO5v9ifJKWibuOyap42nBDyYYJU777cKfKNcwaV9jdzZnxtpU+R1dxKbggUwMS+8nMJ4Ygro3VRcHqGkeL4rkE10oZucM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de; spf=none smtp.mailfrom=bombadil.srs.infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=TzgzAbEI; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=bombadil.srs.infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="TzgzAbEI" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=omZkynrzvTm4SPvm8mn56uhqb82LMW6lpAEs2QFZJ4E=; b=TzgzAbEIaDrMehZxu58Z2+ioTq t40AZMU/M3vEOdcAaV9QFGgMmq0EyLJ2iK5yWEfZp5F0OlzDKQlCmLWtV/Pr6/5uOWLLjyvRDvO7J Q6rFLj8euhGp1oKoJoY1Qos65CtdtQida4y6JRARd2/aDnDGERqJsRSEkjq20sLq8gZH8SLyjdbDH loacHQ2hXp5Ld/rT5fuP/PEuVnktJMsc1eY2rl61PMmfyLAhz/VXMdEP0QONXDOP3BlOy0MpnU8L2 58jjEVFyy9iXZWXXCI4Efk/6zzJLQ7gEey+RrbvO585OdXDlRGnm+3FMRfmO61d4B+sU/IhUrcUyU VFKr/Usw==; Received: from 2a02-8389-2341-5b80-a8a7-5c16-efec-d7f9.cable.dynamic.v6.surfer.at ([2a02:8389:2341:5b80:a8a7:5c16:efec:d7f9] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.97.1 #2 (Red Hat Linux)) id 1sidfn-00000008Fy3-0ikr; Mon, 26 Aug 2024 17:38:35 +0000 From: Christoph Hellwig To: Jens Axboe Cc: Chris Mason , Josef Bacik , David Sterba , Hans Holmberg , Damien Le Moal , Shin'ichiro Kawasaki , linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org Subject: [PATCH 3/4] block: properly handle REQ_OP_ZONE_APPEND in __bio_split_to_limits Date: Mon, 26 Aug 2024 19:37:56 +0200 Message-ID: <20240826173820.1690925-4-hch@lst.de> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240826173820.1690925-1-hch@lst.de> References: <20240826173820.1690925-1-hch@lst.de> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Currently REQ_OP_ZONE_APPEND is handled by the bio_split_rw case in __bio_split_to_limits. This is harmful because REQ_OP_ZONE_APPEND bios do not adhere to the soft max_limits value but instead use their own capped version of max_hw_sectors, leading to incorrect splits that later blow up in bio_split. We still need the bio_split_rw logic to count nr_segs for blk-mq code, so add a new wrapper that passes in the right limit, and turns any bio that would need a split into an error as an additional debugging aid. Signed-off-by: Christoph Hellwig Reviewed-by: Damien Le Moal --- block/blk-merge.c | 20 ++++++++++++++++++++ block/blk.h | 4 ++++ 2 files changed, 24 insertions(+) diff --git a/block/blk-merge.c b/block/blk-merge.c index c7222c4685e060..56769c4bcd799b 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -378,6 +378,26 @@ struct bio *bio_split_rw(struct bio *bio, const struct queue_limits *lim, get_max_io_size(bio, lim) << SECTOR_SHIFT)); } +/* + * REQ_OP_ZONE_APPEND bios must never be split by the block layer. + * + * But we want the nr_segs calculation provided by bio_split_rw_at, and having + * a good sanity check that the submitter built the bio correctly is nice to + * have as well. + */ +struct bio *bio_split_zone_append(struct bio *bio, + const struct queue_limits *lim, unsigned *nr_segs) +{ + unsigned int max_sectors = queue_limits_max_zone_append_sectors(lim); + int split_sectors; + + split_sectors = bio_split_rw_at(bio, lim, nr_segs, + max_sectors << SECTOR_SHIFT); + if (WARN_ON_ONCE(split_sectors > 0)) + split_sectors = -EINVAL; + return bio_submit_split(bio, split_sectors); +} + /** * bio_split_to_limits - split a bio to fit the queue limits * @bio: bio to be split diff --git a/block/blk.h b/block/blk.h index 0d8cd64c126064..61c2afa67daabb 100644 --- a/block/blk.h +++ b/block/blk.h @@ -337,6 +337,8 @@ struct bio *bio_split_write_zeroes(struct bio *bio, const struct queue_limits *lim, unsigned *nsegs); struct bio *bio_split_rw(struct bio *bio, const struct queue_limits *lim, unsigned *nr_segs); +struct bio *bio_split_zone_append(struct bio *bio, + const struct queue_limits *lim, unsigned *nr_segs); /* * All drivers must accept single-segments bios that are smaller than PAGE_SIZE. @@ -375,6 +377,8 @@ static inline struct bio *__bio_split_to_limits(struct bio *bio, return bio_split_rw(bio, lim, nr_segs); *nr_segs = 1; return bio; + case REQ_OP_ZONE_APPEND: + return bio_split_zone_append(bio, lim, nr_segs); case REQ_OP_DISCARD: case REQ_OP_SECURE_ERASE: return bio_split_discard(bio, lim, nr_segs); From patchwork Mon Aug 26 17:37:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13778303 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1E63B1990A7; Mon, 26 Aug 2024 17:38:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724693922; cv=none; b=MJPxcyVMXXZD396c21yXneLQX1uFqTQdF/WP0qDTAeMHqQjU1mut98DMjk4i5aKGVl6yapWDjc0gywZPMbS2h8mi3DTO8lxiERd+S0YLTkByEJUgv2WjsBkm6Ec9DrgxxrU1hKg7Wyi5CLaAADCgMkbtX0M5brjwA+5k3QY+nmI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724693922; c=relaxed/simple; bh=gjNxz+WDCrVaxGkqjYBwgIQ4LI1H7AXowSfzH9I5l+w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Y4JweveQiUBhQWVp5yuBIWypQM1M5oR46PZR8vChzxEA7ZdIuATcOnS0no6YnU4p53CrillS4zFEvH0cohnxWXTluXcalO69WryeK8nvs67XCm03KoX0w/Obr0RrT5WObBrdnLJ86jDgizeLg1jXjul0vXy+dw9/jSwuRYzov6g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de; spf=none smtp.mailfrom=bombadil.srs.infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=qwNKv9mC; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=bombadil.srs.infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="qwNKv9mC" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=RNzlZ0CBEGyo8XBfqzn6UuJuFRRrpKYSdfqJczIxzVs=; b=qwNKv9mCFefSmzqRiDImlO/pz5 H1hzErGuOE/HDfc5jrSnYigUurd2yoKDePfKzEgd2hTVCvOfj/6h7P+SsUdw0JmO23LQUv2+ReHr0 Ttlie3vNTni1BlD19zpthVIexQ/2Y0ZYgxrAk95u5ZXIB1X5OFRJG/jPDNuQfVFKLvcEnOd/bjOP7 sNJ+tONcwKrkRJSJn6vMcuVZXQoSFM5KSPqW3uGDqzPa3zEuGE2qoLLSqI1iKVdmuik6bB8bfRWq5 6w2S8keiCTmYJkU8m1v1E3LeYqD4k+mun8MrikcW9xkYt8cT+Ks25WruJgYUej3k8ZQ3TKkH8W1az rshnD5iQ==; Received: from 2a02-8389-2341-5b80-a8a7-5c16-efec-d7f9.cable.dynamic.v6.surfer.at ([2a02:8389:2341:5b80:a8a7:5c16:efec:d7f9] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.97.1 #2 (Red Hat Linux)) id 1sidfq-00000008Fyz-44Cw; Mon, 26 Aug 2024 17:38:39 +0000 From: Christoph Hellwig To: Jens Axboe Cc: Chris Mason , Josef Bacik , David Sterba , Hans Holmberg , Damien Le Moal , Shin'ichiro Kawasaki , linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org Subject: [PATCH 4/4] block: don't use bio_split_rw on misc operations Date: Mon, 26 Aug 2024 19:37:57 +0200 Message-ID: <20240826173820.1690925-5-hch@lst.de> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240826173820.1690925-1-hch@lst.de> References: <20240826173820.1690925-1-hch@lst.de> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html bio_split_rw is designed to split read and write bios with a payload. Currently it is called by __bio_split_to_limits for all operations not explicitly list, which works because bio_may_need_split explicitly checks for bi_vcnt == 1 and thus skips the bypass if there is no payload and bio_for_each_bvec loop will never execute it's body if bi_size is 0. But all this is hard to understand, fragile and wasted pointless cycles. Switch __bio_split_to_limits to only call bio_split_rw for READ and WRITE command and don't attempt any kind split for operation that do not require splitting. Signed-off-by: Christoph Hellwig Reviewed-by: Damien Le Moal --- block/blk.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/block/blk.h b/block/blk.h index 61c2afa67daabb..32f4e9f630a3ac 100644 --- a/block/blk.h +++ b/block/blk.h @@ -372,7 +372,8 @@ static inline struct bio *__bio_split_to_limits(struct bio *bio, const struct queue_limits *lim, unsigned int *nr_segs) { switch (bio_op(bio)) { - default: + case REQ_OP_READ: + case REQ_OP_WRITE: if (bio_may_need_split(bio, lim)) return bio_split_rw(bio, lim, nr_segs); *nr_segs = 1; @@ -384,6 +385,10 @@ static inline struct bio *__bio_split_to_limits(struct bio *bio, return bio_split_discard(bio, lim, nr_segs); case REQ_OP_WRITE_ZEROES: return bio_split_write_zeroes(bio, lim, nr_segs); + default: + /* other operations can't be split */ + *nr_segs = 0; + return bio; } }