From patchwork Mon Dec 6 02:29:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657587 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4DD3C433F5 for ; Mon, 6 Dec 2021 02:30:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234266AbhLFCdc (ORCPT ); Sun, 5 Dec 2021 21:33:32 -0500 Received: from smtp-out2.suse.de ([195.135.220.29]:51834 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233372AbhLFCd2 (ORCPT ); Sun, 5 Dec 2021 21:33:28 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 55E1A1FD54 for ; Mon, 6 Dec 2021 02:29:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757799; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hF+d1OqsGjogNJdDpe0L/7qbT0oBeA/kXRnlOH9HPjc=; b=RVJGCxtb6MIcvOQv3HBHWwJnJsUUKOQ2lo1FEb6gqC/ufFsUmuPfqs8NvG6Yf1BxyqnYGU Ry7ZyOhkuwGPtbnzSXBmXfhJYokUJRbEmUyMWBu72Kud3wvSJ+wu2p80C2jI8fzQKrHIMs 1HVj0I45gvA8n5LuHITTsYAsA0GyrLQ= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id A3B9013451 for ; Mon, 6 Dec 2021 02:29:58 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id eIg+G6Z1rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:29:58 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 01/17] btrfs: update an stale comment on btrfs_submit_bio_hook() Date: Mon, 6 Dec 2021 10:29:21 +0800 Message-Id: <20211206022937.26465-2-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This function is renamed to btrfs_submit_data_bio(), update the comment and add extra reason why it doesn't completely follow the same rule in btrfs_submit_data_bio(). Signed-off-by: Qu Wenruo --- fs/btrfs/inode.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 504cf090fc88..6079d30f83e8 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8202,7 +8202,13 @@ static inline blk_status_t btrfs_submit_dio_bio(struct bio *bio, bool write = btrfs_op(bio) == BTRFS_MAP_WRITE; blk_status_t ret; - /* Check btrfs_submit_bio_hook() for rules about async submit. */ + /* + * Check btrfs_submit_data_bio() for rules about async submit. + * + * The only exception is for RAID56, when there are more than one bios + * to submit, async submit seems to make it harder to collect csums + * for the full stripe. + */ if (async_submit) async_submit = !atomic_read(&BTRFS_I(inode)->sync_writers); From patchwork Mon Dec 6 02:29:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657589 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 669D8C433EF for ; Mon, 6 Dec 2021 02:30:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234282AbhLFCdd (ORCPT ); Sun, 5 Dec 2021 21:33:33 -0500 Received: from smtp-out1.suse.de ([195.135.220.28]:54752 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234243AbhLFCd3 (ORCPT ); Sun, 5 Dec 2021 21:33:29 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 62FC22113A for ; Mon, 6 Dec 2021 02:30:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757800; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=f+zGBM1gqHgKQWyoi24+8Uyk8gZMM2E7c/dXw8LXUTM=; b=qFdLUFbTpN3DOssj+L7CsfcJ753D6PLxkRA4prX+R3iIGO+55doBigEnXqNGsDrDSwHEe8 SqWcj4YNdmA+6dRpp5e+iBop7bmTlWqE0m5b4YF0Tzkq2Jjzgac6PAWC3GpTypoM1haJL2 EPs9ZezhTfCESzaufQTydJEQ68iFyrA= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id B6FD113451 for ; Mon, 6 Dec 2021 02:29:59 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id YPIPIKd1rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:29:59 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 02/17] btrfs: save bio::bi_iter into btrfs_bio::iter before any endio Date: Mon, 6 Dec 2021 10:29:22 +0800 Message-Id: <20211206022937.26465-3-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently btrfs_bio::iter is only utilized by direct IO. But later we will utilize btrfs_bio::iter to record the original bi_iter, for all endio functions to iterate the original range. Thus this patch will introduce a new helper, btrfs_bio_save_iter(), to save bi_iter into btrfs_bio::iter. All path that can lead to an bio_endio() call needs such btrfs_bio_save_iter() call. Under most common case, there will be a btrfs_map_bio() call to handle submitted bios. While for other error out paths, we need to call btrfs_bio_save_iter() manually, or later endio functions will ASSERT() on empty btrfs_bio::iter. Signed-off-by: Qu Wenruo --- fs/btrfs/compression.c | 3 +++ fs/btrfs/disk-io.c | 2 ++ fs/btrfs/extent_io.c | 7 +++++++ fs/btrfs/raid56.c | 2 ++ fs/btrfs/volumes.c | 1 + fs/btrfs/volumes.h | 17 +++++++++++++++++ 6 files changed, 32 insertions(+) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index e776956d5bc9..cc8d13369f53 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -870,6 +870,9 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, /* include any pages we added in add_ra-bio_pages */ cb->len = bio->bi_iter.bi_size; + /* Save bi_iter so that end_bio_extent_readpage() won't freak out. */ + btrfs_bio_save_iter(btrfs_bio(bio)); + while (cur_disk_byte < disk_bytenr + compressed_len) { u64 offset = cur_disk_byte - disk_bytenr; unsigned int index = offset >> PAGE_SHIFT; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 5c598e124c25..76b3fbcb91eb 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -817,6 +817,7 @@ static void run_one_async_done(struct btrfs_work *work) /* If an error occurred we just want to clean up the bio and move on */ if (async->status) { async->bio->bi_status = async->status; + btrfs_bio_save_iter(btrfs_bio(async->bio)); bio_endio(async->bio); return; } @@ -949,6 +950,7 @@ blk_status_t btrfs_submit_metadata_bio(struct inode *inode, struct bio *bio, out_w_error: bio->bi_status = ret; + btrfs_bio_save_iter(btrfs_bio(bio)); bio_endio(bio); return ret; } diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 1a67f4b3986b..efd109caf95b 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -175,6 +175,11 @@ int __must_check submit_one_bio(struct bio *bio, int mirror_num, /* Caller should ensure the bio has at least some range added */ ASSERT(bio->bi_iter.bi_size); + /* + * This for later endio on errors, as later endio functions will rely + * on btrfs_bio::iter. + */ + btrfs_bio_save_iter(btrfs_bio(bio)); if (is_data_inode(tree->private_data)) ret = btrfs_submit_data_bio(tree->private_data, bio, mirror_num, bio_flags); @@ -192,6 +197,7 @@ static void end_write_bio(struct extent_page_data *epd, int ret) if (bio) { bio->bi_status = errno_to_blk_status(ret); + btrfs_bio_save_iter(btrfs_bio(bio)); bio_endio(bio); epd->bio_ctrl.bio = NULL; } @@ -3355,6 +3361,7 @@ static int alloc_new_bio(struct btrfs_inode *inode, error: bio_ctrl->bio = NULL; bio->bi_status = errno_to_blk_status(ret); + btrfs_bio_save_iter(btrfs_bio(bio)); bio_endio(bio); return ret; } diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c index 0e239a4c3b26..13e726c88a81 100644 --- a/fs/btrfs/raid56.c +++ b/fs/btrfs/raid56.c @@ -1731,6 +1731,7 @@ int raid56_parity_write(struct bio *bio, struct btrfs_io_context *bioc, return PTR_ERR(rbio); } bio_list_add(&rbio->bio_list, bio); + btrfs_bio_save_iter(btrfs_bio(bio)); rbio->bio_list_bytes = bio->bi_iter.bi_size; rbio->operation = BTRFS_RBIO_WRITE; @@ -2135,6 +2136,7 @@ int raid56_parity_recover(struct bio *bio, struct btrfs_io_context *bioc, rbio->operation = BTRFS_RBIO_READ_REBUILD; bio_list_add(&rbio->bio_list, bio); + btrfs_bio_save_iter(btrfs_bio(bio)); rbio->bio_list_bytes = bio->bi_iter.bi_size; rbio->faila = find_logical_bio_stripe(rbio, bio); diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index f38c230111be..cdf5725f1f32 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6794,6 +6794,7 @@ blk_status_t btrfs_map_bio(struct btrfs_fs_info *fs_info, struct bio *bio, map_length = length; btrfs_bio_counter_inc_blocked(fs_info); + btrfs_bio_save_iter(btrfs_bio(bio)); ret = __btrfs_map_block(fs_info, btrfs_op(bio), logical, &map_length, &bioc, mirror_num, 1); if (ret) { diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 3b8130680749..c038fb1e36d5 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -334,6 +334,12 @@ struct btrfs_bio { struct btrfs_device *device; u8 *csum; u8 csum_inline[BTRFS_BIO_INLINE_CSUM_SIZE]; + /* + * Saved bio::bi_iter before submission. + * + * This allows us to interate the cloned/split bio properly, as at + * endio time bio::bi_iter is no longer reliable. + */ struct bvec_iter iter; /* @@ -356,6 +362,17 @@ static inline void btrfs_bio_free_csum(struct btrfs_bio *bbio) } } +/* + * To save bbio::bio->bi_iter into bbio::iter so for callers who need the + * original bi_iter can access the original part of the bio. + * This is especially important for the incoming split btrfs_bio, which needs + * to call its endio for and only for the split range. + */ +static inline void btrfs_bio_save_iter(struct btrfs_bio *bbio) +{ + bbio->iter = bbio->bio.bi_iter; +} + struct btrfs_io_stripe { struct btrfs_device *dev; u64 physical; From patchwork Mon Dec 6 02:29:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657591 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85684C4332F for ; Mon, 6 Dec 2021 02:30:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234283AbhLFCde (ORCPT ); Sun, 5 Dec 2021 21:33:34 -0500 Received: from smtp-out2.suse.de ([195.135.220.29]:51840 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234246AbhLFCda (ORCPT ); Sun, 5 Dec 2021 21:33:30 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 778C41FD5F for ; Mon, 6 Dec 2021 02:30:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757801; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NJ5vZF1+Ea6krFc0na9UXeiRAV8wrbBoYaiSjMXBdIE=; b=qWcIhmEf2yTjlB7d3apjIo4Lvvv2wNLiqTGelsCHxFASzOvku9EnlHugXXIrI3MryZMOrH CjuROWHDGaRmDR1KY7EyzmyIV4ksowQdjJd2xEqXqVjpNyM5Te8WUueJweDIHNyFi6gPqf 85ul09DAl7l8vpBF3p9JXF7aBI8JOf8= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id C8ABA13451 for ; Mon, 6 Dec 2021 02:30:00 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id SLgrJKh1rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:30:00 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 03/17] btrfs: use correct bio size for error message in btrfs_end_dio_bio() Date: Mon, 6 Dec 2021 10:29:23 +0800 Message-Id: <20211206022937.26465-4-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org At endio time, bio->bi_iter is no longer valid (there are some cases they are still valid, but never ensured). Thus if we really want to get the full size of bio, we have to iterate them. In btrfs_end_dio_bio() when we hit error, we would grab bio size from bi_iter which can be wrong. Fix it by iterating the bvecs and calculate the bio size. Signed-off-by: Qu Wenruo --- fs/btrfs/inode.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 6079d30f83e8..126d2117954c 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8175,12 +8175,19 @@ static void btrfs_end_dio_bio(struct bio *bio) struct btrfs_dio_private *dip = bio->bi_private; blk_status_t err = bio->bi_status; - if (err) + if (err) { + struct bvec_iter_all iter_all; + struct bio_vec *bvec; + u32 bi_size = 0; + + bio_for_each_segment_all(bvec, bio, iter_all) + bi_size += bvec->bv_len; + btrfs_warn(BTRFS_I(dip->inode)->root->fs_info, "direct IO failed ino %llu rw %d,%u sector %#Lx len %u err no %d", btrfs_ino(BTRFS_I(dip->inode)), bio_op(bio), - bio->bi_opf, bio->bi_iter.bi_sector, - bio->bi_iter.bi_size, err); + bio->bi_opf, bio->bi_iter.bi_sector, bi_size, err); + } if (bio_op(bio) == REQ_OP_READ) err = btrfs_check_read_dio_bio(dip, btrfs_bio(bio), !err); From patchwork Mon Dec 6 02:29:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657593 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F46FC433F5 for ; Mon, 6 Dec 2021 02:30:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234289AbhLFCdf (ORCPT ); Sun, 5 Dec 2021 21:33:35 -0500 Received: from smtp-out2.suse.de ([195.135.220.29]:51846 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234249AbhLFCdb (ORCPT ); Sun, 5 Dec 2021 21:33:31 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 88C371FDF2 for ; Mon, 6 Dec 2021 02:30:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757802; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=erPPHdibGMxT55fT53B2LNf+qi04tMj++6J9NGjVhXk=; b=fSjt5iCR1CwX73bZy5JFpP5/seCpgBLBTxjl59c3nJjOFqZSgetLXTqTD6bWEdiwTFCVTs sVjzYPMB/+ovN5XnxFsaxxFXZVleL4hC/A5YJwO3XF0XBeRHz78ifGIHQ0vztPnd5AR8LT CHQ00YsNzL0rxiJXqYURtUIOD9/moPg= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id D9FD113451 for ; Mon, 6 Dec 2021 02:30:01 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id OM2GKKl1rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:30:01 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 04/17] btrfs: refactor btrfs_map_bio() Date: Mon, 6 Dec 2021 10:29:24 +0800 Message-Id: <20211206022937.26465-5-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently in btrfs_map_bio() we call __btrfs_map_block(), then using the returned bioc to submit real stripes. This is fine if we're only going to handle one bio a time. For the incoming bio split at btrfs_map_bio() time, we want to handle several different bios, thus there we introduce a new helper, submit_one_mapped_range() to handle the submission part, making it much easier to make it work in a loop. Signed-off-by: Qu Wenruo --- fs/btrfs/volumes.c | 67 ++++++++++++++++++++++++++++------------------ 1 file changed, 41 insertions(+), 26 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index cdf5725f1f32..1630a4d22122 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6777,30 +6777,15 @@ static void bioc_error(struct btrfs_io_context *bioc, struct bio *bio, u64 logic } } -blk_status_t btrfs_map_bio(struct btrfs_fs_info *fs_info, struct bio *bio, - int mirror_num) +static int submit_one_mapped_range(struct btrfs_fs_info *fs_info, struct bio *bio, + struct btrfs_io_context *bioc, u64 map_length, + int mirror_num) { - struct btrfs_device *dev; struct bio *first_bio = bio; - u64 logical = bio->bi_iter.bi_sector << 9; - u64 length = 0; - u64 map_length; - int ret; - int dev_nr; + u64 logical = bio->bi_iter.bi_sector << SECTOR_SHIFT; int total_devs; - struct btrfs_io_context *bioc = NULL; - - length = bio->bi_iter.bi_size; - map_length = length; - - btrfs_bio_counter_inc_blocked(fs_info); - btrfs_bio_save_iter(btrfs_bio(bio)); - ret = __btrfs_map_block(fs_info, btrfs_op(bio), logical, - &map_length, &bioc, mirror_num, 1); - if (ret) { - btrfs_bio_counter_dec(fs_info); - return errno_to_blk_status(ret); - } + int dev_nr; + int ret; total_devs = bioc->num_stripes; bioc->orig_bio = first_bio; @@ -6819,18 +6804,19 @@ blk_status_t btrfs_map_bio(struct btrfs_fs_info *fs_info, struct bio *bio, mirror_num, 1); } - btrfs_bio_counter_dec(fs_info); - return errno_to_blk_status(ret); + return ret; } - if (map_length < length) { + if (map_length < bio->bi_iter.bi_size) { btrfs_crit(fs_info, - "mapping failed logical %llu bio len %llu len %llu", - logical, length, map_length); + "mapping failed logical %llu bio len %u len %llu", + logical, bio->bi_iter.bi_size, map_length); BUG(); } for (dev_nr = 0; dev_nr < total_devs; dev_nr++) { + struct btrfs_device *dev; + dev = bioc->stripes[dev_nr].dev; if (!dev || !dev->bdev || test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state) || @@ -6847,6 +6833,35 @@ blk_status_t btrfs_map_bio(struct btrfs_fs_info *fs_info, struct bio *bio, submit_stripe_bio(bioc, bio, bioc->stripes[dev_nr].physical, dev); } + return 0; +} + +blk_status_t btrfs_map_bio(struct btrfs_fs_info *fs_info, struct bio *bio, + int mirror_num) +{ + u64 logical = bio->bi_iter.bi_sector << 9; + u64 length = 0; + u64 map_length; + int ret; + struct btrfs_io_context *bioc = NULL; + + length = bio->bi_iter.bi_size; + map_length = length; + + btrfs_bio_counter_inc_blocked(fs_info); + btrfs_bio_save_iter(btrfs_bio(bio)); + ret = __btrfs_map_block(fs_info, btrfs_op(bio), logical, + &map_length, &bioc, mirror_num, 1); + if (ret) { + btrfs_bio_counter_dec(fs_info); + return errno_to_blk_status(ret); + } + + ret = submit_one_mapped_range(fs_info, bio, bioc, map_length, mirror_num); + if (ret < 0) { + btrfs_bio_counter_dec(fs_info); + return errno_to_blk_status(ret); + } btrfs_bio_counter_dec(fs_info); return BLK_STS_OK; } From patchwork Mon Dec 6 02:29:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657595 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42F41C433FE for ; Mon, 6 Dec 2021 02:30:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234310AbhLFCdf (ORCPT ); Sun, 5 Dec 2021 21:33:35 -0500 Received: from smtp-out1.suse.de ([195.135.220.28]:54758 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234260AbhLFCdc (ORCPT ); Sun, 5 Dec 2021 21:33:32 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 9F5102113A for ; Mon, 6 Dec 2021 02:30:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757803; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9j2CxGGqvog+2OiiaZvY0uDjzw4ggkvhYFzg06K1c0s=; b=MLSJMKKJs3OI9Oz+7vlAy1t5Yr16t4arjK4VAimJKn4WuVoY508TxgxyCadtqEqEr0tMiH jHK7H7qcSmDglEqJNqsSKVABzFO5h7lrkKA7hd/KvkEmeaXwTN9Q+hcr6aOPX0UgihSyUK aI4NRCh53Bmt83lzqFjndZUiRZYENCs= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id EAF2E13451 for ; Mon, 6 Dec 2021 02:30:02 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id cBLYLKp1rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:30:02 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 05/17] btrfs: move btrfs_bio_wq_end_io() calls into submit_stripe_bio() Date: Mon, 6 Dec 2021 10:29:25 +0800 Message-Id: <20211206022937.26465-6-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This is a preparation patch for the incoming chunk mapping layer bio split. Function btrfs_bio_wq_end_io() is going to remap bio::bi_private and bio::bi_end_io so that the real endio function will be executed in a workqueue. The problem is, remapped bio::bi_private will be a newly allocated memory, and after the original endio executed, the memory will be freed. This will not work well with split bio. So this patch will move all btrfs_bio_wq_end_io() call into one helper function, btrfs_bio_final_endio_remap(), and call that helper in submit_stripe_bio(). This refactor also unified all data bio behaviors. Before this patch, compressed bio no matter if read or write, will always be delayed using workqueue. However all data write operations are already delayed using ordered extent, and all metadata write doesn't need any delayed execution. Thus this patch will make compressed bios follow the same data read/write behavior. Signed-off-by: Qu Wenruo --- fs/btrfs/compression.c | 4 +--- fs/btrfs/disk-io.c | 9 +-------- fs/btrfs/inode.c | 20 +++++--------------- fs/btrfs/volumes.c | 41 +++++++++++++++++++++++++++++++++++++---- fs/btrfs/volumes.h | 9 ++++++++- 5 files changed, 52 insertions(+), 31 deletions(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index cc8d13369f53..8668c5190805 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -429,10 +429,8 @@ static blk_status_t submit_compressed_bio(struct btrfs_fs_info *fs_info, { blk_status_t ret; + btrfs_bio(bio)->endio_type = BTRFS_WQ_ENDIO_DATA; ASSERT(bio->bi_iter.bi_size); - ret = btrfs_bio_wq_end_io(fs_info, bio, BTRFS_WQ_ENDIO_DATA); - if (ret) - return ret; ret = btrfs_map_bio(fs_info, bio, mirror_num); return ret; } diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 76b3fbcb91eb..d6e89822191b 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -921,14 +921,7 @@ blk_status_t btrfs_submit_metadata_bio(struct inode *inode, struct bio *bio, blk_status_t ret; if (btrfs_op(bio) != BTRFS_MAP_WRITE) { - /* - * called for a read, do the setup so that checksum validation - * can happen in the async kernel threads - */ - ret = btrfs_bio_wq_end_io(fs_info, bio, - BTRFS_WQ_ENDIO_METADATA); - if (ret) - goto out_w_error; + btrfs_bio(bio)->endio_type = BTRFS_WQ_ENDIO_METADATA; ret = btrfs_map_bio(fs_info, bio, mirror_num); } else if (!should_async_write(fs_info, BTRFS_I(inode))) { ret = btree_csum_one_bio(bio); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 126d2117954c..007a20a9b076 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2511,7 +2511,7 @@ blk_status_t btrfs_submit_data_bio(struct inode *inode, struct bio *bio, { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_root *root = BTRFS_I(inode)->root; - enum btrfs_wq_endio_type metadata = BTRFS_WQ_ENDIO_DATA; + enum btrfs_wq_endio_type endio_type = BTRFS_WQ_ENDIO_DATA; blk_status_t ret = 0; int skip_sum; int async = !atomic_read(&BTRFS_I(inode)->sync_writers); @@ -2520,7 +2520,7 @@ blk_status_t btrfs_submit_data_bio(struct inode *inode, struct bio *bio, test_bit(BTRFS_FS_STATE_NO_CSUMS, &fs_info->fs_state); if (btrfs_is_free_space_inode(BTRFS_I(inode))) - metadata = BTRFS_WQ_ENDIO_FREE_SPACE; + endio_type = BTRFS_WQ_ENDIO_FREE_SPACE; if (bio_op(bio) == REQ_OP_ZONE_APPEND) { struct page *page = bio_first_bvec_all(bio)->bv_page; @@ -2532,10 +2532,7 @@ blk_status_t btrfs_submit_data_bio(struct inode *inode, struct bio *bio, } if (btrfs_op(bio) != BTRFS_MAP_WRITE) { - ret = btrfs_bio_wq_end_io(fs_info, bio, metadata); - if (ret) - goto out; - + btrfs_bio(bio)->endio_type = endio_type; if (bio_flags & EXTENT_BIO_COMPRESSED) { ret = btrfs_submit_compressed_read(inode, bio, mirror_num, @@ -8090,10 +8087,6 @@ static blk_status_t submit_dio_repair_bio(struct inode *inode, struct bio *bio, BUG_ON(bio_op(bio) == REQ_OP_WRITE); - ret = btrfs_bio_wq_end_io(fs_info, bio, BTRFS_WQ_ENDIO_DATA); - if (ret) - return ret; - refcount_inc(&dip->refs); ret = btrfs_map_bio(fs_info, bio, mirror_num); if (ret) @@ -8219,11 +8212,8 @@ static inline blk_status_t btrfs_submit_dio_bio(struct bio *bio, if (async_submit) async_submit = !atomic_read(&BTRFS_I(inode)->sync_writers); - if (!write) { - ret = btrfs_bio_wq_end_io(fs_info, bio, BTRFS_WQ_ENDIO_DATA); - if (ret) - goto err; - } + if (!write) + btrfs_bio(bio)->endio_type = BTRFS_WQ_ENDIO_DATA; if (BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM) goto map; diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 1630a4d22122..fba08cfcbd4e 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6725,10 +6725,31 @@ static void btrfs_end_bio(struct bio *bio) } } -static void submit_stripe_bio(struct btrfs_io_context *bioc, struct bio *bio, - u64 physical, struct btrfs_device *dev) +/* + * Endio remaps which can't handle cloned bio needs to go here. + * + * Currently it's only btrfs_bio_wq_end_io(). + */ +static int btrfs_bio_final_endio_remap(struct btrfs_fs_info *fs_info, + struct bio *bio) +{ + blk_status_t sts; + + /* For write bio, we don't to put their endio into wq */ + if (btrfs_op(bio) == BTRFS_MAP_WRITE) + return 0; + + sts = btrfs_bio_wq_end_io(fs_info, bio, btrfs_bio(bio)->endio_type); + if (sts != BLK_STS_OK) + return blk_status_to_errno(sts); + return 0; +} + +static int submit_stripe_bio(struct btrfs_io_context *bioc, struct bio *bio, + u64 physical, struct btrfs_device *dev) { struct btrfs_fs_info *fs_info = bioc->fs_info; + int ret; bio->bi_private = bioc; btrfs_bio(bio)->device = dev; @@ -6755,9 +6776,14 @@ static void submit_stripe_bio(struct btrfs_io_context *bioc, struct bio *bio, dev->devid, bio->bi_iter.bi_size); bio_set_dev(bio, dev->bdev); - btrfs_bio_counter_inc_noblocked(fs_info); + /* Do the final endio remap if needed */ + ret = btrfs_bio_final_endio_remap(fs_info, bio); + if (ret < 0) + return ret; + btrfs_bio_counter_inc_noblocked(fs_info); btrfsic_submit_bio(bio); + return ret; } static void bioc_error(struct btrfs_io_context *bioc, struct bio *bio, u64 logical) @@ -6831,9 +6857,16 @@ static int submit_one_mapped_range(struct btrfs_fs_info *fs_info, struct bio *bi else bio = first_bio; - submit_stripe_bio(bioc, bio, bioc->stripes[dev_nr].physical, dev); + ret = submit_stripe_bio(bioc, bio, + bioc->stripes[dev_nr].physical, dev); + if (ret < 0) + goto error; } return 0; +error: + for (; dev_nr < total_devs; dev_nr++) + bioc_error(bioc, first_bio, logical); + return ret; } blk_status_t btrfs_map_bio(struct btrfs_fs_info *fs_info, struct bio *bio, diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index c038fb1e36d5..b2081b03990a 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -328,7 +328,14 @@ struct btrfs_fs_devices { * Mostly for btrfs specific features like csum and mirror_num. */ struct btrfs_bio { - unsigned int mirror_num; + u16 mirror_num; + + /* + * To tell which workqueue the bio's endio should be exeucted in. + * + * Only for read bios. + */ + u16 endio_type; /* @device is for stripe IO submission. */ struct btrfs_device *device; From patchwork Mon Dec 6 02:29:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657597 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AA72C433EF for ; Mon, 6 Dec 2021 02:30:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234318AbhLFCdg (ORCPT ); Sun, 5 Dec 2021 21:33:36 -0500 Received: from smtp-out2.suse.de ([195.135.220.29]:51852 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234270AbhLFCdd (ORCPT ); Sun, 5 Dec 2021 21:33:33 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id B1C811FD54 for ; Mon, 6 Dec 2021 02:30:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757804; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=G6WnAQQM2T4EmfEGGj7c0STQC6vS6kyrJh8rvUD16WE=; b=LBz7D13LFjdnvzQt9WuoDOpVzuEld28Z5FJzawGH0gn5J6DMfXj62P+LWKTEQknQXu8Q7R rspRhB7C9TCAxTrFrrAYKxDwl729sLL/Q8kOp234gw7fTzcf9YJ8xf+ZwSx29UUbMTvH/M c0qG0Ceb3d5Qok9XAZTgkg5LchkLCqs= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 0EC2513451 for ; Mon, 6 Dec 2021 02:30:03 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id mE+GMqt1rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:30:03 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 06/17] btrfs: replace btrfs_dio_private::refs with btrfs_dio_private::pending_bytes Date: Mon, 6 Dec 2021 10:29:26 +0800 Message-Id: <20211206022937.26465-7-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This mostly follows the behavior of compressed_bio::pending_sectors. The point here is, dip::refs is not split bio friendly, as if a bio with its bi_private = dip, and the bio get split, we can easily underflow dip::refs. By using the same sector based solution as compressed_bio, dio can handle both unsplit and split bios. Signed-off-by: Qu Wenruo --- fs/btrfs/btrfs_inode.h | 10 +++---- fs/btrfs/inode.c | 67 +++++++++++++++++++++--------------------- 2 files changed, 38 insertions(+), 39 deletions(-) diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index b3e46aabc3d8..196f74ee102e 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -358,11 +358,11 @@ struct btrfs_dio_private { /* Used for bio::bi_size */ u32 bytes; - /* - * References to this structure. There is one reference per in-flight - * bio plus one while we're still setting up. - */ - refcount_t refs; + /* Hit any error for the whole DIO bio */ + bool errors; + + /* How many bytes are still under IO or not submitted */ + atomic_t pending_bytes; /* dio_bio came from fs/direct-io.c */ struct bio *dio_bio; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 007a20a9b076..1aa060de917c 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8053,20 +8053,28 @@ static int btrfs_dio_iomap_end(struct inode *inode, loff_t pos, loff_t length, return ret; } -static void btrfs_dio_private_put(struct btrfs_dio_private *dip) +static bool dec_and_test_dio_private(struct btrfs_dio_private *dip, bool error, + u32 bytes) { - /* - * This implies a barrier so that stores to dio_bio->bi_status before - * this and loads of dio_bio->bi_status after this are fully ordered. - */ - if (!refcount_dec_and_test(&dip->refs)) + ASSERT(bytes <= dip->bytes); + ASSERT(bytes <= atomic_read(&dip->pending_bytes)); + + if (error) + dip->errors = true; + return atomic_sub_and_test(bytes, &dip->pending_bytes); +} + +static void dio_private_finish(struct btrfs_dio_private *dip, bool error, + u32 bytes) +{ + if (!dec_and_test_dio_private(dip, error, bytes)) return; if (btrfs_op(dip->dio_bio) == BTRFS_MAP_WRITE) { __endio_write_update_ordered(BTRFS_I(dip->inode), dip->file_offset, dip->bytes, - !dip->dio_bio->bi_status); + !dip->errors); } else { unlock_extent(&BTRFS_I(dip->inode)->io_tree, dip->file_offset, @@ -8087,10 +8095,10 @@ static blk_status_t submit_dio_repair_bio(struct inode *inode, struct bio *bio, BUG_ON(bio_op(bio) == REQ_OP_WRITE); - refcount_inc(&dip->refs); + atomic_add(bio->bi_iter.bi_size, &dip->pending_bytes); ret = btrfs_map_bio(fs_info, bio, mirror_num); if (ret) - refcount_dec(&dip->refs); + atomic_sub(bio->bi_iter.bi_size, &dip->pending_bytes); return ret; } @@ -8166,20 +8174,20 @@ static blk_status_t btrfs_submit_bio_start_direct_io(struct inode *inode, static void btrfs_end_dio_bio(struct bio *bio) { struct btrfs_dio_private *dip = bio->bi_private; + struct bvec_iter iter; + struct bio_vec bvec; + u32 bi_size = 0; blk_status_t err = bio->bi_status; - if (err) { - struct bvec_iter_all iter_all; - struct bio_vec *bvec; - u32 bi_size = 0; - - bio_for_each_segment_all(bvec, bio, iter_all) - bi_size += bvec->bv_len; + __bio_for_each_segment(bvec, bio, iter, btrfs_bio(bio)->iter) + bi_size += bvec.bv_len; + if (err) { btrfs_warn(BTRFS_I(dip->inode)->root->fs_info, "direct IO failed ino %llu rw %d,%u sector %#Lx len %u err no %d", btrfs_ino(BTRFS_I(dip->inode)), bio_op(bio), bio->bi_opf, bio->bi_iter.bi_sector, bi_size, err); + dip->errors = true; } if (bio_op(bio) == REQ_OP_READ) @@ -8191,7 +8199,7 @@ static void btrfs_end_dio_bio(struct bio *bio) btrfs_record_physical_zoned(dip->inode, dip->file_offset, bio); bio_put(bio); - btrfs_dio_private_put(dip); + dio_private_finish(dip, err, bi_size); } static inline blk_status_t btrfs_submit_dio_bio(struct bio *bio, @@ -8250,7 +8258,8 @@ static inline blk_status_t btrfs_submit_dio_bio(struct bio *bio, */ static struct btrfs_dio_private *btrfs_create_dio_private(struct bio *dio_bio, struct inode *inode, - loff_t file_offset) + loff_t file_offset, + u32 length) { const bool write = (btrfs_op(dio_bio) == BTRFS_MAP_WRITE); const bool csum = !(BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM); @@ -8270,12 +8279,12 @@ static struct btrfs_dio_private *btrfs_create_dio_private(struct bio *dio_bio, if (!dip) return NULL; + atomic_set(&dip->pending_bytes, length); dip->inode = inode; dip->file_offset = file_offset; dip->bytes = dio_bio->bi_iter.bi_size; dip->disk_bytenr = dio_bio->bi_iter.bi_sector << 9; dip->dio_bio = dio_bio; - refcount_set(&dip->refs, 1); return dip; } @@ -8289,6 +8298,8 @@ static void btrfs_submit_direct(const struct iomap_iter *iter, BTRFS_BLOCK_GROUP_RAID56_MASK); struct btrfs_dio_private *dip; struct bio *bio; + const u32 length = dio_bio->bi_iter.bi_size; + u32 submitted_bytes = 0; u64 start_sector; int async_submit = 0; u64 submit_len; @@ -8301,7 +8312,7 @@ static void btrfs_submit_direct(const struct iomap_iter *iter, struct btrfs_dio_data *dio_data = iter->iomap.private; struct extent_map *em = NULL; - dip = btrfs_create_dio_private(dio_bio, inode, file_offset); + dip = btrfs_create_dio_private(dio_bio, inode, file_offset, length); if (!dip) { if (!write) { unlock_extent(&BTRFS_I(inode)->io_tree, file_offset, @@ -8311,7 +8322,6 @@ static void btrfs_submit_direct(const struct iomap_iter *iter, bio_endio(dio_bio); return; } - if (!write) { /* * Load the csums up front to reduce csum tree searches and @@ -8365,17 +8375,7 @@ static void btrfs_submit_direct(const struct iomap_iter *iter, ASSERT(submit_len >= clone_len); submit_len -= clone_len; - /* - * Increase the count before we submit the bio so we know - * the end IO handler won't happen before we increase the - * count. Otherwise, the dip might get freed before we're - * done setting it up. - * - * We transfer the initial reference to the last bio, so we - * don't need to increment the reference count for the last one. - */ if (submit_len > 0) { - refcount_inc(&dip->refs); /* * If we are submitting more than one bio, submit them * all asynchronously. The exception is RAID 5 or 6, as @@ -8390,11 +8390,10 @@ static void btrfs_submit_direct(const struct iomap_iter *iter, async_submit); if (status) { bio_put(bio); - if (submit_len > 0) - refcount_dec(&dip->refs); goto out_err_em; } + submitted_bytes += clone_len; dio_data->submitted += clone_len; clone_offset += clone_len; start_sector += clone_len >> 9; @@ -8408,7 +8407,7 @@ static void btrfs_submit_direct(const struct iomap_iter *iter, free_extent_map(em); out_err: dip->dio_bio->bi_status = status; - btrfs_dio_private_put(dip); + dio_private_finish(dip, status, length - submitted_bytes); } const struct iomap_ops btrfs_dio_iomap_ops = { From patchwork Mon Dec 6 02:29:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657599 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EB4CC4332F for ; Mon, 6 Dec 2021 02:30:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234323AbhLFCdg (ORCPT ); Sun, 5 Dec 2021 21:33:36 -0500 Received: from smtp-out2.suse.de ([195.135.220.29]:51858 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234246AbhLFCde (ORCPT ); Sun, 5 Dec 2021 21:33:34 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 3702A1FD54 for ; Mon, 6 Dec 2021 02:30:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757806; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=W03vLez1vK1Z0jetUBLMkg7PipCmQTkg93wk/qsMgYg=; b=KQky6JZxiWT5pL6zPsUiMRIaEkQEFbJDByla1M215z47Fa5OZ4rXcLYquinoZPCmT0N4ig e1u1mny1zgorjM5XKNBDVohvGZz3wvyIVvSNb3v5JG5il8qwBNsjRz0Ipswu3fsdgBSo2M LiY/tZWYoje6ixpt0UtQYNxeGt3IgOE= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 8A08B13451 for ; Mon, 6 Dec 2021 02:30:04 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id ELeYNqx1rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:30:04 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 07/17] btrfs: introduce btrfs_bio_split() helper Date: Mon, 6 Dec 2021 10:29:27 +0800 Message-Id: <20211206022937.26465-8-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This new function will handle the split of a btrfs bio, to co-operate with the incoming chunk mapping time bio split. This patch will introduce the following new members and functions: - btrfs_bio::offset_to_original Since btrfs_bio::csum is still storing the checksum for the original logical bytenr, we need to know the offset between current advanced bio and the original logical bytenr. Thus here we need such new member. And the new member will fit into the existing hole between btrfs_bio::mirror_num and btrfs_bio::device, it should not increase the memory usage of btrfs_bio. - btrfs_bio::parent and btrfs_bio::orig_endio To record where the parent bio is and the original endio function. - btrfs_bio::is_split_bio To distinguish bio created by btrfs_bio_split() and btrfs_bio_clone*(). For cloned bio, they still have their csum pointed to correct memory, while split bio must rely on its parent bbio to grab csum pointer. - split_bio_endio() Just to call the original endio function then call bio_endio() on the original bio. This will ensure the original bio is freed after all cloned bio. - btrfs_split_bio() Split the original bio into two, the behavior is pretty much the same as bio_split(), just with extra btrfs specific setup. Currently there is no other caller utilizing above new members/functions yet. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 82 +++++++++++++++++++++++++++++++++++++++++++- fs/btrfs/extent_io.h | 2 ++ fs/btrfs/volumes.h | 43 +++++++++++++++++++++-- 3 files changed, 123 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index efd109caf95b..095bdc4775e7 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3011,7 +3011,6 @@ static void end_bio_extent_readpage(struct bio *bio) int ret; struct bvec_iter_all iter_all; - ASSERT(!bio_flagged(bio, BIO_CLONED)); bio_for_each_segment_all(bvec, bio, iter_all) { bool uptodate = !bio->bi_status; struct page *page = bvec->bv_page; @@ -3190,6 +3189,87 @@ struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size) return bio; } +/* + * A very simple wrapper to call original endio function and then + * call bio_endio() on the parent bio to decrease its bi_remaining count. + */ +static void split_bio_endio(struct bio *bio) +{ + struct btrfs_bio *bbio = btrfs_bio(bio); + /* After endio bbio could be freed, thus grab the info before endio */ + struct bio *parent = bbio->parent; + + /* + * BIO_CLONED can even be set for our parent bio (DIO use clones + * the initial bio, then uses the cloned one for IO). + * So here we don't check BIO_CLONED for parent. + */ + ASSERT(bio_flagged(bio, BIO_CLONED) && bbio->is_split_bio); + ASSERT(parent && !btrfs_bio(parent)->is_split_bio); + + bio->bi_end_io = bbio->orig_endio; + bio_endio(bio); + bio_endio(parent); +} + +/* + * Pretty much like bio_split(), caller needs to ensure @src is not freed + * before the newly allocated bio, as the new bio is relying on @src for + * its bvecs. + */ +struct bio *btrfs_bio_split(struct btrfs_fs_info *fs_info, + struct bio *src, unsigned int bytes) +{ + struct bio *new; + struct btrfs_bio *src_bbio = btrfs_bio(src); + struct btrfs_bio *new_bbio; + const unsigned int old_offset = src_bbio->offset_to_original; + + /* Src should not be split */ + ASSERT(!src_bbio->is_split_bio); + ASSERT(IS_ALIGNED(bytes, fs_info->sectorsize)); + ASSERT(bytes < src->bi_iter.bi_size); + + /* + * We're in fact chaining the new bio to the parent, but we still want + * to have independent bi_private/bi_endio, thus we need to manually + * increase the remaining for the source, just like bio_chain(). + */ + bio_inc_remaining(src); + + /* Bioset backed split should not fail */ + new = bio_split(src, bytes >> SECTOR_SHIFT, GFP_NOFS, &btrfs_bioset); + new_bbio = btrfs_bio(new); + new_bbio->offset_to_original = old_offset; + new_bbio->iter = new->bi_iter; + new_bbio->orig_endio = src->bi_end_io; + new_bbio->parent = src; + new_bbio->endio_type = src_bbio->endio_type; + new_bbio->is_split_bio = 1; + new->bi_end_io = split_bio_endio; + + /* + * This is very tricky, as if any endio has extra refcount on + * bi_private, we will be screwed up. + * + * We workaround this hacky behavior by reviewing all the involved + * endio stacks. Making sure only split-safe endio remap are called. + * + * Split-unsafe endio remap like btrfs_bio_wq_end_io() will be called + * after btrfs_bio_split(). + */ + new->bi_private = src->bi_private; + + src_bbio->offset_to_original += bytes; + + /* + * For direct IO, @src is a cloned bio thus bbio::iter still points to + * the full bio. Need to update it too. + */ + src_bbio->iter = src->bi_iter; + return new; +} + /** * Attempt to add a page to bio * diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 0399cf8e3c32..cb727b77ecda 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -280,6 +280,8 @@ void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end, struct bio *btrfs_bio_alloc(unsigned int nr_iovecs); struct bio *btrfs_bio_clone(struct bio *bio); struct bio *btrfs_bio_clone_partial(struct bio *orig, u64 offset, u64 size); +struct bio *btrfs_bio_split(struct btrfs_fs_info *fs_info, + struct bio *src, unsigned int bytes); void end_extent_writepage(struct page *page, int err, u64 start, u64 end); int btrfs_repair_eb_io_failure(const struct extent_buffer *eb, int mirror_num); diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index b2081b03990a..462b32c89abc 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -332,15 +332,52 @@ struct btrfs_bio { /* * To tell which workqueue the bio's endio should be exeucted in. + * This member is to make sure btrfs_bio_wq_end_io() is the last + * endio remap in the stack. * * Only for read bios. */ - u16 endio_type; + u8 endio_type; + + /* + * To tell if this btrfs bio is split or just cloned. + * Both btrfs_bio_clone*() and btrfs_bio_split() will make bbio->bio + * to have BIO_CLONED flag. + * But cloned bio still has its bbio::csum pointed to correct memory, + * unlike split bio relies on its parent bbio to grab csum. + * + * Thus we needs this extra flag to distinguish those cloned bio. + */ + u8 is_split_bio; + + /* + * Records the offset we're from the original bio. + * + * Since btrfs_bio can be split, but our csum is alwasy for the + * original logical bytenr, we need a way to know the bytes offset + * from the original logical bytenr to do proper csum verification. + */ + unsigned int offset_to_original; /* @device is for stripe IO submission. */ struct btrfs_device *device; - u8 *csum; - u8 csum_inline[BTRFS_BIO_INLINE_CSUM_SIZE]; + + union { + /* + * For the parent bio recording the csum for the original + * logical bytenr + */ + struct { + u8 *csum; + u8 csum_inline[BTRFS_BIO_INLINE_CSUM_SIZE]; + }; + + /* For child (split) bio to record where its parent is */ + struct { + struct bio *parent; + bio_end_io_t *orig_endio; + }; + }; /* * Saved bio::bi_iter before submission. * From patchwork Mon Dec 6 02:29:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657601 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEAC9C43217 for ; Mon, 6 Dec 2021 02:30:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234324AbhLFCdh (ORCPT ); Sun, 5 Dec 2021 21:33:37 -0500 Received: from smtp-out2.suse.de ([195.135.220.29]:51864 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234308AbhLFCdg (ORCPT ); Sun, 5 Dec 2021 21:33:36 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 47AE31FD5F for ; Mon, 6 Dec 2021 02:30:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757807; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uxZffvBdyjIFPeOcUWMRAE7bp37nC+YpuZuqXuCQmWE=; b=d/kh/IYUyHDV2NyDt139lkE8t4a19gp62CGaTGywfQM6bpAB8DNkhFrUl5kA7hV4F8RqmK Ki1aB5+HVsZLDWvs507ie6SU20n38nRyXiVAbjSQMQsMNwjGIQkGniI3OJqIHhzedKmopf sSmmjdCy0nlBPwvHxZ8mn9UIvKhFeR8= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 9BD9513451 for ; Mon, 6 Dec 2021 02:30:06 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id eN0nGa51rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:30:06 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 08/17] btrfs: make data buffered read path to handle split bio properly Date: Mon, 6 Dec 2021 10:29:28 +0800 Message-Id: <20211206022937.26465-9-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This involves the following modifications: - Use bio_for_each_segment() instead of bio_for_each_segment_all() bio_for_each_segment_all() will iterate all bvecs, even if they are not referred by current bi_iter. *_all() variant can only be used if the bio is never split. Change it to __bio_for_each_segment() call so we won't have endio called on the same range by both split and parent bios, and it can handle both split and unsplit bios. - Make check_data_csum() to take bbio->offset_to_original into consideration Since btrfs bio can be split now, split/original bio can all start with some offset to the original logical bytenr. Take btrfs_bio::offset_to_original into consideration to get correct checksum offset. - Remove the BIO_CLONED ASSERT() in submit_read_repair() Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 34 +++++++++++++++++++--------------- fs/btrfs/inode.c | 23 +++++++++++++++++++++-- fs/btrfs/volumes.h | 3 ++- 3 files changed, 42 insertions(+), 18 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 095bdc4775e7..049da3811bae 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2741,10 +2741,9 @@ static blk_status_t submit_read_repair(struct inode *inode, ASSERT(error_bitmap); /* - * We only get called on buffered IO, thus page must be mapped and bio - * must not be cloned. - */ - ASSERT(page->mapping && !bio_flagged(failed_bio, BIO_CLONED)); + * We only get called on buffered IO, thus page must be mapped + */ + ASSERT(page->mapping); /* Iterate through all the sectors in the range */ for (i = 0; i < nr_bits; i++) { @@ -2998,7 +2997,8 @@ static struct extent_buffer *find_extent_buffer_readpage( */ static void end_bio_extent_readpage(struct bio *bio) { - struct bio_vec *bvec; + struct bio_vec bvec; + struct bvec_iter iter; struct btrfs_bio *bbio = btrfs_bio(bio); struct extent_io_tree *tree, *failure_tree; struct processed_extent processed = { 0 }; @@ -3009,11 +3009,15 @@ static void end_bio_extent_readpage(struct bio *bio) u32 bio_offset = 0; int mirror; int ret; - struct bvec_iter_all iter_all; - bio_for_each_segment_all(bvec, bio, iter_all) { + /* + * We should have saved the orignal bi_iter, and then start iterating + * using that saved iter, as at endio time bi_iter is not reliable. + */ + ASSERT(bbio->iter.bi_size); + __bio_for_each_segment(bvec, bio, iter, bbio->iter) { bool uptodate = !bio->bi_status; - struct page *page = bvec->bv_page; + struct page *page = bvec.bv_page; struct inode *inode = page->mapping->host; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); const u32 sectorsize = fs_info->sectorsize; @@ -3036,19 +3040,19 @@ static void end_bio_extent_readpage(struct bio *bio) * for unaligned offsets, and an error if they don't add up to * a full sector. */ - if (!IS_ALIGNED(bvec->bv_offset, sectorsize)) + if (!IS_ALIGNED(bvec.bv_offset, sectorsize)) btrfs_err(fs_info, "partial page read in btrfs with offset %u and length %u", - bvec->bv_offset, bvec->bv_len); - else if (!IS_ALIGNED(bvec->bv_offset + bvec->bv_len, + bvec.bv_offset, bvec.bv_len); + else if (!IS_ALIGNED(bvec.bv_offset + bvec.bv_len, sectorsize)) btrfs_info(fs_info, "incomplete page read with offset %u and length %u", - bvec->bv_offset, bvec->bv_len); + bvec.bv_offset, bvec.bv_len); - start = page_offset(page) + bvec->bv_offset; - end = start + bvec->bv_len - 1; - len = bvec->bv_len; + start = page_offset(page) + bvec.bv_offset; + end = start + bvec.bv_len - 1; + len = bvec.bv_len; mirror = bbio->mirror_num; if (likely(uptodate)) { diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 1aa060de917c..186304c69900 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -3225,6 +3225,24 @@ void btrfs_writepage_endio_finish_ordered(struct btrfs_inode *inode, finish_ordered_fn, uptodate); } +static u8 *bbio_get_real_csum(struct btrfs_fs_info *fs_info, + struct btrfs_bio *bbio) +{ + u8 *ret; + + /* Split bbio needs to grab csum from its parent */ + if (bbio->is_split_bio) + ret = btrfs_bio(bbio->parent)->csum; + else + ret = bbio->csum; + + if (ret == NULL) + return ret; + + return ret + (bbio->offset_to_original >> fs_info->sectorsize_bits) * + fs_info->csum_size; +} + /* * check_data_csum - verify checksum of one sector of uncompressed data * @inode: inode @@ -3252,7 +3270,8 @@ static int check_data_csum(struct inode *inode, struct btrfs_bio *bbio, ASSERT(pgoff + len <= PAGE_SIZE); offset_sectors = bio_offset >> fs_info->sectorsize_bits; - csum_expected = ((u8 *)bbio->csum) + offset_sectors * csum_size; + csum_expected = bbio_get_real_csum(fs_info, bbio) + + offset_sectors * csum_size; kaddr = kmap_atomic(page); shash->tfm = fs_info->csum_shash; @@ -3310,7 +3329,7 @@ unsigned int btrfs_verify_data_csum(struct btrfs_bio *bbio, * Normally this should be covered by above check for compressed read * or the next check for NODATASUM. Just do a quicker exit here. */ - if (bbio->csum == NULL) + if (bbio_get_real_csum(fs_info, bbio) == NULL) return 0; if (BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM) diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 462b32c89abc..a7f3fd4b4226 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -400,7 +400,8 @@ static inline struct btrfs_bio *btrfs_bio(struct bio *bio) static inline void btrfs_bio_free_csum(struct btrfs_bio *bbio) { - if (bbio->csum != bbio->csum_inline) { + /* Only free the csum if we're not a split bio */ + if (!bbio->is_split_bio && bbio->csum != bbio->csum_inline) { kfree(bbio->csum); bbio->csum = NULL; } From patchwork Mon Dec 6 02:29:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657603 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1FED3C433F5 for ; Mon, 6 Dec 2021 02:30:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234338AbhLFCdi (ORCPT ); Sun, 5 Dec 2021 21:33:38 -0500 Received: from smtp-out2.suse.de ([195.135.220.29]:51870 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234319AbhLFCdh (ORCPT ); Sun, 5 Dec 2021 21:33:37 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 58DFA1FD54 for ; Mon, 6 Dec 2021 02:30:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757808; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mVTYQGlKnXknG7jUoqzw7oV1bzUlbta6f3bJYyfXwEk=; b=rNKajSpd/8mTK1Sdn7cFeA1HYeJVMZtc0FmC1hoJU0b7fRd2g993YNXBPNaUFuXCNXo5td jsKrB0vYeHwdR04jAbW+/daNc6aYQrsacdj00EmAzD4B5yPA1ZCRXpOWqlP1Cg+rq4deHa vvLgcyb8OBf/kJEDbhStOIPk2mfp1mc= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id AB52213451 for ; Mon, 6 Dec 2021 02:30:07 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id uE4sHa91rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:30:07 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 09/17] btrfs: make data buffered write endio function to be split bio compatible Date: Mon, 6 Dec 2021 10:29:29 +0800 Message-Id: <20211206022937.26465-10-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Only need to change the bio_for_each_segment_all() call to __bio_for_each_segment() call, and using btrfs_bio::iter as the initial bi_iter. Now the endio function can handle both split and unsplit bios well. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 049da3811bae..952789ed650d 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2833,31 +2833,31 @@ void end_extent_writepage(struct page *page, int err, u64 start, u64 end) static void end_bio_extent_writepage(struct bio *bio) { int error = blk_status_to_errno(bio->bi_status); - struct bio_vec *bvec; + struct bio_vec bvec; + struct bvec_iter iter; u64 start; u64 end; - struct bvec_iter_all iter_all; bool first_bvec = true; - ASSERT(!bio_flagged(bio, BIO_CLONED)); - bio_for_each_segment_all(bvec, bio, iter_all) { - struct page *page = bvec->bv_page; + ASSERT(btrfs_bio(bio)->iter.bi_size); + __bio_for_each_segment(bvec, bio, iter, btrfs_bio(bio)->iter) { + struct page *page = bvec.bv_page; struct inode *inode = page->mapping->host; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); const u32 sectorsize = fs_info->sectorsize; /* Our read/write should always be sector aligned. */ - if (!IS_ALIGNED(bvec->bv_offset, sectorsize)) + if (!IS_ALIGNED(bvec.bv_offset, sectorsize)) btrfs_err(fs_info, "partial page write in btrfs with offset %u and length %u", - bvec->bv_offset, bvec->bv_len); - else if (!IS_ALIGNED(bvec->bv_len, sectorsize)) + bvec.bv_offset, bvec.bv_len); + else if (!IS_ALIGNED(bvec.bv_len, sectorsize)) btrfs_info(fs_info, "incomplete page write with offset %u and length %u", - bvec->bv_offset, bvec->bv_len); + bvec.bv_offset, bvec.bv_len); - start = page_offset(page) + bvec->bv_offset; - end = start + bvec->bv_len - 1; + start = page_offset(page) + bvec.bv_offset; + end = start + bvec.bv_len - 1; if (first_bvec) { btrfs_record_physical_zoned(inode, start, bio); @@ -2866,7 +2866,7 @@ static void end_bio_extent_writepage(struct bio *bio) end_extent_writepage(page, error, start, end); - btrfs_page_clear_writeback(fs_info, page, start, bvec->bv_len); + btrfs_page_clear_writeback(fs_info, page, start, bvec.bv_len); } bio_put(bio); From patchwork Mon Dec 6 02:29:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657605 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 561D8C433FE for ; Mon, 6 Dec 2021 02:30:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234357AbhLFCdj (ORCPT ); Sun, 5 Dec 2021 21:33:39 -0500 Received: from smtp-out2.suse.de ([195.135.220.29]:51876 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234308AbhLFCdi (ORCPT ); Sun, 5 Dec 2021 21:33:38 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 6CCE71FD5F for ; Mon, 6 Dec 2021 02:30:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757809; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uPkQDmRvVGx6Gup1EH6vU/6lNpT3PLLRIeEViifWQqc=; b=pfIkBhqbiCrBNJTqsadPaPayBunWsxe6V7ggdabXcSAvojVpIVn1QP04HqqlSwSWxkhwD3 5oRp4A1T/0go5X17HivGdleV/N0PK/CbEm+t2Yk1hbShju88+Wbo30idyWLFRPgD8hZo4Q BmWpyMo5KjVG4JCcIaWqCdbO7cdRKMc= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id BCDD213451 for ; Mon, 6 Dec 2021 02:30:08 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id IAEsIbB1rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:30:08 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 10/17] btrfs: make metadata write endio functions to be split bio compatible Date: Mon, 6 Dec 2021 10:29:30 +0800 Message-Id: <20211206022937.26465-11-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Only need to convert the bio_for_each_segment_all() call into __bio_for_each_segment() call and using btrfs_bio::iter as the initial iterator. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 952789ed650d..cb99b55ccf87 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4502,20 +4502,20 @@ static struct extent_buffer *find_extent_buffer_nolock( static void end_bio_subpage_eb_writepage(struct bio *bio) { struct btrfs_fs_info *fs_info; - struct bio_vec *bvec; - struct bvec_iter_all iter_all; + struct bvec_iter iter; + struct bio_vec bvec; fs_info = btrfs_sb(bio_first_page_all(bio)->mapping->host->i_sb); ASSERT(fs_info->sectorsize < PAGE_SIZE); - ASSERT(!bio_flagged(bio, BIO_CLONED)); - bio_for_each_segment_all(bvec, bio, iter_all) { - struct page *page = bvec->bv_page; - u64 bvec_start = page_offset(page) + bvec->bv_offset; - u64 bvec_end = bvec_start + bvec->bv_len - 1; + ASSERT(btrfs_bio(bio)->iter.bi_size); + __bio_for_each_segment(bvec, bio, iter, btrfs_bio(bio)->iter) { + struct page *page = bvec.bv_page; + u64 bvec_start = page_offset(page) + bvec.bv_offset; + u64 bvec_end = bvec_start + bvec.bv_len - 1; u64 cur_bytenr = bvec_start; - ASSERT(IS_ALIGNED(bvec->bv_len, fs_info->nodesize)); + ASSERT(IS_ALIGNED(bvec.bv_len, fs_info->nodesize)); /* Iterate through all extent buffers in the range */ while (cur_bytenr <= bvec_end) { @@ -4558,14 +4558,14 @@ static void end_bio_subpage_eb_writepage(struct bio *bio) static void end_bio_extent_buffer_writepage(struct bio *bio) { - struct bio_vec *bvec; struct extent_buffer *eb; + struct bvec_iter iter; + struct bio_vec bvec; int done; - struct bvec_iter_all iter_all; - ASSERT(!bio_flagged(bio, BIO_CLONED)); - bio_for_each_segment_all(bvec, bio, iter_all) { - struct page *page = bvec->bv_page; + ASSERT(btrfs_bio(bio)->iter.bi_size); + __bio_for_each_segment(bvec, bio, iter, btrfs_bio(bio)->iter) { + struct page *page = bvec.bv_page; eb = (struct extent_buffer *)page->private; BUG_ON(!eb); From patchwork Mon Dec 6 02:29:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657607 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1BEB5C433F5 for ; Mon, 6 Dec 2021 02:30:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234363AbhLFCdj (ORCPT ); Sun, 5 Dec 2021 21:33:39 -0500 Received: from smtp-out1.suse.de ([195.135.220.28]:54764 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234319AbhLFCdj (ORCPT ); Sun, 5 Dec 2021 21:33:39 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 83D272190C for ; Mon, 6 Dec 2021 02:30:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757810; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dWbllxqA0p40YAuwzUYnjHSoOkwp7My1ysAjiCYQxWs=; b=H2amgclrzfZmFYgy+dIWvNx8+Mixsm84s+1Qbmv9ckoMsJJzXNgV1J1c27YiDgJtx5o/id 5nMzrTHqyN2mVdvN4vjVIrpE/2Eyp9l1FfAArvLzJKXxmnRlIZdZ94L4C2KVyrL0ua3C3G 7FJ5iMZ6M7W80OltinIN94PulOZ4WNg= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id D1A8913451 for ; Mon, 6 Dec 2021 02:30:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 0AGUJrF1rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:30:09 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 11/17] btrfs: make dec_and_test_compressed_bio() to be split bio compatible Date: Mon, 6 Dec 2021 10:29:31 +0800 Message-Id: <20211206022937.26465-12-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For compression read write endio functions, they all rely on dec_and_test_compressed_bio() to determine if they are the last bio. So here we only need to convert the bio_for_each_segment_all() call into __bio_for_each_segment() so that compression read/write endio functions will handle both split and unsplit bios well. Signed-off-by: Qu Wenruo --- fs/btrfs/compression.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 8668c5190805..8b4b84b59b0c 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -205,18 +205,14 @@ static int check_compressed_csum(struct btrfs_inode *inode, struct bio *bio, static bool dec_and_test_compressed_bio(struct compressed_bio *cb, struct bio *bio) { struct btrfs_fs_info *fs_info = btrfs_sb(cb->inode->i_sb); + struct bio_vec bvec; + struct bvec_iter iter; unsigned int bi_size = 0; bool last_io = false; - struct bio_vec *bvec; - struct bvec_iter_all iter_all; - /* - * At endio time, bi_iter.bi_size doesn't represent the real bio size. - * Thus here we have to iterate through all segments to grab correct - * bio size. - */ - bio_for_each_segment_all(bvec, bio, iter_all) - bi_size += bvec->bv_len; + ASSERT(btrfs_bio(bio)->iter.bi_size); + __bio_for_each_segment(bvec, bio, iter, btrfs_bio(bio)->iter) + bi_size += bvec.bv_len; if (bio->bi_status) cb->errors = 1; From patchwork Mon Dec 6 02:29:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657609 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 283E0C433EF for ; Mon, 6 Dec 2021 02:30:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234368AbhLFCdl (ORCPT ); Sun, 5 Dec 2021 21:33:41 -0500 Received: from smtp-out2.suse.de ([195.135.220.29]:51882 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234319AbhLFCdk (ORCPT ); Sun, 5 Dec 2021 21:33:40 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 12B891FD54 for ; Mon, 6 Dec 2021 02:30:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757812; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=edE4ZBdzotffr5UGM/HN/gCyvVLtg4ZCBKgNYPt03mk=; b=VGw4Z3qdTLhIq7DBFUAa3Obh+k9923WEeFd3zIokLwLRVp5Lc8ZgibbokmvNlTXNth316e ON4BG1vXd9HjxXf3M2SDYTb1eitA/V6a0e/symzh7IJ67TtmLi/3dRxHHvjsx0ly8R0+8f 4PloSq5k1xh8LARIhSJj8f1q+Ctzd/w= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id E890613451 for ; Mon, 6 Dec 2021 02:30:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id uDDXK7J1rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:30:10 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 12/17] btrfs: return proper mapped length for RAID56 profiles in __btrfs_map_block() Date: Mon, 6 Dec 2021 10:29:32 +0800 Message-Id: <20211206022937.26465-13-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For profiles other than RAID56, __btrfs_map_block() returns @map_length as min(stripe_end, logical + *length), which is also the same result from btrfs_get_io_geometry(). But for RAID56, __btrfs_map_block() returns @map_length as stripe_len. This strange behavior is going to hurt incoming bio split at btrfs_map_bio() time, as we will use @map_length as bio split size. Fix this behavior by: - Return @map_length by the same calculatioin as other profiles - Save stripe_len into btrfs_io_context - Pass btrfs_io_context::stripe_len to raid56_*() functions - Update raid56_*() functions to make its stripe_len parameter more explicit - Update scrub_stripe_index_and_offset() to properly name its parameters - Add extra ASSERT()s to make sure the passed stripe_len is correct Signed-off-by: Qu Wenruo --- fs/btrfs/raid56.c | 12 ++++++++++-- fs/btrfs/raid56.h | 2 +- fs/btrfs/scrub.c | 14 ++++++++------ fs/btrfs/volumes.c | 13 ++++++++++--- fs/btrfs/volumes.h | 1 + 5 files changed, 30 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c index 13e726c88a81..d35cfd750b76 100644 --- a/fs/btrfs/raid56.c +++ b/fs/btrfs/raid56.c @@ -969,6 +969,8 @@ static struct btrfs_raid_bio *alloc_rbio(struct btrfs_fs_info *fs_info, int stripe_npages = DIV_ROUND_UP(stripe_len, PAGE_SIZE); void *p; + ASSERT(stripe_len == BTRFS_STRIPE_LEN); + rbio = kzalloc(sizeof(*rbio) + sizeof(*rbio->stripe_pages) * num_pages + sizeof(*rbio->bio_pages) * num_pages + @@ -1725,6 +1727,9 @@ int raid56_parity_write(struct bio *bio, struct btrfs_io_context *bioc, struct blk_plug_cb *cb; int ret; + /* Currently we only support fixed stripe len */ + ASSERT(stripe_len == BTRFS_STRIPE_LEN); + rbio = alloc_rbio(fs_info, bioc, stripe_len); if (IS_ERR(rbio)) { btrfs_put_bioc(bioc); @@ -2122,6 +2127,9 @@ int raid56_parity_recover(struct bio *bio, struct btrfs_io_context *bioc, struct btrfs_raid_bio *rbio; int ret; + /* Currently we only support fixed stripe len */ + ASSERT(stripe_len == BTRFS_STRIPE_LEN); + if (generic_io) { ASSERT(bioc->mirror_num == mirror_num); btrfs_bio(bio)->mirror_num = mirror_num; @@ -2671,12 +2679,12 @@ void raid56_parity_submit_scrub_rbio(struct btrfs_raid_bio *rbio) struct btrfs_raid_bio * raid56_alloc_missing_rbio(struct bio *bio, struct btrfs_io_context *bioc, - u64 length) + u64 stripe_len) { struct btrfs_fs_info *fs_info = bioc->fs_info; struct btrfs_raid_bio *rbio; - rbio = alloc_rbio(fs_info, bioc, length); + rbio = alloc_rbio(fs_info, bioc, stripe_len); if (IS_ERR(rbio)) return NULL; diff --git a/fs/btrfs/raid56.h b/fs/btrfs/raid56.h index 72c00fc284b5..7322dcae4498 100644 --- a/fs/btrfs/raid56.h +++ b/fs/btrfs/raid56.h @@ -46,7 +46,7 @@ void raid56_parity_submit_scrub_rbio(struct btrfs_raid_bio *rbio); struct btrfs_raid_bio * raid56_alloc_missing_rbio(struct bio *bio, struct btrfs_io_context *bioc, - u64 length); + u64 stripe_len); void raid56_submit_missing_rbio(struct btrfs_raid_bio *rbio); int btrfs_alloc_stripe_hash_table(struct btrfs_fs_info *info); diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index 15a123e67108..59bb2d08e697 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -1229,13 +1229,15 @@ static inline int scrub_nr_raid_mirrors(struct btrfs_io_context *bioc) static inline void scrub_stripe_index_and_offset(u64 logical, u64 map_type, u64 *raid_map, - u64 mapped_length, + u64 stripe_len, int nstripes, int mirror, int *stripe_index, u64 *stripe_offset) { int i; + ASSERT(stripe_len == BTRFS_STRIPE_LEN); + if (map_type & BTRFS_BLOCK_GROUP_RAID56_MASK) { /* RAID5/6 */ for (i = 0; i < nstripes; i++) { @@ -1244,7 +1246,7 @@ static inline void scrub_stripe_index_and_offset(u64 logical, u64 map_type, continue; if (logical >= raid_map[i] && - logical < raid_map[i] + mapped_length) + logical < raid_map[i] + stripe_len) break; } @@ -1349,7 +1351,7 @@ static int scrub_setup_recheck_block(struct scrub_block *original_sblock, scrub_stripe_index_and_offset(logical, bioc->map_type, bioc->raid_map, - mapped_length, + bioc->stripe_len, bioc->num_stripes - bioc->num_tgtdevs, mirror_index, @@ -1401,7 +1403,7 @@ static int scrub_submit_raid56_bio_wait(struct btrfs_fs_info *fs_info, mirror_num = spage->sblock->pagev[0]->mirror_num; ret = raid56_parity_recover(bio, spage->recover->bioc, - spage->recover->map_length, + spage->recover->bioc->stripe_len, mirror_num, 0); if (ret) return ret; @@ -2230,7 +2232,7 @@ static void scrub_missing_raid56_pages(struct scrub_block *sblock) bio->bi_private = sblock; bio->bi_end_io = scrub_missing_raid56_end_io; - rbio = raid56_alloc_missing_rbio(bio, bioc, length); + rbio = raid56_alloc_missing_rbio(bio, bioc, bioc->stripe_len); if (!rbio) goto rbio_out; @@ -2846,7 +2848,7 @@ static void scrub_parity_check_and_repair(struct scrub_parity *sparity) bio->bi_private = sparity; bio->bi_end_io = scrub_parity_bio_endio; - rbio = raid56_parity_alloc_scrub_rbio(bio, bioc, length, + rbio = raid56_parity_alloc_scrub_rbio(bio, bioc, bioc->stripe_len, sparity->scrub_dev, sparity->dbitmap, sparity->nsectors); diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index fba08cfcbd4e..6d962450e355 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6051,6 +6051,7 @@ static int __btrfs_map_block_for_discard(struct btrfs_fs_info *fs_info, ret = -ENOMEM; goto out; } + bioc->stripe_len = map->stripe_len; for (i = 0; i < num_stripes; i++) { bioc->stripes[i].physical = @@ -6406,6 +6407,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, { struct extent_map *em; struct map_lookup *map; + const u64 orig_length = *length; u64 stripe_offset; u64 stripe_nr; u64 stripe_len; @@ -6427,6 +6429,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, ASSERT(bioc_ret); ASSERT(op != BTRFS_MAP_DISCARD); + ASSERT(orig_length); em = btrfs_get_chunk_map(fs_info, logical, *length); ASSERT(!IS_ERR(em)); @@ -6522,7 +6525,10 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, num_stripes = map->num_stripes; max_errors = nr_parity_stripes(map); - *length = map->stripe_len; + /* Return the length to the full stripe end */ + *length = min(raid56_full_stripe_start + em->start + + data_stripes * stripe_len, + logical + orig_length) - logical; stripe_index = 0; stripe_offset = 0; } else { @@ -6574,6 +6580,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, ret = -ENOMEM; goto out; } + bioc->stripe_len = map->stripe_len; for (i = 0; i < num_stripes; i++) { bioc->stripes[i].physical = map->stripes[stripe_index].physical + @@ -6824,9 +6831,9 @@ static int submit_one_mapped_range(struct btrfs_fs_info *fs_info, struct bio *bi /* In this case, map_length has been set to the length of a single stripe; not the whole write */ if (btrfs_op(bio) == BTRFS_MAP_WRITE) { - ret = raid56_parity_write(bio, bioc, map_length); + ret = raid56_parity_write(bio, bioc, bioc->stripe_len); } else { - ret = raid56_parity_recover(bio, bioc, map_length, + ret = raid56_parity_recover(bio, bioc, bioc->stripe_len, mirror_num, 1); } diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index a7f3fd4b4226..04c016a844f8 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -449,6 +449,7 @@ struct btrfs_io_context { struct bio *orig_bio; void *private; atomic_t error; + u32 stripe_len; int max_errors; int num_stripes; int mirror_num; From patchwork Mon Dec 6 02:29:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657611 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B86F1C433FE for ; Mon, 6 Dec 2021 02:30:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234387AbhLFCdm (ORCPT ); Sun, 5 Dec 2021 21:33:42 -0500 Received: from smtp-out2.suse.de ([195.135.220.29]:51888 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234319AbhLFCdl (ORCPT ); Sun, 5 Dec 2021 21:33:41 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 239CE1FD5F for ; Mon, 6 Dec 2021 02:30:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757813; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5QbHCJEQ0zg9iOEkjlT8eDJEhk/3WEisOO9BqRfuoQw=; b=sZZvZYtDxKgzgjO0L26knakbDdiNzMEvehiUnTb2YqWZVozpgCLgKznqvj4IeOADB28zw1 twHlRDDBPjbKVklsGJBWn3ghCIvlDINj5KXDReuyiCPuhoJa56DftiopRufHqCosA1Ctmv ixrDTApuEB6CqZlCxinowod8cUVBPeU= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 7690513451 for ; Mon, 6 Dec 2021 02:30:12 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 8DVMELR1rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:30:12 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 13/17] btrfs: allow btrfs_map_bio() to split bio according to chunk stripe boundaries Date: Mon, 6 Dec 2021 10:29:33 +0800 Message-Id: <20211206022937.26465-14-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org With the new btrfs_bio_split() helper, we are able to split bio according to chunk stripe boundaries at btrfs_map_bio() time. Although currently due bio split at buffered/compressed/direct IO time, this ability is not yet utilized. Signed-off-by: Qu Wenruo --- fs/btrfs/volumes.c | 50 +++++++++++++++++++++++++++++----------------- 1 file changed, 32 insertions(+), 18 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 6d962450e355..301fc34320ed 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6879,30 +6879,44 @@ static int submit_one_mapped_range(struct btrfs_fs_info *fs_info, struct bio *bi blk_status_t btrfs_map_bio(struct btrfs_fs_info *fs_info, struct bio *bio, int mirror_num) { - u64 logical = bio->bi_iter.bi_sector << 9; - u64 length = 0; - u64 map_length; + const u64 orig_logical = bio->bi_iter.bi_sector << SECTOR_SHIFT; + const unsigned int orig_length = bio->bi_iter.bi_size; + const enum btrfs_map_op op = btrfs_op(bio); + u64 cur_logical = orig_logical; int ret; - struct btrfs_io_context *bioc = NULL; - length = bio->bi_iter.bi_size; - map_length = length; + while (cur_logical < orig_logical + orig_length) { + u64 map_length = orig_logical + orig_length - cur_logical; + struct btrfs_io_context *bioc = NULL; + struct bio *cur_bio; - btrfs_bio_counter_inc_blocked(fs_info); - btrfs_bio_save_iter(btrfs_bio(bio)); - ret = __btrfs_map_block(fs_info, btrfs_op(bio), logical, - &map_length, &bioc, mirror_num, 1); - if (ret) { - btrfs_bio_counter_dec(fs_info); - return errno_to_blk_status(ret); - } + btrfs_bio_save_iter(btrfs_bio(bio)); + ret = __btrfs_map_block(fs_info, op, cur_logical, &map_length, + &bioc, mirror_num, 1); + if (ret) + return errno_to_blk_status(ret); - ret = submit_one_mapped_range(fs_info, bio, bioc, map_length, mirror_num); - if (ret < 0) { + if (cur_logical + map_length < orig_logical + orig_length) { + /* + * For now zoned write should never cross stripe + * boundary + */ + ASSERT(bio_op(bio) != REQ_OP_ZONE_APPEND); + + /* Split the bio */ + cur_bio = btrfs_bio_split(fs_info, bio, map_length); + } else { + /* Use the existing bio directly */ + cur_bio = bio; + } + btrfs_bio_counter_inc_blocked(fs_info); + ret = submit_one_mapped_range(fs_info, cur_bio, bioc, + map_length, mirror_num); btrfs_bio_counter_dec(fs_info); - return errno_to_blk_status(ret); + if (ret < 0) + return errno_to_blk_status(ret); + cur_logical += map_length; } - btrfs_bio_counter_dec(fs_info); return BLK_STS_OK; } From patchwork Mon Dec 6 02:29:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657613 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4A81C433EF for ; Mon, 6 Dec 2021 02:30:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234390AbhLFCdn (ORCPT ); Sun, 5 Dec 2021 21:33:43 -0500 Received: from smtp-out1.suse.de ([195.135.220.28]:54770 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234382AbhLFCdn (ORCPT ); Sun, 5 Dec 2021 21:33:43 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 36D862190C for ; Mon, 6 Dec 2021 02:30:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757814; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7V/LMd3XDBR7c0xuWvWTrFo5XoeE6zth8zcTilDCxt8=; b=U8lzYyDIz14kYhu0c06C4fL6VCI+37BE7EpcwEL3X6iCDnN26vt7kHRc0nDNN+O/mnFT58 k2UtY+OyzVOjBF/QLesDZCSEKCy9W+otph2V/uvgd8qAaQxIgVevFsfqGow22YV5Az4ljU KZW+FdoU9gbsrXBU4QjPK1mc7gNkrBU= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 8A92013451 for ; Mon, 6 Dec 2021 02:30:13 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id OBmfFLV1rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:30:13 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 14/17] btrfs: remove buffered IO stripe boundary calculation Date: Mon, 6 Dec 2021 10:29:34 +0800 Message-Id: <20211206022937.26465-15-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This will remove btrfs_bio_ctrl::len_to_stripe_boundary, so that buffer IO will no longer limits its bio size according to stripe length. This will move the bio split to btrfs_map_bio() for all buffered IO. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 23 ++--------------------- 1 file changed, 2 insertions(+), 21 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index cb99b55ccf87..97045927b763 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3307,7 +3307,7 @@ static int btrfs_bio_add_page(struct btrfs_bio_ctrl *bio_ctrl, ASSERT(bio); /* The limit should be calculated when bio_ctrl->bio is allocated */ - ASSERT(bio_ctrl->len_to_oe_boundary && bio_ctrl->len_to_stripe_boundary); + ASSERT(bio_ctrl->len_to_oe_boundary); if (bio_ctrl->bio_flags != bio_flags) return 0; @@ -3318,9 +3318,7 @@ static int btrfs_bio_add_page(struct btrfs_bio_ctrl *bio_ctrl, if (!contig) return 0; - real_size = min(bio_ctrl->len_to_oe_boundary, - bio_ctrl->len_to_stripe_boundary) - bio_size; - real_size = min(real_size, size); + real_size = min(bio_ctrl->len_to_oe_boundary - bio_size, size); /* * If real_size is 0, never call bio_add_*_page(), as even size is 0, @@ -3341,11 +3339,8 @@ static int calc_bio_boundaries(struct btrfs_bio_ctrl *bio_ctrl, struct btrfs_inode *inode, u64 file_offset) { struct btrfs_fs_info *fs_info = inode->root->fs_info; - struct btrfs_io_geometry geom; struct btrfs_ordered_extent *ordered; - struct extent_map *em; u64 logical = (bio_ctrl->bio->bi_iter.bi_sector << SECTOR_SHIFT); - int ret; /* * Pages for compressed extent are never submitted to disk directly, @@ -3356,22 +3351,8 @@ static int calc_bio_boundaries(struct btrfs_bio_ctrl *bio_ctrl, */ if (bio_ctrl->bio_flags & EXTENT_BIO_COMPRESSED) { bio_ctrl->len_to_oe_boundary = U32_MAX; - bio_ctrl->len_to_stripe_boundary = U32_MAX; return 0; } - em = btrfs_get_chunk_map(fs_info, logical, fs_info->sectorsize); - if (IS_ERR(em)) - return PTR_ERR(em); - ret = btrfs_get_io_geometry(fs_info, em, btrfs_op(bio_ctrl->bio), - logical, &geom); - free_extent_map(em); - if (ret < 0) { - return ret; - } - if (geom.len > U32_MAX) - bio_ctrl->len_to_stripe_boundary = U32_MAX; - else - bio_ctrl->len_to_stripe_boundary = (u32)geom.len; if (!btrfs_is_zoned(fs_info) || bio_op(bio_ctrl->bio) != REQ_OP_ZONE_APPEND) { From patchwork Mon Dec 6 02:29:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657615 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9182C433FE for ; Mon, 6 Dec 2021 02:30:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234396AbhLFCdo (ORCPT ); Sun, 5 Dec 2021 21:33:44 -0500 Received: from smtp-out2.suse.de ([195.135.220.29]:51894 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234319AbhLFCdn (ORCPT ); Sun, 5 Dec 2021 21:33:43 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 456B91FD54 for ; Mon, 6 Dec 2021 02:30:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757815; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FfzzaplRdzADfh5C67JmZThB8OL3fW3foCobHjbNNyA=; b=p55ZIx/javsmBQTo24yn54ZzAiwmfEVxRJX/fQ9I2P0YM2kffudUQF5GIeddRSKUqBdzcj PAo+AF6BBZ+wcj6194qYEVQlnigUFbA3Kjb4CftsE3wgZJ8zrnkwWTGYugqkJwYlJ+Bzyi 3iOZ9lgdBa5AKtK2Jxh7kBFpmo3bWMM= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 9AA9A13451 for ; Mon, 6 Dec 2021 02:30:14 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id GMQZGbZ1rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:30:14 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 15/17] btrfs: remove stripe boundary calculation for compressed IO Date: Mon, 6 Dec 2021 10:29:35 +0800 Message-Id: <20211206022937.26465-16-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For compressed IO, we calculate the next stripe start inside alloc_compressed_bio(). Since now btrfs_map_bio() can handle bio split, we no longer need to calculate the boundary any more. Signed-off-by: Qu Wenruo --- fs/btrfs/compression.c | 49 +++++------------------------------------- 1 file changed, 5 insertions(+), 44 deletions(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 8b4b84b59b0c..70af7d3973b7 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -442,21 +442,15 @@ static blk_status_t submit_compressed_bio(struct btrfs_fs_info *fs_info, * from or written to. * @endio_func: The endio function to call after the IO for compressed data * is finished. - * @next_stripe_start: Return value of logical bytenr of where next stripe starts. - * Let the caller know to only fill the bio up to the stripe - * boundary. */ static struct bio *alloc_compressed_bio(struct compressed_bio *cb, u64 disk_bytenr, - unsigned int opf, bio_end_io_t endio_func, - u64 *next_stripe_start) + unsigned int opf, bio_end_io_t endio_func) { struct btrfs_fs_info *fs_info = btrfs_sb(cb->inode->i_sb); - struct btrfs_io_geometry geom; struct extent_map *em; struct bio *bio; - int ret; bio = btrfs_bio_alloc(BIO_MAX_VECS); @@ -473,14 +467,7 @@ static struct bio *alloc_compressed_bio(struct compressed_bio *cb, u64 disk_byte if (bio_op(bio) == REQ_OP_ZONE_APPEND) bio_set_dev(bio, em->map_lookup->stripes[0].dev->bdev); - - ret = btrfs_get_io_geometry(fs_info, em, btrfs_op(bio), disk_bytenr, &geom); free_extent_map(em); - if (ret < 0) { - bio_put(bio); - return ERR_PTR(ret); - } - *next_stripe_start = disk_bytenr + geom.len; return bio; } @@ -506,7 +493,6 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start, struct bio *bio = NULL; struct compressed_bio *cb; u64 cur_disk_bytenr = disk_start; - u64 next_stripe_start; blk_status_t ret; int skip_sum = inode->flags & BTRFS_INODE_NODATASUM; const bool use_append = btrfs_use_zone_append(inode, disk_start); @@ -539,28 +525,19 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start, /* Allocate new bio if submitted or not yet allocated */ if (!bio) { bio = alloc_compressed_bio(cb, cur_disk_bytenr, - bio_op | write_flags, end_compressed_bio_write, - &next_stripe_start); + bio_op | write_flags, end_compressed_bio_write); if (IS_ERR(bio)) { ret = errno_to_blk_status(PTR_ERR(bio)); bio = NULL; goto finish_cb; } } - /* - * We should never reach next_stripe_start start as we will - * submit comp_bio when reach the boundary immediately. - */ - ASSERT(cur_disk_bytenr != next_stripe_start); - /* * We have various limits on the real read size: - * - stripe boundary * - page boundary * - compressed length boundary */ - real_size = min_t(u64, U32_MAX, next_stripe_start - cur_disk_bytenr); - real_size = min_t(u64, real_size, PAGE_SIZE - offset_in_page(offset)); + real_size = min_t(u64, U32_MAX, PAGE_SIZE - offset_in_page(offset)); real_size = min_t(u64, real_size, compressed_len - offset); ASSERT(IS_ALIGNED(real_size, fs_info->sectorsize)); @@ -575,9 +552,6 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start, submit = true; cur_disk_bytenr += added; - /* Reached stripe boundary */ - if (cur_disk_bytenr == next_stripe_start) - submit = true; /* Finished the range */ if (cur_disk_bytenr == disk_start + compressed_len) @@ -797,7 +771,6 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, struct bio *comp_bio = NULL; const u64 disk_bytenr = bio->bi_iter.bi_sector << SECTOR_SHIFT; u64 cur_disk_byte = disk_bytenr; - u64 next_stripe_start; u64 file_offset; u64 em_len; u64 em_start; @@ -878,27 +851,19 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, /* Allocate new bio if submitted or not yet allocated */ if (!comp_bio) { comp_bio = alloc_compressed_bio(cb, cur_disk_byte, - REQ_OP_READ, end_compressed_bio_read, - &next_stripe_start); + REQ_OP_READ, end_compressed_bio_read); if (IS_ERR(comp_bio)) { ret = errno_to_blk_status(PTR_ERR(comp_bio)); comp_bio = NULL; goto finish_cb; } } - /* - * We should never reach next_stripe_start start as we will - * submit comp_bio when reach the boundary immediately. - */ - ASSERT(cur_disk_byte != next_stripe_start); /* * We have various limit on the real read size: - * - stripe boundary * - page boundary * - compressed length boundary */ - real_size = min_t(u64, U32_MAX, next_stripe_start - cur_disk_byte); - real_size = min_t(u64, real_size, PAGE_SIZE - offset_in_page(offset)); + real_size = min_t(u64, U32_MAX, PAGE_SIZE - offset_in_page(offset)); real_size = min_t(u64, real_size, compressed_len - offset); ASSERT(IS_ALIGNED(real_size, fs_info->sectorsize)); @@ -910,10 +875,6 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, ASSERT(added == real_size); cur_disk_byte += added; - /* Reached stripe boundary, need to submit */ - if (cur_disk_byte == next_stripe_start) - submit = true; - /* Has finished the range, need to submit */ if (cur_disk_byte == disk_bytenr + compressed_len) submit = true; From patchwork Mon Dec 6 02:29:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657617 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17760C433EF for ; Mon, 6 Dec 2021 02:30:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234406AbhLFCdp (ORCPT ); Sun, 5 Dec 2021 21:33:45 -0500 Received: from smtp-out2.suse.de ([195.135.220.29]:51900 "EHLO smtp-out2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234394AbhLFCdp (ORCPT ); Sun, 5 Dec 2021 21:33:45 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 54FDB1FD5F for ; Mon, 6 Dec 2021 02:30:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757816; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OtXkiQjVq07s6cIpFoM/BWXTJJ8eZBowQdaODkesbCA=; b=tqku+oL9slTxuR+IEmACkyfK6N5gw/ynoyEhnDg+v8L/bc/XzY5asKyGnv7848n866pzTN sFIjUqFj0+O1VA7Efx1jCD9pOweuSxT4b8/OExhZa0maQ+fUw7pKolAnOYdhs/KO7ib8nT p7yJ+6I6SFWBF7dwl9rATpy0rJCpKZ4= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id A8DCB13451 for ; Mon, 6 Dec 2021 02:30:15 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id IAqHHLd1rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:30:15 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 16/17] btrfs: remove the stripe boundary calculation for direct IO Date: Mon, 6 Dec 2021 10:29:36 +0800 Message-Id: <20211206022937.26465-17-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In btrfs_submit_direct() we have a do {} while () loop to handle the bio split due to stripe boundary. Since btrfs_map_bio() can handle it for us now, there is no need to manually do the split anymore. Also since we don't need to split bio, there is no special check for RAID56 anymore, make btrfs_submit_dio_bio() to have the same rule as btrfs_submit_data_bio() for async submit. Signed-off-by: Qu Wenruo --- fs/btrfs/inode.c | 113 ++++++++++------------------------------------- 1 file changed, 24 insertions(+), 89 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 186304c69900..8ffec0fe6c4e 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8222,22 +8222,16 @@ static void btrfs_end_dio_bio(struct bio *bio) } static inline blk_status_t btrfs_submit_dio_bio(struct bio *bio, - struct inode *inode, u64 file_offset, int async_submit) + struct inode *inode, u64 file_offset) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_dio_private *dip = bio->bi_private; bool write = btrfs_op(bio) == BTRFS_MAP_WRITE; + bool async_submit; blk_status_t ret; - /* - * Check btrfs_submit_data_bio() for rules about async submit. - * - * The only exception is for RAID56, when there are more than one bios - * to submit, async submit seems to make it harder to collect csums - * for the full stripe. - */ - if (async_submit) - async_submit = !atomic_read(&BTRFS_I(inode)->sync_writers); + /* Check btrfs_submit_data_bio() for rules about async submit. */ + async_submit = !atomic_read(&BTRFS_I(inode)->sync_writers); if (!write) btrfs_bio(bio)->endio_type = BTRFS_WQ_ENDIO_DATA; @@ -8311,25 +8305,12 @@ static void btrfs_submit_direct(const struct iomap_iter *iter, struct bio *dio_bio, loff_t file_offset) { struct inode *inode = iter->inode; + struct btrfs_dio_data *dio_data = iter->iomap.private; const bool write = (btrfs_op(dio_bio) == BTRFS_MAP_WRITE); - struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); - const bool raid56 = (btrfs_data_alloc_profile(fs_info) & - BTRFS_BLOCK_GROUP_RAID56_MASK); struct btrfs_dio_private *dip; struct bio *bio; const u32 length = dio_bio->bi_iter.bi_size; - u32 submitted_bytes = 0; - u64 start_sector; - int async_submit = 0; - u64 submit_len; - u64 clone_offset = 0; - u64 clone_len; - u64 logical; - int ret; blk_status_t status; - struct btrfs_io_geometry geom; - struct btrfs_dio_data *dio_data = iter->iomap.private; - struct extent_map *em = NULL; dip = btrfs_create_dio_private(dio_bio, inode, file_offset, length); if (!dip) { @@ -8353,80 +8334,34 @@ static void btrfs_submit_direct(const struct iomap_iter *iter, goto out_err; } - start_sector = dio_bio->bi_iter.bi_sector; - submit_len = dio_bio->bi_iter.bi_size; - - do { - logical = start_sector << 9; - em = btrfs_get_chunk_map(fs_info, logical, submit_len); - if (IS_ERR(em)) { - status = errno_to_blk_status(PTR_ERR(em)); - em = NULL; - goto out_err_em; - } - ret = btrfs_get_io_geometry(fs_info, em, btrfs_op(dio_bio), - logical, &geom); - if (ret) { - status = errno_to_blk_status(ret); - goto out_err_em; - } - - clone_len = min(submit_len, geom.len); - ASSERT(clone_len <= UINT_MAX); - - /* - * This will never fail as it's passing GPF_NOFS and - * the allocation is backed by btrfs_bioset. - */ - bio = btrfs_bio_clone_partial(dio_bio, clone_offset, clone_len); - bio->bi_private = dip; - bio->bi_end_io = btrfs_end_dio_bio; - - if (bio_op(bio) == REQ_OP_ZONE_APPEND) { - status = extract_ordered_extent(BTRFS_I(inode), bio, - file_offset); - if (status) { - bio_put(bio); - goto out_err; - } - } - - ASSERT(submit_len >= clone_len); - submit_len -= clone_len; + /* + * This will never fail as it's passing GPF_NOFS and + * the allocation is backed by btrfs_bioset. + */ + bio = btrfs_bio_clone(dio_bio); + bio->bi_private = dip; + bio->bi_end_io = btrfs_end_dio_bio; - if (submit_len > 0) { - /* - * If we are submitting more than one bio, submit them - * all asynchronously. The exception is RAID 5 or 6, as - * asynchronous checksums make it difficult to collect - * full stripe writes. - */ - if (!raid56) - async_submit = 1; - } - status = btrfs_submit_dio_bio(bio, inode, file_offset, - async_submit); + if (bio_op(bio) == REQ_OP_ZONE_APPEND) { + status = extract_ordered_extent(BTRFS_I(inode), bio, + file_offset); if (status) { bio_put(bio); - goto out_err_em; + goto out_err; } - - submitted_bytes += clone_len; - dio_data->submitted += clone_len; - clone_offset += clone_len; - start_sector += clone_len >> 9; - file_offset += clone_len; - - free_extent_map(em); - } while (submit_len > 0); + } + status = btrfs_submit_dio_bio(bio, inode, file_offset); + if (status) { + bio_put(bio); + goto out_err; + } + dio_data->submitted += length; return; -out_err_em: - free_extent_map(em); out_err: dip->dio_bio->bi_status = status; - dio_private_finish(dip, status, length - submitted_bytes); + dio_private_finish(dip, status, length); } const struct iomap_ops btrfs_dio_iomap_ops = { From patchwork Mon Dec 6 02:29:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12657619 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFEDCC433F5 for ; Mon, 6 Dec 2021 02:30:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234432AbhLFCdq (ORCPT ); Sun, 5 Dec 2021 21:33:46 -0500 Received: from smtp-out1.suse.de ([195.135.220.28]:54776 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234398AbhLFCdq (ORCPT ); Sun, 5 Dec 2021 21:33:46 -0500 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 6601B2190C for ; Mon, 6 Dec 2021 02:30:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1638757817; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9qoTsttFOqm5Ra1hDC5glYCVygwcikdxENxP35HwHc8=; b=PBxeN03kGvmlju3Syxykh51pVG6Brlomgs4kj1SIxNVshHmi7TysICON4a4ba6TCbD+RoB GlRfUDl4ohqxtw3Cyl+uKj5CRSoU/lnxCgaakTIt8LWbKKW1wZPiYkfqz2hJ/EpR80Y0Ch PjqoYjz+YxI77clwCmxG+w3q8xwK/5c= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id B82AB13451 for ; Mon, 6 Dec 2021 02:30:16 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id +L1gILh1rWFEMgAAMHmgww (envelope-from ) for ; Mon, 06 Dec 2021 02:30:16 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 17/17] btrfs: unexport btrfs_get_io_geometry() Date: Mon, 6 Dec 2021 10:29:37 +0800 Message-Id: <20211206022937.26465-18-wqu@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206022937.26465-1-wqu@suse.com> References: <20211206022937.26465-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This function provides a lighter weight version of btrfs_map_block(), just to provide enough info without filling everything of btrfs_map_block(). But that function is only used for stripe boundary calculation, and now stripe boundary calculation is all handled inside btrfs_map_bio(), there is no need to export it anymore. Signed-off-by: Qu Wenruo --- fs/btrfs/volumes.c | 8 ++++---- fs/btrfs/volumes.h | 3 --- 2 files changed, 4 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 301fc34320ed..61d281892449 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6320,9 +6320,9 @@ static bool need_full_stripe(enum btrfs_map_op op) * Returns < 0 in case a chunk for the given logical address cannot be found, * usually shouldn't happen unless @logical is corrupted, 0 otherwise. */ -int btrfs_get_io_geometry(struct btrfs_fs_info *fs_info, struct extent_map *em, - enum btrfs_map_op op, u64 logical, - struct btrfs_io_geometry *io_geom) +static int get_io_geometry(struct btrfs_fs_info *fs_info, struct extent_map *em, + enum btrfs_map_op op, u64 logical, + struct btrfs_io_geometry *io_geom) { struct map_lookup *map; u64 len; @@ -6434,7 +6434,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, em = btrfs_get_chunk_map(fs_info, logical, *length); ASSERT(!IS_ERR(em)); - ret = btrfs_get_io_geometry(fs_info, em, op, logical, &geom); + ret = get_io_geometry(fs_info, em, op, logical, &geom); if (ret < 0) return ret; diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 04c016a844f8..d5dbe7f946e0 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -561,9 +561,6 @@ int btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, int btrfs_map_sblock(struct btrfs_fs_info *fs_info, enum btrfs_map_op op, u64 logical, u64 *length, struct btrfs_io_context **bioc_ret); -int btrfs_get_io_geometry(struct btrfs_fs_info *fs_info, struct extent_map *map, - enum btrfs_map_op op, u64 logical, - struct btrfs_io_geometry *io_geom); int btrfs_read_sys_array(struct btrfs_fs_info *fs_info); int btrfs_read_chunk_tree(struct btrfs_fs_info *fs_info); struct btrfs_block_group *btrfs_create_chunk(struct btrfs_trans_handle *trans,