From patchwork Wed Apr 10 11:28:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13624114 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6211F15E1E5 for ; Wed, 10 Apr 2024 11:28:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748529; cv=none; b=JEWGqy9qPRsmt8EWVcMWtMINeIzmZTWGmKFcW0ozIU2obYBMiMFeVZvq7ew/lhgKiEfBK0NSFGGgFnoBXpzJBDMX24IVHObJcNw3Zy9Kp3pGPPyDPaKsgOE1MLgc1Ulk9DA7g9ifn58l61PcgKxWKysxa/lzVWgM9Cq5Psvvq+c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748529; c=relaxed/simple; bh=m8Fty1BCbSqigGihUAVl5sa3UZzEfTB8HUbiUvwNZ74=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=c6ICbK9exAqJvuZ5JoQZD3Jvxo58mRdRrWHkOSPZ/J0uPrUY9M5vIrumgIwJlWA8F0t7QKfuhHitq2FxZ+2BPVoD8NJsgJU7kLcyor+MMpGiS2BW34M3xUutTirLsUnzkpTsE/2SHrPgT6X2B5HuRFIQcNK8fIyrE6z3qtm5zWA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=oT8eLDad; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="oT8eLDad" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4AE93C433F1 for ; Wed, 10 Apr 2024 11:28:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712748528; bh=m8Fty1BCbSqigGihUAVl5sa3UZzEfTB8HUbiUvwNZ74=; h=From:To:Subject:Date:In-Reply-To:References:From; b=oT8eLDad7E7Tjq6XanLHcUfrl9yIqkvUkbCZ85wPKCXFYLsPdISY18khi/vCUHAVp IuL0l+0x3jnXEJxgnxVLjpyrtofwP/A3dR/FUpxVG12aKOeizYtAV7im7lvpXFJDn+ MoZ9N9hkpPzUbj79+sS5BLsaiyFbCH3Ya7nxXM/kQUMFwLRVaJtoOjHzq57eD5OHsN VfhefZ+uOuDQYgug0gAfocfgxT4rN/QnmUjZnfDFqkGLTJccLw1//Cejp7iXuysWh4 +nT98JClYPp0nGgTYQg1VN34Nz+ygBnE3lyegJuK24AYss9MNsmHWCeW+vMtegBZ+V blXfV6DuxOjUg== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 01/11] btrfs: pass an inode to btrfs_add_extent_mapping() Date: Wed, 10 Apr 2024 12:28:33 +0100 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana Instead of passing fs_info and extent map tree arguments to btrfs_add_extent_mapping(), we can pass an inode instead, as extent maps are always inserted in the extent map tree of an inode, and the fs_info can be extracted from the inode (inode->root->fs_info). The only exception is in the self tests where we allocate an extent map tree and then use it to insert/update/remove extent maps. However the tests can be changed to use a test inode and then use the inode's extent map tree. So change btrfs_add_extent_mapping() to have an inode as an argument instead of a fs_info and an extent map tree. This reduces the number of parameters and will also be needed for an upcoming change. Signed-off-by: Filipe Manana --- fs/btrfs/extent_map.c | 14 +-- fs/btrfs/extent_map.h | 3 +- fs/btrfs/inode.c | 2 +- fs/btrfs/tests/extent-map-tests.c | 174 +++++++++++++++--------------- 4 files changed, 95 insertions(+), 98 deletions(-) diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index 471654cb65b0..840be23d2c0a 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -546,10 +546,9 @@ static noinline int merge_extent_mapping(struct extent_map_tree *em_tree, } /* - * Add extent mapping into em_tree. + * Add extent mapping into an inode's extent map tree. * - * @fs_info: the filesystem - * @em_tree: extent tree into which we want to insert the extent mapping + * @inode: target inode * @em_in: extent we are inserting * @start: start of the logical range btrfs_get_extent() is requesting * @len: length of the logical range btrfs_get_extent() is requesting @@ -557,8 +556,8 @@ static noinline int merge_extent_mapping(struct extent_map_tree *em_tree, * Note that @em_in's range may be different from [start, start+len), * but they must be overlapped. * - * Insert @em_in into @em_tree. In case there is an overlapping range, handle - * the -EEXIST by either: + * Insert @em_in into the inode's extent map tree. In case there is an + * overlapping range, handle the -EEXIST by either: * a) Returning the existing extent in @em_in if @start is within the * existing em. * b) Merge the existing extent with @em_in passed in. @@ -566,12 +565,13 @@ static noinline int merge_extent_mapping(struct extent_map_tree *em_tree, * Return 0 on success, otherwise -EEXIST. * */ -int btrfs_add_extent_mapping(struct btrfs_fs_info *fs_info, - struct extent_map_tree *em_tree, +int btrfs_add_extent_mapping(struct btrfs_inode *inode, struct extent_map **em_in, u64 start, u64 len) { int ret; struct extent_map *em = *em_in; + struct extent_map_tree *em_tree = &inode->extent_tree; + struct btrfs_fs_info *fs_info = inode->root->fs_info; /* * Tree-checker should have rejected any inline extent with non-zero diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h index 10e9491865c9..f287ab46e368 100644 --- a/fs/btrfs/extent_map.h +++ b/fs/btrfs/extent_map.h @@ -132,8 +132,7 @@ int unpin_extent_cache(struct btrfs_inode *inode, u64 start, u64 len, u64 gen); void clear_em_logging(struct extent_map_tree *tree, struct extent_map *em); struct extent_map *search_extent_mapping(struct extent_map_tree *tree, u64 start, u64 len); -int btrfs_add_extent_mapping(struct btrfs_fs_info *fs_info, - struct extent_map_tree *em_tree, +int btrfs_add_extent_mapping(struct btrfs_inode *inode, struct extent_map **em_in, u64 start, u64 len); void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end, diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 94ac20e62e13..27888810e6ac 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -6992,7 +6992,7 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode, } write_lock(&em_tree->lock); - ret = btrfs_add_extent_mapping(fs_info, em_tree, &em, start, len); + ret = btrfs_add_extent_mapping(inode, &em, start, len); write_unlock(&em_tree->lock); out: btrfs_free_path(path); diff --git a/fs/btrfs/tests/extent-map-tests.c b/fs/btrfs/tests/extent-map-tests.c index 253cce7ffecf..96089c4c38a5 100644 --- a/fs/btrfs/tests/extent-map-tests.c +++ b/fs/btrfs/tests/extent-map-tests.c @@ -53,9 +53,9 @@ static void free_extent_map_tree(struct extent_map_tree *em_tree) * ->add_extent_mapping(0, 16K) * -> #handle -EEXIST */ -static int test_case_1(struct btrfs_fs_info *fs_info, - struct extent_map_tree *em_tree) +static int test_case_1(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) { + struct extent_map_tree *em_tree = &inode->extent_tree; struct extent_map *em; u64 start = 0; u64 len = SZ_8K; @@ -73,7 +73,7 @@ static int test_case_1(struct btrfs_fs_info *fs_info, em->block_start = 0; em->block_len = SZ_16K; write_lock(&em_tree->lock); - ret = btrfs_add_extent_mapping(fs_info, em_tree, &em, em->start, em->len); + ret = btrfs_add_extent_mapping(inode, &em, em->start, em->len); write_unlock(&em_tree->lock); if (ret < 0) { test_err("cannot add extent range [0, 16K)"); @@ -94,7 +94,7 @@ static int test_case_1(struct btrfs_fs_info *fs_info, em->block_start = SZ_32K; /* avoid merging */ em->block_len = SZ_4K; write_lock(&em_tree->lock); - ret = btrfs_add_extent_mapping(fs_info, em_tree, &em, em->start, em->len); + ret = btrfs_add_extent_mapping(inode, &em, em->start, em->len); write_unlock(&em_tree->lock); if (ret < 0) { test_err("cannot add extent range [16K, 20K)"); @@ -115,7 +115,7 @@ static int test_case_1(struct btrfs_fs_info *fs_info, em->block_start = start; em->block_len = len; write_lock(&em_tree->lock); - ret = btrfs_add_extent_mapping(fs_info, em_tree, &em, em->start, em->len); + ret = btrfs_add_extent_mapping(inode, &em, em->start, em->len); write_unlock(&em_tree->lock); if (ret) { test_err("case1 [%llu %llu]: ret %d", start, start + len, ret); @@ -148,9 +148,9 @@ static int test_case_1(struct btrfs_fs_info *fs_info, * Reading the inline ending up with EEXIST, ie. read an inline * extent and discard page cache and read it again. */ -static int test_case_2(struct btrfs_fs_info *fs_info, - struct extent_map_tree *em_tree) +static int test_case_2(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) { + struct extent_map_tree *em_tree = &inode->extent_tree; struct extent_map *em; int ret; @@ -166,7 +166,7 @@ static int test_case_2(struct btrfs_fs_info *fs_info, em->block_start = EXTENT_MAP_INLINE; em->block_len = (u64)-1; write_lock(&em_tree->lock); - ret = btrfs_add_extent_mapping(fs_info, em_tree, &em, em->start, em->len); + ret = btrfs_add_extent_mapping(inode, &em, em->start, em->len); write_unlock(&em_tree->lock); if (ret < 0) { test_err("cannot add extent range [0, 1K)"); @@ -187,7 +187,7 @@ static int test_case_2(struct btrfs_fs_info *fs_info, em->block_start = SZ_4K; em->block_len = SZ_4K; write_lock(&em_tree->lock); - ret = btrfs_add_extent_mapping(fs_info, em_tree, &em, em->start, em->len); + ret = btrfs_add_extent_mapping(inode, &em, em->start, em->len); write_unlock(&em_tree->lock); if (ret < 0) { test_err("cannot add extent range [4K, 8K)"); @@ -208,7 +208,7 @@ static int test_case_2(struct btrfs_fs_info *fs_info, em->block_start = EXTENT_MAP_INLINE; em->block_len = (u64)-1; write_lock(&em_tree->lock); - ret = btrfs_add_extent_mapping(fs_info, em_tree, &em, em->start, em->len); + ret = btrfs_add_extent_mapping(inode, &em, em->start, em->len); write_unlock(&em_tree->lock); if (ret) { test_err("case2 [0 1K]: ret %d", ret); @@ -235,8 +235,9 @@ static int test_case_2(struct btrfs_fs_info *fs_info, } static int __test_case_3(struct btrfs_fs_info *fs_info, - struct extent_map_tree *em_tree, u64 start) + struct btrfs_inode *inode, u64 start) { + struct extent_map_tree *em_tree = &inode->extent_tree; struct extent_map *em; u64 len = SZ_4K; int ret; @@ -253,7 +254,7 @@ static int __test_case_3(struct btrfs_fs_info *fs_info, em->block_start = SZ_4K; em->block_len = SZ_4K; write_lock(&em_tree->lock); - ret = btrfs_add_extent_mapping(fs_info, em_tree, &em, em->start, em->len); + ret = btrfs_add_extent_mapping(inode, &em, em->start, em->len); write_unlock(&em_tree->lock); if (ret < 0) { test_err("cannot add extent range [4K, 8K)"); @@ -274,7 +275,7 @@ static int __test_case_3(struct btrfs_fs_info *fs_info, em->block_start = 0; em->block_len = SZ_16K; write_lock(&em_tree->lock); - ret = btrfs_add_extent_mapping(fs_info, em_tree, &em, start, len); + ret = btrfs_add_extent_mapping(inode, &em, start, len); write_unlock(&em_tree->lock); if (ret) { test_err("case3 [%llu %llu): ret %d", @@ -322,25 +323,25 @@ static int __test_case_3(struct btrfs_fs_info *fs_info, * -> add_extent_mapping() * -> add_extent_mapping() */ -static int test_case_3(struct btrfs_fs_info *fs_info, - struct extent_map_tree *em_tree) +static int test_case_3(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) { int ret; - ret = __test_case_3(fs_info, em_tree, 0); + ret = __test_case_3(fs_info, inode, 0); if (ret) return ret; - ret = __test_case_3(fs_info, em_tree, SZ_8K); + ret = __test_case_3(fs_info, inode, SZ_8K); if (ret) return ret; - ret = __test_case_3(fs_info, em_tree, (12 * SZ_1K)); + ret = __test_case_3(fs_info, inode, (12 * SZ_1K)); return ret; } static int __test_case_4(struct btrfs_fs_info *fs_info, - struct extent_map_tree *em_tree, u64 start) + struct btrfs_inode *inode, u64 start) { + struct extent_map_tree *em_tree = &inode->extent_tree; struct extent_map *em; u64 len = SZ_4K; int ret; @@ -357,7 +358,7 @@ static int __test_case_4(struct btrfs_fs_info *fs_info, em->block_start = 0; em->block_len = SZ_8K; write_lock(&em_tree->lock); - ret = btrfs_add_extent_mapping(fs_info, em_tree, &em, em->start, em->len); + ret = btrfs_add_extent_mapping(inode, &em, em->start, em->len); write_unlock(&em_tree->lock); if (ret < 0) { test_err("cannot add extent range [0, 8K)"); @@ -378,7 +379,7 @@ static int __test_case_4(struct btrfs_fs_info *fs_info, em->block_start = SZ_16K; /* avoid merging */ em->block_len = 24 * SZ_1K; write_lock(&em_tree->lock); - ret = btrfs_add_extent_mapping(fs_info, em_tree, &em, em->start, em->len); + ret = btrfs_add_extent_mapping(inode, &em, em->start, em->len); write_unlock(&em_tree->lock); if (ret < 0) { test_err("cannot add extent range [8K, 32K)"); @@ -398,7 +399,7 @@ static int __test_case_4(struct btrfs_fs_info *fs_info, em->block_start = 0; em->block_len = SZ_32K; write_lock(&em_tree->lock); - ret = btrfs_add_extent_mapping(fs_info, em_tree, &em, start, len); + ret = btrfs_add_extent_mapping(inode, &em, start, len); write_unlock(&em_tree->lock); if (ret) { test_err("case4 [%llu %llu): ret %d", @@ -450,23 +451,22 @@ static int __test_case_4(struct btrfs_fs_info *fs_info, * # handle -EEXIST when adding * # [0, 32K) */ -static int test_case_4(struct btrfs_fs_info *fs_info, - struct extent_map_tree *em_tree) +static int test_case_4(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) { int ret; - ret = __test_case_4(fs_info, em_tree, 0); + ret = __test_case_4(fs_info, inode, 0); if (ret) return ret; - ret = __test_case_4(fs_info, em_tree, SZ_4K); + ret = __test_case_4(fs_info, inode, SZ_4K); return ret; } -static int add_compressed_extent(struct btrfs_fs_info *fs_info, - struct extent_map_tree *em_tree, +static int add_compressed_extent(struct btrfs_inode *inode, u64 start, u64 len, u64 block_start) { + struct extent_map_tree *em_tree = &inode->extent_tree; struct extent_map *em; int ret; @@ -482,7 +482,7 @@ static int add_compressed_extent(struct btrfs_fs_info *fs_info, em->block_len = SZ_4K; em->flags |= EXTENT_FLAG_COMPRESS_ZLIB; write_lock(&em_tree->lock); - ret = btrfs_add_extent_mapping(fs_info, em_tree, &em, em->start, em->len); + ret = btrfs_add_extent_mapping(inode, &em, em->start, em->len); write_unlock(&em_tree->lock); free_extent_map(em); if (ret < 0) { @@ -588,53 +588,43 @@ static int validate_range(struct extent_map_tree *em_tree, int index) * They'll have the EXTENT_FLAG_COMPRESSED flag set to keep the em tree from * merging the em's. */ -static int test_case_5(struct btrfs_fs_info *fs_info) +static int test_case_5(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) { - struct extent_map_tree *em_tree; - struct inode *inode; u64 start, end; int ret; test_msg("Running btrfs_drop_extent_map_range tests"); - inode = btrfs_new_test_inode(); - if (!inode) { - test_std_err(TEST_ALLOC_INODE); - return -ENOMEM; - } - - em_tree = &BTRFS_I(inode)->extent_tree; - /* [0, 12k) */ - ret = add_compressed_extent(fs_info, em_tree, 0, SZ_4K * 3, 0); + ret = add_compressed_extent(inode, 0, SZ_4K * 3, 0); if (ret) { test_err("cannot add extent range [0, 12K)"); goto out; } /* [12k, 24k) */ - ret = add_compressed_extent(fs_info, em_tree, SZ_4K * 3, SZ_4K * 3, SZ_4K); + ret = add_compressed_extent(inode, SZ_4K * 3, SZ_4K * 3, SZ_4K); if (ret) { test_err("cannot add extent range [12k, 24k)"); goto out; } /* [24k, 36k) */ - ret = add_compressed_extent(fs_info, em_tree, SZ_4K * 6, SZ_4K * 3, SZ_8K); + ret = add_compressed_extent(inode, SZ_4K * 6, SZ_4K * 3, SZ_8K); if (ret) { test_err("cannot add extent range [12k, 24k)"); goto out; } /* [36k, 40k) */ - ret = add_compressed_extent(fs_info, em_tree, SZ_32K + SZ_4K, SZ_4K, SZ_4K * 3); + ret = add_compressed_extent(inode, SZ_32K + SZ_4K, SZ_4K, SZ_4K * 3); if (ret) { test_err("cannot add extent range [12k, 24k)"); goto out; } /* [40k, 64k) */ - ret = add_compressed_extent(fs_info, em_tree, SZ_4K * 10, SZ_4K * 6, SZ_16K); + ret = add_compressed_extent(inode, SZ_4K * 10, SZ_4K * 6, SZ_16K); if (ret) { test_err("cannot add extent range [12k, 24k)"); goto out; @@ -643,36 +633,36 @@ static int test_case_5(struct btrfs_fs_info *fs_info) /* Drop [8k, 12k) */ start = SZ_8K; end = (3 * SZ_4K) - 1; - btrfs_drop_extent_map_range(BTRFS_I(inode), start, end, false); - ret = validate_range(&BTRFS_I(inode)->extent_tree, 0); + btrfs_drop_extent_map_range(inode, start, end, false); + ret = validate_range(&inode->extent_tree, 0); if (ret) goto out; /* Drop [12k, 20k) */ start = SZ_4K * 3; end = SZ_16K + SZ_4K - 1; - btrfs_drop_extent_map_range(BTRFS_I(inode), start, end, false); - ret = validate_range(&BTRFS_I(inode)->extent_tree, 1); + btrfs_drop_extent_map_range(inode, start, end, false); + ret = validate_range(&inode->extent_tree, 1); if (ret) goto out; /* Drop [28k, 32k) */ start = SZ_32K - SZ_4K; end = SZ_32K - 1; - btrfs_drop_extent_map_range(BTRFS_I(inode), start, end, false); - ret = validate_range(&BTRFS_I(inode)->extent_tree, 2); + btrfs_drop_extent_map_range(inode, start, end, false); + ret = validate_range(&inode->extent_tree, 2); if (ret) goto out; /* Drop [32k, 64k) */ start = SZ_32K; end = SZ_64K - 1; - btrfs_drop_extent_map_range(BTRFS_I(inode), start, end, false); - ret = validate_range(&BTRFS_I(inode)->extent_tree, 3); + btrfs_drop_extent_map_range(inode, start, end, false); + ret = validate_range(&inode->extent_tree, 3); if (ret) goto out; out: - iput(inode); + free_extent_map_tree(&inode->extent_tree); return ret; } @@ -681,23 +671,25 @@ static int test_case_5(struct btrfs_fs_info *fs_info) * for areas between two existing ems. Validate it doesn't do this when there * are two unmerged em's side by side. */ -static int test_case_6(struct btrfs_fs_info *fs_info, struct extent_map_tree *em_tree) +static int test_case_6(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) { + struct extent_map_tree *em_tree = &inode->extent_tree; struct extent_map *em = NULL; int ret; - ret = add_compressed_extent(fs_info, em_tree, 0, SZ_4K, 0); + ret = add_compressed_extent(inode, 0, SZ_4K, 0); if (ret) goto out; - ret = add_compressed_extent(fs_info, em_tree, SZ_4K, SZ_4K, 0); + ret = add_compressed_extent(inode, SZ_4K, SZ_4K, 0); if (ret) goto out; em = alloc_extent_map(); if (!em) { test_std_err(TEST_ALLOC_EXTENT_MAP); - return -ENOMEM; + ret = -ENOMEM; + goto out; } em->start = SZ_4K; @@ -705,7 +697,7 @@ static int test_case_6(struct btrfs_fs_info *fs_info, struct extent_map_tree *em em->block_start = SZ_16K; em->block_len = SZ_16K; write_lock(&em_tree->lock); - ret = btrfs_add_extent_mapping(fs_info, em_tree, &em, 0, SZ_8K); + ret = btrfs_add_extent_mapping(inode, &em, 0, SZ_8K); write_unlock(&em_tree->lock); if (ret != 0) { @@ -734,28 +726,19 @@ static int test_case_6(struct btrfs_fs_info *fs_info, struct extent_map_tree *em * true would mess up the start/end calculations and subsequent splits would be * incorrect. */ -static int test_case_7(struct btrfs_fs_info *fs_info) +static int test_case_7(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) { - struct extent_map_tree *em_tree; + struct extent_map_tree *em_tree = &inode->extent_tree; struct extent_map *em; - struct inode *inode; int ret; + int ret2; test_msg("Running btrfs_drop_extent_cache with pinned"); - inode = btrfs_new_test_inode(); - if (!inode) { - test_std_err(TEST_ALLOC_INODE); - return -ENOMEM; - } - - em_tree = &BTRFS_I(inode)->extent_tree; - em = alloc_extent_map(); if (!em) { test_std_err(TEST_ALLOC_EXTENT_MAP); - ret = -ENOMEM; - goto out; + return -ENOMEM; } /* [0, 16K), pinned */ @@ -765,7 +748,7 @@ static int test_case_7(struct btrfs_fs_info *fs_info) em->block_len = SZ_4K; em->flags |= EXTENT_FLAG_PINNED; write_lock(&em_tree->lock); - ret = btrfs_add_extent_mapping(fs_info, em_tree, &em, em->start, em->len); + ret = btrfs_add_extent_mapping(inode, &em, em->start, em->len); write_unlock(&em_tree->lock); if (ret < 0) { test_err("couldn't add extent map"); @@ -786,7 +769,7 @@ static int test_case_7(struct btrfs_fs_info *fs_info) em->block_start = SZ_32K; em->block_len = SZ_16K; write_lock(&em_tree->lock); - ret = btrfs_add_extent_mapping(fs_info, em_tree, &em, em->start, em->len); + ret = btrfs_add_extent_mapping(inode, &em, em->start, em->len); write_unlock(&em_tree->lock); if (ret < 0) { test_err("couldn't add extent map"); @@ -798,7 +781,7 @@ static int test_case_7(struct btrfs_fs_info *fs_info) * Drop [0, 36K) This should skip the [0, 4K) extent and then split the * [32K, 48K) extent. */ - btrfs_drop_extent_map_range(BTRFS_I(inode), 0, (36 * SZ_1K) - 1, true); + btrfs_drop_extent_map_range(inode, 0, (36 * SZ_1K) - 1, true); /* Make sure our extent maps look sane. */ ret = -EINVAL; @@ -860,7 +843,11 @@ static int test_case_7(struct btrfs_fs_info *fs_info) ret = 0; out: free_extent_map(em); - iput(inode); + /* Unpin our extent to prevent warning when removing it below. */ + ret2 = unpin_extent_cache(inode, 0, SZ_16K, 0); + if (ret == 0) + ret = ret2; + free_extent_map_tree(em_tree); return ret; } @@ -954,7 +941,8 @@ static int test_rmap_block(struct btrfs_fs_info *fs_info, int btrfs_test_extent_map(void) { struct btrfs_fs_info *fs_info = NULL; - struct extent_map_tree *em_tree; + struct inode *inode; + struct btrfs_root *root = NULL; int ret = 0, i; struct rmap_test_vector rmap_tests[] = { { @@ -1003,33 +991,42 @@ int btrfs_test_extent_map(void) return -ENOMEM; } - em_tree = kzalloc(sizeof(*em_tree), GFP_KERNEL); - if (!em_tree) { + inode = btrfs_new_test_inode(); + if (!inode) { + test_std_err(TEST_ALLOC_INODE); ret = -ENOMEM; goto out; } - extent_map_tree_init(em_tree); + root = btrfs_alloc_dummy_root(fs_info); + if (IS_ERR(root)) { + test_std_err(TEST_ALLOC_ROOT); + ret = PTR_ERR(root); + root = NULL; + goto out; + } - ret = test_case_1(fs_info, em_tree); + BTRFS_I(inode)->root = root; + + ret = test_case_1(fs_info, BTRFS_I(inode)); if (ret) goto out; - ret = test_case_2(fs_info, em_tree); + ret = test_case_2(fs_info, BTRFS_I(inode)); if (ret) goto out; - ret = test_case_3(fs_info, em_tree); + ret = test_case_3(fs_info, BTRFS_I(inode)); if (ret) goto out; - ret = test_case_4(fs_info, em_tree); + ret = test_case_4(fs_info, BTRFS_I(inode)); if (ret) goto out; - ret = test_case_5(fs_info); + ret = test_case_5(fs_info, BTRFS_I(inode)); if (ret) goto out; - ret = test_case_6(fs_info, em_tree); + ret = test_case_6(fs_info, BTRFS_I(inode)); if (ret) goto out; - ret = test_case_7(fs_info); + ret = test_case_7(fs_info, BTRFS_I(inode)); if (ret) goto out; @@ -1041,7 +1038,8 @@ int btrfs_test_extent_map(void) } out: - kfree(em_tree); + iput(inode); + btrfs_free_dummy_root(root); btrfs_free_dummy_fs_info(fs_info); return ret; From patchwork Wed Apr 10 11:28:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13624115 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E3DA115E203 for ; Wed, 10 Apr 2024 11:28:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748530; cv=none; b=PVOIAl0VJKmPAC3qf/LGiCTxDP/CBArucJFGOMQI8KbRNRQEqEAcCCbejFLJumVJGJREqygr65pHb6fs4XAx4/2qpk5N0TNCf/3c72j/obnWKADh+CFmN6zvj2GPfLNFs4Tt3ayCL2QRCRisLB+MF4o6OSjUvDVxweBwypdokNU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748530; c=relaxed/simple; bh=nRQWBNjqlntEN4Ge1wXMhKAWmUduD4H/gnZQSDpABSg=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=d46579PDNj9oespv/IAGnvwRS/9kBjQkp2WmirBYKJPVADk5Fyhod0mVSfe89S6QC+kj41pbuJiLELwmZojAEEy1IA6GS6WkUH/1/CC2COI6bZcPstxf3qYLK/6vXybvZ46ilZY8EbaFM2BBvHVeiwz7ukdf+RXltk7Y/sjdBp4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=FXJ5zUz4; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="FXJ5zUz4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 40F4CC43390 for ; Wed, 10 Apr 2024 11:28:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712748529; bh=nRQWBNjqlntEN4Ge1wXMhKAWmUduD4H/gnZQSDpABSg=; h=From:To:Subject:Date:In-Reply-To:References:From; b=FXJ5zUz4HFECX4St05jZQlqgFprxjvFQZ+GWT2/SH8bBSNXbUOsmHvVeubXsNlP3O Yw6USs/8d3ewR8wo8iY27DOU7dyhlS1Sj9KPF9akCaZqPPYdfKBk6ZOze57P3Hda64 /1prXrT19/Ykr5pje+79EmO52W2bJmXaamdOQRZOZz6qWLQNQyN4RryKXDx0xbM9tY hMOTD5qSHOtVrdcgrbkQXyRDmFsnSzzTu4gfkmOuBCBetrukZXZDGNpufG8T+4CN63 JWCE2bRLvqUyv5lIzGbNRcvpsMntiXSZ8FygXa9g8px6Cmn4owvS27GxK+XNcqRvQt l7hbEy4N3HIjA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 02/11] btrfs: tests: error out on unexpected extent map reference count Date: Wed, 10 Apr 2024 12:28:34 +0100 Message-Id: <4aec89aabd2b7f920a913f59cea155257c887ef9.1712748143.git.fdmanana@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana In the extent map self tests, when freeing all extent maps from a test extent map tree we are not expecting to find any extent map with a reference count different from 1 (the tree reference). If we find any, we just log a message but we don't fail the test, which makes it very easy to miss any bug/regression - no one reads the test messages unless a test fails. So change the behaviour to make a test fail if we find an extent map in the tree with a reference count different from 1. Make the failure happen only after removing all extent maps, so that we don't leak memory. Signed-off-by: Filipe Manana --- fs/btrfs/tests/extent-map-tests.c | 43 +++++++++++++++++++++++++------ 1 file changed, 35 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/tests/extent-map-tests.c b/fs/btrfs/tests/extent-map-tests.c index 96089c4c38a5..9e9cb591c0f1 100644 --- a/fs/btrfs/tests/extent-map-tests.c +++ b/fs/btrfs/tests/extent-map-tests.c @@ -11,10 +11,11 @@ #include "../disk-io.h" #include "../block-group.h" -static void free_extent_map_tree(struct extent_map_tree *em_tree) +static int free_extent_map_tree(struct extent_map_tree *em_tree) { struct extent_map *em; struct rb_node *node; + int ret = 0; write_lock(&em_tree->lock); while (!RB_EMPTY_ROOT(&em_tree->map.rb_root)) { @@ -24,6 +25,7 @@ static void free_extent_map_tree(struct extent_map_tree *em_tree) #ifdef CONFIG_BTRFS_DEBUG if (refcount_read(&em->refs) != 1) { + ret = -EINVAL; test_err( "em leak: em (start %llu len %llu block_start %llu block_len %llu) refs %d", em->start, em->len, em->block_start, @@ -35,6 +37,8 @@ static void free_extent_map_tree(struct extent_map_tree *em_tree) free_extent_map(em); } write_unlock(&em_tree->lock); + + return ret; } /* @@ -60,6 +64,7 @@ static int test_case_1(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) u64 start = 0; u64 len = SZ_8K; int ret; + int ret2; em = alloc_extent_map(); if (!em) { @@ -137,7 +142,9 @@ static int test_case_1(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) } free_extent_map(em); out: - free_extent_map_tree(em_tree); + ret2 = free_extent_map_tree(em_tree); + if (ret == 0) + ret = ret2; return ret; } @@ -153,6 +160,7 @@ static int test_case_2(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) struct extent_map_tree *em_tree = &inode->extent_tree; struct extent_map *em; int ret; + int ret2; em = alloc_extent_map(); if (!em) { @@ -229,7 +237,9 @@ static int test_case_2(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) } free_extent_map(em); out: - free_extent_map_tree(em_tree); + ret2 = free_extent_map_tree(em_tree); + if (ret == 0) + ret = ret2; return ret; } @@ -241,6 +251,7 @@ static int __test_case_3(struct btrfs_fs_info *fs_info, struct extent_map *em; u64 len = SZ_4K; int ret; + int ret2; em = alloc_extent_map(); if (!em) { @@ -302,7 +313,9 @@ static int __test_case_3(struct btrfs_fs_info *fs_info, } free_extent_map(em); out: - free_extent_map_tree(em_tree); + ret2 = free_extent_map_tree(em_tree); + if (ret == 0) + ret = ret2; return ret; } @@ -345,6 +358,7 @@ static int __test_case_4(struct btrfs_fs_info *fs_info, struct extent_map *em; u64 len = SZ_4K; int ret; + int ret2; em = alloc_extent_map(); if (!em) { @@ -421,7 +435,9 @@ static int __test_case_4(struct btrfs_fs_info *fs_info, } free_extent_map(em); out: - free_extent_map_tree(em_tree); + ret2 = free_extent_map_tree(em_tree); + if (ret == 0) + ret = ret2; return ret; } @@ -592,6 +608,7 @@ static int test_case_5(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) { u64 start, end; int ret; + int ret2; test_msg("Running btrfs_drop_extent_map_range tests"); @@ -662,7 +679,10 @@ static int test_case_5(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) if (ret) goto out; out: - free_extent_map_tree(&inode->extent_tree); + ret2 = free_extent_map_tree(&inode->extent_tree); + if (ret == 0) + ret = ret2; + return ret; } @@ -676,6 +696,7 @@ static int test_case_6(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) struct extent_map_tree *em_tree = &inode->extent_tree; struct extent_map *em = NULL; int ret; + int ret2; ret = add_compressed_extent(inode, 0, SZ_4K, 0); if (ret) @@ -717,7 +738,10 @@ static int test_case_6(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) ret = 0; out: free_extent_map(em); - free_extent_map_tree(em_tree); + ret2 = free_extent_map_tree(em_tree); + if (ret == 0) + ret = ret2; + return ret; } @@ -847,7 +871,10 @@ static int test_case_7(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) ret2 = unpin_extent_cache(inode, 0, SZ_16K, 0); if (ret == 0) ret = ret2; - free_extent_map_tree(em_tree); + ret2 = free_extent_map_tree(em_tree); + if (ret == 0) + ret = ret2; + return ret; } From patchwork Wed Apr 10 11:28:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13624116 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 284A315E203 for ; Wed, 10 Apr 2024 11:28:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748531; cv=none; b=Uq9Y77nDXwQRWAILU3/nxR9zpOmvSWcpf0caj4jG6agQbMYWZMA5D6r00dOPydOwZ9AGrWJrM9uzHWDfq2eIwi4RM9Q4AzG/t7tOwbP41eOFj/zVnwTUe4INapiKCRdSataTE01jHHPCMn5aqsHwehiw0h/2JU5a8Zr3VsZhIAE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748531; c=relaxed/simple; bh=i36mBH7l60yk68YJs2uXhf0sDC820kJcHiCqGgLb03E=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=K/HJhG0xOK7jAXlogFq5jzWmZJT3T4Kfn/WZ0v5uJO41A2lx3eIPLqmtT0kAYdpY26jIQdSM9emFo+vKyyJQqZ8HazebMi7g9OanuPb7ri0efoBPNe/mOs/TsvjA94Vp/jmdsFV2K/QnyfYEbSPCHPOUr3yKSAP6NPza0vnRtTc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=RW+NtJqJ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="RW+NtJqJ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 346E8C433C7 for ; Wed, 10 Apr 2024 11:28:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712748530; bh=i36mBH7l60yk68YJs2uXhf0sDC820kJcHiCqGgLb03E=; h=From:To:Subject:Date:In-Reply-To:References:From; b=RW+NtJqJyzCGtRV2UrShfy96jIMPYRaibFbn1AEpABwUveLr6ZHO4zQ0uMB+IL8DP S55GJITlD+FMZ23RfthYmJfLjsUGLOa6xvRrbZS1YfQn28hOPriryll9xaM3OZ/8Nm wQlVP12gvvfGn5QO1BIgwPMkRut+iWj0awzP+XamjdShxDPc6wkecpF4ciKDI2ByYL BkSVZmEyRXY+vYd5U7vqW18z/Uq7Af+Riv+89irIjztCAVNpeKGESVQ29/Vr0YDvlb Isr94LpTpz6EX2w0YR66SZNlYS9TdLg9wRlSP5P0rto6dBqqgH9qtuWL0SErhck6qf vgcvAJS57Qnwg== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 03/11] btrfs: simplify add_extent_mapping() by removing pointless label Date: Wed, 10 Apr 2024 12:28:35 +0100 Message-Id: <2c0c07d9cf4b7743eaf41c6072e63cd3439696eb.1712748143.git.fdmanana@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana The add_extent_mapping() function is short and trivial, there's no need to have a label for a quick exit in case of an error, even because there's no error handling needed, we just need to return the error. So remove that label and return directly. Also while at it remove the redundant initialization of 'ret', as that may help avoid some warnings with clang tools such as the one reported/fixed by commit 966de47ff0c9 ("btrfs: remove redundant initialization of variables in log_new_ancestors"). Signed-off-by: Filipe Manana --- fs/btrfs/extent_map.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index 840be23d2c0a..d125d5ab9b1d 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -370,17 +370,17 @@ static inline void setup_extent_mapping(struct extent_map_tree *tree, static int add_extent_mapping(struct extent_map_tree *tree, struct extent_map *em, int modified) { - int ret = 0; + int ret; lockdep_assert_held_write(&tree->lock); ret = tree_insert(&tree->map, em); if (ret) - goto out; + return ret; setup_extent_mapping(tree, em, modified); -out: - return ret; + + return 0; } static struct extent_map * From patchwork Wed Apr 10 11:28:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13624117 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DB26715E7E3 for ; Wed, 10 Apr 2024 11:28:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748531; cv=none; b=PTWyKQZcPvFqE/ySofTw4FBaGimHinq6J5i27osZ1Kt7v9U3ayoJceqsIk4GBroLZLyCN5zBkEJkGEma5b5nkjUsoQrm414diQZDUxn3L/VBwcwCVV1VgWK0t7lWJAghGaFM41vIkqRZvSTd5A+BReAohEci2h5DTO+kvjGAUQ8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748531; c=relaxed/simple; bh=tVKbSnVlp+ohQ93ZUlxwb7rViyg8CHOrTu1JqhUmt2g=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Myn6oYjzJ0os2/mjh5/uriGaIAG5ke7CAq/QBb8yBbn8fWTrPDCOkeM9g9oWd3zYdTJdnYLc+z620KAwpM4zEPcsom9lAe3OgklJ0Y63KPybmCZUJeJTk9lma7SQsh0pjZLSwN4HtjU5Th1GqGZ1caWifTLgFvelxdvSTotTm2E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=U5SVKY+w; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="U5SVKY+w" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 280B1C433F1 for ; Wed, 10 Apr 2024 11:28:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712748531; bh=tVKbSnVlp+ohQ93ZUlxwb7rViyg8CHOrTu1JqhUmt2g=; h=From:To:Subject:Date:In-Reply-To:References:From; b=U5SVKY+wTI4NX5Ku3SYOdCFPMLwT9/p7RPu1lOngJpF9l9Gr21kHlobbZbezSUd88 aL7UJY8EoX4++GRoo81Mj6WNg1TJNoLeVJnmYXclxnDb8Yc1eaCO9O69Is56O6UXcr wuv4ZQ2tRWQRSU6mxZQT3Jm5xLK2JeEakLViWvd/X3F13Cjj0Ml2EFI6eY6uoFYGHF hEv+RigNQEdo1b3pznSO2GZItzUTylk7tOQGNd1JoVVMpMA0EYWF3NsWJ0VOy9NRzW 5nHS+ekDzRwiRnxiIM4NjDMA2itDJWpXETbLREixC7MySU4+D0hLUeKTR1AsPPR/4D yRUmIZUXjjRBw== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 04/11] btrfs: pass the extent map tree's inode to add_extent_mapping() Date: Wed, 10 Apr 2024 12:28:36 +0100 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana Extent maps are always added to an inode's extent map tree, so there's no need to pass the extent map tree explicitly to add_extent_mapping(). In order to facilitate an upcoming change that adds a shrinker for extent maps, change add_extent_mapping() to receive the inode instead of its extent map tree. Signed-off-by: Filipe Manana --- fs/btrfs/extent_map.c | 34 ++++++++++++++++------------------ 1 file changed, 16 insertions(+), 18 deletions(-) diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index d125d5ab9b1d..d0e0c4e5415e 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -355,21 +355,22 @@ static inline void setup_extent_mapping(struct extent_map_tree *tree, } /* - * Add new extent map to the extent tree + * Add a new extent map to an inode's extent map tree. * - * @tree: tree to insert new map in + * @inode: the target inode * @em: map to insert * @modified: indicate whether the given @em should be added to the * modified list, which indicates the extent needs to be logged * - * Insert @em into @tree or perform a simple forward/backward merge with - * existing mappings. The extent_map struct passed in will be inserted - * into the tree directly, with an additional reference taken, or a - * reference dropped if the merge attempt was successful. + * Insert @em into the @inode's extent map tree or perform a simple + * forward/backward merge with existing mappings. The extent_map struct passed + * in will be inserted into the tree directly, with an additional reference + * taken, or a reference dropped if the merge attempt was successful. */ -static int add_extent_mapping(struct extent_map_tree *tree, +static int add_extent_mapping(struct btrfs_inode *inode, struct extent_map *em, int modified) { + struct extent_map_tree *tree = &inode->extent_tree; int ret; lockdep_assert_held_write(&tree->lock); @@ -508,7 +509,7 @@ static struct extent_map *prev_extent_map(struct extent_map *em) * and an extent that you want to insert, deal with overlap and insert * the best fitted new extent into the tree. */ -static noinline int merge_extent_mapping(struct extent_map_tree *em_tree, +static noinline int merge_extent_mapping(struct btrfs_inode *inode, struct extent_map *existing, struct extent_map *em, u64 map_start) @@ -542,7 +543,7 @@ static noinline int merge_extent_mapping(struct extent_map_tree *em_tree, em->block_start += start_diff; em->block_len = em->len; } - return add_extent_mapping(em_tree, em, 0); + return add_extent_mapping(inode, em, 0); } /* @@ -570,7 +571,6 @@ int btrfs_add_extent_mapping(struct btrfs_inode *inode, { int ret; struct extent_map *em = *em_in; - struct extent_map_tree *em_tree = &inode->extent_tree; struct btrfs_fs_info *fs_info = inode->root->fs_info; /* @@ -580,7 +580,7 @@ int btrfs_add_extent_mapping(struct btrfs_inode *inode, if (em->block_start == EXTENT_MAP_INLINE) ASSERT(em->start == 0); - ret = add_extent_mapping(em_tree, em, 0); + ret = add_extent_mapping(inode, em, 0); /* it is possible that someone inserted the extent into the tree * while we had the lock dropped. It is also possible that * an overlapping map exists in the tree @@ -588,7 +588,7 @@ int btrfs_add_extent_mapping(struct btrfs_inode *inode, if (ret == -EEXIST) { struct extent_map *existing; - existing = search_extent_mapping(em_tree, start, len); + existing = search_extent_mapping(&inode->extent_tree, start, len); trace_btrfs_handle_em_exist(fs_info, existing, em, start, len); @@ -609,8 +609,7 @@ int btrfs_add_extent_mapping(struct btrfs_inode *inode, * The existing extent map is the one nearest to * the [start, start + len) range which overlaps */ - ret = merge_extent_mapping(em_tree, existing, - em, start); + ret = merge_extent_mapping(inode, existing, em, start); if (WARN_ON(ret)) { free_extent_map(em); *em_in = NULL; @@ -818,8 +817,7 @@ void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end, } else { int ret; - ret = add_extent_mapping(em_tree, split, - modified); + ret = add_extent_mapping(inode, split, modified); /* Logic error, shouldn't happen. */ ASSERT(ret == 0); if (WARN_ON(ret != 0) && modified) @@ -909,7 +907,7 @@ int btrfs_replace_extent_map_range(struct btrfs_inode *inode, do { btrfs_drop_extent_map_range(inode, new_em->start, end, false); write_lock(&tree->lock); - ret = add_extent_mapping(tree, new_em, modified); + ret = add_extent_mapping(inode, new_em, modified); write_unlock(&tree->lock); } while (ret == -EEXIST); @@ -990,7 +988,7 @@ int split_extent_map(struct btrfs_inode *inode, u64 start, u64 len, u64 pre, split_mid->ram_bytes = split_mid->len; split_mid->flags = flags; split_mid->generation = em->generation; - add_extent_mapping(em_tree, split_mid, 1); + add_extent_mapping(inode, split_mid, 1); /* Once for us */ free_extent_map(em); From patchwork Wed Apr 10 11:28:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13624118 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 199D815E80D for ; Wed, 10 Apr 2024 11:28:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748533; cv=none; b=VnJ4cY/+UmuL0/pxvVOqrt8BmaeEW/jNseJheazJlBr+pvRl0leNR97A7RMiN2iX+tcwSkkyOTDqLWZySMOj/iFTfb3cSqUN4D4woBdY5GxODlP1KG/f6vWuKNAXlxKldq+cUMGxBVpOgShKeJvWY1mMex0MzKsJ/0KD+eM102c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748533; c=relaxed/simple; bh=g2W3101eV13B/UgB2JaIdmMCpGCnuK0rfSn8Sw6uT2k=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=WHWd54oAJGfWipR/xn7nxLT0OixtC2VclQ2zi1ehH0GvzotnQrj5Id1kePACo6qoLUBS1NqdLfhskljZ71pCzxnmsXuC07X7kM6dfAxogIJ8JQiIRYXHUNSX2Y65InfGe+hCfIDl2XgSHZZKL9aH9vFHSXikWtDvUPV4knc7QD8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=u5E5YX3O; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="u5E5YX3O" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1C626C433C7 for ; Wed, 10 Apr 2024 11:28:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712748532; bh=g2W3101eV13B/UgB2JaIdmMCpGCnuK0rfSn8Sw6uT2k=; h=From:To:Subject:Date:In-Reply-To:References:From; b=u5E5YX3Oji5ZUCGQObfQwK75gvtTdkpJ7rGzRz1q0bU9TybiM0uT9PaFyd3hHNcph DS0O+rbLiDKlHAF+70y6QuKixMXF/gzs7k1vufxKzIuhHP79UINv418W2UT8cxSMJC 6HuvxJOEAgWlkf8YOIhV7+ASTDTHezHjuHVuAFnAZXHRCOQpy3AK2LlsjFooPmQhzw 6WRQfFFDDOjr29HITvY5tLdt8S2rJ1/IWSDErDiz/E5+yyV+QVpfI28gRjCV1Yg7J3 K8fLhJEtUd3MOpU2+mx492VbZT0wc5a737Q6pWt7rXXK18EQcmG3Geake2zjofkbsv Y0VU9AGqTGCzQ== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 05/11] btrfs: pass the extent map tree's inode to clear_em_logging() Date: Wed, 10 Apr 2024 12:28:37 +0100 Message-Id: <70c2fbda7aee6c8161fc79c63d155df576a57dc3.1712748143.git.fdmanana@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana Extent maps are always associated to an inode's extent map tree, so there's no need to pass the extent map tree explicitly to clear_em_logging(). In order to facilitate an upcoming change that adds a shrinker for extent maps, change clear_em_logging() to receive the inode instead of its extent map tree. Signed-off-by: Filipe Manana --- fs/btrfs/extent_map.c | 4 +++- fs/btrfs/extent_map.h | 2 +- fs/btrfs/tree-log.c | 4 ++-- 3 files changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index d0e0c4e5415e..7cda78d11d75 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -331,8 +331,10 @@ int unpin_extent_cache(struct btrfs_inode *inode, u64 start, u64 len, u64 gen) } -void clear_em_logging(struct extent_map_tree *tree, struct extent_map *em) +void clear_em_logging(struct btrfs_inode *inode, struct extent_map *em) { + struct extent_map_tree *tree = &inode->extent_tree; + lockdep_assert_held_write(&tree->lock); em->flags &= ~EXTENT_FLAG_LOGGING; diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h index f287ab46e368..732fc8d7e534 100644 --- a/fs/btrfs/extent_map.h +++ b/fs/btrfs/extent_map.h @@ -129,7 +129,7 @@ void free_extent_map(struct extent_map *em); int __init extent_map_init(void); void __cold extent_map_exit(void); int unpin_extent_cache(struct btrfs_inode *inode, u64 start, u64 len, u64 gen); -void clear_em_logging(struct extent_map_tree *tree, struct extent_map *em); +void clear_em_logging(struct btrfs_inode *inode, struct extent_map *em); struct extent_map *search_extent_mapping(struct extent_map_tree *tree, u64 start, u64 len); int btrfs_add_extent_mapping(struct btrfs_inode *inode, diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index d9777649e170..4a4fca841510 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -4945,7 +4945,7 @@ static int btrfs_log_changed_extents(struct btrfs_trans_handle *trans, * private list. */ if (ret) { - clear_em_logging(tree, em); + clear_em_logging(inode, em); free_extent_map(em); continue; } @@ -4954,7 +4954,7 @@ static int btrfs_log_changed_extents(struct btrfs_trans_handle *trans, ret = log_one_extent(trans, inode, em, path, ctx); write_lock(&tree->lock); - clear_em_logging(tree, em); + clear_em_logging(inode, em); free_extent_map(em); } WARN_ON(!list_empty(&extents)); From patchwork Wed Apr 10 11:28:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13624119 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B480A15E811 for ; Wed, 10 Apr 2024 11:28:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748533; cv=none; b=W+tWeRy0wHmIfMNBk2ooYbKjpqg0oeEp6xTNMWYrDkcpuwkoLnXibJyzAbiHvww46MTEuWCYnlbxuaLvXiYBczCsV81vVyWovfAivY867tQg0IKnle61D4pTC4VXQaTeMQf9bWDbX9lGxEVsLrOjCzIlpDoaCZsYpvVyhgjbdqc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748533; c=relaxed/simple; bh=UVWYw3W3rDJm0/srJaDKzeRQOjrXVoWlqIdhZKi3XHg=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=hWfbUMUyI0kz/YZV/227bhcVuHaEuEEFDzgZ/8Dv7JPHQcISHBgfH/MEud3wbQ80O+hwZ/aqGY1tlyUpSso7dYUgpqVlqJvBaPGvzWmCo91SbGx02DUop+3EqvdENPBd+OXtY82hc1lHcI3VNO6MX6AqET9oDNjYZWsVHBy8QV8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=snyU7jui; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="snyU7jui" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 10AE7C433F1 for ; Wed, 10 Apr 2024 11:28:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712748533; bh=UVWYw3W3rDJm0/srJaDKzeRQOjrXVoWlqIdhZKi3XHg=; h=From:To:Subject:Date:In-Reply-To:References:From; b=snyU7juiLXbGPNdAM5uNZfvqZDcBmLuldmB0l40JUuQBRUVvY08xXGxgTUMzcdacB plBQDBSUppdqXS6B4WK1tZZBLf0NAtvIUayfRXksGPClRDjPu9e5SHKX4ZiMw0Xlpm qs6qKhkTbuV6mBs8v+fQ74iPrDuBdEmWgD23bB3LGBgIq85EIvYOjZEykgfT+M8ETA hK1hRQd+07C1lqrr1UhgTaA8zIP4qYsJoD0PQZXIJ3mpBY5AeQ6ycpalP9Cz+RBuCm zuZBmu+r3sfI2I9MVvKfBxvRGdP+S4XYyXUMyLBL5qx3/emFEfHtLzugVd0w9538+p ZXWOi/9oxjVpA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 06/11] btrfs: pass the extent map tree's inode to remove_extent_mapping() Date: Wed, 10 Apr 2024 12:28:38 +0100 Message-Id: <43e4e4a75ef530348f4bb1fa65614f6d2df9c757.1712748143.git.fdmanana@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana Extent maps are always associated to an inode's extent map tree, so there's no need to pass the extent map tree explicitly to remove_extent_mapping(). In order to facilitate an upcoming change that adds a shrinker for extent maps, change remove_extent_mapping() to receive the inode instead of its extent map tree. Signed-off-by: Filipe Manana --- fs/btrfs/extent_io.c | 2 +- fs/btrfs/extent_map.c | 22 +++++++++++++--------- fs/btrfs/extent_map.h | 2 +- fs/btrfs/tests/extent-map-tests.c | 19 ++++++++++--------- 4 files changed, 25 insertions(+), 20 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index d90330f26827..1b236fc3f411 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2457,7 +2457,7 @@ int try_release_extent_mapping(struct page *page, gfp_t mask) * hurts the fsync performance for workloads with a data * size that exceeds or is close to the system's memory). */ - remove_extent_mapping(map, em); + remove_extent_mapping(btrfs_inode, em); /* once for the rb tree */ free_extent_map(em); next: diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index 7cda78d11d75..289669763965 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -449,16 +449,18 @@ struct extent_map *search_extent_mapping(struct extent_map_tree *tree, } /* - * Remove an extent_map from the extent tree. + * Remove an extent_map from its inode's extent tree. * - * @tree: extent tree to remove from + * @inode: the inode the extent map belongs to * @em: extent map being removed * - * Remove @em from @tree. No reference counts are dropped, and no checks - * are done to see if the range is in use. + * Remove @em from the extent tree of @inode. No reference counts are dropped, + * and no checks are done to see if the range is in use. */ -void remove_extent_mapping(struct extent_map_tree *tree, struct extent_map *em) +void remove_extent_mapping(struct btrfs_inode *inode, struct extent_map *em) { + struct extent_map_tree *tree = &inode->extent_tree; + lockdep_assert_held_write(&tree->lock); WARN_ON(em->flags & EXTENT_FLAG_PINNED); @@ -633,8 +635,10 @@ int btrfs_add_extent_mapping(struct btrfs_inode *inode, * if needed. This avoids searching the tree, from the root down to the first * extent map, before each deletion. */ -static void drop_all_extent_maps_fast(struct extent_map_tree *tree) +static void drop_all_extent_maps_fast(struct btrfs_inode *inode) { + struct extent_map_tree *tree = &inode->extent_tree; + write_lock(&tree->lock); while (!RB_EMPTY_ROOT(&tree->map.rb_root)) { struct extent_map *em; @@ -643,7 +647,7 @@ static void drop_all_extent_maps_fast(struct extent_map_tree *tree) node = rb_first_cached(&tree->map); em = rb_entry(node, struct extent_map, rb_node); em->flags &= ~(EXTENT_FLAG_PINNED | EXTENT_FLAG_LOGGING); - remove_extent_mapping(tree, em); + remove_extent_mapping(inode, em); free_extent_map(em); cond_resched_rwlock_write(&tree->lock); } @@ -676,7 +680,7 @@ void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end, WARN_ON(end < start); if (end == (u64)-1) { if (start == 0 && !skip_pinned) { - drop_all_extent_maps_fast(em_tree); + drop_all_extent_maps_fast(inode); return; } len = (u64)-1; @@ -854,7 +858,7 @@ void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end, ASSERT(!split); btrfs_set_inode_full_sync(inode); } - remove_extent_mapping(em_tree, em); + remove_extent_mapping(inode, em); } /* diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h index 732fc8d7e534..c3707461ff62 100644 --- a/fs/btrfs/extent_map.h +++ b/fs/btrfs/extent_map.h @@ -120,7 +120,7 @@ static inline u64 extent_map_end(const struct extent_map *em) void extent_map_tree_init(struct extent_map_tree *tree); struct extent_map *lookup_extent_mapping(struct extent_map_tree *tree, u64 start, u64 len); -void remove_extent_mapping(struct extent_map_tree *tree, struct extent_map *em); +void remove_extent_mapping(struct btrfs_inode *inode, struct extent_map *em); int split_extent_map(struct btrfs_inode *inode, u64 start, u64 len, u64 pre, u64 new_logical); diff --git a/fs/btrfs/tests/extent-map-tests.c b/fs/btrfs/tests/extent-map-tests.c index 9e9cb591c0f1..db6fb1a2c78f 100644 --- a/fs/btrfs/tests/extent-map-tests.c +++ b/fs/btrfs/tests/extent-map-tests.c @@ -11,8 +11,9 @@ #include "../disk-io.h" #include "../block-group.h" -static int free_extent_map_tree(struct extent_map_tree *em_tree) +static int free_extent_map_tree(struct btrfs_inode *inode) { + struct extent_map_tree *em_tree = &inode->extent_tree; struct extent_map *em; struct rb_node *node; int ret = 0; @@ -21,7 +22,7 @@ static int free_extent_map_tree(struct extent_map_tree *em_tree) while (!RB_EMPTY_ROOT(&em_tree->map.rb_root)) { node = rb_first_cached(&em_tree->map); em = rb_entry(node, struct extent_map, rb_node); - remove_extent_mapping(em_tree, em); + remove_extent_mapping(inode, em); #ifdef CONFIG_BTRFS_DEBUG if (refcount_read(&em->refs) != 1) { @@ -142,7 +143,7 @@ static int test_case_1(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) } free_extent_map(em); out: - ret2 = free_extent_map_tree(em_tree); + ret2 = free_extent_map_tree(inode); if (ret == 0) ret = ret2; @@ -237,7 +238,7 @@ static int test_case_2(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) } free_extent_map(em); out: - ret2 = free_extent_map_tree(em_tree); + ret2 = free_extent_map_tree(inode); if (ret == 0) ret = ret2; @@ -313,7 +314,7 @@ static int __test_case_3(struct btrfs_fs_info *fs_info, } free_extent_map(em); out: - ret2 = free_extent_map_tree(em_tree); + ret2 = free_extent_map_tree(inode); if (ret == 0) ret = ret2; @@ -435,7 +436,7 @@ static int __test_case_4(struct btrfs_fs_info *fs_info, } free_extent_map(em); out: - ret2 = free_extent_map_tree(em_tree); + ret2 = free_extent_map_tree(inode); if (ret == 0) ret = ret2; @@ -679,7 +680,7 @@ static int test_case_5(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) if (ret) goto out; out: - ret2 = free_extent_map_tree(&inode->extent_tree); + ret2 = free_extent_map_tree(inode); if (ret == 0) ret = ret2; @@ -738,7 +739,7 @@ static int test_case_6(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) ret = 0; out: free_extent_map(em); - ret2 = free_extent_map_tree(em_tree); + ret2 = free_extent_map_tree(inode); if (ret == 0) ret = ret2; @@ -871,7 +872,7 @@ static int test_case_7(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode) ret2 = unpin_extent_cache(inode, 0, SZ_16K, 0); if (ret == 0) ret = ret2; - ret2 = free_extent_map_tree(em_tree); + ret2 = free_extent_map_tree(inode); if (ret == 0) ret = ret2; From patchwork Wed Apr 10 11:28:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13624120 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EB8D415ECCB for ; Wed, 10 Apr 2024 11:28:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748535; cv=none; b=fNz+tB0sSqdfZCWE+icmSkUJFIo8yCcQQwcN749PQ+TN13EidrK8R5UqpBR9GlzyT1jbjrY0L5JWWUtIy1q3ZFisCf7wyRYQuRr3e6MCgl2mGCUJR3XPxXH5WZ277aJIug3BXUe8zf4J02kl3AFqzJa75uTpz0+8vHe37xA65aQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748535; c=relaxed/simple; bh=0t3zjz760py2aauv7Z725DD4gYKRxEhOiE2A6E/q/2s=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cI8KAdsecv1/LvUZsxRrXmAUxfoImIDe6/EU3HS0ocRfsFEV+9mvfqhy0HBPLkZxDNxWQV3CZZm+AlMkaam8PM/NHK9e3lx5aQb/u4k/IzlzHy0nA0kDFLqm1hRh57ckPcJdbmqUuma5Y3ukjT03yH44CkilSglGSGjyUoghVVg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=m4CDnulA; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="m4CDnulA" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 03D7CC433C7 for ; Wed, 10 Apr 2024 11:28:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712748534; bh=0t3zjz760py2aauv7Z725DD4gYKRxEhOiE2A6E/q/2s=; h=From:To:Subject:Date:In-Reply-To:References:From; b=m4CDnulABPOpDsqe0czpDukAmvgfv7M9Gdo9X/qIEMk7sEvMjqPxOJKHfxDzMr2tb 05b1XTl2XNkd/ER1dBPfCeeKZDlIFu4GnIVhDVLxIgKyM4+0vg1i+CusFZH3DFIEMQ re1J3RmvkdZiAUWbrVXKwJM4tbbBvp9xMBlOHyOOxdMnCHfb33LxHYOOjFx+38t8Px /nB5gtrN9KU9Q+XHovd9p48vG8E5V12deQJLXJ5uSY3iIap04CgMSiiPZM8cHpwJ5g L8lJSelVpIZAXDHlackpybvD3VgPz3B9e+4lBc+MAYkXnYseL2G6D2hDFXfwHhiqCk vkqEkTE7bQ8OA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 07/11] btrfs: pass the extent map tree's inode to replace_extent_mapping() Date: Wed, 10 Apr 2024 12:28:39 +0100 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana Extent maps are always associated to an inode's extent map tree, so there's no need to pass the extent map tree explicitly to replace_extent_mapping(). In order to facilitate an upcoming change that adds a shrinker for extent maps, change replace_extent_mapping() to receive the inode instead of its extent map tree. Signed-off-by: Filipe Manana --- fs/btrfs/extent_map.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index 289669763965..15817b842c24 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -470,11 +470,13 @@ void remove_extent_mapping(struct btrfs_inode *inode, struct extent_map *em) RB_CLEAR_NODE(&em->rb_node); } -static void replace_extent_mapping(struct extent_map_tree *tree, +static void replace_extent_mapping(struct btrfs_inode *inode, struct extent_map *cur, struct extent_map *new, int modified) { + struct extent_map_tree *tree = &inode->extent_tree; + lockdep_assert_held_write(&tree->lock); WARN_ON(cur->flags & EXTENT_FLAG_PINNED); @@ -777,7 +779,7 @@ void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end, split->generation = gen; split->flags = flags; - replace_extent_mapping(em_tree, em, split, modified); + replace_extent_mapping(inode, em, split, modified); free_extent_map(split); split = split2; split2 = NULL; @@ -818,8 +820,7 @@ void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end, } if (extent_map_in_tree(em)) { - replace_extent_mapping(em_tree, em, split, - modified); + replace_extent_mapping(inode, em, split, modified); } else { int ret; @@ -977,7 +978,7 @@ int split_extent_map(struct btrfs_inode *inode, u64 start, u64 len, u64 pre, split_pre->flags = flags; split_pre->generation = em->generation; - replace_extent_mapping(em_tree, em, split_pre, 1); + replace_extent_mapping(inode, em, split_pre, 1); /* * Now we only have an extent_map at: From patchwork Wed Apr 10 11:28:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13624121 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 90A2815ECD1 for ; Wed, 10 Apr 2024 11:28:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748535; cv=none; b=G1q9ZeTdPU0mjN6uaj+t4P63I+opAC32gRj9Q5vH3Usb2dH3pkZpP7qdDrxuLEaVefQ6yGAjkZaT9NLYKQn2lffXEN9qnVWrj4nysf/zgWCcGUrPG66WVPMQNNaRxErRZGqErKQhDl+PiCYWW4Dulj/V5VgEQG9VephoEwhEMiE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748535; c=relaxed/simple; bh=yi1U240R5KsRHltzNh0UfAUh7geoKvnxiKt1FKNDmoU=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Fg4889VmSeE8TrMdQsBKpZhX4Xg7GkNt63J6arYpQtqCy22bdAzut7/plMfGE6trL9c7ZNkpXKY19jsfw6DZM1jsIPlAQngHDl0gwueK2RcodZxHbnc7QyW7xvFFB4ewIchjIBIZMm1iHOLBCnhUoIwE/lzYO/S5Exm5VoVKPoI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=JDX8dB4w; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JDX8dB4w" Received: by smtp.kernel.org (Postfix) with ESMTPSA id ECCD8C43390 for ; Wed, 10 Apr 2024 11:28:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712748535; bh=yi1U240R5KsRHltzNh0UfAUh7geoKvnxiKt1FKNDmoU=; h=From:To:Subject:Date:In-Reply-To:References:From; b=JDX8dB4w7qV4l5FgRFhnPFVemk/2jPq6QSxpfrQoBFLjksHcfhTdPB/Qv1nniqiJA VxBwy/dQ0NC5chWkmDV9+3dBVP/lcRynNEzgIoa9bS+MwYjtQl7K4hcftZuOY6L86u mILFbdWOtUo9m09ldycxbJpA5PFyPratq5H0dAK2SWtR9CupTS8tHwcKZhgXWFcnUL cBNIyb5M66i/LYo2gwUDb84GLxtQeeWSbTdHglYekouNomGes5tsaU/NtpuMTnoXOM IyNHbuqj0z3Q22+fc78M8IRqi/7w39QcFqo7GPApOcRZRrKtjEessZvZnmtSfzCD2y 3ffxByZhYzMZg== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 08/11] btrfs: add a global per cpu counter to track number of used extent maps Date: Wed, 10 Apr 2024 12:28:40 +0100 Message-Id: <0f1a834bcb67f4c57885706b54e19d22e64b9ce7.1712748143.git.fdmanana@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana Add a per cpu counter that tracks the total number of extent maps that are in extent trees of inodes that belong to fs trees. This is going to be used in an upcoming change that adds a shrinker for extent maps. Only extent maps for fs trees are considered, because for special trees such as the data relocation tree we don't want to evict their extent maps which are critical for the relocation to work, and since those are limited, it's not a concern to have them in memory during the relocation of a block group. Another case are extent maps for free space cache inodes, which must always remain in memory, but those are limited (there's only one per free space cache inode, which means one per block group). Signed-off-by: Filipe Manana --- fs/btrfs/disk-io.c | 6 ++++++ fs/btrfs/extent_map.c | 38 +++++++++++++++++++++++++++----------- fs/btrfs/fs.h | 2 ++ 3 files changed, 35 insertions(+), 11 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 0474e9b6d302..3c2d35b2062e 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1269,6 +1269,8 @@ void btrfs_free_fs_info(struct btrfs_fs_info *fs_info) percpu_counter_destroy(&fs_info->dirty_metadata_bytes); percpu_counter_destroy(&fs_info->delalloc_bytes); percpu_counter_destroy(&fs_info->ordered_bytes); + ASSERT(percpu_counter_sum_positive(&fs_info->evictable_extent_maps) == 0); + percpu_counter_destroy(&fs_info->evictable_extent_maps); percpu_counter_destroy(&fs_info->dev_replace.bio_counter); btrfs_free_csum_hash(fs_info); btrfs_free_stripe_hash_table(fs_info); @@ -2848,6 +2850,10 @@ static int init_mount_fs_info(struct btrfs_fs_info *fs_info, struct super_block if (ret) return ret; + ret = percpu_counter_init(&fs_info->evictable_extent_maps, 0, GFP_KERNEL); + if (ret) + return ret; + ret = percpu_counter_init(&fs_info->dirty_metadata_bytes, 0, GFP_KERNEL); if (ret) return ret; diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index 15817b842c24..2fcf28148a81 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -76,6 +76,14 @@ static u64 range_end(u64 start, u64 len) return start + len; } +static void dec_evictable_extent_maps(struct btrfs_inode *inode) +{ + struct btrfs_fs_info *fs_info = inode->root->fs_info; + + if (!btrfs_is_testing(fs_info) && is_fstree(btrfs_root_id(inode->root))) + percpu_counter_dec(&fs_info->evictable_extent_maps); +} + static int tree_insert(struct rb_root_cached *root, struct extent_map *em) { struct rb_node **p = &root->rb_root.rb_node; @@ -223,8 +231,9 @@ static bool mergeable_maps(const struct extent_map *prev, const struct extent_ma return next->block_start == prev->block_start; } -static void try_merge_map(struct extent_map_tree *tree, struct extent_map *em) +static void try_merge_map(struct btrfs_inode *inode, struct extent_map *em) { + struct extent_map_tree *tree = &inode->extent_tree; struct extent_map *merge = NULL; struct rb_node *rb; @@ -258,6 +267,7 @@ static void try_merge_map(struct extent_map_tree *tree, struct extent_map *em) rb_erase_cached(&merge->rb_node, &tree->map); RB_CLEAR_NODE(&merge->rb_node); free_extent_map(merge); + dec_evictable_extent_maps(inode); } } @@ -272,6 +282,7 @@ static void try_merge_map(struct extent_map_tree *tree, struct extent_map *em) em->generation = max(em->generation, merge->generation); em->flags |= EXTENT_FLAG_MERGED; free_extent_map(merge); + dec_evictable_extent_maps(inode); } } @@ -322,7 +333,7 @@ int unpin_extent_cache(struct btrfs_inode *inode, u64 start, u64 len, u64 gen) em->generation = gen; em->flags &= ~EXTENT_FLAG_PINNED; - try_merge_map(tree, em); + try_merge_map(inode, em); out: write_unlock(&tree->lock); @@ -333,16 +344,14 @@ int unpin_extent_cache(struct btrfs_inode *inode, u64 start, u64 len, u64 gen) void clear_em_logging(struct btrfs_inode *inode, struct extent_map *em) { - struct extent_map_tree *tree = &inode->extent_tree; - - lockdep_assert_held_write(&tree->lock); + lockdep_assert_held_write(&inode->extent_tree.lock); em->flags &= ~EXTENT_FLAG_LOGGING; if (extent_map_in_tree(em)) - try_merge_map(tree, em); + try_merge_map(inode, em); } -static inline void setup_extent_mapping(struct extent_map_tree *tree, +static inline void setup_extent_mapping(struct btrfs_inode *inode, struct extent_map *em, int modified) { @@ -351,9 +360,9 @@ static inline void setup_extent_mapping(struct extent_map_tree *tree, ASSERT(list_empty(&em->list)); if (modified) - list_add(&em->list, &tree->modified_extents); + list_add(&em->list, &inode->extent_tree.modified_extents); else - try_merge_map(tree, em); + try_merge_map(inode, em); } /* @@ -373,6 +382,8 @@ static int add_extent_mapping(struct btrfs_inode *inode, struct extent_map *em, int modified) { struct extent_map_tree *tree = &inode->extent_tree; + struct btrfs_root *root = inode->root; + struct btrfs_fs_info *fs_info = root->fs_info; int ret; lockdep_assert_held_write(&tree->lock); @@ -381,7 +392,10 @@ static int add_extent_mapping(struct btrfs_inode *inode, if (ret) return ret; - setup_extent_mapping(tree, em, modified); + setup_extent_mapping(inode, em, modified); + + if (!btrfs_is_testing(fs_info) && is_fstree(btrfs_root_id(root))) + percpu_counter_inc(&fs_info->evictable_extent_maps); return 0; } @@ -468,6 +482,8 @@ void remove_extent_mapping(struct btrfs_inode *inode, struct extent_map *em) if (!(em->flags & EXTENT_FLAG_LOGGING)) list_del_init(&em->list); RB_CLEAR_NODE(&em->rb_node); + + dec_evictable_extent_maps(inode); } static void replace_extent_mapping(struct btrfs_inode *inode, @@ -486,7 +502,7 @@ static void replace_extent_mapping(struct btrfs_inode *inode, rb_replace_node_cached(&cur->rb_node, &new->rb_node, &tree->map); RB_CLEAR_NODE(&cur->rb_node); - setup_extent_mapping(tree, new, modified); + setup_extent_mapping(inode, new, modified); } static struct extent_map *next_extent_map(const struct extent_map *em) diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index 93f5c57ea4e3..534d30dafe32 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -630,6 +630,8 @@ struct btrfs_fs_info { s32 dirty_metadata_batch; s32 delalloc_batch; + struct percpu_counter evictable_extent_maps; + /* Protected by 'trans_lock'. */ struct list_head dirty_cowonly_roots; From patchwork Wed Apr 10 11:28:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13624122 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D5C2315ECD1 for ; Wed, 10 Apr 2024 11:28:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748536; cv=none; b=niLpfbRnOuWcaVTFUblQ4OB1yzvLawjsh/+2Lxq5bEN8kK1066FnKUJUzJn2KAmBZLhBXnMtFjA6pSINCHQrZ5Dvt5yRIpTpgAOvGf8L9yLNRbhi+eL/kowm/gxe6vJB8ODR5fXjoQ1+TiStkgNAGyKfmkdL3GZtjZ11Je4hFuU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748536; c=relaxed/simple; bh=GnsezX0q3pbcvFChtfbZjix/vlWVoskhd1gmjB42sqM=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Jh/K4qsoYmY3c12WS6vIRA7wvNDvrEIKdtf/lYZjawpaWZaD2xXgDDt8UUKzBiLEqHAeUHNZGHP6rYN4i9CAc2g3PeJMCw7y4vwtJ/7LysOELxQ0w/MrKWtjz0C6n8MuopgIZrs14OXmnKCjcTSbmDqeMoJsGNcn6ciputkZF2Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=f8uel31r; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="f8uel31r" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E0DA0C433F1 for ; Wed, 10 Apr 2024 11:28:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712748536; bh=GnsezX0q3pbcvFChtfbZjix/vlWVoskhd1gmjB42sqM=; h=From:To:Subject:Date:In-Reply-To:References:From; b=f8uel31raDMSmt0yKKhQx7wrmopOzVnWvDY4O2r+Tx3mU6e9bcK/pFSnz63b9cimn iD6qLTm5TuT4VAziBepXANzOf6Ub26fWkfq/VLkTA9+ZFP+diqNA7SvyGEmw6+7shA c1Gvczz6kpecfptWjQJEMDbJBEIxm25TNGo7ECVKFoLcasvkQWBhIsxD4nGckZIs+r cmbu+FGWSMCoFIeJ39cpmPrHpNID1fUIeiIy5AOfLm7A/FIiA02J2aIZqlOMIgYfo8 IQg89QE9kzGD5SNHrtgGzf6jEAEe3mcquXxbEpgC+fvQoBz/VX1c5jlu2JWI/CDzfb JWMN8h0q8MtQg== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 09/11] btrfs: add a shrinker for extent maps Date: Wed, 10 Apr 2024 12:28:41 +0100 Message-Id: <5d1743b20f84e0262a2c229cd5e877ed0f0596a0.1712748143.git.fdmanana@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana Extent maps are used either to represent existing file extent items, or to represent new extents that are going to be written and the respective file extent items are created when the ordered extent completes. We currently don't have any limit for how many extent maps we can have, neither per inode nor globally. Most of the time this not too noticeable because extent maps are removed in the following situations: 1) When evicting an inode; 2) When releasing folios (pages) through the btrfs_release_folio() address space operation callback. However we won't release extent maps in the folio range if the folio is either dirty or under writeback or if the inode's i_size is less than or equals to 16M (see try_release_extent_mapping(). This 16M i_size constraint was added back in 2008 with commit 70dec8079d78 ("Btrfs: extent_io and extent_state optimizations"), but there's no explanation about why we have it or why the 16M value. This means that for buffered IO we can reach an OOM situation due to too many extent maps if either of the following happens: 1) There's a set of tasks constantly doing IO on many files with a size not larger than 16M, specially if they keep the files open for very long periods, therefore preventing inode eviction. This requires a really high number of such files, and having many non mergeable extent maps (due to random 4K writes for example) and a machine with very little memory; 2) There's a set tasks constantly doing random write IO (therefore creating many non mergeable extent maps) on files and keeping them open for long periods of time, so inode eviction doesn't happen and there's always a lot of dirty pages or pages under writeback, preventing btrfs_release_folio() from releasing the respective extent maps. This second case was actually reported in the thread pointed by the Link tag below, and it requires a very large file under heavy IO and a machine with very little amount of RAM, which is probably hard to happen in practice in a real world use case. However when using direct IO this is not so hard to happen, because the page cache is not used, and therefore btrfs_release_folio() is never called. Which means extent maps are dropped only when evicting the inode, and that means that if we have tasks that keep a file descriptor open and keep doing IO on a very large file (or files), we can exhaust memory due to an unbounded amount of extent maps. This is especially easy to happen if we have a huge file with millions of small extents and their extent maps are not mergeable (non contiguous offsets and disk locations). This was reported in that thread with the following fio test: $ cat test.sh #!/bin/bash DEV=/dev/sdj MNT=/mnt/sdj MOUNT_OPTIONS="-o ssd" MKFS_OPTIONS="" cat < /tmp/fio-job.ini [global] name=fio-rand-write filename=$MNT/fio-rand-write rw=randwrite bs=4K direct=1 numjobs=16 fallocate=none time_based runtime=90000 [file1] size=300G ioengine=libaio iodepth=16 EOF umount $MNT &> /dev/null mkfs.btrfs -f $MKFS_OPTIONS $DEV mount $MOUNT_OPTIONS $DEV $MNT fio /tmp/fio-job.ini umount $MNT Monitoring the btrfs_extent_map slab while running the test with: $ watch -d -n 1 'cat /sys/kernel/slab/btrfs_extent_map/objects \ /sys/kernel/slab/btrfs_extent_map/total_objects' Shows the number of active and total extent maps skyrocketing to tens of millions, and on systems with a short amount of memory it's easy and quick to get into an OOM situation, as reported in that thread. So to avoid this issue add a shrinker that will remove extents maps, as long as they are not pinned, and takes proper care with any concurrent fsync to avoid missing extents (setting the full sync flag while in the middle of a fast fsync). This shrinker is similar to the one ext4 uses for its extent_status structure, which is analogous to btrfs' extent_map structure. Link: https://lore.kernel.org/linux-btrfs/13f94633dcf04d29aaf1f0a43d42c55e@amazon.com/ Signed-off-by: Filipe Manana --- fs/btrfs/disk-io.c | 7 +- fs/btrfs/extent_map.c | 200 ++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/extent_map.h | 2 + fs/btrfs/fs.h | 2 + 4 files changed, 207 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 3c2d35b2062e..8bb295eaf3d7 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1266,11 +1266,10 @@ static void free_global_roots(struct btrfs_fs_info *fs_info) void btrfs_free_fs_info(struct btrfs_fs_info *fs_info) { + btrfs_unregister_extent_map_shrinker(fs_info); percpu_counter_destroy(&fs_info->dirty_metadata_bytes); percpu_counter_destroy(&fs_info->delalloc_bytes); percpu_counter_destroy(&fs_info->ordered_bytes); - ASSERT(percpu_counter_sum_positive(&fs_info->evictable_extent_maps) == 0); - percpu_counter_destroy(&fs_info->evictable_extent_maps); percpu_counter_destroy(&fs_info->dev_replace.bio_counter); btrfs_free_csum_hash(fs_info); btrfs_free_stripe_hash_table(fs_info); @@ -2846,11 +2845,11 @@ static int init_mount_fs_info(struct btrfs_fs_info *fs_info, struct super_block sb->s_blocksize = BTRFS_BDEV_BLOCKSIZE; sb->s_blocksize_bits = blksize_bits(BTRFS_BDEV_BLOCKSIZE); - ret = percpu_counter_init(&fs_info->ordered_bytes, 0, GFP_KERNEL); + ret = btrfs_register_extent_map_shrinker(fs_info); if (ret) return ret; - ret = percpu_counter_init(&fs_info->evictable_extent_maps, 0, GFP_KERNEL); + ret = percpu_counter_init(&fs_info->ordered_bytes, 0, GFP_KERNEL); if (ret) return ret; diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index 2fcf28148a81..fa755921442d 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -8,6 +8,7 @@ #include "extent_map.h" #include "compression.h" #include "btrfs_inode.h" +#include "disk-io.h" static struct kmem_cache *extent_map_cache; @@ -1026,3 +1027,202 @@ int split_extent_map(struct btrfs_inode *inode, u64 start, u64 len, u64 pre, free_extent_map(split_pre); return ret; } + +static unsigned long btrfs_scan_inode(struct btrfs_inode *inode, + unsigned long *scanned, + unsigned long nr_to_scan) +{ + struct extent_map_tree *tree = &inode->extent_tree; + unsigned long nr_dropped = 0; + struct rb_node *node; + + /* + * Take the mmap lock so that we serialize with the inode logging phase + * of fsync because we may need to set the full sync flag on the inode, + * in case we have to remove extent maps in the tree's list of modified + * extents. If we set the full sync flag in the inode while an fsync is + * in progress, we may risk missing new extents because before the flag + * is set, fsync decides to only wait for writeback to complete and then + * during inode logging it sees the flag set and uses the subvolume tree + * to find new extents, which may not be there yet because ordered + * extents haven't completed yet. + */ + down_read(&inode->i_mmap_lock); + write_lock(&tree->lock); + node = rb_first_cached(&tree->map); + while (node) { + struct extent_map *em; + + em = rb_entry(node, struct extent_map, rb_node); + node = rb_next(node); + (*scanned)++; + + if (em->flags & EXTENT_FLAG_PINNED) + goto next; + + if (!list_empty(&em->list)) + btrfs_set_inode_full_sync(inode); + + remove_extent_mapping(inode, em); + /* Drop the reference for the tree. */ + free_extent_map(em); + nr_dropped++; +next: + if (*scanned >= nr_to_scan) + break; + + /* + * Restart if we had to resched, and any extent maps that were + * pinned before may have become unpinned after we released the + * lock and took it again. + */ + if (cond_resched_rwlock_write(&tree->lock)) + node = rb_first_cached(&tree->map); + } + write_unlock(&tree->lock); + up_read(&inode->i_mmap_lock); + + return nr_dropped; +} + +static unsigned long btrfs_scan_root(struct btrfs_root *root, + unsigned long *scanned, + unsigned long nr_to_scan) +{ + unsigned long nr_dropped = 0; + u64 ino = 0; + + while (*scanned < nr_to_scan) { + struct rb_node *node; + struct rb_node *prev = NULL; + struct btrfs_inode *inode; + bool stop_search = true; + + spin_lock(&root->inode_lock); + node = root->inode_tree.rb_node; + + while (node) { + prev = node; + inode = rb_entry(node, struct btrfs_inode, rb_node); + if (ino < btrfs_ino(inode)) + node = node->rb_left; + else if (ino > btrfs_ino(inode)) + node = node->rb_right; + else + break; + } + + if (!node) { + while (prev) { + inode = rb_entry(prev, struct btrfs_inode, rb_node); + if (ino <= btrfs_ino(inode)) { + node = prev; + break; + } + prev = rb_next(prev); + } + } + + while (node) { + inode = rb_entry(node, struct btrfs_inode, rb_node); + ino = btrfs_ino(inode) + 1; + if (igrab(&inode->vfs_inode)) { + spin_unlock(&root->inode_lock); + stop_search = false; + + nr_dropped += btrfs_scan_inode(inode, scanned, + nr_to_scan); + iput(&inode->vfs_inode); + cond_resched(); + break; + } + node = rb_next(node); + } + + if (stop_search) { + spin_unlock(&root->inode_lock); + break; + } + } + + return nr_dropped; +} + +static unsigned long btrfs_extent_maps_scan(struct shrinker *shrinker, + struct shrink_control *sc) +{ + struct btrfs_fs_info *fs_info = shrinker->private_data; + unsigned long nr_dropped = 0; + unsigned long scanned = 0; + u64 next_root_id = 0; + + while (scanned < sc->nr_to_scan) { + struct btrfs_root *root; + unsigned long count; + + spin_lock(&fs_info->fs_roots_radix_lock); + count = radix_tree_gang_lookup(&fs_info->fs_roots_radix, + (void **)&root, next_root_id, 1); + if (count == 0) { + spin_unlock(&fs_info->fs_roots_radix_lock); + break; + } + next_root_id = btrfs_root_id(root) + 1; + root = btrfs_grab_root(root); + spin_unlock(&fs_info->fs_roots_radix_lock); + + if (!root) + continue; + + if (is_fstree(btrfs_root_id(root))) + nr_dropped += btrfs_scan_root(root, &scanned, sc->nr_to_scan); + + btrfs_put_root(root); + } + + return nr_dropped; +} + +static unsigned long btrfs_extent_maps_count(struct shrinker *shrinker, + struct shrink_control *sc) +{ + struct btrfs_fs_info *fs_info = shrinker->private_data; + const s64 total = percpu_counter_sum_positive(&fs_info->evictable_extent_maps); + + /* The unsigned long type is 32 bits on 32 bits platforms. */ +#if BITS_PER_LONG == 32 + if (total > ULONG_MAX) + return ULONG_MAX; +#endif + return total; +} + +int btrfs_register_extent_map_shrinker(struct btrfs_fs_info *fs_info) +{ + int ret; + + ret = percpu_counter_init(&fs_info->evictable_extent_maps, 0, GFP_KERNEL); + if (ret) + return ret; + + fs_info->extent_map_shrinker = shrinker_alloc(0, "em-btrfs:%s", fs_info->sb->s_id); + if (!fs_info->extent_map_shrinker) { + percpu_counter_destroy(&fs_info->evictable_extent_maps); + return -ENOMEM; + } + + fs_info->extent_map_shrinker->scan_objects = btrfs_extent_maps_scan; + fs_info->extent_map_shrinker->count_objects = btrfs_extent_maps_count; + fs_info->extent_map_shrinker->private_data = fs_info; + + shrinker_register(fs_info->extent_map_shrinker); + + return 0; +} + +void btrfs_unregister_extent_map_shrinker(struct btrfs_fs_info *fs_info) +{ + shrinker_free(fs_info->extent_map_shrinker); + ASSERT(percpu_counter_sum_positive(&fs_info->evictable_extent_maps) == 0); + percpu_counter_destroy(&fs_info->evictable_extent_maps); +} diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h index c3707461ff62..8a6be2f7a0e2 100644 --- a/fs/btrfs/extent_map.h +++ b/fs/btrfs/extent_map.h @@ -140,5 +140,7 @@ void btrfs_drop_extent_map_range(struct btrfs_inode *inode, int btrfs_replace_extent_map_range(struct btrfs_inode *inode, struct extent_map *new_em, bool modified); +int btrfs_register_extent_map_shrinker(struct btrfs_fs_info *fs_info); +void btrfs_unregister_extent_map_shrinker(struct btrfs_fs_info *fs_info); #endif diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index 534d30dafe32..f1414814bd69 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -857,6 +857,8 @@ struct btrfs_fs_info { struct lockdep_map btrfs_trans_pending_ordered_map; struct lockdep_map btrfs_ordered_extent_map; + struct shrinker *extent_map_shrinker; + #ifdef CONFIG_BTRFS_FS_REF_VERIFY spinlock_t ref_verify_lock; struct rb_root block_tree; From patchwork Wed Apr 10 11:28:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13624123 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 81A1715ECE5 for ; Wed, 10 Apr 2024 11:28:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748537; cv=none; b=fkdxSzXg9nYkv6lwOfL555S72nJyueL/m2ltY6M/R2qdHdIuFlksnsp4lmlpnxLVksUAVJfjVvYAIVV04ETyLRH68wTCOiUEjH20503ADHnYhF66+t71DX1PGJpC6r7uB8eU103RuTq0H1Eyg6bNRgqs4yRsVPj2e6dljc+TQJk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748537; c=relaxed/simple; bh=67jWQQ9sTDxMQWCgMTV9sLUnQFeO9k20o3t6eEgqmUw=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=hgMTCvdSETa5RzU9NzrRtw2Fz7mH5GLyDtXLamx3LptccFnP/D2g9K/HVNzIWKYGa5xGx0M/W4XZKTCGOiYgqT9GEXMHRm1745UxeEPoKkkfO1+SDVAT1dV+ZbSTSrH/ln852axqEUfXHjdmHnIsE2jrcArgSTCgd5FmF76jR7k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=aPHEql/b; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="aPHEql/b" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D4B2DC433C7 for ; Wed, 10 Apr 2024 11:28:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712748537; bh=67jWQQ9sTDxMQWCgMTV9sLUnQFeO9k20o3t6eEgqmUw=; h=From:To:Subject:Date:In-Reply-To:References:From; b=aPHEql/b2ZWbpjcBc+gaXvdJ5ec37eu4MUzGPKwud4D41awDArrupmXyYVehVThed TqOWAB5lXZ08bKKQFnVam6jreNZz4/qEXbcXj7P55YeREvbNUegw4gdKp3rjiWh24Y ibU7N6l4g+ldaPci6/1Wh2SItbWPPn48nrojnS74OZzczu95H80B2wpB5Or8vJZjAS 62E07RsmZiS7FLsLTpmVePiIQxg8tkk/3+SRrcKDXoccuOO8nc/HLHbLod/R60YFXo a2BfvReT6ovbeGjNEC0YqQ3GOepG9UiLpY83Y3p+v+Bdv4ko87tWhPA+PeVc0d8B2E kUPO3c9UgHVuw== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 10/11] btrfs: update comment for btrfs_set_inode_full_sync() about locking Date: Wed, 10 Apr 2024 12:28:42 +0100 Message-Id: <5011cf788a2c295a13040bc3b1af9e56c33553a0.1712748143.git.fdmanana@suse.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana Nowadays we have a lock used to synchronize mmap writes with reflink and fsync operations (struct btrfs_inode::i_mmap_lock), so update the comment for btrfs_set_inode_full_sync() to mention that it can also be called while holding that mmap lock. Besides being a valid alternative to the inode's VFS lock, we already have the extent map shrinker using that mmap lock instead. Signed-off-by: Filipe Manana --- fs/btrfs/btrfs_inode.h | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index 100020ca4658..ce01709e372f 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -381,9 +381,11 @@ static inline void btrfs_set_inode_last_sub_trans(struct btrfs_inode *inode) } /* - * Should be called while holding the inode's VFS lock in exclusive mode or in a - * context where no one else can access the inode concurrently (during inode - * creation or when loading an inode from disk). + * Should be called while holding the inode's VFS lock in exclusive mode, or + * while holding the inode's mmap lock (struct btrfs_inode::i_mmap_lock) in + * either shared or exclusive mode, or in a context where no one else can access + * the inode concurrently (during inode creation or when loading an inode from + * disk). */ static inline void btrfs_set_inode_full_sync(struct btrfs_inode *inode) { From patchwork Wed Apr 10 11:28:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13624124 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D717615ECFA for ; Wed, 10 Apr 2024 11:28:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748538; cv=none; b=T4ujeBAFQqLiQ7Ck2ZSY+nve2cye/N/op/i8eyCTao3V4bhfx2hwEPEkyoXvPMHvHvSJB0pHkUOmX9uSKy+vQ7/2hqOTVGHvv9oBm5cGIThNMXwFozr6PqkdvhmChhFZGj+uV0ueMAFQSGMfv5s3X89bEnGyzX3MCki/lf7Pv1A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712748538; c=relaxed/simple; bh=Zb2rQqiZU6tC91zUVqi9UHGP3O/3KGkbYmmC6n450kU=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=gs0ifemHx6XN1MdaRxwEhrAr7PkzFxdxxXhQmmk7YOeOQz3KqjcBRoge6az7ZdunoKGqEx0Y8xhBfsbsfvWnMiAav8WXzk/6340ZN+ayMunJ/+16veltUazzPTWqFBB0KAFbA/uMiAS7qnajMkZ147gDoFHWaIjpfKXywre81Hg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=r9OKaymP; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="r9OKaymP" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C901CC433C7 for ; Wed, 10 Apr 2024 11:28:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712748538; bh=Zb2rQqiZU6tC91zUVqi9UHGP3O/3KGkbYmmC6n450kU=; h=From:To:Subject:Date:In-Reply-To:References:From; b=r9OKaymP7bXn//othh57ID9bVoirObrKs7yyJFhyyUFmBeqvhNKiOYXaMQQHYlWy3 OS2WaQoi81AyV1EmT/f8ZyMwisdzLjM29QDOlwe2T4O/YDUUoqFxRLvy96zT/K938o V/Mk4u5fuj9fttlLlCxzo8G64kxgbCfsLivktdHWTbW8HESvlZUSk7p0NvighZyffI 9uH8i8cU6jduwv7pLTGRyW4Fcn8IFr/1HSDcfuWyUogyr6owHrjoJbI3aN9+lwOylN pxw/W1VFz3peNFsPql7rGhdzH0Eq7zsjsOM2Lhp6fefXt9acNW+FX63lrC8y86y8i1 AdxMvU75P2Zpw== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 11/11] btrfs: add tracepoints for extent map shrinker events Date: Wed, 10 Apr 2024 12:28:43 +0100 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Filipe Manana Add some tracepoints for the extent map shrinker to help debug and analyse main events. These have proved useful during development of the shrinker. Signed-off-by: Filipe Manana --- fs/btrfs/extent_map.c | 15 ++++++ include/trace/events/btrfs.h | 92 ++++++++++++++++++++++++++++++++++++ 2 files changed, 107 insertions(+) diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index fa755921442d..2be5324085fe 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -1064,6 +1064,7 @@ static unsigned long btrfs_scan_inode(struct btrfs_inode *inode, btrfs_set_inode_full_sync(inode); remove_extent_mapping(inode, em); + trace_btrfs_extent_map_shrinker_remove_em(inode, em); /* Drop the reference for the tree. */ free_extent_map(em); nr_dropped++; @@ -1156,6 +1157,12 @@ static unsigned long btrfs_extent_maps_scan(struct shrinker *shrinker, unsigned long scanned = 0; u64 next_root_id = 0; + if (trace_btrfs_extent_map_shrinker_scan_enter_enabled()) { + s64 nr = percpu_counter_sum_positive(&fs_info->evictable_extent_maps); + + trace_btrfs_extent_map_shrinker_scan_enter(fs_info, sc->nr_to_scan, nr); + } + while (scanned < sc->nr_to_scan) { struct btrfs_root *root; unsigned long count; @@ -1180,6 +1187,12 @@ static unsigned long btrfs_extent_maps_scan(struct shrinker *shrinker, btrfs_put_root(root); } + if (trace_btrfs_extent_map_shrinker_scan_exit_enabled()) { + s64 nr = percpu_counter_sum_positive(&fs_info->evictable_extent_maps); + + trace_btrfs_extent_map_shrinker_scan_exit(fs_info, nr_dropped, nr); + } + return nr_dropped; } @@ -1189,6 +1202,8 @@ static unsigned long btrfs_extent_maps_count(struct shrinker *shrinker, struct btrfs_fs_info *fs_info = shrinker->private_data; const s64 total = percpu_counter_sum_positive(&fs_info->evictable_extent_maps); + trace_btrfs_extent_map_shrinker_count(fs_info, sc->nr_to_scan, total); + /* The unsigned long type is 32 bits on 32 bits platforms. */ #if BITS_PER_LONG == 32 if (total > ULONG_MAX) diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index 766cfd48386c..ba49efa2bc74 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -2551,6 +2551,98 @@ TRACE_EVENT(btrfs_get_raid_extent_offset, __entry->devid) ); +TRACE_EVENT(btrfs_extent_map_shrinker_count, + + TP_PROTO(const struct btrfs_fs_info *fs_info, u64 nr_to_scan, u64 nr), + + TP_ARGS(fs_info, nr_to_scan, nr), + + TP_STRUCT__entry_btrfs( + __field( u64, nr_to_scan ) + __field( u64, nr ) + ), + + TP_fast_assign_btrfs(fs_info, + __entry->nr_to_scan = nr_to_scan; + __entry->nr = nr; + ), + + TP_printk_btrfs("nr_to_scan=%llu nr=%llu", + __entry->nr_to_scan, __entry->nr) +); + +TRACE_EVENT(btrfs_extent_map_shrinker_scan_enter, + + TP_PROTO(const struct btrfs_fs_info *fs_info, u64 nr_to_scan, u64 nr), + + TP_ARGS(fs_info, nr_to_scan, nr), + + TP_STRUCT__entry_btrfs( + __field( u64, nr_to_scan ) + __field( u64, nr ) + ), + + TP_fast_assign_btrfs(fs_info, + __entry->nr_to_scan = nr_to_scan; + __entry->nr = nr; + ), + + TP_printk_btrfs("nr_to_scan=%llu nr=%llu", + __entry->nr_to_scan, __entry->nr) +); + +TRACE_EVENT(btrfs_extent_map_shrinker_scan_exit, + + TP_PROTO(const struct btrfs_fs_info *fs_info, u64 nr_dropped, u64 nr), + + TP_ARGS(fs_info, nr_dropped, nr), + + TP_STRUCT__entry_btrfs( + __field( u64, nr_dropped ) + __field( u64, nr ) + ), + + TP_fast_assign_btrfs(fs_info, + __entry->nr_dropped = nr_dropped; + __entry->nr = nr; + ), + + TP_printk_btrfs("nr_dropped=%llu nr=%llu", + __entry->nr_dropped, __entry->nr) +); + +TRACE_EVENT(btrfs_extent_map_shrinker_remove_em, + + TP_PROTO(const struct btrfs_inode *inode, const struct extent_map *em), + + TP_ARGS(inode, em), + + TP_STRUCT__entry_btrfs( + __field( u64, ino ) + __field( u64, root_id ) + __field( u64, start ) + __field( u64, len ) + __field( u64, block_start ) + __field( u32, flags ) + ), + + TP_fast_assign_btrfs(inode->root->fs_info, + __entry->ino = btrfs_ino(inode); + __entry->root_id = inode->root->root_key.objectid; + __entry->start = em->start; + __entry->len = em->len; + __entry->block_start = em->block_start; + __entry->flags = em->flags; + ), + + TP_printk_btrfs( +"ino=%llu root=%llu(%s) start=%llu len=%llu block_start=%llu(%s) flags=%s", + __entry->ino, show_root_type(__entry->root_id), + __entry->start, __entry->len, + show_map_type(__entry->block_start), + show_map_flags(__entry->flags)) +); + #endif /* _TRACE_BTRFS_H */ /* This part must be outside protection */