Message ID | e58fb16de072382873edcc02c74366cbcb202499.1683632777.git.fdmanana@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] btrfs: fix backref walking not returning all inode refs | expand |
On Tue, May 09, 2023 at 12:50:02PM +0100, fdmanana@kernel.org wrote: > From: Filipe Manana <fdmanana@suse.com> > > When using the logical to ino ioctl v2, if the flag to ignore offsets of > file extent items (BTRFS_LOGICAL_INO_ARGS_IGNORE_OFFSET) is given, the > backref walking code ends up not returning references for all file offsets > of an inode that point to the given logical bytenr. This happens since > kernel 6.2, commit 6ce6ba534418 ("btrfs: use a single argument for extent > offset in backref walking functions") because: > > 1) It mistakenly skipped the search for file extent items in a leaf that > point to the target extent if that flag is given. Instead it should > only skip the filtering done by check_extent_in_eb() - that is, it > should not avoid the calls to that function (or find_extent_in_eb(), > which uses it). > > 2) It was also not building a list of inode extent elements (struct > extent_inode_elem) if we have multiple inode references for an extent > when the ignore offset flag is given to the logical to ino ioctl - it > would leave a single element, only the last one that was found. > > These stem from the confusing old interface for backref walking functions > where we had an extent item offset argument that was a pointer to a u64 > and another boolean argument that indicated if the offset should be > ignored, but the pointer could be NULL. That NULL case is used by > relocation, qgroup extent accounting and fiemap, simply to avoid building > the inode extent list for each reference, as it's not necessary for those > use cases and therefore avoids memory allocations and some computations. > > Fix this by adding a boolean argument to the backref walk context > structure to indicate that the inode extent list should not be built, > make relocation set that argument to true and fix the backref walking > logic to skip the calls to check_extent_in_eb() and find_extent_in_eb() > only if this new argument is true, instead of 'ignore_extent_item_pos' > being true. > > A test case for fstests will be added soon, to provide cover not only > for these cases but to the logical to ino ioctl in general as well, as > currently we do not have a test case for it. > > Fixes: 6ce6ba534418 ("btrfs: use a single argument for extent offset in backref walking functions") > Reported-by: Vladimir Panteleev <git@vladimir.panteleev.md> > Link: https://lore.kernel.org/linux-btrfs/CAHhfkvwo=nmzrJSqZ2qMfF-rZB-ab6ahHnCD_sq9h4o8v+M7QQ@mail.gmail.com/ > Tested-by: Vladimir Panteleev <git@vladimir.panteleev.md> > CC: stable@vger.kernel.org # 6.2+ > Signed-off-by: Filipe Manana <fdmanana@suse.com> > --- > > V2: Remove wrong check for a non-zero extent item offset. > Apply the same logic at find_parent_nodes(), that is, search for file > extent items on a leaf if the ignore flag is given - the filtering > will be done later at check_extent_in_eb(). Spotted by Vladimir Panteleev > in the thread mentioned above. > > V3: Also fix the condition to decide if we should add an inode element to the > list of inode elements built before. Also rework the fix based on that > missing part. Added to misc-next. As this is a user visible bug I'll submit it in another batch for rc2, but there will be a few days before that in case you'd need another update. Thanks.
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c index e54f0884802a..79336fa853db 100644 --- a/fs/btrfs/backref.c +++ b/fs/btrfs/backref.c @@ -45,7 +45,8 @@ static int check_extent_in_eb(struct btrfs_backref_walk_ctx *ctx, int root_count; bool cached; - if (!btrfs_file_extent_compression(eb, fi) && + if (!ctx->ignore_extent_item_pos && + !btrfs_file_extent_compression(eb, fi) && !btrfs_file_extent_encryption(eb, fi) && !btrfs_file_extent_other_encoding(eb, fi)) { u64 data_offset; @@ -552,7 +553,7 @@ static int add_all_parents(struct btrfs_backref_walk_ctx *ctx, count++; else goto next; - if (!ctx->ignore_extent_item_pos) { + if (!ctx->skip_inode_ref_list) { ret = check_extent_in_eb(ctx, &key, eb, fi, &eie); if (ret == BTRFS_ITERATE_EXTENT_INODES_STOP || ret < 0) @@ -564,7 +565,7 @@ static int add_all_parents(struct btrfs_backref_walk_ctx *ctx, eie, (void **)&old, GFP_NOFS); if (ret < 0) break; - if (!ret && !ctx->ignore_extent_item_pos) { + if (!ret && !ctx->skip_inode_ref_list) { while (old->next) old = old->next; old->next = eie; @@ -1606,7 +1607,7 @@ static int find_parent_nodes(struct btrfs_backref_walk_ctx *ctx, goto out; } if (ref->count && ref->parent) { - if (!ctx->ignore_extent_item_pos && !ref->inode_list && + if (!ctx->skip_inode_ref_list && !ref->inode_list && ref->level == 0) { struct btrfs_tree_parent_check check = { 0 }; struct extent_buffer *eb; @@ -1647,7 +1648,7 @@ static int find_parent_nodes(struct btrfs_backref_walk_ctx *ctx, (void **)&eie, GFP_NOFS); if (ret < 0) goto out; - if (!ret && !ctx->ignore_extent_item_pos) { + if (!ret && !ctx->skip_inode_ref_list) { /* * We've recorded that parent, so we must extend * its inode list here. @@ -1743,7 +1744,7 @@ int btrfs_find_all_leafs(struct btrfs_backref_walk_ctx *ctx) static int btrfs_find_all_roots_safe(struct btrfs_backref_walk_ctx *ctx) { const u64 orig_bytenr = ctx->bytenr; - const bool orig_ignore_extent_item_pos = ctx->ignore_extent_item_pos; + const bool orig_skip_inode_ref_list = ctx->skip_inode_ref_list; bool roots_ulist_allocated = false; struct ulist_iterator uiter; int ret = 0; @@ -1764,7 +1765,7 @@ static int btrfs_find_all_roots_safe(struct btrfs_backref_walk_ctx *ctx) roots_ulist_allocated = true; } - ctx->ignore_extent_item_pos = true; + ctx->skip_inode_ref_list = true; ULIST_ITER_INIT(&uiter); while (1) { @@ -1789,7 +1790,7 @@ static int btrfs_find_all_roots_safe(struct btrfs_backref_walk_ctx *ctx) ulist_free(ctx->refs); ctx->refs = NULL; ctx->bytenr = orig_bytenr; - ctx->ignore_extent_item_pos = orig_ignore_extent_item_pos; + ctx->skip_inode_ref_list = orig_skip_inode_ref_list; return ret; } @@ -1912,7 +1913,7 @@ int btrfs_is_data_extent_shared(struct btrfs_inode *inode, u64 bytenr, goto out_trans; } - walk_ctx.ignore_extent_item_pos = true; + walk_ctx.skip_inode_ref_list = true; walk_ctx.trans = trans; walk_ctx.fs_info = fs_info; walk_ctx.refs = &ctx->refs; diff --git a/fs/btrfs/backref.h b/fs/btrfs/backref.h index ef6bbea3f456..1616e3e3f1e4 100644 --- a/fs/btrfs/backref.h +++ b/fs/btrfs/backref.h @@ -60,6 +60,12 @@ struct btrfs_backref_walk_ctx { * @extent_item_pos is ignored. */ bool ignore_extent_item_pos; + /* + * If true and bytenr corresponds to a data extent, then the inode list + * (each member describing inode number, file offset and root) is not + * added to each reference added to the @refs ulist. + */ + bool skip_inode_ref_list; /* A valid transaction handle or NULL. */ struct btrfs_trans_handle *trans; /* diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 09b1988d1791..59a06499c647 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -3422,7 +3422,7 @@ int add_data_references(struct reloc_control *rc, btrfs_release_path(path); ctx.bytenr = extent_key->objectid; - ctx.ignore_extent_item_pos = true; + ctx.skip_inode_ref_list = true; ctx.fs_info = rc->extent_root->fs_info; ret = btrfs_find_all_leafs(&ctx);