Message ID | 20191121120331.29070-2-nborisov@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | 3 misc patches | expand |
On Thu, Nov 21, 2019 at 02:03:29PM +0200, Nikolay Borisov wrote: > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -4169,8 +4169,6 @@ static int __btrfs_free_reserved_extent(struct btrfs_fs_info *fs_info, > if (ret) > goto out; > } else { > - if (btrfs_test_opt(fs_info, DISCARD)) > - ret = btrfs_discard_extent(fs_info, start, len, NULL); > btrfs_add_free_space(cache, start, len); > btrfs_free_reserved_bytes(cache, len, delalloc); > trace_btrfs_reserved_extent_free(fs_info, start, len); > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > index 0ac0f5b33003..5d80fe030e79 100644 > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -3250,10 +3250,15 @@ static int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent) > if ((ret || !logical_len) && > clear_reserved_extent && > !test_bit(BTRFS_ORDERED_NOCOW, &ordered_extent->flags) && > - !test_bit(BTRFS_ORDERED_PREALLOC, &ordered_extent->flags)) > + !test_bit(BTRFS_ORDERED_PREALLOC, &ordered_extent->flags)) { > btrfs_free_reserved_extent(fs_info, > ordered_extent->start, > ordered_extent->disk_len, 1); > + if (ret && btrfs_test_opt(fs_info, DISCARD)) > + btrfs_discard_extent(fs_info, > + ordered_extent->start, ordered_extent->disk_len, > + NULL); This brings back vague memories of misplaced discard (in finish_ordered_io), that was quite hard to catch. I can't find the fix though. Filipe, is it the same issue?
On Wed, Nov 27, 2019 at 05:06:00PM +0100, David Sterba wrote: > On Thu, Nov 21, 2019 at 02:03:29PM +0200, Nikolay Borisov wrote: > > --- a/fs/btrfs/extent-tree.c > > +++ b/fs/btrfs/extent-tree.c > > @@ -4169,8 +4169,6 @@ static int __btrfs_free_reserved_extent(struct btrfs_fs_info *fs_info, > > if (ret) > > goto out; > > } else { > > - if (btrfs_test_opt(fs_info, DISCARD)) > > - ret = btrfs_discard_extent(fs_info, start, len, NULL); > > btrfs_add_free_space(cache, start, len); > > btrfs_free_reserved_bytes(cache, len, delalloc); > > trace_btrfs_reserved_extent_free(fs_info, start, len); > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > > index 0ac0f5b33003..5d80fe030e79 100644 > > --- a/fs/btrfs/inode.c > > +++ b/fs/btrfs/inode.c > > @@ -3250,10 +3250,15 @@ static int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent) > > if ((ret || !logical_len) && > > clear_reserved_extent && > > !test_bit(BTRFS_ORDERED_NOCOW, &ordered_extent->flags) && > > - !test_bit(BTRFS_ORDERED_PREALLOC, &ordered_extent->flags)) > > + !test_bit(BTRFS_ORDERED_PREALLOC, &ordered_extent->flags)) { > > btrfs_free_reserved_extent(fs_info, > > ordered_extent->start, > > ordered_extent->disk_len, 1); > > + if (ret && btrfs_test_opt(fs_info, DISCARD)) > > + btrfs_discard_extent(fs_info, > > + ordered_extent->start, ordered_extent->disk_len, > > + NULL); > > This brings back vague memories of misplaced discard (in > finish_ordered_io), that was quite hard to catch. I can't find the fix > though. Filipe, is it the same issue? Closest to what I thought are dcc82f4783ad9 "Btrfs: fix log tree corruption when fs mounted with -o discard" or 678886bdc6378 "Btrfs: fix fs corruption on transaction abort if device supports discard".
On 27.11.19 г. 18:15 ч., David Sterba wrote: > On Wed, Nov 27, 2019 at 05:06:00PM +0100, David Sterba wrote: >> On Thu, Nov 21, 2019 at 02:03:29PM +0200, Nikolay Borisov wrote: >>> --- a/fs/btrfs/extent-tree.c >>> +++ b/fs/btrfs/extent-tree.c >>> @@ -4169,8 +4169,6 @@ static int __btrfs_free_reserved_extent(struct btrfs_fs_info *fs_info, >>> if (ret) >>> goto out; >>> } else { >>> - if (btrfs_test_opt(fs_info, DISCARD)) >>> - ret = btrfs_discard_extent(fs_info, start, len, NULL); >>> btrfs_add_free_space(cache, start, len); >>> btrfs_free_reserved_bytes(cache, len, delalloc); >>> trace_btrfs_reserved_extent_free(fs_info, start, len); >>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c >>> index 0ac0f5b33003..5d80fe030e79 100644 >>> --- a/fs/btrfs/inode.c >>> +++ b/fs/btrfs/inode.c >>> @@ -3250,10 +3250,15 @@ static int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent) >>> if ((ret || !logical_len) && >>> clear_reserved_extent && >>> !test_bit(BTRFS_ORDERED_NOCOW, &ordered_extent->flags) && >>> - !test_bit(BTRFS_ORDERED_PREALLOC, &ordered_extent->flags)) >>> + !test_bit(BTRFS_ORDERED_PREALLOC, &ordered_extent->flags)) { >>> btrfs_free_reserved_extent(fs_info, >>> ordered_extent->start, >>> ordered_extent->disk_len, 1); >>> + if (ret && btrfs_test_opt(fs_info, DISCARD)) >>> + btrfs_discard_extent(fs_info, >>> + ordered_extent->start, ordered_extent->disk_len, >>> + NULL); >> >> This brings back vague memories of misplaced discard (in >> finish_ordered_io), that was quite hard to catch. I can't find the fix >> though. Filipe, is it the same issue? > > Closest to what I thought are dcc82f4783ad9 "Btrfs: fix log tree > corruption when fs mounted with -o discard" or 678886bdc6378 "Btrfs: fix > fs corruption on transaction abort if device supports discard". > Pinned extents are discarded when btrfs_finish_extent_commit is called from transaction commit. If you look closer you'd see btrfs_Free_Reserved_extent here means we call into __btrfs_Free_Reserved_extent with pin equal to false, so the original code would have gone to the else {} clause where the discard was. By predicating on (ret) I only handle the case where an ordered extent (which only applies to data) is potentially partially written.
On Wed, Nov 27, 2019 at 4:08 PM David Sterba <dsterba@suse.cz> wrote: > > On Thu, Nov 21, 2019 at 02:03:29PM +0200, Nikolay Borisov wrote: > > --- a/fs/btrfs/extent-tree.c > > +++ b/fs/btrfs/extent-tree.c > > @@ -4169,8 +4169,6 @@ static int __btrfs_free_reserved_extent(struct btrfs_fs_info *fs_info, > > if (ret) > > goto out; > > } else { > > - if (btrfs_test_opt(fs_info, DISCARD)) > > - ret = btrfs_discard_extent(fs_info, start, len, NULL); > > btrfs_add_free_space(cache, start, len); > > btrfs_free_reserved_bytes(cache, len, delalloc); > > trace_btrfs_reserved_extent_free(fs_info, start, len); > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > > index 0ac0f5b33003..5d80fe030e79 100644 > > --- a/fs/btrfs/inode.c > > +++ b/fs/btrfs/inode.c > > @@ -3250,10 +3250,15 @@ static int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent) > > if ((ret || !logical_len) && > > clear_reserved_extent && > > !test_bit(BTRFS_ORDERED_NOCOW, &ordered_extent->flags) && > > - !test_bit(BTRFS_ORDERED_PREALLOC, &ordered_extent->flags)) > > + !test_bit(BTRFS_ORDERED_PREALLOC, &ordered_extent->flags)) { > > btrfs_free_reserved_extent(fs_info, > > ordered_extent->start, > > ordered_extent->disk_len, 1); > > + if (ret && btrfs_test_opt(fs_info, DISCARD)) > > + btrfs_discard_extent(fs_info, > > + ordered_extent->start, ordered_extent->disk_len, > > + NULL); > > This brings back vague memories of misplaced discard (in > finish_ordered_io), that was quite hard to catch. I can't find the fix > though. Filipe, is it the same issue? It's different (as you identified already in another reply). Even though this is rarely hit on a healthy system, it looks fine to me. Reviewed-by: Filipe Manana <fdmanana@suse.com> thanks
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index fe0f33dd344d..613c7bbf5cbd 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -4169,8 +4169,6 @@ static int __btrfs_free_reserved_extent(struct btrfs_fs_info *fs_info, if (ret) goto out; } else { - if (btrfs_test_opt(fs_info, DISCARD)) - ret = btrfs_discard_extent(fs_info, start, len, NULL); btrfs_add_free_space(cache, start, len); btrfs_free_reserved_bytes(cache, len, delalloc); trace_btrfs_reserved_extent_free(fs_info, start, len); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 0ac0f5b33003..5d80fe030e79 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -3250,10 +3250,15 @@ static int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent) if ((ret || !logical_len) && clear_reserved_extent && !test_bit(BTRFS_ORDERED_NOCOW, &ordered_extent->flags) && - !test_bit(BTRFS_ORDERED_PREALLOC, &ordered_extent->flags)) + !test_bit(BTRFS_ORDERED_PREALLOC, &ordered_extent->flags)) { btrfs_free_reserved_extent(fs_info, ordered_extent->start, ordered_extent->disk_len, 1); + if (ret && btrfs_test_opt(fs_info, DISCARD)) + btrfs_discard_extent(fs_info, + ordered_extent->start, ordered_extent->disk_len, + NULL); + } }
All callers of btrfs_free_reserved_extent (respectively __btrfs_free_reserved_extent with in set to 0) pass in extents which have only been reserved but not yet written to. Namely, * In cow_file_range that function is called only if create_io_em fails or btrfs_add_ordered_extent fail, both of which happen _before_ any IO is submitted to the newly reserved range. * In submit_compressed_extents the code flow is similar - out_free_reserve can be called only before btrfs_submit_compressed_write which is where any writes to the range could occur. * btrfs_new_extent_direct also calls btrfs_free_reserved_extent only if extent_map fails, before any IO is issued. * __btrfs_prealloc_file_range also calls btrfs_free_reserved_extent in case insertion of the metadata fails. * btrfs_alloc_tree_block again can only be called in case in-memory operations fail, before any IO is submitted. * btrfs_finish_ordered_io - this is the only caller where discarding the extent could have a material effect, since it can be called for an extent which was partially written. With this change the submission of discards is optimised since discards are now not being created for extents which are known to not have been touched on disk. Signed-off-by: Nikolay Borisov <nborisov@suse.com> --- fs/btrfs/extent-tree.c | 2 -- fs/btrfs/inode.c | 7 ++++++- 2 files changed, 6 insertions(+), 3 deletions(-)