diff mbox series

[1/5] btrfs: free qgroup rsv on ioerr ordered_extent

Message ID 301bc827ef330a961a95791e6c4d3dbe3e2a6108.1701464169.git.boris@bur.io (mailing list archive)
State New, archived
Headers show
Series btrfs: qgroups rsv fixes | expand

Commit Message

Boris Burkov Dec. 1, 2023, 9 p.m. UTC
An ordered extent completing is a critical moment in qgroup rsv
handling, as the ownership of the reservation is handed off from the
ordered extent to the delayed ref. In the happy path we release (unlock)
but do not free (decrement counter) the reservation, and the delayed ref
drives the free. However, on an error, we don't create a delayed ref,
since there is no ref to add. Therefore, free on the error path.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/ordered-data.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Qu Wenruo Dec. 4, 2023, 9:04 p.m. UTC | #1
On 2023/12/2 07:30, Boris Burkov wrote:
> An ordered extent completing is a critical moment in qgroup rsv
> handling, as the ownership of the reservation is handed off from the
> ordered extent to the delayed ref. In the happy path we release (unlock)
> but do not free (decrement counter) the reservation, and the delayed ref
> drives the free. However, on an error, we don't create a delayed ref,
> since there is no ref to add. Therefore, free on the error path.

And I believe this would cause btrfs to be noisy at the unmount time,
due to unreleased qgroup rsv.

Have you hit any one during your tests? If so, mind to add some dmesg
output for it?

Or if no hit so far, would it be possible to add a new test case for it?

>
> Signed-off-by: Boris Burkov <boris@bur.io>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

> ---
>   fs/btrfs/ordered-data.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
> index 574e8a55e24a..8d4ab5ecfa5d 100644
> --- a/fs/btrfs/ordered-data.c
> +++ b/fs/btrfs/ordered-data.c
> @@ -599,7 +599,8 @@ void btrfs_remove_ordered_extent(struct btrfs_inode *btrfs_inode,
>   			release = entry->disk_num_bytes;
>   		else
>   			release = entry->num_bytes;
> -		btrfs_delalloc_release_metadata(btrfs_inode, release, false);
> +		btrfs_delalloc_release_metadata(btrfs_inode, release,
> +						test_bit(BTRFS_ORDERED_IOERR, &entry->flags));
>   	}
>
>   	percpu_counter_add_batch(&fs_info->ordered_bytes, -entry->num_bytes,
Boris Burkov Dec. 5, 2023, 7:42 p.m. UTC | #2
On Tue, Dec 05, 2023 at 07:34:10AM +1030, Qu Wenruo wrote:
> 
> 
> On 2023/12/2 07:30, Boris Burkov wrote:
> > An ordered extent completing is a critical moment in qgroup rsv
> > handling, as the ownership of the reservation is handed off from the
> > ordered extent to the delayed ref. In the happy path we release (unlock)
> > but do not free (decrement counter) the reservation, and the delayed ref
> > drives the free. However, on an error, we don't create a delayed ref,
> > since there is no ref to add. Therefore, free on the error path.
> 
> And I believe this would cause btrfs to be noisy at the unmount time,
> due to unreleased qgroup rsv.
> 
> Have you hit any one during your tests? If so, mind to add some dmesg
> output for it?
> 
> Or if no hit so far, would it be possible to add a new test case for it?

I hit the conditions for all of these fixes running xfstests with
MKFS_OPTIONS including -O squota or -O quota. IIRC the failures were
almost all in the umount path in btrfs_check_quota_leak, though some of
the issues manifested as reservation freeing underflows in
qgroup_rsv_release. generic/475 triggered most of the bugs but a handful
of other tests hit some others. Unfortunately, I did not take perfect
notes on which test was fixed by which patch. I can try to recover that
information by removing the patches and running the full suite while
iteratively adding them back in. That is obviously fairly time consuming,
so I would only do it if people really want that information in the commit
messages or something.

Thanks for the review!
Boris

> 
> > 
> > Signed-off-by: Boris Burkov <boris@bur.io>
> 
> Reviewed-by: Qu Wenruo <wqu@suse.com>
> 
> Thanks,
> Qu
> 
> > ---
> >   fs/btrfs/ordered-data.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
> > index 574e8a55e24a..8d4ab5ecfa5d 100644
> > --- a/fs/btrfs/ordered-data.c
> > +++ b/fs/btrfs/ordered-data.c
> > @@ -599,7 +599,8 @@ void btrfs_remove_ordered_extent(struct btrfs_inode *btrfs_inode,
> >   			release = entry->disk_num_bytes;
> >   		else
> >   			release = entry->num_bytes;
> > -		btrfs_delalloc_release_metadata(btrfs_inode, release, false);
> > +		btrfs_delalloc_release_metadata(btrfs_inode, release,
> > +						test_bit(BTRFS_ORDERED_IOERR, &entry->flags));
> >   	}
> > 
> >   	percpu_counter_add_batch(&fs_info->ordered_bytes, -entry->num_bytes,
Qu Wenruo Dec. 5, 2023, 8:16 p.m. UTC | #3
On 2023/12/6 06:12, Boris Burkov wrote:
> On Tue, Dec 05, 2023 at 07:34:10AM +1030, Qu Wenruo wrote:
>>
>>
>> On 2023/12/2 07:30, Boris Burkov wrote:
>>> An ordered extent completing is a critical moment in qgroup rsv
>>> handling, as the ownership of the reservation is handed off from the
>>> ordered extent to the delayed ref. In the happy path we release (unlock)
>>> but do not free (decrement counter) the reservation, and the delayed ref
>>> drives the free. However, on an error, we don't create a delayed ref,
>>> since there is no ref to add. Therefore, free on the error path.
>>
>> And I believe this would cause btrfs to be noisy at the unmount time,
>> due to unreleased qgroup rsv.
>>
>> Have you hit any one during your tests? If so, mind to add some dmesg
>> output for it?
>>
>> Or if no hit so far, would it be possible to add a new test case for it?
>
> I hit the conditions for all of these fixes running xfstests with
> MKFS_OPTIONS including -O squota or -O quota. IIRC the failures were
> almost all in the umount path in btrfs_check_quota_leak, though some of
> the issues manifested as reservation freeing underflows in
> qgroup_rsv_release. generic/475 triggered most of the bugs but a handful
> of other tests hit some others. Unfortunately, I did not take perfect
> notes on which test was fixed by which patch. I can try to recover that
> information by removing the patches and running the full suite while
> iteratively adding them back in. That is obviously fairly time consuming,
> so I would only do it if people really want that information in the commit
> messages or something.

Oh no worries, in that case I'm totally fine with the current commit
message.

Although a CC to stable would be nicer for patches 1~4.

Thanks,
Qu
>
> Thanks for the review!
> Boris
>
>>
>>>
>>> Signed-off-by: Boris Burkov <boris@bur.io>
>>
>> Reviewed-by: Qu Wenruo <wqu@suse.com>
>>
>> Thanks,
>> Qu
>>
>>> ---
>>>    fs/btrfs/ordered-data.c | 3 ++-
>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
>>> index 574e8a55e24a..8d4ab5ecfa5d 100644
>>> --- a/fs/btrfs/ordered-data.c
>>> +++ b/fs/btrfs/ordered-data.c
>>> @@ -599,7 +599,8 @@ void btrfs_remove_ordered_extent(struct btrfs_inode *btrfs_inode,
>>>    			release = entry->disk_num_bytes;
>>>    		else
>>>    			release = entry->num_bytes;
>>> -		btrfs_delalloc_release_metadata(btrfs_inode, release, false);
>>> +		btrfs_delalloc_release_metadata(btrfs_inode, release,
>>> +						test_bit(BTRFS_ORDERED_IOERR, &entry->flags));
>>>    	}
>>>
>>>    	percpu_counter_add_batch(&fs_info->ordered_bytes, -entry->num_bytes,
diff mbox series

Patch

diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index 574e8a55e24a..8d4ab5ecfa5d 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -599,7 +599,8 @@  void btrfs_remove_ordered_extent(struct btrfs_inode *btrfs_inode,
 			release = entry->disk_num_bytes;
 		else
 			release = entry->num_bytes;
-		btrfs_delalloc_release_metadata(btrfs_inode, release, false);
+		btrfs_delalloc_release_metadata(btrfs_inode, release,
+						test_bit(BTRFS_ORDERED_IOERR, &entry->flags));
 	}
 
 	percpu_counter_add_batch(&fs_info->ordered_bytes, -entry->num_bytes,