[v5,2/2] btrfs: Don't block system suspend during fstrim

Message ID	20240916125707.127118-3-luca.stefani.ge1@gmail.com (mailing list archive)
State	New
Headers	show Received: from mail-wm1-f45.google.com (mail-wm1-f45.google.com [209.85.128.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 74B7B156654; Mon, 16 Sep 2024 12:57:15 +0000 (UTC) From: Luca Stefani <luca.stefani.ge1@gmail.com> To: Cc: Luca Stefani <luca.stefani.ge1@gmail.com>, Chris Mason <clm@fb.com>, Josef Bacik <josef@toxicpanda.com>, David Sterba <dsterba@suse.com>, linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v5 2/2] btrfs: Don't block system suspend during fstrim Date: Mon, 16 Sep 2024 14:56:15 +0200 Message-ID: <20240916125707.127118-3-luca.stefani.ge1@gmail.com> In-Reply-To: <20240916125707.127118-1-luca.stefani.ge1@gmail.com> References: <20240916125707.127118-1-luca.stefani.ge1@gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	btrfs: Don't block system suspend during fstrim \| expand [v5,0/2] btrfs: Don't block system suspend during fstrim [v5,1/2] btrfs: Split remaining space to discard in chunks [v5,2/2] btrfs: Don't block system suspend during fstrim

Message ID

20240916125707.127118-3-luca.stefani.ge1@gmail.com (mailing list archive)

State

New

Headers

From: Luca Stefani <luca.stefani.ge1@gmail.com>
To: 
Cc: Luca Stefani <luca.stefani.ge1@gmail.com>,
	Chris Mason <clm@fb.com>,
	Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>,
	linux-btrfs@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH v5 2/2] btrfs: Don't block system suspend during fstrim
Date: Mon, 16 Sep 2024 14:56:15 +0200
Message-ID: <20240916125707.127118-3-luca.stefani.ge1@gmail.com>
In-Reply-To: <20240916125707.127118-1-luca.stefani.ge1@gmail.com>
References: <20240916125707.127118-1-luca.stefani.ge1@gmail.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

btrfs: Don't block system suspend during fstrim | expand

Commit Message

Luca Stefani Sept. 16, 2024, 12:56 p.m. UTC

Sometimes the system isn't able to suspend because the task
responsible for trimming the device isn't able to finish in
time, especially since we have a free extent discarding phase,
which can trim a lot of unallocated space, and there is no
limits on the trim size (unlike the block group part).

Since discard isn't a critical call it can be interrupted
at any time, in such cases we stop the trim, report the amount
of discarded bytes and return failure.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180
Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
---
 fs/btrfs/extent-tree.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

Comments

David Sterba Sept. 17, 2024, 4:24 p.m. UTC | #1

On Mon, Sep 16, 2024 at 02:56:15PM +0200, Luca Stefani wrote:
> Sometimes the system isn't able to suspend because the task
> responsible for trimming the device isn't able to finish in
> time, especially since we have a free extent discarding phase,
> which can trim a lot of unallocated space, and there is no
> limits on the trim size (unlike the block group part).
> 
> Since discard isn't a critical call it can be interrupted
> at any time, in such cases we stop the trim, report the amount
> of discarded bytes and return failure.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180
> Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
> Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>

I went through the cancellation points, some of them don't seem to be
necessary, eg. in a big loop when some function is called to do trim
(extents, bitmaps) and then again does the signal and freezing check.

Next, some of the functions are called from async discard and errors are
not checked: btrfs_trim_block_group_bitmaps() called from
btrfs_discard_workfn().

Ther's also check for signals pending in trim_bitmaps() in
free-space-cache.c. Given that the space cache code is on the way out we
don't necesssarily need to fix it but if the patch gets backported to
older kernels it still makes sense.

> ---
>  fs/btrfs/extent-tree.c | 23 ++++++++++++++++++++++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 79b9243c9cd6..cef368a30731 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -16,6 +16,7 @@
>  #include <linux/percpu_counter.h>
>  #include <linux/lockdep.h>
>  #include <linux/crc32c.h>
> +#include <linux/freezer.h>
>  #include "ctree.h"
>  #include "extent-tree.h"
>  #include "transaction.h"
> @@ -1235,6 +1236,11 @@ static int remove_extent_backref(struct btrfs_trans_handle *trans,
>  	return ret;
>  }
>  
> +static bool btrfs_trim_interrupted(void)
> +{
> +	return fatal_signal_pending(current) || freezing(current);
> +}
> +
>  static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
>  			       u64 *discarded_bytes)
>  {
> @@ -1316,6 +1322,11 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
>  		start += bytes_to_discard;
>  		bytes_left -= bytes_to_discard;
>  		*discarded_bytes += bytes_to_discard;
> +
> +		if (btrfs_trim_interrupted()) {
> +			ret = -ERESTARTSYS;
> +			break;
> +		}
>  	}
>  
>  	return ret;
> @@ -6470,7 +6481,7 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed)
>  		start += len;
>  		*trimmed += bytes;
>  
> -		if (fatal_signal_pending(current)) {
> +		if (btrfs_trim_interrupted()) {
>  			ret = -ERESTARTSYS;
>  			break;
>  		}
> @@ -6519,6 +6530,11 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>  
>  	cache = btrfs_lookup_first_block_group(fs_info, range->start);
>  	for (; cache; cache = btrfs_next_block_group(cache)) {
> +		if (btrfs_trim_interrupted()) {
> +			bg_ret = -ERESTARTSYS;
> +			break;
> +		}
> +
>  		if (cache->start >= range_end) {
>  			btrfs_put_block_group(cache);
>  			break;
> @@ -6558,6 +6574,11 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>  
>  	mutex_lock(&fs_devices->device_list_mutex);
>  	list_for_each_entry(device, &fs_devices->devices, dev_list) {
> +		if (btrfs_trim_interrupted()) {
> +			dev_ret = -ERESTARTSYS;

This one seems redundant.

> +			break;
> +		}
> +
>  		if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state))
>  			continue;
>  
> -- 
> 2.46.0
>

Luca Stefani Sept. 17, 2024, 5:38 p.m. UTC | #2

On 17/09/24 18:24, David Sterba wrote:
> On Mon, Sep 16, 2024 at 02:56:15PM +0200, Luca Stefani wrote:
>> Sometimes the system isn't able to suspend because the task
>> responsible for trimming the device isn't able to finish in
>> time, especially since we have a free extent discarding phase,
>> which can trim a lot of unallocated space, and there is no
>> limits on the trim size (unlike the block group part).
>>
>> Since discard isn't a critical call it can be interrupted
>> at any time, in such cases we stop the trim, report the amount
>> of discarded bytes and return failure.
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180
>> Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
>> Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
> 
> I went through the cancellation points, some of them don't seem to be
> necessary, eg. in a big loop when some function is called to do trim
> (extents, bitmaps) and then again does the signal and freezing check.
> 
> Next, some of the functions are called from async discard and errors are
> not checked: btrfs_trim_block_group_bitmaps() called from
> btrfs_discard_workfn().
Both btrfs_trim_block_group_bitmaps and btrfs_trim_block_group_extents 
ret codes are never checked indeed in btrfs_discard_workfn. I'll fix 
that up in another CL.
> 
> Ther's also check for signals pending in trim_bitmaps() in
> free-space-cache.c. Given that the space cache code is on the way out we
> don't necesssarily need to fix it but if the patch gets backported to
> older kernels it still makes sense.
Ah I missed this one, will fix it.
There's a few more instances of fatal_signal_pending but I don't know if 
they should be translated or not, will focus on the one you mentioned 
and trim_no_bitmap which seems to do similar checks for fatal signals.
> 
>> ---
>>   fs/btrfs/extent-tree.c | 23 ++++++++++++++++++++++-
>>   1 file changed, 22 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>> index 79b9243c9cd6..cef368a30731 100644
>> --- a/fs/btrfs/extent-tree.c
>> +++ b/fs/btrfs/extent-tree.c
>> @@ -16,6 +16,7 @@
>>   #include <linux/percpu_counter.h>
>>   #include <linux/lockdep.h>
>>   #include <linux/crc32c.h>
>> +#include <linux/freezer.h>
>>   #include "ctree.h"
>>   #include "extent-tree.h"
>>   #include "transaction.h"
>> @@ -1235,6 +1236,11 @@ static int remove_extent_backref(struct btrfs_trans_handle *trans,
>>   	return ret;
>>   }
>>   
>> +static bool btrfs_trim_interrupted(void)
>> +{
>> +	return fatal_signal_pending(current) || freezing(current);
>> +}
>> +
>>   static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
>>   			       u64 *discarded_bytes)
>>   {
>> @@ -1316,6 +1322,11 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
>>   		start += bytes_to_discard;
>>   		bytes_left -= bytes_to_discard;
>>   		*discarded_bytes += bytes_to_discard;
>> +
>> +		if (btrfs_trim_interrupted()) {
>> +			ret = -ERESTARTSYS;
>> +			break;
>> +		}
>>   	}
>>   
>>   	return ret;
>> @@ -6470,7 +6481,7 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed)
>>   		start += len;
>>   		*trimmed += bytes;
>>   
>> -		if (fatal_signal_pending(current)) {
>> +		if (btrfs_trim_interrupted()) {
>>   			ret = -ERESTARTSYS;
>>   			break;
>>   		}
>> @@ -6519,6 +6530,11 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>>   
>>   	cache = btrfs_lookup_first_block_group(fs_info, range->start);
>>   	for (; cache; cache = btrfs_next_block_group(cache)) {
>> +		if (btrfs_trim_interrupted()) {
>> +			bg_ret = -ERESTARTSYS;
>> +			break;
>> +		}
>> +
>>   		if (cache->start >= range_end) {
>>   			btrfs_put_block_group(cache);
>>   			break;
>> @@ -6558,6 +6574,11 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>>   
>>   	mutex_lock(&fs_devices->device_list_mutex);
>>   	list_for_each_entry(device, &fs_devices->devices, dev_list) {
>> +		if (btrfs_trim_interrupted()) {
>> +			dev_ret = -ERESTARTSYS;
> 
> This one seems redundant.
> 
>> +			break;
>> +		}
>> +
>>   		if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state))
>>   			continue;
>>   
>> -- 
>> 2.46.0
>>

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 79b9243c9cd6..cef368a30731 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -16,6 +16,7 @@ 
 #include <linux/percpu_counter.h>
 #include <linux/lockdep.h>
 #include <linux/crc32c.h>
+#include <linux/freezer.h>
 #include "ctree.h"
 #include "extent-tree.h"
 #include "transaction.h"
@@ -1235,6 +1236,11 @@  static int remove_extent_backref(struct btrfs_trans_handle *trans,
 	return ret;
 }
 
+static bool btrfs_trim_interrupted(void)
+{
+	return fatal_signal_pending(current) || freezing(current);
+}
+
 static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
 			       u64 *discarded_bytes)
 {
@@ -1316,6 +1322,11 @@  static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
 		start += bytes_to_discard;
 		bytes_left -= bytes_to_discard;
 		*discarded_bytes += bytes_to_discard;
+
+		if (btrfs_trim_interrupted()) {
+			ret = -ERESTARTSYS;
+			break;
+		}
 	}
 
 	return ret;
@@ -6470,7 +6481,7 @@  static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed)
 		start += len;
 		*trimmed += bytes;
 
-		if (fatal_signal_pending(current)) {
+		if (btrfs_trim_interrupted()) {
 			ret = -ERESTARTSYS;
 			break;
 		}
@@ -6519,6 +6530,11 @@  int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 
 	cache = btrfs_lookup_first_block_group(fs_info, range->start);
 	for (; cache; cache = btrfs_next_block_group(cache)) {
+		if (btrfs_trim_interrupted()) {
+			bg_ret = -ERESTARTSYS;
+			break;
+		}
+
 		if (cache->start >= range_end) {
 			btrfs_put_block_group(cache);
 			break;
@@ -6558,6 +6574,11 @@  int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 
 	mutex_lock(&fs_devices->device_list_mutex);
 	list_for_each_entry(device, &fs_devices->devices, dev_list) {
+		if (btrfs_trim_interrupted()) {
+			dev_ret = -ERESTARTSYS;
+			break;
+		}
+
 		if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state))
 			continue;

[v5,2/2] btrfs: Don't block system suspend during fstrim

Commit Message

Comments

Patch