[RFC,1/2] btrfs: relocation: Allow signal to cancel balance
diff mbox series

Message ID 20200706074435.52356-2-wqu@suse.com
State New
Headers show
Series
  • btrfs: make ticket wait uninterruptible to address unexpected RO during balance
Related show

Commit Message

Qu Wenruo July 6, 2020, 7:44 a.m. UTC
Although btrfs balance can be canceled with "btrfs balance cancel"
command, it's still almost muscle memory to press Ctrl-C to cancel a
long running btrfs balance.

So allow btrfs balance to check signal to determine if it should exit.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/relocation.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Josef Bacik July 6, 2020, 1:45 p.m. UTC | #1
On 7/6/20 3:44 AM, Qu Wenruo wrote:
> Although btrfs balance can be canceled with "btrfs balance cancel"
> command, it's still almost muscle memory to press Ctrl-C to cancel a
> long running btrfs balance.
> 
> So allow btrfs balance to check signal to determine if it should exit.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef
Hans van Kranenburg July 6, 2020, 6:19 p.m. UTC | #2
Hi,

On 7/6/20 9:44 AM, Qu Wenruo wrote:
> Although btrfs balance can be canceled with "btrfs balance cancel"
> command, it's still almost muscle memory to press Ctrl-C to cancel a
> long running btrfs balance.

Thanks for investigating all of this.

Has it actually been unsafe (read: undefined behaviour) forever, or only
since some recent change?

Or did it just by accident not cause real damage earlier on in the past?

[ 1041.810963] BTRFS info (device xvdb): relocating block group
91423244288 flags metadata

<- ^C made it stop here

[ 1076.189766] BTRFS info (device xvdb): relocating block group
91423244288 flags metadata
[ 1081.300131] BTRFS info (device xvdb): found 6689 extents
[ 1081.311138] BTRFS info (device xvdb): relocating block group
90349502464 flags data
[ 1089.776066] BTRFS info (device xvdb): found 215 extents

The above is with 4.19.118. Now I'm trying this again and looking just a
little better at the logging, I see that what I thought (it always
stopped after doing 1 block group) is not true. With ^C I can make it
stop halfway and then next time it again starts at 91423244288.

Related question: should, additionally, the btrfs balance in progs also
be changed to catch the SIGINT while it's doing the balance ioctl, to
prevent the signal from leaking to the kernel space? I mean, kernels
with the bug are already 'out there' now...

Thanks

> So allow btrfs balance to check signal to determine if it should exit.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/relocation.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 523d2e5fab8f..2b869fb2e62c 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -2656,7 +2656,8 @@ int setup_extent_mapping(struct inode *inode, u64 start, u64 end,
>   */
>  int btrfs_should_cancel_balance(struct btrfs_fs_info *fs_info)
>  {
> -	return atomic_read(&fs_info->balance_cancel_req);
> +	return atomic_read(&fs_info->balance_cancel_req) ||
> +		fatal_signal_pending(current);
>  }
>  ALLOW_ERROR_INJECTION(btrfs_should_cancel_balance, TRUE);
>  
>
Qu Wenruo July 6, 2020, 10:43 p.m. UTC | #3
On 2020/7/7 上午2:19, Hans van Kranenburg wrote:
> Hi,
> 
> On 7/6/20 9:44 AM, Qu Wenruo wrote:
>> Although btrfs balance can be canceled with "btrfs balance cancel"
>> command, it's still almost muscle memory to press Ctrl-C to cancel a
>> long running btrfs balance.
> 
> Thanks for investigating all of this.
> 
> Has it actually been unsafe (read: undefined behaviour) forever, or only
> since some recent change?

Forever.

That -EINTR from metadata reservation path is there for a long long time.

> 
> Or did it just by accident not cause real damage earlier on in the past?
> 
> [ 1041.810963] BTRFS info (device xvdb): relocating block group
> 91423244288 flags metadata
> 
> <- ^C made it stop here
> 
> [ 1076.189766] BTRFS info (device xvdb): relocating block group
> 91423244288 flags metadata
> [ 1081.300131] BTRFS info (device xvdb): found 6689 extents
> [ 1081.311138] BTRFS info (device xvdb): relocating block group
> 90349502464 flags data
> [ 1089.776066] BTRFS info (device xvdb): found 215 extents
> 
> The above is with 4.19.118. Now I'm trying this again and looking just a
> little better at the logging, I see that what I thought (it always
> stopped after doing 1 block group) is not true. With ^C I can make it
> stop halfway and then next time it again starts at 91423244288.
> 
> Related question: should, additionally, the btrfs balance in progs also
> be changed to catch the SIGINT while it's doing the balance ioctl, to
> prevent the signal from leaking to the kernel space? I mean, kernels
> with the bug are already 'out there' now...

It has no way to catch signal while trapped into kernel space.

My previous assumption of the whole ioctl thing is still right, when
we're in kernel space, we can't catch any signal.

It's just the metadata reservation code manually checking the pending
signal and return -EINTR.

So even if we make btrfs-progs to catch that signal, it won't work.
And even if it seems to work, it's because in btrfs module we're
checking signal explicitly.

Thanks,
Qu

> 
> Thanks
> 
>> So allow btrfs balance to check signal to determine if it should exit.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>  fs/btrfs/relocation.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
>> index 523d2e5fab8f..2b869fb2e62c 100644
>> --- a/fs/btrfs/relocation.c
>> +++ b/fs/btrfs/relocation.c
>> @@ -2656,7 +2656,8 @@ int setup_extent_mapping(struct inode *inode, u64 start, u64 end,
>>   */
>>  int btrfs_should_cancel_balance(struct btrfs_fs_info *fs_info)
>>  {
>> -	return atomic_read(&fs_info->balance_cancel_req);
>> +	return atomic_read(&fs_info->balance_cancel_req) ||
>> +		fatal_signal_pending(current);
>>  }
>>  ALLOW_ERROR_INJECTION(btrfs_should_cancel_balance, TRUE);
>>  
>>
>

Patch
diff mbox series

diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 523d2e5fab8f..2b869fb2e62c 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2656,7 +2656,8 @@  int setup_extent_mapping(struct inode *inode, u64 start, u64 end,
  */
 int btrfs_should_cancel_balance(struct btrfs_fs_info *fs_info)
 {
-	return atomic_read(&fs_info->balance_cancel_req);
+	return atomic_read(&fs_info->balance_cancel_req) ||
+		fatal_signal_pending(current);
 }
 ALLOW_ERROR_INJECTION(btrfs_should_cancel_balance, TRUE);