Message ID | 20200706074435.52356-2-wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: make ticket wait uninterruptible to address unexpected RO during balance | expand |
On 7/6/20 3:44 AM, Qu Wenruo wrote: > Although btrfs balance can be canceled with "btrfs balance cancel" > command, it's still almost muscle memory to press Ctrl-C to cancel a > long running btrfs balance. > > So allow btrfs balance to check signal to determine if it should exit. > > Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Thanks, Josef
Hi, On 7/6/20 9:44 AM, Qu Wenruo wrote: > Although btrfs balance can be canceled with "btrfs balance cancel" > command, it's still almost muscle memory to press Ctrl-C to cancel a > long running btrfs balance. Thanks for investigating all of this. Has it actually been unsafe (read: undefined behaviour) forever, or only since some recent change? Or did it just by accident not cause real damage earlier on in the past? [ 1041.810963] BTRFS info (device xvdb): relocating block group 91423244288 flags metadata <- ^C made it stop here [ 1076.189766] BTRFS info (device xvdb): relocating block group 91423244288 flags metadata [ 1081.300131] BTRFS info (device xvdb): found 6689 extents [ 1081.311138] BTRFS info (device xvdb): relocating block group 90349502464 flags data [ 1089.776066] BTRFS info (device xvdb): found 215 extents The above is with 4.19.118. Now I'm trying this again and looking just a little better at the logging, I see that what I thought (it always stopped after doing 1 block group) is not true. With ^C I can make it stop halfway and then next time it again starts at 91423244288. Related question: should, additionally, the btrfs balance in progs also be changed to catch the SIGINT while it's doing the balance ioctl, to prevent the signal from leaking to the kernel space? I mean, kernels with the bug are already 'out there' now... Thanks > So allow btrfs balance to check signal to determine if it should exit. > > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > fs/btrfs/relocation.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c > index 523d2e5fab8f..2b869fb2e62c 100644 > --- a/fs/btrfs/relocation.c > +++ b/fs/btrfs/relocation.c > @@ -2656,7 +2656,8 @@ int setup_extent_mapping(struct inode *inode, u64 start, u64 end, > */ > int btrfs_should_cancel_balance(struct btrfs_fs_info *fs_info) > { > - return atomic_read(&fs_info->balance_cancel_req); > + return atomic_read(&fs_info->balance_cancel_req) || > + fatal_signal_pending(current); > } > ALLOW_ERROR_INJECTION(btrfs_should_cancel_balance, TRUE); > >
On 2020/7/7 上午2:19, Hans van Kranenburg wrote: > Hi, > > On 7/6/20 9:44 AM, Qu Wenruo wrote: >> Although btrfs balance can be canceled with "btrfs balance cancel" >> command, it's still almost muscle memory to press Ctrl-C to cancel a >> long running btrfs balance. > > Thanks for investigating all of this. > > Has it actually been unsafe (read: undefined behaviour) forever, or only > since some recent change? Forever. That -EINTR from metadata reservation path is there for a long long time. > > Or did it just by accident not cause real damage earlier on in the past? > > [ 1041.810963] BTRFS info (device xvdb): relocating block group > 91423244288 flags metadata > > <- ^C made it stop here > > [ 1076.189766] BTRFS info (device xvdb): relocating block group > 91423244288 flags metadata > [ 1081.300131] BTRFS info (device xvdb): found 6689 extents > [ 1081.311138] BTRFS info (device xvdb): relocating block group > 90349502464 flags data > [ 1089.776066] BTRFS info (device xvdb): found 215 extents > > The above is with 4.19.118. Now I'm trying this again and looking just a > little better at the logging, I see that what I thought (it always > stopped after doing 1 block group) is not true. With ^C I can make it > stop halfway and then next time it again starts at 91423244288. > > Related question: should, additionally, the btrfs balance in progs also > be changed to catch the SIGINT while it's doing the balance ioctl, to > prevent the signal from leaking to the kernel space? I mean, kernels > with the bug are already 'out there' now... It has no way to catch signal while trapped into kernel space. My previous assumption of the whole ioctl thing is still right, when we're in kernel space, we can't catch any signal. It's just the metadata reservation code manually checking the pending signal and return -EINTR. So even if we make btrfs-progs to catch that signal, it won't work. And even if it seems to work, it's because in btrfs module we're checking signal explicitly. Thanks, Qu > > Thanks > >> So allow btrfs balance to check signal to determine if it should exit. >> >> Signed-off-by: Qu Wenruo <wqu@suse.com> >> --- >> fs/btrfs/relocation.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c >> index 523d2e5fab8f..2b869fb2e62c 100644 >> --- a/fs/btrfs/relocation.c >> +++ b/fs/btrfs/relocation.c >> @@ -2656,7 +2656,8 @@ int setup_extent_mapping(struct inode *inode, u64 start, u64 end, >> */ >> int btrfs_should_cancel_balance(struct btrfs_fs_info *fs_info) >> { >> - return atomic_read(&fs_info->balance_cancel_req); >> + return atomic_read(&fs_info->balance_cancel_req) || >> + fatal_signal_pending(current); >> } >> ALLOW_ERROR_INJECTION(btrfs_should_cancel_balance, TRUE); >> >> >
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 523d2e5fab8f..2b869fb2e62c 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -2656,7 +2656,8 @@ int setup_extent_mapping(struct inode *inode, u64 start, u64 end, */ int btrfs_should_cancel_balance(struct btrfs_fs_info *fs_info) { - return atomic_read(&fs_info->balance_cancel_req); + return atomic_read(&fs_info->balance_cancel_req) || + fatal_signal_pending(current); } ALLOW_ERROR_INJECTION(btrfs_should_cancel_balance, TRUE);
Although btrfs balance can be canceled with "btrfs balance cancel" command, it's still almost muscle memory to press Ctrl-C to cancel a long running btrfs balance. So allow btrfs balance to check signal to determine if it should exit. Signed-off-by: Qu Wenruo <wqu@suse.com> --- fs/btrfs/relocation.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)