diff mbox series

[v3] btrfs: interrupt fstrim if the current process is freezing

Message ID eeffae0b8beecb3406f43ff48e788fd9d88fb2e2.1724971143.git.wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series [v3] btrfs: interrupt fstrim if the current process is freezing | expand

Commit Message

Qu Wenruo Aug. 29, 2024, 10:39 p.m. UTC
[BUG]
There is a bug report that running fstrim will prevent the system from
hibernation, result the following dmesg:

 PM: suspend entry (deep)
 Filesystems sync: 0.060 seconds
 Freezing user space processes
 Freezing user space processes failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0):
 task:fstrim          state:D stack:0     pid:15564 tgid:15564 ppid:1      flags:0x00004006
 Call Trace:
  <TASK>
  __schedule+0x381/0x1540
  schedule+0x24/0xb0
  schedule_timeout+0x1ea/0x2a0
  io_schedule_timeout+0x19/0x50
  wait_for_completion_io+0x78/0x140
  submit_bio_wait+0xaa/0xc0
  blkdev_issue_discard+0x65/0xb0
  btrfs_issue_discard+0xcf/0x160 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
  btrfs_discard_extent+0x120/0x2a0 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
  do_trimming+0xd4/0x220 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
  trim_bitmaps+0x418/0x520 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
  btrfs_trim_block_group+0xcb/0x130 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
  btrfs_trim_fs+0x119/0x460 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
  btrfs_ioctl_fitrim+0xfb/0x160 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
  btrfs_ioctl+0x11cc/0x29f0 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
  __x64_sys_ioctl+0x92/0xd0
  do_syscall_64+0x5b/0x80
  entry_SYSCALL_64_after_hwframe+0x7c/0xe6
 RIP: 0033:0x7f5f3b529f9b
 RSP: 002b:00007fff279ebc20 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
 RAX: ffffffffffffffda RBX: 00007fff279ebd60 RCX: 00007f5f3b529f9b
 RDX: 00007fff279ebc90 RSI: 00000000c0185879 RDI: 0000000000000003
 RBP: 000055748718b2d0 R08: 00005574871899e8 R09: 00007fff279eb010
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
 R13: 000055748718ac40 R14: 000055748718b290 R15: 000055748718b290
  </TASK>
 OOM killer enabled.
 Restarting tasks ... done.
 random: crng reseeded on system resumption
 PM: suspend exit
 PM: suspend entry (s2idle)
 Filesystems sync: 0.047 seconds

[CAUSE]
PM code is freezing all user space processes before entering
hibernation/suspension, but if a user space process is trapping into the
kernel for a long running operation, it will not be frozen since it's
still inside kernel.

Normally those long running operations check for fatal signals and exit
early, but freezing user space processes is not done by signals but a
different infrastructure.

Unfortunately btrfs only checks fatal signals but not if the current
task is being frozen for fstrim.

[FIX]
For now just do the extra freezing() check at a per-block-group basis.

Reported-by: Rolf Wentland <R.Wentland@gmx.de>
Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v3:
- Only check the freezing status for fstrim
  As David still has concerns on all the other long running operations.

v2:
- Rename the helper to btrfs_task_interrupted()
---
 fs/btrfs/extent-tree.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

David Sterba Aug. 30, 2024, 6:51 p.m. UTC | #1
On Fri, Aug 30, 2024 at 08:09:11AM +0930, Qu Wenruo wrote:
> [BUG]
> There is a bug report that running fstrim will prevent the system from
> hibernation, result the following dmesg:
> 
>  PM: suspend entry (deep)
>  Filesystems sync: 0.060 seconds
>  Freezing user space processes
>  Freezing user space processes failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0):
>  task:fstrim          state:D stack:0     pid:15564 tgid:15564 ppid:1      flags:0x00004006
>  Call Trace:
>   <TASK>
>   __schedule+0x381/0x1540
>   schedule+0x24/0xb0
>   schedule_timeout+0x1ea/0x2a0
>   io_schedule_timeout+0x19/0x50
>   wait_for_completion_io+0x78/0x140
>   submit_bio_wait+0xaa/0xc0
>   blkdev_issue_discard+0x65/0xb0
>   btrfs_issue_discard+0xcf/0x160 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
>   btrfs_discard_extent+0x120/0x2a0 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
>   do_trimming+0xd4/0x220 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
>   trim_bitmaps+0x418/0x520 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
>   btrfs_trim_block_group+0xcb/0x130 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
>   btrfs_trim_fs+0x119/0x460 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
>   btrfs_ioctl_fitrim+0xfb/0x160 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
>   btrfs_ioctl+0x11cc/0x29f0 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
>   __x64_sys_ioctl+0x92/0xd0
>   do_syscall_64+0x5b/0x80
>   entry_SYSCALL_64_after_hwframe+0x7c/0xe6
>  RIP: 0033:0x7f5f3b529f9b
>  RSP: 002b:00007fff279ebc20 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>  RAX: ffffffffffffffda RBX: 00007fff279ebd60 RCX: 00007f5f3b529f9b
>  RDX: 00007fff279ebc90 RSI: 00000000c0185879 RDI: 0000000000000003
>  RBP: 000055748718b2d0 R08: 00005574871899e8 R09: 00007fff279eb010
>  R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
>  R13: 000055748718ac40 R14: 000055748718b290 R15: 000055748718b290
>   </TASK>
>  OOM killer enabled.
>  Restarting tasks ... done.
>  random: crng reseeded on system resumption
>  PM: suspend exit
>  PM: suspend entry (s2idle)
>  Filesystems sync: 0.047 seconds
> 
> [CAUSE]
> PM code is freezing all user space processes before entering
> hibernation/suspension, but if a user space process is trapping into the
> kernel for a long running operation, it will not be frozen since it's
> still inside kernel.
> 
> Normally those long running operations check for fatal signals and exit
> early, but freezing user space processes is not done by signals but a
> different infrastructure.
> 
> Unfortunately btrfs only checks fatal signals but not if the current
> task is being frozen for fstrim.
> 
> [FIX]
> For now just do the extra freezing() check at a per-block-group basis.
> 
> Reported-by: Rolf Wentland <R.Wentland@gmx.de>
> Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
> Signed-off-by: Qu Wenruo <wqu@suse.com>

As a quick fix it's ok, I hope there's some way to support freezing of
the ioctls, try_to_freeze() or schedule() at the right time could work.

Reviewed-by: David Sterba <dsterba@suse.com>
Qu Wenruo Sept. 2, 2024, 9:20 a.m. UTC | #2
在 2024/8/31 04:21, David Sterba 写道:
> On Fri, Aug 30, 2024 at 08:09:11AM +0930, Qu Wenruo wrote:
>> [BUG]
>> There is a bug report that running fstrim will prevent the system from
>> hibernation, result the following dmesg:
>>
>>   PM: suspend entry (deep)
>>   Filesystems sync: 0.060 seconds
>>   Freezing user space processes
>>   Freezing user space processes failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0):
>>   task:fstrim          state:D stack:0     pid:15564 tgid:15564 ppid:1      flags:0x00004006
>>   Call Trace:
>>    <TASK>
>>    __schedule+0x381/0x1540
>>    schedule+0x24/0xb0
>>    schedule_timeout+0x1ea/0x2a0
>>    io_schedule_timeout+0x19/0x50
>>    wait_for_completion_io+0x78/0x140
>>    submit_bio_wait+0xaa/0xc0
>>    blkdev_issue_discard+0x65/0xb0
>>    btrfs_issue_discard+0xcf/0x160 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
>>    btrfs_discard_extent+0x120/0x2a0 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
>>    do_trimming+0xd4/0x220 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
>>    trim_bitmaps+0x418/0x520 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
>>    btrfs_trim_block_group+0xcb/0x130 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
>>    btrfs_trim_fs+0x119/0x460 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
>>    btrfs_ioctl_fitrim+0xfb/0x160 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
>>    btrfs_ioctl+0x11cc/0x29f0 [btrfs 7ab35b9b86062a46f6ff578bb32d55ecf8e6bf82]
>>    __x64_sys_ioctl+0x92/0xd0
>>    do_syscall_64+0x5b/0x80
>>    entry_SYSCALL_64_after_hwframe+0x7c/0xe6
>>   RIP: 0033:0x7f5f3b529f9b
>>   RSP: 002b:00007fff279ebc20 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>>   RAX: ffffffffffffffda RBX: 00007fff279ebd60 RCX: 00007f5f3b529f9b
>>   RDX: 00007fff279ebc90 RSI: 00000000c0185879 RDI: 0000000000000003
>>   RBP: 000055748718b2d0 R08: 00005574871899e8 R09: 00007fff279eb010
>>   R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
>>   R13: 000055748718ac40 R14: 000055748718b290 R15: 000055748718b290
>>    </TASK>
>>   OOM killer enabled.
>>   Restarting tasks ... done.
>>   random: crng reseeded on system resumption
>>   PM: suspend exit
>>   PM: suspend entry (s2idle)
>>   Filesystems sync: 0.047 seconds
>>
>> [CAUSE]
>> PM code is freezing all user space processes before entering
>> hibernation/suspension, but if a user space process is trapping into the
>> kernel for a long running operation, it will not be frozen since it's
>> still inside kernel.
>>
>> Normally those long running operations check for fatal signals and exit
>> early, but freezing user space processes is not done by signals but a
>> different infrastructure.
>>
>> Unfortunately btrfs only checks fatal signals but not if the current
>> task is being frozen for fstrim.
>>
>> [FIX]
>> For now just do the extra freezing() check at a per-block-group basis.
>>
>> Reported-by: Rolf Wentland <R.Wentland@gmx.de>
>> Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>
> As a quick fix it's ok, I hope there's some way to support freezing of
> the ioctls, try_to_freeze() or schedule() at the right time could work.
>
> Reviewed-by: David Sterba <dsterba@suse.com>
>

Please drop this one.

The change itself is not enough, furthermore Luca Stefani sent a better
version, with extra handling inside the free extents discarding code,
which looks more like the root cause of the problem (as free extent
discarding has no size limit)

Thanks,
Qu
diff mbox series

Patch

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index feec49e6f9c8..1768628d68da 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5,6 +5,7 @@ 
 
 #include <linux/sched.h>
 #include <linux/sched/signal.h>
+#include <linux/freezer.h>
 #include <linux/pagemap.h>
 #include <linux/writeback.h>
 #include <linux/blkdev.h>
@@ -6459,7 +6460,7 @@  static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed)
 		start += len;
 		*trimmed += bytes;
 
-		if (fatal_signal_pending(current)) {
+		if (fatal_signal_pending(current) || freezing(current)) {
 			ret = -ERESTARTSYS;
 			break;
 		}