diff mbox series

[RFC,RESEND] fs: fix a hungtask problem when freeze/unfreeze fs

Message ID 20201226095641.17290-1-luoshijie1@huawei.com (mailing list archive)
State New, archived
Headers show
Series [RFC,RESEND] fs: fix a hungtask problem when freeze/unfreeze fs | expand

Commit Message

Shijie Luo Dec. 26, 2020, 9:56 a.m. UTC
We found a hungtask problem as described following:
Running xfstests generic/390 with ext4 filesystem, and simutaneously 
offline/onlines the disk we tested. It will cause a hungtask problem 
whose call trace is like this,

[369.857104] INFO: task fsstress:11672 blocked for more than 120 seconds.
[  369.875724] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  369.885168] fsstress        D    0 11672  11625 0x00000080
[  369.885169] Call Trace:
[  369.885171]  ? __schedule+0x2fc/0x930
[  369.885173]  ? filename_parentat+0x10b/0x1a0
[  369.885175]  schedule+0x28/0x70
[  369.885176]  rwsem_down_read_failed+0x102/0x1c0
[  369.885178]  ? __percpu_down_read+0x93/0xb0
[  369.885179]  __percpu_down_read+0x93/0xb0
[  369.885182]  __sb_start_write+0x5f/0x70
[  369.885183]  mnt_want_write+0x20/0x50
[  369.885184]  do_renameat2+0x1f3/0x550
[  369.885186]  __x64_sys_rename+0x1c/0x20
[  369.885187]  do_syscall_64+0x5b/0x1b0
[  369.885188]  entry_SYSCALL_64_after_hwframe+0x65/0xca
[  369.885189] RIP: 0033:0x7f5e6e34ccb7
[  369.885190] Code: Bad RIP value.
[  369.885191] RSP: 002b:00007ffef4a83788 EFLAGS: 00000206 ORIG_RAX: 0000000000000052
[  369.885191] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5e6e34ccb7
[  369.885192] RDX: 0000000000000000 RSI: 0000000001b09500 RDI: 0000000001b09f90
[  369.885192] RBP: 0000000000000000 R08: 0000000000000021 R09: 0000000000000000
[  369.885193] R10: 0000000000000692 R11: 0000000000000206 R12: 00007ffef4a83a30
[  369.885193] R13: 00007ffef4a83a40 R14: 00007ffef4a83a40 R15: 0000000000000001

The root cause is that when offline/onlines disks, the filesystem can easily get into 
a error state and this makes it change to be read-only. Function freeze_super() will hold 
all sb_writers rwsems including rwsem of SB_FREEZE_WRITE when filesystem not read-only, 
but thaw_super_locked() cannot release these while the filesystem suddenly become read-only, 
because the logic will go to out.

freeze_super
    hold sb_writers rwsems
        sb->s_writers.frozen = SB_FREEZE_COMPLETE
                                                 thaw_super_locked
                                                     sb_rdonly
                                                        sb->s_writers.frozen = SB_UNFROZEN;
                                                            goto out // not release rwsems

And at this time, if we call mnt_want_write(), the process will be blocked.

This patch fixes this problem, when filesystem is read-only, just not to set sb_writers.frozen 
be SB_FREEZE_COMPLETE in freeze_super() and then release all rwsems in thaw_super_locked.

Signed-off-by: Shijie Luo <luoshijie1@huawei.com>
Signed-off-by: yangerkun <yangerkun@huawei.com>

Fix some descriptions errors and resend the patch.
---
 fs/super.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

Comments

Al Viro Dec. 26, 2020, 3:55 p.m. UTC | #1
On Sat, Dec 26, 2020 at 04:56:41AM -0500, Shijie Luo wrote:

> The root cause is that when offline/onlines disks, the filesystem can easily get into 
> a error state and this makes it change to be read-only. Function freeze_super() will hold 
> all sb_writers rwsems including rwsem of SB_FREEZE_WRITE when filesystem not read-only, 
> but thaw_super_locked() cannot release these while the filesystem suddenly become read-only, 
> because the logic will go to out.
> 
> freeze_super
>     hold sb_writers rwsems
>         sb->s_writers.frozen = SB_FREEZE_COMPLETE
>                                                  thaw_super_locked
>                                                      sb_rdonly
>                                                         sb->s_writers.frozen = SB_UNFROZEN;
>                                                             goto out // not release rwsems
> 
> And at this time, if we call mnt_want_write(), the process will be blocked.
> 
> This patch fixes this problem, when filesystem is read-only, just not to set sb_writers.frozen 
> be SB_FREEZE_COMPLETE in freeze_super() and then release all rwsems in thaw_super_locked.

I really don't like that - you end up with a case when freeze_super() returns 0 *and*
consumes the reference it had been give.

>  	if (sb_rdonly(sb)) {
> -		/* Nothing to do really... */
> -		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> -		up_write(&sb->s_umount);
> +		deactivate_locked_super(sb);
>  		return 0;
>  	}
Shijie Luo Dec. 28, 2020, 2:15 a.m. UTC | #2
On 2020/12/26 23:55, Al Viro wrote:
> On Sat, Dec 26, 2020 at 04:56:41AM -0500, Shijie Luo wrote:
>
>> The root cause is that when offline/onlines disks, the filesystem can easily get into
>> a error state and this makes it change to be read-only. Function freeze_super() will hold
>> all sb_writers rwsems including rwsem of SB_FREEZE_WRITE when filesystem not read-only,
>> but thaw_super_locked() cannot release these while the filesystem suddenly become read-only,
>> because the logic will go to out.
>>
>> freeze_super
>>      hold sb_writers rwsems
>>          sb->s_writers.frozen = SB_FREEZE_COMPLETE
>>                                                   thaw_super_locked
>>                                                       sb_rdonly
>>                                                          sb->s_writers.frozen = SB_UNFROZEN;
>>                                                              goto out // not release rwsems
>>
>> And at this time, if we call mnt_want_write(), the process will be blocked.
>>
>> This patch fixes this problem, when filesystem is read-only, just not to set sb_writers.frozen
>> be SB_FREEZE_COMPLETE in freeze_super() and then release all rwsems in thaw_super_locked.
> I really don't like that - you end up with a case when freeze_super() returns 0 *and*
> consumes the reference it had been give.

Consuming the reference here because we won't "set frozen = 
SB_FREEZE_COMPLETE" in thaw_super_locked() now.

If don't do that, we never have a chance to consume it, 
thaw_super_locked() will directly return "-EINVAL". But I do

agree that it's not a good idea to return 0. How about returning 
"-EINVAL or -ENOTSUPP" ?

And, If we keep "frozen = SB_FREEZE_COMPLETE" in freeze_super() here, it 
will cause another problem, thaw_super_locked()

will always release rwsems in my logic, but freeze_super() won't acquire 
the rwsems when filesystem is read-only.

Thanks.

>>   	if (sb_rdonly(sb)) {
>> -		/* Nothing to do really... */
>> -		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
>> -		up_write(&sb->s_umount);
>> +		deactivate_locked_super(sb);
>>   		return 0;
>>   	}
> .
>
Jan Kara Jan. 4, 2021, 4:04 p.m. UTC | #3
On Mon 28-12-20 10:15:16, Shijie Luo wrote:
> 
> On 2020/12/26 23:55, Al Viro wrote:
> > On Sat, Dec 26, 2020 at 04:56:41AM -0500, Shijie Luo wrote:
> > 
> > > The root cause is that when offline/onlines disks, the filesystem can easily get into
> > > a error state and this makes it change to be read-only. Function freeze_super() will hold
> > > all sb_writers rwsems including rwsem of SB_FREEZE_WRITE when filesystem not read-only,
> > > but thaw_super_locked() cannot release these while the filesystem suddenly become read-only,
> > > because the logic will go to out.
> > > 
> > > freeze_super
> > >      hold sb_writers rwsems
> > >          sb->s_writers.frozen = SB_FREEZE_COMPLETE
> > >                                                   thaw_super_locked
> > >                                                       sb_rdonly
> > >                                                          sb->s_writers.frozen = SB_UNFROZEN;
> > >                                                              goto out // not release rwsems
> > > 
> > > And at this time, if we call mnt_want_write(), the process will be blocked.
> > > 
> > > This patch fixes this problem, when filesystem is read-only, just not to set sb_writers.frozen
> > > be SB_FREEZE_COMPLETE in freeze_super() and then release all rwsems in thaw_super_locked.
> > I really don't like that - you end up with a case when freeze_super() returns 0 *and*
> > consumes the reference it had been give.
> 
> Consuming the reference here because we won't "set frozen =
> SB_FREEZE_COMPLETE" in thaw_super_locked() now.
> 
> If don't do that, we never have a chance to consume it, thaw_super_locked()
> will directly return "-EINVAL". But I do
> 
> agree that it's not a good idea to return 0. How about returning "-EINVAL or
> -ENOTSUPP" ?
> 
> And, If we keep "frozen = SB_FREEZE_COMPLETE" in freeze_super() here, it
> will cause another problem, thaw_super_locked()
> 
> will always release rwsems in my logic, but freeze_super() won't acquire the
> rwsems when filesystem is read-only.

I was thinking about this for a while. I think the best solution would be
to track whether the fs was read only (and thus frozen without locking
percpu semaphores) at the time of freezing. I'm attaching that patch. Does
it fix your problem?

									Honza
Shijie Luo Jan. 5, 2021, 2:48 a.m. UTC | #4
Hi!

On 2021/1/5 0:04, Jan Kara wrote:
>> Consuming the reference here because we won't "set frozen =
>> SB_FREEZE_COMPLETE" in thaw_super_locked() now.
>>
>> If don't do that, we never have a chance to consume it, thaw_super_locked()
>> will directly return "-EINVAL". But I do
>>
>> agree that it's not a good idea to return 0. How about returning "-EINVAL or
>> -ENOTSUPP" ?
>>
>> And, If we keep "frozen = SB_FREEZE_COMPLETE" in freeze_super() here, it
>> will cause another problem, thaw_super_locked()
>>
>> will always release rwsems in my logic, but freeze_super() won't acquire the
>> rwsems when filesystem is read-only.
> I was thinking about this for a while. I think the best solution would be
> to track whether the fs was read only (and thus frozen without locking
> percpu semaphores) at the time of freezing. I'm attaching that patch. Does
> it fix your problem?
>
> 									Honza

I tested your patch, and it can indeed fix this deadlock problem.

Thanks.
Jan Kara Jan. 5, 2021, 1:46 p.m. UTC | #5
On Tue 05-01-21 10:48:48, Shijie Luo wrote:
> Hi!
> 
> On 2021/1/5 0:04, Jan Kara wrote:
> > > Consuming the reference here because we won't "set frozen =
> > > SB_FREEZE_COMPLETE" in thaw_super_locked() now.
> > > 
> > > If don't do that, we never have a chance to consume it, thaw_super_locked()
> > > will directly return "-EINVAL". But I do
> > > 
> > > agree that it's not a good idea to return 0. How about returning "-EINVAL or
> > > -ENOTSUPP" ?
> > > 
> > > And, If we keep "frozen = SB_FREEZE_COMPLETE" in freeze_super() here, it
> > > will cause another problem, thaw_super_locked()
> > > 
> > > will always release rwsems in my logic, but freeze_super() won't acquire the
> > > rwsems when filesystem is read-only.
> > I was thinking about this for a while. I think the best solution would be
> > to track whether the fs was read only (and thus frozen without locking
> > percpu semaphores) at the time of freezing. I'm attaching that patch. Does
> > it fix your problem?
> > 
> > 									Honza
> 
> I tested your patch, and it can indeed fix this deadlock problem.

Thanks for testing! I've sent the patch to Al for inclusion.

								Honza
diff mbox series

Patch

diff --git a/fs/super.c b/fs/super.c
index 2c6cdea2ab2d..50d79213f678 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1672,9 +1672,7 @@  int freeze_super(struct super_block *sb)
 	}
 
 	if (sb_rdonly(sb)) {
-		/* Nothing to do really... */
-		sb->s_writers.frozen = SB_FREEZE_COMPLETE;
-		up_write(&sb->s_umount);
+		deactivate_locked_super(sb);
 		return 0;
 	}
 
@@ -1733,13 +1731,11 @@  static int thaw_super_locked(struct super_block *sb)
 		return -EINVAL;
 	}
 
-	if (sb_rdonly(sb)) {
-		sb->s_writers.frozen = SB_UNFROZEN;
-		goto out;
-	}
-
 	lockdep_sb_freeze_acquire(sb);
 
+	if (sb_rdonly(sb))
+		goto out;
+
 	if (sb->s_op->unfreeze_fs) {
 		error = sb->s_op->unfreeze_fs(sb);
 		if (error) {
@@ -1751,9 +1747,9 @@  static int thaw_super_locked(struct super_block *sb)
 		}
 	}
 
+out:
 	sb->s_writers.frozen = SB_UNFROZEN;
 	sb_freeze_unlock(sb);
-out:
 	wake_up(&sb->s_writers.wait_unfrozen);
 	deactivate_locked_super(sb);
 	return 0;