diff mbox series

[1/4] btrfs: mark resumed async balance as writing

Message ID 65730df62341500bfcbde7d86eeaa3e9b15f1bcb.1646983176.git.naohiro.aota@wdc.com (mailing list archive)
State New, archived
Headers show
Series protect relocation with sb_start_write | expand

Commit Message

Naohiro Aota March 11, 2022, 7:38 a.m. UTC
When btrfs balance is interrupted with umount, the background balance
resumes on the next mount. There is a potential deadlock with FS freezing
here like as described in commit 26559780b953 ("btrfs: zoned: mark
relocation as writing").

Mark the process as sb_writing. To preserve the order of sb_start_write()
(or mnt_want_write_file()) and btrfs_exclop_start(), call sb_start_write()
at btrfs_resume_balance_async() before taking fs_info->super_lock.

Fixes: 5accdf82ba25 ("fs: Improve filesystem freezing handling")
Cc: stable@vger.kernel.org
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/volumes.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Filipe Manana March 11, 2022, 2:08 p.m. UTC | #1
On Fri, Mar 11, 2022 at 04:38:02PM +0900, Naohiro Aota wrote:
> When btrfs balance is interrupted with umount, the background balance
> resumes on the next mount. There is a potential deadlock with FS freezing
> here like as described in commit 26559780b953 ("btrfs: zoned: mark
> relocation as writing").
> 
> Mark the process as sb_writing. To preserve the order of sb_start_write()
> (or mnt_want_write_file()) and btrfs_exclop_start(), call sb_start_write()
> at btrfs_resume_balance_async() before taking fs_info->super_lock.
> 
> Fixes: 5accdf82ba25 ("fs: Improve filesystem freezing handling")

This seems odd to me. I read the note you left on the cover letter about
this, but honestly I don't think it's fair to blame that commit. I see
it more as btrfs specific problem.

Plus it's a 10 years old commit, so instead of the Fixes tag, adding a
minimal kernel version to the CC stable tag below makes more sense.

> Cc: stable@vger.kernel.org
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> ---
>  fs/btrfs/volumes.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 1be7cb2f955f..0d27d8d35c7a 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -4443,6 +4443,7 @@ static int balance_kthread(void *data)
>  	if (fs_info->balance_ctl)
>  		ret = btrfs_balance(fs_info, fs_info->balance_ctl, NULL);
>  	mutex_unlock(&fs_info->balance_mutex);
> +	sb_end_write(fs_info->sb);
>  
>  	return ret;
>  }
> @@ -4463,6 +4464,7 @@ int btrfs_resume_balance_async(struct btrfs_fs_info *fs_info)
>  		return 0;
>  	}
>  
> +	sb_start_write(fs_info->sb);

I don't understand this.

We are doing the sb_start_write() here, in the task doing the mount, and then
we do the sb_end_write() at the kthread that runs balance_kthread().

Why not do the sb_start_write() in the kthread?

This is also buggy in the case the call below to kthread_run() fails, as
we end up never calling sb_end_write().

Thanks.

>  	spin_lock(&fs_info->super_lock);
>  	ASSERT(fs_info->exclusive_operation == BTRFS_EXCLOP_BALANCE_PAUSED);
>  	fs_info->exclusive_operation = BTRFS_EXCLOP_BALANCE;
> -- 
> 2.35.1
>
Naohiro Aota March 14, 2022, 2:29 a.m. UTC | #2
On Fri, Mar 11, 2022 at 02:08:37PM +0000, Filipe Manana wrote:
> On Fri, Mar 11, 2022 at 04:38:02PM +0900, Naohiro Aota wrote:
> > When btrfs balance is interrupted with umount, the background balance
> > resumes on the next mount. There is a potential deadlock with FS freezing
> > here like as described in commit 26559780b953 ("btrfs: zoned: mark
> > relocation as writing").
> > 
> > Mark the process as sb_writing. To preserve the order of sb_start_write()
> > (or mnt_want_write_file()) and btrfs_exclop_start(), call sb_start_write()
> > at btrfs_resume_balance_async() before taking fs_info->super_lock.
> > 
> > Fixes: 5accdf82ba25 ("fs: Improve filesystem freezing handling")
> 
> This seems odd to me. I read the note you left on the cover letter about
> this, but honestly I don't think it's fair to blame that commit. I see
> it more as btrfs specific problem.

Yeah, I was really not sure how I should write the tag. The issue is
we missed to add sb_start_write() after this commit.

> Plus it's a 10 years old commit, so instead of the Fixes tag, adding a
> minimal kernel version to the CC stable tag below makes more sense.

So, only with "Cc: stable@vger.kernel.org # 3.6+" ?

> > Cc: stable@vger.kernel.org
> > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> > ---
> >  fs/btrfs/volumes.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > index 1be7cb2f955f..0d27d8d35c7a 100644
> > --- a/fs/btrfs/volumes.c
> > +++ b/fs/btrfs/volumes.c
> > @@ -4443,6 +4443,7 @@ static int balance_kthread(void *data)
> >  	if (fs_info->balance_ctl)
> >  		ret = btrfs_balance(fs_info, fs_info->balance_ctl, NULL);
> >  	mutex_unlock(&fs_info->balance_mutex);
> > +	sb_end_write(fs_info->sb);
> >  
> >  	return ret;
> >  }
> > @@ -4463,6 +4464,7 @@ int btrfs_resume_balance_async(struct btrfs_fs_info *fs_info)
> >  		return 0;
> >  	}
> >  
> > +	sb_start_write(fs_info->sb);
> 
> I don't understand this.
> 
> We are doing the sb_start_write() here, in the task doing the mount, and then
> we do the sb_end_write() at the kthread that runs balance_kthread().

Oops, I made a mistake here. It actually printed the lockdep warning
"lock held when returning to user space!".

> Why not do the sb_start_write() in the kthread?
> 
> This is also buggy in the case the call below to kthread_run() fails, as
> we end up never calling sb_end_write().

I was trying to preserve the lock taking order: sb_start_write() ->
spin_lock(fs_info->super_lock). But, it might not be a big deal as
long as we don't call sb_start_write() in the super_lock.

> Thanks.
> 
> >  	spin_lock(&fs_info->super_lock);
> >  	ASSERT(fs_info->exclusive_operation == BTRFS_EXCLOP_BALANCE_PAUSED);
> >  	fs_info->exclusive_operation = BTRFS_EXCLOP_BALANCE;
> > -- 
> > 2.35.1
> >
Filipe Manana March 14, 2022, 11:25 a.m. UTC | #3
On Mon, Mar 14, 2022 at 02:29:22AM +0000, Naohiro Aota wrote:
> On Fri, Mar 11, 2022 at 02:08:37PM +0000, Filipe Manana wrote:
> > On Fri, Mar 11, 2022 at 04:38:02PM +0900, Naohiro Aota wrote:
> > > When btrfs balance is interrupted with umount, the background balance
> > > resumes on the next mount. There is a potential deadlock with FS freezing
> > > here like as described in commit 26559780b953 ("btrfs: zoned: mark
> > > relocation as writing").
> > > 
> > > Mark the process as sb_writing. To preserve the order of sb_start_write()
> > > (or mnt_want_write_file()) and btrfs_exclop_start(), call sb_start_write()
> > > at btrfs_resume_balance_async() before taking fs_info->super_lock.
> > > 
> > > Fixes: 5accdf82ba25 ("fs: Improve filesystem freezing handling")
> > 
> > This seems odd to me. I read the note you left on the cover letter about
> > this, but honestly I don't think it's fair to blame that commit. I see
> > it more as btrfs specific problem.
> 
> Yeah, I was really not sure how I should write the tag. The issue is
> we missed to add sb_start_write() after this commit.
> 
> > Plus it's a 10 years old commit, so instead of the Fixes tag, adding a
> > minimal kernel version to the CC stable tag below makes more sense.
> 
> So, only with "Cc: stable@vger.kernel.org # 3.6+" ?

Looking at kernel.org the oldest stable kernel is 4.9, so anything older
than that is pointless.

> 
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> > > ---
> > >  fs/btrfs/volumes.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > > index 1be7cb2f955f..0d27d8d35c7a 100644
> > > --- a/fs/btrfs/volumes.c
> > > +++ b/fs/btrfs/volumes.c
> > > @@ -4443,6 +4443,7 @@ static int balance_kthread(void *data)
> > >  	if (fs_info->balance_ctl)
> > >  		ret = btrfs_balance(fs_info, fs_info->balance_ctl, NULL);
> > >  	mutex_unlock(&fs_info->balance_mutex);
> > > +	sb_end_write(fs_info->sb);
> > >  
> > >  	return ret;
> > >  }
> > > @@ -4463,6 +4464,7 @@ int btrfs_resume_balance_async(struct btrfs_fs_info *fs_info)
> > >  		return 0;
> > >  	}
> > >  
> > > +	sb_start_write(fs_info->sb);
> > 
> > I don't understand this.
> > 
> > We are doing the sb_start_write() here, in the task doing the mount, and then
> > we do the sb_end_write() at the kthread that runs balance_kthread().
> 
> Oops, I made a mistake here. It actually printed the lockdep warning
> "lock held when returning to user space!".
> 
> > Why not do the sb_start_write() in the kthread?
> > 
> > This is also buggy in the case the call below to kthread_run() fails, as
> > we end up never calling sb_end_write().
> 
> I was trying to preserve the lock taking order: sb_start_write() ->
> spin_lock(fs_info->super_lock). But, it might not be a big deal as
> long as we don't call sb_start_write() in the super_lock.
> 
> > Thanks.
> > 
> > >  	spin_lock(&fs_info->super_lock);
> > >  	ASSERT(fs_info->exclusive_operation == BTRFS_EXCLOP_BALANCE_PAUSED);
> > >  	fs_info->exclusive_operation = BTRFS_EXCLOP_BALANCE;
> > > -- 
> > > 2.35.1
> > >
diff mbox series

Patch

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 1be7cb2f955f..0d27d8d35c7a 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4443,6 +4443,7 @@  static int balance_kthread(void *data)
 	if (fs_info->balance_ctl)
 		ret = btrfs_balance(fs_info, fs_info->balance_ctl, NULL);
 	mutex_unlock(&fs_info->balance_mutex);
+	sb_end_write(fs_info->sb);
 
 	return ret;
 }
@@ -4463,6 +4464,7 @@  int btrfs_resume_balance_async(struct btrfs_fs_info *fs_info)
 		return 0;
 	}
 
+	sb_start_write(fs_info->sb);
 	spin_lock(&fs_info->super_lock);
 	ASSERT(fs_info->exclusive_operation == BTRFS_EXCLOP_BALANCE_PAUSED);
 	fs_info->exclusive_operation = BTRFS_EXCLOP_BALANCE;