diff mbox series

[f2fs-dev] f2fs: avoid the deadlock case when stopping discard thread

Message ID 20240320001442.497813-1-jaegeuk@kernel.org (mailing list archive)
State New
Headers show
Series [f2fs-dev] f2fs: avoid the deadlock case when stopping discard thread | expand

Commit Message

Jaegeuk Kim March 20, 2024, 12:14 a.m. UTC
f2fs_ioc_shutdown(F2FS_GOING_DOWN_NOSYNC)  issue_discard_thread
 - mnt_want_write_file()
   - sb_start_write(SB_FREEZE_WRITE)
                                             - sb_start_intwrite(SB_FREEZE_FS);
 - f2fs_stop_checkpoint(sbi, false,            : waiting
    STOP_CP_REASON_SHUTDOWN);
 - f2fs_stop_discard_thread(sbi);
   - kthread_stop()
     : waiting

 - mnt_drop_write_file(filp);

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/segment.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Chao Yu March 20, 2024, 3:32 p.m. UTC | #1
On 2024/3/20 8:14, Jaegeuk Kim wrote:
> f2fs_ioc_shutdown(F2FS_GOING_DOWN_NOSYNC)  issue_discard_thread
>   - mnt_want_write_file()
>     - sb_start_write(SB_FREEZE_WRITE)
>                                               - sb_start_intwrite(SB_FREEZE_FS);
>   - f2fs_stop_checkpoint(sbi, false,            : waiting
>      STOP_CP_REASON_SHUTDOWN);
>   - f2fs_stop_discard_thread(sbi);
>     - kthread_stop()
>       : waiting
> 
>   - mnt_drop_write_file(filp);
> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

Reviewed-by: Chao Yu <chao@kernel.org>

Thanks,
Hillf Danton March 21, 2024, 10:42 p.m. UTC | #2
On Tue, 19 Mar 2024 17:14:42 -0700 Jaegeuk Kim <jaegeuk@kernel.org>
> f2fs_ioc_shutdown(F2FS_GOING_DOWN_NOSYNC)  issue_discard_thread
>  - mnt_want_write_file()
>    - sb_start_write(SB_FREEZE_WRITE)
	 __sb_start_write()
	   percpu_down_read()
>                                              - sb_start_intwrite(SB_FREEZE_FS);
						   __sb_start_write()
						     percpu_down_read()

Given lock acquirers for read on both sides, wtf deadlock are you fixing?

>  - f2fs_stop_checkpoint(sbi, false,            : waiting
>     STOP_CP_REASON_SHUTDOWN);
>  - f2fs_stop_discard_thread(sbi);
>    - kthread_stop()
>      : waiting
> 
>  - mnt_drop_write_file(filp);

More important, feel free to add in spin.

	Reported-by: "Light Hsieh (謝明燈)" <Light.Hsieh@mediatek.com>
Jaegeuk Kim March 22, 2024, 12:29 a.m. UTC | #3
On 03/22, Hillf Danton wrote:
> On Tue, 19 Mar 2024 17:14:42 -0700 Jaegeuk Kim <jaegeuk@kernel.org>
> > f2fs_ioc_shutdown(F2FS_GOING_DOWN_NOSYNC)  issue_discard_thread
> >  - mnt_want_write_file()
> >    - sb_start_write(SB_FREEZE_WRITE)
> 	 __sb_start_write()
> 	   percpu_down_read()
> >                                              - sb_start_intwrite(SB_FREEZE_FS);
> 						   __sb_start_write()
> 						     percpu_down_read()
> 
> Given lock acquirers for read on both sides, wtf deadlock are you fixing?

Damn. I couldn't think _write uses _read sem.

> 
> >  - f2fs_stop_checkpoint(sbi, false,            : waiting
> >     STOP_CP_REASON_SHUTDOWN);
> >  - f2fs_stop_discard_thread(sbi);
> >    - kthread_stop()
> >      : waiting
> > 
> >  - mnt_drop_write_file(filp);
> 
> More important, feel free to add in spin.

I posted this patch before Light reported.

And, in the report, I didn't get this:

f2fs_ioc_shutdown() --> freeze_bdev() --> freeze_super() --> sb_wait_write(sb, SB_FREEZE_FS) --> ... ->percpu_down_write().

because f2fs_ioc_shutdown() calls f2fs_stop_discard_thread() after thaw_bdev()
like this order.

 -> freeze_bdev()
 -> thaw_bdev()
 -> f2fs_stop_discard_thread()

Am I missing something?

> 
> 	Reported-by: "Light Hsieh (謝明燈)" <Light.Hsieh@mediatek.com>
Hillf Danton March 22, 2024, 11:33 a.m. UTC | #4
On Thu, 21 Mar 2024 17:29:03 -0700 Jaegeuk Kim <jaegeuk@kernel.org>
> 
> I posted this patch before Light reported.

Yeah, his report's timestamp is 2024-03-20  6:59, nearly 7 hours later,
which shows that you constructed the deadlock with nothing to do with
his report.
> 
> And, in the report, I didn't get this:
> 
> f2fs_ioc_shutdown() --> freeze_bdev() --> freeze_super() --> sb_wait_write(sb, SB_FREEZE_FS) --> ... ->percpu_down_write().
> 
> because f2fs_ioc_shutdown() calls f2fs_stop_discard_thread() after thaw_bdev()
> like this order.
> 
>  -> freeze_bdev()
>  -> thaw_bdev()
>  -> f2fs_stop_discard_thread()
> 
> Am I missing something?

Light, could you specify to help Jaegeuk understand the deadlock you reported?
Jaegeuk Kim March 22, 2024, 10:24 p.m. UTC | #5
On 03/22, Light Hsieh (謝明燈) wrote:
> I don't see my added log in sb_free_unlock() which will invoke percpu_up_write to release the write semaphore.

May I ask more details whether thaw_super() was called or not?

> 
> 
> ________________________________
> 寄件者: Jaegeuk Kim <jaegeuk@kernel.org>
> 寄件日期: 2024年3月22日 上午 08:29
> 收件者: Hillf Danton <hdanton@sina.com>
> 副本: linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Light Hsieh (謝明燈) <Light.Hsieh@mediatek.com>; linux-f2fs-devel@lists.sourceforge.net <linux-f2fs-devel@lists.sourceforge.net>
> 主旨: Re: [PATCH] f2fs: avoid the deadlock case when stopping discard thread
> 
> 
> External email : Please do not click links or open attachments until you have verified the sender or the content.
> 
> On 03/22, Hillf Danton wrote:
> > On Tue, 19 Mar 2024 17:14:42 -0700 Jaegeuk Kim <jaegeuk@kernel.org>
> > > f2fs_ioc_shutdown(F2FS_GOING_DOWN_NOSYNC)  issue_discard_thread
> > >  - mnt_want_write_file()
> > >    - sb_start_write(SB_FREEZE_WRITE)
> >  __sb_start_write()
> >    percpu_down_read()
> > >                                              - sb_start_intwrite(SB_FREEZE_FS);
> >    __sb_start_write()
> >      percpu_down_read()
> >
> > Given lock acquirers for read on both sides, wtf deadlock are you fixing?
> 
> Damn. I couldn't think _write uses _read sem.
> 
> >
> > >  - f2fs_stop_checkpoint(sbi, false,            : waiting
> > >     STOP_CP_REASON_SHUTDOWN);
> > >  - f2fs_stop_discard_thread(sbi);
> > >    - kthread_stop()
> > >      : waiting
> > >
> > >  - mnt_drop_write_file(filp);
> >
> > More important, feel free to add in spin.
> 
> I posted this patch before Light reported.
> 
> And, in the report, I didn't get this:
> 
> f2fs_ioc_shutdown() --> freeze_bdev() --> freeze_super() --> sb_wait_write(sb, SB_FREEZE_FS) --> ... ->percpu_down_write().
> 
> because f2fs_ioc_shutdown() calls f2fs_stop_discard_thread() after thaw_bdev()
> like this order.
> 
>  -> freeze_bdev()
>  -> thaw_bdev()
>  -> f2fs_stop_discard_thread()
> 
> Am I missing something?
> 
> >
> > Reported-by: "Light Hsieh (謝明燈)" <Light.Hsieh@mediatek.com>
>
Jaegeuk Kim March 26, 2024, 4:52 p.m. UTC | #6
On 03/22, Jaegeuk Kim wrote:
> On 03/22, Light Hsieh (謝明燈) wrote:
> > I don't see my added log in sb_free_unlock() which will invoke percpu_up_write to release the write semaphore.
> 
> May I ask more details whether thaw_super() was called or not?

Ping?

> 
> > 
> > 
> > ________________________________
> > 寄件者: Jaegeuk Kim <jaegeuk@kernel.org>
> > 寄件日期: 2024年3月22日 上午 08:29
> > 收件者: Hillf Danton <hdanton@sina.com>
> > 副本: linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Light Hsieh (謝明燈) <Light.Hsieh@mediatek.com>; linux-f2fs-devel@lists.sourceforge.net <linux-f2fs-devel@lists.sourceforge.net>
> > 主旨: Re: [PATCH] f2fs: avoid the deadlock case when stopping discard thread
> > 
> > 
> > External email : Please do not click links or open attachments until you have verified the sender or the content.
> > 
> > On 03/22, Hillf Danton wrote:
> > > On Tue, 19 Mar 2024 17:14:42 -0700 Jaegeuk Kim <jaegeuk@kernel.org>
> > > > f2fs_ioc_shutdown(F2FS_GOING_DOWN_NOSYNC)  issue_discard_thread
> > > >  - mnt_want_write_file()
> > > >    - sb_start_write(SB_FREEZE_WRITE)
> > >  __sb_start_write()
> > >    percpu_down_read()
> > > >                                              - sb_start_intwrite(SB_FREEZE_FS);
> > >    __sb_start_write()
> > >      percpu_down_read()
> > >
> > > Given lock acquirers for read on both sides, wtf deadlock are you fixing?
> > 
> > Damn. I couldn't think _write uses _read sem.
> > 
> > >
> > > >  - f2fs_stop_checkpoint(sbi, false,            : waiting
> > > >     STOP_CP_REASON_SHUTDOWN);
> > > >  - f2fs_stop_discard_thread(sbi);
> > > >    - kthread_stop()
> > > >      : waiting
> > > >
> > > >  - mnt_drop_write_file(filp);
> > >
> > > More important, feel free to add in spin.
> > 
> > I posted this patch before Light reported.
> > 
> > And, in the report, I didn't get this:
> > 
> > f2fs_ioc_shutdown() --> freeze_bdev() --> freeze_super() --> sb_wait_write(sb, SB_FREEZE_FS) --> ... ->percpu_down_write().
> > 
> > because f2fs_ioc_shutdown() calls f2fs_stop_discard_thread() after thaw_bdev()
> > like this order.
> > 
> >  -> freeze_bdev()
> >  -> thaw_bdev()
> >  -> f2fs_stop_discard_thread()
> > 
> > Am I missing something?
> > 
> > >
> > > Reported-by: "Light Hsieh (謝明燈)" <Light.Hsieh@mediatek.com>
> >
Jaegeuk Kim April 4, 2024, 7:55 p.m. UTC | #7
On 04/03, Light Hsieh (謝明燈) wrote:
> Our log shows that thaw_super_locked() find that sb is readonly, so sb_freeze_unlock() is not invoked.
> 
> static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
> {
>       ...
>       if (sb_rdonly(sb)) {
>             sb->s_writers.freeze_holders &= ~who;
>             sb->s_writers.frozen = SB_UNFROZEN;
>             wake_up_var(&sb->s_writers.frozen);
>             goto out;
>       }
>                ...
>       sb_freeze_unlock(sb, SB_FREEZE_FS);
> out:
>       deactivate_locked_super(sb);
>       return 0;
> }

Thank you. Could you please take a look at this patch?

https://lore.kernel.org/linux-f2fs-devel/20240404195254.556896-1-jaegeuk@kernel.org/T/#u

> 
> 寄件者: Jaegeuk Kim <jaegeuk@kernel.org>
> 寄件日期: 2024年3月27日 上午 12:52
> 收件者: Light Hsieh (謝明燈) <Light.Hsieh@mediatek.com>
> 副本: Hillf Danton <hdanton@sina.com>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; linux-f2fs-devel@lists.sourceforge.net <linux-f2fs-devel@lists.sourceforge.net>
> 主旨: Re: 回覆: [PATCH] f2fs: avoid the deadlock case when stopping discard thread
>  
> 
> External email : Please do not click links or open attachments until you have verified the sender or the content.
> On 03/22, Jaegeuk Kim wrote:
> > On 03/22, Light Hsieh (謝明燈) wrote:
> > > I don't see my added log in sb_free_unlock() which will invoke percpu_up_write to release the write semaphore.
> > 
> > May I ask more details whether thaw_super() was called or not?
> 
> Ping?
> 
> > 
> > > 
> > > 
> > > ________________________________
> > > 寄件者: Jaegeuk Kim <jaegeuk@kernel.org>
> > > 寄件日期: 2024年3月22日 上午 08:29
> > > 收件者: Hillf Danton <hdanton@sina.com>
> > > 副本: linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Light Hsieh (謝明燈) <Light.Hsieh@mediatek.com>; linux-f2fs-devel@lists.sourceforge.net <linux-f2fs-devel@lists.sourceforge.net>
> > > 主旨: Re: [PATCH] f2fs: avoid the deadlock case when stopping discard thread
> > > 
> > > 
> > > External email : Please do not click links or open attachments until you have verified the sender or the content.
> > > 
> > > On 03/22, Hillf Danton wrote:
> > > > On Tue, 19 Mar 2024 17:14:42 -0700 Jaegeuk Kim <jaegeuk@kernel.org>
> > > > > f2fs_ioc_shutdown(F2FS_GOING_DOWN_NOSYNC)  issue_discard_thread
> > > > >  - mnt_want_write_file()
> > > > >    - sb_start_write(SB_FREEZE_WRITE)
> > > >  __sb_start_write()
> > > >    percpu_down_read()
> > > > >                                              - sb_start_intwrite(SB_FREEZE_FS);
> > > >    __sb_start_write()
> > > >      percpu_down_read()
> > > >
> > > > Given lock acquirers for read on both sides, wtf deadlock are you fixing?
> > > 
> > > Damn. I couldn't think _write uses _read sem.
> > > 
> > > >
> > > > >  - f2fs_stop_checkpoint(sbi, false,            : waiting
> > > > >     STOP_CP_REASON_SHUTDOWN);
> > > > >  - f2fs_stop_discard_thread(sbi);
> > > > >    - kthread_stop()
> > > > >      : waiting
> > > > >
> > > > >  - mnt_drop_write_file(filp);
> > > >
> > > > More important, feel free to add in spin.
> > > 
> > > I posted this patch before Light reported.
> > > 
> > > And, in the report, I didn't get this:
> > > 
> > > f2fs_ioc_shutdown() --> freeze_bdev() --> freeze_super() --> sb_wait_write(sb, SB_FREEZE_FS) --> ... ->percpu_down_write().
> > > 
> > > because f2fs_ioc_shutdown() calls f2fs_stop_discard_thread() after thaw_bdev()
> > > like this order.
> > > 
> > >  -> freeze_bdev()
> > >  -> thaw_bdev()
> > >  -> f2fs_stop_discard_thread()
> > > 
> > > Am I missing something?
> > > 
> > > >
> > > > Reported-by: "Light Hsieh (謝明燈)" <Light.Hsieh@mediatek.com>
> > >
Jaegeuk Kim April 12, 2024, 8:50 p.m. UTC | #8
On 04/12, Light Hsieh (謝明燈) wrote:
> I think 'readon' in this line may be typo of  'reason'

Was fixed as well. Thanks.

> 
> +		f2fs_warn(sbi, "Stopped filesystem due to readon: %d", reason);
> 
> 
> 
> 寄件者: Jaegeuk Kim <jaegeuk@kernel.org>
> 寄件日期: 2024年4月5日 上午 03:55
> 收件者: Light Hsieh (謝明燈) <Light.Hsieh@mediatek.com>
> 副本: Hillf Danton <hdanton@sina.com>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; linux-f2fs-devel@lists.sourceforge.net <linux-f2fs-devel@lists.sourceforge.net>
> 主旨: Re: 回覆: 回覆: [PATCH] f2fs: avoid the deadlock case when stopping discard thread
>  
> 
> On 04/03, Light Hsieh (謝明燈) wrote:
> > Our log shows that thaw_super_locked() find that sb is readonly, so sb_freeze_unlock() is not invoked.
> > 
> > static int thaw_super_locked(struct super_block *sb, enum freeze_holder who)
> > {
> >       ...
> >       if (sb_rdonly(sb)) {
> >             sb->s_writers.freeze_holders &= ~who;
> >             sb->s_writers.frozen = SB_UNFROZEN;
> >             wake_up_var(&sb->s_writers.frozen);
> >             goto out;
> >       }
> >                ...
> >       sb_freeze_unlock(sb, SB_FREEZE_FS);
> > out:
> >       deactivate_locked_super(sb);
> >       return 0;
> > }
> 
> Thank you. Could you please take a look at this patch?
> 
> https://lore.kernel.org/linux-f2fs-devel/20240404195254.556896-1-jaegeuk@kernel.org/T/#u
> 
> > 
> > 寄件者: Jaegeuk Kim <jaegeuk@kernel.org>
> > 寄件日期: 2024年3月27日 上午 12:52
> > 收件者: Light Hsieh (謝明燈) <Light.Hsieh@mediatek.com>
> > 副本: Hillf Danton <hdanton@sina.com>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; linux-f2fs-devel@lists.sourceforge.net <linux-f2fs-devel@lists.sourceforge.net>
> > 主旨: Re: 回覆: [PATCH] f2fs: avoid the deadlock case when stopping discard thread
> >  
> > 
> > External email : Please do not click links or open attachments until you have verified the sender or the content.
> > On 03/22, Jaegeuk Kim wrote:
> > > On 03/22, Light Hsieh (謝明燈) wrote:
> > > > I don't see my added log in sb_free_unlock() which will invoke percpu_up_write to release the write semaphore.
> > > 
> > > May I ask more details whether thaw_super() was called or not?
> > 
> > Ping?
> > 
> > > 
> > > > 
> > > > 
> > > > ________________________________
> > > > 寄件者: Jaegeuk Kim <jaegeuk@kernel.org>
> > > > 寄件日期: 2024年3月22日 上午 08:29
> > > > 收件者: Hillf Danton <hdanton@sina.com>
> > > > 副本: linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; Light Hsieh (謝明燈) <Light.Hsieh@mediatek.com>; linux-f2fs-devel@lists.sourceforge.net <linux-f2fs-devel@lists.sourceforge.net>
> > > > 主旨: Re: [PATCH] f2fs: avoid the deadlock case when stopping discard thread
> > > > 
> > > > 
> > > > External email : Please do not click links or open attachments until you have verified the sender or the content.
> > > > 
> > > > On 03/22, Hillf Danton wrote:
> > > > > On Tue, 19 Mar 2024 17:14:42 -0700 Jaegeuk Kim <jaegeuk@kernel.org>
> > > > > > f2fs_ioc_shutdown(F2FS_GOING_DOWN_NOSYNC)  issue_discard_thread
> > > > > >  - mnt_want_write_file()
> > > > > >    - sb_start_write(SB_FREEZE_WRITE)
> > > > >  __sb_start_write()
> > > > >    percpu_down_read()
> > > > > >                                              - sb_start_intwrite(SB_FREEZE_FS);
> > > > >    __sb_start_write()
> > > > >      percpu_down_read()
> > > > >
> > > > > Given lock acquirers for read on both sides, wtf deadlock are you fixing?
> > > > 
> > > > Damn. I couldn't think _write uses _read sem.
> > > > 
> > > > >
> > > > > >  - f2fs_stop_checkpoint(sbi, false,            : waiting
> > > > > >     STOP_CP_REASON_SHUTDOWN);
> > > > > >  - f2fs_stop_discard_thread(sbi);
> > > > > >    - kthread_stop()
> > > > > >      : waiting
> > > > > >
> > > > > >  - mnt_drop_write_file(filp);
> > > > >
> > > > > More important, feel free to add in spin.
> > > > 
> > > > I posted this patch before Light reported.
> > > > 
> > > > And, in the report, I didn't get this:
> > > > 
> > > > f2fs_ioc_shutdown() --> freeze_bdev() --> freeze_super() --> sb_wait_write(sb, SB_FREEZE_FS) --> ... ->percpu_down_write().
> > > > 
> > > > because f2fs_ioc_shutdown() calls f2fs_stop_discard_thread() after thaw_bdev()
> > > > like this order.
> > > > 
> > > >  -> freeze_bdev()
> > > >  -> thaw_bdev()
> > > >  -> f2fs_stop_discard_thread()
> > > > 
> > > > Am I missing something?
> > > > 
> > > > >
> > > > > Reported-by: "Light Hsieh (謝明燈)" <Light.Hsieh@mediatek.com>
> > > >
diff mbox series

Patch

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 4fd76e867e0a..088b8c48cffa 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1923,7 +1923,9 @@  static int issue_discard_thread(void *data)
 			continue;
 		}
 
-		sb_start_intwrite(sbi->sb);
+		/* Avoid the deadlock from F2FS_GOING_DOWN_NOSYNC. */
+		if (!sb_start_intwrite_trylock(sbi->sb))
+			continue;
 
 		issued = __issue_discard_cmd(sbi, &dpolicy);
 		if (issued > 0) {