diff mbox series

xfs: acquire superblock freeze protection on eofblocks scans

Message ID 20200408122119.33869-1-bfoster@redhat.com (mailing list archive)
State Accepted
Headers show
Series xfs: acquire superblock freeze protection on eofblocks scans | expand

Commit Message

Brian Foster April 8, 2020, 12:21 p.m. UTC
The filesystem freeze sequence in XFS waits on any background
eofblocks or cowblocks scans to complete before the filesystem is
quiesced. At this point, the freezer has already stopped the
transaction subsystem, however, which means a truncate or cowblock
cancellation in progress is likely blocked in transaction
allocation. This results in a deadlock between freeze and the
associated scanner.

Fix this problem by holding superblock write protection across calls
into the block reapers. Since protection for background scans is
acquired from the workqueue task context, trylock to avoid a similar
deadlock between freeze and blocking on the write lock.

Fixes: d6b636ebb1c9f ("xfs: halt auto-reclamation activities while rebuilding rmap")
Reported-by: Paul Furtado <paulfurtado91@gmail.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
---

Note that this has the opposite tradeoff as the approach I originally
posited [1], specifically that the eofblocks ioctl() now always blocks
on a frozen fs rather than return -EAGAIN. It's worth pointing out that
the eofb control structure has a sync flag (that is not used for
background scans), so yet another approach could be to tie the trylock
to that.

Brian

[1] https://lore.kernel.org/linux-xfs/20200407163739.GG28936@bfoster/

 fs/xfs/xfs_icache.c | 10 ++++++++++
 fs/xfs/xfs_ioctl.c  |  5 ++++-
 2 files changed, 14 insertions(+), 1 deletion(-)

Comments

Chandan Rajendra April 8, 2020, 2:54 p.m. UTC | #1
On Wednesday, April 8, 2020 5:51 PM Brian Foster wrote: 
> The filesystem freeze sequence in XFS waits on any background
> eofblocks or cowblocks scans to complete before the filesystem is
> quiesced. At this point, the freezer has already stopped the
> transaction subsystem, however, which means a truncate or cowblock
> cancellation in progress is likely blocked in transaction
> allocation. This results in a deadlock between freeze and the
> associated scanner.
> 
> Fix this problem by holding superblock write protection across calls
> into the block reapers. Since protection for background scans is
> acquired from the workqueue task context, trylock to avoid a similar
> deadlock between freeze and blocking on the write lock.

|-------------------------------------+---------------------------------|
| fsfreeze                            | eof blocks reaper               |
|-------------------------------------+---------------------------------|
| Set sb frozen state to SB_FREEZE_FS |                                 |
|                                     | Start periodic execution        |
|                                     | xfs_trans_alloc()               |
|                                     | - sb_start_intwrite()           |
|                                     |   Wait for frozen state to      |
|                                     |   return to < SB_UNFROZEN state |
| xfs_stop_block_reaping()            |                                 |
| - Wait for eof worker to finish     |                                 |
|-------------------------------------+---------------------------------|

If we add a blocking lock invocation at the beginning of eof blocks reaper,
then fsfreeze would get blocked at cancel_delayed_work_sync().

However using a trylock, "eof blocks reaper" would return back due to failure
in obtaining the lock and hence it is guaranteed that fsfreeze will make progress.

Hence the changes are logically correct.

Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com>

> 
> Fixes: d6b636ebb1c9f ("xfs: halt auto-reclamation activities while rebuilding rmap")
> Reported-by: Paul Furtado <paulfurtado91@gmail.com>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
> 
> Note that this has the opposite tradeoff as the approach I originally
> posited [1], specifically that the eofblocks ioctl() now always blocks
> on a frozen fs rather than return -EAGAIN. It's worth pointing out that
> the eofb control structure has a sync flag (that is not used for
> background scans), so yet another approach could be to tie the trylock
> to that.
> 
> Brian
> 
> [1] https://lore.kernel.org/linux-xfs/20200407163739.GG28936@bfoster/
> 
>  fs/xfs/xfs_icache.c | 10 ++++++++++
>  fs/xfs/xfs_ioctl.c  |  5 ++++-
>  2 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index a7be7a9e5c1a..8bf1d15be3f6 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -911,7 +911,12 @@ xfs_eofblocks_worker(
>  {
>  	struct xfs_mount *mp = container_of(to_delayed_work(work),
>  				struct xfs_mount, m_eofblocks_work);
> +
> +	if (!sb_start_write_trylock(mp->m_super))
> +		return;
>  	xfs_icache_free_eofblocks(mp, NULL);
> +	sb_end_write(mp->m_super);
> +
>  	xfs_queue_eofblocks(mp);
>  }
>  
> @@ -938,7 +943,12 @@ xfs_cowblocks_worker(
>  {
>  	struct xfs_mount *mp = container_of(to_delayed_work(work),
>  				struct xfs_mount, m_cowblocks_work);
> +
> +	if (!sb_start_write_trylock(mp->m_super))
> +		return;
>  	xfs_icache_free_cowblocks(mp, NULL);
> +	sb_end_write(mp->m_super);
> +
>  	xfs_queue_cowblocks(mp);
>  }
>  
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index cdfb3cd9a25b..309958186d33 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -2363,7 +2363,10 @@ xfs_file_ioctl(
>  		if (error)
>  			return error;
>  
> -		return xfs_icache_free_eofblocks(mp, &keofb);
> +		sb_start_write(mp->m_super);
> +		error = xfs_icache_free_eofblocks(mp, &keofb);
> +		sb_end_write(mp->m_super);
> +		return error;
>  	}
>  
>  	default:
>
Christoph Hellwig April 9, 2020, 7:40 a.m. UTC | #2
Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
Allison Henderson April 9, 2020, 6:19 p.m. UTC | #3
On 4/8/20 5:21 AM, Brian Foster wrote:
> The filesystem freeze sequence in XFS waits on any background
> eofblocks or cowblocks scans to complete before the filesystem is
> quiesced. At this point, the freezer has already stopped the
> transaction subsystem, however, which means a truncate or cowblock
> cancellation in progress is likely blocked in transaction
> allocation. This results in a deadlock between freeze and the
> associated scanner.
> 
> Fix this problem by holding superblock write protection across calls
> into the block reapers. Since protection for background scans is
> acquired from the workqueue task context, trylock to avoid a similar
> deadlock between freeze and blocking on the write lock.
> 
> Fixes: d6b636ebb1c9f ("xfs: halt auto-reclamation activities while rebuilding rmap")
> Reported-by: Paul Furtado <paulfurtado91@gmail.com>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
OK, looks good:
Reviewed-by: Allison Collins <allison.henderson@oracle.com>

> ---
> 
> Note that this has the opposite tradeoff as the approach I originally
> posited [1], specifically that the eofblocks ioctl() now always blocks
> on a frozen fs rather than return -EAGAIN. It's worth pointing out that
> the eofb control structure has a sync flag (that is not used for
> background scans), so yet another approach could be to tie the trylock
> to that.
> 
> Brian
> 
> [1] https://urldefense.com/v3/__https://lore.kernel.org/linux-xfs/20200407163739.GG28936@bfoster/__;!!GqivPVa7Brio!NNOb1nQFma-Q2kltH-cEBh_IdUSxLRairJB0HGGs9YaY9qh9sdcPm4SUCnMXoxe1mkGk$
> 
>   fs/xfs/xfs_icache.c | 10 ++++++++++
>   fs/xfs/xfs_ioctl.c  |  5 ++++-
>   2 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index a7be7a9e5c1a..8bf1d15be3f6 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -911,7 +911,12 @@ xfs_eofblocks_worker(
>   {
>   	struct xfs_mount *mp = container_of(to_delayed_work(work),
>   				struct xfs_mount, m_eofblocks_work);
> +
> +	if (!sb_start_write_trylock(mp->m_super))
> +		return;
>   	xfs_icache_free_eofblocks(mp, NULL);
> +	sb_end_write(mp->m_super);
> +
>   	xfs_queue_eofblocks(mp);
>   }
>   
> @@ -938,7 +943,12 @@ xfs_cowblocks_worker(
>   {
>   	struct xfs_mount *mp = container_of(to_delayed_work(work),
>   				struct xfs_mount, m_cowblocks_work);
> +
> +	if (!sb_start_write_trylock(mp->m_super))
> +		return;
>   	xfs_icache_free_cowblocks(mp, NULL);
> +	sb_end_write(mp->m_super);
> +
>   	xfs_queue_cowblocks(mp);
>   }
>   
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index cdfb3cd9a25b..309958186d33 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -2363,7 +2363,10 @@ xfs_file_ioctl(
>   		if (error)
>   			return error;
>   
> -		return xfs_icache_free_eofblocks(mp, &keofb);
> +		sb_start_write(mp->m_super);
> +		error = xfs_icache_free_eofblocks(mp, &keofb);
> +		sb_end_write(mp->m_super);
> +		return error;
>   	}
>   
>   	default:
>
Darrick J. Wong April 9, 2020, 7:11 p.m. UTC | #4
On Wed, Apr 08, 2020 at 08:21:19AM -0400, Brian Foster wrote:
> The filesystem freeze sequence in XFS waits on any background
> eofblocks or cowblocks scans to complete before the filesystem is
> quiesced. At this point, the freezer has already stopped the
> transaction subsystem, however, which means a truncate or cowblock
> cancellation in progress is likely blocked in transaction
> allocation. This results in a deadlock between freeze and the
> associated scanner.
> 
> Fix this problem by holding superblock write protection across calls
> into the block reapers. Since protection for background scans is
> acquired from the workqueue task context, trylock to avoid a similar
> deadlock between freeze and blocking on the write lock.
> 
> Fixes: d6b636ebb1c9f ("xfs: halt auto-reclamation activities while rebuilding rmap")
> Reported-by: Paul Furtado <paulfurtado91@gmail.com>
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Looks ok, will test and probably queue for 5.7-rc2...
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
> 
> Note that this has the opposite tradeoff as the approach I originally
> posited [1], specifically that the eofblocks ioctl() now always blocks
> on a frozen fs rather than return -EAGAIN. It's worth pointing out that
> the eofb control structure has a sync flag (that is not used for
> background scans), so yet another approach could be to tie the trylock
> to that.
> 
> Brian
> 
> [1] https://lore.kernel.org/linux-xfs/20200407163739.GG28936@bfoster/
> 
>  fs/xfs/xfs_icache.c | 10 ++++++++++
>  fs/xfs/xfs_ioctl.c  |  5 ++++-
>  2 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index a7be7a9e5c1a..8bf1d15be3f6 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -911,7 +911,12 @@ xfs_eofblocks_worker(
>  {
>  	struct xfs_mount *mp = container_of(to_delayed_work(work),
>  				struct xfs_mount, m_eofblocks_work);
> +
> +	if (!sb_start_write_trylock(mp->m_super))
> +		return;
>  	xfs_icache_free_eofblocks(mp, NULL);
> +	sb_end_write(mp->m_super);
> +
>  	xfs_queue_eofblocks(mp);
>  }
>  
> @@ -938,7 +943,12 @@ xfs_cowblocks_worker(
>  {
>  	struct xfs_mount *mp = container_of(to_delayed_work(work),
>  				struct xfs_mount, m_cowblocks_work);
> +
> +	if (!sb_start_write_trylock(mp->m_super))
> +		return;
>  	xfs_icache_free_cowblocks(mp, NULL);
> +	sb_end_write(mp->m_super);
> +
>  	xfs_queue_cowblocks(mp);
>  }
>  
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index cdfb3cd9a25b..309958186d33 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -2363,7 +2363,10 @@ xfs_file_ioctl(
>  		if (error)
>  			return error;
>  
> -		return xfs_icache_free_eofblocks(mp, &keofb);
> +		sb_start_write(mp->m_super);
> +		error = xfs_icache_free_eofblocks(mp, &keofb);
> +		sb_end_write(mp->m_super);
> +		return error;
>  	}
>  
>  	default:
> -- 
> 2.21.1
>
diff mbox series

Patch

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index a7be7a9e5c1a..8bf1d15be3f6 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -911,7 +911,12 @@  xfs_eofblocks_worker(
 {
 	struct xfs_mount *mp = container_of(to_delayed_work(work),
 				struct xfs_mount, m_eofblocks_work);
+
+	if (!sb_start_write_trylock(mp->m_super))
+		return;
 	xfs_icache_free_eofblocks(mp, NULL);
+	sb_end_write(mp->m_super);
+
 	xfs_queue_eofblocks(mp);
 }
 
@@ -938,7 +943,12 @@  xfs_cowblocks_worker(
 {
 	struct xfs_mount *mp = container_of(to_delayed_work(work),
 				struct xfs_mount, m_cowblocks_work);
+
+	if (!sb_start_write_trylock(mp->m_super))
+		return;
 	xfs_icache_free_cowblocks(mp, NULL);
+	sb_end_write(mp->m_super);
+
 	xfs_queue_cowblocks(mp);
 }
 
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index cdfb3cd9a25b..309958186d33 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -2363,7 +2363,10 @@  xfs_file_ioctl(
 		if (error)
 			return error;
 
-		return xfs_icache_free_eofblocks(mp, &keofb);
+		sb_start_write(mp->m_super);
+		error = xfs_icache_free_eofblocks(mp, &keofb);
+		sb_end_write(mp->m_super);
+		return error;
 	}
 
 	default: