diff mbox series

fs: xfs: fix a possible data race in xfs_inode_set_reclaim_tag()

Message ID 20200504161530.14059-1-baijiaju1990@gmail.com (mailing list archive)
State New, archived
Headers show
Series fs: xfs: fix a possible data race in xfs_inode_set_reclaim_tag() | expand

Commit Message

Jia-Ju Bai May 4, 2020, 4:15 p.m. UTC
We find that xfs_inode_set_reclaim_tag() and xfs_reclaim_inode() are
concurrently executed at runtime in the following call contexts:

Thread1:
  xfs_fs_put_super()
    xfs_unmountfs()
      xfs_rtunmount_inodes()
        xfs_irele()
          xfs_fs_destroy_inode()
            xfs_inode_set_reclaim_tag()

Thread2:
  xfs_reclaim_worker()
    xfs_reclaim_inodes()
      xfs_reclaim_inodes_ag()
        xfs_reclaim_inode()

In xfs_inode_set_reclaim_tag():
  pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
  ...
  spin_lock(&ip->i_flags_lock);

In xfs_reclaim_inode():
  spin_lock(&ip->i_flags_lock);
  ...
  ip->i_ino = 0;
  spin_unlock(&ip->i_flags_lock);

Thus, a data race can occur for ip->i_ino.

To fix this data race, the spinlock ip->i_flags_lock is used to protect
the access to ip->i_ino in xfs_inode_set_reclaim_tag().

This data race is found by our concurrency fuzzer.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
---
 fs/xfs/xfs_icache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Dave Chinner May 4, 2020, 9:25 p.m. UTC | #1
On Tue, May 05, 2020 at 12:15:30AM +0800, Jia-Ju Bai wrote:
> We find that xfs_inode_set_reclaim_tag() and xfs_reclaim_inode() are
> concurrently executed at runtime in the following call contexts:
> 
> Thread1:
>   xfs_fs_put_super()
>     xfs_unmountfs()
>       xfs_rtunmount_inodes()
>         xfs_irele()
>           xfs_fs_destroy_inode()
>             xfs_inode_set_reclaim_tag()
> 
> Thread2:
>   xfs_reclaim_worker()
>     xfs_reclaim_inodes()
>       xfs_reclaim_inodes_ag()
>         xfs_reclaim_inode()
> 
> In xfs_inode_set_reclaim_tag():
>   pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
>   ...
>   spin_lock(&ip->i_flags_lock);
> 
> In xfs_reclaim_inode():
>   spin_lock(&ip->i_flags_lock);
>   ...
>   ip->i_ino = 0;
>   spin_unlock(&ip->i_flags_lock);
> 
> Thus, a data race can occur for ip->i_ino.
> 
> To fix this data race, the spinlock ip->i_flags_lock is used to protect
> the access to ip->i_ino in xfs_inode_set_reclaim_tag().
> 
> This data race is found by our concurrency fuzzer.

This data race cannot happen.

xfs_reclaim_inode() will not be called on this inode until -after-
the XFS_ICI_RECLAIM_TAG is set in the radix tree for this inode, and
setting that is protected by the i_flags_lock.

So while the xfs_perag_get() call doesn't lock the ip->i_ino access,
there is are -multiple_ iflags_lock lock/unlock cycles before
ip->i_ino is cleared in the reclaim worker. Hence there is a full
unlock->lock memory barrier for the ip->i_ino reset inside the
critical section vs xfs_inode_set_reclaim_tag().

Hence even if the reclaim worker could access the inode before the
XFS_ICI_RECLAIM_TAG is set, no data race exists here.

> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
> ---
>  fs/xfs/xfs_icache.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index 8bf1d15be3f6..a2de08222ff5 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -229,9 +229,9 @@ xfs_inode_set_reclaim_tag(
>  	struct xfs_mount	*mp = ip->i_mount;
>  	struct xfs_perag	*pag;
>  
> +	spin_lock(&ip->i_flags_lock);
>  	pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
>  	spin_lock(&pag->pag_ici_lock);
> -	spin_lock(&ip->i_flags_lock);

Also, this creates a lock inversion deadlock here with
xfs_iget_cache_hit() clearing the XFS_IRECLAIMABLE flag.

Cheers,

Dave.
diff mbox series

Patch

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 8bf1d15be3f6..a2de08222ff5 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -229,9 +229,9 @@  xfs_inode_set_reclaim_tag(
 	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_perag	*pag;
 
+	spin_lock(&ip->i_flags_lock);
 	pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
 	spin_lock(&pag->pag_ici_lock);
-	spin_lock(&ip->i_flags_lock);
 
 	radix_tree_tag_set(&pag->pag_ici_root, XFS_INO_TO_AGINO(mp, ip->i_ino),
 			   XFS_ICI_RECLAIM_TAG);