Message ID | 20200504161530.14059-1-baijiaju1990@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fs: xfs: fix a possible data race in xfs_inode_set_reclaim_tag() | expand |
On Tue, May 05, 2020 at 12:15:30AM +0800, Jia-Ju Bai wrote: > We find that xfs_inode_set_reclaim_tag() and xfs_reclaim_inode() are > concurrently executed at runtime in the following call contexts: > > Thread1: > xfs_fs_put_super() > xfs_unmountfs() > xfs_rtunmount_inodes() > xfs_irele() > xfs_fs_destroy_inode() > xfs_inode_set_reclaim_tag() > > Thread2: > xfs_reclaim_worker() > xfs_reclaim_inodes() > xfs_reclaim_inodes_ag() > xfs_reclaim_inode() > > In xfs_inode_set_reclaim_tag(): > pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino)); > ... > spin_lock(&ip->i_flags_lock); > > In xfs_reclaim_inode(): > spin_lock(&ip->i_flags_lock); > ... > ip->i_ino = 0; > spin_unlock(&ip->i_flags_lock); > > Thus, a data race can occur for ip->i_ino. > > To fix this data race, the spinlock ip->i_flags_lock is used to protect > the access to ip->i_ino in xfs_inode_set_reclaim_tag(). > > This data race is found by our concurrency fuzzer. This data race cannot happen. xfs_reclaim_inode() will not be called on this inode until -after- the XFS_ICI_RECLAIM_TAG is set in the radix tree for this inode, and setting that is protected by the i_flags_lock. So while the xfs_perag_get() call doesn't lock the ip->i_ino access, there is are -multiple_ iflags_lock lock/unlock cycles before ip->i_ino is cleared in the reclaim worker. Hence there is a full unlock->lock memory barrier for the ip->i_ino reset inside the critical section vs xfs_inode_set_reclaim_tag(). Hence even if the reclaim worker could access the inode before the XFS_ICI_RECLAIM_TAG is set, no data race exists here. > Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> > --- > fs/xfs/xfs_icache.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c > index 8bf1d15be3f6..a2de08222ff5 100644 > --- a/fs/xfs/xfs_icache.c > +++ b/fs/xfs/xfs_icache.c > @@ -229,9 +229,9 @@ xfs_inode_set_reclaim_tag( > struct xfs_mount *mp = ip->i_mount; > struct xfs_perag *pag; > > + spin_lock(&ip->i_flags_lock); > pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino)); > spin_lock(&pag->pag_ici_lock); > - spin_lock(&ip->i_flags_lock); Also, this creates a lock inversion deadlock here with xfs_iget_cache_hit() clearing the XFS_IRECLAIMABLE flag. Cheers, Dave.
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index 8bf1d15be3f6..a2de08222ff5 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -229,9 +229,9 @@ xfs_inode_set_reclaim_tag( struct xfs_mount *mp = ip->i_mount; struct xfs_perag *pag; + spin_lock(&ip->i_flags_lock); pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino)); spin_lock(&pag->pag_ici_lock); - spin_lock(&ip->i_flags_lock); radix_tree_tag_set(&pag->pag_ici_root, XFS_INO_TO_AGINO(mp, ip->i_ino), XFS_ICI_RECLAIM_TAG);
We find that xfs_inode_set_reclaim_tag() and xfs_reclaim_inode() are concurrently executed at runtime in the following call contexts: Thread1: xfs_fs_put_super() xfs_unmountfs() xfs_rtunmount_inodes() xfs_irele() xfs_fs_destroy_inode() xfs_inode_set_reclaim_tag() Thread2: xfs_reclaim_worker() xfs_reclaim_inodes() xfs_reclaim_inodes_ag() xfs_reclaim_inode() In xfs_inode_set_reclaim_tag(): pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino)); ... spin_lock(&ip->i_flags_lock); In xfs_reclaim_inode(): spin_lock(&ip->i_flags_lock); ... ip->i_ino = 0; spin_unlock(&ip->i_flags_lock); Thus, a data race can occur for ip->i_ino. To fix this data race, the spinlock ip->i_flags_lock is used to protect the access to ip->i_ino in xfs_inode_set_reclaim_tag(). This data race is found by our concurrency fuzzer. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> --- fs/xfs/xfs_icache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)