diff mbox series

xfs: fix reflink source file racing with directio writes

Message ID 20190815165043.GB15186@magnolia (mailing list archive)
State Accepted
Headers show
Series xfs: fix reflink source file racing with directio writes | expand

Commit Message

Darrick J. Wong Aug. 15, 2019, 4:50 p.m. UTC
From: Darrick J. Wong <darrick.wong@oracle.com>

While trawling through the dedupe file comparison code trying to fix
page deadlocking problems, Dave Chinner noticed that the reflink code
only takes shared IOLOCK/MMAPLOCKs on the source file.  Because
page_mkwrite and directio writes do not take the EXCL versions of those
locks, this means that reflink can race with writer processes.

For pure remapping this can lead to undefined behavior and file
corruption; for dedupe this means that we cannot be sure that the
contents are identical when we decide to go ahead with the remapping.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_reflink.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

Comments

Dave Chinner Aug. 15, 2019, 9:37 p.m. UTC | #1
On Thu, Aug 15, 2019 at 09:50:43AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> While trawling through the dedupe file comparison code trying to fix
> page deadlocking problems, Dave Chinner noticed that the reflink code
> only takes shared IOLOCK/MMAPLOCKs on the source file.  Because
> page_mkwrite and directio writes do not take the EXCL versions of those
> locks, this means that reflink can race with writer processes.
> 
> For pure remapping this can lead to undefined behavior and file
> corruption; for dedupe this means that we cannot be sure that the
> contents are identical when we decide to go ahead with the remapping.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

I don't think this is quite right yet. The source inode locking
change is good, but it doesn't break the layout on the source inode
and so there is still they possibility that something has physical
access and is directly modifying the source file.

And with that, I suspect the locking algorithm changes
substantially:

	order inodes
restart:
	lock first inode
	break layout on first inode
	lock second inode
	break layout on second inode
	fail then unlock both inodes, goto restart

Cheers,

Dave.
Christoph Hellwig Aug. 16, 2019, 6:42 a.m. UTC | #2
On Fri, Aug 16, 2019 at 07:37:17AM +1000, Dave Chinner wrote:
> I don't think this is quite right yet. The source inode locking
> change is good, but it doesn't break the layout on the source inode
> and so there is still they possibility that something has physical
> access and is directly modifying the source file.
> 
> And with that, I suspect the locking algorithm changes
> substantially:
> 
> 	order inodes
> restart:
> 	lock first inode
> 	break layout on first inode
> 	lock second inode
> 	break layout on second inode
> 	fail then unlock both inodes, goto restart

Agreed, that is the locking scheme I'd expect.
diff mbox series

Patch

diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index c4ec7afd1170..9fd417de15c9 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -1205,7 +1205,7 @@  xfs_iolock_two_inodes_and_break_layout(
 
 retry:
 	if (src < dest) {
-		inode_lock_shared(src);
+		inode_lock(src);
 		inode_lock_nested(dest, I_MUTEX_NONDIR2);
 	} else {
 		/* src >= dest */
@@ -1216,7 +1216,7 @@  xfs_iolock_two_inodes_and_break_layout(
 	if (error == -EWOULDBLOCK) {
 		inode_unlock(dest);
 		if (src < dest)
-			inode_unlock_shared(src);
+			inode_unlock(src);
 		error = break_layout(dest, true);
 		if (error)
 			return error;
@@ -1225,11 +1225,11 @@  xfs_iolock_two_inodes_and_break_layout(
 	if (error) {
 		inode_unlock(dest);
 		if (src < dest)
-			inode_unlock_shared(src);
+			inode_unlock(src);
 		return error;
 	}
 	if (src > dest)
-		inode_lock_shared_nested(src, I_MUTEX_NONDIR2);
+		inode_lock_nested(src, I_MUTEX_NONDIR2);
 	return 0;
 }
 
@@ -1247,10 +1247,10 @@  xfs_reflink_remap_unlock(
 
 	xfs_iunlock(dest, XFS_MMAPLOCK_EXCL);
 	if (!same_inode)
-		xfs_iunlock(src, XFS_MMAPLOCK_SHARED);
+		xfs_iunlock(src, XFS_MMAPLOCK_EXCL);
 	inode_unlock(inode_out);
 	if (!same_inode)
-		inode_unlock_shared(inode_in);
+		inode_unlock(inode_in);
 }
 
 /*
@@ -1325,7 +1325,7 @@  xfs_reflink_remap_prep(
 	if (same_inode)
 		xfs_ilock(src, XFS_MMAPLOCK_EXCL);
 	else
-		xfs_lock_two_inodes(src, XFS_MMAPLOCK_SHARED, dest,
+		xfs_lock_two_inodes(src, XFS_MMAPLOCK_EXCL, dest,
 				XFS_MMAPLOCK_EXCL);
 
 	/* Check file eligibility and prepare for block sharing. */