diff mbox series

[RFC] xfs: fix cow_seq locking behavior

Message ID 20180730055539.GT30972@magnolia (mailing list archive)
State New, archived
Headers show
Series [RFC] xfs: fix cow_seq locking behavior | expand

Commit Message

Darrick J. Wong July 30, 2018, 5:55 a.m. UTC
From: Darrick J. Wong <darrick.wong@oracle.com>

In Christoph Hellwig's patch "xfs: avoid COW fork extent lookups in
writeback if the fork didn't change" (which has not yet graduated to
for-next), we sample the COW fork sequence number without taking the
ilock.  This is a little strange, since in general we always take it
before accessing anything in a block mapping.  I think we get lucky in
that the unlocking during actual cow fork changes will erect the
necessary memory barriers (on x86 anyway) but let's not play fast and
loose with breaking everyone else's model of how locking works.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_aops.c |   10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Christoph Hellwig July 30, 2018, 8:14 a.m. UTC | #1
On Sun, Jul 29, 2018 at 10:55:39PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> In Christoph Hellwig's patch "xfs: avoid COW fork extent lookups in
> writeback if the fork didn't change" (which has not yet graduated to
> for-next), we sample the COW fork sequence number without taking the
> ilock.  This is a little strange, since in general we always take it
> before accessing anything in a block mapping.  I think we get lucky in
> that the unlocking during actual cow fork changes will erect the
> necessary memory barriers (on x86 anyway) but let's not play fast and
> loose with breaking everyone else's model of how locking works.

What exaxtly do you want to protect here?  It's not like we have any
multiple fields we need to synchronize access to to.

And it's not like this is superficials - in addition to not providing
any actual synchronization this also means we have to take the ilock
for every page, which reduces a large part of the improvements in the
series.
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darrick J. Wong July 30, 2018, 3:52 p.m. UTC | #2
On Mon, Jul 30, 2018 at 01:14:56AM -0700, Christoph Hellwig wrote:
> On Sun, Jul 29, 2018 at 10:55:39PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > In Christoph Hellwig's patch "xfs: avoid COW fork extent lookups in
> > writeback if the fork didn't change" (which has not yet graduated to
> > for-next), we sample the COW fork sequence number without taking the
> > ilock.  This is a little strange, since in general we always take it
> > before accessing anything in a block mapping.  I think we get lucky in
> > that the unlocking during actual cow fork changes will erect the
> > necessary memory barriers (on x86 anyway) but let's not play fast and
> > loose with breaking everyone else's model of how locking works.
> 
> What exaxtly do you want to protect here?  It's not like we have any
> multiple fields we need to synchronize access to to.

Protecting against us screwing up the locking here some day due to a
subtlety that nobody will remember in 6 months. :)

> And it's not like this is superficials - in addition to not providing
> any actual synchronization this also means we have to take the ilock
> for every page, which reduces a large part of the improvements in the
> series.

Agreed!  I nearly tagged this RFCRAP instead.  Sometimes I send patches
to try to provoke a response... <cough>D :0

Now that I've had a night to think it over (and it's no longer 100F but
the sky is still blood red) I think we ought to have a comment
explaining how the synchronization works such that we don't need to take
the ILOCK before testing cow_seq....

/*
 * COW fork blocks can overlap data fork blocks even if the blocks
 * aren't shared.  COW I/O always takes precedent, so we must always
 * check for overlap on reflink inodes unless the mapping is already a
 * COW one.
 *
 * It's safe to check the COW fork if_seq here without the ILOCK because
 * we've indirectly protected against concurrent updates: writeback has
 * the page locked, which prevents concurrent invalidations by reflink
 * and directio and prevents concurrent buffered writes to the same
 * page.  Concurrent changes to other parts of the COW fork will drop
 * the i_lock on their way out, which provides the necessary memory
 * barrier to ensure that we see the updated if_seq.
 */

I'm not actually sure about the last sentence anymore -- that's what I
was thinking the first time I looked at this patch, before Dave spoke
up.

--D

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig July 30, 2018, 4:22 p.m. UTC | #3
On Mon, Jul 30, 2018 at 08:52:07AM -0700, Darrick J. Wong wrote:
> /*
>  * COW fork blocks can overlap data fork blocks even if the blocks
>  * aren't shared.  COW I/O always takes precedent, so we must always
>  * check for overlap on reflink inodes unless the mapping is already a
>  * COW one.
>  *
>  * It's safe to check the COW fork if_seq here without the ILOCK because
>  * we've indirectly protected against concurrent updates: writeback has
>  * the page locked, which prevents concurrent invalidations by reflink
>  * and directio and prevents concurrent buffered writes to the same
>  * page.  Concurrent changes to other parts of the COW fork will drop
>  * the i_lock on their way out, which provides the necessary memory
>  * barrier to ensure that we see the updated if_seq.
>  */
> 
> I'm not actually sure about the last sentence anymore -- that's what I
> was thinking the first time I looked at this patch, before Dave spoke
> up.

Yeah, that last sentence looks odd.  I'd replace it with:

Changes to if_seq always happen under i_lock, which protects against
concurrent updates and provides a memory barrier on the way out that
ensures that we always see the current value.
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox series

Patch

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index aff9d44fa338..2e178ef89a15 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -338,16 +338,21 @@  xfs_map_blocks(
 	 * COW one, or the COW fork hasn't changed from the last time we looked
 	 * at it.
 	 */
+	xfs_ilock(ip, XFS_ILOCK_SHARED);
 	imap_valid = offset_fsb >= wpc->imap.br_startoff &&
 		     offset_fsb < wpc->imap.br_startoff + wpc->imap.br_blockcount;
 	if (imap_valid &&
 	    (!xfs_inode_has_cow_data(ip) ||
 	     wpc->io_type == XFS_IO_COW ||
-	     wpc->cow_seq == ip->i_cowfp->if_seq))
+	     wpc->cow_seq == ip->i_cowfp->if_seq)) {
+		xfs_iunlock(ip, XFS_ILOCK_SHARED);
 		return 0;
+	}
 
-	if (XFS_FORCED_SHUTDOWN(mp))
+	if (XFS_FORCED_SHUTDOWN(mp)) {
+		xfs_iunlock(ip, XFS_ILOCK_SHARED);
 		return -EIO;
+	}
 
 	/*
 	 * If we don't have a valid map, now it's time to get a new one for this
@@ -355,7 +360,6 @@  xfs_map_blocks(
 	 * into real extents.  If we return without a valid map, it means we
 	 * landed in a hole and we skip the block.
 	 */
-	xfs_ilock(ip, XFS_ILOCK_SHARED);
 	ASSERT(ip->i_d.di_format != XFS_DINODE_FMT_BTREE ||
 	       (ip->i_df.if_flags & XFS_IFEXTENTS));
 	ASSERT(offset <= mp->m_super->s_maxbytes);