Message ID | 20200208193445.27421-7-ira.weiny@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Enable per-file/directory DAX operations V3 | expand |
On Sat, Feb 08, 2020 at 11:34:39AM -0800, ira.weiny@intel.com wrote: > From: Ira Weiny <ira.weiny@intel.com> > > One of the checks for an inode supporting DAX is if the inode is > reflinked. During a non-DAX to DAX state change we could race with > the file being reflinked and end up with a reflinked file being in DAX > state. > > Prevent this race by checking for DAX support under the MMAP_LOCK. The on disk inode flags are protected by the XFS_ILOCK, not the MMAP_LOCK. i.e. the MMAPLOCK provides data access serialisation, not metadata modification serialisation. > > Signed-off-by: Ira Weiny <ira.weiny@intel.com> > --- > fs/xfs/xfs_ioctl.c | 11 +++++++---- > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c > index da1eb2bdb386..4ff402fd6636 100644 > --- a/fs/xfs/xfs_ioctl.c > +++ b/fs/xfs/xfs_ioctl.c > @@ -1194,10 +1194,6 @@ xfs_ioctl_setattr_dax_invalidate( > > *join_flags = 0; > > - if ((fa->fsx_xflags & FS_XFLAG_DAX) == FS_XFLAG_DAX && > - !xfs_inode_supports_dax(ip)) > - return -EINVAL; > - > /* If the DAX state is not changing, we have nothing to do here. */ > if ((fa->fsx_xflags & FS_XFLAG_DAX) && > (ip->i_d.di_flags2 & XFS_DIFLAG2_DAX)) > @@ -1211,6 +1207,13 @@ xfs_ioctl_setattr_dax_invalidate( > > /* lock, flush and invalidate mapping in preparation for flag change */ > xfs_ilock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL); > + > + if ((fa->fsx_xflags & FS_XFLAG_DAX) == FS_XFLAG_DAX && > + !xfs_inode_supports_dax(ip)) { > + error = -EINVAL; > + goto out_unlock; > + } Yes, you might be able to get away with reflink vs dax flag serialisation on the inode flag modification, but it is not correct and leaves a landmine for future inode flag modifications that are done without holding either the MMAP or IOLOCK. e.g. concurrent calls to xfs_ioctl_setattr() setting/clearing flags other than the on disk DAX flag are all serialised by the ILOCK_EXCL and will no be serialised against this DAX check. Hence if there are other flags that we add in future that affect the result of xfs_inode_supports_dax(), this code will not be correctly serialised. This raciness in checking the DAX flags is the reason that xfs_ioctl_setattr_xflags() redoes all the reflink vs dax checks once it's called under the XFS_ILOCK_EXCL during the actual change transaction.... Cheers, Dave.
On Tue, Feb 11, 2020 at 05:16:39PM +1100, Dave Chinner wrote: > On Sat, Feb 08, 2020 at 11:34:39AM -0800, ira.weiny@intel.com wrote: > > From: Ira Weiny <ira.weiny@intel.com> > > > > One of the checks for an inode supporting DAX is if the inode is > > reflinked. During a non-DAX to DAX state change we could race with > > the file being reflinked and end up with a reflinked file being in DAX > > state. > > > > Prevent this race by checking for DAX support under the MMAP_LOCK. > > The on disk inode flags are protected by the XFS_ILOCK, not the > MMAP_LOCK. i.e. the MMAPLOCK provides data access serialisation, not > metadata modification serialisation. Ah... > > > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com> > > --- > > fs/xfs/xfs_ioctl.c | 11 +++++++---- > > 1 file changed, 7 insertions(+), 4 deletions(-) > > > > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c > > index da1eb2bdb386..4ff402fd6636 100644 > > --- a/fs/xfs/xfs_ioctl.c > > +++ b/fs/xfs/xfs_ioctl.c > > @@ -1194,10 +1194,6 @@ xfs_ioctl_setattr_dax_invalidate( > > > > *join_flags = 0; > > > > - if ((fa->fsx_xflags & FS_XFLAG_DAX) == FS_XFLAG_DAX && > > - !xfs_inode_supports_dax(ip)) > > - return -EINVAL; > > - > > /* If the DAX state is not changing, we have nothing to do here. */ > > if ((fa->fsx_xflags & FS_XFLAG_DAX) && > > (ip->i_d.di_flags2 & XFS_DIFLAG2_DAX)) > > @@ -1211,6 +1207,13 @@ xfs_ioctl_setattr_dax_invalidate( > > > > /* lock, flush and invalidate mapping in preparation for flag change */ > > xfs_ilock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL); > > + > > + if ((fa->fsx_xflags & FS_XFLAG_DAX) == FS_XFLAG_DAX && > > + !xfs_inode_supports_dax(ip)) { > > + error = -EINVAL; > > + goto out_unlock; > > + } > > Yes, you might be able to get away with reflink vs dax flag > serialisation on the inode flag modification, but it is not correct and > leaves a landmine for future inode flag modifications that are done > without holding either the MMAP or IOLOCK. > > e.g. concurrent calls to xfs_ioctl_setattr() setting/clearing flags > other than the on disk DAX flag are all serialised by the ILOCK_EXCL > and will no be serialised against this DAX check. Hence if there are > other flags that we add in future that affect the result of > xfs_inode_supports_dax(), this code will not be correctly > serialised. > > This raciness in checking the DAX flags is the reason that > xfs_ioctl_setattr_xflags() redoes all the reflink vs dax checks once > it's called under the XFS_ILOCK_EXCL during the actual change > transaction.... Ok I found this by trying to make sure that the xfs_inode_supports_dax() call was always returning valid data. So I don't have a specific test which was failing. Looking at the code again, it sounds like I was wrong about which locks protect what and with your explanation above it looks like there is nothing to be done here and I can drop the patch. Would you agree? Thanks for the review! Ira > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com
On Tue, Feb 11, 2020 at 09:55:09AM -0800, Ira Weiny wrote: > On Tue, Feb 11, 2020 at 05:16:39PM +1100, Dave Chinner wrote: > > On Sat, Feb 08, 2020 at 11:34:39AM -0800, ira.weiny@intel.com wrote: > > > From: Ira Weiny <ira.weiny@intel.com> > > > > > > One of the checks for an inode supporting DAX is if the inode is > > > reflinked. During a non-DAX to DAX state change we could race with > > > the file being reflinked and end up with a reflinked file being in DAX > > > state. > > > > > > Prevent this race by checking for DAX support under the MMAP_LOCK. > > > > The on disk inode flags are protected by the XFS_ILOCK, not the > > MMAP_LOCK. i.e. the MMAPLOCK provides data access serialisation, not > > metadata modification serialisation. > > Ah... > > > > > > > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com> > > > --- > > > fs/xfs/xfs_ioctl.c | 11 +++++++---- > > > 1 file changed, 7 insertions(+), 4 deletions(-) > > > > > > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c > > > index da1eb2bdb386..4ff402fd6636 100644 > > > --- a/fs/xfs/xfs_ioctl.c > > > +++ b/fs/xfs/xfs_ioctl.c > > > @@ -1194,10 +1194,6 @@ xfs_ioctl_setattr_dax_invalidate( > > > > > > *join_flags = 0; > > > > > > - if ((fa->fsx_xflags & FS_XFLAG_DAX) == FS_XFLAG_DAX && > > > - !xfs_inode_supports_dax(ip)) > > > - return -EINVAL; > > > - > > > /* If the DAX state is not changing, we have nothing to do here. */ > > > if ((fa->fsx_xflags & FS_XFLAG_DAX) && > > > (ip->i_d.di_flags2 & XFS_DIFLAG2_DAX)) > > > @@ -1211,6 +1207,13 @@ xfs_ioctl_setattr_dax_invalidate( > > > > > > /* lock, flush and invalidate mapping in preparation for flag change */ > > > xfs_ilock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL); > > > + > > > + if ((fa->fsx_xflags & FS_XFLAG_DAX) == FS_XFLAG_DAX && > > > + !xfs_inode_supports_dax(ip)) { > > > + error = -EINVAL; > > > + goto out_unlock; > > > + } > > > > Yes, you might be able to get away with reflink vs dax flag > > serialisation on the inode flag modification, but it is not correct and > > leaves a landmine for future inode flag modifications that are done > > without holding either the MMAP or IOLOCK. > > > > e.g. concurrent calls to xfs_ioctl_setattr() setting/clearing flags > > other than the on disk DAX flag are all serialised by the ILOCK_EXCL > > and will no be serialised against this DAX check. Hence if there are > > other flags that we add in future that affect the result of > > xfs_inode_supports_dax(), this code will not be correctly > > serialised. > > > > This raciness in checking the DAX flags is the reason that > > xfs_ioctl_setattr_xflags() redoes all the reflink vs dax checks once > > it's called under the XFS_ILOCK_EXCL during the actual change > > transaction.... > > Ok I found this by trying to make sure that the xfs_inode_supports_dax() call > was always returning valid data. So I don't have a specific test which was > failing. > > Looking at the code again, it sounds like I was wrong about which locks protect > what and with your explanation above it looks like there is nothing to be done > here and I can drop the patch. > > Would you agree? *nod* Cheers, Dave.
On Wed, Feb 12, 2020 at 07:42:20AM +1100, Dave Chinner wrote: > On Tue, Feb 11, 2020 at 09:55:09AM -0800, Ira Weiny wrote: > > On Tue, Feb 11, 2020 at 05:16:39PM +1100, Dave Chinner wrote: > > > On Sat, Feb 08, 2020 at 11:34:39AM -0800, ira.weiny@intel.com wrote: > > > > From: Ira Weiny <ira.weiny@intel.com> > > > > [snip] > > > > > > This raciness in checking the DAX flags is the reason that > > > xfs_ioctl_setattr_xflags() redoes all the reflink vs dax checks once > > > it's called under the XFS_ILOCK_EXCL during the actual change > > > transaction.... > > > > Ok I found this by trying to make sure that the xfs_inode_supports_dax() call > > was always returning valid data. So I don't have a specific test which was > > failing. > > > > Looking at the code again, it sounds like I was wrong about which locks protect > > what and with your explanation above it looks like there is nothing to be done > > here and I can drop the patch. > > > > Would you agree? > > *nod* Thanks! done. Ira > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index da1eb2bdb386..4ff402fd6636 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1194,10 +1194,6 @@ xfs_ioctl_setattr_dax_invalidate( *join_flags = 0; - if ((fa->fsx_xflags & FS_XFLAG_DAX) == FS_XFLAG_DAX && - !xfs_inode_supports_dax(ip)) - return -EINVAL; - /* If the DAX state is not changing, we have nothing to do here. */ if ((fa->fsx_xflags & FS_XFLAG_DAX) && (ip->i_d.di_flags2 & XFS_DIFLAG2_DAX)) @@ -1211,6 +1207,13 @@ xfs_ioctl_setattr_dax_invalidate( /* lock, flush and invalidate mapping in preparation for flag change */ xfs_ilock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL); + + if ((fa->fsx_xflags & FS_XFLAG_DAX) == FS_XFLAG_DAX && + !xfs_inode_supports_dax(ip)) { + error = -EINVAL; + goto out_unlock; + } + error = filemap_write_and_wait(inode->i_mapping); if (error) goto out_unlock;