diff mbox series

[21/43] xfs: don't call xfs_can_free_eofblocks from ->release for zoned inodes

Message ID 20241211085636.1380516-22-hch@lst.de (mailing list archive)
State New
Headers show
Series [01/43] xfs: constify feature checks | expand

Commit Message

Christoph Hellwig Dec. 11, 2024, 8:54 a.m. UTC
Zoned file systems require out of place writes and thus can't support
post-EOF speculative preallocations.  Avoid the pointless ilock critical
section to find out that none can be freed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_file.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

Comments

Darrick J. Wong Dec. 12, 2024, 10:15 p.m. UTC | #1
On Wed, Dec 11, 2024 at 09:54:46AM +0100, Christoph Hellwig wrote:
> Zoned file systems require out of place writes and thus can't support
> post-EOF speculative preallocations.  Avoid the pointless ilock critical
> section to find out that none can be freed.

I wonder if this is true of alwayscow inodes in general, not just zoned
inodes?

> Signed-off-by: Christoph Hellwig <hch@lst.de>

Anyway that makes sense to me, so
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  fs/xfs/xfs_file.c | 15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 27301229011b..827f7819df6a 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1356,15 +1356,22 @@ xfs_file_release(
>  	 * blocks.  This avoids open/read/close workloads from removing EOF
>  	 * blocks that other writers depend upon to reduce fragmentation.
>  	 *
> +	 * Inodes on the zoned RT device never have preallocations, so skip
> +	 * taking the locks below.
> +	 */
> +	if (!inode->i_nlink ||
> +	    !(file->f_mode & FMODE_WRITE) ||
> +	    (ip->i_diflags & XFS_DIFLAG_APPEND) ||
> +	    xfs_is_zoned_inode(ip))
> +		return 0;
> +
> +	/*
>  	 * If we can't get the iolock just skip truncating the blocks past EOF
>  	 * because we could deadlock with the mmap_lock otherwise. We'll get
>  	 * another chance to drop them once the last reference to the inode is
>  	 * dropped, so we'll never leak blocks permanently.
>  	 */
> -	if (inode->i_nlink &&
> -	    (file->f_mode & FMODE_WRITE) &&
> -	    !(ip->i_diflags & XFS_DIFLAG_APPEND) &&
> -	    !xfs_iflags_test(ip, XFS_EOFBLOCKS_RELEASED) &&
> +	if (!xfs_iflags_test(ip, XFS_EOFBLOCKS_RELEASED) &&
>  	    xfs_ilock_nowait(ip, XFS_IOLOCK_EXCL)) {
>  		if (xfs_can_free_eofblocks(ip) &&
>  		    !xfs_iflags_test_and_set(ip, XFS_EOFBLOCKS_RELEASED))
> -- 
> 2.45.2
> 
>
Christoph Hellwig Dec. 13, 2024, 5:28 a.m. UTC | #2
On Thu, Dec 12, 2024 at 02:15:23PM -0800, Darrick J. Wong wrote:
> On Wed, Dec 11, 2024 at 09:54:46AM +0100, Christoph Hellwig wrote:
> > Zoned file systems require out of place writes and thus can't support
> > post-EOF speculative preallocations.  Avoid the pointless ilock critical
> > section to find out that none can be freed.
> 
> I wonder if this is true of alwayscow inodes in general, not just zoned
> inodes?

Maybe I'm missing something, but AFAICS always_cow still generates
preallocations in xfs_buffered_write_iomap_begin.  It probably shouldn't.

Btw, the always_cow code as intended as the common support code for
zoned and atomic msync style atomic writes, which always require hard
out of place writes.  It turns out it doesn't actually do that right
now (see the bounce buffering patch reviewed earlier), which makes it
a bit of an oddball.  I'd personally love to kill it once the zoned
code lands, as just running the zoned mode on a regular device actually
gives you a good way to test always out of place write semantics,
which ended up diverging a bit from the earlier version after it hit
the hard reality of hardware actually enforcing out of place writes.
Darrick J. Wong Dec. 13, 2024, 5:13 p.m. UTC | #3
On Fri, Dec 13, 2024 at 06:28:41AM +0100, Christoph Hellwig wrote:
> On Thu, Dec 12, 2024 at 02:15:23PM -0800, Darrick J. Wong wrote:
> > On Wed, Dec 11, 2024 at 09:54:46AM +0100, Christoph Hellwig wrote:
> > > Zoned file systems require out of place writes and thus can't support
> > > post-EOF speculative preallocations.  Avoid the pointless ilock critical
> > > section to find out that none can be freed.
> > 
> > I wonder if this is true of alwayscow inodes in general, not just zoned
> > inodes?
> 
> Maybe I'm missing something, but AFAICS always_cow still generates
> preallocations in xfs_buffered_write_iomap_begin.  It probably shouldn't.

For non-zoned alwayscow I think it's trying to generate preallocations
in the cow fork to reduce fragmentation of the bmbt since we don't have
to write in the linear order.

Unless... you're talking about preallocations in the data fork?

> Btw, the always_cow code as intended as the common support code for
> zoned and atomic msync style atomic writes, which always require hard
> out of place writes.  It turns out it doesn't actually do that right
> now (see the bounce buffering patch reviewed earlier), which makes it
> a bit of an oddball.  I'd personally love to kill it once the zoned
> code lands, as just running the zoned mode on a regular device actually
> gives you a good way to test always out of place write semantics,
> which ended up diverging a bit from the earlier version after it hit
> the hard reality of hardware actually enforcing out of place writes.

Which patch is the bounce buffering patch?

/me hands himself another cup of coffee

--D
Christoph Hellwig Dec. 13, 2024, 5:18 p.m. UTC | #4
On Fri, Dec 13, 2024 at 09:13:53AM -0800, Darrick J. Wong wrote:
> > > I wonder if this is true of alwayscow inodes in general, not just zoned
> > > inodes?
> > 
> > Maybe I'm missing something, but AFAICS always_cow still generates
> > preallocations in xfs_buffered_write_iomap_begin.  It probably shouldn't.
> 
> For non-zoned alwayscow I think it's trying to generate preallocations
> in the cow fork to reduce fragmentation of the bmbt since we don't have
> to write in the linear order.

Ah yes, and xfs_can_free_eofblocks only deals with the data fork.

> 
> > Btw, the always_cow code as intended as the common support code for
> > zoned and atomic msync style atomic writes, which always require hard
> > out of place writes.  It turns out it doesn't actually do that right
> > now (see the bounce buffering patch reviewed earlier), which makes it
> > a bit of an oddball.  I'd personally love to kill it once the zoned
> > code lands, as just running the zoned mode on a regular device actually
> > gives you a good way to test always out of place write semantics,
> > which ended up diverging a bit from the earlier version after it hit
> > the hard reality of hardware actually enforcing out of place writes.
> 
> Which patch is the bounce buffering patch?

[PATCH 12/43] xfs: refine the unaligned check for always COW inodes in xfs_file_dio_write
diff mbox series

Patch

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 27301229011b..827f7819df6a 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1356,15 +1356,22 @@  xfs_file_release(
 	 * blocks.  This avoids open/read/close workloads from removing EOF
 	 * blocks that other writers depend upon to reduce fragmentation.
 	 *
+	 * Inodes on the zoned RT device never have preallocations, so skip
+	 * taking the locks below.
+	 */
+	if (!inode->i_nlink ||
+	    !(file->f_mode & FMODE_WRITE) ||
+	    (ip->i_diflags & XFS_DIFLAG_APPEND) ||
+	    xfs_is_zoned_inode(ip))
+		return 0;
+
+	/*
 	 * If we can't get the iolock just skip truncating the blocks past EOF
 	 * because we could deadlock with the mmap_lock otherwise. We'll get
 	 * another chance to drop them once the last reference to the inode is
 	 * dropped, so we'll never leak blocks permanently.
 	 */
-	if (inode->i_nlink &&
-	    (file->f_mode & FMODE_WRITE) &&
-	    !(ip->i_diflags & XFS_DIFLAG_APPEND) &&
-	    !xfs_iflags_test(ip, XFS_EOFBLOCKS_RELEASED) &&
+	if (!xfs_iflags_test(ip, XFS_EOFBLOCKS_RELEASED) &&
 	    xfs_ilock_nowait(ip, XFS_IOLOCK_EXCL)) {
 		if (xfs_can_free_eofblocks(ip) &&
 		    !xfs_iflags_test_and_set(ip, XFS_EOFBLOCKS_RELEASED))