Message ID | 20241211085636.1380516-9-hch@lst.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [01/43] xfs: constify feature checks | expand |
On Wed, Dec 11, 2024 at 09:54:33AM +0100, Christoph Hellwig wrote: > For I/O to reflinked blocks we always need to write an entire new > file system block, and the code enforces the file system block alignment > for the entire file if it has any reflinked blocks. > > Unfortunately the reported dio alignment can only report a single value > for reads and writes, so unless we want to trigger these read-modify > write cycles all the time, we need to increase both limits. > > Without this zoned xfs triggers the warnings about failed page cache > invalidation in kiocb_invalidate_post_direct_write all the time when > running generic/551 when running on a 512 byte sector device, and > eventually fails the test due to miscompares. > > Hopefully we can add a separate read alignment to statx eventually. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > fs/xfs/xfs_ioctl.c | 6 +++++- > fs/xfs/xfs_iops.c | 15 ++++++++++++++- > 2 files changed, 19 insertions(+), 2 deletions(-) > > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c > index 726282e74d54..de8ba5345e17 100644 > --- a/fs/xfs/xfs_ioctl.c > +++ b/fs/xfs/xfs_ioctl.c > @@ -1213,7 +1213,11 @@ xfs_file_ioctl( > struct xfs_buftarg *target = xfs_inode_buftarg(ip); > struct dioattr da; > > - da.d_mem = da.d_miniosz = target->bt_logical_sectorsize; > + da.d_mem = target->bt_logical_sectorsize; > + if (xfs_is_cow_inode(ip)) > + da.d_miniosz = mp->m_sb.sb_blocksize; > + else > + da.d_miniosz = target->bt_logical_sectorsize; > da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1); > > if (copy_to_user(arg, &da, sizeof(da))) > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c > index 6b0228a21617..990df072ba35 100644 > --- a/fs/xfs/xfs_iops.c > +++ b/fs/xfs/xfs_iops.c > @@ -582,7 +582,20 @@ xfs_report_dioalign( > > stat->result_mask |= STATX_DIOALIGN; > stat->dio_mem_align = bdev_dma_alignment(bdev) + 1; > - stat->dio_offset_align = bdev_logical_block_size(bdev); > + > + /* > + * On COW inodes we are forced to always rewrite an entire file system > + * block. That's not quite accurate -- we're always forced to write an entire file allocation unit so that the rest of the bmap code doesn't have to deal with a file range that's mapped to multiple different space extents. For all the existing reflink scenarios the allocation unit is always an fsblock so this is a trifling difference. However, once we start adding reflink to the rt device then there comes the question of needing to handle allocation unit > fsblock, and all these bits would have to change. IOWs, I'm saying that this should be: if (xfs_is_cow_inode(ip)) stat->dio_offset_align = xfs_inode_alloc_unitsize(ip); else ... Though ATM this is a distinction that doesn't make a difference. --D > + * > + * Because applications assume they can do sector sized direct writes > + * on XFS we provide an emulation by doing a read-modify-write cycle > + * through the cache, but that is highly inefficient. Thus report the > + * natively supported size here. > + */ > + if (xfs_is_cow_inode(ip)) > + stat->dio_offset_align = ip->i_mount->m_sb.sb_blocksize; > + else > + stat->dio_offset_align = bdev_logical_block_size(bdev); > } > > static void > -- > 2.45.2 > >
On Thu, Dec 12, 2024 at 01:29:53PM -0800, Darrick J. Wong wrote: > > + /* > > + * On COW inodes we are forced to always rewrite an entire file system > > + * block. > > That's not quite accurate -- we're always forced to write an entire file > allocation unit so that the rest of the bmap code doesn't have to deal > with a file range that's mapped to multiple different space extents. > For all the existing reflink scenarios the allocation unit is always an > fsblock so this is a trifling difference. Which right now is a block until we support horrors such as reflink on larget rtextsize inodes or the forcealign stuff. But yes, it could use the helper.
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 726282e74d54..de8ba5345e17 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1213,7 +1213,11 @@ xfs_file_ioctl( struct xfs_buftarg *target = xfs_inode_buftarg(ip); struct dioattr da; - da.d_mem = da.d_miniosz = target->bt_logical_sectorsize; + da.d_mem = target->bt_logical_sectorsize; + if (xfs_is_cow_inode(ip)) + da.d_miniosz = mp->m_sb.sb_blocksize; + else + da.d_miniosz = target->bt_logical_sectorsize; da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1); if (copy_to_user(arg, &da, sizeof(da))) diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 6b0228a21617..990df072ba35 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -582,7 +582,20 @@ xfs_report_dioalign( stat->result_mask |= STATX_DIOALIGN; stat->dio_mem_align = bdev_dma_alignment(bdev) + 1; - stat->dio_offset_align = bdev_logical_block_size(bdev); + + /* + * On COW inodes we are forced to always rewrite an entire file system + * block. + * + * Because applications assume they can do sector sized direct writes + * on XFS we provide an emulation by doing a read-modify-write cycle + * through the cache, but that is highly inefficient. Thus report the + * natively supported size here. + */ + if (xfs_is_cow_inode(ip)) + stat->dio_offset_align = ip->i_mount->m_sb.sb_blocksize; + else + stat->dio_offset_align = bdev_logical_block_size(bdev); } static void
For I/O to reflinked blocks we always need to write an entire new file system block, and the code enforces the file system block alignment for the entire file if it has any reflinked blocks. Unfortunately the reported dio alignment can only report a single value for reads and writes, so unless we want to trigger these read-modify write cycles all the time, we need to increase both limits. Without this zoned xfs triggers the warnings about failed page cache invalidation in kiocb_invalidate_post_direct_write all the time when running generic/551 when running on a 512 byte sector device, and eventually fails the test due to miscompares. Hopefully we can add a separate read alignment to statx eventually. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_ioctl.c | 6 +++++- fs/xfs/xfs_iops.c | 15 ++++++++++++++- 2 files changed, 19 insertions(+), 2 deletions(-)