Message ID | 20210226002030.653855-10-ruansy.fnst@fujitsu.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fsdax,xfs: Add reflink&dedupe support for fsdax | expand |
On Fri, Feb 26, 2021 at 08:20:29AM +0800, Shiyang Ruan wrote: > error = iomap_zero_range(VFS_I(ip), offset, len, NULL, > - &xfs_buffered_write_iomap_ops); > + IS_DAX(VFS_I(ip)) ? > + &xfs_dax_write_iomap_ops : &xfs_buffered_write_iomap_ops); Please add a xfs_zero_range helper that picks the right iomap_ops instead of open coding this in a few places. > +static int > +xfs_dax_write_iomap_end( > + struct inode *inode, > + loff_t pos, > + loff_t length, > + ssize_t written, > + unsigned int flags, > + struct iomap *iomap) > +{ > + int error = 0; > + xfs_inode_t *ip = XFS_I(inode); > + > + if (pos + written > i_size_read(inode)) { > + i_size_write(inode, pos + written); > + error = xfs_setfilesize(ip, pos, written); > + } > + if (xfs_is_cow_inode(ip)) > + error = xfs_reflink_end_cow(ip, pos, written); > + > + return error; What is the advantage of the ioemap_end handler here? It adds another indirect funtion call to the fast path, so if we can avoid it, I'd rather do that. Also, shouldn't we cancel the COW rather than finishing it when setting the file size fails?
> > On Fri, Feb 26, 2021 at 08:20:29AM +0800, Shiyang Ruan wrote: > > error = iomap_zero_range(VFS_I(ip), offset, len, NULL, > > - &xfs_buffered_write_iomap_ops); > > + IS_DAX(VFS_I(ip)) ? > > + &xfs_dax_write_iomap_ops : &xfs_buffered_write_iomap_ops); > > Please add a xfs_zero_range helper that picks the right iomap_ops > instead of open coding this in a few places. OK. I'll add it. > > > +static int > > +xfs_dax_write_iomap_end( > > + struct inode *inode, > > + loff_t pos, > > + loff_t length, > > + ssize_t written, > > + unsigned int flags, > > + struct iomap *iomap) > > +{ > > + int error = 0; > > + xfs_inode_t *ip = XFS_I(inode); > > + > > + if (pos + written > i_size_read(inode)) { > > + i_size_write(inode, pos + written); > > + error = xfs_setfilesize(ip, pos, written); > > + } > > + if (xfs_is_cow_inode(ip)) > > + error = xfs_reflink_end_cow(ip, pos, written); > > + > > + return error; > > What is the advantage of the ioemap_end handler here? It adds another > indirect funtion call to the fast path, so if we can avoid it, I'd > rather do that. These code were in xfs_file_dax_write(). I moved them into the iomap_end because the mmaped CoW need this. I know this is not so good, but I could not find another better way. Do you have any ideas? > > Also, shouldn't we cancel the COW rather than finishing it when setting > the file size fails? > I did forget about this part. Thanks for pointing out. -- Thanks, Ruan Shiyang.
On Wed, Mar 03, 2021 at 09:57:48AM +0000, ruansy.fnst@fujitsu.com wrote: > > What is the advantage of the ioemap_end handler here? It adds another > > indirect funtion call to the fast path, so if we can avoid it, I'd > > rather do that. > > These code were in xfs_file_dax_write(). I moved them into the iomap_end > because the mmaped CoW need this. > > I know this is not so good, but I could not find another better way. Do you > have any ideas? mmaped copy is the copy_edge case? Maybe just use different iomap_ops for that case vs plain write?
> On Wed, Mar 03, 2021 at 09:57:48AM +0000, ruansy.fnst@fujitsu.com wrote: > > > What is the advantage of the ioemap_end handler here? It adds another > > > indirect funtion call to the fast path, so if we can avoid it, I'd > > > rather do that. > > > > These code were in xfs_file_dax_write(). I moved them into the iomap_end > > because the mmaped CoW need this. > > > > I know this is not so good, but I could not find another better way. Do you > > have any ideas? > mmaped copy is the copy_edge case? Maybe just use different iomap_ops for > that case vs plain write? No, I mean mmaped CoW need a xfs_reflink_end_cow() to make sure the new extent will be correctly remaped to the file. Otherwise, the file will still refer to the extent that srcmap point to. We are able to call this in xfs_file_dax_write(), but cannot call it anywhere except iomap_end in mmap path. -- Thanks, Ruan Shiyang.
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 7371a7f7c652..65a8782b6378 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -977,7 +977,8 @@ xfs_free_file_space( if (offset + len > XFS_ISIZE(ip)) len = XFS_ISIZE(ip) - offset; error = iomap_zero_range(VFS_I(ip), offset, len, NULL, - &xfs_buffered_write_iomap_ops); + IS_DAX(VFS_I(ip)) ? + &xfs_dax_write_iomap_ops : &xfs_buffered_write_iomap_ops); if (error) return error; diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 5b0f93f73837..1987d15eab61 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -623,11 +623,8 @@ xfs_file_dax_write( count = iov_iter_count(from); trace_xfs_file_dax_write(ip, count, pos); - ret = dax_iomap_rw(iocb, from, &xfs_direct_write_iomap_ops); - if (ret > 0 && iocb->ki_pos > i_size_read(inode)) { - i_size_write(inode, iocb->ki_pos); - error = xfs_setfilesize(ip, pos, ret); - } + ret = dax_iomap_rw(iocb, from, &xfs_dax_write_iomap_ops); + out: xfs_iunlock(ip, iolock); if (error) @@ -1250,7 +1247,7 @@ __xfs_filemap_fault( ret = dax_iomap_fault(vmf, pe_size, &pfn, NULL, (write_fault && !vmf->cow_page) ? - &xfs_direct_write_iomap_ops : + &xfs_dax_write_iomap_ops : &xfs_read_iomap_ops); if (ret & VM_FAULT_NEEDDSYNC) ret = dax_finish_sync_fault(vmf, pe_size, pfn); diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 7b9ff824e82d..23c6f8c97047 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -771,7 +771,8 @@ xfs_direct_write_iomap_begin( /* may drop and re-acquire the ilock */ error = xfs_reflink_allocate_cow(ip, &imap, &cmap, &shared, - &lockmode, flags & IOMAP_DIRECT); + &lockmode, + flags & IOMAP_DIRECT || IS_DAX(inode)); if (error) goto out_unlock; if (shared) @@ -850,6 +851,33 @@ const struct iomap_ops xfs_direct_write_iomap_ops = { .iomap_begin = xfs_direct_write_iomap_begin, }; +static int +xfs_dax_write_iomap_end( + struct inode *inode, + loff_t pos, + loff_t length, + ssize_t written, + unsigned int flags, + struct iomap *iomap) +{ + int error = 0; + xfs_inode_t *ip = XFS_I(inode); + + if (pos + written > i_size_read(inode)) { + i_size_write(inode, pos + written); + error = xfs_setfilesize(ip, pos, written); + } + if (xfs_is_cow_inode(ip)) + error = xfs_reflink_end_cow(ip, pos, written); + + return error; +} + +const struct iomap_ops xfs_dax_write_iomap_ops = { + .iomap_begin = xfs_direct_write_iomap_begin, + .iomap_end = xfs_dax_write_iomap_end, +}; + static int xfs_buffered_write_iomap_begin( struct inode *inode, diff --git a/fs/xfs/xfs_iomap.h b/fs/xfs/xfs_iomap.h index 7d3703556d0e..a361c2f27cf3 100644 --- a/fs/xfs/xfs_iomap.h +++ b/fs/xfs/xfs_iomap.h @@ -42,6 +42,7 @@ xfs_aligned_fsb_count( extern const struct iomap_ops xfs_buffered_write_iomap_ops; extern const struct iomap_ops xfs_direct_write_iomap_ops; +extern const struct iomap_ops xfs_dax_write_iomap_ops; extern const struct iomap_ops xfs_read_iomap_ops; extern const struct iomap_ops xfs_seek_iomap_ops; extern const struct iomap_ops xfs_xattr_iomap_ops; diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 67c8dc9de8aa..adf4467ab862 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -841,6 +841,7 @@ xfs_setattr_size( int error; uint lock_flags = 0; bool did_zeroing = false; + const struct iomap_ops *ops; ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL)); ASSERT(xfs_isilocked(ip, XFS_MMAPLOCK_EXCL)); @@ -887,10 +888,15 @@ xfs_setattr_size( * extension, or zeroing out the rest of the block on a downward * truncate. */ + if (IS_DAX(inode)) + ops = &xfs_direct_write_iomap_ops; + else + ops = &xfs_buffered_write_iomap_ops; + if (newsize > oldsize) { trace_xfs_zero_eof(ip, oldsize, newsize - oldsize); error = iomap_zero_range(inode, oldsize, newsize - oldsize, - &did_zeroing, &xfs_buffered_write_iomap_ops); + &did_zeroing, ops); } else { /* * iomap won't detect a dirty page over an unwritten block (or a @@ -902,8 +908,7 @@ xfs_setattr_size( newsize); if (error) return error; - error = iomap_truncate_page(inode, newsize, &did_zeroing, - &xfs_buffered_write_iomap_ops); + error = iomap_truncate_page(inode, newsize, &did_zeroing, ops); } if (error) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index f5b3a3da36b7..dfe4e1912ff9 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1246,6 +1246,8 @@ xfs_reflink_zero_posteof( trace_xfs_zero_eof(ip, isize, pos - isize); return iomap_zero_range(VFS_I(ip), isize, pos - isize, NULL, + IS_DAX(VFS_I(ip)) ? + &xfs_dax_write_iomap_ops : &xfs_buffered_write_iomap_ops); }
In fsdax mode, WRITE and ZERO on a shared extent need CoW performed. After CoW, new allocated extents needs to be remapped to the file. So, add an iomap_end for dax write ops to do the remapping work. Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> --- fs/xfs/xfs_bmap_util.c | 3 ++- fs/xfs/xfs_file.c | 9 +++------ fs/xfs/xfs_iomap.c | 30 +++++++++++++++++++++++++++++- fs/xfs/xfs_iomap.h | 1 + fs/xfs/xfs_iops.c | 11 ++++++++--- fs/xfs/xfs_reflink.c | 2 ++ 6 files changed, 45 insertions(+), 11 deletions(-)