Message ID | 20180917205354.15401-8-hch@lst.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/10] xfs: fix transaction leak in xfs_reflink_allocate_cow() | expand |
On Mon, Sep 17, 2018 at 10:53:51PM +0200, Christoph Hellwig wrote: > For files with extent size hints we always allocate unwritten extents > first and then convert them to real extents later to avoid exposing > stale data outside the blocks actually written. But there is no > reason we can't do unwritten extent allocations from delalloc blocks, > in fact we already do that for COW operations. > > Note that this does not handle files on the RT subvolume (yet), as > we removed code handling RT allocations from the delalloc path this > would be a slightly bigger project, and it doesn't really help with > simplifying the COW path either. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > fs/xfs/xfs_iomap.c | 17 ++++------------- > 1 file changed, 4 insertions(+), 13 deletions(-) > > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c > index e12ff5e9a8ec..2d08ace09e25 100644 > --- a/fs/xfs/xfs_iomap.c > +++ b/fs/xfs/xfs_iomap.c > @@ -533,7 +533,6 @@ xfs_file_iomap_begin_delay( > xfs_fsblock_t prealloc_blocks = 0; > > ASSERT(!XFS_IS_REALTIME_INODE(ip)); > - ASSERT(!xfs_get_extsz_hint(ip)); > > xfs_ilock(ip, XFS_ILOCK_EXCL); > > @@ -1035,7 +1034,7 @@ xfs_file_iomap_begin( > return -EIO; > > if ((flags & (IOMAP_WRITE | IOMAP_ZERO)) && !(flags & IOMAP_DIRECT) && > - !IS_DAX(inode) && !xfs_get_extsz_hint(ip)) { > + !IS_DAX(inode) && !XFS_IS_REALTIME_INODE(ip)) { AFAICT this fails to honor the extent size hint for buffered writes: # xfs_io -fc "truncate 1m" -c "extsize 32k" -c "pwrite 0 4k" -c "fiemap -v" /mnt/file wrote 4096/4096 bytes at offset 0 4 KiB, 1 ops; 0.0001 sec (35.511 MiB/sec and 9090.9091 ops/sec) /mnt/file: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..7]: 80..87 8 0x1 1: [8..2047]: hole 2040 ... and with an unpatched kernel: # xfs_io -fc "truncate 1m" -c "extsize 32k" -c "pwrite 0 4k" -c "fiemap -v" /mnt/file wrote 4096/4096 bytes at offset 0 4 KiB, 1 ops; 0.0017 sec (2.283 MiB/sec and 584.4535 ops/sec) /mnt/file: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..7]: 192..199 8 0x0 1: [8..63]: 200..255 56 0x801 2: [64..2047]: hole 1984 I'd guess this is due to the extent overlap check in xfs_bmap_extsize_align(), but I haven't verified. Brian > /* Reserve delalloc blocks for regular writeback. */ > return xfs_file_iomap_begin_delay(inode, offset, length, flags, > iomap); > @@ -1088,17 +1087,9 @@ xfs_file_iomap_begin( > * been done up front, so we don't need to do them here. > */ > if (xfs_is_reflink_inode(ip)) { > - if (flags & IOMAP_DIRECT) { > - /* may drop and re-acquire the ilock */ > - error = xfs_reflink_allocate_cow(ip, &imap, &shared, > - &lockmode); > - if (error) > - goto out_unlock; > - } else { > - error = xfs_reflink_reserve_cow(ip, &imap, &shared); > - if (error) > - goto out_unlock; > - } > + error = xfs_reflink_allocate_cow(ip, &imap, &shared, &lockmode); > + if (error) > + goto out_unlock; > > end_fsb = imap.br_startoff + imap.br_blockcount; > length = XFS_FSB_TO_B(mp, end_fsb) - offset; > -- > 2.18.0 >
On Wed, Sep 26, 2018 at 11:17:24AM -0400, Brian Foster wrote: > I'd guess this is due to the extent overlap check in > xfs_bmap_extsize_align(), but I haven't verified. The problem is that we don't honor extent size hints at all when converting delalloc to real extents. This will be a much bigger change, so I'll defer it for now.
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index e12ff5e9a8ec..2d08ace09e25 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -533,7 +533,6 @@ xfs_file_iomap_begin_delay( xfs_fsblock_t prealloc_blocks = 0; ASSERT(!XFS_IS_REALTIME_INODE(ip)); - ASSERT(!xfs_get_extsz_hint(ip)); xfs_ilock(ip, XFS_ILOCK_EXCL); @@ -1035,7 +1034,7 @@ xfs_file_iomap_begin( return -EIO; if ((flags & (IOMAP_WRITE | IOMAP_ZERO)) && !(flags & IOMAP_DIRECT) && - !IS_DAX(inode) && !xfs_get_extsz_hint(ip)) { + !IS_DAX(inode) && !XFS_IS_REALTIME_INODE(ip)) { /* Reserve delalloc blocks for regular writeback. */ return xfs_file_iomap_begin_delay(inode, offset, length, flags, iomap); @@ -1088,17 +1087,9 @@ xfs_file_iomap_begin( * been done up front, so we don't need to do them here. */ if (xfs_is_reflink_inode(ip)) { - if (flags & IOMAP_DIRECT) { - /* may drop and re-acquire the ilock */ - error = xfs_reflink_allocate_cow(ip, &imap, &shared, - &lockmode); - if (error) - goto out_unlock; - } else { - error = xfs_reflink_reserve_cow(ip, &imap, &shared); - if (error) - goto out_unlock; - } + error = xfs_reflink_allocate_cow(ip, &imap, &shared, &lockmode); + if (error) + goto out_unlock; end_fsb = imap.br_startoff + imap.br_blockcount; length = XFS_FSB_TO_B(mp, end_fsb) - offset;
For files with extent size hints we always allocate unwritten extents first and then convert them to real extents later to avoid exposing stale data outside the blocks actually written. But there is no reason we can't do unwritten extent allocations from delalloc blocks, in fact we already do that for COW operations. Note that this does not handle files on the RT subvolume (yet), as we removed code handling RT allocations from the delalloc path this would be a slightly bigger project, and it doesn't really help with simplifying the COW path either. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_iomap.c | 17 ++++------------- 1 file changed, 4 insertions(+), 13 deletions(-)