Message ID | 20190520085700.29424-1-fdmanana@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Btrfs: incremental send, fix emission of invalid clone operations | expand |
On Mon, May 20, 2019 at 09:57:00AM +0100, fdmanana@kernel.org wrote: > From: Filipe Manana <fdmanana@suse.com> > > When doing an incremental send we can now issue clone operations with a > source range that ends at the source's file eof and with a destination > range that ends at an offset smaller then the destination's file eof. > If the eof of the source file is not aligned to the sector size of the > filesystem, the receiver will get a -EINVAL error when trying to do the > operation or, on older kernels, silently corrupt the destination file. > The corruption happens on kernels without commit ac765f83f1397646 > ("Btrfs: fix data corruption due to cloning of eof block"), while the > failure to clone happens on kernels with that commit. > > Example reproducer: > > $ mkfs.btrfs -f /dev/sdb > $ mount /dev/sdb /mnt/sdb > > $ xfs_io -f -c "pwrite -S 0xb1 0 2M" /mnt/sdb/foo > $ xfs_io -f -c "pwrite -S 0xc7 0 2M" /mnt/sdb/bar > $ xfs_io -f -c "pwrite -S 0x4d 0 2M" /mnt/sdb/baz > $ xfs_io -f -c "pwrite -S 0xe2 0 2M" /mnt/sdb/zoo > > $ btrfs subvolume snapshot -r /mnt/sdb /mnt/sdb/base > > $ btrfs send -f /tmp/base.send /mnt/sdb/base > > $ xfs_io -c "reflink /mnt/sdb/bar 1560K 500K 100K" /mnt/sdb/bar > $ xfs_io -c "reflink /mnt/sdb/bar 1560K 0 100K" /mnt/sdb/zoo > $ xfs_io -c "truncate 550K" /mnt/sdb/bar > > $ btrfs subvolume snapshot -r /mnt/sdb /mnt/sdb/incr > > $ btrfs send -f /tmp/incr.send -p /mnt/sdb/base /mnt/sdb/incr > > $ mkfs.btrfs -f /dev/sdc > $ mount /dev/sdc /mnt/sdc > > $ btrfs receive -f /tmp/base.send /mnt/sdc > $ btrfs receive -vv -f /tmp/incr.send /mnt/sdc > (...) > truncate bar size=563200 > utimes bar > clone zoo - source=bar source offset=512000 offset=0 length=51200 > ERROR: failed to clone extents to zoo > Invalid argument > > The failure happens because the clone source range ends at the eof of file > bar, 563200, which is not aligned to the filesystems sector size (4Kb in > this case), and the destination range ends at offset 0 + 51200, which is > less then the size of the file zoo (2Mb). > > So fix this by detecting such case and instead of issuing a clone > operation for the whole range, do a clone operation for smaller range > that is sector size aligned followed by a write operation for the block > containing the eof. Here we will always be pessimistic and assume the > destination filesystem of the send stream has the largest possible sector > size (64Kb), since we have no way of determining it. > > This fixes a recent regression introduced in kernel 5.2-rc1. > > Fixes: 040ee6120cb6706 ("Btrfs: send, improve clone range") > Signed-off-by: Filipe Manana <fdmanana@suse.com> Added to 5.2-rc queue, thanks.
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 1549d0639b57..66db1271a3cb 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -5224,10 +5224,50 @@ static int clone_range(struct send_ctx *sctx, clone_len = min_t(u64, ext_len, len); if (btrfs_file_extent_disk_bytenr(leaf, ei) == disk_byte && - clone_data_offset == data_offset) - ret = send_clone(sctx, offset, clone_len, clone_root); - else + clone_data_offset == data_offset) { + const u64 src_end = clone_root->offset + clone_len; + const u64 sectorsize = SZ_64K; + + /* + * We can't clone the last block, when its size is not + * sector size aligned, into the middle of a file. If we + * do so, the receiver will get a failure (-EINVAL) when + * trying to clone or will silently corrupt the data in + * the destination file if it's on a kernel without the + * fix introduced by commit ac765f83f1397646 + * ("Btrfs: fix data corruption due to cloning of eof + * block). + * + * So issue a clone of the aligned down range plus a + * regular write for the eof block, if we hit that case. + * + * Also, we use the maximum possible sector size, 64K, + * because we don't know what's the sector size of the + * filesystem that receives the stream, so we have to + * assume the largest possible sector size. + */ + if (src_end == clone_src_i_size && + !IS_ALIGNED(src_end, sectorsize) && + offset + clone_len < sctx->cur_inode_size) { + u64 slen; + + slen = ALIGN_DOWN(src_end - clone_root->offset, + sectorsize); + if (slen > 0) { + ret = send_clone(sctx, offset, slen, + clone_root); + if (ret < 0) + goto out; + } + ret = send_extent_data(sctx, offset + slen, + clone_len - slen); + } else { + ret = send_clone(sctx, offset, clone_len, + clone_root); + } + } else { ret = send_extent_data(sctx, offset, clone_len); + } if (ret < 0) goto out;