Btrfs: send, fix missing truncate for inode with prealloc extent past eof
diff mbox

Message ID 20180430180507.21751-1-fdmanana@kernel.org
State New
Headers show

Commit Message

Filipe Manana April 30, 2018, 6:05 p.m. UTC
From: Filipe Manana <fdmanana@suse.com>

An incremental send operation can miss a truncate operation when an inode
has an increased size in the send snapshot and a prealloc extent beyond
its size.

Consider the following scenario where a necessary truncate operation is
missing in the incremental send stream:

1) In the parent snapshot an inode has a size of 1282957 bytes and it has
   no prealloc extents beyond its size;

2) In the the send snapshot it has a size of 5738496 bytes and has a new
   extent at offsets 1884160 (length of 106496 bytes) and a prealloc
   extent beyond eof at offset 6729728 (and a length of 339968 bytes);

3) When processing the prealloc extent, at offset 6729728, we end up at
   send.c:send_write_or_clone() and set the @len variable to a value of
   18446744073708560384 because @offset plus the original @len value is
   larger then the inode's size (6729728 + 339968 > 5738496). We then
   call send_extent_data(), with that @offset and @len, which in turn
   calls send_write(), and then the later calls fill_read_buf(). Because
   the offset passed to fill_read_buf() is greater then inode's i_size,
   this function returns 0 immediately, which makes send_write() and
   send_extent_data() do nothing and return immediately as well. When
   we get back to send.c:send_write_or_clone() we adjust the value
   of sctx->cur_inode_next_write_offset to @offset plus @len, which
   corresponds to 6729728 + 18446744073708560384 = 5738496, which is
   precisely the the size of the inode in the send snapshot;

4) Later when at send.c:finish_inode_if_needed() we determine that
   we don't need to issue a truncate operation because the value of
   sctx->cur_inode_next_write_offset corresponds to the inode's new
   size, 5738496 bytes. This is wrong because the last write operation
   that was issued started at offset 1884160 with a length of 106496
   bytes, so the correct value for sctx->cur_inode_next_write_offset
   should be 1990656 (1884160 + 106496), so that a truncate operation
   with a value of 5738496 bytes would have been sent to insert a
   trailing hole at the destination.

So fix the issue by making send.c:send_write_or_clone() not attempt
to send write or clone operations for extents that start beyond the
inode's size, since such attempts do nothing but waste time by
calling helper functions and allocating path structures, and send
currently has no fallocate command in order to create prealloc extents
at the destination (either beyond a file's eof or not).

The issue was found running the test btrfs/007 from fstests using a seed
value of 1524346151 for fsstress.

Reported-by: Gu, Jinxiang <gujx@cn.fujitsu.com>
Fixes: ffa7c4296e93 ("Btrfs: send, do not issue unnecessary truncate operations")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/send.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

David Sterba May 1, 2018, 2:55 p.m. UTC | #1
On Mon, Apr 30, 2018 at 07:05:07PM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> An incremental send operation can miss a truncate operation when an inode
> has an increased size in the send snapshot and a prealloc extent beyond
> its size.
> 
> Consider the following scenario where a necessary truncate operation is
> missing in the incremental send stream:
> 
> 1) In the parent snapshot an inode has a size of 1282957 bytes and it has
>    no prealloc extents beyond its size;
> 
> 2) In the the send snapshot it has a size of 5738496 bytes and has a new
>    extent at offsets 1884160 (length of 106496 bytes) and a prealloc
>    extent beyond eof at offset 6729728 (and a length of 339968 bytes);
> 
> 3) When processing the prealloc extent, at offset 6729728, we end up at
>    send.c:send_write_or_clone() and set the @len variable to a value of
>    18446744073708560384 because @offset plus the original @len value is
>    larger then the inode's size (6729728 + 339968 > 5738496). We then
>    call send_extent_data(), with that @offset and @len, which in turn
>    calls send_write(), and then the later calls fill_read_buf(). Because
>    the offset passed to fill_read_buf() is greater then inode's i_size,
>    this function returns 0 immediately, which makes send_write() and
>    send_extent_data() do nothing and return immediately as well. When
>    we get back to send.c:send_write_or_clone() we adjust the value
>    of sctx->cur_inode_next_write_offset to @offset plus @len, which
>    corresponds to 6729728 + 18446744073708560384 = 5738496, which is
>    precisely the the size of the inode in the send snapshot;
> 
> 4) Later when at send.c:finish_inode_if_needed() we determine that
>    we don't need to issue a truncate operation because the value of
>    sctx->cur_inode_next_write_offset corresponds to the inode's new
>    size, 5738496 bytes. This is wrong because the last write operation
>    that was issued started at offset 1884160 with a length of 106496
>    bytes, so the correct value for sctx->cur_inode_next_write_offset
>    should be 1990656 (1884160 + 106496), so that a truncate operation
>    with a value of 5738496 bytes would have been sent to insert a
>    trailing hole at the destination.
> 
> So fix the issue by making send.c:send_write_or_clone() not attempt
> to send write or clone operations for extents that start beyond the
> inode's size, since such attempts do nothing but waste time by
> calling helper functions and allocating path structures, and send
> currently has no fallocate command in order to create prealloc extents
> at the destination (either beyond a file's eof or not).
> 
> The issue was found running the test btrfs/007 from fstests using a seed
> value of 1524346151 for fsstress.
> 
> Reported-by: Gu, Jinxiang <gujx@cn.fujitsu.com>
> Fixes: ffa7c4296e93 ("Btrfs: send, do not issue unnecessary truncate operations")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Added to next, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 1f5748c7d1c7..fff44ed15927 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -5249,6 +5249,10 @@  static int send_write_or_clone(struct send_ctx *sctx,
 		len = btrfs_file_extent_num_bytes(path->nodes[0], ei);
 	}
 
+	if (offset >= sctx->cur_inode_size) {
+		ret = 0;
+		goto out;
+	}
 	if (offset + len > sctx->cur_inode_size)
 		len = sctx->cur_inode_size - offset;
 	if (len == 0) {