diff mbox series

btrfs: test a compressed send stream scenario that triggered a read corruption

Message ID 330b0c61a77f01bd2aa57e9b500145178a2d751a.1721124764.git.fdmanana@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: test a compressed send stream scenario that triggered a read corruption | expand

Commit Message

Filipe Manana July 16, 2024, 10:14 a.m. UTC
From: Filipe Manana <fdmanana@suse.com>

Test a scenario of a compressed send stream that triggered a bug in the
extent map merging code introduced in the merge window for 6.11.

The commit that introduced the bug is on its way to Linus' tree and its
subject is:

   "btrfs: introduce new members for extent_map"

The corresponding fix was submitted to the btrfs mailing list, with the
subject:

  "btrfs: fix corrupt read due to bad offset of a compressed extent map"

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 tests/btrfs/312     | 115 ++++++++++++++++++++++++++++++++++++++++++++
 tests/btrfs/312.out |   7 +++
 2 files changed, 122 insertions(+)
 create mode 100755 tests/btrfs/312
 create mode 100644 tests/btrfs/312.out

Comments

Qu Wenruo July 16, 2024, 11:11 a.m. UTC | #1
在 2024/7/16 19:44, fdmanana@kernel.org 写道:
> From: Filipe Manana <fdmanana@suse.com>
>
> Test a scenario of a compressed send stream that triggered a bug in the
> extent map merging code introduced in the merge window for 6.11.
>
> The commit that introduced the bug is on its way to Linus' tree and its
> subject is:
>
>     "btrfs: introduce new members for extent_map"
>
> The corresponding fix was submitted to the btrfs mailing list, with the
> subject:
>
>    "btrfs: fix corrupt read due to bad offset of a compressed extent map"
>
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
>   tests/btrfs/312     | 115 ++++++++++++++++++++++++++++++++++++++++++++
>   tests/btrfs/312.out |   7 +++
>   2 files changed, 122 insertions(+)
>   create mode 100755 tests/btrfs/312
>   create mode 100644 tests/btrfs/312.out
>
> diff --git a/tests/btrfs/312 b/tests/btrfs/312
> new file mode 100755
> index 00000000..ebecadc6
> --- /dev/null
> +++ b/tests/btrfs/312
> @@ -0,0 +1,115 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (C) 2024 SUSE Linux Products GmbH. All Rights Reserved.
> +#
> +# FS QA Test 312
> +#
> +# Test a scenario of a compressed send stream that triggered a bug in the extent
> +# map merging code introduced in the merge window for 6.11.
> +#
> +. ./common/preamble
> +_begin_fstest auto quick send compress
> +
> +_cleanup()
> +{
> +	cd /
> +	rm -fr $tmp.*
> +	rm -fr $send_files_dir
> +}
> +
> +. ./common/filter
> +
> +_require_btrfs_send_version 2
> +_require_test
> +_require_scratch
> +
> +_fixed_by_kernel_commit XXXXXXXXXXXX \
> +	"btrfs: fix corrupt read due to bad offset of a compressed extent map"
> +
> +send_files_dir=$TEST_DIR/btrfs-test-$seq
> +
> +rm -fr $send_files_dir
> +mkdir $send_files_dir
> +first_stream="$send_files_dir/1.send"
> +second_stream="$send_files_dir/2.send"
> +
> +_scratch_mkfs >> $seqres.full 2>&1 || _fail "first mkfs failed"
> +_scratch_mount -o compress
> +
> +# Create a compressed extent for the range [108K, 144K[. Since it's a
> +# non-aligned start offset, the first 3K of the extent are filled with zeroes.
> +# The i_size is set to 141K.
> +$XFS_IO_PROG -f -c "pwrite -S 0xab 111K 30K" $SCRATCH_MNT/foo >> $seqres.full
> +
> +$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/snap1 >> $seqres.full
> +
> +# Overwrite a part of the extent we created before.
> +# This will make the send stream include an encoded write (compressed) for the
> +# file range [120K, 128K[.
> +$XFS_IO_PROG -c "pwrite -S 0xcd 120K 8K" $SCRATCH_MNT/foo >> $seqres.full
> +
> +# Now do write after those extents and leaving a hole in between.
> +# This results in expanding the last block of the extent we first created, that
> +# is, in filling with zeroes the file range [141K, 144K[ (3072 bytes), which
> +# belongs to the block in the range [140K, 144K[.
> +#
> +# When the destination filesystem receives from the send stream a write for that
> +# range ([140K, 144K[) it does a btrfs_get_extent() call to find the extent map
> +# containing the offset 140K. There's no loaded extent map covering that range
> +# so it will lookg at the subvolume tree to find a file extent item covering the
> +# range and then finds the file extent item covering the range [108K, 144K[ which
> +# corresponds to the first extent written to the file, before snapshoting.
> +#
> +# Note that at this point in time the destination filesystem processed an encoded
> +# write for the range [120K, 128K[, which created a compressed extent map for
> +# that range and a corresponding ordered extent, which has not yet completed when
> +# it received the write command for the [140K, 144K[ range, so the corresponding
> +# file extent item is not yet in the subvolume tree - that only happens when the
> +# ordered extent completes, at btrfs_finish_one_ordered().
> +#
> +# So having found a file extent item for the range [108K, 144K[ where 140K falls
> +# into, it tries to add a compressed extent map for that range to the inode's
> +# extent map tree with a call to btrfs_add_extent_mapping() done at
> +# btrfs_get_extent(). That finds there's a loaded overlapping extent map for the
> +# range [120K, 128K[ (the extent from the previous encoded write) and then calls
> +# into merge_extent_mapping().
> +#
> +# The merging ended adjusting the extent map we attempted to insert, covering
> +# the range [108K, 144K[, to cover instead the range [128K, 144K[ (length 16K)
> +# instead, since there's an existing extent map for the range [120K, 128K[ and
> +# we are looking for a range starting at 140K (and ending at 144K). However it
> +# didn't adjust the extent map's offset from 0 to 20K, resulting in future reads
> +# reading the incorrect range from the underlying extent (108K to 124K, 16K of
> +# length, instead of the 128K to 144K range).
> +#
> +# Note that for the incorrect extent map, and therefore read corruption, to
> +# happen, we depend on specific timings - the ordered extent for the encoded
> +# write for the range [120K, 128K[ must not complete before the destination
> +# of the send stream receives the write command for the range [140K, 144K[.
> +#

Considering it's timing related, do we need multiple runs to make sure
we can hit it?

And I'm not sure if fstests is the best place to put all those detailed
info.
The _fixed_by_kernel_commit line looks enough to me.

Thanks,
Qu

> +$XFS_IO_PROG -c "pwrite -S 0xef 160K 4K" $SCRATCH_MNT/foo >> $seqres.full
> +
> +$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/snap2 >> $seqres.full
> +
> +echo "Checksums in the original filesystem:"
> +echo "$(md5sum $SCRATCH_MNT/snap1/foo | _filter_scratch)"
> +echo "$(md5sum $SCRATCH_MNT/snap2/foo | _filter_scratch)"
> +
> +$BTRFS_UTIL_PROG send --compressed-data -q -f $first_stream $SCRATCH_MNT/snap1
> +$BTRFS_UTIL_PROG send --compressed-data -q -f $second_stream \
> +		 -p $SCRATCH_MNT/snap1 $SCRATCH_MNT/snap2
> +
> +_scratch_unmount
> +_scratch_mkfs >> $seqres.full 2>&1 || _fail "second mkfs failed"
> +_scratch_mount
> +
> +$BTRFS_UTIL_PROG receive -q -f $first_stream $SCRATCH_MNT
> +$BTRFS_UTIL_PROG receive -q -f $second_stream $SCRATCH_MNT
> +
> +echo "Checksums in the new filesystem:"
> +echo "$(md5sum $SCRATCH_MNT/snap1/foo | _filter_scratch)"
> +echo "$(md5sum $SCRATCH_MNT/snap2/foo | _filter_scratch)"
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/btrfs/312.out b/tests/btrfs/312.out
> new file mode 100644
> index 00000000..75f1f4cc
> --- /dev/null
> +++ b/tests/btrfs/312.out
> @@ -0,0 +1,7 @@
> +QA output created by 312
> +Checksums in the original filesystem:
> +e3ba4a9cbb38d921adc30d7e795d6626  SCRATCH_MNT/snap1/foo
> +4de09f7184f63aa64b481f3031138920  SCRATCH_MNT/snap2/foo
> +Checksums in the new filesystem:
> +e3ba4a9cbb38d921adc30d7e795d6626  SCRATCH_MNT/snap1/foo
> +4de09f7184f63aa64b481f3031138920  SCRATCH_MNT/snap2/foo
Filipe Manana July 16, 2024, 11:16 a.m. UTC | #2
On Tue, Jul 16, 2024 at 12:11 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
>
> 在 2024/7/16 19:44, fdmanana@kernel.org 写道:
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > Test a scenario of a compressed send stream that triggered a bug in the
> > extent map merging code introduced in the merge window for 6.11.
> >
> > The commit that introduced the bug is on its way to Linus' tree and its
> > subject is:
> >
> >     "btrfs: introduce new members for extent_map"
> >
> > The corresponding fix was submitted to the btrfs mailing list, with the
> > subject:
> >
> >    "btrfs: fix corrupt read due to bad offset of a compressed extent map"
> >
> > Signed-off-by: Filipe Manana <fdmanana@suse.com>
> > ---
> >   tests/btrfs/312     | 115 ++++++++++++++++++++++++++++++++++++++++++++
> >   tests/btrfs/312.out |   7 +++
> >   2 files changed, 122 insertions(+)
> >   create mode 100755 tests/btrfs/312
> >   create mode 100644 tests/btrfs/312.out
> >
> > diff --git a/tests/btrfs/312 b/tests/btrfs/312
> > new file mode 100755
> > index 00000000..ebecadc6
> > --- /dev/null
> > +++ b/tests/btrfs/312
> > @@ -0,0 +1,115 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (C) 2024 SUSE Linux Products GmbH. All Rights Reserved.
> > +#
> > +# FS QA Test 312
> > +#
> > +# Test a scenario of a compressed send stream that triggered a bug in the extent
> > +# map merging code introduced in the merge window for 6.11.
> > +#
> > +. ./common/preamble
> > +_begin_fstest auto quick send compress
> > +
> > +_cleanup()
> > +{
> > +     cd /
> > +     rm -fr $tmp.*
> > +     rm -fr $send_files_dir
> > +}
> > +
> > +. ./common/filter
> > +
> > +_require_btrfs_send_version 2
> > +_require_test
> > +_require_scratch
> > +
> > +_fixed_by_kernel_commit XXXXXXXXXXXX \
> > +     "btrfs: fix corrupt read due to bad offset of a compressed extent map"
> > +
> > +send_files_dir=$TEST_DIR/btrfs-test-$seq
> > +
> > +rm -fr $send_files_dir
> > +mkdir $send_files_dir
> > +first_stream="$send_files_dir/1.send"
> > +second_stream="$send_files_dir/2.send"
> > +
> > +_scratch_mkfs >> $seqres.full 2>&1 || _fail "first mkfs failed"
> > +_scratch_mount -o compress
> > +
> > +# Create a compressed extent for the range [108K, 144K[. Since it's a
> > +# non-aligned start offset, the first 3K of the extent are filled with zeroes.
> > +# The i_size is set to 141K.
> > +$XFS_IO_PROG -f -c "pwrite -S 0xab 111K 30K" $SCRATCH_MNT/foo >> $seqres.full
> > +
> > +$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/snap1 >> $seqres.full
> > +
> > +# Overwrite a part of the extent we created before.
> > +# This will make the send stream include an encoded write (compressed) for the
> > +# file range [120K, 128K[.
> > +$XFS_IO_PROG -c "pwrite -S 0xcd 120K 8K" $SCRATCH_MNT/foo >> $seqres.full
> > +
> > +# Now do write after those extents and leaving a hole in between.
> > +# This results in expanding the last block of the extent we first created, that
> > +# is, in filling with zeroes the file range [141K, 144K[ (3072 bytes), which
> > +# belongs to the block in the range [140K, 144K[.
> > +#
> > +# When the destination filesystem receives from the send stream a write for that
> > +# range ([140K, 144K[) it does a btrfs_get_extent() call to find the extent map
> > +# containing the offset 140K. There's no loaded extent map covering that range
> > +# so it will lookg at the subvolume tree to find a file extent item covering the
> > +# range and then finds the file extent item covering the range [108K, 144K[ which
> > +# corresponds to the first extent written to the file, before snapshoting.
> > +#
> > +# Note that at this point in time the destination filesystem processed an encoded
> > +# write for the range [120K, 128K[, which created a compressed extent map for
> > +# that range and a corresponding ordered extent, which has not yet completed when
> > +# it received the write command for the [140K, 144K[ range, so the corresponding
> > +# file extent item is not yet in the subvolume tree - that only happens when the
> > +# ordered extent completes, at btrfs_finish_one_ordered().
> > +#
> > +# So having found a file extent item for the range [108K, 144K[ where 140K falls
> > +# into, it tries to add a compressed extent map for that range to the inode's
> > +# extent map tree with a call to btrfs_add_extent_mapping() done at
> > +# btrfs_get_extent(). That finds there's a loaded overlapping extent map for the
> > +# range [120K, 128K[ (the extent from the previous encoded write) and then calls
> > +# into merge_extent_mapping().
> > +#
> > +# The merging ended adjusting the extent map we attempted to insert, covering
> > +# the range [108K, 144K[, to cover instead the range [128K, 144K[ (length 16K)
> > +# instead, since there's an existing extent map for the range [120K, 128K[ and
> > +# we are looking for a range starting at 140K (and ending at 144K). However it
> > +# didn't adjust the extent map's offset from 0 to 20K, resulting in future reads
> > +# reading the incorrect range from the underlying extent (108K to 124K, 16K of
> > +# length, instead of the 128K to 144K range).
> > +#
> > +# Note that for the incorrect extent map, and therefore read corruption, to
> > +# happen, we depend on specific timings - the ordered extent for the encoded
> > +# write for the range [120K, 128K[ must not complete before the destination
> > +# of the send stream receives the write command for the range [140K, 144K[.
> > +#
>
> Considering it's timing related, do we need multiple runs to make sure
> we can hit it?

We run fstests continuously, so there's not much point in that.

>
> And I'm not sure if fstests is the best place to put all those detailed
> info.

Why not?
I like it there, it explains in detail the problem and without any of
that, anyone reading the test will be puzzled about those specific
write ranges, ordering, etc.

> The _fixed_by_kernel_commit line looks enough to me.

Looking at the changelog of the kernel commit doesn't explain
everything, because this here is a specific context.
That's why I left all those comments explaining everything.


>
> Thanks,
> Qu
>
> > +$XFS_IO_PROG -c "pwrite -S 0xef 160K 4K" $SCRATCH_MNT/foo >> $seqres.full
> > +
> > +$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/snap2 >> $seqres.full
> > +
> > +echo "Checksums in the original filesystem:"
> > +echo "$(md5sum $SCRATCH_MNT/snap1/foo | _filter_scratch)"
> > +echo "$(md5sum $SCRATCH_MNT/snap2/foo | _filter_scratch)"
> > +
> > +$BTRFS_UTIL_PROG send --compressed-data -q -f $first_stream $SCRATCH_MNT/snap1
> > +$BTRFS_UTIL_PROG send --compressed-data -q -f $second_stream \
> > +              -p $SCRATCH_MNT/snap1 $SCRATCH_MNT/snap2
> > +
> > +_scratch_unmount
> > +_scratch_mkfs >> $seqres.full 2>&1 || _fail "second mkfs failed"
> > +_scratch_mount
> > +
> > +$BTRFS_UTIL_PROG receive -q -f $first_stream $SCRATCH_MNT
> > +$BTRFS_UTIL_PROG receive -q -f $second_stream $SCRATCH_MNT
> > +
> > +echo "Checksums in the new filesystem:"
> > +echo "$(md5sum $SCRATCH_MNT/snap1/foo | _filter_scratch)"
> > +echo "$(md5sum $SCRATCH_MNT/snap2/foo | _filter_scratch)"
> > +
> > +# success, all done
> > +status=0
> > +exit
> > diff --git a/tests/btrfs/312.out b/tests/btrfs/312.out
> > new file mode 100644
> > index 00000000..75f1f4cc
> > --- /dev/null
> > +++ b/tests/btrfs/312.out
> > @@ -0,0 +1,7 @@
> > +QA output created by 312
> > +Checksums in the original filesystem:
> > +e3ba4a9cbb38d921adc30d7e795d6626  SCRATCH_MNT/snap1/foo
> > +4de09f7184f63aa64b481f3031138920  SCRATCH_MNT/snap2/foo
> > +Checksums in the new filesystem:
> > +e3ba4a9cbb38d921adc30d7e795d6626  SCRATCH_MNT/snap1/foo
> > +4de09f7184f63aa64b481f3031138920  SCRATCH_MNT/snap2/foo
Qu Wenruo July 18, 2024, 10:05 p.m. UTC | #3
在 2024/7/16 19:44, fdmanana@kernel.org 写道:
> From: Filipe Manana <fdmanana@suse.com>
>
> Test a scenario of a compressed send stream that triggered a bug in the
> extent map merging code introduced in the merge window for 6.11.
>
> The commit that introduced the bug is on its way to Linus' tree and its
> subject is:
>
>     "btrfs: introduce new members for extent_map"
>
> The corresponding fix was submitted to the btrfs mailing list, with the
> subject:
>
>    "btrfs: fix corrupt read due to bad offset of a compressed extent map"
>
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu
> ---
>   tests/btrfs/312     | 115 ++++++++++++++++++++++++++++++++++++++++++++
>   tests/btrfs/312.out |   7 +++
>   2 files changed, 122 insertions(+)
>   create mode 100755 tests/btrfs/312
>   create mode 100644 tests/btrfs/312.out
>
> diff --git a/tests/btrfs/312 b/tests/btrfs/312
> new file mode 100755
> index 00000000..ebecadc6
> --- /dev/null
> +++ b/tests/btrfs/312
> @@ -0,0 +1,115 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (C) 2024 SUSE Linux Products GmbH. All Rights Reserved.
> +#
> +# FS QA Test 312
> +#
> +# Test a scenario of a compressed send stream that triggered a bug in the extent
> +# map merging code introduced in the merge window for 6.11.
> +#
> +. ./common/preamble
> +_begin_fstest auto quick send compress
> +
> +_cleanup()
> +{
> +	cd /
> +	rm -fr $tmp.*
> +	rm -fr $send_files_dir
> +}
> +
> +. ./common/filter
> +
> +_require_btrfs_send_version 2
> +_require_test
> +_require_scratch
> +
> +_fixed_by_kernel_commit XXXXXXXXXXXX \
> +	"btrfs: fix corrupt read due to bad offset of a compressed extent map"
> +
> +send_files_dir=$TEST_DIR/btrfs-test-$seq
> +
> +rm -fr $send_files_dir
> +mkdir $send_files_dir
> +first_stream="$send_files_dir/1.send"
> +second_stream="$send_files_dir/2.send"
> +
> +_scratch_mkfs >> $seqres.full 2>&1 || _fail "first mkfs failed"
> +_scratch_mount -o compress
> +
> +# Create a compressed extent for the range [108K, 144K[. Since it's a
> +# non-aligned start offset, the first 3K of the extent are filled with zeroes.
> +# The i_size is set to 141K.
> +$XFS_IO_PROG -f -c "pwrite -S 0xab 111K 30K" $SCRATCH_MNT/foo >> $seqres.full
> +
> +$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/snap1 >> $seqres.full
> +
> +# Overwrite a part of the extent we created before.
> +# This will make the send stream include an encoded write (compressed) for the
> +# file range [120K, 128K[.
> +$XFS_IO_PROG -c "pwrite -S 0xcd 120K 8K" $SCRATCH_MNT/foo >> $seqres.full
> +
> +# Now do write after those extents and leaving a hole in between.
> +# This results in expanding the last block of the extent we first created, that
> +# is, in filling with zeroes the file range [141K, 144K[ (3072 bytes), which
> +# belongs to the block in the range [140K, 144K[.
> +#
> +# When the destination filesystem receives from the send stream a write for that
> +# range ([140K, 144K[) it does a btrfs_get_extent() call to find the extent map
> +# containing the offset 140K. There's no loaded extent map covering that range
> +# so it will lookg at the subvolume tree to find a file extent item covering the
> +# range and then finds the file extent item covering the range [108K, 144K[ which
> +# corresponds to the first extent written to the file, before snapshoting.
> +#
> +# Note that at this point in time the destination filesystem processed an encoded
> +# write for the range [120K, 128K[, which created a compressed extent map for
> +# that range and a corresponding ordered extent, which has not yet completed when
> +# it received the write command for the [140K, 144K[ range, so the corresponding
> +# file extent item is not yet in the subvolume tree - that only happens when the
> +# ordered extent completes, at btrfs_finish_one_ordered().
> +#
> +# So having found a file extent item for the range [108K, 144K[ where 140K falls
> +# into, it tries to add a compressed extent map for that range to the inode's
> +# extent map tree with a call to btrfs_add_extent_mapping() done at
> +# btrfs_get_extent(). That finds there's a loaded overlapping extent map for the
> +# range [120K, 128K[ (the extent from the previous encoded write) and then calls
> +# into merge_extent_mapping().
> +#
> +# The merging ended adjusting the extent map we attempted to insert, covering
> +# the range [108K, 144K[, to cover instead the range [128K, 144K[ (length 16K)
> +# instead, since there's an existing extent map for the range [120K, 128K[ and
> +# we are looking for a range starting at 140K (and ending at 144K). However it
> +# didn't adjust the extent map's offset from 0 to 20K, resulting in future reads
> +# reading the incorrect range from the underlying extent (108K to 124K, 16K of
> +# length, instead of the 128K to 144K range).
> +#
> +# Note that for the incorrect extent map, and therefore read corruption, to
> +# happen, we depend on specific timings - the ordered extent for the encoded
> +# write for the range [120K, 128K[ must not complete before the destination
> +# of the send stream receives the write command for the range [140K, 144K[.
> +#
> +$XFS_IO_PROG -c "pwrite -S 0xef 160K 4K" $SCRATCH_MNT/foo >> $seqres.full
> +
> +$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/snap2 >> $seqres.full
> +
> +echo "Checksums in the original filesystem:"
> +echo "$(md5sum $SCRATCH_MNT/snap1/foo | _filter_scratch)"
> +echo "$(md5sum $SCRATCH_MNT/snap2/foo | _filter_scratch)"
> +
> +$BTRFS_UTIL_PROG send --compressed-data -q -f $first_stream $SCRATCH_MNT/snap1
> +$BTRFS_UTIL_PROG send --compressed-data -q -f $second_stream \
> +		 -p $SCRATCH_MNT/snap1 $SCRATCH_MNT/snap2
> +
> +_scratch_unmount
> +_scratch_mkfs >> $seqres.full 2>&1 || _fail "second mkfs failed"
> +_scratch_mount
> +
> +$BTRFS_UTIL_PROG receive -q -f $first_stream $SCRATCH_MNT
> +$BTRFS_UTIL_PROG receive -q -f $second_stream $SCRATCH_MNT
> +
> +echo "Checksums in the new filesystem:"
> +echo "$(md5sum $SCRATCH_MNT/snap1/foo | _filter_scratch)"
> +echo "$(md5sum $SCRATCH_MNT/snap2/foo | _filter_scratch)"
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/btrfs/312.out b/tests/btrfs/312.out
> new file mode 100644
> index 00000000..75f1f4cc
> --- /dev/null
> +++ b/tests/btrfs/312.out
> @@ -0,0 +1,7 @@
> +QA output created by 312
> +Checksums in the original filesystem:
> +e3ba4a9cbb38d921adc30d7e795d6626  SCRATCH_MNT/snap1/foo
> +4de09f7184f63aa64b481f3031138920  SCRATCH_MNT/snap2/foo
> +Checksums in the new filesystem:
> +e3ba4a9cbb38d921adc30d7e795d6626  SCRATCH_MNT/snap1/foo
> +4de09f7184f63aa64b481f3031138920  SCRATCH_MNT/snap2/foo
diff mbox series

Patch

diff --git a/tests/btrfs/312 b/tests/btrfs/312
new file mode 100755
index 00000000..ebecadc6
--- /dev/null
+++ b/tests/btrfs/312
@@ -0,0 +1,115 @@ 
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (C) 2024 SUSE Linux Products GmbH. All Rights Reserved.
+#
+# FS QA Test 312
+#
+# Test a scenario of a compressed send stream that triggered a bug in the extent
+# map merging code introduced in the merge window for 6.11.
+#
+. ./common/preamble
+_begin_fstest auto quick send compress
+
+_cleanup()
+{
+	cd /
+	rm -fr $tmp.*
+	rm -fr $send_files_dir
+}
+
+. ./common/filter
+
+_require_btrfs_send_version 2
+_require_test
+_require_scratch
+
+_fixed_by_kernel_commit XXXXXXXXXXXX \
+	"btrfs: fix corrupt read due to bad offset of a compressed extent map"
+
+send_files_dir=$TEST_DIR/btrfs-test-$seq
+
+rm -fr $send_files_dir
+mkdir $send_files_dir
+first_stream="$send_files_dir/1.send"
+second_stream="$send_files_dir/2.send"
+
+_scratch_mkfs >> $seqres.full 2>&1 || _fail "first mkfs failed"
+_scratch_mount -o compress
+
+# Create a compressed extent for the range [108K, 144K[. Since it's a
+# non-aligned start offset, the first 3K of the extent are filled with zeroes.
+# The i_size is set to 141K.
+$XFS_IO_PROG -f -c "pwrite -S 0xab 111K 30K" $SCRATCH_MNT/foo >> $seqres.full
+
+$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/snap1 >> $seqres.full
+
+# Overwrite a part of the extent we created before.
+# This will make the send stream include an encoded write (compressed) for the
+# file range [120K, 128K[.
+$XFS_IO_PROG -c "pwrite -S 0xcd 120K 8K" $SCRATCH_MNT/foo >> $seqres.full
+
+# Now do write after those extents and leaving a hole in between.
+# This results in expanding the last block of the extent we first created, that
+# is, in filling with zeroes the file range [141K, 144K[ (3072 bytes), which
+# belongs to the block in the range [140K, 144K[.
+#
+# When the destination filesystem receives from the send stream a write for that
+# range ([140K, 144K[) it does a btrfs_get_extent() call to find the extent map
+# containing the offset 140K. There's no loaded extent map covering that range
+# so it will lookg at the subvolume tree to find a file extent item covering the
+# range and then finds the file extent item covering the range [108K, 144K[ which
+# corresponds to the first extent written to the file, before snapshoting.
+#
+# Note that at this point in time the destination filesystem processed an encoded
+# write for the range [120K, 128K[, which created a compressed extent map for
+# that range and a corresponding ordered extent, which has not yet completed when
+# it received the write command for the [140K, 144K[ range, so the corresponding
+# file extent item is not yet in the subvolume tree - that only happens when the
+# ordered extent completes, at btrfs_finish_one_ordered().
+#
+# So having found a file extent item for the range [108K, 144K[ where 140K falls
+# into, it tries to add a compressed extent map for that range to the inode's
+# extent map tree with a call to btrfs_add_extent_mapping() done at
+# btrfs_get_extent(). That finds there's a loaded overlapping extent map for the
+# range [120K, 128K[ (the extent from the previous encoded write) and then calls
+# into merge_extent_mapping().
+#
+# The merging ended adjusting the extent map we attempted to insert, covering
+# the range [108K, 144K[, to cover instead the range [128K, 144K[ (length 16K)
+# instead, since there's an existing extent map for the range [120K, 128K[ and
+# we are looking for a range starting at 140K (and ending at 144K). However it
+# didn't adjust the extent map's offset from 0 to 20K, resulting in future reads
+# reading the incorrect range from the underlying extent (108K to 124K, 16K of
+# length, instead of the 128K to 144K range).
+#
+# Note that for the incorrect extent map, and therefore read corruption, to
+# happen, we depend on specific timings - the ordered extent for the encoded
+# write for the range [120K, 128K[ must not complete before the destination
+# of the send stream receives the write command for the range [140K, 144K[.
+#
+$XFS_IO_PROG -c "pwrite -S 0xef 160K 4K" $SCRATCH_MNT/foo >> $seqres.full
+
+$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/snap2 >> $seqres.full
+
+echo "Checksums in the original filesystem:"
+echo "$(md5sum $SCRATCH_MNT/snap1/foo | _filter_scratch)"
+echo "$(md5sum $SCRATCH_MNT/snap2/foo | _filter_scratch)"
+
+$BTRFS_UTIL_PROG send --compressed-data -q -f $first_stream $SCRATCH_MNT/snap1
+$BTRFS_UTIL_PROG send --compressed-data -q -f $second_stream \
+		 -p $SCRATCH_MNT/snap1 $SCRATCH_MNT/snap2
+
+_scratch_unmount
+_scratch_mkfs >> $seqres.full 2>&1 || _fail "second mkfs failed"
+_scratch_mount
+
+$BTRFS_UTIL_PROG receive -q -f $first_stream $SCRATCH_MNT
+$BTRFS_UTIL_PROG receive -q -f $second_stream $SCRATCH_MNT
+
+echo "Checksums in the new filesystem:"
+echo "$(md5sum $SCRATCH_MNT/snap1/foo | _filter_scratch)"
+echo "$(md5sum $SCRATCH_MNT/snap2/foo | _filter_scratch)"
+
+# success, all done
+status=0
+exit
diff --git a/tests/btrfs/312.out b/tests/btrfs/312.out
new file mode 100644
index 00000000..75f1f4cc
--- /dev/null
+++ b/tests/btrfs/312.out
@@ -0,0 +1,7 @@ 
+QA output created by 312
+Checksums in the original filesystem:
+e3ba4a9cbb38d921adc30d7e795d6626  SCRATCH_MNT/snap1/foo
+4de09f7184f63aa64b481f3031138920  SCRATCH_MNT/snap2/foo
+Checksums in the new filesystem:
+e3ba4a9cbb38d921adc30d7e795d6626  SCRATCH_MNT/snap1/foo
+4de09f7184f63aa64b481f3031138920  SCRATCH_MNT/snap2/foo