diff mbox series

[v2] btrfs: fix wrong block_start calculation for btrfs_drop_extent_map_range()

Message ID f6e36de0cc45247c30c645764f3ffe4f6a487007.1712621026.git.wqu@suse.com (mailing list archive)
State New
Headers show
Series [v2] btrfs: fix wrong block_start calculation for btrfs_drop_extent_map_range() | expand

Commit Message

Qu Wenruo April 9, 2024, 12:06 a.m. UTC
[BUG]
During my extent_map cleanup/refactor, with extra sanity checks,
extent-map-tests::test_case_7() would not pass the checks.

The problem is, after btrfs_drop_extent_map_range(), the resulted
extent_map has a @block_start way too large.
Meanwhile my btrfs_file_extent_item based members are returning a
correct @disk_bytenr/@offset combination.

The extent map layout looks like this:

     0        16K    32K       48K
     | PINNED |      | Regular |

The regular em at [32K, 48K) also has 32K @block_start.

Then drop range [0, 36K), which should shrink the regular one to be
[36K, 48K).
However the @block_start is incorrect, we expect 32K + 4K, but got 52K.

[CAUSE]
Inside btrfs_drop_extent_map_range() function, if we hit an extent_map
that covers the target range but is still beyond it, we need to split
that extent map into half:

	|<-- drop range -->|
		 |<----- existing extent_map --->|

And if the extent map is not compressed, we need to forward
extent_map::block_start by the difference between the end of drop range
and the extent map start.

However in that particular case, the difference is calculated using
(start + len - em->start).

The problem is @start can be modified if the drop range covers any
pinned extent.

This leads to wrong calculation, and would be caught by my later
extent_map sanity checks, which checks the em::block_start against
btrfs_file_extent_item::disk_bytenr + btrfs_file_extent_item::offset.

This is a regression caused by commit c962098ca4af ("btrfs: fix
incorrect splitting in btrfs_drop_extent_map_range"), which removed the
@len update for pinned extents.

[FIX]
Fix it by avoiding using @start completely, and use @end - em->start
instead, which @end is exclusive bytenr number.

And update the test case to verify the @block_start to prevent such
problem from happening.

Thankfully this is not going to lead to any data corruption, as IO path
does not utilize btrfs_drop_extent_map_range() with @skip_pinned set.

So this fix is only here for the sake of consistency/correctness.

CC: stable@vger.kernel.org # 6.5+
Fixes: c962098ca4af ("btrfs: fix incorrect splitting in btrfs_drop_extent_map_range")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v2:
- Remove the mention of possible corruption
  Thankfully this bug does not affect IO path thus it's fine.

- Explain why c962098ca4af is the cause
---
 fs/btrfs/extent_map.c             | 2 +-
 fs/btrfs/tests/extent-map-tests.c | 6 +++++-
 2 files changed, 6 insertions(+), 2 deletions(-)

Comments

Filipe Manana April 9, 2024, 10:29 a.m. UTC | #1
On Tue, Apr 9, 2024 at 1:06 AM Qu Wenruo <wqu@suse.com> wrote:
>
> [BUG]
> During my extent_map cleanup/refactor, with extra sanity checks,
> extent-map-tests::test_case_7() would not pass the checks.
>
> The problem is, after btrfs_drop_extent_map_range(), the resulted
> extent_map has a @block_start way too large.
> Meanwhile my btrfs_file_extent_item based members are returning a
> correct @disk_bytenr/@offset combination.
>
> The extent map layout looks like this:
>
>      0        16K    32K       48K
>      | PINNED |      | Regular |
>
> The regular em at [32K, 48K) also has 32K @block_start.
>
> Then drop range [0, 36K), which should shrink the regular one to be
> [36K, 48K).
> However the @block_start is incorrect, we expect 32K + 4K, but got 52K.
>
> [CAUSE]
> Inside btrfs_drop_extent_map_range() function, if we hit an extent_map
> that covers the target range but is still beyond it, we need to split
> that extent map into half:
>
>         |<-- drop range -->|
>                  |<----- existing extent_map --->|
>
> And if the extent map is not compressed, we need to forward
> extent_map::block_start by the difference between the end of drop range
> and the extent map start.
>
> However in that particular case, the difference is calculated using
> (start + len - em->start).
>
> The problem is @start can be modified if the drop range covers any
> pinned extent.
>
> This leads to wrong calculation, and would be caught by my later
> extent_map sanity checks, which checks the em::block_start against
> btrfs_file_extent_item::disk_bytenr + btrfs_file_extent_item::offset.
>
> This is a regression caused by commit c962098ca4af ("btrfs: fix
> incorrect splitting in btrfs_drop_extent_map_range"), which removed the
> @len update for pinned extents.
>
> [FIX]
> Fix it by avoiding using @start completely, and use @end - em->start
> instead, which @end is exclusive bytenr number.
>
> And update the test case to verify the @block_start to prevent such
> problem from happening.
>
> Thankfully this is not going to lead to any data corruption, as IO path
> does not utilize btrfs_drop_extent_map_range() with @skip_pinned set.
>
> So this fix is only here for the sake of consistency/correctness.
>
> CC: stable@vger.kernel.org # 6.5+
> Fixes: c962098ca4af ("btrfs: fix incorrect splitting in btrfs_drop_extent_map_range")
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> Changelog:
> v2:
> - Remove the mention of possible corruption
>   Thankfully this bug does not affect IO path thus it's fine.
>
> - Explain why c962098ca4af is the cause
> ---
>  fs/btrfs/extent_map.c             | 2 +-
>  fs/btrfs/tests/extent-map-tests.c | 6 +++++-
>  2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
> index 471654cb65b0..955ce300e5a1 100644
> --- a/fs/btrfs/extent_map.c
> +++ b/fs/btrfs/extent_map.c
> @@ -799,7 +799,7 @@ void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end,
>                                         split->block_len = em->block_len;
>                                         split->orig_start = em->orig_start;
>                                 } else {
> -                                       const u64 diff = start + len - em->start;
> +                                       const u64 diff = end - em->start;
>
>                                         split->block_len = split->len;
>                                         split->block_start += diff;
> diff --git a/fs/btrfs/tests/extent-map-tests.c b/fs/btrfs/tests/extent-map-tests.c
> index 253cce7ffecf..80e71c5cb7ab 100644
> --- a/fs/btrfs/tests/extent-map-tests.c
> +++ b/fs/btrfs/tests/extent-map-tests.c
> @@ -818,7 +818,6 @@ static int test_case_7(struct btrfs_fs_info *fs_info)
>                 test_err("em->len is %llu, expected 16K", em->len);
>                 goto out;
>         }
> -
>         free_extent_map(em);

As pointed out before, please avoid such accidental and unrelated
changes like this.

With that fixed:

Reviewed-by: Filipe Manana <fdmanana@suse.com>


>
>         read_lock(&em_tree->lock);
> @@ -847,6 +846,11 @@ static int test_case_7(struct btrfs_fs_info *fs_info)
>                 goto out;
>         }
>
> +       if (em->block_start != SZ_32K + SZ_4K) {
> +               test_err("em->block_start is %llu, expected 36K", em->block_start);
> +               goto out;
> +       }
> +
>         free_extent_map(em);
>
>         read_lock(&em_tree->lock);
> --
> 2.44.0
>
>
diff mbox series

Patch

diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
index 471654cb65b0..955ce300e5a1 100644
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -799,7 +799,7 @@  void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end,
 					split->block_len = em->block_len;
 					split->orig_start = em->orig_start;
 				} else {
-					const u64 diff = start + len - em->start;
+					const u64 diff = end - em->start;
 
 					split->block_len = split->len;
 					split->block_start += diff;
diff --git a/fs/btrfs/tests/extent-map-tests.c b/fs/btrfs/tests/extent-map-tests.c
index 253cce7ffecf..80e71c5cb7ab 100644
--- a/fs/btrfs/tests/extent-map-tests.c
+++ b/fs/btrfs/tests/extent-map-tests.c
@@ -818,7 +818,6 @@  static int test_case_7(struct btrfs_fs_info *fs_info)
 		test_err("em->len is %llu, expected 16K", em->len);
 		goto out;
 	}
-
 	free_extent_map(em);
 
 	read_lock(&em_tree->lock);
@@ -847,6 +846,11 @@  static int test_case_7(struct btrfs_fs_info *fs_info)
 		goto out;
 	}
 
+	if (em->block_start != SZ_32K + SZ_4K) {
+		test_err("em->block_start is %llu, expected 36K", em->block_start);
+		goto out;
+	}
+
 	free_extent_map(em);
 
 	read_lock(&em_tree->lock);