diff mbox series

[v2] btrfs: scrub: fix false alerts on zoned device scrubing

Message ID cf93c10bb94755f1bee7e70b333db72ba9f0896b.1709629215.git.wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series [v2] btrfs: scrub: fix false alerts on zoned device scrubing | expand

Commit Message

Qu Wenruo March 5, 2024, 9:01 a.m. UTC
[BUG]
When using zoned devices (zbc), scrub would always report super block
errors like the following:

  # btrfs scrub start -fB /mnt/btrfs/
  Starting scrub on devid 1
  scrub done for b7b5c759-1baa-4561-a0ca-b8d0babcde56
  Scrub started:    Tue Mar  5 12:49:14 2024
  Status:           finished
  Duration:         0:00:00
  Total to scrub:   288.00KiB
  Rate:             288.00KiB/s
  Error summary:    super=2
    Corrected:      0
    Uncorrectable:  0
    Unverified:     0

[CAUSE]
Since the very beginning of scrub, we always go with btrfs_sb_offset()
to grab the super blocks.
This is fine for regular btrfs filesystems, but for zoned btrfs, super
blocks are stored in dedicated zones with a ring buffer like structure.

This means the old btrfs_sb_offset() is not able to give the correct
bytenr for us to grabbing the super blocks, thus except the primary
super block, the rest would be garbage and cause the above false alerts.

[FIX]
Instead of btrfs_sb_offset(), go with btrfs_sb_log_location() which is
zoned friendly, to grab the correct super block location.

This would introduce new error patterns, as btrfs_sb_log_location() can
fail with extra errors.

Here for -ENOENT we just end the scrub as there are no more super
blocks.
For other errors, we record it as a super block error and exit.

Reported-by: WA AM <waautomata@gmail.com>
Link: https://lore.kernel.org/all/CANU2Z0EvUzfYxczLgGUiREoMndE9WdQnbaawV5Fv5gNXptPUKw@mail.gmail.com/
CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/scrub.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)
---
Changelog:
v2:
- Use READ to replace the number 0
- Continue checking the next super block if we hit a non-ENOENT error

Comments

Qu Wenruo March 5, 2024, 9:05 a.m. UTC | #1
Please ignore this patch.

Johannes sent out his version before me and it's all my bad I didn't 
check the ML before sending out my version.

Sorry for the noise.

Thanks,
Qu

在 2024/3/5 19:31, Qu Wenruo 写道:
> [BUG]
> When using zoned devices (zbc), scrub would always report super block
> errors like the following:
> 
>    # btrfs scrub start -fB /mnt/btrfs/
>    Starting scrub on devid 1
>    scrub done for b7b5c759-1baa-4561-a0ca-b8d0babcde56
>    Scrub started:    Tue Mar  5 12:49:14 2024
>    Status:           finished
>    Duration:         0:00:00
>    Total to scrub:   288.00KiB
>    Rate:             288.00KiB/s
>    Error summary:    super=2
>      Corrected:      0
>      Uncorrectable:  0
>      Unverified:     0
> 
> [CAUSE]
> Since the very beginning of scrub, we always go with btrfs_sb_offset()
> to grab the super blocks.
> This is fine for regular btrfs filesystems, but for zoned btrfs, super
> blocks are stored in dedicated zones with a ring buffer like structure.
> 
> This means the old btrfs_sb_offset() is not able to give the correct
> bytenr for us to grabbing the super blocks, thus except the primary
> super block, the rest would be garbage and cause the above false alerts.
> 
> [FIX]
> Instead of btrfs_sb_offset(), go with btrfs_sb_log_location() which is
> zoned friendly, to grab the correct super block location.
> 
> This would introduce new error patterns, as btrfs_sb_log_location() can
> fail with extra errors.
> 
> Here for -ENOENT we just end the scrub as there are no more super
> blocks.
> For other errors, we record it as a super block error and exit.
> 
> Reported-by: WA AM <waautomata@gmail.com>
> Link: https://lore.kernel.org/all/CANU2Z0EvUzfYxczLgGUiREoMndE9WdQnbaawV5Fv5gNXptPUKw@mail.gmail.com/
> CC: stable@vger.kernel.org # 5.15+
> Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
> Signed-off-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>   fs/btrfs/scrub.c | 13 +++++++++++--
>   1 file changed, 11 insertions(+), 2 deletions(-)
> ---
> Changelog:
> v2:
> - Use READ to replace the number 0
> - Continue checking the next super block if we hit a non-ENOENT error
> 
> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
> index c4bd0e60db59..201b547aac4c 100644
> --- a/fs/btrfs/scrub.c
> +++ b/fs/btrfs/scrub.c
> @@ -2788,7 +2788,6 @@ static noinline_for_stack int scrub_supers(struct scrub_ctx *sctx,
>   					   struct btrfs_device *scrub_dev)
>   {
>   	int	i;
> -	u64	bytenr;
>   	u64	gen;
>   	int ret = 0;
>   	struct page *page;
> @@ -2812,7 +2811,17 @@ static noinline_for_stack int scrub_supers(struct scrub_ctx *sctx,
>   		gen = btrfs_get_last_trans_committed(fs_info);
>   
>   	for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) {
> -		bytenr = btrfs_sb_offset(i);
> +		u64 bytenr;
> +
> +		ret = btrfs_sb_log_location(scrub_dev, i, READ, &bytenr);
> +		if (ret == -ENOENT)
> +			break;
> +		if (ret < 0) {
> +			spin_lock(&sctx->stat_lock);
> +			sctx->stat.super_errors++;
> +			spin_unlock(&sctx->stat_lock);
> +			continue;
> +		}
>   		if (bytenr + BTRFS_SUPER_INFO_SIZE >
>   		    scrub_dev->commit_total_bytes)
>   			break;
diff mbox series

Patch

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index c4bd0e60db59..201b547aac4c 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -2788,7 +2788,6 @@  static noinline_for_stack int scrub_supers(struct scrub_ctx *sctx,
 					   struct btrfs_device *scrub_dev)
 {
 	int	i;
-	u64	bytenr;
 	u64	gen;
 	int ret = 0;
 	struct page *page;
@@ -2812,7 +2811,17 @@  static noinline_for_stack int scrub_supers(struct scrub_ctx *sctx,
 		gen = btrfs_get_last_trans_committed(fs_info);
 
 	for (i = 0; i < BTRFS_SUPER_MIRROR_MAX; i++) {
-		bytenr = btrfs_sb_offset(i);
+		u64 bytenr;
+
+		ret = btrfs_sb_log_location(scrub_dev, i, READ, &bytenr);
+		if (ret == -ENOENT)
+			break;
+		if (ret < 0) {
+			spin_lock(&sctx->stat_lock);
+			sctx->stat.super_errors++;
+			spin_unlock(&sctx->stat_lock);
+			continue;
+		}
 		if (bytenr + BTRFS_SUPER_INFO_SIZE >
 		    scrub_dev->commit_total_bytes)
 			break;