btrfs: scrub: fix failed to detect checksum error

Message ID	1667745304-11470-1-git-send-email-zhanglikernel@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-btrfs-owner@kernel.org> From: Li Zhang <zhanglikernel@gmail.com> To: linux-btrfs@vger.kernel.org Cc: Li Zhang <zhanglikernel@gmail.com> Subject: [PATCH] btrfs: scrub: fix failed to detect checksum error Date: Sun, 6 Nov 2022 22:35:04 +0800 Message-Id: <1667745304-11470-1-git-send-email-zhanglikernel@gmail.com> Precedence: bulk
Series	btrfs: scrub: fix failed to detect checksum error \| expand btrfs: scrub: fix failed to detect checksum error

Message ID

1667745304-11470-1-git-send-email-zhanglikernel@gmail.com (mailing list archive)

State

New, archived

Headers

From: Li Zhang <zhanglikernel@gmail.com>
To: linux-btrfs@vger.kernel.org
Cc: Li Zhang <zhanglikernel@gmail.com>
Subject: [PATCH] btrfs: scrub: fix failed to detect checksum error
Date: Sun,  6 Nov 2022 22:35:04 +0800
Message-Id: <1667745304-11470-1-git-send-email-zhanglikernel@gmail.com>
Precedence: bulk

Series

btrfs: scrub: fix failed to detect checksum error | expand

Commit Message

li zhang Nov. 6, 2022, 2:35 p.m. UTC

[bug]
Scrub fails to detect checksum errors
To reproduce the problem:

$ truncate -s 250M loop_dev1
$ truncate -s 250M loop_dev2
$ losetup /dev/loop1 loop_dev1
$ losetup /dev/loop2 loop_dev2
$ mkfs.btrfs -mraid1 -draid1 /dev/loop1 /dev/loop2 -f
$ mount /dev/loop1 /mnt/
$ cp ~/btrfs/btrfs-progs/mkfs/main.c /mnt/

$ vim -b loop_dev1

....
         free(label);
         free(source_dir);
         exit(1);
success:
         exit(0);
}zhangli

....

$ sudo btrfs scrub start /mnt/
fsid:b66aa912-ae76-4b4b-881b-6a6840f06870 sock_path:/var/lib/btrfs/scrub.progress.b66aa912-ae76-4b4b-881b-6a6840f06870.
scrub started on /mnt/, fsid b66aa912-ae76-4b4b-881b-6a6840f06870 (pid=9894)

$ sudo btrfs scrub status /mnt/
UUID:             b66aa912-ae76-4b4b-881b-6a6840f06870
Scrub started:    Sun Nov  6 21:51:50 2022
Status:           finished
Duration:         0:00:00
Total to scrub:   392.00KiB
Rate:             0.00B/s
Error summary:    no errors found

[reason]
Because scrub only checks the first sector (scrub_sector) of
the sblock (scrub_block), it does not check other sectors.

[implementation]
1. Move the set sector (scrub_sector) csum from scrub_extent
to scrub_sectors, since each sector has an independent checksum.
2. In the scrub_checksum_data function, check all
sectors in the sblock.
3. In the scrub_setup_recheck_block function,
use sector_index to set the sector csum.

[test]
The test method is the same as the problem reproduction.
Can detect checksum errors and fix checksum errors
Below is the scrub output.

$ sudo btrfs scrub start /mnt/
fsid:b66aa912-ae76-4b4b-881b-6a6840f06870 sock_path:/var/lib/btrfs/scrub.progress.b66aa912-ae76-4b4b-881b-6a6840f06870.
scrub started on /mnt/, fsid b66aa912-ae76-4b4b-881b-6a6840f06870 (pid=11089)
$ sudo btrfs scrub status /mnt/WARNING: errors detected during scrubbing, corrected

UUID:             b66aa912-ae76-4b4b-881b-6a6840f06870
Scrub started:    Sun Nov  6 22:15:02 2022
Status:           finished
Duration:         0:00:00
Total to scrub:   392.00KiB
Rate:             0.00B/s
Error summary:    csum=1
  Corrected:      1
  Uncorrectable:  0
  Unverified:     0

Signed-off-by: Li Zhang <zhanglikernel@gmail.com>
---
Issue: 537

 fs/btrfs/scrub.c | 58 ++++++++++++++++++++++++++++----------------------------
 1 file changed, 29 insertions(+), 29 deletions(-)

Comments

Qu Wenruo Nov. 6, 2022, 10:57 p.m. UTC | #1

On 2022/11/6 22:35, Li Zhang wrote:
> [bug]
> Scrub fails to detect checksum errors
> To reproduce the problem:
> 
> $ truncate -s 250M loop_dev1
> $ truncate -s 250M loop_dev2
> $ losetup /dev/loop1 loop_dev1
> $ losetup /dev/loop2 loop_dev2
> $ mkfs.btrfs -mraid1 -draid1 /dev/loop1 /dev/loop2 -f
> $ mount /dev/loop1 /mnt/
> $ cp ~/btrfs/btrfs-progs/mkfs/main.c /mnt/
> 
> $ vim -b loop_dev1
> 
> ....
>           free(label);
>           free(source_dir);
>           exit(1);
> success:
>           exit(0);
> }zhangli
> 
> ....
> 
> $ sudo btrfs scrub start /mnt/
> fsid:b66aa912-ae76-4b4b-881b-6a6840f06870 sock_path:/var/lib/btrfs/scrub.progress.b66aa912-ae76-4b4b-881b-6a6840f06870.
> scrub started on /mnt/, fsid b66aa912-ae76-4b4b-881b-6a6840f06870 (pid=9894)
> 
> $ sudo btrfs scrub status /mnt/
> UUID:             b66aa912-ae76-4b4b-881b-6a6840f06870
> Scrub started:    Sun Nov  6 21:51:50 2022
> Status:           finished
> Duration:         0:00:00
> Total to scrub:   392.00KiB
> Rate:             0.00B/s
> Error summary:    no errors found
> 
> [reason]
> Because scrub only checks the first sector (scrub_sector) of
> the sblock (scrub_block), it does not check other sectors.

That's caused by commit 786672e9e1a3 ("btrfs: scrub: use larger block 
size for data extent scrub"), which enlarged data scrub extent size 
(previously always sectorsize, thus there will only be one sector per 
scrub_block, thus it always works before that commit).

I'd prefer a revert before we have better fix.

> 
> [implementation]
> 1. Move the set sector (scrub_sector) csum from scrub_extent
> to scrub_sectors, since each sector has an independent checksum.
> 2. In the scrub_checksum_data function, check all
> sectors in the sblock.
> 3. In the scrub_setup_recheck_block function,
> use sector_index to set the sector csum.
> 
> [test]
> The test method is the same as the problem reproduction.
> Can detect checksum errors and fix checksum errors
> Below is the scrub output.
> 
> $ sudo btrfs scrub start /mnt/
> fsid:b66aa912-ae76-4b4b-881b-6a6840f06870 sock_path:/var/lib/btrfs/scrub.progress.b66aa912-ae76-4b4b-881b-6a6840f06870.
> scrub started on /mnt/, fsid b66aa912-ae76-4b4b-881b-6a6840f06870 (pid=11089)
> $ sudo btrfs scrub status /mnt/WARNING: errors detected during scrubbing, corrected
> 
> UUID:             b66aa912-ae76-4b4b-881b-6a6840f06870
> Scrub started:    Sun Nov  6 22:15:02 2022
> Status:           finished
> Duration:         0:00:00
> Total to scrub:   392.00KiB
> Rate:             0.00B/s
> Error summary:    csum=1
>    Corrected:      1
>    Uncorrectable:  0
>    Unverified:     0
> 
> Signed-off-by: Li Zhang <zhanglikernel@gmail.com>
> ---
> Issue: 537
> 
>   fs/btrfs/scrub.c | 58 ++++++++++++++++++++++++++++----------------------------
>   1 file changed, 29 insertions(+), 29 deletions(-)
> 
> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
> index f260c53..56ee600 100644
> --- a/fs/btrfs/scrub.c
> +++ b/fs/btrfs/scrub.c
> @@ -404,7 +404,7 @@ static int scrub_write_sector_to_dev_replace(struct scrub_block *sblock,
>   static void scrub_parity_put(struct scrub_parity *sparity);
>   static int scrub_sectors(struct scrub_ctx *sctx, u64 logical, u32 len,
>   			 u64 physical, struct btrfs_device *dev, u64 flags,
> -			 u64 gen, int mirror_num, u8 *csum,
> +			 u64 gen, int mirror_num,
>   			 u64 physical_for_dev_replace);
>   static void scrub_bio_end_io(struct bio *bio);
>   static void scrub_bio_end_io_worker(struct work_struct *work);
> @@ -420,6 +420,8 @@ static int scrub_add_sector_to_wr_bio(struct scrub_ctx *sctx,
>   static void scrub_wr_bio_end_io(struct bio *bio);
>   static void scrub_wr_bio_end_io_worker(struct work_struct *work);
>   static void scrub_put_ctx(struct scrub_ctx *sctx);
> +static int scrub_find_csum(struct scrub_ctx *sctx, u64 logical, u8 *csum);
> +

I don't think this is the way to go, since we can have a extent up to 
STRIPE_LEN, going csum search again and again is not the proper way to go.

We need a function to grab a batch of csum in just one go.

>   
>   static inline int scrub_is_page_on_raid56(struct scrub_sector *sector)
>   {
> @@ -1516,7 +1518,7 @@ static int scrub_setup_recheck_block(struct scrub_block *original_sblock,
>   			sector->have_csum = have_csum;
>   			if (have_csum)
>   				memcpy(sector->csum,
> -				       original_sblock->sectors[0]->csum,
> +				       original_sblock->sectors[sector_index]->csum,
>   				       sctx->fs_info->csum_size);
>   
>   			scrub_stripe_index_and_offset(logical,
> @@ -1984,21 +1986,22 @@ static int scrub_checksum_data(struct scrub_block *sblock)
>   	u8 csum[BTRFS_CSUM_SIZE];
>   	struct scrub_sector *sector;
>   	char *kaddr;
> +	int i;
>   
>   	BUG_ON(sblock->sector_count < 1);
> -	sector = sblock->sectors[0];
> -	if (!sector->have_csum)
> -		return 0;
> -
> -	kaddr = scrub_sector_get_kaddr(sector);
>   
>   	shash->tfm = fs_info->csum_shash;
>   	crypto_shash_init(shash);
> +	for (i = 0; i < sblock->sector_count; i++) {
> +		sector = sblock->sectors[i];
> +		if (!sector->have_csum)
> +			continue;
>   
> -	crypto_shash_digest(shash, kaddr, fs_info->sectorsize, csum);
> -
> -	if (memcmp(csum, sector->csum, fs_info->csum_size))
> -		sblock->checksum_error = 1;
> +		kaddr = scrub_sector_get_kaddr(sector);
> +		crypto_shash_digest(shash, kaddr, fs_info->sectorsize, csum);
> +		if (memcmp(csum, sector->csum, fs_info->csum_size))
> +			sblock->checksum_error = 1;

That would only increase checksum error by 1, but we may have multiple 
corruptions in that extent.

This hotfix is going to screw up scrub error reporting.

Thanks,
Qu
> +	}
>   	return sblock->checksum_error;
>   }
>   
> @@ -2400,12 +2403,14 @@ static void scrub_missing_raid56_pages(struct scrub_block *sblock)
>   
>   static int scrub_sectors(struct scrub_ctx *sctx, u64 logical, u32 len,
>   		       u64 physical, struct btrfs_device *dev, u64 flags,
> -		       u64 gen, int mirror_num, u8 *csum,
> +		       u64 gen, int mirror_num,
>   		       u64 physical_for_dev_replace)
>   {
>   	struct scrub_block *sblock;
>   	const u32 sectorsize = sctx->fs_info->sectorsize;
>   	int index;
> +	u8 csum[BTRFS_CSUM_SIZE];
> +	int have_csum;
>   
>   	sblock = alloc_scrub_block(sctx, dev, logical, physical,
>   				   physical_for_dev_replace, mirror_num);
> @@ -2424,7 +2429,6 @@ static int scrub_sectors(struct scrub_ctx *sctx, u64 logical, u32 len,
>   		 * more memory for PAGE_SIZE > sectorsize case.
>   		 */
>   		u32 l = min(sectorsize, len);
> -
>   		sector = alloc_scrub_sector(sblock, logical, GFP_KERNEL);
>   		if (!sector) {
>   			spin_lock(&sctx->stat_lock);
> @@ -2435,11 +2439,16 @@ static int scrub_sectors(struct scrub_ctx *sctx, u64 logical, u32 len,
>   		}
>   		sector->flags = flags;
>   		sector->generation = gen;
> -		if (csum) {
> -			sector->have_csum = 1;
> -			memcpy(sector->csum, csum, sctx->fs_info->csum_size);
> -		} else {
> -			sector->have_csum = 0;
> +		if (flags & BTRFS_EXTENT_FLAG_DATA) {
> +			/* push csums to sbio */
> +			have_csum = scrub_find_csum(sctx, logical, csum);
> +			if (have_csum == 0) {
> +				++sctx->stat.no_csum;
> +				sector->have_csum = 0;
> +			} else {
> +				sector->have_csum = 1;
> +				memcpy(sector->csum, csum, sctx->fs_info->csum_size);
> +			}
>   		}
>   		len -= l;
>   		logical += l;
> @@ -2669,7 +2678,6 @@ static int scrub_extent(struct scrub_ctx *sctx, struct map_lookup *map,
>   	u64 src_physical = physical;
>   	int src_mirror = mirror_num;
>   	int ret;
> -	u8 csum[BTRFS_CSUM_SIZE];
>   	u32 blocksize;
>   
>   	/*
> @@ -2715,17 +2723,9 @@ static int scrub_extent(struct scrub_ctx *sctx, struct map_lookup *map,
>   				     &src_dev, &src_mirror);
>   	while (len) {
>   		u32 l = min(len, blocksize);
> -		int have_csum = 0;
> -
> -		if (flags & BTRFS_EXTENT_FLAG_DATA) {
> -			/* push csums to sbio */
> -			have_csum = scrub_find_csum(sctx, logical, csum);
> -			if (have_csum == 0)
> -				++sctx->stat.no_csum;
> -		}
>   		ret = scrub_sectors(sctx, logical, l, src_physical, src_dev,
>   				    flags, gen, src_mirror,
> -				    have_csum ? csum : NULL, physical);
> +				    physical);
>   		if (ret)
>   			return ret;
>   		len -= l;
> @@ -4155,7 +4155,7 @@ static noinline_for_stack int scrub_supers(struct scrub_ctx *sctx,
>   
>   		ret = scrub_sectors(sctx, bytenr, BTRFS_SUPER_INFO_SIZE, bytenr,
>   				    scrub_dev, BTRFS_EXTENT_FLAG_SUPER, gen, i,
> -				    NULL, bytenr);
> +				    bytenr);
>   		if (ret)
>   			return ret;
>   	}

li zhang Nov. 7, 2022, 3:54 p.m. UTC | #2

Yes, I agree.
Would begin to work on it

Thanks,
Li

Qu Wenruo <quwenruo.btrfs@gmx.com> 于2022年11月7日周一 06:57写道：
>
>
>
> On 2022/11/6 22:35, Li Zhang wrote:
> > [bug]
> > Scrub fails to detect checksum errors
> > To reproduce the problem:
> >
> > $ truncate -s 250M loop_dev1
> > $ truncate -s 250M loop_dev2
> > $ losetup /dev/loop1 loop_dev1
> > $ losetup /dev/loop2 loop_dev2
> > $ mkfs.btrfs -mraid1 -draid1 /dev/loop1 /dev/loop2 -f
> > $ mount /dev/loop1 /mnt/
> > $ cp ~/btrfs/btrfs-progs/mkfs/main.c /mnt/
> >
> > $ vim -b loop_dev1
> >
> > ....
> >           free(label);
> >           free(source_dir);
> >           exit(1);
> > success:
> >           exit(0);
> > }zhangli
> >
> > ....
> >
> > $ sudo btrfs scrub start /mnt/
> > fsid:b66aa912-ae76-4b4b-881b-6a6840f06870 sock_path:/var/lib/btrfs/scrub.progress.b66aa912-ae76-4b4b-881b-6a6840f06870.
> > scrub started on /mnt/, fsid b66aa912-ae76-4b4b-881b-6a6840f06870 (pid=9894)
> >
> > $ sudo btrfs scrub status /mnt/
> > UUID:             b66aa912-ae76-4b4b-881b-6a6840f06870
> > Scrub started:    Sun Nov  6 21:51:50 2022
> > Status:           finished
> > Duration:         0:00:00
> > Total to scrub:   392.00KiB
> > Rate:             0.00B/s
> > Error summary:    no errors found
> >
> > [reason]
> > Because scrub only checks the first sector (scrub_sector) of
> > the sblock (scrub_block), it does not check other sectors.
>
> That's caused by commit 786672e9e1a3 ("btrfs: scrub: use larger block
> size for data extent scrub"), which enlarged data scrub extent size
> (previously always sectorsize, thus there will only be one sector per
> scrub_block, thus it always works before that commit).
>
> I'd prefer a revert before we have better fix.
>
> >
> > [implementation]
> > 1. Move the set sector (scrub_sector) csum from scrub_extent
> > to scrub_sectors, since each sector has an independent checksum.
> > 2. In the scrub_checksum_data function, check all
> > sectors in the sblock.
> > 3. In the scrub_setup_recheck_block function,
> > use sector_index to set the sector csum.
> >
> > [test]
> > The test method is the same as the problem reproduction.
> > Can detect checksum errors and fix checksum errors
> > Below is the scrub output.
> >
> > $ sudo btrfs scrub start /mnt/
> > fsid:b66aa912-ae76-4b4b-881b-6a6840f06870 sock_path:/var/lib/btrfs/scrub.progress.b66aa912-ae76-4b4b-881b-6a6840f06870.
> > scrub started on /mnt/, fsid b66aa912-ae76-4b4b-881b-6a6840f06870 (pid=11089)
> > $ sudo btrfs scrub status /mnt/WARNING: errors detected during scrubbing, corrected
> >
> > UUID:             b66aa912-ae76-4b4b-881b-6a6840f06870
> > Scrub started:    Sun Nov  6 22:15:02 2022
> > Status:           finished
> > Duration:         0:00:00
> > Total to scrub:   392.00KiB
> > Rate:             0.00B/s
> > Error summary:    csum=1
> >    Corrected:      1
> >    Uncorrectable:  0
> >    Unverified:     0
> >
> > Signed-off-by: Li Zhang <zhanglikernel@gmail.com>
> > ---
> > Issue: 537
> >
> >   fs/btrfs/scrub.c | 58 ++++++++++++++++++++++++++++----------------------------
> >   1 file changed, 29 insertions(+), 29 deletions(-)
> >
> > diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
> > index f260c53..56ee600 100644
> > --- a/fs/btrfs/scrub.c
> > +++ b/fs/btrfs/scrub.c
> > @@ -404,7 +404,7 @@ static int scrub_write_sector_to_dev_replace(struct scrub_block *sblock,
> >   static void scrub_parity_put(struct scrub_parity *sparity);
> >   static int scrub_sectors(struct scrub_ctx *sctx, u64 logical, u32 len,
> >                        u64 physical, struct btrfs_device *dev, u64 flags,
> > -                      u64 gen, int mirror_num, u8 *csum,
> > +                      u64 gen, int mirror_num,
> >                        u64 physical_for_dev_replace);
> >   static void scrub_bio_end_io(struct bio *bio);
> >   static void scrub_bio_end_io_worker(struct work_struct *work);
> > @@ -420,6 +420,8 @@ static int scrub_add_sector_to_wr_bio(struct scrub_ctx *sctx,
> >   static void scrub_wr_bio_end_io(struct bio *bio);
> >   static void scrub_wr_bio_end_io_worker(struct work_struct *work);
> >   static void scrub_put_ctx(struct scrub_ctx *sctx);
> > +static int scrub_find_csum(struct scrub_ctx *sctx, u64 logical, u8 *csum);
> > +
>
> I don't think this is the way to go, since we can have a extent up to
> STRIPE_LEN, going csum search again and again is not the proper way to go.
>
> We need a function to grab a batch of csum in just one go.
>
> >
> >   static inline int scrub_is_page_on_raid56(struct scrub_sector *sector)
> >   {
> > @@ -1516,7 +1518,7 @@ static int scrub_setup_recheck_block(struct scrub_block *original_sblock,
> >                       sector->have_csum = have_csum;
> >                       if (have_csum)
> >                               memcpy(sector->csum,
> > -                                    original_sblock->sectors[0]->csum,
> > +                                    original_sblock->sectors[sector_index]->csum,
> >                                      sctx->fs_info->csum_size);
> >
> >                       scrub_stripe_index_and_offset(logical,
> > @@ -1984,21 +1986,22 @@ static int scrub_checksum_data(struct scrub_block *sblock)
> >       u8 csum[BTRFS_CSUM_SIZE];
> >       struct scrub_sector *sector;
> >       char *kaddr;
> > +     int i;
> >
> >       BUG_ON(sblock->sector_count < 1);
> > -     sector = sblock->sectors[0];
> > -     if (!sector->have_csum)
> > -             return 0;
> > -
> > -     kaddr = scrub_sector_get_kaddr(sector);
> >
> >       shash->tfm = fs_info->csum_shash;
> >       crypto_shash_init(shash);
> > +     for (i = 0; i < sblock->sector_count; i++) {
> > +             sector = sblock->sectors[i];
> > +             if (!sector->have_csum)
> > +                     continue;
> >
> > -     crypto_shash_digest(shash, kaddr, fs_info->sectorsize, csum);
> > -
> > -     if (memcmp(csum, sector->csum, fs_info->csum_size))
> > -             sblock->checksum_error = 1;
> > +             kaddr = scrub_sector_get_kaddr(sector);
> > +             crypto_shash_digest(shash, kaddr, fs_info->sectorsize, csum);
> > +             if (memcmp(csum, sector->csum, fs_info->csum_size))
> > +                     sblock->checksum_error = 1;
>
> That would only increase checksum error by 1, but we may have multiple
> corruptions in that extent.
>
> This hotfix is going to screw up scrub error reporting.
>
> Thanks,
> Qu
> > +     }
> >       return sblock->checksum_error;
> >   }
> >
> > @@ -2400,12 +2403,14 @@ static void scrub_missing_raid56_pages(struct scrub_block *sblock)
> >
> >   static int scrub_sectors(struct scrub_ctx *sctx, u64 logical, u32 len,
> >                      u64 physical, struct btrfs_device *dev, u64 flags,
> > -                    u64 gen, int mirror_num, u8 *csum,
> > +                    u64 gen, int mirror_num,
> >                      u64 physical_for_dev_replace)
> >   {
> >       struct scrub_block *sblock;
> >       const u32 sectorsize = sctx->fs_info->sectorsize;
> >       int index;
> > +     u8 csum[BTRFS_CSUM_SIZE];
> > +     int have_csum;
> >
> >       sblock = alloc_scrub_block(sctx, dev, logical, physical,
> >                                  physical_for_dev_replace, mirror_num);
> > @@ -2424,7 +2429,6 @@ static int scrub_sectors(struct scrub_ctx *sctx, u64 logical, u32 len,
> >                * more memory for PAGE_SIZE > sectorsize case.
> >                */
> >               u32 l = min(sectorsize, len);
> > -
> >               sector = alloc_scrub_sector(sblock, logical, GFP_KERNEL);
> >               if (!sector) {
> >                       spin_lock(&sctx->stat_lock);
> > @@ -2435,11 +2439,16 @@ static int scrub_sectors(struct scrub_ctx *sctx, u64 logical, u32 len,
> >               }
> >               sector->flags = flags;
> >               sector->generation = gen;
> > -             if (csum) {
> > -                     sector->have_csum = 1;
> > -                     memcpy(sector->csum, csum, sctx->fs_info->csum_size);
> > -             } else {
> > -                     sector->have_csum = 0;
> > +             if (flags & BTRFS_EXTENT_FLAG_DATA) {
> > +                     /* push csums to sbio */
> > +                     have_csum = scrub_find_csum(sctx, logical, csum);
> > +                     if (have_csum == 0) {
> > +                             ++sctx->stat.no_csum;
> > +                             sector->have_csum = 0;
> > +                     } else {
> > +                             sector->have_csum = 1;
> > +                             memcpy(sector->csum, csum, sctx->fs_info->csum_size);
> > +                     }
> >               }
> >               len -= l;
> >               logical += l;
> > @@ -2669,7 +2678,6 @@ static int scrub_extent(struct scrub_ctx *sctx, struct map_lookup *map,
> >       u64 src_physical = physical;
> >       int src_mirror = mirror_num;
> >       int ret;
> > -     u8 csum[BTRFS_CSUM_SIZE];
> >       u32 blocksize;
> >
> >       /*
> > @@ -2715,17 +2723,9 @@ static int scrub_extent(struct scrub_ctx *sctx, struct map_lookup *map,
> >                                    &src_dev, &src_mirror);
> >       while (len) {
> >               u32 l = min(len, blocksize);
> > -             int have_csum = 0;
> > -
> > -             if (flags & BTRFS_EXTENT_FLAG_DATA) {
> > -                     /* push csums to sbio */
> > -                     have_csum = scrub_find_csum(sctx, logical, csum);
> > -                     if (have_csum == 0)
> > -                             ++sctx->stat.no_csum;
> > -             }
> >               ret = scrub_sectors(sctx, logical, l, src_physical, src_dev,
> >                                   flags, gen, src_mirror,
> > -                                 have_csum ? csum : NULL, physical);
> > +                                 physical);
> >               if (ret)
> >                       return ret;
> >               len -= l;
> > @@ -4155,7 +4155,7 @@ static noinline_for_stack int scrub_supers(struct scrub_ctx *sctx,
> >
> >               ret = scrub_sectors(sctx, bytenr, BTRFS_SUPER_INFO_SIZE, bytenr,
> >                                   scrub_dev, BTRFS_EXTENT_FLAG_SUPER, gen, i,
> > -                                 NULL, bytenr);
> > +                                 bytenr);
> >               if (ret)
> >                       return ret;
> >       }

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index f260c53..56ee600 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -404,7 +404,7 @@  static int scrub_write_sector_to_dev_replace(struct scrub_block *sblock,
 static void scrub_parity_put(struct scrub_parity *sparity);
 static int scrub_sectors(struct scrub_ctx *sctx, u64 logical, u32 len,
 			 u64 physical, struct btrfs_device *dev, u64 flags,
-			 u64 gen, int mirror_num, u8 *csum,
+			 u64 gen, int mirror_num,
 			 u64 physical_for_dev_replace);
 static void scrub_bio_end_io(struct bio *bio);
 static void scrub_bio_end_io_worker(struct work_struct *work);
@@ -420,6 +420,8 @@  static int scrub_add_sector_to_wr_bio(struct scrub_ctx *sctx,
 static void scrub_wr_bio_end_io(struct bio *bio);
 static void scrub_wr_bio_end_io_worker(struct work_struct *work);
 static void scrub_put_ctx(struct scrub_ctx *sctx);
+static int scrub_find_csum(struct scrub_ctx *sctx, u64 logical, u8 *csum);
+
 
 static inline int scrub_is_page_on_raid56(struct scrub_sector *sector)
 {
@@ -1516,7 +1518,7 @@  static int scrub_setup_recheck_block(struct scrub_block *original_sblock,
 			sector->have_csum = have_csum;
 			if (have_csum)
 				memcpy(sector->csum,
-				       original_sblock->sectors[0]->csum,
+				       original_sblock->sectors[sector_index]->csum,
 				       sctx->fs_info->csum_size);
 
 			scrub_stripe_index_and_offset(logical,
@@ -1984,21 +1986,22 @@  static int scrub_checksum_data(struct scrub_block *sblock)
 	u8 csum[BTRFS_CSUM_SIZE];
 	struct scrub_sector *sector;
 	char *kaddr;
+	int i;
 
 	BUG_ON(sblock->sector_count < 1);
-	sector = sblock->sectors[0];
-	if (!sector->have_csum)
-		return 0;
-
-	kaddr = scrub_sector_get_kaddr(sector);
 
 	shash->tfm = fs_info->csum_shash;
 	crypto_shash_init(shash);
+	for (i = 0; i < sblock->sector_count; i++) {
+		sector = sblock->sectors[i];
+		if (!sector->have_csum)
+			continue;
 
-	crypto_shash_digest(shash, kaddr, fs_info->sectorsize, csum);
-
-	if (memcmp(csum, sector->csum, fs_info->csum_size))
-		sblock->checksum_error = 1;
+		kaddr = scrub_sector_get_kaddr(sector);
+		crypto_shash_digest(shash, kaddr, fs_info->sectorsize, csum);
+		if (memcmp(csum, sector->csum, fs_info->csum_size))
+			sblock->checksum_error = 1;
+	}
 	return sblock->checksum_error;
 }
 
@@ -2400,12 +2403,14 @@  static void scrub_missing_raid56_pages(struct scrub_block *sblock)
 
 static int scrub_sectors(struct scrub_ctx *sctx, u64 logical, u32 len,
 		       u64 physical, struct btrfs_device *dev, u64 flags,
-		       u64 gen, int mirror_num, u8 *csum,
+		       u64 gen, int mirror_num,
 		       u64 physical_for_dev_replace)
 {
 	struct scrub_block *sblock;
 	const u32 sectorsize = sctx->fs_info->sectorsize;
 	int index;
+	u8 csum[BTRFS_CSUM_SIZE];
+	int have_csum;
 
 	sblock = alloc_scrub_block(sctx, dev, logical, physical,
 				   physical_for_dev_replace, mirror_num);
@@ -2424,7 +2429,6 @@  static int scrub_sectors(struct scrub_ctx *sctx, u64 logical, u32 len,
 		 * more memory for PAGE_SIZE > sectorsize case.
 		 */
 		u32 l = min(sectorsize, len);
-
 		sector = alloc_scrub_sector(sblock, logical, GFP_KERNEL);
 		if (!sector) {
 			spin_lock(&sctx->stat_lock);
@@ -2435,11 +2439,16 @@  static int scrub_sectors(struct scrub_ctx *sctx, u64 logical, u32 len,
 		}
 		sector->flags = flags;
 		sector->generation = gen;
-		if (csum) {
-			sector->have_csum = 1;
-			memcpy(sector->csum, csum, sctx->fs_info->csum_size);
-		} else {
-			sector->have_csum = 0;
+		if (flags & BTRFS_EXTENT_FLAG_DATA) {
+			/* push csums to sbio */
+			have_csum = scrub_find_csum(sctx, logical, csum);
+			if (have_csum == 0) {
+				++sctx->stat.no_csum;
+				sector->have_csum = 0;
+			} else {
+				sector->have_csum = 1;
+				memcpy(sector->csum, csum, sctx->fs_info->csum_size);
+			}
 		}
 		len -= l;
 		logical += l;
@@ -2669,7 +2678,6 @@  static int scrub_extent(struct scrub_ctx *sctx, struct map_lookup *map,
 	u64 src_physical = physical;
 	int src_mirror = mirror_num;
 	int ret;
-	u8 csum[BTRFS_CSUM_SIZE];
 	u32 blocksize;
 
 	/*
@@ -2715,17 +2723,9 @@  static int scrub_extent(struct scrub_ctx *sctx, struct map_lookup *map,
 				     &src_dev, &src_mirror);
 	while (len) {
 		u32 l = min(len, blocksize);
-		int have_csum = 0;
-
-		if (flags & BTRFS_EXTENT_FLAG_DATA) {
-			/* push csums to sbio */
-			have_csum = scrub_find_csum(sctx, logical, csum);
-			if (have_csum == 0)
-				++sctx->stat.no_csum;
-		}
 		ret = scrub_sectors(sctx, logical, l, src_physical, src_dev,
 				    flags, gen, src_mirror,
-				    have_csum ? csum : NULL, physical);
+				    physical);
 		if (ret)
 			return ret;
 		len -= l;
@@ -4155,7 +4155,7 @@  static noinline_for_stack int scrub_supers(struct scrub_ctx *sctx,
 
 		ret = scrub_sectors(sctx, bytenr, BTRFS_SUPER_INFO_SIZE, bytenr,
 				    scrub_dev, BTRFS_EXTENT_FLAG_SUPER, gen, i,
-				    NULL, bytenr);
+				    bytenr);
 		if (ret)
 			return ret;
 	}

btrfs: scrub: fix failed to detect checksum error

Commit Message

Comments

Patch