diff mbox series

[5/8] btrfs: Increment device corruption error in case of checksum error

Message ID 20200702122335.9117-6-nborisov@suse.com (mailing list archive)
State New, archived
Headers show
Series Corrupt counter improvement | expand

Commit Message

Nikolay Borisov July 2, 2020, 12:23 p.m. UTC
Now that btrfs_io_bio have access to btrfs_device we can safely
increment the device corruption counter on error. There is one notable
exception - repair bios for raid. Since those don't go through the
normal submit_stripe_bio callpath but through raid56_parity_recover thus
repair bios won't have their device set.

Link: https://lore.kernel.org/linux-btrfs/4857863.FCrPRfMyHP@liv/
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
---
 fs/btrfs/inode.c | 3 +++
 1 file changed, 3 insertions(+)

--
2.17.1

Comments

Josef Bacik July 2, 2020, 1:18 p.m. UTC | #1
On 7/2/20 8:23 AM, Nikolay Borisov wrote:
> Now that btrfs_io_bio have access to btrfs_device we can safely
> increment the device corruption counter on error. There is one notable
> exception - repair bios for raid. Since those don't go through the
> normal submit_stripe_bio callpath but through raid56_parity_recover thus
> repair bios won't have their device set.
> 
> Link: https://lore.kernel.org/linux-btrfs/4857863.FCrPRfMyHP@liv/
> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
> ---
>   fs/btrfs/inode.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index e7600b0fd9b5..c6824d0ce59d 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -2822,6 +2822,9 @@ static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio,
>   zeroit:
>   	btrfs_print_data_csum_error(BTRFS_I(inode), start, csum, csum_expected,
>   				    io_bio->mirror_num);
> +	if (io_bio->dev)
> +		btrfs_dev_stat_inc_and_print(io_bio->dev,
> +					     BTRFS_DEV_STAT_CORRUPTION_ERRS);
>   	memset(kaddr + pgoff, 1, len);
>   	flush_dcache_page(page);
>   	kunmap_atomic(kaddr);
> --
> 2.17.1
> 

I had to go look this up to see if we were double counting, but no, we only do 
BTRFS_DEV_STAT_CORRUPTION_ERRS for data with scrub, which goes through a 
different IO path than the normal reads.  Just in case anybody else is as 
confused as I was

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef
Johannes Thumshirn July 2, 2020, 1:21 p.m. UTC | #2
On 02/07/2020 14:41, Nikolay Borisov wrote:
> Now that btrfs_io_bio have access to btrfs_device we can safely
> increment the device corruption counter on error. There is one notable
> exception - repair bios for raid. Since those don't go through the
> normal submit_stripe_bio callpath but through raid56_parity_recover thus
> repair bios won't have their device set.
> 
> Link: https://lore.kernel.org/linux-btrfs/4857863.FCrPRfMyHP@liv/
> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
> ---
>  fs/btrfs/inode.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index e7600b0fd9b5..c6824d0ce59d 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -2822,6 +2822,9 @@ static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio,
>  zeroit:
>  	btrfs_print_data_csum_error(BTRFS_I(inode), start, csum, csum_expected,
>  				    io_bio->mirror_num);
> +	if (io_bio->dev)
> +		btrfs_dev_stat_inc_and_print(io_bio->dev,
> +					     BTRFS_DEV_STAT_CORRUPTION_ERRS);
>  	memset(kaddr + pgoff, 1, len);
>  	flush_dcache_page(page);
>  	kunmap_atomic(kaddr);

Any chance you could do a follow up merging that weird zeroit label 
into the memset() block?

It kind of disturbs the reading flow of that function and in fact it 
doesn't even zero the data
Nikolay Borisov July 2, 2020, 2:44 p.m. UTC | #3
On 2.07.20 г. 16:21 ч., Johannes Thumshirn wrote:
> On 02/07/2020 14:41, Nikolay Borisov wrote:
>> Now that btrfs_io_bio have access to btrfs_device we can safely
>> increment the device corruption counter on error. There is one notable
>> exception - repair bios for raid. Since those don't go through the
>> normal submit_stripe_bio callpath but through raid56_parity_recover thus
>> repair bios won't have their device set.
>>
>> Link: https://lore.kernel.org/linux-btrfs/4857863.FCrPRfMyHP@liv/
>> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
>> ---
>>  fs/btrfs/inode.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>> index e7600b0fd9b5..c6824d0ce59d 100644
>> --- a/fs/btrfs/inode.c
>> +++ b/fs/btrfs/inode.c
>> @@ -2822,6 +2822,9 @@ static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio,
>>  zeroit:
>>  	btrfs_print_data_csum_error(BTRFS_I(inode), start, csum, csum_expected,
>>  				    io_bio->mirror_num);
>> +	if (io_bio->dev)
>> +		btrfs_dev_stat_inc_and_print(io_bio->dev,
>> +					     BTRFS_DEV_STAT_CORRUPTION_ERRS);
>>  	memset(kaddr + pgoff, 1, len);
>>  	flush_dcache_page(page);
>>  	kunmap_atomic(kaddr);
> 
> Any chance you could do a follow up merging that weird zeroit label 
> into the memset() block?
> 
> It kind of disturbs the reading flow of that function and in fact it 
> doesn't even zero the data
> 

Makes sense, it doesn't even look that bad at all in terms of line length.
diff mbox series

Patch

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index e7600b0fd9b5..c6824d0ce59d 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2822,6 +2822,9 @@  static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio,
 zeroit:
 	btrfs_print_data_csum_error(BTRFS_I(inode), start, csum, csum_expected,
 				    io_bio->mirror_num);
+	if (io_bio->dev)
+		btrfs_dev_stat_inc_and_print(io_bio->dev,
+					     BTRFS_DEV_STAT_CORRUPTION_ERRS);
 	memset(kaddr + pgoff, 1, len);
 	flush_dcache_page(page);
 	kunmap_atomic(kaddr);