diff mbox

[10/16] Btrfs: Avoid trustless page-level-repair in dev-replace

Message ID 1421666465-3892-11-git-send-email-zhaolei@cn.fujitsu.com (mailing list archive)
State Superseded
Headers show

Commit Message

Zhaolei Jan. 19, 2015, 11:20 a.m. UTC
From: Zhao Lei <zhaolei@cn.fujitsu.com>

Current code of page level repair for dev-replace can only support
io-error, we can't use it in checksum-fail case.

We can skip this kind of repair in dev-replace just as we in scrub.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/scrub.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Stefan Behrens Jan. 19, 2015, 12:25 p.m. UTC | #1
On Mon, 19 Jan 2015 19:20:59 +0800, Zhaolei wrote:
> From: Zhao Lei <zhaolei@cn.fujitsu.com>
> 
> Current code of page level repair for dev-replace can only support
> io-error, we can't use it in checksum-fail case.
> 
> We can skip this kind of repair in dev-replace just as we in scrub.

Why? I see dev-replace as a disk copy operation with integrated support
to pick the best mirror source device. Therefore the strategy was to
always write to the target device, even in case of uncorrectable
checksum errors. Maybe the checksum was wrong, not the data itself, and
you can repair files later by recalculating the checksum. If you write
nothing at all, some random, old data remains on the target disk,
instead of potentially fixable data.

Therefore this is handled differently for the dev-replace and for the
scrub case. For scrub, data is only overwritten when the source data is
verified, since the goal is to repair something. Dev-replace's goal is
to copy something.


> 
> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
> ---
>  fs/btrfs/scrub.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
> index 9afc6dd..6cf0dc7 100644
> --- a/fs/btrfs/scrub.c
> +++ b/fs/btrfs/scrub.c
> @@ -1099,6 +1099,10 @@ nodatasum_case:
>  		}
>  	}
>  
> +	/* can only fix I/O errors from here on */
> +	if (sblock_bad->no_io_error_seen)
> +		goto did_not_correct_error;
> +
>  	/*
>  	 * for dev_replace, pick good pages and write to the target device.
>  	 */
> @@ -1181,10 +1185,6 @@ nodatasum_case:
>  	 * area are unreadable.
>  	 */
>  
> -	/* can only fix I/O errors from here on */
> -	if (sblock_bad->no_io_error_seen)
> -		goto did_not_correct_error;
> -
>  	success = 1;
>  	for (page_num = 0; page_num < sblock_bad->page_count; page_num++) {
>  		struct scrub_page *page_bad = sblock_bad->pagev[page_num];
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhaolei Jan. 20, 2015, 3:07 a.m. UTC | #2
Hi, Stefan Behrens

Thanks for review these patch.

* From: Stefan Behrens [mailto:sbehrens@giantdisaster.de]
> Sent: Monday, January 19, 2015 8:26 PM
> On Mon, 19 Jan 2015 19:20:59 +0800, Zhaolei wrote:
> > From: Zhao Lei <zhaolei@cn.fujitsu.com>
> >
> > Current code of page level repair for dev-replace can only support
> > io-error, we can't use it in checksum-fail case.
> >
> > We can skip this kind of repair in dev-replace just as we in scrub.
> 
> Why? I see dev-replace as a disk copy operation with integrated support
> to pick the best mirror source device. Therefore the strategy was to
> always write to the target device, even in case of uncorrectable
> checksum errors. Maybe the checksum was wrong, not the data itself, and
> you can repair files later by recalculating the checksum. If you write
> nothing at all, some random, old data remains on the target disk,
> instead of potentially fixable data.
> 
> Therefore this is handled differently for the dev-replace and for the
> scrub case. For scrub, data is only overwritten when the source data is
> verified, since the goal is to repair something. Dev-replace's goal is
> to copy something.
> 
I accept your suggestion, copying from another mirror have more
opportunity to recovery than other way:
1: When checksum data is wrong, can be fixed
2: When other mirror have right data on this block
  (checksum fail may be caused by other block ), can be fixed.

Thanks
Zhaolei

> > Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
> > Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
> > ---
> >  fs/btrfs/scrub.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
> > index 9afc6dd..6cf0dc7 100644
> > --- a/fs/btrfs/scrub.c
> > +++ b/fs/btrfs/scrub.c
> > @@ -1099,6 +1099,10 @@ nodatasum_case:
> >  		}
> >  	}
> >
> > +	/* can only fix I/O errors from here on */
> > +	if (sblock_bad->no_io_error_seen)
> > +		goto did_not_correct_error;
> > +
> >  	/*
> >  	 * for dev_replace, pick good pages and write to the target device.
> >  	 */
> > @@ -1181,10 +1185,6 @@ nodatasum_case:
> >  	 * area are unreadable.
> >  	 */
> >
> > -	/* can only fix I/O errors from here on */
> > -	if (sblock_bad->no_io_error_seen)
> > -		goto did_not_correct_error;
> > -
> >  	success = 1;
> >  	for (page_num = 0; page_num < sblock_bad->page_count; page_num++) {
> >  		struct scrub_page *page_bad = sblock_bad->pagev[page_num];
> >



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 9afc6dd..6cf0dc7 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1099,6 +1099,10 @@  nodatasum_case:
 		}
 	}
 
+	/* can only fix I/O errors from here on */
+	if (sblock_bad->no_io_error_seen)
+		goto did_not_correct_error;
+
 	/*
 	 * for dev_replace, pick good pages and write to the target device.
 	 */
@@ -1181,10 +1185,6 @@  nodatasum_case:
 	 * area are unreadable.
 	 */
 
-	/* can only fix I/O errors from here on */
-	if (sblock_bad->no_io_error_seen)
-		goto did_not_correct_error;
-
 	success = 1;
 	for (page_num = 0; page_num < sblock_bad->page_count; page_num++) {
 		struct scrub_page *page_bad = sblock_bad->pagev[page_num];