BTRFS: Runs the xor function if a Block has failed
diff mbox

Message ID 1451456916-16520-1-git-send-email-jpage.lkml@gmail.com
State New
Headers show

Commit Message

Sanidhya Solanki Dec. 30, 2015, 6:28 a.m. UTC
The patch adds the xor function after the P stripe
has failed, without bad data or the Q stripe.

Signed-off-by: Sanidhya Solanki <jpage.lkml@gmail.com>
---
 fs/btrfs/raid56.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

David Sterba Dec. 30, 2015, 5:18 p.m. UTC | #1
On Wed, Dec 30, 2015 at 01:28:36AM -0500, Sanidhya Solanki wrote:
> The patch adds the xor function after the P stripe
> has failed, without bad data or the Q stripe.

That's just the comment copied, the changelog does not explain why it's
ok to do just the run_xor there. It does not seem trivial to me. Please
describe that the end result after the code change is expected.

> @@ -1864,8 +1864,8 @@ static void __raid_recover_end_io(struct btrfs_raid_bio *rbio)
>  					/*
>  					 * Just the P stripe has failed, without
>  					 * a bad data or Q stripe.
> -					 * TODO, we should redo the xor here.
>  					 */
> +					run_xor(pointers, rbio->nr_data - 1, PAGE_CACHE_SIZE);
>  					err = -EIO;
>  					goto cleanup;
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sanidhya Solanki Dec. 31, 2015, 2:15 a.m. UTC | #2
On Wed, 30 Dec 2015 18:18:26 +0100
David Sterba <dsterba@suse.cz> wrote:

> That's just the comment copied, the changelog does not explain why
> it's ok to do just the run_xor there. It does not seem trivial to me.
> Please describe that the end result after the code change is expected.

In the RAID 6 case after a failure, we discover that the failure
affected the entire P stripe, without any bad data occurring. Hence, we
xor the previously stored parity data to return the data that was lost
in the P stripe failure.

The xor-red data is from the parity blocks. Hence, we are left with 
recovered data belonging to the P stripe.

If there is an error during the completion of the xor (provided by the
patch ), we got to the cleanup function.

Hope that is satisfactory.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba Jan. 5, 2016, 9:22 a.m. UTC | #3
On Wed, Dec 30, 2015 at 09:15:44PM -0500, Sanidhya Solanki wrote:
> On Wed, 30 Dec 2015 18:18:26 +0100
> David Sterba <dsterba@suse.cz> wrote:
> 
> > That's just the comment copied, the changelog does not explain why
> > it's ok to do just the run_xor there. It does not seem trivial to me.
> > Please describe that the end result after the code change is expected.
> 
> In the RAID 6 case after a failure, we discover that the failure
> affected the entire P stripe, without any bad data occurring. Hence, we
> xor the previously stored parity data to return the data that was lost
> in the P stripe failure.
> 
> The xor-red data is from the parity blocks. Hence, we are left with 
> recovered data belonging to the P stripe.

If the data a rerecovered, why is -EIO still returned? Also, I see some
post-recovery steps eg. for the damaged P stripes (at label pstripes)
and I'd expect something similar for the case you're fixing.

I'm not familiar with the raid56 implementation but the fix looks
suspiciously trivial and I doubt that the xor was omitted out of
laziness.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sanidhya Solanki Jan. 5, 2016, 10:03 a.m. UTC | #4
On Tue, 5 Jan 2016 10:22:36 +0100
David Sterba <dsterba@suse.cz> wrote:

> If the data a rerecovered, why is -EIO still returned? 

In the other places in the file where the code appears, the submitted
patch is all that is required to do the xor. I think we also need to
include the following line:
memcpy(pointers[nr_data], pointers[0], PAGE_SIZE);
and delete the "return EIO" statement which I believe appears to be a
placeholder for the xor function.
In the end the final patch should look something like this:
> -					 * TODO, we should redo the xor here.
>  					 */
> +					memcpy(pointers[nr_data], pointers[0], PAGE_SIZE);
> +					run_xor(pointers, rbio->nr_data - 1, PAGE_CACHE_SIZE); 
> - err = -EIO;

So, I was just going to send you the mail as it is written above, but I
decided to investigate. The commit in question that added the todo was
53b381b3abeb86f12787a6c40fee9b2f71edc23b. Unfortunately, it was not
submitted  by the original author, nor was it by any means a small
dedicated patch. It adds the entire file, without much comment or
explanation and has not been touched since. 

However, we can get some idea of what is expected by looking at line
2398 in raid56.c, where a similar case of raid 5 recovery is handled.

So what the patch described above does is deal with a scenario where
no q stripe or bad data block exists, and, we can only rebuild from the
p-stripe, in effect like a raid 5 recovery.

So, if you are satisfied with the above retouched change, I can modify
my original patch with your suggestions and my changes, and, I can
forward them to the list again.

> Also, I see some post-recovery steps eg. for the damaged P stripes
> (at label pstripes) and I'd expect something similar for the case
> you're fixing.

I believe that is the case because the other cases still have the q-
stripe available tor rebuild from, which requires cleanup afterwards,
but the raid5-like scenario above does not. Let me know if anything
else is needed.

> I'm not familiar with the raid56 implementation but the fix looks
> suspiciously trivial and I doubt that the xor was omitted out of
> laziness.

I guess we will never know.

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 1a33d3e..d33734a 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -1864,8 +1864,8 @@  static void __raid_recover_end_io(struct btrfs_raid_bio *rbio)
 					/*
 					 * Just the P stripe has failed, without
 					 * a bad data or Q stripe.
-					 * TODO, we should redo the xor here.
 					 */
+					run_xor(pointers, rbio->nr_data - 1, PAGE_CACHE_SIZE);
 					err = -EIO;
 					goto cleanup;
 				}