mbox series

[STABLE,5.19,0/2] btrfs: raid56: backports to reduce corruption

Message ID cover.1660891713.git.wqu@suse.com (mailing list archive)
Headers show
Series btrfs: raid56: backports to reduce corruption | expand

Message

Qu Wenruo Aug. 19, 2022, 7:01 a.m. UTC
This backport is going to address the following possible corruption for
btrfs RAID56:

- RMW always writes the full P/Q stripe
  This happens even only some vertical stripes are dirty.

  If some data sectors are corrupted in the clean vertical stripes,
  then such unnecessary P/Q updates will wipe the chance or recovery,
  as the P/Q will be calculated using corrupted data.

  This will be addressed by the first backport patch.

- Don't trust any cached sector when doing recovery
  Although above patch will avoid unnecessary P/Q writes, the full P/Q
  stripe will still be updated in the in-memory only RAID56 cache.

  To properly recovery data stripes, we should not trust any cached
  sector, and always read data from disk, which will avoid corrupted
  P/Q sectors.

  This will be addressed by the second backport patch.

This would make some previously always-fail test cases, like btrfs/125,
to pass, and end users have a lower chance to corrupt their RAID56 data.

Since this is a data corruption related fix, these backport patches are
needed for all stable branches.

Unfortunately due to the new cleanups in v6.0-rc, these backport patches
have quite some conflicts (even for 5.19), and have to be manually resolved.
Almost every stable branch will need their own backports.

Acked-by: David Sterba <dsterba@suse.com>

Qu Wenruo (2):
  btrfs: only write the sectors in the vertical stripe which has data
    stripes
  btrfs: raid56: don't trust any cached sector in
    __raid56_parity_recover()

 fs/btrfs/raid56.c | 68 ++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 55 insertions(+), 13 deletions(-)

Comments

Qu Wenruo Aug. 19, 2022, 7:08 a.m. UTC | #1
BTW, since I have to re-run all the tests for every stable branch, for 
the stable branches and 5.18.x backports may be delayed a little.

In that case, I guess I should submit backports in the most recent 
stable branches first?

Thanks,
Qu

On 2022/8/19 15:01, Qu Wenruo wrote:
> This backport is going to address the following possible corruption for
> btrfs RAID56:
> 
> - RMW always writes the full P/Q stripe
>    This happens even only some vertical stripes are dirty.
> 
>    If some data sectors are corrupted in the clean vertical stripes,
>    then such unnecessary P/Q updates will wipe the chance or recovery,
>    as the P/Q will be calculated using corrupted data.
> 
>    This will be addressed by the first backport patch.
> 
> - Don't trust any cached sector when doing recovery
>    Although above patch will avoid unnecessary P/Q writes, the full P/Q
>    stripe will still be updated in the in-memory only RAID56 cache.
> 
>    To properly recovery data stripes, we should not trust any cached
>    sector, and always read data from disk, which will avoid corrupted
>    P/Q sectors.
> 
>    This will be addressed by the second backport patch.
> 
> This would make some previously always-fail test cases, like btrfs/125,
> to pass, and end users have a lower chance to corrupt their RAID56 data.
> 
> Since this is a data corruption related fix, these backport patches are
> needed for all stable branches.
> 
> Unfortunately due to the new cleanups in v6.0-rc, these backport patches
> have quite some conflicts (even for 5.19), and have to be manually resolved.
> Almost every stable branch will need their own backports.
> 
> Acked-by: David Sterba <dsterba@suse.com>
> 
> Qu Wenruo (2):
>    btrfs: only write the sectors in the vertical stripe which has data
>      stripes
>    btrfs: raid56: don't trust any cached sector in
>      __raid56_parity_recover()
> 
>   fs/btrfs/raid56.c | 68 ++++++++++++++++++++++++++++++++++++++---------
>   1 file changed, 55 insertions(+), 13 deletions(-)
>
Greg KH Aug. 19, 2022, 7:18 a.m. UTC | #2
On Fri, Aug 19, 2022 at 03:08:07PM +0800, Qu Wenruo wrote:
> BTW, since I have to re-run all the tests for every stable branch, for the
> stable branches and 5.18.x backports may be delayed a little.
> 
> In that case, I guess I should submit backports in the most recent stable
> branches first?

Yes please as we can not take patches only into older trees if the same
fix is not already in newer trees as you do not want someone to upgrade
and hit a regression.

thanks,

greg k-h