Btrfs: do not cache rbio pages if using raid6 recover
diff mbox

Message ID 20180113010702.25612-2-bo.li.liu@oracle.com
State New
Headers show

Commit Message

Liu Bo Jan. 13, 2018, 1:07 a.m. UTC
Since raid6 recover tries all possible combinations of failed stripes,

- when raid6 rebuild algorithm is used, i.e. raid6_datap_recov() and
  raid6_2data_recov(), it may change the in-memory content of failed
  stripes, if such a raid bio is cached, a later raid write rmw or recover
  can steal @stripe_pages from it instead of reading from disks, such that
  it carries the wrong content to do write rmw or recovery and ends up
  with corruption or recovery failures.

- when raid5 rebuild algorithm is used, i.e. xor, raid bio can be cached
  because the only failed stripe which contains @rbio->bio_pages gets
  modified, others remain the same so that their in-memory content is
  consistent with their on-disk content.

This adds a check to skip caching rbio if using raid6 recover.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/raid56.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

Comments

David Sterba Jan. 18, 2018, 3:17 p.m. UTC | #1
On Fri, Jan 12, 2018 at 06:07:02PM -0700, Liu Bo wrote:
> Since raid6 recover tries all possible combinations of failed stripes,
> 
> - when raid6 rebuild algorithm is used, i.e. raid6_datap_recov() and
>   raid6_2data_recov(), it may change the in-memory content of failed
>   stripes, if such a raid bio is cached, a later raid write rmw or recover
>   can steal @stripe_pages from it instead of reading from disks, such that
>   it carries the wrong content to do write rmw or recovery and ends up
>   with corruption or recovery failures.
> 
> - when raid5 rebuild algorithm is used, i.e. xor, raid bio can be cached
>   because the only failed stripe which contains @rbio->bio_pages gets
>   modified, others remain the same so that their in-memory content is
>   consistent with their on-disk content.
> 
> This adds a check to skip caching rbio if using raid6 recover.
> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>

Added to 4.16 queue, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 56ae5bd..4d56f24 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -1975,7 +1975,22 @@  static void __raid_recover_end_io(struct btrfs_raid_bio *rbio)
 
 cleanup_io:
 	if (rbio->operation == BTRFS_RBIO_READ_REBUILD) {
-		if (err == BLK_STS_OK)
+		/*
+		 * - In case of two failures, where rbio->failb != -1:
+		 *
+		 *   Do not cache this rbio since the above read reconstruction
+		 *   (raid6_datap_recov() or raid6_2data_recov()) may have
+		 *   changed some content of stripes which are not identical to
+		 *   on-disk content any more, otherwise, a later write/recover
+		 *   may steal stripe_pages from this rbio and end up with
+		 *   corruptions or rebuild failures.
+		 *
+		 * - In case of single failure, where rbio->failb == -1:
+		 *
+		 *   Cache this rbio iff the above read reconstruction is
+		 *   excuted without problems.
+		 */
+		if (err == BLK_STS_OK && rbio->failb < 0)
 			cache_rbio_pages(rbio);
 		else
 			clear_bit(RBIO_CACHE_READY_BIT, &rbio->flags);