From patchwork Fri Oct 28 02:31:53 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 9400927 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1A27260588 for ; Fri, 28 Oct 2016 02:33:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 099472A0E3 for ; Fri, 28 Oct 2016 02:33:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F23BE2A1E9; Fri, 28 Oct 2016 02:33:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 331E02A0E3 for ; Fri, 28 Oct 2016 02:33:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1034431AbcJ1Cdf (ORCPT ); Thu, 27 Oct 2016 22:33:35 -0400 Received: from cn.fujitsu.com ([222.73.24.84]:48358 "EHLO song.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with ESMTP id S1034423AbcJ1Cdd (ORCPT ); Thu, 27 Oct 2016 22:33:33 -0400 X-IronPort-AV: E=Sophos;i="5.20,367,1444665600"; d="scan'208";a="932121" Received: from unknown (HELO cn.fujitsu.com) ([10.167.250.3]) by song.cn.fujitsu.com with ESMTP; 28 Oct 2016 10:32:04 +0800 Received: from adam-work.localdomain (unknown [10.167.226.34]) by cn.fujitsu.com (Postfix) with ESMTP id 4CEDD41B4BD0; Fri, 28 Oct 2016 10:32:04 +0800 (CST) From: Qu Wenruo To: linux-btrfs@vger.kernel.org, dsterba@suse.cz Subject: [PATCH 17/19] btrfs-progs: check/scrub: Introduce a function to scrub one full stripe Date: Fri, 28 Oct 2016 10:31:53 +0800 Message-Id: <20161028023155.27336-18-quwenruo@cn.fujitsu.com> X-Mailer: git-send-email 2.10.1 In-Reply-To: <20161028023155.27336-1-quwenruo@cn.fujitsu.com> References: <20161028023155.27336-1-quwenruo@cn.fujitsu.com> MIME-Version: 1.0 X-yoursite-MailScanner-ID: 4CEDD41B4BD0.A10A7 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: quwenruo@cn.fujitsu.com Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Introduce a new function, scrub_one_full_stripe(), to check a full stripe. It handles the full stripe scrub in the following steps: 0) Check if we need to check full stripe If full stripe contains no extent, why waste our CPU and IO? 1) Read out full stripe Then we know how many devices are missing or have read error. If out of repair, then exit If have missing device or have read error, try recover here. 2) Check data stripe against csum We add data stripe with csum error as corrupted stripe, just like dev missing or read error. Then recheck if csum mismatch is still below tolerance. Finally we check the full stripe using 2 factors only: A) If the full stripe go through recover ever B) If the full stripe has csum error Combine factor A and B we get: 1) A && B: Recovered, csum mismatch Screwed up totally 2) A && !B: Recovered, csum match Recoverable, data corrupted but P/Q is good to recover 3) !A && B: Not recovered, csum mismatch Try to recover corrupted data stripes If recovered csum match, then recoverable Else, screwed up 4) !A && !B: Not recovered, no csum mismatch Best case, just check if P/Q matches. If P/Q matches, everything is good Else, just P/Q is screwed up, still recoverable. Signed-off-by: Qu Wenruo --- check/scrub.c | 254 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 254 insertions(+) diff --git a/check/scrub.c b/check/scrub.c index 7a8b783..24c0800 100644 --- a/check/scrub.c +++ b/check/scrub.c @@ -544,3 +544,257 @@ static int recover_from_parities(struct btrfs_fs_info *fs_info, free(ptrs); return ret; } + +/* + * Return 0 if we still have chance to recover + * Return <0 if we have no more chance + */ +static int report_recoverablity(struct scrub_full_stripe *fstripe) +{ + int max_tolerance; + u64 start = fstripe->logical_start; + + if (fstripe->bg_type & BTRFS_BLOCK_GROUP_RAID5) + max_tolerance = 1; + else + max_tolerance = 2; + + if (fstripe->nr_corrupted_stripes > max_tolerance) { + error( + "full stripe %llu CORRUPTED: too many read error or corrupted devices", + start); + error( + "full stripe %llu: tolerance: %d, missing: %d, read error: %d, csum error: %d", + start, max_tolerance, fstripe->err_read_stripes, + fstripe->err_missing_devs, fstripe->err_csum_dstripes); + return -EIO; + } + return 0; +} + +static void clear_corrupted_stripe_record(struct scrub_full_stripe *fstripe) +{ + fstripe->corrupted_index[0] = -1; + fstripe->corrupted_index[1] = -1; + fstripe->nr_corrupted_stripes = 0; +} + +static void record_corrupted_stripe(struct scrub_full_stripe *fstripe, + int index) +{ + int i = 0; + + for (i = 0; i < 2; i++) { + if (fstripe->corrupted_index[i] == -1) { + fstripe->corrupted_index[i] = index; + break; + } + } + fstripe->nr_corrupted_stripes++; +} + +static int scrub_one_full_stripe(struct btrfs_fs_info *fs_info, + struct btrfs_scrub_progress *scrub_ctx, + u64 start, u64 *next_ret) +{ + struct scrub_full_stripe *fstripe; + struct btrfs_map_block *map_block = NULL; + u32 stripe_len = BTRFS_STRIPE_LEN; + u64 bg_type; + u64 len; + int i; + int ret; + + if (!next_ret) { + error("invalid argument for %s", __func__); + return -EINVAL; + } + + ret = __btrfs_map_block_v2(fs_info, WRITE, start, stripe_len, + &map_block); + if (ret < 0) { + /* Let caller to skip the whole block group */ + *next_ret = (u64)-1; + return ret; + } + start = map_block->start; + len = map_block->length; + *next_ret = start + len; + + /* + * Step 0: Check if we need to scrub the full stripe + * + * If no extent lies in the full stripe, not need to check + */ + ret = btrfs_check_extent_exists(fs_info, start, len); + if (ret < 0) { + free(map_block); + return ret; + } + /* No extents in range, no need to check */ + if (ret == 0) { + free(map_block); + return 0; + } + + bg_type = map_block->type & BTRFS_BLOCK_GROUP_PROFILE_MASK; + if (bg_type != BTRFS_BLOCK_GROUP_RAID5 && + bg_type != BTRFS_BLOCK_GROUP_RAID6) { + free(map_block); + return -EINVAL; + } + + fstripe = alloc_full_stripe(map_block->num_stripes, + map_block->stripe_len); + if (!fstripe) + return -ENOMEM; + + fstripe->logical_start = map_block->start; + fstripe->nr_stripes = map_block->num_stripes; + fstripe->stripe_len = stripe_len; + fstripe->bg_type = bg_type; + + /* + * Step 1: Read out the whole full stripe + * + * Then we have the chance to exit early if too many devices are + * missing. + */ + for (i = 0; i < map_block->num_stripes; i++) { + struct scrub_stripe *s_stripe = &fstripe->stripes[i]; + struct btrfs_map_stripe *m_stripe = &map_block->stripes[i]; + + s_stripe->logical = m_stripe->logical; + + if (m_stripe->dev->fd == -1) { + s_stripe->dev_missing = 1; + record_corrupted_stripe(fstripe, i); + fstripe->err_missing_devs++; + continue; + } + + ret = pread(m_stripe->dev->fd, s_stripe->data, stripe_len, + m_stripe->physical); + if (ret < stripe_len) { + record_corrupted_stripe(fstripe, i); + fstripe->err_read_stripes++; + continue; + } + } + + ret = report_recoverablity(fstripe); + if (ret < 0) + goto out; + + ret = recover_from_parities(fs_info, scrub_ctx, fstripe); + if (ret < 0) { + error("full stripe %llu CORRUPTED: failed to recover: %s\n", + fstripe->logical_start, strerror(-ret)); + goto out; + } + + /* + * Clear corrupted stripes report, since they are recovered, + * and later checker need to record csum mismatch stripes reusing + * these members + */ + clear_corrupted_stripe_record(fstripe); + + /* + * Step 2: Check each data stripes against csum + */ + for (i = 0; i < map_block->num_stripes; i++) { + struct scrub_stripe *stripe = &fstripe->stripes[i]; + + if (!is_data_stripe(stripe)) + continue; + ret = scrub_one_data_stripe(fs_info, scrub_ctx, stripe, + stripe_len); + if (ret < 0) { + fstripe->err_csum_dstripes++; + record_corrupted_stripe(fstripe, i); + } + } + + ret = report_recoverablity(fstripe); + if (ret < 0) + goto out; + + /* + * Recovered before, but no csum error + */ + if (fstripe->err_csum_dstripes == 0 && fstripe->recovered) { + error( + "full stripe %llu RECOVERABLE: P/Q is good for recovery", + start); + ret = 0; + goto out; + } + /* + * No csum error, not recovered before. + * + * Only need to check if P/Q matches. + */ + if (fstripe->err_csum_dstripes == 0 && !fstripe->recovered) { + ret = verify_parities(fs_info, scrub_ctx, fstripe); + if (ret < 0) + error( + "full stripe %llu CORRUPTED: failed to check P/Q: %s", + start, strerror(-ret)); + if (ret > 0) { + error( + "full stripe %llu RECOVERABLE: only P/Q is corrupted", + start); + ret = 0; + } + goto out; + } + + /* + * Still csum error after recovery + * + * No mean to fix further, screwed up already. + */ + if (fstripe->err_csum_dstripes && fstripe->recovered) { + error( + "full stripe %llu CORRUPTED: csum still mismatch after recovery", + start); + ret = -EIO; + goto out; + } + + /* Csum mismatch, but we still has chance to recover. */ + ret = recover_from_parities(fs_info, scrub_ctx, fstripe); + if (ret < 0) { + error( + "full stripe %llu CORRUPTED: failed to recover: %s\n", + fstripe->logical_start, strerror(-ret)); + goto out; + } + + /* After recovery, recheck data stripe csum */ + for (i = 0; i < 2; i++) { + int index = fstripe->corrupted_index[i]; + struct scrub_stripe *stripe; + + if (i == -1) + continue; + stripe = &fstripe->stripes[index]; + ret = scrub_one_data_stripe(fs_info, scrub_ctx, stripe, + stripe_len); + if (ret < 0) { + error( + "full stripe %llu CORRUPTED: csum still mismatch after recovery", + start); + goto out; + } + } + error( + "full stripe %llu RECOVERABLE: Data stripes corrupted, but P/Q is good", + start); + +out: + free_full_stripe(fstripe); + free(map_block); + return ret; +}