From patchwork Thu Jul 20 06:55:58 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gu Jinxiang X-Patchwork-Id: 9853933 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A4CAE60388 for ; Thu, 20 Jul 2017 06:56:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 939472876B for ; Thu, 20 Jul 2017 06:56:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8686128770; Thu, 20 Jul 2017 06:56:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0E9D42876B for ; Thu, 20 Jul 2017 06:56:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933548AbdGTG4y (ORCPT ); Thu, 20 Jul 2017 02:56:54 -0400 Received: from cn.fujitsu.com ([59.151.112.132]:57268 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1752033AbdGTG4u (ORCPT ); Thu, 20 Jul 2017 02:56:50 -0400 X-IronPort-AV: E=Sophos;i="5.22,518,1449504000"; d="scan'208";a="21556349" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 20 Jul 2017 14:56:47 +0800 Received: from G08CNEXCHPEKD02.g08.fujitsu.local (unknown [10.167.33.83]) by cn.fujitsu.com (Postfix) with ESMTP id A53B646B5DF7; Thu, 20 Jul 2017 14:56:43 +0800 (CST) Received: from localhost.localdomain (10.167.226.132) by G08CNEXCHPEKD02.g08.fujitsu.local (10.167.33.89) with Microsoft SMTP Server (TLS) id 14.3.319.2; Thu, 20 Jul 2017 14:56:43 +0800 From: Gu Jinxiang To: CC: Qu Wenruo Subject: [PATCH v6 05/15] btrfs-progs: scrub: Introduce functions to scrub mirror based tree block Date: Thu, 20 Jul 2017 14:55:58 +0800 Message-ID: <20170720065608.27563-6-gujx@cn.fujitsu.com> X-Mailer: git-send-email 2.9.4 In-Reply-To: <20170720065608.27563-1-gujx@cn.fujitsu.com> References: <20170720065608.27563-1-gujx@cn.fujitsu.com> MIME-Version: 1.0 X-Originating-IP: [10.167.226.132] X-yoursite-MailScanner-ID: A53B646B5DF7.A85CF X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: gujx@cn.fujitsu.com Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Qu Wenruo Introduce new functions, check/recover_tree_mirror(), to check and recover mirror-based tree blocks (Single/DUP/RAID0/1/10). check_tree_mirror() can also be used on in-memory tree blocks using @data parameter. This is very handy for RAID5/6 case, either checking the data stripe tree block by @bytenr and 0 as @mirror, or using @data parameter for recovered in-memory data. While recover_tree_mirror() is only used for mirror-based profiles, as RAID56 recovery is done by stripe unit, not mirror unit. Signed-off-by: Qu Wenruo Signed-off-by: Gu Jinxiang --- disk-io.c | 4 +- disk-io.h | 2 + scrub.c | 145 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 149 insertions(+), 2 deletions(-) diff --git a/disk-io.c b/disk-io.c index 8cf800e..fb5fe40 100644 --- a/disk-io.c +++ b/disk-io.c @@ -51,8 +51,8 @@ static u32 max_nritems(u8 level, u32 nodesize) sizeof(struct btrfs_key_ptr)); } -static int check_tree_block(struct btrfs_fs_info *fs_info, - struct extent_buffer *buf) +int check_tree_block(struct btrfs_fs_info *fs_info, + struct extent_buffer *buf) { struct btrfs_fs_devices *fs_devices; diff --git a/disk-io.h b/disk-io.h index dfe4cf0..0f65e67 100644 --- a/disk-io.h +++ b/disk-io.h @@ -119,6 +119,8 @@ struct extent_buffer* read_tree_block( struct btrfs_fs_info *fs_info, u64 bytenr, u32 blocksize, u64 parent_transid); +int check_tree_block(struct btrfs_fs_info *fs_info, + struct extent_buffer *buf); int read_extent_data(struct btrfs_fs_info *fs_info, char *data, u64 logical, u64 *len, int mirror); void readahead_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr, diff --git a/scrub.c b/scrub.c index 41c4010..7e10ac1 100644 --- a/scrub.c +++ b/scrub.c @@ -117,3 +117,148 @@ static struct scrub_full_stripe *alloc_full_stripe(int nr_stripes, } return ret; } + +static inline int is_data_stripe(struct scrub_stripe *stripe) +{ + u64 bytenr = stripe->logical; + + if (bytenr == BTRFS_RAID5_P_STRIPE || bytenr == BTRFS_RAID6_Q_STRIPE) + return 0; + return 1; +} + +/* + * Check one tree mirror given by @bytenr and @mirror, or @data. + * If @data is not given (NULL), the function will try to read out tree block + * using @bytenr and @mirror. + * If @data is given, use data directly, won't try to read from disk. + * + * The extra @data prameter is handy for RAID5/6 recovery code to verify + * the recovered data. + * + * Return 0 if everything is OK. + * Return <0 something goes wrong, and @scrub_ctx accounting will be updated + * if it's a data corruption. + */ +static int check_tree_mirror(struct btrfs_fs_info *fs_info, + struct btrfs_scrub_progress *scrub_ctx, + char *data, u64 bytenr, int mirror) +{ + struct extent_buffer *eb; + u32 nodesize = fs_info->nodesize; + int ret; + + if (!IS_ALIGNED(bytenr, fs_info->sectorsize)) { + /* Such error will be reported by check_tree_block() */ + scrub_ctx->verify_errors++; + return -EIO; + } + + eb = btrfs_find_create_tree_block(fs_info, bytenr, nodesize); + if (!eb) + return -ENOMEM; + if (data) { + memcpy(eb->data, data, nodesize); + } else { + ret = read_whole_eb(fs_info, eb, mirror); + if (ret) { + scrub_ctx->read_errors++; + error("failed to read tree block %llu mirror %d", + bytenr, mirror); + goto out; + } + } + + scrub_ctx->tree_bytes_scrubbed += nodesize; + if (csum_tree_block(fs_info, eb, 1)) { + error("tree block %llu mirror %d checksum mismatch", bytenr, + mirror); + scrub_ctx->csum_errors++; + ret = -EIO; + goto out; + } + ret = check_tree_block(fs_info, eb); + if (ret < 0) { + error("tree block %llu mirror %d is invalid", bytenr, mirror); + scrub_ctx->verify_errors++; + goto out; + } + + scrub_ctx->tree_extents_scrubbed++; +out: + free_extent_buffer(eb); + return ret; +} + +/* + * read_extent_data() helper + * + * This function will handle short read and update @scrub_ctx when read + * error happens. + */ +static int read_extent_data_loop(struct btrfs_fs_info *fs_info, + struct btrfs_scrub_progress *scrub_ctx, + char *buf, u64 start, u64 len, int mirror) +{ + int ret = 0; + u64 cur = 0; + + while (cur < len) { + u64 read_len = len - cur; + + ret = read_extent_data(fs_info, buf + cur, + start + cur, &read_len, mirror); + if (ret < 0) { + error("failed to read out data at bytenr %llu mirror %d", + start + cur, mirror); + scrub_ctx->read_errors++; + break; + } + cur += read_len; + } + return ret; +} + +/* + * Recover all other (corrupted) mirrors for tree block. + * + * The method is quite simple, just read out the correct mirror specified by + * @good_mirror and write back correct data to all other blocks + */ +static int recover_tree_mirror(struct btrfs_fs_info *fs_info, + struct btrfs_scrub_progress *scrub_ctx, + u64 start, int good_mirror) +{ + char *buf; + u32 nodesize = fs_info->nodesize; + int i; + int num_copies; + int ret; + + buf = malloc(nodesize); + if (!buf) + return -ENOMEM; + ret = read_extent_data_loop(fs_info, scrub_ctx, buf, start, nodesize, + good_mirror); + if (ret < 0) { + error("failed to read tree block at bytenr %llu mirror %d", + start, good_mirror); + goto out; + } + + num_copies = btrfs_num_copies(fs_info, start, nodesize); + for (i = 0; i <= num_copies; i++) { + if (i == good_mirror) + continue; + ret = write_data_to_disk(fs_info, buf, start, nodesize, i); + if (ret < 0) { + error("failed to write tree block at bytenr %llu mirror %d", + start, i); + goto out; + } + } + ret = 0; +out: + free(buf); + return ret; +}