From patchwork Mon Dec 4 06:56:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 13477808 Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2a07:de40:b251:101:10:150:64:2]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B393F2 for ; Sun, 3 Dec 2023 22:56:52 -0800 (PST) Received: from imap2.dmz-prg2.suse.org (imap2.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:98]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id B1C661FE4F for ; Mon, 4 Dec 2023 06:56:50 +0000 (UTC) Received: from imap2.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap2.dmz-prg2.suse.org (Postfix) with ESMTPS id B07E213588 for ; Mon, 4 Dec 2023 06:56:49 +0000 (UTC) Received: from dovecot-director2.suse.de ([10.150.64.162]) by imap2.dmz-prg2.suse.org with ESMTPSA id kIW+FzF4bWXABwAAn2gu4w (envelope-from ) for ; Mon, 04 Dec 2023 06:56:49 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 1/3] btrfs-progs: check: remove inode cache clearing functionality Date: Mon, 4 Dec 2023 17:26:27 +1030 Message-ID: <9ded4d71f4ab77ee4ed8d3ff31df839e85756def.1701672971.git.wqu@suse.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spamd-Bar: +++++++++++++++++++++ X-Spam-Score: 21.50 X-Rspamd-Server: rspamd1 Authentication-Results: smtp-out2.suse.de; dkim=none; spf=fail (smtp-out2.suse.de: domain of wqu@suse.com does not designate 2a07:de40:b281:104:10:150:64:98 as permitted sender) smtp.mailfrom=wqu@suse.com; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=suse.com (policy=quarantine) X-Rspamd-Queue-Id: B1C661FE4F X-Spamd-Result: default: False [21.50 / 50.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_FAIL(1.00)[-all]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; DMARC_POLICY_QUARANTINE(1.50)[suse.com : No valid SPF, No valid DKIM,quarantine]; R_MISSING_CHARSET(2.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[linux-btrfs@vger.kernel.org]; BROKEN_CONTENT_TYPE(1.50)[]; RCPT_COUNT_ONE(0.00)[1]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:98:from]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_NONE(0.00)[]; NEURAL_SPAM_SHORT(3.00)[0.999]; MX_GOOD(-0.01)[]; NEURAL_SPAM_LONG(3.50)[1.000]; MID_CONTAINS_FROM(1.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.com:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(2.20)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_ALL(0.00)[]; BAYES_HAM(-3.00)[100.00%] Since we're already directing the end user to use "btrfs rescue clear-ino-cache" command, there is not much need to support it in btrfs-check. Signed-off-by: Qu Wenruo --- check/main.c | 12 ++---------- .../060-ino-cache-clean}/ino-cache-enabled.raw.xz | Bin .../060-ino-cache-clean}/test.sh | 2 +- 3 files changed, 3 insertions(+), 11 deletions(-) rename tests/{fsck-tests/046-ino-cache-clean => misc-tests/060-ino-cache-clean}/ino-cache-enabled.raw.xz (100%) rename tests/{fsck-tests/046-ino-cache-clean => misc-tests/060-ino-cache-clean}/test.sh (97%) diff --git a/check/main.c b/check/main.c index 901a7ef5ebcb..30967fd426ca 100644 --- a/check/main.c +++ b/check/main.c @@ -9994,7 +9994,6 @@ static int cmd_check(const struct cmd_struct *cmd, int argc, char **argv) int init_csum_tree = 0; int readonly = 0; int clear_space_cache = 0; - int clear_ino_cache = 0; int qgroup_report = 0; int qgroups_repaired = 0; int qgroup_verify_ret; @@ -10118,8 +10117,8 @@ static int cmd_check(const struct cmd_struct *cmd, int argc, char **argv) ctree_flags |= OPEN_CTREE_WRITES; break; case GETOPT_VAL_CLEAR_INO_CACHE: - clear_ino_cache = 1; - ctree_flags |= OPEN_CTREE_WRITES; + error("--clear-ino-cache option is deprecated, please use \"btrfs rescue clear-ino-cache\" instead"); + exit(1); break; case GETOPT_VAL_FORCE: force = 1; @@ -10240,13 +10239,6 @@ static int cmd_check(const struct cmd_struct *cmd, int argc, char **argv) goto close_out; } - if (clear_ino_cache) { - warning("--clear-ino-cache option is deprecated, please use \"btrfs rescue clear-ino-cache\" instead"); - ret = clear_ino_cache_items(gfs_info); - err = ret; - goto close_out; - } - /* * repair mode will force us to commit transaction which * will make us fail to load log tree when mounting. diff --git a/tests/fsck-tests/046-ino-cache-clean/ino-cache-enabled.raw.xz b/tests/misc-tests/060-ino-cache-clean/ino-cache-enabled.raw.xz similarity index 100% rename from tests/fsck-tests/046-ino-cache-clean/ino-cache-enabled.raw.xz rename to tests/misc-tests/060-ino-cache-clean/ino-cache-enabled.raw.xz diff --git a/tests/fsck-tests/046-ino-cache-clean/test.sh b/tests/misc-tests/060-ino-cache-clean/test.sh similarity index 97% rename from tests/fsck-tests/046-ino-cache-clean/test.sh rename to tests/misc-tests/060-ino-cache-clean/test.sh index cfbadf251d30..10b75cc5f5b0 100755 --- a/tests/fsck-tests/046-ino-cache-clean/test.sh +++ b/tests/misc-tests/060-ino-cache-clean/test.sh @@ -9,7 +9,7 @@ setup_root_helper image=$(extract_image "./ino-cache-enabled.raw.xz") -run_check "$TOP/btrfs" check --clear-ino-cache "$image" +run_check "$TOP/btrfs" rescue clear-ino-cache "$image" run_check "$TOP/btrfs" check "$image" # Check for FREE_INO items for toplevel subvol From patchwork Mon Dec 4 06:56:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 13477810 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="vFrxPQ9E" Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 28C48FA for ; Sun, 3 Dec 2023 22:56:54 -0800 (PST) Received: from imap2.dmz-prg2.suse.org (imap2.dmz-prg2.suse.org [10.150.64.98]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id AD90921F75 for ; Mon, 4 Dec 2023 06:56:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1701673012; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qG/DyZBEgDWEEh5A0AUVMDUw0BtBnTTWSG0/Z0E68Tg=; b=vFrxPQ9EmUCrb7CVaNfINjWdZAJW23e3mt7u2mRtgJqlwY4LASEQ34p4Ad+aZWjM9Nc1A3 vHEU4anKqJGw+8st9yztc8QsM2xvSg9H3q/LARANrFU6CiqVX0v479HNwUg6/KBwH5mKRR 7rI93b6ra5hPSRXjuN+V1C4eu4UX6pU= Received: from imap2.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap2.dmz-prg2.suse.org (Postfix) with ESMTPS id 5B4D413588 for ; Mon, 4 Dec 2023 06:56:51 +0000 (UTC) Received: from dovecot-director2.suse.de ([10.150.64.162]) by imap2.dmz-prg2.suse.org with ESMTPSA id kIYtAjN4bWXABwAAn2gu4w (envelope-from ) for ; Mon, 04 Dec 2023 06:56:51 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 2/3] btrfs-progs: check: export a dedicated btrfs_check() function Date: Mon, 4 Dec 2023 17:26:28 +1030 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Authentication-Results: smtp-out1.suse.de; none X-Spam-Score: 3.87 X-Spamd-Result: default: False [3.87 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; R_MISSING_CHARSET(2.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[linux-btrfs@vger.kernel.org]; BROKEN_CONTENT_TYPE(1.50)[]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-1.00)[-1.000]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_NONE(0.00)[]; DKIM_SIGNED(0.00)[suse.com:s=susede1]; MID_CONTAINS_FROM(1.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.com:email]; NEURAL_SPAM_SHORT(2.97)[0.992]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_ALL(0.00)[]; BAYES_HAM(-3.00)[100.00%] There are use cases (mostly bgt conversion) where we want to verify the healthy of a filesystem before crossing the rubicon. For features like block group tree conversion, even if every operation is protected by transaction and *should* be able to revert to previous status, if we had some critical corruption in extent tree/free space tree, the COW mechanism can be broken and no transaction protection at all. Thus for those should-be safe operations, they are only safe if the target filesystem is healthy in the first place. This patch would export function btrfs_check(), with extra btrfs_check_options structure to fine-tune the behaviors. For most external callers, they would only pass with btrfs_check_default_options, but since "btrfs check" would go through the same function, all options are exported through btrfs_check_options structure. Signed-off-by: Qu Wenruo --- check/main.c | 987 +++++++++++++++++++++++++------------------------ common/utils.c | 18 + common/utils.h | 41 ++ 3 files changed, 554 insertions(+), 492 deletions(-) diff --git a/check/main.c b/check/main.c index 30967fd426ca..af7156fdae7a 100644 --- a/check/main.c +++ b/check/main.c @@ -87,14 +87,7 @@ int init_extent_tree = 0; int check_data_csum = 0; struct cache_tree *roots_info_cache = NULL; -enum btrfs_check_mode { - CHECK_MODE_ORIGINAL, - CHECK_MODE_LOWMEM, - CHECK_MODE_UNKNOWN, - CHECK_MODE_DEFAULT = CHECK_MODE_ORIGINAL -}; - -static enum btrfs_check_mode check_mode = CHECK_MODE_DEFAULT; +static enum btrfs_check_mode check_mode = BTRFS_CHECK_MODE_DEFAULT; struct device_record { struct rb_node node; @@ -266,13 +259,13 @@ static int print_status_return(void *p) static enum btrfs_check_mode parse_check_mode(const char *str) { if (strcmp(str, "lowmem") == 0) - return CHECK_MODE_LOWMEM; + return BTRFS_CHECK_MODE_LOWMEM; if (strcmp(str, "orig") == 0) - return CHECK_MODE_ORIGINAL; + return BTRFS_CHECK_MODE_ORIGINAL; if (strcmp(str, "original") == 0) - return CHECK_MODE_ORIGINAL; + return BTRFS_CHECK_MODE_ORIGINAL; - return CHECK_MODE_UNKNOWN; + return BTRFS_CHECK_MODE_UNKNOWN; } /* Compatible function to allow reuse of old codes */ @@ -3961,7 +3954,7 @@ static int do_check_fs_roots(struct cache_tree *root_cache) { int ret; - if (check_mode == CHECK_MODE_LOWMEM) + if (check_mode == BTRFS_CHECK_MODE_LOWMEM) ret = check_fs_roots_lowmem(); else ret = check_fs_roots(root_cache); @@ -9061,7 +9054,7 @@ static int do_check_chunks_and_extents(void) { int ret; - if (check_mode == CHECK_MODE_LOWMEM) + if (check_mode == BTRFS_CHECK_MODE_LOWMEM) ret = check_chunks_and_extents_lowmem(); else ret = check_chunks_and_extents(); @@ -9978,30 +9971,486 @@ static const char * const cmd_check_usage[] = { NULL }; -static int cmd_check(const struct cmd_struct *cmd, int argc, char **argv) +bool btrfs_check(const char *device, const struct btrfs_check_options *options) { struct cache_tree root_cache; struct btrfs_root *root; struct open_ctree_args oca = { 0 }; - u64 bytenr = 0; - u64 subvolid = 0; - u64 tree_root_bytenr = 0; - u64 chunk_root_bytenr = 0; char uuidbuf[BTRFS_UUID_UNPARSED_SIZE]; + u64 super_bytenr = 0; + /* If we found non-critical error and need continue checking. */ + bool found_error = false; int ret = 0; - int err = 0; - u64 num; - int init_csum_tree = 0; - int readonly = 0; - int clear_space_cache = 0; - int qgroup_report = 0; int qgroups_repaired = 0; - int qgroup_verify_ret; unsigned ctree_flags = OPEN_CTREE_EXCLUSIVE | OPEN_CTREE_ALLOW_TRANSID_MISMATCH | OPEN_CTREE_SKIP_LEAF_ITEM_CHECKS; - int force = 0; + if (options->use_backup_roots) + ctree_flags |= OPEN_CTREE_BACKUP_ROOT; + if (options->use_nth_super) { + if (options->use_nth_super >= BTRFS_SUPER_MIRROR_MAX) { + error( + "super mirror should be less than %d", + BTRFS_SUPER_MIRROR_MAX); + exit(1); + } + super_bytenr = btrfs_sb_offset(options->use_nth_super); + printf("using SB copy %u, bytenr %llu\n", + options->use_nth_super, super_bytenr); + } + if (options->repair) { + printf("enabling repair mode\n"); + opt_check_repair = 1; + ctree_flags |= OPEN_CTREE_WRITES; + } + if (options->init_csum_tree) { + printf("Creating a new CRC tree\n"); + opt_check_repair = 1; + ctree_flags |= OPEN_CTREE_WRITES; + } + if (options->init_extent_tree) { + ctree_flags |= (OPEN_CTREE_WRITES | OPEN_CTREE_NO_BLOCK_GROUPS); + opt_check_repair = 1; + init_extent_tree = 1; + } + if (options->clear_space_cache) { + if (options->clear_space_cache == 2) + ctree_flags |= OPEN_CTREE_INVALIDATE_FST; + ctree_flags |= OPEN_CTREE_WRITES; + } + g_task_ctx.progress_enabled = options->show_progress; + check_data_csum = options->check_data_csum; + check_mode = options->check_mode; + + if (g_task_ctx.progress_enabled) { + g_task_ctx.tp = TASK_NOTHING; + g_task_ctx.info = task_init(print_status_check, print_status_return, &g_task_ctx); + } + + /* + * Make sure we didn't trigger some write operation with --readonly + * specified. + */ + if (options->force_readonly && opt_check_repair) { + error("repair options are not compatible with --readonly"); + exit(1); + } + + if (opt_check_repair && !options->force) { + int delay = 10; + + printf("WARNING:\n\n"); + printf("\tDo not use --repair unless you are advised to do so by a developer\n"); + printf("\tor an experienced user, and then only after having accepted that no\n"); + printf("\tfsck can successfully repair all types of filesystem corruption. E.g.\n"); + printf("\tsome software or hardware bugs can fatally damage a volume.\n"); + printf("\tThe operation will start in %d seconds.\n", delay); + printf("\tUse Ctrl-C to stop it.\n"); + while (delay) { + printf("%2d", delay--); + fflush(stdout); + sleep(1); + } + printf("\nStarting repair.\n"); + } + + /* Experimental */ + if (opt_check_repair && check_mode == BTRFS_CHECK_MODE_LOWMEM) + warning("low-memory mode repair support is only partial"); + + printf("Opening filesystem to check...\n"); + + cache_tree_init(&root_cache); + qgroup_set_item_count_ptr(&g_task_ctx.item_count); + + ret = check_mounted(device); + if (!options->force) { + if (ret < 0) { + errno = -ret; + error("could not check mount status: %m"); + goto err_out; + } else if (ret) { + error( +"%s is currently mounted, use --force if you really intend to check the filesystem", + device); + ret = -EBUSY; + goto err_out; + } + } else { + if (ret < 0) { + warning( +"cannot check mount status of %s, the filesystem could be mounted, continuing because of --force", + device); + } else if (ret) { + warning( + "filesystem mounted, continuing because of --force"); + } + /* A block device is mounted in exclusive mode by kernel */ + ctree_flags &= ~OPEN_CTREE_EXCLUSIVE; + } + + /* only allow partial opening under repair mode */ + if (opt_check_repair) + ctree_flags |= OPEN_CTREE_PARTIAL; + + oca.filename = device; + oca.sb_bytenr = super_bytenr; + oca.root_tree_bytenr = options->tree_root; + oca.chunk_tree_bytenr = options->chunk_root; + oca.flags = ctree_flags; + gfs_info = open_ctree_fs_info(&oca); + if (!gfs_info) { + error("cannot open file system"); + ret = -EIO; + goto err_out; + } + + root = gfs_info->fs_root; + uuid_unparse(gfs_info->super_copy->fsid, uuidbuf); + + printf("Checking filesystem on %s\nUUID: %s\n", device, uuidbuf); + + /* + * Check the bare minimum before starting anything else that could rely + * on it, namely the tree roots, any local consistency checks + */ + if (!extent_buffer_uptodate(gfs_info->tree_root->node) || + !extent_buffer_uptodate(gfs_info->chunk_root->node)) { + error("critical roots corrupted, unable to check the filesystem"); + ret = -EIO; + goto close_out; + } + + if (options->clear_space_cache) { + warning("--clear-space-cache option is deprecated, please use \"btrfs rescue clear-space-cache\" instead"); + ret = do_clear_free_space_cache(gfs_info, + options->clear_space_cache); + goto close_out; + } + + /* + * repair mode will force us to commit transaction which + * will make us fail to load log tree when mounting. + */ + if (opt_check_repair && btrfs_super_log_root(gfs_info->super_copy)) { + ret = ask_user("repair mode will force to clear out log tree, are you sure?"); + if (!ret) { + ret = 1; + goto close_out; + } + ret = zero_log_tree(root); + if (ret) { + error("failed to zero log tree: %d", ret); + goto close_out; + } + } + + if (options->qgroup_report) { + printf("Print quota groups for %s\nUUID: %s\n", device, + uuidbuf); + ret = qgroup_verify_all(gfs_info); + if (ret) + found_error = true; + if (ret >= 0) + report_qgroups(1); + goto close_out; + } + if (options->subvolid) { + printf("Print extent state for subvolume %llu on %s\nUUID: %s\n", + options->subvolid, device, uuidbuf); + ret = print_extent_state(gfs_info, options->subvolid); + goto close_out; + } + + if (options->init_extent_tree || options->init_csum_tree) { + struct btrfs_trans_handle *trans; + + /* + * If we're rebuilding extent tree, we must keep the flag set + * for the whole duration of btrfs check. As we rely on later + * extent tree check code to rebuild block group items, thus we + * can no longer trust the free space for metadata. + */ + if (init_extent_tree) + gfs_info->rebuilding_extent_tree = 1; + trans = btrfs_start_transaction(gfs_info->tree_root, 0); + if (IS_ERR(trans)) { + ret = PTR_ERR(trans); + errno = -ret; + error_msg(ERROR_MSG_START_TRANS, "%m"); + goto close_out; + } + + trans->reinit_extent_tree = true; + if (init_extent_tree) { + printf("Creating a new extent tree\n"); + ret = reinit_extent_tree(trans, + check_mode == BTRFS_CHECK_MODE_ORIGINAL); + if (ret) + goto close_out; + } + + if (options->init_csum_tree) { + printf("Reinitialize checksum tree\n"); + ret = reinit_global_roots(trans, + BTRFS_CSUM_TREE_OBJECTID); + if (ret) { + error("checksum tree initialization failed: %d", + ret); + ret = -EIO; + goto close_out; + } + + ret = fill_csum_tree(trans, init_extent_tree); + if (ret) { + error("checksum tree refilling failed: %d", ret); + goto close_out; + } + } + /* + * Ok now we commit and run the normal fsck, which will add + * extent entries for all of the items it finds. + */ + ret = btrfs_commit_transaction(trans, gfs_info->tree_root); + if (ret) + goto close_out; + } + + ret = check_global_roots_uptodate(); + if (ret) + goto close_out; + + if (!init_extent_tree) { + if (!g_task_ctx.progress_enabled) { + fprintf(stderr, "[1/7] checking root items\n"); + } else { + g_task_ctx.tp = TASK_ROOT_ITEMS; + task_start(g_task_ctx.info, &g_task_ctx.start_time, + &g_task_ctx.item_count); + } + ret = repair_root_items(); + task_stop(g_task_ctx.info); + if (ret < 0) { + errno = -ret; + error("failed to repair root items: %m"); + found_error = true; + /* + * For repair, if we can't repair root items, it's + * fatal. But for non-repair, it's pretty rare to hit + * such v3.17 era bug, we want to continue check. + */ + if (opt_check_repair) + goto close_out; + } else { + if (opt_check_repair) { + fprintf(stderr, "Fixed %d roots.\n", ret); + ret = 0; + } else if (ret > 0) { + fprintf(stderr, + "Found %d roots with an outdated root item.\n", + ret); + fprintf(stderr, + "Please run a filesystem check with the option --repair to fix them.\n"); + ret = 1; + found_error = true; + } + } + } else { + fprintf(stderr, "[1/7] checking root items... skipped\n"); + } + + if (!g_task_ctx.progress_enabled) { + fprintf(stderr, "[2/7] checking extents\n"); + } else { + g_task_ctx.tp = TASK_EXTENTS; + task_start(g_task_ctx.info, &g_task_ctx.start_time, &g_task_ctx.item_count); + } + ret = do_check_chunks_and_extents(); + task_stop(g_task_ctx.info); + if (ret) { + error( + "errors found in extent allocation tree or chunk allocation"); + found_error = true; + } + + /* Only re-check super size after we checked and repaired the fs */ + if (!is_super_size_valid()) + found_error = true; + + is_free_space_tree = btrfs_fs_compat_ro(gfs_info, FREE_SPACE_TREE); + + if (!g_task_ctx.progress_enabled) { + if (is_free_space_tree) + fprintf(stderr, "[3/7] checking free space tree\n"); + else + fprintf(stderr, "[3/7] checking free space cache\n"); + } else { + g_task_ctx.tp = TASK_FREE_SPACE; + task_start(g_task_ctx.info, &g_task_ctx.start_time, &g_task_ctx.item_count); + } + + ret = validate_free_space_cache(root, &g_task_ctx); + if (ret) + found_error = true; + task_stop(g_task_ctx.info); + + /* + * We used to have to have these hole extents in between our real + * extents so if we don't have this flag set we need to make sure there + * are no gaps in the file extents for inodes, otherwise we can just + * ignore it when this happens. + */ + no_holes = btrfs_fs_incompat(gfs_info, NO_HOLES); + if (!g_task_ctx.progress_enabled) { + fprintf(stderr, "[4/7] checking fs roots\n"); + } else { + g_task_ctx.tp = TASK_FS_ROOTS; + task_start(g_task_ctx.info, &g_task_ctx.start_time, &g_task_ctx.item_count); + } + + ret = do_check_fs_roots(&root_cache); + task_stop(g_task_ctx.info); + if (ret) { + error("errors found in fs roots"); + goto out; + } + + if (!g_task_ctx.progress_enabled) { + if (check_data_csum) + fprintf(stderr, "[5/7] checking csums against data\n"); + else + fprintf(stderr, + "[5/7] checking only csums items (without verifying data)\n"); + } else { + g_task_ctx.tp = TASK_CSUMS; + task_start(g_task_ctx.info, &g_task_ctx.start_time, &g_task_ctx.item_count); + } + + ret = check_csums(); + task_stop(g_task_ctx.info); + /* + * Data csum error is not fatal, and it may indicate more serious + * corruption, continue checking. + */ + if (ret) { + error("errors found in csum tree"); + found_error = true; + } + + /* For low memory mode, check_fs_roots_v2 handles root refs */ + if (check_mode != BTRFS_CHECK_MODE_LOWMEM) { + if (!g_task_ctx.progress_enabled) { + fprintf(stderr, "[6/7] checking root refs\n"); + } else { + g_task_ctx.tp = TASK_ROOT_REFS; + task_start(g_task_ctx.info, &g_task_ctx.start_time, &g_task_ctx.item_count); + } + + ret = check_root_refs(root, &root_cache); + task_stop(g_task_ctx.info); + if (ret) { + error("errors found in root refs"); + goto out; + } + } else { + fprintf(stderr, + "[6/7] checking root refs done with fs roots in lowmem mode, skipping\n"); + } + + while (opt_check_repair && !list_empty(&gfs_info->recow_ebs)) { + struct extent_buffer *eb; + + eb = list_first_entry(&gfs_info->recow_ebs, + struct extent_buffer, recow); + list_del_init(&eb->recow); + ret = recow_extent_buffer(root, eb); + free_extent_buffer(eb); + if (ret) { + error("fails to fix transid errors"); + found_error = true;; + break; + } + } + + while (!list_empty(&delete_items)) { + struct bad_item *bad; + + bad = list_first_entry(&delete_items, struct bad_item, list); + list_del_init(&bad->list); + if (opt_check_repair) { + ret = delete_bad_item(root, bad); + if (ret) + found_error = true; + } + free(bad); + } + + if (gfs_info->quota_enabled) { + int qgroup_verify_ret; + + if (!g_task_ctx.progress_enabled) { + fprintf(stderr, "[7/7] checking quota groups\n"); + } else { + g_task_ctx.tp = TASK_QGROUPS; + task_start(g_task_ctx.info, &g_task_ctx.start_time, &g_task_ctx.item_count); + } + qgroup_verify_ret = qgroup_verify_all(gfs_info); + task_stop(g_task_ctx.info); + if (qgroup_verify_ret < 0) { + error("failed to check quota groups"); + goto out; + } + report_qgroups(0); + ret = repair_qgroups(gfs_info, &qgroups_repaired, false); + if (ret) { + error("failed to repair quota groups"); + goto out; + } + if (qgroup_verify_ret && (!qgroups_repaired || ret)) + found_error = true; + ret = 0; + } else { + fprintf(stderr, + "[7/7] checking quota groups skipped (not enabled on this FS)\n"); + } + + if (!list_empty(&gfs_info->recow_ebs)) { + error("transid errors in file system"); + ret = 1; + found_error = true; + } +out: + printf("found %llu bytes used, ", bytes_used); + if (found_error || ret) + printf("error(s) found\n"); + else + printf("no error found\n"); + printf("total csum bytes: %llu\n", total_csum_bytes); + printf("total tree bytes: %llu\n", total_btree_bytes); + printf("total fs tree bytes: %llu\n", total_fs_tree_bytes); + printf("total extent tree bytes: %llu\n", total_extent_tree_bytes); + printf("btree space waste bytes: %llu\n", btree_space_waste); + printf("file data blocks allocated: %llu\n referenced %llu\n", + data_bytes_allocated, data_bytes_referenced); + + free_qgroup_counts(); + free_root_recs_tree(&root_cache); +close_out: + close_ctree(root); +err_out: + if (g_task_ctx.progress_enabled) + task_deinit(g_task_ctx.info); + + return ret || found_error; +} + +static int cmd_check(const struct cmd_struct *cmd, int argc, char **argv) +{ + struct btrfs_check_options options; + int ret = 0; + + memcpy(&options, &btrfs_default_check_options, sizeof(options)); while(1) { int c; enum { GETOPT_VAL_REPAIR = GETOPT_VAL_FIRST, GETOPT_VAL_INIT_CSUM, @@ -10042,86 +10491,68 @@ static int cmd_check(const struct cmd_struct *cmd, int argc, char **argv) switch(c) { case 'a': /* ignored */ break; case 'b': - ctree_flags |= OPEN_CTREE_BACKUP_ROOT; + options.use_backup_roots = true; break; case 's': - num = arg_strtou64(optarg); - if (num >= BTRFS_SUPER_MIRROR_MAX) { - error( - "super mirror should be less than %d", - BTRFS_SUPER_MIRROR_MAX); - exit(1); - } - bytenr = btrfs_sb_offset(((int)num)); - printf("using SB copy %llu, bytenr %llu\n", num, bytenr); + options.use_nth_super = arg_strtou64(optarg); break; case 'Q': - qgroup_report = 1; + options.qgroup_report = true; break; case 'E': - subvolid = arg_strtou64(optarg); + options.subvolid = arg_strtou64(optarg); break; case 'r': - tree_root_bytenr = arg_strtou64(optarg); + options.tree_root = arg_strtou64(optarg); break; case GETOPT_VAL_CHUNK_TREE: - chunk_root_bytenr = arg_strtou64(optarg); + options.chunk_root = arg_strtou64(optarg); break; case 'p': - g_task_ctx.progress_enabled = true; + options.show_progress = true; break; case '?': case 'h': usage(cmd, 0); case GETOPT_VAL_REPAIR: - printf("enabling repair mode\n"); - opt_check_repair = 1; - ctree_flags |= OPEN_CTREE_WRITES; + options.repair = true; break; case GETOPT_VAL_READONLY: - readonly = 1; + options.force_readonly = true; break; case GETOPT_VAL_INIT_CSUM: - printf("Creating a new CRC tree\n"); - init_csum_tree = 1; - opt_check_repair = 1; - ctree_flags |= OPEN_CTREE_WRITES; + options.init_csum_tree = true; break; case GETOPT_VAL_INIT_EXTENT: - init_extent_tree = 1; - ctree_flags |= (OPEN_CTREE_WRITES | - OPEN_CTREE_NO_BLOCK_GROUPS); - opt_check_repair = 1; + options.init_extent_tree = 1; break; case GETOPT_VAL_CHECK_CSUM: - check_data_csum = 1; + options.check_data_csum = 1; break; case GETOPT_VAL_MODE: - check_mode = parse_check_mode(optarg); - if (check_mode == CHECK_MODE_UNKNOWN) { + options.check_mode = parse_check_mode(optarg); + if (options.check_mode == BTRFS_CHECK_MODE_UNKNOWN) { error("unknown mode: %s", optarg); exit(1); } break; case GETOPT_VAL_CLEAR_SPACE_CACHE: if (strcmp(optarg, "v1") == 0) { - clear_space_cache = 1; + options.clear_space_cache = 1; } else if (strcmp(optarg, "v2") == 0) { - clear_space_cache = 2; - ctree_flags |= OPEN_CTREE_INVALIDATE_FST; + options.clear_space_cache = 2; } else { error( "invalid argument to --clear-space-cache, must be v1 or v2"); exit(1); } - ctree_flags |= OPEN_CTREE_WRITES; break; case GETOPT_VAL_CLEAR_INO_CACHE: error("--clear-ino-cache option is deprecated, please use \"btrfs rescue clear-ino-cache\" instead"); exit(1); break; case GETOPT_VAL_FORCE: - force = 1; + options.force = 1; break; } } @@ -10129,435 +10560,7 @@ static int cmd_check(const struct cmd_struct *cmd, int argc, char **argv) if (check_argc_exact(argc - optind, 1)) usage(cmd, 1); - if (g_task_ctx.progress_enabled) { - g_task_ctx.tp = TASK_NOTHING; - g_task_ctx.info = task_init(print_status_check, print_status_return, &g_task_ctx); - } - - /* This check is the only reason for --readonly to exist */ - if (readonly && opt_check_repair) { - error("repair options are not compatible with --readonly"); - exit(1); - } - - if (opt_check_repair && !force) { - int delay = 10; - - printf("WARNING:\n\n"); - printf("\tDo not use --repair unless you are advised to do so by a developer\n"); - printf("\tor an experienced user, and then only after having accepted that no\n"); - printf("\tfsck can successfully repair all types of filesystem corruption. E.g.\n"); - printf("\tsome software or hardware bugs can fatally damage a volume.\n"); - printf("\tThe operation will start in %d seconds.\n", delay); - printf("\tUse Ctrl-C to stop it.\n"); - while (delay) { - printf("%2d", delay--); - fflush(stdout); - sleep(1); - } - printf("\nStarting repair.\n"); - } - - /* - * experimental and dangerous - */ - if (opt_check_repair && check_mode == CHECK_MODE_LOWMEM) - warning("low-memory mode repair support is only partial"); - - printf("Opening filesystem to check...\n"); - - cache_tree_init(&root_cache); - qgroup_set_item_count_ptr(&g_task_ctx.item_count); - - ret = check_mounted(argv[optind]); - if (!force) { - if (ret < 0) { - errno = -ret; - error("could not check mount status: %m"); - err |= !!ret; - goto err_out; - } else if (ret) { - error( -"%s is currently mounted, use --force if you really intend to check the filesystem", - argv[optind]); - ret = -EBUSY; - err |= !!ret; - goto err_out; - } - } else { - if (ret < 0) { - warning( -"cannot check mount status of %s, the filesystem could be mounted, continuing because of --force", - argv[optind]); - } else if (ret) { - warning( - "filesystem mounted, continuing because of --force"); - } - /* A block device is mounted in exclusive mode by kernel */ - ctree_flags &= ~OPEN_CTREE_EXCLUSIVE; - } - - /* only allow partial opening under repair mode */ - if (opt_check_repair) - ctree_flags |= OPEN_CTREE_PARTIAL; - - oca.filename = argv[optind]; - oca.sb_bytenr = bytenr; - oca.root_tree_bytenr = tree_root_bytenr; - oca.chunk_tree_bytenr = chunk_root_bytenr; - oca.flags = ctree_flags; - gfs_info = open_ctree_fs_info(&oca); - if (!gfs_info) { - error("cannot open file system"); - ret = -EIO; - err |= !!ret; - goto err_out; - } - - root = gfs_info->fs_root; - uuid_unparse(gfs_info->super_copy->fsid, uuidbuf); - - printf("Checking filesystem on %s\nUUID: %s\n", argv[optind], uuidbuf); - - /* - * Check the bare minimum before starting anything else that could rely - * on it, namely the tree roots, any local consistency checks - */ - if (!extent_buffer_uptodate(gfs_info->tree_root->node) || - !extent_buffer_uptodate(gfs_info->dev_root->node) || - !extent_buffer_uptodate(gfs_info->chunk_root->node)) { - error("critical roots corrupted, unable to check the filesystem"); - err |= !!ret; - ret = -EIO; - goto close_out; - } - - if (clear_space_cache) { - warning("--clear-space-cache option is deprecated, please use \"btrfs rescue clear-space-cache\" instead"); - ret = do_clear_free_space_cache(gfs_info, clear_space_cache); - err |= !!ret; - goto close_out; - } - - /* - * repair mode will force us to commit transaction which - * will make us fail to load log tree when mounting. - */ - if (opt_check_repair && btrfs_super_log_root(gfs_info->super_copy)) { - ret = ask_user("repair mode will force to clear out log tree, are you sure?"); - if (!ret) { - ret = 1; - err |= !!ret; - goto close_out; - } - ret = zero_log_tree(root); - err |= !!ret; - if (ret) { - error("failed to zero log tree: %d", ret); - goto close_out; - } - } - - if (qgroup_report) { - printf("Print quota groups for %s\nUUID: %s\n", argv[optind], - uuidbuf); - ret = qgroup_verify_all(gfs_info); - err |= !!ret; - if (ret >= 0) - report_qgroups(1); - goto close_out; - } - if (subvolid) { - printf("Print extent state for subvolume %llu on %s\nUUID: %s\n", - subvolid, argv[optind], uuidbuf); - ret = print_extent_state(gfs_info, subvolid); - err |= !!ret; - goto close_out; - } - - if (init_extent_tree || init_csum_tree) { - struct btrfs_trans_handle *trans; - - /* - * If we're rebuilding extent tree, we must keep the flag set - * for the whole duration of btrfs check. As we rely on later - * extent tree check code to rebuild block group items, thus we - * can no longer trust the free space for metadata. - */ - if (init_extent_tree) - gfs_info->rebuilding_extent_tree = 1; - trans = btrfs_start_transaction(gfs_info->tree_root, 0); - if (IS_ERR(trans)) { - ret = PTR_ERR(trans); - errno = -ret; - error_msg(ERROR_MSG_START_TRANS, "%m"); - err |= !!ret; - goto close_out; - } - - trans->reinit_extent_tree = true; - if (init_extent_tree) { - printf("Creating a new extent tree\n"); - ret = reinit_extent_tree(trans, - check_mode == CHECK_MODE_ORIGINAL); - err |= !!ret; - if (ret) - goto close_out; - } - - if (init_csum_tree) { - printf("Reinitialize checksum tree\n"); - ret = reinit_global_roots(trans, - BTRFS_CSUM_TREE_OBJECTID); - if (ret) { - error("checksum tree initialization failed: %d", - ret); - ret = -EIO; - err |= !!ret; - goto close_out; - } - - ret = fill_csum_tree(trans, init_extent_tree); - err |= !!ret; - if (ret) { - error("checksum tree refilling failed: %d", ret); - return -EIO; - } - } - /* - * Ok now we commit and run the normal fsck, which will add - * extent entries for all of the items it finds. - */ - ret = btrfs_commit_transaction(trans, gfs_info->tree_root); - err |= !!ret; - if (ret) - goto close_out; - } - - ret = check_global_roots_uptodate(); - if (ret) { - err |= !!ret; - goto close_out; - } - - if (!init_extent_tree) { - if (!g_task_ctx.progress_enabled) { - fprintf(stderr, "[1/7] checking root items\n"); - } else { - g_task_ctx.tp = TASK_ROOT_ITEMS; - task_start(g_task_ctx.info, &g_task_ctx.start_time, - &g_task_ctx.item_count); - } - ret = repair_root_items(); - task_stop(g_task_ctx.info); - if (ret < 0) { - err = !!ret; - errno = -ret; - error("failed to repair root items: %m"); - /* - * For repair, if we can't repair root items, it's - * fatal. But for non-repair, it's pretty rare to hit - * such v3.17 era bug, we want to continue check. - */ - if (opt_check_repair) - goto close_out; - err |= 1; - } else { - if (opt_check_repair) { - fprintf(stderr, "Fixed %d roots.\n", ret); - ret = 0; - } else if (ret > 0) { - fprintf(stderr, - "Found %d roots with an outdated root item.\n", - ret); - fprintf(stderr, - "Please run a filesystem check with the option --repair to fix them.\n"); - ret = 1; - err |= ret; - } - } - } else { - fprintf(stderr, "[1/7] checking root items... skipped\n"); - } - - if (!g_task_ctx.progress_enabled) { - fprintf(stderr, "[2/7] checking extents\n"); - } else { - g_task_ctx.tp = TASK_EXTENTS; - task_start(g_task_ctx.info, &g_task_ctx.start_time, &g_task_ctx.item_count); - } - ret = do_check_chunks_and_extents(); - task_stop(g_task_ctx.info); - err |= !!ret; - if (ret) - error( - "errors found in extent allocation tree or chunk allocation"); - - /* Only re-check super size after we checked and repaired the fs */ - err |= !is_super_size_valid(); - - is_free_space_tree = btrfs_fs_compat_ro(gfs_info, FREE_SPACE_TREE); - - if (!g_task_ctx.progress_enabled) { - if (is_free_space_tree) - fprintf(stderr, "[3/7] checking free space tree\n"); - else - fprintf(stderr, "[3/7] checking free space cache\n"); - } else { - g_task_ctx.tp = TASK_FREE_SPACE; - task_start(g_task_ctx.info, &g_task_ctx.start_time, &g_task_ctx.item_count); - } - - ret = validate_free_space_cache(root, &g_task_ctx); - task_stop(g_task_ctx.info); - err |= !!ret; - - /* - * We used to have to have these hole extents in between our real - * extents so if we don't have this flag set we need to make sure there - * are no gaps in the file extents for inodes, otherwise we can just - * ignore it when this happens. - */ - no_holes = btrfs_fs_incompat(gfs_info, NO_HOLES); - if (!g_task_ctx.progress_enabled) { - fprintf(stderr, "[4/7] checking fs roots\n"); - } else { - g_task_ctx.tp = TASK_FS_ROOTS; - task_start(g_task_ctx.info, &g_task_ctx.start_time, &g_task_ctx.item_count); - } - - ret = do_check_fs_roots(&root_cache); - task_stop(g_task_ctx.info); - err |= !!ret; - if (ret) { - error("errors found in fs roots"); - goto out; - } - - if (!g_task_ctx.progress_enabled) { - if (check_data_csum) - fprintf(stderr, "[5/7] checking csums against data\n"); - else - fprintf(stderr, - "[5/7] checking only csums items (without verifying data)\n"); - } else { - g_task_ctx.tp = TASK_CSUMS; - task_start(g_task_ctx.info, &g_task_ctx.start_time, &g_task_ctx.item_count); - } - - ret = check_csums(); - task_stop(g_task_ctx.info); - /* - * Data csum error is not fatal, and it may indicate more serious - * corruption, continue checking. - */ - if (ret) - error("errors found in csum tree"); - err |= !!ret; - - /* For low memory mode, check_fs_roots_v2 handles root refs */ - if (check_mode != CHECK_MODE_LOWMEM) { - if (!g_task_ctx.progress_enabled) { - fprintf(stderr, "[6/7] checking root refs\n"); - } else { - g_task_ctx.tp = TASK_ROOT_REFS; - task_start(g_task_ctx.info, &g_task_ctx.start_time, &g_task_ctx.item_count); - } - - ret = check_root_refs(root, &root_cache); - task_stop(g_task_ctx.info); - err |= !!ret; - if (ret) { - error("errors found in root refs"); - goto out; - } - } else { - fprintf(stderr, - "[6/7] checking root refs done with fs roots in lowmem mode, skipping\n"); - } - - while (opt_check_repair && !list_empty(&gfs_info->recow_ebs)) { - struct extent_buffer *eb; - - eb = list_first_entry(&gfs_info->recow_ebs, - struct extent_buffer, recow); - list_del_init(&eb->recow); - ret = recow_extent_buffer(root, eb); - free_extent_buffer(eb); - err |= !!ret; - if (ret) { - error("fails to fix transid errors"); - break; - } - } - - while (!list_empty(&delete_items)) { - struct bad_item *bad; - - bad = list_first_entry(&delete_items, struct bad_item, list); - list_del_init(&bad->list); - if (opt_check_repair) { - ret = delete_bad_item(root, bad); - err |= !!ret; - } - free(bad); - } - - if (gfs_info->quota_enabled) { - if (!g_task_ctx.progress_enabled) { - fprintf(stderr, "[7/7] checking quota groups\n"); - } else { - g_task_ctx.tp = TASK_QGROUPS; - task_start(g_task_ctx.info, &g_task_ctx.start_time, &g_task_ctx.item_count); - } - qgroup_verify_ret = qgroup_verify_all(gfs_info); - task_stop(g_task_ctx.info); - if (qgroup_verify_ret < 0) { - error("failed to check quota groups"); - err |= !!qgroup_verify_ret; - goto out; - } - report_qgroups(0); - ret = repair_qgroups(gfs_info, &qgroups_repaired, false); - if (ret) { - error("failed to repair quota groups"); - goto out; - } - if (qgroup_verify_ret && (!qgroups_repaired || ret)) - err |= !!qgroup_verify_ret; - ret = 0; - } else { - fprintf(stderr, - "[7/7] checking quota groups skipped (not enabled on this FS)\n"); - } - - if (!list_empty(&gfs_info->recow_ebs)) { - error("transid errors in file system"); - ret = 1; - err |= !!ret; - } -out: - printf("found %llu bytes used, ", bytes_used); - if (err) - printf("error(s) found\n"); - else - printf("no error found\n"); - printf("total csum bytes: %llu\n", total_csum_bytes); - printf("total tree bytes: %llu\n", total_btree_bytes); - printf("total fs tree bytes: %llu\n", total_fs_tree_bytes); - printf("total extent tree bytes: %llu\n", total_extent_tree_bytes); - printf("btree space waste bytes: %llu\n", btree_space_waste); - printf("file data blocks allocated: %llu\n referenced %llu\n", - data_bytes_allocated, data_bytes_referenced); - - free_qgroup_counts(); - free_root_recs_tree(&root_cache); -close_out: - close_ctree(root); -err_out: - if (g_task_ctx.progress_enabled) - task_deinit(g_task_ctx.info); - - return err; + ret = btrfs_check(argv[optind], &options); + return ret; } DEFINE_SIMPLE_COMMAND(check, "check"); diff --git a/common/utils.c b/common/utils.c index 035f27052f24..0a00991af10c 100644 --- a/common/utils.c +++ b/common/utils.c @@ -58,6 +58,24 @@ struct pending_dir { char name[PATH_MAX]; }; +const struct btrfs_check_options btrfs_default_check_options = { + .check_mode = BTRFS_CHECK_MODE_DEFAULT, + .force_readonly = false, + .use_backup_roots = false, + .check_data_csum = false, + .show_progress = false, + .qgroup_report = false, + .chunk_root = 0, + .tree_root = 0, + .subvolid = 0, + .use_nth_super = 0, + .clear_space_cache = 0, + .repair = false, + .init_csum_tree = false, + .init_extent_tree = false, + .force = false, +}; + void btrfs_format_csum(u16 csum_type, const u8 *data, char *output) { int i; diff --git a/common/utils.h b/common/utils.h index e267814b08a8..5f93a79a38bc 100644 --- a/common/utils.h +++ b/common/utils.h @@ -91,8 +91,49 @@ struct btrfs_config { int dry_run; struct list_head params; }; + +enum btrfs_check_mode { + BTRFS_CHECK_MODE_ORIGINAL, + BTRFS_CHECK_MODE_LOWMEM, + BTRFS_CHECK_MODE_UNKNOWN, + BTRFS_CHECK_MODE_DEFAULT = BTRFS_CHECK_MODE_ORIGINAL +}; + +/* For external tools to call "btrfs check". */ +struct btrfs_check_options { + enum btrfs_check_mode check_mode; + + /* Whether we rejects any writes. */ + bool force_readonly; + bool use_backup_roots; + bool check_data_csum; + bool show_progress; + bool qgroup_report; + u64 chunk_root; + u64 tree_root; + u64 subvolid; + int use_nth_super; + + /* + * 0 means do no clear space cache. 1 means v1 free space cache + * while 2 means v2. + */ + int clear_space_cache; + + /* The remaining options are all dangerous. */ + bool repair; + bool init_csum_tree; + bool init_extent_tree; + bool force; +}; + +extern const struct btrfs_check_options btrfs_default_check_options; + extern struct btrfs_config bconf; +extern bool btrfs_check(const char *device, + const struct btrfs_check_options *options); + struct config_param { struct list_head list; const char *key; From patchwork Mon Dec 4 06:56:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 13477809 Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2a07:de40:b251:101:10:150:64:1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E284ED3 for ; Sun, 3 Dec 2023 22:56:55 -0800 (PST) Received: from imap2.dmz-prg2.suse.org (imap2.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:98]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 5377D21F76 for ; Mon, 4 Dec 2023 06:56:54 +0000 (UTC) Received: from imap2.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap2.dmz-prg2.suse.org (Postfix) with ESMTPS id 5325C13588 for ; Mon, 4 Dec 2023 06:56:53 +0000 (UTC) Received: from dovecot-director2.suse.de ([10.150.64.162]) by imap2.dmz-prg2.suse.org with ESMTPSA id IB/tADV4bWXABwAAn2gu4w (envelope-from ) for ; Mon, 04 Dec 2023 06:56:53 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 3/3] btrfs-progs: tune: add fsck runs before and after a full conversion Date: Mon, 4 Dec 2023 17:26:29 +1030 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-btrfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spamd-Bar: +++++++++++++++++++++ X-Spam-Score: 21.50 X-Rspamd-Server: rspamd1 Authentication-Results: smtp-out1.suse.de; dkim=none; spf=fail (smtp-out1.suse.de: domain of wqu@suse.com does not designate 2a07:de40:b281:104:10:150:64:98 as permitted sender) smtp.mailfrom=wqu@suse.com; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=suse.com (policy=quarantine) X-Rspamd-Queue-Id: 5377D21F76 X-Spamd-Result: default: False [21.50 / 50.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_FAIL(1.00)[-all]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; DMARC_POLICY_QUARANTINE(1.50)[suse.com : No valid SPF, No valid DKIM,quarantine]; R_MISSING_CHARSET(2.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[linux-btrfs@vger.kernel.org]; BROKEN_CONTENT_TYPE(1.50)[]; RCPT_COUNT_ONE(0.00)[1]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:98:from]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_NONE(0.00)[]; NEURAL_SPAM_SHORT(3.00)[0.999]; MX_GOOD(-0.01)[]; NEURAL_SPAM_LONG(3.50)[1.000]; MID_CONTAINS_FROM(1.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.com:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(2.20)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_ALL(0.00)[]; BAYES_HAM(-3.00)[100.00%] We have two bug reports in the mailing list around block group tree conversion. Although currently there is still no strong evidence showing btrfstune is the culprit yet, I still want to be extra cautious. This patch would add an extra safenet for the end users, that a full readonly btrfs check would be executed on the filesystem before doing any full fs conversion (including bg/extent tree and csum conversion). This can catch any existing health problem of the filesystem. Furthermore, after the conversion is done, there would also be a "btrfs check" run, to make sure the conversion itself is fine, to make sure we have done everything to make sure the problem is not from our side. One thing to note is, the initial check would only happen on a source filesystem which is not under any existing conversion. As a fs already under conversion can easily trigger btrfs check false alerts. Signed-off-by: Qu Wenruo --- Makefile | 3 ++- tune/main.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 57 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 374f59b99150..47c6421781f3 100644 --- a/Makefile +++ b/Makefile @@ -267,7 +267,8 @@ mkfs_objects = mkfs/main.o mkfs/common.o mkfs/rootdir.o image_objects = image/main.o image/sanitize.o image/image-create.o image/common.o \ image/image-restore.o tune_objects = tune/main.o tune/seeding.o tune/change-uuid.o tune/change-metadata-uuid.o \ - tune/convert-bgt.o tune/change-csum.o common/clear-cache.o tune/quota.o + tune/convert-bgt.o tune/change-csum.o common/clear-cache.o tune/quota.o \ + check/main.o check/mode-common.o check/mode-lowmem.o mkfs/common.o all_objects = $(objects) $(cmds_objects) $(libbtrfs_objects) $(convert_objects) \ $(mkfs_objects) $(image_objects) $(tune_objects) $(libbtrfsutil_objects) diff --git a/tune/main.c b/tune/main.c index 9dcb55952b59..f05ab7c3b36e 100644 --- a/tune/main.c +++ b/tune/main.c @@ -29,6 +29,7 @@ #include "kernel-shared/transaction.h" #include "kernel-shared/volumes.h" #include "kernel-shared/free-space-tree.h" +#include "kernel-shared/zoned.h" #include "common/utils.h" #include "common/open-utils.h" #include "common/device-scan.h" @@ -367,6 +368,47 @@ int BOX_MAIN(btrfstune)(int argc, char *argv[]) if (change_metadata_uuid || random_fsid || new_fsid_str) ctree_flags |= OPEN_CTREE_USE_LATEST_BDEV; + /* + * For fs rewrites operations, we need to verify the initial + * filesystem to make sure they are healthy. + * Otherwise the transaction protection is not going to save us. + */ + if (to_bg_tree || to_extent_tree || csum_type != -1) { + struct btrfs_super_block sb = { 0 }; + u64 sb_flags; + + /* + * Here we just read out the superblock without any extra check, + * as later btrfs_check() would do more comprehensive check on + * it. + */ + ret = sbread(fd, &sb, 0); + if (ret < 0) { + errno = -ret; + error("failed to read super block from \"%s\"", device); + goto free_out; + } + sb_flags = btrfs_super_flags(&sb); + /* + * If we're not resuming the conversion, we should check the fs + * first. + */ + if (!(sb_flags & (BTRFS_SUPER_FLAG_CHANGING_BG_TREE | + BTRFS_SUPER_FLAG_CHANGING_DATA_CSUM | + BTRFS_SUPER_FLAG_CHANGING_META_CSUM))) { + bool check_ret; + struct btrfs_check_options options = + btrfs_default_check_options; + + check_ret = btrfs_check(device, &options); + if (check_ret) { + error( + "\"btrfs check\" found some errors, will not touch the filesystem"); + ret = -EUCLEAN; + goto free_out; + } + } + } root = open_ctree_fd(fd, device, 0, ctree_flags); if (!root) { @@ -535,6 +577,19 @@ out: } close_ctree(root); btrfs_close_all_devices(); + /* A sucessful run finished, let's verify the fs again. */ + if (!ret && (to_bg_tree || to_extent_tree || csum_type)) { + bool check_ret; + struct btrfs_check_options options = btrfs_default_check_options; + + check_ret = btrfs_check(device, &options); + if (check_ret) { + error( + "\"btrfs check\" found some errors after the conversion, please contact the developers"); + ret = -EUCLEAN; + } + + } free_out: return ret;