From patchwork Thu Jul 20 22:57:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Burkov X-Patchwork-Id: 13321173 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3FF0EB64DA for ; Thu, 20 Jul 2023 22:59:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229610AbjGTW7C (ORCPT ); Thu, 20 Jul 2023 18:59:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33104 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229517AbjGTW7C (ORCPT ); Thu, 20 Jul 2023 18:59:02 -0400 Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E88F92 for ; Thu, 20 Jul 2023 15:59:01 -0700 (PDT) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id 7B9EB5C0166; Thu, 20 Jul 2023 18:59:00 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Thu, 20 Jul 2023 18:59:00 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bur.io; h=cc :content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm1; t=1689893940; x= 1689980340; bh=GFV+uKPRk4Am+jixPfXNfontdo+7MbTMnFGc/XraWBI=; b=g 2k1tAU+OO9zTU1WBbt+Q+KsUk3pCzz3XnwL46dWUTHwfN06J9PJqlQkWZIy42MQo MyYYb7+Svm5dYXqGETl+14bvWNfcwPJVMREVIQK/IyTZ0d+CRrx7JDWv9PbqptgB WCY76eWM+UauXvxf4r886nueV4saEBl0XSYjq5hie590NUPnsDkWLkqXWiZWD+/u B4ovx9bTtXZ38mDYI7RARUI4OPSHzyem74f2nWnjI1PCtB9OiPAyv7lg8z3kbdxA k1BfX5cSdQIFfBF2+5LMe/Mbp5pJ1PVGIRZ+wCKxe1snTIgk7MzwBrkblZJ0FQGU mG3cZXRc7YNTlOiJrC4nA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:date:feedback-id:feedback-id:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm3; t=1689893940; x=1689980340; bh=G FV+uKPRk4Am+jixPfXNfontdo+7MbTMnFGc/XraWBI=; b=xNhlfByLp+BWMxZGu RJKq9Sa4fiZS+zYYEdZ2YLzFWHgYNPe6kaSsaKFGqvAgyTDTEnCcrFaU4OJ/Ax8H pQMItWxQsVNxzQCqgWdkmRJD+O9n0tfeg7DWHQN/Hg4Z2SVBKSfaT3iurH0lIJ3W gP/owo6IBfbOEY0COS8fra1thW1dNDVl5PwQqVj47nH3mDl4ijE05OzgQBxXLroh FRf2Wlxird8D3pPYsCmWn++f5hsMelRFUMtXRbQftyR9vn3XOUYnFC/6+s5lD1Pd iG+650kIm//a/vH6fw6+S7xmixA9EGUPPgHuFn0gAH08rv6H+Hs7JrPvhjN9egce 9bVFQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrhedugdduhecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpeeuohhrihhsuceuuhhrkhhovhcuoegsohhrihhssegsuhhrrdhi oheqnecuggftrfgrthhtvghrnhepieeuffeuvdeiueejhfehiefgkeevudejjeejffevvd ehtddufeeihfekgeeuheelnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehm rghilhhfrhhomhepsghorhhishessghurhdrihho X-ME-Proxy: Feedback-ID: i083147f8:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 20 Jul 2023 18:59:00 -0400 (EDT) From: Boris Burkov To: linux-btrfs@vger.kernel.org, kernel-team@fb.com Subject: [PATCH v2 4/8] btrfs-progs: simple quotas fsck Date: Thu, 20 Jul 2023 15:57:20 -0700 Message-ID: X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add simple quotas checks to btrfs check. Like the kernel feature, these checks bypass most of the backref walking in the qgroups check. Instead, they enforce the invariant behind the design of simple quotas by scanning the extent tree and determining the owner of each extent: Data: reading the owner ref inline item Metadata: reading the tree block and reading its btrfs_header's owner This gives us the expected count from squotas which we check against the on-disk state of the qgroup items. Signed-off-by: Boris Burkov --- check/main.c | 2 ++ check/qgroup-verify.c | 79 +++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 78 insertions(+), 3 deletions(-) diff --git a/check/main.c b/check/main.c index 77bb50a0e..07f31fbe0 100644 --- a/check/main.c +++ b/check/main.c @@ -5667,6 +5667,8 @@ static int process_extent_item(struct btrfs_root *root, btrfs_shared_data_ref_count(eb, sref), gen, 0, num_bytes); break; + case BTRFS_EXTENT_OWNER_REF_KEY: + break; default: fprintf(stderr, "corrupt extent record: key [%llu,%u,%llu]\n", diff --git a/check/qgroup-verify.c b/check/qgroup-verify.c index 1a62009b8..c95e6f806 100644 --- a/check/qgroup-verify.c +++ b/check/qgroup-verify.c @@ -85,6 +85,8 @@ static struct counts_tree { unsigned int num_groups; unsigned int rescan_running:1; unsigned int qgroup_inconsist:1; + unsigned int simple:1; + u64 enable_gen; u64 scan_progress; } counts = { .root = RB_ROOT }; @@ -915,14 +917,18 @@ static int add_qgroup_relation(u64 memberid, u64 parentid) return 0; } -static void read_qgroup_status(struct extent_buffer *eb, int slot, - struct counts_tree *counts) +static void read_qgroup_status(struct btrfs_fs_info *info, + struct extent_buffer *eb, + int slot, struct counts_tree *counts) { struct btrfs_qgroup_status_item *status_item; u64 flags; status_item = btrfs_item_ptr(eb, slot, struct btrfs_qgroup_status_item); flags = btrfs_qgroup_status_flags(eb, status_item); + + if (counts->simple == 1) + counts->enable_gen = btrfs_qgroup_status_enable_gen(eb, status_item); /* * Since qgroup_inconsist/rescan_running is just one bit, * assign value directly won't work. @@ -948,6 +954,8 @@ static int load_quota_info(struct btrfs_fs_info *info) int i, nr; int search_relations = 0; + if (btrfs_fs_incompat(info, SIMPLE_QUOTA)) + counts.simple = 1; loop: /* * Do 2 passes, the first allocates group counts and reads status @@ -990,7 +998,7 @@ loop: } if (key.type == BTRFS_QGROUP_STATUS_KEY) { - read_qgroup_status(leaf, i, &counts); + read_qgroup_status(info, leaf, i, &counts); continue; } @@ -1038,6 +1046,51 @@ out: return ret; } +static int simple_quota_account_extent(struct btrfs_fs_info *info, + struct extent_buffer *leaf, + struct btrfs_key *key, + struct btrfs_extent_item *ei, + struct btrfs_extent_inline_ref *iref, + u64 bytenr, u64 num_bytes, int meta_item) +{ + u64 generation; + int type; + u64 root; + struct ulist *roots = ulist_alloc(0); + int ret; + struct extent_buffer *node_eb; + u64 extent_root; + + generation = btrfs_extent_generation(leaf, ei); + if (generation < counts.enable_gen) + return 0; + + type = btrfs_extent_inline_ref_type(leaf, iref); + if (!meta_item) { + if (type == BTRFS_EXTENT_OWNER_REF_KEY) { + struct btrfs_extent_owner_ref *oref = (struct btrfs_extent_owner_ref *)(&iref->offset); + root = btrfs_extent_owner_ref_root_id(leaf, oref); + } else { + return 0; + } + } else { + extent_root = btrfs_root_id(btrfs_extent_root(info, key->objectid)); + node_eb = read_tree_block(info, key->objectid, extent_root, 0, 0, NULL); + if (!extent_buffer_uptodate(node_eb)) + return -EIO; + root = btrfs_header_owner(node_eb); + free_extent_buffer(node_eb); + } + + if (!is_fstree(root)) + return 0; + + ulist_add(roots, root, 0, 0); + ret = account_one_extent(roots, bytenr, num_bytes); + ulist_free(roots); + return ret; +} + static int add_inline_refs(struct btrfs_fs_info *info, struct extent_buffer *ei_leaf, int slot, u64 bytenr, u64 num_bytes, int meta_item) @@ -1045,6 +1098,7 @@ static int add_inline_refs(struct btrfs_fs_info *info, struct btrfs_extent_item *ei; struct btrfs_extent_inline_ref *iref; struct btrfs_extent_data_ref *dref; + struct btrfs_key key; u64 flags, root_obj, offset, parent; u32 item_size = btrfs_item_size(ei_leaf, slot); int type; @@ -1052,6 +1106,7 @@ static int add_inline_refs(struct btrfs_fs_info *info, unsigned long ptr; ei = btrfs_item_ptr(ei_leaf, slot, struct btrfs_extent_item); + btrfs_item_key_to_cpu(ei_leaf, &key, slot); flags = btrfs_extent_flags(ei_leaf, ei); if (flags & BTRFS_EXTENT_FLAG_TREE_BLOCK && !meta_item) { @@ -1062,6 +1117,15 @@ static int add_inline_refs(struct btrfs_fs_info *info, iref = (struct btrfs_extent_inline_ref *)(ei + 1); } + if (counts.simple) { + int ret = simple_quota_account_extent(info, ei_leaf, &key, ei, iref, + bytenr, num_bytes, meta_item); + + if (ret) + error("simple quota account extent error: %d", ret); + return ret; + } + ptr = (unsigned long)iref; end = (unsigned long)ei + item_size; while (ptr < end) { @@ -1083,6 +1147,7 @@ static int add_inline_refs(struct btrfs_fs_info *info, parent = offset; break; default: + error("unexpected iref type %d", type); return 1; } @@ -1445,6 +1510,13 @@ int qgroup_verify_all(struct btrfs_fs_info *info) goto out; } } + /* + * As in the kernel, simple qgroup accounting is done locally per extent, + * so we don't need to resolve backrefs to find which subvol an extent + * is accounted to. + */ + if (counts.simple) + goto check; ret = map_implied_refs(info); if (ret) { @@ -1454,6 +1526,7 @@ int qgroup_verify_all(struct btrfs_fs_info *info) ret = account_all_refs(1, 0); +check: /* * Do the correctness check here, so for callers who don't want * verbose report can skip calling report_qgroups()