From patchwork Thu Mar 11 14:31:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 12131545 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D757BC433DB for ; Thu, 11 Mar 2021 14:32:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8174964FD6 for ; Thu, 11 Mar 2021 14:32:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233186AbhCKObz (ORCPT ); Thu, 11 Mar 2021 09:31:55 -0500 Received: from mail.kernel.org ([198.145.29.99]:56976 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233913AbhCKObY (ORCPT ); Thu, 11 Mar 2021 09:31:24 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 1A98064FBA; Thu, 11 Mar 2021 14:31:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1615473083; bh=oOxzupoUsR6GeC5maNGG71BWQQOs+ZeWCTpE/a/qcFs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dyjFjQOfE8VMNNYqMFiVJi1vSMfGzc1y/iGuOPbJoEwEPvzLJIwRhXzf32fcbpXKM ObKDptVGygrmDOAX8N5NlbFtne1O0C/98NRacnAwqNriRSk3GI0vDhix4GBCNBw86N X/O28Rah9aF23aYgQZrG4KWatgk/NGVFpQdsvdJRtoE9it5hgpSVQvfUazBZ2u14p2 jl41cGkcXsUnvA9/XcDkZwyrFTJXh6wJeIve/KrBF++tMKvjquJoeDutoR7bbAlJZp LAEFhNnRuU4YxRTsmm6tYck1SBbyvEza2m2C4MXcMZqAIXtMh8r+48yuRPVIWvlU/B dirjX6qHkR5XQ== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Cc: ce3g8jdj@umail.furryterror.org, Filipe Manana Subject: [PATCH 1/9] btrfs: fix race when cloning extent buffer during rewind of an old root Date: Thu, 11 Mar 2021 14:31:05 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana While resolving backreferences, as part of a logical ino ioctl call or fiemap, we can end up hitting a BUG_ON() when replaying tree mod log operations of a root, triggering a stack trace like the following: ------------[ cut here ]------------ kernel BUG at fs/btrfs/ctree.c:1210! invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 1 PID: 19054 Comm: crawl_335 Tainted: G W 5.11.0-2d11c0084b02-misc-next+ #89 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 RIP: 0010:__tree_mod_log_rewind+0x3b1/0x3c0 Code: 05 48 8d 74 10 (...) RSP: 0018:ffffc90001eb70b8 EFLAGS: 00010297 RAX: 0000000000000000 RBX: ffff88812344e400 RCX: ffffffffb28933b6 RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff88812344e42c RBP: ffffc90001eb7108 R08: 1ffff11020b60a20 R09: ffffed1020b60a20 R10: ffff888105b050f9 R11: ffffed1020b60a1f R12: 00000000000000ee R13: ffff8880195520c0 R14: ffff8881bc958500 R15: ffff88812344e42c FS: 00007fd1955e8700(0000) GS:ffff8881f5600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007efdb7928718 CR3: 000000010103a006 CR4: 0000000000170ee0 Call Trace: btrfs_search_old_slot+0x265/0x10d0 ? lock_acquired+0xbb/0x600 ? btrfs_search_slot+0x1090/0x1090 ? free_extent_buffer.part.61+0xd7/0x140 ? free_extent_buffer+0x13/0x20 resolve_indirect_refs+0x3e9/0xfc0 ? lock_downgrade+0x3d0/0x3d0 ? __kasan_check_read+0x11/0x20 ? add_prelim_ref.part.11+0x150/0x150 ? lock_downgrade+0x3d0/0x3d0 ? __kasan_check_read+0x11/0x20 ? lock_acquired+0xbb/0x600 ? __kasan_check_write+0x14/0x20 ? do_raw_spin_unlock+0xa8/0x140 ? rb_insert_color+0x30/0x360 ? prelim_ref_insert+0x12d/0x430 find_parent_nodes+0x5c3/0x1830 ? resolve_indirect_refs+0xfc0/0xfc0 ? lock_release+0xc8/0x620 ? fs_reclaim_acquire+0x67/0xf0 ? lock_acquire+0xc7/0x510 ? lock_downgrade+0x3d0/0x3d0 ? lockdep_hardirqs_on_prepare+0x160/0x210 ? lock_release+0xc8/0x620 ? fs_reclaim_acquire+0x67/0xf0 ? lock_acquire+0xc7/0x510 ? poison_range+0x38/0x40 ? unpoison_range+0x14/0x40 ? trace_hardirqs_on+0x55/0x120 btrfs_find_all_roots_safe+0x142/0x1e0 ? find_parent_nodes+0x1830/0x1830 ? btrfs_inode_flags_to_xflags+0x50/0x50 iterate_extent_inodes+0x20e/0x580 ? tree_backref_for_extent+0x230/0x230 ? lock_downgrade+0x3d0/0x3d0 ? read_extent_buffer+0xdd/0x110 ? lock_downgrade+0x3d0/0x3d0 ? __kasan_check_read+0x11/0x20 ? lock_acquired+0xbb/0x600 ? __kasan_check_write+0x14/0x20 ? _raw_spin_unlock+0x22/0x30 ? __kasan_check_write+0x14/0x20 iterate_inodes_from_logical+0x129/0x170 ? iterate_inodes_from_logical+0x129/0x170 ? btrfs_inode_flags_to_xflags+0x50/0x50 ? iterate_extent_inodes+0x580/0x580 ? __vmalloc_node+0x92/0xb0 ? init_data_container+0x34/0xb0 ? init_data_container+0x34/0xb0 ? kvmalloc_node+0x60/0x80 btrfs_ioctl_logical_to_ino+0x158/0x230 btrfs_ioctl+0x205e/0x4040 ? __might_sleep+0x71/0xe0 ? btrfs_ioctl_get_supported_features+0x30/0x30 ? getrusage+0x4b6/0x9c0 ? __kasan_check_read+0x11/0x20 ? lock_release+0xc8/0x620 ? __might_fault+0x64/0xd0 ? lock_acquire+0xc7/0x510 ? lock_downgrade+0x3d0/0x3d0 ? lockdep_hardirqs_on_prepare+0x210/0x210 ? lockdep_hardirqs_on_prepare+0x210/0x210 ? __kasan_check_read+0x11/0x20 ? do_vfs_ioctl+0xfc/0x9d0 ? ioctl_file_clone+0xe0/0xe0 ? lock_downgrade+0x3d0/0x3d0 ? lockdep_hardirqs_on_prepare+0x210/0x210 ? __kasan_check_read+0x11/0x20 ? lock_release+0xc8/0x620 ? __task_pid_nr_ns+0xd3/0x250 ? lock_acquire+0xc7/0x510 ? __fget_files+0x160/0x230 ? __fget_light+0xf2/0x110 __x64_sys_ioctl+0xc3/0x100 do_syscall_64+0x37/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fd1976e2427 Code: 00 00 90 48 8b 05 (...) RSP: 002b:00007fd1955e5cf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fd1955e5f40 RCX: 00007fd1976e2427 RDX: 00007fd1955e5f48 RSI: 00000000c038943b RDI: 0000000000000004 RBP: 0000000001000000 R08: 0000000000000000 R09: 00007fd1955e6120 R10: 0000557835366b00 R11: 0000000000000246 R12: 0000000000000004 R13: 00007fd1955e5f48 R14: 00007fd1955e5f40 R15: 00007fd1955e5ef8 Modules linked in: ---[ end trace ec8931a1c36e57be ]--- (gdb) l *(__tree_mod_log_rewind+0x3b1) 0xffffffff81893521 is in __tree_mod_log_rewind (fs/btrfs/ctree.c:1210). 1205 * the modification. as we're going backwards, we do the 1206 * opposite of each operation here. 1207 */ 1208 switch (tm->op) { 1209 case MOD_LOG_KEY_REMOVE_WHILE_FREEING: 1210 BUG_ON(tm->slot < n); 1211 fallthrough; 1212 case MOD_LOG_KEY_REMOVE_WHILE_MOVING: 1213 case MOD_LOG_KEY_REMOVE: 1214 btrfs_set_node_key(eb, &tm->key, tm->slot); Here's what happens to hit that BUG_ON(): 1) We have one tree mod log user (through fiemap or the logical ino ioctl), with a sequence number of 1, so we have fs_info->tree_mod_seq == 1; 2) Another task is at ctree.c:balance_level() and we have eb X currently as the root of the tree, and we promote its single child, eb Y, as the new root. Then, at ctree.c:balance_level(), we call: tree_mod_log_insert_root(eb X, eb Y, 1); 3) At tree_mod_log_insert_root() we create tree mod log elements for each slot of eb X, of operation type MOD_LOG_KEY_REMOVE_WHILE_FREEING each with a ->logical pointing to ebX->start. These are placed in an array named tm_list. Lets assume there are N elements (N pointers in eb X); 4) Then, still at tree_mod_log_insert_root(), we create a tree mod log element of operation type MOD_LOG_ROOT_REPLACE, ->logical set to ebY->start, ->old_root.logical set to ebX->start, ->old_root.level set to the level of eb X and ->generation set to the generation of eb X; 5) Then tree_mod_log_insert_root() calls tree_mod_log_free_eb() with tm_list as argument. After that, tree_mod_log_free_eb() calls __tree_mod_log_insert() for each member of tm_list in reverse order, from highest slot in eb X, slot N - 1, to slot 0 of eb X; 6) __tree_mod_log_insert() sets the sequence number of each given tree mod log operation - it increments fs_info->tree_mod_seq and sets fs_info->tree_mod_seq as the sequence number of the given tree mod log operation. This means that for the tm_list created at tree_mod_log_insert_root(), the element corresponding to slot 0 of eb X has the highest sequence number (1 + N), and the element corresponding to the last slot has the lowest sequence number (2); 7) Then, after inserting tm_list's elements into the tree mod log rbtree, the MOD_LOG_ROOT_REPLACE element is inserted, which gets the highest sequence number, which is N + 2; 8) Back to ctree.c:balance_level(), we free eb X by calling btrfs_free_tree_block() on it. Because eb X was created in the current transaction, has no other references and writeback did not happen for it, we add it back to the free space cache/tree; 9) Later some other task T allocates the metadata extent from eb X, since it is marked as free space in the space cache/tree, and uses it as a node for some other btree; 10) The tree mod log user task calls btrfs_search_old_slot(), which calls get_old_root(), and finally that calls __tree_mod_log_oldest_root() with time_seq == 1 and eb_root == eb Y; 11) First iteration of the while loop finds the tree mod log element with sequence number N + 2, for the logical address of eb Y and of type MOD_LOG_ROOT_REPLACE; 12) Because the operation type is MOD_LOG_ROOT_REPLACE, we don't break out of the loop, and set root_logical to point to tm->old_root.logical which corresponds to the logical address of eb X; 13) On the next iteration of the while loop, the call to tree_mod_log_search_oldest() returns the smallest tree mod log element for the logical address of eb X, which has a sequence number of 2, an operation type of MOD_LOG_KEY_REMOVE_WHILE_FREEING and corresponds to the old slot N - 1 of eb X (eb X had N items in it before being freed); 14) We then break out of the while loop and return the tree mod log operation of type MOD_LOG_ROOT_REPLACE (eb Y), and not the one for slot N - 1 of eb X, to get_old_root(); 15) At get_old_root(), we process the MOD_LOG_ROOT_REPLACE operation and set "logical" to the logical address of eb X, which was the old root. We then call tree_mod_log_search() passing it the logical address of eb X and time_seq == 1; 16) Then before calling tree_mod_log_search(), task T adds a key to eb X, which results in adding a tree mod log operation of type MOD_LOG_KEY_ADD to the tree mod log - this is done at ctree.c:insert_ptr() - but after adding the tree mod log operation and before updating the number of items in eb X from 0 to 1... 17) The task at get_old_root() calls tree_mod_log_search() and gets the tree mod log operation of type MOD_LOG_KEY_ADD just added by task T. Then it enters the following if branch: if (old_root && tm && tm->op != MOD_LOG_KEY_REMOVE_WHILE_FREEING) { (...) } (...) Calls read_tree_block() for eb X, which gets a reference on eb X but does not lock it - task T has it locked. Then it clones eb X while it has nritems set to 0 in its header, before task T sets nritems to 1 in eb X's header. From hereupon we use the clone of eb X which no other task has access to; 18) Then we call __tree_mod_log_rewind(), passing it the MOD_LOG_KEY_ADD mod log operation we just got from tree_mod_log_search() in the previous step and the cloned version of eb X; 19) At __tree_mod_log_rewind(), we set the local variable "n" to the number of items set in eb X's clone, which is 0. Then we enter the while loop, and in its first iteration we process the MOD_LOG_KEY_ADD operation, which just decrements "n" from 0 to (u32)-1, since "n" is declared with a type of u32. At the end of this iteration we call rb_next() to find the next tree mod log operation for eb X, that gives us the mod log operation of type MOD_LOG_KEY_REMOVE_WHILE_FREEING, for slot 0, with a sequence number of N + 1 (steps 3 to 6); 20) Then we go back to the top of the while loop and trigger the following BUG_ON(): (...) switch (tm->op) { case MOD_LOG_KEY_REMOVE_WHILE_FREEING: BUG_ON(tm->slot < n); fallthrough; (...) Because "n" has a value of (u32)-1 (4294967295) and tm->slot is 0. Fix this by taking a read lock on the extent buffer before cloning it at ctree.c:get_old_root(). This should be done regardless of the extent buffer having been freed and reused, as a concurrent task might be modifying it (while holding a write lock on it). Fixes: 834328a8493079 ("Btrfs: tree mod log's old roots could still be part of the tree") Reported-by: Zygo Blaxell Link: https://lore.kernel.org/linux-btrfs/20210227155037.GN28049@hungrycats.org/ Signed-off-by: Filipe Manana --- fs/btrfs/ctree.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index f33e69f02230..19d7b81cee55 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1365,7 +1365,9 @@ get_old_root(struct btrfs_root *root, u64 time_seq) "failed to read tree block %llu from get_old_root", logical); } else { + btrfs_tree_read_lock(old); eb = btrfs_clone_extent_buffer(old); + btrfs_tree_read_unlock(old); free_extent_buffer(old); } } else if (old_root) { From patchwork Thu Mar 11 14:31:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 12131549 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CFC8C433E6 for ; Thu, 11 Mar 2021 14:32:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CAFE464FE2 for ; Thu, 11 Mar 2021 14:32:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233910AbhCKOb5 (ORCPT ); Thu, 11 Mar 2021 09:31:57 -0500 Received: from mail.kernel.org ([198.145.29.99]:56994 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233671AbhCKObZ (ORCPT ); Thu, 11 Mar 2021 09:31:25 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 50D7764FE8; Thu, 11 Mar 2021 14:31:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1615473085; bh=y1RZTQszhrd9706ZBPIDfFMuO+2g/xMo9Agaer091Vs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Yn/ZoC3SkQUsFlyoDKWK3Kmpy6ZQIpfHoaGIfp5L2nty3yks++ON9tSMFhytGdLxv 40/+c3Paaod4lrRg/TdKKs+HV5iiW6Oj1jyb2DSmegQt6BiNZD5rFlyXSXkLdA+LLO hYhP7BiD4oFhaMfFFKMw6d8Thhn0hMoaz/LSdpQEXsi/uj+8zlf4VgkTIlVGZlyb7Y 5fZSxuZ9PDtcmr5bmtIjKoz/ObFwMdcodj6LG83n6HkfOj4jGpZk4LsKcOvMvn+V+a jJDFevBAXr/dqXOB0KEEH8K3GCxJvbxc41VeEephJ+ljsO/rjxnLucdYiFLBuZtL83 Z4dhr62Rg+mGg== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Cc: ce3g8jdj@umail.furryterror.org, Filipe Manana Subject: [PATCH 2/9] btrfs: always pin deleted leaves when there are active tree mod log users Date: Thu, 11 Mar 2021 14:31:06 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana When freeing a tree block we may end up adding its extent back to the free space cache/tree, as long as there are no more references for it, it was created in the current transaction and writeback for it never happened. This is generally fine, however when we have tree mod log operations it can result in inconsistent versions of a btree after unwinding extent buffers with the recorded tree mod log operations. This is because: * We only log operations for nodes (adding and removing key/pointers), for leaves we don't do anything; * This means that we can log a MOD_LOG_KEY_REMOVE_WHILE_FREEING operation for a node that points to a leaf that was deleted; * Before we apply the logged operation to unwind a node, we can have that leaf's extent allocated again, either as a node or as a leaf, and possibly for another btree. This is possible if the leaf was created in the current transaction and writeback for it never started, in which case btrfs_free_tree_block() returns its extent back to the free space cache/tree; * Then, before applying the tree mod log operation, some task allocates the metadata extent just freed before, and uses it either as a leaf or as a node for some btree (can be the same or another one, it does not matter); * After applying the MOD_LOG_KEY_REMOVE_WHILE_FREEING operation we now get the target node with an item pointing to the metadata extent that now has content different from what it had before the leaf was deleted. It might now belong to a different btree and be a node and not a leaf anymore. As a consequence, the results of searches after the unwinding can be unpredictable and produce unexpected results. So make sure we pin extent buffers corresponding to leaves when there are tree mod log users. Signed-off-by: Filipe Manana --- fs/btrfs/extent-tree.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 5e228d6ad63f..2482b26b1971 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3310,6 +3310,7 @@ void btrfs_free_tree_block(struct btrfs_trans_handle *trans, if (last_ref && btrfs_header_generation(buf) == trans->transid) { struct btrfs_block_group *cache; + bool must_pin = false; if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID) { ret = check_ref_cleanup(trans, buf->start); @@ -3327,7 +3328,27 @@ void btrfs_free_tree_block(struct btrfs_trans_handle *trans, goto out; } - if (btrfs_is_zoned(fs_info)) { + /* + * If this is a leaf and there are tree mod log users, we may + * have recorded mod log operations that point to this leaf. + * So we must make sure no one reuses this leaf's extent before + * mod log operations are applied to a node, otherwise after + * rewinding a node using the mod log operations we get an + * inconsistent btree, as the leaf's extent may now be used as + * a node or leaf for another different btree. + * We are safe from races here because at this point no other + * node or root points to this extent buffer, so if after this + * check a new tree mod log user joins, it will not be able to + * find a node pointing to this leaf and record operations that + * point to this leaf. + */ + if (btrfs_header_level(buf) == 0) { + read_lock(&fs_info->tree_mod_log_lock); + must_pin = !list_empty(&fs_info->tree_mod_seq_list); + read_unlock(&fs_info->tree_mod_log_lock); + } + + if (must_pin || btrfs_is_zoned(fs_info)) { btrfs_redirty_list_add(trans->transaction, buf); pin_down_extent(trans, cache, buf->start, buf->len, 1); btrfs_put_block_group(cache); From patchwork Thu Mar 11 14:31:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 12131561 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5331DC433E9 for ; Thu, 11 Mar 2021 14:32:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F013264FEB for ; Thu, 11 Mar 2021 14:32:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233671AbhCKOb6 (ORCPT ); Thu, 11 Mar 2021 09:31:58 -0500 Received: from mail.kernel.org ([198.145.29.99]:57000 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233232AbhCKOb1 (ORCPT ); Thu, 11 Mar 2021 09:31:27 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 89D6A64FEB; Thu, 11 Mar 2021 14:31:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1615473086; bh=fs7XbDjKmlMeVcP88GEblEANnQLT7XItSi6PZHAw7Xk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Xxv1r2ZGZ1MaDgiMQG8/Y69j9G2wRATtoo2ezHZLBaCy/onIwjycGNp2dyelBqdtR 9calCaaVOqCE3XiknwZ+PICsljD3+kThnXb9fwc7YUtzW2UXjjPDZ5pI4W8lRe9tLn Qz/FlGsLRO/DgLQHNUNNwuSg5bDYcVioIp4zvp4TgA3V5vSVJQWHZ4RCe1sMPNDYzu IsC0ZIy/B5AwdoOMFf5Ijd1Co+wIvJjikXHAxyG+u+fyUpVvmcDAQyVO/60yhMxdYB nO62lTShPo3SS0XCg0T6WG16ENrEuvSKGEP4Y4ICtAIfAKePh9cYj7rE7Bj40qYj8Y iU3m2DE98YOiA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Cc: ce3g8jdj@umail.furryterror.org, Filipe Manana Subject: [PATCH 3/9] btrfs: move the tree mod log code into its own file Date: Thu, 11 Mar 2021 14:31:07 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana The tree modification log, which records modifications done to btrees, is quite large and currently spread all over ctree.c, which is a huge file already. To make things better organized, move all that code into its own separate source and header files. Functions and definitions that are used outside of the module (mostly by ctree.c) are renamed so that they start with a "btrfs_" prefix. Everything else remains unchanged. This makes it easier to go over the tree modification log code everytime I need to go read it to fix a bug. Signed-off-by: Filipe Manana Reported-by: kernel test robot Reviewed-by: Anand Jain --- fs/btrfs/Makefile | 2 +- fs/btrfs/backref.c | 33 +- fs/btrfs/ctree.c | 956 ++-------------------------------------- fs/btrfs/ctree.h | 17 - fs/btrfs/delayed-ref.c | 9 +- fs/btrfs/qgroup.c | 9 +- fs/btrfs/tree-mod-log.c | 889 +++++++++++++++++++++++++++++++++++++ fs/btrfs/tree-mod-log.h | 52 +++ 8 files changed, 1006 insertions(+), 961 deletions(-) create mode 100644 fs/btrfs/tree-mod-log.c create mode 100644 fs/btrfs/tree-mod-log.h diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index b634c42115ea..fdba34b10a98 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -28,7 +28,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \ uuid-tree.o props.o free-space-tree.o tree-checker.o space-info.o \ block-rsv.o delalloc-space.o block-group.o discard.o reflink.o \ - subpage.o + subpage.o tree-mod-log.o btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c index f47c1528eb9a..117d423fdb93 100644 --- a/fs/btrfs/backref.c +++ b/fs/btrfs/backref.c @@ -14,6 +14,7 @@ #include "delayed-ref.h" #include "locking.h" #include "misc.h" +#include "tree-mod-log.h" /* Just an arbitrary number so we can be sure this happened */ #define BACKREF_FOUND_SHARED 6 @@ -452,7 +453,7 @@ static int add_all_parents(struct btrfs_root *root, struct btrfs_path *path, if (path->slots[0] >= btrfs_header_nritems(eb) || is_shared_data_backref(preftrees, eb->start) || ref->root_id != btrfs_header_owner(eb)) { - if (time_seq == SEQ_LAST) + if (time_seq == BTRFS_SEQ_LAST) ret = btrfs_next_leaf(root, path); else ret = btrfs_next_old_leaf(root, path, time_seq); @@ -476,7 +477,7 @@ static int add_all_parents(struct btrfs_root *root, struct btrfs_path *path, if (slot == 0 && (is_shared_data_backref(preftrees, eb->start) || ref->root_id != btrfs_header_owner(eb))) { - if (time_seq == SEQ_LAST) + if (time_seq == BTRFS_SEQ_LAST) ret = btrfs_next_leaf(root, path); else ret = btrfs_next_old_leaf(root, path, time_seq); @@ -514,7 +515,7 @@ static int add_all_parents(struct btrfs_root *root, struct btrfs_path *path, eie = NULL; } next: - if (time_seq == SEQ_LAST) + if (time_seq == BTRFS_SEQ_LAST) ret = btrfs_next_item(root, path); else ret = btrfs_next_old_item(root, path, time_seq); @@ -574,7 +575,7 @@ static int resolve_indirect_ref(struct btrfs_fs_info *fs_info, if (path->search_commit_root) root_level = btrfs_header_level(root->commit_root); - else if (time_seq == SEQ_LAST) + else if (time_seq == BTRFS_SEQ_LAST) root_level = btrfs_header_level(root->node); else root_level = btrfs_old_root_level(root, time_seq); @@ -605,7 +606,7 @@ static int resolve_indirect_ref(struct btrfs_fs_info *fs_info, search_key.offset >= LLONG_MAX) search_key.offset = 0; path->lowest_level = level; - if (time_seq == SEQ_LAST) + if (time_seq == BTRFS_SEQ_LAST) ret = btrfs_search_slot(NULL, root, &search_key, path, 0, 0); else ret = btrfs_search_old_slot(root, &search_key, path, time_seq); @@ -1147,8 +1148,8 @@ static int add_keyed_refs(struct btrfs_fs_info *fs_info, * indirect refs to their parent bytenr. * When roots are found, they're added to the roots list * - * If time_seq is set to SEQ_LAST, it will not search delayed_refs, and behave - * much like trans == NULL case, the difference only lies in it will not + * If time_seq is set to BTRFS_SEQ_LAST, it will not search delayed_refs, and + * behave much like trans == NULL case, the difference only lies in it will not * commit root. * The special case is for qgroup to search roots in commit_transaction(). * @@ -1199,7 +1200,7 @@ static int find_parent_nodes(struct btrfs_trans_handle *trans, path->skip_locking = 1; } - if (time_seq == SEQ_LAST) + if (time_seq == BTRFS_SEQ_LAST) path->skip_locking = 1; /* @@ -1217,9 +1218,9 @@ static int find_parent_nodes(struct btrfs_trans_handle *trans, #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS if (trans && likely(trans->type != __TRANS_DUMMY) && - time_seq != SEQ_LAST) { + time_seq != BTRFS_SEQ_LAST) { #else - if (trans && time_seq != SEQ_LAST) { + if (trans && time_seq != BTRFS_SEQ_LAST) { #endif /* * look if there are updates for this ref queued and lock the @@ -1527,7 +1528,7 @@ int btrfs_check_shared(struct btrfs_root *root, u64 inum, u64 bytenr, struct btrfs_trans_handle *trans; struct ulist_iterator uiter; struct ulist_node *node; - struct seq_list elem = SEQ_LIST_INIT(elem); + struct btrfs_seq_list elem = BTRFS_SEQ_LIST_INIT(elem); int ret = 0; struct share_check shared = { .root_objectid = root->root_key.objectid, @@ -1953,7 +1954,7 @@ int iterate_extent_inodes(struct btrfs_fs_info *fs_info, struct ulist *roots = NULL; struct ulist_node *ref_node = NULL; struct ulist_node *root_node = NULL; - struct seq_list tree_mod_seq_elem = SEQ_LIST_INIT(tree_mod_seq_elem); + struct btrfs_seq_list seq_elem = BTRFS_SEQ_LIST_INIT(seq_elem); struct ulist_iterator ref_uiter; struct ulist_iterator root_uiter; @@ -1971,12 +1972,12 @@ int iterate_extent_inodes(struct btrfs_fs_info *fs_info, } if (trans) - btrfs_get_tree_mod_seq(fs_info, &tree_mod_seq_elem); + btrfs_get_tree_mod_seq(fs_info, &seq_elem); else down_read(&fs_info->commit_root_sem); ret = btrfs_find_all_leafs(trans, fs_info, extent_item_objectid, - tree_mod_seq_elem.seq, &refs, + seq_elem.seq, &refs, &extent_item_pos, ignore_offset); if (ret) goto out; @@ -1984,7 +1985,7 @@ int iterate_extent_inodes(struct btrfs_fs_info *fs_info, ULIST_ITER_INIT(&ref_uiter); while (!ret && (ref_node = ulist_next(refs, &ref_uiter))) { ret = btrfs_find_all_roots_safe(trans, fs_info, ref_node->val, - tree_mod_seq_elem.seq, &roots, + seq_elem.seq, &roots, ignore_offset); if (ret) break; @@ -2007,7 +2008,7 @@ int iterate_extent_inodes(struct btrfs_fs_info *fs_info, free_leaf_list(refs); out: if (trans) { - btrfs_put_tree_mod_seq(fs_info, &tree_mod_seq_elem); + btrfs_put_tree_mod_seq(fs_info, &seq_elem); btrfs_end_transaction(trans); } else { up_read(&fs_info->commit_root_sem); diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 19d7b81cee55..2c6e525b3458 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -14,6 +14,7 @@ #include "locking.h" #include "volumes.h" #include "qgroup.h" +#include "tree-mod-log.h" static int split_node(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_path *path, int level); @@ -233,597 +234,6 @@ int btrfs_copy_root(struct btrfs_trans_handle *trans, return 0; } -enum mod_log_op { - MOD_LOG_KEY_REPLACE, - MOD_LOG_KEY_ADD, - MOD_LOG_KEY_REMOVE, - MOD_LOG_KEY_REMOVE_WHILE_FREEING, - MOD_LOG_KEY_REMOVE_WHILE_MOVING, - MOD_LOG_MOVE_KEYS, - MOD_LOG_ROOT_REPLACE, -}; - -struct tree_mod_root { - u64 logical; - u8 level; -}; - -struct tree_mod_elem { - struct rb_node node; - u64 logical; - u64 seq; - enum mod_log_op op; - - /* this is used for MOD_LOG_KEY_* and MOD_LOG_MOVE_KEYS operations */ - int slot; - - /* this is used for MOD_LOG_KEY* and MOD_LOG_ROOT_REPLACE */ - u64 generation; - - /* those are used for op == MOD_LOG_KEY_{REPLACE,REMOVE} */ - struct btrfs_disk_key key; - u64 blockptr; - - /* this is used for op == MOD_LOG_MOVE_KEYS */ - struct { - int dst_slot; - int nr_items; - } move; - - /* this is used for op == MOD_LOG_ROOT_REPLACE */ - struct tree_mod_root old_root; -}; - -/* - * Pull a new tree mod seq number for our operation. - */ -static inline u64 btrfs_inc_tree_mod_seq(struct btrfs_fs_info *fs_info) -{ - return atomic64_inc_return(&fs_info->tree_mod_seq); -} - -/* - * This adds a new blocker to the tree mod log's blocker list if the @elem - * passed does not already have a sequence number set. So when a caller expects - * to record tree modifications, it should ensure to set elem->seq to zero - * before calling btrfs_get_tree_mod_seq. - * Returns a fresh, unused tree log modification sequence number, even if no new - * blocker was added. - */ -u64 btrfs_get_tree_mod_seq(struct btrfs_fs_info *fs_info, - struct seq_list *elem) -{ - write_lock(&fs_info->tree_mod_log_lock); - if (!elem->seq) { - elem->seq = btrfs_inc_tree_mod_seq(fs_info); - list_add_tail(&elem->list, &fs_info->tree_mod_seq_list); - } - write_unlock(&fs_info->tree_mod_log_lock); - - return elem->seq; -} - -void btrfs_put_tree_mod_seq(struct btrfs_fs_info *fs_info, - struct seq_list *elem) -{ - struct rb_root *tm_root; - struct rb_node *node; - struct rb_node *next; - struct tree_mod_elem *tm; - u64 min_seq = (u64)-1; - u64 seq_putting = elem->seq; - - if (!seq_putting) - return; - - write_lock(&fs_info->tree_mod_log_lock); - list_del(&elem->list); - elem->seq = 0; - - if (!list_empty(&fs_info->tree_mod_seq_list)) { - struct seq_list *first; - - first = list_first_entry(&fs_info->tree_mod_seq_list, - struct seq_list, list); - if (seq_putting > first->seq) { - /* - * Blocker with lower sequence number exists, we - * cannot remove anything from the log. - */ - write_unlock(&fs_info->tree_mod_log_lock); - return; - } - min_seq = first->seq; - } - - /* - * anything that's lower than the lowest existing (read: blocked) - * sequence number can be removed from the tree. - */ - tm_root = &fs_info->tree_mod_log; - for (node = rb_first(tm_root); node; node = next) { - next = rb_next(node); - tm = rb_entry(node, struct tree_mod_elem, node); - if (tm->seq >= min_seq) - continue; - rb_erase(node, tm_root); - kfree(tm); - } - write_unlock(&fs_info->tree_mod_log_lock); -} - -/* - * key order of the log: - * node/leaf start address -> sequence - * - * The 'start address' is the logical address of the *new* root node - * for root replace operations, or the logical address of the affected - * block for all other operations. - */ -static noinline int -__tree_mod_log_insert(struct btrfs_fs_info *fs_info, struct tree_mod_elem *tm) -{ - struct rb_root *tm_root; - struct rb_node **new; - struct rb_node *parent = NULL; - struct tree_mod_elem *cur; - - lockdep_assert_held_write(&fs_info->tree_mod_log_lock); - - tm->seq = btrfs_inc_tree_mod_seq(fs_info); - - tm_root = &fs_info->tree_mod_log; - new = &tm_root->rb_node; - while (*new) { - cur = rb_entry(*new, struct tree_mod_elem, node); - parent = *new; - if (cur->logical < tm->logical) - new = &((*new)->rb_left); - else if (cur->logical > tm->logical) - new = &((*new)->rb_right); - else if (cur->seq < tm->seq) - new = &((*new)->rb_left); - else if (cur->seq > tm->seq) - new = &((*new)->rb_right); - else - return -EEXIST; - } - - rb_link_node(&tm->node, parent, new); - rb_insert_color(&tm->node, tm_root); - return 0; -} - -/* - * Determines if logging can be omitted. Returns 1 if it can. Otherwise, it - * returns zero with the tree_mod_log_lock acquired. The caller must hold - * this until all tree mod log insertions are recorded in the rb tree and then - * write unlock fs_info::tree_mod_log_lock. - */ -static inline int tree_mod_dont_log(struct btrfs_fs_info *fs_info, - struct extent_buffer *eb) { - smp_mb(); - if (list_empty(&(fs_info)->tree_mod_seq_list)) - return 1; - if (eb && btrfs_header_level(eb) == 0) - return 1; - - write_lock(&fs_info->tree_mod_log_lock); - if (list_empty(&(fs_info)->tree_mod_seq_list)) { - write_unlock(&fs_info->tree_mod_log_lock); - return 1; - } - - return 0; -} - -/* Similar to tree_mod_dont_log, but doesn't acquire any locks. */ -static inline int tree_mod_need_log(const struct btrfs_fs_info *fs_info, - struct extent_buffer *eb) -{ - smp_mb(); - if (list_empty(&(fs_info)->tree_mod_seq_list)) - return 0; - if (eb && btrfs_header_level(eb) == 0) - return 0; - - return 1; -} - -static struct tree_mod_elem * -alloc_tree_mod_elem(struct extent_buffer *eb, int slot, - enum mod_log_op op, gfp_t flags) -{ - struct tree_mod_elem *tm; - - tm = kzalloc(sizeof(*tm), flags); - if (!tm) - return NULL; - - tm->logical = eb->start; - if (op != MOD_LOG_KEY_ADD) { - btrfs_node_key(eb, &tm->key, slot); - tm->blockptr = btrfs_node_blockptr(eb, slot); - } - tm->op = op; - tm->slot = slot; - tm->generation = btrfs_node_ptr_generation(eb, slot); - RB_CLEAR_NODE(&tm->node); - - return tm; -} - -static noinline int tree_mod_log_insert_key(struct extent_buffer *eb, int slot, - enum mod_log_op op, gfp_t flags) -{ - struct tree_mod_elem *tm; - int ret; - - if (!tree_mod_need_log(eb->fs_info, eb)) - return 0; - - tm = alloc_tree_mod_elem(eb, slot, op, flags); - if (!tm) - return -ENOMEM; - - if (tree_mod_dont_log(eb->fs_info, eb)) { - kfree(tm); - return 0; - } - - ret = __tree_mod_log_insert(eb->fs_info, tm); - write_unlock(&eb->fs_info->tree_mod_log_lock); - if (ret) - kfree(tm); - - return ret; -} - -static noinline int tree_mod_log_insert_move(struct extent_buffer *eb, - int dst_slot, int src_slot, int nr_items) -{ - struct tree_mod_elem *tm = NULL; - struct tree_mod_elem **tm_list = NULL; - int ret = 0; - int i; - int locked = 0; - - if (!tree_mod_need_log(eb->fs_info, eb)) - return 0; - - tm_list = kcalloc(nr_items, sizeof(struct tree_mod_elem *), GFP_NOFS); - if (!tm_list) - return -ENOMEM; - - tm = kzalloc(sizeof(*tm), GFP_NOFS); - if (!tm) { - ret = -ENOMEM; - goto free_tms; - } - - tm->logical = eb->start; - tm->slot = src_slot; - tm->move.dst_slot = dst_slot; - tm->move.nr_items = nr_items; - tm->op = MOD_LOG_MOVE_KEYS; - - for (i = 0; i + dst_slot < src_slot && i < nr_items; i++) { - tm_list[i] = alloc_tree_mod_elem(eb, i + dst_slot, - MOD_LOG_KEY_REMOVE_WHILE_MOVING, GFP_NOFS); - if (!tm_list[i]) { - ret = -ENOMEM; - goto free_tms; - } - } - - if (tree_mod_dont_log(eb->fs_info, eb)) - goto free_tms; - locked = 1; - - /* - * When we override something during the move, we log these removals. - * This can only happen when we move towards the beginning of the - * buffer, i.e. dst_slot < src_slot. - */ - for (i = 0; i + dst_slot < src_slot && i < nr_items; i++) { - ret = __tree_mod_log_insert(eb->fs_info, tm_list[i]); - if (ret) - goto free_tms; - } - - ret = __tree_mod_log_insert(eb->fs_info, tm); - if (ret) - goto free_tms; - write_unlock(&eb->fs_info->tree_mod_log_lock); - kfree(tm_list); - - return 0; -free_tms: - for (i = 0; i < nr_items; i++) { - if (tm_list[i] && !RB_EMPTY_NODE(&tm_list[i]->node)) - rb_erase(&tm_list[i]->node, &eb->fs_info->tree_mod_log); - kfree(tm_list[i]); - } - if (locked) - write_unlock(&eb->fs_info->tree_mod_log_lock); - kfree(tm_list); - kfree(tm); - - return ret; -} - -static inline int -__tree_mod_log_free_eb(struct btrfs_fs_info *fs_info, - struct tree_mod_elem **tm_list, - int nritems) -{ - int i, j; - int ret; - - for (i = nritems - 1; i >= 0; i--) { - ret = __tree_mod_log_insert(fs_info, tm_list[i]); - if (ret) { - for (j = nritems - 1; j > i; j--) - rb_erase(&tm_list[j]->node, - &fs_info->tree_mod_log); - return ret; - } - } - - return 0; -} - -static noinline int tree_mod_log_insert_root(struct extent_buffer *old_root, - struct extent_buffer *new_root, int log_removal) -{ - struct btrfs_fs_info *fs_info = old_root->fs_info; - struct tree_mod_elem *tm = NULL; - struct tree_mod_elem **tm_list = NULL; - int nritems = 0; - int ret = 0; - int i; - - if (!tree_mod_need_log(fs_info, NULL)) - return 0; - - if (log_removal && btrfs_header_level(old_root) > 0) { - nritems = btrfs_header_nritems(old_root); - tm_list = kcalloc(nritems, sizeof(struct tree_mod_elem *), - GFP_NOFS); - if (!tm_list) { - ret = -ENOMEM; - goto free_tms; - } - for (i = 0; i < nritems; i++) { - tm_list[i] = alloc_tree_mod_elem(old_root, i, - MOD_LOG_KEY_REMOVE_WHILE_FREEING, GFP_NOFS); - if (!tm_list[i]) { - ret = -ENOMEM; - goto free_tms; - } - } - } - - tm = kzalloc(sizeof(*tm), GFP_NOFS); - if (!tm) { - ret = -ENOMEM; - goto free_tms; - } - - tm->logical = new_root->start; - tm->old_root.logical = old_root->start; - tm->old_root.level = btrfs_header_level(old_root); - tm->generation = btrfs_header_generation(old_root); - tm->op = MOD_LOG_ROOT_REPLACE; - - if (tree_mod_dont_log(fs_info, NULL)) - goto free_tms; - - if (tm_list) - ret = __tree_mod_log_free_eb(fs_info, tm_list, nritems); - if (!ret) - ret = __tree_mod_log_insert(fs_info, tm); - - write_unlock(&fs_info->tree_mod_log_lock); - if (ret) - goto free_tms; - kfree(tm_list); - - return ret; - -free_tms: - if (tm_list) { - for (i = 0; i < nritems; i++) - kfree(tm_list[i]); - kfree(tm_list); - } - kfree(tm); - - return ret; -} - -static struct tree_mod_elem * -__tree_mod_log_search(struct btrfs_fs_info *fs_info, u64 start, u64 min_seq, - int smallest) -{ - struct rb_root *tm_root; - struct rb_node *node; - struct tree_mod_elem *cur = NULL; - struct tree_mod_elem *found = NULL; - - read_lock(&fs_info->tree_mod_log_lock); - tm_root = &fs_info->tree_mod_log; - node = tm_root->rb_node; - while (node) { - cur = rb_entry(node, struct tree_mod_elem, node); - if (cur->logical < start) { - node = node->rb_left; - } else if (cur->logical > start) { - node = node->rb_right; - } else if (cur->seq < min_seq) { - node = node->rb_left; - } else if (!smallest) { - /* we want the node with the highest seq */ - if (found) - BUG_ON(found->seq > cur->seq); - found = cur; - node = node->rb_left; - } else if (cur->seq > min_seq) { - /* we want the node with the smallest seq */ - if (found) - BUG_ON(found->seq < cur->seq); - found = cur; - node = node->rb_right; - } else { - found = cur; - break; - } - } - read_unlock(&fs_info->tree_mod_log_lock); - - return found; -} - -/* - * this returns the element from the log with the smallest time sequence - * value that's in the log (the oldest log item). any element with a time - * sequence lower than min_seq will be ignored. - */ -static struct tree_mod_elem * -tree_mod_log_search_oldest(struct btrfs_fs_info *fs_info, u64 start, - u64 min_seq) -{ - return __tree_mod_log_search(fs_info, start, min_seq, 1); -} - -/* - * this returns the element from the log with the largest time sequence - * value that's in the log (the most recent log item). any element with - * a time sequence lower than min_seq will be ignored. - */ -static struct tree_mod_elem * -tree_mod_log_search(struct btrfs_fs_info *fs_info, u64 start, u64 min_seq) -{ - return __tree_mod_log_search(fs_info, start, min_seq, 0); -} - -static noinline int tree_mod_log_eb_copy(struct extent_buffer *dst, - struct extent_buffer *src, unsigned long dst_offset, - unsigned long src_offset, int nr_items) -{ - struct btrfs_fs_info *fs_info = dst->fs_info; - int ret = 0; - struct tree_mod_elem **tm_list = NULL; - struct tree_mod_elem **tm_list_add, **tm_list_rem; - int i; - int locked = 0; - - if (!tree_mod_need_log(fs_info, NULL)) - return 0; - - if (btrfs_header_level(dst) == 0 && btrfs_header_level(src) == 0) - return 0; - - tm_list = kcalloc(nr_items * 2, sizeof(struct tree_mod_elem *), - GFP_NOFS); - if (!tm_list) - return -ENOMEM; - - tm_list_add = tm_list; - tm_list_rem = tm_list + nr_items; - for (i = 0; i < nr_items; i++) { - tm_list_rem[i] = alloc_tree_mod_elem(src, i + src_offset, - MOD_LOG_KEY_REMOVE, GFP_NOFS); - if (!tm_list_rem[i]) { - ret = -ENOMEM; - goto free_tms; - } - - tm_list_add[i] = alloc_tree_mod_elem(dst, i + dst_offset, - MOD_LOG_KEY_ADD, GFP_NOFS); - if (!tm_list_add[i]) { - ret = -ENOMEM; - goto free_tms; - } - } - - if (tree_mod_dont_log(fs_info, NULL)) - goto free_tms; - locked = 1; - - for (i = 0; i < nr_items; i++) { - ret = __tree_mod_log_insert(fs_info, tm_list_rem[i]); - if (ret) - goto free_tms; - ret = __tree_mod_log_insert(fs_info, tm_list_add[i]); - if (ret) - goto free_tms; - } - - write_unlock(&fs_info->tree_mod_log_lock); - kfree(tm_list); - - return 0; - -free_tms: - for (i = 0; i < nr_items * 2; i++) { - if (tm_list[i] && !RB_EMPTY_NODE(&tm_list[i]->node)) - rb_erase(&tm_list[i]->node, &fs_info->tree_mod_log); - kfree(tm_list[i]); - } - if (locked) - write_unlock(&fs_info->tree_mod_log_lock); - kfree(tm_list); - - return ret; -} - -static noinline int tree_mod_log_free_eb(struct extent_buffer *eb) -{ - struct tree_mod_elem **tm_list = NULL; - int nritems = 0; - int i; - int ret = 0; - - if (btrfs_header_level(eb) == 0) - return 0; - - if (!tree_mod_need_log(eb->fs_info, NULL)) - return 0; - - nritems = btrfs_header_nritems(eb); - tm_list = kcalloc(nritems, sizeof(struct tree_mod_elem *), GFP_NOFS); - if (!tm_list) - return -ENOMEM; - - for (i = 0; i < nritems; i++) { - tm_list[i] = alloc_tree_mod_elem(eb, i, - MOD_LOG_KEY_REMOVE_WHILE_FREEING, GFP_NOFS); - if (!tm_list[i]) { - ret = -ENOMEM; - goto free_tms; - } - } - - if (tree_mod_dont_log(eb->fs_info, eb)) - goto free_tms; - - ret = __tree_mod_log_free_eb(eb->fs_info, tm_list, nritems); - write_unlock(&eb->fs_info->tree_mod_log_lock); - if (ret) - goto free_tms; - kfree(tm_list); - - return 0; - -free_tms: - for (i = 0; i < nritems; i++) - kfree(tm_list[i]); - kfree(tm_list); - - return ret; -} - /* * check if the tree block can be shared by multiple trees */ @@ -1090,7 +500,7 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, parent_start = buf->start; atomic_inc(&cow->refs); - ret = tree_mod_log_insert_root(root->node, cow, 1); + ret = btrfs_tree_mod_log_insert_root(root->node, cow, 1); BUG_ON(ret < 0); rcu_assign_pointer(root->node, cow); @@ -1100,15 +510,15 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, add_root_to_dirty_list(root); } else { WARN_ON(trans->transid != btrfs_header_generation(parent)); - tree_mod_log_insert_key(parent, parent_slot, - MOD_LOG_KEY_REPLACE, GFP_NOFS); + btrfs_tree_mod_log_insert_key(parent, parent_slot, + BTRFS_MOD_LOG_KEY_REPLACE, GFP_NOFS); btrfs_set_node_blockptr(parent, parent_slot, cow->start); btrfs_set_node_ptr_generation(parent, parent_slot, trans->transid); btrfs_mark_buffer_dirty(parent); if (last_ref) { - ret = tree_mod_log_free_eb(buf); + ret = btrfs_tree_mod_log_free_eb(buf); if (ret) { btrfs_tree_unlock(cow); free_extent_buffer(cow); @@ -1127,298 +537,6 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, return 0; } -/* - * returns the logical address of the oldest predecessor of the given root. - * entries older than time_seq are ignored. - */ -static struct tree_mod_elem *__tree_mod_log_oldest_root( - struct extent_buffer *eb_root, u64 time_seq) -{ - struct tree_mod_elem *tm; - struct tree_mod_elem *found = NULL; - u64 root_logical = eb_root->start; - int looped = 0; - - if (!time_seq) - return NULL; - - /* - * the very last operation that's logged for a root is the - * replacement operation (if it is replaced at all). this has - * the logical address of the *new* root, making it the very - * first operation that's logged for this root. - */ - while (1) { - tm = tree_mod_log_search_oldest(eb_root->fs_info, root_logical, - time_seq); - if (!looped && !tm) - return NULL; - /* - * if there are no tree operation for the oldest root, we simply - * return it. this should only happen if that (old) root is at - * level 0. - */ - if (!tm) - break; - - /* - * if there's an operation that's not a root replacement, we - * found the oldest version of our root. normally, we'll find a - * MOD_LOG_KEY_REMOVE_WHILE_FREEING operation here. - */ - if (tm->op != MOD_LOG_ROOT_REPLACE) - break; - - found = tm; - root_logical = tm->old_root.logical; - looped = 1; - } - - /* if there's no old root to return, return what we found instead */ - if (!found) - found = tm; - - return found; -} - -/* - * tm is a pointer to the first operation to rewind within eb. then, all - * previous operations will be rewound (until we reach something older than - * time_seq). - */ -static void -__tree_mod_log_rewind(struct btrfs_fs_info *fs_info, struct extent_buffer *eb, - u64 time_seq, struct tree_mod_elem *first_tm) -{ - u32 n; - struct rb_node *next; - struct tree_mod_elem *tm = first_tm; - unsigned long o_dst; - unsigned long o_src; - unsigned long p_size = sizeof(struct btrfs_key_ptr); - - n = btrfs_header_nritems(eb); - read_lock(&fs_info->tree_mod_log_lock); - while (tm && tm->seq >= time_seq) { - /* - * all the operations are recorded with the operator used for - * the modification. as we're going backwards, we do the - * opposite of each operation here. - */ - switch (tm->op) { - case MOD_LOG_KEY_REMOVE_WHILE_FREEING: - BUG_ON(tm->slot < n); - fallthrough; - case MOD_LOG_KEY_REMOVE_WHILE_MOVING: - case MOD_LOG_KEY_REMOVE: - btrfs_set_node_key(eb, &tm->key, tm->slot); - btrfs_set_node_blockptr(eb, tm->slot, tm->blockptr); - btrfs_set_node_ptr_generation(eb, tm->slot, - tm->generation); - n++; - break; - case MOD_LOG_KEY_REPLACE: - BUG_ON(tm->slot >= n); - btrfs_set_node_key(eb, &tm->key, tm->slot); - btrfs_set_node_blockptr(eb, tm->slot, tm->blockptr); - btrfs_set_node_ptr_generation(eb, tm->slot, - tm->generation); - break; - case MOD_LOG_KEY_ADD: - /* if a move operation is needed it's in the log */ - n--; - break; - case MOD_LOG_MOVE_KEYS: - o_dst = btrfs_node_key_ptr_offset(tm->slot); - o_src = btrfs_node_key_ptr_offset(tm->move.dst_slot); - memmove_extent_buffer(eb, o_dst, o_src, - tm->move.nr_items * p_size); - break; - case MOD_LOG_ROOT_REPLACE: - /* - * this operation is special. for roots, this must be - * handled explicitly before rewinding. - * for non-roots, this operation may exist if the node - * was a root: root A -> child B; then A gets empty and - * B is promoted to the new root. in the mod log, we'll - * have a root-replace operation for B, a tree block - * that is no root. we simply ignore that operation. - */ - break; - } - next = rb_next(&tm->node); - if (!next) - break; - tm = rb_entry(next, struct tree_mod_elem, node); - if (tm->logical != first_tm->logical) - break; - } - read_unlock(&fs_info->tree_mod_log_lock); - btrfs_set_header_nritems(eb, n); -} - -/* - * Called with eb read locked. If the buffer cannot be rewound, the same buffer - * is returned. If rewind operations happen, a fresh buffer is returned. The - * returned buffer is always read-locked. If the returned buffer is not the - * input buffer, the lock on the input buffer is released and the input buffer - * is freed (its refcount is decremented). - */ -static struct extent_buffer * -tree_mod_log_rewind(struct btrfs_fs_info *fs_info, struct btrfs_path *path, - struct extent_buffer *eb, u64 time_seq) -{ - struct extent_buffer *eb_rewin; - struct tree_mod_elem *tm; - - if (!time_seq) - return eb; - - if (btrfs_header_level(eb) == 0) - return eb; - - tm = tree_mod_log_search(fs_info, eb->start, time_seq); - if (!tm) - return eb; - - if (tm->op == MOD_LOG_KEY_REMOVE_WHILE_FREEING) { - BUG_ON(tm->slot != 0); - eb_rewin = alloc_dummy_extent_buffer(fs_info, eb->start); - if (!eb_rewin) { - btrfs_tree_read_unlock(eb); - free_extent_buffer(eb); - return NULL; - } - btrfs_set_header_bytenr(eb_rewin, eb->start); - btrfs_set_header_backref_rev(eb_rewin, - btrfs_header_backref_rev(eb)); - btrfs_set_header_owner(eb_rewin, btrfs_header_owner(eb)); - btrfs_set_header_level(eb_rewin, btrfs_header_level(eb)); - } else { - eb_rewin = btrfs_clone_extent_buffer(eb); - if (!eb_rewin) { - btrfs_tree_read_unlock(eb); - free_extent_buffer(eb); - return NULL; - } - } - - btrfs_tree_read_unlock(eb); - free_extent_buffer(eb); - - btrfs_set_buffer_lockdep_class(btrfs_header_owner(eb_rewin), - eb_rewin, btrfs_header_level(eb_rewin)); - btrfs_tree_read_lock(eb_rewin); - __tree_mod_log_rewind(fs_info, eb_rewin, time_seq, tm); - WARN_ON(btrfs_header_nritems(eb_rewin) > - BTRFS_NODEPTRS_PER_BLOCK(fs_info)); - - return eb_rewin; -} - -/* - * get_old_root() rewinds the state of @root's root node to the given @time_seq - * value. If there are no changes, the current root->root_node is returned. If - * anything changed in between, there's a fresh buffer allocated on which the - * rewind operations are done. In any case, the returned buffer is read locked. - * Returns NULL on error (with no locks held). - */ -static inline struct extent_buffer * -get_old_root(struct btrfs_root *root, u64 time_seq) -{ - struct btrfs_fs_info *fs_info = root->fs_info; - struct tree_mod_elem *tm; - struct extent_buffer *eb = NULL; - struct extent_buffer *eb_root; - u64 eb_root_owner = 0; - struct extent_buffer *old; - struct tree_mod_root *old_root = NULL; - u64 old_generation = 0; - u64 logical; - int level; - - eb_root = btrfs_read_lock_root_node(root); - tm = __tree_mod_log_oldest_root(eb_root, time_seq); - if (!tm) - return eb_root; - - if (tm->op == MOD_LOG_ROOT_REPLACE) { - old_root = &tm->old_root; - old_generation = tm->generation; - logical = old_root->logical; - level = old_root->level; - } else { - logical = eb_root->start; - level = btrfs_header_level(eb_root); - } - - tm = tree_mod_log_search(fs_info, logical, time_seq); - if (old_root && tm && tm->op != MOD_LOG_KEY_REMOVE_WHILE_FREEING) { - btrfs_tree_read_unlock(eb_root); - free_extent_buffer(eb_root); - old = read_tree_block(fs_info, logical, root->root_key.objectid, - 0, level, NULL); - if (WARN_ON(IS_ERR(old) || !extent_buffer_uptodate(old))) { - if (!IS_ERR(old)) - free_extent_buffer(old); - btrfs_warn(fs_info, - "failed to read tree block %llu from get_old_root", - logical); - } else { - btrfs_tree_read_lock(old); - eb = btrfs_clone_extent_buffer(old); - btrfs_tree_read_unlock(old); - free_extent_buffer(old); - } - } else if (old_root) { - eb_root_owner = btrfs_header_owner(eb_root); - btrfs_tree_read_unlock(eb_root); - free_extent_buffer(eb_root); - eb = alloc_dummy_extent_buffer(fs_info, logical); - } else { - eb = btrfs_clone_extent_buffer(eb_root); - btrfs_tree_read_unlock(eb_root); - free_extent_buffer(eb_root); - } - - if (!eb) - return NULL; - if (old_root) { - btrfs_set_header_bytenr(eb, eb->start); - btrfs_set_header_backref_rev(eb, BTRFS_MIXED_BACKREF_REV); - btrfs_set_header_owner(eb, eb_root_owner); - btrfs_set_header_level(eb, old_root->level); - btrfs_set_header_generation(eb, old_generation); - } - btrfs_set_buffer_lockdep_class(btrfs_header_owner(eb), eb, - btrfs_header_level(eb)); - btrfs_tree_read_lock(eb); - if (tm) - __tree_mod_log_rewind(fs_info, eb, time_seq, tm); - else - WARN_ON(btrfs_header_level(eb) != 0); - WARN_ON(btrfs_header_nritems(eb) > BTRFS_NODEPTRS_PER_BLOCK(fs_info)); - - return eb; -} - -int btrfs_old_root_level(struct btrfs_root *root, u64 time_seq) -{ - struct tree_mod_elem *tm; - int level; - struct extent_buffer *eb_root = btrfs_root_node(root); - - tm = __tree_mod_log_oldest_root(eb_root, time_seq); - if (tm && tm->op == MOD_LOG_ROOT_REPLACE) { - level = tm->old_root.level; - } else { - level = btrfs_header_level(eb_root); - } - free_extent_buffer(eb_root); - - return level; -} - static inline int should_cow_block(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct extent_buffer *buf) @@ -1840,7 +958,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, goto enospc; } - ret = tree_mod_log_insert_root(root->node, child, 1); + ret = btrfs_tree_mod_log_insert_root(root->node, child, 1); BUG_ON(ret < 0); rcu_assign_pointer(root->node, child); @@ -1920,8 +1038,8 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, } else { struct btrfs_disk_key right_key; btrfs_node_key(right, &right_key, 0); - ret = tree_mod_log_insert_key(parent, pslot + 1, - MOD_LOG_KEY_REPLACE, GFP_NOFS); + ret = btrfs_tree_mod_log_insert_key(parent, pslot + 1, + BTRFS_MOD_LOG_KEY_REPLACE, GFP_NOFS); BUG_ON(ret < 0); btrfs_set_node_key(parent, &right_key, pslot + 1); btrfs_mark_buffer_dirty(parent); @@ -1966,8 +1084,8 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, /* update the parent key to reflect our changes */ struct btrfs_disk_key mid_key; btrfs_node_key(mid, &mid_key, 0); - ret = tree_mod_log_insert_key(parent, pslot, - MOD_LOG_KEY_REPLACE, GFP_NOFS); + ret = btrfs_tree_mod_log_insert_key(parent, pslot, + BTRFS_MOD_LOG_KEY_REPLACE, GFP_NOFS); BUG_ON(ret < 0); btrfs_set_node_key(parent, &mid_key, pslot); btrfs_mark_buffer_dirty(parent); @@ -2068,8 +1186,8 @@ static noinline int push_nodes_for_insert(struct btrfs_trans_handle *trans, struct btrfs_disk_key disk_key; orig_slot += left_nr; btrfs_node_key(mid, &disk_key, 0); - ret = tree_mod_log_insert_key(parent, pslot, - MOD_LOG_KEY_REPLACE, GFP_NOFS); + ret = btrfs_tree_mod_log_insert_key(parent, pslot, + BTRFS_MOD_LOG_KEY_REPLACE, GFP_NOFS); BUG_ON(ret < 0); btrfs_set_node_key(parent, &disk_key, pslot); btrfs_mark_buffer_dirty(parent); @@ -2122,8 +1240,8 @@ static noinline int push_nodes_for_insert(struct btrfs_trans_handle *trans, struct btrfs_disk_key disk_key; btrfs_node_key(right, &disk_key, 0); - ret = tree_mod_log_insert_key(parent, pslot + 1, - MOD_LOG_KEY_REPLACE, GFP_NOFS); + ret = btrfs_tree_mod_log_insert_key(parent, pslot + 1, + BTRFS_MOD_LOG_KEY_REPLACE, GFP_NOFS); BUG_ON(ret < 0); btrfs_set_node_key(parent, &disk_key, pslot + 1); btrfs_mark_buffer_dirty(parent); @@ -2881,7 +1999,7 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key, } again: - b = get_old_root(root, time_seq); + b = btrfs_get_old_root(root, time_seq); if (!b) { ret = -EIO; goto done; @@ -2936,7 +2054,7 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key, level = btrfs_header_level(b); btrfs_tree_read_lock(b); - b = tree_mod_log_rewind(fs_info, p, b, time_seq); + b = btrfs_tree_mod_log_rewind(fs_info, p, b, time_seq); if (!b) { ret = -ENOMEM; goto done; @@ -3050,8 +2168,8 @@ static void fixup_low_keys(struct btrfs_path *path, if (!path->nodes[i]) break; t = path->nodes[i]; - ret = tree_mod_log_insert_key(t, tslot, MOD_LOG_KEY_REPLACE, - GFP_ATOMIC); + ret = btrfs_tree_mod_log_insert_key(t, tslot, + BTRFS_MOD_LOG_KEY_REPLACE, GFP_ATOMIC); BUG_ON(ret < 0); btrfs_set_node_key(t, key, tslot); btrfs_mark_buffer_dirty(path->nodes[i]); @@ -3214,7 +2332,7 @@ static int push_node_left(struct btrfs_trans_handle *trans, btrfs_abort_transaction(trans, ret); return ret; } - ret = tree_mod_log_eb_copy(dst, src, dst_nritems, 0, push_items); + ret = btrfs_tree_mod_log_eb_copy(dst, src, dst_nritems, 0, push_items); if (ret) { btrfs_abort_transaction(trans, ret); return ret; @@ -3226,8 +2344,8 @@ static int push_node_left(struct btrfs_trans_handle *trans, if (push_items < src_nritems) { /* - * Don't call tree_mod_log_insert_move here, key removal was - * already fully logged by tree_mod_log_eb_copy above. + * Don't call btrfs_tree_mod_log_insert_move() here, key removal + * was already fully logged by btrfs_tree_mod_log_eb_copy() above. */ memmove_extent_buffer(src, btrfs_node_key_ptr_offset(0), btrfs_node_key_ptr_offset(push_items), @@ -3288,15 +2406,15 @@ static int balance_node_right(struct btrfs_trans_handle *trans, btrfs_abort_transaction(trans, ret); return ret; } - ret = tree_mod_log_insert_move(dst, push_items, 0, dst_nritems); + ret = btrfs_tree_mod_log_insert_move(dst, push_items, 0, dst_nritems); BUG_ON(ret < 0); memmove_extent_buffer(dst, btrfs_node_key_ptr_offset(push_items), btrfs_node_key_ptr_offset(0), (dst_nritems) * sizeof(struct btrfs_key_ptr)); - ret = tree_mod_log_eb_copy(dst, src, 0, src_nritems - push_items, - push_items); + ret = btrfs_tree_mod_log_eb_copy(dst, src, 0, src_nritems - push_items, + push_items); if (ret) { btrfs_abort_transaction(trans, ret); return ret; @@ -3362,7 +2480,7 @@ static noinline int insert_new_root(struct btrfs_trans_handle *trans, btrfs_mark_buffer_dirty(c); old = root->node; - ret = tree_mod_log_insert_root(root->node, c, 0); + ret = btrfs_tree_mod_log_insert_root(root->node, c, 0); BUG_ON(ret < 0); rcu_assign_pointer(root->node, c); @@ -3401,8 +2519,8 @@ static void insert_ptr(struct btrfs_trans_handle *trans, BUG_ON(nritems == BTRFS_NODEPTRS_PER_BLOCK(trans->fs_info)); if (slot != nritems) { if (level) { - ret = tree_mod_log_insert_move(lower, slot + 1, slot, - nritems - slot); + ret = btrfs_tree_mod_log_insert_move(lower, slot + 1, + slot, nritems - slot); BUG_ON(ret < 0); } memmove_extent_buffer(lower, @@ -3411,8 +2529,8 @@ static void insert_ptr(struct btrfs_trans_handle *trans, (nritems - slot) * sizeof(struct btrfs_key_ptr)); } if (level) { - ret = tree_mod_log_insert_key(lower, slot, MOD_LOG_KEY_ADD, - GFP_NOFS); + ret = btrfs_tree_mod_log_insert_key(lower, slot, + BTRFS_MOD_LOG_KEY_ADD, GFP_NOFS); BUG_ON(ret < 0); } btrfs_set_node_key(lower, key, slot); @@ -3453,9 +2571,9 @@ static noinline int split_node(struct btrfs_trans_handle *trans, * tree mod log: We don't log_removal old root in * insert_new_root, because that root buffer will be kept as a * normal node. We are going to log removal of half of the - * elements below with tree_mod_log_eb_copy. We're holding a - * tree lock on the buffer, which is why we cannot race with - * other tree_mod_log users. + * elements below with btrfs_tree_mod_log_eb_copy(). We're + * holding a tree lock on the buffer, which is why we cannot + * race with other tree_mod_log users. */ ret = insert_new_root(trans, root, path, level + 1); if (ret) @@ -3482,7 +2600,7 @@ static noinline int split_node(struct btrfs_trans_handle *trans, root_add_used(root, fs_info->nodesize); ASSERT(btrfs_header_level(c) == level); - ret = tree_mod_log_eb_copy(split, c, 0, mid, c_nritems - mid); + ret = btrfs_tree_mod_log_eb_copy(split, c, 0, mid, c_nritems - mid); if (ret) { btrfs_abort_transaction(trans, ret); return ret; @@ -4864,8 +3982,8 @@ static void del_ptr(struct btrfs_root *root, struct btrfs_path *path, nritems = btrfs_header_nritems(parent); if (slot != nritems - 1) { if (level) { - ret = tree_mod_log_insert_move(parent, slot, slot + 1, - nritems - slot - 1); + ret = btrfs_tree_mod_log_insert_move(parent, slot, + slot + 1, nritems - slot - 1); BUG_ON(ret < 0); } memmove_extent_buffer(parent, @@ -4874,8 +3992,8 @@ static void del_ptr(struct btrfs_root *root, struct btrfs_path *path, sizeof(struct btrfs_key_ptr) * (nritems - slot - 1)); } else if (level) { - ret = tree_mod_log_insert_key(parent, slot, MOD_LOG_KEY_REMOVE, - GFP_NOFS); + ret = btrfs_tree_mod_log_insert_key(parent, slot, + BTRFS_MOD_LOG_KEY_REMOVE, GFP_NOFS); BUG_ON(ret < 0); } diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index ae8901aec4f1..50208673bd55 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -500,16 +500,6 @@ struct btrfs_discard_ctl { atomic64_t discard_bytes_saved; }; -/* delayed seq elem */ -struct seq_list { - struct list_head list; - u64 seq; -}; - -#define SEQ_LIST_INIT(name) { .list = LIST_HEAD_INIT((name).list), .seq = 0 } - -#define SEQ_LAST ((u64)-1) - enum btrfs_orphan_cleanup_state { ORPHAN_CLEANUP_STARTED = 1, ORPHAN_CLEANUP_DONE = 2, @@ -2946,13 +2936,6 @@ static inline void btrfs_clear_sb_rdonly(struct super_block *sb) clear_bit(BTRFS_FS_STATE_RO, &btrfs_sb(sb)->fs_state); } -/* tree mod log functions from ctree.c */ -u64 btrfs_get_tree_mod_seq(struct btrfs_fs_info *fs_info, - struct seq_list *elem); -void btrfs_put_tree_mod_seq(struct btrfs_fs_info *fs_info, - struct seq_list *elem); -int btrfs_old_root_level(struct btrfs_root *root, u64 time_seq); - /* root-item.c */ int btrfs_add_root_ref(struct btrfs_trans_handle *trans, u64 root_id, u64 ref_id, u64 dirid, u64 sequence, const char *name, diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c index 63be7d01a9a3..d87472a68bf4 100644 --- a/fs/btrfs/delayed-ref.c +++ b/fs/btrfs/delayed-ref.c @@ -11,6 +11,7 @@ #include "transaction.h" #include "qgroup.h" #include "space-info.h" +#include "tree-mod-log.h" struct kmem_cache *btrfs_delayed_ref_head_cachep; struct kmem_cache *btrfs_delayed_tree_ref_cachep; @@ -496,10 +497,10 @@ void btrfs_merge_delayed_refs(struct btrfs_trans_handle *trans, read_lock(&fs_info->tree_mod_log_lock); if (!list_empty(&fs_info->tree_mod_seq_list)) { - struct seq_list *elem; + struct btrfs_seq_list *elem; elem = list_first_entry(&fs_info->tree_mod_seq_list, - struct seq_list, list); + struct btrfs_seq_list, list); seq = elem->seq; } read_unlock(&fs_info->tree_mod_log_lock); @@ -517,13 +518,13 @@ void btrfs_merge_delayed_refs(struct btrfs_trans_handle *trans, int btrfs_check_delayed_seq(struct btrfs_fs_info *fs_info, u64 seq) { - struct seq_list *elem; + struct btrfs_seq_list *elem; int ret = 0; read_lock(&fs_info->tree_mod_log_lock); if (!list_empty(&fs_info->tree_mod_seq_list)) { elem = list_first_entry(&fs_info->tree_mod_seq_list, - struct seq_list, list); + struct btrfs_seq_list, list); if (seq >= elem->seq) { btrfs_debug(fs_info, "holding back delayed_ref %#x.%x, lowest is %#x.%x", diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index 1b7d3cadc39e..9d8d1ac86962 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -23,6 +23,7 @@ #include "qgroup.h" #include "block-group.h" #include "sysfs.h" +#include "tree-mod-log.h" /* TODO XXX FIXME * - subvol delete -> delete when ref goes to 0? delete limits also? @@ -2631,12 +2632,12 @@ int btrfs_qgroup_account_extents(struct btrfs_trans_handle *trans) record->data_rsv, BTRFS_QGROUP_RSV_DATA); /* - * Use SEQ_LAST as time_seq to do special search, which - * doesn't lock tree or delayed_refs and search current - * root. It's safe inside commit_transaction(). + * Use BTRFS_SEQ_LAST as time_seq to do special search, + * which doesn't lock tree or delayed_refs and search + * current root. It's safe inside commit_transaction(). */ ret = btrfs_find_all_roots(trans, fs_info, - record->bytenr, SEQ_LAST, &new_roots, false); + record->bytenr, BTRFS_SEQ_LAST, &new_roots, false); if (ret < 0) goto cleanup; if (qgroup_to_skip) { diff --git a/fs/btrfs/tree-mod-log.c b/fs/btrfs/tree-mod-log.c new file mode 100644 index 000000000000..fc7eb33ec383 --- /dev/null +++ b/fs/btrfs/tree-mod-log.c @@ -0,0 +1,889 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "tree-mod-log.h" +#include "disk-io.h" + +struct tree_mod_root { + u64 logical; + u8 level; +}; + +struct tree_mod_elem { + struct rb_node node; + u64 logical; + u64 seq; + enum btrfs_mod_log_op op; + + /* + * This is used for BTRFS_MOD_LOG_KEY_* and BTRFS_MOD_LOG_MOVE_KEYS + * operations. + */ + int slot; + + /* This is used for BTRFS_MOD_LOG_KEY* and BTRFS_MOD_LOG_ROOT_REPLACE. */ + u64 generation; + + /* Those are used for op == BTRFS_MOD_LOG_KEY_{REPLACE,REMOVE}. */ + struct btrfs_disk_key key; + u64 blockptr; + + /* This is used for op == BTRFS_MOD_LOG_MOVE_KEYS. */ + struct { + int dst_slot; + int nr_items; + } move; + + /* This is used for op == BTRFS_MOD_LOG_ROOT_REPLACE. */ + struct tree_mod_root old_root; +}; + +/* + * Pull a new tree mod seq number for our operation. + */ +static inline u64 btrfs_inc_tree_mod_seq(struct btrfs_fs_info *fs_info) +{ + return atomic64_inc_return(&fs_info->tree_mod_seq); +} + +/* + * This adds a new blocker to the tree mod log's blocker list if the @elem + * passed does not already have a sequence number set. So when a caller expects + * to record tree modifications, it should ensure to set elem->seq to zero + * before calling btrfs_get_tree_mod_seq. + * Returns a fresh, unused tree log modification sequence number, even if no new + * blocker was added. + */ +u64 btrfs_get_tree_mod_seq(struct btrfs_fs_info *fs_info, + struct btrfs_seq_list *elem) +{ + write_lock(&fs_info->tree_mod_log_lock); + if (!elem->seq) { + elem->seq = btrfs_inc_tree_mod_seq(fs_info); + list_add_tail(&elem->list, &fs_info->tree_mod_seq_list); + } + write_unlock(&fs_info->tree_mod_log_lock); + + return elem->seq; +} + +void btrfs_put_tree_mod_seq(struct btrfs_fs_info *fs_info, + struct btrfs_seq_list *elem) +{ + struct rb_root *tm_root; + struct rb_node *node; + struct rb_node *next; + struct tree_mod_elem *tm; + u64 min_seq = BTRFS_SEQ_LAST; + u64 seq_putting = elem->seq; + + if (!seq_putting) + return; + + write_lock(&fs_info->tree_mod_log_lock); + list_del(&elem->list); + elem->seq = 0; + + if (!list_empty(&fs_info->tree_mod_seq_list)) { + struct btrfs_seq_list *first; + + first = list_first_entry(&fs_info->tree_mod_seq_list, + struct btrfs_seq_list, list); + if (seq_putting > first->seq) { + /* + * Blocker with lower sequence number exists, we + * cannot remove anything from the log. + */ + write_unlock(&fs_info->tree_mod_log_lock); + return; + } + min_seq = first->seq; + } + + /* + * anything that's lower than the lowest existing (read: blocked) + * sequence number can be removed from the tree. + */ + tm_root = &fs_info->tree_mod_log; + for (node = rb_first(tm_root); node; node = next) { + next = rb_next(node); + tm = rb_entry(node, struct tree_mod_elem, node); + if (tm->seq >= min_seq) + continue; + rb_erase(node, tm_root); + kfree(tm); + } + write_unlock(&fs_info->tree_mod_log_lock); +} + +/* + * key order of the log: + * node/leaf start address -> sequence + * + * The 'start address' is the logical address of the *new* root node + * for root replace operations, or the logical address of the affected + * block for all other operations. + */ +static noinline int tree_mod_log_insert(struct btrfs_fs_info *fs_info, + struct tree_mod_elem *tm) +{ + struct rb_root *tm_root; + struct rb_node **new; + struct rb_node *parent = NULL; + struct tree_mod_elem *cur; + + lockdep_assert_held_write(&fs_info->tree_mod_log_lock); + + tm->seq = btrfs_inc_tree_mod_seq(fs_info); + + tm_root = &fs_info->tree_mod_log; + new = &tm_root->rb_node; + while (*new) { + cur = rb_entry(*new, struct tree_mod_elem, node); + parent = *new; + if (cur->logical < tm->logical) + new = &((*new)->rb_left); + else if (cur->logical > tm->logical) + new = &((*new)->rb_right); + else if (cur->seq < tm->seq) + new = &((*new)->rb_left); + else if (cur->seq > tm->seq) + new = &((*new)->rb_right); + else + return -EEXIST; + } + + rb_link_node(&tm->node, parent, new); + rb_insert_color(&tm->node, tm_root); + return 0; +} + +/* + * Determines if logging can be omitted. Returns 1 if it can. Otherwise, it + * returns zero with the tree_mod_log_lock acquired. The caller must hold + * this until all tree mod log insertions are recorded in the rb tree and then + * write unlock fs_info::tree_mod_log_lock. + */ +static inline int tree_mod_dont_log(struct btrfs_fs_info *fs_info, + struct extent_buffer *eb) +{ + smp_mb(); + if (list_empty(&(fs_info)->tree_mod_seq_list)) + return 1; + if (eb && btrfs_header_level(eb) == 0) + return 1; + + write_lock(&fs_info->tree_mod_log_lock); + if (list_empty(&(fs_info)->tree_mod_seq_list)) { + write_unlock(&fs_info->tree_mod_log_lock); + return 1; + } + + return 0; +} + +/* Similar to tree_mod_dont_log, but doesn't acquire any locks. */ +static inline int tree_mod_need_log(const struct btrfs_fs_info *fs_info, + struct extent_buffer *eb) +{ + smp_mb(); + if (list_empty(&(fs_info)->tree_mod_seq_list)) + return 0; + if (eb && btrfs_header_level(eb) == 0) + return 0; + + return 1; +} + +static struct tree_mod_elem *alloc_tree_mod_elem(struct extent_buffer *eb, + int slot, + enum btrfs_mod_log_op op, + gfp_t flags) +{ + struct tree_mod_elem *tm; + + tm = kzalloc(sizeof(*tm), flags); + if (!tm) + return NULL; + + tm->logical = eb->start; + if (op != BTRFS_MOD_LOG_KEY_ADD) { + btrfs_node_key(eb, &tm->key, slot); + tm->blockptr = btrfs_node_blockptr(eb, slot); + } + tm->op = op; + tm->slot = slot; + tm->generation = btrfs_node_ptr_generation(eb, slot); + RB_CLEAR_NODE(&tm->node); + + return tm; +} + +int btrfs_tree_mod_log_insert_key(struct extent_buffer *eb, int slot, + enum btrfs_mod_log_op op, gfp_t flags) +{ + struct tree_mod_elem *tm; + int ret; + + if (!tree_mod_need_log(eb->fs_info, eb)) + return 0; + + tm = alloc_tree_mod_elem(eb, slot, op, flags); + if (!tm) + return -ENOMEM; + + if (tree_mod_dont_log(eb->fs_info, eb)) { + kfree(tm); + return 0; + } + + ret = tree_mod_log_insert(eb->fs_info, tm); + write_unlock(&eb->fs_info->tree_mod_log_lock); + if (ret) + kfree(tm); + + return ret; +} + +int btrfs_tree_mod_log_insert_move(struct extent_buffer *eb, + int dst_slot, int src_slot, + int nr_items) +{ + struct tree_mod_elem *tm = NULL; + struct tree_mod_elem **tm_list = NULL; + int ret = 0; + int i; + int locked = 0; + + if (!tree_mod_need_log(eb->fs_info, eb)) + return 0; + + tm_list = kcalloc(nr_items, sizeof(struct tree_mod_elem *), GFP_NOFS); + if (!tm_list) + return -ENOMEM; + + tm = kzalloc(sizeof(*tm), GFP_NOFS); + if (!tm) { + ret = -ENOMEM; + goto free_tms; + } + + tm->logical = eb->start; + tm->slot = src_slot; + tm->move.dst_slot = dst_slot; + tm->move.nr_items = nr_items; + tm->op = BTRFS_MOD_LOG_MOVE_KEYS; + + for (i = 0; i + dst_slot < src_slot && i < nr_items; i++) { + tm_list[i] = alloc_tree_mod_elem(eb, i + dst_slot, + BTRFS_MOD_LOG_KEY_REMOVE_WHILE_MOVING, GFP_NOFS); + if (!tm_list[i]) { + ret = -ENOMEM; + goto free_tms; + } + } + + if (tree_mod_dont_log(eb->fs_info, eb)) + goto free_tms; + locked = 1; + + /* + * When we override something during the move, we log these removals. + * This can only happen when we move towards the beginning of the + * buffer, i.e. dst_slot < src_slot. + */ + for (i = 0; i + dst_slot < src_slot && i < nr_items; i++) { + ret = tree_mod_log_insert(eb->fs_info, tm_list[i]); + if (ret) + goto free_tms; + } + + ret = tree_mod_log_insert(eb->fs_info, tm); + if (ret) + goto free_tms; + write_unlock(&eb->fs_info->tree_mod_log_lock); + kfree(tm_list); + + return 0; +free_tms: + for (i = 0; i < nr_items; i++) { + if (tm_list[i] && !RB_EMPTY_NODE(&tm_list[i]->node)) + rb_erase(&tm_list[i]->node, &eb->fs_info->tree_mod_log); + kfree(tm_list[i]); + } + if (locked) + write_unlock(&eb->fs_info->tree_mod_log_lock); + kfree(tm_list); + kfree(tm); + + return ret; +} + +static inline int tree_mod_log_free_eb(struct btrfs_fs_info *fs_info, + struct tree_mod_elem **tm_list, + int nritems) +{ + int i, j; + int ret; + + for (i = nritems - 1; i >= 0; i--) { + ret = tree_mod_log_insert(fs_info, tm_list[i]); + if (ret) { + for (j = nritems - 1; j > i; j--) + rb_erase(&tm_list[j]->node, + &fs_info->tree_mod_log); + return ret; + } + } + + return 0; +} + +int btrfs_tree_mod_log_insert_root(struct extent_buffer *old_root, + struct extent_buffer *new_root, + int log_removal) +{ + struct btrfs_fs_info *fs_info = old_root->fs_info; + struct tree_mod_elem *tm = NULL; + struct tree_mod_elem **tm_list = NULL; + int nritems = 0; + int ret = 0; + int i; + + if (!tree_mod_need_log(fs_info, NULL)) + return 0; + + if (log_removal && btrfs_header_level(old_root) > 0) { + nritems = btrfs_header_nritems(old_root); + tm_list = kcalloc(nritems, sizeof(struct tree_mod_elem *), + GFP_NOFS); + if (!tm_list) { + ret = -ENOMEM; + goto free_tms; + } + for (i = 0; i < nritems; i++) { + tm_list[i] = alloc_tree_mod_elem(old_root, i, + BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING, GFP_NOFS); + if (!tm_list[i]) { + ret = -ENOMEM; + goto free_tms; + } + } + } + + tm = kzalloc(sizeof(*tm), GFP_NOFS); + if (!tm) { + ret = -ENOMEM; + goto free_tms; + } + + tm->logical = new_root->start; + tm->old_root.logical = old_root->start; + tm->old_root.level = btrfs_header_level(old_root); + tm->generation = btrfs_header_generation(old_root); + tm->op = BTRFS_MOD_LOG_ROOT_REPLACE; + + if (tree_mod_dont_log(fs_info, NULL)) + goto free_tms; + + if (tm_list) + ret = tree_mod_log_free_eb(fs_info, tm_list, nritems); + if (!ret) + ret = tree_mod_log_insert(fs_info, tm); + + write_unlock(&fs_info->tree_mod_log_lock); + if (ret) + goto free_tms; + kfree(tm_list); + + return ret; + +free_tms: + if (tm_list) { + for (i = 0; i < nritems; i++) + kfree(tm_list[i]); + kfree(tm_list); + } + kfree(tm); + + return ret; +} + +static struct tree_mod_elem *__tree_mod_log_search(struct btrfs_fs_info *fs_info, + u64 start, u64 min_seq, + int smallest) +{ + struct rb_root *tm_root; + struct rb_node *node; + struct tree_mod_elem *cur = NULL; + struct tree_mod_elem *found = NULL; + + read_lock(&fs_info->tree_mod_log_lock); + tm_root = &fs_info->tree_mod_log; + node = tm_root->rb_node; + while (node) { + cur = rb_entry(node, struct tree_mod_elem, node); + if (cur->logical < start) { + node = node->rb_left; + } else if (cur->logical > start) { + node = node->rb_right; + } else if (cur->seq < min_seq) { + node = node->rb_left; + } else if (!smallest) { + /* we want the node with the highest seq */ + if (found) + BUG_ON(found->seq > cur->seq); + found = cur; + node = node->rb_left; + } else if (cur->seq > min_seq) { + /* we want the node with the smallest seq */ + if (found) + BUG_ON(found->seq < cur->seq); + found = cur; + node = node->rb_right; + } else { + found = cur; + break; + } + } + read_unlock(&fs_info->tree_mod_log_lock); + + return found; +} + +/* + * This returns the element from the log with the smallest time sequence + * value that's in the log (the oldest log item). any element with a time + * sequence lower than min_seq will be ignored. + */ +static struct tree_mod_elem *tree_mod_log_search_oldest(struct btrfs_fs_info *fs_info, + u64 start, u64 min_seq) +{ + return __tree_mod_log_search(fs_info, start, min_seq, 1); +} + +/* + * This returns the element from the log with the largest time sequence + * value that's in the log (the most recent log item). any element with + * a time sequence lower than min_seq will be ignored. + */ +static struct tree_mod_elem *tree_mod_log_search(struct btrfs_fs_info *fs_info, + u64 start, u64 min_seq) +{ + return __tree_mod_log_search(fs_info, start, min_seq, 0); +} + + +int btrfs_tree_mod_log_eb_copy(struct extent_buffer *dst, + struct extent_buffer *src, + unsigned long dst_offset, + unsigned long src_offset, + int nr_items) +{ + struct btrfs_fs_info *fs_info = dst->fs_info; + int ret = 0; + struct tree_mod_elem **tm_list = NULL; + struct tree_mod_elem **tm_list_add, **tm_list_rem; + int i; + int locked = 0; + + if (!tree_mod_need_log(fs_info, NULL)) + return 0; + + if (btrfs_header_level(dst) == 0 && btrfs_header_level(src) == 0) + return 0; + + tm_list = kcalloc(nr_items * 2, sizeof(struct tree_mod_elem *), + GFP_NOFS); + if (!tm_list) + return -ENOMEM; + + tm_list_add = tm_list; + tm_list_rem = tm_list + nr_items; + for (i = 0; i < nr_items; i++) { + tm_list_rem[i] = alloc_tree_mod_elem(src, i + src_offset, + BTRFS_MOD_LOG_KEY_REMOVE, GFP_NOFS); + if (!tm_list_rem[i]) { + ret = -ENOMEM; + goto free_tms; + } + + tm_list_add[i] = alloc_tree_mod_elem(dst, i + dst_offset, + BTRFS_MOD_LOG_KEY_ADD, GFP_NOFS); + if (!tm_list_add[i]) { + ret = -ENOMEM; + goto free_tms; + } + } + + if (tree_mod_dont_log(fs_info, NULL)) + goto free_tms; + locked = 1; + + for (i = 0; i < nr_items; i++) { + ret = tree_mod_log_insert(fs_info, tm_list_rem[i]); + if (ret) + goto free_tms; + ret = tree_mod_log_insert(fs_info, tm_list_add[i]); + if (ret) + goto free_tms; + } + + write_unlock(&fs_info->tree_mod_log_lock); + kfree(tm_list); + + return 0; + +free_tms: + for (i = 0; i < nr_items * 2; i++) { + if (tm_list[i] && !RB_EMPTY_NODE(&tm_list[i]->node)) + rb_erase(&tm_list[i]->node, &fs_info->tree_mod_log); + kfree(tm_list[i]); + } + if (locked) + write_unlock(&fs_info->tree_mod_log_lock); + kfree(tm_list); + + return ret; +} + +int btrfs_tree_mod_log_free_eb(struct extent_buffer *eb) +{ + struct tree_mod_elem **tm_list = NULL; + int nritems = 0; + int i; + int ret = 0; + + if (btrfs_header_level(eb) == 0) + return 0; + + if (!tree_mod_need_log(eb->fs_info, NULL)) + return 0; + + nritems = btrfs_header_nritems(eb); + tm_list = kcalloc(nritems, sizeof(struct tree_mod_elem *), GFP_NOFS); + if (!tm_list) + return -ENOMEM; + + for (i = 0; i < nritems; i++) { + tm_list[i] = alloc_tree_mod_elem(eb, i, + BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING, GFP_NOFS); + if (!tm_list[i]) { + ret = -ENOMEM; + goto free_tms; + } + } + + if (tree_mod_dont_log(eb->fs_info, eb)) + goto free_tms; + + ret = tree_mod_log_free_eb(eb->fs_info, tm_list, nritems); + write_unlock(&eb->fs_info->tree_mod_log_lock); + if (ret) + goto free_tms; + kfree(tm_list); + + return 0; + +free_tms: + for (i = 0; i < nritems; i++) + kfree(tm_list[i]); + kfree(tm_list); + + return ret; +} + +/* + * Returns the logical address of the oldest predecessor of the given root. + * entries older than time_seq are ignored. + */ +static struct tree_mod_elem *tree_mod_log_oldest_root(struct extent_buffer *eb_root, + u64 time_seq) +{ + struct tree_mod_elem *tm; + struct tree_mod_elem *found = NULL; + u64 root_logical = eb_root->start; + int looped = 0; + + if (!time_seq) + return NULL; + + /* + * the very last operation that's logged for a root is the + * replacement operation (if it is replaced at all). this has + * the logical address of the *new* root, making it the very + * first operation that's logged for this root. + */ + while (1) { + tm = tree_mod_log_search_oldest(eb_root->fs_info, root_logical, + time_seq); + if (!looped && !tm) + return NULL; + /* + * if there are no tree operation for the oldest root, we simply + * return it. this should only happen if that (old) root is at + * level 0. + */ + if (!tm) + break; + + /* + * if there's an operation that's not a root replacement, we + * found the oldest version of our root. normally, we'll find a + * BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING operation here. + */ + if (tm->op != BTRFS_MOD_LOG_ROOT_REPLACE) + break; + + found = tm; + root_logical = tm->old_root.logical; + looped = 1; + } + + /* if there's no old root to return, return what we found instead */ + if (!found) + found = tm; + + return found; +} + + +/* + * tm is a pointer to the first operation to rewind within eb. then, all + * previous operations will be rewound (until we reach something older than + * time_seq). + */ +static void tree_mod_log_rewind(struct btrfs_fs_info *fs_info, + struct extent_buffer *eb, + u64 time_seq, + struct tree_mod_elem *first_tm) +{ + u32 n; + struct rb_node *next; + struct tree_mod_elem *tm = first_tm; + unsigned long o_dst; + unsigned long o_src; + unsigned long p_size = sizeof(struct btrfs_key_ptr); + + n = btrfs_header_nritems(eb); + read_lock(&fs_info->tree_mod_log_lock); + while (tm && tm->seq >= time_seq) { + /* + * all the operations are recorded with the operator used for + * the modification. as we're going backwards, we do the + * opposite of each operation here. + */ + switch (tm->op) { + case BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING: + BUG_ON(tm->slot < n); + fallthrough; + case BTRFS_MOD_LOG_KEY_REMOVE_WHILE_MOVING: + case BTRFS_MOD_LOG_KEY_REMOVE: + btrfs_set_node_key(eb, &tm->key, tm->slot); + btrfs_set_node_blockptr(eb, tm->slot, tm->blockptr); + btrfs_set_node_ptr_generation(eb, tm->slot, + tm->generation); + n++; + break; + case BTRFS_MOD_LOG_KEY_REPLACE: + BUG_ON(tm->slot >= n); + btrfs_set_node_key(eb, &tm->key, tm->slot); + btrfs_set_node_blockptr(eb, tm->slot, tm->blockptr); + btrfs_set_node_ptr_generation(eb, tm->slot, + tm->generation); + break; + case BTRFS_MOD_LOG_KEY_ADD: + /* if a move operation is needed it's in the log */ + n--; + break; + case BTRFS_MOD_LOG_MOVE_KEYS: + o_dst = btrfs_node_key_ptr_offset(tm->slot); + o_src = btrfs_node_key_ptr_offset(tm->move.dst_slot); + memmove_extent_buffer(eb, o_dst, o_src, + tm->move.nr_items * p_size); + break; + case BTRFS_MOD_LOG_ROOT_REPLACE: + /* + * this operation is special. for roots, this must be + * handled explicitly before rewinding. + * for non-roots, this operation may exist if the node + * was a root: root A -> child B; then A gets empty and + * B is promoted to the new root. in the mod log, we'll + * have a root-replace operation for B, a tree block + * that is no root. we simply ignore that operation. + */ + break; + } + next = rb_next(&tm->node); + if (!next) + break; + tm = rb_entry(next, struct tree_mod_elem, node); + if (tm->logical != first_tm->logical) + break; + } + read_unlock(&fs_info->tree_mod_log_lock); + btrfs_set_header_nritems(eb, n); +} + + +/* + * Called with eb read locked. If the buffer cannot be rewound, the same buffer + * is returned. If rewind operations happen, a fresh buffer is returned. The + * returned buffer is always read-locked. If the returned buffer is not the + * input buffer, the lock on the input buffer is released and the input buffer + * is freed (its refcount is decremented). + */ +struct extent_buffer *btrfs_tree_mod_log_rewind(struct btrfs_fs_info *fs_info, + struct btrfs_path *path, + struct extent_buffer *eb, + u64 time_seq) +{ + struct extent_buffer *eb_rewin; + struct tree_mod_elem *tm; + + if (!time_seq) + return eb; + + if (btrfs_header_level(eb) == 0) + return eb; + + tm = tree_mod_log_search(fs_info, eb->start, time_seq); + if (!tm) + return eb; + + if (tm->op == BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING) { + BUG_ON(tm->slot != 0); + eb_rewin = alloc_dummy_extent_buffer(fs_info, eb->start); + if (!eb_rewin) { + btrfs_tree_read_unlock(eb); + free_extent_buffer(eb); + return NULL; + } + btrfs_set_header_bytenr(eb_rewin, eb->start); + btrfs_set_header_backref_rev(eb_rewin, + btrfs_header_backref_rev(eb)); + btrfs_set_header_owner(eb_rewin, btrfs_header_owner(eb)); + btrfs_set_header_level(eb_rewin, btrfs_header_level(eb)); + } else { + eb_rewin = btrfs_clone_extent_buffer(eb); + if (!eb_rewin) { + btrfs_tree_read_unlock(eb); + free_extent_buffer(eb); + return NULL; + } + } + + btrfs_tree_read_unlock(eb); + free_extent_buffer(eb); + + btrfs_set_buffer_lockdep_class(btrfs_header_owner(eb_rewin), + eb_rewin, btrfs_header_level(eb_rewin)); + btrfs_tree_read_lock(eb_rewin); + tree_mod_log_rewind(fs_info, eb_rewin, time_seq, tm); + WARN_ON(btrfs_header_nritems(eb_rewin) > + BTRFS_NODEPTRS_PER_BLOCK(fs_info)); + + return eb_rewin; +} + +/* + * Rewinds the state of @root's root node to the given @time_seq value. + * If there are no changes, the current root->root_node is returned. If anything + * changed in between, there's a fresh buffer allocated on which the rewind + * operations are done. In any case, the returned buffer is read locked. + * Returns NULL on error (with no locks held). + */ +struct extent_buffer *btrfs_get_old_root(struct btrfs_root *root, u64 time_seq) +{ + struct btrfs_fs_info *fs_info = root->fs_info; + struct tree_mod_elem *tm; + struct extent_buffer *eb = NULL; + struct extent_buffer *eb_root; + u64 eb_root_owner = 0; + struct extent_buffer *old; + struct tree_mod_root *old_root = NULL; + u64 old_generation = 0; + u64 logical; + int level; + + eb_root = btrfs_read_lock_root_node(root); + tm = tree_mod_log_oldest_root(eb_root, time_seq); + if (!tm) + return eb_root; + + if (tm->op == BTRFS_MOD_LOG_ROOT_REPLACE) { + old_root = &tm->old_root; + old_generation = tm->generation; + logical = old_root->logical; + level = old_root->level; + } else { + logical = eb_root->start; + level = btrfs_header_level(eb_root); + } + + tm = tree_mod_log_search(fs_info, logical, time_seq); + if (old_root && tm && tm->op != BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING) { + btrfs_tree_read_unlock(eb_root); + free_extent_buffer(eb_root); + old = read_tree_block(fs_info, logical, root->root_key.objectid, + 0, level, NULL); + if (WARN_ON(IS_ERR(old) || !extent_buffer_uptodate(old))) { + if (!IS_ERR(old)) + free_extent_buffer(old); + btrfs_warn(fs_info, + "failed to read tree block %llu from get_old_root", + logical); + } else { + btrfs_tree_read_lock(old); + eb = btrfs_clone_extent_buffer(old); + btrfs_tree_read_unlock(old); + free_extent_buffer(old); + } + } else if (old_root) { + eb_root_owner = btrfs_header_owner(eb_root); + btrfs_tree_read_unlock(eb_root); + free_extent_buffer(eb_root); + eb = alloc_dummy_extent_buffer(fs_info, logical); + } else { + eb = btrfs_clone_extent_buffer(eb_root); + btrfs_tree_read_unlock(eb_root); + free_extent_buffer(eb_root); + } + + if (!eb) + return NULL; + if (old_root) { + btrfs_set_header_bytenr(eb, eb->start); + btrfs_set_header_backref_rev(eb, BTRFS_MIXED_BACKREF_REV); + btrfs_set_header_owner(eb, eb_root_owner); + btrfs_set_header_level(eb, old_root->level); + btrfs_set_header_generation(eb, old_generation); + } + btrfs_set_buffer_lockdep_class(btrfs_header_owner(eb), eb, + btrfs_header_level(eb)); + btrfs_tree_read_lock(eb); + if (tm) + tree_mod_log_rewind(fs_info, eb, time_seq, tm); + else + WARN_ON(btrfs_header_level(eb) != 0); + WARN_ON(btrfs_header_nritems(eb) > BTRFS_NODEPTRS_PER_BLOCK(fs_info)); + + return eb; +} + +int btrfs_old_root_level(struct btrfs_root *root, u64 time_seq) +{ + struct tree_mod_elem *tm; + int level; + struct extent_buffer *eb_root = btrfs_root_node(root); + + tm = tree_mod_log_oldest_root(eb_root, time_seq); + if (tm && tm->op == BTRFS_MOD_LOG_ROOT_REPLACE) + level = tm->old_root.level; + else + level = btrfs_header_level(eb_root); + + free_extent_buffer(eb_root); + + return level; +} + diff --git a/fs/btrfs/tree-mod-log.h b/fs/btrfs/tree-mod-log.h new file mode 100644 index 000000000000..15e7a4c4bb14 --- /dev/null +++ b/fs/btrfs/tree-mod-log.h @@ -0,0 +1,52 @@ +// SPDX-License-Identifier: GPL-2.0 + +#ifndef BTRFS_TREE_MOD_LOG_H +#define BTRFS_TREE_MOD_LOG_H + +#include "ctree.h" + +/* Represents a tree mod log user. */ +struct btrfs_seq_list { + struct list_head list; + u64 seq; +}; + +#define BTRFS_SEQ_LIST_INIT(name) { .list = LIST_HEAD_INIT((name).list), .seq = 0 } +#define BTRFS_SEQ_LAST ((u64)-1) + +enum btrfs_mod_log_op { + BTRFS_MOD_LOG_KEY_REPLACE, + BTRFS_MOD_LOG_KEY_ADD, + BTRFS_MOD_LOG_KEY_REMOVE, + BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING, + BTRFS_MOD_LOG_KEY_REMOVE_WHILE_MOVING, + BTRFS_MOD_LOG_MOVE_KEYS, + BTRFS_MOD_LOG_ROOT_REPLACE, +}; + +u64 btrfs_get_tree_mod_seq(struct btrfs_fs_info *fs_info, + struct btrfs_seq_list *elem); +void btrfs_put_tree_mod_seq(struct btrfs_fs_info *fs_info, + struct btrfs_seq_list *elem); +int btrfs_tree_mod_log_insert_root(struct extent_buffer *old_root, + struct extent_buffer *new_root, + int log_removal); +int btrfs_tree_mod_log_insert_key(struct extent_buffer *eb, int slot, + enum btrfs_mod_log_op op, gfp_t flags); +int btrfs_tree_mod_log_free_eb(struct extent_buffer *eb); +struct extent_buffer *btrfs_tree_mod_log_rewind(struct btrfs_fs_info *fs_info, + struct btrfs_path *path, + struct extent_buffer *eb, + u64 time_seq); +struct extent_buffer *btrfs_get_old_root(struct btrfs_root *root, u64 time_seq); +int btrfs_old_root_level(struct btrfs_root *root, u64 time_seq); +int btrfs_tree_mod_log_eb_copy(struct extent_buffer *dst, + struct extent_buffer *src, + unsigned long dst_offset, + unsigned long src_offset, + int nr_items); +int btrfs_tree_mod_log_insert_move(struct extent_buffer *eb, + int dst_slot, int src_slot, + int nr_items); + +#endif /* BTRFS_TREE_MOD_LOG_H */ From patchwork Thu Mar 11 14:31:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 12131553 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1F87C4332E for ; Thu, 11 Mar 2021 14:32:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7B13364FE9 for ; Thu, 11 Mar 2021 14:32:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233913AbhCKOb6 (ORCPT ); Thu, 11 Mar 2021 09:31:58 -0500 Received: from mail.kernel.org ([198.145.29.99]:57006 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233679AbhCKOb2 (ORCPT ); Thu, 11 Mar 2021 09:31:28 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id EC94A64FE5; Thu, 11 Mar 2021 14:31:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1615473087; bh=VstjVMeExBqD1S35M9kDaLcww9nD/ofjVwDjEkusemE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uVpwAWxOR6N/buaP9r6RX5Y2Bu8e4q51Gj043hkiwxYEiVwvo0Tq4t2tj1jiXNjGq 24RskuwHHbhzzGQQ2RdE5Taok6ZTgfefpYNwGDjtxY+lsqDww8l1xauFROYIRu8ZLT +GqflRcqCdKIE0ZK4k2h9K3df3fXi48kiXMpRfrrs/JGmY5MeCEYAny60j2Arhsbi/ YH3nOtUDdZ30Kr+lQ8pSkFebNVqg8YgfR5wY2u/fK3c9ZMAh9fOJ9Ai6JVdZ3JaRPl He+lIrSM/rvr8TKZ9LeYYpoG+ifyonQDSJuq+X1bC63w3IEymm1EGpMOEpkhJegmnm CKGjPG0wCzMTA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Cc: ce3g8jdj@umail.furryterror.org, Filipe Manana Subject: [PATCH 4/9] btrfs: use booleans where appropriate for the tree mod log functions Date: Thu, 11 Mar 2021 14:31:08 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana Several functions of the tree modification log use integers as booleans, so change them to use booleans instead, making their use more clear. Signed-off-by: Filipe Manana --- fs/btrfs/ctree.c | 6 +++--- fs/btrfs/tree-mod-log.c | 42 ++++++++++++++++++++--------------------- fs/btrfs/tree-mod-log.h | 2 +- 3 files changed, 25 insertions(+), 25 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 2c6e525b3458..1b0f28f69cb3 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -500,7 +500,7 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, parent_start = buf->start; atomic_inc(&cow->refs); - ret = btrfs_tree_mod_log_insert_root(root->node, cow, 1); + ret = btrfs_tree_mod_log_insert_root(root->node, cow, true); BUG_ON(ret < 0); rcu_assign_pointer(root->node, cow); @@ -958,7 +958,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, goto enospc; } - ret = btrfs_tree_mod_log_insert_root(root->node, child, 1); + ret = btrfs_tree_mod_log_insert_root(root->node, child, true); BUG_ON(ret < 0); rcu_assign_pointer(root->node, child); @@ -2480,7 +2480,7 @@ static noinline int insert_new_root(struct btrfs_trans_handle *trans, btrfs_mark_buffer_dirty(c); old = root->node; - ret = btrfs_tree_mod_log_insert_root(root->node, c, 0); + ret = btrfs_tree_mod_log_insert_root(root->node, c, false); BUG_ON(ret < 0); rcu_assign_pointer(root->node, c); diff --git a/fs/btrfs/tree-mod-log.c b/fs/btrfs/tree-mod-log.c index fc7eb33ec383..f2a6da1b2903 100644 --- a/fs/btrfs/tree-mod-log.c +++ b/fs/btrfs/tree-mod-log.c @@ -158,40 +158,40 @@ static noinline int tree_mod_log_insert(struct btrfs_fs_info *fs_info, } /* - * Determines if logging can be omitted. Returns 1 if it can. Otherwise, it - * returns zero with the tree_mod_log_lock acquired. The caller must hold + * Determines if logging can be omitted. Returns true if it can. Otherwise, it + * returns false with the tree_mod_log_lock acquired. The caller must hold * this until all tree mod log insertions are recorded in the rb tree and then * write unlock fs_info::tree_mod_log_lock. */ -static inline int tree_mod_dont_log(struct btrfs_fs_info *fs_info, +static inline bool tree_mod_dont_log(struct btrfs_fs_info *fs_info, struct extent_buffer *eb) { smp_mb(); if (list_empty(&(fs_info)->tree_mod_seq_list)) - return 1; + return true; if (eb && btrfs_header_level(eb) == 0) - return 1; + return true; write_lock(&fs_info->tree_mod_log_lock); if (list_empty(&(fs_info)->tree_mod_seq_list)) { write_unlock(&fs_info->tree_mod_log_lock); - return 1; + return true; } - return 0; + return false; } /* Similar to tree_mod_dont_log, but doesn't acquire any locks. */ -static inline int tree_mod_need_log(const struct btrfs_fs_info *fs_info, +static inline bool tree_mod_need_log(const struct btrfs_fs_info *fs_info, struct extent_buffer *eb) { smp_mb(); if (list_empty(&(fs_info)->tree_mod_seq_list)) - return 0; + return false; if (eb && btrfs_header_level(eb) == 0) - return 0; + return false; - return 1; + return true; } static struct tree_mod_elem *alloc_tree_mod_elem(struct extent_buffer *eb, @@ -252,7 +252,7 @@ int btrfs_tree_mod_log_insert_move(struct extent_buffer *eb, struct tree_mod_elem **tm_list = NULL; int ret = 0; int i; - int locked = 0; + bool locked = false; if (!tree_mod_need_log(eb->fs_info, eb)) return 0; @@ -284,7 +284,7 @@ int btrfs_tree_mod_log_insert_move(struct extent_buffer *eb, if (tree_mod_dont_log(eb->fs_info, eb)) goto free_tms; - locked = 1; + locked = true; /* * When we override something during the move, we log these removals. @@ -340,7 +340,7 @@ static inline int tree_mod_log_free_eb(struct btrfs_fs_info *fs_info, int btrfs_tree_mod_log_insert_root(struct extent_buffer *old_root, struct extent_buffer *new_root, - int log_removal) + bool log_removal) { struct btrfs_fs_info *fs_info = old_root->fs_info; struct tree_mod_elem *tm = NULL; @@ -410,7 +410,7 @@ int btrfs_tree_mod_log_insert_root(struct extent_buffer *old_root, static struct tree_mod_elem *__tree_mod_log_search(struct btrfs_fs_info *fs_info, u64 start, u64 min_seq, - int smallest) + bool smallest) { struct rb_root *tm_root; struct rb_node *node; @@ -458,7 +458,7 @@ static struct tree_mod_elem *__tree_mod_log_search(struct btrfs_fs_info *fs_info static struct tree_mod_elem *tree_mod_log_search_oldest(struct btrfs_fs_info *fs_info, u64 start, u64 min_seq) { - return __tree_mod_log_search(fs_info, start, min_seq, 1); + return __tree_mod_log_search(fs_info, start, min_seq, true); } /* @@ -469,7 +469,7 @@ static struct tree_mod_elem *tree_mod_log_search_oldest(struct btrfs_fs_info *fs static struct tree_mod_elem *tree_mod_log_search(struct btrfs_fs_info *fs_info, u64 start, u64 min_seq) { - return __tree_mod_log_search(fs_info, start, min_seq, 0); + return __tree_mod_log_search(fs_info, start, min_seq, false); } @@ -484,7 +484,7 @@ int btrfs_tree_mod_log_eb_copy(struct extent_buffer *dst, struct tree_mod_elem **tm_list = NULL; struct tree_mod_elem **tm_list_add, **tm_list_rem; int i; - int locked = 0; + bool locked = false; if (!tree_mod_need_log(fs_info, NULL)) return 0; @@ -517,7 +517,7 @@ int btrfs_tree_mod_log_eb_copy(struct extent_buffer *dst, if (tree_mod_dont_log(fs_info, NULL)) goto free_tms; - locked = 1; + locked = true; for (i = 0; i < nr_items; i++) { ret = tree_mod_log_insert(fs_info, tm_list_rem[i]); @@ -602,7 +602,7 @@ static struct tree_mod_elem *tree_mod_log_oldest_root(struct extent_buffer *eb_r struct tree_mod_elem *tm; struct tree_mod_elem *found = NULL; u64 root_logical = eb_root->start; - int looped = 0; + bool looped = false; if (!time_seq) return NULL; @@ -636,7 +636,7 @@ static struct tree_mod_elem *tree_mod_log_oldest_root(struct extent_buffer *eb_r found = tm; root_logical = tm->old_root.logical; - looped = 1; + looped = true; } /* if there's no old root to return, return what we found instead */ diff --git a/fs/btrfs/tree-mod-log.h b/fs/btrfs/tree-mod-log.h index 15e7a4c4bb14..2f7396ea57ae 100644 --- a/fs/btrfs/tree-mod-log.h +++ b/fs/btrfs/tree-mod-log.h @@ -30,7 +30,7 @@ void btrfs_put_tree_mod_seq(struct btrfs_fs_info *fs_info, struct btrfs_seq_list *elem); int btrfs_tree_mod_log_insert_root(struct extent_buffer *old_root, struct extent_buffer *new_root, - int log_removal); + bool log_removal); int btrfs_tree_mod_log_insert_key(struct extent_buffer *eb, int slot, enum btrfs_mod_log_op op, gfp_t flags); int btrfs_tree_mod_log_free_eb(struct extent_buffer *eb); From patchwork Thu Mar 11 14:31:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 12131551 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A30DFC4332D for ; Thu, 11 Mar 2021 14:32:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5904A64FEE for ; Thu, 11 Mar 2021 14:32:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233930AbhCKOb7 (ORCPT ); Thu, 11 Mar 2021 09:31:59 -0500 Received: from mail.kernel.org ([198.145.29.99]:57012 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233906AbhCKOb3 (ORCPT ); Thu, 11 Mar 2021 09:31:29 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 2CE1E64FD6; Thu, 11 Mar 2021 14:31:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1615473089; bh=qz1ZVBbatiRI2o/5VInRcVgXpTEjFJ8/El+flUY9fTM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LUSu8Mbo9x8+1mceh3fl22OMjKxhs6lAFYwhhVxqPjenrVM7CHgba7y+QIMQ+YIun NCLPAJcCtfNcFjRigs0TuH2FnP8sJNEozYOl3qRqsqi9CWeCuY6u8LcmdwmNNYcIxe SdwjrlAecKJQkARxcEqjL2kneaicLW9Ar9U5PNg6BVmkP1p10m36ZuAdVCooG2Jyuy 6tkmfF4UtG92CmlvSpV7PmTl53SX1cB2GbUc8ApDfQ1HlbZ2AVSwhOS7PV3dKDYMcn 1HcAhqtxFThYRKSXT9Yb+NwhQ6+YryLP7hmqEDPRoAuXNh2aFdsiO/WTVf/raDzGU/ fX3GZPrW8DnDw== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Cc: ce3g8jdj@umail.furryterror.org, Filipe Manana Subject: [PATCH 5/9] btrfs: use a bit to track the existence of tree mod log users Date: Thu, 11 Mar 2021 14:31:09 +0000 Message-Id: <08ccdf842ceb0e05bccbdd087b9ef48efc4144db.1615472583.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana The tree modification log functions are called very frequently, basically they are called everytime a btree is modified (a pointer added or removed to a node, a new root for a btree is set, etc). Because of that, to avoid heavy lock contention on the lock that protects the list of tree mod log users, we have checks that test the emptiness of the list with a full memory barrier before the checks, so that when there are no tree mod log users we avoid taking the lock. Replace the memory barrier and list emptiness check with a test for a new bit set at fs_info->flags. This bit is used to indicate when there are tree mod log users, set whenever a user is added to the list and cleared when the last user is removed from the list. This makes the intention a bit more obvious and possibly more efficient (assuming test_bit() may be cheaper than a full memory barrier on some architectures). Signed-off-by: Filipe Manana --- fs/btrfs/ctree.h | 3 +++ fs/btrfs/tree-mod-log.c | 11 ++++++----- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 50208673bd55..90184ee2f832 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -580,6 +580,9 @@ enum { /* Indicate that we can't trust the free space tree for caching yet */ BTRFS_FS_FREE_SPACE_TREE_UNTRUSTED, + + /* Indicate whether there are any tree modification log users. */ + BTRFS_FS_TREE_MOD_LOG_USERS, }; /* diff --git a/fs/btrfs/tree-mod-log.c b/fs/btrfs/tree-mod-log.c index f2a6da1b2903..b912b82c36c9 100644 --- a/fs/btrfs/tree-mod-log.c +++ b/fs/btrfs/tree-mod-log.c @@ -60,6 +60,7 @@ u64 btrfs_get_tree_mod_seq(struct btrfs_fs_info *fs_info, if (!elem->seq) { elem->seq = btrfs_inc_tree_mod_seq(fs_info); list_add_tail(&elem->list, &fs_info->tree_mod_seq_list); + set_bit(BTRFS_FS_TREE_MOD_LOG_USERS, &fs_info->flags); } write_unlock(&fs_info->tree_mod_log_lock); @@ -83,7 +84,9 @@ void btrfs_put_tree_mod_seq(struct btrfs_fs_info *fs_info, list_del(&elem->list); elem->seq = 0; - if (!list_empty(&fs_info->tree_mod_seq_list)) { + if (list_empty(&fs_info->tree_mod_seq_list)) { + clear_bit(BTRFS_FS_TREE_MOD_LOG_USERS, &fs_info->flags); + } else { struct btrfs_seq_list *first; first = list_first_entry(&fs_info->tree_mod_seq_list, @@ -166,8 +169,7 @@ static noinline int tree_mod_log_insert(struct btrfs_fs_info *fs_info, static inline bool tree_mod_dont_log(struct btrfs_fs_info *fs_info, struct extent_buffer *eb) { - smp_mb(); - if (list_empty(&(fs_info)->tree_mod_seq_list)) + if (!test_bit(BTRFS_FS_TREE_MOD_LOG_USERS, &fs_info->flags)) return true; if (eb && btrfs_header_level(eb) == 0) return true; @@ -185,8 +187,7 @@ static inline bool tree_mod_dont_log(struct btrfs_fs_info *fs_info, static inline bool tree_mod_need_log(const struct btrfs_fs_info *fs_info, struct extent_buffer *eb) { - smp_mb(); - if (list_empty(&(fs_info)->tree_mod_seq_list)) + if (!test_bit(BTRFS_FS_TREE_MOD_LOG_USERS, &fs_info->flags)) return false; if (eb && btrfs_header_level(eb) == 0) return false; From patchwork Thu Mar 11 14:31:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 12131547 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A8E9C43381 for ; Thu, 11 Mar 2021 14:32:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2D78C64FE9 for ; Thu, 11 Mar 2021 14:32:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233906AbhCKOcA (ORCPT ); Thu, 11 Mar 2021 09:32:00 -0500 Received: from mail.kernel.org ([198.145.29.99]:57018 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233825AbhCKOba (ORCPT ); Thu, 11 Mar 2021 09:31:30 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 6709264FEC; Thu, 11 Mar 2021 14:31:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1615473090; bh=8CcoXdh4WpINBazOzmK5xDypaQ3NQpOm14zerdAGALg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=eDXG/K29yXVY0u+JZd7CzXE+ncXTAWJcdSb6upaWMV8/qwwQ3NO9lo6HM3gdK3t0L N7qHOHv9QyIuT2aTNBUtWqv0asNgOlEV0brGt7Bmg4PUgF0t6AKHkww8pRCnSm49K/ HEKosQk31TdYeqjJtTZpMzPSnZzwpY9oIABOcouyNkxLVb8iomu7deUH4ZGEFkQgm4 9tpwxMV+DRWnXpoM8T946pfiB48i25AjBLO+qtiJgD8C76I70Twr2oXmOg/MFRFRTN gtDSV9OgiilcnYV1hsvo9DGon4SVlgCCKdVh4XLwLmEkk9JJSREJdJ9sHID2I9Ijnd 3WF700EIs6nkA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Cc: ce3g8jdj@umail.furryterror.org, Filipe Manana Subject: [PATCH 6/9] btrfs: use the new bit BTRFS_FS_TREE_MOD_LOG_USERS at btrfs_free_tree_block() Date: Thu, 11 Mar 2021 14:31:10 +0000 Message-Id: <8b318da95fb03892feeaeebc94b6efa11515bb56.1615472583.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana Instead of exposing implementation details of the tree mod log to check if there are active tree mod log users at btrfs_free_tree_block(), use the new bit BTRFS_FS_TREE_MOD_LOG_USERS for fs_info->flags instead. This way extent-tree.c does not need to known about any of the internals of the tree mod log and avoids taking a lock unnecessarily as well. Signed-off-by: Filipe Manana --- fs/btrfs/extent-tree.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 2482b26b1971..7a28314189b4 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3342,11 +3342,9 @@ void btrfs_free_tree_block(struct btrfs_trans_handle *trans, * find a node pointing to this leaf and record operations that * point to this leaf. */ - if (btrfs_header_level(buf) == 0) { - read_lock(&fs_info->tree_mod_log_lock); - must_pin = !list_empty(&fs_info->tree_mod_seq_list); - read_unlock(&fs_info->tree_mod_log_lock); - } + if (btrfs_header_level(buf) == 0 && + test_bit(BTRFS_FS_TREE_MOD_LOG_USERS, &fs_info->flags)) + must_pin = true; if (must_pin || btrfs_is_zoned(fs_info)) { btrfs_redirty_list_add(trans->transaction, buf); From patchwork Thu Mar 11 14:31:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 12131555 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD570C4332B for ; Thu, 11 Mar 2021 14:32:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9F0F564FEB for ; Thu, 11 Mar 2021 14:32:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233934AbhCKOcB (ORCPT ); Thu, 11 Mar 2021 09:32:01 -0500 Received: from mail.kernel.org ([198.145.29.99]:57024 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233925AbhCKObb (ORCPT ); Thu, 11 Mar 2021 09:31:31 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 9B9F264FE8; Thu, 11 Mar 2021 14:31:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1615473091; bh=xU6JgFTKD2WIWP3hEJz56Cszs6XlV1Zs6kx8WnSyN20=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hG7kGRqDR7HYQSkH16tpsXAouKD5jZXWD1y6ojcjlxfTRY+l2C7v6+HtjJsGt9Woi OGLhEg4m9WC9JcIuJeVIY5dLMUn0JeTNJJzWaUHAYlxjBOQSCDaGwE5iMy4iLg36UJ kNDGHHrc+3BFarRirxtd9fOieys24ZvDX9Q+ZEcNermE9oxsRHach4gx0MCRnIfx0O cIdt9e6SG5+ygRiYUcJb9dtJ62QUGvGg6Xp8aTl0SijHgBtrfGapR07xVBDAr5dPKo q/nohbsjM9kpSMP7c7ncFamhrv0zOFC9p9P5AKcACmHiGmNDGDS0uC4KBh81xKnd7i E4QIw+yP+IZIg== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Cc: ce3g8jdj@umail.furryterror.org, Filipe Manana Subject: [PATCH 7/9] btrfs: remove unnecessary leaf check at btrfs_tree_mod_log_free_eb() Date: Thu, 11 Mar 2021 14:31:11 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana At btrfs_tree_mod_log_free_eb() we check if we are dealing with a leaf, and if so, return immediately and do nothing. However this check can be removed, because after it we call tree_mod_need_log(), which returns false when given an extent buffer that corresponds to a leaf. So just remove the leaf check and pass the extent buffer to tree_mod_need_log(). Signed-off-by: Filipe Manana --- fs/btrfs/tree-mod-log.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/fs/btrfs/tree-mod-log.c b/fs/btrfs/tree-mod-log.c index b912b82c36c9..c854cebf27f6 100644 --- a/fs/btrfs/tree-mod-log.c +++ b/fs/btrfs/tree-mod-log.c @@ -554,10 +554,7 @@ int btrfs_tree_mod_log_free_eb(struct extent_buffer *eb) int i; int ret = 0; - if (btrfs_header_level(eb) == 0) - return 0; - - if (!tree_mod_need_log(eb->fs_info, NULL)) + if (!tree_mod_need_log(eb->fs_info, eb)) return 0; nritems = btrfs_header_nritems(eb); From patchwork Thu Mar 11 14:31:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 12131559 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0D31C43332 for ; Thu, 11 Mar 2021 14:32:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C2D7E64FEE for ; Thu, 11 Mar 2021 14:32:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233825AbhCKOcA (ORCPT ); Thu, 11 Mar 2021 09:32:00 -0500 Received: from mail.kernel.org ([198.145.29.99]:57030 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233934AbhCKObd (ORCPT ); Thu, 11 Mar 2021 09:31:33 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id D145C64FE2; Thu, 11 Mar 2021 14:31:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1615473092; bh=1sLxWq+aZf5wLlrnsk/Uo8KzAsj6NAr40yldOC3ZIiA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HYSrgSwtAF73DhvFDefTD8Wxb0ZQl/XxdvFx7vyRzTfkRWJT+VtyCnUN4mVWcE+9x KbL2B9ZWLLf17D1JH59PMWv4YLXEu+66WTskJl/feXpmrcjA4Y74XTF7Q+7mxaA9Q8 2qeoyUwtesho7ZR6P3CgoIpf+RGDHfR95gB6bTfDvjOsK82fRzH+bp/k9v3oD6/G5C ozgsyAM6+Qv8+vePCzS58S8EH2h7YLI15novQdze0rpgKUocfJZRdU4P6kClO1gjGO L73fGxBV9469wgdai/gmPsmNM0k7FfvA6vF37dYICDkupoIxIQadAcopjsGyJVuJ/r KOJGnhWyTzbuw== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Cc: ce3g8jdj@umail.furryterror.org, Filipe Manana Subject: [PATCH 8/9] btrfs: add and use helper to get lowest sequence number for the tree mod log Date: Thu, 11 Mar 2021 14:31:12 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana There are two places outside the tree mod log module that extract the lowest sequence number of the tree mod log. These places end up duplicating code and open coding the logic and internal implementation details of the tree mod log. So add a helper to the tree mod log module and header that returns the lowest sequence number or 0 if there aren't any tree mod log users at the moment. Signed-off-by: Filipe Manana --- fs/btrfs/delayed-ref.c | 31 ++++++++----------------------- fs/btrfs/tree-mod-log.c | 25 +++++++++++++++++++++++++ fs/btrfs/tree-mod-log.h | 1 + 3 files changed, 34 insertions(+), 23 deletions(-) diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c index d87472a68bf4..7a3131cbf1fb 100644 --- a/fs/btrfs/delayed-ref.c +++ b/fs/btrfs/delayed-ref.c @@ -495,16 +495,7 @@ void btrfs_merge_delayed_refs(struct btrfs_trans_handle *trans, if (head->is_data) return; - read_lock(&fs_info->tree_mod_log_lock); - if (!list_empty(&fs_info->tree_mod_seq_list)) { - struct btrfs_seq_list *elem; - - elem = list_first_entry(&fs_info->tree_mod_seq_list, - struct btrfs_seq_list, list); - seq = elem->seq; - } - read_unlock(&fs_info->tree_mod_log_lock); - + seq = btrfs_tree_mod_log_lowest_seq(fs_info); again: for (node = rb_first_cached(&head->ref_tree); node; node = rb_next(node)) { @@ -518,23 +509,17 @@ void btrfs_merge_delayed_refs(struct btrfs_trans_handle *trans, int btrfs_check_delayed_seq(struct btrfs_fs_info *fs_info, u64 seq) { - struct btrfs_seq_list *elem; + u64 min_seq = btrfs_tree_mod_log_lowest_seq(fs_info); int ret = 0; - read_lock(&fs_info->tree_mod_log_lock); - if (!list_empty(&fs_info->tree_mod_seq_list)) { - elem = list_first_entry(&fs_info->tree_mod_seq_list, - struct btrfs_seq_list, list); - if (seq >= elem->seq) { - btrfs_debug(fs_info, - "holding back delayed_ref %#x.%x, lowest is %#x.%x", - (u32)(seq >> 32), (u32)seq, - (u32)(elem->seq >> 32), (u32)elem->seq); - ret = 1; - } + if (min_seq != 0 && seq >= min_seq) { + btrfs_debug(fs_info, + "holding back delayed_ref %#x.%x, lowest is %#x.%x", + (u32)(seq >> 32), (u32)seq, + (u32)(min_seq >> 32), (u32)min_seq); + ret = 1; } - read_unlock(&fs_info->tree_mod_log_lock); return ret; } diff --git a/fs/btrfs/tree-mod-log.c b/fs/btrfs/tree-mod-log.c index c854cebf27f6..b2083b575e21 100644 --- a/fs/btrfs/tree-mod-log.c +++ b/fs/btrfs/tree-mod-log.c @@ -885,3 +885,28 @@ int btrfs_old_root_level(struct btrfs_root *root, u64 time_seq) return level; } +/* + * Return the lowest sequence number in the tree modification log. + * + * @fs_info: the filesystem context + * + * Returns the sequence number of the oldest tree modification log user, which + * corresponds to the lowest sequence number of all existing users. If there are + * no users it returns 0. + */ +u64 btrfs_tree_mod_log_lowest_seq(struct btrfs_fs_info *fs_info) +{ + u64 ret = 0; + + read_lock(&fs_info->tree_mod_log_lock); + if (!list_empty(&fs_info->tree_mod_seq_list)) { + struct btrfs_seq_list *elem; + + elem = list_first_entry(&fs_info->tree_mod_seq_list, + struct btrfs_seq_list, list); + ret = elem->seq; + } + read_unlock(&fs_info->tree_mod_log_lock); + + return ret; +} diff --git a/fs/btrfs/tree-mod-log.h b/fs/btrfs/tree-mod-log.h index 2f7396ea57ae..d218f8a661bc 100644 --- a/fs/btrfs/tree-mod-log.h +++ b/fs/btrfs/tree-mod-log.h @@ -48,5 +48,6 @@ int btrfs_tree_mod_log_eb_copy(struct extent_buffer *dst, int btrfs_tree_mod_log_insert_move(struct extent_buffer *eb, int dst_slot, int src_slot, int nr_items); +u64 btrfs_tree_mod_log_lowest_seq(struct btrfs_fs_info *fs_info); #endif /* BTRFS_TREE_MOD_LOG_H */ From patchwork Thu Mar 11 14:31:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 12131557 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2889BC43331 for ; Thu, 11 Mar 2021 14:32:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F268964FF5 for ; Thu, 11 Mar 2021 14:32:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233925AbhCKOcB (ORCPT ); Thu, 11 Mar 2021 09:32:01 -0500 Received: from mail.kernel.org ([198.145.29.99]:57036 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233936AbhCKObe (ORCPT ); Thu, 11 Mar 2021 09:31:34 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 1333F64FEB; Thu, 11 Mar 2021 14:31:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1615473093; bh=ziekMdAbJzsNFDCi2ZxRIQBbdARKcREVy4JnNc68k5U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WGLHH71Wq7S2T8v14GJXxAXhuPa7du+E4aX/2yxxPWLUq4CphIHyZo2Mq//QtT2fA Z+9eudWgnYXRPAYSAfwSKSHvRpnc662Zy/L1Xg/Sl6hMyvZ8UjF6flmsNDsWtECazd gR3re/T1Gd4EmY9ExLy/OlfO5kNupgiK8bs+9ZPoEZvggja/2M6SneA4HRte7kwMER c9csJNRutJcjR4wdqwmjjd6ZgRZ3hUmU//jMSrvUjwqQi8flwmZb1XFVljfrwkfkna s9F9j9q2JuR8jtba0dLu27G5MekDFSjTXv6uxDd5RVWe0Gzm90tVwTJBFXKloUZDoQ qblV9WXN3zPoA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Cc: ce3g8jdj@umail.furryterror.org, Filipe Manana Subject: [PATCH 9/9] btrfs: update debug message when checking seq number of a delayed ref Date: Thu, 11 Mar 2021 14:31:13 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana We used to encode two different numbers in the tree mod log counter used for sequence numbers, one in the upper 32 bits and the other one in the lower 32 bits. However that is no longer the case, we stopped doing that since commit fcebe4562dec83 ("Btrfs: rework qgroup accounting"). So update the debug message at btrfs_check_delayed_seq to stop extracting the two 32 bits counters and print instead the 64 bits sequence numbers. Signed-off-by: Filipe Manana --- fs/btrfs/delayed-ref.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c index 7a3131cbf1fb..8c1d78befb5b 100644 --- a/fs/btrfs/delayed-ref.c +++ b/fs/btrfs/delayed-ref.c @@ -514,9 +514,8 @@ int btrfs_check_delayed_seq(struct btrfs_fs_info *fs_info, u64 seq) if (min_seq != 0 && seq >= min_seq) { btrfs_debug(fs_info, - "holding back delayed_ref %#x.%x, lowest is %#x.%x", - (u32)(seq >> 32), (u32)seq, - (u32)(min_seq >> 32), (u32)min_seq); + "holding back delayed_ref %llu, lowest is %llu", + seq, min_seq); ret = 1; }