From patchwork Wed Jan 11 11:36:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096481 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E3B4C46467 for ; Wed, 11 Jan 2023 11:39:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236083AbjAKLjY (ORCPT ); Wed, 11 Jan 2023 06:39:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37314 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237682AbjAKLjB (ORCPT ); Wed, 11 Jan 2023 06:39:01 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 766AFA465 for ; Wed, 11 Jan 2023 03:36:25 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 0D12361C06 for ; Wed, 11 Jan 2023 11:36:25 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EFFE3C433D2 for ; Wed, 11 Jan 2023 11:36:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436984; bh=Xyzp+CKxgPWNP8tmtSZ7a9/ddn1gFU0DIwb1Qp8I6pM=; h=From:To:Subject:Date:In-Reply-To:References:From; b=ZhqQKjH8DOS23JS5+Yd2+Ntsdl6Kp9DJ0SyUGgvOrmY7ZThkqbydrf2rjbiKZRl8Z ViNX/WSFolxlDPY2ErfSUqtY02YEF7dBilUQiN0aqzKJl+fTKJYmyJnVMILF0WA/9l a2gpjV2WNuJ2Bin3bVH/ioCyVyr/PRS1TPmNzWpEIqVwZFLHHoqsE0y+xZCP2jtUFb wFvOLFYTwpRgs+RUIyDKsnxsioywtMKmbsqfrb4pe8W2MMXhvSq7/OCDTV9Lg31xyL zKBIbObFrBOdoCKjWLZdA2Kdzs8Ad1936po0ta9j23WIrCmrlgZXYfZUBJwYhDBd/7 tZdY0I9qYwucw== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 01/19] btrfs: send: directly return from did_overwrite_ref() and simplify it Date: Wed, 11 Jan 2023 11:36:02 +0000 Message-Id: <06477426bff54ecc38f75e3a6845731bd9d4abba.1673436276.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana There are no resources to release before did_overwrite_ref() returns, so we don't really need the 'out' label and jumping to it when conditions are met - we can directly return and get rid of the label and jumps. Also we can deal with -ENOENT and other errors in a single if-else logic, as it's more straightforward. This helps the next patch in the series to be more simple as well. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 43 ++++++++++++++++++------------------------- 1 file changed, 18 insertions(+), 25 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index b90ad6f7219c..59106eb8b114 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -2192,48 +2192,44 @@ static int did_overwrite_ref(struct send_ctx *sctx, u64 ino, u64 ino_gen, const char *name, int name_len) { - int ret = 0; + int ret; u64 gen; u64 ow_inode; if (!sctx->parent_root) - goto out; + return 0; ret = is_inode_existent(sctx, dir, dir_gen); if (ret <= 0) - goto out; + return ret; if (dir != BTRFS_FIRST_FREE_OBJECTID) { ret = get_inode_gen(sctx->send_root, dir, &gen); - if (ret < 0 && ret != -ENOENT) - goto out; - if (ret) { - ret = 0; - goto out; - } + if (ret == -ENOENT) + return 0; + else if (ret < 0) + return ret; + if (gen != dir_gen) - goto out; + return 0; } /* check if the ref was overwritten by another ref */ ret = lookup_dir_item_inode(sctx->send_root, dir, name, name_len, &ow_inode); - if (ret < 0 && ret != -ENOENT) - goto out; - if (ret) { + if (ret == -ENOENT) { /* was never and will never be overwritten */ - ret = 0; - goto out; + return 0; + } else if (ret < 0) { + return ret; } ret = get_inode_gen(sctx->send_root, ow_inode, &gen); if (ret < 0) - goto out; + return ret; - if (ow_inode == ino && gen == ino_gen) { - ret = 0; - goto out; - } + if (ow_inode == ino && gen == ino_gen) + return 0; /* * We know that it is or will be overwritten. Check this now. @@ -2244,12 +2240,9 @@ static int did_overwrite_ref(struct send_ctx *sctx, if ((ow_inode < sctx->send_progress) || (ino != sctx->cur_ino && ow_inode == sctx->cur_ino && gen == sctx->cur_inode_gen)) - ret = 1; - else - ret = 0; + return 1; -out: - return ret; + return 0; } /* From patchwork Wed Jan 11 11:36:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096478 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B70A0C46467 for ; Wed, 11 Jan 2023 11:39:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235452AbjAKLjV (ORCPT ); Wed, 11 Jan 2023 06:39:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38074 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238068AbjAKLjC (ORCPT ); Wed, 11 Jan 2023 06:39:02 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D41D12BD3 for ; Wed, 11 Jan 2023 03:36:27 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 9E620B81B8D for ; Wed, 11 Jan 2023 11:36:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D58DBC433F1 for ; Wed, 11 Jan 2023 11:36:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436985; bh=bz2hjETIMiO1Y/a5EWyqQkJemp8px3E+gwWGVekGXa4=; h=From:To:Subject:Date:In-Reply-To:References:From; b=NbfqwwmjH6n9W7gqrHeqN+ZHHFU2X/OJvgAqJJorBKXQ1/csWj7vRzPXfEclll4ks v3PRhow5rceaUIRXYiCkFM5Xr//5korjBE6vl/KXV34ytF7FD1hVL8lR8CpisMBi9z bVcA2V6GIruJsOl0DXpwvQQ6s4UtnBLKacOcvSpxALnRlNYUsBEypue1234krqp7Vn JQv9qoe9MIufJ/Lbno3lzkUAyjmx4ru4tw+GjJcC2ReY62g3o/MRPMup0iXcdcqypq uRDbpTiqMV1N2cp8lwSH9XGAjkoOmT/XW4LGw65PlrQx6PNUmob/u3h/gJnfp1VSs1 u7NKQpFkV4t5g== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 02/19] btrfs: send: avoid unnecessary generation search at did_overwrite_ref() Date: Wed, 11 Jan 2023 11:36:03 +0000 Message-Id: <22bedd046623def51d4bfcf1bd313da15f2e27ba.1673436276.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana At did_overwrite_ref() we always call get_inode_gen() to find out the generation of the inode 'ow_inode'. However we don't always need to use that generation, and in fact it's very common to not use it, so we end up doing a b+tree search on the send root, allocating a path, etc, for nothing. So improve on this by getting the generation only if we need to use it. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 31 ++++++++++++++++++++++--------- 1 file changed, 22 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 59106eb8b114..23060eec30de 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -2193,8 +2193,8 @@ static int did_overwrite_ref(struct send_ctx *sctx, const char *name, int name_len) { int ret; - u64 gen; u64 ow_inode; + u64 ow_gen = 0; if (!sctx->parent_root) return 0; @@ -2204,6 +2204,8 @@ static int did_overwrite_ref(struct send_ctx *sctx, return ret; if (dir != BTRFS_FIRST_FREE_OBJECTID) { + u64 gen; + ret = get_inode_gen(sctx->send_root, dir, &gen); if (ret == -ENOENT) return 0; @@ -2224,12 +2226,15 @@ static int did_overwrite_ref(struct send_ctx *sctx, return ret; } - ret = get_inode_gen(sctx->send_root, ow_inode, &gen); - if (ret < 0) - return ret; + if (ow_inode == ino) { + ret = get_inode_gen(sctx->send_root, ow_inode, &ow_gen); + if (ret < 0) + return ret; - if (ow_inode == ino && gen == ino_gen) - return 0; + /* It's the same inode, so no overwrite happened. */ + if (ow_gen == ino_gen) + return 0; + } /* * We know that it is or will be overwritten. Check this now. @@ -2237,11 +2242,19 @@ static int did_overwrite_ref(struct send_ctx *sctx, * inode 'ino' to be orphanized, therefore check if ow_inode matches * the current inode being processed. */ - if ((ow_inode < sctx->send_progress) || - (ino != sctx->cur_ino && ow_inode == sctx->cur_ino && - gen == sctx->cur_inode_gen)) + if (ow_inode < sctx->send_progress) return 1; + if (ino != sctx->cur_ino && ow_inode == sctx->cur_ino) { + if (ow_gen == 0) { + ret = get_inode_gen(sctx->send_root, ow_inode, &ow_gen); + if (ret < 0) + return ret; + } + if (ow_gen == sctx->cur_inode_gen) + return 1; + } + return 0; } From patchwork Wed Jan 11 11:36:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096480 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7780DC5479D for ; Wed, 11 Jan 2023 11:39:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235989AbjAKLjX (ORCPT ); Wed, 11 Jan 2023 06:39:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37582 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236094AbjAKLjC (ORCPT ); Wed, 11 Jan 2023 06:39:02 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 30191B841 for ; Wed, 11 Jan 2023 03:36:27 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C800D61C06 for ; Wed, 11 Jan 2023 11:36:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BC34DC433EF for ; Wed, 11 Jan 2023 11:36:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436986; bh=vinQGOMDnnH7VfZ7H2uXIqymKseqacpOolHePdPoE30=; h=From:To:Subject:Date:In-Reply-To:References:From; b=J1PrBtceoiWCU4/TuIXgwYCdPvpKBbp+QxyLoMo1xQ0xS26laiaVw4v5I2UXzy4Af XG8K0gGhrd9+T7RiGntX4Z3B2Un6iL39OUrkvqKs5UX0ruZcrJ1AspHH17/nuJ/K/1 QUthxAsufkiTIPMe11feNJ0Yg6HVmYmxPO4qxqmB2H+RSYEUVAUE2IN01o6nFQ8S2l YjpM4WGTQ9DJNXNLFDlfcZfVuQgrbzK0Rj7WOwurbfwEPuJGEm/J7Zu8PaYrdHWO5J ZjrnLjcTxQkWbCgY8tkelhl+pHAGYr6AHJ7AY7H2IctMKxLUrkDBfd8v3EFMkc82k7 e3zhsEFMAApjw== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 03/19] btrfs: send: directly return from will_overwrite_ref() and simplify it Date: Wed, 11 Jan 2023 11:36:04 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana There are no resources to release before will_overwrite_ref() returns, so we don't really need the 'out' label and jumping to it when conditions are met - we can directly return and get rid of the label and jumps. Also we can deal with -ENOENT and other errors in a single if-else logic, as it's more straightforward. This helps the next patch in the series to be more simple as well. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 38 ++++++++++++++++---------------------- 1 file changed, 16 insertions(+), 22 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 23060eec30de..9be3d7db85d6 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -2119,17 +2119,17 @@ static int will_overwrite_ref(struct send_ctx *sctx, u64 dir, u64 dir_gen, const char *name, int name_len, u64 *who_ino, u64 *who_gen, u64 *who_mode) { - int ret = 0; + int ret; u64 gen; u64 other_inode = 0; struct btrfs_inode_info info; if (!sctx->parent_root) - goto out; + return 0; ret = is_inode_existent(sctx, dir, dir_gen); if (ret <= 0) - goto out; + return 0; /* * If we have a parent root we need to verify that the parent dir was @@ -2138,24 +2138,21 @@ static int will_overwrite_ref(struct send_ctx *sctx, u64 dir, u64 dir_gen, */ if (sctx->parent_root && dir != BTRFS_FIRST_FREE_OBJECTID) { ret = get_inode_gen(sctx->parent_root, dir, &gen); - if (ret < 0 && ret != -ENOENT) - goto out; - if (ret) { - ret = 0; - goto out; - } + if (ret == -ENOENT) + return 0; + else if (ret < 0) + return ret; + if (gen != dir_gen) - goto out; + return 0; } ret = lookup_dir_item_inode(sctx->parent_root, dir, name, name_len, &other_inode); - if (ret < 0 && ret != -ENOENT) - goto out; - if (ret) { - ret = 0; - goto out; - } + if (ret == -ENOENT) + return 0; + else if (ret < 0) + return ret; /* * Check if the overwritten ref was already processed. If yes, the ref @@ -2166,18 +2163,15 @@ static int will_overwrite_ref(struct send_ctx *sctx, u64 dir, u64 dir_gen, is_waiting_for_move(sctx, other_inode)) { ret = get_inode_info(sctx->parent_root, other_inode, &info); if (ret < 0) - goto out; + return ret; - ret = 1; *who_ino = other_inode; *who_gen = info.gen; *who_mode = info.mode; - } else { - ret = 0; + return 1; } -out: - return ret; + return 0; } /* From patchwork Wed Jan 11 11:36:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096482 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82D9FC54EBC for ; Wed, 11 Jan 2023 11:39:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236548AbjAKLj0 (ORCPT ); Wed, 11 Jan 2023 06:39:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38076 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238176AbjAKLjD (ORCPT ); Wed, 11 Jan 2023 06:39:03 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B17442DD6 for ; Wed, 11 Jan 2023 03:36:29 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 70067B81BAD for ; Wed, 11 Jan 2023 11:36:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A1D5DC433F0 for ; Wed, 11 Jan 2023 11:36:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436987; bh=Uo09xMCyOJh/2doMSBpr/i4lt/vMQMvo0rkwF5RDtS8=; h=From:To:Subject:Date:In-Reply-To:References:From; b=P5a5j/Bb5zCJew5IrlLIPsR3FV+cuVvODrqGfsvZlQvwOy7/1YHOZjOTVKKsioe97 1mA4XKhDl9u/ZylnbTcBC+I5gC6uISr8GJyAaEOmA4E+4ilH+6OXqEgH4lVUDSBmGT IdQMRFwp4YSQk00cTtroxBkDguLVANY3VeVEsyrovFvNIXFFimBrJvCcY+ENXD1W8n QvWXGRnTGi/p4S5VAghjIJGqszy/rUBmh7zO3ZaoqQ6PmmmITHivBh6LhPwlhEDhxR fwxfLrJnEQqlr09mx7Z00o2nd+2sdAcKGucpTg1M9QIL8GTPMEYCA4ioBLB67jda3i KhzfxSdykgyAA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 04/19] btrfs: send: avoid extra b+tree searches when checking reference overrides Date: Wed, 11 Jan 2023 11:36:05 +0000 Message-Id: <9c5cf707a0db0c0de55809fe56446cf7613ab042.1673436276.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana During an incremental send, when processing the new references of an inode (either it's a new inode or an existing one renamed/moved), he will search the b+tree of the send or parent roots in order to find out the inode item of the parent directory and extract its generation. However we are doing that search twice, once with is_inode_existent() -> get_cur_inode_state() and then again at did_overwrite_ref() or will_overwrite_ref(). So avoid that and get the generation at get_cur_inode_state() and then propagate it up to did_overwrite_ref() and will_overwrite_ref(). This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 61 +++++++++++++++++++++++-------------------------- 1 file changed, 29 insertions(+), 32 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 9be3d7db85d6..6332add4865c 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -1884,7 +1884,8 @@ enum inode_state { inode_state_did_delete, }; -static int get_cur_inode_state(struct send_ctx *sctx, u64 ino, u64 gen) +static int get_cur_inode_state(struct send_ctx *sctx, u64 ino, u64 gen, + u64 *send_gen, u64 *parent_gen) { int ret; int left_ret; @@ -1898,6 +1899,8 @@ static int get_cur_inode_state(struct send_ctx *sctx, u64 ino, u64 gen) goto out; left_ret = (info.nlink == 0) ? -ENOENT : ret; left_gen = info.gen; + if (send_gen) + *send_gen = (left_ret == -ENOENT) ? 0 : info.gen; if (!sctx->parent_root) { right_ret = -ENOENT; @@ -1907,6 +1910,8 @@ static int get_cur_inode_state(struct send_ctx *sctx, u64 ino, u64 gen) goto out; right_ret = (info.nlink == 0) ? -ENOENT : ret; right_gen = info.gen; + if (parent_gen) + *parent_gen = (right_ret == -ENOENT) ? 0 : info.gen; } if (!left_ret && !right_ret) { @@ -1951,14 +1956,15 @@ static int get_cur_inode_state(struct send_ctx *sctx, u64 ino, u64 gen) return ret; } -static int is_inode_existent(struct send_ctx *sctx, u64 ino, u64 gen) +static int is_inode_existent(struct send_ctx *sctx, u64 ino, u64 gen, + u64 *send_gen, u64 *parent_gen) { int ret; if (ino == BTRFS_FIRST_FREE_OBJECTID) return 1; - ret = get_cur_inode_state(sctx, ino, gen); + ret = get_cur_inode_state(sctx, ino, gen, send_gen, parent_gen); if (ret < 0) goto out; @@ -2120,14 +2126,14 @@ static int will_overwrite_ref(struct send_ctx *sctx, u64 dir, u64 dir_gen, u64 *who_ino, u64 *who_gen, u64 *who_mode) { int ret; - u64 gen; + u64 parent_root_dir_gen; u64 other_inode = 0; struct btrfs_inode_info info; if (!sctx->parent_root) return 0; - ret = is_inode_existent(sctx, dir, dir_gen); + ret = is_inode_existent(sctx, dir, dir_gen, NULL, &parent_root_dir_gen); if (ret <= 0) return 0; @@ -2135,17 +2141,13 @@ static int will_overwrite_ref(struct send_ctx *sctx, u64 dir, u64 dir_gen, * If we have a parent root we need to verify that the parent dir was * not deleted and then re-created, if it was then we have no overwrite * and we can just unlink this entry. + * + * @parent_root_dir_gen was set to 0 if the inode does not exists in the + * parent root. */ - if (sctx->parent_root && dir != BTRFS_FIRST_FREE_OBJECTID) { - ret = get_inode_gen(sctx->parent_root, dir, &gen); - if (ret == -ENOENT) - return 0; - else if (ret < 0) - return ret; - - if (gen != dir_gen) - return 0; - } + if (sctx->parent_root && dir != BTRFS_FIRST_FREE_OBJECTID && + parent_root_dir_gen != dir_gen) + return 0; ret = lookup_dir_item_inode(sctx->parent_root, dir, name, name_len, &other_inode); @@ -2189,26 +2191,21 @@ static int did_overwrite_ref(struct send_ctx *sctx, int ret; u64 ow_inode; u64 ow_gen = 0; + u64 send_root_dir_gen; if (!sctx->parent_root) return 0; - ret = is_inode_existent(sctx, dir, dir_gen); + ret = is_inode_existent(sctx, dir, dir_gen, &send_root_dir_gen, NULL); if (ret <= 0) return ret; - if (dir != BTRFS_FIRST_FREE_OBJECTID) { - u64 gen; - - ret = get_inode_gen(sctx->send_root, dir, &gen); - if (ret == -ENOENT) - return 0; - else if (ret < 0) - return ret; - - if (gen != dir_gen) - return 0; - } + /* + * @send_root_dir_gen was set to 0 if the inode does not exists in the + * send root. + */ + if (dir != BTRFS_FIRST_FREE_OBJECTID && send_root_dir_gen != dir_gen) + return 0; /* check if the ref was overwritten by another ref */ ret = lookup_dir_item_inode(sctx->send_root, dir, name, name_len, @@ -2444,7 +2441,7 @@ static int __get_cur_name_and_parent(struct send_ctx *sctx, * This should only happen for the parent dir that we determine in * record_new_ref_if_needed(). */ - ret = is_inode_existent(sctx, ino, gen); + ret = is_inode_existent(sctx, ino, gen, NULL, NULL); if (ret < 0) goto out; @@ -4240,7 +4237,7 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) * "testdir_2". */ list_for_each_entry(cur, &sctx->new_refs, list) { - ret = get_cur_inode_state(sctx, cur->dir, cur->dir_gen); + ret = get_cur_inode_state(sctx, cur->dir, cur->dir_gen, NULL, NULL); if (ret < 0) goto out; if (ret == inode_state_will_create) @@ -4356,7 +4353,7 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) * parent directory out of order. But we need to check if this * did already happen before due to other refs in the same dir. */ - ret = get_cur_inode_state(sctx, cur->dir, cur->dir_gen); + ret = get_cur_inode_state(sctx, cur->dir, cur->dir_gen, NULL, NULL); if (ret < 0) goto out; if (ret == inode_state_will_create) { @@ -4562,7 +4559,7 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) if (cur->dir > sctx->cur_ino) continue; - ret = get_cur_inode_state(sctx, cur->dir, cur->dir_gen); + ret = get_cur_inode_state(sctx, cur->dir, cur->dir_gen, NULL, NULL); if (ret < 0) goto out; From patchwork Wed Jan 11 11:36:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096483 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 079F4C46467 for ; Wed, 11 Jan 2023 11:39:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237863AbjAKLj2 (ORCPT ); Wed, 11 Jan 2023 06:39:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38080 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238196AbjAKLjD (ORCPT ); Wed, 11 Jan 2023 06:39:03 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3226559A for ; Wed, 11 Jan 2023 03:36:30 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 588ADB819EF for ; Wed, 11 Jan 2023 11:36:29 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 88014C433EF for ; Wed, 11 Jan 2023 11:36:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436988; bh=INn+0dxY7soFHsIzbujrxySQrcWAhUh6ibVvasXce9Q=; h=From:To:Subject:Date:In-Reply-To:References:From; b=N3RGGKbZ9zPpflREefX1szUnX0+jyhIz6a1qxW7JAVJ1ki9yDuG1if/J3pdIdM89T x6ISYXsxTOehDHMNjNQxF70Me4/JxHQJa7DLIsg0RkBeQFqhyh9+NZlxcSG7wDzJ/Z Pb923GxtaAutDf+Cs+HXDwVFQt8HKaK3sQwRL9awudbYqTag2IDRtqBFuBBj0UhNFu EKW6RzeMysGHRcgRhWbwJHYbbTWkpAYSf37Qilp0kK13gIDPzwlkFpd5Ti3xupSnoC K+0kuuRtpOTwZ5SYICGjJvBnVDeZJXf7oMo1OdgjLjqL4wH+any8ycfsaGWmOTWIfx v7mI5Qw/7GfWQ== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 05/19] btrfs: send: remove send_progress argument from can_rmdir() Date: Wed, 11 Jan 2023 11:36:06 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana All callers of can_rmdir() pass sctx->cur_ino as the value for the send_progress argument, so remove the argument and directly use sctx->cur_ino. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 6332add4865c..32dd88ed629a 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -3210,8 +3210,7 @@ static void free_orphan_dir_info(struct send_ctx *sctx, * We check this by iterating all dir items and checking if the inode behind * the dir item was already processed. */ -static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen, - u64 send_progress) +static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) { int ret = 0; int iter_ret = 0; @@ -3267,7 +3266,7 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen, goto out; } - if (loc.objectid > send_progress) { + if (loc.objectid > sctx->cur_ino) { odi = add_orphan_dir_info(sctx, dir, dir_gen); if (IS_ERR(odi)) { ret = PTR_ERR(odi); @@ -3574,7 +3573,7 @@ static int apply_dir_move(struct send_ctx *sctx, struct pending_dir_move *pm) } gen = odi->gen; - ret = can_rmdir(sctx, rmdir_ino, gen, sctx->cur_ino); + ret = can_rmdir(sctx, rmdir_ino, gen); if (ret < 0) goto out; if (!ret) @@ -4465,8 +4464,7 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) * later, we do this check again and rmdir it then if possible. * See the use of check_dirs for more details. */ - ret = can_rmdir(sctx, sctx->cur_ino, sctx->cur_inode_gen, - sctx->cur_ino); + ret = can_rmdir(sctx, sctx->cur_ino, sctx->cur_inode_gen); if (ret < 0) goto out; if (ret) { @@ -4571,8 +4569,7 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) goto out; } else if (ret == inode_state_did_delete && cur->dir != last_dir_ino_rm) { - ret = can_rmdir(sctx, cur->dir, cur->dir_gen, - sctx->cur_ino); + ret = can_rmdir(sctx, cur->dir, cur->dir_gen); if (ret < 0) goto out; if (ret) { From patchwork Wed Jan 11 11:36:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096485 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77798C678D6 for ; Wed, 11 Jan 2023 11:39:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238493AbjAKLjb (ORCPT ); Wed, 11 Jan 2023 06:39:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37344 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238228AbjAKLjE (ORCPT ); Wed, 11 Jan 2023 06:39:04 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 871CB65B7 for ; Wed, 11 Jan 2023 03:36:31 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 38850B81BAE for ; Wed, 11 Jan 2023 11:36:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6D4C5C433D2 for ; Wed, 11 Jan 2023 11:36:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436988; bh=IZn6Ia3g/yg2V15zZbqJ5NKqtnKzpAhV4fNWgDDJTp0=; h=From:To:Subject:Date:In-Reply-To:References:From; b=aa6lnvZErdCQd9aLBzY1T0j1N/gVhvn3uI4XJLTDIoX4jA69t8MM0rOlhm/s0dtHv e1KiiDal0UvLERziZ0zpVZbeuO0K8yjqao7Rdf4fI8ouEMOdOZTfntZOUC3QrNuhwT N5AVXsTkNbBpH6m9cXlLuMhSEXCiPpvs7/hikuFJ/psm1OUbea9IML8f/duTESebux EA05hLkQ1ehJpy1HonoEOlg+Cbu+LOUdzrQNAWR5lQijWh3/kRegLe3NqvKKQmjr5L sgZ+tuwj45uBYcH7t8QBdTcY77sn5Sduk3lZlrfiWAScfbzWP1rN855v4mBkY2Zrpb XcWyw/GJFJFvA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 06/19] btrfs: send: avoid duplicated orphan dir allocation and initialization Date: Wed, 11 Jan 2023 11:36:07 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana At can_rmdir() we are allocating and initializing an orphan dir object twice. This can be deduplicated outside of the loop that iterates over the dir index keys. So deduplicate that code, even because other patch in the series will need to add more initializion code and another one will add one more condition. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 32dd88ed629a..f7d533c364b1 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -3253,13 +3253,6 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) dm = get_waiting_dir_move(sctx, loc.objectid); if (dm) { - odi = add_orphan_dir_info(sctx, dir, dir_gen); - if (IS_ERR(odi)) { - ret = PTR_ERR(odi); - goto out; - } - odi->gen = dir_gen; - odi->last_dir_index_offset = found_key.offset; dm->rmdir_ino = dir; dm->rmdir_gen = dir_gen; ret = 0; @@ -3267,13 +3260,6 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) } if (loc.objectid > sctx->cur_ino) { - odi = add_orphan_dir_info(sctx, dir, dir_gen); - if (IS_ERR(odi)) { - ret = PTR_ERR(odi); - goto out; - } - odi->gen = dir_gen; - odi->last_dir_index_offset = found_key.offset; ret = 0; goto out; } @@ -3288,7 +3274,18 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) out: btrfs_free_path(path); - return ret; + + if (ret) + return ret; + + odi = add_orphan_dir_info(sctx, dir, dir_gen); + if (IS_ERR(odi)) + return PTR_ERR(odi); + + odi->gen = dir_gen; + odi->last_dir_index_offset = found_key.offset; + + return 0; } static int is_waiting_for_move(struct send_ctx *sctx, u64 ino) From patchwork Wed Jan 11 11:36:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096484 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64B91C677F1 for ; Wed, 11 Jan 2023 11:39:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238196AbjAKLj2 (ORCPT ); Wed, 11 Jan 2023 06:39:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38078 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238193AbjAKLjD (ORCPT ); Wed, 11 Jan 2023 06:39:03 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9B1F3884 for ; Wed, 11 Jan 2023 03:36:30 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 64B9561B8E for ; Wed, 11 Jan 2023 11:36:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5366EC433F1 for ; Wed, 11 Jan 2023 11:36:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436989; bh=YmKLNs8Sf/Umkr43m/ieZAWQg3jFPSTPhM7eHLzFT6I=; h=From:To:Subject:Date:In-Reply-To:References:From; b=JaSLUFXOtgb+SdDE4BMtUoUb/f/1CZSSUH35xN+trGj3ziK9Ko5+JLXyoIQNZG7Wo SZtmRP8OoT82EZNsKk+2kQHE5z7jLpFgYjrrdvam4O7SAW/NOAHL+z3wPzLd5H63eK NlXfJ6kWPG4E3UPEgzvGQw6t8r5xD4cS9S5ukCp60uqaKkdx4/iAEO1nHT7deHD6e6 DCAuPfj3R5rNDV5CxnbbzkGci9ev8mCO0Fk+lyjnXQzxUSLkZqSRHEfbjuc79B45I0 IDV29ZvHkxK0w4lfO8/huCruMW0Evvv5kk7s8x/0Pl/REWlHmse9L+iG1Rq8ywB7jI fFP0z4LibxiNQ== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 07/19] btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() Date: Wed, 11 Jan 2023 11:36:08 +0000 Message-Id: <08ad72730636a2f40abba7f0be31e0a40c6049c6.1673436276.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana At can_rmdir() we start by searching the orphan dirs rbtree for an orphan dir object for the target directory. Later when iterating over the dir index keys, if we find that any dir entry points to inode for which there is a pending dir move or the inode was not yet processed, we exit because we can't remove the directory yet. However we end up always calling add_orphan_dir_info(), which will iterate again the rbtree and if there is already an orphan dir object (created by the first call to can_rmdir()), it returns the existing object. This is unnecessary work because in case there is already an existing orphan dir object, we got a reference to it at the start of can_rmdir(). So skip the call to add_orphan_dir_info() if we already have a reference for an orphan dir object. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index f7d533c364b1..bc57fa8a6bde 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -3278,11 +3278,14 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) if (ret) return ret; - odi = add_orphan_dir_info(sctx, dir, dir_gen); - if (IS_ERR(odi)) - return PTR_ERR(odi); + if (!odi) { + odi = add_orphan_dir_info(sctx, dir, dir_gen); + if (IS_ERR(odi)) + return PTR_ERR(odi); + + odi->gen = dir_gen; + } - odi->gen = dir_gen; odi->last_dir_index_offset = found_key.offset; return 0; From patchwork Wed Jan 11 11:36:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096486 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AF0DC54EBC for ; Wed, 11 Jan 2023 11:39:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238325AbjAKLja (ORCPT ); Wed, 11 Jan 2023 06:39:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37820 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238265AbjAKLjE (ORCPT ); Wed, 11 Jan 2023 06:39:04 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 565C7B499 for ; Wed, 11 Jan 2023 03:36:32 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id F0605B81AD3 for ; Wed, 11 Jan 2023 11:36:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3997FC433EF for ; Wed, 11 Jan 2023 11:36:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436990; bh=mHo2nKykopTDW30Yps48wkFh0uy4OHLuGV6DkaDKO+8=; h=From:To:Subject:Date:In-Reply-To:References:From; b=e7g5GEi87u+2sMvT2be1mZgQFjXWsDiqLQVGk5Y5wSfhozDxatOdnNoE+lh2Ouedd x8NOHZKkgK3o4RNO4DyIlSbHhiyr0HC2XYeZ31mfUgHPVtebz1t6ZfwsuEsveMMQpN tdAbQiOkQNOIP/mnUIccYz3O9jsoprR8cKTqe/j3bqyrXn8q7sPCDk8mdX41SphGaF P89nhMRJN9C+dYm51oH29ncqEmYmy5ukqX8qF0Ue3dLRPDkuem0cto/Yz1Mblx7vqe QI5P3avM80Gq9ZatLva2UqFwOg76Tc00ZSPndBihI6mWLFiJjxN1WU9pePUEv5qTdO th4bOFxQ3JFJA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 08/19] btrfs: send: reduce searches on parent root when checking if dir can be removed Date: Wed, 11 Jan 2023 11:36:09 +0000 Message-Id: <506db9ca6dafe05827d4e466b36558b563faa933.1673436276.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana During an incremental send, every time we remove a reference (dentry) for an inode and the parent directory does not exists anymore in the send root, we go check if we can remove the directory by making a call to can_rmdir(). This helper can only return true (value 1) if all dentries were already removed, and for that it always does a search on the parent root for dir index keys - if it finds any dentry referring to an inode with a number higher then the inode currently being processed, then the directory can not be removed and it must return false (value 0). However that means if a directory that was deleted had 1000 dentries, and each one pointed to an inode with a number higher then the number of the directory's inode, we end up doing 1000 searches on the parent root. Typically files are created in a directory after the directory was created and therefore they get an higher inode number than the directory. It's also common to have the each dentry pointing to an inode with a higher number then the inodes the previous dentries point to, for example when creating a series of files inside a directory, a very common pattern. So improve on that by having the first call to can_rmdir() for a directory to check the number of the inode that the last dentry points to and cache that inode number in the orphan dir structure. Then every subsequent call to can_rmdir() can avoid doing a search on the parent root if the number of the inode currently being processed is smaller than cached inode number at the directory's orphan dir structure. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 65 ++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 59 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index bc57fa8a6bde..cd4aa0eae66c 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -321,6 +321,7 @@ struct orphan_dir_info { u64 ino; u64 gen; u64 last_dir_index_offset; + u64 dir_high_seq_ino; }; struct name_cache_entry { @@ -3161,6 +3162,7 @@ static struct orphan_dir_info *add_orphan_dir_info(struct send_ctx *sctx, odi->ino = dir_ino; odi->gen = dir_gen; odi->last_dir_index_offset = 0; + odi->dir_high_seq_ino = 0; rb_link_node(&odi->node, parent, p); rb_insert_color(&odi->node, &sctx->orphan_dirs); @@ -3221,6 +3223,8 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) struct btrfs_key loc; struct btrfs_dir_item *di; struct orphan_dir_info *odi = NULL; + u64 dir_high_seq_ino = 0; + u64 last_dir_index_offset = 0; /* * Don't try to rmdir the top/root subvolume dir. @@ -3228,17 +3232,62 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) if (dir == BTRFS_FIRST_FREE_OBJECTID) return 0; + odi = get_orphan_dir_info(sctx, dir, dir_gen); + if (odi && sctx->cur_ino < odi->dir_high_seq_ino) + return 0; + path = alloc_path_for_send(); if (!path) return -ENOMEM; + if (!odi) { + /* + * Find the inode number associated with the last dir index + * entry. This is very likely the inode with the highest number + * of all inodes that have an entry in the directory. We can + * then use it to avoid future calls to can_rmdir(), when + * processing inodes with a lower number, from having to search + * the parent root b+tree for dir index keys. + */ + key.objectid = dir; + key.type = BTRFS_DIR_INDEX_KEY; + key.offset = (u64)-1; + + ret = btrfs_search_slot(NULL, root, &key, path, 0, 0); + if (ret < 0) { + goto out; + } else if (ret > 0) { + /* Can't happen, the root is never empty. */ + ASSERT(path->slots[0] > 0); + if (WARN_ON(path->slots[0] == 0)) { + ret = -EUCLEAN; + goto out; + } + path->slots[0]--; + } + + btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]); + if (key.objectid != dir || key.type != BTRFS_DIR_INDEX_KEY) { + /* No index keys, dir can be removed. */ + ret = 1; + goto out; + } + + di = btrfs_item_ptr(path->nodes[0], path->slots[0], + struct btrfs_dir_item); + btrfs_dir_item_key_to_cpu(path->nodes[0], di, &loc); + dir_high_seq_ino = loc.objectid; + if (sctx->cur_ino < dir_high_seq_ino) { + ret = 0; + goto out; + } + + btrfs_release_path(path); + } + key.objectid = dir; key.type = BTRFS_DIR_INDEX_KEY; - key.offset = 0; - - odi = get_orphan_dir_info(sctx, dir, dir_gen); - if (odi) - key.offset = odi->last_dir_index_offset; + key.offset = odi ? odi->last_dir_index_offset : 0; btrfs_for_each_slot(root, &key, &found_key, path, iter_ret) { struct waiting_dir_move *dm; @@ -3251,6 +3300,9 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) struct btrfs_dir_item); btrfs_dir_item_key_to_cpu(path->nodes[0], di, &loc); + dir_high_seq_ino = max(dir_high_seq_ino, loc.objectid); + last_dir_index_offset = found_key.offset; + dm = get_waiting_dir_move(sctx, loc.objectid); if (dm) { dm->rmdir_ino = dir; @@ -3286,7 +3338,8 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) odi->gen = dir_gen; } - odi->last_dir_index_offset = found_key.offset; + odi->last_dir_index_offset = last_dir_index_offset; + odi->dir_high_seq_ino = max(odi->dir_high_seq_ino, dir_high_seq_ino); return 0; } From patchwork Wed Jan 11 11:36:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096487 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1511C46467 for ; Wed, 11 Jan 2023 11:39:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238228AbjAKLjc (ORCPT ); Wed, 11 Jan 2023 06:39:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37358 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236455AbjAKLjE (ORCPT ); Wed, 11 Jan 2023 06:39:04 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58C4BB7FF for ; Wed, 11 Jan 2023 03:36:33 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id DDAAFB81B8D for ; Wed, 11 Jan 2023 11:36:32 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1F2C1C433D2 for ; Wed, 11 Jan 2023 11:36:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436991; bh=d/4tneVXKHXgIx1Vi1b7OwN5+doGiLmXydrL8Jv8BL4=; h=From:To:Subject:Date:In-Reply-To:References:From; b=uYsDRm6EzD48cgIKmPdzsVlTBrpPUklnOLbINDaXDYA+ZnjWiYzxOgoJqUpVvWy9V pzb6CX+cA4RldHPzUkFtXOj/c7hwRdZ4uevEhH8AzxfmKU2dygYg1eepEVZ0DpfzZa dF28VgZmv9+BmmHCAITcb39SFJydqxd0dDXS1Pv5/dZAigceJpJOKoCkTGCxkpWQyf /w2gdy+OpszTiqBgdTS3PXmCegneaSUdpL5jUsOJ/eLq9qpdwAnm8GsjB47MrTQPz5 lsL1tHx7c7jb1qvN5iQ/zNaOx1LSzC15IjFzEYJLAbkGVIPNYkLET0J4JuyGM7hunR xgLTwEcwthgqg== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 09/19] btrfs: send: iterate waiting dir move rbtree only once when processing refs Date: Wed, 11 Jan 2023 11:36:10 +0000 Message-Id: <5486d61909a314d37d64512fc8a9acb8f4e2025a.1673436276.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana When processing the new references for an inode, we unnecessarily iterate twice the waiting dir moves rbtree, once with is_waiting_for_move() and if we found an entry in the rbtree, we iterate it again with a call to get_waiting_dir_move(). This is pointless, we can make this simpler and more efficient by calling only get_waiting_dir_move(), so just do that. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index cd4aa0eae66c..20fcf1c0832a 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -4335,12 +4335,9 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) * the source path when performing its rename * operation. */ - if (is_waiting_for_move(sctx, ow_inode)) { - wdm = get_waiting_dir_move(sctx, - ow_inode); - ASSERT(wdm); + wdm = get_waiting_dir_move(sctx, ow_inode); + if (wdm) wdm->orphanized = true; - } /* * Make sure we clear our orphanized inode's From patchwork Wed Jan 11 11:36:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096488 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81479C677F1 for ; Wed, 11 Jan 2023 11:39:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231598AbjAKLjd (ORCPT ); Wed, 11 Jan 2023 06:39:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38100 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238134AbjAKLjF (ORCPT ); Wed, 11 Jan 2023 06:39:05 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A84A5BE26 for ; Wed, 11 Jan 2023 03:36:33 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 15F2061C4B for ; Wed, 11 Jan 2023 11:36:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 04E6BC433F0 for ; Wed, 11 Jan 2023 11:36:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436992; bh=pMXbqdmklh4DA+hN5wPA1CB9cVGFnwZX4nlpjxQmsMs=; h=From:To:Subject:Date:In-Reply-To:References:From; b=t3OVieC3uKmYMk6Doy9MCgyPUrRY4E4NmaDPlqlh2vR2wgafCTUZwF3jnLuVR5ALk L6bRr7S8Wi3/EwloeY2zZmcrKrd5/rAkaIhjZHNVgRcRFLqVzTGWKNuDp3xW09FxHl My8BkCrSs2C2DF4sBeSh1Dpb9dT5NRmHYTlnbViTKbUVih9biNfoul2MUlgV2hBaO/ 2rsWxnb8rKvna5sX9ISuowpvxqGV3jpCUfx/OTR90JVIAh4bfvE7oLV6IHMRtThvp4 7vANwbK70MmkvcH+dHj5jdjFIbwyIQ6NKck6/295nh3E4oCAPfLU2YKARr+4a2ZMxN 3oAppCacyRSvw== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 10/19] btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree Date: Wed, 11 Jan 2023 11:36:11 +0000 Message-Id: <013ed0f5ae2bfb6a340ba7fa2642a81ec327b724.1673436276.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana As send is single threaded and there's no concurrency, we don't need to protect the accesses to the backref cache's mapple tree, so initialize the maple tree with the flag MT_FLAGS_LOCK_EXTERN. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 20fcf1c0832a..5ac3cff7bd68 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -8140,7 +8140,7 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) INIT_LIST_HEAD(&sctx->name_cache_list); INIT_LIST_HEAD(&sctx->backref_cache.lru_list); - mt_init(&sctx->backref_cache.entries); + mt_init_flags(&sctx->backref_cache.entries, MT_FLAGS_LOCK_EXTERN); sctx->flags = arg->flags; From patchwork Wed Jan 11 11:36:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096491 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 329FDC46467 for ; Wed, 11 Jan 2023 11:39:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231810AbjAKLjg (ORCPT ); Wed, 11 Jan 2023 06:39:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38134 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238303AbjAKLjG (ORCPT ); Wed, 11 Jan 2023 06:39:06 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5B4A9D137 for ; Wed, 11 Jan 2023 03:36:35 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id A89EAB81AD3 for ; Wed, 11 Jan 2023 11:36:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E0139C433EF for ; Wed, 11 Jan 2023 11:36:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436993; bh=HXV/Gb+mxi4s1k4nOv3b39QjeDGD/VgnxwTZaEwq//4=; h=From:To:Subject:Date:In-Reply-To:References:From; b=XiGBTFXaSi4wvacXA2w3z+57B5JAsE4q8s/VbeuDaTgGW3U53FIRfNPHdKq1VLWFp X9TK8l4sdX2jXMd3XYyFyKd/TaG+ph7Ng7sFVb+mF/vmWSYYOUvq+EpMD8/TftGvEg t2oS0FO5TxaZ40FCwnbsN+CevmwIyklCQNFn80bFP7EvPpzkhcCVJsJ6aAcm9lzB8v KAMyG7+3ITG6d/VYCsbfcFgdtRhBK7WkCX+bHkUdlnUABjQHfmR28pjAbZMowg5gpk xGbvl9nXyfn3Wb0gZN+SbeBf0ug7VNHzmlLDcmOh4ZSRX9z3HpXt5XgntOGZFNq6Sz ToZRya/1WzIjg== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 11/19] btrfs: send: initialize all the red black trees earlier Date: Wed, 11 Jan 2023 11:36:12 +0000 Message-Id: <9e69f33b0dcce18f81e2829cf629110baacd578a.1673436276.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana After we allocate the send context object and before we initialize all the red black trees, we can jump to the 'out' label if some errors happen, and then under the 'out' label we use RB_EMPTY_ROOT() against some of the those trees, which we have not yet initialized. This happens to work out ok because the send context object was initialized to zeroes with kzalloc and the RB_ROOT initializer just happens to have the following definition: #define RB_ROOT (struct rb_root) { NULL, } But it's really neither clean nor a good practice as RB_ROOT is supposed to be opaque and in case it changes or we change those red black trees to some other data structure, it leaves us in a precarious situation. So initialize all the red black trees immediately after allocating the send context and before any jump into the 'out' label. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 5ac3cff7bd68..e60aa0cb0b5c 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -8142,6 +8142,12 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) INIT_LIST_HEAD(&sctx->backref_cache.lru_list); mt_init_flags(&sctx->backref_cache.entries, MT_FLAGS_LOCK_EXTERN); + sctx->pending_dir_moves = RB_ROOT; + sctx->waiting_dir_moves = RB_ROOT; + sctx->orphan_dirs = RB_ROOT; + sctx->rbtree_new_refs = RB_ROOT; + sctx->rbtree_deleted_refs = RB_ROOT; + sctx->flags = arg->flags; if (arg->flags & BTRFS_SEND_FLAG_VERSION) { @@ -8207,12 +8213,6 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) goto out; } - sctx->pending_dir_moves = RB_ROOT; - sctx->waiting_dir_moves = RB_ROOT; - sctx->orphan_dirs = RB_ROOT; - sctx->rbtree_new_refs = RB_ROOT; - sctx->rbtree_deleted_refs = RB_ROOT; - sctx->clone_roots = kvcalloc(sizeof(*sctx->clone_roots), arg->clone_sources_count + 1, GFP_KERNEL); From patchwork Wed Jan 11 11:36:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096489 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AACDC54EBC for ; Wed, 11 Jan 2023 11:39:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238134AbjAKLjd (ORCPT ); Wed, 11 Jan 2023 06:39:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238209AbjAKLjG (ORCPT ); Wed, 11 Jan 2023 06:39:06 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5AFB1BF5D for ; Wed, 11 Jan 2023 03:36:35 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D343561B7D for ; Wed, 11 Jan 2023 11:36:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C67E2C433F0 for ; Wed, 11 Jan 2023 11:36:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436994; bh=5y43LMj/08WtXLs2vcn6ZhRxM1ErWWFBYruoxesvxEo=; h=From:To:Subject:Date:In-Reply-To:References:From; b=IpnaEGl06BxzwY2oJbnFwiEwKC307nqIwBMdqFnWmdbIpkCo7FS4+bld0b9RLzlgM fAk2FHiPMTvvM7dntonl2csP+GnyEbuk2bGT4+gI06eOenIu7zanFGB6nRqI8jvVGs l3WFDxaZj263i1Ni/rclKzJsAfKT5HdX1jKOkjY+EAyr6GrRV3ZcO8SE3nINOEK/b0 A+J5xbl0KERl2BciMWauIlYy8Njl+384YNHTdt4+UHUf/rHH5R6Dg0pg8DC0aavkvr 4K1d22jfGFOol4jYmlFalwXXSKRXzjs1P5gEYChOWYkBK5eDUHEcrr7mU9oUV4bvI6 xYhQ8FYmUDfyQ== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 12/19] btrfs: send: genericize the backref cache to allow it to be reused Date: Wed, 11 Jan 2023 11:36:13 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana The backref cache is a cache backed by a maple tree and a linked list to keep track of temporal access to cached entries (the LRU entry always at the head of the list). This type of caching method is going to be useful in other scenarios, so make the cache implementation more generic and move it into its own header and source files. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/Makefile | 3 +- fs/btrfs/lru_cache.c | 97 ++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/lru_cache.h | 44 ++++++++++++++++++++ fs/btrfs/send.c | 80 +++++++++++------------------------- 4 files changed, 167 insertions(+), 57 deletions(-) create mode 100644 fs/btrfs/lru_cache.c create mode 100644 fs/btrfs/lru_cache.h diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index 460eced3f5bd..90d53209755b 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -32,7 +32,8 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \ uuid-tree.o props.o free-space-tree.o tree-checker.o space-info.o \ block-rsv.o delalloc-space.o block-group.o discard.o reflink.o \ - subpage.o tree-mod-log.o extent-io-tree.o fs.o messages.o bio.o + subpage.o tree-mod-log.o extent-io-tree.o fs.o messages.o bio.o \ + lru_cache.o btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o diff --git a/fs/btrfs/lru_cache.c b/fs/btrfs/lru_cache.c new file mode 100644 index 000000000000..706987890793 --- /dev/null +++ b/fs/btrfs/lru_cache.c @@ -0,0 +1,97 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include "lru_cache.h" +#include "messages.h" + +/* + * Initialize a cache object. + * + * @cache: The cache. + * @max_size: Maximum size (number of entries) for the cache. + */ +void btrfs_lru_cache_init(struct btrfs_lru_cache *cache, unsigned int max_size) +{ + INIT_LIST_HEAD(&cache->lru_list); + mt_init_flags(&cache->entries, MT_FLAGS_LOCK_EXTERN); + cache->size = 0; + cache->max_size = max_size; +} + +/* + * Lookup for an entry in the cache. + * + * @cache: The cache. + * @key: The key of the entry we are looking for. + * + * Returns the entry associated with the key or NULL if none found. + */ +struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cache, + u64 key) +{ + struct btrfs_lru_cache_entry *entry; + + entry = mtree_load(&cache->entries, key); + if (entry) + list_move_tail(&entry->lru_list, &cache->lru_list); + + return entry; +} + +/* + * Store an entry in the cache. + * + * @cache: The cache. + * @entry: The entry to store. + * + * Returns 0 on success and < 0 on error. + */ +int btrfs_lru_cache_store(struct btrfs_lru_cache *cache, + struct btrfs_lru_cache_entry *new_entry, + gfp_t gfp) +{ + int ret; + + if (cache->size == cache->max_size) { + struct btrfs_lru_cache_entry *lru_entry; + struct btrfs_lru_cache_entry *mt_entry; + + lru_entry = list_first_entry(&cache->lru_list, + struct btrfs_lru_cache_entry, + lru_list); + mt_entry = mtree_erase(&cache->entries, lru_entry->key); + ASSERT(mt_entry == lru_entry); + list_del(&mt_entry->lru_list); + kfree(mt_entry); + cache->size--; + } + + ret = mtree_insert(&cache->entries, new_entry->key, new_entry, gfp); + if (ret < 0) + return ret; + + list_add_tail(&new_entry->lru_list, &cache->lru_list); + cache->size++; + + return 0; +} + +/* + * Empty a cache. + * + * @cache: The cache to empty. + * + * Removes all entries from the cache. + */ +void btrfs_lru_cache_clear(struct btrfs_lru_cache *cache) +{ + struct btrfs_lru_cache_entry *entry; + struct btrfs_lru_cache_entry *tmp; + + list_for_each_entry_safe(entry, tmp, &cache->lru_list, lru_list) + kfree(entry); + + INIT_LIST_HEAD(&cache->lru_list); + mtree_destroy(&cache->entries); + cache->size = 0; +} diff --git a/fs/btrfs/lru_cache.h b/fs/btrfs/lru_cache.h new file mode 100644 index 000000000000..189be5be0a8d --- /dev/null +++ b/fs/btrfs/lru_cache.h @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef BTRFS_LRU_CACHE_H +#define BTRFS_LRU_CACHE_H + +#include +#include + +/* + * A cache entry. This is meant to be embedded in a structure of a user of + * this module. Similar to how struct list_head and struct rb_node are used. + * + * Note: it should be embedded as the first element in a struct (offset 0), and + * this module assumes it was allocated with kmalloc(), so it calls kfree() when + * it needs to free an entry. + */ +struct btrfs_lru_cache_entry { + struct list_head lru_list; + u64 key; +}; + +struct btrfs_lru_cache { + struct list_head lru_list; + struct maple_tree entries; + /* Number of entries stored in the cache. */ + unsigned int size; + /* Maximum number of entries the cache can have. */ + unsigned int max_size; +}; + +static inline unsigned int btrfs_lru_cache_size(const struct btrfs_lru_cache *cache) +{ + return cache->size; +} + +void btrfs_lru_cache_init(struct btrfs_lru_cache *cache, unsigned int max_size); +struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cache, + u64 key); +int btrfs_lru_cache_store(struct btrfs_lru_cache *cache, + struct btrfs_lru_cache_entry *new_entry, + gfp_t gfp); +void btrfs_lru_cache_clear(struct btrfs_lru_cache *cache); + +#endif /* BTRFS_LRU_CACHE_H */ diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index e60aa0cb0b5c..b31af939bea8 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -32,6 +32,7 @@ #include "file-item.h" #include "ioctl.h" #include "verity.h" +#include "lru_cache.h" /* * Maximum number of references an extent can have in order for us to attempt to @@ -107,15 +108,15 @@ struct clone_root { * x86_64). */ struct backref_cache_entry { - /* List to link to the cache's lru list. */ - struct list_head list; - /* The key for this entry in the cache. */ - u64 key; + struct btrfs_lru_cache_entry entry; u64 root_ids[SEND_MAX_BACKREF_CACHE_ROOTS]; /* Number of valid elements in the root_ids array. */ int num_roots; }; +/* See the comment at lru_cache.h about struct btrfs_lru_cache_entry. */ +static_assert(offsetof(struct backref_cache_entry, entry) == 0); + struct send_ctx { struct file *send_filp; loff_t send_off; @@ -285,13 +286,8 @@ struct send_ctx { struct rb_root rbtree_new_refs; struct rb_root rbtree_deleted_refs; - struct { - u64 last_reloc_trans; - struct list_head lru_list; - struct maple_tree entries; - /* Number of entries stored in the cache. */ - int size; - } backref_cache; + struct btrfs_lru_cache backref_cache; + u64 backref_cache_last_reloc_trans; }; struct pending_dir_move { @@ -1387,19 +1383,6 @@ static int iterate_backrefs(u64 ino, u64 offset, u64 num_bytes, u64 root_id, return 0; } -static void empty_backref_cache(struct send_ctx *sctx) -{ - struct backref_cache_entry *entry; - struct backref_cache_entry *tmp; - - list_for_each_entry_safe(entry, tmp, &sctx->backref_cache.lru_list, list) - kfree(entry); - - INIT_LIST_HEAD(&sctx->backref_cache.lru_list); - mtree_destroy(&sctx->backref_cache.entries); - sctx->backref_cache.size = 0; -} - static bool lookup_backref_cache(u64 leaf_bytenr, void *ctx, const u64 **root_ids_ret, int *root_count_ret) { @@ -1407,9 +1390,10 @@ static bool lookup_backref_cache(u64 leaf_bytenr, void *ctx, struct send_ctx *sctx = bctx->sctx; struct btrfs_fs_info *fs_info = sctx->send_root->fs_info; const u64 key = leaf_bytenr >> fs_info->sectorsize_bits; + struct btrfs_lru_cache_entry *raw_entry; struct backref_cache_entry *entry; - if (sctx->backref_cache.size == 0) + if (btrfs_lru_cache_size(&sctx->backref_cache) == 0) return false; /* @@ -1423,18 +1407,18 @@ static bool lookup_backref_cache(u64 leaf_bytenr, void *ctx, * transaction handle or holding fs_info->commit_root_sem, so no need * to take any lock here. */ - if (fs_info->last_reloc_trans > sctx->backref_cache.last_reloc_trans) { - empty_backref_cache(sctx); + if (fs_info->last_reloc_trans > sctx->backref_cache_last_reloc_trans) { + btrfs_lru_cache_clear(&sctx->backref_cache); return false; } - entry = mtree_load(&sctx->backref_cache.entries, key); - if (!entry) + raw_entry = btrfs_lru_cache_lookup(&sctx->backref_cache, key); + if (!raw_entry) return false; + entry = container_of(raw_entry, struct backref_cache_entry, entry); *root_ids_ret = entry->root_ids; *root_count_ret = entry->num_roots; - list_move_tail(&entry->list, &sctx->backref_cache.lru_list); return true; } @@ -1460,7 +1444,7 @@ static void store_backref_cache(u64 leaf_bytenr, const struct ulist *root_ids, if (!new_entry) return; - new_entry->key = leaf_bytenr >> fs_info->sectorsize_bits; + new_entry->entry.key = leaf_bytenr >> fs_info->sectorsize_bits; new_entry->num_roots = 0; ULIST_ITER_INIT(&uiter); while ((node = ulist_next(root_ids, &uiter)) != NULL) { @@ -1488,23 +1472,12 @@ static void store_backref_cache(u64 leaf_bytenr, const struct ulist *root_ids, * none of the roots is part of the list of roots from which we are * allowed to clone. Cache the new entry as it's still useful to avoid * backref walking to determine which roots have a path to the leaf. + * + * Also use GFP_NOFS because we're called while holding a transaction + * handle or while holding fs_info->commit_root_sem. */ - - if (sctx->backref_cache.size >= SEND_MAX_BACKREF_CACHE_SIZE) { - struct backref_cache_entry *lru_entry; - struct backref_cache_entry *mt_entry; - - lru_entry = list_first_entry(&sctx->backref_cache.lru_list, - struct backref_cache_entry, list); - mt_entry = mtree_erase(&sctx->backref_cache.entries, lru_entry->key); - ASSERT(mt_entry == lru_entry); - list_del(&mt_entry->list); - kfree(mt_entry); - sctx->backref_cache.size--; - } - - ret = mtree_insert(&sctx->backref_cache.entries, new_entry->key, - new_entry, GFP_NOFS); + ret = btrfs_lru_cache_store(&sctx->backref_cache, &new_entry->entry, + GFP_NOFS); ASSERT(ret == 0 || ret == -ENOMEM); if (ret) { /* Caching is optional, no worries. */ @@ -1512,17 +1485,13 @@ static void store_backref_cache(u64 leaf_bytenr, const struct ulist *root_ids, return; } - list_add_tail(&new_entry->list, &sctx->backref_cache.lru_list); - /* * We are called from iterate_extent_inodes() while either holding a * transaction handle or holding fs_info->commit_root_sem, so no need * to take any lock here. */ - if (sctx->backref_cache.size == 0) - sctx->backref_cache.last_reloc_trans = fs_info->last_reloc_trans; - - sctx->backref_cache.size++; + if (btrfs_lru_cache_size(&sctx->backref_cache) == 1) + sctx->backref_cache_last_reloc_trans = fs_info->last_reloc_trans; } static int check_extent_item(u64 bytenr, const struct btrfs_extent_item *ei, @@ -8139,8 +8108,7 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) INIT_RADIX_TREE(&sctx->name_cache, GFP_KERNEL); INIT_LIST_HEAD(&sctx->name_cache_list); - INIT_LIST_HEAD(&sctx->backref_cache.lru_list); - mt_init_flags(&sctx->backref_cache.entries, MT_FLAGS_LOCK_EXTERN); + btrfs_lru_cache_init(&sctx->backref_cache, SEND_MAX_BACKREF_CACHE_SIZE); sctx->pending_dir_moves = RB_ROOT; sctx->waiting_dir_moves = RB_ROOT; @@ -8404,7 +8372,7 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) close_current_inode(sctx); - empty_backref_cache(sctx); + btrfs_lru_cache_clear(&sctx->backref_cache); kfree(sctx); } From patchwork Wed Jan 11 11:36:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096490 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2CE2C5479D for ; Wed, 11 Jan 2023 11:39:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238374AbjAKLjf (ORCPT ); Wed, 11 Jan 2023 06:39:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37386 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232354AbjAKLjG (ORCPT ); Wed, 11 Jan 2023 06:39:06 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66249DF50 for ; Wed, 11 Jan 2023 03:36:36 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B983361C50 for ; Wed, 11 Jan 2023 11:36:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id ADD86C433D2 for ; Wed, 11 Jan 2023 11:36:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436995; bh=lnphF+YNUghnGRa8BuwwMe8nSOmirqn7lQ19JmMjX+0=; h=From:To:Subject:Date:In-Reply-To:References:From; b=JnBqJ8Kw6vev89qBuBslrP3bV4kqzM3M5zrQ6uGRT9vV9UqWLkjA1IKuy1J15Fdda 14wdPmsn2zc/YHrmBY7r7CwqN+P2CjE1umNDJrOMNBljZhjLFnYmgYS6dw4VW7jXCW ZbfiLsYU7ZOqW1mDLdaBnMGrt1fbTF15Wkrq2OfE+4zzcB05Eb1F0PaB5Qe5XI9oIW k/7WAJELMOxOlH2G4Y0XZsqpQUoLgv98gjBpO4UC42LLedmBPz2B8nktt2WETCEgKN vWjLd1XiUvMw5he4VHjIiMANp3CQ0MW98MyVH3KB0n0/S8yHaJ4uHKmblrrIFsA+AU p9vwXcz//6bWA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 13/19] btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems Date: Wed, 11 Jan 2023 11:36:14 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana The lru cache is backed by a maple tree, which uses the unsigned long type for keys, and that type has a width of 32 bits on 32 bits systems and a width of 64 bits on 64 bits systems. Currently there is only one user of the lru cache, the send backref cache, which uses a sector number as a key, a logical address right shifted by fs_info->sectorsize_bits, so a 32 bits width is not yet a problem (the same happens with the radix tree we use to track extent buffers, fs_info->buffer_radix). However the next patches in the series will start using the lru cache for cases where inode numbers are the keys, and the inode numbers are always 64 bits, even if we are running on a 32 bits system. So adapt the lru cache to allow multiple values under the same key, by having the maple tree store a head entry that points to a list of entries instead of pointing to a single entry. This is a similar approach to what we currently do for the name cache in send (which uses a radix tree that has indexes with an unsigned long type as well), and will allow later to use the lru cache for the send name cache as well. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/lru_cache.c | 86 ++++++++++++++++++++++++++++++++++++-------- fs/btrfs/lru_cache.h | 12 +++++++ 2 files changed, 83 insertions(+), 15 deletions(-) diff --git a/fs/btrfs/lru_cache.c b/fs/btrfs/lru_cache.c index 706987890793..10c1f9ceb706 100644 --- a/fs/btrfs/lru_cache.c +++ b/fs/btrfs/lru_cache.c @@ -18,6 +18,17 @@ void btrfs_lru_cache_init(struct btrfs_lru_cache *cache, unsigned int max_size) cache->max_size = max_size; } +static struct btrfs_lru_cache_entry *match_entry(struct list_head *head, u64 key) +{ + struct btrfs_lru_cache_entry *entry; + + list_for_each_entry(entry, head, list) + if (entry->key == key) + return entry; + + return NULL; +} + /* * Lookup for an entry in the cache. * @@ -29,15 +40,48 @@ void btrfs_lru_cache_init(struct btrfs_lru_cache *cache, unsigned int max_size) struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cache, u64 key) { + struct list_head *head; struct btrfs_lru_cache_entry *entry; - entry = mtree_load(&cache->entries, key); + head = mtree_load(&cache->entries, key); + if (!head) + return NULL; + + entry = match_entry(head, key); if (entry) list_move_tail(&entry->lru_list, &cache->lru_list); return entry; } +static void delete_entry(struct btrfs_lru_cache *cache, + struct btrfs_lru_cache_entry *entry) +{ + struct list_head *prev = entry->list.prev; + + ASSERT(cache->size > 0); + ASSERT(!mtree_empty(&cache->entries)); + + list_del(&entry->list); + list_del(&entry->lru_list); + + if (list_empty(prev)) { + struct list_head *head; + + /* + * If previous element in the list entry->list is now empty, it + * means it's a head entry not pointing to any cached entries, + * so remove it from the maple tree and free it. + */ + head = mtree_erase(&cache->entries, entry->key); + ASSERT(head == prev); + kfree(head); + } + + kfree(entry); + cache->size--; +} + /* * Store an entry in the cache. * @@ -50,26 +94,39 @@ int btrfs_lru_cache_store(struct btrfs_lru_cache *cache, struct btrfs_lru_cache_entry *new_entry, gfp_t gfp) { + const u64 key = new_entry->key; + struct list_head *head; int ret; + head = kmalloc(sizeof(*head), gfp); + if (!head) + return -ENOMEM; + + ret = mtree_insert(&cache->entries, key, head, gfp); + if (ret == 0) { + INIT_LIST_HEAD(head); + list_add_tail(&new_entry->list, head); + } else if (ret == -EEXIST) { + kfree(head); + head = mtree_load(&cache->entries, key); + ASSERT(head != NULL); + if (match_entry(head, key) != NULL) + return -EEXIST; + list_add_tail(&new_entry->list, head); + } else if (ret < 0) { + kfree(head); + return ret; + } + if (cache->size == cache->max_size) { struct btrfs_lru_cache_entry *lru_entry; - struct btrfs_lru_cache_entry *mt_entry; lru_entry = list_first_entry(&cache->lru_list, struct btrfs_lru_cache_entry, lru_list); - mt_entry = mtree_erase(&cache->entries, lru_entry->key); - ASSERT(mt_entry == lru_entry); - list_del(&mt_entry->lru_list); - kfree(mt_entry); - cache->size--; + delete_entry(cache, lru_entry); } - ret = mtree_insert(&cache->entries, new_entry->key, new_entry, gfp); - if (ret < 0) - return ret; - list_add_tail(&new_entry->lru_list, &cache->lru_list); cache->size++; @@ -89,9 +146,8 @@ void btrfs_lru_cache_clear(struct btrfs_lru_cache *cache) struct btrfs_lru_cache_entry *tmp; list_for_each_entry_safe(entry, tmp, &cache->lru_list, lru_list) - kfree(entry); + delete_entry(cache, entry); - INIT_LIST_HEAD(&cache->lru_list); - mtree_destroy(&cache->entries); - cache->size = 0; + ASSERT(cache->size == 0); + ASSERT(mtree_empty(&cache->entries)); } diff --git a/fs/btrfs/lru_cache.h b/fs/btrfs/lru_cache.h index 189be5be0a8d..368248be42a2 100644 --- a/fs/btrfs/lru_cache.h +++ b/fs/btrfs/lru_cache.h @@ -17,6 +17,18 @@ struct btrfs_lru_cache_entry { struct list_head lru_list; u64 key; + /* + * The maple tree uses unsigned long type for the keys, which is 32 bits + * on 32 bits systems, and 64 bits on 64 bits systems. So if we want to + * use something like inode numbers as keys, which are always a u64, we + * have to deal with this in a special way - we store the key in the + * entry itself, as a u64, and the values inserted into the maple tree + * are linked lists of entries - so in case we are on a 64 bits system, + * that list always has a single entry, while on 32 bits systems it + * may have more than one, with each entry having the same value for + * their lower 32 bits of the u64 key. + */ + struct list_head list; }; struct btrfs_lru_cache { From patchwork Wed Jan 11 11:36:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096492 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1851AC677F1 for ; Wed, 11 Jan 2023 11:39:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238303AbjAKLjh (ORCPT ); Wed, 11 Jan 2023 06:39:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238357AbjAKLjI (ORCPT ); Wed, 11 Jan 2023 06:39:08 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE849BF65 for ; Wed, 11 Jan 2023 03:36:38 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 648A8B81AD3 for ; Wed, 11 Jan 2023 11:36:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 935CEC43392 for ; Wed, 11 Jan 2023 11:36:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436996; bh=7jlLxKTxSDiDIFYJNaf/OYQGQzgs57SVysvvZ7QLgzI=; h=From:To:Subject:Date:In-Reply-To:References:From; b=kNodAHo54aY6rzuirG2+NqsIn05oDSnREtWs4h6iEhGJtFKUTh1Grmlu+hhY95Zi/ P/Ee5zq6iA1lQ4vET4ErUweXiti3xgHagUbnI3tFXDOTZS1cmVtHV9ipT0DApduyIg D+USH1ssBy6Z06Xxorn//DHfpraeM6sz/g6ymGwzr+uXgw8hb+DQhK3MeXMtuVXMqZ DUE7gOdSmwad6kTZDS4VOKE1NpHqQ51HI0XvNR54RHZZcOAyNWg7Mng3jM5t/r/9JI coojCEE14hnc9nnJy21RpYKCuOoQI54BgG6Jgb5g4yiw8GTiSij79p8C3KySZMsKV7 B5pYEr5wbLOzQ== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 14/19] btrfs: send: cache information about created directories Date: Wed, 11 Jan 2023 11:36:15 +0000 Message-Id: <668a05b55417e689a2ad9725c48f19b1fb1fdaa3.1673436276.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana During an incremental send, when processing the a reference for an inode we need to check if the directory where the new reference is located was already created before creating the new reference. This check, which is done by the helper did_create_dir(), can be expensive if the directory has many entries, since it consists in seaching the send root's b+tree and visiting every single dir index key until we either find one which points to an inode with a number smaller than the current inode's number or until we visited all index keys. So it doesn't scale well for very large directories. So improve on this by caching created directories using a lru cache, and limiting its size to 64 entries, which results in using at most 4096 bytes of memory. The caching is optinal, if we fail to allocate memory, we just proceed as before and use the existing slower path. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 41 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 40 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index b31af939bea8..bc232eb60e68 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -117,6 +117,14 @@ struct backref_cache_entry { /* See the comment at lru_cache.h about struct btrfs_lru_cache_entry. */ static_assert(offsetof(struct backref_cache_entry, entry) == 0); +/* + * Max number of entries in the cache that stores directories that were already + * created. The cache uses raw struct btrfs_lru_cache_entry entries, so it uses + * at most 4096 bytes - sizeof(struct btrfs_lru_cache_entry) is 40 bytes, but + * the kmalloc-64 slab is used, so we get 4096 bytes (64 bytes * 64). + */ +#define SEND_MAX_DIR_CREATED_CACHE_SIZE 64 + struct send_ctx { struct file *send_filp; loff_t send_off; @@ -288,6 +296,8 @@ struct send_ctx { struct btrfs_lru_cache backref_cache; u64 backref_cache_last_reloc_trans; + + struct btrfs_lru_cache dir_created_cache; }; struct pending_dir_move { @@ -2936,6 +2946,22 @@ static int send_create_inode(struct send_ctx *sctx, u64 ino) return ret; } +static void cache_dir_created(struct send_ctx *sctx, u64 dir) +{ + struct btrfs_lru_cache_entry *entry; + int ret; + + /* Caching is optional, ignore any failures. */ + entry = kmalloc(sizeof(*entry), GFP_KERNEL); + if (!entry) + return; + + entry->key = dir; + ret = btrfs_lru_cache_store(&sctx->dir_created_cache, entry, GFP_KERNEL); + if (ret < 0) + kfree(entry); +} + /* * We need some special handling for inodes that get processed before the parent * directory got created. See process_recorded_refs for details. @@ -2951,6 +2977,9 @@ static int did_create_dir(struct send_ctx *sctx, u64 dir) struct btrfs_key di_key; struct btrfs_dir_item *di; + if (btrfs_lru_cache_lookup(&sctx->dir_created_cache, dir)) + return 1; + path = alloc_path_for_send(); if (!path) return -ENOMEM; @@ -2974,6 +3003,7 @@ static int did_create_dir(struct send_ctx *sctx, u64 dir) if (di_key.type != BTRFS_ROOT_ITEM_KEY && di_key.objectid < sctx->send_progress) { ret = 1; + cache_dir_created(sctx, dir); break; } } @@ -3003,7 +3033,12 @@ static int send_create_inode_if_needed(struct send_ctx *sctx) return 0; } - return send_create_inode(sctx, sctx->cur_ino); + ret = send_create_inode(sctx, sctx->cur_ino); + + if (ret == 0 && S_ISDIR(sctx->cur_inode_mode)) + cache_dir_created(sctx, sctx->cur_ino); + + return ret; } struct recorded_ref { @@ -4401,6 +4436,7 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) ret = send_create_inode(sctx, cur->dir); if (ret < 0) goto out; + cache_dir_created(sctx, cur->dir); } } @@ -8109,6 +8145,8 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) INIT_LIST_HEAD(&sctx->name_cache_list); btrfs_lru_cache_init(&sctx->backref_cache, SEND_MAX_BACKREF_CACHE_SIZE); + btrfs_lru_cache_init(&sctx->dir_created_cache, + SEND_MAX_DIR_CREATED_CACHE_SIZE); sctx->pending_dir_moves = RB_ROOT; sctx->waiting_dir_moves = RB_ROOT; @@ -8373,6 +8411,7 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) close_current_inode(sctx); btrfs_lru_cache_clear(&sctx->backref_cache); + btrfs_lru_cache_clear(&sctx->dir_created_cache); kfree(sctx); } From patchwork Wed Jan 11 11:36:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096494 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50A3AC5479D for ; Wed, 11 Jan 2023 11:39:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238397AbjAKLjk (ORCPT ); Wed, 11 Jan 2023 06:39:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238380AbjAKLjJ (ORCPT ); Wed, 11 Jan 2023 06:39:09 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5905BE3C for ; Wed, 11 Jan 2023 03:36:39 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 41FBCB81BAD for ; Wed, 11 Jan 2023 11:36:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7A017C433D2 for ; Wed, 11 Jan 2023 11:36:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436997; bh=9j5W0lVj6TIU1yHV7NhWjifyNHGaw9QD8f4IlOIMQeE=; h=From:To:Subject:Date:In-Reply-To:References:From; b=M6sfizGH1RdJH9uTUYtDnGeywS3ETEDob4S5njDP9A8FPM2C67jGlOtHvcksCcG1n p+LB5eGzkvM7eaT6SiHytwti4IhLbwmdRvJCyQq9TjYKr+aRHeB565KnVSawm5CKTB NRUn5hi+GSK5qrzQG7ocP0abS5rdSaat/vUdqqGxoW+OqVHX1so6GSuvPp9jzZqkig aDqM/JLcEUybA2PvBXBfX2xbwFouUeTWhNjkypg0d7RsWw8sqGXmrSfJA1hwNy6Q+5 rvLrrIn3udUHnkQ0qP2v7+M83du70/JpOMZR8li8eLqx4uRP+XyGATk7VSciX6dpNH jyG+LlNOKmhJQ== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 15/19] btrfs: allow a generation number to be associated with lru cache entries Date: Wed, 11 Jan 2023 11:36:16 +0000 Message-Id: <226a658a3260bb4b25b43bc08345a79c8578fe9f.1673436276.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana This allows an optional generation number to be associated to each entry of the lru cache. Entries with the same key but different generations, are stored in the linked list to which the maple tree points to. This is meant to be used when there's a small number of different generations, so the impact of searching a linked list is negligible. The goal is to get rid of the open coded name cache in the send code (which uses a radix tree and a a similar linked list of values/entries) and use instead the lru cache module. For that particular use case we have at most 2 generations that are associated to each key (inode number): one generation for the send root and another generation for the parent root. The actual migration of the send name cache is done in the next patch in the series. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/lru_cache.c | 12 +++++++----- fs/btrfs/lru_cache.h | 9 ++++++++- fs/btrfs/send.c | 8 +++++--- 3 files changed, 20 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/lru_cache.c b/fs/btrfs/lru_cache.c index 10c1f9ceb706..d85341946213 100644 --- a/fs/btrfs/lru_cache.c +++ b/fs/btrfs/lru_cache.c @@ -18,12 +18,13 @@ void btrfs_lru_cache_init(struct btrfs_lru_cache *cache, unsigned int max_size) cache->max_size = max_size; } -static struct btrfs_lru_cache_entry *match_entry(struct list_head *head, u64 key) +static struct btrfs_lru_cache_entry *match_entry(struct list_head *head, u64 key, + u64 gen) { struct btrfs_lru_cache_entry *entry; list_for_each_entry(entry, head, list) - if (entry->key == key) + if (entry->key == key && entry->gen == gen) return entry; return NULL; @@ -34,11 +35,12 @@ static struct btrfs_lru_cache_entry *match_entry(struct list_head *head, u64 key * * @cache: The cache. * @key: The key of the entry we are looking for. + * @gen: Generation associated to the key. * * Returns the entry associated with the key or NULL if none found. */ struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cache, - u64 key) + u64 key, u64 gen) { struct list_head *head; struct btrfs_lru_cache_entry *entry; @@ -47,7 +49,7 @@ struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cac if (!head) return NULL; - entry = match_entry(head, key); + entry = match_entry(head, key, gen); if (entry) list_move_tail(&entry->lru_list, &cache->lru_list); @@ -110,7 +112,7 @@ int btrfs_lru_cache_store(struct btrfs_lru_cache *cache, kfree(head); head = mtree_load(&cache->entries, key); ASSERT(head != NULL); - if (match_entry(head, key) != NULL) + if (match_entry(head, key, new_entry->gen) != NULL) return -EEXIST; list_add_tail(&new_entry->list, head); } else if (ret < 0) { diff --git a/fs/btrfs/lru_cache.h b/fs/btrfs/lru_cache.h index 368248be42a2..de887d438cfb 100644 --- a/fs/btrfs/lru_cache.h +++ b/fs/btrfs/lru_cache.h @@ -17,6 +17,13 @@ struct btrfs_lru_cache_entry { struct list_head lru_list; u64 key; + /* + * Optional generation associated to a key. Use 0 if not needed/used. + * Entries with the same key and different generations are stored in a + * linked list, so use this only for cases where there's a small number + * of different generations. + */ + u64 gen; /* * The maple tree uses unsigned long type for the keys, which is 32 bits * on 32 bits systems, and 64 bits on 64 bits systems. So if we want to @@ -47,7 +54,7 @@ static inline unsigned int btrfs_lru_cache_size(const struct btrfs_lru_cache *ca void btrfs_lru_cache_init(struct btrfs_lru_cache *cache, unsigned int max_size); struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cache, - u64 key); + u64 key, u64 gen); int btrfs_lru_cache_store(struct btrfs_lru_cache *cache, struct btrfs_lru_cache_entry *new_entry, gfp_t gfp); diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index bc232eb60e68..3966f8ce7e49 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -120,7 +120,7 @@ static_assert(offsetof(struct backref_cache_entry, entry) == 0); /* * Max number of entries in the cache that stores directories that were already * created. The cache uses raw struct btrfs_lru_cache_entry entries, so it uses - * at most 4096 bytes - sizeof(struct btrfs_lru_cache_entry) is 40 bytes, but + * at most 4096 bytes - sizeof(struct btrfs_lru_cache_entry) is 48 bytes, but * the kmalloc-64 slab is used, so we get 4096 bytes (64 bytes * 64). */ #define SEND_MAX_DIR_CREATED_CACHE_SIZE 64 @@ -1422,7 +1422,7 @@ static bool lookup_backref_cache(u64 leaf_bytenr, void *ctx, return false; } - raw_entry = btrfs_lru_cache_lookup(&sctx->backref_cache, key); + raw_entry = btrfs_lru_cache_lookup(&sctx->backref_cache, key, 0); if (!raw_entry) return false; @@ -1455,6 +1455,7 @@ static void store_backref_cache(u64 leaf_bytenr, const struct ulist *root_ids, return; new_entry->entry.key = leaf_bytenr >> fs_info->sectorsize_bits; + new_entry->entry.gen = 0; new_entry->num_roots = 0; ULIST_ITER_INIT(&uiter); while ((node = ulist_next(root_ids, &uiter)) != NULL) { @@ -2957,6 +2958,7 @@ static void cache_dir_created(struct send_ctx *sctx, u64 dir) return; entry->key = dir; + entry->gen = 0; ret = btrfs_lru_cache_store(&sctx->dir_created_cache, entry, GFP_KERNEL); if (ret < 0) kfree(entry); @@ -2977,7 +2979,7 @@ static int did_create_dir(struct send_ctx *sctx, u64 dir) struct btrfs_key di_key; struct btrfs_dir_item *di; - if (btrfs_lru_cache_lookup(&sctx->dir_created_cache, dir)) + if (btrfs_lru_cache_lookup(&sctx->dir_created_cache, dir, 0)) return 1; path = alloc_path_for_send(); From patchwork Wed Jan 11 11:36:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096493 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2861BC54EBC for ; Wed, 11 Jan 2023 11:39:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238522AbjAKLjj (ORCPT ); Wed, 11 Jan 2023 06:39:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37426 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238368AbjAKLjI (ORCPT ); Wed, 11 Jan 2023 06:39:08 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7688BF52 for ; Wed, 11 Jan 2023 03:36:38 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7110A61C4B for ; Wed, 11 Jan 2023 11:36:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5F8D9C433F1 for ; Wed, 11 Jan 2023 11:36:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436997; bh=Au6clPOEmouJABkMQxT/fuYcNhHEI8NAvw2yr78ElsI=; h=From:To:Subject:Date:In-Reply-To:References:From; b=EznKf77DcNv+yJGBcWAPLohLDPWieK4yQT4inrmH17mvriPFg0qnW3UCLt+hbPCDC j48Cdz/iDYeW6FA42/QCX4rds9sTqv4B4B6+ExWzLFbZIb4Bgi7RTGoXpfAmJEVc6a 80tJk/gG9zvlVFfPxYZ6lqVJa7ebexKcDJ13nWOmL9yZJv4jKqJ1JCUfxbfEJP5unx 4hdrl1aEpm2v1nR/NssXmmONDJPyB6EJ96WPvB26ODiOd7xodMaUwNbUOPhCOxNURy 2DKkj2B6idDjrHDA1vW0bhrH4ZvZRury8RgUkzvRoMwADrRWjuWr3m1txXNJ+mQmMi 2gjST/VsgzGHA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 16/19] btrfs: add an api to delete a specific entry from the lru cache Date: Wed, 11 Jan 2023 11:36:17 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana In order to replace the open coded name cache in send with the lru cache, we need an API for the lru cache to delete a specific entry for which we did a previous lookup. This adds the API for it, and a next patch in the series will use it. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/lru_cache.c | 16 ++++++++++++---- fs/btrfs/lru_cache.h | 2 ++ 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/lru_cache.c b/fs/btrfs/lru_cache.c index d85341946213..602801235921 100644 --- a/fs/btrfs/lru_cache.c +++ b/fs/btrfs/lru_cache.c @@ -56,8 +56,16 @@ struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cac return entry; } -static void delete_entry(struct btrfs_lru_cache *cache, - struct btrfs_lru_cache_entry *entry) +/* + * Remove an entry from the cache. + * + * @cache: The cache to remove from. + * @entry: The entry to remove from the cache. + * + * Note: this also frees the memory used by the entry. + */ +void btrfs_lru_cache_remove(struct btrfs_lru_cache *cache, + struct btrfs_lru_cache_entry *entry) { struct list_head *prev = entry->list.prev; @@ -126,7 +134,7 @@ int btrfs_lru_cache_store(struct btrfs_lru_cache *cache, lru_entry = list_first_entry(&cache->lru_list, struct btrfs_lru_cache_entry, lru_list); - delete_entry(cache, lru_entry); + btrfs_lru_cache_remove(cache, lru_entry); } list_add_tail(&new_entry->lru_list, &cache->lru_list); @@ -148,7 +156,7 @@ void btrfs_lru_cache_clear(struct btrfs_lru_cache *cache) struct btrfs_lru_cache_entry *tmp; list_for_each_entry_safe(entry, tmp, &cache->lru_list, lru_list) - delete_entry(cache, entry); + btrfs_lru_cache_remove(cache, entry); ASSERT(cache->size == 0); ASSERT(mtree_empty(&cache->entries)); diff --git a/fs/btrfs/lru_cache.h b/fs/btrfs/lru_cache.h index de887d438cfb..57e5bdc57e6d 100644 --- a/fs/btrfs/lru_cache.h +++ b/fs/btrfs/lru_cache.h @@ -58,6 +58,8 @@ struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cac int btrfs_lru_cache_store(struct btrfs_lru_cache *cache, struct btrfs_lru_cache_entry *new_entry, gfp_t gfp); +void btrfs_lru_cache_remove(struct btrfs_lru_cache *cache, + struct btrfs_lru_cache_entry *entry); void btrfs_lru_cache_clear(struct btrfs_lru_cache *cache); #endif /* BTRFS_LRU_CACHE_H */ From patchwork Wed Jan 11 11:36:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096496 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E09A9C677F1 for ; Wed, 11 Jan 2023 11:39:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233197AbjAKLjm (ORCPT ); Wed, 11 Jan 2023 06:39:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37438 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232117AbjAKLjK (ORCPT ); Wed, 11 Jan 2023 06:39:10 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5A1EADE87 for ; Wed, 11 Jan 2023 03:36:41 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 0C8F4B81B8D for ; Wed, 11 Jan 2023 11:36:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 46217C433EF for ; Wed, 11 Jan 2023 11:36:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436998; bh=L6gWvZUH5RKDx/VFO/phIeW+7F1B8XGRIHaDaiA+Dx4=; h=From:To:Subject:Date:In-Reply-To:References:From; b=C6mEh6nCe1ErnZmUZbGSG2aVktttk5fEBzg0entPhUmDO7eeRofxZLLrp4xG2299F datcrIXPSJwebucAFNyPK8cMiLAQ4+44UnYECjGTyTFBNUrK0KsameLSOzJBXkKota OUCjq1P7mjvLOYnFQVo1zNv2za9QCAzB8D5E4WbOuC/hXz3XwO9Rv+yxIpFUC1Lzzx AFDWJVxB3gVWVytBqBuxmAkxTe1jEaAtDlANbBjJV8MQmgtPqsKtbUGHTH/HqAJegx +HS4igP0bDyD7UVmtOnkuFLeetFRSnB07OC5xlPgsRnHb0483NZq8RhunZ6STFHRDB j7gr22fYjSX9Q== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 17/19] btrfs: send: use the lru cache to implement the name cache Date: Wed, 11 Jan 2023 11:36:18 +0000 Message-Id: <652b384afc372a2a0b1b3338084dd8106ac69db9.1673436276.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana The name cache in send is basically a lru cache implemented with a radix tree and linked lists, very similar to the lru cache module which is used for the send backref cache and the cache of previously created directories during a send operation. So remove all the custom caching code for the name cache and make it use the lru cache instead. One particular detail to note is that the current cache behaves a bit differently when it comes to eviction of entries. Namely when after inserting a new name in the cache, if the cache now has 256 entries, we evict the last 128 LRU entries. The lru_cache.{c,h} module behaves a bit differently in that once we reach the cache limit, we evict a single LRU entry. In practice this doesn't make much difference, but it's actually better to evict just one entry instead of half of the entries, as there's always a chance we will need a name stored in one of that last 128 removed entries. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 164 +++++++----------------------------------------- 1 file changed, 24 insertions(+), 140 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 3966f8ce7e49..7d327d977fa0 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -81,8 +81,7 @@ struct clone_root { bool found_ref; }; -#define SEND_CTX_MAX_NAME_CACHE_SIZE 128 -#define SEND_CTX_NAME_CACHE_CLEAN_SIZE (SEND_CTX_MAX_NAME_CACHE_SIZE * 2) +#define SEND_MAX_NAME_CACHE_SIZE 256 /* * Limit the root_ids array of struct backref_cache_entry to 12 elements. @@ -183,9 +182,7 @@ struct send_ctx { struct list_head new_refs; struct list_head deleted_refs; - struct radix_tree_root name_cache; - struct list_head name_cache_list; - int name_cache_size; + struct btrfs_lru_cache name_cache; /* * The inode we are currently processing. It's not NULL only when we @@ -331,18 +328,11 @@ struct orphan_dir_info { }; struct name_cache_entry { - struct list_head list; /* - * radix_tree has only 32bit entries but we need to handle 64bit inums. - * We use the lower 32bit of the 64bit inum to store it in the tree. If - * more then one inum would fall into the same entry, we use radix_list - * to store the additional entries. radix_list is also used to store - * entries where two entries have the same inum but different - * generations. + * The key in the entry is an inode number, and the generation matches + * the inode's generation. */ - struct list_head radix_list; - u64 ino; - u64 gen; + struct btrfs_lru_cache_entry entry; u64 parent_ino; u64 parent_gen; int ret; @@ -351,6 +341,9 @@ struct name_cache_entry { char name[]; }; +/* See the comment at lru_cache.h about struct btrfs_lru_cache_entry. */ +static_assert(offsetof(struct name_cache_entry, entry) == 0); + #define ADVANCE 1 #define ADVANCE_ONLY_NEXT -1 @@ -2261,113 +2254,16 @@ static int did_overwrite_first_ref(struct send_ctx *sctx, u64 ino, u64 gen) return ret; } -/* - * Insert a name cache entry. On 32bit kernels the radix tree index is 32bit, - * so we need to do some special handling in case we have clashes. This function - * takes care of this with the help of name_cache_entry::radix_list. - * In case of error, nce is kfreed. - */ -static int name_cache_insert(struct send_ctx *sctx, - struct name_cache_entry *nce) -{ - int ret = 0; - struct list_head *nce_head; - - nce_head = radix_tree_lookup(&sctx->name_cache, - (unsigned long)nce->ino); - if (!nce_head) { - nce_head = kmalloc(sizeof(*nce_head), GFP_KERNEL); - if (!nce_head) { - kfree(nce); - return -ENOMEM; - } - INIT_LIST_HEAD(nce_head); - - ret = radix_tree_insert(&sctx->name_cache, nce->ino, nce_head); - if (ret < 0) { - kfree(nce_head); - kfree(nce); - return ret; - } - } - list_add_tail(&nce->radix_list, nce_head); - list_add_tail(&nce->list, &sctx->name_cache_list); - sctx->name_cache_size++; - - return ret; -} - -static void name_cache_delete(struct send_ctx *sctx, - struct name_cache_entry *nce) -{ - struct list_head *nce_head; - - nce_head = radix_tree_lookup(&sctx->name_cache, - (unsigned long)nce->ino); - if (!nce_head) { - btrfs_err(sctx->send_root->fs_info, - "name_cache_delete lookup failed ino %llu cache size %d, leaking memory", - nce->ino, sctx->name_cache_size); - } - - list_del(&nce->radix_list); - list_del(&nce->list); - sctx->name_cache_size--; - - /* - * We may not get to the final release of nce_head if the lookup fails - */ - if (nce_head && list_empty(nce_head)) { - radix_tree_delete(&sctx->name_cache, (unsigned long)nce->ino); - kfree(nce_head); - } -} - -static struct name_cache_entry *name_cache_search(struct send_ctx *sctx, - u64 ino, u64 gen) +static inline struct name_cache_entry *name_cache_search(struct send_ctx *sctx, + u64 ino, u64 gen) { - struct list_head *nce_head; - struct name_cache_entry *cur; + struct btrfs_lru_cache_entry *entry; - nce_head = radix_tree_lookup(&sctx->name_cache, (unsigned long)ino); - if (!nce_head) + entry = btrfs_lru_cache_lookup(&sctx->name_cache, ino, gen); + if (!entry) return NULL; - list_for_each_entry(cur, nce_head, radix_list) { - if (cur->ino == ino && cur->gen == gen) - return cur; - } - return NULL; -} - -/* - * Remove some entries from the beginning of name_cache_list. - */ -static void name_cache_clean_unused(struct send_ctx *sctx) -{ - struct name_cache_entry *nce; - - if (sctx->name_cache_size < SEND_CTX_NAME_CACHE_CLEAN_SIZE) - return; - - while (sctx->name_cache_size > SEND_CTX_MAX_NAME_CACHE_SIZE) { - nce = list_entry(sctx->name_cache_list.next, - struct name_cache_entry, list); - name_cache_delete(sctx, nce); - kfree(nce); - } -} - -static void name_cache_free(struct send_ctx *sctx) -{ - struct name_cache_entry *nce; - - while (!list_empty(&sctx->name_cache_list)) { - nce = list_entry(sctx->name_cache_list.next, - struct name_cache_entry, list); - name_cache_delete(sctx, nce); - kfree(nce); - } + return container_of(entry, struct name_cache_entry, entry); } /* @@ -2386,7 +2282,7 @@ static int __get_cur_name_and_parent(struct send_ctx *sctx, { int ret; int nce_ret; - struct name_cache_entry *nce = NULL; + struct name_cache_entry *nce; /* * First check if we already did a call to this function with the same @@ -2396,17 +2292,9 @@ static int __get_cur_name_and_parent(struct send_ctx *sctx, nce = name_cache_search(sctx, ino, gen); if (nce) { if (ino < sctx->send_progress && nce->need_later_update) { - name_cache_delete(sctx, nce); - kfree(nce); + btrfs_lru_cache_remove(&sctx->name_cache, &nce->entry); nce = NULL; } else { - /* - * Removes the entry from the list and adds it back to - * the end. This marks the entry as recently used so - * that name_cache_clean_unused does not remove it. - */ - list_move_tail(&nce->list, &sctx->name_cache_list); - *parent_ino = nce->parent_ino; *parent_gen = nce->parent_gen; ret = fs_path_add(dest, nce->name, nce->name_len); @@ -2473,8 +2361,8 @@ static int __get_cur_name_and_parent(struct send_ctx *sctx, goto out; } - nce->ino = ino; - nce->gen = gen; + nce->entry.key = ino; + nce->entry.gen = gen; nce->parent_ino = *parent_ino; nce->parent_gen = *parent_gen; nce->name_len = fs_path_len(dest); @@ -2486,10 +2374,9 @@ static int __get_cur_name_and_parent(struct send_ctx *sctx, else nce->need_later_update = 1; - nce_ret = name_cache_insert(sctx, nce); + nce_ret = btrfs_lru_cache_store(&sctx->name_cache, &nce->entry, GFP_KERNEL); if (nce_ret < 0) ret = nce_ret; - name_cache_clean_unused(sctx); out: return ret; @@ -4356,10 +4243,9 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) * and get instead the orphan name. */ nce = name_cache_search(sctx, ow_inode, ow_gen); - if (nce) { - name_cache_delete(sctx, nce); - kfree(nce); - } + if (nce) + btrfs_lru_cache_remove(&sctx->name_cache, + &nce->entry); /* * ow_inode might currently be an ancestor of @@ -8143,9 +8029,8 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) INIT_LIST_HEAD(&sctx->new_refs); INIT_LIST_HEAD(&sctx->deleted_refs); - INIT_RADIX_TREE(&sctx->name_cache, GFP_KERNEL); - INIT_LIST_HEAD(&sctx->name_cache_list); + btrfs_lru_cache_init(&sctx->name_cache, SEND_MAX_NAME_CACHE_SIZE); btrfs_lru_cache_init(&sctx->backref_cache, SEND_MAX_BACKREF_CACHE_SIZE); btrfs_lru_cache_init(&sctx->dir_created_cache, SEND_MAX_DIR_CREATED_CACHE_SIZE); @@ -8408,10 +8293,9 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) kvfree(sctx->send_buf); kvfree(sctx->verity_descriptor); - name_cache_free(sctx); - close_current_inode(sctx); + btrfs_lru_cache_clear(&sctx->name_cache); btrfs_lru_cache_clear(&sctx->backref_cache); btrfs_lru_cache_clear(&sctx->dir_created_cache); From patchwork Wed Jan 11 11:36:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096495 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39783C46467 for ; Wed, 11 Jan 2023 11:39:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232760AbjAKLjl (ORCPT ); Wed, 11 Jan 2023 06:39:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231806AbjAKLjK (ORCPT ); Wed, 11 Jan 2023 06:39:10 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38A428FD8 for ; Wed, 11 Jan 2023 03:36:42 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E080AB81BB6 for ; Wed, 11 Jan 2023 11:36:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2BD07C433D2 for ; Wed, 11 Jan 2023 11:36:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673436999; bh=1fX028zxFCx5/VXZF94TGwUkwBcK5Mkvrq51Ihk4WeM=; h=From:To:Subject:Date:In-Reply-To:References:From; b=W+IiJMDpustMHilj/1L4E+GOLTYJPEr0t7ZCMgtpU0cOKCqrEhvYoUGQHTlktDI/U 25rlzAgiU9/0euCdxv5nsTGhFPPIrTZkKWE5hXkJ7b2mkfhpNx2o0+oQ5VCMgBnkni tDa5Hr2Toa1R13mW/6ibSIg/aWF8nLOHGPDTtk8VCKLc3+2C6mRUiti5tFPgtrXvZy hxg7Tr/pVgwrpWnVMSdG11gVdHXwaAsDZo/iYUdXhgH8i/BBw0Gx8ChqJbnAIvJ6fy h+ur0R/Fmhy1t2bRa+Sq5IpEuaacrP/YT9nWRLC7OqVUdFLJXgDy7IETjbsSYx5Mnw SQpEJt/IhG8xQ== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 18/19] btrfs: send: update size of roots array for backref cache entries Date: Wed, 11 Jan 2023 11:36:19 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana Currently we limit the size of the roots array, for backref cache entries, to 12 elements. This is because that number is enough for most cases and to make the backref cache entry size to be exactly 128 bytes, so that memory is allocated from the kmalloc-128 slab and no space is wasted. However recent changes in the series refactored the backref cache to be more generic and allow it to be reused for other purposes, which resulted in increasing the size of the embedded structure btrfs_lru_cache_entry in order to allow for supporting inode numbers as keys on 32 bits system and allow multiple generations per key. This resulted in increasing the size of struct backref_cache_entry from 128 bytes to 152 bytes. Since the cache entries are allocated with kmalloc(), it means we end up using the slab kmalloc-192, so we end up wasting 40 bytes of memory. So bump the size of the roots array from 12 elements to 17 elements, so we end up using 192 bytes for each backref cache entry. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 7d327d977fa0..ae2d462f647e 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -84,19 +84,20 @@ struct clone_root { #define SEND_MAX_NAME_CACHE_SIZE 256 /* - * Limit the root_ids array of struct backref_cache_entry to 12 elements. - * This makes the size of a cache entry to be exactly 128 bytes on x86_64. + * Limit the root_ids array of struct backref_cache_entry to 17 elements. + * This makes the size of a cache entry to be exactly 192 bytes on x86_64, which + * can be satisfied from the kmalloc-192 slab, without wasting any space. * The most common case is to have a single root for cloning, which corresponds - * to the send root. Having the user specify more than 11 clone roots is not + * to the send root. Having the user specify more than 16 clone roots is not * common, and in such rare cases we simply don't use caching if the number of - * cloning roots that lead down to a leaf is more than 12. + * cloning roots that lead down to a leaf is more than 17. */ -#define SEND_MAX_BACKREF_CACHE_ROOTS 12 +#define SEND_MAX_BACKREF_CACHE_ROOTS 17 /* * Max number of entries in the cache. - * With SEND_MAX_BACKREF_CACHE_ROOTS as 12, the size in bytes, excluding - * maple tree's internal nodes, is 16K. + * With SEND_MAX_BACKREF_CACHE_ROOTS as 17, the size in bytes, excluding + * maple tree's internal nodes, is 24K. */ #define SEND_MAX_BACKREF_CACHE_SIZE 128 From patchwork Wed Jan 11 11:36:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13096497 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A13BC46467 for ; Wed, 11 Jan 2023 11:39:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234048AbjAKLjn (ORCPT ); Wed, 11 Jan 2023 06:39:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37888 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238312AbjAKLjK (ORCPT ); Wed, 11 Jan 2023 06:39:10 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7357FDF7C for ; Wed, 11 Jan 2023 03:36:43 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id CC42EB81AD3 for ; Wed, 11 Jan 2023 11:36:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 12216C433F0 for ; Wed, 11 Jan 2023 11:36:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673437000; bh=m+hZcsGEn31EVG6bDh9ikmFHCQV7nmQL3FotuOQu/d4=; h=From:To:Subject:Date:In-Reply-To:References:From; b=tSE3V5+LBOP1iZkKmJxNF/i2/R0PvvIjxRfBBpjxYHJ/L7Xt/gvhaPEfhEvmNKQPp NrppSG6gBv+K5l7M5QohuiHNcICiQt5TH1Mmh5D9GQXvVMgaIgpB15rhznNpGHBGr5 D/uBmFqjNCdphBIf2djJyCG4Rz0UywVVaWVAZ4zItZAtKhPSUUEpsPSaguMhtR3fyB IyBYyG7wULkFteuDzMIdGXUp38eS286FQ60sduLs3k9fW0IPlPmnERedHjkDmP5oat fz8M4MwA3D7lEWqZvPYBuiInQNe1LSIMayd4Ufyf2ASCLJUe63G2DsrCZDyanbXOyI MlGjCqSzQvznQ== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH 19/19] btrfs: send: cache utimes operations for directories if possible Date: Wed, 11 Jan 2023 11:36:20 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana Whenever we add or remove an entry to a directory, we issue an utimes command for the directory. If we add 1000 entries to a directory (create 1000 files under it or move 1000 files to it), then we issue the same utimes command 1000 times, which increases the send stream size, results in more pipe IO, one search in the send b+tree, allocating one path for the search, etc, as well as making the receiver do a system call for each duplicated utimes command. We also issue an utimes command when we create a new directory, but later we might add entries to it corresponding to inodes with an higher inode number, so it's pointless to issue the utimes command before we create the last inode under the directory. So use a lru cache to track directories for which we must send a utimes command. When we need to remove an entry from the cache, we issue the utimes command for the respective directory. When finishing the send operation, we go over each cache element and issue the respective utimes command. Finally the caching is entirely optional, just a performance optimization, meaning that if we fail to cache (due to memory allocation failure), we issue the utimes command right away, that is, we fallback to the previous, unoptimized, behaviour. This patch belongs to a patchset comprised of the following patches: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: use MT_FLAGS_LOCK_EXTERN for the backref cache maple tree btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible The following test was run before and after applying the whole patchset, and on a non-debug kernel (Debian's default kernel config): #!/bin/bash MNT=/mnt/sdi DEV=/dev/sdi mkfs.btrfs -f $DEV > /dev/null mount $DEV $MNT mkdir $MNT/A for ((i = 1; i <= 20000; i++)); do echo -n > $MNT/A/file_$i done btrfs subvolume snapshot -r $MNT $MNT/snap1 mkdir $MNT/B for ((i = 20000; i <= 40000; i++)); do echo -n > $MNT/B/file_$i done mv $MNT/A/file_* $MNT/B/ btrfs subvolume snapshot -r $MNT $MNT/snap2 start=$(date +%s%N) btrfs send -p $MNT/snap1 $MNT/snap2 > /dev/null end=$(date +%s%N) dur=$(( (end - start) / 1000000 )) echo "Incremental send took $dur milliseconds" umount $MNT Before the whole patchset: 18408 milliseconds After the whole patchset: 1942 milliseconds (9.5x speedup) Using 60000 files instead of 40000: Before the whole patchset: 39764 milliseconds After the whole patchset: 3076 milliseconds (12.9x speedup) Using 20000 files instead of 40000: Before the whole patchset: 5072 milliseconds After the whole patchset: 916 milliseconds (5.5x speedup) Signed-off-by: Filipe Manana --- fs/btrfs/lru_cache.h | 15 ++++++++ fs/btrfs/send.c | 81 +++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 91 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/lru_cache.h b/fs/btrfs/lru_cache.h index 57e5bdc57e6d..9f3101cc89d5 100644 --- a/fs/btrfs/lru_cache.h +++ b/fs/btrfs/lru_cache.h @@ -47,11 +47,26 @@ struct btrfs_lru_cache { unsigned int max_size; }; +#define btrfs_lru_cache_for_each_entry_safe(cache, entry, tmp) \ + list_for_each_entry_safe_reverse((entry), (tmp), &(cache)->lru_list, lru_list) + static inline unsigned int btrfs_lru_cache_size(const struct btrfs_lru_cache *cache) { return cache->size; } +static inline bool btrfs_lru_cache_is_full(const struct btrfs_lru_cache *cache) +{ + return cache->size >= cache->max_size; +} + +static inline struct btrfs_lru_cache_entry *btrfs_lru_cache_lru_entry( + struct btrfs_lru_cache *cache) +{ + return list_first_entry_or_null(&cache->lru_list, + struct btrfs_lru_cache_entry, lru_list); +} + void btrfs_lru_cache_init(struct btrfs_lru_cache *cache, unsigned int max_size); struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cache, u64 key, u64 gen); diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index ae2d462f647e..a271b39c1445 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -125,6 +125,14 @@ static_assert(offsetof(struct backref_cache_entry, entry) == 0); */ #define SEND_MAX_DIR_CREATED_CACHE_SIZE 64 +/* + * Max number of entries in the cache that stores directories that were already + * created. The cache uses raw struct btrfs_lru_cache_entry entries, so it uses + * at most 4096 bytes - sizeof(struct btrfs_lru_cache_entry) is 48 bytes, but + * the kmalloc-64 slab is used, so we get 4096 bytes (64 bytes * 64). + */ +#define SEND_MAX_DIR_UTIMES_CACHE_SIZE 64 + struct send_ctx { struct file *send_filp; loff_t send_off; @@ -296,6 +304,7 @@ struct send_ctx { u64 backref_cache_last_reloc_trans; struct btrfs_lru_cache dir_created_cache; + struct btrfs_lru_cache dir_utimes_cache; }; struct pending_dir_move { @@ -2747,6 +2756,46 @@ static int send_utimes(struct send_ctx *sctx, u64 ino, u64 gen) return ret; } +static int cache_dir_utimes(struct send_ctx *sctx, u64 dir, u64 gen) +{ + struct btrfs_lru_cache_entry *entry; + int ret; + + entry = btrfs_lru_cache_lookup(&sctx->dir_utimes_cache, dir, gen); + if (entry != NULL) + return 0; + + if (btrfs_lru_cache_is_full(&sctx->dir_utimes_cache)) { + struct btrfs_lru_cache_entry *lru; + + lru = btrfs_lru_cache_lru_entry(&sctx->dir_utimes_cache); + ASSERT(lru != NULL); + + ret = send_utimes(sctx, lru->key, lru->gen); + if (ret) + return ret; + + btrfs_lru_cache_remove(&sctx->dir_utimes_cache, lru); + } + + /* Caching is optional, don't fail if we can't allocate memory. */ + entry = kmalloc(sizeof(*entry), GFP_KERNEL); + if (!entry) + return send_utimes(sctx, dir, gen); + + entry->key = dir; + entry->gen = gen; + + ret = btrfs_lru_cache_store(&sctx->dir_utimes_cache, entry, GFP_KERNEL); + ASSERT(ret != -EEXIST); + if (ret) { + kfree(entry); + return send_utimes(sctx, dir, gen); + } + + return 0; +} + /* * Sends a BTRFS_SEND_C_MKXXX or SYMLINK command to user space. We don't have * a valid path yet because we did not process the refs yet. So, the inode @@ -3540,7 +3589,7 @@ static int apply_dir_move(struct send_ctx *sctx, struct pending_dir_move *pm) } finish: - ret = send_utimes(sctx, pm->ino, pm->gen); + ret = cache_dir_utimes(sctx, pm->ino, pm->gen); if (ret < 0) goto out; @@ -3560,7 +3609,7 @@ static int apply_dir_move(struct send_ctx *sctx, struct pending_dir_move *pm) if (ret < 0) goto out; - ret = send_utimes(sctx, cur->dir, cur->dir_gen); + ret = cache_dir_utimes(sctx, cur->dir, cur->dir_gen); if (ret < 0) goto out; } @@ -4507,8 +4556,7 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) if (ret == inode_state_did_create || ret == inode_state_no_change) { - /* TODO delayed utimes */ - ret = send_utimes(sctx, cur->dir, cur->dir_gen); + ret = cache_dir_utimes(sctx, cur->dir, cur->dir_gen); if (ret < 0) goto out; } else if (ret == inode_state_did_delete && @@ -6690,7 +6738,18 @@ static int finish_inode_if_needed(struct send_ctx *sctx, int at_end) * it's moved/renamed, therefore we don't need to do it here. */ sctx->send_progress = sctx->cur_ino + 1; - ret = send_utimes(sctx, sctx->cur_ino, sctx->cur_inode_gen); + + /* + * If the current inode is a non-empty directory, delay issuing + * the utimes command for it, as it's very likely we have inodes + * with an higher number inside it. We want to issue the utimes + * command only after adding all dentries to it. + */ + if (S_ISDIR(sctx->cur_inode_mode) && sctx->cur_inode_size > 0) + ret = cache_dir_utimes(sctx, sctx->cur_ino, sctx->cur_inode_gen); + else + ret = send_utimes(sctx, sctx->cur_ino, sctx->cur_inode_gen); + if (ret < 0) goto out; } @@ -7980,6 +8039,8 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) int clone_sources_to_rollback = 0; size_t alloc_size; int sort_clone_roots = 0; + struct btrfs_lru_cache_entry *entry; + struct btrfs_lru_cache_entry *tmp; if (!capable(CAP_SYS_ADMIN)) return -EPERM; @@ -8035,6 +8096,8 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) btrfs_lru_cache_init(&sctx->backref_cache, SEND_MAX_BACKREF_CACHE_SIZE); btrfs_lru_cache_init(&sctx->dir_created_cache, SEND_MAX_DIR_CREATED_CACHE_SIZE); + btrfs_lru_cache_init(&sctx->dir_utimes_cache, + SEND_MAX_DIR_UTIMES_CACHE_SIZE); sctx->pending_dir_moves = RB_ROOT; sctx->waiting_dir_moves = RB_ROOT; @@ -8215,6 +8278,13 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) if (ret < 0) goto out; + btrfs_lru_cache_for_each_entry_safe(&sctx->dir_utimes_cache, entry, tmp) { + ret = send_utimes(sctx, entry->key, entry->gen); + if (ret < 0) + goto out; + btrfs_lru_cache_remove(&sctx->dir_utimes_cache, entry); + } + if (!(sctx->flags & BTRFS_SEND_FLAG_OMIT_END_CMD)) { ret = begin_cmd(sctx, BTRFS_SEND_C_END); if (ret < 0) @@ -8299,6 +8369,7 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) btrfs_lru_cache_clear(&sctx->name_cache); btrfs_lru_cache_clear(&sctx->backref_cache); btrfs_lru_cache_clear(&sctx->dir_created_cache); + btrfs_lru_cache_clear(&sctx->dir_utimes_cache); kfree(sctx); }