From patchwork Thu Jan 19 19:39:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13108715 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3056C004D4 for ; Thu, 19 Jan 2023 19:39:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230448AbjASTji (ORCPT ); Thu, 19 Jan 2023 14:39:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50880 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229544AbjASTjh (ORCPT ); Thu, 19 Jan 2023 14:39:37 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 636E2A5F6 for ; Thu, 19 Jan 2023 11:39:36 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id EF8A861D1B for ; Thu, 19 Jan 2023 19:39:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DF54CC433F0 for ; Thu, 19 Jan 2023 19:39:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157175; bh=ufmG8lo6S5LSnMD9B+hszsoBBsJoo88ed/0T4d6Nze8=; h=From:To:Subject:Date:In-Reply-To:References:From; b=lbHYX8AtH4dSF1vVfjsWo6sHro1soIBiZou9PKyqTqbRVt9zRBqj3L2blu5L+x40J 1yNaAtELzWi2WC0NXfLm7UR5Z4iSBIsi9W03pJ8FcHAdkF2caOT14tpRsxl8R2zGOM oCb6TC0ESlpYUAXlps5BouezMJoMOiWiqKXOdlET6JHPitZZZIhwPTqlouqobrr1IT 6oWW5l4S7BQ6aKm14HA+F3CAsMlYfJA92updreyHocPhLTA3eGCLD4WFUp9LA55JB/ FAEY+U6uLFeG4y2LfvxMaNmh0bcWipORAg4tGq4UUCVSafYJce3B9cnzhDf+Y1Vt8Q C0MVtnrGqj4Gw== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 01/18] btrfs: send: directly return from did_overwrite_ref() and simplify it Date: Thu, 19 Jan 2023 19:39:13 +0000 Message-Id: <7083b98c7dafbccb344a518c50a594947393e8fe.1674157020.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana There are no resources to release before did_overwrite_ref() returns, so we don't really need the 'out' label and jumping to it when conditions are met - we can directly return and get rid of the label and jumps. Also we can deal with -ENOENT and other errors in a single if-else logic, as it's more straightforward. This helps the next patch in the series to be more simple as well. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 43 ++++++++++++++++++------------------------- 1 file changed, 18 insertions(+), 25 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index b90ad6f7219c..59106eb8b114 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -2192,48 +2192,44 @@ static int did_overwrite_ref(struct send_ctx *sctx, u64 ino, u64 ino_gen, const char *name, int name_len) { - int ret = 0; + int ret; u64 gen; u64 ow_inode; if (!sctx->parent_root) - goto out; + return 0; ret = is_inode_existent(sctx, dir, dir_gen); if (ret <= 0) - goto out; + return ret; if (dir != BTRFS_FIRST_FREE_OBJECTID) { ret = get_inode_gen(sctx->send_root, dir, &gen); - if (ret < 0 && ret != -ENOENT) - goto out; - if (ret) { - ret = 0; - goto out; - } + if (ret == -ENOENT) + return 0; + else if (ret < 0) + return ret; + if (gen != dir_gen) - goto out; + return 0; } /* check if the ref was overwritten by another ref */ ret = lookup_dir_item_inode(sctx->send_root, dir, name, name_len, &ow_inode); - if (ret < 0 && ret != -ENOENT) - goto out; - if (ret) { + if (ret == -ENOENT) { /* was never and will never be overwritten */ - ret = 0; - goto out; + return 0; + } else if (ret < 0) { + return ret; } ret = get_inode_gen(sctx->send_root, ow_inode, &gen); if (ret < 0) - goto out; + return ret; - if (ow_inode == ino && gen == ino_gen) { - ret = 0; - goto out; - } + if (ow_inode == ino && gen == ino_gen) + return 0; /* * We know that it is or will be overwritten. Check this now. @@ -2244,12 +2240,9 @@ static int did_overwrite_ref(struct send_ctx *sctx, if ((ow_inode < sctx->send_progress) || (ino != sctx->cur_ino && ow_inode == sctx->cur_ino && gen == sctx->cur_inode_gen)) - ret = 1; - else - ret = 0; + return 1; -out: - return ret; + return 0; } /* From patchwork Thu Jan 19 19:39:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13109175 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A474AC678DF for ; Fri, 20 Jan 2023 05:36:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231396AbjATFgx (ORCPT ); Fri, 20 Jan 2023 00:36:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230508AbjATFgh (ORCPT ); Fri, 20 Jan 2023 00:36:37 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE28F6A314 for ; Thu, 19 Jan 2023 21:33:19 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 869B2B82716 for ; Thu, 19 Jan 2023 19:39:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BFEB9C433D2 for ; Thu, 19 Jan 2023 19:39:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157176; bh=5VnDHugdVresDPNG53FGJP+zhSjtI3mOUxYWQINnPiQ=; h=From:To:Subject:Date:In-Reply-To:References:From; b=FWR7CkTxY7hx1MZqRr5RNZrV9AxrAdLpnYFkt0D1WX3t3ircgix97Tmmu+g+6Grzv k8G5p8UHnbsFDTaZYUtj00C8R0eB892Kpl87AU656gJSWxk26kuFsj3mBq7fbyIgNQ 3nklAb85/yee+rsrsV4ZU3Db4EPI8CETkDfmXqCLMTsIGb7FtIHSd9WJ6hGjdUPihm tw8mygcRLJ7G8p0rNYScpjjxHzBfFiGk184KiUXwVk1Bv2SvdkQKaz7+UB8d6fcJBA UudxqVuP7xR4ngeGJxpF+yV4x3zmszp0ub1+Jn2Xt00gpgI4pQRQONOttmPDgpxPRv KEetsf8IJSDng== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 02/18] btrfs: send: avoid unnecessary generation search at did_overwrite_ref() Date: Thu, 19 Jan 2023 19:39:14 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana At did_overwrite_ref() we always call get_inode_gen() to find out the generation of the inode 'ow_inode'. However we don't always need to use that generation, and in fact it's very common to not use it, so we end up doing a b+tree search on the send root, allocating a path, etc, for nothing. So improve on this by getting the generation only if we need to use it. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 31 ++++++++++++++++++++++--------- 1 file changed, 22 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 59106eb8b114..23060eec30de 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -2193,8 +2193,8 @@ static int did_overwrite_ref(struct send_ctx *sctx, const char *name, int name_len) { int ret; - u64 gen; u64 ow_inode; + u64 ow_gen = 0; if (!sctx->parent_root) return 0; @@ -2204,6 +2204,8 @@ static int did_overwrite_ref(struct send_ctx *sctx, return ret; if (dir != BTRFS_FIRST_FREE_OBJECTID) { + u64 gen; + ret = get_inode_gen(sctx->send_root, dir, &gen); if (ret == -ENOENT) return 0; @@ -2224,12 +2226,15 @@ static int did_overwrite_ref(struct send_ctx *sctx, return ret; } - ret = get_inode_gen(sctx->send_root, ow_inode, &gen); - if (ret < 0) - return ret; + if (ow_inode == ino) { + ret = get_inode_gen(sctx->send_root, ow_inode, &ow_gen); + if (ret < 0) + return ret; - if (ow_inode == ino && gen == ino_gen) - return 0; + /* It's the same inode, so no overwrite happened. */ + if (ow_gen == ino_gen) + return 0; + } /* * We know that it is or will be overwritten. Check this now. @@ -2237,11 +2242,19 @@ static int did_overwrite_ref(struct send_ctx *sctx, * inode 'ino' to be orphanized, therefore check if ow_inode matches * the current inode being processed. */ - if ((ow_inode < sctx->send_progress) || - (ino != sctx->cur_ino && ow_inode == sctx->cur_ino && - gen == sctx->cur_inode_gen)) + if (ow_inode < sctx->send_progress) return 1; + if (ino != sctx->cur_ino && ow_inode == sctx->cur_ino) { + if (ow_gen == 0) { + ret = get_inode_gen(sctx->send_root, ow_inode, &ow_gen); + if (ret < 0) + return ret; + } + if (ow_gen == sctx->cur_inode_gen) + return 1; + } + return 0; } From patchwork Thu Jan 19 19:39:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13109183 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4128DC7112B for ; Fri, 20 Jan 2023 05:37:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231411AbjATFhe (ORCPT ); Fri, 20 Jan 2023 00:37:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230255AbjATFgt (ORCPT ); Fri, 20 Jan 2023 00:36:49 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 696B560C99 for ; Thu, 19 Jan 2023 21:33:25 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 663EAB82715 for ; Thu, 19 Jan 2023 19:39:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A3980C433F0 for ; Thu, 19 Jan 2023 19:39:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157177; bh=JZDTPPOKQWO3LkI1oPetOFxqhj5/jCkznUSbsY2PgEQ=; h=From:To:Subject:Date:In-Reply-To:References:From; b=CR51n3+zczHPTRY+co0J+8CsPxgDFq0nUXN/fs8rFq+4uWn6Rqzc1Du+o0hzMiLo4 pCN49Wf3QFMp9PmZaYDNKEi4db2y2frC5DduZuB//HMiW8PGf8Ryt9A6tGpeFvVjs8 TQfs0omPgasbyU5WUBosGvOIb/KwcBbULjdlKMejQ9BPOnj8gINoKs9aQmUSF/Honr Nz6Mgg62kqVGVx5Tjkmwgmbn6Tlx6OwN/v0BG90mh2WYCHImtWD0NNckVW+LT/IEFa sv4sbGxEVaImEdH/m/EE1ec8ZmlAmrn08e7KG80taYXe0LO8DvmujtacYDRPHwtAKE dww1/3GpZDRxA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 03/18] btrfs: send: directly return from will_overwrite_ref() and simplify it Date: Thu, 19 Jan 2023 19:39:15 +0000 Message-Id: <1747b050e9ac81e0ddc845cffdb05c27c733aa18.1674157020.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana There are no resources to release before will_overwrite_ref() returns, so we don't really need the 'out' label and jumping to it when conditions are met - we can directly return and get rid of the label and jumps. Also we can deal with -ENOENT and other errors in a single if-else logic, as it's more straightforward. This helps the next patch in the series to be more simple as well. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 38 ++++++++++++++++---------------------- 1 file changed, 16 insertions(+), 22 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 23060eec30de..9be3d7db85d6 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -2119,17 +2119,17 @@ static int will_overwrite_ref(struct send_ctx *sctx, u64 dir, u64 dir_gen, const char *name, int name_len, u64 *who_ino, u64 *who_gen, u64 *who_mode) { - int ret = 0; + int ret; u64 gen; u64 other_inode = 0; struct btrfs_inode_info info; if (!sctx->parent_root) - goto out; + return 0; ret = is_inode_existent(sctx, dir, dir_gen); if (ret <= 0) - goto out; + return 0; /* * If we have a parent root we need to verify that the parent dir was @@ -2138,24 +2138,21 @@ static int will_overwrite_ref(struct send_ctx *sctx, u64 dir, u64 dir_gen, */ if (sctx->parent_root && dir != BTRFS_FIRST_FREE_OBJECTID) { ret = get_inode_gen(sctx->parent_root, dir, &gen); - if (ret < 0 && ret != -ENOENT) - goto out; - if (ret) { - ret = 0; - goto out; - } + if (ret == -ENOENT) + return 0; + else if (ret < 0) + return ret; + if (gen != dir_gen) - goto out; + return 0; } ret = lookup_dir_item_inode(sctx->parent_root, dir, name, name_len, &other_inode); - if (ret < 0 && ret != -ENOENT) - goto out; - if (ret) { - ret = 0; - goto out; - } + if (ret == -ENOENT) + return 0; + else if (ret < 0) + return ret; /* * Check if the overwritten ref was already processed. If yes, the ref @@ -2166,18 +2163,15 @@ static int will_overwrite_ref(struct send_ctx *sctx, u64 dir, u64 dir_gen, is_waiting_for_move(sctx, other_inode)) { ret = get_inode_info(sctx->parent_root, other_inode, &info); if (ret < 0) - goto out; + return ret; - ret = 1; *who_ino = other_inode; *who_gen = info.gen; *who_mode = info.mode; - } else { - ret = 0; + return 1; } -out: - return ret; + return 0; } /* From patchwork Thu Jan 19 19:39:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13108716 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A52D1C004D4 for ; Thu, 19 Jan 2023 19:39:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230456AbjASTjm (ORCPT ); Thu, 19 Jan 2023 14:39:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50912 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229544AbjASTjk (ORCPT ); Thu, 19 Jan 2023 14:39:40 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10AC3A5F6 for ; Thu, 19 Jan 2023 11:39:39 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9192D61D23 for ; Thu, 19 Jan 2023 19:39:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 85097C433F1 for ; Thu, 19 Jan 2023 19:39:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157178; bh=T+lc0KA81etOhdE7cco7HW8j+jNYpTp/L2EyRwYi+Eg=; h=From:To:Subject:Date:In-Reply-To:References:From; b=CObBgkzTOcIO2/i9xjHynncdRLZIzDIG8g+WHLaUa77AtyHxzPvLYHMAQrVe62a7O oyE1mbFyeF12hPEsyKFhw3aNxOkmB5w5ARH+ERqx188uOpvzwq247MQ8kd2Ur1OghY 0L3ymSpWOjFNPEkn4BpEA5Odo+V07ges7iDA+reGmFUZO6NbxiLcbNfVPp+2paNz4g Y5F7x2lJJf52+jewRw4pXQLiL8RkpXfMSoyFbGeg6DaLoV0O71oikToEoIDgScLp8/ S/EJPZkXQA/+BOStNg2/Y3t1Skl4LXNFycWfGiD0NoLF62soT8qNuSRQp+u23bZ262 lKCUt2oueIPIQ== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 04/18] btrfs: send: avoid extra b+tree searches when checking reference overrides Date: Thu, 19 Jan 2023 19:39:16 +0000 Message-Id: <05e8d5c0c18cedf2845166cf08dd4b0a7a09e289.1674157020.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana During an incremental send, when processing the new references of an inode (either it's a new inode or an existing one renamed/moved), he will search the b+tree of the send or parent roots in order to find out the inode item of the parent directory and extract its generation. However we are doing that search twice, once with is_inode_existent() -> get_cur_inode_state() and then again at did_overwrite_ref() or will_overwrite_ref(). So avoid that and get the generation at get_cur_inode_state() and then propagate it up to did_overwrite_ref() and will_overwrite_ref(). This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 61 +++++++++++++++++++++++-------------------------- 1 file changed, 29 insertions(+), 32 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 9be3d7db85d6..6332add4865c 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -1884,7 +1884,8 @@ enum inode_state { inode_state_did_delete, }; -static int get_cur_inode_state(struct send_ctx *sctx, u64 ino, u64 gen) +static int get_cur_inode_state(struct send_ctx *sctx, u64 ino, u64 gen, + u64 *send_gen, u64 *parent_gen) { int ret; int left_ret; @@ -1898,6 +1899,8 @@ static int get_cur_inode_state(struct send_ctx *sctx, u64 ino, u64 gen) goto out; left_ret = (info.nlink == 0) ? -ENOENT : ret; left_gen = info.gen; + if (send_gen) + *send_gen = (left_ret == -ENOENT) ? 0 : info.gen; if (!sctx->parent_root) { right_ret = -ENOENT; @@ -1907,6 +1910,8 @@ static int get_cur_inode_state(struct send_ctx *sctx, u64 ino, u64 gen) goto out; right_ret = (info.nlink == 0) ? -ENOENT : ret; right_gen = info.gen; + if (parent_gen) + *parent_gen = (right_ret == -ENOENT) ? 0 : info.gen; } if (!left_ret && !right_ret) { @@ -1951,14 +1956,15 @@ static int get_cur_inode_state(struct send_ctx *sctx, u64 ino, u64 gen) return ret; } -static int is_inode_existent(struct send_ctx *sctx, u64 ino, u64 gen) +static int is_inode_existent(struct send_ctx *sctx, u64 ino, u64 gen, + u64 *send_gen, u64 *parent_gen) { int ret; if (ino == BTRFS_FIRST_FREE_OBJECTID) return 1; - ret = get_cur_inode_state(sctx, ino, gen); + ret = get_cur_inode_state(sctx, ino, gen, send_gen, parent_gen); if (ret < 0) goto out; @@ -2120,14 +2126,14 @@ static int will_overwrite_ref(struct send_ctx *sctx, u64 dir, u64 dir_gen, u64 *who_ino, u64 *who_gen, u64 *who_mode) { int ret; - u64 gen; + u64 parent_root_dir_gen; u64 other_inode = 0; struct btrfs_inode_info info; if (!sctx->parent_root) return 0; - ret = is_inode_existent(sctx, dir, dir_gen); + ret = is_inode_existent(sctx, dir, dir_gen, NULL, &parent_root_dir_gen); if (ret <= 0) return 0; @@ -2135,17 +2141,13 @@ static int will_overwrite_ref(struct send_ctx *sctx, u64 dir, u64 dir_gen, * If we have a parent root we need to verify that the parent dir was * not deleted and then re-created, if it was then we have no overwrite * and we can just unlink this entry. + * + * @parent_root_dir_gen was set to 0 if the inode does not exists in the + * parent root. */ - if (sctx->parent_root && dir != BTRFS_FIRST_FREE_OBJECTID) { - ret = get_inode_gen(sctx->parent_root, dir, &gen); - if (ret == -ENOENT) - return 0; - else if (ret < 0) - return ret; - - if (gen != dir_gen) - return 0; - } + if (sctx->parent_root && dir != BTRFS_FIRST_FREE_OBJECTID && + parent_root_dir_gen != dir_gen) + return 0; ret = lookup_dir_item_inode(sctx->parent_root, dir, name, name_len, &other_inode); @@ -2189,26 +2191,21 @@ static int did_overwrite_ref(struct send_ctx *sctx, int ret; u64 ow_inode; u64 ow_gen = 0; + u64 send_root_dir_gen; if (!sctx->parent_root) return 0; - ret = is_inode_existent(sctx, dir, dir_gen); + ret = is_inode_existent(sctx, dir, dir_gen, &send_root_dir_gen, NULL); if (ret <= 0) return ret; - if (dir != BTRFS_FIRST_FREE_OBJECTID) { - u64 gen; - - ret = get_inode_gen(sctx->send_root, dir, &gen); - if (ret == -ENOENT) - return 0; - else if (ret < 0) - return ret; - - if (gen != dir_gen) - return 0; - } + /* + * @send_root_dir_gen was set to 0 if the inode does not exists in the + * send root. + */ + if (dir != BTRFS_FIRST_FREE_OBJECTID && send_root_dir_gen != dir_gen) + return 0; /* check if the ref was overwritten by another ref */ ret = lookup_dir_item_inode(sctx->send_root, dir, name, name_len, @@ -2444,7 +2441,7 @@ static int __get_cur_name_and_parent(struct send_ctx *sctx, * This should only happen for the parent dir that we determine in * record_new_ref_if_needed(). */ - ret = is_inode_existent(sctx, ino, gen); + ret = is_inode_existent(sctx, ino, gen, NULL, NULL); if (ret < 0) goto out; @@ -4240,7 +4237,7 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) * "testdir_2". */ list_for_each_entry(cur, &sctx->new_refs, list) { - ret = get_cur_inode_state(sctx, cur->dir, cur->dir_gen); + ret = get_cur_inode_state(sctx, cur->dir, cur->dir_gen, NULL, NULL); if (ret < 0) goto out; if (ret == inode_state_will_create) @@ -4356,7 +4353,7 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) * parent directory out of order. But we need to check if this * did already happen before due to other refs in the same dir. */ - ret = get_cur_inode_state(sctx, cur->dir, cur->dir_gen); + ret = get_cur_inode_state(sctx, cur->dir, cur->dir_gen, NULL, NULL); if (ret < 0) goto out; if (ret == inode_state_will_create) { @@ -4562,7 +4559,7 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) if (cur->dir > sctx->cur_ino) continue; - ret = get_cur_inode_state(sctx, cur->dir, cur->dir_gen); + ret = get_cur_inode_state(sctx, cur->dir, cur->dir_gen, NULL, NULL); if (ret < 0) goto out; From patchwork Thu Jan 19 19:39:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13108717 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C412BC46467 for ; Thu, 19 Jan 2023 19:39:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230457AbjASTjn (ORCPT ); Thu, 19 Jan 2023 14:39:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50926 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230103AbjASTjl (ORCPT ); Thu, 19 Jan 2023 14:39:41 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D8CA3A5FD for ; Thu, 19 Jan 2023 11:39:39 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7551E61D1B for ; Thu, 19 Jan 2023 19:39:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 65C89C433D2 for ; Thu, 19 Jan 2023 19:39:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157178; bh=JZ0IvlubeXdciN/LPPq5UdjDR7dEa2SVmoHgTfZ/8xU=; h=From:To:Subject:Date:In-Reply-To:References:From; b=L/41wRPDvak/dZzbkFy/kMYkM0PbnuiPoU4HqM9dRPge48xiozqMbMBIEHpIzTq3N EIReMc+11xgmGLYSdhrXTWtmN/IkduJzx4PsczquHlSJjExgu7w28lwK7n9JtH3Xpp SshpIB0ULm5a5njb2dfbUyWui58HUzFanzWGFEyo9LqupYV3soY9hEOkoBosid81vJ 7h9ieAdBecdWIzzZrrracTvOpjak27ci2S8BWuvlrmA5gcz7x75TblFaQuS1R7TR/b N0eqA7UjnGqlaYzOBEcaPWuO8jyA6i6aj9LOSiRvCV1sZzOW1XkH476OIkocAaTOSh Mcbwx3UrrZFVA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 05/18] btrfs: send: remove send_progress argument from can_rmdir() Date: Thu, 19 Jan 2023 19:39:17 +0000 Message-Id: <09271a55f41a848b2fec3e167beb1905c6f3de4e.1674157020.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana All callers of can_rmdir() pass sctx->cur_ino as the value for the send_progress argument, so remove the argument and directly use sctx->cur_ino. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 6332add4865c..32dd88ed629a 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -3210,8 +3210,7 @@ static void free_orphan_dir_info(struct send_ctx *sctx, * We check this by iterating all dir items and checking if the inode behind * the dir item was already processed. */ -static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen, - u64 send_progress) +static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) { int ret = 0; int iter_ret = 0; @@ -3267,7 +3266,7 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen, goto out; } - if (loc.objectid > send_progress) { + if (loc.objectid > sctx->cur_ino) { odi = add_orphan_dir_info(sctx, dir, dir_gen); if (IS_ERR(odi)) { ret = PTR_ERR(odi); @@ -3574,7 +3573,7 @@ static int apply_dir_move(struct send_ctx *sctx, struct pending_dir_move *pm) } gen = odi->gen; - ret = can_rmdir(sctx, rmdir_ino, gen, sctx->cur_ino); + ret = can_rmdir(sctx, rmdir_ino, gen); if (ret < 0) goto out; if (!ret) @@ -4465,8 +4464,7 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) * later, we do this check again and rmdir it then if possible. * See the use of check_dirs for more details. */ - ret = can_rmdir(sctx, sctx->cur_ino, sctx->cur_inode_gen, - sctx->cur_ino); + ret = can_rmdir(sctx, sctx->cur_ino, sctx->cur_inode_gen); if (ret < 0) goto out; if (ret) { @@ -4571,8 +4569,7 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) goto out; } else if (ret == inode_state_did_delete && cur->dir != last_dir_ino_rm) { - ret = can_rmdir(sctx, cur->dir, cur->dir_gen, - sctx->cur_ino); + ret = can_rmdir(sctx, cur->dir, cur->dir_gen); if (ret < 0) goto out; if (ret) { From patchwork Thu Jan 19 19:39:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13108718 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE1DFC6379F for ; Thu, 19 Jan 2023 19:39:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230463AbjASTjo (ORCPT ); Thu, 19 Jan 2023 14:39:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50932 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230451AbjASTjl (ORCPT ); Thu, 19 Jan 2023 14:39:41 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE89F951B4 for ; Thu, 19 Jan 2023 11:39:40 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4FE8461D57 for ; Thu, 19 Jan 2023 19:39:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 460F7C433F0 for ; Thu, 19 Jan 2023 19:39:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157179; bh=Jwof/HOXT1lKIQJf1gyF4f3kf2bv29gwLrIBHzj9FfU=; h=From:To:Subject:Date:In-Reply-To:References:From; b=pYWQJvNxyCETlYOpteVgqz7ikVOjatiSTEV42ya8QD7HfFPWqi9Ibs6S/eazwdcoK odNLvtVw3uogX5gfTTDaxViXrXPc/+ssfFyZ4EGNgKHTRHLQKRlkkzeW4YLAMEo/cK EmTuQjB6yC8eoAnWS1k6LHShs9zF37jt2/pBrY6vL3p7Mm5N5ekico5GEY7QCROOWc 35Ipw57InVxSMcQ5o540c/I+HW6lCwBwtnZRwzXz12o9yz8pG9vZHDKfBrVG6LyhRD cbSER/Ega7Dik9zhTN35RfdikaNLw7mG5iN5kyXWrWQXZ8ibWR30d7HsnHmmxcnKNs MsFEY8T7fJiDw== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 06/18] btrfs: send: avoid duplicated orphan dir allocation and initialization Date: Thu, 19 Jan 2023 19:39:18 +0000 Message-Id: <32617d6f2caa097cdd66227a5701de21c51c8c2d.1674157020.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana At can_rmdir() we are allocating and initializing an orphan dir object twice. This can be deduplicated outside of the loop that iterates over the dir index keys. So deduplicate that code, even because other patch in the series will need to add more initializion code and another one will add one more condition. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 32dd88ed629a..f7d533c364b1 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -3253,13 +3253,6 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) dm = get_waiting_dir_move(sctx, loc.objectid); if (dm) { - odi = add_orphan_dir_info(sctx, dir, dir_gen); - if (IS_ERR(odi)) { - ret = PTR_ERR(odi); - goto out; - } - odi->gen = dir_gen; - odi->last_dir_index_offset = found_key.offset; dm->rmdir_ino = dir; dm->rmdir_gen = dir_gen; ret = 0; @@ -3267,13 +3260,6 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) } if (loc.objectid > sctx->cur_ino) { - odi = add_orphan_dir_info(sctx, dir, dir_gen); - if (IS_ERR(odi)) { - ret = PTR_ERR(odi); - goto out; - } - odi->gen = dir_gen; - odi->last_dir_index_offset = found_key.offset; ret = 0; goto out; } @@ -3288,7 +3274,18 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) out: btrfs_free_path(path); - return ret; + + if (ret) + return ret; + + odi = add_orphan_dir_info(sctx, dir, dir_gen); + if (IS_ERR(odi)) + return PTR_ERR(odi); + + odi->gen = dir_gen; + odi->last_dir_index_offset = found_key.offset; + + return 0; } static int is_waiting_for_move(struct send_ctx *sctx, u64 ino) From patchwork Thu Jan 19 19:39:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13108719 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13F11C004D4 for ; Thu, 19 Jan 2023 19:39:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230472AbjASTjp (ORCPT ); Thu, 19 Jan 2023 14:39:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50936 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230455AbjASTjm (ORCPT ); Thu, 19 Jan 2023 14:39:42 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 95AEE94300 for ; Thu, 19 Jan 2023 11:39:41 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 325C861D5C for ; Thu, 19 Jan 2023 19:39:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 268EBC433D2 for ; Thu, 19 Jan 2023 19:39:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157180; bh=B0BGda+QDkxmYfExAf7G5qYUNM//hTWB1tjAXwQFjBY=; h=From:To:Subject:Date:In-Reply-To:References:From; b=ZdnhmC3uYg7dB8rURDvhLD3067l8KWGHskFsnc4RkjP0SuC9sQ2BFLrintEQlxxM3 auTkfctdc9dSU8PwsPVbllzVUGPjeq4iPd+EOeMvLj9jAlE6wCmZztMlaRALxp73xM zSEdzMMzibmFW1uRayIv1I0qP1lgONBjzcO32cd8qpb9k3WzY+8W+0jevxRvp3Hd+A z8qkXjUFAF0nlQUo3eiW+4nb/KVUHTBXPXiNnNCBBTw7br/lSNVKv2QMIeGF9XdSvy eopWgnzzywkOpphbMgS0QnnYt5ihwxNRxgfnz99wvKLsjWJaLcDgMYLkjvQ60TyZXB g5NZIiXqmdNgA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 07/18] btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() Date: Thu, 19 Jan 2023 19:39:19 +0000 Message-Id: <4ee5e1fd893a126e9ae6f43959a8abc03be8188c.1674157020.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana At can_rmdir() we start by searching the orphan dirs rbtree for an orphan dir object for the target directory. Later when iterating over the dir index keys, if we find that any dir entry points to inode for which there is a pending dir move or the inode was not yet processed, we exit because we can't remove the directory yet. However we end up always calling add_orphan_dir_info(), which will iterate again the rbtree and if there is already an orphan dir object (created by the first call to can_rmdir()), it returns the existing object. This is unnecessary work because in case there is already an existing orphan dir object, we got a reference to it at the start of can_rmdir(). So skip the call to add_orphan_dir_info() if we already have a reference for an orphan dir object. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index f7d533c364b1..bc57fa8a6bde 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -3278,11 +3278,14 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) if (ret) return ret; - odi = add_orphan_dir_info(sctx, dir, dir_gen); - if (IS_ERR(odi)) - return PTR_ERR(odi); + if (!odi) { + odi = add_orphan_dir_info(sctx, dir, dir_gen); + if (IS_ERR(odi)) + return PTR_ERR(odi); + + odi->gen = dir_gen; + } - odi->gen = dir_gen; odi->last_dir_index_offset = found_key.offset; return 0; From patchwork Thu Jan 19 19:39:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13109177 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ABB9BC678DC for ; Fri, 20 Jan 2023 05:37:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231172AbjATFhB (ORCPT ); Fri, 20 Jan 2023 00:37:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42294 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231156AbjATFgn (ORCPT ); Fri, 20 Jan 2023 00:36:43 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0DBD853561 for ; Thu, 19 Jan 2023 21:33:19 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id DB99AB82718 for ; Thu, 19 Jan 2023 19:39:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 07900C433EF for ; Thu, 19 Jan 2023 19:39:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157181; bh=snGufMPm/cb5kRgMA0SO15pGrpkBwyPz1qeFOBbX5o8=; h=From:To:Subject:Date:In-Reply-To:References:From; b=RytAhhSTWqUIw0YVpAqsc1IrcNIDrNm1yzsbplH98DUv1n5l0GowjfkM8Y2dyYJrR cTctrr5gly1RQTi/oKddDzYVvIw2ztqbPoTQEnVA/lGZxmOeCueCjzrXFjPOwzxbsY aWo/oLPC4UkF4FNBSKiFP9mAWkMNqXVS2T3do4n9pokElL0HI1qqBXvOtAVg8iQOgH 6w1uHx2YI8S5vNS+E3tji2aiLb5svZ1NkLUPqZM4lkClywWLkOex28TX7hkM3sUWJ+ qVBgzd0rd2E4zV0bL0xAzim0qgKysr0nq0yqA5MRvbC/tCWGzesrsdMcEW9oMz4++y Dvbcpokr1cSpg== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 08/18] btrfs: send: reduce searches on parent root when checking if dir can be removed Date: Thu, 19 Jan 2023 19:39:20 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana During an incremental send, every time we remove a reference (dentry) for an inode and the parent directory does not exists anymore in the send root, we go check if we can remove the directory by making a call to can_rmdir(). This helper can only return true (value 1) if all dentries were already removed, and for that it always does a search on the parent root for dir index keys - if it finds any dentry referring to an inode with a number higher then the inode currently being processed, then the directory can not be removed and it must return false (value 0). However that means if a directory that was deleted had 1000 dentries, and each one pointed to an inode with a number higher then the number of the directory's inode, we end up doing 1000 searches on the parent root. Typically files are created in a directory after the directory was created and therefore they get an higher inode number than the directory. It's also common to have the each dentry pointing to an inode with a higher number then the inodes the previous dentries point to, for example when creating a series of files inside a directory, a very common pattern. So improve on that by having the first call to can_rmdir() for a directory to check the number of the inode that the last dentry points to and cache that inode number in the orphan dir structure. Then every subsequent call to can_rmdir() can avoid doing a search on the parent root if the number of the inode currently being processed is smaller than cached inode number at the directory's orphan dir structure. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 65 ++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 59 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index bc57fa8a6bde..cd4aa0eae66c 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -321,6 +321,7 @@ struct orphan_dir_info { u64 ino; u64 gen; u64 last_dir_index_offset; + u64 dir_high_seq_ino; }; struct name_cache_entry { @@ -3161,6 +3162,7 @@ static struct orphan_dir_info *add_orphan_dir_info(struct send_ctx *sctx, odi->ino = dir_ino; odi->gen = dir_gen; odi->last_dir_index_offset = 0; + odi->dir_high_seq_ino = 0; rb_link_node(&odi->node, parent, p); rb_insert_color(&odi->node, &sctx->orphan_dirs); @@ -3221,6 +3223,8 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) struct btrfs_key loc; struct btrfs_dir_item *di; struct orphan_dir_info *odi = NULL; + u64 dir_high_seq_ino = 0; + u64 last_dir_index_offset = 0; /* * Don't try to rmdir the top/root subvolume dir. @@ -3228,17 +3232,62 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) if (dir == BTRFS_FIRST_FREE_OBJECTID) return 0; + odi = get_orphan_dir_info(sctx, dir, dir_gen); + if (odi && sctx->cur_ino < odi->dir_high_seq_ino) + return 0; + path = alloc_path_for_send(); if (!path) return -ENOMEM; + if (!odi) { + /* + * Find the inode number associated with the last dir index + * entry. This is very likely the inode with the highest number + * of all inodes that have an entry in the directory. We can + * then use it to avoid future calls to can_rmdir(), when + * processing inodes with a lower number, from having to search + * the parent root b+tree for dir index keys. + */ + key.objectid = dir; + key.type = BTRFS_DIR_INDEX_KEY; + key.offset = (u64)-1; + + ret = btrfs_search_slot(NULL, root, &key, path, 0, 0); + if (ret < 0) { + goto out; + } else if (ret > 0) { + /* Can't happen, the root is never empty. */ + ASSERT(path->slots[0] > 0); + if (WARN_ON(path->slots[0] == 0)) { + ret = -EUCLEAN; + goto out; + } + path->slots[0]--; + } + + btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]); + if (key.objectid != dir || key.type != BTRFS_DIR_INDEX_KEY) { + /* No index keys, dir can be removed. */ + ret = 1; + goto out; + } + + di = btrfs_item_ptr(path->nodes[0], path->slots[0], + struct btrfs_dir_item); + btrfs_dir_item_key_to_cpu(path->nodes[0], di, &loc); + dir_high_seq_ino = loc.objectid; + if (sctx->cur_ino < dir_high_seq_ino) { + ret = 0; + goto out; + } + + btrfs_release_path(path); + } + key.objectid = dir; key.type = BTRFS_DIR_INDEX_KEY; - key.offset = 0; - - odi = get_orphan_dir_info(sctx, dir, dir_gen); - if (odi) - key.offset = odi->last_dir_index_offset; + key.offset = odi ? odi->last_dir_index_offset : 0; btrfs_for_each_slot(root, &key, &found_key, path, iter_ret) { struct waiting_dir_move *dm; @@ -3251,6 +3300,9 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) struct btrfs_dir_item); btrfs_dir_item_key_to_cpu(path->nodes[0], di, &loc); + dir_high_seq_ino = max(dir_high_seq_ino, loc.objectid); + last_dir_index_offset = found_key.offset; + dm = get_waiting_dir_move(sctx, loc.objectid); if (dm) { dm->rmdir_ino = dir; @@ -3286,7 +3338,8 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 dir_gen) odi->gen = dir_gen; } - odi->last_dir_index_offset = found_key.offset; + odi->last_dir_index_offset = last_dir_index_offset; + odi->dir_high_seq_ino = max(odi->dir_high_seq_ino, dir_high_seq_ino); return 0; } From patchwork Thu Jan 19 19:39:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13108720 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07D8DC004D4 for ; Thu, 19 Jan 2023 19:39:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230481AbjASTjr (ORCPT ); Thu, 19 Jan 2023 14:39:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230465AbjASTjo (ORCPT ); Thu, 19 Jan 2023 14:39:44 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C4A1A5FD for ; Thu, 19 Jan 2023 11:39:43 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E407D61D1B for ; Thu, 19 Jan 2023 19:39:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DC41BC433F0 for ; Thu, 19 Jan 2023 19:39:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157182; bh=3JVlcVMDynD013hZOe1M/K44O4llr6B7BUBJgoSzHXg=; h=From:To:Subject:Date:In-Reply-To:References:From; b=Wzb/JtAbo+CF7AXMnAPx6Txw3DU6ZtKBX37blCaleH9zN3+0+z+K7WnGfggxV42QZ gJ3qq0nN5xoSsAdgUAKjyeV5+lyxQI36Mu0jRaQZT1FZtbUxSiMbV3fwfW4Rc18n6b Ug3IUlLUs5iFdsOTJynZ2VohEFkRy+NFhYDBLZLWp3t5mVRYVFl3cQTD0X4/Bd+SLO p7V/Mu6dbFaoEOsbqaCLpsZyGWyHQyLVkx+rsio33f7mbXCy10eWEoqvH2bEMn4y/S K8NoD137afTcBmKoaiGGY7VvRtwvKx2XuKnse6QjOZrsjHjeDPjr8eCMGFiu5qOXnL LQz/VF7JmJVvQ== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 09/18] btrfs: send: iterate waiting dir move rbtree only once when processing refs Date: Thu, 19 Jan 2023 19:39:21 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana When processing the new references for an inode, we unnecessarily iterate twice the waiting dir moves rbtree, once with is_waiting_for_move() and if we found an entry in the rbtree, we iterate it again with a call to get_waiting_dir_move(). This is pointless, we can make this simpler and more efficient by calling only get_waiting_dir_move(), so just do that. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index cd4aa0eae66c..20fcf1c0832a 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -4335,12 +4335,9 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) * the source path when performing its rename * operation. */ - if (is_waiting_for_move(sctx, ow_inode)) { - wdm = get_waiting_dir_move(sctx, - ow_inode); - ASSERT(wdm); + wdm = get_waiting_dir_move(sctx, ow_inode); + if (wdm) wdm->orphanized = true; - } /* * Make sure we clear our orphanized inode's From patchwork Thu Jan 19 19:39:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13108722 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E48AC004D4 for ; Thu, 19 Jan 2023 19:40:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230497AbjASTju (ORCPT ); Thu, 19 Jan 2023 14:39:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50996 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230477AbjASTjp (ORCPT ); Thu, 19 Jan 2023 14:39:45 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3C9C8A5F6 for ; Thu, 19 Jan 2023 11:39:44 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C789E61D23 for ; Thu, 19 Jan 2023 19:39:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BCC3BC433D2 for ; Thu, 19 Jan 2023 19:39:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157183; bh=RN1Gicf4pu/HnXjVjlEdkVEpsQNDVzDuFtgt3WDWocw=; h=From:To:Subject:Date:In-Reply-To:References:From; b=eXFva80h0uNGAFBuF3yCCvp2wkXs2lK2BfJ28D2yHfBoAh3VXhuhuoh3h78pH2Dyt 8uyBCV7xY7VaxPSvavIs3FKrIvvyr9E//CG/scHHEJMSkKXUpeSPPhmvwf0wbWXElI haYdwtco/GAzh5fCeRwGdEScdgr8To3W7OcRWFIMxdsw2DB1u5M3uZM1qlRuyDk+Wi 8hWxqAFusYLmmgmLZf1xGRVenMAwUXupuR9ksxHiDlCg/WHLGxGpbXDiJvolvbzqwx kntzZIPtWFzZcCXS+xE9ck/NtHvJTVMQvHXjHrKi5VtdBnoY1UG4/jbgr23E+m037V XKIYoVndlF7mw== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 10/18] btrfs: send: initialize all the red black trees earlier Date: Thu, 19 Jan 2023 19:39:22 +0000 Message-Id: <368a25b99e38f6fe0923d66bfc6a939a91d41da9.1674157020.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana After we allocate the send context object and before we initialize all the red black trees, we can jump to the 'out' label if some errors happen, and then under the 'out' label we use RB_EMPTY_ROOT() against some of the those trees, which we have not yet initialized. This happens to work out ok because the send context object was initialized to zeroes with kzalloc and the RB_ROOT initializer just happens to have the following definition: #define RB_ROOT (struct rb_root) { NULL, } But it's really neither clean nor a good practice as RB_ROOT is supposed to be opaque and in case it changes or we change those red black trees to some other data structure, it leaves us in a precarious situation. So initialize all the red black trees immediately after allocating the send context and before any jump into the 'out' label. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 20fcf1c0832a..38319d0354d4 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -8142,6 +8142,12 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) INIT_LIST_HEAD(&sctx->backref_cache.lru_list); mt_init(&sctx->backref_cache.entries); + sctx->pending_dir_moves = RB_ROOT; + sctx->waiting_dir_moves = RB_ROOT; + sctx->orphan_dirs = RB_ROOT; + sctx->rbtree_new_refs = RB_ROOT; + sctx->rbtree_deleted_refs = RB_ROOT; + sctx->flags = arg->flags; if (arg->flags & BTRFS_SEND_FLAG_VERSION) { @@ -8207,12 +8213,6 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) goto out; } - sctx->pending_dir_moves = RB_ROOT; - sctx->waiting_dir_moves = RB_ROOT; - sctx->orphan_dirs = RB_ROOT; - sctx->rbtree_new_refs = RB_ROOT; - sctx->rbtree_deleted_refs = RB_ROOT; - sctx->clone_roots = kvcalloc(sizeof(*sctx->clone_roots), arg->clone_sources_count + 1, GFP_KERNEL); From patchwork Thu Jan 19 19:39:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13108721 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DAD70C46467 for ; Thu, 19 Jan 2023 19:40:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230502AbjASTjv (ORCPT ); Thu, 19 Jan 2023 14:39:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51042 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230491AbjASTjr (ORCPT ); Thu, 19 Jan 2023 14:39:47 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 28328A5FD for ; Thu, 19 Jan 2023 11:39:45 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A38AA61D59 for ; Thu, 19 Jan 2023 19:39:44 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9CFE0C433F0 for ; Thu, 19 Jan 2023 19:39:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157184; bh=fPoTMWyYA1ILhMLB+0rjRlkVG8J4spXI8RdtvJ7EAug=; h=From:To:Subject:Date:In-Reply-To:References:From; b=YGnpy05K3/DcCgnbahdUI183+xmYponZB7wcKJWp9wbfWMZaDLlrZhWn1N7L9E7Si rwKreRhZ9QDius2WOmE8nTMrjxYYPnOd8pHfz0fa6kSX1byiHKx5UD8N8XVR/Fk68w Rl7p9ZATyUzpSPOYZkaxTp3ESPifmj+khk/p8M7iurqb+Co94duWCOp9aeyDwgWfha tEPOQorXqCFka+IbTs75AFKecc2fiyy5zVVqWq2C+56ZPYmFzd692ugO4Y1mjozuWq qydZRnBMwi2cZ0pRP8TH5dI+8toux9uoKHvn97vGUmeapkgzDpS+amW9JF8yXjiviO W3Wq7UHcovWfw== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 11/18] btrfs: send: genericize the backref cache to allow it to be reused Date: Thu, 19 Jan 2023 19:39:23 +0000 Message-Id: <7985d2b1ee134dab7f55c1a12f964327ec898ae6.1674157020.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana The backref cache is a cache backed by a maple tree and a linked list to keep track of temporal access to cached entries (the LRU entry always at the head of the list). This type of caching method is going to be useful in other scenarios, so make the cache implementation more generic and move it into its own header and source files. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/Makefile | 3 +- fs/btrfs/lru_cache.c | 97 ++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/lru_cache.h | 44 ++++++++++++++++++++ fs/btrfs/send.c | 80 +++++++++++------------------------- 4 files changed, 167 insertions(+), 57 deletions(-) create mode 100644 fs/btrfs/lru_cache.c create mode 100644 fs/btrfs/lru_cache.h diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index 460eced3f5bd..90d53209755b 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -32,7 +32,8 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \ uuid-tree.o props.o free-space-tree.o tree-checker.o space-info.o \ block-rsv.o delalloc-space.o block-group.o discard.o reflink.o \ - subpage.o tree-mod-log.o extent-io-tree.o fs.o messages.o bio.o + subpage.o tree-mod-log.o extent-io-tree.o fs.o messages.o bio.o \ + lru_cache.o btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o diff --git a/fs/btrfs/lru_cache.c b/fs/btrfs/lru_cache.c new file mode 100644 index 000000000000..177e7e705363 --- /dev/null +++ b/fs/btrfs/lru_cache.c @@ -0,0 +1,97 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include "lru_cache.h" +#include "messages.h" + +/* + * Initialize a cache object. + * + * @cache: The cache. + * @max_size: Maximum size (number of entries) for the cache. + */ +void btrfs_lru_cache_init(struct btrfs_lru_cache *cache, unsigned int max_size) +{ + INIT_LIST_HEAD(&cache->lru_list); + mt_init(&cache->entries); + cache->size = 0; + cache->max_size = max_size; +} + +/* + * Lookup for an entry in the cache. + * + * @cache: The cache. + * @key: The key of the entry we are looking for. + * + * Returns the entry associated with the key or NULL if none found. + */ +struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cache, + u64 key) +{ + struct btrfs_lru_cache_entry *entry; + + entry = mtree_load(&cache->entries, key); + if (entry) + list_move_tail(&entry->lru_list, &cache->lru_list); + + return entry; +} + +/* + * Store an entry in the cache. + * + * @cache: The cache. + * @entry: The entry to store. + * + * Returns 0 on success and < 0 on error. + */ +int btrfs_lru_cache_store(struct btrfs_lru_cache *cache, + struct btrfs_lru_cache_entry *new_entry, + gfp_t gfp) +{ + int ret; + + if (cache->size == cache->max_size) { + struct btrfs_lru_cache_entry *lru_entry; + struct btrfs_lru_cache_entry *mt_entry; + + lru_entry = list_first_entry(&cache->lru_list, + struct btrfs_lru_cache_entry, + lru_list); + mt_entry = mtree_erase(&cache->entries, lru_entry->key); + ASSERT(mt_entry == lru_entry); + list_del(&mt_entry->lru_list); + kfree(mt_entry); + cache->size--; + } + + ret = mtree_insert(&cache->entries, new_entry->key, new_entry, gfp); + if (ret < 0) + return ret; + + list_add_tail(&new_entry->lru_list, &cache->lru_list); + cache->size++; + + return 0; +} + +/* + * Empty a cache. + * + * @cache: The cache to empty. + * + * Removes all entries from the cache. + */ +void btrfs_lru_cache_clear(struct btrfs_lru_cache *cache) +{ + struct btrfs_lru_cache_entry *entry; + struct btrfs_lru_cache_entry *tmp; + + list_for_each_entry_safe(entry, tmp, &cache->lru_list, lru_list) + kfree(entry); + + INIT_LIST_HEAD(&cache->lru_list); + mtree_destroy(&cache->entries); + cache->size = 0; +} diff --git a/fs/btrfs/lru_cache.h b/fs/btrfs/lru_cache.h new file mode 100644 index 000000000000..189be5be0a8d --- /dev/null +++ b/fs/btrfs/lru_cache.h @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef BTRFS_LRU_CACHE_H +#define BTRFS_LRU_CACHE_H + +#include +#include + +/* + * A cache entry. This is meant to be embedded in a structure of a user of + * this module. Similar to how struct list_head and struct rb_node are used. + * + * Note: it should be embedded as the first element in a struct (offset 0), and + * this module assumes it was allocated with kmalloc(), so it calls kfree() when + * it needs to free an entry. + */ +struct btrfs_lru_cache_entry { + struct list_head lru_list; + u64 key; +}; + +struct btrfs_lru_cache { + struct list_head lru_list; + struct maple_tree entries; + /* Number of entries stored in the cache. */ + unsigned int size; + /* Maximum number of entries the cache can have. */ + unsigned int max_size; +}; + +static inline unsigned int btrfs_lru_cache_size(const struct btrfs_lru_cache *cache) +{ + return cache->size; +} + +void btrfs_lru_cache_init(struct btrfs_lru_cache *cache, unsigned int max_size); +struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cache, + u64 key); +int btrfs_lru_cache_store(struct btrfs_lru_cache *cache, + struct btrfs_lru_cache_entry *new_entry, + gfp_t gfp); +void btrfs_lru_cache_clear(struct btrfs_lru_cache *cache); + +#endif /* BTRFS_LRU_CACHE_H */ diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 38319d0354d4..b31af939bea8 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -32,6 +32,7 @@ #include "file-item.h" #include "ioctl.h" #include "verity.h" +#include "lru_cache.h" /* * Maximum number of references an extent can have in order for us to attempt to @@ -107,15 +108,15 @@ struct clone_root { * x86_64). */ struct backref_cache_entry { - /* List to link to the cache's lru list. */ - struct list_head list; - /* The key for this entry in the cache. */ - u64 key; + struct btrfs_lru_cache_entry entry; u64 root_ids[SEND_MAX_BACKREF_CACHE_ROOTS]; /* Number of valid elements in the root_ids array. */ int num_roots; }; +/* See the comment at lru_cache.h about struct btrfs_lru_cache_entry. */ +static_assert(offsetof(struct backref_cache_entry, entry) == 0); + struct send_ctx { struct file *send_filp; loff_t send_off; @@ -285,13 +286,8 @@ struct send_ctx { struct rb_root rbtree_new_refs; struct rb_root rbtree_deleted_refs; - struct { - u64 last_reloc_trans; - struct list_head lru_list; - struct maple_tree entries; - /* Number of entries stored in the cache. */ - int size; - } backref_cache; + struct btrfs_lru_cache backref_cache; + u64 backref_cache_last_reloc_trans; }; struct pending_dir_move { @@ -1387,19 +1383,6 @@ static int iterate_backrefs(u64 ino, u64 offset, u64 num_bytes, u64 root_id, return 0; } -static void empty_backref_cache(struct send_ctx *sctx) -{ - struct backref_cache_entry *entry; - struct backref_cache_entry *tmp; - - list_for_each_entry_safe(entry, tmp, &sctx->backref_cache.lru_list, list) - kfree(entry); - - INIT_LIST_HEAD(&sctx->backref_cache.lru_list); - mtree_destroy(&sctx->backref_cache.entries); - sctx->backref_cache.size = 0; -} - static bool lookup_backref_cache(u64 leaf_bytenr, void *ctx, const u64 **root_ids_ret, int *root_count_ret) { @@ -1407,9 +1390,10 @@ static bool lookup_backref_cache(u64 leaf_bytenr, void *ctx, struct send_ctx *sctx = bctx->sctx; struct btrfs_fs_info *fs_info = sctx->send_root->fs_info; const u64 key = leaf_bytenr >> fs_info->sectorsize_bits; + struct btrfs_lru_cache_entry *raw_entry; struct backref_cache_entry *entry; - if (sctx->backref_cache.size == 0) + if (btrfs_lru_cache_size(&sctx->backref_cache) == 0) return false; /* @@ -1423,18 +1407,18 @@ static bool lookup_backref_cache(u64 leaf_bytenr, void *ctx, * transaction handle or holding fs_info->commit_root_sem, so no need * to take any lock here. */ - if (fs_info->last_reloc_trans > sctx->backref_cache.last_reloc_trans) { - empty_backref_cache(sctx); + if (fs_info->last_reloc_trans > sctx->backref_cache_last_reloc_trans) { + btrfs_lru_cache_clear(&sctx->backref_cache); return false; } - entry = mtree_load(&sctx->backref_cache.entries, key); - if (!entry) + raw_entry = btrfs_lru_cache_lookup(&sctx->backref_cache, key); + if (!raw_entry) return false; + entry = container_of(raw_entry, struct backref_cache_entry, entry); *root_ids_ret = entry->root_ids; *root_count_ret = entry->num_roots; - list_move_tail(&entry->list, &sctx->backref_cache.lru_list); return true; } @@ -1460,7 +1444,7 @@ static void store_backref_cache(u64 leaf_bytenr, const struct ulist *root_ids, if (!new_entry) return; - new_entry->key = leaf_bytenr >> fs_info->sectorsize_bits; + new_entry->entry.key = leaf_bytenr >> fs_info->sectorsize_bits; new_entry->num_roots = 0; ULIST_ITER_INIT(&uiter); while ((node = ulist_next(root_ids, &uiter)) != NULL) { @@ -1488,23 +1472,12 @@ static void store_backref_cache(u64 leaf_bytenr, const struct ulist *root_ids, * none of the roots is part of the list of roots from which we are * allowed to clone. Cache the new entry as it's still useful to avoid * backref walking to determine which roots have a path to the leaf. + * + * Also use GFP_NOFS because we're called while holding a transaction + * handle or while holding fs_info->commit_root_sem. */ - - if (sctx->backref_cache.size >= SEND_MAX_BACKREF_CACHE_SIZE) { - struct backref_cache_entry *lru_entry; - struct backref_cache_entry *mt_entry; - - lru_entry = list_first_entry(&sctx->backref_cache.lru_list, - struct backref_cache_entry, list); - mt_entry = mtree_erase(&sctx->backref_cache.entries, lru_entry->key); - ASSERT(mt_entry == lru_entry); - list_del(&mt_entry->list); - kfree(mt_entry); - sctx->backref_cache.size--; - } - - ret = mtree_insert(&sctx->backref_cache.entries, new_entry->key, - new_entry, GFP_NOFS); + ret = btrfs_lru_cache_store(&sctx->backref_cache, &new_entry->entry, + GFP_NOFS); ASSERT(ret == 0 || ret == -ENOMEM); if (ret) { /* Caching is optional, no worries. */ @@ -1512,17 +1485,13 @@ static void store_backref_cache(u64 leaf_bytenr, const struct ulist *root_ids, return; } - list_add_tail(&new_entry->list, &sctx->backref_cache.lru_list); - /* * We are called from iterate_extent_inodes() while either holding a * transaction handle or holding fs_info->commit_root_sem, so no need * to take any lock here. */ - if (sctx->backref_cache.size == 0) - sctx->backref_cache.last_reloc_trans = fs_info->last_reloc_trans; - - sctx->backref_cache.size++; + if (btrfs_lru_cache_size(&sctx->backref_cache) == 1) + sctx->backref_cache_last_reloc_trans = fs_info->last_reloc_trans; } static int check_extent_item(u64 bytenr, const struct btrfs_extent_item *ei, @@ -8139,8 +8108,7 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) INIT_RADIX_TREE(&sctx->name_cache, GFP_KERNEL); INIT_LIST_HEAD(&sctx->name_cache_list); - INIT_LIST_HEAD(&sctx->backref_cache.lru_list); - mt_init(&sctx->backref_cache.entries); + btrfs_lru_cache_init(&sctx->backref_cache, SEND_MAX_BACKREF_CACHE_SIZE); sctx->pending_dir_moves = RB_ROOT; sctx->waiting_dir_moves = RB_ROOT; @@ -8404,7 +8372,7 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) close_current_inode(sctx); - empty_backref_cache(sctx); + btrfs_lru_cache_clear(&sctx->backref_cache); kfree(sctx); } From patchwork Thu Jan 19 19:39:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13109181 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BAE7DC678DD for ; Fri, 20 Jan 2023 05:37:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231342AbjATFhd (ORCPT ); Fri, 20 Jan 2023 00:37:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41020 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230168AbjATFgs (ORCPT ); Fri, 20 Jan 2023 00:36:48 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4C4159B4A for ; Thu, 19 Jan 2023 21:33:24 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 3AB03B8271B for ; Thu, 19 Jan 2023 19:39:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7DB7EC433D2 for ; Thu, 19 Jan 2023 19:39:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157185; bh=ecDDAqw5cfInxQMnoLZCSI1nzupUv25AKmgpGtVv4f8=; h=From:To:Subject:Date:In-Reply-To:References:From; b=mlv0+xxflceryloTX3k8x76Bx6WdABryLFom9ohCDk37DKoGAKzPuNFiGulq04aYx v+9ahxkUSgCeR5RxZh5nucLB7RhAqzO8Sxt6G7zU/0FzqxLScSLhWPr0q824JUjcJn nDpsDyhMf8PE7JI9gH9X46Lz9LzI+eEgPpK2QQi7km+6YTcRIuJjqGB/U1/DV3KeTT uHsH4baJm4Urqmxg4Ag/+5kchzPCL+GuIK923GmORQAMyWoMjZ4fX/1S9EaqTu5mbi HiMpXMrlMMzhszNrlIL0ckdDiGhROx8fyStvIVl7vYMsz7KIu3WkqrjPaxDdQceM8l e7v5arf0v+uAA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 12/18] btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems Date: Thu, 19 Jan 2023 19:39:24 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana The lru cache is backed by a maple tree, which uses the unsigned long type for keys, and that type has a width of 32 bits on 32 bits systems and a width of 64 bits on 64 bits systems. Currently there is only one user of the lru cache, the send backref cache, which uses a sector number as a key, a logical address right shifted by fs_info->sectorsize_bits, so a 32 bits width is not yet a problem (the same happens with the radix tree we use to track extent buffers, fs_info->buffer_radix). However the next patches in the series will start using the lru cache for cases where inode numbers are the keys, and the inode numbers are always 64 bits, even if we are running on a 32 bits system. So adapt the lru cache to allow multiple values under the same key, by having the maple tree store a head entry that points to a list of entries instead of pointing to a single entry. This is a similar approach to what we currently do for the name cache in send (which uses a radix tree that has indexes with an unsigned long type as well), and will allow later to use the lru cache for the send name cache as well. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/lru_cache.c | 86 ++++++++++++++++++++++++++++++++++++-------- fs/btrfs/lru_cache.h | 12 +++++++ 2 files changed, 83 insertions(+), 15 deletions(-) diff --git a/fs/btrfs/lru_cache.c b/fs/btrfs/lru_cache.c index 177e7e705363..96a71bb6a374 100644 --- a/fs/btrfs/lru_cache.c +++ b/fs/btrfs/lru_cache.c @@ -18,6 +18,17 @@ void btrfs_lru_cache_init(struct btrfs_lru_cache *cache, unsigned int max_size) cache->max_size = max_size; } +static struct btrfs_lru_cache_entry *match_entry(struct list_head *head, u64 key) +{ + struct btrfs_lru_cache_entry *entry; + + list_for_each_entry(entry, head, list) + if (entry->key == key) + return entry; + + return NULL; +} + /* * Lookup for an entry in the cache. * @@ -29,15 +40,48 @@ void btrfs_lru_cache_init(struct btrfs_lru_cache *cache, unsigned int max_size) struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cache, u64 key) { + struct list_head *head; struct btrfs_lru_cache_entry *entry; - entry = mtree_load(&cache->entries, key); + head = mtree_load(&cache->entries, key); + if (!head) + return NULL; + + entry = match_entry(head, key); if (entry) list_move_tail(&entry->lru_list, &cache->lru_list); return entry; } +static void delete_entry(struct btrfs_lru_cache *cache, + struct btrfs_lru_cache_entry *entry) +{ + struct list_head *prev = entry->list.prev; + + ASSERT(cache->size > 0); + ASSERT(!mtree_empty(&cache->entries)); + + list_del(&entry->list); + list_del(&entry->lru_list); + + if (list_empty(prev)) { + struct list_head *head; + + /* + * If previous element in the list entry->list is now empty, it + * means it's a head entry not pointing to any cached entries, + * so remove it from the maple tree and free it. + */ + head = mtree_erase(&cache->entries, entry->key); + ASSERT(head == prev); + kfree(head); + } + + kfree(entry); + cache->size--; +} + /* * Store an entry in the cache. * @@ -50,26 +94,39 @@ int btrfs_lru_cache_store(struct btrfs_lru_cache *cache, struct btrfs_lru_cache_entry *new_entry, gfp_t gfp) { + const u64 key = new_entry->key; + struct list_head *head; int ret; + head = kmalloc(sizeof(*head), gfp); + if (!head) + return -ENOMEM; + + ret = mtree_insert(&cache->entries, key, head, gfp); + if (ret == 0) { + INIT_LIST_HEAD(head); + list_add_tail(&new_entry->list, head); + } else if (ret == -EEXIST) { + kfree(head); + head = mtree_load(&cache->entries, key); + ASSERT(head != NULL); + if (match_entry(head, key) != NULL) + return -EEXIST; + list_add_tail(&new_entry->list, head); + } else if (ret < 0) { + kfree(head); + return ret; + } + if (cache->size == cache->max_size) { struct btrfs_lru_cache_entry *lru_entry; - struct btrfs_lru_cache_entry *mt_entry; lru_entry = list_first_entry(&cache->lru_list, struct btrfs_lru_cache_entry, lru_list); - mt_entry = mtree_erase(&cache->entries, lru_entry->key); - ASSERT(mt_entry == lru_entry); - list_del(&mt_entry->lru_list); - kfree(mt_entry); - cache->size--; + delete_entry(cache, lru_entry); } - ret = mtree_insert(&cache->entries, new_entry->key, new_entry, gfp); - if (ret < 0) - return ret; - list_add_tail(&new_entry->lru_list, &cache->lru_list); cache->size++; @@ -89,9 +146,8 @@ void btrfs_lru_cache_clear(struct btrfs_lru_cache *cache) struct btrfs_lru_cache_entry *tmp; list_for_each_entry_safe(entry, tmp, &cache->lru_list, lru_list) - kfree(entry); + delete_entry(cache, entry); - INIT_LIST_HEAD(&cache->lru_list); - mtree_destroy(&cache->entries); - cache->size = 0; + ASSERT(cache->size == 0); + ASSERT(mtree_empty(&cache->entries)); } diff --git a/fs/btrfs/lru_cache.h b/fs/btrfs/lru_cache.h index 189be5be0a8d..368248be42a2 100644 --- a/fs/btrfs/lru_cache.h +++ b/fs/btrfs/lru_cache.h @@ -17,6 +17,18 @@ struct btrfs_lru_cache_entry { struct list_head lru_list; u64 key; + /* + * The maple tree uses unsigned long type for the keys, which is 32 bits + * on 32 bits systems, and 64 bits on 64 bits systems. So if we want to + * use something like inode numbers as keys, which are always a u64, we + * have to deal with this in a special way - we store the key in the + * entry itself, as a u64, and the values inserted into the maple tree + * are linked lists of entries - so in case we are on a 64 bits system, + * that list always has a single entry, while on 32 bits systems it + * may have more than one, with each entry having the same value for + * their lower 32 bits of the u64 key. + */ + struct list_head list; }; struct btrfs_lru_cache { From patchwork Thu Jan 19 19:39:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13108723 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57F62C004D4 for ; Thu, 19 Jan 2023 19:40:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230223AbjASTkL (ORCPT ); Thu, 19 Jan 2023 14:40:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51076 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230496AbjASTjr (ORCPT ); Thu, 19 Jan 2023 14:39:47 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA3949B139 for ; Thu, 19 Jan 2023 11:39:46 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6490061D1B for ; Thu, 19 Jan 2023 19:39:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5EF6BC433F0 for ; Thu, 19 Jan 2023 19:39:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157185; bh=k3hDlSBp4vxCJoByeGhmpZ01i26TFdzLmWBekXsIG/Q=; h=From:To:Subject:Date:In-Reply-To:References:From; b=PFhk9EVAo61Es6CvObaKEowH11sGyKgJ8UwxpJMvTzOze80OYs8cakdi7hNr93HQ7 1SahVwn4/lhq9bmJRb7vDQO0YtpWZ17XaJHq7QMetNBtOCuKc/bykq+7x8CRRRqyj3 mGKyza+QY+MC1UvQUk1Q+bN4O+drMcveisgDp6M+Eiwbjtzf+se4+icmEYAVw4+tnx Pg5IhqpAtgYItku6plnkgD/SyR2vRFO4oYU3LTu/JFbfq8bO9JhZF419cRuEp09tcM HImDwPDB84BSHNideF1dC5tYdAuZrQ9B0RRRvUwUDGjRlwL+LbHAO8tjh6q6m3VMgA 0veY23Qj52mjA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 13/18] btrfs: send: cache information about created directories Date: Thu, 19 Jan 2023 19:39:25 +0000 Message-Id: <546be4877df604b6e64b18357ea694a491126621.1674157020.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana During an incremental send, when processing the a reference for an inode we need to check if the directory where the new reference is located was already created before creating the new reference. This check, which is done by the helper did_create_dir(), can be expensive if the directory has many entries, since it consists in seaching the send root's b+tree and visiting every single dir index key until we either find one which points to an inode with a number smaller than the current inode's number or until we visited all index keys. So it doesn't scale well for very large directories. So improve on this by caching created directories using a lru cache, and limiting its size to 64 entries, which results in using at most 4096 bytes of memory. The caching is optinal, if we fail to allocate memory, we just proceed as before and use the existing slower path. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 41 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 40 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index b31af939bea8..bc232eb60e68 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -117,6 +117,14 @@ struct backref_cache_entry { /* See the comment at lru_cache.h about struct btrfs_lru_cache_entry. */ static_assert(offsetof(struct backref_cache_entry, entry) == 0); +/* + * Max number of entries in the cache that stores directories that were already + * created. The cache uses raw struct btrfs_lru_cache_entry entries, so it uses + * at most 4096 bytes - sizeof(struct btrfs_lru_cache_entry) is 40 bytes, but + * the kmalloc-64 slab is used, so we get 4096 bytes (64 bytes * 64). + */ +#define SEND_MAX_DIR_CREATED_CACHE_SIZE 64 + struct send_ctx { struct file *send_filp; loff_t send_off; @@ -288,6 +296,8 @@ struct send_ctx { struct btrfs_lru_cache backref_cache; u64 backref_cache_last_reloc_trans; + + struct btrfs_lru_cache dir_created_cache; }; struct pending_dir_move { @@ -2936,6 +2946,22 @@ static int send_create_inode(struct send_ctx *sctx, u64 ino) return ret; } +static void cache_dir_created(struct send_ctx *sctx, u64 dir) +{ + struct btrfs_lru_cache_entry *entry; + int ret; + + /* Caching is optional, ignore any failures. */ + entry = kmalloc(sizeof(*entry), GFP_KERNEL); + if (!entry) + return; + + entry->key = dir; + ret = btrfs_lru_cache_store(&sctx->dir_created_cache, entry, GFP_KERNEL); + if (ret < 0) + kfree(entry); +} + /* * We need some special handling for inodes that get processed before the parent * directory got created. See process_recorded_refs for details. @@ -2951,6 +2977,9 @@ static int did_create_dir(struct send_ctx *sctx, u64 dir) struct btrfs_key di_key; struct btrfs_dir_item *di; + if (btrfs_lru_cache_lookup(&sctx->dir_created_cache, dir)) + return 1; + path = alloc_path_for_send(); if (!path) return -ENOMEM; @@ -2974,6 +3003,7 @@ static int did_create_dir(struct send_ctx *sctx, u64 dir) if (di_key.type != BTRFS_ROOT_ITEM_KEY && di_key.objectid < sctx->send_progress) { ret = 1; + cache_dir_created(sctx, dir); break; } } @@ -3003,7 +3033,12 @@ static int send_create_inode_if_needed(struct send_ctx *sctx) return 0; } - return send_create_inode(sctx, sctx->cur_ino); + ret = send_create_inode(sctx, sctx->cur_ino); + + if (ret == 0 && S_ISDIR(sctx->cur_inode_mode)) + cache_dir_created(sctx, sctx->cur_ino); + + return ret; } struct recorded_ref { @@ -4401,6 +4436,7 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) ret = send_create_inode(sctx, cur->dir); if (ret < 0) goto out; + cache_dir_created(sctx, cur->dir); } } @@ -8109,6 +8145,8 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) INIT_LIST_HEAD(&sctx->name_cache_list); btrfs_lru_cache_init(&sctx->backref_cache, SEND_MAX_BACKREF_CACHE_SIZE); + btrfs_lru_cache_init(&sctx->dir_created_cache, + SEND_MAX_DIR_CREATED_CACHE_SIZE); sctx->pending_dir_moves = RB_ROOT; sctx->waiting_dir_moves = RB_ROOT; @@ -8373,6 +8411,7 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) close_current_inode(sctx); btrfs_lru_cache_clear(&sctx->backref_cache); + btrfs_lru_cache_clear(&sctx->dir_created_cache); kfree(sctx); } From patchwork Thu Jan 19 19:39:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13109182 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23445C677F1 for ; Fri, 20 Jan 2023 05:37:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230107AbjATFhc (ORCPT ); Fri, 20 Jan 2023 00:37:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230373AbjATFgu (ORCPT ); Fri, 20 Jan 2023 00:36:50 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E25B5CFC4 for ; Thu, 19 Jan 2023 21:33:26 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 049C9B82717 for ; Thu, 19 Jan 2023 19:39:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3F0D5C43392 for ; Thu, 19 Jan 2023 19:39:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157186; bh=MLp8eSIA7vxX3MohZiy2tuWKah+fAx1D7bJC0u/wCMk=; h=From:To:Subject:Date:In-Reply-To:References:From; b=iuKwVH8UNyYNCb4tRV9K52y0znhJbGRqyNAd3Wqyzd/DvHzfgst7V2IyWBBZ0F9NX 5qRuMs+fqyKDM9UJhPMFf6rfrxzytex4ie4e1IaFouDIKYZNdWaoSzdXRTlX7N7VK7 RadqEL+RKbGe5N4MkZC0myLbZwsYfziPc4VplTYAaGKLrxn6oF3ZnERIHD6G7/5h/O N1VHz0VIVEbpTRW3QR2mZFU8Fa+CsNKKaZ5/wdtFu+KJ4YYnKSSEIAAGOJXG+134bo f0ik3zwhs8e3Cw5Gty6Cx2CzRQHasSSRVPTm19kl5xqXPAOIpSN2TrQPMuTYjHIqYy z6IQL7/QyLYeg== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 14/18] btrfs: allow a generation number to be associated with lru cache entries Date: Thu, 19 Jan 2023 19:39:26 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana This allows an optional generation number to be associated to each entry of the lru cache. Entries with the same key but different generations, are stored in the linked list to which the maple tree points to. This is meant to be used when there's a small number of different generations, so the impact of searching a linked list is negligible. The goal is to get rid of the open coded name cache in the send code (which uses a radix tree and a a similar linked list of values/entries) and use instead the lru cache module. For that particular use case we have at most 2 generations that are associated to each key (inode number): one generation for the send root and another generation for the parent root. The actual migration of the send name cache is done in the next patch in the series. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/lru_cache.c | 12 +++++++----- fs/btrfs/lru_cache.h | 9 ++++++++- fs/btrfs/send.c | 8 +++++--- 3 files changed, 20 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/lru_cache.c b/fs/btrfs/lru_cache.c index 96a71bb6a374..23b061b69f65 100644 --- a/fs/btrfs/lru_cache.c +++ b/fs/btrfs/lru_cache.c @@ -18,12 +18,13 @@ void btrfs_lru_cache_init(struct btrfs_lru_cache *cache, unsigned int max_size) cache->max_size = max_size; } -static struct btrfs_lru_cache_entry *match_entry(struct list_head *head, u64 key) +static struct btrfs_lru_cache_entry *match_entry(struct list_head *head, u64 key, + u64 gen) { struct btrfs_lru_cache_entry *entry; list_for_each_entry(entry, head, list) - if (entry->key == key) + if (entry->key == key && entry->gen == gen) return entry; return NULL; @@ -34,11 +35,12 @@ static struct btrfs_lru_cache_entry *match_entry(struct list_head *head, u64 key * * @cache: The cache. * @key: The key of the entry we are looking for. + * @gen: Generation associated to the key. * * Returns the entry associated with the key or NULL if none found. */ struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cache, - u64 key) + u64 key, u64 gen) { struct list_head *head; struct btrfs_lru_cache_entry *entry; @@ -47,7 +49,7 @@ struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cac if (!head) return NULL; - entry = match_entry(head, key); + entry = match_entry(head, key, gen); if (entry) list_move_tail(&entry->lru_list, &cache->lru_list); @@ -110,7 +112,7 @@ int btrfs_lru_cache_store(struct btrfs_lru_cache *cache, kfree(head); head = mtree_load(&cache->entries, key); ASSERT(head != NULL); - if (match_entry(head, key) != NULL) + if (match_entry(head, key, new_entry->gen) != NULL) return -EEXIST; list_add_tail(&new_entry->list, head); } else if (ret < 0) { diff --git a/fs/btrfs/lru_cache.h b/fs/btrfs/lru_cache.h index 368248be42a2..de887d438cfb 100644 --- a/fs/btrfs/lru_cache.h +++ b/fs/btrfs/lru_cache.h @@ -17,6 +17,13 @@ struct btrfs_lru_cache_entry { struct list_head lru_list; u64 key; + /* + * Optional generation associated to a key. Use 0 if not needed/used. + * Entries with the same key and different generations are stored in a + * linked list, so use this only for cases where there's a small number + * of different generations. + */ + u64 gen; /* * The maple tree uses unsigned long type for the keys, which is 32 bits * on 32 bits systems, and 64 bits on 64 bits systems. So if we want to @@ -47,7 +54,7 @@ static inline unsigned int btrfs_lru_cache_size(const struct btrfs_lru_cache *ca void btrfs_lru_cache_init(struct btrfs_lru_cache *cache, unsigned int max_size); struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cache, - u64 key); + u64 key, u64 gen); int btrfs_lru_cache_store(struct btrfs_lru_cache *cache, struct btrfs_lru_cache_entry *new_entry, gfp_t gfp); diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index bc232eb60e68..3966f8ce7e49 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -120,7 +120,7 @@ static_assert(offsetof(struct backref_cache_entry, entry) == 0); /* * Max number of entries in the cache that stores directories that were already * created. The cache uses raw struct btrfs_lru_cache_entry entries, so it uses - * at most 4096 bytes - sizeof(struct btrfs_lru_cache_entry) is 40 bytes, but + * at most 4096 bytes - sizeof(struct btrfs_lru_cache_entry) is 48 bytes, but * the kmalloc-64 slab is used, so we get 4096 bytes (64 bytes * 64). */ #define SEND_MAX_DIR_CREATED_CACHE_SIZE 64 @@ -1422,7 +1422,7 @@ static bool lookup_backref_cache(u64 leaf_bytenr, void *ctx, return false; } - raw_entry = btrfs_lru_cache_lookup(&sctx->backref_cache, key); + raw_entry = btrfs_lru_cache_lookup(&sctx->backref_cache, key, 0); if (!raw_entry) return false; @@ -1455,6 +1455,7 @@ static void store_backref_cache(u64 leaf_bytenr, const struct ulist *root_ids, return; new_entry->entry.key = leaf_bytenr >> fs_info->sectorsize_bits; + new_entry->entry.gen = 0; new_entry->num_roots = 0; ULIST_ITER_INIT(&uiter); while ((node = ulist_next(root_ids, &uiter)) != NULL) { @@ -2957,6 +2958,7 @@ static void cache_dir_created(struct send_ctx *sctx, u64 dir) return; entry->key = dir; + entry->gen = 0; ret = btrfs_lru_cache_store(&sctx->dir_created_cache, entry, GFP_KERNEL); if (ret < 0) kfree(entry); @@ -2977,7 +2979,7 @@ static int did_create_dir(struct send_ctx *sctx, u64 dir) struct btrfs_key di_key; struct btrfs_dir_item *di; - if (btrfs_lru_cache_lookup(&sctx->dir_created_cache, dir)) + if (btrfs_lru_cache_lookup(&sctx->dir_created_cache, dir, 0)) return 1; path = alloc_path_for_send(); From patchwork Thu Jan 19 19:39:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13109178 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6609C38159 for ; Fri, 20 Jan 2023 05:37:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231270AbjATFhD (ORCPT ); Fri, 20 Jan 2023 00:37:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40218 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231146AbjATFgl (ORCPT ); Fri, 20 Jan 2023 00:36:41 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE5FB6A31B for ; Thu, 19 Jan 2023 21:33:19 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id DE47CB82719 for ; Thu, 19 Jan 2023 19:39:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1F3DCC433D2 for ; Thu, 19 Jan 2023 19:39:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157187; bh=gTL2A9flNzkxUhEVw4BmoY/h7oH7FnjGI79Re274aMY=; h=From:To:Subject:Date:In-Reply-To:References:From; b=L0T+fWiINVZt6OZLWTyFJlGoQqoUBsoaLptviQXj4CajRwCXUjYcB8iQkWr7UwOsj 6pmCBkngR8ae0p/72EXM1RA54goPf7okyrJvArV0uuRioKfVVpZou74EAHjk1UgacZ 0j+7blQfpX+eBKNaJ2wLgUfoDQTGd/siM7bB7XkFC5GIM/LGfGbNRXzJTg0Cdi3BM7 f/IvUoknOfvdPC2klA2koXROIJP44YjdjePQ8PoqH3HdAs9TzD0fTDuLVk0nTBkHDw WGTNDjbKjN4btmvhjrBl1qPnNqrMB/l5Lh8y8sH9hPloFqDUs/atyMK83US4Qr90OI mNIMqRphKqnOg== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 15/18] btrfs: add an api to delete a specific entry from the lru cache Date: Thu, 19 Jan 2023 19:39:27 +0000 Message-Id: <0bd767c7525178993f30cb2dc71a82c383b9518e.1674157020.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana In order to replace the open coded name cache in send with the lru cache, we need an API for the lru cache to delete a specific entry for which we did a previous lookup. This adds the API for it, and a next patch in the series will use it. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/lru_cache.c | 16 ++++++++++++---- fs/btrfs/lru_cache.h | 2 ++ 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/lru_cache.c b/fs/btrfs/lru_cache.c index 23b061b69f65..260b4e5213a2 100644 --- a/fs/btrfs/lru_cache.c +++ b/fs/btrfs/lru_cache.c @@ -56,8 +56,16 @@ struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cac return entry; } -static void delete_entry(struct btrfs_lru_cache *cache, - struct btrfs_lru_cache_entry *entry) +/* + * Remove an entry from the cache. + * + * @cache: The cache to remove from. + * @entry: The entry to remove from the cache. + * + * Note: this also frees the memory used by the entry. + */ +void btrfs_lru_cache_remove(struct btrfs_lru_cache *cache, + struct btrfs_lru_cache_entry *entry) { struct list_head *prev = entry->list.prev; @@ -126,7 +134,7 @@ int btrfs_lru_cache_store(struct btrfs_lru_cache *cache, lru_entry = list_first_entry(&cache->lru_list, struct btrfs_lru_cache_entry, lru_list); - delete_entry(cache, lru_entry); + btrfs_lru_cache_remove(cache, lru_entry); } list_add_tail(&new_entry->lru_list, &cache->lru_list); @@ -148,7 +156,7 @@ void btrfs_lru_cache_clear(struct btrfs_lru_cache *cache) struct btrfs_lru_cache_entry *tmp; list_for_each_entry_safe(entry, tmp, &cache->lru_list, lru_list) - delete_entry(cache, entry); + btrfs_lru_cache_remove(cache, entry); ASSERT(cache->size == 0); ASSERT(mtree_empty(&cache->entries)); diff --git a/fs/btrfs/lru_cache.h b/fs/btrfs/lru_cache.h index de887d438cfb..57e5bdc57e6d 100644 --- a/fs/btrfs/lru_cache.h +++ b/fs/btrfs/lru_cache.h @@ -58,6 +58,8 @@ struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cac int btrfs_lru_cache_store(struct btrfs_lru_cache *cache, struct btrfs_lru_cache_entry *new_entry, gfp_t gfp); +void btrfs_lru_cache_remove(struct btrfs_lru_cache *cache, + struct btrfs_lru_cache_entry *entry); void btrfs_lru_cache_clear(struct btrfs_lru_cache *cache); #endif /* BTRFS_LRU_CACHE_H */ From patchwork Thu Jan 19 19:39:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13108724 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2048FC46467 for ; Thu, 19 Jan 2023 19:40:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230491AbjASTkP (ORCPT ); Thu, 19 Jan 2023 14:40:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51132 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230499AbjASTjv (ORCPT ); Thu, 19 Jan 2023 14:39:51 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 73BF19373D for ; Thu, 19 Jan 2023 11:39:49 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 0AE5261D23 for ; Thu, 19 Jan 2023 19:39:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 01B62C433F0 for ; Thu, 19 Jan 2023 19:39:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157188; bh=edcfNdFSwDx9LPVNFT3b8hURuIK97kxk61HDwOveid8=; h=From:To:Subject:Date:In-Reply-To:References:From; b=Qw8ZAhTgtfO1aRENKDKc46OfplmvRYMD0+aJl/jfpP3knqm5fzI+MaNdbNKwNIpiU H13jc1JejXGeIzKPvIEpkfgzZBQSp9nywxUrQuqteDMyfmaD1BVbe27/iCA23OFpP6 nlXqHVUpo0tJHMPjB5ahM3PnJL45oudZQw9ChenC8sbgbu/BOebvV2FJEGa34zz4tL hPKa0STp6zcJtMd4Zsycs9D300jriBiau9jkR8y9+uLilOB98dA4CmOIEm+M9iUjE2 XQA5TZGSiNqtZc/4sujsB4CNHWLLtpb2oX7lxNK/I6m/nsBnUcvv4C9cHMUZew7wB8 3/ZtOUJs4apsg== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 16/18] btrfs: send: use the lru cache to implement the name cache Date: Thu, 19 Jan 2023 19:39:28 +0000 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana The name cache in send is basically a lru cache implemented with a radix tree and linked lists, very similar to the lru cache module which is used for the send backref cache and the cache of previously created directories during a send operation. So remove all the custom caching code for the name cache and make it use the lru cache instead. One particular detail to note is that the current cache behaves a bit differently when it comes to eviction of entries. Namely when after inserting a new name in the cache, if the cache now has 256 entries, we evict the last 128 LRU entries. The lru_cache.{c,h} module behaves a bit differently in that once we reach the cache limit, we evict a single LRU entry. In practice this doesn't make much difference, but it's actually better to evict just one entry instead of half of the entries, as there's always a chance we will need a name stored in one of that last 128 removed entries. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 164 +++++++----------------------------------------- 1 file changed, 24 insertions(+), 140 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 3966f8ce7e49..7d327d977fa0 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -81,8 +81,7 @@ struct clone_root { bool found_ref; }; -#define SEND_CTX_MAX_NAME_CACHE_SIZE 128 -#define SEND_CTX_NAME_CACHE_CLEAN_SIZE (SEND_CTX_MAX_NAME_CACHE_SIZE * 2) +#define SEND_MAX_NAME_CACHE_SIZE 256 /* * Limit the root_ids array of struct backref_cache_entry to 12 elements. @@ -183,9 +182,7 @@ struct send_ctx { struct list_head new_refs; struct list_head deleted_refs; - struct radix_tree_root name_cache; - struct list_head name_cache_list; - int name_cache_size; + struct btrfs_lru_cache name_cache; /* * The inode we are currently processing. It's not NULL only when we @@ -331,18 +328,11 @@ struct orphan_dir_info { }; struct name_cache_entry { - struct list_head list; /* - * radix_tree has only 32bit entries but we need to handle 64bit inums. - * We use the lower 32bit of the 64bit inum to store it in the tree. If - * more then one inum would fall into the same entry, we use radix_list - * to store the additional entries. radix_list is also used to store - * entries where two entries have the same inum but different - * generations. + * The key in the entry is an inode number, and the generation matches + * the inode's generation. */ - struct list_head radix_list; - u64 ino; - u64 gen; + struct btrfs_lru_cache_entry entry; u64 parent_ino; u64 parent_gen; int ret; @@ -351,6 +341,9 @@ struct name_cache_entry { char name[]; }; +/* See the comment at lru_cache.h about struct btrfs_lru_cache_entry. */ +static_assert(offsetof(struct name_cache_entry, entry) == 0); + #define ADVANCE 1 #define ADVANCE_ONLY_NEXT -1 @@ -2261,113 +2254,16 @@ static int did_overwrite_first_ref(struct send_ctx *sctx, u64 ino, u64 gen) return ret; } -/* - * Insert a name cache entry. On 32bit kernels the radix tree index is 32bit, - * so we need to do some special handling in case we have clashes. This function - * takes care of this with the help of name_cache_entry::radix_list. - * In case of error, nce is kfreed. - */ -static int name_cache_insert(struct send_ctx *sctx, - struct name_cache_entry *nce) -{ - int ret = 0; - struct list_head *nce_head; - - nce_head = radix_tree_lookup(&sctx->name_cache, - (unsigned long)nce->ino); - if (!nce_head) { - nce_head = kmalloc(sizeof(*nce_head), GFP_KERNEL); - if (!nce_head) { - kfree(nce); - return -ENOMEM; - } - INIT_LIST_HEAD(nce_head); - - ret = radix_tree_insert(&sctx->name_cache, nce->ino, nce_head); - if (ret < 0) { - kfree(nce_head); - kfree(nce); - return ret; - } - } - list_add_tail(&nce->radix_list, nce_head); - list_add_tail(&nce->list, &sctx->name_cache_list); - sctx->name_cache_size++; - - return ret; -} - -static void name_cache_delete(struct send_ctx *sctx, - struct name_cache_entry *nce) -{ - struct list_head *nce_head; - - nce_head = radix_tree_lookup(&sctx->name_cache, - (unsigned long)nce->ino); - if (!nce_head) { - btrfs_err(sctx->send_root->fs_info, - "name_cache_delete lookup failed ino %llu cache size %d, leaking memory", - nce->ino, sctx->name_cache_size); - } - - list_del(&nce->radix_list); - list_del(&nce->list); - sctx->name_cache_size--; - - /* - * We may not get to the final release of nce_head if the lookup fails - */ - if (nce_head && list_empty(nce_head)) { - radix_tree_delete(&sctx->name_cache, (unsigned long)nce->ino); - kfree(nce_head); - } -} - -static struct name_cache_entry *name_cache_search(struct send_ctx *sctx, - u64 ino, u64 gen) +static inline struct name_cache_entry *name_cache_search(struct send_ctx *sctx, + u64 ino, u64 gen) { - struct list_head *nce_head; - struct name_cache_entry *cur; + struct btrfs_lru_cache_entry *entry; - nce_head = radix_tree_lookup(&sctx->name_cache, (unsigned long)ino); - if (!nce_head) + entry = btrfs_lru_cache_lookup(&sctx->name_cache, ino, gen); + if (!entry) return NULL; - list_for_each_entry(cur, nce_head, radix_list) { - if (cur->ino == ino && cur->gen == gen) - return cur; - } - return NULL; -} - -/* - * Remove some entries from the beginning of name_cache_list. - */ -static void name_cache_clean_unused(struct send_ctx *sctx) -{ - struct name_cache_entry *nce; - - if (sctx->name_cache_size < SEND_CTX_NAME_CACHE_CLEAN_SIZE) - return; - - while (sctx->name_cache_size > SEND_CTX_MAX_NAME_CACHE_SIZE) { - nce = list_entry(sctx->name_cache_list.next, - struct name_cache_entry, list); - name_cache_delete(sctx, nce); - kfree(nce); - } -} - -static void name_cache_free(struct send_ctx *sctx) -{ - struct name_cache_entry *nce; - - while (!list_empty(&sctx->name_cache_list)) { - nce = list_entry(sctx->name_cache_list.next, - struct name_cache_entry, list); - name_cache_delete(sctx, nce); - kfree(nce); - } + return container_of(entry, struct name_cache_entry, entry); } /* @@ -2386,7 +2282,7 @@ static int __get_cur_name_and_parent(struct send_ctx *sctx, { int ret; int nce_ret; - struct name_cache_entry *nce = NULL; + struct name_cache_entry *nce; /* * First check if we already did a call to this function with the same @@ -2396,17 +2292,9 @@ static int __get_cur_name_and_parent(struct send_ctx *sctx, nce = name_cache_search(sctx, ino, gen); if (nce) { if (ino < sctx->send_progress && nce->need_later_update) { - name_cache_delete(sctx, nce); - kfree(nce); + btrfs_lru_cache_remove(&sctx->name_cache, &nce->entry); nce = NULL; } else { - /* - * Removes the entry from the list and adds it back to - * the end. This marks the entry as recently used so - * that name_cache_clean_unused does not remove it. - */ - list_move_tail(&nce->list, &sctx->name_cache_list); - *parent_ino = nce->parent_ino; *parent_gen = nce->parent_gen; ret = fs_path_add(dest, nce->name, nce->name_len); @@ -2473,8 +2361,8 @@ static int __get_cur_name_and_parent(struct send_ctx *sctx, goto out; } - nce->ino = ino; - nce->gen = gen; + nce->entry.key = ino; + nce->entry.gen = gen; nce->parent_ino = *parent_ino; nce->parent_gen = *parent_gen; nce->name_len = fs_path_len(dest); @@ -2486,10 +2374,9 @@ static int __get_cur_name_and_parent(struct send_ctx *sctx, else nce->need_later_update = 1; - nce_ret = name_cache_insert(sctx, nce); + nce_ret = btrfs_lru_cache_store(&sctx->name_cache, &nce->entry, GFP_KERNEL); if (nce_ret < 0) ret = nce_ret; - name_cache_clean_unused(sctx); out: return ret; @@ -4356,10 +4243,9 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) * and get instead the orphan name. */ nce = name_cache_search(sctx, ow_inode, ow_gen); - if (nce) { - name_cache_delete(sctx, nce); - kfree(nce); - } + if (nce) + btrfs_lru_cache_remove(&sctx->name_cache, + &nce->entry); /* * ow_inode might currently be an ancestor of @@ -8143,9 +8029,8 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) INIT_LIST_HEAD(&sctx->new_refs); INIT_LIST_HEAD(&sctx->deleted_refs); - INIT_RADIX_TREE(&sctx->name_cache, GFP_KERNEL); - INIT_LIST_HEAD(&sctx->name_cache_list); + btrfs_lru_cache_init(&sctx->name_cache, SEND_MAX_NAME_CACHE_SIZE); btrfs_lru_cache_init(&sctx->backref_cache, SEND_MAX_BACKREF_CACHE_SIZE); btrfs_lru_cache_init(&sctx->dir_created_cache, SEND_MAX_DIR_CREATED_CACHE_SIZE); @@ -8408,10 +8293,9 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) kvfree(sctx->send_buf); kvfree(sctx->verity_descriptor); - name_cache_free(sctx); - close_current_inode(sctx); + btrfs_lru_cache_clear(&sctx->name_cache); btrfs_lru_cache_clear(&sctx->backref_cache); btrfs_lru_cache_clear(&sctx->dir_created_cache); From patchwork Thu Jan 19 19:39:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13109180 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E852AC46467 for ; Fri, 20 Jan 2023 05:37:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230145AbjATFhX (ORCPT ); Fri, 20 Jan 2023 00:37:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42178 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230113AbjATFgr (ORCPT ); Fri, 20 Jan 2023 00:36:47 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C615460BA for ; Thu, 19 Jan 2023 21:33:24 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id A1726B8271A for ; Thu, 19 Jan 2023 19:39:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D5E06C433F1 for ; Thu, 19 Jan 2023 19:39:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157189; bh=xKwEwzofOtFC7cTBu+gDDQtM4LGWw6uR8b+ckvS+/kQ=; h=From:To:Subject:Date:In-Reply-To:References:From; b=jA/KfwWT5+LHyFv4SVdQaUY66CrIdEf6frB+34/VRsb0f47PF99NYejjEvobWnzHJ AfOWVsTIWe6b40Lwk6L8OLYIt3JYYdBq4pJOCNAdN3JtCtGgE1NCR9d51oLdjVZy4Y xBGqtKwnFUPqcCOnwuw9yFG1tlMo09p8m3P9YgYW2zns5RzizQ6/UOF29v2LN1UD8z fN/1gKApdxgNpf0VYMUll1dg28HKtxfrpY9F55a23G9d65OOYzhpIKEJ2VvWU9XAt5 ANmNrW8nq++rVZffNLkjJkvIt3HL91brxpiRLNr34H+rqOICzsPkAicD6aKJWKLD/e YAidsCuWJMXqA== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 17/18] btrfs: send: update size of roots array for backref cache entries Date: Thu, 19 Jan 2023 19:39:29 +0000 Message-Id: <5c3179983276d6fe5096023e5b30e7b6564c53a6.1674157020.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana Currently we limit the size of the roots array, for backref cache entries, to 12 elements. This is because that number is enough for most cases and to make the backref cache entry size to be exactly 128 bytes, so that memory is allocated from the kmalloc-128 slab and no space is wasted. However recent changes in the series refactored the backref cache to be more generic and allow it to be reused for other purposes, which resulted in increasing the size of the embedded structure btrfs_lru_cache_entry in order to allow for supporting inode numbers as keys on 32 bits system and allow multiple generations per key. This resulted in increasing the size of struct backref_cache_entry from 128 bytes to 152 bytes. Since the cache entries are allocated with kmalloc(), it means we end up using the slab kmalloc-192, so we end up wasting 40 bytes of memory. So bump the size of the roots array from 12 elements to 17 elements, so we end up using 192 bytes for each backref cache entry. This patch is part of a larger patchset and the changelog of the last patch in the series contains a sample performance test and results. The patches that comprise the patchset are the following: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible Signed-off-by: Filipe Manana --- fs/btrfs/send.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 7d327d977fa0..ae2d462f647e 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -84,19 +84,20 @@ struct clone_root { #define SEND_MAX_NAME_CACHE_SIZE 256 /* - * Limit the root_ids array of struct backref_cache_entry to 12 elements. - * This makes the size of a cache entry to be exactly 128 bytes on x86_64. + * Limit the root_ids array of struct backref_cache_entry to 17 elements. + * This makes the size of a cache entry to be exactly 192 bytes on x86_64, which + * can be satisfied from the kmalloc-192 slab, without wasting any space. * The most common case is to have a single root for cloning, which corresponds - * to the send root. Having the user specify more than 11 clone roots is not + * to the send root. Having the user specify more than 16 clone roots is not * common, and in such rare cases we simply don't use caching if the number of - * cloning roots that lead down to a leaf is more than 12. + * cloning roots that lead down to a leaf is more than 17. */ -#define SEND_MAX_BACKREF_CACHE_ROOTS 12 +#define SEND_MAX_BACKREF_CACHE_ROOTS 17 /* * Max number of entries in the cache. - * With SEND_MAX_BACKREF_CACHE_ROOTS as 12, the size in bytes, excluding - * maple tree's internal nodes, is 16K. + * With SEND_MAX_BACKREF_CACHE_ROOTS as 17, the size in bytes, excluding + * maple tree's internal nodes, is 24K. */ #define SEND_MAX_BACKREF_CACHE_SIZE 128 From patchwork Thu Jan 19 19:39:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 13109176 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5411CC7112F for ; Fri, 20 Jan 2023 05:36:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231431AbjATFgz (ORCPT ); Fri, 20 Jan 2023 00:36:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40942 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230496AbjATFgh (ORCPT ); Fri, 20 Jan 2023 00:36:37 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE6CB6A31F for ; Thu, 19 Jan 2023 21:33:19 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 7494AB8271D for ; Thu, 19 Jan 2023 19:39:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B6A40C433D2 for ; Thu, 19 Jan 2023 19:39:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674157190; bh=pBgcnaEKimQGquxSVWRD6yR4nP+3nu0sU5mfw3OsVMQ=; h=From:To:Subject:Date:In-Reply-To:References:From; b=u0c3ut2JHxXEZZkYW4oGZievyiHFVxoV2OpOJxlk+hZJi6lhv48FLfMl+IP4Nmi24 Jh5A9leMkRaDYikHBhJyiLmVUU9p/UZe/Bt0oHoKU5oUtcYHR90AJgf1ZOCOOpEzUF qED+1Npun1fXCmDcvXv9TgRUB2z+SyFC44DfQ24ejupVZhefnf7/QBREhhCD2GF631 BZoq2ifamh+XNn8i3ilje+WrmNMz+qIbKMly1uSViJ6sA9n/hRrXdM7vWku3akM/U+ EjKFTn5Ja2suaaLO+b3E9g3WkWILuYnynm08fT8DI6mB4sa1EEj9yIUCI0pn7ayc4g NoNALOg201q5Q== From: fdmanana@kernel.org To: linux-btrfs@vger.kernel.org Subject: [PATCH v2 18/18] btrfs: send: cache utimes operations for directories if possible Date: Thu, 19 Jan 2023 19:39:30 +0000 Message-Id: <4483b872a3d205aa9db83be74ae7f7730b32380d.1674157020.git.fdmanana@suse.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Filipe Manana Whenever we add or remove an entry to a directory, we issue an utimes command for the directory. If we add 1000 entries to a directory (create 1000 files under it or move 1000 files to it), then we issue the same utimes command 1000 times, which increases the send stream size, results in more pipe IO, one search in the send b+tree, allocating one path for the search, etc, as well as making the receiver do a system call for each duplicated utimes command. We also issue an utimes command when we create a new directory, but later we might add entries to it corresponding to inodes with an higher inode number, so it's pointless to issue the utimes command before we create the last inode under the directory. So use a lru cache to track directories for which we must send a utimes command. When we need to remove an entry from the cache, we issue the utimes command for the respective directory. When finishing the send operation, we go over each cache element and issue the respective utimes command. Finally the caching is entirely optional, just a performance optimization, meaning that if we fail to cache (due to memory allocation failure), we issue the utimes command right away, that is, we fallback to the previous, unoptimized, behaviour. This patch belongs to a patchset comprised of the following patches: btrfs: send: directly return from did_overwrite_ref() and simplify it btrfs: send: avoid unnecessary generation search at did_overwrite_ref() btrfs: send: directly return from will_overwrite_ref() and simplify it btrfs: send: avoid extra b+tree searches when checking reference overrides btrfs: send: remove send_progress argument from can_rmdir() btrfs: send: avoid duplicated orphan dir allocation and initialization btrfs: send: avoid unnecessary orphan dir rbtree search at can_rmdir() btrfs: send: reduce searches on parent root when checking if dir can be removed btrfs: send: iterate waiting dir move rbtree only once when processing refs btrfs: send: initialize all the red black trees earlier btrfs: send: genericize the backref cache to allow it to be reused btrfs: adapt lru cache to allow for 64 bits keys on 32 bits systems btrfs: send: cache information about created directories btrfs: allow a generation number to be associated with lru cache entries btrfs: add an api to delete a specific entry from the lru cache btrfs: send: use the lru cache to implement the name cache btrfs: send: update size of roots array for backref cache entries btrfs: send: cache utimes operations for directories if possible The following test was run before and after applying the whole patchset, and on a non-debug kernel (Debian's default kernel config): #!/bin/bash MNT=/mnt/sdi DEV=/dev/sdi mkfs.btrfs -f $DEV > /dev/null mount $DEV $MNT mkdir $MNT/A for ((i = 1; i <= 20000; i++)); do echo -n > $MNT/A/file_$i done btrfs subvolume snapshot -r $MNT $MNT/snap1 mkdir $MNT/B for ((i = 20000; i <= 40000; i++)); do echo -n > $MNT/B/file_$i done mv $MNT/A/file_* $MNT/B/ btrfs subvolume snapshot -r $MNT $MNT/snap2 start=$(date +%s%N) btrfs send -p $MNT/snap1 $MNT/snap2 > /dev/null end=$(date +%s%N) dur=$(( (end - start) / 1000000 )) echo "Incremental send took $dur milliseconds" umount $MNT Before the whole patchset: 18408 milliseconds After the whole patchset: 1942 milliseconds (9.5x speedup) Using 60000 files instead of 40000: Before the whole patchset: 39764 milliseconds After the whole patchset: 3076 milliseconds (12.9x speedup) Using 20000 files instead of 40000: Before the whole patchset: 5072 milliseconds After the whole patchset: 916 milliseconds (5.5x speedup) Signed-off-by: Filipe Manana --- fs/btrfs/lru_cache.h | 15 ++++++++ fs/btrfs/send.c | 81 +++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 91 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/lru_cache.h b/fs/btrfs/lru_cache.h index 57e5bdc57e6d..9f3101cc89d5 100644 --- a/fs/btrfs/lru_cache.h +++ b/fs/btrfs/lru_cache.h @@ -47,11 +47,26 @@ struct btrfs_lru_cache { unsigned int max_size; }; +#define btrfs_lru_cache_for_each_entry_safe(cache, entry, tmp) \ + list_for_each_entry_safe_reverse((entry), (tmp), &(cache)->lru_list, lru_list) + static inline unsigned int btrfs_lru_cache_size(const struct btrfs_lru_cache *cache) { return cache->size; } +static inline bool btrfs_lru_cache_is_full(const struct btrfs_lru_cache *cache) +{ + return cache->size >= cache->max_size; +} + +static inline struct btrfs_lru_cache_entry *btrfs_lru_cache_lru_entry( + struct btrfs_lru_cache *cache) +{ + return list_first_entry_or_null(&cache->lru_list, + struct btrfs_lru_cache_entry, lru_list); +} + void btrfs_lru_cache_init(struct btrfs_lru_cache *cache, unsigned int max_size); struct btrfs_lru_cache_entry *btrfs_lru_cache_lookup(struct btrfs_lru_cache *cache, u64 key, u64 gen); diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index ae2d462f647e..a271b39c1445 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -125,6 +125,14 @@ static_assert(offsetof(struct backref_cache_entry, entry) == 0); */ #define SEND_MAX_DIR_CREATED_CACHE_SIZE 64 +/* + * Max number of entries in the cache that stores directories that were already + * created. The cache uses raw struct btrfs_lru_cache_entry entries, so it uses + * at most 4096 bytes - sizeof(struct btrfs_lru_cache_entry) is 48 bytes, but + * the kmalloc-64 slab is used, so we get 4096 bytes (64 bytes * 64). + */ +#define SEND_MAX_DIR_UTIMES_CACHE_SIZE 64 + struct send_ctx { struct file *send_filp; loff_t send_off; @@ -296,6 +304,7 @@ struct send_ctx { u64 backref_cache_last_reloc_trans; struct btrfs_lru_cache dir_created_cache; + struct btrfs_lru_cache dir_utimes_cache; }; struct pending_dir_move { @@ -2747,6 +2756,46 @@ static int send_utimes(struct send_ctx *sctx, u64 ino, u64 gen) return ret; } +static int cache_dir_utimes(struct send_ctx *sctx, u64 dir, u64 gen) +{ + struct btrfs_lru_cache_entry *entry; + int ret; + + entry = btrfs_lru_cache_lookup(&sctx->dir_utimes_cache, dir, gen); + if (entry != NULL) + return 0; + + if (btrfs_lru_cache_is_full(&sctx->dir_utimes_cache)) { + struct btrfs_lru_cache_entry *lru; + + lru = btrfs_lru_cache_lru_entry(&sctx->dir_utimes_cache); + ASSERT(lru != NULL); + + ret = send_utimes(sctx, lru->key, lru->gen); + if (ret) + return ret; + + btrfs_lru_cache_remove(&sctx->dir_utimes_cache, lru); + } + + /* Caching is optional, don't fail if we can't allocate memory. */ + entry = kmalloc(sizeof(*entry), GFP_KERNEL); + if (!entry) + return send_utimes(sctx, dir, gen); + + entry->key = dir; + entry->gen = gen; + + ret = btrfs_lru_cache_store(&sctx->dir_utimes_cache, entry, GFP_KERNEL); + ASSERT(ret != -EEXIST); + if (ret) { + kfree(entry); + return send_utimes(sctx, dir, gen); + } + + return 0; +} + /* * Sends a BTRFS_SEND_C_MKXXX or SYMLINK command to user space. We don't have * a valid path yet because we did not process the refs yet. So, the inode @@ -3540,7 +3589,7 @@ static int apply_dir_move(struct send_ctx *sctx, struct pending_dir_move *pm) } finish: - ret = send_utimes(sctx, pm->ino, pm->gen); + ret = cache_dir_utimes(sctx, pm->ino, pm->gen); if (ret < 0) goto out; @@ -3560,7 +3609,7 @@ static int apply_dir_move(struct send_ctx *sctx, struct pending_dir_move *pm) if (ret < 0) goto out; - ret = send_utimes(sctx, cur->dir, cur->dir_gen); + ret = cache_dir_utimes(sctx, cur->dir, cur->dir_gen); if (ret < 0) goto out; } @@ -4507,8 +4556,7 @@ static int process_recorded_refs(struct send_ctx *sctx, int *pending_move) if (ret == inode_state_did_create || ret == inode_state_no_change) { - /* TODO delayed utimes */ - ret = send_utimes(sctx, cur->dir, cur->dir_gen); + ret = cache_dir_utimes(sctx, cur->dir, cur->dir_gen); if (ret < 0) goto out; } else if (ret == inode_state_did_delete && @@ -6690,7 +6738,18 @@ static int finish_inode_if_needed(struct send_ctx *sctx, int at_end) * it's moved/renamed, therefore we don't need to do it here. */ sctx->send_progress = sctx->cur_ino + 1; - ret = send_utimes(sctx, sctx->cur_ino, sctx->cur_inode_gen); + + /* + * If the current inode is a non-empty directory, delay issuing + * the utimes command for it, as it's very likely we have inodes + * with an higher number inside it. We want to issue the utimes + * command only after adding all dentries to it. + */ + if (S_ISDIR(sctx->cur_inode_mode) && sctx->cur_inode_size > 0) + ret = cache_dir_utimes(sctx, sctx->cur_ino, sctx->cur_inode_gen); + else + ret = send_utimes(sctx, sctx->cur_ino, sctx->cur_inode_gen); + if (ret < 0) goto out; } @@ -7980,6 +8039,8 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) int clone_sources_to_rollback = 0; size_t alloc_size; int sort_clone_roots = 0; + struct btrfs_lru_cache_entry *entry; + struct btrfs_lru_cache_entry *tmp; if (!capable(CAP_SYS_ADMIN)) return -EPERM; @@ -8035,6 +8096,8 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) btrfs_lru_cache_init(&sctx->backref_cache, SEND_MAX_BACKREF_CACHE_SIZE); btrfs_lru_cache_init(&sctx->dir_created_cache, SEND_MAX_DIR_CREATED_CACHE_SIZE); + btrfs_lru_cache_init(&sctx->dir_utimes_cache, + SEND_MAX_DIR_UTIMES_CACHE_SIZE); sctx->pending_dir_moves = RB_ROOT; sctx->waiting_dir_moves = RB_ROOT; @@ -8215,6 +8278,13 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) if (ret < 0) goto out; + btrfs_lru_cache_for_each_entry_safe(&sctx->dir_utimes_cache, entry, tmp) { + ret = send_utimes(sctx, entry->key, entry->gen); + if (ret < 0) + goto out; + btrfs_lru_cache_remove(&sctx->dir_utimes_cache, entry); + } + if (!(sctx->flags & BTRFS_SEND_FLAG_OMIT_END_CMD)) { ret = begin_cmd(sctx, BTRFS_SEND_C_END); if (ret < 0) @@ -8299,6 +8369,7 @@ long btrfs_ioctl_send(struct inode *inode, struct btrfs_ioctl_send_args *arg) btrfs_lru_cache_clear(&sctx->name_cache); btrfs_lru_cache_clear(&sctx->backref_cache); btrfs_lru_cache_clear(&sctx->dir_created_cache); + btrfs_lru_cache_clear(&sctx->dir_utimes_cache); kfree(sctx); }