From patchwork Mon Jan 4 03:09:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 11996107 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A597C433E6 for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 25B7E21D79 for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728019AbhADDKG (ORCPT ); Sun, 3 Jan 2021 22:10:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39070 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728008AbhADDKE (ORCPT ); Sun, 3 Jan 2021 22:10:04 -0500 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B94EC061794 for ; Sun, 3 Jan 2021 19:09:24 -0800 (PST) Received: by mail-wr1-x42e.google.com with SMTP id r3so30802417wrt.2 for ; Sun, 03 Jan 2021 19:09:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Dkys2TZsvFsYLc4Nzw+wTJ/OAwTXIQdwI+QSCleLVFI=; b=AIDyyjehKyUFz49rbRybNkA85pZQLtfkZT8QpiRURqVDsMe9TUcbgGdbRmJ+BjRxXU 56buZ9bJm2IrPGPEVCa/oUEGlVf20WL63FXP0lL5Y0FEUQ0isv/HLBCkqWAimQ42nFop zZu/Js/1TanBL1R+/zerYkcjsQA7fEDlLXXVYRb2M7sGCK5gv1O4KjxsAu4W1nkmk6mk B0it24XPhvcrPoWpFkSQsKwomi6dk3+HWVx0l4UoEMAbdkfiXznvXTuHlxGnGYX4/djM 2Ncbped0fwS446V3Dlu2kKr53lb2g90ynlzPIkzZGe9C3jbXXxH23pVk1/3eACQ62WCX DH7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Dkys2TZsvFsYLc4Nzw+wTJ/OAwTXIQdwI+QSCleLVFI=; b=RSyAti2Ap8giBkRozbweEdupWTMJ0JaVY0jW6e7fi9jB93E79fRey0GzEaUsBLhKqd RASNz3sSwVm5gmM3CqaMiW6/wqPAatK+85C4WyBthJmFPfQ/kJz+TobcmdhHQOTz/69f NGtsqlPRUA7yK6Yw9x3yt7wha/PgDi5caQigml4dtN76nuD1Hle6C4HcpDSeVdwyiPiu E1Eskx/It+UyVFaYtAJEOLVPwtZSOLZUjLxwxZ3iiiM9OvIdFq+qFdVq/fnA5noennsq krgPohgDK28IiWm3/7k3Fic4cgkT9MDOHTvSCKF2hSlYoPGJY+uxlV19jcGQn6tjvswH VuNg== X-Gm-Message-State: AOAM533QJ0QkHLP8iNRz0G3brl+jHPkC9VtlNmAWzofWZ0CzE+FM0nJg Olm/BT8Nz4q3BEgtZzd+18TT8nFTnyg= X-Google-Smtp-Source: ABdhPJyTphoVVvGeBPM57vfQ5/fvUTJCpR02AaKhjLY0WNvcfLwW8WJDjQ8dDiMua5X19Kh3MHTWKQ== X-Received: by 2002:a05:6000:10c4:: with SMTP id b4mr79351701wrx.170.1609729761259; Sun, 03 Jan 2021 19:09:21 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 67sm33107739wmb.47.2021.01.03.19.09.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 03 Jan 2021 19:09:20 -0800 (PST) Message-Id: <0e500c86f397987c4d03beac52b1e91f683e4d11.1609729758.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 04 Jan 2021 03:09:10 +0000 Subject: [PATCH v2 1/9] tree-walk: report recursion counts Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, newren@gmail.com, Derrick Stolee , =?utf-8?b?UmVuw6k=?= Scharfe , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The traverse_trees() method recursively walks through trees, but also prunes the tree-walk based on a callback. Some callers, such as unpack_trees(), are quite complicated and can have wildly different performance between two different commands. Create constants that count these values and then report the results at the end of a process. These counts are cumulative across multiple "root" instances of traverse_trees(), but they provide reproducible values for demonstrating improvements to the pruning algorithm when possible. This change is modeled after a similar statistics reporting in 42e50e78 (revision.c: add trace2 stats around Bloom filter usage, 2020-04-06). Signed-off-by: Derrick Stolee --- tree-walk.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/tree-walk.c b/tree-walk.c index 0160294712b..2d6226d5f18 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -4,6 +4,7 @@ #include "object-store.h" #include "tree.h" #include "pathspec.h" +#include "json-writer.h" static const char *get_mode(const char *str, unsigned int *modep) { @@ -167,6 +168,25 @@ int tree_entry_gently(struct tree_desc *desc, struct name_entry *entry) return 1; } +static int traverse_trees_atexit_registered; +static int traverse_trees_count; +static int traverse_trees_cur_depth; +static int traverse_trees_max_depth; + +static void trace2_traverse_trees_statistics_atexit(void) +{ + struct json_writer jw = JSON_WRITER_INIT; + + jw_object_begin(&jw, 0); + jw_object_intmax(&jw, "traverse_trees_count", traverse_trees_count); + jw_object_intmax(&jw, "traverse_trees_max_depth", traverse_trees_max_depth); + jw_end(&jw); + + trace2_data_json("traverse_trees", the_repository, "statistics", &jw); + + jw_release(&jw); +} + void setup_traverse_info(struct traverse_info *info, const char *base) { size_t pathlen = strlen(base); @@ -180,6 +200,11 @@ void setup_traverse_info(struct traverse_info *info, const char *base) info->namelen = pathlen; if (pathlen) info->prev = &dummy; + + if (trace2_is_enabled() && !traverse_trees_atexit_registered) { + atexit(trace2_traverse_trees_statistics_atexit); + traverse_trees_atexit_registered = 1; + } } char *make_traverse_path(char *path, size_t pathlen, @@ -416,6 +441,12 @@ int traverse_trees(struct index_state *istate, int interesting = 1; char *traverse_path; + traverse_trees_count++; + traverse_trees_cur_depth++; + + if (traverse_trees_cur_depth > traverse_trees_max_depth) + traverse_trees_max_depth = traverse_trees_cur_depth; + if (n >= ARRAY_SIZE(entry)) BUG("traverse_trees() called with too many trees (%d)", n); @@ -515,6 +546,8 @@ int traverse_trees(struct index_state *istate, free(traverse_path); info->traverse_path = NULL; strbuf_release(&base); + + traverse_trees_cur_depth--; return error; } From patchwork Mon Jan 4 03:09:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 11996101 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42ACFC433DB for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0CACE2151B for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728015AbhADDKF (ORCPT ); Sun, 3 Jan 2021 22:10:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39068 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726610AbhADDKE (ORCPT ); Sun, 3 Jan 2021 22:10:04 -0500 Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 33EC8C061793 for ; Sun, 3 Jan 2021 19:09:24 -0800 (PST) Received: by mail-wr1-x431.google.com with SMTP id c5so30767339wrp.6 for ; Sun, 03 Jan 2021 19:09:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=TlE4Wa/PmXo18jG1onlTnDHaohvuR3hpK+GpRAAFDa4=; b=gaGm+TqJ4gFFMrgrYupd47wmJfjb/BjsNoYqQQAtJgkcPTti82zBHT7JEYNPCSAhzr QN6ukPxnbfBs7whyKq/ewtiSi7KffuCeEVKX8e4V0pZ/SCcTdUtLt1bqV6uD7BpdNAd4 MhB+RzbMJD0FQwXGmflRlfJUbU735GzUyMVjgoN9RZkRoRe8PHDbAXtq3A0kXMKLLnPt Buo+FxTbCjmJuzjqqE4g+UGqCUub6lQCBBZ+KE/fFhAgrOGDVUJcxhW5/NTGomcOAJxV qiXRDj8J8BNH5Omx5VixlQacfFs7e38cgD88e0su2pnFYqHibdNBRubGamXW4xBgCU/q tbHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=TlE4Wa/PmXo18jG1onlTnDHaohvuR3hpK+GpRAAFDa4=; b=QQYJxh6naGZeRLi6/cKMiblhGkuf8Go3ZDiaoGRZhTJ6NjTIfLUoacro3gbLF2HDaP gs5KzMX+Jb1/CrkfCwtszKnrYr7CvfgRMTpLb0Ts+xfL7vlR9AB+GYasAwyxzB9CmwtU poe8EquGDaJAJTJ/28rUrjXaks7CWAEhKd1GWjbRpQiYgVz+jCPvEg35zaWzfo3NuDVy GAhO8idWFjuTGa5bXejCRlwLjx5r8V2canGEEEo390giYx8ffrvfqjm4zZBmtfzl3iDX znNxLtk2GPwodnocfYeGBE0xplJGy3yNpiJnIUrDObxiQ2zL6LvOI8Iv5fqB1AUYWlQV ZqyA== X-Gm-Message-State: AOAM531WCcFPGkU41BpdyybHdq17dYUmS984mjC6YLdB+WOh/cqqDMVX KLiTztU/7MM6EQiY1njLMU2Qk02CrU0= X-Google-Smtp-Source: ABdhPJyMRScxRCSwflRZqD91oO/mArW3iZaPKMJWBm4Z6wRksLCr1BpzN5uVrdG/inRbrW5kaPkrYQ== X-Received: by 2002:a5d:4491:: with SMTP id j17mr76330819wrq.78.1609729762056; Sun, 03 Jan 2021 19:09:22 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c4sm30240557wmf.19.2021.01.03.19.09.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 03 Jan 2021 19:09:21 -0800 (PST) Message-Id: <4157b91acf8009ef2136c0856b6b61833d82873e.1609729758.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 04 Jan 2021 03:09:11 +0000 Subject: [PATCH v2 2/9] unpack-trees: add trace2 regions Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, newren@gmail.com, Derrick Stolee , =?utf-8?b?UmVuw6k=?= Scharfe , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The unpack_trees() method is quite complicated and its performance can change dramatically depending on how it is used. We already have some performance tracing regions, but they have not been updated to the trace2 API. Do so now. We already have trace2 regions in unpack_trees.c:clear_ce_flags(), which uses a linear scan through the index without recursing into trees. Signed-off-by: Derrick Stolee --- unpack-trees.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/unpack-trees.c b/unpack-trees.c index 323280dd48b..af6e9b9c2fd 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1580,6 +1580,8 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES); trace_performance_enter(); + trace2_region_enter("unpack_trees", "unpack_trees", the_repository); + if (!core_apply_sparse_checkout || !o->update) o->skip_sparse_checkout = 1; if (!o->skip_sparse_checkout && !o->pl) { @@ -1653,7 +1655,9 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options } trace_performance_enter(); + trace2_region_enter("unpack_trees", "traverse_trees", the_repository); ret = traverse_trees(o->src_index, len, t, &info); + trace2_region_leave("unpack_trees", "traverse_trees", the_repository); trace_performance_leave("traverse_trees"); if (ret < 0) goto return_failed; @@ -1741,6 +1745,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options done: if (free_pattern_list) clear_pattern_list(&pl); + trace2_region_leave("unpack_trees", "unpack_trees", the_repository); trace_performance_leave("unpack_trees"); return ret; From patchwork Mon Jan 4 03:09:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 11996109 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 723D5C43381 for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5B82C21D93 for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728022AbhADDKG (ORCPT ); Sun, 3 Jan 2021 22:10:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728009AbhADDKE (ORCPT ); Sun, 3 Jan 2021 22:10:04 -0500 Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 467E2C061795 for ; Sun, 3 Jan 2021 19:09:24 -0800 (PST) Received: by mail-wm1-x331.google.com with SMTP id 3so17991585wmg.4 for ; Sun, 03 Jan 2021 19:09:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Beof6wirizOKpfsWwTbU2ekekdWsETRnVpYiQCmgFkc=; b=RjH250PQWH+JZWOOt5HoDYfpGjC9dZAEEt5f/F7JtGwCetTCBlikRZdJ4sl+cIrbYZ XWCLGHl1dSJ5Vj+tvyM6ITjq/UR+OPpA02KCABAGOwyZisLpu2jTpIkFlBQ/fJDytmDc xnyCeTeM8U+bNkBZWL2epyYchTasgLx3ZAO7MpmAC3HPleWxzH/2XCjNMfQfB/zzxTtt wC0xhiYjDb7Gy8VZbNdBPCRA1DhUXyO72s4qbemwUeIv4iFUl9U0YclHwEGivCDJc/YA tl0YKE1YmfFq8EfU1I9u0RP1JJBr/dNa4ME5S/H03XVLT0c6D7yW86fFT4zuZ1vtGG1f KPIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Beof6wirizOKpfsWwTbU2ekekdWsETRnVpYiQCmgFkc=; b=dICndZ9uMsGIP8M42kfjsRgQEO9iebjyf300oW8Md4OPPSo0FA6UrFSEpTsINd0/2H j5tCj6p5MMw95LKgVffMroPv8U1Sq0ra6vbDRF81nM812feaB7zkbddXbss6HKYEAW6z ybf6DTBwHm9M2OS9KAq8k++i0zcddnUySdRiAQ+TxEBvdt3SHq5Bvfyf3tFvj8uBU+wP eXnGM3jD0yuaXUhemCxsB5a0srRRLiQ+hMRVe6U5vf1C/E6ScvbOvV/StgTEnL6IIhd2 7ISyIj3RNnE/Os32tVBMSXw3ZruiH9axbzvUSO9tg5WawOehna40OOuTKqfh6bOdsw+0 wj0Q== X-Gm-Message-State: AOAM532QEcZXGS2ioyX3HguHBl9FfZaHtnYXRxLf36OL3Lt0pU6ttMZo g0cxfglsyDrD4iQkXh3jHUy8to6wDSA= X-Google-Smtp-Source: ABdhPJzmcjLYzd9FbTlh7KcpniDZXAltch9w0LHREaIsVC0hycxHqACCbD+oNVBxCDXDa43TiBS19g== X-Received: by 2002:a1c:24c4:: with SMTP id k187mr25503995wmk.14.1609729762828; Sun, 03 Jan 2021 19:09:22 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b7sm79826887wru.33.2021.01.03.19.09.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 03 Jan 2021 19:09:22 -0800 (PST) Message-Id: <8959d57abddd620f4b597e4c43c5d2545c666e97.1609729758.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 04 Jan 2021 03:09:12 +0000 Subject: [PATCH v2 3/9] cache-tree: use trace2 in cache_tree_update() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, newren@gmail.com, Derrick Stolee , =?utf-8?b?UmVuw6k=?= Scharfe , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee This matches a trace_performance_enter()/trace_performance_leave() pair added by 0d1ed59 (unpack-trees: add performance tracing, 2018-08-18). Signed-off-by: Derrick Stolee --- cache-tree.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/cache-tree.c b/cache-tree.c index a537a806c16..9efb6748662 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -442,7 +442,9 @@ int cache_tree_update(struct index_state *istate, int flags) if (i) return i; trace_performance_enter(); + trace2_region_enter("cache_tree", "update", the_repository); i = update_one(it, cache, entries, "", 0, &skip, flags); + trace2_region_leave("cache_tree", "update", the_repository); trace_performance_leave("cache_tree_update"); if (i < 0) return i; From patchwork Mon Jan 4 03:09:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 11996105 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68E48C433E9 for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4538220DD4 for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728026AbhADDKG (ORCPT ); Sun, 3 Jan 2021 22:10:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39078 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728010AbhADDKF (ORCPT ); Sun, 3 Jan 2021 22:10:05 -0500 Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A388C061796 for ; Sun, 3 Jan 2021 19:09:25 -0800 (PST) Received: by mail-wm1-x331.google.com with SMTP id k10so17228951wmi.3 for ; Sun, 03 Jan 2021 19:09:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=F5mr/fFvRdXYRRCHVaEtFFWbeGwbu/OPMWpaJOKG0T0=; b=osDl8CwNuTo3u2SVbVa+6BDdhz8Mgz+ATIaqyahALjt8UTcuOEinKN4KbDZMqi/Kc8 T42okljt8b6w6nF//BpsxIheQU6RajXHGQvhKPlPo8PbbI1eZ07xVAL0456thgUJZfnX ChfM5ln4YmGz2P7MpypUdxBH/NbdSBjEOgyp0SRvUEE4nOSMhUM2U8wJfFDShbLApFB/ IW5fDkVqidZPSAABgpPoSRyuspnpS2JP8L4V9ZT0s1UMMFNTFctQQvuK3tkDt0juP5fz I0kbwji0YwERCO+AjidUwo5sOxDo7FWo4VG3tPaVjqhhbUsrvbgjnhbdi/gOz4QRPl9v rEiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=F5mr/fFvRdXYRRCHVaEtFFWbeGwbu/OPMWpaJOKG0T0=; b=D6ahSR8u+ORw/2FvQXK3VXAoxlqycfythbYA24t3x2g+jTYe5hIoOQJLXTiY1RGqMT kE+qorXBYR7MR9oAJl/3pXudYuXvUVi6+HEQVySjt+1TrlRKdFQJ3phhh3AFqicS0VBy 3xIV++npmgBbjUuwTCQmMy9pgYt2Vk3YK10Hgr3dDlwJ1fI5NEItH19l6Vs7BJU50Vq8 ryYV3X6+VIDBVO0UTUySRQLUP2iS9gnGm+JOAWkhMooCsfUftmfbPsOSJUrVRHcqBaOq 4030eUgFYzI+OrMetE+ClGAJdwhIgeq0AlYtTSWs2W8IrTI0kL2fA6I+AQyuku3phWR/ Ll4Q== X-Gm-Message-State: AOAM5325j6IHswAN4x1tLWF4paNIj9wOiuSzCwBP9/9MxkFKdqHdqeeG zBn/EWFQIaq+OyTWryd6ltS3tJwzVLk= X-Google-Smtp-Source: ABdhPJymsQRs6PFxGTwS3LjkH8NjstVV6PUEV979mHnuOODyFnAjzku0Bk2AEyjdH8Cm11WqKw1z6Q== X-Received: by 2002:a1c:2394:: with SMTP id j142mr25224843wmj.42.1609729763653; Sun, 03 Jan 2021 19:09:23 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z21sm30511451wmk.20.2021.01.03.19.09.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 03 Jan 2021 19:09:23 -0800 (PST) Message-Id: <1d8a797ee2650e8c815281b0c672301c7f24a724.1609729758.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 04 Jan 2021 03:09:13 +0000 Subject: [PATCH v2 4/9] cache-tree: trace regions for I/O Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, newren@gmail.com, Derrick Stolee , =?utf-8?b?UmVuw6k=?= Scharfe , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee As we write or read the cached tree index extension, it can be good to isolate how much of the file I/O time is spent constructing this in-memory tree from the existing index or writing it out again to the new index file. Use trace2 regions to indicate that we are spending time on this operation. Signed-off-by: Derrick Stolee --- cache-tree.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/cache-tree.c b/cache-tree.c index 9efb6748662..45fb57b17f3 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -494,7 +494,9 @@ static void write_one(struct strbuf *buffer, struct cache_tree *it, void cache_tree_write(struct strbuf *sb, struct cache_tree *root) { + trace2_region_enter("cache_tree", "write", the_repository); write_one(sb, root, "", 0); + trace2_region_leave("cache_tree", "write", the_repository); } static struct cache_tree *read_one(const char **buffer, unsigned long *size_p) @@ -583,9 +585,16 @@ static struct cache_tree *read_one(const char **buffer, unsigned long *size_p) struct cache_tree *cache_tree_read(const char *buffer, unsigned long size) { + struct cache_tree *result; + if (buffer[0]) return NULL; /* not the whole tree */ - return read_one(&buffer, &size); + + trace2_region_enter("cache_tree", "read", the_repository); + result = read_one(&buffer, &size); + trace2_region_leave("cache_tree", "read", the_repository); + + return result; } static struct cache_tree *cache_tree_find(struct cache_tree *it, const char *path) From patchwork Mon Jan 4 03:09:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 11996111 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA912C4332E for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8B0A420E65 for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728032AbhADDKn (ORCPT ); Sun, 3 Jan 2021 22:10:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39172 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728010AbhADDKm (ORCPT ); Sun, 3 Jan 2021 22:10:42 -0500 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E0BE4C061798 for ; Sun, 3 Jan 2021 19:09:25 -0800 (PST) Received: by mail-wr1-x42a.google.com with SMTP id w5so30744155wrm.11 for ; Sun, 03 Jan 2021 19:09:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=1cXbZJPHh6PbIf9iSugyF98nzI72FppH2Gn56ddappw=; b=C4xTA0joJFj15cLJ5D+OsYN30FXloWJvLM+tchf3e9U+gXyGZwQ7z5Fx6RKWM/rp3d hMXN8SnQtJDzvDxHf+RTnyzTSaUQ4Uz6JHQIdDoNKjSykoCXysqVg/wyMn/vIb4d8/6C ox74pxeGrHKjTrciYzzBUDI00VbGks726TKJYYYMeLKIBs4G9w3lWqhlKtWH+l543KeD O1jt7zsUC3PLYo69ONiDmujeRe1uYZHaIZ33GVGLKymhKvYacjxBUO/F8TVQORIpuMwi jFXVijcvVRJdkBg4KaGziGBBIg0j/fTNxgc9gwrBBL/MKxL88lgp0PGEWBba39y3lSE0 89OA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=1cXbZJPHh6PbIf9iSugyF98nzI72FppH2Gn56ddappw=; b=qfRs73nix6+LrbszjaQCnW9W96rryZU+IUMIOfD5R31Gg2YiCTHJ58BoQ9pYpJd3My yv5sMoWJDFwK4od+3QZCkFiG0CvVqKJbwfsw69HZbCEQkNaFEQ/DJOwsbomxfam9UWMO R+uaSsYWFUFOWRpvCiPC4GmhM0dqq5LKPmDTUjy8SM49aPFZ81DqZ/DaBImYXKcxY+tt 69EQ3B55HxnB/TVDxXqK4k/eVyC2j0runHTs5J4sAGnZrNVfOdOrrvp7dHV4gJ2fznhW S3IP4pBRaCmIpDftJwQdw8xuSajcilbcvllEB2+oN0Ymki/0igoESkG+tQ2A7zFsEwwY ilbQ== X-Gm-Message-State: AOAM532UYnA+vqIxcOuttXikfm/OHUuCw144uKMx5vN3/+pyeiydWUvX CkDoqcX1Ber2SORMLOXEdTTSjJpDm1o= X-Google-Smtp-Source: ABdhPJyD0/mUdKddUh+mCnIUT8/NxLJzaKO+pnRuJ6yKtBkyRTJoFtahrdvzJtp1CWlQpgJaEkpgpQ== X-Received: by 2002:adf:dd90:: with SMTP id x16mr76333508wrl.85.1609729764426; Sun, 03 Jan 2021 19:09:24 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c190sm30716078wme.19.2021.01.03.19.09.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 03 Jan 2021 19:09:23 -0800 (PST) Message-Id: <2b2e70bb77c8dafbf2cfedd9e68f834f02deb4a2.1609729758.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 04 Jan 2021 03:09:14 +0000 Subject: [PATCH v2 5/9] cache-tree: trace regions for prime_cache_tree Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, newren@gmail.com, Derrick Stolee , =?utf-8?b?UmVuw6k=?= Scharfe , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee Commands such as "git reset --hard" rebuild the in-memory representation of the cached tree index extension by parsing tree objects starting at a known root tree. The performance of this operation can vary widely depending on the width and depth of the repository's working directory structure. Measure the time in this operation using trace2 regions in prime_cache_tree(). Signed-off-by: Derrick Stolee --- cache-tree.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/cache-tree.c b/cache-tree.c index 45fb57b17f3..7da59b2aa07 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -744,10 +744,13 @@ void prime_cache_tree(struct repository *r, struct index_state *istate, struct tree *tree) { + trace2_region_enter("cache-tree", "prime_cache_tree", the_repository); cache_tree_free(&istate->cache_tree); istate->cache_tree = cache_tree(); + prime_cache_tree_rec(r, istate->cache_tree, tree); istate->cache_changed |= CACHE_TREE_CHANGED; + trace2_region_leave("cache-tree", "prime_cache_tree", the_repository); } /* From patchwork Mon Jan 4 03:09:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 11996113 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A55CC4332B for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 70E3820DD4 for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728035AbhADDKn (ORCPT ); Sun, 3 Jan 2021 22:10:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39174 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728027AbhADDKm (ORCPT ); Sun, 3 Jan 2021 22:10:42 -0500 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5E5BC06179A for ; Sun, 3 Jan 2021 19:09:26 -0800 (PST) Received: by mail-wr1-x434.google.com with SMTP id c5so30767407wrp.6 for ; Sun, 03 Jan 2021 19:09:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=+bBj/3spC8b6bi5zcCITehdRVRVjOAOFrCLkIBojeb0=; b=qsoGPaOCOyQr2Fky3NNR9WJ2lz9EPjSI670/w+gxzGD1sA6otcfM9xHiB3NU45yLcD SYR0LjtlOJRh8b3ujBjmahNcF7V3Sl3JTduK9ShLYIZSm9t/AW2Ll+ftSwyaATyyPxq+ VFWCJz0IODi/TVYeVUXjFx3o3l2FkcqVFtkNq4GwbUmOWyrJUaJvEgRXxQL3tBPWcePe 3GQrSE5PodI9qZ87svz2dEnkTH/tOhyv0SxahjuYXq1WRSFS8HSSUwn5O26HfK8jujJ3 qzYTPly9zTCg/FdcInUJABWL/InK1tC7FO+rWTHX8sOGhAWg6556yFtCrQX4y2lin9UW qBuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=+bBj/3spC8b6bi5zcCITehdRVRVjOAOFrCLkIBojeb0=; b=pjwbKkaVJC5gbLaFDA+M0X5G5MpvZeXUi5tpfOq6LTXH8YgaxGUrtXHOpEigr30ef1 qfkS4kadhT2cxbSfLoqcdkO2TXoErfY5v/UhYIgBb6Q83hnIBmc7NDx20ScH8UNtPPvM Na5qg50Fttw+TNrdd2lslI1GEew8Mkjz+4mPfDYcn1DsSkefDaWWuM0d3WSK06y2AZS0 akKLDTee/poIlK+3m0FE/HGGmQu09VfP+mBx/ygFiCUhoNTPPHGjCieOO7Hsgo+g2/5B JxV6mDtkDGW/lrsX4UzmppuXrheDOWAUGcIe2DBSo4ERd4kjHed6F5o9otucB0nXCLEH tYVQ== X-Gm-Message-State: AOAM532tjV5wvGTrVvzhAqHF0RPhreD16q1Xovk+6Kb/v/TAIxZEJm6C hstFesh/jmFOqzuiRSPsWfv7d1Ng5jg= X-Google-Smtp-Source: ABdhPJxbSv7d9x13pJLkhFFVhCgJ/8eYOG56wDw36EYWEVXEueeWhnYOcHa8Xwt/27o9UEqBDCqUNQ== X-Received: by 2002:a05:6000:14b:: with SMTP id r11mr79698688wrx.53.1609729765371; Sun, 03 Jan 2021 19:09:25 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r1sm91114250wrl.95.2021.01.03.19.09.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 03 Jan 2021 19:09:24 -0800 (PST) Message-Id: <75b51483d3c7088d0cfae36544966672374c50f9.1609729758.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 04 Jan 2021 03:09:15 +0000 Subject: [PATCH v2 6/9] index-format: update preamble to cached tree extension Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, newren@gmail.com, Derrick Stolee , =?utf-8?b?UmVuw6k=?= Scharfe , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee I had difficulty in my efforts to learn about the cached tree extension based on the documentation and code because I had an incorrect assumption about how it behaved. This might be due to some ambiguity in the documentation, so this change modifies the beginning of the cached tree format by expanding the description of the feature. My hope is that this documentation clarifies a few things: 1. There is an in-memory recursive tree structure that is constructed from the extension data. This structure has a few differences, such as where the name is stored. 2. What does it mean for an entry to be invalid? 3. When exactly are "new" trees created? Signed-off-by: Derrick Stolee --- Documentation/technical/index-format.txt | 36 ++++++++++++++++++++---- 1 file changed, 30 insertions(+), 6 deletions(-) diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index 69edf46c031..c614e136e24 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -138,12 +138,36 @@ Git index format === Cached tree - Cached tree extension contains pre-computed hashes for trees that can - be derived from the index. It helps speed up tree object generation - from index for a new commit. - - When a path is updated in index, the path must be invalidated and - removed from tree cache. + Since the index does not record entries for directories, the cache + entries cannot describe tree objects that already exist in the object + database for regions of the index that are unchanged from an existing + commit. The cached tree extension stores a recursive tree structure that + describes the trees that already exist and completely match sections of + the cache entries. This speeds up tree object generation from the index + for a new commit by only computing the trees that are "new" to that + commit. + + The recursive tree structure uses nodes that store a number of cache + entries, a list of subnodes, and an object ID (OID). The OID references + the exising tree for that node, if it is known to exist. The subnodes + correspond to subdirectories that themselves have cached tree nodes. The + number of cache entries corresponds to the number of cache entries in + the index that describe paths within that tree's directory. + + Note that the path for a given tree is part of the parent node in-memory + but is part of the child in the file format. The root tree has an empty + string for its name and its name does not exist in-memory. + + When a path is updated in index, Git invalidates all nodes of the + recurisive cached tree corresponding to the parent directories of that + path. We store these tree nodes as being "invalid" by using "-1" as the + number of cache entries. To create trees corresponding to the current + index, Git only walks the invalid tree nodes and uses the cached OIDs + for the valid trees to construct new trees. In this way, Git only + constructs trees on the order of the number of changed paths (and their + depth in the working directory). This comes at a cost of tracking the + full directory structure in the cached tree extension, but this is + generally smaller than the full cache entry list in the index. The signature for this extension is { 'T', 'R', 'E', 'E' }. From patchwork Mon Jan 4 03:09:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 11996115 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D677EC4332D for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A757B2151B for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728053AbhADDKq (ORCPT ); Sun, 3 Jan 2021 22:10:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39184 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728027AbhADDKp (ORCPT ); Sun, 3 Jan 2021 22:10:45 -0500 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70D26C06179E for ; Sun, 3 Jan 2021 19:09:27 -0800 (PST) Received: by mail-wm1-x32a.google.com with SMTP id c124so17192075wma.5 for ; Sun, 03 Jan 2021 19:09:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=JVX1YwEI15ACXoHGSLfoBMSCsKP/lX3ZPnVK+ClcBPI=; b=Mgcf6rmRBJuFuxs0lPmcv99DBS8vTzYSLkwlP26j5tCu31OE0xDrOL9F1qQSIkOWwY 8WwD0y4KPxBScKam6d2lmsLQlRv6yaDkqZE/YxFki5G49lMqznW10cNhcmwL1pGIruyH TFLG4mk6eP1yDL6jTi8oJlVYVg1FZu33XOCL/R0zW1m1JA0uG+5CO2htv9pvF0S1SABO 3As8Y4dIDxA5jeQ4Y7RZBoa3hECA02bfSaD4AAtwLCzpFGKCQ36bQu0veMAktvodYoKB frIIljYNopFyRe4Z15QI1VyatxUcQlOzqtWMC1wFhE5vvl7Ve+3MNUPTuO/ENBT8u4b3 Ch4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=JVX1YwEI15ACXoHGSLfoBMSCsKP/lX3ZPnVK+ClcBPI=; b=hscPngnpJBxzBQGkT2/rSyEYdegtsjftRGr4k5k7XX0pBPj5Y/G8+r5GffJRY5hwbp WZ6ZfO+sMPgcSId4m/jStvNtgwcp01GkDLlLHgBU2iFvVZuJTwLyTSg7pEHGkWhN6sGA 97wmubCzF4zdP9/cQZbNJwaVZmRE1c3U1AgMXwqvedDYqvC+iuyhNNoqeb+5ADQezCLV 1gJj1JyDdftBXOgwwTcGojnQG9/2U3E9rrb5k9QybWMMX431/Efh9dIf2eb1IJmqQ1A0 9IVBHoByw78w4D8qTDhhg0JkzfcprSSIRzNw9B2CCUQfV1JH81brqx9ohBHBvKXmf82g otLQ== X-Gm-Message-State: AOAM531/GiU5HO+WPUB0UezGNP2cf8A/UVz2doQZ+SvHXNSbgTeFEgCd wb2OI3oTdZ/gPjs+lNpA6XW7jp7S18M= X-Google-Smtp-Source: ABdhPJwDwliPZYLn3LcC0yNhaczwCldizjfef++Z0dOBtQA/6FNL0dHbhG7AMeKegU4LW17iTKzhEw== X-Received: by 2002:a1c:6484:: with SMTP id y126mr24795483wmb.76.1609729766106; Sun, 03 Jan 2021 19:09:26 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 67sm33108030wmb.47.2021.01.03.19.09.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 03 Jan 2021 19:09:25 -0800 (PST) Message-Id: In-Reply-To: References: Date: Mon, 04 Jan 2021 03:09:16 +0000 Subject: [PATCH v2 7/9] index-format: discuss recursion of cached-tree better Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, newren@gmail.com, Derrick Stolee , =?utf-8?b?UmVuw6k=?= Scharfe , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The end of the cached tree index extension format trails off with ellipses ever since 23fcc98 (doc: technical details about the index file format, 2011-03-01). While an intuitive reader could gather what this means, it could be better to use "and so on" instead. Really, this is only justified because I also wanted to point out that the number of subtrees in the index format is used to determine when the recursive depth-first-search stack should be "popped." This should help to add clarity to the format. Signed-off-by: Derrick Stolee --- Documentation/technical/index-format.txt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index c614e136e24..2ebe88b9d46 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -198,7 +198,8 @@ Git index format first entry represents the root level of the repository, followed by the first subtree--let's call this A--of the root level (with its name relative to the root level), followed by the first subtree of A (with - its name relative to A), ... + its name relative to A), and so on. The specified number of subtrees + indicates when the current level of the recursive stack is complete. === Resolve undo From patchwork Mon Jan 4 03:09:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ren=C3=A9_Scharfe?= X-Patchwork-Id: 11996119 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2764EC43331 for ; Mon, 4 Jan 2021 03:11:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EC08420DD4 for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728045AbhADDKq (ORCPT ); Sun, 3 Jan 2021 22:10:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39186 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728036AbhADDKp (ORCPT ); Sun, 3 Jan 2021 22:10:45 -0500 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3CFFCC06179F for ; Sun, 3 Jan 2021 19:09:28 -0800 (PST) Received: by mail-wr1-x42e.google.com with SMTP id q18so30822918wrn.1 for ; Sun, 03 Jan 2021 19:09:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=2sQ7U1FWgBOX7rWhIg5mwDJ4gGLWmHnYWn6oA+HXRLA=; b=ZVIZ7/7gyVBX5Lg2BaogjXv9v2PJXXjcXRRDVPqYD6C9Yg5mox2sMoMT/hrgG2L8Lb tz5aC7yrFmWqZvxgO+sa7hSYPOViD/JhCWyGBooDyoAs6Qm44qMJPFXqsK7FgvwdE+j2 oNFhFpftIctgC/rmsKypDv/yU7+dsTSwYmuCv260xrwpwrZSPEehQQ3Mgb42EWIuOM6K qtL1AhC0Qoiv2caETHeUBvXutJbU1Lm32nW5BMIVgR7no3py+Kg4tDJr+jIVcrvtJr6u lZ2dSUXsnlpnR2s/fh0elxIt385lFobThLcfFtL8v94tHBzeOdKtivbG2AmrPGL5jkmo +6Bw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=2sQ7U1FWgBOX7rWhIg5mwDJ4gGLWmHnYWn6oA+HXRLA=; b=nK4plGqDBJI0dgtYKV23ghb7zKNMo/FHVosAogW5E337nmmt7OjD8009AP7eQ8olOT 8F6ojRR8TjBzuj+Q3neBMWJonuNX2iB01m32WJqbhmfkQDSrBr5FDMzSVGEkk/8wd0SF 7Iz9gcy5XGQ/JEkeVcnvgL45azEf2RAGxVCflFQzXaDjevU/WlmV+KPwfYH9iqEvA+25 ReSREvLCLUJ1UGaPDhtfFcg2k2SQp3tWRpZwOAEPIqqcYDLK+ZS8fvNqKEA2b3hjgouX MEOl2IjPuh+b3qLeWau5CMPRm39K5FAlHCLM7aGA0DC2l43PF7m8JVY4RKEOeFaUBMO1 tHXA== X-Gm-Message-State: AOAM531H5l/KkqeJnz9D+JLHcga7lUrQaAdfh7UaY5K4r8rDjRJBmEru zLyOOO/VYO/ljiPnV89sl8TrdNuZ13Y= X-Google-Smtp-Source: ABdhPJyenWI9erim8yBzAMJSLNtBsdllaDgoPmyn5rwQwRKSzvsIqp60GWvTvhYJ0nfPJ5hULKZsBQ== X-Received: by 2002:adf:8145:: with SMTP id 63mr75537495wrm.8.1609729766892; Sun, 03 Jan 2021 19:09:26 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h16sm32049910wmb.41.2021.01.03.19.09.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 03 Jan 2021 19:09:26 -0800 (PST) Message-Id: <5298694786e2c76fb08a0b37890e3183d788ff10.1609729758.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 04 Jan 2021 03:09:17 +0000 Subject: [PATCH v2 8/9] cache-tree: use ce_namelen() instead of strlen() MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: gitster@pobox.com, newren@gmail.com, Derrick Stolee , =?utf-8?b?UmVuw6k=?= Scharfe , Derrick Stolee , =?utf-8?q?Ren=C3=A9_Scharfe?= Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: =?utf-8?q?Ren=C3=A9_Scharfe?= From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= Use the name length field of cache entries instead of calculating its value anew. Signed-off-by: René Scharfe Signed-off-by: Derrick Stolee --- cache-tree.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/cache-tree.c b/cache-tree.c index 7da59b2aa07..4274de75bac 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -185,10 +185,12 @@ static int verify_cache(struct cache_entry **cache, * the cache is sorted. Also path can appear only once, * which means conflicting one would immediately follow. */ - const char *this_name = cache[i]->name; - const char *next_name = cache[i+1]->name; - int this_len = strlen(this_name); - if (this_len < strlen(next_name) && + const struct cache_entry *this_ce = cache[i]; + const struct cache_entry *next_ce = cache[i + 1]; + const char *this_name = this_ce->name; + const char *next_name = next_ce->name; + int this_len = ce_namelen(this_ce); + if (this_len < ce_namelen(next_ce) && strncmp(this_name, next_name, this_len) == 0 && next_name[this_len] == '/') { if (10 < ++funny) { From patchwork Mon Jan 4 03:09:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 11996117 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB670C43332 for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CFDB22184D for ; Mon, 4 Jan 2021 03:11:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728050AbhADDKq (ORCPT ); Sun, 3 Jan 2021 22:10:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39188 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728037AbhADDKp (ORCPT ); Sun, 3 Jan 2021 22:10:45 -0500 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1BF27C0617A0 for ; Sun, 3 Jan 2021 19:09:29 -0800 (PST) Received: by mail-wr1-x432.google.com with SMTP id a12so30752869wrv.8 for ; Sun, 03 Jan 2021 19:09:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=RWzNeO0RUBCbiAHqidiz31YNJN86Cus3A4MuiXII+/s=; b=eKuHLEIRFCc5SragLNPXI62mZUc6Z2tQcCyH5C95C9KlnFXYJZqZkF3gb4Ds738Tv6 Sf85CK4DegXfMAtS+KwCnhDfKKUzyFZMoeEkrBr27DVqOUP0HSESaM7NUKApr8j+Tr9v zt+syXKhTfv38hqJEjsQxW1XmgEjREj6MHnGggDw14wX0/ENeK34qERu+9UwMkMUEgWk C0CkRbL4sRBbNSXhZxMsris6gRzQf9IeqtbRw1RiS/lhz2y77U/0mXUVGANWCfwnroPv 3cDx9Hcj8z4ia4+QHE040ryTNWDjAAxh4Z38D2Uo9C4DNe1arAcW9mxW/43D8uTEMrBJ xaRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=RWzNeO0RUBCbiAHqidiz31YNJN86Cus3A4MuiXII+/s=; b=huOb2sdlqtpMx6R/ZeGsaDDmkKcROGaBq1g1KNeayHAje0rdh2BMxv/V+jDzZ50FVi sCR8bL//1pVhg90aLTQzmz/fReC5qYyqsPrfIkR8H/SVTPrMbtZWVELrUzCmW8/n4Iga ouj/BXF9CU7v9EVUw1yu9Wf7Z4cbCZYmlM9RLfEOxUScrU3GNXBi4F1X00xyiOICQd0q wyifdTg/ERfMKZn4weqeN9GXrg7eW4qKkJexfjHsI/h7rO9St0kLLz9gvb38PkmK/NmX SOrVZmjXk1U8iKFbs56pnm5aNG6Or4CcoEOeLx0suhLQAvw5ucHmRdd5aXkDKbKCXwqS BgXA== X-Gm-Message-State: AOAM5326Q5rqVtRq7NrEyM1oxlG06DNqL0qze0xm5rxpzOQTpwKXYVHy bcPJg1GRBfVF8nOvFBFNEGz2PKdmwy0= X-Google-Smtp-Source: ABdhPJyxrBIPuB7HGoMSgR3Md0tiKYNTSEQTZ6Ljs0tNf3UOUiMX98gzMXwgnw+4Tp56/pNwkypxKA== X-Received: by 2002:adf:e348:: with SMTP id n8mr79153975wrj.148.1609729767649; Sun, 03 Jan 2021 19:09:27 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r15sm86588934wrq.1.2021.01.03.19.09.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 03 Jan 2021 19:09:27 -0800 (PST) Message-Id: <72edd7bb4278a8ce6823752a3142b14a05bbea58.1609729758.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 04 Jan 2021 03:09:18 +0000 Subject: [PATCH v2 9/9] cache-tree: speed up consecutive path comparisons MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: gitster@pobox.com, newren@gmail.com, Derrick Stolee , =?utf-8?b?UmVuw6k=?= Scharfe , Derrick Stolee , Derrick Stolee Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee From: Derrick Stolee The previous change reduced time spent in strlen() while comparing consecutive paths in verify_cache(), but we can do better. The conditional checks the existence of a directory separator at the correct location, but only after doing a string comparison. Swap the order to be logically equivalent but perform fewer string comparisons. To test the effect on performance, I used a repository with over three million paths in the index. I then ran the following command on repeat: git -c index.threads=1 commit --amend --allow-empty --no-edit Here are the measurements over 10 runs after a 5-run warmup: Benchmark #1: v2.30.0 Time (mean ± σ): 854.5 ms ± 18.2 ms Range (min … max): 825.0 ms … 892.8 ms Benchmark #2: Previous change Time (mean ± σ): 833.2 ms ± 10.3 ms Range (min … max): 815.8 ms … 849.7 ms Benchmark #3: This change Time (mean ± σ): 815.5 ms ± 18.1 ms Range (min … max): 795.4 ms … 849.5 ms This change is 2% faster than the previous change and 5% faster than v2.30.0. Signed-off-by: Derrick Stolee --- cache-tree.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/cache-tree.c b/cache-tree.c index 4274de75bac..3f1a8d4f1b7 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -191,8 +191,8 @@ static int verify_cache(struct cache_entry **cache, const char *next_name = next_ce->name; int this_len = ce_namelen(this_ce); if (this_len < ce_namelen(next_ce) && - strncmp(this_name, next_name, this_len) == 0 && - next_name[this_len] == '/') { + next_name[this_len] == '/' && + strncmp(this_name, next_name, this_len) == 0) { if (10 < ++funny) { fprintf(stderr, "...\n"); break;