From patchwork Fri Mar 20 12:38:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11449055 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B7EBD1668 for ; Fri, 20 Mar 2020 12:38:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8DA4320777 for ; Fri, 20 Mar 2020 12:38:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="p8/FRyHP" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727035AbgCTMiQ (ORCPT ); Fri, 20 Mar 2020 08:38:16 -0400 Received: from mail-ed1-f52.google.com ([209.85.208.52]:45694 "EHLO mail-ed1-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727006AbgCTMiP (ORCPT ); Fri, 20 Mar 2020 08:38:15 -0400 Received: by mail-ed1-f52.google.com with SMTP id u59so6884212edc.12 for ; Fri, 20 Mar 2020 05:38:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=BegkCao7aXxyIqvZOnBRAvg0DlWcFw0p5P8u8suUCQE=; b=p8/FRyHPTIlHR45W8+bXw3pOizJxNjmeizkJET/TQmRMPMVUsOl9rI27caa5LZdF7S 50Y3xAn/hapKZPsd2zDv6BxOgFLTdbmj5/SxTCcpTl3JmQre7lFNj/zXpfflxgoXSOdU 5IaFE/g4frbHbV8xy1lu+9q/Hgt4XiVVF+UJP1jaPSNwS2UXleg9b4+/1Ld2pOD04TYf nWgx3e9V661MH4OUE17pWSYAXxv10TU1K97B1vbzBsuhdCfKjYgzd6dBGCidMOyQJpcs npKqGg99Y9q4dKKlQzm5AKjSHnx91Ful2pnG0YmwjjYHbJzHgz+5IZcAkvc+LdTYO+24 juxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=BegkCao7aXxyIqvZOnBRAvg0DlWcFw0p5P8u8suUCQE=; b=jQGUo/cXsB8eP/5sWWDTqBn8uzrlOxBTl/1YBn69ehgqvBhDcXUfK0C5vlOql1tU0X A6uZNHjRV/YLTb457VH4dYQQZEUrVeopXJ5ODlp+uDAKNEVlYyj0xPsAHgMsSsngTN8Z 3wLbNCJSAA2offqL48JTgbUedQOnHL7tx9qR6gBb6lrJvarMpczErH9bEszefns33Wia ktWN+b6urhtSwmnOHBcBnMwjTsNijaiwMidKvNRwD6HI5U4bdXpM9JTPThH3IrJGrp/K zflEW7abHpfdGtBJxhtIjb8ZN6/ogL/u7N9B33cGrE6shmmdi1XpeCjc+4xYqEvdfnu1 BdMQ== X-Gm-Message-State: ANhLgQ0LI+ff5u6810UpCs77CRXiYVfOaVsPB4BAHo13SrUW2fUusiku tsu5/o+bwKjjLJa/WZ9Tta++lg4W X-Google-Smtp-Source: ADFU+vuhhUFbWgpR3212hzjeQvsfq9yTe/cKWvgOgPcDVkfY03cXgVEgezKXhQHaZk3C59LRpes2QA== X-Received: by 2002:aa7:d64a:: with SMTP id v10mr7755475edr.324.1584707892807; Fri, 20 Mar 2020 05:38:12 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o33sm160013eda.30.2020.03.20.05.38.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Mar 2020 05:38:12 -0700 (PDT) Message-Id: <60b5cc6f337011a7f2d5a229a83df7b82638d421.1584707890.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Fri, 20 Mar 2020 12:38:09 +0000 Subject: [PATCH v3 1/2] config: set pack.useSparse=true by default Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: me@ttaylorr.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The pack.useSparse config option was introduced by 3d036eb0 (pack-objects: create pack.useSparse setting, 2019-01-19) and was first available in v2.21.0. When enabled, the pack-objects process during 'git push' will use a sparse tree walk when deciding which trees and blobs to send to the remote. The algorithm was introduced by d5d2e93 (revision: implement sparse algorithm, 2019-01-16) and has been in production use by VFS for Git since around that time. The features.experimental config option also enabled pack.useSparse, so hopefully that has also increased exposure. It is worth noting that pack.useSparse has a possibility of sending more objects across a push, but requires a special arrangement of exact _copies_ across directories. There is a test in t5322-pack-objects-sparse.sh that demonstrates this possibility. This test uses the --sparse option to "git pack-objects" but we can make it implied by the config value to demonstrate that the default value has changed. While updating that test, I noticed that the documentation did not include an option for --no-sparse, which is now more important than it was before. Since the downside is unlikely but the upside is significant, set the default value of pack.useSparse to true. Remove it from the set of options implied by features.experimental. Signed-off-by: Derrick Stolee --- Documentation/config/feature.txt | 3 --- Documentation/config/pack.txt | 4 ++-- Documentation/git-pack-objects.txt | 10 ++++++---- repo-settings.c | 3 ++- t/t5322-pack-objects-sparse.sh | 3 ++- 5 files changed, 12 insertions(+), 11 deletions(-) diff --git a/Documentation/config/feature.txt b/Documentation/config/feature.txt index 875f8c8a66f..4e3a5c0cebc 100644 --- a/Documentation/config/feature.txt +++ b/Documentation/config/feature.txt @@ -12,9 +12,6 @@ feature.experimental:: setting if you are interested in providing feedback on experimental features. The new default values are: + -* `pack.useSparse=true` uses a new algorithm when constructing a pack-file -which can improve `git push` performance in repos with many files. -+ * `fetch.negotiationAlgorithm=skipping` may improve fetch negotiation times by skipping more commits at a time, reducing the number of round trips. + diff --git a/Documentation/config/pack.txt b/Documentation/config/pack.txt index 0dac5805816..837f1b16792 100644 --- a/Documentation/config/pack.txt +++ b/Documentation/config/pack.txt @@ -119,8 +119,8 @@ pack.useSparse:: objects. This can have significant performance benefits when computing a pack to send a small change. However, it is possible that extra objects are added to the pack-file if the included - commits contain certain types of direct renames. Default is `false` - unless `feature.experimental` is enabled. + commits contain certain types of direct renames. Default is + `true`. pack.writeBitmaps (deprecated):: This is a deprecated synonym for `repack.writeBitmaps`. diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index fecdf2600cc..eaa2f2a4041 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -14,7 +14,7 @@ SYNOPSIS [--local] [--incremental] [--window=] [--depth=] [--revs [--unpacked | --all]] [--keep-pack=] [--stdout [--filter=] | base-name] - [--shallow] [--keep-true-parents] [--sparse] < object-list + [--shallow] [--keep-true-parents] [--[no-]sparse] < object-list DESCRIPTION @@ -196,14 +196,16 @@ depth is 4095. Add --no-reuse-object if you want to force a uniform compression level on all data no matter the source. ---sparse:: - Use the "sparse" algorithm to determine which objects to include in +--[no-]sparse:: + Toggle the "sparse" algorithm to determine which objects to include in the pack, when combined with the "--revs" option. This algorithm only walks trees that appear in paths that introduce new objects. This can have significant performance benefits when computing a pack to send a small change. However, it is possible that extra objects are added to the pack-file if the included commits contain - certain types of direct renames. + certain types of direct renames. If this option is not included, + it defaults to the value of `pack.useSparse`, which is true unless + otherwise specified. --thin:: Create a "thin" pack by omitting the common objects between a diff --git a/repo-settings.c b/repo-settings.c index a703e407a3f..dc6817daa95 100644 --- a/repo-settings.c +++ b/repo-settings.c @@ -45,6 +45,8 @@ void prepare_repo_settings(struct repository *r) if (!repo_config_get_bool(r, "pack.usesparse", &value)) r->settings.pack_use_sparse = value; + UPDATE_DEFAULT_BOOL(r->settings.pack_use_sparse, 1); + if (!repo_config_get_bool(r, "feature.manyfiles", &value) && value) { UPDATE_DEFAULT_BOOL(r->settings.index_version, 4); UPDATE_DEFAULT_BOOL(r->settings.core_untracked_cache, UNTRACKED_CACHE_WRITE); @@ -52,7 +54,6 @@ void prepare_repo_settings(struct repository *r) if (!repo_config_get_bool(r, "fetch.writecommitgraph", &value)) r->settings.fetch_write_commit_graph = value; if (!repo_config_get_bool(r, "feature.experimental", &value) && value) { - UPDATE_DEFAULT_BOOL(r->settings.pack_use_sparse, 1); UPDATE_DEFAULT_BOOL(r->settings.fetch_negotiation_algorithm, FETCH_NEGOTIATION_SKIPPING); UPDATE_DEFAULT_BOOL(r->settings.fetch_write_commit_graph, 1); } diff --git a/t/t5322-pack-objects-sparse.sh b/t/t5322-pack-objects-sparse.sh index 7124b5581a0..6e5d6bdb0a7 100755 --- a/t/t5322-pack-objects-sparse.sh +++ b/t/t5322-pack-objects-sparse.sh @@ -105,6 +105,7 @@ test_expect_success 'non-sparse pack-objects' ' test_cmp required_objects.txt nonsparse_required_objects.txt ' +# --sparse is enabled by default by pack.useSparse test_expect_success 'sparse pack-objects' ' git rev-parse \ topic1 \ @@ -112,7 +113,7 @@ test_expect_success 'sparse pack-objects' ' topic1:f3 \ topic1:f3/f4 \ topic1:f3/f4/data.txt | sort >expect_sparse_objects.txt && - git pack-objects --stdout --revs --sparse sparse.pack && + git pack-objects --stdout --revs sparse.pack && git index-pack -o sparse.idx sparse.pack && git show-index sparse_objects.txt && test_cmp expect_sparse_objects.txt sparse_objects.txt From patchwork Fri Mar 20 12:38:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11449057 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EB7F51894 for ; Fri, 20 Mar 2020 12:38:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C11002076E for ; Fri, 20 Mar 2020 12:38:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="QxxUBbPf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727043AbgCTMiR (ORCPT ); Fri, 20 Mar 2020 08:38:17 -0400 Received: from mail-ed1-f68.google.com ([209.85.208.68]:42540 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726884AbgCTMiQ (ORCPT ); Fri, 20 Mar 2020 08:38:16 -0400 Received: by mail-ed1-f68.google.com with SMTP id b21so6913412edy.9 for ; Fri, 20 Mar 2020 05:38:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=a/RJ3B19GrtdA1cBnNvBYv1VPg4liq+Psbi8N0DPvLM=; b=QxxUBbPfk2eRdodQwZ9FhJvnMwrfjYb9AjckLx2JO9YGzw1L0alkdyPjjGWLcoc+PL dqIVrVI7xkNVbxKR2aDRtMVP2zz512Kiv8HQDFGOr8fQVi0Ie5ZZLw/ISQBWF2s+J9ja 0VnKNGxItMXTL5lh1rEQMuHU8pPJwzmeq0DbC+5fWi2KtFzZeyoTfo3blV9tO3onNLBJ CY+fPMBpK1Z14mm1GDzPe3aKaBxhpNXsXIIg5MFujUkQCS2k+IgYmLtnY64+lKX0YWGz Yx6LbsmKT1sgSLn11xERWfM8d1OziD1D0r/ig1QzGDVdGCNmUdnSQVCXxT45hptac0/T hpnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=a/RJ3B19GrtdA1cBnNvBYv1VPg4liq+Psbi8N0DPvLM=; b=cD02zMLhKDgYh7c2UUJ2eCRckrU9kCbGcAN/ELL6wezJxDa7iMkIkDuSjBj0khEgqq esy+iRLWWpVyhAEBxjlMHACdCv7diJlxLHuCkqz9svmkvYAU4DfF/xJYyCVNEeZUQizD IDVodmJW4JaAg0IH1bGpEGfBvGJ4sh/LwTvGWqh2VlP6JA41W4KKMTTbYfBWM4RP9mm7 OwSqOvQcElcPGL//L3P2cku48f3OTknDHxTwsmyQbK56vGPTcBMOLB2UeDtzMqA5lS/4 lEAjOhZOtAhYmx4dBjqQN0wEmCcPbCAM/Pbuup4e5B8C+7jyyobIBmAUs5XGKcWAIJRN jADA== X-Gm-Message-State: ANhLgQ1fVaINzX7E0KSZwQlEKY+Ds2Mvjb8dbzKfxFILc74bUM+O5FCK GiQWD/Wv9oafJ8C+z5BUtoveMCni X-Google-Smtp-Source: ADFU+vuV8BA5kY61Ujrs6M+6SGcPcKfyBxX7gvnl/jFyRiD8C/WEE+qooIE4K/lCEQxV/vTn6Def3g== X-Received: by 2002:a50:9e45:: with SMTP id z63mr7527542ede.338.1584707893582; Fri, 20 Mar 2020 05:38:13 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id ny24sm361067ejb.50.2020.03.20.05.38.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Mar 2020 05:38:13 -0700 (PDT) Message-Id: <908d5c77c96feeeda74144447586ccdc2be4665e.1584707890.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Fri, 20 Mar 2020 12:38:10 +0000 Subject: [PATCH v3 2/2] pack-objects: flip the use of GIT_TEST_PACK_SPARSE Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: me@ttaylorr.com, jrnieder@gmail.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The environment variable GIT_TEST_PACK_SPARSE was previously used to allow testing the --sparse option for "git pack-objects" in the test suite. This allowed interesting cases of "git push" to also test this algorithm. Since pack.useSparse is now true by default, we do not need this variable to _enable_ the --sparse option, but instead to _disable_ it. This flips how we work with the variable a bit. When checking for the variable, default to a value of -1 for "unset". If unset, then take the default from the repo settings, which is currently 1. Then, the --[no-]sparse command-line option will override either of these settings. Signed-off-by: Derrick Stolee --- builtin/pack-objects.c | 4 ++-- t/README | 6 +++--- t/t5322-pack-objects-sparse.sh | 1 + 3 files changed, 6 insertions(+), 5 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 02aa6ee4808..eff9542f09f 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3469,9 +3469,9 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) read_replace_refs = 0; - sparse = git_env_bool("GIT_TEST_PACK_SPARSE", 0); + sparse = git_env_bool("GIT_TEST_PACK_SPARSE", -1); prepare_repo_settings(the_repository); - if (!sparse && the_repository->settings.pack_use_sparse != -1) + if (sparse < 0) sparse = the_repository->settings.pack_use_sparse; reset_pack_idx_option(&pack_idx_opts); diff --git a/t/README b/t/README index 9afd61e3ca0..99ebb18829f 100644 --- a/t/README +++ b/t/README @@ -386,9 +386,9 @@ GIT_TEST_INDEX_VERSION= exercises the index read/write code path for the index version specified. Can be set to any valid version (currently 2, 3, or 4). -GIT_TEST_PACK_SPARSE= if enabled will default the pack-objects -builtin to use the sparse object walk. This can still be overridden by -the --no-sparse command-line argument. +GIT_TEST_PACK_SPARSE= if disabled will default the pack-objects +builtin to use the non-sparse object walk. This can still be overridden by +the --sparse command-line argument. GIT_TEST_PRELOAD_INDEX= exercises the preload-index code path by overriding the minimum number of cache entries required per thread. diff --git a/t/t5322-pack-objects-sparse.sh b/t/t5322-pack-objects-sparse.sh index 6e5d6bdb0a7..a581eaf5293 100755 --- a/t/t5322-pack-objects-sparse.sh +++ b/t/t5322-pack-objects-sparse.sh @@ -107,6 +107,7 @@ test_expect_success 'non-sparse pack-objects' ' # --sparse is enabled by default by pack.useSparse test_expect_success 'sparse pack-objects' ' + GIT_TEST_PACK_SPARSE=-1 && git rev-parse \ topic1 \ topic1^{tree} \