From patchwork Thu Jul 30 22:24:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693705 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AC15A14B7 for ; Thu, 30 Jul 2020 22:24:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8FACC20838 for ; Thu, 30 Jul 2020 22:24:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="KTwG23S3" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730269AbgG3WYl (ORCPT ); Thu, 30 Jul 2020 18:24:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730024AbgG3WYk (ORCPT ); Thu, 30 Jul 2020 18:24:40 -0400 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C83BFC061574 for ; Thu, 30 Jul 2020 15:24:39 -0700 (PDT) Received: by mail-wm1-x342.google.com with SMTP id p14so7023473wmg.1 for ; Thu, 30 Jul 2020 15:24:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=VkMfmEWQ19rREUTd8nLWnILvhKOacPDW1DoUVs4XDpE=; b=KTwG23S3XhLut07OdeFvL5sagSiTgjwr5Lm1uQJfcsOSQaV8MKRZ500WekgkitQzeb d6PJV6J2mI2je+n4LZIbLRAdXqcox8NLVqqMDEI4zK+hNilFQEjbMwK0v3hfL/WZoFWJ 7QG/L3VYYvRC6M6QTe+g+fPeVJMw/HJMofUGOIXReyk2MmjbMykkqZ5WWAhF8PQCjquH bc0v06gjcFa6gQ+9xpCEo1SYJWUicUkENIiSmXsn5/JsUjd7QwYNqQixB8slfvj6pdE4 RNw0zBWc43+AbpNhnIFPvJdRfussZJf+waXssUotzU/nQy1j/s0KnIrdKrNifSlwL5np 42lQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=VkMfmEWQ19rREUTd8nLWnILvhKOacPDW1DoUVs4XDpE=; b=lxqSEZ6dRvS765y5JizsFtf2VwrMCPN5qZ78g3WsHGTZ0T8XoZNvAMBBFVlM0Lzgdr nYi7x24Fa7ME6w7NEBbWAqg4yAGfHuz+7VFH8+PlVZNGPQ+rw+6cCBz86blGPO48fSAV DCQISqZVfSW8CTkmJ+xXy3EsMyW9n+cw2jHIv8uqjbPomjPQJ5SNvU/7MiwBUUwvB7A7 vW1M4rDrudJuFkSiMXRRtXT9d2relMOxAk6y/d3n9ylGcyUHMsumRmLKZabpOfuM6GB9 WVK8GG6UHUucCuxZ43IRfUWtt0Ao05VYFFxizjIKkI2JnlPMrMF5v89zVT8EPhUAJXjC SYUw== X-Gm-Message-State: AOAM533ikB8c44OdQEbhAcIt3FpPdlVn417XHNiXqU1Lr3db3Nrrm31E c14q7goH253s4w2x4Zi7/378u5gR X-Google-Smtp-Source: ABdhPJyPXrc3Cae/e0+4HFgt58XZ66n97RKfe8If4NkRjHzF54Iofgw8373Tiq3TXzbkVECJ0krdIg== X-Received: by 2002:a05:600c:c7:: with SMTP id u7mr1138891wmm.135.1596147878200; Thu, 30 Jul 2020 15:24:38 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 111sm11795858wrc.53.2020.07.30.15.24.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:33 -0700 (PDT) Message-Id: <12fe73bb72bf2193967979acf68f7645b339eaa2.1596147867.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:07 +0000 Subject: [PATCH v3 01/20] maintenance: create basic maintenance runner Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The 'gc' builtin is our current entrypoint for automatically maintaining a repository. This one tool does many operations, such as repacking the repository, packing refs, and rewriting the commit-graph file. The name implies it performs "garbage collection" which means several different things, and some users may not want to use this operation that rewrites the entire object database. Create a new 'maintenance' builtin that will become a more general- purpose command. To start, it will only support the 'run' subcommand, but will later expand to add subcommands for scheduling maintenance in the background. For now, the 'maintenance' builtin is a thin shim over the 'gc' builtin. In fact, the only option is the '--auto' toggle, which is handed directly to the 'gc' builtin. The current change is isolated to this simple operation to prevent more interesting logic from being lost in all of the boilerplate of adding a new builtin. Use existing builtin/gc.c file because we want to share code between the two builtins. It is possible that we will have 'maintenance' replace the 'gc' builtin entirely at some point, leaving 'git gc' as an alias for some specific arguments to 'git maintenance run'. Signed-off-by: Derrick Stolee --- .gitignore | 1 + Documentation/git-maintenance.txt | 57 +++++++++++++++++++++++++++++++ builtin.h | 1 + builtin/gc.c | 56 ++++++++++++++++++++++++++++++ git.c | 1 + t/t7900-maintenance.sh | 19 +++++++++++ 6 files changed, 135 insertions(+) create mode 100644 Documentation/git-maintenance.txt create mode 100755 t/t7900-maintenance.sh diff --git a/.gitignore b/.gitignore index ee509a2ad2..a5808fa30d 100644 --- a/.gitignore +++ b/.gitignore @@ -90,6 +90,7 @@ /git-ls-tree /git-mailinfo /git-mailsplit +/git-maintenance /git-merge /git-merge-base /git-merge-index diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt new file mode 100644 index 0000000000..34cd2b4417 --- /dev/null +++ b/Documentation/git-maintenance.txt @@ -0,0 +1,57 @@ +git-maintenance(1) +================== + +NAME +---- +git-maintenance - Run tasks to optimize Git repository data + + +SYNOPSIS +-------- +[verse] +'git maintenance' run [] + + +DESCRIPTION +----------- +Run tasks to optimize Git repository data, speeding up other Git commands +and reducing storage requirements for the repository. ++ +Git commands that add repository data, such as `git add` or `git fetch`, +are optimized for a responsive user experience. These commands do not take +time to optimize the Git data, since such optimizations scale with the full +size of the repository while these user commands each perform a relatively +small action. ++ +The `git maintenance` command provides flexibility for how to optimize the +Git repository. + +SUBCOMMANDS +----------- + +run:: + Run one or more maintenance tasks. + +TASKS +----- + +gc:: + Cleanup unnecessary files and optimize the local repository. "GC" + stands for "garbage collection," but this task performs many + smaller tasks. This task can be rather expensive for large + repositories, as it repacks all Git objects into a single pack-file. + It can also be disruptive in some situations, as it deletes stale + data. + +OPTIONS +------- +--auto:: + When combined with the `run` subcommand, run maintenance tasks + only if certain thresholds are met. For example, the `gc` task + runs when the number of loose objects exceeds the number stored + in the `gc.auto` config setting, or when the number of pack-files + exceeds the `gc.autoPackLimit` config setting. + +GIT +--- +Part of the linkgit:git[1] suite diff --git a/builtin.h b/builtin.h index a5ae15bfe5..17c1c0ce49 100644 --- a/builtin.h +++ b/builtin.h @@ -167,6 +167,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix); int cmd_ls_remote(int argc, const char **argv, const char *prefix); int cmd_mailinfo(int argc, const char **argv, const char *prefix); int cmd_mailsplit(int argc, const char **argv, const char *prefix); +int cmd_maintenance(int argc, const char **argv, const char *prefix); int cmd_merge(int argc, const char **argv, const char *prefix); int cmd_merge_base(int argc, const char **argv, const char *prefix); int cmd_merge_index(int argc, const char **argv, const char *prefix); diff --git a/builtin/gc.c b/builtin/gc.c index 10346e0465..be9557452e 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -699,3 +699,59 @@ int cmd_gc(int argc, const char **argv, const char *prefix) return 0; } + +static const char * const builtin_maintenance_usage[] = { + N_("git maintenance run []"), + NULL +}; + +static struct maintenance_opts { + int auto_flag; +} opts; + +static int maintenance_task_gc(void) +{ + struct child_process child = CHILD_PROCESS_INIT; + + child.git_cmd = 1; + strvec_push(&child.args, "gc"); + + if (opts.auto_flag) + strvec_push(&child.args, "--auto"); + + close_object_store(the_repository->objects); + return run_command(&child); +} + +static int maintenance_run(void) +{ + return maintenance_task_gc(); +} + +int cmd_maintenance(int argc, const char **argv, const char *prefix) +{ + static struct option builtin_maintenance_options[] = { + OPT_BOOL(0, "auto", &opts.auto_flag, + N_("run tasks based on the state of the repository")), + OPT_END() + }; + + memset(&opts, 0, sizeof(opts)); + + if (argc == 2 && !strcmp(argv[1], "-h")) + usage_with_options(builtin_maintenance_usage, + builtin_maintenance_options); + + argc = parse_options(argc, argv, prefix, + builtin_maintenance_options, + builtin_maintenance_usage, + PARSE_OPT_KEEP_UNKNOWN); + + if (argc == 1) { + if (!strcmp(argv[0], "run")) + return maintenance_run(); + } + + usage_with_options(builtin_maintenance_usage, + builtin_maintenance_options); +} diff --git a/git.c b/git.c index 832688ca23..5bb9645403 100644 --- a/git.c +++ b/git.c @@ -529,6 +529,7 @@ static struct cmd_struct commands[] = { { "ls-tree", cmd_ls_tree, RUN_SETUP }, { "mailinfo", cmd_mailinfo, RUN_SETUP_GENTLY | NO_PARSEOPT }, { "mailsplit", cmd_mailsplit, NO_PARSEOPT }, + { "maintenance", cmd_maintenance, RUN_SETUP_GENTLY | NO_PARSEOPT }, { "merge", cmd_merge, RUN_SETUP | NEED_WORK_TREE }, { "merge-base", cmd_merge_base, RUN_SETUP }, { "merge-file", cmd_merge_file, RUN_SETUP_GENTLY }, diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh new file mode 100755 index 0000000000..0e864eaaed --- /dev/null +++ b/t/t7900-maintenance.sh @@ -0,0 +1,19 @@ +#!/bin/sh + +test_description='git maintenance builtin' + +. ./test-lib.sh + +test_expect_success 'help text' ' + test_expect_code 129 git maintenance -h 2>err && + test_i18ngrep "usage: git maintenance run" err +' + +test_expect_success 'run [--auto]' ' + GIT_TRACE2_EVENT="$(pwd)/run-no-auto.txt" git maintenance run && + GIT_TRACE2_EVENT="$(pwd)/run-auto.txt" git maintenance run --auto && + grep ",\"gc\"]" run-no-auto.txt && + grep ",\"gc\",\"--auto\"]" run-auto.txt +' + +test_done From patchwork Thu Jul 30 22:24:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693707 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6E1A114B7 for ; Thu, 30 Jul 2020 22:24:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5788C2083B for ; Thu, 30 Jul 2020 22:24:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="R0Xgcqyb" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730357AbgG3WYn (ORCPT ); Thu, 30 Jul 2020 18:24:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730024AbgG3WYm (ORCPT ); Thu, 30 Jul 2020 18:24:42 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 43958C061574 for ; Thu, 30 Jul 2020 15:24:42 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id a5so16313977wrm.6 for ; Thu, 30 Jul 2020 15:24:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=BxgGmFppPriwJ2GMFVHvVQxCmv9jc2jpc3eg6BVfn/0=; b=R0Xgcqybf5TIQx1ReuUy1RPq7qMw7QmAZ8WxAxKP6ICsnUzCRMqScUYf0hBhDHNK/j Qb7p7u8Ndb2nEXVap1+/qvVzNbzDD2RMMBLcBU7RkTNNRkV9410OW5RR01ozu1LHP/mU Gz+o7gsPu+r2B9B1L04mbQt0FtRgLoodr6rHt6TAICgkGsM3yHLRBNwMoJ+qc/cBNzGY 6riBnEFfHOu6naDULJ5ITrqGLIaT58Cnlibt/GZC7yrEF/qrejiYqEVTVth5wzlR0up7 31skruM8LOrK3dRyq62RtxSCg+QeYlU2BMspNHH+hSMVm3PhiidEuTBnIJ9eHLEH6lUj 1kSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=BxgGmFppPriwJ2GMFVHvVQxCmv9jc2jpc3eg6BVfn/0=; b=p/i7ytu8Cr4RhVuGoVzm0BuG4XqfYelPVUN0rb3ShgQze9sliRNrx9Gez3HRmKt3Ez x7gBx9Iq23BZRYrE9YwHNXT/vuoGGzRjM9FbhqFyN+mtXj33lisfEq8pucRTvBudLpI/ pGCIOtKkb9TCJgKsuD95FBUIqhxx7rfj7NExYnKKVil/O86XVNPCJrj37giAqAGtIhXk hkKOHgQxmKv5tRWZ9pE32c7YEqSa75fqZTtMNYSJ/vavQPKBrouOYoQeKTMDN75cTvnn ZTd50ROpY24m13cVbxefUpu6V3flTi+lQMjdc9zKmUKvGrwIVHTWWzMH6eOMr2pc9YX7 /rQw== X-Gm-Message-State: AOAM531jw2s9FCoG8Z1caEWdS7r8FNnfCuelo2QSzRZETBhaC74pgoRO ABMRU9edY8BmRP+4xAfR2SNAOoVP X-Google-Smtp-Source: ABdhPJzs1Ptyg4VtTWY7/YicqIp/94BxIBQ2q+fTn5KIYY2LIdzyTAnzGpUlSsFHCrHu5gv+urJMew== X-Received: by 2002:a5d:4a8a:: with SMTP id o10mr643730wrq.327.1596147879974; Thu, 30 Jul 2020 15:24:39 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id i9sm10422848wmb.11.2020.07.30.15.24.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:38 -0700 (PDT) Message-Id: <6e533e43d71580d3cd81f9b0ae2c5884a7d3ac2a.1596147867.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:08 +0000 Subject: [PATCH v3 02/20] maintenance: add --quiet option Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Maintenance activities are commonly used as steps in larger scripts. Providing a '--quiet' option allows those scripts to be less noisy when run on a terminal window. Turn this mode on by default when stderr is not a terminal. Pipe the option to the 'git gc' child process. Signed-off-by: Derrick Stolee --- Documentation/git-maintenance.txt | 3 +++ builtin/gc.c | 7 +++++++ t/t7900-maintenance.sh | 8 +++++--- 3 files changed, 15 insertions(+), 3 deletions(-) diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index 34cd2b4417..089fa4cedc 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -52,6 +52,9 @@ OPTIONS in the `gc.auto` config setting, or when the number of pack-files exceeds the `gc.autoPackLimit` config setting. +--quiet:: + Do not report progress or other information over `stderr`. + GIT --- Part of the linkgit:git[1] suite diff --git a/builtin/gc.c b/builtin/gc.c index be9557452e..3c277f9f9c 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -707,6 +707,7 @@ static const char * const builtin_maintenance_usage[] = { static struct maintenance_opts { int auto_flag; + int quiet; } opts; static int maintenance_task_gc(void) @@ -718,6 +719,8 @@ static int maintenance_task_gc(void) if (opts.auto_flag) strvec_push(&child.args, "--auto"); + if (opts.quiet) + strvec_push(&child.args, "--quiet"); close_object_store(the_repository->objects); return run_command(&child); @@ -733,6 +736,8 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix) static struct option builtin_maintenance_options[] = { OPT_BOOL(0, "auto", &opts.auto_flag, N_("run tasks based on the state of the repository")), + OPT_BOOL(0, "quiet", &opts.quiet, + N_("do not report progress or other information over stderr")), OPT_END() }; @@ -742,6 +747,8 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix) usage_with_options(builtin_maintenance_usage, builtin_maintenance_options); + opts.quiet = !isatty(2); + argc = parse_options(argc, argv, prefix, builtin_maintenance_options, builtin_maintenance_usage, diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 0e864eaaed..f08eee0977 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -9,11 +9,13 @@ test_expect_success 'help text' ' test_i18ngrep "usage: git maintenance run" err ' -test_expect_success 'run [--auto]' ' - GIT_TRACE2_EVENT="$(pwd)/run-no-auto.txt" git maintenance run && +test_expect_success 'run [--auto|--quiet]' ' + GIT_TRACE2_EVENT="$(pwd)/run-no-auto.txt" git maintenance run --no-quiet && GIT_TRACE2_EVENT="$(pwd)/run-auto.txt" git maintenance run --auto && + GIT_TRACE2_EVENT="$(pwd)/run-quiet.txt" git maintenance run --quiet && grep ",\"gc\"]" run-no-auto.txt && - grep ",\"gc\",\"--auto\"]" run-auto.txt + grep ",\"gc\",\"--auto\"" run-auto.txt && + grep ",\"gc\",\"--quiet\"" run-quiet.txt ' test_done From patchwork Thu Jul 30 22:24:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693713 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4FBAB14DD for ; Thu, 30 Jul 2020 22:24:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 37CEC2083B for ; Thu, 30 Jul 2020 22:24:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="L0KWrxOX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730397AbgG3WYq (ORCPT ); Thu, 30 Jul 2020 18:24:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730293AbgG3WYn (ORCPT ); Thu, 30 Jul 2020 18:24:43 -0400 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A2118C061575 for ; Thu, 30 Jul 2020 15:24:42 -0700 (PDT) Received: by mail-wm1-x342.google.com with SMTP id p14so7023560wmg.1 for ; Thu, 30 Jul 2020 15:24:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=OWrG73JUxOZ0/0m7jgFoeinUrGnyoOyxkJTUEiq2Q3I=; b=L0KWrxOXdsWDJn6HVEHjNN/m7Lma1MujnZXlXzamIHrHlx23m8v20kiCzQuXUgTEm/ GI1JduBKsahGVBaLg8TFXBRMLRpaipmLWwJ24Oyq2MCqotS6rfIqsVf2ju+qHfHpCMiZ vnKcJjnQwu9kdchm3EOET2/u/120Ri+e3vEr3lOO/ayywRAqNqqTWzeuXEkRvo0ocRNQ 0Sdv9a/d1+iIAZ3/QtQyj+pBZ1ioSNc86e2HYzRRBGXbCZZ8RGGymfHdTO/yjy9I4kVe olUhfQ5OnTUtnU2T+7srA/vT/JVq378HsxcbzrsclRTb4orV36Um6tiHn5evA8JCgcJB rH/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=OWrG73JUxOZ0/0m7jgFoeinUrGnyoOyxkJTUEiq2Q3I=; b=t+Cyrx+EqHqTPqs0/VpMGVcBfDZeZLuaPwT68F46joeheZEI3Vk1+D8Isk6GD4gDKK mGiBHJP0Bs/fWgbwMJeAjGczFEtZzXnFWy904XFA26DBIDT+XIfjkOHgc/HPP8bBjeEF TppvIggjzmNMnUR9rJq0IaSwUZOq0BjL2D3pQBmKEB1ScJf8qKeJuyEFwzcQ33dTfmIx R6uBmeyJbPrhRmDvlF3na0WFDTWJ+YnpBV8F3TLM4tgL9pPIwuKnSrmNCJv6tNMQSkG3 DLAgn+KsIdygn+eMC9oTguTp6zteVKSvBiwxqZn9Ikq0NdxHRXtTKFGNYIaniTTDHdTh s8zw== X-Gm-Message-State: AOAM533IZNauBerYuz8gtQb2jnvVRWtot7so0bu93k7q1Qv9YjLBDLl1 mNQAYWrXBK7VYzWCQ+YbTi3hae8k X-Google-Smtp-Source: ABdhPJx7C/B2WAz9ea+LMXayiGhqkH6lyZG4kGmZ5oT09aWMfKEyZYBTNihRHpHqolHM+gOYb6tJbA== X-Received: by 2002:a7b:cf22:: with SMTP id m2mr1183421wmg.46.1596147880861; Thu, 30 Jul 2020 15:24:40 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c7sm11284681wrq.58.2020.07.30.15.24.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:40 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:09 +0000 Subject: [PATCH v3 03/20] maintenance: replace run_auto_gc() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The run_auto_gc() method is used in several places to trigger a check for repo maintenance after some Git commands, such as 'git commit' or 'git fetch'. To allow for extra customization of this maintenance activity, replace the 'git gc --auto [--quiet]' call with one to 'git maintenance run --auto [--quiet]'. As we extend the maintenance builtin with other steps, users will be able to select different maintenance activities. Rename run_auto_gc() to run_auto_maintenance() to be clearer what is happening on this call, and to expose all callers in the current diff. Rewrite the method to use a struct child_process to simplify the calls slightly. Since 'git fetch' already allows disabling the 'git gc --auto' subprocess, add an equivalent option with a different name to be more descriptive of the new behavior: '--[no-]maintenance'. Update the documentation to include these options at the same time. Signed-off-by: Derrick Stolee --- Documentation/fetch-options.txt | 6 ++++-- Documentation/git-clone.txt | 6 +++--- builtin/am.c | 2 +- builtin/commit.c | 2 +- builtin/fetch.c | 6 ++++-- builtin/merge.c | 2 +- builtin/rebase.c | 4 ++-- run-command.c | 16 +++++++--------- run-command.h | 2 +- t/t5510-fetch.sh | 2 +- 10 files changed, 25 insertions(+), 23 deletions(-) diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt index 6e2a160a47..495bc8ab5a 100644 --- a/Documentation/fetch-options.txt +++ b/Documentation/fetch-options.txt @@ -86,9 +86,11 @@ ifndef::git-pull[] Allow several and arguments to be specified. No s may be specified. +--[no-]auto-maintenance:: --[no-]auto-gc:: - Run `git gc --auto` at the end to perform garbage collection - if needed. This is enabled by default. + Run `git maintenance run --auto` at the end to perform automatic + repository maintenance if needed. (`--[no-]auto-gc` is a synonym.) + This is enabled by default. --[no-]write-commit-graph:: Write a commit-graph after fetching. This overrides the config diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt index c898310099..097e6a86c5 100644 --- a/Documentation/git-clone.txt +++ b/Documentation/git-clone.txt @@ -78,9 +78,9 @@ repository using this option and then delete branches (or use any other Git command that makes any existing commit unreferenced) in the source repository, some objects may become unreferenced (or dangling). These objects may be removed by normal Git operations (such as `git commit`) -which automatically call `git gc --auto`. (See linkgit:git-gc[1].) -If these objects are removed and were referenced by the cloned repository, -then the cloned repository will become corrupt. +which automatically call `git maintenance run --auto`. (See +linkgit:git-maintenance[1].) If these objects are removed and were referenced +by the cloned repository, then the cloned repository will become corrupt. + Note that running `git repack` without the `--local` option in a repository cloned with `--shared` will copy objects from the source repository into a pack diff --git a/builtin/am.c b/builtin/am.c index 3f2adb3822..2ca363f72e 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -1795,7 +1795,7 @@ static void am_run(struct am_state *state, int resume) if (!state->rebasing) { am_destroy(state); close_object_store(the_repository->objects); - run_auto_gc(state->quiet); + run_auto_maintenance(state->quiet); } } diff --git a/builtin/commit.c b/builtin/commit.c index 01105ce8b0..9705bfb0cf 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -1702,7 +1702,7 @@ int cmd_commit(int argc, const char **argv, const char *prefix) git_test_write_commit_graph_or_die(); repo_rerere(the_repository, 0); - run_auto_gc(quiet); + run_auto_maintenance(quiet); run_commit_hook(use_editor, get_index_file(), "post-commit", NULL); if (amend && !no_post_rewrite) { commit_post_rewrite(the_repository, current_head, &oid); diff --git a/builtin/fetch.c b/builtin/fetch.c index 7953a1a25b..c7c8ac0861 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -196,8 +196,10 @@ static struct option builtin_fetch_options[] = { OPT_STRING_LIST(0, "negotiation-tip", &negotiation_tip, N_("revision"), N_("report that we have only objects reachable from this object")), OPT_PARSE_LIST_OBJECTS_FILTER(&filter_options), + OPT_BOOL(0, "auto-maintenance", &enable_auto_gc, + N_("run 'maintenance --auto' after fetching")), OPT_BOOL(0, "auto-gc", &enable_auto_gc, - N_("run 'gc --auto' after fetching")), + N_("run 'maintenance --auto' after fetching")), OPT_BOOL(0, "show-forced-updates", &fetch_show_forced_updates, N_("check for forced-updates on all updated branches")), OPT_BOOL(0, "write-commit-graph", &fetch_write_commit_graph, @@ -1882,7 +1884,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) close_object_store(the_repository->objects); if (enable_auto_gc) - run_auto_gc(verbosity < 0); + run_auto_maintenance(verbosity < 0); return result; } diff --git a/builtin/merge.c b/builtin/merge.c index 7da707bf55..c068e73037 100644 --- a/builtin/merge.c +++ b/builtin/merge.c @@ -457,7 +457,7 @@ static void finish(struct commit *head_commit, * user should see them. */ close_object_store(the_repository->objects); - run_auto_gc(verbosity < 0); + run_auto_maintenance(verbosity < 0); } } if (new_head && show_diffstat) { diff --git a/builtin/rebase.c b/builtin/rebase.c index 494107a648..d14d18191b 100644 --- a/builtin/rebase.c +++ b/builtin/rebase.c @@ -728,10 +728,10 @@ static int finish_rebase(struct rebase_options *opts) apply_autostash(state_dir_path("autostash", opts)); close_object_store(the_repository->objects); /* - * We ignore errors in 'gc --auto', since the + * We ignore errors in 'git maintenance run --auto', since the * user should see them. */ - run_auto_gc(!(opts->flags & (REBASE_NO_QUIET|REBASE_VERBOSE))); + run_auto_maintenance(!(opts->flags & (REBASE_NO_QUIET|REBASE_VERBOSE))); if (opts->type == REBASE_MERGE) { struct replay_opts replay = REPLAY_OPTS_INIT; diff --git a/run-command.c b/run-command.c index 30104a4ee1..b7e1f1dd5a 100644 --- a/run-command.c +++ b/run-command.c @@ -1866,15 +1866,13 @@ int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task, return result; } -int run_auto_gc(int quiet) +int run_auto_maintenance(int quiet) { - struct strvec argv_gc_auto = STRVEC_INIT; - int status; + struct child_process maint = CHILD_PROCESS_INIT; - strvec_pushl(&argv_gc_auto, "gc", "--auto", NULL); - if (quiet) - strvec_push(&argv_gc_auto, "--quiet"); - status = run_command_v_opt(argv_gc_auto.items, RUN_GIT_CMD); - strvec_clear(&argv_gc_auto); - return status; + maint.git_cmd = 1; + strvec_pushl(&maint.args, "maintenance", "run", "--auto", NULL); + strvec_push(&maint.args, quiet ? "--quiet" : "--no-quiet"); + + return run_command(&maint); } diff --git a/run-command.h b/run-command.h index 8b9bfaef16..6472b38bde 100644 --- a/run-command.h +++ b/run-command.h @@ -221,7 +221,7 @@ int run_hook_ve(const char *const *env, const char *name, va_list args); /* * Trigger an auto-gc */ -int run_auto_gc(int quiet); +int run_auto_maintenance(int quiet); #define RUN_COMMAND_NO_STDIN 1 #define RUN_GIT_CMD 2 /*If this is to be git sub-command */ diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh index a66dbe0bde..9850ecde5d 100755 --- a/t/t5510-fetch.sh +++ b/t/t5510-fetch.sh @@ -919,7 +919,7 @@ test_expect_success 'fetching with auto-gc does not lock up' ' git config fetch.unpackLimit 1 && git config gc.autoPackLimit 1 && git config gc.autoDetach false && - GIT_ASK_YESNO="$D/askyesno" git fetch >fetch.out 2>&1 && + GIT_ASK_YESNO="$D/askyesno" git fetch --verbose >fetch.out 2>&1 && test_i18ngrep "Auto packing the repository" fetch.out && ! grep "Should I try again" fetch.out ) From patchwork Thu Jul 30 22:24:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693711 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B199E912 for ; Thu, 30 Jul 2020 22:24:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 98A562083B for ; Thu, 30 Jul 2020 22:24:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="pi1N8ue2" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730374AbgG3WYo (ORCPT ); Thu, 30 Jul 2020 18:24:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43862 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730367AbgG3WYn (ORCPT ); Thu, 30 Jul 2020 18:24:43 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2FB6C061574 for ; Thu, 30 Jul 2020 15:24:42 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id a14so26304377wra.5 for ; Thu, 30 Jul 2020 15:24:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=arQb+rX+4t5Nf6yBzkvbk7cmF+LQ1a/qca9Qa+LsjV0=; b=pi1N8ue2a3Z1OTlqkSIKYehU3epbxPDPYigCUSb4n1wBWpaAvSkXwByCoF+YeDKtdk fpUGLNH8BUfVaYhOA/l4B6D4sXoqmHVGQeuEFjjxuty4VzMIeyh3n4MqlSGbKHyaftk4 1+Klf8Hi0RBORxSQuUCuJ25IH+0Vq7vwIhYPO8dre3DhO3d/XKAUku7UA+S600Ge6LEi mAHmXZPgwInXr7h9p4mUTvri8sGrjMFi5rZY6rLHdmLHKw/cfseYtl5d29Jrj0LFIKE8 KGTHqi8Zj9ODcoQKn3c+2va2dq6JzmjvMVaBTctJlV1hAhbCdqpF44Y7UGO09025GLRA PX0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=arQb+rX+4t5Nf6yBzkvbk7cmF+LQ1a/qca9Qa+LsjV0=; b=lpaRD/lGuVTLYCO+Lpt1mgXMw73j56uwQCjSFLFnUgouXy5hHsGCEyih/4ByIYfiPi 7dhBL8EhImZDiEZUeY3Ywfbrh/msiCQkgMzzBhvAAvyo8amJ5UOOoMKfzs4c6yVq37vN Y///JI0jbIRno7MmjCriDV6gNMX6g5b2Zb5N0fOfNsx1rqLLzoX8lWsAXv61LW02lNUY KIL1Uztr+hQm3/gKBuINfcLKo0zbMWahQHbCUO4kirmkEczfrCcfHJM0rDRPXR9XPCf0 9nK8RWOQAQV8XbNC+ZUFLC9MpJRYYdctrGt2rNJikqCM9jFXjGy+gnSKBr9C1nX/4W0t qE9Q== X-Gm-Message-State: AOAM530ieulfreTPs7QAwLKG3lqQ7gRji8jh7I5Byf1PIZjfexOp35U/ hp852J/J3kjfREHXP3ko+IoWCICZ X-Google-Smtp-Source: ABdhPJyKRlP5nJSQUPDyRLq0KiuF1nb7oXWCI114mEXzwy4SijbHFYhmsfZYWUfaC/rQjZKNKuGdlQ== X-Received: by 2002:adf:cf10:: with SMTP id o16mr649753wrj.380.1596147881553; Thu, 30 Jul 2020 15:24:41 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l7sm1619729wrf.32.2020.07.30.15.24.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:41 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:10 +0000 Subject: [PATCH v3 04/20] maintenance: initialize task array Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee In anticipation of implementing multiple maintenance tasks inside the 'maintenance' builtin, use a list of structs to describe the work to be done. The struct maintenance_task stores the name of the task (as given by a future command-line argument) along with a function pointer to its implementation and a boolean for whether the step is enabled. A list these structs are initialized with the full list of implemented tasks along with a default order. For now, this list only contains the "gc" task. This task is also the only task enabled by default. The run subcommand will return a nonzero exit code if any task fails. However, it will attempt all tasks in its loop before returning with the failure. Also each failed task will send an error message. Helped-by: Taylor Blau Helped-by: Junio C Hamano Signed-off-by: Derrick Stolee --- builtin/gc.c | 38 +++++++++++++++++++++++++++++++++++++- 1 file changed, 37 insertions(+), 1 deletion(-) diff --git a/builtin/gc.c b/builtin/gc.c index 3c277f9f9c..0f15162825 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -726,9 +726,45 @@ static int maintenance_task_gc(void) return run_command(&child); } +typedef int maintenance_task_fn(void); + +struct maintenance_task { + const char *name; + maintenance_task_fn *fn; + unsigned enabled:1; +}; + +enum maintenance_task_label { + TASK_GC, + + /* Leave as final value */ + TASK__COUNT +}; + +static struct maintenance_task tasks[] = { + [TASK_GC] = { + "gc", + maintenance_task_gc, + 1, + }, +}; + static int maintenance_run(void) { - return maintenance_task_gc(); + int i; + int result = 0; + + for (i = 0; i < TASK__COUNT; i++) { + if (!tasks[i].enabled) + continue; + + if (tasks[i].fn()) { + error(_("task '%s' failed"), tasks[i].name); + result = 1; + } + } + + return result; } int cmd_maintenance(int argc, const char **argv, const char *prefix) From patchwork Thu Jul 30 22:24:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693715 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A1FCC14B7 for ; Thu, 30 Jul 2020 22:24:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8410A2083B for ; Thu, 30 Jul 2020 22:24:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="CX/cBJTW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730406AbgG3WYr (ORCPT ); Thu, 30 Jul 2020 18:24:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43868 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730377AbgG3WYo (ORCPT ); Thu, 30 Jul 2020 18:24:44 -0400 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CC67C061574 for ; Thu, 30 Jul 2020 15:24:44 -0700 (PDT) Received: by mail-wm1-x342.google.com with SMTP id g10so5204967wmc.1 for ; Thu, 30 Jul 2020 15:24:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=FO6eJ897hgJYdcvr8miOENffZqXcaswZeSgSb+9MS5w=; b=CX/cBJTWG1jtRk54auIH0BdPoqN/AXTOQL7FkLv42E5MeT55OrERSt6LDiNc42Ae5w 4B0ZiNtYtGSH3l2ltwI9jQWSssomnMCRiEvT/abvKt+Kh0ARhVF2P1SUeLic/DhVB6x5 JBI1NN+4ygM8okxVNYz3X2uujC2OwRtJO1Mb61MtrhlwxAtzh/Hyb+ViDHqC+qdflF7M yJTOM1F+GUqBEuWk3KAt+43QmowjNdJGws1DDz/Fir9VjhUguaXox4arX7oXbKmgWJEx vrGsXlYbuJS7yTOnD8VsGQBsRP/vYPWpQa06O22Isthfee7M0R35SUL/XYXuUW120ENR PbIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=FO6eJ897hgJYdcvr8miOENffZqXcaswZeSgSb+9MS5w=; b=FhD6Rf2rOHb+08L6KzGfwKg523pX9OfG1pItd0VV0etiZ80P4pJo2p1ThikG0GKHPp Z2Hawlw4GJD7pnZCiuHCvlCUFG35bVdCDMQr9dXA3ZvtRU/Dj/ivwASdqECdVKWjT6Et YiLAQy5mMHxQrWF0CcdHRgld3Yy8dO8DIQMZIeRbO5jiAQmuByuaKi41jT98bUFIepbB Kn3aVMKQrDmNU0zs1fxRUHcjy6WG6q5GmG7ma0ifglcTPmnEaEVqwOU8XQPTgQ0LML2u bzUHC5eemb/8frJGc4gBEDJeNIVboAfpuGQI70r1lN+t11xX0PksaPQFtstKe2SGn/aY GQkw== X-Gm-Message-State: AOAM531zQU4ay9hsF5DtbgTCRJpofaC+MAMLl/+UbIBQ8ehqN9blnVV6 9P83sX6fyGnkgCs9BP9CysZ9ZJA3 X-Google-Smtp-Source: ABdhPJz1Cjay5KLPRwQ1W52pUiiX7d3Yr/4Z11xyE78hBVosz++1QcnQ1RMLNqPtpYwSRcFPrmxm3w== X-Received: by 2002:a1c:2dc6:: with SMTP id t189mr1203518wmt.26.1596147882456; Thu, 30 Jul 2020 15:24:42 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a3sm10948361wme.34.2020.07.30.15.24.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:41 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:11 +0000 Subject: [PATCH v3 05/20] maintenance: add commit-graph task Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The first new task in the 'git maintenance' builtin is the 'commit-graph' job. It is based on the sequence of events in the 'commit-graph' job in Scalar [1]. This sequence is as follows: 1. git commit-graph write --reachable --split 2. git commit-graph verify --shallow 3. If the verify succeeds, stop. 4. Delete the commit-graph-chain file. 5. git commit-graph write --reachable --split By writing an incremental commit-graph file using the "--split" option we minimize the disruption from this operation. The default behavior is to merge layers until the new "top" layer is less than half the size of the layer below. This provides quick writes most of the time, with the longer writes following a power law distribution. Most importantly, concurrent Git processes only look at the commit-graph-chain file for a very short amount of time, so they will verly likely not be holding a handle to the file when we try to replace it. (This only matters on Windows.) If a concurrent process reads the old commit-graph-chain file, but our job expires some of the .graph files before they can be read, then those processes will see a warning message (but not fail). This could be avoided by a future update to use the --expire-time argument when writing the commit-graph. By using 'git commit-graph verify --shallow' we can ensure that the file we just wrote is valid. This is an extra safety precaution that is faster than our 'write' subcommand. In the rare situation that the newest layer of the commit-graph is corrupt, we can "fix" the corruption by deleting the commit-graph-chain file and rewrite the full commit-graph as a new one-layer commit graph. This does not completely prevent _that_ file from being corrupt, but it does recompute the commit-graph by parsing commits from the object database. In our use of this step in Scalar and VFS for Git, we have only seen this issue arise because our microsoft/git fork reverted 43d3561 ("commit-graph write: don't die if the existing graph is corrupt" 2019-03-25) for a while to keep commit-graph writes very fast. We dropped the revert when updating to v2.23.0. The verify still has potential for catching corrupt data across the layer boundary: if the new file has commit X with parent Y in an old file but the commit ID for Y in the old file had a bitswap, then we will notice that in the 'verify' command. [1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/CommitGraphStep.cs Signed-off-by: Derrick Stolee --- Documentation/git-maintenance.txt | 18 +++++++++ builtin/gc.c | 63 +++++++++++++++++++++++++++++++ commit-graph.c | 8 ++-- commit-graph.h | 1 + t/t7900-maintenance.sh | 2 + 5 files changed, 88 insertions(+), 4 deletions(-) diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index 089fa4cedc..35b0be7d40 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -35,6 +35,24 @@ run:: TASKS ----- +commit-graph:: + The `commit-graph` job updates the `commit-graph` files incrementally, + then verifies that the written data is correct. If the new layer has an + issue, then the chain file is removed and the `commit-graph` is + rewritten from scratch. ++ +The verification only checks the top layer of the `commit-graph` chain. +If the incremental write merged the new commits with at least one +existing layer, then there is potential for on-disk corruption being +carried forward into the new file. This will be noticed and the new +commit-graph file will be clean as Git reparses the commit data from +the object database. ++ +The incremental write is safe to run alongside concurrent Git processes +since it will not expire `.graph` files that were in the previous +`commit-graph-chain` file. They will be deleted by a later run based on +the expiration delay. + gc:: Cleanup unnecessary files and optimize the local repository. "GC" stands for "garbage collection," but this task performs many diff --git a/builtin/gc.c b/builtin/gc.c index 0f15162825..ec1bbc3f9e 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -710,6 +710,64 @@ static struct maintenance_opts { int quiet; } opts; +static int run_write_commit_graph(void) +{ + struct child_process child = CHILD_PROCESS_INIT; + + child.git_cmd = 1; + strvec_pushl(&child.args, "commit-graph", "write", + "--split", "--reachable", NULL); + + if (opts.quiet) + strvec_push(&child.args, "--no-progress"); + + return !!run_command(&child); +} + +static int run_verify_commit_graph(void) +{ + struct child_process child = CHILD_PROCESS_INIT; + + child.git_cmd = 1; + strvec_pushl(&child.args, "commit-graph", "verify", + "--shallow", NULL); + + if (opts.quiet) + strvec_push(&child.args, "--no-progress"); + + return !!run_command(&child); +} + +static int maintenance_task_commit_graph(void) +{ + struct repository *r = the_repository; + char *chain_path; + + close_object_store(r->objects); + if (run_write_commit_graph()) { + error(_("failed to write commit-graph")); + return 1; + } + + if (!run_verify_commit_graph()) + return 0; + + warning(_("commit-graph verify caught error, rewriting")); + + chain_path = get_commit_graph_chain_filename(r->objects->odb); + if (unlink(chain_path)) { + UNLEAK(chain_path); + die(_("failed to remove commit-graph at %s"), chain_path); + } + free(chain_path); + + if (!run_write_commit_graph()) + return 0; + + error(_("failed to rewrite commit-graph")); + return 1; +} + static int maintenance_task_gc(void) { struct child_process child = CHILD_PROCESS_INIT; @@ -736,6 +794,7 @@ struct maintenance_task { enum maintenance_task_label { TASK_GC, + TASK_COMMIT_GRAPH, /* Leave as final value */ TASK__COUNT @@ -747,6 +806,10 @@ static struct maintenance_task tasks[] = { maintenance_task_gc, 1, }, + [TASK_COMMIT_GRAPH] = { + "commit-graph", + maintenance_task_commit_graph, + }, }; static int maintenance_run(void) diff --git a/commit-graph.c b/commit-graph.c index 1af68c297d..9705d237e4 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -172,7 +172,7 @@ static char *get_split_graph_filename(struct object_directory *odb, oid_hex); } -static char *get_chain_filename(struct object_directory *odb) +char *get_commit_graph_chain_filename(struct object_directory *odb) { return xstrfmt("%s/info/commit-graphs/commit-graph-chain", odb->path); } @@ -521,7 +521,7 @@ static struct commit_graph *load_commit_graph_chain(struct repository *r, struct stat st; struct object_id *oids; int i = 0, valid = 1, count; - char *chain_name = get_chain_filename(odb); + char *chain_name = get_commit_graph_chain_filename(odb); FILE *fp; int stat_res; @@ -1619,7 +1619,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) } if (ctx->split) { - char *lock_name = get_chain_filename(ctx->odb); + char *lock_name = get_commit_graph_chain_filename(ctx->odb); hold_lock_file_for_update_mode(&lk, lock_name, LOCK_DIE_ON_ERROR, 0444); @@ -1996,7 +1996,7 @@ static void expire_commit_graphs(struct write_commit_graph_context *ctx) if (ctx->split_opts && ctx->split_opts->expire_time) expire_time = ctx->split_opts->expire_time; if (!ctx->split) { - char *chain_file_name = get_chain_filename(ctx->odb); + char *chain_file_name = get_commit_graph_chain_filename(ctx->odb); unlink(chain_file_name); free(chain_file_name); ctx->num_commit_graphs_after = 0; diff --git a/commit-graph.h b/commit-graph.h index 28f89cdf3e..3c202748c3 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -25,6 +25,7 @@ struct commit; struct bloom_filter_settings; char *get_commit_graph_filename(struct object_directory *odb); +char *get_commit_graph_chain_filename(struct object_directory *odb); int open_commit_graph(const char *graph_file, int *fd, struct stat *st); /* diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index f08eee0977..ff646abf7c 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -4,6 +4,8 @@ test_description='git maintenance builtin' . ./test-lib.sh +GIT_TEST_COMMIT_GRAPH=0 + test_expect_success 'help text' ' test_expect_code 129 git maintenance -h 2>err && test_i18ngrep "usage: git maintenance run" err From patchwork Thu Jul 30 22:24:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693717 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8947914B7 for ; Thu, 30 Jul 2020 22:24:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6BFF620838 for ; Thu, 30 Jul 2020 22:24:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="OVt8RD9g" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730419AbgG3WYt (ORCPT ); Thu, 30 Jul 2020 18:24:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43872 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730378AbgG3WYp (ORCPT ); Thu, 30 Jul 2020 18:24:45 -0400 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDB66C061575 for ; Thu, 30 Jul 2020 15:24:44 -0700 (PDT) Received: by mail-wr1-x442.google.com with SMTP id r2so21200128wrs.8 for ; Thu, 30 Jul 2020 15:24:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=sWYlfdDhdnxAyiX9S3sbgwmIEMoCljL5jXPkmla5wNg=; b=OVt8RD9gLm0ZVZnfUWngrGG8PeU2BdbpDZ7ytbUqD2768N9nzw48+GjXQqlbtruIfu u/NMuOoq/og5MvTDsASIezVBrUnmw9boNfX7t8w0YwXpHK9wBGZ7tkPY8aaVm5Zstpz3 m+C2aJccJo5kxNh89475oH/Yu0sn0iZF/N3fojUtax6c/ipGakMZfuVW99lQqhCoFbLr MqVOq7y2exUVmkCiUEHZWhoWgm5g6bzzsSN5VuRhakt9a7Eai1Q/B7sDGyy7o6P52P3c hiCk50ViPeuBtTfoWL1JaaN37HpVxxKe3PqYUShpDOPZnqEPisEWw1nVm1mgsNe5kCGW qA4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=sWYlfdDhdnxAyiX9S3sbgwmIEMoCljL5jXPkmla5wNg=; b=Sc/WdrlDK5qmFpWnZJfBxO0B/FWdG3lCG5+F/lGXH9nubuZ647MCLbuWr0DiEaN5z0 d7aFjJNP5Ooaq8bCdojzAhQpz9wSK7xKpK4zFYz7BGc4ra4MZXopphawcmyiB8NOi+dG fIBb5+3NQxy54D/EywnsDrDNxEnn+kRJf2J7ueTOQMTeHECmqv8XL/+3NPahyEy4SyJE AWIpP51yo1jxdJOMdySh4Uo6h3Xkyl2Uz/AtHtBqslN3y4NYLg4mpfqdH2CGK/yGCReK 8BliaMNBM1WKRO/QPxuXqYPEZZb6IS3ftV/bMIRRbpIwcOJhQ0vdmuaM1GIUFTpz0zMz 2hjg== X-Gm-Message-State: AOAM530dCvQnnALeUXCi/amjtFgWvNSxpIzJEJKMiRZHkofQZxTYEKII qh50gl5UUF3sBTW4OaAs4Boqf+w8 X-Google-Smtp-Source: ABdhPJywGpFU6ItLLGB47YmCdvkcvfj1lOtP5ey9C0/DK5DqSZo9AsrAbUIalHwHW/3w4gq6IS2DTg== X-Received: by 2002:adf:df06:: with SMTP id y6mr661955wrl.89.1596147883404; Thu, 30 Jul 2020 15:24:43 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r22sm3349750wmh.45.2020.07.30.15.24.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:42 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:12 +0000 Subject: [PATCH v3 06/20] maintenance: add --task option Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee A user may want to only run certain maintenance tasks in a certain order. Add the --task= option, which allows a user to specify an ordered list of tasks to run. These cannot be run multiple times, however. Here is where our array of maintenance_task pointers becomes critical. We can sort the array of pointers based on the task order, but we do not want to move the struct data itself in order to preserve the hashmap references. We use the hashmap to match the --task= arguments into the task struct data. Keep in mind that the 'enabled' member of the maintenance_task struct is a placeholder for a future 'maintenance..enabled' config option. Thus, we use the 'enabled' member to specify which tasks are run when the user does not specify any --task= arguments. The 'enabled' member should be ignored if --task= appears. Signed-off-by: Derrick Stolee --- Documentation/git-maintenance.txt | 4 +++ builtin/gc.c | 59 +++++++++++++++++++++++++++++-- t/t7900-maintenance.sh | 23 ++++++++++++ 3 files changed, 84 insertions(+), 2 deletions(-) diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index 35b0be7d40..9204762e21 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -73,6 +73,10 @@ OPTIONS --quiet:: Do not report progress or other information over `stderr`. +--task=:: + If this option is specified one or more times, then only run the + specified tasks in the specified order. + GIT --- Part of the linkgit:git[1] suite diff --git a/builtin/gc.c b/builtin/gc.c index ec1bbc3f9e..b7f64891cd 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -708,6 +708,7 @@ static const char * const builtin_maintenance_usage[] = { static struct maintenance_opts { int auto_flag; int quiet; + int tasks_selected; } opts; static int run_write_commit_graph(void) @@ -789,7 +790,9 @@ typedef int maintenance_task_fn(void); struct maintenance_task { const char *name; maintenance_task_fn *fn; - unsigned enabled:1; + unsigned enabled:1, + selected:1; + int selected_order; }; enum maintenance_task_label { @@ -812,13 +815,29 @@ static struct maintenance_task tasks[] = { }, }; +static int compare_tasks_by_selection(const void *a_, const void *b_) +{ + const struct maintenance_task *a, *b; + + a = (const struct maintenance_task *)&a_; + b = (const struct maintenance_task *)&b_; + + return b->selected_order - a->selected_order; +} + static int maintenance_run(void) { int i; int result = 0; + if (opts.tasks_selected) + QSORT(tasks, TASK__COUNT, compare_tasks_by_selection); + for (i = 0; i < TASK__COUNT; i++) { - if (!tasks[i].enabled) + if (opts.tasks_selected && !tasks[i].selected) + continue; + + if (!opts.tasks_selected && !tasks[i].enabled) continue; if (tasks[i].fn()) { @@ -830,6 +849,39 @@ static int maintenance_run(void) return result; } +static int task_option_parse(const struct option *opt, + const char *arg, int unset) +{ + int i; + struct maintenance_task *task = NULL; + + BUG_ON_OPT_NEG(unset); + + opts.tasks_selected++; + + for (i = 0; i < TASK__COUNT; i++) { + if (!strcasecmp(tasks[i].name, arg)) { + task = &tasks[i]; + break; + } + } + + if (!task) { + error(_("'%s' is not a valid task"), arg); + return 1; + } + + if (task->selected) { + error(_("task '%s' cannot be selected multiple times"), arg); + return 1; + } + + task->selected = 1; + task->selected_order = opts.tasks_selected; + + return 0; +} + int cmd_maintenance(int argc, const char **argv, const char *prefix) { static struct option builtin_maintenance_options[] = { @@ -837,6 +889,9 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix) N_("run tasks based on the state of the repository")), OPT_BOOL(0, "quiet", &opts.quiet, N_("do not report progress or other information over stderr")), + OPT_CALLBACK_F(0, "task", NULL, N_("task"), + N_("run a specific task"), + PARSE_OPT_NONEG, task_option_parse), OPT_END() }; diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index ff646abf7c..3cdccb24df 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -20,4 +20,27 @@ test_expect_success 'run [--auto|--quiet]' ' grep ",\"gc\",\"--quiet\"" run-quiet.txt ' +test_expect_success 'run --task=' ' + GIT_TRACE2_EVENT="$(pwd)/run-commit-graph.txt" git maintenance run --task=commit-graph && + GIT_TRACE2_EVENT="$(pwd)/run-gc.txt" git maintenance run --task=gc && + GIT_TRACE2_EVENT="$(pwd)/run-commit-graph.txt" git maintenance run --task=commit-graph && + GIT_TRACE2_EVENT="$(pwd)/run-both.txt" git maintenance run --task=commit-graph --task=gc && + ! grep ",\"gc\"" run-commit-graph.txt && + grep ",\"gc\"" run-gc.txt && + grep ",\"gc\"" run-both.txt && + grep ",\"commit-graph\",\"write\"" run-commit-graph.txt && + ! grep ",\"commit-graph\",\"write\"" run-gc.txt && + grep ",\"commit-graph\",\"write\"" run-both.txt +' + +test_expect_success 'run --task=bogus' ' + test_must_fail git maintenance run --task=bogus 2>err && + test_i18ngrep "is not a valid task" err +' + +test_expect_success 'run --task duplicate' ' + test_must_fail git maintenance run --task=gc --task=gc 2>err && + test_i18ngrep "cannot be selected multiple times" err +' + test_done From patchwork Thu Jul 30 22:24:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693721 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6E0AE912 for ; Thu, 30 Jul 2020 22:24:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 54A8220838 for ; Thu, 30 Jul 2020 22:24:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="d41dX1SX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730438AbgG3WYv (ORCPT ); Thu, 30 Jul 2020 18:24:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43878 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730388AbgG3WYq (ORCPT ); Thu, 30 Jul 2020 18:24:46 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C4A45C061574 for ; Thu, 30 Jul 2020 15:24:45 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id r4so23321415wrx.9 for ; Thu, 30 Jul 2020 15:24:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=nkDpWha0F8QROGymjtiq9TmTXWR4raEqKPpr+Lp1vOU=; b=d41dX1SXBhp5NPv88y1gZcO0VWXsjSFrxDni3vlL6IbSYgp+RYE7xzJvzD/HXdwgFq ac5fY37NxsorTvc0yQywSyRg3iA2CWjoKOQ1lzHD2ut6/doBcAgdrTrN00MMt8+lgRnH l1uAkmQeGB8vqO04l5WYTSqm5LpyNEEqGnpCRQbXg7L6U9THLHSnJDduVz60C4MiYDEL 8NXH7LKnPciTG+VqEDt/SwDXBJghfKzS5JNwcGq9TrIEws/Id608YaSfM5qz31fyE3Oj IFx9MqtjJK3tQ+IToGC/plDKR0mUqRVFoekfgWpuWc3pvIY5k9+TLt0P1RTSpmTg9Zhl 4l1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=nkDpWha0F8QROGymjtiq9TmTXWR4raEqKPpr+Lp1vOU=; b=QzKb+pNNwcIBejdUsC1NmbcntisvwCMC6BNxwCgP4rz2MbmRr0EyEKby/TPNQ39USn +CbXdOi3B7Vev2w2h/VG+uMSTtichIJpA/yYTLzCvDOS+fftYt/cQJ790ykC5ix0Ok3K iJGdR+0CdZG2WnY5A03OVWtlAn8R5hSgQ3MvokJmPo7rgze/2bwg575nqQO09ILUWJU+ SHC6bhQxhYMXnh1aRAdW/jr0uUNAR051yJrVgdZ1hYTXTtLZjvAD1Qf3GdEzaBZSxJ59 38k8lOUA1jecyhxCcV1xQ993NUCBMokcTY12m46PUuzlz/l/A2/Ydm9CIaV0cxm6MVMZ f0fQ== X-Gm-Message-State: AOAM531CLtzqkYHACm6aTAVv6QCHyzK8vA601pYskk6vQhuYqzdGbee7 IJiaiEwtePwBK6tV9RnhBtjv+v2D X-Google-Smtp-Source: ABdhPJxHNybtoLuEuFSXXsQY78vmJgbc+mJf2TV/1xqPYoJSyLT24pP9OvbmiMOM2SUpTnM3b7fouw== X-Received: by 2002:a5d:420b:: with SMTP id n11mr678537wrq.11.1596147884412; Thu, 30 Jul 2020 15:24:44 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f9sm10619229wru.47.2020.07.30.15.24.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:43 -0700 (PDT) Message-Id: <1b00524da3c199e833acc8ce400e8ebd332908b0.1596147867.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:13 +0000 Subject: [PATCH v3 07/20] maintenance: take a lock on the objects directory Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Performing maintenance on a Git repository involves writing data to the .git directory, which is not safe to do with multiple writers attempting the same operation. Ensure that only one 'git maintenance' process is running at a time by holding a file-based lock. Simply the presence of the .git/maintenance.lock file will prevent future maintenance. This lock is never committed, since it does not represent meaningful data. Instead, it is only a placeholder. If the lock file already exists, then fail silently. This will become very important later when we implement the 'fetch' task, as this is our stop-gap from creating a recursive process loop between 'git fetch' and 'git maintenance run'. Signed-off-by: Derrick Stolee --- builtin/gc.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/builtin/gc.c b/builtin/gc.c index b7f64891cd..b57bc7b0ff 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -829,6 +829,25 @@ static int maintenance_run(void) { int i; int result = 0; + struct lock_file lk; + struct repository *r = the_repository; + char *lock_path = xstrfmt("%s/maintenance", r->objects->odb->path); + + if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0) { + /* + * Another maintenance command is running. + * + * If --auto was provided, then it is likely due to a + * recursive process stack. Do not report an error in + * that case. + */ + if (!opts.auto_flag && !opts.quiet) + error(_("lock file '%s' exists, skipping maintenance"), + lock_path); + free(lock_path); + return 0; + } + free(lock_path); if (opts.tasks_selected) QSORT(tasks, TASK__COUNT, compare_tasks_by_selection); @@ -846,6 +865,7 @@ static int maintenance_run(void) } } + rollback_lock_file(&lk); return result; } From patchwork Thu Jul 30 22:24:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693719 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 36AC2912 for ; Thu, 30 Jul 2020 22:24:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 19B872083B for ; Thu, 30 Jul 2020 22:24:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="n+2q7LW0" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730433AbgG3WYu (ORCPT ); Thu, 30 Jul 2020 18:24:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43880 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730399AbgG3WYr (ORCPT ); Thu, 30 Jul 2020 18:24:47 -0400 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED32CC06174A for ; Thu, 30 Jul 2020 15:24:46 -0700 (PDT) Received: by mail-wm1-x342.google.com with SMTP id c80so7006730wme.0 for ; Thu, 30 Jul 2020 15:24:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=rsKhZ1EbEn0JR7Lln1R/ydFNotYRL48TW1QJwpyVS6I=; b=n+2q7LW08BVLGYlpCEdZWeAgF9W19gsxOnAo77qsG/x6bZsPoyXZQLzNeho9SawjtT c/x0cW7TN5kX66byvopI69zdI2+Yeyy/vuHTHmKQzYgOn6Va9WzUN87RpxB4R9EoOsqh EOO8qAYr1l89ACwGBeg8gt3iNuu4vuMFlQueuLmLEiKduLLlsbrXzo52rMs6RIgqPROI pRL03JeuQROOElQww2iLEQoYc7kuxGHxD/qon2BD8x68dtC0eBu3u+mya8/BX1LMff1Z OegYCgxCyGpHW0Zw1btO1UXkCRBGXaDVDkf3X+T3DIIATUks1dlkib2FiCrvd2WDF1iR ZPhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=rsKhZ1EbEn0JR7Lln1R/ydFNotYRL48TW1QJwpyVS6I=; b=H2FknySd1rJx2M1jXgNTmvCaRrsIM8x2zwI30uj7AR0qKpCgqmiNDO7vAIHD38ezQP BbmJQtN62X4ivoJjMplVJNHhWJylGb2wxJxazpFcNc2EHeRWSBii2sbWlojg5l0BvuAp zBA2JwZQgJrYlgMh7xtgSjDaiaix3c28L+3VJMbKhJH/UemYP1x1EtIl5gWK/QfMZEIT JvukZk6AxLxnuvhrcNRFKZdgdQ+p9eJ0hb78E3hhAImFNAZqimCHtV7MN7kN5RxXjUku 645vMq9DNkeO/iUIQr3pzywz8miJ/VmY5ZQ5EFDzHXt/HHmj7BKM3UQUZPHGUsGW/L6w 5I2w== X-Gm-Message-State: AOAM533N3FeuEu/gidBLXjhSi5lLhTkVf43B8BYGzHyif7OSenM0vREu d5tExjhKiN+ekAKOxi0puMHIbrPg X-Google-Smtp-Source: ABdhPJwt7WJiYR8VvB22sTw5yGOb4NGmrgcwWs2LXZIKG4xvdSreW4hZ4j1CZAZnoQiwGLagbU+Fyg== X-Received: by 2002:a7b:c257:: with SMTP id b23mr1072178wmj.164.1596147885332; Thu, 30 Jul 2020 15:24:45 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a3sm10948457wme.34.2020.07.30.15.24.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:44 -0700 (PDT) Message-Id: <0e94e04dcd4e6434a70cf7b676c12cd27f859fd9.1596147867.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Junio C Hamano via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:14 +0000 Subject: [PATCH v3 08/20] fetch: optionally allow disabling FETCH_HEAD update Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Junio C Hamano Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Junio C Hamano If you run fetch but record the result in remote-tracking branches, and either if you do nothing with the fetched refs (e.g. you are merely mirroring) or if you always work from the remote-tracking refs (e.g. you fetch and then merge origin/branchname separately), you can get away with having no FETCH_HEAD at all. Teach "git fetch" a command line option "--[no-]write-fetch-head" and "fetch.writeFetchHEAD" configuration variable. Without either, the default is to write FETCH_HEAD, and the usual rule that the command line option defeats configured default applies. Note that under "--dry-run" mode, FETCH_HEAD is never written; otherwise you'd see list of objects in the file that you do not actually have. Passing `--write-fetch-head` does not force `git fetch` to write the file. Also note that this option is explicitly passed when "git pull" internally invokes "git fetch", so that those who configured their "git fetch" not to write FETCH_HEAD would not be able to break the cooperation between these two commands. "git pull" must see what "git fetch" got recorded in FETCH_HEAD to work correctly. Signed-off-by: Junio C Hamano Signed-off-by: Derrick Stolee --- Documentation/config/fetch.txt | 7 ++++++ Documentation/fetch-options.txt | 10 +++++++++ builtin/fetch.c | 19 +++++++++++++--- builtin/pull.c | 3 ++- t/t5510-fetch.sh | 39 +++++++++++++++++++++++++++++++-- t/t5521-pull-options.sh | 16 ++++++++++++++ 6 files changed, 88 insertions(+), 6 deletions(-) diff --git a/Documentation/config/fetch.txt b/Documentation/config/fetch.txt index b20394038d..0aaa05e8c0 100644 --- a/Documentation/config/fetch.txt +++ b/Documentation/config/fetch.txt @@ -91,3 +91,10 @@ fetch.writeCommitGraph:: merge and the write may take longer. Having an updated commit-graph file helps performance of many Git commands, including `git merge-base`, `git push -f`, and `git log --graph`. Defaults to false. + +fetch.writeFetchHEAD:: + Setting it to false tells `git fetch` not to write the list + of remote refs fetched in the `FETCH_HEAD` file directly + under `$GIT_DIR`. Can be countermanded from the command + line with the `--[no-]write-fetch-head` option. Defaults to + true. diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt index 495bc8ab5a..6972ad2522 100644 --- a/Documentation/fetch-options.txt +++ b/Documentation/fetch-options.txt @@ -64,6 +64,16 @@ documented in linkgit:git-config[1]. --dry-run:: Show what would be done, without making any changes. +ifndef::git-pull[] +--[no-]write-fetch-head:: + Write the list of remote refs fetched in the `FETCH_HEAD` + file directly under `$GIT_DIR`. This is the default unless + the configuration variable `fetch.writeFetchHEAD` is set to + false. Passing `--no-write-fetch-head` from the command + line tells Git not to write the file. Under `--dry-run` + option, the file is never written. +endif::git-pull[] + -f:: --force:: When 'git fetch' is used with `:` refspec it may diff --git a/builtin/fetch.c b/builtin/fetch.c index c7c8ac0861..30ac57dcf6 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -56,6 +56,7 @@ static int prune_tags = -1; /* unspecified */ #define PRUNE_TAGS_BY_DEFAULT 0 /* do we prune tags by default? */ static int all, append, dry_run, force, keep, multiple, update_head_ok; +static int write_fetch_head = 1; static int verbosity, deepen_relative, set_upstream; static int progress = -1; static int enable_auto_gc = 1; @@ -118,6 +119,10 @@ static int git_fetch_config(const char *k, const char *v, void *cb) return 0; } + if (!strcmp(k, "fetch.writefetchhead")) { + write_fetch_head = git_config_bool(k, v); + return 0; + } return git_default_config(k, v, cb); } @@ -162,6 +167,8 @@ static struct option builtin_fetch_options[] = { PARSE_OPT_OPTARG, option_fetch_parse_recurse_submodules), OPT_BOOL(0, "dry-run", &dry_run, N_("dry run")), + OPT_BOOL(0, "write-fetch-head", &write_fetch_head, + N_("write fetched references to the FETCH_HEAD file")), OPT_BOOL('k', "keep", &keep, N_("keep downloaded pack")), OPT_BOOL('u', "update-head-ok", &update_head_ok, N_("allow updating of HEAD ref")), @@ -895,7 +902,9 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, const char *what, *kind; struct ref *rm; char *url; - const char *filename = dry_run ? "/dev/null" : git_path_fetch_head(the_repository); + const char *filename = (!write_fetch_head + ? "/dev/null" + : git_path_fetch_head(the_repository)); int want_status; int summary_width = transport_summary_width(ref_map); @@ -1329,7 +1338,7 @@ static int do_fetch(struct transport *transport, } /* if not appending, truncate FETCH_HEAD */ - if (!append && !dry_run) { + if (!append && write_fetch_head) { retcode = truncate_fetch_head(); if (retcode) goto cleanup; @@ -1596,7 +1605,7 @@ static int fetch_multiple(struct string_list *list, int max_children) int i, result = 0; struct strvec argv = STRVEC_INIT; - if (!append && !dry_run) { + if (!append && write_fetch_head) { int errcode = truncate_fetch_head(); if (errcode) return errcode; @@ -1797,6 +1806,10 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) if (depth || deepen_since || deepen_not.nr) deepen = 1; + /* FETCH_HEAD never gets updated in --dry-run mode */ + if (dry_run) + write_fetch_head = 0; + if (all) { if (argc == 1) die(_("fetch --all does not take a repository argument")); diff --git a/builtin/pull.c b/builtin/pull.c index 858b492af3..4c66db1468 100644 --- a/builtin/pull.c +++ b/builtin/pull.c @@ -527,7 +527,8 @@ static int run_fetch(const char *repo, const char **refspecs) struct strvec args = STRVEC_INIT; int ret; - strvec_pushl(&args, "fetch", "--update-head-ok", NULL); + strvec_pushl(&args, "fetch", "--update-head-ok", + "--write-fetch-head", NULL); /* Shared options */ argv_push_verbosity(&args); diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh index 9850ecde5d..31c91d0ed2 100755 --- a/t/t5510-fetch.sh +++ b/t/t5510-fetch.sh @@ -539,13 +539,48 @@ test_expect_success 'fetch into the current branch with --update-head-ok' ' ' -test_expect_success 'fetch --dry-run' ' - +test_expect_success 'fetch --dry-run does not touch FETCH_HEAD' ' rm -f .git/FETCH_HEAD && git fetch --dry-run . && ! test -f .git/FETCH_HEAD ' +test_expect_success '--no-write-fetch-head does not touch FETCH_HEAD' ' + rm -f .git/FETCH_HEAD && + git fetch --no-write-fetch-head . && + ! test -f .git/FETCH_HEAD +' + +test_expect_success '--write-fetch-head gets defeated by --dry-run' ' + rm -f .git/FETCH_HEAD && + git fetch --dry-run --write-fetch-head . && + ! test -f .git/FETCH_HEAD +' + +test_expect_success 'fetch.writeFetchHEAD and FETCH_HEAD' ' + rm -f .git/FETCH_HEAD && + git -c fetch.writeFetchHEAD=no fetch . && + ! test -f .git/FETCH_HEAD +' + +test_expect_success 'fetch.writeFetchHEAD gets defeated by --dry-run' ' + rm -f .git/FETCH_HEAD && + git -c fetch.writeFetchHEAD=yes fetch --dry-run . && + ! test -f .git/FETCH_HEAD +' + +test_expect_success 'fetch.writeFetchHEAD and --no-write-fetch-head' ' + rm -f .git/FETCH_HEAD && + git -c fetch.writeFetchHEAD=yes fetch --no-write-fetch-head . && + ! test -f .git/FETCH_HEAD +' + +test_expect_success 'fetch.writeFetchHEAD and --write-fetch-head' ' + rm -f .git/FETCH_HEAD && + git -c fetch.writeFetchHEAD=no fetch --write-fetch-head . && + test -f .git/FETCH_HEAD +' + test_expect_success "should be able to fetch with duplicate refspecs" ' mkdir dups && ( diff --git a/t/t5521-pull-options.sh b/t/t5521-pull-options.sh index 159afa7ac8..1acae3b9a4 100755 --- a/t/t5521-pull-options.sh +++ b/t/t5521-pull-options.sh @@ -77,6 +77,7 @@ test_expect_success 'git pull -q -v --no-rebase' ' test_must_be_empty out && test -s err) ' + test_expect_success 'git pull --cleanup errors early on invalid argument' ' mkdir clonedcleanup && (cd clonedcleanup && git init && @@ -85,6 +86,21 @@ test_expect_success 'git pull --cleanup errors early on invalid argument' ' test -s err) ' +test_expect_success 'git pull --no-write-fetch-head fails' ' + mkdir clonedwfh && + (cd clonedwfh && git init && + test_must_fail git pull --no-write-fetch-head "../parent" >out 2>err && + test_must_be_empty out && + test_i18ngrep "no-write-fetch-head" err) +' + +test_expect_success 'git pull succeeds with fetch.writeFetchHEAD=false' ' + mkdir clonedwfhconfig && + (cd clonedwfhconfig && git init && + git config fetch.writeFetchHEAD false && + git pull "../parent" >out 2>err && + grep FETCH_HEAD err) +' test_expect_success 'git pull --force' ' mkdir clonedoldstyle && From patchwork Thu Jul 30 22:24:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693725 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8BD1014B7 for ; Thu, 30 Jul 2020 22:24:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6D0FC20838 for ; Thu, 30 Jul 2020 22:24:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="VfanJcO2" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730478AbgG3WYy (ORCPT ); Thu, 30 Jul 2020 18:24:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730412AbgG3WYs (ORCPT ); Thu, 30 Jul 2020 18:24:48 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EDDD9C061756 for ; Thu, 30 Jul 2020 15:24:47 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id a14so26304515wra.5 for ; Thu, 30 Jul 2020 15:24:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=PinCAdMsqIP+MXB5Nud7V9IOva8JaEQYH5qk7nJu2nA=; b=VfanJcO2rCMWBUEKfTyCvfKlmmaZeexD0aX9KlDq8lCjecp3q6U6rcSftqD9byfCqx Dcf7Z3AIsw5edxfLN1txY8FKPwnqd4urMsLUXMKpanHGlVWIC9yCrN5Mi/jWKE03aGNl iILJYwS1q9z51Zl4xMaTh3CD2gt3CDmxR2q7/2Bx77pek/YrfUQj/6QJtN5RhoYaEdXb KguNYdYowEpQppyYbI3Oml5R8cx6Ih3xtoh5Fzg8iqcacvWejLWzlssxXwky3erOzTed pwaMvz62+YzayeJaVx2ehJAjZjs6SO+yo2IMxuie7m6RDuYORatA404AFeSh1GTD62xK /gpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=PinCAdMsqIP+MXB5Nud7V9IOva8JaEQYH5qk7nJu2nA=; b=ugYSELVmnXxabu0zzjS9CLLoVU1b0eTECJ2FmwT2QjWVDLApD+Wcr7UrxwuOwci/Nz UlTgFrJzK+yXrvZjVcWBTq96qiwrIkNQiZgcjfKBudFxm5TpyKzcFrUMRnvxdd7Trcr8 ISDIQP42PtT/f9C2gLVBj3mFNd7PeSeMTu/1ayAR2sR47w6iiUoXloJGQerTFc+T8jjH k5SAl9lO6oOI+jprL1OL1TDgVTvQ1J7BD5Rks3ev7DdB71hf9pMcJqWEkzlm5bWBYvej fJlmfyON+rofioDCNKpF1ft9M0xgK0orn/aThVUG/kZagLmovXnhxlUUiyUKD2hTY0ot hfOA== X-Gm-Message-State: AOAM5300e2kLuzB0dzbhTDe7Yh4k2cJMpPu8abN/q6SEPO6Jv9UPiauF Jx2phH5kLitwK/XBHo0aqsGfXZBA X-Google-Smtp-Source: ABdhPJyMty2Q/8OrPJXjRcJtDsgUCEfCUyUBKEEZZ7d3/GKYpzh9wunss50khUCPsFkRmdFp5BpzrQ== X-Received: by 2002:adf:f44b:: with SMTP id f11mr739700wrp.114.1596147886324; Thu, 30 Jul 2020 15:24:46 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d21sm1740137wmd.41.2020.07.30.15.24.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:45 -0700 (PDT) Message-Id: <9e38ade15c3a3eeec58dc262f1835d344c4899c0.1596147867.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:15 +0000 Subject: [PATCH v3 09/20] maintenance: add prefetch task Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee When working with very large repositories, an incremental 'git fetch' command can download a large amount of data. If there are many other users pushing to a common repo, then this data can rival the initial pack-file size of a 'git clone' of a medium-size repo. Users may want to keep the data on their local repos as close as possible to the data on the remote repos by fetching periodically in the background. This can break up a large daily fetch into several smaller hourly fetches. The task is called "prefetch" because it is work done in advance of a foreground fetch to make that 'git fetch' command much faster. However, if we simply ran 'git fetch ' in the background, then the user running a foregroudn 'git fetch ' would lose some important feedback when a new branch appears or an existing branch updates. This is especially true if a remote branch is force-updated and this isn't noticed by the user because it occurred in the background. Further, the functionality of 'git push --force-with-lease' becomes suspect. When running 'git fetch ' in the background, use the following options for careful updating: 1. --no-tags prevents getting a new tag when a user wants to see the new tags appear in their foreground fetches. 2. --refmap= removes the configured refspec which usually updates refs/remotes//* with the refs advertised by the remote. While this looks confusing, this was documented and tested by b40a50264ac (fetch: document and test --refmap="", 2020-01-21), including this sentence in the documentation: Providing an empty `` to the `--refmap` option causes Git to ignore the configured refspecs and rely entirely on the refspecs supplied as command-line arguments. 3. By adding a new refspec "+refs/heads/*:refs/prefetch//*" we can ensure that we actually load the new values somewhere in our refspace while not updating refs/heads or refs/remotes. By storing these refs here, the commit-graph job will update the commit-graph with the commits from these hidden refs. 4. --prune will delete the refs/prefetch/ refs that no longer appear on the remote. 5. --no-write-fetch-head prevents updating FETCH_HEAD. We've been using this step as a critical background job in Scalar [1] (and VFS for Git). This solved a pain point that was showing up in user reports: fetching was a pain! Users do not like waiting to download the data that was created while they were away from their machines. After implementing background fetch, the foreground fetch commands sped up significantly because they mostly just update refs and download a small amount of new data. The effect is especially dramatic when paried with --no-show-forced-udpates (through fetch.showForcedUpdates=false). [1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/FetchStep.cs Signed-off-by: Derrick Stolee --- Documentation/git-maintenance.txt | 15 +++++++++ builtin/gc.c | 52 +++++++++++++++++++++++++++++++ t/t7900-maintenance.sh | 24 ++++++++++++++ 3 files changed, 91 insertions(+) diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index 9204762e21..d134192fa8 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -53,6 +53,21 @@ since it will not expire `.graph` files that were in the previous `commit-graph-chain` file. They will be deleted by a later run based on the expiration delay. +prefetch:: + The `prefetch` task updates the object directory with the latest + objects from all registered remotes. For each remote, a `git fetch` + command is run. The refmap is custom to avoid updating local or remote + branches (those in `refs/heads` or `refs/remotes`). Instead, the + remote refs are stored in `refs/prefetch//`. Also, tags are + not updated. ++ +This is done to avoid disrupting the remote-tracking branches. The end users +expect these refs to stay unmoved unless they initiate a fetch. With prefetch +task, however, the objects necessary to complete a later real fetch would +already be obtained, so the real fetch would go faster. In the ideal case, +it will just become an update to bunch of remote-tracking branches without +any object transfer. + gc:: Cleanup unnecessary files and optimize the local repository. "GC" stands for "garbage collection," but this task performs many diff --git a/builtin/gc.c b/builtin/gc.c index b57bc7b0ff..1f20428286 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -28,6 +28,7 @@ #include "blob.h" #include "tree.h" #include "promisor-remote.h" +#include "remote.h" #define FAILED_RUN "failed to run %s" @@ -769,6 +770,52 @@ static int maintenance_task_commit_graph(void) return 1; } +static int fetch_remote(const char *remote) +{ + struct child_process child = CHILD_PROCESS_INIT; + + child.git_cmd = 1; + strvec_pushl(&child.args, "fetch", remote, "--prune", "--no-tags", + "--no-write-fetch-head", "--refmap=", NULL); + + strvec_pushf(&child.args, "+refs/heads/*:refs/prefetch/%s/*", remote); + + if (opts.quiet) + strvec_push(&child.args, "--quiet"); + + return !!run_command(&child); +} + +static int fill_each_remote(struct remote *remote, void *cbdata) +{ + struct string_list *remotes = (struct string_list *)cbdata; + + string_list_append(remotes, remote->name); + return 0; +} + +static int maintenance_task_prefetch(void) +{ + int result = 0; + struct string_list_item *item; + struct string_list remotes = STRING_LIST_INIT_DUP; + + if (for_each_remote(fill_each_remote, &remotes)) { + error(_("failed to fill remotes")); + result = 1; + goto cleanup; + } + + for (item = remotes.items; + item && item < remotes.items + remotes.nr; + item++) + result |= fetch_remote(item->string); + +cleanup: + string_list_clear(&remotes, 0); + return result; +} + static int maintenance_task_gc(void) { struct child_process child = CHILD_PROCESS_INIT; @@ -796,6 +843,7 @@ struct maintenance_task { }; enum maintenance_task_label { + TASK_PREFETCH, TASK_GC, TASK_COMMIT_GRAPH, @@ -804,6 +852,10 @@ enum maintenance_task_label { }; static struct maintenance_task tasks[] = { + [TASK_PREFETCH] = { + "prefetch", + maintenance_task_prefetch, + }, [TASK_GC] = { "gc", maintenance_task_gc, diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 3cdccb24df..5294396a24 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -43,4 +43,28 @@ test_expect_success 'run --task duplicate' ' test_i18ngrep "cannot be selected multiple times" err ' +test_expect_success 'run --task=prefetch with no remotes' ' + git maintenance run --task=prefetch 2>err && + test_must_be_empty err +' + +test_expect_success 'prefetch multiple remotes' ' + git clone . clone1 && + git clone . clone2 && + git remote add remote1 "file://$(pwd)/clone1" && + git remote add remote2 "file://$(pwd)/clone2" && + git -C clone1 switch -c one && + git -C clone2 switch -c two && + test_commit -C clone1 one && + test_commit -C clone2 two && + GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch && + grep ",\"fetch\",\"remote1\"" run-prefetch.txt && + grep ",\"fetch\",\"remote2\"" run-prefetch.txt && + test_path_is_missing .git/refs/remotes && + test_cmp clone1/.git/refs/heads/one .git/refs/prefetch/remote1/one && + test_cmp clone2/.git/refs/heads/two .git/refs/prefetch/remote2/two && + git log prefetch/remote1/one && + git log prefetch/remote2/two +' + test_done From patchwork Thu Jul 30 22:24:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693723 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A13A014B7 for ; Thu, 30 Jul 2020 22:24:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 83D2C20838 for ; Thu, 30 Jul 2020 22:24:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="dit74pY4" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730454AbgG3WYw (ORCPT ); Thu, 30 Jul 2020 18:24:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43890 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730424AbgG3WYt (ORCPT ); Thu, 30 Jul 2020 18:24:49 -0400 Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02F35C061575 for ; Thu, 30 Jul 2020 15:24:49 -0700 (PDT) Received: by mail-wm1-x341.google.com with SMTP id p14so7023731wmg.1 for ; Thu, 30 Jul 2020 15:24:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=/R5LWWXKMmJRdlNwqXfPKs3HG+n52TuWAZYKO2N8KvA=; b=dit74pY4t+qxUdyx6M/ZLbReeNW8q1UjjHRHM5+06QMmiEYNiprxmQElBJXR6AXJ8Z yOvTL+/pQ9RDFx0R5aB6kqQOomteGY+oYCVLN9GE8Z0Cs1IRbFfBlyqz5r4YA1UEs/mU jRKZApj43pFHkztAU8Xu4PtvlYJXPmPYuTGBb/qEXnVJhAjptmk3R2hV/a++SyFYSqLp FFRxPn1WlfjBjPmlH1+FRGdw6qUKsEcAKvtIoRfiE0oTMmx9DYA2EvvIfSq1s9fvIe5v xKBjzFHt8b7MrYOOLFx1k3wVKEP+d9q0eX6GRW5xqCg70rhxd7d8RcS9bLEYVR8AL6R8 24XQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=/R5LWWXKMmJRdlNwqXfPKs3HG+n52TuWAZYKO2N8KvA=; b=G03D4szn1VSXCtOQ4QRdkeWf+o8HhMzcXxglmd8gjCAEN7TY5IaiXDBW7jkjU6cZ8r 6Pi1lBZo2BUyxWUxplYkgKmidFGQZbCfTBtHHPWOyJu6+xp9Fb1gQb46IspReN5nFAb7 Xn5mAP0AOBKIzX2cGLJeUUZIt+3XGIIbT7zk4XszNMmQXL3uGHlTd3lXSDYu54Aor3GS ISEHrSii65PNVpS4k9yK2Daw9X1J30li9hjaPTrkd8g0v68A34NweiqG9UL2jfVNgxKp 1R4x6UNpCf9bapMc5zlkkvnUScl/5hlfy6XE6FpW4WKxY0UozX+sfkPoXRm40/FIw//2 ynwQ== X-Gm-Message-State: AOAM532aLG4bxAB65MrSMbbBMTqydhZI2f5iVWIj33g/6R2OXbPT5OWt f5M1Bz7CG2sgUQnWSAgPaPnyun9v X-Google-Smtp-Source: ABdhPJxu7//6q3j6cv4sSMdh/UdYZSMtpCs574sZN2bFxZv7qn9DcO0+onOJj4MU1bNGFDnMD9OzRA== X-Received: by 2002:a1c:b7c2:: with SMTP id h185mr1221009wmf.168.1596147887319; Thu, 30 Jul 2020 15:24:47 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l18sm11328819wrm.52.2020.07.30.15.24.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:46 -0700 (PDT) Message-Id: <0128fdfd1ab286ea27437bfe93a9dbb532444277.1596147867.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:16 +0000 Subject: [PATCH v3 10/20] maintenance: add loose-objects task Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee One goal of background maintenance jobs is to allow a user to disable auto-gc (gc.auto=0) but keep their repository in a clean state. Without any cleanup, loose objects will clutter the object database and slow operations. In addition, the loose objects will take up extra space because they are not stored with deltas against similar objects. Create a 'loose-objects' task for the 'git maintenance run' command. This helps clean up loose objects without disrupting concurrent Git commands using the following sequence of events: 1. Run 'git prune-packed' to delete any loose objects that exist in a pack-file. Concurrent commands will prefer the packed version of the object to the loose version. (Of course, there are exceptions for commands that specifically care about the location of an object. These are rare for a user to run on purpose, and we hope a user that has selected background maintenance will not be trying to do foreground maintenance.) 2. Run 'git pack-objects' on a batch of loose objects. These objects are grouped by scanning the loose object directories in lexicographic order until listing all loose objects -or- reaching 50,000 objects. This is more than enough if the loose objects are created only by a user doing normal development. We noticed users with _millions_ of loose objects because VFS for Git downloads blobs on-demand when a file read operation requires populating a virtual file. This has potential of happening in partial clones if someone runs 'git grep' or otherwise evades the batch-download feature for requesting promisor objects. This step is based on a similar step in Scalar [1] and VFS for Git. [1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/LooseObjectsStep.cs Signed-off-by: Derrick Stolee --- Documentation/git-maintenance.txt | 11 ++++ builtin/gc.c | 97 +++++++++++++++++++++++++++++++ t/t7900-maintenance.sh | 39 +++++++++++++ 3 files changed, 147 insertions(+) diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index d134192fa8..077929b691 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -76,6 +76,17 @@ gc:: It can also be disruptive in some situations, as it deletes stale data. +loose-objects:: + The `loose-objects` job cleans up loose objects and places them into + pack-files. In order to prevent race conditions with concurrent Git + commands, it follows a two-step process. First, it deletes any loose + objects that already exist in a pack-file; concurrent Git processes + will examine the pack-file for the object data instead of the loose + object. Second, it creates a new pack-file (starting with "loose-") + containing a batch of loose objects. The batch size is limited to 50 + thousand objects to prevent the job from taking too long on a + repository with many loose objects. + OPTIONS ------- --auto:: diff --git a/builtin/gc.c b/builtin/gc.c index 1f20428286..96ded73b2f 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -832,6 +832,98 @@ static int maintenance_task_gc(void) return run_command(&child); } +static int prune_packed(void) +{ + struct child_process child = CHILD_PROCESS_INIT; + + child.git_cmd = 1; + strvec_push(&child.args, "prune-packed"); + + if (opts.quiet) + strvec_push(&child.args, "--quiet"); + + return !!run_command(&child); +} + +struct write_loose_object_data { + FILE *in; + int count; + int batch_size; +}; + +static int bail_on_loose(const struct object_id *oid, + const char *path, + void *data) +{ + return 1; +} + +static int write_loose_object_to_stdin(const struct object_id *oid, + const char *path, + void *data) +{ + struct write_loose_object_data *d = (struct write_loose_object_data *)data; + + fprintf(d->in, "%s\n", oid_to_hex(oid)); + + return ++(d->count) > d->batch_size; +} + +static int pack_loose(void) +{ + struct repository *r = the_repository; + int result = 0; + struct write_loose_object_data data; + struct child_process pack_proc = CHILD_PROCESS_INIT; + + /* + * Do not start pack-objects process + * if there are no loose objects. + */ + if (!for_each_loose_file_in_objdir(r->objects->odb->path, + bail_on_loose, + NULL, NULL, NULL)) + return 0; + + pack_proc.git_cmd = 1; + + strvec_push(&pack_proc.args, "pack-objects"); + if (opts.quiet) + strvec_push(&pack_proc.args, "--quiet"); + strvec_pushf(&pack_proc.args, "%s/pack/loose", r->objects->odb->path); + + pack_proc.in = -1; + + if (start_command(&pack_proc)) { + error(_("failed to start 'git pack-objects' process")); + return 1; + } + + data.in = xfdopen(pack_proc.in, "w"); + data.count = 0; + data.batch_size = 50000; + + for_each_loose_file_in_objdir(r->objects->odb->path, + write_loose_object_to_stdin, + NULL, + NULL, + &data); + + fclose(data.in); + + if (finish_command(&pack_proc)) { + error(_("failed to finish 'git pack-objects' process")); + result = 1; + } + + return result; +} + +static int maintenance_task_loose_objects(void) +{ + return prune_packed() || pack_loose(); +} + typedef int maintenance_task_fn(void); struct maintenance_task { @@ -844,6 +936,7 @@ struct maintenance_task { enum maintenance_task_label { TASK_PREFETCH, + TASK_LOOSE_OBJECTS, TASK_GC, TASK_COMMIT_GRAPH, @@ -856,6 +949,10 @@ static struct maintenance_task tasks[] = { "prefetch", maintenance_task_prefetch, }, + [TASK_LOOSE_OBJECTS] = { + "loose-objects", + maintenance_task_loose_objects, + }, [TASK_GC] = { "gc", maintenance_task_gc, diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 5294396a24..27a423a4f2 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -67,4 +67,43 @@ test_expect_success 'prefetch multiple remotes' ' git log prefetch/remote2/two ' +test_expect_success 'loose-objects task' ' + # Repack everything so we know the state of the object dir + git repack -adk && + + # Hack to stop maintenance from running during "git commit" + echo in use >.git/objects/maintenance.lock && + + # Assuming that "git commit" creates at least one loose object + test_commit create-loose-object && + rm .git/objects/maintenance.lock && + + ls .git/objects >obj-dir-before && + test_file_not_empty obj-dir-before && + ls .git/objects/pack/*.pack >packs-before && + test_line_count = 1 packs-before && + + # The first run creates a pack-file + # but does not delete loose objects. + git maintenance run --task=loose-objects && + ls .git/objects >obj-dir-between && + test_cmp obj-dir-before obj-dir-between && + ls .git/objects/pack/*.pack >packs-between && + test_line_count = 2 packs-between && + ls .git/objects/pack/loose-*.pack >loose-packs && + test_line_count = 1 loose-packs && + + # The second run deletes loose objects + # but does not create a pack-file. + git maintenance run --task=loose-objects && + ls .git/objects >obj-dir-after && + cat >expect <<-\EOF && + info + pack + EOF + test_cmp expect obj-dir-after && + ls .git/objects/pack/*.pack >packs-after && + test_cmp packs-between packs-after +' + test_done From patchwork Thu Jul 30 22:24:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693727 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0C167912 for ; Thu, 30 Jul 2020 22:24:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E8D402083B for ; Thu, 30 Jul 2020 22:24:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="eC7eUDPY" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730466AbgG3WYx (ORCPT ); Thu, 30 Jul 2020 18:24:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730432AbgG3WYu (ORCPT ); Thu, 30 Jul 2020 18:24:50 -0400 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF2F8C061574 for ; Thu, 30 Jul 2020 15:24:49 -0700 (PDT) Received: by mail-wr1-x444.google.com with SMTP id b6so26270311wrs.11 for ; Thu, 30 Jul 2020 15:24:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Ry0hpw+x89x8hU9nQ5fnnsze4ID6eAXp+p+08m26Kb4=; b=eC7eUDPYUJayHzjhMDv9lmZiogBJpvxrFxXFAOKOExrn2TlpjHP4EOKr5oavxVWoRm ZiGzd8X+2Ml53Ow7pXsu/5GHtt4rC+NSgud0ujiUjLA0ZeuR56KWxOhqwDFY4ObDdD+O 21C3EHYukQFGuI/CIAZ5lAp2KJTgZMiU6BAbEtiwqIDGXls5wjnnj6dUn/IiA0/vSQ96 QABMdKxTJ8jXNt/mVS/opyLJHRWDk9EWn3QHu20jdjg2YUCVVp8eT/O9pVOCVz1wHh6B Ep8MMn08EbnO4GFF/SYL11Ez9lfjtrDSGvPISQ56dnS2ftdLNtI1s6jdzdknmZvUgFG/ bN+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Ry0hpw+x89x8hU9nQ5fnnsze4ID6eAXp+p+08m26Kb4=; b=PTpX2iDkAScAntbv7K5JG/QSlvPPrTGdOuAouLeHSSWFEs1PKNijKkK3TfHNLRFFow DAsXuOr0eZ9vGGyuJnRmk7D/TMgeN83OzIdA6zfm+t6pOaxG1T7cvm5tuCKv4lyawrs7 M+HCkBQ2X+EVmHSbvJlOgi+31C0rdWAfSeGF9Jm+LoK4ngpJZie1OX/j+38WAwSrUsgB jttq4YzNdONvpWZ/auzZ4oW5gOMMqoBZ6X4FSl08BPfi3XTsGhe8rj6XfrAx2U0bH2V9 KL2enudmXSJuKjrGB74HAY05ZLy8/SYmPMsRlq/uEAQnVKACHyYUdeweBwPRAON4GRqX wOng== X-Gm-Message-State: AOAM5320dAFKAfYUYq5i5adXy5y44LuB4l/bcpGfh6nNstEEYuSPpeAM G7XtivYuWjfKtKFZ0LFeFVEv8sTt X-Google-Smtp-Source: ABdhPJy2tifHmm/ZoTYXV1Ti3B73fpOgkBqU2Z2L0NCF/AYEg1dHJ8BUx/HF7HufOhWnHpbU+6xQmQ== X-Received: by 2002:a5d:4241:: with SMTP id s1mr659046wrr.411.1596147888259; Thu, 30 Jul 2020 15:24:48 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u186sm10572007wmu.10.2020.07.30.15.24.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:47 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:17 +0000 Subject: [PATCH v3 11/20] midx: enable core.multiPackIndex by default Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The core.multiPackIndex setting has been around since c4d25228ebb (config: create core.multiPackIndex setting, 2018-07-12), but has been disabled by default. If a user wishes to use the multi-pack-index feature, then they must enable this config and run 'git multi-pack-index write'. The multi-pack-index feature is relatively stable now, so make the config option true by default. For users that do not use a multi-pack-index, the only extra cost will be a file lookup to see if a multi-pack-index file exists (once per process, per object directory). Also, this config option will be referenced by an upcoming "incremental-repack" task in the maintenance builtin, so move the config option into the repository settings struct. Note that if GIT_TEST_MULTI_PACK_INDEX=1, then we want to ignore the config option and treat core.multiPackIndex as enabled. Signed-off-by: Derrick Stolee --- Documentation/config/core.txt | 4 ++-- midx.c | 11 +++-------- repo-settings.c | 6 ++++++ repository.h | 2 ++ 4 files changed, 13 insertions(+), 10 deletions(-) diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt index 74619a9c03..86c91d5381 100644 --- a/Documentation/config/core.txt +++ b/Documentation/config/core.txt @@ -606,8 +606,8 @@ core.useReplaceRefs:: core.multiPackIndex:: Use the multi-pack-index file to track multiple packfiles using a - single index. See link:technical/multi-pack-index.html[the - multi-pack-index design document]. + single index. See linkgit:git-multi-pack-index[1] for more + information. Defaults to true. core.sparseCheckout:: Enable "sparse checkout" feature. See linkgit:git-sparse-checkout[1] diff --git a/midx.c b/midx.c index a5fb797ede..ef499cf504 100644 --- a/midx.c +++ b/midx.c @@ -10,6 +10,7 @@ #include "progress.h" #include "trace2.h" #include "run-command.h" +#include "repository.h" #define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ #define MIDX_VERSION 1 @@ -384,15 +385,9 @@ int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, i { struct multi_pack_index *m; struct multi_pack_index *m_search; - int config_value; - static int env_value = -1; - if (env_value < 0) - env_value = git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0); - - if (!env_value && - (repo_config_get_bool(r, "core.multipackindex", &config_value) || - !config_value)) + prepare_repo_settings(r); + if (!r->settings.core_multi_pack_index) return 0; for (m_search = r->objects->multi_pack_index; m_search; m_search = m_search->next) diff --git a/repo-settings.c b/repo-settings.c index 0918408b34..5bd2c22726 100644 --- a/repo-settings.c +++ b/repo-settings.c @@ -1,6 +1,7 @@ #include "cache.h" #include "config.h" #include "repository.h" +#include "midx.h" #define UPDATE_DEFAULT_BOOL(s,v) do { if (s == -1) { s = v; } } while(0) @@ -47,6 +48,11 @@ void prepare_repo_settings(struct repository *r) r->settings.pack_use_sparse = value; UPDATE_DEFAULT_BOOL(r->settings.pack_use_sparse, 1); + value = git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0); + if (value || !repo_config_get_bool(r, "core.multipackindex", &value)) + r->settings.core_multi_pack_index = value; + UPDATE_DEFAULT_BOOL(r->settings.core_multi_pack_index, 1); + if (!repo_config_get_bool(r, "feature.manyfiles", &value) && value) { UPDATE_DEFAULT_BOOL(r->settings.index_version, 4); UPDATE_DEFAULT_BOOL(r->settings.core_untracked_cache, UNTRACKED_CACHE_WRITE); diff --git a/repository.h b/repository.h index 3c1f7d54bd..3901ce0b65 100644 --- a/repository.h +++ b/repository.h @@ -37,6 +37,8 @@ struct repo_settings { int pack_use_sparse; enum fetch_negotiation_setting fetch_negotiation_algorithm; + + int core_multi_pack_index; }; struct repository { From patchwork Thu Jul 30 22:24:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693735 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EF25914DD for ; Thu, 30 Jul 2020 22:25:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D26B02083B for ; Thu, 30 Jul 2020 22:25:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ExoMiR+z" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730491AbgG3WY6 (ORCPT ); Thu, 30 Jul 2020 18:24:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43904 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730448AbgG3WYw (ORCPT ); Thu, 30 Jul 2020 18:24:52 -0400 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D491BC061575 for ; Thu, 30 Jul 2020 15:24:51 -0700 (PDT) Received: by mail-wm1-x342.google.com with SMTP id c80so7006838wme.0 for ; Thu, 30 Jul 2020 15:24:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=PasK1usoUHuqqBJ/4S4gEHtQklNT/S528NCSF7FdP14=; b=ExoMiR+z9Asz+riCJ+lGykLmjhR7RUPd1T6ZBHGduC028624Ue4q/STtvWyyOBZ9IE Tjt+Stfju9bmfUjgwSAqj2MPk4VV8ibEofDdWz0Ev3g5+4YffKkzwrHJAUAoSuyqUFuZ UEjbl5lC+ilPJ/B/yQMPQPO8RS6aJipzm8DeKw5n4mZ3kTaaBofUGERDbyaobpiqQGR6 WYjNaJPzz6j7eM/1C07QGSduJlSlfvyIBT/9MbRMSh9xBtP1eu/nOkPpO9zdmUp2XjEN bGmoBCft/V8N1EYgm9hCuJogqn4erWFWU7BqH5wE48M2IkeSaXyxGm13h1xLN2xaMym7 513Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=PasK1usoUHuqqBJ/4S4gEHtQklNT/S528NCSF7FdP14=; b=WsNKFsdmDQzYO/f45F7OeGe9fGJ+5ArLleG0gLXgR5txJYqJbR0S4FbipTbqfD4UFr qKNaA1+XAcCCyXCX2XxT0X+LwSakVORJfo9knThoGC3DjvId2rW5v2TgWEsg0PlU+wLK a140umV+KXcVfdG6jQxX6s9OVOcMuRL2o+SaH7yrJd7Et5JsJhIFIhYrMkQ/PBj3UgDa FzB0uzlJjJzazxe7bHHOdYc30t/F1x6IrBjC5/knOmpqXGk8XoB+c/Z1HhBLuKv+aAxr VARQnTpzaQzQljGVUd86Rg6SqPr0P7r9vMX/lg9MIxOJ06g/Rop3Csh80r559EQ91lfQ JDlQ== X-Gm-Message-State: AOAM530i9Foetsvtzn7t0w5ZQQicBm1HssmYcKjclSbpnbWm9qbD1iYE iguNHsKt4g/NsSia/0Zyy4MX9xl7 X-Google-Smtp-Source: ABdhPJwlxOb35hDnWiJRMyYb0Ztp/rXMY1faYxhNHaX2wraXgCj21IFJr9dW+3C1fTOqiMmB89svAg== X-Received: by 2002:a1c:cc12:: with SMTP id h18mr1217712wmb.56.1596147889326; Thu, 30 Jul 2020 15:24:49 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o126sm12047706wma.20.2020.07.30.15.24.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:48 -0700 (PDT) Message-Id: <00f47c48484c16987592d8b39fbd70e2744e3d4a.1596147867.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:18 +0000 Subject: [PATCH v3 12/20] maintenance: add incremental-repack task Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The previous change cleaned up loose objects using the 'loose-objects' that can be run safely in the background. Add a similar job that performs similar cleanups for pack-files. One issue with running 'git repack' is that it is designed to repack all pack-files into a single pack-file. While this is the most space-efficient way to store object data, it is not time or memory efficient. This becomes extremely important if the repo is so large that a user struggles to store two copies of the pack on their disk. Instead, perform an "incremental" repack by collecting a few small pack-files into a new pack-file. The multi-pack-index facilitates this process ever since 'git multi-pack-index expire' was added in 19575c7 (multi-pack-index: implement 'expire' subcommand, 2019-06-10) and 'git multi-pack-index repack' was added in ce1e4a1 (midx: implement midx_repack(), 2019-06-10). The 'incremental-repack' task runs the following steps: 1. 'git multi-pack-index write' creates a multi-pack-index file if one did not exist, and otherwise will update the multi-pack-index with any new pack-files that appeared since the last write. This is particularly relevant with the background fetch job. When the multi-pack-index sees two copies of the same object, it stores the offset data into the newer pack-file. This means that some old pack-files could become "unreferenced" which I will use to mean "a pack-file that is in the pack-file list of the multi-pack-index but none of the objects in the multi-pack-index reference a location inside that pack-file." 2. 'git multi-pack-index expire' deletes any unreferenced pack-files and updaes the multi-pack-index to drop those pack-files from the list. This is safe to do as concurrent Git processes will see the multi-pack-index and not open those packs when looking for object contents. (Similar to the 'loose-objects' job, there are some Git commands that open pack-files regardless of the multi-pack-index, but they are rarely used. Further, a user that self-selects to use background operations would likely refrain from using those commands.) 3. 'git multi-pack-index repack --bacth-size=' collects a set of pack-files that are listed in the multi-pack-index and creates a new pack-file containing the objects whose offsets are listed by the multi-pack-index to be in those objects. The set of pack- files is selected greedily by sorting the pack-files by modified time and adding a pack-file to the set if its "expected size" is smaller than the batch size until the total expected size of the selected pack-files is at least the batch size. The "expected size" is calculated by taking the size of the pack-file divided by the number of objects in the pack-file and multiplied by the number of objects from the multi-pack-index with offset in that pack-file. The expected size approximates how much data from that pack-file will contribute to the resulting pack-file size. The intention is that the resulting pack-file will be close in size to the provided batch size. The next run of the incremental-repack task will delete these repacked pack-files during the 'expire' step. In this version, the batch size is set to "0" which ignores the size restrictions when selecting the pack-files. It instead selects all pack-files and repacks all packed objects into a single pack-file. This will be updated in the next change, but it requires doing some calculations that are better isolated to a separate change. Each of the above steps update the multi-pack-index file. After each step, we verify the new multi-pack-index. If the new multi-pack-index is corrupt, then delete the multi-pack-index, rewrite it from scratch, and stop doing the later steps of the job. This is intended to be an extra-safe check without leaving a repo with many pack-files without a multi-pack-index. These steps are based on a similar background maintenance step in Scalar (and VFS for Git) [1]. This was incredibly effective for users of the Windows OS repository. After using the same VFS for Git repository for over a year, some users had _thousands_ of pack-files that combined to up to 250 GB of data. We noticed a few users were running into the open file descriptor limits (due in part to a bug in the multi-pack-index fixed by af96fe3 (midx: add packs to packed_git linked list, 2019-04-29). These pack-files were mostly small since they contained the commits and trees that were pushed to the origin in a given hour. The GVFS protocol includes a "prefetch" step that asks for pre-computed pack- files containing commits and trees by timestamp. These pack-files were grouped into "daily" pack-files once a day for up to 30 days. If a user did not request prefetch packs for over 30 days, then they would get the entire history of commits and trees in a new, large pack-file. This led to a large number of pack-files that had poor delta compression. By running this pack-file maintenance step once per day, these repos with thousands of packs spanning 200+ GB dropped to dozens of pack- files spanning 30-50 GB. This was done all without removing objects from the system and using a constant batch size of two gigabytes. Once the work was done to reduce the pack-files to small sizes, the batch size of two gigabytes means that not every run triggers a repack operation, so the following run will not expire a pack-file. This has kept these repos in a "clean" state. [1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/PackfileMaintenanceStep.cs Signed-off-by: Derrick Stolee --- Documentation/git-maintenance.txt | 15 ++++ builtin/gc.c | 112 ++++++++++++++++++++++++++++++ midx.c | 2 +- midx.h | 1 + t/t5319-multi-pack-index.sh | 1 + t/t7900-maintenance.sh | 38 ++++++++++ 6 files changed, 168 insertions(+), 1 deletion(-) diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index 077929b691..a598582986 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -87,6 +87,21 @@ loose-objects:: thousand objects to prevent the job from taking too long on a repository with many loose objects. +incremental-repack:: + The `incremental-repack` job repacks the object directory + using the `multi-pack-index` feature. In order to prevent race + conditions with concurrent Git commands, it follows a two-step + process. First, it deletes any pack-files included in the + `multi-pack-index` where none of the objects in the + `multi-pack-index` reference those pack-files; this only happens + if all objects in the pack-file are also stored in a newer + pack-file. Second, it selects a group of pack-files whose "expected + size" is below the batch size until the group has total expected + size at least the batch size; see the `--batch-size` option for + the `repack` subcommand in linkgit:git-multi-pack-index[1]. The + default batch-size is zero, which is a special case that attempts + to repack all pack-files into a single pack-file. + OPTIONS ------- --auto:: diff --git a/builtin/gc.c b/builtin/gc.c index 96ded73b2f..99ab1f5e9d 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -29,6 +29,7 @@ #include "tree.h" #include "promisor-remote.h" #include "remote.h" +#include "midx.h" #define FAILED_RUN "failed to run %s" @@ -924,6 +925,112 @@ static int maintenance_task_loose_objects(void) return prune_packed() || pack_loose(); } +static int multi_pack_index_write(void) +{ + struct child_process child = CHILD_PROCESS_INIT; + + child.git_cmd = 1; + strvec_pushl(&child.args, "multi-pack-index", "write", NULL); + + if (opts.quiet) + strvec_push(&child.args, "--no-progress"); + + if (run_command(&child)) + return error(_("failed to write multi-pack-index")); + + return 0; +} + +static int rewrite_multi_pack_index(void) +{ + struct repository *r = the_repository; + char *midx_name = get_midx_filename(r->objects->odb->path); + + unlink(midx_name); + free(midx_name); + + return multi_pack_index_write(); +} + +static int multi_pack_index_verify(const char *message) +{ + struct child_process child = CHILD_PROCESS_INIT; + + child.git_cmd = 1; + strvec_pushl(&child.args, "multi-pack-index", "verify", NULL); + + if (opts.quiet) + strvec_push(&child.args, "--no-progress"); + + if (run_command(&child)) { + warning(_("'git multi-pack-index verify' failed %s"), message); + return 1; + } + + return 0; +} + +static int multi_pack_index_expire(void) +{ + struct child_process child = CHILD_PROCESS_INIT; + + child.git_cmd = 1; + strvec_pushl(&child.args, "multi-pack-index", "expire", NULL); + + if (opts.quiet) + strvec_push(&child.args, "--no-progress"); + + close_object_store(the_repository->objects); + + if (run_command(&child)) + return error(_("'git multi-pack-index expire' failed")); + + return 0; +} + +static int multi_pack_index_repack(void) +{ + struct child_process child = CHILD_PROCESS_INIT; + + child.git_cmd = 1; + strvec_pushl(&child.args, "multi-pack-index", "repack", NULL); + + if (opts.quiet) + strvec_push(&child.args, "--no-progress"); + + strvec_push(&child.args, "--batch-size=0"); + + close_object_store(the_repository->objects); + + if (run_command(&child)) + return error(_("'git multi-pack-index repack' failed")); + + return 0; +} + +static int maintenance_task_incremental_repack(void) +{ + prepare_repo_settings(the_repository); + if (!the_repository->settings.core_multi_pack_index) { + warning(_("skipping incremental-repack task because core.multiPackIndex is disabled")); + return 0; + } + + if (multi_pack_index_write()) + return 1; + if (multi_pack_index_verify("after initial write")) + return rewrite_multi_pack_index(); + if (multi_pack_index_expire()) + return 1; + if (multi_pack_index_verify("after expire step")) + return !!rewrite_multi_pack_index(); + if (multi_pack_index_repack()) + return 1; + if (multi_pack_index_verify("after repack step")) + return !!rewrite_multi_pack_index(); + return 0; +} + typedef int maintenance_task_fn(void); struct maintenance_task { @@ -937,6 +1044,7 @@ struct maintenance_task { enum maintenance_task_label { TASK_PREFETCH, TASK_LOOSE_OBJECTS, + TASK_INCREMENTAL_REPACK, TASK_GC, TASK_COMMIT_GRAPH, @@ -953,6 +1061,10 @@ static struct maintenance_task tasks[] = { "loose-objects", maintenance_task_loose_objects, }, + [TASK_INCREMENTAL_REPACK] = { + "incremental-repack", + maintenance_task_incremental_repack, + }, [TASK_GC] = { "gc", maintenance_task_gc, diff --git a/midx.c b/midx.c index ef499cf504..f0ebee34c8 100644 --- a/midx.c +++ b/midx.c @@ -37,7 +37,7 @@ #define PACK_EXPIRED UINT_MAX -static char *get_midx_filename(const char *object_dir) +char *get_midx_filename(const char *object_dir) { return xstrfmt("%s/pack/multi-pack-index", object_dir); } diff --git a/midx.h b/midx.h index b18cf53bc4..baeecc70c9 100644 --- a/midx.h +++ b/midx.h @@ -37,6 +37,7 @@ struct multi_pack_index { #define MIDX_PROGRESS (1 << 0) +char *get_midx_filename(const char *object_dir); struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local); int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id); int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result); diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 7214cab36c..2abd29a007 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -3,6 +3,7 @@ test_description='multi-pack-indexes' . ./test-lib.sh +GIT_TEST_MULTI_PACK_INDEX=0 objdir=.git/objects midx_read_expect () { diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 27a423a4f2..0cc59adb21 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -5,6 +5,7 @@ test_description='git maintenance builtin' . ./test-lib.sh GIT_TEST_COMMIT_GRAPH=0 +GIT_TEST_MULTI_PACK_INDEX=0 test_expect_success 'help text' ' test_expect_code 129 git maintenance -h 2>err && @@ -106,4 +107,41 @@ test_expect_success 'loose-objects task' ' test_cmp packs-between packs-after ' +test_expect_success 'incremental-repack task' ' + packDir=.git/objects/pack && + for i in $(test_seq 1 5) + do + test_commit $i || return 1 + done && + + # Create three disjoint pack-files with size BIG, small, small. + echo HEAD~2 | git pack-objects --revs $packDir/test-1 && + test_tick && + git pack-objects --revs $packDir/test-2 <<-\EOF && + HEAD~1 + ^HEAD~2 + EOF + test_tick && + git pack-objects --revs $packDir/test-3 <<-\EOF && + HEAD + ^HEAD~1 + EOF + rm -f $packDir/pack-* && + rm -f $packDir/loose-* && + ls $packDir/*.pack >packs-before && + test_line_count = 3 packs-before && + + # the job repacks the two into a new pack, but does not + # delete the old ones. + git maintenance run --task=incremental-repack && + ls $packDir/*.pack >packs-between && + test_line_count = 4 packs-between && + + # the job deletes the two old packs, and does not write + # a new one because only one pack remains. + git maintenance run --task=incremental-repack && + ls .git/objects/pack/*.pack >packs-after && + test_line_count = 1 packs-after +' + test_done From patchwork Thu Jul 30 22:24:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693733 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AD168912 for ; Thu, 30 Jul 2020 22:25:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 92B742083B for ; Thu, 30 Jul 2020 22:25:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="VYfzTiiy" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730520AbgG3WZA (ORCPT ); Thu, 30 Jul 2020 18:25:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730445AbgG3WYw (ORCPT ); Thu, 30 Jul 2020 18:24:52 -0400 Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B61EBC061574 for ; Thu, 30 Jul 2020 15:24:51 -0700 (PDT) Received: by mail-wm1-x341.google.com with SMTP id f18so5210485wmc.0 for ; Thu, 30 Jul 2020 15:24:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=iFZtDi+XmH2tLB2MN3s2TsMV6TqojLwZeUex+F1/1v4=; b=VYfzTiiy+1NZQXbnTLrY/bUpCylzaucihuSXusjZy1IMuWNbVmy39uKZHFMt6JuYSw qf+xJSqSb/FuvEZ7QToVCK1RDMonfs224MnOvWdBzXbY4/I+803/daDZXXx1d27ovfZc 42kvWzD2aFfdXT9t0DbxDhYSFtqM9NKwAkct88axvDHsuu5tb4Ksjo9EfaQX4ehNfDwY qgDmjJARTSeuFbwQJQY6sW6xufqGDuCFZSmQ6/xtYD92PaYs65m4GMOXxDt5Gp9RfJM7 ih381nZN8WWzOXVAkthfN8ltm2ASXdN0KcM/gycubgW1efiAekchQWtl+asJ9LNdUKc3 2gGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=iFZtDi+XmH2tLB2MN3s2TsMV6TqojLwZeUex+F1/1v4=; b=BZbZGLNHMtIcxCq/c7awh8vDFJs9pQzgsPBkmZwmYTLOd6tbOnhQOOxYRBPVLqIEna aTB8es1G2qSwdmH96jkMjW4PBOAYNdodpaZp0tym4C7U0sW0WkCRA0u+05yXXbED/HQ6 EuMJVrSsdqHeXbCXFgV9tqAPEeL/M+dFBs63+JcLzc7sJJqXtWnhmEBj6l/lk08Wha7b 7/wtFfQ6QSRUFICWFyYdZLOc69G6AtP/Dqf6ZPgjEtCpz6c3NUh4JJz4xs6BK3XEMoHm euXfMCyg4rFVnhlFjwaBzLbA4BRBNnRz7g5m//LDhX8JHtquqVoKg4qhnuVTAUDrRYdD tAxQ== X-Gm-Message-State: AOAM53152XcWXyC0Vgy5nFhZiQTwyrCj88pKGGns374ll5X6oITeuOUy VqDvjJJvDyskg9Okklssau2Ul/os X-Google-Smtp-Source: ABdhPJxt4HKv4Dxu7F0kj4dEs+0cjKqM9XLmmBAOtQlBUG4SO4nL5TaNXjBhbcCwRa3Qv/SzgaJEMA== X-Received: by 2002:a1c:740e:: with SMTP id p14mr1044712wmc.179.1596147890289; Thu, 30 Jul 2020 15:24:50 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z11sm10641091wrw.93.2020.07.30.15.24.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:49 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:19 +0000 Subject: [PATCH v3 13/20] maintenance: auto-size incremental-repack batch Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee When repacking during the 'incremental-repack' task, we use the --batch-size option in 'git multi-pack-index repack'. The initial setting used --batch-size=0 to repack everything into a single pack-file. This is not sustainable for a large repository. The amount of work required is also likely to use too many system resources for a background job. Update the 'incremental-repack' task by dynamically computing a --batch-size option based on the current pack-file structure. The dynamic default size is computed with this idea in mind for a client repository that was cloned from a very large remote: there is likely one "big" pack-file that was created at clone time. Thus, do not try repacking it as it is likely packed efficiently by the server. Instead, we select the second-largest pack-file, and create a batch size that is one larger than that pack-file. If there are three or more pack-files, then this guarantees that at least two will be combined into a new pack-file. Of course, this means that the second-largest pack-file size is likely to grow over time and may eventually surpass the initially-cloned pack-file. Recall that the pack-file batch is selected in a greedy manner: the packs are considered from oldest to newest and are selected if they have size smaller than the batch size until the total selected size is larger than the batch size. Thus, that oldest "clone" pack will be first to repack after the new data creates a pack larger than that. We also want to place some limits on how large these pack-files become, in order to bound the amount of time spent repacking. A maximum batch-size of two gigabytes means that large repositories will never be packed into a single pack-file using this job, but also that repack is rather expensive. This is a trade-off that is valuable to have if the maintenance is being run automatically or in the background. Users who truly want to optimize for space and performance (and are willing to pay the upfront cost of a full repack) can use the 'gc' task to do so. Reported-by: Son Luong Ngoc Signed-off-by: Derrick Stolee --- builtin/gc.c | 43 +++++++++++++++++++++++++++++++++++++++++- t/t7900-maintenance.sh | 5 +++-- 2 files changed, 45 insertions(+), 3 deletions(-) diff --git a/builtin/gc.c b/builtin/gc.c index 99ab1f5e9d..d94eb3e6ad 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -988,6 +988,46 @@ static int multi_pack_index_expire(void) return 0; } +#define TWO_GIGABYTES (0x7FFF) + +static off_t get_auto_pack_size(void) +{ + /* + * The "auto" value is special: we optimize for + * one large pack-file (i.e. from a clone) and + * expect the rest to be small and they can be + * repacked quickly. + * + * The strategy we select here is to select a + * size that is one more than the second largest + * pack-file. This ensures that we will repack + * at least two packs if there are three or more + * packs. + */ + off_t max_size = 0; + off_t second_largest_size = 0; + off_t result_size; + struct packed_git *p; + struct repository *r = the_repository; + + reprepare_packed_git(r); + for (p = get_all_packs(r); p; p = p->next) { + if (p->pack_size > max_size) { + second_largest_size = max_size; + max_size = p->pack_size; + } else if (p->pack_size > second_largest_size) + second_largest_size = p->pack_size; + } + + result_size = second_largest_size + 1; + + /* But limit ourselves to a batch size of 2g */ + if (result_size > TWO_GIGABYTES) + result_size = TWO_GIGABYTES; + + return result_size; +} + static int multi_pack_index_repack(void) { struct child_process child = CHILD_PROCESS_INIT; @@ -998,7 +1038,8 @@ static int multi_pack_index_repack(void) if (opts.quiet) strvec_push(&child.args, "--no-progress"); - strvec_push(&child.args, "--batch-size=0"); + strvec_pushf(&child.args, "--batch-size=%"PRIuMAX, + (uintmax_t)get_auto_pack_size()); close_object_store(the_repository->objects); diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 0cc59adb21..4e9c1dfa0f 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -138,10 +138,11 @@ test_expect_success 'incremental-repack task' ' test_line_count = 4 packs-between && # the job deletes the two old packs, and does not write - # a new one because only one pack remains. + # a new one because the batch size is not high enough to + # pack the largest pack-file. git maintenance run --task=incremental-repack && ls .git/objects/pack/*.pack >packs-after && - test_line_count = 1 packs-after + test_line_count = 2 packs-after ' test_done From patchwork Thu Jul 30 22:24:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693729 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 14F5C14B7 for ; Thu, 30 Jul 2020 22:25:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EE9B92083B for ; Thu, 30 Jul 2020 22:24:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kZyDjsx+" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730509AbgG3WY6 (ORCPT ); Thu, 30 Jul 2020 18:24:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43906 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730457AbgG3WYx (ORCPT ); Thu, 30 Jul 2020 18:24:53 -0400 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ADDDEC06174A for ; Thu, 30 Jul 2020 15:24:52 -0700 (PDT) Received: by mail-wm1-x342.google.com with SMTP id p14so7023810wmg.1 for ; Thu, 30 Jul 2020 15:24:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=DxOZ1IOjcDH2+iLvAAYVmpQWrqgBbjvBYKjkQTk6LiE=; b=kZyDjsx+hjomOcV8vR1Eqw25JcPL+Z/KbtmNtqDjmb0l1fStPXM3z7YpdJ8va++WAK sTdKepgEvuSJee0EzM8KJ/DXoPKTBXPRC6EocKPwUC1PC4xjRRySfYFJY33RxfbhwGg4 svHri0ZTDzmxZE1Pbp4ndyt1i7cyY3lWGmHdj+yRkqppE8T/TCzts+RUJKppDu9DmwAB aHTj8ZgXi7qUuT1sKgK76c13wEBWL56O9jiokrAASEzV74nMrMER5KGG56fw5m+I9wOr 6ZBZZtuDMxdMFXO/p8mUiIFiMTBYGUndOofGQIZxCfggY8OpQyre8HswqWXuG8Typ7nX k31g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=DxOZ1IOjcDH2+iLvAAYVmpQWrqgBbjvBYKjkQTk6LiE=; b=m8tayMS/s8V7/qdTP4pqDyeCF0VBaI9057z/KfDjcpkQ+iy19+eI+GvBmD1Glw/RBA 6HbWdJ1hF07+j56GICqH4f2DrCzm2XgAhBaU9Mu3E/BTHqIVg99RANlZ3hGgKyBQYCwP UkCpIhclnhsefDyLAHLCho6/iduE9PJiLKF8uwhjbHHcCwx4CoGMQj6W7JXfYscSoRJp F43Z+S/hONShC/7kLkA1nnVNH/cjbfggEu9gjq2XvYBCuLGfF96bmFg7OJ7dqqIIdwbE GDt7UDqBumJptHKw1XRhPdEK9OnzAKMMsY+kKyRfpu+/aA6/axzB0936304ORgSqrxeQ jQbQ== X-Gm-Message-State: AOAM531WLNAcElndzXO6Pj/Y2hgNLCF6sXpn+yoXYp2DwD5FQUDNNQl+ hMpES1Vm0v4gTwaDs8o/bbloqrRz X-Google-Smtp-Source: ABdhPJwkGUbdeGbWmy+Caxllf70xCf8TTsHAg39uApnVrn34Z9iE/LEIiSAx+HlwkXCBYV/0iB06NQ== X-Received: by 2002:a1c:1d92:: with SMTP id d140mr1105900wmd.143.1596147891290; Thu, 30 Jul 2020 15:24:51 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id x2sm8196961wrg.73.2020.07.30.15.24.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:50 -0700 (PDT) Message-Id: <99840c4b8f28b33b1c7ec2e861ee78bf81fe0277.1596147867.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:20 +0000 Subject: [PATCH v3 14/20] maintenance: create maintenance..enabled config Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Currently, a normal run of "git maintenance run" will only run the 'gc' task, as it is the only one enabled. This is mostly for backwards- compatible reasons since "git maintenance run --auto" commands replaced previous "git gc --auto" commands after some Git processes. Users could manually run specific maintenance tasks by calling "git maintenance run --task=" directly. Allow users to customize which steps are run automatically using config. The 'maintenance..enabled' option then can turn on these other tasks (or turn off the 'gc' task). Signed-off-by: Derrick Stolee --- Documentation/config.txt | 2 ++ Documentation/config/maintenance.txt | 4 ++++ Documentation/git-maintenance.txt | 6 +++++- builtin/gc.c | 19 +++++++++++++++++++ t/t7900-maintenance.sh | 12 ++++++++++++ 5 files changed, 42 insertions(+), 1 deletion(-) create mode 100644 Documentation/config/maintenance.txt diff --git a/Documentation/config.txt b/Documentation/config.txt index ef0768b91a..2783b825f9 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -396,6 +396,8 @@ include::config/mailinfo.txt[] include::config/mailmap.txt[] +include::config/maintenance.txt[] + include::config/man.txt[] include::config/merge.txt[] diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt new file mode 100644 index 0000000000..370cbfb42f --- /dev/null +++ b/Documentation/config/maintenance.txt @@ -0,0 +1,4 @@ +maintenance..enabled:: + This boolean config option controls whether the maintenance task + with name `` is run when no `--task` option is specified. + By default, only `maintenance.gc.enabled` is true. diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index a598582986..1bd105918f 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -30,7 +30,11 @@ SUBCOMMANDS ----------- run:: - Run one or more maintenance tasks. + Run one or more maintenance tasks. If one or more `--task` options + are specified, then those tasks are run in that order. Otherwise, + the tasks are determined by which `maintenance..enabled` + config options are true. By default, only `maintenance.gc.enabled` + is true. TASKS ----- diff --git a/builtin/gc.c b/builtin/gc.c index d94eb3e6ad..c599690591 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -1171,6 +1171,24 @@ static int maintenance_run(void) return result; } +static void initialize_task_config(void) +{ + int i; + struct strbuf config_name = STRBUF_INIT; + for (i = 0; i < TASK__COUNT; i++) { + int config_value; + + strbuf_setlen(&config_name, 0); + strbuf_addf(&config_name, "maintenance.%s.enabled", + tasks[i].name); + + if (!git_config_get_bool(config_name.buf, &config_value)) + tasks[i].enabled = config_value; + } + + strbuf_release(&config_name); +} + static int task_option_parse(const struct option *opt, const char *arg, int unset) { @@ -1224,6 +1242,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix) builtin_maintenance_options); opts.quiet = !isatty(2); + initialize_task_config(); argc = parse_options(argc, argv, prefix, builtin_maintenance_options, diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 4e9c1dfa0f..affa268dec 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -21,6 +21,18 @@ test_expect_success 'run [--auto|--quiet]' ' grep ",\"gc\",\"--quiet\"" run-quiet.txt ' +test_expect_success 'maintenance..enabled' ' + git config maintenance.gc.enabled false && + git config maintenance.commit-graph.enabled true && + git config maintenance.loose-objects.enabled true && + GIT_TRACE2_EVENT="$(pwd)/run-config.txt" git maintenance run && + ! grep ",\"fetch\"" run-config.txt && + ! grep ",\"gc\"" run-config.txt && + ! grep ",\"multi-pack-index\"" run-config.txt && + grep ",\"commit-graph\"" run-config.txt && + grep ",\"prune-packed\"" run-config.txt +' + test_expect_success 'run --task=' ' GIT_TRACE2_EVENT="$(pwd)/run-commit-graph.txt" git maintenance run --task=commit-graph && GIT_TRACE2_EVENT="$(pwd)/run-gc.txt" git maintenance run --task=gc && From patchwork Thu Jul 30 22:24:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693731 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6568E14DD for ; Thu, 30 Jul 2020 22:25:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4AB882083B for ; Thu, 30 Jul 2020 22:25:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="doqO+rqf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730518AbgG3WY7 (ORCPT ); Thu, 30 Jul 2020 18:24:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730468AbgG3WYx (ORCPT ); Thu, 30 Jul 2020 18:24:53 -0400 Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 95711C061756 for ; Thu, 30 Jul 2020 15:24:53 -0700 (PDT) Received: by mail-wm1-x341.google.com with SMTP id t14so2390719wmi.3 for ; Thu, 30 Jul 2020 15:24:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=y4rH2VOeRAt+L1lOlsRtWcRpEG/8wcm/V9DpTFxHn7c=; b=doqO+rqf4BqQfM08tiuCYkmAMS9jAzO3jAtJOu01UBuNTyZV3Opjgc36FV7Jrt+urI 9ObgcVHK3WKP3L+/58QI6h072c52o1+aEEbmJE6XIR+HjSE/3vJoz9LlFUJ51RYu/t2w wFNGnq66Mz9XPl5OgsKVC75ZSwifvXsjZoTHwpYm2bZLQby1QCCxX7/hPqjX0xhQJsCz b49IAHBzGclMKVGsXaIMW7OYxMowjUGEvNBvwry+l14Lqeo81qFtdXIqbaNI0xpWwo3+ bvtL2hFF0iqMCeLoR+94HzjdGmEU2pAuG+XHQSVmIBW7X0ysaPdYBghoHZy9G4sD5D5p 6E1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=y4rH2VOeRAt+L1lOlsRtWcRpEG/8wcm/V9DpTFxHn7c=; b=Dk/8hVgoYAJdUzomgARe0fGMNOkhRVoySDUJTpK0iLAadUtmjYrIWVBIRzlUOrGjV7 D+GCLfw/bDEqT0/TvVcEqo/UEtDwrtKKOgUgNcdPqrwvznjzSn/yYF9+aTJx/1dDp49p vCUkD2nJZclTLWIhOtHbA4koELHQUfcyTy39Lsc3kghDj8PtrGFeMgjADRK5rSRkHmO2 WwmLorF4LhcSCVZjUL6ppKhogJ/g+LWaHN3Ki2L7v0jbGPzeJK0FWIgoXhw0zbVhBeN0 n0mVQ6djS4aSJ+XlB/mjZ4+c6ILek/FhRpu1/gg9QoXWJfCjlXpKnlt7wssq1UrRghgB fRKA== X-Gm-Message-State: AOAM530hy63RtaxS8B7gXR3+PfGXhqbwa1mMi1J3Y3aI51GhmLWqn7xt ELNPADv2Pc718FVynYE78yZzXOJB X-Google-Smtp-Source: ABdhPJxx/uIRokZXKaPj5Hcg7ObSwm323YmBFE80HLA5gp4XQH1+iXSD53+dhQjqdX0hLTnVSys0qg== X-Received: by 2002:a1c:e908:: with SMTP id q8mr1165397wmc.59.1596147892124; Thu, 30 Jul 2020 15:24:52 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id t14sm10418459wrv.14.2020.07.30.15.24.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:51 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:21 +0000 Subject: [PATCH v3 15/20] maintenance: use pointers to check --auto Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The 'git maintenance run' command has an '--auto' option. This is used by other Git commands such as 'git commit' or 'git fetch' to check if maintenance should be run after adding data to the repository. Previously, this --auto option was only used to add the argument to the 'git gc' command as part of the 'gc' task. We will be expanding the other tasks to perform a check to see if they should do work as part of the --auto flag, when they are enabled by config. First, update the 'gc' task to perform the auto check inside the maintenance process. This prevents running an extra 'git gc --auto' command when not needed. It also shows a model for other tasks. Second, use the 'auto_condition' function pointer as a signal for whether we enable the maintenance task under '--auto'. For instance, we do not want to enable the 'fetch' task in '--auto' mode, so that function pointer will remain NULL. Now that we are not automatically calling 'git gc', a test in t5514-fetch-multiple.sh must be changed to watch for 'git maintenance' instead. We continue to pass the '--auto' option to the 'git gc' command when necessary, because of the gc.autoDetach config option changes behavior. Likely, we will want to absorb the daemonizing behavior implied by gc.autoDetach as a maintenance.autoDetach config option. Signed-off-by: Derrick Stolee --- builtin/gc.c | 16 ++++++++++++++++ t/t5514-fetch-multiple.sh | 2 +- t/t7900-maintenance.sh | 2 +- 3 files changed, 18 insertions(+), 2 deletions(-) diff --git a/builtin/gc.c b/builtin/gc.c index c599690591..1c449b3776 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -1074,9 +1074,17 @@ static int maintenance_task_incremental_repack(void) typedef int maintenance_task_fn(void); +/* + * An auto condition function returns 1 if the task should run + * and 0 if the task should NOT run. See needs_to_gc() for an + * example. + */ +typedef int maintenance_auto_fn(void); + struct maintenance_task { const char *name; maintenance_task_fn *fn; + maintenance_auto_fn *auto_condition; unsigned enabled:1, selected:1; int selected_order; @@ -1109,6 +1117,7 @@ static struct maintenance_task tasks[] = { [TASK_GC] = { "gc", maintenance_task_gc, + need_to_gc, 1, }, [TASK_COMMIT_GRAPH] = { @@ -1161,6 +1170,11 @@ static int maintenance_run(void) if (!opts.tasks_selected && !tasks[i].enabled) continue; + if (opts.auto_flag && + (!tasks[i].auto_condition || + !tasks[i].auto_condition())) + continue; + if (tasks[i].fn()) { error(_("task '%s' failed"), tasks[i].name); result = 1; @@ -1175,6 +1189,8 @@ static void initialize_task_config(void) { int i; struct strbuf config_name = STRBUF_INIT; + gc_config(); + for (i = 0; i < TASK__COUNT; i++) { int config_value; diff --git a/t/t5514-fetch-multiple.sh b/t/t5514-fetch-multiple.sh index de8e2f1531..bd202ec6f3 100755 --- a/t/t5514-fetch-multiple.sh +++ b/t/t5514-fetch-multiple.sh @@ -108,7 +108,7 @@ test_expect_success 'git fetch --multiple (two remotes)' ' GIT_TRACE=1 git fetch --multiple one two 2>trace && git branch -r > output && test_cmp ../expect output && - grep "built-in: git gc" trace >gc && + grep "built-in: git maintenance" trace >gc && test_line_count = 1 gc ) ' diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index affa268dec..19494e6c43 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -17,7 +17,7 @@ test_expect_success 'run [--auto|--quiet]' ' GIT_TRACE2_EVENT="$(pwd)/run-auto.txt" git maintenance run --auto && GIT_TRACE2_EVENT="$(pwd)/run-quiet.txt" git maintenance run --quiet && grep ",\"gc\"]" run-no-auto.txt && - grep ",\"gc\",\"--auto\"" run-auto.txt && + ! grep ",\"gc\",\"--auto\"" run-auto.txt && grep ",\"gc\",\"--quiet\"" run-quiet.txt ' From patchwork Thu Jul 30 22:24:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693745 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4E96C14DD for ; Thu, 30 Jul 2020 22:25:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 372972083B for ; Thu, 30 Jul 2020 22:25:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="KxEinOld" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730547AbgG3WZI (ORCPT ); Thu, 30 Jul 2020 18:25:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730412AbgG3WYy (ORCPT ); Thu, 30 Jul 2020 18:24:54 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 74AF0C061757 for ; Thu, 30 Jul 2020 15:24:54 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id a14so26304695wra.5 for ; Thu, 30 Jul 2020 15:24:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=k5H7vskpbKfRG/gk39MiiO8RG5LEjWacblN8SAQ+xvg=; b=KxEinOldZFr9KAvsSTO11J+vpoS0X9t4MCFcQzIGV94Xlo/vEiM12h4OTXYyk3A+Om 6kG5ZW6Lz7LIDjilREOFh6coff5m6Y2VuGLMmquUsR/gOV7GxlzIGucnsxG5Hlfu7mCQ RyoiDWfEabQvBAfSc6GtIhhqGf5oPAApVmA606G6BKH/QE600eg18hh6f8UtbdPYsmeC ieGatDiLgI4IrqXqBvHwBCzejcQgJ3VNi+1lvJO9JknFWkM7QiSguH+l+HEOJ5FZ3HXa UrJjA5h2kVFfpGtwNdlc2tXp+5Ke3nes+XiWygZsNE/dW68xpzNHeAmvvBR9gBSwrxV6 xzOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=k5H7vskpbKfRG/gk39MiiO8RG5LEjWacblN8SAQ+xvg=; b=iwW5A8wj6EXLGeXkyNGpW+Vle11k/+AfceQ449uYmqrnJBTQzqKUIsnme50wRNvN+f x+40cujnzsLUH93+ETho+GEhZ2M2DvA09lFH1YnKwdGAXd4liMRnVy/5aCnzUphzO1HA 6MxxgCrSDUdL3rtpjgDYs8afFyevun0aAqjNtJZ/lFudqaeCjBvRJtetZPv/mF6LZkb3 mQmLLqDuyPpaJv1dpxs4J2sNEVFO+TcBM1iYe03g07hug4qMOmJ8P0+UI8Io7eHoKOdz nfYHdrjQaBqoQVGEAIfb9Zc3bENzgH30eZTihW1PXIuA/u3iRBrkU5MeFl/hElNnfct0 LKyA== X-Gm-Message-State: AOAM532cDN+J0mDXYwD8Csfmzbtkt20xQB90j2M2a1g7mJcWZyq3yw5K 9JIGrzELM0evI6C4sJ9qTnGXXeQx X-Google-Smtp-Source: ABdhPJwyUVzxiIZ9ceOiLhowQYYDE2dAzZlY+9hkhDAgFMstpGUrl3zQsusHZBDlNg3nKDEIqIAppg== X-Received: by 2002:a05:6000:12c1:: with SMTP id l1mr672267wrx.270.1596147893016; Thu, 30 Jul 2020 15:24:53 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j5sm10926978wmb.12.2020.07.30.15.24.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:52 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:22 +0000 Subject: [PATCH v3 16/20] maintenance: add auto condition for commit-graph task Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Instead of writing a new commit-graph in every 'git maintenance run --auto' process (when maintenance.commit-graph.enalbed is configured to be true), only write when there are "enough" commits not in a commit-graph file. This count is controlled by the maintenance.commit-graph.auto config option. To compute the count, use a depth-first search starting at each ref, and leaving markers using the PARENT1 flag. If this count reaches the limit, then terminate early and start the task. Otherwise, this operation will peel every ref and parse the commit it points to. If these are all in the commit-graph, then this is typically a very fast operation. Users with many refs might feel a slow-down, and hence could consider updating their limit to be very small. A negative value will force the step to run every time. Signed-off-by: Derrick Stolee --- Documentation/config/maintenance.txt | 10 ++++ builtin/gc.c | 76 ++++++++++++++++++++++++++++ object.h | 1 + 3 files changed, 87 insertions(+) diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt index 370cbfb42f..9bd69b9df3 100644 --- a/Documentation/config/maintenance.txt +++ b/Documentation/config/maintenance.txt @@ -2,3 +2,13 @@ maintenance..enabled:: This boolean config option controls whether the maintenance task with name `` is run when no `--task` option is specified. By default, only `maintenance.gc.enabled` is true. + +maintenance.commit-graph.auto:: + This integer config option controls how often the `commit-graph` task + should be run as part of `git maintenance run --auto`. If zero, then + the `commit-graph` task will not run with the `--auto` option. A + negative value will force the task to run every time. Otherwise, a + positive value implies the command should run when the number of + reachable commits that are not in the commit-graph file is at least + the value of `maintenance.commit-graph.auto`. The default value is + 100. diff --git a/builtin/gc.c b/builtin/gc.c index 1c449b3776..c85813fffe 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -30,6 +30,7 @@ #include "promisor-remote.h" #include "remote.h" #include "midx.h" +#include "refs.h" #define FAILED_RUN "failed to run %s" @@ -713,6 +714,80 @@ static struct maintenance_opts { int tasks_selected; } opts; +/* Remember to update object flag allocation in object.h */ +#define PARENT1 (1u<<16) + +static int num_commits_not_in_graph = 0; +static int limit_commits_not_in_graph = 100; + +static int dfs_on_ref(const char *refname, + const struct object_id *oid, int flags, + void *cb_data) +{ + int result = 0; + struct object_id peeled; + struct commit_list *stack = NULL; + struct commit *commit; + + if (!peel_ref(refname, &peeled)) + oid = &peeled; + if (oid_object_info(the_repository, oid, NULL) != OBJ_COMMIT) + return 0; + + commit = lookup_commit(the_repository, oid); + if (!commit) + return 0; + if (parse_commit(commit)) + return 0; + + commit_list_append(commit, &stack); + + while (!result && stack) { + struct commit_list *parent; + + commit = pop_commit(&stack); + + for (parent = commit->parents; parent; parent = parent->next) { + if (parse_commit(parent->item) || + commit_graph_position(parent->item) != COMMIT_NOT_FROM_GRAPH || + parent->item->object.flags & PARENT1) + continue; + + parent->item->object.flags |= PARENT1; + num_commits_not_in_graph++; + + if (num_commits_not_in_graph >= limit_commits_not_in_graph) { + result = 1; + break; + } + + commit_list_append(parent->item, &stack); + } + } + + free_commit_list(stack); + return result; +} + +static int should_write_commit_graph(void) +{ + int result; + + git_config_get_int("maintenance.commit-graph.auto", + &limit_commits_not_in_graph); + + if (!limit_commits_not_in_graph) + return 0; + if (limit_commits_not_in_graph < 0) + return 1; + + result = for_each_ref(dfs_on_ref, NULL); + + clear_commit_marks_all(PARENT1); + + return result; +} + static int run_write_commit_graph(void) { struct child_process child = CHILD_PROCESS_INIT; @@ -1123,6 +1198,7 @@ static struct maintenance_task tasks[] = { [TASK_COMMIT_GRAPH] = { "commit-graph", maintenance_task_commit_graph, + should_write_commit_graph, }, }; diff --git a/object.h b/object.h index 96a2105859..30d4e0f6a0 100644 --- a/object.h +++ b/object.h @@ -74,6 +74,7 @@ struct object_array { * list-objects-filter.c: 21 * builtin/fsck.c: 0--3 * builtin/index-pack.c: 2021 + * builtin/maintenance.c: 16 * builtin/pack-objects.c: 20 * builtin/reflog.c: 10--12 * builtin/show-branch.c: 0-------------------------------------------26 From patchwork Thu Jul 30 22:24:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693741 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3B036912 for ; Thu, 30 Jul 2020 22:25:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2349120838 for ; Thu, 30 Jul 2020 22:25:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="mIW4BSx6" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730561AbgG3WZF (ORCPT ); Thu, 30 Jul 2020 18:25:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43924 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730479AbgG3WY5 (ORCPT ); Thu, 30 Jul 2020 18:24:57 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 57F24C06179E for ; Thu, 30 Jul 2020 15:24:55 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id r4so23321702wrx.9 for ; Thu, 30 Jul 2020 15:24:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=aRS2hQwnkKs6hP12JwyT9uf97Wjde3hMJnedVnVLB04=; b=mIW4BSx67g6xL3rMAlf1cVM62z1o4Bf6qolC46V24ivV7aP0/0iOSXSxZhLthuvgvh OysN9Oz1VzY3tvcswTKtlXIHVoEmwYAujvTQD7NimoIK93zUMlWZ9A4XG0QNzvxmOF4G WP+dvRcTfEev/szn/LCyHK+BoRcPrTxVpYgwf+7IuBl4AXRejF1zj4LD6v9FqtjrL7Nc TdL3FMkp/iT7NzSHRzFktguHPleRzybZfEZ5hnlmEdMNuge1W/QRWabQ78iqjiJn15Jh ickg07GBsO1lM5NKt0fqDMwO8sKjL09guJn6t85b/6y225IHWm2FbE69tmfj3dVtyRjj vPvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=aRS2hQwnkKs6hP12JwyT9uf97Wjde3hMJnedVnVLB04=; b=djAmvjjdx4goWrAJoW/Ezejn4B/qw45u2jjBNNMIGtPWL49LdQW0g8lBnyWL5GDyfX T7Vl7eMSYGXsD5EROzcDB07tnRsllfIVlPmtWbnt4G30K4JT8j9LMy0xzS7T7dU+2ssc 5AImhU6volxXULyqK04T3lYWSB9jpXM+6R4hD1Dg3e/zekTOVSCtJbNwYdfUQLjc3FCd 5poeJxXPHx9VFLyM+u4dAg+KbMGJIB/MVsWmisR4r96q+IuR8FGeDNHsqcz8Ar2l99cG 7X/OkK8BM1b7l6/KGE+wU11nIpzEXwHUCJHb1v0Hw82SWELm1MjecJjAXIJ7fbS6yAeh X0Pw== X-Gm-Message-State: AOAM532VwmEK4UE8Vs2Bmpg94n4l7A+ACwFWrZAMUAioHePM4NEbK/QF Opo2BHZGxnTwFHnEE1m7eqFDqn0f X-Google-Smtp-Source: ABdhPJx7acHK0MDXzKIDfuw8w5HJIb1SXjDb6g30/gQ8YXg3juPle0CVc0qK8Y83aCIIn6mLXKLtfA== X-Received: by 2002:adf:f8d0:: with SMTP id f16mr777422wrq.66.1596147893905; Thu, 30 Jul 2020 15:24:53 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m8sm10624941wro.75.2020.07.30.15.24.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:53 -0700 (PDT) Message-Id: <6ac3a58f2fa1b7456cb80867fd62c3c462c5c858.1596147867.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:23 +0000 Subject: [PATCH v3 17/20] maintenance: create auto condition for loose-objects Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The loose-objects task deletes loose objects that already exist in a pack-file, then place the remaining loose objects into a new pack-file. If this step runs all the time, then we risk creating pack-files with very few objects with every 'git commit' process. To prevent overwhelming the packs directory with small pack-files, place a minimum number of objects to justify the task. The 'maintenance.loose-objects.auto' config option specifies a minimum number of loose objects to justify the task to run under the '--auto' option. This defaults to 100 loose objects. Setting the value to zero will prevent the step from running under '--auto' while a negative value will force it to run every time. Signed-off-by: Derrick Stolee --- Documentation/config/maintenance.txt | 9 +++++++++ builtin/gc.c | 30 ++++++++++++++++++++++++++++ t/t7900-maintenance.sh | 25 +++++++++++++++++++++++ 3 files changed, 64 insertions(+) diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt index 9bd69b9df3..a9442dd260 100644 --- a/Documentation/config/maintenance.txt +++ b/Documentation/config/maintenance.txt @@ -12,3 +12,12 @@ maintenance.commit-graph.auto:: reachable commits that are not in the commit-graph file is at least the value of `maintenance.commit-graph.auto`. The default value is 100. + +maintenance.loose-objects.auto:: + This integer config option controls how often the `loose-objects` task + should be run as part of `git maintenance run --auto`. If zero, then + the `loose-objects` task will not run with the `--auto` option. A + negative value will force the task to run every time. Otherwise, a + positive value implies the command should run when the number of + loose objects is at least the value of `maintenance.loose-objects.auto`. + The default value is 100. diff --git a/builtin/gc.c b/builtin/gc.c index c85813fffe..3b27dc7e7f 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -927,6 +927,35 @@ struct write_loose_object_data { int batch_size; }; +static int loose_object_auto_limit = 100; + +static int loose_object_count(const struct object_id *oid, + const char *path, + void *data) +{ + int *count = (int*)data; + if (++(*count) >= loose_object_auto_limit) + return 1; + return 0; +} + +static int loose_object_auto_condition(void) +{ + int count = 0; + + git_config_get_int("maintenance.loose-objects.auto", + &loose_object_auto_limit); + + if (!loose_object_auto_limit) + return 0; + if (loose_object_auto_limit < 0) + return 1; + + return for_each_loose_file_in_objdir(the_repository->objects->odb->path, + loose_object_count, + NULL, NULL, &count); +} + static int bail_on_loose(const struct object_id *oid, const char *path, void *data) @@ -1184,6 +1213,7 @@ static struct maintenance_task tasks[] = { [TASK_LOOSE_OBJECTS] = { "loose-objects", maintenance_task_loose_objects, + loose_object_auto_condition, }, [TASK_INCREMENTAL_REPACK] = { "incremental-repack", diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 19494e6c43..32ac5c9fb7 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -119,6 +119,31 @@ test_expect_success 'loose-objects task' ' test_cmp packs-between packs-after ' +test_expect_success 'maintenance.loose-objects.auto' ' + git repack -adk && + GIT_TRACE2_EVENT="$(pwd)/trace-lo1.txt" \ + git -c maintenance.loose-objects.auto=1 maintenance \ + run --auto --task=loose-objects && + ! grep "\"prune-packed\"" trace-lo1.txt && + for i in 1 2 + do + printf data-A-$i | git hash-object -t blob --stdin -w && + GIT_TRACE2_EVENT="$(pwd)/trace-loA-$i" \ + git -c maintenance.loose-objects.auto=2 \ + maintenance run --auto --task=loose-objects && + ! grep "\"prune-packed\"" trace-loA-$i && + printf data-B-$i | git hash-object -t blob --stdin -w && + GIT_TRACE2_EVENT="$(pwd)/trace-loB-$i" \ + git -c maintenance.loose-objects.auto=2 \ + maintenance run --auto --task=loose-objects && + grep "\"prune-packed\"" trace-loB-$i && + GIT_TRACE2_EVENT="$(pwd)/trace-loC-$i" \ + git -c maintenance.loose-objects.auto=2 \ + maintenance run --auto --task=loose-objects && + grep "\"prune-packed\"" trace-loC-$i || return 1 + done +' + test_expect_success 'incremental-repack task' ' packDir=.git/objects/pack && for i in $(test_seq 1 5) From patchwork Thu Jul 30 22:24:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693739 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BAAD614B7 for ; Thu, 30 Jul 2020 22:25:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9C72F2083B for ; Thu, 30 Jul 2020 22:25:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="OHN+jY7p" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730549AbgG3WZE (ORCPT ); Thu, 30 Jul 2020 18:25:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43926 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730481AbgG3WY5 (ORCPT ); Thu, 30 Jul 2020 18:24:57 -0400 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 21B42C06179F for ; Thu, 30 Jul 2020 15:24:56 -0700 (PDT) Received: by mail-wr1-x443.google.com with SMTP id a15so26265421wrh.10 for ; Thu, 30 Jul 2020 15:24:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=IUfoxyu4y2Mnm4a5S+lYZa6c+1Hk1DSww+pfud9VER4=; b=OHN+jY7pE99y61aIm/GPk0SEgdTZ9F0mbDle8BRf162hOX9tU3MhhQeATppw30NIS3 Q4znncnVzGIhhbpIrgVSnSAh/PZOB/g3s67saCbN82JbWTRW4miRYFiYhZAnJn1ybJat UhA3DlnbaSR/EEE0DGcDCzgVoAHhvO76w3JW0tnWfRUXc1d8uhZaoedezryNl3gaKTmj mZWixv0Y1o0tKmd7vED2G5o9mZIi0AsEHC2cF6qRVfpsu2BTZHbYoT3LlLDFfKShnYFd W37LiAZrEKcu7HzNoywsmVHAi0nXC/JpRlMiK8SHF1IyNKys/Lq4M9vcukiISzREKF7S J0lw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=IUfoxyu4y2Mnm4a5S+lYZa6c+1Hk1DSww+pfud9VER4=; b=g0rKd6fZpS3X8mtSS0fuI5keqWxKQbAYj/y9+yzAHpvLzQyeWODGy2pkeh3WPJt73g pfyZDS1yY5Kx6PVTMq/tz8FuY97Jw/8XzHfs8ecG4Qd4fS5W73RcHO3jMGBUTrriuTR+ fZKdh0L0ypQdgCT45i/9qiUg9XLlrK4lOSDfrmAoaJQWTnJEdexSR8Y2+2jwypEyD8eT de3DsaAkI4pAvq888knhLXywsZrsclW594aEvGL5wR5hkrk4lX7DdtsNW7bN5euPl3bI Hh6X5m9KMLYsUvPlZ8ewgLJ8cqESZaIJz6URx01qcFRIgM0NogbTVOst/Dx9Pi2PcfWO Irwg== X-Gm-Message-State: AOAM530IgxLnAxLXDXxlys7oFeHfq1D7WFFDpFSSNaIatbioCQXRrmUi /JurNFyPSkXZGm+1VIn0TLn57Lm4 X-Google-Smtp-Source: ABdhPJyDRp7iO9rdXgUXGLvt5cPAVTWL+YpVAfvNr6XXp2U9HiCDYjKiY1k+YxSQ5Yqr/zsD5eOsTg== X-Received: by 2002:adf:8024:: with SMTP id 33mr744325wrk.117.1596147894689; Thu, 30 Jul 2020 15:24:54 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id k62sm11535399wmb.16.2020.07.30.15.24.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:54 -0700 (PDT) Message-Id: <801b262d1c26625d3f174fd80d7ddd336328bdd6.1596147867.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:24 +0000 Subject: [PATCH v3 18/20] maintenance: add incremental-repack auto condition Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The incremental-repack task updates the multi-pack-index by deleting pack- files that have been replaced with new packs, then repacking a batch of small pack-files into a larger pack-file. This incremental repack is faster than rewriting all object data, but is slower than some other maintenance activities. The 'maintenance.incremental-repack.auto' config option specifies how many pack-files should exist outside of the multi-pack-index before running the step. These pack-files could be created by 'git fetch' commands or by the loose-objects task. The default value is 10. Setting the option to zero disables the task with the '--auto' option, and a negative value makes the task run every time. Signed-off-by: Derrick Stolee --- Documentation/config/maintenance.txt | 9 ++++++++ builtin/gc.c | 31 ++++++++++++++++++++++++++++ t/t7900-maintenance.sh | 30 +++++++++++++++++++++++++++ 3 files changed, 70 insertions(+) diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt index a9442dd260..22229e7174 100644 --- a/Documentation/config/maintenance.txt +++ b/Documentation/config/maintenance.txt @@ -21,3 +21,12 @@ maintenance.loose-objects.auto:: positive value implies the command should run when the number of loose objects is at least the value of `maintenance.loose-objects.auto`. The default value is 100. + +maintenance.incremental-repack.auto:: + This integer config option controls how often the `incremental-repack` + task should be run as part of `git maintenance run --auto`. If zero, + then the `incremental-repack` task will not run with the `--auto` + option. A negative value will force the task to run every time. + Otherwise, a positive value implies the command should run when the + number of pack-files not in the multi-pack-index is at least the value + of `maintenance.incremental-repack.auto`. The default value is 10. diff --git a/builtin/gc.c b/builtin/gc.c index 3b27dc7e7f..6ccc0cca19 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -31,6 +31,7 @@ #include "remote.h" #include "midx.h" #include "refs.h" +#include "object-store.h" #define FAILED_RUN "failed to run %s" @@ -1029,6 +1030,35 @@ static int maintenance_task_loose_objects(void) return prune_packed() || pack_loose(); } +static int incremental_repack_auto_condition(void) +{ + struct packed_git *p; + int enabled; + int incremental_repack_auto_limit = 10; + int count = 0; + + if (git_config_get_bool("core.multiPackIndex", &enabled) || + !enabled) + return 0; + + git_config_get_int("maintenance.incremental-repack.auto", + &incremental_repack_auto_limit); + + if (!incremental_repack_auto_limit) + return 0; + if (incremental_repack_auto_limit < 0) + return 1; + + for (p = get_packed_git(the_repository); + count < incremental_repack_auto_limit && p; + p = p->next) { + if (!p->multi_pack_index) + count++; + } + + return count >= incremental_repack_auto_limit; +} + static int multi_pack_index_write(void) { struct child_process child = CHILD_PROCESS_INIT; @@ -1218,6 +1248,7 @@ static struct maintenance_task tasks[] = { [TASK_INCREMENTAL_REPACK] = { "incremental-repack", maintenance_task_incremental_repack, + incremental_repack_auto_condition, }, [TASK_GC] = { "gc", diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 32ac5c9fb7..c8ccaa0a3d 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -182,4 +182,34 @@ test_expect_success 'incremental-repack task' ' test_line_count = 2 packs-after ' +test_expect_success 'maintenance.incremental-repack.auto' ' + git repack -adk && + git config core.multiPackIndex true && + git multi-pack-index write && + GIT_TRACE2_EVENT=1 git -c maintenance.incremental-repack.auto=1 \ + maintenance run --auto --task=incremental-repack >out && + ! grep "\"multi-pack-index\"" out && + for i in 1 2 + do + test_commit A-$i && + git pack-objects --revs .git/objects/pack/pack <<-\EOF && + HEAD + ^HEAD~1 + EOF + GIT_TRACE2_EVENT=$(pwd)/trace-A-$i git \ + -c maintenance.incremental-repack.auto=2 \ + maintenance run --auto --task=incremental-repack && + ! grep "\"multi-pack-index\"" trace-A-$i && + test_commit B-$i && + git pack-objects --revs .git/objects/pack/pack <<-\EOF && + HEAD + ^HEAD~1 + EOF + GIT_TRACE2_EVENT=$(pwd)/trace-B-$i git \ + -c maintenance.incremental-repack.auto=2 \ + maintenance run --auto --task=incremental-repack >out && + grep "\"multi-pack-index\"" trace-B-$i >/dev/null || return 1 + done +' + test_done From patchwork Thu Jul 30 22:24:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693743 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C00D9912 for ; Thu, 30 Jul 2020 22:25:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A646520838 for ; Thu, 30 Jul 2020 22:25:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="r1V1521c" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730564AbgG3WZH (ORCPT ); Thu, 30 Jul 2020 18:25:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730337AbgG3WY5 (ORCPT ); Thu, 30 Jul 2020 18:24:57 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A85EC0617A0 for ; Thu, 30 Jul 2020 15:24:57 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id f7so26316981wrw.1 for ; Thu, 30 Jul 2020 15:24:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=TMxpn4iMmsXP6T5DjXbu9YsDv8P3a3JeXKyeYM1BNkw=; b=r1V1521crySSp46ak7ulfyKqeneTMj1vPT8k5W9y0EdDJY0+cvOyuiqXc0x67/UFXK FUPNUUV+aAkdCfmsmxOE0i8Ush7cgwarySlAU9glbGu4HHmxZUujGwl3W8/hCdD0m2FT gQYhl/cDpBe2cSX+CFkc0LUhxLWtBvjaw5jJ/J0jcKjkA+cGrq5TZS7M9WTfX8GBE1Vd uzqML4mHix1QNKFqJzu6JbmKDL5WXAVFhZM0relYxAm+DJHoxgHC8/SrZaTXNnzHjRyA ohfVrcvRORSe2cbWo+BM58B6weI6QVP0NemjU4cZ7ZZsNRJxjWmnvg3ATKgFwaHnAs3b 3k6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=TMxpn4iMmsXP6T5DjXbu9YsDv8P3a3JeXKyeYM1BNkw=; b=atZEVH+vat6hp5EhiKQKycPtkbjUIwdvzr1pYNCN8zWFlRa8eeeynXuQf9MDWlCIka uHttFCH3tzw8e/IuU3pYKucsV2146C7f1HSlQbuqJa4NfhV25AswX0NmtYk5JnuiFIgq EeNW50JifjiMW1XSvoyZ7wemnuJRghBS19px3+UPcXiuRCx5Q3jJg9OUTErnCvzHUU0q XB00l8ngYNY5rYTuCWYYzpeFZVSXAFEqztuhfRYAQUAqsFUyhhWLtuLrzLN0DY+cceoo /vnH/bihR7ViIkgrquFCVKLMEM/q5mgmUsMY4kWJf1jgSaujZsm5Amg1MICwwQiI5W4S T5BQ== X-Gm-Message-State: AOAM532xMcOdt5fYCAO9HkqGhCOFAPPdJqioLZf+aJUb4ZX4PNcLdTbU ea6d04Nw+Gapa/yBusje8G45a7TN X-Google-Smtp-Source: ABdhPJz+sfGGPT2Jm3I0L2/D/ECqrsZwgzIAe0hua05DPR+f/wwyHayDCy6FNhgloYJX00fRKou1sQ== X-Received: by 2002:a05:6000:12c1:: with SMTP id l1mr672366wrx.270.1596147895579; Thu, 30 Jul 2020 15:24:55 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id e16sm10839150wrx.30.2020.07.30.15.24.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:54 -0700 (PDT) Message-Id: <9b4cef7635ede3e9e597d5f8e8da24e352f34c12.1596147867.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:25 +0000 Subject: [PATCH v3 19/20] midx: use start_delayed_progress() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Now that the multi-pack-index may be written as part of auto maintenance at the end of a command, reduce the progress output when the operations are quick. Use start_delayed_progress() instead of start_progress(). Update t5319-multi-pack-index.sh to use GIT_PROGRESS_DELAY=0 now that the progress indicators are conditional. Signed-off-by: Derrick Stolee --- midx.c | 10 +++++----- t/t5319-multi-pack-index.sh | 14 +++++++------- 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/midx.c b/midx.c index f0ebee34c8..66d7053d83 100644 --- a/midx.c +++ b/midx.c @@ -832,7 +832,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * packs.pack_paths_checked = 0; if (flags & MIDX_PROGRESS) - packs.progress = start_progress(_("Adding packfiles to multi-pack-index"), 0); + packs.progress = start_delayed_progress(_("Adding packfiles to multi-pack-index"), 0); else packs.progress = NULL; @@ -969,7 +969,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * } if (flags & MIDX_PROGRESS) - progress = start_progress(_("Writing chunks to multi-pack-index"), + progress = start_delayed_progress(_("Writing chunks to multi-pack-index"), num_chunks); for (i = 0; i < num_chunks; i++) { if (written != chunk_offsets[i]) @@ -1104,7 +1104,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag return 0; if (flags & MIDX_PROGRESS) - progress = start_progress(_("Looking for referenced packfiles"), + progress = start_delayed_progress(_("Looking for referenced packfiles"), m->num_packs); for (i = 0; i < m->num_packs; i++) { if (prepare_midx_pack(r, m, i)) @@ -1225,7 +1225,7 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla count = xcalloc(m->num_packs, sizeof(uint32_t)); if (flags & MIDX_PROGRESS) - progress = start_progress(_("Counting referenced objects"), + progress = start_delayed_progress(_("Counting referenced objects"), m->num_objects); for (i = 0; i < m->num_objects; i++) { int pack_int_id = nth_midxed_pack_int_id(m, i); @@ -1235,7 +1235,7 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla stop_progress(&progress); if (flags & MIDX_PROGRESS) - progress = start_progress(_("Finding and deleting unreferenced packfiles"), + progress = start_delayed_progress(_("Finding and deleting unreferenced packfiles"), m->num_packs); for (i = 0; i < m->num_packs; i++) { char *pack_name; diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 2abd29a007..26f224b0e3 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -173,12 +173,12 @@ test_expect_success 'write progress off for redirected stderr' ' ' test_expect_success 'write force progress on for stderr' ' - git multi-pack-index --object-dir=$objdir --progress write 2>err && + GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir --progress write 2>err && test_file_not_empty err ' test_expect_success 'write with the --no-progress option' ' - git multi-pack-index --object-dir=$objdir --no-progress write 2>err && + GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir --no-progress write 2>err && test_line_count = 0 err ' @@ -335,17 +335,17 @@ test_expect_success 'git-fsck incorrect offset' ' ' test_expect_success 'repack progress off for redirected stderr' ' - git multi-pack-index --object-dir=$objdir repack 2>err && + GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir repack 2>err && test_line_count = 0 err ' test_expect_success 'repack force progress on for stderr' ' - git multi-pack-index --object-dir=$objdir --progress repack 2>err && + GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir --progress repack 2>err && test_file_not_empty err ' test_expect_success 'repack with the --no-progress option' ' - git multi-pack-index --object-dir=$objdir --no-progress repack 2>err && + GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir --no-progress repack 2>err && test_line_count = 0 err ' @@ -489,7 +489,7 @@ test_expect_success 'expire progress off for redirected stderr' ' test_expect_success 'expire force progress on for stderr' ' ( cd dup && - git multi-pack-index --progress expire 2>err && + GIT_PROGRESS_DELAY=0 git multi-pack-index --progress expire 2>err && test_file_not_empty err ) ' @@ -497,7 +497,7 @@ test_expect_success 'expire force progress on for stderr' ' test_expect_success 'expire with the --no-progress option' ' ( cd dup && - git multi-pack-index --no-progress expire 2>err && + GIT_PROGRESS_DELAY=0 git multi-pack-index --no-progress expire 2>err && test_line_count = 0 err ) ' From patchwork Thu Jul 30 22:24:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11693737 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 110AE14B7 for ; Thu, 30 Jul 2020 22:25:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EBE4920829 for ; Thu, 30 Jul 2020 22:25:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="RnuJHuXg" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730543AbgG3WZD (ORCPT ); Thu, 30 Jul 2020 18:25:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43930 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730486AbgG3WY6 (ORCPT ); Thu, 30 Jul 2020 18:24:58 -0400 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D10A5C061574 for ; Thu, 30 Jul 2020 15:24:57 -0700 (PDT) Received: by mail-wr1-x443.google.com with SMTP id f18so26297401wrs.0 for ; Thu, 30 Jul 2020 15:24:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=jx8IG9PgjEZR5kIR5xlj7FzwV3s6UaKdizKrQv/+zQ8=; b=RnuJHuXgo1SQxt3yo9hzbMO8FQVMnXZYt2IrGiIb/SYSHaaIGlLDRq5njFM352l+J7 Yt0KwaJObijUlbOM7yw+BHH5Eedzy+3vaII9nnXo+jtd5q9sk975t3HMlU8vd4XQKdgl RJkEbSMgOvVElKN4657yqR8qxLuuzUzD0ZFbL2jVvW6vNeQyYnKboxsiOWUJh5S0rc1+ qOxF7Gh3SRv0mhWeUCb3g6znSeMGVFVb7oyEHNVXUcA6w3+LjQNq7SCScOdIsvCO0HqV jkcKRBbicw3BGlqA0t+2OABs5rXtWIBKPQxxtrNoZWHJ/MbcVce2DE+zrt50QyK3bgU6 4O/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=jx8IG9PgjEZR5kIR5xlj7FzwV3s6UaKdizKrQv/+zQ8=; b=Zqg+V7+6zVeHZEQ3TDWLs05MFjrYrzTd/j07gTnPiLAnIYcoL+t421OW6L5y5FZ9VF HnLHlW+RwK3fHa7epoZbvyqE0oFMv3YiDqD7hkoYdDSHPll/W1uhpG4CtW1dmVbHCLaT EFMFBFokjaosev/dGl8tbva2Ha1iFEBatJ4HOJYbqLF6Gy5TxQ6UETmf0D0r0t6e8gVa jOyCuZf7lT23P82UQmT//B+jNVpDnjkLeiwUcHs+23rlS2Bwf+5HhgFtcd6zQNiPzBcw OANZNT4AS8d00dp0jS8NJgcBlsGxAhzH0E7cM5vbQk3Bb+ITDatNZHIL9IhuYJucO1xs LklQ== X-Gm-Message-State: AOAM531smnKIyQr6hN+Xy9/4Kzk+GJ0M7XnFOYnqHmgCGR7MwzcFNnYq GfB4Nzuf/97VrTIYhuPqHLEm32Md X-Google-Smtp-Source: ABdhPJxQxIGx22PP+8khgwDq4HPMi5i/Ay7mGXhqI+uMMdtvaixYlovuycQcS7Fii4k1Nzpo/uTUqg== X-Received: by 2002:adf:de8d:: with SMTP id w13mr675013wrl.129.1596147896493; Thu, 30 Jul 2020 15:24:56 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a134sm12087309wmd.17.2020.07.30.15.24.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jul 2020 15:24:56 -0700 (PDT) Message-Id: <39eb83ad1efe4dc521f5fc3838f47c7cdb507fc7.1596147867.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 30 Jul 2020 22:24:26 +0000 Subject: [PATCH v3 20/20] maintenance: add trace2 regions for task execution Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Signed-off-by: Derrick Stolee --- builtin/gc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/builtin/gc.c b/builtin/gc.c index 6ccc0cca19..33e889bc71 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -1312,10 +1312,12 @@ static int maintenance_run(void) !tasks[i].auto_condition())) continue; + trace2_region_enter("maintenance", tasks[i].name, r); if (tasks[i].fn()) { error(_("task '%s' failed"), tasks[i].name); result = 1; } + trace2_region_leave("maintenance", tasks[i].name, r); } rollback_lock_file(&lk);