From patchwork Thu Jul 23 17:56:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681437 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8225D722 for ; Thu, 23 Jul 2020 17:56:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 64EB220792 for ; Thu, 23 Jul 2020 17:56:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="VIdpVElz" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730115AbgGWR4p (ORCPT ); Thu, 23 Jul 2020 13:56:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53262 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726666AbgGWR4o (ORCPT ); Thu, 23 Jul 2020 13:56:44 -0400 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8CE66C0619DC for ; Thu, 23 Jul 2020 10:56:44 -0700 (PDT) Received: by mail-wm1-x343.google.com with SMTP id w3so6003128wmi.4 for ; Thu, 23 Jul 2020 10:56:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=AJ+4hiN+2qwEmleoBkrkqIPSzcAa4C3THiktR/HFFiA=; b=VIdpVElzm74sMvIZtUcaPYD5MnxVp2FKi248lEQMy63UOUQ8lKjpW/Mw2ZufRGIanW i0Il9p4pSdSXVI0wlbY2hiKZuCU/Q7pqf0xYV/P+ivRC5zPoweIM1JtiUMEFVLwSqpbc LBx6zsDcxeah8k0VAw/T3C80wSnQM5BS9BaY29eFLne+DTdiWmFW11d4wCQANdBHzkhZ Fv4eF4n2a1jFiC2m+QMuFl6YFZsTK72aG2u9n5h7H0/5jiD1n67zNUfBTtoE0NiTT11u DTjwAFklzcFQ8GaZ4TCQDWgz6PVQg9QMUubUS1yTU5k+lf7GRtiRtp/Vq8r7XVOhzk6h VaGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=AJ+4hiN+2qwEmleoBkrkqIPSzcAa4C3THiktR/HFFiA=; b=doqUGegt0c4ZnllXAI4bmzzE657OmbTUDdhJpWGDoWVw5J6T0D/llQrtzLmOlNOS7I ZxAwWPmvTir37TA47d4008nNXmhpCYzWOu5mlXzq+WX8LatIkkSfiak/r26MYf53g0CY Z9bfTQzOqJrCkjbCU5o8enCWmvjDhwER9whtP4HAnMNiAkdvUvtNFXor0dzI6L3458wx 6FkYUmO1MfRZ7yoAi1W6VJMvlSxFzKNis5v1mVXPRzC6BXH9biT7VHKbS2ItdTG1pcf6 xIgetu4BPvnLkQL722K6Zpk92hQmp4xeNHDgGcf5pi4iYiQ8aRXdX9syZ8lcsg1kUsgY Erbg== X-Gm-Message-State: AOAM530eQhGb19AmI1l/eI+fQxLCPBDe0olKgCzWK4aAEkmpAWd8uopx 1katVH1s3uvFy5in6RkDzshxNn1H X-Google-Smtp-Source: ABdhPJyrsVziJnsLHoPk3HCREGd9g/5MO6oOKNGryT6qyKPxJ0fjco+Ltg5qB3MnxEFQ4u0Y/U5Icg== X-Received: by 2002:a05:600c:2c0d:: with SMTP id q13mr4968233wmg.81.1595527003025; Thu, 23 Jul 2020 10:56:43 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id g70sm4543007wmg.24.2020.07.23.10.56.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:42 -0700 (PDT) Message-Id: <63ec602a07756a41f8ccddd745562c567a4b3ed7.1595527000.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:23 +0000 Subject: [PATCH v2 01/18] maintenance: create basic maintenance runner Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The 'gc' builtin is our current entrypoint for automatically maintaining a repository. This one tool does many operations, such as repacking the repository, packing refs, and rewriting the commit-graph file. The name implies it performs "garbage collection" which means several different things, and some users may not want to use this operation that rewrites the entire object database. Create a new 'maintenance' builtin that will become a more general- purpose command. To start, it will only support the 'run' subcommand, but will later expand to add subcommands for scheduling maintenance in the background. For now, the 'maintenance' builtin is a thin shim over the 'gc' builtin. In fact, the only option is the '--auto' toggle, which is handed directly to the 'gc' builtin. The current change is isolated to this simple operation to prevent more interesting logic from being lost in all of the boilerplate of adding a new builtin. Use existing builtin/gc.c file because we want to share code between the two builtins. It is possible that we will have 'maintenance' replace the 'gc' builtin entirely at some point, leaving 'git gc' as an alias for some specific arguments to 'git maintenance run'. Signed-off-by: Derrick Stolee --- .gitignore | 1 + Documentation/git-maintenance.txt | 57 +++++++++++++++++++++++++++++ builtin.h | 1 + builtin/gc.c | 59 +++++++++++++++++++++++++++++++ git.c | 1 + t/t7900-maintenance.sh | 22 ++++++++++++ 6 files changed, 141 insertions(+) create mode 100644 Documentation/git-maintenance.txt create mode 100755 t/t7900-maintenance.sh diff --git a/.gitignore b/.gitignore index ee509a2ad2..a5808fa30d 100644 --- a/.gitignore +++ b/.gitignore @@ -90,6 +90,7 @@ /git-ls-tree /git-mailinfo /git-mailsplit +/git-maintenance /git-merge /git-merge-base /git-merge-index diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt new file mode 100644 index 0000000000..34cd2b4417 --- /dev/null +++ b/Documentation/git-maintenance.txt @@ -0,0 +1,57 @@ +git-maintenance(1) +================== + +NAME +---- +git-maintenance - Run tasks to optimize Git repository data + + +SYNOPSIS +-------- +[verse] +'git maintenance' run [] + + +DESCRIPTION +----------- +Run tasks to optimize Git repository data, speeding up other Git commands +and reducing storage requirements for the repository. ++ +Git commands that add repository data, such as `git add` or `git fetch`, +are optimized for a responsive user experience. These commands do not take +time to optimize the Git data, since such optimizations scale with the full +size of the repository while these user commands each perform a relatively +small action. ++ +The `git maintenance` command provides flexibility for how to optimize the +Git repository. + +SUBCOMMANDS +----------- + +run:: + Run one or more maintenance tasks. + +TASKS +----- + +gc:: + Cleanup unnecessary files and optimize the local repository. "GC" + stands for "garbage collection," but this task performs many + smaller tasks. This task can be rather expensive for large + repositories, as it repacks all Git objects into a single pack-file. + It can also be disruptive in some situations, as it deletes stale + data. + +OPTIONS +------- +--auto:: + When combined with the `run` subcommand, run maintenance tasks + only if certain thresholds are met. For example, the `gc` task + runs when the number of loose objects exceeds the number stored + in the `gc.auto` config setting, or when the number of pack-files + exceeds the `gc.autoPackLimit` config setting. + +GIT +--- +Part of the linkgit:git[1] suite diff --git a/builtin.h b/builtin.h index a5ae15bfe5..17c1c0ce49 100644 --- a/builtin.h +++ b/builtin.h @@ -167,6 +167,7 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix); int cmd_ls_remote(int argc, const char **argv, const char *prefix); int cmd_mailinfo(int argc, const char **argv, const char *prefix); int cmd_mailsplit(int argc, const char **argv, const char *prefix); +int cmd_maintenance(int argc, const char **argv, const char *prefix); int cmd_merge(int argc, const char **argv, const char *prefix); int cmd_merge_base(int argc, const char **argv, const char *prefix); int cmd_merge_index(int argc, const char **argv, const char *prefix); diff --git a/builtin/gc.c b/builtin/gc.c index 8e0b9cf41b..8d73c77f3a 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -699,3 +699,62 @@ int cmd_gc(int argc, const char **argv, const char *prefix) return 0; } + +static const char * const builtin_maintenance_usage[] = { + N_("git maintenance run []"), + NULL +}; + +static struct maintenance_opts { + int auto_flag; +} opts; + +static int maintenance_task_gc(void) +{ + int result; + struct argv_array cmd = ARGV_ARRAY_INIT; + + argv_array_pushl(&cmd, "gc", NULL); + + if (opts.auto_flag) + argv_array_pushl(&cmd, "--auto", NULL); + + close_object_store(the_repository->objects); + result = run_command_v_opt(cmd.argv, RUN_GIT_CMD); + argv_array_clear(&cmd); + + return result; +} + +static int maintenance_run(void) +{ + return maintenance_task_gc(); +} + +int cmd_maintenance(int argc, const char **argv, const char *prefix) +{ + static struct option builtin_maintenance_options[] = { + OPT_BOOL(0, "auto", &opts.auto_flag, + N_("run tasks based on the state of the repository")), + OPT_END() + }; + + memset(&opts, 0, sizeof(opts)); + + if (argc == 2 && !strcmp(argv[1], "-h")) + usage_with_options(builtin_maintenance_usage, + builtin_maintenance_options); + + argc = parse_options(argc, argv, prefix, + builtin_maintenance_options, + builtin_maintenance_usage, + PARSE_OPT_KEEP_UNKNOWN); + + if (argc == 1) { + if (!strcmp(argv[0], "run")) + return maintenance_run(); + } + + usage_with_options(builtin_maintenance_usage, + builtin_maintenance_options); +} diff --git a/git.c b/git.c index 2f021b97f3..ff56d1df24 100644 --- a/git.c +++ b/git.c @@ -527,6 +527,7 @@ static struct cmd_struct commands[] = { { "ls-tree", cmd_ls_tree, RUN_SETUP }, { "mailinfo", cmd_mailinfo, RUN_SETUP_GENTLY | NO_PARSEOPT }, { "mailsplit", cmd_mailsplit, NO_PARSEOPT }, + { "maintenance", cmd_maintenance, RUN_SETUP_GENTLY | NO_PARSEOPT }, { "merge", cmd_merge, RUN_SETUP | NEED_WORK_TREE }, { "merge-base", cmd_merge_base, RUN_SETUP }, { "merge-file", cmd_merge_file, RUN_SETUP_GENTLY }, diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh new file mode 100755 index 0000000000..d00641c4dd --- /dev/null +++ b/t/t7900-maintenance.sh @@ -0,0 +1,22 @@ +#!/bin/sh + +test_description='git maintenance builtin' + +GIT_TEST_COMMIT_GRAPH=0 +GIT_TEST_MULTI_PACK_INDEX=0 + +. ./test-lib.sh + +test_expect_success 'help text' ' + test_must_fail git maintenance -h 2>err && + test_i18ngrep "usage: git maintenance run" err +' + +test_expect_success 'gc [--auto]' ' + GIT_TRACE2_EVENT="$(pwd)/run-no-auto.txt" git maintenance run && + GIT_TRACE2_EVENT="$(pwd)/run-auto.txt" git maintenance run --auto && + grep ",\"gc\"]" run-no-auto.txt && + grep ",\"gc\",\"--auto\"]" run-auto.txt +' + +test_done From patchwork Thu Jul 23 17:56:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681439 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BF56D138A for ; Thu, 23 Jul 2020 17:56:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A6C562086A for ; Thu, 23 Jul 2020 17:56:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="DY9GFODY" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730136AbgGWR4q (ORCPT ); Thu, 23 Jul 2020 13:56:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53264 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726666AbgGWR4p (ORCPT ); Thu, 23 Jul 2020 13:56:45 -0400 Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5DA77C0619DC for ; Thu, 23 Jul 2020 10:56:45 -0700 (PDT) Received: by mail-wm1-x344.google.com with SMTP id y24so1827471wma.1 for ; Thu, 23 Jul 2020 10:56:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=1tO9a2Ab/ZByi0YCxKb/Dl0qn2scgmm5o+zRFxn6ONc=; b=DY9GFODYZz/9XwX1PHvDZTAYUGQ3AuIXgo989viSgsY5lO9AtJEPr0i9dSz380QeEK /50jwKxrkOY6nt86IwY2xAKZw7XFjnB7KDMpylKD298IbW9qwZBazmdKThCmvprhYEHn DmomZzieJ2KNC5ZvtVtq7KL1FhgGH/u+tcfSggAg6hqko7oIHzl1nDkWNOmU7pTT3rGZ RxFJpylJnI8epjxr3iKplqKXcu6n4otG7sjZF9fmeonhx9fgFjFUnwtZjclx54d/lGHR Royep4/OTzh5t+s6fmVfTgpU4CXuhSFeKqqWqUX7225jNtWDN/Yeh8ShMrxyGNf6qkCE sWRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=1tO9a2Ab/ZByi0YCxKb/Dl0qn2scgmm5o+zRFxn6ONc=; b=QefurJnWKet3tANggFT/P/TRDNTw0HVSxXxkFGzrDoypl8rwCd0SoJcUeqExklEvlM 9K6mh3RFKCbRnEhfidpUNvB7U9wr0vAHvd3Llo82znyAxCndafiU7Ucyw3VixZ1aMhQ9 3xcGfT98475Woa/BlaCrvkZwIwfOCHB93xXQjUNISelC0p/VszjyickGkTwyZQOpQd0H 1tU3Pr08lLVrLm6+HCMuAUWItW5MK1aPBxASbjJHIlH2Bd6yc4+FSv+qcIA/BxkwDeVh CzDVLgSf4O/pYeI0DnxU/RlZSisPBQfi+MnzwoKKN6RMb733bZNs8/Z/Kfg2+ZoV01CV YRmw== X-Gm-Message-State: AOAM531Mge3sIBkpNuOkC9VPliwz2+8v9QYcA5aGBnxbj37KQylxgCKD QsSpvvUu/dG+AB1I5W4JbUlNLZd7 X-Google-Smtp-Source: ABdhPJxmbh6J2dYMu2B0Z0d/6zfhb7BLFVwU5ab++OxYaY7Hrme50qIpkusI1rMv5l1C1jdpHWjB/g== X-Received: by 2002:a7b:c1cc:: with SMTP id a12mr5380923wmj.112.1595527003867; Thu, 23 Jul 2020 10:56:43 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id g145sm6648245wmg.23.2020.07.23.10.56.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:43 -0700 (PDT) Message-Id: <1d37e55cb714ea579744d28d1aeb332a63815342.1595527000.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:24 +0000 Subject: [PATCH v2 02/18] maintenance: add --quiet option Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Maintenance activities are commonly used as steps in larger scripts. Providing a '--quiet' option allows those scripts to be less noisy when run on a terminal window. Turn this mode on by default when stderr is not a terminal. Pipe the option to the 'git gc' child process. Signed-off-by: Derrick Stolee --- Documentation/git-maintenance.txt | 3 +++ builtin/gc.c | 7 +++++++ t/t7900-maintenance.sh | 8 +++++--- 3 files changed, 15 insertions(+), 3 deletions(-) diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index 34cd2b4417..089fa4cedc 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -52,6 +52,9 @@ OPTIONS in the `gc.auto` config setting, or when the number of pack-files exceeds the `gc.autoPackLimit` config setting. +--quiet:: + Do not report progress or other information over `stderr`. + GIT --- Part of the linkgit:git[1] suite diff --git a/builtin/gc.c b/builtin/gc.c index 8d73c77f3a..c8cde28436 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -707,6 +707,7 @@ static const char * const builtin_maintenance_usage[] = { static struct maintenance_opts { int auto_flag; + int quiet; } opts; static int maintenance_task_gc(void) @@ -718,6 +719,8 @@ static int maintenance_task_gc(void) if (opts.auto_flag) argv_array_pushl(&cmd, "--auto", NULL); + if (opts.quiet) + argv_array_pushl(&cmd, "--quiet", NULL); close_object_store(the_repository->objects); result = run_command_v_opt(cmd.argv, RUN_GIT_CMD); @@ -736,6 +739,8 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix) static struct option builtin_maintenance_options[] = { OPT_BOOL(0, "auto", &opts.auto_flag, N_("run tasks based on the state of the repository")), + OPT_BOOL(0, "quiet", &opts.quiet, + N_("do not report progress or other information over stderr")), OPT_END() }; @@ -745,6 +750,8 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix) usage_with_options(builtin_maintenance_usage, builtin_maintenance_options); + opts.quiet = !isatty(2); + argc = parse_options(argc, argv, prefix, builtin_maintenance_options, builtin_maintenance_usage, diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index d00641c4dd..e4e4036e50 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -12,11 +12,13 @@ test_expect_success 'help text' ' test_i18ngrep "usage: git maintenance run" err ' -test_expect_success 'gc [--auto]' ' - GIT_TRACE2_EVENT="$(pwd)/run-no-auto.txt" git maintenance run && +test_expect_success 'gc [--auto|--quiet]' ' + GIT_TRACE2_EVENT="$(pwd)/run-no-auto.txt" git maintenance run --no-quiet && GIT_TRACE2_EVENT="$(pwd)/run-auto.txt" git maintenance run --auto && + GIT_TRACE2_EVENT="$(pwd)/run-quiet.txt" git maintenance run --quiet && grep ",\"gc\"]" run-no-auto.txt && - grep ",\"gc\",\"--auto\"]" run-auto.txt + grep ",\"gc\",\"--auto\"" run-auto.txt && + grep ",\"gc\",\"--quiet\"" run-quiet.txt ' test_done From patchwork Thu Jul 23 17:56:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681441 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 13D4B722 for ; Thu, 23 Jul 2020 17:56:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E966D2086A for ; Thu, 23 Jul 2020 17:56:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="AdVCwP0w" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730186AbgGWR4r (ORCPT ); Thu, 23 Jul 2020 13:56:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726666AbgGWR4q (ORCPT ); Thu, 23 Jul 2020 13:56:46 -0400 Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5CD7EC0619DC for ; Thu, 23 Jul 2020 10:56:46 -0700 (PDT) Received: by mail-wm1-x341.google.com with SMTP id f139so5998644wmf.5 for ; Thu, 23 Jul 2020 10:56:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=olS31h0DH7u13Xtqz7Dz86pUcZJw5EBVhiIhrGFQSSI=; b=AdVCwP0wrGGldhSuVwke1PhohTjx0X63JMko7j6KGRSu9lzGP5hFutUd1byzXzJbBI Pwyu8T8BZIprbKherO68WSwHowqXFmA/oPj9Yna7YlCjIYg9Pt6Ysb7XsENmHuk77FP4 xOZ5oiq16Zz/3R92TJccVGSQOfdoz7QqDG+Fhiu97W+CejaDe45saiK+Pz2QXHfruCOs B1HlD+uohJib/Y31RHPzbKLNs4k5sKPvYx6wA8+Z3jXBWuPzgiZ3tVgjFXWj1huKDwfe nGJ7ZCAXS5d35Iz7g+nT0A2oOBiNcfnl0AjJzISZ1lYFjphDu4fL3vwnonvNxGm12gdm lacg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=olS31h0DH7u13Xtqz7Dz86pUcZJw5EBVhiIhrGFQSSI=; b=izUWi/Z7sXnrDWuU5yxTJOWAfcTm4lenqmS5SQ17nqMQuvF4eBlf20AUmQndCO5wJ2 aUBEy2sy9n7S1VRwQzPmDebdae3WH3OKY9BvBfRuTMvlMePyLLRy5p0uf7LF9VgwZW4s 7UAhfhoTIvVZyjuckTFogJuI6ZSQqqBa14//P8fg/+KnKEqtSXTh34mcSSUN9CrSLzcH sN/dtkr/3ELSNoD4ORbUU08jkkR89unnZDU0FnuXK5TtYS5hoSzwB68Ty+2bPjr48c/v Ce7rFvb3Iud+RUbRpYqOaFw1q9Y/PUxpmm/IY+8iM6Mi1bpFCXnP1jikh6CWlS2K4NkO DXkQ== X-Gm-Message-State: AOAM533WLK1an0gRZlLp/FBZYOiQca5r+EQA2tzvtdNMOn6i4p1+qeXY ID+SBHGULKZ+/Jwazyqwm3aHemIP X-Google-Smtp-Source: ABdhPJwHUAuK+HNT7V5UX6jxT3vMkqN8gL3zpnMb/LibME77y5dUPeCB+oy/LEnHcCCycVr3UOM3vQ== X-Received: by 2002:a7b:c244:: with SMTP id b4mr3248782wmj.14.1595527004836; Thu, 23 Jul 2020 10:56:44 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c10sm4882097wro.84.2020.07.23.10.56.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:44 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:25 +0000 Subject: [PATCH v2 03/18] maintenance: replace run_auto_gc() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The run_auto_gc() method is used in several places to trigger a check for repo maintenance after some Git commands, such as 'git commit' or 'git fetch'. To allow for extra customization of this maintenance activity, replace the 'git gc --auto [--quiet]' call with one to 'git maintenance run --auto [--quiet]'. As we extend the maintenance builtin with other steps, users will be able to select different maintenance activities. Rename run_auto_gc() to run_auto_maintenance() to be clearer what is happening on this call, and to expose all callers in the current diff. Since 'git fetch' already allows disabling the 'git gc --auto' subprocess, add an equivalent option with a different name to be more descriptive of the new behavior: '--[no-]maintenance'. Update the documentation to include these options at the same time. Signed-off-by: Derrick Stolee --- Documentation/fetch-options.txt | 5 +++-- Documentation/git-clone.txt | 7 ++++--- builtin/am.c | 2 +- builtin/commit.c | 2 +- builtin/fetch.c | 6 ++++-- builtin/merge.c | 2 +- builtin/rebase.c | 4 ++-- run-command.c | 7 +++++-- run-command.h | 2 +- t/t5510-fetch.sh | 2 +- 10 files changed, 23 insertions(+), 16 deletions(-) diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt index 6e2a160a47..d73224844e 100644 --- a/Documentation/fetch-options.txt +++ b/Documentation/fetch-options.txt @@ -86,9 +86,10 @@ ifndef::git-pull[] Allow several and arguments to be specified. No s may be specified. +--[no-]maintenance:: --[no-]auto-gc:: - Run `git gc --auto` at the end to perform garbage collection - if needed. This is enabled by default. + Run `git maintenance run --auto` at the end to perform garbage + collection if needed. This is enabled by default. --[no-]write-commit-graph:: Write a commit-graph after fetching. This overrides the config diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt index c898310099..aa25aba7d9 100644 --- a/Documentation/git-clone.txt +++ b/Documentation/git-clone.txt @@ -78,9 +78,10 @@ repository using this option and then delete branches (or use any other Git command that makes any existing commit unreferenced) in the source repository, some objects may become unreferenced (or dangling). These objects may be removed by normal Git operations (such as `git commit`) -which automatically call `git gc --auto`. (See linkgit:git-gc[1].) -If these objects are removed and were referenced by the cloned repository, -then the cloned repository will become corrupt. +which automatically call `git maintenance run --auto` and `git gc --auto`. +(See linkgit:git-maintenance[1] and linkgit:git-gc[1].) If these objects +are removed and were referenced by the cloned repository, then the cloned +repository will become corrupt. + Note that running `git repack` without the `--local` option in a repository cloned with `--shared` will copy objects from the source repository into a pack diff --git a/builtin/am.c b/builtin/am.c index 69e50de018..ff895125f6 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -1795,7 +1795,7 @@ static void am_run(struct am_state *state, int resume) if (!state->rebasing) { am_destroy(state); close_object_store(the_repository->objects); - run_auto_gc(state->quiet); + run_auto_maintenance(state->quiet); } } diff --git a/builtin/commit.c b/builtin/commit.c index d1b7396052..658b158659 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -1702,7 +1702,7 @@ int cmd_commit(int argc, const char **argv, const char *prefix) git_test_write_commit_graph_or_die(); repo_rerere(the_repository, 0); - run_auto_gc(quiet); + run_auto_maintenance(quiet); run_commit_hook(use_editor, get_index_file(), "post-commit", NULL); if (amend && !no_post_rewrite) { commit_post_rewrite(the_repository, current_head, &oid); diff --git a/builtin/fetch.c b/builtin/fetch.c index 82ac4be8a5..49a4d727d4 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -196,8 +196,10 @@ static struct option builtin_fetch_options[] = { OPT_STRING_LIST(0, "negotiation-tip", &negotiation_tip, N_("revision"), N_("report that we have only objects reachable from this object")), OPT_PARSE_LIST_OBJECTS_FILTER(&filter_options), + OPT_BOOL(0, "maintenance", &enable_auto_gc, + N_("run 'maintenance --auto' after fetching")), OPT_BOOL(0, "auto-gc", &enable_auto_gc, - N_("run 'gc --auto' after fetching")), + N_("run 'maintenance --auto' after fetching")), OPT_BOOL(0, "show-forced-updates", &fetch_show_forced_updates, N_("check for forced-updates on all updated branches")), OPT_BOOL(0, "write-commit-graph", &fetch_write_commit_graph, @@ -1882,7 +1884,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix) close_object_store(the_repository->objects); if (enable_auto_gc) - run_auto_gc(verbosity < 0); + run_auto_maintenance(verbosity < 0); return result; } diff --git a/builtin/merge.c b/builtin/merge.c index 7da707bf55..c068e73037 100644 --- a/builtin/merge.c +++ b/builtin/merge.c @@ -457,7 +457,7 @@ static void finish(struct commit *head_commit, * user should see them. */ close_object_store(the_repository->objects); - run_auto_gc(verbosity < 0); + run_auto_maintenance(verbosity < 0); } } if (new_head && show_diffstat) { diff --git a/builtin/rebase.c b/builtin/rebase.c index 37ba76ac3d..0c4ee98f08 100644 --- a/builtin/rebase.c +++ b/builtin/rebase.c @@ -728,10 +728,10 @@ static int finish_rebase(struct rebase_options *opts) apply_autostash(state_dir_path("autostash", opts)); close_object_store(the_repository->objects); /* - * We ignore errors in 'gc --auto', since the + * We ignore errors in 'git maintenance run --auto', since the * user should see them. */ - run_auto_gc(!(opts->flags & (REBASE_NO_QUIET|REBASE_VERBOSE))); + run_auto_maintenance(!(opts->flags & (REBASE_NO_QUIET|REBASE_VERBOSE))); if (opts->type == REBASE_MERGE) { struct replay_opts replay = REPLAY_OPTS_INIT; diff --git a/run-command.c b/run-command.c index 9b3a57d1e3..82ad241638 100644 --- a/run-command.c +++ b/run-command.c @@ -1865,14 +1865,17 @@ int run_processes_parallel_tr2(int n, get_next_task_fn get_next_task, return result; } -int run_auto_gc(int quiet) +int run_auto_maintenance(int quiet) { struct argv_array argv_gc_auto = ARGV_ARRAY_INIT; int status; - argv_array_pushl(&argv_gc_auto, "gc", "--auto", NULL); + argv_array_pushl(&argv_gc_auto, "maintenance", "run", "--auto", NULL); if (quiet) argv_array_push(&argv_gc_auto, "--quiet"); + else + argv_array_push(&argv_gc_auto, "--no-quiet"); + status = run_command_v_opt(argv_gc_auto.argv, RUN_GIT_CMD); argv_array_clear(&argv_gc_auto); return status; diff --git a/run-command.h b/run-command.h index 191dfcdafe..d9a800e700 100644 --- a/run-command.h +++ b/run-command.h @@ -221,7 +221,7 @@ int run_hook_ve(const char *const *env, const char *name, va_list args); /* * Trigger an auto-gc */ -int run_auto_gc(int quiet); +int run_auto_maintenance(int quiet); #define RUN_COMMAND_NO_STDIN 1 #define RUN_GIT_CMD 2 /*If this is to be git sub-command */ diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh index a66dbe0bde..9850ecde5d 100755 --- a/t/t5510-fetch.sh +++ b/t/t5510-fetch.sh @@ -919,7 +919,7 @@ test_expect_success 'fetching with auto-gc does not lock up' ' git config fetch.unpackLimit 1 && git config gc.autoPackLimit 1 && git config gc.autoDetach false && - GIT_ASK_YESNO="$D/askyesno" git fetch >fetch.out 2>&1 && + GIT_ASK_YESNO="$D/askyesno" git fetch --verbose >fetch.out 2>&1 && test_i18ngrep "Auto packing the repository" fetch.out && ! grep "Should I try again" fetch.out ) From patchwork Thu Jul 23 17:56:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681443 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 041A9722 for ; Thu, 23 Jul 2020 17:56:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DACB1206E3 for ; Thu, 23 Jul 2020 17:56:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="on8oPCtQ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730214AbgGWR4t (ORCPT ); Thu, 23 Jul 2020 13:56:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53276 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730176AbgGWR4r (ORCPT ); Thu, 23 Jul 2020 13:56:47 -0400 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 44247C0619DC for ; Thu, 23 Jul 2020 10:56:47 -0700 (PDT) Received: by mail-wr1-x444.google.com with SMTP id a14so6011761wra.5 for ; Thu, 23 Jul 2020 10:56:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Fp8rJJuXnVVn4Jfilg4dxj67dseptukRejeJaQfpoqM=; b=on8oPCtQ9y1Uxp1nHgFMOduN/7GFHQkYza0H8ATXdHIIMnctnqTAS31y/R1O5KyWuR E+Wri+E912PTAtMcd0Nq90HGXqWDTMFu4eFaQ2ZJj2O4FQJ08fxYfeJSa8y/au3CoXrA 7NlzPqRYtTOZUorgAqp1mNoBTj7j52PI18BUGDWQzdcC/MqwTZSvY79bZJi2T3lLUnFS Bezruxencyi6Pe41TrzW3hkUM/l4rOSms1wMLsW2LJXcvtI2uSnIss5gRMXpkbPTnaYC AsoXlPqTnJNhU7BWkmzQlHkvB1XGA4qyu1sE9z02uHX8kOtrac3UIt5cdBzlzHt8ULYw 5jwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Fp8rJJuXnVVn4Jfilg4dxj67dseptukRejeJaQfpoqM=; b=HSQqh8Ek1btlg3N2mUagFzA94cdv0+CUzcHUotsq0ZSL74oRRMQwab5Ln8XcMflyOp R3UlogFf8qiVQWkPnoe3wvUO1lCAE3Eu8iWFBnODViN3EatSJK1FHzOBIVes2SbsHRnP ub4Ikvo9CCQsq2mJ99YuNBVeS7/3twrKDwmhzIJIj5iZQLKPdyFM+Llw9tuYUluWY0f+ dO2NA9YlizdlDeo/8Vr2urYdZy+7UalC4cYNK1T+hsKWrDatpKXi/N4gMJ3AMU2Ia5Pa s47FhovQtHOPLidTTIJWwemA0lI6nDUMA2vjPc0O+vqQXkFZ89yLguiWeYKHngvnFnGT I3UQ== X-Gm-Message-State: AOAM533imNZ99pklYqpg7+GzRfW9InepA1nrq199sJ2eX2zvSnTRxL0N +/xUEDX5gjXMeFtmO3/dDR3+pyF3 X-Google-Smtp-Source: ABdhPJx5LhsZMdA4p1cs7l7j9FQX/8P/TO+1A4ybZ6iCrj9BagD/NVVdvRogOb1I5QTt3t7XGbXAmQ== X-Received: by 2002:adf:f486:: with SMTP id l6mr337375wro.265.1595527005848; Thu, 23 Jul 2020 10:56:45 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w16sm5465697wrg.95.2020.07.23.10.56.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:45 -0700 (PDT) Message-Id: <8e260bccf1a0b6cd799a6bc78798b31ebed8ad7e.1595527000.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:26 +0000 Subject: [PATCH v2 04/18] maintenance: initialize task array Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee In anticipation of implementing multiple maintenance tasks inside the 'maintenance' builtin, use a list of structs to describe the work to be done. The struct maintenance_task stores the name of the task (as given by a future command-line argument) along with a function pointer to its implementation and a boolean for whether the step is enabled. A list of pointers to these structs are initialized with the full list of implemented tasks along with a default order. For now, this list only contains the "gc" task. This task is also the only task enabled by default. Signed-off-by: Derrick Stolee --- builtin/gc.c | 39 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/builtin/gc.c b/builtin/gc.c index c8cde28436..c28fb0b16d 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -700,6 +700,8 @@ int cmd_gc(int argc, const char **argv, const char *prefix) return 0; } +#define MAX_NUM_TASKS 1 + static const char * const builtin_maintenance_usage[] = { N_("git maintenance run []"), NULL @@ -729,9 +731,43 @@ static int maintenance_task_gc(void) return result; } +typedef int maintenance_task_fn(void); + +struct maintenance_task { + const char *name; + maintenance_task_fn *fn; + unsigned enabled:1; +}; + +static struct maintenance_task *tasks[MAX_NUM_TASKS]; +static int num_tasks; + static int maintenance_run(void) { - return maintenance_task_gc(); + int i; + int result = 0; + + for (i = 0; !result && i < num_tasks; i++) { + if (!tasks[i]->enabled) + continue; + result = tasks[i]->fn(); + } + + return result; +} + +static void initialize_tasks(void) +{ + int i; + num_tasks = 0; + + for (i = 0; i < MAX_NUM_TASKS; i++) + tasks[i] = xcalloc(1, sizeof(struct maintenance_task)); + + tasks[num_tasks]->name = "gc"; + tasks[num_tasks]->fn = maintenance_task_gc; + tasks[num_tasks]->enabled = 1; + num_tasks++; } int cmd_maintenance(int argc, const char **argv, const char *prefix) @@ -751,6 +787,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix) builtin_maintenance_options); opts.quiet = !isatty(2); + initialize_tasks(); argc = parse_options(argc, argv, prefix, builtin_maintenance_options, From patchwork Thu Jul 23 17:56:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681445 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CBBED138A for ; Thu, 23 Jul 2020 17:56:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AE77A20792 for ; Thu, 23 Jul 2020 17:56:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kGFygM0y" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730234AbgGWR4v (ORCPT ); Thu, 23 Jul 2020 13:56:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730122AbgGWR4s (ORCPT ); Thu, 23 Jul 2020 13:56:48 -0400 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7068CC0619DC for ; Thu, 23 Jul 2020 10:56:48 -0700 (PDT) Received: by mail-wm1-x343.google.com with SMTP id w3so6003276wmi.4 for ; Thu, 23 Jul 2020 10:56:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=+StNxTftResFg6PGYXwExKE98V/MPeVmwc/+SCNg7NE=; b=kGFygM0yilH2YmscjlAt99yOxmfYbdVfbzBo2aOeT5b92BWuGEkpj6ZioMH5Zm46R1 7nidIUgi5uhGGdt5xG2FM5bJNCwl0lwkx91e3pseEncfAO29uq1Y4KRVjDRJ/OYSggYE XGYOvcjAudS4HV4f+VqJL7O282sPBKzAvQufLjphlUKz7KL/faF462d+OCNF2mlxKUbB tB3sRYYsSyNJZ390awdsuYTpkSW70AJI2OoJ7OIdLzbDg+z93IeeyDpqJSqAaZnYrwD5 2SgOBFbjgAkfHXagRMkcUa6EdOuAwA+sp9WZyrEzXp0esbaqZlvh/4vIxCjlgOZ5YTV/ ENdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=+StNxTftResFg6PGYXwExKE98V/MPeVmwc/+SCNg7NE=; b=OCCM0//OTjjZ1DEbyfWfIS/r37Nmcz7bxlAySr+Xg0g82TAvoCCIaTSBGpNZjDSf0z P0ia7+jaXs/XEX2+P0UTGG22U0Aay9SkeeNt5MXmlRzGLBvZpfJWnwuhlf9VU99gcwgJ h779XkHi0oeHdB2UeKzmnaDim8yb4Vt6H4Wkwvam3wNZTqy4kon4i1qlTCRk3NQAX72k Xo8krt7mxErEfc5N0srOlDuiai4vKewCMsVqXr4H6qGeYLxqMkndyCtHnSaaVgjAR6Pr C0jRIjlJdJCRHRPFAdMaNgdnLAUc2O4D2yTsSkWGew13y8iKXnpA9SqP//oKo2EK/H8q FNfA== X-Gm-Message-State: AOAM5321BYdM1hXlMfdXQoZhC5QY9m0fFfYg4W716IBAJ9VBLkwlU3rX AFT80iUO9iubC5TAx1XOZWCx63dP X-Google-Smtp-Source: ABdhPJwBBTRWWn4lTAKP6LOYUj2ELUw33ALcOmcHG8N8izvG7CzVmFqfBKYVgarjvOtST9fpkzDG0Q== X-Received: by 2002:a1c:a756:: with SMTP id q83mr4990507wme.168.1595527006788; Thu, 23 Jul 2020 10:56:46 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j145sm4734594wmj.7.2020.07.23.10.56.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:46 -0700 (PDT) Message-Id: <04552b1d2ed751a11eb7c50f6898cbc078b552b4.1595527000.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:27 +0000 Subject: [PATCH v2 05/18] maintenance: add commit-graph task Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The first new task in the 'git maintenance' builtin is the 'commit-graph' job. It is based on the sequence of events in the 'commit-graph' job in Scalar [1]. This sequence is as follows: 1. git commit-graph write --reachable --split 2. git commit-graph verify --shallow 3. If the verify succeeds, stop. 4. Delete the commit-graph-chain file. 5. git commit-graph write --reachable --split By writing an incremental commit-graph file using the "--split" option we minimize the disruption from this operation. The default behavior is to merge layers until the new "top" layer is less than half the size of the layer below. This provides quick writes most of the time, with the longer writes following a power law distribution. Most importantly, concurrent Git processes only look at the commit-graph-chain file for a very short amount of time, so they will verly likely not be holding a handle to the file when we try to replace it. (This only matters on Windows.) If a concurrent process reads the old commit-graph-chain file, but our job expires some of the .graph files before they can be read, then those processes will see a warning message (but not fail). This could be avoided by a future update to use the --expire-time argument when writing the commit-graph. By using 'git commit-graph verify --shallow' we can ensure that the file we just wrote is valid. This is an extra safety precaution that is faster than our 'write' subcommand. In the rare situation that the newest layer of the commit-graph is corrupt, we can "fix" the corruption by deleting the commit-graph-chain file and rewrite the full commit-graph as a new one-layer commit graph. This does not completely prevent _that_ file from being corrupt, but it does recompute the commit-graph by parsing commits from the object database. In our use of this step in Scalar and VFS for Git, we have only seen this issue arise because our microsoft/git fork reverted 43d3561 ("commit-graph write: don't die if the existing graph is corrupt" 2019-03-25) for a while to keep commit-graph writes very fast. We dropped the revert when updating to v2.23.0. The verify still has potential for catching corrupt data across the layer boundary: if the new file has commit X with parent Y in an old file but the commit ID for Y in the old file had a bitswap, then we will notice that in the 'verify' command. [1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/CommitGraphStep.cs Signed-off-by: Derrick Stolee --- Documentation/git-maintenance.txt | 18 ++++++++ builtin/gc.c | 74 ++++++++++++++++++++++++++++++- commit-graph.c | 8 ++-- commit-graph.h | 1 + t/t7900-maintenance.sh | 2 +- 5 files changed, 97 insertions(+), 6 deletions(-) diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index 089fa4cedc..35b0be7d40 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -35,6 +35,24 @@ run:: TASKS ----- +commit-graph:: + The `commit-graph` job updates the `commit-graph` files incrementally, + then verifies that the written data is correct. If the new layer has an + issue, then the chain file is removed and the `commit-graph` is + rewritten from scratch. ++ +The verification only checks the top layer of the `commit-graph` chain. +If the incremental write merged the new commits with at least one +existing layer, then there is potential for on-disk corruption being +carried forward into the new file. This will be noticed and the new +commit-graph file will be clean as Git reparses the commit data from +the object database. ++ +The incremental write is safe to run alongside concurrent Git processes +since it will not expire `.graph` files that were in the previous +`commit-graph-chain` file. They will be deleted by a later run based on +the expiration delay. + gc:: Cleanup unnecessary files and optimize the local repository. "GC" stands for "garbage collection," but this task performs many diff --git a/builtin/gc.c b/builtin/gc.c index c28fb0b16d..2cd17398ec 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -700,7 +700,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix) return 0; } -#define MAX_NUM_TASKS 1 +#define MAX_NUM_TASKS 2 static const char * const builtin_maintenance_usage[] = { N_("git maintenance run []"), @@ -712,6 +712,74 @@ static struct maintenance_opts { int quiet; } opts; +static int run_write_commit_graph(void) +{ + int result; + struct argv_array cmd = ARGV_ARRAY_INIT; + + argv_array_pushl(&cmd, "commit-graph", "write", + "--split", "--reachable", NULL); + + if (opts.quiet) + argv_array_push(&cmd, "--no-progress"); + + result = run_command_v_opt(cmd.argv, RUN_GIT_CMD); + argv_array_clear(&cmd); + + return result; +} + +static int run_verify_commit_graph(void) +{ + int result; + struct argv_array cmd = ARGV_ARRAY_INIT; + + argv_array_pushl(&cmd, "commit-graph", "verify", + "--shallow", NULL); + + if (opts.quiet) + argv_array_push(&cmd, "--no-progress"); + + result = run_command_v_opt(cmd.argv, RUN_GIT_CMD); + argv_array_clear(&cmd); + + return result; +} + +static int maintenance_task_commit_graph(void) +{ + struct repository *r = the_repository; + char *chain_path; + + /* Skip commit-graph when --auto is specified. */ + if (opts.auto_flag) + return 0; + + close_object_store(r->objects); + if (run_write_commit_graph()) { + error(_("failed to write commit-graph")); + return 1; + } + + if (!run_verify_commit_graph()) + return 0; + + warning(_("commit-graph verify caught error, rewriting")); + + chain_path = get_commit_graph_chain_filename(r->objects->odb); + if (unlink(chain_path)) { + UNLEAK(chain_path); + die(_("failed to remove commit-graph at %s"), chain_path); + } + free(chain_path); + + if (!run_write_commit_graph()) + return 0; + + error(_("failed to rewrite commit-graph")); + return 1; +} + static int maintenance_task_gc(void) { int result; @@ -768,6 +836,10 @@ static void initialize_tasks(void) tasks[num_tasks]->fn = maintenance_task_gc; tasks[num_tasks]->enabled = 1; num_tasks++; + + tasks[num_tasks]->name = "commit-graph"; + tasks[num_tasks]->fn = maintenance_task_commit_graph; + num_tasks++; } int cmd_maintenance(int argc, const char **argv, const char *prefix) diff --git a/commit-graph.c b/commit-graph.c index fdd1c4fa7c..57278a9ab5 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -172,7 +172,7 @@ static char *get_split_graph_filename(struct object_directory *odb, oid_hex); } -static char *get_chain_filename(struct object_directory *odb) +char *get_commit_graph_chain_filename(struct object_directory *odb) { return xstrfmt("%s/info/commit-graphs/commit-graph-chain", odb->path); } @@ -520,7 +520,7 @@ static struct commit_graph *load_commit_graph_chain(struct repository *r, struct stat st; struct object_id *oids; int i = 0, valid = 1, count; - char *chain_name = get_chain_filename(odb); + char *chain_name = get_commit_graph_chain_filename(odb); FILE *fp; int stat_res; @@ -1635,7 +1635,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) } if (ctx->split) { - char *lock_name = get_chain_filename(ctx->odb); + char *lock_name = get_commit_graph_chain_filename(ctx->odb); hold_lock_file_for_update_mode(&lk, lock_name, LOCK_DIE_ON_ERROR, 0444); @@ -2012,7 +2012,7 @@ static void expire_commit_graphs(struct write_commit_graph_context *ctx) if (ctx->split_opts && ctx->split_opts->expire_time) expire_time = ctx->split_opts->expire_time; if (!ctx->split) { - char *chain_file_name = get_chain_filename(ctx->odb); + char *chain_file_name = get_commit_graph_chain_filename(ctx->odb); unlink(chain_file_name); free(chain_file_name); ctx->num_commit_graphs_after = 0; diff --git a/commit-graph.h b/commit-graph.h index 28f89cdf3e..3c202748c3 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -25,6 +25,7 @@ struct commit; struct bloom_filter_settings; char *get_commit_graph_filename(struct object_directory *odb); +char *get_commit_graph_chain_filename(struct object_directory *odb); int open_commit_graph(const char *graph_file, int *fd, struct stat *st); /* diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index e4e4036e50..216ac0b19e 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -12,7 +12,7 @@ test_expect_success 'help text' ' test_i18ngrep "usage: git maintenance run" err ' -test_expect_success 'gc [--auto|--quiet]' ' +test_expect_success 'run [--auto|--quiet]' ' GIT_TRACE2_EVENT="$(pwd)/run-no-auto.txt" git maintenance run --no-quiet && GIT_TRACE2_EVENT="$(pwd)/run-auto.txt" git maintenance run --auto && GIT_TRACE2_EVENT="$(pwd)/run-quiet.txt" git maintenance run --quiet && From patchwork Thu Jul 23 17:56:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681449 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4784A722 for ; Thu, 23 Jul 2020 17:56:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2F91E20792 for ; Thu, 23 Jul 2020 17:56:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="GIyrU8RY" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730246AbgGWR4z (ORCPT ); Thu, 23 Jul 2020 13:56:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53284 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730176AbgGWR4t (ORCPT ); Thu, 23 Jul 2020 13:56:49 -0400 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5E126C0619DC for ; Thu, 23 Jul 2020 10:56:49 -0700 (PDT) Received: by mail-wm1-x342.google.com with SMTP id j18so5740980wmi.3 for ; Thu, 23 Jul 2020 10:56:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=CUbjUytCK5g/a7uEnKFWBVCBMHu0YhuF13HTppvxZ8I=; b=GIyrU8RYYZsLDBYgOFdioa2krejG1mjK7L0sCC5OHLkxV0yfD4SPqlczINXs6HG7Je TzYyfYW85pBCP21gqhx9SmAYXoMUd2dmY4e4lINkELNFuzUywInl8zCx2Io20F6d9K1W VJCFUg43/LaR+ImsaU/KJStz/xYoE6z9b9E/Hy+d4wQhAUULXWANyG40kaUHcMEHy55U j+USialNehHXBD7F/pEqHg8vo1wFX7ANDX6mevJ6Ipa8kfBNq0u1Zo7FtCZLyEXbi0jL xBqnTRPuPKiCKTA74vJXlPYw7tK9ZP7ZqhoJl9mH00v+MpTYGw7buhahkvo2yO6vWDb1 GKfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=CUbjUytCK5g/a7uEnKFWBVCBMHu0YhuF13HTppvxZ8I=; b=plW8WzdRaPGhsCvUnHzFzKRRLvR9Rl97eYMvuZpg2IvmwfmeVeVTxJX83uDcz6Ryf2 2c86TPDdS2E7clJHbcdgqA59VyqsujwLsdvNGq51PUhhPQyycIEJ8ezxunO+gQHuqRgM 2kqv29RJbeic3C20HhM3U/j9/FKY74F9kP1bnSPrP+KerQG91hERI3Cacf95yRNrvox6 C94aNklhZf3nk8rII221OrL+eTMN9NrLbWq9uStRYlWcDgIX5H20jZZUOG2vasK/Dys7 Qd3kveP4FM1Z1hhCGKsBresfnSWlCGlSAhoysGJLUL3HI+/6f9Y2R0Wyxok1GJgM9bY7 z1Sg== X-Gm-Message-State: AOAM533UhhmKsytJBD1va7rpsHWLRDG1PRGWG0g5MA2imipp0l4zgPCF tASp9ie636tiLGaBoU0uNlhilPZb X-Google-Smtp-Source: ABdhPJz6wMRk+ghH8j5fxcNMXEx/PuCUpCnVBLnllMKRD5B5MenuGTgbiZMzHvvtNav6bCgOXkBb5g== X-Received: by 2002:a1c:7315:: with SMTP id d21mr4961242wmb.108.1595527007903; Thu, 23 Jul 2020 10:56:47 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w2sm4681564wrs.77.2020.07.23.10.56.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:47 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:28 +0000 Subject: [PATCH v2 06/18] maintenance: add --task option Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee A user may want to only run certain maintenance tasks in a certain order. Add the --task= option, which allows a user to specify an ordered list of tasks to run. These cannot be run multiple times, however. Here is where our array of maintenance_task pointers becomes critical. We can sort the array of pointers based on the task order, but we do not want to move the struct data itself in order to preserve the hashmap references. We use the hashmap to match the --task= arguments into the task struct data. Signed-off-by: Derrick Stolee --- Documentation/git-maintenance.txt | 4 ++ builtin/gc.c | 64 ++++++++++++++++++++++++++++++- t/t7900-maintenance.sh | 23 +++++++++++ 3 files changed, 89 insertions(+), 2 deletions(-) diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index 35b0be7d40..9204762e21 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -73,6 +73,10 @@ OPTIONS --quiet:: Do not report progress or other information over `stderr`. +--task=:: + If this option is specified one or more times, then only run the + specified tasks in the specified order. + GIT --- Part of the linkgit:git[1] suite diff --git a/builtin/gc.c b/builtin/gc.c index 2cd17398ec..c58dea6fa5 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -710,6 +710,7 @@ static const char * const builtin_maintenance_usage[] = { static struct maintenance_opts { int auto_flag; int quiet; + int tasks_selected; } opts; static int run_write_commit_graph(void) @@ -804,20 +805,38 @@ typedef int maintenance_task_fn(void); struct maintenance_task { const char *name; maintenance_task_fn *fn; - unsigned enabled:1; + int task_order; + unsigned enabled:1, + selected:1; }; static struct maintenance_task *tasks[MAX_NUM_TASKS]; static int num_tasks; +static int compare_tasks_by_selection(const void *a_, const void *b_) +{ + const struct maintenance_task *a, *b; + a = (const struct maintenance_task *)a_; + b = (const struct maintenance_task *)b_; + + return b->task_order - a->task_order; +} + static int maintenance_run(void) { int i; int result = 0; + if (opts.tasks_selected) + QSORT(tasks, num_tasks, compare_tasks_by_selection); + for (i = 0; !result && i < num_tasks; i++) { - if (!tasks[i]->enabled) + if (opts.tasks_selected && !tasks[i]->selected) + continue; + + if (!opts.tasks_selected && !tasks[i]->enabled) continue; + result = tasks[i]->fn(); } @@ -842,6 +861,44 @@ static void initialize_tasks(void) num_tasks++; } +static int task_option_parse(const struct option *opt, + const char *arg, int unset) +{ + int i; + struct maintenance_task *task = NULL; + + BUG_ON_OPT_NEG(unset); + + if (!arg || !strlen(arg)) { + error(_("--task requires a value")); + return 1; + } + + opts.tasks_selected++; + + for (i = 0; i < MAX_NUM_TASKS; i++) { + if (tasks[i] && !strcasecmp(tasks[i]->name, arg)) { + task = tasks[i]; + break; + } + } + + if (!task) { + error(_("'%s' is not a valid task"), arg); + return 1; + } + + if (task->selected) { + error(_("task '%s' cannot be selected multiple times"), arg); + return 1; + } + + task->selected = 1; + task->task_order = opts.tasks_selected; + + return 0; +} + int cmd_maintenance(int argc, const char **argv, const char *prefix) { static struct option builtin_maintenance_options[] = { @@ -849,6 +906,9 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix) N_("run tasks based on the state of the repository")), OPT_BOOL(0, "quiet", &opts.quiet, N_("do not report progress or other information over stderr")), + OPT_CALLBACK_F(0, "task", NULL, N_("task"), + N_("run a specific task"), + PARSE_OPT_NONEG, task_option_parse), OPT_END() }; diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 216ac0b19e..c09a9eb90b 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -21,4 +21,27 @@ test_expect_success 'run [--auto|--quiet]' ' grep ",\"gc\",\"--quiet\"" run-quiet.txt ' +test_expect_success 'run --task=' ' + GIT_TRACE2_EVENT="$(pwd)/run-commit-graph.txt" git maintenance run --task=commit-graph && + GIT_TRACE2_EVENT="$(pwd)/run-gc.txt" git maintenance run --task=gc && + GIT_TRACE2_EVENT="$(pwd)/run-commit-graph.txt" git maintenance run --task=commit-graph && + GIT_TRACE2_EVENT="$(pwd)/run-both.txt" git maintenance run --task=commit-graph --task=gc && + ! grep ",\"gc\"" run-commit-graph.txt && + grep ",\"gc\"" run-gc.txt && + grep ",\"gc\"" run-both.txt && + grep ",\"commit-graph\",\"write\"" run-commit-graph.txt && + ! grep ",\"commit-graph\",\"write\"" run-gc.txt && + grep ",\"commit-graph\",\"write\"" run-both.txt +' + +test_expect_success 'run --task=bogus' ' + test_must_fail git maintenance run --task=bogus 2>err && + test_i18ngrep "is not a valid task" err +' + +test_expect_success 'run --task duplicate' ' + test_must_fail git maintenance run --task=gc --task=gc 2>err && + test_i18ngrep "cannot be selected multiple times" err +' + test_done From patchwork Thu Jul 23 17:56:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681447 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6BD34138A for ; Thu, 23 Jul 2020 17:56:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 50DB52086A for ; Thu, 23 Jul 2020 17:56:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="mZ5jeTrW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730240AbgGWR4y (ORCPT ); Thu, 23 Jul 2020 13:56:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53290 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730222AbgGWR4u (ORCPT ); Thu, 23 Jul 2020 13:56:50 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B70DC0619E2 for ; Thu, 23 Jul 2020 10:56:50 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id r2so918474wrs.8 for ; Thu, 23 Jul 2020 10:56:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Yvp+ThwhNNGXj4tRo41TmDa3Hns/sGK8I20VbFp0Ap8=; b=mZ5jeTrW9rZR6kEqeOTXpr9fFslESJ1yPBXi82K3THK9FR7B0PySJY06X+vsl3+JS4 zSry8UVsGumhi3Ks4Q1Uz5lDJ7jl9onLBMN3y3WHPKoSqcJm2aRdJz3DEieKh+GomnsR PZWQ5kOVxTPlf55hXTmG5oaDB2Mf99ooILVAoVUI7cWWeo5uB1WolPsUJ25m+vSw55+h q/Rsp2mpSqNsdLTSZVYJ6XKjGV50iv64+p4tMhcHCKn4PcLAvct2ejJQ89j0tFMILYp9 NuEpR5IVcoz5lsXqzGRtYf8Y7OFGvldlE5K70WfEk8Gy0KkOAWgIPPGH3DkWdXEjy34Y O5Tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Yvp+ThwhNNGXj4tRo41TmDa3Hns/sGK8I20VbFp0Ap8=; b=LkYT2D20wHgzTXEtJBCjxqa/VGSEHIIlSU5ibxO802s9Hw0zDlDcLo/T89zfVZoLWg pq+UyPbDvYpKrX+u0fQuzNL0AytPcbnHQOPoHEkqRDr5PoKsnZxpUT1QN0LB8OY3TSlY QJUNNLOqp2SF8ZTvTlW5dZe6pAU6lafMpYAhdlkclKqwdXc+81IW9YrGrlIzI4YSCOhq Znp/+KIsHbRKRJ8ZRQ0SBPkkljZ6akRI1ULuLWN4je/VjjsQ1xmbUWo9A21dzCHX12zY oZLY33ds/Jozaj6JDOHgXghWN5tYJRqBGQSPcU7MPTHIuXP1LaaeJOmYnXYwGy04MMQB HBYA== X-Gm-Message-State: AOAM532Ih5aTRy1I3s+qjHF+i6HdwUJ2uVvY1xSaHEuvp6Nm537AhDrK MTAP5vW3XTuHanrXpt9hJfLTvcne X-Google-Smtp-Source: ABdhPJyJns398vV2iRuGBhpHEiA7yHtUUX/SuiVHR0qabLomOR6xbPKw+Bwu8hYhQMkKDuvaBboRzQ== X-Received: by 2002:adf:db86:: with SMTP id u6mr5248512wri.27.1595527008942; Thu, 23 Jul 2020 10:56:48 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s8sm4476947wru.38.2020.07.23.10.56.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:48 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:29 +0000 Subject: [PATCH v2 07/18] maintenance: take a lock on the objects directory Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Performing maintenance on a Git repository involves writing data to the .git directory, which is not safe to do with multiple writers attempting the same operation. Ensure that only one 'git maintenance' process is running at a time by holding a file-based lock. Simply the presence of the .git/maintenance.lock file will prevent future maintenance. This lock is never committed, since it does not represent meaningful data. Instead, it is only a placeholder. If the lock file already exists, then fail silently. This will become very important later when we implement the 'fetch' task, as this is our stop-gap from creating a recursive process loop between 'git fetch' and 'git maintenance run'. Signed-off-by: Derrick Stolee --- builtin/gc.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/builtin/gc.c b/builtin/gc.c index c58dea6fa5..5d99b4b805 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -826,6 +826,25 @@ static int maintenance_run(void) { int i; int result = 0; + struct lock_file lk; + struct repository *r = the_repository; + char *lock_path = xstrfmt("%s/maintenance", r->objects->odb->path); + + if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0) { + /* + * Another maintenance command is running. + * + * If --auto was provided, then it is likely due to a + * recursive process stack. Do not report an error in + * that case. + */ + if (!opts.auto_flag && !opts.quiet) + error(_("lock file '%s' exists, skipping maintenance"), + lock_path); + free(lock_path); + return 0; + } + free(lock_path); if (opts.tasks_selected) QSORT(tasks, num_tasks, compare_tasks_by_selection); @@ -840,6 +859,7 @@ static int maintenance_run(void) result = tasks[i]->fn(); } + rollback_lock_file(&lk); return result; } From patchwork Thu Jul 23 17:56:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681451 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 03331722 for ; Thu, 23 Jul 2020 17:56:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D93562086A for ; Thu, 23 Jul 2020 17:56:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="YpnEjeDn" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730251AbgGWR44 (ORCPT ); Thu, 23 Jul 2020 13:56:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53294 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730122AbgGWR4w (ORCPT ); Thu, 23 Jul 2020 13:56:52 -0400 Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9823FC0619E3 for ; Thu, 23 Jul 2020 10:56:51 -0700 (PDT) Received: by mail-wm1-x344.google.com with SMTP id j18so5741070wmi.3 for ; Thu, 23 Jul 2020 10:56:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=J5JAIL788qaocK3RQu2EiuORUvNYhdMI8XoJUFNK7zk=; b=YpnEjeDnXjteJlsuMNVoCxWc8aQppcTn+8q4QuQh9L7iV765wLQqgfBLmTrREpE75L T3mgZuPtu852+kMy+t9IwiAxnGeH4VSEz2bc+8Gw1fBr1m9MYc7w0z+vjqS5zGjpFeQk HUPeOTChIfrzwYjfQsRYl7XwjKV5TxSWxka2LMeIy5sTg7xU+NXyakdBWVxgiCInbFNX 0C8WUw0YLpXMyVsd91j0ZwyI0K1GzwPUesdc7Kwn0vYC3jBW13V3YAp0Rs2EOb1lNkMa Mizkqh8SzlTbbuiGZnPyQf6/b9+xeROKZUg6C6szxRjrbQseJc34brl0OKrDBzD6AxVt 6VpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=J5JAIL788qaocK3RQu2EiuORUvNYhdMI8XoJUFNK7zk=; b=V1S5Y7bs5vzdfc8rLnz5n55r2y7UKECqLX/Yhh0Oe0rdo+jLNgRPXGUgJindRz/vLb KqMHD0A54KNQXCe3C2FAfdwxm/MMxnug6V87ePjZWSltHcLV383iXb65Q46GtFPR3AdE k5nOJ7pfq9IEWPJpr8Ftu4zChSr7NVa4kJqlmJyPE67jWV6WQr+KQXL1RgfvoF7A7017 HT9MvpuULHng/RYpLca+60LgAyDJiFAABjZ2iYCkyqtlbd2z/xt6i2S2anhro0OGWKIu mmkX9C06ZKeZjgKB5AaH2pM6S1OrE8RoIa1NyazOR2AlUvnlmjYbmSL2pyDwAs5l4yRF vU+Q== X-Gm-Message-State: AOAM531Tpcip2PcrPFJcb/SUQjAgUMvLdO5p3kfGYXJ7TsZn31PhmTb/ pM2GJPBq822jZ9AiOOnuzEAlBUXC X-Google-Smtp-Source: ABdhPJykQfwHA+ziEtB0gm1u1Sk8u4XAsEyQljX7Y2tQc1NEMk3agvi6hlvMiafA73HnJHqwCWw80w== X-Received: by 2002:a05:600c:2058:: with SMTP id p24mr5148107wmg.74.1595527009809; Thu, 23 Jul 2020 10:56:49 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id i66sm990336wma.35.2020.07.23.10.56.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:49 -0700 (PDT) Message-Id: <3165b8916d2d80bf72dac6596a42c871ccd4cbe6.1595527000.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:30 +0000 Subject: [PATCH v2 08/18] maintenance: add prefetch task Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee When working with very large repositories, an incremental 'git fetch' command can download a large amount of data. If there are many other users pushing to a common repo, then this data can rival the initial pack-file size of a 'git clone' of a medium-size repo. Users may want to keep the data on their local repos as close as possible to the data on the remote repos by fetching periodically in the background. This can break up a large daily fetch into several smaller hourly fetches. The task is called "prefetch" because it is work done in advance of a foreground fetch to make that 'git fetch' command much faster. However, if we simply ran 'git fetch ' in the background, then the user running a foregroudn 'git fetch ' would lose some important feedback when a new branch appears or an existing branch updates. This is especially true if a remote branch is force-updated and this isn't noticed by the user because it occurred in the background. Further, the functionality of 'git push --force-with-lease' becomes suspect. When running 'git fetch ' in the background, use the following options for careful updating: 1. --no-tags prevents getting a new tag when a user wants to see the new tags appear in their foreground fetches. 2. --refmap= removes the configured refspec which usually updates refs/remotes//* with the refs advertised by the remote. 3. By adding a new refspec "+refs/heads/*:refs/prefetch//*" we can ensure that we actually load the new values somewhere in our refspace while not updating refs/heads or refs/remotes. By storing these refs here, the commit-graph job will update the commit-graph with the commits from these hidden refs. 4. --prune will delete the refs/prefetch/ refs that no longer appear on the remote. We've been using this step as a critical background job in Scalar [1] (and VFS for Git). This solved a pain point that was showing up in user reports: fetching was a pain! Users do not like waiting to download the data that was created while they were away from their machines. After implementing background fetch, the foreground fetch commands sped up significantly because they mostly just update refs and download a small amount of new data. The effect is especially dramatic when paried with --no-show-forced-udpates (through fetch.showForcedUpdates=false). [1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/FetchStep.cs Signed-off-by: Derrick Stolee --- Documentation/git-maintenance.txt | 12 ++++++ builtin/gc.c | 64 ++++++++++++++++++++++++++++++- t/t7900-maintenance.sh | 24 ++++++++++++ 3 files changed, 99 insertions(+), 1 deletion(-) diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index 9204762e21..0927643247 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -53,6 +53,18 @@ since it will not expire `.graph` files that were in the previous `commit-graph-chain` file. They will be deleted by a later run based on the expiration delay. +prefetch:: + The `fetch` task updates the object directory with the latest objects + from all registered remotes. For each remote, a `git fetch` command + is run. The refmap is custom to avoid updating local or remote + branches (those in `refs/heads` or `refs/remotes`). Instead, the + remote refs are stored in `refs/prefetch//`. Also, tags are + not updated. ++ +This means that foreground fetches are still required to update the +remote refs, but the users is notified when the branches and tags are +updated on the remote. + gc:: Cleanup unnecessary files and optimize the local repository. "GC" stands for "garbage collection," but this task performs many diff --git a/builtin/gc.c b/builtin/gc.c index 5d99b4b805..969c127877 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -28,6 +28,7 @@ #include "blob.h" #include "tree.h" #include "promisor-remote.h" +#include "remote.h" #define FAILED_RUN "failed to run %s" @@ -700,7 +701,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix) return 0; } -#define MAX_NUM_TASKS 2 +#define MAX_NUM_TASKS 3 static const char * const builtin_maintenance_usage[] = { N_("git maintenance run []"), @@ -781,6 +782,63 @@ static int maintenance_task_commit_graph(void) return 1; } +static int fetch_remote(const char *remote) +{ + int result; + struct argv_array cmd = ARGV_ARRAY_INIT; + struct strbuf refmap = STRBUF_INIT; + + argv_array_pushl(&cmd, "fetch", remote, "--prune", + "--no-tags", "--refmap=", NULL); + + strbuf_addf(&refmap, "+refs/heads/*:refs/prefetch/%s/*", remote); + argv_array_push(&cmd, refmap.buf); + + if (opts.quiet) + argv_array_push(&cmd, "--quiet"); + + result = run_command_v_opt(cmd.argv, RUN_GIT_CMD); + + strbuf_release(&refmap); + return result; +} + +static int fill_each_remote(struct remote *remote, void *cbdata) +{ + struct string_list *remotes = (struct string_list *)cbdata; + + string_list_append(remotes, remote->name); + return 0; +} + +static int maintenance_task_prefetch(void) +{ + int result = 0; + struct string_list_item *item; + struct string_list remotes = STRING_LIST_INIT_DUP; + + if (for_each_remote(fill_each_remote, &remotes)) { + error(_("failed to fill remotes")); + result = 1; + goto cleanup; + } + + /* + * Do not modify the result based on the success of the 'fetch' + * operation, as a loss of network could cause 'fetch' to fail + * quickly. We do not want that to stop the rest of our + * background operations. + */ + for (item = remotes.items; + item && item < remotes.items + remotes.nr; + item++) + fetch_remote(item->string); + +cleanup: + string_list_clear(&remotes, 0); + return result; +} + static int maintenance_task_gc(void) { int result; @@ -871,6 +929,10 @@ static void initialize_tasks(void) for (i = 0; i < MAX_NUM_TASKS; i++) tasks[i] = xcalloc(1, sizeof(struct maintenance_task)); + tasks[num_tasks]->name = "prefetch"; + tasks[num_tasks]->fn = maintenance_task_prefetch; + num_tasks++; + tasks[num_tasks]->name = "gc"; tasks[num_tasks]->fn = maintenance_task_gc; tasks[num_tasks]->enabled = 1; diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index c09a9eb90b..8b04a04c79 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -44,4 +44,28 @@ test_expect_success 'run --task duplicate' ' test_i18ngrep "cannot be selected multiple times" err ' +test_expect_success 'run --task=prefetch with no remotes' ' + git maintenance run --task=prefetch 2>err && + test_must_be_empty err +' + +test_expect_success 'prefetch multiple remotes' ' + git clone . clone1 && + git clone . clone2 && + git remote add remote1 "file://$(pwd)/clone1" && + git remote add remote2 "file://$(pwd)/clone2" && + git -C clone1 switch -c one && + git -C clone2 switch -c two && + test_commit -C clone1 one && + test_commit -C clone2 two && + GIT_TRACE2_EVENT="$(pwd)/run-prefetch.txt" git maintenance run --task=prefetch && + grep ",\"fetch\",\"remote1\"" run-prefetch.txt && + grep ",\"fetch\",\"remote2\"" run-prefetch.txt && + test_path_is_missing .git/refs/remotes && + test_cmp clone1/.git/refs/heads/one .git/refs/prefetch/remote1/one && + test_cmp clone2/.git/refs/heads/two .git/refs/prefetch/remote2/two && + git log prefetch/remote1/one && + git log prefetch/remote2/two +' + test_done From patchwork Thu Jul 23 17:56:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681453 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C4CBC138A for ; Thu, 23 Jul 2020 17:56:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A74392086A for ; Thu, 23 Jul 2020 17:56:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ETvN8o+B" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730254AbgGWR44 (ORCPT ); Thu, 23 Jul 2020 13:56:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53300 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730235AbgGWR4w (ORCPT ); Thu, 23 Jul 2020 13:56:52 -0400 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6A085C0619E4 for ; Thu, 23 Jul 2020 10:56:52 -0700 (PDT) Received: by mail-wm1-x32c.google.com with SMTP id j18so5741098wmi.3 for ; Thu, 23 Jul 2020 10:56:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=X2/Mp4NmeVTSZlWokHiXCu/uyB8hsODRAgoDS5oc5vg=; b=ETvN8o+BhSKTvo+kcPmh2bfkTi482MRu1FLYjRzN066WlbgRzMhjcq2q2lfUfLrf9W zQTTJWDHUKjxi55x8GPaOZ2s5O/el7a8E5L1UOJBoWMP288xzmmaMx11aWi/dtL1M8BC a6NrbRdmMsal2YlBS02xm+Vi9D9k5IpSHqom/p5aXu7RrAFXhQIKS9hTA4hOsqznBbTi hBPXfEIEiDLNzFlgYhPaPtOl3KRUNox3mABVrc1nMnaRj1Z0eIK/7V9tqhu0fHMNJh0q kWdJzSeTj4SnQ+XqCyHz65zaoplyhe+OhuPCT9US9BVD6IXimndnWPwinu/KauyM7tSM 85Kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=X2/Mp4NmeVTSZlWokHiXCu/uyB8hsODRAgoDS5oc5vg=; b=OvdMOWJOvayr5Ikrf4GusO4bzH8R5y5EO+CfLCXRbATkCk1/F+ZUD4w2bF7UFMXcuD RR20A2IhNJ2EApC484/HkY8MYkZRMSmuaxrQvsxi5B9Ks4tJtSLiofSSG5UkmYJczdnZ cKQeAjU3HIlZZQOo+9cNP3oxEiXIllz4W/GZrBdYTnBs5edfECundAvFB5gwuW7g+FZ1 Um4y6bYVH/GEHB9LkDveFJKQKi+4qSNI4pW1WyZ4Y7rhTmo8Xy26kQjoag0Pk71z8H8O 7k936uvhvOyaXeCimKDFvLTDyBHkf5TVkf143ca9eJ/0WkjvkNnfhO1SPb97OrNWL7d9 BuOQ== X-Gm-Message-State: AOAM533ODvyhSHDjJNI5al7b5AjfDPdpaU3soduIZrFtytL0o0sWVJHb ZhXmVRBarplhUx52sY/Vc03gTuUb X-Google-Smtp-Source: ABdhPJxhB50AixM/MqAiM+ZwQeE6VUV3EggHHRnqkm8rhHyKaGrpJ2XpgWVoEALJPwsPHbQY9yTTmA== X-Received: by 2002:a1c:4c0e:: with SMTP id z14mr5127905wmf.54.1595527010820; Thu, 23 Jul 2020 10:56:50 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h5sm5117331wrc.97.2020.07.23.10.56.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:50 -0700 (PDT) Message-Id: <83648f48655ba68126110018d81c1d2e2bcc7a6f.1595527000.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:31 +0000 Subject: [PATCH v2 09/18] maintenance: add loose-objects task Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee One goal of background maintenance jobs is to allow a user to disable auto-gc (gc.auto=0) but keep their repository in a clean state. Without any cleanup, loose objects will clutter the object database and slow operations. In addition, the loose objects will take up extra space because they are not stored with deltas against similar objects. Create a 'loose-objects' task for the 'git maintenance run' command. This helps clean up loose objects without disrupting concurrent Git commands using the following sequence of events: 1. Run 'git prune-packed' to delete any loose objects that exist in a pack-file. Concurrent commands will prefer the packed version of the object to the loose version. (Of course, there are exceptions for commands that specifically care about the location of an object. These are rare for a user to run on purpose, and we hope a user that has selected background maintenance will not be trying to do foreground maintenance.) 2. Run 'git pack-objects' on a batch of loose objects. These objects are grouped by scanning the loose object directories in lexicographic order until listing all loose objects -or- reaching 50,000 objects. This is more than enough if the loose objects are created only by a user doing normal development. We noticed users with _millions_ of loose objects because VFS for Git downloads blobs on-demand when a file read operation requires populating a virtual file. This has potential of happening in partial clones if someone runs 'git grep' or otherwise evades the batch-download feature for requesting promisor objects. This step is based on a similar step in Scalar [1] and VFS for Git. [1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/LooseObjectsStep.cs Signed-off-by: Derrick Stolee --- Documentation/git-maintenance.txt | 11 ++++ builtin/gc.c | 106 +++++++++++++++++++++++++++++- t/t7900-maintenance.sh | 35 ++++++++++ 3 files changed, 151 insertions(+), 1 deletion(-) diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index 0927643247..557915a653 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -73,6 +73,17 @@ gc:: It can also be disruptive in some situations, as it deletes stale data. +loose-objects:: + The `loose-objects` job cleans up loose objects and places them into + pack-files. In order to prevent race conditions with concurrent Git + commands, it follows a two-step process. First, it deletes any loose + objects that already exist in a pack-file; concurrent Git processes + will examine the pack-file for the object data instead of the loose + object. Second, it creates a new pack-file (starting with "loose-") + containing a batch of loose objects. The batch size is limited to 50 + thousand objects to prevent the job from taking too long on a + repository with many loose objects. + OPTIONS ------- --auto:: diff --git a/builtin/gc.c b/builtin/gc.c index 969c127877..fa65304580 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -701,7 +701,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix) return 0; } -#define MAX_NUM_TASKS 3 +#define MAX_NUM_TASKS 4 static const char * const builtin_maintenance_usage[] = { N_("git maintenance run []"), @@ -858,6 +858,106 @@ static int maintenance_task_gc(void) return result; } +static int prune_packed(void) +{ + struct argv_array cmd = ARGV_ARRAY_INIT; + argv_array_pushl(&cmd, "prune-packed", NULL); + + if (opts.quiet) + argv_array_push(&cmd, "--quiet"); + + return run_command_v_opt(cmd.argv, RUN_GIT_CMD); +} + +struct write_loose_object_data { + FILE *in; + int count; + int batch_size; +}; + +static int loose_object_exists(const struct object_id *oid, + const char *path, + void *data) +{ + return 1; +} + +static int write_loose_object_to_stdin(const struct object_id *oid, + const char *path, + void *data) +{ + struct write_loose_object_data *d = (struct write_loose_object_data *)data; + + fprintf(d->in, "%s\n", oid_to_hex(oid)); + + return ++(d->count) > d->batch_size; +} + +static int pack_loose(void) +{ + struct repository *r = the_repository; + int result = 0; + struct write_loose_object_data data; + struct strbuf prefix = STRBUF_INIT; + struct child_process *pack_proc; + + /* + * Do not start pack-objects process + * if there are no loose objects. + */ + if (!for_each_loose_file_in_objdir(r->objects->odb->path, + loose_object_exists, + NULL, NULL, NULL)) + return 0; + + pack_proc = xmalloc(sizeof(*pack_proc)); + + child_process_init(pack_proc); + + strbuf_addstr(&prefix, r->objects->odb->path); + strbuf_addstr(&prefix, "/pack/loose"); + + argv_array_pushl(&pack_proc->args, "git", "pack-objects", NULL); + if (opts.quiet) + argv_array_push(&pack_proc->args, "--quiet"); + argv_array_push(&pack_proc->args, prefix.buf); + + pack_proc->in = -1; + + if (start_command(pack_proc)) { + error(_("failed to start 'git pack-objects' process")); + result = 1; + goto cleanup; + } + + data.in = xfdopen(pack_proc->in, "w"); + data.count = 0; + data.batch_size = 50000; + + for_each_loose_file_in_objdir(r->objects->odb->path, + write_loose_object_to_stdin, + NULL, + NULL, + &data); + + fclose(data.in); + + if (finish_command(pack_proc)) { + error(_("failed to finish 'git pack-objects' process")); + result = 1; + } + +cleanup: + strbuf_release(&prefix); + free(pack_proc); + return result; +} + +static int maintenance_task_loose_objects(void) +{ + return prune_packed() || pack_loose(); +} + typedef int maintenance_task_fn(void); struct maintenance_task { @@ -933,6 +1033,10 @@ static void initialize_tasks(void) tasks[num_tasks]->fn = maintenance_task_prefetch; num_tasks++; + tasks[num_tasks]->name = "loose-objects"; + tasks[num_tasks]->fn = maintenance_task_loose_objects; + num_tasks++; + tasks[num_tasks]->name = "gc"; tasks[num_tasks]->fn = maintenance_task_gc; tasks[num_tasks]->enabled = 1; diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 8b04a04c79..94bb493733 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -68,4 +68,39 @@ test_expect_success 'prefetch multiple remotes' ' git log prefetch/remote2/two ' +test_expect_success 'loose-objects task' ' + # Repack everything so we know the state of the object dir + git repack -adk && + + # Hack to stop maintenance from running during "git commit" + echo in use >.git/objects/maintenance.lock && + test_commit create-loose-object && + rm .git/objects/maintenance.lock && + + ls .git/objects >obj-dir-before && + test_file_not_empty obj-dir-before && + ls .git/objects/pack/*.pack >packs-before && + test_line_count = 1 packs-before && + + # The first run creates a pack-file + # but does not delete loose objects. + git maintenance run --task=loose-objects && + ls .git/objects >obj-dir-between && + test_cmp obj-dir-before obj-dir-between && + ls .git/objects/pack/*.pack >packs-between && + test_line_count = 2 packs-between && + + # The second run deletes loose objects + # but does not create a pack-file. + git maintenance run --task=loose-objects && + ls .git/objects >obj-dir-after && + cat >expect <<-\EOF && + info + pack + EOF + test_cmp expect obj-dir-after && + ls .git/objects/pack/*.pack >packs-after && + test_cmp packs-between packs-after +' + test_done From patchwork Thu Jul 23 17:56:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681457 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EA1BE722 for ; Thu, 23 Jul 2020 17:56:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C6EC3206E3 for ; Thu, 23 Jul 2020 17:56:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZO6bNkCz" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730258AbgGWR46 (ORCPT ); Thu, 23 Jul 2020 13:56:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53302 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730236AbgGWR4x (ORCPT ); Thu, 23 Jul 2020 13:56:53 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 79477C0619E5 for ; Thu, 23 Jul 2020 10:56:53 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id r4so3048061wrx.9 for ; Thu, 23 Jul 2020 10:56:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=4V4l1GzGZeC37+TbBBvEXaX6tBlf7EXxL2Zke8V3noE=; b=ZO6bNkCzkLJL8DPreSk9AMeVZSLMpV+y0nywSAZHFGkDH7VWg+JE6vMq9dffNA0Oke 5e71KOUKEwXkwRo6GCzhGHR6zD56hFQLhp6hFREpGpLVy2pV9lq5Hr+7pK6x9Rir6TVt FQtaWkFVg9UF4T23enPkRhZc3SGmuso2+QQ2pjWosD411h956cvinMBNZgbQlCGkQbzp 0xhXbg2QZ+RMY7oAm7/iYcinkYW83xl+6WnQoRUOfkc6C/9Nt51+oDzsi31FWyR8kM0Z TcuhVjzFj9hJxiSn8rk8LZPAZd+nmeFm0ZqzCB5BNVdB6EL0CwLsPm3vSkSbIvdNOEkC MNXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=4V4l1GzGZeC37+TbBBvEXaX6tBlf7EXxL2Zke8V3noE=; b=NAh8s46WfHXm80f4/Ys4msZZVOsofeUIUNDdrd8eRTJKZHOOXvwg/awYNBZXHWSO/9 JP8C+anlrUrpHRood53A7dQMc4yVf3LjzyvtoYQ5Cl/pT1wguqWduR9KGyhUzn/RRVUq 5JnKy9/ONK/ApWi3QuogLUgTcXBSMmlzunPIQ5N7VjV4N38jm6vRXM3AdzfoVmSKTra/ PqZsjtfe3dbPuzGATui+xEvf/GhzsZlqSgEznknyCNFDsfUFcCOMAtuo3iHVl9C0ZD1b oViC3ozLcWl6KKkfFRQkB2JAYuJ8yVcppWTHNQVe8I5xFGRUzYwCvutKznkn08jlVjuQ 8S3w== X-Gm-Message-State: AOAM531zogdeJnKbVzpmQEnOqIRuID60RxvB2gw4ke6lOqgIqihMqG0W 3rFNt72rc4oZl3gZg8e1Vja6O9zJ X-Google-Smtp-Source: ABdhPJzeuJTtXK/wpiWmB43tDHuQHIkJX+YEBO+FuIl8s2acb1wR5BWW0nZA1jdbu1GAM0HHCSztjQ== X-Received: by 2002:adf:8024:: with SMTP id 33mr5472585wrk.117.1595527011708; Thu, 23 Jul 2020 10:56:51 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id e5sm5044835wrc.37.2020.07.23.10.56.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:51 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:32 +0000 Subject: [PATCH v2 10/18] maintenance: add incremental-repack task Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The previous change cleaned up loose objects using the 'loose-objects' that can be run safely in the background. Add a similar job that performs similar cleanups for pack-files. One issue with running 'git repack' is that it is designed to repack all pack-files into a single pack-file. While this is the most space-efficient way to store object data, it is not time or memory efficient. This becomes extremely important if the repo is so large that a user struggles to store two copies of the pack on their disk. Instead, perform an "incremental" repack by collecting a few small pack-files into a new pack-file. The multi-pack-index facilitates this process ever since 'git multi-pack-index expire' was added in 19575c7 (multi-pack-index: implement 'expire' subcommand, 2019-06-10) and 'git multi-pack-index repack' was added in ce1e4a1 (midx: implement midx_repack(), 2019-06-10). The 'incremental-repack' task runs the following steps: 1. 'git multi-pack-index write' creates a multi-pack-index file if one did not exist, and otherwise will update the multi-pack-index with any new pack-files that appeared since the last write. This is particularly relevant with the background fetch job. When the multi-pack-index sees two copies of the same object, it stores the offset data into the newer pack-file. This means that some old pack-files could become "unreferenced" which I will use to mean "a pack-file that is in the pack-file list of the multi-pack-index but none of the objects in the multi-pack-index reference a location inside that pack-file." 2. 'git multi-pack-index expire' deletes any unreferenced pack-files and updaes the multi-pack-index to drop those pack-files from the list. This is safe to do as concurrent Git processes will see the multi-pack-index and not open those packs when looking for object contents. (Similar to the 'loose-objects' job, there are some Git commands that open pack-files regardless of the multi-pack-index, but they are rarely used. Further, a user that self-selects to use background operations would likely refrain from using those commands.) 3. 'git multi-pack-index repack --bacth-size=' collects a set of pack-files that are listed in the multi-pack-index and creates a new pack-file containing the objects whose offsets are listed by the multi-pack-index to be in those objects. The set of pack- files is selected greedily by sorting the pack-files by modified time and adding a pack-file to the set if its "expected size" is smaller than the batch size until the total expected size of the selected pack-files is at least the batch size. The "expected size" is calculated by taking the size of the pack-file divided by the number of objects in the pack-file and multiplied by the number of objects from the multi-pack-index with offset in that pack-file. The expected size approximats how much data from that pack-file will contribute to the resulting pack-file size. The intention is that the resulting pack-file will be close in size to the provided batch size. The next run of the incremental-repack task will delete these repacked pack-files during the 'expire' step. In this version, the batch size is set to "0" which ignores the size restrictions when selecting the pack-files. It instead selects all pack-files and repacks all packed objects into a single pack-file. This will be updated in the next change, but it requires doing some calculations that are better isolated to a separate change. Each of the above steps update the multi-pack-index file. After each step, we verify the new multi-pack-index. If the new multi-pack-index is corrupt, then delete the multi-pack-index, rewrite it from scratch, and stop doing the later steps of the job. This is intended to be an extra-safe check without leaving a repo with many pack-files without a multi-pack-index. These steps are based on a similar background maintenance step in Scalar (and VFS for Git) [1]. This was incredibly effective for users of the Windows OS repository. After using the same VFS for Git repository for over a year, some users had _thousands_ of pack-files that combined to up to 250 GB of data. We noticed a few users were running into the open file descriptor limits (due in part to a bug in the multi-pack-index fixed by af96fe3 (midx: add packs to packed_git linked list, 2019-04-29). These pack-files were mostly small since they contained the commits and trees that were pushed to the origin in a given hour. The GVFS protocol includes a "prefetch" step that asks for pre-computed pack- files containing commits and trees by timestamp. These pack-files were grouped into "daily" pack-files once a day for up to 30 days. If a user did not request prefetch packs for over 30 days, then they would get the entire history of commits and trees in a new, large pack-file. This led to a large number of pack-files that had poor delta compression. By running this pack-file maintenance step once per day, these repos with thousands of packs spanning 200+ GB dropped to dozens of pack- files spanning 30-50 GB. This was done all without removing objects from the system and using a constant batch size of two gigabytes. Once the work was done to reduce the pack-files to small sizes, the batch size of two gigabytes means that not every run triggers a repack operation, so the following run will not expire a pack-file. This has kept these repos in a "clean" state. [1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/PackfileMaintenanceStep.cs Signed-off-by: Derrick Stolee --- Documentation/git-maintenance.txt | 15 ++++ builtin/gc.c | 121 +++++++++++++++++++++++++++++- midx.c | 2 +- midx.h | 1 + t/t7900-maintenance.sh | 37 +++++++++ 5 files changed, 174 insertions(+), 2 deletions(-) diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index 557915a653..bda8df4aaa 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -84,6 +84,21 @@ loose-objects:: thousand objects to prevent the job from taking too long on a repository with many loose objects. +incremental-repack:: + The `incremental-repack` job repacks the object directory + using the `multi-pack-index` feature. In order to prevent race + conditions with concurrent Git commands, it follows a two-step + process. First, it deletes any pack-files included in the + `multi-pack-index` where none of the objects in the + `multi-pack-index` reference those pack-files; this only happens + if all objects in the pack-file are also stored in a newer + pack-file. Second, it selects a group of pack-files whose "expected + size" is below the batch size until the group has total expected + size at least the batch size; see the `--batch-size` option for + the `repack` subcommand in linkgit:git-multi-pack-index[1]. The + default batch-size is zero, which is a special case that attempts + to repack all pack-files into a single pack-file. + OPTIONS ------- --auto:: diff --git a/builtin/gc.c b/builtin/gc.c index fa65304580..eb4b01c104 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -29,6 +29,7 @@ #include "tree.h" #include "promisor-remote.h" #include "remote.h" +#include "midx.h" #define FAILED_RUN "failed to run %s" @@ -701,7 +702,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix) return 0; } -#define MAX_NUM_TASKS 4 +#define MAX_NUM_TASKS 5 static const char * const builtin_maintenance_usage[] = { N_("git maintenance run []"), @@ -958,6 +959,120 @@ static int maintenance_task_loose_objects(void) return prune_packed() || pack_loose(); } +static int multi_pack_index_write(void) +{ + int result; + struct argv_array cmd = ARGV_ARRAY_INIT; + argv_array_pushl(&cmd, "multi-pack-index", "write", NULL); + + if (opts.quiet) + argv_array_push(&cmd, "--no-progress"); + + result = run_command_v_opt(cmd.argv, RUN_GIT_CMD); + argv_array_clear(&cmd); + + return result; +} + +static int rewrite_multi_pack_index(void) +{ + struct repository *r = the_repository; + char *midx_name = get_midx_filename(r->objects->odb->path); + + unlink(midx_name); + free(midx_name); + + if (multi_pack_index_write()) { + error(_("failed to rewrite multi-pack-index")); + return 1; + } + + return 0; +} + +static int multi_pack_index_verify(void) +{ + int result; + struct argv_array cmd = ARGV_ARRAY_INIT; + argv_array_pushl(&cmd, "multi-pack-index", "verify", NULL); + + if (opts.quiet) + argv_array_push(&cmd, "--no-progress"); + + result = run_command_v_opt(cmd.argv, RUN_GIT_CMD); + argv_array_clear(&cmd); + + return result; +} + +static int multi_pack_index_expire(void) +{ + int result; + struct argv_array cmd = ARGV_ARRAY_INIT; + argv_array_pushl(&cmd, "multi-pack-index", "expire", NULL); + + if (opts.quiet) + argv_array_push(&cmd, "--no-progress"); + + close_object_store(the_repository->objects); + result = run_command_v_opt(cmd.argv, RUN_GIT_CMD); + argv_array_clear(&cmd); + + return result; +} + +static int multi_pack_index_repack(void) +{ + int result; + struct argv_array cmd = ARGV_ARRAY_INIT; + argv_array_pushl(&cmd, "multi-pack-index", "repack", NULL); + + if (opts.quiet) + argv_array_push(&cmd, "--no-progress"); + + argv_array_push(&cmd, "--batch-size=0"); + + close_object_store(the_repository->objects); + result = run_command_v_opt(cmd.argv, RUN_GIT_CMD); + + if (result && multi_pack_index_verify()) { + warning(_("multi-pack-index verify failed after repack")); + result = rewrite_multi_pack_index(); + } + + return result; +} + +static int maintenance_task_incremental_repack(void) +{ + if (multi_pack_index_write()) { + error(_("failed to write multi-pack-index")); + return 1; + } + + if (multi_pack_index_verify()) { + warning(_("multi-pack-index verify failed after initial write")); + return rewrite_multi_pack_index(); + } + + if (multi_pack_index_expire()) { + error(_("multi-pack-index expire failed")); + return 1; + } + + if (multi_pack_index_verify()) { + warning(_("multi-pack-index verify failed after expire")); + return rewrite_multi_pack_index(); + } + + if (multi_pack_index_repack()) { + error(_("multi-pack-index repack failed")); + return 1; + } + + return 0; +} + typedef int maintenance_task_fn(void); struct maintenance_task { @@ -1037,6 +1152,10 @@ static void initialize_tasks(void) tasks[num_tasks]->fn = maintenance_task_loose_objects; num_tasks++; + tasks[num_tasks]->name = "incremental-repack"; + tasks[num_tasks]->fn = maintenance_task_incremental_repack; + num_tasks++; + tasks[num_tasks]->name = "gc"; tasks[num_tasks]->fn = maintenance_task_gc; tasks[num_tasks]->enabled = 1; diff --git a/midx.c b/midx.c index 6d1584ca51..57a8a00082 100644 --- a/midx.c +++ b/midx.c @@ -36,7 +36,7 @@ #define PACK_EXPIRED UINT_MAX -static char *get_midx_filename(const char *object_dir) +char *get_midx_filename(const char *object_dir) { return xstrfmt("%s/pack/multi-pack-index", object_dir); } diff --git a/midx.h b/midx.h index b18cf53bc4..baeecc70c9 100644 --- a/midx.h +++ b/midx.h @@ -37,6 +37,7 @@ struct multi_pack_index { #define MIDX_PROGRESS (1 << 0) +char *get_midx_filename(const char *object_dir); struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local); int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id); int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result); diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 94bb493733..3ec813979a 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -103,4 +103,41 @@ test_expect_success 'loose-objects task' ' test_cmp packs-between packs-after ' +test_expect_success 'incremental-repack task' ' + packDir=.git/objects/pack && + for i in $(test_seq 1 5) + do + test_commit $i || return 1 + done && + + # Create three disjoint pack-files with size BIG, small, small. + echo HEAD~2 | git pack-objects --revs $packDir/test-1 && + test_tick && + git pack-objects --revs $packDir/test-2 <<-\EOF && + HEAD~1 + ^HEAD~2 + EOF + test_tick && + git pack-objects --revs $packDir/test-3 <<-\EOF && + HEAD + ^HEAD~1 + EOF + rm -f $packDir/pack-* && + rm -f $packDir/loose-* && + ls $packDir/*.pack >packs-before && + test_line_count = 3 packs-before && + + # the job repacks the two into a new pack, but does not + # delete the old ones. + git maintenance run --task=incremental-repack && + ls $packDir/*.pack >packs-between && + test_line_count = 4 packs-between && + + # the job deletes the two old packs, and does not write + # a new one because only one pack remains. + git maintenance run --task=incremental-repack && + ls .git/objects/pack/*.pack >packs-after && + test_line_count = 1 packs-after +' + test_done From patchwork Thu Jul 23 17:56:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681459 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3FAD413B4 for ; Thu, 23 Jul 2020 17:57:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2196620792 for ; Thu, 23 Jul 2020 17:57:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="C7f0Zl/8" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730266AbgGWR47 (ORCPT ); Thu, 23 Jul 2020 13:56:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730237AbgGWR4y (ORCPT ); Thu, 23 Jul 2020 13:56:54 -0400 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE0E7C0619E6 for ; Thu, 23 Jul 2020 10:56:53 -0700 (PDT) Received: by mail-wm1-x342.google.com with SMTP id c80so5760297wme.0 for ; Thu, 23 Jul 2020 10:56:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=sNVSGvF3mvcx9l7i3Lj3EAi8J/egmMgbQkQ+jRiIAWA=; b=C7f0Zl/8kr5IcuCFIZ0ltvVNad0JqF1oKR1yEFb/ZB7YVmHt2UFAP0m5FOk3jG5Y5P zBdH32DQlFTyhVgLSOPK9AAfAx/MCHr1+vT3TR1FHw9uyc8Ze+wDe4YANdYl0zRNtXmD U36EPx+eM7T/Dh7wC7eRrKFkf/fPKaZGxeQ2Ihwst6h6c9osEq1fDxWzesKv0WGlUg7s 425FZa9PMJf3tad/H8ak+PhNC0x9xo1y6TeXBvAgFZDBUtnr15BemCO4lp0r70qEHy/X Gh9ciwKHe3juNNpglVA7p8pxHsGLW/Eo49gABhOhKFSzOgdD3iNIFG6GJWnfZlWXTF3Q B/nQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=sNVSGvF3mvcx9l7i3Lj3EAi8J/egmMgbQkQ+jRiIAWA=; b=HcKlz1++IbdnnlRCnWEmEzkP5tcvGeOHLb/PtbtJWrVwC59+CW8OiOMXbzOnPLsKu/ NStRXMD2Nk84P/idYBlaj8PrqT3phfF6fpH7+p3/UAM3usnSHg7egYP6goYyCgWJoiIa GZP53oKOFG3IkPYEo9O37C12dIE6FwHiMY6z5GuA6fFq1qCf19d1us2aWMVmbw6tcjtl Q3m8ofI/N+qOVkruV0Mt8ZyPPFs83ubTmeMhcE8x1yB4d12PCcYvywg04tXeciRNQB0J G7HL15+1B2cdh2bPifP+eJvsPAiJvxQSId37I03omSZTqMpx4d38fqJFjgb9Dv2+Ou8e On6w== X-Gm-Message-State: AOAM531vBRCMIBtJyRK+2DfZpQijkOQF0nyVsFPRYXqYck+9vDzkth1M Cml/tS46g+GwDvrNSRV1qP8x04qs X-Google-Smtp-Source: ABdhPJwwGq1a+BRBu7rZFpGeMn9WwGmfPO6xy6I+e4W8NtGwaiVqUkVaAaEC4E2zRTcPJIT6kKt/WQ== X-Received: by 2002:a1c:770f:: with SMTP id t15mr5069384wmi.22.1595527012391; Thu, 23 Jul 2020 10:56:52 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id x1sm4677979wrp.10.2020.07.23.10.56.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:52 -0700 (PDT) Message-Id: <478c7f1d0b858755c2c4b98605405214910b6f4c.1595527000.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:33 +0000 Subject: [PATCH v2 11/18] maintenance: auto-size incremental-repack batch Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee When repacking during the 'incremental-repack' task, we use the --batch-size option in 'git multi-pack-index repack'. The initial setting used --batch-size=0 to repack everything into a single pack-file. This is not sustaintable for a large repository. The amount of work required is also likely to use too many system resources for a background job. Update the 'incremental-repack' task by dynamically computing a --batch-size option based on the current pack-file structure. The dynamic default size is computed with this idea in mind for a client repository that was cloned from a very large remote: there is likely one "big" pack-file that was created at clone time. Thus, do not try repacking it as it is likely packed efficiently by the server. Instead, we select the second-largest pack-file, and create a batch size that is one larger than that pack-file. If there are three or more pack-files, then this guarantees that at least two will be combined into a new pack-file. Of course, this means that the second-largest pack-file size is likely to grow over time and may eventually surpass the initially-cloned pack-file. Recall that the pack-file batch is selected in a greedy manner: the packs are considered from oldest to newest and are selected if they have size smaller than the batch size until the total selected size is larger than the batch size. Thus, that oldest "clone" pack will be first to repack after the new data creates a pack larger than that. We also want to place some limits on how large these pack-files become, in order to bound the amount of time spent repacking. A maximum batch-size of two gigabytes means that large repositories will never be packed into a single pack-file using this job, but also that repack is rather expensive. This is a trade-off that is valuable to have if the maintenance is being run automatically or in the background. Users who truly want to optimize for space and performance (and are willing to pay the upfront cost of a full repack) can use the 'gc' task to do so. Reported-by: Son Luong Ngoc Signed-off-by: Derrick Stolee --- builtin/gc.c | 48 +++++++++++++++++++++++++++++++++++++++++- t/t7900-maintenance.sh | 5 +++-- 2 files changed, 50 insertions(+), 3 deletions(-) diff --git a/builtin/gc.c b/builtin/gc.c index eb4b01c104..889d97afe7 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -1021,19 +1021,65 @@ static int multi_pack_index_expire(void) return result; } +#define TWO_GIGABYTES (2147483647) +#define UNSET_BATCH_SIZE ((unsigned long)-1) + +static off_t get_auto_pack_size(void) +{ + /* + * The "auto" value is special: we optimize for + * one large pack-file (i.e. from a clone) and + * expect the rest to be small and they can be + * repacked quickly. + * + * The strategy we select here is to select a + * size that is one more than the second largest + * pack-file. This ensures that we will repack + * at least two packs if there are three or more + * packs. + */ + off_t max_size = 0; + off_t second_largest_size = 0; + off_t result_size; + struct packed_git *p; + struct repository *r = the_repository; + + reprepare_packed_git(r); + for (p = get_all_packs(r); p; p = p->next) { + if (p->pack_size > max_size) { + second_largest_size = max_size; + max_size = p->pack_size; + } else if (p->pack_size > second_largest_size) + second_largest_size = p->pack_size; + } + + result_size = second_largest_size + 1; + + /* But limit ourselves to a batch size of 2g */ + if (result_size > TWO_GIGABYTES) + result_size = TWO_GIGABYTES; + + return result_size; +} + static int multi_pack_index_repack(void) { int result; struct argv_array cmd = ARGV_ARRAY_INIT; + struct strbuf batch_arg = STRBUF_INIT; + argv_array_pushl(&cmd, "multi-pack-index", "repack", NULL); if (opts.quiet) argv_array_push(&cmd, "--no-progress"); - argv_array_push(&cmd, "--batch-size=0"); + strbuf_addf(&batch_arg, "--batch-size=%"PRIuMAX, + (uintmax_t)get_auto_pack_size()); + argv_array_push(&cmd, batch_arg.buf); close_object_store(the_repository->objects); result = run_command_v_opt(cmd.argv, RUN_GIT_CMD); + strbuf_release(&batch_arg); if (result && multi_pack_index_verify()) { warning(_("multi-pack-index verify failed after repack")); diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 3ec813979a..ab5c961eb9 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -134,10 +134,11 @@ test_expect_success 'incremental-repack task' ' test_line_count = 4 packs-between && # the job deletes the two old packs, and does not write - # a new one because only one pack remains. + # a new one because the batch size is not high enough to + # pack the largest pack-file. git maintenance run --task=incremental-repack && ls .git/objects/pack/*.pack >packs-after && - test_line_count = 1 packs-after + test_line_count = 2 packs-after ' test_done From patchwork Thu Jul 23 17:56:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681465 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9E5D2722 for ; Thu, 23 Jul 2020 17:57:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 82EC22086A for ; Thu, 23 Jul 2020 17:57:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Gi0lZOrF" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730272AbgGWR5D (ORCPT ); Thu, 23 Jul 2020 13:57:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53310 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730222AbgGWR4z (ORCPT ); Thu, 23 Jul 2020 13:56:55 -0400 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B57CBC0619DC for ; Thu, 23 Jul 2020 10:56:54 -0700 (PDT) Received: by mail-wm1-x343.google.com with SMTP id 184so6019236wmb.0 for ; Thu, 23 Jul 2020 10:56:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=geCGmK1+y9ASuRDRnml2TDhrv4AHYRy13S19WWdthj4=; b=Gi0lZOrFfCbHWptN3Yu/2cbFmEwSoOTrUeBFgL5J7kQJFqjxppJExCY1Rz4pmIQbnG xqUvvirpAW82FPeI209kbEoGVgmvkLhKxGxw+ZLt5vbmEdwlq3RBOrDLjNlBEEdnYfIl dtxnzcthcFfuXRVTGy+kQso6hAm4GmOban1X3GaX7JKGm2VYlItGmG9wUDfcMeYgiEJe /h0rt0Y9hXM2mAl8VtDcRNgOMlm502PiZPzSb4POtYvnlhTbWakG7HNxAlt1EGgvJsSI pO9dvabWPYv/gvp5vhWOoh07XIbadT0nZHZS1e0P68P3DBHb4yMhn33sFH0hWddbwViA ov9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=geCGmK1+y9ASuRDRnml2TDhrv4AHYRy13S19WWdthj4=; b=DnOMZQvHqnvr5bn+Gn/eqsqSdwPgSdInJFGfry5VFZKEtFfHQtDYr1byi0PU/O9xJp 3cB2Vms5heKnYH344LCpfOBCLHytciZzdSK8EoNTVVRZxSTnRA1EMnMNwy9BP9ARQwWS 3RfP4PBs/98aKR9PRSeGIxlBX9uJp4M0dmTvlraq35VxQKKTfWk/Km0+dea8qPCupzKa dISwRdHHaw3qbgEUst9mu+oQRlKdrXWxApAtiSBxbmmwFSmx6thzcHY92JHcLzq4DzT3 HENZbc+Qg9PXNwLsbWkUpetnSYrohQlgci4jwX86ApXfye+0evVGsqvBVDf1ORAAuVik JnZw== X-Gm-Message-State: AOAM530olWHiEzpERbbZaJKjp4qNcVVspvqIcOoEvkpyx7A8/QD/hyxp 0cFi2uD71UGMpoNIqg3vuIGNhM5k X-Google-Smtp-Source: ABdhPJyoyJRi8GaUt8Qo7dGWxNlmRHDnVneXzfRBTr+a5OpiwRiIhp/41bwlFdvy38NDi+goD3xnGA== X-Received: by 2002:a1c:ac81:: with SMTP id v123mr4939711wme.159.1595527013224; Thu, 23 Jul 2020 10:56:53 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u10sm4114790wml.29.2020.07.23.10.56.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:52 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:34 +0000 Subject: [PATCH v2 12/18] maintenance: create maintenance..enabled config Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Currently, a normal run of "git maintenance run" will only run the 'gc' task, as it is the only one enabled. This is mostly for backwards- compatible reasons since "git maintenance run --auto" commands replaced previous "git gc --auto" commands after some Git processes. Users could manually run specific maintenance tasks by calling "git maintenance run --task=" directly. Allow users to customize which steps are run automatically using config. The 'maintenance..enabled' option then can turn on these other tasks (or turn off the 'gc' task). Signed-off-by: Derrick Stolee --- Documentation/config.txt | 2 ++ Documentation/config/maintenance.txt | 4 ++++ Documentation/git-maintenance.txt | 6 +++++- builtin/gc.c | 13 +++++++++++++ t/t7900-maintenance.sh | 12 ++++++++++++ 5 files changed, 36 insertions(+), 1 deletion(-) create mode 100644 Documentation/config/maintenance.txt diff --git a/Documentation/config.txt b/Documentation/config.txt index ef0768b91a..2783b825f9 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -396,6 +396,8 @@ include::config/mailinfo.txt[] include::config/mailmap.txt[] +include::config/maintenance.txt[] + include::config/man.txt[] include::config/merge.txt[] diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt new file mode 100644 index 0000000000..370cbfb42f --- /dev/null +++ b/Documentation/config/maintenance.txt @@ -0,0 +1,4 @@ +maintenance..enabled:: + This boolean config option controls whether the maintenance task + with name `` is run when no `--task` option is specified. + By default, only `maintenance.gc.enabled` is true. diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index bda8df4aaa..4a61441bbc 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -30,7 +30,11 @@ SUBCOMMANDS ----------- run:: - Run one or more maintenance tasks. + Run one or more maintenance tasks. If one or more `--task` options + are specified, then those tasks are run in that order. Otherwise, + the tasks are determined by which `maintenance..enabled` + config options are true. By default, only `maintenance.gc.enabled` + is true. TASKS ----- diff --git a/builtin/gc.c b/builtin/gc.c index 889d97afe7..b6dc4b1832 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -1185,6 +1185,7 @@ static int maintenance_run(void) static void initialize_tasks(void) { int i; + struct strbuf config_name = STRBUF_INIT; num_tasks = 0; for (i = 0; i < MAX_NUM_TASKS; i++) @@ -1210,6 +1211,18 @@ static void initialize_tasks(void) tasks[num_tasks]->name = "commit-graph"; tasks[num_tasks]->fn = maintenance_task_commit_graph; num_tasks++; + + for (i = 0; i < num_tasks; i++) { + int config_value; + + strbuf_setlen(&config_name, 0); + strbuf_addf(&config_name, "maintenance.%s.enabled", tasks[i]->name); + + if (!git_config_get_bool(config_name.buf, &config_value)) + tasks[i]->enabled = config_value; + } + + strbuf_release(&config_name); } static int task_option_parse(const struct option *opt, diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index ab5c961eb9..3ee51723e0 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -21,6 +21,18 @@ test_expect_success 'run [--auto|--quiet]' ' grep ",\"gc\",\"--quiet\"" run-quiet.txt ' +test_expect_success 'maintenance..enabled' ' + git config maintenance.gc.enabled false && + git config maintenance.commit-graph.enabled true && + git config maintenance.loose-objects.enabled true && + GIT_TRACE2_EVENT="$(pwd)/run-config.txt" git maintenance run && + ! grep ",\"fetch\"" run-config.txt && + ! grep ",\"gc\"" run-config.txt && + ! grep ",\"multi-pack-index\"" run-config.txt && + grep ",\"commit-graph\"" run-config.txt && + grep ",\"prune-packed\"" run-config.txt +' + test_expect_success 'run --task=' ' GIT_TRACE2_EVENT="$(pwd)/run-commit-graph.txt" git maintenance run --task=commit-graph && GIT_TRACE2_EVENT="$(pwd)/run-gc.txt" git maintenance run --task=gc && From patchwork Thu Jul 23 17:56:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681461 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 64415138A for ; Thu, 23 Jul 2020 17:57:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4ACBE206E3 for ; Thu, 23 Jul 2020 17:57:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kVUcjt6i" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730268AbgGWR5A (ORCPT ); Thu, 23 Jul 2020 13:57:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53314 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730248AbgGWR44 (ORCPT ); Thu, 23 Jul 2020 13:56:56 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A0E09C0619DC for ; Thu, 23 Jul 2020 10:56:55 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id q5so6008342wru.6 for ; Thu, 23 Jul 2020 10:56:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Z6kTm1PJVREFGVitIv2AsZneQRuyQdm7G67H8YL0WrY=; b=kVUcjt6ib5DCljaPd71d2S9LBHLkOtznQGuXs9PO3fzc/fjFKLyi2FBcKO+F+zssGW LUGb7tnDbJFHsWkmEnXN3xE2UjKyVaRfCtW/ycE0d7ellI/CwpdivUO4N8NVcU5KOsKg qDkA5D6YMuNNkDWK3dt7YErSfDiHzzYDc3QuCj7ldrYqs52SjxTg9V2cNL2mQWHiNYDJ mowLUV0vFjbaiCpdPvim/3i/rPiJTIR1W7pri+dDFOwbaNFCJoFXuLfc0tzyrI2+Lew8 eurEvyW4+h/wBK89M+KHdLjYrCGnUPCtWZS9nZXIPPNuWiTTF2Dsk9nlcSRtNMElxvFx iE2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Z6kTm1PJVREFGVitIv2AsZneQRuyQdm7G67H8YL0WrY=; b=g68NOggOprg7+PZc6kjf8VEg7Rj3UqQ4sKjK9Wo4PW5ZQCFITZ5/2jRmHDZG88GKM8 YPEud8aqIB8/Oj4vgj004Kxk9TRi6+vbAWp0U8sFsDeabV+uP5MYhy5LKoDHtqoCsgah nBdp79ec3Xggk98g+wcOHusp3DOlBzd2x/NYrqqgqCW0fk/bV5vIh73mxSXxdNTIiYu1 YDn5LNfHOgUeSIPUL3eO4NFQu2vd2JD24tFESeVSu7FpVHEJk4vgJByO4CFsSAO1pPB9 w+b1U1A4Nk2uYzmezE3neEazdKJmGolvCDXd/EI6kITHTfftOXBRTewwCvMC3viCSInj eYOw== X-Gm-Message-State: AOAM532xGqOM/L9A/zhvCp77wlzrhxPksPAiHb00qOLQvQEJDd4vlyUa XGxFzOpS5cvLmr1joCNcYxRf+CXJ X-Google-Smtp-Source: ABdhPJwDVLldbRmcd75v8UlbojYUJGNp6v47KPnz44TZYAlkzWCUdFpKPJyUzqXaraPjzphSMFlD4A== X-Received: by 2002:a5d:6cd0:: with SMTP id c16mr5159187wrc.121.1595527014183; Thu, 23 Jul 2020 10:56:54 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s203sm4465678wms.32.2020.07.23.10.56.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:53 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:35 +0000 Subject: [PATCH v2 13/18] maintenance: use pointers to check --auto Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The 'git maintenance run' command has an '--auto' option. This is used by other Git commands such as 'git commit' or 'git fetch' to check if maintenance should be run after adding data to the repository. Previously, this --auto option was only used to add the argument to the 'git gc' command as part of the 'gc' task. We will be expanding the other tasks to perform a check to see if they should do work as part of the --auto flag, when they are enabled by config. First, update the 'gc' task to perform the auto check inside the maintenance process. This prevents running an extra 'git gc --auto' command when not needed. It also shows a model for other tasks. Second, use the 'auto_condition' function pointer as a signal for whether we enable the maintenance task under '--auto'. For instance, we do not want to enable the 'fetch' task in '--auto' mode, so that function pointer will remain NULL. Now that we are not automatically calling 'git gc', a test in t5514-fetch-multiple.sh must be changed to watch for 'git maintenance' instead. We continue to pass the '--auto' option to the 'git gc' command when necessary, because of the gc.autoDetach config option changes behavior. Likely, we will want to absorb the daemonizing behavior implied by gc.autoDetach as a maintenance.autoDetach config option. Signed-off-by: Derrick Stolee --- builtin/gc.c | 15 +++++++++++++++ t/t5514-fetch-multiple.sh | 2 +- t/t7900-maintenance.sh | 2 +- 3 files changed, 17 insertions(+), 2 deletions(-) diff --git a/builtin/gc.c b/builtin/gc.c index b6dc4b1832..31696a2595 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -1121,9 +1121,17 @@ static int maintenance_task_incremental_repack(void) typedef int maintenance_task_fn(void); +/* + * An auto condition function returns 1 if the task should run + * and 0 if the task should NOT run. See needs_to_gc() for an + * example. + */ +typedef int maintenance_auto_fn(void); + struct maintenance_task { const char *name; maintenance_task_fn *fn; + maintenance_auto_fn *auto_condition; int task_order; unsigned enabled:1, selected:1; @@ -1175,6 +1183,11 @@ static int maintenance_run(void) if (!opts.tasks_selected && !tasks[i]->enabled) continue; + if (opts.auto_flag && + (!tasks[i]->auto_condition || + !tasks[i]->auto_condition())) + continue; + result = tasks[i]->fn(); } @@ -1205,6 +1218,7 @@ static void initialize_tasks(void) tasks[num_tasks]->name = "gc"; tasks[num_tasks]->fn = maintenance_task_gc; + tasks[num_tasks]->auto_condition = need_to_gc; tasks[num_tasks]->enabled = 1; num_tasks++; @@ -1283,6 +1297,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix) builtin_maintenance_options); opts.quiet = !isatty(2); + gc_config(); initialize_tasks(); argc = parse_options(argc, argv, prefix, diff --git a/t/t5514-fetch-multiple.sh b/t/t5514-fetch-multiple.sh index de8e2f1531..bd202ec6f3 100755 --- a/t/t5514-fetch-multiple.sh +++ b/t/t5514-fetch-multiple.sh @@ -108,7 +108,7 @@ test_expect_success 'git fetch --multiple (two remotes)' ' GIT_TRACE=1 git fetch --multiple one two 2>trace && git branch -r > output && test_cmp ../expect output && - grep "built-in: git gc" trace >gc && + grep "built-in: git maintenance" trace >gc && test_line_count = 1 gc ) ' diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 3ee51723e0..373b8dbe04 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -17,7 +17,7 @@ test_expect_success 'run [--auto|--quiet]' ' GIT_TRACE2_EVENT="$(pwd)/run-auto.txt" git maintenance run --auto && GIT_TRACE2_EVENT="$(pwd)/run-quiet.txt" git maintenance run --quiet && grep ",\"gc\"]" run-no-auto.txt && - grep ",\"gc\",\"--auto\"" run-auto.txt && + ! grep ",\"gc\",\"--auto\"" run-auto.txt && grep ",\"gc\",\"--quiet\"" run-quiet.txt ' From patchwork Thu Jul 23 17:56:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681463 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8A4C8722 for ; Thu, 23 Jul 2020 17:57:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7199C206E3 for ; Thu, 23 Jul 2020 17:57:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ixe/m4dB" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730270AbgGWR5C (ORCPT ); Thu, 23 Jul 2020 13:57:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730255AbgGWR45 (ORCPT ); Thu, 23 Jul 2020 13:56:57 -0400 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 96EC8C0619DC for ; Thu, 23 Jul 2020 10:56:56 -0700 (PDT) Received: by mail-wm1-x342.google.com with SMTP id x5so5190870wmi.2 for ; Thu, 23 Jul 2020 10:56:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=tIb63pX0YeW2srRqGdL3BdyiWo8uWN8WA7i9PyUQcko=; b=ixe/m4dBeL+P7I1ufPyItsAIKzqaBl+/7knRwzxAANpYmNbEGuu05YIZ8FqAbUuS1V ig0B3riPIL3xSKpw4Fpjkfcbx0WgtjRwsPsXQ0l2t0B0ZZLpH5ht0mZvH7YyVDM66zbC Hvwpw3m3kP2ehVWi9cbQbojmhNaLz4/xec1i5/IkQsYduZ6H1jysZkBvrIcg0SnmQa6K /nwMLqAaVbx1JdDy+sk8ibMOhriDE2L67twvY6dOmurcVY1NNpdm9/bFOgAeEHu4+fTR p8vmXHOohDmYzRRWenPFDgsb9Sd6/q6bb3q/YfTYrRU96nx0f4PNrDsA8RJqjYow9ZY6 byjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=tIb63pX0YeW2srRqGdL3BdyiWo8uWN8WA7i9PyUQcko=; b=Prx+OddV6Mh2V98vIc2JE2QAbD0wg0cU1lu5kCan44A6x/PnchiAxWZnOyP8VczxrE 7+7oDFmswexCPvxm50F3qbu8kQQPhRx6kh1aTpznm4pOxzvWqTnoet9cozrn6CJg2QIc 10JeSS9+6Ua25CcqP/RRqNP5lMycSCpEUs7ubQi+9mMP9bV0eZX725ZFBrlLWcToPE27 +THDt/Wy4qPkb67V7foZo015TaDrWdUMW53we1FOq5IXk4jdShEq52lEjELGLn8h5Y3b vSGrqw7bLj68to3yPsK2Tg7P0h6Y6HDU5hTE6viF3rrFNcIbESssp+b0Kla8XW9EnjRK VcIQ== X-Gm-Message-State: AOAM533tiZbSmaepo+M48CTrMkqrDTMAEm46XuIDE/WXWtP3srAMr7C6 G94mfwsJgTgvjhGs+94Q3VNktNBF X-Google-Smtp-Source: ABdhPJyX3mAAY6Jy2ddnmXVeoxUOdQ/WYGc9qwGTQgxPHxQSEeQeYxSPNKC9A2tyRVPmKvIb1prGVA== X-Received: by 2002:a1c:9650:: with SMTP id y77mr5032849wmd.101.1595527015179; Thu, 23 Jul 2020 10:56:55 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id g18sm4868495wru.27.2020.07.23.10.56.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:54 -0700 (PDT) Message-Id: <9af2309f0804cc3a2b26c3cc5c5526d51e86aac2.1595527000.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:36 +0000 Subject: [PATCH v2 14/18] maintenance: add auto condition for commit-graph task Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Instead of writing a new commit-graph in every 'git maintenance run --auto' process (when maintenance.commit-graph.enalbed is configured to be true), only write when there are "enough" commits not in a commit-graph file. This count is controlled by the maintenance.commit-graph.auto config option. To compute the count, use a depth-first search starting at each ref, and leaving markers using the PARENT1 flag. If this count reaches the limit, then terminate early and start the task. Otherwise, this operation will peel every ref and parse the commit it points to. If these are all in the commit-graph, then this is typically a very fast operation. Users with many refs might feel a slow-down, and hence could consider updating their limit to be very small. A negative value will force the step to run every time. Signed-off-by: Derrick Stolee --- Documentation/config/maintenance.txt | 10 ++++ builtin/gc.c | 76 ++++++++++++++++++++++++++++ object.h | 1 + 3 files changed, 87 insertions(+) diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt index 370cbfb42f..9bd69b9df3 100644 --- a/Documentation/config/maintenance.txt +++ b/Documentation/config/maintenance.txt @@ -2,3 +2,13 @@ maintenance..enabled:: This boolean config option controls whether the maintenance task with name `` is run when no `--task` option is specified. By default, only `maintenance.gc.enabled` is true. + +maintenance.commit-graph.auto:: + This integer config option controls how often the `commit-graph` task + should be run as part of `git maintenance run --auto`. If zero, then + the `commit-graph` task will not run with the `--auto` option. A + negative value will force the task to run every time. Otherwise, a + positive value implies the command should run when the number of + reachable commits that are not in the commit-graph file is at least + the value of `maintenance.commit-graph.auto`. The default value is + 100. diff --git a/builtin/gc.c b/builtin/gc.c index 31696a2595..84ad360d17 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -30,6 +30,7 @@ #include "promisor-remote.h" #include "remote.h" #include "midx.h" +#include "refs.h" #define FAILED_RUN "failed to run %s" @@ -715,6 +716,80 @@ static struct maintenance_opts { int tasks_selected; } opts; +/* Remember to update object flag allocation in object.h */ +#define PARENT1 (1u<<16) + +static int num_commits_not_in_graph = 0; +static int limit_commits_not_in_graph = 100; + +static int dfs_on_ref(const char *refname, + const struct object_id *oid, int flags, + void *cb_data) +{ + int result = 0; + struct object_id peeled; + struct commit_list *stack = NULL; + struct commit *commit; + + if (!peel_ref(refname, &peeled)) + oid = &peeled; + if (oid_object_info(the_repository, oid, NULL) != OBJ_COMMIT) + return 0; + + commit = lookup_commit(the_repository, oid); + if (!commit) + return 0; + if (parse_commit(commit)) + return 0; + + commit_list_append(commit, &stack); + + while (!result && stack) { + struct commit_list *parent; + + commit = pop_commit(&stack); + + for (parent = commit->parents; parent; parent = parent->next) { + if (parse_commit(parent->item) || + commit_graph_position(parent->item) != COMMIT_NOT_FROM_GRAPH || + parent->item->object.flags & PARENT1) + continue; + + parent->item->object.flags |= PARENT1; + num_commits_not_in_graph++; + + if (num_commits_not_in_graph >= limit_commits_not_in_graph) { + result = 1; + break; + } + + commit_list_append(parent->item, &stack); + } + } + + free_commit_list(stack); + return result; +} + +static int should_write_commit_graph(void) +{ + int result; + + git_config_get_int("maintenance.commit-graph.auto", + &limit_commits_not_in_graph); + + if (!limit_commits_not_in_graph) + return 0; + if (limit_commits_not_in_graph < 0) + return 1; + + result = for_each_ref(dfs_on_ref, NULL); + + clear_commit_marks_all(PARENT1); + + return result; +} + static int run_write_commit_graph(void) { int result; @@ -1224,6 +1299,7 @@ static void initialize_tasks(void) tasks[num_tasks]->name = "commit-graph"; tasks[num_tasks]->fn = maintenance_task_commit_graph; + tasks[num_tasks]->auto_condition = should_write_commit_graph; num_tasks++; for (i = 0; i < num_tasks; i++) { diff --git a/object.h b/object.h index 38dc2d5a6c..4f886495d7 100644 --- a/object.h +++ b/object.h @@ -73,6 +73,7 @@ struct object_array { * list-objects-filter.c: 21 * builtin/fsck.c: 0--3 * builtin/index-pack.c: 2021 + * builtin/maintenance.c: 16 * builtin/pack-objects.c: 20 * builtin/reflog.c: 10--12 * builtin/show-branch.c: 0-------------------------------------------26 From patchwork Thu Jul 23 17:56:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681473 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9E318722 for ; Thu, 23 Jul 2020 17:57:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 86A7C2086A for ; Thu, 23 Jul 2020 17:57:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="qmuJ+aH1" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730286AbgGWR5I (ORCPT ); Thu, 23 Jul 2020 13:57:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730257AbgGWR45 (ORCPT ); Thu, 23 Jul 2020 13:56:57 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71628C0619E2 for ; Thu, 23 Jul 2020 10:56:57 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id a14so6012157wra.5 for ; Thu, 23 Jul 2020 10:56:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=tN02UvhulprvTzUPsFR91Vc5X1kfR4B2eBxD5sAVaKo=; b=qmuJ+aH1QYlDT0L6vz66tnYUUhKoB23Xwh+qmimc2meg+Crg1EwfaxJMsR702wsCbx +Vlue2xhLgLAu376dIx/4Ciuuss3tH774WL3T36ZVqEl359/kjQHgg9zTw1Y0YyIfu3y 475Mj9652MkOzmpVl+zIOBTeOnar0z2+Gyv4HMF2Lx/Sut6dMUlzZuB3Jz/584NwF1v3 HjeQORH1VubnEi9VWFIs9GZT/MRZaMY12PyGlv5I21boYX7Xy3qgNzgg1kXHlFdBiSOV Q7W3Cem8Y7TVdlplqb3SpkPOzK8YW2HeI2Kyg9F951JEtVaQP4i8i72QFjkSnGayfd04 wCeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=tN02UvhulprvTzUPsFR91Vc5X1kfR4B2eBxD5sAVaKo=; b=PzeqGzwAIsuBui1z4YWiWMSbx26sgAH0dWIQWSbi4FfQ00++hK4TZ8eR8psTE0qWs1 Rn2de+u1Apsr3mvGVgH97hTvCFolNiwZUARr1g5FH1sJ/mhoV335j8LGNFtoLtRuHTtT W0JS582RbHjnYaCPeeaRMlkIzPV/rf92Dtv94qg8eVSbq/8Zd2nOohsVO2Bxqq5aYkya k8FbpoNZav6lL9mw429WlR9i8SZrBDPHNtggX9pM35u4/dFNHoKob7iJDBehiDWLB8Zp /L8q+qGL91RobjpobhHR9O6ljwzS2wVPIfbzyh32UNpE0TWFGkeZ783mf3v56Ukj5VFD SOkg== X-Gm-Message-State: AOAM533m8J5i0q7DHchhur82yLzL1xQhK2gTtO1rEui2+IoD1H7HZJyp sLMszfIGhoX8DD7pZEqSO286/gy3 X-Google-Smtp-Source: ABdhPJyu4MXJxrW+WPoYhIL+iXR9KySOEwgtBVW+s4itHQnrU+OpjGe2xhD3jVLsZfRqDlosoXwHhw== X-Received: by 2002:adf:fac8:: with SMTP id a8mr5084234wrs.368.1595527015962; Thu, 23 Jul 2020 10:56:55 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f15sm4355957wrx.91.2020.07.23.10.56.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:55 -0700 (PDT) Message-Id: <42e316ca5851992f29fa2658e38a08ebb7dd3e31.1595527000.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:37 +0000 Subject: [PATCH v2 15/18] maintenance: create auto condition for loose-objects Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The loose-objects task deletes loose objects that already exist in a pack-file, then place the remaining loose objects into a new pack-file. If this step runs all the time, then we risk creating pack-files with very few objects with every 'git commit' process. To prevent overwhelming the packs directory with small pack-files, place a minimum number of objects to justify the task. The 'maintenance.loose-objects.auto' config option specifies a minimum number of loose objects to justify the task to run under the '--auto' option. This defaults to 100 loose objects. Setting the value to zero will prevent the step from running under '--auto' while a negative value will force it to run every time. Signed-off-by: Derrick Stolee --- Documentation/config/maintenance.txt | 9 +++++++++ builtin/gc.c | 30 ++++++++++++++++++++++++++++ t/t7900-maintenance.sh | 25 +++++++++++++++++++++++ 3 files changed, 64 insertions(+) diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt index 9bd69b9df3..a9442dd260 100644 --- a/Documentation/config/maintenance.txt +++ b/Documentation/config/maintenance.txt @@ -12,3 +12,12 @@ maintenance.commit-graph.auto:: reachable commits that are not in the commit-graph file is at least the value of `maintenance.commit-graph.auto`. The default value is 100. + +maintenance.loose-objects.auto:: + This integer config option controls how often the `loose-objects` task + should be run as part of `git maintenance run --auto`. If zero, then + the `loose-objects` task will not run with the `--auto` option. A + negative value will force the task to run every time. Otherwise, a + positive value implies the command should run when the number of + loose objects is at least the value of `maintenance.loose-objects.auto`. + The default value is 100. diff --git a/builtin/gc.c b/builtin/gc.c index 84ad360d17..ae59a28203 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -951,6 +951,35 @@ struct write_loose_object_data { int batch_size; }; +static int loose_object_auto_limit = 100; + +static int loose_object_count(const struct object_id *oid, + const char *path, + void *data) +{ + int *count = (int*)data; + if (++(*count) >= loose_object_auto_limit) + return 1; + return 0; +} + +static int loose_object_auto_condition(void) +{ + int count = 0; + + git_config_get_int("maintenance.loose-objects.auto", + &loose_object_auto_limit); + + if (!loose_object_auto_limit) + return 0; + if (loose_object_auto_limit < 0) + return 1; + + return for_each_loose_file_in_objdir(the_repository->objects->odb->path, + loose_object_count, + NULL, NULL, &count); +} + static int loose_object_exists(const struct object_id *oid, const char *path, void *data) @@ -1285,6 +1314,7 @@ static void initialize_tasks(void) tasks[num_tasks]->name = "loose-objects"; tasks[num_tasks]->fn = maintenance_task_loose_objects; + tasks[num_tasks]->auto_condition = loose_object_auto_condition; num_tasks++; tasks[num_tasks]->name = "incremental-repack"; diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 373b8dbe04..e4244d7c3c 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -115,6 +115,31 @@ test_expect_success 'loose-objects task' ' test_cmp packs-between packs-after ' +test_expect_success 'maintenance.loose-objects.auto' ' + git repack -adk && + GIT_TRACE2_EVENT="$(pwd)/trace-lo1.txt" \ + git -c maintenance.loose-objects.auto=1 maintenance \ + run --auto --task=loose-objects && + ! grep "\"prune-packed\"" trace-lo1.txt && + for i in 1 2 + do + printf data-A-$i | git hash-object -t blob --stdin -w && + GIT_TRACE2_EVENT="$(pwd)/trace-loA-$i" \ + git -c maintenance.loose-objects.auto=2 \ + maintenance run --auto --task=loose-objects && + ! grep "\"prune-packed\"" trace-loA-$i && + printf data-B-$i | git hash-object -t blob --stdin -w && + GIT_TRACE2_EVENT="$(pwd)/trace-loB-$i" \ + git -c maintenance.loose-objects.auto=2 \ + maintenance run --auto --task=loose-objects && + grep "\"prune-packed\"" trace-loB-$i && + GIT_TRACE2_EVENT="$(pwd)/trace-loC-$i" \ + git -c maintenance.loose-objects.auto=2 \ + maintenance run --auto --task=loose-objects && + grep "\"prune-packed\"" trace-loC-$i || return 1 + done +' + test_expect_success 'incremental-repack task' ' packDir=.git/objects/pack && for i in $(test_seq 1 5) From patchwork Thu Jul 23 17:56:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681471 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6B1B4138A for ; Thu, 23 Jul 2020 17:57:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 51A752086A for ; Thu, 23 Jul 2020 17:57:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Qsr59m+J" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730281AbgGWR5H (ORCPT ); Thu, 23 Jul 2020 13:57:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53328 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730236AbgGWR46 (ORCPT ); Thu, 23 Jul 2020 13:56:58 -0400 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A198C0619E3 for ; Thu, 23 Jul 2020 10:56:58 -0700 (PDT) Received: by mail-wr1-x442.google.com with SMTP id z18so2443480wrm.12 for ; Thu, 23 Jul 2020 10:56:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ZuuyvrOEbYHMb02DAbtMFB0i88wpfAnbSXGTqzJrkUQ=; b=Qsr59m+JpTvgS/SCWJS3GSTc8kRioRRTJhkTeSJNGPFRKydIR/1bpMU4645tXkt8RX +0KjAZ1nN01iNTvp6TpaTDcsYX0mPlSQKox/MQ/fw3ToBiIxe8lib27Cm+f4yv6YA5I4 1CGSe2ulDDP6V5WHrR0MGtT2bQhOpYgGmUqj3NCU6NpsTsEC1CdTy2D5fbWz6g7GN2eR upXE6jHXVC8DBrTIoGTAmfhSAYR8W5vtOo/4c5BZjOx3bkUYQ78hRbLZ5P/QAeC5G0vV 81G4Rl570MAfqdmC9FDu3FeI7Y3bRCKVefm76HoX1Qr/PyuERXBzy/YK7FNiUWE21D7q eBiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ZuuyvrOEbYHMb02DAbtMFB0i88wpfAnbSXGTqzJrkUQ=; b=eZ003gNwU5VxOlngKIEaUDe8HBTnU7m74DFSUKIZ+NJYbOa/mH2nPQlGcEqe/nDbd6 pU89yHyTClbpfTKLs8I/iuiGluaEGHLcw6FQ/5Tw1fgYLFFsSTOhtB8htd65TTHS3c2i ply5kQvggzzNkljBJ7aQquUrWbgTC/h1PxrZQlrfu8kQMvdaZf21bzUcqvBVKOpgvQqG 4m/SgQrNdOLzmH1/+blL/O4PF+qPAJ3sI0U9hb8HCAcyWW8UnyTrpHe8hqiGWsei9tHa MpOtsef6kUtNRWT8Bi4enmNJQ26gVJ+T3MD6XTVDJW02sZ5zCf5x+rZ2X3If1YK2ibxr AciQ== X-Gm-Message-State: AOAM5305t1if3qZ4gLpmleft8CcQLLIH87t4V9h1G2aoRNxvVmqF6qsv 7Ca32mf8D+fCBGs/mxB3qdUyPDT4 X-Google-Smtp-Source: ABdhPJzSqa2n8fjJx6lAXLzKOu/c0iKsvjNv31Ebx4spGDP/WrBEhGcXR87BnmQ99Gas0eA/cf+xrA== X-Received: by 2002:a5d:6681:: with SMTP id l1mr4809629wru.47.1595527016704; Thu, 23 Jul 2020 10:56:56 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id t2sm4611325wmb.25.2020.07.23.10.56.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:56 -0700 (PDT) Message-Id: <3d527cb0dda20a5b89a9a213fbdd4a28586a4e4f.1595527000.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:38 +0000 Subject: [PATCH v2 16/18] maintenance: add incremental-repack auto condition Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The incremental-repack task updates the multi-pack-index by deleting pack- files that have been replaced with new packs, then repacking a batch of small pack-files into a larger pack-file. This incremental repack is faster than rewriting all object data, but is slower than some other maintenance activities. The 'maintenance.incremental-repack.auto' config option specifies how many pack-files should exist outside of the multi-pack-index before running the step. These pack-files could be created by 'git fetch' commands or by the loose-objects task. The default value is 10. Setting the option to zero disables the task with the '--auto' option, and a negative value makes the task run every time. Signed-off-by: Derrick Stolee --- Documentation/config/maintenance.txt | 9 ++++++++ builtin/gc.c | 31 ++++++++++++++++++++++++++++ t/t7900-maintenance.sh | 30 +++++++++++++++++++++++++++ 3 files changed, 70 insertions(+) diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt index a9442dd260..22229e7174 100644 --- a/Documentation/config/maintenance.txt +++ b/Documentation/config/maintenance.txt @@ -21,3 +21,12 @@ maintenance.loose-objects.auto:: positive value implies the command should run when the number of loose objects is at least the value of `maintenance.loose-objects.auto`. The default value is 100. + +maintenance.incremental-repack.auto:: + This integer config option controls how often the `incremental-repack` + task should be run as part of `git maintenance run --auto`. If zero, + then the `incremental-repack` task will not run with the `--auto` + option. A negative value will force the task to run every time. + Otherwise, a positive value implies the command should run when the + number of pack-files not in the multi-pack-index is at least the value + of `maintenance.incremental-repack.auto`. The default value is 10. diff --git a/builtin/gc.c b/builtin/gc.c index ae59a28203..b040c7d31d 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -31,6 +31,7 @@ #include "remote.h" #include "midx.h" #include "refs.h" +#include "object-store.h" #define FAILED_RUN "failed to run %s" @@ -1063,6 +1064,35 @@ static int maintenance_task_loose_objects(void) return prune_packed() || pack_loose(); } +static int incremental_repack_auto_condition(void) +{ + struct packed_git *p; + int enabled; + int incremental_repack_auto_limit = 10; + int count = 0; + + if (git_config_get_bool("core.multiPackIndex", &enabled) || + !enabled) + return 0; + + git_config_get_int("maintenance.incremental-repack.auto", + &incremental_repack_auto_limit); + + if (!incremental_repack_auto_limit) + return 0; + if (incremental_repack_auto_limit < 0) + return 1; + + for (p = get_packed_git(the_repository); + count < incremental_repack_auto_limit && p; + p = p->next) { + if (!p->multi_pack_index) + count++; + } + + return count >= incremental_repack_auto_limit; +} + static int multi_pack_index_write(void) { int result; @@ -1319,6 +1349,7 @@ static void initialize_tasks(void) tasks[num_tasks]->name = "incremental-repack"; tasks[num_tasks]->fn = maintenance_task_incremental_repack; + tasks[num_tasks]->auto_condition = incremental_repack_auto_condition; num_tasks++; tasks[num_tasks]->name = "gc"; diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index e4244d7c3c..0b29674805 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -178,4 +178,34 @@ test_expect_success 'incremental-repack task' ' test_line_count = 2 packs-after ' +test_expect_success 'maintenance.incremental-repack.auto' ' + git repack -adk && + git config core.multiPackIndex true && + git multi-pack-index write && + GIT_TRACE2_EVENT=1 git -c maintenance.incremental-repack.auto=1 \ + maintenance run --auto --task=incremental-repack >out && + ! grep "\"multi-pack-index\"" out && + for i in 1 2 + do + test_commit A-$i && + git pack-objects --revs .git/objects/pack/pack <<-\EOF && + HEAD + ^HEAD~1 + EOF + GIT_TRACE2_EVENT=$(pwd)/trace-A-$i git \ + -c maintenance.incremental-repack.auto=2 \ + maintenance run --auto --task=incremental-repack && + ! grep "\"multi-pack-index\"" trace-A-$i && + test_commit B-$i && + git pack-objects --revs .git/objects/pack/pack <<-\EOF && + HEAD + ^HEAD~1 + EOF + GIT_TRACE2_EVENT=$(pwd)/trace-B-$i git \ + -c maintenance.incremental-repack.auto=2 \ + maintenance run --auto --task=incremental-repack >out && + grep "\"multi-pack-index\"" trace-B-$i >/dev/null || return 1 + done +' + test_done From patchwork Thu Jul 23 17:56:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681469 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BC02C722 for ; Thu, 23 Jul 2020 17:57:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9F6F720792 for ; Thu, 23 Jul 2020 17:57:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="u07JaHYa" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730280AbgGWR5F (ORCPT ); Thu, 23 Jul 2020 13:57:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53332 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730261AbgGWR47 (ORCPT ); Thu, 23 Jul 2020 13:56:59 -0400 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 08C81C0619E4 for ; Thu, 23 Jul 2020 10:56:59 -0700 (PDT) Received: by mail-wr1-x443.google.com with SMTP id z18so2443523wrm.12 for ; Thu, 23 Jul 2020 10:56:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=LU+CNrX9yNQ3ZBh5dcZ/D4gj9kwibseehJxjrOFtaPA=; b=u07JaHYa1c+hKO5qwoNNrZJdn5YqB+8u/YTDtZO2/VN302nijAvklwuIO+tjU4csRn fgMkwgLyKrelXdXJ3RjMAYT5AcdueuaMmS+WEg78RHoYdqT8jDwWWBZTbhTz9Drfr3XQ 7UVOmrwhh5DJ4A+84tNo7hkftOrXQWeQGwZ8eBkFV7+Y3gnNDvf7pnN4YBNsHnqRCWsN QlWtwx+XfoZzH4NKSxV/XlMc6PuYEHzu6WhHCfr0bq/LaZ9j8jBI/9D/7DhCphYhD7US uCMDitU0xVuhd5YE5lisjtEahipXWPFoHF7QZEQOzpNQSvERyvh4tuieJWrVQArvm0ol WUSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=LU+CNrX9yNQ3ZBh5dcZ/D4gj9kwibseehJxjrOFtaPA=; b=hr2iEHfeDDxMZEMN6ON4iJRNA7A29q4gm1H9R6Cli9iTpjPmZUh0V04JjQyfjH5E43 rSYevoyk62iQH7cyvifs7ms5owJdNqAmuFIEzyfe1jTlmR9/9NBrweGTf2p3y+jO9B0/ QNEMMOD5/zYM1idSt+GWJdN/SYVxE7Qw/g9KgV5CpseneaY0qt9S4MG+KRe72vTAseDb mi15cJAS/hJ7+xeaZRoW/7PnrGqaYsNnmo/qcml+lDRhKEOSxTH5lFQrP2l6/RVaYTT1 KlVN8detdkl5Xjc4nFzFgI4ZrcoXpYY7RWtSK0IHPjAEoe6xDiaO5C2bhwXGi2qMYzLy YFAQ== X-Gm-Message-State: AOAM532E2675aSQGHrEdctbn0vBcTQrf8V0/ziESNPj/Y2n8qiSRXDfb 6LZt/7vQOme+MKlejJl6fH19DTdy X-Google-Smtp-Source: ABdhPJxvkKr5CE56IfSBL6D+KnZ4FVnqvfSX+ZxAO9Qo14pQe19h2/OAOH6XDqsuA4f9SkS4yEjAHQ== X-Received: by 2002:adf:e486:: with SMTP id i6mr4986644wrm.258.1595527017569; Thu, 23 Jul 2020 10:56:57 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 1sm4236090wmf.21.2020.07.23.10.56.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:57 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:39 +0000 Subject: [PATCH v2 17/18] midx: use start_delayed_progress() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Now that the multi-pack-index may be written as part of auto maintenance at the end of a command, reduce the progress output when the operations are quick. Use start_delayed_progress() instead of start_progress(). Update t5319-multi-pack-index.sh to use GIT_PROGRESS_DELAY=0 now that the progress indicators are conditional. Signed-off-by: Derrick Stolee --- midx.c | 10 +++++----- t/t5319-multi-pack-index.sh | 14 +++++++------- 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/midx.c b/midx.c index 57a8a00082..d4022e4aef 100644 --- a/midx.c +++ b/midx.c @@ -837,7 +837,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * packs.pack_paths_checked = 0; if (flags & MIDX_PROGRESS) - packs.progress = start_progress(_("Adding packfiles to multi-pack-index"), 0); + packs.progress = start_delayed_progress(_("Adding packfiles to multi-pack-index"), 0); else packs.progress = NULL; @@ -974,7 +974,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * } if (flags & MIDX_PROGRESS) - progress = start_progress(_("Writing chunks to multi-pack-index"), + progress = start_delayed_progress(_("Writing chunks to multi-pack-index"), num_chunks); for (i = 0; i < num_chunks; i++) { if (written != chunk_offsets[i]) @@ -1109,7 +1109,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag return 0; if (flags & MIDX_PROGRESS) - progress = start_progress(_("Looking for referenced packfiles"), + progress = start_delayed_progress(_("Looking for referenced packfiles"), m->num_packs); for (i = 0; i < m->num_packs; i++) { if (prepare_midx_pack(r, m, i)) @@ -1230,7 +1230,7 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla count = xcalloc(m->num_packs, sizeof(uint32_t)); if (flags & MIDX_PROGRESS) - progress = start_progress(_("Counting referenced objects"), + progress = start_delayed_progress(_("Counting referenced objects"), m->num_objects); for (i = 0; i < m->num_objects; i++) { int pack_int_id = nth_midxed_pack_int_id(m, i); @@ -1240,7 +1240,7 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla stop_progress(&progress); if (flags & MIDX_PROGRESS) - progress = start_progress(_("Finding and deleting unreferenced packfiles"), + progress = start_delayed_progress(_("Finding and deleting unreferenced packfiles"), m->num_packs); for (i = 0; i < m->num_packs; i++) { char *pack_name; diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 7214cab36c..12f41dfc18 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -172,12 +172,12 @@ test_expect_success 'write progress off for redirected stderr' ' ' test_expect_success 'write force progress on for stderr' ' - git multi-pack-index --object-dir=$objdir --progress write 2>err && + GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir --progress write 2>err && test_file_not_empty err ' test_expect_success 'write with the --no-progress option' ' - git multi-pack-index --object-dir=$objdir --no-progress write 2>err && + GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir --no-progress write 2>err && test_line_count = 0 err ' @@ -334,17 +334,17 @@ test_expect_success 'git-fsck incorrect offset' ' ' test_expect_success 'repack progress off for redirected stderr' ' - git multi-pack-index --object-dir=$objdir repack 2>err && + GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir repack 2>err && test_line_count = 0 err ' test_expect_success 'repack force progress on for stderr' ' - git multi-pack-index --object-dir=$objdir --progress repack 2>err && + GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir --progress repack 2>err && test_file_not_empty err ' test_expect_success 'repack with the --no-progress option' ' - git multi-pack-index --object-dir=$objdir --no-progress repack 2>err && + GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir --no-progress repack 2>err && test_line_count = 0 err ' @@ -488,7 +488,7 @@ test_expect_success 'expire progress off for redirected stderr' ' test_expect_success 'expire force progress on for stderr' ' ( cd dup && - git multi-pack-index --progress expire 2>err && + GIT_PROGRESS_DELAY=0 git multi-pack-index --progress expire 2>err && test_file_not_empty err ) ' @@ -496,7 +496,7 @@ test_expect_success 'expire force progress on for stderr' ' test_expect_success 'expire with the --no-progress option' ' ( cd dup && - git multi-pack-index --no-progress expire 2>err && + GIT_PROGRESS_DELAY=0 git multi-pack-index --no-progress expire 2>err && test_line_count = 0 err ) ' From patchwork Thu Jul 23 17:56:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Passaro via GitGitGadget X-Patchwork-Id: 11681467 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C910B722 for ; Thu, 23 Jul 2020 17:57:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B14AC20792 for ; Thu, 23 Jul 2020 17:57:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Y05oITk0" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730278AbgGWR5E (ORCPT ); Thu, 23 Jul 2020 13:57:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730269AbgGWR5A (ORCPT ); Thu, 23 Jul 2020 13:57:00 -0400 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7E25C0619E5 for ; Thu, 23 Jul 2020 10:56:59 -0700 (PDT) Received: by mail-wm1-x32b.google.com with SMTP id j18so5741412wmi.3 for ; Thu, 23 Jul 2020 10:56:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Zi54yqBdOqFyF97dKl9M0ENaSIb+QlkuRDDGBj/36rg=; b=Y05oITk0C3k+yR/SrOPvLbjo6XodmKhRt6K2a07edZ6aLkbnwIqQzJoQ3mADw2E3g0 WcaDJGwm+nCW4EjJH3rKP+VwZWZZmG1dV2QoyImZZmHhEkwa/8QLXHpbHHqL1Ml3dlor qZJbCm8jWeRUQVtkU4ux/mgNo5ED5dDoHQ28+/7u8aDm6qkowYANfoPcW5ScZVoTiVhA J5wJm3FRrNnzZT6ESf2PssSzMt2o8R33rXw83hzj57L0E588LxqXbGtaYRslfCtziVsu 6TYNNzKbYSHYhqw9inxthnNb1bprNv7VaF+uUbdO5JjDaLxcqwCgn6pIquQwSmSHthfQ Ka5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Zi54yqBdOqFyF97dKl9M0ENaSIb+QlkuRDDGBj/36rg=; b=J5YFLMevc/x0diiP9+tMiURcjYfzNDtxVzzyblug+Tf6cCod5lxhPogl13gJC1QuZI abpLUVHxS+D6NIExpx3cWUoeWqYE2ctoEA5SrjWUQy4ZGbkDf6EHlFccU2YULLLVO5PY SDFF3xCkVtmZbJb5EGp6ql0e7+BYfUOSXceqMbI17Z/aO81TfVAHoJuO946+WVbAswVA ub4hQkZtuwEjpPckSETppk65GGK/OFRnTwmt8zZy2AA7fSYz6tx9OEtxeEbeYKbgylq8 S5mPQxgW4fAGiUAIxIxHB8UcJ7pLO7Iz+NQf7ILfr732va5st9zUNsqzzVUuVHkeD914 CTZA== X-Gm-Message-State: AOAM533m6C7QhIpzxU4bBHGKzq/T/WvipLGnJSOuOLEH3d6SH3TRACwY UPK8RG+9If9XRC3x2HpsFXaFpSJo X-Google-Smtp-Source: ABdhPJx33QWHO95EMLmtkmx+sS5/CbNtVKHdUXEp2kcixTNZdq4UkgwZcxsfq+NsOmpzozIf3ui1dQ== X-Received: by 2002:a1c:27c1:: with SMTP id n184mr5393934wmn.6.1595527018444; Thu, 23 Jul 2020 10:56:58 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id g3sm5178452wrb.59.2020.07.23.10.56.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 10:56:58 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Thu, 23 Jul 2020 17:56:40 +0000 Subject: [PATCH v2 18/18] maintenance: add trace2 regions for task execution Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Johannes.Schindelin@gmx.de, sandals@crustytoothpaste.net, steadmon@google.com, jrnieder@gmail.com, peff@peff.net, congdanhqx@gmail.com, phillip.wood123@gmail.com, emilyshaffer@google.com, sluongng@gmail.com, jonathantanmy@google.com, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Signed-off-by: Derrick Stolee --- builtin/gc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/builtin/gc.c b/builtin/gc.c index b040c7d31d..7d9e6c34b7 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -1322,7 +1322,9 @@ static int maintenance_run(void) !tasks[i]->auto_condition())) continue; + trace2_region_enter("maintenance", tasks[i]->name, r); result = tasks[i]->fn(); + trace2_region_leave("maintenance", tasks[i]->name, r); } rollback_lock_file(&lk);