From patchwork Wed Jun 29 18:45:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12900569 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EC8CC433EF for ; Wed, 29 Jun 2022 18:46:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231963AbiF2SqV (ORCPT ); Wed, 29 Jun 2022 14:46:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57740 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232025AbiF2SqG (ORCPT ); Wed, 29 Jun 2022 14:46:06 -0400 Received: from mail-qk1-x732.google.com (mail-qk1-x732.google.com [IPv6:2607:f8b0:4864:20::732]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F9DA3EF29 for ; Wed, 29 Jun 2022 11:45:57 -0700 (PDT) Received: by mail-qk1-x732.google.com with SMTP id v6so12765220qkh.2 for ; Wed, 29 Jun 2022 11:45:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=45PPgvs47uQVLRW8vx0jQD35V+xYebNEUK4soB/ZLlg=; b=3YwW9cAgLIjY0yw+BEzY/YDrNzMNxA7EdvipqH3WX2axHXTbM43gvp6xwfgNLKacLO AeqdZXC867Jg237AwOSLy3J6PZALLEZBXhTC+I6SobjC7b7q37jAmI4WFh5VtAitJQZu rsfc5v5isXs/oCUgGTcxm/8glRNXP1p3kaoTL+v2Jt9gmvR5+Maghi/oBDETdBGJLYlu C+ZueI6ApkODeFoSZ/1PXAkNwpe2Mk32pMnUqvg6g4dzgsFuSMvv5xQBMmM8E8QKluLi i7AMGctqsEyqMFum3UFyTOxJ4P12CNiF0RVu0sXm9kQzc51DFgzb4/mpvSav09HNb395 XI5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=45PPgvs47uQVLRW8vx0jQD35V+xYebNEUK4soB/ZLlg=; b=J8AdYyIElk6Hn9aZ1MiVsw556JlsZ9dR70SWWvpnLu/PTMuroXg2hgaEsLN2XLnUHr YO6XFB+zGC/rHTQxrT3hyqBjbsozluzyedli2cBg7NBHmLvdrn5ifHPT6LVE9W2HBLqC 8OVBSzFZKl+Tk+rFPn2ZkV3qCrAL1gv5jgynpqHh5oC5PmDY8MOjQd+ULeocfZPtR7i2 KNox4uIztVLH6X8XJpN6d4JOaA35CRUsgzrqlv1NJKglyeHyuPugD7o1HtKMSEEP40xE rBKUgbim9PohUcwLEa3x5KBRZuzeIZuWXDMj54ikL5wFeCq21JYZuNCJIV/qxEKkpCaM QIOw== X-Gm-Message-State: AJIora8UKoidoKHse5sUtQQxBDd6S+dt3ucHQSUix9zPcPqylA4UgHTY 8vSHGPSo2cGXCVtJ+PuuPAamntyFYwzarg== X-Google-Smtp-Source: AGRyM1vyihSNJeQOqPHifjbbZqlARKqopLF6KEeAAV1+LMHTw95CF2Rt9zcBpJ0RdVGFegd70jyM6w== X-Received: by 2002:a05:620a:294a:b0:6ae:fb7f:831 with SMTP id n10-20020a05620a294a00b006aefb7f0831mr3291442qkp.130.1656528356341; Wed, 29 Jun 2022 11:45:56 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id s12-20020a05620a29cc00b006a36b0d7f27sm14653953qkp.76.2022.06.29.11.45.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Jun 2022 11:45:56 -0700 (PDT) Date: Wed, 29 Jun 2022 14:45:54 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: derrickstolee@github.com, jonathantanmy@google.com, gitster@pobox.com Subject: [RFC PATCH 1/4] builtin/repack.c: pass "out" to `prepare_pack_objects` Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org `builtin/repack.c`'s `prepare_pack_objects()` is used to prepare a set of arguments to a `pack-objects` process which will generate a desired pack. A future patch will add an `--expire-to` option which allows `git repack` to write a cruft pack containing the pruned objects out to a separate repository. Prepare for this by teaching that function to write packs to an arbitrary location specified by the caller. All existing callers of `prepare_pack_objects()` will pass `packtmp` for `out`, retaining the existing behavior. Signed-off-by: Taylor Blau --- builtin/repack.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index 4a7ae4cf48..025882a075 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -188,7 +188,8 @@ static void remove_redundant_pack(const char *dir_name, const char *base_name) } static void prepare_pack_objects(struct child_process *cmd, - const struct pack_objects_args *args) + const struct pack_objects_args *args, + const char *out) { strvec_push(&cmd->args, "pack-objects"); if (args->window) @@ -211,7 +212,7 @@ static void prepare_pack_objects(struct child_process *cmd, strvec_push(&cmd->args, "--quiet"); if (delta_base_offset) strvec_push(&cmd->args, "--delta-base-offset"); - strvec_push(&cmd->args, packtmp); + strvec_push(&cmd->args, out); cmd->git_cmd = 1; cmd->out = -1; } @@ -275,7 +276,7 @@ static void repack_promisor_objects(const struct pack_objects_args *args, FILE *out; struct strbuf line = STRBUF_INIT; - prepare_pack_objects(&cmd, args); + prepare_pack_objects(&cmd, args, packtmp); cmd.in = -1; /* @@ -673,7 +674,7 @@ static int write_cruft_pack(const struct pack_objects_args *args, FILE *in, *out; int ret; - prepare_pack_objects(&cmd, args); + prepare_pack_objects(&cmd, args, packtmp); strvec_push(&cmd.args, "--cruft"); if (cruft_expiration) @@ -862,7 +863,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) sigchain_push_common(remove_pack_on_signal); - prepare_pack_objects(&cmd, &po_args); + prepare_pack_objects(&cmd, &po_args, packtmp); show_progress = !po_args.quiet && isatty(2); From patchwork Wed Jun 29 18:47:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12900570 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65046C433EF for ; Wed, 29 Jun 2022 18:47:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230254AbiF2SrR (ORCPT ); Wed, 29 Jun 2022 14:47:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59562 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229838AbiF2SrP (ORCPT ); Wed, 29 Jun 2022 14:47:15 -0400 Received: from mail-qv1-xf2b.google.com (mail-qv1-xf2b.google.com [IPv6:2607:f8b0:4864:20::f2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 623373B2BC for ; Wed, 29 Jun 2022 11:47:14 -0700 (PDT) Received: by mail-qv1-xf2b.google.com with SMTP id p31so26200493qvp.5 for ; Wed, 29 Jun 2022 11:47:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=ZpRxP2pxbt62zotQ3LBu7s+VpuUAnNiRCUQjgppAgFI=; b=uEiHBVNEjaHkkHl9GkHHEkw+YNbNXIU6rgT3F9FWlbLrWprQrja912cD+nn3WdIqVe ADFsK5Y60MQME0k/w3LrisQ9NhUUcD2RWwYSOvm4tb0V1MA21mMjm2AxeO0hEod0+4RU saIFAtfyE2t/GOdd4VCjWz1XM7Zei9GV11/rAMajftu0zzCjQGtkrooNOapu+VXvMJe6 pNSz+njWWo8PaoaPh0/5fTsbVN2OyHxZwusQo53C86qDvnzBgZiqltsOq7gFIw5RtHRk KKYZJVd4xJhusfMxGbADD/ju3TQeGY4qHHjLvPXNWLIuwIhyRCYSCChVdI60WjEecx8A YWTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=ZpRxP2pxbt62zotQ3LBu7s+VpuUAnNiRCUQjgppAgFI=; b=6HT5g9hAPM9OcpoP6WpmV9BbHMr27KYijA49q58opEG3PCQZGlI2FVewYBTGJE2zjZ G4ZroOZpHYKHgFiT0vYr0rUbqhOqXwLxQdgfGBWJp1g6O5vxWIdV7wcZi+R6SQaKr6oi hYB2y604Znt405Zm0RNHfu353hYeRifS7dLGNRwN5A4ntnTXTEAxW4AHIfqYmyT/DWVs WTyPUQvWWPtPjRDYtWDmJB4+tbpKp0Tlky9agCkfZyY4ASu2xaeSuJvYH81QMXYqhC6v sAhlLO8w8BiGe3/urNR8JqZVPfTAJbI22Nz7inUOKqYhthDOIuzntQoaQmZUO4xMoGOi YE4w== X-Gm-Message-State: AJIora+rM9dC0GmDnJ7Ye4lxghx1F5Z37SPe0CMJy2wMq1yQm7PFHez6 oea+BCwoVZ0dLzGD8ooXu3U+vLne3vwSOA== X-Google-Smtp-Source: AGRyM1sQCvzFUmzca6DzzePxYzt0gK5K+yy8HmgOc/UyrXiFtOkx/2Ie/5chqZ7SMuvgRUKgHSIhgA== X-Received: by 2002:a05:6214:202d:b0:470:3e7a:d183 with SMTP id 13-20020a056214202d00b004703e7ad183mr9169928qvf.4.1656528433321; Wed, 29 Jun 2022 11:47:13 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id bz19-20020a05622a1e9300b0031bba2e05aesm4628597qtb.58.2022.06.29.11.47.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Jun 2022 11:47:13 -0700 (PDT) Date: Wed, 29 Jun 2022 14:47:11 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: derrickstolee@github.com, jonathantanmy@google.com, gitster@pobox.com Subject: [RFC PATCH 2/4] builtin/repack.c: pass "cruft_expiration" to `write_cruft_pack` Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org `builtin/repack.c`'s `write_cruft_pack()` is used to generate the cruft pack when `--cruft` is supplied. It uses a static variable "cruft_expiration" which is filled in by option parsing. A future patch will add an `--expire-to` option which allows `git repack` to write a cruft pack containing the pruned objects out to a separate repository. In order to implement this functionality, some callers will have to pass a value for `cruft_expiration` different than the one filled out by option parsing. Prepare for this by teaching `write_cruft_pack` to take a "cruft_expiration" parameter, instead of reading a single static variable. The (sole) existing caller of `write_cruft_pack()` will pass the value for "cruft_expiration" filled in by option parsing, retaining existing behavior. This means that we can make the variable local to `cmd_repack()`, and eliminate the static declaration. Signed-off-by: Taylor Blau --- builtin/repack.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index 025882a075..19e789d745 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -32,7 +32,6 @@ static int write_bitmaps = -1; static int use_delta_islands; static int run_update_server_info = 1; static char *packdir, *packtmp_name, *packtmp; -static char *cruft_expiration; static const char *const git_repack_usage[] = { N_("git repack []"), @@ -664,6 +663,7 @@ static int write_midx_included_packs(struct string_list *include, static int write_cruft_pack(const struct pack_objects_args *args, const char *pack_prefix, + const char *cruft_expiration, struct string_list *names, struct string_list *existing_packs, struct string_list *existing_kept_packs) @@ -747,6 +747,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) struct pack_objects_args cruft_po_args = {NULL}; int geometric_factor = 0; int write_midx = 0; + const char *cruft_expiration = NULL; struct option builtin_repack_options[] = { OPT_BIT('a', NULL, &pack_everything, @@ -986,7 +987,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) cruft_po_args.local = po_args.local; cruft_po_args.quiet = po_args.quiet; - ret = write_cruft_pack(&cruft_po_args, pack_prefix, &names, + ret = write_cruft_pack(&cruft_po_args, pack_prefix, + cruft_expiration, &names, &existing_nonkept_packs, &existing_kept_packs); if (ret) From patchwork Wed Jun 29 18:47:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12900571 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A1FBC43334 for ; Wed, 29 Jun 2022 18:47:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229982AbiF2SrW (ORCPT ); Wed, 29 Jun 2022 14:47:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59702 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230452AbiF2SrT (ORCPT ); Wed, 29 Jun 2022 14:47:19 -0400 Received: from mail-qv1-xf35.google.com (mail-qv1-xf35.google.com [IPv6:2607:f8b0:4864:20::f35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C77C8275FF for ; Wed, 29 Jun 2022 11:47:17 -0700 (PDT) Received: by mail-qv1-xf35.google.com with SMTP id t16so26263256qvh.1 for ; Wed, 29 Jun 2022 11:47:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=54HJQC/LI5ra5plJoOqbEYLvaFE3OeMEQ89PfY0Caro=; b=jKL9usEzYiyUQ3NzxE2r/T+2zRd1zvDLaoWHs6+PSXYoRHRCeNirmj3TrW+3zDRi7O DE5bFHk/pBNzUjgZL8BA2trbSMgQVy+2NI20+sE5x+aTojo/svekRwST8pV/4MIazieJ UY1ix8CZmUO/4pSqmqQxAJWhjYyuE1orQ741CITt34cvi11PpJsdyGUiajsHEQDS+VWF qQV1qG1tCddbGcoVN3swf/CwMBGMpjR4hCh6ZXQCzKhuWVuxMIS/1TFHJqP+JqMiHIVt nJXnAq/o/XzfG6VJobvOT36VSibuFkqhH5JtYbVv4MltbgGFD3Xi2vx92Hjpefr1kmUo AYog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=54HJQC/LI5ra5plJoOqbEYLvaFE3OeMEQ89PfY0Caro=; b=00Cd35V2NXsOy8zMFmXJdrN74Jziq/+xkaKXMkHw5MC4OWH/ycPi+ibwv3gdNbfnEi du4InhTpbF1KLO5JcU5/aJ5SPeLWRcJOTUQJx4qjaM5mSc6WDjJ2diKQmwRhXI0P6V34 ofvABOYh05rWNN74bPK+xeeWo7t/wxaPleo6owKRmxYtHs8Yy1lKdaFXb2LTUzw0wPM4 h9Cp4SS53xon7gKL/w2FjB/+fhvIlBBl5x/O9+VhfsJCu+Fo+pMGgvuXtIN/Ftm8fqr+ 16E05mQXS+X8ebEgFgIDdqA65D+phVtqL4cljLVpZPmGGlWCKEku0kDzdXuQWd4KB/1h 05Ng== X-Gm-Message-State: AJIora/prwc2yoXZlx7I+yHB78zDsu4CJhgJS1VM9qm+0PNwhnDGv997 lr8JQYQC//uZlG12FSHsvXKsIppBskhCFA== X-Google-Smtp-Source: AGRyM1sJswwXM6RwOZztHgW9Fk2ErnI9ScPACoI7mgbnYS+zGjfHYqwZwI00N2n/j9vxHl3UoTxZqA== X-Received: by 2002:ac8:5a89:0:b0:31b:f68d:5cee with SMTP id c9-20020ac85a89000000b0031bf68d5ceemr4005668qtc.468.1656528436722; Wed, 29 Jun 2022 11:47:16 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id m5-20020ae9e005000000b006af1f0af045sm8744639qkk.107.2022.06.29.11.47.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Jun 2022 11:47:16 -0700 (PDT) Date: Wed, 29 Jun 2022 14:47:14 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: derrickstolee@github.com, jonathantanmy@google.com, gitster@pobox.com Subject: [RFC PATCH 3/4] builtin/repack.c: write cruft packs to arbitrary locations Message-ID: <86bfb40904427f68d5094e00d568eaa1ad341cad.1656528415.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In the following commit, a new write_cruft_pack() caller will be added which wants to write a cruft pack to an arbitrary location. Prepare for this by adding a parameter which controls the destination of the cruft pack. For now, provide "packtmp" so that this commit does not change any behavior. Signed-off-by: Taylor Blau --- builtin/repack.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index 19e789d745..ab976007e1 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -662,6 +662,7 @@ static int write_midx_included_packs(struct string_list *include, } static int write_cruft_pack(const struct pack_objects_args *args, + const char *destination, const char *pack_prefix, const char *cruft_expiration, struct string_list *names, @@ -673,8 +674,10 @@ static int write_cruft_pack(const struct pack_objects_args *args, struct string_list_item *item; FILE *in, *out; int ret; + const char *scratch; + int local = skip_prefix(destination, packdir, &scratch); - prepare_pack_objects(&cmd, args, packtmp); + prepare_pack_objects(&cmd, args, destination); strvec_push(&cmd.args, "--cruft"); if (cruft_expiration) @@ -714,7 +717,12 @@ static int write_cruft_pack(const struct pack_objects_args *args, if (line.len != the_hash_algo->hexsz) die(_("repack: Expecting full hex object ID lines only " "from pack-objects.")); - string_list_append(names, line.buf); + /* + * avoid putting packs written outside of the repository in the + * list of names + */ + if (local) + string_list_append(names, line.buf); } fclose(out); @@ -987,7 +995,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) cruft_po_args.local = po_args.local; cruft_po_args.quiet = po_args.quiet; - ret = write_cruft_pack(&cruft_po_args, pack_prefix, + ret = write_cruft_pack(&cruft_po_args, packtmp, pack_prefix, cruft_expiration, &names, &existing_nonkept_packs, &existing_kept_packs); From patchwork Wed Jun 29 18:47:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12900572 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ABB33C43334 for ; Wed, 29 Jun 2022 18:47:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231825AbiF2Sr3 (ORCPT ); Wed, 29 Jun 2022 14:47:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230452AbiF2SrY (ORCPT ); Wed, 29 Jun 2022 14:47:24 -0400 Received: from mail-qk1-x732.google.com (mail-qk1-x732.google.com [IPv6:2607:f8b0:4864:20::732]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC8DB24BE7 for ; Wed, 29 Jun 2022 11:47:20 -0700 (PDT) Received: by mail-qk1-x732.google.com with SMTP id b133so12734561qkc.6 for ; Wed, 29 Jun 2022 11:47:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=YHotxnglMry9bUjnWUlMqUeL1td4DaS/VON4a5zDDdc=; b=QkTjIoblVjSFIVxvCK0i0g9BvnJyd+Q8UgxFGlYIVkiLXs3vUy5r/jVkwPWYJ8Siyw PAIbRGCeJEsm4f/GH2FsBoUabuOXDdL7wmZcMVLSScD0R5EuEON7ziQyofVf1CiU9fh4 bwuWSiUGlceHECGAPBxeh4RmXMbyJu6jtuNmUCVkoccUrywJSZiKom6enukXN9K1rzDf 9AqgOxVlW19IzCd8fhWxAerzOhhJHP4fw7UFJE5i+ttueGMB7ttsUGfsrxdg2UZbz0Vi NLC6P5uv3bYMRiFiZ50IIR//JJ7Z8zPjzNlAKV/MtC8/39189pHmCO+6ERt3ejlfvj2X QKHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=YHotxnglMry9bUjnWUlMqUeL1td4DaS/VON4a5zDDdc=; b=LeJtHx4x1Ag5xxCzOV0RjtGzUdzAGo/kw5ZZZfJ/R1AH8sV23J4WFq34XcJgBrLZdf tkXpTmFh2stBwBnMuNvTDtr/PvnlD8fMMyCKi9XG0z33UCqMHRB0SZWCw2B1Bwb3Yx86 jrs5Q+TXq2Ubrr6IcfMK5pWRbem+rlt9Kx1GK6KChQzCTZBfG6O1Sxi2nbmdhjMtVC7Z pmfXKu+mHf2qauISKnve504jP/lLgPdfr8733aBw1F6zJ0nB8aAPQzzIKJmh6RZ4i84u cXqn5zet7EAOsBGjAmn5W8OodL6wg5tNNnR5aUvYf+t/dc/9JrqtA8N5mbxQWu9dVp7q L31A== X-Gm-Message-State: AJIora9fLVZHq7LakbVRJk9hHMbwtdKzZC66ulc8tjdHHmFCHZnqJ3PF 17Ocv4GyAPgifqCGB7iobR9n5vQL0F4qVg== X-Google-Smtp-Source: AGRyM1vKl4oOFbGW0JZPWk6o2KwS4+JVwqP1cHvCI/xwXbPRD84/1x2DWAIBdajpaQ/EliI9Cq2LZA== X-Received: by 2002:a37:4655:0:b0:6af:3d7d:f827 with SMTP id t82-20020a374655000000b006af3d7df827mr3351277qka.776.1656528439640; Wed, 29 Jun 2022 11:47:19 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id t12-20020a05620a450c00b006a746826feesm14891359qkp.120.2022.06.29.11.47.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Jun 2022 11:47:19 -0700 (PDT) Date: Wed, 29 Jun 2022 14:47:18 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: derrickstolee@github.com, jonathantanmy@google.com, gitster@pobox.com Subject: [RFC PATCH 4/4] builtin/repack.c: implement `--expire-to` for storing pruned objects Message-ID: <9baf44b3174d82c9fae858c77955cacb5131aa91.1656528415.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When pruning objects with `--cruft`, `git repack` offers some flexibility when selecting the set of which objects are pruned via the `--cruft-expiration` option. This is useful for expiring objects which are older than the grace period, making races where to-be-pruned objects become reachable and then ancestors of freshly pushed objects, leaving the repository in a corrupt state after pruning substantially less likely. But in practice, such races are impossible to avoid entirely, no matter how long the grace period is. To prevent this race, it is often advisable to temporarily put a repository into a read-only state. But in practice, this is not always practical, and so some middle ground would be nice. This patch introduces a new option, `--expire-to`, which teaches `git repack` to write an additional cruft pack containing just the objects which were pruned from the repository. The caller can specify a directory outside of the current repository as the destination for this second cruft pack. This makes it possible to prune objects from a repository, while still holding onto a supplemental copy of them outside of the original repository. Having this copy on-disk makes it substantially easier to recover objects when the aforementioned race is encountered. `--expire-to` is implemented in a somewhat convoluted manner, which is to take advantage of the fact that the first time `write_cruft_pack()` is called, it adds the name of the cruft pack to the `names` string list. That means the second time we call `write_cruft_pack()`, objects in the previously-written cruft pack will be excluded. As long as the caller ensures that no objects are expired during the second pass, this is sufficient to generate a cruft pack containing all objects which don't appear in any of the new packs written by `git repack`, including the cruft pack. In other words, all of the objects which are about to be pruned from the repository. It is important to note that the destination in `--expire-to` does not necessarily need to be a Git repository (though it can be) Notably, the expired packs do not contain all ancestors of expired objects. So if the source repository contains something like: / C1 --- C2 \ refs/heads/master where C2 is unreachable, but has a parent (C1) which is reachable, and C2 would be pruned, then the expiry pack will contain only C2, not C1. Signed-off-by: Taylor Blau --- Documentation/git-repack.txt | 6 ++ builtin/repack.c | 40 ++++++++++++ t/t7700-repack.sh | 121 +++++++++++++++++++++++++++++++++++ 3 files changed, 167 insertions(+) diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 0bf13893d8..4017157949 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -74,6 +74,12 @@ to the new separate pack will be written. immediately instead of waiting for the next `git gc` invocation. Only useful with `--cruft -d`. +--expire-to=:: + Write a cruft pack containing pruned objects (if any) to the + directory ``. This option is useful for keeping a copy of + any pruned objects in a separate directory as a backup. Only + useful with `--cruft -d`. + -l:: Pass the `--local` option to 'git pack-objects'. See linkgit:git-pack-objects[1]. diff --git a/builtin/repack.c b/builtin/repack.c index ab976007e1..d789150a2e 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -702,6 +702,10 @@ static int write_cruft_pack(const struct pack_objects_args *args, * By the time it is read here, it contains only the pack(s) * that were just written, which is exactly the set of packs we * want to consider kept. + * + * If `--expire-to` is given, the double-use served by `names` + * ensures that the pack written to `--expire-to` excludes any + * objects contained in the cruft pack. */ in = xfdopen(cmd.in, "w"); for_each_string_list_item(item, names) @@ -756,6 +760,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) int geometric_factor = 0; int write_midx = 0; const char *cruft_expiration = NULL; + const char *expire_to = NULL; struct option builtin_repack_options[] = { OPT_BIT('a', NULL, &pack_everything, @@ -805,6 +810,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) N_("find a geometric progression with factor ")), OPT_BOOL('m', "write-midx", &write_midx, N_("write a multi-pack index of the resulting packs")), + OPT_STRING(0, "expire-to", &expire_to, N_("dir"), + N_("pack prefix to store a pack containing pruned objects")), OPT_END() }; @@ -1001,6 +1008,39 @@ int cmd_repack(int argc, const char **argv, const char *prefix) &existing_kept_packs); if (ret) return ret; + + if (delete_redundant && expire_to) { + /* + * If `--expire-to` is given with `-d`, it's possible + * that we're about to prune some objects. With cruft + * packs, pruning is implicit: any objects from existing + * packs that weren't picked up by new packs are removed + * when their packs are deleted. + * + * Generate an additional cruft pack, with one twist: + * `names` now includes the name of the cruft pack + * written in the previous step. So the contents of + * _this_ cruft pack exclude everything contained in the + * existing cruft pack (that is, all of the unreachable + * objects which are no older than + * `--cruft-expiration`). + * + * To make this work, cruft_expiration must become NULL + * so that this cruft pack doesn't actually prune any + * objects. If it were non-NULL, this call would always + * generate an empty pack (since every object not in the + * cruft pack generated above will have an mtime older + * than the expiration). + */ + ret = write_cruft_pack(&cruft_po_args, expire_to, + pack_prefix, + NULL, + &names, + &existing_nonkept_packs, + &existing_kept_packs); + if (ret) + return ret; + } } string_list_sort(&names); diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index ca45c4cd2c..7ffd3c7b54 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -482,4 +482,125 @@ test_expect_success '-n overrides repack.updateServerInfo=true' ' test_server_info_missing ' +test_expect_success '--expire to stores pruned objects (now)' ' + git init expire-to-now && + ( + cd expire-to-now && + + git branch -M main && + + test_commit base && + + git checkout -b cruft && + test_commit --no-tag cruft && + + git rev-list --objects --no-object-names main..cruft >moved.raw && + sort moved.raw >moved.want && + + git rev-list --all --objects --no-object-names >expect.raw && + sort expect.raw >expect && + + git checkout main && + git branch -D cruft && + git reflog expire --all --expire=all && + + git init --bare expired.git && + git repack -d \ + --cruft --cruft-expiration="now" \ + --expire-to="expired.git/objects/pack/pack" && + + expired="$(ls expired.git/objects/pack/pack-*.idx)" && + test_path_is_file "${expired%.idx}.mtimes" && + + # Since the `--cruft-expiration` is "now", the effective + # behavior is to move _all_ unreachable objects out to + # the location in `--expire-to`. + git show-index <$expired >expired.raw && + cut -d" " -f2 expired.raw | sort >expired.objects && + git rev-list --all --objects --no-object-names \ + >remaining.objects && + + # ...in other words, the combined contents of this + # repository and expired.git should be the same as the + # set of objects we started with. + cat expired.objects remaining.objects | sort >actual && + test_cmp expect actual && + + # The "moved" objects (i.e., those in expired.git) + # should be the same as the cruft objects which were + # expired in the previous step. + test_cmp moved.want expired.objects + ) +' + +test_expect_success '--expire to stores pruned objects (5.minutes.ago)' ' + git init expire-to-5.minutes.ago && + ( + cd expire-to-5.minutes.ago && + + git branch -M main && + + test_commit base && + + # Create two classes of unreachable objects, one which + # is older than 5 minutes (stale), and another which is + # newer (recent). + for kind in stale recent + do + git checkout -b $kind main && + test_commit --no-tag $kind + done && + + git rev-list --objects --no-object-names main..stale >in && + stale="$(git pack-objects $objdir/pack/pack expect.raw && + sort expect.raw >expect && + + # moved.want holds the set of objects we expect to find + # in expired.git + git rev-list --objects --no-object-names main..stale >out && + sort out >moved.want && + + git checkout main && + git branch -D stale recent && + git reflog expire --all --expire=all && + git prune-packed && + + git init --bare expired.git && + git repack -d \ + --cruft --cruft-expiration=5.minutes.ago \ + --expire-to="expired.git/objects/pack/pack" && + + # Some of the remaining objects in this repository are + # unreachable, so use `cat-file --batch-all-objects` + # instead of `rev-list` to get their names + git cat-file --batch-all-objects --batch-check="%(objectname)" \ + >remaining.objects && + sort remaining.objects >actual && + test_cmp expect actual && + + ( + cd expired.git && + + expired="$(ls objects/pack/pack-*.mtimes)" && + test-tool pack-mtimes $(basename $expired) >out && + cut -d" " -f1 out | sort >../moved.got && + + # Ensure that there are as many objects with the + # expected mtime as were moved to expired.git. + # + # In other words, ensure that the recorded + # mtimes of any moved objects was written + # correctly. + grep " $mtime$" out >matching && + test_line_count = $(wc -l <../moved.want) matching + ) && + test_cmp moved.want moved.got + ) +' + test_done