From patchwork Wed Jun 14 19:25:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13280404 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C825FEB64D8 for ; Wed, 14 Jun 2023 19:26:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235649AbjFNT0I (ORCPT ); Wed, 14 Jun 2023 15:26:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229775AbjFNT0G (ORCPT ); Wed, 14 Jun 2023 15:26:06 -0400 Received: from mail-lf1-x129.google.com (mail-lf1-x129.google.com [IPv6:2a00:1450:4864:20::129]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34DAA213B for ; Wed, 14 Jun 2023 12:26:04 -0700 (PDT) Received: by mail-lf1-x129.google.com with SMTP id 2adb3069b0e04-4f61d79b0f2so9569839e87.3 for ; Wed, 14 Jun 2023 12:26:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686770762; x=1689362762; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iVT5qFgLsiqhkfE3Cc6h/3DFtc2WMznuyH276ymzcc0=; b=Boxy/Wko0rN4/r0neafpjjiiD83Zm8OJUV3/v8Kpuicwy5gS/7+gijqEQ05UsoT1dR G2jjtmD6aZhuRrbfmYfWJNjkUo3a8q5s4VhtebzKWgi2kFdMWCtvH4KiOpcdCSAgRD/4 YYTwlxVV2/8UCah3y9VeB8p/RvbjT4WjpYCpKdFjql/G27jr+YwMCaMibfu8dQplYV4u EstDXta23cXPfJMWhpzjxKZTrbvRLtJ2WYoTgV3KygCHzW+Y+0hw/kY18a+U6AyAfppU kMm7gGxH+vkYJNxvlz7lltZhRZ/M2Yn7Rseg9JK7Ll5eXMErOyjwbjV7wd9vkZnT041t ZaYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686770762; x=1689362762; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iVT5qFgLsiqhkfE3Cc6h/3DFtc2WMznuyH276ymzcc0=; b=Ag2ISSXlg08zW/B3wSnrjVyPQMmBjpuOnSzwughR410IkEVP1xTB+dP/9hCDqC8AfW ukoDHRK7hWlFb9To5Ufzc61/8Z6wwEADm6QgYshSxPQFS03gTReSkG4mD5Am4H9rthWt s2n5jFD/qjEovgu/0i2t6HMKhaDKklA0eWUSpu6f3IjWRvRn/xOEuEdMTwciJjuI1efz LIkhdrziYz7wKxW3X9G2quO7lXClU74SVTEcyAjtoyV9j2PyZLso/iUO78uvC4xKu+Ye 5cSJ9DAoOy83pcn5q9WRwWP6FdHXjcab1OV2a/JTqIS9y1hMvOBuTa6x7sDETDl0ricd 6oIA== X-Gm-Message-State: AC+VfDy0/MUyt6V3aF6MDNnHZ/aCX9d+KEvhtTnVzAjjfYNCQh38jmKt i+kc0b7CWq5NrT6fY78nPsthOFrFvPM= X-Google-Smtp-Source: ACHHUZ6xhHhq63Y0ZE2TgGojmx2XGy7Cs7xdhmVSW3Yl5v5n0qZGVSjwIUpfSAKTnGHnoXC3MiorUg== X-Received: by 2002:a19:f00a:0:b0:4f3:b883:fc4e with SMTP id p10-20020a19f00a000000b004f3b883fc4emr7458173lfc.43.1686770761742; Wed, 14 Jun 2023 12:26:01 -0700 (PDT) Received: from localhost.localdomain ([2001:861:3f04:7ca0:e164:efe0:8fdb:6ba]) by smtp.gmail.com with ESMTPSA id u26-20020a05600c00da00b003eddc6aa5fasm18370365wmm.39.2023.06.14.12.26.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jun 2023 12:26:01 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH 1/9] pack-objects: allow `--filter` without `--stdout` Date: Wed, 14 Jun 2023 21:25:33 +0200 Message-ID: <20230614192541.1599256-2-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.37.gae45d9845e In-Reply-To: <20230614192541.1599256-1-christian.couder@gmail.com> References: <20230614192541.1599256-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org 9535ce7337 (pack-objects: add list-objects filtering, 2017-11-21) taught `git pack-objects` to use `--filter`, but required the use of `--stdout` since a partial clone mechanism was not yet in place to handle missing objects. Since then, changes like 9e27beaa23 (promisor-remote: implement promisor_remote_get_direct(), 2019-06-25) and others added support to dynamically fetch objects that were missing. Even without a promisor remote, filtering out objects can also be useful if we can put the filtered out objects in a separate pack, and in this case it also makes sense for pack-objects to write the packfile directly to an actual file rather than on stdout. Remove the `--stdout` requirement when using `--filter`, so that in a follow-up commit, repack can pass `--filter` to pack-objects to omit certain objects from the resulting packfile. Signed-off-by: John Cai Signed-off-by: Christian Couder --- Documentation/git-pack-objects.txt | 4 ++-- builtin/pack-objects.c | 8 ++------ 2 files changed, 4 insertions(+), 8 deletions(-) diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index a9995a932c..583270a85f 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -298,8 +298,8 @@ So does `git bundle` (see linkgit:git-bundle[1]) when it creates a bundle. nevertheless. --filter=:: - Requires `--stdout`. Omits certain objects (usually blobs) from - the resulting packfile. See linkgit:git-rev-list[1] for valid + Omits certain objects (usually blobs) from the resulting + packfile. See linkgit:git-rev-list[1] for valid `` forms. --no-filter:: diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 9cfc8801f9..af007868c1 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -4388,12 +4388,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) if (!rev_list_all || !rev_list_reflog || !rev_list_index) unpack_unreachable_expiration = 0; - if (filter_options.choice) { - if (!pack_to_stdout) - die(_("cannot use --filter without --stdout")); - if (stdin_packs) - die(_("cannot use --filter with --stdin-packs")); - } + if (stdin_packs && filter_options.choice) + die(_("cannot use --filter with --stdin-packs")); if (stdin_packs && use_internal_rev_list) die(_("cannot use internal rev list with --stdin-packs")); From patchwork Wed Jun 14 19:25:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13280406 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8A2BEB64D8 for ; Wed, 14 Jun 2023 19:26:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236349AbjFNT0L (ORCPT ); Wed, 14 Jun 2023 15:26:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51374 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230178AbjFNT0H (ORCPT ); Wed, 14 Jun 2023 15:26:07 -0400 Received: from mail-lf1-x12b.google.com (mail-lf1-x12b.google.com [IPv6:2a00:1450:4864:20::12b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1BCEC1FDD for ; Wed, 14 Jun 2023 12:26:06 -0700 (PDT) Received: by mail-lf1-x12b.google.com with SMTP id 2adb3069b0e04-4f6370ddd27so9284874e87.0 for ; Wed, 14 Jun 2023 12:26:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686770763; x=1689362763; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RZw3Wr9b6l0Xaz14oHZo1mGFe1ZhH3aMinlQKqEZmnU=; b=lSTvc+9FnjCRXFhTD8tWTZPoU/S5iAzbWYVPvjuJ15aVs39iT6b40ngvactOPypbQF GfvLGoP2EkIimJwASnifaGZWFMVUfVRPWx56+NOz4toMzeuUxjES9ISa2LicPg9PiOL4 ZuJCPG5TUJ55+WNRcBpHTiH4Q08tj/CRoMSlgXBIDzV5+4LR5ThohajOuytM8/uVvMYS 1kL0fcdWtezfbE7id5Qw/46Ytn0qLvKBKm2rzhRJDp5c8hIAis/ORpE/pAn5p/gxBsns HFPIjw07fbIRbcQhv7GlFUAFo9cb3v4CPF7YcXcec/MRFqLI/YyT+5bl5wGOvG4z2ddN LQmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686770763; x=1689362763; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RZw3Wr9b6l0Xaz14oHZo1mGFe1ZhH3aMinlQKqEZmnU=; b=O3IS2Rb2itNzPKTkFbSjYDGIWzRt6932l7tQbhIApY0LvVEgcZEa/ULLrs7Zhrl9tg 1vHtdpTEOWE3JkJtnHHaVj0vsOB5P8YXaLkOraaIbdBocEP/jMzTjgdDQaSCh9w063Hr s1DqpApEh/uHzjr+CCRFval/akVvQsw6Ssrk6tYQ1TyGPi8FFtIzb63myDhs0aJ6bSn0 sjoGnqetIq78nVxYuAOyCYmQSmci463LDyYSjB8epcNWW3ygFjdXTaq0hQJ4WwSCwaEz lskVB4e0dSCjyqHT9SvNf4qbn0mGMOahG3eNhqLbJDe0/1XFRpyCpKj6Bl/4YRJF1ggE agRg== X-Gm-Message-State: AC+VfDxIog95PiKlqI98C2cpfqB6ivg5SxKNw2/rQUdp+XFIuCtWv8mB mXQwltNJ1isd2YHPMWfcuak7GbMwmDU= X-Google-Smtp-Source: ACHHUZ71jve/7wMTaIWRmzYXPcVRi1ECkSqidybo7m6cxXNEI4wuabxEB/I4p3z6JxZ3zcvOdeENpQ== X-Received: by 2002:a19:920a:0:b0:4f4:b806:4b5f with SMTP id u10-20020a19920a000000b004f4b8064b5fmr8202251lfd.57.1686770763195; Wed, 14 Jun 2023 12:26:03 -0700 (PDT) Received: from localhost.localdomain ([2001:861:3f04:7ca0:e164:efe0:8fdb:6ba]) by smtp.gmail.com with ESMTPSA id u26-20020a05600c00da00b003eddc6aa5fasm18370365wmm.39.2023.06.14.12.26.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jun 2023 12:26:02 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH 2/9] pack-objects: add `--print-filtered` to print omitted objects Date: Wed, 14 Jun 2023 21:25:34 +0200 Message-ID: <20230614192541.1599256-3-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.37.gae45d9845e In-Reply-To: <20230614192541.1599256-1-christian.couder@gmail.com> References: <20230614192541.1599256-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When using the `--filter=` option, `git pack-objects` will omit some objects from the resulting packfile(s) it produces. It could be useful to know about these omitted objects though. For example, we might want to write these objects into a separate packfile by piping them into another `git pack-object` process. Or we might want to check if these objects are available from a promisor remote. Anyway, this patch implements a simple way to let us know about these objects by simply printing their oid, one per line, on stdout when the new `--print-filtered` flag is passed. As `--print-filtered` doesn't make sense without `--filter`, it is disallowed to use the former without the latter. Using `--stdout` is likely to make the `--print-filtered` output difficult to find or parse, so we also disallow using these two options together. Signed-off-by: Christian Couder --- Documentation/git-pack-objects.txt | 10 ++++++ builtin/pack-objects.c | 47 ++++++++++++++++++++++++-- t/t5317-pack-objects-filter-objects.sh | 27 +++++++++++++++ 3 files changed, 81 insertions(+), 3 deletions(-) diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index 583270a85f..6469080029 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -305,6 +305,16 @@ So does `git bundle` (see linkgit:git-bundle[1]) when it creates a bundle. --no-filter:: Turns off any previous `--filter=` argument. +--print-filtered:: + Requires `--filter=`. Prints on stdout, one per line, the + object IDs of the objects that are filtered out from the + resulting packfile by the filter. This is incompatible with + `--stdout`. As hashes are already written to stdout + based on the resulting pack contents (see the `base-name` + argument), a line containing only six `-` characters is + written after those hashes, before the filtered object + IDs. + --missing=:: A debug option to help with future "partial clone" development. This option specifies how missing objects are handled. diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index af007868c1..c8e2b6b859 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -266,6 +266,12 @@ static struct oidmap configured_exclusions; static struct oidset excluded_by_config; +/* + * Objects omitted by filter + */ +static int print_filtered_out; +static struct oidset *omitted_by_filter; + /* * stats */ @@ -4065,11 +4071,18 @@ static void get_object_list(struct rev_info *revs, int ac, const char **av) die(_("revision walk setup failed")); mark_edges_uninteresting(revs, show_edge, sparse); + if (print_filtered_out) { + omitted_by_filter = xmalloc(sizeof(*omitted_by_filter)); + oidset_init(omitted_by_filter, 0); + } + if (!fn_show_object) fn_show_object = show_object; - traverse_commit_list(revs, - show_commit, fn_show_object, - NULL); + traverse_commit_list_filtered(revs, + show_commit, + fn_show_object, + NULL, + omitted_by_filter); if (unpack_unreachable_expiration) { revs->ignore_missing_links = 1; @@ -4165,6 +4178,23 @@ static int option_parse_cruft_expiration(const struct option *opt, return 0; } +static void print_omitted_by_filter(void) +{ + struct oidset_iter iter; + const struct object_id *oid; + + fprintf_ln(stdout, "%s", "------"); + fprintf_ln(stderr, "%s", _("Printing objects omitted by filter")); + + oidset_iter_init(omitted_by_filter, &iter); + + while ((oid = oidset_iter_next(&iter))) + fprintf_ln(stdout, "%s", oid_to_hex(oid)); + + oidset_clear(omitted_by_filter); + FREE_AND_NULL(omitted_by_filter); +} + int cmd_pack_objects(int argc, const char **argv, const char *prefix) { int use_internal_rev_list = 0; @@ -4278,6 +4308,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) OPT_STRING_LIST(0, "uri-protocol", &uri_protocols, N_("protocol"), N_("exclude any configured uploadpack.blobpackfileuri with this protocol")), + OPT_BOOL(0, "print-filtered", &print_filtered_out, + N_("print filtered out objects to stdout")), OPT_END(), }; @@ -4394,6 +4426,12 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) if (stdin_packs && use_internal_rev_list) die(_("cannot use internal rev list with --stdin-packs")); + if (print_filtered_out && !filter_options.choice) + die(_("cannot use --print-filtered without --filter")); + + if (print_filtered_out && pack_to_stdout) + die(_("cannot use --print-filtered with --stdout")); + if (cruft) { if (use_internal_rev_list) die(_("cannot use internal rev list with --cruft")); @@ -4509,6 +4547,9 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) written, written_delta, reused, reused_delta, reuse_packfile_objects); + if (omitted_by_filter) + print_omitted_by_filter(); + cleanup: list_objects_filter_release(&filter_options); strvec_clear(&rp); diff --git a/t/t5317-pack-objects-filter-objects.sh b/t/t5317-pack-objects-filter-objects.sh index b26d476c64..ec3a03d90a 100755 --- a/t/t5317-pack-objects-filter-objects.sh +++ b/t/t5317-pack-objects-filter-objects.sh @@ -438,6 +438,33 @@ test_expect_success 'verify sparse:oid=oid-ish' ' test_cmp expected observed ' +# Test pack-objects with --print-filtered option + +test_expect_success 'pack-objects fails w/ both --print-filtered and --stdout' ' + test_must_fail git -C r1 pack-objects --revs --stdout \ + --filter=blob:none --print-filtered >filter.out <<-EOF + HEAD + EOF +' + +test_expect_success 'pack-objects w/ --print-filtered and a pack name' ' + git -C r1 pack-objects --revs --filter=blob:none \ + --print-filtered filtered-pack >filter.out <<-EOF && + HEAD + EOF + + # Check that the second line contains "------" + head -n 2 filter.out | tail -n 1 >actual && + echo "------" >expected && + test_cmp expected actual && + + # Remove the first two lines and check there are all the blobs + tail -n +3 filter.out | sort >actual && + git -C r1 cat-file --batch-check --batch-all-objects | grep blob | + sed -e "s/ blob.*//" | sort >expected && + test_cmp expected actual +' + # Delete some loose objects and use pack-objects, but WITHOUT any filtering. # This models previously omitted objects that we did not receive. From patchwork Wed Jun 14 19:25:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13280407 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62960EB64D9 for ; Wed, 14 Jun 2023 19:26:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236598AbjFNT0O (ORCPT ); Wed, 14 Jun 2023 15:26:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51388 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233911AbjFNT0H (ORCPT ); Wed, 14 Jun 2023 15:26:07 -0400 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB28C2683 for ; Wed, 14 Jun 2023 12:26:06 -0700 (PDT) Received: by mail-wm1-x335.google.com with SMTP id 5b1f17b1804b1-3f8d17639feso11731305e9.2 for ; Wed, 14 Jun 2023 12:26:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686770764; x=1689362764; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dYZvrlhaWmNoHVEZ0191rldez8vAixJ2smS+NWbDwUk=; b=gRS/hdErupgc1gySOJz23lhzg5sRgSSTu2DYkocOgu/NSsxVemR7dkLA8X+lMv3G/2 dp3uUqKAW9pOguQj8A3mYlKyoqoFohSaHEw1e8IN+vj7PlXlqFNrSGmeLylyVQnAq2XM +01bBvfhdx2nxv/goYUVzIkamtVMVHSPmFv9pEWCrvh1VCfC6p8BBk7GmnrZBN7M5HeD fKlISgwdQm10WoeBcRgNnoHLiZDnlEkd3q8vxCiyZtuPqmABNKN2WiCP00JJXFBgzw9x Gre1AjA34OJkSDTBMSTeHnGHl2ecy/iwRI9Ut8rQaeOShlyDHKWOXv+aRpQYBknv3z6l aBog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686770764; x=1689362764; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dYZvrlhaWmNoHVEZ0191rldez8vAixJ2smS+NWbDwUk=; b=ShOuyicjP/yeXYScAKIzGNZ4MF8s9fkkN74w7CcgyL4Ag3mI6TJp3iM9VEtpp19gLZ YnwxA/7A/QDHU+qmpBSS13GuDuAi1OTBFnrVNMjgyEZ48yOShVtIJPGs8O8biWoA4uuZ kgN4207MweVha/AyA4+DguxYUkqV6GZCnhcWEpagem64MUOtcQVpdzwJp34nY0J6+Z06 5IP6S8Aad80ZBBq2kPebKoIUbZq59NKvWYNWH3DTw5gfYyl2WcE+YM0EOC4Nn94+DeH6 kaccgCBR7eo0hZTpueDBsZksi6422zunx6BwMoGtfmIiK+MOd9nndadlAcy7ZV9/Wo0K 0+Pw== X-Gm-Message-State: AC+VfDwEfga7VHjZvl31x3p1/i1i7MtgLOr7ODmJriXvPK6Fe1zUBGlO HZ5PbJxxF6C0rsfdY1XtqhQ6awRNb5A= X-Google-Smtp-Source: ACHHUZ7do4fzXU8N+E7zMVDE3W9+PQsQCtrLHLpYWYK2cB1tdR/eNh3tM2soRY0PM5q4OglLu9JRww== X-Received: by 2002:a7b:cc12:0:b0:3f7:f398:e1ca with SMTP id f18-20020a7bcc12000000b003f7f398e1camr12930535wmh.26.1686770764257; Wed, 14 Jun 2023 12:26:04 -0700 (PDT) Received: from localhost.localdomain ([2001:861:3f04:7ca0:e164:efe0:8fdb:6ba]) by smtp.gmail.com with ESMTPSA id u26-20020a05600c00da00b003eddc6aa5fasm18370365wmm.39.2023.06.14.12.26.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jun 2023 12:26:03 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH 3/9] t/helper: add 'find-pack' test-tool Date: Wed, 14 Jun 2023 21:25:35 +0200 Message-ID: <20230614192541.1599256-4-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.37.gae45d9845e In-Reply-To: <20230614192541.1599256-1-christian.couder@gmail.com> References: <20230614192541.1599256-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In a following commit, we will make it possible to separate objects in different packfiles depending on a filter. To make sure that the right objects are in the right packs, let's add a new test-tool that can display which packfile(s) a given object is in. Signed-off-by: Christian Couder --- Makefile | 1 + t/helper/test-find-pack.c | 35 +++++++++++++++++++++++++++++++++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + 4 files changed, 38 insertions(+) create mode 100644 t/helper/test-find-pack.c diff --git a/Makefile b/Makefile index e440728c24..c1cd735b31 100644 --- a/Makefile +++ b/Makefile @@ -800,6 +800,7 @@ TEST_BUILTINS_OBJS += test-dump-untracked-cache.o TEST_BUILTINS_OBJS += test-env-helper.o TEST_BUILTINS_OBJS += test-example-decorate.o TEST_BUILTINS_OBJS += test-fast-rebase.o +TEST_BUILTINS_OBJS += test-find-pack.o TEST_BUILTINS_OBJS += test-fsmonitor-client.o TEST_BUILTINS_OBJS += test-genrandom.o TEST_BUILTINS_OBJS += test-genzeros.o diff --git a/t/helper/test-find-pack.c b/t/helper/test-find-pack.c new file mode 100644 index 0000000000..1928fe7329 --- /dev/null +++ b/t/helper/test-find-pack.c @@ -0,0 +1,35 @@ +#include "test-tool.h" +#include "object-name.h" +#include "object-store.h" +#include "packfile.h" +#include "setup.h" + +/* + * Display the path(s), one per line, of the packfile(s) containing + * the given object. + */ + +static const char *find_pack_usage = "\n" +" test-tool find-pack "; + + +int cmd__find_pack(int argc, const char **argv) +{ + struct object_id oid; + struct packed_git *p; + + setup_git_directory(); + + if (argc != 2) + usage(find_pack_usage); + + if (repo_get_oid(the_repository, argv[1], &oid)) + die("cannot parse %s as an object name", argv[1]); + + for (p = get_all_packs(the_repository); p; p = p->next) { + if (find_pack_entry_one(oid.hash, p)) + printf("%s\n", p->pack_name); + } + + return 0; +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index abe8a785eb..41da40c296 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -31,6 +31,7 @@ static struct test_cmd cmds[] = { { "env-helper", cmd__env_helper }, { "example-decorate", cmd__example_decorate }, { "fast-rebase", cmd__fast_rebase }, + { "find-pack", cmd__find_pack }, { "fsmonitor-client", cmd__fsmonitor_client }, { "genrandom", cmd__genrandom }, { "genzeros", cmd__genzeros }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index ea2672436c..411dbf2db4 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -25,6 +25,7 @@ int cmd__dump_reftable(int argc, const char **argv); int cmd__env_helper(int argc, const char **argv); int cmd__example_decorate(int argc, const char **argv); int cmd__fast_rebase(int argc, const char **argv); +int cmd__find_pack(int argc, const char **argv); int cmd__fsmonitor_client(int argc, const char **argv); int cmd__genrandom(int argc, const char **argv); int cmd__genzeros(int argc, const char **argv); From patchwork Wed Jun 14 19:25:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13280409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 809BCEB64D9 for ; Wed, 14 Jun 2023 19:26:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236688AbjFNT03 (ORCPT ); Wed, 14 Jun 2023 15:26:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236272AbjFNT0J (ORCPT ); Wed, 14 Jun 2023 15:26:09 -0400 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2530D1FDD for ; Wed, 14 Jun 2023 12:26:08 -0700 (PDT) Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-3f8c9cb3144so9284575e9.0 for ; Wed, 14 Jun 2023 12:26:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686770766; x=1689362766; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UHrLhqhiua3psCWQsrZsPnfXmvVY5FIK3NCVO6BpoNg=; b=gOpIz2fHMsc4gyoT5l8TwxrybmDSp0Bihl2jO4ingyHWGHpwcHdA2FpHO7YOV3/YZH uG66PwKE+su1yQhdajBQcR3nVKRc6d40TcbPakcMTmNHRPEADcIyUY0sboR2h6BVFfuI Znb2nrEWzdvkIfUxiuG9zYsgv2f+DJFJP5+jSi3gh0u+7ltXqFmtTI7ytP/woQFyebfa nPPtjNsz6ixZ3YM6dbxlR0U659aWXpZI8LG7uj87LRJz6XLziGBjLeMaTXHaLTrdcvmw YDmy2iRfv4urpoXUwk2lmdtN7XjLw9SdzgsjkBqIvJyQLnvzhjjixOR+QHVkAeebFQMS 3TXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686770766; x=1689362766; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UHrLhqhiua3psCWQsrZsPnfXmvVY5FIK3NCVO6BpoNg=; b=GajeoA7G7OkQhzetirBcfwLZazx3xf6AyovfpB0U36Tj5GveadOH5nbJLSKfWN3oSF fhxxiquEU+fEqotCnl4Bb9mieo14qSAc5zjwvARFOrRDJnNynW8fGmr7mFbRuzq7Os29 CAISUemqFOg7H7e6WPmPyIt3ekvSw9GH1OkPRPzyVmQjSDkc531Z6E58IR/2ORbf+KAJ GcchzBR4bEyCS+EfUBIFY3gnyT/d8T+8KibLPZlsKbTKNdSeg8q729lNVh8JwvsrOLIw QnHasgMYBL5FsrtRYEnKVXHVXZyR8puZTAeLulQFyBDnhQrW95yD+Zd9fj4FACf3vcrC I9hw== X-Gm-Message-State: AC+VfDwHhlvMJSsVbylwFxNBgX22/CmkfMdIloQY/mn4ThLb+SkmywJ5 jghSZHpQfEJnElXZmsXTg31FP/w2D0I= X-Google-Smtp-Source: ACHHUZ6afieRH8VhTTAXKEXRByTX88jgvPqxpYGsuaO0xzf1iafmLn8vmbKYfT25nqwigU1d/7Zyvg== X-Received: by 2002:a05:600c:2204:b0:3f6:be1:b8d9 with SMTP id z4-20020a05600c220400b003f60be1b8d9mr11129765wml.6.1686770765905; Wed, 14 Jun 2023 12:26:05 -0700 (PDT) Received: from localhost.localdomain ([2001:861:3f04:7ca0:e164:efe0:8fdb:6ba]) by smtp.gmail.com with ESMTPSA id u26-20020a05600c00da00b003eddc6aa5fasm18370365wmm.39.2023.06.14.12.26.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jun 2023 12:26:04 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH 4/9] repack: refactor piping an oid to a command Date: Wed, 14 Jun 2023 21:25:36 +0200 Message-ID: <20230614192541.1599256-5-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.37.gae45d9845e In-Reply-To: <20230614192541.1599256-1-christian.couder@gmail.com> References: <20230614192541.1599256-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Create a new write_oid_hex_cmd() function to send an oid to the standard input of a running command. This new function will be used in a following commit. Signed-off-by: Christian Couder --- builtin/repack.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index 0541c3ce15..e591c295cf 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -182,6 +182,17 @@ static void prepare_pack_objects(struct child_process *cmd, cmd->out = -1; } +static void write_oid_hex_cmd(const char *oid_hex, + struct child_process *cmd, + const char *err_msg) +{ + if (cmd->in == -1 && start_command(cmd)) + die("%s", err_msg); + + xwrite(cmd->in, oid_hex, the_hash_algo->hexsz); + xwrite(cmd->in, "\n", 1); +} + /* * Write oid to the given struct child_process's stdin, starting it first if * necessary. @@ -192,13 +203,8 @@ static int write_oid(const struct object_id *oid, { struct child_process *cmd = data; - if (cmd->in == -1) { - if (start_command(cmd)) - die(_("could not start pack-objects to repack promisor objects")); - } - - xwrite(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz); - xwrite(cmd->in, "\n", 1); + write_oid_hex_cmd(oid_to_hex(oid), cmd, + _("could not start pack-objects to repack promisor objects")); return 0; } From patchwork Wed Jun 14 19:25:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13280408 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D223EB64D8 for ; Wed, 14 Jun 2023 19:26:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236712AbjFNT01 (ORCPT ); Wed, 14 Jun 2023 15:26:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236608AbjFNT0Z (ORCPT ); Wed, 14 Jun 2023 15:26:25 -0400 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7A5182684 for ; Wed, 14 Jun 2023 12:26:09 -0700 (PDT) Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-3f8d0d684f3so10037565e9.2 for ; Wed, 14 Jun 2023 12:26:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686770767; x=1689362767; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Khcw39MVa4wY4BiNdWHhcIIpn3dBkfmavaetTaqVbUI=; b=ObHhO1hGlBnOLsk85s0TUweJ929mURHs4EhF07xOCovmPNoSWhQyfBJO6eQHFY8bn2 j7KJEgsSnUcQoKHjADtSE/DLv/rq0bjl1LTkP9WEW5YISgyvj6qWBjecN+VCfZ1xpvwo bkTDhGmolJ5OBIG/Q5Diphl7kcumH3NsE2mdV3YgVNGZGX2ssv2zWIS7XSC/pb2wBtYT lktJcjhwMwDk2JQvdCtPG3KKrZOj6/x3oYodqsqokVxGu5PQ2gq4nrVsM/o2xUsc8Szq uHqA685+HuhCK3wASj8u21pl0PHaj/PV87CWths4AWhYoLWcKIXreNr98u6BwE9t0r7Z 3oVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686770767; x=1689362767; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Khcw39MVa4wY4BiNdWHhcIIpn3dBkfmavaetTaqVbUI=; b=RbRiV8TTTtL0shdCRTZF900w2viJV89PLg1wf/00ShkjVpEiGIgT3+Nnh8/VTAjcOb HsyM4aIQuknKc4/VA1oE8pUVgSM+/w8qK0a8FSsjEtVfp9q/wNDeQvJ+jyAz0IMlLw/J 7Uk2H6hsDErfW/j12HR0jRJBT29w61IcLcHHHu2woMiBE7ObantqDXJv25gZExaf1/aq obJk5p6UKYRk2ylm4pCbDKwK49qS0DmW6ZThCS+rxhdM3wV0jXdCE+MmqUIgL+oU4Obh g6+9pqg7rdSrQ1VT1VKFe71bUJu7F5dcJsvu2A42qr81xcOQ2nl9/mfKjL7btmMsWz6P XWLg== X-Gm-Message-State: AC+VfDwnyC2TTlMgwvDjWOCcku/1jI3qIUMzy4RAVaMiqN5QN5mdyocr 1vYvHltyPOGt2llNNYkx5YKprK5tqBg= X-Google-Smtp-Source: ACHHUZ5lRWq+sCO3avN1gYDV8cCYp2eE7yUg4lqMI1OUSoU1KIj8UZ83V0iMN6NxXXM7rpjx19kmow== X-Received: by 2002:a1c:4c0c:0:b0:3f7:ee69:869f with SMTP id z12-20020a1c4c0c000000b003f7ee69869fmr10828141wmf.5.1686770767357; Wed, 14 Jun 2023 12:26:07 -0700 (PDT) Received: from localhost.localdomain ([2001:861:3f04:7ca0:e164:efe0:8fdb:6ba]) by smtp.gmail.com with ESMTPSA id u26-20020a05600c00da00b003eddc6aa5fasm18370365wmm.39.2023.06.14.12.26.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jun 2023 12:26:06 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder Subject: [PATCH 5/9] repack: refactor finishing pack-objects command Date: Wed, 14 Jun 2023 21:25:37 +0200 Message-ID: <20230614192541.1599256-6-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.37.gae45d9845e In-Reply-To: <20230614192541.1599256-1-christian.couder@gmail.com> References: <20230614192541.1599256-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Create a new finish_pack_objects_cmd() to refactor duplicated code that handles reading the packfile names from the output of a `git pack-objects` command and putting it into a string_list, as well as calling finish_command(). While at it, beautify a code comment a bit in the new function. Signed-off-by: Christian Couder out, "r"); + while (strbuf_getline_lf(&line, out) != EOF) { + struct string_list_item *item; + + if (line.len != the_hash_algo->hexsz) + die(_("repack: Expecting full hex object ID lines only " + "from pack-objects.")); + /* + * Avoid putting packs written outside of the repository in the + * list of names. + */ + if (local) { + item = string_list_append(names, line.buf); + item->util = populate_pack_exts(line.buf); + } + } + fclose(out); + + strbuf_release(&line); + + return finish_command(cmd); +} + static int write_cruft_pack(const struct pack_objects_args *args, const char *destination, const char *pack_prefix, @@ -712,12 +748,9 @@ static int write_cruft_pack(const struct pack_objects_args *args, struct string_list *existing_kept_packs) { struct child_process cmd = CHILD_PROCESS_INIT; - struct strbuf line = STRBUF_INIT; struct string_list_item *item; - FILE *in, *out; + FILE *in; int ret; - const char *scratch; - int local = skip_prefix(destination, packdir, &scratch); prepare_pack_objects(&cmd, args, destination); @@ -758,27 +791,7 @@ static int write_cruft_pack(const struct pack_objects_args *args, fprintf(in, "%s.pack\n", item->string); fclose(in); - out = xfdopen(cmd.out, "r"); - while (strbuf_getline_lf(&line, out) != EOF) { - struct string_list_item *item; - - if (line.len != the_hash_algo->hexsz) - die(_("repack: Expecting full hex object ID lines only " - "from pack-objects.")); - /* - * avoid putting packs written outside of the repository in the - * list of names - */ - if (local) { - item = string_list_append(names, line.buf); - item->util = populate_pack_exts(line.buf); - } - } - fclose(out); - - strbuf_release(&line); - - return finish_command(&cmd); + return finish_pack_objects_cmd(&cmd, names, destination); } int cmd_repack(int argc, const char **argv, const char *prefix) @@ -789,10 +802,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) struct string_list existing_nonkept_packs = STRING_LIST_INIT_DUP; struct string_list existing_kept_packs = STRING_LIST_INIT_DUP; struct pack_geometry *geometry = NULL; - struct strbuf line = STRBUF_INIT; struct tempfile *refs_snapshot = NULL; int i, ext, ret; - FILE *out; int show_progress; /* variables to be filled by option parsing */ @@ -1023,18 +1034,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) fclose(in); } - out = xfdopen(cmd.out, "r"); - while (strbuf_getline_lf(&line, out) != EOF) { - struct string_list_item *item; - - if (line.len != the_hash_algo->hexsz) - die(_("repack: Expecting full hex object ID lines only from pack-objects.")); - item = string_list_append(&names, line.buf); - item->util = populate_pack_exts(item->string); - } - strbuf_release(&line); - fclose(out); - ret = finish_command(&cmd); + ret = finish_pack_objects_cmd(&cmd, &names, NULL); if (ret) goto cleanup; From patchwork Wed Jun 14 19:25:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13280410 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18270EB64DB for ; Wed, 14 Jun 2023 19:26:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236632AbjFNT0b (ORCPT ); Wed, 14 Jun 2023 15:26:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236685AbjFNT00 (ORCPT ); Wed, 14 Jun 2023 15:26:26 -0400 Received: from mail-lf1-x134.google.com (mail-lf1-x134.google.com [IPv6:2a00:1450:4864:20::134]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20739268B for ; Wed, 14 Jun 2023 12:26:11 -0700 (PDT) Received: by mail-lf1-x134.google.com with SMTP id 2adb3069b0e04-4f004cc54f4so9249885e87.3 for ; Wed, 14 Jun 2023 12:26:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686770769; x=1689362769; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MHEbVw2TSqNSDCP7yXujcu/a0AJGha6JRAsdW72fgFc=; b=Z5Y7+1xHsncxYcV6iC6bW4A+vj19FWlikGQLUUE7AXfz61iUNkH6FkOqRfkouQABe3 kSI5pQv0/A5Q4G98Ee1Jqwem0hgYmWcGpWiZsuvkg2aIXEcTzU13ZGRyBj2hI/7xjuay mEdhMA2HeQup/JtuhQjC4NmJJQDwO2BRK3aMi7e6ZL+2GZcaCSI+9St2eDxJ0eCgCY9T VQfTUdLmN6PJovTD/Hx05ysnpfhU0ll57teanPUldDAHIZ08wW+MmpFzO2BnUxfLI09c sJVRgQu9ROVCPoZGYEx3/lmA9wjzCFZqUS5YXA0i2SrRw9432Odci5eU0NONz5H3AdGY iJlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686770769; x=1689362769; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MHEbVw2TSqNSDCP7yXujcu/a0AJGha6JRAsdW72fgFc=; b=L9T1dbsVLFN6ofQ+IKA9lXbGQUl2XFlOS+qkbt8j57vPHC1dTPwEni3kGV8l2Inf4d uQIh+c3/ppcucOg7vkb+kHbYBlLrd3FSYHBwp2VuMdvxFiwGFlVgWzIbGKl3YliMNZoe frzzwigzDQVXcc3AlzQUCMMNNXHu8UKiyStAtek0+DoF4cCaQn/gcKkiDLBbSqO6bQXL q6KntcVj2F0sfwCuP9gpGEH9EqqH5H66Gb+3B3sltsah95EUyunm1/00nfS9mYGPNgXT mvgwQCMKG3lGFNoxF9caXGQhmlCX8Pz0xGK9+4KT/cka5EqJ6hqzyX6JsjL4JxJmOiqt xGSg== X-Gm-Message-State: AC+VfDzttKiqRMpajXOuEfF7GZdMOl8ctcJB5WFPqC8d65WwCgYJwo8Q vpJO+pS3krq0yw8Ho1N2HwAucfSVwtY= X-Google-Smtp-Source: ACHHUZ6vcvA+Biqp0HRLsvn56OMmqbc3mLyx+X/J41zjglmw8zvsYrVH8fQXCqip7s/X3W08UBdCag== X-Received: by 2002:a19:e00b:0:b0:4f6:4c47:d5fe with SMTP id x11-20020a19e00b000000b004f64c47d5femr8318101lfg.14.1686770768634; Wed, 14 Jun 2023 12:26:08 -0700 (PDT) Received: from localhost.localdomain ([2001:861:3f04:7ca0:e164:efe0:8fdb:6ba]) by smtp.gmail.com with ESMTPSA id u26-20020a05600c00da00b003eddc6aa5fasm18370365wmm.39.2023.06.14.12.26.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jun 2023 12:26:07 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH 6/9] repack: add `--filter=` option Date: Wed, 14 Jun 2023 21:25:38 +0200 Message-ID: <20230614192541.1599256-7-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.37.gae45d9845e In-Reply-To: <20230614192541.1599256-1-christian.couder@gmail.com> References: <20230614192541.1599256-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org After cloning with --filter=, for example to avoid getting unneeded large files on a user machine, it's possible that some of these large files still get fetched for some reasons (like checking out old branches) over time. In this case the repo size could grow too much for no good reason and a way to filter out some objects would be useful to remove the unneeded large files. Deleting objects right away could corrupt a repo though, so it might be better to put those objects into a separate packfile instead of deleting them. The separate pack could then be removed after checking that all the objects in it are still available on a promisor remote it can access. Also splitting a packfile into 2 packs depending on a filter could be useful in other usecases. For example some large blobs might take a lot of precious space on fast storage while they are rarely accessed, and it could make sense to move them in a separate cheaper, though slower, storage. This commit implements a new `--filter=` option in `git repack` that moves filtered out objects into a separate pack. This is done by reading filtered out objects from `git pack-objects`'s output and piping them into a separate `git pack-objects` process that will put them into a separate packfile. Signed-off-by: John Cai Signed-off-by: Christian Couder --- Documentation/git-repack.txt | 5 +++ builtin/repack.c | 75 ++++++++++++++++++++++++++++++++++-- t/t7700-repack.sh | 16 ++++++++ 3 files changed, 93 insertions(+), 3 deletions(-) diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 4017157949..aa29c7e648 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -143,6 +143,11 @@ depth is 4095. a larger and slower repository; see the discussion in `pack.packSizeLimit`. +--filter=:: + Remove objects matching the filter specification from the + resulting packfile and put them into a separate packfile. See + linkgit:git-rev-list[1] for valid `` forms. + -b:: --write-bitmap-index:: Write a reachability bitmap index as part of the repack. This diff --git a/builtin/repack.c b/builtin/repack.c index f1adacf1d0..b13d7196de 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -53,6 +53,7 @@ struct pack_objects_args { const char *depth; const char *threads; const char *max_pack_size; + const char *filter; int no_reuse_delta; int no_reuse_object; int quiet; @@ -167,6 +168,10 @@ static void prepare_pack_objects(struct child_process *cmd, strvec_pushf(&cmd->args, "--threads=%s", args->threads); if (args->max_pack_size) strvec_pushf(&cmd->args, "--max-pack-size=%s", args->max_pack_size); + if (args->filter) { + strvec_pushf(&cmd->args, "--filter=%s", args->filter); + strvec_pushf(&cmd->args, "--print-filtered"); + } if (args->no_reuse_delta) strvec_pushf(&cmd->args, "--no-reuse-delta"); if (args->no_reuse_object) @@ -703,13 +708,21 @@ static void remove_redundant_bitmaps(struct string_list *include, strbuf_release(&path); } +static void pack_filtered(const char *oid_hex, struct child_process *cmd) +{ + write_oid_hex_cmd(oid_hex, cmd, + _("could not start pack-objects to pack filtered objects")); +} + static int finish_pack_objects_cmd(struct child_process *cmd, struct string_list *names, - const char *destination) + const char *destination, + struct child_process *pack_filtered_cmd) { int local = 1; FILE *out; struct strbuf line = STRBUF_INIT; + int filtered_start = 0; if (destination) { const char *scratch; @@ -720,9 +733,20 @@ static int finish_pack_objects_cmd(struct child_process *cmd, while (strbuf_getline_lf(&line, out) != EOF) { struct string_list_item *item; + if (!filtered_start && pack_filtered_cmd && !strcmp(line.buf, "------")) { + filtered_start = 1; + continue; + } + if (line.len != the_hash_algo->hexsz) die(_("repack: Expecting full hex object ID lines only " "from pack-objects.")); + + if (pack_filtered_cmd && filtered_start) { + pack_filtered(line.buf, pack_filtered_cmd); + continue; + } + /* * Avoid putting packs written outside of the repository in the * list of names. @@ -791,9 +815,44 @@ static int write_cruft_pack(const struct pack_objects_args *args, fprintf(in, "%s.pack\n", item->string); fclose(in); - return finish_pack_objects_cmd(&cmd, names, destination); + return finish_pack_objects_cmd(&cmd, names, destination, NULL); } +/* + * Prepare the command that will pack objects that have been filtered + * out from the original pack, so that they will end up in a separate + * pack. + */ +static void prepare_pack_filtered_cmd(struct child_process *cmd, + const struct pack_objects_args *args, + const char *destination) +{ + /* We need to copy args to modify it */ + struct pack_objects_args new_args = *args; + + /* No need to filter again */ + new_args.filter = NULL; + + prepare_pack_objects(cmd, &new_args, destination); + cmd->in = -1; +} + +static void finish_pack_filtered_cmd(struct child_process *cmd, + struct string_list *names) +{ + if (cmd->in == -1) { + /* No packed objects; cmd was never started */ + child_process_clear(cmd); + return; + } + + close(cmd->in); + + if (finish_pack_objects_cmd(cmd, names, NULL, NULL)) + die(_("could not finish pack-objects to pack filtered objects")); +} + + int cmd_repack(int argc, const char **argv, const char *prefix) { struct child_process cmd = CHILD_PROCESS_INIT; @@ -817,6 +876,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) int write_midx = 0; const char *cruft_expiration = NULL; const char *expire_to = NULL; + struct child_process pack_filtered_cmd = CHILD_PROCESS_INIT; struct option builtin_repack_options[] = { OPT_BIT('a', NULL, &pack_everything, @@ -858,6 +918,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) N_("limits the maximum number of threads")), OPT_STRING(0, "max-pack-size", &po_args.max_pack_size, N_("bytes"), N_("maximum size of each packfile")), + OPT_STRING(0, "filter", &po_args.filter, N_("args"), + N_("object filtering")), OPT_BOOL(0, "pack-kept-objects", &pack_kept_objects, N_("repack objects in packs marked with .keep")), OPT_STRING_LIST(0, "keep-pack", &keep_pack_list, N_("name"), @@ -1011,6 +1073,9 @@ int cmd_repack(int argc, const char **argv, const char *prefix) strvec_push(&cmd.args, "--incremental"); } + if (po_args.filter) + prepare_pack_filtered_cmd(&pack_filtered_cmd, &po_args, packtmp); + if (geometry) cmd.in = -1; else @@ -1034,7 +1099,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) fclose(in); } - ret = finish_pack_objects_cmd(&cmd, &names, NULL); + ret = finish_pack_objects_cmd(&cmd, &names, NULL, + po_args.filter ? &pack_filtered_cmd : NULL); if (ret) goto cleanup; @@ -1102,6 +1168,9 @@ int cmd_repack(int argc, const char **argv, const char *prefix) } } + if (po_args.filter) + finish_pack_filtered_cmd(&pack_filtered_cmd, &names); + string_list_sort(&names); close_object_store(the_repository->objects); diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index faa739eeb9..9e7654090f 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -270,6 +270,22 @@ test_expect_success 'auto-bitmaps do not complain if unavailable' ' test_must_be_empty actual ' +test_expect_success 'repacking with a filter works' ' + git -C bare.git repack -a -d && + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && + git -C bare.git -c repack.writebitmaps=false repack -a -d --filter=blob:none && + test_stdout_line_count = 2 ls bare.git/objects/pack/*.pack && + commit_pack=$(test-tool -C bare.git find-pack HEAD) && + test -n "$commit_pack" && + blob_pack=$(test-tool -C bare.git find-pack HEAD:file1) && + test -n "$blob_pack" && + test "$commit_pack" != "$blob_pack" && + tree_pack=$(test-tool -C bare.git find-pack HEAD^{tree}) && + test "$tree_pack" = "$commit_pack" && + blob_pack2=$(test-tool -C bare.git find-pack HEAD:file2) && + test "$blob_pack2" = "$blob_pack" +' + objdir=.git/objects midx=$objdir/pack/multi-pack-index From patchwork Wed Jun 14 19:25:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13280412 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC5EFEB64D8 for ; Wed, 14 Jun 2023 19:26:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237017AbjFNT0e (ORCPT ); Wed, 14 Jun 2023 15:26:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51642 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236317AbjFNT01 (ORCPT ); Wed, 14 Jun 2023 15:26:27 -0400 Received: from mail-lf1-x12c.google.com (mail-lf1-x12c.google.com [IPv6:2a00:1450:4864:20::12c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A79362695 for ; Wed, 14 Jun 2023 12:26:12 -0700 (PDT) Received: by mail-lf1-x12c.google.com with SMTP id 2adb3069b0e04-4f64fb05a8aso9272384e87.0 for ; Wed, 14 Jun 2023 12:26:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686770770; x=1689362770; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=d2EmGX+RKtKhvL0xMR2AvOlsq4xQkoy74mEscLcHois=; b=qfqHVQLNlkbICsmYZOyqb4w/BK6wWYnWxVP/lviN50NwXVPY6j0Hl253HKB5WVJJ2y kextBSqeg5Bhb0QNBOdWVxeZHgiZ7lpXh8aUKx8sxkUNNKcvQMSHBhhf3FPoYzn4UTHB +mkP+ZgtvUvC+GkuvUU1hA/E6POXCCYDcSwICMnMzzmpfbOn83EKRDSwpqdYQi1XHFh7 UxYCxfJi5byZGq7n0qi7uT4k8vZvVOF9zfXMtHCKDZ/YW6RpEBQ0w8U7AZ+qOhWd+MCs pGu7PSJTmz52gLD6ZKt2QajI+GpZwRxGcspjYVzIV/3LAWgb9f31bQjcUijxgik9rfJv 7Mbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686770770; x=1689362770; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=d2EmGX+RKtKhvL0xMR2AvOlsq4xQkoy74mEscLcHois=; b=In/tNPH2aBuOenaUXg03cBjMFaF/+6f3oKrBNYMaXUKPUS/bl1ZcvfYUt4nQ7jcvmR RbZ/5eWuPUgrSjR8RY3YabuX+J9bl7r0ClwvoQ4ZaAs5DbQHBV+iHuzpsGRLeib/qg6x I8WcavKwEkuVZuTYjnLCl5hMuKYZXuWWCUxMYW8MYI4+5cTYcTTOeDfVOMP2TRDVdNs0 g8fCmHuto9/Rtop1fFkuu6P48Hn5YyxRfWPo5hGaCJxlZ+JAeamDRJwWQaf8ydrMY+0u 75ZEmToy0jKd4nZ4h2TLSDKCTmBqe5+cneM4b7GdNdIn3xrmGwNnj1emVZ+PtO4KcMlz UCGg== X-Gm-Message-State: AC+VfDzWTGK8wGSi9JSfZLRwZ7NtTFes+MCanYdzoJ03GmkC9wXCPz9v xfoeoFPER7iGNTeidyuFn68CfaV6iwE= X-Google-Smtp-Source: ACHHUZ55vkERdr/ZboYwFWPswf+ku4UEZ1eXYjO0UZqAujhK/7mIomaZjDh0It0RVfpFsMb3wSRn4Q== X-Received: by 2002:a2e:9d43:0:b0:2b3:4442:50da with SMTP id y3-20020a2e9d43000000b002b3444250damr2693444ljj.9.1686770770203; Wed, 14 Jun 2023 12:26:10 -0700 (PDT) Received: from localhost.localdomain ([2001:861:3f04:7ca0:e164:efe0:8fdb:6ba]) by smtp.gmail.com with ESMTPSA id u26-20020a05600c00da00b003eddc6aa5fasm18370365wmm.39.2023.06.14.12.26.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jun 2023 12:26:09 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH 7/9] gc: add `gc.repackFilter` config option Date: Wed, 14 Jun 2023 21:25:39 +0200 Message-ID: <20230614192541.1599256-8-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.37.gae45d9845e In-Reply-To: <20230614192541.1599256-1-christian.couder@gmail.com> References: <20230614192541.1599256-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A previous commit has implemented `git repack --filter=` to allow users to filter out some objects from the main pack and move them into a new different pack. Users might want to perform such a cleanup regularly at the same time as they perform other repacks and cleanups, so as part of `git gc`. Let's allow them to configure a for that purpose using a new gc.repackFilter config option. Now when `git gc` will perform a repack with a configured through this option and not empty, the repack process will be passed a corresponding `--filter=` argument. Signed-off-by: Christian Couder --- Documentation/config/gc.txt | 5 +++++ builtin/gc.c | 6 ++++++ t/t6500-gc.sh | 12 ++++++++++++ 3 files changed, 23 insertions(+) diff --git a/Documentation/config/gc.txt b/Documentation/config/gc.txt index 7f95c866e1..055c4e0db6 100644 --- a/Documentation/config/gc.txt +++ b/Documentation/config/gc.txt @@ -130,6 +130,11 @@ or rebase occurring. Since these changes are not part of the current project most users will want to expire them sooner, which is why the default is more aggressive than `gc.reflogExpire`. +gc.repackFilter:: + When repacking, use the specified filter to move certain + objects into a separate packfile. See the + `--filter=` option of linkgit:git-repack[1]. + gc.rerereResolved:: Records of conflicted merge you resolved earlier are kept for this many days when 'git rerere gc' is run. diff --git a/builtin/gc.c b/builtin/gc.c index f3942188a6..1c57913214 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -61,6 +61,7 @@ static timestamp_t gc_log_expire_time; static const char *gc_log_expire = "1.day.ago"; static const char *prune_expire = "2.weeks.ago"; static const char *prune_worktrees_expire = "3.months.ago"; +static char *repack_filter; static unsigned long big_pack_threshold; static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE; @@ -170,6 +171,8 @@ static void gc_config(void) git_config_get_ulong("gc.bigpackthreshold", &big_pack_threshold); git_config_get_ulong("pack.deltacachesize", &max_delta_cache_size); + git_config_get_string("gc.repackfilter", &repack_filter); + git_config(git_default_config, NULL); } @@ -355,6 +358,9 @@ static void add_repack_all_option(struct string_list *keep_pack) if (keep_pack) for_each_string_list(keep_pack, keep_one_pack, NULL); + + if (repack_filter && *repack_filter) + strvec_pushf(&repack, "--filter=%s", repack_filter); } static void add_repack_incremental_option(void) diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index 69509d0c11..5b89faf505 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -202,6 +202,18 @@ test_expect_success 'one of gc.reflogExpire{Unreachable,}=never does not skip "e grep -E "^trace: (built-in|exec|run_command): git reflog expire --" trace.out ' +test_expect_success 'gc.repackFilter launches repack with a filter' ' + test_when_finished "rm -rf bare.git" && + git clone --no-local --bare . bare.git && + + git -C bare.git -c gc.cruftPacks=false gc && + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && + + GIT_TRACE=$(pwd)/trace.out git -C bare.git -c gc.repackFilter=blob:none -c repack.writeBitmaps=false -c gc.cruftPacks=false gc && + test_stdout_line_count = 2 ls bare.git/objects/pack/*.pack && + grep -E "^trace: (built-in|exec|run_command): git repack .* --filter=blob:none ?.*" trace.out +' + prepare_cruft_history () { test_commit base && From patchwork Wed Jun 14 19:25:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13280411 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95067EB64D9 for ; Wed, 14 Jun 2023 19:26:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236706AbjFNT0d (ORCPT ); Wed, 14 Jun 2023 15:26:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51686 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236703AbjFNT01 (ORCPT ); Wed, 14 Jun 2023 15:26:27 -0400 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77CD926A1 for ; Wed, 14 Jun 2023 12:26:13 -0700 (PDT) Received: by mail-wm1-x334.google.com with SMTP id 5b1f17b1804b1-3f8ca80e889so10120035e9.3 for ; Wed, 14 Jun 2023 12:26:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686770771; x=1689362771; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=W2hVlUib114K1vh7cLAEePRmblkimNleZ0RKnEw6gKU=; b=NHnCyqOw37icp1TMn5LVHjO2WG/QYx43uviFCwaMgrvNyXVy0tfklSM7fHZAs53nyA rIkXaNZ+O9MEkZOaiH56wp5RSrtRyDsXVZCWcSXWTogOhdT6VnRZwOQjDb7kapHpa8ic WwpX6PdyCeZ7dzxmJ29EO0hyNRwfhb65M3V5ofVYXtb1t98DDJ+erodOYOIlw/uJsBli jaUZPOvAyGmGvs2/ylzSn1eHlV9VBE0UeXKWs+SqT0Fw22c17QDxSk4UR6RZAKovoVHq e5Fl6ujvI/amzAHzWG+dKy+keluwhesy5ykIdveI83e4ZNsV7Nq9qhg8TWsQntGR9wWp NCtw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686770771; x=1689362771; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=W2hVlUib114K1vh7cLAEePRmblkimNleZ0RKnEw6gKU=; b=CnUAqBb4vHMmBmiK2Ts3mV/LxINU/l8EXuKqvrSLPdmKucntJ0VEeD3N3J/eOmbzeb W6crpVd+eQkwFNyOzOTf8BF71epIG1a4CO69cKPQLkyvPkoA1DgA3uqZhhOHcQn+dfeZ a9Gq5QnbsdwAR+ZO/PyCY0YH+jKUAE7RsAzCpYCVD/zJZOYRwDjtpRl1vVrhGxgrRcPK n91cJ3JVgeWnbunSE1PZJrQd5pO9B9heEym5lkFB2GsV2NYxrOoD6KTcD289RQDRICES ttIw8jdNOr8Gm+rEwb8mfwXlawrdz/rr2nD8/P/zHph32qUD4dMOB0FO23uS94rsk09W 7Z/Q== X-Gm-Message-State: AC+VfDxIGekhJj+3mUQnbNw74oMu4Tone5TJE4fBmkkeWECCUdfSmj2v QYX91TtaEsEp6TXGwZYkvTu23a8VvbI= X-Google-Smtp-Source: ACHHUZ5hDiGck/pJb5ryOta+FtuHuSar/00XWculhFXC/ox5Yu2NF28L7f9mPweTh0gQvVdE71y6MA== X-Received: by 2002:a1c:7715:0:b0:3f8:153b:a51e with SMTP id t21-20020a1c7715000000b003f8153ba51emr8069606wmi.36.1686770771558; Wed, 14 Jun 2023 12:26:11 -0700 (PDT) Received: from localhost.localdomain ([2001:861:3f04:7ca0:e164:efe0:8fdb:6ba]) by smtp.gmail.com with ESMTPSA id u26-20020a05600c00da00b003eddc6aa5fasm18370365wmm.39.2023.06.14.12.26.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jun 2023 12:26:10 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH 8/9] repack: implement `--filter-to` for storing filtered out objects Date: Wed, 14 Jun 2023 21:25:40 +0200 Message-ID: <20230614192541.1599256-9-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.37.gae45d9845e In-Reply-To: <20230614192541.1599256-1-christian.couder@gmail.com> References: <20230614192541.1599256-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A previous commit has implemented `git repack --filter=` to allow users to filter out some objects from the main pack and move them into a new different pack. It would be nice if this new different pack could be created in a different directory than the regular pack. This would make it possible to move large blobs into a pack on a different kind of storage, for example cheaper storage. Even in a different directory this pack can be accessible if, for example, the Git alternates mechanism is used to point to it. If users want to remove a pack that contains filtered out objects after checking that they are all already on a promisor remote, creating the pack in a different directory makes it easier to do so. Signed-off-by: Christian Couder --- Documentation/git-repack.txt | 6 ++++++ builtin/repack.c | 17 ++++++++++++----- t/t7700-repack.sh | 27 +++++++++++++++++++++++++++ 3 files changed, 45 insertions(+), 5 deletions(-) diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index aa29c7e648..070dd22610 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -148,6 +148,12 @@ depth is 4095. resulting packfile and put them into a separate packfile. See linkgit:git-rev-list[1] for valid `` forms. +--filter-to=:: + Write the pack containing filtered out objects to the + directory ``. This can be used for putting the pack on a + separate object directory that is accessed through the Git + alternates mechanism. Only useful with `--filter`. + -b:: --write-bitmap-index:: Write a reachability bitmap index as part of the repack. This diff --git a/builtin/repack.c b/builtin/repack.c index b13d7196de..8c71e8fd51 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -838,7 +838,8 @@ static void prepare_pack_filtered_cmd(struct child_process *cmd, } static void finish_pack_filtered_cmd(struct child_process *cmd, - struct string_list *names) + struct string_list *names, + const char *destination) { if (cmd->in == -1) { /* No packed objects; cmd was never started */ @@ -848,7 +849,7 @@ static void finish_pack_filtered_cmd(struct child_process *cmd, close(cmd->in); - if (finish_pack_objects_cmd(cmd, names, NULL, NULL)) + if (finish_pack_objects_cmd(cmd, names, destination, NULL)) die(_("could not finish pack-objects to pack filtered objects")); } @@ -877,6 +878,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) const char *cruft_expiration = NULL; const char *expire_to = NULL; struct child_process pack_filtered_cmd = CHILD_PROCESS_INIT; + const char *filter_to = NULL; struct option builtin_repack_options[] = { OPT_BIT('a', NULL, &pack_everything, @@ -930,6 +932,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) N_("write a multi-pack index of the resulting packs")), OPT_STRING(0, "expire-to", &expire_to, N_("dir"), N_("pack prefix to store a pack containing pruned objects")), + OPT_STRING(0, "filter-to", &filter_to, N_("dir"), + N_("pack prefix to store a pack containing filtered out objects")), OPT_END() }; @@ -1073,8 +1077,11 @@ int cmd_repack(int argc, const char **argv, const char *prefix) strvec_push(&cmd.args, "--incremental"); } - if (po_args.filter) - prepare_pack_filtered_cmd(&pack_filtered_cmd, &po_args, packtmp); + if (po_args.filter) { + if (!filter_to) + filter_to = packtmp; + prepare_pack_filtered_cmd(&pack_filtered_cmd, &po_args, filter_to); + } if (geometry) cmd.in = -1; @@ -1169,7 +1176,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) } if (po_args.filter) - finish_pack_filtered_cmd(&pack_filtered_cmd, &names); + finish_pack_filtered_cmd(&pack_filtered_cmd, &names, filter_to); string_list_sort(&names); diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 9e7654090f..898f8a01b4 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -286,6 +286,33 @@ test_expect_success 'repacking with a filter works' ' test "$blob_pack2" = "$blob_pack" ' +test_expect_success '--filter-to stores filtered out objects' ' + git -C bare.git repack -a -d && + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && + + git init --bare filtered.git && + git -C bare.git -c repack.writebitmaps=false repack -a -d \ + --filter=blob:none \ + --filter-to=../filtered.git/objects/pack/pack && + test_stdout_line_count = 1 ls bare.git/objects/pack/pack-*.pack && + test_stdout_line_count = 1 ls filtered.git/objects/pack/pack-*.pack && + + commit_pack=$(test-tool -C bare.git find-pack HEAD) && + test -n "$commit_pack" && + blob_pack=$(test-tool -C bare.git find-pack HEAD:file1) && + test -z "$blob_pack" && + blob_hash=$(git -C bare.git rev-parse HEAD:file1) && + test -n "$blob_hash" && + blob_pack=$(test-tool -C filtered.git find-pack $blob_hash) && + test -n "$blob_pack" && + + echo $(pwd)/filtered.git/objects >bare.git/objects/info/alternates && + blob_pack=$(test-tool -C bare.git find-pack HEAD:file1) && + test -n "$blob_pack" && + blob_content=$(git -C bare.git show $blob_hash) && + test "$blob_content" = "content1" +' + objdir=.git/objects midx=$objdir/pack/multi-pack-index From patchwork Wed Jun 14 19:25:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13280413 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80667EB64DB for ; Wed, 14 Jun 2023 19:26:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237042AbjFNT0f (ORCPT ); Wed, 14 Jun 2023 15:26:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51696 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229775AbjFNT01 (ORCPT ); Wed, 14 Jun 2023 15:26:27 -0400 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DFC4A26A4 for ; Wed, 14 Jun 2023 12:26:14 -0700 (PDT) Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-3f8d5262dc8so222345e9.0 for ; Wed, 14 Jun 2023 12:26:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686770773; x=1689362773; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=x8AjKnqmHXvW0/kE5+ku/lIt6XDqaH9ZIMcIxFtwt8c=; b=goplzWAl1Jr3IP1PzjloHhTuaIEJ2w3ieWsVlOdVcGwni5mJR7xarlQ/KorNdYqjuc +XQQJawZ3xl+fKILDgX10mCMvA1Jbj7G+m2Z6qCIz1nkk8A5WC20zJ1OUJQppNLTHC/J FbfY8APAULXOmkGE9tJExg+1P64VXQb6qAS9Ghh0JL9PivKCSssHFTJCZCTwZlB6WVw1 edzJUX0XPDc43Qfj4Pzk37boXUyKp7MOTQkODs58w7fmX0ZK33VZdCTUd0YkQU/yGcpL M0a3by+5l7LjefjWll04XlUaEaBCWs5FXbS9FlubOycBpL8bZRUBU6z7JCLiBObD3slx 0veQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686770773; x=1689362773; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=x8AjKnqmHXvW0/kE5+ku/lIt6XDqaH9ZIMcIxFtwt8c=; b=IdRcg4tsU9xS7dLYXN1vRshWDRKPktVVpr8DPS6FLZ2Nnn/Kgq2bQ0MnuaNxN2gNG/ Kj8FYAcSM/eq5Slo9BaclzI1ufyizbfb2I/f7vzILPG1N+XYRMoMzLZsPmsRU5UnuIxs Vtnp5McMdeo7KfzT0wdHOPpkolhGVmYmKeTrkdyKLJxa/S2se/+Xfc+JLHA45RpC/NKe XgRxtj5FKyWOQbyKjYvku6zOwI9VCpPcOl4qlkZfXtaSDXgbpCqR/elc/6O/AOk7c851 aX7t1YMl6MidmU7kEjPOGvPtYtB+WzI0WGdNyRVj9/4b2T7GddUs8XhaZrYEcNdbtVfx ZMlA== X-Gm-Message-State: AC+VfDxL3LwSKFxuAsw7pwIeOucp2RFvudm+vQ84PSQ4znzcjPyC+Ivv FH3jobnOBFdMJGd5BJ/AcRGpv+xhsb8= X-Google-Smtp-Source: ACHHUZ5ggnoo2+FHeZJHR8o/D+jcJDHHR7+GQKxZ0vnszDJQ4BilEoqt3IJZvl7BC1GuYventFW8fw== X-Received: by 2002:a05:600c:204e:b0:3f8:1dda:209e with SMTP id p14-20020a05600c204e00b003f81dda209emr2274722wmg.10.1686770772844; Wed, 14 Jun 2023 12:26:12 -0700 (PDT) Received: from localhost.localdomain ([2001:861:3f04:7ca0:e164:efe0:8fdb:6ba]) by smtp.gmail.com with ESMTPSA id u26-20020a05600c00da00b003eddc6aa5fasm18370365wmm.39.2023.06.14.12.26.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jun 2023 12:26:11 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH 9/9] gc: add `gc.repackFilterTo` config option Date: Wed, 14 Jun 2023 21:25:41 +0200 Message-ID: <20230614192541.1599256-10-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.37.gae45d9845e In-Reply-To: <20230614192541.1599256-1-christian.couder@gmail.com> References: <20230614192541.1599256-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A previous commit implemented the `gc.repackFilter` config option to specify a filter that should be used by `git gc` when performing repacks. Another previous commit has implemented `git repack --filter-to=` to specify the location of the packfile containing filtered out objects when using a filter. Let's implement the `gc.repackFilterTo` config option to specify that location in the config when `gc.repackFilter` is used. Now when `git gc` will perform a repack with a configured through this option and not empty, the repack process will be passed a corresponding `--filter-to=` argument. Signed-off-by: Christian Couder --- Documentation/config/gc.txt | 6 ++++++ builtin/gc.c | 4 ++++ t/t6500-gc.sh | 13 ++++++++++++- 3 files changed, 22 insertions(+), 1 deletion(-) diff --git a/Documentation/config/gc.txt b/Documentation/config/gc.txt index 055c4e0db6..699ad887b3 100644 --- a/Documentation/config/gc.txt +++ b/Documentation/config/gc.txt @@ -135,6 +135,12 @@ gc.repackFilter:: objects into a separate packfile. See the `--filter=` option of linkgit:git-repack[1]. +gc.repackFilterTo:: + When repacking and using a filter, see `gc.repackFilter`, the + specified location will be used to create the packfile + containing the filtered out objects. See the + `--filter-to=` option of linkgit:git-repack[1]. + gc.rerereResolved:: Records of conflicted merge you resolved earlier are kept for this many days when 'git rerere gc' is run. diff --git a/builtin/gc.c b/builtin/gc.c index 1c57913214..87f5fc6946 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -62,6 +62,7 @@ static const char *gc_log_expire = "1.day.ago"; static const char *prune_expire = "2.weeks.ago"; static const char *prune_worktrees_expire = "3.months.ago"; static char *repack_filter; +static char *repack_filter_to; static unsigned long big_pack_threshold; static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE; @@ -172,6 +173,7 @@ static void gc_config(void) git_config_get_ulong("pack.deltacachesize", &max_delta_cache_size); git_config_get_string("gc.repackfilter", &repack_filter); + git_config_get_string("gc.repackfilterto", &repack_filter_to); git_config(git_default_config, NULL); } @@ -361,6 +363,8 @@ static void add_repack_all_option(struct string_list *keep_pack) if (repack_filter && *repack_filter) strvec_pushf(&repack, "--filter=%s", repack_filter); + if (repack_filter_to && *repack_filter_to) + strvec_pushf(&repack, "--filter-to=%s", repack_filter_to); } static void add_repack_incremental_option(void) diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index 5b89faf505..37056a824b 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -203,7 +203,6 @@ test_expect_success 'one of gc.reflogExpire{Unreachable,}=never does not skip "e ' test_expect_success 'gc.repackFilter launches repack with a filter' ' - test_when_finished "rm -rf bare.git" && git clone --no-local --bare . bare.git && git -C bare.git -c gc.cruftPacks=false gc && @@ -214,6 +213,18 @@ test_expect_success 'gc.repackFilter launches repack with a filter' ' grep -E "^trace: (built-in|exec|run_command): git repack .* --filter=blob:none ?.*" trace.out ' +test_expect_success 'gc.repackFilterTo store filtered out objects' ' + test_when_finished "rm -rf bare.git filtered.git" && + + git init --bare filtered.git && + git -C bare.git -c gc.repackFilter=blob:none \ + -c gc.repackFilterTo=../filtered.git/objects/pack/pack \ + -c repack.writeBitmaps=false -c gc.cruftPacks=false gc && + + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && + test_stdout_line_count = 1 ls filtered.git/objects/pack/*.pack +' + prepare_cruft_history () { test_commit base &&