From patchwork Mon Sep 25 15:25:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13397987 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8DE9ACE7A81 for ; Mon, 25 Sep 2023 15:25:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232850AbjIYPZr (ORCPT ); Mon, 25 Sep 2023 11:25:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57166 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232846AbjIYPZq (ORCPT ); Mon, 25 Sep 2023 11:25:46 -0400 Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D332DFF for ; Mon, 25 Sep 2023 08:25:39 -0700 (PDT) Received: by mail-ej1-x636.google.com with SMTP id a640c23a62f3a-9b29186e20aso153823566b.2 for ; Mon, 25 Sep 2023 08:25:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695655538; x=1696260338; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=R366ku2qccvwBlT+dCJDnDBr7SOQgvE8pTvURdasjYs=; b=E3yt2cDx0QVbaBrmD6mWxKSoPF9uXHHwpq0OxUpwxpXi8lFQ5aCJNlX6hb3yw8h11L SiDndiLzegsPB7DIAcQsVx9POQtaX3Mtup7dAAurC7V1pZl+O8jDtSs2QTHtpKoIzFkk 8lI5L/P6QDgsciwIim1wMmaptX4xBEJYxoTpYUzKww7OKE8dX7UohUKHzJvxeKGo7dCw CS8/mY3j+bwyZu2yj9HoDRvpE6k6rfTgeIFT2NOoHjSCnOL9JPqV/n6imttc6e0f4ZtY EJokDAsx+9VzA9sg/7y3mJLCRapafsL1VHpFe769X0RIXd8jF7bRK+/a9jhSsx26mMpQ deIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695655538; x=1696260338; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=R366ku2qccvwBlT+dCJDnDBr7SOQgvE8pTvURdasjYs=; b=aLYSqoYQqxKh/xBifuXGtL6JWS4IDLurl2d+e/rjrZrcQfmRiA07PjZjgHUYNGOoNj tZIA5rDhiMA9LDV5+s2LPnjJ1dVNP5k9rtmusS3jV3B28N+FFKHtKDFXI4J/6bvboTUA NCUJ/SrXz6dZC+K7HTNn67w+w6tr6HMctMAOztdLSvREHKZnadDk2nvVpANChs39DL8S 1P8Al8d8oeLdZwAagTQgkZqSxTHtFELOisnNQcBfDzR7km6qfSeIts3NwuRO6hxPGDrt +0OMTNulOvDzjM8rj1ZurYnVZOva5WIS5zijD9UyqOmmvttsOpwoGT97scamOh5gM4D4 Bfqg== X-Gm-Message-State: AOJu0YwhlYrsijFsPxdBwRtnH0mZB1eIqApe8pXw9fbg7eIz0fq+n5+c vPpOAW/+uO7rN2AcqOjOv4WYezCEL8SiqA== X-Google-Smtp-Source: AGHT+IEqmcv0+BO+OsydnRy7mpJugz1CzqnxESYnXayFM1Uygg2LGblIvm6JA02GOKRDFKH0J43qHQ== X-Received: by 2002:a17:906:225b:b0:9a1:b144:30f0 with SMTP id 27-20020a170906225b00b009a1b14430f0mr6563294ejr.53.1695655537624; Mon, 25 Sep 2023 08:25:37 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:105a:e25e:7421:a01e:ee4a:ba03]) by smtp.gmail.com with ESMTPSA id f1-20020a17090624c100b009ae3e6c342asm6432045ejb.111.2023.09.25.08.25.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 08:25:36 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v7 1/9] pack-objects: allow `--filter` without `--stdout` Date: Mon, 25 Sep 2023 17:25:09 +0200 Message-ID: <20230925152517.803579-2-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.279.g57b2ba444c In-Reply-To: <20230925152517.803579-1-christian.couder@gmail.com> References: <20230911150618.129737-1-christian.couder@gmail.com> <20230925152517.803579-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org 9535ce7337 (pack-objects: add list-objects filtering, 2017-11-21) taught `git pack-objects` to use `--filter`, but required the use of `--stdout` since a partial clone mechanism was not yet in place to handle missing objects. Since then, changes like 9e27beaa23 (promisor-remote: implement promisor_remote_get_direct(), 2019-06-25) and others added support to dynamically fetch objects that were missing. Even without a promisor remote, filtering out objects can also be useful if we can put the filtered out objects in a separate pack, and in this case it also makes sense for pack-objects to write the packfile directly to an actual file rather than on stdout. Remove the `--stdout` requirement when using `--filter`, so that in a follow-up commit, repack can pass `--filter` to pack-objects to omit certain objects from the resulting packfile. Signed-off-by: John Cai Signed-off-by: Christian Couder --- Documentation/git-pack-objects.txt | 4 ++-- builtin/pack-objects.c | 8 ++------ t/t5317-pack-objects-filter-objects.sh | 8 ++++++++ 3 files changed, 12 insertions(+), 8 deletions(-) diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index dea7eacb0f..e32404c6aa 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -296,8 +296,8 @@ So does `git bundle` (see linkgit:git-bundle[1]) when it creates a bundle. nevertheless. --filter=:: - Requires `--stdout`. Omits certain objects (usually blobs) from - the resulting packfile. See linkgit:git-rev-list[1] for valid + Omits certain objects (usually blobs) from the resulting + packfile. See linkgit:git-rev-list[1] for valid `` forms. --no-filter:: diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 6eb9756836..89a8b5a976 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -4402,12 +4402,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) if (!rev_list_all || !rev_list_reflog || !rev_list_index) unpack_unreachable_expiration = 0; - if (filter_options.choice) { - if (!pack_to_stdout) - die(_("cannot use --filter without --stdout")); - if (stdin_packs) - die(_("cannot use --filter with --stdin-packs")); - } + if (stdin_packs && filter_options.choice) + die(_("cannot use --filter with --stdin-packs")); if (stdin_packs && use_internal_rev_list) die(_("cannot use internal rev list with --stdin-packs")); diff --git a/t/t5317-pack-objects-filter-objects.sh b/t/t5317-pack-objects-filter-objects.sh index b26d476c64..2ff3eef9a3 100755 --- a/t/t5317-pack-objects-filter-objects.sh +++ b/t/t5317-pack-objects-filter-objects.sh @@ -53,6 +53,14 @@ test_expect_success 'verify blob:none packfile has no blobs' ' ! grep blob verify_result ' +test_expect_success 'verify blob:none packfile without --stdout' ' + git -C r1 pack-objects --revs --filter=blob:none mypackname >packhash <<-EOF && + HEAD + EOF + git -C r1 verify-pack -v "mypackname-$(cat packhash).pack" >verify_result && + ! grep blob verify_result +' + test_expect_success 'verify normal and blob:none packfiles have same commits/trees' ' git -C r1 verify-pack -v ../all.pack >verify_result && grep -E "commit|tree" verify_result | From patchwork Mon Sep 25 15:25:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13397988 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04125CE7A95 for ; Mon, 25 Sep 2023 15:25:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232948AbjIYPZ6 (ORCPT ); Mon, 25 Sep 2023 11:25:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57240 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232867AbjIYPZt (ORCPT ); Mon, 25 Sep 2023 11:25:49 -0400 Received: from mail-ej1-x62b.google.com (mail-ej1-x62b.google.com [IPv6:2a00:1450:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E7294A3 for ; Mon, 25 Sep 2023 08:25:41 -0700 (PDT) Received: by mail-ej1-x62b.google.com with SMTP id a640c23a62f3a-9ae2cc4d17eso834141366b.1 for ; Mon, 25 Sep 2023 08:25:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695655540; x=1696260340; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8m1lnk4SOv0jMEroRHHNjRkmIXEZmLmf9v7yBToSmb0=; b=ivR97JuBjEOOygATUiokmoj12WgYAwNaoe4ONZIyORvDSYRWQsJUjQCdL+k2R9MfIW NwKs8OuMVtq2ynj3SKLnMVJmF1VlMpp2Iy77IgedV1fI4tOLhDKxy70iLsqAvo+NJMHi 8h2ZUCdh5Wa5wYQod2CzLIZ7kQ6w67jA0K+uCyc4LSQrMrYexbdaab9Uc4qkpxXFBrnz 7TcKRWCRCHs3hTVaaA3My27Tf7a802k9aqAeNj1r9qIvPNd9qwNZRqFz4S66JeEHbXKA cGULIvOORYeuLG1Qs7JPs6B1Ucaw4SXcV5X2halgv8Jo1mm4peIAv9PiHa8/9d9HLrXp bQEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695655540; x=1696260340; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8m1lnk4SOv0jMEroRHHNjRkmIXEZmLmf9v7yBToSmb0=; b=VT0dGglf6AtWZ6BSCzgRA8vDGPnIdx5bDwNNwKk8h8R7Tjk/jAId9gSnEhkmW6e+K8 nRvwza1VXEQ45aFwklueTzkwKYKHMDJ4H8+8gfe95YnJFjPAOOtiHQs5uc/4tZwFy5Ll taY+BLccBlUHgEynh3Em1WIKQyVT11OEeWxtzzMqI82fxs6xrkP89XAXMuCsc0j0JM3W 98Efj0yYKPRyqBngbjHh9MCtNx0y/SYjoOk2/vHChnbYcTbVMiw3u+ITQqD4ZU8iRiZf zNBFlVOs2UY7Z/GOOy+03VdNbcj8LlvHrYGsnE3SAzvbhY+9uHzA4tI0QhqbAZCXS9tP SW7A== X-Gm-Message-State: AOJu0YzWe9lxFPLjEv9d0cqseAEKeisWMHlV0FhVTG89me1Ugqb13BbN yw/zsSy3cXu+djg0GW7cF8RhWG8ylFUhZg== X-Google-Smtp-Source: AGHT+IEJFoI59jsL6p+NN6KWLiADVVrdnivAhBmRlwzONk6Ty35dRuxND3cw8fUJXhN9PkTptM6+jQ== X-Received: by 2002:a17:907:a0c7:b0:9ae:46f3:b03f with SMTP id hw7-20020a170907a0c700b009ae46f3b03fmr5815064ejc.0.1695655539849; Mon, 25 Sep 2023 08:25:39 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:105a:e25e:7421:a01e:ee4a:ba03]) by smtp.gmail.com with ESMTPSA id f1-20020a17090624c100b009ae3e6c342asm6432045ejb.111.2023.09.25.08.25.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 08:25:38 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v7 2/9] t/helper: add 'find-pack' test-tool Date: Mon, 25 Sep 2023 17:25:10 +0200 Message-ID: <20230925152517.803579-3-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.279.g57b2ba444c In-Reply-To: <20230925152517.803579-1-christian.couder@gmail.com> References: <20230911150618.129737-1-christian.couder@gmail.com> <20230925152517.803579-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In a following commit, we will make it possible to separate objects in different packfiles depending on a filter. To make sure that the right objects are in the right packs, let's add a new test-tool that can display which packfile(s) a given object is in. Let's also make it possible to check if a given object is in the expected number of packfiles with a `--check-count ` option. Signed-off-by: Christian Couder --- Makefile | 1 + t/helper/test-find-pack.c | 50 ++++++++++++++++++++++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + t/t0080-find-pack.sh | 82 +++++++++++++++++++++++++++++++++++++++ 5 files changed, 135 insertions(+) create mode 100644 t/helper/test-find-pack.c create mode 100755 t/t0080-find-pack.sh diff --git a/Makefile b/Makefile index 003e63b792..f267034d23 100644 --- a/Makefile +++ b/Makefile @@ -800,6 +800,7 @@ TEST_BUILTINS_OBJS += test-dump-untracked-cache.o TEST_BUILTINS_OBJS += test-env-helper.o TEST_BUILTINS_OBJS += test-example-decorate.o TEST_BUILTINS_OBJS += test-fast-rebase.o +TEST_BUILTINS_OBJS += test-find-pack.o TEST_BUILTINS_OBJS += test-fsmonitor-client.o TEST_BUILTINS_OBJS += test-genrandom.o TEST_BUILTINS_OBJS += test-genzeros.o diff --git a/t/helper/test-find-pack.c b/t/helper/test-find-pack.c new file mode 100644 index 0000000000..e8bd793e58 --- /dev/null +++ b/t/helper/test-find-pack.c @@ -0,0 +1,50 @@ +#include "test-tool.h" +#include "object-name.h" +#include "object-store.h" +#include "packfile.h" +#include "parse-options.h" +#include "setup.h" + +/* + * Display the path(s), one per line, of the packfile(s) containing + * the given object. + * + * If '--check-count ' is passed, then error out if the number of + * packfiles containing the object is not . + */ + +static const char *find_pack_usage[] = { + "test-tool find-pack [--check-count ] ", + NULL +}; + +int cmd__find_pack(int argc, const char **argv) +{ + struct object_id oid; + struct packed_git *p; + int count = -1, actual_count = 0; + const char *prefix = setup_git_directory(); + + struct option options[] = { + OPT_INTEGER('c', "check-count", &count, "expected number of packs"), + OPT_END(), + }; + + argc = parse_options(argc, argv, prefix, options, find_pack_usage, 0); + if (argc != 1) + usage(find_pack_usage[0]); + + if (repo_get_oid(the_repository, argv[0], &oid)) + die("cannot parse %s as an object name", argv[0]); + + for (p = get_all_packs(the_repository); p; p = p->next) + if (find_pack_entry_one(oid.hash, p)) { + printf("%s\n", p->pack_name); + actual_count++; + } + + if (count > -1 && count != actual_count) + die("bad packfile count %d instead of %d", actual_count, count); + + return 0; +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index 621ac3dd10..9010ac6de7 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -31,6 +31,7 @@ static struct test_cmd cmds[] = { { "env-helper", cmd__env_helper }, { "example-decorate", cmd__example_decorate }, { "fast-rebase", cmd__fast_rebase }, + { "find-pack", cmd__find_pack }, { "fsmonitor-client", cmd__fsmonitor_client }, { "genrandom", cmd__genrandom }, { "genzeros", cmd__genzeros }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index a641c3a81d..f134f96b97 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -25,6 +25,7 @@ int cmd__dump_reftable(int argc, const char **argv); int cmd__env_helper(int argc, const char **argv); int cmd__example_decorate(int argc, const char **argv); int cmd__fast_rebase(int argc, const char **argv); +int cmd__find_pack(int argc, const char **argv); int cmd__fsmonitor_client(int argc, const char **argv); int cmd__genrandom(int argc, const char **argv); int cmd__genzeros(int argc, const char **argv); diff --git a/t/t0080-find-pack.sh b/t/t0080-find-pack.sh new file mode 100755 index 0000000000..67b11216a3 --- /dev/null +++ b/t/t0080-find-pack.sh @@ -0,0 +1,82 @@ +#!/bin/sh + +test_description='test `test-tool find-pack`' + +TEST_PASSES_SANITIZE_LEAK=true +. ./test-lib.sh + +test_expect_success 'setup' ' + test_commit one && + test_commit two && + test_commit three && + test_commit four && + test_commit five +' + +test_expect_success 'repack everything into a single packfile' ' + git repack -a -d --no-write-bitmap-index && + + head_commit_pack=$(test-tool find-pack HEAD) && + head_tree_pack=$(test-tool find-pack HEAD^{tree}) && + one_pack=$(test-tool find-pack HEAD:one.t) && + three_pack=$(test-tool find-pack HEAD:three.t) && + old_commit_pack=$(test-tool find-pack HEAD~4) && + + test-tool find-pack --check-count 1 HEAD && + test-tool find-pack --check-count=1 HEAD^{tree} && + ! test-tool find-pack --check-count=0 HEAD:one.t && + ! test-tool find-pack -c 2 HEAD:one.t && + test-tool find-pack -c 1 HEAD:three.t && + + # Packfile exists at the right path + case "$head_commit_pack" in + ".git/objects/pack/pack-"*".pack") true ;; + *) false ;; + esac && + test -f "$head_commit_pack" && + + # Everything is in the same pack + test "$head_commit_pack" = "$head_tree_pack" && + test "$head_commit_pack" = "$one_pack" && + test "$head_commit_pack" = "$three_pack" && + test "$head_commit_pack" = "$old_commit_pack" +' + +test_expect_success 'add more packfiles' ' + git rev-parse HEAD^{tree} HEAD:two.t HEAD:four.t >objects && + git pack-objects .git/objects/pack/mypackname1 >packhash1 objects && + git pack-objects .git/objects/pack/mypackname2 >packhash2 head_tree_packs && + grep "$head_commit_pack" head_tree_packs && + grep mypackname1 head_tree_packs && + ! grep mypackname2 head_tree_packs && + test-tool find-pack --check-count 2 HEAD^{tree} && + ! test-tool find-pack --check-count 1 HEAD^{tree} && + + # HEAD:five.t is also in 2 packfiles + test-tool find-pack HEAD:five.t >five_packs && + grep "$head_commit_pack" five_packs && + ! grep mypackname1 five_packs && + grep mypackname2 five_packs && + test-tool find-pack -c 2 HEAD:five.t && + ! test-tool find-pack --check-count=0 HEAD:five.t +' + +test_expect_success 'add more commits (as loose objects)' ' + test_commit six && + test_commit seven && + + test -z "$(test-tool find-pack HEAD)" && + test -z "$(test-tool find-pack HEAD:six.t)" && + test-tool find-pack --check-count 0 HEAD && + test-tool find-pack -c 0 HEAD:six.t && + ! test-tool find-pack -c 1 HEAD:seven.t +' + +test_done From patchwork Mon Sep 25 15:25:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13397989 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC75ACE7A81 for ; Mon, 25 Sep 2023 15:25:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232874AbjIYP0B (ORCPT ); Mon, 25 Sep 2023 11:26:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57256 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232868AbjIYPZu (ORCPT ); Mon, 25 Sep 2023 11:25:50 -0400 Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB1CC109 for ; Mon, 25 Sep 2023 08:25:43 -0700 (PDT) Received: by mail-ej1-x636.google.com with SMTP id a640c23a62f3a-9a648f9d8e3so886577066b.1 for ; Mon, 25 Sep 2023 08:25:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695655541; x=1696260341; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0kpEPGUcOVm0Cr8yXEhQpwXIWray4hrXDhMKhT+w5tA=; b=UVN1Ua5Bs4sXFwScpS3xyztJypEUL0Hui1vg3G2wxbwrZ1FXSff5kMB9HhJ1B/OhrP NYcVic5eMM+iCN4Bz2r5J/eVsYtaH/cmnGrGGh41B4b2jcgnfXCIk7gj4FY3UFoSkhOK eqNGjF1m4zI+7HEFigkSERnv3NmWzFVqvBDnBY6iJzaj8bAOlI3IZfOPQ+Nh3Fe8Z/bF vs7k69RBL7jCuqFxGD+RxVjJ9+/L0CmERayxEv5Bbf2lNURDcgzn0ec3P8MCkTLjtpfn eEkmTQ7U76oX3dGWFjubQWoDxtdhfsAubvyrQFYQMVegprGUjqPRBr9pUUgh1We3k6HS dKIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695655541; x=1696260341; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0kpEPGUcOVm0Cr8yXEhQpwXIWray4hrXDhMKhT+w5tA=; b=MqRwbIGXBwfMSF24a3+AtznwKQFBdQ664VddYsmbsRkslrcynXP8OqIb6Dc6hyxgpW UNqsyxI6/BaofbXJ1X0oosJCHcNGEhs5ab2Zq5RANKOgOBAgXGlFej7d/1MEYWFECE77 0n/KVOjsl+ErVeElswDLAr51lU46EFVhGOOwUvRTZlFaF5ToRbBQXt83OE9jcMC/6SYt dQTo6tYe6X/wZ7vqfNWc1ARVBgk4rrbeTE2orCFl4pRpqXOzhPS75uN2aNhCE9DxYdw9 3sKvlV4r3/iF+bNKn3T3Zns0Bn66KTdCCit7gy8mWnLn4bYrO3dNv/gKq3LJD1xPKhjd NtrQ== X-Gm-Message-State: AOJu0Yxm5+7TE0m2VRhHeXAR1qhscPiX2kLttNh5oxtHwfSxmfLwWW/9 R5D8frt7sv5NGDIhiuEbxF4A7dgZTUPpog== X-Google-Smtp-Source: AGHT+IEI2Zhw2FSJEMI8Ssm9E4Jw0snukMCxeHVgeJmR68DOH7fPqK0agm6z23db1yfmP2ukgwIncw== X-Received: by 2002:a17:906:18a9:b0:9ad:ef08:1f32 with SMTP id c9-20020a17090618a900b009adef081f32mr6706422ejf.37.1695655541670; Mon, 25 Sep 2023 08:25:41 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:105a:e25e:7421:a01e:ee4a:ba03]) by smtp.gmail.com with ESMTPSA id f1-20020a17090624c100b009ae3e6c342asm6432045ejb.111.2023.09.25.08.25.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 08:25:40 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder Subject: [PATCH v7 3/9] repack: refactor finishing pack-objects command Date: Mon, 25 Sep 2023 17:25:11 +0200 Message-ID: <20230925152517.803579-4-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.279.g57b2ba444c In-Reply-To: <20230925152517.803579-1-christian.couder@gmail.com> References: <20230911150618.129737-1-christian.couder@gmail.com> <20230925152517.803579-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Create a new finish_pack_objects_cmd() to refactor duplicated code that handles reading the packfile names from the output of a `git pack-objects` command and putting it into a string_list, as well as calling finish_command(). While at it, beautify a code comment a bit in the new function. Signed-off-by: Christian Couder out, "r"); + while (strbuf_getline_lf(&line, out) != EOF) { + struct string_list_item *item; + + if (line.len != the_hash_algo->hexsz) + die(_("repack: Expecting full hex object ID lines only " + "from pack-objects.")); + /* + * Avoid putting packs written outside of the repository in the + * list of names. + */ + if (local) { + item = string_list_append(names, line.buf); + item->util = populate_pack_exts(line.buf); + } + } + fclose(out); + + strbuf_release(&line); + + return finish_command(cmd); +} + static int write_cruft_pack(const struct pack_objects_args *args, const char *destination, const char *pack_prefix, @@ -814,9 +844,8 @@ static int write_cruft_pack(const struct pack_objects_args *args, struct existing_packs *existing) { struct child_process cmd = CHILD_PROCESS_INIT; - struct strbuf line = STRBUF_INIT; struct string_list_item *item; - FILE *in, *out; + FILE *in; int ret; const char *scratch; int local = skip_prefix(destination, packdir, &scratch); @@ -861,27 +890,7 @@ static int write_cruft_pack(const struct pack_objects_args *args, fprintf(in, "%s.pack\n", item->string); fclose(in); - out = xfdopen(cmd.out, "r"); - while (strbuf_getline_lf(&line, out) != EOF) { - struct string_list_item *item; - - if (line.len != the_hash_algo->hexsz) - die(_("repack: Expecting full hex object ID lines only " - "from pack-objects.")); - /* - * avoid putting packs written outside of the repository in the - * list of names - */ - if (local) { - item = string_list_append(names, line.buf); - item->util = populate_pack_exts(line.buf); - } - } - fclose(out); - - strbuf_release(&line); - - return finish_command(&cmd); + return finish_pack_objects_cmd(&cmd, names, local); } int cmd_repack(int argc, const char **argv, const char *prefix) @@ -891,10 +900,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) struct string_list names = STRING_LIST_INIT_DUP; struct existing_packs existing = EXISTING_PACKS_INIT; struct pack_geometry geometry = { 0 }; - struct strbuf line = STRBUF_INIT; struct tempfile *refs_snapshot = NULL; int i, ext, ret; - FILE *out; int show_progress; /* variables to be filled by option parsing */ @@ -1124,18 +1131,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) fclose(in); } - out = xfdopen(cmd.out, "r"); - while (strbuf_getline_lf(&line, out) != EOF) { - struct string_list_item *item; - - if (line.len != the_hash_algo->hexsz) - die(_("repack: Expecting full hex object ID lines only from pack-objects.")); - item = string_list_append(&names, line.buf); - item->util = populate_pack_exts(item->string); - } - strbuf_release(&line); - fclose(out); - ret = finish_command(&cmd); + ret = finish_pack_objects_cmd(&cmd, &names, 1); if (ret) goto cleanup; From patchwork Mon Sep 25 15:25:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13397990 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DAB5FCE7A8C for ; Mon, 25 Sep 2023 15:25:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232952AbjIYP0D (ORCPT ); Mon, 25 Sep 2023 11:26:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232898AbjIYPZ5 (ORCPT ); Mon, 25 Sep 2023 11:25:57 -0400 Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E0FC011F for ; Mon, 25 Sep 2023 08:25:45 -0700 (PDT) Received: by mail-wm1-x332.google.com with SMTP id 5b1f17b1804b1-405361bba99so62128875e9.2 for ; Mon, 25 Sep 2023 08:25:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695655544; x=1696260344; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=b8ghgTHs/eKVmRgMp1W2op64pNScf7ZgN2SJ4Kn1efg=; b=nB6hh/IO2Z/j0MGV8lCvF1O4ZhJHFgavXu9VEp0u5T/M51ZHZgn15MAY/O7ObfO8mA STddXIEKmr3DNPw1bp/gZDRHD5I31PXLZ5cB41v8tWueaHbpLXfPcytd+DMUezZINeXN S8ZUORqKcmiJQmA1Fm5RdPIDDeYRadRtXw4DCIY0IuL14obM8G/lwheS8jc8xlm2ZlKO a/8xhKjYtvxYTRffWfhjVAc14Nl3yciKS9RGZkpyTwBcYiedL+6dzuH2fjF+5FTJl9yP 6KwxMbi3HEn2hVNgCYWy7ozsYdJou1IP+Tq7pDxZi1oK+rjb02dOwT9uI92DQqvcn1UF /XoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695655544; x=1696260344; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=b8ghgTHs/eKVmRgMp1W2op64pNScf7ZgN2SJ4Kn1efg=; b=pX9p/PitVNkOmZ92PDupRk8a2hcCFTRI14hZGL+1RyUIkVBrDzd1lCfaILWnr6muYx Oc7M/t5IoVWhJq/vZEw9hGRhvQgawuTsKwb5hrju9DBbqVvMDwBOnyFfwydf9OeljyxV hPJ3bQAyojRmI8MCzwHBlb4HIDppvvC6+I2Ev27RGWUSQGsRggsQ7grla1jGjzNHp5XB 4cOn/kXo3idfiaSE8rQL6YBrcYWyPqsEcy4vdw5rKi3qg+h0Z2NS0ddwqBkS/iB7Q2th aPz9weyZZRF6h1UDc0urbLeVojoowug4aClKIeGgxjPT+DG976r20qw/WrBc+UctuHEP W3TA== X-Gm-Message-State: AOJu0YwjYOpXmGYohWIBMXo/PaudxMLo+uvHpqObU6RbqarnhTdR16Mb lKvcFAvrwXFF52CCZICjaRK1UT9COwAEVg== X-Google-Smtp-Source: AGHT+IFPbHWxQWTJ0CY8vdHn3VeGmpk1tFQeSG3513BYF3BMhqYtTgYmA+/Piw1uPWbcbJB7/qgMBA== X-Received: by 2002:adf:fe85:0:b0:319:f9d6:a769 with SMTP id l5-20020adffe85000000b00319f9d6a769mr5669359wrr.45.1695655543871; Mon, 25 Sep 2023 08:25:43 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:105a:e25e:7421:a01e:ee4a:ba03]) by smtp.gmail.com with ESMTPSA id f1-20020a17090624c100b009ae3e6c342asm6432045ejb.111.2023.09.25.08.25.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 08:25:43 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder Subject: [PATCH v7 4/9] repack: refactor finding pack prefix Date: Mon, 25 Sep 2023 17:25:12 +0200 Message-ID: <20230925152517.803579-5-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.279.g57b2ba444c In-Reply-To: <20230925152517.803579-1-christian.couder@gmail.com> References: <20230911150618.129737-1-christian.couder@gmail.com> <20230925152517.803579-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Create a new find_pack_prefix() to refactor code that handles finding the pack prefix from the packtmp and packdir global variables, as we are going to need this feature again in following commit. Signed-off-by: Christian Couder X-Patchwork-Id: 13397991 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FA53CE7A81 for ; Mon, 25 Sep 2023 15:26:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232958AbjIYP0F (ORCPT ); Mon, 25 Sep 2023 11:26:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232927AbjIYPZ5 (ORCPT ); Mon, 25 Sep 2023 11:25:57 -0400 Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [IPv6:2a00:1450:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C37B2197 for ; Mon, 25 Sep 2023 08:25:48 -0700 (PDT) Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-9ad8a822508so821727066b.0 for ; Mon, 25 Sep 2023 08:25:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695655547; x=1696260347; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WtoxgzyRi6IOP6IK0MX8kYV1/sFhqo8Ox2PkAkhPk/I=; b=mhxQB8xM3+iTnGuSngThmMRXCTbKqbFLtvQZ/85QxqmoeZ7te9gBUnXRQhSgaP1K08 AeeB6vwAcA/AIgZOcOo6/xomjO46cBjkBjLO17Lm0s8vBB9IKBGbcDPy4kI6RrJWVo4z suJgug+psGoXKUG7alrrkq2STWMfPjP/BqkiFOqr/qv0wrYvipTEWOY9l3FdJvaYSChS RwURAHxIwXIxTQHDQY3Im10Oeysnz0/MePt6DK52bQyxur7Trg3kyG15fDCAk0jv7UDN p5Z4OOksC7MZGCCXivFtH7CQxFpAqfxVYjFPLi6ujgnC4aMo5D7B/sLGkfEaGbLLQvGC 181g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695655547; x=1696260347; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WtoxgzyRi6IOP6IK0MX8kYV1/sFhqo8Ox2PkAkhPk/I=; b=aFGOAhGdsiDigGZtVM8PLaJcdMA4Y66H7qxYCX8yjdA4Ydhc57HWaODIASu91NIXAl CZ3H3W6elIwF+8hePRm+avPe7E30iLeyH3ke7ucKGMUDnpXpbNAUP+ykKaRc3zSzXQj9 ISmnMUEWIOdtzovcvyOIM4ZqmxJ6BxUzlepFdg2AihdeRXchl+WUI5nvTDJgli3Ttshs FTYauZaU4zik0uhdAPYjW62m1n0IpjrMni3sHAJ9DYG9IiGqPOQTAJybGzKH0Hxix+Ru N+J2WsiCL4s+2vIgSN+bG53p9d1Paln7zF1tJW4iapPRrnvam+W8MupN6vRZmCH0XyEw ZL8w== X-Gm-Message-State: AOJu0Yy7wf8CaYRVFi0jTmqKj4jvzTGEMHCvkdiNnoLvC2eE7GSWgIpf bj+eT9u9PNcNI6UO/tpouc3R/cNzSp4eAw== X-Google-Smtp-Source: AGHT+IGrHagBM08VFyrATDBngeueWBpIjFaTg36ZQuCiuaxDCKcAjM7z2zDI7e3LabPX+xD7sJ8HWQ== X-Received: by 2002:a17:906:51dd:b0:9ae:62ec:f4a1 with SMTP id v29-20020a17090651dd00b009ae62ecf4a1mr5809295ejk.33.1695655546724; Mon, 25 Sep 2023 08:25:46 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:105a:e25e:7421:a01e:ee4a:ba03]) by smtp.gmail.com with ESMTPSA id f1-20020a17090624c100b009ae3e6c342asm6432045ejb.111.2023.09.25.08.25.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 08:25:45 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v7 5/9] pack-bitmap-write: rebuild using new bitmap when remapping Date: Mon, 25 Sep 2023 17:25:13 +0200 Message-ID: <20230925152517.803579-6-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.279.g57b2ba444c In-Reply-To: <20230925152517.803579-1-christian.couder@gmail.com> References: <20230911150618.129737-1-christian.couder@gmail.com> <20230925152517.803579-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org `git repack` is about to learn a new `--filter=` option and we will want to check that this option is incompatible with `--write-bitmap-index`. Unfortunately it appears that a test like: test_expect_success '--filter fails with --write-bitmap-index' ' test_must_fail \ env GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ git -C bare.git repack -a -d --write-bitmap-index --filter=blob:none ' sometimes fail because when rebuilding bitmaps, it appears that we are reusing existing bitmap information. So instead of detecting that some objects are missing and erroring out as it should, the `git repack --write-bitmap-index --filter=...` command succeeds. Let's fix that by making sure we rebuild bitmaps using new bitmaps instead of existing ones. Helped-by: Taylor Blau Signed-off-by: Christian Couder --- pack-bitmap-write.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index f6757c3cbf..f4ecdf8b0e 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -413,15 +413,19 @@ static int fill_bitmap_commit(struct bb_commit *ent, if (old_bitmap && mapping) { struct ewah_bitmap *old = bitmap_for_commit(old_bitmap, c); + struct bitmap *remapped = bitmap_new(); /* * If this commit has an old bitmap, then translate that * bitmap and add its bits to this one. No need to walk * parents or the tree for this commit. */ - if (old && !rebuild_bitmap(mapping, old, ent->bitmap)) { + if (old && !rebuild_bitmap(mapping, old, remapped)) { + bitmap_or(ent->bitmap, remapped); + bitmap_free(remapped); reused_bitmaps_nr++; continue; } + bitmap_free(remapped); } /* From patchwork Mon Sep 25 15:25:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13397993 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 006EFCE7A81 for ; Mon, 25 Sep 2023 15:26:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232989AbjIYP0K (ORCPT ); Mon, 25 Sep 2023 11:26:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46646 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232836AbjIYPZ6 (ORCPT ); Mon, 25 Sep 2023 11:25:58 -0400 Received: from mail-ej1-x633.google.com (mail-ej1-x633.google.com [IPv6:2a00:1450:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D31E6FF for ; Mon, 25 Sep 2023 08:25:50 -0700 (PDT) Received: by mail-ej1-x633.google.com with SMTP id a640c23a62f3a-9ad8a822508so821734366b.0 for ; Mon, 25 Sep 2023 08:25:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695655549; x=1696260349; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=A4LDbAAJmEWA1Pues+oBTWBdkT+yU0f8T0tv07oodBE=; b=jhRgmaMA8aKT1KzoBmnj68HBlZeqAOlbGA1+9ByUFZXkKoZ/J55oYHb/oCVCc39RaT 3aCymjmg5oAESNRAxdyxFW0bIPnNCRQmHUbdwySWcimggyl4P8D5yWMnr5vHETGam8Ee tGltm0AkHs49s9dPL2vQBS9+1tjCz9q8kGX0stxVusO/yOb4gone7ZZDHOBZVnAzZU87 rYolWAy/lTnq0kvpXs76aXfjzcAXWuEWmLzyQeRz8KhtQthvgqJieAU5nWHqX6ogPxPe btw/QbpldQaKMjpWCHEjwGL2bfF+JAAPINlXG37NzXd3YgyH6GkGF3o+8URVIQKjt46f e/tA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695655549; x=1696260349; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=A4LDbAAJmEWA1Pues+oBTWBdkT+yU0f8T0tv07oodBE=; b=YZ9bIDXlQl3Fgm3nk9NCtFUZvG0W4LdF/K0Ny6egZFyZe78/f2M+QDE1a1/57AJ/6W pB15VFRVYD2ZRbXn58mTknde5Y3ud+xNuIOa07jnya9YGxkQnaordFvDAfpSO5MUWuT5 Gc++tYroW6eLkikJrxcsmIpHmRhSPV00QaJ164twV7xMA2h7ndUsCjgRC+RkjQHLX/Xn gGtZbuXcYnjgbIbK7cMJPkaLQllIvTcFaFsGVqgPOF8aRBXWlvpNU3c0B2gWlGiiDzAB nldvZiIYNdRtq9+GjwQmTqlV1+ePs69wX0T2ijZuXtkDsuLAUubPHi1i2ZJtWadYQDxC Zx+g== X-Gm-Message-State: AOJu0YzYgby/5GsEdjWZYwH+d2XMWN29+ze6+Y/3pItCQGnZlA7P3AKz NelHrsHhkzPWt6UFaXQTcXAnWqaX4xXi+g== X-Google-Smtp-Source: AGHT+IFR6LrIg+p2eRFOaQswDfiYvXabc8aQGxqEO6IyTsIe2Pztir+KPEtmpQ1Q6UCqSQ4iWgInjg== X-Received: by 2002:a17:906:1da1:b0:9ae:5367:fe90 with SMTP id u1-20020a1709061da100b009ae5367fe90mr6179584ejh.32.1695655548810; Mon, 25 Sep 2023 08:25:48 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:105a:e25e:7421:a01e:ee4a:ba03]) by smtp.gmail.com with ESMTPSA id f1-20020a17090624c100b009ae3e6c342asm6432045ejb.111.2023.09.25.08.25.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 08:25:48 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v7 6/9] repack: add `--filter=` option Date: Mon, 25 Sep 2023 17:25:14 +0200 Message-ID: <20230925152517.803579-7-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.279.g57b2ba444c In-Reply-To: <20230925152517.803579-1-christian.couder@gmail.com> References: <20230911150618.129737-1-christian.couder@gmail.com> <20230925152517.803579-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This new option puts the objects specified by `` into a separate packfile. This could be useful if, for example, some blobs take up a lot of precious space on fast storage while they are rarely accessed. It could make sense to move them into a separate cheaper, though slower, storage. It's possible to find which new packfile contains the filtered out objects using one of the following: - `git verify-pack -v ...`, - `test-tool find-pack ...`, which a previous commit added, - `--filter-to=`, which a following commit will add to specify where the pack containing the filtered out objects will be. This feature is implemented by running `git pack-objects` twice in a row. The first command is run with `--filter=`, using the specified filter. It packs objects while omitting the objects specified by the filter. Then another `git pack-objects` command is launched using `--stdin-packs`. We pass it all the previously existing packs into its stdin, so that it will pack all the objects in the previously existing packs. But we also pass into its stdin, the pack created by the previous `git pack-objects --filter=` command as well as the kept packs, all prefixed with '^', so that the objects in these packs will be omitted from the resulting pack. The result is that only the objects filtered out by the first `git pack-objects` command are in the pack resulting from the second `git pack-objects` command. As the interactions with kept packs are a bit tricky, a few related tests are added. Helped-by: Taylor Blau Signed-off-by: John Cai Signed-off-by: Christian Couder --- Documentation/git-repack.txt | 12 ++++ builtin/repack.c | 70 ++++++++++++++++++ t/t7700-repack.sh | 135 +++++++++++++++++++++++++++++++++++ 3 files changed, 217 insertions(+) diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 4017157949..6d5bec7716 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -143,6 +143,18 @@ depth is 4095. a larger and slower repository; see the discussion in `pack.packSizeLimit`. +--filter=:: + Remove objects matching the filter specification from the + resulting packfile and put them into a separate packfile. Note + that objects used in the working directory are not filtered + out. So for the split to fully work, it's best to perform it + in a bare repo and to use the `-a` and `-d` options along with + this option. Also `--no-write-bitmap-index` (or the + `repack.writebitmaps` config option set to `false`) should be + used otherwise writing bitmap index will fail, as it supposes + a single packfile containing all the objects. See + linkgit:git-rev-list[1] for valid `` forms. + -b:: --write-bitmap-index:: Write a reachability bitmap index as part of the repack. This diff --git a/builtin/repack.c b/builtin/repack.c index 9ef0044384..c7b564192f 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -21,6 +21,7 @@ #include "pack.h" #include "pack-bitmap.h" #include "refs.h" +#include "list-objects-filter-options.h" #define ALL_INTO_ONE 1 #define LOOSEN_UNREACHABLE 2 @@ -56,6 +57,7 @@ struct pack_objects_args { int no_reuse_object; int quiet; int local; + struct list_objects_filter_options filter_options; }; static int repack_config(const char *var, const char *value, @@ -836,6 +838,56 @@ static int finish_pack_objects_cmd(struct child_process *cmd, return finish_command(cmd); } +static int write_filtered_pack(const struct pack_objects_args *args, + const char *destination, + const char *pack_prefix, + struct existing_packs *existing, + struct string_list *names) +{ + struct child_process cmd = CHILD_PROCESS_INIT; + struct string_list_item *item; + FILE *in; + int ret; + const char *caret; + const char *scratch; + int local = skip_prefix(destination, packdir, &scratch); + + prepare_pack_objects(&cmd, args, destination); + + strvec_push(&cmd.args, "--stdin-packs"); + + if (!pack_kept_objects) + strvec_push(&cmd.args, "--honor-pack-keep"); + for_each_string_list_item(item, &existing->kept_packs) + strvec_pushf(&cmd.args, "--keep-pack=%s", item->string); + + cmd.in = -1; + + ret = start_command(&cmd); + if (ret) + return ret; + + /* + * Here 'names' contains only the pack(s) that were just + * written, which is exactly the packs we want to keep. Also + * 'existing_kept_packs' already contains the packs in + * 'keep_pack_list'. + */ + in = xfdopen(cmd.in, "w"); + for_each_string_list_item(item, names) + fprintf(in, "^%s-%s.pack\n", pack_prefix, item->string); + for_each_string_list_item(item, &existing->non_kept_packs) + fprintf(in, "%s.pack\n", item->string); + for_each_string_list_item(item, &existing->cruft_packs) + fprintf(in, "%s.pack\n", item->string); + caret = pack_kept_objects ? "" : "^"; + for_each_string_list_item(item, &existing->kept_packs) + fprintf(in, "%s%s.pack\n", caret, item->string); + fclose(in); + + return finish_pack_objects_cmd(&cmd, names, local); +} + static int write_cruft_pack(const struct pack_objects_args *args, const char *destination, const char *pack_prefix, @@ -966,6 +1018,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) N_("limits the maximum number of threads")), OPT_STRING(0, "max-pack-size", &po_args.max_pack_size, N_("bytes"), N_("maximum size of each packfile")), + OPT_PARSE_LIST_OBJECTS_FILTER(&po_args.filter_options), OPT_BOOL(0, "pack-kept-objects", &pack_kept_objects, N_("repack objects in packs marked with .keep")), OPT_STRING_LIST(0, "keep-pack", &keep_pack_list, N_("name"), @@ -979,6 +1032,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) OPT_END() }; + list_objects_filter_init(&po_args.filter_options); + git_config(repack_config, &cruft_po_args); argc = parse_options(argc, argv, prefix, builtin_repack_options, @@ -1119,6 +1174,10 @@ int cmd_repack(int argc, const char **argv, const char *prefix) strvec_push(&cmd.args, "--incremental"); } + if (po_args.filter_options.choice) + strvec_pushf(&cmd.args, "--filter=%s", + expand_list_objects_filter_spec(&po_args.filter_options)); + if (geometry.split_factor) cmd.in = -1; else @@ -1205,6 +1264,16 @@ int cmd_repack(int argc, const char **argv, const char *prefix) } } + if (po_args.filter_options.choice) { + ret = write_filtered_pack(&po_args, + packtmp, + find_pack_prefix(packdir, packtmp), + &existing, + &names); + if (ret) + goto cleanup; + } + string_list_sort(&names); close_object_store(the_repository->objects); @@ -1297,6 +1366,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) string_list_clear(&names, 1); existing_packs_release(&existing); free_pack_geometry(&geometry); + list_objects_filter_release(&po_args.filter_options); return ret; } diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 27b66807cd..39e89445fd 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -327,6 +327,141 @@ test_expect_success 'auto-bitmaps do not complain if unavailable' ' test_must_be_empty actual ' +test_expect_success 'repacking with a filter works' ' + git -C bare.git repack -a -d && + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && + git -C bare.git -c repack.writebitmaps=false repack -a -d --filter=blob:none && + test_stdout_line_count = 2 ls bare.git/objects/pack/*.pack && + commit_pack=$(test-tool -C bare.git find-pack -c 1 HEAD) && + blob_pack=$(test-tool -C bare.git find-pack -c 1 HEAD:file1) && + test "$commit_pack" != "$blob_pack" && + tree_pack=$(test-tool -C bare.git find-pack -c 1 HEAD^{tree}) && + test "$tree_pack" = "$commit_pack" && + blob_pack2=$(test-tool -C bare.git find-pack -c 1 HEAD:file2) && + test "$blob_pack2" = "$blob_pack" +' + +test_expect_success '--filter fails with --write-bitmap-index' ' + test_must_fail \ + env GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -C bare.git repack -a -d --write-bitmap-index --filter=blob:none +' + +test_expect_success 'repacking with two filters works' ' + git init two-filters && + ( + cd two-filters && + mkdir subdir && + test_commit foo && + test_commit subdir_bar subdir/bar && + test_commit subdir_baz subdir/baz + ) && + git clone --no-local --bare two-filters two-filters.git && + ( + cd two-filters.git && + test_stdout_line_count = 1 ls objects/pack/*.pack && + git -c repack.writebitmaps=false repack -a -d \ + --filter=blob:none --filter=tree:1 && + test_stdout_line_count = 2 ls objects/pack/*.pack && + commit_pack=$(test-tool find-pack -c 1 HEAD) && + blob_pack=$(test-tool find-pack -c 1 HEAD:foo.t) && + root_tree_pack=$(test-tool find-pack -c 1 HEAD^{tree}) && + subdir_tree_hash=$(git ls-tree --object-only HEAD -- subdir) && + subdir_tree_pack=$(test-tool find-pack -c 1 "$subdir_tree_hash") && + + # Root tree and subdir tree are not in the same packfiles + test "$commit_pack" != "$blob_pack" && + test "$commit_pack" = "$root_tree_pack" && + test "$blob_pack" = "$subdir_tree_pack" + ) +' + +prepare_for_keep_packs () { + git init keep-packs && + ( + cd keep-packs && + test_commit foo && + test_commit bar + ) && + git clone --no-local --bare keep-packs keep-packs.git && + ( + cd keep-packs.git && + + # Create two packs + # The first pack will contain all of the objects except one blob + git rev-list --objects --all >objs && + grep -v "bar.t" objs | git pack-objects pack && + # The second pack will contain the excluded object and be kept + packid=$(grep "bar.t" objs | git pack-objects pack) && + >pack-$packid.keep && + + # Replace the existing pack with the 2 new ones + rm -f objects/pack/pack* && + mv pack-* objects/pack/ + ) +} + +test_expect_success '--filter works with .keep packs' ' + prepare_for_keep_packs && + ( + cd keep-packs.git && + + foo_pack=$(test-tool find-pack -c 1 HEAD:foo.t) && + bar_pack=$(test-tool find-pack -c 1 HEAD:bar.t) && + head_pack=$(test-tool find-pack -c 1 HEAD) && + + test "$foo_pack" != "$bar_pack" && + test "$foo_pack" = "$head_pack" && + + git -c repack.writebitmaps=false repack -a -d --filter=blob:none && + + foo_pack_1=$(test-tool find-pack -c 1 HEAD:foo.t) && + bar_pack_1=$(test-tool find-pack -c 1 HEAD:bar.t) && + head_pack_1=$(test-tool find-pack -c 1 HEAD) && + + # Object bar is still only in the old .keep pack + test "$foo_pack_1" != "$foo_pack" && + test "$bar_pack_1" = "$bar_pack" && + test "$head_pack_1" != "$head_pack" && + + test "$foo_pack_1" != "$bar_pack_1" && + test "$foo_pack_1" != "$head_pack_1" && + test "$bar_pack_1" != "$head_pack_1" + ) +' + +test_expect_success '--filter works with --pack-kept-objects and .keep packs' ' + rm -rf keep-packs keep-packs.git && + prepare_for_keep_packs && + ( + cd keep-packs.git && + + foo_pack=$(test-tool find-pack -c 1 HEAD:foo.t) && + bar_pack=$(test-tool find-pack -c 1 HEAD:bar.t) && + head_pack=$(test-tool find-pack -c 1 HEAD) && + + test "$foo_pack" != "$bar_pack" && + test "$foo_pack" = "$head_pack" && + + git -c repack.writebitmaps=false repack -a -d --filter=blob:none \ + --pack-kept-objects && + + foo_pack_1=$(test-tool find-pack -c 1 HEAD:foo.t) && + test-tool find-pack -c 2 HEAD:bar.t >bar_pack_1 && + head_pack_1=$(test-tool find-pack -c 1 HEAD) && + + test "$foo_pack_1" != "$foo_pack" && + test "$foo_pack_1" != "$bar_pack" && + test "$head_pack_1" != "$head_pack" && + + # Object bar is in both the old .keep pack and the new + # pack that contained the filtered out objects + grep "$bar_pack" bar_pack_1 && + grep "$foo_pack_1" bar_pack_1 && + test "$foo_pack_1" != "$head_pack_1" + ) +' + objdir=.git/objects midx=$objdir/pack/multi-pack-index From patchwork Mon Sep 25 15:25:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13397992 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCB95CE7A8C for ; Mon, 25 Sep 2023 15:26:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232943AbjIYP0J (ORCPT ); Mon, 25 Sep 2023 11:26:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46742 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232889AbjIYPZ7 (ORCPT ); Mon, 25 Sep 2023 11:25:59 -0400 Received: from mail-ej1-x62f.google.com (mail-ej1-x62f.google.com [IPv6:2a00:1450:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2ECCE10A for ; Mon, 25 Sep 2023 08:25:52 -0700 (PDT) Received: by mail-ej1-x62f.google.com with SMTP id a640c23a62f3a-99bdeae1d0aso820027666b.1 for ; Mon, 25 Sep 2023 08:25:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695655550; x=1696260350; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2e4Aqk/meUSmscAN+bYvQhR1eh5JYPT3g6SksH8pUko=; b=bcgyoVTY6yeayMEsSWMeurb7B1dzIeyRqtB6MZ+ePWx6IqI/cp7I3ys/P+TnYAM0LE RzhuRWZb3yp8Dp9BdAxYQk7f3todHDY7xTrl0Me8xmrpM13Dz+Gr04qlnarb7kJ4eXc2 DgglcNTx2r0xReMsjwmhCXci0FJBwPIJDB12y62mzssPQ53qpZCzbl++XTy3hm2WBo6B LMWf7iWDfAXPmRhRifgzq7Zli7Y3t6JBI9Migt6ZzwIjM+BWalcXx+wxVsw73TPhlgf0 TB/57y39GAxgOu8xOQG3vTC+UqtA/7k1oBYqUTvNQccBwLQqUuJCcoaqfM+8oNudnhzG zt6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695655550; x=1696260350; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2e4Aqk/meUSmscAN+bYvQhR1eh5JYPT3g6SksH8pUko=; b=CryauboAm0N9iYOncGrN6m3FE+N8fxZidZzMK1A4RlbKcEBmK+eE8NkjYJn0t+UBIu icnOZn4QpSU102ATe2Z4BlR5+v7F0p+YfhSH/X2A7bciSp3bR4YryXMeYpsIIG35kgLp WjpYJpED8CXk+/pbZx4Hos3pZQgQcRmE98GU68cyjcp3IuLcDKKych13XKP8aiPT+hOj 8KFEDwgS8MwWB84ZFZpeYC/Yq2kyWS58i5hnM3qc+mibWbilTZfJka8kaJJWaMDDcCuq rJNDosAkak6APb3ygAccfx1KBEfYLVwpW/U0XsNJbf25qeB+xIKWPxFGS7omePOs7KfO UgwA== X-Gm-Message-State: AOJu0YyOUgvoM3MqGRBFGT1CHs37xYNF6QB4KPU5tlNMhOe+wgP7Xy5H mlXRN7/8NJO6Ig4gnacLqMOLz00jaA7QPg== X-Google-Smtp-Source: AGHT+IE/vvcn+xj/bHCvxfPNa8w2Y07RjzM2dmor77HMHkOwG3uyPgw53aOzev8eULHPtD6rqh84sw== X-Received: by 2002:a17:906:6d14:b0:9a9:f17e:412f with SMTP id m20-20020a1709066d1400b009a9f17e412fmr6451973ejr.50.1695655550280; Mon, 25 Sep 2023 08:25:50 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:105a:e25e:7421:a01e:ee4a:ba03]) by smtp.gmail.com with ESMTPSA id f1-20020a17090624c100b009ae3e6c342asm6432045ejb.111.2023.09.25.08.25.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 08:25:49 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v7 7/9] gc: add `gc.repackFilter` config option Date: Mon, 25 Sep 2023 17:25:15 +0200 Message-ID: <20230925152517.803579-8-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.279.g57b2ba444c In-Reply-To: <20230925152517.803579-1-christian.couder@gmail.com> References: <20230911150618.129737-1-christian.couder@gmail.com> <20230925152517.803579-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A previous commit has implemented `git repack --filter=` to allow users to filter out some objects from the main pack and move them into a new different pack. Users might want to perform such a cleanup regularly at the same time as they perform other repacks and cleanups, so as part of `git gc`. Let's allow them to configure a for that purpose using a new gc.repackFilter config option. Now when `git gc` will perform a repack with a configured through this option and not empty, the repack process will be passed a corresponding `--filter=` argument. Signed-off-by: Christian Couder --- Documentation/config/gc.txt | 5 +++++ builtin/gc.c | 6 ++++++ t/t6500-gc.sh | 13 +++++++++++++ 3 files changed, 24 insertions(+) diff --git a/Documentation/config/gc.txt b/Documentation/config/gc.txt index ca47eb2008..2153bde7ac 100644 --- a/Documentation/config/gc.txt +++ b/Documentation/config/gc.txt @@ -145,6 +145,11 @@ Multiple hooks are supported, but all must exit successfully, else the operation (either generating a cruft pack or unpacking unreachable objects) will be halted. +gc.repackFilter:: + When repacking, use the specified filter to move certain + objects into a separate packfile. See the + `--filter=` option of linkgit:git-repack[1]. + gc.rerereResolved:: Records of conflicted merge you resolved earlier are kept for this many days when 'git rerere gc' is run. diff --git a/builtin/gc.c b/builtin/gc.c index 00192ae5d3..98148e98fe 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -61,6 +61,7 @@ static timestamp_t gc_log_expire_time; static const char *gc_log_expire = "1.day.ago"; static const char *prune_expire = "2.weeks.ago"; static const char *prune_worktrees_expire = "3.months.ago"; +static char *repack_filter; static unsigned long big_pack_threshold; static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE; @@ -170,6 +171,8 @@ static void gc_config(void) git_config_get_ulong("gc.bigpackthreshold", &big_pack_threshold); git_config_get_ulong("pack.deltacachesize", &max_delta_cache_size); + git_config_get_string("gc.repackfilter", &repack_filter); + git_config(git_default_config, NULL); } @@ -355,6 +358,9 @@ static void add_repack_all_option(struct string_list *keep_pack) if (keep_pack) for_each_string_list(keep_pack, keep_one_pack, NULL); + + if (repack_filter && *repack_filter) + strvec_pushf(&repack, "--filter=%s", repack_filter); } static void add_repack_incremental_option(void) diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index 69509d0c11..232e403b66 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -202,6 +202,19 @@ test_expect_success 'one of gc.reflogExpire{Unreachable,}=never does not skip "e grep -E "^trace: (built-in|exec|run_command): git reflog expire --" trace.out ' +test_expect_success 'gc.repackFilter launches repack with a filter' ' + test_when_finished "rm -rf bare.git" && + git clone --no-local --bare . bare.git && + + git -C bare.git -c gc.cruftPacks=false gc && + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && + + GIT_TRACE=$(pwd)/trace.out git -C bare.git -c gc.repackFilter=blob:none \ + -c repack.writeBitmaps=false -c gc.cruftPacks=false gc && + test_stdout_line_count = 2 ls bare.git/objects/pack/*.pack && + grep -E "^trace: (built-in|exec|run_command): git repack .* --filter=blob:none ?.*" trace.out +' + prepare_cruft_history () { test_commit base && From patchwork Mon Sep 25 15:25:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13397994 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A885CE7A81 for ; Mon, 25 Sep 2023 15:26:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233029AbjIYP0W (ORCPT ); Mon, 25 Sep 2023 11:26:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232882AbjIYP0C (ORCPT ); Mon, 25 Sep 2023 11:26:02 -0400 Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9655B10E for ; Mon, 25 Sep 2023 08:25:54 -0700 (PDT) Received: by mail-ed1-x536.google.com with SMTP id 4fb4d7f45d1cf-5230a22cfd1so7798244a12.1 for ; Mon, 25 Sep 2023 08:25:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695655552; x=1696260352; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cFaQb+QCw1pF7cO2nQRCLrae1O2I+zc6XjI4hP6cI3g=; b=LmUcHoL7XXS63zl3hV0RrZT02KaeUp34OKupe2+On92pAq3HZcGTMc/49511GpHLvO fa7SSsGDqtcykAGi6qqKA1Qod143GCGVhnArr1kHjv+P4TJPxwVZjOd905hSgSelpZMB 0GKFAu9kluI+QYb1EPKB6ns0v98Wp400vTW6NeEWpBmQmscBoeHn3MIcHvusVyvmJ16o TZmvdbnNnF2HpfWP7Exyof6aaQqFxxPitsmHIVmdF1Yd+YDT7RxsAtsjkWox5PY0Fb++ nR0yZbP1fm5wJSZTjVbfAnuLCIbd8kfBWpU0k8MJrrsYfhCkaeKLE0vSeCCWp2XVqOSU 4dhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695655552; x=1696260352; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cFaQb+QCw1pF7cO2nQRCLrae1O2I+zc6XjI4hP6cI3g=; b=Q/ixPid42xM+vJfPPzy5uumcA/+qtabCUXYP5rQMIC+K67gHlRp3/dKvwWcUaD7ZKq uNGKh49ZtvYonss7kf/4CglChbB2AjLOlP+NQA/ediJj0Isgw2M502ppIFYXSf23dafQ bOIeCm77mYw9kjoMlBDQCYvHBezn10UQnMfwmNXleHFvF6HS/lRo4BaZD3J1GgYlVaPX /ifXo5GY+PATH+o+acjR00s4SLc+N3QUbswGoSROqUmz5ZOIQcufKycUvd+1JT0ANYcO JOaP6ion035T7ulRiyipBTVtWYtgrXtw2Wo/J/PaV2gTPU6pTI3UWFXdIE0OEYzgiOxJ 5BnQ== X-Gm-Message-State: AOJu0YwxL32uAH0wZx8unCxqApX8j3VQVCjFT6YoJeoNIfovXLkT2cYf yQan7x3SVGlF7J+YNGLsGdp5s6oDRpysXA== X-Google-Smtp-Source: AGHT+IGz1Q6TO6RBD4oR9Dmo9w+SpEESJfJJTEy+QidrHUhfkWmTxq6zThegdXjgD/d9s31ZiGN8Vg== X-Received: by 2002:a17:907:7759:b0:9ae:74d1:4b44 with SMTP id kx25-20020a170907775900b009ae74d14b44mr5839361ejc.72.1695655552206; Mon, 25 Sep 2023 08:25:52 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:105a:e25e:7421:a01e:ee4a:ba03]) by smtp.gmail.com with ESMTPSA id f1-20020a17090624c100b009ae3e6c342asm6432045ejb.111.2023.09.25.08.25.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 08:25:51 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v7 8/9] repack: implement `--filter-to` for storing filtered out objects Date: Mon, 25 Sep 2023 17:25:16 +0200 Message-ID: <20230925152517.803579-9-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.279.g57b2ba444c In-Reply-To: <20230925152517.803579-1-christian.couder@gmail.com> References: <20230911150618.129737-1-christian.couder@gmail.com> <20230925152517.803579-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A previous commit has implemented `git repack --filter=` to allow users to filter out some objects from the main pack and move them into a new different pack. It would be nice if this new different pack could be created in a different directory than the regular pack. This would make it possible to move large blobs into a pack on a different kind of storage, for example cheaper storage. Even in a different directory, this pack can be accessible if, for example, the Git alternates mechanism is used to point to it. In fact not using the Git alternates mechanism can corrupt a repo as the generated pack containing the filtered objects might not be accessible from the repo any more. So setting up the Git alternates mechanism should be done before using this feature if the user wants the repo to be fully usable while this feature is used. In some cases, like when a repo has just been cloned or when there is no other activity in the repo, it's Ok to setup the Git alternates mechanism afterwards though. It's also Ok to just inspect the generated packfile containing the filtered objects and then just move it into the '.git/objects/pack/' directory manually. That's why it's not necessary for this command to check that the Git alternates mechanism has been already setup. While at it, as an example to show that `--filter` and `--filter-to` work well with other options, let's also add a test to check that these options work well with `--max-pack-size`. Signed-off-by: Christian Couder --- Documentation/git-repack.txt | 11 +++++++ builtin/repack.c | 10 +++++- t/t7700-repack.sh | 62 ++++++++++++++++++++++++++++++++++++ 3 files changed, 82 insertions(+), 1 deletion(-) diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 6d5bec7716..8545a32667 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -155,6 +155,17 @@ depth is 4095. a single packfile containing all the objects. See linkgit:git-rev-list[1] for valid `` forms. +--filter-to=:: + Write the pack containing filtered out objects to the + directory ``. Only useful with `--filter`. This can be + used for putting the pack on a separate object directory that + is accessed through the Git alternates mechanism. **WARNING:** + If the packfile containing the filtered out objects is not + accessible, the repo can become corrupt as it might not be + possible to access the objects in that packfile. See the + `objects` and `objects/info/alternates` sections of + linkgit:gitrepository-layout[5]. + -b:: --write-bitmap-index:: Write a reachability bitmap index as part of the repack. This diff --git a/builtin/repack.c b/builtin/repack.c index c7b564192f..db9277081d 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -977,6 +977,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) int write_midx = 0; const char *cruft_expiration = NULL; const char *expire_to = NULL; + const char *filter_to = NULL; struct option builtin_repack_options[] = { OPT_BIT('a', NULL, &pack_everything, @@ -1029,6 +1030,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) N_("write a multi-pack index of the resulting packs")), OPT_STRING(0, "expire-to", &expire_to, N_("dir"), N_("pack prefix to store a pack containing pruned objects")), + OPT_STRING(0, "filter-to", &filter_to, N_("dir"), + N_("pack prefix to store a pack containing filtered out objects")), OPT_END() }; @@ -1177,6 +1180,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) if (po_args.filter_options.choice) strvec_pushf(&cmd.args, "--filter=%s", expand_list_objects_filter_spec(&po_args.filter_options)); + else if (filter_to) + die(_("option '%s' can only be used along with '%s'"), "--filter-to", "--filter"); if (geometry.split_factor) cmd.in = -1; @@ -1265,8 +1270,11 @@ int cmd_repack(int argc, const char **argv, const char *prefix) } if (po_args.filter_options.choice) { + if (!filter_to) + filter_to = packtmp; + ret = write_filtered_pack(&po_args, - packtmp, + filter_to, find_pack_prefix(packdir, packtmp), &existing, &names); diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 39e89445fd..48e92aa6f7 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -462,6 +462,68 @@ test_expect_success '--filter works with --pack-kept-objects and .keep packs' ' ) ' +test_expect_success '--filter-to stores filtered out objects' ' + git -C bare.git repack -a -d && + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && + + git init --bare filtered.git && + git -C bare.git -c repack.writebitmaps=false repack -a -d \ + --filter=blob:none \ + --filter-to=../filtered.git/objects/pack/pack && + test_stdout_line_count = 1 ls bare.git/objects/pack/pack-*.pack && + test_stdout_line_count = 1 ls filtered.git/objects/pack/pack-*.pack && + + commit_pack=$(test-tool -C bare.git find-pack -c 1 HEAD) && + blob_pack=$(test-tool -C bare.git find-pack -c 0 HEAD:file1) && + blob_hash=$(git -C bare.git rev-parse HEAD:file1) && + test -n "$blob_hash" && + blob_pack=$(test-tool -C filtered.git find-pack -c 1 $blob_hash) && + + echo $(pwd)/filtered.git/objects >bare.git/objects/info/alternates && + blob_pack=$(test-tool -C bare.git find-pack -c 1 HEAD:file1) && + blob_content=$(git -C bare.git show $blob_hash) && + test "$blob_content" = "content1" +' + +test_expect_success '--filter works with --max-pack-size' ' + rm -rf filtered.git && + git init --bare filtered.git && + git init max-pack-size && + ( + cd max-pack-size && + test_commit base && + # two blobs which exceed the maximum pack size + test-tool genrandom foo 1048576 >foo && + git hash-object -w foo && + test-tool genrandom bar 1048576 >bar && + git hash-object -w bar && + git add foo bar && + git commit -m "adding foo and bar" + ) && + git clone --no-local --bare max-pack-size max-pack-size.git && + ( + cd max-pack-size.git && + git -c repack.writebitmaps=false repack -a -d --filter=blob:none \ + --max-pack-size=1M \ + --filter-to=../filtered.git/objects/pack/pack && + echo $(cd .. && pwd)/filtered.git/objects >objects/info/alternates && + + # Check that the 3 blobs are in different packfiles in filtered.git + test_stdout_line_count = 3 ls ../filtered.git/objects/pack/pack-*.pack && + test_stdout_line_count = 1 ls objects/pack/pack-*.pack && + foo_pack=$(test-tool find-pack -c 1 HEAD:foo) && + bar_pack=$(test-tool find-pack -c 1 HEAD:bar) && + base_pack=$(test-tool find-pack -c 1 HEAD:base.t) && + test "$foo_pack" != "$bar_pack" && + test "$foo_pack" != "$base_pack" && + test "$bar_pack" != "$base_pack" && + for pack in "$foo_pack" "$bar_pack" "$base_pack" + do + case "$foo_pack" in */filtered.git/objects/pack/*) true ;; *) return 1 ;; esac + done + ) +' + objdir=.git/objects midx=$objdir/pack/multi-pack-index From patchwork Mon Sep 25 15:25:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13397995 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACF9DCE7A8C for ; Mon, 25 Sep 2023 15:26:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233079AbjIYP0Y (ORCPT ); Mon, 25 Sep 2023 11:26:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232920AbjIYP0D (ORCPT ); Mon, 25 Sep 2023 11:26:03 -0400 Received: from mail-ej1-x62b.google.com (mail-ej1-x62b.google.com [IPv6:2a00:1450:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8770511B for ; Mon, 25 Sep 2023 08:25:56 -0700 (PDT) Received: by mail-ej1-x62b.google.com with SMTP id a640c23a62f3a-99c3d3c3db9so805792566b.3 for ; Mon, 25 Sep 2023 08:25:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695655554; x=1696260354; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3lll2BDbSt0U0/11zMPrvLfBYK7pXQOtx0KJ+okIhu8=; b=h73SUAoB/k2MaQYNchG+2FWRCOLxbRa65rb3tgWRkRLQUX9NwcyYW9CjNzwH1UIS3h QDVPynUA1FwThMnIvRXAjVO/nvaRW4zWKWl4/q8dibgf7KqhLlv4qvD0sWzdXtowHbN5 orUnNXE/Ms+BHysAhXoxlx+YDfAhio049EQ3V70pUp0yKXM/dSLTRMtj9Ym9mnz9DYwR +7dXCd1b5lPPkUo7/wL10mbTYjzXMkzipAUZA9vxnOP3V3fRLMNOHnCdM0m3xrzfqnEx uXmXcZta6t/k51AGQjtqlyWPbu3ElHh/3Gg0zjUsJFo4pXR1YlUQnT1++iiiSj2wFIM6 adew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695655554; x=1696260354; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3lll2BDbSt0U0/11zMPrvLfBYK7pXQOtx0KJ+okIhu8=; b=B/BEIEB0dU4shVfSUVhTklMaSbKqReDLm45EzzAq1r9Piz5HGwAHhQ4PY/KYRBrSbJ a/j6K0YvuG58trYDRIF7XbLV/D37j1IWUjOUYHN9ioHRSw3pLhjDlVKim3UDs+UlftMF dvdO+5wPNwg4x0xDW3QAW/XZ9r1Kl/VVBGkWQ4qPHxYNG3VWKADjDjpvd9NcDy1OUzz8 FZsfrXhZymfLdWFG91rHF/7kda1S+n8/FCWzFWh985yAKLeWK+bH3sQJsWMaDFq2gN5m 8h+5warseWgaDVdgl4oIqr7emIsD8tPflmt9BVmbNzpvT+4BHe9YAuwwyxt096MWhChq gDBg== X-Gm-Message-State: AOJu0YwirraQZJSHueigjFjnAtGDG404qEAMkiuUS0iBouDslRhhJYEF CU2hwrpxWtVIHYMyHHaKfdG10vxGIc4qLg== X-Google-Smtp-Source: AGHT+IGRN5NnKdU51rhqczG5ogxegWQcGQu+cGo0Ch6S4xLUU74d3vFKwyHIOYD36GjG8kWu5OaoQQ== X-Received: by 2002:a17:907:7e8c:b0:9ae:5370:81d5 with SMTP id qb12-20020a1709077e8c00b009ae537081d5mr8762378ejc.41.1695655554508; Mon, 25 Sep 2023 08:25:54 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:105a:e25e:7421:a01e:ee4a:ba03]) by smtp.gmail.com with ESMTPSA id f1-20020a17090624c100b009ae3e6c342asm6432045ejb.111.2023.09.25.08.25.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 08:25:53 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v7 9/9] gc: add `gc.repackFilterTo` config option Date: Mon, 25 Sep 2023 17:25:17 +0200 Message-ID: <20230925152517.803579-10-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.279.g57b2ba444c In-Reply-To: <20230925152517.803579-1-christian.couder@gmail.com> References: <20230911150618.129737-1-christian.couder@gmail.com> <20230925152517.803579-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A previous commit implemented the `gc.repackFilter` config option to specify a filter that should be used by `git gc` when performing repacks. Another previous commit has implemented `git repack --filter-to=` to specify the location of the packfile containing filtered out objects when using a filter. Let's implement the `gc.repackFilterTo` config option to specify that location in the config when `gc.repackFilter` is used. Now when `git gc` will perform a repack with a configured through this option and not empty, the repack process will be passed a corresponding `--filter-to=` argument. Signed-off-by: Christian Couder --- Documentation/config/gc.txt | 11 +++++++++++ builtin/gc.c | 4 ++++ t/t6500-gc.sh | 13 ++++++++++++- 3 files changed, 27 insertions(+), 1 deletion(-) diff --git a/Documentation/config/gc.txt b/Documentation/config/gc.txt index 2153bde7ac..466466d6cc 100644 --- a/Documentation/config/gc.txt +++ b/Documentation/config/gc.txt @@ -150,6 +150,17 @@ gc.repackFilter:: objects into a separate packfile. See the `--filter=` option of linkgit:git-repack[1]. +gc.repackFilterTo:: + When repacking and using a filter, see `gc.repackFilter`, the + specified location will be used to create the packfile + containing the filtered out objects. **WARNING:** The + specified location should be accessible, using for example the + Git alternates mechanism, otherwise the repo could be + considered corrupt by Git as it migh not be able to access the + objects in that packfile. See the `--filter-to=` option + of linkgit:git-repack[1] and the `objects/info/alternates` + section of linkgit:gitrepository-layout[5]. + gc.rerereResolved:: Records of conflicted merge you resolved earlier are kept for this many days when 'git rerere gc' is run. diff --git a/builtin/gc.c b/builtin/gc.c index 98148e98fe..68ca8d45bf 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -62,6 +62,7 @@ static const char *gc_log_expire = "1.day.ago"; static const char *prune_expire = "2.weeks.ago"; static const char *prune_worktrees_expire = "3.months.ago"; static char *repack_filter; +static char *repack_filter_to; static unsigned long big_pack_threshold; static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE; @@ -172,6 +173,7 @@ static void gc_config(void) git_config_get_ulong("pack.deltacachesize", &max_delta_cache_size); git_config_get_string("gc.repackfilter", &repack_filter); + git_config_get_string("gc.repackfilterto", &repack_filter_to); git_config(git_default_config, NULL); } @@ -361,6 +363,8 @@ static void add_repack_all_option(struct string_list *keep_pack) if (repack_filter && *repack_filter) strvec_pushf(&repack, "--filter=%s", repack_filter); + if (repack_filter_to && *repack_filter_to) + strvec_pushf(&repack, "--filter-to=%s", repack_filter_to); } static void add_repack_incremental_option(void) diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index 232e403b66..e412cf8daf 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -203,7 +203,6 @@ test_expect_success 'one of gc.reflogExpire{Unreachable,}=never does not skip "e ' test_expect_success 'gc.repackFilter launches repack with a filter' ' - test_when_finished "rm -rf bare.git" && git clone --no-local --bare . bare.git && git -C bare.git -c gc.cruftPacks=false gc && @@ -215,6 +214,18 @@ test_expect_success 'gc.repackFilter launches repack with a filter' ' grep -E "^trace: (built-in|exec|run_command): git repack .* --filter=blob:none ?.*" trace.out ' +test_expect_success 'gc.repackFilterTo store filtered out objects' ' + test_when_finished "rm -rf bare.git filtered.git" && + + git init --bare filtered.git && + git -C bare.git -c gc.repackFilter=blob:none \ + -c gc.repackFilterTo=../filtered.git/objects/pack/pack \ + -c repack.writeBitmaps=false -c gc.cruftPacks=false gc && + + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && + test_stdout_line_count = 1 ls filtered.git/objects/pack/*.pack +' + prepare_cruft_history () { test_commit base &&