From patchwork Mon Oct 2 16:54:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13406470 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A16BE784AB for ; Mon, 2 Oct 2023 16:55:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238535AbjJBQze (ORCPT ); Mon, 2 Oct 2023 12:55:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56472 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238523AbjJBQzc (ORCPT ); Mon, 2 Oct 2023 12:55:32 -0400 Received: from mail-ed1-x534.google.com (mail-ed1-x534.google.com [IPv6:2a00:1450:4864:20::534]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDD64A9 for ; Mon, 2 Oct 2023 09:55:29 -0700 (PDT) Received: by mail-ed1-x534.google.com with SMTP id 4fb4d7f45d1cf-534659061afso13451919a12.3 for ; Mon, 02 Oct 2023 09:55:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696265728; x=1696870528; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7wdo8W5kCCEYMOK6uRqIb3nAKp5ZSd+chRwny8i1rB0=; b=M4cF84sPCjc9MuFITuCDOaKs+NAfr4S9sW71WGmolyJNy5dTqscBEQf/twnIVmWFxC NKyqovop9j8W9QQdS1bEehfPGHHKvhJ4jun4PZUgHAwYcAKcN0eC4wpyGV0db0YiXhsQ p+wQgAaCDbHKvxbvFDNFY9wlTaQcsIMB2pPNVw1/IsxHv9WOq3GTvZTNAAsV4D0fjv5v 4RiFUHwRFVSERl9uESvWQG/shrmLfvhrw5mhtO5uejl4Axg7kg7yxJypvBWGvSqPrWfd ynB8v08tDr43G60oAJoynMdy/dMWbKVp+b1e46BGWu8HQ/Ub9LmUJUcvGrD9Kq6F+YSH 4BRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696265728; x=1696870528; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7wdo8W5kCCEYMOK6uRqIb3nAKp5ZSd+chRwny8i1rB0=; b=lBayPHAW8J4ozXYiOgGxKWTTXoKSEDPOVhaN0bh7f6lcU4pAswRXNDcqcKyWqLVkw5 qHwWfososExPOtwDhrczwXd65cEpiCHLfOnMk1o+Ss13xh3c1SgEKSrMiaYgxkB1UIRB 30tr3ferRm8UwhrBkJHxli9mWvueH59UxgN6CTrI3EFTzYHxIbpAXJKtLw0lJ/SCR2zT ipjUjaT07yYRnKmiyWchtsNB7EKsfISxz12Y5zuT0VarVj24viS3IwryHTXhZoiIHRsi Fvqq2ZWRMcEEeNaWAeMujasAu21hcwcet9FlkmGVESBuDdAYcbyTmhRt3dZqOvPXIM9G FZjg== X-Gm-Message-State: AOJu0YwyRSEhCxWBOcX+rrSykrwVmgLhnp1GkSTRcudSbFIJQbTiipni d+otXzyEoGN5TGPiJFA8yciIlVdkP19e7A== X-Google-Smtp-Source: AGHT+IFxT0Ukskua9846kx8GlVsO9KzD4d7pb7vOUywk+kfVIqwqmqjqfvNCNwGL5z95criqkf2bjQ== X-Received: by 2002:a05:6402:26c3:b0:531:11fa:eacf with SMTP id x3-20020a05640226c300b0053111faeacfmr12356294edd.2.1696265727643; Mon, 02 Oct 2023 09:55:27 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:c027:f1d4:d825:fbf4:9197:5c9f]) by smtp.gmail.com with ESMTPSA id er15-20020a056402448f00b00533c844e337sm12762364edb.85.2023.10.02.09.55.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Oct 2023 09:55:27 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v8 1/9] pack-objects: allow `--filter` without `--stdout` Date: Mon, 2 Oct 2023 18:54:56 +0200 Message-ID: <20231002165504.1325153-2-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.305.g5bfd918c90 In-Reply-To: <20231002165504.1325153-1-christian.couder@gmail.com> References: <20230925152517.803579-1-christian.couder@gmail.com> <20231002165504.1325153-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org 9535ce7337 (pack-objects: add list-objects filtering, 2017-11-21) taught `git pack-objects` to use `--filter`, but required the use of `--stdout` since a partial clone mechanism was not yet in place to handle missing objects. Since then, changes like 9e27beaa23 (promisor-remote: implement promisor_remote_get_direct(), 2019-06-25) and others added support to dynamically fetch objects that were missing. Even without a promisor remote, filtering out objects can also be useful if we can put the filtered out objects in a separate pack, and in this case it also makes sense for pack-objects to write the packfile directly to an actual file rather than on stdout. Remove the `--stdout` requirement when using `--filter`, so that in a follow-up commit, repack can pass `--filter` to pack-objects to omit certain objects from the resulting packfile. Signed-off-by: John Cai Signed-off-by: Christian Couder --- Documentation/git-pack-objects.txt | 4 ++-- builtin/pack-objects.c | 8 ++------ t/t5317-pack-objects-filter-objects.sh | 8 ++++++++ 3 files changed, 12 insertions(+), 8 deletions(-) diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index dea7eacb0f..e32404c6aa 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -296,8 +296,8 @@ So does `git bundle` (see linkgit:git-bundle[1]) when it creates a bundle. nevertheless. --filter=:: - Requires `--stdout`. Omits certain objects (usually blobs) from - the resulting packfile. See linkgit:git-rev-list[1] for valid + Omits certain objects (usually blobs) from the resulting + packfile. See linkgit:git-rev-list[1] for valid `` forms. --no-filter:: diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 6eb9756836..89a8b5a976 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -4402,12 +4402,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) if (!rev_list_all || !rev_list_reflog || !rev_list_index) unpack_unreachable_expiration = 0; - if (filter_options.choice) { - if (!pack_to_stdout) - die(_("cannot use --filter without --stdout")); - if (stdin_packs) - die(_("cannot use --filter with --stdin-packs")); - } + if (stdin_packs && filter_options.choice) + die(_("cannot use --filter with --stdin-packs")); if (stdin_packs && use_internal_rev_list) die(_("cannot use internal rev list with --stdin-packs")); diff --git a/t/t5317-pack-objects-filter-objects.sh b/t/t5317-pack-objects-filter-objects.sh index b26d476c64..2ff3eef9a3 100755 --- a/t/t5317-pack-objects-filter-objects.sh +++ b/t/t5317-pack-objects-filter-objects.sh @@ -53,6 +53,14 @@ test_expect_success 'verify blob:none packfile has no blobs' ' ! grep blob verify_result ' +test_expect_success 'verify blob:none packfile without --stdout' ' + git -C r1 pack-objects --revs --filter=blob:none mypackname >packhash <<-EOF && + HEAD + EOF + git -C r1 verify-pack -v "mypackname-$(cat packhash).pack" >verify_result && + ! grep blob verify_result +' + test_expect_success 'verify normal and blob:none packfiles have same commits/trees' ' git -C r1 verify-pack -v ../all.pack >verify_result && grep -E "commit|tree" verify_result | From patchwork Mon Oct 2 16:54:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13406471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 076BFE748F8 for ; Mon, 2 Oct 2023 16:55:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238380AbjJBQzn (ORCPT ); Mon, 2 Oct 2023 12:55:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238398AbjJBQze (ORCPT ); Mon, 2 Oct 2023 12:55:34 -0400 Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3C81EB3 for ; Mon, 2 Oct 2023 09:55:31 -0700 (PDT) Received: by mail-ed1-x531.google.com with SMTP id 4fb4d7f45d1cf-533df112914so18000928a12.0 for ; Mon, 02 Oct 2023 09:55:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696265729; x=1696870529; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=m6QHRWDp7bquaIBXN4OjkivgJWCDnWcW10iYJcceakk=; b=GgyhJhEJdXN/7EcBcoCLg4zg1AtX8dVxbmfou/CIQ6Ajeu2F1pBuyolyOO7XC0ujU8 1pQ2llrvRO++UbdthTkZWQ7qdvfBgqaoiGYxKA/IW4H+EcnNxQNhmBD2iRkwm3fhWyBn /PaF25le6WC4+9k2gGlzmAIPhUmOEwG4J0wPzBcK6gSZLKhaEIlmd6Oywfnznbivy2wI T6S1zKH9Cm+8tQobDnTy7FQ6CB6hssfI4du/JIay9TyDcAQqnL3sFOg5tc2SdzeGhMZq hu+O/Ei8+ftZOc5YZN+/1+5C8qkBCyn4S5u1IiBnQiMyiU8hWimaXAGkB5w7suHt6nDF l+qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696265729; x=1696870529; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=m6QHRWDp7bquaIBXN4OjkivgJWCDnWcW10iYJcceakk=; b=WbUtIpK/TmaHh/MZYtPAnyGb2eAWt/YZMJPoI0Z3QMrTVr9srROomEtfOmRTLYrPTd 8CUW+lipwjsRfRsRQQ3/biyFP22VaSRtoVeoSyqz5FYtPr7vsdMb6yZdQ2afVIO5uGYf BZQPPHJyeLNYL3d+gMhWKG1f600DTG+A1jfjBEz2s010ylDr/63yLxwrqVD/Wuas59ss PBDzWKO+86RTUSeWuKi1qJxyjvE6J37tEEeLI99//AW8Hbza+GyQXPXSlhT7FYdxcz/6 ECuEZysBEmcB2N+GLz9ZJevdoT5ExD5hg2wQ51r+7OYqwCQB6UjAKWmDL4zf89DGsUHc JTJg== X-Gm-Message-State: AOJu0Yye/E20YITj7lc9ZsP11TV66w2WW4IlQtk/T8OAifruJ43Ax+3z /+XiIlqZajp68KKJUY5eC9Ju8qytXoFeIQ== X-Google-Smtp-Source: AGHT+IGpOah46Ab5n+efUPiXIAW21Sx15Xh9StcS1T7mELhnvQxEMbriqj+AarMu3uj9yRtVeLhfPQ== X-Received: by 2002:aa7:d451:0:b0:536:24ff:74ff with SMTP id q17-20020aa7d451000000b0053624ff74ffmr11427127edr.5.1696265729200; Mon, 02 Oct 2023 09:55:29 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:c027:f1d4:d825:fbf4:9197:5c9f]) by smtp.gmail.com with ESMTPSA id er15-20020a056402448f00b00533c844e337sm12762364edb.85.2023.10.02.09.55.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Oct 2023 09:55:28 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v8 2/9] t/helper: add 'find-pack' test-tool Date: Mon, 2 Oct 2023 18:54:57 +0200 Message-ID: <20231002165504.1325153-3-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.305.g5bfd918c90 In-Reply-To: <20231002165504.1325153-1-christian.couder@gmail.com> References: <20230925152517.803579-1-christian.couder@gmail.com> <20231002165504.1325153-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In a following commit, we will make it possible to separate objects in different packfiles depending on a filter. To make sure that the right objects are in the right packs, let's add a new test-tool that can display which packfile(s) a given object is in. Let's also make it possible to check if a given object is in the expected number of packfiles with a `--check-count ` option. Signed-off-by: Christian Couder --- Makefile | 1 + t/helper/test-find-pack.c | 50 ++++++++++++++++++++++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + t/t0081-find-pack.sh | 82 +++++++++++++++++++++++++++++++++++++++ 5 files changed, 135 insertions(+) create mode 100644 t/helper/test-find-pack.c create mode 100755 t/t0081-find-pack.sh diff --git a/Makefile b/Makefile index 003e63b792..f267034d23 100644 --- a/Makefile +++ b/Makefile @@ -800,6 +800,7 @@ TEST_BUILTINS_OBJS += test-dump-untracked-cache.o TEST_BUILTINS_OBJS += test-env-helper.o TEST_BUILTINS_OBJS += test-example-decorate.o TEST_BUILTINS_OBJS += test-fast-rebase.o +TEST_BUILTINS_OBJS += test-find-pack.o TEST_BUILTINS_OBJS += test-fsmonitor-client.o TEST_BUILTINS_OBJS += test-genrandom.o TEST_BUILTINS_OBJS += test-genzeros.o diff --git a/t/helper/test-find-pack.c b/t/helper/test-find-pack.c new file mode 100644 index 0000000000..e8bd793e58 --- /dev/null +++ b/t/helper/test-find-pack.c @@ -0,0 +1,50 @@ +#include "test-tool.h" +#include "object-name.h" +#include "object-store.h" +#include "packfile.h" +#include "parse-options.h" +#include "setup.h" + +/* + * Display the path(s), one per line, of the packfile(s) containing + * the given object. + * + * If '--check-count ' is passed, then error out if the number of + * packfiles containing the object is not . + */ + +static const char *find_pack_usage[] = { + "test-tool find-pack [--check-count ] ", + NULL +}; + +int cmd__find_pack(int argc, const char **argv) +{ + struct object_id oid; + struct packed_git *p; + int count = -1, actual_count = 0; + const char *prefix = setup_git_directory(); + + struct option options[] = { + OPT_INTEGER('c', "check-count", &count, "expected number of packs"), + OPT_END(), + }; + + argc = parse_options(argc, argv, prefix, options, find_pack_usage, 0); + if (argc != 1) + usage(find_pack_usage[0]); + + if (repo_get_oid(the_repository, argv[0], &oid)) + die("cannot parse %s as an object name", argv[0]); + + for (p = get_all_packs(the_repository); p; p = p->next) + if (find_pack_entry_one(oid.hash, p)) { + printf("%s\n", p->pack_name); + actual_count++; + } + + if (count > -1 && count != actual_count) + die("bad packfile count %d instead of %d", actual_count, count); + + return 0; +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index 621ac3dd10..9010ac6de7 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -31,6 +31,7 @@ static struct test_cmd cmds[] = { { "env-helper", cmd__env_helper }, { "example-decorate", cmd__example_decorate }, { "fast-rebase", cmd__fast_rebase }, + { "find-pack", cmd__find_pack }, { "fsmonitor-client", cmd__fsmonitor_client }, { "genrandom", cmd__genrandom }, { "genzeros", cmd__genzeros }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index a641c3a81d..f134f96b97 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -25,6 +25,7 @@ int cmd__dump_reftable(int argc, const char **argv); int cmd__env_helper(int argc, const char **argv); int cmd__example_decorate(int argc, const char **argv); int cmd__fast_rebase(int argc, const char **argv); +int cmd__find_pack(int argc, const char **argv); int cmd__fsmonitor_client(int argc, const char **argv); int cmd__genrandom(int argc, const char **argv); int cmd__genzeros(int argc, const char **argv); diff --git a/t/t0081-find-pack.sh b/t/t0081-find-pack.sh new file mode 100755 index 0000000000..67b11216a3 --- /dev/null +++ b/t/t0081-find-pack.sh @@ -0,0 +1,82 @@ +#!/bin/sh + +test_description='test `test-tool find-pack`' + +TEST_PASSES_SANITIZE_LEAK=true +. ./test-lib.sh + +test_expect_success 'setup' ' + test_commit one && + test_commit two && + test_commit three && + test_commit four && + test_commit five +' + +test_expect_success 'repack everything into a single packfile' ' + git repack -a -d --no-write-bitmap-index && + + head_commit_pack=$(test-tool find-pack HEAD) && + head_tree_pack=$(test-tool find-pack HEAD^{tree}) && + one_pack=$(test-tool find-pack HEAD:one.t) && + three_pack=$(test-tool find-pack HEAD:three.t) && + old_commit_pack=$(test-tool find-pack HEAD~4) && + + test-tool find-pack --check-count 1 HEAD && + test-tool find-pack --check-count=1 HEAD^{tree} && + ! test-tool find-pack --check-count=0 HEAD:one.t && + ! test-tool find-pack -c 2 HEAD:one.t && + test-tool find-pack -c 1 HEAD:three.t && + + # Packfile exists at the right path + case "$head_commit_pack" in + ".git/objects/pack/pack-"*".pack") true ;; + *) false ;; + esac && + test -f "$head_commit_pack" && + + # Everything is in the same pack + test "$head_commit_pack" = "$head_tree_pack" && + test "$head_commit_pack" = "$one_pack" && + test "$head_commit_pack" = "$three_pack" && + test "$head_commit_pack" = "$old_commit_pack" +' + +test_expect_success 'add more packfiles' ' + git rev-parse HEAD^{tree} HEAD:two.t HEAD:four.t >objects && + git pack-objects .git/objects/pack/mypackname1 >packhash1 objects && + git pack-objects .git/objects/pack/mypackname2 >packhash2 head_tree_packs && + grep "$head_commit_pack" head_tree_packs && + grep mypackname1 head_tree_packs && + ! grep mypackname2 head_tree_packs && + test-tool find-pack --check-count 2 HEAD^{tree} && + ! test-tool find-pack --check-count 1 HEAD^{tree} && + + # HEAD:five.t is also in 2 packfiles + test-tool find-pack HEAD:five.t >five_packs && + grep "$head_commit_pack" five_packs && + ! grep mypackname1 five_packs && + grep mypackname2 five_packs && + test-tool find-pack -c 2 HEAD:five.t && + ! test-tool find-pack --check-count=0 HEAD:five.t +' + +test_expect_success 'add more commits (as loose objects)' ' + test_commit six && + test_commit seven && + + test -z "$(test-tool find-pack HEAD)" && + test -z "$(test-tool find-pack HEAD:six.t)" && + test-tool find-pack --check-count 0 HEAD && + test-tool find-pack -c 0 HEAD:six.t && + ! test-tool find-pack -c 1 HEAD:seven.t +' + +test_done From patchwork Mon Oct 2 16:54:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13406472 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECED5E748FB for ; Mon, 2 Oct 2023 16:55:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238398AbjJBQzo (ORCPT ); Mon, 2 Oct 2023 12:55:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56506 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238536AbjJBQzg (ORCPT ); Mon, 2 Oct 2023 12:55:36 -0400 Received: from mail-ed1-x530.google.com (mail-ed1-x530.google.com [IPv6:2a00:1450:4864:20::530]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 708DEB4 for ; Mon, 2 Oct 2023 09:55:32 -0700 (PDT) Received: by mail-ed1-x530.google.com with SMTP id 4fb4d7f45d1cf-523100882f2so21998345a12.2 for ; Mon, 02 Oct 2023 09:55:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696265730; x=1696870530; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=V1B2Pw8ZjowFSiZyo7tIrTx5e8fItBlrmtuqIKFxK4k=; b=XGc38AgQBTRcFfG8YqsCU89cZu3/lJKdiBaUgOdu1fEn9FBswqFNz7DphzjftRSbZf ob3y/IALU7IoNxa1Fmhr42nOW8C1CC1f/LrUsSU42zZsDfsnUatDih4ThGbXq94EhVj4 d/I1UDB1z0Qj0RA6pOS0hrBawpoCXmKUj+W21DJCaOsSgG3ssgsqGqOOcTZ34aJqeZHD oxW00rVK4hCkuQ9qN9ua9Nm/qQDEJu4DsPd/4Pzfv4SuP2SpiPc54iQGb7tgF0hfgsQO gBgWhxE81g8UQths1zRqyaJAvhoNkfFdOCiRws7LxmQ3+YKLU0RIT9MBMXwEpssG+e1g bXbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696265730; x=1696870530; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=V1B2Pw8ZjowFSiZyo7tIrTx5e8fItBlrmtuqIKFxK4k=; b=xI6Y5jhjVtQkLHpGwA6RYbNrzfQUnxbnrveaQ/U6bNIR957i54ypTKn23HKpB7eftH oT4JiiLotMbYS5jG3y8AThbQib3yYqNXdL20/75h4r/BdDkEGk3WAQFG6YTQ3FX3m9IM aJH97CQXf5jUTmP23MFb39ahc7cRIXP+8p70vY0HEoBCzVVawbZuHDvGzMtWQOH4Zysy c9XIKARRrqAMhA8lQL7Y1HFuSH0yiI118QJ2suJ6ih/MAi1xFMqoAMvC9zalQkhccRQE EsnPGX8D2P3L/62gTYj5SrTAQMjENwb5wdF9LADPW38P3RtQxOdt2vwr6OmUfbUO23Ib getQ== X-Gm-Message-State: AOJu0YznNAD8AwPq9uk9kEUX0tFr/Hb5pUtzY9fqE8vLg9VR9aOOBqaf SZCJ+f00IUeiiEInztf4Qr0MDm9dpZz5VQ== X-Google-Smtp-Source: AGHT+IGhAHNWaSB57idyAJFedPU5KUkVJa+iojnkUZ1NBFwti0uPeOBftd+AOiRWw/L5TXziJKlVdA== X-Received: by 2002:aa7:c301:0:b0:51e:2e39:9003 with SMTP id l1-20020aa7c301000000b0051e2e399003mr9678450edq.40.1696265730331; Mon, 02 Oct 2023 09:55:30 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:c027:f1d4:d825:fbf4:9197:5c9f]) by smtp.gmail.com with ESMTPSA id er15-20020a056402448f00b00533c844e337sm12762364edb.85.2023.10.02.09.55.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Oct 2023 09:55:29 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder Subject: [PATCH v8 3/9] repack: refactor finishing pack-objects command Date: Mon, 2 Oct 2023 18:54:58 +0200 Message-ID: <20231002165504.1325153-4-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.305.g5bfd918c90 In-Reply-To: <20231002165504.1325153-1-christian.couder@gmail.com> References: <20230925152517.803579-1-christian.couder@gmail.com> <20231002165504.1325153-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Create a new finish_pack_objects_cmd() to refactor duplicated code that handles reading the packfile names from the output of a `git pack-objects` command and putting it into a string_list, as well as calling finish_command(). While at it, beautify a code comment a bit in the new function. Signed-off-by: Christian Couder out, "r"); + while (strbuf_getline_lf(&line, out) != EOF) { + struct string_list_item *item; + + if (line.len != the_hash_algo->hexsz) + die(_("repack: Expecting full hex object ID lines only " + "from pack-objects.")); + /* + * Avoid putting packs written outside of the repository in the + * list of names. + */ + if (local) { + item = string_list_append(names, line.buf); + item->util = populate_pack_exts(line.buf); + } + } + fclose(out); + + strbuf_release(&line); + + return finish_command(cmd); +} + static int write_cruft_pack(const struct pack_objects_args *args, const char *destination, const char *pack_prefix, @@ -814,9 +844,8 @@ static int write_cruft_pack(const struct pack_objects_args *args, struct existing_packs *existing) { struct child_process cmd = CHILD_PROCESS_INIT; - struct strbuf line = STRBUF_INIT; struct string_list_item *item; - FILE *in, *out; + FILE *in; int ret; const char *scratch; int local = skip_prefix(destination, packdir, &scratch); @@ -861,27 +890,7 @@ static int write_cruft_pack(const struct pack_objects_args *args, fprintf(in, "%s.pack\n", item->string); fclose(in); - out = xfdopen(cmd.out, "r"); - while (strbuf_getline_lf(&line, out) != EOF) { - struct string_list_item *item; - - if (line.len != the_hash_algo->hexsz) - die(_("repack: Expecting full hex object ID lines only " - "from pack-objects.")); - /* - * avoid putting packs written outside of the repository in the - * list of names - */ - if (local) { - item = string_list_append(names, line.buf); - item->util = populate_pack_exts(line.buf); - } - } - fclose(out); - - strbuf_release(&line); - - return finish_command(&cmd); + return finish_pack_objects_cmd(&cmd, names, local); } int cmd_repack(int argc, const char **argv, const char *prefix) @@ -891,10 +900,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) struct string_list names = STRING_LIST_INIT_DUP; struct existing_packs existing = EXISTING_PACKS_INIT; struct pack_geometry geometry = { 0 }; - struct strbuf line = STRBUF_INIT; struct tempfile *refs_snapshot = NULL; int i, ext, ret; - FILE *out; int show_progress; /* variables to be filled by option parsing */ @@ -1124,18 +1131,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) fclose(in); } - out = xfdopen(cmd.out, "r"); - while (strbuf_getline_lf(&line, out) != EOF) { - struct string_list_item *item; - - if (line.len != the_hash_algo->hexsz) - die(_("repack: Expecting full hex object ID lines only from pack-objects.")); - item = string_list_append(&names, line.buf); - item->util = populate_pack_exts(item->string); - } - strbuf_release(&line); - fclose(out); - ret = finish_command(&cmd); + ret = finish_pack_objects_cmd(&cmd, &names, 1); if (ret) goto cleanup; From patchwork Mon Oct 2 16:54:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13406475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90B35E748FC for ; Mon, 2 Oct 2023 16:55:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238539AbjJBQzq (ORCPT ); Mon, 2 Oct 2023 12:55:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238537AbjJBQzg (ORCPT ); Mon, 2 Oct 2023 12:55:36 -0400 Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A537CB8 for ; Mon, 2 Oct 2023 09:55:33 -0700 (PDT) Received: by mail-ed1-x52f.google.com with SMTP id 4fb4d7f45d1cf-5362bcc7026so7896587a12.1 for ; Mon, 02 Oct 2023 09:55:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696265732; x=1696870532; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=IiKMYJQ/PaTtnPf5bQbCqy+J8QZRROkrKfTtX6YN/Z4=; b=cqG54nC0bh+Bi3mt4r84+9Xx4EGRmKve14iAucT3ekobXNGKjIEzuMRHZsgzOqDi0K No3dixHf0Ohd5Dhsf8FJn/aPdXixQrfhYvYi9LPnxYIpcfcvvN+LswWLR5Rdb8pT+R9p AaS8Y5kbq/pZFGit+RcP+KfETxs+tP5UaAShwDMWnoOCleZl8XvNrSr5XJOBDPnxqXyl SqSQWExGZPT0mS0q8FToSapBqN8t2NMB80a77aGN0/Kj1yF2AvS/yEmf859+cua8VsB7 XTiqg9wz3NFF3xxsr9MlFiSRdqkupX5iEyCvaTPnGPJAPLZ7Pf1EBPj4Jn+VL59hQoOU 33ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696265732; x=1696870532; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IiKMYJQ/PaTtnPf5bQbCqy+J8QZRROkrKfTtX6YN/Z4=; b=CVHCqQTuPJEFPe9Zpn9LA70oBHnGY/LN666WrsosIEw0c4Ipe7IG1LZOquHuBxD64U vY/lxnM+eXlepN5rg5cOuaL3eXTIclS6wbObmIrTNLvx2VYLESzqXI/dqi4KFqImiDvM Gql8rsCCgc9ocyTV4/JQQ5K1uIGAqBnmYd3x/Vf5KooI6NkzvNZvTc2Kl7EjDB5BXVWB QCFOBUZsxZ7m8nQ/DuN6CotlPUiotOm03mUPerAgql7QDqlhqp9/jnyGvm+hzXsqv0Rb GSswY5FV5TD+wV1JC3W4ohXRTW0dHL3Pyg2rAZhrAo+f/kyktMpjDwUQsZgTMhTa9+/V vWvA== X-Gm-Message-State: AOJu0YzTcUfEjqqfwErbD5YGAZMMorbyd1YDnzN6D0GKaElqDPdk/Y3c XcDbhCc+af8C53fjuRl7UDVEjeJjr9n18w== X-Google-Smtp-Source: AGHT+IEfAkyfn5jLQvHPA3DDpyjJwN7en03o3OGf8tM/XeVAWlywHAZjHkfBw3ai4SEL7uaoGxJ8Mg== X-Received: by 2002:aa7:d899:0:b0:52f:c073:9c77 with SMTP id u25-20020aa7d899000000b0052fc0739c77mr10297967edq.35.1696265731750; Mon, 02 Oct 2023 09:55:31 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:c027:f1d4:d825:fbf4:9197:5c9f]) by smtp.gmail.com with ESMTPSA id er15-20020a056402448f00b00533c844e337sm12762364edb.85.2023.10.02.09.55.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Oct 2023 09:55:30 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder Subject: [PATCH v8 4/9] repack: refactor finding pack prefix Date: Mon, 2 Oct 2023 18:54:59 +0200 Message-ID: <20231002165504.1325153-5-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.305.g5bfd918c90 In-Reply-To: <20231002165504.1325153-1-christian.couder@gmail.com> References: <20230925152517.803579-1-christian.couder@gmail.com> <20231002165504.1325153-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Create a new find_pack_prefix() to refactor code that handles finding the pack prefix from the packtmp and packdir global variables, as we are going to need this feature again in following commit. Signed-off-by: Christian Couder X-Patchwork-Id: 13406473 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B26BE748F8 for ; Mon, 2 Oct 2023 16:55:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238549AbjJBQzs (ORCPT ); Mon, 2 Oct 2023 12:55:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48494 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238523AbjJBQzm (ORCPT ); Mon, 2 Oct 2023 12:55:42 -0400 Received: from mail-ed1-x52d.google.com (mail-ed1-x52d.google.com [IPv6:2a00:1450:4864:20::52d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 60A11DA for ; Mon, 2 Oct 2023 09:55:35 -0700 (PDT) Received: by mail-ed1-x52d.google.com with SMTP id 4fb4d7f45d1cf-5345a3dfe3bso14351344a12.3 for ; Mon, 02 Oct 2023 09:55:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696265733; x=1696870533; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=V9nO8oWC472P5aBqV0C0wBRIcLVlNzXG1XN5nKL/aDE=; b=nf0vTtnneLWjLqeKhYI5XrxpE6iXeC0RErmfh/9sXQpAoA88aTPcBaUdAZ1lLh/cl3 OroRmUeXKvTjuyDpffq/g0Q2IGcQo9Ln0rSCjlX07PJbmGpor6NjhFWsCxVVw5MA/Ft6 yRo2i5N8I2pk5xeQPzzcx5Wj82vpJRlsAgBitDSUBenFtUvCyDx7yK4ZJkRhgR4O7OHr d5xdChIQZUsCUo9dBf1WCaEPkjW6H648BFaeqyUUEfxS202HafzRMHyLEfyw2zQLDbYr 5JQM86D/xGXZptCBm4JAV0nFrJ0TDjkcU79IXKjfsgufHlEbJwMkNbBTnU/p+2XGlDjW Q+iQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696265733; x=1696870533; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=V9nO8oWC472P5aBqV0C0wBRIcLVlNzXG1XN5nKL/aDE=; b=o+PbJoPz90CV+ZC13VhojE/ww2whO3bdRROz9RCQbkq3VB3O0VgqGTz5JLYZXMDtfw 7FAxuFgx84VeQaZO+dXAo9337gWVbS+YP8DMnUx3OlyAz5iMw5C2LQn7KzHAqPAwEYYw 1lf13u4QZZ8ZxXpU1mKTqZ+H7RFJO3GgVmkJtLNgN/T8Gj/2eWgNWuurjmXWku5Gqv5S WPjCwS3VTJFclQrFzLsJ5ZBC/gGHlS3UNi/9f/kFynZVnWe2WlvljHKTLTLFOcDEzJ+z dliu3IXiGOniUXX+han7JyfG2hnCQ1smBitcgCmsrie3fP/pqVpuU/jQousvlB3sBbc+ oJJQ== X-Gm-Message-State: AOJu0YzMnt6qDmeoKL/b73O1rIcp5N8ApHV6J2IyLBDdF11+GuwlP0Dp P06V+8eHELhOKuh6/NeWYRxl5wIsOMsSDQ== X-Google-Smtp-Source: AGHT+IFPRRKNdA3m+KzbMd+LYjtDRNSwttIH4DKCk5DWqvc7mBBhcLUkTIFxcFbjIxGXt/oC1z3+BA== X-Received: by 2002:aa7:c746:0:b0:522:564d:6de with SMTP id c6-20020aa7c746000000b00522564d06demr12166221eds.36.1696265732973; Mon, 02 Oct 2023 09:55:32 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:c027:f1d4:d825:fbf4:9197:5c9f]) by smtp.gmail.com with ESMTPSA id er15-20020a056402448f00b00533c844e337sm12762364edb.85.2023.10.02.09.55.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Oct 2023 09:55:32 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v8 5/9] pack-bitmap-write: rebuild using new bitmap when remapping Date: Mon, 2 Oct 2023 18:55:00 +0200 Message-ID: <20231002165504.1325153-6-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.305.g5bfd918c90 In-Reply-To: <20231002165504.1325153-1-christian.couder@gmail.com> References: <20230925152517.803579-1-christian.couder@gmail.com> <20231002165504.1325153-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org `git repack` is about to learn a new `--filter=` option and we will want to check that this option is incompatible with `--write-bitmap-index`. Unfortunately it appears that a test like: test_expect_success '--filter fails with --write-bitmap-index' ' test_must_fail \ env GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ git -C bare.git repack -a -d --write-bitmap-index --filter=blob:none ' sometimes fail because when rebuilding bitmaps, it appears that we are reusing existing bitmap information. So instead of detecting that some objects are missing and erroring out as it should, the `git repack --write-bitmap-index --filter=...` command succeeds. Let's fix that by making sure we rebuild bitmaps using new bitmaps instead of existing ones. Helped-by: Taylor Blau Signed-off-by: Christian Couder --- pack-bitmap-write.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index f6757c3cbf..f4ecdf8b0e 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -413,15 +413,19 @@ static int fill_bitmap_commit(struct bb_commit *ent, if (old_bitmap && mapping) { struct ewah_bitmap *old = bitmap_for_commit(old_bitmap, c); + struct bitmap *remapped = bitmap_new(); /* * If this commit has an old bitmap, then translate that * bitmap and add its bits to this one. No need to walk * parents or the tree for this commit. */ - if (old && !rebuild_bitmap(mapping, old, ent->bitmap)) { + if (old && !rebuild_bitmap(mapping, old, remapped)) { + bitmap_or(ent->bitmap, remapped); + bitmap_free(remapped); reused_bitmaps_nr++; continue; } + bitmap_free(remapped); } /* From patchwork Mon Oct 2 16:55:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13406478 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 911D6E748FB for ; Mon, 2 Oct 2023 16:55:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238561AbjJBQzz (ORCPT ); Mon, 2 Oct 2023 12:55:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48420 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238557AbjJBQzm (ORCPT ); Mon, 2 Oct 2023 12:55:42 -0400 Received: from mail-ed1-x532.google.com (mail-ed1-x532.google.com [IPv6:2a00:1450:4864:20::532]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED322E8 for ; Mon, 2 Oct 2023 09:55:36 -0700 (PDT) Received: by mail-ed1-x532.google.com with SMTP id 4fb4d7f45d1cf-534659061afso13452042a12.3 for ; Mon, 02 Oct 2023 09:55:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696265735; x=1696870535; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0Exkm0g9UlfEBlVv4FQte8xMmGskn5N+nolz1YqLcKQ=; b=mVQsUz1p8/+ck4zwVE2OBlVwkK1K/axOSOdUFFfCstaNU4QzepuJQkYJw2GIaMcKri Tw77EyfmWSXjFyHGhQ3vNfX28bgEZEC7OLo/bplrZOAKUlK3hK3OYhP3zSs10crAA5XC 4nU2HCobm8lS9EYq/N01kZQ8QIW3ErMvcRzzRUPfzK/Q/E4TTih1wpUmTKts01SeP1pl bmvl6ElsVjKkBwP8/cXTiAOUttr0vWLdYedhVyonlwur3JiqEUYgrnkIMIIFKd3z8RkR vPSaPz5hCm8eihm4c6x++RnLKh97kQt5ajWNlfY0LLPH6tOBwimaWjxvpMnaZFTM0eSP xc7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696265735; x=1696870535; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0Exkm0g9UlfEBlVv4FQte8xMmGskn5N+nolz1YqLcKQ=; b=CSPpqPWfjfXT9kAQ1gQ1iREZmd4S8TI+XbGw36GonGEFpzZkUHuWnPWmMRAtYYefOw Z5Q4Gu7B7l+ydIQ9ZEgIH70Pmp6pMmpkuAoFgN/BEzPI6UiNKVmYHx6wLsoXJO5owIuW PNyDpI/cv3Lc491j/X6uDl5enQUzzOi8QtMFr8SWpMxtFsp/KhmSxT+G0fg4BeN07bSQ j0dhGrmmGU3o19dal6PVN7NmSQRCx2LI7qtYk3xo8dbfgcXb+iaWTTBab6VlHkF7S9Dg PG2X0/2OentEYn44ENLLOPwZh95ZgJLWsyb2PBRaRjaZGT7xgKptASX0maatAVd/s+lj 0zpg== X-Gm-Message-State: AOJu0YzMIUeSh4ta0TX/kEhkBNZbZXNk7mhCg3fi4r6dEk0IEL02T+/O 0wXh9du1eXkirahG4EfVzLgj5Cn/DiUuWA== X-Google-Smtp-Source: AGHT+IEdVKUP7nw/n1UJAOOpMLv3ztwtHynks4nGRJpRiaCtS5rMuXRQL1F4L9M3LCdiZlv27ksl0w== X-Received: by 2002:a05:6402:785:b0:533:1832:f2b4 with SMTP id d5-20020a056402078500b005331832f2b4mr10877816edy.13.1696265734838; Mon, 02 Oct 2023 09:55:34 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:c027:f1d4:d825:fbf4:9197:5c9f]) by smtp.gmail.com with ESMTPSA id er15-20020a056402448f00b00533c844e337sm12762364edb.85.2023.10.02.09.55.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Oct 2023 09:55:33 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v8 6/9] repack: add `--filter=` option Date: Mon, 2 Oct 2023 18:55:01 +0200 Message-ID: <20231002165504.1325153-7-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.305.g5bfd918c90 In-Reply-To: <20231002165504.1325153-1-christian.couder@gmail.com> References: <20230925152517.803579-1-christian.couder@gmail.com> <20231002165504.1325153-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This new option puts the objects specified by `` into a separate packfile. This could be useful if, for example, some blobs take up a lot of precious space on fast storage while they are rarely accessed. It could make sense to move them into a separate cheaper, though slower, storage. It's possible to find which new packfile contains the filtered out objects using one of the following: - `git verify-pack -v ...`, - `test-tool find-pack ...`, which a previous commit added, - `--filter-to=`, which a following commit will add to specify where the pack containing the filtered out objects will be. This feature is implemented by running `git pack-objects` twice in a row. The first command is run with `--filter=`, using the specified filter. It packs objects while omitting the objects specified by the filter. Then another `git pack-objects` command is launched using `--stdin-packs`. We pass it all the previously existing packs into its stdin, so that it will pack all the objects in the previously existing packs. But we also pass into its stdin, the pack created by the previous `git pack-objects --filter=` command as well as the kept packs, all prefixed with '^', so that the objects in these packs will be omitted from the resulting pack. The result is that only the objects filtered out by the first `git pack-objects` command are in the pack resulting from the second `git pack-objects` command. As the interactions with kept packs are a bit tricky, a few related tests are added. Helped-by: Taylor Blau Signed-off-by: John Cai Signed-off-by: Christian Couder --- Documentation/git-repack.txt | 12 ++++ builtin/repack.c | 70 ++++++++++++++++++ t/t7700-repack.sh | 135 +++++++++++++++++++++++++++++++++++ 3 files changed, 217 insertions(+) diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 4017157949..6d5bec7716 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -143,6 +143,18 @@ depth is 4095. a larger and slower repository; see the discussion in `pack.packSizeLimit`. +--filter=:: + Remove objects matching the filter specification from the + resulting packfile and put them into a separate packfile. Note + that objects used in the working directory are not filtered + out. So for the split to fully work, it's best to perform it + in a bare repo and to use the `-a` and `-d` options along with + this option. Also `--no-write-bitmap-index` (or the + `repack.writebitmaps` config option set to `false`) should be + used otherwise writing bitmap index will fail, as it supposes + a single packfile containing all the objects. See + linkgit:git-rev-list[1] for valid `` forms. + -b:: --write-bitmap-index:: Write a reachability bitmap index as part of the repack. This diff --git a/builtin/repack.c b/builtin/repack.c index 9ef0044384..c7b564192f 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -21,6 +21,7 @@ #include "pack.h" #include "pack-bitmap.h" #include "refs.h" +#include "list-objects-filter-options.h" #define ALL_INTO_ONE 1 #define LOOSEN_UNREACHABLE 2 @@ -56,6 +57,7 @@ struct pack_objects_args { int no_reuse_object; int quiet; int local; + struct list_objects_filter_options filter_options; }; static int repack_config(const char *var, const char *value, @@ -836,6 +838,56 @@ static int finish_pack_objects_cmd(struct child_process *cmd, return finish_command(cmd); } +static int write_filtered_pack(const struct pack_objects_args *args, + const char *destination, + const char *pack_prefix, + struct existing_packs *existing, + struct string_list *names) +{ + struct child_process cmd = CHILD_PROCESS_INIT; + struct string_list_item *item; + FILE *in; + int ret; + const char *caret; + const char *scratch; + int local = skip_prefix(destination, packdir, &scratch); + + prepare_pack_objects(&cmd, args, destination); + + strvec_push(&cmd.args, "--stdin-packs"); + + if (!pack_kept_objects) + strvec_push(&cmd.args, "--honor-pack-keep"); + for_each_string_list_item(item, &existing->kept_packs) + strvec_pushf(&cmd.args, "--keep-pack=%s", item->string); + + cmd.in = -1; + + ret = start_command(&cmd); + if (ret) + return ret; + + /* + * Here 'names' contains only the pack(s) that were just + * written, which is exactly the packs we want to keep. Also + * 'existing_kept_packs' already contains the packs in + * 'keep_pack_list'. + */ + in = xfdopen(cmd.in, "w"); + for_each_string_list_item(item, names) + fprintf(in, "^%s-%s.pack\n", pack_prefix, item->string); + for_each_string_list_item(item, &existing->non_kept_packs) + fprintf(in, "%s.pack\n", item->string); + for_each_string_list_item(item, &existing->cruft_packs) + fprintf(in, "%s.pack\n", item->string); + caret = pack_kept_objects ? "" : "^"; + for_each_string_list_item(item, &existing->kept_packs) + fprintf(in, "%s%s.pack\n", caret, item->string); + fclose(in); + + return finish_pack_objects_cmd(&cmd, names, local); +} + static int write_cruft_pack(const struct pack_objects_args *args, const char *destination, const char *pack_prefix, @@ -966,6 +1018,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) N_("limits the maximum number of threads")), OPT_STRING(0, "max-pack-size", &po_args.max_pack_size, N_("bytes"), N_("maximum size of each packfile")), + OPT_PARSE_LIST_OBJECTS_FILTER(&po_args.filter_options), OPT_BOOL(0, "pack-kept-objects", &pack_kept_objects, N_("repack objects in packs marked with .keep")), OPT_STRING_LIST(0, "keep-pack", &keep_pack_list, N_("name"), @@ -979,6 +1032,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) OPT_END() }; + list_objects_filter_init(&po_args.filter_options); + git_config(repack_config, &cruft_po_args); argc = parse_options(argc, argv, prefix, builtin_repack_options, @@ -1119,6 +1174,10 @@ int cmd_repack(int argc, const char **argv, const char *prefix) strvec_push(&cmd.args, "--incremental"); } + if (po_args.filter_options.choice) + strvec_pushf(&cmd.args, "--filter=%s", + expand_list_objects_filter_spec(&po_args.filter_options)); + if (geometry.split_factor) cmd.in = -1; else @@ -1205,6 +1264,16 @@ int cmd_repack(int argc, const char **argv, const char *prefix) } } + if (po_args.filter_options.choice) { + ret = write_filtered_pack(&po_args, + packtmp, + find_pack_prefix(packdir, packtmp), + &existing, + &names); + if (ret) + goto cleanup; + } + string_list_sort(&names); close_object_store(the_repository->objects); @@ -1297,6 +1366,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) string_list_clear(&names, 1); existing_packs_release(&existing); free_pack_geometry(&geometry); + list_objects_filter_release(&po_args.filter_options); return ret; } diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 27b66807cd..39e89445fd 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -327,6 +327,141 @@ test_expect_success 'auto-bitmaps do not complain if unavailable' ' test_must_be_empty actual ' +test_expect_success 'repacking with a filter works' ' + git -C bare.git repack -a -d && + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && + git -C bare.git -c repack.writebitmaps=false repack -a -d --filter=blob:none && + test_stdout_line_count = 2 ls bare.git/objects/pack/*.pack && + commit_pack=$(test-tool -C bare.git find-pack -c 1 HEAD) && + blob_pack=$(test-tool -C bare.git find-pack -c 1 HEAD:file1) && + test "$commit_pack" != "$blob_pack" && + tree_pack=$(test-tool -C bare.git find-pack -c 1 HEAD^{tree}) && + test "$tree_pack" = "$commit_pack" && + blob_pack2=$(test-tool -C bare.git find-pack -c 1 HEAD:file2) && + test "$blob_pack2" = "$blob_pack" +' + +test_expect_success '--filter fails with --write-bitmap-index' ' + test_must_fail \ + env GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -C bare.git repack -a -d --write-bitmap-index --filter=blob:none +' + +test_expect_success 'repacking with two filters works' ' + git init two-filters && + ( + cd two-filters && + mkdir subdir && + test_commit foo && + test_commit subdir_bar subdir/bar && + test_commit subdir_baz subdir/baz + ) && + git clone --no-local --bare two-filters two-filters.git && + ( + cd two-filters.git && + test_stdout_line_count = 1 ls objects/pack/*.pack && + git -c repack.writebitmaps=false repack -a -d \ + --filter=blob:none --filter=tree:1 && + test_stdout_line_count = 2 ls objects/pack/*.pack && + commit_pack=$(test-tool find-pack -c 1 HEAD) && + blob_pack=$(test-tool find-pack -c 1 HEAD:foo.t) && + root_tree_pack=$(test-tool find-pack -c 1 HEAD^{tree}) && + subdir_tree_hash=$(git ls-tree --object-only HEAD -- subdir) && + subdir_tree_pack=$(test-tool find-pack -c 1 "$subdir_tree_hash") && + + # Root tree and subdir tree are not in the same packfiles + test "$commit_pack" != "$blob_pack" && + test "$commit_pack" = "$root_tree_pack" && + test "$blob_pack" = "$subdir_tree_pack" + ) +' + +prepare_for_keep_packs () { + git init keep-packs && + ( + cd keep-packs && + test_commit foo && + test_commit bar + ) && + git clone --no-local --bare keep-packs keep-packs.git && + ( + cd keep-packs.git && + + # Create two packs + # The first pack will contain all of the objects except one blob + git rev-list --objects --all >objs && + grep -v "bar.t" objs | git pack-objects pack && + # The second pack will contain the excluded object and be kept + packid=$(grep "bar.t" objs | git pack-objects pack) && + >pack-$packid.keep && + + # Replace the existing pack with the 2 new ones + rm -f objects/pack/pack* && + mv pack-* objects/pack/ + ) +} + +test_expect_success '--filter works with .keep packs' ' + prepare_for_keep_packs && + ( + cd keep-packs.git && + + foo_pack=$(test-tool find-pack -c 1 HEAD:foo.t) && + bar_pack=$(test-tool find-pack -c 1 HEAD:bar.t) && + head_pack=$(test-tool find-pack -c 1 HEAD) && + + test "$foo_pack" != "$bar_pack" && + test "$foo_pack" = "$head_pack" && + + git -c repack.writebitmaps=false repack -a -d --filter=blob:none && + + foo_pack_1=$(test-tool find-pack -c 1 HEAD:foo.t) && + bar_pack_1=$(test-tool find-pack -c 1 HEAD:bar.t) && + head_pack_1=$(test-tool find-pack -c 1 HEAD) && + + # Object bar is still only in the old .keep pack + test "$foo_pack_1" != "$foo_pack" && + test "$bar_pack_1" = "$bar_pack" && + test "$head_pack_1" != "$head_pack" && + + test "$foo_pack_1" != "$bar_pack_1" && + test "$foo_pack_1" != "$head_pack_1" && + test "$bar_pack_1" != "$head_pack_1" + ) +' + +test_expect_success '--filter works with --pack-kept-objects and .keep packs' ' + rm -rf keep-packs keep-packs.git && + prepare_for_keep_packs && + ( + cd keep-packs.git && + + foo_pack=$(test-tool find-pack -c 1 HEAD:foo.t) && + bar_pack=$(test-tool find-pack -c 1 HEAD:bar.t) && + head_pack=$(test-tool find-pack -c 1 HEAD) && + + test "$foo_pack" != "$bar_pack" && + test "$foo_pack" = "$head_pack" && + + git -c repack.writebitmaps=false repack -a -d --filter=blob:none \ + --pack-kept-objects && + + foo_pack_1=$(test-tool find-pack -c 1 HEAD:foo.t) && + test-tool find-pack -c 2 HEAD:bar.t >bar_pack_1 && + head_pack_1=$(test-tool find-pack -c 1 HEAD) && + + test "$foo_pack_1" != "$foo_pack" && + test "$foo_pack_1" != "$bar_pack" && + test "$head_pack_1" != "$head_pack" && + + # Object bar is in both the old .keep pack and the new + # pack that contained the filtered out objects + grep "$bar_pack" bar_pack_1 && + grep "$foo_pack_1" bar_pack_1 && + test "$foo_pack_1" != "$head_pack_1" + ) +' + objdir=.git/objects midx=$objdir/pack/multi-pack-index From patchwork Mon Oct 2 16:55:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13406474 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C8B6E748FB for ; Mon, 2 Oct 2023 16:55:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238551AbjJBQzt (ORCPT ); Mon, 2 Oct 2023 12:55:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48434 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238556AbjJBQzm (ORCPT ); Mon, 2 Oct 2023 12:55:42 -0400 Received: from mail-ed1-x535.google.com (mail-ed1-x535.google.com [IPv6:2a00:1450:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7EA4BF0 for ; Mon, 2 Oct 2023 09:55:38 -0700 (PDT) Received: by mail-ed1-x535.google.com with SMTP id 4fb4d7f45d1cf-523100882f2so21998573a12.2 for ; Mon, 02 Oct 2023 09:55:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696265736; x=1696870536; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fgndZkQPDQIo1GHzSCM/s3yh+5pQEtkV39B8KuAzcxs=; b=b6SoNVrbw137bc4bb/Lah0xz7FcNbSVdVnviouRhKJ8BM6DSeifAfkpxHhkApw+8lE Sx1xaB/DYV8w1+GuD067AjxsFeAvqOSgMiv1a73hTNU+9FL7QvJFp4GR6crHhJKSKYop Aequ8nrrGAIKjAEi63kk2nCTgP28Izkf00MR4LJavZlJ3A9pXug/FMls5XHePCKyKBIv 1RqGJe694m/D4FtjORUnlGYswyqXUorZ0vPLh7hZEy31ZNNJbZZS3KrX7TCliFQ1Y8L/ JikmTxWdnHiLHEGggmbdaWaX0iWkkH9OPR07gLaFFY/P3r5HQouasMc03jZgiXfgILVH /YGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696265736; x=1696870536; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fgndZkQPDQIo1GHzSCM/s3yh+5pQEtkV39B8KuAzcxs=; b=p5xOpUKq66FEjGdb1gq/zaEQ09En+VovAWqjQNQkWQPbmrhWOUgaMVa5QzHfL308eF GZfup9nzQ3CnsYXvfFV6SW6PMIubtaVvvBfWISSLJpR5LP2Hx/3PjIYaQ/rsy5rFMNsz rrSQ0RYX4gQCNOwvaCjGGbpgKzx4nZG1YHK0MQrWFBhhtQ2v/OxcSuF9DMElBBCqrr9Y dWlMA2rPaj7RL8QiKf9A5Ee6Rqii1c9dp7zF3a8WXK7707HAQCr5k4To4JDH1Y2fsk/2 dWtWHtDzSW79AB2PnxMJrkyPkwl5Ink521as0AVJlKH5nXXpf4C3hC/BPBFEiRul7ELx RoKw== X-Gm-Message-State: AOJu0YyqaEGnBUBtd3ZA1+CmWS+q8VDFd2Mzh51MOlXkYY7Bmjz4Thd7 jcszewmS8e457jpbfa/DlP0r8MjYaqnLEw== X-Google-Smtp-Source: AGHT+IFyRCduWPl7+MLHerOA94xNNZmCdDmWLW5PFOYkAb5D+1d4QGZBmfvSnWEOOI8Z2HOT2V5QVQ== X-Received: by 2002:aa7:d98e:0:b0:527:251e:1be8 with SMTP id u14-20020aa7d98e000000b00527251e1be8mr10080317eds.13.1696265736342; Mon, 02 Oct 2023 09:55:36 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:c027:f1d4:d825:fbf4:9197:5c9f]) by smtp.gmail.com with ESMTPSA id er15-20020a056402448f00b00533c844e337sm12762364edb.85.2023.10.02.09.55.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Oct 2023 09:55:35 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v8 7/9] gc: add `gc.repackFilter` config option Date: Mon, 2 Oct 2023 18:55:02 +0200 Message-ID: <20231002165504.1325153-8-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.305.g5bfd918c90 In-Reply-To: <20231002165504.1325153-1-christian.couder@gmail.com> References: <20230925152517.803579-1-christian.couder@gmail.com> <20231002165504.1325153-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A previous commit has implemented `git repack --filter=` to allow users to filter out some objects from the main pack and move them into a new different pack. Users might want to perform such a cleanup regularly at the same time as they perform other repacks and cleanups, so as part of `git gc`. Let's allow them to configure a for that purpose using a new gc.repackFilter config option. Now when `git gc` will perform a repack with a configured through this option and not empty, the repack process will be passed a corresponding `--filter=` argument. Signed-off-by: Christian Couder --- Documentation/config/gc.txt | 5 +++++ builtin/gc.c | 6 ++++++ t/t6500-gc.sh | 13 +++++++++++++ 3 files changed, 24 insertions(+) diff --git a/Documentation/config/gc.txt b/Documentation/config/gc.txt index ca47eb2008..2153bde7ac 100644 --- a/Documentation/config/gc.txt +++ b/Documentation/config/gc.txt @@ -145,6 +145,11 @@ Multiple hooks are supported, but all must exit successfully, else the operation (either generating a cruft pack or unpacking unreachable objects) will be halted. +gc.repackFilter:: + When repacking, use the specified filter to move certain + objects into a separate packfile. See the + `--filter=` option of linkgit:git-repack[1]. + gc.rerereResolved:: Records of conflicted merge you resolved earlier are kept for this many days when 'git rerere gc' is run. diff --git a/builtin/gc.c b/builtin/gc.c index 00192ae5d3..98148e98fe 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -61,6 +61,7 @@ static timestamp_t gc_log_expire_time; static const char *gc_log_expire = "1.day.ago"; static const char *prune_expire = "2.weeks.ago"; static const char *prune_worktrees_expire = "3.months.ago"; +static char *repack_filter; static unsigned long big_pack_threshold; static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE; @@ -170,6 +171,8 @@ static void gc_config(void) git_config_get_ulong("gc.bigpackthreshold", &big_pack_threshold); git_config_get_ulong("pack.deltacachesize", &max_delta_cache_size); + git_config_get_string("gc.repackfilter", &repack_filter); + git_config(git_default_config, NULL); } @@ -355,6 +358,9 @@ static void add_repack_all_option(struct string_list *keep_pack) if (keep_pack) for_each_string_list(keep_pack, keep_one_pack, NULL); + + if (repack_filter && *repack_filter) + strvec_pushf(&repack, "--filter=%s", repack_filter); } static void add_repack_incremental_option(void) diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index 69509d0c11..232e403b66 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -202,6 +202,19 @@ test_expect_success 'one of gc.reflogExpire{Unreachable,}=never does not skip "e grep -E "^trace: (built-in|exec|run_command): git reflog expire --" trace.out ' +test_expect_success 'gc.repackFilter launches repack with a filter' ' + test_when_finished "rm -rf bare.git" && + git clone --no-local --bare . bare.git && + + git -C bare.git -c gc.cruftPacks=false gc && + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && + + GIT_TRACE=$(pwd)/trace.out git -C bare.git -c gc.repackFilter=blob:none \ + -c repack.writeBitmaps=false -c gc.cruftPacks=false gc && + test_stdout_line_count = 2 ls bare.git/objects/pack/*.pack && + grep -E "^trace: (built-in|exec|run_command): git repack .* --filter=blob:none ?.*" trace.out +' + prepare_cruft_history () { test_commit base && From patchwork Mon Oct 2 16:55:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13406477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5842EE748FE for ; Mon, 2 Oct 2023 16:55:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238558AbjJBQzx (ORCPT ); Mon, 2 Oct 2023 12:55:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48480 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238562AbjJBQzm (ORCPT ); Mon, 2 Oct 2023 12:55:42 -0400 Received: from mail-ed1-x52a.google.com (mail-ed1-x52a.google.com [IPv6:2a00:1450:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 76AC3FD for ; Mon, 2 Oct 2023 09:55:39 -0700 (PDT) Received: by mail-ed1-x52a.google.com with SMTP id 4fb4d7f45d1cf-53829312d12so4069880a12.0 for ; Mon, 02 Oct 2023 09:55:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696265737; x=1696870537; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bcDCGtCuL69vamozo+DX4dhtNjlpqdJbSwNEKqSuzJg=; b=X3ktopdr44OKhrtQ9pJYdriM05qaPOG3dXWqHeUhDxwir9syJDcd7IGZa2HLCkqaDz UpzV7KHmA1LyY++b+xpGIKpQ+9/Sd3HXEH1vTPn3huU9eEaGG7dbsZFS7DEOa3GkLOs2 C7JhB5a61x+bL8nwEL8fTrlBKY7Yh9GAfiKoh3RG6HqS7O+6LPMp7RvGlLn9SoPmggPN H+OWk35BFQCVh32nvtQi0CjgwSyuyU1/ARc21kMnynis/f0E1++0v6UIsoDQHG2kmVMW PcHj9ZZZYXC60QEFz3xLnxY5DZ9wb5vqMSuw2nByghkstU3+j4sWPmSxdWgoXAc/B/2C O1DQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696265737; x=1696870537; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bcDCGtCuL69vamozo+DX4dhtNjlpqdJbSwNEKqSuzJg=; b=MZ9HtjFriMRkZg7+bcs6E4VGzv3t0woWIrc2d93cJc1WpjLbPx5J++QaVCeel7QwE5 Y4sKZq8LYCRI8roRD72T+EhfYw20yqE0hLrGYaTh668ALFghA/3yalVpmgT4xGzKVhoN 2NCwRFoFrwGkjZqqZzy7ylRSIUHIwkHIhh0UhW6MyTrzgi5rJ6WIjoCo2nK0GS/WVqAc d0JuQOaMO9QtpSWXiFa80zr+SWD/2/FoWbDkjr9P/DIeRLbCUt8FIno3C7kpWerA8wQt h6D2u+6SQ0aKQ43Gp6R0O/zmImEPFeVkX7U+inPMs4L6XtaI3q7/Rpd4+dMT+AirTwbq 9JFw== X-Gm-Message-State: AOJu0YyhmviniP/cxvKlSCbdJXbmiYPUXmunyY7pPMNiy8pQza94MosI 7XO4N1kqUCQ+S5nZxkUhoqhXlJ9Rk2J8Kg== X-Google-Smtp-Source: AGHT+IEiw5BZpT03NapE3zmOVeGzEmwv9foY0kVB3zDZT0ux9h5udlerrgyDYpcF9hxUbIPOf6zTPA== X-Received: by 2002:aa7:d0c7:0:b0:533:ccec:552 with SMTP id u7-20020aa7d0c7000000b00533ccec0552mr240279edo.9.1696265737543; Mon, 02 Oct 2023 09:55:37 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:c027:f1d4:d825:fbf4:9197:5c9f]) by smtp.gmail.com with ESMTPSA id er15-20020a056402448f00b00533c844e337sm12762364edb.85.2023.10.02.09.55.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Oct 2023 09:55:37 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v8 8/9] repack: implement `--filter-to` for storing filtered out objects Date: Mon, 2 Oct 2023 18:55:03 +0200 Message-ID: <20231002165504.1325153-9-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.305.g5bfd918c90 In-Reply-To: <20231002165504.1325153-1-christian.couder@gmail.com> References: <20230925152517.803579-1-christian.couder@gmail.com> <20231002165504.1325153-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A previous commit has implemented `git repack --filter=` to allow users to filter out some objects from the main pack and move them into a new different pack. It would be nice if this new different pack could be created in a different directory than the regular pack. This would make it possible to move large blobs into a pack on a different kind of storage, for example cheaper storage. Even in a different directory, this pack can be accessible if, for example, the Git alternates mechanism is used to point to it. In fact not using the Git alternates mechanism can corrupt a repo as the generated pack containing the filtered objects might not be accessible from the repo any more. So setting up the Git alternates mechanism should be done before using this feature if the user wants the repo to be fully usable while this feature is used. In some cases, like when a repo has just been cloned or when there is no other activity in the repo, it's Ok to setup the Git alternates mechanism afterwards though. It's also Ok to just inspect the generated packfile containing the filtered objects and then just move it into the '.git/objects/pack/' directory manually. That's why it's not necessary for this command to check that the Git alternates mechanism has been already setup. While at it, as an example to show that `--filter` and `--filter-to` work well with other options, let's also add a test to check that these options work well with `--max-pack-size`. Signed-off-by: Christian Couder --- Documentation/git-repack.txt | 11 +++++++ builtin/repack.c | 10 +++++- t/t7700-repack.sh | 62 ++++++++++++++++++++++++++++++++++++ 3 files changed, 82 insertions(+), 1 deletion(-) diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 6d5bec7716..8545a32667 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -155,6 +155,17 @@ depth is 4095. a single packfile containing all the objects. See linkgit:git-rev-list[1] for valid `` forms. +--filter-to=:: + Write the pack containing filtered out objects to the + directory ``. Only useful with `--filter`. This can be + used for putting the pack on a separate object directory that + is accessed through the Git alternates mechanism. **WARNING:** + If the packfile containing the filtered out objects is not + accessible, the repo can become corrupt as it might not be + possible to access the objects in that packfile. See the + `objects` and `objects/info/alternates` sections of + linkgit:gitrepository-layout[5]. + -b:: --write-bitmap-index:: Write a reachability bitmap index as part of the repack. This diff --git a/builtin/repack.c b/builtin/repack.c index c7b564192f..db9277081d 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -977,6 +977,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) int write_midx = 0; const char *cruft_expiration = NULL; const char *expire_to = NULL; + const char *filter_to = NULL; struct option builtin_repack_options[] = { OPT_BIT('a', NULL, &pack_everything, @@ -1029,6 +1030,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) N_("write a multi-pack index of the resulting packs")), OPT_STRING(0, "expire-to", &expire_to, N_("dir"), N_("pack prefix to store a pack containing pruned objects")), + OPT_STRING(0, "filter-to", &filter_to, N_("dir"), + N_("pack prefix to store a pack containing filtered out objects")), OPT_END() }; @@ -1177,6 +1180,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) if (po_args.filter_options.choice) strvec_pushf(&cmd.args, "--filter=%s", expand_list_objects_filter_spec(&po_args.filter_options)); + else if (filter_to) + die(_("option '%s' can only be used along with '%s'"), "--filter-to", "--filter"); if (geometry.split_factor) cmd.in = -1; @@ -1265,8 +1270,11 @@ int cmd_repack(int argc, const char **argv, const char *prefix) } if (po_args.filter_options.choice) { + if (!filter_to) + filter_to = packtmp; + ret = write_filtered_pack(&po_args, - packtmp, + filter_to, find_pack_prefix(packdir, packtmp), &existing, &names); diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 39e89445fd..48e92aa6f7 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -462,6 +462,68 @@ test_expect_success '--filter works with --pack-kept-objects and .keep packs' ' ) ' +test_expect_success '--filter-to stores filtered out objects' ' + git -C bare.git repack -a -d && + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && + + git init --bare filtered.git && + git -C bare.git -c repack.writebitmaps=false repack -a -d \ + --filter=blob:none \ + --filter-to=../filtered.git/objects/pack/pack && + test_stdout_line_count = 1 ls bare.git/objects/pack/pack-*.pack && + test_stdout_line_count = 1 ls filtered.git/objects/pack/pack-*.pack && + + commit_pack=$(test-tool -C bare.git find-pack -c 1 HEAD) && + blob_pack=$(test-tool -C bare.git find-pack -c 0 HEAD:file1) && + blob_hash=$(git -C bare.git rev-parse HEAD:file1) && + test -n "$blob_hash" && + blob_pack=$(test-tool -C filtered.git find-pack -c 1 $blob_hash) && + + echo $(pwd)/filtered.git/objects >bare.git/objects/info/alternates && + blob_pack=$(test-tool -C bare.git find-pack -c 1 HEAD:file1) && + blob_content=$(git -C bare.git show $blob_hash) && + test "$blob_content" = "content1" +' + +test_expect_success '--filter works with --max-pack-size' ' + rm -rf filtered.git && + git init --bare filtered.git && + git init max-pack-size && + ( + cd max-pack-size && + test_commit base && + # two blobs which exceed the maximum pack size + test-tool genrandom foo 1048576 >foo && + git hash-object -w foo && + test-tool genrandom bar 1048576 >bar && + git hash-object -w bar && + git add foo bar && + git commit -m "adding foo and bar" + ) && + git clone --no-local --bare max-pack-size max-pack-size.git && + ( + cd max-pack-size.git && + git -c repack.writebitmaps=false repack -a -d --filter=blob:none \ + --max-pack-size=1M \ + --filter-to=../filtered.git/objects/pack/pack && + echo $(cd .. && pwd)/filtered.git/objects >objects/info/alternates && + + # Check that the 3 blobs are in different packfiles in filtered.git + test_stdout_line_count = 3 ls ../filtered.git/objects/pack/pack-*.pack && + test_stdout_line_count = 1 ls objects/pack/pack-*.pack && + foo_pack=$(test-tool find-pack -c 1 HEAD:foo) && + bar_pack=$(test-tool find-pack -c 1 HEAD:bar) && + base_pack=$(test-tool find-pack -c 1 HEAD:base.t) && + test "$foo_pack" != "$bar_pack" && + test "$foo_pack" != "$base_pack" && + test "$bar_pack" != "$base_pack" && + for pack in "$foo_pack" "$bar_pack" "$base_pack" + do + case "$foo_pack" in */filtered.git/objects/pack/*) true ;; *) return 1 ;; esac + done + ) +' + objdir=.git/objects midx=$objdir/pack/multi-pack-index From patchwork Mon Oct 2 16:55:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13406476 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2844DE784AB for ; Mon, 2 Oct 2023 16:55:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238554AbjJBQzv (ORCPT ); Mon, 2 Oct 2023 12:55:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238538AbjJBQzo (ORCPT ); Mon, 2 Oct 2023 12:55:44 -0400 Received: from mail-ed1-x532.google.com (mail-ed1-x532.google.com [IPv6:2a00:1450:4864:20::532]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E12DEA9 for ; Mon, 2 Oct 2023 09:55:40 -0700 (PDT) Received: by mail-ed1-x532.google.com with SMTP id 4fb4d7f45d1cf-5333fb34be3so22354668a12.1 for ; Mon, 02 Oct 2023 09:55:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696265739; x=1696870539; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FNKP/Vfnf5kRJ5Yesm2cFwh3jAzoSFJ9JwdLJhVoEbg=; b=ZyYBxDjt5eb3Xc4oBRg1cj1SWoHedwwjFtAyjQqyvdW7eVBJmlgOB4BwcCdBMzg89k 5IRda11AmT42WCALCwf/SLfPvb5iSx8fbLsliZgdnTlkzo0tPbchxweQk4qNKuZ5R7Wu RKlDMWBewoQz/9eUAY/XqyCPtdVzUDvuIxHeRRYrYoYp6onXUn/GcDf5Y52mI4rwR7Qw Vk/uq7bhgkO0K7d13alF8H+gcGspYHgqvCPN4mKs3mQOAimkfwiSBvhqQ+QXF2PF3eAc ZtNZ3O3777ORbO0tb/Qs+M8ZyVTU1s4ZurobTUtzAZO4lFtufiFU8/uUfNWcLGWH7LKX vFhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696265739; x=1696870539; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FNKP/Vfnf5kRJ5Yesm2cFwh3jAzoSFJ9JwdLJhVoEbg=; b=s+/YjAxVJgukeqmRpPpfg30Se112MIkg4d4YDfB7LoU8PPJZ4jJ0LJjAGReUlMKh2T QlGM6ihEe9eO2kd7bbt0J4W0ryExT+N2zmMn9XbpS60KoO99Y3PS+8ZA5VsEAKR02R0X igcEKSnIw4f5tuh3B0Laa3gm3QnmynSxs1QsV3xnFjJ83xE+3XSYd1SR9MY/4EdNqZMj q/Nn0eEX3sjrttrSIjiuk56jHxPoQgRJo/tSlRbgZxFHFTtQGS6vVZ4b8Q397znVDFst 6UXrafD2JAZYuxDzy1qh9Ra+iMJ0Yk/jncg+q0JbHwXcBnDTr8gX8SoLhRnJc5/SzSpf 3UfQ== X-Gm-Message-State: AOJu0Yysm3Njw5llywHw4dbUrkK/ET01Iy48o8joHK1Fs8H6QVHYlRup rQ187KXuasV7f/QZ9aS0j4hvYFfSxIjQCw== X-Google-Smtp-Source: AGHT+IHrx/WROprDScBs0iFTwI4/O6JPS9ZoEbPpeGGuhnZi3oBF0mgOdJ2TviHQ+cODlSsC8DGkCg== X-Received: by 2002:a05:6402:785:b0:523:ae0a:a446 with SMTP id d5-20020a056402078500b00523ae0aa446mr12299324edy.24.1696265738830; Mon, 02 Oct 2023 09:55:38 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:c027:f1d4:d825:fbf4:9197:5c9f]) by smtp.gmail.com with ESMTPSA id er15-20020a056402448f00b00533c844e337sm12762364edb.85.2023.10.02.09.55.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Oct 2023 09:55:38 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v8 9/9] gc: add `gc.repackFilterTo` config option Date: Mon, 2 Oct 2023 18:55:04 +0200 Message-ID: <20231002165504.1325153-10-christian.couder@gmail.com> X-Mailer: git-send-email 2.42.0.305.g5bfd918c90 In-Reply-To: <20231002165504.1325153-1-christian.couder@gmail.com> References: <20230925152517.803579-1-christian.couder@gmail.com> <20231002165504.1325153-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A previous commit implemented the `gc.repackFilter` config option to specify a filter that should be used by `git gc` when performing repacks. Another previous commit has implemented `git repack --filter-to=` to specify the location of the packfile containing filtered out objects when using a filter. Let's implement the `gc.repackFilterTo` config option to specify that location in the config when `gc.repackFilter` is used. Now when `git gc` will perform a repack with a configured through this option and not empty, the repack process will be passed a corresponding `--filter-to=` argument. Signed-off-by: Christian Couder --- Documentation/config/gc.txt | 11 +++++++++++ builtin/gc.c | 4 ++++ t/t6500-gc.sh | 13 ++++++++++++- 3 files changed, 27 insertions(+), 1 deletion(-) diff --git a/Documentation/config/gc.txt b/Documentation/config/gc.txt index 2153bde7ac..466466d6cc 100644 --- a/Documentation/config/gc.txt +++ b/Documentation/config/gc.txt @@ -150,6 +150,17 @@ gc.repackFilter:: objects into a separate packfile. See the `--filter=` option of linkgit:git-repack[1]. +gc.repackFilterTo:: + When repacking and using a filter, see `gc.repackFilter`, the + specified location will be used to create the packfile + containing the filtered out objects. **WARNING:** The + specified location should be accessible, using for example the + Git alternates mechanism, otherwise the repo could be + considered corrupt by Git as it migh not be able to access the + objects in that packfile. See the `--filter-to=` option + of linkgit:git-repack[1] and the `objects/info/alternates` + section of linkgit:gitrepository-layout[5]. + gc.rerereResolved:: Records of conflicted merge you resolved earlier are kept for this many days when 'git rerere gc' is run. diff --git a/builtin/gc.c b/builtin/gc.c index 98148e98fe..68ca8d45bf 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -62,6 +62,7 @@ static const char *gc_log_expire = "1.day.ago"; static const char *prune_expire = "2.weeks.ago"; static const char *prune_worktrees_expire = "3.months.ago"; static char *repack_filter; +static char *repack_filter_to; static unsigned long big_pack_threshold; static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE; @@ -172,6 +173,7 @@ static void gc_config(void) git_config_get_ulong("pack.deltacachesize", &max_delta_cache_size); git_config_get_string("gc.repackfilter", &repack_filter); + git_config_get_string("gc.repackfilterto", &repack_filter_to); git_config(git_default_config, NULL); } @@ -361,6 +363,8 @@ static void add_repack_all_option(struct string_list *keep_pack) if (repack_filter && *repack_filter) strvec_pushf(&repack, "--filter=%s", repack_filter); + if (repack_filter_to && *repack_filter_to) + strvec_pushf(&repack, "--filter-to=%s", repack_filter_to); } static void add_repack_incremental_option(void) diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index 232e403b66..e412cf8daf 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -203,7 +203,6 @@ test_expect_success 'one of gc.reflogExpire{Unreachable,}=never does not skip "e ' test_expect_success 'gc.repackFilter launches repack with a filter' ' - test_when_finished "rm -rf bare.git" && git clone --no-local --bare . bare.git && git -C bare.git -c gc.cruftPacks=false gc && @@ -215,6 +214,18 @@ test_expect_success 'gc.repackFilter launches repack with a filter' ' grep -E "^trace: (built-in|exec|run_command): git repack .* --filter=blob:none ?.*" trace.out ' +test_expect_success 'gc.repackFilterTo store filtered out objects' ' + test_when_finished "rm -rf bare.git filtered.git" && + + git init --bare filtered.git && + git -C bare.git -c gc.repackFilter=blob:none \ + -c gc.repackFilterTo=../filtered.git/objects/pack/pack \ + -c repack.writeBitmaps=false -c gc.cruftPacks=false gc && + + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && + test_stdout_line_count = 1 ls filtered.git/objects/pack/*.pack +' + prepare_cruft_history () { test_commit base &&