From patchwork Mon Jul 24 08:59:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13323604 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E3DDC001B0 for ; Mon, 24 Jul 2023 08:59:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231718AbjGXI7l (ORCPT ); Mon, 24 Jul 2023 04:59:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55192 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230235AbjGXI7k (ORCPT ); Mon, 24 Jul 2023 04:59:40 -0400 Received: from mail-ej1-x631.google.com (mail-ej1-x631.google.com [IPv6:2a00:1450:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 28A7D131 for ; Mon, 24 Jul 2023 01:59:38 -0700 (PDT) Received: by mail-ej1-x631.google.com with SMTP id a640c23a62f3a-99313a34b2dso604429766b.1 for ; Mon, 24 Jul 2023 01:59:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690189176; x=1690793976; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=VNnA5sq63goWK4I7oOX1pcAAlYWE4QpX14QVrUxDLFU=; b=IjscgxCFw5dhgpl8ZgGTI5sDrAbbAWU+I1E+TLJnaHPU0ei9Serh+pq/fWzuUYkLcP qcppxlUqmTmx/YD6zdYcNY/J6ezR00Ne6+9eLV8ug79Sce+HDc/OOwuqfh5fVaNy4TBh 1r8+pbAOwMIiikAziOFf+pQLitiLecv41rqw9LFsT/ozCobLTF0wO/NdxJ06J1+hPtYo oThku+9df1AxEdp4nobv0QEYd87W2PwqFC3XCQF0TyCkY8ykxeQpWJB9qqwJLbz8tQD/ RxRCAXAVsxTZMRYVVVNjBqXulCewnFHJE2ioqmfjLpSX0uaMUNl5nZlA8pyLX+Mqu/ej XQtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690189176; x=1690793976; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VNnA5sq63goWK4I7oOX1pcAAlYWE4QpX14QVrUxDLFU=; b=J2nwEgSp04nCZNUJuXH0SFWwE3P1NE5u3Heat6t7BwY90dFoCa5bv1Kugji47VO5Y8 ws9vyYnkUFaR3iSwUiAGqkQ960IXJ+pehMPizrVuhGbpDt07zlvle0UmSjwrqT5UhVcn 8FnmZoOchn8GYWEsCXPRlBVf6n0ktz6zZ/7k6N+Ti319h8en4+hPXjSu8busGM0pVj1D rdKUIfSz550szs3/HqQue0dB4dmQ9vAtXmXRKvB9GxD4FbWcfUcDbzWJtZiEfQwT1UHu vBea9CanplSEtwwJZxQDjDR2DqKy6mg2uCUcSt0Ghp8pm/dn4Fto+KzZ123AI4f7Y1s4 ifAg== X-Gm-Message-State: ABy/qLZ5TYQ6UrOUwcwl3qOWDtOZ39dmBiG2wgF63C0uw07a0y+gBR36 rPD2xu9GsFHrFJtiJ1yQfLEQ5IYVx3w= X-Google-Smtp-Source: APBJJlHICxi+1EsFc2dBC1J0U3emIQu2VXdquCH+/6zZD/FNDHWb/rHIBmyR5AwVebEeamHZsoErtA== X-Received: by 2002:a17:906:8a73:b0:994:47a5:a377 with SMTP id hy19-20020a1709068a7300b0099447a5a377mr9084752ejc.24.1690189175846; Mon, 24 Jul 2023 01:59:35 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:11c4:4096:3d09:3950:f280:5ec1]) by smtp.gmail.com with ESMTPSA id rv7-20020a17090710c700b00993a9a951fasm6506665ejb.11.2023.07.24.01.59.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 01:59:34 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v3 1/8] pack-objects: allow `--filter` without `--stdout` Date: Mon, 24 Jul 2023 10:59:02 +0200 Message-ID: <20230724085909.3831831-2-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.384.ged66511823 In-Reply-To: <20230724085909.3831831-1-christian.couder@gmail.com> References: <20230705060812.2865188-1-christian.couder@gmail.com> <20230724085909.3831831-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org 9535ce7337 (pack-objects: add list-objects filtering, 2017-11-21) taught `git pack-objects` to use `--filter`, but required the use of `--stdout` since a partial clone mechanism was not yet in place to handle missing objects. Since then, changes like 9e27beaa23 (promisor-remote: implement promisor_remote_get_direct(), 2019-06-25) and others added support to dynamically fetch objects that were missing. Even without a promisor remote, filtering out objects can also be useful if we can put the filtered out objects in a separate pack, and in this case it also makes sense for pack-objects to write the packfile directly to an actual file rather than on stdout. Remove the `--stdout` requirement when using `--filter`, so that in a follow-up commit, repack can pass `--filter` to pack-objects to omit certain objects from the resulting packfile. Signed-off-by: John Cai Signed-off-by: Christian Couder --- Documentation/git-pack-objects.txt | 4 ++-- builtin/pack-objects.c | 8 ++------ t/t5317-pack-objects-filter-objects.sh | 8 ++++++++ 3 files changed, 12 insertions(+), 8 deletions(-) diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index a9995a932c..583270a85f 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -298,8 +298,8 @@ So does `git bundle` (see linkgit:git-bundle[1]) when it creates a bundle. nevertheless. --filter=:: - Requires `--stdout`. Omits certain objects (usually blobs) from - the resulting packfile. See linkgit:git-rev-list[1] for valid + Omits certain objects (usually blobs) from the resulting + packfile. See linkgit:git-rev-list[1] for valid `` forms. --no-filter:: diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 06b33d49e9..7fca27ffbe 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -4387,12 +4387,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) if (!rev_list_all || !rev_list_reflog || !rev_list_index) unpack_unreachable_expiration = 0; - if (filter_options.choice) { - if (!pack_to_stdout) - die(_("cannot use --filter without --stdout")); - if (stdin_packs) - die(_("cannot use --filter with --stdin-packs")); - } + if (stdin_packs && filter_options.choice) + die(_("cannot use --filter with --stdin-packs")); if (stdin_packs && use_internal_rev_list) die(_("cannot use internal rev list with --stdin-packs")); diff --git a/t/t5317-pack-objects-filter-objects.sh b/t/t5317-pack-objects-filter-objects.sh index b26d476c64..2ff3eef9a3 100755 --- a/t/t5317-pack-objects-filter-objects.sh +++ b/t/t5317-pack-objects-filter-objects.sh @@ -53,6 +53,14 @@ test_expect_success 'verify blob:none packfile has no blobs' ' ! grep blob verify_result ' +test_expect_success 'verify blob:none packfile without --stdout' ' + git -C r1 pack-objects --revs --filter=blob:none mypackname >packhash <<-EOF && + HEAD + EOF + git -C r1 verify-pack -v "mypackname-$(cat packhash).pack" >verify_result && + ! grep blob verify_result +' + test_expect_success 'verify normal and blob:none packfiles have same commits/trees' ' git -C r1 verify-pack -v ../all.pack >verify_result && grep -E "commit|tree" verify_result | From patchwork Mon Jul 24 08:59:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13323607 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57D84C001DF for ; Mon, 24 Jul 2023 08:59:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232045AbjGXI7u (ORCPT ); Mon, 24 Jul 2023 04:59:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231570AbjGXI7l (ORCPT ); Mon, 24 Jul 2023 04:59:41 -0400 Received: from mail-ej1-x629.google.com (mail-ej1-x629.google.com [IPv6:2a00:1450:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE86EF9 for ; Mon, 24 Jul 2023 01:59:39 -0700 (PDT) Received: by mail-ej1-x629.google.com with SMTP id a640c23a62f3a-9926623e367so730003066b.0 for ; Mon, 24 Jul 2023 01:59:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690189178; x=1690793978; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ONbOqFjvECLFyVJxWULnwHX3t0j3KCY+T8uobbkhpqY=; b=rXrvUI1t+FXQ1afsanctLzY91LCzLSBn3TzZDJvgbqVkoYyiWFpkNKjqD/NTzlqT7j PZ9vbBPMRkfi8lpuU1DpjW85U5Zu9T89L/K88tFcPh4/qU9SXtwp56BqvPq/6qgAXF18 nTmrHuIZVeFqNEaPv+PvpM36Y0GU09FTulnbdSdrqf55frVOSxZNB/RrrYlvywYl3nFt pEYHnaLigSSkFMgRBF/DahwrwD1YFcBX8Atr9fAXvuMXy03O0uC5xB3OGIAPa9rdn6/m VNH+N2d80hxZ4RivmkrTbtdEUOb2fR98gD++meXtrAuGToDEMyV9wzQGCmAyE3GYjpeL pw8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690189178; x=1690793978; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ONbOqFjvECLFyVJxWULnwHX3t0j3KCY+T8uobbkhpqY=; b=GVJl9gUVVt5yMLzFxKDm4dYT60+1mO/9lXEB0ua2jrjB9HOB1rkIOa0iFSB+ou5zBX 3t3OM1Ay6PhFeMuRVTEXXBSjKBlrE9AP3GAiG0Nh9ABLT5IECE+CVutT6XVws/abdbXB YlB3Hk8CHpwM+oBQ3rURuhw01GwLOULYqDvHoavcQp/ktDMwjY2AcNI6dD5/2yoV0UWT 7bvIq3pyK+twnqO5itn2N95pnRR+LF/fVumzAWL5fKwyM4lQOkHvXR/De2zP3Mwj+MX2 LXQ4REg5EaMJZmJSUrQDDiL5fncQl4UT527iKR9g8NjVcg10JoZlBZ22LLeY3uYkyOfE MzpQ== X-Gm-Message-State: ABy/qLbSzH9Imk9UzVmbN1MvPYe7QUNve2OaZ6X/RgowFTMD60tXiVHQ Puyz+EoJ41NeeTGh7qBN2cNcMT8KIvQ= X-Google-Smtp-Source: APBJJlGFJ2PxZqHtOPSB3tHSxVYLL/pjk+d9iHTtYfbxCFo9umvrSRpxKGi7oKkeSGCDZdOdB1pABA== X-Received: by 2002:a17:906:8a67:b0:994:673:8b00 with SMTP id hy7-20020a1709068a6700b0099406738b00mr8227902ejc.12.1690189177649; Mon, 24 Jul 2023 01:59:37 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:11c4:4096:3d09:3950:f280:5ec1]) by smtp.gmail.com with ESMTPSA id rv7-20020a17090710c700b00993a9a951fasm6506665ejb.11.2023.07.24.01.59.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 01:59:36 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v3 2/8] t/helper: add 'find-pack' test-tool Date: Mon, 24 Jul 2023 10:59:03 +0200 Message-ID: <20230724085909.3831831-3-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.384.ged66511823 In-Reply-To: <20230724085909.3831831-1-christian.couder@gmail.com> References: <20230705060812.2865188-1-christian.couder@gmail.com> <20230724085909.3831831-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In a following commit, we will make it possible to separate objects in different packfiles depending on a filter. To make sure that the right objects are in the right packs, let's add a new test-tool that can display which packfile(s) a given object is in. Signed-off-by: Christian Couder --- Makefile | 1 + t/helper/test-find-pack.c | 35 +++++++++++++++++++++++++++++++++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + 4 files changed, 38 insertions(+) create mode 100644 t/helper/test-find-pack.c diff --git a/Makefile b/Makefile index fb541dedc9..14ee0c45d4 100644 --- a/Makefile +++ b/Makefile @@ -800,6 +800,7 @@ TEST_BUILTINS_OBJS += test-dump-untracked-cache.o TEST_BUILTINS_OBJS += test-env-helper.o TEST_BUILTINS_OBJS += test-example-decorate.o TEST_BUILTINS_OBJS += test-fast-rebase.o +TEST_BUILTINS_OBJS += test-find-pack.o TEST_BUILTINS_OBJS += test-fsmonitor-client.o TEST_BUILTINS_OBJS += test-genrandom.o TEST_BUILTINS_OBJS += test-genzeros.o diff --git a/t/helper/test-find-pack.c b/t/helper/test-find-pack.c new file mode 100644 index 0000000000..1928fe7329 --- /dev/null +++ b/t/helper/test-find-pack.c @@ -0,0 +1,35 @@ +#include "test-tool.h" +#include "object-name.h" +#include "object-store.h" +#include "packfile.h" +#include "setup.h" + +/* + * Display the path(s), one per line, of the packfile(s) containing + * the given object. + */ + +static const char *find_pack_usage = "\n" +" test-tool find-pack "; + + +int cmd__find_pack(int argc, const char **argv) +{ + struct object_id oid; + struct packed_git *p; + + setup_git_directory(); + + if (argc != 2) + usage(find_pack_usage); + + if (repo_get_oid(the_repository, argv[1], &oid)) + die("cannot parse %s as an object name", argv[1]); + + for (p = get_all_packs(the_repository); p; p = p->next) { + if (find_pack_entry_one(oid.hash, p)) + printf("%s\n", p->pack_name); + } + + return 0; +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index abe8a785eb..41da40c296 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -31,6 +31,7 @@ static struct test_cmd cmds[] = { { "env-helper", cmd__env_helper }, { "example-decorate", cmd__example_decorate }, { "fast-rebase", cmd__fast_rebase }, + { "find-pack", cmd__find_pack }, { "fsmonitor-client", cmd__fsmonitor_client }, { "genrandom", cmd__genrandom }, { "genzeros", cmd__genzeros }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index ea2672436c..411dbf2db4 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -25,6 +25,7 @@ int cmd__dump_reftable(int argc, const char **argv); int cmd__env_helper(int argc, const char **argv); int cmd__example_decorate(int argc, const char **argv); int cmd__fast_rebase(int argc, const char **argv); +int cmd__find_pack(int argc, const char **argv); int cmd__fsmonitor_client(int argc, const char **argv); int cmd__genrandom(int argc, const char **argv); int cmd__genzeros(int argc, const char **argv); From patchwork Mon Jul 24 08:59:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13323605 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33588C001DE for ; Mon, 24 Jul 2023 08:59:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231941AbjGXI7v (ORCPT ); Mon, 24 Jul 2023 04:59:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55214 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231752AbjGXI7m (ORCPT ); Mon, 24 Jul 2023 04:59:42 -0400 Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E1E512A for ; Mon, 24 Jul 2023 01:59:41 -0700 (PDT) Received: by mail-ej1-x636.google.com with SMTP id a640c23a62f3a-992acf67388so611222266b.1 for ; Mon, 24 Jul 2023 01:59:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690189179; x=1690793979; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dPxnAwAPrGQ414YKVIBt/qldlBjQHw2tp5sXBwwzoNs=; b=quER0J0Vqe9CaJY1Z1mMcdbtV9JQdLECb1aaaYzxybqkeVOgmxK6BM/XeLdIZ7/uLs o2brhvT4JzAuq7juSNp2Pa2fRZYLttp3P3JayBzm5wQbmvtiBd87Q/QU//jg+IDm4RIX SA5aNbshysg+zROhf4jdrYZxXpWl8KjDWKseqnP7wO0l4cMmI2NTV7qszV+J0C1IlCd3 FuB9foi+5DM4igczV2u6GuixXP+VERUVBVFYR2oIvGSm94Pj0WwJrNTXrwWjHvRoOwA0 746or1iXkcH2022kWBpS83NMoFsr6dzV5xxVaUA4xAbgQACSRkBDORmejBp0VGkONG0a Igdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690189179; x=1690793979; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dPxnAwAPrGQ414YKVIBt/qldlBjQHw2tp5sXBwwzoNs=; b=dPBxAVboUwWxODi1cr/6vMVpfzYtADIeCtbwNlLwEyciRwuVEmiSBV5XTUmCTsWWx5 E1W7uXzY5Xn/g0Ba2U3hk1PLTjtQdOB1MZkkG8h4BUl3ju+FNeLwbRF01Y9Zye8FE0gk 1Pdu9EJL0PWJZ5+9JQpuLvzxP++gkoD4Amwoxt5H9n78PjMFu0jp13lvXlY/HLUSxb67 u8A41l270mRLpV/DZ+RmZ1ac2YNlyxY/kNKpBTeWbNHL7M4XVUgN+XpDbbJaVmZsqWkU 2wnZIXpW/1B49rvX8GkHcOqHlXy5Mzk32OaglmdawEQ1BcNlBQvBYw2bpNVtw345Aply 83Xw== X-Gm-Message-State: ABy/qLZibzcmKjwTBQMHGXqS8eqgkRIzAHfFnYo7DhQ4SJOcxM1lD59H BqGUhth5PaH6eyM75E3j3Vyu6MF9N6c= X-Google-Smtp-Source: APBJJlHnDA4oFuqX7oL2LZmvkX2QJuyByj4s8I3HOSFaxK5Ktzg/w1boAnFN622aVEg3tPBsplMBXQ== X-Received: by 2002:a17:906:cc16:b0:992:d013:1132 with SMTP id ml22-20020a170906cc1600b00992d0131132mr8480191ejb.1.1690189179321; Mon, 24 Jul 2023 01:59:39 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:11c4:4096:3d09:3950:f280:5ec1]) by smtp.gmail.com with ESMTPSA id rv7-20020a17090710c700b00993a9a951fasm6506665ejb.11.2023.07.24.01.59.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 01:59:38 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder Subject: [PATCH v3 3/8] repack: refactor finishing pack-objects command Date: Mon, 24 Jul 2023 10:59:04 +0200 Message-ID: <20230724085909.3831831-4-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.384.ged66511823 In-Reply-To: <20230724085909.3831831-1-christian.couder@gmail.com> References: <20230705060812.2865188-1-christian.couder@gmail.com> <20230724085909.3831831-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Create a new finish_pack_objects_cmd() to refactor duplicated code that handles reading the packfile names from the output of a `git pack-objects` command and putting it into a string_list, as well as calling finish_command(). While at it, beautify a code comment a bit in the new function. Signed-off-by: Christian Couder out, "r"); + while (strbuf_getline_lf(&line, out) != EOF) { + struct string_list_item *item; + + if (line.len != the_hash_algo->hexsz) + die(_("repack: Expecting full hex object ID lines only " + "from pack-objects.")); + /* + * Avoid putting packs written outside of the repository in the + * list of names. + */ + if (local) { + item = string_list_append(names, line.buf); + item->util = populate_pack_exts(line.buf); + } + } + fclose(out); + + strbuf_release(&line); + + return finish_command(cmd); +} + static int write_cruft_pack(const struct pack_objects_args *args, const char *destination, const char *pack_prefix, @@ -705,9 +735,8 @@ static int write_cruft_pack(const struct pack_objects_args *args, struct string_list *existing_kept_packs) { struct child_process cmd = CHILD_PROCESS_INIT; - struct strbuf line = STRBUF_INIT; struct string_list_item *item; - FILE *in, *out; + FILE *in; int ret; const char *scratch; int local = skip_prefix(destination, packdir, &scratch); @@ -751,27 +780,7 @@ static int write_cruft_pack(const struct pack_objects_args *args, fprintf(in, "%s.pack\n", item->string); fclose(in); - out = xfdopen(cmd.out, "r"); - while (strbuf_getline_lf(&line, out) != EOF) { - struct string_list_item *item; - - if (line.len != the_hash_algo->hexsz) - die(_("repack: Expecting full hex object ID lines only " - "from pack-objects.")); - /* - * avoid putting packs written outside of the repository in the - * list of names - */ - if (local) { - item = string_list_append(names, line.buf); - item->util = populate_pack_exts(line.buf); - } - } - fclose(out); - - strbuf_release(&line); - - return finish_command(&cmd); + return finish_pack_objects_cmd(&cmd, names, local); } int cmd_repack(int argc, const char **argv, const char *prefix) @@ -782,10 +791,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) struct string_list existing_nonkept_packs = STRING_LIST_INIT_DUP; struct string_list existing_kept_packs = STRING_LIST_INIT_DUP; struct pack_geometry *geometry = NULL; - struct strbuf line = STRBUF_INIT; struct tempfile *refs_snapshot = NULL; int i, ext, ret; - FILE *out; int show_progress; /* variables to be filled by option parsing */ @@ -1016,18 +1023,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) fclose(in); } - out = xfdopen(cmd.out, "r"); - while (strbuf_getline_lf(&line, out) != EOF) { - struct string_list_item *item; - - if (line.len != the_hash_algo->hexsz) - die(_("repack: Expecting full hex object ID lines only from pack-objects.")); - item = string_list_append(&names, line.buf); - item->util = populate_pack_exts(item->string); - } - strbuf_release(&line); - fclose(out); - ret = finish_command(&cmd); + ret = finish_pack_objects_cmd(&cmd, &names, 1); if (ret) goto cleanup; From patchwork Mon Jul 24 08:59:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13323606 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B317C001B0 for ; Mon, 24 Jul 2023 08:59:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232031AbjGXI7x (ORCPT ); Mon, 24 Jul 2023 04:59:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55220 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231781AbjGXI7n (ORCPT ); Mon, 24 Jul 2023 04:59:43 -0400 Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 53FC6131 for ; Mon, 24 Jul 2023 01:59:42 -0700 (PDT) Received: by mail-ed1-x533.google.com with SMTP id 4fb4d7f45d1cf-5222c5d71b8so1109215a12.2 for ; Mon, 24 Jul 2023 01:59:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690189180; x=1690793980; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mGphpGKPeucRmByj89WO/k/xCxKR/i7/0c8QtIZngDQ=; b=YuDHuZMzZz9mz7ScZxvb1j59qYbmCZVF41OOZZ/V6UFI8SiX2pjwn2OJouUa7ykRQV 05Veol+/UHmuCfovtrQMyQCtQswCduqxNH2Lj+IEOksfHDN52bm3v3pP0LlbtQGpvtM5 MKrSIBD2oIUZbzKSanqOJeaUPzZzUT8XaFuRRNuCdp2t1fBxOnr6isbpCwtVgwJTc9SL nic/Ax+YP6tTf/8C49MNzUxOO8CGLB6nOT233y38vSO7Gix7MHUqMqDlMTnfEiUSosR8 q7Zl4QUDxTa2pBhvKpk5IjlSgeV8ILVPoEG4XIbBxDZt2rElTXVoRsw5pVDqThjl6kGE 0oFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690189180; x=1690793980; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mGphpGKPeucRmByj89WO/k/xCxKR/i7/0c8QtIZngDQ=; b=VBiPhjlFVflCRj1EY6yh0zgagloBom7OlmP5kn8u0GJLiLRZ/mIA0V2tFzT4qskf4o jiCdI/w2iQON7A3/APvlOo68FWVykv9uKo0XUoSHwtIiC6ffJ3OSIHxD5joZ6zMPXOgR V/u1krfChzve8f6c2PuJ/w+A3STnBBv7NH22wrvxvq3UUFBozQOSjiVpYBCKaiLiBVyd iI7orZE3UbmS/4TA7DJc/SjJLoM8LBgI48K0IrjE3geQqU0H6CcA3gnkc2Tc2AUShK0D AZ0xNqTCImvuN87fUJWUcI5LpZpBky3fPXyHllQhPEjirJx/BOpbHiunj1eFLo7OKMbf FGrQ== X-Gm-Message-State: ABy/qLadn/GhdDE5yKfOK+cZnPtmjlJ9LVOuCX1J10PEb10mcDir4csp 0/BVTszkMKooQjN68ZnscC/jtMUfc6U= X-Google-Smtp-Source: APBJJlGZ1JVoFh2ztV0qqkw+o6VQhJK07cciSsTZvOslN/91YKxZmkyyLM8jw31qGy7w6L+n6GxvLw== X-Received: by 2002:a17:906:3088:b0:967:e015:f542 with SMTP id 8-20020a170906308800b00967e015f542mr9315566ejv.44.1690189180465; Mon, 24 Jul 2023 01:59:40 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:11c4:4096:3d09:3950:f280:5ec1]) by smtp.gmail.com with ESMTPSA id rv7-20020a17090710c700b00993a9a951fasm6506665ejb.11.2023.07.24.01.59.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 01:59:40 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder Subject: [PATCH v3 4/8] repack: refactor finding pack prefix Date: Mon, 24 Jul 2023 10:59:05 +0200 Message-ID: <20230724085909.3831831-5-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.384.ged66511823 In-Reply-To: <20230724085909.3831831-1-christian.couder@gmail.com> References: <20230705060812.2865188-1-christian.couder@gmail.com> <20230724085909.3831831-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Create a new find_pack_prefix() to refactor code that handles finding the pack prefix from the packtmp and packdir global variables, as we are going to need this feature again in following commit. Signed-off-by: Christian Couder X-Patchwork-Id: 13323608 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 731AAC001B0 for ; Mon, 24 Jul 2023 08:59:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232059AbjGXI75 (ORCPT ); Mon, 24 Jul 2023 04:59:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55248 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231785AbjGXI7p (ORCPT ); Mon, 24 Jul 2023 04:59:45 -0400 Received: from mail-ej1-x633.google.com (mail-ej1-x633.google.com [IPv6:2a00:1450:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B615D137 for ; Mon, 24 Jul 2023 01:59:43 -0700 (PDT) Received: by mail-ej1-x633.google.com with SMTP id a640c23a62f3a-977e0fbd742so662690966b.2 for ; Mon, 24 Jul 2023 01:59:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690189182; x=1690793982; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kpm73t/Fx3pCkHAtgHrzYskc/SByoOkgJguPBGbzJVk=; b=HSSRWNFZ3+w3IK8zmhzL3o9EW3BuZhP8quyzsrINnyHu62nTuKZ/ejQzXBsEnFS7CG /fJwWMLhKQIZzTAKLkHZN+kwkdKyu0dPMelkx+T1ogxs6/11avKowJ88d6bxtrVfTEEJ tx1lIdUYvjqkg7bLb58hBzKsb0FxNFHzW2KBKO1CsW8bk2wePKt5VDEf+rVOl6oHU7oA zqHmmJtu1nvAnlxcE2ngaB7JU4j0N0OuGABvDTHrMcDFO9UgqLGST/q/dmbzSKi2xErW N0VPtG9RNmAEFXEMOh1aPQzwcz+Onx8UNNU4ip7Qh74cVyYtNuoGgmSERptOIIcqq3Yr EhZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690189182; x=1690793982; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kpm73t/Fx3pCkHAtgHrzYskc/SByoOkgJguPBGbzJVk=; b=A/j7OkdFXPpHn2RcEd5ds2GfhaPXFa5kxPH16Qf6YRneqxthz4EkBaHrc130wDRtpo R99kT6XfoyvIMNPKez7l9obH/BAteIOXthE8XQo+s90JCdg90KzMmwVjhpKazVGOKIsC XDSOwpVBtqHhkM8DGHiAZxPwum43WLClQ4nUgE0YGtXSTDMSuTZFUbg6NZ4t27b8EOnP eWYSiCf/d5sgclP/+TjvXeBCvkprGOfePqEJ9XWpRDSqG0gTOKqPPt9XbCttxjAjerfC MjRMHghN9grnSJFmEmXBrjh5K8pjgXcs25p+ADJpVszg2OQStS4SObJK7YyKoc3EvOgr TPnw== X-Gm-Message-State: ABy/qLYn5NlMsUnPqAmaj+rLOiU35RceRnLvPCUcBUdZvAbIbPKel3Nt DxeiG/r2BTcVkdNjxAl7fT/1UvhR2aE= X-Google-Smtp-Source: APBJJlEnCBTVTuZ1BN+bsg27MiJHxKkvvxFOl357GUkcmJMWiMSy5RpKIA//MSbTkGgUMsigsPCfMw== X-Received: by 2002:a17:906:314f:b0:993:f540:5bb1 with SMTP id e15-20020a170906314f00b00993f5405bb1mr8374665eje.41.1690189181725; Mon, 24 Jul 2023 01:59:41 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:11c4:4096:3d09:3950:f280:5ec1]) by smtp.gmail.com with ESMTPSA id rv7-20020a17090710c700b00993a9a951fasm6506665ejb.11.2023.07.24.01.59.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 01:59:41 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v3 5/8] repack: add `--filter=` option Date: Mon, 24 Jul 2023 10:59:06 +0200 Message-ID: <20230724085909.3831831-6-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.384.ged66511823 In-Reply-To: <20230724085909.3831831-1-christian.couder@gmail.com> References: <20230705060812.2865188-1-christian.couder@gmail.com> <20230724085909.3831831-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This new option puts the objects specified by `` into a separate packfile. This could be useful if, for example, some large blobs take up a lot of precious space on fast storage while they are rarely accessed. It could make sense to move them into a separate cheaper, though slower, storage. In other use cases it might make sense to put all the blobs into separate storage. It's possible to find which new packfile contains the filtered out objects using one of the following: - `git verify-pack -v ...`, - `test-tool find-pack ...`, which a previous commit added, - `--filter-to=`, which a following commit will add to specify where the pack containing the filtered out objects will be. This feature is implemented by running `git pack-objects` twice in a row. The first command is run with `--filter=`, using the specified filter. It packs objects while omitting the objects specified by the filter. Then another `git pack-objects` command is launched using `--stdin-packs`. We pass it all the previously existing packs into its stdin, so that it will pack all the objects in the previously existing packs. But we also pass into its stdin, the pack created by the previous `git pack-objects --filter=` command as well as the kept packs, all prefixed with '^', so that the objects in these packs will be omitted from the resulting pack. The result is that only the objects filtered out by the first `git pack-objects` command are in the pack resulting from the second `git pack-objects` command. Signed-off-by: John Cai Signed-off-by: Christian Couder --- Documentation/git-repack.txt | 12 +++++++ builtin/repack.c | 67 ++++++++++++++++++++++++++++++++++++ t/t7700-repack.sh | 24 +++++++++++++ 3 files changed, 103 insertions(+) diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 4017157949..6d5bec7716 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -143,6 +143,18 @@ depth is 4095. a larger and slower repository; see the discussion in `pack.packSizeLimit`. +--filter=:: + Remove objects matching the filter specification from the + resulting packfile and put them into a separate packfile. Note + that objects used in the working directory are not filtered + out. So for the split to fully work, it's best to perform it + in a bare repo and to use the `-a` and `-d` options along with + this option. Also `--no-write-bitmap-index` (or the + `repack.writebitmaps` config option set to `false`) should be + used otherwise writing bitmap index will fail, as it supposes + a single packfile containing all the objects. See + linkgit:git-rev-list[1] for valid `` forms. + -b:: --write-bitmap-index:: Write a reachability bitmap index as part of the repack. This diff --git a/builtin/repack.c b/builtin/repack.c index 21e3b89f27..2c81b7738e 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -53,6 +53,7 @@ struct pack_objects_args { const char *depth; const char *threads; const char *max_pack_size; + const char *filter; int no_reuse_delta; int no_reuse_object; int quiet; @@ -166,6 +167,8 @@ static void prepare_pack_objects(struct child_process *cmd, strvec_pushf(&cmd->args, "--threads=%s", args->threads); if (args->max_pack_size) strvec_pushf(&cmd->args, "--max-pack-size=%s", args->max_pack_size); + if (args->filter) + strvec_pushf(&cmd->args, "--filter=%s", args->filter); if (args->no_reuse_delta) strvec_pushf(&cmd->args, "--no-reuse-delta"); if (args->no_reuse_object) @@ -726,6 +729,57 @@ static int finish_pack_objects_cmd(struct child_process *cmd, return finish_command(cmd); } +static int write_filtered_pack(const struct pack_objects_args *args, + const char *destination, + const char *pack_prefix, + struct string_list *names, + struct string_list *existing_packs, + struct string_list *existing_kept_packs) +{ + struct child_process cmd = CHILD_PROCESS_INIT; + struct string_list_item *item; + FILE *in; + int ret; + const char *scratch; + int local = skip_prefix(destination, packdir, &scratch); + + /* We need to copy 'args' to modify it */ + struct pack_objects_args new_args = *args; + + /* No need to filter again */ + new_args.filter = NULL; + + prepare_pack_objects(&cmd, &new_args, destination); + + strvec_push(&cmd.args, "--stdin-packs"); + + cmd.in = -1; + + ret = start_command(&cmd); + if (ret) + return ret; + + /* + * names has a confusing double use: it both provides the list + * of just-written new packs, and accepts the name of the + * filtered pack we are writing. + * + * By the time it is read here, it contains only the pack(s) + * that were just written, which is exactly the set of packs we + * want to consider kept. + */ + in = xfdopen(cmd.in, "w"); + for_each_string_list_item(item, names) + fprintf(in, "^%s-%s.pack\n", pack_prefix, item->string); + for_each_string_list_item(item, existing_packs) + fprintf(in, "%s.pack\n", item->string); + for_each_string_list_item(item, existing_kept_packs) + fprintf(in, "^%s.pack\n", item->string); + fclose(in); + + return finish_pack_objects_cmd(&cmd, names, local); +} + static int write_cruft_pack(const struct pack_objects_args *args, const char *destination, const char *pack_prefix, @@ -858,6 +912,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) N_("limits the maximum number of threads")), OPT_STRING(0, "max-pack-size", &po_args.max_pack_size, N_("bytes"), N_("maximum size of each packfile")), + OPT_STRING(0, "filter", &po_args.filter, N_("args"), + N_("object filtering")), OPT_BOOL(0, "pack-kept-objects", &pack_kept_objects, N_("repack objects in packs marked with .keep")), OPT_STRING_LIST(0, "keep-pack", &keep_pack_list, N_("name"), @@ -1097,6 +1153,17 @@ int cmd_repack(int argc, const char **argv, const char *prefix) } } + if (po_args.filter) { + ret = write_filtered_pack(&po_args, + packtmp, + find_pack_prefix(), + &names, + &existing_nonkept_packs, + &existing_kept_packs); + if (ret) + goto cleanup; + } + string_list_sort(&names); close_object_store(the_repository->objects); diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 27b66807cd..0a2c73bca7 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -327,6 +327,30 @@ test_expect_success 'auto-bitmaps do not complain if unavailable' ' test_must_be_empty actual ' +test_expect_success 'repacking with a filter works' ' + git -C bare.git repack -a -d && + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && + git -C bare.git -c repack.writebitmaps=false repack -a -d --filter=blob:none && + test_stdout_line_count = 2 ls bare.git/objects/pack/*.pack && + commit_pack=$(test-tool -C bare.git find-pack HEAD) && + test -n "$commit_pack" && + blob_pack=$(test-tool -C bare.git find-pack HEAD:file1) && + test -n "$blob_pack" && + test "$commit_pack" != "$blob_pack" && + tree_pack=$(test-tool -C bare.git find-pack HEAD^{tree}) && + test "$tree_pack" = "$commit_pack" && + blob_pack2=$(test-tool -C bare.git find-pack HEAD:file2) && + test "$blob_pack2" = "$blob_pack" +' + +test_expect_success '--filter fails with --write-bitmap-index' ' + test_must_fail git -C bare.git repack -a -d --write-bitmap-index \ + --filter=blob:none && + + git -C bare.git repack -a -d --no-write-bitmap-index \ + --filter=blob:none +' + objdir=.git/objects midx=$objdir/pack/multi-pack-index From patchwork Mon Jul 24 08:59:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13323609 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03979C001DF for ; Mon, 24 Jul 2023 09:00:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232065AbjGXJAA (ORCPT ); Mon, 24 Jul 2023 05:00:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231948AbjGXI7t (ORCPT ); Mon, 24 Jul 2023 04:59:49 -0400 Received: from mail-ed1-x530.google.com (mail-ed1-x530.google.com [IPv6:2a00:1450:4864:20::530]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6275E11F for ; Mon, 24 Jul 2023 01:59:45 -0700 (PDT) Received: by mail-ed1-x530.google.com with SMTP id 4fb4d7f45d1cf-5222bc916acso1049557a12.3 for ; Mon, 24 Jul 2023 01:59:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690189183; x=1690793983; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WYpL/AvLoQi/zB7yvx8F+K7HS8X/XJAFUruyX1J25Dk=; b=MFnrEJeeNlRPpWrs6wr4SXmY8PWVVuZztWD9/gT1YI+YN9erYQSrsRyUV/QPRHyz0y XstemgTfvRA0Hz3x7uWtL0L/s2nXyQhodolBwt/aCrfYOyn+S8uWURw/5CDy866jyhDX 9XxCDpgjx1xfdrB1RII6Si5TxfJde94vvNS+C10P/x1gkhyIx6skVz53zwiD7dWhcI82 ZARnpotgNR1Cs72jHkKPTs9FRjS0dvgen3Vtlrhn4OSNNnJ/C5uyjZp4M7L6s5tYy4fG wAJn4hpAyKUTQBTb2zLLq5I03wFN9uB4cyWYb5MuoelrhJgY2pA13KNezwVKzfbf7JsA 2xCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690189183; x=1690793983; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WYpL/AvLoQi/zB7yvx8F+K7HS8X/XJAFUruyX1J25Dk=; b=fzzffnaB2vY3eJHCwOCwB75s4uCBb1Ne/LY3crUCzoDZtvf9tUkWqWmZ9n2umvMCNS xIRDfH253Ci8enp2lQPKCrIyCngOiW/pGppkv6wpgzTrGyJNt34gpMEEhivg50OV9U5B SCzTpQAUVMOcd5OZUUhv7B30cUM4Cxypoa0fOyia0CpjwR+SP2YMI+iKcFI/ivWpQ5vG ObnrH13ag+UkmpNW2MYZvXOdy1TxQfJ0fQwAu04L6kaG521BtCDNEuGuMSkLQmAiuWlC kl6GoVIXUAA1j3G9sC+EqVbewLVxAyu1aavC4tC5MchgpU2zmddav4hfd9E3Ian9MADl KeHQ== X-Gm-Message-State: ABy/qLY+DaHoIG3EKfkxVDGdGYzT1wAsAwFXbxoY82/mvlqhwc8BKwg1 dfIIzPhYXiSelLj3afFzPIQoOVZ6xg4= X-Google-Smtp-Source: APBJJlG4pBgmtS2Z/ZF13U9RzPsXJgzfyMZCuhijpD41yc819E4T5GckMx3iYGMu6JuMzgOnhN0bCw== X-Received: by 2002:a17:906:d7:b0:99b:64d0:f6c8 with SMTP id 23-20020a17090600d700b0099b64d0f6c8mr10161051eji.50.1690189183033; Mon, 24 Jul 2023 01:59:43 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:11c4:4096:3d09:3950:f280:5ec1]) by smtp.gmail.com with ESMTPSA id rv7-20020a17090710c700b00993a9a951fasm6506665ejb.11.2023.07.24.01.59.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 01:59:42 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v3 6/8] gc: add `gc.repackFilter` config option Date: Mon, 24 Jul 2023 10:59:07 +0200 Message-ID: <20230724085909.3831831-7-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.384.ged66511823 In-Reply-To: <20230724085909.3831831-1-christian.couder@gmail.com> References: <20230705060812.2865188-1-christian.couder@gmail.com> <20230724085909.3831831-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A previous commit has implemented `git repack --filter=` to allow users to filter out some objects from the main pack and move them into a new different pack. Users might want to perform such a cleanup regularly at the same time as they perform other repacks and cleanups, so as part of `git gc`. Let's allow them to configure a for that purpose using a new gc.repackFilter config option. Now when `git gc` will perform a repack with a configured through this option and not empty, the repack process will be passed a corresponding `--filter=` argument. Signed-off-by: Christian Couder --- Documentation/config/gc.txt | 5 +++++ builtin/gc.c | 6 ++++++ t/t6500-gc.sh | 12 ++++++++++++ 3 files changed, 23 insertions(+) diff --git a/Documentation/config/gc.txt b/Documentation/config/gc.txt index ca47eb2008..2153bde7ac 100644 --- a/Documentation/config/gc.txt +++ b/Documentation/config/gc.txt @@ -145,6 +145,11 @@ Multiple hooks are supported, but all must exit successfully, else the operation (either generating a cruft pack or unpacking unreachable objects) will be halted. +gc.repackFilter:: + When repacking, use the specified filter to move certain + objects into a separate packfile. See the + `--filter=` option of linkgit:git-repack[1]. + gc.rerereResolved:: Records of conflicted merge you resolved earlier are kept for this many days when 'git rerere gc' is run. diff --git a/builtin/gc.c b/builtin/gc.c index 19d73067aa..9b0984f301 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -61,6 +61,7 @@ static timestamp_t gc_log_expire_time; static const char *gc_log_expire = "1.day.ago"; static const char *prune_expire = "2.weeks.ago"; static const char *prune_worktrees_expire = "3.months.ago"; +static char *repack_filter; static unsigned long big_pack_threshold; static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE; @@ -170,6 +171,8 @@ static void gc_config(void) git_config_get_ulong("gc.bigpackthreshold", &big_pack_threshold); git_config_get_ulong("pack.deltacachesize", &max_delta_cache_size); + git_config_get_string("gc.repackfilter", &repack_filter); + git_config(git_default_config, NULL); } @@ -355,6 +358,9 @@ static void add_repack_all_option(struct string_list *keep_pack) if (keep_pack) for_each_string_list(keep_pack, keep_one_pack, NULL); + + if (repack_filter && *repack_filter) + strvec_pushf(&repack, "--filter=%s", repack_filter); } static void add_repack_incremental_option(void) diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index 69509d0c11..5b89faf505 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -202,6 +202,18 @@ test_expect_success 'one of gc.reflogExpire{Unreachable,}=never does not skip "e grep -E "^trace: (built-in|exec|run_command): git reflog expire --" trace.out ' +test_expect_success 'gc.repackFilter launches repack with a filter' ' + test_when_finished "rm -rf bare.git" && + git clone --no-local --bare . bare.git && + + git -C bare.git -c gc.cruftPacks=false gc && + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && + + GIT_TRACE=$(pwd)/trace.out git -C bare.git -c gc.repackFilter=blob:none -c repack.writeBitmaps=false -c gc.cruftPacks=false gc && + test_stdout_line_count = 2 ls bare.git/objects/pack/*.pack && + grep -E "^trace: (built-in|exec|run_command): git repack .* --filter=blob:none ?.*" trace.out +' + prepare_cruft_history () { test_commit base && From patchwork Mon Jul 24 08:59:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13323611 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 606A2C001DE for ; Mon, 24 Jul 2023 09:00:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232067AbjGXJAD (ORCPT ); Mon, 24 Jul 2023 05:00:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55348 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232039AbjGXI7t (ORCPT ); Mon, 24 Jul 2023 04:59:49 -0400 Received: from mail-ed1-x530.google.com (mail-ed1-x530.google.com [IPv6:2a00:1450:4864:20::530]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD069E4F for ; Mon, 24 Jul 2023 01:59:46 -0700 (PDT) Received: by mail-ed1-x530.google.com with SMTP id 4fb4d7f45d1cf-51e28cac164so10688750a12.1 for ; Mon, 24 Jul 2023 01:59:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690189185; x=1690793985; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ayFhYX9N8prj0BLiW1EZaoSp1ab2Zg9ZMd3bxPsoo4E=; b=Z6tlff02j5RPKBQ1E43o3VmSc741ifomaIjE5zz+OoDHrp4qaYNgv0cwp4U7MGuXAa ZF4BqPy0fdKNrOaGCKOlaSNK6qTfFEjobyKfdHSB04KhEMvyq14wsirOkH2Vi59jOkS4 Bbn8yDztfpHSEObOKf0bWMc8N5BtDOy1RgwFEMs/uXxw3JRt8UtGK7Dh9jYSHQuHw18N 4NiTg0J2dmJou7p4aCgXpebAl0orz+B5XMXCr/XcXj3w1aQbFoU13LiAE2cSB7QdV//3 muuqAk1XL6ObbZbm1/c4TtLrabUPinmnK35glwUjK9+O512VTjvJFLRDKH1ggnEaSuW0 5G9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690189185; x=1690793985; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ayFhYX9N8prj0BLiW1EZaoSp1ab2Zg9ZMd3bxPsoo4E=; b=OivbnVwEoMwaKnoVwUN8C69TopPU4VMZWUIgTp2VLK4ZSK6d9ML69FDU5rpo5sQdD2 ukwf/eSnwNHb+RvdCmMAwonSpDH4re5NKBo5qM1aXu0EEM/Pnu4iV1BOTlaO48iKosx5 S3sJtrcD78UCtP+L5ExQMD0Qjo80t+2dLqQhhs+ybCIrRq6vTpfDfMKWL+zVeJnM1gqG ZYXRYMxfYlsUx2Qe6UxHPSSWP1qiazPJsgpKf+DRPrrO7uwDqBGyRlIpPBFORva8lLC/ veco2/YspQbFipSQ4DU6FdIiVIedFZXUG829quyOHxURCvIiQBliABzJUSEvWFOeJ0vp SMHA== X-Gm-Message-State: ABy/qLZlGalvLVwChDEG7DxXL0wkg5iugf/vkS4lKHDqLI8CVVxZfvs1 nuYxcNFgnU0nvn437w8Y6cjkKDpJrtU= X-Google-Smtp-Source: APBJJlEYspjopbaLYbGl245vNtrRWW8iKdg9fKIYdNPY+5t3FTpGh3OK1MjwYq2pbFcEy9Ho3t8DcQ== X-Received: by 2002:a17:906:76c7:b0:994:1805:1fad with SMTP id q7-20020a17090676c700b0099418051fadmr9381394ejn.10.1690189184744; Mon, 24 Jul 2023 01:59:44 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:11c4:4096:3d09:3950:f280:5ec1]) by smtp.gmail.com with ESMTPSA id rv7-20020a17090710c700b00993a9a951fasm6506665ejb.11.2023.07.24.01.59.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 01:59:43 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v3 7/8] repack: implement `--filter-to` for storing filtered out objects Date: Mon, 24 Jul 2023 10:59:08 +0200 Message-ID: <20230724085909.3831831-8-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.384.ged66511823 In-Reply-To: <20230724085909.3831831-1-christian.couder@gmail.com> References: <20230705060812.2865188-1-christian.couder@gmail.com> <20230724085909.3831831-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A previous commit has implemented `git repack --filter=` to allow users to filter out some objects from the main pack and move them into a new different pack. It would be nice if this new different pack could be created in a different directory than the regular pack. This would make it possible to move large blobs into a pack on a different kind of storage, for example cheaper storage. Even in a different directory, this pack can be accessible if, for example, the Git alternates mechanism is used to point to it. In fact not using the Git alternates mechanism can corrupt a repo as the generated pack containing the filtered objects might not be accessible from the repo any more. So setting up the Git alternates mechanism should be done before using this feature if the user wants the repo to be fully usable while this feature is used. In some cases, like when a repo has just been cloned or when there is no other activity in the repo, it's Ok to setup the Git alternates mechanism afterwards though. It's also Ok to just inspect the generated packfile containing the filtered objects and then just move it into the '.git/objects/pack/' directory manually. That's why it's not necessary for this command to check that the Git alternates mechanism has been already setup. While at it, as an example to show that `--filter` and `--filter-to` work well with other options, let's also add a test to check that these options work well with `--max-pack-size`. Signed-off-by: Christian Couder repack: add test with --max-pack-size --- Documentation/git-repack.txt | 11 ++++++ builtin/repack.c | 11 +++++- t/t7700-repack.sh | 66 ++++++++++++++++++++++++++++++++++++ 3 files changed, 87 insertions(+), 1 deletion(-) diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 6d5bec7716..c0fbb0ed0c 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -155,6 +155,17 @@ depth is 4095. a single packfile containing all the objects. See linkgit:git-rev-list[1] for valid `` forms. +--filter-to=:: + Write the pack containing filtered out objects to the + directory ``. Only useful with `--filter`. This can be + used for putting the pack on a separate object directory that + is accessed through the Git alternates mechanism. **WARNING:** + If the packfile containing the filtered out objects is not + accessible, the repo could be considered corrupt by Git as it + migh not be able to access the objects in that packfile. See + the `objects` and `objects/info/alternates` sections of + linkgit:gitrepository-layout[5]. + -b:: --write-bitmap-index:: Write a reachability bitmap index as part of the repack. This diff --git a/builtin/repack.c b/builtin/repack.c index 2c81b7738e..626284191b 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -871,6 +871,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) int write_midx = 0; const char *cruft_expiration = NULL; const char *expire_to = NULL; + const char *filter_to = NULL; struct option builtin_repack_options[] = { OPT_BIT('a', NULL, &pack_everything, @@ -924,6 +925,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) N_("write a multi-pack index of the resulting packs")), OPT_STRING(0, "expire-to", &expire_to, N_("dir"), N_("pack prefix to store a pack containing pruned objects")), + OPT_STRING(0, "filter-to", &filter_to, N_("dir"), + N_("pack prefix to store a pack containing filtered out objects")), OPT_END() }; @@ -1067,6 +1070,9 @@ int cmd_repack(int argc, const char **argv, const char *prefix) strvec_push(&cmd.args, "--incremental"); } + if (filter_to && !po_args.filter) + die(_("option '%s' can only be used along with '%s'"), "--filter-to", "--filter"); + if (geometry) cmd.in = -1; else @@ -1154,8 +1160,11 @@ int cmd_repack(int argc, const char **argv, const char *prefix) } if (po_args.filter) { + if (!filter_to) + filter_to = packtmp; + ret = write_filtered_pack(&po_args, - packtmp, + filter_to, find_pack_prefix(), &names, &existing_nonkept_packs, diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 0a2c73bca7..2bf237ba3a 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -351,6 +351,72 @@ test_expect_success '--filter fails with --write-bitmap-index' ' --filter=blob:none ' +test_expect_success '--filter-to stores filtered out objects' ' + git -C bare.git repack -a -d && + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && + + git init --bare filtered.git && + git -C bare.git -c repack.writebitmaps=false repack -a -d \ + --filter=blob:none \ + --filter-to=../filtered.git/objects/pack/pack && + test_stdout_line_count = 1 ls bare.git/objects/pack/pack-*.pack && + test_stdout_line_count = 1 ls filtered.git/objects/pack/pack-*.pack && + + commit_pack=$(test-tool -C bare.git find-pack HEAD) && + test -n "$commit_pack" && + blob_pack=$(test-tool -C bare.git find-pack HEAD:file1) && + test -z "$blob_pack" && + blob_hash=$(git -C bare.git rev-parse HEAD:file1) && + test -n "$blob_hash" && + blob_pack=$(test-tool -C filtered.git find-pack $blob_hash) && + test -n "$blob_pack" && + + echo $(pwd)/filtered.git/objects >bare.git/objects/info/alternates && + blob_pack=$(test-tool -C bare.git find-pack HEAD:file1) && + test -n "$blob_pack" && + blob_content=$(git -C bare.git show $blob_hash) && + test "$blob_content" = "content1" +' + +test_expect_success '--filter works with --max-pack-size' ' + rm -rf filtered.git && + git init --bare filtered.git && + git init max-pack-size && + ( + cd max-pack-size && + test_commit base && + # two blobs which exceed the maximum pack size + test-tool genrandom foo 1048576 >foo && + git hash-object -w foo && + test-tool genrandom bar 1048576 >bar && + git hash-object -w bar && + git add foo bar && + git commit -m "adding foo and bar" + ) && + git clone --no-local --bare max-pack-size max-pack-size.git && + ( + cd max-pack-size.git && + git -c repack.writebitmaps=false repack -a -d --filter=blob:none \ + --max-pack-size=1M \ + --filter-to=../filtered.git/objects/pack/pack && + echo $(cd .. && pwd)/filtered.git/objects >objects/info/alternates && + + # Check that the 3 blobs are in different packfiles in filtered.git + test_stdout_line_count = 3 ls ../filtered.git/objects/pack/pack-*.pack && + test_stdout_line_count = 1 ls objects/pack/pack-*.pack && + foo_pack=$(test-tool find-pack HEAD:foo) && + bar_pack=$(test-tool find-pack HEAD:bar) && + base_pack=$(test-tool find-pack HEAD:base.t) && + test "$foo_pack" != "$bar_pack" && + test "$foo_pack" != "$base_pack" && + test "$bar_pack" != "$base_pack" && + for pack in "$foo_pack" "$bar_pack" "$base_pack" + do + case "$foo_pack" in */filtered.git/objects/pack/*) true ;; *) return 1 ;; esac + done + ) +' + objdir=.git/objects midx=$objdir/pack/multi-pack-index From patchwork Mon Jul 24 08:59:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Couder X-Patchwork-Id: 13323610 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5C5EC001B0 for ; Mon, 24 Jul 2023 09:00:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232039AbjGXJAE (ORCPT ); Mon, 24 Jul 2023 05:00:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231824AbjGXI7u (ORCPT ); Mon, 24 Jul 2023 04:59:50 -0400 Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4981FE6F for ; Mon, 24 Jul 2023 01:59:48 -0700 (PDT) Received: by mail-ej1-x62d.google.com with SMTP id a640c23a62f3a-992dcae74e0so699692366b.3 for ; Mon, 24 Jul 2023 01:59:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690189186; x=1690793986; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YWI/z40HSt5Nf7v9CbTY6e2vkZf+4HjzOcQOciiO+uk=; b=dXySMJyV6K4hQKBcAMUFGQYl67wGPQ5ZW8nO00Ztn9HqOwGw0BJTqc0Gp+1NCGOSPi qAJUxqDUoc1sCJCINiq/9aavf/HvDmmO2gN6Wn24GQDGaTVOR/7iKt09eVphIcoLAJnZ Ry7vpeWzxyjmP6ugJ7vKH5A0MMbqGquil8vIM4BhV7mrm6I4b+1t2H8esOs9x/F3d+rp NU1zwbgthzec6HelMJDapdpdoNrfc90ode9li0AKoGI0NPccdbFaRBaL9ab/z9Yx94Vi v7HEV5xHo8Oh5hCT8iIX9CeHzcg56fsAqRoMzXOICtF8HFCc4rj7PYgmA0Ny3X0k4vx8 LTaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690189186; x=1690793986; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YWI/z40HSt5Nf7v9CbTY6e2vkZf+4HjzOcQOciiO+uk=; b=LzTjVSn5fkqHSs9vU/0W+Iqgdq1m2ucqDGTcsHd3ca0yvQFU3/5Klxh/7rs1uJfCE3 8GIG9uB9mZEzhF8a/b7lBGueGwpwpYmIL7VTQS68Jvb+uYfVuVqFbPadEZ9vEL+nuGBT aFd0cbwwlHgYxaXyW8G7cE3zkYi7omSHO7LddUw866+rwZOIl5dcFWNd5ZycDOo7ft04 TlrakAMH39K3ZSwKlRrLPYAzbZXv8v8DP2nFcOCHjFDuKt/5jC2mccw2c4xq7dxw4anV RmY9m3/2cIC1ABc1CYmiE02DlYDJ8/GW/tBLDtjNt+Eg4BsGNvmApkmAuBe6kBQh4bur 7TTg== X-Gm-Message-State: ABy/qLZzWqhVTbtCsl2mrQ8HC7TAkmvrMfCqPvCAoHHbUYUWvhUoAa9J RlpUg8TJp6rvoSs+ysG3bGU0I9lZGvk= X-Google-Smtp-Source: APBJJlHrPujlyBLxYvfbEmjFr54i28YSu4/4EdzbzxuHS7GOJGZqfztzNA5QUub6CDRvy19buynbjA== X-Received: by 2002:a17:906:538f:b0:991:bf04:204f with SMTP id g15-20020a170906538f00b00991bf04204fmr9254161ejo.60.1690189186215; Mon, 24 Jul 2023 01:59:46 -0700 (PDT) Received: from christian-Precision-5550.. ([2a04:cec0:11c4:4096:3d09:3950:f280:5ec1]) by smtp.gmail.com with ESMTPSA id rv7-20020a17090710c700b00993a9a951fasm6506665ejb.11.2023.07.24.01.59.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 01:59:45 -0700 (PDT) From: Christian Couder To: git@vger.kernel.org Cc: Junio C Hamano , John Cai , Jonathan Tan , Jonathan Nieder , Taylor Blau , Derrick Stolee , Patrick Steinhardt , Christian Couder , Christian Couder Subject: [PATCH v3 8/8] gc: add `gc.repackFilterTo` config option Date: Mon, 24 Jul 2023 10:59:09 +0200 Message-ID: <20230724085909.3831831-9-christian.couder@gmail.com> X-Mailer: git-send-email 2.41.0.384.ged66511823 In-Reply-To: <20230724085909.3831831-1-christian.couder@gmail.com> References: <20230705060812.2865188-1-christian.couder@gmail.com> <20230724085909.3831831-1-christian.couder@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A previous commit implemented the `gc.repackFilter` config option to specify a filter that should be used by `git gc` when performing repacks. Another previous commit has implemented `git repack --filter-to=` to specify the location of the packfile containing filtered out objects when using a filter. Let's implement the `gc.repackFilterTo` config option to specify that location in the config when `gc.repackFilter` is used. Now when `git gc` will perform a repack with a configured through this option and not empty, the repack process will be passed a corresponding `--filter-to=` argument. Signed-off-by: Christian Couder --- Documentation/config/gc.txt | 11 +++++++++++ builtin/gc.c | 4 ++++ t/t6500-gc.sh | 13 ++++++++++++- 3 files changed, 27 insertions(+), 1 deletion(-) diff --git a/Documentation/config/gc.txt b/Documentation/config/gc.txt index 2153bde7ac..466466d6cc 100644 --- a/Documentation/config/gc.txt +++ b/Documentation/config/gc.txt @@ -150,6 +150,17 @@ gc.repackFilter:: objects into a separate packfile. See the `--filter=` option of linkgit:git-repack[1]. +gc.repackFilterTo:: + When repacking and using a filter, see `gc.repackFilter`, the + specified location will be used to create the packfile + containing the filtered out objects. **WARNING:** The + specified location should be accessible, using for example the + Git alternates mechanism, otherwise the repo could be + considered corrupt by Git as it migh not be able to access the + objects in that packfile. See the `--filter-to=` option + of linkgit:git-repack[1] and the `objects/info/alternates` + section of linkgit:gitrepository-layout[5]. + gc.rerereResolved:: Records of conflicted merge you resolved earlier are kept for this many days when 'git rerere gc' is run. diff --git a/builtin/gc.c b/builtin/gc.c index 9b0984f301..1b7c775d94 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -62,6 +62,7 @@ static const char *gc_log_expire = "1.day.ago"; static const char *prune_expire = "2.weeks.ago"; static const char *prune_worktrees_expire = "3.months.ago"; static char *repack_filter; +static char *repack_filter_to; static unsigned long big_pack_threshold; static unsigned long max_delta_cache_size = DEFAULT_DELTA_CACHE_SIZE; @@ -172,6 +173,7 @@ static void gc_config(void) git_config_get_ulong("pack.deltacachesize", &max_delta_cache_size); git_config_get_string("gc.repackfilter", &repack_filter); + git_config_get_string("gc.repackfilterto", &repack_filter_to); git_config(git_default_config, NULL); } @@ -361,6 +363,8 @@ static void add_repack_all_option(struct string_list *keep_pack) if (repack_filter && *repack_filter) strvec_pushf(&repack, "--filter=%s", repack_filter); + if (repack_filter_to && *repack_filter_to) + strvec_pushf(&repack, "--filter-to=%s", repack_filter_to); } static void add_repack_incremental_option(void) diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index 5b89faf505..37056a824b 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -203,7 +203,6 @@ test_expect_success 'one of gc.reflogExpire{Unreachable,}=never does not skip "e ' test_expect_success 'gc.repackFilter launches repack with a filter' ' - test_when_finished "rm -rf bare.git" && git clone --no-local --bare . bare.git && git -C bare.git -c gc.cruftPacks=false gc && @@ -214,6 +213,18 @@ test_expect_success 'gc.repackFilter launches repack with a filter' ' grep -E "^trace: (built-in|exec|run_command): git repack .* --filter=blob:none ?.*" trace.out ' +test_expect_success 'gc.repackFilterTo store filtered out objects' ' + test_when_finished "rm -rf bare.git filtered.git" && + + git init --bare filtered.git && + git -C bare.git -c gc.repackFilter=blob:none \ + -c gc.repackFilterTo=../filtered.git/objects/pack/pack \ + -c repack.writeBitmaps=false -c gc.cruftPacks=false gc && + + test_stdout_line_count = 1 ls bare.git/objects/pack/*.pack && + test_stdout_line_count = 1 ls filtered.git/objects/pack/*.pack +' + prepare_cruft_history () { test_commit base &&