From patchwork Sat Sep 11 03:32:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12486133 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AED9C433FE for ; Sat, 11 Sep 2021 03:32:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 57B6560F46 for ; Sat, 11 Sep 2021 03:32:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235249AbhIKDdo (ORCPT ); Fri, 10 Sep 2021 23:33:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229501AbhIKDdm (ORCPT ); Fri, 10 Sep 2021 23:33:42 -0400 Received: from mail-io1-xd35.google.com (mail-io1-xd35.google.com [IPv6:2607:f8b0:4864:20::d35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 11666C061574 for ; Fri, 10 Sep 2021 20:32:31 -0700 (PDT) Received: by mail-io1-xd35.google.com with SMTP id z1so4882480ioh.7 for ; Fri, 10 Sep 2021 20:32:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=lHuYxpO7NYRr+JRkvTNdCiDMSBzMJzQdHVXX41gdwh4=; b=pFukAJHF3Rn6nYwAOYbQWIebi7AiuD6tZJRCNy6bsdGkllscMC7Ebi0W6n63PfKPbt 9UHIqblzDg9wipRD84oF8GuLHC2Jmwqn0mu/ZYlxgSqN78SWe2GOZRm09f8K+4UjY/jp dyBqymuO4Or8mgvPvAngu8JcuDKnoIRNIWYrtDBkYL1p+sT2R5MqwSI8jfLv6h1yBo8Q /Gy1cTBfJ8Atw70juHQRw56VB+4jWhLLxFaWpDyB6IOSL+452RnkTCoZsZzLxf9IMc6x G2OJU7MIZoE3ckVJi5+bfCzg0BbPtQT+zMG6pPYkZt5D8lRARp+oF5ZtENJCworzUvWa I5cA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=lHuYxpO7NYRr+JRkvTNdCiDMSBzMJzQdHVXX41gdwh4=; b=kyYjuA71eJZPWq3kHaDCxNSr6f4HMm+EUlrym/9RitnT5NMC0xo6BF1/YnEQTN8KM1 +U9oyMUeABEJJG3+4Q1QKnGZYpUkCzr19YZMX2gWKGfWlTh3CIKBH3pUxresfJrPKlnz IjwP1CLRMNgBD34QFExQkOpdvmwUZqaeD9PTNAwXbSmtrgTUhwu6r0Er05nxyhbbhKPh 6wuffHqKetxySR21DWKasXYYGNrLfW81vBtaCi87hkuvDZbFmSt8g9lsDUa4Nk/PHxM9 OcccV4Pu6kjy2KC7oVAFsyXyaGJgk+6CEWu0Re0p3DI1kVc7qhy0TWsq/cVRNgoOffoW lVLQ== X-Gm-Message-State: AOAM531JZcIYM8MCmaVIrDv5JgC9MHMAJcJxDG96hsm6S6N6ak8vuIBk 8FV9+dQHs+qsDOypXrKgYcrUl5GELmdHENbH X-Google-Smtp-Source: ABdhPJyyeMPdALWKiG5DCx4BnwYeG816l8qCVyFprfvb95Xu1PPqqZYH1aezez7ZdmGUBrpXlR9TJg== X-Received: by 2002:a02:caaf:: with SMTP id e15mr800087jap.11.1631331150259; Fri, 10 Sep 2021 20:32:30 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id o15sm307185ilj.69.2021.09.10.20.32.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 20:32:30 -0700 (PDT) Date: Fri, 10 Sep 2021 23:32:29 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net Subject: [PATCH 1/8] midx: expose 'write_midx_file_only()' publicly Message-ID: <4afa03b972a1885c60fbf3716f22a7ab58056383.1631331139.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Expose a variant of the write_midx_file() function which ignores packs that aren't included in an explicit "allow" list. This will be used in an upcoming patch to power a new `--stdin-packs` mode of `git multi-pack-index write` for callers that only want to include certain packs in a MIDX (and ignore any packs which may have happened to enter the repository independently, e.g., from pushes). Those patches will provide test coverage for this new function. Signed-off-by: Taylor Blau --- midx.c | 50 +++++++++++++++++++++++++++++++++++++++----------- midx.h | 2 ++ 2 files changed, 41 insertions(+), 11 deletions(-) diff --git a/midx.c b/midx.c index 864034a6ad..29d1d107b3 100644 --- a/midx.c +++ b/midx.c @@ -475,6 +475,8 @@ struct write_midx_context { uint32_t num_large_offsets; int preferred_pack_idx; + + struct string_list *to_include; }; static void add_pack_to_midx(const char *full_path, size_t full_path_len, @@ -484,8 +486,13 @@ static void add_pack_to_midx(const char *full_path, size_t full_path_len, if (ends_with(file_name, ".idx")) { display_progress(ctx->progress, ++ctx->pack_paths_checked); - if (ctx->m && midx_contains_pack(ctx->m, file_name)) - return; + if (ctx->m) { + if (midx_contains_pack(ctx->m, file_name)) + return; + } else if (ctx->to_include) { + if (!string_list_has_string(ctx->to_include, file_name)) + return; + } ALLOC_GROW(ctx->info, ctx->nr + 1, ctx->alloc); @@ -1058,6 +1065,7 @@ static int write_midx_bitmap(char *midx_name, unsigned char *midx_hash, } static int write_midx_internal(const char *object_dir, + struct string_list *packs_to_include, struct string_list *packs_to_drop, const char *preferred_pack_name, unsigned flags) @@ -1082,10 +1090,17 @@ static int write_midx_internal(const char *object_dir, die_errno(_("unable to create leading directories of %s"), midx_name); - for (cur = get_multi_pack_index(the_repository); cur; cur = cur->next) { - if (!strcmp(object_dir, cur->object_dir)) { - ctx.m = cur; - break; + if (!packs_to_include) { + /* + * Only reference an existing MIDX when not filtering which + * packs to include, since all packs and objects are copied + * blindly from an existing MIDX if one is present. + */ + for (cur = get_multi_pack_index(the_repository); cur; cur = cur->next) { + if (!strcmp(object_dir, cur->object_dir)) { + ctx.m = cur; + break; + } } } @@ -1136,10 +1151,13 @@ static int write_midx_internal(const char *object_dir, else ctx.progress = NULL; + ctx.to_include = packs_to_include; + for_each_file_in_pack_dir(object_dir, add_pack_to_midx, &ctx); stop_progress(&ctx.progress); - if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop) { + if ((ctx.m && ctx.nr == ctx.m->num_packs) && + !(packs_to_include || packs_to_drop)) { struct bitmap_index *bitmap_git; int bitmap_exists; int want_bitmap = flags & MIDX_WRITE_BITMAP; @@ -1237,7 +1255,7 @@ static int write_midx_internal(const char *object_dir, QSORT(ctx.info, ctx.nr, pack_info_compare); - if (packs_to_drop && packs_to_drop->nr) { + if (ctx.m && packs_to_drop && packs_to_drop->nr) { int drop_index = 0; int missing_drops = 0; @@ -1380,7 +1398,17 @@ int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags) { - return write_midx_internal(object_dir, NULL, preferred_pack_name, flags); + return write_midx_internal(object_dir, NULL, NULL, preferred_pack_name, + flags); +} + +int write_midx_file_only(const char *object_dir, + struct string_list *packs_to_include, + const char *preferred_pack_name, + unsigned flags) +{ + return write_midx_internal(object_dir, packs_to_include, NULL, + preferred_pack_name, flags); } struct clear_midx_data { @@ -1660,7 +1688,7 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla free(count); if (packs_to_drop.nr) { - result = write_midx_internal(object_dir, &packs_to_drop, NULL, flags); + result = write_midx_internal(object_dir, NULL, &packs_to_drop, NULL, flags); m = NULL; } @@ -1851,7 +1879,7 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, goto cleanup; } - result = write_midx_internal(object_dir, NULL, NULL, flags); + result = write_midx_internal(object_dir, NULL, NULL, NULL, flags); m = NULL; cleanup: diff --git a/midx.h b/midx.h index aa3da557bb..aefa371c90 100644 --- a/midx.h +++ b/midx.h @@ -2,6 +2,7 @@ #define MIDX_H #include "repository.h" +#include "string-list.h" struct object_id; struct pack_entry; @@ -62,6 +63,7 @@ int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name) int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local); int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags); +int write_midx_file_only(const char *object_dir, struct string_list *packs_to_include, const char *preferred_pack_name, unsigned flags); void clear_midx_file(struct repository *r); int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags); int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags); From patchwork Sat Sep 11 03:32:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12486135 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 950C7C433EF for ; Sat, 11 Sep 2021 03:32:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6F9696113E for ; Sat, 11 Sep 2021 03:32:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235251AbhIKDds (ORCPT ); Fri, 10 Sep 2021 23:33:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34220 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229501AbhIKDdp (ORCPT ); Fri, 10 Sep 2021 23:33:45 -0400 Received: from mail-io1-xd2e.google.com (mail-io1-xd2e.google.com [IPv6:2607:f8b0:4864:20::d2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6EC55C061574 for ; Fri, 10 Sep 2021 20:32:33 -0700 (PDT) Received: by mail-io1-xd2e.google.com with SMTP id f6so4951997iox.0 for ; Fri, 10 Sep 2021 20:32:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Nkb7fgnuRqEbS9laTpFPUA/upPT5bq5QrdPBIZQDpqM=; b=STb2l4Y/GVrqpwJ8181Ox5fBnGy7vbU3o0TRevJhJMZtTzrDVlA3Td40TAto7e7OZS R2hl4ZJOgNV8eBEm8glE2pDp8uncHfJvpLB1CB1intiyhxUfTF/vgVARY0+RxNekP9i+ 7JdnCUhZOuVFkmHfsQm1N0boikfUU5/ZcvQD5wxz04nVP7gFVQmQ4kVQ8UvRZGfbTT3K Ww23LZY5+CVOlZyqUIwguxr97ogxid3v11foNuiVwd67T0alRCwmYqqmtnPpWPk2534I DKIeoeiR3Mz7C4Sfs7rFW4oOhpKjdNJBAbdMpVaVOv+EiEiPQhefzPUBq+IqjfRrHSaM 2OGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Nkb7fgnuRqEbS9laTpFPUA/upPT5bq5QrdPBIZQDpqM=; b=ZE+vGVvnLBhp+A6SSx9tV9YcIIVW4FisL3JWDvK3awRG7E2HPTABaVBu3UqOw/IhdR S64C0hFo60tVKWyRBba6dai4PR+L3QkKy9UG18I+7r5wezlxqN03poJJt9BQSJKFgPW+ Szesvz2KCbh9likPI5rEqesR6wDH/vhdCYZ2PjY/HHGkguvhbn0lRmtfUZ9BB5UV4gmZ TgVakzQZFHf2ZBRmODj33I/O4hLVGZmGIQ3hT8nB9FyrE9tn8X2upeSk9pD+u1/r8vVn 9Hl6yvX+fof7t6Y7+oUi8ff0+67jq5dUpVQ0X0dOhdF6dYmwSSOn8CNb/87exwPLL5p8 gVKQ== X-Gm-Message-State: AOAM531dnD6gg25itFbBtXjEF3LXYfU7RhzYXSaTZl9SvZ2geIKrGaec CgnjvzcPf+9wqmZ8kRpfEf7VaVlJU/tgeH52 X-Google-Smtp-Source: ABdhPJzkop/xs62Is5IRnVrzjXL5n9+/wDNc2CYwxm2mCFQWp0SZVmIOT5vQ6LJogPH+2zlkONi9jg== X-Received: by 2002:a02:3b15:: with SMTP id c21mr811313jaa.54.1631331152724; Fri, 10 Sep 2021 20:32:32 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id h10sm298280ilj.71.2021.09.10.20.32.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 20:32:32 -0700 (PDT) Date: Fri, 10 Sep 2021 23:32:31 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net Subject: [PATCH 2/8] builtin/multi-pack-index.c: support --stdin-packs mode Message-ID: <2a16f11790b79ab452233b6f28acac607c0abd28.1631331139.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org To power a new `--write-midx` mode, `git repack` will want to write a multi-pack index containing a certain set of packs in the repository. This new option will be used by `git repack` to write a MIDX which contains only the packs which will survive after the repack (that is, it will exclude any packs which are about to be deleted). This patch effectively exposes the function implemented in the previous commit via the `git multi-pack-index` builtin. An alternative approach would have been to call that function from the `git repack` builtin directly, but this introduces awkward problems around closing and reopening the object store, so the MIDX will be written out-of-process. Signed-off-by: Taylor Blau --- Documentation/git-multi-pack-index.txt | 4 ++++ builtin/multi-pack-index.c | 26 ++++++++++++++++++++++++++ t/t5319-multi-pack-index.sh | 15 +++++++++++++++ 3 files changed, 45 insertions(+) diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index a9df3dbd32..009c989ef8 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -45,6 +45,10 @@ write:: --[no-]bitmap:: Control whether or not a multi-pack bitmap is written. + + --stdin-packs:: + Write a multi-pack index containing only the set of + line-delimited pack index basenames provided over stdin. -- verify:: diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 73c0113b48..77488b6b7b 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -47,6 +47,7 @@ static struct opts_multi_pack_index { const char *preferred_pack; unsigned long batch_size; unsigned flags; + int stdin_packs; } opts; static struct option common_opts[] = { @@ -61,6 +62,15 @@ static struct option *add_common_options(struct option *prev) return parse_options_concat(common_opts, prev); } +static void read_packs_from_stdin(struct string_list *to) +{ + struct strbuf buf = STRBUF_INIT; + while (strbuf_getline(&buf, stdin) != EOF) { + string_list_append(to, strbuf_detach(&buf, NULL)); + } + string_list_sort(to); +} + static int cmd_multi_pack_index_write(int argc, const char **argv) { struct option *options; @@ -70,6 +80,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) N_("pack for reuse when computing a multi-pack bitmap")), OPT_BIT(0, "bitmap", &opts.flags, N_("write multi-pack bitmap"), MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX), + OPT_BOOL(0, "stdin-packs", &opts.stdin_packs, + N_("write multi-pack index containing only given indexes")), OPT_END(), }; @@ -86,6 +98,20 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) FREE_AND_NULL(options); + if (opts.stdin_packs) { + struct string_list packs = STRING_LIST_INIT_NODUP; + int ret; + + read_packs_from_stdin(&packs); + + ret = write_midx_file_only(opts.object_dir, &packs, + opts.preferred_pack, opts.flags); + + string_list_clear(&packs, 0); + + return ret; + + } return write_midx_file(opts.object_dir, opts.preferred_pack, opts.flags); } diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index bb04f0f23b..385f0a3efd 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -168,6 +168,21 @@ test_expect_success 'write midx with two packs' ' compare_results_with_midx "two packs" +test_expect_success 'write midx with --stdin-packs' ' + rm -fr $objdir/pack/multi-pack-index && + + idx="$(find $objdir/pack -name "test-2-*.idx")" && + basename "$idx" >in && + + git multi-pack-index write --stdin-packs packs && + + test_cmp packs in +' + +compare_results_with_midx "mixed mode (one pack + extra)" + test_expect_success 'write progress off for redirected stderr' ' git multi-pack-index --object-dir=$objdir write 2>err && test_line_count = 0 err From patchwork Sat Sep 11 03:32:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12486139 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BA4DC433FE for ; Sat, 11 Sep 2021 03:32:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 407D06101A for ; Sat, 11 Sep 2021 03:32:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235260AbhIKDdy (ORCPT ); Fri, 10 Sep 2021 23:33:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34268 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229503AbhIKDdw (ORCPT ); Fri, 10 Sep 2021 23:33:52 -0400 Received: from mail-io1-xd30.google.com (mail-io1-xd30.google.com [IPv6:2607:f8b0:4864:20::d30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5C18C061756 for ; Fri, 10 Sep 2021 20:32:40 -0700 (PDT) Received: by mail-io1-xd30.google.com with SMTP id b200so4876728iof.13 for ; Fri, 10 Sep 2021 20:32:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=uiYDPgald224eJ5FeioA68hVYwRtkTtUg0iPQF999dI=; b=FWW8fIVPCnt+9wwFEKxZQsC62QVFoPxjM3qcHqPh9pYDW2W4rpM49OQuLFfnd/zDmv t/po0RoDdiMsEpvgOaJ5TE4SUv1anMCHABDfasWXHYYR6Nriso7c/k1a51mnW2uIrq3o L3R/4lPcfQ08Cbmi20e3DkmqjnGj8NxlyLv34/kA0ZDKqy3Cvt7dsygMNb56Zlbp6sYc 40IelvwKVXv5J+h5CQcmrYcMqGuXbt9JZvIOXpU71CsDsWqEzlSuaYHDvRiwV2+9TiQC thZM8q2g9ZBk8EjB27DI1VZU6KPk+8koEiTMXldPKo3/KWDFoiRivEMd23ZuaGiESbOb JX7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=uiYDPgald224eJ5FeioA68hVYwRtkTtUg0iPQF999dI=; b=L/DOw2rjAwd3Wt8H98edlhjnQulHQy6gSqSJokKSLs89H1d5UOVVOnR0d1QbxLKDL7 qEYNuvrFwJbiF87eUD6DXb3eeU0/FmuChO/TKM00zCOpZDP2VA08tdGiTTU2VrWcUOZh FzOFu5jhfYmBiJgtyHWKOf4wB+ajCramsdfoHDWfsazVdEgPyIE1GLb4sIjuPUc4dj0c 7SNnbUWQEAOoPiaA3WUs6grlXiFakOFPPyrUFdCVAoIfFs/yLi8pwNFZelnvwCvena6/ tIg56vQwx4MSvamamzfkCgUxziPY7mOWmHEV9EflbUP1CS3Q8yVRp5bo06OyAU4U8vOn fP3g== X-Gm-Message-State: AOAM533mCMQjFlSpInuDyXYdHRVFHjcTEE39uefRjH/jYKPV5XmrU9rm 3QOH1xadUdpy/1sLsRv1GLBztS8lSy+mdtbH X-Google-Smtp-Source: ABdhPJyR7JckQfjyHS1zWic95HGU23+JBI6X498a/AWvr0FFL5Ky13QeZJS8oreqbuFhaigh80kTrw== X-Received: by 2002:a02:2402:: with SMTP id f2mr799380jaa.28.1631331155276; Fri, 10 Sep 2021 20:32:35 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id c23sm325600ioi.31.2021.09.10.20.32.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 20:32:35 -0700 (PDT) Date: Fri, 10 Sep 2021 23:32:34 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net Subject: [PATCH 3/8] midx: preliminary support for `--refs-snapshot` Message-ID: <137759fe6c7c095b980d55dc4ba37f663044a6dd.1631331139.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org To figure out which commits we can write a bitmap for, the multi-pack index/bitmap code does a reachability traversal, marking any commit which can be found in the MIDX as eligible to receive a bitmap. This approach will cause a problem when multi-pack bitmaps are able to be generated from `git repack`, since the reference tips can change during the repack. Even though we ignore commits that don't exist in the MIDX (when doing a scan of the ref tips), it's possible that a commit in the MIDX reaches something that isn't. This can happen when a multi-pack index contains some pack which refers to loose objects (which by definition aren't included in the multi-pack index). By taking a snapshot of the references before we start repacking, we can close that race window. In the above scenario (where we have a packed object pointing at a loose one), we'll either (a) take a snapshot of the references before seeing the packed one, or (b) take it after, at which point we can guarantee that the loose object will be packed and included in the MIDX. This patch does just that. It writes a temporary "reference snapshot", which is a list of OIDs that are at the ref tips before writing a multi-pack bitmap. References that are "preferred" (i.e,. are a suffix of at least one value of the 'pack.preferBitmapTips' configuration) are marked with a special '+'. The format is simple: one line per commit at each tip, with an optional '+' at the beginning (for preferred references, as described above). When provided, the reference snapshot is used to drive bitmap selection instead of the MIDX code doing its own traversal. When it isn't provided, the usual traversal takes place instead. Signed-off-by: Taylor Blau --- Documentation/git-multi-pack-index.txt | 15 +++++ builtin/multi-pack-index.c | 11 +++- builtin/repack.c | 2 +- midx.c | 60 ++++++++++++++++--- midx.h | 4 +- t/t5326-multi-pack-bitmaps.sh | 82 ++++++++++++++++++++++++++ 6 files changed, 160 insertions(+), 14 deletions(-) diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index 009c989ef8..27f83932e4 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -49,6 +49,21 @@ write:: --stdin-packs:: Write a multi-pack index containing only the set of line-delimited pack index basenames provided over stdin. + + --refs-snapshot=:: + With `--bitmap`, optionally specify a file which + contains a "refs snapshot" taken prior to repacking. ++ +A reference snapshot is composed of line-delimited OIDs corresponding to +the reference tips, usually taken by `git repack` prior to generating a +new pack. A line may optionally start with a `+` character to indicate +that the reference which corresponds to that OID is "preferred" (see +linkgit:git-config[1]'s `pack.preferBitmapTips`.) ++ +The file given at `` is expected to be readable, and can contain +duplicates. (If a given OID is given more than once, it is marked as +preferred if at least one instance of it begins with the special `+` +marker). -- verify:: diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 77488b6b7b..65a242f5cf 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -7,7 +7,8 @@ #include "object-store.h" #define BUILTIN_MIDX_WRITE_USAGE \ - N_("git multi-pack-index [] write [--preferred-pack=]") + N_("git multi-pack-index [] write [--preferred-pack=]" \ + "[--refs-snapshot=]") #define BUILTIN_MIDX_VERIFY_USAGE \ N_("git multi-pack-index [] verify") @@ -45,6 +46,7 @@ static char const * const builtin_multi_pack_index_usage[] = { static struct opts_multi_pack_index { const char *object_dir; const char *preferred_pack; + const char *refs_snapshot; unsigned long batch_size; unsigned flags; int stdin_packs; @@ -82,6 +84,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX), OPT_BOOL(0, "stdin-packs", &opts.stdin_packs, N_("write multi-pack index containing only given indexes")), + OPT_FILENAME(0, "refs-snapshot", &opts.refs_snapshot, + N_("refs snapshot for selecting bitmap commits")), OPT_END(), }; @@ -105,7 +109,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) read_packs_from_stdin(&packs); ret = write_midx_file_only(opts.object_dir, &packs, - opts.preferred_pack, opts.flags); + opts.preferred_pack, + opts.refs_snapshot, opts.flags); string_list_clear(&packs, 0); @@ -113,7 +118,7 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) } return write_midx_file(opts.object_dir, opts.preferred_pack, - opts.flags); + opts.refs_snapshot, opts.flags); } static int cmd_multi_pack_index_verify(int argc, const char **argv) diff --git a/builtin/repack.c b/builtin/repack.c index 82ab668272..27158a897b 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -733,7 +733,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) unsigned flags = 0; if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP, 0)) flags |= MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX; - write_midx_file(get_object_directory(), NULL, flags); + write_midx_file(get_object_directory(), NULL, NULL, flags); } string_list_clear(&names, 0); diff --git a/midx.c b/midx.c index 29d1d107b3..46ded7a4cf 100644 --- a/midx.c +++ b/midx.c @@ -970,7 +970,42 @@ static void bitmap_show_commit(struct commit *commit, void *_data) data->commits[data->commits_nr++] = commit; } +static int read_refs_snapshot(const char *refs_snapshot, + struct rev_info *revs) +{ + struct strbuf buf = STRBUF_INIT; + struct object_id oid; + FILE *f = xfopen(refs_snapshot, "r"); + + while (strbuf_getline(&buf, f) != EOF) { + struct object *object; + int preferred = 0; + char *hex = buf.buf; + const char *end = NULL; + + if (buf.len && *buf.buf == '+') { + preferred = 1; + hex = &buf.buf[1]; + } + + if (parse_oid_hex(hex, &oid, &end) < 0) + die(_("could not parse line: %s"), buf.buf); + if (*end) + die(_("malformed line: %s"), buf.buf); + + object = parse_object_or_die(&oid, NULL); + if (preferred) + object->flags |= NEEDS_BITMAP; + + add_pending_object(revs, object, ""); + } + + fclose(f); + return 0; +} + static struct commit **find_commits_for_midx_bitmap(uint32_t *indexed_commits_nr_p, + const char *refs_snapshot, struct write_midx_context *ctx) { struct rev_info revs; @@ -979,8 +1014,12 @@ static struct commit **find_commits_for_midx_bitmap(uint32_t *indexed_commits_nr cb.ctx = ctx; repo_init_revisions(the_repository, &revs, NULL); - setup_revisions(0, NULL, &revs, NULL); - for_each_ref(add_ref_to_pending, &revs); + if (refs_snapshot) { + read_refs_snapshot(refs_snapshot, &revs); + } else { + setup_revisions(0, NULL, &revs, NULL); + for_each_ref(add_ref_to_pending, &revs); + } /* * Skipping promisor objects here is intentional, since it only excludes @@ -1009,6 +1048,7 @@ static struct commit **find_commits_for_midx_bitmap(uint32_t *indexed_commits_nr static int write_midx_bitmap(char *midx_name, unsigned char *midx_hash, struct write_midx_context *ctx, + const char *refs_snapshot, unsigned flags) { struct packing_data pdata; @@ -1020,7 +1060,7 @@ static int write_midx_bitmap(char *midx_name, unsigned char *midx_hash, prepare_midx_packing_data(&pdata, ctx); - commits = find_commits_for_midx_bitmap(&commits_nr, ctx); + commits = find_commits_for_midx_bitmap(&commits_nr, refs_snapshot, ctx); /* * Build the MIDX-order index based on pdata.objects (which is already @@ -1068,6 +1108,7 @@ static int write_midx_internal(const char *object_dir, struct string_list *packs_to_include, struct string_list *packs_to_drop, const char *preferred_pack_name, + const char *refs_snapshot, unsigned flags) { char *midx_name; @@ -1361,7 +1402,8 @@ static int write_midx_internal(const char *object_dir, if (flags & MIDX_WRITE_REV_INDEX) write_midx_reverse_index(midx_name, midx_hash, &ctx); if (flags & MIDX_WRITE_BITMAP) { - if (write_midx_bitmap(midx_name, midx_hash, &ctx, flags) < 0) { + if (write_midx_bitmap(midx_name, midx_hash, &ctx, + refs_snapshot, flags) < 0) { error(_("could not write multi-pack bitmap")); result = 1; goto cleanup; @@ -1396,19 +1438,21 @@ static int write_midx_internal(const char *object_dir, int write_midx_file(const char *object_dir, const char *preferred_pack_name, + const char *refs_snapshot, unsigned flags) { return write_midx_internal(object_dir, NULL, NULL, preferred_pack_name, - flags); + refs_snapshot, flags); } int write_midx_file_only(const char *object_dir, struct string_list *packs_to_include, const char *preferred_pack_name, + const char *refs_snapshot, unsigned flags) { return write_midx_internal(object_dir, packs_to_include, NULL, - preferred_pack_name, flags); + preferred_pack_name, refs_snapshot, flags); } struct clear_midx_data { @@ -1688,7 +1732,7 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla free(count); if (packs_to_drop.nr) { - result = write_midx_internal(object_dir, NULL, &packs_to_drop, NULL, flags); + result = write_midx_internal(object_dir, NULL, &packs_to_drop, NULL, NULL, flags); m = NULL; } @@ -1879,7 +1923,7 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, goto cleanup; } - result = write_midx_internal(object_dir, NULL, NULL, NULL, flags); + result = write_midx_internal(object_dir, NULL, NULL, NULL, NULL, flags); m = NULL; cleanup: diff --git a/midx.h b/midx.h index aefa371c90..dc37b94ea1 100644 --- a/midx.h +++ b/midx.h @@ -62,8 +62,8 @@ int fill_midx_entry(struct repository *r, const struct object_id *oid, struct pa int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name); int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local); -int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags); -int write_midx_file_only(const char *object_dir, struct string_list *packs_to_include, const char *preferred_pack_name, unsigned flags); +int write_midx_file(const char *object_dir, const char *preferred_pack_name, const char *refs_snapshot, unsigned flags); +int write_midx_file_only(const char *object_dir, struct string_list *packs_to_include, const char *preferred_pack_name, const char *refs_snapshot, unsigned flags); void clear_midx_file(struct repository *r); int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags); int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags); diff --git a/t/t5326-multi-pack-bitmaps.sh b/t/t5326-multi-pack-bitmaps.sh index 4ad7c2c969..069dab3e17 100755 --- a/t/t5326-multi-pack-bitmaps.sh +++ b/t/t5326-multi-pack-bitmaps.sh @@ -283,4 +283,86 @@ test_expect_success 'pack.preferBitmapTips' ' ) ' +test_expect_success 'writing a bitmap with --refs-snapshot' ' + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + + test_commit one && + test_commit two && + + git rev-parse one >snapshot && + + git repack -ad && + + # First, write a MIDX which see both refs/tags/one and + # refs/tags/two (causing both of those commits to receive + # bitmaps). + git multi-pack-index write --bitmap && + + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap && + + test-tool bitmap list-commits | sort >bitmaps && + grep "$(git rev-parse one)" bitmaps && + grep "$(git rev-parse two)" bitmaps && + + rm -fr $midx-$(midx_checksum $objdir).bitmap && + rm -fr $midx-$(midx_checksum $objdir).rev && + rm -fr $midx && + + # Then again, but with a refs snapshot which only sees + # refs/tags/one. + git multi-pack-index write --bitmap --refs-snapshot=snapshot && + + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap && + + test-tool bitmap list-commits | sort >bitmaps && + grep "$(git rev-parse one)" bitmaps && + ! grep "$(git rev-parse two)" bitmaps + ) +' + +test_expect_success 'write a bitmap with --refs-snapshot (preferred tips)' ' + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + + test_commit_bulk --message="%s" 103 && + + git log --format="%H" >commits.raw && + sort commits && + + git log --format="create refs/tags/%s %H" HEAD >refs && + git update-ref --stdin bitmaps && + comm -13 bitmaps commits >before && + test_line_count = 1 before && + + ( + grep -vf before commits.raw && + # mark missing commits as preferred + sed "s/^/+/" before + ) >snapshot && + + rm -fr $midx-$(midx_checksum $objdir).bitmap && + rm -fr $midx-$(midx_checksum $objdir).rev && + rm -fr $midx && + + git multi-pack-index write --bitmap --refs-snapshot=snapshot && + test-tool bitmap list-commits | sort >bitmaps && + comm -13 bitmaps commits >after && + + ! test_cmp before after + ) +' + test_done From patchwork Sat Sep 11 03:32:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12486137 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59A33C433F5 for ; Sat, 11 Sep 2021 03:32:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 33E5960F46 for ; Sat, 11 Sep 2021 03:32:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235256AbhIKDdx (ORCPT ); Fri, 10 Sep 2021 23:33:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34266 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235263AbhIKDdw (ORCPT ); Fri, 10 Sep 2021 23:33:52 -0400 Received: from mail-il1-x12e.google.com (mail-il1-x12e.google.com [IPv6:2607:f8b0:4864:20::12e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A90ACC061574 for ; Fri, 10 Sep 2021 20:32:40 -0700 (PDT) Received: by mail-il1-x12e.google.com with SMTP id b15so3695068ils.10 for ; Fri, 10 Sep 2021 20:32:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Gz4CvzE8Ph4NOcoOPKTuVPvwpTkyX8ozHsM8dxdWibA=; b=JpusKtjRoUWai5v5XE7zdV75gAi9UhsODH9QENqGyXSdHQ88/jyohqjndc8MnQun+H h/pnTIoHvO+SsHBZ8TVJEynawH5mc3whZU/1Blzs4zZe+9RPgG4tfWhhuBhbnb6F9njh j5oU7D74jcBf4WsqLMbnw+2p3FzA5jSjyplQqX39J1OKzqUnaG18fevrv/M2lsPe7NjI v68ZI3qIzVrRTkMSLbRG0LsmKXatZ4oZ8oVIQRktl2q8QH77txMpuSDRsf9bVEvGvO2X 67M2Y9nIpVksn17DoLmROap6fOZ2qKudIkDRqRSZEDHFwMiiGrXLr6V1hChx6J1bZQZ2 inWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Gz4CvzE8Ph4NOcoOPKTuVPvwpTkyX8ozHsM8dxdWibA=; b=RX3la0ZhHYxaKTuMhp7Tdt41zGUb7RkdCgfc+LqE69X7ZEMK9rorONDrOUYPPDaEyk HU7B8bXOeAV6++5M0x7jZelRIeEwce/kbsWhS8fJoNlGgUKdrenCl+DDgOLKlmrudSPD OiOYdrc/Kip3IBZfE+ufCZJg5qB4OrQJigD4VzrTGoS8SlNbIUxsE78aui+f9Rhj1KVA GqHYranwlsmALYgdNnVjSdxcz7zPs5IloN0gbTuyAkwvJMcf1DglPVGO34sCcCc36r65 eqe0nwgQg4rBKBECujK7VuVLS2Jd5q0Qz889BUAZG6fgc23PJ2fMZtudWOi6xw9qwHrj 3DDw== X-Gm-Message-State: AOAM531ye5uxnocoQ4Me1pmlc2DEDOUx/iMHwn6HNB9dIxHN1iqnytl/ 5MtUfvPyd0FLnhsuyLHfqM1zTeZ263jIs93U X-Google-Smtp-Source: ABdhPJzYH34a6DgNeBN4p8uhCgIkjMcp4IZKADso/mKUDQOM90Z1Hek7rZEejkf0v57UGZFW7gZuyg== X-Received: by 2002:a92:cd8b:: with SMTP id r11mr666751ilb.136.1631331159923; Fri, 10 Sep 2021 20:32:39 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id a4sm311448ioe.19.2021.09.10.20.32.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 20:32:39 -0700 (PDT) Date: Fri, 10 Sep 2021 23:32:36 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net Subject: [PATCH 4/8] builtin/repack.c: keep track of existing packs unconditionally Message-ID: <243cf5f82cabfb930395ff3cb97d1354f4b16064.1631331139.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In order to be able to write a multi-pack index during repacking, `git repack` must keep track of which packs it wants to write into the MIDX. This set is the union of existing packs which will not be deleted, new pack(s) generated as a result of the repack, and .keep packs. Prior to this patch, `git repack` populated the list of existing packs only when repacking all-into-one (i.e., with `-A` or `-a`), but we will soon need to know this list when repacking when writing a MIDX without a-i-o. Populate the list of existing packs unconditionally, and guard removing packs from that list only when repacking a-i-o. Additionally, keep track of filenames of kept packs separately, since this, too, will be used in an upcoming patch. Signed-off-by: Taylor Blau --- builtin/repack.c | 49 ++++++++++++++++++++++++++---------------------- 1 file changed, 27 insertions(+), 22 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index 27158a897b..e55a650de5 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -98,8 +98,9 @@ static void remove_pack_on_signal(int signo) * have a corresponding .keep file. These packs are not to * be kept if we are going to pack everything into one file. */ -static void get_non_kept_pack_filenames(struct string_list *fname_list, - const struct string_list *extra_keep) +static void collect_pack_filenames(struct string_list *fname_list, + struct string_list *fname_kept_list, + const struct string_list *extra_keep) { DIR *dir; struct dirent *e; @@ -112,21 +113,20 @@ static void get_non_kept_pack_filenames(struct string_list *fname_list, size_t len; int i; + if (!strip_suffix(e->d_name, ".pack", &len)) + continue; + for (i = 0; i < extra_keep->nr; i++) if (!fspathcmp(e->d_name, extra_keep->items[i].string)) break; - if (extra_keep->nr > 0 && i < extra_keep->nr) - continue; - - if (!strip_suffix(e->d_name, ".pack", &len)) - continue; fname = xmemdupz(e->d_name, len); - if (!file_exists(mkpath("%s/%s.keep", packdir, fname))) - string_list_append_nodup(fname_list, fname); + if ((extra_keep->nr > 0 && i < extra_keep->nr) || + (file_exists(mkpath("%s/%s.keep", packdir, fname)))) + string_list_append_nodup(fname_kept_list, fname); else - free(fname); + string_list_append_nodup(fname_list, fname); } closedir(dir); } @@ -440,6 +440,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) struct string_list names = STRING_LIST_INIT_DUP; struct string_list rollback = STRING_LIST_INIT_NODUP; struct string_list existing_packs = STRING_LIST_INIT_DUP; + struct string_list existing_kept_packs = STRING_LIST_INIT_DUP; struct pack_geometry *geometry = NULL; struct strbuf line = STRBUF_INIT; int i, ext, ret; @@ -572,9 +573,10 @@ int cmd_repack(int argc, const char **argv, const char *prefix) if (use_delta_islands) strvec_push(&cmd.args, "--delta-islands"); + collect_pack_filenames(&existing_packs, &existing_kept_packs, + &keep_pack_list); + if (pack_everything & ALL_INTO_ONE) { - get_non_kept_pack_filenames(&existing_packs, &keep_pack_list); - repack_promisor_objects(&po_args, &names); if (existing_packs.nr && delete_redundant) { @@ -683,17 +685,19 @@ int cmd_repack(int argc, const char **argv, const char *prefix) reprepare_packed_git(the_repository); if (delete_redundant) { - const int hexsz = the_hash_algo->hexsz; int opts = 0; - string_list_sort(&names); - for_each_string_list_item(item, &existing_packs) { - char *sha1; - size_t len = strlen(item->string); - if (len < hexsz) - continue; - sha1 = item->string + len - hexsz; - if (!string_list_has_string(&names, sha1)) - remove_redundant_pack(packdir, item->string); + if (pack_everything & ALL_INTO_ONE) { + const int hexsz = the_hash_algo->hexsz; + string_list_sort(&names); + for_each_string_list_item(item, &existing_packs) { + char *sha1; + size_t len = strlen(item->string); + if (len < hexsz) + continue; + sha1 = item->string + len - hexsz; + if (!string_list_has_string(&names, sha1)) + remove_redundant_pack(packdir, item->string); + } } if (geometry) { @@ -739,6 +743,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) string_list_clear(&names, 0); string_list_clear(&rollback, 0); string_list_clear(&existing_packs, 0); + string_list_clear(&existing_kept_packs, 0); clear_pack_geometry(geometry); strbuf_release(&line); From patchwork Sat Sep 11 03:32:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12486141 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED6D0C433EF for ; Sat, 11 Sep 2021 03:32:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CCEF16101A for ; Sat, 11 Sep 2021 03:32:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235271AbhIKDd7 (ORCPT ); Fri, 10 Sep 2021 23:33:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34306 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235279AbhIKDd6 (ORCPT ); Fri, 10 Sep 2021 23:33:58 -0400 Received: from mail-io1-xd33.google.com (mail-io1-xd33.google.com [IPv6:2607:f8b0:4864:20::d33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 72FD9C061766 for ; Fri, 10 Sep 2021 20:32:45 -0700 (PDT) Received: by mail-io1-xd33.google.com with SMTP id b200so4876868iof.13 for ; Fri, 10 Sep 2021 20:32:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=2fvAbbtx+C+ejpKOhyDhGT239iSdShPZs0EhgwvBNIk=; b=UBeFgLMKV17ic1XhTwAsnoI4Mc5OOWuM84ij187uJ/xMj40ddhh9FvRXM9wYVsZnlY cvGmWel811urwXq85c5Fdq0kFerhEaObYQBpwMy1+a9UJJ28D3yPOH8nubYWq5o9YDXc jT9uhKX1YUnlQ/BDhf19wo5q+g+Q+NSX56u5OOdvuttiSjQx2KcL2BsekxhFwltISUOs zUA2JvivUFyi6HUAyZkp7kLw3/VfJs1rBUaY8mM7eWWBxRAY8UHJM5VrZ34VAx55brDl vX8eimlLtWRqA07rjuoqc6L8lawcOg+HkkJluNV1ap0xKNRJb3snEy6s8RK1zzETIJgt Lyfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=2fvAbbtx+C+ejpKOhyDhGT239iSdShPZs0EhgwvBNIk=; b=OjF0j1oyjyqzGCsnXuqF0Z5nsQfNEk3EZrJ2ZnePOi++75XmmvhaL9CCISnV0Vl5S9 q99EpjKDEAPRsn+U/5d2MnIwgCQd2K2MENxyeyWE9/ZJ//ciEChawd78Hk72S32fY1R1 Zm9NU8FSx85MC8gMji9YQV3YAKN5vXIqUujCTXbAE0+DxlPUGmysscChgN0Dg3hOez5t MSQduAz4+5NgCZIuL2COtdXnrTl5i2M/tFwORqhTPlBvfPUnweK+Vv+LPPgaD1b16IsP RZNGFRE2pQ97mzAYZY3uQABgIPXWtN638e9AEhzQ6iRljSKJuqk90TjTLOf+LA+Fc4pw /tKA== X-Gm-Message-State: AOAM532sTkuMk0ma3NSUTRjcGo6+HyNIuPWdV/ArzimTj7HDBBbVbBVT B44jOhp7q4abCYWxHGYzJ41Z7apia4Y/34+N X-Google-Smtp-Source: ABdhPJzRRFRhwyXV9bXYqHwivMWn8UKGi6MD0t8eG3q3FYkpJxk51YA286bBHutS2nVU+q33IAiCuA== X-Received: by 2002:a05:6638:34a6:: with SMTP id t38mr793685jal.19.1631331163181; Fri, 10 Sep 2021 20:32:43 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id x12sm298491ilm.56.2021.09.10.20.32.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 20:32:42 -0700 (PDT) Date: Fri, 10 Sep 2021 23:32:42 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net Subject: [PATCH 5/8] builtin/repack.c: extract showing progress to a variable Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org We only ask whether stderr is a tty before calling 'prune_packed_objects()', but the subsequent patch will add another use. Extract this check into a variable so that both can use it without having to call 'isatty()' twice. Signed-off-by: Taylor Blau --- builtin/repack.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/builtin/repack.c b/builtin/repack.c index e55a650de5..be470546e6 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -445,6 +445,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) struct strbuf line = STRBUF_INIT; int i, ext, ret; FILE *out; + int show_progress = isatty(2); /* variables to be filled by option parsing */ int pack_everything = 0; @@ -718,7 +719,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) } strbuf_release(&buf); } - if (!po_args.quiet && isatty(2)) + if (!po_args.quiet && show_progress) opts |= PRUNE_PACKED_VERBOSE; prune_packed_objects(opts); From patchwork Sat Sep 11 03:32:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12486143 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48274C433FE for ; Sat, 11 Sep 2021 03:32:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 27D9D611BD for ; Sat, 11 Sep 2021 03:32:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235293AbhIKDeA (ORCPT ); Fri, 10 Sep 2021 23:34:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34310 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229503AbhIKDd6 (ORCPT ); Fri, 10 Sep 2021 23:33:58 -0400 Received: from mail-il1-x136.google.com (mail-il1-x136.google.com [IPv6:2607:f8b0:4864:20::136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AA86EC061756 for ; Fri, 10 Sep 2021 20:32:46 -0700 (PDT) Received: by mail-il1-x136.google.com with SMTP id b15so3695238ils.10 for ; Fri, 10 Sep 2021 20:32:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=tWDjLmSkwdgLfOUvOeP3T+dA65qezGMftfAH0hMbHgQ=; b=B+GSBposghgP8jCQU1EZjm/A+SYZird3M+oMoiI1rhlb/mEr8yq/j7RMN6boCdxWxx ZCtqsHCQLfjoTsOV0SjJCIRELsRdlZ8DuajIRtBY4jOzqx/6rFd3UvBjOQqOe8UkMA3a avGxpupeyqlFaA7nu9jFopE6PX3dFYnem7DNoYENGz6djgwqYOqKjW/L43iCyjxtMpy8 XVYjZPPsoyxS2+ymwjcZuGfdKA0PPw3q+WUi0Qx4HHvgDuwZYLp/XXsj/yO94bOH4CTY ATJpMTK0Xx8FQuoKBHdP52f9zQS98qbfZ9ZQCf1noKnhn8aBtUigNG7Ki4GSuIiaL2Vt c3dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=tWDjLmSkwdgLfOUvOeP3T+dA65qezGMftfAH0hMbHgQ=; b=aqitfvVUmj6w7itV47pYhk7rkH5cM6lys3ovdmHVq/Eyb13Tx339whFxEsheohYS3f hjJ9PticJHPIQCbzjNW/T7aTuV6WSAKHGK3FcUY3nIvV6zc+y9MoRboHjJIDRzGweD9H lM1NpjFjT5Sd6CXMawuO2ACdpYsrMpyFLue7OlP0j22N/YxXnoQEowHxtbEZGxhXA70n aH+ekBBqXWRZiW4Kcj6lFGmBgl2uqMFG+GEgbCGGCdTEij9Ri6ijvJGTDvREVgb8cBTO BJbBZXPOYaVwbI+S4tunOrqoF/DCcMLgzBYzZ2U/+oy4u0KOHnN/XYZjJjRzyp4CDJPH XCeQ== X-Gm-Message-State: AOAM530/9VkDJZZtff9sxLGkGhIMMPaByrnHwjIJSC743Itl7x3YejiF SbmvthDDM9PwTlk047kJvJRYLOLnJNc+w1YZ X-Google-Smtp-Source: ABdhPJzug+zrvgaH//ucQyUaI50GoIztNFpZyLcqU78COSMwoHbXC4QpTC8DyinuCrTf+UsI61Dl/A== X-Received: by 2002:a92:c98b:: with SMTP id y11mr611230iln.205.1631331165809; Fri, 10 Sep 2021 20:32:45 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id m13sm302403ilh.43.2021.09.10.20.32.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 20:32:45 -0700 (PDT) Date: Fri, 10 Sep 2021 23:32:44 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net Subject: [PATCH 6/8] builtin/repack.c: support writing a MIDX while repacking Message-ID: <06b99a0ab5ef52165538f1fe028515097276ed67.1631331139.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Teach `git repack` a new `--write-midx` option for callers that wish to persist a multi-pack index in their repository while repacking. There are two existing alternatives to this new flag, but they don't cover our particular use-case. These alternatives are: - Call 'git multi-pack-index write' after running 'git repack', or - Set 'GIT_TEST_MULTI_PACK_INDEX=1' in your environment when running 'git repack'. The former works, but introduces a gap in bitmap coverage between repacking and writing a new MIDX (since the repack may have deleted a pack included in the existing MIDX, invalidating it altogether). Setting the 'GIT_TEST_' environment variable is obviously unsupported. In fact, even if it were supported officially, it still wouldn't work, because it generates the MIDX *after* redundant packs have been dropped, leading to the same issue as above. Introduce a new option which eliminates this race by teaching `git repack` to generate the MIDX at the critical point: after the new packs have been written and moved into place, but before the redundant packs have been removed. This option is compatible with `git repack`'s '--bitmap' option (it changes the interpretation to be: "write a bitmap corresponding to the MIDX after one has been generated"). There is a little bit of additional noise in the patch below to avoid repeating ourselves when selecting which packs to delete. Instead of a single loop as before (where we iterate over 'existing_packs', decide if a pack is worth deleting, and if so, delete it), we have two loops (the first where we decide which ones are worth deleting, and the second where we actually do the deleting). This makes it so we have a single check we can make consistently when (1) telling the MIDX which packs we want to exclude, and (2) actually unlinking the redundant packs. Signed-off-by: Taylor Blau --- Documentation/git-repack.txt | 14 ++-- builtin/repack.c | 132 ++++++++++++++++++++++++++++++----- t/lib-midx.sh | 8 +++ t/t7700-repack.sh | 96 +++++++++++++++++++++++++ 4 files changed, 227 insertions(+), 23 deletions(-) create mode 100644 t/lib-midx.sh diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 24c00c9384..0f2d235ca5 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -9,7 +9,7 @@ git-repack - Pack unpacked objects in a repository SYNOPSIS -------- [verse] -'git repack' [-a] [-A] [-d] [-f] [-F] [-l] [-n] [-q] [-b] [--window=] [--depth=] [--threads=] [--keep-pack=] +'git repack' [-a] [-A] [-d] [-f] [-F] [-l] [-n] [-q] [-b] [-m] [--window=] [--depth=] [--threads=] [--keep-pack=] [--write-midx] DESCRIPTION ----------- @@ -128,10 +128,11 @@ depth is 4095. -b:: --write-bitmap-index:: Write a reachability bitmap index as part of the repack. This - only makes sense when used with `-a` or `-A`, as the bitmaps + only makes sense when used with `-a`, `-A` or `-m`, as the bitmaps must be able to refer to all reachable objects. This option - overrides the setting of `repack.writeBitmaps`. This option - has no effect if multiple packfiles are created. + overrides the setting of `repack.writeBitmaps`. This option + has no effect if multiple packfiles are created, unless writing a + MIDX (in which case a multi-pack bitmap is created). --pack-kept-objects:: Include objects in `.keep` files when repacking. Note that we @@ -190,6 +191,11 @@ to change in the future. This option (implying a drastically different repack mode) is not guaranteed to work with all other combinations of option to `git repack`. +-m:: +--write-midx:: + Write a multi-pack index (see linkgit:git-multi-pack-index[1]) + containing the non-redundant packs. + CONFIGURATION ------------- diff --git a/builtin/repack.c b/builtin/repack.c index be470546e6..cb4292ab37 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -433,6 +433,76 @@ static void clear_pack_geometry(struct pack_geometry *geometry) geometry->split = 0; } +static void midx_included_packs(struct string_list *include, + struct string_list *existing_packs, + struct string_list *existing_kept_packs, + struct string_list *names, + struct pack_geometry *geometry) +{ + struct string_list_item *item; + + for_each_string_list_item(item, existing_kept_packs) { + string_list_insert(include, xstrfmt("%s.idx", item->string)); + } + for_each_string_list_item(item, names) { + string_list_insert(include, xstrfmt("pack-%s.idx", item->string)); + } + if (geometry) { + struct strbuf buf = STRBUF_INIT; + uint32_t i; + for (i = geometry->split; i < geometry->pack_nr; i++) { + struct packed_git *p = geometry->pack[i]; + + strbuf_addstr(&buf, pack_basename(p)); + strbuf_strip_suffix(&buf, ".pack"); + strbuf_addstr(&buf, ".idx"); + + string_list_insert(include, strbuf_detach(&buf, NULL)); + } + } else { + for_each_string_list_item(item, existing_packs) { + if (item->util) + continue; + string_list_insert(include, xstrfmt("%s.idx", item->string)); + } + } +} + +static int write_midx_included_packs(struct string_list *include, + int show_progress, int write_bitmaps) +{ + struct child_process cmd = CHILD_PROCESS_INIT; + struct string_list_item *item; + FILE *in; + int ret; + + cmd.in = -1; + cmd.git_cmd = 1; + + strvec_push(&cmd.args, "multi-pack-index"); + strvec_pushl(&cmd.args, "write", "--stdin-packs", NULL); + + if (show_progress) + strvec_push(&cmd.args, "--progress"); + else + strvec_push(&cmd.args, "--no-progress"); + + if (write_bitmaps) + strvec_push(&cmd.args, "--bitmap"); + + ret = start_command(&cmd); + if (ret) + return ret; + + in = xfdopen(cmd.in, "w"); + for_each_string_list_item(item, include) { + fprintf(in, "%s\n", item->string); + } + fclose(in); + + return finish_command(&cmd); +} + int cmd_repack(int argc, const char **argv, const char *prefix) { struct child_process cmd = CHILD_PROCESS_INIT; @@ -456,6 +526,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) int no_update_server_info = 0; struct pack_objects_args po_args = {NULL}; int geometric_factor = 0; + int write_midx = 0; struct option builtin_repack_options[] = { OPT_BIT('a', NULL, &pack_everything, @@ -498,6 +569,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) N_("do not repack this pack")), OPT_INTEGER('g', "geometric", &geometric_factor, N_("find a geometric progression with factor ")), + OPT_BOOL('m', "write-midx", &write_midx, + N_("write a multi-pack index of the resulting packs")), OPT_END() }; @@ -514,8 +587,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) die(_("--keep-unreachable and -A are incompatible")); if (write_bitmaps < 0) { - if (!(pack_everything & ALL_INTO_ONE) || - !is_bare_repository()) + if (!write_midx && + (!(pack_everything & ALL_INTO_ONE) || !is_bare_repository())) write_bitmaps = 0; } else if (write_bitmaps && git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0) && @@ -525,7 +598,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) if (pack_kept_objects < 0) pack_kept_objects = write_bitmaps > 0; - if (write_bitmaps && !(pack_everything & ALL_INTO_ONE)) + if (write_bitmaps && !(pack_everything & ALL_INTO_ONE) && !write_midx) die(_(incremental_bitmap_conflict_error)); if (geometric_factor) { @@ -567,10 +640,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix) } if (has_promisor_remote()) strvec_push(&cmd.args, "--exclude-promisor-objects"); - if (write_bitmaps > 0) - strvec_push(&cmd.args, "--write-bitmap-index"); - else if (write_bitmaps < 0) - strvec_push(&cmd.args, "--write-bitmap-index-quiet"); + if (!write_midx) { + if (write_bitmaps > 0) + strvec_push(&cmd.args, "--write-bitmap-index"); + else if (write_bitmaps < 0) + strvec_push(&cmd.args, "--write-bitmap-index-quiet"); + } if (use_delta_islands) strvec_push(&cmd.args, "--delta-islands"); @@ -683,22 +758,41 @@ int cmd_repack(int argc, const char **argv, const char *prefix) } /* End of pack replacement. */ + if (delete_redundant && pack_everything & ALL_INTO_ONE) { + const int hexsz = the_hash_algo->hexsz; + string_list_sort(&names); + for_each_string_list_item(item, &existing_packs) { + char *sha1; + size_t len = strlen(item->string); + if (len < hexsz) + continue; + sha1 = item->string + len - hexsz; + item->util = (void*)(intptr_t)!string_list_has_string(&names, sha1); + } + } + + if (write_midx) { + struct string_list include = STRING_LIST_INIT_NODUP; + midx_included_packs(&include, &existing_packs, + &existing_kept_packs, &names, geometry); + + ret = write_midx_included_packs(&include, + show_progress, write_bitmaps > 0); + + string_list_clear(&include, 0); + + if (ret) + return ret; + } + reprepare_packed_git(the_repository); if (delete_redundant) { int opts = 0; - if (pack_everything & ALL_INTO_ONE) { - const int hexsz = the_hash_algo->hexsz; - string_list_sort(&names); - for_each_string_list_item(item, &existing_packs) { - char *sha1; - size_t len = strlen(item->string); - if (len < hexsz) - continue; - sha1 = item->string + len - hexsz; - if (!string_list_has_string(&names, sha1)) - remove_redundant_pack(packdir, item->string); - } + for_each_string_list_item(item, &existing_packs) { + if (!item->util) + continue; + remove_redundant_pack(packdir, item->string); } if (geometry) { diff --git a/t/lib-midx.sh b/t/lib-midx.sh new file mode 100644 index 0000000000..1261994744 --- /dev/null +++ b/t/lib-midx.sh @@ -0,0 +1,8 @@ +# test_midx_consistent +test_midx_consistent () { + ls $1/pack/pack-*.idx | xargs -n 1 basename | sort >expect && + test-tool read-midx $1 | grep ^pack-.*\.idx$ | sort >actual && + + test_cmp expect actual && + git multi-pack-index --object-dir=$1 verify +} diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 98eda3bfeb..6792531dfd 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -3,6 +3,8 @@ test_description='git repack works correctly' . ./test-lib.sh +. "${TEST_DIRECTORY}/lib-bitmap.sh" +. "${TEST_DIRECTORY}/lib-midx.sh" commit_and_pack () { test_commit "$@" 1>&2 && @@ -234,4 +236,98 @@ test_expect_success 'auto-bitmaps do not complain if unavailable' ' test_must_be_empty actual ' +objdir=.git/objects +midx=$objdir/pack/multi-pack-index + +test_expect_success 'setup for --write-midx tests' ' + git init midx && + ( + cd midx && + git config core.multiPackIndex true && + + test_commit base + ) +' + +test_expect_success '--write-midx unchanged' ' + ( + cd midx && + GIT_TEST_MULTI_PACK_INDEX=0 git repack && + test_path_is_missing $midx && + test_path_is_missing $midx-*.bitmap && + + GIT_TEST_MULTI_PACK_INDEX=0 git repack --write-midx && + + test_path_is_file $midx && + test_path_is_missing $midx-*.bitmap && + test_midx_consistent $objdir + ) +' + +test_expect_success '--write-midx with a new pack' ' + ( + cd midx && + test_commit loose && + + GIT_TEST_MULTI_PACK_INDEX=0 git repack --write-midx && + + test_path_is_file $midx && + test_path_is_missing $midx-*.bitmap && + test_midx_consistent $objdir + ) +' + +test_expect_success '--write-midx with -b' ' + ( + cd midx && + GIT_TEST_MULTI_PACK_INDEX=0 git repack -mb && + + test_path_is_file $midx && + test_path_is_file $midx-*.bitmap && + test_midx_consistent $objdir + ) +' + +test_expect_success '--write-midx with -d' ' + ( + cd midx && + test_commit repack && + + GIT_TEST_MULTI_PACK_INDEX=0 git repack -Ad --write-midx && + + test_path_is_file $midx && + test_path_is_missing $midx-*.bitmap && + test_midx_consistent $objdir + ) +' + +test_expect_success 'cleans up MIDX when appropriate' ' + ( + cd midx && + + test_commit repack-2 && + GIT_TEST_MULTI_PACK_INDEX=0 git repack -Adb --write-midx && + + checksum=$(midx_checksum $objdir) && + test_path_is_file $midx && + test_path_is_file $midx-$checksum.bitmap && + test_path_is_file $midx-$checksum.rev && + + test_commit repack-3 && + GIT_TEST_MULTI_PACK_INDEX=0 git repack -Adb --write-midx && + + test_path_is_file $midx && + test_path_is_missing $midx-$checksum.bitmap && + test_path_is_missing $midx-$checksum.rev && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap && + test_path_is_file $midx-$(midx_checksum $objdir).rev && + + test_commit repack-4 && + GIT_TEST_MULTI_PACK_INDEX=0 git repack -Adb && + + find $objdir/pack -type f -name "multi-pack-index*" >files && + test_must_be_empty files + ) +' + test_done From patchwork Sat Sep 11 03:32:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12486145 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A412C433EF for ; Sat, 11 Sep 2021 03:32:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4EE77611BD for ; Sat, 11 Sep 2021 03:32:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235275AbhIKDeC (ORCPT ); Fri, 10 Sep 2021 23:34:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34330 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235300AbhIKDeA (ORCPT ); Fri, 10 Sep 2021 23:34:00 -0400 Received: from mail-io1-xd32.google.com (mail-io1-xd32.google.com [IPv6:2607:f8b0:4864:20::d32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CF70C061756 for ; Fri, 10 Sep 2021 20:32:49 -0700 (PDT) Received: by mail-io1-xd32.google.com with SMTP id b7so4885474iob.4 for ; Fri, 10 Sep 2021 20:32:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=VJpZVkqRjSBizAd4VUhQzPiopMFRuuF2IEzkU/g6r7M=; b=BjIhkhrYXobJubqbkQSuRSCCfPoW3NcU1/z1jk4Zf+RwD4/tpGpIdoRC3p+btvEKAa VKsAvTMefEApIixmMqJ5Qk6eGC359NkTzS6tvwieBAQMqt6jFf4oMeizvQ/yNTki23A/ kk89xIZphlrQ/KExeBmLbvq6zGwm1PtGLxEpOzxaEPHn+9M1GKiEHHby52hu0YSjXYpj Xnn9RYJTWVJChVQtngdGmTy4F3wtw+pJvuBCyUv7/jKq0SDCOTcDJ5InaB7zHWj4RvjZ NE61ixED2PVjpJYazfl7I4sp12nxU3lqX2SeOFvXR+ku7J7eCIq04p2gXzSrpfD0sf/u 4FRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=VJpZVkqRjSBizAd4VUhQzPiopMFRuuF2IEzkU/g6r7M=; b=XY1Cku3FNXK6EPdDS6EDLiOYuJSmcYY+sMOMOLUb972w4HyoxtWLHw6SkzSJmON5Fa r7yxJj8ltdHXFSLGwg2Z+7D80Jm3CUQBwJbWMnXHpGLrtHOBNO+KEW1kAKq0jh9Tox5c snacX1fL+g2PfoMptTqsPSlnA3MPPQFtuFifRqxn1ydQzU0bDl0eNsZf4qFeFjk89vZo qrlrxRP2iI2ey7lAAIvar5saSLIxDmW2w0DOwlKbK56dWF70TrzzOEYmQH2kZj05EkYK gTR/zuRu6vmGIXvOoXOgTgvAJO3W+YoYrzs+t9ff84w4gbWnlvR35waf+XgX0ImNiHPP D0/g== X-Gm-Message-State: AOAM530GEzvqo8LToIS66kk5cyr1ZbOhWqA4M2o+a1f3T4ENQjc39KBn AmEBuvBGB71GE9chsLeDXxc43/UXKpHPMKy7 X-Google-Smtp-Source: ABdhPJw6AdL5kmoyCKIhBMuvODbh4KPRXSr3MMmwrQL3YjTc1BWVGZetgRDmqe9m40eM3ibAdstvgA== X-Received: by 2002:a02:cc30:: with SMTP id o16mr760067jap.101.1631331168330; Fri, 10 Sep 2021 20:32:48 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id c5sm304009ilk.48.2021.09.10.20.32.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 20:32:48 -0700 (PDT) Date: Fri, 10 Sep 2021 23:32:47 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net Subject: [PATCH 7/8] builtin/repack.c: make largest pack preferred Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When repacking into a geometric series and writing a multi-pack bitmap, it is beneficial to have the largest resulting pack be the preferred object source in the bitmap's MIDX, since selecting the large packs can lead to fewer broken delta chains and better compression. Teach 'git repack' to identify this pack and pass it to the MIDX write machinery in order to mark it as preferred. Signed-off-by: Taylor Blau --- Documentation/git-repack.txt | 4 ++++ builtin/repack.c | 23 ++++++++++++++++++++++- pack-bitmap.c | 2 +- pack-bitmap.h | 1 + t/helper/test-read-midx.c | 25 ++++++++++++++++++++++++- t/t7703-repack-geometric.sh | 23 +++++++++++++++++++++++ 6 files changed, 75 insertions(+), 3 deletions(-) diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 0f2d235ca5..7183fb498f 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -190,6 +190,10 @@ this "roll-up", without respect to their reachability. This is subject to change in the future. This option (implying a drastically different repack mode) is not guaranteed to work with all other combinations of option to `git repack`. ++ +When writing a multi-pack bitmap, `git repack` selects the largest resulting +pack as the preferred pack for object selection by the MIDX (see +linkgit:git-multi-pack-index[1]). -m:: --write-midx:: diff --git a/builtin/repack.c b/builtin/repack.c index cb4292ab37..e958caff8b 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -422,6 +422,13 @@ static void split_pack_geometry(struct pack_geometry *geometry, int factor) geometry->split = split; } +static struct packed_git *get_largest_active_pack(struct pack_geometry *geometry) +{ + if (geometry->split == geometry->pack_nr) + return NULL; + return geometry->pack[geometry->pack_nr - 1]; +} + static void clear_pack_geometry(struct pack_geometry *geometry) { if (!geometry) @@ -469,6 +476,7 @@ static void midx_included_packs(struct string_list *include, } static int write_midx_included_packs(struct string_list *include, + struct pack_geometry *geometry, int show_progress, int write_bitmaps) { struct child_process cmd = CHILD_PROCESS_INIT; @@ -490,6 +498,19 @@ static int write_midx_included_packs(struct string_list *include, if (write_bitmaps) strvec_push(&cmd.args, "--bitmap"); + if (geometry) { + struct packed_git *largest = get_largest_active_pack(geometry); + if (largest) + strvec_pushf(&cmd.args, "--preferred-pack=%s", + pack_basename(largest)); + else + /* + * The largest pack was repacked, meaning only one pack + * exists (and tautologically, it is the largest). + */ + ; + } + ret = start_command(&cmd); if (ret) return ret; @@ -776,7 +797,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) midx_included_packs(&include, &existing_packs, &existing_kept_packs, &names, geometry); - ret = write_midx_included_packs(&include, + ret = write_midx_included_packs(&include, geometry, show_progress, write_bitmaps > 0); string_list_clear(&include, 0); diff --git a/pack-bitmap.c b/pack-bitmap.c index 8504110a4d..67be9be9a6 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1418,7 +1418,7 @@ static int try_partial_reuse(struct packed_git *pack, return 0; } -static uint32_t midx_preferred_pack(struct bitmap_index *bitmap_git) +uint32_t midx_preferred_pack(struct bitmap_index *bitmap_git) { struct multi_pack_index *m = bitmap_git->midx; if (!m) diff --git a/pack-bitmap.h b/pack-bitmap.h index 469090bad2..7d407c5a4c 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -55,6 +55,7 @@ int test_bitmap_commits(struct repository *r); struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, struct list_objects_filter_options *filter, int filter_provided_objects); +uint32_t midx_preferred_pack(struct bitmap_index *bitmap_git); int reuse_partial_packfile_from_bitmap(struct bitmap_index *, struct packed_git **packfile, uint32_t *entries, diff --git a/t/helper/test-read-midx.c b/t/helper/test-read-midx.c index cb0d27049a..0038559129 100644 --- a/t/helper/test-read-midx.c +++ b/t/helper/test-read-midx.c @@ -3,6 +3,7 @@ #include "midx.h" #include "repository.h" #include "object-store.h" +#include "pack-bitmap.h" static int read_midx_file(const char *object_dir, int show_objects) { @@ -72,14 +73,36 @@ static int read_midx_checksum(const char *object_dir) return 0; } +static int read_midx_preferred_pack(const char *object_dir) +{ + struct multi_pack_index *midx = NULL; + struct bitmap_index *bitmap = NULL; + + setup_git_directory(); + + midx = load_multi_pack_index(object_dir, 1); + if (!midx) + return 1; + + bitmap = prepare_bitmap_git(the_repository); + if (!(bitmap && bitmap_is_midx(bitmap))) + return 1; + + + printf("%s\n", midx->pack_names[midx_preferred_pack(bitmap)]); + return 0; +} + int cmd__read_midx(int argc, const char **argv) { if (!(argc == 2 || argc == 3)) - usage("read-midx [--show-objects|--checksum] "); + usage("read-midx [--show-objects|--checksum|--preferred-pack] "); if (!strcmp(argv[1], "--show-objects")) return read_midx_file(argv[2], 1); else if (!strcmp(argv[1], "--checksum")) return read_midx_checksum(argv[2]); + else if (!strcmp(argv[1], "--preferred-pack")) + return read_midx_preferred_pack(argv[2]); return read_midx_file(argv[1], 0); } diff --git a/t/t7703-repack-geometric.sh b/t/t7703-repack-geometric.sh index 5ccaa440e0..7bdeffa111 100755 --- a/t/t7703-repack-geometric.sh +++ b/t/t7703-repack-geometric.sh @@ -180,4 +180,27 @@ test_expect_success '--geometric ignores kept packs' ' ) ' +test_expect_success '--geometric chooses largest MIDX preferred pack' ' + git init geometric && + test_when_finished "rm -fr geometric" && + ( + cd geometric && + git config core.multiPackIndex true && + + # These packs already form a geometric progression. + test_commit_bulk --start=1 1 && # 3 objects + test_commit_bulk --start=2 2 && # 6 objects + ls $objdir/pack/pack-*.idx >before && + test_commit_bulk --start=4 4 && # 12 objects + ls $objdir/pack/pack-*.idx >after && + + git repack --geometric 2 -dbm && + + comm -3 before after | xargs -n 1 basename >expect && + test-tool read-midx --preferred-pack $objdir >actual && + + test_cmp expect actual + ) +' + test_done From patchwork Sat Sep 11 03:32:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12486147 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D8D6C433F5 for ; Sat, 11 Sep 2021 03:32:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1DD0860F46 for ; Sat, 11 Sep 2021 03:32:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235263AbhIKDeI (ORCPT ); Fri, 10 Sep 2021 23:34:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34350 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235285AbhIKDeD (ORCPT ); Fri, 10 Sep 2021 23:34:03 -0400 Received: from mail-io1-xd33.google.com (mail-io1-xd33.google.com [IPv6:2607:f8b0:4864:20::d33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85C9AC061757 for ; Fri, 10 Sep 2021 20:32:51 -0700 (PDT) Received: by mail-io1-xd33.google.com with SMTP id a15so4897375iot.2 for ; Fri, 10 Sep 2021 20:32:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Uu0wawx66RaM8zpuLmxh0zsTeM4voy7ay4UAsO8S7Qc=; b=YbcJOOriB3AmsIGja9CMPfXP/H7AIkxkyT+qGUC7slIQUG4X2PJT24WyQJDvoEPXaE iGUxCzqujHyXLI8ZQBjSuukefXl2NoG2t9utKfl9ks/MCGorWfNFP8rwY4Qkv3/5d1X0 DiablMT2wVVxrHc12ob5qGhHJ3S8wzALhF7t0TDaaAOGFyZSQkO642i7Sj/naxSdpwJg WEQB699Y+LXt7PA2rMvn6sM20bdg9n+KL0gYHYIDdGNz0MC5oAszcmUiMFHEBTNorWZz 6rZ4pq31QTPuVVX69JwHRLLO4Rwzy8b/Sedunmhat7FmeP90ALXUKYJ0g66NqIFnSUVP dv4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Uu0wawx66RaM8zpuLmxh0zsTeM4voy7ay4UAsO8S7Qc=; b=Q/2fAK2W4zkYgDB+KYzCwOxXx4evhhnLly1X0D7NoSmV8EcDNufIvMUgLu0b80hxVW 2zHmwuzjlTYZYLGSjbgLIaTmKB0n7gdBt8CxIstNbjTTXe3RjlHC9hrIw/QXgoon+HaO lP7xiEMbNBC5uYh5aaKCFHngewVhGtgz673Jy6CVnIGKT2I46vLNzDbwSW5Z4uk5Bbns yiksy5TaZ0vcxKiRx33Rv6UB4pDIGvS1GhggvAEXDKOxe7D5Zlq/jnTuqrtNGGCrQq+4 nrTi5Ek05ZP2y8jUDEnTqiARF0nAWROVvIQXL6vzOtqMOzYAkveAa6z3uovI4A/2/KfF 414A== X-Gm-Message-State: AOAM530xbIw3TkbE9KZa1Uo8WAlaPBi/6R28gk7otZNUIw+gD/ZGpNzm lTsfPIJJT8u0mIJvIFjuMOYI01Odmw2omSKW X-Google-Smtp-Source: ABdhPJw0XZPZDZpp58SNZCuZrTw7/G9cZWVGqYX5+n2OfNDgKLaTaI3jbj8bI+MFfP28+OQT8ZQL5A== X-Received: by 2002:a5d:9ada:: with SMTP id x26mr754095ion.50.1631331170809; Fri, 10 Sep 2021 20:32:50 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id f3sm332051iow.3.2021.09.10.20.32.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Sep 2021 20:32:50 -0700 (PDT) Date: Fri, 10 Sep 2021 23:32:49 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net Subject: [PATCH 8/8] builtin/repack.c: pass `--refs-snapshot` when writing bitmaps Message-ID: <6a1c52181e8c8c9fe2f0e2d7fbeb1057f68c1f3d.1631331139.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org To prevent the race described in an earlier patch, generate and pass a reference snapshot to the multi-pack bitmap code, if we are writing one from `git repack`. This patch is mostly limited to creating a temporary file, and then calling for_each_ref(). Except we try to minimize duplicates, since doing so can drastically reduce the size in network-of-forks style repositories. In the kernel's fork network (the repository containing all objects from the kernel and all its forks), deduplicating the references drops the snapshot size from 934 MB to just 12 MB. But since we're handling duplicates in this way, we have to make sure that we preferred references (those listed in pack.preferBitmapTips) before non-preferred ones (to avoid recording an object which is pointed at by a preferred tip as non-preferred). We accomplish this by doing separate passes over the references: first visiting each prefix in pack.preferBitmapTips, and then over the rest of the references. Signed-off-by: Taylor Blau --- builtin/repack.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) diff --git a/builtin/repack.c b/builtin/repack.c index e958caff8b..3fda837cb5 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -15,6 +15,8 @@ #include "promisor-remote.h" #include "shallow.h" #include "pack.h" +#include "pack-bitmap.h" +#include "refs.h" static int delta_base_offset = 1; static int pack_kept_objects = -1; @@ -440,6 +442,61 @@ static void clear_pack_geometry(struct pack_geometry *geometry) geometry->split = 0; } +struct midx_snapshot_ref_data { + struct tempfile *f; + struct oidset seen; + int preferred; +}; + +static int midx_snapshot_ref_one(const char *refname, + const struct object_id *oid, + int flag, void *_data) +{ + struct midx_snapshot_ref_data *data = _data; + struct object_id peeled; + + if (!peel_iterated_oid(oid, &peeled)) + oid = &peeled; + + if (oidset_insert(&data->seen, oid)) + return 0; /* already seen */ + + if (oid_object_info(the_repository, oid, NULL) != OBJ_COMMIT) + return 0; + + fprintf(data->f->fp, "%s%s\n", data->preferred ? "+" : "", + oid_to_hex(oid)); + + return 0; +} + +static int midx_snapshot_refs(struct tempfile *f) +{ + struct midx_snapshot_ref_data data; + const struct string_list *preferred = bitmap_preferred_tips(the_repository); + + data.f = f; + oidset_init(&data.seen, 0); + + if (!fdopen_tempfile(f, "w")) + die(_("could not open tempfile %s for writing"), + get_tempfile_path(f)); + + if (preferred) { + struct string_list_item *item; + + data.preferred = 1; + for_each_string_list_item(item, preferred) { + for_each_ref_in(item->string, midx_snapshot_ref_one, &data); + } + data.preferred = 0; + } + + for_each_ref(midx_snapshot_ref_one, &data); + + return close_tempfile_gently(f); +} + static void midx_included_packs(struct string_list *include, struct string_list *existing_packs, struct string_list *existing_kept_packs, @@ -477,6 +534,7 @@ static void midx_included_packs(struct string_list *include, static int write_midx_included_packs(struct string_list *include, struct pack_geometry *geometry, + const char *refs_snapshot, int show_progress, int write_bitmaps) { struct child_process cmd = CHILD_PROCESS_INIT; @@ -511,6 +569,9 @@ static int write_midx_included_packs(struct string_list *include, ; } + if (refs_snapshot) + strvec_pushf(&cmd.args, "--refs-snapshot=%s", refs_snapshot); + ret = start_command(&cmd); if (ret) return ret; @@ -534,6 +595,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) struct string_list existing_kept_packs = STRING_LIST_INIT_DUP; struct pack_geometry *geometry = NULL; struct strbuf line = STRBUF_INIT; + struct tempfile *refs_snapshot = NULL; int i, ext, ret; FILE *out; int show_progress = isatty(2); @@ -622,6 +684,19 @@ int cmd_repack(int argc, const char **argv, const char *prefix) if (write_bitmaps && !(pack_everything & ALL_INTO_ONE) && !write_midx) die(_(incremental_bitmap_conflict_error)); + if (write_midx && write_bitmaps) { + struct strbuf path = STRBUF_INIT; + + strbuf_addf(&path, "%s/%s_XXXXXX", get_object_directory(), + "bitmap-ref-tips"); + + refs_snapshot = xmks_tempfile(path.buf); + if (midx_snapshot_refs(refs_snapshot) < 0) + die(_("could not take a snapshot of references")); + + strbuf_release(&path); + } + if (geometric_factor) { if (pack_everything) die(_("--geometric is incompatible with -A, -a")); @@ -798,6 +873,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) &existing_kept_packs, &names, geometry); ret = write_midx_included_packs(&include, geometry, + refs_snapshot ? get_tempfile_path(refs_snapshot) : NULL, show_progress, write_bitmaps > 0); string_list_clear(&include, 0);