From patchwork Tue Aug 24 16:15:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455499 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 318F8C4338F for ; Tue, 24 Aug 2021 16:16:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0D41760F4C for ; Tue, 24 Aug 2021 16:16:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229543AbhHXQQk (ORCPT ); Tue, 24 Aug 2021 12:16:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46576 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229584AbhHXQQh (ORCPT ); Tue, 24 Aug 2021 12:16:37 -0400 Received: from mail-io1-xd30.google.com (mail-io1-xd30.google.com [IPv6:2607:f8b0:4864:20::d30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E5B3C061764 for ; Tue, 24 Aug 2021 09:15:53 -0700 (PDT) Received: by mail-io1-xd30.google.com with SMTP id n24so27064490ion.10 for ; Tue, 24 Aug 2021 09:15:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=JToHhdtVTn4etFI7xctZBFCHT+B0WOr0FcUxXLnQ+cw=; b=mxseS81UxeLaW1Fn5av4iuALFYcKrvwESqXo+L3TbsUtMDGrSLtsl5DZV0mTxz8eFE nv219+5kZq73VdnsStq5mDfTuJJcdvTZ3WSVSEYsgY6ifUTGsM7skPv+CSXteBg41W+k /c65F8A0h0hSxy7rkXmkuso31pzg7/j4eBXEd3KBNkNgO+6A4d0zKuk6vJiPSBC7343w GYc0wxZ09wKWsHm7ykoL5kVZoGompYHvNRAjMrQkXgQTQCINaN37g39npJbRVtmCvP6k MQSKt9QIeQprYslayqm4A7Mca20LLYVf3TJZfdAEoGHZHRM+LrmqE/P8kfLviVGDaTpj 37FA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=JToHhdtVTn4etFI7xctZBFCHT+B0WOr0FcUxXLnQ+cw=; b=g/83tkkLZGFAUH3sgX0aXMvCJM4SWnmTaBmjCNYZiOTWQqvaWErNE/dPq1TDT8L/2R +73nCJPk9JwKdKvPijg2+jIROODp9+zPE90fUwWNlhJNSjNwjJiAG7xtKVxyiUHV3JLY GWjnB2MH5B/gx5VElim3TYdORcvvG7Mxgp/aV2oKHqiCW+fe1dBfIJsLWl1srSNggGvo X07aqk1PgTPmMaikuQTilKB9j19CPgpJ7ILBvjrX9xukQYCl2yo99s/D24jzme3T6/9o T37WA+n+GYhCDHrXFX0WaHjKPLMK/4mAFsQDkIGcvBSZQS4kxzo/XwChR/Rg6L2VTi1e pIfw== X-Gm-Message-State: AOAM533VShFR/suMLKVjKc+D8bqY5KYL39vu3B4PakoSJAA5PglukAR2 sOQKeq/Qa+UCQIWv/9S10fX80rWi9el39KWO X-Google-Smtp-Source: ABdhPJzsHe5U8/YqXizHisR8tKmVN7rZ3WQBeWfWn56VxH1mLRWp1rsnUMUf2FF00ub+js/ECbmgpg== X-Received: by 2002:a02:958e:: with SMTP id b14mr34917317jai.123.1629821752586; Tue, 24 Aug 2021 09:15:52 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id q15sm10059303ilm.60.2021.08.24.09.15.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:15:51 -0700 (PDT) Date: Tue, 24 Aug 2021 12:15:51 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 01/25] pack-bitmap.c: harden 'test_bitmap_walk()' to check type bitmaps Message-ID: <92dc0bbc0d0e3297f9eb6f51e8d6f40b367a77ca.1629821743.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The special `--test-bitmap` mode of `git rev-list` is used to compare the result of an object traversal with a bitmap to check its integrity. This mode does not, however, assert that the types of reachable objects are stored correctly. Harden this mode by teaching it to also check that each time an object's bit is marked, the corresponding bit should be set in exactly one of the type bitmaps (whose type matches the object's true type). Co-authored-by: Jeff King Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- pack-bitmap.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/pack-bitmap.c b/pack-bitmap.c index d999616c9e..9b11af87aa 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1325,10 +1325,52 @@ void count_bitmap_commit_list(struct bitmap_index *bitmap_git, struct bitmap_test_data { struct bitmap_index *bitmap_git; struct bitmap *base; + struct bitmap *commits; + struct bitmap *trees; + struct bitmap *blobs; + struct bitmap *tags; struct progress *prg; size_t seen; }; +static void test_bitmap_type(struct bitmap_test_data *tdata, + struct object *obj, int pos) +{ + enum object_type bitmap_type = OBJ_NONE; + int bitmaps_nr = 0; + + if (bitmap_get(tdata->commits, pos)) { + bitmap_type = OBJ_COMMIT; + bitmaps_nr++; + } + if (bitmap_get(tdata->trees, pos)) { + bitmap_type = OBJ_TREE; + bitmaps_nr++; + } + if (bitmap_get(tdata->blobs, pos)) { + bitmap_type = OBJ_BLOB; + bitmaps_nr++; + } + if (bitmap_get(tdata->tags, pos)) { + bitmap_type = OBJ_TAG; + bitmaps_nr++; + } + + if (bitmap_type == OBJ_NONE) + die("object %s not found in type bitmaps", + oid_to_hex(&obj->oid)); + + if (bitmaps_nr > 1) + die("object %s does not have a unique type", + oid_to_hex(&obj->oid)); + + if (bitmap_type != obj->type) + die("object %s: real type %s, expected: %s", + oid_to_hex(&obj->oid), + type_name(obj->type), + type_name(bitmap_type)); +} + static void test_show_object(struct object *object, const char *name, void *data) { @@ -1338,6 +1380,7 @@ static void test_show_object(struct object *object, const char *name, bitmap_pos = bitmap_position(tdata->bitmap_git, &object->oid); if (bitmap_pos < 0) die("Object not in bitmap: %s\n", oid_to_hex(&object->oid)); + test_bitmap_type(tdata, object, bitmap_pos); bitmap_set(tdata->base, bitmap_pos); display_progress(tdata->prg, ++tdata->seen); @@ -1352,6 +1395,7 @@ static void test_show_commit(struct commit *commit, void *data) &commit->object.oid); if (bitmap_pos < 0) die("Object not in bitmap: %s\n", oid_to_hex(&commit->object.oid)); + test_bitmap_type(tdata, &commit->object, bitmap_pos); bitmap_set(tdata->base, bitmap_pos); display_progress(tdata->prg, ++tdata->seen); @@ -1399,6 +1443,10 @@ void test_bitmap_walk(struct rev_info *revs) tdata.bitmap_git = bitmap_git; tdata.base = bitmap_new(); + tdata.commits = ewah_to_bitmap(bitmap_git->commits); + tdata.trees = ewah_to_bitmap(bitmap_git->trees); + tdata.blobs = ewah_to_bitmap(bitmap_git->blobs); + tdata.tags = ewah_to_bitmap(bitmap_git->tags); tdata.prg = start_progress("Verifying bitmap entries", result_popcnt); tdata.seen = 0; From patchwork Tue Aug 24 16:15:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455501 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27251C4338F for ; Tue, 24 Aug 2021 16:16:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0390B61212 for ; Tue, 24 Aug 2021 16:16:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229723AbhHXQQv (ORCPT ); Tue, 24 Aug 2021 12:16:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229604AbhHXQQl (ORCPT ); Tue, 24 Aug 2021 12:16:41 -0400 Received: from mail-il1-x12b.google.com (mail-il1-x12b.google.com [IPv6:2607:f8b0:4864:20::12b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C482CC061757 for ; Tue, 24 Aug 2021 09:15:55 -0700 (PDT) Received: by mail-il1-x12b.google.com with SMTP id s16so21030280ilo.9 for ; Tue, 24 Aug 2021 09:15:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=w02MQkrlLL8g6njXtmUYSgp+9uoMaT+bjGJFefhDnDk=; b=K1jA9xgoX7KBCgij45V1h73pQixXEzuP+h75f4lxbrYAAL0+Icy3ogWVUFjTEmXWpv 0X9dKxR3M2b48QO5Z4nQ9nHRlknK2p2rj45NfO9zJbEKQ0DfWzA+/m02Nb9VqmeQBQFG znDj7MyKsGBcNsVcu51MRLK3td2TRH8w2931P5/p4idlPh1WIrj5MCW0LphAqwrgnI4H SY+F1SyTaD5nl2lsdmlhcZQm70lmdjF+yRt9N/gQ7iuOIVtvPh3Q70Os3BqiWoUL480g 07L7Kdq6PP1ogeKCuOEvSj1Xo3jNP/fWpKUE13FI0ccN+mqll7us7rTQ5NeUTcLb++WT TdTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=w02MQkrlLL8g6njXtmUYSgp+9uoMaT+bjGJFefhDnDk=; b=JMoNgaKu9PmDHsEuky3pqIg0ghInyN30y+Ttu2OkkS31/+bWBk74VAIMu2evr9mHh+ xBAoUHaVb3Ca/ysWm2ZEmSPQE3VpsfIRgyL7xVPeIiRfHQi7elnaZkkdPpzjKKst6KfE novjf6cBCDVWHipjgbPugp2vMfHFf5A1cLKlrKl1IjmHwX9CqCWSbR38iH91q7bF04By j2x3d7wQK4QkfVvIp6OkBH/mU/3arPNaK/yAzlJZO8hcD2HQMkdb0YQH3v4ZXcQfrmEK HIfOVfAzqUkj3+rR9h1+yaQ4lEEKefg9hhsIzbY3rlyi+M+JPavFaL0u4vr1Ekkk1CKi ghdg== X-Gm-Message-State: AOAM533JuwzsR7SK4Gf3UU7AMctP3Kh/LyutL/7mtVBvSrW6apGGU2/U r1eDbP79grSoD6IYFachNjx8xHxkvxYJXiJZ X-Google-Smtp-Source: ABdhPJzmlgcMBzHtdLnFXEp0bi6Y9pyWrtBZyOhZsQMWkGkifuiIq36uM7rPpDGHbN3VBJuiQdJ5Sg== X-Received: by 2002:a05:6e02:d03:: with SMTP id g3mr27699922ilj.127.1629821755109; Tue, 24 Aug 2021 09:15:55 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id c13sm10657154iod.25.2021.08.24.09.15.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:15:54 -0700 (PDT) Date: Tue, 24 Aug 2021 12:15:54 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 02/25] pack-bitmap-write.c: gracefully fail to write non-closed bitmaps Message-ID: <979276bc74a9e703da844d461503f6720663ae31.1629821743.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The set of objects covered by a bitmap must be closed under reachability, since it must be the case that there is a valid bit position assigned for every possible reachable object (otherwise the bitmaps would be incomplete). Pack bitmaps are never written from 'git repack' unless repacking all-into-one, and so we never write non-closed bitmaps (except in the case of partial clones where we aren't guaranteed to have all objects). But multi-pack bitmaps change this, since it isn't known whether the set of objects in the MIDX is closed under reachability until walking them. Plumb through a bit that is set when a reachable object isn't found. As soon as a reachable object isn't found in the set of objects to include in the bitmap, bitmap_writer_build() knows that the set is not closed, and so it now fails gracefully. A test is added in t0410 to trigger a bitmap write without full reachability closure by removing local copies of some reachable objects from a promisor remote. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 3 +- pack-bitmap-write.c | 76 ++++++++++++++++++++++++++++------------ pack-bitmap.h | 2 +- t/t0410-partial-clone.sh | 9 ++++- 4 files changed, 64 insertions(+), 26 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index de00adbb9e..8a523624a1 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1256,7 +1256,8 @@ static void write_pack_file(void) bitmap_writer_show_progress(progress); bitmap_writer_select_commits(indexed_commits, indexed_commits_nr, -1); - bitmap_writer_build(&to_pack); + if (bitmap_writer_build(&to_pack) < 0) + die(_("failed to write bitmap index")); bitmap_writer_finish(written_list, nr_written, tmpname.buf, write_bitmap_options); write_bitmap_index = 0; diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index 88d9e696a5..d374f7884b 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -125,15 +125,20 @@ static inline void push_bitmapped_commit(struct commit *commit) writer.selected_nr++; } -static uint32_t find_object_pos(const struct object_id *oid) +static uint32_t find_object_pos(const struct object_id *oid, int *found) { struct object_entry *entry = packlist_find(writer.to_pack, oid); if (!entry) { - die("Failed to write bitmap index. Packfile doesn't have full closure " + if (found) + *found = 0; + warning("Failed to write bitmap index. Packfile doesn't have full closure " "(object %s is missing)", oid_to_hex(oid)); + return 0; } + if (found) + *found = 1; return oe_in_pack_pos(writer.to_pack, entry); } @@ -331,9 +336,10 @@ static void bitmap_builder_clear(struct bitmap_builder *bb) bb->commits_nr = bb->commits_alloc = 0; } -static void fill_bitmap_tree(struct bitmap *bitmap, - struct tree *tree) +static int fill_bitmap_tree(struct bitmap *bitmap, + struct tree *tree) { + int found; uint32_t pos; struct tree_desc desc; struct name_entry entry; @@ -342,9 +348,11 @@ static void fill_bitmap_tree(struct bitmap *bitmap, * If our bit is already set, then there is nothing to do. Both this * tree and all of its children will be set. */ - pos = find_object_pos(&tree->object.oid); + pos = find_object_pos(&tree->object.oid, &found); + if (!found) + return -1; if (bitmap_get(bitmap, pos)) - return; + return 0; bitmap_set(bitmap, pos); if (parse_tree(tree) < 0) @@ -355,11 +363,15 @@ static void fill_bitmap_tree(struct bitmap *bitmap, while (tree_entry(&desc, &entry)) { switch (object_type(entry.mode)) { case OBJ_TREE: - fill_bitmap_tree(bitmap, - lookup_tree(the_repository, &entry.oid)); + if (fill_bitmap_tree(bitmap, + lookup_tree(the_repository, &entry.oid)) < 0) + return -1; break; case OBJ_BLOB: - bitmap_set(bitmap, find_object_pos(&entry.oid)); + pos = find_object_pos(&entry.oid, &found); + if (!found) + return -1; + bitmap_set(bitmap, pos); break; default: /* Gitlink, etc; not reachable */ @@ -368,15 +380,18 @@ static void fill_bitmap_tree(struct bitmap *bitmap, } free_tree_buffer(tree); + return 0; } -static void fill_bitmap_commit(struct bb_commit *ent, - struct commit *commit, - struct prio_queue *queue, - struct prio_queue *tree_queue, - struct bitmap_index *old_bitmap, - const uint32_t *mapping) +static int fill_bitmap_commit(struct bb_commit *ent, + struct commit *commit, + struct prio_queue *queue, + struct prio_queue *tree_queue, + struct bitmap_index *old_bitmap, + const uint32_t *mapping) { + int found; + uint32_t pos; if (!ent->bitmap) ent->bitmap = bitmap_new(); @@ -401,11 +416,16 @@ static void fill_bitmap_commit(struct bb_commit *ent, * Mark ourselves and queue our tree. The commit * walk ensures we cover all parents. */ - bitmap_set(ent->bitmap, find_object_pos(&c->object.oid)); + pos = find_object_pos(&c->object.oid, &found); + if (!found) + return -1; + bitmap_set(ent->bitmap, pos); prio_queue_put(tree_queue, get_commit_tree(c)); for (p = c->parents; p; p = p->next) { - int pos = find_object_pos(&p->item->object.oid); + pos = find_object_pos(&p->item->object.oid, &found); + if (!found) + return -1; if (!bitmap_get(ent->bitmap, pos)) { bitmap_set(ent->bitmap, pos); prio_queue_put(queue, p->item); @@ -413,8 +433,12 @@ static void fill_bitmap_commit(struct bb_commit *ent, } } - while (tree_queue->nr) - fill_bitmap_tree(ent->bitmap, prio_queue_get(tree_queue)); + while (tree_queue->nr) { + if (fill_bitmap_tree(ent->bitmap, + prio_queue_get(tree_queue)) < 0) + return -1; + } + return 0; } static void store_selected(struct bb_commit *ent, struct commit *commit) @@ -432,7 +456,7 @@ static void store_selected(struct bb_commit *ent, struct commit *commit) kh_value(writer.bitmaps, hash_pos) = stored; } -void bitmap_writer_build(struct packing_data *to_pack) +int bitmap_writer_build(struct packing_data *to_pack) { struct bitmap_builder bb; size_t i; @@ -441,6 +465,7 @@ void bitmap_writer_build(struct packing_data *to_pack) struct prio_queue tree_queue = { NULL }; struct bitmap_index *old_bitmap; uint32_t *mapping; + int closed = 1; /* until proven otherwise */ writer.bitmaps = kh_init_oid_map(); writer.to_pack = to_pack; @@ -463,8 +488,11 @@ void bitmap_writer_build(struct packing_data *to_pack) struct commit *child; int reused = 0; - fill_bitmap_commit(ent, commit, &queue, &tree_queue, - old_bitmap, mapping); + if (fill_bitmap_commit(ent, commit, &queue, &tree_queue, + old_bitmap, mapping) < 0) { + closed = 0; + break; + } if (ent->selected) { store_selected(ent, commit); @@ -499,7 +527,9 @@ void bitmap_writer_build(struct packing_data *to_pack) stop_progress(&writer.progress); - compute_xor_offsets(); + if (closed) + compute_xor_offsets(); + return closed ? 0 : -1; } /** diff --git a/pack-bitmap.h b/pack-bitmap.h index 99d733eb26..020cd8d868 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -87,7 +87,7 @@ struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git, struct commit *commit); void bitmap_writer_select_commits(struct commit **indexed_commits, unsigned int indexed_commits_nr, int max_bitmaps); -void bitmap_writer_build(struct packing_data *to_pack); +int bitmap_writer_build(struct packing_data *to_pack); void bitmap_writer_finish(struct pack_idx_entry **index, uint32_t index_nr, const char *filename, diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh index a211a66c67..bbcc51ee8e 100755 --- a/t/t0410-partial-clone.sh +++ b/t/t0410-partial-clone.sh @@ -536,7 +536,13 @@ test_expect_success 'gc does not repack promisor objects if there are none' ' repack_and_check () { rm -rf repo2 && cp -r repo repo2 && - git -C repo2 repack $1 -d && + if test x"$1" = "x--must-fail" + then + shift + test_must_fail git -C repo2 repack $1 -d + else + git -C repo2 repack $1 -d + fi && git -C repo2 fsck && git -C repo2 cat-file -e $2 && @@ -561,6 +567,7 @@ test_expect_success 'repack -d does not irreversibly delete promisor objects' ' printf "$THREE\n" | pack_as_from_promisor && delete_object repo "$ONE" && + repack_and_check --must-fail -ab "$TWO" "$THREE" && repack_and_check -a "$TWO" "$THREE" && repack_and_check -A "$TWO" "$THREE" && repack_and_check -l "$TWO" "$THREE" From patchwork Tue Aug 24 16:15:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455505 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD02FC4320E for ; Tue, 24 Aug 2021 16:16:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9EBF960F4C for ; Tue, 24 Aug 2021 16:16:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229665AbhHXQQw (ORCPT ); Tue, 24 Aug 2021 12:16:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229618AbhHXQQm (ORCPT ); Tue, 24 Aug 2021 12:16:42 -0400 Received: from mail-io1-xd2f.google.com (mail-io1-xd2f.google.com [IPv6:2607:f8b0:4864:20::d2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A8BC2C061764 for ; Tue, 24 Aug 2021 09:15:58 -0700 (PDT) Received: by mail-io1-xd2f.google.com with SMTP id b7so27064388iob.4 for ; Tue, 24 Aug 2021 09:15:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Dj8bSAAyMgcbrNeyMR+hYayZQXjJBIgzVAauMYnE+HQ=; b=bEh6mbeofVcfebSzQxKBOP3K2ZHFiJSLlBPwUw/+pjZTIwKI8Z0JAE/q1p+ehn7yLl 5yYd7XXR2Hc8wy278CgcWr4+AzHYIIwoC44B5unB7bqCzAsfxok0E7sw6tiChSbp2fuG tV0xKhe3AoiwTr/nMV1LzfVGymo/3nEmymDH8UkNxpWOrBROg6ys7ByUc9qZ6QnZy2Of sWVxHWf3pOZCOpm7jZnE56fGtBI/bZk+yJrUPbcB98wYZfxbc4dX31cOChecGnNzGhN9 wRRGbaiBAz9O1u48szS6cMUXvVjJIbpc/3r9uGdN0Ap0u4YaKA/uk+zDX/RlJ6REN8f0 kY7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Dj8bSAAyMgcbrNeyMR+hYayZQXjJBIgzVAauMYnE+HQ=; b=Zo7vv6T6JwRzvkJfMHY4c8HW6R5pXgvXc4fI3ptpiKc3ixffroBmg3KtUQqiK9YLYp GX3KZ7qkrBjGX1Wkrx0hSLpI7DzkXdrkHaWZmqzciNnoVQMiJ/LJtiwFdnvNu/IU5ZxX VfEpR8p9rDoteuYsj+uTtn3IDeUmmPWpQbM6sa4L9BArwpiAAK4nEK8U1CZw36ozIfCo qy2qBrHsPaiBTud8BH9ZlBoHTuWHS/2EMPS1HcPzUYEVeqRtJpSaBatYWmkUGxy8K5GR UwlLZ/4QFbFhGL24QemyDUZZJ/LcSuvi3xDTxZ/lEvXbVnfv64KBxz5poqIEGcsoswFQ dPbg== X-Gm-Message-State: AOAM5330g7ujbxFWFqF4s66fK+c8s+mhNUYzDYmBKeVJJ7TFdtwztBuW fOjhU+yuVjZGm024ahDr39KEqbdGweXSKisn X-Google-Smtp-Source: ABdhPJyNUSp6RbpDE0cJb6V0d5Xc/cbJo+KWDiyp/SG6KMJ9WoTmnywhLxMpA1deBDPCufbQQptoTA== X-Received: by 2002:a05:6638:81:: with SMTP id v1mr22019485jao.11.1629821757951; Tue, 24 Aug 2021 09:15:57 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id r18sm10531734ioa.13.2021.08.24.09.15.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:15:57 -0700 (PDT) Date: Tue, 24 Aug 2021 12:15:56 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 03/25] pack-bitmap-write.c: free existing bitmaps Message-ID: <8f00493955662a5c7ec8d50a1b9c9ffe2ef7f343.1629821743.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When writing a new bitmap, the bitmap writer code attempts to read the existing bitmap (if one is present). This is done in order to quickly permute the bits of any bitmaps for commits which appear in the existing bitmap, and were also selected for the new bitmap. But since this code was added in 341fa34887 (pack-bitmap-write: use existing bitmaps, 2020-12-08), the resources associated with opening an existing bitmap were never released. It's fine to ignore this, but it's bad hygiene. It will also cause a problem for the multi-pack-index builtin, which will be responsible not only for writing bitmaps, but also for expiring any old multi-pack bitmaps. If an existing bitmap was reused here, it will also be expired. That will cause a problem on platforms which require file resources to be closed before unlinking them, like Windows. Avoid this by ensuring we close reused bitmaps with free_bitmap_index() before removing them. Signed-off-by: Taylor Blau --- pack-bitmap-write.c | 1 + 1 file changed, 1 insertion(+) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index d374f7884b..142fd0adb8 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -520,6 +520,7 @@ int bitmap_writer_build(struct packing_data *to_pack) clear_prio_queue(&queue); clear_prio_queue(&tree_queue); bitmap_builder_clear(&bb); + free_bitmap_index(old_bitmap); free(mapping); trace2_region_leave("pack-bitmap-write", "building_bitmaps_total", From patchwork Tue Aug 24 16:15:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455507 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB79FC4338F for ; Tue, 24 Aug 2021 16:16:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C9F3A61212 for ; Tue, 24 Aug 2021 16:16:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229885AbhHXQQx (ORCPT ); Tue, 24 Aug 2021 12:16:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46610 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229638AbhHXQQp (ORCPT ); Tue, 24 Aug 2021 12:16:45 -0400 Received: from mail-io1-xd29.google.com (mail-io1-xd29.google.com [IPv6:2607:f8b0:4864:20::d29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 269C2C0613C1 for ; Tue, 24 Aug 2021 09:16:01 -0700 (PDT) Received: by mail-io1-xd29.google.com with SMTP id g9so27062795ioq.11 for ; Tue, 24 Aug 2021 09:16:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=7gb+p1qp8OyprN79ta1umhEzfyQPao0o7L6jiWESP8o=; b=veD1ZzYiowelGXHWxWzJxjEwGfBi5nawcXIvPx43NYOTGKhqVuzdYhEmjaexWK/yDA X5pMSmkGkBod9aBb1bHU8LhC+J3DecqtsBrTRCVb2tPKXh64Mv8HaW10C8GTJ02AIeUN SRn/fEl0VbVavglNGVaThx052S8iyOB3fwVwCz8prHU+MYcRxFoN1lxtSzep9M0hvOz8 Q6EOiQOg5nHRLlGB1A3kSJL9ghNGmn1hy5sbO8Zyaeq2T/FYCug0VohvujCKbwqZDchu kzOvjaSbA7649YX+QnfBtjbpCYLaLeR68UE4E8iTDhTcflhN57xr0D7ph3jYIQ9u086e AKVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=7gb+p1qp8OyprN79ta1umhEzfyQPao0o7L6jiWESP8o=; b=oIX/8gcXqvbXpO5vSWa8NIMusRZzY89TWz0HRkCYijbwUlTA62gswX7yl0jQvNZdXw zHDPBuEwUUolAc8C9QGnnCMAHGhJ0ekl1klv4Elmtgldsb/SE/GCFGY5cAs6bqbre2KU 9vUpO89Ji/pbtBdY5a0Cp+SANEuGq3b3Dj2CYZ+6h4u7JPQ+pMI4tdYtlF8u/5Pm5fDE EBMAQwnCLQoBDr48nsjjzZKX8XinvhFxBvu5ZL4MkuA59W7eqsfK4YGQ2bTx6WIQ9eIi ZsGaUCDDKKUhQ9ZjdgRE3YTZVj/CUcI8+cHZpigHMboD02yQDitQzugoYV43SQtlGZby TUNQ== X-Gm-Message-State: AOAM533DMJ8jf/Nl1Ycy1qWbwzbXE06JKyYX3dd/QQsnEB4yHT6wIZC8 B6mAf47Pt3PK0M5kG+keKsR8dKbau783dpAu X-Google-Smtp-Source: ABdhPJzRgocaI79QF3U8HoB0dKkfTtst1QDZ7CAljsnFJT+JoADuzJ511OhjC2LHRlbp1DZsw8SDyg== X-Received: by 2002:a05:6638:2284:: with SMTP id y4mr36202551jas.75.1629821760407; Tue, 24 Aug 2021 09:16:00 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d12sm6634942iow.16.2021.08.24.09.16.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:00 -0700 (PDT) Date: Tue, 24 Aug 2021 12:15:59 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 04/25] Documentation: describe MIDX-based bitmaps Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Update the technical documentation to describe the multi-pack bitmap format. This patch merely introduces the new format, and describes its high-level ideas. Git does not yet know how to read nor write these multi-pack variants, and so the subsequent patches will: - Introduce code to interpret multi-pack bitmaps, according to this document. - Then, introduce code to write multi-pack bitmaps from the 'git multi-pack-index write' sub-command. Finally, the implementation will gain tests in subsequent patches (as opposed to inline with the patch teaching Git how to write multi-pack bitmaps) to avoid a cyclic dependency. Signed-off-by: Taylor Blau --- Documentation/technical/bitmap-format.txt | 71 ++++++++++++++++---- Documentation/technical/multi-pack-index.txt | 10 +-- 2 files changed, 60 insertions(+), 21 deletions(-) diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt index f8c18a0f7a..04b3ec2178 100644 --- a/Documentation/technical/bitmap-format.txt +++ b/Documentation/technical/bitmap-format.txt @@ -1,6 +1,44 @@ GIT bitmap v1 format ==================== +== Pack and multi-pack bitmaps + +Bitmaps store reachability information about the set of objects in a packfile, +or a multi-pack index (MIDX). The former is defined obviously, and the latter is +defined as the union of objects in packs contained in the MIDX. + +A bitmap may belong to either one pack, or the repository's multi-pack index (if +it exists). A repository may have at most one bitmap. + +An object is uniquely described by its bit position within a bitmap: + + - If the bitmap belongs to a packfile, the __n__th bit corresponds to + the __n__th object in pack order. For a function `offset` which maps + objects to their byte offset within a pack, pack order is defined as + follows: + + o1 <= o2 <==> offset(o1) <= offset(o2) + + - If the bitmap belongs to a MIDX, the __n__th bit corresponds to the + __n__th object in MIDX order. With an additional function `pack` which + maps objects to the pack they were selected from by the MIDX, MIDX order + is defined as follows: + + o1 <= o2 <==> pack(o1) <= pack(o2) /\ offset(o1) <= offset(o2) + + The ordering between packs is done according to the MIDX's .rev file. + Notably, the preferred pack sorts ahead of all other packs. + +The on-disk representation (described below) of a bitmap is the same regardless +of whether or not that bitmap belongs to a packfile or a MIDX. The only +difference is the interpretation of the bits, which is described above. + +Certain bitmap extensions are supported (see: Appendix B). No extensions are +required for bitmaps corresponding to packfiles. For bitmaps that correspond to +MIDXs, both the bit-cache and rev-cache extensions are required. + +== On-disk format + - A header appears at the beginning: 4-byte signature: {'B', 'I', 'T', 'M'} @@ -14,17 +52,19 @@ GIT bitmap v1 format The following flags are supported: - BITMAP_OPT_FULL_DAG (0x1) REQUIRED - This flag must always be present. It implies that the bitmap - index has been generated for a packfile with full closure - (i.e. where every single object in the packfile can find - its parent links inside the same packfile). This is a - requirement for the bitmap index format, also present in JGit, - that greatly reduces the complexity of the implementation. + This flag must always be present. It implies that the + bitmap index has been generated for a packfile or + multi-pack index (MIDX) with full closure (i.e. where + every single object in the packfile/MIDX can find its + parent links inside the same packfile/MIDX). This is a + requirement for the bitmap index format, also present in + JGit, that greatly reduces the complexity of the + implementation. - BITMAP_OPT_HASH_CACHE (0x4) If present, the end of the bitmap file contains `N` 32-bit name-hash values, one per object in the - pack. The format and meaning of the name-hash is + pack/MIDX. The format and meaning of the name-hash is described below. 4-byte entry count (network byte order) @@ -33,7 +73,8 @@ GIT bitmap v1 format 20-byte checksum - The SHA1 checksum of the pack this bitmap index belongs to. + The SHA1 checksum of the pack/MIDX this bitmap index + belongs to. - 4 EWAH bitmaps that act as type indexes @@ -50,7 +91,7 @@ GIT bitmap v1 format - Tags In each bitmap, the `n`th bit is set to true if the `n`th object - in the packfile is of that type. + in the packfile or multi-pack index is of that type. The obvious consequence is that the OR of all 4 bitmaps will result in a full set (all bits set), and the AND of all 4 bitmaps will @@ -62,8 +103,9 @@ GIT bitmap v1 format Each entry contains the following: - 4-byte object position (network byte order) - The position **in the index for the packfile** where the - bitmap for this commit is found. + The position **in the index for the packfile or + multi-pack index** where the bitmap for this commit is + found. - 1-byte XOR-offset The xor offset used to compress this bitmap. For an entry @@ -146,10 +188,11 @@ Name-hash cache --------------- If the BITMAP_OPT_HASH_CACHE flag is set, the end of the bitmap contains -a cache of 32-bit values, one per object in the pack. The value at +a cache of 32-bit values, one per object in the pack/MIDX. The value at position `i` is the hash of the pathname at which the `i`th object -(counting in index order) in the pack can be found. This can be fed -into the delta heuristics to compare objects with similar pathnames. +(counting in index or multi-pack index order) in the pack/MIDX can be found. +This can be fed into the delta heuristics to compare objects with similar +pathnames. The hash algorithm used is: diff --git a/Documentation/technical/multi-pack-index.txt b/Documentation/technical/multi-pack-index.txt index fb688976c4..1a73c3ee20 100644 --- a/Documentation/technical/multi-pack-index.txt +++ b/Documentation/technical/multi-pack-index.txt @@ -71,14 +71,10 @@ Future Work still reducing the number of binary searches required for object lookups. -- The reachability bitmap is currently paired directly with a single - packfile, using the pack-order as the object order to hopefully - compress the bitmaps well using run-length encoding. This could be - extended to pair a reachability bitmap with a multi-pack-index. If - the multi-pack-index is extended to store a "stable object order" +- If the multi-pack-index is extended to store a "stable object order" (a function Order(hash) = integer that is constant for a given hash, - even as the multi-pack-index is updated) then a reachability bitmap - could point to a multi-pack-index and be updated independently. + even as the multi-pack-index is updated) then MIDX bitmaps could be + updated independently of the MIDX. - Packfiles can be marked as "special" using empty files that share the initial name but replace ".pack" with ".keep" or ".promisor". From patchwork Tue Aug 24 16:16:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455509 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 469E8C4320A for ; Tue, 24 Aug 2021 16:16:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 288D561212 for ; Tue, 24 Aug 2021 16:16:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230045AbhHXQQ5 (ORCPT ); Tue, 24 Aug 2021 12:16:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229666AbhHXQQs (ORCPT ); Tue, 24 Aug 2021 12:16:48 -0400 Received: from mail-il1-x12a.google.com (mail-il1-x12a.google.com [IPv6:2607:f8b0:4864:20::12a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C345EC0613CF for ; Tue, 24 Aug 2021 09:16:03 -0700 (PDT) Received: by mail-il1-x12a.google.com with SMTP id j15so21070488ila.1 for ; Tue, 24 Aug 2021 09:16:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=f9cTScN3cBtgvr8gRBwQzXJ86fIQCcArXDoohUDQAZI=; b=ml0yvjLjoEP9vLJhpEJ6UMK2YVJGdfvJKg4W4e5t8nYxQgH5C1VUvxX2+fD8OQ4yOQ 33z1icn/L7d8wuM9mP6iNez8xF2dfM4reU4HP6qVej4B6NbifhbZXeb6q7sC2o36Pp68 dkSMv/pxu1hGmtrUgYjFsu9Lmsol2ZZ278m+Zx43wj0Rup8xWxe/L3MuwY9gWPBbNlo4 3FfamFDm85E5/1YSyMS2EQ14t5carxhX/M1g7ekxKTp0k4c1cVZzBXjuZJjkjQnDqZ3n 6QCIKWkR+p8V6QpojlUzz+pCZaR+cv4wgDWEgYW6q29KKJl2Pk+blLX5poW6b8Mgizxd qKbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=f9cTScN3cBtgvr8gRBwQzXJ86fIQCcArXDoohUDQAZI=; b=HCRBJuw3CwLDH8nE1gHg3T9oMXHurazAyGat3gKecN6pT8e7Fmy9auwpUWWnEEQr44 J3bmRgV0KwBQZt9jCV3UOTU60DJQhf2v2yAb/0KJ6pVTmPZ/Dy4RBy99NglSWqOV+Ymj 7mDY/5svA5GODmv6XhsGr6M+A6NLaEm5x3YFjmRDuYHMBLHgn1v8vpJbpF80wSWgTfzJ lcQYffACLv99qJYQq4qLff3uVWw9BmEPabnExiyUKePh8MR0ZGOkfqSP72Gq1SBG5tfJ F8RSoW2TGFr+fwDnYkjMw4XixZWJ4Z2PoPQ1q+nd89Q/ASXI1dyxbf95n+zd8lHWE7fi vioQ== X-Gm-Message-State: AOAM533GuregOD3Pg3OAuxDPgwiKzoKV2xjIjhvfTF7JwQmNPJ8pPU1E bPuGzh2hHeh353isl439dRH8kjcGefpissLt X-Google-Smtp-Source: ABdhPJz4Q/1bq3qahsU83a+aeJenUiiPfcjk07SCsU0VQqttxuRVTflx1yERxAUWx1FczpTvcExHcw== X-Received: by 2002:a92:c646:: with SMTP id 6mr25788407ill.78.1629821763114; Tue, 24 Aug 2021 09:16:03 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id p19sm10321486ilj.58.2021.08.24.09.16.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:02 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:02 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 05/25] midx: clear auxiliary .rev after replacing the MIDX Message-ID: <771741844be3570395abfda813ed5ef2fa78332e.1629821743.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When writing a new multi-pack index, write_midx_internal() attempts to clean up any auxiliary files (currently just the MIDX's `.rev` file, but soon to include a `.bitmap`, too) corresponding to the MIDX it's replacing. This step should happen after the new MIDX is written into place, since doing so beforehand means that the old MIDX could be read without its corresponding .rev file. Signed-off-by: Taylor Blau --- midx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/midx.c b/midx.c index 321c6fdd2f..73b199ca49 100644 --- a/midx.c +++ b/midx.c @@ -1086,10 +1086,11 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * if (flags & MIDX_WRITE_REV_INDEX) write_midx_reverse_index(midx_name, midx_hash, &ctx); - clear_midx_files_ext(the_repository, ".rev", midx_hash); commit_lock_file(&lk); + clear_midx_files_ext(the_repository, ".rev", midx_hash); + cleanup: for (i = 0; i < ctx.nr; i++) { if (ctx.info[i].p) { From patchwork Tue Aug 24 16:16:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455511 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 730FEC432BE for ; Tue, 24 Aug 2021 16:16:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 61EAB610A3 for ; Tue, 24 Aug 2021 16:16:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230235AbhHXQQ6 (ORCPT ); Tue, 24 Aug 2021 12:16:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229709AbhHXQQu (ORCPT ); Tue, 24 Aug 2021 12:16:50 -0400 Received: from mail-il1-x12d.google.com (mail-il1-x12d.google.com [IPv6:2607:f8b0:4864:20::12d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69882C0613D9 for ; Tue, 24 Aug 2021 09:16:06 -0700 (PDT) Received: by mail-il1-x12d.google.com with SMTP id u7so21064276ilk.7 for ; Tue, 24 Aug 2021 09:16:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=2XzWdz7EESNopwj6BDpfpE36HwWK9E1PabZFQ1q7w1I=; b=YTmpn2Y4Jxd9yjy3dY/yc05aRc7xg/zQQUS8bgI9l2dBhZQ0u3t6tJJYnMRssXsicv W1irpJuAHu3B1Ue7IjlJC2zHlj8xvCltB2WCcfxVq5purNjs119AhCkNHYUK5hRHusGW 2y+04tPCtqkJp9NyoFS74H0T5Csll80Skmn8jQVAMoHchuMmtgpElWon341t5NseK7Z5 5d65QTpz53U7Ex+NhaX51zz32TMeLXE6Y+KC8RbDL/gTLGT/CRSaDAV2Rz49WlGV4tw/ DxfqzPxu3wHZWqaUX/2Eq96rFIcM4Pet7TGVEdLM+22R82nmgK3YMN6Va319rSifzycV ddoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=2XzWdz7EESNopwj6BDpfpE36HwWK9E1PabZFQ1q7w1I=; b=kME4stxkubt0bcSO8s6v5SoyQWXjwTWp8QxeeDYSpSvJ0ahYxm7GEZxxflq9TITut6 wv8npxFAwaidq5N6cibgcWKR8Y419C2t3HmQh8ALLdigqPrh3VHCnTF/EuSj00sb33aT wGgMXrTMfPMRiT3RtHLjYCyeRJNrNx7hd8QsKTHDoLwqrW2n/8XhwU5oVutAUie95FWI MZmyXy7iYdkdgNsOM3SP0ovkYqVEwj+6L1PqWlNjQYD23T3ci8jZMjGUiID0vddMxuDK GUOuINoIzFjNqoGAfhUv+j5tTx9CCSufNLbDaZn7XN1PPeSZcN2bLWSkx/kcOLJHmvr4 /RRg== X-Gm-Message-State: AOAM531pYzJwKKRRddN4ywvg6EBIOhySerLEDs6gnIzseOhfeWobWZWp 17phKtJDnLInkIbQ4WO0Vu3nh/bnfqQ0Yfhj X-Google-Smtp-Source: ABdhPJwdOWQtA3Sxw68DYBgHR3AGtdtxImnFkOVl8MffiJ/dDrsBAAQHesjX0F+vWhPBuBRQAV7jmg== X-Received: by 2002:a92:194c:: with SMTP id e12mr25776402ilm.199.1629821765704; Tue, 24 Aug 2021 09:16:05 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id r18sm10170701ilo.38.2021.08.24.09.16.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:05 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:04 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 06/25] midx: reject empty `--preferred-pack`'s Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The soon-to-be-implemented multi-pack bitmap treats object in the first bit position specially by assuming that all objects in the pack it was selected from are also represented from that pack in the MIDX. In other words, the pack from which the first object was selected must also have all of its other objects selected from that same pack in the MIDX in case of any duplicates. But this assumption relies on the fact that there is at least one object in that pack to begin with; otherwise the object in the first bit position isn't from a preferred pack, in which case we can no longer assume that all objects in that pack were also selected from the same pack. Guard this assumption by checking the number of objects in the given preferred pack, and failing if the given pack is empty. To make sure we can safely perform this check, open any packs which are contained in an existing MIDX via prepare_midx_pack(). The same is done for new packs via the add_pack_to_midx() callback, but packs picked up from a previous MIDX will not yet have these opened. Signed-off-by: Taylor Blau --- Documentation/git-multi-pack-index.txt | 6 +++--- midx.c | 29 ++++++++++++++++++++++++++ t/t5319-multi-pack-index.sh | 17 +++++++++++++++ 3 files changed, 49 insertions(+), 3 deletions(-) diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index ffd601bc17..c9b063d31e 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -37,9 +37,9 @@ write:: -- --preferred-pack=:: Optionally specify the tie-breaking pack used when - multiple packs contain the same object. If not given, - ties are broken in favor of the pack with the lowest - mtime. + multiple packs contain the same object. `` must + contain at least one object. If not given, ties are + broken in favor of the pack with the lowest mtime. -- verify:: diff --git a/midx.c b/midx.c index 73b199ca49..551e5c2ee5 100644 --- a/midx.c +++ b/midx.c @@ -934,6 +934,25 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * ctx.info[ctx.nr].pack_name = xstrdup(ctx.m->pack_names[i]); ctx.info[ctx.nr].p = NULL; ctx.info[ctx.nr].expired = 0; + + if (flags & MIDX_WRITE_REV_INDEX) { + /* + * If generating a reverse index, need to have + * packed_git's loaded to compare their + * mtimes and object count. + */ + if (prepare_midx_pack(the_repository, ctx.m, i)) { + error(_("could not load pack")); + result = 1; + goto cleanup; + } + + if (open_pack_index(ctx.m->packs[i])) + die(_("could not open index for %s"), + ctx.m->packs[i]->pack_name); + ctx.info[ctx.nr].p = ctx.m->packs[i]; + } + ctx.nr++; } } @@ -961,6 +980,16 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * } } + if (ctx.preferred_pack_idx > -1) { + struct packed_git *preferred = ctx.info[ctx.preferred_pack_idx].p; + if (!preferred->num_objects) { + error(_("cannot select preferred pack %s with no objects"), + preferred->pack_name); + result = 1; + goto cleanup; + } + } + ctx.entries = get_sorted_entries(ctx.m, ctx.info, ctx.nr, &ctx.entries_nr, ctx.preferred_pack_idx); diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 3d4d9f10c3..9b184bd45e 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -277,6 +277,23 @@ test_expect_success 'midx picks objects from preferred pack' ' ) ' +test_expect_success 'preferred packs must be non-empty' ' + test_when_finished rm -rf preferred.git && + git init preferred.git && + ( + cd preferred.git && + + test_commit base && + git repack -ad && + + empty="$(git pack-objects $objdir/pack/pack err && + grep "with no objects" err + ) +' + test_expect_success 'verify multi-pack-index success' ' git multi-pack-index verify --object-dir=$objdir ' From patchwork Tue Aug 24 16:16:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455513 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65BD2C4338F for ; Tue, 24 Aug 2021 16:16:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 519CC60F4C for ; Tue, 24 Aug 2021 16:16:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229792AbhHXQQ7 (ORCPT ); Tue, 24 Aug 2021 12:16:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46648 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229908AbhHXQQx (ORCPT ); Tue, 24 Aug 2021 12:16:53 -0400 Received: from mail-io1-xd30.google.com (mail-io1-xd30.google.com [IPv6:2607:f8b0:4864:20::d30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4FA14C0617A8 for ; Tue, 24 Aug 2021 09:16:09 -0700 (PDT) Received: by mail-io1-xd30.google.com with SMTP id b7so27064991iob.4 for ; Tue, 24 Aug 2021 09:16:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=ujhAxRiA3uR6dx4f9FN/eiaLsee3JfsYFqVjz5Ahhtc=; b=KkoVR0Ja92i+6UbBvbgd9fP470bLd+oSxpT8hXcZBzgwt3ziTyC2RVLRswzBTUpSv+ QXi/y72ZSS1pLWcY8drvZ7bNlDQ6VFaV3cbKdbP4WwlhX0Az+ae7AA5vI5BFw/Zgu1MY mhQ6UVU5cWD5t/nWAPFMJRqLMw/ynTfJVeEhFOtQsU9O/c5n6BCcstxzqHLCq8svdy9+ 6zjykoCveSpp+f7hKD56fhyZfP4T1zARitliFWSFTg3+LR7z45EQUiZ5RsiacUpwyHIw 0b/owSl2J3TyhCk1A9jyku6Rxy5GJ7iFzrRIfVga7YUG3FZPZE+Qq715Tp4dVG6PtLOJ 6prQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=ujhAxRiA3uR6dx4f9FN/eiaLsee3JfsYFqVjz5Ahhtc=; b=couFXQ9RBYLRAa8sPjE23ZbKpuB01rfoA4NouGAfGDOaVOBtmnTsrMTafm/ikeQGfQ HSMuyH2IvR0XrSnumgZQScDXfovEuXtgdFODb/p5xQqQpGVOPE9H7UrOqUmp6tYBKuKQ Ztcfl77JGN7MScDLxV4KT1G63C6s5VhB9bC16552F2PD5SZnsXDSxTowx05Qn3r/TSnC j/3wuDeuSwJfn1lglqV9GBy00NE09GJyF3+BKZCp2Ljnmdd1tPvZeG4+OB/z3jIOzmhN wq+pZ/86dEM1RWzlocyq+1QuNxjEJEA+UqYJ1BD7cU9C5vqkkC7WFEpJR/fz21jOqKjW GAMA== X-Gm-Message-State: AOAM533ECFyDfdMf4DGVp0K3I0mirWiRu9FqYb7uAtkrj9fm63Nb4kKX 03Yk31On0hojM1ZUu5d1N/32ynFO9gH69mOS X-Google-Smtp-Source: ABdhPJyvQGTq/CUE8ogVu7JinTu3z0iIfRwM86ovT8d+Da6fuFXw2HM0AYTJnHWOQmpFUJCqqmh9Bg== X-Received: by 2002:a02:1cc5:: with SMTP id c188mr35617959jac.46.1629821768316; Tue, 24 Aug 2021 09:16:08 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id s7sm10263867ioc.42.2021.08.24.09.16.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:07 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:07 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 07/25] midx: infer preferred pack when not given one Message-ID: <31f4517de0bc188a7b0d53e6092dde82c4b2855a.1629821743.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In 9218c6a40c (midx: allow marking a pack as preferred, 2021-03-30), the multi-pack index code learned how to select a pack which all duplicate objects are selected from. That is, if an object appears in multiple packs, select the copy in the preferred pack before breaking ties according to the other rules like pack mtime and readdir() order. Not specifying a preferred pack can cause serious problems with multi-pack reachability bitmaps, because these bitmaps rely on having at least one pack from which all duplicates are selected. Not having such a pack causes problems with the code in pack-objects to reuse packs verbatim (e.g., that code assumes that a delta object in a chunk of pack sent verbatim will have its base object sent from the same pack). So why does not marking a pack preferred cause problems here? The reason is roughly as follows: - Ties are broken (when handling duplicate objects) by sorting according to midx_oid_compare(), which sorts objects by OID, preferred-ness, pack mtime, and finally pack ID (more on that later). - The psuedo pack-order (described in Documentation/technical/pack-format.txt under the section "multi-pack-index reverse indexes") is computed by midx_pack_order(), and sorts by pack ID and pack offset, with preferred packs sorting first. - But! Pack IDs come from incrementing the pack count in add_pack_to_midx(), which is a callback to for_each_file_in_pack_dir(), meaning that pack IDs are assigned in readdir() order. When specifying a preferred pack, all of that works fine, because duplicate objects are correctly resolved in favor of the copy in the preferred pack, and the preferred pack sorts first in the object order. "Sorting first" is critical, because the bitmap code relies on finding out which pack holds the first object in the MIDX's pseudo pack-order to determine which pack is preferred. But if we didn't specify a preferred pack, and the pack which comes first in readdir() order does not also have the lowest timestamp, then it's possible that that pack (the one that sorts first in pseudo-pack order, which the bitmap code will treat as the preferred one) did *not* have all duplicate objects resolved in its favor, resulting in breakage. The fix is simple: pick a (semi-arbitrary, non-empty) preferred pack when none was specified. This forces that pack to have duplicates resolved in its favor, and (critically) to sort first in pseudo-pack order. Unfortunately, testing this behavior portably isn't possible, since it depends on readdir() order which isn't guaranteed by POSIX. (Note that multi-pack reachability bitmaps have yet to be implemented; so in that sense this patch is fixing a bug which does not yet exist. But by having this patch beforehand, we can prevent the bug from ever materializing.) Signed-off-by: Taylor Blau --- midx.c | 50 ++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 44 insertions(+), 6 deletions(-) diff --git a/midx.c b/midx.c index 551e5c2ee5..e5b17483af 100644 --- a/midx.c +++ b/midx.c @@ -969,15 +969,57 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop) goto cleanup; - ctx.preferred_pack_idx = -1; if (preferred_pack_name) { + int found = 0; for (i = 0; i < ctx.nr; i++) { if (!cmp_idx_or_pack_name(preferred_pack_name, ctx.info[i].pack_name)) { ctx.preferred_pack_idx = i; + found = 1; break; } } + + if (!found) + warning(_("unknown preferred pack: '%s'"), + preferred_pack_name); + } else if (ctx.nr && (flags & MIDX_WRITE_REV_INDEX)) { + struct packed_git *oldest = ctx.info[ctx.preferred_pack_idx].p; + ctx.preferred_pack_idx = 0; + + if (packs_to_drop && packs_to_drop->nr) + BUG("cannot write a MIDX bitmap during expiration"); + + /* + * set a preferred pack when writing a bitmap to ensure that + * the pack from which the first object is selected in pseudo + * pack-order has all of its objects selected from that pack + * (and not another pack containing a duplicate) + */ + for (i = 1; i < ctx.nr; i++) { + struct packed_git *p = ctx.info[i].p; + + if (!oldest->num_objects || p->mtime < oldest->mtime) { + oldest = p; + ctx.preferred_pack_idx = i; + } + } + + if (!oldest->num_objects) { + /* + * If all packs are empty; unset the preferred index. + * This is acceptable since there will be no duplicate + * objects to resolve, so the preferred value doesn't + * matter. + */ + ctx.preferred_pack_idx = -1; + } + } else { + /* + * otherwise don't mark any pack as preferred to avoid + * interfering with expiration logic below + */ + ctx.preferred_pack_idx = -1; } if (ctx.preferred_pack_idx > -1) { @@ -1058,11 +1100,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * ctx.info, ctx.nr, sizeof(*ctx.info), idx_or_pack_name_cmp); - - if (!preferred) - warning(_("unknown preferred pack: '%s'"), - preferred_pack_name); - else { + if (preferred) { uint32_t perm = ctx.pack_perm[preferred->orig_pack_int_id]; if (perm == PACK_EXPIRED) warning(_("preferred pack '%s' is expired"), From patchwork Tue Aug 24 16:16:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455515 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD9D9C432BE for ; Tue, 24 Aug 2021 16:16:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B4FFF60F4C for ; Tue, 24 Aug 2021 16:16:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230010AbhHXQRB (ORCPT ); Tue, 24 Aug 2021 12:17:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229666AbhHXQQ5 (ORCPT ); Tue, 24 Aug 2021 12:16:57 -0400 Received: from mail-il1-x12f.google.com (mail-il1-x12f.google.com [IPv6:2607:f8b0:4864:20::12f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D53ADC0612E7 for ; Tue, 24 Aug 2021 09:16:11 -0700 (PDT) Received: by mail-il1-x12f.google.com with SMTP id r6so21020535ilt.13 for ; Tue, 24 Aug 2021 09:16:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=AOplCxr4KKea766qJZRQdDVjvlAF/QreSsz08pLBxoI=; b=k5in38ZK3PXh5+rh5Fuqpnq3Ka9Yfuua7EyuSkaJxiolhGQA1iEHDCrhjxfzggzcSk G/Cy/MwjATkWmwapHZD7bxsktHbqDVrtGoXE0vnEc7TefzMuOX2JOpwJ05ehgIKW3aeR Q+Gh96r0uKEe8km0KKd1X45E2n7/oG9ScZpHCahLOPtVhPml7xqcvpF8WZiJeB37IWPa X94G/z0ZDxNA+Calb57sDQfqjmrMGiKPoRwOq4uUwdhwctu0pv9esLyUIIfSQhr7eQxv JO2yCahpF+zRtjzrU7SqCj14KGoP6XIIcJXL260ps4myGHyZDkQTbHBcvpbk68YdW3oC kd1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=AOplCxr4KKea766qJZRQdDVjvlAF/QreSsz08pLBxoI=; b=Bz+MNICuJft52tjlyPBx+xGKnMzGphTwCXry84wh3hk0XxONc3kj0wT38LGCYlrHZw +T+IZDdbo0LqwVLgH4GXLWihts2hYU/me6v46HmXhLz77n9EZ9qW4+296woW9nyOt08I pMEgpBT6+b/Z6xKDogEn+FHt+tTlGvtzE2MQzBhLb8ylLTpS9vdFTeH5Pk8zZaCA1GmA VwW8MNoX/Z2reQ7YUF0DlvmUknydWOVaJesg9XApWrBIJjUxFK4ue/owgSv/ay+bQTpN hYfj5fTJ62w9gMktVXpu8XD3i3Aj9tRriWHVr94aFnAxmPh/eo/mPksXCR6Wz1IM8y9f YkXg== X-Gm-Message-State: AOAM530UhR+px1qn6KkBzcaKXwPyMU22NHubS0wBDKMdf6/2gRnRNHBV zFGonKdDG2BSnqq4DbSz4jYzyF0cNpCIYd97 X-Google-Smtp-Source: ABdhPJx3eNZp+LhkphGhUiH4Z2aiDUQ/tgnLhQdD1QavtxYA6PWEHiCvcGJyUcahSCZ8tjRvvK/mBQ== X-Received: by 2002:a92:440c:: with SMTP id r12mr26945488ila.174.1629821771080; Tue, 24 Aug 2021 09:16:11 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id o11sm10090032ilf.86.2021.08.24.09.16.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:10 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:09 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 08/25] midx: close linked MIDXs, avoid leaking memory Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When a repository has at least one alternate, the MIDX belonging to each alternate is accessed through the `next` pointer on the main object store's copy of the MIDX. close_midx() didn't bother to close any of the linked MIDXs. It likewise didn't free the memory pointed to by `m`, leaving uninitialized bytes with live pointers to them left around in the heap. Clean this up by closing linked MIDXs, and freeing up the memory pointed to by each of them. When callers call close_midx(), then they can discard the entire linked list of MIDXs and set their pointer to the head of that list to NULL. This isn't strictly required for the upcoming patches, but it makes it much more difficult (though still possible, for e.g., by calling `close_midx(m->next)` which leaves `m->next` pointing at uninitialized bytes) to have pointers to uninitialized memory. Signed-off-by: Taylor Blau --- midx.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/midx.c b/midx.c index e5b17483af..0a515d8711 100644 --- a/midx.c +++ b/midx.c @@ -195,6 +195,8 @@ void close_midx(struct multi_pack_index *m) if (!m) return; + close_midx(m->next); + munmap((unsigned char *)m->data, m->data_len); for (i = 0; i < m->num_packs; i++) { @@ -203,6 +205,7 @@ void close_midx(struct multi_pack_index *m) } FREE_AND_NULL(m->packs); FREE_AND_NULL(m->pack_names); + free(m); } int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id) From patchwork Tue Aug 24 16:16:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455519 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3205C4338F for ; Tue, 24 Aug 2021 16:16:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DC607610A3 for ; Tue, 24 Aug 2021 16:16:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231676AbhHXQRO (ORCPT ); Tue, 24 Aug 2021 12:17:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46654 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230274AbhHXQQ6 (ORCPT ); Tue, 24 Aug 2021 12:16:58 -0400 Received: from mail-io1-xd2c.google.com (mail-io1-xd2c.google.com [IPv6:2607:f8b0:4864:20::d2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8320CC0613CF for ; Tue, 24 Aug 2021 09:16:14 -0700 (PDT) Received: by mail-io1-xd2c.google.com with SMTP id z1so27075141ioh.7 for ; Tue, 24 Aug 2021 09:16:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=99QFc6iWE46oS6WpjKr6khuiQmTu6kQ1e9j/Shryp6E=; b=uLQSxmnQ3yRksPvtGwQycSihORSjMucirAN1uQ9Ac5RE8gC8k1li4FqAS2YM2OZrwh ZkijSujM1MpaQhR24vbfkyal9OR74Wo0W1AsqjEFcc5SVDuS6z2yFFA5+8i759S4te9H 7pVk602L38t39uSNmFn+IGMRZd3OrZEDvfsGw95u9MrqlBaL7QFSe0K7dj8MenSJwbys WMk7YmYJkzYwT+w46kKYKzYGn4I1EQjTpZGsb6TZUh+nJrURxdhOmaEDMfl1HoYC3dVp gCC59M1WX0VNBlxg3+WR4HYb3dwnXp85ZAGp+1onjWP3NPdR7a2oVaT76bLmAPjYzMDL LTfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=99QFc6iWE46oS6WpjKr6khuiQmTu6kQ1e9j/Shryp6E=; b=EjyTnHzj4FXrJMgIujlSxFTyQYGzxcIVTHk0qY38zNiny2Qu83hzE3U0JENxG2YbQr N34ptigjLdkvLKP6SfSUhTf2dhG94Wr1gxsGCnL+m7+CVIUBiJdFlBThqr+1rT+NV93u vKftCQDfyOJ2Qm0rfEI6n2Y/E8Yx0ht8amWRo7IaDfl1FsnJ2gTsFX9W03WI61z+40pN S/X7l0F1cxGJ/Go47mLskelyhc3CU9wQB77l8UYxQFG0PBBJpV1FhxyFnpq/yk6W7UV+ 0yuTLS6kvuzXeKbVhcGtVzV41rQkXjL84ibtRppC6OdBI2+03tUOut7xTcNqOACuZg3a JnUQ== X-Gm-Message-State: AOAM5337gbvIRtmq72u2xEtXA0i7sMLv0a8iS41bACm7JlC9INC8opHI P6jcN2/Wfyq1NbzhNmzcZsnHl1b8wKOkhmre X-Google-Smtp-Source: ABdhPJxtQJd8N1m9FxJzrhrWtUmckPILiJKkMATtN/xu4sMeedQLkS9CI/BH52k5YjBS5qGvjEbXmg== X-Received: by 2002:a6b:e616:: with SMTP id g22mr32831687ioh.67.1629821773785; Tue, 24 Aug 2021 09:16:13 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id c25sm10643653iom.9.2021.08.24.09.16.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:13 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:12 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 09/25] midx: avoid opening multiple MIDXs when writing Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Opening multiple instance of the same MIDX can lead to problems like two separate packed_git structures which represent the same pack being added to the repository's object store. The above scenario can happen because prepare_midx_pack() checks if `m->packs[pack_int_id]` is NULL in order to determine if a pack has been opened and installed in the repository before. But a caller can construct two copies of the same MIDX by calling get_multi_pack_index() and load_multi_pack_index() since the former manipulates the object store directly but the latter is a lower-level routine which allocates a new MIDX for each call. So if prepare_midx_pack() is called on multiple MIDXs with the same pack_int_id, then that pack will be installed twice in the object store's packed_git pointer. This can lead to problems in, for e.g., the pack-bitmap code, which does something like the following (in pack-bitmap.c:open_pack_bitmap()): struct bitmap_index *bitmap_git = ...; for (p = get_all_packs(r); p; p = p->next) { if (open_pack_bitmap_1(bitmap_git, p) == 0) ret = 0; } which is a problem if two copies of the same pack exist in the packed_git list because pack-bitmap.c:open_pack_bitmap_1() contains a conditional like the following: if (bitmap_git->pack || bitmap_git->midx) { /* ignore extra bitmap file; we can only handle one */ warning("ignoring extra bitmap file: %s", packfile->pack_name); close(fd); return -1; } Avoid this scenario by not letting write_midx_internal() open a MIDX that isn't also pointed at by the object store. So long as this is the case, other routines should prefer to open MIDXs with get_multi_pack_index() or reprepare_packed_git() instead of creating instances on their own. Because get_multi_pack_index() returns `r->object_store->multi_pack_index` if it is non-NULL, we'll only have one instance of a MIDX open at one time, avoiding these problems. To encourage this, drop the `struct multi_pack_index *` parameter from `write_midx_internal()`, and rely instead on the `object_dir` to find (or initialize) the correct MIDX instance. Likewise, replace the call to `close_midx()` with `close_object_store()`, since we're about to replace the MIDX with a new one and should invalidate the object store's memory of any MIDX that might have existed beforehand. Note that this now forbids passing object directories that don't belong to alternate repositories over `--object-dir`, since before we would have happily opened a MIDX in any directory, but now restrict ourselves to only those reachable by `r->objects->multi_pack_index` (and alternate MIDXs that we can see by walking the `next` pointer). As far as I can tell, supporting arbitrary directories with `--object-dir` was a historical accident, since even the documentation says `` when referring to the value passed to this option. A future patch could clean this up and provide a warning() when a non-alternate directory was given, since we'll still write a new MIDX there, we just won't reuse any MIDX that might happen to already exist in that directory. Signed-off-by: Taylor Blau --- midx.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/midx.c b/midx.c index 0a515d8711..3dacb31f9d 100644 --- a/midx.c +++ b/midx.c @@ -893,7 +893,7 @@ static int midx_checksum_valid(struct multi_pack_index *m) return hashfile_checksum_valid(m->data, m->data_len); } -static int write_midx_internal(const char *object_dir, struct multi_pack_index *m, +static int write_midx_internal(const char *object_dir, struct string_list *packs_to_drop, const char *preferred_pack_name, unsigned flags) @@ -904,6 +904,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * struct hashfile *f = NULL; struct lock_file lk; struct write_midx_context ctx = { 0 }; + struct multi_pack_index *cur; int pack_name_concat_len = 0; int dropped_packs = 0; int result = 0; @@ -914,10 +915,12 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * die_errno(_("unable to create leading directories of %s"), midx_name); - if (m) - ctx.m = m; - else - ctx.m = load_multi_pack_index(object_dir, 1); + for (cur = get_multi_pack_index(the_repository); cur; cur = cur->next) { + if (!strcmp(object_dir, cur->object_dir)) { + ctx.m = cur; + break; + } + } if (ctx.m && !midx_checksum_valid(ctx.m)) { warning(_("ignoring existing multi-pack-index; checksum mismatch")); @@ -1119,7 +1122,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk)); if (ctx.m) - close_midx(ctx.m); + close_object_store(the_repository->objects); if (ctx.nr - dropped_packs == 0) { error(_("no pack files to index.")); @@ -1182,8 +1185,7 @@ int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags) { - return write_midx_internal(object_dir, NULL, NULL, preferred_pack_name, - flags); + return write_midx_internal(object_dir, NULL, preferred_pack_name, flags); } struct clear_midx_data { @@ -1461,8 +1463,10 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla free(count); - if (packs_to_drop.nr) - result = write_midx_internal(object_dir, m, &packs_to_drop, NULL, flags); + if (packs_to_drop.nr) { + result = write_midx_internal(object_dir, &packs_to_drop, NULL, flags); + m = NULL; + } string_list_clear(&packs_to_drop, 0); return result; @@ -1651,7 +1655,7 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, goto cleanup; } - result = write_midx_internal(object_dir, m, NULL, NULL, flags); + result = write_midx_internal(object_dir, NULL, NULL, flags); m = NULL; cleanup: From patchwork Tue Aug 24 16:16:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455517 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D24FAC432BE for ; Tue, 24 Aug 2021 16:16:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B5BCA6125F for ; Tue, 24 Aug 2021 16:16:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230274AbhHXQRP (ORCPT ); Tue, 24 Aug 2021 12:17:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229666AbhHXQRB (ORCPT ); Tue, 24 Aug 2021 12:17:01 -0400 Received: from mail-io1-xd31.google.com (mail-io1-xd31.google.com [IPv6:2607:f8b0:4864:20::d31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0C6FEC0617AE for ; Tue, 24 Aug 2021 09:16:17 -0700 (PDT) Received: by mail-io1-xd31.google.com with SMTP id y18so11450951ioc.1 for ; Tue, 24 Aug 2021 09:16:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=r72an/cnoBQ19QAwOR0I7U/KevC0JVRiYmUYmDS1JkA=; b=kgbvE8azwtBgvqNFcNglawTv1i2JrYYNkTi17ZLzRq5BA2+5muy3RuEi8qIfEkmOZj LO7qVpTxJHsvG4YJJeAM74QW/TvOErOr8ufeRDtm7B4yUlhEFRcCqFZNu7nX+GiYXTMT ZTskMnQOD0Cn0Iz3F59Atqp48UQvRX1PY1mlqD8ZsHymW9opdYWMhDeWQKI4MjW3o7on z+RHVUOkOYxCE6tsCw9qmoxscTfyNf1KFvK5c2Ia+RwtWgbNWA7lJCYBSMyLB5/2ydMN aOfOC2RBi53BswsK+q6kPQtn7SzIegj54sFhzmwefPW4OFbuc40sgykhHnjUgkiab6hF U+Dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=r72an/cnoBQ19QAwOR0I7U/KevC0JVRiYmUYmDS1JkA=; b=aiKgeRy39Kt5XD9ety3aHn4ciifTvCCbFueMgY0pbo0iENbLi6MpzFnWVmTomYQFfU RNYvkESKXfVy8f3K3ArNs9k7SFapuieSCC347kwBxmSeJ3R9jp6jtLKyTNk5aJ0XRv7a S9YfpkME9PBV9GylWF1rg5H5X6Hj8zbK6vkSKOc6gHMLY72J8Yf9I044zdDVcOGSUsaH 846aK58NKP0B+L+vf3ImFyIKl5Zusi0SEVtkzzyHyyhjIXZw+aa4g0z76tkr6qjFjT6q j2AY6N4SHSSAdb726PmeCdBluK0QCccV+nbNlwDsK/A1/7Xeg3ZpV8+4z6ZCDK5G20PN UKMw== X-Gm-Message-State: AOAM531shO4p+ZGj0RFUh5dI8f+twfR3Fvi3auFrFYHpcY+klO/D5RfP OEZPzXCvwkIJo+e5f4ZKp2dk6rMTSMhIEhRI X-Google-Smtp-Source: ABdhPJzjNKa9blmnFhfQ8hvL2KXwsuUXFOSSiUf+V51ZOzVvoPgDZVkVQh+LaDREpQWgfbZ68bKQHA== X-Received: by 2002:a6b:cf15:: with SMTP id o21mr32242117ioa.132.1629821776355; Tue, 24 Aug 2021 09:16:16 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id n11sm120810ioo.44.2021.08.24.09.16.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:15 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:15 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 10/25] pack-bitmap.c: introduce 'bitmap_num_objects()' Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A subsequent patch to support reading MIDX bitmaps will be less noisy after extracting a generic function to return how many objects are contained in a bitmap. Signed-off-by: Taylor Blau --- pack-bitmap.c | 37 +++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 9b11af87aa..65356f9657 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -136,6 +136,11 @@ static struct ewah_bitmap *read_bitmap_1(struct bitmap_index *index) return b; } +static uint32_t bitmap_num_objects(struct bitmap_index *index) +{ + return index->pack->num_objects; +} + static int load_bitmap_header(struct bitmap_index *index) { struct bitmap_disk_header *header = (void *)index->map; @@ -154,7 +159,7 @@ static int load_bitmap_header(struct bitmap_index *index) /* Parse known bitmap format options */ { uint32_t flags = ntohs(header->options); - size_t cache_size = st_mult(index->pack->num_objects, sizeof(uint32_t)); + size_t cache_size = st_mult(bitmap_num_objects(index), sizeof(uint32_t)); unsigned char *index_end = index->map + index->map_size - the_hash_algo->rawsz; if ((flags & BITMAP_OPT_FULL_DAG) == 0) @@ -404,7 +409,7 @@ static inline int bitmap_position_extended(struct bitmap_index *bitmap_git, if (pos < kh_end(positions)) { int bitmap_pos = kh_value(positions, pos); - return bitmap_pos + bitmap_git->pack->num_objects; + return bitmap_pos + bitmap_num_objects(bitmap_git); } return -1; @@ -456,7 +461,7 @@ static int ext_index_add_object(struct bitmap_index *bitmap_git, bitmap_pos = kh_value(eindex->positions, hash_pos); } - return bitmap_pos + bitmap_git->pack->num_objects; + return bitmap_pos + bitmap_num_objects(bitmap_git); } struct bitmap_show_data { @@ -673,7 +678,7 @@ static void show_extended_objects(struct bitmap_index *bitmap_git, for (i = 0; i < eindex->count; ++i) { struct object *obj; - if (!bitmap_get(objects, bitmap_git->pack->num_objects + i)) + if (!bitmap_get(objects, bitmap_num_objects(bitmap_git) + i)) continue; obj = eindex->objects[i]; @@ -832,7 +837,7 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git, * them individually. */ for (i = 0; i < eindex->count; i++) { - uint32_t pos = i + bitmap_git->pack->num_objects; + uint32_t pos = i + bitmap_num_objects(bitmap_git); if (eindex->objects[i]->type == type && bitmap_get(to_filter, pos) && !bitmap_get(tips, pos)) @@ -859,7 +864,7 @@ static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git, oi.sizep = &size; - if (pos < pack->num_objects) { + if (pos < bitmap_num_objects(bitmap_git)) { off_t ofs = pack_pos_to_offset(pack, pos); if (packed_object_info(the_repository, pack, ofs, &oi) < 0) { struct object_id oid; @@ -869,7 +874,7 @@ static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git, } } else { struct eindex *eindex = &bitmap_git->ext_index; - struct object *obj = eindex->objects[pos - pack->num_objects]; + struct object *obj = eindex->objects[pos - bitmap_num_objects(bitmap_git)]; if (oid_object_info_extended(the_repository, &obj->oid, &oi, 0) < 0) die(_("unable to get size of %s"), oid_to_hex(&obj->oid)); } @@ -911,7 +916,7 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git, } for (i = 0; i < eindex->count; i++) { - uint32_t pos = i + bitmap_git->pack->num_objects; + uint32_t pos = i + bitmap_num_objects(bitmap_git); if (eindex->objects[i]->type == OBJ_BLOB && bitmap_get(to_filter, pos) && !bitmap_get(tips, pos) && @@ -1137,8 +1142,8 @@ static void try_partial_reuse(struct bitmap_index *bitmap_git, enum object_type type; unsigned long size; - if (pos >= bitmap_git->pack->num_objects) - return; /* not actually in the pack */ + if (pos >= bitmap_num_objects(bitmap_git)) + return; /* not actually in the pack or MIDX */ offset = header = pack_pos_to_offset(bitmap_git->pack, pos); type = unpack_object_header(bitmap_git->pack, w_curs, &offset, &size); @@ -1204,6 +1209,7 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, struct pack_window *w_curs = NULL; size_t i = 0; uint32_t offset; + uint32_t objects_nr = bitmap_num_objects(bitmap_git); assert(result); @@ -1211,8 +1217,8 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, i++; /* Don't mark objects not in the packfile */ - if (i > bitmap_git->pack->num_objects / BITS_IN_EWORD) - i = bitmap_git->pack->num_objects / BITS_IN_EWORD; + if (i > objects_nr / BITS_IN_EWORD) + i = objects_nr / BITS_IN_EWORD; reuse = bitmap_word_alloc(i); memset(reuse->words, 0xFF, i * sizeof(eword_t)); @@ -1296,7 +1302,7 @@ static uint32_t count_object_type(struct bitmap_index *bitmap_git, for (i = 0; i < eindex->count; ++i) { if (eindex->objects[i]->type == type && - bitmap_get(objects, bitmap_git->pack->num_objects + i)) + bitmap_get(objects, bitmap_num_objects(bitmap_git) + i)) count++; } @@ -1517,7 +1523,7 @@ uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git, uint32_t i, num_objects; uint32_t *reposition; - num_objects = bitmap_git->pack->num_objects; + num_objects = bitmap_num_objects(bitmap_git); CALLOC_ARRAY(reposition, num_objects); for (i = 0; i < num_objects; ++i) { @@ -1600,7 +1606,6 @@ static off_t get_disk_usage_for_type(struct bitmap_index *bitmap_git, static off_t get_disk_usage_for_extended(struct bitmap_index *bitmap_git) { struct bitmap *result = bitmap_git->result; - struct packed_git *pack = bitmap_git->pack; struct eindex *eindex = &bitmap_git->ext_index; off_t total = 0; struct object_info oi = OBJECT_INFO_INIT; @@ -1612,7 +1617,7 @@ static off_t get_disk_usage_for_extended(struct bitmap_index *bitmap_git) for (i = 0; i < eindex->count; i++) { struct object *obj = eindex->objects[i]; - if (!bitmap_get(result, pack->num_objects + i)) + if (!bitmap_get(result, bitmap_num_objects(bitmap_git) + i)) continue; if (oid_object_info_extended(the_repository, &obj->oid, &oi, 0) < 0) From patchwork Tue Aug 24 16:16:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455521 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A527C4338F for ; Tue, 24 Aug 2021 16:16:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 519FD61212 for ; Tue, 24 Aug 2021 16:16:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230497AbhHXQRY (ORCPT ); Tue, 24 Aug 2021 12:17:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46724 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231143AbhHXQRK (ORCPT ); Tue, 24 Aug 2021 12:17:10 -0400 Received: from mail-il1-x136.google.com (mail-il1-x136.google.com [IPv6:2607:f8b0:4864:20::136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C2C4C061292 for ; Tue, 24 Aug 2021 09:16:19 -0700 (PDT) Received: by mail-il1-x136.google.com with SMTP id i13so9900997ilm.4 for ; Tue, 24 Aug 2021 09:16:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=bw3CvqdrhpEOXdqlwAX99hXO+wxsKpEMzaMvZpwbHu8=; b=wKRiPepuc9Jh2LUFYxjMipLJQNq0xtRz0jYaGtawUkqMhvfNpX+yhG2oQwnpTsOGaA rAkozlLthNQdVD63kluORuXH7xZcHq0rXYR8lbg9ectFt21Ns5bZtaNYrcaOOybhj4yz 1rAX7y8XPPSwdGfGG/u6qqFru1XKdXq8RMY4edbqDn6VvxF2lxaa0IgpRtLJQny8Aus1 sM0qIa9HPAhen94vW5xPppu6heap9LgjbIfDKD3Bx5m1Fk9ZQTWv9rU8RVxLXYSHWTW0 IiUVpNxFXSHzQ1ht6s4lX9omk01ADlgdY3UCa1pnaGmUvk2vILPF18GZbM1z3ptM/ok/ 95Sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=bw3CvqdrhpEOXdqlwAX99hXO+wxsKpEMzaMvZpwbHu8=; b=bhdRfro5NJFvxO4xG338ezbmdPlK+Vs55Wgqwc3CNvqjLXOFuv8FfwqETxQGsl8tyy JHmrZaZkmRDh9chgmHcIGdm67Uu2GBVaMcpTgVIT198YHaxC5hv8WcLY12n9dJ1Zrzja L1Pg5o4ryu1oooaBbCEK8UrMgjRvrUGBFoXjdrCrvoVqQjRTsxaTN9Jjj8gD/Ec3aboC yFNljrhikfQMLrjsiAX/LGsSvOjOjfHMic6klAz2ghamfJu8HqYFl4Jy59Dh+6SsQNtJ dRqHpUJCowwYBXkld4XS2lvthpRZct/pWqf5Np6b8Z3J8zC8qrVa1BFrPdRyj57vlRwG owJA== X-Gm-Message-State: AOAM53213g8bLT+DRxn9Kj1QbyGgMbf5TKG9A0370zSoaFvSe/U0B/6C vVkQDq2vZyH28QjkBYZn27dBFbuALH/jGlvp X-Google-Smtp-Source: ABdhPJxxSVtQ1SeHlSqI1+LvvAan/KLaeNWNwM/lccRDI4h8vpYcSg5GHR6zWaBKzK9jsHCdOGnocg== X-Received: by 2002:a05:6e02:e51:: with SMTP id l17mr27781496ilk.73.1629821778793; Tue, 24 Aug 2021 09:16:18 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id i6sm3009940ilb.30.2021.08.24.09.16.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:18 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:17 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 11/25] pack-bitmap.c: introduce 'nth_bitmap_object_oid()' Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A subsequent patch to support reading MIDX bitmaps will be less noisy after extracting a generic function to fetch the nth OID contained in the bitmap. Signed-off-by: Taylor Blau --- pack-bitmap.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 65356f9657..612f62da97 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -223,6 +223,13 @@ static inline uint8_t read_u8(const unsigned char *buffer, size_t *pos) #define MAX_XOR_OFFSET 160 +static int nth_bitmap_object_oid(struct bitmap_index *index, + struct object_id *oid, + uint32_t n) +{ + return nth_packed_object_id(oid, index->pack, n); +} + static int load_bitmap_entries_v1(struct bitmap_index *index) { uint32_t i; @@ -242,7 +249,7 @@ static int load_bitmap_entries_v1(struct bitmap_index *index) xor_offset = read_u8(index->map, &index->map_pos); flags = read_u8(index->map, &index->map_pos); - if (nth_packed_object_id(&oid, index->pack, commit_idx_pos) < 0) + if (nth_bitmap_object_oid(index, &oid, commit_idx_pos) < 0) return error("corrupt ewah bitmap: commit index %u out of range", (unsigned)commit_idx_pos); @@ -868,8 +875,8 @@ static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git, off_t ofs = pack_pos_to_offset(pack, pos); if (packed_object_info(the_repository, pack, ofs, &oi) < 0) { struct object_id oid; - nth_packed_object_id(&oid, pack, - pack_pos_to_index(pack, pos)); + nth_bitmap_object_oid(bitmap_git, &oid, + pack_pos_to_index(pack, pos)); die(_("unable to get size of %s"), oid_to_hex(&oid)); } } else { From patchwork Tue Aug 24 16:16:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455523 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5BCBC4338F for ; Tue, 24 Aug 2021 16:16:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CE79861212 for ; Tue, 24 Aug 2021 16:16:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229736AbhHXQRZ (ORCPT ); Tue, 24 Aug 2021 12:17:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46668 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231437AbhHXQRM (ORCPT ); Tue, 24 Aug 2021 12:17:12 -0400 Received: from mail-io1-xd30.google.com (mail-io1-xd30.google.com [IPv6:2607:f8b0:4864:20::d30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2D0F0C0612A7 for ; Tue, 24 Aug 2021 09:16:22 -0700 (PDT) Received: by mail-io1-xd30.google.com with SMTP id b7so27065924iob.4 for ; Tue, 24 Aug 2021 09:16:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=GTEp0X4fH1e57QonUX2gkN17NjtRoyvnlbMuzRDXPk8=; b=Oifs/hp5znH/4xJn7IFX5nEMuP6qu2ley/N0ZJjZ4bhrmIwYhRRwZTcDO/fQBgZGqP EYxvvfzSinWKwjH5X52H/tyZB/ggYX02ynwJp+0J/b3GAwsNmElRbhA4WiIJgH/ignja up3jQRPoBNleslKDW+H7oKUXYoNK6VzvLXiYHRdD2NoiEJWizPbz6lniKHNcpzEzSljK DMvlJNGY6GWX/fnf9jI2p6cF1Yc6JBpKEqBPrDs8zDHv5o0DLnlhPqv/JtgO/ixozXoU 3G5Nai9fIJ7AYKw4SdpApsEDNCwGd2RWTlXrEM9dFxAUWKckhrPpXM+TD3TEonxNys+E qcYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=GTEp0X4fH1e57QonUX2gkN17NjtRoyvnlbMuzRDXPk8=; b=BMtNuRRPPp4VJIFBVAV8aX7AdSIVHKLepkAe3okRM8cdU9b9CkEKe9B1ONMc/dj1jj skhvVJPkPmT+RXTz0XIagJtbKy0+4zx745iOLILM++OgL9bovLjAO2diMjoY4zDUBvP3 Eygbu/qbUtpldCdikOo3Wx00D2iRgMqNFwGLEb7gKB6hmPVOvbRAg/itzR4yc7qNmEfZ jPpExm7ycb5hBdSGMapdeoKVm7blKKMVx78BsMV67d4j9o1w7VuVh/ss2cfbVPpmbzhL tHvu1syKYB4Bl1jBXjB3B7hW2tgEvOAJIAYWl53KMUtwQMBaIPU/UBWMLdzOt27226OU zAPA== X-Gm-Message-State: AOAM5330J0f/hy3+7tEGcdYeBbxunYgOKaC/2K0WtQsxJniImW6CrfaP fpyml8pjXZ1LweruW+NivCyN9kG/3LC5YygH X-Google-Smtp-Source: ABdhPJy+xey78M/qcaOecCmaRQ8hY/joUQ2wTKPLHUrLJu7HoSZf2roZh2aKHNU9ifo0OexF/98Z8A== X-Received: by 2002:a6b:f007:: with SMTP id w7mr31388734ioc.112.1629821781288; Tue, 24 Aug 2021 09:16:21 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id c23sm10294430ioi.31.2021.08.24.09.16.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:20 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:20 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 12/25] pack-bitmap.c: introduce 'bitmap_is_preferred_refname()' Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In a recent commit, pack-objects learned support for the 'pack.preferBitmapTips' configuration. This patch prepares the multi-pack bitmap code to respect this configuration, too. The yet-to-be implemented code will find that it is more efficient to check whether each reference contains a prefix found in the configured set of values rather than doing an additional traversal. Implement a function 'bitmap_is_preferred_refname()' which will perform that check. Its caller will be added in a subsequent patch. Signed-off-by: Taylor Blau --- pack-bitmap.c | 16 ++++++++++++++++ pack-bitmap.h | 1 + 2 files changed, 17 insertions(+) diff --git a/pack-bitmap.c b/pack-bitmap.c index 612f62da97..d5296750eb 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1658,3 +1658,19 @@ const struct string_list *bitmap_preferred_tips(struct repository *r) { return repo_config_get_value_multi(r, "pack.preferbitmaptips"); } + +int bitmap_is_preferred_refname(struct repository *r, const char *refname) +{ + const struct string_list *preferred_tips = bitmap_preferred_tips(r); + struct string_list_item *item; + + if (!preferred_tips) + return 0; + + for_each_string_list_item(item, preferred_tips) { + if (starts_with(refname, item->string)) + return 1; + } + + return 0; +} diff --git a/pack-bitmap.h b/pack-bitmap.h index 020cd8d868..52ea10de51 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -94,5 +94,6 @@ void bitmap_writer_finish(struct pack_idx_entry **index, uint16_t options); const struct string_list *bitmap_preferred_tips(struct repository *r); +int bitmap_is_preferred_refname(struct repository *r, const char *refname); #endif From patchwork Tue Aug 24 16:16:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455525 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F027FC432BE for ; Tue, 24 Aug 2021 16:16:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DD08A6125F for ; Tue, 24 Aug 2021 16:16:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229685AbhHXQRb (ORCPT ); Tue, 24 Aug 2021 12:17:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231496AbhHXQRM (ORCPT ); Tue, 24 Aug 2021 12:17:12 -0400 Received: from mail-il1-x134.google.com (mail-il1-x134.google.com [IPv6:2607:f8b0:4864:20::134]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 19ADCC06124C for ; Tue, 24 Aug 2021 09:16:25 -0700 (PDT) Received: by mail-il1-x134.google.com with SMTP id z2so21095936iln.0 for ; Tue, 24 Aug 2021 09:16:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=5HfFbBs0+y3zJfMUpG97rWi/FW9jxAgLk5VvVtPlBes=; b=bZbS+fgSqvgwFphMlUkkE2flQDWDeNKIhAqW/YGOwD1TpMEyH1gko2JxINbo/fIiM0 i5Qkpw4oxeBy3x1c2Bt6iJTT1z2gou1VvuBuOW4sTBE2SLWIXhkr0EF16TBB/lkQXYWO IHDlO8nV1JolzwLPXGkQDGjnFqfbU3s662gphN1Atumc99DMVZs6UTnh6y2yvmWdy24P zlqVrOD/KKg1xCCKWpL5nzEKwrZ7AybhX461dVf3P2bcKF0f3qZdgaQjlCORO4VQptbT v9OH1aFzerEVDjwZOetClKUeASOVNWLLg8BhXe3uw5l98SZbNCXFO702p9J0FMSHeeof XrFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=5HfFbBs0+y3zJfMUpG97rWi/FW9jxAgLk5VvVtPlBes=; b=a375Tk3GG8+ssNXFyCLEGnd1GdEbiew1LzGwg7Rcbn0XYnhtw3lzxKIqTXnmPf/Tkp Xr2DQOHA/XyFW7ZzgXZ+wCyk1OOtT+tVHsAnQ5RCsdC97RS9TRd7TpxMdm4igD9NVCkm r3PDilNbOUc+QkrMCh9J9O3V91G0ZgOLZJMxK5gIS9YAeRziG0oQBhuuuRqmPls3znVC VSPfflZ6CQpAmFLdr5TWEe4uD09omsAJsiLngOfQVS81npyiN+2DM0Go4wYmgqoxcmV2 ReV/ZoHI2xVnYsrtIzSc8nrARhorDNkECZRfZey3PlcpYsMUlvXBWzXAzB5wb514fcY/ uBkA== X-Gm-Message-State: AOAM533hz65cAJn2/Llev6zp5X5NmXHt/KUJTQujcQ/p08bRxXPWSZ03 G4+CBekvheNGr9aciauic7swWcsQhf36pc10 X-Google-Smtp-Source: ABdhPJwITQpoKECu5SnvS+uutodkqD47hv1kFMyxPuHXR2HxJB6lwbaZnYUVtW23SfORw9XMHdbQgg== X-Received: by 2002:a05:6e02:1074:: with SMTP id q20mr7147092ilj.204.1629821783829; Tue, 24 Aug 2021 09:16:23 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id g23sm10133192ioc.8.2021.08.24.09.16.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:23 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:22 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 13/25] pack-bitmap.c: avoid redundant calls to try_partial_reuse Message-ID: <4e06f051a7d56a906d1db97513c88d4b3029742e.1629821743.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org try_partial_reuse() is used to mark any bits in the beginning of a bitmap whose objects can be reused verbatim from the pack they came from. Currently this function returns void, and signals nothing to the caller when bits could not be reused. But multi-pack bitmaps would benefit from having such a signal, because they may try to pass objects which are in bounds, but from a pack other than the preferred one. Any extra calls are noops because of a conditional in reuse_partial_packfile_from_bitmap(), but those loop iterations can be avoided by letting try_partial_reuse() indicate when it can't accept any more bits for reuse, and then listening to that signal. Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- pack-bitmap.c | 40 +++++++++++++++++++++++++++++----------- 1 file changed, 29 insertions(+), 11 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index d5296750eb..4e37f5d574 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1140,22 +1140,26 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, return NULL; } -static void try_partial_reuse(struct bitmap_index *bitmap_git, - size_t pos, - struct bitmap *reuse, - struct pack_window **w_curs) +/* + * -1 means "stop trying further objects"; 0 means we may or may not have + * reused, but you can keep feeding bits. + */ +static int try_partial_reuse(struct bitmap_index *bitmap_git, + size_t pos, + struct bitmap *reuse, + struct pack_window **w_curs) { off_t offset, header; enum object_type type; unsigned long size; if (pos >= bitmap_num_objects(bitmap_git)) - return; /* not actually in the pack or MIDX */ + return -1; /* not actually in the pack or MIDX */ offset = header = pack_pos_to_offset(bitmap_git->pack, pos); type = unpack_object_header(bitmap_git->pack, w_curs, &offset, &size); if (type < 0) - return; /* broken packfile, punt */ + return -1; /* broken packfile, punt */ if (type == OBJ_REF_DELTA || type == OBJ_OFS_DELTA) { off_t base_offset; @@ -1172,9 +1176,9 @@ static void try_partial_reuse(struct bitmap_index *bitmap_git, base_offset = get_delta_base(bitmap_git->pack, w_curs, &offset, type, header); if (!base_offset) - return; + return 0; if (offset_to_pack_pos(bitmap_git->pack, base_offset, &base_pos) < 0) - return; + return 0; /* * We assume delta dependencies always point backwards. This @@ -1186,7 +1190,7 @@ static void try_partial_reuse(struct bitmap_index *bitmap_git, * odd parameters. */ if (base_pos >= pos) - return; + return 0; /* * And finally, if we're not sending the base as part of our @@ -1197,13 +1201,14 @@ static void try_partial_reuse(struct bitmap_index *bitmap_git, * object_entry code path handle it. */ if (!bitmap_get(reuse, base_pos)) - return; + return 0; } /* * If we got here, then the object is OK to reuse. Mark it. */ bitmap_set(reuse, pos); + return 0; } int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, @@ -1239,10 +1244,23 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, break; offset += ewah_bit_ctz64(word >> offset); - try_partial_reuse(bitmap_git, pos + offset, reuse, &w_curs); + if (try_partial_reuse(bitmap_git, pos + offset, reuse, + &w_curs) < 0) { + /* + * try_partial_reuse indicated we couldn't reuse + * any bits, so there is no point in trying more + * bits in the current word, or any other words + * in result. + * + * Jump out of both loops to avoid future + * unnecessary calls to try_partial_reuse. + */ + goto done; + } } } +done: unuse_pack(&w_curs); *entries = bitmap_popcount(reuse); From patchwork Tue Aug 24 16:16:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455527 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FF46C4338F for ; Tue, 24 Aug 2021 16:16:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7303461212 for ; Tue, 24 Aug 2021 16:16:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231283AbhHXQRd (ORCPT ); Tue, 24 Aug 2021 12:17:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46678 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229876AbhHXQRW (ORCPT ); Tue, 24 Aug 2021 12:17:22 -0400 Received: from mail-io1-xd33.google.com (mail-io1-xd33.google.com [IPv6:2607:f8b0:4864:20::d33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32E21C06121D for ; Tue, 24 Aug 2021 09:16:27 -0700 (PDT) Received: by mail-io1-xd33.google.com with SMTP id q3so9644952iot.3 for ; Tue, 24 Aug 2021 09:16:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Sr7MoKPn621v4iJmti0eB1RaoacEwPzTpOT6+LytdO8=; b=eiNyOOGu9Or9vvQmvinshbd8/ArXIQd/B08ZJ4kK/yWrYkQqZxp6OJiDbobQAYtzEj hq9rAdK69liJgbfK/b5/4bfNgxkVWkp4RlDWRtdtGMel7UHUSTvkg7MSYMqVzNgWE5Sa bLIfyL67Suz7V2WB5/PR9GcQl0PAF5gjZASQLdM2DTF7jtqILC+bKcoIr96VHagACQyS U7ey6BOpERGX4brTJSeow0cJnjlJK8hh5HbuXLlKlCNf6ChChq+8DFItYu45E6DjKCV2 hbrB67KZySTiCgXPtr9boZfwiaSxBKae77uxqFM1EVpM5rP+McuFzF0sUUjaRWS+IHzO 2mJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Sr7MoKPn621v4iJmti0eB1RaoacEwPzTpOT6+LytdO8=; b=Kf5TG0PJP/sgyMttmd6sMRoPhk8yHD0kLc39XzD9dbe2gyum/wb5Vgyc9TSGd4AWBf IPYD+Wdxrlxlicw/ALrg+1WOztzHJBGh0Fv6o6TJPtdk/7vM5AFLU8SjvXgB+ywQi3/m fT1QA1NUM0D36bKjl4uoxaQ2Qc9QBcLn+WwA5aq/H5j7Lh0sb1gKVwGFBpHpq93s/k9t zwve3inoh53CXJKCMPURsfTMKCnuIAkdLUyXFFeFjdNC91VAK1Yq0nwTI6UP/zrcAsoB wLnyAAaqyaLTX5uVL8v8zN3EqC1WREyTfHMEDW1/j3TtfBdsLWFYVzMkZ4ZrIKdnhC8o txmA== X-Gm-Message-State: AOAM53238L5uL95ZPXcdSiOGHvTFAcz9O06jLNb95vmvagQJFTmGtCIm 2LiWvK7aXL5kAU8z/SMFsAzJDxVCHCPqnyYt X-Google-Smtp-Source: ABdhPJyltge7ulHrGBoYH1GrAXjWkloapAg+7r8IxZtKGl18Ncq+OeOlyS3KdmOrfz5cORAmoiuLFA== X-Received: by 2002:a05:6638:aca:: with SMTP id m10mr14055865jab.22.1629821786340; Tue, 24 Aug 2021 09:16:26 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 7sm10014082ilx.16.2021.08.24.09.16.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:25 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:25 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 14/25] pack-bitmap: read multi-pack bitmaps Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This prepares the code in pack-bitmap to interpret the new multi-pack bitmaps described in Documentation/technical/bitmap-format.txt, which mostly involves converting bit positions to accommodate looking them up in a MIDX. Note that there are currently no writers who write multi-pack bitmaps, and that this will be implemented in the subsequent commit. Note also that get_midx_checksum() and get_midx_filename() are made non-static so they can be called from pack-bitmap.c. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 5 + midx.c | 4 +- midx.h | 2 + pack-bitmap-write.c | 2 +- pack-bitmap.c | 357 ++++++++++++++++++++++++++++++++++++----- pack-bitmap.h | 6 + packfile.c | 2 +- 7 files changed, 336 insertions(+), 42 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 8a523624a1..e11d3ac2e5 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1124,6 +1124,11 @@ static void write_reused_pack(struct hashfile *f) break; offset += ewah_bit_ctz64(word >> offset); + /* + * Can use bit positions directly, even for MIDX + * bitmaps. See comment in try_partial_reuse() + * for why. + */ write_reused_pack_one(pos + offset, f, &w_curs); display_progress(progress_state, ++written); } diff --git a/midx.c b/midx.c index 3dacb31f9d..2dceaf9565 100644 --- a/midx.c +++ b/midx.c @@ -48,12 +48,12 @@ static uint8_t oid_version(void) } } -static const unsigned char *get_midx_checksum(struct multi_pack_index *m) +const unsigned char *get_midx_checksum(struct multi_pack_index *m) { return m->data + m->data_len - the_hash_algo->rawsz; } -static char *get_midx_filename(const char *object_dir) +char *get_midx_filename(const char *object_dir) { return xstrfmt("%s/pack/multi-pack-index", object_dir); } diff --git a/midx.h b/midx.h index 8684cf0fef..1172df1a71 100644 --- a/midx.h +++ b/midx.h @@ -42,6 +42,8 @@ struct multi_pack_index { #define MIDX_PROGRESS (1 << 0) #define MIDX_WRITE_REV_INDEX (1 << 1) +const unsigned char *get_midx_checksum(struct multi_pack_index *m); +char *get_midx_filename(const char *object_dir); char *get_midx_rev_filename(struct multi_pack_index *m); struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local); diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index 142fd0adb8..9c55c1531e 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -48,7 +48,7 @@ void bitmap_writer_show_progress(int show) } /** - * Build the initial type index for the packfile + * Build the initial type index for the packfile or multi-pack-index */ void bitmap_writer_build_type_index(struct packing_data *to_pack, struct pack_idx_entry **index, diff --git a/pack-bitmap.c b/pack-bitmap.c index 4e37f5d574..fa69ed7a6d 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -13,6 +13,7 @@ #include "repository.h" #include "object-store.h" #include "list-objects-filter-options.h" +#include "midx.h" #include "config.h" /* @@ -35,8 +36,15 @@ struct stored_bitmap { * the active bitmap index is the largest one. */ struct bitmap_index { - /* Packfile to which this bitmap index belongs to */ + /* + * The pack or multi-pack index (MIDX) that this bitmap index belongs + * to. + * + * Exactly one of these must be non-NULL; this specifies the object + * order used to interpret this bitmap. + */ struct packed_git *pack; + struct multi_pack_index *midx; /* * Mark the first `reuse_objects` in the packfile as reused: @@ -71,6 +79,9 @@ struct bitmap_index { /* If not NULL, this is a name-hash cache pointing into map. */ uint32_t *hashes; + /* The checksum of the packfile or MIDX; points into map. */ + const unsigned char *checksum; + /* * Extended index. * @@ -138,6 +149,8 @@ static struct ewah_bitmap *read_bitmap_1(struct bitmap_index *index) static uint32_t bitmap_num_objects(struct bitmap_index *index) { + if (index->midx) + return index->midx->num_objects; return index->pack->num_objects; } @@ -175,6 +188,7 @@ static int load_bitmap_header(struct bitmap_index *index) } index->entry_count = ntohl(header->entry_count); + index->checksum = header->checksum; index->map_pos += header_size; return 0; } @@ -227,6 +241,8 @@ static int nth_bitmap_object_oid(struct bitmap_index *index, struct object_id *oid, uint32_t n) { + if (index->midx) + return nth_midxed_object_oid(oid, index->midx, n) ? 0 : -1; return nth_packed_object_id(oid, index->pack, n); } @@ -274,7 +290,14 @@ static int load_bitmap_entries_v1(struct bitmap_index *index) return 0; } -static char *pack_bitmap_filename(struct packed_git *p) +char *midx_bitmap_filename(struct multi_pack_index *midx) +{ + return xstrfmt("%s-%s.bitmap", + get_midx_filename(midx->object_dir), + hash_to_hex(get_midx_checksum(midx))); +} + +char *pack_bitmap_filename(struct packed_git *p) { size_t len; @@ -283,6 +306,57 @@ static char *pack_bitmap_filename(struct packed_git *p) return xstrfmt("%.*s.bitmap", (int)len, p->pack_name); } +static int open_midx_bitmap_1(struct bitmap_index *bitmap_git, + struct multi_pack_index *midx) +{ + struct stat st; + char *idx_name = midx_bitmap_filename(midx); + int fd = git_open(idx_name); + + free(idx_name); + + if (fd < 0) + return -1; + + if (fstat(fd, &st)) { + close(fd); + return -1; + } + + if (bitmap_git->pack || bitmap_git->midx) { + /* ignore extra bitmap file; we can only handle one */ + warning("ignoring extra bitmap file: %s", + get_midx_filename(midx->object_dir)); + close(fd); + return -1; + } + + bitmap_git->midx = midx; + bitmap_git->map_size = xsize_t(st.st_size); + bitmap_git->map_pos = 0; + bitmap_git->map = xmmap(NULL, bitmap_git->map_size, PROT_READ, + MAP_PRIVATE, fd, 0); + close(fd); + + if (load_bitmap_header(bitmap_git) < 0) + goto cleanup; + + if (!hasheq(get_midx_checksum(bitmap_git->midx), bitmap_git->checksum)) + goto cleanup; + + if (load_midx_revindex(bitmap_git->midx) < 0) { + warning(_("multi-pack bitmap is missing required reverse index")); + goto cleanup; + } + return 0; + +cleanup: + munmap(bitmap_git->map, bitmap_git->map_size); + bitmap_git->map_size = 0; + bitmap_git->map = NULL; + return -1; +} + static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git *packfile) { int fd; @@ -304,7 +378,8 @@ static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git return -1; } - if (bitmap_git->pack) { + if (bitmap_git->pack || bitmap_git->midx) { + /* ignore extra bitmap file; we can only handle one */ warning("ignoring extra bitmap file: %s", packfile->pack_name); close(fd); return -1; @@ -331,13 +406,39 @@ static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git return 0; } -static int load_pack_bitmap(struct bitmap_index *bitmap_git) +static int load_reverse_index(struct bitmap_index *bitmap_git) +{ + if (bitmap_is_midx(bitmap_git)) { + uint32_t i; + int ret; + + /* + * The multi-pack-index's .rev file is already loaded via + * open_pack_bitmap_1(). + * + * But we still need to open the individual pack .rev files, + * since we will need to make use of them in pack-objects. + */ + for (i = 0; i < bitmap_git->midx->num_packs; i++) { + if (prepare_midx_pack(the_repository, bitmap_git->midx, i)) + die(_("load_reverse_index: could not open pack")); + ret = load_pack_revindex(bitmap_git->midx->packs[i]); + if (ret) + return ret; + } + return 0; + } + return load_pack_revindex(bitmap_git->pack); +} + +static int load_bitmap(struct bitmap_index *bitmap_git) { assert(bitmap_git->map); bitmap_git->bitmaps = kh_init_oid_map(); bitmap_git->ext_index.positions = kh_init_oid_pos(); - if (load_pack_revindex(bitmap_git->pack)) + + if (load_reverse_index(bitmap_git)) goto failed; if (!(bitmap_git->commits = read_bitmap_1(bitmap_git)) || @@ -381,11 +482,47 @@ static int open_pack_bitmap(struct repository *r, return ret; } +static int open_midx_bitmap(struct repository *r, + struct bitmap_index *bitmap_git) +{ + struct multi_pack_index *midx; + + assert(!bitmap_git->map); + + for (midx = get_multi_pack_index(r); midx; midx = midx->next) { + if (!open_midx_bitmap_1(bitmap_git, midx)) + return 0; + } + return -1; +} + +static int open_bitmap(struct repository *r, + struct bitmap_index *bitmap_git) +{ + assert(!bitmap_git->map); + + if (!open_midx_bitmap(r, bitmap_git)) + return 0; + return open_pack_bitmap(r, bitmap_git); +} + struct bitmap_index *prepare_bitmap_git(struct repository *r) { struct bitmap_index *bitmap_git = xcalloc(1, sizeof(*bitmap_git)); - if (!open_pack_bitmap(r, bitmap_git) && !load_pack_bitmap(bitmap_git)) + if (!open_bitmap(r, bitmap_git) && !load_bitmap(bitmap_git)) + return bitmap_git; + + free_bitmap_index(bitmap_git); + return NULL; +} + +struct bitmap_index *prepare_midx_bitmap_git(struct repository *r, + struct multi_pack_index *midx) +{ + struct bitmap_index *bitmap_git = xcalloc(1, sizeof(*bitmap_git)); + + if (!open_midx_bitmap_1(bitmap_git, midx) && !load_bitmap(bitmap_git)) return bitmap_git; free_bitmap_index(bitmap_git); @@ -435,10 +572,26 @@ static inline int bitmap_position_packfile(struct bitmap_index *bitmap_git, return pos; } +static int bitmap_position_midx(struct bitmap_index *bitmap_git, + const struct object_id *oid) +{ + uint32_t want, got; + if (!bsearch_midx(oid, bitmap_git->midx, &want)) + return -1; + + if (midx_to_pack_pos(bitmap_git->midx, want, &got) < 0) + return -1; + return got; +} + static int bitmap_position(struct bitmap_index *bitmap_git, const struct object_id *oid) { - int pos = bitmap_position_packfile(bitmap_git, oid); + int pos; + if (bitmap_is_midx(bitmap_git)) + pos = bitmap_position_midx(bitmap_git, oid); + else + pos = bitmap_position_packfile(bitmap_git, oid); return (pos >= 0) ? pos : bitmap_position_extended(bitmap_git, oid); } @@ -749,6 +902,7 @@ static void show_objects_for_type( continue; for (offset = 0; offset < BITS_IN_EWORD; ++offset) { + struct packed_git *pack; struct object_id oid; uint32_t hash = 0, index_pos; off_t ofs; @@ -758,14 +912,28 @@ static void show_objects_for_type( offset += ewah_bit_ctz64(word >> offset); - index_pos = pack_pos_to_index(bitmap_git->pack, pos + offset); - ofs = pack_pos_to_offset(bitmap_git->pack, pos + offset); - nth_packed_object_id(&oid, bitmap_git->pack, index_pos); + if (bitmap_is_midx(bitmap_git)) { + struct multi_pack_index *m = bitmap_git->midx; + uint32_t pack_id; + + index_pos = pack_pos_to_midx(m, pos + offset); + ofs = nth_midxed_offset(m, index_pos); + nth_midxed_object_oid(&oid, m, index_pos); + + pack_id = nth_midxed_pack_int_id(m, index_pos); + pack = bitmap_git->midx->packs[pack_id]; + } else { + index_pos = pack_pos_to_index(bitmap_git->pack, pos + offset); + ofs = pack_pos_to_offset(bitmap_git->pack, pos + offset); + nth_bitmap_object_oid(bitmap_git, &oid, index_pos); + + pack = bitmap_git->pack; + } if (bitmap_git->hashes) hash = get_be32(bitmap_git->hashes + index_pos); - show_reach(&oid, object_type, 0, hash, bitmap_git->pack, ofs); + show_reach(&oid, object_type, 0, hash, pack, ofs); } } } @@ -777,8 +945,13 @@ static int in_bitmapped_pack(struct bitmap_index *bitmap_git, struct object *object = roots->item; roots = roots->next; - if (find_pack_entry_one(object->oid.hash, bitmap_git->pack) > 0) - return 1; + if (bitmap_is_midx(bitmap_git)) { + if (bsearch_midx(&object->oid, bitmap_git->midx, NULL)) + return 1; + } else { + if (find_pack_entry_one(object->oid.hash, bitmap_git->pack) > 0) + return 1; + } } return 0; @@ -865,14 +1038,26 @@ static void filter_bitmap_blob_none(struct bitmap_index *bitmap_git, static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git, uint32_t pos) { - struct packed_git *pack = bitmap_git->pack; unsigned long size; struct object_info oi = OBJECT_INFO_INIT; oi.sizep = &size; if (pos < bitmap_num_objects(bitmap_git)) { - off_t ofs = pack_pos_to_offset(pack, pos); + struct packed_git *pack; + off_t ofs; + + if (bitmap_is_midx(bitmap_git)) { + uint32_t midx_pos = pack_pos_to_midx(bitmap_git->midx, pos); + uint32_t pack_id = nth_midxed_pack_int_id(bitmap_git->midx, midx_pos); + + pack = bitmap_git->midx->packs[pack_id]; + ofs = nth_midxed_offset(bitmap_git->midx, midx_pos); + } else { + pack = bitmap_git->pack; + ofs = pack_pos_to_offset(pack, pos); + } + if (packed_object_info(the_repository, pack, ofs, &oi) < 0) { struct object_id oid; nth_bitmap_object_oid(bitmap_git, &oid, @@ -1053,7 +1238,7 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, /* try to open a bitmapped pack, but don't parse it yet * because we may not need to use it */ CALLOC_ARRAY(bitmap_git, 1); - if (open_pack_bitmap(revs->repo, bitmap_git) < 0) + if (open_bitmap(revs->repo, bitmap_git) < 0) goto cleanup; for (i = 0; i < revs->pending.nr; ++i) { @@ -1097,7 +1282,7 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, * from disk. this is the point of no return; after this the rev_list * becomes invalidated and we must perform the revwalk through bitmaps */ - if (load_pack_bitmap(bitmap_git) < 0) + if (load_bitmap(bitmap_git) < 0) goto cleanup; object_array_clear(&revs->pending); @@ -1145,19 +1330,43 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, * reused, but you can keep feeding bits. */ static int try_partial_reuse(struct bitmap_index *bitmap_git, + struct packed_git *pack, size_t pos, struct bitmap *reuse, struct pack_window **w_curs) { - off_t offset, header; + off_t offset, delta_obj_offset; enum object_type type; unsigned long size; - if (pos >= bitmap_num_objects(bitmap_git)) - return -1; /* not actually in the pack or MIDX */ + /* + * try_partial_reuse() is called either on (a) objects in the + * bitmapped pack (in the case of a single-pack bitmap) or (b) + * objects in the preferred pack of a multi-pack bitmap. + * Importantly, the latter can pretend as if only a single pack + * exists because: + * + * - The first pack->num_objects bits of a MIDX bitmap are + * reserved for the preferred pack, and + * + * - Ties due to duplicate objects are always resolved in + * favor of the preferred pack. + * + * Therefore we do not need to ever ask the MIDX for its copy of + * an object by OID, since it will always select it from the + * preferred pack. Likewise, the selected copy of the base + * object for any deltas will reside in the same pack. + * + * This means that we can reuse pos when looking up the bit in + * the reuse bitmap, too, since bits corresponding to the + * preferred pack precede all bits from other packs. + */ - offset = header = pack_pos_to_offset(bitmap_git->pack, pos); - type = unpack_object_header(bitmap_git->pack, w_curs, &offset, &size); + if (pos >= pack->num_objects) + return -1; /* not actually in the pack or MIDX preferred pack */ + + offset = delta_obj_offset = pack_pos_to_offset(pack, pos); + type = unpack_object_header(pack, w_curs, &offset, &size); if (type < 0) return -1; /* broken packfile, punt */ @@ -1173,11 +1382,11 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, * and the normal slow path will complain about it in * more detail. */ - base_offset = get_delta_base(bitmap_git->pack, w_curs, - &offset, type, header); + base_offset = get_delta_base(pack, w_curs, &offset, type, + delta_obj_offset); if (!base_offset) return 0; - if (offset_to_pack_pos(bitmap_git->pack, base_offset, &base_pos) < 0) + if (offset_to_pack_pos(pack, base_offset, &base_pos) < 0) return 0; /* @@ -1211,24 +1420,48 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, return 0; } +static uint32_t midx_preferred_pack(struct bitmap_index *bitmap_git) +{ + struct multi_pack_index *m = bitmap_git->midx; + if (!m) + BUG("midx_preferred_pack: requires non-empty MIDX"); + return nth_midxed_pack_int_id(m, pack_pos_to_midx(bitmap_git->midx, 0)); +} + int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, struct packed_git **packfile_out, uint32_t *entries, struct bitmap **reuse_out) { + struct packed_git *pack; struct bitmap *result = bitmap_git->result; struct bitmap *reuse; struct pack_window *w_curs = NULL; size_t i = 0; uint32_t offset; - uint32_t objects_nr = bitmap_num_objects(bitmap_git); + uint32_t objects_nr; assert(result); + load_reverse_index(bitmap_git); + + if (bitmap_is_midx(bitmap_git)) + pack = bitmap_git->midx->packs[midx_preferred_pack(bitmap_git)]; + else + pack = bitmap_git->pack; + objects_nr = pack->num_objects; + while (i < result->word_alloc && result->words[i] == (eword_t)~0) i++; - /* Don't mark objects not in the packfile */ + /* + * Don't mark objects not in the packfile or preferred pack. This bitmap + * marks objects eligible for reuse, but the pack-reuse code only + * understands how to reuse a single pack. Since the preferred pack is + * guaranteed to have all bases for its deltas (in a multi-pack bitmap), + * we use it instead of another pack. In single-pack bitmaps, the choice + * is made for us. + */ if (i > objects_nr / BITS_IN_EWORD) i = objects_nr / BITS_IN_EWORD; @@ -1244,8 +1477,8 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, break; offset += ewah_bit_ctz64(word >> offset); - if (try_partial_reuse(bitmap_git, pos + offset, reuse, - &w_curs) < 0) { + if (try_partial_reuse(bitmap_git, pack, pos + offset, + reuse, &w_curs) < 0) { /* * try_partial_reuse indicated we couldn't reuse * any bits, so there is no point in trying more @@ -1274,7 +1507,7 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, * need to be handled separately. */ bitmap_and_not(result, reuse); - *packfile_out = bitmap_git->pack; + *packfile_out = pack; *reuse_out = reuse; return 0; } @@ -1548,6 +1781,12 @@ uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git, uint32_t i, num_objects; uint32_t *reposition; + if (!bitmap_is_midx(bitmap_git)) + load_reverse_index(bitmap_git); + else if (load_midx_revindex(bitmap_git->midx) < 0) + BUG("rebuild_existing_bitmaps: missing required rev-cache " + "extension"); + num_objects = bitmap_num_objects(bitmap_git); CALLOC_ARRAY(reposition, num_objects); @@ -1555,8 +1794,13 @@ uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git, struct object_id oid; struct object_entry *oe; - nth_packed_object_id(&oid, bitmap_git->pack, - pack_pos_to_index(bitmap_git->pack, i)); + if (bitmap_is_midx(bitmap_git)) + nth_midxed_object_oid(&oid, + bitmap_git->midx, + pack_pos_to_midx(bitmap_git->midx, i)); + else + nth_packed_object_id(&oid, bitmap_git->pack, + pack_pos_to_index(bitmap_git->pack, i)); oe = packlist_find(mapping, &oid); if (oe) @@ -1582,6 +1826,19 @@ void free_bitmap_index(struct bitmap_index *b) free(b->ext_index.hashes); bitmap_free(b->result); bitmap_free(b->haves); + if (bitmap_is_midx(b)) { + /* + * Multi-pack bitmaps need to have resources associated with + * their on-disk reverse indexes unmapped so that stale .rev and + * .bitmap files can be removed. + * + * Unlike pack-based bitmaps, multi-pack bitmaps can be read and + * written in the same 'git multi-pack-index write --bitmap' + * process. Close resources so they can be removed safely on + * platforms like Windows. + */ + close_midx_revindex(b->midx); + } free(b); } @@ -1596,7 +1853,6 @@ static off_t get_disk_usage_for_type(struct bitmap_index *bitmap_git, enum object_type object_type) { struct bitmap *result = bitmap_git->result; - struct packed_git *pack = bitmap_git->pack; off_t total = 0; struct ewah_iterator it; eword_t filter; @@ -1613,15 +1869,35 @@ static off_t get_disk_usage_for_type(struct bitmap_index *bitmap_git, continue; for (offset = 0; offset < BITS_IN_EWORD; offset++) { - size_t pos; - if ((word >> offset) == 0) break; offset += ewah_bit_ctz64(word >> offset); - pos = base + offset; - total += pack_pos_to_offset(pack, pos + 1) - - pack_pos_to_offset(pack, pos); + + if (bitmap_is_midx(bitmap_git)) { + uint32_t pack_pos; + uint32_t midx_pos = pack_pos_to_midx(bitmap_git->midx, base + offset); + off_t offset = nth_midxed_offset(bitmap_git->midx, midx_pos); + + uint32_t pack_id = nth_midxed_pack_int_id(bitmap_git->midx, midx_pos); + struct packed_git *pack = bitmap_git->midx->packs[pack_id]; + + if (offset_to_pack_pos(pack, offset, &pack_pos) < 0) { + struct object_id oid; + nth_midxed_object_oid(&oid, bitmap_git->midx, midx_pos); + + die(_("could not find %s in pack %s at offset %"PRIuMAX), + oid_to_hex(&oid), + pack->pack_name, + (uintmax_t)offset); + } + + total += pack_pos_to_offset(pack, pack_pos + 1) - offset; + } else { + size_t pos = base + offset; + total += pack_pos_to_offset(bitmap_git->pack, pos + 1) - + pack_pos_to_offset(bitmap_git->pack, pos); + } } } @@ -1672,6 +1948,11 @@ off_t get_disk_usage_from_bitmap(struct bitmap_index *bitmap_git, return total; } +int bitmap_is_midx(struct bitmap_index *bitmap_git) +{ + return !!bitmap_git->midx; +} + const struct string_list *bitmap_preferred_tips(struct repository *r) { return repo_config_get_value_multi(r, "pack.preferbitmaptips"); diff --git a/pack-bitmap.h b/pack-bitmap.h index 52ea10de51..81664f933f 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -44,6 +44,8 @@ typedef int (*show_reachable_fn)( struct bitmap_index; struct bitmap_index *prepare_bitmap_git(struct repository *r); +struct bitmap_index *prepare_midx_bitmap_git(struct repository *r, + struct multi_pack_index *midx); void count_bitmap_commit_list(struct bitmap_index *, uint32_t *commits, uint32_t *trees, uint32_t *blobs, uint32_t *tags); void traverse_bitmap_commit_list(struct bitmap_index *, @@ -92,6 +94,10 @@ void bitmap_writer_finish(struct pack_idx_entry **index, uint32_t index_nr, const char *filename, uint16_t options); +char *midx_bitmap_filename(struct multi_pack_index *midx); +char *pack_bitmap_filename(struct packed_git *p); + +int bitmap_is_midx(struct bitmap_index *bitmap_git); const struct string_list *bitmap_preferred_tips(struct repository *r); int bitmap_is_preferred_refname(struct repository *r, const char *refname); diff --git a/packfile.c b/packfile.c index 9ef6d98292..371f5488cf 100644 --- a/packfile.c +++ b/packfile.c @@ -860,7 +860,7 @@ static void prepare_pack(const char *full_name, size_t full_name_len, if (!strcmp(file_name, "multi-pack-index")) return; if (starts_with(file_name, "multi-pack-index") && - ends_with(file_name, ".rev")) + (ends_with(file_name, ".bitmap") || ends_with(file_name, ".rev"))) return; if (ends_with(file_name, ".idx") || ends_with(file_name, ".rev") || From patchwork Tue Aug 24 16:16:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455529 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93699C4338F for ; Tue, 24 Aug 2021 16:16:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7C92C61212 for ; Tue, 24 Aug 2021 16:16:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229687AbhHXQRi (ORCPT ); Tue, 24 Aug 2021 12:17:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46642 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230017AbhHXQRX (ORCPT ); Tue, 24 Aug 2021 12:17:23 -0400 Received: from mail-il1-x12a.google.com (mail-il1-x12a.google.com [IPv6:2607:f8b0:4864:20::12a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A7746C0613CF for ; Tue, 24 Aug 2021 09:16:29 -0700 (PDT) Received: by mail-il1-x12a.google.com with SMTP id y3so21090929ilm.6 for ; Tue, 24 Aug 2021 09:16:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=hbD9gRFTn1o9opXaNS7VNsVkBvZ7OlmNxUee53rlVDY=; b=MCwV82oGeZFkxe4X90MVQLYTOpnMWjhBBEKjyEQ32XvEoohmZbF8/fiGIUIIaukgBU nr049aZNJ7qTFFbtlNYT36zOwt9oWCimTjY9y6l1XeVDYQL7l0SmHwLl1mPrLfZmgDbf DJ1sRcQbCBBt3WElqOxW0QO5DDHUb6vGqxMfvv32NIR8zqGJCXfMwRe4L4PyMMswAjhx ogYDstr/kWGD3CbG3VA36nXG4Hc+mLfgHLd4qNX6gG2kC6/J74OAdf8dNr24liOsjvLN stK8sTHTCJlVGFMYtF4EKABo9apn3/saeW6PzL1gbpktedKoyh4GLaThYHT06O+JJPud efcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=hbD9gRFTn1o9opXaNS7VNsVkBvZ7OlmNxUee53rlVDY=; b=jLc4Qi0rSQPcYbrJy9fSsQQ7DG4btds0lNc8/1B0x/rmodLVoJb3WFIsmOUVvVQe6k qtNGFiWafhqIoMa5sxEH2+ndbipBx1MiBT2Meea5Oy43AKpaU6VVkQXimbwyzsedf8L/ pj+D7rNUW8hScYnbcxdJL0QdT2ABu/jH1rOjkt6HB2TCXXmBPAHIQRU/F3pE8qk5o7UA Joos0Q5NL6pESdKAMzvkaW4TsaQ39ODLdZOpS81Ktgvyi4LplwtP56SFvBh7lXI/ptaz yiglG5ENZEOioz/idtJSaKaz9ypc/WgZT+L5cxjixPbwcza1YVMN75BhxRdEyCW1v3YR GLwA== X-Gm-Message-State: AOAM533Xk3ao994BOrn2Pvhy7ncPqcpaZhrCFeV1HervkYHLOgR+P16B FFi/ZxxuPMOAr11uQrIruV7FYtYXsd56DA3L X-Google-Smtp-Source: ABdhPJz398Bz5EJpV0OU5NmyJl+Ld28/0LLwteeOX9UH1vfl6eeNKA/B6JnyrcuhtAKmRm5CtP8oPw== X-Received: by 2002:a92:b111:: with SMTP id t17mr27981784ilh.208.1629821788867; Tue, 24 Aug 2021 09:16:28 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id z7sm10202708ilz.25.2021.08.24.09.16.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:28 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:27 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 15/25] pack-bitmap: write multi-pack bitmaps Message-ID: <9d83ad77abff76b0c9e43603295461009cefa415.1629821743.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Write multi-pack bitmaps in the format described by Documentation/technical/bitmap-format.txt, inferring their presence with the absence of '--bitmap'. To write a multi-pack bitmap, this patch attempts to reuse as much of the existing machinery from pack-objects as possible. Specifically, the MIDX code prepares a packing_data struct that pretends as if a single packfile has been generated containing all of the objects contained within the MIDX. Signed-off-by: Taylor Blau --- Documentation/git-multi-pack-index.txt | 12 +- builtin/multi-pack-index.c | 2 + midx.c | 208 ++++++++++++++++++++++++- midx.h | 1 + 4 files changed, 214 insertions(+), 9 deletions(-) diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index c9b063d31e..ed52459a9d 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -10,7 +10,7 @@ SYNOPSIS -------- [verse] 'git multi-pack-index' [--object-dir=] [--[no-]progress] - [--preferred-pack=] + [--preferred-pack=] [--[no-]bitmap] DESCRIPTION ----------- @@ -40,6 +40,9 @@ write:: multiple packs contain the same object. `` must contain at least one object. If not given, ties are broken in favor of the pack with the lowest mtime. + + --[no-]bitmap:: + Control whether or not a multi-pack bitmap is written. -- verify:: @@ -81,6 +84,13 @@ EXAMPLES $ git multi-pack-index write ----------------------------------------------- +* Write a MIDX file for the packfiles in the current .git folder with a +corresponding bitmap. ++ +------------------------------------------------------------- +$ git multi-pack-index write --preferred-pack= --bitmap +------------------------------------------------------------- + * Write a MIDX file for the packfiles in an alternate object store. + ----------------------------------------------- diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 8ff0dee2ec..73c0113b48 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -68,6 +68,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) OPT_STRING(0, "preferred-pack", &opts.preferred_pack, N_("preferred-pack"), N_("pack for reuse when computing a multi-pack bitmap")), + OPT_BIT(0, "bitmap", &opts.flags, N_("write multi-pack bitmap"), + MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX), OPT_END(), }; diff --git a/midx.c b/midx.c index 2dceaf9565..4574e6d411 100644 --- a/midx.c +++ b/midx.c @@ -13,6 +13,10 @@ #include "repository.h" #include "chunk-format.h" #include "pack.h" +#include "pack-bitmap.h" +#include "refs.h" +#include "revision.h" +#include "list-objects.h" #define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ #define MIDX_VERSION 1 @@ -893,6 +897,166 @@ static int midx_checksum_valid(struct multi_pack_index *m) return hashfile_checksum_valid(m->data, m->data_len); } +static void prepare_midx_packing_data(struct packing_data *pdata, + struct write_midx_context *ctx) +{ + uint32_t i; + + memset(pdata, 0, sizeof(struct packing_data)); + prepare_packing_data(the_repository, pdata); + + for (i = 0; i < ctx->entries_nr; i++) { + struct pack_midx_entry *from = &ctx->entries[ctx->pack_order[i]]; + struct object_entry *to = packlist_alloc(pdata, &from->oid); + + oe_set_in_pack(pdata, to, + ctx->info[ctx->pack_perm[from->pack_int_id]].p); + } +} + +static int add_ref_to_pending(const char *refname, + const struct object_id *oid, + int flag, void *cb_data) +{ + struct rev_info *revs = (struct rev_info*)cb_data; + struct object *object; + + if ((flag & REF_ISSYMREF) && (flag & REF_ISBROKEN)) { + warning("symbolic ref is dangling: %s", refname); + return 0; + } + + object = parse_object_or_die(oid, refname); + if (object->type != OBJ_COMMIT) + return 0; + + add_pending_object(revs, object, ""); + if (bitmap_is_preferred_refname(revs->repo, refname)) + object->flags |= NEEDS_BITMAP; + return 0; +} + +struct bitmap_commit_cb { + struct commit **commits; + size_t commits_nr, commits_alloc; + + struct write_midx_context *ctx; +}; + +static const struct object_id *bitmap_oid_access(size_t index, + const void *_entries) +{ + const struct pack_midx_entry *entries = _entries; + return &entries[index].oid; +} + +static void bitmap_show_commit(struct commit *commit, void *_data) +{ + struct bitmap_commit_cb *data = _data; + int pos = oid_pos(&commit->object.oid, data->ctx->entries, + data->ctx->entries_nr, + bitmap_oid_access); + if (pos < 0) + return; + + ALLOC_GROW(data->commits, data->commits_nr + 1, data->commits_alloc); + data->commits[data->commits_nr++] = commit; +} + +static struct commit **find_commits_for_midx_bitmap(uint32_t *indexed_commits_nr_p, + struct write_midx_context *ctx) +{ + struct rev_info revs; + struct bitmap_commit_cb cb = {0}; + + cb.ctx = ctx; + + repo_init_revisions(the_repository, &revs, NULL); + setup_revisions(0, NULL, &revs, NULL); + for_each_ref(add_ref_to_pending, &revs); + + /* + * Skipping promisor objects here is intentional, since it only excludes + * them from the list of reachable commits that we want to select from + * when computing the selection of MIDX'd commits to receive bitmaps. + * + * Reachability bitmaps do require that their objects be closed under + * reachability, but fetching any objects missing from promisors at this + * point is too late. But, if one of those objects can be reached from + * an another object that is included in the bitmap, then we will + * complain later that we don't have reachability closure (and fail + * appropriately). + */ + fetch_if_missing = 0; + revs.exclude_promisor_objects = 1; + + if (prepare_revision_walk(&revs)) + die(_("revision walk setup failed")); + + traverse_commit_list(&revs, bitmap_show_commit, NULL, &cb); + if (indexed_commits_nr_p) + *indexed_commits_nr_p = cb.commits_nr; + + return cb.commits; +} + +static int write_midx_bitmap(char *midx_name, unsigned char *midx_hash, + struct write_midx_context *ctx, + unsigned flags) +{ + struct packing_data pdata; + struct pack_idx_entry **index; + struct commit **commits = NULL; + uint32_t i, commits_nr; + char *bitmap_name = xstrfmt("%s-%s.bitmap", midx_name, hash_to_hex(midx_hash)); + int ret; + + prepare_midx_packing_data(&pdata, ctx); + + commits = find_commits_for_midx_bitmap(&commits_nr, ctx); + + /* + * Build the MIDX-order index based on pdata.objects (which is already + * in MIDX order; c.f., 'midx_pack_order_cmp()' for the definition of + * this order). + */ + ALLOC_ARRAY(index, pdata.nr_objects); + for (i = 0; i < pdata.nr_objects; i++) + index[i] = &pdata.objects[i].idx; + + bitmap_writer_show_progress(flags & MIDX_PROGRESS); + bitmap_writer_build_type_index(&pdata, index, pdata.nr_objects); + + /* + * bitmap_writer_finish expects objects in lex order, but pack_order + * gives us exactly that. use it directly instead of re-sorting the + * array. + * + * This changes the order of objects in 'index' between + * bitmap_writer_build_type_index and bitmap_writer_finish. + * + * The same re-ordering takes place in the single-pack bitmap code via + * write_idx_file(), which is called by finish_tmp_packfile(), which + * happens between bitmap_writer_build_type_index() and + * bitmap_writer_finish(). + */ + for (i = 0; i < pdata.nr_objects; i++) + index[ctx->pack_order[i]] = &pdata.objects[i].idx; + + bitmap_writer_select_commits(commits, commits_nr, -1); + ret = bitmap_writer_build(&pdata); + if (ret < 0) + goto cleanup; + + bitmap_writer_set_checksum(midx_hash); + bitmap_writer_finish(index, pdata.nr_objects, bitmap_name, 0); + +cleanup: + free(index); + free(bitmap_name); + return ret; +} + static int write_midx_internal(const char *object_dir, struct string_list *packs_to_drop, const char *preferred_pack_name, @@ -938,7 +1102,7 @@ static int write_midx_internal(const char *object_dir, ctx.info[ctx.nr].orig_pack_int_id = i; ctx.info[ctx.nr].pack_name = xstrdup(ctx.m->pack_names[i]); - ctx.info[ctx.nr].p = NULL; + ctx.info[ctx.nr].p = ctx.m->packs[i]; ctx.info[ctx.nr].expired = 0; if (flags & MIDX_WRITE_REV_INDEX) { @@ -972,8 +1136,26 @@ static int write_midx_internal(const char *object_dir, for_each_file_in_pack_dir(object_dir, add_pack_to_midx, &ctx); stop_progress(&ctx.progress); - if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop) - goto cleanup; + if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop) { + struct bitmap_index *bitmap_git; + int bitmap_exists; + int want_bitmap = flags & MIDX_WRITE_BITMAP; + + bitmap_git = prepare_midx_bitmap_git(the_repository, ctx.m); + bitmap_exists = bitmap_git && bitmap_is_midx(bitmap_git); + free_bitmap_index(bitmap_git); + + if (bitmap_exists || !want_bitmap) { + /* + * The correct MIDX already exists, and so does a + * corresponding bitmap (or one wasn't requested). + */ + if (!want_bitmap) + clear_midx_files_ext(the_repository, ".bitmap", + NULL); + goto cleanup; + } + } if (preferred_pack_name) { int found = 0; @@ -989,7 +1171,8 @@ static int write_midx_internal(const char *object_dir, if (!found) warning(_("unknown preferred pack: '%s'"), preferred_pack_name); - } else if (ctx.nr && (flags & MIDX_WRITE_REV_INDEX)) { + } else if (ctx.nr && + (flags & (MIDX_WRITE_REV_INDEX | MIDX_WRITE_BITMAP))) { struct packed_git *oldest = ctx.info[ctx.preferred_pack_idx].p; ctx.preferred_pack_idx = 0; @@ -1121,9 +1304,6 @@ static int write_midx_internal(const char *object_dir, hold_lock_file_for_update(&lk, midx_name, LOCK_DIE_ON_ERROR); f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk)); - if (ctx.m) - close_object_store(the_repository->objects); - if (ctx.nr - dropped_packs == 0) { error(_("no pack files to index.")); result = 1; @@ -1154,14 +1334,24 @@ static int write_midx_internal(const char *object_dir, finalize_hashfile(f, midx_hash, CSUM_FSYNC | CSUM_HASH_IN_STREAM); free_chunkfile(cf); - if (flags & MIDX_WRITE_REV_INDEX) + if (flags & (MIDX_WRITE_REV_INDEX | MIDX_WRITE_BITMAP)) ctx.pack_order = midx_pack_order(&ctx); if (flags & MIDX_WRITE_REV_INDEX) write_midx_reverse_index(midx_name, midx_hash, &ctx); + if (flags & MIDX_WRITE_BITMAP) { + if (write_midx_bitmap(midx_name, midx_hash, &ctx, flags) < 0) { + error(_("could not write multi-pack bitmap")); + result = 1; + goto cleanup; + } + } + + close_object_store(the_repository->objects); commit_lock_file(&lk); + clear_midx_files_ext(the_repository, ".bitmap", midx_hash); clear_midx_files_ext(the_repository, ".rev", midx_hash); cleanup: @@ -1178,6 +1368,7 @@ static int write_midx_internal(const char *object_dir, free(ctx.pack_perm); free(ctx.pack_order); free(midx_name); + return result; } @@ -1238,6 +1429,7 @@ void clear_midx_file(struct repository *r) if (remove_path(midx)) die(_("failed to clear multi-pack-index at %s"), midx); + clear_midx_files_ext(r, ".bitmap", NULL); clear_midx_files_ext(r, ".rev", NULL); free(midx); diff --git a/midx.h b/midx.h index 1172df1a71..350f4d0a7b 100644 --- a/midx.h +++ b/midx.h @@ -41,6 +41,7 @@ struct multi_pack_index { #define MIDX_PROGRESS (1 << 0) #define MIDX_WRITE_REV_INDEX (1 << 1) +#define MIDX_WRITE_BITMAP (1 << 2) const unsigned char *get_midx_checksum(struct multi_pack_index *m); char *get_midx_filename(const char *object_dir); From patchwork Tue Aug 24 16:16:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455531 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64F92C432BE for ; Tue, 24 Aug 2021 16:17:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4EA08610A3 for ; Tue, 24 Aug 2021 16:17:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230507AbhHXQRk (ORCPT ); Tue, 24 Aug 2021 12:17:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46726 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230123AbhHXQRX (ORCPT ); Tue, 24 Aug 2021 12:17:23 -0400 Received: from mail-io1-xd2e.google.com (mail-io1-xd2e.google.com [IPv6:2607:f8b0:4864:20::d2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 031F8C061796 for ; Tue, 24 Aug 2021 09:16:33 -0700 (PDT) Received: by mail-io1-xd2e.google.com with SMTP id a21so27065850ioq.6 for ; Tue, 24 Aug 2021 09:16:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=bwJLEVJAqDBg7h/hClOBrPy1jfzi4ZxSTwORxqWd3PU=; b=hg2gc9kq4KR/GnLC3+zvjEfTIe1Iozhu2P7LfYt29Cy2/6D62hz7YiUxaR/mgehdx/ 7rpHjjw+G3JLsyCCDsS75vnO/FMnUpOVRJ/PjrxSBQtHd7IIGD2LypIem3WqJ2ukwJHE /q62fNDyMFzwwiYQB/SlmAKt4jUEOwPcCT2lCjEL9jm3ICLOXhMQ+YxL42zgjMzQ/2d+ gpmfKot7JzkhhiMceHyf/XIjCS7bGvNSK5aeC2tvSMbibqmB7NS5l0kceLspb+aM259l SUK46OgkDFMYE/zBbZibs4MoCzWzSx6iQOumM9zZVPT8Hb5oW8irm5m0F91gLl2VXyiQ V6fA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=bwJLEVJAqDBg7h/hClOBrPy1jfzi4ZxSTwORxqWd3PU=; b=XXQ0TEyKLiSwevVz3WpoOOjK6isrqIz2FGNta9gdkWWppl6VPWbvkwfb3zX9yYlXSP uekCYPc115S9NeW/nm/5p7UNXB2fRW9wgQQAOhRn10QlQmDlvDjVvKo3JWspncI3Od9E Q0bUXWb3tuabEEBlVjSispLD/d6FZlIhnGQCMB4X0/ch26vXucRU0tcnQvrf5wWAxpQN t656wHUDx6nIq1/+qawA+JcmZO0jEmTOQsi9/AZaojbLkZVJQwCp9iyz23Jig33G1J/E iE89BfJHGXBIcn17pWSdr85IsELKOgfsnZnyQUT7fBXcZ49vRwHceKDG5EPqLkEZH3yt I4gw== X-Gm-Message-State: AOAM530fPzwYWBupCtKt/sdhg2osmr6r/E8K/YtzavQac/Hs7+o8krKG lT72F51NB6SgktGlQDAg/AcW7ZBjgQ5WWBRJ X-Google-Smtp-Source: ABdhPJwR4UldYX40S7C1okwiT8dtedrzzOc3W8VWIsXeqOV83zt+nRhVFTqNMwuN4y3rb8CktcB1Ag== X-Received: by 2002:a6b:8d8a:: with SMTP id p132mr31408028iod.81.1629821792115; Tue, 24 Aug 2021 09:16:32 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id o11sm10090547ilf.86.2021.08.24.09.16.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:31 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:31 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 16/25] t5310: move some tests to lib-bitmap.sh Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org We'll soon be adding a test script that will cover many of the same bitmap concepts as t5310, but for MIDX bitmaps. Let's pull out as many of the applicable tests as we can so we don't have to rewrite them. There should be no functional change to t5310; we still run the same operations in the same order. Signed-off-by: Taylor Blau --- t/lib-bitmap.sh | 236 ++++++++++++++++++++++++++++++++++++++++ t/t5310-pack-bitmaps.sh | 227 +------------------------------------- 2 files changed, 240 insertions(+), 223 deletions(-) diff --git a/t/lib-bitmap.sh b/t/lib-bitmap.sh index fe3f98be24..77464da6fd 100644 --- a/t/lib-bitmap.sh +++ b/t/lib-bitmap.sh @@ -1,3 +1,6 @@ +# Helpers for scripts testing bitmap functionality; see t5310 for +# example usage. + # Compare a file containing rev-list bitmap traversal output to its non-bitmap # counterpart. You can't just use test_cmp for this, because the two produce # subtly different output: @@ -24,3 +27,236 @@ test_bitmap_traversal () { test_cmp "$1.normalized" "$2.normalized" && rm -f "$1.normalized" "$2.normalized" } + +# To ensure the logic for "maximal commits" is exercised, make +# the repository a bit more complicated. +# +# other second +# * * +# (99 commits) (99 commits) +# * * +# |\ /| +# | * octo-other octo-second * | +# |/|\_________ ____________/|\| +# | \ \/ __________/ | +# | | ________/\ / | +# * |/ * merge-right * +# | _|__________/ \____________ | +# |/ | \| +# (l1) * * merge-left * (r1) +# | / \________________________ | +# |/ \| +# (l2) * * (r2) +# \___________________________ | +# \| +# * (base) +# +# We only push bits down the first-parent history, which +# makes some of these commits unimportant! +# +# The important part for the maximal commit algorithm is how +# the bitmasks are extended. Assuming starting bit positions +# for second (bit 0) and other (bit 1), the bitmasks at the +# end should be: +# +# second: 1 (maximal, selected) +# other: 01 (maximal, selected) +# (base): 11 (maximal) +# +# This complicated history was important for a previous +# version of the walk that guarantees never walking a +# commit multiple times. That goal might be important +# again, so preserve this complicated case. For now, this +# test will guarantee that the bitmaps are computed +# correctly, even with the repeat calculations. +setup_bitmap_history() { + test_expect_success 'setup repo with moderate-sized history' ' + test_commit_bulk --id=file 10 && + git branch -M second && + git checkout -b other HEAD~5 && + test_commit_bulk --id=side 10 && + + # add complicated history setup, including merges and + # ambiguous merge-bases + + git checkout -b merge-left other~2 && + git merge second~2 -m "merge-left" && + + git checkout -b merge-right second~1 && + git merge other~1 -m "merge-right" && + + git checkout -b octo-second second && + git merge merge-left merge-right -m "octopus-second" && + + git checkout -b octo-other other && + git merge merge-left merge-right -m "octopus-other" && + + git checkout other && + git merge octo-other -m "pull octopus" && + + git checkout second && + git merge octo-second -m "pull octopus" && + + # Remove these branches so they are not selected + # as bitmap tips + git branch -D merge-left && + git branch -D merge-right && + git branch -D octo-other && + git branch -D octo-second && + + # add padding to make these merges less interesting + # and avoid having them selected for bitmaps + test_commit_bulk --id=file 100 && + git checkout other && + test_commit_bulk --id=side 100 && + git checkout second && + + bitmaptip=$(git rev-parse second) && + blob=$(echo tagged-blob | git hash-object -w --stdin) && + git tag tagged-blob $blob + ' +} + +rev_list_tests_head () { + test_expect_success "counting commits via bitmap ($state, $branch)" ' + git rev-list --count $branch >expect && + git rev-list --use-bitmap-index --count $branch >actual && + test_cmp expect actual + ' + + test_expect_success "counting partial commits via bitmap ($state, $branch)" ' + git rev-list --count $branch~5..$branch >expect && + git rev-list --use-bitmap-index --count $branch~5..$branch >actual && + test_cmp expect actual + ' + + test_expect_success "counting commits with limit ($state, $branch)" ' + git rev-list --count -n 1 $branch >expect && + git rev-list --use-bitmap-index --count -n 1 $branch >actual && + test_cmp expect actual + ' + + test_expect_success "counting non-linear history ($state, $branch)" ' + git rev-list --count other...second >expect && + git rev-list --use-bitmap-index --count other...second >actual && + test_cmp expect actual + ' + + test_expect_success "counting commits with limiting ($state, $branch)" ' + git rev-list --count $branch -- 1.t >expect && + git rev-list --use-bitmap-index --count $branch -- 1.t >actual && + test_cmp expect actual + ' + + test_expect_success "counting objects via bitmap ($state, $branch)" ' + git rev-list --count --objects $branch >expect && + git rev-list --use-bitmap-index --count --objects $branch >actual && + test_cmp expect actual + ' + + test_expect_success "enumerate commits ($state, $branch)" ' + git rev-list --use-bitmap-index $branch >actual && + git rev-list $branch >expect && + test_bitmap_traversal --no-confirm-bitmaps expect actual + ' + + test_expect_success "enumerate --objects ($state, $branch)" ' + git rev-list --objects --use-bitmap-index $branch >actual && + git rev-list --objects $branch >expect && + test_bitmap_traversal expect actual + ' + + test_expect_success "bitmap --objects handles non-commit objects ($state, $branch)" ' + git rev-list --objects --use-bitmap-index $branch tagged-blob >actual && + grep $blob actual + ' +} + +rev_list_tests () { + state=$1 + + for branch in "second" "other" + do + rev_list_tests_head + done +} + +basic_bitmap_tests () { + tip="$1" + test_expect_success 'rev-list --test-bitmap verifies bitmaps' " + git rev-list --test-bitmap "${tip:-HEAD}" + " + + rev_list_tests 'full bitmap' + + test_expect_success 'clone from bitmapped repository' ' + rm -fr clone.git && + git clone --no-local --bare . clone.git && + git rev-parse HEAD >expect && + git --git-dir=clone.git rev-parse HEAD >actual && + test_cmp expect actual + ' + + test_expect_success 'partial clone from bitmapped repository' ' + test_config uploadpack.allowfilter true && + rm -fr partial-clone.git && + git clone --no-local --bare --filter=blob:none . partial-clone.git && + ( + cd partial-clone.git && + pack=$(echo objects/pack/*.pack) && + git verify-pack -v "$pack" >have && + awk "/blob/ { print \$1 }" blobs && + # we expect this single blob because of the direct ref + git rev-parse refs/tags/tagged-blob >expect && + test_cmp expect blobs + ) + ' + + test_expect_success 'setup further non-bitmapped commits' ' + test_commit_bulk --id=further 10 + ' + + rev_list_tests 'partial bitmap' + + test_expect_success 'fetch (partial bitmap)' ' + git --git-dir=clone.git fetch origin second:second && + git rev-parse HEAD >expect && + git --git-dir=clone.git rev-parse HEAD >actual && + test_cmp expect actual + ' + + test_expect_success 'enumerating progress counts pack-reused objects' ' + count=$(git rev-list --objects --all --count) && + git repack -adb && + + # check first with only reused objects; confirm that our + # progress showed the right number, and also that we did + # pack-reuse as expected. Check only the final "done" + # line of the meter (there may be an arbitrary number of + # intermediate lines ending with CR). + GIT_PROGRESS_DELAY=0 \ + git pack-objects --all --stdout --progress \ + /dev/null 2>stderr && + grep "Enumerating objects: $count, done" stderr && + grep "pack-reused $count" stderr && + + # now the same but with one non-reused object + git commit --allow-empty -m "an extra commit object" && + GIT_PROGRESS_DELAY=0 \ + git pack-objects --all --stdout --progress \ + /dev/null 2>stderr && + grep "Enumerating objects: $((count+1)), done" stderr && + grep "pack-reused $count" stderr + ' +} + +# have_delta +# +# Note that because this relies on cat-file, it might find _any_ copy of an +# object in the repository. The caller is responsible for making sure +# there's only one (e.g., via "repack -ad", or having just fetched a copy). +have_delta () { + echo $2 >expect && + echo $1 | git cat-file --batch-check="%(deltabase)" >actual && + test_cmp expect actual +} diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh index b02838750e..4318f84d53 100755 --- a/t/t5310-pack-bitmaps.sh +++ b/t/t5310-pack-bitmaps.sh @@ -25,93 +25,10 @@ has_any () { grep -Ff "$1" "$2" } -# To ensure the logic for "maximal commits" is exercised, make -# the repository a bit more complicated. -# -# other second -# * * -# (99 commits) (99 commits) -# * * -# |\ /| -# | * octo-other octo-second * | -# |/|\_________ ____________/|\| -# | \ \/ __________/ | -# | | ________/\ / | -# * |/ * merge-right * -# | _|__________/ \____________ | -# |/ | \| -# (l1) * * merge-left * (r1) -# | / \________________________ | -# |/ \| -# (l2) * * (r2) -# \___________________________ | -# \| -# * (base) -# -# We only push bits down the first-parent history, which -# makes some of these commits unimportant! -# -# The important part for the maximal commit algorithm is how -# the bitmasks are extended. Assuming starting bit positions -# for second (bit 0) and other (bit 1), the bitmasks at the -# end should be: -# -# second: 1 (maximal, selected) -# other: 01 (maximal, selected) -# (base): 11 (maximal) -# -# This complicated history was important for a previous -# version of the walk that guarantees never walking a -# commit multiple times. That goal might be important -# again, so preserve this complicated case. For now, this -# test will guarantee that the bitmaps are computed -# correctly, even with the repeat calculations. +setup_bitmap_history -test_expect_success 'setup repo with moderate-sized history' ' - test_commit_bulk --id=file 10 && - git branch -M second && - git checkout -b other HEAD~5 && - test_commit_bulk --id=side 10 && - - # add complicated history setup, including merges and - # ambiguous merge-bases - - git checkout -b merge-left other~2 && - git merge second~2 -m "merge-left" && - - git checkout -b merge-right second~1 && - git merge other~1 -m "merge-right" && - - git checkout -b octo-second second && - git merge merge-left merge-right -m "octopus-second" && - - git checkout -b octo-other other && - git merge merge-left merge-right -m "octopus-other" && - - git checkout other && - git merge octo-other -m "pull octopus" && - - git checkout second && - git merge octo-second -m "pull octopus" && - - # Remove these branches so they are not selected - # as bitmap tips - git branch -D merge-left && - git branch -D merge-right && - git branch -D octo-other && - git branch -D octo-second && - - # add padding to make these merges less interesting - # and avoid having them selected for bitmaps - test_commit_bulk --id=file 100 && - git checkout other && - test_commit_bulk --id=side 100 && - git checkout second && - - bitmaptip=$(git rev-parse second) && - blob=$(echo tagged-blob | git hash-object -w --stdin) && - git tag tagged-blob $blob && - git config repack.writebitmaps true +test_expect_success 'setup writing bitmaps during repack' ' + git config repack.writeBitmaps true ' test_expect_success 'full repack creates bitmaps' ' @@ -123,109 +40,7 @@ test_expect_success 'full repack creates bitmaps' ' grep "\"key\":\"num_maximal_commits\",\"value\":\"107\"" trace ' -test_expect_success 'rev-list --test-bitmap verifies bitmaps' ' - git rev-list --test-bitmap HEAD -' - -rev_list_tests_head () { - test_expect_success "counting commits via bitmap ($state, $branch)" ' - git rev-list --count $branch >expect && - git rev-list --use-bitmap-index --count $branch >actual && - test_cmp expect actual - ' - - test_expect_success "counting partial commits via bitmap ($state, $branch)" ' - git rev-list --count $branch~5..$branch >expect && - git rev-list --use-bitmap-index --count $branch~5..$branch >actual && - test_cmp expect actual - ' - - test_expect_success "counting commits with limit ($state, $branch)" ' - git rev-list --count -n 1 $branch >expect && - git rev-list --use-bitmap-index --count -n 1 $branch >actual && - test_cmp expect actual - ' - - test_expect_success "counting non-linear history ($state, $branch)" ' - git rev-list --count other...second >expect && - git rev-list --use-bitmap-index --count other...second >actual && - test_cmp expect actual - ' - - test_expect_success "counting commits with limiting ($state, $branch)" ' - git rev-list --count $branch -- 1.t >expect && - git rev-list --use-bitmap-index --count $branch -- 1.t >actual && - test_cmp expect actual - ' - - test_expect_success "counting objects via bitmap ($state, $branch)" ' - git rev-list --count --objects $branch >expect && - git rev-list --use-bitmap-index --count --objects $branch >actual && - test_cmp expect actual - ' - - test_expect_success "enumerate commits ($state, $branch)" ' - git rev-list --use-bitmap-index $branch >actual && - git rev-list $branch >expect && - test_bitmap_traversal --no-confirm-bitmaps expect actual - ' - - test_expect_success "enumerate --objects ($state, $branch)" ' - git rev-list --objects --use-bitmap-index $branch >actual && - git rev-list --objects $branch >expect && - test_bitmap_traversal expect actual - ' - - test_expect_success "bitmap --objects handles non-commit objects ($state, $branch)" ' - git rev-list --objects --use-bitmap-index $branch tagged-blob >actual && - grep $blob actual - ' -} - -rev_list_tests () { - state=$1 - - for branch in "second" "other" - do - rev_list_tests_head - done -} - -rev_list_tests 'full bitmap' - -test_expect_success 'clone from bitmapped repository' ' - git clone --no-local --bare . clone.git && - git rev-parse HEAD >expect && - git --git-dir=clone.git rev-parse HEAD >actual && - test_cmp expect actual -' - -test_expect_success 'partial clone from bitmapped repository' ' - test_config uploadpack.allowfilter true && - git clone --no-local --bare --filter=blob:none . partial-clone.git && - ( - cd partial-clone.git && - pack=$(echo objects/pack/*.pack) && - git verify-pack -v "$pack" >have && - awk "/blob/ { print \$1 }" blobs && - # we expect this single blob because of the direct ref - git rev-parse refs/tags/tagged-blob >expect && - test_cmp expect blobs - ) -' - -test_expect_success 'setup further non-bitmapped commits' ' - test_commit_bulk --id=further 10 -' - -rev_list_tests 'partial bitmap' - -test_expect_success 'fetch (partial bitmap)' ' - git --git-dir=clone.git fetch origin second:second && - git rev-parse HEAD >expect && - git --git-dir=clone.git rev-parse HEAD >actual && - test_cmp expect actual -' +basic_bitmap_tests test_expect_success 'incremental repack fails when bitmaps are requested' ' test_commit more-1 && @@ -461,40 +276,6 @@ test_expect_success 'truncated bitmap fails gracefully (cache)' ' test_i18ngrep corrupted.bitmap.index stderr ' -test_expect_success 'enumerating progress counts pack-reused objects' ' - count=$(git rev-list --objects --all --count) && - git repack -adb && - - # check first with only reused objects; confirm that our progress - # showed the right number, and also that we did pack-reuse as expected. - # Check only the final "done" line of the meter (there may be an - # arbitrary number of intermediate lines ending with CR). - GIT_PROGRESS_DELAY=0 \ - git pack-objects --all --stdout --progress \ - /dev/null 2>stderr && - grep "Enumerating objects: $count, done" stderr && - grep "pack-reused $count" stderr && - - # now the same but with one non-reused object - git commit --allow-empty -m "an extra commit object" && - GIT_PROGRESS_DELAY=0 \ - git pack-objects --all --stdout --progress \ - /dev/null 2>stderr && - grep "Enumerating objects: $((count+1)), done" stderr && - grep "pack-reused $count" stderr -' - -# have_delta -# -# Note that because this relies on cat-file, it might find _any_ copy of an -# object in the repository. The caller is responsible for making sure -# there's only one (e.g., via "repack -ad", or having just fetched a copy). -have_delta () { - echo $2 >expect && - echo $1 | git cat-file --batch-check="%(deltabase)" >actual && - test_cmp expect actual -} - # Create a state of history with these properties: # # - refs that allow a client to fetch some new history, while sharing some old From patchwork Tue Aug 24 16:16:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455533 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B22C9C4320A for ; Tue, 24 Aug 2021 16:17:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9C82161212 for ; Tue, 24 Aug 2021 16:17:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229666AbhHXQRo (ORCPT ); Tue, 24 Aug 2021 12:17:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46732 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230192AbhHXQRY (ORCPT ); Tue, 24 Aug 2021 12:17:24 -0400 Received: from mail-io1-xd31.google.com (mail-io1-xd31.google.com [IPv6:2607:f8b0:4864:20::d31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C411C0617AE for ; Tue, 24 Aug 2021 09:16:35 -0700 (PDT) Received: by mail-io1-xd31.google.com with SMTP id a21so27066042ioq.6 for ; Tue, 24 Aug 2021 09:16:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=EiDGZ/YjEwrBF8MU75wdKIksLOVBYJ6qMZ9IydcqelI=; b=y5lTMAjiF3xqZBm8NHRSbbzvlOhNUT9fW+5vkx939qeYbhNdy3cYLqnktKs7RX62vq QWbkjwNvL7UYtDT76f/9GyIZrKCSlQDxCR1C4RnIlmoCTKFcBrABrAjTeTK7BwRpMlkd yhNtP2PGsd/9q9j5r20TTDdtGZuAZtC19exlBJaKDiGAhM7qC52DQDKzpNyI2zuKE/uu NYyoWA8Cf3V3QYxzCCdVz6eMCXi2HiPyhHGYYc3uEVTjE1oXYv2B5xv2PL29rWG7TucA njggKbrGnioGBPwUyFHeqJ0kDbH0tOEHrl8Id5jX3AhQXjv0zTAogY2GMZIx4vWLAcbi eHOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=EiDGZ/YjEwrBF8MU75wdKIksLOVBYJ6qMZ9IydcqelI=; b=n2qEkAF5/uF8a+aIBuYM28DDL31Ls2lSzxzf+TAXQynXDuXMj9cIXb4G19mJAmJXjA xWeTbOMi5O+wA2TJhPk76DApN8QCiS/kjRySrM+5NhNTNwO6j5Y1pb02EWRD+4M89h1X 2hHzYkBE7JR63PSSCIb/a6in6T4NYCz351APpbBG/YEiYO9/ccrk5fn2Iaa4bmsBGGLz JBLM5Rifwx3hBhUnX68HflNKRZmBnBvCDBjHkhq5jwxnujHEVHCwssysyFimVVu8rND8 BmObQz7GVTNj/+MOby3Ni3A9NEU88+1/a3IFWatPv1pGkPfiwLYDvwWEy5RDu0X8lz5E Fg9g== X-Gm-Message-State: AOAM5300kqm778wcXVp+LwOSwNcnOqxPt/vdIvNGHNri44uV/yj14nxQ sFyMcA/tvZyxFDaAAfrHK8QvZFXAjxRqergN X-Google-Smtp-Source: ABdhPJwQa1QIuLlprwt+bDilBGnKH8h1Og82Lix5vEFGCN2ZD+QBdEcXsIb+M5TyXENjDEpHDATkWA== X-Received: by 2002:a05:6638:25c3:: with SMTP id u3mr35556877jat.52.1629821794664; Tue, 24 Aug 2021 09:16:34 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id a6sm10046671ilb.59.2021.08.24.09.16.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:34 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:33 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 17/25] t/helper/test-read-midx.c: add --checksum mode Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Subsequent tests will want to check for the existence of a multi-pack bitmap which matches the multi-pack-index stored in the pack directory. The multi-pack bitmap includes the hex checksum of the MIDX it corresponds to in its filename (for example, '$packdir/multi-pack-index-.bitmap'). As a result, some tests want a way to learn what '' is. This helper addresses that need by printing the checksum of the repository's multi-pack-index. Signed-off-by: Taylor Blau --- t/helper/test-read-midx.c | 16 +++++++++++++++- t/lib-bitmap.sh | 4 ++++ 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/t/helper/test-read-midx.c b/t/helper/test-read-midx.c index 7c2eb11a8e..cb0d27049a 100644 --- a/t/helper/test-read-midx.c +++ b/t/helper/test-read-midx.c @@ -60,12 +60,26 @@ static int read_midx_file(const char *object_dir, int show_objects) return 0; } +static int read_midx_checksum(const char *object_dir) +{ + struct multi_pack_index *m; + + setup_git_directory(); + m = load_multi_pack_index(object_dir, 1); + if (!m) + return 1; + printf("%s\n", hash_to_hex(get_midx_checksum(m))); + return 0; +} + int cmd__read_midx(int argc, const char **argv) { if (!(argc == 2 || argc == 3)) - usage("read-midx [--show-objects] "); + usage("read-midx [--show-objects|--checksum] "); if (!strcmp(argv[1], "--show-objects")) return read_midx_file(argv[2], 1); + else if (!strcmp(argv[1], "--checksum")) + return read_midx_checksum(argv[2]); return read_midx_file(argv[1], 0); } diff --git a/t/lib-bitmap.sh b/t/lib-bitmap.sh index 77464da6fd..21d0392dda 100644 --- a/t/lib-bitmap.sh +++ b/t/lib-bitmap.sh @@ -260,3 +260,7 @@ have_delta () { echo $1 | git cat-file --batch-check="%(deltabase)" >actual && test_cmp expect actual } + +midx_checksum () { + test-tool read-midx --checksum "$1" +} From patchwork Tue Aug 24 16:16:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455539 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8B7CC4338F for ; Tue, 24 Aug 2021 16:17:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9C368610A3 for ; Tue, 24 Aug 2021 16:17:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230187AbhHXQRw (ORCPT ); Tue, 24 Aug 2021 12:17:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230392AbhHXQRY (ORCPT ); Tue, 24 Aug 2021 12:17:24 -0400 Received: from mail-il1-x133.google.com (mail-il1-x133.google.com [IPv6:2607:f8b0:4864:20::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D4FF9C06122F for ; Tue, 24 Aug 2021 09:16:37 -0700 (PDT) Received: by mail-il1-x133.google.com with SMTP id s16so21032518ilo.9 for ; Tue, 24 Aug 2021 09:16:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Oucp9lS3oSCkxDLTeKvvVMXv4Tdizq2p2ijMn1rmUKk=; b=ANcsRTPUbSkJ5jl3Faf5QNibRZe+FniM0C1oiJFEjBNn+ItMHfn9TcqYyAlrI8CSwB jKLlcW7CBixBvOTGXgTg1X2M1s5+s/1Et0RFXCTn4Tf0TkKwUZdqYeCV9JkUrQsoW7sn G4xKinklrAO0ZT9jrbkh75haO144qaz7rSyHYh66qTN5DdGdj434nP4ZGdMnOTcoI7VG FBesVfjK9jwNhhbHkPc4GDxisKmHuDB4PB8buVedNmul867nwZWKut3fRiooXvaWQUMh vxve7KPNgOcg9mHTHa9JM+c47fbShdQklxGmqrBiXVrzzoZCw/dXOfTx94Xzu/9XonVi p7Ug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Oucp9lS3oSCkxDLTeKvvVMXv4Tdizq2p2ijMn1rmUKk=; b=Bliu+7V4CnLp6UVuNr5Z2f655cbi2tdaW80MgZ4Ysz9h8kW1F0v9ztHayKzwCTHACt VTIzLhMNRdtvbWCAKz72p6SQ6q7vOaPjNUMXJ1Ze20kQW7NmBD695wm2MhyVmeekP9Qt blzxKxNt2EV0ar0gJW+6vKE5vKjQkWohFITbLV1e0Oqi9z0T3SxQlenR4373hpU0Ubel YN3w6SUisqseiq11nAhOXEI2U6PkdEd6WDsjqu2qVNdpSoGyaOhbNn+iyDvYnC2L1Vyd cylhkLR1O8GtgOWcSmS6EQEU+i0SS+vBeYF/XnvdNNRcuS0HyMg7uFMyAlnfCIRAIm6h sSaw== X-Gm-Message-State: AOAM532siWLQaXn25mllaYFzzyso/CuneFv8kB6Yid8AAsXgZr086h0W 0FTNLNi8CEfxGLF3xKrTV3CfvZ4MxcNqPc3x X-Google-Smtp-Source: ABdhPJyBVvBKgqz94e37EtEsQCJkDP+c+Yk7pgOKK9EguOujToAgP8ZasImmhoA1P+gw0i8dlaYg/Q== X-Received: by 2002:a92:6f0a:: with SMTP id k10mr26361194ilc.105.1629821797153; Tue, 24 Aug 2021 09:16:37 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id h10sm9989267ilj.71.2021.08.24.09.16.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:36 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:36 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 18/25] t5326: test multi-pack bitmap behavior Message-ID: <9d9d9f28a6703a49aa9c62892985dea32543d880.1629821743.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This patch introduces a new test, t5326, which tests the basic functionality of multi-pack bitmaps. Some trivial behavior is tested, such as: - Whether bitmaps can be generated with more than one pack. - Whether clones can be served with all objects in the bitmap. - Whether follow-up fetches can be served with some objects outside of the server's bitmap These use lib-bitmap's tests (which in turn were pulled from t5310), and we cover cases where the MIDX represents both a single pack and multiple packs. In addition, some non-trivial and MIDX-specific behavior is tested, too, including: - Whether multi-pack bitmaps behave correctly with respect to the pack-reuse machinery when the base for some object is selected from a different pack than the delta. - Whether multi-pack bitmaps correctly respect the pack.preferBitmapTips configuration. Signed-off-by: Taylor Blau --- t/t5326-multi-pack-bitmaps.sh | 286 ++++++++++++++++++++++++++++++++++ 1 file changed, 286 insertions(+) create mode 100755 t/t5326-multi-pack-bitmaps.sh diff --git a/t/t5326-multi-pack-bitmaps.sh b/t/t5326-multi-pack-bitmaps.sh new file mode 100755 index 0000000000..4ad7c2c969 --- /dev/null +++ b/t/t5326-multi-pack-bitmaps.sh @@ -0,0 +1,286 @@ +#!/bin/sh + +test_description='exercise basic multi-pack bitmap functionality' +. ./test-lib.sh +. "${TEST_DIRECTORY}/lib-bitmap.sh" + +# We'll be writing our own midx and bitmaps, so avoid getting confused by the +# automatic ones. +GIT_TEST_MULTI_PACK_INDEX=0 +GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 + +objdir=.git/objects +midx=$objdir/pack/multi-pack-index + +# midx_pack_source +midx_pack_source () { + test-tool read-midx --show-objects .git/objects | grep "^$1 " | cut -f2 +} + +setup_bitmap_history + +test_expect_success 'enable core.multiPackIndex' ' + git config core.multiPackIndex true +' + +test_expect_success 'create single-pack midx with bitmaps' ' + git repack -ad && + git multi-pack-index write --bitmap && + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap && + test_path_is_file $midx-$(midx_checksum $objdir).rev +' + +basic_bitmap_tests + +test_expect_success 'create new additional packs' ' + for i in $(test_seq 1 16) + do + test_commit "$i" && + git repack -d || return 1 + done && + + git checkout -b other2 HEAD~8 && + for i in $(test_seq 1 8) + do + test_commit "side-$i" && + git repack -d || return 1 + done && + git checkout second +' + +test_expect_success 'create multi-pack midx with bitmaps' ' + git multi-pack-index write --bitmap && + + ls $objdir/pack/pack-*.pack >packs && + test_line_count = 25 packs && + + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap && + test_path_is_file $midx-$(midx_checksum $objdir).rev +' + +basic_bitmap_tests + +test_expect_success '--no-bitmap is respected when bitmaps exist' ' + git multi-pack-index write --bitmap && + + test_commit respect--no-bitmap && + git repack -d && + + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap && + test_path_is_file $midx-$(midx_checksum $objdir).rev && + + git multi-pack-index write --no-bitmap && + + test_path_is_file $midx && + test_path_is_missing $midx-$(midx_checksum $objdir).bitmap && + test_path_is_missing $midx-$(midx_checksum $objdir).rev +' + +test_expect_success 'setup midx with base from later pack' ' + # Write a and b so that "a" is a delta on top of base "b", since Git + # prefers to delete contents out of a base rather than add to a shorter + # object. + test_seq 1 128 >a && + test_seq 1 130 >b && + + git add a b && + git commit -m "initial commit" && + + a=$(git rev-parse HEAD:a) && + b=$(git rev-parse HEAD:b) && + + # In the first pack, "a" is stored as a delta to "b". + p1=$(git pack-objects .git/objects/pack/pack <<-EOF + $a + $b + EOF + ) && + + # In the second pack, "a" is missing, and "b" is not a delta nor base to + # any other object. + p2=$(git pack-objects .git/objects/pack/pack <<-EOF + $b + $(git rev-parse HEAD) + $(git rev-parse HEAD^{tree}) + EOF + ) && + + git prune-packed && + # Use the second pack as the preferred source, so that "b" occurs + # earlier in the MIDX object order, rendering "a" unusable for pack + # reuse. + git multi-pack-index write --bitmap --preferred-pack=pack-$p2.idx && + + have_delta $a $b && + test $(midx_pack_source $a) != $(midx_pack_source $b) +' + +rev_list_tests 'full bitmap with backwards delta' + +test_expect_success 'clone with bitmaps enabled' ' + git clone --no-local --bare . clone-reverse-delta.git && + test_when_finished "rm -fr clone-reverse-delta.git" && + + git rev-parse HEAD >expect && + git --git-dir=clone-reverse-delta.git rev-parse HEAD >actual && + test_cmp expect actual +' + +bitmap_reuse_tests() { + from=$1 + to=$2 + + test_expect_success "setup pack reuse tests ($from -> $to)" ' + rm -fr repo && + git init repo && + ( + cd repo && + test_commit_bulk 16 && + git tag old-tip && + + git config core.multiPackIndex true && + if test "MIDX" = "$from" + then + git repack -Ad && + git multi-pack-index write --bitmap + else + git repack -Adb + fi + ) + ' + + test_expect_success "build bitmap from existing ($from -> $to)" ' + ( + cd repo && + test_commit_bulk --id=further 16 && + git tag new-tip && + + if test "MIDX" = "$to" + then + git repack -d && + git multi-pack-index write --bitmap + else + git repack -Adb + fi + ) + ' + + test_expect_success "verify resulting bitmaps ($from -> $to)" ' + ( + cd repo && + git for-each-ref && + git rev-list --test-bitmap refs/tags/old-tip && + git rev-list --test-bitmap refs/tags/new-tip + ) + ' +} + +bitmap_reuse_tests 'pack' 'MIDX' +bitmap_reuse_tests 'MIDX' 'pack' +bitmap_reuse_tests 'MIDX' 'MIDX' + +test_expect_success 'missing object closure fails gracefully' ' + rm -fr repo && + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + + test_commit loose && + test_commit packed && + + # Do not pass "--revs"; we want a pack without the "loose" + # commit. + git pack-objects $objdir/pack/pack <<-EOF && + $(git rev-parse packed) + EOF + + test_must_fail git multi-pack-index write --bitmap 2>err && + grep "doesn.t have full closure" err && + test_path_is_missing $midx + ) +' + +test_expect_success 'setup partial bitmaps' ' + test_commit packed && + git repack && + test_commit loose && + git multi-pack-index write --bitmap 2>err && + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap && + test_path_is_file $midx-$(midx_checksum $objdir).rev +' + +basic_bitmap_tests HEAD~ + +test_expect_success 'removing a MIDX clears stale bitmaps' ' + rm -fr repo && + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + test_commit base && + git repack && + git multi-pack-index write --bitmap && + + # Write a MIDX and bitmap; remove the MIDX but leave the bitmap. + stale_bitmap=$midx-$(midx_checksum $objdir).bitmap && + stale_rev=$midx-$(midx_checksum $objdir).rev && + rm $midx && + + # Then write a new MIDX. + test_commit new && + git repack && + git multi-pack-index write --bitmap && + + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap && + test_path_is_file $midx-$(midx_checksum $objdir).rev && + test_path_is_missing $stale_bitmap && + test_path_is_missing $stale_rev + ) +' + +test_expect_success 'pack.preferBitmapTips' ' + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + + test_commit_bulk --message="%s" 103 && + + git log --format="%H" >commits.raw && + sort commits && + + git log --format="create refs/tags/%s %H" HEAD >refs && + git update-ref --stdin bitmaps && + comm -13 bitmaps commits >before && + test_line_count = 1 before && + + perl -ne "printf(\"create refs/tags/include/%d \", $.); print" \ + bitmaps && + comm -13 bitmaps commits >after && + + ! test_cmp before after + ) +' + +test_done From patchwork Tue Aug 24 16:16:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455535 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B009C4338F for ; Tue, 24 Aug 2021 16:17:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 361B061212 for ; Tue, 24 Aug 2021 16:17:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230494AbhHXQRs (ORCPT ); Tue, 24 Aug 2021 12:17:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231166AbhHXQRZ (ORCPT ); Tue, 24 Aug 2021 12:17:25 -0400 Received: from mail-il1-x136.google.com (mail-il1-x136.google.com [IPv6:2607:f8b0:4864:20::136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A40E2C061757 for ; Tue, 24 Aug 2021 09:16:40 -0700 (PDT) Received: by mail-il1-x136.google.com with SMTP id v16so21067290ilo.10 for ; Tue, 24 Aug 2021 09:16:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=QA6urUWCpbAiRBEZ742+N7KEWTKsPkPBK/kej7S+Hsc=; b=SWn4hXwieyg0QDYuTvRBWZD20ZWoUya44qE2LpGdhKsrA0aIHKi5y0/UasS8PL+M1a 4MoSptuWl7rBpBIHhYevf9dU9hpibVcHJEQv60xRz5HjHVtZkNarLdLL4cQyDEHEJ5wA CUbgoauQthZAjJOXSDWLlnKullp0/DlfUVbOAkHosLzFvPKkG0eaBseNrcBZtZLeFQ5B O3BQ3vSbcMZ6q8YHeb2Uwu1eoYmh9MQv1dNqBcUFKeUiTM/k/GjP4Fx7mY19atLHsS7F V6Zyhp/I9mVcmHCP17Cja+ePJYbLLx+QYaYCkjT9lAkndSaJu+02JlD1hok7tUHktYUu o1EQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=QA6urUWCpbAiRBEZ742+N7KEWTKsPkPBK/kej7S+Hsc=; b=iJJ0PG+TxQlQTjJiIAn7ATrm1zY/l3+R2qbbsIMJxOS0eNIVB2/RHOqEmxUfwnVptV UAXK+0gRbzo8dauYGgZxUr6eIVonR45Oh4R4vvJcYs7hUjJZdpx45sqOhblu1QCUXkXq V3Z/hU6MFregDazjgLaOrl3oQqOmsOl/ihM6fM1D2dOEmCWG4Ey5YnE6vN2/UOLLnUoJ QsWwafJXf0day4MdF82SuysFra3TW6XBbb7RQEcjPsQCEYeLaldV08spgR0g0fw2YJTG z92p9mJgyS+5gOr/LBy0RRvYIzW6RMitsF530APDtXXbtXDYMIIawbRc2MSi69kjWdcr WlFg== X-Gm-Message-State: AOAM533UQx9WqoBCQaCinntua/L8wQp1hDL0MKyvIWmvvPSLkTqkuQWo ajDkIPregfzJBWGGlRHy9EE2RlM1f18NOzQG X-Google-Smtp-Source: ABdhPJwXbGzHpaXMAE+xC10Qf3V8QGBhQxYRLmSEEtTwBolYDXSCI3OQyn3jQ8XpBCZMGVy9wnm6LQ== X-Received: by 2002:a92:ab0c:: with SMTP id v12mr13034698ilh.292.1629821800037; Tue, 24 Aug 2021 09:16:40 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id r18sm10171454ilo.38.2021.08.24.09.16.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:39 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:38 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 19/25] t0410: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP Message-ID: <3e0da7e5ed0da144a5305c0bb211ecebd71e798b.1629821743.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King Generating a MIDX bitmap causes tests which repack in a partial clone to fail because they are missing objects. Missing objects is an expected component of tests in t0410, so disable this knob altogether. Graceful degradation when writing a bitmap with missing objects is tested in t5326. Signed-off-by: Taylor Blau --- t/t0410-partial-clone.sh | 3 +++ 1 file changed, 3 insertions(+) diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh index bbcc51ee8e..bba679685f 100755 --- a/t/t0410-partial-clone.sh +++ b/t/t0410-partial-clone.sh @@ -4,6 +4,9 @@ test_description='partial clone' . ./test-lib.sh +# missing promisor objects cause repacks which write bitmaps to fail +GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 + delete_object () { rm $1/.git/objects/$(echo $2 | sed -e 's|^..|&/|') } From patchwork Tue Aug 24 16:16:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455537 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 876A6C432BE for ; Tue, 24 Aug 2021 16:17:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6159161357 for ; Tue, 24 Aug 2021 16:17:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229904AbhHXQRu (ORCPT ); Tue, 24 Aug 2021 12:17:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46794 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231176AbhHXQR1 (ORCPT ); Tue, 24 Aug 2021 12:17:27 -0400 Received: from mail-il1-x131.google.com (mail-il1-x131.google.com [IPv6:2607:f8b0:4864:20::131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2A63DC061764 for ; Tue, 24 Aug 2021 09:16:43 -0700 (PDT) Received: by mail-il1-x131.google.com with SMTP id z2so21096899iln.0 for ; Tue, 24 Aug 2021 09:16:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=C5Yl6mIwi9zLmDdBydnpT2kQELkZPYxenRWJSBxoMY8=; b=CzxfJ5MEUfsUPke3Ha1nGtnexDcMP2Q81pFwFRsVUvmcWasufeIGEaLWiiKsRIOgkc 45G4J0K0zFBpt51wGgPobG91glZgHTEtJii9KBHbQWz4oLuorOz6bCZb2PJZ7SfuP8K4 z10KuULIvcgW+5xCyHqNbH+9miOZDXBMIYY7dovFACmBUKpaIHNTYQhH6cR51QrTgPeb iGOYGTh3B3UBggc7F4hLAllvpBrxPY0cof2HJ7qE8cgCYWQF+2ba1PeuWH7kM3umIZjb CyzhIxatDiBh19XhVYYH28Y58wIDcsXAXkOtOFJzr5r2cZG3kBqPA2moexbgiSBASUew pVrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=C5Yl6mIwi9zLmDdBydnpT2kQELkZPYxenRWJSBxoMY8=; b=JwE5Nj1SABNtUBQ1niP1p7lGWDIeCC6RRfHu31uPv+fYNWfVysKXbZ2b7f1Z/LZ2fG ZiqE6zqoKRXiCdv0UG4t7T2SvVmAtRUQpIMGub/UKouO8KBTY4GTr9KBD3qSZ3jBHJAQ I+6LjT30MozkRMHyO3CMaUwZl4D+G//vKgWd/NcjotthHirSRFS5TZ/9h4A7tYeH2gR/ CPi7BJ2ITKgfutxLIlZrVVVlYfH9YKQCamkifUjAyenNgavg9Sr0gcGgdb5ucw26ifVC e63q+DyqC/MobyChehKvS9N0agx2hDFregMf3UuqLNlr5+WvaXLzW75mZd+9VGrPR41j OeRg== X-Gm-Message-State: AOAM531Jp9utfCnBanVWKTVFKGWcKhX1A0MByMaxrJnq/LEDrqBHrgRD 0xGTiWTi9eNprUtnCaQ7NPqkAUjRKm4NqwJG X-Google-Smtp-Source: ABdhPJykJs8dmVP7qzx35jq9mEne5gVcOZKynApwRjBRDDj4qZviXBs6D+DNiBYw9kwW0rJ51Ma4og== X-Received: by 2002:a92:d646:: with SMTP id x6mr23011893ilp.280.1629821802468; Tue, 24 Aug 2021 09:16:42 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id m1sm10136777ilf.24.2021.08.24.09.16.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:42 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:41 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 20/25] t5310: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP Message-ID: <4e0d49a2dd6856763486bf7931199494a521d29c.1629821743.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King Generating a MIDX bitmap confuses many of the tests in t5310, which expect to control whether and how bitmaps are written. Since the relevant MIDX-bitmap tests here are covered already in t5326, let's just disable the flag for the whole t5310 script. Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- t/t5310-pack-bitmaps.sh | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh index 4318f84d53..673baa5c3c 100755 --- a/t/t5310-pack-bitmaps.sh +++ b/t/t5310-pack-bitmaps.sh @@ -8,6 +8,10 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME . "$TEST_DIRECTORY"/lib-bundle.sh . "$TEST_DIRECTORY"/lib-bitmap.sh +# t5310 deals only with single-pack bitmaps, so don't write MIDX bitmaps in +# their place. +GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 + objpath () { echo ".git/objects/$(echo "$1" | sed -e 's|\(..\)|\1/|')" } From patchwork Tue Aug 24 16:16:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455541 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 480C7C4338F for ; Tue, 24 Aug 2021 16:17:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2BB076125F for ; Tue, 24 Aug 2021 16:17:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229705AbhHXQR7 (ORCPT ); Tue, 24 Aug 2021 12:17:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46722 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231300AbhHXQRe (ORCPT ); Tue, 24 Aug 2021 12:17:34 -0400 Received: from mail-il1-x12d.google.com (mail-il1-x12d.google.com [IPv6:2607:f8b0:4864:20::12d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A04A0C061292 for ; Tue, 24 Aug 2021 09:16:45 -0700 (PDT) Received: by mail-il1-x12d.google.com with SMTP id i13so9902358ilm.4 for ; Tue, 24 Aug 2021 09:16:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=2mzrFOmq8LqcO2yiS8+HdtGhGLR0MAR0Eqdi1I0JImE=; b=y3Yfs9BB9utaOw8ZEU9JRiFckjfvxOUguTDDlU27647v/uyzzbe7njIdTbxs0TcpMo 0WZ6Ff40NocrgGiPVu1eIn2HMlD/qMWI9iKcI2QPI8fZ+/4Afa+jIKMBTOkmeq2sBjB4 pioBFov1vLuF5lZN1gXgfrZXwHul2YCD0SYZ1A7TvkuHe7vxR6hCVlLqu80yUSukpVoX tviVTN9nBOJL5CsQuZI40D+YcSIUJa4gk84D8VEb/tlaHxthQUsbg7p4zptpXe04K1NX kJgsPfkxzkXo9RdiMc1MIDEleEot0rRo8j6WTkNLjqDd7VtK+Gsy0E8km/T1Ivayld/7 vgPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=2mzrFOmq8LqcO2yiS8+HdtGhGLR0MAR0Eqdi1I0JImE=; b=koS1K2QNLAtLl0dJSlZSLhU5n8297qB7Z7PsefzgY9J6dBQLvtg4zVFJzwqXz1BHc0 fCOYHOsnukZnAfp79F3a43VU7zBKu5t1WWinQNv7jt9T0regoxKuiKZtvtVZZXXIC38K jYho3D6pbB8+yty5KuKriN5hnmlBmVtkclYncPhfRp82LG2Z+PUgaZdMoMP3Fm5PyfkY ON2ePX36SIYS7Bb7u3Y/Mu3exWzLbbWuPd5flJ3c0soumzgztbLsffEMxbC9BfFQFqO9 hjer5wB0zwz3tbbeVTu1QrmKq+PZvl3pVrwH3uML/WBqgTTCRKQIlBWgTULVFylZ3UGl jnVA== X-Gm-Message-State: AOAM533g+99KKnQFo5eqSYDFgnu/T4j0R92SjFlcOh6ANPlPgwrW/R+w OD1/olf2CVKZQgJAeytxuC+VXVjfKQvTv7ab X-Google-Smtp-Source: ABdhPJxNMSfYC8h83rPFrDP9mzZYEx6Y1I2bHfn43fxv7TLYeeAxwTUrom6qjvO9aQDaghZY9ugzfA== X-Received: by 2002:a92:c5cf:: with SMTP id s15mr26623091ilt.62.1629821805009; Tue, 24 Aug 2021 09:16:45 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id c8sm10106971ilk.64.2021.08.24.09.16.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:44 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:44 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 21/25] t5319: don't write MIDX bitmaps in t5319 Message-ID: <47eba8ecf93e6e220ffbb98ebdb7e958db2c2d53.1629821743.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This test is specifically about generating a midx still respecting a pack-based bitmap file. Generating a MIDX bitmap would confuse the test. Let's override the 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' variable to make sure we don't do so. Signed-off-by: Taylor Blau --- t/t5319-multi-pack-index.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 9b184bd45e..a81375d920 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -504,7 +504,8 @@ test_expect_success 'repack preserves multi-pack-index when creating packs' ' compare_results_with_midx "after repack" test_expect_success 'multi-pack-index and pack-bitmap' ' - git -c repack.writeBitmaps=true repack -ad && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -c repack.writeBitmaps=true repack -ad && git multi-pack-index write && git rev-list --test-bitmap HEAD ' From patchwork Tue Aug 24 16:16:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455547 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AE30C432BE for ; Tue, 24 Aug 2021 16:17:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 82ADF6125F for ; Tue, 24 Aug 2021 16:17:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232462AbhHXQSN (ORCPT ); Tue, 24 Aug 2021 12:18:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46654 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231489AbhHXQRh (ORCPT ); Tue, 24 Aug 2021 12:17:37 -0400 Received: from mail-io1-xd33.google.com (mail-io1-xd33.google.com [IPv6:2607:f8b0:4864:20::d33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 170EBC0612A4 for ; Tue, 24 Aug 2021 09:16:48 -0700 (PDT) Received: by mail-io1-xd33.google.com with SMTP id g9so27066118ioq.11 for ; Tue, 24 Aug 2021 09:16:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=P9XhA576oXjIKXlNHfQ5aXRwRaVis24OihjlRxxBAZs=; b=whWa2weoXqE7kxD7y9U6XutT3EXx4jE+YrcUfb4AMQL2Yf8tadgJCV0+hvp6YE5Jrg GxUVuzd3XxhaQGCI/m313OM8CD4sE8HGwXUbqyQ16XOEOmobVDR5aCtd+YRUhM32fZ6C TECUt0YHlEFsx7PdJ8iC9gF1bTwmbdNVmAV26WC9VIKxUA93FF9d1uDeh5Zm57xMvtKC 9qOSume3aDhnGwUWFms/XJdqZaE+uM/qLtIq4jLMzwVIPeJDFZwJuEYt8uQPncqeysur WG85jLcvm/sYtJrWmrT4hEWutb/+tt56b2sUTLq6TTrAKmeV//PsOYDwbV2zdtdsg+lT eEXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=P9XhA576oXjIKXlNHfQ5aXRwRaVis24OihjlRxxBAZs=; b=HXVBpE15w00Q5prYWRQ6n+z5CVd55Jm+zV+pwCIHI6zmuHl9tlJrOKk42p/4za+vx8 hy2wozNm5/xa82r8EhZCFilzNYZ/pca0qoE+VyNkbbBVY+9DkrulswQJ5gwuOvOKQ2hN Tx6E2fQG9Yxp+XMW1DzX61eV+vtCWqLlP3svokHzdLwlEhg0vRLrkkoipWY+GaKUMT8S 4x6dGNS+LNz5t27Edn945P6qwlbF3jnxO6QOeLoZtobSwkBMbNK+Sc9C2xjTvS3tHjdz kKNU3O21X++qC+2ruqpfpOBt7MFKKMnAiu+W+uY3t98TezGQD/hbHVaAg+PhvUHGHJAI NLWA== X-Gm-Message-State: AOAM533ELg4HfhfNfDa6wcib16plHH3nyR/e/J7uOKc2AnqiUcZ7Mx6h g5N3FTsnHBq4EydshNk942OdDHKnZD2Hq3I1 X-Google-Smtp-Source: ABdhPJy8Rim6MoOFCx0BYba/PASqdstE4+h3akB367Ak9UGN3qs+JpfPI/JrYYzaOEid5OokavQW/Q== X-Received: by 2002:a02:3b1b:: with SMTP id c27mr5473452jaa.103.1629821807389; Tue, 24 Aug 2021 09:16:47 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id b18sm10300400iln.8.2021.08.24.09.16.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:47 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:46 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 22/25] t7700: update to work with MIDX bitmap test knob Message-ID: <3d78afa2ad35b96886af78a295f1fecc3d7e6170.1629821743.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A number of these tests are focused only on pack-based bitmaps and need to be updated to disable 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' where necessary. Signed-off-by: Taylor Blau --- t/t7700-repack.sh | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 25b235c063..98eda3bfeb 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -63,13 +63,14 @@ test_expect_success 'objects in packs marked .keep are not repacked' ' test_expect_success 'writing bitmaps via command-line can duplicate .keep objects' ' # build on $oid, $packid, and .keep state from previous - git repack -Adbl && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 git repack -Adbl && test_has_duplicate_object true ' test_expect_success 'writing bitmaps via config can duplicate .keep objects' ' # build on $oid, $packid, and .keep state from previous - git -c repack.writebitmaps=true repack -Adl && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -c repack.writebitmaps=true repack -Adl && test_has_duplicate_object true ' @@ -189,7 +190,9 @@ test_expect_success 'repack --keep-pack' ' test_expect_success 'bitmaps are created by default in bare repos' ' git clone --bare .git bare.git && - git -C bare.git repack -ad && + rm -f bare.git/objects/pack/*.bitmap && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -C bare.git repack -ad && bitmap=$(ls bare.git/objects/pack/*.bitmap) && test_path_is_file "$bitmap" ' @@ -200,7 +203,8 @@ test_expect_success 'incremental repack does not complain' ' ' test_expect_success 'bitmaps can be disabled on bare repos' ' - git -c repack.writeBitmaps=false -C bare.git repack -ad && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -c repack.writeBitmaps=false -C bare.git repack -ad && bitmap=$(ls bare.git/objects/pack/*.bitmap || :) && test -z "$bitmap" ' @@ -211,7 +215,8 @@ test_expect_success 'no bitmaps created if .keep files present' ' keep=${pack%.pack}.keep && test_when_finished "rm -f \"\$keep\"" && >"$keep" && - git -C bare.git repack -ad 2>stderr && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -C bare.git repack -ad 2>stderr && test_must_be_empty stderr && find bare.git/objects/pack/ -type f -name "*.bitmap" >actual && test_must_be_empty actual @@ -222,7 +227,8 @@ test_expect_success 'auto-bitmaps do not complain if unavailable' ' blob=$(test-tool genrandom big $((1024*1024)) | git -C bare.git hash-object -w --stdin) && git -C bare.git update-ref refs/tags/big $blob && - git -C bare.git repack -ad 2>stderr && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -C bare.git repack -ad 2>stderr && test_must_be_empty stderr && find bare.git/objects/pack -type f -name "*.bitmap" >actual && test_must_be_empty actual From patchwork Tue Aug 24 16:16:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455545 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF11FC4338F for ; Tue, 24 Aug 2021 16:17:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BE3B6610A3 for ; Tue, 24 Aug 2021 16:17:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232283AbhHXQSC (ORCPT ); Tue, 24 Aug 2021 12:18:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46742 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231496AbhHXQRh (ORCPT ); Tue, 24 Aug 2021 12:17:37 -0400 Received: from mail-il1-x12b.google.com (mail-il1-x12b.google.com [IPv6:2607:f8b0:4864:20::12b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8F703C0612A7 for ; Tue, 24 Aug 2021 09:16:50 -0700 (PDT) Received: by mail-il1-x12b.google.com with SMTP id l10so7315459ilh.8 for ; Tue, 24 Aug 2021 09:16:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=SoorQ/GWxuzmOTfjtez+99in0f4knnTQwgEu6e/ih1M=; b=JlFbGCfGna9JxUB9J1mXgncnXGOwuHfYH1rIxHit9XpEJ2o5j0jeinQaajYrJJFA+r LLYHcD1wpgoj4kQ7YOMYKqFGTWl5tzCvX1HD+DRai9Ap6Z1z/HWO5epv973BhhZosMtR E/X7XkaBLWh/xDDupl+Xv5h26XEi1dmb8MzCA28hmba4M7pUnh5Sfy6RwZE7w8iAxldC OBxkji4kpmcvyxOPgpw108fzHEdgmQr2Ne+3AP2oK+ID7+hyngkVm56w/60y5t45EPx6 l6ICjRKoHZ+7b8Lpc8uFBnpbTNrbjFiMl0TpF57InPj1vCpSsghN7JjwXh4XCI07BL0w mYOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=SoorQ/GWxuzmOTfjtez+99in0f4knnTQwgEu6e/ih1M=; b=iRl+rCJcqZ0cJqgaxnQNA4SdA8oEuJO/B7MUEa0e5zpSjYJxJb08E5XrpPhwh2GSyK Hsy7LVyMKFqJMKrAYnXF60LhnLZMCS149If39rM/3atu0w2/evOsQBdOY4a9m0/iKqI9 VZQG6Ub8/llTdlNG09q0tAE3sV8hayIebX7RrYY+kqBKDFsMbqoifzTwOQyfuX95/fgU dOZOz+j2bwvnSjLhoJrWcqDDagvnbrqTAOp3ZOpK7RHRVbSUZf8IHYPAnWXftnHhYmb5 vm/bJyEtVHTe2LSwjtlXILbWRLkCSt2SW/RrZIAqsc72+KFPcv+Q8ITMKIWcMGe2gAGF 9+3g== X-Gm-Message-State: AOAM532M226nWU3Msi2mz8eH+WHUeeh6KxBkGpUyVxKFpyB59oFt0mjk t/evEMP2dpzqbqFzk6kYeihDWKcTlI+3AjK/ X-Google-Smtp-Source: ABdhPJwXnl9xXhmQwJFma8C88J+oZK5+R+jZRkA8hMgJhh6Y4mkXmLOOX26trFJZJGS+r3PEk0gHWA== X-Received: by 2002:a92:c5cf:: with SMTP id s15mr26623320ilt.62.1629821809923; Tue, 24 Aug 2021 09:16:49 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id x5sm5766231ioa.35.2021.08.24.09.16.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:49 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:49 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 23/25] midx: respect 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Introduce a new 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' environment variable to also write a multi-pack bitmap when 'GIT_TEST_MULTI_PACK_INDEX' is set. Signed-off-by: Taylor Blau --- builtin/repack.c | 12 ++++++++++-- ci/run-build-and-tests.sh | 1 + midx.h | 2 ++ t/README | 4 ++++ 4 files changed, 17 insertions(+), 2 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index 5f9bc74adc..82ab668272 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -515,6 +515,10 @@ int cmd_repack(int argc, const char **argv, const char *prefix) if (!(pack_everything & ALL_INTO_ONE) || !is_bare_repository()) write_bitmaps = 0; + } else if (write_bitmaps && + git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0) && + git_env_bool(GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP, 0)) { + write_bitmaps = 0; } if (pack_kept_objects < 0) pack_kept_objects = write_bitmaps > 0; @@ -725,8 +729,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix) update_server_info(0); remove_temporary_files(); - if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) - write_midx_file(get_object_directory(), NULL, 0); + if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) { + unsigned flags = 0; + if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP, 0)) + flags |= MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX; + write_midx_file(get_object_directory(), NULL, flags); + } string_list_clear(&names, 0); string_list_clear(&rollback, 0); diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh index 3ce81ffee9..7ee9ba9325 100755 --- a/ci/run-build-and-tests.sh +++ b/ci/run-build-and-tests.sh @@ -23,6 +23,7 @@ linux-gcc) export GIT_TEST_COMMIT_GRAPH=1 export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1 export GIT_TEST_MULTI_PACK_INDEX=1 + export GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=1 export GIT_TEST_ADD_I_USE_BUILTIN=1 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master export GIT_TEST_WRITE_REV_INDEX=1 diff --git a/midx.h b/midx.h index 350f4d0a7b..aa3da557bb 100644 --- a/midx.h +++ b/midx.h @@ -8,6 +8,8 @@ struct pack_entry; struct repository; #define GIT_TEST_MULTI_PACK_INDEX "GIT_TEST_MULTI_PACK_INDEX" +#define GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP \ + "GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP" struct multi_pack_index { struct multi_pack_index *next; diff --git a/t/README b/t/README index 9e70122302..12014aa988 100644 --- a/t/README +++ b/t/README @@ -425,6 +425,10 @@ GIT_TEST_MULTI_PACK_INDEX=, when true, forces the multi-pack- index to be written after every 'git repack' command, and overrides the 'core.multiPackIndex' setting to true. +GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=, when true, sets the +'--bitmap' option on all invocations of 'git multi-pack-index write', +and ignores pack-objects' '--write-bitmap-index'. + GIT_TEST_SIDEBAND_ALL=, when true, overrides the 'uploadpack.allowSidebandAll' setting to true, and when false, forces fetch-pack to not request sideband-all (even if the server advertises From patchwork Tue Aug 24 16:16:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455549 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D48FC4338F for ; Tue, 24 Aug 2021 16:17:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 01F5961212 for ; Tue, 24 Aug 2021 16:17:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232422AbhHXQSH (ORCPT ); Tue, 24 Aug 2021 12:18:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46666 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231791AbhHXQRh (ORCPT ); Tue, 24 Aug 2021 12:17:37 -0400 Received: from mail-io1-xd34.google.com (mail-io1-xd34.google.com [IPv6:2607:f8b0:4864:20::d34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EE1C3C06179A for ; Tue, 24 Aug 2021 09:16:52 -0700 (PDT) Received: by mail-io1-xd34.google.com with SMTP id b200so27052859iof.13 for ; Tue, 24 Aug 2021 09:16:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=sAWq7ApHqrUs88UHszHQDp+raK9BhywjCqAJGMbjwdk=; b=XhZy7F+Ina8SPksXzHtA108G/5S9Icqcbu61mGbQEdxgGQAptvKCqJKhwUR5RjqyKa zPBTvkFGv/74bKknSgG2j5NNe69rS+tj6O3oXQB0BFWZLFbozmz1l9vytGdiEN3KB5IE B2tlfBFQ97iSZU1blM5BkV5vSf+amqIwD9KicoYbQDsafzHl10eEARfH1+y1YB6XtW+C 9C3J3XCKiv/OMCTBjWAOBd4aOMNoxcSDsB/IIOjdJyEhvzzBr2ZFRRH2X9OlrenorxXv VJ92ISqRibf6vZHzlnS1iKIZXrO/chDv0DyEaRZ3SZZlIomJF1/gUah8eBkOd8hmUQq1 6rnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=sAWq7ApHqrUs88UHszHQDp+raK9BhywjCqAJGMbjwdk=; b=f0U4NEcIY+kGrFqohhVa3FMi0eQcI5LqqWjd2Y9ZtjH7ffL/3L4XV5DKgUO8t8REkr zhjBFIxhvnWxnuP9HKCCVj5cK1n+Eo0AohB2UPlnKOaMbI6m0KOwHv8wgf4Z2Q1gtfbz JA62fS5bKLsgnUtQr/PewKMnw2WVUFrJ5oZokKLNuOe3OJ2CZbDm7K3OuUNa2NcQ2NfC +LjAf+vEIfhAdL+1CqMMC8hqXujLecalETRUAfgY5CnQ25Rz9DZhsHdp1Wn3JFnbfD25 zYr+MbVGRJMSKEVfoBHPQ3qCfS+Ie94Btxg3ZaSuXLfQtWTYrmmdK0auhQzxw3H9jtxB p3IQ== X-Gm-Message-State: AOAM533Ym+bh+Y+xjYted0k0rcrwkHeyKvTBYpgjSA9Mgy7b6GLga6yO 3Ly4g9Bhm224/E9VJ6cI914TMb19b30m7Kgp X-Google-Smtp-Source: ABdhPJyeUe6e2Hq2cyze7fzUQxdfeVPpeTt79QXoLr/lzVkGjcpQ2zaF1C9lrJrz+TuKtfseR0lPyw== X-Received: by 2002:a02:7348:: with SMTP id a8mr34837476jae.116.1629821812246; Tue, 24 Aug 2021 09:16:52 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d14sm8693459iod.18.2021.08.24.09.16.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:52 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:51 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 24/25] p5310: extract full and partial bitmap tests Message-ID: <6b03016c9937218071f1819dbbca988615b3b6a0.1629821743.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A new p5326 introduced by the next patch will want these same tests, interjecting its own setup in between. Move them out so that both perf tests can reuse them. Signed-off-by: Taylor Blau --- t/perf/lib-bitmap.sh | 69 ++++++++++++++++++++++++++++++++++++ t/perf/p5310-pack-bitmaps.sh | 65 ++------------------------------- 2 files changed, 72 insertions(+), 62 deletions(-) create mode 100644 t/perf/lib-bitmap.sh diff --git a/t/perf/lib-bitmap.sh b/t/perf/lib-bitmap.sh new file mode 100644 index 0000000000..63d3bc7cec --- /dev/null +++ b/t/perf/lib-bitmap.sh @@ -0,0 +1,69 @@ +# Helper functions for testing bitmap performance; see p5310. + +test_full_bitmap () { + test_perf 'simulated clone' ' + git pack-objects --stdout --all /dev/null + ' + + test_perf 'simulated fetch' ' + have=$(git rev-list HEAD~100 -1) && + { + echo HEAD && + echo ^$have + } | git pack-objects --revs --stdout >/dev/null + ' + + test_perf 'pack to file (bitmap)' ' + git pack-objects --use-bitmap-index --all pack1b /dev/null + ' + + test_perf 'rev-list (commits)' ' + git rev-list --all --use-bitmap-index >/dev/null + ' + + test_perf 'rev-list (objects)' ' + git rev-list --all --use-bitmap-index --objects >/dev/null + ' + + test_perf 'rev-list with tag negated via --not --all (objects)' ' + git rev-list perf-tag --not --all --use-bitmap-index --objects >/dev/null + ' + + test_perf 'rev-list with negative tag (objects)' ' + git rev-list HEAD --not perf-tag --use-bitmap-index --objects >/dev/null + ' + + test_perf 'rev-list count with blob:none' ' + git rev-list --use-bitmap-index --count --objects --all \ + --filter=blob:none >/dev/null + ' + + test_perf 'rev-list count with blob:limit=1k' ' + git rev-list --use-bitmap-index --count --objects --all \ + --filter=blob:limit=1k >/dev/null + ' + + test_perf 'rev-list count with tree:0' ' + git rev-list --use-bitmap-index --count --objects --all \ + --filter=tree:0 >/dev/null + ' + + test_perf 'simulated partial clone' ' + git pack-objects --stdout --all --filter=blob:none /dev/null + ' +} + +test_partial_bitmap () { + test_perf 'clone (partial bitmap)' ' + git pack-objects --stdout --all /dev/null + ' + + test_perf 'pack to file (partial bitmap)' ' + git pack-objects --use-bitmap-index --all pack2b /dev/null + ' + + test_perf 'rev-list with tree filter (partial bitmap)' ' + git rev-list --use-bitmap-index --count --objects --all \ + --filter=tree:0 >/dev/null + ' +} diff --git a/t/perf/p5310-pack-bitmaps.sh b/t/perf/p5310-pack-bitmaps.sh index 452be01056..7ad4f237bc 100755 --- a/t/perf/p5310-pack-bitmaps.sh +++ b/t/perf/p5310-pack-bitmaps.sh @@ -2,6 +2,7 @@ test_description='Tests pack performance using bitmaps' . ./perf-lib.sh +. "${TEST_DIRECTORY}/perf/lib-bitmap.sh" test_perf_large_repo @@ -25,56 +26,7 @@ test_perf 'repack to disk' ' git repack -ad ' -test_perf 'simulated clone' ' - git pack-objects --stdout --all /dev/null -' - -test_perf 'simulated fetch' ' - have=$(git rev-list HEAD~100 -1) && - { - echo HEAD && - echo ^$have - } | git pack-objects --revs --stdout >/dev/null -' - -test_perf 'pack to file (bitmap)' ' - git pack-objects --use-bitmap-index --all pack1b /dev/null -' - -test_perf 'rev-list (commits)' ' - git rev-list --all --use-bitmap-index >/dev/null -' - -test_perf 'rev-list (objects)' ' - git rev-list --all --use-bitmap-index --objects >/dev/null -' - -test_perf 'rev-list with tag negated via --not --all (objects)' ' - git rev-list perf-tag --not --all --use-bitmap-index --objects >/dev/null -' - -test_perf 'rev-list with negative tag (objects)' ' - git rev-list HEAD --not perf-tag --use-bitmap-index --objects >/dev/null -' - -test_perf 'rev-list count with blob:none' ' - git rev-list --use-bitmap-index --count --objects --all \ - --filter=blob:none >/dev/null -' - -test_perf 'rev-list count with blob:limit=1k' ' - git rev-list --use-bitmap-index --count --objects --all \ - --filter=blob:limit=1k >/dev/null -' - -test_perf 'rev-list count with tree:0' ' - git rev-list --use-bitmap-index --count --objects --all \ - --filter=tree:0 >/dev/null -' - -test_perf 'simulated partial clone' ' - git pack-objects --stdout --all --filter=blob:none /dev/null -' +test_full_bitmap test_expect_success 'create partial bitmap state' ' # pick a commit to represent the repo tip in the past @@ -97,17 +49,6 @@ test_expect_success 'create partial bitmap state' ' git update-ref HEAD $orig_tip ' -test_perf 'clone (partial bitmap)' ' - git pack-objects --stdout --all /dev/null -' - -test_perf 'pack to file (partial bitmap)' ' - git pack-objects --use-bitmap-index --all pack2b /dev/null -' - -test_perf 'rev-list with tree filter (partial bitmap)' ' - git rev-list --use-bitmap-index --count --objects --all \ - --filter=tree:0 >/dev/null -' +test_partial_bitmap test_done From patchwork Tue Aug 24 16:16:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12455543 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 110F8C432BE for ; Tue, 24 Aug 2021 16:17:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E8001610A3 for ; Tue, 24 Aug 2021 16:17:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230330AbhHXQSE (ORCPT ); Tue, 24 Aug 2021 12:18:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231961AbhHXQRj (ORCPT ); Tue, 24 Aug 2021 12:17:39 -0400 Received: from mail-il1-x135.google.com (mail-il1-x135.google.com [IPv6:2607:f8b0:4864:20::135]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5344CC0617AD for ; Tue, 24 Aug 2021 09:16:55 -0700 (PDT) Received: by mail-il1-x135.google.com with SMTP id y3so21092241ilm.6 for ; Tue, 24 Aug 2021 09:16:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=4z4LT0lv5k9umGSR/ooyeHTuJXpCfSVmHmMOmakBiFE=; b=d3lTNk1c6PoOWhN9PgMnlZkU6pEJvSsGW+ZDfd7u98dAWkjBpz7IgUoMwmXnDY3XIN dSTI223jkxvxBbwSRnZzhDIYyEa5Ay5lQO1sQCUaPwrWdcbo8638sEMTAwKl/zbXQkKh vDGLy9Q1JHUgsVYWOpwnf9I8I6yTfqPCiW2+aNSN7fB1lXIA2ZMRepEsP3r1T3F9xGNb 6xTLoosTlYmtTUd0Hg8gKmJhOH7IaKcH+yk7/6XPJ3xVMW8XKj3cKODfC0LNOte5saOt Hjk9+US6aV1mRvIj1hlHMnZgTTLbhN0ql86CrN79wS+2sBgzFfP9LBxdPkhm0S2PhkCS Vxzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=4z4LT0lv5k9umGSR/ooyeHTuJXpCfSVmHmMOmakBiFE=; b=WDFNl34mJHquIRdbQPKsUunpY8CggDkScyraOloz1BYaK+v/1+dWUalQ4gXhNbye5E Jae/DRkbAtEQSOo6Iotw/v3wgh5AiBzdjP0mFSniWLpZz4nE9Irmrx1WwUvgvS7ZGrC7 YZJLJXf/iUjrWpa08J4NsmrlzHmAzbK1HVpBcFY6j1NAEdb2a3PADJwXfNhiTdiMynaa s1ikqrBb/XhHOX3yVgVjgHfl1+uWErG7IAUw0PZk7ZQ/a2Ly3o+B0rKL4Xx7jMpu6klB b25LVPRHOpWOJTSKqz+OhNRa4Ql7a5HUramkrS53lEE2PRMcOF5DaaB57QW9BCZvhPxC B5ZA== X-Gm-Message-State: AOAM530GE9YJQB4Lv0EBM0XmHAJdLfPV2JkDTr/gfM1Yi8JqDh+3PyvL RRYxZY77dXRC4B7nmqRpzEhoHlrHFuJg6WTj X-Google-Smtp-Source: ABdhPJx4wbylCrMz0yvEh0yaiNoksXVEuBl+6Cbv58tdRiF9fcDoRsEhu0o4M1U5+OGBqL2yjRU05Q== X-Received: by 2002:a92:c609:: with SMTP id p9mr10393489ilm.135.1629821814604; Tue, 24 Aug 2021 09:16:54 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id r18sm10171723ilo.38.2021.08.24.09.16.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Aug 2021 09:16:54 -0700 (PDT) Date: Tue, 24 Aug 2021 12:16:53 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v4 25/25] p5326: perf tests for MIDX bitmaps Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org These new performance tests demonstrate effectively the same behavior as p5310, but use a multi-pack bitmap instead of a single-pack one. Notably, p5326 does not create a MIDX bitmap with multiple packs. This is so we can measure a direct comparison between it and p5310. Any difference between the two is measuring just the overhead of using MIDX bitmaps. Here are the results of p5310 and p5326 together, measured at the same time and on the same machine (using a Xenon W-2255 CPU): Test HEAD ------------------------------------------------------------------------ 5310.2: repack to disk 96.78(93.39+11.33) 5310.3: simulated clone 9.98(9.79+0.19) 5310.4: simulated fetch 1.75(4.26+0.19) 5310.5: pack to file (bitmap) 28.20(27.87+8.70) 5310.6: rev-list (commits) 0.41(0.36+0.05) 5310.7: rev-list (objects) 1.61(1.54+0.07) 5310.8: rev-list count with blob:none 0.25(0.21+0.04) 5310.9: rev-list count with blob:limit=1k 2.65(2.54+0.10) 5310.10: rev-list count with tree:0 0.23(0.19+0.04) 5310.11: simulated partial clone 4.34(4.21+0.12) 5310.13: clone (partial bitmap) 11.05(12.21+0.48) 5310.14: pack to file (partial bitmap) 31.25(34.22+3.70) 5310.15: rev-list with tree filter (partial bitmap) 0.26(0.22+0.04) versus the same tests (this time using a multi-pack index): Test HEAD ------------------------------------------------------------------------ 5326.2: setup multi-pack index 78.99(75.29+11.58) 5326.3: simulated clone 11.78(11.56+0.22) 5326.4: simulated fetch 1.70(4.49+0.13) 5326.5: pack to file (bitmap) 28.02(27.72+8.76) 5326.6: rev-list (commits) 0.42(0.36+0.06) 5326.7: rev-list (objects) 1.65(1.58+0.06) 5326.8: rev-list count with blob:none 0.26(0.21+0.05) 5326.9: rev-list count with blob:limit=1k 2.97(2.86+0.10) 5326.10: rev-list count with tree:0 0.25(0.20+0.04) 5326.11: simulated partial clone 5.65(5.49+0.16) 5326.13: clone (partial bitmap) 12.22(13.43+0.38) 5326.14: pack to file (partial bitmap) 30.05(31.57+7.25) 5326.15: rev-list with tree filter (partial bitmap) 0.24(0.20+0.04) There is slight overhead in "simulated clone", "simulated partial clone", and "clone (partial bitmap)". Unsurprisingly, that overhead is due to using the MIDX's reverse index to map between bit positions and MIDX positions. This can be reproduced by running "git repack -adb" along with "git multi-pack-index write --bitmap" in a large-ish repository. Then run: $ perf record -o pack.perf git -c core.multiPackIndex=false \ pack-objects --all --stdout >/dev/null /dev/null --- t/perf/p5326-multi-pack-bitmaps.sh | 43 ++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) create mode 100755 t/perf/p5326-multi-pack-bitmaps.sh diff --git a/t/perf/p5326-multi-pack-bitmaps.sh b/t/perf/p5326-multi-pack-bitmaps.sh new file mode 100755 index 0000000000..5845109ac7 --- /dev/null +++ b/t/perf/p5326-multi-pack-bitmaps.sh @@ -0,0 +1,43 @@ +#!/bin/sh + +test_description='Tests performance using midx bitmaps' +. ./perf-lib.sh +. "${TEST_DIRECTORY}/perf/lib-bitmap.sh" + +test_perf_large_repo + +test_expect_success 'enable multi-pack index' ' + git config core.multiPackIndex true +' + +test_perf 'setup multi-pack index' ' + git repack -ad && + git multi-pack-index write --bitmap +' + +test_full_bitmap + +test_expect_success 'create partial bitmap state' ' + # pick a commit to represent the repo tip in the past + cutoff=$(git rev-list HEAD~100 -1) && + orig_tip=$(git rev-parse HEAD) && + + # now pretend we have just one tip + rm -rf .git/logs .git/refs/* .git/packed-refs && + git update-ref HEAD $cutoff && + + # and then repack, which will leave us with a nice + # big bitmap pack of the "old" history, and all of + # the new history will be loose, as if it had been pushed + # up incrementally and exploded via unpack-objects + git repack -Ad && + git multi-pack-index write --bitmap && + + # and now restore our original tip, as if the pushes + # had happened + git update-ref HEAD $orig_tip +' + +test_partial_bitmap + +test_done