From patchwork Mon Jun 21 22:24:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335791 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 293F4C4743C for ; Mon, 21 Jun 2021 22:25:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0CDD9611CE for ; Mon, 21 Jun 2021 22:25:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231826AbhFUW1U (ORCPT ); Mon, 21 Jun 2021 18:27:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229789AbhFUW1P (ORCPT ); Mon, 21 Jun 2021 18:27:15 -0400 Received: from mail-io1-xd31.google.com (mail-io1-xd31.google.com [IPv6:2607:f8b0:4864:20::d31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E1BC0C061574 for ; Mon, 21 Jun 2021 15:25:00 -0700 (PDT) Received: by mail-io1-xd31.google.com with SMTP id f10so12026679iok.6 for ; Mon, 21 Jun 2021 15:25:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=JMyZQ9uYc5pqgSysG3QhSpm5Mht+vN571BzNg3z+bE4=; b=gte72RpXXFDhja/Cowy27+9jX038s/p93WWxv+KXzFS+xi3u4WXZYAfm+KVHLrF5HL Q2Y8EXkbitX1VGqWcWuqYcd+3+F0vkp4d38jskE7/++AiX8vmLlgDq9g414aW9kYjSkY GKp6UWTCZc3MRY2I0kOVb1XaxDT+dHroJddPhI0QgqCae6E91t4dKXY68Esej0cNbVLo dCinAu2EdhiuYvgpOxUMi6bHRmIAJp+WuCYJLhcZSFsA+scN6Anplhb72T5Z+QjNQuml r99nrwNgkf0GWqNW/DikQBRlLqvnx/iYpZhtmMyAJn/Ynwk/zr9FGpJ0ZnU7FO8Scs4H 9mZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=JMyZQ9uYc5pqgSysG3QhSpm5Mht+vN571BzNg3z+bE4=; b=rV3k00Cd0RqoneBKxLGZXRjBcZb82xLHM/zpNZvFh7PwUApd4jP+W+9V7NwanlYJuc 60F7GJ/A+bfU5TWcvM0+vYZ6pQUnqkZdXKczQlnQJRTzcmyeLN80G5x7TbJKzkhXGLl8 VjpK6PXv/KC85aVuHCnCILjsgmoFYvXxLl2apHOcJ21FYrem7zATRcvlKGvp7PYIGa5l 6lwas84pLy7joOUEpLTqjnqvfN4M3xuCAwHIIUPOAvSliKMtn1hqTycrA3EU/fntpyPb xFr9kArZGSWiCTIzuT1NAyH7OpES3LlF9OycgRsqDkjpZPEr0ye8xq6F44gj0MI+Msjv RNug== X-Gm-Message-State: AOAM5336tVgWs3VsJpYuCVQIafcX1deXF91fYA+3bOHmg/g3vH271TnZ lA3dIkjek7Ngn5AA61WmZK8B9HqH5ElDj0k0 X-Google-Smtp-Source: ABdhPJwkj+5Sl1Q6f968iDnL9ZohjTONDgjf1/v/cPPIz5zf+cxyw6UFO59sO/3PrTxwd0f9vwwyww== X-Received: by 2002:a6b:103:: with SMTP id 3mr245894iob.156.1624314300182; Mon, 21 Jun 2021 15:25:00 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id g9sm6551099ile.12.2021.06.21.15.24.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:24:59 -0700 (PDT) Date: Mon, 21 Jun 2021 18:24:59 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 01/24] pack-bitmap.c: harden 'test_bitmap_walk()' to check type bitmaps Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The special `--test-bitmap` mode of `git rev-list` is used to compare the result of an object traversal with a bitmap to check its integrity. This mode does not, however, assert that the types of reachable objects are stored correctly. Harden this mode by teaching it to also check that each time an object's bit is marked, the corresponding bit should be set in exactly one of the type bitmaps (whose type matches the object's true type). Co-authored-by: Jeff King Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- pack-bitmap.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/pack-bitmap.c b/pack-bitmap.c index d90e1d9d8c..368fa59a42 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1301,10 +1301,52 @@ void count_bitmap_commit_list(struct bitmap_index *bitmap_git, struct bitmap_test_data { struct bitmap_index *bitmap_git; struct bitmap *base; + struct bitmap *commits; + struct bitmap *trees; + struct bitmap *blobs; + struct bitmap *tags; struct progress *prg; size_t seen; }; +static void test_bitmap_type(struct bitmap_test_data *tdata, + struct object *obj, int pos) +{ + enum object_type bitmap_type = OBJ_NONE; + int bitmaps_nr = 0; + + if (bitmap_get(tdata->commits, pos)) { + bitmap_type = OBJ_COMMIT; + bitmaps_nr++; + } + if (bitmap_get(tdata->trees, pos)) { + bitmap_type = OBJ_TREE; + bitmaps_nr++; + } + if (bitmap_get(tdata->blobs, pos)) { + bitmap_type = OBJ_BLOB; + bitmaps_nr++; + } + if (bitmap_get(tdata->tags, pos)) { + bitmap_type = OBJ_TAG; + bitmaps_nr++; + } + + if (!bitmap_type) + die("object %s not found in type bitmaps", + oid_to_hex(&obj->oid)); + + if (bitmaps_nr > 1) + die("object %s does not have a unique type", + oid_to_hex(&obj->oid)); + + if (bitmap_type != obj->type) + die("object %s: real type %s, expected: %s", + oid_to_hex(&obj->oid), + type_name(obj->type), + type_name(bitmap_type)); +} + static void test_show_object(struct object *object, const char *name, void *data) { @@ -1314,6 +1356,7 @@ static void test_show_object(struct object *object, const char *name, bitmap_pos = bitmap_position(tdata->bitmap_git, &object->oid); if (bitmap_pos < 0) die("Object not in bitmap: %s\n", oid_to_hex(&object->oid)); + test_bitmap_type(tdata, object, bitmap_pos); bitmap_set(tdata->base, bitmap_pos); display_progress(tdata->prg, ++tdata->seen); @@ -1328,6 +1371,7 @@ static void test_show_commit(struct commit *commit, void *data) &commit->object.oid); if (bitmap_pos < 0) die("Object not in bitmap: %s\n", oid_to_hex(&commit->object.oid)); + test_bitmap_type(tdata, &commit->object, bitmap_pos); bitmap_set(tdata->base, bitmap_pos); display_progress(tdata->prg, ++tdata->seen); @@ -1375,6 +1419,10 @@ void test_bitmap_walk(struct rev_info *revs) tdata.bitmap_git = bitmap_git; tdata.base = bitmap_new(); + tdata.commits = ewah_to_bitmap(bitmap_git->commits); + tdata.trees = ewah_to_bitmap(bitmap_git->trees); + tdata.blobs = ewah_to_bitmap(bitmap_git->blobs); + tdata.tags = ewah_to_bitmap(bitmap_git->tags); tdata.prg = start_progress("Verifying bitmap entries", result_popcnt); tdata.seen = 0; From patchwork Mon Jun 21 22:25:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335793 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01FC5C48BE5 for ; Mon, 21 Jun 2021 22:25:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D83C16124B for ; Mon, 21 Jun 2021 22:25:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232032AbhFUW1U (ORCPT ); Mon, 21 Jun 2021 18:27:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53524 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231669AbhFUW1T (ORCPT ); Mon, 21 Jun 2021 18:27:19 -0400 Received: from mail-io1-xd34.google.com (mail-io1-xd34.google.com [IPv6:2607:f8b0:4864:20::d34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7DD3C061756 for ; Mon, 21 Jun 2021 15:25:03 -0700 (PDT) Received: by mail-io1-xd34.google.com with SMTP id o5so15881901iob.4 for ; Mon, 21 Jun 2021 15:25:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=hjqOo5CugurnPssJWxo0/V9fUpxV95Iays7iB0BsfD8=; b=FfemrMkohGcYAL1hDqC6bptlECdTPvXBx0Jv9uMdi4Dc/UI5xHrWYA+wGkCUpLCZ63 4Igy/l8TRDnOEiIOM3I8O+0ZECFDzNajd0ZvrTzDZgD3IGmkfCxfQYhQhaOzPBE6gR90 ppBIJUHE/8NCZOBxOnhlMMd+1Y/78vtN808ukXHJGLAUL7j4i8p1ONNJMup2Q/LTZf3j C45BZMIrlieMPe1ypRNYR6t9/azlqmrgSfcEtPyOijNlSIcfsz8y4vZzeqhHFaa3Q23E PLDg2iIc/THqosq6+RURLs82r5qJ37orMKQYUtNFY8KTszFxF++F21Dj3eb6Ki1YtIkp aatw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=hjqOo5CugurnPssJWxo0/V9fUpxV95Iays7iB0BsfD8=; b=pNZ+TSYonnS0k5uDrzRCIDBOvxZ/KwKTcK5Ozn5Taf9UOYJh15ZHuTDnE+PfStefsa I4+damijjcvBLbsal1NfABGesTlmBS/EcaIaplcDlYqNpZqKxQlCAgGeM05Gfy/6rpc7 ar8rKrW9IcHpCM7NPfnPQerVh/pFnV3d9UNF849809FsApuIXu52lwwFuuCxWTDVBv5g c+TVK9MD0x/ORNqfEPCpXgv19ONlr97ewxZWQGRZLkgTXlSvFN2wnyWjb9Q2e8fM4M3y 6Qpvntxb/AWbOdm26UzOE2iq5T1Eoba5ALArLQwvxXLAx3gpSIyJ/dR1Ix2ZEIp3u2Zx ZAnA== X-Gm-Message-State: AOAM533RDGNHKfus1Z84gabHBX2Kr93CwZEAQ+vbshuPY+Z+4uzoY8eP KemNdnajfTZH8QGbMRxLD6gaLmGUokNfjGla X-Google-Smtp-Source: ABdhPJwwEuNT78iTk8PsyfuqreVi9hqhgw+77uPwtPT6DtN7cx7+7QZZoTacQo6btbwA+7BBiDMfNQ== X-Received: by 2002:a02:ccf2:: with SMTP id l18mr650706jaq.128.1624314302946; Mon, 21 Jun 2021 15:25:02 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id r4sm1057463ilq.27.2021.06.21.15.25.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:02 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:01 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 02/24] pack-bitmap-write.c: gracefully fail to write non-closed bitmaps Message-ID: <3e637d9ec83435540ad32b8325b0dce87f61bae0.1624314293.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The set of objects covered by a bitmap must be closed under reachability, since it must be the case that there is a valid bit position assigned for every possible reachable object (otherwise the bitmaps would be incomplete). Pack bitmaps are never written from 'git repack' unless repacking all-into-one, and so we never write non-closed bitmaps (except in the case of partial clones where we aren't guaranteed to have all objects). But multi-pack bitmaps change this, since it isn't known whether the set of objects in the MIDX is closed under reachability until walking them. Plumb through a bit that is set when a reachable object isn't found. As soon as a reachable object isn't found in the set of objects to include in the bitmap, bitmap_writer_build() knows that the set is not closed, and so it now fails gracefully. A test is added in t0410 to trigger a bitmap write without full reachability closure by removing local copies of some reachable objects from a promisor remote. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 3 +- pack-bitmap-write.c | 76 ++++++++++++++++++++++++++++------------ pack-bitmap.h | 2 +- t/t0410-partial-clone.sh | 9 ++++- 4 files changed, 64 insertions(+), 26 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index de00adbb9e..8a523624a1 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1256,7 +1256,8 @@ static void write_pack_file(void) bitmap_writer_show_progress(progress); bitmap_writer_select_commits(indexed_commits, indexed_commits_nr, -1); - bitmap_writer_build(&to_pack); + if (bitmap_writer_build(&to_pack) < 0) + die(_("failed to write bitmap index")); bitmap_writer_finish(written_list, nr_written, tmpname.buf, write_bitmap_options); write_bitmap_index = 0; diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index 88d9e696a5..d374f7884b 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -125,15 +125,20 @@ static inline void push_bitmapped_commit(struct commit *commit) writer.selected_nr++; } -static uint32_t find_object_pos(const struct object_id *oid) +static uint32_t find_object_pos(const struct object_id *oid, int *found) { struct object_entry *entry = packlist_find(writer.to_pack, oid); if (!entry) { - die("Failed to write bitmap index. Packfile doesn't have full closure " + if (found) + *found = 0; + warning("Failed to write bitmap index. Packfile doesn't have full closure " "(object %s is missing)", oid_to_hex(oid)); + return 0; } + if (found) + *found = 1; return oe_in_pack_pos(writer.to_pack, entry); } @@ -331,9 +336,10 @@ static void bitmap_builder_clear(struct bitmap_builder *bb) bb->commits_nr = bb->commits_alloc = 0; } -static void fill_bitmap_tree(struct bitmap *bitmap, - struct tree *tree) +static int fill_bitmap_tree(struct bitmap *bitmap, + struct tree *tree) { + int found; uint32_t pos; struct tree_desc desc; struct name_entry entry; @@ -342,9 +348,11 @@ static void fill_bitmap_tree(struct bitmap *bitmap, * If our bit is already set, then there is nothing to do. Both this * tree and all of its children will be set. */ - pos = find_object_pos(&tree->object.oid); + pos = find_object_pos(&tree->object.oid, &found); + if (!found) + return -1; if (bitmap_get(bitmap, pos)) - return; + return 0; bitmap_set(bitmap, pos); if (parse_tree(tree) < 0) @@ -355,11 +363,15 @@ static void fill_bitmap_tree(struct bitmap *bitmap, while (tree_entry(&desc, &entry)) { switch (object_type(entry.mode)) { case OBJ_TREE: - fill_bitmap_tree(bitmap, - lookup_tree(the_repository, &entry.oid)); + if (fill_bitmap_tree(bitmap, + lookup_tree(the_repository, &entry.oid)) < 0) + return -1; break; case OBJ_BLOB: - bitmap_set(bitmap, find_object_pos(&entry.oid)); + pos = find_object_pos(&entry.oid, &found); + if (!found) + return -1; + bitmap_set(bitmap, pos); break; default: /* Gitlink, etc; not reachable */ @@ -368,15 +380,18 @@ static void fill_bitmap_tree(struct bitmap *bitmap, } free_tree_buffer(tree); + return 0; } -static void fill_bitmap_commit(struct bb_commit *ent, - struct commit *commit, - struct prio_queue *queue, - struct prio_queue *tree_queue, - struct bitmap_index *old_bitmap, - const uint32_t *mapping) +static int fill_bitmap_commit(struct bb_commit *ent, + struct commit *commit, + struct prio_queue *queue, + struct prio_queue *tree_queue, + struct bitmap_index *old_bitmap, + const uint32_t *mapping) { + int found; + uint32_t pos; if (!ent->bitmap) ent->bitmap = bitmap_new(); @@ -401,11 +416,16 @@ static void fill_bitmap_commit(struct bb_commit *ent, * Mark ourselves and queue our tree. The commit * walk ensures we cover all parents. */ - bitmap_set(ent->bitmap, find_object_pos(&c->object.oid)); + pos = find_object_pos(&c->object.oid, &found); + if (!found) + return -1; + bitmap_set(ent->bitmap, pos); prio_queue_put(tree_queue, get_commit_tree(c)); for (p = c->parents; p; p = p->next) { - int pos = find_object_pos(&p->item->object.oid); + pos = find_object_pos(&p->item->object.oid, &found); + if (!found) + return -1; if (!bitmap_get(ent->bitmap, pos)) { bitmap_set(ent->bitmap, pos); prio_queue_put(queue, p->item); @@ -413,8 +433,12 @@ static void fill_bitmap_commit(struct bb_commit *ent, } } - while (tree_queue->nr) - fill_bitmap_tree(ent->bitmap, prio_queue_get(tree_queue)); + while (tree_queue->nr) { + if (fill_bitmap_tree(ent->bitmap, + prio_queue_get(tree_queue)) < 0) + return -1; + } + return 0; } static void store_selected(struct bb_commit *ent, struct commit *commit) @@ -432,7 +456,7 @@ static void store_selected(struct bb_commit *ent, struct commit *commit) kh_value(writer.bitmaps, hash_pos) = stored; } -void bitmap_writer_build(struct packing_data *to_pack) +int bitmap_writer_build(struct packing_data *to_pack) { struct bitmap_builder bb; size_t i; @@ -441,6 +465,7 @@ void bitmap_writer_build(struct packing_data *to_pack) struct prio_queue tree_queue = { NULL }; struct bitmap_index *old_bitmap; uint32_t *mapping; + int closed = 1; /* until proven otherwise */ writer.bitmaps = kh_init_oid_map(); writer.to_pack = to_pack; @@ -463,8 +488,11 @@ void bitmap_writer_build(struct packing_data *to_pack) struct commit *child; int reused = 0; - fill_bitmap_commit(ent, commit, &queue, &tree_queue, - old_bitmap, mapping); + if (fill_bitmap_commit(ent, commit, &queue, &tree_queue, + old_bitmap, mapping) < 0) { + closed = 0; + break; + } if (ent->selected) { store_selected(ent, commit); @@ -499,7 +527,9 @@ void bitmap_writer_build(struct packing_data *to_pack) stop_progress(&writer.progress); - compute_xor_offsets(); + if (closed) + compute_xor_offsets(); + return closed ? 0 : -1; } /** diff --git a/pack-bitmap.h b/pack-bitmap.h index 99d733eb26..020cd8d868 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -87,7 +87,7 @@ struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git, struct commit *commit); void bitmap_writer_select_commits(struct commit **indexed_commits, unsigned int indexed_commits_nr, int max_bitmaps); -void bitmap_writer_build(struct packing_data *to_pack); +int bitmap_writer_build(struct packing_data *to_pack); void bitmap_writer_finish(struct pack_idx_entry **index, uint32_t index_nr, const char *filename, diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh index 584a039b85..1667450917 100755 --- a/t/t0410-partial-clone.sh +++ b/t/t0410-partial-clone.sh @@ -536,7 +536,13 @@ test_expect_success 'gc does not repack promisor objects if there are none' ' repack_and_check () { rm -rf repo2 && cp -r repo repo2 && - git -C repo2 repack $1 -d && + if test x"$1" = "x--must-fail" + then + shift + test_must_fail git -C repo2 repack $1 -d + else + git -C repo2 repack $1 -d + fi && git -C repo2 fsck && git -C repo2 cat-file -e $2 && @@ -561,6 +567,7 @@ test_expect_success 'repack -d does not irreversibly delete promisor objects' ' printf "$THREE\n" | pack_as_from_promisor && delete_object repo "$ONE" && + repack_and_check --must-fail -ab "$TWO" "$THREE" && repack_and_check -a "$TWO" "$THREE" && repack_and_check -A "$TWO" "$THREE" && repack_and_check -l "$TWO" "$THREE" From patchwork Mon Jun 21 22:25:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335795 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D980EC48BC2 for ; Mon, 21 Jun 2021 22:25:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C218060FE9 for ; Mon, 21 Jun 2021 22:25:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232056AbhFUW1V (ORCPT ); Mon, 21 Jun 2021 18:27:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53534 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231669AbhFUW1V (ORCPT ); Mon, 21 Jun 2021 18:27:21 -0400 Received: from mail-io1-xd2b.google.com (mail-io1-xd2b.google.com [IPv6:2607:f8b0:4864:20::d2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B5B2C061574 for ; Mon, 21 Jun 2021 15:25:06 -0700 (PDT) Received: by mail-io1-xd2b.google.com with SMTP id v3so11905433ioq.9 for ; Mon, 21 Jun 2021 15:25:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Dj8bSAAyMgcbrNeyMR+hYayZQXjJBIgzVAauMYnE+HQ=; b=Ly8drg4DEbUKn+Uolci2SeZGJT6f3bTo1jxOpCJWtUdk+9eBPtZ/zkd57YbI68uw7f k8bqFWmMtlLdeATK4PolMLYN8qUiQGFxBXFZg9UYzMSCIdKYqm+k3M3SN9a9DQf4fYHZ oPhQijgldeH2dMGNz1d65bVrswxkROvZIvm27xcGd82HjoQVwwBzVMixfyeyLP75SRR0 ZOROB6zIBpC7NEQyZwALaeVo4XBJeEUKaF8Je98nCHyFnn0Zhlyzjch4DksAy7jNKoNv lxmemwWJ0TScFJrmQ6XH11zE+wwD6b20cQepBfX5SZRdKlpnpcA8d7kdkHvhFim9y+MH l7ng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Dj8bSAAyMgcbrNeyMR+hYayZQXjJBIgzVAauMYnE+HQ=; b=iNVkHa4YdrO6GcT3gR/joHjDq9jVzn6FkwSUk7d7JhmrT5Fp3J/hCebYvsMEWsuUwS PtK9Z7a3Hu4ljxRhEz4Sh0eoHLfG9HCEeJiYFM7o5Wb7O9mdwnyOWxrlr0fmBvPRZT0j XeEIOoCrTS1/S9FrXGvAI7kPIpakoiYJ4FBvoflYnvVafjHhwzdqFasi8iNtkY0PgX1H 5v8Ux1+9JS4zbVH/fxvZO8y68HANbcrm8YURFgay7aUlukRSCOnL6xDv7EBXQZmBi5CD 5zq7NYSoLrowdkOK9nqiH5p7Nr0eZontIfqwgeCX6N+zzXqUgaE1jGRAPS/1B7HW2J8b ynPQ== X-Gm-Message-State: AOAM533FH1qOyaCylNUj/RADGCs8XVzzYEIMtJwC72cy+OOhLwDi22t9 d/5CVcIDTFid6yshkGqZDS4cf4YJYfgy17XD X-Google-Smtp-Source: ABdhPJxt8kb2L1RWp4U7yvXh86QzVUBnjGgdncBoT7SGlJvoz81hWpFaglEeOGvEqa8JNqTdlhl+Eg== X-Received: by 2002:a5d:9f11:: with SMTP id q17mr240111iot.62.1624314305682; Mon, 21 Jun 2021 15:25:05 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id j2sm6399093ilr.45.2021.06.21.15.25.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:05 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:04 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 03/24] pack-bitmap-write.c: free existing bitmaps Message-ID: <490d733d121e206ccdc335812c03f31c380bcd86.1624314293.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When writing a new bitmap, the bitmap writer code attempts to read the existing bitmap (if one is present). This is done in order to quickly permute the bits of any bitmaps for commits which appear in the existing bitmap, and were also selected for the new bitmap. But since this code was added in 341fa34887 (pack-bitmap-write: use existing bitmaps, 2020-12-08), the resources associated with opening an existing bitmap were never released. It's fine to ignore this, but it's bad hygiene. It will also cause a problem for the multi-pack-index builtin, which will be responsible not only for writing bitmaps, but also for expiring any old multi-pack bitmaps. If an existing bitmap was reused here, it will also be expired. That will cause a problem on platforms which require file resources to be closed before unlinking them, like Windows. Avoid this by ensuring we close reused bitmaps with free_bitmap_index() before removing them. Signed-off-by: Taylor Blau --- pack-bitmap-write.c | 1 + 1 file changed, 1 insertion(+) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index d374f7884b..142fd0adb8 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -520,6 +520,7 @@ int bitmap_writer_build(struct packing_data *to_pack) clear_prio_queue(&queue); clear_prio_queue(&tree_queue); bitmap_builder_clear(&bb); + free_bitmap_index(old_bitmap); free(mapping); trace2_region_leave("pack-bitmap-write", "building_bitmaps_total", From patchwork Mon Jun 21 22:25:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335797 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA16BC4743C for ; Mon, 21 Jun 2021 22:25:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D26D36124B for ; Mon, 21 Jun 2021 22:25:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232112AbhFUW1Y (ORCPT ); Mon, 21 Jun 2021 18:27:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53552 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232087AbhFUW1Y (ORCPT ); Mon, 21 Jun 2021 18:27:24 -0400 Received: from mail-io1-xd34.google.com (mail-io1-xd34.google.com [IPv6:2607:f8b0:4864:20::d34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 23D42C061756 for ; Mon, 21 Jun 2021 15:25:09 -0700 (PDT) Received: by mail-io1-xd34.google.com with SMTP id b7so1282321ioq.12 for ; Mon, 21 Jun 2021 15:25:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=iqI2Hj3r+FGYz7ZR5jm8I7em9zBvoF/bgSb0O3cwOUQ=; b=LugQlqIY5y286dp53h62xP5mv/1HNnPyTwexpuA29ZZ1x8pMppgnPHRb3G5CHYEyNC kDHHAeH9Rdbcx+P1BmH8jtQg9wxnfx6AmBxUvMljnrBVCSA2sUfD/XHnXKbiXY39S6Ug fc3mZu254G67ihwpJxeakqTC7hhb0vHMEjckhJ8WGtVL2qWmBRNYlxZ0QRzZoUiXai9P 55kf1YERSOSqO0Yuy2ViVXPu2HYWXmOA0JC1bIYDnfu718h0Rkuykze+dk3o/HqZz/j3 ppG2JGRFGX8v8ODBjkyM2gx40IyYPmjknykX2UWgeSJlsJKmD1UmsltRCslFd2WC52AA t4tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=iqI2Hj3r+FGYz7ZR5jm8I7em9zBvoF/bgSb0O3cwOUQ=; b=iePC3FxtRTGQIkNuciwtZbUOt1ObHbliuG3h99YdKpMgyT9NnLob6X9UE9Am8Jtv2L tWhs+nmWo+eeWRYnmfRJqZJumQRZEIzOHL70V6NcsnAH43+A6TxmHzi6wDtLPjsgsDed L6XRAFR8XiLLgaBaYjpRMKJWOgTjBBYmPV68eT8aHTUGCCaPmKnNJnhqa1/+hV8wRRkJ la37rViQQHH0PzYAGB6V7eas9J2aaS9jMzcOwVCnbCggFCx+LddkfVupXMNqB1o61AJ3 SXeREYCRaB3AuX+MZPHcDVFX7gXz3PJsuD6oW/vB1colYa/s1QGR5wRKshr7vOzJDPca JzWQ== X-Gm-Message-State: AOAM533PjYBtPu+MCI80RAsEjUPcOvjLj9LBjiaI07uQ/habCPzo1Tc+ JbxoPRr27AhpX8DRgsRImTlRneRy341/UhFf X-Google-Smtp-Source: ABdhPJycrwUyPc58p/iSFnUgFMKAg1WSXgDS+QdadbXJIhRdzZs551BwnnJjdZ9YlQrhWbt31T8b3w== X-Received: by 2002:a05:6638:1446:: with SMTP id l6mr58311jad.14.1624314308460; Mon, 21 Jun 2021 15:25:08 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id c22sm10409028ioz.24.2021.06.21.15.25.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:08 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:07 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 04/24] Documentation: build 'technical/bitmap-format' by default Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Even though the 'TECH_DOCS' variable was introduced all the way back in 5e00439f0a (Documentation: build html for all files in technical and howto, 2012-10-23), the 'bitmap-format' document was never added to that list when it was created. Prepare for changes to this file by including it in the list of technical documentation that 'make doc' will build by default. Signed-off-by: Taylor Blau --- Documentation/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/Makefile b/Documentation/Makefile index f5605b7767..7d7b778b28 100644 --- a/Documentation/Makefile +++ b/Documentation/Makefile @@ -90,6 +90,7 @@ SP_ARTICLES += $(API_DOCS) TECH_DOCS += MyFirstContribution TECH_DOCS += MyFirstObjectWalk TECH_DOCS += SubmittingPatches +TECH_DOCS += technical/bitmap-format TECH_DOCS += technical/hash-function-transition TECH_DOCS += technical/http-protocol TECH_DOCS += technical/index-format From patchwork Mon Jun 21 22:25:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335799 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EDCD0C4743C for ; Mon, 21 Jun 2021 22:25:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D53976124B for ; Mon, 21 Jun 2021 22:25:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232073AbhFUW11 (ORCPT ); Mon, 21 Jun 2021 18:27:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53576 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232144AbhFUW10 (ORCPT ); Mon, 21 Jun 2021 18:27:26 -0400 Received: from mail-il1-x136.google.com (mail-il1-x136.google.com [IPv6:2607:f8b0:4864:20::136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0093BC061574 for ; Mon, 21 Jun 2021 15:25:11 -0700 (PDT) Received: by mail-il1-x136.google.com with SMTP id q9so3065279ilj.3 for ; Mon, 21 Jun 2021 15:25:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=E1ukGpQl2b4/B9Wy3ah+SELJbNqBJxQ0kTbWxrKWFkE=; b=BJ3/rxdQaez2UNq2O7XtQy6sgWnHDZ/8/oGANoO3MenxGpF9cVUPXrJsAs6jdEJcG/ lsFQIxvUf5zTYtXsPU0fKfFZ5YP9EAsBjB4Do+jByMKGaXrzfXdn4PShmiOVDrjTEr/o eQzCTqf2nujKiCRa3wmNxGDB8j3YYgOPXm+YnfniyAcX+blulaP50F02G1twDad+e+FQ cCrKu7vjv0Ckw9bBKo5gjudd8VxR+hN9m1kQkIaDAqF3afP662532Dh+brXTLPVik0Nq PwNNV6N5v76qPlL2KsDRdiWmrDNnI7kbz6IsXOEGhTH4sp3YQpuoi94XALYj1KIw1gY8 xaaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=E1ukGpQl2b4/B9Wy3ah+SELJbNqBJxQ0kTbWxrKWFkE=; b=aTTs3AVkCuRIln6hACR/ItGpTZgFqot36vbISRTjKG0/1Qn23KTzo2eRc668UxF/OY R2tF+aObgTCXhV+RqD1GyuI/EnfReslmOLLXtgLqjFm+bYR0poGMEyN1ev1YnsTlTmj3 hwXVAYHUBPwWVLMKK0gd2X6NrWmqp7V0i+kIZLNzO75d2O0rrVrkZFZheEuSerwa/DIc 5VppZFOgdlL3VAxwVmJUFAj6TlviZ4yrXtzg5JGQlqffD8F2nKBfvgOkRUPARPU62FVh yOnUEBN1LsLZZy/MfcrU4EqeRUXwtvkawa53j7vCL1SQ+wzQfhtl2cm9V2qp8D2d4Ewh T7Dg== X-Gm-Message-State: AOAM5327koaICtzdXU/jZPj26NbYnAi+l5jx/Nm4ypVgkdybWJjoUJor TFCBw7iEf9queR7SmGTNltbAUgXWhw36rown X-Google-Smtp-Source: ABdhPJyYpDJOmUDT3zEWTLWIoW7raypTf5YHW5WTl6UYwfyaHKSQzXHQaPAKkbxrn4nPA9tphtlghw== X-Received: by 2002:a92:bd03:: with SMTP id c3mr328446ile.83.1624314311136; Mon, 21 Jun 2021 15:25:11 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id a18sm6845218ilc.31.2021.06.21.15.25.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:10 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:10 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 05/24] Documentation: describe MIDX-based bitmaps Message-ID: <64a260e0c6a116b7c6fa6fea2b9fd96bf416cb18.1624314293.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Update the technical documentation to describe the multi-pack bitmap format. This patch merely introduces the new format, and describes its high-level ideas. Git does not yet know how to read nor write these multi-pack variants, and so the subsequent patches will: - Introduce code to interpret multi-pack bitmaps, according to this document. - Then, introduce code to write multi-pack bitmaps from the 'git multi-pack-index write' sub-command. Finally, the implementation will gain tests in subsequent patches (as opposed to inline with the patch teaching Git how to write multi-pack bitmaps) to avoid a cyclic dependency. Signed-off-by: Taylor Blau --- Documentation/technical/bitmap-format.txt | 72 ++++++++++++++++---- Documentation/technical/multi-pack-index.txt | 10 +-- 2 files changed, 61 insertions(+), 21 deletions(-) diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt index f8c18a0f7a..25221c7ec8 100644 --- a/Documentation/technical/bitmap-format.txt +++ b/Documentation/technical/bitmap-format.txt @@ -1,6 +1,45 @@ GIT bitmap v1 format ==================== +== Pack and multi-pack bitmaps + +Bitmaps store reachability information about the set of objects in a packfile, +or a multi-pack index (MIDX). The former is defined obviously, and the latter is +defined as the union of objects in packs contained in the MIDX. + +A bitmap may belong to either one pack, or the repository's multi-pack index (if +it exists). A repository may have at most one bitmap. + +An object is uniquely described by its bit position within a bitmap: + + - If the bitmap belongs to a packfile, the __n__th bit corresponds to + the __n__th object in pack order. For a function `offset` which maps + objects to their byte offset within a pack, pack order is defined as + follows: + + o1 <= o2 <==> offset(o1) <= offset(o2) + + - If the bitmap belongs to a MIDX, the __n__th bit corresponds to the + __n__th object in MIDX order. With an additional function `pack` which + maps objects to the pack they were selected from by the MIDX, MIDX order + is defined as follows: + + o1 <= o2 <==> pack(o1) <= pack(o2) /\ offset(o1) <= offset(o2) + + The ordering between packs is done lexicographically by the pack name, + with the exception of the preferred pack, which sorts ahead of all other + packs. + +The on-disk representation (described below) of a bitmap is the same regardless +of whether or not that bitmap belongs to a packfile or a MIDX. The only +difference is the interpretation of the bits, which is described above. + +Certain bitmap extensions are supported (see: Appendix B). No extensions are +required for bitmaps corresponding to packfiles. For bitmaps that correspond to +MIDXs, both the bit-cache and rev-cache extensions are required. + +== On-disk format + - A header appears at the beginning: 4-byte signature: {'B', 'I', 'T', 'M'} @@ -14,17 +53,19 @@ GIT bitmap v1 format The following flags are supported: - BITMAP_OPT_FULL_DAG (0x1) REQUIRED - This flag must always be present. It implies that the bitmap - index has been generated for a packfile with full closure - (i.e. where every single object in the packfile can find - its parent links inside the same packfile). This is a - requirement for the bitmap index format, also present in JGit, - that greatly reduces the complexity of the implementation. + This flag must always be present. It implies that the + bitmap index has been generated for a packfile or + multi-pack index (MIDX) with full closure (i.e. where + every single object in the packfile/MIDX can find its + parent links inside the same packfile/MIDX). This is a + requirement for the bitmap index format, also present in + JGit, that greatly reduces the complexity of the + implementation. - BITMAP_OPT_HASH_CACHE (0x4) If present, the end of the bitmap file contains `N` 32-bit name-hash values, one per object in the - pack. The format and meaning of the name-hash is + pack/MIDX. The format and meaning of the name-hash is described below. 4-byte entry count (network byte order) @@ -33,7 +74,8 @@ GIT bitmap v1 format 20-byte checksum - The SHA1 checksum of the pack this bitmap index belongs to. + The SHA1 checksum of the pack/MIDX this bitmap index + belongs to. - 4 EWAH bitmaps that act as type indexes @@ -50,7 +92,7 @@ GIT bitmap v1 format - Tags In each bitmap, the `n`th bit is set to true if the `n`th object - in the packfile is of that type. + in the packfile or multi-pack index is of that type. The obvious consequence is that the OR of all 4 bitmaps will result in a full set (all bits set), and the AND of all 4 bitmaps will @@ -62,8 +104,9 @@ GIT bitmap v1 format Each entry contains the following: - 4-byte object position (network byte order) - The position **in the index for the packfile** where the - bitmap for this commit is found. + The position **in the index for the packfile or + multi-pack index** where the bitmap for this commit is + found. - 1-byte XOR-offset The xor offset used to compress this bitmap. For an entry @@ -146,10 +189,11 @@ Name-hash cache --------------- If the BITMAP_OPT_HASH_CACHE flag is set, the end of the bitmap contains -a cache of 32-bit values, one per object in the pack. The value at +a cache of 32-bit values, one per object in the pack/MIDX. The value at position `i` is the hash of the pathname at which the `i`th object -(counting in index order) in the pack can be found. This can be fed -into the delta heuristics to compare objects with similar pathnames. +(counting in index or multi-pack index order) in the pack/MIDX can be found. +This can be fed into the delta heuristics to compare objects with similar +pathnames. The hash algorithm used is: diff --git a/Documentation/technical/multi-pack-index.txt b/Documentation/technical/multi-pack-index.txt index fb688976c4..1a73c3ee20 100644 --- a/Documentation/technical/multi-pack-index.txt +++ b/Documentation/technical/multi-pack-index.txt @@ -71,14 +71,10 @@ Future Work still reducing the number of binary searches required for object lookups. -- The reachability bitmap is currently paired directly with a single - packfile, using the pack-order as the object order to hopefully - compress the bitmaps well using run-length encoding. This could be - extended to pair a reachability bitmap with a multi-pack-index. If - the multi-pack-index is extended to store a "stable object order" +- If the multi-pack-index is extended to store a "stable object order" (a function Order(hash) = integer that is constant for a given hash, - even as the multi-pack-index is updated) then a reachability bitmap - could point to a multi-pack-index and be updated independently. + even as the multi-pack-index is updated) then MIDX bitmaps could be + updated independently of the MIDX. - Packfiles can be marked as "special" using empty files that share the initial name but replace ".pack" with ".keep" or ".promisor". From patchwork Mon Jun 21 22:25:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335801 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C132C4743C for ; Mon, 21 Jun 2021 22:25:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1F6FA60FE9 for ; Mon, 21 Jun 2021 22:25:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232144AbhFUW1e (ORCPT ); Mon, 21 Jun 2021 18:27:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53594 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232145AbhFUW13 (ORCPT ); Mon, 21 Jun 2021 18:27:29 -0400 Received: from mail-il1-x12f.google.com (mail-il1-x12f.google.com [IPv6:2607:f8b0:4864:20::12f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9D422C061756 for ; Mon, 21 Jun 2021 15:25:14 -0700 (PDT) Received: by mail-il1-x12f.google.com with SMTP id k5so3125512ilv.8 for ; Mon, 21 Jun 2021 15:25:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=gNhnSuz11aYXJPqYSvE9SZD82rVbAXajnXodd8BEeRQ=; b=ugbULfUHKI/69QTFH3cPClCV8MjxVA1FgyfIANbA5OU6Wh6DKWptVYmdao3UMqEgBv ZRRB91pO5zq/GB7lV4kUKth0FnMvL5d3A7RGa2iwD8m0Qm3QAHYxJBVXaesXgi/RDdzG IcRYpwtI4zS2KqEh6ee4V7BfAWy2raTjqUesc3L47BnTZh9rZ53ta9MzWBEobTwfpbmp W2f9UxoPLDA3w5/T+U10oEstc4fmYJ2H5gK8iZJ1M/TSYUDhsO9rqVqol9/7S4xrTswX EP3GxTkKkqOjOFt4NBBFohk6/PCAlTbL07wKxJD1Bz3RUFMNFrzV2xTb88foe/WbKRm6 SkUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=gNhnSuz11aYXJPqYSvE9SZD82rVbAXajnXodd8BEeRQ=; b=tU3WjYGV9bgqFRsoekKcHCb3RknPc21B0icKkTGlQmCDuH/fyDb888Q5f6UTZCQJD2 o+uJyq2QkTHYcUVMz9iCmo3c1n3SOSLPHeyaMgcxmF6R714+PlNSmLek5TmTGXFmSIIi RTfN6XrLynzdHI4SLOiaAMR1dR9MW2WNfjzzTi2v7ly7Ri8y6uhJOu27MOAVv4WdMg2T Y4uqXtn3ix2NpdwxqPEsS3pCTuo0yxjKmGCFCSI9aIVouDiG0cv0eojwzWGH91T9DuL1 MHqv3l9AOvHe29mIiZRegXaXwA9w+s5OCYNYahKNALogMpFjtVnHK3wnhf1njXXoG5ri jjsQ== X-Gm-Message-State: AOAM532H/kCbthGvkQot97gfWKIXVFSMMfWLOeCYOsfMNCBD5B5f/KjW DUoyzL/WqwIx3QKd3VpaHB05oeL7C1NfTMMq X-Google-Smtp-Source: ABdhPJykJooXAeOYXSxC1h6wRPbFRnsd8tqujBOoalpDoXBjqceuTNfVyxE8JpoKLapt05zxwAmoUg== X-Received: by 2002:a05:6e02:5a3:: with SMTP id k3mr357568ils.302.1624314313824; Mon, 21 Jun 2021 15:25:13 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id h18sm3196159ilr.86.2021.06.21.15.25.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:13 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:12 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 06/24] midx: make a number of functions non-static Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org These functions will be called from outside of midx.c in a subsequent patch. Signed-off-by: Taylor Blau --- midx.c | 4 ++-- midx.h | 2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/midx.c b/midx.c index 21d6a05e88..fa23d57a24 100644 --- a/midx.c +++ b/midx.c @@ -48,12 +48,12 @@ static uint8_t oid_version(void) } } -static const unsigned char *get_midx_checksum(struct multi_pack_index *m) +const unsigned char *get_midx_checksum(struct multi_pack_index *m) { return m->data + m->data_len - the_hash_algo->rawsz; } -static char *get_midx_filename(const char *object_dir) +char *get_midx_filename(const char *object_dir) { return xstrfmt("%s/pack/multi-pack-index", object_dir); } diff --git a/midx.h b/midx.h index 8684cf0fef..1172df1a71 100644 --- a/midx.h +++ b/midx.h @@ -42,6 +42,8 @@ struct multi_pack_index { #define MIDX_PROGRESS (1 << 0) #define MIDX_WRITE_REV_INDEX (1 << 1) +const unsigned char *get_midx_checksum(struct multi_pack_index *m); +char *get_midx_filename(const char *object_dir); char *get_midx_rev_filename(struct multi_pack_index *m); struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local); From patchwork Mon Jun 21 22:25:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335803 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BBA52C48BE5 for ; Mon, 21 Jun 2021 22:25:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A4BEE6124B for ; Mon, 21 Jun 2021 22:25:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232172AbhFUW1f (ORCPT ); Mon, 21 Jun 2021 18:27:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53614 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232197AbhFUW1d (ORCPT ); Mon, 21 Jun 2021 18:27:33 -0400 Received: from mail-io1-xd33.google.com (mail-io1-xd33.google.com [IPv6:2607:f8b0:4864:20::d33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42087C061768 for ; Mon, 21 Jun 2021 15:25:17 -0700 (PDT) Received: by mail-io1-xd33.google.com with SMTP id s19so11153833ioc.3 for ; Mon, 21 Jun 2021 15:25:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=7xGIEvzSdDRc4IQvzEVwXEoIEHv61yRPVPsvgvyowXQ=; b=Smpi3LBdAb+obLYctuvDe5LbMFAgQfcLL3xcLDGdXKnbMsHp7TnFg4K98shYkl6FtW DvBLVEz7wYs+RJvXQvL0JqjeqNaDe/rzCkH9Yf0zPpLtGCq5tNTT5m8pDCFt8dIh4eac kKA6Hg73M2O0D3/2tGy4Agc0bIbda2jYonAwE51Zw/fojYFHEYiLYirtcacfbBGHfzK5 xRbOHYtLlQFpfUEU1IH2vxM2eD+I3cJR34qbC/DTCbmqDwJ/4hHIEciPi0CbkGOgTXZ9 5blrcg6ldNZY78sEgyCvBnw+TWgw+ba3vWADht+V/MM25wolFA1wlhsGPSy7Rcd4A8lz fwuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=7xGIEvzSdDRc4IQvzEVwXEoIEHv61yRPVPsvgvyowXQ=; b=WYF1JuQE06dTIEm1vi8VsW5K/+hguLR0qP0la/5w4gc7YGS+og4p2toonVXhJQB/kx IySQ/Vb1FYRI2ju3TXy32Qxw5UDtH+u0I9V6Slx27KevW7S/nzLnXAhRY6iBPnLjy/pA VsEK3VthDOv3/obWAjbkBYLsDVRUivf4k4mvYD+Bav4zt+tQVp0UVf3Jy0/clLoSY5cN 4Hg7ew7cJKKE77fjYB1oR6n03N/BWZMDNGq9E3K9qbHUs6Aabj8exCV6UQal2AbwyFrH wwkl0eRvMc/G+9ZFL0lk5AxW1Fp7Vm5ZH0K7EzBzS6dIYeqU09TCqNDCnTq5x6fELWWp saZw== X-Gm-Message-State: AOAM530S53LN5+mXS23VVYzHoshQUdVpeIfbSvt0PDCAliIc0r6X4ngH mnAZ/+t5A96F06QerA/Y3VSwQMaZyU8qlhF6 X-Google-Smtp-Source: ABdhPJwsQrwVvUBxOvEyJB7YOjPhPXC54KszqG7tUU8Vmh/DXBkXKmd4FQX2UDihz310qvK435Ybiw== X-Received: by 2002:a6b:f81a:: with SMTP id o26mr250755ioh.56.1624314316554; Mon, 21 Jun 2021 15:25:16 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id e14sm6659660ilc.47.2021.06.21.15.25.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:16 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:15 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 07/24] midx: clear auxiliary .rev after replacing the MIDX Message-ID: <1448ca0d2ba265db2dce414a7f7d6b1f4bcb5a08.1624314293.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When writing a new multi-pack index, write_midx_internal() attempts to clean up any auxiliary files (currently just the MIDX's `.rev` file, but soon to include a `.bitmap`, too) corresponding to the MIDX it's replacing. This step should happen after the new MIDX is written into place, since doing so beforehand means that the old MIDX could be read without its corresponding .rev file. Signed-off-by: Taylor Blau --- midx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/midx.c b/midx.c index fa23d57a24..40eb7974ba 100644 --- a/midx.c +++ b/midx.c @@ -1076,10 +1076,11 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * if (flags & MIDX_WRITE_REV_INDEX) write_midx_reverse_index(midx_name, midx_hash, &ctx); - clear_midx_files_ext(the_repository, ".rev", midx_hash); commit_lock_file(&lk); + clear_midx_files_ext(the_repository, ".rev", midx_hash); + cleanup: for (i = 0; i < ctx.nr; i++) { if (ctx.info[i].p) { From patchwork Mon Jun 21 22:25:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335805 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46B2DC48BC2 for ; Mon, 21 Jun 2021 22:25:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2B7B460FE9 for ; Mon, 21 Jun 2021 22:25:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232145AbhFUW1g (ORCPT ); Mon, 21 Jun 2021 18:27:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232151AbhFUW1f (ORCPT ); Mon, 21 Jun 2021 18:27:35 -0400 Received: from mail-il1-x130.google.com (mail-il1-x130.google.com [IPv6:2607:f8b0:4864:20::130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2999CC061756 for ; Mon, 21 Jun 2021 15:25:20 -0700 (PDT) Received: by mail-il1-x130.google.com with SMTP id h3so16729975ilc.9 for ; Mon, 21 Jun 2021 15:25:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=OqGRm3pg+O/AiftNk5fRF2XsCUvLD0T9jIAgs3Vf6m0=; b=vZ5lAoMgcLFmOpgSR7w9G0O26i4dp4AAWbeSIV9RslZYGZshSBFkhJhOvxxb99ZoJ4 idoqvXiOfWzVbfe/JoieBysb2zgJp0+7RHoYmTN/BPEsa03EBQ3Ul+H/mW/wQonBW/Cf dRrObovSeGFae8/ZYjqk2s6TJMaOEusq/r0M4nGW4uzYrTJaKEQh01ngCjCIpRQ2bJKW szjxN7tbhh8IyLU/Jw46NblhPfbwChMmlbr9YouxaE6A0t3o6JUmXRqc8xficcPtEbs/ cNBQPMjLysYlCblsxK8tIAeLrxp3OdFz6IiIbSjD4HsVeSkSIbmEbOFl2ZXmdX7ppPv7 wt5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=OqGRm3pg+O/AiftNk5fRF2XsCUvLD0T9jIAgs3Vf6m0=; b=RoVn4RqtY6wXc3efe6AoWGWjyRy8cKS6h68AMAmE0zlyg9+ZxN62n+89S7ztG7AAlk LJ1qKzSPY3/D79VP/0kVa66aFdLGe/sy4M/nQ86b3txtg+Uv+JMrSONs/T3Kk9gwIt7H InyRJcztUxVKsVIyzjI0V8aJGwhG9ipoRiHnVXBsQyJ7ufhMXSgVS/xPWAgDkVcFxKzo NjPEXIu2WRN5BP1gP3NdmaKWTIzVMI7spfVdJMyYRzPBqjLcqfwN2rpYzYCIe2glF3xI nnSeYya3OUlh2y4oudvV+8B6Hr3eQSnFmXCmzsn4EaxuavSUXrWk+XES+b27nFixhR/D F7Hw== X-Gm-Message-State: AOAM533SniKJh6Syh2Y3ck6u8YK46oJYZVR0WmvvBkbvVJv/+v3yrbTe lWn+WjtxvDXjhD8lzR9qeznzMTeZT7p2BMft X-Google-Smtp-Source: ABdhPJz2ak0iPqHQgYhtIataahXVY0Gi5PcMT6IIiNli+/781CWUatJ1uNbC4GKxaaZHjYbVH1DGZQ== X-Received: by 2002:a05:6e02:20c2:: with SMTP id 2mr350507ilq.222.1624314319309; Mon, 21 Jun 2021 15:25:19 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id c3sm6435735ils.54.2021.06.21.15.25.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:18 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:18 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 08/24] midx: respect 'core.multiPackIndex' when writing Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When writing a new multi-pack index, write_midx_internal() attempts to load any existing one to fill in some pieces of information. But it uses load_multi_pack_index(), which ignores the configuration "core.multiPackIndex", which indicates whether or not Git is allowed to read an existing multi-pack-index. Replace this with a routine that does respect that setting, to avoid reading multi-pack-index files when told not to. This avoids a problem that would arise in subsequent patches due to the combination of 'git repack' reopening the object store in-process and the multi-pack index code not checking whether a pack already exists in the object store when calling add_pack_to_midx(). This would ultimately lead to a cycle being created along the 'packed_git' struct's '->next' pointer. That is obviously bad, but it has hard-to-debug downstream effects like saying a bitmap can't be loaded for a pack because one already exists (for the same pack). Signed-off-by: Taylor Blau --- midx.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/midx.c b/midx.c index 40eb7974ba..759007d5a8 100644 --- a/midx.c +++ b/midx.c @@ -908,8 +908,18 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * if (m) ctx.m = m; - else - ctx.m = load_multi_pack_index(object_dir, 1); + else { + struct multi_pack_index *cur; + + prepare_multi_pack_index_one(the_repository, object_dir, 1); + + ctx.m = NULL; + for (cur = the_repository->objects->multi_pack_index; cur; + cur = cur->next) { + if (!strcmp(object_dir, cur->object_dir)) + ctx.m = cur; + } + } ctx.nr = 0; ctx.alloc = ctx.m ? ctx.m->num_packs : 16; From patchwork Mon Jun 21 22:25:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335807 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 811F6C4743C for ; Mon, 21 Jun 2021 22:25:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 553B860FE9 for ; Mon, 21 Jun 2021 22:25:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232227AbhFUW1k (ORCPT ); Mon, 21 Jun 2021 18:27:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53642 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232193AbhFUW1i (ORCPT ); Mon, 21 Jun 2021 18:27:38 -0400 Received: from mail-io1-xd35.google.com (mail-io1-xd35.google.com [IPv6:2607:f8b0:4864:20::d35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB483C061574 for ; Mon, 21 Jun 2021 15:25:22 -0700 (PDT) Received: by mail-io1-xd35.google.com with SMTP id v3so11906085ioq.9 for ; Mon, 21 Jun 2021 15:25:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=+CjsrtEcOwfai2FUxrt0cxDisW48GdkbzD4CCPCcdoU=; b=our3WhcTjI//SSoFojhht0o3/r2kzb/hc0g6YDwTNFcY+zt4xb/OT9zbk9rE+FcsMF ejXBC/nloxOTU9ox9QH0ZvDMXLH5FQ5V3BL8j7TaLMLM6ARhdtf6Wuj4nokSMPyGJ2TH AfCJO1r0VIwJBF0PT5wBtUlHu5l+73iyNnPrcOe2l2UrXBhGVJypPHc/cfKV1fvPGIfr FLiiPvRqc41p1jXeWfJWSo2c6lSNHC8lXJPMNN8cFNqh1V7gLn1b+1wC/7WMq0HgCWLW /ZN3uwoeMR176JtmCeq9H4Oj6LV7CnF5aQ3yJsPN1ZN/XmhqrmXV2CvDWrNLSsqHU6kq UjZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=+CjsrtEcOwfai2FUxrt0cxDisW48GdkbzD4CCPCcdoU=; b=HQdsgCscYZossIicUmunA5GQCafNjZB+jlRI0gAg5z+RLXz0Xn8KtPYpIlYcVIpD4o yBBKVR/A8JcvhCTw/wIWGwzGr0FMBweqlcl96+ZZzIUVjM3x54A2HcDWTbO2b7J7K2DO BprwdU5v/tq5ecEJdGY0G5s6s92OEZipoLfZ+aOPQQlD2lexhQZcc8YCuQeywQ/y1daS hki3y3bS1rf0QdK5SkTUVU0ad1EtInvWVv+WB35tnNcuZ+2vSKKnq1z9XkeNaTinwNhl Zj0VQmU7rfbH1JpfEJ8LOZubg43i4V3FwvnKn5PXjl8UVtB9OhZhIrUg4qUDq2MAEo38 iVog== X-Gm-Message-State: AOAM530S4g1Nxngqgke65/oCJ42aTccRWibZAzLHL5V7i4cRYqOZzxBd 5MhEmBNmJSZHHaShXE9siOGqXAcNI8SiZq4o X-Google-Smtp-Source: ABdhPJz2JvUcsYiOAJ3EJS/bdEBYcp5oXNc1fbFgoFTxz3jve6WEetuk2/1tsjpIHSx1e2lZSqPTGg== X-Received: by 2002:a05:6638:379d:: with SMTP id w29mr685636jal.2.1624314322133; Mon, 21 Jun 2021 15:25:22 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id b23sm10681457ior.4.2021.06.21.15.25.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:21 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:21 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 09/24] midx: infer preferred pack when not given one Message-ID: <9495f6869d792264c4366c9914fcf93d544caa6a.1624314293.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In 9218c6a40c (midx: allow marking a pack as preferred, 2021-03-30), the multi-pack index code learned how to select a pack which all duplicate objects are selected from. That is, if an object appears in multiple packs, select the copy in the preferred pack before breaking ties according to the other rules like pack mtime and readdir() order. Not specifying a preferred pack can cause serious problems with multi-pack reachability bitmaps, because these bitmaps rely on having at least one pack from which all duplicates are selected. Not having such a pack causes problems with the pack reuse code (e.g., like assuming that a base object was sent from that pack via reuse when in fact the base was selected from a different pack). So why does not marking a pack preferred cause problems here? The reason is roughly as follows: - Ties are broken (when handling duplicate objects) by sorting according to midx_oid_compare(), which sorts objects by OID, preferred-ness, pack mtime, and finally pack ID (more on that later). - The psuedo pack-order (described in Documentation/technical/bitmap-format.txt) is computed by midx_pack_order(), and sorts by pack ID and pack offset, with preferred packs sorting first. - But! Pack IDs come from incrementing the pack count in add_pack_to_midx(), which is a callback to for_each_file_in_pack_dir(), meaning that pack IDs are assigned in readdir() order. When specifying a preferred pack, all of that works fine, because duplicate objects are correctly resolved in favor of the copy in the preferred pack, and the preferred pack sorts first in the object order. "Sorting first" is critical, because the bitmap code relies on finding out which pack holds the first object in the MIDX's pseudo pack-order to determine which pack is preferred. But if we didn't specify a preferred pack, and the pack which comes first in readdir() order does not also have the lowest timestamp, then it's possible that that pack (the one that sorts first in pseudo-pack order, which the bitmap code will treat as the preferred one) did *not* have all duplicate objects resolved in its favor, resulting in breakage. The fix is simple: pick a (semi-arbitrary) preferred pack when none was specified. This forces that pack to have duplicates resolved in its favor, and (critically) to sort first in pseudo-pack order. Unfortunately, testing this behavior portably isn't possible, since it depends on readdir() order which isn't guaranteed by POSIX. Signed-off-by: Taylor Blau --- midx.c | 39 +++++++++++++++++++++++++++++++++------ 1 file changed, 33 insertions(+), 6 deletions(-) diff --git a/midx.c b/midx.c index 759007d5a8..752d36c57f 100644 --- a/midx.c +++ b/midx.c @@ -950,15 +950,46 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop) goto cleanup; - ctx.preferred_pack_idx = -1; if (preferred_pack_name) { + int found = 0; for (i = 0; i < ctx.nr; i++) { if (!cmp_idx_or_pack_name(preferred_pack_name, ctx.info[i].pack_name)) { ctx.preferred_pack_idx = i; + found = 1; break; } } + + if (!found) + warning(_("unknown preferred pack: '%s'"), + preferred_pack_name); + } else if (ctx.nr && (flags & MIDX_WRITE_REV_INDEX)) { + time_t oldest = ctx.info[0].p->mtime; + ctx.preferred_pack_idx = 0; + + if (packs_to_drop && packs_to_drop->nr) + BUG("cannot write a MIDX bitmap during expiration"); + + /* + * set a preferred pack when writing a bitmap to ensure that + * the pack from which the first object is selected in pseudo + * pack-order has all of its objects selected from that pack + * (and not another pack containing a duplicate) + */ + for (i = 1; i < ctx.nr; i++) { + time_t mtime = ctx.info[i].p->mtime; + if (mtime < oldest) { + oldest = mtime; + ctx.preferred_pack_idx = i; + } + } + } else { + /* + * otherwise don't mark any pack as preferred to avoid + * interfering with expiration logic below + */ + ctx.preferred_pack_idx = -1; } ctx.entries = get_sorted_entries(ctx.m, ctx.info, ctx.nr, &ctx.entries_nr, @@ -1029,11 +1060,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * ctx.info, ctx.nr, sizeof(*ctx.info), idx_or_pack_name_cmp); - - if (!preferred) - warning(_("unknown preferred pack: '%s'"), - preferred_pack_name); - else { + if (preferred) { uint32_t perm = ctx.pack_perm[preferred->orig_pack_int_id]; if (perm == PACK_EXPIRED) warning(_("preferred pack '%s' is expired"), From patchwork Mon Jun 21 22:25:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335809 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFAA0C48BE5 for ; Mon, 21 Jun 2021 22:25:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A928A60FE9 for ; Mon, 21 Jun 2021 22:25:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232197AbhFUW1l (ORCPT ); Mon, 21 Jun 2021 18:27:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53650 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232193AbhFUW1k (ORCPT ); Mon, 21 Jun 2021 18:27:40 -0400 Received: from mail-il1-x12a.google.com (mail-il1-x12a.google.com [IPv6:2607:f8b0:4864:20::12a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1025C061756 for ; Mon, 21 Jun 2021 15:25:25 -0700 (PDT) Received: by mail-il1-x12a.google.com with SMTP id s19so10009877ilj.1 for ; Mon, 21 Jun 2021 15:25:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=H1SYv7+wUvvYTf/8JQBOtwl3b13isRISBnBtPBogQU0=; b=mEk68v22ZjkJU6vh0mznXf7JM8Ka7shlzBlOjPe30HXk6whoswZDAMg2zu1NNg2134 7vxMOOBldtubTV+/Jjs0xebeJ86tdoBVxf88QGstmVhNL3JOkY96uD094jLCpMPLW0Os WxM6BSX1X24J8/O+GV4lPzEaP+z++qjvOiCyK8OVslAUW8Xf5BVSZqcUFbdv5tRPg6qE uF0FE7kFStF5M46O3B9OdnKT1z5ciGmGcDYssacF3obH1lOgleDctFO1iQDpSubZk8op UUmAENy/2+YQJBPDiW0U1lqCWwW5Bw+sE0nS9ojDVY3mk6v0D/OO7JD9jKMnrEZZ/Ct+ 27Sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=H1SYv7+wUvvYTf/8JQBOtwl3b13isRISBnBtPBogQU0=; b=sx5OL86bX26yYXoeob+G2irRNpN5gmohQCjOdc5uLIdYKtuMnX9MNnqAcFRb2Rrr5P bnZzYynfx/Z9ku97MOMgJCDjfXkIhua9C49n5m47f+pfpCsMgr7lNHysharE7ytvsDYl 6uWVlKEqn9hp/U+Fag6c6QvB9f3mUbMglfWM6L+TMucoZG5E6NiOgO92QfWWB7qXv/oR I52FOpRIasHPfa4uzO59z4vzZIUvJz+s64XrgOvnBW0l7/HrHKWsmgVz0fhd/HsiRmoS MRFM0H8UApb7VMaD2XslNWHtFtu7fHVE8V3W7sxezlLofYAmx8YG+R8htqQJFEsvLI7u KiHg== X-Gm-Message-State: AOAM531+68SY82SSp9ZjFXp/Dt3EkHL9FjGzqHVp3YjfO5v1DOg9j66B pJovs5j2WevsCVSIgf1l+1Xz4+Vw/gWOkbsy X-Google-Smtp-Source: ABdhPJzkPhBufgqp3qZ/hjbJsb8wiwu/+U2LnimS/lTgisidspykSFEqKaZFr3EEtb/fQWMC3FyGTg== X-Received: by 2002:a05:6e02:66e:: with SMTP id l14mr336004ilt.211.1624314324866; Mon, 21 Jun 2021 15:25:24 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id w1sm6499287ill.3.2021.06.21.15.25.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:24 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:23 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 10/24] pack-bitmap.c: introduce 'bitmap_num_objects()' Message-ID: <373aa47528ce8bd4bb044a82a7e80312476f1b5f.1624314293.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A subsequent patch to support reading MIDX bitmaps will be less noisy after extracting a generic function to return how many objects are contained in a bitmap. Signed-off-by: Taylor Blau --- pack-bitmap.c | 37 +++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 368fa59a42..2dc135d34a 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -136,6 +136,11 @@ static struct ewah_bitmap *read_bitmap_1(struct bitmap_index *index) return b; } +static uint32_t bitmap_num_objects(struct bitmap_index *index) +{ + return index->pack->num_objects; +} + static int load_bitmap_header(struct bitmap_index *index) { struct bitmap_disk_header *header = (void *)index->map; @@ -154,7 +159,7 @@ static int load_bitmap_header(struct bitmap_index *index) /* Parse known bitmap format options */ { uint32_t flags = ntohs(header->options); - size_t cache_size = st_mult(index->pack->num_objects, sizeof(uint32_t)); + size_t cache_size = st_mult(bitmap_num_objects(index), sizeof(uint32_t)); unsigned char *index_end = index->map + index->map_size - the_hash_algo->rawsz; if ((flags & BITMAP_OPT_FULL_DAG) == 0) @@ -399,7 +404,7 @@ static inline int bitmap_position_extended(struct bitmap_index *bitmap_git, if (pos < kh_end(positions)) { int bitmap_pos = kh_value(positions, pos); - return bitmap_pos + bitmap_git->pack->num_objects; + return bitmap_pos + bitmap_num_objects(bitmap_git); } return -1; @@ -451,7 +456,7 @@ static int ext_index_add_object(struct bitmap_index *bitmap_git, bitmap_pos = kh_value(eindex->positions, hash_pos); } - return bitmap_pos + bitmap_git->pack->num_objects; + return bitmap_pos + bitmap_num_objects(bitmap_git); } struct bitmap_show_data { @@ -650,7 +655,7 @@ static void show_extended_objects(struct bitmap_index *bitmap_git, for (i = 0; i < eindex->count; ++i) { struct object *obj; - if (!bitmap_get(objects, bitmap_git->pack->num_objects + i)) + if (!bitmap_get(objects, bitmap_num_objects(bitmap_git) + i)) continue; obj = eindex->objects[i]; @@ -808,7 +813,7 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git, * individually. */ for (i = 0; i < eindex->count; i++) { - uint32_t pos = i + bitmap_git->pack->num_objects; + uint32_t pos = i + bitmap_num_objects(bitmap_git); if (eindex->objects[i]->type == type && bitmap_get(to_filter, pos) && !bitmap_get(tips, pos)) @@ -835,7 +840,7 @@ static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git, oi.sizep = &size; - if (pos < pack->num_objects) { + if (pos < bitmap_num_objects(bitmap_git)) { off_t ofs = pack_pos_to_offset(pack, pos); if (packed_object_info(the_repository, pack, ofs, &oi) < 0) { struct object_id oid; @@ -845,7 +850,7 @@ static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git, } } else { struct eindex *eindex = &bitmap_git->ext_index; - struct object *obj = eindex->objects[pos - pack->num_objects]; + struct object *obj = eindex->objects[pos - bitmap_num_objects(bitmap_git)]; if (oid_object_info_extended(the_repository, &obj->oid, &oi, 0) < 0) die(_("unable to get size of %s"), oid_to_hex(&obj->oid)); } @@ -887,7 +892,7 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git, } for (i = 0; i < eindex->count; i++) { - uint32_t pos = i + bitmap_git->pack->num_objects; + uint32_t pos = i + bitmap_num_objects(bitmap_git); if (eindex->objects[i]->type == OBJ_BLOB && bitmap_get(to_filter, pos) && !bitmap_get(tips, pos) && @@ -1113,8 +1118,8 @@ static void try_partial_reuse(struct bitmap_index *bitmap_git, enum object_type type; unsigned long size; - if (pos >= bitmap_git->pack->num_objects) - return; /* not actually in the pack */ + if (pos >= bitmap_num_objects(bitmap_git)) + return; /* not actually in the pack or MIDX */ offset = header = pack_pos_to_offset(bitmap_git->pack, pos); type = unpack_object_header(bitmap_git->pack, w_curs, &offset, &size); @@ -1180,6 +1185,7 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, struct pack_window *w_curs = NULL; size_t i = 0; uint32_t offset; + uint32_t objects_nr = bitmap_num_objects(bitmap_git); assert(result); @@ -1187,8 +1193,8 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, i++; /* Don't mark objects not in the packfile */ - if (i > bitmap_git->pack->num_objects / BITS_IN_EWORD) - i = bitmap_git->pack->num_objects / BITS_IN_EWORD; + if (i > objects_nr / BITS_IN_EWORD) + i = objects_nr / BITS_IN_EWORD; reuse = bitmap_word_alloc(i); memset(reuse->words, 0xFF, i * sizeof(eword_t)); @@ -1272,7 +1278,7 @@ static uint32_t count_object_type(struct bitmap_index *bitmap_git, for (i = 0; i < eindex->count; ++i) { if (eindex->objects[i]->type == type && - bitmap_get(objects, bitmap_git->pack->num_objects + i)) + bitmap_get(objects, bitmap_num_objects(bitmap_git) + i)) count++; } @@ -1493,7 +1499,7 @@ uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git, uint32_t i, num_objects; uint32_t *reposition; - num_objects = bitmap_git->pack->num_objects; + num_objects = bitmap_num_objects(bitmap_git); CALLOC_ARRAY(reposition, num_objects); for (i = 0; i < num_objects; ++i) { @@ -1576,7 +1582,6 @@ static off_t get_disk_usage_for_type(struct bitmap_index *bitmap_git, static off_t get_disk_usage_for_extended(struct bitmap_index *bitmap_git) { struct bitmap *result = bitmap_git->result; - struct packed_git *pack = bitmap_git->pack; struct eindex *eindex = &bitmap_git->ext_index; off_t total = 0; struct object_info oi = OBJECT_INFO_INIT; @@ -1588,7 +1593,7 @@ static off_t get_disk_usage_for_extended(struct bitmap_index *bitmap_git) for (i = 0; i < eindex->count; i++) { struct object *obj = eindex->objects[i]; - if (!bitmap_get(result, pack->num_objects + i)) + if (!bitmap_get(result, bitmap_num_objects(bitmap_git) + i)) continue; if (oid_object_info_extended(the_repository, &obj->oid, &oi, 0) < 0) From patchwork Mon Jun 21 22:25:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335811 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 062F3C4743C for ; Mon, 21 Jun 2021 22:25:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E27C860FE9 for ; Mon, 21 Jun 2021 22:25:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232250AbhFUW1r (ORCPT ); Mon, 21 Jun 2021 18:27:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53676 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232193AbhFUW1n (ORCPT ); Mon, 21 Jun 2021 18:27:43 -0400 Received: from mail-il1-x12c.google.com (mail-il1-x12c.google.com [IPv6:2607:f8b0:4864:20::12c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 65C73C061756 for ; Mon, 21 Jun 2021 15:25:28 -0700 (PDT) Received: by mail-il1-x12c.google.com with SMTP id z1so6810010ils.0 for ; Mon, 21 Jun 2021 15:25:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Sre3Q6XvNNpfuISWO9VaDHBk4aYBbs63EGuy0S0Dpb0=; b=aUL23VQ/eZQtJU8WgX8evO0AlrGyPb+9/RPxE4iF4sXfXh7+RLZBpK5QJcIHsFlKFV 7lEau7rwZkGewqCaOnXzhHqOKos3dNlpxh7PeomR08VxlJAKWyQc8fw/4bkU0mVOBOmu 2MKV6M7UItvfnAAc9+PnBYUda4/sNRq+wbtln0/tBq1BSXIescAMjFK3zBH/bbYPJwD4 4gjzpZ7/MGZ4vWJ9HxZaThReYt+fs7URP4ap+QjjCSgT2z2qwBdXkvG5rhQHvWW9VkNM onCsTCwyI7sP2kLlYBJO4sK0VuOV7Gexh9Q6nxeKE58ZfYnuPo50B2nuKzwnsUVen1Fm GGlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Sre3Q6XvNNpfuISWO9VaDHBk4aYBbs63EGuy0S0Dpb0=; b=RvAKN1cCwZjy7JAgK9ba4rTE6oqWyzS/5yeNoLKyPG6G1SnZ7CKbe1OW41tgNY08Rz CIRfTXDuAiBJO4rBWmTrgofHHtgTCIM9CT8BNY85Z4NnsZFYFOo2+sId4bKd4NkdL1fm UQTpwuXPCVgymGrVXU1O4q+fG91CFFTgaplSZQgSaqPzNs/Yg7TSJlFx/eJfyFJGvW9k XwtSOMHIc+hcppsA+YxfquHjEHQcyJmvCje5XP97aZU/kfxyR4k9dajOUEKuMIFFasF8 Vw6hGwO+ATNldiPUvAMuZx+eYhWf+EtDkw1ENvaFuX5EKDxGFKj5+qNR/G1vVTZpqIIF sNcQ== X-Gm-Message-State: AOAM533fYaVa3KvOOrAqTqAlWDjQaZmEt8Pd509sjo+Q6Xnfv1qHUd8P t66miJJHQtj1ZDnHecL9j3D3/33U0BfIjfSd X-Google-Smtp-Source: ABdhPJyoGBjAgz2xfWcCW7enSonVLUJbelXQbjPxXpUGEAuv/gAa0NLubyed8a+BLCnpCwAIeT18gw== X-Received: by 2002:a92:dc48:: with SMTP id x8mr343255ilq.213.1624314327629; Mon, 21 Jun 2021 15:25:27 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id u14sm3394968iln.43.2021.06.21.15.25.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:27 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:26 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 11/24] pack-bitmap.c: introduce 'nth_bitmap_object_oid()' Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A subsequent patch to support reading MIDX bitmaps will be less noisy after extracting a generic function to fetch the nth OID contained in the bitmap. Signed-off-by: Taylor Blau --- pack-bitmap.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 2dc135d34a..9757cd0fbb 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -223,6 +223,13 @@ static inline uint8_t read_u8(const unsigned char *buffer, size_t *pos) #define MAX_XOR_OFFSET 160 +static void nth_bitmap_object_oid(struct bitmap_index *index, + struct object_id *oid, + uint32_t n) +{ + nth_packed_object_id(oid, index->pack, n); +} + static int load_bitmap_entries_v1(struct bitmap_index *index) { uint32_t i; @@ -242,9 +249,7 @@ static int load_bitmap_entries_v1(struct bitmap_index *index) xor_offset = read_u8(index->map, &index->map_pos); flags = read_u8(index->map, &index->map_pos); - if (nth_packed_object_id(&oid, index->pack, commit_idx_pos) < 0) - return error("corrupt ewah bitmap: commit index %u out of range", - (unsigned)commit_idx_pos); + nth_bitmap_object_oid(index, &oid, commit_idx_pos); bitmap = read_bitmap_1(index); if (!bitmap) @@ -844,8 +849,8 @@ static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git, off_t ofs = pack_pos_to_offset(pack, pos); if (packed_object_info(the_repository, pack, ofs, &oi) < 0) { struct object_id oid; - nth_packed_object_id(&oid, pack, - pack_pos_to_index(pack, pos)); + nth_bitmap_object_oid(bitmap_git, &oid, + pack_pos_to_index(pack, pos)); die(_("unable to get size of %s"), oid_to_hex(&oid)); } } else { From patchwork Mon Jun 21 22:25:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335813 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B938FC48BC2 for ; Mon, 21 Jun 2021 22:25:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A2D12611CE for ; Mon, 21 Jun 2021 22:25:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232193AbhFUW1s (ORCPT ); Mon, 21 Jun 2021 18:27:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53700 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232181AbhFUW1r (ORCPT ); Mon, 21 Jun 2021 18:27:47 -0400 Received: from mail-io1-xd31.google.com (mail-io1-xd31.google.com [IPv6:2607:f8b0:4864:20::d31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10BC1C061768 for ; Mon, 21 Jun 2021 15:25:31 -0700 (PDT) Received: by mail-io1-xd31.google.com with SMTP id k11so4448386ioa.5 for ; Mon, 21 Jun 2021 15:25:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=RA82XER+UaVGp5h/ADQVWTL4DtiRmaqY4gy56FPe/Bc=; b=b9b9lfXeenW7UqvPbJAJC68zVgIJsySgptTKs2i45g2oR6OWEO/+GjAldT7Kp+98vY qYjpP8lta4Ugf97Kn7R6Pyz6SXSHMcqiRotRX1igieuki2sYMgy/f/grJaLsBr5NIOgj eQyF1V8FJhMdi+E1Tr5H+kHKe/G6g9DGPGEHOkHwzT4aNnhv2//y+MjSilYrurWy113p 7ZrgWCWXSd4lJDUciYNV5t8zPVzdRf6PHPA1TRvZRaF17Gprl8Enu2cg4TbNlDqPfH/b HuiVV/lewVrmxL9bio8InUe9gytZU1OV0FY66w2LZN0E5muN+jgrnft3DorII8d75mCp DFBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=RA82XER+UaVGp5h/ADQVWTL4DtiRmaqY4gy56FPe/Bc=; b=uPugJxqhWNBWR4QpU4k0fy6LlCI3c306ZSRMUnWxU3L9H3WG2FtlLDwiVKPGBe4o4H FkD4qYkahbdBJk2z3y3FrDif7EWWyLPHRJMFQQXTdALXjLvJugGvgWs2LL63BXeNKz7J 0hlpo3Yqvji901euvnhuQKnaiF8nAqNrcHngXhIMNEeGYsFYY8HEBsUJWgpJqW3zfMys vG7qW/Jsyk2++c7UjemIl2lS3xL84u3pFIszLQ38uz7tlGdsdYQMyhPdnf5tYt0kbn7H 70FHz+OyPSoLXIT2A1R8aWR1fAIsf6gPNN1D7uS4vMrQr06w14OwQGhlmWRgUYWtGd6Y FxWw== X-Gm-Message-State: AOAM5311UPFoiFiZmPX76YcsEwJOOanbYHnGOW5Z6nPtgDbLGmLgO6hK FZxGXE+zfNmI7/C+Agmg9vg9NTmqqRf6eb3A X-Google-Smtp-Source: ABdhPJwoxOnsLTfRaGIF7sVKFFc3ZeOxQO7wcMXI7bSUsmnG9mvsJuVJMMTJznreyKYuGOHumgm6Pg== X-Received: by 2002:a6b:6b10:: with SMTP id g16mr211069ioc.187.1624314330308; Mon, 21 Jun 2021 15:25:30 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id u14sm6995051ilk.78.2021.06.21.15.25.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:29 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:29 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 12/24] pack-bitmap.c: introduce 'bitmap_is_preferred_refname()' Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In a recent commit, pack-objects learned support for the 'pack.preferBitmapTips' configuration. This patch prepares the multi-pack bitmap code to respect this configuration, too. Since the multi-pack bitmap code already does a traversal of all references (in order to discover the set of reachable commits in the multi-pack index), it is more efficient to check whether or not each reference is a suffix of any value of 'pack.preferBitmapTips' rather than do an additional traversal. Implement a function 'bitmap_is_preferred_refname()' which does just that. The caller will be added in a subsequent patch. Signed-off-by: Taylor Blau --- pack-bitmap.c | 16 ++++++++++++++++ pack-bitmap.h | 1 + 2 files changed, 17 insertions(+) diff --git a/pack-bitmap.c b/pack-bitmap.c index 9757cd0fbb..d882bf7ce1 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1632,3 +1632,19 @@ const struct string_list *bitmap_preferred_tips(struct repository *r) { return repo_config_get_value_multi(r, "pack.preferbitmaptips"); } + +int bitmap_is_preferred_refname(struct repository *r, const char *refname) +{ + const struct string_list *preferred_tips = bitmap_preferred_tips(r); + struct string_list_item *item; + + if (!preferred_tips) + return 0; + + for_each_string_list_item(item, preferred_tips) { + if (starts_with(refname, item->string)) + return 1; + } + + return 0; +} diff --git a/pack-bitmap.h b/pack-bitmap.h index 020cd8d868..52ea10de51 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -94,5 +94,6 @@ void bitmap_writer_finish(struct pack_idx_entry **index, uint16_t options); const struct string_list *bitmap_preferred_tips(struct repository *r); +int bitmap_is_preferred_refname(struct repository *r, const char *refname); #endif From patchwork Mon Jun 21 22:25:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335815 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4CEB5C48BE5 for ; Mon, 21 Jun 2021 22:25:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2C37560FE9 for ; Mon, 21 Jun 2021 22:25:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232240AbhFUW1u (ORCPT ); Mon, 21 Jun 2021 18:27:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232181AbhFUW1t (ORCPT ); Mon, 21 Jun 2021 18:27:49 -0400 Received: from mail-io1-xd35.google.com (mail-io1-xd35.google.com [IPv6:2607:f8b0:4864:20::d35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 015C2C061756 for ; Mon, 21 Jun 2021 15:25:34 -0700 (PDT) Received: by mail-io1-xd35.google.com with SMTP id h2so5930719iob.11 for ; Mon, 21 Jun 2021 15:25:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=OG1X0+UmaoWYYtrYOOtZ4vQY9d5o/MVCkebNzcSIOO8=; b=PPs6LJoyJUIaY4QxopeQkZTOKnUboR4mKz6qwXn4MYeL+MgPG0dvGVE1SW2jptkF9P hwS4xrefq7Db3WyAXb0egSo8xLgiXLzMNacQMFWbY7XokM0He4HgTw/QFRzBHmpVmLjD +7mUYlBErZro/BVB19Sbmm57ubFw9mw5FEZWBLsx9EXyvMNZBd7fr94OyAcYDAsFCVy3 aGApCdVFohxBYJQLyUzKNGs86O/vdqBKSoHcBk1RUP86jxhv/0/aLekmqvwVwrKZOB2h OT2QXayqYj6VN5ArJsGwqwE93u7eIDC2w93yl1jycFA6o3Rvc3lvviuUIVYBCrCCekom SfrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=OG1X0+UmaoWYYtrYOOtZ4vQY9d5o/MVCkebNzcSIOO8=; b=kRx4ONZ5Wvf7tkra/rU/8O6kqpHi7aaQ/YTUv8PLEs2snUjKiZGCODECQxSDhPc/YM znXUxXrMI9AF9chMcDVlvVRkcRaW0sBvaeZtkkc9bvOL+WSE0OEnPEMw1mqk6OwHJU9s DH35fizkQqCkDi0sK47cpDaHy16Ta5qYszBoAplgX+TVgCXJ/UbQ75/7xLTUvNuzREPU WXfCFwOC9BPMHFFNtRFA3awi+7hcF/Vdsc4drCOuNIxOcknw6VVclOmnU1NsDxiCpG/a jli3NxPGhmyU2ZI2Agyow9+8+qn2MM87FkZBgaIPmwTGjJ7Fp+w2oE1p4Uwn5ZAh7YxD U/5Q== X-Gm-Message-State: AOAM530yD3zSrCo5lm9f+ObXnxZ5YQv9wht6WF+xPNkY8/GhFcO/c6fg 3R3OBRrARbzubrpnFfLGmAj8UZbrWFa+4BAw X-Google-Smtp-Source: ABdhPJzf4/fo2T96r/cmpMdm/KniDnrk2MqpVd5nRSdIiFQ//qAjDraXG1LL8qlvf1CuXoKVObiO2w== X-Received: by 2002:a6b:7009:: with SMTP id l9mr251012ioc.82.1624314333092; Mon, 21 Jun 2021 15:25:33 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id g8sm4493692iop.31.2021.06.21.15.25.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:32 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:31 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 13/24] pack-bitmap: read multi-pack bitmaps Message-ID: <7d44ba6299c06c956d5ac8ba01a0288d109c3cae.1624314293.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This prepares the code in pack-bitmap to interpret the new multi-pack bitmaps described in Documentation/technical/bitmap-format.txt, which mostly involves converting bit positions to accommodate looking them up in a MIDX. Note that there are currently no writers who write multi-pack bitmaps, and that this will be implemented in the subsequent commit. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 5 + pack-bitmap-write.c | 2 +- pack-bitmap.c | 362 +++++++++++++++++++++++++++++++++++++---- pack-bitmap.h | 5 + packfile.c | 2 +- 5 files changed, 340 insertions(+), 36 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 8a523624a1..e11d3ac2e5 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1124,6 +1124,11 @@ static void write_reused_pack(struct hashfile *f) break; offset += ewah_bit_ctz64(word >> offset); + /* + * Can use bit positions directly, even for MIDX + * bitmaps. See comment in try_partial_reuse() + * for why. + */ write_reused_pack_one(pos + offset, f, &w_curs); display_progress(progress_state, ++written); } diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index 142fd0adb8..9c55c1531e 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -48,7 +48,7 @@ void bitmap_writer_show_progress(int show) } /** - * Build the initial type index for the packfile + * Build the initial type index for the packfile or multi-pack-index */ void bitmap_writer_build_type_index(struct packing_data *to_pack, struct pack_idx_entry **index, diff --git a/pack-bitmap.c b/pack-bitmap.c index d882bf7ce1..4110d23ca1 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -13,6 +13,7 @@ #include "repository.h" #include "object-store.h" #include "list-objects-filter-options.h" +#include "midx.h" #include "config.h" /* @@ -35,8 +36,15 @@ struct stored_bitmap { * the active bitmap index is the largest one. */ struct bitmap_index { - /* Packfile to which this bitmap index belongs to */ + /* + * The pack or multi-pack index (MIDX) that this bitmap index belongs + * to. + * + * Exactly one of these must be non-NULL; this specifies the object + * order used to interpret this bitmap. + */ struct packed_git *pack; + struct multi_pack_index *midx; /* * Mark the first `reuse_objects` in the packfile as reused: @@ -71,6 +79,9 @@ struct bitmap_index { /* If not NULL, this is a name-hash cache pointing into map. */ uint32_t *hashes; + /* The checksum of the packfile or MIDX; points into map. */ + const unsigned char *checksum; + /* * Extended index. * @@ -138,6 +149,8 @@ static struct ewah_bitmap *read_bitmap_1(struct bitmap_index *index) static uint32_t bitmap_num_objects(struct bitmap_index *index) { + if (index->midx) + return index->midx->num_objects; return index->pack->num_objects; } @@ -175,6 +188,7 @@ static int load_bitmap_header(struct bitmap_index *index) } index->entry_count = ntohl(header->entry_count); + index->checksum = header->checksum; index->map_pos += header_size; return 0; } @@ -227,7 +241,10 @@ static void nth_bitmap_object_oid(struct bitmap_index *index, struct object_id *oid, uint32_t n) { - nth_packed_object_id(oid, index->pack, n); + if (index->midx) + nth_midxed_object_oid(oid, index->midx, n); + else + nth_packed_object_id(oid, index->pack, n); } static int load_bitmap_entries_v1(struct bitmap_index *index) @@ -272,7 +289,14 @@ static int load_bitmap_entries_v1(struct bitmap_index *index) return 0; } -static char *pack_bitmap_filename(struct packed_git *p) +char *midx_bitmap_filename(struct multi_pack_index *midx) +{ + return xstrfmt("%s-%s.bitmap", + get_midx_filename(midx->object_dir), + hash_to_hex(get_midx_checksum(midx))); +} + +char *pack_bitmap_filename(struct packed_git *p) { size_t len; @@ -281,6 +305,57 @@ static char *pack_bitmap_filename(struct packed_git *p) return xstrfmt("%.*s.bitmap", (int)len, p->pack_name); } +static int open_midx_bitmap_1(struct bitmap_index *bitmap_git, + struct multi_pack_index *midx) +{ + struct stat st; + char *idx_name = midx_bitmap_filename(midx); + int fd = git_open(idx_name); + + free(idx_name); + + if (fd < 0) + return -1; + + if (fstat(fd, &st)) { + close(fd); + return -1; + } + + if (bitmap_git->pack || bitmap_git->midx) { + /* ignore extra bitmap file; we can only handle one */ + warning("ignoring extra bitmap file: %s", + get_midx_filename(midx->object_dir)); + close(fd); + return -1; + } + + bitmap_git->midx = midx; + bitmap_git->map_size = xsize_t(st.st_size); + bitmap_git->map_pos = 0; + bitmap_git->map = xmmap(NULL, bitmap_git->map_size, PROT_READ, + MAP_PRIVATE, fd, 0); + close(fd); + + if (load_bitmap_header(bitmap_git) < 0) + goto cleanup; + + if (!hasheq(get_midx_checksum(bitmap_git->midx), bitmap_git->checksum)) + goto cleanup; + + if (load_midx_revindex(bitmap_git->midx) < 0) { + warning(_("multi-pack bitmap is missing required reverse index")); + goto cleanup; + } + return 0; + +cleanup: + munmap(bitmap_git->map, bitmap_git->map_size); + bitmap_git->map_size = 0; + bitmap_git->map = NULL; + return -1; +} + static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git *packfile) { int fd; @@ -302,12 +377,18 @@ static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git return -1; } - if (bitmap_git->pack) { + if (bitmap_git->pack || bitmap_git->midx) { + /* ignore extra bitmap file; we can only handle one */ warning("ignoring extra bitmap file: %s", packfile->pack_name); close(fd); return -1; } + if (!is_pack_valid(packfile)) { + close(fd); + return -1; + } + bitmap_git->pack = packfile; bitmap_git->map_size = xsize_t(st.st_size); bitmap_git->map = xmmap(NULL, bitmap_git->map_size, PROT_READ, MAP_PRIVATE, fd, 0); @@ -324,13 +405,36 @@ static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git return 0; } -static int load_pack_bitmap(struct bitmap_index *bitmap_git) +static int load_reverse_index(struct bitmap_index *bitmap_git) +{ + if (bitmap_is_midx(bitmap_git)) { + uint32_t i; + int ret; + + ret = load_midx_revindex(bitmap_git->midx); + if (ret) + return ret; + + for (i = 0; i < bitmap_git->midx->num_packs; i++) { + if (prepare_midx_pack(the_repository, bitmap_git->midx, i)) + die(_("load_reverse_index: could not open pack")); + ret = load_pack_revindex(bitmap_git->midx->packs[i]); + if (ret) + return ret; + } + return 0; + } + return load_pack_revindex(bitmap_git->pack); +} + +static int load_bitmap(struct bitmap_index *bitmap_git) { assert(bitmap_git->map); bitmap_git->bitmaps = kh_init_oid_map(); bitmap_git->ext_index.positions = kh_init_oid_pos(); - if (load_pack_revindex(bitmap_git->pack)) + + if (load_reverse_index(bitmap_git)) goto failed; if (!(bitmap_git->commits = read_bitmap_1(bitmap_git)) || @@ -374,11 +478,35 @@ static int open_pack_bitmap(struct repository *r, return ret; } +static int open_midx_bitmap(struct repository *r, + struct bitmap_index *bitmap_git) +{ + struct multi_pack_index *midx; + + assert(!bitmap_git->map); + + for (midx = get_multi_pack_index(r); midx; midx = midx->next) { + if (!open_midx_bitmap_1(bitmap_git, midx)) + return 0; + } + return -1; +} + +static int open_bitmap(struct repository *r, + struct bitmap_index *bitmap_git) +{ + assert(!bitmap_git->map); + + if (!open_midx_bitmap(r, bitmap_git)) + return 0; + return open_pack_bitmap(r, bitmap_git); +} + struct bitmap_index *prepare_bitmap_git(struct repository *r) { struct bitmap_index *bitmap_git = xcalloc(1, sizeof(*bitmap_git)); - if (!open_pack_bitmap(r, bitmap_git) && !load_pack_bitmap(bitmap_git)) + if (!open_bitmap(r, bitmap_git) && !load_bitmap(bitmap_git)) return bitmap_git; free_bitmap_index(bitmap_git); @@ -428,10 +556,26 @@ static inline int bitmap_position_packfile(struct bitmap_index *bitmap_git, return pos; } +static int bitmap_position_midx(struct bitmap_index *bitmap_git, + const struct object_id *oid) +{ + uint32_t want, got; + if (!bsearch_midx(oid, bitmap_git->midx, &want)) + return -1; + + if (midx_to_pack_pos(bitmap_git->midx, want, &got) < 0) + return -1; + return got; +} + static int bitmap_position(struct bitmap_index *bitmap_git, const struct object_id *oid) { - int pos = bitmap_position_packfile(bitmap_git, oid); + int pos; + if (bitmap_is_midx(bitmap_git)) + pos = bitmap_position_midx(bitmap_git, oid); + else + pos = bitmap_position_packfile(bitmap_git, oid); return (pos >= 0) ? pos : bitmap_position_extended(bitmap_git, oid); } @@ -724,6 +868,7 @@ static void show_objects_for_type( continue; for (offset = 0; offset < BITS_IN_EWORD; ++offset) { + struct packed_git *pack; struct object_id oid; uint32_t hash = 0, index_pos; off_t ofs; @@ -733,14 +878,28 @@ static void show_objects_for_type( offset += ewah_bit_ctz64(word >> offset); - index_pos = pack_pos_to_index(bitmap_git->pack, pos + offset); - ofs = pack_pos_to_offset(bitmap_git->pack, pos + offset); - nth_packed_object_id(&oid, bitmap_git->pack, index_pos); + if (bitmap_is_midx(bitmap_git)) { + struct multi_pack_index *m = bitmap_git->midx; + uint32_t pack_id; + + index_pos = pack_pos_to_midx(m, pos + offset); + ofs = nth_midxed_offset(m, index_pos); + nth_midxed_object_oid(&oid, m, index_pos); + + pack_id = nth_midxed_pack_int_id(m, index_pos); + pack = bitmap_git->midx->packs[pack_id]; + } else { + index_pos = pack_pos_to_index(bitmap_git->pack, pos + offset); + ofs = pack_pos_to_offset(bitmap_git->pack, pos + offset); + nth_bitmap_object_oid(bitmap_git, &oid, index_pos); + + pack = bitmap_git->pack; + } if (bitmap_git->hashes) hash = get_be32(bitmap_git->hashes + index_pos); - show_reach(&oid, object_type, 0, hash, bitmap_git->pack, ofs); + show_reach(&oid, object_type, 0, hash, pack, ofs); } } } @@ -752,8 +911,13 @@ static int in_bitmapped_pack(struct bitmap_index *bitmap_git, struct object *object = roots->item; roots = roots->next; - if (find_pack_entry_one(object->oid.hash, bitmap_git->pack) > 0) - return 1; + if (bitmap_is_midx(bitmap_git)) { + if (bsearch_midx(&object->oid, bitmap_git->midx, NULL)) + return 1; + } else { + if (find_pack_entry_one(object->oid.hash, bitmap_git->pack) > 0) + return 1; + } } return 0; @@ -839,14 +1003,26 @@ static void filter_bitmap_blob_none(struct bitmap_index *bitmap_git, static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git, uint32_t pos) { - struct packed_git *pack = bitmap_git->pack; unsigned long size; struct object_info oi = OBJECT_INFO_INIT; oi.sizep = &size; if (pos < bitmap_num_objects(bitmap_git)) { - off_t ofs = pack_pos_to_offset(pack, pos); + struct packed_git *pack; + off_t ofs; + + if (bitmap_is_midx(bitmap_git)) { + uint32_t midx_pos = pack_pos_to_midx(bitmap_git->midx, pos); + uint32_t pack_id = nth_midxed_pack_int_id(bitmap_git->midx, midx_pos); + + pack = bitmap_git->midx->packs[pack_id]; + ofs = nth_midxed_offset(bitmap_git->midx, midx_pos); + } else { + pack = bitmap_git->pack; + ofs = pack_pos_to_offset(pack, pos); + } + if (packed_object_info(the_repository, pack, ofs, &oi) < 0) { struct object_id oid; nth_bitmap_object_oid(bitmap_git, &oid, @@ -1027,7 +1203,7 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, /* try to open a bitmapped pack, but don't parse it yet * because we may not need to use it */ CALLOC_ARRAY(bitmap_git, 1); - if (open_pack_bitmap(revs->repo, bitmap_git) < 0) + if (open_bitmap(revs->repo, bitmap_git) < 0) goto cleanup; for (i = 0; i < revs->pending.nr; ++i) { @@ -1071,7 +1247,7 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, * from disk. this is the point of no return; after this the rev_list * becomes invalidated and we must perform the revwalk through bitmaps */ - if (load_pack_bitmap(bitmap_git) < 0) + if (load_bitmap(bitmap_git) < 0) goto cleanup; object_array_clear(&revs->pending); @@ -1115,19 +1291,43 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, } static void try_partial_reuse(struct bitmap_index *bitmap_git, + struct packed_git *pack, size_t pos, struct bitmap *reuse, struct pack_window **w_curs) { - off_t offset, header; + off_t offset, delta_obj_offset; enum object_type type; unsigned long size; - if (pos >= bitmap_num_objects(bitmap_git)) - return; /* not actually in the pack or MIDX */ + /* + * try_partial_reuse() is called either on (a) objects in the + * bitmapped pack (in the case of a single-pack bitmap) or (b) + * objects in the preferred pack of a multi-pack bitmap. + * Importantly, the latter can pretend as if only a single pack + * exists because: + * + * - The first pack->num_objects bits of a MIDX bitmap are + * reserved for the preferred pack, and + * + * - Ties due to duplicate objects are always resolved in + * favor of the preferred pack. + * + * Therefore we do not need to ever ask the MIDX for its copy of + * an object by OID, since it will always select it from the + * preferred pack. Likewise, the selected copy of the base + * object for any deltas will reside in the same pack. + * + * This means that we can reuse pos when looking up the bit in + * the reuse bitmap, too, since bits corresponding to the + * preferred pack precede all bits from other packs. + */ - offset = header = pack_pos_to_offset(bitmap_git->pack, pos); - type = unpack_object_header(bitmap_git->pack, w_curs, &offset, &size); + if (pos >= pack->num_objects) + return; /* not actually in the pack or MIDX preferred pack */ + + offset = delta_obj_offset = pack_pos_to_offset(pack, pos); + type = unpack_object_header(pack, w_curs, &offset, &size); if (type < 0) return; /* broken packfile, punt */ @@ -1143,11 +1343,11 @@ static void try_partial_reuse(struct bitmap_index *bitmap_git, * and the normal slow path will complain about it in * more detail. */ - base_offset = get_delta_base(bitmap_git->pack, w_curs, - &offset, type, header); + base_offset = get_delta_base(pack, w_curs, &offset, type, + delta_obj_offset); if (!base_offset) return; - if (offset_to_pack_pos(bitmap_git->pack, base_offset, &base_pos) < 0) + if (offset_to_pack_pos(pack, base_offset, &base_pos) < 0) return; /* @@ -1180,24 +1380,48 @@ static void try_partial_reuse(struct bitmap_index *bitmap_git, bitmap_set(reuse, pos); } +static uint32_t midx_preferred_pack(struct bitmap_index *bitmap_git) +{ + struct multi_pack_index *m = bitmap_git->midx; + if (!m) + BUG("midx_preferred_pack: requires non-empty MIDX"); + return nth_midxed_pack_int_id(m, pack_pos_to_midx(bitmap_git->midx, 0)); +} + int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, struct packed_git **packfile_out, uint32_t *entries, struct bitmap **reuse_out) { + struct packed_git *pack; struct bitmap *result = bitmap_git->result; struct bitmap *reuse; struct pack_window *w_curs = NULL; size_t i = 0; uint32_t offset; - uint32_t objects_nr = bitmap_num_objects(bitmap_git); + uint32_t objects_nr; assert(result); + load_reverse_index(bitmap_git); + + if (bitmap_is_midx(bitmap_git)) + pack = bitmap_git->midx->packs[midx_preferred_pack(bitmap_git)]; + else + pack = bitmap_git->pack; + objects_nr = pack->num_objects; + while (i < result->word_alloc && result->words[i] == (eword_t)~0) i++; - /* Don't mark objects not in the packfile */ + /* + * Don't mark objects not in the packfile or preferred pack. This bitmap + * marks objects eligible for reuse, but the pack-reuse code only + * understands how to reuse a single pack. Since the preferred pack is + * guaranteed to have all bases for its deltas (in a multi-pack bitmap), + * we use it instead of another pack. In single-pack bitmaps, the choice + * is made for us. + */ if (i > objects_nr / BITS_IN_EWORD) i = objects_nr / BITS_IN_EWORD; @@ -1213,7 +1437,15 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, break; offset += ewah_bit_ctz64(word >> offset); - try_partial_reuse(bitmap_git, pos + offset, reuse, &w_curs); + if (bitmap_is_midx(bitmap_git)) { + /* + * Can't reuse from a non-preferred pack (see + * above). + */ + if (pos + offset >= objects_nr) + continue; + } + try_partial_reuse(bitmap_git, pack, pos + offset, reuse, &w_curs); } } @@ -1230,7 +1462,7 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, * need to be handled separately. */ bitmap_and_not(result, reuse); - *packfile_out = bitmap_git->pack; + *packfile_out = pack; *reuse_out = reuse; return 0; } @@ -1504,6 +1736,12 @@ uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git, uint32_t i, num_objects; uint32_t *reposition; + if (!bitmap_is_midx(bitmap_git)) + load_reverse_index(bitmap_git); + else if (load_midx_revindex(bitmap_git->midx) < 0) + BUG("rebuild_existing_bitmaps: missing required rev-cache " + "extension"); + num_objects = bitmap_num_objects(bitmap_git); CALLOC_ARRAY(reposition, num_objects); @@ -1511,8 +1749,13 @@ uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git, struct object_id oid; struct object_entry *oe; - nth_packed_object_id(&oid, bitmap_git->pack, - pack_pos_to_index(bitmap_git->pack, i)); + if (bitmap_is_midx(bitmap_git)) + nth_midxed_object_oid(&oid, + bitmap_git->midx, + pack_pos_to_midx(bitmap_git->midx, i)); + else + nth_packed_object_id(&oid, bitmap_git->pack, + pack_pos_to_index(bitmap_git->pack, i)); oe = packlist_find(mapping, &oid); if (oe) @@ -1538,6 +1781,19 @@ void free_bitmap_index(struct bitmap_index *b) free(b->ext_index.hashes); bitmap_free(b->result); bitmap_free(b->haves); + if (bitmap_is_midx(b)) { + /* + * Multi-pack bitmaps need to have resources associated with + * their on-disk reverse indexes unmapped so that stale .rev and + * .bitmap files can be removed. + * + * Unlike pack-based bitmaps, multi-pack bitmaps can be read and + * written in the same 'git multi-pack-index write --bitmap' + * process. Close resources so they can be removed safely on + * platforms like Windows. + */ + close_midx_revindex(b->midx); + } free(b); } @@ -1552,7 +1808,7 @@ static off_t get_disk_usage_for_type(struct bitmap_index *bitmap_git, enum object_type object_type) { struct bitmap *result = bitmap_git->result; - struct packed_git *pack = bitmap_git->pack; + struct packed_git *pack; off_t total = 0; struct ewah_iterator it; eword_t filter; @@ -1575,7 +1831,31 @@ static off_t get_disk_usage_for_type(struct bitmap_index *bitmap_git, break; offset += ewah_bit_ctz64(word >> offset); - pos = base + offset; + + if (bitmap_is_midx(bitmap_git)) { + uint32_t pack_pos; + uint32_t midx_pos = pack_pos_to_midx(bitmap_git->midx, base + offset); + uint32_t pack_id = nth_midxed_pack_int_id(bitmap_git->midx, midx_pos); + off_t offset = nth_midxed_offset(bitmap_git->midx, midx_pos); + + pack = bitmap_git->midx->packs[pack_id]; + + if (offset_to_pack_pos(pack, offset, &pack_pos) < 0) { + struct object_id oid; + nth_midxed_object_oid(&oid, bitmap_git->midx, midx_pos); + + die(_("could not find %s in pack #%"PRIu32" at offset %"PRIuMAX), + oid_to_hex(&oid), + pack_id, + (uintmax_t)offset); + } + + pos = pack_pos; + } else { + pack = bitmap_git->pack; + pos = base + offset; + } + total += pack_pos_to_offset(pack, pos + 1) - pack_pos_to_offset(pack, pos); } @@ -1628,6 +1908,20 @@ off_t get_disk_usage_from_bitmap(struct bitmap_index *bitmap_git, return total; } +int bitmap_is_midx(struct bitmap_index *bitmap_git) +{ + return !!bitmap_git->midx; +} + +off_t bitmap_pack_offset(struct bitmap_index *bitmap_git, uint32_t pos) +{ + if (bitmap_is_midx(bitmap_git)) + return nth_midxed_offset(bitmap_git->midx, + pack_pos_to_midx(bitmap_git->midx, pos)); + return nth_packed_object_offset(bitmap_git->pack, + pack_pos_to_index(bitmap_git->pack, pos)); +} + const struct string_list *bitmap_preferred_tips(struct repository *r) { return repo_config_get_value_multi(r, "pack.preferbitmaptips"); diff --git a/pack-bitmap.h b/pack-bitmap.h index 52ea10de51..30396a7a4a 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -92,6 +92,11 @@ void bitmap_writer_finish(struct pack_idx_entry **index, uint32_t index_nr, const char *filename, uint16_t options); +char *midx_bitmap_filename(struct multi_pack_index *midx); +char *pack_bitmap_filename(struct packed_git *p); + +int bitmap_is_midx(struct bitmap_index *bitmap_git); +off_t bitmap_pack_offset(struct bitmap_index *bitmap_git, uint32_t pos); const struct string_list *bitmap_preferred_tips(struct repository *r); int bitmap_is_preferred_refname(struct repository *r, const char *refname); diff --git a/packfile.c b/packfile.c index 755aa7aec5..e855b93208 100644 --- a/packfile.c +++ b/packfile.c @@ -860,7 +860,7 @@ static void prepare_pack(const char *full_name, size_t full_name_len, if (!strcmp(file_name, "multi-pack-index")) return; if (starts_with(file_name, "multi-pack-index") && - ends_with(file_name, ".rev")) + (ends_with(file_name, ".bitmap") || ends_with(file_name, ".rev"))) return; if (ends_with(file_name, ".idx") || ends_with(file_name, ".rev") || From patchwork Mon Jun 21 22:25:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335817 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4BE1BC4743C for ; Mon, 21 Jun 2021 22:25:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2AC5B611CE for ; Mon, 21 Jun 2021 22:25:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232249AbhFUW14 (ORCPT ); Mon, 21 Jun 2021 18:27:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232181AbhFUW1v (ORCPT ); Mon, 21 Jun 2021 18:27:51 -0400 Received: from mail-il1-x130.google.com (mail-il1-x130.google.com [IPv6:2607:f8b0:4864:20::130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B06CFC061574 for ; Mon, 21 Jun 2021 15:25:36 -0700 (PDT) Received: by mail-il1-x130.google.com with SMTP id b5so4768557ilc.12 for ; Mon, 21 Jun 2021 15:25:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=C0G7B0yqrJmr7K3MFzz6OBt0KR3MXS3lVJZ4/8QMd1s=; b=KGIruIBDWF3LgzQ6DeDajsJbfH7UDlw2l21Kc/TNBIF2quLL/U6wqPhjyTJJLW0Y8Z WMKhfV7nRYGoCyC5ZyWxmufvpkJpmZq2wiHCyTtMiP8UMo//e6mTUPG88MisAavrGUYr dffszynVA+cG1eWclw3AoSNFCQG6wnrTf7nf1eVYBpDacECbjlv4kAHcFoPz7LlcMGSs xIDIs+2jOA6Rd4Bc7n4diMOlvSS2uCCE+OZBfL3tSBtwHrXJ0PMUSVw8WWXExxjFJc1L N94oVN3oBRyQWPUlEGaE5YMEEPY5NLIm9l/XYyfmmx/KFRQUBhwGKqkngzwDJCke74XU 0FZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=C0G7B0yqrJmr7K3MFzz6OBt0KR3MXS3lVJZ4/8QMd1s=; b=LO+DlpHXg7e/7pra+/X/NDin1gqIFe4WdJAuB2rvmedjZc8D1oHiiNJbYn7X4BhbtD LHfTeu8jXh05mSSdDTWbKUfFBXsrThFeVg8je15N0sQXNpZVEHsAbDq7xCtL2QniY+Kj dF6Jtvet5uiUBpAX+yMZ2XuUdoVD+n049mStUg0cPEPF0sZO2BrE19YQ68JrUlQO65hc IDTAWF9uyOPFyIzB5jTG5Henp+HJUJqU+Xxx3L0j94en8kxMjIvpu6Vs9vb9MtoDrvrR lIHD637i+yj1IVIz8m/qnlZlAvnsAGZbFAiK3IB7GhXEwQDGXXONQ8CqbLbqXR5ZjYrI 3CeA== X-Gm-Message-State: AOAM5332jRIjdEfezvgPcv37IU05azXYdzYL+/zwnWFfCQVODk/1p/uj 6hGRtL4jCL0FHd4Zhkou6xXGYsySGOvfXz8D X-Google-Smtp-Source: ABdhPJzWb0zylmcXbEG/VJ6oGjtr+T6B1mHOhHe040xEV8eZkeM+1eFnD8sRKUWE+MPne4e0FBhYog== X-Received: by 2002:a05:6e02:cd1:: with SMTP id c17mr349472ilj.86.1624314335832; Mon, 21 Jun 2021 15:25:35 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id c19sm6912790ili.62.2021.06.21.15.25.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:35 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:34 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 14/24] pack-bitmap: write multi-pack bitmaps Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Write multi-pack bitmaps in the format described by Documentation/technical/bitmap-format.txt, inferring their presence with the absence of '--bitmap'. To write a multi-pack bitmap, this patch attempts to reuse as much of the existing machinery from pack-objects as possible. Specifically, the MIDX code prepares a packing_data struct that pretends as if a single packfile has been generated containing all of the objects contained within the MIDX. Signed-off-by: Taylor Blau --- Documentation/git-multi-pack-index.txt | 12 +- builtin/multi-pack-index.c | 2 + midx.c | 230 ++++++++++++++++++++++++- midx.h | 1 + 4 files changed, 236 insertions(+), 9 deletions(-) diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index ffd601bc17..ada14deb2c 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -10,7 +10,7 @@ SYNOPSIS -------- [verse] 'git multi-pack-index' [--object-dir=] [--[no-]progress] - [--preferred-pack=] + [--preferred-pack=] [--[no-]bitmap] DESCRIPTION ----------- @@ -40,6 +40,9 @@ write:: multiple packs contain the same object. If not given, ties are broken in favor of the pack with the lowest mtime. + + --[no-]bitmap:: + Control whether or not a multi-pack bitmap is written. -- verify:: @@ -81,6 +84,13 @@ EXAMPLES $ git multi-pack-index write ----------------------------------------------- +* Write a MIDX file for the packfiles in the current .git folder with a +corresponding bitmap. ++ +------------------------------------------------------------- +$ git multi-pack-index write --preferred-pack --bitmap +------------------------------------------------------------- + * Write a MIDX file for the packfiles in an alternate object store. + ----------------------------------------------- diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 5d3ea445fd..bf6fa982e3 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -68,6 +68,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) OPT_STRING(0, "preferred-pack", &opts.preferred_pack, N_("preferred-pack"), N_("pack for reuse when computing a multi-pack bitmap")), + OPT_BIT(0, "bitmap", &opts.flags, N_("write multi-pack bitmap"), + MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX), OPT_END(), }; diff --git a/midx.c b/midx.c index 752d36c57f..a58cca707b 100644 --- a/midx.c +++ b/midx.c @@ -13,6 +13,10 @@ #include "repository.h" #include "chunk-format.h" #include "pack.h" +#include "pack-bitmap.h" +#include "refs.h" +#include "revision.h" +#include "list-objects.h" #define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ #define MIDX_VERSION 1 @@ -885,6 +889,172 @@ static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash, static void clear_midx_files_ext(struct repository *r, const char *ext, unsigned char *keep_hash); +static void prepare_midx_packing_data(struct packing_data *pdata, + struct write_midx_context *ctx) +{ + uint32_t i; + + memset(pdata, 0, sizeof(struct packing_data)); + prepare_packing_data(the_repository, pdata); + + for (i = 0; i < ctx->entries_nr; i++) { + struct pack_midx_entry *from = &ctx->entries[ctx->pack_order[i]]; + struct object_entry *to = packlist_alloc(pdata, &from->oid); + + oe_set_in_pack(pdata, to, + ctx->info[ctx->pack_perm[from->pack_int_id]].p); + } +} + +static int add_ref_to_pending(const char *refname, + const struct object_id *oid, + int flag, void *cb_data) +{ + struct rev_info *revs = (struct rev_info*)cb_data; + struct object *object; + + if ((flag & REF_ISSYMREF) && (flag & REF_ISBROKEN)) { + warning("symbolic ref is dangling: %s", refname); + return 0; + } + + object = parse_object_or_die(oid, refname); + if (object->type != OBJ_COMMIT) + return 0; + + add_pending_object(revs, object, ""); + if (bitmap_is_preferred_refname(revs->repo, refname)) + object->flags |= NEEDS_BITMAP; + return 0; +} + +struct bitmap_commit_cb { + struct commit **commits; + size_t commits_nr, commits_alloc; + + struct write_midx_context *ctx; +}; + +static const struct object_id *bitmap_oid_access(size_t index, + const void *_entries) +{ + const struct pack_midx_entry *entries = _entries; + return &entries[index].oid; +} + +static void bitmap_show_commit(struct commit *commit, void *_data) +{ + struct bitmap_commit_cb *data = _data; + if (oid_pos(&commit->object.oid, data->ctx->entries, + data->ctx->entries_nr, + bitmap_oid_access) > -1) { + ALLOC_GROW(data->commits, data->commits_nr + 1, + data->commits_alloc); + data->commits[data->commits_nr++] = commit; + } +} + +static struct commit **find_commits_for_midx_bitmap(uint32_t *indexed_commits_nr_p, + struct write_midx_context *ctx) +{ + struct rev_info revs; + struct bitmap_commit_cb cb; + + memset(&cb, 0, sizeof(struct bitmap_commit_cb)); + cb.ctx = ctx; + + repo_init_revisions(the_repository, &revs, NULL); + for_each_ref(add_ref_to_pending, &revs); + + /* + * Skipping promisor objects here is intentional, since it only excludes + * them from the list of reachable commits that we want to select from + * when computing the selection of MIDX'd commits to receive bitmaps. + * + * Reachability bitmaps do require that their objects be closed under + * reachability, but fetching any objects missing from promisors at this + * point is too late. But, if one of those objects can be reached from + * an another object that is included in the bitmap, then we will + * complain later that we don't have reachability closure (and fail + * appropriately). + */ + fetch_if_missing = 0; + revs.exclude_promisor_objects = 1; + + /* + * Pass selected commits in topo order to match the behavior of + * pack-bitmaps when configured with delta islands. + */ + revs.topo_order = 1; + revs.sort_order = REV_SORT_IN_GRAPH_ORDER; + + if (prepare_revision_walk(&revs)) + die(_("revision walk setup failed")); + + traverse_commit_list(&revs, bitmap_show_commit, NULL, &cb); + if (indexed_commits_nr_p) + *indexed_commits_nr_p = cb.commits_nr; + + return cb.commits; +} + +static int write_midx_bitmap(char *midx_name, unsigned char *midx_hash, + struct write_midx_context *ctx, + unsigned flags) +{ + struct packing_data pdata; + struct pack_idx_entry **index; + struct commit **commits = NULL; + uint32_t i, commits_nr; + char *bitmap_name = xstrfmt("%s-%s.bitmap", midx_name, hash_to_hex(midx_hash)); + int ret; + + prepare_midx_packing_data(&pdata, ctx); + + commits = find_commits_for_midx_bitmap(&commits_nr, ctx); + + /* + * Build the MIDX-order index based on pdata.objects (which is already + * in MIDX order; c.f., 'midx_pack_order_cmp()' for the definition of + * this order). + */ + ALLOC_ARRAY(index, pdata.nr_objects); + for (i = 0; i < pdata.nr_objects; i++) + index[i] = (struct pack_idx_entry *)&pdata.objects[i]; + + bitmap_writer_show_progress(flags & MIDX_PROGRESS); + bitmap_writer_build_type_index(&pdata, index, pdata.nr_objects); + + /* + * bitmap_writer_finish expects objects in lex order, but pack_order + * gives us exactly that. use it directly instead of re-sorting the + * array. + * + * This changes the order of objects in 'index' between + * bitmap_writer_build_type_index and bitmap_writer_finish. + * + * The same re-ordering takes place in the single-pack bitmap code via + * write_idx_file(), which is called by finish_tmp_packfile(), which + * happens between bitmap_writer_build_type_index() and + * bitmap_writer_finish(). + */ + for (i = 0; i < pdata.nr_objects; i++) + index[ctx->pack_order[i]] = (struct pack_idx_entry *)&pdata.objects[i]; + + bitmap_writer_select_commits(commits, commits_nr, -1); + ret = bitmap_writer_build(&pdata); + if (ret < 0) + goto cleanup; + + bitmap_writer_set_checksum(midx_hash); + bitmap_writer_finish(index, pdata.nr_objects, bitmap_name, 0); + +cleanup: + free(index); + free(bitmap_name); + return ret; +} + static int write_midx_internal(const char *object_dir, struct multi_pack_index *m, struct string_list *packs_to_drop, const char *preferred_pack_name, @@ -930,9 +1100,16 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * for (i = 0; i < ctx.m->num_packs; i++) { ALLOC_GROW(ctx.info, ctx.nr + 1, ctx.alloc); + if (prepare_midx_pack(the_repository, ctx.m, i)) { + error(_("could not load pack %s"), + ctx.m->pack_names[i]); + result = 1; + goto cleanup; + } + ctx.info[ctx.nr].orig_pack_int_id = i; ctx.info[ctx.nr].pack_name = xstrdup(ctx.m->pack_names[i]); - ctx.info[ctx.nr].p = NULL; + ctx.info[ctx.nr].p = ctx.m->packs[i]; ctx.info[ctx.nr].expired = 0; ctx.nr++; } @@ -947,8 +1124,26 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * for_each_file_in_pack_dir(object_dir, add_pack_to_midx, &ctx); stop_progress(&ctx.progress); - if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop) - goto cleanup; + if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop) { + struct bitmap_index *bitmap_git; + int bitmap_exists; + int want_bitmap = flags & MIDX_WRITE_BITMAP; + + bitmap_git = prepare_bitmap_git(the_repository); + bitmap_exists = bitmap_git && bitmap_is_midx(bitmap_git); + free_bitmap_index(bitmap_git); + + if (bitmap_exists || !want_bitmap) { + /* + * The correct MIDX already exists, and so does a + * corresponding bitmap (or one wasn't requested). + */ + if (!want_bitmap) + clear_midx_files_ext(the_repository, ".bitmap", + NULL); + goto cleanup; + } + } if (preferred_pack_name) { int found = 0; @@ -964,7 +1159,8 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * if (!found) warning(_("unknown preferred pack: '%s'"), preferred_pack_name); - } else if (ctx.nr && (flags & MIDX_WRITE_REV_INDEX)) { + } else if (ctx.nr && + (flags & (MIDX_WRITE_REV_INDEX | MIDX_WRITE_BITMAP))) { time_t oldest = ctx.info[0].p->mtime; ctx.preferred_pack_idx = 0; @@ -1075,9 +1271,6 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * hold_lock_file_for_update(&lk, midx_name, LOCK_DIE_ON_ERROR); f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk)); - if (ctx.m) - close_midx(ctx.m); - if (ctx.nr - dropped_packs == 0) { error(_("no pack files to index.")); result = 1; @@ -1108,14 +1301,22 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * finalize_hashfile(f, midx_hash, CSUM_FSYNC | CSUM_HASH_IN_STREAM); free_chunkfile(cf); - if (flags & MIDX_WRITE_REV_INDEX) + if (flags & (MIDX_WRITE_REV_INDEX | MIDX_WRITE_BITMAP)) ctx.pack_order = midx_pack_order(&ctx); if (flags & MIDX_WRITE_REV_INDEX) write_midx_reverse_index(midx_name, midx_hash, &ctx); + if (flags & MIDX_WRITE_BITMAP) { + if (write_midx_bitmap(midx_name, midx_hash, &ctx, flags) < 0) { + error(_("could not write multi-pack bitmap")); + result = 1; + goto cleanup; + } + } commit_lock_file(&lk); + clear_midx_files_ext(the_repository, ".bitmap", midx_hash); clear_midx_files_ext(the_repository, ".rev", midx_hash); cleanup: @@ -1123,6 +1324,15 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * if (ctx.info[i].p) { close_pack(ctx.info[i].p); free(ctx.info[i].p); + if (ctx.m) { + /* + * Destroy a stale reference to the pack in + * 'ctx.m'. + */ + uint32_t orig = ctx.info[i].orig_pack_int_id; + if (orig < ctx.m->num_packs) + ctx.m->packs[orig] = NULL; + } } free(ctx.info[i].pack_name); } @@ -1132,6 +1342,9 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * free(ctx.pack_perm); free(ctx.pack_order); free(midx_name); + if (ctx.m) + close_midx(ctx.m); + return result; } @@ -1193,6 +1406,7 @@ void clear_midx_file(struct repository *r) if (remove_path(midx)) die(_("failed to clear multi-pack-index at %s"), midx); + clear_midx_files_ext(r, ".bitmap", NULL); clear_midx_files_ext(r, ".rev", NULL); free(midx); diff --git a/midx.h b/midx.h index 1172df1a71..350f4d0a7b 100644 --- a/midx.h +++ b/midx.h @@ -41,6 +41,7 @@ struct multi_pack_index { #define MIDX_PROGRESS (1 << 0) #define MIDX_WRITE_REV_INDEX (1 << 1) +#define MIDX_WRITE_BITMAP (1 << 2) const unsigned char *get_midx_checksum(struct multi_pack_index *m); char *get_midx_filename(const char *object_dir); From patchwork Mon Jun 21 22:25:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335819 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A349C48BE5 for ; Mon, 21 Jun 2021 22:25:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5B64B6124B for ; Mon, 21 Jun 2021 22:25:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232248AbhFUW14 (ORCPT ); Mon, 21 Jun 2021 18:27:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53758 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232261AbhFUW1z (ORCPT ); Mon, 21 Jun 2021 18:27:55 -0400 Received: from mail-io1-xd34.google.com (mail-io1-xd34.google.com [IPv6:2607:f8b0:4864:20::d34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F8EFC061767 for ; Mon, 21 Jun 2021 15:25:39 -0700 (PDT) Received: by mail-io1-xd34.google.com with SMTP id g22so3926999iom.1 for ; Mon, 21 Jun 2021 15:25:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=iX1ih0eMhB/gtUSIILACWBvI0a5QseSTDtrNN2Q5x1k=; b=Ed7Z1GO4mNsJTTiaZZ7cbPg6XsllzzXOPssTL5c9yByrtP1PgMYUiZSAYaM9wzuf8h eno2bwpjCZOqPkwZ+03r2XLRQqrTmqRYgfuJZ7GaI2/rb+3Y0I6A4Rm/a1uQnKh0UGSP SjvbMZz0uykbvzX7HVYRmSyBRLMlv8yuqZfkT5Wd7vxviI/Dm/ZP7O2As7mSVDv9T5cb z9xzB4CgA3vVmA0SxvhtY3cDdGWJfhvCOuR11zsqT6YiBTK2OmQqlV2nZz5qzAmBuFw7 EXYdEI3PJ+zWezIXFP7c6Zzqk+hSba6uLmWJC8pc7NOLDR4i5GwWc5JOkktfHL2yu9YC GQOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=iX1ih0eMhB/gtUSIILACWBvI0a5QseSTDtrNN2Q5x1k=; b=UYuB04/G0Da2/xuwWNM+8H/v7s1Dx+flEIVEIrDK6qFAv5o9rZSlSvRDwO5tr6zPtY CiK+kS1s/Snogqxos3il9Ogp5vLI5ZU64vQYaCJhWkUQJOgc6dT6kvEsEOBUGqefFL6z D8fhzxT3PvS/ZABvEcyuGnknPW/IUZVY6AV5pJig97uOFJ0RD9kolOICQ/JNbhK4Nhj9 ABT7P5FGnj2o97UHe6tsMjs9KyQmuVsIjNBIMqylshNaCSuEYcf66OIBn1woOQDCyWRC mJyL2KdVQawrqyUwLncJzt3A6+Rfu0WplyJx5j5XOi6JsvnqkBNO1ETfcoZptwYvbBZg c1rA== X-Gm-Message-State: AOAM532YndyRF6UrlGKZvRvrI5Axu/pg0KZNdpAd0Dxu056gKviWCOwt 07EzoKnmsuuLREGD+dHH8N49K3ohPDtPoZ/s X-Google-Smtp-Source: ABdhPJxW4fFIOJGIkvB4kBg4GeN90ecnSQzscgzfJdwVIBqI2MlfJgmdi1+ZTIeBRbCSlNWHDE5rxA== X-Received: by 2002:a02:6382:: with SMTP id j124mr679236jac.72.1624314338536; Mon, 21 Jun 2021 15:25:38 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id a18sm6846004ilc.31.2021.06.21.15.25.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:38 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:37 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 15/24] t5310: move some tests to lib-bitmap.sh Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org We'll soon be adding a test script that will cover many of the same bitmap concepts as t5310, but for MIDX bitmaps. Let's pull out as many of the applicable tests as we can so we don't have to rewrite them. There should be no functional change to t5310; we still run the same operations in the same order. Signed-off-by: Taylor Blau --- t/lib-bitmap.sh | 236 ++++++++++++++++++++++++++++++++++++++++ t/t5310-pack-bitmaps.sh | 227 +------------------------------------- 2 files changed, 240 insertions(+), 223 deletions(-) diff --git a/t/lib-bitmap.sh b/t/lib-bitmap.sh index fe3f98be24..ecb5d0e05d 100644 --- a/t/lib-bitmap.sh +++ b/t/lib-bitmap.sh @@ -1,3 +1,6 @@ +# Helpers for scripts testing bitamp functionality; see t5310 for +# example usage. + # Compare a file containing rev-list bitmap traversal output to its non-bitmap # counterpart. You can't just use test_cmp for this, because the two produce # subtly different output: @@ -24,3 +27,236 @@ test_bitmap_traversal () { test_cmp "$1.normalized" "$2.normalized" && rm -f "$1.normalized" "$2.normalized" } + +# To ensure the logic for "maximal commits" is exercised, make +# the repository a bit more complicated. +# +# other second +# * * +# (99 commits) (99 commits) +# * * +# |\ /| +# | * octo-other octo-second * | +# |/|\_________ ____________/|\| +# | \ \/ __________/ | +# | | ________/\ / | +# * |/ * merge-right * +# | _|__________/ \____________ | +# |/ | \| +# (l1) * * merge-left * (r1) +# | / \________________________ | +# |/ \| +# (l2) * * (r2) +# \___________________________ | +# \| +# * (base) +# +# We only push bits down the first-parent history, which +# makes some of these commits unimportant! +# +# The important part for the maximal commit algorithm is how +# the bitmasks are extended. Assuming starting bit positions +# for second (bit 0) and other (bit 1), the bitmasks at the +# end should be: +# +# second: 1 (maximal, selected) +# other: 01 (maximal, selected) +# (base): 11 (maximal) +# +# This complicated history was important for a previous +# version of the walk that guarantees never walking a +# commit multiple times. That goal might be important +# again, so preserve this complicated case. For now, this +# test will guarantee that the bitmaps are computed +# correctly, even with the repeat calculations. +setup_bitmap_history() { + test_expect_success 'setup repo with moderate-sized history' ' + test_commit_bulk --id=file 10 && + git branch -M second && + git checkout -b other HEAD~5 && + test_commit_bulk --id=side 10 && + + # add complicated history setup, including merges and + # ambiguous merge-bases + + git checkout -b merge-left other~2 && + git merge second~2 -m "merge-left" && + + git checkout -b merge-right second~1 && + git merge other~1 -m "merge-right" && + + git checkout -b octo-second second && + git merge merge-left merge-right -m "octopus-second" && + + git checkout -b octo-other other && + git merge merge-left merge-right -m "octopus-other" && + + git checkout other && + git merge octo-other -m "pull octopus" && + + git checkout second && + git merge octo-second -m "pull octopus" && + + # Remove these branches so they are not selected + # as bitmap tips + git branch -D merge-left && + git branch -D merge-right && + git branch -D octo-other && + git branch -D octo-second && + + # add padding to make these merges less interesting + # and avoid having them selected for bitmaps + test_commit_bulk --id=file 100 && + git checkout other && + test_commit_bulk --id=side 100 && + git checkout second && + + bitmaptip=$(git rev-parse second) && + blob=$(echo tagged-blob | git hash-object -w --stdin) && + git tag tagged-blob $blob + ' +} + +rev_list_tests_head () { + test_expect_success "counting commits via bitmap ($state, $branch)" ' + git rev-list --count $branch >expect && + git rev-list --use-bitmap-index --count $branch >actual && + test_cmp expect actual + ' + + test_expect_success "counting partial commits via bitmap ($state, $branch)" ' + git rev-list --count $branch~5..$branch >expect && + git rev-list --use-bitmap-index --count $branch~5..$branch >actual && + test_cmp expect actual + ' + + test_expect_success "counting commits with limit ($state, $branch)" ' + git rev-list --count -n 1 $branch >expect && + git rev-list --use-bitmap-index --count -n 1 $branch >actual && + test_cmp expect actual + ' + + test_expect_success "counting non-linear history ($state, $branch)" ' + git rev-list --count other...second >expect && + git rev-list --use-bitmap-index --count other...second >actual && + test_cmp expect actual + ' + + test_expect_success "counting commits with limiting ($state, $branch)" ' + git rev-list --count $branch -- 1.t >expect && + git rev-list --use-bitmap-index --count $branch -- 1.t >actual && + test_cmp expect actual + ' + + test_expect_success "counting objects via bitmap ($state, $branch)" ' + git rev-list --count --objects $branch >expect && + git rev-list --use-bitmap-index --count --objects $branch >actual && + test_cmp expect actual + ' + + test_expect_success "enumerate commits ($state, $branch)" ' + git rev-list --use-bitmap-index $branch >actual && + git rev-list $branch >expect && + test_bitmap_traversal --no-confirm-bitmaps expect actual + ' + + test_expect_success "enumerate --objects ($state, $branch)" ' + git rev-list --objects --use-bitmap-index $branch >actual && + git rev-list --objects $branch >expect && + test_bitmap_traversal expect actual + ' + + test_expect_success "bitmap --objects handles non-commit objects ($state, $branch)" ' + git rev-list --objects --use-bitmap-index $branch tagged-blob >actual && + grep $blob actual + ' +} + +rev_list_tests () { + state=$1 + + for branch in "second" "other" + do + rev_list_tests_head + done +} + +basic_bitmap_tests () { + tip="$1" + test_expect_success 'rev-list --test-bitmap verifies bitmaps' " + git rev-list --test-bitmap "${tip:-HEAD}" + " + + rev_list_tests 'full bitmap' + + test_expect_success 'clone from bitmapped repository' ' + rm -fr clone.git && + git clone --no-local --bare . clone.git && + git rev-parse HEAD >expect && + git --git-dir=clone.git rev-parse HEAD >actual && + test_cmp expect actual + ' + + test_expect_success 'partial clone from bitmapped repository' ' + test_config uploadpack.allowfilter true && + rm -fr partial-clone.git && + git clone --no-local --bare --filter=blob:none . partial-clone.git && + ( + cd partial-clone.git && + pack=$(echo objects/pack/*.pack) && + git verify-pack -v "$pack" >have && + awk "/blob/ { print \$1 }" blobs && + # we expect this single blob because of the direct ref + git rev-parse refs/tags/tagged-blob >expect && + test_cmp expect blobs + ) + ' + + test_expect_success 'setup further non-bitmapped commits' ' + test_commit_bulk --id=further 10 + ' + + rev_list_tests 'partial bitmap' + + test_expect_success 'fetch (partial bitmap)' ' + git --git-dir=clone.git fetch origin second:second && + git rev-parse HEAD >expect && + git --git-dir=clone.git rev-parse HEAD >actual && + test_cmp expect actual + ' + + test_expect_success 'enumerating progress counts pack-reused objects' ' + count=$(git rev-list --objects --all --count) && + git repack -adb && + + # check first with only reused objects; confirm that our + # progress showed the right number, and also that we did + # pack-reuse as expected. Check only the final "done" + # line of the meter (there may be an arbitrary number of + # intermediate lines ending with CR). + GIT_PROGRESS_DELAY=0 \ + git pack-objects --all --stdout --progress \ + /dev/null 2>stderr && + grep "Enumerating objects: $count, done" stderr && + grep "pack-reused $count" stderr && + + # now the same but with one non-reused object + git commit --allow-empty -m "an extra commit object" && + GIT_PROGRESS_DELAY=0 \ + git pack-objects --all --stdout --progress \ + /dev/null 2>stderr && + grep "Enumerating objects: $((count+1)), done" stderr && + grep "pack-reused $count" stderr + ' +} + +# have_delta +# +# Note that because this relies on cat-file, it might find _any_ copy of an +# object in the repository. The caller is responsible for making sure +# there's only one (e.g., via "repack -ad", or having just fetched a copy). +have_delta () { + echo $2 >expect && + echo $1 | git cat-file --batch-check="%(deltabase)" >actual && + test_cmp expect actual +} diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh index b02838750e..4318f84d53 100755 --- a/t/t5310-pack-bitmaps.sh +++ b/t/t5310-pack-bitmaps.sh @@ -25,93 +25,10 @@ has_any () { grep -Ff "$1" "$2" } -# To ensure the logic for "maximal commits" is exercised, make -# the repository a bit more complicated. -# -# other second -# * * -# (99 commits) (99 commits) -# * * -# |\ /| -# | * octo-other octo-second * | -# |/|\_________ ____________/|\| -# | \ \/ __________/ | -# | | ________/\ / | -# * |/ * merge-right * -# | _|__________/ \____________ | -# |/ | \| -# (l1) * * merge-left * (r1) -# | / \________________________ | -# |/ \| -# (l2) * * (r2) -# \___________________________ | -# \| -# * (base) -# -# We only push bits down the first-parent history, which -# makes some of these commits unimportant! -# -# The important part for the maximal commit algorithm is how -# the bitmasks are extended. Assuming starting bit positions -# for second (bit 0) and other (bit 1), the bitmasks at the -# end should be: -# -# second: 1 (maximal, selected) -# other: 01 (maximal, selected) -# (base): 11 (maximal) -# -# This complicated history was important for a previous -# version of the walk that guarantees never walking a -# commit multiple times. That goal might be important -# again, so preserve this complicated case. For now, this -# test will guarantee that the bitmaps are computed -# correctly, even with the repeat calculations. +setup_bitmap_history -test_expect_success 'setup repo with moderate-sized history' ' - test_commit_bulk --id=file 10 && - git branch -M second && - git checkout -b other HEAD~5 && - test_commit_bulk --id=side 10 && - - # add complicated history setup, including merges and - # ambiguous merge-bases - - git checkout -b merge-left other~2 && - git merge second~2 -m "merge-left" && - - git checkout -b merge-right second~1 && - git merge other~1 -m "merge-right" && - - git checkout -b octo-second second && - git merge merge-left merge-right -m "octopus-second" && - - git checkout -b octo-other other && - git merge merge-left merge-right -m "octopus-other" && - - git checkout other && - git merge octo-other -m "pull octopus" && - - git checkout second && - git merge octo-second -m "pull octopus" && - - # Remove these branches so they are not selected - # as bitmap tips - git branch -D merge-left && - git branch -D merge-right && - git branch -D octo-other && - git branch -D octo-second && - - # add padding to make these merges less interesting - # and avoid having them selected for bitmaps - test_commit_bulk --id=file 100 && - git checkout other && - test_commit_bulk --id=side 100 && - git checkout second && - - bitmaptip=$(git rev-parse second) && - blob=$(echo tagged-blob | git hash-object -w --stdin) && - git tag tagged-blob $blob && - git config repack.writebitmaps true +test_expect_success 'setup writing bitmaps during repack' ' + git config repack.writeBitmaps true ' test_expect_success 'full repack creates bitmaps' ' @@ -123,109 +40,7 @@ test_expect_success 'full repack creates bitmaps' ' grep "\"key\":\"num_maximal_commits\",\"value\":\"107\"" trace ' -test_expect_success 'rev-list --test-bitmap verifies bitmaps' ' - git rev-list --test-bitmap HEAD -' - -rev_list_tests_head () { - test_expect_success "counting commits via bitmap ($state, $branch)" ' - git rev-list --count $branch >expect && - git rev-list --use-bitmap-index --count $branch >actual && - test_cmp expect actual - ' - - test_expect_success "counting partial commits via bitmap ($state, $branch)" ' - git rev-list --count $branch~5..$branch >expect && - git rev-list --use-bitmap-index --count $branch~5..$branch >actual && - test_cmp expect actual - ' - - test_expect_success "counting commits with limit ($state, $branch)" ' - git rev-list --count -n 1 $branch >expect && - git rev-list --use-bitmap-index --count -n 1 $branch >actual && - test_cmp expect actual - ' - - test_expect_success "counting non-linear history ($state, $branch)" ' - git rev-list --count other...second >expect && - git rev-list --use-bitmap-index --count other...second >actual && - test_cmp expect actual - ' - - test_expect_success "counting commits with limiting ($state, $branch)" ' - git rev-list --count $branch -- 1.t >expect && - git rev-list --use-bitmap-index --count $branch -- 1.t >actual && - test_cmp expect actual - ' - - test_expect_success "counting objects via bitmap ($state, $branch)" ' - git rev-list --count --objects $branch >expect && - git rev-list --use-bitmap-index --count --objects $branch >actual && - test_cmp expect actual - ' - - test_expect_success "enumerate commits ($state, $branch)" ' - git rev-list --use-bitmap-index $branch >actual && - git rev-list $branch >expect && - test_bitmap_traversal --no-confirm-bitmaps expect actual - ' - - test_expect_success "enumerate --objects ($state, $branch)" ' - git rev-list --objects --use-bitmap-index $branch >actual && - git rev-list --objects $branch >expect && - test_bitmap_traversal expect actual - ' - - test_expect_success "bitmap --objects handles non-commit objects ($state, $branch)" ' - git rev-list --objects --use-bitmap-index $branch tagged-blob >actual && - grep $blob actual - ' -} - -rev_list_tests () { - state=$1 - - for branch in "second" "other" - do - rev_list_tests_head - done -} - -rev_list_tests 'full bitmap' - -test_expect_success 'clone from bitmapped repository' ' - git clone --no-local --bare . clone.git && - git rev-parse HEAD >expect && - git --git-dir=clone.git rev-parse HEAD >actual && - test_cmp expect actual -' - -test_expect_success 'partial clone from bitmapped repository' ' - test_config uploadpack.allowfilter true && - git clone --no-local --bare --filter=blob:none . partial-clone.git && - ( - cd partial-clone.git && - pack=$(echo objects/pack/*.pack) && - git verify-pack -v "$pack" >have && - awk "/blob/ { print \$1 }" blobs && - # we expect this single blob because of the direct ref - git rev-parse refs/tags/tagged-blob >expect && - test_cmp expect blobs - ) -' - -test_expect_success 'setup further non-bitmapped commits' ' - test_commit_bulk --id=further 10 -' - -rev_list_tests 'partial bitmap' - -test_expect_success 'fetch (partial bitmap)' ' - git --git-dir=clone.git fetch origin second:second && - git rev-parse HEAD >expect && - git --git-dir=clone.git rev-parse HEAD >actual && - test_cmp expect actual -' +basic_bitmap_tests test_expect_success 'incremental repack fails when bitmaps are requested' ' test_commit more-1 && @@ -461,40 +276,6 @@ test_expect_success 'truncated bitmap fails gracefully (cache)' ' test_i18ngrep corrupted.bitmap.index stderr ' -test_expect_success 'enumerating progress counts pack-reused objects' ' - count=$(git rev-list --objects --all --count) && - git repack -adb && - - # check first with only reused objects; confirm that our progress - # showed the right number, and also that we did pack-reuse as expected. - # Check only the final "done" line of the meter (there may be an - # arbitrary number of intermediate lines ending with CR). - GIT_PROGRESS_DELAY=0 \ - git pack-objects --all --stdout --progress \ - /dev/null 2>stderr && - grep "Enumerating objects: $count, done" stderr && - grep "pack-reused $count" stderr && - - # now the same but with one non-reused object - git commit --allow-empty -m "an extra commit object" && - GIT_PROGRESS_DELAY=0 \ - git pack-objects --all --stdout --progress \ - /dev/null 2>stderr && - grep "Enumerating objects: $((count+1)), done" stderr && - grep "pack-reused $count" stderr -' - -# have_delta -# -# Note that because this relies on cat-file, it might find _any_ copy of an -# object in the repository. The caller is responsible for making sure -# there's only one (e.g., via "repack -ad", or having just fetched a copy). -have_delta () { - echo $2 >expect && - echo $1 | git cat-file --batch-check="%(deltabase)" >actual && - test_cmp expect actual -} - # Create a state of history with these properties: # # - refs that allow a client to fetch some new history, while sharing some old From patchwork Mon Jun 21 22:25:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335821 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 478A5C48BC2 for ; Mon, 21 Jun 2021 22:25:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2D0D760FE9 for ; Mon, 21 Jun 2021 22:25:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232267AbhFUW15 (ORCPT ); Mon, 21 Jun 2021 18:27:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53762 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232181AbhFUW14 (ORCPT ); Mon, 21 Jun 2021 18:27:56 -0400 Received: from mail-il1-x132.google.com (mail-il1-x132.google.com [IPv6:2607:f8b0:4864:20::132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7499C061574 for ; Mon, 21 Jun 2021 15:25:41 -0700 (PDT) Received: by mail-il1-x132.google.com with SMTP id k5so3126277ilv.8 for ; Mon, 21 Jun 2021 15:25:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=ZPrEzB0xIK/wOTqk4WBQW79bWc96glFVTSqPVLHOsMY=; b=rMBDFyWftTGzHaFzsr5c1NMiZ2lja67Spp2qGm9DGOZX9p2wLHc9vkhnDBTpmgi1bF vG7UCcmU/sWmQQbi9a+emoZqFaY7QT0Nv+FDqAmr6d3xzgVrU58t7qxjd0f1AsRYx5gG WeND9a3PV0lLbOypJmj44nKbEZU9wrCPRjLC63kES/ntqT9epZgb32q1fWlVNxZqL1Ls mWR1Fx8FvUzJhgqNthXE+arpoc/hRc/uczf9vKHPPgJtZp5QEX04Tt/RpB80zuX7eG7H dJTSg1ugbUOuIhtKRFFgDGsnDacbHHrX3F7RPaI9sZRDAINXXkQHIhoIcG5AhRfUgrSU RpXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=ZPrEzB0xIK/wOTqk4WBQW79bWc96glFVTSqPVLHOsMY=; b=EQZbg2sFLPFn/t/6F+eZ4dCssW9h04mCyNTfzpKMoV3QiBvmfEUvEYZJdzC9d06gfV ScKYVus8eCZEynbS5jH1l/8cZF/Q4ExhyAQsrKHt5rWm3P5ZaiHpfQCqUxoGql08VmOh 86UmJqkyDxoq3kByRUeLdxQdFAJ15ATy+wSgeM0dpDdbDre0Vlxa92r6dSmOEsn4VzVq 5GEBrImEICemZt3+TtguUBt8JUutLvjweM1CCfdfn1Nr4JlufT4ulL658imEO+/SVLCE 0GaNijU8XgZi8ETra5HbA6+dlxRFdB93ZtSb7KYROnp3icnNjLm1XoAz57cwvMSG/wRI Wp0Q== X-Gm-Message-State: AOAM532JrtDvqjvhMm7+fcckDOJCfPABOAeJZ5/LDzFghcAADgOQCQKy gwbFataPc1mtKoROvtOPuJQpDGV6jpQHue+7 X-Google-Smtp-Source: ABdhPJzs0QFxJpArIRjyRCqikWuSsxEoq5oITs6cNqHt8BPS23E6kk4XmWrebfPsKR3kEgUFvtidng== X-Received: by 2002:a05:6e02:10c2:: with SMTP id s2mr364698ilj.24.1624314341144; Mon, 21 Jun 2021 15:25:41 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id y205sm11119697iof.1.2021.06.21.15.25.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:40 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:40 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 16/24] t/helper/test-read-midx.c: add --checksum mode Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Subsequent tests will want to check for the existence of a multi-pack bitmap which matches the multi-pack-index stored in the pack directory. The multi-pack bitmap includes the hex checksum of the MIDX it corresponds to in its filename (for example, '$packdir/multi-pack-index-.bitmap'). As a result, some tests want a way to learn what '' is. This helper addresses that need by printing the checksum of the repository's multi-pack-index. Signed-off-by: Taylor Blau --- t/helper/test-read-midx.c | 16 +++++++++++++++- t/lib-bitmap.sh | 4 ++++ 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/t/helper/test-read-midx.c b/t/helper/test-read-midx.c index 7c2eb11a8e..cb0d27049a 100644 --- a/t/helper/test-read-midx.c +++ b/t/helper/test-read-midx.c @@ -60,12 +60,26 @@ static int read_midx_file(const char *object_dir, int show_objects) return 0; } +static int read_midx_checksum(const char *object_dir) +{ + struct multi_pack_index *m; + + setup_git_directory(); + m = load_multi_pack_index(object_dir, 1); + if (!m) + return 1; + printf("%s\n", hash_to_hex(get_midx_checksum(m))); + return 0; +} + int cmd__read_midx(int argc, const char **argv) { if (!(argc == 2 || argc == 3)) - usage("read-midx [--show-objects] "); + usage("read-midx [--show-objects|--checksum] "); if (!strcmp(argv[1], "--show-objects")) return read_midx_file(argv[2], 1); + else if (!strcmp(argv[1], "--checksum")) + return read_midx_checksum(argv[2]); return read_midx_file(argv[1], 0); } diff --git a/t/lib-bitmap.sh b/t/lib-bitmap.sh index ecb5d0e05d..09cd036f4d 100644 --- a/t/lib-bitmap.sh +++ b/t/lib-bitmap.sh @@ -260,3 +260,7 @@ have_delta () { echo $1 | git cat-file --batch-check="%(deltabase)" >actual && test_cmp expect actual } + +midx_checksum () { + test-tool read-midx --checksum "${1:-.git/objects}" +} From patchwork Mon Jun 21 22:25:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335825 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50C05C48BC2 for ; Mon, 21 Jun 2021 22:25:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3A20F61289 for ; Mon, 21 Jun 2021 22:25:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232281AbhFUW2E (ORCPT ); Mon, 21 Jun 2021 18:28:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53784 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232258AbhFUW2D (ORCPT ); Mon, 21 Jun 2021 18:28:03 -0400 Received: from mail-io1-xd31.google.com (mail-io1-xd31.google.com [IPv6:2607:f8b0:4864:20::d31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AD5AFC061760 for ; Mon, 21 Jun 2021 15:25:44 -0700 (PDT) Received: by mail-io1-xd31.google.com with SMTP id o5so15883293iob.4 for ; Mon, 21 Jun 2021 15:25:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=TyurQToSdl5Ms1MbaDFjltHEPjfsW7FXSTdEXeOoLsg=; b=XjKezKIrMrYT1YJFq6lUYczGeLB2u4/JKF1OwJtEizMmPqv/ySFuBn24Y/xDN+q7dD 5RibQq7YzV0ulnbQRMsgMQxwXEr9iM4yjA/cE17uXwMwOSIEUcS9xXTsdGi+zoaLLE22 PrriCu1UdEWbMzinlYvkXJr1KMxurIn76cFpIRSw3XgJkZ8mLLaM+/xicB2BmhfeqFJD EWHb9du2s4GTSqun6MkIjl89FHWnmJW+kXvYwwrScJsR/vy+YyySKDcYuXdkV1fArvWM WdoxV/WZ0g2PQP1/gRBMID5uqZ254wAhB9DNg5IKJ7OezROedB2V4tmM2E5gTYJa6Fgp rX+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=TyurQToSdl5Ms1MbaDFjltHEPjfsW7FXSTdEXeOoLsg=; b=ToskAKrq4ag/vAwQsSdfyyvPCG/xy2OUgoPCXtfptaUCk/Hoefb+NmlbfS/3BkHBNW 9qkailw2oiiyPKaLEjcb2vKfXnHS/taglatcYV2VISQiWLIyK6OQFIPgWPFEhvp3FoUb ByMOFfkvuo+4EKmk9HA43GwucYdWm2mCQTMVL8za8gtZEO4hbguGQ3eWR3OwWDbrtyM2 KOgEuUIaJiYQtq0uQAR87Gotc0biGpgpEZ9czxSIGwAbUBV00cXS0Z+0M/tZVKAuH5iQ f2fHN/9IpMKOWC+YCOhBy40WHgQsuy2hO3H3+zVgj+D7P5WdxY7oVEig4AMnjIV5QhcQ ochg== X-Gm-Message-State: AOAM530gASadeVquHGjwaYjRqPzndcUcC+GOcll+oob+8+9PZeh5Rukg 7uvC+FC+pflg9pgaUeFsto2HLGVdP1GSoXm3 X-Google-Smtp-Source: ABdhPJzlUSWdbmYIh0KeRdKXQkhjc5IP1q+fBG2aiJSJ0XydsaSRwuBiwYzK6w9XHSW4AMlvACaRGQ== X-Received: by 2002:a02:5d0a:: with SMTP id w10mr663221jaa.82.1624314343916; Mon, 21 Jun 2021 15:25:43 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id e7sm6235186ili.10.2021.06.21.15.25.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:43 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:42 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 17/24] t5326: test multi-pack bitmap behavior Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This patch introduces a new test, t5326, which tests the basic functionality of multi-pack bitmaps. Some trivial behavior is tested, such as: - Whether bitmaps can be generated with more than one pack. - Whether clones can be served with all objects in the bitmap. - Whether follow-up fetches can be served with some objects outside of the server's bitmap These use lib-bitmap's tests (which in turn were pulled from t5310), and we cover cases where the MIDX represents both a single pack and multiple packs. In addition, some non-trivial and MIDX-specific behavior is tested, too, including: - Whether multi-pack bitmaps behave correctly with respect to the pack-reuse machinery when the base for some object is selected from a different pack than the delta. - Whether multi-pack bitmaps correctly respect the pack.preferBitmapTips configuration. Signed-off-by: Taylor Blau --- t/t5326-multi-pack-bitmaps.sh | 277 ++++++++++++++++++++++++++++++++++ 1 file changed, 277 insertions(+) create mode 100755 t/t5326-multi-pack-bitmaps.sh diff --git a/t/t5326-multi-pack-bitmaps.sh b/t/t5326-multi-pack-bitmaps.sh new file mode 100755 index 0000000000..c1b7d633e2 --- /dev/null +++ b/t/t5326-multi-pack-bitmaps.sh @@ -0,0 +1,277 @@ +#!/bin/sh + +test_description='exercise basic multi-pack bitmap functionality' +. ./test-lib.sh +. "${TEST_DIRECTORY}/lib-bitmap.sh" + +# We'll be writing our own midx and bitmaps, so avoid getting confused by the +# automatic ones. +GIT_TEST_MULTI_PACK_INDEX=0 +GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 + +objdir=.git/objects +midx=$objdir/pack/multi-pack-index + +# midx_pack_source +midx_pack_source () { + test-tool read-midx --show-objects .git/objects | grep "^$1 " | cut -f2 +} + +setup_bitmap_history + +test_expect_success 'enable core.multiPackIndex' ' + git config core.multiPackIndex true +' + +test_expect_success 'create single-pack midx with bitmaps' ' + git repack -ad && + git multi-pack-index write --bitmap && + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap +' + +basic_bitmap_tests + +test_expect_success 'create new additional packs' ' + for i in $(test_seq 1 16) + do + test_commit "$i" && + git repack -d + done && + + git checkout -b other2 HEAD~8 && + for i in $(test_seq 1 8) + do + test_commit "side-$i" && + git repack -d + done && + git checkout second +' + +test_expect_success 'create multi-pack midx with bitmaps' ' + git multi-pack-index write --bitmap && + + ls $objdir/pack/pack-*.pack >packs && + test_line_count = 25 packs && + + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap +' + +basic_bitmap_tests + +test_expect_success '--no-bitmap is respected when bitmaps exist' ' + git multi-pack-index write --bitmap && + + test_commit respect--no-bitmap && + GIT_TEST_MULTI_PACK_INDEX=0 git repack -d && + + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap && + + git multi-pack-index write --no-bitmap && + + test_path_is_file $midx && + test_path_is_missing $midx-$(midx_checksum $objdir).bitmap +' + +test_expect_success 'setup midx with base from later pack' ' + # Write a and b so that "a" is a delta on top of base "b", since Git + # prefers to delete contents out of a base rather than add to a shorter + # object. + test_seq 1 128 >a && + test_seq 1 130 >b && + + git add a b && + git commit -m "initial commit" && + + a=$(git rev-parse HEAD:a) && + b=$(git rev-parse HEAD:b) && + + # In the first pack, "a" is stored as a delta to "b". + p1=$(git pack-objects .git/objects/pack/pack <<-EOF + $a + $b + EOF + ) && + + # In the second pack, "a" is missing, and "b" is not a delta nor base to + # any other object. + p2=$(git pack-objects .git/objects/pack/pack <<-EOF + $b + $(git rev-parse HEAD) + $(git rev-parse HEAD^{tree}) + EOF + ) && + + git prune-packed && + # Use the second pack as the preferred source, so that "b" occurs + # earlier in the MIDX object order, rendering "a" unusable for pack + # reuse. + git multi-pack-index write --bitmap --preferred-pack=pack-$p2.idx && + + have_delta $a $b && + test $(midx_pack_source $a) != $(midx_pack_source $b) +' + +rev_list_tests 'full bitmap with backwards delta' + +test_expect_success 'clone with bitmaps enabled' ' + git clone --no-local --bare . clone-reverse-delta.git && + test_when_finished "rm -fr clone-reverse-delta.git" && + + git rev-parse HEAD >expect && + git --git-dir=clone-reverse-delta.git rev-parse HEAD >actual && + test_cmp expect actual +' + +bitmap_reuse_tests() { + from=$1 + to=$2 + + test_expect_success "setup pack reuse tests ($from -> $to)" ' + rm -fr repo && + git init repo && + ( + cd repo && + test_commit_bulk 16 && + git tag old-tip && + + git config core.multiPackIndex true && + if test "MIDX" = "$from" + then + GIT_TEST_MULTI_PACK_INDEX=0 git repack -Ad && + git multi-pack-index write --bitmap + else + GIT_TEST_MULTI_PACK_INDEX=0 git repack -Adb + fi + ) + ' + + test_expect_success "build bitmap from existing ($from -> $to)" ' + ( + cd repo && + test_commit_bulk --id=further 16 && + git tag new-tip && + + if test "MIDX" = "$to" + then + GIT_TEST_MULTI_PACK_INDEX=0 git repack -d && + git multi-pack-index write --bitmap + else + GIT_TEST_MULTI_PACK_INDEX=0 git repack -Adb + fi + ) + ' + + test_expect_success "verify resulting bitmaps ($from -> $to)" ' + ( + cd repo && + git for-each-ref && + git rev-list --test-bitmap refs/tags/old-tip && + git rev-list --test-bitmap refs/tags/new-tip + ) + ' +} + +bitmap_reuse_tests 'pack' 'MIDX' +bitmap_reuse_tests 'MIDX' 'pack' +bitmap_reuse_tests 'MIDX' 'MIDX' + +test_expect_success 'missing object closure fails gracefully' ' + rm -fr repo && + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + + test_commit loose && + test_commit packed && + + # Do not pass "--revs"; we want a pack without the "loose" + # commit. + git pack-objects $objdir/pack/pack <<-EOF && + $(git rev-parse packed) + EOF + + test_must_fail git multi-pack-index write --bitmap 2>err && + grep "doesn.t have full closure" err && + test_path_is_missing $midx + ) +' + +test_expect_success 'setup partial bitmaps' ' + test_commit packed && + git repack && + test_commit loose && + git multi-pack-index write --bitmap 2>err && + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap +' + +basic_bitmap_tests HEAD~ + +test_expect_success 'removing a MIDX clears stale bitmaps' ' + rm -fr repo && + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + test_commit base && + git repack && + git multi-pack-index write --bitmap && + + # Write a MIDX and bitmap; remove the MIDX but leave the bitmap. + stale_bitmap=$midx-$(midx_checksum $objdir).bitmap && + rm $midx && + + # Then write a new MIDX. + test_commit new && + git repack && + git multi-pack-index write --bitmap && + + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap && + test_path_is_missing $stale_bitmap + ) +' + +test_expect_success 'pack.preferBitmapTips' ' + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + + test_commit_bulk --message="%s" 103 && + + git log --format="%H" >commits.raw && + sort commits && + + git log --format="create refs/tags/%s %H" HEAD >refs && + git update-ref --stdin bitmaps && + comm -13 bitmaps commits >before && + test_line_count = 1 before && + + perl -ne "printf(\"create refs/tags/include/%d \", $.); print" \ + bitmaps && + comm -13 bitmaps commits >after && + + ! test_cmp before after + ) +' + +test_done From patchwork Mon Jun 21 22:25:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335823 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF366C4743C for ; Mon, 21 Jun 2021 22:25:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 962F460FE9 for ; Mon, 21 Jun 2021 22:25:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232268AbhFUW2D (ORCPT ); Mon, 21 Jun 2021 18:28:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53772 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231744AbhFUW2C (ORCPT ); Mon, 21 Jun 2021 18:28:02 -0400 Received: from mail-il1-x132.google.com (mail-il1-x132.google.com [IPv6:2607:f8b0:4864:20::132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 578A6C061574 for ; Mon, 21 Jun 2021 15:25:47 -0700 (PDT) Received: by mail-il1-x132.google.com with SMTP id s19so10010506ilj.1 for ; Mon, 21 Jun 2021 15:25:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=E7g7KePDkycaatnR6SQPYLn26QnL2jc3tVnJa5mP2eQ=; b=IkzVP9pLxczsa3HG6jKZmhijdT98keokJaEiwBInLtIbsMisWHXutYgeXibdNCFv5f d9ljwcyH4A163gOjkg1VqKHKFS41PEPJHSz8kY+Ljn4/cbvOry9AexDnryratsSPgg4Y jwJ7hMdS7e5z2wQ4w3IW5pebYGPyonuiAVBEAgTDzXuas5ddqBAFkKjkFfsweDeac4Uv WMq9NTFZwE6mQu2ZAz/xCXtKs17JcxWfaDgv+4hk6Y6Q0NalJiI2DnsyHseFFY9GadrQ z2KM8KoCrzbCmZ8FJokxVZcLD7tJMi4Fh1uDdsNfr4dHucZg/ctRirsC7PNmC6S9sx6f aLIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=E7g7KePDkycaatnR6SQPYLn26QnL2jc3tVnJa5mP2eQ=; b=INnwq7wR+NVURh4BaUDEOsJbgeUm3NLKq8byy2M2UW4TMaFj3ChZullpFb5lLmQv20 oTDrafhSkrUDnA5qYjdAjZkKVHfOhH+ELSbThxtKE3hI0kOnobvbWlSdCgawy/tjWFR3 9XsO2Ds1yMfnvoA3jEGgD4DMK3wBfH+IiFpGWeP2yaoz7U0DhSH6J+LW2m9hE6Kf1TyJ QxdG1+ViXBdrQ6y7FzgWumDgnY+yIHzkVwImWc9ar53lh+qlF6Cz/WAauucAHDHG1uoD ryjY76dwWgNMV51MClqSpfzSLwY5kJuS9mwgF8xY6nAOUuNE4YJriElboBzFEJRVN5hx 9KcA== X-Gm-Message-State: AOAM533zUVWM/4OKPImh9hkWaqkqaESittGxLVc3caKhBTf/68sgVmLJ iVRWDzw/SbRJpAFH+RAAl9UUWKou/NK3xdxi X-Google-Smtp-Source: ABdhPJzTHLJZkyM3jjklc48dOXMYW2qin2BihBqlfwGcU/mK2yPev7b0SfaxVH5fw7Sh2EIl8g7o3Q== X-Received: by 2002:a92:7b07:: with SMTP id w7mr345380ilc.308.1624314346649; Mon, 21 Jun 2021 15:25:46 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id g3sm11297966iob.13.2021.06.21.15.25.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:46 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:45 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 18/24] t0410: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP Message-ID: <2a5df1832a340bfedba80bbb1b223b82d14ce3f9.1624314293.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King Generating a MIDX bitmap causes tests which repack in a partial clone to fail because they are missing objects. Missing objects is an expected component of tests in t0410, so disable this knob altogether. Graceful degradation when writing a bitmap with missing objects is tested in t5326. Signed-off-by: Taylor Blau --- t/t0410-partial-clone.sh | 3 +++ 1 file changed, 3 insertions(+) diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh index 1667450917..4fd8e83da1 100755 --- a/t/t0410-partial-clone.sh +++ b/t/t0410-partial-clone.sh @@ -4,6 +4,9 @@ test_description='partial clone' . ./test-lib.sh +# missing promisor objects cause repacks which write bitmaps to fail +GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 + delete_object () { rm $1/.git/objects/$(echo $2 | sed -e 's|^..|&/|') } From patchwork Mon Jun 21 22:25:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335827 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9EB0AC4743C for ; Mon, 21 Jun 2021 22:25:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 83B9F6124B for ; Mon, 21 Jun 2021 22:25:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232272AbhFUW2I (ORCPT ); Mon, 21 Jun 2021 18:28:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232258AbhFUW2E (ORCPT ); Mon, 21 Jun 2021 18:28:04 -0400 Received: from mail-io1-xd34.google.com (mail-io1-xd34.google.com [IPv6:2607:f8b0:4864:20::d34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 17706C061574 for ; Mon, 21 Jun 2021 15:25:50 -0700 (PDT) Received: by mail-io1-xd34.google.com with SMTP id h2so5931373iob.11 for ; Mon, 21 Jun 2021 15:25:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=C5Yl6mIwi9zLmDdBydnpT2kQELkZPYxenRWJSBxoMY8=; b=RrdGSL6UJPy1QayCFJwMh2eEYSAAMgR99Wi+zpZxR/wlEU2dizuTY+/1+2uoLlQLXn JWOxCwK7FzmdlAPH0myd7QwnN0abM6fFukva9nrFZZ46NedH1QzRL8RP31m/WRZuvJje oexk3jRLNHy1mGtg1fxOQuuwIv+G7rjWQSxMHBNyJ0DE8MUlGHQXSbqpq1OJ9DFeo++X RCNeQBsmJ0g7Rt4PCvXaCbe2iEWXTUD7TNbvTMKOL3/wg1yB8b/mHTyuhpG0aFcOWxHj XJjdQo1hDz3/TPQsjul2387fxlUohyY/6wYWssyp5pZn1Qh3F4He4RSXOj2jIE9Tfhjz 7OHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=C5Yl6mIwi9zLmDdBydnpT2kQELkZPYxenRWJSBxoMY8=; b=Uh60+PjcZEeQLmdoY+B5lcoOdUypiP4JEVBnEUsoaesvMZmifxwgLxac74dA3G6uw1 A9QHR5oP12jHucmzWHK9C0Z7oT01goy9lkbg1bbdeQ6N+WssWt75BOVmEaezljfv/QK4 1nYWQuetkcPXEBIYY1KCXyNWOZmmkVzPQvP85KksW+EApaE9pWKMETU4xMqRnxaO51Hu NnrLjyqELeksUi0WtHTO0qGGrMdNf6JeP10V1csg9/bAQpe1NIsnT6tQmiTjMeimFjEa 1AOZn34+PXYvaqLfF1Xd06PZNQRIF06eC+X/3utNIL7jtM0FErgGYWa2eK0c9Qw44BHv 82Ig== X-Gm-Message-State: AOAM532k0spsRkeo1T/sXZbBsJJ+Nsn5GpuAo3tfb8f2PGFCkfa0jRK+ l03CvYvlfg0kFEs8skF0WvZOlJH4RKNAzWdd X-Google-Smtp-Source: ABdhPJwceOsun8EpY6cD5VjABUveic4B/g/dJwLzxLTvnJWq07yz1BHrs2kuMLINGkDJCC5K8CMG6A== X-Received: by 2002:a02:8241:: with SMTP id q1mr660572jag.134.1624314349346; Mon, 21 Jun 2021 15:25:49 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id b23sm10682044ior.4.2021.06.21.15.25.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:49 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:48 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 19/24] t5310: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP Message-ID: <2d24c5b7ad0835f433428c16dfd137449688d93b.1624314293.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King Generating a MIDX bitmap confuses many of the tests in t5310, which expect to control whether and how bitmaps are written. Since the relevant MIDX-bitmap tests here are covered already in t5326, let's just disable the flag for the whole t5310 script. Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- t/t5310-pack-bitmaps.sh | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh index 4318f84d53..673baa5c3c 100755 --- a/t/t5310-pack-bitmaps.sh +++ b/t/t5310-pack-bitmaps.sh @@ -8,6 +8,10 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME . "$TEST_DIRECTORY"/lib-bundle.sh . "$TEST_DIRECTORY"/lib-bitmap.sh +# t5310 deals only with single-pack bitmaps, so don't write MIDX bitmaps in +# their place. +GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 + objpath () { echo ".git/objects/$(echo "$1" | sed -e 's|\(..\)|\1/|')" } From patchwork Mon Jun 21 22:25:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335829 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3735C4743C for ; Mon, 21 Jun 2021 22:25:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AE49D6124B for ; Mon, 21 Jun 2021 22:25:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232290AbhFUW2K (ORCPT ); Mon, 21 Jun 2021 18:28:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53836 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232318AbhFUW2H (ORCPT ); Mon, 21 Jun 2021 18:28:07 -0400 Received: from mail-il1-x133.google.com (mail-il1-x133.google.com [IPv6:2607:f8b0:4864:20::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6109C061768 for ; Mon, 21 Jun 2021 15:25:52 -0700 (PDT) Received: by mail-il1-x133.google.com with SMTP id q18so16721717ile.10 for ; Mon, 21 Jun 2021 15:25:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=urkpKRCVUcnkeA5xm5AkC500A1RReChHW2zksdFNMEU=; b=tLtiapBP4Ey4ScnpcHyWu5rod+ArZR8voHCLYNsaVukPYDVzBcs/LbXZeIdBefrjKU uArbiSilApwqwUOt4ctZjWiBQv+9yKJ7JxlogjgSkmExBCsPRfGCz1dj/o0nz+3EuJxJ YF5rTiXJUDCUJoa6Gk9PiOD4k+H0xHECV6jeosLxylWS7kBZGJTI2htWWID5JbP+bz+p PW1ChEyQkI7ARVZfLiSEtlNMhUIZwT1DB3skL0+oUNdLhDOV3FkSIjnFowR5/GN9kbQB Q0mUpnemly3CyVnrTgvmHdL8fBK+hgVq8NI/6/OkHY7LWeyVCfGqHYn7MyxQo5xegZNt cepg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=urkpKRCVUcnkeA5xm5AkC500A1RReChHW2zksdFNMEU=; b=UWORKOvFX22OhLtKCDo2Oyn8rsVZo6psd44rij2o+Ed6HjUw1v8X9Ipv3BhRzXmIkw 5Trqm6rL5iw5IKsYA785A7n8Mwt+ZiJJHIDf3XPqPCMMRBN9xZtkSqjqI5lEYnSMfxg6 ORge1LyJoBedqaIusgi9oahdrOtgsbafCimQxjLX4p7I/9EW9zj3YzY+1RxrYqMHt0Bf mDHzBdPa6S9xgKsYE8sPZKRhQ2/JH+FE8BF7WKg9mCJrxDPMBxdbc+fUsSk3BusaIjbW clEcJaPxwsyoLCs4A4v0BBBIQVqu02k4ooaBcX38LVhe8tFlMfetpKVgkRmL9YxO5JLf JpDg== X-Gm-Message-State: AOAM5339VvqSNtFRKumv3wBoxQT+ic2UiFqG3j2N/5Z1mb/Jqqxw4zDE bwotRFb4ZRjm3O/PBM9N8KYebzgayRGlHK3o X-Google-Smtp-Source: ABdhPJwKTjAuKnlLgWucTczKibgdUwO6r+fleqwScxpHp5FB32BDBp6jXIu1PqHBf0NTkXOAHuFNew== X-Received: by 2002:a05:6e02:10c3:: with SMTP id s3mr342730ilj.37.1624314351989; Mon, 21 Jun 2021 15:25:51 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id r4sm1058708ilq.27.2021.06.21.15.25.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:51 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:50 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 20/24] t5319: don't write MIDX bitmaps in t5319 Message-ID: <4cbfaa0e978fd68bd5acd6ce5f96b34e9cd43656.1624314293.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This test is specifically about generating a midx still respecting a pack-based bitmap file. Generating a MIDX bitmap would confuse the test. Let's override the 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' variable to make sure we don't do so. Signed-off-by: Taylor Blau --- t/t5319-multi-pack-index.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 5641d158df..69f1c815aa 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -474,7 +474,8 @@ test_expect_success 'repack preserves multi-pack-index when creating packs' ' compare_results_with_midx "after repack" test_expect_success 'multi-pack-index and pack-bitmap' ' - git -c repack.writeBitmaps=true repack -ad && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -c repack.writeBitmaps=true repack -ad && git multi-pack-index write && git rev-list --test-bitmap HEAD ' From patchwork Mon Jun 21 22:25:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335831 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67F47C4743C for ; Mon, 21 Jun 2021 22:25:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 528646124B for ; Mon, 21 Jun 2021 22:25:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232296AbhFUW2L (ORCPT ); Mon, 21 Jun 2021 18:28:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232303AbhFUW2K (ORCPT ); Mon, 21 Jun 2021 18:28:10 -0400 Received: from mail-io1-xd34.google.com (mail-io1-xd34.google.com [IPv6:2607:f8b0:4864:20::d34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6188FC061760 for ; Mon, 21 Jun 2021 15:25:55 -0700 (PDT) Received: by mail-io1-xd34.google.com with SMTP id b14so17664751iow.13 for ; Mon, 21 Jun 2021 15:25:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=P9XhA576oXjIKXlNHfQ5aXRwRaVis24OihjlRxxBAZs=; b=cKxZU7CbG0f7L0G7KM+nx9II/+dZ7S/RI6FYaGvCdeDUxDT6gEH3R3m+G44RTR21mV xm0Kjeu4R/sGKGE0ACTiDBSFYOxJhzVh+GdQxrVRHSJKvYwuqTdI2S9fbQ6clitY4pYJ 5O6jibnCyWg+WoWhlyWQfolockCyE+IKT0mASOzOrePEV3yH+Kd/8wyZXDlItV+/Zu2n sB9mX6P2DNpaR8UPVOPZ3LSOmrFMUjecI/bw49mlqd/8TpwsfoL2b5aNXNxKCIXWLpz4 Gjn4tggPnTYYjG5mvsaH04agjwrUmguKXQmuVMKSE8NTfIq+jJalG6Z9Qr3wW3+D0eGr aK3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=P9XhA576oXjIKXlNHfQ5aXRwRaVis24OihjlRxxBAZs=; b=EaI/CmCLyw9dd0DH9xnrLKqRC3+D3gEiUN9a5zXYbefuk5+icxV5efH/WGl0Vqr4xC HEEJy3UTgxJVBCDDD0LszXppssZ6G1Yw2+cMlHniMV3ZKqRjdAw3T60vEmtHcZ1e93rM UGwal1pkXGwgy/1uc+tXQLf2XjCb+EbmzBBSil2ASJW3/fyIG7kY1l4ckMZna5V22T9m yBXX6FSoW5QajBEbmxhDdmq3SNgb8IFxR4PqqLVWL6oOng8AL3tGYirUNOwAUYFHOyoZ RQprL0mAJjkuKE6EH7oCp0TWgJ2lHP7fAH+yvqTsEtKHAOOhUOocRCUsVYS0uCGxX4o3 4PJw== X-Gm-Message-State: AOAM533v1U11ksdsov+6BKcLgsv5PCVMdjdBu3bTqj4VTuIRECgcX/+z r+P9/1opNclqpEZdKUT39ym2qiiBceLpYWL6 X-Google-Smtp-Source: ABdhPJyhs6Ag7MPYwQARXfE4J7u3T+raYnL2/jhGdHHBwyNcxb2ju00+YW1aytqL44FpADpjQaemgw== X-Received: by 2002:a6b:7009:: with SMTP id l9mr251987ioc.82.1624314354588; Mon, 21 Jun 2021 15:25:54 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id j18sm10636340ioo.3.2021.06.21.15.25.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:54 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:53 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 21/24] t7700: update to work with MIDX bitmap test knob Message-ID: <839a7a79eb1fedd25948213a5d64c97e34d38a31.1624314293.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A number of these tests are focused only on pack-based bitmaps and need to be updated to disable 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' where necessary. Signed-off-by: Taylor Blau --- t/t7700-repack.sh | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 25b235c063..98eda3bfeb 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -63,13 +63,14 @@ test_expect_success 'objects in packs marked .keep are not repacked' ' test_expect_success 'writing bitmaps via command-line can duplicate .keep objects' ' # build on $oid, $packid, and .keep state from previous - git repack -Adbl && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 git repack -Adbl && test_has_duplicate_object true ' test_expect_success 'writing bitmaps via config can duplicate .keep objects' ' # build on $oid, $packid, and .keep state from previous - git -c repack.writebitmaps=true repack -Adl && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -c repack.writebitmaps=true repack -Adl && test_has_duplicate_object true ' @@ -189,7 +190,9 @@ test_expect_success 'repack --keep-pack' ' test_expect_success 'bitmaps are created by default in bare repos' ' git clone --bare .git bare.git && - git -C bare.git repack -ad && + rm -f bare.git/objects/pack/*.bitmap && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -C bare.git repack -ad && bitmap=$(ls bare.git/objects/pack/*.bitmap) && test_path_is_file "$bitmap" ' @@ -200,7 +203,8 @@ test_expect_success 'incremental repack does not complain' ' ' test_expect_success 'bitmaps can be disabled on bare repos' ' - git -c repack.writeBitmaps=false -C bare.git repack -ad && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -c repack.writeBitmaps=false -C bare.git repack -ad && bitmap=$(ls bare.git/objects/pack/*.bitmap || :) && test -z "$bitmap" ' @@ -211,7 +215,8 @@ test_expect_success 'no bitmaps created if .keep files present' ' keep=${pack%.pack}.keep && test_when_finished "rm -f \"\$keep\"" && >"$keep" && - git -C bare.git repack -ad 2>stderr && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -C bare.git repack -ad 2>stderr && test_must_be_empty stderr && find bare.git/objects/pack/ -type f -name "*.bitmap" >actual && test_must_be_empty actual @@ -222,7 +227,8 @@ test_expect_success 'auto-bitmaps do not complain if unavailable' ' blob=$(test-tool genrandom big $((1024*1024)) | git -C bare.git hash-object -w --stdin) && git -C bare.git update-ref refs/tags/big $blob && - git -C bare.git repack -ad 2>stderr && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -C bare.git repack -ad 2>stderr && test_must_be_empty stderr && find bare.git/objects/pack -type f -name "*.bitmap" >actual && test_must_be_empty actual From patchwork Mon Jun 21 22:25:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335833 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10851C48BC2 for ; Mon, 21 Jun 2021 22:26:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E82396124B for ; Mon, 21 Jun 2021 22:26:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232320AbhFUW2O (ORCPT ); Mon, 21 Jun 2021 18:28:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53872 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232307AbhFUW2N (ORCPT ); Mon, 21 Jun 2021 18:28:13 -0400 Received: from mail-io1-xd2b.google.com (mail-io1-xd2b.google.com [IPv6:2607:f8b0:4864:20::d2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 09E8EC061756 for ; Mon, 21 Jun 2021 15:25:58 -0700 (PDT) Received: by mail-io1-xd2b.google.com with SMTP id k11so4449375ioa.5 for ; Mon, 21 Jun 2021 15:25:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=fzLH9EWtzty/BdnnpE5D2On3mYvZUnuNWIFUpH7BI3g=; b=bjWhGIqZc30BkPXh5tPnn8dkn7GIM02n/upLXc1EadHTaO6tTq/RIOPfc5lTXW6P1c 9Poeey6NpQxeVcGyFp1MVy1S1KuMIla8VPnFsH/qURiB/wixYWYpJDt5s06KAzGWEmJC 7e8oJHaWlV8CNCFNCXdP7dj39o10b+62i77Zf2LKtqsMgcCYc8/8v9TN0Mh9syspVnQS YG6oyOuqoxjK1S1liNT7uQxqQAOXRvTY+Abp4s6O0Ssqj90iYA2fjMY0onxBJ8eYatIo 6hYOvvU4K3AVZn0z1ZTURDAP0b2nXsXdiilSxmp/Reu7WcXp9WociMAdbPNb7d7hmwa/ +YWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=fzLH9EWtzty/BdnnpE5D2On3mYvZUnuNWIFUpH7BI3g=; b=R3548qxDz/9yEx8yr7IJy3ILH6TKkStQ1i4Eum3oyjkNZEwN2RyGZXsplWmaYoVWHP ycOHLzFvnOtCniCZQ4lIETWolwQRrKgz6B/oZpkqBX8U6wg0bX8JDfPmc3k1fxX5PXzR 9rN23wlLQpEO0gvmas2Mzfp0p8FpCL+q7PUSsmu3+vx7RQtRcl34SuK/3ULuO5vAPode djSr8wg2yYbL1QJS+IbZfF9IZMmaV+yfx/aDpIDPPyigA9keH4hm0gJSbtqNkD2Aci6m 6RTCQmGkDNJ04BKPBuBVLN+E/CaVTSb8RjAwqSy1TpFh4tyZJlmkKsSPAEKJeZFRDPgT ZQZg== X-Gm-Message-State: AOAM5302X6oep2oUngckUmqeiAGgjMPK2SoybAieV30VPaDs+jPU7pe9 gmtFC0swwOSOj+CbH3pGoN6zo3JF+3P5GpyP X-Google-Smtp-Source: ABdhPJyXJSupetHQ2xC6cyFe6vHoGaw4b7r6cqNXXw3JJLP3VDnEEy49WT7IGuSvt3ZLwsiHuCH1gg== X-Received: by 2002:a5d:930b:: with SMTP id l11mr224473ion.177.1624314357266; Mon, 21 Jun 2021 15:25:57 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id v18sm10541941iom.5.2021.06.21.15.25.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:56 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:56 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 22/24] midx: respect 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' Message-ID: <00418d5b096c049ddfc738b5d51ef65eae019607.1624314293.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Introduce a new 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' environment variable to also write a multi-pack bitmap when 'GIT_TEST_MULTI_PACK_INDEX' is set. Signed-off-by: Taylor Blau --- builtin/repack.c | 13 ++++++++++--- ci/run-build-and-tests.sh | 1 + midx.h | 2 ++ t/README | 4 ++++ 4 files changed, 17 insertions(+), 3 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index 5f9bc74adc..77f6f03057 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -515,7 +515,10 @@ int cmd_repack(int argc, const char **argv, const char *prefix) if (!(pack_everything & ALL_INTO_ONE) || !is_bare_repository()) write_bitmaps = 0; - } + } else if (write_bitmaps && + git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0) && + git_env_bool(GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP, 0)) + write_bitmaps = 0; if (pack_kept_objects < 0) pack_kept_objects = write_bitmaps > 0; @@ -725,8 +728,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix) update_server_info(0); remove_temporary_files(); - if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) - write_midx_file(get_object_directory(), NULL, 0); + if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) { + unsigned flags = 0; + if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP, 0)) + flags |= MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX; + write_midx_file(get_object_directory(), NULL, flags); + } string_list_clear(&names, 0); string_list_clear(&rollback, 0); diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh index 3ce81ffee9..7ee9ba9325 100755 --- a/ci/run-build-and-tests.sh +++ b/ci/run-build-and-tests.sh @@ -23,6 +23,7 @@ linux-gcc) export GIT_TEST_COMMIT_GRAPH=1 export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1 export GIT_TEST_MULTI_PACK_INDEX=1 + export GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=1 export GIT_TEST_ADD_I_USE_BUILTIN=1 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master export GIT_TEST_WRITE_REV_INDEX=1 diff --git a/midx.h b/midx.h index 350f4d0a7b..aa3da557bb 100644 --- a/midx.h +++ b/midx.h @@ -8,6 +8,8 @@ struct pack_entry; struct repository; #define GIT_TEST_MULTI_PACK_INDEX "GIT_TEST_MULTI_PACK_INDEX" +#define GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP \ + "GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP" struct multi_pack_index { struct multi_pack_index *next; diff --git a/t/README b/t/README index 1a2072b2c8..1311b8e17a 100644 --- a/t/README +++ b/t/README @@ -425,6 +425,10 @@ GIT_TEST_MULTI_PACK_INDEX=, when true, forces the multi-pack- index to be written after every 'git repack' command, and overrides the 'core.multiPackIndex' setting to true. +GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=, when true, sets the +'--bitmap' option on all invocations of 'git multi-pack-index write', +and ignores pack-objects' '--write-bitmap-index'. + GIT_TEST_SIDEBAND_ALL=, when true, overrides the 'uploadpack.allowSidebandAll' setting to true, and when false, forces fetch-pack to not request sideband-all (even if the server advertises From patchwork Mon Jun 21 22:25:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335835 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4937CC4743C for ; Mon, 21 Jun 2021 22:26:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 33DE560FE9 for ; Mon, 21 Jun 2021 22:26:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232308AbhFUW2Q (ORCPT ); Mon, 21 Jun 2021 18:28:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232307AbhFUW2P (ORCPT ); Mon, 21 Jun 2021 18:28:15 -0400 Received: from mail-il1-x12c.google.com (mail-il1-x12c.google.com [IPv6:2607:f8b0:4864:20::12c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4EBEC061760 for ; Mon, 21 Jun 2021 15:26:00 -0700 (PDT) Received: by mail-il1-x12c.google.com with SMTP id k5so3126754ilv.8 for ; Mon, 21 Jun 2021 15:26:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=sAWq7ApHqrUs88UHszHQDp+raK9BhywjCqAJGMbjwdk=; b=LV8CVzdMsxFcEAyugEJ4iLA0Tc46JmKFj7d5uMFsuDBOB1Y5bCpabKvjRRoln8JT6M aEAz8DY0srglUflJ5szVWw/obDXoj5wRUPfd4ENuRFg4Ys6NA/e2rLAhGtZVNrDS0WeQ uuLpEratBabdg2SXrgtgtCueTuRXkJjvMLIXITeK7hGbxkzaT4riqf4yci5zL16UH7k1 sf4a4XYLWwwiEFsCA6O3rRVNorJ3iQ4fD0niDMQ7Gbkvw1nUF0DLEipP44bcZvqr653j kzmv6TqHylZ0iHDBEuTHuxiTfj8QJen1tk4TC3fVrvLiU7mKIclGm6kyKHSrze0TCkFb W6zA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=sAWq7ApHqrUs88UHszHQDp+raK9BhywjCqAJGMbjwdk=; b=KE91R4LZWXwVHIbgkun5mbVbkeSd2WF9oeO+vdiLOb2jCsonRAGaNC8groS+FtQK82 zxlVH20Gq4kuvCEQg0EbT3Njzoor9H6BXu0R/q+fmfN9kQyz++mCCUXz8/v71ZLWtFM1 X183ePvWS8O/nsgiUKly5BnX4to5diqt4jLUpopmTBtk/f0BXfLVnkaaW6fO6m3r9rz3 RFtyQKsprdu9s3IuGocrF5HkBcJMFyJ7YZoGqKyrc1pIDPm0MdPWrcRlkJgdlovNV/uq k0T6xl97+tNSsx1KymQgporTxTXX2eMT2RGYAjkdtzO+OgT1lDrho/PcDcHJTDeDQH6+ 84KQ== X-Gm-Message-State: AOAM531hlbqBBYG7Bsy0iBzBqvkhxr+oebjKbrCB53fmhq+DGE0clQ9T 22vXgu9V80RVqpvQkA8PIO6ufOvngr6AGBGS X-Google-Smtp-Source: ABdhPJxOOqdf75FUYkEEve7MMxxBu9yHAFb1NqWPCGQ8oJGgqrTfphJxK15RituwMVfw1DKchVd32w== X-Received: by 2002:a05:6e02:1947:: with SMTP id x7mr356143ilu.300.1624314359945; Mon, 21 Jun 2021 15:25:59 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id 14sm6960640ilx.61.2021.06.21.15.25.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:25:59 -0700 (PDT) Date: Mon, 21 Jun 2021 18:25:58 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 23/24] p5310: extract full and partial bitmap tests Message-ID: <98fa73a76a981ce328dc9ad1dc816d6a62bc0659.1624314293.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A new p5326 introduced by the next patch will want these same tests, interjecting its own setup in between. Move them out so that both perf tests can reuse them. Signed-off-by: Taylor Blau --- t/perf/lib-bitmap.sh | 69 ++++++++++++++++++++++++++++++++++++ t/perf/p5310-pack-bitmaps.sh | 65 ++------------------------------- 2 files changed, 72 insertions(+), 62 deletions(-) create mode 100644 t/perf/lib-bitmap.sh diff --git a/t/perf/lib-bitmap.sh b/t/perf/lib-bitmap.sh new file mode 100644 index 0000000000..63d3bc7cec --- /dev/null +++ b/t/perf/lib-bitmap.sh @@ -0,0 +1,69 @@ +# Helper functions for testing bitmap performance; see p5310. + +test_full_bitmap () { + test_perf 'simulated clone' ' + git pack-objects --stdout --all /dev/null + ' + + test_perf 'simulated fetch' ' + have=$(git rev-list HEAD~100 -1) && + { + echo HEAD && + echo ^$have + } | git pack-objects --revs --stdout >/dev/null + ' + + test_perf 'pack to file (bitmap)' ' + git pack-objects --use-bitmap-index --all pack1b /dev/null + ' + + test_perf 'rev-list (commits)' ' + git rev-list --all --use-bitmap-index >/dev/null + ' + + test_perf 'rev-list (objects)' ' + git rev-list --all --use-bitmap-index --objects >/dev/null + ' + + test_perf 'rev-list with tag negated via --not --all (objects)' ' + git rev-list perf-tag --not --all --use-bitmap-index --objects >/dev/null + ' + + test_perf 'rev-list with negative tag (objects)' ' + git rev-list HEAD --not perf-tag --use-bitmap-index --objects >/dev/null + ' + + test_perf 'rev-list count with blob:none' ' + git rev-list --use-bitmap-index --count --objects --all \ + --filter=blob:none >/dev/null + ' + + test_perf 'rev-list count with blob:limit=1k' ' + git rev-list --use-bitmap-index --count --objects --all \ + --filter=blob:limit=1k >/dev/null + ' + + test_perf 'rev-list count with tree:0' ' + git rev-list --use-bitmap-index --count --objects --all \ + --filter=tree:0 >/dev/null + ' + + test_perf 'simulated partial clone' ' + git pack-objects --stdout --all --filter=blob:none /dev/null + ' +} + +test_partial_bitmap () { + test_perf 'clone (partial bitmap)' ' + git pack-objects --stdout --all /dev/null + ' + + test_perf 'pack to file (partial bitmap)' ' + git pack-objects --use-bitmap-index --all pack2b /dev/null + ' + + test_perf 'rev-list with tree filter (partial bitmap)' ' + git rev-list --use-bitmap-index --count --objects --all \ + --filter=tree:0 >/dev/null + ' +} diff --git a/t/perf/p5310-pack-bitmaps.sh b/t/perf/p5310-pack-bitmaps.sh index 452be01056..7ad4f237bc 100755 --- a/t/perf/p5310-pack-bitmaps.sh +++ b/t/perf/p5310-pack-bitmaps.sh @@ -2,6 +2,7 @@ test_description='Tests pack performance using bitmaps' . ./perf-lib.sh +. "${TEST_DIRECTORY}/perf/lib-bitmap.sh" test_perf_large_repo @@ -25,56 +26,7 @@ test_perf 'repack to disk' ' git repack -ad ' -test_perf 'simulated clone' ' - git pack-objects --stdout --all /dev/null -' - -test_perf 'simulated fetch' ' - have=$(git rev-list HEAD~100 -1) && - { - echo HEAD && - echo ^$have - } | git pack-objects --revs --stdout >/dev/null -' - -test_perf 'pack to file (bitmap)' ' - git pack-objects --use-bitmap-index --all pack1b /dev/null -' - -test_perf 'rev-list (commits)' ' - git rev-list --all --use-bitmap-index >/dev/null -' - -test_perf 'rev-list (objects)' ' - git rev-list --all --use-bitmap-index --objects >/dev/null -' - -test_perf 'rev-list with tag negated via --not --all (objects)' ' - git rev-list perf-tag --not --all --use-bitmap-index --objects >/dev/null -' - -test_perf 'rev-list with negative tag (objects)' ' - git rev-list HEAD --not perf-tag --use-bitmap-index --objects >/dev/null -' - -test_perf 'rev-list count with blob:none' ' - git rev-list --use-bitmap-index --count --objects --all \ - --filter=blob:none >/dev/null -' - -test_perf 'rev-list count with blob:limit=1k' ' - git rev-list --use-bitmap-index --count --objects --all \ - --filter=blob:limit=1k >/dev/null -' - -test_perf 'rev-list count with tree:0' ' - git rev-list --use-bitmap-index --count --objects --all \ - --filter=tree:0 >/dev/null -' - -test_perf 'simulated partial clone' ' - git pack-objects --stdout --all --filter=blob:none /dev/null -' +test_full_bitmap test_expect_success 'create partial bitmap state' ' # pick a commit to represent the repo tip in the past @@ -97,17 +49,6 @@ test_expect_success 'create partial bitmap state' ' git update-ref HEAD $orig_tip ' -test_perf 'clone (partial bitmap)' ' - git pack-objects --stdout --all /dev/null -' - -test_perf 'pack to file (partial bitmap)' ' - git pack-objects --use-bitmap-index --all pack2b /dev/null -' - -test_perf 'rev-list with tree filter (partial bitmap)' ' - git rev-list --use-bitmap-index --count --objects --all \ - --filter=tree:0 >/dev/null -' +test_partial_bitmap test_done From patchwork Mon Jun 21 22:26:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12335837 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA717C4743C for ; Mon, 21 Jun 2021 22:26:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A9AAB611CE for ; Mon, 21 Jun 2021 22:26:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232343AbhFUW2Z (ORCPT ); Mon, 21 Jun 2021 18:28:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53900 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232322AbhFUW2S (ORCPT ); Mon, 21 Jun 2021 18:28:18 -0400 Received: from mail-il1-x132.google.com (mail-il1-x132.google.com [IPv6:2607:f8b0:4864:20::132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5D494C06175F for ; Mon, 21 Jun 2021 15:26:03 -0700 (PDT) Received: by mail-il1-x132.google.com with SMTP id k5so3126828ilv.8 for ; Mon, 21 Jun 2021 15:26:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=4z4LT0lv5k9umGSR/ooyeHTuJXpCfSVmHmMOmakBiFE=; b=N+ZIxn8LZJ8RSoIBdwdGp/5m91mRIeLyK8ZO4thUBPqsxSz28Bonj7JzV7VAWB4q1p fGrwTNlFwX5C34b49Hvxqm+1OI7RrEEjVOtr3mRw93jlFEiiOoycgsi9YdbD7H6C9Oa5 QCYZJu0e+XNqY71yegJ6eL4+3DzcT92BeSLgpyelSznZJtfjNtnTwcdV61Ed+ef15hEh 4s+TCxz/SKfEPpkTACkFx3YEa/dMXtKbMLeiZo4Sm3+kQrXXONm53HsgmxhGPpToPMn4 F5U92sVrqhFPpF5nilsQ/nOHe3siwbzBO0Va5YhGSETw46dtukhx6CP/KO99EcqoDzoQ g8uQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=4z4LT0lv5k9umGSR/ooyeHTuJXpCfSVmHmMOmakBiFE=; b=e6p7McdT3kEuWy2lif68qXe+wvOdxXvhdnZqwkPTya8H90Qyfx5P+5k5fIkEPoGmOt aHRez5sgkyn911CLrRRZ0lsqYrJ4443ep1FE815TPp1y7lwfVycnFG+UhTvb0p6ppycr lqMiK4E/8dl6r3Zt4DIxADrwLY3AImNjYNWLVYppg+FgErNyDMzH4BTGnnPeqFGuWjCE AnQBuDdL3mLiBI7k4E8tw+pTC06LPVOVM8Nj0uNj3qc+tp/Zin5FDwseC3EePu9CYWeU o7Vj8X3jQoXWwCEtHxij6lwRSwh/LdhARQI4tbmCvVxgDbTdW9iIgYjnWSfG3GGKKaIx J/Sw== X-Gm-Message-State: AOAM532fS/3UjPHFS4ALfgg7FCDTS1yFAL1S5J4VDcsbplSoYggbvdPy 5dtjE1sFglzQOEqBcZ13ph10KjZsgRIQt2wH X-Google-Smtp-Source: ABdhPJzk/phn8Rdb+zqX0oyZsuV25S1BqMaf6fL9iYcm9YSyWt9WJQdxh7G5fZuhW0W+DZ7hhwJ0nQ== X-Received: by 2002:a92:8e45:: with SMTP id k5mr377355ilh.116.1624314362619; Mon, 21 Jun 2021 15:26:02 -0700 (PDT) Received: from localhost ([2600:1700:d843:8f:f6bb:34fc:22a7:6a3]) by smtp.gmail.com with ESMTPSA id d7sm6725229ilg.60.2021.06.21.15.26.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Jun 2021 15:26:02 -0700 (PDT) Date: Mon, 21 Jun 2021 18:26:01 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v2 24/24] p5326: perf tests for MIDX bitmaps Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org These new performance tests demonstrate effectively the same behavior as p5310, but use a multi-pack bitmap instead of a single-pack one. Notably, p5326 does not create a MIDX bitmap with multiple packs. This is so we can measure a direct comparison between it and p5310. Any difference between the two is measuring just the overhead of using MIDX bitmaps. Here are the results of p5310 and p5326 together, measured at the same time and on the same machine (using a Xenon W-2255 CPU): Test HEAD ------------------------------------------------------------------------ 5310.2: repack to disk 96.78(93.39+11.33) 5310.3: simulated clone 9.98(9.79+0.19) 5310.4: simulated fetch 1.75(4.26+0.19) 5310.5: pack to file (bitmap) 28.20(27.87+8.70) 5310.6: rev-list (commits) 0.41(0.36+0.05) 5310.7: rev-list (objects) 1.61(1.54+0.07) 5310.8: rev-list count with blob:none 0.25(0.21+0.04) 5310.9: rev-list count with blob:limit=1k 2.65(2.54+0.10) 5310.10: rev-list count with tree:0 0.23(0.19+0.04) 5310.11: simulated partial clone 4.34(4.21+0.12) 5310.13: clone (partial bitmap) 11.05(12.21+0.48) 5310.14: pack to file (partial bitmap) 31.25(34.22+3.70) 5310.15: rev-list with tree filter (partial bitmap) 0.26(0.22+0.04) versus the same tests (this time using a multi-pack index): Test HEAD ------------------------------------------------------------------------ 5326.2: setup multi-pack index 78.99(75.29+11.58) 5326.3: simulated clone 11.78(11.56+0.22) 5326.4: simulated fetch 1.70(4.49+0.13) 5326.5: pack to file (bitmap) 28.02(27.72+8.76) 5326.6: rev-list (commits) 0.42(0.36+0.06) 5326.7: rev-list (objects) 1.65(1.58+0.06) 5326.8: rev-list count with blob:none 0.26(0.21+0.05) 5326.9: rev-list count with blob:limit=1k 2.97(2.86+0.10) 5326.10: rev-list count with tree:0 0.25(0.20+0.04) 5326.11: simulated partial clone 5.65(5.49+0.16) 5326.13: clone (partial bitmap) 12.22(13.43+0.38) 5326.14: pack to file (partial bitmap) 30.05(31.57+7.25) 5326.15: rev-list with tree filter (partial bitmap) 0.24(0.20+0.04) There is slight overhead in "simulated clone", "simulated partial clone", and "clone (partial bitmap)". Unsurprisingly, that overhead is due to using the MIDX's reverse index to map between bit positions and MIDX positions. This can be reproduced by running "git repack -adb" along with "git multi-pack-index write --bitmap" in a large-ish repository. Then run: $ perf record -o pack.perf git -c core.multiPackIndex=false \ pack-objects --all --stdout >/dev/null /dev/null --- t/perf/p5326-multi-pack-bitmaps.sh | 43 ++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) create mode 100755 t/perf/p5326-multi-pack-bitmaps.sh diff --git a/t/perf/p5326-multi-pack-bitmaps.sh b/t/perf/p5326-multi-pack-bitmaps.sh new file mode 100755 index 0000000000..5845109ac7 --- /dev/null +++ b/t/perf/p5326-multi-pack-bitmaps.sh @@ -0,0 +1,43 @@ +#!/bin/sh + +test_description='Tests performance using midx bitmaps' +. ./perf-lib.sh +. "${TEST_DIRECTORY}/perf/lib-bitmap.sh" + +test_perf_large_repo + +test_expect_success 'enable multi-pack index' ' + git config core.multiPackIndex true +' + +test_perf 'setup multi-pack index' ' + git repack -ad && + git multi-pack-index write --bitmap +' + +test_full_bitmap + +test_expect_success 'create partial bitmap state' ' + # pick a commit to represent the repo tip in the past + cutoff=$(git rev-list HEAD~100 -1) && + orig_tip=$(git rev-parse HEAD) && + + # now pretend we have just one tip + rm -rf .git/logs .git/refs/* .git/packed-refs && + git update-ref HEAD $cutoff && + + # and then repack, which will leave us with a nice + # big bitmap pack of the "old" history, and all of + # the new history will be loose, as if it had been pushed + # up incrementally and exploded via unpack-objects + git repack -Ad && + git multi-pack-index write --bitmap && + + # and now restore our original tip, as if the pushes + # had happened + git update-ref HEAD $orig_tip +' + +test_partial_bitmap + +test_done