From patchwork Tue Aug 31 20:51:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467793 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA9CDC432BE for ; Tue, 31 Aug 2021 20:51:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9A4CC60ED4 for ; Tue, 31 Aug 2021 20:51:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241159AbhHaUwn (ORCPT ); Tue, 31 Aug 2021 16:52:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41830 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241217AbhHaUwj (ORCPT ); Tue, 31 Aug 2021 16:52:39 -0400 Received: from mail-io1-xd33.google.com (mail-io1-xd33.google.com [IPv6:2607:f8b0:4864:20::d33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 04DAEC061796 for ; Tue, 31 Aug 2021 13:51:44 -0700 (PDT) Received: by mail-io1-xd33.google.com with SMTP id b10so760904ioq.9 for ; Tue, 31 Aug 2021 13:51:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=+P3lStOCr1PeSqvaQjnYzB0LEBCnLp1CL09lxHtH7a0=; b=B751DgvX/R3rNY9COXWG1C9LPpWw0K+HjZUi3w2E6ueD2tohgLOtM6Qv5SE9mTISsm YqjKD5oVH+0f3mVJWwAEg/rtkm85mEptiSKCffUijND7aMgdEfrlCvCxYbXknBv9vc2v 4UgJ1wHi5dhb5QPvJaD/d9XXh5RzgpRxD6XQeipE+mVQDi8vSeiNkafMoCIKAIlbMDBw xdVtNciB1/qMDhpKMAaL2zz4sZmNgnLlV1gS64Ripsu2BSzIHZVG+P3tJJA1HE7bs3JT x4CYZJ04EwoOz6d5PHz9LGke9cyFM+6MH2cwsmKb8tkrv8toE417b/HHbIwrEg6Ri2n4 MRKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=+P3lStOCr1PeSqvaQjnYzB0LEBCnLp1CL09lxHtH7a0=; b=my60ooouvr7+Z+TARZOfzujVSUHu3/AlDfRVw176yG18+fpU4mXCToUGtxp52p4CEC m5zvfwKzBUyT2ekPGyCgqeHJg9hH25XEs3DeiZIOp6byU4P+4eRIPtf0erCW43kL6llN EqMEQWiUhmSNlKRqbjgAbd5hdLcSuxLGcLVn3kBPIpiS9qNV+OBDdX/ldAu1f0qshLE0 fC1y/dbyF/VXm/+758MQXb4ePaUcH6JmWGnSCNJUbCGqily3uprQoQdBwxFoG0pyANa+ NNe14DrGden2Fo6OT9/ZyQqYWt66RlegKqoub5dWWYy1WQugr4W4bF3HPkA0YlSezVNw 8POg== X-Gm-Message-State: AOAM531SZvImf9UPeI8xdnl3IJ2KE5OF0xav9NGNtXhLLvF+I5x8RO9L 7LvGEuM3elHiH/Wg2JeCT1mMh93Ndbyl2mgR X-Google-Smtp-Source: ABdhPJzS95jrCmJJPL9qZcttQW/MCLoAJkxy8LvVfUJ4CnJJWGDo/NkPaM3UGJS8Gz4X2G1C4d7kGw== X-Received: by 2002:a02:850a:: with SMTP id g10mr4597136jai.134.1630443103316; Tue, 31 Aug 2021 13:51:43 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id l18sm1900817ilj.82.2021.08.31.13.51.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:51:42 -0700 (PDT) Date: Tue, 31 Aug 2021 16:51:42 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 01/27] pack-bitmap.c: harden 'test_bitmap_walk()' to check type bitmaps Message-ID: <7815d9929d7b34f566b9748e03818c37d22a6808.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The special `--test-bitmap` mode of `git rev-list` is used to compare the result of an object traversal with a bitmap to check its integrity. This mode does not, however, assert that the types of reachable objects are stored correctly. Harden this mode by teaching it to also check that each time an object's bit is marked, the corresponding bit should be set in exactly one of the type bitmaps (whose type matches the object's true type). Co-authored-by: Jeff King Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- pack-bitmap.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/pack-bitmap.c b/pack-bitmap.c index d999616c9e..9b11af87aa 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1325,10 +1325,52 @@ void count_bitmap_commit_list(struct bitmap_index *bitmap_git, struct bitmap_test_data { struct bitmap_index *bitmap_git; struct bitmap *base; + struct bitmap *commits; + struct bitmap *trees; + struct bitmap *blobs; + struct bitmap *tags; struct progress *prg; size_t seen; }; +static void test_bitmap_type(struct bitmap_test_data *tdata, + struct object *obj, int pos) +{ + enum object_type bitmap_type = OBJ_NONE; + int bitmaps_nr = 0; + + if (bitmap_get(tdata->commits, pos)) { + bitmap_type = OBJ_COMMIT; + bitmaps_nr++; + } + if (bitmap_get(tdata->trees, pos)) { + bitmap_type = OBJ_TREE; + bitmaps_nr++; + } + if (bitmap_get(tdata->blobs, pos)) { + bitmap_type = OBJ_BLOB; + bitmaps_nr++; + } + if (bitmap_get(tdata->tags, pos)) { + bitmap_type = OBJ_TAG; + bitmaps_nr++; + } + + if (bitmap_type == OBJ_NONE) + die("object %s not found in type bitmaps", + oid_to_hex(&obj->oid)); + + if (bitmaps_nr > 1) + die("object %s does not have a unique type", + oid_to_hex(&obj->oid)); + + if (bitmap_type != obj->type) + die("object %s: real type %s, expected: %s", + oid_to_hex(&obj->oid), + type_name(obj->type), + type_name(bitmap_type)); +} + static void test_show_object(struct object *object, const char *name, void *data) { @@ -1338,6 +1380,7 @@ static void test_show_object(struct object *object, const char *name, bitmap_pos = bitmap_position(tdata->bitmap_git, &object->oid); if (bitmap_pos < 0) die("Object not in bitmap: %s\n", oid_to_hex(&object->oid)); + test_bitmap_type(tdata, object, bitmap_pos); bitmap_set(tdata->base, bitmap_pos); display_progress(tdata->prg, ++tdata->seen); @@ -1352,6 +1395,7 @@ static void test_show_commit(struct commit *commit, void *data) &commit->object.oid); if (bitmap_pos < 0) die("Object not in bitmap: %s\n", oid_to_hex(&commit->object.oid)); + test_bitmap_type(tdata, &commit->object, bitmap_pos); bitmap_set(tdata->base, bitmap_pos); display_progress(tdata->prg, ++tdata->seen); @@ -1399,6 +1443,10 @@ void test_bitmap_walk(struct rev_info *revs) tdata.bitmap_git = bitmap_git; tdata.base = bitmap_new(); + tdata.commits = ewah_to_bitmap(bitmap_git->commits); + tdata.trees = ewah_to_bitmap(bitmap_git->trees); + tdata.blobs = ewah_to_bitmap(bitmap_git->blobs); + tdata.tags = ewah_to_bitmap(bitmap_git->tags); tdata.prg = start_progress("Verifying bitmap entries", result_popcnt); tdata.seen = 0; From patchwork Tue Aug 31 20:51:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467797 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75518C4320E for ; Tue, 31 Aug 2021 20:51:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5A94A60ED4 for ; Tue, 31 Aug 2021 20:51:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241217AbhHaUwn (ORCPT ); Tue, 31 Aug 2021 16:52:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41844 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229916AbhHaUwm (ORCPT ); Tue, 31 Aug 2021 16:52:42 -0400 Received: from mail-io1-xd2c.google.com (mail-io1-xd2c.google.com [IPv6:2607:f8b0:4864:20::d2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BEC6BC061575 for ; Tue, 31 Aug 2021 13:51:46 -0700 (PDT) Received: by mail-io1-xd2c.google.com with SMTP id e186so729307iof.12 for ; Tue, 31 Aug 2021 13:51:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=JFgfNOP69iY73x+/56MeHf3OUVtz0bNgDEwDbybDP5s=; b=tBoxX8sdsvCIVSgh8CIiQF0XbY8+EhTrl1p5nBl1fff50BzRoXq6cOzU7xQun9/5KR RpOCG86g3NgpLkh6SvTUFX0hrKPVbprcWGbN3/wxi6YW65oCG4/zAwGLk+MHhOXKLcv8 HnqYa+OZs0qJ0ck8awPh/wys970RJ6gSJ4R+4rNJJpQV9pC5Fv4bSEl9NA27IyXiGAy3 3tGCReh5ks3H3667u5ZeevGI7q9BMok+OSGNRCJRnZtXCsp9m+WrJ3BK0j4OC2XpDEGr kqneWrFrqJODAFdpdecusmFdzxsOMUDM6bjMMurFRHlTFKjmiYp2Fh1KxEXop2d/zPJf SVbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=JFgfNOP69iY73x+/56MeHf3OUVtz0bNgDEwDbybDP5s=; b=bQxtRo3UiAra6IErIYbENRu3vbQGQQAlkmas8jsexzh7AUaXH2HGeRHCYXP0TuBYu6 JuXRtuGFsMmzac8cd4eAzhDVRMFaaqZusUDZeAIvEF1kaEqbqvExnkxLpBe1JkIBgj0U K7MNf7BJpODWf8MSW8UR1grm9dyx68r6oiU/uvH3rdqEETWUWRzjI/k3EpeX67tUI02b LmIL/j42D7nrUn2zBC6bc/bHrm9brxaV6drDER7ppgzO0Ert9oZDwGJlHSS5MA1AQ17y gIawabAHkZOIUpqtfm9TIt3RH/fT/TPtsafjFOBt6QblSjFNfpe8MAFDyIL2Gb4eSBsN rH5Q== X-Gm-Message-State: AOAM530SqF3Py2mJrVkb1iMCvLzGH0ECUXa1dGDyLGOPahy1HyPe6Uqk XjfjXkkrTsK+SLaRceXBWzQpmS8Ciy7pBC2B X-Google-Smtp-Source: ABdhPJxRJaUXRMR9kMtnlzy7ZqeZ0fIfzig2cN8H8/US5noKoXwXVfnM5e5NN+tQpfjUaZL0BsvSzA== X-Received: by 2002:a6b:e90c:: with SMTP id u12mr23533962iof.95.1630443105931; Tue, 31 Aug 2021 13:51:45 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id j5sm10770357ilu.11.2021.08.31.13.51.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:51:45 -0700 (PDT) Date: Tue, 31 Aug 2021 16:51:44 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 02/27] pack-bitmap-write.c: gracefully fail to write non-closed bitmaps Message-ID: <629171115a36071969ea56569e07c8b179affa29.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The set of objects covered by a bitmap must be closed under reachability, since it must be the case that there is a valid bit position assigned for every possible reachable object (otherwise the bitmaps would be incomplete). Pack bitmaps are never written from 'git repack' unless repacking all-into-one, and so we never write non-closed bitmaps (except in the case of partial clones where we aren't guaranteed to have all objects). But multi-pack bitmaps change this, since it isn't known whether the set of objects in the MIDX is closed under reachability until walking them. Plumb through a bit that is set when a reachable object isn't found. As soon as a reachable object isn't found in the set of objects to include in the bitmap, bitmap_writer_build() knows that the set is not closed, and so it now fails gracefully. A test is added in t0410 to trigger a bitmap write without full reachability closure by removing local copies of some reachable objects from a promisor remote. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 3 +- pack-bitmap-write.c | 76 ++++++++++++++++++++++++++++------------ pack-bitmap.h | 2 +- t/t0410-partial-clone.sh | 9 ++++- 4 files changed, 64 insertions(+), 26 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index df49f656b9..b63e06e46c 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1256,7 +1256,8 @@ static void write_pack_file(void) bitmap_writer_show_progress(progress); bitmap_writer_select_commits(indexed_commits, indexed_commits_nr, -1); - bitmap_writer_build(&to_pack); + if (bitmap_writer_build(&to_pack) < 0) + die(_("failed to write bitmap index")); bitmap_writer_finish(written_list, nr_written, tmpname.buf, write_bitmap_options); write_bitmap_index = 0; diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index 88d9e696a5..d374f7884b 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -125,15 +125,20 @@ static inline void push_bitmapped_commit(struct commit *commit) writer.selected_nr++; } -static uint32_t find_object_pos(const struct object_id *oid) +static uint32_t find_object_pos(const struct object_id *oid, int *found) { struct object_entry *entry = packlist_find(writer.to_pack, oid); if (!entry) { - die("Failed to write bitmap index. Packfile doesn't have full closure " + if (found) + *found = 0; + warning("Failed to write bitmap index. Packfile doesn't have full closure " "(object %s is missing)", oid_to_hex(oid)); + return 0; } + if (found) + *found = 1; return oe_in_pack_pos(writer.to_pack, entry); } @@ -331,9 +336,10 @@ static void bitmap_builder_clear(struct bitmap_builder *bb) bb->commits_nr = bb->commits_alloc = 0; } -static void fill_bitmap_tree(struct bitmap *bitmap, - struct tree *tree) +static int fill_bitmap_tree(struct bitmap *bitmap, + struct tree *tree) { + int found; uint32_t pos; struct tree_desc desc; struct name_entry entry; @@ -342,9 +348,11 @@ static void fill_bitmap_tree(struct bitmap *bitmap, * If our bit is already set, then there is nothing to do. Both this * tree and all of its children will be set. */ - pos = find_object_pos(&tree->object.oid); + pos = find_object_pos(&tree->object.oid, &found); + if (!found) + return -1; if (bitmap_get(bitmap, pos)) - return; + return 0; bitmap_set(bitmap, pos); if (parse_tree(tree) < 0) @@ -355,11 +363,15 @@ static void fill_bitmap_tree(struct bitmap *bitmap, while (tree_entry(&desc, &entry)) { switch (object_type(entry.mode)) { case OBJ_TREE: - fill_bitmap_tree(bitmap, - lookup_tree(the_repository, &entry.oid)); + if (fill_bitmap_tree(bitmap, + lookup_tree(the_repository, &entry.oid)) < 0) + return -1; break; case OBJ_BLOB: - bitmap_set(bitmap, find_object_pos(&entry.oid)); + pos = find_object_pos(&entry.oid, &found); + if (!found) + return -1; + bitmap_set(bitmap, pos); break; default: /* Gitlink, etc; not reachable */ @@ -368,15 +380,18 @@ static void fill_bitmap_tree(struct bitmap *bitmap, } free_tree_buffer(tree); + return 0; } -static void fill_bitmap_commit(struct bb_commit *ent, - struct commit *commit, - struct prio_queue *queue, - struct prio_queue *tree_queue, - struct bitmap_index *old_bitmap, - const uint32_t *mapping) +static int fill_bitmap_commit(struct bb_commit *ent, + struct commit *commit, + struct prio_queue *queue, + struct prio_queue *tree_queue, + struct bitmap_index *old_bitmap, + const uint32_t *mapping) { + int found; + uint32_t pos; if (!ent->bitmap) ent->bitmap = bitmap_new(); @@ -401,11 +416,16 @@ static void fill_bitmap_commit(struct bb_commit *ent, * Mark ourselves and queue our tree. The commit * walk ensures we cover all parents. */ - bitmap_set(ent->bitmap, find_object_pos(&c->object.oid)); + pos = find_object_pos(&c->object.oid, &found); + if (!found) + return -1; + bitmap_set(ent->bitmap, pos); prio_queue_put(tree_queue, get_commit_tree(c)); for (p = c->parents; p; p = p->next) { - int pos = find_object_pos(&p->item->object.oid); + pos = find_object_pos(&p->item->object.oid, &found); + if (!found) + return -1; if (!bitmap_get(ent->bitmap, pos)) { bitmap_set(ent->bitmap, pos); prio_queue_put(queue, p->item); @@ -413,8 +433,12 @@ static void fill_bitmap_commit(struct bb_commit *ent, } } - while (tree_queue->nr) - fill_bitmap_tree(ent->bitmap, prio_queue_get(tree_queue)); + while (tree_queue->nr) { + if (fill_bitmap_tree(ent->bitmap, + prio_queue_get(tree_queue)) < 0) + return -1; + } + return 0; } static void store_selected(struct bb_commit *ent, struct commit *commit) @@ -432,7 +456,7 @@ static void store_selected(struct bb_commit *ent, struct commit *commit) kh_value(writer.bitmaps, hash_pos) = stored; } -void bitmap_writer_build(struct packing_data *to_pack) +int bitmap_writer_build(struct packing_data *to_pack) { struct bitmap_builder bb; size_t i; @@ -441,6 +465,7 @@ void bitmap_writer_build(struct packing_data *to_pack) struct prio_queue tree_queue = { NULL }; struct bitmap_index *old_bitmap; uint32_t *mapping; + int closed = 1; /* until proven otherwise */ writer.bitmaps = kh_init_oid_map(); writer.to_pack = to_pack; @@ -463,8 +488,11 @@ void bitmap_writer_build(struct packing_data *to_pack) struct commit *child; int reused = 0; - fill_bitmap_commit(ent, commit, &queue, &tree_queue, - old_bitmap, mapping); + if (fill_bitmap_commit(ent, commit, &queue, &tree_queue, + old_bitmap, mapping) < 0) { + closed = 0; + break; + } if (ent->selected) { store_selected(ent, commit); @@ -499,7 +527,9 @@ void bitmap_writer_build(struct packing_data *to_pack) stop_progress(&writer.progress); - compute_xor_offsets(); + if (closed) + compute_xor_offsets(); + return closed ? 0 : -1; } /** diff --git a/pack-bitmap.h b/pack-bitmap.h index 99d733eb26..020cd8d868 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -87,7 +87,7 @@ struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git, struct commit *commit); void bitmap_writer_select_commits(struct commit **indexed_commits, unsigned int indexed_commits_nr, int max_bitmaps); -void bitmap_writer_build(struct packing_data *to_pack); +int bitmap_writer_build(struct packing_data *to_pack); void bitmap_writer_finish(struct pack_idx_entry **index, uint32_t index_nr, const char *filename, diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh index a211a66c67..bbcc51ee8e 100755 --- a/t/t0410-partial-clone.sh +++ b/t/t0410-partial-clone.sh @@ -536,7 +536,13 @@ test_expect_success 'gc does not repack promisor objects if there are none' ' repack_and_check () { rm -rf repo2 && cp -r repo repo2 && - git -C repo2 repack $1 -d && + if test x"$1" = "x--must-fail" + then + shift + test_must_fail git -C repo2 repack $1 -d + else + git -C repo2 repack $1 -d + fi && git -C repo2 fsck && git -C repo2 cat-file -e $2 && @@ -561,6 +567,7 @@ test_expect_success 'repack -d does not irreversibly delete promisor objects' ' printf "$THREE\n" | pack_as_from_promisor && delete_object repo "$ONE" && + repack_and_check --must-fail -ab "$TWO" "$THREE" && repack_and_check -a "$TWO" "$THREE" && repack_and_check -A "$TWO" "$THREE" && repack_and_check -l "$TWO" "$THREE" From patchwork Tue Aug 31 20:51:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467795 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A7F8C432BE for ; Tue, 31 Aug 2021 20:51:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1D1B760ED4 for ; Tue, 31 Aug 2021 20:51:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241218AbhHaUwr (ORCPT ); Tue, 31 Aug 2021 16:52:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229916AbhHaUwo (ORCPT ); Tue, 31 Aug 2021 16:52:44 -0400 Received: from mail-io1-xd31.google.com (mail-io1-xd31.google.com [IPv6:2607:f8b0:4864:20::d31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2FCC1C061575 for ; Tue, 31 Aug 2021 13:51:49 -0700 (PDT) Received: by mail-io1-xd31.google.com with SMTP id g9so740090ioq.11 for ; Tue, 31 Aug 2021 13:51:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=+DRaLVOHAM1ShuPqp6HHxzma3WXLZmd+bPW51wq5EsU=; b=bqGI1MkN/yglEu/bxZ5gJD8TdIKVQiNemHDC0EsSQ4HI94ZXf0Ors+/7JMiCIC9gMI QXA7olHsQd5lKfH31A6IICXjo0Y/tqmDqoZruw8WAO5Y4U6ZXroNo4WlhP5ju+NoLEfn xwzhsmz4KQtlxCZfi0kZinDg1+l5bIB1EXUucPaIGNepieDt9H5KeGqOjXyJI+oylOG8 INB6v5B9LN19L1615nI1yUv+SeP8e+yUR3iBWKymK7hb/g4oZCzglHd9sB8jqHX9wAjK KJz+uxMES3DZ14fJq0WPIbMrLSBGw3l0vQmz0r4D6XE6fEt1aEeVL02gmNHR3zdreL6c 67dA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=+DRaLVOHAM1ShuPqp6HHxzma3WXLZmd+bPW51wq5EsU=; b=SrQSPwSpGHo6WzlWMIPOSUv0W6c2DphWtWZZG49SHi5HNNpsPiSu8utD9OrAsu34z2 LXKXXBPLEHVgF51IVpIwwvgf92MZ1dwZva0MW+XIwf85pX5TuaJqm0GnULtzHTmphbP6 PvO2pD+J2vd0ja7nWKFKJxiPY1/5slpywC0IA+CNi7J7CZ8W70de+AppSJtXdAAlCtg/ q/Wnw4wFRIzST1LULQoH+2OeknsR3u7E/cPu/HfQOgI6gAqJfwzmx1c5YhIl1+WGjm76 mXZvEHxEiWrTEQPwZ8ZWFDB5t4pbRFM60yM3bkX/SvYgCtojhPoM0peOacFqzwfGMzFq S89A== X-Gm-Message-State: AOAM531ZpDSreTc1BssGfr0ziKcbcbrZeic2TdoUN21B252i6lzBd9Ts KS9oDdezv8fMRBpL2EI7kagTnrbmuX0TUsTA X-Google-Smtp-Source: ABdhPJzArW0PGFl9atkVEDpUeuH88fB6n5WaPrzGdD+nooZcfPuSdS/eMTmTIEJs1+EMfp5VbaCk1Q== X-Received: by 2002:a5d:9284:: with SMTP id s4mr24270427iom.131.1630443108545; Tue, 31 Aug 2021 13:51:48 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id q12sm10119559ioh.20.2021.08.31.13.51.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:51:48 -0700 (PDT) Date: Tue, 31 Aug 2021 16:51:47 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 03/27] pack-bitmap-write.c: free existing bitmaps Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When writing a new bitmap, the bitmap writer code attempts to read the existing bitmap (if one is present). This is done in order to quickly permute the bits of any bitmaps for commits which appear in the existing bitmap, and were also selected for the new bitmap. But since this code was added in 341fa34887 (pack-bitmap-write: use existing bitmaps, 2020-12-08), the resources associated with opening an existing bitmap were never released. It's fine to ignore this, but it's bad hygiene. It will also cause a problem for the multi-pack-index builtin, which will be responsible not only for writing bitmaps, but also for expiring any old multi-pack bitmaps. If an existing bitmap was reused here, it will also be expired. That will cause a problem on platforms which require file resources to be closed before unlinking them, like Windows. Avoid this by ensuring we close reused bitmaps with free_bitmap_index() before removing them. Signed-off-by: Taylor Blau --- pack-bitmap-write.c | 1 + 1 file changed, 1 insertion(+) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index d374f7884b..142fd0adb8 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -520,6 +520,7 @@ int bitmap_writer_build(struct packing_data *to_pack) clear_prio_queue(&queue); clear_prio_queue(&tree_queue); bitmap_builder_clear(&bb); + free_bitmap_index(old_bitmap); free(mapping); trace2_region_leave("pack-bitmap-write", "building_bitmaps_total", From patchwork Tue Aug 31 20:51:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467799 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0183C432BE for ; Tue, 31 Aug 2021 20:51:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CCF1260F9E for ; Tue, 31 Aug 2021 20:51:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241223AbhHaUwt (ORCPT ); Tue, 31 Aug 2021 16:52:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41882 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229916AbhHaUwr (ORCPT ); Tue, 31 Aug 2021 16:52:47 -0400 Received: from mail-io1-xd2b.google.com (mail-io1-xd2b.google.com [IPv6:2607:f8b0:4864:20::d2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5C12C061575 for ; Tue, 31 Aug 2021 13:51:51 -0700 (PDT) Received: by mail-io1-xd2b.google.com with SMTP id y18so830190ioc.1 for ; Tue, 31 Aug 2021 13:51:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=/9RoyJUt233rLn5Vi/qShXKFiOHudX4ykIEDOxpKOus=; b=Sx6iZHiSCQT6d0CLtW3NmNuhB8gxDS/1I0rWhXn/ykRwcH3pPQCCUKeOrVhXUHrEr2 yELVmgtC+vgJPlbFYuJR5nnMjBYZbcdlBsQxuyUr0ZaoW7pJ1gUAugr+oHj/zFxRBekd QDN3+A510dKc6YhmzAVd6E0+KLC0QkyRLocjCuhWA4DC3S94//anpd1bsRHLH85okxsp waDdrb9zus1yT4gSUw7NhBnBjcaMcv8jUs9c2LtwCdrMq3F3gsKMF7sNMzxHMmJL+WTJ o4vzOqfBT39S51JVZJEIRjgogeNgnHdxnvMesTauIxFd9HCpypPyCUxT1K/6T8GQDf1s CV0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=/9RoyJUt233rLn5Vi/qShXKFiOHudX4ykIEDOxpKOus=; b=FMIw12Zfp1mOfuA9YFqMFYR211n00k45vGhb/4SuIXUtCOv8hYK2BT4D3zQcd6uVWZ 0GyGv5vG3OLz4hlBkJPdDCWHkKTBLhLF/MUsbgRvpPRLG2Ld1gFGZrRsL5jvwR261C/4 C8RbLu5DZDm7uzlz/7FR+BmvFSDcKfwSL4y8PZfPff9rSj82gik88Gq+OBbD8oEB9i+h rhk0emZKAHmbIif/W+JP1oQSqdd2d9BTfrwlcqM9+WNbuBZlzOQ5oBVyb/W9KZs7t2Lu A8auv9JWNOkDcz9j1gprXyzGmgzHyDKcJZ/W8DoNNy/13M4gookIE3oVcgT7/rZ3FspN 8/MA== X-Gm-Message-State: AOAM530a79JQ0rLCy9rNRlYu+CNBkCwyCN1yclPqgz6uRuXLbFZ1t59m dGIA7EM0qCNvNx3IxjC35o/1GsbdhifnTVdk X-Google-Smtp-Source: ABdhPJw1wJFElimAHp2wUftwJ0V8Xk0eVKtd45MCbu/OEbQXrMQjWM/bOuuboJY0RTBDNdTF0NnANg== X-Received: by 2002:a5d:9409:: with SMTP id v9mr9878198ion.170.1630443110950; Tue, 31 Aug 2021 13:51:50 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id k2sm10576411ior.40.2021.08.31.13.51.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:51:50 -0700 (PDT) Date: Tue, 31 Aug 2021 16:51:50 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 04/27] Documentation: describe MIDX-based bitmaps Message-ID: <158ff797c4cf79ee50efe31c85c032f4d79af609.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Update the technical documentation to describe the multi-pack bitmap format. This patch merely introduces the new format, and describes its high-level ideas. Git does not yet know how to read nor write these multi-pack variants, and so the subsequent patches will: - Introduce code to interpret multi-pack bitmaps, according to this document. - Then, introduce code to write multi-pack bitmaps from the 'git multi-pack-index write' sub-command. Finally, the implementation will gain tests in subsequent patches (as opposed to inline with the patch teaching Git how to write multi-pack bitmaps) to avoid a cyclic dependency. Signed-off-by: Taylor Blau --- Documentation/technical/bitmap-format.txt | 71 ++++++++++++++++---- Documentation/technical/multi-pack-index.txt | 10 +-- 2 files changed, 60 insertions(+), 21 deletions(-) diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt index f8c18a0f7a..04b3ec2178 100644 --- a/Documentation/technical/bitmap-format.txt +++ b/Documentation/technical/bitmap-format.txt @@ -1,6 +1,44 @@ GIT bitmap v1 format ==================== +== Pack and multi-pack bitmaps + +Bitmaps store reachability information about the set of objects in a packfile, +or a multi-pack index (MIDX). The former is defined obviously, and the latter is +defined as the union of objects in packs contained in the MIDX. + +A bitmap may belong to either one pack, or the repository's multi-pack index (if +it exists). A repository may have at most one bitmap. + +An object is uniquely described by its bit position within a bitmap: + + - If the bitmap belongs to a packfile, the __n__th bit corresponds to + the __n__th object in pack order. For a function `offset` which maps + objects to their byte offset within a pack, pack order is defined as + follows: + + o1 <= o2 <==> offset(o1) <= offset(o2) + + - If the bitmap belongs to a MIDX, the __n__th bit corresponds to the + __n__th object in MIDX order. With an additional function `pack` which + maps objects to the pack they were selected from by the MIDX, MIDX order + is defined as follows: + + o1 <= o2 <==> pack(o1) <= pack(o2) /\ offset(o1) <= offset(o2) + + The ordering between packs is done according to the MIDX's .rev file. + Notably, the preferred pack sorts ahead of all other packs. + +The on-disk representation (described below) of a bitmap is the same regardless +of whether or not that bitmap belongs to a packfile or a MIDX. The only +difference is the interpretation of the bits, which is described above. + +Certain bitmap extensions are supported (see: Appendix B). No extensions are +required for bitmaps corresponding to packfiles. For bitmaps that correspond to +MIDXs, both the bit-cache and rev-cache extensions are required. + +== On-disk format + - A header appears at the beginning: 4-byte signature: {'B', 'I', 'T', 'M'} @@ -14,17 +52,19 @@ GIT bitmap v1 format The following flags are supported: - BITMAP_OPT_FULL_DAG (0x1) REQUIRED - This flag must always be present. It implies that the bitmap - index has been generated for a packfile with full closure - (i.e. where every single object in the packfile can find - its parent links inside the same packfile). This is a - requirement for the bitmap index format, also present in JGit, - that greatly reduces the complexity of the implementation. + This flag must always be present. It implies that the + bitmap index has been generated for a packfile or + multi-pack index (MIDX) with full closure (i.e. where + every single object in the packfile/MIDX can find its + parent links inside the same packfile/MIDX). This is a + requirement for the bitmap index format, also present in + JGit, that greatly reduces the complexity of the + implementation. - BITMAP_OPT_HASH_CACHE (0x4) If present, the end of the bitmap file contains `N` 32-bit name-hash values, one per object in the - pack. The format and meaning of the name-hash is + pack/MIDX. The format and meaning of the name-hash is described below. 4-byte entry count (network byte order) @@ -33,7 +73,8 @@ GIT bitmap v1 format 20-byte checksum - The SHA1 checksum of the pack this bitmap index belongs to. + The SHA1 checksum of the pack/MIDX this bitmap index + belongs to. - 4 EWAH bitmaps that act as type indexes @@ -50,7 +91,7 @@ GIT bitmap v1 format - Tags In each bitmap, the `n`th bit is set to true if the `n`th object - in the packfile is of that type. + in the packfile or multi-pack index is of that type. The obvious consequence is that the OR of all 4 bitmaps will result in a full set (all bits set), and the AND of all 4 bitmaps will @@ -62,8 +103,9 @@ GIT bitmap v1 format Each entry contains the following: - 4-byte object position (network byte order) - The position **in the index for the packfile** where the - bitmap for this commit is found. + The position **in the index for the packfile or + multi-pack index** where the bitmap for this commit is + found. - 1-byte XOR-offset The xor offset used to compress this bitmap. For an entry @@ -146,10 +188,11 @@ Name-hash cache --------------- If the BITMAP_OPT_HASH_CACHE flag is set, the end of the bitmap contains -a cache of 32-bit values, one per object in the pack. The value at +a cache of 32-bit values, one per object in the pack/MIDX. The value at position `i` is the hash of the pathname at which the `i`th object -(counting in index order) in the pack can be found. This can be fed -into the delta heuristics to compare objects with similar pathnames. +(counting in index or multi-pack index order) in the pack/MIDX can be found. +This can be fed into the delta heuristics to compare objects with similar +pathnames. The hash algorithm used is: diff --git a/Documentation/technical/multi-pack-index.txt b/Documentation/technical/multi-pack-index.txt index fb688976c4..1a73c3ee20 100644 --- a/Documentation/technical/multi-pack-index.txt +++ b/Documentation/technical/multi-pack-index.txt @@ -71,14 +71,10 @@ Future Work still reducing the number of binary searches required for object lookups. -- The reachability bitmap is currently paired directly with a single - packfile, using the pack-order as the object order to hopefully - compress the bitmaps well using run-length encoding. This could be - extended to pair a reachability bitmap with a multi-pack-index. If - the multi-pack-index is extended to store a "stable object order" +- If the multi-pack-index is extended to store a "stable object order" (a function Order(hash) = integer that is constant for a given hash, - even as the multi-pack-index is updated) then a reachability bitmap - could point to a multi-pack-index and be updated independently. + even as the multi-pack-index is updated) then MIDX bitmaps could be + updated independently of the MIDX. - Packfiles can be marked as "special" using empty files that share the initial name but replace ".pack" with ".keep" or ".promisor". From patchwork Tue Aug 31 20:51:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467801 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6CEB2C432BE for ; Tue, 31 Aug 2021 20:51:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5429F60ED4 for ; Tue, 31 Aug 2021 20:51:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241241AbhHaUwx (ORCPT ); Tue, 31 Aug 2021 16:52:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41912 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241238AbhHaUwu (ORCPT ); Tue, 31 Aug 2021 16:52:50 -0400 Received: from mail-io1-xd34.google.com (mail-io1-xd34.google.com [IPv6:2607:f8b0:4864:20::d34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9FC28C0613D9 for ; Tue, 31 Aug 2021 13:51:54 -0700 (PDT) Received: by mail-io1-xd34.google.com with SMTP id n24so750807ion.10 for ; Tue, 31 Aug 2021 13:51:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=XHkP4oKdVEyWsc6Sk+cCaLnOaDuXSmZ/L+SbW56QNaw=; b=MXGCB9/8DI0QVZEJD/agiUPFX6Np9kGcaND+kI+xN3pfvV4Zsck6ygnk4HgajJsP95 wYxM+jun/ZgvI+2evDHuy4D0ca4UMrDf390peYlOIn5/8VZHNMEml/nR3ROTIjHFe+rk pqqwDizO5EPRX63H9TpP1l+hsB4WheR6P38Fyk2n4Q3cu4gBVHH1rf8+O6UYtQszMQRa 3V8xNKPmEgdhbEGLlMeCgy0AXdVQhzT2VyZLogHrv2UPWajiSakxas5w2vjgJassSRG6 kVMxgTZvbv/5erAt1TDIwTILE0r67mqlPYjwYcwRYFoRxHvtvUshm/JWFBoiHt7w9pe3 z+xw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=XHkP4oKdVEyWsc6Sk+cCaLnOaDuXSmZ/L+SbW56QNaw=; b=m8fMBVEmBOefDCRMY5qKzxRSM03vxhCcphCAwoaWErF5omjbPIFuqO5EWq15r7+2Qe Rc9zd0vpe0/YCwbFgiXAmzoQICX2NvFzTAsLNxDMrVppsI/1jbbxc1S12qxiEeQHK+pf dn4+2TRblEwnI42kYlXdsUZsVOEENRwneUZJ7KMIckKbmCWR/0hPFpHkjZS4dGf8nmGX r87kYG8Z8IepUvkdAvNUSpHu8F7EIEwYdWLEd6McVbjLs9sEGB4VSe9LHAufbvLrJknA ea7fXGJNbvQ9bCH1/ST+3sj1DN6Ylf8F0pqj3yRxZ+tc0iuXxxLMv0Fafva2sfnQSTcP K+uA== X-Gm-Message-State: AOAM530BG4KDmMmRVTvEIJB2B7y9rysT54ZN5bg6gl10knAG6+Jxuxqq X22LnVPz7n5EKZpo862Rto2HZ04eyfYuh7Cq X-Google-Smtp-Source: ABdhPJzY4ygIc0NqyoEWZUFoYIeV7ch39oCgz0qgT+JZqgrXkYhq6gecmktR2SQmCLhqvOxeMh316A== X-Received: by 2002:a02:9082:: with SMTP id x2mr4608609jaf.44.1630443113975; Tue, 31 Aug 2021 13:51:53 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id o2sm10964277ilg.47.2021.08.31.13.51.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:51:53 -0700 (PDT) Date: Tue, 31 Aug 2021 16:51:53 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 05/27] midx: disallow running outside of a repository Message-ID: <5f24be8985fe0f20b3ea3001dca6c72b6fac17e1.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The multi-pack-index command supports working with arbitrary object directories via the `--object-dir` flag. Though this has historically worked in arbitrary repositories (including when the command itself was run outside of a Git repository), this has been somewhat of an accident. For example, running: git multi-pack-index write --object-dir=/path/to/repo/objects outside of a Git repository causes a BUG(). This is because the top-level `cmd_multi_pack_index()` function stops parsing when it sees "write", and then fills in the default object directory (the result of calling `get_object_directory()`) before handing off to `cmd_multi_pack_index_write()`. But there is no repository to initialize, and so calling `get_object_directory()` results in a BUG() (indicating that the current repository is not initialized). Another case where this doesn't quite work as expected is when operating in a SHA-256 repository. To see the failure, try this in your shell: git init --object-format=sha256 repo git -C repo commit --allow-empty base git -C repo repack -d git multi-pack-index --object-dir=$(pwd)/repo/.git/objects write and observe that we cannot open the `.idx` file in "repo", because the outermost process assumes that any repository that it works in also uses the default value of `the_hash_algo` (at the time of writing, SHA-1). There may be compelling reasons for trying to work around these bugs, but working in arbitrary `--object-dir`'s is non-standard enough (and likewise, these bugs prevalent enough) that I don't think any workflows would be broken by abandoning this behavior. Accordingly, restrict the `multi-pack-index` builtin to only work when inside of a Git repository (i.e., its main utility becomes selecting which alternate to operate in), which avoids both of the bugs above. (Note that you can still trigger a bug when writing a MIDX in an alternate which does not use the same object format as the repository which it is an alternate of, but that is an unrelated bug to this one). Signed-off-by: Taylor Blau --- git.c | 2 +- t/t5319-multi-pack-index.sh | 5 +++++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/git.c b/git.c index 18bed9a996..60c2784be4 100644 --- a/git.c +++ b/git.c @@ -561,7 +561,7 @@ static struct cmd_struct commands[] = { { "merge-tree", cmd_merge_tree, RUN_SETUP | NO_PARSEOPT }, { "mktag", cmd_mktag, RUN_SETUP | NO_PARSEOPT }, { "mktree", cmd_mktree, RUN_SETUP }, - { "multi-pack-index", cmd_multi_pack_index, RUN_SETUP_GENTLY }, + { "multi-pack-index", cmd_multi_pack_index, RUN_SETUP }, { "mv", cmd_mv, RUN_SETUP | NEED_WORK_TREE }, { "name-rev", cmd_name_rev, RUN_SETUP }, { "notes", cmd_notes, RUN_SETUP }, diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 3d4d9f10c3..9034e94c0a 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -842,4 +842,9 @@ test_expect_success 'usage shown without sub-command' ' ! test_i18ngrep "unrecognized subcommand" err ' +test_expect_success 'complains when run outside of a repository' ' + nongit test_must_fail git multi-pack-index write 2>err && + grep "not a git repository" err +' + test_done From patchwork Tue Aug 31 20:51:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467803 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B62F1C4320E for ; Tue, 31 Aug 2021 20:52:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A12E460F9E for ; Tue, 31 Aug 2021 20:52:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241259AbhHaUwz (ORCPT ); Tue, 31 Aug 2021 16:52:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241257AbhHaUww (ORCPT ); Tue, 31 Aug 2021 16:52:52 -0400 Received: from mail-il1-x12d.google.com (mail-il1-x12d.google.com [IPv6:2607:f8b0:4864:20::12d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6239CC061575 for ; Tue, 31 Aug 2021 13:51:57 -0700 (PDT) Received: by mail-il1-x12d.google.com with SMTP id s16so674988ilo.9 for ; Tue, 31 Aug 2021 13:51:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=HzCKDLjGKEnx9sMzqhiF5WyixhOqv2ShfpkUJ5lh0/k=; b=xwjhvY+26UykS46tanmllkFPUvcc/n++UO0Rsm+3ZXFwISX2r7dSKwqWQ3FrdyBY0w QEpFUKZIliiOr3Dy5eEwb2N84XzrkoVPM2wQtVtVsCU8ZaHy+BBUgw4MqNUyEdpDV7PP UHSdRFepQ9KEhKAX2aGxYbynzCLSSIUiwrxH3tWEzDUB0cK5odKujOOe0l3GPwQrqh9p H/JSq7ZZ4YtnZb1B/xUrvhitjkv7gPHVsi8ALXlJoEu+CpMbOz3dG11+hM+Cg9xq/E8R Ix739InUsqCDS7nBPQv+ZeEvCCvax+oVIda7s1hbNFaV15PbVK9ht2yFrVxNHdjmoz0h r7/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=HzCKDLjGKEnx9sMzqhiF5WyixhOqv2ShfpkUJ5lh0/k=; b=RP8Mpn7KuJeptp2L+BrXUD3/6lacaNIT6jhE2W5wI4ZdSSXMftPL4L5oWwH452MAO4 8W7SnV9hYVl5XAFnKy+4zn4iiMue/KqaCJ792yQQG+Ii3qICbK4cS6TKR8q/tR87CdnS wYsA4fxcOFwKYpFIXnsM3oNhE6fdnx+mfhm3MzT7OY0AAf8ysVLOnq6BTFGN3bQlCFlG CjKeDjYdpzNL71uKnmIeqcdEN1fqso43T7gUHyfHkxcU53fvcAqnxJCuoH0nDxfiR0oY xEHnSCMfKPRFRhppXICq2iauk0+FQj/3jNjgE0Ftgs+CwrSTy4qeMt/Pde2TKTavooe5 Ix4Q== X-Gm-Message-State: AOAM533sl3Nr+WdMC+zQi3m9PH7a36xYasMwFlooiueZUfuJu3aklJOv YbhJ8jIREe1uwZrYm7tv2cWg5QMV8JzVzUrp X-Google-Smtp-Source: ABdhPJxIxI8fN6oHO8vhM+MnqfRUn8QFShCMhSxzkut6mCfwbi5W+VR47rBSjeiRTLNLBQqUBgD57Q== X-Received: by 2002:a05:6e02:1a4f:: with SMTP id u15mr21665781ilv.251.1630443116639; Tue, 31 Aug 2021 13:51:56 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id s7sm9667459ioc.42.2021.08.31.13.51.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:51:56 -0700 (PDT) Date: Tue, 31 Aug 2021 16:51:55 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 06/27] midx: fix `*.rev` cleanups with `--object-dir` Message-ID: <0aacaa928395bdc041cc366a25047442ff7a33f1.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org If using --object-dir to point into an object directory which belongs to a different repository than the one in the current working directory, such as: git init repo git -C repo ... # add some objects cd alternate git multi-pack-index --object-dir ../repo/.git/objects write the binary will segfault trying to access the object-dir via the repo it found, but that's not fully initialized. Worse, if we later call clear_midx_files_ext(), we will use `the_repository` and remove files out of the wrong object directory. Fix this by using the given object_dir (or the object directory of `the_repository` if `--object-dir` wasn't given) to properly to clean up the *.rev files, avoiding the crash. Original-patch-by: Johannes Berg Signed-off-by: Taylor Blau --- midx.c | 10 +++++----- t/t5319-multi-pack-index.sh | 28 ++++++++++++++++++++++++++++ 2 files changed, 33 insertions(+), 5 deletions(-) diff --git a/midx.c b/midx.c index 321c6fdd2f..902e1a7a7d 100644 --- a/midx.c +++ b/midx.c @@ -882,7 +882,7 @@ static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash, strbuf_release(&buf); } -static void clear_midx_files_ext(struct repository *r, const char *ext, +static void clear_midx_files_ext(const char *object_dir, const char *ext, unsigned char *keep_hash); static int midx_checksum_valid(struct multi_pack_index *m) @@ -1086,7 +1086,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * if (flags & MIDX_WRITE_REV_INDEX) write_midx_reverse_index(midx_name, midx_hash, &ctx); - clear_midx_files_ext(the_repository, ".rev", midx_hash); + clear_midx_files_ext(object_dir, ".rev", midx_hash); commit_lock_file(&lk); @@ -1135,7 +1135,7 @@ static void clear_midx_file_ext(const char *full_path, size_t full_path_len, die_errno(_("failed to remove %s"), full_path); } -static void clear_midx_files_ext(struct repository *r, const char *ext, +static void clear_midx_files_ext(const char *object_dir, const char *ext, unsigned char *keep_hash) { struct clear_midx_data data; @@ -1146,7 +1146,7 @@ static void clear_midx_files_ext(struct repository *r, const char *ext, hash_to_hex(keep_hash), ext); data.ext = ext; - for_each_file_in_pack_dir(r->objects->odb->path, + for_each_file_in_pack_dir(object_dir, clear_midx_file_ext, &data); @@ -1165,7 +1165,7 @@ void clear_midx_file(struct repository *r) if (remove_path(midx)) die(_("failed to clear multi-pack-index at %s"), midx); - clear_midx_files_ext(r, ".rev", NULL); + clear_midx_files_ext(r->objects->odb->path, ".rev", NULL); free(midx); } diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 9034e94c0a..e953cdd6d1 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -201,6 +201,34 @@ test_expect_success 'write midx with twelve packs' ' compare_results_with_midx "twelve packs" +test_expect_success 'multi-pack-index *.rev cleanup with --object-dir' ' + git init repo && + git clone -s repo alternate && + + test_when_finished "rm -rf repo alternate" && + + ( + cd repo && + test_commit base && + git repack -d + ) && + + ours="alternate/.git/objects/pack/multi-pack-index-123.rev" && + theirs="repo/.git/objects/pack/multi-pack-index-abc.rev" && + touch "$ours" "$theirs" && + + ( + cd alternate && + git multi-pack-index --object-dir ../repo/.git/objects write + ) && + + # writing a midx in "repo" should not remove the .rev file in the + # alternate + test_path_is_file repo/.git/objects/pack/multi-pack-index && + test_path_is_file $ours && + test_path_is_missing $theirs +' + test_expect_success 'warn on improper hash version' ' git init --object-format=sha1 sha1 && ( From patchwork Tue Aug 31 20:51:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467805 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFE39C432BE for ; Tue, 31 Aug 2021 20:52:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B5C7E60F9E for ; Tue, 31 Aug 2021 20:52:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241245AbhHaUxE (ORCPT ); Tue, 31 Aug 2021 16:53:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41956 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241270AbhHaUw4 (ORCPT ); Tue, 31 Aug 2021 16:52:56 -0400 Received: from mail-il1-x132.google.com (mail-il1-x132.google.com [IPv6:2607:f8b0:4864:20::132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 27FC3C0613D9 for ; Tue, 31 Aug 2021 13:52:01 -0700 (PDT) Received: by mail-il1-x132.google.com with SMTP id i13so715386ilm.4 for ; Tue, 31 Aug 2021 13:52:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=ID+XA7Pw++CvN1OMKw6Hl46Oz1Wxc2qICRPm34E5Ti0=; b=Hxv+ekBsIwws+DNBFVcKQh9PG5FR/XXzH+CHsXI5OkDUS0xXXj6or5XKFDflBjmxtk wZyr5j/YaN4PX4ZFhdIor2pj/N/ON2VzUJlSP9Kz80rbFMIF2xWTXL/71cBAty+0pwM4 eCtXYGJhiKEKf5uremDVWLx4LqEoglTK1SjSaZZp24khkmkMA602m/sNh0NUdnnN5MGY B7JVrem7z38fv+DopmJlxHLPcEjW9KwKfqQu7ETrJQfMkapqb0B3urNc8ABB8ds3M3aT LbIyiC8W3rH88PF5jsDwDi0FJotmxfTtJQqEQIn1eb4RORKr3yB9ts5mX9jic8mkpjZD 7Y2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=ID+XA7Pw++CvN1OMKw6Hl46Oz1Wxc2qICRPm34E5Ti0=; b=CDPGu51a56DEen1FJCWCsku7gvwA844AaLPIgs+aOjHKcYtEnp6F19t6pyT5zmU5Va f3vYzcitXPzKOcS2IFXFNw/0rnUF+m56DkojIFrAhJG567f/koyNKpHzwi+3XqLH4N9x dT3IwGBc/0RltvBkbeQ9Y//kPurtah7RuT+1bxYyrcUXAfs/OvwSLO3SXP8aQuAbYiX3 YTcWFeJ2PFe5ywLkXmo9gN5aXCeaSMtLe89s79TT0aDlpfiBaR4AO4Coj7DQnLtemZ17 tiakWXP+f+L4IC1Gk4qAkAOgg1l6Jluvy9729DmESo8PPubsQaTCflaAb4dIfvGLZyS4 Y+JA== X-Gm-Message-State: AOAM532bb3wspJ1gLxCKMq2iruvxJW5+beqncDcvqJ2OUHFABhx1oEDj mKfdayACKr55qMDOWe8+/1/WMtiMONmdlxAZ X-Google-Smtp-Source: ABdhPJyQEnBCkJEeaG4+e+vll/QphzI0gnIVTdDGTS4TcQMnohS5heWOUKRzsNKXSOs2LYijvRuBEA== X-Received: by 2002:a92:4b01:: with SMTP id m1mr20083134ilg.50.1630443120430; Tue, 31 Aug 2021 13:52:00 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id s18sm10683330ilp.83.2021.08.31.13.52.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:00 -0700 (PDT) Date: Tue, 31 Aug 2021 16:51:59 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 07/27] midx: clear auxiliary .rev after replacing the MIDX Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When writing a new multi-pack index, write_midx_internal() attempts to clean up any auxiliary files (currently just the MIDX's `.rev` file, but soon to include a `.bitmap`, too) corresponding to the MIDX it's replacing. This step should happen after the new MIDX is written into place, since doing so beforehand means that the old MIDX could be read without its corresponding .rev file. Signed-off-by: Taylor Blau --- midx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/midx.c b/midx.c index 902e1a7a7d..0bcb403bae 100644 --- a/midx.c +++ b/midx.c @@ -1086,10 +1086,11 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * if (flags & MIDX_WRITE_REV_INDEX) write_midx_reverse_index(midx_name, midx_hash, &ctx); - clear_midx_files_ext(object_dir, ".rev", midx_hash); commit_lock_file(&lk); + clear_midx_files_ext(object_dir, ".rev", midx_hash); + cleanup: for (i = 0; i < ctx.nr; i++) { if (ctx.info[i].p) { From patchwork Tue Aug 31 20:52:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467811 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B531C432BE for ; Tue, 31 Aug 2021 20:52:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 590F96103A for ; Tue, 31 Aug 2021 20:52:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241274AbhHaUxH (ORCPT ); Tue, 31 Aug 2021 16:53:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241280AbhHaUxC (ORCPT ); Tue, 31 Aug 2021 16:53:02 -0400 Received: from mail-io1-xd2e.google.com (mail-io1-xd2e.google.com [IPv6:2607:f8b0:4864:20::d2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A693AC0613A3 for ; Tue, 31 Aug 2021 13:52:03 -0700 (PDT) Received: by mail-io1-xd2e.google.com with SMTP id m11so785975ioo.6 for ; Tue, 31 Aug 2021 13:52:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Vb7bbxKGSFE1XPjJzo+wKcFArRDl5UEw8MU9KNu1hqo=; b=h8YyDSfxulxrzgdvqPTxtZxEfn3hQzLQYYC4T7LTbGgs7QC4/qKQ9AjGRRNw5ynWA/ qOhCPqJCVk8tmX7m5NTOrqeAVxvc6O3hYk7lrm+V+Bx3tYvSjXV3RPu5zpfkiLd6S1u1 9GDPmfWJAqYjie4rnCEgAFDvGoCW6hi2W4dqjVGplkWhqrSD50dvZyrvgukWJ9W8nIsv WB7g2RLW3UG5qYboCQgZBNeXGoN3dJeShkF7Zox0Q5S76B4rq87QC8VUiJPrU98r23SG BPdX0D1PHNeocginyljRIPeUMEa963iBVXuadNLzv2ZGKC7tSS7VP/Nzf7l1bBDqpH6Z 4QGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Vb7bbxKGSFE1XPjJzo+wKcFArRDl5UEw8MU9KNu1hqo=; b=Is4HF0gkNiwG4mQ258/mEnl5QOnHbFW0UFfqQiAQIt1BW5dCD957vfLzfv5f+hEKqs SnFLyZs5igymZTa14lRBTZ8e7thzH4lT81Vg/Op4ei/3bqjxdop8rwqKESUgHBNdBjKP QO40qJNkRybpayLD9U7bSvUim9vW+gq5b1xT0mfvj1CZH2qutXFeGSzqyOu2xkz152X7 Ov/N2aH/TXMHmTbRnWL1pkieTQISAzQh9jpYNFN3DyWwdAfGfcdOS4PLcGe+1L3MtacR ZRn1W3eG9BcHLttSQU2B1vK/0Mmko1WNDDQlOxaP5QHO/enxVFVWtNfhGmORQpa7dG6D Vi7A== X-Gm-Message-State: AOAM532jrh1Nnm257wsRuSh2igq7ZFy6mOtNVXWi/hqQxYBze8JBXH+v t5PCh8kMgPICbCNKT+TJl9SCWJdp3QjEC6J6 X-Google-Smtp-Source: ABdhPJyzOMQ8sDj+YRJ7tUMVBSIpfZClycISXrqNwMnI0Ufnpu6e9hASaT1ZAYbZivu3uc2ODmgCiA== X-Received: by 2002:a05:6638:408b:: with SMTP id m11mr4742169jam.136.1630443122977; Tue, 31 Aug 2021 13:52:02 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id b12sm10938655ios.0.2021.08.31.13.52.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:02 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:02 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 08/27] midx: reject empty `--preferred-pack`'s Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The soon-to-be-implemented multi-pack bitmap treats object in the first bit position specially by assuming that all objects in the pack it was selected from are also represented from that pack in the MIDX. In other words, the pack from which the first object was selected must also have all of its other objects selected from that same pack in the MIDX in case of any duplicates. But this assumption relies on the fact that there is at least one object in that pack to begin with; otherwise the object in the first bit position isn't from a preferred pack, in which case we can no longer assume that all objects in that pack were also selected from the same pack. Guard this assumption by checking the number of objects in the given preferred pack, and failing if the given pack is empty. To make sure we can safely perform this check, open any packs which are contained in an existing MIDX via prepare_midx_pack(). The same is done for new packs via the add_pack_to_midx() callback, but packs picked up from a previous MIDX will not yet have these opened. Signed-off-by: Taylor Blau --- Documentation/git-multi-pack-index.txt | 6 +++--- midx.c | 29 ++++++++++++++++++++++++++ t/t5319-multi-pack-index.sh | 17 +++++++++++++++ 3 files changed, 49 insertions(+), 3 deletions(-) diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index ffd601bc17..c9b063d31e 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -37,9 +37,9 @@ write:: -- --preferred-pack=:: Optionally specify the tie-breaking pack used when - multiple packs contain the same object. If not given, - ties are broken in favor of the pack with the lowest - mtime. + multiple packs contain the same object. `` must + contain at least one object. If not given, ties are + broken in favor of the pack with the lowest mtime. -- verify:: diff --git a/midx.c b/midx.c index 0bcb403bae..26089ec9c7 100644 --- a/midx.c +++ b/midx.c @@ -934,6 +934,25 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * ctx.info[ctx.nr].pack_name = xstrdup(ctx.m->pack_names[i]); ctx.info[ctx.nr].p = NULL; ctx.info[ctx.nr].expired = 0; + + if (flags & MIDX_WRITE_REV_INDEX) { + /* + * If generating a reverse index, need to have + * packed_git's loaded to compare their + * mtimes and object count. + */ + if (prepare_midx_pack(the_repository, ctx.m, i)) { + error(_("could not load pack")); + result = 1; + goto cleanup; + } + + if (open_pack_index(ctx.m->packs[i])) + die(_("could not open index for %s"), + ctx.m->packs[i]->pack_name); + ctx.info[ctx.nr].p = ctx.m->packs[i]; + } + ctx.nr++; } } @@ -961,6 +980,16 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * } } + if (ctx.preferred_pack_idx > -1) { + struct packed_git *preferred = ctx.info[ctx.preferred_pack_idx].p; + if (!preferred->num_objects) { + error(_("cannot select preferred pack %s with no objects"), + preferred->pack_name); + result = 1; + goto cleanup; + } + } + ctx.entries = get_sorted_entries(ctx.m, ctx.info, ctx.nr, &ctx.entries_nr, ctx.preferred_pack_idx); diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index e953cdd6d1..d7e4988f2b 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -305,6 +305,23 @@ test_expect_success 'midx picks objects from preferred pack' ' ) ' +test_expect_success 'preferred packs must be non-empty' ' + test_when_finished rm -rf preferred.git && + git init preferred.git && + ( + cd preferred.git && + + test_commit base && + git repack -ad && + + empty="$(git pack-objects $objdir/pack/pack err && + grep "with no objects" err + ) +' + test_expect_success 'verify multi-pack-index success' ' git multi-pack-index verify --object-dir=$objdir ' From patchwork Tue Aug 31 20:52:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467807 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B2DBC432BE for ; Tue, 31 Aug 2021 20:52:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E86C560ED4 for ; Tue, 31 Aug 2021 20:52:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241237AbhHaUxF (ORCPT ); Tue, 31 Aug 2021 16:53:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41992 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241282AbhHaUxC (ORCPT ); Tue, 31 Aug 2021 16:53:02 -0400 Received: from mail-io1-xd2b.google.com (mail-io1-xd2b.google.com [IPv6:2607:f8b0:4864:20::d2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E2F3C061575 for ; Tue, 31 Aug 2021 13:52:06 -0700 (PDT) Received: by mail-io1-xd2b.google.com with SMTP id n24so751725ion.10 for ; Tue, 31 Aug 2021 13:52:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=d/KmTaMPxW79wkpfwwyDcnccakFT2KRF8SQyD8tz05M=; b=q6czIeQ9g7s7hJnBdXHPN6lc+T/AUqdNIaJHwB2cz5dNQCzGMaqELYihUI6dFcTp29 hUq6o7qBRPLBnQ9/Nw+wIRYb0D96iz7+2dVoidSZM86xHT8jimDehgZkwJ7pidO0FXYh +65zg9OEwXE7aONctRXeutPi9LpGrNBqSUAbkvS/yJ+QUG6QmD5KHRpn5PEVZj88RM8j k5i9FJXtfMZfw/31jU+SGWyNgSdVY3wZfRqodOKujKiECZjfb57Ji6wcJyTIu54zDomO cDPzq+yyMZhxVpU19UdGoDJzwEUAOi0bPEqRdxUF7DlrGiGmOcKwstXpPmrqtPoOEUxm W/UQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=d/KmTaMPxW79wkpfwwyDcnccakFT2KRF8SQyD8tz05M=; b=PbCO+J054rRA2/GbcBWjhrxcwIQHCqdj1LxxZ+IqrX1czHg+ueWxgZLatIpADcgNnq i3xdtxJUArjOVvXHyrmSuzTBnBcpJILtmL+CT+6EX7g94AZnKs7kwQP8Mmd1tAWw8nEu xt3Jgvv0J1mxTsXNXRCw+W2l/T75n0gnQAR91swK0t4zubiXclcKlm9yyvuvV2fNIDtX 1xt9nBp5+q7qtU1x1n98rZc6z+mKLe0W+fkEW4r3dFWxXfk7B++8tFMNRLn3OZgJpe78 eCyh21JuTEtaNItUflvAqQRRaEkAN4NdBSMKFxFvAmsZxB1PDgTvRbB7W3hYxNmg9vZu LfLg== X-Gm-Message-State: AOAM533xYZZ3o7LWYl2nEgp1nLtt+NQJUmS/r40Nq008EQNdb5sZQpkC 5OdjN2fmPWIfc5C8ub+IjFmLAlR8xImkCZUY X-Google-Smtp-Source: ABdhPJxlYAJDqegUpYP/iG8lqJtYfAgsEHew6EXR0PZbGFXzFlPyM49E40AjyFQKdRPGBvGlobDvRQ== X-Received: by 2002:a6b:7b4b:: with SMTP id m11mr24385515iop.165.1630443125554; Tue, 31 Aug 2021 13:52:05 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id a6sm10628284ilb.59.2021.08.31.13.52.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:05 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:04 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 09/27] midx: infer preferred pack when not given one Message-ID: <059c583e34f4fd90e7cc2a9aad20fc21ea2bbc8b.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In 9218c6a40c (midx: allow marking a pack as preferred, 2021-03-30), the multi-pack index code learned how to select a pack which all duplicate objects are selected from. That is, if an object appears in multiple packs, select the copy in the preferred pack before breaking ties according to the other rules like pack mtime and readdir() order. Not specifying a preferred pack can cause serious problems with multi-pack reachability bitmaps, because these bitmaps rely on having at least one pack from which all duplicates are selected. Not having such a pack causes problems with the code in pack-objects to reuse packs verbatim (e.g., that code assumes that a delta object in a chunk of pack sent verbatim will have its base object sent from the same pack). So why does not marking a pack preferred cause problems here? The reason is roughly as follows: - Ties are broken (when handling duplicate objects) by sorting according to midx_oid_compare(), which sorts objects by OID, preferred-ness, pack mtime, and finally pack ID (more on that later). - The psuedo pack-order (described in Documentation/technical/pack-format.txt under the section "multi-pack-index reverse indexes") is computed by midx_pack_order(), and sorts by pack ID and pack offset, with preferred packs sorting first. - But! Pack IDs come from incrementing the pack count in add_pack_to_midx(), which is a callback to for_each_file_in_pack_dir(), meaning that pack IDs are assigned in readdir() order. When specifying a preferred pack, all of that works fine, because duplicate objects are correctly resolved in favor of the copy in the preferred pack, and the preferred pack sorts first in the object order. "Sorting first" is critical, because the bitmap code relies on finding out which pack holds the first object in the MIDX's pseudo pack-order to determine which pack is preferred. But if we didn't specify a preferred pack, and the pack which comes first in readdir() order does not also have the lowest timestamp, then it's possible that that pack (the one that sorts first in pseudo-pack order, which the bitmap code will treat as the preferred one) did *not* have all duplicate objects resolved in its favor, resulting in breakage. The fix is simple: pick a (semi-arbitrary, non-empty) preferred pack when none was specified. This forces that pack to have duplicates resolved in its favor, and (critically) to sort first in pseudo-pack order. Unfortunately, testing this behavior portably isn't possible, since it depends on readdir() order which isn't guaranteed by POSIX. (Note that multi-pack reachability bitmaps have yet to be implemented; so in that sense this patch is fixing a bug which does not yet exist. But by having this patch beforehand, we can prevent the bug from ever materializing.) Signed-off-by: Taylor Blau --- midx.c | 50 ++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 44 insertions(+), 6 deletions(-) diff --git a/midx.c b/midx.c index 26089ec9c7..67de1dbaeb 100644 --- a/midx.c +++ b/midx.c @@ -969,15 +969,57 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop) goto cleanup; - ctx.preferred_pack_idx = -1; if (preferred_pack_name) { + int found = 0; for (i = 0; i < ctx.nr; i++) { if (!cmp_idx_or_pack_name(preferred_pack_name, ctx.info[i].pack_name)) { ctx.preferred_pack_idx = i; + found = 1; break; } } + + if (!found) + warning(_("unknown preferred pack: '%s'"), + preferred_pack_name); + } else if (ctx.nr && (flags & MIDX_WRITE_REV_INDEX)) { + struct packed_git *oldest = ctx.info[ctx.preferred_pack_idx].p; + ctx.preferred_pack_idx = 0; + + if (packs_to_drop && packs_to_drop->nr) + BUG("cannot write a MIDX bitmap during expiration"); + + /* + * set a preferred pack when writing a bitmap to ensure that + * the pack from which the first object is selected in pseudo + * pack-order has all of its objects selected from that pack + * (and not another pack containing a duplicate) + */ + for (i = 1; i < ctx.nr; i++) { + struct packed_git *p = ctx.info[i].p; + + if (!oldest->num_objects || p->mtime < oldest->mtime) { + oldest = p; + ctx.preferred_pack_idx = i; + } + } + + if (!oldest->num_objects) { + /* + * If all packs are empty; unset the preferred index. + * This is acceptable since there will be no duplicate + * objects to resolve, so the preferred value doesn't + * matter. + */ + ctx.preferred_pack_idx = -1; + } + } else { + /* + * otherwise don't mark any pack as preferred to avoid + * interfering with expiration logic below + */ + ctx.preferred_pack_idx = -1; } if (ctx.preferred_pack_idx > -1) { @@ -1058,11 +1100,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * ctx.info, ctx.nr, sizeof(*ctx.info), idx_or_pack_name_cmp); - - if (!preferred) - warning(_("unknown preferred pack: '%s'"), - preferred_pack_name); - else { + if (preferred) { uint32_t perm = ctx.pack_perm[preferred->orig_pack_int_id]; if (perm == PACK_EXPIRED) warning(_("preferred pack '%s' is expired"), From patchwork Tue Aug 31 20:52:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467809 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 427E8C4320A for ; Tue, 31 Aug 2021 20:52:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 226A260ED4 for ; Tue, 31 Aug 2021 20:52:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241277AbhHaUxK (ORCPT ); Tue, 31 Aug 2021 16:53:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42002 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241285AbhHaUxE (ORCPT ); Tue, 31 Aug 2021 16:53:04 -0400 Received: from mail-io1-xd32.google.com (mail-io1-xd32.google.com [IPv6:2607:f8b0:4864:20::d32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 98DABC061575 for ; Tue, 31 Aug 2021 13:52:08 -0700 (PDT) Received: by mail-io1-xd32.google.com with SMTP id a15so818685iot.2 for ; Tue, 31 Aug 2021 13:52:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=R3g4w2GDtR7QylGU3LlZvxoKxtS5Vnhfs+azpWnhjiI=; b=Q2kuLAqVi9Khx9h6MF2BAxHBtPUp9uG8aYMnuq7pOLf6TYGyVXu1QmeHt1BWDtHd/N 3mtefM0GIDVN1OR9GzPob9usZA4zl0OO4Pa1DNWz9ZXJJPKYnkcDriGUsb1FXMzo0dG+ ecqbaQFGt/ryYiJDOslz6OSwDlffT4HPtDTP+PPahngCGL+kwr+D11aJQ+eSfGp7QicG qZ7Qdfvlazo8vbPp8Cwot2eHfFUtGhkOGDyFSOPbqV0xbiKU9ce/QNIJfuatsSaVIjzN xGkZhZnhGWO3MtUDYR8WtMvEHlW7Fga/aFiMYxRB0x+IWrC7bYzTL5SLX2G8P6vgNEg4 qvBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=R3g4w2GDtR7QylGU3LlZvxoKxtS5Vnhfs+azpWnhjiI=; b=e2Q6+9sDvw1rxHV0HLLwsiwrKE/2b+VF+AHPkFjWEuGzmMeVBmLBvWjPuBMurh0Wqb VwCXzCq6IJ8m5KrqS3/kLo3uOJHEmhBeYKB0lPu/ukQFlVwO0cuIl8oowT7D+4OAXk8o LqTTbe4Lntje8bSlP1l6eM/Hk+oXkv0ZEF0Xtp9ADgx/dflr4Mybvjkvf98ZXmXsrIp4 Xoh8rR4UIc0cOXQZ+gnFQHZ+xv6fGHy8mb0PgOBo7bIURIwmNgXYWRBVPtaMGDzwQjUY EKPbxUiKY2q7h0PCn2DbP59z/yR9KynZhgcWDQo75p83WVh84prDSwZEsB3u2tYQgit+ 2+Jw== X-Gm-Message-State: AOAM531zKBr+F/ZKSmfxCrdmLX/cOx3Ur/maSpOOff8SB7lJ8UfIqfRX hGsL3qkvKyJFIs/obfbsU+1kiPeKleIuHm3R X-Google-Smtp-Source: ABdhPJysqFPMG5vrjXJWJ7RUQ6xrOuEonDu+2SJ/UWmVX3a8Lag14wX9IqG1/yjepqf4juXiX1YhDg== X-Received: by 2002:a05:6638:304a:: with SMTP id u10mr4737832jak.62.1630443127972; Tue, 31 Aug 2021 13:52:07 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id y10sm10090309ilv.35.2021.08.31.13.52.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:07 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:07 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 10/27] midx: close linked MIDXs, avoid leaking memory Message-ID: <6f5ca446f3820ae433c9fcbd7439f92a44d0eb2c.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When a repository has at least one alternate, the MIDX belonging to each alternate is accessed through the `next` pointer on the main object store's copy of the MIDX. close_midx() didn't bother to close any of the linked MIDXs. It likewise didn't free the memory pointed to by `m`, leaving uninitialized bytes with live pointers to them left around in the heap. Clean this up by closing linked MIDXs, and freeing up the memory pointed to by each of them. When callers call close_midx(), then they can discard the entire linked list of MIDXs and set their pointer to the head of that list to NULL. This isn't strictly required for the upcoming patches, but it makes it much more difficult (though still possible, for e.g., by calling `close_midx(m->next)` which leaves `m->next` pointing at uninitialized bytes) to have pointers to uninitialized memory. Signed-off-by: Taylor Blau --- midx.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/midx.c b/midx.c index 67de1dbaeb..e83f22b5ee 100644 --- a/midx.c +++ b/midx.c @@ -195,6 +195,8 @@ void close_midx(struct multi_pack_index *m) if (!m) return; + close_midx(m->next); + munmap((unsigned char *)m->data, m->data_len); for (i = 0; i < m->num_packs; i++) { @@ -203,6 +205,7 @@ void close_midx(struct multi_pack_index *m) } FREE_AND_NULL(m->packs); FREE_AND_NULL(m->pack_names); + free(m); } int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id) From patchwork Tue Aug 31 20:52:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467813 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDF83C432BE for ; Tue, 31 Aug 2021 20:52:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A5C5A61026 for ; Tue, 31 Aug 2021 20:52:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241301AbhHaUxN (ORCPT ); Tue, 31 Aug 2021 16:53:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42018 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241271AbhHaUxH (ORCPT ); Tue, 31 Aug 2021 16:53:07 -0400 Received: from mail-io1-xd2f.google.com (mail-io1-xd2f.google.com [IPv6:2607:f8b0:4864:20::d2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 266B0C061764 for ; Tue, 31 Aug 2021 13:52:11 -0700 (PDT) Received: by mail-io1-xd2f.google.com with SMTP id g9so741994ioq.11 for ; Tue, 31 Aug 2021 13:52:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Xml83dDhotBELRc4X2XFpr0Il4GpSkXu+BGRGXaC2ao=; b=o23Nxaf1rQuwQoOE4ljzNL4R1PENhvSRgfq8v6uxRfaD642GahnddzlKZs+sqrGioJ SYNqcmZku9DK0SdVCHDXSDDKa5FUuW67pnLqZQVpQTN1NqTMUi6lyuon1PsJSbcwzt4K heX9dcYtl8Bc9wUXZmxTjX3jp0h1bG0hYkAFVJqTxFoXncytPW2IWvSmx+7kym5hOeqq og/s/v5Vzz9F+FT1L8LJKPmxMJQVhynGa71HCkVzAdpAZQX4XPCNAgeYZun0e7/rIdmx 7V4Z8yyhHauKYg18Nu7WbEOO05pjlNN3vMCMAbQ+R+s82o69R3GB2Z2iU4LEhtFzph+5 WlQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Xml83dDhotBELRc4X2XFpr0Il4GpSkXu+BGRGXaC2ao=; b=oH1ahuETbJ3gzo1Zcj7/LoK4dt0LG/OYKHZKAEnjlU0h36fJSbT8h0fHXYkQX2mKfj 3ajEWEZUOccvehWoiTXCY6N9kAIE+NDXA7JXOW1DQNKZiWHFPXW0eNz5qUTHBLLn+NTA dyWwWPOSYvWOEne4v/dLDDQIzwwcSWbn2PfLH2q4+GGq55Y+I7MhDZRsIh9HTKWkWC4a 7XBKpx/c8yll7WRyIaXQ7a54IHIIR3IsaOXmMoag62jfkfQ4bQ6m9k1dq1rU/i9CD5QR SnlGjhKGql1grWwBemYjlSLnAkUK0m7fSLrVZ9u4czGSKnHpTknh0OQLQ7RWGlpotO3G JzMQ== X-Gm-Message-State: AOAM530wP8Mh8yR5WwmXwUBCVhiT37Nw8RxJub0qrxSUtwS7etPsaCjT YBBtyQT6ATYU3jWad6N0Rf5whIcZOEr6CbVR X-Google-Smtp-Source: ABdhPJwh4tIQZkunH2hxD/s60jrxSjeHj9hzoA1LqwXD2N0lOE13eWixb1NFT9oX7AOh8M+PEPOCcA== X-Received: by 2002:a02:c7d2:: with SMTP id s18mr4699811jao.22.1630443130431; Tue, 31 Aug 2021 13:52:10 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id f6sm10893105ilr.77.2021.08.31.13.52.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:10 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:09 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 11/27] midx: avoid opening multiple MIDXs when writing Message-ID: <4656608f7350a059748c75c5ed8731cb66a52d9b.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Opening multiple instance of the same MIDX can lead to problems like two separate packed_git structures which represent the same pack being added to the repository's object store. The above scenario can happen because prepare_midx_pack() checks if `m->packs[pack_int_id]` is NULL in order to determine if a pack has been opened and installed in the repository before. But a caller can construct two copies of the same MIDX by calling get_multi_pack_index() and load_multi_pack_index() since the former manipulates the object store directly but the latter is a lower-level routine which allocates a new MIDX for each call. So if prepare_midx_pack() is called on multiple MIDXs with the same pack_int_id, then that pack will be installed twice in the object store's packed_git pointer. This can lead to problems in, for e.g., the pack-bitmap code, which does something like the following (in pack-bitmap.c:open_pack_bitmap()): struct bitmap_index *bitmap_git = ...; for (p = get_all_packs(r); p; p = p->next) { if (open_pack_bitmap_1(bitmap_git, p) == 0) ret = 0; } which is a problem if two copies of the same pack exist in the packed_git list because pack-bitmap.c:open_pack_bitmap_1() contains a conditional like the following: if (bitmap_git->pack || bitmap_git->midx) { /* ignore extra bitmap file; we can only handle one */ warning("ignoring extra bitmap file: %s", packfile->pack_name); close(fd); return -1; } Avoid this scenario by not letting write_midx_internal() open a MIDX that isn't also pointed at by the object store. So long as this is the case, other routines should prefer to open MIDXs with get_multi_pack_index() or reprepare_packed_git() instead of creating instances on their own. Because get_multi_pack_index() returns `r->object_store->multi_pack_index` if it is non-NULL, we'll only have one instance of a MIDX open at one time, avoiding these problems. To encourage this, drop the `struct multi_pack_index *` parameter from `write_midx_internal()`, and rely instead on the `object_dir` to find (or initialize) the correct MIDX instance. Likewise, replace the call to `close_midx()` with `close_object_store()`, since we're about to replace the MIDX with a new one and should invalidate the object store's memory of any MIDX that might have existed beforehand. Note that this now forbids passing object directories that don't belong to alternate repositories over `--object-dir`, since before we would have happily opened a MIDX in any directory, but now restrict ourselves to only those reachable by `r->objects->multi_pack_index` (and alternate MIDXs that we can see by walking the `next` pointer). As far as I can tell, supporting arbitrary directories with `--object-dir` was a historical accident, since even the documentation says `` when referring to the value passed to this option. A future patch could clean this up and provide a warning() when a non-alternate directory was given, since we'll still write a new MIDX there, we just won't reuse any MIDX that might happen to already exist in that directory. Signed-off-by: Taylor Blau --- Documentation/git-multi-pack-index.txt | 2 ++ midx.c | 26 +++++++++++++++----------- 2 files changed, 17 insertions(+), 11 deletions(-) diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index c9b063d31e..0af6beb2dd 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -23,6 +23,8 @@ OPTIONS Use given directory for the location of Git objects. We check `/packs/multi-pack-index` for the current MIDX file, and `/packs` for the pack-files to index. ++ +`` must be an alternate of the current repository. --[no-]progress:: Turn progress on/off explicitly. If neither is specified, progress is diff --git a/midx.c b/midx.c index e83f22b5ee..43510290dc 100644 --- a/midx.c +++ b/midx.c @@ -893,7 +893,7 @@ static int midx_checksum_valid(struct multi_pack_index *m) return hashfile_checksum_valid(m->data, m->data_len); } -static int write_midx_internal(const char *object_dir, struct multi_pack_index *m, +static int write_midx_internal(const char *object_dir, struct string_list *packs_to_drop, const char *preferred_pack_name, unsigned flags) @@ -904,6 +904,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * struct hashfile *f = NULL; struct lock_file lk; struct write_midx_context ctx = { 0 }; + struct multi_pack_index *cur; int pack_name_concat_len = 0; int dropped_packs = 0; int result = 0; @@ -914,10 +915,12 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * die_errno(_("unable to create leading directories of %s"), midx_name); - if (m) - ctx.m = m; - else - ctx.m = load_multi_pack_index(object_dir, 1); + for (cur = get_multi_pack_index(the_repository); cur; cur = cur->next) { + if (!strcmp(object_dir, cur->object_dir)) { + ctx.m = cur; + break; + } + } if (ctx.m && !midx_checksum_valid(ctx.m)) { warning(_("ignoring existing multi-pack-index; checksum mismatch")); @@ -1119,7 +1122,7 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index * f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk)); if (ctx.m) - close_midx(ctx.m); + close_object_store(the_repository->objects); if (ctx.nr - dropped_packs == 0) { error(_("no pack files to index.")); @@ -1182,8 +1185,7 @@ int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags) { - return write_midx_internal(object_dir, NULL, NULL, preferred_pack_name, - flags); + return write_midx_internal(object_dir, NULL, preferred_pack_name, flags); } struct clear_midx_data { @@ -1461,8 +1463,10 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla free(count); - if (packs_to_drop.nr) - result = write_midx_internal(object_dir, m, &packs_to_drop, NULL, flags); + if (packs_to_drop.nr) { + result = write_midx_internal(object_dir, &packs_to_drop, NULL, flags); + m = NULL; + } string_list_clear(&packs_to_drop, 0); return result; @@ -1651,7 +1655,7 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, goto cleanup; } - result = write_midx_internal(object_dir, m, NULL, NULL, flags); + result = write_midx_internal(object_dir, NULL, NULL, flags); m = NULL; cleanup: From patchwork Tue Aug 31 20:52:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467815 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95776C4320E for ; Tue, 31 Aug 2021 20:52:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 77A0860ED4 for ; Tue, 31 Aug 2021 20:52:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241315AbhHaUxV (ORCPT ); Tue, 31 Aug 2021 16:53:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241262AbhHaUxJ (ORCPT ); Tue, 31 Aug 2021 16:53:09 -0400 Received: from mail-il1-x12d.google.com (mail-il1-x12d.google.com [IPv6:2607:f8b0:4864:20::12d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 93887C061575 for ; Tue, 31 Aug 2021 13:52:13 -0700 (PDT) Received: by mail-il1-x12d.google.com with SMTP id i13so716206ilm.4 for ; Tue, 31 Aug 2021 13:52:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=R/GeToTgrlnRlpxAA4v1x0rBIZOyzKeR/oh9triRL5M=; b=USuudZBS+OJtDF4MSyVG6xKbcU0IKYENkkLtnxD9T945vCFJn/i2MSbQ1Up9uNa9oW ZoUDeRhL13o6TJNPv7fXAXm0UWROww+lBNCMoYV0Hj+ontrXu0cjaGkLttNtqBkJicuA ocRJ4YoiinMT/wCcuOWwLa6s/Bsf1z3+dmWGqKsBz5ruPVOpwx/AdNUjv7qiwbwTxlhI Rb3tc6SNg0MpfpKVrES9D8tu2L+c1ifxFLV91DTYVHKraETP/VmKAWpSrZSIzuPeqD51 InAITx5UFg7nMFQ0r7ijBD3Puo5r8EdlUi7kywVT/oc6vAIjYIzjLWdlsxcq3BtykA6F 68PQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=R/GeToTgrlnRlpxAA4v1x0rBIZOyzKeR/oh9triRL5M=; b=cOKFXUJEnNwO2afmL+FxgReACf6wnVAtCoqmXu2fsrYV6i6nLxdpra2qJhJDGCD9oi WJ/Ez/BbZSK/p4nyZVsVa4bH3HGGneiY8Eoq+OpwBJMoewDIzQJ6XXxX3ufuHQwyh6qm 6/9QDRfD5dHE0p0QfwCz5F16Xu5Okrc+lXnTTry9+vpB2FlTHSg4G5vOSsidVIdO0F7/ jgC2LT5n3eB8CoeHivRZXDyDgXOkNrBESCqe9fKgee+FVm3S5EpIRJLjNgjbAE3G6E6b 2Yk2XQBYUkQLLigTmgOLdLwKJThfXErnMMF3uNEKjYs9s6+qHlo8rFevNZnrh+9JVh8v jThA== X-Gm-Message-State: AOAM5320CR64aZU+4EX8qr1199xBbD+K6OHkK1oKhJdTqUc61qeTPS+D rcRIeuhzLoO/JqWtrxKjEVqzHIQhhOD9oQA7 X-Google-Smtp-Source: ABdhPJzGjfdGNSOOnVZzr2+8z8vyaWMiBJPZjwn1nXCpHbgae04djf/Xftqz4gzRrd86sBBIR2hxlQ== X-Received: by 2002:a92:d20d:: with SMTP id y13mr21431449ily.294.1630443132889; Tue, 31 Aug 2021 13:52:12 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id u16sm10140240iob.41.2021.08.31.13.52.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:12 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:12 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 12/27] pack-bitmap.c: introduce 'bitmap_num_objects()' Message-ID: <4c793df9d13bfec35ca5dc10323a0a91c1ae7269.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A subsequent patch to support reading MIDX bitmaps will be less noisy after extracting a generic function to return how many objects are contained in a bitmap. Signed-off-by: Taylor Blau --- pack-bitmap.c | 37 +++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 9b11af87aa..65356f9657 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -136,6 +136,11 @@ static struct ewah_bitmap *read_bitmap_1(struct bitmap_index *index) return b; } +static uint32_t bitmap_num_objects(struct bitmap_index *index) +{ + return index->pack->num_objects; +} + static int load_bitmap_header(struct bitmap_index *index) { struct bitmap_disk_header *header = (void *)index->map; @@ -154,7 +159,7 @@ static int load_bitmap_header(struct bitmap_index *index) /* Parse known bitmap format options */ { uint32_t flags = ntohs(header->options); - size_t cache_size = st_mult(index->pack->num_objects, sizeof(uint32_t)); + size_t cache_size = st_mult(bitmap_num_objects(index), sizeof(uint32_t)); unsigned char *index_end = index->map + index->map_size - the_hash_algo->rawsz; if ((flags & BITMAP_OPT_FULL_DAG) == 0) @@ -404,7 +409,7 @@ static inline int bitmap_position_extended(struct bitmap_index *bitmap_git, if (pos < kh_end(positions)) { int bitmap_pos = kh_value(positions, pos); - return bitmap_pos + bitmap_git->pack->num_objects; + return bitmap_pos + bitmap_num_objects(bitmap_git); } return -1; @@ -456,7 +461,7 @@ static int ext_index_add_object(struct bitmap_index *bitmap_git, bitmap_pos = kh_value(eindex->positions, hash_pos); } - return bitmap_pos + bitmap_git->pack->num_objects; + return bitmap_pos + bitmap_num_objects(bitmap_git); } struct bitmap_show_data { @@ -673,7 +678,7 @@ static void show_extended_objects(struct bitmap_index *bitmap_git, for (i = 0; i < eindex->count; ++i) { struct object *obj; - if (!bitmap_get(objects, bitmap_git->pack->num_objects + i)) + if (!bitmap_get(objects, bitmap_num_objects(bitmap_git) + i)) continue; obj = eindex->objects[i]; @@ -832,7 +837,7 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git, * them individually. */ for (i = 0; i < eindex->count; i++) { - uint32_t pos = i + bitmap_git->pack->num_objects; + uint32_t pos = i + bitmap_num_objects(bitmap_git); if (eindex->objects[i]->type == type && bitmap_get(to_filter, pos) && !bitmap_get(tips, pos)) @@ -859,7 +864,7 @@ static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git, oi.sizep = &size; - if (pos < pack->num_objects) { + if (pos < bitmap_num_objects(bitmap_git)) { off_t ofs = pack_pos_to_offset(pack, pos); if (packed_object_info(the_repository, pack, ofs, &oi) < 0) { struct object_id oid; @@ -869,7 +874,7 @@ static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git, } } else { struct eindex *eindex = &bitmap_git->ext_index; - struct object *obj = eindex->objects[pos - pack->num_objects]; + struct object *obj = eindex->objects[pos - bitmap_num_objects(bitmap_git)]; if (oid_object_info_extended(the_repository, &obj->oid, &oi, 0) < 0) die(_("unable to get size of %s"), oid_to_hex(&obj->oid)); } @@ -911,7 +916,7 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git, } for (i = 0; i < eindex->count; i++) { - uint32_t pos = i + bitmap_git->pack->num_objects; + uint32_t pos = i + bitmap_num_objects(bitmap_git); if (eindex->objects[i]->type == OBJ_BLOB && bitmap_get(to_filter, pos) && !bitmap_get(tips, pos) && @@ -1137,8 +1142,8 @@ static void try_partial_reuse(struct bitmap_index *bitmap_git, enum object_type type; unsigned long size; - if (pos >= bitmap_git->pack->num_objects) - return; /* not actually in the pack */ + if (pos >= bitmap_num_objects(bitmap_git)) + return; /* not actually in the pack or MIDX */ offset = header = pack_pos_to_offset(bitmap_git->pack, pos); type = unpack_object_header(bitmap_git->pack, w_curs, &offset, &size); @@ -1204,6 +1209,7 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, struct pack_window *w_curs = NULL; size_t i = 0; uint32_t offset; + uint32_t objects_nr = bitmap_num_objects(bitmap_git); assert(result); @@ -1211,8 +1217,8 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, i++; /* Don't mark objects not in the packfile */ - if (i > bitmap_git->pack->num_objects / BITS_IN_EWORD) - i = bitmap_git->pack->num_objects / BITS_IN_EWORD; + if (i > objects_nr / BITS_IN_EWORD) + i = objects_nr / BITS_IN_EWORD; reuse = bitmap_word_alloc(i); memset(reuse->words, 0xFF, i * sizeof(eword_t)); @@ -1296,7 +1302,7 @@ static uint32_t count_object_type(struct bitmap_index *bitmap_git, for (i = 0; i < eindex->count; ++i) { if (eindex->objects[i]->type == type && - bitmap_get(objects, bitmap_git->pack->num_objects + i)) + bitmap_get(objects, bitmap_num_objects(bitmap_git) + i)) count++; } @@ -1517,7 +1523,7 @@ uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git, uint32_t i, num_objects; uint32_t *reposition; - num_objects = bitmap_git->pack->num_objects; + num_objects = bitmap_num_objects(bitmap_git); CALLOC_ARRAY(reposition, num_objects); for (i = 0; i < num_objects; ++i) { @@ -1600,7 +1606,6 @@ static off_t get_disk_usage_for_type(struct bitmap_index *bitmap_git, static off_t get_disk_usage_for_extended(struct bitmap_index *bitmap_git) { struct bitmap *result = bitmap_git->result; - struct packed_git *pack = bitmap_git->pack; struct eindex *eindex = &bitmap_git->ext_index; off_t total = 0; struct object_info oi = OBJECT_INFO_INIT; @@ -1612,7 +1617,7 @@ static off_t get_disk_usage_for_extended(struct bitmap_index *bitmap_git) for (i = 0; i < eindex->count; i++) { struct object *obj = eindex->objects[i]; - if (!bitmap_get(result, pack->num_objects + i)) + if (!bitmap_get(result, bitmap_num_objects(bitmap_git) + i)) continue; if (oid_object_info_extended(the_repository, &obj->oid, &oi, 0) < 0) From patchwork Tue Aug 31 20:52:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467817 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B410AC43214 for ; Tue, 31 Aug 2021 20:52:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9C8CC61053 for ; Tue, 31 Aug 2021 20:52:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241285AbhHaUxX (ORCPT ); Tue, 31 Aug 2021 16:53:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42058 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241298AbhHaUxL (ORCPT ); Tue, 31 Aug 2021 16:53:11 -0400 Received: from mail-il1-x12c.google.com (mail-il1-x12c.google.com [IPv6:2607:f8b0:4864:20::12c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 011DEC0617AD for ; Tue, 31 Aug 2021 13:52:16 -0700 (PDT) Received: by mail-il1-x12c.google.com with SMTP id z2so812965iln.0 for ; Tue, 31 Aug 2021 13:52:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=zq/v/JdXgximi7tjH/gLOQeRUsYSH2LAxO/CeqbSkrw=; b=OAlMxIP0bsyUzTf1mdZYs6sb9hmbjnrLeHVv74phJ+ZYbSIJAWZ6eXxJJvFjaerk2D b3yt49v6Pa9aqJ8WP6CiNlSQ5gXkL2BnJgvyPwcjqpRcZgboHZ86FRumm2Hp5ULjyp9P B1jORlVXDjawGEYMNWqVu2DfvZOdfUCoIYrDoNF44Ggn+qsQkD+e6ldU0v25v2YSzZf0 fJabQoHUnLnfIVxe0TmJTrzBs+lDDBq0iMw6JryTYu8QDBWE483WB9CClf7ZhbjAtoC1 PzcVTMJ3RlupUe5lrK1rLufSIEvV+V+ObESe6UxF122lWUOwsNlLfGqx+ocy0jr9TNSv /Vdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=zq/v/JdXgximi7tjH/gLOQeRUsYSH2LAxO/CeqbSkrw=; b=Bx6VrZDSBwu8Pz2HfvnzyojA7XobuYnDLfvQaMb9oi84OKv6asTrRF9CL4qyJoggjl wcK6XLgGyqMTQaJCBzdrJc1wb4l5chP7vQ4Z0Ol588m5FF+mHoYkmOHqUQdCMMHPu/E3 88aE5PEJqjdlEaGluRHFlRvvHqPA9f1rd3GvIUp3OfUW4/8oGcwXPOxnFvg7vvWgL7+P vh/BYdAzOAnoo+0ubjb8jHLJCLXgs2ME+gx8bYdss9WxPZ85KGchJfVpt/sRBw0CeEMT VTnzvScD+GKRxGS8hXsmhZqoPcGlj7LHRBMjt/1YPfYsGOu+MKGgN51jw+zqg1lMsLb9 LY5w== X-Gm-Message-State: AOAM530gZiFWJSJA7mkM2c4lqeGFmZtRZBeePzeHJawpco4R6rFRSs7v JTNvUNxubaBK4iTmjf7HW1vdEUW2iZRD/6bv X-Google-Smtp-Source: ABdhPJwgPp1TNWnzd7ubnSrEHCg74zHhzRscd9C8HPO6rP6jIbo4fv6SYqLNh0ABzBWnatXk4W2wAA== X-Received: by 2002:a92:ca89:: with SMTP id t9mr22459679ilo.178.1630443135301; Tue, 31 Aug 2021 13:52:15 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id x15sm10715540ilp.23.2021.08.31.13.52.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:15 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:14 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 13/27] pack-bitmap.c: introduce 'nth_bitmap_object_oid()' Message-ID: <9f165037ce5e002fc7f4cc8d41969693f2376772.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A subsequent patch to support reading MIDX bitmaps will be less noisy after extracting a generic function to fetch the nth OID contained in the bitmap. Signed-off-by: Taylor Blau --- pack-bitmap.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 65356f9657..612f62da97 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -223,6 +223,13 @@ static inline uint8_t read_u8(const unsigned char *buffer, size_t *pos) #define MAX_XOR_OFFSET 160 +static int nth_bitmap_object_oid(struct bitmap_index *index, + struct object_id *oid, + uint32_t n) +{ + return nth_packed_object_id(oid, index->pack, n); +} + static int load_bitmap_entries_v1(struct bitmap_index *index) { uint32_t i; @@ -242,7 +249,7 @@ static int load_bitmap_entries_v1(struct bitmap_index *index) xor_offset = read_u8(index->map, &index->map_pos); flags = read_u8(index->map, &index->map_pos); - if (nth_packed_object_id(&oid, index->pack, commit_idx_pos) < 0) + if (nth_bitmap_object_oid(index, &oid, commit_idx_pos) < 0) return error("corrupt ewah bitmap: commit index %u out of range", (unsigned)commit_idx_pos); @@ -868,8 +875,8 @@ static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git, off_t ofs = pack_pos_to_offset(pack, pos); if (packed_object_info(the_repository, pack, ofs, &oi) < 0) { struct object_id oid; - nth_packed_object_id(&oid, pack, - pack_pos_to_index(pack, pos)); + nth_bitmap_object_oid(bitmap_git, &oid, + pack_pos_to_index(pack, pos)); die(_("unable to get size of %s"), oid_to_hex(&oid)); } } else { From patchwork Tue Aug 31 20:52:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467819 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37DF9C432BE for ; Tue, 31 Aug 2021 20:52:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1840461026 for ; Tue, 31 Aug 2021 20:52:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241317AbhHaUxd (ORCPT ); Tue, 31 Aug 2021 16:53:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42044 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241312AbhHaUxV (ORCPT ); Tue, 31 Aug 2021 16:53:21 -0400 Received: from mail-io1-xd2f.google.com (mail-io1-xd2f.google.com [IPv6:2607:f8b0:4864:20::d2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 53053C061292 for ; Tue, 31 Aug 2021 13:52:18 -0700 (PDT) Received: by mail-io1-xd2f.google.com with SMTP id n24so752756ion.10 for ; Tue, 31 Aug 2021 13:52:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=8NYgMRF3jBTzW2/0/2YVZB3StET0438Dob1HTkgbmY8=; b=Zz765yJPpVUdPbDQJSzrY2H+3s3L73YkbwJ8h4TKjPFatTi50wPtfDn4y1y7lAI1UA SuU9u4yQJVQTGhFXKPgZMRm/1AfRF1iIFDC2uqT8//r60pe8yfvVcGcWPEA5yuUBTNbo DpXB80OkxV/yyU5GL7y3IRa5i1df92lNfubb0/f5e9FF07xLX/3Qwp0ht5ONCAYhNTfM gDWziymo8K6sRQHD5e+dqt5GXWKiZDQafgE2rsmYOd0q1XjFiJM3etSVgEWFqvshCNQ8 8RQOVwUZ6Ciz7f9hzMdMmXkIO7jH8vYoTzUFFI+W/+VR1m6UMB4TFeUZ6zQ2HJVkGRLM yjqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=8NYgMRF3jBTzW2/0/2YVZB3StET0438Dob1HTkgbmY8=; b=XImtIRbb4RW4LpvmICyYLJP+M1g417c631brM/W09O3LBafUpKZYSurxpNV2QVqbSD 8hX9V2Lu5JIfJZXOfc0pnnMIW1BhS29uABcfVYf9aDG346z0OrwBclKP3Gs2+NavwTyH v8jypjOMkyf/CStB/eS/oY6xHbThkjg8AufhK47gIiD7VHSl0uSPGwGOORhszdt9g8KJ PVEoJk8RLz/SHxVr1rtFdJXI5G1nyJc3eaOJOJ8pRJ19aOhSP3wD2RfDe+iU84pzD2Px W5k9i8tEw2NS+tm+fTBojiYeXwJOXSukk6mBgvN7p3giSnsSdFuk4AEr5uwY173Po9cl w+cw== X-Gm-Message-State: AOAM5302Kwp68M8XyJN5nPJdKaOOJTXa2Lf8YtN1X1pDD73cWbNX4nQA vxSJzBPuhZpbrkp2EpPege0yIzlznrGKM3Qc X-Google-Smtp-Source: ABdhPJyfyLriTzL1Db9tZ7LVymJlRvD36pXFuq/tn0MSVSXAH0U5fEYFJz++AEJyFq9SR908UfOnKA== X-Received: by 2002:a05:6602:2436:: with SMTP id g22mr24683053iob.109.1630443137647; Tue, 31 Aug 2021 13:52:17 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id b12sm10938978ios.0.2021.08.31.13.52.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:17 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:16 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 14/27] pack-bitmap.c: introduce 'bitmap_is_preferred_refname()' Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In a recent commit, pack-objects learned support for the 'pack.preferBitmapTips' configuration. This patch prepares the multi-pack bitmap code to respect this configuration, too. The yet-to-be implemented code will find that it is more efficient to check whether each reference contains a prefix found in the configured set of values rather than doing an additional traversal. Implement a function 'bitmap_is_preferred_refname()' which will perform that check. Its caller will be added in a subsequent patch. Signed-off-by: Taylor Blau --- pack-bitmap.c | 16 ++++++++++++++++ pack-bitmap.h | 1 + 2 files changed, 17 insertions(+) diff --git a/pack-bitmap.c b/pack-bitmap.c index 612f62da97..d5296750eb 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1658,3 +1658,19 @@ const struct string_list *bitmap_preferred_tips(struct repository *r) { return repo_config_get_value_multi(r, "pack.preferbitmaptips"); } + +int bitmap_is_preferred_refname(struct repository *r, const char *refname) +{ + const struct string_list *preferred_tips = bitmap_preferred_tips(r); + struct string_list_item *item; + + if (!preferred_tips) + return 0; + + for_each_string_list_item(item, preferred_tips) { + if (starts_with(refname, item->string)) + return 1; + } + + return 0; +} diff --git a/pack-bitmap.h b/pack-bitmap.h index 020cd8d868..52ea10de51 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -94,5 +94,6 @@ void bitmap_writer_finish(struct pack_idx_entry **index, uint16_t options); const struct string_list *bitmap_preferred_tips(struct repository *r); +int bitmap_is_preferred_refname(struct repository *r, const char *refname); #endif From patchwork Tue Aug 31 20:52:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467825 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4620C432BE for ; Tue, 31 Aug 2021 20:52:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AD2B46103A for ; Tue, 31 Aug 2021 20:52:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241326AbhHaUxf (ORCPT ); Tue, 31 Aug 2021 16:53:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241322AbhHaUxV (ORCPT ); Tue, 31 Aug 2021 16:53:21 -0400 Received: from mail-il1-x12c.google.com (mail-il1-x12c.google.com [IPv6:2607:f8b0:4864:20::12c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BAA1AC061575 for ; Tue, 31 Aug 2021 13:52:20 -0700 (PDT) Received: by mail-il1-x12c.google.com with SMTP id s16so676642ilo.9 for ; Tue, 31 Aug 2021 13:52:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=zVk2Sv4JxjtOV0AzqifItb828qzQjqxuFrEIv8yMvE8=; b=TQwNsEQEtjzmIg/cLUXvmqUtT9DNNZ1e0ZM60fOf4XjeprJwq8JQYYX67UIVW3Ildn N901mv29ekdPbtua19nXb35WeZEA02Gza8EHo29rXg9Nq/SD8ckuo6tikJQGbQPE+2hm scbE7Qj458bRERbnQIL+vYy9/YFE5r53om9GsyQPB7Si4L3iwniS63gjxq8wJp7kRh2c yyCA7L7eqN4bOnxkkR+X9/D3Gx8f/NJDWiQd0FADBT4RHJYYqE1V24FJ9vn1PeVlEr50 9WkAeunQ6/ijIrfmcIh6uAikLy/hNw6Jp/QdfSuzv/wjSQXiFppHedubiGoM8uzexIwy XCIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=zVk2Sv4JxjtOV0AzqifItb828qzQjqxuFrEIv8yMvE8=; b=M5wy7Mti0qtAsLqPiSMfABF/dfAiVHHadvtye2Esy2S9K8wci0qUdFfzHHN7JoWOQR ni1d+ijK9ReJqYzk+O1hccRnDqk7OJKA7bD3wYYWNvuESjKPF0RZK/FQOrJZJZMz0fU5 36rIAZ/P71pYl/EC4HLF/oEyq8Cry4E2hOmiV+gzH4LUtRruvpChN+dfkSMd6JIwjbmb ebJZgdj+N0KbwSd9Ry1pKh+WxgkbxzEJCd+ZC7WGX3QRdGpgwB/kWPSd2Uv+D5MtAnqF umKMxCwZUxEyzOFpxTrNaP7FDt56wqLRyzQed8yTbdRlWMS5JJ2x9W+l7rTUpzu/s5LY YuKg== X-Gm-Message-State: AOAM5334Gat4VvXtqhkRwrq1HJXc6Q3fPdqql7LMlv0WOTx1//QUt39s DxOoJcHBKXoxGfSio9J2ouvt7jgc3RvFlhY/ X-Google-Smtp-Source: ABdhPJzWb5HxLJOn+aqiBsF+sh3+7y5R+bdqKsRgzB5xY+S6y28XK/19lV94Xz3VSByO2zuRvEI+gA== X-Received: by 2002:a92:cb4d:: with SMTP id f13mr22090863ilq.220.1630443140075; Tue, 31 Aug 2021 13:52:20 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id s5sm10472581iol.33.2021.08.31.13.52.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:19 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:19 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 15/27] pack-bitmap.c: avoid redundant calls to try_partial_reuse Message-ID: <06db8dbbc163ac41f0bd6e45b83e684660ca2746.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org try_partial_reuse() is used to mark any bits in the beginning of a bitmap whose objects can be reused verbatim from the pack they came from. Currently this function returns void, and signals nothing to the caller when bits could not be reused. But multi-pack bitmaps would benefit from having such a signal, because they may try to pass objects which are in bounds, but from a pack other than the preferred one. Any extra calls are noops because of a conditional in reuse_partial_packfile_from_bitmap(), but those loop iterations can be avoided by letting try_partial_reuse() indicate when it can't accept any more bits for reuse, and then listening to that signal. Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- pack-bitmap.c | 40 +++++++++++++++++++++++++++++----------- 1 file changed, 29 insertions(+), 11 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index d5296750eb..4e37f5d574 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1140,22 +1140,26 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, return NULL; } -static void try_partial_reuse(struct bitmap_index *bitmap_git, - size_t pos, - struct bitmap *reuse, - struct pack_window **w_curs) +/* + * -1 means "stop trying further objects"; 0 means we may or may not have + * reused, but you can keep feeding bits. + */ +static int try_partial_reuse(struct bitmap_index *bitmap_git, + size_t pos, + struct bitmap *reuse, + struct pack_window **w_curs) { off_t offset, header; enum object_type type; unsigned long size; if (pos >= bitmap_num_objects(bitmap_git)) - return; /* not actually in the pack or MIDX */ + return -1; /* not actually in the pack or MIDX */ offset = header = pack_pos_to_offset(bitmap_git->pack, pos); type = unpack_object_header(bitmap_git->pack, w_curs, &offset, &size); if (type < 0) - return; /* broken packfile, punt */ + return -1; /* broken packfile, punt */ if (type == OBJ_REF_DELTA || type == OBJ_OFS_DELTA) { off_t base_offset; @@ -1172,9 +1176,9 @@ static void try_partial_reuse(struct bitmap_index *bitmap_git, base_offset = get_delta_base(bitmap_git->pack, w_curs, &offset, type, header); if (!base_offset) - return; + return 0; if (offset_to_pack_pos(bitmap_git->pack, base_offset, &base_pos) < 0) - return; + return 0; /* * We assume delta dependencies always point backwards. This @@ -1186,7 +1190,7 @@ static void try_partial_reuse(struct bitmap_index *bitmap_git, * odd parameters. */ if (base_pos >= pos) - return; + return 0; /* * And finally, if we're not sending the base as part of our @@ -1197,13 +1201,14 @@ static void try_partial_reuse(struct bitmap_index *bitmap_git, * object_entry code path handle it. */ if (!bitmap_get(reuse, base_pos)) - return; + return 0; } /* * If we got here, then the object is OK to reuse. Mark it. */ bitmap_set(reuse, pos); + return 0; } int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, @@ -1239,10 +1244,23 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, break; offset += ewah_bit_ctz64(word >> offset); - try_partial_reuse(bitmap_git, pos + offset, reuse, &w_curs); + if (try_partial_reuse(bitmap_git, pos + offset, reuse, + &w_curs) < 0) { + /* + * try_partial_reuse indicated we couldn't reuse + * any bits, so there is no point in trying more + * bits in the current word, or any other words + * in result. + * + * Jump out of both loops to avoid future + * unnecessary calls to try_partial_reuse. + */ + goto done; + } } } +done: unuse_pack(&w_curs); *entries = bitmap_popcount(reuse); From patchwork Tue Aug 31 20:52:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467821 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F22EBC4320A for ; Tue, 31 Aug 2021 20:52:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DE6D660ED4 for ; Tue, 31 Aug 2021 20:52:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241319AbhHaUxe (ORCPT ); Tue, 31 Aug 2021 16:53:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42108 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241327AbhHaUxV (ORCPT ); Tue, 31 Aug 2021 16:53:21 -0400 Received: from mail-io1-xd30.google.com (mail-io1-xd30.google.com [IPv6:2607:f8b0:4864:20::d30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A68D2C0612A7 for ; Tue, 31 Aug 2021 13:52:23 -0700 (PDT) Received: by mail-io1-xd30.google.com with SMTP id b10so764222ioq.9 for ; Tue, 31 Aug 2021 13:52:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=CFCC1BXGCVDgg84FP4CQpcCdwosXuE7A34ek0ZMYfdI=; b=wJdZheh8xrt74r+W/OHNIJDY4dZf7M0Yovwkk/of+kQLELKlWU79sLOnGWax9Hm/bQ tNJSbVq3U2LuWmYYQRH2ejRmbdxd6zJKZesDDD0bTNwP19MV3V+smRS9SgKUIQgUlyoj 8+dK0+lkERQHJ9r1zg0sEKFFY9+KSk6EbuAYOxVwubrFHG/lAjLM/uxa1IGCY0NXuh+c mLHyYJvpS583l1WZ2pI/VTNvafQ/F9aVg67Or33lYs97xPfTRJpbCwGpgOqbt9HnSbhR 6yluMmonqOPeKMGSE/dNg7BZmNc5z8P0jLbeE/f+2z92H2wfItG07CoJW/nvrzP9L8CI IiUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=CFCC1BXGCVDgg84FP4CQpcCdwosXuE7A34ek0ZMYfdI=; b=Y9qzrSO63szlesy6F2xZJXxzJbJyFNPaA6bRzwWT3eJhsMvABpbbaI5MgO3q82XbV0 Bu2Ou40K+RnNJAD+oxNyIPMr65W7ZhcLd/5g6hmud24N0PH/CPtjjV9zfHygLv0l9Ug/ 9ijdtc9tUIxny5UAawHLf6Yixq64fRMJ9Q6haOVykNqYbgYwp8a++OohkoG5NGahxFzk Jgnv8SOvblcBubRVYds8TqAUCtvnGGVttw3dQNI/0DnbJkbn5NCDZ4e1974K4SA+ueca mwc7ucY+tkUlMuAL4TZLVb9FSMg9arJPdefXMkH5KGJ6qfO9jQIfhJcsy1l+DdkfB700 gpwQ== X-Gm-Message-State: AOAM531za6QEPzoflraZHG+hRXMB9Ig69y6iJAtgItBMQjiLpHAzLGqf eLnAi1XjFCDTuorEaaWGNsdQAox4g5y5GrDb X-Google-Smtp-Source: ABdhPJwr+6aiYta0YtFeQGyzRItZvnrIKz2bcENvz//OU1gHShrSijOtRN5wZQgMIC792WILI4G+kQ== X-Received: by 2002:a5d:9ada:: with SMTP id x26mr23488262ion.50.1630443142559; Tue, 31 Aug 2021 13:52:22 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 12sm10673837ilq.37.2021.08.31.13.52.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:22 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:21 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 16/27] pack-bitmap: read multi-pack bitmaps Message-ID: <61798853b6c5bec0ecc92d8aacc6c676441840d3.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This prepares the code in pack-bitmap to interpret the new multi-pack bitmaps described in Documentation/technical/bitmap-format.txt, which mostly involves converting bit positions to accommodate looking them up in a MIDX. Note that there are currently no writers who write multi-pack bitmaps, and that this will be implemented in the subsequent commit. Note also that get_midx_checksum() and get_midx_filename() are made non-static so they can be called from pack-bitmap.c. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 5 + midx.c | 4 +- midx.h | 2 + pack-bitmap-write.c | 2 +- pack-bitmap.c | 357 ++++++++++++++++++++++++++++++++++++----- pack-bitmap.h | 6 + packfile.c | 2 +- 7 files changed, 336 insertions(+), 42 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index b63e06e46c..c26dedfe5d 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1124,6 +1124,11 @@ static void write_reused_pack(struct hashfile *f) break; offset += ewah_bit_ctz64(word >> offset); + /* + * Can use bit positions directly, even for MIDX + * bitmaps. See comment in try_partial_reuse() + * for why. + */ write_reused_pack_one(pos + offset, f, &w_curs); display_progress(progress_state, ++written); } diff --git a/midx.c b/midx.c index 43510290dc..6a10f7a042 100644 --- a/midx.c +++ b/midx.c @@ -48,12 +48,12 @@ static uint8_t oid_version(void) } } -static const unsigned char *get_midx_checksum(struct multi_pack_index *m) +const unsigned char *get_midx_checksum(struct multi_pack_index *m) { return m->data + m->data_len - the_hash_algo->rawsz; } -static char *get_midx_filename(const char *object_dir) +char *get_midx_filename(const char *object_dir) { return xstrfmt("%s/pack/multi-pack-index", object_dir); } diff --git a/midx.h b/midx.h index 8684cf0fef..1172df1a71 100644 --- a/midx.h +++ b/midx.h @@ -42,6 +42,8 @@ struct multi_pack_index { #define MIDX_PROGRESS (1 << 0) #define MIDX_WRITE_REV_INDEX (1 << 1) +const unsigned char *get_midx_checksum(struct multi_pack_index *m); +char *get_midx_filename(const char *object_dir); char *get_midx_rev_filename(struct multi_pack_index *m); struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local); diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index 142fd0adb8..9c55c1531e 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -48,7 +48,7 @@ void bitmap_writer_show_progress(int show) } /** - * Build the initial type index for the packfile + * Build the initial type index for the packfile or multi-pack-index */ void bitmap_writer_build_type_index(struct packing_data *to_pack, struct pack_idx_entry **index, diff --git a/pack-bitmap.c b/pack-bitmap.c index 4e37f5d574..fa69ed7a6d 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -13,6 +13,7 @@ #include "repository.h" #include "object-store.h" #include "list-objects-filter-options.h" +#include "midx.h" #include "config.h" /* @@ -35,8 +36,15 @@ struct stored_bitmap { * the active bitmap index is the largest one. */ struct bitmap_index { - /* Packfile to which this bitmap index belongs to */ + /* + * The pack or multi-pack index (MIDX) that this bitmap index belongs + * to. + * + * Exactly one of these must be non-NULL; this specifies the object + * order used to interpret this bitmap. + */ struct packed_git *pack; + struct multi_pack_index *midx; /* * Mark the first `reuse_objects` in the packfile as reused: @@ -71,6 +79,9 @@ struct bitmap_index { /* If not NULL, this is a name-hash cache pointing into map. */ uint32_t *hashes; + /* The checksum of the packfile or MIDX; points into map. */ + const unsigned char *checksum; + /* * Extended index. * @@ -138,6 +149,8 @@ static struct ewah_bitmap *read_bitmap_1(struct bitmap_index *index) static uint32_t bitmap_num_objects(struct bitmap_index *index) { + if (index->midx) + return index->midx->num_objects; return index->pack->num_objects; } @@ -175,6 +188,7 @@ static int load_bitmap_header(struct bitmap_index *index) } index->entry_count = ntohl(header->entry_count); + index->checksum = header->checksum; index->map_pos += header_size; return 0; } @@ -227,6 +241,8 @@ static int nth_bitmap_object_oid(struct bitmap_index *index, struct object_id *oid, uint32_t n) { + if (index->midx) + return nth_midxed_object_oid(oid, index->midx, n) ? 0 : -1; return nth_packed_object_id(oid, index->pack, n); } @@ -274,7 +290,14 @@ static int load_bitmap_entries_v1(struct bitmap_index *index) return 0; } -static char *pack_bitmap_filename(struct packed_git *p) +char *midx_bitmap_filename(struct multi_pack_index *midx) +{ + return xstrfmt("%s-%s.bitmap", + get_midx_filename(midx->object_dir), + hash_to_hex(get_midx_checksum(midx))); +} + +char *pack_bitmap_filename(struct packed_git *p) { size_t len; @@ -283,6 +306,57 @@ static char *pack_bitmap_filename(struct packed_git *p) return xstrfmt("%.*s.bitmap", (int)len, p->pack_name); } +static int open_midx_bitmap_1(struct bitmap_index *bitmap_git, + struct multi_pack_index *midx) +{ + struct stat st; + char *idx_name = midx_bitmap_filename(midx); + int fd = git_open(idx_name); + + free(idx_name); + + if (fd < 0) + return -1; + + if (fstat(fd, &st)) { + close(fd); + return -1; + } + + if (bitmap_git->pack || bitmap_git->midx) { + /* ignore extra bitmap file; we can only handle one */ + warning("ignoring extra bitmap file: %s", + get_midx_filename(midx->object_dir)); + close(fd); + return -1; + } + + bitmap_git->midx = midx; + bitmap_git->map_size = xsize_t(st.st_size); + bitmap_git->map_pos = 0; + bitmap_git->map = xmmap(NULL, bitmap_git->map_size, PROT_READ, + MAP_PRIVATE, fd, 0); + close(fd); + + if (load_bitmap_header(bitmap_git) < 0) + goto cleanup; + + if (!hasheq(get_midx_checksum(bitmap_git->midx), bitmap_git->checksum)) + goto cleanup; + + if (load_midx_revindex(bitmap_git->midx) < 0) { + warning(_("multi-pack bitmap is missing required reverse index")); + goto cleanup; + } + return 0; + +cleanup: + munmap(bitmap_git->map, bitmap_git->map_size); + bitmap_git->map_size = 0; + bitmap_git->map = NULL; + return -1; +} + static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git *packfile) { int fd; @@ -304,7 +378,8 @@ static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git return -1; } - if (bitmap_git->pack) { + if (bitmap_git->pack || bitmap_git->midx) { + /* ignore extra bitmap file; we can only handle one */ warning("ignoring extra bitmap file: %s", packfile->pack_name); close(fd); return -1; @@ -331,13 +406,39 @@ static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git return 0; } -static int load_pack_bitmap(struct bitmap_index *bitmap_git) +static int load_reverse_index(struct bitmap_index *bitmap_git) +{ + if (bitmap_is_midx(bitmap_git)) { + uint32_t i; + int ret; + + /* + * The multi-pack-index's .rev file is already loaded via + * open_pack_bitmap_1(). + * + * But we still need to open the individual pack .rev files, + * since we will need to make use of them in pack-objects. + */ + for (i = 0; i < bitmap_git->midx->num_packs; i++) { + if (prepare_midx_pack(the_repository, bitmap_git->midx, i)) + die(_("load_reverse_index: could not open pack")); + ret = load_pack_revindex(bitmap_git->midx->packs[i]); + if (ret) + return ret; + } + return 0; + } + return load_pack_revindex(bitmap_git->pack); +} + +static int load_bitmap(struct bitmap_index *bitmap_git) { assert(bitmap_git->map); bitmap_git->bitmaps = kh_init_oid_map(); bitmap_git->ext_index.positions = kh_init_oid_pos(); - if (load_pack_revindex(bitmap_git->pack)) + + if (load_reverse_index(bitmap_git)) goto failed; if (!(bitmap_git->commits = read_bitmap_1(bitmap_git)) || @@ -381,11 +482,47 @@ static int open_pack_bitmap(struct repository *r, return ret; } +static int open_midx_bitmap(struct repository *r, + struct bitmap_index *bitmap_git) +{ + struct multi_pack_index *midx; + + assert(!bitmap_git->map); + + for (midx = get_multi_pack_index(r); midx; midx = midx->next) { + if (!open_midx_bitmap_1(bitmap_git, midx)) + return 0; + } + return -1; +} + +static int open_bitmap(struct repository *r, + struct bitmap_index *bitmap_git) +{ + assert(!bitmap_git->map); + + if (!open_midx_bitmap(r, bitmap_git)) + return 0; + return open_pack_bitmap(r, bitmap_git); +} + struct bitmap_index *prepare_bitmap_git(struct repository *r) { struct bitmap_index *bitmap_git = xcalloc(1, sizeof(*bitmap_git)); - if (!open_pack_bitmap(r, bitmap_git) && !load_pack_bitmap(bitmap_git)) + if (!open_bitmap(r, bitmap_git) && !load_bitmap(bitmap_git)) + return bitmap_git; + + free_bitmap_index(bitmap_git); + return NULL; +} + +struct bitmap_index *prepare_midx_bitmap_git(struct repository *r, + struct multi_pack_index *midx) +{ + struct bitmap_index *bitmap_git = xcalloc(1, sizeof(*bitmap_git)); + + if (!open_midx_bitmap_1(bitmap_git, midx) && !load_bitmap(bitmap_git)) return bitmap_git; free_bitmap_index(bitmap_git); @@ -435,10 +572,26 @@ static inline int bitmap_position_packfile(struct bitmap_index *bitmap_git, return pos; } +static int bitmap_position_midx(struct bitmap_index *bitmap_git, + const struct object_id *oid) +{ + uint32_t want, got; + if (!bsearch_midx(oid, bitmap_git->midx, &want)) + return -1; + + if (midx_to_pack_pos(bitmap_git->midx, want, &got) < 0) + return -1; + return got; +} + static int bitmap_position(struct bitmap_index *bitmap_git, const struct object_id *oid) { - int pos = bitmap_position_packfile(bitmap_git, oid); + int pos; + if (bitmap_is_midx(bitmap_git)) + pos = bitmap_position_midx(bitmap_git, oid); + else + pos = bitmap_position_packfile(bitmap_git, oid); return (pos >= 0) ? pos : bitmap_position_extended(bitmap_git, oid); } @@ -749,6 +902,7 @@ static void show_objects_for_type( continue; for (offset = 0; offset < BITS_IN_EWORD; ++offset) { + struct packed_git *pack; struct object_id oid; uint32_t hash = 0, index_pos; off_t ofs; @@ -758,14 +912,28 @@ static void show_objects_for_type( offset += ewah_bit_ctz64(word >> offset); - index_pos = pack_pos_to_index(bitmap_git->pack, pos + offset); - ofs = pack_pos_to_offset(bitmap_git->pack, pos + offset); - nth_packed_object_id(&oid, bitmap_git->pack, index_pos); + if (bitmap_is_midx(bitmap_git)) { + struct multi_pack_index *m = bitmap_git->midx; + uint32_t pack_id; + + index_pos = pack_pos_to_midx(m, pos + offset); + ofs = nth_midxed_offset(m, index_pos); + nth_midxed_object_oid(&oid, m, index_pos); + + pack_id = nth_midxed_pack_int_id(m, index_pos); + pack = bitmap_git->midx->packs[pack_id]; + } else { + index_pos = pack_pos_to_index(bitmap_git->pack, pos + offset); + ofs = pack_pos_to_offset(bitmap_git->pack, pos + offset); + nth_bitmap_object_oid(bitmap_git, &oid, index_pos); + + pack = bitmap_git->pack; + } if (bitmap_git->hashes) hash = get_be32(bitmap_git->hashes + index_pos); - show_reach(&oid, object_type, 0, hash, bitmap_git->pack, ofs); + show_reach(&oid, object_type, 0, hash, pack, ofs); } } } @@ -777,8 +945,13 @@ static int in_bitmapped_pack(struct bitmap_index *bitmap_git, struct object *object = roots->item; roots = roots->next; - if (find_pack_entry_one(object->oid.hash, bitmap_git->pack) > 0) - return 1; + if (bitmap_is_midx(bitmap_git)) { + if (bsearch_midx(&object->oid, bitmap_git->midx, NULL)) + return 1; + } else { + if (find_pack_entry_one(object->oid.hash, bitmap_git->pack) > 0) + return 1; + } } return 0; @@ -865,14 +1038,26 @@ static void filter_bitmap_blob_none(struct bitmap_index *bitmap_git, static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git, uint32_t pos) { - struct packed_git *pack = bitmap_git->pack; unsigned long size; struct object_info oi = OBJECT_INFO_INIT; oi.sizep = &size; if (pos < bitmap_num_objects(bitmap_git)) { - off_t ofs = pack_pos_to_offset(pack, pos); + struct packed_git *pack; + off_t ofs; + + if (bitmap_is_midx(bitmap_git)) { + uint32_t midx_pos = pack_pos_to_midx(bitmap_git->midx, pos); + uint32_t pack_id = nth_midxed_pack_int_id(bitmap_git->midx, midx_pos); + + pack = bitmap_git->midx->packs[pack_id]; + ofs = nth_midxed_offset(bitmap_git->midx, midx_pos); + } else { + pack = bitmap_git->pack; + ofs = pack_pos_to_offset(pack, pos); + } + if (packed_object_info(the_repository, pack, ofs, &oi) < 0) { struct object_id oid; nth_bitmap_object_oid(bitmap_git, &oid, @@ -1053,7 +1238,7 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, /* try to open a bitmapped pack, but don't parse it yet * because we may not need to use it */ CALLOC_ARRAY(bitmap_git, 1); - if (open_pack_bitmap(revs->repo, bitmap_git) < 0) + if (open_bitmap(revs->repo, bitmap_git) < 0) goto cleanup; for (i = 0; i < revs->pending.nr; ++i) { @@ -1097,7 +1282,7 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, * from disk. this is the point of no return; after this the rev_list * becomes invalidated and we must perform the revwalk through bitmaps */ - if (load_pack_bitmap(bitmap_git) < 0) + if (load_bitmap(bitmap_git) < 0) goto cleanup; object_array_clear(&revs->pending); @@ -1145,19 +1330,43 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, * reused, but you can keep feeding bits. */ static int try_partial_reuse(struct bitmap_index *bitmap_git, + struct packed_git *pack, size_t pos, struct bitmap *reuse, struct pack_window **w_curs) { - off_t offset, header; + off_t offset, delta_obj_offset; enum object_type type; unsigned long size; - if (pos >= bitmap_num_objects(bitmap_git)) - return -1; /* not actually in the pack or MIDX */ + /* + * try_partial_reuse() is called either on (a) objects in the + * bitmapped pack (in the case of a single-pack bitmap) or (b) + * objects in the preferred pack of a multi-pack bitmap. + * Importantly, the latter can pretend as if only a single pack + * exists because: + * + * - The first pack->num_objects bits of a MIDX bitmap are + * reserved for the preferred pack, and + * + * - Ties due to duplicate objects are always resolved in + * favor of the preferred pack. + * + * Therefore we do not need to ever ask the MIDX for its copy of + * an object by OID, since it will always select it from the + * preferred pack. Likewise, the selected copy of the base + * object for any deltas will reside in the same pack. + * + * This means that we can reuse pos when looking up the bit in + * the reuse bitmap, too, since bits corresponding to the + * preferred pack precede all bits from other packs. + */ - offset = header = pack_pos_to_offset(bitmap_git->pack, pos); - type = unpack_object_header(bitmap_git->pack, w_curs, &offset, &size); + if (pos >= pack->num_objects) + return -1; /* not actually in the pack or MIDX preferred pack */ + + offset = delta_obj_offset = pack_pos_to_offset(pack, pos); + type = unpack_object_header(pack, w_curs, &offset, &size); if (type < 0) return -1; /* broken packfile, punt */ @@ -1173,11 +1382,11 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, * and the normal slow path will complain about it in * more detail. */ - base_offset = get_delta_base(bitmap_git->pack, w_curs, - &offset, type, header); + base_offset = get_delta_base(pack, w_curs, &offset, type, + delta_obj_offset); if (!base_offset) return 0; - if (offset_to_pack_pos(bitmap_git->pack, base_offset, &base_pos) < 0) + if (offset_to_pack_pos(pack, base_offset, &base_pos) < 0) return 0; /* @@ -1211,24 +1420,48 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, return 0; } +static uint32_t midx_preferred_pack(struct bitmap_index *bitmap_git) +{ + struct multi_pack_index *m = bitmap_git->midx; + if (!m) + BUG("midx_preferred_pack: requires non-empty MIDX"); + return nth_midxed_pack_int_id(m, pack_pos_to_midx(bitmap_git->midx, 0)); +} + int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, struct packed_git **packfile_out, uint32_t *entries, struct bitmap **reuse_out) { + struct packed_git *pack; struct bitmap *result = bitmap_git->result; struct bitmap *reuse; struct pack_window *w_curs = NULL; size_t i = 0; uint32_t offset; - uint32_t objects_nr = bitmap_num_objects(bitmap_git); + uint32_t objects_nr; assert(result); + load_reverse_index(bitmap_git); + + if (bitmap_is_midx(bitmap_git)) + pack = bitmap_git->midx->packs[midx_preferred_pack(bitmap_git)]; + else + pack = bitmap_git->pack; + objects_nr = pack->num_objects; + while (i < result->word_alloc && result->words[i] == (eword_t)~0) i++; - /* Don't mark objects not in the packfile */ + /* + * Don't mark objects not in the packfile or preferred pack. This bitmap + * marks objects eligible for reuse, but the pack-reuse code only + * understands how to reuse a single pack. Since the preferred pack is + * guaranteed to have all bases for its deltas (in a multi-pack bitmap), + * we use it instead of another pack. In single-pack bitmaps, the choice + * is made for us. + */ if (i > objects_nr / BITS_IN_EWORD) i = objects_nr / BITS_IN_EWORD; @@ -1244,8 +1477,8 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, break; offset += ewah_bit_ctz64(word >> offset); - if (try_partial_reuse(bitmap_git, pos + offset, reuse, - &w_curs) < 0) { + if (try_partial_reuse(bitmap_git, pack, pos + offset, + reuse, &w_curs) < 0) { /* * try_partial_reuse indicated we couldn't reuse * any bits, so there is no point in trying more @@ -1274,7 +1507,7 @@ int reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, * need to be handled separately. */ bitmap_and_not(result, reuse); - *packfile_out = bitmap_git->pack; + *packfile_out = pack; *reuse_out = reuse; return 0; } @@ -1548,6 +1781,12 @@ uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git, uint32_t i, num_objects; uint32_t *reposition; + if (!bitmap_is_midx(bitmap_git)) + load_reverse_index(bitmap_git); + else if (load_midx_revindex(bitmap_git->midx) < 0) + BUG("rebuild_existing_bitmaps: missing required rev-cache " + "extension"); + num_objects = bitmap_num_objects(bitmap_git); CALLOC_ARRAY(reposition, num_objects); @@ -1555,8 +1794,13 @@ uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git, struct object_id oid; struct object_entry *oe; - nth_packed_object_id(&oid, bitmap_git->pack, - pack_pos_to_index(bitmap_git->pack, i)); + if (bitmap_is_midx(bitmap_git)) + nth_midxed_object_oid(&oid, + bitmap_git->midx, + pack_pos_to_midx(bitmap_git->midx, i)); + else + nth_packed_object_id(&oid, bitmap_git->pack, + pack_pos_to_index(bitmap_git->pack, i)); oe = packlist_find(mapping, &oid); if (oe) @@ -1582,6 +1826,19 @@ void free_bitmap_index(struct bitmap_index *b) free(b->ext_index.hashes); bitmap_free(b->result); bitmap_free(b->haves); + if (bitmap_is_midx(b)) { + /* + * Multi-pack bitmaps need to have resources associated with + * their on-disk reverse indexes unmapped so that stale .rev and + * .bitmap files can be removed. + * + * Unlike pack-based bitmaps, multi-pack bitmaps can be read and + * written in the same 'git multi-pack-index write --bitmap' + * process. Close resources so they can be removed safely on + * platforms like Windows. + */ + close_midx_revindex(b->midx); + } free(b); } @@ -1596,7 +1853,6 @@ static off_t get_disk_usage_for_type(struct bitmap_index *bitmap_git, enum object_type object_type) { struct bitmap *result = bitmap_git->result; - struct packed_git *pack = bitmap_git->pack; off_t total = 0; struct ewah_iterator it; eword_t filter; @@ -1613,15 +1869,35 @@ static off_t get_disk_usage_for_type(struct bitmap_index *bitmap_git, continue; for (offset = 0; offset < BITS_IN_EWORD; offset++) { - size_t pos; - if ((word >> offset) == 0) break; offset += ewah_bit_ctz64(word >> offset); - pos = base + offset; - total += pack_pos_to_offset(pack, pos + 1) - - pack_pos_to_offset(pack, pos); + + if (bitmap_is_midx(bitmap_git)) { + uint32_t pack_pos; + uint32_t midx_pos = pack_pos_to_midx(bitmap_git->midx, base + offset); + off_t offset = nth_midxed_offset(bitmap_git->midx, midx_pos); + + uint32_t pack_id = nth_midxed_pack_int_id(bitmap_git->midx, midx_pos); + struct packed_git *pack = bitmap_git->midx->packs[pack_id]; + + if (offset_to_pack_pos(pack, offset, &pack_pos) < 0) { + struct object_id oid; + nth_midxed_object_oid(&oid, bitmap_git->midx, midx_pos); + + die(_("could not find %s in pack %s at offset %"PRIuMAX), + oid_to_hex(&oid), + pack->pack_name, + (uintmax_t)offset); + } + + total += pack_pos_to_offset(pack, pack_pos + 1) - offset; + } else { + size_t pos = base + offset; + total += pack_pos_to_offset(bitmap_git->pack, pos + 1) - + pack_pos_to_offset(bitmap_git->pack, pos); + } } } @@ -1672,6 +1948,11 @@ off_t get_disk_usage_from_bitmap(struct bitmap_index *bitmap_git, return total; } +int bitmap_is_midx(struct bitmap_index *bitmap_git) +{ + return !!bitmap_git->midx; +} + const struct string_list *bitmap_preferred_tips(struct repository *r) { return repo_config_get_value_multi(r, "pack.preferbitmaptips"); diff --git a/pack-bitmap.h b/pack-bitmap.h index 52ea10de51..81664f933f 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -44,6 +44,8 @@ typedef int (*show_reachable_fn)( struct bitmap_index; struct bitmap_index *prepare_bitmap_git(struct repository *r); +struct bitmap_index *prepare_midx_bitmap_git(struct repository *r, + struct multi_pack_index *midx); void count_bitmap_commit_list(struct bitmap_index *, uint32_t *commits, uint32_t *trees, uint32_t *blobs, uint32_t *tags); void traverse_bitmap_commit_list(struct bitmap_index *, @@ -92,6 +94,10 @@ void bitmap_writer_finish(struct pack_idx_entry **index, uint32_t index_nr, const char *filename, uint16_t options); +char *midx_bitmap_filename(struct multi_pack_index *midx); +char *pack_bitmap_filename(struct packed_git *p); + +int bitmap_is_midx(struct bitmap_index *bitmap_git); const struct string_list *bitmap_preferred_tips(struct repository *r); int bitmap_is_preferred_refname(struct repository *r, const char *refname); diff --git a/packfile.c b/packfile.c index 9ef6d98292..371f5488cf 100644 --- a/packfile.c +++ b/packfile.c @@ -860,7 +860,7 @@ static void prepare_pack(const char *full_name, size_t full_name_len, if (!strcmp(file_name, "multi-pack-index")) return; if (starts_with(file_name, "multi-pack-index") && - ends_with(file_name, ".rev")) + (ends_with(file_name, ".bitmap") || ends_with(file_name, ".rev"))) return; if (ends_with(file_name, ".idx") || ends_with(file_name, ".rev") || From patchwork Tue Aug 31 20:52:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467823 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DBD4C4320E for ; Tue, 31 Aug 2021 20:52:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 573D860F9E for ; Tue, 31 Aug 2021 20:52:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241296AbhHaUxh (ORCPT ); Tue, 31 Aug 2021 16:53:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241329AbhHaUxV (ORCPT ); Tue, 31 Aug 2021 16:53:21 -0400 Received: from mail-io1-xd34.google.com (mail-io1-xd34.google.com [IPv6:2607:f8b0:4864:20::d34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA363C0612A8 for ; Tue, 31 Aug 2021 13:52:25 -0700 (PDT) Received: by mail-io1-xd34.google.com with SMTP id m11so787893ioo.6 for ; Tue, 31 Aug 2021 13:52:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Muvs/cJJzdr3kmctCFncuxuPrh8XRR1dmMQhkYttInM=; b=K6JCA7oSERmb5GIm7mwLPI17NbvFvQ1DkuFDUFfxuuSVMjGhwzG5hk6fRoQ172lHvA pnE2KNjtIhjj68ELpAhqvr7FnPmCmRNcXNYzDJEQdEbvquvM/kfWwvInH6XngM/TtP+p JrZ8J5cYzkTOsNcPzjREdHn5teaRc53GGKDk4U3NmOtJXmvmWjXiNniUelCwRifHt4Dh AmGfNHLQQyY53QrxmGgPX0/9ahxIR0cTebW1WAKd4p82cfjZjADVQUlDveB5yfi7su4l ig+3fgUcYaEp9dpsujYPlEZyDI0KSLTIo7Nf0OTYWVA5/1gpjQWvcp0jX9HXOtHcqVrK +iuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Muvs/cJJzdr3kmctCFncuxuPrh8XRR1dmMQhkYttInM=; b=Xbd7F2p7Qye7CBS+T4+c39T3BGmkzXhSOIBdzmGnE93yA2w3SzmFToCoepmbIJCmNm xXRqQM4YvWWB8ZyDUF6DoZtGmmpCgRVbdaxnGKQ11GG0loDKyWVtoA12/dEUq8SGybi5 TG/vCYGuDgY6iMR+XcEInRp8lYlxAyuGOia/SMoXsEP2ih6sgJpAJRmoWYXyFKzqWVvE GrU0iTYrfxjM+FsorNjuLDv5sN/ou3c2KmjMg/afcWb8562MvCL+TiB1RkIsSuKKi4pY Lbn45EacIajKmW51aLxwLmnU5rxWCOp8Prtldujx5YKPF9faTG+ssQiukojb/dBESaTD /Lpg== X-Gm-Message-State: AOAM5311vsP1xvByrHmL2li/1sk4f/Wn3Kb5dE5l7wNxT691U+FQrUn2 RClk5ckUQW6WiXgwFgUHFFtas/4PD6ywPkYu X-Google-Smtp-Source: ABdhPJx0GsFAEUpEG5SuDofx/cS7gHi1rBZGa0ztl0gtzgxupWNTRHCHv6LYGHBJ5LIU4e2zfsWyFg== X-Received: by 2002:a05:6602:2219:: with SMTP id n25mr23769292ion.185.1630443144968; Tue, 31 Aug 2021 13:52:24 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id s16sm10778226iln.5.2021.08.31.13.52.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:24 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:24 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 17/27] pack-bitmap: write multi-pack bitmaps Message-ID: <4968229663ae98588b70e46f43540cb346f47455.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Write multi-pack bitmaps in the format described by Documentation/technical/bitmap-format.txt, inferring their presence with the absence of '--bitmap'. To write a multi-pack bitmap, this patch attempts to reuse as much of the existing machinery from pack-objects as possible. Specifically, the MIDX code prepares a packing_data struct that pretends as if a single packfile has been generated containing all of the objects contained within the MIDX. Signed-off-by: Taylor Blau --- Documentation/git-multi-pack-index.txt | 12 +- builtin/multi-pack-index.c | 2 + midx.c | 209 ++++++++++++++++++++++++- midx.h | 1 + 4 files changed, 215 insertions(+), 9 deletions(-) diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index 0af6beb2dd..a9df3dbd32 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -10,7 +10,7 @@ SYNOPSIS -------- [verse] 'git multi-pack-index' [--object-dir=] [--[no-]progress] - [--preferred-pack=] + [--preferred-pack=] [--[no-]bitmap] DESCRIPTION ----------- @@ -42,6 +42,9 @@ write:: multiple packs contain the same object. `` must contain at least one object. If not given, ties are broken in favor of the pack with the lowest mtime. + + --[no-]bitmap:: + Control whether or not a multi-pack bitmap is written. -- verify:: @@ -83,6 +86,13 @@ EXAMPLES $ git multi-pack-index write ----------------------------------------------- +* Write a MIDX file for the packfiles in the current .git folder with a +corresponding bitmap. ++ +------------------------------------------------------------- +$ git multi-pack-index write --preferred-pack= --bitmap +------------------------------------------------------------- + * Write a MIDX file for the packfiles in an alternate object store. + ----------------------------------------------- diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 8ff0dee2ec..73c0113b48 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -68,6 +68,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) OPT_STRING(0, "preferred-pack", &opts.preferred_pack, N_("preferred-pack"), N_("pack for reuse when computing a multi-pack bitmap")), + OPT_BIT(0, "bitmap", &opts.flags, N_("write multi-pack bitmap"), + MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX), OPT_END(), }; diff --git a/midx.c b/midx.c index 6a10f7a042..284221ae62 100644 --- a/midx.c +++ b/midx.c @@ -13,6 +13,10 @@ #include "repository.h" #include "chunk-format.h" #include "pack.h" +#include "pack-bitmap.h" +#include "refs.h" +#include "revision.h" +#include "list-objects.h" #define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ #define MIDX_VERSION 1 @@ -893,6 +897,166 @@ static int midx_checksum_valid(struct multi_pack_index *m) return hashfile_checksum_valid(m->data, m->data_len); } +static void prepare_midx_packing_data(struct packing_data *pdata, + struct write_midx_context *ctx) +{ + uint32_t i; + + memset(pdata, 0, sizeof(struct packing_data)); + prepare_packing_data(the_repository, pdata); + + for (i = 0; i < ctx->entries_nr; i++) { + struct pack_midx_entry *from = &ctx->entries[ctx->pack_order[i]]; + struct object_entry *to = packlist_alloc(pdata, &from->oid); + + oe_set_in_pack(pdata, to, + ctx->info[ctx->pack_perm[from->pack_int_id]].p); + } +} + +static int add_ref_to_pending(const char *refname, + const struct object_id *oid, + int flag, void *cb_data) +{ + struct rev_info *revs = (struct rev_info*)cb_data; + struct object *object; + + if ((flag & REF_ISSYMREF) && (flag & REF_ISBROKEN)) { + warning("symbolic ref is dangling: %s", refname); + return 0; + } + + object = parse_object_or_die(oid, refname); + if (object->type != OBJ_COMMIT) + return 0; + + add_pending_object(revs, object, ""); + if (bitmap_is_preferred_refname(revs->repo, refname)) + object->flags |= NEEDS_BITMAP; + return 0; +} + +struct bitmap_commit_cb { + struct commit **commits; + size_t commits_nr, commits_alloc; + + struct write_midx_context *ctx; +}; + +static const struct object_id *bitmap_oid_access(size_t index, + const void *_entries) +{ + const struct pack_midx_entry *entries = _entries; + return &entries[index].oid; +} + +static void bitmap_show_commit(struct commit *commit, void *_data) +{ + struct bitmap_commit_cb *data = _data; + int pos = oid_pos(&commit->object.oid, data->ctx->entries, + data->ctx->entries_nr, + bitmap_oid_access); + if (pos < 0) + return; + + ALLOC_GROW(data->commits, data->commits_nr + 1, data->commits_alloc); + data->commits[data->commits_nr++] = commit; +} + +static struct commit **find_commits_for_midx_bitmap(uint32_t *indexed_commits_nr_p, + struct write_midx_context *ctx) +{ + struct rev_info revs; + struct bitmap_commit_cb cb = {0}; + + cb.ctx = ctx; + + repo_init_revisions(the_repository, &revs, NULL); + setup_revisions(0, NULL, &revs, NULL); + for_each_ref(add_ref_to_pending, &revs); + + /* + * Skipping promisor objects here is intentional, since it only excludes + * them from the list of reachable commits that we want to select from + * when computing the selection of MIDX'd commits to receive bitmaps. + * + * Reachability bitmaps do require that their objects be closed under + * reachability, but fetching any objects missing from promisors at this + * point is too late. But, if one of those objects can be reached from + * an another object that is included in the bitmap, then we will + * complain later that we don't have reachability closure (and fail + * appropriately). + */ + fetch_if_missing = 0; + revs.exclude_promisor_objects = 1; + + if (prepare_revision_walk(&revs)) + die(_("revision walk setup failed")); + + traverse_commit_list(&revs, bitmap_show_commit, NULL, &cb); + if (indexed_commits_nr_p) + *indexed_commits_nr_p = cb.commits_nr; + + return cb.commits; +} + +static int write_midx_bitmap(char *midx_name, unsigned char *midx_hash, + struct write_midx_context *ctx, + unsigned flags) +{ + struct packing_data pdata; + struct pack_idx_entry **index; + struct commit **commits = NULL; + uint32_t i, commits_nr; + char *bitmap_name = xstrfmt("%s-%s.bitmap", midx_name, hash_to_hex(midx_hash)); + int ret; + + prepare_midx_packing_data(&pdata, ctx); + + commits = find_commits_for_midx_bitmap(&commits_nr, ctx); + + /* + * Build the MIDX-order index based on pdata.objects (which is already + * in MIDX order; c.f., 'midx_pack_order_cmp()' for the definition of + * this order). + */ + ALLOC_ARRAY(index, pdata.nr_objects); + for (i = 0; i < pdata.nr_objects; i++) + index[i] = &pdata.objects[i].idx; + + bitmap_writer_show_progress(flags & MIDX_PROGRESS); + bitmap_writer_build_type_index(&pdata, index, pdata.nr_objects); + + /* + * bitmap_writer_finish expects objects in lex order, but pack_order + * gives us exactly that. use it directly instead of re-sorting the + * array. + * + * This changes the order of objects in 'index' between + * bitmap_writer_build_type_index and bitmap_writer_finish. + * + * The same re-ordering takes place in the single-pack bitmap code via + * write_idx_file(), which is called by finish_tmp_packfile(), which + * happens between bitmap_writer_build_type_index() and + * bitmap_writer_finish(). + */ + for (i = 0; i < pdata.nr_objects; i++) + index[ctx->pack_order[i]] = &pdata.objects[i].idx; + + bitmap_writer_select_commits(commits, commits_nr, -1); + ret = bitmap_writer_build(&pdata); + if (ret < 0) + goto cleanup; + + bitmap_writer_set_checksum(midx_hash); + bitmap_writer_finish(index, pdata.nr_objects, bitmap_name, 0); + +cleanup: + free(index); + free(bitmap_name); + return ret; +} + static int write_midx_internal(const char *object_dir, struct string_list *packs_to_drop, const char *preferred_pack_name, @@ -938,7 +1102,7 @@ static int write_midx_internal(const char *object_dir, ctx.info[ctx.nr].orig_pack_int_id = i; ctx.info[ctx.nr].pack_name = xstrdup(ctx.m->pack_names[i]); - ctx.info[ctx.nr].p = NULL; + ctx.info[ctx.nr].p = ctx.m->packs[i]; ctx.info[ctx.nr].expired = 0; if (flags & MIDX_WRITE_REV_INDEX) { @@ -972,8 +1136,26 @@ static int write_midx_internal(const char *object_dir, for_each_file_in_pack_dir(object_dir, add_pack_to_midx, &ctx); stop_progress(&ctx.progress); - if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop) - goto cleanup; + if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop) { + struct bitmap_index *bitmap_git; + int bitmap_exists; + int want_bitmap = flags & MIDX_WRITE_BITMAP; + + bitmap_git = prepare_midx_bitmap_git(the_repository, ctx.m); + bitmap_exists = bitmap_git && bitmap_is_midx(bitmap_git); + free_bitmap_index(bitmap_git); + + if (bitmap_exists || !want_bitmap) { + /* + * The correct MIDX already exists, and so does a + * corresponding bitmap (or one wasn't requested). + */ + if (!want_bitmap) + clear_midx_files_ext(object_dir, ".bitmap", + NULL); + goto cleanup; + } + } if (preferred_pack_name) { int found = 0; @@ -989,7 +1171,8 @@ static int write_midx_internal(const char *object_dir, if (!found) warning(_("unknown preferred pack: '%s'"), preferred_pack_name); - } else if (ctx.nr && (flags & MIDX_WRITE_REV_INDEX)) { + } else if (ctx.nr && + (flags & (MIDX_WRITE_REV_INDEX | MIDX_WRITE_BITMAP))) { struct packed_git *oldest = ctx.info[ctx.preferred_pack_idx].p; ctx.preferred_pack_idx = 0; @@ -1121,9 +1304,6 @@ static int write_midx_internal(const char *object_dir, hold_lock_file_for_update(&lk, midx_name, LOCK_DIE_ON_ERROR); f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk)); - if (ctx.m) - close_object_store(the_repository->objects); - if (ctx.nr - dropped_packs == 0) { error(_("no pack files to index.")); result = 1; @@ -1154,14 +1334,25 @@ static int write_midx_internal(const char *object_dir, finalize_hashfile(f, midx_hash, CSUM_FSYNC | CSUM_HASH_IN_STREAM); free_chunkfile(cf); - if (flags & MIDX_WRITE_REV_INDEX) + if (flags & (MIDX_WRITE_REV_INDEX | MIDX_WRITE_BITMAP)) ctx.pack_order = midx_pack_order(&ctx); if (flags & MIDX_WRITE_REV_INDEX) write_midx_reverse_index(midx_name, midx_hash, &ctx); + if (flags & MIDX_WRITE_BITMAP) { + if (write_midx_bitmap(midx_name, midx_hash, &ctx, flags) < 0) { + error(_("could not write multi-pack bitmap")); + result = 1; + goto cleanup; + } + } + + if (ctx.m) + close_object_store(the_repository->objects); commit_lock_file(&lk); + clear_midx_files_ext(object_dir, ".bitmap", midx_hash); clear_midx_files_ext(object_dir, ".rev", midx_hash); cleanup: @@ -1178,6 +1369,7 @@ static int write_midx_internal(const char *object_dir, free(ctx.pack_perm); free(ctx.pack_order); free(midx_name); + return result; } @@ -1238,6 +1430,7 @@ void clear_midx_file(struct repository *r) if (remove_path(midx)) die(_("failed to clear multi-pack-index at %s"), midx); + clear_midx_files_ext(r->objects->odb->path, ".bitmap", NULL); clear_midx_files_ext(r->objects->odb->path, ".rev", NULL); free(midx); diff --git a/midx.h b/midx.h index 1172df1a71..350f4d0a7b 100644 --- a/midx.h +++ b/midx.h @@ -41,6 +41,7 @@ struct multi_pack_index { #define MIDX_PROGRESS (1 << 0) #define MIDX_WRITE_REV_INDEX (1 << 1) +#define MIDX_WRITE_BITMAP (1 << 2) const unsigned char *get_midx_checksum(struct multi_pack_index *m); char *get_midx_filename(const char *object_dir); From patchwork Tue Aug 31 20:52:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467827 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C68ECC4320A for ; Tue, 31 Aug 2021 20:52:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A909D61026 for ; Tue, 31 Aug 2021 20:52:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241283AbhHaUxk (ORCPT ); Tue, 31 Aug 2021 16:53:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42042 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241288AbhHaUx2 (ORCPT ); Tue, 31 Aug 2021 16:53:28 -0400 Received: from mail-il1-x129.google.com (mail-il1-x129.google.com [IPv6:2607:f8b0:4864:20::129]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5064EC06179A for ; Tue, 31 Aug 2021 13:52:28 -0700 (PDT) Received: by mail-il1-x129.google.com with SMTP id y3so697282ilm.6 for ; Tue, 31 Aug 2021 13:52:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Zdw/SkdDOqw8Qpj695y7K2joDOIde56MzCyVtDdtkAM=; b=jL0lXiXUpg7LG2i9j75P50BJP+52+UZBecYVHPEWvhFQLVu0c+fVy39wooO9YuLwn+ naxIIvpQFauskqCKZcICBMgMy2yc/syPw9tg6AinzDSBKpY5khV5abOK8f225LvHuK47 qFm5U/fuBM1oHOeRszJ/ombrLu+OMXjohbAygszwJVA5nL+NVd3ityDEFDhyhOIfrUUC LmIF/SoBh3THB2+0jsiHyoctZxJPdJm3+7r1Eg2LuMAPvqBP6l0dnSlfCyQKAUtz+5Lq KDL3JDrjAgEmJmW5Rp0eE69vVDAaylTCZuZu3ka1gZe4BMKsQKeONEIxZJDNSJq9dPRa KnEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Zdw/SkdDOqw8Qpj695y7K2joDOIde56MzCyVtDdtkAM=; b=eVt5jMHn0wiFmxWa/T8V5s/CxrMpp+wmNGT6vx6Rp06/XDOdtiqAegJFXL5MigODHb AieDF9/61n9G+ZyaIlB/56L3WQZxCpECaWTWlum5l7pR8QZSQ7D+BOMYIk+bDU3fJ9jj J9kG9BUCED6WjvWPJ1bb3HHhKO42Yb118688sv96pnSge0yrabGBmwNdiSy4ncBU1KE9 D1E2EDXGXDN05TKiMsPvJQASVPETAC+ejanHyowTS2nl65qLfSrvr98+XLtnDVjy1pIr gS2eXAMUhHQrLZcCTTQWBTp747gWd0BdbAGjZI/vhsvp3U8ERMHb3mel1wj66NHxZw+M PzWg== X-Gm-Message-State: AOAM532uqzwWkicfvzvgWGNsnCgkGbmaCgKFgih5bO4yN9NqT+NQEJW8 Ags6muEBRyMV9xtqJxvWWVMYtC6dD2AsIjXc X-Google-Smtp-Source: ABdhPJxlqhBJXENQSR1aj386rKTGD/jU5Knj/qVEZ5DxN3K1pUXemge3WP0o8ylscPACd5b+grH2oA== X-Received: by 2002:a92:cb4d:: with SMTP id f13mr22091215ilq.220.1630443147394; Tue, 31 Aug 2021 13:52:27 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id l15sm10259377iow.4.2021.08.31.13.52.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:27 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:26 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 18/27] t5310: move some tests to lib-bitmap.sh Message-ID: <5d60b07e2e0631eaaefe21cc94d3d1c7d9f7cfe3.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org We'll soon be adding a test script that will cover many of the same bitmap concepts as t5310, but for MIDX bitmaps. Let's pull out as many of the applicable tests as we can so we don't have to rewrite them. There should be no functional change to t5310; we still run the same operations in the same order. Signed-off-by: Taylor Blau --- t/lib-bitmap.sh | 236 ++++++++++++++++++++++++++++++++++++++++ t/t5310-pack-bitmaps.sh | 227 +------------------------------------- 2 files changed, 240 insertions(+), 223 deletions(-) diff --git a/t/lib-bitmap.sh b/t/lib-bitmap.sh index fe3f98be24..77464da6fd 100644 --- a/t/lib-bitmap.sh +++ b/t/lib-bitmap.sh @@ -1,3 +1,6 @@ +# Helpers for scripts testing bitmap functionality; see t5310 for +# example usage. + # Compare a file containing rev-list bitmap traversal output to its non-bitmap # counterpart. You can't just use test_cmp for this, because the two produce # subtly different output: @@ -24,3 +27,236 @@ test_bitmap_traversal () { test_cmp "$1.normalized" "$2.normalized" && rm -f "$1.normalized" "$2.normalized" } + +# To ensure the logic for "maximal commits" is exercised, make +# the repository a bit more complicated. +# +# other second +# * * +# (99 commits) (99 commits) +# * * +# |\ /| +# | * octo-other octo-second * | +# |/|\_________ ____________/|\| +# | \ \/ __________/ | +# | | ________/\ / | +# * |/ * merge-right * +# | _|__________/ \____________ | +# |/ | \| +# (l1) * * merge-left * (r1) +# | / \________________________ | +# |/ \| +# (l2) * * (r2) +# \___________________________ | +# \| +# * (base) +# +# We only push bits down the first-parent history, which +# makes some of these commits unimportant! +# +# The important part for the maximal commit algorithm is how +# the bitmasks are extended. Assuming starting bit positions +# for second (bit 0) and other (bit 1), the bitmasks at the +# end should be: +# +# second: 1 (maximal, selected) +# other: 01 (maximal, selected) +# (base): 11 (maximal) +# +# This complicated history was important for a previous +# version of the walk that guarantees never walking a +# commit multiple times. That goal might be important +# again, so preserve this complicated case. For now, this +# test will guarantee that the bitmaps are computed +# correctly, even with the repeat calculations. +setup_bitmap_history() { + test_expect_success 'setup repo with moderate-sized history' ' + test_commit_bulk --id=file 10 && + git branch -M second && + git checkout -b other HEAD~5 && + test_commit_bulk --id=side 10 && + + # add complicated history setup, including merges and + # ambiguous merge-bases + + git checkout -b merge-left other~2 && + git merge second~2 -m "merge-left" && + + git checkout -b merge-right second~1 && + git merge other~1 -m "merge-right" && + + git checkout -b octo-second second && + git merge merge-left merge-right -m "octopus-second" && + + git checkout -b octo-other other && + git merge merge-left merge-right -m "octopus-other" && + + git checkout other && + git merge octo-other -m "pull octopus" && + + git checkout second && + git merge octo-second -m "pull octopus" && + + # Remove these branches so they are not selected + # as bitmap tips + git branch -D merge-left && + git branch -D merge-right && + git branch -D octo-other && + git branch -D octo-second && + + # add padding to make these merges less interesting + # and avoid having them selected for bitmaps + test_commit_bulk --id=file 100 && + git checkout other && + test_commit_bulk --id=side 100 && + git checkout second && + + bitmaptip=$(git rev-parse second) && + blob=$(echo tagged-blob | git hash-object -w --stdin) && + git tag tagged-blob $blob + ' +} + +rev_list_tests_head () { + test_expect_success "counting commits via bitmap ($state, $branch)" ' + git rev-list --count $branch >expect && + git rev-list --use-bitmap-index --count $branch >actual && + test_cmp expect actual + ' + + test_expect_success "counting partial commits via bitmap ($state, $branch)" ' + git rev-list --count $branch~5..$branch >expect && + git rev-list --use-bitmap-index --count $branch~5..$branch >actual && + test_cmp expect actual + ' + + test_expect_success "counting commits with limit ($state, $branch)" ' + git rev-list --count -n 1 $branch >expect && + git rev-list --use-bitmap-index --count -n 1 $branch >actual && + test_cmp expect actual + ' + + test_expect_success "counting non-linear history ($state, $branch)" ' + git rev-list --count other...second >expect && + git rev-list --use-bitmap-index --count other...second >actual && + test_cmp expect actual + ' + + test_expect_success "counting commits with limiting ($state, $branch)" ' + git rev-list --count $branch -- 1.t >expect && + git rev-list --use-bitmap-index --count $branch -- 1.t >actual && + test_cmp expect actual + ' + + test_expect_success "counting objects via bitmap ($state, $branch)" ' + git rev-list --count --objects $branch >expect && + git rev-list --use-bitmap-index --count --objects $branch >actual && + test_cmp expect actual + ' + + test_expect_success "enumerate commits ($state, $branch)" ' + git rev-list --use-bitmap-index $branch >actual && + git rev-list $branch >expect && + test_bitmap_traversal --no-confirm-bitmaps expect actual + ' + + test_expect_success "enumerate --objects ($state, $branch)" ' + git rev-list --objects --use-bitmap-index $branch >actual && + git rev-list --objects $branch >expect && + test_bitmap_traversal expect actual + ' + + test_expect_success "bitmap --objects handles non-commit objects ($state, $branch)" ' + git rev-list --objects --use-bitmap-index $branch tagged-blob >actual && + grep $blob actual + ' +} + +rev_list_tests () { + state=$1 + + for branch in "second" "other" + do + rev_list_tests_head + done +} + +basic_bitmap_tests () { + tip="$1" + test_expect_success 'rev-list --test-bitmap verifies bitmaps' " + git rev-list --test-bitmap "${tip:-HEAD}" + " + + rev_list_tests 'full bitmap' + + test_expect_success 'clone from bitmapped repository' ' + rm -fr clone.git && + git clone --no-local --bare . clone.git && + git rev-parse HEAD >expect && + git --git-dir=clone.git rev-parse HEAD >actual && + test_cmp expect actual + ' + + test_expect_success 'partial clone from bitmapped repository' ' + test_config uploadpack.allowfilter true && + rm -fr partial-clone.git && + git clone --no-local --bare --filter=blob:none . partial-clone.git && + ( + cd partial-clone.git && + pack=$(echo objects/pack/*.pack) && + git verify-pack -v "$pack" >have && + awk "/blob/ { print \$1 }" blobs && + # we expect this single blob because of the direct ref + git rev-parse refs/tags/tagged-blob >expect && + test_cmp expect blobs + ) + ' + + test_expect_success 'setup further non-bitmapped commits' ' + test_commit_bulk --id=further 10 + ' + + rev_list_tests 'partial bitmap' + + test_expect_success 'fetch (partial bitmap)' ' + git --git-dir=clone.git fetch origin second:second && + git rev-parse HEAD >expect && + git --git-dir=clone.git rev-parse HEAD >actual && + test_cmp expect actual + ' + + test_expect_success 'enumerating progress counts pack-reused objects' ' + count=$(git rev-list --objects --all --count) && + git repack -adb && + + # check first with only reused objects; confirm that our + # progress showed the right number, and also that we did + # pack-reuse as expected. Check only the final "done" + # line of the meter (there may be an arbitrary number of + # intermediate lines ending with CR). + GIT_PROGRESS_DELAY=0 \ + git pack-objects --all --stdout --progress \ + /dev/null 2>stderr && + grep "Enumerating objects: $count, done" stderr && + grep "pack-reused $count" stderr && + + # now the same but with one non-reused object + git commit --allow-empty -m "an extra commit object" && + GIT_PROGRESS_DELAY=0 \ + git pack-objects --all --stdout --progress \ + /dev/null 2>stderr && + grep "Enumerating objects: $((count+1)), done" stderr && + grep "pack-reused $count" stderr + ' +} + +# have_delta +# +# Note that because this relies on cat-file, it might find _any_ copy of an +# object in the repository. The caller is responsible for making sure +# there's only one (e.g., via "repack -ad", or having just fetched a copy). +have_delta () { + echo $2 >expect && + echo $1 | git cat-file --batch-check="%(deltabase)" >actual && + test_cmp expect actual +} diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh index b02838750e..4318f84d53 100755 --- a/t/t5310-pack-bitmaps.sh +++ b/t/t5310-pack-bitmaps.sh @@ -25,93 +25,10 @@ has_any () { grep -Ff "$1" "$2" } -# To ensure the logic for "maximal commits" is exercised, make -# the repository a bit more complicated. -# -# other second -# * * -# (99 commits) (99 commits) -# * * -# |\ /| -# | * octo-other octo-second * | -# |/|\_________ ____________/|\| -# | \ \/ __________/ | -# | | ________/\ / | -# * |/ * merge-right * -# | _|__________/ \____________ | -# |/ | \| -# (l1) * * merge-left * (r1) -# | / \________________________ | -# |/ \| -# (l2) * * (r2) -# \___________________________ | -# \| -# * (base) -# -# We only push bits down the first-parent history, which -# makes some of these commits unimportant! -# -# The important part for the maximal commit algorithm is how -# the bitmasks are extended. Assuming starting bit positions -# for second (bit 0) and other (bit 1), the bitmasks at the -# end should be: -# -# second: 1 (maximal, selected) -# other: 01 (maximal, selected) -# (base): 11 (maximal) -# -# This complicated history was important for a previous -# version of the walk that guarantees never walking a -# commit multiple times. That goal might be important -# again, so preserve this complicated case. For now, this -# test will guarantee that the bitmaps are computed -# correctly, even with the repeat calculations. +setup_bitmap_history -test_expect_success 'setup repo with moderate-sized history' ' - test_commit_bulk --id=file 10 && - git branch -M second && - git checkout -b other HEAD~5 && - test_commit_bulk --id=side 10 && - - # add complicated history setup, including merges and - # ambiguous merge-bases - - git checkout -b merge-left other~2 && - git merge second~2 -m "merge-left" && - - git checkout -b merge-right second~1 && - git merge other~1 -m "merge-right" && - - git checkout -b octo-second second && - git merge merge-left merge-right -m "octopus-second" && - - git checkout -b octo-other other && - git merge merge-left merge-right -m "octopus-other" && - - git checkout other && - git merge octo-other -m "pull octopus" && - - git checkout second && - git merge octo-second -m "pull octopus" && - - # Remove these branches so they are not selected - # as bitmap tips - git branch -D merge-left && - git branch -D merge-right && - git branch -D octo-other && - git branch -D octo-second && - - # add padding to make these merges less interesting - # and avoid having them selected for bitmaps - test_commit_bulk --id=file 100 && - git checkout other && - test_commit_bulk --id=side 100 && - git checkout second && - - bitmaptip=$(git rev-parse second) && - blob=$(echo tagged-blob | git hash-object -w --stdin) && - git tag tagged-blob $blob && - git config repack.writebitmaps true +test_expect_success 'setup writing bitmaps during repack' ' + git config repack.writeBitmaps true ' test_expect_success 'full repack creates bitmaps' ' @@ -123,109 +40,7 @@ test_expect_success 'full repack creates bitmaps' ' grep "\"key\":\"num_maximal_commits\",\"value\":\"107\"" trace ' -test_expect_success 'rev-list --test-bitmap verifies bitmaps' ' - git rev-list --test-bitmap HEAD -' - -rev_list_tests_head () { - test_expect_success "counting commits via bitmap ($state, $branch)" ' - git rev-list --count $branch >expect && - git rev-list --use-bitmap-index --count $branch >actual && - test_cmp expect actual - ' - - test_expect_success "counting partial commits via bitmap ($state, $branch)" ' - git rev-list --count $branch~5..$branch >expect && - git rev-list --use-bitmap-index --count $branch~5..$branch >actual && - test_cmp expect actual - ' - - test_expect_success "counting commits with limit ($state, $branch)" ' - git rev-list --count -n 1 $branch >expect && - git rev-list --use-bitmap-index --count -n 1 $branch >actual && - test_cmp expect actual - ' - - test_expect_success "counting non-linear history ($state, $branch)" ' - git rev-list --count other...second >expect && - git rev-list --use-bitmap-index --count other...second >actual && - test_cmp expect actual - ' - - test_expect_success "counting commits with limiting ($state, $branch)" ' - git rev-list --count $branch -- 1.t >expect && - git rev-list --use-bitmap-index --count $branch -- 1.t >actual && - test_cmp expect actual - ' - - test_expect_success "counting objects via bitmap ($state, $branch)" ' - git rev-list --count --objects $branch >expect && - git rev-list --use-bitmap-index --count --objects $branch >actual && - test_cmp expect actual - ' - - test_expect_success "enumerate commits ($state, $branch)" ' - git rev-list --use-bitmap-index $branch >actual && - git rev-list $branch >expect && - test_bitmap_traversal --no-confirm-bitmaps expect actual - ' - - test_expect_success "enumerate --objects ($state, $branch)" ' - git rev-list --objects --use-bitmap-index $branch >actual && - git rev-list --objects $branch >expect && - test_bitmap_traversal expect actual - ' - - test_expect_success "bitmap --objects handles non-commit objects ($state, $branch)" ' - git rev-list --objects --use-bitmap-index $branch tagged-blob >actual && - grep $blob actual - ' -} - -rev_list_tests () { - state=$1 - - for branch in "second" "other" - do - rev_list_tests_head - done -} - -rev_list_tests 'full bitmap' - -test_expect_success 'clone from bitmapped repository' ' - git clone --no-local --bare . clone.git && - git rev-parse HEAD >expect && - git --git-dir=clone.git rev-parse HEAD >actual && - test_cmp expect actual -' - -test_expect_success 'partial clone from bitmapped repository' ' - test_config uploadpack.allowfilter true && - git clone --no-local --bare --filter=blob:none . partial-clone.git && - ( - cd partial-clone.git && - pack=$(echo objects/pack/*.pack) && - git verify-pack -v "$pack" >have && - awk "/blob/ { print \$1 }" blobs && - # we expect this single blob because of the direct ref - git rev-parse refs/tags/tagged-blob >expect && - test_cmp expect blobs - ) -' - -test_expect_success 'setup further non-bitmapped commits' ' - test_commit_bulk --id=further 10 -' - -rev_list_tests 'partial bitmap' - -test_expect_success 'fetch (partial bitmap)' ' - git --git-dir=clone.git fetch origin second:second && - git rev-parse HEAD >expect && - git --git-dir=clone.git rev-parse HEAD >actual && - test_cmp expect actual -' +basic_bitmap_tests test_expect_success 'incremental repack fails when bitmaps are requested' ' test_commit more-1 && @@ -461,40 +276,6 @@ test_expect_success 'truncated bitmap fails gracefully (cache)' ' test_i18ngrep corrupted.bitmap.index stderr ' -test_expect_success 'enumerating progress counts pack-reused objects' ' - count=$(git rev-list --objects --all --count) && - git repack -adb && - - # check first with only reused objects; confirm that our progress - # showed the right number, and also that we did pack-reuse as expected. - # Check only the final "done" line of the meter (there may be an - # arbitrary number of intermediate lines ending with CR). - GIT_PROGRESS_DELAY=0 \ - git pack-objects --all --stdout --progress \ - /dev/null 2>stderr && - grep "Enumerating objects: $count, done" stderr && - grep "pack-reused $count" stderr && - - # now the same but with one non-reused object - git commit --allow-empty -m "an extra commit object" && - GIT_PROGRESS_DELAY=0 \ - git pack-objects --all --stdout --progress \ - /dev/null 2>stderr && - grep "Enumerating objects: $((count+1)), done" stderr && - grep "pack-reused $count" stderr -' - -# have_delta -# -# Note that because this relies on cat-file, it might find _any_ copy of an -# object in the repository. The caller is responsible for making sure -# there's only one (e.g., via "repack -ad", or having just fetched a copy). -have_delta () { - echo $2 >expect && - echo $1 | git cat-file --batch-check="%(deltabase)" >actual && - test_cmp expect actual -} - # Create a state of history with these properties: # # - refs that allow a client to fetch some new history, while sharing some old From patchwork Tue Aug 31 20:52:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467829 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 116DFC432BE for ; Tue, 31 Aug 2021 20:52:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DF90261026 for ; Tue, 31 Aug 2021 20:52:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241321AbhHaUxm (ORCPT ); Tue, 31 Aug 2021 16:53:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241330AbhHaUxc (ORCPT ); Tue, 31 Aug 2021 16:53:32 -0400 Received: from mail-io1-xd32.google.com (mail-io1-xd32.google.com [IPv6:2607:f8b0:4864:20::d32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66603C0617AF for ; Tue, 31 Aug 2021 13:52:30 -0700 (PDT) Received: by mail-io1-xd32.google.com with SMTP id j18so766541ioj.8 for ; Tue, 31 Aug 2021 13:52:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=yG/JDN1Qp48z4r4jiHxdtCZXIj9giIJW34Xgsm0vA2s=; b=Jd536pFRzVyoUiu2NGJNyChA5012e9IwkHYtj3t6U1IFTCcAFqCL2+B6NkFGALa1zu +1+5FHQj2o8BrBU1UBytdTlE9MTQWb9bxdA8V58ZE1/jOCE330kE1pT3dS3M/FF8+FO+ FSAboLAc06lqLKcR4zn9HinhhipMrn2EY7p6DEG1qVpBDTV/jZ471TpeVKquL/0/7K8G w64oyMwMGhRXh7GM4ch+EJCOIv0aYHbIOG89Try76XeyIW1iPrEOBQi97f7GKAjgxGze 1ueYUGjbOqTQStR9yihyzStP7oKwWKAulEJ7hZQ71OqDXcJesPFLpR6FUOQfDCwzD+sG 9vQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=yG/JDN1Qp48z4r4jiHxdtCZXIj9giIJW34Xgsm0vA2s=; b=cdX6ywHjWAHLViU+38hUhqJC98juAqQOJBU9B301zlbxS732NrOS1QPP61NSlbzoVa SjGTaY7maqd8KRPDcQI7OCXQS9/OKBR02r5ucgNFqvAf0/EkIk3Om6DZygvAnBQCXre0 ycf22xuB3yqbP8CQY+sLX25YYc1JYMFFBZh46BnkNrFKzfDbs++VEuE2o01Yy5jBygrO SS6aPB09OCcEvrbt2r2bkuz3rI3atZV2FAt6A9vvfXfgWycZ8KiB9yO+6llfdsMkiq7+ oTRZlTKjlgd6iLcgRtmcYNgHXxoRlmforX8c4bliusJIS2z2bKlZGQD85EW48KHMOtJP sJSQ== X-Gm-Message-State: AOAM531TU5cEfYgw1e90ysdMv3SEE0LWMA4rvFYd3dWtOAYWH/gBTcjr o2N6tTZmNEF56fkYcysAPaDkOIt5jKj3dUhp X-Google-Smtp-Source: ABdhPJzKSAmZ3fHBLWxSd+gkm9bySlSmTZ9p9junZZSQys5UkllYDeJPv9ZZ6GsbtVz5mzr2iECrZQ== X-Received: by 2002:a02:7348:: with SMTP id a8mr4582425jae.116.1630443149786; Tue, 31 Aug 2021 13:52:29 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d14sm11407457iod.18.2021.08.31.13.52.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:29 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:28 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 19/27] t/helper/test-read-midx.c: add --checksum mode Message-ID: <1a9c3538dbb7b7077e4a34b1d40628a770e44af8.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Subsequent tests will want to check for the existence of a multi-pack bitmap which matches the multi-pack-index stored in the pack directory. The multi-pack bitmap includes the hex checksum of the MIDX it corresponds to in its filename (for example, '$packdir/multi-pack-index-.bitmap'). As a result, some tests want a way to learn what '' is. This helper addresses that need by printing the checksum of the repository's multi-pack-index. Signed-off-by: Taylor Blau --- t/helper/test-read-midx.c | 16 +++++++++++++++- t/lib-bitmap.sh | 4 ++++ 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/t/helper/test-read-midx.c b/t/helper/test-read-midx.c index 7c2eb11a8e..cb0d27049a 100644 --- a/t/helper/test-read-midx.c +++ b/t/helper/test-read-midx.c @@ -60,12 +60,26 @@ static int read_midx_file(const char *object_dir, int show_objects) return 0; } +static int read_midx_checksum(const char *object_dir) +{ + struct multi_pack_index *m; + + setup_git_directory(); + m = load_multi_pack_index(object_dir, 1); + if (!m) + return 1; + printf("%s\n", hash_to_hex(get_midx_checksum(m))); + return 0; +} + int cmd__read_midx(int argc, const char **argv) { if (!(argc == 2 || argc == 3)) - usage("read-midx [--show-objects] "); + usage("read-midx [--show-objects|--checksum] "); if (!strcmp(argv[1], "--show-objects")) return read_midx_file(argv[2], 1); + else if (!strcmp(argv[1], "--checksum")) + return read_midx_checksum(argv[2]); return read_midx_file(argv[1], 0); } diff --git a/t/lib-bitmap.sh b/t/lib-bitmap.sh index 77464da6fd..21d0392dda 100644 --- a/t/lib-bitmap.sh +++ b/t/lib-bitmap.sh @@ -260,3 +260,7 @@ have_delta () { echo $1 | git cat-file --batch-check="%(deltabase)" >actual && test_cmp expect actual } + +midx_checksum () { + test-tool read-midx --checksum "$1" +} From patchwork Tue Aug 31 20:52:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467831 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AFE6BC4320E for ; Tue, 31 Aug 2021 20:52:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9973F6103A for ; Tue, 31 Aug 2021 20:52:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241330AbhHaUxn (ORCPT ); Tue, 31 Aug 2021 16:53:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241333AbhHaUxc (ORCPT ); Tue, 31 Aug 2021 16:53:32 -0400 Received: from mail-io1-xd2e.google.com (mail-io1-xd2e.google.com [IPv6:2607:f8b0:4864:20::d2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 17454C0612AF for ; Tue, 31 Aug 2021 13:52:33 -0700 (PDT) Received: by mail-io1-xd2e.google.com with SMTP id y18so833950ioc.1 for ; Tue, 31 Aug 2021 13:52:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=j5I5iCRwXj1odODBDC0Zi2ykOdJx9BfcfMSR4YOfpZQ=; b=kwN56Xj6+kqflYnOOeto0g+BVXDE2Q98mJXAMb1Gu9yVyRXdo7Kz/+oj2g7W1+VoxX DbALaAtm/dRHHqiV0yX6Aknubcyt8h4yIz3Y7Owei0rB6anq37rtMqDCDFZZOG90XQos 2T2ipmAxQLfqbro0q0jdiCEryMOFvVLHw9BKosWV82YWbPDsZJ/M0Pjql/HaPHQ/v9kb wkRiMawCx/y/Dc/Cmi23jHf2ZhGlCgennwDzRkkx2ztxyUYYoWcdkV8SiWWEFoa7JR5y oNWd82aCTsVVWNbu/bMvkCBP+vi+qUWCeb779uSFbhf3tf11EEv6pEs2ssl5TwgZAPNZ k+tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=j5I5iCRwXj1odODBDC0Zi2ykOdJx9BfcfMSR4YOfpZQ=; b=Y2if3Bwi0y6uvgqDAxqLu4jtCSWaoo9i6ohwykQFDYXKd45F5E5gc5Vik93DtAJrle HmeRqpqRfO0AG+nyPJzCZ4pS0YRvcg8E48NeBIF9Mkollh35HxVRM2JSCFKnd1jAAG0p QsDWOgixoAcf+4/h60xMJFE3Zx99Q4oV0g0M1BpbnY7YMaKVBDgD5b8vuDI14gv0isLO T9NadS448wUPxpYY3fXc3J2wLrOs6XujrgDjQwZvVtIXLiCI6D0WhBpcsT3ukADAvsCd tfNT2eaxfYvjFSAeRCZ9N4pZgyk1uY1mbpE7b4p8227DC1wjrXfOSQr+urZ6XNS1stVb OutQ== X-Gm-Message-State: AOAM532EyWBOrgewPwrWCpZHIzIuo462S/ohGrLl5xLZux48L2N+pfTa TWiyV8dWx1Lqckk4Az5lUNZFyCSn8jvIFE/O X-Google-Smtp-Source: ABdhPJxB01PpZ1GtaIW5hUDfPZwPalHBxz5iQ7ekksstnBcCC17LuzZxvIHEKbJSyrRIVpzbYEEIVQ== X-Received: by 2002:a6b:5c0c:: with SMTP id z12mr18053584ioh.171.1630443152252; Tue, 31 Aug 2021 13:52:32 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id a17sm10886203ilp.75.2021.08.31.13.52.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:31 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:31 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 20/27] t5326: test multi-pack bitmap behavior Message-ID: <8895114ace2e9d5392f5c71b7ea4190b90c01db8.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This patch introduces a new test, t5326, which tests the basic functionality of multi-pack bitmaps. Some trivial behavior is tested, such as: - Whether bitmaps can be generated with more than one pack. - Whether clones can be served with all objects in the bitmap. - Whether follow-up fetches can be served with some objects outside of the server's bitmap These use lib-bitmap's tests (which in turn were pulled from t5310), and we cover cases where the MIDX represents both a single pack and multiple packs. In addition, some non-trivial and MIDX-specific behavior is tested, too, including: - Whether multi-pack bitmaps behave correctly with respect to the pack-reuse machinery when the base for some object is selected from a different pack than the delta. - Whether multi-pack bitmaps correctly respect the pack.preferBitmapTips configuration. Signed-off-by: Taylor Blau --- t/t5326-multi-pack-bitmaps.sh | 286 ++++++++++++++++++++++++++++++++++ 1 file changed, 286 insertions(+) create mode 100755 t/t5326-multi-pack-bitmaps.sh diff --git a/t/t5326-multi-pack-bitmaps.sh b/t/t5326-multi-pack-bitmaps.sh new file mode 100755 index 0000000000..4ad7c2c969 --- /dev/null +++ b/t/t5326-multi-pack-bitmaps.sh @@ -0,0 +1,286 @@ +#!/bin/sh + +test_description='exercise basic multi-pack bitmap functionality' +. ./test-lib.sh +. "${TEST_DIRECTORY}/lib-bitmap.sh" + +# We'll be writing our own midx and bitmaps, so avoid getting confused by the +# automatic ones. +GIT_TEST_MULTI_PACK_INDEX=0 +GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 + +objdir=.git/objects +midx=$objdir/pack/multi-pack-index + +# midx_pack_source +midx_pack_source () { + test-tool read-midx --show-objects .git/objects | grep "^$1 " | cut -f2 +} + +setup_bitmap_history + +test_expect_success 'enable core.multiPackIndex' ' + git config core.multiPackIndex true +' + +test_expect_success 'create single-pack midx with bitmaps' ' + git repack -ad && + git multi-pack-index write --bitmap && + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap && + test_path_is_file $midx-$(midx_checksum $objdir).rev +' + +basic_bitmap_tests + +test_expect_success 'create new additional packs' ' + for i in $(test_seq 1 16) + do + test_commit "$i" && + git repack -d || return 1 + done && + + git checkout -b other2 HEAD~8 && + for i in $(test_seq 1 8) + do + test_commit "side-$i" && + git repack -d || return 1 + done && + git checkout second +' + +test_expect_success 'create multi-pack midx with bitmaps' ' + git multi-pack-index write --bitmap && + + ls $objdir/pack/pack-*.pack >packs && + test_line_count = 25 packs && + + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap && + test_path_is_file $midx-$(midx_checksum $objdir).rev +' + +basic_bitmap_tests + +test_expect_success '--no-bitmap is respected when bitmaps exist' ' + git multi-pack-index write --bitmap && + + test_commit respect--no-bitmap && + git repack -d && + + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap && + test_path_is_file $midx-$(midx_checksum $objdir).rev && + + git multi-pack-index write --no-bitmap && + + test_path_is_file $midx && + test_path_is_missing $midx-$(midx_checksum $objdir).bitmap && + test_path_is_missing $midx-$(midx_checksum $objdir).rev +' + +test_expect_success 'setup midx with base from later pack' ' + # Write a and b so that "a" is a delta on top of base "b", since Git + # prefers to delete contents out of a base rather than add to a shorter + # object. + test_seq 1 128 >a && + test_seq 1 130 >b && + + git add a b && + git commit -m "initial commit" && + + a=$(git rev-parse HEAD:a) && + b=$(git rev-parse HEAD:b) && + + # In the first pack, "a" is stored as a delta to "b". + p1=$(git pack-objects .git/objects/pack/pack <<-EOF + $a + $b + EOF + ) && + + # In the second pack, "a" is missing, and "b" is not a delta nor base to + # any other object. + p2=$(git pack-objects .git/objects/pack/pack <<-EOF + $b + $(git rev-parse HEAD) + $(git rev-parse HEAD^{tree}) + EOF + ) && + + git prune-packed && + # Use the second pack as the preferred source, so that "b" occurs + # earlier in the MIDX object order, rendering "a" unusable for pack + # reuse. + git multi-pack-index write --bitmap --preferred-pack=pack-$p2.idx && + + have_delta $a $b && + test $(midx_pack_source $a) != $(midx_pack_source $b) +' + +rev_list_tests 'full bitmap with backwards delta' + +test_expect_success 'clone with bitmaps enabled' ' + git clone --no-local --bare . clone-reverse-delta.git && + test_when_finished "rm -fr clone-reverse-delta.git" && + + git rev-parse HEAD >expect && + git --git-dir=clone-reverse-delta.git rev-parse HEAD >actual && + test_cmp expect actual +' + +bitmap_reuse_tests() { + from=$1 + to=$2 + + test_expect_success "setup pack reuse tests ($from -> $to)" ' + rm -fr repo && + git init repo && + ( + cd repo && + test_commit_bulk 16 && + git tag old-tip && + + git config core.multiPackIndex true && + if test "MIDX" = "$from" + then + git repack -Ad && + git multi-pack-index write --bitmap + else + git repack -Adb + fi + ) + ' + + test_expect_success "build bitmap from existing ($from -> $to)" ' + ( + cd repo && + test_commit_bulk --id=further 16 && + git tag new-tip && + + if test "MIDX" = "$to" + then + git repack -d && + git multi-pack-index write --bitmap + else + git repack -Adb + fi + ) + ' + + test_expect_success "verify resulting bitmaps ($from -> $to)" ' + ( + cd repo && + git for-each-ref && + git rev-list --test-bitmap refs/tags/old-tip && + git rev-list --test-bitmap refs/tags/new-tip + ) + ' +} + +bitmap_reuse_tests 'pack' 'MIDX' +bitmap_reuse_tests 'MIDX' 'pack' +bitmap_reuse_tests 'MIDX' 'MIDX' + +test_expect_success 'missing object closure fails gracefully' ' + rm -fr repo && + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + + test_commit loose && + test_commit packed && + + # Do not pass "--revs"; we want a pack without the "loose" + # commit. + git pack-objects $objdir/pack/pack <<-EOF && + $(git rev-parse packed) + EOF + + test_must_fail git multi-pack-index write --bitmap 2>err && + grep "doesn.t have full closure" err && + test_path_is_missing $midx + ) +' + +test_expect_success 'setup partial bitmaps' ' + test_commit packed && + git repack && + test_commit loose && + git multi-pack-index write --bitmap 2>err && + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap && + test_path_is_file $midx-$(midx_checksum $objdir).rev +' + +basic_bitmap_tests HEAD~ + +test_expect_success 'removing a MIDX clears stale bitmaps' ' + rm -fr repo && + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + test_commit base && + git repack && + git multi-pack-index write --bitmap && + + # Write a MIDX and bitmap; remove the MIDX but leave the bitmap. + stale_bitmap=$midx-$(midx_checksum $objdir).bitmap && + stale_rev=$midx-$(midx_checksum $objdir).rev && + rm $midx && + + # Then write a new MIDX. + test_commit new && + git repack && + git multi-pack-index write --bitmap && + + test_path_is_file $midx && + test_path_is_file $midx-$(midx_checksum $objdir).bitmap && + test_path_is_file $midx-$(midx_checksum $objdir).rev && + test_path_is_missing $stale_bitmap && + test_path_is_missing $stale_rev + ) +' + +test_expect_success 'pack.preferBitmapTips' ' + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + + test_commit_bulk --message="%s" 103 && + + git log --format="%H" >commits.raw && + sort commits && + + git log --format="create refs/tags/%s %H" HEAD >refs && + git update-ref --stdin bitmaps && + comm -13 bitmaps commits >before && + test_line_count = 1 before && + + perl -ne "printf(\"create refs/tags/include/%d \", $.); print" \ + bitmaps && + comm -13 bitmaps commits >after && + + ! test_cmp before after + ) +' + +test_done From patchwork Tue Aug 31 20:52:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467837 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A195C432BE for ; Tue, 31 Aug 2021 20:52:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1F1C561026 for ; Tue, 31 Aug 2021 20:52:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241341AbhHaUxo (ORCPT ); Tue, 31 Aug 2021 16:53:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42106 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241339AbhHaUxc (ORCPT ); Tue, 31 Aug 2021 16:53:32 -0400 Received: from mail-io1-xd32.google.com (mail-io1-xd32.google.com [IPv6:2607:f8b0:4864:20::d32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 660EFC0612E7 for ; Tue, 31 Aug 2021 13:52:35 -0700 (PDT) Received: by mail-io1-xd32.google.com with SMTP id j18so766973ioj.8 for ; Tue, 31 Aug 2021 13:52:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=ZEa3Fi1uZnCurzx/46RtiTv/7nsRoAGFEHKRIkqUa1Q=; b=W5VNYn0L7Gh1dw2HKV1i3MB82PQ6ch/PLFyRJByLXrpmIVFmfbxXDkoHBUnEM8Scqj HnYUkTrQfhU3QeUKbrehKQoOHizvryl/AlyyqPH6bbYUDPqZRxC4BGIbwG+vX3EC7WGv Yw5g4qzgotAfl27iHc1dsCaoqBdAvZKNerDt1A7Ad8BJ/qJ9wNM1/bDlMCf0q9RKSu23 DLHbDT2n/qDegx8vhV+mku+hiYk6zbhVkIK8mYBI9oqExlMBmWJTv5Iz3KOM3hKozOek kctUZhJx+YJYBSjxCb0GUbo3wlADcoITa/CrnmPo4oiK8IcP/XGs94uasaQPV7/2DbE6 OB/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=ZEa3Fi1uZnCurzx/46RtiTv/7nsRoAGFEHKRIkqUa1Q=; b=EcBLIyvjiliSkstu4mJclLyGDPiBN+yj1RRU1UGAACN6+VqC04Xa5PQo0QTyd4YI8T 6z80cEshVZBL3w8cl4SYDj+S8ySXtMqvhFdiD0Pdo80pqUtaCwm3Jz3Cv2NeO4PhQLti GqbWtfjdS/F8ocPNRJMu62gsSNYBEyxMTS8o2zqNVAKfI7knFJ3SZjH1jXmC74XXo5WW Lh5H35kPzuiJ9gTrcObxoXzwrlTkRP9bUXfwfo7lMhbc02EAtwa1q897hgjDiZSOxqYP un1rYEG08yQBjlFxca6B8srOOdEsxFd+nKRtEbLQOxdUH8gbtUbVA7jFHVaTLAxzFW57 TOcQ== X-Gm-Message-State: AOAM532fWCrIipTLYG5HQmSRk8J1W2ntCE0gOOzgWndZAw+iIwcrctdA ofuorI8FDQBSOUQGIgPMvem6QkGOL5hmHBnJ X-Google-Smtp-Source: ABdhPJzSoqr3whXloFwfjTQKr23vw4YPEkOXAx1wgc6k3crKDjG29BJUyI4GFFakcFRAuf0JgFDGrQ== X-Received: by 2002:a02:cf05:: with SMTP id q5mr4586794jar.132.1630443154715; Tue, 31 Aug 2021 13:52:34 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id a4sm10484075ioe.19.2021.08.31.13.52.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:34 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:33 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 21/27] t0410: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP Message-ID: <94b1317e0c42bf5fd7492eb6e51e8ff6079595cc.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King Generating a MIDX bitmap causes tests which repack in a partial clone to fail because they are missing objects. Missing objects is an expected component of tests in t0410, so disable this knob altogether. Graceful degradation when writing a bitmap with missing objects is tested in t5326. Signed-off-by: Taylor Blau --- t/t0410-partial-clone.sh | 3 +++ 1 file changed, 3 insertions(+) diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh index bbcc51ee8e..bba679685f 100755 --- a/t/t0410-partial-clone.sh +++ b/t/t0410-partial-clone.sh @@ -4,6 +4,9 @@ test_description='partial clone' . ./test-lib.sh +# missing promisor objects cause repacks which write bitmaps to fail +GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 + delete_object () { rm $1/.git/objects/$(echo $2 | sed -e 's|^..|&/|') } From patchwork Tue Aug 31 20:52:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467833 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA1C6C4320A for ; Tue, 31 Aug 2021 20:52:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D421C61026 for ; Tue, 31 Aug 2021 20:52:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241350AbhHaUxs (ORCPT ); Tue, 31 Aug 2021 16:53:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42108 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241309AbhHaUxd (ORCPT ); Tue, 31 Aug 2021 16:53:33 -0400 Received: from mail-io1-xd2b.google.com (mail-io1-xd2b.google.com [IPv6:2607:f8b0:4864:20::d2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B94CAC061575 for ; Tue, 31 Aug 2021 13:52:37 -0700 (PDT) Received: by mail-io1-xd2b.google.com with SMTP id z1so779067ioh.7 for ; Tue, 31 Aug 2021 13:52:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=n6qnx6eeG0uYPYB1EanAtySI9p8uXGfLDW6fS1AT9ks=; b=ELX4iEhxKOni/PW/FbUd3/XmKj9+e5X9m+XHCI38Nff6D3Ua9HqLckJnUkxXXl5Gb+ S26kNxPG1TlZX2AElKox6CCjbZ0TL+C+ZWb2du5u1KSqmHGa6xn4ENfAmzx+kqZffxnO tYWrMYOVMfHnxL2JKIPvbkwqPrx1iv+TVVrGsY12iqITJLMv084KlWIyNzV11ud4Drim ko2bAAkOD3U8a581KzMxq5LohlUJJ2jftWoDMJpJSQWq4qv18CD0Uy2n1cBPSSNDkT5f b/9qNclVwp7WqkJHUyyqIVrdUw8mr4ijNlaf7BWOaV0vx0zdPYIFjJXhDsjNnlwKzvtJ vDeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=n6qnx6eeG0uYPYB1EanAtySI9p8uXGfLDW6fS1AT9ks=; b=PtzpGbb2OAjKwyAXxrEvXP8Gi0SDDSHSka4jGwKjOZ/19kKi2TpauN1ExusOp+BNT/ tU4mkf2rQ/SkpABtRd8BWSQ9gFZP1OE+JMRmPf4w3Y08TJsieq2gtkN1f5WU2DMvgULu Q19kkRTuFsqEMGMSM34ER+4K3OFuzvLtznWRf6B9ypgRSIEUVTPhNIhCph8n3YTtZqfa UulY5hlqwrWn7tUPYqKm5n9YmW+XaIfptSC/ly84DCsMG60WiByZHO4voFA0UapgBdF9 xUtvGa1hOtvpzVD6PyeNuimKVf1t1AFEApS2bUQPvLxYUNgocp4/+cZHMRzBw9feVLmz S3jQ== X-Gm-Message-State: AOAM533a3WkzmfB3on2Sz9XUpMQA46I5dbG9VroaU+BY0pDxPF1d7/GL hJaBhLUD5KBtwS4BNEewIJRDUKFSwbenuPQ6 X-Google-Smtp-Source: ABdhPJy9CjYXeb5IN3EvtDyhWaL8NKKG9Hp2r2N4lIRG44j5+0nx9cItbINjzWY6qL7Wxpq4pZJn1A== X-Received: by 2002:a02:cc59:: with SMTP id i25mr4613933jaq.125.1630443157143; Tue, 31 Aug 2021 13:52:37 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id y12sm10805306ilm.81.2021.08.31.13.52.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:36 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:36 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 22/27] t5310: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff King Generating a MIDX bitmap confuses many of the tests in t5310, which expect to control whether and how bitmaps are written. Since the relevant MIDX-bitmap tests here are covered already in t5326, let's just disable the flag for the whole t5310 script. Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- t/t5310-pack-bitmaps.sh | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh index 4318f84d53..673baa5c3c 100755 --- a/t/t5310-pack-bitmaps.sh +++ b/t/t5310-pack-bitmaps.sh @@ -8,6 +8,10 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME . "$TEST_DIRECTORY"/lib-bundle.sh . "$TEST_DIRECTORY"/lib-bitmap.sh +# t5310 deals only with single-pack bitmaps, so don't write MIDX bitmaps in +# their place. +GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 + objpath () { echo ".git/objects/$(echo "$1" | sed -e 's|\(..\)|\1/|')" } From patchwork Tue Aug 31 20:52:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467835 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F04EC43214 for ; Tue, 31 Aug 2021 20:52:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E934F6103A for ; Tue, 31 Aug 2021 20:52:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241357AbhHaUxt (ORCPT ); Tue, 31 Aug 2021 16:53:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42044 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241324AbhHaUxg (ORCPT ); Tue, 31 Aug 2021 16:53:36 -0400 Received: from mail-il1-x12f.google.com (mail-il1-x12f.google.com [IPv6:2607:f8b0:4864:20::12f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 41FADC061575 for ; Tue, 31 Aug 2021 13:52:40 -0700 (PDT) Received: by mail-il1-x12f.google.com with SMTP id u7so690385ilk.7 for ; Tue, 31 Aug 2021 13:52:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Ne9hO33HlOEkL0I+Xo8+9iTfXidjrHnaKTzmCB49g5g=; b=pLi80PT3ykXoONYW22CKdz12D8+QAfzj89GTJZXJSG1gNLXhGVSfSitwlQxBuxMsyb p31EvWklbZuxbeC0VLu9lvsJJP4GdUcyFb5lEjrrs7A0cQGxS3JrBIU8ppy5UrvDL9r9 zmOVmSh/NPFh9jUM9rQikVc0ZtzRkq/upR12wkElB8VJOSdTh08NKCogUhbGNzyEXy06 t61SqqduBV158iRdmOEIekR3wXPIDUsd4RoeXc6VYR8LSoGH4imzoRMsQ90tjvQ/NZEV ASSB+kLCt2nk5fpF0Wtzq1w8g+DybDB44w0fxRmgWI1wuy/0ny4t2gafqTAC8KfMtJpY /QDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Ne9hO33HlOEkL0I+Xo8+9iTfXidjrHnaKTzmCB49g5g=; b=qvU4C+OiwVv827RzYfwoWbxGAvmxZBhQpss70IPAYqOzdAtzPB2zMIA0rf8OnM/6RE uMF4F7nwnP4IhDs3zOYyNc4BZVIxrQm4oJLaw/dLeydYSiwdsiuOA2Svy/VwwypSW+y1 vApTuyQTGLPaQWN+lChKK632MPpg12pqpT5u8FV0m675WXR+CRY/y0kaljqnZ8nBHraI SN0/HDv4s3/koHGi9TMLhPq6T9qTp4BXtRdwjJ43lBudYsDmS/FaHOis14IwjEf7PxKx TqTsz0ev7C5PUMnWu3rvcqzSGgwvJDz1nSJfOcL7RYlM1ZcWWE4RtRP2OBFaHpQfsi5L yb5w== X-Gm-Message-State: AOAM533xtriczgwjB2wpEnr2D3tTbgbPWn6ZZyc7HNq2vbA9+qcWX2UH t+vn5DR+F2p4eYPHSXoSzCDZcKHVqqkIOkmO X-Google-Smtp-Source: ABdhPJwgeYMhlo0Hk8BI8H5KoB2FuBZG/JjmHCbGHgr1ZD90xhQrNJzHJQAKgipku/c5E4pLr77Dow== X-Received: by 2002:a05:6e02:dcb:: with SMTP id l11mr21766442ilj.169.1630443159574; Tue, 31 Aug 2021 13:52:39 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id z7sm10977974ilz.25.2021.08.31.13.52.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:39 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:38 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 23/27] t5319: don't write MIDX bitmaps in t5319 Message-ID: <92a6370e7715c34a1ac7ccc090ef501c1b5dcc58.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This test is specifically about generating a midx still respecting a pack-based bitmap file. Generating a MIDX bitmap would confuse the test. Let's override the 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' variable to make sure we don't do so. Signed-off-by: Taylor Blau --- t/t5319-multi-pack-index.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index d7e4988f2b..b3f9f3969d 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -532,7 +532,8 @@ test_expect_success 'repack preserves multi-pack-index when creating packs' ' compare_results_with_midx "after repack" test_expect_success 'multi-pack-index and pack-bitmap' ' - git -c repack.writeBitmaps=true repack -ad && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -c repack.writeBitmaps=true repack -ad && git multi-pack-index write && git rev-list --test-bitmap HEAD ' From patchwork Tue Aug 31 20:52:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467839 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B2F6C4320E for ; Tue, 31 Aug 2021 20:52:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 829546103A for ; Tue, 31 Aug 2021 20:52:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241212AbhHaUxv (ORCPT ); Tue, 31 Aug 2021 16:53:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42188 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241322AbhHaUxi (ORCPT ); Tue, 31 Aug 2021 16:53:38 -0400 Received: from mail-il1-x129.google.com (mail-il1-x129.google.com [IPv6:2607:f8b0:4864:20::129]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6F24C061796 for ; Tue, 31 Aug 2021 13:52:42 -0700 (PDT) Received: by mail-il1-x129.google.com with SMTP id l10so683566ilh.8 for ; Tue, 31 Aug 2021 13:52:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=dXZ7G+IY/SOmjuoLnf7Th919pFizsyLyvUA/yEhVAVM=; b=K1385AgmSe7ICr84laxGhC5PK6d6bBFBavOWWccDBIWteLvjtfOdppaW7JXsW9TAh0 nr5k1ZT+HMFvYgHFQxW7LofK/D+QPYWxMXKxNF4owrJebQDScLBfs6C/18YcgdJJRTUr pT6ng0wEosbcpzkmKmqc6zgODAjNv1Ai0bnYFGn4InEdNy7QCeD7gH/tcQ81JHzOEVen ey5w9r9piUrejA1nVgUWkCEliQS0ErXjsUOYJvqijPus7gWCf3+0o2OlIqGS2XIBSLq+ s8O4dvKa6bXhWQGIX4KA6TbQBE+LtUl+mSdJJMEMHX25kBAJBjU+Fge8ok1pMyCNUzxe Dn1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=dXZ7G+IY/SOmjuoLnf7Th919pFizsyLyvUA/yEhVAVM=; b=Lau8RDCXH2EfE/qDQRFS6TMGEz950L5DTbnVm9559mLTEFnMMtuOoVjL3SlGgl73J+ XGJqVKPmJDikjbJz6IIQY3MtWIsIxhAMacVJhOk4nGZH2ABVaTo7fe1rn7yEK2cENaG3 o/XiBjpOzqB4Ov4kRKWkp4rgPko4cTNLnJQKU1b8RC3JL9gJRhFIx3B7CPK2x/G2db6i 1SP5vVbUqnBbuTjQK1DUUC+iBER+BBvopeUN56Siyq0QsmXh55ocaoIleGv3weWI5FZQ 2h/G/u5EHpOkfFdA13vNp9vlq3tQLLymWCpwSpIT8hVJhyCqoArrdWu2kQGf/XjPBF1M bP6g== X-Gm-Message-State: AOAM531Tc96id/Ajk1atxJmRBunJ7suNZpFdzJ1SrGXZujg+3klbF+q/ dZQ+KP5SNaaI2Tj4+z0oFJTurJy0/te2k3MC X-Google-Smtp-Source: ABdhPJxRwgIs3VRHwzm/EcbbTIExoPlobQ97JIsU0w1fW9R0DOXKD9Kg3wz/bqjm2dy73zaAqxNTpw== X-Received: by 2002:a92:c151:: with SMTP id b17mr21800088ilh.1.1630443162023; Tue, 31 Aug 2021 13:52:42 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id s7sm9668245ioc.42.2021.08.31.13.52.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:41 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:41 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 24/27] t7700: update to work with MIDX bitmap test knob Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A number of these tests are focused only on pack-based bitmaps and need to be updated to disable 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' where necessary. Signed-off-by: Taylor Blau --- t/t7700-repack.sh | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 25b235c063..98eda3bfeb 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -63,13 +63,14 @@ test_expect_success 'objects in packs marked .keep are not repacked' ' test_expect_success 'writing bitmaps via command-line can duplicate .keep objects' ' # build on $oid, $packid, and .keep state from previous - git repack -Adbl && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 git repack -Adbl && test_has_duplicate_object true ' test_expect_success 'writing bitmaps via config can duplicate .keep objects' ' # build on $oid, $packid, and .keep state from previous - git -c repack.writebitmaps=true repack -Adl && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -c repack.writebitmaps=true repack -Adl && test_has_duplicate_object true ' @@ -189,7 +190,9 @@ test_expect_success 'repack --keep-pack' ' test_expect_success 'bitmaps are created by default in bare repos' ' git clone --bare .git bare.git && - git -C bare.git repack -ad && + rm -f bare.git/objects/pack/*.bitmap && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -C bare.git repack -ad && bitmap=$(ls bare.git/objects/pack/*.bitmap) && test_path_is_file "$bitmap" ' @@ -200,7 +203,8 @@ test_expect_success 'incremental repack does not complain' ' ' test_expect_success 'bitmaps can be disabled on bare repos' ' - git -c repack.writeBitmaps=false -C bare.git repack -ad && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -c repack.writeBitmaps=false -C bare.git repack -ad && bitmap=$(ls bare.git/objects/pack/*.bitmap || :) && test -z "$bitmap" ' @@ -211,7 +215,8 @@ test_expect_success 'no bitmaps created if .keep files present' ' keep=${pack%.pack}.keep && test_when_finished "rm -f \"\$keep\"" && >"$keep" && - git -C bare.git repack -ad 2>stderr && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -C bare.git repack -ad 2>stderr && test_must_be_empty stderr && find bare.git/objects/pack/ -type f -name "*.bitmap" >actual && test_must_be_empty actual @@ -222,7 +227,8 @@ test_expect_success 'auto-bitmaps do not complain if unavailable' ' blob=$(test-tool genrandom big $((1024*1024)) | git -C bare.git hash-object -w --stdin) && git -C bare.git update-ref refs/tags/big $blob && - git -C bare.git repack -ad 2>stderr && + GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ + git -C bare.git repack -ad 2>stderr && test_must_be_empty stderr && find bare.git/objects/pack -type f -name "*.bitmap" >actual && test_must_be_empty actual From patchwork Tue Aug 31 20:52:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467841 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2927FC432BE for ; Tue, 31 Aug 2021 20:52:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 10A7B6103A for ; Tue, 31 Aug 2021 20:52:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241384AbhHaUxx (ORCPT ); Tue, 31 Aug 2021 16:53:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42246 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241360AbhHaUxt (ORCPT ); Tue, 31 Aug 2021 16:53:49 -0400 Received: from mail-io1-xd2b.google.com (mail-io1-xd2b.google.com [IPv6:2607:f8b0:4864:20::d2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 380ABC061292 for ; Tue, 31 Aug 2021 13:52:45 -0700 (PDT) Received: by mail-io1-xd2b.google.com with SMTP id e186so734452iof.12 for ; Tue, 31 Aug 2021 13:52:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=C1W1zkLF6+PUEqyzPRKGz+wxHjEIxGpKnu9spXfGQyg=; b=EEmyyAWfQ0iIDdBmeTnkceImv8ZpYLaAFhnSNEXhrPmHC5oD50rZQkoTOoRCV6/ghD Pe50dZFxwSdEB/md77sL/p6rT6fj0/uW+Jz379gszrZpzlSxqLjJNgQBftwndEAhHHIe gu2CEJYhdUyckuV1Y8txqKNqU37QKhAWdq7oXsuvRHCDvip9A83+87XThCO71fSxp6Mf t/ofZFFRYgZXrHGSTMPvadSvffDiohmC92y7IPS5olXW/vHYmuj+5mIAcUT6YJ+4rjzC gVYSZczknxc/GzhPLiNYtovxqumtgpBSqOP66k95putpCiKc2umvPR9KIEjh1pEjQHC6 q8cA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=C1W1zkLF6+PUEqyzPRKGz+wxHjEIxGpKnu9spXfGQyg=; b=Xsc5AqBotopI/qooc2u3vqafX35PMv4jH8rGZYwelzsaB4srzHYluwwgQ7kDGuQyK+ YyEqqLCsvA4AveEzGI8O5lsHZNgimI+V1RRYMqu4fFg2Nnxo4lTTGkWKPHCsDJ6P6Z+f cexk8L53g2K8K77ezYCDIFRVDz5lwtGFsgm0OdQidiRG8Dq158VcdyO0fWbwhzpHB9nP mMTdXZDAvWe1H0Z4TVYoRAa1EFUep4vO/Hgh7IPyFufQ3rTWtsci+qlcN0Yfwu02d6pB Td4Pa+dpy8HOMo8lwaLAz0mfuuMo78rFamhX50QAm+RFLlUPwdSLhIOoXUXX3w5sb5YG UJLg== X-Gm-Message-State: AOAM531cImmBdog9IEB0t8oYkgmWP1Ahi8OHuC8VCpUwnKkFRFna7OkZ VE7kZaccbJCKkBTFC3balGusJHjBfFTOC9vw X-Google-Smtp-Source: ABdhPJzfizQ40jN8oaR8y/2M+WP4b84//ZfbS6cPC5Nv6DDe7538QNnsbFlT0TzlMU7SS1AAnFsT0Q== X-Received: by 2002:a02:9542:: with SMTP id y60mr4500102jah.87.1630443164516; Tue, 31 Aug 2021 13:52:44 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id a14sm10666950iol.24.2021.08.31.13.52.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:44 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:43 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 25/27] midx: respect 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' Message-ID: <44a4800756de7749e17cea0db5790a876db96d28.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Introduce a new 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' environment variable to also write a multi-pack bitmap when 'GIT_TEST_MULTI_PACK_INDEX' is set. Signed-off-by: Taylor Blau --- builtin/repack.c | 12 ++++++++++-- ci/run-build-and-tests.sh | 1 + midx.h | 2 ++ t/README | 4 ++++ 4 files changed, 17 insertions(+), 2 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index 5f9bc74adc..82ab668272 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -515,6 +515,10 @@ int cmd_repack(int argc, const char **argv, const char *prefix) if (!(pack_everything & ALL_INTO_ONE) || !is_bare_repository()) write_bitmaps = 0; + } else if (write_bitmaps && + git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0) && + git_env_bool(GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP, 0)) { + write_bitmaps = 0; } if (pack_kept_objects < 0) pack_kept_objects = write_bitmaps > 0; @@ -725,8 +729,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix) update_server_info(0); remove_temporary_files(); - if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) - write_midx_file(get_object_directory(), NULL, 0); + if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) { + unsigned flags = 0; + if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP, 0)) + flags |= MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX; + write_midx_file(get_object_directory(), NULL, flags); + } string_list_clear(&names, 0); string_list_clear(&rollback, 0); diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh index 3ce81ffee9..7ee9ba9325 100755 --- a/ci/run-build-and-tests.sh +++ b/ci/run-build-and-tests.sh @@ -23,6 +23,7 @@ linux-gcc) export GIT_TEST_COMMIT_GRAPH=1 export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1 export GIT_TEST_MULTI_PACK_INDEX=1 + export GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=1 export GIT_TEST_ADD_I_USE_BUILTIN=1 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master export GIT_TEST_WRITE_REV_INDEX=1 diff --git a/midx.h b/midx.h index 350f4d0a7b..aa3da557bb 100644 --- a/midx.h +++ b/midx.h @@ -8,6 +8,8 @@ struct pack_entry; struct repository; #define GIT_TEST_MULTI_PACK_INDEX "GIT_TEST_MULTI_PACK_INDEX" +#define GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP \ + "GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP" struct multi_pack_index { struct multi_pack_index *next; diff --git a/t/README b/t/README index 9e70122302..12014aa988 100644 --- a/t/README +++ b/t/README @@ -425,6 +425,10 @@ GIT_TEST_MULTI_PACK_INDEX=, when true, forces the multi-pack- index to be written after every 'git repack' command, and overrides the 'core.multiPackIndex' setting to true. +GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=, when true, sets the +'--bitmap' option on all invocations of 'git multi-pack-index write', +and ignores pack-objects' '--write-bitmap-index'. + GIT_TEST_SIDEBAND_ALL=, when true, overrides the 'uploadpack.allowSidebandAll' setting to true, and when false, forces fetch-pack to not request sideband-all (even if the server advertises From patchwork Tue Aug 31 20:52:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467843 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4DED2C432BE for ; Tue, 31 Aug 2021 20:53:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2D4ED60F9E for ; Tue, 31 Aug 2021 20:53:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241356AbhHaUyH (ORCPT ); Tue, 31 Aug 2021 16:54:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241332AbhHaUxw (ORCPT ); Tue, 31 Aug 2021 16:53:52 -0400 Received: from mail-il1-x132.google.com (mail-il1-x132.google.com [IPv6:2607:f8b0:4864:20::132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 96ECCC0612A6 for ; Tue, 31 Aug 2021 13:52:47 -0700 (PDT) Received: by mail-il1-x132.google.com with SMTP id x5so730696ill.3 for ; Tue, 31 Aug 2021 13:52:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=IPXB1qBMlOGpG/jyerww7SfinUgH7Yh1oF/wkRRBjWg=; b=VyWo3Ej+Ij9j5RlhOJIvVHlSO9pwB87VJC7E7yMyLOlxiecClmE/GdB/T0IIeMO+HE G9Lpi3uO6jGBz8bTL+qSVxz9IK/oIxioje2K+ySYi833kxp33PvRvOVE4Mjnhev51tKM lOoP//Ca5rP9PtLxWJBrEBgPC7u/3XudOraNVmeKoM9Ojab8/wD/WLMqBO7WVO9YZSos 0rUnKwekbfFc4mUxlmEOSBblMmyxkMnMvki8SlTlXofBGjHqUu76HBLfLRMEF+zsSLQh KquwSIAa186Ii9xDFXkKs5R/LoFs+SeBoCj0juglWOQS4xeyg+OoSkBgjdqnq7rANyyA uFaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=IPXB1qBMlOGpG/jyerww7SfinUgH7Yh1oF/wkRRBjWg=; b=MJ6N3XmP+HHOOWQu4x7sjHcoS6h5BTpsB4VjUxA1zwQhceT/4WNTa5zbY7vPBPyC7x UIKNdnZzdopUdDWSxpe4fo/NQaiZNPpPLo4aPtv56D7sLtWdVB+sTlFRqBCKUEthPS3r x24V6FkW09q/HQ0Lzj0bFxOA93t+0Mj7pebilVgyMCwxDNVsGlyKOGJwPeM4Y/wN4OQd 6HRVkzuTDdjbfMK6cLzXNVNVW6U+bYYfMK64dVgiVrkGHuhP3M2uFEHcKKmeSvQjZJ78 w+fd2p6HXeF9GyQ91XKeXbDXOuQ3pDxRT05q4BKO+aKjaY8Cdx8TjrRLiN4oWOHrmd8A sKIg== X-Gm-Message-State: AOAM530EptyeZxV8nkTg9j28qri1aK5DLp0bJJD7GWb8asy1Hmv4RIlN jL/+NfcLei1Q7IPD45FACylzQX+pSQgrgqzW X-Google-Smtp-Source: ABdhPJxM21cbCte9hger87Mv7FJhtBQ6pB62Yivya3T4RsZPXTJ8Cdfpewk66Ck1rd6fV+UySwDj7A== X-Received: by 2002:a05:6e02:1d08:: with SMTP id i8mr22192869ila.185.1630443166865; Tue, 31 Aug 2021 13:52:46 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id o18sm10404566ilh.52.2021.08.31.13.52.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:46 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:46 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 26/27] p5310: extract full and partial bitmap tests Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org A new p5326 introduced by the next patch will want these same tests, interjecting its own setup in between. Move them out so that both perf tests can reuse them. Signed-off-by: Taylor Blau --- t/perf/lib-bitmap.sh | 69 ++++++++++++++++++++++++++++++++++++ t/perf/p5310-pack-bitmaps.sh | 65 ++------------------------------- 2 files changed, 72 insertions(+), 62 deletions(-) create mode 100644 t/perf/lib-bitmap.sh diff --git a/t/perf/lib-bitmap.sh b/t/perf/lib-bitmap.sh new file mode 100644 index 0000000000..63d3bc7cec --- /dev/null +++ b/t/perf/lib-bitmap.sh @@ -0,0 +1,69 @@ +# Helper functions for testing bitmap performance; see p5310. + +test_full_bitmap () { + test_perf 'simulated clone' ' + git pack-objects --stdout --all /dev/null + ' + + test_perf 'simulated fetch' ' + have=$(git rev-list HEAD~100 -1) && + { + echo HEAD && + echo ^$have + } | git pack-objects --revs --stdout >/dev/null + ' + + test_perf 'pack to file (bitmap)' ' + git pack-objects --use-bitmap-index --all pack1b /dev/null + ' + + test_perf 'rev-list (commits)' ' + git rev-list --all --use-bitmap-index >/dev/null + ' + + test_perf 'rev-list (objects)' ' + git rev-list --all --use-bitmap-index --objects >/dev/null + ' + + test_perf 'rev-list with tag negated via --not --all (objects)' ' + git rev-list perf-tag --not --all --use-bitmap-index --objects >/dev/null + ' + + test_perf 'rev-list with negative tag (objects)' ' + git rev-list HEAD --not perf-tag --use-bitmap-index --objects >/dev/null + ' + + test_perf 'rev-list count with blob:none' ' + git rev-list --use-bitmap-index --count --objects --all \ + --filter=blob:none >/dev/null + ' + + test_perf 'rev-list count with blob:limit=1k' ' + git rev-list --use-bitmap-index --count --objects --all \ + --filter=blob:limit=1k >/dev/null + ' + + test_perf 'rev-list count with tree:0' ' + git rev-list --use-bitmap-index --count --objects --all \ + --filter=tree:0 >/dev/null + ' + + test_perf 'simulated partial clone' ' + git pack-objects --stdout --all --filter=blob:none /dev/null + ' +} + +test_partial_bitmap () { + test_perf 'clone (partial bitmap)' ' + git pack-objects --stdout --all /dev/null + ' + + test_perf 'pack to file (partial bitmap)' ' + git pack-objects --use-bitmap-index --all pack2b /dev/null + ' + + test_perf 'rev-list with tree filter (partial bitmap)' ' + git rev-list --use-bitmap-index --count --objects --all \ + --filter=tree:0 >/dev/null + ' +} diff --git a/t/perf/p5310-pack-bitmaps.sh b/t/perf/p5310-pack-bitmaps.sh index 452be01056..7ad4f237bc 100755 --- a/t/perf/p5310-pack-bitmaps.sh +++ b/t/perf/p5310-pack-bitmaps.sh @@ -2,6 +2,7 @@ test_description='Tests pack performance using bitmaps' . ./perf-lib.sh +. "${TEST_DIRECTORY}/perf/lib-bitmap.sh" test_perf_large_repo @@ -25,56 +26,7 @@ test_perf 'repack to disk' ' git repack -ad ' -test_perf 'simulated clone' ' - git pack-objects --stdout --all /dev/null -' - -test_perf 'simulated fetch' ' - have=$(git rev-list HEAD~100 -1) && - { - echo HEAD && - echo ^$have - } | git pack-objects --revs --stdout >/dev/null -' - -test_perf 'pack to file (bitmap)' ' - git pack-objects --use-bitmap-index --all pack1b /dev/null -' - -test_perf 'rev-list (commits)' ' - git rev-list --all --use-bitmap-index >/dev/null -' - -test_perf 'rev-list (objects)' ' - git rev-list --all --use-bitmap-index --objects >/dev/null -' - -test_perf 'rev-list with tag negated via --not --all (objects)' ' - git rev-list perf-tag --not --all --use-bitmap-index --objects >/dev/null -' - -test_perf 'rev-list with negative tag (objects)' ' - git rev-list HEAD --not perf-tag --use-bitmap-index --objects >/dev/null -' - -test_perf 'rev-list count with blob:none' ' - git rev-list --use-bitmap-index --count --objects --all \ - --filter=blob:none >/dev/null -' - -test_perf 'rev-list count with blob:limit=1k' ' - git rev-list --use-bitmap-index --count --objects --all \ - --filter=blob:limit=1k >/dev/null -' - -test_perf 'rev-list count with tree:0' ' - git rev-list --use-bitmap-index --count --objects --all \ - --filter=tree:0 >/dev/null -' - -test_perf 'simulated partial clone' ' - git pack-objects --stdout --all --filter=blob:none /dev/null -' +test_full_bitmap test_expect_success 'create partial bitmap state' ' # pick a commit to represent the repo tip in the past @@ -97,17 +49,6 @@ test_expect_success 'create partial bitmap state' ' git update-ref HEAD $orig_tip ' -test_perf 'clone (partial bitmap)' ' - git pack-objects --stdout --all /dev/null -' - -test_perf 'pack to file (partial bitmap)' ' - git pack-objects --use-bitmap-index --all pack2b /dev/null -' - -test_perf 'rev-list with tree filter (partial bitmap)' ' - git rev-list --use-bitmap-index --count --objects --all \ - --filter=tree:0 >/dev/null -' +test_partial_bitmap test_done From patchwork Tue Aug 31 20:52:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 12467845 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13482C4320E for ; Tue, 31 Aug 2021 20:53:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F2E146103A for ; Tue, 31 Aug 2021 20:53:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241389AbhHaUyI (ORCPT ); Tue, 31 Aug 2021 16:54:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42246 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241340AbhHaUxw (ORCPT ); Tue, 31 Aug 2021 16:53:52 -0400 Received: from mail-io1-xd2e.google.com (mail-io1-xd2e.google.com [IPv6:2607:f8b0:4864:20::d2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 18AEFC0612AF for ; Tue, 31 Aug 2021 13:52:50 -0700 (PDT) Received: by mail-io1-xd2e.google.com with SMTP id m11so789856ioo.6 for ; Tue, 31 Aug 2021 13:52:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=0/zjnVzIMgjSIymqIYNSYr2ZIy25x9rq9SNyHtEUciE=; b=HUgFv2mU/1pYawjkPGQC0RNfvE0kJaANaeGDCH4DZPC7aWetozjDY3C0EN7wxLlwl1 tqVDobd9q8ovq0OXcRAJivziHvFmPkEiQ8Q+cWHK0p3aWLCbR2MQgV/MbRRdUciE9m4Z Jqu6BMQRE8jeRZwTsgfZgR5OJYFlWFYkHCAYxvGLhjRzCJz6Mok3snXrQDsJbDdaoXqE WWqXm7e5G8MvvndiE5dej1jD4qcDPJYnatuDJ0UHcMuoyo68XmyFfBVbJksvg8ZevU+F A0Dp0TXjS6mocmKBK9pjW5BKSQv1a8LCeG7PdMMToLWwzsHahOXZ4bhs+WJ4JBCPYgsn FkOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=0/zjnVzIMgjSIymqIYNSYr2ZIy25x9rq9SNyHtEUciE=; b=E61nGvZka79cYVW4nsCIzg3JNe2rqbC+uKhyJQdz2BeKh7Px9GyV07/ypKLMtuEwyi GvQlr0m7dZIE2y/7xzO1porukpZy5dqpMUFLe8m+mlrHUyQdgTE70bvWJHdxvXBxF0mO vuHML+ZItScujMBiTkh1uCG4Cg1lxB1s8btuQSrCdQnbV/xhNuxtbBU2OK3h6O0AcaWy gm1tkLgdj4dOpeeV6OXQA8aZGGiCkAemMMUCBEgkOAMIfzdfxJKHKO2CwpNIunLeVYKV PK0StdQjGU0AabJhaupBZ6hzjNioLUM6RQkpeom5eSZfampFXwzhdVDoCoJiZu/2ND9J 2YAA== X-Gm-Message-State: AOAM5320qu/812gnlESf7A7FA3/6sB+2lqGrfXilpw5ppQrlx2E448ZS so1GfaaZLTMRLFf4Mlgg5JIRUc1lyiJs750Q X-Google-Smtp-Source: ABdhPJwlQMZfHj7MsSQkmC72P+Ug05IhSCCXH0O1K81UdMjp0m0djuNm/1Y4df6rq8CZwLDei+rqMQ== X-Received: by 2002:a05:6602:2219:: with SMTP id n25mr23770589ion.185.1630443169411; Tue, 31 Aug 2021 13:52:49 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id z16sm10843751ile.72.2021.08.31.13.52.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 13:52:49 -0700 (PDT) Date: Tue, 31 Aug 2021 16:52:48 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com, jonathantanmy@google.com Subject: [PATCH v5 27/27] p5326: perf tests for MIDX bitmaps Message-ID: <6888fe01aae6710959f52f15a862f14adba02d22.1630443072.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org These new performance tests demonstrate effectively the same behavior as p5310, but use a multi-pack bitmap instead of a single-pack one. Notably, p5326 does not create a MIDX bitmap with multiple packs. This is so we can measure a direct comparison between it and p5310. Any difference between the two is measuring just the overhead of using MIDX bitmaps. Here are the results of p5310 and p5326 together, measured at the same time and on the same machine (using a Xenon W-2255 CPU): Test HEAD ------------------------------------------------------------------------ 5310.2: repack to disk 96.78(93.39+11.33) 5310.3: simulated clone 9.98(9.79+0.19) 5310.4: simulated fetch 1.75(4.26+0.19) 5310.5: pack to file (bitmap) 28.20(27.87+8.70) 5310.6: rev-list (commits) 0.41(0.36+0.05) 5310.7: rev-list (objects) 1.61(1.54+0.07) 5310.8: rev-list count with blob:none 0.25(0.21+0.04) 5310.9: rev-list count with blob:limit=1k 2.65(2.54+0.10) 5310.10: rev-list count with tree:0 0.23(0.19+0.04) 5310.11: simulated partial clone 4.34(4.21+0.12) 5310.13: clone (partial bitmap) 11.05(12.21+0.48) 5310.14: pack to file (partial bitmap) 31.25(34.22+3.70) 5310.15: rev-list with tree filter (partial bitmap) 0.26(0.22+0.04) versus the same tests (this time using a multi-pack index): Test HEAD ------------------------------------------------------------------------ 5326.2: setup multi-pack index 78.99(75.29+11.58) 5326.3: simulated clone 11.78(11.56+0.22) 5326.4: simulated fetch 1.70(4.49+0.13) 5326.5: pack to file (bitmap) 28.02(27.72+8.76) 5326.6: rev-list (commits) 0.42(0.36+0.06) 5326.7: rev-list (objects) 1.65(1.58+0.06) 5326.8: rev-list count with blob:none 0.26(0.21+0.05) 5326.9: rev-list count with blob:limit=1k 2.97(2.86+0.10) 5326.10: rev-list count with tree:0 0.25(0.20+0.04) 5326.11: simulated partial clone 5.65(5.49+0.16) 5326.13: clone (partial bitmap) 12.22(13.43+0.38) 5326.14: pack to file (partial bitmap) 30.05(31.57+7.25) 5326.15: rev-list with tree filter (partial bitmap) 0.24(0.20+0.04) There is slight overhead in "simulated clone", "simulated partial clone", and "clone (partial bitmap)". Unsurprisingly, that overhead is due to using the MIDX's reverse index to map between bit positions and MIDX positions. This can be reproduced by running "git repack -adb" along with "git multi-pack-index write --bitmap" in a large-ish repository. Then run: $ perf record -o pack.perf git -c core.multiPackIndex=false \ pack-objects --all --stdout >/dev/null /dev/null --- t/perf/p5326-multi-pack-bitmaps.sh | 43 ++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) create mode 100755 t/perf/p5326-multi-pack-bitmaps.sh diff --git a/t/perf/p5326-multi-pack-bitmaps.sh b/t/perf/p5326-multi-pack-bitmaps.sh new file mode 100755 index 0000000000..5845109ac7 --- /dev/null +++ b/t/perf/p5326-multi-pack-bitmaps.sh @@ -0,0 +1,43 @@ +#!/bin/sh + +test_description='Tests performance using midx bitmaps' +. ./perf-lib.sh +. "${TEST_DIRECTORY}/perf/lib-bitmap.sh" + +test_perf_large_repo + +test_expect_success 'enable multi-pack index' ' + git config core.multiPackIndex true +' + +test_perf 'setup multi-pack index' ' + git repack -ad && + git multi-pack-index write --bitmap +' + +test_full_bitmap + +test_expect_success 'create partial bitmap state' ' + # pick a commit to represent the repo tip in the past + cutoff=$(git rev-list HEAD~100 -1) && + orig_tip=$(git rev-parse HEAD) && + + # now pretend we have just one tip + rm -rf .git/logs .git/refs/* .git/packed-refs && + git update-ref HEAD $cutoff && + + # and then repack, which will leave us with a nice + # big bitmap pack of the "old" history, and all of + # the new history will be loose, as if it had been pushed + # up incrementally and exploded via unpack-objects + git repack -Ad && + git multi-pack-index write --bitmap && + + # and now restore our original tip, as if the pushes + # had happened + git update-ref HEAD $orig_tip +' + +test_partial_bitmap + +test_done