Message ID | c9a64b1d2a9d6b3fe1f5fb0a7303e043114fcd8f.1723743050.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | Accepted |
Commit | a72dfab8b8bcccee06d7bf53e5c0323e82a1765a |
Headers | show |
Series | pseudo-merge: avoid empty and non-closed pseudo-merge commits | expand |
On Thu, Aug 15, 2024 at 01:31:20PM -0400, Taylor Blau wrote: > Rectify this by ensuring that the commits which are pseudo-merge > candidates can only be so if they appear somewhere in the packing order. > > This is sufficient, since we know that the original packing order is > closed under reachability, so if a commit appears in that list as a > potential pseudo-merge candidate, we know that everything reachable from > it also appears in the list (and thus the candidate is a good one). Right, good explanation. > diff --git a/pseudo-merge.c b/pseudo-merge.c > index 6422be979c..7ec9d4c51c 100644 > --- a/pseudo-merge.c > +++ b/pseudo-merge.c > @@ -217,6 +217,8 @@ static int find_pseudo_merge_group_for_ref(const char *refname, > c = lookup_commit(the_repository, oid); > if (!c) > return 0; > + if (!packlist_find(writer->to_pack, oid)) > + return 0; > > has_bitmap = bitmap_writer_has_bitmapped_object_id(writer, oid); > And the patch looks good. I wondered about checking the packlist before calling lookup_commit(), but the latter is really not very expensive (it is not reading the object, but just creating a struct). > +test_expect_success 'pseudo-merge closure' ' > + git init pseudo-merge-closure && > + ( > + cd pseudo-merge-closure && > + > + test_commit A && > + git repack -d && > + > + test_commit B && > + > + # Note that the contents of A is packed, but B is not. A > + # (and the objects reachable from it) are thus visible > + # to the MIDX, but the same is not true for B and its > + # objects. > + # > + # Ensure that we do not attempt to create a pseudo-merge > + # for B, depsite it matching the below pseudo-merge > + # group pattern, as doing so would result in a failure > + # to write a non-closed bitmap. > + git config bitmapPseudoMerge.test.pattern refs/ && > + git config bitmapPseudoMerge.test.threshold now && > + > + git multi-pack-index write --bitmap && OK, clever. In the real world, I think this would happen racily, because you'd usually suck up all of the loose objects into a pack to feed into the midx. And the problem is new objects (whether packed or not) that are referenced after that step. But here we just skip that step and generate the midx directly, which lets us do it deterministically. -Peff
diff --git a/pseudo-merge.c b/pseudo-merge.c index 6422be979c..7ec9d4c51c 100644 --- a/pseudo-merge.c +++ b/pseudo-merge.c @@ -217,6 +217,8 @@ static int find_pseudo_merge_group_for_ref(const char *refname, c = lookup_commit(the_repository, oid); if (!c) return 0; + if (!packlist_find(writer->to_pack, oid)) + return 0; has_bitmap = bitmap_writer_has_bitmapped_object_id(writer, oid); diff --git a/t/t5333-pseudo-merge-bitmaps.sh b/t/t5333-pseudo-merge-bitmaps.sh index aa1a7d26f1..1dd6284756 100755 --- a/t/t5333-pseudo-merge-bitmaps.sh +++ b/t/t5333-pseudo-merge-bitmaps.sh @@ -410,4 +410,40 @@ test_expect_success 'empty pseudo-merge group' ' ) ' +test_expect_success 'pseudo-merge closure' ' + git init pseudo-merge-closure && + ( + cd pseudo-merge-closure && + + test_commit A && + git repack -d && + + test_commit B && + + # Note that the contents of A is packed, but B is not. A + # (and the objects reachable from it) are thus visible + # to the MIDX, but the same is not true for B and its + # objects. + # + # Ensure that we do not attempt to create a pseudo-merge + # for B, depsite it matching the below pseudo-merge + # group pattern, as doing so would result in a failure + # to write a non-closed bitmap. + git config bitmapPseudoMerge.test.pattern refs/ && + git config bitmapPseudoMerge.test.threshold now && + + git multi-pack-index write --bitmap && + + test-tool bitmap dump-pseudo-merges >pseudo-merges && + test_line_count = 1 pseudo-merges && + + git rev-parse A >expect && + + test-tool bitmap list-commits >actual && + test_cmp expect actual && + test-tool bitmap dump-pseudo-merge-commits 0 >actual && + test_cmp expect actual + ) +' + test_done
When generating pseudo-merge bitmaps, it's possible that concurrent reference updates may reveal some pseudo-merge candidates which reach objects that are not contained in the bitmap's pack or pseudo-pack order (in the case of MIDX bitmaps). The latter case is relatively easy to demonstrate: if we generate a MIDX bitmap with only half of the repository packed, then the unpacked contents are not part of the MIDX's object order. If we happen to select one or more commit(s) from the unpacked portion of the repository for inclusion in a pseudo-merge, we'll get the following message when trying to generate its bitmap: $ git multi-pack-index write --bitmap [...] Selecting pseudo-merge commits: 100% (1/1), done. warning: Failed to write bitmap index. Packfile doesn't have full closure (object ... is missing) Building bitmaps: 50% (1/2), done. error: could not write multi-pack bitmap , and the attempted bitmap write will fail, leaving the repository without a current bitmap. Rectify this by ensuring that the commits which are pseudo-merge candidates can only be so if they appear somewhere in the packing order. This is sufficient, since we know that the original packing order is closed under reachability, so if a commit appears in that list as a potential pseudo-merge candidate, we know that everything reachable from it also appears in the list (and thus the candidate is a good one). Noticed-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- pseudo-merge.c | 2 ++ t/t5333-pseudo-merge-bitmaps.sh | 36 +++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+)