diff mbox series

[v2,6/7] midx.c: include preferred pack correctly with existing MIDX

Message ID d301c4d87f8a490c90268aa251a94253f1e2b730.1661197803.git.me@ttaylorr.com (mailing list archive)
State Accepted
Commit cdf517be06b83c01d0411709c1fac0c2fc5c1a9b
Headers show
Series midx: permit changing the preferred pack when reusing the MIDX | expand

Commit Message

Taylor Blau Aug. 22, 2022, 7:50 p.m. UTC
This patch resolves an issue where the object order used to generate a
MIDX bitmap would violate an invariant that all of the preferred pack's
objects are represented by that pack in the MIDX.

The problem arises when reusing an existing MIDX while generating a new
one, and occurs specifically when the identity of the preferred pack
changes from one MIDX to another, along with a few other conditions:

    - the new preferred pack must also be present in the existing MIDX

    - the new preferred pack must *not* have been the preferred pack in
      the existing MIDX

    - most importantly, there must be at least one object present in the
      physical preferred pack (ie., it shows up in that pack's index)
      but was selected from a *different* pack when the previous MIDX
      was generated

When the above conditions are all met, we end up (incorrectly)
discarding copies of some objects in the pack selected as the preferred
pack. This is because `get_sorted_entries()` adds objects to its list
by doing the following at each fanout level:

    - first, adding all objects from that fanout level from an existing
      MIDX

    - then, adding all objects from that fanout level in each pack *not*
      included in the existing MIDX

So if some object was not selected from the to-be-preferred pack when
writing the previous MIDX, then we will never consider it as a candidate
when generating the new MIDX. This means that it's possible for the
preferred pack to not include all of its objects in the MIDX's
pseudo-pack object order, which is an invariant violation of that order.

Resolve this by adding all objects from the preferred pack separately
when it appears in the existing MIDX (if one was present). This will
duplicate objects from that pack that *did* appear in the MIDX, but this
is fine, since get_sorted_entries() already handles duplicates. (A
future optimization in this area could avoid adding copies of objects
that we know already existing in the MIDX.)

Note that we no longer need to compute the preferred-ness of objects
added from the MIDX, since we only want to select the preferred objects
from a single source. (We could still mark these preferred bits, but
doing so is redundant and unnecessary).

This resolves the bug demonstrated by t5326.174 ("preferred pack change
with existing MIDX bitmap").

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 midx.c                        | 14 +++++++-------
 t/t5326-multi-pack-bitmaps.sh | 13 +++++--------
 2 files changed, 12 insertions(+), 15 deletions(-)
diff mbox series

Patch

diff --git a/midx.c b/midx.c
index be8186eec2..bd1d27090e 100644
--- a/midx.c
+++ b/midx.c
@@ -595,7 +595,6 @@  static void midx_fanout_sort(struct midx_fanout *fanout)
 
 static void midx_fanout_add_midx_fanout(struct midx_fanout *fanout,
 					struct multi_pack_index *m,
-					int preferred_pack,
 					uint32_t cur_fanout)
 {
 	uint32_t start = 0, end;
@@ -610,10 +609,7 @@  static void midx_fanout_add_midx_fanout(struct midx_fanout *fanout,
 		nth_midxed_pack_midx_entry(m,
 					   &fanout->entries[fanout->nr],
 					   cur_object);
-		if (nth_midxed_pack_int_id(m, cur_object) == preferred_pack)
-			fanout->entries[fanout->nr].preferred = 1;
-		else
-			fanout->entries[fanout->nr].preferred = 0;
+		fanout->entries[fanout->nr].preferred = 0;
 		fanout->nr++;
 	}
 }
@@ -684,8 +680,7 @@  static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m,
 		fanout.nr = 0;
 
 		if (m)
-			midx_fanout_add_midx_fanout(&fanout, m, preferred_pack,
-						    cur_fanout);
+			midx_fanout_add_midx_fanout(&fanout, m, cur_fanout);
 
 		for (cur_pack = start_pack; cur_pack < nr_packs; cur_pack++) {
 			int preferred = cur_pack == preferred_pack;
@@ -694,6 +689,11 @@  static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m,
 						    preferred, cur_fanout);
 		}
 
+		if (-1 < preferred_pack && preferred_pack < start_pack)
+			midx_fanout_add_pack_fanout(&fanout, info,
+						    preferred_pack, 1,
+						    cur_fanout);
+
 		midx_fanout_sort(&fanout);
 
 		/*
diff --git a/t/t5326-multi-pack-bitmaps.sh b/t/t5326-multi-pack-bitmaps.sh
index c364677ae8..89ecd1062c 100755
--- a/t/t5326-multi-pack-bitmaps.sh
+++ b/t/t5326-multi-pack-bitmaps.sh
@@ -338,19 +338,16 @@  test_expect_success 'preferred pack change with existing MIDX bitmap' '
 		git pack-objects --all --unpacked $objdir/pack/pack0 &&
 
 		# Generate a new MIDX which changes the preferred pack
-		# to a pack contained in the existing MIDX, such that
-		# not all objects from p2 that appear in the MIDX had
-		# their copy selected from p2.
+		# to a pack contained in the existing MIDX.
 		git multi-pack-index write --bitmap \
 			--preferred-pack="pack-$p2.pack" &&
 		test_path_is_file $midx &&
 		test_path_is_file $midx-$(midx_checksum $objdir).bitmap &&
 
-		# When the above circumstances are met, an existing bug
-		# in the MIDX machinery will cause the reverse index to
-		# be read incorrectly, resulting in failed clones (among
-		# other things).
-		test_must_fail git clone --no-local . clone2
+		# When the above circumstances are met, the preferred
+		# pack should change appropriately and clones should
+		# (still) succeed.
+		git clone --no-local . clone2
 	)
 '