diff mbox series

[v3,4/9] pack-bitmap: don't rely on bitmap_git->reuse_objects

Message ID 20191115141541.11149-5-chriscool@tuxfamily.org (mailing list archive)
State New, archived
Headers show
Series Rewrite packfile reuse code | expand

Commit Message

Christian Couder Nov. 15, 2019, 2:15 p.m. UTC
From: Jeff King <peff@peff.net>

We will no longer compute bitmap_git->reuse_objects in a
following commit, so we cannot rely on it anymore to
terminate the loop early; we have to iterate to the end.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 pack-bitmap.c | 18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

Comments

Jeff King Dec. 9, 2019, 6:47 a.m. UTC | #1
On Fri, Nov 15, 2019 at 03:15:36PM +0100, Christian Couder wrote:

> From: Jeff King <peff@peff.net>
> 
> We will no longer compute bitmap_git->reuse_objects in a
> following commit, so we cannot rely on it anymore to
> terminate the loop early; we have to iterate to the end.

Hmm. In my original work from 2015 (which you never saw as individual
commits), this came somewhere in the middle, after moving to per-object
reuse.

I think by dropping these hunks now, the state of the code at this point
would mean that we might write the objects twice. We'd mark them as
"reused" and send them as part of write_reused_pack(). But we'd also
send them to pack-objects via the show_reachable_fn callback, and it
would add them to the usual packing list.

And indeed, t5310.10 fails at this point in the series with:

  Cloning into bare repository 'clone.git'...
  remote: Enumerating objects: 331, done.        
  remote: Counting objects: 100% (331/331), done.        
  remote: Compressing objects: 100% (111/111), done.        
  remote: Total 662 (delta 108), reused 331 (delta 108), pack-reused 331        
  Receiving objects: 100% (662/662), 53.14 KiB | 17.71 MiB/s, done.
  Resolving deltas: 100% (216/216), done.
  fatal: The same object 00c1d3730931e66eb08dabe3a3c9fa16621d728a appears twice in the pack
  fatal: index-pack failed

and then starts working again after the final patch.

-Peff
Christian Couder Dec. 13, 2019, 1:26 p.m. UTC | #2
On Mon, Dec 9, 2019 at 7:47 AM Jeff King <peff@peff.net> wrote:


> I think by dropping these hunks now, the state of the code at this point
> would mean that we might write the objects twice. We'd mark them as
> "reused" and send them as part of write_reused_pack(). But we'd also
> send them to pack-objects via the show_reachable_fn callback, and it
> would add them to the usual packing list.
>
> And indeed, t5310.10 fails at this point in the series with:
>
>   Cloning into bare repository 'clone.git'...
>   remote: Enumerating objects: 331, done.
>   remote: Counting objects: 100% (331/331), done.
>   remote: Compressing objects: 100% (111/111), done.
>   remote: Total 662 (delta 108), reused 331 (delta 108), pack-reused 331
>   Receiving objects: 100% (662/662), 53.14 KiB | 17.71 MiB/s, done.
>   Resolving deltas: 100% (216/216), done.
>   fatal: The same object 00c1d3730931e66eb08dabe3a3c9fa16621d728a appears twice in the pack
>   fatal: index-pack failed
>
> and then starts working again after the final patch.

Yeah, I thought I had tested this, but anyway I squashed this patch
into the final patch to avoid failures.
diff mbox series

Patch

diff --git a/pack-bitmap.c b/pack-bitmap.c
index e07c798879..016d0319fc 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -622,7 +622,7 @@  static void show_objects_for_type(
 	enum object_type object_type,
 	show_reachable_fn show_reach)
 {
-	size_t pos = 0, i = 0;
+	size_t i = 0;
 	uint32_t offset;
 
 	struct ewah_iterator it;
@@ -630,13 +630,15 @@  static void show_objects_for_type(
 
 	struct bitmap *objects = bitmap_git->result;
 
-	if (bitmap_git->reuse_objects == bitmap_git->pack->num_objects)
-		return;
-
 	ewah_iterator_init(&it, type_filter);
 
-	while (i < objects->word_alloc && ewah_iterator_next(&filter, &it)) {
+	for (i = 0; i < objects->word_alloc &&
+			ewah_iterator_next(&filter, &it); i++) {
 		eword_t word = objects->words[i] & filter;
+		size_t pos = (i * BITS_IN_EWORD);
+
+		if (!word)
+			continue;
 
 		for (offset = 0; offset < BITS_IN_EWORD; ++offset) {
 			struct object_id oid;
@@ -648,9 +650,6 @@  static void show_objects_for_type(
 
 			offset += ewah_bit_ctz64(word >> offset);
 
-			if (pos + offset < bitmap_git->reuse_objects)
-				continue;
-
 			entry = &bitmap_git->pack->revindex[pos + offset];
 			nth_packed_object_oid(&oid, bitmap_git->pack, entry->nr);
 
@@ -659,9 +658,6 @@  static void show_objects_for_type(
 
 			show_reach(&oid, object_type, 0, hash, bitmap_git->pack, entry->offset);
 		}
-
-		pos += BITS_IN_EWORD;
-		i++;
 	}
 }