diff mbox series

[2/2] midx.c: protect against disappearing packs

Message ID e1806d1bdc0da8061c78608e56138424ad24bed0.1606324509.git.me@ttaylorr.com (mailing list archive)
State Accepted
Commit 506ec2fbda5334c4fc60fbd9f425fff3916a2066
Headers show
Series midx: prevent against racily disappearing packs | expand

Commit Message

Taylor Blau Nov. 25, 2020, 5:17 p.m. UTC
When a packed object is stored in a multi-pack index, but that pack has
racily gone away, the MIDX code simply calls die(), when it could be
returning an error to the caller, which would in turn lead to
re-scanning the pack directory.

A pack can racily disappear, for example, due to a simultaneous 'git
repack -ad',

You can also reproduce this with two terminals, where one is running:

    git init
    while true; do
      git commit -q --allow-empty -m foo
      git repack -ad
      git multi-pack-index write
    done

(in effect, constantly writing new MIDXs), and the other is running:

    obj=$(git rev-parse HEAD)
    while true; do
      echo $obj | git cat-file --batch-check='%(objectsize:disk)' || break
    done

That will sometimes hit the error preparing packfile from
multi-pack-index message, which this patch fixes.

Right now, that path to discovering a missing pack looks something like
'find_pack_entry()' calling 'fill_midx_entry()' and eventually making
its way to call 'nth_midxed_pack_entry()'.

'nth_midxed_pack_entry()' already checks 'is_pack_valid()' and
propagates an error if the pack is invalid. So, this works if the pack
has gone away between calling 'prepare_midx_pack()' and before calling
'is_pack_valid()', but not if it disappears before then.

Catch the case where the pack has already disappeared before
'prepare_midx_pack()' by returning an error in that case, too.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 midx.c                      | 2 +-
 t/t5319-multi-pack-index.sh | 8 +++++++-
 2 files changed, 8 insertions(+), 2 deletions(-)

Comments

Junio C Hamano Nov. 25, 2020, 9:22 p.m. UTC | #1
Taylor Blau <me@ttaylorr.com> writes:

> When a packed object is stored in a multi-pack index, but that pack has
> racily gone away, the MIDX code simply calls die(), when it could be
> returning an error to the caller, which would in turn lead to
> re-scanning the pack directory.

Makes sense.  Will queue.

Thanks.
diff mbox series

Patch

diff --git a/midx.c b/midx.c
index d233b54ac7..1d2179a61f 100644
--- a/midx.c
+++ b/midx.c
@@ -298,7 +298,7 @@  static int nth_midxed_pack_entry(struct repository *r,
 	pack_int_id = nth_midxed_pack_int_id(m, pos);
 
 	if (prepare_midx_pack(r, m, pack_int_id))
-		die(_("error preparing packfile from multi-pack-index"));
+		return 0;
 	p = m->packs[pack_int_id];
 
 	/*
diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index d4607daec1..297de502a9 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -755,7 +755,7 @@  test_expect_success 'repack --batch-size=<large> repacks everything' '
 	)
 '
 
-test_expect_success 'load reverse index when missing .idx' '
+test_expect_success 'load reverse index when missing .idx, .pack' '
 	git init repo &&
 	test_when_finished "rm -fr repo" &&
 	(
@@ -768,9 +768,15 @@  test_expect_success 'load reverse index when missing .idx' '
 		git multi-pack-index write &&
 
 		git rev-parse HEAD >tip &&
+		pack=$(ls .git/objects/pack/pack-*.pack) &&
 		idx=$(ls .git/objects/pack/pack-*.idx) &&
 
 		mv $idx $idx.bak &&
+		git cat-file --batch-check="%(objectsize:disk)" <tip &&
+
+		mv $idx.bak $idx &&
+
+		mv $pack $pack.bak &&
 		git cat-file --batch-check="%(objectsize:disk)" <tip
 	)
 '