diff mbox series

[v2,1/4] pack-bitmap.c: check preferred pack validity when opening MIDX bitmap

Message ID 618e8a6166473d238e62ce6243d9a0b2b72ee2f0.1653418457.git.me@ttaylorr.com (mailing list archive)
State Accepted
Commit 44f9fd649673362bdbaae7067a9919b1fe4c96d1
Headers show
Series pack-objects: fix a pair of MIDX bitmap-related races | expand

Commit Message

Taylor Blau May 24, 2022, 6:54 p.m. UTC
When pack-objects adds an entry to its packing list, it marks the
packfile and offset containing the object, which we may later use during
verbatim reuse (c.f., `write_reused_pack_verbatim()`).

If the packfile in question is deleted in the background (e.g., due to a
concurrent `git repack`), we'll die() as a result of calling use_pack(),
unless we have an open file descriptor on the pack itself. 4c08018204
(pack-objects: protect against disappearing packs, 2011-10-14) worked
around this by opening the pack ahead of time before recording it as a
valid source for reuse.

4c08018204's treatment meant that we could tolerate disappearing packs,
since it ensures we always have an open file descriptor on any pack that
we mark as a valid source for reuse. This tightens the race to only
happen when we need to close an open pack's file descriptor (c.f., the
caller of `packfile.c::get_max_fd_limit()`) _and_ that pack was deleted,
in which case we'll complain that a pack could not be accessed and
die().

The pack bitmap code does this, too, since prior to dc1daacdcc
(pack-bitmap: check pack validity when opening bitmap, 2021-07-23) it
was vulnerable to the same race.

The MIDX bitmap code does not do this, and is vulnerable to the same
race. Apply the same treatment as dc1daacdcc to the routine responsible
for opening the multi-pack bitmap's preferred pack to close this race.

This patch handles the "preferred" pack (c.f., the section
"multi-pack-index reverse indexes" in
Documentation/technical/pack-format.txt) specially, since pack-objects
depends on reusing exact chunks of that pack verbatim in
reuse_partial_packfile_from_bitmap(). So if that pack cannot be loaded,
the utility of a bitmap is significantly diminished.

Similar to dc1daacdcc, we could technically just add this check in
reuse_partial_packfile_from_bitmap(), since it's possible to use a MIDX
.bitmap without needing to open any of its packs. But it's simpler to do
the check as early as possible, covering all direct uses of the
preferred pack. Note that doing this check early requires us to call
prepare_midx_pack() early, too, so move the relevant part of that loop
from load_reverse_index() into open_midx_bitmap_1().

Subsequent patches handle the non-preferred packs in a slightly
different fashion.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

Comments

Ævar Arnfjörð Bjarmason May 24, 2022, 7:36 p.m. UTC | #1
On Tue, May 24 2022, Taylor Blau wrote:

Just nits on the error reporting:

> @@ -353,6 +355,20 @@ static int open_midx_bitmap_1(struct bitmap_index *bitmap_git,
>  		warning(_("multi-pack bitmap is missing required reverse index"));
>  		goto cleanup;
>  	}
> +
> +	for (i = 0; i < bitmap_git->midx->num_packs; i++) {
> +		if (prepare_midx_pack(the_repository, bitmap_git->midx, i))
> +			die(_("could not open pack %s"),
> +			    bitmap_git->midx->pack_names[i]);

Some existing API users of this & their error handling suggest that this
message is wrong. I.e. it's not that we couldn't open it, but that we
could open it and there's something wrong with it. Or perhaps their
messages are misleading?

> +	}
> +
> +	preferred = bitmap_git->midx->packs[midx_preferred_pack(bitmap_git)];
> +	if (!is_pack_valid(preferred)) {
> +		warning(_("preferred pack (%s) is invalid"),
> +			preferred->pack_name);

Likewise this? E.g. perhaps the permissions are just wrong or whatever,
per open_packed_git_1().
Taylor Blau May 24, 2022, 9:38 p.m. UTC | #2
On Tue, May 24, 2022 at 09:36:45PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
> On Tue, May 24 2022, Taylor Blau wrote:
>
> Just nits on the error reporting:
>
> > @@ -353,6 +355,20 @@ static int open_midx_bitmap_1(struct bitmap_index *bitmap_git,
> >  		warning(_("multi-pack bitmap is missing required reverse index"));
> >  		goto cleanup;
> >  	}
> > +
> > +	for (i = 0; i < bitmap_git->midx->num_packs; i++) {
> > +		if (prepare_midx_pack(the_repository, bitmap_git->midx, i))
> > +			die(_("could not open pack %s"),
> > +			    bitmap_git->midx->pack_names[i]);
>
> Some existing API users of this & their error handling suggest that this
> message is wrong. I.e. it's not that we couldn't open it, but that we
> could open it and there's something wrong with it. Or perhaps their
> messages are misleading?

I tried to reuse some similar message based on "git grep 'if
(.*prepare_midx_pack'", so this was inspired by:

  - the caller in midx.c::write_midx_internal(), whose error is "could
    not load pack", and
  - the caller in midx.c::verify_midx_file(), whose error is "failed to
    load pack"

Are you suggesting we should s/open/load here and use the above error
message? My feeling at the time was that "load" was basically synonymous
with "open" given the context, but if you think they're different
enough, or have a different suggestion LMK.

Thanks,
Taylor
Ævar Arnfjörð Bjarmason May 24, 2022, 9:51 p.m. UTC | #3
On Tue, May 24 2022, Taylor Blau wrote:

> On Tue, May 24, 2022 at 09:36:45PM +0200, Ævar Arnfjörð Bjarmason wrote:
>>
>> On Tue, May 24 2022, Taylor Blau wrote:
>>
>> Just nits on the error reporting:
>>
>> > @@ -353,6 +355,20 @@ static int open_midx_bitmap_1(struct bitmap_index *bitmap_git,
>> >  		warning(_("multi-pack bitmap is missing required reverse index"));
>> >  		goto cleanup;
>> >  	}
>> > +
>> > +	for (i = 0; i < bitmap_git->midx->num_packs; i++) {
>> > +		if (prepare_midx_pack(the_repository, bitmap_git->midx, i))
>> > +			die(_("could not open pack %s"),
>> > +			    bitmap_git->midx->pack_names[i]);
>>
>> Some existing API users of this & their error handling suggest that this
>> message is wrong. I.e. it's not that we couldn't open it, but that we
>> could open it and there's something wrong with it. Or perhaps their
>> messages are misleading?
>
> I tried to reuse some similar message based on "git grep 'if
> (.*prepare_midx_pack'", so this was inspired by:
>
>   - the caller in midx.c::write_midx_internal(), whose error is "could
>     not load pack", and
>   - the caller in midx.c::verify_midx_file(), whose error is "failed to
>     load pack"
>
> Are you suggesting we should s/open/load here and use the above error
> message? My feeling at the time was that "load" was basically synonymous
> with "open" given the context, but if you think they're different
> enough, or have a different suggestion LMK.

Perhaps "parse" or something? Anyway with "could not open" I'd assume
open() failed, but in this case it looks like we could open it, but
(mostly?) failed later.

Maybe "could not load midx"? I don't know...
diff mbox series

Patch

diff --git a/pack-bitmap.c b/pack-bitmap.c
index 97909d48da..d607918407 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -315,6 +315,8 @@  static int open_midx_bitmap_1(struct bitmap_index *bitmap_git,
 	struct stat st;
 	char *idx_name = midx_bitmap_filename(midx);
 	int fd = git_open(idx_name);
+	uint32_t i;
+	struct packed_git *preferred;
 
 	free(idx_name);
 
@@ -353,6 +355,20 @@  static int open_midx_bitmap_1(struct bitmap_index *bitmap_git,
 		warning(_("multi-pack bitmap is missing required reverse index"));
 		goto cleanup;
 	}
+
+	for (i = 0; i < bitmap_git->midx->num_packs; i++) {
+		if (prepare_midx_pack(the_repository, bitmap_git->midx, i))
+			die(_("could not open pack %s"),
+			    bitmap_git->midx->pack_names[i]);
+	}
+
+	preferred = bitmap_git->midx->packs[midx_preferred_pack(bitmap_git)];
+	if (!is_pack_valid(preferred)) {
+		warning(_("preferred pack (%s) is invalid"),
+			preferred->pack_name);
+		goto cleanup;
+	}
+
 	return 0;
 
 cleanup:
@@ -429,8 +445,6 @@  static int load_reverse_index(struct bitmap_index *bitmap_git)
 		 * since we will need to make use of them in pack-objects.
 		 */
 		for (i = 0; i < bitmap_git->midx->num_packs; i++) {
-			if (prepare_midx_pack(the_repository, bitmap_git->midx, i))
-				die(_("load_reverse_index: could not open pack"));
 			ret = load_pack_revindex(bitmap_git->midx->packs[i]);
 			if (ret)
 				return ret;