Message ID | 1838bbcf7fe6daa58a7db78b81a2d08138fe176e.1724793201.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 41cd4b478f7ee0a9049601afdcd7872bdbeae519 |
Headers | show |
Series | pack-objects: brown-paper-bag fixes for multi-pack reuse | expand |
Taylor Blau <me@ttaylorr.com> writes: > The next commit will need to use the bitmap's MIDX (if one exists) to > translate bit positions into pack-relative positions in the source pack. > > Ordinarily, we'd use the "midx" field of the bitmap_index struct. But > since that struct is defined within pack-bitmap.c, and our caller is in > a separate compilation unit, we do not have access to the MIDX field. > > Instead, add a "from_midx" field to the bitmapped_pack structure so that > we can use that piece of data from outside of pack-bitmap.c. The caller > that uses this new piece of information will be added in the following > commit. > > Signed-off-by: Taylor Blau <me@ttaylorr.com> > --- > midx.c | 1 + > pack-bitmap.c | 1 + > pack-bitmap.h | 1 + > 3 files changed, 3 insertions(+) > > diff --git a/midx.c b/midx.c > index ca98bfd7c6..67e0d64004 100644 > --- a/midx.c > +++ b/midx.c > @@ -496,6 +496,7 @@ int nth_bitmapped_pack(struct repository *r, struct multi_pack_index *m, > MIDX_CHUNK_BITMAPPED_PACKS_WIDTH * local_pack_int_id + > sizeof(uint32_t)); > bp->pack_int_id = pack_int_id; > + bp->from_midx = m; Do multi_pack_index objects live as long as bitmapped_pack objects that point at them live? If m later goes away without letting the bitmapped_pack know about it, the borrowed pointer in from_midx would become dangling, which is not what we want to see. "None of these objects are released or relocated while we are running pack-objects, so once the .from_midx member is assigned here, it will always be pointing at a valid multi_pack_index object" is a satisfactory answer, I guess.
On Tue, Aug 27, 2024 at 05:14:14PM -0700, Junio C Hamano wrote: > > diff --git a/midx.c b/midx.c > > index ca98bfd7c6..67e0d64004 100644 > > --- a/midx.c > > +++ b/midx.c > > @@ -496,6 +496,7 @@ int nth_bitmapped_pack(struct repository *r, struct multi_pack_index *m, > > MIDX_CHUNK_BITMAPPED_PACKS_WIDTH * local_pack_int_id + > > sizeof(uint32_t)); > > bp->pack_int_id = pack_int_id; > > + bp->from_midx = m; > > Do multi_pack_index objects live as long as bitmapped_pack objects > that point at them live? If m later goes away without letting the > bitmapped_pack know about it, the borrowed pointer in from_midx > would become dangling, which is not what we want to see. > > "None of these objects are released or relocated while we are > running pack-objects, so once the .from_midx member is assigned > here, it will always be pointing at a valid multi_pack_index object" > is a satisfactory answer, I guess. Good question, and good answer ;-). This is only relevant in a read-only path where we're generating a new pack from existing packs and not altering those pack or rewriting / deleting the MIDX attached to them. So I think we're OK here and don't have any lifetime/scope issues. Thanks, Taylor
On Thu, Aug 29, 2024 at 02:58:19PM -0400, Taylor Blau wrote: > On Tue, Aug 27, 2024 at 05:14:14PM -0700, Junio C Hamano wrote: > > > diff --git a/midx.c b/midx.c > > > index ca98bfd7c6..67e0d64004 100644 > > > --- a/midx.c > > > +++ b/midx.c > > > @@ -496,6 +496,7 @@ int nth_bitmapped_pack(struct repository *r, struct multi_pack_index *m, > > > MIDX_CHUNK_BITMAPPED_PACKS_WIDTH * local_pack_int_id + > > > sizeof(uint32_t)); > > > bp->pack_int_id = pack_int_id; > > > + bp->from_midx = m; > > > > Do multi_pack_index objects live as long as bitmapped_pack objects > > that point at them live? If m later goes away without letting the > > bitmapped_pack know about it, the borrowed pointer in from_midx > > would become dangling, which is not what we want to see. > > > > "None of these objects are released or relocated while we are > > running pack-objects, so once the .from_midx member is assigned > > here, it will always be pointing at a valid multi_pack_index object" > > is a satisfactory answer, I guess. > > Good question, and good answer ;-). > > This is only relevant in a read-only path where we're generating a new > pack from existing packs and not altering those pack or rewriting / > deleting the MIDX attached to them. So I think we're OK here and don't > have any lifetime/scope issues. Do we ever close/reopen the midx? For example, if a simultaneous process wrote a new one and we triggered reprepare_packed_git()? I think the answer is "no"; we might read a new midx, but we never ditch the old one (just like we do for packed_git structs). Which I suspect is needed even before this patch, since various other parts of the bitmap code (and probably others) rely on the struct staying in place across as we read many objects. -Peff
On Thu, Sep 05, 2024 at 05:00:24AM -0400, Jeff King wrote: > On Thu, Aug 29, 2024 at 02:58:19PM -0400, Taylor Blau wrote: > > > On Tue, Aug 27, 2024 at 05:14:14PM -0700, Junio C Hamano wrote: > > > > diff --git a/midx.c b/midx.c > > > > index ca98bfd7c6..67e0d64004 100644 > > > > --- a/midx.c > > > > +++ b/midx.c > > > > @@ -496,6 +496,7 @@ int nth_bitmapped_pack(struct repository *r, struct multi_pack_index *m, > > > > MIDX_CHUNK_BITMAPPED_PACKS_WIDTH * local_pack_int_id + > > > > sizeof(uint32_t)); > > > > bp->pack_int_id = pack_int_id; > > > > + bp->from_midx = m; > > > > > > Do multi_pack_index objects live as long as bitmapped_pack objects > > > that point at them live? If m later goes away without letting the > > > bitmapped_pack know about it, the borrowed pointer in from_midx > > > would become dangling, which is not what we want to see. > > > > > > "None of these objects are released or relocated while we are > > > running pack-objects, so once the .from_midx member is assigned > > > here, it will always be pointing at a valid multi_pack_index object" > > > is a satisfactory answer, I guess. > > > > Good question, and good answer ;-). > > > > This is only relevant in a read-only path where we're generating a new > > pack from existing packs and not altering those pack or rewriting / > > deleting the MIDX attached to them. So I think we're OK here and don't > > have any lifetime/scope issues. > > Do we ever close/reopen the midx? For example, if a simultaneous process > wrote a new one and we triggered reprepare_packed_git()? > > I think the answer is "no"; we might read a new midx, but we never ditch > the old one (just like we do for packed_git structs). Which I suspect is > needed even before this patch, since various other parts of the bitmap > code (and probably others) rely on the struct staying in place across as > we read many objects. I think that we *could* close and reopen the MIDX via a call to close_object_store() (which dispatches a call to close_midx(), before NULL-ing out o->multi_pack_index) and then calling prepare_packed_git() or similar. But I'm not aware of any such codepaths that deal with pack reuse and calling nth_bitmapped_pack that would do such a thing, so I think that we're OK here. Thanks, Taylor
diff --git a/midx.c b/midx.c index ca98bfd7c6..67e0d64004 100644 --- a/midx.c +++ b/midx.c @@ -496,6 +496,7 @@ int nth_bitmapped_pack(struct repository *r, struct multi_pack_index *m, MIDX_CHUNK_BITMAPPED_PACKS_WIDTH * local_pack_int_id + sizeof(uint32_t)); bp->pack_int_id = pack_int_id; + bp->from_midx = m; return 0; } diff --git a/pack-bitmap.c b/pack-bitmap.c index 2e657a2aa4..218d7ac2eb 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -2322,6 +2322,7 @@ void reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, packs[packs_nr].pack_int_id = pack_int_id; packs[packs_nr].bitmap_nr = pack->num_objects; packs[packs_nr].bitmap_pos = 0; + packs[packs_nr].from_midx = bitmap_git->midx; objects_nr = packs[packs_nr++].bitmap_nr; } diff --git a/pack-bitmap.h b/pack-bitmap.h index ff0fd815b8..d7f4b8b8e9 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -60,6 +60,7 @@ struct bitmapped_pack { uint32_t bitmap_pos; uint32_t bitmap_nr; + struct multi_pack_index *from_midx; /* MIDX only */ uint32_t pack_int_id; /* MIDX only */ };
The next commit will need to use the bitmap's MIDX (if one exists) to translate bit positions into pack-relative positions in the source pack. Ordinarily, we'd use the "midx" field of the bitmap_index struct. But since that struct is defined within pack-bitmap.c, and our caller is in a separate compilation unit, we do not have access to the MIDX field. Instead, add a "from_midx" field to the bitmapped_pack structure so that we can use that piece of data from outside of pack-bitmap.c. The caller that uses this new piece of information will be added in the following commit. Signed-off-by: Taylor Blau <me@ttaylorr.com> --- midx.c | 1 + pack-bitmap.c | 1 + pack-bitmap.h | 1 + 3 files changed, 3 insertions(+)