Message ID | 67a8196978244b56d4f60861f140b78c59d15e30.1701198172.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | pack-objects: multi-pack verbatim reuse | expand |
On Tue, Nov 28, 2023 at 02:08:37PM -0500, Taylor Blau wrote: > The function `write_reused_pack()` within `builtin/pack-objects.c` is > responsible for performing pack-reuse on a single pack, and has two main > functions: > > - it dispatches a call to `write_reused_pack_verbatim()` to see if we > can reuse portions of the packfile in whole-word chunks > > - for any remaining objects (that is, any objects that appear after > the first "gap" in the bitmap), call write_reused_pack_one() on that > object to record it for reuse. > > Prepare this function for multi-pack reuse by removing the assumption > that the bit position corresponding to the first object being reused > from a given pack may not be at bit position zero. Is this double-negation intended? We remove the assumption that we start reading at position zero, but the paragraph here states that we remove the assumption that we do _not_ start at bit zero. I'll stop reviewing here and will come back to this series somewhen next week. Patrick
On Thu, Dec 07, 2023 at 02:13:29PM +0100, Patrick Steinhardt wrote: > On Tue, Nov 28, 2023 at 02:08:37PM -0500, Taylor Blau wrote: > > The function `write_reused_pack()` within `builtin/pack-objects.c` is > > responsible for performing pack-reuse on a single pack, and has two main > > functions: > > > > - it dispatches a call to `write_reused_pack_verbatim()` to see if we > > can reuse portions of the packfile in whole-word chunks > > > > - for any remaining objects (that is, any objects that appear after > > the first "gap" in the bitmap), call write_reused_pack_one() on that > > object to record it for reuse. > > > > Prepare this function for multi-pack reuse by removing the assumption > > that the bit position corresponding to the first object being reused > > from a given pack may not be at bit position zero. > > Is this double-negation intended? We remove the assumption that we start > reading at position zero, but the paragraph here states that we remove > the assumption that we do _not_ start at bit zero. Oops, great catch. I'll s/may not/must in the last paragraph to clarify. > I'll stop reviewing here and will come back to this series somewhen next > week. Thanks as usual for your review -- I appreciate you digging through this rather dense series. Thanks, Taylor
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 3b7704d062..b5e6f6377a 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1128,7 +1128,7 @@ static size_t write_reused_pack_verbatim(struct bitmapped_pack *reuse_packfile, static void write_reused_pack(struct bitmapped_pack *reuse_packfile, struct hashfile *f) { - size_t i = 0; + size_t i = reuse_packfile->bitmap_pos / BITS_IN_EWORD; uint32_t offset; off_t pack_start = hashfile_total(f) - sizeof(struct pack_header); struct pack_window *w_curs = NULL; @@ -1146,17 +1146,23 @@ static void write_reused_pack(struct bitmapped_pack *reuse_packfile, break; offset += ewah_bit_ctz64(word >> offset); + if (pos + offset < reuse_packfile->bitmap_pos) + continue; + if (pos + offset >= reuse_packfile->bitmap_pos + reuse_packfile->bitmap_nr) + goto done; /* * Can use bit positions directly, even for MIDX * bitmaps. See comment in try_partial_reuse() * for why. */ - write_reused_pack_one(reuse_packfile->p, pos + offset, + write_reused_pack_one(reuse_packfile->p, + pos + offset - reuse_packfile->bitmap_pos, f, pack_start, &w_curs); display_progress(progress_state, ++written); } } +done: unuse_pack(&w_curs); }
The function `write_reused_pack()` within `builtin/pack-objects.c` is responsible for performing pack-reuse on a single pack, and has two main functions: - it dispatches a call to `write_reused_pack_verbatim()` to see if we can reuse portions of the packfile in whole-word chunks - for any remaining objects (that is, any objects that appear after the first "gap" in the bitmap), call write_reused_pack_one() on that object to record it for reuse. Prepare this function for multi-pack reuse by removing the assumption that the bit position corresponding to the first object being reused from a given pack may not be at bit position zero. The changes in this function are mostly straightforward. Initialize `i` to the position of the first word to contain bits corresponding to that reuse pack. In most situations, we throw the initialized value away, since we end up replacing it with the return value from write_reused_pack_verbatim(), moving us past the section of whole words that we reused. Likewise, modify the per-object loop to ignore any bits at the beginning of the first word that do not belong to the pack currently being reused, as well as skip to the "done" section once we have processed the last bit corresponding to this pack. Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/pack-objects.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)