diff mbox series

[3/8] midx-write.c: pass `start_pack` to `get_sorted_entries()`

Message ID 7acf4557dcb2240cb43eadebfd21b5c37515ba7f.1716482279.git.me@ttaylorr.com (mailing list archive)
State Superseded
Headers show
Series midx-write: miscellaneous clean-ups for incremental MIDXs | expand

Commit Message

Taylor Blau May 23, 2024, 4:38 p.m. UTC
The function `get_sorted_entries()` is broadly responsible for
building an array of the objects to be written into a MIDX based on the
provided list of packs.

If we have loaded an existing MIDX, however, we may not use all of its
packs, despite loading them into the ctx->info array.

The existing implementation simply skips past the first
ctx->m->num_packs (if ctx->m is non-NULL, indicating that we loaded an
existing MIDX). Future changes (outside the scope of this patch series)
to the MIDX code will require us to skip *at most* that number[^1].

We could tag each pack with a bit that indicates the pack's contents
should be included in the MIDX. But we can just as easily determine the
number of packs to skip by passing in the number of packs we learned
about after processing an existing MIDX.

[^1]: Kind of. The real number will be bounded by the number of packs in
  a MIDX layer, and the number of packs in its base layer(s), but that
  concept hasn't been fully defined yet.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 midx-write.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Comments

Patrick Steinhardt May 29, 2024, 7:48 a.m. UTC | #1
On Thu, May 23, 2024 at 12:38:09PM -0400, Taylor Blau wrote:
> The function `get_sorted_entries()` is broadly responsible for
> building an array of the objects to be written into a MIDX based on the
> provided list of packs.
> 
> If we have loaded an existing MIDX, however, we may not use all of its
> packs, despite loading them into the ctx->info array.
> 
> The existing implementation simply skips past the first
> ctx->m->num_packs (if ctx->m is non-NULL, indicating that we loaded an
> existing MIDX). Future changes (outside the scope of this patch series)
> to the MIDX code will require us to skip *at most* that number[^1].
> 
> We could tag each pack with a bit that indicates the pack's contents
> should be included in the MIDX. But we can just as easily determine the
> number of packs to skip by passing in the number of packs we learned
> about after processing an existing MIDX.

Will we always want to skip packs from the start of the array? Or may it
happen that we want to skip packs in the middle of it? It's a bit hard
to judge because there isn't a ton of context when exactly we'll want to
skip, and why.

Patrick
Taylor Blau May 29, 2024, 10:36 p.m. UTC | #2
On Wed, May 29, 2024 at 09:48:03AM +0200, Patrick Steinhardt wrote:
> Will we always want to skip packs from the start of the array? Or may it
> happen that we want to skip packs in the middle of it? It's a bit hard
> to judge because there isn't a ton of context when exactly we'll want to
> skip, and why.

It's always from the start of the array, but I'll add some more context
into the patch message as to why this is.

Thanks,
Taylor
diff mbox series

Patch

diff --git a/midx-write.c b/midx-write.c
index ad32e8953d..cf7e391b6e 100644
--- a/midx-write.c
+++ b/midx-write.c
@@ -299,13 +299,13 @@  static void midx_fanout_add_pack_fanout(struct midx_fanout *fanout,
  * Copy only the de-duplicated entries (selected by most-recent modified time
  * of a packfile containing the object).
  */
-static struct pack_midx_entry *get_sorted_entries(struct write_midx_context *ctx)
+static struct pack_midx_entry *get_sorted_entries(struct write_midx_context *ctx,
+						  uint32_t start_pack)
 {
 	uint32_t cur_fanout, cur_pack, cur_object;
 	size_t alloc_objects, total_objects = 0;
 	struct midx_fanout fanout = { 0 };
 	struct pack_midx_entry *deduplicated_entries = NULL;
-	uint32_t start_pack = ctx->m ? ctx->m->num_packs : 0;
 
 	for (cur_pack = start_pack; cur_pack < ctx->nr; cur_pack++)
 		total_objects = st_add(total_objects,
@@ -886,7 +886,7 @@  static int write_midx_internal(const char *object_dir,
 {
 	struct strbuf midx_name = STRBUF_INIT;
 	unsigned char midx_hash[GIT_MAX_RAWSZ];
-	uint32_t i;
+	uint32_t i, start_pack;
 	struct hashfile *f = NULL;
 	struct lock_file lk;
 	struct write_midx_context ctx = { 0 };
@@ -954,6 +954,8 @@  static int write_midx_internal(const char *object_dir,
 		}
 	}
 
+	start_pack = ctx.nr;
+
 	ctx.pack_paths_checked = 0;
 	if (flags & MIDX_PROGRESS)
 		ctx.progress = start_delayed_progress(_("Adding packfiles to multi-pack-index"), 0);
@@ -1051,7 +1053,7 @@  static int write_midx_internal(const char *object_dir,
 		}
 	}
 
-	ctx.entries = get_sorted_entries(&ctx);
+	ctx.entries = get_sorted_entries(&ctx, start_pack);
 
 	ctx.large_offsets_needed = 0;
 	for (i = 0; i < ctx.entries_nr; i++) {