mbox series

[v2,0/3] midx: use `--stdin-packs` to implement `repack`

Message ID cover.1663706401.git.me@ttaylorr.com (mailing list archive)
Headers show
Series midx: use `--stdin-packs` to implement `repack` | expand

Message

Taylor Blau Sept. 20, 2022, 8:40 p.m. UTC
Here's a few patches that replace the existing "feed each OID
one-by-one" approach to implement the `repack` sub-command of the
`multi-pack-index` builtin with one that uses `pack-objects`'s
`--stdin-packs` option.

There is an additional patch at the beginning of this series in order to
extract the mtime-sorted list of packs to rollup from their home in
`fill_included_packs_batch()`. There's also one more on the end to unify
the `include_pack` array into the `repack_info` struct(s) themselves.

The second patch is the substantive one. Thanks for all of the review so
far! :-)

Taylor Blau (3):
  midx.c: compute pack_info array outside of fill_included_packs_batch()
  midx.c: use `pack-objects --stdin-packs` when repacking
  midx.c: unify `include_pack` array into `pack_info` struct

 midx.c | 76 +++++++++++++++++++++++++++++-----------------------------
 1 file changed, 38 insertions(+), 38 deletions(-)

Range-diff against v1:
-:  ---------- > 1:  a501776f6e midx.c: compute pack_info array outside of fill_included_packs_batch()
1:  7e987eb9d7 ! 2:  4218d9e08a midx.c: use `pack-objects --stdin-packs` when repacking
    @@ midx.c: int midx_repack(struct repository *r, const char *object_dir, size_t bat
      	struct strbuf base_name = STRBUF_INIT;
     +	struct strbuf scratch = STRBUF_INIT;
      	struct multi_pack_index *m = lookup_multi_pack_index(r, object_dir);
    + 	struct repack_info *pack_info = NULL;
      
    - 	/*
     @@ midx.c: int midx_repack(struct repository *r, const char *object_dir, size_t batch_size,
      	repo_config_get_bool(r, "repack.usedeltabaseoffset", &delta_base_offset);
      	repo_config_get_bool(r, "repack.usedeltaislands", &use_delta_islands);
    @@ midx.c: int midx_repack(struct repository *r, const char *object_dir, size_t bat
     -		struct object_id oid;
     -		uint32_t pack_int_id = nth_midxed_pack_int_id(m, i);
     +	for (i = 0; i < m->num_packs; i++) {
    -+		strbuf_reset(&scratch);
    ++		struct repack_info *info = &pack_info[i];
      
     -		if (!include_pack[pack_int_id])
     -			continue;
    -+		strbuf_addstr(&scratch, m->pack_names[i]);
    -+		strbuf_strip_suffix(&scratch, ".idx");
    -+		strbuf_addstr(&scratch, ".pack");
    ++		strbuf_reset(&scratch);
      
     -		nth_midxed_object_oid(&oid, m, i);
     -		fprintf(cmd_in, "%s\n", oid_to_hex(&oid));
    -+		fprintf(cmd_in, "%s%s\n", include_pack[i] ? "" : "^", scratch.buf);
    ++		strbuf_addstr(&scratch, m->pack_names[info->pack_int_id]);
    ++		strbuf_strip_suffix(&scratch, ".idx");
    ++		strbuf_addstr(&scratch, ".pack");
    ++
    ++		fprintf(cmd_in, "%s%s\n", include_pack[info->pack_int_id] ? "" : "^", scratch.buf);
      	}
      	fclose(cmd_in);
     +	strbuf_release(&scratch);
-:  ---------- > 3:  81e9ccc323 midx.c: unify `include_pack` array into `pack_info` struct

Comments

Jeff King Sept. 20, 2022, 9:54 p.m. UTC | #1
On Tue, Sep 20, 2022 at 04:40:14PM -0400, Taylor Blau wrote:

> Here's a few patches that replace the existing "feed each OID
> one-by-one" approach to implement the `repack` sub-command of the
> `multi-pack-index` builtin with one that uses `pack-objects`'s
> `--stdin-packs` option.
> 
> There is an additional patch at the beginning of this series in order to
> extract the mtime-sorted list of packs to rollup from their home in
> `fill_included_packs_batch()`. There's also one more on the end to unify
> the `include_pack` array into the `repack_info` struct(s) themselves.

The first and third make sense to me, and it looks like a nice cleanup.

The middle one looks fine, modulo all of our earlier discussion about
the extra traversal being fine. I think it would be good t get a review
from Stolee as the person who invented "git multi-index-pack repack" and
would understand its intended use. Presumably this is all weird "I have
a giant Windows repository" packing strategy. :)

-Peff
Junio C Hamano Sept. 21, 2022, 7:09 p.m. UTC | #2
Taylor Blau <me@ttaylorr.com> writes:

> Here's a few patches that replace the existing "feed each OID
> one-by-one" approach to implement the `repack` sub-command of the
> `multi-pack-index` builtin with one that uses `pack-objects`'s
> `--stdin-packs` option.

One question.  How is this series expected to interact with the
7-patch series about ignoring cruft pack while "midx repack" etc.?

I guess it is not so urgent during the -rc period, so I'll stop at
asking that question without reading further, at least for now.

Thanks.
Taylor Blau Sept. 23, 2022, 6:29 p.m. UTC | #3
On Wed, Sep 21, 2022 at 12:09:31PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > Here's a few patches that replace the existing "feed each OID
> > one-by-one" approach to implement the `repack` sub-command of the
> > `multi-pack-index` builtin with one that uses `pack-objects`'s
> > `--stdin-packs` option.
>
> One question.  How is this series expected to interact with the
> 7-patch series about ignoring cruft pack while "midx repack" etc.?

The first round merges cleanly, but the second round will have a slight
conflict. I'll send you a new version of this series based on whatever
is in `master` following the release.

Thanks,
Taylor