mbox series

[v4,0/2] midx: apply gitconfig to midx repack

Message ID pull.626.v4.git.1589126855.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series midx: apply gitconfig to midx repack | expand

Message

Phillip Wood via GitGitGadget May 10, 2020, 4:07 p.m. UTC
Midx repack has largely been used in Microsoft Scalar on the client side to
optimize the repository multiple packs state. However when I tried to apply
this onto the server-side, I realized that there are certain features that
were lacking compare to git repack. Most of these features are highly
desirable on the server-side to create the most optimized pack possible.

One of the example is delta_base_offset, comparing an midx repack
with/without delta_base_offset, we can observe significant size differences.

> du objects/pack/*pack
14536   objects/pack/pack-08a017b424534c88191addda1aa5dd6f24bf7a29.pack
9435280 objects/pack/pack-8829c53ad1dca02e7311f8e5b404962ab242e8f1.pack

Latest 2.26.2 (without delta_base_offset)
> git multi-pack-index write
> git multi-pack-index repack
> git multi-pack-index expire
> du objects/pack/*pack
9446096 objects/pack/pack-366c75e2c2f987b9836d3bf0bf5e4a54b6975036.pack

With delta_base_offset
> git version
git version 2.26.2.672.g232c24e857.dirty
> git multi-pack-index write
> git multi-pack-index repack
> git multi-pack-index expire
> du objects/pack/*pack
9152512 objects/pack/pack-3bc8c1ec496ab95d26875f8367ff6807081e9e7d.pack

Note that repack.writeBitmaps configuration is ignored, as the pack bitmap
facility is useful only with a single packfile.

Derrick Stolee's following patch will address repack.packKeptObjects 
support.

Derrick Stolee (1):
  multi-pack-index: respect repack.packKeptObjects=false

Son Luong Ngoc (1):
  midx: teach "git multi-pack-index repack" honor "git repack"
    configurations

 Documentation/git-multi-pack-index.txt |  3 ++
 midx.c                                 | 42 +++++++++++++++++++++++---
 t/t5319-multi-pack-index.sh            | 27 +++++++++++++++++
 3 files changed, 67 insertions(+), 5 deletions(-)


base-commit: b994622632154fc3b17fb40a38819ad954a5fb88
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-626%2Fsluongng%2Fsluongngoc%2Fmidx-config-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-626/sluongng/sluongngoc/midx-config-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/626

Range-diff vs v3:

 1:  a925307d4c5 ! 1:  a8f75e34e5b midx: teach "git multi-pack-index repack" honor "git repack" configurations
     @@ Metadata
       ## Commit message ##
          midx: teach "git multi-pack-index repack" honor "git repack" configurations
      
     -    Previously, when the "repack" subcommand of "git multi-pack-index" command
     -    creates new packfile(s), it does not call the "git repack" command but
     -    instead directly calls the "git pack-objects" command, and the
     -    configuration variables meant for the "git repack" command, like
     -    "repack.usedaeltabaseoffset", are ignored.
     +    When the "repack" subcommand of "git multi-pack-index" command
     +    creates new packfile(s), it does not call the "git repack"
     +    command but instead directly calls the "git pack-objects"
     +    command, and the configuration variables meant for the "git
     +    repack" command, like "repack.usedaeltabaseoffset", are ignored.
      
     -    This patch ensured "git multi-pack-index" checks the configuration
     -    variables used by "git repack" and passes the corresponding options to
     -    the underlying "git pack-objects" command.
     +    Check the configuration variables used by "git repack" ourselves
     +    in "git multi-index-pack" and pass the corresponding options to
     +    underlying "git pack-objects".
      
          Note that `repack.writeBitmaps` configuration is ignored, as the
          pack bitmap facility is useful only with a single packfile.
     @@ Commit message
      
       ## midx.c ##
      @@ midx.c: int midx_repack(struct repository *r, const char *object_dir, size_t batch_size,
     - 	struct child_process cmd = CHILD_PROCESS_INIT;
       	struct strbuf base_name = STRBUF_INIT;
       	struct multi_pack_index *m = load_multi_pack_index(object_dir, 1);
     + 
     ++	/*
     ++	 * When updating the default for these configuration
     ++	 * variables in builtin/repack.c, these must be adjusted
     ++	 * to match.
     ++	 */
      +	int delta_base_offset = 1;
      +	int use_delta_islands = 0;
     - 
     ++
       	if (!m)
       		return 0;
     + 
      @@ midx.c: int midx_repack(struct repository *r, const char *object_dir, size_t batch_size,
       	} else if (fill_included_packs_all(m, include_pack))
       		goto cleanup;
 2:  988697dd512 ! 2:  192fc785382 multi-pack-index: respect repack.packKeptObjects=false
     @@ t/t5319-multi-pack-index.sh: test_expect_success 'repack with minimum size does
      +		ls .git/objects/pack/*idx >idx-list &&
      +		test_line_count = 5 idx-list &&
      +		ls .git/objects/pack/*.pack | sed "s/\.pack/.keep/" >keep-list &&
     ++		test_line_count = 5 keep-list &&
      +		for keep in $(cat keep-list)
      +		do
      +			touch $keep || return 1
     @@ t/t5319-multi-pack-index.sh: test_expect_success 'repack with minimum size does
      +		test_line_count = 5 idx-list &&
      +		test-tool read-midx .git/objects | grep idx >midx-list &&
      +		test_line_count = 5 midx-list &&
     -+		THIRD_SMALLEST_SIZE=$(test-tool path-utils file-size .git/objects/pack/*pack | sort -n | head -n 3 | tail -n 1) &&
     -+		BATCH_SIZE=$(($THIRD_SMALLEST_SIZE + 1)) &&
     ++		THIRD_SMALLEST_SIZE=$(test-tool path-utils file-size .git/objects/pack/*pack | sort -n | sed -n 3p) &&
     ++		BATCH_SIZE=$((THIRD_SMALLEST_SIZE + 1)) &&
      +		git multi-pack-index repack --batch-size=$BATCH_SIZE &&
      +		ls .git/objects/pack/*idx >idx-list &&
      +		test_line_count = 5 idx-list &&
 3:  efeb3d7d132 < -:  ----------- Ensured t5319 follows arith expansion guideline