mbox series

[v3,0/5] Avoid spawning gzip in git archive

Message ID 217a2f4d-4fc2-aaed-f5c2-1b7e134b046d@web.de (mailing list archive)
Headers show
Series Avoid spawning gzip in git archive | expand

Message

René Scharfe June 12, 2022, 6 a.m. UTC
It's been a while, let's try again.

Changes:
- Use our own zlib helpers instead of the gz* functions of zlib,
- ... which allows us to set the OS_CODE header consistently.
- Pseudo-command "git archive gzip" to select the internal
  implementation in config.
- Use a function pointer to plug in the internal gzip.
- Tests.
- Discuss performance in commit message.

  archive: rename archiver data field to filter_command
  archive-tar: factor out write_block()
  archive-tar: add internal gzip implementation
  archive-tar: use OS_CODE 3 (Unix) for internal gzip
  archive-tar: use internal gzip by default

 Documentation/git-archive.txt |  3 +-
 archive-tar.c                 | 79 ++++++++++++++++++++++++++++++-----
 archive.h                     |  2 +-
 t/t5000-tar-tree.sh           | 28 ++++++++++---
 4 files changed, 93 insertions(+), 19 deletions(-)

--
2.36.1

Comments

Johannes Schindelin June 14, 2022, 11:28 a.m. UTC | #1
Hi René,

On Sun, 12 Jun 2022, René Scharfe wrote:

> It's been a while, let's try again.

Thank you for picking this up again!

> Changes:
> - Use our own zlib helpers instead of the gz* functions of zlib,
> - ... which allows us to set the OS_CODE header consistently.
> - Pseudo-command "git archive gzip" to select the internal
>   implementation in config.
> - Use a function pointer to plug in the internal gzip.
> - Tests.
> - Discuss performance in commit message.

Makes sense. Here is the range-diff:

-- snip --
-:  ----------- > 1:  9847267888e archive: rename archiver data field to filter_command
1:  7d50f52e490 ! 2:  a98ef655af9 archive: factor out writing blocks into a separate function
    @@
      ## Metadata ##
    -Author: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
    +Author: René Scharfe <l.s.r@web.de>

      ## Commit message ##
    -    archive: factor out writing blocks into a separate function
    +    archive-tar: factor out write_block()

    -    The `git archive --format=tgz` command spawns `gzip` to perform the
    -    actual compression. However, the MinGit flavor of Git for Windows
    -    comes without `gzip` bundled inside.
    +    All tar archive writes have the same size and are done to the same file
    +    descriptor.  Move them to a common function, write_block(), to reduce
    +    code duplication and make it easy to change the destination.

    -    To help with that, we will teach `git archive` to let zlib perform the
    -    actual compression.
    -
    -    In preparation for this, we consolidate all the block writes into the
    -    function `write_block_or_die()`.
    -
    -    Note: .tar files have a well-defined, fixed block size. For that reason,
    -    it does not make any sense to pass anything but a fully-populated,
    -    full-length block to the `write_block_or_die()` function, and we can
    -    save ourselves some future trouble by not even allowing to pass an
    -    incorrect `size` parameter to it.
    -
    -    Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
    -    Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
    +    Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
    +    Signed-off-by: René Scharfe <l.s.r@web.de>
    +    Signed-off-by: Junio C Hamano <gitster@pobox.com>

      ## archive-tar.c ##
     @@ archive-tar.c: static int write_tar_filter_archive(const struct archiver *ar,
      #define USTAR_MAX_MTIME 077777777777ULL
      #endif

    -+/* writes out the whole block, or dies if fails */
    -+static void write_block_or_die(const char *block) {
    -+	write_or_die(1, block, BLOCKSIZE);
    ++static void write_block(const void *buf)
    ++{
    ++	write_or_die(1, buf, BLOCKSIZE);
     +}
     +
      /* writes out the whole block, but only if it is full */
    @@ archive-tar.c: static int write_tar_filter_archive(const struct archiver *ar,
      {
      	if (offset == BLOCKSIZE) {
     -		write_or_die(1, block, BLOCKSIZE);
    -+		write_block_or_die(block);
    ++		write_block(block);
      		offset = 0;
      	}
      }
    @@ archive-tar.c: static void do_write_blocked(const void *data, unsigned long size
      	}
      	while (size >= BLOCKSIZE) {
     -		write_or_die(1, buf, BLOCKSIZE);
    -+		write_block_or_die(buf);
    ++		write_block(buf);
      		size -= BLOCKSIZE;
      		buf += BLOCKSIZE;
      	}
    @@ archive-tar.c: static void write_trailer(void)
      	int tail = BLOCKSIZE - offset;
      	memset(block + offset, 0, tail);
     -	write_or_die(1, block, BLOCKSIZE);
    -+	write_block_or_die(block);
    ++	write_block(block);
      	if (tail < 2 * RECORDSIZE) {
      		memset(block, 0, offset);
     -		write_or_die(1, block, BLOCKSIZE);
    -+		write_block_or_die(block);
    ++		write_block(block);
      	}
      }

2:  ac2b2488a1b < -:  ----------- archive-tar: mark RECORDSIZE/BLOCKSIZE as unsigned
3:  4ea94a87848 ! 3:  5e3c0d79589 archive: optionally use zlib directly for gzip compression
    @@
      ## Metadata ##
    -Author: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
    +Author: René Scharfe <l.s.r@web.de>

      ## Commit message ##
    -    archive: optionally use zlib directly for gzip compression
    +    archive-tar: add internal gzip implementation

    -    As we already link to the zlib library, we can perform the compression
    -    without even requiring gzip on the host machine.
    +    Git uses zlib for its own object store, but calls gzip when creating tgz
    +    archives.  Add an option to perform the gzip compression for the latter
    +    using zlib, without depending on the external gzip binary.

    -    Note: the `-n` flag that `git archive` passed to `gzip` wants to ensure
    -    that a reproducible file is written, i.e. no filename or mtime will be
    -    recorded in the compressed output. This is already the default for
    -    zlib's `gzopen()` function (if the file name or mtime should be
    -    recorded, the `deflateSetHeader()` function would have to be called
    -    instead).
    +    Plug it in by making write_block a function pointer and switching to a
    +    compressing variant if the filter command has the magic value "git
    +    archive gzip".  Does that indirection slow down tar creation?  Not
    +    really, at least not in this test:

    -    Note also that the `gzFile` datatype is defined as a pointer in
    -    `zlib.h`, i.e. we can rely on the fact that it can be `NULL`.
    +    $ hyperfine -w3 -L rev HEAD,origin/main -p 'git checkout {rev} && make' \
    +    './git -C ../linux archive --format=tar HEAD # {rev}'
    +    Benchmark #1: ./git -C ../linux archive --format=tar HEAD # HEAD
    +      Time (mean ± σ):      4.044 s ±  0.007 s    [User: 3.901 s, System: 0.137 s]
    +      Range (min … max):    4.038 s …  4.059 s    10 runs

    -    At this point, this new mode is hidden behind the pseudo command
    -    `:zlib`: assign this magic string to the `archive.tgz.command` config
    -    setting to enable it.
    +    Benchmark #2: ./git -C ../linux archive --format=tar HEAD # origin/main
    +      Time (mean ± σ):      4.047 s ±  0.009 s    [User: 3.903 s, System: 0.138 s]
    +      Range (min … max):    4.038 s …  4.066 s    10 runs

    -    Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
    -    Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
    +    How does tgz creation perform?

    - ## archive-tar.c ##
    -@@ archive-tar.c: static unsigned long offset;
    -
    - static int tar_umask = 002;
    -
    -+static gzFile gzip;
    -+
    - static int write_tar_filter_archive(const struct archiver *ar,
    - 				    struct archiver_args *args);
    +    $ hyperfine -w3 -L command 'gzip -cn','git archive gzip' \
    +    './git -c tar.tgz.command="{command}" -C ../linux archive --format=tgz HEAD'
    +    Benchmark #1: ./git -c tar.tgz.command="gzip -cn" -C ../linux archive --format=tgz HEAD
    +      Time (mean ± σ):     20.404 s ±  0.006 s    [User: 23.943 s, System: 0.401 s]
    +      Range (min … max):   20.395 s … 20.414 s    10 runs
    +
    +    Benchmark #2: ./git -c tar.tgz.command="git archive gzip" -C ../linux archive --format=tgz HEAD
    +      Time (mean ± σ):     23.807 s ±  0.023 s    [User: 23.655 s, System: 0.145 s]
    +      Range (min … max):   23.782 s … 23.857 s    10 runs
    +
    +    Summary
    +      './git -c tar.tgz.command="gzip -cn" -C ../linux archive --format=tgz HEAD' ran
    +        1.17 ± 0.00 times faster than './git -c tar.tgz.command="git archive gzip" -C ../linux archive --format=tgz HEAD'
    +
    +    So the internal implementation takes 17% longer on the Linux repo, but
    +    uses 2% less CPU time.  That's because the external gzip can run in
    +    parallel on its own processor, while the internal one works sequentially
    +    and avoids the inter-process communication overhead.
    +
    +    What are the benefits?  Only an internal sequential implementation can
    +    offer this eco mode, and it allows avoiding the gzip(1) requirement.
    +
    +    This implementation uses the helper functions from our zlib.c instead of
    +    the convenient gz* functions from zlib, because the latter doesn't give
    +    the control over the generated gzip header that the next patch requires.
    +
    +    Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
    +    Signed-off-by: René Scharfe <l.s.r@web.de>
    +    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    +
    + ## Documentation/git-archive.txt ##
    +@@ Documentation/git-archive.txt: tar.<format>.command::
    + 	format is given.
    + +
    + The "tar.gz" and "tgz" formats are defined automatically and default to
    +-`gzip -cn`. You may override them with custom commands.
    ++`gzip -cn`. You may override them with custom commands. An internal gzip
    ++implementation can be used by specifying the value `git archive gzip`.

    + tar.<format>.remote::
    + 	If true, enable `<format>` for use by remote clients via
    +
    + ## archive-tar.c ##
     @@ archive-tar.c: static int write_tar_filter_archive(const struct archiver *ar,
    + #define USTAR_MAX_MTIME 077777777777ULL
    + #endif

    - /* writes out the whole block, or dies if fails */
    - static void write_block_or_die(const char *block) {
    --	write_or_die(1, block, BLOCKSIZE);
    -+	if (!gzip)
    -+		write_or_die(1, block, BLOCKSIZE);
    -+	else if (gzwrite(gzip, block, (unsigned) BLOCKSIZE) != BLOCKSIZE)
    -+		die(_("gzwrite failed"));
    +-static void write_block(const void *buf)
    ++static void tar_write_block(const void *buf)
    + {
    + 	write_or_die(1, buf, BLOCKSIZE);
      }

    ++static void (*write_block)(const void *) = tar_write_block;
    ++
      /* writes out the whole block, but only if it is full */
    -@@ archive-tar.c: static int write_tar_filter_archive(const struct archiver *ar,
    - 	filter.use_shell = 1;
    - 	filter.in = -1;
    + static void write_if_needed(void)
    + {
    +@@ archive-tar.c: static int write_tar_archive(const struct archiver *ar,
    + 	return err;
    + }

    --	if (start_command(&filter) < 0)
    --		die_errno(_("unable to start '%s' filter"), argv[0]);
    --	close(1);
    --	if (dup2(filter.in, 1) < 0)
    --		die_errno(_("unable to redirect descriptor"));
    --	close(filter.in);
    -+	if (!strcmp(":zlib", ar->data)) {
    -+		struct strbuf mode = STRBUF_INIT;
    ++static git_zstream gzstream;
    ++static unsigned char outbuf[16384];
     +
    -+		strbuf_addstr(&mode, "wb");
    ++static void tgz_deflate(int flush)
    ++{
    ++	while (gzstream.avail_in || flush == Z_FINISH) {
    ++		int status = git_deflate(&gzstream, flush);
    ++		if (!gzstream.avail_out || status == Z_STREAM_END) {
    ++			write_or_die(1, outbuf, gzstream.next_out - outbuf);
    ++			gzstream.next_out = outbuf;
    ++			gzstream.avail_out = sizeof(outbuf);
    ++			if (status == Z_STREAM_END)
    ++				break;
    ++		}
    ++		if (status != Z_OK && status != Z_BUF_ERROR)
    ++			die(_("deflate error (%d)"), status);
    ++	}
    ++}
     +
    -+		if (args->compression_level >= 0 && args->compression_level <= 9)
    -+			strbuf_addf(&mode, "%d", args->compression_level);
    ++static void tgz_write_block(const void *data)
    ++{
    ++	gzstream.next_in = (void *)data;
    ++	gzstream.avail_in = BLOCKSIZE;
    ++	tgz_deflate(Z_NO_FLUSH);
    ++}
     +
    -+		gzip = gzdopen(fileno(stdout), mode.buf);
    -+		if (!gzip)
    -+			die(_("Could not gzdopen stdout"));
    -+		strbuf_release(&mode);
    -+	} else {
    -+		if (start_command(&filter) < 0)
    -+			die_errno(_("unable to start '%s' filter"), argv[0]);
    -+		close(1);
    -+		if (dup2(filter.in, 1) < 0)
    -+			die_errno(_("unable to redirect descriptor"));
    -+		close(filter.in);
    -+	}
    -
    - 	r = write_tar_archive(ar, args);
    ++static const char internal_gzip_command[] = "git archive gzip";
    ++
    + static int write_tar_filter_archive(const struct archiver *ar,
    + 				    struct archiver_args *args)
    + {
    +@@ archive-tar.c: static int write_tar_filter_archive(const struct archiver *ar,
    + 	if (!ar->filter_command)
    + 		BUG("tar-filter archiver called with no filter defined");

    --	close(1);
    --	if (finish_command(&filter) != 0)
    --		die(_("'%s' filter reported error"), argv[0]);
    -+	if (gzip) {
    -+		int ret = gzclose(gzip);
    -+		if (ret == Z_ERRNO)
    -+			die_errno(_("gzclose failed"));
    -+		else if (ret != Z_OK)
    -+			die(_("gzclose failed (%d)"), ret);
    -+	} else {
    -+		close(1);
    -+		if (finish_command(&filter) != 0)
    -+			die(_("'%s' filter reported error"), argv[0]);
    ++	if (!strcmp(ar->filter_command, internal_gzip_command)) {
    ++		write_block = tgz_write_block;
    ++		git_deflate_init_gzip(&gzstream, args->compression_level);
    ++		gzstream.next_out = outbuf;
    ++		gzstream.avail_out = sizeof(outbuf);
    ++
    ++		r = write_tar_archive(ar, args);
    ++
    ++		tgz_deflate(Z_FINISH);
    ++		git_deflate_end(&gzstream);
    ++		return r;
     +	}
    ++
    + 	strbuf_addstr(&cmd, ar->filter_command);
    + 	if (args->compression_level >= 0)
    + 		strbuf_addf(&cmd, " -%d", args->compression_level);
    +
    + ## t/t5000-tar-tree.sh ##
    +@@ t/t5000-tar-tree.sh: test_expect_success GZIP 'remote tar.gz can be disabled' '
    + 		>remote.tar.gz
    + '

    - 	strbuf_release(&cmd);
    - 	return r;
    ++test_expect_success 'git archive --format=tgz (internal gzip)' '
    ++	test_config tar.tgz.command "git archive gzip" &&
    ++	git archive --format=tgz HEAD >internal_gzip.tgz
    ++'
    ++
    ++test_expect_success 'git archive --format=tar.gz (internal gzip)' '
    ++	test_config tar.tar.gz.command "git archive gzip" &&
    ++	git archive --format=tar.gz HEAD >internal_gzip.tar.gz &&
    ++	test_cmp_bin internal_gzip.tgz internal_gzip.tar.gz
    ++'
    ++
    ++test_expect_success GZIP 'extract tgz file (internal gzip)' '
    ++	gzip -d -c <internal_gzip.tgz >internal_gzip.tar &&
    ++	test_cmp_bin b.tar internal_gzip.tar
    ++'
    ++
    + test_expect_success 'archive and :(glob)' '
    + 	git archive -v HEAD -- ":(glob)**/sh" >/dev/null 2>actual &&
    + 	cat >expect <<EOF &&
-:  ----------- > 4:  af27bea4fc3 archive-tar: use OS_CODE 3 (Unix) for internal gzip
4:  0e5826e3f25 ! 5:  62038b8e911 archive: use the internal zlib-based gzip compression by default
    @@
      ## Metadata ##
    -Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
    +Author: René Scharfe <l.s.r@web.de>

      ## Commit message ##
    -    archive: use the internal zlib-based gzip compression by default
    +    archive-tar: use internal gzip by default

    -    We just introduced support for compressing `.tar.gz` archives in the
    -    `git archive` process itself, using zlib directly instead of spawning
    -    `gzip`.
    +    Drop the dependency on gzip(1) and use our internal implementation to
    +    create tar.gz and tgz files.

    -    While this takes less CPU time overall, on multi-core machines, this is
    -    slightly slower in terms of wall clock time (it seems to be in the
    -    ballpark of 15%).
    -
    -    It does reduce the number of dependencies by one, though, which makes it
    -    desirable to turn that mode on by default.
    -
    -    Changing the default benefits most notably the MinGit flavor of Git for
    -    Windows (which intends to support 3rd-party applications that want to
    -    use Git and want to bundle a minimal set of files for that purpose, i.e.
    -    stripping out all non-essential files such as interactive commands,
    -    Perl, and yes, also `gzip`).
    -
    -    We also can now remove the `GZIP` prerequisite from quite a number of
    -    test cases in `t/t5000-tar-tree.sh`.
    -
    -    This closes https://github.com/git-for-windows/git/issues/1970
    -
    -    Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
    +    Signed-off-by: René Scharfe <l.s.r@web.de>
    +    Signed-off-by: Junio C Hamano <gitster@pobox.com>

      ## Documentation/git-archive.txt ##
     @@ Documentation/git-archive.txt: tar.<format>.command::
      	format is given.
      +
      The "tar.gz" and "tgz" formats are defined automatically and default to
    --`gzip -cn`. You may override them with custom commands.
    -+`:zlib`, triggering an in-process gzip compression. You may override
    -+them with custom commands, e.g. `gzip -cn` or `pigz -cn`.
    +-`gzip -cn`. You may override them with custom commands. An internal gzip
    +-implementation can be used by specifying the value `git archive gzip`.
    ++the magic value `git archive gzip`, which invokes an internal
    ++implementation of gzip. You may override them with custom commands.

      tar.<format>.remote::
      	If true, enable `<format>` for use by remote clients via
    @@ archive-tar.c: void init_tar_archiver(void)
      	register_archiver(&tar_archiver);

     -	tar_filter_config("tar.tgz.command", "gzip -cn", NULL);
    -+	tar_filter_config("tar.tgz.command", ":zlib", NULL);
    ++	tar_filter_config("tar.tgz.command", internal_gzip_command, NULL);
      	tar_filter_config("tar.tgz.remote", "true", NULL);
     -	tar_filter_config("tar.tar.gz.command", "gzip -cn", NULL);
    -+	tar_filter_config("tar.tar.gz.command", ":zlib", NULL);
    ++	tar_filter_config("tar.tar.gz.command", internal_gzip_command, NULL);
      	tar_filter_config("tar.tar.gz.remote", "true", NULL);
      	git_config(git_tar_config, NULL);
      	for (i = 0; i < nr_tar_filters; i++) {
    @@ t/t5000-tar-tree.sh: test_expect_success 'only enabled filters are available rem
      	git archive --output=j3.tar.gz HEAD &&
      	test_cmp_bin j.tgz j3.tar.gz
      '
    -
    -+test_expect_success 'use `archive.tgz.command=:zlib` explicitly' '
    -+	git -c archive.tgz.command=:zlib archive --output=j4.tgz HEAD &&
    -+	test_cmp_bin j.tgz j4.tgz
    -+'
    -+
    - test_expect_success GZIP 'extract tgz file' '
    - 	gzip -d -c <j.tgz >j.tar &&
    +@@ t/t5000-tar-tree.sh: test_expect_success GZIP 'extract tgz file' '
      	test_cmp_bin b.tar j.tar
      '

    @@ t/t5000-tar-tree.sh: test_expect_success 'only enabled filters are available rem
      	git config tar.tar.gz.remote false &&
      	test_must_fail git archive --remote=. --format=tar.gz HEAD \
      		>remote.tar.gz
    + '
    +
    +-test_expect_success 'git archive --format=tgz (internal gzip)' '
    +-	test_config tar.tgz.command "git archive gzip" &&
    +-	git archive --format=tgz HEAD >internal_gzip.tgz
    ++test_expect_success GZIP 'git archive --format=tgz (external gzip)' '
    ++	test_config tar.tgz.command "gzip -cn" &&
    ++	git archive --format=tgz HEAD >external_gzip.tgz
    + '
    +
    +-test_expect_success 'git archive --format=tar.gz (internal gzip)' '
    +-	test_config tar.tar.gz.command "git archive gzip" &&
    +-	git archive --format=tar.gz HEAD >internal_gzip.tar.gz &&
    +-	test_cmp_bin internal_gzip.tgz internal_gzip.tar.gz
    ++test_expect_success GZIP 'git archive --format=tar.gz (external gzip)' '
    ++	test_config tar.tar.gz.command "gzip -cn" &&
    ++	git archive --format=tar.gz HEAD >external_gzip.tar.gz &&
    ++	test_cmp_bin external_gzip.tgz external_gzip.tar.gz
    + '
    +
    +-test_expect_success GZIP 'extract tgz file (internal gzip)' '
    +-	gzip -d -c <internal_gzip.tgz >internal_gzip.tar &&
    +-	test_cmp_bin b.tar internal_gzip.tar
    ++test_expect_success GZIP 'extract tgz file (external gzip)' '
    ++	gzip -d -c <external_gzip.tgz >external_gzip.tar &&
    ++	test_cmp_bin b.tar external_gzip.tar
    + '
    +
    + test_expect_success 'archive and :(glob)' '
-- snap --

All of these changes look sensible to me, and the performance
implications, while at first glance unfavorable because wallclock time
increases even as CPU time decreases, are actually quite good. As Peff
said in
https://lore.kernel.org/git/20190501181807.GC4109@sigill.intra.peff.net/t/#u

> [...] whatever has the lowest overall CPU time is generally preferable
> [...]

By the way, the main reason why I did not work more is that in
http://madler.net/pipermail/zlib-devel_madler.net/2019-December/003308.html,
Mark Adler (the zlib maintainer) announced that...

> [...] There are many well-tested performance improvements in zlib
> waiting in the wings that will be incorporated over the next several
> months. [...]

This was in December 2019. And now it's June 2022 and I kind of wonder
whether those promised improvements will still come.

In the meantime, however, a viable alternative seems to have cropped up:
https://github.com/zlib-ng/zlib-ng. Essentially, it looks as if it is what
zlib should have become after above-quoted announcement.

In particular the CPU intrinsics support (think MMX, SSE2/3, etc) seem to
be very interesting and I would not be completely surprised if building
Git with your patches and linking against zlib-ng would paint a very
favorable picture not only in terms of CPU time but also in terms of
wallclock time. Sadly, I have not been able to set aside time to look into
that angle, but maybe I can peak your interest?

Thanks,
Dscho
René Scharfe June 14, 2022, 8:05 p.m. UTC | #2
Am 14.06.22 um 13:28 schrieb Johannes Schindelin:
>
> By the way, the main reason why I did not work more is that in
> http://madler.net/pipermail/zlib-devel_madler.net/2019-December/003308.html,
> Mark Adler (the zlib maintainer) announced that...
>
>> [...] There are many well-tested performance improvements in zlib
>> waiting in the wings that will be incorporated over the next several
>> months. [...]
>
> This was in December 2019. And now it's June 2022 and I kind of wonder
> whether those promised improvements will still come.
>
> In the meantime, however, a viable alternative seems to have cropped up:
> https://github.com/zlib-ng/zlib-ng. Essentially, it looks as if it is what
> zlib should have become after above-quoted announcement.
>
> In particular the CPU intrinsics support (think MMX, SSE2/3, etc) seem to
> be very interesting and I would not be completely surprised if building
> Git with your patches and linking against zlib-ng would paint a very
> favorable picture not only in terms of CPU time but also in terms of
> wallclock time. Sadly, I have not been able to set aside time to look into
> that angle, but maybe I can peak your interest?
I was unable to preload zlib-ng using DYLD_INSERT_LIBRARIES on macOS
12.4 so far.  The included demo proggy looks impressive, though:

$ hyperfine -w3 -L gzip gzip,../zlib-ng/minigzip "git -C ../linux archive --format=tar HEAD | {gzip} -c"
Benchmark #1: git -C ../linux archive --format=tar HEAD | gzip -c
  Time (mean ± σ):     20.424 s ±  0.006 s    [User: 23.964 s, System: 0.432 s]
  Range (min … max):   20.414 s … 20.434 s    10 runs

Benchmark #2: git -C ../linux archive --format=tar HEAD | ../zlib-ng/minigzip -c
  Time (mean ± σ):     12.158 s ±  0.006 s    [User: 13.908 s, System: 0.376 s]
  Range (min … max):   12.145 s … 12.166 s    10 runs

Summary
  'git -C ../linux archive --format=tar HEAD | ../zlib-ng/minigzip -c' ran
    1.68 ± 0.00 times faster than 'git -C ../linux archive --format=tar HEAD | gzip -c'
Johannes Schindelin June 30, 2022, 6:55 p.m. UTC | #3
Hi René,

On Tue, 14 Jun 2022, René Scharfe wrote:

> Am 14.06.22 um 13:28 schrieb Johannes Schindelin:
> >
> > By the way, the main reason why I did not work more is that in
> > http://madler.net/pipermail/zlib-devel_madler.net/2019-December/003308.html,
> > Mark Adler (the zlib maintainer) announced that...
> >
> >> [...] There are many well-tested performance improvements in zlib
> >> waiting in the wings that will be incorporated over the next several
> >> months. [...]
> >
> > This was in December 2019. And now it's June 2022 and I kind of wonder
> > whether those promised improvements will still come.
> >
> > In the meantime, however, a viable alternative seems to have cropped up:
> > https://github.com/zlib-ng/zlib-ng. Essentially, it looks as if it is what
> > zlib should have become after above-quoted announcement.
> >
> > In particular the CPU intrinsics support (think MMX, SSE2/3, etc) seem to
> > be very interesting and I would not be completely surprised if building
> > Git with your patches and linking against zlib-ng would paint a very
> > favorable picture not only in terms of CPU time but also in terms of
> > wallclock time. Sadly, I have not been able to set aside time to look into
> > that angle, but maybe I can peak your interest?
> I was unable to preload zlib-ng using DYLD_INSERT_LIBRARIES on macOS
> 12.4 so far.  The included demo proggy looks impressive, though:
>
> $ hyperfine -w3 -L gzip gzip,../zlib-ng/minigzip "git -C ../linux archive --format=tar HEAD | {gzip} -c"
> Benchmark #1: git -C ../linux archive --format=tar HEAD | gzip -c
>   Time (mean ± σ):     20.424 s ±  0.006 s    [User: 23.964 s, System: 0.432 s]
>   Range (min … max):   20.414 s … 20.434 s    10 runs
>
> Benchmark #2: git -C ../linux archive --format=tar HEAD | ../zlib-ng/minigzip -c
>   Time (mean ± σ):     12.158 s ±  0.006 s    [User: 13.908 s, System: 0.376 s]
>   Range (min … max):   12.145 s … 12.166 s    10 runs
>
> Summary
>   'git -C ../linux archive --format=tar HEAD | ../zlib-ng/minigzip -c' ran
>     1.68 ± 0.00 times faster than 'git -C ../linux archive --format=tar HEAD | gzip -c'

Intriguing.

I finally managed to play around with building and packaging zlib-ng [*1*]
(since I want to use it as a drop-in replacement for zlib, I think it is
best to configure it with `--zlib-compat`, that way I do not have to
fiddle with any equivalent of `LD_PRELOAD`). Here are my numbers:

	zlib-ng: 14.409 s ± 0.209 s
	zlib:    26.843 s ± 0.636 s

These are pretty good, which made me think that they might actually even
help regular Git operations (because we zlib every loose object).

So I tried to `fast-import` some 2500 commits from linux.git into a fresh
repository, and the zlib-ng version takes ~51s and the zlib version takes
~58s. At first I thought that it might be noise, but the trend seems to be
steady. It's not a huge improvement, of course, but I think that might be
because most of the time is spent parsing.

I then tried to test the performance focusing on writing loose object, by
using p0008 (increasing the number of files from 50 to 1500 and
restricting it to fsyncMethod=none).

Unfortunately, the numbers are not really conclusive. I do see minor
speed-ups with zlib-ng, mostly, in the single digit percentages, though
occasionally in the other direction. In other words, there is no clear-cut
change, just a vague tendency. My guess: Git writes too small files (their
contents are of the form "$basedir$test_tick.$counter") and zlib-ng's
superior performance does not come to bear.

Still, for larger workloads, zlib-ng seems to offer a quite nice and
substantial performance improvement over zlib.

Ciao,
Dscho

Footnote *1*: https://github.com/msys2/MINGW-packages/compare/master...dscho:zlib-ng
Johannes Schindelin July 1, 2022, 4:05 p.m. UTC | #4
Me again,

On Thu, 30 Jun 2022, Johannes Schindelin wrote:

> I finally managed to play around with building and packaging zlib-ng
> [*1*] (since I want to use it as a drop-in replacement for zlib, I think
> it is best to configure it with `--zlib-compat`, that way I do not have
> to fiddle with any equivalent of `LD_PRELOAD`). Here are my numbers:
>
> 	zlib-ng: 14.409 s ± 0.209 s
> 	zlib:    26.843 s ± 0.636 s
>
> These are pretty good, which made me think that they might actually even
> help regular Git operations (because we zlib every loose object).
>
> So I tried to `fast-import` some 2500 commits from linux.git into a fresh
> repository, and the zlib-ng version takes ~51s and the zlib version takes
> ~58s. At first I thought that it might be noise, but the trend seems to be
> steady. It's not a huge improvement, of course, but I think that might be
> because most of the time is spent parsing.
>
> I then tried to test the performance focusing on writing loose object, by
> using p0008 (increasing the number of files from 50 to 1500 and
> restricting it to fsyncMethod=none).
>
> Unfortunately, the numbers are not really conclusive. I do see minor
> speed-ups with zlib-ng, mostly, in the single digit percentages, though
> occasionally in the other direction. In other words, there is no clear-cut
> change, just a vague tendency. My guess: Git writes too small files (their
> contents are of the form "$basedir$test_tick.$counter") and zlib-ng's
> superior performance does not come to bear.
>
> Still, for larger workloads, zlib-ng seems to offer a quite nice and
> substantial performance improvement over zlib.

Stolee pointed out to me that objects inside pack files are also
zlib-compressed, and that measuring the speed of `git rev-list --objects
--all --count` might therefore be a better test.

And this is where things get a little messy: in the context of Git for
Windows, my local measurements indicate that zlib is better, with ~41
seconds using zlib vs ~52 seconds using zlib-ng (but the latter has a
rather large variance).

These measurements were done with a relatively straight-forward build of
zlib-ng v2.0.6, and on a hunch I then tried to build the tip of zlib-ng's
`develop` branch (which was much less straight-forward) and now get
virtually the same speed with that `rev-list` command.

But then I repeated the `archive` measurement with the `develop` version
of zlib-ng, and while it was still substantially faster than zlib, it was
slightly slower than zlib-ng v2.0.6 (zlib: ~26 seconds, zlib-ng v2.0.6:
~14 seconds, zlib-ng develop: ~16 seconds). Still, much, much faster than
using `-c tar.tgz.command="gzip -cn"` at ~24 seconds.

So: the picture is messy. The latest official release of zlib-ng seems to
offer performance wins using `archive` but slight losses using `rev-list.
Upgrading to the latest revision of zlib-ng offers slightly smaller
performance wins using `archive` and equivalent performance using
`rev-list`. Both blow `gzip -cn` out of the water, thanks to using MMX or
whatever my laptop's CPU offers.

The take-away as far as Git for Windows is concerned: It seems not _quite_
the time yet to switch from zlib to zlib-ng, I want to wait until there is
an official zlib-ng release with favorable speed.

Ciao,
Dscho

P.S.: I pushed a WIP update to this branch:

> Footnote *1*: https://github.com/msys2/MINGW-packages/compare/master...dscho:zlib-ng
Jeff King July 1, 2022, 4:27 p.m. UTC | #5
On Fri, Jul 01, 2022 at 06:05:59PM +0200, Johannes Schindelin wrote:

> Stolee pointed out to me that objects inside pack files are also
> zlib-compressed, and that measuring the speed of `git rev-list --objects
> --all --count` might therefore be a better test.

That will spend quite a lot of time doing hash-lookups for each tree
entry. A better raw zlib test might be:

  git cat-file --batch --batch-all-objects --unordered >/dev/null

which will just dump each object, and should mostly be zlib and delta
reconstruction (the --unordered is important to hit the deltas in the
right order).

> And this is where things get a little messy: in the context of Git for
> Windows, my local measurements indicate that zlib is better, with ~41
> seconds using zlib vs ~52 seconds using zlib-ng (but the latter has a
> rather large variance).

That is a surprising slow-down between the two. I'd expect the command
above to show even more pronounced results, though, as it's spending
less time doing non-zlib things. But it's still just inflating (as
opposed to git-archive, which is both inflating and deflating).

-Peff
Junio C Hamano July 1, 2022, 5:47 p.m. UTC | #6
Jeff King <peff@peff.net> writes:

> That will spend quite a lot of time doing hash-lookups for each tree
> entry. A better raw zlib test might be:
>
>   git cat-file --batch --batch-all-objects --unordered >/dev/null
>
> which will just dump each object, and should mostly be zlib and delta
> reconstruction (the --unordered is important to hit the deltas in the
> right order).

;-)

I like --unordered has the meaning "use the order Git likes" (which
is probably the packfile offset order, which we optimize for
minimizing seek during delta reconstruction).