mbox series

[0/8] compat/zlib: allow use of zlib-ng as backend

Message ID 20250110-b4-pks-compat-drop-uncompress2-v1-0-965d0022a74d@pks.im (mailing list archive)
Headers show
Series compat/zlib: allow use of zlib-ng as backend | expand

Message

Patrick Steinhardt Jan. 10, 2025, 12:55 p.m. UTC
Hi,

I have recently started to play around with zlib-ng a bit, which is a
hard fork of the zlib library. It describes itself as zlib replacement
with optimizations for "next generation" systems. As such, it contains
several implementations of central algorithms using for example SSE2,
AVX2 and other vectorized CPU intrinsics that supposedly speed up in-
and deflating data.

And indeed, compiling Git against zlib-ng leads to a significant speedup
when reading objects. The following benchmark uses git-cat-file(1) with
`--batch --batch-all-objects` in the Git repository:

    Benchmark 1: zlib
      Time (mean ± σ):     52.085 s ±  0.141 s    [User: 51.500 s, System: 0.456 s]
      Range (min … max):   52.004 s … 52.335 s    5 runs

    Benchmark 2: zlib-ng
      Time (mean ± σ):     40.324 s ±  0.134 s    [User: 39.731 s, System: 0.490 s]
      Range (min … max):   40.135 s … 40.484 s    5 runs

    Summary
      zlib-ng ran
        1.29 ± 0.01 times faster than zlib

So we're looking at a ~25% speedup compared to zlib. This is of course
an extreme example, as it makes us read through all objects in the
repository. But regardless, it should be possible to see some sort of
speedup in most commands that end up accessing the object database.

This patch series refactors how we wire up zlib in our project by
introducing a new "compat/zlib.h" header function. This header is then
later extended to patch over the differences between zlib and zlib-ng,
which is mostly just that zlib-ng has a `zng_` prefix for each of its
symbols. Like this, we can support both libraries directly, and a new
Meson build options allows users to pick whichever backend they like.

In theory, these changes shouldn't be necessary because zlib-ng provides
a compatibility layer that make it directly compatible with zlib. But
most distros don't allow you to install zlib-ng with that layer is it
would mean that zlib would need to be replaced globally. Instead, they
typically only provide a version of zlib-ng that only has the `zng_`
prefixed symbols.

Given the observed speedup I do think that this is a worthwhile change
so that users (or especially hosting providers) can easily switch to
zlib-ng without impacting the rest of their system.

Thanks!

Patrick

---
Patrick Steinhardt (8):
      compat: drop `uncompress2()` compatibility shim
      git-compat-util: drop `z_const` define
      compat: introduce new "zlib.h" header
      git-compat-util: move include of "compat/zlib.h" into "git-zlib.h"
      compat/zlib: provide `deflateBound()` shim centrally
      compat/zlib: provide stubs for `deflateSetHeader()`
      git-zlib: cast away potential constness of `next_in` pointer
      compat/zlib: allow use of zlib-ng as backend

 Makefile                  |  1 -
 archive-tar.c             |  4 --
 archive.c                 |  1 +
 compat/zlib-compat.h      | 47 +++++++++++++++++++++++
 compat/zlib-uncompress2.c | 96 -----------------------------------------------
 config.c                  |  1 +
 csum-file.c               |  3 +-
 environment.c             |  1 +
 git-compat-util.h         | 12 ------
 git-zlib.c                |  6 +--
 git-zlib.h                |  2 +
 meson.build               | 22 ++++++++---
 meson_options.txt         |  2 +
 reftable/block.c          |  1 -
 reftable/system.h         |  1 +
 15 files changed, 75 insertions(+), 125 deletions(-)


---
base-commit: 05388c0e69a3497fceb0b5c80ca76d1a6bc3afcd
change-id: 20250110-b4-pks-compat-drop-uncompress2-eb5914459c32

Comments

Taylor Blau Jan. 10, 2025, 3:50 p.m. UTC | #1
On Fri, Jan 10, 2025 at 01:55:27PM +0100, Patrick Steinhardt wrote:
> This patch series refactors how we wire up zlib in our project by
> introducing a new "compat/zlib.h" header function. This header is then
> later extended to patch over the differences between zlib and zlib-ng,
> which is mostly just that zlib-ng has a `zng_` prefix for each of its
> symbols. Like this, we can support both libraries directly, and a new
> Meson build options allows users to pick whichever backend they like.

I'm very excited about the possibility of supporting zlib-ng. You
mention that there are new Meson build options here, but I don't see any
changes to the Makefile.

Can we build Git against zlib-ng out of the box with the Makefile? If
so, that is great, and we should document how to build it with zlib
versus zlib-ng when using the Makefile. If not, I am somewhat
uncomfortable about exposing new build options and the features that
they enable behind the new build system.

I think that we should continue to evolve the two more or less in
lockstep if/until we are ready to deprecate the Makefile.

Thanks,
Taylor
Patrick Steinhardt Jan. 13, 2025, 8:42 a.m. UTC | #2
On Fri, Jan 10, 2025 at 10:50:14AM -0500, Taylor Blau wrote:
> On Fri, Jan 10, 2025 at 01:55:27PM +0100, Patrick Steinhardt wrote:
> > This patch series refactors how we wire up zlib in our project by
> > introducing a new "compat/zlib.h" header function. This header is then
> > later extended to patch over the differences between zlib and zlib-ng,
> > which is mostly just that zlib-ng has a `zng_` prefix for each of its
> > symbols. Like this, we can support both libraries directly, and a new
> > Meson build options allows users to pick whichever backend they like.
> 
> I'm very excited about the possibility of supporting zlib-ng. You
> mention that there are new Meson build options here, but I don't see any
> changes to the Makefile.
> 
> Can we build Git against zlib-ng out of the box with the Makefile? If
> so, that is great, and we should document how to build it with zlib
> versus zlib-ng when using the Makefile. If not, I am somewhat
> uncomfortable about exposing new build options and the features that
> they enable behind the new build system.

No, it doesn't work out of the box.

> I think that we should continue to evolve the two more or less in
> lockstep if/until we are ready to deprecate the Makefile.

Yeah, you're probably right. I was a bit annoyed when trying to figure
out how to name and document things in the Makefile, but that's not
really a good reason to punt on it. Doubly so because it's ultimately
quite easy to wire up.

Patrick