Message ID | 20250114-b4-pks-compat-drop-uncompress2-v2-0-614a2158e34e@pks.im (mailing list archive) |
---|---|
Headers | show |
Series | compat/zlib: allow use of zlib-ng as backend | expand |
Patrick Steinhardt <ps@pks.im> writes: > Changes in v2: > - Wire up zlib-ng in our Makefile. > - Exercise zlib-ng via CI by adapting our "linux-musl" job to use > Meson and installing zlib-ng. > - Link to v1: https://lore.kernel.org/r/20250110-b4-pks-compat-drop-uncompress2-v1-0-965d0022a74d@pks.im > > The series is built on top of fbe8d3079d (Git 2.48, 2025-01-10) with > ps/meson-weak-sha1-build at 6a0ee54f9a (meson: provide a summary of > configured backends, 2024-12-30) merged into it. I think you are now also textually depending on the fuzzer thing due to touching meson_options.txt and ci/run-build-and-tests.sh with a later step. > -: ---------- > 9: 7ae8f413d4 ci: switch linux-musl to use Meson > -: ---------- > 10: 2dd1b49e4f ci: make "linux-musl" job use zlib-ng I will see what other things I can find. Thanks.
Junio C Hamano <gitster@pobox.com> writes: > Patrick Steinhardt <ps@pks.im> writes: > >> Changes in v2: >> - Wire up zlib-ng in our Makefile. >> - Exercise zlib-ng via CI by adapting our "linux-musl" job to use >> Meson and installing zlib-ng. >> - Link to v1: https://lore.kernel.org/r/20250110-b4-pks-compat-drop-uncompress2-v1-0-965d0022a74d@pks.im >> >> The series is built on top of fbe8d3079d (Git 2.48, 2025-01-10) with >> ps/meson-weak-sha1-build at 6a0ee54f9a (meson: provide a summary of >> configured backends, 2024-12-30) merged into it. > > I think you are now also textually depending on the fuzzer thing due > to touching meson_options.txt and ci/run-build-and-tests.sh with a > later step. > >> -: ---------- > 9: 7ae8f413d4 ci: switch linux-musl to use Meson >> -: ---------- > 10: 2dd1b49e4f ci: make "linux-musl" job use zlib-ng > > I will see what other things I can find. Yup. The patch series for some reason still does not seem to apply cleanly ([09/10] ci/run-build-and-tests.sh somehow seems to be troublesome), but it was easy to wiggle it in when the base was prepared with these two topics merged on top of 'master': 4610af08e7 ci: make "linux-musl" job use zlib-ng b2ddd0b33e ci: switch linux-musl to use Meson 5118183ef4 compat/zlib: allow use of zlib-ng as backend 08bf6b2062 git-zlib: cast away potential constness of `next_in` pointer ebf98412e3 compat/zlib: provide stubs for `deflateSetHeader()` 29829e5714 compat/zlib: provide `deflateBound()` shim centrally 8f19b26bbe git-compat-util: move include of "compat/zlib.h" into "git-zlib.h" 8aab230253 compat: introduce new "zlib.h" header 1ce001beaa git-compat-util: drop `z_const` define b9d4bd5467 compat: drop `uncompress2()` compatibility shim db620fad21 Merge branch 'ps/build-meson-fixes' into ps/zlib-ng 64156589d9 Merge branch 'ps/meson-weak-sha1-build' into ps/zlib-ng I think the reason is because the other topic that touches the fuzz thing we see in the context of [09/10] is not ps/build-meson-fixes but something else that is before "--fatal-meson-warnings" was added. One request. You seem to have started using --full-index when generating the patches. It is extremely annoying when a patch needs to be mucked with an editor to inspect why it does not apply and to tweak it to make it apply. 40-hex does not help at all if the base commit is not conveyed correctly, as the recipient will not have the necessary blob objects _anyway_. And 40-hex is unnecessarily long in order to protect the recipient who uses "--3way" from using a wrong blob in a fake ancestor tree. Please stop. Thanks.
On Tue, Jan 14, 2025 at 01:09:43PM -0800, Junio C Hamano wrote: > One request. You seem to have started using --full-index when > generating the patches. It is extremely annoying when a patch needs > to be mucked with an editor to inspect why it does not apply and to > tweak it to make it apply. 40-hex does not help at all if the base > commit is not conveyed correctly, as the recipient will not have the > necessary blob objects _anyway_. And 40-hex is unnecessarily long > in order to protect the recipient who uses "--3way" from using a > wrong blob in a fake ancestor tree. Please stop. I have in fact started using b4, as it makes most of the tedious housekeeping around patch series go away, and it indeed uses `--full-index` to generate patches. There isn't any way to change that, but I'll send a patch upstream that gives us an option to do so. My last patches haven't gotten any feedback though, so let's see how it goes. Patrick
On Tue, Jan 14, 2025 at 01:09:43PM -0800, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: > > > Patrick Steinhardt <ps@pks.im> writes: > > > >> Changes in v2: > >> - Wire up zlib-ng in our Makefile. > >> - Exercise zlib-ng via CI by adapting our "linux-musl" job to use > >> Meson and installing zlib-ng. > >> - Link to v1: https://lore.kernel.org/r/20250110-b4-pks-compat-drop-uncompress2-v1-0-965d0022a74d@pks.im > >> > >> The series is built on top of fbe8d3079d (Git 2.48, 2025-01-10) with > >> ps/meson-weak-sha1-build at 6a0ee54f9a (meson: provide a summary of > >> configured backends, 2024-12-30) merged into it. > > > > I think you are now also textually depending on the fuzzer thing due > > to touching meson_options.txt and ci/run-build-and-tests.sh with a > > later step. > > > >> -: ---------- > 9: 7ae8f413d4 ci: switch linux-musl to use Meson > >> -: ---------- > 10: 2dd1b49e4f ci: make "linux-musl" job use zlib-ng > > > > I will see what other things I can find. > > Yup. The patch series for some reason still does not seem to apply > cleanly ([09/10] ci/run-build-and-tests.sh somehow seems to be > troublesome), but it was easy to wiggle it in when the base was > prepared with these two topics merged on top of 'master': > > 4610af08e7 ci: make "linux-musl" job use zlib-ng > b2ddd0b33e ci: switch linux-musl to use Meson > 5118183ef4 compat/zlib: allow use of zlib-ng as backend > 08bf6b2062 git-zlib: cast away potential constness of `next_in` pointer > ebf98412e3 compat/zlib: provide stubs for `deflateSetHeader()` > 29829e5714 compat/zlib: provide `deflateBound()` shim centrally > 8f19b26bbe git-compat-util: move include of "compat/zlib.h" into "git-zlib.h" > 8aab230253 compat: introduce new "zlib.h" header > 1ce001beaa git-compat-util: drop `z_const` define > b9d4bd5467 compat: drop `uncompress2()` compatibility shim > db620fad21 Merge branch 'ps/build-meson-fixes' into ps/zlib-ng > 64156589d9 Merge branch 'ps/meson-weak-sha1-build' into ps/zlib-ng > > I think the reason is because the other topic that touches the fuzz > thing we see in the context of [09/10] is not ps/build-meson-fixes > but something else that is before "--fatal-meson-warnings" was > added. Okay, I'll adapt the base accordingly for the next iterations. Thanks! Patrick
On Wed, Jan 15, 2025 at 06:45:31AM +0100, Patrick Steinhardt wrote: > I have in fact started using b4, as it makes most of the tedious > housekeeping around patch series go away, and it indeed uses > `--full-index` to generate patches. There isn't any way to change that, > but I'll send a patch upstream that gives us an option to do so. This was done as part of this change: https://git.kernel.org/pub/scm/utils/b4/b4.git/commit/?id=23a9ddba10a057bfa9c438c0b50ac36d278ae022 I'm not sure why --full-index was added there -- I don't think it's needed for --binary? Please feel free to send a fix for that. > My last patches haven't gotten any feedback though, so let's see how it > goes. I had to focus on infrastructure needs over the past few months, but I'm starting on my b4 backlog soon. -K
Konstantin Ryabitsev <konstantin@linuxfoundation.org> writes: > I'm not sure why --full-index was added there -- I don't think it's needed for > --binary? Correct. For the purpose of that quoted commit, giving "--binary" should be sufficient. It would avoid "Binary files differ" that is useless in the context of format-patch, and when a binary patch is given, it will automatically get the full length blob object name, without --full-index. Thanks.
On Wed, Jan 15, 2025 at 10:50:36AM -0500, Konstantin Ryabitsev wrote: > On Wed, Jan 15, 2025 at 06:45:31AM +0100, Patrick Steinhardt wrote: > > I have in fact started using b4, as it makes most of the tedious > > housekeeping around patch series go away, and it indeed uses > > `--full-index` to generate patches. There isn't any way to change that, > > but I'll send a patch upstream that gives us an option to do so. > > This was done as part of this change: > https://git.kernel.org/pub/scm/utils/b4/b4.git/commit/?id=23a9ddba10a057bfa9c438c0b50ac36d278ae022 > > I'm not sure why --full-index was added there -- I don't think it's needed for > --binary? Please feel free to send a fix for that. No, it shouldn't be needed. `--binary` implies `--full-index` for that particular binary diff anyway. I'll send a patch. > > My last patches haven't gotten any feedback though, so let's see how it > > goes. > > I had to focus on infrastructure needs over the past few months, but I'm > starting on my b4 backlog soon. Fair enough, thanks! Patrick
Patrick Steinhardt <ps@pks.im> writes: > Hi, > > I have recently started to play around with zlib-ng a bit, which is a > hard fork of the zlib library. It describes itself as zlib replacement > with optimizations for "next generation" systems. As such, it contains > several implementations of central algorithms using for example SSE2, > AVX2 and other vectorized CPU intrinsics that supposedly speed up in- > and deflating data. > > And indeed, compiling Git against zlib-ng leads to a significant speedup > when reading objects. The following benchmark uses git-cat-file(1) with > `--batch --batch-all-objects` in the Git repository: > > Benchmark 1: zlib > Time (mean ± σ): 52.085 s ± 0.141 s [User: 51.500 s, System: 0.456 s] > Range (min … max): 52.004 s … 52.335 s 5 runs > > Benchmark 2: zlib-ng > Time (mean ± σ): 40.324 s ± 0.134 s [User: 39.731 s, System: 0.490 s] > Range (min … max): 40.135 s … 40.484 s 5 runs > > Summary > zlib-ng ran > 1.29 ± 0.01 times faster than zlib > > So we're looking at a ~25% speedup compared to zlib. This is of course > an extreme example, as it makes us read through all objects in the > repository. But regardless, it should be possible to see some sort of > speedup in most commands that end up accessing the object database. > > This patch series refactors how we wire up zlib in our project by > introducing a new "compat/zlib.h" header function. This header is then > later extended to patch over the differences between zlib and zlib-ng, > which is mostly just that zlib-ng has a `zng_` prefix for each of its > symbols. Like this, we can support both libraries directly, and a new > Meson build options allows users to pick whichever backend they like. > > In theory, these changes shouldn't be necessary because zlib-ng provides > a compatibility layer that make it directly compatible with zlib. But > most distros don't allow you to install zlib-ng with that layer is it > would mean that zlib would need to be replaced globally. Instead, they > typically only provide a version of zlib-ng that only has the `zng_` > prefixed symbols. > > Given the observed speedup I do think that this is a worthwhile change > so that users (or especially hosting providers) can easily switch to > zlib-ng without impacting the rest of their system. > > Changes in v2: > - Wire up zlib-ng in our Makefile. > - Exercise zlib-ng via CI by adapting our "linux-musl" job to use > Meson and installing zlib-ng. > - Link to v1: https://lore.kernel.org/r/20250110-b4-pks-compat-drop-uncompress2-v1-0-965d0022a74d@pks.im > Apart from a few typos, Patrick has already answered two of questions and the series already looks good to me! Thanks [snip]
On Wed, Jan 15, 2025 at 05:25:27PM +0100, Patrick Steinhardt wrote: > > I'm not sure why --full-index was added there -- I don't think it's needed for > > --binary? Please feel free to send a fix for that. > > No, it shouldn't be needed. `--binary` implies `--full-index` for that > particular binary diff anyway. I'll send a patch. I've applied it to master and stable-0.14.y, thanks! I should have version 0.14.3 out in the near future with this fix. -K
Hi, I have recently started to play around with zlib-ng a bit, which is a hard fork of the zlib library. It describes itself as zlib replacement with optimizations for "next generation" systems. As such, it contains several implementations of central algorithms using for example SSE2, AVX2 and other vectorized CPU intrinsics that supposedly speed up in- and deflating data. And indeed, compiling Git against zlib-ng leads to a significant speedup when reading objects. The following benchmark uses git-cat-file(1) with `--batch --batch-all-objects` in the Git repository: Benchmark 1: zlib Time (mean ± σ): 52.085 s ± 0.141 s [User: 51.500 s, System: 0.456 s] Range (min … max): 52.004 s … 52.335 s 5 runs Benchmark 2: zlib-ng Time (mean ± σ): 40.324 s ± 0.134 s [User: 39.731 s, System: 0.490 s] Range (min … max): 40.135 s … 40.484 s 5 runs Summary zlib-ng ran 1.29 ± 0.01 times faster than zlib So we're looking at a ~25% speedup compared to zlib. This is of course an extreme example, as it makes us read through all objects in the repository. But regardless, it should be possible to see some sort of speedup in most commands that end up accessing the object database. This patch series refactors how we wire up zlib in our project by introducing a new "compat/zlib.h" header function. This header is then later extended to patch over the differences between zlib and zlib-ng, which is mostly just that zlib-ng has a `zng_` prefix for each of its symbols. Like this, we can support both libraries directly, and a new Meson build options allows users to pick whichever backend they like. In theory, these changes shouldn't be necessary because zlib-ng provides a compatibility layer that make it directly compatible with zlib. But most distros don't allow you to install zlib-ng with that layer is it would mean that zlib would need to be replaced globally. Instead, they typically only provide a version of zlib-ng that only has the `zng_` prefixed symbols. Given the observed speedup I do think that this is a worthwhile change so that users (or especially hosting providers) can easily switch to zlib-ng without impacting the rest of their system. Changes in v2: - Wire up zlib-ng in our Makefile. - Exercise zlib-ng via CI by adapting our "linux-musl" job to use Meson and installing zlib-ng. - Link to v1: https://lore.kernel.org/r/20250110-b4-pks-compat-drop-uncompress2-v1-0-965d0022a74d@pks.im The series is built on top of fbe8d3079d (Git 2.48, 2025-01-10) with ps/meson-weak-sha1-build at 6a0ee54f9a (meson: provide a summary of configured backends, 2024-12-30) merged into it. Thanks! Patrick --- Patrick Steinhardt (10): compat: drop `uncompress2()` compatibility shim git-compat-util: drop `z_const` define compat: introduce new "zlib.h" header git-compat-util: move include of "compat/zlib.h" into "git-zlib.h" compat/zlib: provide `deflateBound()` shim centrally compat/zlib: provide stubs for `deflateSetHeader()` git-zlib: cast away potential constness of `next_in` pointer compat/zlib: allow use of zlib-ng as backend ci: switch linux-musl to use Meson ci: make "linux-musl" job use zlib-ng .github/workflows/main.yml | 2 +- .gitlab-ci.yml | 2 +- Makefile | 21 +++++++--- archive-tar.c | 4 -- archive.c | 1 + ci/install-dependencies.sh | 4 +- ci/lib.sh | 5 +-- ci/run-build-and-tests.sh | 3 +- compat/zlib-compat.h | 47 +++++++++++++++++++++++ compat/zlib-uncompress2.c | 96 ---------------------------------------------- config.c | 1 + csum-file.c | 3 +- environment.c | 1 + git-compat-util.h | 12 ------ git-zlib.c | 6 +-- git-zlib.h | 2 + meson.build | 24 +++++++++--- meson_options.txt | 4 ++ reftable/block.c | 1 - reftable/system.h | 1 + 20 files changed, 100 insertions(+), 140 deletions(-) Range-diff versus v1: 1: 5f650c2a6b = 1: 0d442c86cf compat: drop `uncompress2()` compatibility shim 2: a94c26ad03 = 2: 9fb474c07a git-compat-util: drop `z_const` define 3: 4431647ede = 3: d732ab51ca compat: introduce new "zlib.h" header 4: bbca17dfb5 = 4: b261f9ebcd git-compat-util: move include of "compat/zlib.h" into "git-zlib.h" 5: d5315531d8 = 5: 8fa63dc02c compat/zlib: provide `deflateBound()` shim centrally 6: 27691c395f = 6: 851c90ea6a compat/zlib: provide stubs for `deflateSetHeader()` 7: 5c47ab5f9b = 7: 54d4a95753 git-zlib: cast away potential constness of `next_in` pointer 8: 9826f57665 ! 8: e260c57b2e compat/zlib: allow use of zlib-ng as backend @@ Commit message Signed-off-by: Patrick Steinhardt <ps@pks.im> + ## Makefile ## +@@ Makefile: include shared.mak + # byte-order mark (BOM) when writing UTF-16 or UTF-32 and always writes in + # big-endian format. + # +-# Define NO_DEFLATE_BOUND if your zlib does not have deflateBound. ++# Define NO_DEFLATE_BOUND if your zlib does not have deflateBound. Define ++# ZLIB_NG if you want to use zlib-ng instead of zlib. + # + # Define NO_NORETURN if using buggy versions of gcc 4.6+ and profile feedback, + # as the compiler can crash (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49299) +@@ Makefile: else + endif + IMAP_SEND_LDFLAGS += $(OPENSSL_LINK) $(OPENSSL_LIBSSL) $(LIB_4_CRYPTO) + +-ifdef ZLIB_PATH +- BASIC_CFLAGS += -I$(ZLIB_PATH)/include +- EXTLIBS += $(call libpath_template,$(ZLIB_PATH)/$(lib)) ++ifdef ZLIB_NG ++ BASIC_CFLAGS += -DHAVE_ZLIB_NG ++ ifdef ZLIB_NG_PATH ++ BASIC_CFLAGS += -I$(ZLIB_NG_PATH)/include ++ EXTLIBS += $(call libpath_template,$(ZLIB_NG_PATH)/$(lib)) ++ endif ++ EXTLIBS += -lz-ng ++else ++ ifdef ZLIB_PATH ++ BASIC_CFLAGS += -I$(ZLIB_PATH)/include ++ EXTLIBS += $(call libpath_template,$(ZLIB_PATH)/$(lib)) ++ endif ++ EXTLIBS += -lz + endif +-EXTLIBS += -lz + + ifndef NO_OPENSSL + OPENSSL_LIBSSL = -lssl + ## compat/zlib-compat.h ## @@ #ifndef COMPAT_ZLIB_H -: ---------- > 9: 7ae8f413d4 ci: switch linux-musl to use Meson -: ---------- > 10: 2dd1b49e4f ci: make "linux-musl" job use zlib-ng --- base-commit: b2da7775f8b064ef54920eb0f2e60c7f6df8f995 change-id: 20250110-b4-pks-compat-drop-uncompress2-eb5914459c32