mbox series

[0/4] hash: introduce generic wrappers to update hash contexts

Message ID 20250131-b4-pks-hash-context-direct-v1-0-67a6d3f49d6e@pks.im (mailing list archive)
Headers show
Series hash: introduce generic wrappers to update hash contexts | expand

Message

Patrick Steinhardt Jan. 31, 2025, 12:55 p.m. UTC
Hi,

this patch series introduces a couple of generic wrappers to update hash
contexts. Instead of updating contexts via function pointers provided by
the hash algorithm, we now remember the hash algorithm in the context
itself. As a result, subsequent calls that update the hash don't need to
remember which algorithm they used:

    ```
    struct git_hash_ctx ctx;
    struct object_id oid;

    git_hash_sha1_init(&ctx);
    git_hash_update(&ctx, data);
    git_hash_final_oid(&oid, &ctx);
    ```

This was discussed in [1] and [2].

The series is built on top of master at 3b0d05c4a7 (The fifth batch,
2025-01-29) with tb/unsafe-hashtcleanup at 04292c3796 (hash.h: drop
unsafe_ function variants, 2025-01-23) merged into it.

Thanks!

Patrick

[1]: <Z3fhK1ACzJfVehM2@pks.im>
[2]: <Z4jyZCAwqOjZ-u2U@pks.im>

---
Patrick Steinhardt (4):
      hash: convert hashing context to a structure
      hash: stop typedeffing the hash context
      hash: provide generic wrappers to update hash contexts
      global: adapt callers to use generic hash context helpers

 builtin/fast-import.c      |  16 +++---
 builtin/index-pack.c       |  19 ++++---
 builtin/patch-id.c         |  14 ++---
 builtin/receive-pack.c     |  18 +++----
 builtin/unpack-objects.c   |  13 +++--
 bulk-checkin.c             |  10 ++--
 csum-file.c                |  16 +++---
 csum-file.h                |   4 +-
 diff.c                     |  34 ++++++------
 diff.h                     |   2 +-
 hash.h                     |  43 +++++++++++----
 http-push.c                |   6 +--
 http.c                     |   6 +--
 http.h                     |   2 +-
 object-file.c              | 130 +++++++++++++++++++++++----------------------
 pack-check.c               |   6 +--
 pack-write.c               |  16 +++---
 read-cache.c               |  26 ++++-----
 rerere.c                   |  18 +++----
 t/helper/test-hash-speed.c |   8 +--
 t/helper/test-hash.c       |   6 +--
 t/unit-tests/u-hash.c      |   6 +--
 trace2/tr2_sid.c           |   6 +--
 23 files changed, 224 insertions(+), 201 deletions(-)


---
base-commit: 16b8e620c866ddbff7e33cc47b9772541285da7f
change-id: 20250131-b4-pks-hash-context-direct-aa7acfda3abc

Comments

Junio C Hamano Jan. 31, 2025, 6:16 p.m. UTC | #1
Patrick Steinhardt <ps@pks.im> writes:

> this patch series introduces a couple of generic wrappers to update hash
> contexts. Instead of updating contexts via function pointers provided by
> the hash algorithm, we now remember the hash algorithm in the context
> itself. As a result, subsequent calls that update the hash don't need to
> remember which algorithm they used:
>
>     ```
>     struct git_hash_ctx ctx;
>     struct object_id oid;
>
>     git_hash_sha1_init(&ctx);
>     git_hash_update(&ctx, data);
>     git_hash_final_oid(&oid, &ctx);
>     ```
>
> This was discussed in [1] and [2].
>
> The series is built on top of master at 3b0d05c4a7 (The fifth batch,
> 2025-01-29) with tb/unsafe-hashtcleanup at 04292c3796 (hash.h: drop
> unsafe_ function variants, 2025-01-23) merged into it.
>
> Thanks!
>
> Patrick
>
> [1]: <Z3fhK1ACzJfVehM2@pks.im>
> [2]: <Z4jyZCAwqOjZ-u2U@pks.im>

Sounds sensible.  

It seems to textually interact with Karthik's attempt to pass down a
hash_algo instance through the callchain in pack-write.c but I
should be able to resolve the conflicts.

Thanks.
Patrick Steinhardt Feb. 3, 2025, 5:42 a.m. UTC | #2
On Fri, Jan 31, 2025 at 10:16:19AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > this patch series introduces a couple of generic wrappers to update hash
> > contexts. Instead of updating contexts via function pointers provided by
> > the hash algorithm, we now remember the hash algorithm in the context
> > itself. As a result, subsequent calls that update the hash don't need to
> > remember which algorithm they used:
> >
> >     ```
> >     struct git_hash_ctx ctx;
> >     struct object_id oid;
> >
> >     git_hash_sha1_init(&ctx);
> >     git_hash_update(&ctx, data);
> >     git_hash_final_oid(&oid, &ctx);
> >     ```
> >
> > This was discussed in [1] and [2].
> >
> > The series is built on top of master at 3b0d05c4a7 (The fifth batch,
> > 2025-01-29) with tb/unsafe-hashtcleanup at 04292c3796 (hash.h: drop
> > unsafe_ function variants, 2025-01-23) merged into it.
> >
> > Thanks!
> >
> > Patrick
> >
> > [1]: <Z3fhK1ACzJfVehM2@pks.im>
> > [2]: <Z4jyZCAwqOjZ-u2U@pks.im>
> 
> Sounds sensible.  
> 
> It seems to textually interact with Karthik's attempt to pass down a
> hash_algo instance through the callchain in pack-write.c but I
> should be able to resolve the conflicts.

Ah, sorry, I didn't do a test merge with 'seen'. In any case, the
conflict resolution looks good to me, thanks!

Patrick