mbox series

[v3,00/13] reftable: improve ref iteration performance (pt.2)

Message ID cover.1709548907.git.ps@pks.im (mailing list archive)
Headers show
Series reftable: improve ref iteration performance (pt.2) | expand

Message

Patrick Steinhardt March 4, 2024, 10:48 a.m. UTC
Hi,

this is the third version of my patch series that aims to improve raw
ref iteration performance with the reftable backend. Changes compared to
v2:

    - Reversed the order of the second set of `SWAP()` macro calls.

    - Fixed typos in commit messages.

Thanks!

Patrick

Patrick Steinhardt (13):
  reftable/pq: use `size_t` to track iterator index
  reftable/merged: make `merged_iter` structure private
  reftable/merged: advance subiter on subsequent iteration
  reftable/merged: make subiters own their records
  reftable/merged: remove unnecessary null check for subiters
  reftable/merged: handle subiter cleanup on close only
  reftable/merged: circumvent pqueue with single subiter
  reftable/merged: avoid duplicate pqueue emptiness check
  reftable/record: reuse refname when decoding
  reftable/record: reuse refname when copying
  reftable/record: decode keys in place
  reftable: allow inlining of a few functions
  refs/reftable: precompute prefix length

 refs/reftable-backend.c    |   6 +-
 reftable/block.c           |  25 +++----
 reftable/block.h           |   2 -
 reftable/iter.c            |   5 --
 reftable/iter.h            |   4 --
 reftable/merged.c          | 139 +++++++++++++++++++------------------
 reftable/merged.h          |  11 +--
 reftable/pq.c              |  18 +----
 reftable/pq.h              |  16 +++--
 reftable/pq_test.c         |  41 +++++------
 reftable/record.c          |  64 +++++++++--------
 reftable/record.h          |  21 ++++--
 reftable/record_test.c     |   3 +-
 reftable/reftable-record.h |   1 +
 14 files changed, 175 insertions(+), 181 deletions(-)

Range-diff against v2:
 1:  292e5f8888 =  1:  c998039333 reftable/pq: use `size_t` to track iterator index
 2:  95e1ccafc4 =  2:  cb144e28a1 reftable/merged: make `merged_iter` structure private
 3:  0e327e5fe3 =  3:  1bf09661e5 reftable/merged: advance subiter on subsequent iteration
 4:  494d74deff =  4:  9aa1733aef reftable/merged: make subiters own their records
 5:  0adf34d08b =  5:  b413006159 reftable/merged: remove unnecessary null check for subiters
 6:  01152ce130 =  6:  0ab1be740e reftable/merged: handle subiter cleanup on close only
 7:  370b6cfc6c =  7:  2199881d47 reftable/merged: circumvent pqueue with single subiter
 8:  1e279f21e6 !  8:  04435f515c reftable/merged: avoid duplicate pqueue emptiness check
    @@ Commit message
         down the stack in `merged_iter_next_entry()` though, which makes this
         check redundant.
     
    -    Now if this check was there to accellerate the common case it might have
    +    Now if this check was there to accelerate the common case it might have
         made sense to keep it. But the iterator being exhausted is rather the
         uncommon case because you can expect most reftable stacks to contain
         more than two refs.
 9:  15a8cbf678 !  9:  92f83dd404 reftable/record: reuse refname when decoding
    @@ Commit message
         to the required number of bytes via `REFTABLE_ALLOC_GROW()`.
     
         This refactoring is safe to do because all functions that assigning to
    -    the refname will first call `release_reftable_record()`, which will zero
    -    out the complete record after releasing memory.
    +    the refname will first call `reftable_ref_record_release()`, which will
    +    zero out the complete record after releasing memory.
     
         This change results in a nice speedup when iterating over 1 million
         refs:
    @@ reftable/record.c: static int reftable_ref_record_decode(void *rec, struct strbu
     +	SWAP(refname, r->refname);
     +	SWAP(refname_cap, r->refname_cap);
      	reftable_ref_record_release(r);
    -+	SWAP(refname, r->refname);
    -+	SWAP(refname_cap, r->refname_cap);
    ++	SWAP(r->refname, refname);
    ++	SWAP(r->refname_cap, refname_cap);
      
     -	assert(hash_size > 0);
     -
10:  35b1af2f06 ! 10:  eb600f3bf3 reftable/record: reuse refname when copying
    @@ reftable/record.c: static void reftable_ref_record_copy_from(void *rec, const vo
     +	SWAP(refname, ref->refname);
     +	SWAP(refname_cap, ref->refname_cap);
      	reftable_ref_record_release(ref);
    -+	SWAP(refname, ref->refname);
    -+	SWAP(refname_cap, ref->refname_cap);
    ++	SWAP(ref->refname, refname);
    ++	SWAP(ref->refname_cap, refname_cap);
     +
      	if (src->refname) {
     -		ref->refname = xstrdup(src->refname);
11:  d7151ef361 = 11:  f7915f1df8 reftable/record: decode keys in place
12:  99b238a40d = 12:  527c15e5da reftable: allow inlining of a few functions
13:  627bd1f5f7 ! 13:  de4a1e2239 refs/reftable: precompute prefix length
    @@ refs/reftable-backend.c: static int reftable_ref_iterator_advance(struct ref_ite
      		}
     @@ refs/reftable-backend.c: static struct reftable_ref_iterator *ref_iterator_for_stack(struct reftable_ref_
      	iter = xcalloc(1, sizeof(*iter));
    - 	base_ref_iterator_init(&iter->base, &reftable_ref_iterator_vtable, 1);
    + 	base_ref_iterator_init(&iter->base, &reftable_ref_iterator_vtable);
      	iter->prefix = prefix;
     +	iter->prefix_len = prefix ? strlen(prefix) : 0;
      	iter->base.oid = &iter->oid;

base-commit: b387623c12f3f4a376e4d35a610fd3e55d7ea907