mbox series

[00/15] refs: implement skip lists for packed backend

Message ID cover.1683581621.git.me@ttaylorr.com (mailing list archive)
Headers show
Series refs: implement skip lists for packed backend | expand

Message

Taylor Blau May 8, 2023, 9:59 p.m. UTC
This series implements the concept of a skip list for the packed refs
backend. In situations where the caller has many references they want to
exclude, the packed-refs iterator maintains a list of (start, end)
tuples of regions to jump over (i.e. if `iter->pos == start`, jump
forward to `end`).

This series implements that concept, and uses it to power a couple of
new things:

  - `git for-each-ref --exclude`, which allows callers to specify
    prefixes they wish to discard.

    In a synthetic example, the naive implementation improved runtime
    from ~820ms to enumerate all references and discard unwanted ones,
    down to ~106ms with `--exclude`. By the end of the series, this time
    drops to under ~5ms.

  - `git receive-pack` uses the new skip-list machinery to avoid
    visiting references hidden via a hideRefs rule.

  - `git upload-pack` has an analogous optimization as above (though it
    can't kick in quite as often for reasons described in that patch).

The series is laid out as follows:

  - The first five patches are preparatory (various refs.c and
    ref-filter cleanup).
  - The sixth patch provides a naive implementation of `--exclude`.
  - The next four patches prepare for and implement skip lists in the
    packed-refs backend.
  - The last five patches apply the optimization in `upload-pack` and
    `receive-pack`.

The series is ordered so that any prefix of sections list above could be
queued independently. Thanks in advance for your review.

Jeff King (5):
  refs.c: rename `ref_filter`
  ref-filter.h: provide `REF_FILTER_INIT`
  ref-filter: clear reachable list pointers after freeing
  ref-filter: add ref_filter_clear()
  ref-filter.c: parameterize match functions over patterns

Taylor Blau (10):
  builtin/for-each-ref.c: add `--exclude` option
  refs: plumb `exclude_patterns` argument throughout
  refs/packed-backend.c: refactor `find_reference_location()`
  refs/packed-backend.c: implement skip lists to avoid excluded
    pattern(s)
  refs/packed-backend.c: add trace2 counters for skip list
  revision.h: store hidden refs in a `strvec`
  refs/packed-backend.c: ignore complicated hidden refs rules
  refs.h: let `for_each_namespaced_ref()` take excluded patterns
  upload-pack.c: avoid enumerating hidden refs where possible
  builtin/receive-pack.c: avoid enumerating hidden references

 Documentation/git-for-each-ref.txt |   6 +
 builtin/branch.c                   |   4 +-
 builtin/for-each-ref.c             |   8 +-
 builtin/receive-pack.c             |   7 +-
 builtin/tag.c                      |   4 +-
 http-backend.c                     |   2 +-
 ls-refs.c                          |   8 +-
 ref-filter.c                       |  61 ++++++--
 ref-filter.h                       |  12 ++
 refs.c                             |  61 ++++----
 refs.h                             |  15 +-
 refs/debug.c                       |   5 +-
 refs/files-backend.c               |   5 +-
 refs/packed-backend.c              | 222 ++++++++++++++++++++++++++---
 refs/refs-internal.h               |   7 +-
 revision.c                         |   4 +-
 revision.h                         |   5 +-
 t/helper/test-reach.c              |   2 +-
 t/helper/test-ref-store.c          |  10 ++
 t/t1419-exclude-refs.sh            | 131 +++++++++++++++++
 t/t6300-for-each-ref.sh            |  35 +++++
 trace2.h                           |   2 +
 trace2/tr2_ctr.c                   |   5 +
 upload-pack.c                      |  43 ++++--
 24 files changed, 559 insertions(+), 105 deletions(-)
 create mode 100755 t/t1419-exclude-refs.sh