diff mbox series

[v2,6/6] refs/reftable: wire up support for exclude patterns

Message ID 050f4906393d1f9c58a6b6bc695b004695d219be.1726476401.git.ps@pks.im (mailing list archive)
State Accepted
Commit 18695250667912d8278e76dce453706c3d488173
Headers show
Series refs/reftable: wire up exclude patterns | expand

Commit Message

Patrick Steinhardt Sept. 16, 2024, 8:50 a.m. UTC
Exclude patterns can be used by reference backends to skip over blocks
of references that are uninteresting to the caller. Reference backends
do not have to wire up support for them, and all callers are expected to
behave as if the backend didn't support them. In fact, the only backend
that supports exclude patterns right now is the "packed" backend.

Exclude patterns can be quite an important performance optimization in
repositories that have loads of references. The patterns are set up in
case "transfer.hideRefs" and friends are configured during a fetch, so
handling these patterns becomes important once there are lots of hidden
refs in a served repository.

Now that we have properly re-seekable reftable iterators we can also
wire up support for these patterns in the "reftable" backend. Doing so
is conceptually simple: once we hit a reference whose prefix matches the
current exclude pattern we re-seek the iterator to the first reference
that doesn't match the pattern anymore. This schema only works for
trivial patterns that do not have any globbing characters in them, but
this restriction also applies do the "packed" backend.

This makes t1419 work with the "reftable" backend with some slight
modifications. Of course it also speeds up listing of references with
hidden refs. The following benchmark prints one reference with 1 million
hidden references:

    Benchmark 1: HEAD~
      Time (mean ± σ):      93.3 ms ±   2.1 ms    [User: 90.3 ms, System: 2.5 ms]
      Range (min … max):    89.8 ms …  97.2 ms    33 runs

    Benchmark 2: HEAD
      Time (mean ± σ):       4.2 ms ±   0.6 ms    [User: 2.2 ms, System: 1.8 ms]
      Range (min … max):     3.1 ms …   8.1 ms    765 runs

    Summary
      HEAD ran
       22.15 ± 3.19 times faster than HEAD~

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs/reftable-backend.c | 133 +++++++++++++++++++++++++++++++++++++++-
 t/t1419-exclude-refs.sh |  49 ++++++++++++---
 trace2.h                |   1 +
 trace2/tr2_ctr.c        |   5 ++
 4 files changed, 176 insertions(+), 12 deletions(-)

Comments

Taylor Blau Sept. 17, 2024, 9:26 a.m. UTC | #1
On Mon, Sep 16, 2024 at 10:50:16AM +0200, Patrick Steinhardt wrote:
> This makes t1419 work with the "reftable" backend with some slight
> modifications. Of course it also speeds up listing of references with
> hidden refs. The following benchmark prints one reference with 1 million
> hidden references:
>
>     Benchmark 1: HEAD~
>       Time (mean ± σ):      93.3 ms ±   2.1 ms    [User: 90.3 ms, System: 2.5 ms]
>       Range (min … max):    89.8 ms …  97.2 ms    33 runs
>
>     Benchmark 2: HEAD
>       Time (mean ± σ):       4.2 ms ±   0.6 ms    [User: 2.2 ms, System: 1.8 ms]
>       Range (min … max):     3.1 ms …   8.1 ms    765 runs
>
>     Summary
>       HEAD ran
>        22.15 ± 3.19 times faster than HEAD~

Very nice.

> +		/*
> +		 * When the reference name is lexicographically bigger than the
> +		 * current exclude pattern we know that it won't ever match any
> +		 * of the following references, either. We thus advance to the
> +		 * next pattern and re-check whether it matches.
> +		 *
> +		 * Otherwise, if it's smaller, then we do not have a match and
> +		 * thus want to show the current reference.
> +		 */
> +		cmp = strncmp(iter->ref.refname, pattern,
> +			      iter->exclude_patterns_strlen);
> +		if (cmp > 0) {
> +			iter->exclude_patterns_index++;
> +			iter->exclude_patterns_strlen = 0;
> +			continue;
> +		}
> +		if (cmp < 0)
> +			return 0;

Perhaps I am showing my ignorance for the reftable backend, but is it OK
to call strncmp() against all patterns here?

In the packed-refs implementation which I worked on with Peff sometime
last year, we rejected exclude patterns that contain special glob
characters in them for exactly this reason.

The implementation that I'm referring to has a helpful comment that
jogged my memory for what we were thinking at the time (see the
function refs/packed-backend.c::populate_excluded_jump_list()).

Ugh, I just read the next hunk below, so ignore me here ;-).

> +static char **filter_exclude_patterns(const char **exclude_patterns)
> +{
> +	size_t filtered_size = 0, filtered_alloc = 0;
> +	char **filtered = NULL;
> +
> +	if (!exclude_patterns)
> +		return NULL;
> +
> +	for (size_t i = 0; ; i++) {
> +		const char *exclude_pattern = exclude_patterns[i];
> +		int has_glob = 0;
> +
> +		if (!exclude_pattern)
> +			break;
> +
> +		for (const char *p = exclude_pattern; *p; p++) {
> +			has_glob = is_glob_special(*p);
> +			if (has_glob)
> +				break;
> +		}
> +		if (has_glob)
> +			continue;
> +
> +		ALLOC_GROW(filtered, filtered_size + 1, filtered_alloc);
> +		filtered[filtered_size++] = xstrdup(exclude_pattern);
> +	}
> +
> +	if (filtered_size) {
> +		QSORT(filtered, filtered_size, qsort_strcmp);
> +		ALLOC_GROW(filtered, filtered_size + 1, filtered_alloc);
> +		filtered[filtered_size++] = NULL;
> +	}
> +
> +	return filtered;
> +}

Ohhh, here's where we filter out un-excludeable patterns. Ignore me :-).

One question I had reading this is why we don't filter these out on the
fly in the iterator itself instead of allocating a separate array that
we have to xstrdup() into and free later on.

We may be at the point of diminishing returns here, but I wonder if
allocating this thing is more expensive than a few redundant strcmp()s
and calls to is_glob_special(). I dunno.

Thanks,
Taylor
Patrick Steinhardt Sept. 17, 2024, 9:39 a.m. UTC | #2
On Tue, Sep 17, 2024 at 05:26:48AM -0400, Taylor Blau wrote:
> On Mon, Sep 16, 2024 at 10:50:16AM +0200, Patrick Steinhardt wrote:
> > +		/*
> > +		 * When the reference name is lexicographically bigger than the
> > +		 * current exclude pattern we know that it won't ever match any
> > +		 * of the following references, either. We thus advance to the
> > +		 * next pattern and re-check whether it matches.
> > +		 *
> > +		 * Otherwise, if it's smaller, then we do not have a match and
> > +		 * thus want to show the current reference.
> > +		 */
> > +		cmp = strncmp(iter->ref.refname, pattern,
> > +			      iter->exclude_patterns_strlen);
> > +		if (cmp > 0) {
> > +			iter->exclude_patterns_index++;
> > +			iter->exclude_patterns_strlen = 0;
> > +			continue;
> > +		}
> > +		if (cmp < 0)
> > +			return 0;
> 
> Perhaps I am showing my ignorance for the reftable backend, but is it OK
> to call strncmp() against all patterns here?
> 
> In the packed-refs implementation which I worked on with Peff sometime
> last year, we rejected exclude patterns that contain special glob
> characters in them for exactly this reason.
> 
> The implementation that I'm referring to has a helpful comment that
> jogged my memory for what we were thinking at the time (see the
> function refs/packed-backend.c::populate_excluded_jump_list()).
> 
> Ugh, I just read the next hunk below, so ignore me here ;-).

;)

I was also wondering whether we'd want to amend the generic parts of the
refs interface to filter out globs. But it is entirely feasible that a
backend can indeed filter out globs efficiently, even though none of the
current ones can. So it kind of makes sense to keep things as-is and let
the backends themselves decide what they can use.

> > +static char **filter_exclude_patterns(const char **exclude_patterns)
> > +{
> > +	size_t filtered_size = 0, filtered_alloc = 0;
> > +	char **filtered = NULL;
> > +
> > +	if (!exclude_patterns)
> > +		return NULL;
> > +
> > +	for (size_t i = 0; ; i++) {
> > +		const char *exclude_pattern = exclude_patterns[i];
> > +		int has_glob = 0;
> > +
> > +		if (!exclude_pattern)
> > +			break;
> > +
> > +		for (const char *p = exclude_pattern; *p; p++) {
> > +			has_glob = is_glob_special(*p);
> > +			if (has_glob)
> > +				break;
> > +		}
> > +		if (has_glob)
> > +			continue;
> > +
> > +		ALLOC_GROW(filtered, filtered_size + 1, filtered_alloc);
> > +		filtered[filtered_size++] = xstrdup(exclude_pattern);
> > +	}
> > +
> > +	if (filtered_size) {
> > +		QSORT(filtered, filtered_size, qsort_strcmp);
> > +		ALLOC_GROW(filtered, filtered_size + 1, filtered_alloc);
> > +		filtered[filtered_size++] = NULL;
> > +	}
> > +
> > +	return filtered;
> > +}
> 
> Ohhh, here's where we filter out un-excludeable patterns. Ignore me :-).
> 
> One question I had reading this is why we don't filter these out on the
> fly in the iterator itself instead of allocating a separate array that
> we have to xstrdup() into and free later on.
> 
> We may be at the point of diminishing returns here, but I wonder if
> allocating this thing is more expensive than a few redundant strcmp()s
> and calls to is_glob_special(). I dunno.

I think duplicating the array is the right thing to do anyway to not get
weird lifetime issues with the exclude patterns. A caller may set up a
ref iterator that they may end up using for longer than they keep alive
the exlude patterns passed to the iterator. So by duplicating the array
we might end up wasting a bit of memory, but we avoid such lifetime
problems completely, which I think is a win to make the infrastructure
less fragile.

Thanks for your review!

Patrick
Taylor Blau Sept. 17, 2024, 9:53 a.m. UTC | #3
On Tue, Sep 17, 2024 at 11:39:04AM +0200, Patrick Steinhardt wrote:
> > Ugh, I just read the next hunk below, so ignore me here ;-).
>
> ;)
>
> I was also wondering whether we'd want to amend the generic parts of the
> refs interface to filter out globs. But it is entirely feasible that a
> backend can indeed filter out globs efficiently, even though none of the
> current ones can. So it kind of makes sense to keep things as-is and let
> the backends themselves decide what they can use.

Yep, agreed.

> > One question I had reading this is why we don't filter these out on the
> > fly in the iterator itself instead of allocating a separate array that
> > we have to xstrdup() into and free later on.
> >
> > We may be at the point of diminishing returns here, but I wonder if
> > allocating this thing is more expensive than a few redundant strcmp()s
> > and calls to is_glob_special(). I dunno.
>
> I think duplicating the array is the right thing to do anyway to not get
> weird lifetime issues with the exclude patterns. A caller may set up a
> ref iterator that they may end up using for longer than they keep alive
> the exlude patterns passed to the iterator. So by duplicating the array
> we might end up wasting a bit of memory, but we avoid such lifetime
> problems completely, which I think is a win to make the infrastructure
> less fragile.

OK, I trust your judgement here over my own as not having nearly the
familiarity with reftables as you do.

Thanks,
Taylor
diff mbox series

Patch

diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 1c4b19e737f..3e63833ce41 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -21,6 +21,7 @@ 
 #include "../reftable/reftable-iterator.h"
 #include "../setup.h"
 #include "../strmap.h"
+#include "../trace2.h"
 #include "parse.h"
 #include "refs-internal.h"
 
@@ -447,10 +448,81 @@  struct reftable_ref_iterator {
 
 	const char *prefix;
 	size_t prefix_len;
+	char **exclude_patterns;
+	size_t exclude_patterns_index;
+	size_t exclude_patterns_strlen;
 	unsigned int flags;
 	int err;
 };
 
+/*
+ * Handle exclude patterns. Returns either `1`, which tells the caller that the
+ * current reference shall not be shown. Or `0`, which indicates that it should
+ * be shown.
+ */
+static int should_exclude_current_ref(struct reftable_ref_iterator *iter)
+{
+	while (iter->exclude_patterns[iter->exclude_patterns_index]) {
+		const char *pattern = iter->exclude_patterns[iter->exclude_patterns_index];
+		char *ref_after_pattern;
+		int cmp;
+
+		/*
+		 * Lazily cache the pattern length so that we don't have to
+		 * recompute it every time this function is called.
+		 */
+		if (!iter->exclude_patterns_strlen)
+			iter->exclude_patterns_strlen = strlen(pattern);
+
+		/*
+		 * When the reference name is lexicographically bigger than the
+		 * current exclude pattern we know that it won't ever match any
+		 * of the following references, either. We thus advance to the
+		 * next pattern and re-check whether it matches.
+		 *
+		 * Otherwise, if it's smaller, then we do not have a match and
+		 * thus want to show the current reference.
+		 */
+		cmp = strncmp(iter->ref.refname, pattern,
+			      iter->exclude_patterns_strlen);
+		if (cmp > 0) {
+			iter->exclude_patterns_index++;
+			iter->exclude_patterns_strlen = 0;
+			continue;
+		}
+		if (cmp < 0)
+			return 0;
+
+		/*
+		 * The reference shares a prefix with the exclude pattern and
+		 * shall thus be omitted. We skip all references that match the
+		 * pattern by seeking to the first reference after the block of
+		 * matches.
+		 *
+		 * This is done by appending the highest possible character to
+		 * the pattern. Consequently, all references that have the
+		 * pattern as prefix and whose suffix starts with anything in
+		 * the range [0x00, 0xfe] are skipped. And given that 0xff is a
+		 * non-printable character that shouldn't ever be in a ref name,
+		 * we'd not yield any such record, either.
+		 *
+		 * Note that the seeked-to reference may also be excluded. This
+		 * is not handled here though, but the caller is expected to
+		 * loop and re-verify the next reference for us.
+		 */
+		ref_after_pattern = xstrfmt("%s%c", pattern, 0xff);
+		iter->err = reftable_iterator_seek_ref(&iter->iter, ref_after_pattern);
+		iter->exclude_patterns_index++;
+		iter->exclude_patterns_strlen = 0;
+		trace2_counter_add(TRACE2_COUNTER_ID_REFTABLE_RESEEKS, 1);
+
+		free(ref_after_pattern);
+		return 1;
+	}
+
+	return 0;
+}
+
 static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
 {
 	struct reftable_ref_iterator *iter =
@@ -481,6 +553,9 @@  static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
 			break;
 		}
 
+		if (iter->exclude_patterns && should_exclude_current_ref(iter))
+			continue;
+
 		if (iter->flags & DO_FOR_EACH_PER_WORKTREE_ONLY &&
 		    parse_worktree_ref(iter->ref.refname, NULL, NULL, NULL) !=
 			    REF_WORKTREE_CURRENT)
@@ -570,6 +645,11 @@  static int reftable_ref_iterator_abort(struct ref_iterator *ref_iterator)
 		(struct reftable_ref_iterator *)ref_iterator;
 	reftable_ref_record_release(&iter->ref);
 	reftable_iterator_destroy(&iter->iter);
+	if (iter->exclude_patterns) {
+		for (size_t i = 0; iter->exclude_patterns[i]; i++)
+			free(iter->exclude_patterns[i]);
+		free(iter->exclude_patterns);
+	}
 	free(iter);
 	return ITER_DONE;
 }
@@ -580,9 +660,53 @@  static struct ref_iterator_vtable reftable_ref_iterator_vtable = {
 	.abort = reftable_ref_iterator_abort
 };
 
+static int qsort_strcmp(const void *va, const void *vb)
+{
+	const char *a = *(const char **)va;
+	const char *b = *(const char **)vb;
+	return strcmp(a, b);
+}
+
+static char **filter_exclude_patterns(const char **exclude_patterns)
+{
+	size_t filtered_size = 0, filtered_alloc = 0;
+	char **filtered = NULL;
+
+	if (!exclude_patterns)
+		return NULL;
+
+	for (size_t i = 0; ; i++) {
+		const char *exclude_pattern = exclude_patterns[i];
+		int has_glob = 0;
+
+		if (!exclude_pattern)
+			break;
+
+		for (const char *p = exclude_pattern; *p; p++) {
+			has_glob = is_glob_special(*p);
+			if (has_glob)
+				break;
+		}
+		if (has_glob)
+			continue;
+
+		ALLOC_GROW(filtered, filtered_size + 1, filtered_alloc);
+		filtered[filtered_size++] = xstrdup(exclude_pattern);
+	}
+
+	if (filtered_size) {
+		QSORT(filtered, filtered_size, qsort_strcmp);
+		ALLOC_GROW(filtered, filtered_size + 1, filtered_alloc);
+		filtered[filtered_size++] = NULL;
+	}
+
+	return filtered;
+}
+
 static struct reftable_ref_iterator *ref_iterator_for_stack(struct reftable_ref_store *refs,
 							    struct reftable_stack *stack,
 							    const char *prefix,
+							    const char **exclude_patterns,
 							    int flags)
 {
 	struct reftable_ref_iterator *iter;
@@ -595,6 +719,7 @@  static struct reftable_ref_iterator *ref_iterator_for_stack(struct reftable_ref_
 	iter->base.oid = &iter->oid;
 	iter->flags = flags;
 	iter->refs = refs;
+	iter->exclude_patterns = filter_exclude_patterns(exclude_patterns);
 
 	ret = refs->err;
 	if (ret)
@@ -616,7 +741,7 @@  static struct reftable_ref_iterator *ref_iterator_for_stack(struct reftable_ref_
 
 static struct ref_iterator *reftable_be_iterator_begin(struct ref_store *ref_store,
 						       const char *prefix,
-						       const char **exclude_patterns UNUSED,
+						       const char **exclude_patterns,
 						       unsigned int flags)
 {
 	struct reftable_ref_iterator *main_iter, *worktree_iter;
@@ -627,7 +752,8 @@  static struct ref_iterator *reftable_be_iterator_begin(struct ref_store *ref_sto
 		required_flags |= REF_STORE_ODB;
 	refs = reftable_be_downcast(ref_store, required_flags, "ref_iterator_begin");
 
-	main_iter = ref_iterator_for_stack(refs, refs->main_stack, prefix, flags);
+	main_iter = ref_iterator_for_stack(refs, refs->main_stack, prefix,
+					   exclude_patterns, flags);
 
 	/*
 	 * The worktree stack is only set when we're in an actual worktree
@@ -641,7 +767,8 @@  static struct ref_iterator *reftable_be_iterator_begin(struct ref_store *ref_sto
 	 * Otherwise we merge both the common and the per-worktree refs into a
 	 * single iterator.
 	 */
-	worktree_iter = ref_iterator_for_stack(refs, refs->worktree_stack, prefix, flags);
+	worktree_iter = ref_iterator_for_stack(refs, refs->worktree_stack, prefix,
+					       exclude_patterns, flags);
 	return merge_ref_iterator_begin(&worktree_iter->base, &main_iter->base,
 					ref_iterator_select, NULL);
 }
diff --git a/t/t1419-exclude-refs.sh b/t/t1419-exclude-refs.sh
index 13595744190..3256e4462f9 100755
--- a/t/t1419-exclude-refs.sh
+++ b/t/t1419-exclude-refs.sh
@@ -8,12 +8,6 @@  export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
-if test_have_prereq !REFFILES
-then
-	skip_all='skipping `git for-each-ref --exclude` tests; need files backend'
-	test_done
-fi
-
 for_each_ref__exclude () {
 	GIT_TRACE2_PERF=1 test-tool ref-store main \
 		for-each-ref--exclude "$@" >actual.raw
@@ -28,7 +22,14 @@  assert_jumps () {
 	local nr="$1"
 	local trace="$2"
 
-	grep -q "name:jumps_made value:$nr$" $trace
+	case "$GIT_DEFAULT_REF_FORMAT" in
+	files)
+		grep -q "name:jumps_made value:$nr$" $trace;;
+	reftable)
+		grep -q "name:reseeks_made value:$nr$" $trace;;
+	*)
+		BUG "unhandled ref format $GIT_DEFAULT_REF_FORMAT";;
+	esac
 }
 
 assert_no_jumps () {
@@ -89,7 +90,14 @@  test_expect_success 'adjacent, non-overlapping excluded regions' '
 	for_each_ref refs/heads/foo refs/heads/quux >expect &&
 
 	test_cmp expect actual &&
-	assert_jumps 1 perf
+	case "$GIT_DEFAULT_REF_FORMAT" in
+	files)
+		assert_jumps 1 perf;;
+	reftable)
+		assert_jumps 2 perf;;
+	*)
+		BUG "unhandled ref format $GIT_DEFAULT_REF_FORMAT";;
+	esac
 '
 
 test_expect_success 'overlapping excluded regions' '
@@ -106,7 +114,30 @@  test_expect_success 'several overlapping excluded regions' '
 	for_each_ref refs/heads/quux >expect &&
 
 	test_cmp expect actual &&
-	assert_jumps 1 perf
+	case "$GIT_DEFAULT_REF_FORMAT" in
+	files)
+		assert_jumps 1 perf;;
+	reftable)
+		assert_jumps 3 perf;;
+	*)
+		BUG "unhandled ref format $GIT_DEFAULT_REF_FORMAT";;
+	esac
+'
+
+test_expect_success 'unordered excludes' '
+	for_each_ref__exclude refs/heads \
+		refs/heads/foo refs/heads/baz >actual 2>perf &&
+	for_each_ref refs/heads/bar refs/heads/quux >expect &&
+
+	test_cmp expect actual &&
+	case "$GIT_DEFAULT_REF_FORMAT" in
+	files)
+		assert_jumps 1 perf;;
+	reftable)
+		assert_jumps 2 perf;;
+	*)
+		BUG "unhandled ref format $GIT_DEFAULT_REF_FORMAT";;
+	esac
 '
 
 test_expect_success 'non-matching excluded section' '
diff --git a/trace2.h b/trace2.h
index 19e04bf040f..901f39253a6 100644
--- a/trace2.h
+++ b/trace2.h
@@ -554,6 +554,7 @@  enum trace2_counter_id {
 	TRACE2_COUNTER_ID_TEST2,     /* emits summary and thread events */
 
 	TRACE2_COUNTER_ID_PACKED_REFS_JUMPS, /* counts number of jumps */
+	TRACE2_COUNTER_ID_REFTABLE_RESEEKS, /* counts number of re-seeks */
 
 	/* counts number of fsyncs */
 	TRACE2_COUNTER_ID_FSYNC_WRITEOUT_ONLY,
diff --git a/trace2/tr2_ctr.c b/trace2/tr2_ctr.c
index d3a33715c14..036b643578b 100644
--- a/trace2/tr2_ctr.c
+++ b/trace2/tr2_ctr.c
@@ -31,6 +31,11 @@  static struct tr2_counter_metadata tr2_counter_metadata[TRACE2_NUMBER_OF_COUNTER
 		.name = "jumps_made",
 		.want_per_thread_events = 0,
 	},
+	[TRACE2_COUNTER_ID_REFTABLE_RESEEKS] = {
+		.category = "reftable",
+		.name = "reseeks_made",
+		.want_per_thread_events = 0,
+	},
 	[TRACE2_COUNTER_ID_FSYNC_WRITEOUT_ONLY] = {
 		.category = "fsync",
 		.name = "writeout-only",