diff mbox series

[1/6] refs: properly apply exclude patterns to namespaced refs

Message ID 8d347bc5599e2a679d50fed073e0f09ffdad85c4.1725881266.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series refs/reftable: wire up exclude patterns | expand

Commit Message

Patrick Steinhardt Sept. 9, 2024, 11:31 a.m. UTC
Reference namespaces allow commands like git-upload-pack(1) to serve
different sets of references to the client depending on which namespace
is enabled, which is for example useful in fork networks. Namespaced
refs are stored with a `refs/namespaces/$namespace` prefix, but all the
user will ultimately see is a stripped version where that prefix is
removed.

The way that this interacts with "transfer.hideRefs" is not immediately
obvious: the hidden refs can either apply to the stripped references, or
to the non-stripped ones that still have the namespace prefix. In fact,
the "transfer.hideRefs" machinery does the former and applies to the
stripped reference by default, but rules can have "^" prefixed to switch
this behaviour to iinstead match against the rull reference name.

Namespaces are exclusively handled at the generic "refs" layer, the
respective backends have no clue that such a thing even exists. This
also has the consequence that they cannot handle hiding references as
soon as reference namespaces come into play because they neither know
whether a namespace is active, nor do they know how to strip references
if they are active.

Handling such exclude patterns in `refs_for_each_namespaced_ref()` and
`refs_for_each_fullref_in_prefixes()` is broken though, as both support
that the user passes both namespaces and exclude patterns. In the case
where both are set we will exclude references with unstripped names,
even though we really wanted to exclude references based on their
stripped names.

This only surfaces when:

  - A repository uses reference namespaces.

  - "transfer.hideRefs" is active.

  - The namespaced references are packed into the "packed-refs" file.

None of our tests exercise this scenario, and thus we haven't ever hit
it. While t5509 exercises both (1) and (2), it does not happen to hit
(3). It is trivial to demonstrate the bug though by explicitly packing
refs in the tests, and then we indeed surface the breakage.

Fix this bug by prefixing exclude patterns with the namespace in the
generic layer.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                           | 35 ++++++++++++++++++++++++++++----
 refs.h                           |  9 ++++++++
 t/t5509-fetch-push-namespaces.sh |  1 +
 3 files changed, 41 insertions(+), 4 deletions(-)

Comments

karthik nayak Sept. 13, 2024, 11:35 a.m. UTC | #1
Patrick Steinhardt <ps@pks.im> writes:

> Reference namespaces allow commands like git-upload-pack(1) to serve
> different sets of references to the client depending on which namespace
> is enabled, which is for example useful in fork networks. Namespaced
> refs are stored with a `refs/namespaces/$namespace` prefix, but all the
> user will ultimately see is a stripped version where that prefix is
> removed.
>
> The way that this interacts with "transfer.hideRefs" is not immediately
> obvious: the hidden refs can either apply to the stripped references, or
> to the non-stripped ones that still have the namespace prefix. In fact,
> the "transfer.hideRefs" machinery does the former and applies to the
> stripped reference by default, but rules can have "^" prefixed to switch
> this behaviour to iinstead match against the rull reference name.

s/iinstead/instead
s/rull/full

> Namespaces are exclusively handled at the generic "refs" layer, the
> respective backends have no clue that such a thing even exists. This
> also has the consequence that they cannot handle hiding references as
> soon as reference namespaces come into play because they neither know
> whether a namespace is active, nor do they know how to strip references
> if they are active.
>
> Handling such exclude patterns in `refs_for_each_namespaced_ref()` and
> `refs_for_each_fullref_in_prefixes()` is broken though, as both support
> that the user passes both namespaces and exclude patterns. In the case
> where both are set we will exclude references with unstripped names,
> even though we really wanted to exclude references based on their
> stripped names.
>
> This only surfaces when:
>
>   - A repository uses reference namespaces.
>
>   - "transfer.hideRefs" is active.
>
>   - The namespaced references are packed into the "packed-refs" file.
>

So this is because we don't even apply exclude patterns to the loose
refs right?

To understand correctly, the transport layer passes on
'transfer.hideRefs' as `exclude_refs` to the generic refs layer. This is
mostly to optimize the reference backend to skip such refs. This is used
by the packed-refs currently but not used for loose refs.

The transfer layer also uses this list in `mark_our_ref()` to skip refs
as needed.

So all in all `exclude_refs` here is mostly for optimization.

> None of our tests exercise this scenario, and thus we haven't ever hit
> it. While t5509 exercises both (1) and (2), it does not happen to hit
> (3). It is trivial to demonstrate the bug though by explicitly packing
> refs in the tests, and then we indeed surface the breakage.

Nit: I know you're referring to the three points stated above, it would
be nice if they were numbered.

> Fix this bug by prefixing exclude patterns with the namespace in the
> generic layer.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  refs.c                           | 35 ++++++++++++++++++++++++++++----
>  refs.h                           |  9 ++++++++
>  t/t5509-fetch-push-namespaces.sh |  1 +
>  3 files changed, 41 insertions(+), 4 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index ceb72d4bd74..b3a367ea12c 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1517,6 +1517,19 @@ const char **hidden_refs_to_excludes(const struct strvec *hide_refs)
>  	return hide_refs->v;
>  }
>
> +const char **get_namespaced_exclude_patterns(const char **exclude_patterns,
> +					     const char *namespace,
> +					     struct strvec *out)
> +{
> +	if (!namespace || !*namespace || !exclude_patterns || !*exclude_patterns)

What scenario would `!*namespace` be possible?

> +		return exclude_patterns;
> +
> +	for (size_t i = 0; exclude_patterns[i]; i++)
> +		strvec_pushf(out, "%s%s", namespace, exclude_patterns[i]);
> +
> +	return out->v;
> +}
> +
>  const char *find_descendant_ref(const char *dirname,
>  				const struct string_list *extras,
>  				const struct string_list *skip)
> @@ -1634,11 +1647,19 @@ int refs_for_each_namespaced_ref(struct ref_store *refs,
>  				 const char **exclude_patterns,
>  				 each_ref_fn fn, void *cb_data)
>  {
> -	struct strbuf buf = STRBUF_INIT;
> +	struct strvec namespaced_exclude_patterns = STRVEC_INIT;
> +	struct strbuf prefix = STRBUF_INIT;
>  	int ret;
> -	strbuf_addf(&buf, "%srefs/", get_git_namespace());
> -	ret = do_for_each_ref(refs, buf.buf, exclude_patterns, fn, 0, 0, cb_data);
> -	strbuf_release(&buf);
> +
> +	exclude_patterns = get_namespaced_exclude_patterns(exclude_patterns,
> +							   get_git_namespace(),
> +							   &namespaced_exclude_patterns);
> +
> +	strbuf_addf(&prefix, "%srefs/", get_git_namespace());
> +	ret = do_for_each_ref(refs, prefix.buf, exclude_patterns, fn, 0, 0, cb_data);
> +
> +	strvec_clear(&namespaced_exclude_patterns);
> +	strbuf_release(&prefix);
>  	return ret;
>  }
>
> @@ -1719,6 +1740,7 @@ int refs_for_each_fullref_in_prefixes(struct ref_store *ref_store,
>  				      const char **exclude_patterns,
>  				      each_ref_fn fn, void *cb_data)
>  {
> +	struct strvec namespaced_exclude_patterns = STRVEC_INIT;
>  	struct string_list prefixes = STRING_LIST_INIT_DUP;
>  	struct string_list_item *prefix;
>  	struct strbuf buf = STRBUF_INIT;
> @@ -1730,6 +1752,10 @@ int refs_for_each_fullref_in_prefixes(struct ref_store *ref_store,
>  		strbuf_addstr(&buf, namespace);
>  	namespace_len = buf.len;
>
> +	exclude_patterns = get_namespaced_exclude_patterns(exclude_patterns,
> +							   namespace,
> +							   &namespaced_exclude_patterns);
> +
>  	for_each_string_list_item(prefix, &prefixes) {
>  		strbuf_addstr(&buf, prefix->string);
>  		ret = refs_for_each_fullref_in(ref_store, buf.buf,
> @@ -1739,6 +1765,7 @@ int refs_for_each_fullref_in_prefixes(struct ref_store *ref_store,
>  		strbuf_setlen(&buf, namespace_len);
>  	}
>
> +	strvec_clear(&namespaced_exclude_patterns);
>  	string_list_clear(&prefixes, 0);
>  	strbuf_release(&buf);
>  	return ret;
> diff --git a/refs.h b/refs.h
> index f8b919a1388..3f774e96d18 100644
> --- a/refs.h
> +++ b/refs.h
> @@ -859,6 +859,15 @@ int ref_is_hidden(const char *, const char *, const struct strvec *);
>   */
>  const char **hidden_refs_to_excludes(const struct strvec *hide_refs);
>
> +/*
> + * Prefix all exclude patterns with the namespace, if any. This is required
> + * because exclude patterns apply to the stripped reference name, not the full
> + * reference name with the namespace.
> + */
> +const char **get_namespaced_exclude_patterns(const char **exclude_patterns,
> +					     const char *namespace,
> +					     struct strvec *out);
> +

Do we need to expose this? Can't it be made static?
Patrick Steinhardt Sept. 16, 2024, 6:56 a.m. UTC | #2
On Fri, Sep 13, 2024 at 04:35:35AM -0700, karthik nayak wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> > Handling such exclude patterns in `refs_for_each_namespaced_ref()` and
> > `refs_for_each_fullref_in_prefixes()` is broken though, as both support
> > that the user passes both namespaces and exclude patterns. In the case
> > where both are set we will exclude references with unstripped names,
> > even though we really wanted to exclude references based on their
> > stripped names.
> >
> > This only surfaces when:
> >
> >   - A repository uses reference namespaces.
> >
> >   - "transfer.hideRefs" is active.
> >
> >   - The namespaced references are packed into the "packed-refs" file.
> >
> 
> So this is because we don't even apply exclude patterns to the loose
> refs right?
> 
> To understand correctly, the transport layer passes on
> 'transfer.hideRefs' as `exclude_refs` to the generic refs layer. This is
> mostly to optimize the reference backend to skip such refs. This is used
> by the packed-refs currently but not used for loose refs.
> 
> The transfer layer also uses this list in `mark_our_ref()` to skip refs
> as needed.
> 
> So all in all `exclude_refs` here is mostly for optimization.

Yup.

> > diff --git a/refs.c b/refs.c
> > index ceb72d4bd74..b3a367ea12c 100644
> > --- a/refs.c
> > +++ b/refs.c
> > @@ -1517,6 +1517,19 @@ const char **hidden_refs_to_excludes(const struct strvec *hide_refs)
> >  	return hide_refs->v;
> >  }
> >
> > +const char **get_namespaced_exclude_patterns(const char **exclude_patterns,
> > +					     const char *namespace,
> > +					     struct strvec *out)
> > +{
> > +	if (!namespace || !*namespace || !exclude_patterns || !*exclude_patterns)
> 
> What scenario would `!*namespace` be possible?

It's the default value of `get_git_namespace()`.

> > diff --git a/refs.h b/refs.h
> > index f8b919a1388..3f774e96d18 100644
> > --- a/refs.h
> > +++ b/refs.h
> > @@ -859,6 +859,15 @@ int ref_is_hidden(const char *, const char *, const struct strvec *);
> >   */
> >  const char **hidden_refs_to_excludes(const struct strvec *hide_refs);
> >
> > +/*
> > + * Prefix all exclude patterns with the namespace, if any. This is required
> > + * because exclude patterns apply to the stripped reference name, not the full
> > + * reference name with the namespace.
> > + */
> > +const char **get_namespaced_exclude_patterns(const char **exclude_patterns,
> > +					     const char *namespace,
> > +					     struct strvec *out);
> > +
> 
> Do we need to expose this? Can't it be made static?

It will be used by the next patch. I'll amen the commit message to point
this out.

Patrick
diff mbox series

Patch

diff --git a/refs.c b/refs.c
index ceb72d4bd74..b3a367ea12c 100644
--- a/refs.c
+++ b/refs.c
@@ -1517,6 +1517,19 @@  const char **hidden_refs_to_excludes(const struct strvec *hide_refs)
 	return hide_refs->v;
 }
 
+const char **get_namespaced_exclude_patterns(const char **exclude_patterns,
+					     const char *namespace,
+					     struct strvec *out)
+{
+	if (!namespace || !*namespace || !exclude_patterns || !*exclude_patterns)
+		return exclude_patterns;
+
+	for (size_t i = 0; exclude_patterns[i]; i++)
+		strvec_pushf(out, "%s%s", namespace, exclude_patterns[i]);
+
+	return out->v;
+}
+
 const char *find_descendant_ref(const char *dirname,
 				const struct string_list *extras,
 				const struct string_list *skip)
@@ -1634,11 +1647,19 @@  int refs_for_each_namespaced_ref(struct ref_store *refs,
 				 const char **exclude_patterns,
 				 each_ref_fn fn, void *cb_data)
 {
-	struct strbuf buf = STRBUF_INIT;
+	struct strvec namespaced_exclude_patterns = STRVEC_INIT;
+	struct strbuf prefix = STRBUF_INIT;
 	int ret;
-	strbuf_addf(&buf, "%srefs/", get_git_namespace());
-	ret = do_for_each_ref(refs, buf.buf, exclude_patterns, fn, 0, 0, cb_data);
-	strbuf_release(&buf);
+
+	exclude_patterns = get_namespaced_exclude_patterns(exclude_patterns,
+							   get_git_namespace(),
+							   &namespaced_exclude_patterns);
+
+	strbuf_addf(&prefix, "%srefs/", get_git_namespace());
+	ret = do_for_each_ref(refs, prefix.buf, exclude_patterns, fn, 0, 0, cb_data);
+
+	strvec_clear(&namespaced_exclude_patterns);
+	strbuf_release(&prefix);
 	return ret;
 }
 
@@ -1719,6 +1740,7 @@  int refs_for_each_fullref_in_prefixes(struct ref_store *ref_store,
 				      const char **exclude_patterns,
 				      each_ref_fn fn, void *cb_data)
 {
+	struct strvec namespaced_exclude_patterns = STRVEC_INIT;
 	struct string_list prefixes = STRING_LIST_INIT_DUP;
 	struct string_list_item *prefix;
 	struct strbuf buf = STRBUF_INIT;
@@ -1730,6 +1752,10 @@  int refs_for_each_fullref_in_prefixes(struct ref_store *ref_store,
 		strbuf_addstr(&buf, namespace);
 	namespace_len = buf.len;
 
+	exclude_patterns = get_namespaced_exclude_patterns(exclude_patterns,
+							   namespace,
+							   &namespaced_exclude_patterns);
+
 	for_each_string_list_item(prefix, &prefixes) {
 		strbuf_addstr(&buf, prefix->string);
 		ret = refs_for_each_fullref_in(ref_store, buf.buf,
@@ -1739,6 +1765,7 @@  int refs_for_each_fullref_in_prefixes(struct ref_store *ref_store,
 		strbuf_setlen(&buf, namespace_len);
 	}
 
+	strvec_clear(&namespaced_exclude_patterns);
 	string_list_clear(&prefixes, 0);
 	strbuf_release(&buf);
 	return ret;
diff --git a/refs.h b/refs.h
index f8b919a1388..3f774e96d18 100644
--- a/refs.h
+++ b/refs.h
@@ -859,6 +859,15 @@  int ref_is_hidden(const char *, const char *, const struct strvec *);
  */
 const char **hidden_refs_to_excludes(const struct strvec *hide_refs);
 
+/*
+ * Prefix all exclude patterns with the namespace, if any. This is required
+ * because exclude patterns apply to the stripped reference name, not the full
+ * reference name with the namespace.
+ */
+const char **get_namespaced_exclude_patterns(const char **exclude_patterns,
+					     const char *namespace,
+					     struct strvec *out);
+
 /* Is this a per-worktree ref living in the refs/ namespace? */
 int is_per_worktree_ref(const char *refname);
 
diff --git a/t/t5509-fetch-push-namespaces.sh b/t/t5509-fetch-push-namespaces.sh
index 05090feaf92..98e8352b6cc 100755
--- a/t/t5509-fetch-push-namespaces.sh
+++ b/t/t5509-fetch-push-namespaces.sh
@@ -96,6 +96,7 @@  test_expect_success 'hide namespaced refs with transfer.hideRefs' '
 '
 
 test_expect_success 'check that transfer.hideRefs does not match unstripped refs' '
+	git -C pushee pack-refs --all &&
 	GIT_NAMESPACE=namespace \
 		git -C pushee -c transfer.hideRefs=refs/namespaces/namespace/refs/tags \
 		ls-remote "ext::git %s ." >actual &&