diff mbox series

[v2,1/6] refs: properly apply exclude patterns to namespaced refs

Message ID 7497166422ea702aabdf4159b0d7780f1422ba13.1726476401.git.ps@pks.im (mailing list archive)
State New
Headers show
Series refs/reftable: wire up exclude patterns | expand

Commit Message

Patrick Steinhardt Sept. 16, 2024, 8:50 a.m. UTC
Reference namespaces allow commands like git-upload-pack(1) to serve
different sets of references to the client depending on which namespace
is enabled, which is for example useful in fork networks. Namespaced
refs are stored with a `refs/namespaces/$namespace` prefix, but all the
user will ultimately see is a stripped version where that prefix is
removed.

The way that this interacts with "transfer.hideRefs" is not immediately
obvious: the hidden refs can either apply to the stripped references, or
to the non-stripped ones that still have the namespace prefix. In fact,
the "transfer.hideRefs" machinery does the former and applies to the
stripped reference by default, but rules can have "^" prefixed to switch
this behaviour to instead match against the full reference name.

Namespaces are exclusively handled at the generic "refs" layer, the
respective backends have no clue that such a thing even exists. This
also has the consequence that they cannot handle hiding references as
soon as reference namespaces come into play because they neither know
whether a namespace is active, nor do they know how to strip references
if they are active.

Handling such exclude patterns in `refs_for_each_namespaced_ref()` and
`refs_for_each_fullref_in_prefixes()` is broken though, as both support
that the user passes both namespaces and exclude patterns. In the case
where both are set we will exclude references with unstripped names,
even though we really wanted to exclude references based on their
stripped names.

This only surfaces when:

  - A repository uses reference namespaces.

  - "transfer.hideRefs" is active.

  - The namespaced references are packed into the "packed-refs" file.

None of our tests exercise this scenario, and thus we haven't ever hit
it. While t5509 exercises both (1) and (2), it does not happen to hit
(3). It is trivial to demonstrate the bug though by explicitly packing
refs in the tests, and then we indeed surface the breakage.

Fix this bug by prefixing exclude patterns with the namespace in the
generic layer. The newly introduced function will be used outside of
"refs.c" in the next patch, so we add a declaration to "refs.h".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c                           | 35 ++++++++++++++++++++++++++++----
 refs.h                           |  9 ++++++++
 t/t5509-fetch-push-namespaces.sh |  1 +
 3 files changed, 41 insertions(+), 4 deletions(-)

Comments

Taylor Blau Sept. 17, 2024, 9:12 a.m. UTC | #1
On Mon, Sep 16, 2024 at 10:50:03AM +0200, Patrick Steinhardt wrote:
> This only surfaces when:
>
>   - A repository uses reference namespaces.
>
>   - "transfer.hideRefs" is active.
>
>   - The namespaced references are packed into the "packed-refs" file.
>
> None of our tests exercise this scenario, and thus we haven't ever hit
> it. While t5509 exercises both (1) and (2), it does not happen to hit
> (3). It is trivial to demonstrate the bug though by explicitly packing
> refs in the tests, and then we indeed surface the breakage.
>
> Fix this bug by prefixing exclude patterns with the namespace in the
> generic layer. The newly introduced function will be used outside of
> "refs.c" in the next patch, so we add a declaration to "refs.h".

Thanks for finding and fixing this bug!

> diff --git a/refs.c b/refs.c
> index ceb72d4bd74..b3a367ea12c 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1517,6 +1517,19 @@ const char **hidden_refs_to_excludes(const struct strvec *hide_refs)
>  	return hide_refs->v;
>  }
>
> +const char **get_namespaced_exclude_patterns(const char **exclude_patterns,
> +					     const char *namespace,
> +					     struct strvec *out)
> +{
> +	if (!namespace || !*namespace || !exclude_patterns || !*exclude_patterns)
> +		return exclude_patterns;
> +
> +	for (size_t i = 0; exclude_patterns[i]; i++)
> +		strvec_pushf(out, "%s%s", namespace, exclude_patterns[i]);
> +
> +	return out->v;
> +}
> +

Is it safe to concatenate each exclude pattern with the specified
namespace? If I'm reading this correctly, I think we silently do the
wrong thing for exclude patterns that start with '^'.

I guess we reject such patterns in the hidden_refs_to_excludes()
function, but perhaps we wouldn't have to if this function stripped
those prefixes for us when the caller does or doesn't specify exclude
patterns with a '^'?

Thanks,
Taylor
Patrick Steinhardt Sept. 17, 2024, 9:33 a.m. UTC | #2
On Tue, Sep 17, 2024 at 05:12:39AM -0400, Taylor Blau wrote:
> On Mon, Sep 16, 2024 at 10:50:03AM +0200, Patrick Steinhardt wrote:
> > diff --git a/refs.c b/refs.c
> > index ceb72d4bd74..b3a367ea12c 100644
> > --- a/refs.c
> > +++ b/refs.c
> > @@ -1517,6 +1517,19 @@ const char **hidden_refs_to_excludes(const struct strvec *hide_refs)
> >  	return hide_refs->v;
> >  }
> >
> > +const char **get_namespaced_exclude_patterns(const char **exclude_patterns,
> > +					     const char *namespace,
> > +					     struct strvec *out)
> > +{
> > +	if (!namespace || !*namespace || !exclude_patterns || !*exclude_patterns)
> > +		return exclude_patterns;
> > +
> > +	for (size_t i = 0; exclude_patterns[i]; i++)
> > +		strvec_pushf(out, "%s%s", namespace, exclude_patterns[i]);
> > +
> > +	return out->v;
> > +}
> > +
> 
> Is it safe to concatenate each exclude pattern with the specified
> namespace? If I'm reading this correctly, I think we silently do the
> wrong thing for exclude patterns that start with '^'.
> 
> I guess we reject such patterns in the hidden_refs_to_excludes()
> function, but perhaps we wouldn't have to if this function stripped
> those prefixes for us when the caller does or doesn't specify exclude
> patterns with a '^'?

Yeah, as you mention, `hidden_refs_to_excludes()` drops excludes
completely in case there's any pattern starting with '^' or '!'. So the
current assumption should be safe because we don't use excludes in this
scenario at all.

I also agree that we should eventually think about allowing '^'-scoped
references. We can handle them correctly, but it requires a bit more
thought wire that up. So I'd prefer to keep that out of this series, as
it is basically a new feature.

Patrick
Taylor Blau Sept. 17, 2024, 9:38 a.m. UTC | #3
On Tue, Sep 17, 2024 at 11:33:25AM +0200, Patrick Steinhardt wrote:
> On Tue, Sep 17, 2024 at 05:12:39AM -0400, Taylor Blau wrote:
> > On Mon, Sep 16, 2024 at 10:50:03AM +0200, Patrick Steinhardt wrote:
> > > diff --git a/refs.c b/refs.c
> > > index ceb72d4bd74..b3a367ea12c 100644
> > > --- a/refs.c
> > > +++ b/refs.c
> > > @@ -1517,6 +1517,19 @@ const char **hidden_refs_to_excludes(const struct strvec *hide_refs)
> > >  	return hide_refs->v;
> > >  }
> > >
> > > +const char **get_namespaced_exclude_patterns(const char **exclude_patterns,
> > > +					     const char *namespace,
> > > +					     struct strvec *out)
> > > +{
> > > +	if (!namespace || !*namespace || !exclude_patterns || !*exclude_patterns)
> > > +		return exclude_patterns;
> > > +
> > > +	for (size_t i = 0; exclude_patterns[i]; i++)
> > > +		strvec_pushf(out, "%s%s", namespace, exclude_patterns[i]);
> > > +
> > > +	return out->v;
> > > +}
> > > +
> >
> > Is it safe to concatenate each exclude pattern with the specified
> > namespace? If I'm reading this correctly, I think we silently do the
> > wrong thing for exclude patterns that start with '^'.
> >
> > I guess we reject such patterns in the hidden_refs_to_excludes()
> > function, but perhaps we wouldn't have to if this function stripped
> > those prefixes for us when the caller does or doesn't specify exclude
> > patterns with a '^'?
>
> Yeah, as you mention, `hidden_refs_to_excludes()` drops excludes
> completely in case there's any pattern starting with '^' or '!'. So the
> current assumption should be safe because we don't use excludes in this

Right... but can't exclude_patterns be arbitrary here, as it is a
parameter to the function which is exported via the *.h header file?

IOW, I don't think we can claim at all that we have passed the excluded
patterns through hidden_refs_to_excludes() before calling
get_namespaced_exclude_patterns().

> I also agree that we should eventually think about allowing '^'-scoped
> references. We can handle them correctly, but it requires a bit more
> thought wire that up. So I'd prefer to keep that out of this series, as
> it is basically a new feature.

Yeah, I agree.

Thanks,
Taylor
Patrick Steinhardt Sept. 17, 2024, 9:44 a.m. UTC | #4
On Tue, Sep 17, 2024 at 05:38:51AM -0400, Taylor Blau wrote:
> On Tue, Sep 17, 2024 at 11:33:25AM +0200, Patrick Steinhardt wrote:
> > On Tue, Sep 17, 2024 at 05:12:39AM -0400, Taylor Blau wrote:
> > > On Mon, Sep 16, 2024 at 10:50:03AM +0200, Patrick Steinhardt wrote:
> > > > diff --git a/refs.c b/refs.c
> > > > index ceb72d4bd74..b3a367ea12c 100644
> > > > --- a/refs.c
> > > > +++ b/refs.c
> > > > @@ -1517,6 +1517,19 @@ const char **hidden_refs_to_excludes(const struct strvec *hide_refs)
> > > >  	return hide_refs->v;
> > > >  }
> > > >
> > > > +const char **get_namespaced_exclude_patterns(const char **exclude_patterns,
> > > > +					     const char *namespace,
> > > > +					     struct strvec *out)
> > > > +{
> > > > +	if (!namespace || !*namespace || !exclude_patterns || !*exclude_patterns)
> > > > +		return exclude_patterns;
> > > > +
> > > > +	for (size_t i = 0; exclude_patterns[i]; i++)
> > > > +		strvec_pushf(out, "%s%s", namespace, exclude_patterns[i]);
> > > > +
> > > > +	return out->v;
> > > > +}
> > > > +
> > >
> > > Is it safe to concatenate each exclude pattern with the specified
> > > namespace? If I'm reading this correctly, I think we silently do the
> > > wrong thing for exclude patterns that start with '^'.
> > >
> > > I guess we reject such patterns in the hidden_refs_to_excludes()
> > > function, but perhaps we wouldn't have to if this function stripped
> > > those prefixes for us when the caller does or doesn't specify exclude
> > > patterns with a '^'?
> >
> > Yeah, as you mention, `hidden_refs_to_excludes()` drops excludes
> > completely in case there's any pattern starting with '^' or '!'. So the
> > current assumption should be safe because we don't use excludes in this
> 
> Right... but can't exclude_patterns be arbitrary here, as it is a
> parameter to the function which is exported via the *.h header file?
> 
> IOW, I don't think we can claim at all that we have passed the excluded
> patterns through hidden_refs_to_excludes() before calling
> get_namespaced_exclude_patterns().

I think the important thing to realize is that we're talking about two
different things:

  - We have exclude patterns. These are _patterns_ only and do not
    support '^' or '!'. These patterns are handled by the ref backend.

  - We have hidden refs, which are a mechanism of our transport layer.
    These _can_ support '^' or '!'.

To use hidden refs as exclude patterns they need to get translated to
use the exclude pattern syntax, which does not support every feature
that the hidden ref syntax does.

So every caller that uses exclude patterns must already make sure to
filter things accordingly, and they must make sure to use the correct
syntax for exclude patterns. And if they do, I think that the conversion
to namespaced exclude patterns should be correct.

Or am I missing something?

Patrick
Taylor Blau Sept. 17, 2024, 9:52 a.m. UTC | #5
On Tue, Sep 17, 2024 at 11:44:44AM +0200, Patrick Steinhardt wrote:
> I think the important thing to realize is that we're talking about two
> different things:

Wow, I'm sorry -- I completely misunderstood what was going on in this
diff.

I have to say, that is quite embarrassing having written this feature
only a little over a year ago. Sorry for the confusion, this patch LGTM.

Thanks,
Taylor
Patrick Steinhardt Sept. 17, 2024, 9:55 a.m. UTC | #6
On Tue, Sep 17, 2024 at 05:52:34AM -0400, Taylor Blau wrote:
> On Tue, Sep 17, 2024 at 11:44:44AM +0200, Patrick Steinhardt wrote:
> > I think the important thing to realize is that we're talking about two
> > different things:
> 
> Wow, I'm sorry -- I completely misunderstood what was going on in this
> diff.
> 
> I have to say, that is quite embarrassing having written this feature
> only a little over a year ago. Sorry for the confusion, this patch LGTM.

No worries! If at all it points out that the patch message could've been
clearer.

Patrick
diff mbox series

Patch

diff --git a/refs.c b/refs.c
index ceb72d4bd74..b3a367ea12c 100644
--- a/refs.c
+++ b/refs.c
@@ -1517,6 +1517,19 @@  const char **hidden_refs_to_excludes(const struct strvec *hide_refs)
 	return hide_refs->v;
 }
 
+const char **get_namespaced_exclude_patterns(const char **exclude_patterns,
+					     const char *namespace,
+					     struct strvec *out)
+{
+	if (!namespace || !*namespace || !exclude_patterns || !*exclude_patterns)
+		return exclude_patterns;
+
+	for (size_t i = 0; exclude_patterns[i]; i++)
+		strvec_pushf(out, "%s%s", namespace, exclude_patterns[i]);
+
+	return out->v;
+}
+
 const char *find_descendant_ref(const char *dirname,
 				const struct string_list *extras,
 				const struct string_list *skip)
@@ -1634,11 +1647,19 @@  int refs_for_each_namespaced_ref(struct ref_store *refs,
 				 const char **exclude_patterns,
 				 each_ref_fn fn, void *cb_data)
 {
-	struct strbuf buf = STRBUF_INIT;
+	struct strvec namespaced_exclude_patterns = STRVEC_INIT;
+	struct strbuf prefix = STRBUF_INIT;
 	int ret;
-	strbuf_addf(&buf, "%srefs/", get_git_namespace());
-	ret = do_for_each_ref(refs, buf.buf, exclude_patterns, fn, 0, 0, cb_data);
-	strbuf_release(&buf);
+
+	exclude_patterns = get_namespaced_exclude_patterns(exclude_patterns,
+							   get_git_namespace(),
+							   &namespaced_exclude_patterns);
+
+	strbuf_addf(&prefix, "%srefs/", get_git_namespace());
+	ret = do_for_each_ref(refs, prefix.buf, exclude_patterns, fn, 0, 0, cb_data);
+
+	strvec_clear(&namespaced_exclude_patterns);
+	strbuf_release(&prefix);
 	return ret;
 }
 
@@ -1719,6 +1740,7 @@  int refs_for_each_fullref_in_prefixes(struct ref_store *ref_store,
 				      const char **exclude_patterns,
 				      each_ref_fn fn, void *cb_data)
 {
+	struct strvec namespaced_exclude_patterns = STRVEC_INIT;
 	struct string_list prefixes = STRING_LIST_INIT_DUP;
 	struct string_list_item *prefix;
 	struct strbuf buf = STRBUF_INIT;
@@ -1730,6 +1752,10 @@  int refs_for_each_fullref_in_prefixes(struct ref_store *ref_store,
 		strbuf_addstr(&buf, namespace);
 	namespace_len = buf.len;
 
+	exclude_patterns = get_namespaced_exclude_patterns(exclude_patterns,
+							   namespace,
+							   &namespaced_exclude_patterns);
+
 	for_each_string_list_item(prefix, &prefixes) {
 		strbuf_addstr(&buf, prefix->string);
 		ret = refs_for_each_fullref_in(ref_store, buf.buf,
@@ -1739,6 +1765,7 @@  int refs_for_each_fullref_in_prefixes(struct ref_store *ref_store,
 		strbuf_setlen(&buf, namespace_len);
 	}
 
+	strvec_clear(&namespaced_exclude_patterns);
 	string_list_clear(&prefixes, 0);
 	strbuf_release(&buf);
 	return ret;
diff --git a/refs.h b/refs.h
index f8b919a1388..3f774e96d18 100644
--- a/refs.h
+++ b/refs.h
@@ -859,6 +859,15 @@  int ref_is_hidden(const char *, const char *, const struct strvec *);
  */
 const char **hidden_refs_to_excludes(const struct strvec *hide_refs);
 
+/*
+ * Prefix all exclude patterns with the namespace, if any. This is required
+ * because exclude patterns apply to the stripped reference name, not the full
+ * reference name with the namespace.
+ */
+const char **get_namespaced_exclude_patterns(const char **exclude_patterns,
+					     const char *namespace,
+					     struct strvec *out);
+
 /* Is this a per-worktree ref living in the refs/ namespace? */
 int is_per_worktree_ref(const char *refname);
 
diff --git a/t/t5509-fetch-push-namespaces.sh b/t/t5509-fetch-push-namespaces.sh
index 05090feaf92..98e8352b6cc 100755
--- a/t/t5509-fetch-push-namespaces.sh
+++ b/t/t5509-fetch-push-namespaces.sh
@@ -96,6 +96,7 @@  test_expect_success 'hide namespaced refs with transfer.hideRefs' '
 '
 
 test_expect_success 'check that transfer.hideRefs does not match unstripped refs' '
+	git -C pushee pack-refs --all &&
 	GIT_NAMESPACE=namespace \
 		git -C pushee -c transfer.hideRefs=refs/namespaces/namespace/refs/tags \
 		ls-remote "ext::git %s ." >actual &&