Message ID | 7497166422ea702aabdf4159b0d7780f1422ba13.1726476401.git.ps@pks.im (mailing list archive) |
---|---|
State | Accepted |
Commit | 155dc8447d3590ea856bb17919bfc85172b52e09 |
Headers | show |
Series | refs/reftable: wire up exclude patterns | expand |
On Mon, Sep 16, 2024 at 10:50:03AM +0200, Patrick Steinhardt wrote: > This only surfaces when: > > - A repository uses reference namespaces. > > - "transfer.hideRefs" is active. > > - The namespaced references are packed into the "packed-refs" file. > > None of our tests exercise this scenario, and thus we haven't ever hit > it. While t5509 exercises both (1) and (2), it does not happen to hit > (3). It is trivial to demonstrate the bug though by explicitly packing > refs in the tests, and then we indeed surface the breakage. > > Fix this bug by prefixing exclude patterns with the namespace in the > generic layer. The newly introduced function will be used outside of > "refs.c" in the next patch, so we add a declaration to "refs.h". Thanks for finding and fixing this bug! > diff --git a/refs.c b/refs.c > index ceb72d4bd74..b3a367ea12c 100644 > --- a/refs.c > +++ b/refs.c > @@ -1517,6 +1517,19 @@ const char **hidden_refs_to_excludes(const struct strvec *hide_refs) > return hide_refs->v; > } > > +const char **get_namespaced_exclude_patterns(const char **exclude_patterns, > + const char *namespace, > + struct strvec *out) > +{ > + if (!namespace || !*namespace || !exclude_patterns || !*exclude_patterns) > + return exclude_patterns; > + > + for (size_t i = 0; exclude_patterns[i]; i++) > + strvec_pushf(out, "%s%s", namespace, exclude_patterns[i]); > + > + return out->v; > +} > + Is it safe to concatenate each exclude pattern with the specified namespace? If I'm reading this correctly, I think we silently do the wrong thing for exclude patterns that start with '^'. I guess we reject such patterns in the hidden_refs_to_excludes() function, but perhaps we wouldn't have to if this function stripped those prefixes for us when the caller does or doesn't specify exclude patterns with a '^'? Thanks, Taylor
On Tue, Sep 17, 2024 at 05:12:39AM -0400, Taylor Blau wrote: > On Mon, Sep 16, 2024 at 10:50:03AM +0200, Patrick Steinhardt wrote: > > diff --git a/refs.c b/refs.c > > index ceb72d4bd74..b3a367ea12c 100644 > > --- a/refs.c > > +++ b/refs.c > > @@ -1517,6 +1517,19 @@ const char **hidden_refs_to_excludes(const struct strvec *hide_refs) > > return hide_refs->v; > > } > > > > +const char **get_namespaced_exclude_patterns(const char **exclude_patterns, > > + const char *namespace, > > + struct strvec *out) > > +{ > > + if (!namespace || !*namespace || !exclude_patterns || !*exclude_patterns) > > + return exclude_patterns; > > + > > + for (size_t i = 0; exclude_patterns[i]; i++) > > + strvec_pushf(out, "%s%s", namespace, exclude_patterns[i]); > > + > > + return out->v; > > +} > > + > > Is it safe to concatenate each exclude pattern with the specified > namespace? If I'm reading this correctly, I think we silently do the > wrong thing for exclude patterns that start with '^'. > > I guess we reject such patterns in the hidden_refs_to_excludes() > function, but perhaps we wouldn't have to if this function stripped > those prefixes for us when the caller does or doesn't specify exclude > patterns with a '^'? Yeah, as you mention, `hidden_refs_to_excludes()` drops excludes completely in case there's any pattern starting with '^' or '!'. So the current assumption should be safe because we don't use excludes in this scenario at all. I also agree that we should eventually think about allowing '^'-scoped references. We can handle them correctly, but it requires a bit more thought wire that up. So I'd prefer to keep that out of this series, as it is basically a new feature. Patrick
On Tue, Sep 17, 2024 at 11:33:25AM +0200, Patrick Steinhardt wrote: > On Tue, Sep 17, 2024 at 05:12:39AM -0400, Taylor Blau wrote: > > On Mon, Sep 16, 2024 at 10:50:03AM +0200, Patrick Steinhardt wrote: > > > diff --git a/refs.c b/refs.c > > > index ceb72d4bd74..b3a367ea12c 100644 > > > --- a/refs.c > > > +++ b/refs.c > > > @@ -1517,6 +1517,19 @@ const char **hidden_refs_to_excludes(const struct strvec *hide_refs) > > > return hide_refs->v; > > > } > > > > > > +const char **get_namespaced_exclude_patterns(const char **exclude_patterns, > > > + const char *namespace, > > > + struct strvec *out) > > > +{ > > > + if (!namespace || !*namespace || !exclude_patterns || !*exclude_patterns) > > > + return exclude_patterns; > > > + > > > + for (size_t i = 0; exclude_patterns[i]; i++) > > > + strvec_pushf(out, "%s%s", namespace, exclude_patterns[i]); > > > + > > > + return out->v; > > > +} > > > + > > > > Is it safe to concatenate each exclude pattern with the specified > > namespace? If I'm reading this correctly, I think we silently do the > > wrong thing for exclude patterns that start with '^'. > > > > I guess we reject such patterns in the hidden_refs_to_excludes() > > function, but perhaps we wouldn't have to if this function stripped > > those prefixes for us when the caller does or doesn't specify exclude > > patterns with a '^'? > > Yeah, as you mention, `hidden_refs_to_excludes()` drops excludes > completely in case there's any pattern starting with '^' or '!'. So the > current assumption should be safe because we don't use excludes in this Right... but can't exclude_patterns be arbitrary here, as it is a parameter to the function which is exported via the *.h header file? IOW, I don't think we can claim at all that we have passed the excluded patterns through hidden_refs_to_excludes() before calling get_namespaced_exclude_patterns(). > I also agree that we should eventually think about allowing '^'-scoped > references. We can handle them correctly, but it requires a bit more > thought wire that up. So I'd prefer to keep that out of this series, as > it is basically a new feature. Yeah, I agree. Thanks, Taylor
On Tue, Sep 17, 2024 at 05:38:51AM -0400, Taylor Blau wrote: > On Tue, Sep 17, 2024 at 11:33:25AM +0200, Patrick Steinhardt wrote: > > On Tue, Sep 17, 2024 at 05:12:39AM -0400, Taylor Blau wrote: > > > On Mon, Sep 16, 2024 at 10:50:03AM +0200, Patrick Steinhardt wrote: > > > > diff --git a/refs.c b/refs.c > > > > index ceb72d4bd74..b3a367ea12c 100644 > > > > --- a/refs.c > > > > +++ b/refs.c > > > > @@ -1517,6 +1517,19 @@ const char **hidden_refs_to_excludes(const struct strvec *hide_refs) > > > > return hide_refs->v; > > > > } > > > > > > > > +const char **get_namespaced_exclude_patterns(const char **exclude_patterns, > > > > + const char *namespace, > > > > + struct strvec *out) > > > > +{ > > > > + if (!namespace || !*namespace || !exclude_patterns || !*exclude_patterns) > > > > + return exclude_patterns; > > > > + > > > > + for (size_t i = 0; exclude_patterns[i]; i++) > > > > + strvec_pushf(out, "%s%s", namespace, exclude_patterns[i]); > > > > + > > > > + return out->v; > > > > +} > > > > + > > > > > > Is it safe to concatenate each exclude pattern with the specified > > > namespace? If I'm reading this correctly, I think we silently do the > > > wrong thing for exclude patterns that start with '^'. > > > > > > I guess we reject such patterns in the hidden_refs_to_excludes() > > > function, but perhaps we wouldn't have to if this function stripped > > > those prefixes for us when the caller does or doesn't specify exclude > > > patterns with a '^'? > > > > Yeah, as you mention, `hidden_refs_to_excludes()` drops excludes > > completely in case there's any pattern starting with '^' or '!'. So the > > current assumption should be safe because we don't use excludes in this > > Right... but can't exclude_patterns be arbitrary here, as it is a > parameter to the function which is exported via the *.h header file? > > IOW, I don't think we can claim at all that we have passed the excluded > patterns through hidden_refs_to_excludes() before calling > get_namespaced_exclude_patterns(). I think the important thing to realize is that we're talking about two different things: - We have exclude patterns. These are _patterns_ only and do not support '^' or '!'. These patterns are handled by the ref backend. - We have hidden refs, which are a mechanism of our transport layer. These _can_ support '^' or '!'. To use hidden refs as exclude patterns they need to get translated to use the exclude pattern syntax, which does not support every feature that the hidden ref syntax does. So every caller that uses exclude patterns must already make sure to filter things accordingly, and they must make sure to use the correct syntax for exclude patterns. And if they do, I think that the conversion to namespaced exclude patterns should be correct. Or am I missing something? Patrick
On Tue, Sep 17, 2024 at 11:44:44AM +0200, Patrick Steinhardt wrote: > I think the important thing to realize is that we're talking about two > different things: Wow, I'm sorry -- I completely misunderstood what was going on in this diff. I have to say, that is quite embarrassing having written this feature only a little over a year ago. Sorry for the confusion, this patch LGTM. Thanks, Taylor
On Tue, Sep 17, 2024 at 05:52:34AM -0400, Taylor Blau wrote: > On Tue, Sep 17, 2024 at 11:44:44AM +0200, Patrick Steinhardt wrote: > > I think the important thing to realize is that we're talking about two > > different things: > > Wow, I'm sorry -- I completely misunderstood what was going on in this > diff. > > I have to say, that is quite embarrassing having written this feature > only a little over a year ago. Sorry for the confusion, this patch LGTM. No worries! If at all it points out that the patch message could've been clearer. Patrick
diff --git a/refs.c b/refs.c index ceb72d4bd74..b3a367ea12c 100644 --- a/refs.c +++ b/refs.c @@ -1517,6 +1517,19 @@ const char **hidden_refs_to_excludes(const struct strvec *hide_refs) return hide_refs->v; } +const char **get_namespaced_exclude_patterns(const char **exclude_patterns, + const char *namespace, + struct strvec *out) +{ + if (!namespace || !*namespace || !exclude_patterns || !*exclude_patterns) + return exclude_patterns; + + for (size_t i = 0; exclude_patterns[i]; i++) + strvec_pushf(out, "%s%s", namespace, exclude_patterns[i]); + + return out->v; +} + const char *find_descendant_ref(const char *dirname, const struct string_list *extras, const struct string_list *skip) @@ -1634,11 +1647,19 @@ int refs_for_each_namespaced_ref(struct ref_store *refs, const char **exclude_patterns, each_ref_fn fn, void *cb_data) { - struct strbuf buf = STRBUF_INIT; + struct strvec namespaced_exclude_patterns = STRVEC_INIT; + struct strbuf prefix = STRBUF_INIT; int ret; - strbuf_addf(&buf, "%srefs/", get_git_namespace()); - ret = do_for_each_ref(refs, buf.buf, exclude_patterns, fn, 0, 0, cb_data); - strbuf_release(&buf); + + exclude_patterns = get_namespaced_exclude_patterns(exclude_patterns, + get_git_namespace(), + &namespaced_exclude_patterns); + + strbuf_addf(&prefix, "%srefs/", get_git_namespace()); + ret = do_for_each_ref(refs, prefix.buf, exclude_patterns, fn, 0, 0, cb_data); + + strvec_clear(&namespaced_exclude_patterns); + strbuf_release(&prefix); return ret; } @@ -1719,6 +1740,7 @@ int refs_for_each_fullref_in_prefixes(struct ref_store *ref_store, const char **exclude_patterns, each_ref_fn fn, void *cb_data) { + struct strvec namespaced_exclude_patterns = STRVEC_INIT; struct string_list prefixes = STRING_LIST_INIT_DUP; struct string_list_item *prefix; struct strbuf buf = STRBUF_INIT; @@ -1730,6 +1752,10 @@ int refs_for_each_fullref_in_prefixes(struct ref_store *ref_store, strbuf_addstr(&buf, namespace); namespace_len = buf.len; + exclude_patterns = get_namespaced_exclude_patterns(exclude_patterns, + namespace, + &namespaced_exclude_patterns); + for_each_string_list_item(prefix, &prefixes) { strbuf_addstr(&buf, prefix->string); ret = refs_for_each_fullref_in(ref_store, buf.buf, @@ -1739,6 +1765,7 @@ int refs_for_each_fullref_in_prefixes(struct ref_store *ref_store, strbuf_setlen(&buf, namespace_len); } + strvec_clear(&namespaced_exclude_patterns); string_list_clear(&prefixes, 0); strbuf_release(&buf); return ret; diff --git a/refs.h b/refs.h index f8b919a1388..3f774e96d18 100644 --- a/refs.h +++ b/refs.h @@ -859,6 +859,15 @@ int ref_is_hidden(const char *, const char *, const struct strvec *); */ const char **hidden_refs_to_excludes(const struct strvec *hide_refs); +/* + * Prefix all exclude patterns with the namespace, if any. This is required + * because exclude patterns apply to the stripped reference name, not the full + * reference name with the namespace. + */ +const char **get_namespaced_exclude_patterns(const char **exclude_patterns, + const char *namespace, + struct strvec *out); + /* Is this a per-worktree ref living in the refs/ namespace? */ int is_per_worktree_ref(const char *refname); diff --git a/t/t5509-fetch-push-namespaces.sh b/t/t5509-fetch-push-namespaces.sh index 05090feaf92..98e8352b6cc 100755 --- a/t/t5509-fetch-push-namespaces.sh +++ b/t/t5509-fetch-push-namespaces.sh @@ -96,6 +96,7 @@ test_expect_success 'hide namespaced refs with transfer.hideRefs' ' ' test_expect_success 'check that transfer.hideRefs does not match unstripped refs' ' + git -C pushee pack-refs --all && GIT_NAMESPACE=namespace \ git -C pushee -c transfer.hideRefs=refs/namespaces/namespace/refs/tags \ ls-remote "ext::git %s ." >actual &&
Reference namespaces allow commands like git-upload-pack(1) to serve different sets of references to the client depending on which namespace is enabled, which is for example useful in fork networks. Namespaced refs are stored with a `refs/namespaces/$namespace` prefix, but all the user will ultimately see is a stripped version where that prefix is removed. The way that this interacts with "transfer.hideRefs" is not immediately obvious: the hidden refs can either apply to the stripped references, or to the non-stripped ones that still have the namespace prefix. In fact, the "transfer.hideRefs" machinery does the former and applies to the stripped reference by default, but rules can have "^" prefixed to switch this behaviour to instead match against the full reference name. Namespaces are exclusively handled at the generic "refs" layer, the respective backends have no clue that such a thing even exists. This also has the consequence that they cannot handle hiding references as soon as reference namespaces come into play because they neither know whether a namespace is active, nor do they know how to strip references if they are active. Handling such exclude patterns in `refs_for_each_namespaced_ref()` and `refs_for_each_fullref_in_prefixes()` is broken though, as both support that the user passes both namespaces and exclude patterns. In the case where both are set we will exclude references with unstripped names, even though we really wanted to exclude references based on their stripped names. This only surfaces when: - A repository uses reference namespaces. - "transfer.hideRefs" is active. - The namespaced references are packed into the "packed-refs" file. None of our tests exercise this scenario, and thus we haven't ever hit it. While t5509 exercises both (1) and (2), it does not happen to hit (3). It is trivial to demonstrate the bug though by explicitly packing refs in the tests, and then we indeed surface the breakage. Fix this bug by prefixing exclude patterns with the namespace in the generic layer. The newly introduced function will be used outside of "refs.c" in the next patch, so we add a declaration to "refs.h". Signed-off-by: Patrick Steinhardt <ps@pks.im> --- refs.c | 35 ++++++++++++++++++++++++++++---- refs.h | 9 ++++++++ t/t5509-fetch-push-namespaces.sh | 1 + 3 files changed, 41 insertions(+), 4 deletions(-)