diff mbox series

[2/2] ref-filter: support filtering of operational refs

Message ID 20231221170715.110565-3-karthik.188@gmail.com (mailing list archive)
State New, archived
Headers show
Series Initial changes to support printing all refs | expand

Commit Message

Karthik Nayak Dec. 21, 2023, 5:07 p.m. UTC
With the upcoming introduction of the reftable backend, it becomes ever
so important to provide the necessary tooling for printing all refs
associated with a repository.

While regular refs stored within the "refs/" namespace are currently
supported by multiple commands like git-for-each-ref(1),
git-show-ref(1). Neither support printing all the operational refs
within the repository.

This is crucial because with the reftable backend, all these refs will
also move to reftable. It would be necessary to identify all the refs
that are stored within the reftable since there is no easy way to do so
otherwise. This is because the reftable itself is a binary format and
all access will be via git. Unlike the filesystem backend, which allows
access directly via the filesystem.

This commit adds the necessary code to iterate over operational refs
present in a repository and provides a new filter flag
'FILTER_REFS_OPERATIONAL' to iterate over operational refs.

This flag can now be used to extend existing git commands to print
operational refs.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 ref-filter.c | 12 ++++++++++++
 ref-filter.h |  4 +++-
 refs.c       | 14 ++++++++++++++
 refs.h       |  3 +++
 4 files changed, 32 insertions(+), 1 deletion(-)

Comments

Junio C Hamano Dec. 21, 2023, 8:40 p.m. UTC | #1
Karthik Nayak <karthik.188@gmail.com> writes:

> With the upcoming introduction of the reftable backend, it becomes ever
> so important to provide the necessary tooling for printing all refs
> associated with a repository.

We have pseudoref (those all caps files outside the refs/ hierarchy)
as an official term defined in the glossary, and Patrick's reftable
work based on Han-Wen's work revealed the need to treat FETCH_HEAD
and MERGE_HEAD as "even more pecurilar than pseudorefs" that need
different term (tentatively called "special refs").  Please avoid
coming up with yet another random name "operational" without
discussing.

With a quick look at the table in this patch, "pseudorefs" appears
to be the closest word that people are already familiar with, I
think.  A lot more reasonable thing to do may be to scan the
$GIT_DIR for files whose name satisfy refs.c:is_pseudoref_syntax()
and list them, instead of having a hardcoded list of these special
refs.  In addition, when reftable and other backends that can
natively store things outside refs/ hierarchy is in use, they ought
to know what they have so enumerating these would not be an issue
for them without having such a hardcoded table of names.
Karthik Nayak Dec. 22, 2023, 2:05 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:
> We have pseudoref (those all caps files outside the refs/ hierarchy)
> as an official term defined in the glossary, and Patrick's reftable
> work based on Han-Wen's work revealed the need to treat FETCH_HEAD
> and MERGE_HEAD as "even more pecurilar than pseudorefs" that need
> different term (tentatively called "special refs").  Please avoid
> coming up with yet another random name "operational" without
> discussing.
>

I totally agree with the term usage and will switch to "pseudorefs".

> With a quick look at the table in this patch, "pseudorefs" appears
> to be the closest word that people are already familiar with, I
> think.  A lot more reasonable thing to do may be to scan the
> $GIT_DIR for files whose name satisfy refs.c:is_pseudoref_syntax()
> and list them, instead of having a hardcoded list of these special
> refs.  In addition, when reftable and other backends that can
> natively store things outside refs/ hierarchy is in use, they ought
> to know what they have so enumerating these would not be an issue
> for them without having such a hardcoded table of names.

Thanks for that, I did play around with trying to find files which
could be refs in the $GIT_DIR, but the issue is that there will be
false positives. e.g. `COMMIT_EDITMSG` could be confused for a
pseudoref (passes is_pseudoref_syntax()) and it could potentially also
contain a commit-ish value.

While you're here, I wonder if you have any thoughts on the last block
of my first mail.

> Over this, I'm also curious on what the mailing list thinks about
> exposing this to the client side. Would an `--all` option for
> git-for-each-ref(1) be sufficient?
>

Thanks
- Karthik
Junio C Hamano Dec. 26, 2023, 5:01 p.m. UTC | #3
Karthik Nayak <karthik.188@gmail.com> writes:

> Thanks for that, I did play around with trying to find files which
> could be refs in the $GIT_DIR, but the issue is that there will be
> false positives. e.g. `COMMIT_EDITMSG` could be confused for a
> pseudoref (passes is_pseudoref_syntax()) and it could potentially also
> contain a commit-ish value.

It would not begin with 40-hex, though.  If I were doing this,
perhaps I'd say we should first split is_pseudoref_syntax() that is
overly loose into to classes (e.g. "caps with underscores that ends
with HEAD" and everything else), silently reject false positives
among the latter class.  Then we rename those that are misnamed
(there should be only few, like AUTO_MERGE that should be a
pseudoref but named without _HEAD; I do not think of anything that
ends with _HEAD that is not a ref) over time and drop the latter
class.

> While you're here, I wonder if you have any thoughts on the last block
> of my first mail.
>
>> Over this, I'm also curious on what the mailing list thinks about
>> exposing this to the client side. Would an `--all` option for
>> git-for-each-ref(1) be sufficient?

"list pseudorefs in addition to things below refs/"?  Sounds OK to
me as a feature.

However, "--all" does not mean that in the context of "git log"
family of commands.  Over there, it means "not just --branches,
--tags, and --remotes, but everything" which is still limited below
"refs/".

As "git for-each-ref" takes pattern that is prefix match, e.g.,

    $ git for-each-ref refs/remotes/

shows everything like refs/remotes/origin/main that begins with
refs/remotes/, I wonder if

    $ git for-each-ref ""

should mean what you are asking for.  After all, "git for-each-ref"
does *not* take "--branches" and others like "git log" family to
limit its operation to subhierarchy of "refs/" to begin with.
Patrick Steinhardt Dec. 28, 2023, 10:34 a.m. UTC | #4
On Thu, Dec 21, 2023 at 12:40:03PM -0800, Junio C Hamano wrote:
> Karthik Nayak <karthik.188@gmail.com> writes:
> > With the upcoming introduction of the reftable backend, it becomes ever
> > so important to provide the necessary tooling for printing all refs
> > associated with a repository.
> 
> We have pseudoref (those all caps files outside the refs/ hierarchy)
> as an official term defined in the glossary, and Patrick's reftable
> work based on Han-Wen's work revealed the need to treat FETCH_HEAD
> and MERGE_HEAD as "even more pecurilar than pseudorefs" that need
> different term (tentatively called "special refs").  Please avoid
> coming up with yet another random name "operational" without
> discussing.
> 
> With a quick look at the table in this patch, "pseudorefs" appears
> to be the closest word that people are already familiar with, I
> think.

Agreed, this thought also crossed my mind while reading through the
patches.

> A lot more reasonable thing to do may be to scan the
> $GIT_DIR for files whose name satisfy refs.c:is_pseudoref_syntax()
> and list them, instead of having a hardcoded list of these special
> refs.

Agreed, as well. Despite the reasons mentioned below, the chance for
such a hardcoded list to grow stale is also quite high. And while it
certainly feels very hacky to iterate over the files one by one and
check for each of them whether it could be a pseudo ref, it is the best
we can do to dynamically detect any such reference.

One interesting question is how we should treat files that look like a
pseudoref, but which really aren't. I'm not aware of any such files
written by Git itself, but it could certainly be that a user wrote such
a file into the repository manually. But given that we're adding new
behaviour that will be opt-in (e.g. via a new switch) I'd rather err on
the side of caution and mark any such file as broken instead of silently
ignoring them.

> In addition, when reftable and other backends that can
> natively store things outside refs/ hierarchy is in use, they ought
> to know what they have so enumerating these would not be an issue
> for them without having such a hardcoded table of names.

Yup, for the reftable we don't have the issue of "How do we detect refs
dynamically" at all. So I would love for there to be a way to print all
refs in the refdb, regardless of whether they start with `refs/` or look
like a pseudoref or whatever else. Otherwise it wouldn't be possible for
a user to delete anything stored in the refdb that may have a funny
name, be it intentionally, by accident or due to a bug.

In the reftable backend, the ref iterator's `_advance()` function has a
hardcoded `starts_with(refname, "refs/")` check. If false, then we'd
skip the ref in order to retain the same behaviour that the files
backend has. So maybe what we should be doing is to introduce a new flag
`DO_FOR_EACH_ALL_REFS` and expose it via git-for-each-ref(1) or
git-show-ref(1). So:

  - For the reftable backend we'd skip the `starts_with()` check and
    simply return all refs.

  - For the files backend we'd also iterate through all files in
    $GIT_DIR and check whether they are pseudoref-like.

Patrick
Karthik Nayak Jan. 2, 2024, 3:18 p.m. UTC | #5
Junio C Hamano <gitster@pobox.com> writes:
>
> It would not begin with 40-hex, though.  If I were doing this,
> perhaps I'd say we should first split is_pseudoref_syntax() that is
> overly loose into to classes (e.g. "caps with underscores that ends
> with HEAD" and everything else), silently reject false positives
> among the latter class.  Then we rename those that are misnamed
> (there should be only few, like AUTO_MERGE that should be a
> pseudoref but named without _HEAD; I do not think of anything that
> ends with _HEAD that is not a ref) over time and drop the latter
> class.
>

I agree about checking the contents of the files to filter out false
positives around the filenames. Alright, this seems like a good way to
go.

Summarizing, we'll change `is_pseudoref_syntax()` to
1. Check for filename to be UPPERCASE including '-', '_'.
   a. If the filename ends with _HEAD, we return as positive
   b. Else check the file content for SHA1/SHA256 hex

This still leaves out HEAD ref, which is not a pseudo ref (since it can
be a symref at times). I'll figure out something there.

>
>> While you're here, I wonder if you have any thoughts on the last block
>> of my first mail.
>>
>>> Over this, I'm also curious on what the mailing list thinks about
>>> exposing this to the client side. Would an `--all` option for
>>> git-for-each-ref(1) be sufficient?
>
> "list pseudorefs in addition to things below refs/"?  Sounds OK to
> me as a feature.
>
> However, "--all" does not mean that in the context of "git log"
> family of commands.  Over there, it means "not just --branches,
> --tags, and --remotes, but everything" which is still limited below
> "refs/".
>

Good point, I agree, usage of "--all" would clash here.

> As "git for-each-ref" takes pattern that is prefix match, e.g.,
>
>     $ git for-each-ref refs/remotes/
>
> shows everything like refs/remotes/origin/main that begins with
> refs/remotes/, I wonder if
>
>     $ git for-each-ref ""
>
> should mean what you are asking for.  After all, "git for-each-ref"
> does *not* take "--branches" and others like "git log" family to
> limit its operation to subhierarchy of "refs/" to begin with.

But I don't think using an empty pattern is the best way to go forward.
This would break the pattern matching feature. For instance, what if the
user wanted to print all refs, but pattern match "*_HEAD"?

Would that be

      $ git for-each-ref "" "*_HEAD"

I think this would be confusing, since the first pattern is now acting
as an option, since its not really filtering rather its changing the
search space.

Maybe "--all-refs" or "--all-ref-types" instead?
Karthik Nayak Jan. 2, 2024, 3:23 p.m. UTC | #6
Patrick Steinhardt <ps@pks.im> writes:
> One interesting question is how we should treat files that look like a
> pseudoref, but which really aren't. I'm not aware of any such files
> written by Git itself, but it could certainly be that a user wrote such
> a file into the repository manually. But given that we're adding new
> behaviour that will be opt-in (e.g. via a new switch) I'd rather err on
> the side of caution and mark any such file as broken instead of silently
> ignoring them.
>

This is definitely tricky, especially since something like
`COMMIT_EDITMSG` passes the `is_pseudoref_syntax()` check and could
simply contain a commit-ish object ID. Here identifying that this is not
a pseudoref is hard when it satisfies both:
1. The general pseudoref syntax
2. Contains the expected file content type

>
> Yup, for the reftable we don't have the issue of "How do we detect refs
> dynamically" at all. So I would love for there to be a way to print all
> refs in the refdb, regardless of whether they start with `refs/` or look
> like a pseudoref or whatever else. Otherwise it wouldn't be possible for
> a user to delete anything stored in the refdb that may have a funny
> name, be it intentionally, by accident or due to a bug.
>
> In the reftable backend, the ref iterator's `_advance()` function has a
> hardcoded `starts_with(refname, "refs/")` check. If false, then we'd
> skip the ref in order to retain the same behaviour that the files
> backend has. So maybe what we should be doing is to introduce a new flag
> `DO_FOR_EACH_ALL_REFS` and expose it via git-for-each-ref(1) or
> git-show-ref(1). So:
>
>   - For the reftable backend we'd skip the `starts_with()` check and
>     simply return all refs.
>
>   - For the files backend we'd also iterate through all files in
>     $GIT_DIR and check whether they are pseudoref-like.
>

This makes sense to me and is along what I was considering for the
dynamic approach, thanks for writing it down, clarifies my thoughts.
Junio C Hamano Jan. 2, 2024, 4:49 p.m. UTC | #7
Karthik Nayak <karthik.188@gmail.com> writes:

>> As "git for-each-ref" takes pattern that is prefix match, e.g.,
>>
>>     $ git for-each-ref refs/remotes/
>>
>> shows everything like refs/remotes/origin/main that begins with
>> refs/remotes/, I wonder if
>>
>>     $ git for-each-ref ""
>>
>> should mean what you are asking for.  After all, "git for-each-ref"
>> does *not* take "--branches" and others like "git log" family to
>> limit its operation to subhierarchy of "refs/" to begin with.
>
> But I don't think using an empty pattern is the best way to go forward.
> This would break the pattern matching feature. For instance, what if the
> user wanted to print all refs, but pattern match "*_HEAD"?
>
> Would that be
>
>       $ git for-each-ref "" "*_HEAD"

Because you do not omit the leading hierarchy when using globs:

    $ git for-each-ref v2.4\?.\* ;# nothing
    $ git for-each-ref tags/v2.4\?.\* ;# nothing
    $ git for-each-ref refs/tags/v2.4\?.\* ;# gives tags in v2.40 and onwards

I would expect that it would be more like

	$ git for-each-ref "*_HEAD"

And because you can have two or more patterns, e.g.,

    $ git for-each-ref refs/tags/v2.4\?.\* refs/heads/m\*

to get those recent tags and branches whose name begins with 'm', I
would expect your

    $ git for-each-ref "" "*_HEAD"

would probably be equivalent to

    $ git for-each-ref ""
Taylor Blau Jan. 2, 2024, 6:47 p.m. UTC | #8
On Tue, Jan 02, 2024 at 07:18:48AM -0800, Karthik Nayak wrote:
> > As "git for-each-ref" takes pattern that is prefix match, e.g.,
> >
> >     $ git for-each-ref refs/remotes/
> >
> > shows everything like refs/remotes/origin/main that begins with
> > refs/remotes/, I wonder if
> >
> >     $ git for-each-ref ""
> >
> > should mean what you are asking for.  After all, "git for-each-ref"
> > does *not* take "--branches" and others like "git log" family to
> > limit its operation to subhierarchy of "refs/" to begin with.
>
> But I don't think using an empty pattern is the best way to go forward.
> This would break the pattern matching feature. For instance, what if the
> user wanted to print all refs, but pattern match "*_HEAD"?
>
> Would that be
>
>       $ git for-each-ref "" "*_HEAD"
>
> I think this would be confusing, since the first pattern is now acting
> as an option, since its not really filtering rather its changing the
> search space.
>
> Maybe "--all-refs" or "--all-ref-types" instead?

I tend to agree that the special empty pattern would be a good shorthand
for listing all references underneath refs/, including any top-level
psuedo-refs.

But I don't think that I quite follow what Karthik is saying here.
for-each-ref returns the union of references that match the given
pattern(s), not their intersection. So if you wanted to list just the
psudo-refs ending in '_HEAD', you'd do:

  $ git for-each-ref "*_HEAD"

I think if you wanted to list all pseudo-refs, calling the option
`--pseudo-refs` seems reasonable. But if you want to list some subset of
psueod-refs matching a given pattern, you should specify that pattern
directly.

Thanks,
Taylor
Taylor Blau Jan. 2, 2024, 6:49 p.m. UTC | #9
On Thu, Dec 28, 2023 at 11:34:22AM +0100, Patrick Steinhardt wrote:
> One interesting question is how we should treat files that look like a
> pseudoref, but which really aren't. I'm not aware of any such files
> written by Git itself, but it could certainly be that a user wrote such
> a file into the repository manually. But given that we're adding new
> behaviour that will be opt-in (e.g. via a new switch) I'd rather err on
> the side of caution and mark any such file as broken instead of silently
> ignoring them.

I probably wouldn't spend a ton of time worrying about this personally.
Without additional information, I think it's impossible for us to
determine a-priori whether or not a file underneath $GIT_DIR should be
interpreted as a pseudo-ref or not.

I agree with your reasoning that since this is opt-in via a new
command-line flag, that we're probably OK here enumerating the files in
$GIT_DIR, and printing out the ones that look like pseudo-refs.

Thanks,
Taylor
Patrick Steinhardt Jan. 3, 2024, 8:52 a.m. UTC | #10
On Tue, Jan 02, 2024 at 01:47:22PM -0500, Taylor Blau wrote:
> On Tue, Jan 02, 2024 at 07:18:48AM -0800, Karthik Nayak wrote:
> > > As "git for-each-ref" takes pattern that is prefix match, e.g.,
> > >
> > >     $ git for-each-ref refs/remotes/
> > >
> > > shows everything like refs/remotes/origin/main that begins with
> > > refs/remotes/, I wonder if
> > >
> > >     $ git for-each-ref ""
> > >
> > > should mean what you are asking for.  After all, "git for-each-ref"
> > > does *not* take "--branches" and others like "git log" family to
> > > limit its operation to subhierarchy of "refs/" to begin with.
> >
> > But I don't think using an empty pattern is the best way to go forward.
> > This would break the pattern matching feature. For instance, what if the
> > user wanted to print all refs, but pattern match "*_HEAD"?
> >
> > Would that be
> >
> >       $ git for-each-ref "" "*_HEAD"
> >
> > I think this would be confusing, since the first pattern is now acting
> > as an option, since its not really filtering rather its changing the
> > search space.
> >
> > Maybe "--all-refs" or "--all-ref-types" instead?
> 
> I tend to agree that the special empty pattern would be a good shorthand
> for listing all references underneath refs/, including any top-level
> psuedo-refs.
> 
> But I don't think that I quite follow what Karthik is saying here.
> for-each-ref returns the union of references that match the given
> pattern(s), not their intersection. So if you wanted to list just the
> psudo-refs ending in '_HEAD', you'd do:
> 
>   $ git for-each-ref "*_HEAD"
> 
> I think if you wanted to list all pseudo-refs, calling the option
> `--pseudo-refs` seems reasonable. But if you want to list some subset of
> psueod-refs matching a given pattern, you should specify that pattern
> directly.

Where I think this proposal falls short is if you have refs outside of
the "refs/" hierarchy. Granted, this is nothing that should usually
happen nowadays. But I think we should safeguard us for the future:

  - There may be bugs in the reftable backend that allow for such refs
    to be created.

  - We may even eventually end up saying that it's valid for refs to not
    start with "refs/". I consider this to be mostly an artifact of how
    the files backend works, so it is not entirely unreasonable for us
    to eventually lift the restriction for the reftable backend.

I do not want to push for the second bullet point anytime soon, nor do I
have any plans to do so in the future. But regardless of that I would
really love to have a way to ask the ref backend for _any_ reference
that it knows of, regardless of its prefix. Otherwise it becomes next to
impossible for a user to learn about what the reftable binary-format
actually contains. So I think that the current focus on pseudo-refs is
too short-sighted, and would want to aim for a more complete solution to
this problem.

This could be in the form of a `--all-refs` flag that gets translated
into a new `DO_FOR_EACH_REF_ALL_REFS` bit, which would indicate to the
ref backend to also enumerate refs outside of the "refs/" hierarchy.
This is orthogonal to the already existing `--all` pseudo-opt, because
`--all` would only ever enumerate refs inside of the "refs/" hierarchy.

Patrick
Karthik Nayak Jan. 3, 2024, 10:22 a.m. UTC | #11
Patrick Steinhardt <ps@pks.im> writes:

> On Tue, Jan 02, 2024 at 01:47:22PM -0500, Taylor Blau wrote:
>> On Tue, Jan 02, 2024 at 07:18:48AM -0800, Karthik Nayak wrote:
>> > > As "git for-each-ref" takes pattern that is prefix match, e.g.,
>> > >
>> > >     $ git for-each-ref refs/remotes/
>> > >
>> > > shows everything like refs/remotes/origin/main that begins with
>> > > refs/remotes/, I wonder if
>> > >
>> > >     $ git for-each-ref ""
>> > >
>> > > should mean what you are asking for.  After all, "git for-each-ref"
>> > > does *not* take "--branches" and others like "git log" family to
>> > > limit its operation to subhierarchy of "refs/" to begin with.
>> >
>> > But I don't think using an empty pattern is the best way to go forward.
>> > This would break the pattern matching feature. For instance, what if the
>> > user wanted to print all refs, but pattern match "*_HEAD"?
>> >
>> > Would that be
>> >
>> >       $ git for-each-ref "" "*_HEAD"
>> >
>> > I think this would be confusing, since the first pattern is now acting
>> > as an option, since its not really filtering rather its changing the
>> > search space.
>> >
>> > Maybe "--all-refs" or "--all-ref-types" instead?
>>
>> I tend to agree that the special empty pattern would be a good shorthand
>> for listing all references underneath refs/, including any top-level
>> psuedo-refs.
>>
>> But I don't think that I quite follow what Karthik is saying here.
>> for-each-ref returns the union of references that match the given
>> pattern(s), not their intersection. So if you wanted to list just the
>> psudo-refs ending in '_HEAD', you'd do:
>>
>>   $ git for-each-ref "*_HEAD"
>>
>> I think if you wanted to list all pseudo-refs, calling the option
>> `--pseudo-refs` seems reasonable. But if you want to list some subset of
>> psueod-refs matching a given pattern, you should specify that pattern
>> directly.
>
> Where I think this proposal falls short is if you have refs outside of
> the "refs/" hierarchy. Granted, this is nothing that should usually
> happen nowadays. But I think we should safeguard us for the future:
>
>   - There may be bugs in the reftable backend that allow for such refs
>     to be created.
>
>   - We may even eventually end up saying that it's valid for refs to not
>     start with "refs/". I consider this to be mostly an artifact of how
>     the files backend works, so it is not entirely unreasonable for us
>     to eventually lift the restriction for the reftable backend.
>
> I do not want to push for the second bullet point anytime soon, nor do I
> have any plans to do so in the future. But regardless of that I would
> really love to have a way to ask the ref backend for _any_ reference
> that it knows of, regardless of its prefix. Otherwise it becomes next to
> impossible for a user to learn about what the reftable binary-format
> actually contains. So I think that the current focus on pseudo-refs is
> too short-sighted, and would want to aim for a more complete solution to
> this problem.
>
> This could be in the form of a `--all-refs` flag that gets translated
> into a new `DO_FOR_EACH_REF_ALL_REFS` bit, which would indicate to the
> ref backend to also enumerate refs outside of the "refs/" hierarchy.
> This is orthogonal to the already existing `--all` pseudo-opt, because
> `--all` would only ever enumerate refs inside of the "refs/" hierarchy.
>
> Patrick

Thanks Patrick. I think the confusion was because I was referring to add
a new command to print all refs, meaning all refs including the ones
outside the "refs/" folder.

The confusion was that I thought Junio was referring to using

    $ git for-each-ref ""

to print all refs under $GIT_DIR, while he was actually talking about
"$GIT_DIR/refs/" directory.

So to loop back, I'm suggesting to add `--all-refs` to print all the
refs under $GIT_DIR. This would include refs traditionally found under
"refs/" and other refs like pseudo refs, HEAD and perhaps user created
refs under $GIT_DIR.
Junio C Hamano Jan. 3, 2024, 2:38 p.m. UTC | #12
Karthik Nayak <karthik.188@gmail.com> writes:

> The confusion was that I thought Junio was referring to using
>
>     $ git for-each-ref ""
>
> to print all refs under $GIT_DIR, while he was actually talking about
> "$GIT_DIR/refs/" directory.

I do not think you misunderstood me here, though.  

When you have your master branch (refs/heads/master), your v1.0 tag
(refs/tags/v1.0), and the usual pseudorefs, giving "refs" to "git
for-each-ref" would yield refs/heads/master and refs/tags/v1.0 but
not HEAD and others, simply because the pattern "refs" in

    $ git for-each-ref "refs"

works as a hierarchy prefix match.  You give "refs/heads" and you
get only your master branch but not tags or HEAD in such a
repository.  As a natural extension to that behaviour, an empty
string as a hierarchy prefix that matches everything would work
well: you'd get HEAD, refs/heads/master, and refs/tags/v1.0 as an
empty prefix would cover all of the hiearchies these three refs (and
pseudorefs if you had ORIG_HEAD and MERGE_HEAD there) live in.

In any case, it is not a very much interesting to define the syntax
to tell for-each-ref not to limit itself under "refs/".  My point
was that you do not need a special option for that, as shown above.

What is more interesting is what to do with refs that are specific
to other worktrees, e.g.

    $ git rev-parse "worktrees/$name/refs/bisect/bad"

would currently let you peek into (and worse yet, muck with, if you
really wanted to, with something like "git update-ref") refs that
should be only visible in another worktree.  Should for-each-ref and
friends learn a way to iterate over them?  I have no answer to that
question.
Taylor Blau Jan. 3, 2024, 3:45 p.m. UTC | #13
On Wed, Jan 03, 2024 at 09:52:33AM +0100, Patrick Steinhardt wrote:
> > I tend to agree that the special empty pattern would be a good shorthand
> > for listing all references underneath refs/, including any top-level
> > psuedo-refs.
> >
> > But I don't think that I quite follow what Karthik is saying here.
> > for-each-ref returns the union of references that match the given
> > pattern(s), not their intersection. So if you wanted to list just the
> > psudo-refs ending in '_HEAD', you'd do:
> >
> >   $ git for-each-ref "*_HEAD"
> >
> > I think if you wanted to list all pseudo-refs, calling the option
> > `--pseudo-refs` seems reasonable. But if you want to list some subset of
> > psueod-refs matching a given pattern, you should specify that pattern
> > directly.
>
> Where I think this proposal falls short is if you have refs outside of
> the "refs/" hierarchy. Granted, this is nothing that should usually
> happen nowadays. But I think we should safeguard us for the future:

Hmm. Maybe I misspoke, but I was thinking that `--pseudo-refs` would
imply that we list all references (regardless of whether they appear in
the top-level refs/ hierarchy). But perhaps I'm misunderstanding what
you're trying to accomplish here.

Thanks,
Taylor
Patrick Steinhardt Jan. 3, 2024, 3:50 p.m. UTC | #14
On Wed, Jan 03, 2024 at 06:38:02AM -0800, Junio C Hamano wrote:
> Karthik Nayak <karthik.188@gmail.com> writes:
> 
> > The confusion was that I thought Junio was referring to using
> >
> >     $ git for-each-ref ""
> >
> > to print all refs under $GIT_DIR, while he was actually talking about
> > "$GIT_DIR/refs/" directory.
> 
> I do not think you misunderstood me here, though.  
> 
> When you have your master branch (refs/heads/master), your v1.0 tag
> (refs/tags/v1.0), and the usual pseudorefs, giving "refs" to "git
> for-each-ref" would yield refs/heads/master and refs/tags/v1.0 but
> not HEAD and others, simply because the pattern "refs" in
> 
>     $ git for-each-ref "refs"
> 
> works as a hierarchy prefix match.  You give "refs/heads" and you
> get only your master branch but not tags or HEAD in such a
> repository.  As a natural extension to that behaviour, an empty
> string as a hierarchy prefix that matches everything would work
> well: you'd get HEAD, refs/heads/master, and refs/tags/v1.0 as an
> empty prefix would cover all of the hiearchies these three refs (and
> pseudorefs if you had ORIG_HEAD and MERGE_HEAD there) live in.
> 
> In any case, it is not a very much interesting to define the syntax
> to tell for-each-ref not to limit itself under "refs/".  My point
> was that you do not need a special option for that, as shown above.

I think you're just stating that "it's possible, but not necessarily a
good idea" (let me know if I'm misinterpreting, I'm not quite sure
whether I read this correctly). Anyway, let me add my 2c here, even
though it may ultimately be completely moot.

The downside of an empty prefix is that you wouldn't be able to filter
refs outside of the "refs/" hierarchy in case we'd use the empty prefix.
A better alternative would be to use "/" as an indicator that you want
to list refs outside of "refs/". That'd allow for more flexible queries:

  - "/" prints all refs and pseudo refs, even those outside of the
    "refs/" hierarchy.

  - "/refs" prints your normal refs.

  - "/something/else" prints refs in "$GIT_DIR/something/else".

I'm not sure whether it's a better idea than using a flag and I'd assume
that the implementation would be more complex in that case because the
respective backends would need to special-case leading slashes.

So in the end I'm still in the camp that a flag is likely a better idea.

> What is more interesting is what to do with refs that are specific
> to other worktrees, e.g.
> 
>     $ git rev-parse "worktrees/$name/refs/bisect/bad"
> 
> would currently let you peek into (and worse yet, muck with, if you
> really wanted to, with something like "git update-ref") refs that
> should be only visible in another worktree.  Should for-each-ref and
> friends learn a way to iterate over them?  I have no answer to that
> question.

That's a good question indeed. I could certainly see an argument that
there should be the possibility to list them to get an allcompassing
view of the repository's refs. It's sure going to get more complex like
that though (which is not a good argument not to do this).

Currently, per-worktree refs are implemented as quasi-separate ref
stores (see `get_worktree_ref_store()`), and the reffiles backend will
indeed use completely separate stacks for each worktree. So ultimately
it makes me think that this is higher-level logic that the ref store
backend wouldn't need to be aware of, but that git-for-each-ref(1) or
related commands would need to handle.

So I'm not quite sure whether we should solve all these related problems
at once. If we were to implement these features via a flag then I could
see us using a value-flag with which you can control what exactly should
be included in the listing. So something like:

  - `--with-refs=repository` includes all refs of the current
    repository.

  - `--with-refs=worktrees` includes refs of all worktrees.

I dunno. I feel like I start to overthink this.

Patrick
Patrick Steinhardt Jan. 3, 2024, 3:52 p.m. UTC | #15
On Wed, Jan 03, 2024 at 10:45:49AM -0500, Taylor Blau wrote:
> On Wed, Jan 03, 2024 at 09:52:33AM +0100, Patrick Steinhardt wrote:
> > > I tend to agree that the special empty pattern would be a good shorthand
> > > for listing all references underneath refs/, including any top-level
> > > psuedo-refs.
> > >
> > > But I don't think that I quite follow what Karthik is saying here.
> > > for-each-ref returns the union of references that match the given
> > > pattern(s), not their intersection. So if you wanted to list just the
> > > psudo-refs ending in '_HEAD', you'd do:
> > >
> > >   $ git for-each-ref "*_HEAD"
> > >
> > > I think if you wanted to list all pseudo-refs, calling the option
> > > `--pseudo-refs` seems reasonable. But if you want to list some subset of
> > > psueod-refs matching a given pattern, you should specify that pattern
> > > directly.
> >
> > Where I think this proposal falls short is if you have refs outside of
> > the "refs/" hierarchy. Granted, this is nothing that should usually
> > happen nowadays. But I think we should safeguard us for the future:
> 
> Hmm. Maybe I misspoke, but I was thinking that `--pseudo-refs` would
> imply that we list all references (regardless of whether they appear in
> the top-level refs/ hierarchy). But perhaps I'm misunderstanding what
> you're trying to accomplish here.

Ah, okay. I think in that case it's simply a misunderstanding. To me a
pseudo-ref only includes refs that match `is_pseudoref_syntax()`, so
things like "HEAD", "ORIG_HEAD" or "MERGE_HEAD". So with that
understanding, a ref "something/outside/refs" would not be included,
but I'd very much like to see it listed.

Patrick
Junio C Hamano Jan. 3, 2024, 4:02 p.m. UTC | #16
Patrick Steinhardt <ps@pks.im> writes:

> The downside of an empty prefix is that you wouldn't be able to filter
> refs outside of the "refs/" hierarchy in case we'd use the empty prefix.
> A better alternative would be to use "/" as an indicator that you want
> to list refs outside of "refs/". That'd allow for more flexible queries:
>
>   - "/" prints all refs and pseudo refs, even those outside of the
>     "refs/" hierarchy.
>
>   - "/refs" prints your normal refs.
>
>   - "/something/else" prints refs in "$GIT_DIR/something/else".

I do not get this at all, sorry.  What makes your "/" cover "refs/"
but not "something/"?  Unless you have some rule that special cases 
"/" to apply the "hierarchy prefix" matching rule unevenly, that is
not possible.  So you can easily lose the "/" all of your above
patterns share, go back to what I showed, and apply the morally
equivalent special case to an empty prefix and you'd be OK.

In any case, I do not think supporting anything other than
pseudorefs and HEAD outside "refs/" is a good idea to begin with
(the "worktrees/$name/" example), and requiring that all normal
references live inside "refs/" hierarchy is a good idea, so all of
the above is moot, I would say.

Thanks.
Patrick Steinhardt Jan. 3, 2024, 4:17 p.m. UTC | #17
On Wed, Jan 03, 2024 at 08:02:46AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > The downside of an empty prefix is that you wouldn't be able to filter
> > refs outside of the "refs/" hierarchy in case we'd use the empty prefix.
> > A better alternative would be to use "/" as an indicator that you want
> > to list refs outside of "refs/". That'd allow for more flexible queries:
> >
> >   - "/" prints all refs and pseudo refs, even those outside of the
> >     "refs/" hierarchy.
> >
> >   - "/refs" prints your normal refs.
> >
> >   - "/something/else" prints refs in "$GIT_DIR/something/else".
> 
> I do not get this at all, sorry.  What makes your "/" cover "refs/"
> but not "something/"? 

It does cover "something/". But...

> Unless you have some rule that special cases "/" to apply the
> "hierarchy prefix" matching rule unevenly, that is not possible.  So
> you can easily lose the "/" all of your above patterns share, go back
> to what I showed, and apply the morally equivalent special case to an
> empty prefix and you'd be OK.

... I think you're right -- I was arguing under the misassumption that
the typical rev-parse rules kicked in for git-for-each-ref(1) (e.g.
matching "heads/foo" to "refs/heads/foo"). But they don't, so my point
indeed becomes moot and I see what you're getting at now and agree with
you.

> In any case, I do not think supporting anything other than
> pseudorefs and HEAD outside "refs/" is a good idea to begin with
> (the "worktrees/$name/" example), and requiring that all normal
> references live inside "refs/" hierarchy is a good idea, so all of
> the above is moot, I would say.

Yeah, I'm on the same page: anything outside of "refs/" should not be
supported. But the problem is that tools like git-update-ref(1) don't
enforce this, so something like `git update-ref foo/bar HEAD` happily
creates "$GIT_DIR/foo/bar". And I bet there are other ways to write refs
at arbitrary paths.

With the files backend it's easy to see that this was created and can be
rectified. But with the reftable library you wouldn't be able to learn
about the existence of this ref at all if we ignored anything but
pseudo-refs and refs prefixed with "refs/".

So while I agree that we shouldn't endorse such refs, we need to at
least give an escape hatch in case such refs end up in the refdb anyway.

Patrick
Taylor Blau Jan. 3, 2024, 5 p.m. UTC | #18
On Wed, Jan 03, 2024 at 04:52:36PM +0100, Patrick Steinhardt wrote:
> On Wed, Jan 03, 2024 at 10:45:49AM -0500, Taylor Blau wrote:
> > On Wed, Jan 03, 2024 at 09:52:33AM +0100, Patrick Steinhardt wrote:
> > > > I tend to agree that the special empty pattern would be a good shorthand
> > > > for listing all references underneath refs/, including any top-level
> > > > psuedo-refs.
> > > >
> > > > But I don't think that I quite follow what Karthik is saying here.
> > > > for-each-ref returns the union of references that match the given
> > > > pattern(s), not their intersection. So if you wanted to list just the
> > > > psudo-refs ending in '_HEAD', you'd do:
> > > >
> > > >   $ git for-each-ref "*_HEAD"
> > > >
> > > > I think if you wanted to list all pseudo-refs, calling the option
> > > > `--pseudo-refs` seems reasonable. But if you want to list some subset of
> > > > psueod-refs matching a given pattern, you should specify that pattern
> > > > directly.
> > >
> > > Where I think this proposal falls short is if you have refs outside of
> > > the "refs/" hierarchy. Granted, this is nothing that should usually
> > > happen nowadays. But I think we should safeguard us for the future:
> >
> > Hmm. Maybe I misspoke, but I was thinking that `--pseudo-refs` would
> > imply that we list all references (regardless of whether they appear in
> > the top-level refs/ hierarchy). But perhaps I'm misunderstanding what
> > you're trying to accomplish here.
>
> Ah, okay. I think in that case it's simply a misunderstanding. To me a
> pseudo-ref only includes refs that match `is_pseudoref_syntax()`, so
> things like "HEAD", "ORIG_HEAD" or "MERGE_HEAD". So with that
> understanding, a ref "something/outside/refs" would not be included,
> but I'd very much like to see it listed.

OK, I see: you're trying to add an option that lists all references
(including those outside of the top-level "refs/" hierarchy). But my
proposal to use `--pseudo-refs` was to list *just* those references
outside of the top-level hierarchy.

I wonder if we might want to do something else entirely, which is an
option which controls the top-level "namespace" of references that we
want to see. The behavior would then be to list all references under
"namespace" (which presumably would be "refs/" by default).

If you want to list references like something/outside/refs, your
namespace would then be --namespace="".

I think that this would be a bit more flexible than the current
suggestions, but I am also not as familiar as you are at this particular
problem :-).

Thanks,
Taylor
Junio C Hamano Jan. 3, 2024, 5:21 p.m. UTC | #19
Patrick Steinhardt <ps@pks.im> writes:

> ... But the problem is that tools like git-update-ref(1) don't
> enforce this, so something like `git update-ref foo/bar HEAD` happily
> creates "$GIT_DIR/foo/bar". And I bet there are other ways to write refs
> at arbitrary paths.

I think we should tighten things up over time.  First by teaching
the ref backend that anything that is not a pseudoref, HEAD or a
proper ref (one item of whose definition is "lives under refs/
hierarchy) should not resolve_ref() successfully.  That should
correctly fail things like

    $ git rev-parse worktrees/$name/bisect/bad
    $ git update-ref foo/bar HEAD

I'd hope.

Thanks.
Patrick Steinhardt Jan. 3, 2024, 5:36 p.m. UTC | #20
On Wed, Jan 03, 2024 at 09:21:13AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > ... But the problem is that tools like git-update-ref(1) don't
> > enforce this, so something like `git update-ref foo/bar HEAD` happily
> > creates "$GIT_DIR/foo/bar". And I bet there are other ways to write refs
> > at arbitrary paths.
> 
> I think we should tighten things up over time.  First by teaching
> the ref backend that anything that is not a pseudoref, HEAD or a
> proper ref (one item of whose definition is "lives under refs/
> hierarchy) should not resolve_ref() successfully.  That should
> correctly fail things like
> 
>     $ git rev-parse worktrees/$name/bisect/bad
>     $ git update-ref foo/bar HEAD
> 
> I'd hope.
> 
> Thanks.

Yeah, agreed, that's something we should do. I do wonder whether this
will break existing usecases, but in any case I'd rather consider it an
accident that it is possible to write (and read) such refs in the first
place.

Patrick
Junio C Hamano Jan. 3, 2024, 5:59 p.m. UTC | #21
Patrick Steinhardt <ps@pks.im> writes:

>> I think we should tighten things up over time.  First by teaching
>> the ref backend that anything that is not a pseudoref, HEAD or a
>> proper ref (one item of whose definition is "lives under refs/
>> hierarchy) should not resolve_ref() successfully.  That should
>> correctly fail things like
>> 
>>     $ git rev-parse worktrees/$name/bisect/bad
>>     $ git update-ref foo/bar HEAD
> ...
> Yeah, agreed, that's something we should do. I do wonder whether this
> will break existing usecases, but in any case I'd rather consider it an
> accident that it is possible to write (and read) such refs in the first
> place.

Unfortunately, the worktrees/$name/refs/bisect/bad and its friends
are documented in "git worktree" and the refs.c layer is aware of
the "main-worktree/" and "worktrees/" hierarchy, so while I still
think it is a good long-term direction to make it impossible to
create random refs like "foo/bar" and "resf/heads/master" via the
commands like "git update-ref", we cannot limit ourselves only to
"refs/" hierarchy.
Patrick Steinhardt Jan. 3, 2024, 6:01 p.m. UTC | #22
On Wed, Jan 03, 2024 at 09:59:13AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> >> I think we should tighten things up over time.  First by teaching
> >> the ref backend that anything that is not a pseudoref, HEAD or a
> >> proper ref (one item of whose definition is "lives under refs/
> >> hierarchy) should not resolve_ref() successfully.  That should
> >> correctly fail things like
> >> 
> >>     $ git rev-parse worktrees/$name/bisect/bad
> >>     $ git update-ref foo/bar HEAD
> > ...
> > Yeah, agreed, that's something we should do. I do wonder whether this
> > will break existing usecases, but in any case I'd rather consider it an
> > accident that it is possible to write (and read) such refs in the first
> > place.
> 
> Unfortunately, the worktrees/$name/refs/bisect/bad and its friends
> are documented in "git worktree" and the refs.c layer is aware of
> the "main-worktree/" and "worktrees/" hierarchy, so while I still
> think it is a good long-term direction to make it impossible to
> create random refs like "foo/bar" and "resf/heads/master" via the
> commands like "git update-ref", we cannot limit ourselves only to
> "refs/" hierarchy.

Ah, I first wanted to point this out, but then noticed that you didn't
include the "refs/" prefix in "worktrees/$name/bisect/bad" and thought
this was intentional. But yes, per-worktree refs need to stay supported,
weird as they may be.

Patrick
Karthik Nayak Jan. 4, 2024, 11:31 a.m. UTC | #23
Patrick Steinhardt <ps@pks.im> writes:

> On Wed, Jan 03, 2024 at 09:59:13AM -0800, Junio C Hamano wrote:
>> Patrick Steinhardt <ps@pks.im> writes:
>>
>> >> I think we should tighten things up over time.  First by teaching
>> >> the ref backend that anything that is not a pseudoref, HEAD or a
>> >> proper ref (one item of whose definition is "lives under refs/
>> >> hierarchy) should not resolve_ref() successfully.  That should
>> >> correctly fail things like
>> >>
>> >>     $ git rev-parse worktrees/$name/bisect/bad
>> >>     $ git update-ref foo/bar HEAD
>> > ...
>> > Yeah, agreed, that's something we should do. I do wonder whether this
>> > will break existing usecases, but in any case I'd rather consider it an
>> > accident that it is possible to write (and read) such refs in the first
>> > place.
>>
>> Unfortunately, the worktrees/$name/refs/bisect/bad and its friends
>> are documented in "git worktree" and the refs.c layer is aware of
>> the "main-worktree/" and "worktrees/" hierarchy, so while I still
>> think it is a good long-term direction to make it impossible to
>> create random refs like "foo/bar" and "resf/heads/master" via the
>> commands like "git update-ref", we cannot limit ourselves only to
>> "refs/" hierarchy.
>
> Ah, I first wanted to point this out, but then noticed that you didn't
> include the "refs/" prefix in "worktrees/$name/bisect/bad" and thought
> this was intentional. But yes, per-worktree refs need to stay supported,
> weird as they may be.
>
> Patrick

Thanks all for the discussion, I'll try to summarize the path forward
as per my understanding.

1. We want to introduce a way to output all refs. This includes refs
under "refs/", pseudo refs, HEAD, and any other ref like objects under
$GIT_DIR. The reasoning being that users are allowed currently to create
refs without any directory restrictions. So with the upcoming reftable
backend, it becomes important to provide a utility to print all the refs
held in the reftable. Ideally we want to restrict such ref's from being
created but for the time being, since thats allowed, we should also
provide the utility to print such refs.

2. In the files backend, this would involve iterating through the
$GIT_DIR and finding all ref-like objects and printing them.

3. To expose this to the user, we could do something like

   $ git for-each-ref ""

Which is a natural extension of the current syntax, where the empty
string would imply that we do not filter to the "refs/" directory.
It is still not clear whether we should support "worktrees".

Any corrections here?
Junio C Hamano Jan. 4, 2024, 11:59 p.m. UTC | #24
Karthik Nayak <karthik.188@gmail.com> writes:

> Thanks all for the discussion, I'll try to summarize the path forward
> as per my understanding.

It has already been clear for the past 5 years or so since 3a3b9d8c
(refs: new ref types to make per-worktree refs visible to all
worktrees, 2018-10-21) that we need to treat "worktrees/foo/HEAD",
"worktrees/bar/refs/bisect/bad", etc. as something end-users can
access via the normal ref mechansim (by a resolve_ref() call that is
eventually made from get_sha1() when these are passed to say "git
log"); I just did not remember that one but that does not mean we
can suddenly change the rules.

So you'd probably need to tweak the end of your bullet point #3, but
other than that it is a great summary.

Thanks.

> 1. We want to introduce a way to output all refs. This includes refs
> under "refs/", pseudo refs, HEAD, and any other ref like objects under
> $GIT_DIR. The reasoning being that users are allowed currently to create
> refs without any directory restrictions. So with the upcoming reftable
> backend, it becomes important to provide a utility to print all the refs
> held in the reftable. Ideally we want to restrict such ref's from being
> created but for the time being, since thats allowed, we should also
> provide the utility to print such refs.
>
> 2. In the files backend, this would involve iterating through the
> $GIT_DIR and finding all ref-like objects and printing them.
>
> 3. To expose this to the user, we could do something like
>
>    $ git for-each-ref ""
>
> Which is a natural extension of the current syntax, where the empty
> string would imply that we do not filter to the "refs/" directory.
> It is still not clear whether we should support "worktrees".
>
> Any corrections here?
diff mbox series

Patch

diff --git a/ref-filter.c b/ref-filter.c
index fdaabb5bb4..307a512c27 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2756,6 +2756,10 @@  static int filter_ref_kind(struct ref_filter *filter, const char *refname)
 	    filter->kind == FILTER_REFS_REMOTES ||
 	    filter->kind == FILTER_REFS_TAGS)
 		return filter->kind;
+
+	if (filter->kind & FILTER_REFS_OPERATIONAL)
+		return filter->kind;
+
 	return ref_kind_from_refname(refname);
 }
 
@@ -3032,6 +3036,14 @@  static int do_filter_refs(struct ref_filter *filter, unsigned int type, each_ref
 	if (!filter->kind)
 		die("filter_refs: invalid type");
 	else {
+		size_t i;
+		if (filter->kind & FILTER_REFS_OPERATIONAL) {
+			for (i = 0; i < operational_refs_max; i++) {
+				refs_single_ref(get_main_ref_store(the_repository),
+								operational_refs[i], fn, cb_data);
+			}
+		}
+
 		/*
 		 * For common cases where we need only branches or remotes or tags,
 		 * we only iterate through those refs. If a mix of refs is needed,
diff --git a/ref-filter.h b/ref-filter.h
index 0ce5af58ab..eec1d29514 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -23,7 +23,9 @@ 
 #define FILTER_REFS_ALL            (FILTER_REFS_TAGS | FILTER_REFS_BRANCHES | \
 				    FILTER_REFS_REMOTES | FILTER_REFS_OTHERS)
 #define FILTER_REFS_DETACHED_HEAD  0x0020
-#define FILTER_REFS_KIND_MASK      (FILTER_REFS_ALL | FILTER_REFS_DETACHED_HEAD)
+#define FILTER_REFS_OPERATIONAL    0x0040
+#define FILTER_REFS_KIND_MASK      (FILTER_REFS_ALL | FILTER_REFS_DETACHED_HEAD | \
+				    FILTER_REFS_OPERATIONAL)
 
 struct atom_value;
 struct ref_sorting;
diff --git a/refs.c b/refs.c
index cebc5458d3..70f6854301 100644
--- a/refs.c
+++ b/refs.c
@@ -82,6 +82,20 @@  static unsigned char refname_disposition[256] = {
 	0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 4, 4
 };
 
+const char *operational_refs[] = {
+	"HEAD",
+	"ORIG_HEAD",
+	"REBASE_HEAD",
+	"REVERT_HEAD",
+	"CHERRY_PICK_HEAD",
+	"BISECT_HEAD",
+	"BISECT_EXPECTED_REV",
+	"NOTES_MERGE_PARTIAL",
+	"NOTES_MERGE_REF",
+};
+
+const int operational_refs_max = ARRAY_SIZE(operational_refs) - 1;
+
 struct ref_namespace_info ref_namespace[] = {
 	[NAMESPACE_HEAD] = {
 		.ref = "HEAD",
diff --git a/refs.h b/refs.h
index e147f13a85..b01938a91a 100644
--- a/refs.h
+++ b/refs.h
@@ -15,6 +15,9 @@  int default_ref_storage_format(void);
 int ref_storage_format_by_name(const char *name);
 const char *ref_storage_format_to_name(int ref_storage_format);
 
+extern const char *operational_refs[];
+extern const int operational_refs_max;
+
 /*
  * Resolve a reference, recursively following symbolic refererences.
  *