mbox series

[0/3] log: create tighter default decoration filter

Message ID pull.1301.git.1658844250.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series log: create tighter default decoration filter | expand

Message

Linus Arver via GitGitGadget July 26, 2022, 2:04 p.m. UTC
When running 'git log', the default is to report any refs within refs/* as
decorations. This series reduces the default set to be more specific to a
set of known-useful namespaces.

This was previously reduced by adding the log.excludeDecoration config
option and modifying that config in git maintenance's prefetch task (to hide
refs/prefetch/*). I then followed that pattern again for the bundle URI
feature [1], but this caught some reviewers by surprise as an unfortunate
side-effect. This series is a way to roll back the previous decision to use
log.excludeDecoration and instead use tighter filters by default.

[1]
https://lore.kernel.org/git/a217e9a0640b45d21ef971d6e91cee3f1993f383.1656535245.git.gitgitgadget@gmail.com/

As noted in the last patch, the current design ignores the new filters if
there are any previously-specified filters. This includes the
log.excludeDecorations=refs/prefetch/ set by the git maintenance command.
This means that users who ran that command in their repo will not get the
benefits of the more strict filters. While we stop writing
log.excludeDecorations, we don't remove existing instances of it.

I'm interested if anyone has another way around this issue, or if we
consider adding the default filter as long as no --decorate=refs options are
specified.

Thanks, -Stolee

Derrick Stolee (3):
  refs: allow "HEAD" as decoration filter
  log: add default decoration filter
  maintenance: stop writing log.excludeDecoration

 Documentation/git-log.txt |  7 ++++--
 builtin/gc.c              |  6 -----
 builtin/log.c             | 53 +++++++++++++++++++++++++++++----------
 refs.c                    |  4 +--
 t/t4202-log.sh            | 29 +++++++++++++++++++++
 t/t7900-maintenance.sh    | 21 ----------------
 6 files changed, 76 insertions(+), 44 deletions(-)


base-commit: 6a475b71f8c4ce708d69fdc9317aefbde3769e25
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1301%2Fderrickstolee%2Fdecorations-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1301/derrickstolee/decorations-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1301

Comments

Ævar Arnfjörð Bjarmason July 26, 2022, 2:44 p.m. UTC | #1
On Tue, Jul 26 2022, Derrick Stolee via GitGitGadget wrote:

> This was previously reduced by adding the log.excludeDecoration config
> option and modifying that config in git maintenance's prefetch task (to hide
> refs/prefetch/*). I then followed that pattern again for the bundle URI
> feature [1], but this caught some reviewers by surprise as an unfortunate
> side-effect. This series is a way to roll back the previous decision to use
> log.excludeDecoration and instead use tighter filters by default.
>
> As noted in the last patch, the current design ignores the new filters if
> there are any previously-specified filters. This includes the
> log.excludeDecorations=refs/prefetch/ set by the git maintenance command.
> This means that users who ran that command in their repo will not get the
> benefits of the more strict filters. While we stop writing
> log.excludeDecorations, we don't remove existing instances of it.

Leaving aside the question of these magic refs, and if we need new ones
(e.g. refs/bundle/*) I have sometimes made use of out-of-standard
refspace refs.

E.g. when I build git I create refs/built-tags/* tag object refs
(i.e. not in refs/tags/*), which is a neat way to get "git tag -l" and
the like to ignore it.

But to still have it show decorated in logs (e.g. I'll see what my
"private" branch is at), and "for-each-ref --contains" still knows about
it.

Now, that's a rather obscure use-case, and I suspect other "special
refs" are similarly obscure (e.g. GitLab's refs/keep-around/* comes to
mind).

But I think this change is going about it the wrong way, let's have a
list of refs that Git knows about as magical, instead of assuming that
we can ignore everything that's not on a small list of things we're
including.

Wouldn't that give you what you want, and not exclude these sorts of
custom refs unexpectedly for users?

> I'm interested if anyone has another way around this issue, or if we
> consider adding the default filter as long as no --decorate=refs options are
> specified.

I think the resulting UX here is bad, in that we ship hardcoded list of
these if you don't specify the config in 2/3. So I can do:

      -c log.excludeDecoration=this-will-never-match

To "clear" the list, but not this:

      -c log.excludeDecoration=

Which we usually have as a synonym for "false", is it just adding a
prefix of "" that matches everything to my exclusion list (but HEAD
still shows up...).

Speaking of funny refnames isn't your:

	git -c log.excludeDecoration=HEAD log

Going to exclude refs like "HEAD/foo/bar" too? I.e. it's always a
prefix.

I suspect a much better way out of this, which I think fixes the problem
you're raising here is:

	log.useBuiltinDecorationExclusions=[bool]

Or whatever, i.e. we ship a hardcoded list of exclusions.

Leaving aside whether that's a blacklist or whitelist, or whether we
turn that to "true" or "false" by default it wouldn't have the behavior
of "flushing out" the built-in config.

Another way of doing that if you really want to use one variable would
be to have:

      -c log.excludeDecoration=foo

Add to the built-in list, but:

      -c log.excludeDecoration=
      -c log.excludeDecoration=foo

clear it, we have a couple of other list config variables that work that
way.
Derrick Stolee July 26, 2022, 4:38 p.m. UTC | #2
On 7/26/2022 10:44 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, Jul 26 2022, Derrick Stolee via GitGitGadget wrote:
> 
>> This was previously reduced by adding the log.excludeDecoration config
>> option and modifying that config in git maintenance's prefetch task (to hide
>> refs/prefetch/*). I then followed that pattern again for the bundle URI
>> feature [1], but this caught some reviewers by surprise as an unfortunate
>> side-effect. This series is a way to roll back the previous decision to use
>> log.excludeDecoration and instead use tighter filters by default.
>>
>> As noted in the last patch, the current design ignores the new filters if
>> there are any previously-specified filters. This includes the
>> log.excludeDecorations=refs/prefetch/ set by the git maintenance command.
>> This means that users who ran that command in their repo will not get the
>> benefits of the more strict filters. While we stop writing
>> log.excludeDecorations, we don't remove existing instances of it.
> 
> Leaving aside the question of these magic refs, and if we need new ones
> (e.g. refs/bundle/*) I have sometimes made use of out-of-standard
> refspace refs.
> 
> E.g. when I build git I create refs/built-tags/* tag object refs
> (i.e. not in refs/tags/*), which is a neat way to get "git tag -l" and
> the like to ignore it.
> 
> But to still have it show decorated in logs (e.g. I'll see what my
> "private" branch is at), and "for-each-ref --contains" still knows about
> it.

You also have the unfortunate UX of having the refs spelled out entirely
("refs/<special-place>/..." instead of "<special-place>/..." like how
"refs/remotes/" is dropped from remote refs) and not having special color.
But that's beside the point.

> Now, that's a rather obscure use-case, and I suspect other "special
> refs" are similarly obscure (e.g. GitLab's refs/keep-around/* comes to
> mind).
> 
> But I think this change is going about it the wrong way, let's have a
> list of refs that Git knows about as magical, instead of assuming that
> we can ignore everything that's not on a small list of things we're
> including.
> 
> Wouldn't that give you what you want, and not exclude these sorts of
> custom refs unexpectedly for users?

Instead of keeping track of an ever-growing list of exclusions, instead
making a clear list of "this is what most users will want for their
decorations" is a better approach.

Users who know how to create custom refs outside of this space have the
capability to figure out how to show their special refs. My general ideas
for designing these kinds of features is to have a default that is focused
on the typical user while giving config options for experts to tweak those
defaults.

You're right that this series perhaps leaves something to be desired in
that second part, since there isn't an easy _config-based_ way to enable
all decorations (or a small additional subset).

>> I'm interested if anyone has another way around this issue, or if we
>> consider adding the default filter as long as no --decorate=refs options are
>> specified.
> 
> I think the resulting UX here is bad, in that we ship hardcoded list of
> these if you don't specify the config in 2/3. So I can do:
> 
>       -c log.excludeDecoration=this-will-never-match
> 
> To "clear" the list, but not this:
> 
>       -c log.excludeDecoration=

The thing that I forgot to do, but had considered was adding a
--decorate-all option to allow clearing the default filters from the
command line. You can already do "--decorate-refs=refs" to get everything
(except HEAD).

As far as config goes, we could also create a log.includeDecoration key,
but we'd want to consider it to populate the same part of the filtering
algorithm. Similar to having any instance of log.excludeDecoration, this
would clear the default list. To get all decorations again, you could add
this to your config file:

	[log]
		includeDecoration = refs
		includeDecoration = HEAD

Alternatively, we could instead create a "filter default" option such as
"log.decorateFilter = (core|all)" where "core" is the default set being
considered by this series, and "all" is the empty filter.

Thanks,
-Stolee
Ævar Arnfjörð Bjarmason July 26, 2022, 6:19 p.m. UTC | #3
On Tue, Jul 26 2022, Derrick Stolee wrote:

> On 7/26/2022 10:44 AM, Ævar Arnfjörð Bjarmason wrote:
>> 
>> On Tue, Jul 26 2022, Derrick Stolee via GitGitGadget wrote:
>> 
>>> This was previously reduced by adding the log.excludeDecoration config
>>> option and modifying that config in git maintenance's prefetch task (to hide
>>> refs/prefetch/*). I then followed that pattern again for the bundle URI
>>> feature [1], but this caught some reviewers by surprise as an unfortunate
>>> side-effect. This series is a way to roll back the previous decision to use
>>> log.excludeDecoration and instead use tighter filters by default.
>>>
>>> As noted in the last patch, the current design ignores the new filters if
>>> there are any previously-specified filters. This includes the
>>> log.excludeDecorations=refs/prefetch/ set by the git maintenance command.
>>> This means that users who ran that command in their repo will not get the
>>> benefits of the more strict filters. While we stop writing
>>> log.excludeDecorations, we don't remove existing instances of it.
>> 
>> Leaving aside the question of these magic refs, and if we need new ones
>> (e.g. refs/bundle/*) I have sometimes made use of out-of-standard
>> refspace refs.
>> 
>> E.g. when I build git I create refs/built-tags/* tag object refs
>> (i.e. not in refs/tags/*), which is a neat way to get "git tag -l" and
>> the like to ignore it.
>> 
>> But to still have it show decorated in logs (e.g. I'll see what my
>> "private" branch is at), and "for-each-ref --contains" still knows about
>> it.
>
> You also have the unfortunate UX of having the refs spelled out entirely
> ("refs/<special-place>/..." instead of "<special-place>/..." like how
> "refs/remotes/" is dropped from remote refs) and not having special color.
> But that's beside the point.

Whether that's unfortunate UX or not is debatable, since we expose the
"refs/" prefix explicitly in other parts of the UX, and e.g. "HEAD" or
"ORIG_HEAD" or whatever in in that "/"-relation to
"refs/heads/whatever", and not "heads/whatever".

E.g. try:

	mkdir .git/FOO &&
	git rev-parse origin/seen >.git/FOO/BAR &&
	git push origin FOO/BAR:refs/FOO/BAR

And then:

	mkdir .git/refs/FOO &&
	git rev-parse origin/next >.git/refs/FOO/BAR
	$ git rev-parse FOO/BAR
	warning: object-name.c:970: refname 'FOO/BAR' is ambiguous.
	[seen SHA1]
	$ git rev-parse refs/FOO/BAR
	[next SHA1]

I.e. we really do understand FOO/BAR as meaning .git/FOO/BAR, as opposed
to .git/refs/FOO/BAR. You really can't assume that you can strip a
"refs/" from a multi-level ref and not end up with ambiguities. It's
*not* the case that we just have the .git/<UPPER-CASE-HERE> and
.git/refs/* namespaces, and nothing else.

Are we *almost* there? Yes, I vaguely recall experimenting with trying
to get us 100% of the way (just locally, didn't make it on-list) a long
time ago, and there were some tricky edge cases, and we don't know who
in the wild is relying on it.

I think it would be interesting to explore getting there, but I don't
think eliding information on the display end like that would be the
right way to start.

Although to be fair e.g. "git for-each-ref" excludes these entirely, but
that's a ref-filter.c specific edge-case, per the above try it on
e.g. "git push", you'll find that you can push one or the other.

I think I was poking at this at the time because I wanted to have "git
show head" mean the same as "git show HEAD", or at least for us to start
enforcing the rule that thou shalt not use refnames that are head, HeAd
or whatever mixed-case version we have of our magical all-upper special
refs. This matters because you have commands like "git rev-parse head"
that'll work on case-insensitive FS's that promptly fail on
case-sensitive FS's.

>> Now, that's a rather obscure use-case, and I suspect other "special
>> refs" are similarly obscure (e.g. GitLab's refs/keep-around/* comes to
>> mind).
>> 
>> But I think this change is going about it the wrong way, let's have a
>> list of refs that Git knows about as magical, instead of assuming that
>> we can ignore everything that's not on a small list of things we're
>> including.
>> 
>> Wouldn't that give you what you want, and not exclude these sorts of
>> custom refs unexpectedly for users?
>
> Instead of keeping track of an ever-growing list of exclusions, instead
> making a clear list of "this is what most users will want for their
> decorations" is a better approach.
>
> Users who know how to create custom refs outside of this space have the
> capability to figure out how to show their special refs. My general ideas
> for designing these kinds of features is to have a default that is focused
> on the typical user while giving config options for experts to tweak those
> defaults.
>
> You're right that this series perhaps leaves something to be desired in
> that second part, since there isn't an easy _config-based_ way to enable
> all decorations (or a small additional subset).

Yes, but this is just side-stepping the issue. Your X-Y problem is that
you want to exclude certain refs that we're specifically creating.

I think that's fair enough, but I don't see why we're not specifically
excluding just those then.

It'll just be these bundle refs, filter refs, or whatever other magic we
want excluded. Why would we extend that to refs that we don't know about?

It seems like a safe assumption that if we don't know what it is we
should include it.

Both because that's what we do now (least chance of breaking workflows
in the wild), and because the entire point of the feature is to display
the log in relation to points on the ref namespace. If we don't *know*
that they're meaningless let's leave them out of the exclusion, no?

I'd also think that we'd want to consider this holistically. I haven't
checked, but aside from HEAD (which I think is the only special-case,
but maybe I'm wrong) isn't there a one-to-one mapping between what we'll
show in decorations, and what "git for-each-ref" diplays?

>>> I'm interested if anyone has another way around this issue, or if we
>>> consider adding the default filter as long as no --decorate=refs options are
>>> specified.
>> 
>> I think the resulting UX here is bad, in that we ship hardcoded list of
>> these if you don't specify the config in 2/3. So I can do:
>> 
>>       -c log.excludeDecoration=this-will-never-match
>> 
>> To "clear" the list, but not this:
>> 
>>       -c log.excludeDecoration=
>
> The thing that I forgot to do, but had considered was adding a
> --decorate-all option to allow clearing the default filters from the
> command line. You can already do "--decorate-refs=refs" to get everything
> (except HEAD).

...and some other exclusions, e.g. it won't show ORIG_HEAD even with
--decorate-refs=ORIG_HEAD, but perhaps I'm doing it the wrong way...

> As far as config goes, we could also create a log.includeDecoration key,
> but we'd want to consider it to populate the same part of the filtering
> algorithm. Similar to having any instance of log.excludeDecoration, this
> would clear the default list. To get all decorations again, you could add
> this to your config file:
>
> 	[log]
> 		includeDecoration = refs
> 		includeDecoration = HEAD

If we're considering new config I'd think anything that worked like this
instead would be *much less* confusing:

	some.config = refs/*
	some.config = HEAD

I.e. per the above your HEAD will presumably either match HEAD/BLAH, or
treat "refs" magically as "refs/" and not "refs".

Now, I don't think you can delete .git/HEAD and/or make .git/refs a file
and still have a functioning repository, but the same general rule
applies to "FOO" v.s. "FOO/BAR" and "custom-refs" or whatever.

We use that wildcard syntax unambiguously for the refspec matching, we
could just use the same machinery to match this.

> Alternatively, we could instead create a "filter default" option such as
> "log.decorateFilter = (core|all)" where "core" is the default set being
> considered by this series, and "all" is the empty filter.

Sure, for any key that takes N arguments we could have some way to say
"if you do this it's the same as enumerating these X values", as long as
it's not ambiguous whether you mean a special "aggregate label" or a
name of your own...
Derrick Stolee July 27, 2022, 1:41 p.m. UTC | #4
On 7/26/22 2:19 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, Jul 26 2022, Derrick Stolee wrote:
> 
>> On 7/26/2022 10:44 AM, Ævar Arnfjörð Bjarmason wrote:

>>> But I think this change is going about it the wrong way, let's have a
>>> list of refs that Git knows about as magical, instead of assuming that
>>> we can ignore everything that's not on a small list of things we're
>>> including.
>>>
>>> Wouldn't that give you what you want, and not exclude these sorts of
>>> custom refs unexpectedly for users?
>>
>> Instead of keeping track of an ever-growing list of exclusions, instead
>> making a clear list of "this is what most users will want for their
>> decorations" is a better approach.
>>
>> Users who know how to create custom refs outside of this space have the
>> capability to figure out how to show their special refs. My general ideas
>> for designing these kinds of features is to have a default that is focused
>> on the typical user while giving config options for experts to tweak those
>> defaults.
>>
>> You're right that this series perhaps leaves something to be desired in
>> that second part, since there isn't an easy _config-based_ way to enable
>> all decorations (or a small additional subset).
> 
> Yes, but this is just side-stepping the issue. Your X-Y problem is that
> you want to exclude certain refs that we're specifically creating.
> 
> I think that's fair enough, but I don't see why we're not specifically
> excluding just those then.

I'm advocating that we make a one-time change to have a set of "known
useful refs" as showing up in the decorations. Perhaps some users (like
yourself) need to react to that change, but it happens _once_.

Changing the rules repeatedly as new "hidden" namespaces are added is
more likely to cause confusion multiple times.

Thanks,
-Stolee
Ævar Arnfjörð Bjarmason July 27, 2022, 8:40 p.m. UTC | #5
On Wed, Jul 27 2022, Derrick Stolee wrote:

> On 7/26/22 2:19 PM, Ævar Arnfjörð Bjarmason wrote:
>> 
>> On Tue, Jul 26 2022, Derrick Stolee wrote:
>> 
>>> On 7/26/2022 10:44 AM, Ævar Arnfjörð Bjarmason wrote:
>
>>>> But I think this change is going about it the wrong way, let's have a
>>>> list of refs that Git knows about as magical, instead of assuming that
>>>> we can ignore everything that's not on a small list of things we're
>>>> including.
>>>>
>>>> Wouldn't that give you what you want, and not exclude these sorts of
>>>> custom refs unexpectedly for users?
>>>
>>> Instead of keeping track of an ever-growing list of exclusions, instead
>>> making a clear list of "this is what most users will want for their
>>> decorations" is a better approach.
>>>
>>> Users who know how to create custom refs outside of this space have the
>>> capability to figure out how to show their special refs. My general ideas
>>> for designing these kinds of features is to have a default that is focused
>>> on the typical user while giving config options for experts to tweak those
>>> defaults.
>>>
>>> You're right that this series perhaps leaves something to be desired in
>>> that second part, since there isn't an easy _config-based_ way to enable
>>> all decorations (or a small additional subset).
>> 
>> Yes, but this is just side-stepping the issue. Your X-Y problem is that
>> you want to exclude certain refs that we're specifically creating.
>> 
>> I think that's fair enough, but I don't see why we're not specifically
>> excluding just those then.
>
> I'm advocating that we make a one-time change to have a set of "known
> useful refs"

You can't know the set of known-useful refs. Unless we explicitly lay
claim to something in the namespace it's the user's.

This series seems to have started out by observing that "git log" is
showing certain "useless" refs added in another topic, since they're
owned by git, and "internal-only", which is fair enough.

But I don't see how we can or should make the leap to things that are
not on a limited "not-useless" list git knows about should be the only
thing we should display.

Wouldn't carrying a list of what to exclude achive the same goal (but
see below...).

> as showing up in the decorations. Perhaps some users (like yourself)
> need to react to that change, but it happens _once_.

It's long-established behavior, we try not to change that sort of
thing. In this case I think we *could*, i.e. I don't imagine it would be
a *huge* disruption if we had a good reason, e.g. the use-cases of
machine-parsing this are fairly obscure.

But the disruption seems unwarranted if it's just to hide the likes of
refs/bundle/, when we can just address those specifically.

And it's not "once", it's N times for each downstream user that'll run
into this, whereas not changing the behavior is effort among a small
group of git developers...

> Changing the rules repeatedly as new "hidden" namespaces are added is
> more likely to cause confusion multiple times.

I really don't see the worry about a maintenance burden of having a
thing in refs.c (or whatever) that's just:

	hide_from_decorate[] = {
		"refs/bundle/*"
                [...],
                NULL
	};

Worst case we'll create another "magic" ref, forget to add to the list,
and someone will notice and we'll add it.

Which at that point will be a one-line change, and preferrable to N
users having to chase down some new config which explains why their
local refs aren't shown anymore with decorate...