diff mbox series

pretty: allow to override the built-in formats

Message ID 20200905192406.74411-1-dev+git@drbeat.li (mailing list archive)
State New, archived
Headers show
Series pretty: allow to override the built-in formats | expand

Commit Message

Beat Bolli Sept. 5, 2020, 7:24 p.m. UTC
In 1f0fc1db8599 (pretty: implement 'reference' format, 2019-11-19), the
"reference" format was added. As a built-in format, it cannot be
overridden, although different projects may have divergent conventions
on how to format a commit reference. E.g., Git uses

    <hash> (<subject>, <short-date>) [1]

while Linux uses

    <hash> ("<subject>") [2]

Teach pretty to look at a different set of config variables, all
starting with "override" (e.g. "pretty.overrideReference"), to override
the built-in formats. Note that a format called "override" by itself is
not affected. The prefix was chosen to make it clear to the user that
this should not be done without thought, as it may cause issues with
other tools that expect the built-in formats to be immutable.

[1] https://github.com/git/git/blob/3a238e539bcdfe3f9eb5010fd218640c1b499f7a/Documentation/SubmittingPatches#L144
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v5.9-rc3#n167

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
---
I intend to also submit a patch to gitk that will use "git show -s
--pretty=reference" if it is available, with a fallback to reading
"pretty.overrideReference", so there's a single point of configuration
for the reference format.

 Documentation/config/pretty.txt |  6 ++++--
 pretty.c                        | 20 +++++++++++++++-----
 t/t4205-log-pretty-formats.sh   |  8 ++++++++
 3 files changed, 27 insertions(+), 7 deletions(-)

Comments

Denton Liu Sept. 5, 2020, 7:52 p.m. UTC | #1
Hi Beat,

Thanks for doing this. It was on my todo list but I've been quite busy
recently.

On Sat, Sep 05, 2020 at 09:24:06PM +0200, Beat Bolli wrote:
> In 1f0fc1db8599 (pretty: implement 'reference' format, 2019-11-19), the
> "reference" format was added. As a built-in format, it cannot be
> overridden, although different projects may have divergent conventions
> on how to format a commit reference. E.g., Git uses
> 
>     <hash> (<subject>, <short-date>) [1]
> 
> while Linux uses
> 
>     <hash> ("<subject>") [2]
> 
> Teach pretty to look at a different set of config variables, all
> starting with "override" (e.g. "pretty.overrideReference"), to override
> the built-in formats. Note that a format called "override" by itself is
> not affected. The prefix was chosen to make it clear to the user that
> this should not be done without thought, as it may cause issues with
> other tools that expect the built-in formats to be immutable.

Hmm, I'm not sure how I feel about being able to override formats other
than "reference". Perhaps we could special-case "reference" instead of
providing users with a possible foot-gun?

> [1] https://github.com/git/git/blob/3a238e539bcdfe3f9eb5010fd218640c1b499f7a/Documentation/SubmittingPatches#L144
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v5.9-rc3#n167
> 
> Signed-off-by: Beat Bolli <dev+git@drbeat.li>
> ---
> I intend to also submit a patch to gitk that will use "git show -s
> --pretty=reference" if it is available, with a fallback to reading
> "pretty.overrideReference", so there's a single point of configuration
> for the reference format.

Very good, I'm in favour of this.
Junio C Hamano Sept. 6, 2020, 9:59 p.m. UTC | #2
Denton Liu <liu.denton@gmail.com> writes:

> Hmm, I'm not sure how I feel about being able to override formats other
> than "reference".

Is the idea to introduce a parallel namespace to pretty.<name>?  I
am not sure why that is a good idea than, say a single variable that
says "to me, pretty.<name> would override even the built-in names".

I am not sure how I feel about being able to override built-in
formats in the first place, though.

After all, pretty.<name> were introduced so that user-defined ones
can be invoked with an equal ease as the built-in ones, but
overriding common understanding among the users of the tool is a
different story.
Beat Bolli Sept. 7, 2020, 5:36 a.m. UTC | #3
On 06.09.20 23:59, Junio C Hamano wrote:
> Denton Liu <liu.denton@gmail.com> writes:
> 
>> Hmm, I'm not sure how I feel about being able to override formats other
>> than "reference".
> 
> Is the idea to introduce a parallel namespace to pretty.<name>?  I
> am not sure why that is a good idea than, say a single variable that
> says "to me, pretty.<name> would override even the built-in names".
> 
> I am not sure how I feel about being able to override built-in
> formats in the first place, though.
> 
> After all, pretty.<name> were introduced so that user-defined ones
> can be invoked with an equal ease as the built-in ones, but
> overriding common understanding among the users of the tool is a
> different story.

I gave a reason for the reference format, at least.

Would you be fine with a patch that just allows to override the
reference format (for the stated reasons)?
Junio C Hamano Sept. 7, 2020, 6:54 a.m. UTC | #4
Beat Bolli <dev+git@drbeat.li> writes:

> On 06.09.20 23:59, Junio C Hamano wrote:
>> Denton Liu <liu.denton@gmail.com> writes:
>> 
>>> Hmm, I'm not sure how I feel about being able to override formats other
>>> than "reference".
>> 
>> Is the idea to introduce a parallel namespace to pretty.<name>?  I
>> am not sure why that is a good idea than, say a single variable that
>> says "to me, pretty.<name> would override even the built-in names".
>> 
>> I am not sure how I feel about being able to override built-in
>> formats in the first place, though.
>> 
>> After all, pretty.<name> were introduced so that user-defined ones
>> can be invoked with an equal ease as the built-in ones, but
>> overriding common understanding among the users of the tool is a
>> different story.
>
> I gave a reason for the reference format, at least.
>
> Would you be fine with a patch that just allows to override the
> reference format (for the stated reasons)?

Your "reason" read pretty much the same as "I want reference to do
something else", but that leads to "depending on the configuration,
even built-in names that are well known to all Git users behave
differently---the users lose common reference (no pun intended)".

Also I am not sure how your reason applies specifically to the
reference format.  It would be widely applicable to other formats
like 'short' and 'oneline' in that depending on projects' and
personal preference, people may want "something like X but not
exactly X" for all the built-in formats.

IOW I still do not see why your "stated reasons" justify overriding
any built-in format, and/or overriding only the reference format.  I
can understand (but not necessarily agree with) the position "We'll
let any built-in format to be overridden", but I do not see what
makes "reference" so special.  Even though I think it would confuse
the users to make any built-in format overridable and therefore I do
not think it is such a good idea, if we were to allow it, I do not
see any point in limiting the damage only to the reference format.

Finally, a non-built-in name to express the format specific to a
project can already be defined and used pretty easily; e.g. the
"pretty.kernel" format may say %h ("%s") and can be used like

    $ git show -s --pretty=kernel HEAD

with the same ease as the 'reference' format.

    $ git show -s --pretty=reference HEAD

So, I dunno.
Beat Bolli Sept. 7, 2020, 7:06 a.m. UTC | #5
On 07.09.20 08:54, Junio C Hamano wrote:
> Finally, a non-built-in name to express the format specific to a
> project can already be defined and used pretty easily; e.g. the
> "pretty.kernel" format may say %h ("%s") and can be used like
> 
>     $ git show -s --pretty=kernel HEAD
> 
> with the same ease as the 'reference' format.
> 
>     $ git show -s --pretty=reference HEAD

This would work fine if there wasn't also gitk, which has a "Copy commit
reference" function. Without a common name for the format in Git and
gitk, we lose the single point of truth.

The one alternative I can see is to make the format name configurable in
gitk.
Jeff King Sept. 8, 2020, 1:53 p.m. UTC | #6
On Sun, Sep 06, 2020 at 11:54:42PM -0700, Junio C Hamano wrote:

> > I gave a reason for the reference format, at least.
> >
> > Would you be fine with a patch that just allows to override the
> > reference format (for the stated reasons)?
> 
> Your "reason" read pretty much the same as "I want reference to do
> something else", but that leads to "depending on the configuration,
> even built-in names that are well known to all Git users behave
> differently---the users lose common reference (no pun intended)".

I think there is some value in having names and rough semantics that are
well-known to all Git users (and scripts), but whose exact output is not
set in stone.  So a name like "reference" becomes a rendezvous point
between a script like gitk and the user. It is a shorthand for
"reference a commit in a human-readable way according to the user or
project preferences". The script wants to read that value, and the user
wants to specify it.

You could accomplish something similar by having gitk look up
pretty.userReference, and defaulting to something sensible if it's not
defined. For a big script like gitk that's not too much of an
imposition. But it's awfully convenient to be able to just say
--format=reference in any script and get the user's preferred format.

That's where I think your pretty.kernel example falls down; both the
repo and the script have to agree that the name "kernel" exists.

> Also I am not sure how your reason applies specifically to the
> reference format.  It would be widely applicable to other formats
> like 'short' and 'oneline' in that depending on projects' and
> personal preference, people may want "something like X but not
> exactly X" for all the built-in formats.

The things that may make "reference" different are:

 - it's new-ish, so there's less chance of historical dependencies on it
   (this is a bit hand-wavy, of course; it has been in released
   versions. On the other hand, people may well have been using
   pretty.reference for this already, and we made it stop working when
   we added "reference").

 - from the start, the point was for it to be a human-readable format
   (it's not even unambiguously parseable anyway).

So of any of the formats, it seems like the most likely candidate for
such a feature (setting "pretty.raw" would be a pretty big foot-gun, for
instance). I don't like the inconsistency it introduces between formats,
though.

Here's a slightly different proposal. I'm not sure if I like it or not,
but just thinking out loud for a moment. The issue is that we're worried
the consumer of the output may be surprised by a user-configured pretty
format. Can we give them a way to say "I don't care about the exact
output; pick what the user configured for this name, or some sane
default". I.e., something like:

  git log --format=loose:reference

? That would let pretty.reference override the built-in name, but the
behavior of plain "--format=reference" would continue to ignore it. It's
a little more annoying for a script to specify, but not nearly as
annoying as:

  format=$(git config pretty.customReference || echo "%h (%s, %d)")
  git log --format="%format"

-Peff
Junio C Hamano Sept. 8, 2020, 6:12 p.m. UTC | #7
Jeff King <peff@peff.net> writes:

> You could accomplish something similar by having gitk look up
> pretty.userReference, and defaulting to something sensible if it's not
> defined. For a big script like gitk that's not too much of an
> imposition. But it's awfully convenient to be able to just say
> --format=reference in any script and get the user's preferred format.

Or --format=userReference in any script, and then allow it to fall
back to pretty.reference that is otherwise ignored?  Ah, that indeed
is what you suggested with --format=loose:reference already.

> So of any of the formats, it seems like the most likely candidate for
> such a feature (setting "pretty.raw" would be a pretty big foot-gun, for
> instance). I don't like the inconsistency it introduces between formats,
> though.

Yes, the inconsistency was what primarily disturbed me.

> Here's a slightly different proposal. I'm not sure if I like it or not,
> but just thinking out loud for a moment. The issue is that we're worried
> the consumer of the output may be surprised by a user-configured pretty
> format. Can we give them a way to say "I don't care about the exact
> output; pick what the user configured for this name, or some sane
> default". I.e., something like:
>
>   git log --format=loose:reference

Yeah, that, or with s/loose/user/ or something.
Jeff King Sept. 9, 2020, 9:08 a.m. UTC | #8
On Tue, Sep 08, 2020 at 11:12:11AM -0700, Junio C Hamano wrote:

> > Here's a slightly different proposal. I'm not sure if I like it or not,
> > but just thinking out loud for a moment. The issue is that we're worried
> > the consumer of the output may be surprised by a user-configured pretty
> > format. Can we give them a way to say "I don't care about the exact
> > output; pick what the user configured for this name, or some sane
> > default". I.e., something like:
> >
> >   git log --format=loose:reference
> 
> Yeah, that, or with s/loose/user/ or something.

Heh, I actually called it "user:" initially but wasn't sure if that was
sufficiently descriptive, so I groped around for another word. But if
both of us thought of "user", maybe it's better.

At any rate, this was mostly just thinking out loud, and isn't something
I'm planning to follow up on with a patch. But maybe it inspires
somebody to run with it.

-Peff
Junio C Hamano Sept. 9, 2020, 6:27 p.m. UTC | #9
Jeff King <peff@peff.net> writes:

> On Tue, Sep 08, 2020 at 11:12:11AM -0700, Junio C Hamano wrote:
>
>> > Here's a slightly different proposal. I'm not sure if I like it or not,
>> > but just thinking out loud for a moment. The issue is that we're worried
>> > the consumer of the output may be surprised by a user-configured pretty
>> > format. Can we give them a way to say "I don't care about the exact
>> > output; pick what the user configured for this name, or some sane
>> > default". I.e., something like:
>> >
>> >   git log --format=loose:reference
>> 
>> Yeah, that, or with s/loose/user/ or something.
>
> Heh, I actually called it "user:" initially but wasn't sure if that was
> sufficiently descriptive, so I groped around for another word. But if
> both of us thought of "user", maybe it's better.
>
> At any rate, this was mostly just thinking out loud, and isn't something
> I'm planning to follow up on with a patch. But maybe it inspires
> somebody to run with it.

Of course, we could go the other way and follow the same approach as
the "--literal-pathspecs" feature (and what bash does with the alias
and uses "command" keyword to work around the confusion it causes).

IOW, we could force those scripts that want to be strict to pay the
price and be explicit (e.g. "--format=builtin:<name>") and allow
others that want to be affected by end user customization can keep
saying "--format=<name>".  It unfortunately breaks our long standing
stance against backward compatibility breaking changes, so I would
say it is not likely to happen.  "--format=loose:reference" does not
share the problem, and it is much safer.

In any case, I do not think "pretty.override<word>" configuration
variable, which warns with the 'override' and makes those who tweak
builtin format think twice, is a good idea.  Those who add the
custom configuration are not in the position to decide if a
particular use of --format=<word> in a script (like gitk[*1*]) should
or should not be affected by customization.  It is up to the script
writers [*2*].


[Footnote]

*1* We've been using gitk as an example but it is not the best one,
    since it was made crystal clear that gitk will not be accepting a
    single liner patch that uses --pretty=reference anyway, it is a
    moot point.

    cf. https://lore.kernel.org/git/20191211215826.GA31614@blackberry/

*2* In general, a script wants strict/builtin output if it captures
    and parses, and customizable output if it just lets the git
    command it calls directly talk to the end user.
diff mbox series

Patch

diff --git a/Documentation/config/pretty.txt b/Documentation/config/pretty.txt
index 063c6b63d9..d9dac7b3ee 100644
--- a/Documentation/config/pretty.txt
+++ b/Documentation/config/pretty.txt
@@ -5,5 +5,7 @@  pretty.<name>::
 	running `git config pretty.changelog "format:* %H %s"`
 	would cause the invocation `git log --pretty=changelog`
 	to be equivalent to running `git log "--pretty=format:* %H %s"`.
-	Note that an alias with the same name as a built-in format
-	will be silently ignored.
+	Note that you can override a built-in format by prefixing its
+	name with `override`, e.g. `pretty.overrideReference` to override
+	the built-in reference format. Doing so can cause interoperability
+	issues with tools that expect a built-in format to be immutable.
diff --git a/pretty.c b/pretty.c
index 2a3d46bf42..a8f8ade470 100644
--- a/pretty.c
+++ b/pretty.c
@@ -46,18 +46,28 @@  static int git_pretty_formats_config(const char *var, const char *value, void *c
 {
 	struct cmt_fmt_map *commit_format = NULL;
 	const char *name;
+	const char *suffix;
 	const char *fmt;
 	int i;
 
 	if (!skip_prefix(var, "pretty.", &name))
 		return 0;
-
-	for (i = 0; i < builtin_formats_len; i++) {
-		if (!strcmp(commit_formats[i].name, name))
-			return 0;
+	if (skip_prefix(name, "override", &suffix) && *suffix) {
+		name = suffix;
+		/* also search the built-in formats */
+		i = 0;
+	} else {
+		for (i = 0; i < builtin_formats_len; i++) {
+			if (!strcmp(commit_formats[i].name, name))
+				return 0;
+		}
+		/*
+		 * Here, i == builtin_formats_len, so we only search the
+		 * user-defined formats
+		 */
 	}
 
-	for (i = builtin_formats_len; i < commit_formats_len; i++) {
+	for (; i < commit_formats_len; i++) {
 		if (!strcmp(commit_formats[i].name, name)) {
 			commit_format = &commit_formats[i];
 			break;
diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh
index 204c149d5a..55c37be392 100755
--- a/t/t4205-log-pretty-formats.sh
+++ b/t/t4205-log-pretty-formats.sh
@@ -52,6 +52,14 @@  test_expect_success 'alias masking builtin format' '
 	test_cmp expected actual
 '
 
+test_expect_success 'overriding builtin format' '
+	git log --pretty=%H >expected &&
+	git config pretty.overrideOneline "%H" &&
+	git log --pretty=oneline >actual &&
+	git config --unset pretty.overrideOneline &&
+	test_cmp expected actual
+'
+
 test_expect_success 'alias user-defined format' '
 	git log --pretty="format:%h" >expected &&
 	git config pretty.test-alias "format:%h" &&