[1/2] pretty: add %(describe)

Message ID	5561d11b-08c3-bcf7-5d37-a7d6c6bfb715@web.de (mailing list archive)
State	New, archived
Headers	show Return-Path: <git-owner@kernel.org> Subject: [PATCH 1/2] pretty: add %(describe) To: Eli Schwartz <eschwartz@archlinux.org>, git@vger.kernel.org References: <7418f1d8-78c2-61a7-4f03-62360b986a41@archlinux.org> Cc: Junio C Hamano <gitster@pobox.com> From: =?utf-8?q?Ren=C3=A9_Scharfe?= <l.s.r@web.de> Message-ID: <5561d11b-08c3-bcf7-5d37-a7d6c6bfb715@web.de> Date: Sun, 14 Feb 2021 11:04:34 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.16; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 MIME-Version: 1.0 In-Reply-To: <7418f1d8-78c2-61a7-4f03-62360b986a41@archlinux.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Precedence: bulk
Series	[1/2] pretty: add %(describe) \| expand [1/2] pretty: add %(describe) [2/2] pretty: add merge and exclude options to %(describe)

René Scharfe Feb. 14, 2021, 10:04 a.m. UTC

Add a format placeholder for describe output.  Implement it by actually
calling git describe, which is simple and guarantees correctness.  It's
intended to be used with $Format:...$ in files with the attribute
export-subst and git archive.  It can also be used with git log etc.,
even though that's going to be slow due to the fork for each commit.

Suggested-by: Eli Schwartz <eschwartz@archlinux.org>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
 Documentation/pretty-formats.txt |  2 ++
 pretty.c                         | 17 +++++++++++++++++
 t/t4205-log-pretty-formats.sh    | 10 ++++++++++
 3 files changed, 29 insertions(+)

--
2.30.1

Eli Schwartz Feb. 16, 2021, 5:04 a.m. UTC | #1

On 2/14/21 5:04 AM, René Scharfe wrote:
> Add a format placeholder for describe output.  Implement it by actually
> calling git describe, which is simple and guarantees correctness.  It's
> intended to be used with $Format:...$ in files with the attribute
> export-subst and git archive.  It can also be used with git log etc.,
> even though that's going to be slow due to the fork for each commit.

This patch works great for me.

In fact, it even works fast enough for git log to not noticeably slow
down unless I really stomp on the "Page Down" button. At least on Linux...

Thanks for working on this!

> Suggested-by: Eli Schwartz <eschwartz@archlinux.org>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  Documentation/pretty-formats.txt |  2 ++
>  pretty.c                         | 17 +++++++++++++++++
>  t/t4205-log-pretty-formats.sh    | 10 ++++++++++
>  3 files changed, 29 insertions(+)
> 
> diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
> index 6b59e28d44..bb8c05bc21 100644
> --- a/Documentation/pretty-formats.txt
> +++ b/Documentation/pretty-formats.txt
> @@ -208,6 +208,8 @@ The placeholders are:
>  '%cs':: committer date, short format (`YYYY-MM-DD`)
>  '%d':: ref names, like the --decorate option of linkgit:git-log[1]
>  '%D':: ref names without the " (", ")" wrapping.
> +'%(describe)':: human-readable name, like linkgit:git-describe[1];
> +		empty string for undescribable commits
>  '%S':: ref name given on the command line by which the commit was reached
>         (like `git log --source`), only works with `git log`
>  '%e':: encoding
> diff --git a/pretty.c b/pretty.c
> index b4ff3f602f..a0c427fb61 100644
> --- a/pretty.c
> +++ b/pretty.c
> @@ -12,6 +12,7 @@
>  #include "reflog-walk.h"
>  #include "gpg-interface.h"
>  #include "trailer.h"
> +#include "run-command.h"
> 
>  static char *user_format;
>  static struct cmt_fmt_map {
> @@ -1214,6 +1215,22 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
>  		return parse_padding_placeholder(placeholder, c);
>  	}
> 
> +	if (skip_prefix(placeholder, "(describe)", &arg)) {
> +		struct child_process cmd = CHILD_PROCESS_INIT;
> +		struct strbuf out = STRBUF_INIT;
> +		struct strbuf err = STRBUF_INIT;
> +
> +		cmd.git_cmd = 1;
> +		strvec_push(&cmd.args, "describe");
> +		strvec_push(&cmd.args, oid_to_hex(&commit->object.oid));
> +		pipe_command(&cmd, NULL, 0, &out, 0, &err, 0);
> +		strbuf_rtrim(&out);
> +		strbuf_addbuf(sb, &out);
> +		strbuf_release(&out);
> +		strbuf_release(&err);
> +		return arg - placeholder;
> +	}
> +
>  	/* these depend on the commit */
>  	if (!commit->object.parsed)
>  		parse_object(the_repository, &commit->object.oid);
> diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh
> index 749bc1431a..5a44fa447d 100755
> --- a/t/t4205-log-pretty-formats.sh
> +++ b/t/t4205-log-pretty-formats.sh
> @@ -962,4 +962,14 @@ test_expect_success 'log --pretty=reference is colored appropriately' '
>  	test_cmp expect actual
>  '
> 
> +test_expect_success '%(describe) vs git describe' '
> +	git log --format="%H" | while read hash
> +	do
> +		echo "$hash $(git describe $hash)"
> +	done >expect &&
> +	git log --format="%H %(describe)" >actual 2>err &&
> +	test_cmp expect actual &&
> +	test_must_be_empty err
> +'
> +
>  test_done
> --
> 2.30.1
>

Ævar Arnfjörð Bjarmason Feb. 16, 2021, 1 p.m. UTC | #2

On Sun, Feb 14 2021, René Scharfe wrote:

> Add a format placeholder for describe output.  Implement it by actually
> calling git describe, which is simple and guarantees correctness.  It's
> intended to be used with $Format:...$ in files with the attribute
> export-subst and git archive. 

Does it really guarantee correctness though? In "builtin/describe.c" we
first walk over the refs and use that to format all N items we're asked
about.

Under "git log" this is presumably in a race where refs added/deleted
during the run of "git log" will change the describe output to be
inconsistent with earlier lines.

> It can also be used with git log etc., even though that's going to be
> slow due to the fork for each commit.

René Scharfe Feb. 16, 2021, 5:13 p.m. UTC | #3

Am 16.02.21 um 14:00 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sun, Feb 14 2021, René Scharfe wrote:
>
>> Add a format placeholder for describe output.  Implement it by actually
>> calling git describe, which is simple and guarantees correctness.  It's
>> intended to be used with $Format:...$ in files with the attribute
>> export-subst and git archive.
>
> Does it really guarantee correctness though? In "builtin/describe.c" we
> first walk over the refs and use that to format all N items we're asked
> about.
>
> Under "git log" this is presumably in a race where refs added/deleted
> during the run of "git log" will change the describe output to be
> inconsistent with earlier lines.

Right, didn't think of that aspect.  So we'd better warn about the
raciness in the documentation.

We could improve that by keeping a single describe process around and
feeding it object names through a pipe as we go.  The results would
still become outdated after a ref is added or removed, but they'd be
consistent.  This would be faster as well.

René

Junio C Hamano Feb. 16, 2021, 6:44 p.m. UTC | #4

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Sun, Feb 14 2021, René Scharfe wrote:
>
>> Add a format placeholder for describe output.  Implement it by actually
>> calling git describe, which is simple and guarantees correctness.  It's
>> intended to be used with $Format:...$ in files with the attribute
>> export-subst and git archive. 
>
> Does it really guarantee correctness though? In "builtin/describe.c" we
> first walk over the refs and use that to format all N items we're asked
> about.
>
> Under "git log" this is presumably in a race where refs added/deleted
> during the run of "git log" will change the describe output to be
> inconsistent with earlier lines.

Yes, but it is not a news that the describe for a given commit will
not stay the constant while you add more objects to the repository
or you change the tags, whether the "describe" is driven internally
by "git log" or by the end-user, so I am not sure how that becomes
an issue.  If the output is inconsistent in a quiescent repository,
that would be a problem, though.

Puzzled.

Thanks.

Ævar Arnfjörð Bjarmason Feb. 17, 2021, 12:47 a.m. UTC | #5

On Tue, Feb 16 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Sun, Feb 14 2021, René Scharfe wrote:
>>
>>> Add a format placeholder for describe output.  Implement it by actually
>>> calling git describe, which is simple and guarantees correctness.  It's
>>> intended to be used with $Format:...$ in files with the attribute
>>> export-subst and git archive. 
>>
>> Does it really guarantee correctness though? In "builtin/describe.c" we
>> first walk over the refs and use that to format all N items we're asked
>> about.
>>
>> Under "git log" this is presumably in a race where refs added/deleted
>> during the run of "git log" will change the describe output to be
>> inconsistent with earlier lines.
>
> Yes, but it is not a news that the describe for a given commit will
> not stay the constant while you add more objects to the repository
> or you change the tags, whether the "describe" is driven internally
> by "git log" or by the end-user, so I am not sure how that becomes
> an issue.  If the output is inconsistent in a quiescent repository,
> that would be a problem, though.
>
> Puzzled.

Usually something shelling out has going for it is that even if it's
slow it's at least known to be bug-free, after all we use the command
like that already.

I just wanted to point out this edge case, for "git describe <commits>"
we do the ref list once at the beginning so our N list will be
consistent within itself.

Whereas one might expect "git log" to also format such a list, but due
to the internal implementation the semantics are different.

Ævar Arnfjörð Bjarmason Feb. 17, 2021, 12:58 a.m. UTC | #6

On Sun, Feb 14 2021, René Scharfe wrote:

> +'%(describe)':: human-readable name, like linkgit:git-describe[1];
> +		empty string for undescribable commits

In the case of undescribable we've got the subcommand exiting non-zero
and we ignore it. The right thing in this case given how the rest of
format arguments work, but maybe something to explicitly test for?
>
> +	if (skip_prefix(placeholder, "(describe)", &arg)) {
> +		struct child_process cmd = CHILD_PROCESS_INIT;
> +		struct strbuf out = STRBUF_INIT;
> +		struct strbuf err = STRBUF_INIT;
> +
> +		cmd.git_cmd = 1;
> +		strvec_push(&cmd.args, "describe");
> +		strvec_push(&cmd.args, oid_to_hex(&commit->object.oid));
> +		pipe_command(&cmd, NULL, 0, &out, 0, &err, 0);
> +		strbuf_rtrim(&out);
> +		strbuf_addbuf(sb, &out);
> +		strbuf_release(&out);
> +		strbuf_release(&err);
> +		return arg - placeholder;
> +	}

There's another edge case in this: if you do "%(describe)%(describe)"
it'll be run twice for the rev, 3 times if you add another "%(describe)"
etc. I don't know if pretty.c has an easy way to cache/avoid that.

Junio C Hamano Feb. 17, 2021, 6:12 p.m. UTC | #7

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Sun, Feb 14 2021, René Scharfe wrote:
>
>> +'%(describe)':: human-readable name, like linkgit:git-describe[1];
>> +		empty string for undescribable commits
>
> In the case of undescribable we've got the subcommand exiting non-zero
> and we ignore it. The right thing in this case given how the rest of
> format arguments work, but maybe something to explicitly test for?
>>
>> +	if (skip_prefix(placeholder, "(describe)", &arg)) {
>> +		struct child_process cmd = CHILD_PROCESS_INIT;
>> +		struct strbuf out = STRBUF_INIT;
>> +		struct strbuf err = STRBUF_INIT;
>> +
>> +		cmd.git_cmd = 1;
>> +		strvec_push(&cmd.args, "describe");
>> +		strvec_push(&cmd.args, oid_to_hex(&commit->object.oid));
>> +		pipe_command(&cmd, NULL, 0, &out, 0, &err, 0);
>> +		strbuf_rtrim(&out);
>> +		strbuf_addbuf(sb, &out);
>> +		strbuf_release(&out);
>> +		strbuf_release(&err);
>> +		return arg - placeholder;
>> +	}
>
> There's another edge case in this: if you do "%(describe)%(describe)"
> it'll be run twice for the rev, 3 times if you add another "%(describe)"
> etc. I don't know if pretty.c has an easy way to cache/avoid that.

Just like for-each-ref that has the "atom" system to lazily parse
and cache a computed result so that multiple invocations of the same
placeholder will have to prepare a string only once, the pretty side
has the "format_commit_context" structure that can be used to cache
values that are expensive to compute in case it is used more than
once, so we could use it.

I however suspect that we may already be leaking some cacheed values
in the current code.  For example, there is "signature_check"
instance that is used to handle %G and we do use it to call
check_signature(), but a call to signature_check_clear() to release
resources is nowhere to be seen as far as I can tell.

René Scharfe Feb. 28, 2021, 11:22 a.m. UTC | #8

Am 17.02.21 um 01:58 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sun, Feb 14 2021, René Scharfe wrote:
>
>> +'%(describe)':: human-readable name, like linkgit:git-describe[1];
>> +		empty string for undescribable commits
>
> In the case of undescribable we've got the subcommand exiting non-zero
> and we ignore it. The right thing in this case given how the rest of
> format arguments work, but maybe something to explicitly test for?

The test convers it, but we can surely make that easier to see.

-- >8 --
Subject: [PATCH] t4205: assert %(describe) test coverage

Document that the test is covering both describable and
undescribable commits.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/t4205-log-pretty-formats.sh | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh
index b47a0bd9eb..cabdf7d57a 100755
--- a/t/t4205-log-pretty-formats.sh
+++ b/t/t4205-log-pretty-formats.sh
@@ -965,8 +965,17 @@ test_expect_success 'log --pretty=reference is colored appropriately' '
 test_expect_success '%(describe) vs git describe' '
 	git log --format="%H" | while read hash
 	do
-		echo "$hash $(git describe $hash)"
+		if desc=$(git describe $hash)
+		then
+			: >expect-contains-good
+		else
+			: >expect-contains-bad
+		fi &&
+		echo "$hash $desc"
 	done >expect &&
+	test_path_exists expect-contains-good &&
+	test_path_exists expect-contains-bad &&
+
 	git log --format="%H %(describe)" >actual 2>err &&
 	test_cmp expect actual &&
 	test_must_be_empty err
--
2.30.1

René Scharfe Feb. 28, 2021, 11:22 a.m. UTC | #9

Am 17.02.21 um 01:47 schrieb Ævar Arnfjörð Bjarmason:
> Usually something shelling out has going for it is that even if it's
> slow it's at least known to be bug-free, after all we use the command
> like that already.
>
> I just wanted to point out this edge case, for "git describe <commits>"
> we do the ref list once at the beginning so our N list will be
> consistent within itself.
>
> Whereas one might expect "git log" to also format such a list, but due
> to the internal implementation the semantics are different.

It shouldn't matter for the intended use case (git archive containing a
spec file with a single "$Format:%(describe)$") and I cannot imagine how
consistency in the face of tag changes would be useful (what application
would be OK with consistently outdated output, but break with partly
outdated descriptions).  But it makes sense to mention it in the docs.

-- >8 --
Subject: [PATCH] pretty: document multiple %(describe) being inconsistent

Each %(describe) placeholder is expanded using a separate git describe
call.  Their outputs depend on the tags present at the time, so there's
no consistency guarantee.  Document that fact.

Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
 Documentation/pretty-formats.txt | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index 231010e6ef..45133066e4 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -212,7 +212,9 @@ The placeholders are:
 			  linkgit:git-describe[1]; empty string for
 			  undescribable commits.  The `describe` string
 			  may be followed by a colon and zero or more
-			  comma-separated options.
+			  comma-separated options.  Descriptions can be
+			  inconsistent when tags are added or removed at
+			  the same time.
 +
 ** 'match=<pattern>': Only consider tags matching the given
    `glob(7)` pattern, excluding the "refs/tags/" prefix.
--
2.30.1

René Scharfe March 2, 2021, 4 p.m. UTC | #10

Am 01.03.21 um 18:50 schrieb Junio C Hamano:
> René Scharfe. <l.s.r@web.de> writes:
>
>> Am 17.02.21 um 01:47 schrieb Ævar Arnfjörð Bjarmason:
>>> Usually something shelling out has going for it is that even if it's
>>> slow it's at least known to be bug-free, after all we use the command
>>> like that already.
>>>
>>> I just wanted to point out this edge case, for "git describe <commits>"
>>> we do the ref list once at the beginning so our N list will be
>>> consistent within itself.
>>>
>>> Whereas one might expect "git log" to also format such a list, but due
>>> to the internal implementation the semantics are different.
>>
>> It shouldn't matter for the intended use case (git archive containing a
>> spec file with a single "$Format:%(describe)$") and I cannot imagine how
>> consistency in the face of tag changes would be useful (what application
>> would be OK with consistently outdated output, but break with partly
>> outdated descriptions).  But it makes sense to mention it in the docs.
>
> Does it?
>
> As long as the reader understands "git describe" is about measuring
> the "location" of given commits relative to the annotated tags (or
> whatever anchor points the invocation chooses to use), I would say
> that it is unlikely that the reader does not to realize the
> ramifications of changing tags in the middle.

I actually agree, but I'm biased.  The thing is: The question came up
and needed answering, so there is a chance that it might help someone
else as well.

> While it may not be incorrect per-se (hence it may "not hurt"),
> making the description longer does hurt the readers' experience.

The added sentence would look better in a smaller font, or in a
footnote. :-|

> So, I am a bit skeptical if this is a good change, but will queue
> for now anyway.
>
> Thanks.
>
>
>
>> -- >8 --
>> Subject: [PATCH] pretty: document multiple %(describe) being inconsistent
>>
>> Each %(describe) placeholder is expanded using a separate git describe
>> call.  Their outputs depend on the tags present at the time, so there's
>> no consistency guarantee.  Document that fact.
>>
>> Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> Signed-off-by: René Scharfe <l.s.r@web.de>
>> ---
>>  Documentation/pretty-formats.txt | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
>> index 231010e6ef..45133066e4 100644
>> --- a/Documentation/pretty-formats.txt
>> +++ b/Documentation/pretty-formats.txt
>> @@ -212,7 +212,9 @@ The placeholders are:
>>  			  linkgit:git-describe[1]; empty string for
>>  			  undescribable commits.  The `describe` string
>>  			  may be followed by a colon and zero or more
>> -			  comma-separated options.
>> +			  comma-separated options.  Descriptions can be
>> +			  inconsistent when tags are added or removed at
>> +			  the same time.
>>  +
>>  ** 'match=<pattern>': Only consider tags matching the given
>>     `glob(7)` pattern, excluding the "refs/tags/" prefix.
>> --
>> 2.30.1

René Scharfe March 2, 2021, 4 p.m. UTC | #11

Am 01.03.21 um 18:45 schrieb Junio C Hamano:
> René Scharfe. <l.s.r@web.de> writes:
>
>> Subject: [PATCH] t4205: assert %(describe) test coverage
>>
>> Document that the test is covering both describable and
>> undescribable commits.
>>
>> Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> Signed-off-by: René Scharfe <l.s.r@web.de>
>> ---
>>  t/t4205-log-pretty-formats.sh | 11 ++++++++++-
>>  1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh
>> index b47a0bd9eb..cabdf7d57a 100755
>> --- a/t/t4205-log-pretty-formats.sh
>> +++ b/t/t4205-log-pretty-formats.sh
>> @@ -965,8 +965,17 @@ test_expect_success 'log --pretty=reference is colored appropriately' '
>>  test_expect_success '%(describe) vs git describe' '
>>  	git log --format="%H" | while read hash
>>  	do
>> -		echo "$hash $(git describe $hash)"
>> +		if desc=$(git describe $hash)
>> +		then
>> +			: >expect-contains-good
>> +		else
>> +			: >expect-contains-bad
>> +		fi &&
>> +		echo "$hash $desc"
>>  	done >expect &&
>> +	test_path_exists expect-contains-good &&
>> +	test_path_exists expect-contains-bad &&
>
> Hmph, I am not sure why we want temporary files for this (and I
> doubt this "documenting" adds that much value to the tests to begin
> with), but OK.  Will queue.

Variables would suffice, but make debugging harder.  test_path_exists
will at least print a suggestive file name.  Perhaps we should add a
test_assert?

The added checks guard against neutering the test accidentally e.g.
by tagging the currently undescribable commit in the setup phase.
That would be hard to detect without it.

>
>>  	git log --format="%H %(describe)" >actual 2>err &&
>>  	test_cmp expect actual &&
>>  	test_must_be_empty err
>> --
>> 2.30.1

[1/2] pretty: add %(describe)

Commit Message

Comments

Patch