[v3,2/2] ref-filter: add new "describe" atom

Message ID	20230719162424.70781-3-five231003@gmail.com (mailing list archive)
State	Superseded
Headers	show Return-Path: <git-owner@vger.kernel.org> From: Kousik Sanagavarapu <five231003@gmail.com> To: git@vger.kernel.org Cc: Junio C Hamano <gitster@pobox.com>, Christian Couder <christian.couder@gmail.com>, Hariom Verma <hariom18599@gmail.com>, Kousik Sanagavarapu <five231003@gmail.com> Subject: [PATCH v3 2/2] ref-filter: add new "describe" atom Date: Wed, 19 Jul 2023 21:45:06 +0530 Message-ID: <20230719162424.70781-3-five231003@gmail.com> In-Reply-To: <20230719162424.70781-1-five231003@gmail.com> References: <20230714194249.66862-1-five231003@gmail.com> <20230719162424.70781-1-five231003@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Add new "describe" atom \| expand [v3,0/2] Add new "describe" atom [v3,1/2] ref-filter: add multiple-option parsing functions [v3,2/2] ref-filter: add new "describe" atom

Message ID

20230719162424.70781-3-five231003@gmail.com (mailing list archive)

State

Superseded

Headers

From: Kousik Sanagavarapu <five231003@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
        Christian Couder <christian.couder@gmail.com>,
        Hariom Verma <hariom18599@gmail.com>,
        Kousik Sanagavarapu <five231003@gmail.com>
Subject: [PATCH v3 2/2] ref-filter: add new "describe" atom
Date: Wed, 19 Jul 2023 21:45:06 +0530
Message-ID: <20230719162424.70781-3-five231003@gmail.com>
In-Reply-To: <20230719162424.70781-1-five231003@gmail.com>
References: <20230714194249.66862-1-five231003@gmail.com>
 <20230719162424.70781-1-five231003@gmail.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

Add new "describe" atom | expand

Commit Message

Kousik Sanagavarapu July 19, 2023, 4:15 p.m. UTC

Duplicate the logic of %(describe) and friends from pretty to
ref-filter. In the future, this change helps in unifying both the
formats as ref-filter will be able to do everything that pretty is doing
and we can have a single interface.

The new atom "describe" and its friends are equivalent to the existing
pretty formats with the same name.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
---
 Documentation/git-for-each-ref.txt |  23 +++++
 ref-filter.c                       | 130 +++++++++++++++++++++++++++++
 t/t6300-for-each-ref.sh            | 114 +++++++++++++++++++++++++
 3 files changed, 267 insertions(+)

Comments

Junio C Hamano July 19, 2023, 10:56 p.m. UTC | #1

Kousik Sanagavarapu <five231003@gmail.com> writes:

> Duplicate the logic of %(describe) and friends from pretty to
> ref-filter. In the future, this change helps in unifying both the
> formats as ref-filter will be able to do everything that pretty is doing
> and we can have a single interface.
>
> The new atom "describe" and its friends are equivalent to the existing
> pretty formats with the same name.
>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Hariom Verma <hariom18599@gmail.com>
> Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
> ---
>  Documentation/git-for-each-ref.txt |  23 +++++
>  ref-filter.c                       | 130 +++++++++++++++++++++++++++++
>  t/t6300-for-each-ref.sh            | 114 +++++++++++++++++++++++++
>  3 files changed, 267 insertions(+)
>
> diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
> index 2e0318770b..395daf1b22 100644
> --- a/Documentation/git-for-each-ref.txt
> +++ b/Documentation/git-for-each-ref.txt
> @@ -258,6 +258,29 @@ ahead-behind:<committish>::
>  	commits ahead and behind, respectively, when comparing the output
>  	ref to the `<committish>` specified in the format.
>  
> +describe[:options]:: Human-readable name, like
> +		     link-git:git-describe[1]; empty string for
> +		     undescribable commits. The `describe` string may be
> +		     followed by a colon and zero or more comma-separated
> +		     options.

Why do these new items formatted so differently from the previous
ones?  By indenting the lines so deeply you are forcing yourself to
wrap these lines many times.  How about imitating the previous entry
for ahead-behind and writing this like so:

        describe[:<options>]::
                A human-readable name, like linkgit:git-describe[1];
                empty string is given for an undescribable commit.
		...

By the way, there is a typo "link-git" above that needs to be
corrected.

It is curious that we support "describe:" (i.e. having no options,
but colon is still present).  It may not be wrong per se, but it
looks strange.  "may be followed by a colon and one or more
comma-separated options" would be more intuitive (I haven't seen the
implementation yet, so if we go that route, the implementation may
also need to be updated).

>               .... Descriptions can be inconsistent when tags
> +		     are added or removed at the same time.

"at the same time" meaning "while the description are being
computed"?  I think this was copied from 273c9901 (pretty: document
multiple %(describe) being inconsistent, 2021-02-28) where the
pretty placeholder for "git log" and friends are described, and the
implementation used there go one formatting element at a time,
unlike for-each-ref that can compute a description for a given ref
just once in populate_value() and reuse the same atom number of
times in the format, each instance giving exactly the same value.
So I am not sure if the "can be inconsistent" disclaimer applies
to the %(describe) on this side the same way.  Are you sure?

As %(describe) is fairly expensive to compute, if the format string
wants two, e.g. --format="%(refname) %(describe) %(describe)", there
should be some effort to make these two share the same used_atom(),
so that there will be only one "git describe" invocation from
populate_value() that lets get_ref_atom_value() reuse that result
of a single invocation to fill the two placeholder.

> @@ -219,6 +222,7 @@ static struct used_atom {
>  			enum { S_BARE, S_GRADE, S_SIGNER, S_KEY,
>  			       S_FINGERPRINT, S_PRI_KEY_FP, S_TRUST_LEVEL } option;
>  		} signature;
> +		const char **describe_args;
>  		struct refname_atom refname;
>  		char *head;
>  	} u;

Nice and simple ;-).

> +static int describe_atom_option_parser(struct strvec *args, const char **arg,
> +				       struct strbuf *err)
> +{
> +	const char *argval;
> +	size_t arglen = 0;
> +	int optval = 0;
> +
> +	if (match_atom_bool_arg(*arg, "tags", arg, &optval)) {
> +		if (!optval)
> +			strvec_push(args, "--no-tags");
> +		else
> +			strvec_push(args, "--tags");
> +		return 1;
> +	}

OK.  One thing that I hate about the split of this series into two
steps is that [1/2] has to be read without knowing what the expected
use of those two helper functions are.  It was especially bad as the
functions lacked any documentation on how they are supposed to be
called.

Now, if we go back to the implementation of match_atom_bool_arg(),
it first called match_atom_arg_value(), which stripped the given key
("tags" in this case) from the argument being parsed, and allowed
'=' (i.e. followed by a val), ',' (i.e. no val, but more "key[=val]"
to follow), or '\0' (i.e. end of the argument string).  Anything
else after the matched key meant that the key did not match
(e.g. the arg had "tagsabcd", which should not match "tag").  And it
stored the byte position after '=' if the key was terminated with
'=', or NULL otherwise, to signal where the optional value starts.
match_atom_bool_arg() uses this and correctly translates a key
without the optional [=val] part into "true".  So the above is
given, after the caller skips "%(describe:", things like "tags,...",
"tags=no,...", "tags=yes,..." (or replace ",..." with a NUL for the
final option), and chooses between --no-tags and --tags.  Sounds
good.

> + ...
> +	if (match_atom_arg_value(*arg, "exclude", arg, &argval, &arglen)) {
> +		if (!arglen)
> +			return strbuf_addf_ret(err, -1,
> +					       _("value expected %s="),
> +					       "describe:exclude");
> +
> +		strvec_pushf(args, "--exclude=%.*s", (int)arglen, argval);
> +		return 1;
> +	}

I would have expected that these become if/else if/.../else cascade,
i.e.

	if (is that "tags"?) {
	} else if (is that "abbrev"?) {
		...
	} else
		return 0; /* nothing matched */
	return 1;

but I do not mind the above.  Each "block" that matches and handles
one key looks more indenendent the way the patch was written, which
may be a good thing.

> +	return 0;
> +}
> +
> +static int describe_atom_parser(struct ref_format *format UNUSED,
> +				struct used_atom *atom,
> +				const char *arg, struct strbuf *err)
> +{
> +	struct strvec args = STRVEC_INIT;

OK, parse_ref_fitler_atom() saw "%(describe", possibly followed by a
colon and zero or more comma-separated key[=val], and the location
after ':' (or NULL) is given to arg.  Specifically, %(describe) and
%(describe:) both pass NULL in arg.

> +	for (;;) {
> +		int found = 0;
> +		const char *bad_arg = NULL;
> +
> +		if (!arg || !*arg)
> +			break;

And we stop when there is no more key[=val].

> +		bad_arg = arg;
> +		found = describe_atom_option_parser(&args, &arg, err);

This one moves arg forward and arranges the next key[=val] to be seen
in the next iteration of this loop.  Makes sense.

In the remainder of the code changes, I saw nothing strange.
Quite well made.

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index 2e0318770b..395daf1b22 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -258,6 +258,29 @@  ahead-behind:<committish>::
 	commits ahead and behind, respectively, when comparing the output
 	ref to the `<committish>` specified in the format.
 
+describe[:options]:: Human-readable name, like
+		     link-git:git-describe[1]; empty string for
+		     undescribable commits. The `describe` string may be
+		     followed by a colon and zero or more comma-separated
+		     options. Descriptions can be inconsistent when tags
+		     are added or removed at the same time.
++
+--
+tags=<bool-value>;; Instead of only considering annotated tags, consider
+		    lightweight tags as well; see the corresponding option
+		    in linkgit:git-describe[1] for details.
+abbrev=<number>;; Use at least <number> hexadecimal digits; see
+		  the corresponding option in linkgit:git-describe[1]
+		  for details.
+match=<pattern>;; Only consider tags matching the given `glob(7)` pattern,
+		  excluding the "refs/tags/" prefix; see the corresponding
+		  option in linkgit:git-describe[1] for details.
+exclude=<pattern>;; Do not consider tags matching the given `glob(7)`
+		    pattern, excluding the "refs/tags/" prefix; see the
+		    corresponding option in linkgit:git-describe[1] for
+		    details.
+--
+
 In addition to the above, for commit and tag objects, the header
 field names (`tree`, `parent`, `object`, `type`, and `tag`) can
 be used to specify the value in the header field.
diff --git a/ref-filter.c b/ref-filter.c
index f64437e781..43316643be 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1,9 +1,11 @@ 
 #include "git-compat-util.h"
 #include "environment.h"
 #include "gettext.h"
+#include "config.h"
 #include "gpg-interface.h"
 #include "hex.h"
 #include "parse-options.h"
+#include "run-command.h"
 #include "refs.h"
 #include "wildmatch.h"
 #include "object-name.h"
@@ -145,6 +147,7 @@  enum atom_type {
 	ATOM_TAGGERDATE,
 	ATOM_CREATOR,
 	ATOM_CREATORDATE,
+	ATOM_DESCRIBE,
 	ATOM_SUBJECT,
 	ATOM_BODY,
 	ATOM_TRAILERS,
@@ -219,6 +222,7 @@  static struct used_atom {
 			enum { S_BARE, S_GRADE, S_SIGNER, S_KEY,
 			       S_FINGERPRINT, S_PRI_KEY_FP, S_TRUST_LEVEL } option;
 		} signature;
+		const char **describe_args;
 		struct refname_atom refname;
 		char *head;
 	} u;
@@ -554,6 +558,91 @@  static int contents_atom_parser(struct ref_format *format, struct used_atom *ato
 	return 0;
 }
 
+static int describe_atom_option_parser(struct strvec *args, const char **arg,
+				       struct strbuf *err)
+{
+	const char *argval;
+	size_t arglen = 0;
+	int optval = 0;
+
+	if (match_atom_bool_arg(*arg, "tags", arg, &optval)) {
+		if (!optval)
+			strvec_push(args, "--no-tags");
+		else
+			strvec_push(args, "--tags");
+		return 1;
+	}
+
+	if (match_atom_arg_value(*arg, "abbrev", arg, &argval, &arglen)) {
+		char *endptr;
+
+		if (!arglen)
+			return strbuf_addf_ret(err, -1,
+					       _("argument expected for %s"),
+					       "describe:abbrev");
+		if (strtol(argval, &endptr, 10) < 0)
+			return strbuf_addf_ret(err, -1,
+					       _("positive value expected %s=%s"),
+					       "describe:abbrev", argval);
+		if (endptr - argval != arglen)
+			return strbuf_addf_ret(err, -1,
+					       _("cannot fully parse %s=%s"),
+					       "describe:abbrev", argval);
+
+		strvec_pushf(args, "--abbrev=%.*s", (int)arglen, argval);
+		return 1;
+	}
+
+	if (match_atom_arg_value(*arg, "match", arg, &argval, &arglen)) {
+		if (!arglen)
+			return strbuf_addf_ret(err, -1,
+					       _("value expected %s="),
+					       "describe:match");
+
+		strvec_pushf(args, "--match=%.*s", (int)arglen, argval);
+		return 1;
+	}
+
+	if (match_atom_arg_value(*arg, "exclude", arg, &argval, &arglen)) {
+		if (!arglen)
+			return strbuf_addf_ret(err, -1,
+					       _("value expected %s="),
+					       "describe:exclude");
+
+		strvec_pushf(args, "--exclude=%.*s", (int)arglen, argval);
+		return 1;
+	}
+
+	return 0;
+}
+
+static int describe_atom_parser(struct ref_format *format UNUSED,
+				struct used_atom *atom,
+				const char *arg, struct strbuf *err)
+{
+	struct strvec args = STRVEC_INIT;
+
+	for (;;) {
+		int found = 0;
+		const char *bad_arg = NULL;
+
+		if (!arg || !*arg)
+			break;
+
+		bad_arg = arg;
+		found = describe_atom_option_parser(&args, &arg, err);
+		if (found < 0)
+			return found;
+		if (!found) {
+			if (bad_arg && *bad_arg)
+				return err_bad_arg(err, "describe", bad_arg);
+			break;
+		}
+	}
+	atom->u.describe_args = strvec_detach(&args);
+	return 0;
+}
+
 static int raw_atom_parser(struct ref_format *format UNUSED,
 			   struct used_atom *atom,
 			   const char *arg, struct strbuf *err)
@@ -756,6 +845,7 @@  static struct {
 	[ATOM_TAGGERDATE] = { "taggerdate", SOURCE_OBJ, FIELD_TIME },
 	[ATOM_CREATOR] = { "creator", SOURCE_OBJ },
 	[ATOM_CREATORDATE] = { "creatordate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_DESCRIBE] = { "describe", SOURCE_OBJ, FIELD_STR, describe_atom_parser },
 	[ATOM_SUBJECT] = { "subject", SOURCE_OBJ, FIELD_STR, subject_atom_parser },
 	[ATOM_BODY] = { "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
 	[ATOM_TRAILERS] = { "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
@@ -1662,6 +1752,44 @@  static void append_lines(struct strbuf *out, const char *buf, unsigned long size
 	}
 }
 
+static void grab_describe_values(struct atom_value *val, int deref,
+				 struct object *obj)
+{
+	struct commit *commit = (struct commit *)obj;
+	int i;
+
+	for (i = 0; i < used_atom_cnt; i++) {
+		struct used_atom *atom = &used_atom[i];
+		enum atom_type type = atom->atom_type;
+		const char *name = atom->name;
+		struct atom_value *v = &val[i];
+
+		struct child_process cmd = CHILD_PROCESS_INIT;
+		struct strbuf out = STRBUF_INIT;
+		struct strbuf err = STRBUF_INIT;
+
+		if (type != ATOM_DESCRIBE)
+			continue;
+
+		if (!!deref != (*name == '*'))
+			continue;
+
+		cmd.git_cmd = 1;
+		strvec_push(&cmd.args, "describe");
+		strvec_pushv(&cmd.args, atom->u.describe_args);
+		strvec_push(&cmd.args, oid_to_hex(&commit->object.oid));
+		if (pipe_command(&cmd, NULL, 0, &out, 0, &err, 0) < 0) {
+			error(_("failed to run 'describe'"));
+			v->s = xstrdup("");
+			continue;
+		}
+		strbuf_rtrim(&out);
+		v->s = strbuf_detach(&out, NULL);
+
+		strbuf_release(&err);
+	}
+}
+
 /* See grab_values */
 static void grab_sub_body_contents(struct atom_value *val, int deref, struct expand_data *data)
 {
@@ -1771,6 +1899,7 @@  static void grab_values(struct atom_value *val, int deref, struct object *obj, s
 		grab_tag_values(val, deref, obj);
 		grab_sub_body_contents(val, deref, data);
 		grab_person("tagger", val, deref, buf);
+		grab_describe_values(val, deref, obj);
 		break;
 	case OBJ_COMMIT:
 		grab_commit_values(val, deref, obj);
@@ -1778,6 +1907,7 @@  static void grab_values(struct atom_value *val, int deref, struct object *obj, s
 		grab_person("author", val, deref, buf);
 		grab_person("committer", val, deref, buf);
 		grab_signature(val, deref, obj);
+		grab_describe_values(val, deref, obj);
 		break;
 	case OBJ_TREE:
 		/* grab_tree_values(val, deref, obj, buf, sz); */
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 6e6ec852b5..4bbba76874 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -562,6 +562,120 @@  test_expect_success 'color.ui=always does not override tty check' '
 	test_cmp expected.bare actual
 '
 
+test_expect_success 'setup for describe atom tests' '
+	git init describe-repo &&
+	(
+		cd describe-repo &&
+
+		test_commit --no-tag one &&
+		git tag tagone &&
+
+		test_commit --no-tag two &&
+		git tag -a -m "tag two" tagtwo
+	)
+'
+
+test_expect_success 'describe atom vs git describe' '
+	(
+		cd describe-repo &&
+
+		git for-each-ref --format="%(objectname)" \
+			refs/tags/ >obj &&
+		while read hash
+		do
+			if desc=$(git describe $hash)
+			then
+				: >expect-contains-good
+			else
+				: >expect-contains-bad
+			fi &&
+			echo "$hash $desc" || return 1
+		done <obj >expect &&
+		test_path_exists expect-contains-good &&
+		test_path_exists expect-contains-bad &&
+
+		git for-each-ref --format="%(objectname) %(describe)" \
+			refs/tags/ >actual 2>err &&
+		test_cmp expect actual &&
+		test_must_be_empty err
+	)
+'
+
+test_expect_success 'describe:tags vs describe --tags' '
+	(
+		cd describe-repo &&
+		git describe --tags >expect &&
+		git for-each-ref --format="%(describe:tags)" \
+				refs/heads/master >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'describe:abbrev=... vs describe --abbrev=...' '
+	(
+		cd describe-repo &&
+
+		# Case 1: We have commits between HEAD and the most
+		#	  recent tag reachable from it
+		test_commit --no-tag file &&
+		git describe --abbrev=14 >expect &&
+		git for-each-ref --format="%(describe:abbrev=14)" \
+			refs/heads/master >actual &&
+		test_cmp expect actual &&
+
+		# Make sure the hash used is atleast 14 digits long
+		sed -e "s/^.*-g\([0-9a-f]*\)$/\1/" <actual >hexpart &&
+		test 15 -le $(wc -c <hexpart) &&
+
+		# Case 2: We have a tag at HEAD, describe directly gives
+		#	  the name of the tag
+		git tag -a -m tagged tagname &&
+		git describe --abbrev=14 >expect &&
+		git for-each-ref --format="%(describe:abbrev=14)" \
+			refs/heads/master >actual &&
+		test_cmp expect actual &&
+		test tagname = $(cat actual)
+	)
+'
+
+test_expect_success 'describe:match=... vs describe --match ...' '
+	(
+		cd describe-repo &&
+		git tag -a -m "tag foo" tag-foo &&
+		git describe --match "*-foo" >expect &&
+		git for-each-ref --format="%(describe:match="*-foo")" \
+			refs/heads/master >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'describe:exclude:... vs describe --exclude ...' '
+	(
+		cd describe-repo &&
+		git tag -a -m "tag bar" tag-bar &&
+		git describe --exclude "*-bar" >expect &&
+		git for-each-ref --format="%(describe:exclude="*-bar")" \
+			refs/heads/master >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'deref with describe atom' '
+	(
+		cd describe-repo &&
+		cat >expect <<-\EOF &&
+
+		tagname
+		tagname
+		tagname
+
+		tagtwo
+		EOF
+		git for-each-ref --format="%(*describe)" >actual &&
+		test_cmp expect actual
+	)
+'
+
 cat >expected <<\EOF
 heads/main
 tags/main

[v3,2/2] ref-filter: add new "describe" atom

Commit Message

Comments

Patch