diff mbox series

[v2,2/2,GSOC] ref-filter: introduce enum atom_type

Message ID a1f70b39b7efbadff9e2202968dd1ca65ea3c1b4.1620659000.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series ref-filter: introduce enum atom_type | expand

Commit Message

ZheNing Hu May 10, 2021, 3:03 p.m. UTC
From: ZheNing Hu <adlternative@gmail.com>

In the original ref-filter design, it will copy the parsed
atom's name and attributes to `used_atom[i].name` in the
atom's parsing step, and use it again for string matching
in the later specific ref attributes filling step. It use
a lot of string matching to determine which atom we need.

Introduce the enum "atom_type", each enum value is named
as `ATOM_*`, which is the index of each corresponding
valid_atom entry. In the first step of the atom parsing,
`used_atom.atom_type` will record corresponding enum value
from valid_atom entry index, and then in specific reference
attribute filling step, only need to compare the value of
the `used_atom.atom_type` to judge the atom type.

the enum value of `ATOM_UNKNOWN` is equals to zero, which
could ensure that we can easily distinguish such a struct
where the atom_type is known from such a struct where it
is unknown yet.

the enum value of `ATOM_INVALID` is equals to the size of
valid_atom array, which could help us iterate over
valid_atom array using something like:

for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++)
        /* do something with valid_atom[i] */;

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c | 214 ++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 143 insertions(+), 71 deletions(-)

Comments

Junio C Hamano May 11, 2021, 2:14 a.m. UTC | #1
"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> the enum value of `ATOM_UNKNOWN` is equals to zero, which

s/the/The/;

> could ensure that we can easily distinguish such a struct
> where the atom_type is known from such a struct where it
> is unknown yet.
>
> the enum value of `ATOM_INVALID` is equals to the size of

Ditto.

> +/*
> + * The enum atom_type is used as the coordinates of valid_atom entry.
> + * In the atom parsing stage, it will be passed to used_atom.atom_type
> + * as the identifier of the atom type. We can judge the type of used_atom
> + * entry by `if (used_atom[i].atom_type == ATOM_*)`.
> + *
> + * ATOM_UNKNOWN equals to 0, used as an enumeration value of uninitialized
> + * atom_type.

Shouldn't it be (-1)?

And I'd assume I am right in the following.

> + * ATOM_INVALID equals to the size of valid_atom array, which could help us
> + * iterate over valid_atom array like this:
> + *
> + * 	for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {

I find it far more intuitive to say

	for (i = 0; i < ATOM_INVALID; i++)

than having to say UNKNOWN+1.

In any case, the values should be indented, and a comment should
ensure that the final one stays at the end, perhaps like this.

	enum atom_type {
		ATOM_INVALID = -2,
		ATOM_UNKNOWN = -1,
		ATOM_REFNAME,
		...
		ATOM_ELSE,
		ATOM_MAX /* MUST BE AT THE END */
	}

(note that the trailing comma is deliberately omitted).

It would allow people to say

	for (i = 0; i < ATOM_MAX; i++)

instead, which would be even nicer.
Christian Couder May 11, 2021, 5:51 a.m. UTC | #2
On Tue, May 11, 2021 at 4:14 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> > +/*
> > + * The enum atom_type is used as the coordinates of valid_atom entry.
> > + * In the atom parsing stage, it will be passed to used_atom.atom_type
> > + * as the identifier of the atom type. We can judge the type of used_atom
> > + * entry by `if (used_atom[i].atom_type == ATOM_*)`.
> > + *
> > + * ATOM_UNKNOWN equals to 0, used as an enumeration value of uninitialized
> > + * atom_type.
>
> Shouldn't it be (-1)?

If it's -1 instead of 0, then it might be a bit more complex to
initialize structs that contain such a field, as it cannot be done
with only xcalloc().

> And I'd assume I am right in the following.
>
> > + * ATOM_INVALID equals to the size of valid_atom array, which could help us
> > + * iterate over valid_atom array like this:
> > + *
> > + *   for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {
>
> I find it far more intuitive to say
>
>         for (i = 0; i < ATOM_INVALID; i++)
>
> than having to say UNKNOWN+1.

Yeah, that's more intuitive. But in my opinion, using `ATOM_UNKNOWN +
1` instead of `0` at least shouldn't often result in more lines of
code, and should be a bit easier to get right, compared to having to
initialize the field with ATOM_UNKNOWN.

> In any case, the values should be indented, and a comment should
> ensure that the final one stays at the end, perhaps like this.
>
>         enum atom_type {
>                 ATOM_INVALID = -2,
>                 ATOM_UNKNOWN = -1,
>                 ATOM_REFNAME,
>                 ...
>                 ATOM_ELSE,
>                 ATOM_MAX /* MUST BE AT THE END */

I agree that a comment telling people that it must be at the end is good.

>         }
>
> (note that the trailing comma is deliberately omitted).

Yeah.

> It would allow people to say
>
>         for (i = 0; i < ATOM_MAX; i++)
>
> instead, which would be even nicer.

Yeah, it's also a tradeoff to have the last one called ATOM_MAX
instead of ATOM_INVALID, and to have a separate ATOM_INVALID if it's
needed.
Junio C Hamano May 11, 2021, 6:12 a.m. UTC | #3
Christian Couder <christian.couder@gmail.com> writes:

>> I find it far more intuitive to say
>>
>>         for (i = 0; i < ATOM_INVALID; i++)
>>
>> than having to say UNKNOWN+1.
>
> Yeah, that's more intuitive. But in my opinion, using `ATOM_UNKNOWN +
> 1` instead of `0` at least shouldn't often result in more lines of
> code, and should be a bit easier to get right, compared to having to
> initialize the field with ATOM_UNKNOWN.

Number of lines is not all that important.

But the developers must remember that UNKNOWN is at the bottom end
and INVALID is at the top end, which is very taxing.  Tying UNKNOWN
to the top end and INVALID to the bottom end would equally be
plausible and there is no memory aid to help us remember which one
is which.  Compare it to "array indices begin at 0, and the upper
end is MAX".  Your scheme is much easier for developers to screw up.
ZheNing Hu May 11, 2021, 12:18 p.m. UTC | #4
> And I'd assume I am right in the following.
>
> > + * ATOM_INVALID equals to the size of valid_atom array, which could help us
> > + * iterate over valid_atom array like this:
> > + *
> > + *   for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {
>
> I find it far more intuitive to say
>
>         for (i = 0; i < ATOM_INVALID; i++)
>
> than having to say UNKNOWN+1.
>
> In any case, the values should be indented, and a comment should
> ensure that the final one stays at the end, perhaps like this.
>
>         enum atom_type {
>                 ATOM_INVALID = -2,
>                 ATOM_UNKNOWN = -1,
>                 ATOM_REFNAME,
>                 ...
>                 ATOM_ELSE,
>                 ATOM_MAX /* MUST BE AT THE END */
>         }
>
> (note that the trailing comma is deliberately omitted).
>
> It would allow people to say
>
>         for (i = 0; i < ATOM_MAX; i++)
>
> instead, which would be even nicer.

I think ATOM_INVALID and ATOM_MAX all will have a
similar effort. Why don't we omit one of them?

For the time being, all the used_atom entry create in
`parse_ref_filter_atom()`, we directly use
`used_atom[at].atom_type = i;` after realloc the used_atom.
There is no time for "ATOM_UNKNOWN" at all.

I don’t know if it makes a lot of sense use "ATOM_UNKNOWN"
at the moment.

--
ZheNing Hu
ZheNing Hu May 11, 2021, 12:37 p.m. UTC | #5
Christian Couder <christian.couder@gmail.com> 于2021年5月11日周二 下午1:51写道:
>
> >
> > Shouldn't it be (-1)?
>
> If it's -1 instead of 0, then it might be a bit more complex to
> initialize structs that contain such a field, as it cannot be done
> with only xcalloc().
>

I agree. If the traverse start from 0, an init atom_type will have
"ATOM_REFNAME" junk value. If let users manually adjust it to
ATOM_UNKNOWN, it seems to be very troublesome.

> > And I'd assume I am right in the following.
> >
> > > + * ATOM_INVALID equals to the size of valid_atom array, which could help us
> > > + * iterate over valid_atom array like this:
> > > + *
> > > + *   for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {
> >
> > I find it far more intuitive to say
> >
> >         for (i = 0; i < ATOM_INVALID; i++)
> >
> > than having to say UNKNOWN+1.
>
> Yeah, that's more intuitive. But in my opinion, using `ATOM_UNKNOWN +
> 1` instead of `0` at least shouldn't often result in more lines of
> code, and should be a bit easier to get right, compared to having to
> initialize the field with ATOM_UNKNOWN.
>

There will be a trade-off. Traverse from 0 or does not need to adjust the
initialized atom_type = UNKNOWN.

>
> > It would allow people to say
> >
> >         for (i = 0; i < ATOM_MAX; i++)
> >
> > instead, which would be even nicer.
>
> Yeah, it's also a tradeoff to have the last one called ATOM_MAX
> instead of ATOM_INVALID, and to have a separate ATOM_INVALID if it's
> needed.

About ATOM_MAX or ATOM_INVALID, I have a idea:

enum atom_type {
ATOM_UNKNOWN,
...
ATOM_ELSE,
ATOM_INVALID,
+ATOM_MAX = ATOM_INVALID
 };

This might be able to do both.

Thanks.
--
ZheNing Hu
ZheNing Hu May 11, 2021, 12:53 p.m. UTC | #6
Junio C Hamano <gitster@pobox.com> 于2021年5月11日周二 下午2:12写道:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> >> I find it far more intuitive to say
> >>
> >>         for (i = 0; i < ATOM_INVALID; i++)
> >>
> >> than having to say UNKNOWN+1.
> >
> > Yeah, that's more intuitive. But in my opinion, using `ATOM_UNKNOWN +
> > 1` instead of `0` at least shouldn't often result in more lines of
> > code, and should be a bit easier to get right, compared to having to
> > initialize the field with ATOM_UNKNOWN.
>
> Number of lines is not all that important.
>
> But the developers must remember that UNKNOWN is at the bottom end
> and INVALID is at the top end, which is very taxing.  Tying UNKNOWN
> to the top end and INVALID to the bottom end would equally be
> plausible and there is no memory aid to help us remember which one
> is which.  Compare it to "array indices begin at 0, and the upper
> end is MAX".  Your scheme is much easier for developers to screw up.
>

Yes, UNKNOWN + 1 is difficult to use. But using UNKNOWN = -1,
this means that the coder may indirectly use an init atom_type with
junk value "ATOM_REFNAME", they maybe did't notice they need
reinitialize the value to UNKNOWN.

I thought that perhaps such a naming would be better:

ATOM_BEGIN = ATOM_UNKNOWN + 1,
ATOM_END = ATOM_INVALID

       for (i = ATOM_BEGIN; i < ATOM_END; i++) {
       }

But ATOM_END has been used...

--
ZheNing Hu
Junio C Hamano May 11, 2021, 1:05 p.m. UTC | #7
Christian Couder <christian.couder@gmail.com> writes:

> If it's -1 instead of 0, then it might be a bit more complex to
> initialize structs that contain such a field, as it cannot be done
> with only xcalloc().

In general, yes, but I would have thought that the codepath that
allocates used_atom[] elements are pretty isolated---it is not like
there are random xcalloc() all over the code.  In fact, there is
only one "used_atom_cnt++" in the whole file, which means there is
only one place that (re)allocates the array.

	...
	at = used_atom_cnt;
	used_atom_cnt++;
	REALLOC_ARRAY(used_atom, used_atom_cnt);
	used_atom[at].name = xmemdupz(atom, ep - atom);
	used_atom[at].type = valid_atom[i].cmp_type;
	used_atom[at].source = valid_atom[i].source;
	...

So, I do not think there is even any need to worry about "initialize
to invalid and fill it in as we discover what it really is"; if
there were such a use pattern, UNKNOWN would be handy, but that is
not what we are dealing with here.  In the above snippet, we already
have found from which valid_atom[] element to instantiate the new
element in used_atom[] array.

>> > + * ATOM_INVALID equals to the size of valid_atom array, which could help us
>> > + * iterate over valid_atom array like this:
>> > + *
>> > + *   for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {
>>
>> I find it far more intuitive to say
>>
>>         for (i = 0; i < ATOM_INVALID; i++)
>>
>> than having to say UNKNOWN+1.

And here I was being embarrassingly silly.

As long as we do not waste any entry in valid_atom[] with leading
gap, trailing gap or gap in the middle, the way to iterate over such
an array is

	for (i = 0; i < ARRAY_SIZE(valid_atom); i++)

hence, there is no need for ATOM_MAX, and no need to burden us to
remember that UNKNOWN is near the bottom of the range, and INVALID
is near the top of the range.
diff mbox series

Patch

diff --git a/ref-filter.c b/ref-filter.c
index f420bae6e5ba..e3bf2cd1aaec 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -108,6 +108,69 @@  static struct ref_to_worktree_map {
 	struct worktree **worktrees;
 } ref_to_worktree_map;
 
+/*
+ * The enum atom_type is used as the coordinates of valid_atom entry.
+ * In the atom parsing stage, it will be passed to used_atom.atom_type
+ * as the identifier of the atom type. We can judge the type of used_atom
+ * entry by `if (used_atom[i].atom_type == ATOM_*)`.
+ *
+ * ATOM_UNKNOWN equals to 0, used as an enumeration value of uninitialized
+ * atom_type.
+ * ATOM_INVALID equals to the size of valid_atom array, which could help us
+ * iterate over valid_atom array like this:
+ *
+ * 	for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {
+ *		int len = strlen(valid_atom[i].name);
+ *		if (len == atom_len && !memcmp(valid_atom[i].name, sp, len))
+ *			break;
+ *	}
+ */
+enum atom_type {
+ATOM_UNKNOWN,
+ATOM_REFNAME,
+ATOM_OBJECTTYPE,
+ATOM_OBJECTSIZE,
+ATOM_OBJECTNAME,
+ATOM_DELTABASE,
+ATOM_TREE,
+ATOM_PARENT,
+ATOM_NUMPARENT,
+ATOM_OBJECT,
+ATOM_TYPE,
+ATOM_TAG,
+ATOM_AUTHOR,
+ATOM_AUTHORNAME,
+ATOM_AUTHOREMAIL,
+ATOM_AUTHORDATE,
+ATOM_COMMITTER,
+ATOM_COMMITTERNAME,
+ATOM_COMMITTEREMAIL,
+ATOM_COMMITTERDATE,
+ATOM_TAGGER,
+ATOM_TAGGERNAME,
+ATOM_TAGGEREMAIL,
+ATOM_TAGGERDATE,
+ATOM_CREATOR,
+ATOM_CREATORDATE,
+ATOM_SUBJECT,
+ATOM_BODY,
+ATOM_TRAILERS,
+ATOM_CONTENTS,
+ATOM_UPSTREAM,
+ATOM_PUSH,
+ATOM_SYMREF,
+ATOM_FLAG,
+ATOM_HEAD,
+ATOM_COLOR,
+ATOM_WORKTREEPATH,
+ATOM_ALIGN,
+ATOM_END,
+ATOM_IF,
+ATOM_THEN,
+ATOM_ELSE,
+ATOM_INVALID,
+};
+
 /*
  * An atom is a valid field atom listed below, possibly prefixed with
  * a "*" to denote deref_tag().
@@ -119,6 +182,7 @@  static struct ref_to_worktree_map {
  * array.
  */
 static struct used_atom {
+	enum atom_type atom_type;
 	const char *name;
 	cmp_type type;
 	info_source source;
@@ -506,47 +570,47 @@  static struct {
 	int (*parser)(const struct ref_format *format, struct used_atom *atom,
 		      const char *arg, struct strbuf *err);
 } valid_atom[] = {
-	{ "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
-	{ "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser },
-	{ "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
-	{ "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
-	{ "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
-	{ "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
-	{ "parent", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
-	{ "numparent", SOURCE_OBJ, FIELD_ULONG },
-	{ "object", SOURCE_OBJ },
-	{ "type", SOURCE_OBJ },
-	{ "tag", SOURCE_OBJ },
-	{ "author", SOURCE_OBJ },
-	{ "authorname", SOURCE_OBJ },
-	{ "authoremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
-	{ "authordate", SOURCE_OBJ, FIELD_TIME },
-	{ "committer", SOURCE_OBJ },
-	{ "committername", SOURCE_OBJ },
-	{ "committeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
-	{ "committerdate", SOURCE_OBJ, FIELD_TIME },
-	{ "tagger", SOURCE_OBJ },
-	{ "taggername", SOURCE_OBJ },
-	{ "taggeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
-	{ "taggerdate", SOURCE_OBJ, FIELD_TIME },
-	{ "creator", SOURCE_OBJ },
-	{ "creatordate", SOURCE_OBJ, FIELD_TIME },
-	{ "subject", SOURCE_OBJ, FIELD_STR, subject_atom_parser },
-	{ "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
-	{ "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
-	{ "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
-	{ "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
-	{ "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
-	{ "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
-	{ "flag", SOURCE_NONE },
-	{ "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser },
-	{ "color", SOURCE_NONE, FIELD_STR, color_atom_parser },
-	{ "worktreepath", SOURCE_NONE },
-	{ "align", SOURCE_NONE, FIELD_STR, align_atom_parser },
-	{ "end", SOURCE_NONE },
-	{ "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
-	{ "then", SOURCE_NONE },
-	{ "else", SOURCE_NONE },
+	[ATOM_REFNAME] = { "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
+	[ATOM_OBJECTTYPE] = { "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser },
+	[ATOM_OBJECTSIZE] = { "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
+	[ATOM_OBJECTNAME] = { "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
+	[ATOM_DELTABASE] = { "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
+	[ATOM_TREE] = { "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
+	[ATOM_PARENT] = { "parent", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
+	[ATOM_NUMPARENT] = { "numparent", SOURCE_OBJ, FIELD_ULONG },
+	[ATOM_OBJECT] = { "object", SOURCE_OBJ },
+	[ATOM_TYPE] = { "type", SOURCE_OBJ },
+	[ATOM_TAG] = { "tag", SOURCE_OBJ },
+	[ATOM_AUTHOR] = { "author", SOURCE_OBJ },
+	[ATOM_AUTHORNAME] = { "authorname", SOURCE_OBJ },
+	[ATOM_AUTHOREMAIL] = { "authoremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
+	[ATOM_AUTHORDATE] = { "authordate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_COMMITTER] = { "committer", SOURCE_OBJ },
+	[ATOM_COMMITTERNAME] = { "committername", SOURCE_OBJ },
+	[ATOM_COMMITTEREMAIL] = { "committeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
+	[ATOM_COMMITTERDATE] = { "committerdate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_TAGGER] = { "tagger", SOURCE_OBJ },
+	[ATOM_TAGGERNAME] = { "taggername", SOURCE_OBJ },
+	[ATOM_TAGGEREMAIL] = { "taggeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
+	[ATOM_TAGGERDATE] = { "taggerdate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_CREATOR] = { "creator", SOURCE_OBJ },
+	[ATOM_CREATORDATE] = { "creatordate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_SUBJECT] = { "subject", SOURCE_OBJ, FIELD_STR, subject_atom_parser },
+	[ATOM_BODY] = { "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
+	[ATOM_TRAILERS] = { "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
+	[ATOM_CONTENTS] = { "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
+	[ATOM_UPSTREAM] = { "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
+	[ATOM_PUSH] = { "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
+	[ATOM_SYMREF] = { "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
+	[ATOM_FLAG] = { "flag", SOURCE_NONE },
+	[ATOM_HEAD] = { "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser },
+	[ATOM_COLOR] = { "color", SOURCE_NONE, FIELD_STR, color_atom_parser },
+	[ATOM_WORKTREEPATH] = { "worktreepath", SOURCE_NONE },
+	[ATOM_ALIGN] = { "align", SOURCE_NONE, FIELD_STR, align_atom_parser },
+	[ATOM_END] = { "end", SOURCE_NONE },
+	[ATOM_IF] = { "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
+	[ATOM_THEN] = { "then", SOURCE_NONE },
+	[ATOM_ELSE] = { "else", SOURCE_NONE },
 	/*
 	 * Please update $__git_ref_fieldlist in git-completion.bash
 	 * when you add new atoms
@@ -610,13 +674,13 @@  static int parse_ref_filter_atom(const struct ref_format *format,
 	atom_len = (arg ? arg : ep) - sp;
 
 	/* Is the atom a valid one? */
-	for (i = 0; i < ARRAY_SIZE(valid_atom); i++) {
+	for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {
 		int len = strlen(valid_atom[i].name);
 		if (len == atom_len && !memcmp(valid_atom[i].name, sp, len))
 			break;
 	}
 
-	if (ARRAY_SIZE(valid_atom) <= i)
+	if (i == ATOM_INVALID)
 		return strbuf_addf_ret(err, -1, _("unknown field name: %.*s"),
 				       (int)(ep-atom), atom);
 	if (valid_atom[i].source != SOURCE_NONE && !have_git_dir())
@@ -628,6 +692,7 @@  static int parse_ref_filter_atom(const struct ref_format *format,
 	at = used_atom_cnt;
 	used_atom_cnt++;
 	REALLOC_ARRAY(used_atom, used_atom_cnt);
+	used_atom[at].atom_type = i;
 	used_atom[at].name = xmemdupz(atom, ep - atom);
 	used_atom[at].type = valid_atom[i].cmp_type;
 	used_atom[at].source = valid_atom[i].source;
@@ -652,7 +717,7 @@  static int parse_ref_filter_atom(const struct ref_format *format,
 		return -1;
 	if (*atom == '*')
 		need_tagged = 1;
-	if (!strcmp(valid_atom[i].name, "symref"))
+	if (i == ATOM_SYMREF)
 		need_symref = 1;
 	return at;
 }
@@ -965,14 +1030,15 @@  static void grab_common_values(struct atom_value *val, int deref, struct expand_
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
-		if (!strcmp(name, "objecttype"))
+		if (atom_type == ATOM_OBJECTTYPE)
 			v->s = xstrdup(type_name(oi->type));
-		else if (starts_with(name, "objectsize")) {
+		else if (atom_type == ATOM_OBJECTSIZE) {
 			if (used_atom[i].u.objectsize.option == O_SIZE_DISK) {
 				v->value = oi->disk_size;
 				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)oi->disk_size);
@@ -980,9 +1046,9 @@  static void grab_common_values(struct atom_value *val, int deref, struct expand_
 				v->value = oi->size;
 				v->s = xstrfmt("%"PRIuMAX , (uintmax_t)oi->size);
 			}
-		} else if (!strcmp(name, "deltabase"))
+		} else if (atom_type == ATOM_DELTABASE)
 			v->s = xstrdup(oid_to_hex(&oi->delta_base_oid));
-		else if (deref)
+		else if (atom_type == ATOM_OBJECTNAME && deref)
 			grab_oid(name, "objectname", &oi->oid, v, &used_atom[i]);
 	}
 }
@@ -995,16 +1061,17 @@  static void grab_tag_values(struct atom_value *val, int deref, struct object *ob
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
-		if (!strcmp(name, "tag"))
+		if (atom_type == ATOM_TAG)
 			v->s = xstrdup(tag->tag);
-		else if (!strcmp(name, "type") && tag->tagged)
+		else if (atom_type == ATOM_TYPE && tag->tagged)
 			v->s = xstrdup(type_name(tag->tagged->type));
-		else if (!strcmp(name, "object") && tag->tagged)
+		else if (atom_type == ATOM_OBJECT && tag->tagged)
 			v->s = xstrdup(oid_to_hex(&tag->tagged->oid));
 	}
 }
@@ -1017,18 +1084,20 @@  static void grab_commit_values(struct atom_value *val, int deref, struct object
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
-		if (grab_oid(name, "tree", get_commit_tree_oid(commit), v, &used_atom[i]))
+		if (atom_type == ATOM_TREE &&
+		    grab_oid(name, "tree", get_commit_tree_oid(commit), v, &used_atom[i]))
 			continue;
-		if (!strcmp(name, "numparent")) {
+		if (atom_type == ATOM_NUMPARENT) {
 			v->value = commit_list_count(commit->parents);
 			v->s = xstrfmt("%lu", (unsigned long)v->value);
 		}
-		else if (starts_with(name, "parent")) {
+		else if (atom_type == ATOM_PARENT) {
 			struct commit_list *parents;
 			struct strbuf s = STRBUF_INIT;
 			for (parents = commit->parents; parents; parents = parents->next) {
@@ -1208,15 +1277,16 @@  static void grab_person(const char *who, struct atom_value *val, int deref, void
 		return;
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
 
-		if (starts_with(name, "creatordate"))
+		if (atom_type == ATOM_CREATORDATE)
 			grab_date(wholine, v, name);
-		else if (!strcmp(name, "creator"))
+		else if (atom_type == ATOM_CREATOR)
 			v->s = copy_line(wholine);
 	}
 }
@@ -1696,6 +1766,7 @@  static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 	/* Fill in specials first */
 	for (i = 0; i < used_atom_cnt; i++) {
 		struct used_atom *atom = &used_atom[i];
+		enum atom_type atom_type = atom->atom_type;
 		const char *name = used_atom[i].name;
 		struct atom_value *v = &ref->value[i];
 		int deref = 0;
@@ -1710,18 +1781,18 @@  static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			name++;
 		}
 
-		if (starts_with(name, "refname"))
+		if (atom_type == ATOM_REFNAME)
 			refname = get_refname(atom, ref);
-		else if (!strcmp(name, "worktreepath")) {
+		else if (atom_type == ATOM_WORKTREEPATH) {
 			if (ref->kind == FILTER_REFS_BRANCHES)
 				v->s = get_worktree_path(atom, ref);
 			else
 				v->s = xstrdup("");
 			continue;
 		}
-		else if (starts_with(name, "symref"))
+		else if (atom_type == ATOM_SYMREF)
 			refname = get_symref(atom, ref);
-		else if (starts_with(name, "upstream")) {
+		else if (atom_type == ATOM_UPSTREAM) {
 			const char *branch_name;
 			/* only local branches may have an upstream */
 			if (!skip_prefix(ref->refname, "refs/heads/",
@@ -1737,7 +1808,7 @@  static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			else
 				v->s = xstrdup("");
 			continue;
-		} else if (atom->u.remote_ref.push) {
+		} else if (atom_type == ATOM_PUSH && atom->u.remote_ref.push) {
 			const char *branch_name;
 			v->s = xstrdup("");
 			if (!skip_prefix(ref->refname, "refs/heads/",
@@ -1756,10 +1827,10 @@  static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			free((char *)v->s);
 			fill_remote_ref_details(atom, refname, branch, &v->s);
 			continue;
-		} else if (starts_with(name, "color:")) {
+		} else if (atom_type == ATOM_COLOR) {
 			v->s = xstrdup(atom->u.color);
 			continue;
-		} else if (!strcmp(name, "flag")) {
+		} else if (atom_type == ATOM_FLAG) {
 			char buf[256], *cp = buf;
 			if (ref->flag & REF_ISSYMREF)
 				cp = copy_advance(cp, ",symref");
@@ -1772,23 +1843,24 @@  static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup(buf + 1);
 			}
 			continue;
-		} else if (!deref && grab_oid(name, "objectname", &ref->objectname, v, atom)) {
-			continue;
-		} else if (!strcmp(name, "HEAD")) {
+		} else if (!deref && atom_type == ATOM_OBJECTNAME &&
+			   grab_oid(name, "objectname", &ref->objectname, v, atom)) {
+				continue;
+		} else if (atom_type == ATOM_HEAD) {
 			if (atom->u.head && !strcmp(ref->refname, atom->u.head))
 				v->s = xstrdup("*");
 			else
 				v->s = xstrdup(" ");
 			continue;
-		} else if (starts_with(name, "align")) {
+		} else if (atom_type == ATOM_ALIGN) {
 			v->handler = align_atom_handler;
 			v->s = xstrdup("");
 			continue;
-		} else if (!strcmp(name, "end")) {
+		} else if (atom_type == ATOM_END) {
 			v->handler = end_atom_handler;
 			v->s = xstrdup("");
 			continue;
-		} else if (starts_with(name, "if")) {
+		} else if (atom_type == ATOM_IF) {
 			const char *s;
 			if (skip_prefix(name, "if:", &s))
 				v->s = xstrdup(s);
@@ -1796,11 +1868,11 @@  static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup("");
 			v->handler = if_atom_handler;
 			continue;
-		} else if (!strcmp(name, "then")) {
+		} else if (atom_type == ATOM_THEN) {
 			v->handler = then_atom_handler;
 			v->s = xstrdup("");
 			continue;
-		} else if (!strcmp(name, "else")) {
+		} else if (atom_type == ATOM_ELSE) {
 			v->handler = else_atom_handler;
 			v->s = xstrdup("");
 			continue;