diff mbox series

[v2,1/4] refs: introduce `is_pseudoref()` and `is_headref()`

Message ID 20240124152726.124873-2-karthik.188@gmail.com (mailing list archive)
State Superseded
Headers show
Series for-each-ref: print all refs on empty string pattern | expand

Commit Message

Karthik Nayak Jan. 24, 2024, 3:27 p.m. UTC
Introduce two new functions `is_pseudoref()` and `is_headref()`. This
provides the necessary functionality for us to add pseudorefs and HEAD
to the loose ref cache in the files backend, allowing us to build
tooling to print these refs.

The `is_pseudoref()` function internally calls `is_pseudoref_syntax()`
but adds onto it by also checking to ensure that the pseudoref either
ends with a "_HEAD" suffix or matches a list of exceptions. After which
we also parse the contents of the pseudoref to ensure that it conforms
to the ref format.

We cannot directly add the new syntax checks to `is_pseudoref_syntax()`
because the function is also used by `is_current_worktree_ref()` and
making it stricter to match only known pseudorefs might have unintended
consequences due to files like 'BISECT_START' which isn't a pseudoref
but sometimes contains object ID.

Keeping this in mind, we leave `is_pseudoref_syntax()` as is and create
`is_pseudoref()` which is stricter. Ideally we'd want to move the new
syntax checks to `is_pseudoref_syntax()` but a prerequisite for this
would be to actually remove the exception list by converting those
pseudorefs to also contain a '_HEAD' suffix and perhaps move bisect
related files like 'BISECT_START' to a new directory similar to the
'rebase-merge' directory.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 refs.c | 32 ++++++++++++++++++++++++++++++++
 refs.h |  3 +++
 2 files changed, 35 insertions(+)

Comments

Junio C Hamano Jan. 24, 2024, 7:09 p.m. UTC | #1
Karthik Nayak <karthik.188@gmail.com> writes:

> We cannot directly add the new syntax checks to `is_pseudoref_syntax()`
> because the function is also used by `is_current_worktree_ref()` and
> making it stricter to match only known pseudorefs might have unintended
> consequences due to files like 'BISECT_START' which isn't a pseudoref
> but sometimes contains object ID.

Well described.

> diff --git a/refs.c b/refs.c
> index 20e8f1ff1f..4b6bfc66fb 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -859,6 +859,38 @@ static int is_pseudoref_syntax(const char *refname)
>  	return 1;
>  }
>  
> +int is_pseudoref(struct ref_store *refs, const char *refname)
> +{
> +	static const char *const irregular_pseudorefs[] = {
> +		"AUTO_MERGE",
> +		"BISECT_EXPECTED_REV",
> +		"NOTES_MERGE_PARTIAL",
> +		"NOTES_MERGE_REF",
> +		"MERGE_AUTOSTASH"

Let's end an array's initializer with a trailing comma, to help
future patches to add entries to this array without unnecessary
patch noise. 

> +	};
> +	size_t i;
> +
> +	if (!is_pseudoref_syntax(refname))
> +		return 0;
> +
> +	if (ends_with(refname, "_HEAD"))
> +		return refs_ref_exists(refs, refname);
> +
> +	for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++)
> +		 if (!strcmp(refname, irregular_pseudorefs[i]))
> +			 return refs_ref_exists(refs, refname);
> +
> +	return 0;
> +}

The above uses refs_ref_exists() because we want these to
successfully resolve for reading.

> +int is_headref(struct ref_store *refs, const char *refname)
> +{
> +	if (!strcmp(refname, "HEAD"))
> +		return refs_ref_exists(refs, refname);

Given that "git for-each-ref refs/remotes" does not show
"refs/remotes/origin/HEAD" in the output when we do not have the
remote-tracking branch that symref points at, we probably do want
to omit "HEAD" from the output when the HEAD symref points at an
unborn branch.  If refs_ref_exists() says "no, it does not exist"
in such a case, we are perfectly fine with this code.

We do not have to worry about the unborn state for pseudorefs
because they would never be symbolic.  But that in turn makes me
suspect that the check done with refs_ref_exists() in the
is_pseudoref() helper is a bit too lenient by allowing it to be a
symbolic ref.  Shouldn't we be using a check based on
read_ref_full(), like we did in another topic recently [*]?


[Reference]

 * https://lore.kernel.org/git/xmqqzfxa9usx.fsf@gitster.g/
Karthik Nayak Jan. 25, 2024, 4:20 p.m. UTC | #2
Hello Junio,

Junio C Hamano <gitster@pobox.com> writes:

> Karthik Nayak <karthik.188@gmail.com> writes:
>
>> We cannot directly add the new syntax checks to `is_pseudoref_syntax()`
>> because the function is also used by `is_current_worktree_ref()` and
>> making it stricter to match only known pseudorefs might have unintended
>> consequences due to files like 'BISECT_START' which isn't a pseudoref
>> but sometimes contains object ID.
>
> Well described.
>
>> diff --git a/refs.c b/refs.c
>> index 20e8f1ff1f..4b6bfc66fb 100644
>> --- a/refs.c
>> +++ b/refs.c
>> @@ -859,6 +859,38 @@ static int is_pseudoref_syntax(const char *refname)
>>  	return 1;
>>  }
>>
>> +int is_pseudoref(struct ref_store *refs, const char *refname)
>> +{
>> +	static const char *const irregular_pseudorefs[] = {
>> +		"AUTO_MERGE",
>> +		"BISECT_EXPECTED_REV",
>> +		"NOTES_MERGE_PARTIAL",
>> +		"NOTES_MERGE_REF",
>> +		"MERGE_AUTOSTASH"
>
> Let's end an array's initializer with a trailing comma, to help
> future patches to add entries to this array without unnecessary
> patch noise.

Sure, will add!

>> +	};
>> +	size_t i;
>> +
>> +	if (!is_pseudoref_syntax(refname))
>> +		return 0;
>> +
>> +	if (ends_with(refname, "_HEAD"))
>> +		return refs_ref_exists(refs, refname);
>> +
>> +	for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++)
>> +		 if (!strcmp(refname, irregular_pseudorefs[i]))
>> +			 return refs_ref_exists(refs, refname);
>> +
>> +	return 0;
>> +}
>
> The above uses refs_ref_exists() because we want these to
> successfully resolve for reading.
>
>> +int is_headref(struct ref_store *refs, const char *refname)
>> +{
>> +	if (!strcmp(refname, "HEAD"))
>> +		return refs_ref_exists(refs, refname);
>
> Given that "git for-each-ref refs/remotes" does not show
> "refs/remotes/origin/HEAD" in the output when we do not have the
> remote-tracking branch that symref points at, we probably do want
> to omit "HEAD" from the output when the HEAD symref points at an
> unborn branch.  If refs_ref_exists() says "no, it does not exist"
> in such a case, we are perfectly fine with this code.
>
> We do not have to worry about the unborn state for pseudorefs
> because they would never be symbolic.  But that in turn makes me
> suspect that the check done with refs_ref_exists() in the
> is_pseudoref() helper is a bit too lenient by allowing it to be a
> symbolic ref.  Shouldn't we be using a check based on
> read_ref_full(), like we did in another topic recently [*]?
>
>
> [Reference]
>
>  * https://lore.kernel.org/git/xmqqzfxa9usx.fsf@gitster.g/
>

Thanks, this makes sense and the link is helpful. I'll do something
similar, but since HEAD can be a symref, I'll drop the
`RESOLVE_REF_NO_RECURSE` flag and only use `RESOLVE_REF_READING`.

I'll wait a day or two, before sending in the new version with the
fixes. The current diff is

diff --git a/refs.c b/refs.c
index b5e63f133a..4a1fd30ef2 100644
--- a/refs.c
+++ b/refs.c
@@ -866,7 +866,7 @@ int is_pseudoref(struct ref_store *refs, const
char *refname)
 		"BISECT_EXPECTED_REV",
 		"NOTES_MERGE_PARTIAL",
 		"NOTES_MERGE_REF",
-		"MERGE_AUTOSTASH"
+		"MERGE_AUTOSTASH",
 	};
 	size_t i;

@@ -885,10 +885,23 @@ int is_pseudoref(struct ref_store *refs, const
char *refname)

 int is_headref(struct ref_store *refs, const char *refname)
 {
-	if (!strcmp(refname, "HEAD"))
-		return refs_ref_exists(refs, refname);
+	struct object_id oid;
+	int flag;

-	return 0;
+	if (strcmp(refname, "HEAD"))
+		return 0;
+
+	/*
+	 * If HEAD doesn't exist, we don't have to die, but rather,
+	 * we simply return 0.
+	 */
+	if (read_ref_full("HEAD", RESOLVE_REF_READING, &oid, &flag))
+		return 0;
+
+	if (is_null_oid(&oid))
+		return 0;
+
+	return 1;
 }

 static int is_current_worktree_ref(const char *ref) {

- Karthik
Junio C Hamano Jan. 25, 2024, 4:28 p.m. UTC | #3
Karthik Nayak <karthik.188@gmail.com> writes:

>>> +int is_headref(struct ref_store *refs, const char *refname)
>>> +{
>>> +	if (!strcmp(refname, "HEAD"))
>>> +		return refs_ref_exists(refs, refname);
>>
>> Given that "git for-each-ref refs/remotes" does not show
>> "refs/remotes/origin/HEAD" in the output when we do not have the
>> remote-tracking branch that symref points at, we probably do want
>> to omit "HEAD" from the output when the HEAD symref points at an
>> unborn branch.  If refs_ref_exists() says "no, it does not exist"
>> in such a case, we are perfectly fine with this code.
>>
>> We do not have to worry about the unborn state for pseudorefs
>> because they would never be symbolic.  But that in turn makes me
>> suspect that the check done with refs_ref_exists() in the
>> is_pseudoref() helper is a bit too lenient by allowing it to be a
>> symbolic ref.  Shouldn't we be using a check based on
>> read_ref_full(), like we did in another topic recently [*]?
>>
>>
>> [Reference]
>>
>>  * https://lore.kernel.org/git/xmqqzfxa9usx.fsf@gitster.g/
>>
>
> Thanks, this makes sense and the link is helpful. I'll do something
> similar, but since HEAD can be a symref, I'll drop the
> `RESOLVE_REF_NO_RECURSE` flag and only use `RESOLVE_REF_READING`.

Just to make sure there is no misunderstanding, I think how
is_headref() does what it does in the patch is perfectly fine,
including its use of refs_ref_exists().  The side I was referring to
with "in turn makes me suspect" is the other helper function that
will never have to deal with a symref.  Use of refs_ref_exists() in
that function is too loose.
Karthik Nayak Jan. 25, 2024, 9:48 p.m. UTC | #4
On Thu, Jan 25, 2024 at 5:28 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Karthik Nayak <karthik.188@gmail.com> writes:
>
> >>> +int is_headref(struct ref_store *refs, const char *refname)
> >>> +{
> >>> +   if (!strcmp(refname, "HEAD"))
> >>> +           return refs_ref_exists(refs, refname);
> >>
> >> Given that "git for-each-ref refs/remotes" does not show
> >> "refs/remotes/origin/HEAD" in the output when we do not have the
> >> remote-tracking branch that symref points at, we probably do want
> >> to omit "HEAD" from the output when the HEAD symref points at an
> >> unborn branch.  If refs_ref_exists() says "no, it does not exist"
> >> in such a case, we are perfectly fine with this code.
> >>
> >> We do not have to worry about the unborn state for pseudorefs
> >> because they would never be symbolic.  But that in turn makes me
> >> suspect that the check done with refs_ref_exists() in the
> >> is_pseudoref() helper is a bit too lenient by allowing it to be a
> >> symbolic ref.  Shouldn't we be using a check based on
> >> read_ref_full(), like we did in another topic recently [*]?
> >>
> >>
> >> [Reference]
> >>
> >>  * https://lore.kernel.org/git/xmqqzfxa9usx.fsf@gitster.g/
> >>
> >
> > Thanks, this makes sense and the link is helpful. I'll do something
> > similar, but since HEAD can be a symref, I'll drop the
> > `RESOLVE_REF_NO_RECURSE` flag and only use `RESOLVE_REF_READING`.
>
> Just to make sure there is no misunderstanding, I think how
> is_headref() does what it does in the patch is perfectly fine,
> including its use of refs_ref_exists().  The side I was referring to
> with "in turn makes me suspect" is the other helper function that
> will never have to deal with a symref.  Use of refs_ref_exists() in
> that function is too loose.
>

AH! Totally misunderstood, thanks for reiterating.
diff mbox series

Patch

diff --git a/refs.c b/refs.c
index 20e8f1ff1f..4b6bfc66fb 100644
--- a/refs.c
+++ b/refs.c
@@ -859,6 +859,38 @@  static int is_pseudoref_syntax(const char *refname)
 	return 1;
 }
 
+int is_pseudoref(struct ref_store *refs, const char *refname)
+{
+	static const char *const irregular_pseudorefs[] = {
+		"AUTO_MERGE",
+		"BISECT_EXPECTED_REV",
+		"NOTES_MERGE_PARTIAL",
+		"NOTES_MERGE_REF",
+		"MERGE_AUTOSTASH"
+	};
+	size_t i;
+
+	if (!is_pseudoref_syntax(refname))
+		return 0;
+
+	if (ends_with(refname, "_HEAD"))
+		return refs_ref_exists(refs, refname);
+
+	for (i = 0; i < ARRAY_SIZE(irregular_pseudorefs); i++)
+		 if (!strcmp(refname, irregular_pseudorefs[i]))
+			 return refs_ref_exists(refs, refname);
+
+	return 0;
+}
+
+int is_headref(struct ref_store *refs, const char *refname)
+{
+	if (!strcmp(refname, "HEAD"))
+		return refs_ref_exists(refs, refname);
+
+	return 0;
+}
+
 static int is_current_worktree_ref(const char *ref) {
 	return is_pseudoref_syntax(ref) || is_per_worktree_ref(ref);
 }
diff --git a/refs.h b/refs.h
index 11b3b6ccea..46b8085d63 100644
--- a/refs.h
+++ b/refs.h
@@ -1021,4 +1021,7 @@  extern struct ref_namespace_info ref_namespace[NAMESPACE__COUNT];
  */
 void update_ref_namespace(enum ref_namespace namespace, char *ref);
 
+int is_pseudoref(struct ref_store *refs, const char *refname);
+int is_headref(struct ref_store *refs, const char *refname);
+
 #endif /* REFS_H */