diff mbox series

[v13,2/9] refs: standardize output of refs_read_symbolic_ref

Message ID 20241118151755.756265-3-bence@ferdinandy.com (mailing list archive)
State New
Headers show
Series set-head/fetch remote/HEAD updates | expand

Commit Message

Bence Ferdinandy Nov. 18, 2024, 3:09 p.m. UTC
When the symbolic reference we want to read with refs_read_symbolic_ref
is actually not a symbolic reference, the files and the reftable
backends return different values (1 and -1 respectively). Standardize
the returned values so that 0 is success, -1 is a generic error and -2
is that the reference was actually non-symbolic.

Signed-off-by: Bence Ferdinandy <bence@ferdinandy.com>
---

Notes:
    v13: new patch

 refs.h                  | 6 ++++++
 refs/files-backend.c    | 7 +++----
 refs/refs-internal.h    | 6 ++++++
 refs/reftable-backend.c | 4 +++-
 4 files changed, 18 insertions(+), 5 deletions(-)

Comments

Junio C Hamano Nov. 19, 2024, 1:22 a.m. UTC | #1
Bence Ferdinandy <bence@ferdinandy.com> writes:

> Subject: Re: [PATCH v13 2/9] refs: standardize output of refs_read_symbolic_ref

"output" -> "return values", or something.

> When the symbolic reference we want to read with refs_read_symbolic_ref
> is actually not a symbolic reference, the files and the reftable
> backends return different values (1 and -1 respectively). Standardize
> the returned values so that 0 is success, -1 is a generic error and -2
> is that the reference was actually non-symbolic.

Are all the existing callers OK with this switch from 1 to -2?

IOW, if a caller using the ref-files backend start behaving
differently with this change upon seeing a return value of -2 (where
it previously was expecting to see 1), that would not be nice.

Because "reftable was already broken" is not a good excuse to
introduce a separate regression to ref-files users, we'd want to be
careful if we want to do this kind of "while at it" change.

> +/*
> + * Return 0 if the symbolic reference could be read without error.
> + * Return -1 for generic errors.
> + * Return -2 if the reference was actually non-symbolic.
> + */

As this is an implementation of ref_stroage_be.read_symbolic_ref,
the above comment must stay in sync with the comment over there (and
a comment before the corresponding function in the other backend).

I personally would not add the above comment for that reason, but as
long as we are consistent, I am OK either way.  So if we add this,
we should do the same to the reftable side as well.

By the way, it is arguable that it is an error when a caller asks
"here is a ref, please read it as a symbolic ref" and the ref turns
out to be not a symbolic ref.  I'd say that it is a valid view that
the caller asked the question to find out if the ref was a symbolic
before making the call, and "that ref is not symbolic" is one of the
valid and expected results.  So if you wanted to change the value
from 1 to -2 only because "you called read_symbolic_ref() without
checking if it is a symbolic to begin with, which is your fault and
you deserve to get an error", I am not sure if I agree with that
view and reasoning.

In any case, this "not a symbolic ref" may want to have a symbolic
constant.  With NOT_A_SYMREF being a non-error, you could structure
the caller like so:

	status = read_symref()
	if (status < 0)
		... we got an error so we stop here ...
		... we do not behave differently on error type ...

	switch (status) {
	case 0:
		... everything is peachy ...
		break ;;
	case NOT_A_SYMREF:
		... we want to handle non-symref here ...
		break ;;
	default:
		BUG("unhandled case");
	}

Also, even if we decide that we want to treat this as an error,
lumping everything else as "generic error" might make it awkward to
evolve the API, I suspect.

> diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
> index 38eb14d591..60cb83f23a 100644
> --- a/refs/reftable-backend.c
> +++ b/refs/reftable-backend.c
> @@ -830,7 +830,9 @@ static int reftable_be_read_symbolic_ref(struct ref_store *ref_store,
>  		return ret;
>  
>  	ret = reftable_stack_read_ref(stack, refname, &ref);
> -	if (ret == 0 && ref.value_type == REFTABLE_REF_SYMREF)
> +	if (!ret && (ref.value_type != REFTABLE_REF_SYMREF))
> +		ret = -2;
> +	else if (!ret && (ref.value_type == REFTABLE_REF_SYMREF))
>  		strbuf_addstr(referent, ref.value.symref);
>  	else
>  		ret = -1;
Junio C Hamano Nov. 19, 2024, 5:10 a.m. UTC | #2
Bence Ferdinandy <bence@ferdinandy.com> writes:

> diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
> index 38eb14d591..60cb83f23a 100644
> --- a/refs/reftable-backend.c
> +++ b/refs/reftable-backend.c
> @@ -830,7 +830,9 @@ static int reftable_be_read_symbolic_ref(struct ref_store *ref_store,
>  		return ret;
>  
>  	ret = reftable_stack_read_ref(stack, refname, &ref);
> -	if (ret == 0 && ref.value_type == REFTABLE_REF_SYMREF)
> +	if (!ret && (ref.value_type != REFTABLE_REF_SYMREF))
> +		ret = -2;
> +	else if (!ret && (ref.value_type == REFTABLE_REF_SYMREF))
>  		strbuf_addstr(referent, ref.value.symref);
>  	else
>  		ret = -1;

The ref.value_type can be either equal to REFTABLE_REF_SYMREF or not
equal to it, and there is no other choice.

Wouldn't it be easier to reason about if the above code were written
more like this:

        if (ret)
		ret = -1;
	else if (ref.value_type == REFTABLE_REF_SYMREF)
		strbuf_addstr(...);
	else
		ret = -2;

I found it curious when I read it again while attempting to resolve
conflicts with 5413d69f (refs/reftable: refactor reading symbolic
refs to use reftable backend, 2024-11-05).  The resolution has to
update this part of code to use the new implementation that asks
reftable_backend_read_ref() and becomes a lot simpler, so the way it
is written in your topic does not make much difference in the longer
term when both topics graduate.

IOW, if we were rebuilding your topic on top of Patrick's topoic
that includes 5413d69f, this part would read like so, I think.

diff --git c/refs/reftable-backend.c w/refs/reftable-backend.c
index 6298991da7..b6bc3039a5 100644
--- c/refs/reftable-backend.c
+++ w/refs/reftable-backend.c
@@ -920,8 +920,10 @@ static int reftable_be_read_symbolic_ref(struct ref_store *ref_store,
 		return ret;
 
 	ret = reftable_backend_read_ref(be, refname, &oid, referent, &type);
-	if (type != REF_ISSYMREF)
+	if (ret)
 		ret = -1;
+	else if (type != REF_ISSYMREF)
+		ret = -2;
 	return ret;
 }
Patrick Steinhardt Nov. 19, 2024, 6:44 a.m. UTC | #3
On Tue, Nov 19, 2024 at 10:22:31AM +0900, Junio C Hamano wrote:
> Bence Ferdinandy <bence@ferdinandy.com> writes:
> 
> > Subject: Re: [PATCH v13 2/9] refs: standardize output of refs_read_symbolic_ref
> 
> "output" -> "return values", or something.
> 
> > When the symbolic reference we want to read with refs_read_symbolic_ref
> > is actually not a symbolic reference, the files and the reftable
> > backends return different values (1 and -1 respectively). Standardize
> > the returned values so that 0 is success, -1 is a generic error and -2
> > is that the reference was actually non-symbolic.
> 
> Are all the existing callers OK with this switch from 1 to -2?
> 
> IOW, if a caller using the ref-files backend start behaving
> differently with this change upon seeing a return value of -2 (where
> it previously was expecting to see 1), that would not be nice.
> 
> Because "reftable was already broken" is not a good excuse to
> introduce a separate regression to ref-files users, we'd want to be
> careful if we want to do this kind of "while at it" change.

There are only three callers:

  - "remote.c:ignore_symref_update()" only cares whether the return
    value is 0 or not.

  - "builtin/remote.c:mv()" is the same.

  - "refs.c:migrate_one_ref()" assumes the behaviour of the reftable
    backend and only checks for negative error codes.

So you could argue that it's the "files" backend that is broken, not the
"reftable" backend. Doesn't matter ultimately though, the real problem
is that this wasn't ever documented anywhere.

I agree that this should be part of the commit message.

> > +/*
> > + * Return 0 if the symbolic reference could be read without error.
> > + * Return -1 for generic errors.
> > + * Return -2 if the reference was actually non-symbolic.
> > + */
> 
> As this is an implementation of ref_stroage_be.read_symbolic_ref,
> the above comment must stay in sync with the comment over there (and
> a comment before the corresponding function in the other backend).
> 
> I personally would not add the above comment for that reason, but as
> long as we are consistent, I am OK either way.  So if we add this,
> we should do the same to the reftable side as well.

Another solution could be to have the comment in "refs.h" for the
caller-facing function, while the backend pointer simply says "Please
refer to the documentation of `refs_read_symbolic_ref()`."

> By the way, it is arguable that it is an error when a caller asks
> "here is a ref, please read it as a symbolic ref" and the ref turns
> out to be not a symbolic ref.  I'd say that it is a valid view that
> the caller asked the question to find out if the ref was a symbolic
> before making the call, and "that ref is not symbolic" is one of the
> valid and expected results.  So if you wanted to change the value
> from 1 to -2 only because "you called read_symbolic_ref() without
> checking if it is a symbolic to begin with, which is your fault and
> you deserve to get an error", I am not sure if I agree with that
> view and reasoning.

The reason why I've been proposing to return negative is because we have
the idiom of checking `err < 0` in many places, so a function that
returns a positive value in the case where it didn't return the expected
result can easily lead to bugs.

> In any case, this "not a symbolic ref" may want to have a symbolic
> constant.  With NOT_A_SYMREF being a non-error, you could structure
> the caller like so:

That'd be a good idea.

> 	status = read_symref()
> 	if (status < 0)
> 		... we got an error so we stop here ...
> 		... we do not behave differently on error type ...
> 
> 	switch (status) {
> 	case 0:
> 		... everything is peachy ...
> 		break ;;
> 	case NOT_A_SYMREF:
> 		... we want to handle non-symref here ...
> 		break ;;
> 	default:
> 		BUG("unhandled case");
> 	}
> 
> Also, even if we decide that we want to treat this as an error,
> lumping everything else as "generic error" might make it awkward to
> evolve the API, I suspect.

I guess most callers don't care about the exact error code, they only
care about whether or not they have been able to read the symref. I
think this idiom would easily be extensible in the future if all callers
know to check for `err < 0` by default, because introducing a new error
code would continue to work for them. And if they want to handle that
new error code they can adapt as you propose above with the switch.

Patrick
Patrick Steinhardt Nov. 19, 2024, 6:48 a.m. UTC | #4
On Mon, Nov 18, 2024 at 04:09:21PM +0100, Bence Ferdinandy wrote:
> diff --git a/refs.h b/refs.h
> index 108dfc93b3..f8b714ca1d 100644
> --- a/refs.h
> +++ b/refs.h
> @@ -83,6 +83,12 @@ int refs_read_ref_full(struct ref_store *refs, const char *refname,
>  
>  int refs_read_ref(struct ref_store *refs, const char *refname, struct object_id *oid);
>  
> +/*
> + * Return 0 if the symbolic reference could be read without error.
> + * Return -1 for generic errors.
> + * Return -2 if the reference was actually non-symbolic.
> + */
> +

Extraneous empty newline.

Also, how about the following:

    /*
     * Read the symbolic ref named "refname" and write its immediate
     * referent into the provided buffer. This does not resolve the
     * symbolic ref recursively in case the target is a symbolic ref, as
     * well.
     *
     * Returns 0 on success, -2 if the "refname" is not a symbolic ref,
     * -1 otherwise.
     */

> diff --git a/refs/refs-internal.h b/refs/refs-internal.h
> index 2313c830d8..f0ef354bce 100644
> --- a/refs/refs-internal.h
> +++ b/refs/refs-internal.h
> @@ -673,6 +673,12 @@ struct ref_storage_be {
>  
>  	ref_iterator_begin_fn *iterator_begin;
>  	read_raw_ref_fn *read_raw_ref;
> +
> +	/*
> +	 * Return 0 if the symbolic reference could be read without error.
> +	 * Return -1 for generic errors.
> +	 * Return -2 if the reference was actually non-symbolic.
> +	 */
>  	read_symbolic_ref_fn *read_symbolic_ref;

As proposed in the other thread, this could instead be:

    /*
     * Please refer to `refs_read_symbolic_ref()` for the expected
     * behaviour.
     /

Patrick
Junio C Hamano Nov. 19, 2024, 6:54 a.m. UTC | #5
Patrick Steinhardt <ps@pks.im> writes:

> There are only three callers:
>
>   - "remote.c:ignore_symref_update()" only cares whether the return
>     value is 0 or not.
>
>   - "builtin/remote.c:mv()" is the same.
>
>   - "refs.c:migrate_one_ref()" assumes the behaviour of the reftable
>     backend and only checks for negative error codes.
>
> So you could argue that it's the "files" backend that is broken, not the
> "reftable" backend. Doesn't matter ultimately though, the real problem
> is that this wasn't ever documented anywhere.

You're correct that it does not matter ultimately.  But as a general
rule, which also applies here, anything newly introduced one does
differently without having a good reason is a bug in the new thing,
I would say.

> Another solution could be to have the comment in "refs.h" for the
> caller-facing function, while the backend pointer simply says "Please
> refer to the documentation of `refs_read_symbolic_ref()`."

Yup, avoiding unnecessary duplication is a good thing.

> The reason why I've been proposing to return negative is because we have
> the idiom of checking `err < 0` in many places, so a function that
> returns a positive value in the case where it didn't return the expected
> result can easily lead to bugs.

I agree with the general reasoning.  I am saying this may or may not
be an error, and if it turns out that it is not an error but is just
one of the generally expected outcome, treating it as an error and
having "if (status < 0)" to lump the case together with other error
cases may not be nice to the callers.
Patrick Steinhardt Nov. 19, 2024, 7:26 a.m. UTC | #6
On Tue, Nov 19, 2024 at 03:54:05PM +0900, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> > The reason why I've been proposing to return negative is because we have
> > the idiom of checking `err < 0` in many places, so a function that
> > returns a positive value in the case where it didn't return the expected
> > result can easily lead to bugs.
> 
> I agree with the general reasoning.  I am saying this may or may not
> be an error, and if it turns out that it is not an error but is just
> one of the generally expected outcome, treating it as an error and
> having "if (status < 0)" to lump the case together with other error
> cases may not be nice to the callers.

The question to me is whether the function returns something sensible in
all non-error cases that a caller can use properly without having to
explicitly check for the value. And I'd say that this is not the case
with `refs_read_symbolic_ref()`, which wouldn't end up setting the value
of `referent`.

So regardless of whether we define this as error or non-error, the
caller would have to exlicitly handle the case where it's not a symref
in order to make sense of it because the result is not well-defined.

Patrick
Bence Ferdinandy Nov. 19, 2024, 10:04 a.m. UTC | #7
On Tue Nov 19, 2024 at 06:10, Junio C Hamano <gitster@pobox.com> wrote:
> Bence Ferdinandy <bence@ferdinandy.com> writes:
>
>> diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
>> index 38eb14d591..60cb83f23a 100644
>> --- a/refs/reftable-backend.c
>> +++ b/refs/reftable-backend.c
>> @@ -830,7 +830,9 @@ static int reftable_be_read_symbolic_ref(struct ref_store *ref_store,
>>  		return ret;
>>  
>>  	ret = reftable_stack_read_ref(stack, refname, &ref);
>> -	if (ret == 0 && ref.value_type == REFTABLE_REF_SYMREF)
>> +	if (!ret && (ref.value_type != REFTABLE_REF_SYMREF))
>> +		ret = -2;
>> +	else if (!ret && (ref.value_type == REFTABLE_REF_SYMREF))
>>  		strbuf_addstr(referent, ref.value.symref);
>>  	else
>>  		ret = -1;
>
> The ref.value_type can be either equal to REFTABLE_REF_SYMREF or not
> equal to it, and there is no other choice.

Ah, ok, I didn't realize this.

>
> Wouldn't it be easier to reason about if the above code were written
> more like this:
>
>         if (ret)
> 		ret = -1;
> 	else if (ref.value_type == REFTABLE_REF_SYMREF)
> 		strbuf_addstr(...);
> 	else
> 		ret = -2;
>
> I found it curious when I read it again while attempting to resolve
> conflicts with 5413d69f (refs/reftable: refactor reading symbolic
> refs to use reftable backend, 2024-11-05).  The resolution has to
> update this part of code to use the new implementation that asks
> reftable_backend_read_ref() and becomes a lot simpler, so the way it
> is written in your topic does not make much difference in the longer
> term when both topics graduate.

I'll update the patch with the above, 

>
> IOW, if we were rebuilding your topic on top of Patrick's topoic
> that includes 5413d69f, this part would read like so, I think.
>
> diff --git c/refs/reftable-backend.c w/refs/reftable-backend.c
> index 6298991da7..b6bc3039a5 100644
> --- c/refs/reftable-backend.c
> +++ w/refs/reftable-backend.c
> @@ -920,8 +920,10 @@ static int reftable_be_read_symbolic_ref(struct ref_store *ref_store,
>  		return ret;
>  
>  	ret = reftable_backend_read_ref(be, refname, &oid, referent, &type);
> -	if (type != REF_ISSYMREF)
> +	if (ret)
>  		ret = -1;
> +	else if (type != REF_ISSYMREF)
> +		ret = -2;
>  	return ret;
>  }
>  

but I'll save this as well, I would not be completely surprised if Patrick's
topic makes it in sooner :)

Thanks,
Bence
Bence Ferdinandy Nov. 19, 2024, 10:10 a.m. UTC | #8
On Tue Nov 19, 2024 at 08:26, Patrick Steinhardt <ps@pks.im> wrote:
> On Tue, Nov 19, 2024 at 03:54:05PM +0900, Junio C Hamano wrote:
>> Patrick Steinhardt <ps@pks.im> writes:
>> > The reason why I've been proposing to return negative is because we have
>> > the idiom of checking `err < 0` in many places, so a function that
>> > returns a positive value in the case where it didn't return the expected
>> > result can easily lead to bugs.
>> 
>> I agree with the general reasoning.  I am saying this may or may not
>> be an error, and if it turns out that it is not an error but is just
>> one of the generally expected outcome, treating it as an error and
>> having "if (status < 0)" to lump the case together with other error
>> cases may not be nice to the callers.
>
> The question to me is whether the function returns something sensible in
> all non-error cases that a caller can use properly without having to
> explicitly check for the value. And I'd say that this is not the case
> with `refs_read_symbolic_ref()`, which wouldn't end up setting the value
> of `referent`.
>
> So regardless of whether we define this as error or non-error, the
> caller would have to exlicitly handle the case where it's not a symref
> in order to make sense of it because the result is not well-defined.

I agree with Patrick re the -1, -2 return values. The non-error behavior should
be when referent is set, anything else is something the caller would need to
consider if they want to do something with it or not.
Bence Ferdinandy Nov. 19, 2024, 10:17 a.m. UTC | #9
On Tue Nov 19, 2024 at 07:48, Patrick Steinhardt <ps@pks.im> wrote:
> On Mon, Nov 18, 2024 at 04:09:21PM +0100, Bence Ferdinandy wrote:
>> diff --git a/refs.h b/refs.h
>> index 108dfc93b3..f8b714ca1d 100644
>> --- a/refs.h
>> +++ b/refs.h
>> @@ -83,6 +83,12 @@ int refs_read_ref_full(struct ref_store *refs, const char *refname,
>>  
>>  int refs_read_ref(struct ref_store *refs, const char *refname, struct object_id *oid);
>>  
>> +/*
>> + * Return 0 if the symbolic reference could be read without error.
>> + * Return -1 for generic errors.
>> + * Return -2 if the reference was actually non-symbolic.
>> + */
>> +
>
> Extraneous empty newline.
>
> Also, how about the following:
>
>     /*
>      * Read the symbolic ref named "refname" and write its immediate
>      * referent into the provided buffer. This does not resolve the
>      * symbolic ref recursively in case the target is a symbolic ref, as
>      * well.
>      *
>      * Returns 0 on success, -2 if the "refname" is not a symbolic ref,
>      * -1 otherwise.
>      */
>
>> diff --git a/refs/refs-internal.h b/refs/refs-internal.h
>> index 2313c830d8..f0ef354bce 100644
>> --- a/refs/refs-internal.h
>> +++ b/refs/refs-internal.h
>> @@ -673,6 +673,12 @@ struct ref_storage_be {
>>  
>>  	ref_iterator_begin_fn *iterator_begin;
>>  	read_raw_ref_fn *read_raw_ref;
>> +
>> +	/*
>> +	 * Return 0 if the symbolic reference could be read without error.
>> +	 * Return -1 for generic errors.
>> +	 * Return -2 if the reference was actually non-symbolic.
>> +	 */
>>  	read_symbolic_ref_fn *read_symbolic_ref;
>
> As proposed in the other thread, this could instead be:
>
>     /*
>      * Please refer to `refs_read_symbolic_ref()` for the expected
>      * behaviour.
>      /

Thanks, that makes sense. So  as a summary, I'll update the comments and also
the implementation detail Junio pointed out.
diff mbox series

Patch

diff --git a/refs.h b/refs.h
index 108dfc93b3..f8b714ca1d 100644
--- a/refs.h
+++ b/refs.h
@@ -83,6 +83,12 @@  int refs_read_ref_full(struct ref_store *refs, const char *refname,
 
 int refs_read_ref(struct ref_store *refs, const char *refname, struct object_id *oid);
 
+/*
+ * Return 0 if the symbolic reference could be read without error.
+ * Return -1 for generic errors.
+ * Return -2 if the reference was actually non-symbolic.
+ */
+
 int refs_read_symbolic_ref(struct ref_store *ref_store, const char *refname,
 			   struct strbuf *referent);
 
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0824c0b8a9..81e650ec48 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -596,10 +596,9 @@  static int files_read_symbolic_ref(struct ref_store *ref_store, const char *refn
 	unsigned int type;
 
 	ret = read_ref_internal(ref_store, refname, &oid, referent, &type, &failure_errno, 1);
-	if (ret)
-		return ret;
-
-	return !(type & REF_ISSYMREF);
+	if (!ret && !(type & REF_ISSYMREF))
+		return -2;
+	return ret;
 }
 
 int parse_loose_ref_contents(const struct git_hash_algo *algop,
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 2313c830d8..f0ef354bce 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -673,6 +673,12 @@  struct ref_storage_be {
 
 	ref_iterator_begin_fn *iterator_begin;
 	read_raw_ref_fn *read_raw_ref;
+
+	/*
+	 * Return 0 if the symbolic reference could be read without error.
+	 * Return -1 for generic errors.
+	 * Return -2 if the reference was actually non-symbolic.
+	 */
 	read_symbolic_ref_fn *read_symbolic_ref;
 
 	reflog_iterator_begin_fn *reflog_iterator_begin;
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 38eb14d591..60cb83f23a 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -830,7 +830,9 @@  static int reftable_be_read_symbolic_ref(struct ref_store *ref_store,
 		return ret;
 
 	ret = reftable_stack_read_ref(stack, refname, &ref);
-	if (ret == 0 && ref.value_type == REFTABLE_REF_SYMREF)
+	if (!ret && (ref.value_type != REFTABLE_REF_SYMREF))
+		ret = -2;
+	else if (!ret && (ref.value_type == REFTABLE_REF_SYMREF))
 		strbuf_addstr(referent, ref.value.symref);
 	else
 		ret = -1;