diff mbox series

[3/6] object.c: make type_from_string() return "enum object_type"

Message ID patch-3.6-7fd86f6699-20210409T082935Z-avarab@gmail.com (mailing list archive)
State Superseded
Headers show
Series {tag,object}*.c: refactorings + prep for a larger change | expand

Commit Message

Ævar Arnfjörð Bjarmason April 9, 2021, 8:32 a.m. UTC
Change the type_from_string*() functions to return an "enum
object_type", but don't refactor their callers to check for "==
OBJ_BAD" instead of "< 0".

Refactoring the check of the return value to check == OBJ_BAD would
now be equivalent to "ret < 0", but the consensus on an earlier
version of this patch was to not do that, and to instead use -1
consistently as a return value. It just so happens that OBJ_BAD == -1,
but let's not put a hard reliance on that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 object.c | 8 ++++----
 object.h | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

Comments

Jeff King April 9, 2021, 6:14 p.m. UTC | #1
On Fri, Apr 09, 2021 at 10:32:51AM +0200, Ævar Arnfjörð Bjarmason wrote:

> Change the type_from_string*() functions to return an "enum
> object_type", but don't refactor their callers to check for "==
> OBJ_BAD" instead of "< 0".
> 
> Refactoring the check of the return value to check == OBJ_BAD would
> now be equivalent to "ret < 0", but the consensus on an earlier
> version of this patch was to not do that, and to instead use -1
> consistently as a return value. It just so happens that OBJ_BAD == -1,
> but let's not put a hard reliance on that.

I think what the patch is doing is good, but this rationale misses the
main point of that discussion, I think. I doubt that the value of
OBJ_BAD would ever change. But the point was that we could grow a new
"failure" value at "-2", and we would want to catch here (I do consider
it relatively unlikely, but that IMHO is the reason to keep the negative
check).

I think for the same reason that "return OBJ_BAD" instead of "return -1"
would be just fine (it is not "just so happens" that OBJ_BAD is
negative; that was deliberate to allow exactly this convention). But I
am also OK with leaving the "return -1" calls.

-Peff
Ævar Arnfjörð Bjarmason April 9, 2021, 7:42 p.m. UTC | #2
On Fri, Apr 09 2021, Jeff King wrote:

> On Fri, Apr 09, 2021 at 10:32:51AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> Change the type_from_string*() functions to return an "enum
>> object_type", but don't refactor their callers to check for "==
>> OBJ_BAD" instead of "< 0".
>> 
>> Refactoring the check of the return value to check == OBJ_BAD would
>> now be equivalent to "ret < 0", but the consensus on an earlier
>> version of this patch was to not do that, and to instead use -1
>> consistently as a return value. It just so happens that OBJ_BAD == -1,
>> but let's not put a hard reliance on that.
>
> I think what the patch is doing is good, but this rationale misses the
> main point of that discussion, I think. I doubt that the value of
> OBJ_BAD would ever change. But the point was that we could grow a new
> "failure" value at "-2", and we would want to catch here (I do consider
> it relatively unlikely, but that IMHO is the reason to keep the negative
> check).
>
> I think for the same reason that "return OBJ_BAD" instead of "return -1"
> would be just fine (it is not "just so happens" that OBJ_BAD is
> negative; that was deliberate to allow exactly this convention). But I
> am also OK with leaving the "return -1" calls.

I'm beginning to think in response to this and the comment on 5/6 that
it might be cleaner to split up the object_type enum, as demonstrated
for a config.[ch] feature in [1].

Converting back and forth between them is a bit nasty, and having
multiple interchangable OBJ_* constants with identical values just to
satisfy them being in different enums, but it would allow having the
compiler explicitly help check that callers cover all possible cases of
values they could get.

Most callers just get OBJ_{COMMIT,TREE,BLOB,TAG} some more get that plus
OBJ_{BAD,NONE}, almost nobody gets OBJ_{OFS,REF}_DELTA, and AFAICT just
the peel code cares about OBJ_ANY. We then have an OBJ_MAX nobody's ever
used for anything (I've got some unsubmitted patch somewhere to remove
it).

What do you think about that sort of approach? I haven't convinced
myself that it's a good idea, so far I just thought bridging the gap of
things that return "enum" actually having that as part of their
signature for human legibility, even if C itself doesn't care about the
difference, and we currently can't get much/any of the benefits of the
compiler catching non-exhaustive "case" statements (unless every
callsite is to include OBJ_OFS etc.).

1. https://lore.kernel.org/git/875z0wicmp.fsf@evledraar.gmail.com/
Jeff King April 9, 2021, 9:29 p.m. UTC | #3
On Fri, Apr 09, 2021 at 09:42:17PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > I think what the patch is doing is good, but this rationale misses the
> > main point of that discussion, I think. I doubt that the value of
> > OBJ_BAD would ever change. But the point was that we could grow a new
> > "failure" value at "-2", and we would want to catch here (I do consider
> > it relatively unlikely, but that IMHO is the reason to keep the negative
> > check).
> >
> > I think for the same reason that "return OBJ_BAD" instead of "return -1"
> > would be just fine (it is not "just so happens" that OBJ_BAD is
> > negative; that was deliberate to allow exactly this convention). But I
> > am also OK with leaving the "return -1" calls.
> 
> I'm beginning to think in response to this and the comment on 5/6 that
> it might be cleaner to split up the object_type enum, as demonstrated
> for a config.[ch] feature in [1].
> 
> Converting back and forth between them is a bit nasty, and having
> multiple interchangable OBJ_* constants with identical values just to
> satisfy them being in different enums, but it would allow having the
> compiler explicitly help check that callers cover all possible cases of
> values they could get.
> 
> Most callers just get OBJ_{COMMIT,TREE,BLOB,TAG} some more get that plus
> OBJ_{BAD,NONE}, almost nobody gets OBJ_{OFS,REF}_DELTA, and AFAICT just
> the peel code cares about OBJ_ANY. We then have an OBJ_MAX nobody's ever
> used for anything (I've got some unsubmitted patch somewhere to remove
> it).
> 
> What do you think about that sort of approach? I haven't convinced
> myself that it's a good idea, so far I just thought bridging the gap of
> things that return "enum" actually having that as part of their
> signature for human legibility, even if C itself doesn't care about the
> difference, and we currently can't get much/any of the benefits of the
> compiler catching non-exhaustive "case" statements (unless every
> callsite is to include OBJ_OFS etc.).

I suspect you'll end up with a lot of awkward spots. I do agree that
_most_ callers only care about getting one of the actual 4 object types,
or an error. But those values are tied to the delta ones internally
(when we see a delta type, and then later decide to promote it to the
"real" type). And of course those are all used to read and write the
on-disk bits, too.

So while there might be some way of doing this cleaner, it hasn't
historically been a place we've seen a lot of problems (at least not
that I recall). So it seems like a pretty deep rabbit hole that is not
likely to give a lot of benefit.

-Peff
diff mbox series

Patch

diff --git a/object.c b/object.c
index 5477abc97c..2216cdcda2 100644
--- a/object.c
+++ b/object.c
@@ -35,9 +35,9 @@  const char *type_name(unsigned int type)
 	return object_type_strings[type];
 }
 
-int type_from_string_gently(const char *str, ssize_t len)
+enum object_type type_from_string_gently(const char *str, ssize_t len)
 {
-	int i;
+	enum object_type i;
 
 	for (i = 1; i < ARRAY_SIZE(object_type_strings); i++)
 		if (!strncmp(str, object_type_strings[i], len) &&
@@ -46,10 +46,10 @@  int type_from_string_gently(const char *str, ssize_t len)
 	return -1;
 }
 
-int type_from_string(const char *str)
+enum object_type type_from_string(const char *str)
 {
 	size_t len = strlen(str);
-	int ret = type_from_string_gently(str, len);
+	enum object_type ret = type_from_string_gently(str, len);
 	if (ret < 0)
 		die(_("invalid object type \"%s\""), str);
 	return ret;
diff --git a/object.h b/object.h
index ffdc129830..5e7a523e85 100644
--- a/object.h
+++ b/object.h
@@ -93,8 +93,8 @@  struct object {
 };
 
 const char *type_name(unsigned int type);
-int type_from_string_gently(const char *str, ssize_t len);
-int type_from_string(const char *str);
+enum object_type type_from_string_gently(const char *str, ssize_t len);
+enum object_type type_from_string(const char *str);
 
 /*
  * Return the current number of buckets in the object hashmap.