diff mbox series

[v9,5/9] ls-tree: optimize naming and handling of "return" in show_tree()

Message ID 75503c41a7e2f3fdbb59ce3568853049b55a2d3b.1641440700.git.dyroneteng@gmail.com (mailing list archive)
State New, archived
Headers show
Series ls-tree.c: introduce "--format" option | expand

Commit Message

Teng Long Jan. 6, 2022, 4:31 a.m. UTC
The variable which "show_tree()" return is named "retval", a name that's
a little hard to understand. This commit tries to make the variable
and the related codes more clear in the context.

The change is based on three steps. The first is to rename "retval" to
a more meaningful name.

The second is that there are different "return" cases in "show_tree",
some places use "return retval;", some just directly use "return 0;",
this maybe cause some confusion when reading these "returns". For this
, we change all the "return" cases to the new uniform name.

The last is there are some nested "if" judgments surround the "returns",
this even make the codes here a little hard to understand. So we put
some logic in individual methods, "init_type()" and "init_recursive()".

After the steps, let us look at "show_tree()" again. It has a uniform
return variable name now, and first we init the "type" by "mode", then
call "init_recursive" to init the value of "recursive" which means
whether to go on reading recusively into the "tree". The codes here
become a little bit clearer, so we do not need to take a look at
"read_tree_at()" in "tree.c" to make sure the context of the return
value.

Signed-off-by: Teng Long <dyronetengb@gmail.com>
---
 builtin/ls-tree.c | 38 ++++++++++++++++++++++++--------------
 1 file changed, 24 insertions(+), 14 deletions(-)

Comments

Junio C Hamano Jan. 6, 2022, 8:44 p.m. UTC | #1
Teng Long <dyroneteng@gmail.com> writes:

> +static void init_type(unsigned mode, enum object_type *type)
> +{
> +	if (S_ISGITLINK(mode))
> +		*type = OBJ_COMMIT;
> +	else if (S_ISDIR(mode))
> +		*type = OBJ_TREE;
> +}
> +
> +static void init_recursive(struct strbuf *base, const char *pathname,
> +				int *recursive)
> +{
> +	if (show_recursive(base->buf, base->len, pathname))
> +		*recursive = READ_TREE_RECURSIVE;
> +}
> +
>  static int show_tree(const struct object_id *oid, struct strbuf *base,
>  		const char *pathname, unsigned mode, void *context)
>  {
> -	int retval = 0;
> +	int recursive = 0;
>  	size_t baselen;
>  	enum object_type type = OBJ_BLOB;
>  
> -	if (S_ISGITLINK(mode)) {
> -		type = OBJ_COMMIT;
> -	} else if (S_ISDIR(mode)) {
> -		if (show_recursive(base->buf, base->len, pathname)) {
> -			retval = READ_TREE_RECURSIVE;
> -			if (!(ls_options & LS_SHOW_TREES))
> -				return retval;
> -		}
> -		type = OBJ_TREE;
> -	}
> -	else if (ls_options & LS_TREE_ONLY)
> -		return 0;
> +	init_type(mode, &type);

What this one is doing sounds more like setting the type variable
based on the mode bits, and doing only half a job at it.  The name
"init" does not sound like a good match to what it does.

If we make it a separate function, we probably should add the "else"
clause to set *type to OBJ_BLOB there, so that the caller does not
say "we'd assume it is BLOB initially, but tweak it based on mode
bits".

I.e.

	type = get_type(mode);

where

	static enum object_type get_type(unsigned int mode)
	{
		return (S_ISGITLINK(mode) 
		        ? OBJ_COMMIT
			: S_ISDIR(mode)
			? OBJ_TREE
			: OBJ_BLOB);
	}

or something like that, perhaps?  But I think open-coding the whole
thing, after losing the "We assume BLOB" initialization, would be
much easier to follow, i.e.

	if (S_ISGITLINK(mode))
		type = OBJ_COMMIT;
	else if (S_ISDIR(mode))
		type = OBJ_TREE;
	else
		type = OBJ_BLOB;

without adding init_type() helper function.

> +	init_recursive(base, pathname, &recursive);

This is even less readable.  In the original, it was clear that we
only call show_recursive() on a path that is a true directory; we
now seem to unconditionally make a call to it.  Is that intended?

	Side note.  show_recursive() has a confusing name; it does
	not show anything---it only decides if we want to go
	recursive.

At least, losing the "we assume recursive is 0" upfront in the
variable declaration and writing

	if (type == OBJ_TREE && show_recursive(...))
		recursive = READ_TREE_RECURSIVE;
	else
		recursive = 0;

here, without introducing init_recursive(), would make it easier to
follow.  If we really want to add a new function, perhaps

	recursive = get_recursive(type, base, pathname);
 
where

	static int get_recursive(enum object_type type,
				 struct strbuf *base, const char *pathname)
	{
		if (type == OBJ_TREE && show_recursive(...))
			return READ_TREE_RECURSIVE;
		else
			return 0;
	}

but I fail to see the point of doing so; open-coded 4 lines here
would make the flow of thought much better to me.

In any case, I think your splitting the original into "this is about
type" and "this is about the recursive bit" is a good idea to help
making the resulting code easier to follow.

> +	if (type == OBJ_TREE && recursive && !(ls_options & LS_SHOW_TREES))
> +		return recursive;

We are looking at an entry that is a directory.  We are running in
recursive mode.  And we are not told to show the directory itself in
the output.  We skip the rest of the function, which is about to
show this single entry.  Makes sense.


> +	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
> +		return !READ_TREE_RECURSIVE;

Negation of a non-zero integer constant is 0, so it is the same as
the original that returned 0, but I am not sure if it is enhancing
or hurting readability of the code.  The user of the value, in
tree.c::read_tree_at(), knows that the possible and valid values are
0 and READ_TREE_RECURSIVE, so returning 0 would probably be a better
idea.  After all, the initializer in the original for the variable
definition of "retval" used "0", not "!READ_TREE_RECURSIVE".

The name "recursive" is much more specific than the overly generic
"retval".  Its value is to be consumed by read_tree_at(), i.e. our
caller, to decide if we want it to recurse into the contents of the
directory.  I would have called it "recurse" (or even "to_recurse"),
if I were doing this change, though.
Teng Long Jan. 11, 2022, 9:14 a.m. UTC | #2
On Fri, Jan 7, 2022 at 4:44 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Teng Long <dyroneteng@gmail.com> writes:
>

> What this one is doing sounds more like setting the type variable
> based on the mode bits, and doing only half a job at it.  The name
> "init" does not sound like a good match to what it does.
>
> If we make it a separate function, we probably should add the "else"
> clause to set *type to OBJ_BLOB there, so that the caller does not
> say "we'd assume it is BLOB initially, but tweak it based on mode
> bits".
>
> I.e.
>
>         type = get_type(mode);
>
> where
>
>         static enum object_type get_type(unsigned int mode)
>         {
>                 return (S_ISGITLINK(mode)
>                         ? OBJ_COMMIT
>                         : S_ISDIR(mode)
>                         ? OBJ_TREE
>                         : OBJ_BLOB);
>         }

> or something like that, perhaps?  But I think open-coding the whole
> thing, after losing the "We assume BLOB" initialization, would be
> much easier to follow, i.e.
>
>         if (S_ISGITLINK(mode))
>                 type = OBJ_COMMIT;
>         else if (S_ISDIR(mode))
>                 type = OBJ_TREE;
>         else
>                 type = OBJ_BLOB;
>
> without adding init_type() helper function.
>
> > +     init_recursive(base, pathname, &recursive);
>
> This is even less readable.  In the original, it was clear that we
> only call show_recursive() on a path that is a true directory; we
> now seem to unconditionally make a call to it.  Is that intended?
>
>         Side note.  show_recursive() has a confusing name; it does
>         not show anything---it only decides if we want to go
>         recursive.
>
> At least, losing the "we assume recursive is 0" upfront in the
> variable declaration and writing
>
>         if (type == OBJ_TREE && show_recursive(...))
>                 recursive = READ_TREE_RECURSIVE;
>         else
>                 recursive = 0;
>
> here, without introducing init_recursive(), would make it easier to
> follow.  If we really want to add a new function, perhaps
>
>         recursive = get_recursive(type, base, pathname);
>
> where
>
>         static int get_recursive(enum object_type type,
>                                  struct strbuf *base, const char *pathname)
>         {
>                 if (type == OBJ_TREE && show_recursive(...))
>                         return READ_TREE_RECURSIVE;
>                 else
>                         return 0;
>         }
>
> but I fail to see the point of doing so; open-coded 4 lines here
> would make the flow of thought much better to me.
>
> In any case, I think your splitting the original into "this is about
> type" and "this is about the recursive bit" is a good idea to help
> making the resulting code easier to follow.
>
> > +     if (type == OBJ_TREE && recursive && !(ls_options & LS_SHOW_TREES))
> > +             return recursive;
>
> We are looking at an entry that is a directory.  We are running in
> recursive mode.  And we are not told to show the directory itself in
> the output.  We skip the rest of the function, which is about to
> show this single entry.  Makes sense.
>
>
> > +     if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
> > +             return !READ_TREE_RECURSIVE;
>
> Negation of a non-zero integer constant is 0, so it is the same as
> the original that returned 0, but I am not sure if it is enhancing
> or hurting readability of the code.  The user of the value, in
> tree.c::read_tree_at(), knows that the possible and valid values are
> 0 and READ_TREE_RECURSIVE, so returning 0 would probably be a better
> idea.  After all, the initializer in the original for the variable
> definition of "retval" used "0", not "!READ_TREE_RECURSIVE".
>
> The name "recursive" is much more specific than the overly generic
> "retval".  Its value is to be consumed by read_tree_at(), i.e. our
> caller, to decide if we want it to recurse into the contents of the
> directory.  I would have called it "recurse" (or even "to_recurse"),
> if I were doing this change, though.

Thanks, will apply in the next patch.
diff mbox series

Patch

diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index eecc7482d5..7383dddf8c 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -61,25 +61,35 @@  static int show_recursive(const char *base, size_t baselen, const char *pathname
 	return 0;
 }
 
+static void init_type(unsigned mode, enum object_type *type)
+{
+	if (S_ISGITLINK(mode))
+		*type = OBJ_COMMIT;
+	else if (S_ISDIR(mode))
+		*type = OBJ_TREE;
+}
+
+static void init_recursive(struct strbuf *base, const char *pathname,
+				int *recursive)
+{
+	if (show_recursive(base->buf, base->len, pathname))
+		*recursive = READ_TREE_RECURSIVE;
+}
+
 static int show_tree(const struct object_id *oid, struct strbuf *base,
 		const char *pathname, unsigned mode, void *context)
 {
-	int retval = 0;
+	int recursive = 0;
 	size_t baselen;
 	enum object_type type = OBJ_BLOB;
 
-	if (S_ISGITLINK(mode)) {
-		type = OBJ_COMMIT;
-	} else if (S_ISDIR(mode)) {
-		if (show_recursive(base->buf, base->len, pathname)) {
-			retval = READ_TREE_RECURSIVE;
-			if (!(ls_options & LS_SHOW_TREES))
-				return retval;
-		}
-		type = OBJ_TREE;
-	}
-	else if (ls_options & LS_TREE_ONLY)
-		return 0;
+	init_type(mode, &type);
+	init_recursive(base, pathname, &recursive);
+
+	if (type == OBJ_TREE && recursive && !(ls_options & LS_SHOW_TREES))
+		return recursive;
+	if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
+		return !READ_TREE_RECURSIVE;
 
 	if (!(ls_options & LS_NAME_ONLY)) {
 		if (ls_options & LS_SHOW_SIZE) {
@@ -109,7 +119,7 @@  static int show_tree(const struct object_id *oid, struct strbuf *base,
 				   chomp_prefix ? ls_tree_prefix : NULL,
 				   stdout, line_termination);
 	strbuf_setlen(base, baselen);
-	return retval;
+	return recursive;
 }
 
 int cmd_ls_tree(int argc, const char **argv, const char *prefix)