diff mbox series

[8/8] worktree: simplify search for unique worktree

Message ID 8383c246f8c23e61dedd69d6e69c72d51fd6b469.1599762679.git.martin.agren@gmail.com
State Superseded
Headers show
Series various wt-status/worktree cleanups | expand

Commit Message

Martin Ågren Sept. 10, 2020, 7:03 p.m. UTC
We track the number of worktrees we've found and break out of the loop
early once we hit 2. This is because we're not really interested in the
number of matches -- we just want to make sure that we don't find more
than one worktree that matches the suffix. This can be done just as well
by checking the NULL-ness of the pointer where we collect our
answer-to-be. Drop the redundant `nr_found` variable.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 worktree.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Comments

Eric Sunshine Sept. 10, 2020, 7:28 p.m. UTC | #1
On Thu, Sep 10, 2020 at 3:08 PM Martin Ågren <martin.agren@gmail.com> wrote:
> We track the number of worktrees we've found and break out of the loop
> early once we hit 2. This is because we're not really interested in the
> number of matches -- we just want to make sure that we don't find more
> than one worktree that matches the suffix. This can be done just as well
> by checking the NULL-ness of the pointer where we collect our
> answer-to-be. Drop the redundant `nr_found` variable.
>
> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
> diff --git a/worktree.c b/worktree.c
> @@ -172,13 +172,13 @@ static struct worktree *find_worktree_by_suffix(struct worktree **list,
>                                                 const char *suffix)
>  {
> -       for (; *list && nr_found < 2; list++) {
> +       for (; *list; list++) {
> @@ -186,11 +186,12 @@ static struct worktree *find_worktree_by_suffix(struct worktree **list,
>                 if ((!start || (start > 0 && is_dir_sep(path[start - 1]))) &&
>                     !fspathcmp(suffix, path + start)) {
> +                       if (found)
> +                               return NULL;
>                         found = *list;
> -                       nr_found++;
>                 }
>         }
> -       return nr_found == 1 ? found : NULL;
> +       return found;

Although this change appears to be correct and does simplify the code,
I think it also makes it a bit more opaque. With the explicit
`nr_found == 1`, it is quite obvious that the function considers
"success" to be when one and only one entry is found and any other
number is failure. But with the revised code, it is harder to work out
precisely what the conditions are. Having said that, I think a simple
comment before the function would suffice to clear up the opaqueness.
Perhaps something like:

    /* If exactly one worktree matches 'target', return it, else NULL. */
Martin Ågren Sept. 10, 2020, 7:48 p.m. UTC | #2
On Thu, 10 Sep 2020 at 21:28, Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Thu, Sep 10, 2020 at 3:08 PM Martin Ågren <martin.agren@gmail.com> wrote:
> > We track the number of worktrees we've found and break out of the loop
> > early once we hit 2. This is because we're not really interested in the
> > number of matches -- we just want to make sure that we don't find more
> > than one worktree that matches the suffix. This can be done just as well
> > by checking the NULL-ness of the pointer where we collect our
> > answer-to-be. Drop the redundant `nr_found` variable.
> >
> > Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> > ---
> > diff --git a/worktree.c b/worktree.c
> > @@ -172,13 +172,13 @@ static struct worktree *find_worktree_by_suffix(struct worktree **list,
> >                                                 const char *suffix)
> >  {
> > -       for (; *list && nr_found < 2; list++) {
> > +       for (; *list; list++) {
> > @@ -186,11 +186,12 @@ static struct worktree *find_worktree_by_suffix(struct worktree **list,
> >                 if ((!start || (start > 0 && is_dir_sep(path[start - 1]))) &&
> >                     !fspathcmp(suffix, path + start)) {
> > +                       if (found)
> > +                               return NULL;
> >                         found = *list;
> > -                       nr_found++;
> >                 }
> >         }
> > -       return nr_found == 1 ? found : NULL;
> > +       return found;
>
> Although this change appears to be correct and does simplify the code,
> I think it also makes it a bit more opaque. With the explicit
> `nr_found == 1`, it is quite obvious that the function considers
> "success" to be when one and only one entry is found and any other
> number is failure. But with the revised code, it is harder to work out
> precisely what the conditions are.

Thanks for commenting. I found the original trickier than it had to be.
It spreads out the logic in several places and is careful to short-cut
the loop. My first thought was "why doesn't this just use the standard
form?". But I'm open to the idea that it might be a fairly personal
standard form... If there's any risk that someone else comes along and
simplifies this to use a `nr_found` variable, then maybe file this under
code churning?

> Having said that, I think a simple
> comment before the function would suffice to clear up the opaqueness.
> Perhaps something like:
>
>     /* If exactly one worktree matches 'target', return it, else NULL. */

That's a good suggestion regardless.

Thanks
Martin
Eric Sunshine Sept. 10, 2020, 8:01 p.m. UTC | #3
On Thu, Sep 10, 2020 at 3:48 PM Martin Ågren <martin.agren@gmail.com> wrote:
> On Thu, 10 Sep 2020 at 21:28, Eric Sunshine <sunshine@sunshineco.com> wrote:
> > Although this change appears to be correct and does simplify the code,
> > I think it also makes it a bit more opaque. With the explicit
> > `nr_found == 1`, it is quite obvious that the function considers
> > "success" to be when one and only one entry is found and any other
> > number is failure. But with the revised code, it is harder to work out
> > precisely what the conditions are.
>
> Thanks for commenting. I found the original trickier than it had to be.
> It spreads out the logic in several places and is careful to short-cut
> the loop. My first thought was "why doesn't this just use the standard
> form?". But I'm open to the idea that it might be a fairly personal
> standard form... If there's any risk that someone else comes along and
> simplifies this to use a `nr_found` variable, then maybe file this under
> code churning?

Maybe. Dunno. Even with the suggested function comment, I still find
that the revised code has a higher cognitive load because the reader
has to remember or figure out mentally what `found` contains by the
`return found;` at the end of the function. Compare that with the old
code, in which the reader doesn't have to remember or figure out
anything. The final `return nr_found == 1 ? found : NULL;` condition
spells out everything the reader needs to know -- even if the reader
didn't pay close attention to the meat of the function. So, we each
have a different take on the apparent complexity.

> > Having said that, I think a simple
> > comment before the function would suffice to clear up the opaqueness.
> > Perhaps something like:
> >
> >     /* If exactly one worktree matches 'target', return it, else NULL. */
>
> That's a good suggestion regardless.

The function is so small that the increased cognitive load (for me) in
the rewrite probably doesn't matter at all, so I don't feel strongly
one way or the other. Keeping the patch (amended with the suggested
comment) or dropping it are both suitable options.
Junio C Hamano Sept. 10, 2020, 9:08 p.m. UTC | #4
Eric Sunshine <sunshine@sunshineco.com> writes:

>> Thanks for commenting. I found the original trickier than it had to be.
>> It spreads out the logic in several places and is careful to short-cut
>> the loop. My first thought was "why doesn't this just use the standard
>> form?". But I'm open to the idea that it might be a fairly personal
>> standard form... If there's any risk that someone else comes along and
>> simplifies this to use a `nr_found` variable, then maybe file this under
>> code churning?
>
> Maybe. Dunno. Even with the suggested function comment, I still find
> that the revised code has a higher cognitive load because the reader
> has to remember or figure out mentally what `found` contains by the
> `return found;` at the end of the function.

Judging from the phrase "standard form", I have a feeling that
Martin thinks that the updated code is more intuitive (i.e. "we see
another one while keeping the one we already found, so we know there
is no unique answer").  I do not claim that would be the standard way
and using a counter clamped to 2 is substandard, but I find the code
more readable with the patch than the original.

Even though it helps somewhat that the counter is named "nr_found"
and not "nr_match", I found it a bit awkward that the counter in the
original pretends to count all the answers, yet does not really
count all of them and stops at 2.

I agree with Martin that this would probably be subjective.

Thanks.
diff mbox series

Patch

diff --git a/worktree.c b/worktree.c
index faac87422c..ac754b88ff 100644
--- a/worktree.c
+++ b/worktree.c
@@ -172,13 +172,13 @@  static struct worktree *find_worktree_by_suffix(struct worktree **list,
 						const char *suffix)
 {
 	struct worktree *found = NULL;
-	int nr_found = 0, suffixlen;
+	int suffixlen;
 
 	suffixlen = strlen(suffix);
 	if (!suffixlen)
 		return NULL;
 
-	for (; *list && nr_found < 2; list++) {
+	for (; *list; list++) {
 		const char	*path	 = (*list)->path;
 		int		 pathlen = strlen(path);
 		int		 start	 = pathlen - suffixlen;
@@ -186,11 +186,12 @@  static struct worktree *find_worktree_by_suffix(struct worktree **list,
 		/* suffix must start at directory boundary */
 		if ((!start || (start > 0 && is_dir_sep(path[start - 1]))) &&
 		    !fspathcmp(suffix, path + start)) {
+			if (found)
+				return NULL;
 			found = *list;
-			nr_found++;
 		}
 	}
-	return nr_found == 1 ? found : NULL;
+	return found;
 }
 
 struct worktree *find_worktree(struct worktree **list,