diff mbox series

sequencer: clarify intention to break out of loop

Message ID 20181028153145.25734-1-martin.agren@gmail.com (mailing list archive)
State New, archived
Headers show
Series sequencer: clarify intention to break out of loop | expand

Commit Message

Martin Ågren Oct. 28, 2018, 3:31 p.m. UTC
When we find a space, we set `len = i`, which gives us the answer we are
looking for, but which also breaks out of the loop through these steps:

  1. `len = i`

  2. `i = i + 1`

  3. Is `i < len`? No, so break out.

Since `i` is signed, step 2 is undefined if `i` has the value `INT_MAX`.
It can't actually have that value, but that doesn't stop my copy of gcc
7.3.0 from throwing the following:

> sequencer.c:2853:3: error: assuming signed overflow does not occur when
> assuming that (X + c) < X is always false [-Werror=strict-overflow]
>    for (i = 0; i < len; i++)
>    ^~~

That is, the compiler has realized that the code is essentially
evaluating "(len + 1) < len" and that for `len = INT_MAX`, this is
undefined behavior. What it hasn't figured out is that if `i` and `len`
are both `INT_MAX` after step 1, then `len` must have had a value larger
than `INT_MAX` before that step, which it can't have had.

Let's be explicit about breaking out of the loop. This helps the
compiler grok our intention. As a bonus, it might make it (even) more
obvious to human readers that the loop stops at the first space.

While at it, reduce the scope of `i`.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 sequencer.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

Comments

Eric Sunshine Oct. 28, 2018, 7:01 p.m. UTC | #1
On Sun, Oct 28, 2018 at 11:32 AM Martin Ågren <martin.agren@gmail.com> wrote:
> [...]
> Let's be explicit about breaking out of the loop. This helps the
> compiler grok our intention. As a bonus, it might make it (even) more
> obvious to human readers that the loop stops at the first space.

This did come up in review[1,2]. I had a hard time understanding the
loop because it looked idiomatic but wasn't (and we have preconceived
notions about behavior of things which look idiomatic).

[1]: https://public-inbox.org/git/CAPig+cQbG2s-LrAo9+7C7=dXifbWFJ3SzuNa-QePHDk7egK=jg@mail.gmail.com/
[2]: https://public-inbox.org/git/CAPig+cRjU6niXpT2FrDWZ0x1HmGf1ojVZj3uk2qXEGe-S7i_HQ@mail.gmail.com/

> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
> diff --git a/sequencer.c b/sequencer.c
> @@ -2849,10 +2849,14 @@ static int do_reset(const char *name, int len, struct replay_opts *opts)
>                 /* Determine the length of the label */
> +               for (i = 0; i < len; i++) {
> +                       if (isspace(name[i])) {
>                                 len = i;
> +                               break;
> +                       }
> +               }
>                 strbuf_addf(&ref_name, "refs/rewritten/%.*s", len, name);

At least for me, this would be more idiomatic if it was coded like this:

    for (i = 0; i < len; i++)
        if (isspace(name[i]))
            break;
    strbuf_addf(..., i, name);

or, perhaps (less concise):

    for (i = 0; i < len; i++)
        if (isspace(name[i]))
            break;
    len = i;
    strbuf_addf(..., len, name);
Martin Ågren Oct. 28, 2018, 8:37 p.m. UTC | #2
On Sun, 28 Oct 2018 at 20:01, Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Sun, Oct 28, 2018 at 11:32 AM Martin Ågren <martin.agren@gmail.com> wrote:
> > [...]
> > Let's be explicit about breaking out of the loop. This helps the
> > compiler grok our intention. As a bonus, it might make it (even) more
> > obvious to human readers that the loop stops at the first space.
>
> This did come up in review[1,2]. I had a hard time understanding the
> loop because it looked idiomatic but wasn't (and we have preconceived
> notions about behavior of things which look idiomatic).
>
> [1]: https://public-inbox.org/git/CAPig+cQbG2s-LrAo9+7C7=dXifbWFJ3SzuNa-QePHDk7egK=jg@mail.gmail.com/
> [2]: https://public-inbox.org/git/CAPig+cRjU6niXpT2FrDWZ0x1HmGf1ojVZj3uk2qXEGe-S7i_HQ@mail.gmail.com/

Hmm, I should have been able to dig those up myself. Thanks for the
pointers.

> >                 /* Determine the length of the label */
> > +               for (i = 0; i < len; i++) {
> > +                       if (isspace(name[i])) {
> >                                 len = i;
> > +                               break;
> > +                       }
> > +               }
> >                 strbuf_addf(&ref_name, "refs/rewritten/%.*s", len, name);
>
> At least for me, this would be more idiomatic if it was coded like this:
>
>     for (i = 0; i < len; i++)
>         if (isspace(name[i]))
>             break;
>     strbuf_addf(..., i, name);
>
> or, perhaps (less concise):
>
>     for (i = 0; i < len; i++)
>         if (isspace(name[i]))
>             break;
>     len = i;
>     strbuf_addf(..., len, name);

This second one is more true to the original in that it updates `len` to
the new, shorter length. Which actually seems to be needed -- towards
the very end of the function, `len` is used, so the first of these
suggestions would change the behavior.

Thanks a lot for a review. I'll wait for any additional comments and
will try a v2 with your second suggestion.

Martin
Junio C Hamano Oct. 29, 2018, 3:43 a.m. UTC | #3
Eric Sunshine <sunshine@sunshineco.com> writes:

>> diff --git a/sequencer.c b/sequencer.c
>> @@ -2849,10 +2849,14 @@ static int do_reset(const char *name, int len, struct replay_opts *opts)
>>                 /* Determine the length of the label */
>> +               for (i = 0; i < len; i++) {
>> +                       if (isspace(name[i])) {
>>                                 len = i;
>> +                               break;
>> +                       }
>> +               }
>>                 strbuf_addf(&ref_name, "refs/rewritten/%.*s", len, name);
>
> At least for me, this would be more idiomatic if it was coded like this:
>
>     for (i = 0; i < len; i++)
>         if (isspace(name[i]))
>             break;
>     strbuf_addf(..., i, name);
>
> or, perhaps (less concise):
>
>     for (i = 0; i < len; i++)
>         if (isspace(name[i]))
>             break;
>     len = i;
>     strbuf_addf(..., len, name);

Yup, the former is more idiomatic between these two; the latter is
also fine.

The result of Martin's patch shares the same "Huh?" factor as the
original that mucks with the "supposedly constant" side of the
termination condition, and from that point of view, it is not that
much of a readability improvement, I would think.
diff mbox series

Patch

diff --git a/sequencer.c b/sequencer.c
index 0c164d5f98..a351638ad9 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2829,7 +2829,7 @@  static int do_reset(const char *name, int len, struct replay_opts *opts)
 	struct tree_desc desc;
 	struct tree *tree;
 	struct unpack_trees_options unpack_tree_opts;
-	int ret = 0, i;
+	int ret = 0;
 
 	if (hold_locked_index(&lock, LOCK_REPORT_ON_ERROR) < 0)
 		return -1;
@@ -2849,10 +2849,14 @@  static int do_reset(const char *name, int len, struct replay_opts *opts)
 		}
 		oidcpy(&oid, &opts->squash_onto);
 	} else {
+		int i;
 		/* Determine the length of the label */
-		for (i = 0; i < len; i++)
-			if (isspace(name[i]))
+		for (i = 0; i < len; i++) {
+			if (isspace(name[i])) {
 				len = i;
+				break;
+			}
+		}
 
 		strbuf_addf(&ref_name, "refs/rewritten/%.*s", len, name);
 		if (get_oid(ref_name.buf, &oid) &&