diff mbox series

[v3] diffcore-break: use a goto instead of a redundant if statement

Message ID 20190929005646.734046-1-alexhenrie24@gmail.com (mailing list archive)
State New, archived
Headers show
Series [v3] diffcore-break: use a goto instead of a redundant if statement | expand

Commit Message

Alex Henrie Sept. 29, 2019, 12:56 a.m. UTC
The condition "if (q->nr <= j)" checks whether the loop exited normally
or via a break statement. This check can be avoided by replacing the
jump to the end of the loop with a jump to the end of the function.

With the break replaced by a goto, the two diff_q calls then can be
replaced with a single diff_q call outside of the outer if statement.

Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
---
 diffcore-break.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

Comments

CB Bailey Sept. 29, 2019, 9:37 a.m. UTC | #1
On Sat, Sep 28, 2019 at 06:56:46PM -0600, Alex Henrie wrote:
> The condition "if (q->nr <= j)" checks whether the loop exited normally
> or via a break statement. This check can be avoided by replacing the
> jump to the end of the loop with a jump to the end of the function.
> 
> With the break replaced by a goto, the two diff_q calls then can be
> replaced with a single diff_q call outside of the outer if statement.
> 
> Reviewed-by: Derrick Stolee <stolee@gmail.com>
> Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
> ---
>  diffcore-break.c | 15 +++++++--------
>  1 file changed, 7 insertions(+), 8 deletions(-)

For easier discussion, I've snipped the original patch and replaced with
one with enough context to show the entire function.

I was reviewing this patch and it appeared to introduce a change in
behaviour.

> diff --git a/diffcore-break.c b/diffcore-break.c
> index 875aefd3fe..f6ab74141b 100644
> --- a/diffcore-break.c
> +++ b/diffcore-break.c
> @@ -262,44 +262,43 @@ static void merge_broken(struct diff_filepair *p,
> 
>  void diffcore_merge_broken(void)
>  {
>  	struct diff_queue_struct *q = &diff_queued_diff;
>  	struct diff_queue_struct outq;
>  	int i, j;
> 
>  	DIFF_QUEUE_CLEAR(&outq);
> 
>  	for (i = 0; i < q->nr; i++) {
>  		struct diff_filepair *p = q->queue[i];
>  		if (!p)
>  			/* we already merged this with its peer */
>  			continue;
>  		else if (p->broken_pair &&
>  			 !strcmp(p->one->path, p->two->path)) {
>  			/* If the peer also survived rename/copy, then
>  			 * we merge them back together.
>  			 */
>  			for (j = i + 1; j < q->nr; j++) {
>  				struct diff_filepair *pp = q->queue[j];
>  				if (pp->broken_pair &&
>  				    !strcmp(pp->one->path, pp->two->path) &&
>  				    !strcmp(p->one->path, pp->two->path)) {
>  					/* Peer survived.  Merge them */
>  					merge_broken(p, pp, &outq);
>  					q->queue[j] = NULL;
> -					break;
> +					goto done;

Previously, if the condition matched in the inner loop, the function
would null out the entry in the queue that that inner loop had reached
(q->queue[j] = NULL) and then break out of the inner loop. This meant
that the outer loop would skip over this entry (if (!p)).

The change introduced seems to break out of both loops as soon as we
reach one match, whereas before other subsequent matches would be
considered and merged. Not only this, but the outer 'else' case for all
subsequent entries is skipped so the rest of the entries the original
queue are missing from 'outq'.

>  				}
>  			}
> -			if (q->nr <= j)
> -				/* The peer did not survive, so we keep
> -				 * it in the output.
> -				 */
> -				diff_q(&outq, p);
> +			/* The peer did not survive, so we keep
> +			 * it in the output.
> +			 */
>  		}
> -		else
> -			diff_q(&outq, p);
> +		diff_q(&outq, p);
>  	}
> +
> +done:
>  	free(q->queue);
>  	*q = outq;
> 
>  	return;
>  }

I spent a bit of time trying to see if this change was user visible
which turned out to be unneeded as t4008-diff-break-rewrite.sh already
fails with this change for me in my environment, initially with this
test but also 3 other tests in this file.

> expecting success of 4008.6 'run diff with -B (#3)':
> 	git diff-index -B reference >current &&
> 	cat >expect <<-EOF &&
> 	:100644 100644 $blob0_id $blob1_id M100	file0
> 	:100644 100644 $blob1_id $blob0_id M100	file1
> 	EOF
> 	compare_diff_raw expect current
> 
> --- .tmp-1	2019-09-29 09:21:07.089070076 +0000
> +++ .tmp-2	2019-09-29 09:21:07.093070086 +0000
> @@ -1,2 +1 @@
>  :100644 100644 548142c327a6790ff8821d67c2ee1eff7a656b52 6ff87c4664981e4397625791c8ea3bbb5f2279a3 M#	file0
> -:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 548142c327a6790ff8821d67c2ee1eff7a656b52 M#	file1
> not ok 6 - run diff with -B (#3)
Alex Henrie Sept. 29, 2019, 8:10 p.m. UTC | #2
On Sun, Sep 29, 2019 at 3:37 AM CB Bailey <cb@hashpling.org> wrote:
>
> Previously, if the condition matched in the inner loop, the function
> would null out the entry in the queue that that inner loop had reached
> (q->queue[j] = NULL) and then break out of the inner loop. This meant
> that the outer loop would skip over this entry (if (!p)).
>
> The change introduced seems to break out of both loops as soon as we
> reach one match, whereas before other subsequent matches would be
> considered and merged. Not only this, but the outer 'else' case for all
> subsequent entries is skipped so the rest of the entries the original
> queue are missing from 'outq'.

> I spent a bit of time trying to see if this change was user visible
> which turned out to be unneeded as t4008-diff-break-rewrite.sh already
> fails with this change for me in my environment, initially with this
> test but also 3 other tests in this file.

Thank you for reviewing this. I should have run `make test` myself
before sending the patch; I do indeed see the same test failure that
you saw. I will send a v4 of this patch with the label in the right
place.

-Alex
Junio C Hamano Sept. 30, 2019, 1:14 a.m. UTC | #3
CB Bailey <cb@hashpling.org> writes:

> For easier discussion, I've snipped the original patch and replaced with
> one with enough context to show the entire function.
>
> I was reviewing this patch and it appeared to introduce a change in
> behaviour.
>
>> diff --git a/diffcore-break.c b/diffcore-break.c
>> index 875aefd3fe..f6ab74141b 100644
>> --- a/diffcore-break.c
>> +++ b/diffcore-break.c
>> @@ -262,44 +262,43 @@ static void merge_broken(struct diff_filepair *p,
>> 
>>  void diffcore_merge_broken(void)
>>  {
>>  	struct diff_queue_struct *q = &diff_queued_diff;
>>  	struct diff_queue_struct outq;
>>  	int i, j;
>> 
>>  	DIFF_QUEUE_CLEAR(&outq);
>> 
>>  	for (i = 0; i < q->nr; i++) {
>>  		struct diff_filepair *p = q->queue[i];
>>  		if (!p)
>>  			/* we already merged this with its peer */
>>  			continue;
>>  		else if (p->broken_pair &&
>>  			 !strcmp(p->one->path, p->two->path)) {
>>  			/* If the peer also survived rename/copy, then
>>  			 * we merge them back together.
>>  			 */
>>  			for (j = i + 1; j < q->nr; j++) {
>>  				struct diff_filepair *pp = q->queue[j];
>>  				if (pp->broken_pair &&
>>  				    !strcmp(pp->one->path, pp->two->path) &&
>>  				    !strcmp(p->one->path, pp->two->path)) {
>>  					/* Peer survived.  Merge them */
>>  					merge_broken(p, pp, &outq);
>>  					q->queue[j] = NULL;
>> -					break;
>> +					goto done;
>
> Previously, if the condition matched in the inner loop, the function
> would null out the entry in the queue that that inner loop had reached
> (q->queue[j] = NULL) and then break out of the inner loop. This meant
> that the outer loop would skip over this entry (if (!p)).
>
> The change introduced seems to break out of both loops as soon as we
> reach one match, whereas before other subsequent matches would be
> considered and merged. Not only this, but the outer 'else' case for all
> subsequent entries is skipped so the rest of the entries the original
> queue are missing from 'outq'.

Thanks.

Sometimes judicious use of 'goto' makes the resulting code easier to
follow, but quite honestly, I do not see it happening with this
change.  The original makes it much more clear that there are three
cases to worry about:

    A. an earlier round handled this one already;

    B. we have a broken pair and need to find the other one,
       B-1. if there is, we process it;
       B-2. otherwise we keep it in the outq.

    C. a normal one that does not need the complication of B is
       sent to the outq.

and I find it much easier to follow without any goto.

>
>>  				}
>>  			}
>> -			if (q->nr <= j)
>> -				/* The peer did not survive, so we keep
>> -				 * it in the output.
>> -				 */
>> -				diff_q(&outq, p);
>> +			/* The peer did not survive, so we keep
>> +			 * it in the output.
>> +			 */
>>  		}
>> -		else
>> -			diff_q(&outq, p);
>> +		diff_q(&outq, p);
>>  	}
>> +
>> +done:
>>  	free(q->queue);
>>  	*q = outq;
>> 
>>  	return;
>>  }
>
> I spent a bit of time trying to see if this change was user visible
> which turned out to be unneeded as t4008-diff-break-rewrite.sh already
> fails with this change for me in my environment, initially with this
> test but also 3 other tests in this file.
>
>> expecting success of 4008.6 'run diff with -B (#3)':
>> 	git diff-index -B reference >current &&
>> 	cat >expect <<-EOF &&
>> 	:100644 100644 $blob0_id $blob1_id M100	file0
>> 	:100644 100644 $blob1_id $blob0_id M100	file1
>> 	EOF
>> 	compare_diff_raw expect current
>> 
>> --- .tmp-1	2019-09-29 09:21:07.089070076 +0000
>> +++ .tmp-2	2019-09-29 09:21:07.093070086 +0000
>> @@ -1,2 +1 @@
>>  :100644 100644 548142c327a6790ff8821d67c2ee1eff7a656b52 6ff87c4664981e4397625791c8ea3bbb5f2279a3 M#	file0
>> -:100644 100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 548142c327a6790ff8821d67c2ee1eff7a656b52 M#	file1
>> not ok 6 - run diff with -B (#3)
diff mbox series

Patch

diff --git a/diffcore-break.c b/diffcore-break.c
index 875aefd3fe..f6ab74141b 100644
--- a/diffcore-break.c
+++ b/diffcore-break.c
@@ -286,18 +286,17 @@  void diffcore_merge_broken(void)
 					/* Peer survived.  Merge them */
 					merge_broken(p, pp, &outq);
 					q->queue[j] = NULL;
-					break;
+					goto done;
 				}
 			}
-			if (q->nr <= j)
-				/* The peer did not survive, so we keep
-				 * it in the output.
-				 */
-				diff_q(&outq, p);
+			/* The peer did not survive, so we keep
+			 * it in the output.
+			 */
 		}
-		else
-			diff_q(&outq, p);
+		diff_q(&outq, p);
 	}
+
+done:
 	free(q->queue);
 	*q = outq;