diff mbox series

[2/2] merge: avoid searching for strategies with fewer than 0 conflicts

Message ID 5657a05e7635ecadbb8d2e41ad97fe19f3633fdd.1661056709.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Miscellaneous merge fixes | expand

Commit Message

Elijah Newren Aug. 21, 2022, 4:38 a.m. UTC
From: Elijah Newren <newren@gmail.com>

builtin/merge.c has a loop over the specified strategies, where if
they all fail with conflicts, it picks the one with the least number
of conflicts.

In the codepath that finds a successful merge, if an automatic commit
was wanted, the code breaks out of the above loop, which makes sense.
However, if the user requested there be no automatic commit, the loop
would continue looking for a "better" strategy.  Since it had just
found a strategy with 0 conflicts though, and it is not possible to
have fewer than 0 conflicts, the continuing search is guaranteed to be
futile.

While searching additional strategies won't cause problems other than
wasting energy, it is wasteful.  Avoid searching for other strategies
with fewer than 0 conflicts.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/merge.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

Comments

Junio C Hamano Aug. 21, 2022, 6:05 p.m. UTC | #1
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> builtin/merge.c has a loop over the specified strategies, where if
> they all fail with conflicts, it picks the one with the least number
> of conflicts.
>
> In the codepath that finds a successful merge, if an automatic commit
> was wanted, the code breaks out of the above loop, which makes sense.
> However, if the user requested there be no automatic commit, the loop
> would continue looking for a "better" strategy.  Since it had just
> found a strategy with 0 conflicts though, and it is not possible to
> have fewer than 0 conflicts, the continuing search is guaranteed to be
> futile.

Let's imagine "git merge -s ours -s ort X", both of which resolve
the merge cleanly but differently.

The "best_cnt" thing tries to find the last strategy with the lowest
score from evaluate_result(), so strictly speaking I think this
changes the behaviour.  Am I mistaken?

When two strategies 'ours' and 'ort' resolved a given merge cleanly
(but differently), we would have taken the result from 'ort' in the
original code, but now we stop at seeing that 'ours' resolved the
merge cleanly and use its result.

>  			cnt = (use_strategies_nr > 1) ? evaluate_result() : 0;
>  			if (best_cnt <= 0 || cnt <= best_cnt) {

While I think "we see one clean merge, so it is OK to quit" is more
intuitive than the current behaviour, we need to update this
comparison to require the new candidate to have strictly better
result to take over the tentative best from the current candidate.
That will make things consistent with this change and makes it
easier to sell.

As the userbase of Git is so wide, it is very plausible that
somebody already invented and publisized that prepending "-s ours"
before the real strategy they want to use as an idiom to say "fall
back to pretend merge cleanly succeeded but did not affect the tree"
in a script that automate rebuilding tentative integration branch to
test, or something.  They can "fix" their "idiom" to append instead
of prepend "-s ours", and that would arguably make the resulting
command line more intuitive and easier to understand, but the fact
that whatever that was working for them to their liking suddenly
changes the behaviour is a regression we have to ask them to accept.

I do not mind writing something like this

    "git merge" with multiple strategies chooses to leave the least
    number of conflicted paths as the result.  Among multiple strategies
    that give the same number of conflicted paths, it used to use the
    last such strategy with the smallest number of conflicted paths.
    The command now prefers the first such strategy instead.

    If you have been using "git merge -s ours -s another X" as a way to
    say "we prefer 'another' strategy to merge, but use 'ours' strategy
    to make progress if it does not", you now have to swap the order of
    strategies you list on the command line.

in the "Backward Incompatibility Notes" section of the release
notes.

Thanks.
Elijah Newren Aug. 22, 2022, 3 p.m. UTC | #2
On Sun, Aug 21, 2022 at 11:05 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > builtin/merge.c has a loop over the specified strategies, where if
> > they all fail with conflicts, it picks the one with the least number
> > of conflicts.
> >
> > In the codepath that finds a successful merge, if an automatic commit
> > was wanted, the code breaks out of the above loop, which makes sense.
> > However, if the user requested there be no automatic commit, the loop
> > would continue looking for a "better" strategy.  Since it had just
> > found a strategy with 0 conflicts though, and it is not possible to
> > have fewer than 0 conflicts, the continuing search is guaranteed to be
> > futile.
>
> Let's imagine "git merge -s ours -s ort X", both of which resolve
> the merge cleanly but differently.
>
> The "best_cnt" thing tries to find the last strategy with the lowest
> score from evaluate_result(), so strictly speaking I think this
> changes the behaviour.  Am I mistaken?

Yes, you are.

Though, to be fair, my commit message is wrong too.

I'll explain below...

> When two strategies 'ours' and 'ort' resolved a given merge cleanly
> (but differently), we would have taken the result from 'ort' in the
> original code, but now we stop at seeing that 'ours' resolved the
> merge cleanly and use its result.
>
> >                       cnt = (use_strategies_nr > 1) ? evaluate_result() : 0;
> >                       if (best_cnt <= 0 || cnt <= best_cnt) {

No, the original code would not have taken 'ort'.  You have overlooked
the part of the code immediately above these two lines:

    if (ret < 2) {
        if (!ret) {
            if (option_commit) {
                /* Automerge succeeded. */
                automerge_was_ok = 1;
                break;
            }
        merge_was_ok = 1;
        }

In particular, the "break" is key.  If the first strategy succeeds
(and the user did not specify --no-commit), then the loop will hit
that break statement and bail out of the loop, preventing any other
merge strategy from being tried.  In contrast, if the user did specify
--no-commit and the previous strategy succeeded, then we will continue
the loop.  That seems rather inconsistent, since --no-commit should
not affect the choice of strategy.

However, I missed two things in my reading.  You are correct that I
missed the "<=" as opposed to "<" when I wrote my commit message,
though that turns out to not matter much due to the second thing.  The
second thing I missed was part of the code at the beginning of the
loop:

    for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) {

In particular, that "!merge_was_ok" check means that the code would
bail early whenever ret was 0, regardless of whether --no-commit was
passed.  So, my code turns out to not only leave behavior alone, but
also doesn't count as an optimization either -- it's simply a
half-baked cleanup.  If I also get rid of the now-redundant
"!merge_was_ok" check in the for loop and fix my commit message, then
I think it'd be a fully-baked code cleanup.

I'll submit an update.
Junio C Hamano Aug. 22, 2022, 4:19 p.m. UTC | #3
Elijah Newren <newren@gmail.com> writes:

> No, the original code would not have taken 'ort'.  You have overlooked
> the part of the code immediately above these two lines:
>
>     if (ret < 2) {
>         if (!ret) {
>             if (option_commit) {
>                 /* Automerge succeeded. */
>                 automerge_was_ok = 1;
>                 break;
>             }
>         merge_was_ok = 1;
>         }
>
> In particular, the "break" is key.

I noticed that much but then would "git merge --no-commit -s A -s B"
have the issue?

> ...  In contrast, if the user did specify
> --no-commit and the previous strategy succeeded, then we will continue
> the loop.  That seems rather inconsistent, since --no-commit should
> not affect the choice of strategy.

Yeah, exactly.

> However, I missed two things in my reading.  You are correct that I
> missed the "<=" as opposed to "<" when I wrote my commit message,
> though that turns out to not matter much due to the second thing.  The
> second thing I missed was part of the code at the beginning of the
> loop:
>
>     for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) {

Ahhhh, that explains it.  We leave as soon as we find merge_was_ok
so this patch is not necessary.  There is nothing to fix.  The
original was fine as-is.

Thanks.
Elijah Newren Aug. 23, 2022, 1:18 a.m. UTC | #4
On Mon, Aug 22, 2022 at 9:19 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
[...]
> > ...  In contrast, if the user did specify
> > --no-commit and the previous strategy succeeded, then we will continue
> > the loop.  That seems rather inconsistent, since --no-commit should
> > not affect the choice of strategy.
>
> Yeah, exactly.
>
> > However, I missed two things in my reading.  You are correct that I
> > missed the "<=" as opposed to "<" when I wrote my commit message,
> > though that turns out to not matter much due to the second thing.  The
> > second thing I missed was part of the code at the beginning of the
> > loop:
> >
> >     for (i = 0; !merge_was_ok && i < use_strategies_nr; i++) {
>
> Ahhhh, that explains it.  We leave as soon as we find merge_was_ok
> so this patch is not necessary.  There is nothing to fix.  The
> original was fine as-is.

Right, there was no bug or even optimization opportunity in the
original, it was just confusing...as evidenced by the fact that we
both misread the code.

The fact that we both misread the code, though, suggests there is
something to fix: code readability.
diff mbox series

Patch

diff --git a/builtin/merge.c b/builtin/merge.c
index b4253710d19..f04100ce0da 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1718,12 +1718,18 @@  int cmd_merge(int argc, const char **argv, const char *prefix)
 		 */
 		if (ret < 2) {
 			if (!ret) {
-				if (option_commit) {
+				if (option_commit)
 					/* Automerge succeeded. */
 					automerge_was_ok = 1;
-					break;
-				}
-				merge_was_ok = 1;
+				else
+					/* Merge good, but let user commit */
+					merge_was_ok = 1;
+				/*
+				 * This strategy worked; no point in trying
+				 * another.
+				 */
+				best_strategy = wt_strategy;
+				break;
 			}
 			cnt = (use_strategies_nr > 1) ? evaluate_result() : 0;
 			if (best_cnt <= 0 || cnt <= best_cnt) {