diff mbox series

[v2,4/8] merge-tree: implement real merges

Message ID 1710ba4a9e432e2a854579c4c929e7f2cfc92211.1641403655.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series RFC: Server side merges (no ref updating, no commit creating, no touching worktree or index) | expand

Commit Message

Elijah Newren Jan. 5, 2022, 5:27 p.m. UTC
From: Elijah Newren <newren@gmail.com>

This adds the ability to perform real merges rather than just trivial
merges (meaning handling three way content merges, recursive ancestor
consolidation, renames, proper directory/file conflict handling, and so
forth).  However, unlike `git merge`, the working tree and index are
left alone and no branch is updated.

The only output is:
  - the toplevel resulting tree printed on stdout
  - exit status of 0 (clean) or 1 (conflicts present)

This output is mean to be used by some higher level script, perhaps in a
sequence of steps like this:

   NEWTREE=$(git merge-tree --real $BRANCH1 $BRANCH2)
   test $? -eq 0 || die "There were conflicts..."
   NEWCOMMIT=$(git commit-tree $NEWTREE -p $BRANCH1 -p $BRANCH2)
   git update-ref $BRANCH1 $NEWCOMMIT

Note that higher level scripts may also want to access the
conflict/warning messages normally output during a merge, or have quick
access to a list of files with conflicts.  That is not available in this
preliminary implementation, but subsequent commits will add that
ability.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/git-merge-tree.txt | 28 +++++++----
 builtin/merge-tree.c             | 55 +++++++++++++++++++++-
 t/t4301-merge-tree-real.sh       | 81 ++++++++++++++++++++++++++++++++
 3 files changed, 153 insertions(+), 11 deletions(-)
 create mode 100755 t/t4301-merge-tree-real.sh

Comments

Johannes Schindelin Jan. 7, 2022, 3:30 p.m. UTC | #1
Hi Elijah,

On Wed, 5 Jan 2022, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@gmail.com>
>
> This adds the ability to perform real merges rather than just trivial
> merges (meaning handling three way content merges, recursive ancestor
> consolidation, renames, proper directory/file conflict handling, and so
> forth).  However, unlike `git merge`, the working tree and index are
> left alone and no branch is updated.
>
> The only output is:
>   - the toplevel resulting tree printed on stdout
>   - exit status of 0 (clean) or 1 (conflicts present)
>
> This output is mean to be used by some higher level script, perhaps in a
                 ^^^^

My apologies for pointing out a grammar issue: This probably intended to
say "meant", as the word "mean" changes the sense of the sentence.

In my defense, I have more substantial suggestions below.

> sequence of steps like this:
>
>    NEWTREE=$(git merge-tree --real $BRANCH1 $BRANCH2)
>    test $? -eq 0 || die "There were conflicts..."
>    NEWCOMMIT=$(git commit-tree $NEWTREE -p $BRANCH1 -p $BRANCH2)
>    git update-ref $BRANCH1 $NEWCOMMIT
>
> Note that higher level scripts may also want to access the
> conflict/warning messages normally output during a merge, or have quick
> access to a list of files with conflicts.  That is not available in this
> preliminary implementation, but subsequent commits will add that
> ability.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  Documentation/git-merge-tree.txt | 28 +++++++----
>  builtin/merge-tree.c             | 55 +++++++++++++++++++++-
>  t/t4301-merge-tree-real.sh       | 81 ++++++++++++++++++++++++++++++++
>  3 files changed, 153 insertions(+), 11 deletions(-)
>  create mode 100755 t/t4301-merge-tree-real.sh
>
> diff --git a/Documentation/git-merge-tree.txt b/Documentation/git-merge-tree.txt
> index 58731c19422..5823938937f 100644
> --- a/Documentation/git-merge-tree.txt
> +++ b/Documentation/git-merge-tree.txt
> @@ -3,26 +3,34 @@ git-merge-tree(1)
>
>  NAME
>  ----
> -git-merge-tree - Show three-way merge without touching index
> +git-merge-tree - Perform merge without touching index or working tree
>
>
>  SYNOPSIS
>  --------
>  [verse]
> +'git merge-tree' --real <branch1> <branch2>
>  'git merge-tree' <base-tree> <branch1> <branch2>

Here is an idea: How about aiming for this synopsis instead, exploiting
the fact that the "real" mode takes a different amount of arguments?

   'git merge-tree' [--write-tree] <branch1> <branch2>
   'git merge-tree' [--demo-trivial-merge] <base-tree> <branch1> <branch2>

That way, the old mode can still function, and can even at some stage be
deprecated and eventually removed.

>
>  DESCRIPTION
>  -----------
> -Reads three tree-ish, and output trivial merge results and
> -conflicting stages to the standard output.  This is similar to
> -what three-way 'git read-tree -m' does, but instead of storing the
> -results in the index, the command outputs the entries to the
> -standard output.
> +Performs a merge, but does not make any new commits and does not read
> +from or write to either the working tree or index.
>
> -This is meant to be used by higher level scripts to compute
> -merge results outside of the index, and stuff the results back into the
> -index.  For this reason, the output from the command omits
> -entries that match the <branch1> tree.
> +The first form will merge the two branches, doing a full recursive
> +merge with rename detection.  If the merge is clean, the exit status
> +will be `0`, and if the merge has conflicts, the exit status will be
> +`1`.  The output will consist solely of the resulting toplevel tree
> +(which may have files including conflict markers).
> +
> +The second form is meant for backward compatibility and will only do a
> +trival merge.  It reads three tree-ish, and outputs trivial merge
> +results and conflicting stages to the standard output in a semi-diff
> +format.  Since this was designed for higher level scripts to consume
> +and merge the results back into the index, it omits entries that match
> +<branch1>.  The result of this second form is is similar to what
> +three-way 'git read-tree -m' does, but instead of storing the results
> +in the index, the command outputs the entries to the standard output.
>
>  GIT
>  ---
> diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> index e1d2832c809..ac50f3d108b 100644
> --- a/builtin/merge-tree.c
> +++ b/builtin/merge-tree.c
> @@ -2,6 +2,9 @@
>  #include "builtin.h"
>  #include "tree-walk.h"
>  #include "xdiff-interface.h"
> +#include "help.h"
> +#include "commit-reach.h"
> +#include "merge-ort.h"
>  #include "object-store.h"
>  #include "parse-options.h"
>  #include "repository.h"
> @@ -392,7 +395,57 @@ struct merge_tree_options {
>  static int real_merge(struct merge_tree_options *o,
>  		      const char *branch1, const char *branch2)
>  {
> -	die(_("real merges are not yet implemented"));
> +	struct commit *parent1, *parent2;
> +	struct commit_list *common;
> +	struct commit_list *merge_bases = NULL;
> +	struct commit_list *j;
> +	struct merge_options opt;
> +	struct merge_result result = { 0 };
> +
> +	parent1 = get_merge_parent(branch1);
> +	if (!parent1)
> +		help_unknown_ref(branch1, "merge",
> +				 _("not something we can merge"));
> +
> +	parent2 = get_merge_parent(branch2);
> +	if (!parent2)
> +		help_unknown_ref(branch2, "merge",
> +				 _("not something we can merge"));
> +
> +	init_merge_options(&opt, the_repository);
> +	/*
> +	 * TODO: Support subtree and other -X options?
> +	if (use_strategies_nr == 1 &&
> +	    !strcmp(use_strategies[0]->name, "subtree"))
> +		opt.subtree_shift = "";
> +	for (x = 0; x < xopts_nr; x++)
> +		if (parse_merge_opt(&opt, xopts[x]))
> +			die(_("Unknown strategy option: -X%s"), xopts[x]);
> +	*/
> +
> +	opt.show_rename_progress = 0;
> +
> +	opt.branch1 = merge_remote_util(parent1)->name; /* or just branch1? */
> +	opt.branch2 = merge_remote_util(parent2)->name; /* or just branch2? */
> +
> +	/*
> +	 * Get the merge bases, in reverse order; see comment above
> +	 * merge_incore_recursive in merge-ort.h
> +	 */
> +	common = get_merge_bases(parent1, parent2);
> +	for (j = common; j; j = j->next)
> +		commit_list_insert(j->item, &merge_bases);
> +
> +	/*
> +	 * TODO: notify if merging unrelated histories?

I guess that it would make most sense to add a flag whether this is
allowed or not, and I would suggest the default to be `off`.

> +	if (!common)
> +		fprintf(stderr, _("merging unrelated histories"));
> +	 */
> +
> +	merge_incore_recursive(&opt, merge_bases, parent1, parent2, &result);
> +	printf("%s\n", oid_to_hex(&result.tree->object.oid));
> +	merge_switch_to_result(&opt, NULL, &result, 0, 0);

This looks to be idempotent to `merge_finalize(&opt, &result)`, so maybe
use that instead?

> +	return result.clean ? 0 : 1;
>  }
>
>  int cmd_merge_tree(int argc, const char **argv, const char *prefix)
> diff --git a/t/t4301-merge-tree-real.sh b/t/t4301-merge-tree-real.sh
> new file mode 100755
> index 00000000000..f7aa310f8c1
> --- /dev/null
> +++ b/t/t4301-merge-tree-real.sh
> @@ -0,0 +1,81 @@
> +#!/bin/sh
> +
> +test_description='git merge-tree --real'
> +
> +. ./test-lib.sh
> +
> +# This test is ort-specific
> +GIT_TEST_MERGE_ALGORITHM=ort
> +export GIT_TEST_MERGE_ALGORITHM

It might make sense to skip the entire test if the user asked for
`recursive` to be tested:

	test "${GIT_TEST_MERGE_ALGORITHM:-ort}" = ort ||
		skip_all="GIT_TEST_MERGE_ALGORITHM != ort"
		test_done
	}

> +
> +test_expect_success setup '
> +	test_write_lines 1 2 3 4 5 >numbers &&
> +	echo hello >greeting &&
> +	echo foo >whatever &&
> +	git add numbers greeting whatever &&
> +	git commit -m initial &&

I would really like to encourage the use of `test_tick`. It makes the
commit consistent, just in case you run into an issue that depends on some
hash order.

> +
> +	git branch side1 &&
> +	git branch side2 &&
> +
> +	git checkout side1 &&

Please use `git switch -c side1` or `git checkout -b side1`: it is more
compact than `git branch ... && git checkout ...`.

> +	test_write_lines 1 2 3 4 5 6 >numbers &&
> +	echo hi >greeting &&
> +	echo bar >whatever &&
> +	git add numbers greeting whatever &&
> +	git commit -m modify-stuff &&
> +
> +	git checkout side2 &&

This could be written as `git checkout -b side2 HEAD^`, to make the setup
more succinct.

> +	test_write_lines 0 1 2 3 4 5 >numbers &&
> +	echo yo >greeting &&
> +	git rm whatever &&
> +	mkdir whatever &&
> +	>whatever/empty &&
> +	git add numbers greeting whatever/empty &&
> +	git commit -m other-modifications
> +'
> +
> +test_expect_success 'Content merge and a few conflicts' '
> +	git checkout side1^0 &&
> +	test_must_fail git merge side2 &&
> +	cp .git/AUTO_MERGE EXPECT &&
> +	E_TREE=$(cat EXPECT) &&

The file `EXPECT` is not used below. And can we use a more obvious name?
SOmething like:

	expected_tree=$(cat .git/AUTO_MERGE)

> +	git reset --hard &&

For an extra bonus, we could delay this via `test_when_finished`, to prove
that `git merge-tree --real` works even in a dirty worktree _with
conflicts_.

> +	test_must_fail git merge-tree --real side1 side2 >RESULT &&
> +	R_TREE=$(cat RESULT) &&

How about `actual_tree` instead?

> +
> +	# Due to differences of e.g. "HEAD" vs "side1", the results will not
> +	# exactly match.  Dig into individual files.
> +
> +	# Numbers should have three-way merged cleanly
> +	test_write_lines 0 1 2 3 4 5 6 >expect &&
> +	git show ${R_TREE}:numbers >actual &&
> +	test_cmp expect actual &&
> +
> +	# whatever and whatever~<branch> should have same HASHES
> +	git rev-parse ${E_TREE}:whatever ${E_TREE}:whatever~HEAD >expect &&
> +	git rev-parse ${R_TREE}:whatever ${R_TREE}:whatever~side1 >actual &&
> +	test_cmp expect actual &&
> +
> +	# greeting should have a merge conflict
> +	git show ${E_TREE}:greeting >tmp &&
> +	cat tmp | sed -e s/HEAD/side1/ >expect &&
> +	git show ${R_TREE}:greeting >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'Barf on misspelled option' '
> +	# Mis-spell with single "s" instead of double "s"
> +	test_expect_code 129 git merge-tree --real --mesages FOOBAR side1 side2 2>expect &&
> +
> +	grep "error: unknown option.*mesages" expect
> +'

I do not think that this test case adds much, and we already test the
`parse_options()` machinery elsewhere.

> +
> +test_expect_success 'Barf on too many arguments' '
> +	test_expect_code 129 git merge-tree --real side1 side2 side3 2>expect &&
> +
> +	grep "^usage: git merge-tree" expect
> +'
> +
> +test_done

The rest looks awesome. Thank you for working on it! I will definitely
come back to review the rest (have to take a break now), and then probably
add quite a bit of food for thought based on my experience _actually_
using `merge-ort` on the server-side. Stay tuned.

Thank you,
Dscho
Christian Couder Jan. 7, 2022, 6:12 p.m. UTC | #2
On Wed, Jan 5, 2022 at 6:27 PM Elijah Newren via GitGitGadget
<gitgitgadget@gmail.com> wrote:

> The only output is:
>   - the toplevel resulting tree printed on stdout
>   - exit status of 0 (clean) or 1 (conflicts present)

I thought that the merge-ort API could (at least theoretically
according to merge-ort.h) return something < 0 in case of internal
error. In this case I would be interested in knowing what's the output
of the command.

> +The first form will merge the two branches, doing a full recursive
> +merge with rename detection.  If the merge is clean, the exit status
> +will be `0`, and if the merge has conflicts, the exit status will be
> +`1`.

No mention of what happens in case of an internal error in the merge-ort API.

> +       merge_incore_recursive(&opt, merge_bases, parent1, parent2, &result);
> +       printf("%s\n", oid_to_hex(&result.tree->object.oid));
> +       merge_switch_to_result(&opt, NULL, &result, 0, 0);
> +       return result.clean ? 0 : 1;

If result.clean can be < 0, this might pretend that the merge was clean.
Johannes Schindelin Jan. 7, 2022, 6:22 p.m. UTC | #3
Hi Elijah,


On Fri, 7 Jan 2022, Elijah Newren wrote:

> On Fri, Jan 7, 2022 at 7:30 AM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> >
> > On Wed, 5 Jan 2022, Elijah Newren via GitGitGadget wrote:
> >
> > >  SYNOPSIS
> > >  --------
> > >  [verse]
> > > +'git merge-tree' --real <branch1> <branch2>
> > >  'git merge-tree' <base-tree> <branch1> <branch2>
> >
> > Here is an idea: How about aiming for this synopsis instead, exploiting
> > the fact that the "real" mode takes a different amount of arguments?
>
> My turn on the grammar thing: s/amount/number/.   :-)

See? I know why I'm refraining from nitpicking. It's just not good for
anyone involved.

> > > diff --git a/t/t4301-merge-tree-real.sh b/t/t4301-merge-tree-real.sh
> > > new file mode 100755
> > > index 00000000000..f7aa310f8c1
> > > --- /dev/null
> > > +++ b/t/t4301-merge-tree-real.sh
> > > @@ -0,0 +1,81 @@
> > > +#!/bin/sh
> > > +
> > > +test_description='git merge-tree --real'
> > > +
> > > +. ./test-lib.sh
> > > +
> > > +# This test is ort-specific
> > > +GIT_TEST_MERGE_ALGORITHM=ort
> > > +export GIT_TEST_MERGE_ALGORITHM
> >
> > It might make sense to skip the entire test if the user asked for
> > `recursive` to be tested:
> >
> >         test "${GIT_TEST_MERGE_ALGORITHM:-ort}" = ort ||
> >                 skip_all="GIT_TEST_MERGE_ALGORITHM != ort"
> >                 test_done
> >         }
>
> The idea makes sense, but it took me a bit to understand this code
> block.  I think you're just missing an opening left curly brace right
> after the '||'?

Yes. Sorry.

> > > +test_expect_success setup '
> > > +     test_write_lines 1 2 3 4 5 >numbers &&
> > > +     echo hello >greeting &&
> > > +     echo foo >whatever &&
> > > +     git add numbers greeting whatever &&
> > > +     git commit -m initial &&
> >
> > I would really like to encourage the use of `test_tick`. It makes the
> > commit consistent, just in case you run into an issue that depends on some
> > hash order.
>
> I've used test_tick before, but I already know this test can't depend
> on hash order.  Further, the hashes in the output are also replaced
> before comparing in order to make the tests also work as-is under
> sha256.  So the tests are explicitly ignoring precise hashes.  As
> such, I'm not sure I see the value of test_tick here.

Nevertheless. To make comparing logs of two different test runs easier, it
makes more sense to insist on consistency.

> > > +
> > > +     git branch side1 &&
> > > +     git branch side2 &&
> > > +
> > > +     git checkout side1 &&
> >
> > Please use `git switch -c side1` or `git checkout -b side1`: it is more
> > compact than `git branch ... && git checkout ...`.
>
> Yes, but less forgiving to later modification where I go and add
> additional commits on one of the sides, because...
>
> >
> > > +     test_write_lines 1 2 3 4 5 6 >numbers &&
> > > +     echo hi >greeting &&
> > > +     echo bar >whatever &&
> > > +     git add numbers greeting whatever &&
> > > +     git commit -m modify-stuff &&
> > > +
> > > +     git checkout side2 &&
> >
> > This could be written as `git checkout -b side2 HEAD^`, to make the setup
> > more succinct.
>
> ...the presumption of HEAD^ is hardcoded and has to be parsed by
> readers to understand the test.  It felt like more cognitive overhead
> to me, in addition to being less malleable.

Right. Different developers, different preferences. I wish we had a
standard way in the test suite to initialize a test setup that _everybody_
could agree to be succinct and helpful. So far, we use shell scripted Git
commands to recreate an initial commit topology, but especially when
comparing to existing test suites with fixtures that are not only
well-documented but also easy to wrap your head around, I find Git's test
suite awfully lacking. Mind you, the code _I_ introduced isn't stellar in
this respect, either, not by a very far stretch.

> > > +test_expect_success 'Barf on misspelled option' '
> > > +     # Mis-spell with single "s" instead of double "s"
> > > +     test_expect_code 129 git merge-tree --real --mesages FOOBAR side1 side2 2>expect &&
> > > +
> > > +     grep "error: unknown option.*mesages" expect
> > > +'
> >
> > I do not think that this test case adds much, and we already test the
> > `parse_options()` machinery elsewhere.
>
> It's more about verifying that exit codes of 0 & 1 are reserved for
> "completed with no conflicts" and "completed with conflicts".  The 129
> bit in this test is the important bit (and perhaps is well-known to
> lots of other folks, but I thought it was worth highlighting).

Fair enough.

Ciao,
Dscho
Elijah Newren Jan. 7, 2022, 7:09 p.m. UTC | #4
On Fri, Jan 7, 2022 at 10:12 AM Christian Couder
<christian.couder@gmail.com> wrote:
>
> On Wed, Jan 5, 2022 at 6:27 PM Elijah Newren via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>
> > The only output is:
> >   - the toplevel resulting tree printed on stdout
> >   - exit status of 0 (clean) or 1 (conflicts present)
>
> I thought that the merge-ort API could (at least theoretically
> according to merge-ort.h) return something < 0 in case of internal
> error. In this case I would be interested in knowing what's the output
> of the command.
>
> > +The first form will merge the two branches, doing a full recursive
> > +merge with rename detection.  If the merge is clean, the exit status
> > +will be `0`, and if the merge has conflicts, the exit status will be
> > +`1`.
>
> No mention of what happens in case of an internal error in the merge-ort API.
>
> > +       merge_incore_recursive(&opt, merge_bases, parent1, parent2, &result);
> > +       printf("%s\n", oid_to_hex(&result.tree->object.oid));
> > +       merge_switch_to_result(&opt, NULL, &result, 0, 0);
> > +       return result.clean ? 0 : 1;
>
> If result.clean can be < 0, this might pretend that the merge was clean.

Ooh, these are very good points.  Thanks for bringing them up; I'll
try to address them in a re-roll.
Elijah Newren Jan. 7, 2022, 7:15 p.m. UTC | #5
Hi Dscho,

On Fri, Jan 7, 2022 at 10:23 AM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Elijah,
>
> On Fri, 7 Jan 2022, Elijah Newren wrote:
>
> > On Fri, Jan 7, 2022 at 7:30 AM Johannes Schindelin
> > <Johannes.Schindelin@gmx.de> wrote:
> > >
> > > On Wed, 5 Jan 2022, Elijah Newren via GitGitGadget wrote:
> > >
> > > >  SYNOPSIS
> > > >  --------
> > > >  [verse]
> > > > +'git merge-tree' --real <branch1> <branch2>
> > > >  'git merge-tree' <base-tree> <branch1> <branch2>
> > >
> > > Here is an idea: How about aiming for this synopsis instead, exploiting
> > > the fact that the "real" mode takes a different amount of arguments?
> >
> > My turn on the grammar thing: s/amount/number/.   :-)
>
> See? I know why I'm refraining from nitpicking. It's just not good for
> anyone involved.

Well, in your case, the point you brought up will improve the commit
message for future readers, and so it was totally justified (and I'm
glad you brought it up).  My comment is useful for nothing more than a
bit of good-natured ribbing.  But I'm not sure it was taken that way,
so I'm sorry if my comment had any effect other than making you smile.

> > > > diff --git a/t/t4301-merge-tree-real.sh b/t/t4301-merge-tree-real.sh
> > > > new file mode 100755
> > > > index 00000000000..f7aa310f8c1
> > > > --- /dev/null
> > > > +++ b/t/t4301-merge-tree-real.sh
> > > > @@ -0,0 +1,81 @@
> > > > +#!/bin/sh
> > > > +
> > > > +test_description='git merge-tree --real'
> > > > +
> > > > +. ./test-lib.sh
> > > > +
> > > > +# This test is ort-specific
> > > > +GIT_TEST_MERGE_ALGORITHM=ort
> > > > +export GIT_TEST_MERGE_ALGORITHM
> > >
> > > It might make sense to skip the entire test if the user asked for
> > > `recursive` to be tested:
> > >
> > >         test "${GIT_TEST_MERGE_ALGORITHM:-ort}" = ort ||
> > >                 skip_all="GIT_TEST_MERGE_ALGORITHM != ort"
> > >                 test_done
> > >         }
> >
> > The idea makes sense, but it took me a bit to understand this code
> > block.  I think you're just missing an opening left curly brace right
> > after the '||'?
>
> Yes. Sorry.
>
> > > > +test_expect_success setup '
> > > > +     test_write_lines 1 2 3 4 5 >numbers &&
> > > > +     echo hello >greeting &&
> > > > +     echo foo >whatever &&
> > > > +     git add numbers greeting whatever &&
> > > > +     git commit -m initial &&
> > >
> > > I would really like to encourage the use of `test_tick`. It makes the
> > > commit consistent, just in case you run into an issue that depends on some
> > > hash order.
> >
> > I've used test_tick before, but I already know this test can't depend
> > on hash order.  Further, the hashes in the output are also replaced
> > before comparing in order to make the tests also work as-is under
> > sha256.  So the tests are explicitly ignoring precise hashes.  As
> > such, I'm not sure I see the value of test_tick here.
>
> Nevertheless. To make comparing logs of two different test runs easier, it
> makes more sense to insist on consistency.

Ah...comparing logs between two different test runs; that sounds like
a reasonable justification.  I'll add the test_tick's.

> > > > +
> > > > +     git branch side1 &&
> > > > +     git branch side2 &&
> > > > +
> > > > +     git checkout side1 &&
> > >
> > > Please use `git switch -c side1` or `git checkout -b side1`: it is more
> > > compact than `git branch ... && git checkout ...`.
> >
> > Yes, but less forgiving to later modification where I go and add
> > additional commits on one of the sides, because...
> >
> > >
> > > > +     test_write_lines 1 2 3 4 5 6 >numbers &&
> > > > +     echo hi >greeting &&
> > > > +     echo bar >whatever &&
> > > > +     git add numbers greeting whatever &&
> > > > +     git commit -m modify-stuff &&
> > > > +
> > > > +     git checkout side2 &&
> > >
> > > This could be written as `git checkout -b side2 HEAD^`, to make the setup
> > > more succinct.
> >
> > ...the presumption of HEAD^ is hardcoded and has to be parsed by
> > readers to understand the test.  It felt like more cognitive overhead
> > to me, in addition to being less malleable.
>
> Right. Different developers, different preferences. I wish we had a
> standard way in the test suite to initialize a test setup that _everybody_
> could agree to be succinct and helpful. So far, we use shell scripted Git
> commands to recreate an initial commit topology, but especially when
> comparing to existing test suites with fixtures that are not only
> well-documented but also easy to wrap your head around, I find Git's test
> suite awfully lacking. Mind you, the code _I_ introduced isn't stellar in
> this respect, either, not by a very far stretch.
>
> > > > +test_expect_success 'Barf on misspelled option' '
> > > > +     # Mis-spell with single "s" instead of double "s"
> > > > +     test_expect_code 129 git merge-tree --real --mesages FOOBAR side1 side2 2>expect &&
> > > > +
> > > > +     grep "error: unknown option.*mesages" expect
> > > > +'
> > >
> > > I do not think that this test case adds much, and we already test the
> > > `parse_options()` machinery elsewhere.
> >
> > It's more about verifying that exit codes of 0 & 1 are reserved for
> > "completed with no conflicts" and "completed with conflicts".  The 129
> > bit in this test is the important bit (and perhaps is well-known to
> > lots of other folks, but I thought it was worth highlighting).
>
> Fair enough.
>
> Ciao,
> Dscho
Junio C Hamano Jan. 7, 2022, 8:56 p.m. UTC | #6
Elijah Newren <newren@gmail.com> writes:

>>    'git merge-tree' [--write-tree] <branch1> <branch2>
>>    'git merge-tree' [--demo-trivial-merge] <base-tree> <branch1> <branch2>
>>
>> That way, the old mode can still function, and can even at some stage be
>> deprecated and eventually removed.
>
> Ooh, interesting.

I wondered if we can _also_ extend the trivial-merge mode so that we
do not have to call it "demo".

The internal result is expressed in this way:

    struct merge_list {
            struct merge_list *next;
            struct merge_list *link;	/* other stages for this object */

            unsigned int stage : 2;
            unsigned int mode;
            const char *path;
            struct blob *blob;
    };

because the command was not designed to resolve content level
merges, but show the half-resolved state with the "stage" number.
The "explanation" the command gives on the result is truly trivial,
but there is no reason for it to stay that way.
Johannes Schindelin Jan. 11, 2022, 1:39 p.m. UTC | #7
Hi Junio,

On Fri, 7 Jan 2022, Junio C Hamano wrote:

> Elijah Newren <newren@gmail.com> writes:
>
> >>    'git merge-tree' [--write-tree] <branch1> <branch2>
> >>    'git merge-tree' [--demo-trivial-merge] <base-tree> <branch1> <branch2>
> >>
> >> That way, the old mode can still function, and can even at some stage be
> >> deprecated and eventually removed.
> >
> > Ooh, interesting.
>
> I wondered if we can _also_ extend the trivial-merge mode so that we
> do not have to call it "demo".
>
> The internal result is expressed in this way:
>
>     struct merge_list {
>             struct merge_list *next;
>             struct merge_list *link;	/* other stages for this object */
>
>             unsigned int stage : 2;
>             unsigned int mode;
>             const char *path;
>             struct blob *blob;
>     };
>
> because the command was not designed to resolve content level
> merges, but show the half-resolved state with the "stage" number.
> The "explanation" the command gives on the result is truly trivial,
> but there is no reason for it to stay that way.

The original `merge-tree` code outputs a diff, which I think has now been
firmly established as something a low-level merge tool should not do at
all.

So I am not sure how necessary it is to maintain the original UI. I don't
think it is a good UI. In fact, I am rather certain that we will want to
get rid of it.

We can keep it for backwards-compatibility for now, keeping it working for
existing users (if any!) by that 3-arg vs 2-arg trick, eventually
deprecate and then remove it.

Ciao,
Dscho
diff mbox series

Patch

diff --git a/Documentation/git-merge-tree.txt b/Documentation/git-merge-tree.txt
index 58731c19422..5823938937f 100644
--- a/Documentation/git-merge-tree.txt
+++ b/Documentation/git-merge-tree.txt
@@ -3,26 +3,34 @@  git-merge-tree(1)
 
 NAME
 ----
-git-merge-tree - Show three-way merge without touching index
+git-merge-tree - Perform merge without touching index or working tree
 
 
 SYNOPSIS
 --------
 [verse]
+'git merge-tree' --real <branch1> <branch2>
 'git merge-tree' <base-tree> <branch1> <branch2>
 
 DESCRIPTION
 -----------
-Reads three tree-ish, and output trivial merge results and
-conflicting stages to the standard output.  This is similar to
-what three-way 'git read-tree -m' does, but instead of storing the
-results in the index, the command outputs the entries to the
-standard output.
+Performs a merge, but does not make any new commits and does not read
+from or write to either the working tree or index.
 
-This is meant to be used by higher level scripts to compute
-merge results outside of the index, and stuff the results back into the
-index.  For this reason, the output from the command omits
-entries that match the <branch1> tree.
+The first form will merge the two branches, doing a full recursive
+merge with rename detection.  If the merge is clean, the exit status
+will be `0`, and if the merge has conflicts, the exit status will be
+`1`.  The output will consist solely of the resulting toplevel tree
+(which may have files including conflict markers).
+
+The second form is meant for backward compatibility and will only do a
+trival merge.  It reads three tree-ish, and outputs trivial merge
+results and conflicting stages to the standard output in a semi-diff
+format.  Since this was designed for higher level scripts to consume
+and merge the results back into the index, it omits entries that match
+<branch1>.  The result of this second form is is similar to what
+three-way 'git read-tree -m' does, but instead of storing the results
+in the index, the command outputs the entries to the standard output.
 
 GIT
 ---
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index e1d2832c809..ac50f3d108b 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -2,6 +2,9 @@ 
 #include "builtin.h"
 #include "tree-walk.h"
 #include "xdiff-interface.h"
+#include "help.h"
+#include "commit-reach.h"
+#include "merge-ort.h"
 #include "object-store.h"
 #include "parse-options.h"
 #include "repository.h"
@@ -392,7 +395,57 @@  struct merge_tree_options {
 static int real_merge(struct merge_tree_options *o,
 		      const char *branch1, const char *branch2)
 {
-	die(_("real merges are not yet implemented"));
+	struct commit *parent1, *parent2;
+	struct commit_list *common;
+	struct commit_list *merge_bases = NULL;
+	struct commit_list *j;
+	struct merge_options opt;
+	struct merge_result result = { 0 };
+
+	parent1 = get_merge_parent(branch1);
+	if (!parent1)
+		help_unknown_ref(branch1, "merge",
+				 _("not something we can merge"));
+
+	parent2 = get_merge_parent(branch2);
+	if (!parent2)
+		help_unknown_ref(branch2, "merge",
+				 _("not something we can merge"));
+
+	init_merge_options(&opt, the_repository);
+	/*
+	 * TODO: Support subtree and other -X options?
+	if (use_strategies_nr == 1 &&
+	    !strcmp(use_strategies[0]->name, "subtree"))
+		opt.subtree_shift = "";
+	for (x = 0; x < xopts_nr; x++)
+		if (parse_merge_opt(&opt, xopts[x]))
+			die(_("Unknown strategy option: -X%s"), xopts[x]);
+	*/
+
+	opt.show_rename_progress = 0;
+
+	opt.branch1 = merge_remote_util(parent1)->name; /* or just branch1? */
+	opt.branch2 = merge_remote_util(parent2)->name; /* or just branch2? */
+
+	/*
+	 * Get the merge bases, in reverse order; see comment above
+	 * merge_incore_recursive in merge-ort.h
+	 */
+	common = get_merge_bases(parent1, parent2);
+	for (j = common; j; j = j->next)
+		commit_list_insert(j->item, &merge_bases);
+
+	/*
+	 * TODO: notify if merging unrelated histories?
+	if (!common)
+		fprintf(stderr, _("merging unrelated histories"));
+	 */
+
+	merge_incore_recursive(&opt, merge_bases, parent1, parent2, &result);
+	printf("%s\n", oid_to_hex(&result.tree->object.oid));
+	merge_switch_to_result(&opt, NULL, &result, 0, 0);
+	return result.clean ? 0 : 1;
 }
 
 int cmd_merge_tree(int argc, const char **argv, const char *prefix)
diff --git a/t/t4301-merge-tree-real.sh b/t/t4301-merge-tree-real.sh
new file mode 100755
index 00000000000..f7aa310f8c1
--- /dev/null
+++ b/t/t4301-merge-tree-real.sh
@@ -0,0 +1,81 @@ 
+#!/bin/sh
+
+test_description='git merge-tree --real'
+
+. ./test-lib.sh
+
+# This test is ort-specific
+GIT_TEST_MERGE_ALGORITHM=ort
+export GIT_TEST_MERGE_ALGORITHM
+
+test_expect_success setup '
+	test_write_lines 1 2 3 4 5 >numbers &&
+	echo hello >greeting &&
+	echo foo >whatever &&
+	git add numbers greeting whatever &&
+	git commit -m initial &&
+
+	git branch side1 &&
+	git branch side2 &&
+
+	git checkout side1 &&
+	test_write_lines 1 2 3 4 5 6 >numbers &&
+	echo hi >greeting &&
+	echo bar >whatever &&
+	git add numbers greeting whatever &&
+	git commit -m modify-stuff &&
+
+	git checkout side2 &&
+	test_write_lines 0 1 2 3 4 5 >numbers &&
+	echo yo >greeting &&
+	git rm whatever &&
+	mkdir whatever &&
+	>whatever/empty &&
+	git add numbers greeting whatever/empty &&
+	git commit -m other-modifications
+'
+
+test_expect_success 'Content merge and a few conflicts' '
+	git checkout side1^0 &&
+	test_must_fail git merge side2 &&
+	cp .git/AUTO_MERGE EXPECT &&
+	E_TREE=$(cat EXPECT) &&
+
+	git reset --hard &&
+	test_must_fail git merge-tree --real side1 side2 >RESULT &&
+	R_TREE=$(cat RESULT) &&
+
+	# Due to differences of e.g. "HEAD" vs "side1", the results will not
+	# exactly match.  Dig into individual files.
+
+	# Numbers should have three-way merged cleanly
+	test_write_lines 0 1 2 3 4 5 6 >expect &&
+	git show ${R_TREE}:numbers >actual &&
+	test_cmp expect actual &&
+
+	# whatever and whatever~<branch> should have same HASHES
+	git rev-parse ${E_TREE}:whatever ${E_TREE}:whatever~HEAD >expect &&
+	git rev-parse ${R_TREE}:whatever ${R_TREE}:whatever~side1 >actual &&
+	test_cmp expect actual &&
+
+	# greeting should have a merge conflict
+	git show ${E_TREE}:greeting >tmp &&
+	cat tmp | sed -e s/HEAD/side1/ >expect &&
+	git show ${R_TREE}:greeting >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'Barf on misspelled option' '
+	# Mis-spell with single "s" instead of double "s"
+	test_expect_code 129 git merge-tree --real --mesages FOOBAR side1 side2 2>expect &&
+
+	grep "error: unknown option.*mesages" expect
+'
+
+test_expect_success 'Barf on too many arguments' '
+	test_expect_code 129 git merge-tree --real side1 side2 side3 2>expect &&
+
+	grep "^usage: git merge-tree" expect
+'
+
+test_done