diff mbox series

merge-tree: accept 3 trees as arguments

Message ID pull.1647.git.1706277694231.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series merge-tree: accept 3 trees as arguments | expand

Commit Message

Johannes Schindelin Jan. 26, 2024, 2:01 p.m. UTC
From: Johannes Schindelin <johannes.schindelin@gmx.de>

When specifying a merge base explicitly, there is actually no good
reason why the inputs need to be commits: that's only needed if the
merge base has to be deduced from the commit graph.

This commit is best viewed with `--color-moved
--color-moved-ws=allow-indentation-change`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    merge-tree: accept 3 trees as arguments
    
    I was asked to implement this at $dayjob and it seems like a feature
    that might be useful to other users, too.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1647%2Fdscho%2Fallow-merge-tree-to-accept-3-trees-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1647/dscho/allow-merge-tree-to-accept-3-trees-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1647

 Documentation/git-merge-tree.txt |  5 +++-
 builtin/merge-tree.c             | 42 +++++++++++++++++++-------------
 t/t4301-merge-tree-write-tree.sh |  8 ++++++
 3 files changed, 37 insertions(+), 18 deletions(-)


base-commit: e02ecfcc534e2021aae29077a958dd11c3897e4c

Comments

Eric Sunshine Jan. 26, 2024, 2:15 p.m. UTC | #1
On Fri, Jan 26, 2024 at 9:01 AM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> When specifying a merge base explicitly, there is actually no good
> reason why the inputs need to be commits: that's only needed if the
> merge base has to be deduced from the commit graph.
>
> This commit is best viewed with `--color-moved
> --color-moved-ws=allow-indentation-change`.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> diff --git a/Documentation/git-merge-tree.txt b/Documentation/git-merge-tree.txt
> @@ -64,10 +64,13 @@ OPTIONS
> +--merge-base=<tree-ish>::
>         Instead of finding the merge-bases for <branch1> and <branch2>,
>         specify a merge-base for the merge, and specifying multiple bases is
>         currently not supported. This option is incompatible with `--stdin`.
> ++
> +As the merge-base is provided directly, <branch1> and <branch2> do not need
> +o specify commits; it is sufficient if they specify trees.

presumably: s/o/to/
Junio C Hamano Jan. 26, 2024, 4:01 p.m. UTC | #2
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> When specifying a merge base explicitly, there is actually no good
> reason why the inputs need to be commits: that's only needed if the
> merge base has to be deduced from the commit graph.

yes, Yes, YES, YEAHHHHH!

>     I was asked to implement this at $dayjob and it seems like a feature
>     that might be useful to other users, too.

Yup, I think it is an obvious building block for machinery to
perform any mergy operation to grow history.  Many of the time you
may have a commit, but requiring them to be commits when you know
you will not do a virtual ancestor synthesis smells fundamentally
wrong.  Thanks for fixing it.

> ---merge-base=<commit>::
> +--merge-base=<tree-ish>::
>  	Instead of finding the merge-bases for <branch1> and <branch2>,
>  	specify a merge-base for the merge, and specifying multiple bases is
>  	currently not supported. This option is incompatible with `--stdin`.
> ++
> +As the merge-base is provided directly, <branch1> and <branch2> do not need
> +o specify commits; it is sufficient if they specify trees.

"... do not need to specify commits; trees are enough"?

> diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> index 3bdec53fbe5..cbd8e15af6d 100644
> --- a/builtin/merge-tree.c
> +++ b/builtin/merge-tree.c
> @@ -429,35 +429,43 @@ static int real_merge(struct merge_tree_options *o,
>  	struct merge_options opt;
>  
>  	copy_merge_options(&opt, &o->merge_options);
> -	parent1 = get_merge_parent(branch1);
> -	if (!parent1)
> -		help_unknown_ref(branch1, "merge-tree",
> -				 _("not something we can merge"));
> -
> -	parent2 = get_merge_parent(branch2);
> -	if (!parent2)
> -		help_unknown_ref(branch2, "merge-tree",
> -				 _("not something we can merge"));
> -
>  	opt.show_rename_progress = 0;
>  
>  	opt.branch1 = branch1;
>  	opt.branch2 = branch2;
>  
>  	if (merge_base) {
> -		struct commit *base_commit;
>  		struct tree *base_tree, *parent1_tree, *parent2_tree;
>  
> -		base_commit = lookup_commit_reference_by_name(merge_base);
> -		if (!base_commit)
> -			die(_("could not lookup commit '%s'"), merge_base);
> +		/*
> +		 * We actually only need the trees because we already
> +		 * have a merge base.
> +		 */
> +		struct object_id base_oid, head_oid, merge_oid;
> +
> +		if (repo_get_oid_treeish(the_repository, merge_base, &base_oid))
> +			die(_("could not parse as tree '%s'"), merge_base);
> +		base_tree = parse_tree_indirect(&base_oid);
> +		if (repo_get_oid_treeish(the_repository, branch1, &head_oid))
> +			die(_("could not parse as tree '%s'"), branch1);
> +		parent1_tree = parse_tree_indirect(&head_oid);
> +		if (repo_get_oid_treeish(the_repository, branch2, &merge_oid))
> +			die(_("could not parse as tree '%s'"), branch2);
> +		parent2_tree = parse_tree_indirect(&merge_oid);
>  
>  		opt.ancestor = merge_base;
> -		base_tree = repo_get_commit_tree(the_repository, base_commit);
> -		parent1_tree = repo_get_commit_tree(the_repository, parent1);
> -		parent2_tree = repo_get_commit_tree(the_repository, parent2);
>  		merge_incore_nonrecursive(&opt, base_tree, parent1_tree, parent2_tree, &result);
>  	} else {
> +		parent1 = get_merge_parent(branch1);
> +		if (!parent1)
> +			help_unknown_ref(branch1, "merge-tree",
> +					 _("not something we can merge"));
> +
> +		parent2 = get_merge_parent(branch2);
> +		if (!parent2)
> +			help_unknown_ref(branch2, "merge-tree",
> +					 _("not something we can merge"));
> +
>  		/*
>  		 * Get the merge bases, in reverse order; see comment above
>  		 * merge_incore_recursive in merge-ort.h

OK.  The changes here look quite straight-forward.

> diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> index 12ac4368736..71f21bb834f 100755
> --- a/t/t4301-merge-tree-write-tree.sh
> +++ b/t/t4301-merge-tree-write-tree.sh
> @@ -945,4 +945,12 @@ test_expect_success 'check the input format when --stdin is passed' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success '--merge-base with tree OIDs' '
> +	git merge-tree --merge-base=side1^ side1 side3 >tree &&
> +	tree=$(cat tree) &&
> +	git merge-tree --merge-base=side1^^{tree} side1^{tree} side3^{tree} >tree2 &&
> +	tree2=$(cat tree2) &&
> +	test $tree = $tree2
> +'

You do not need $tree and $tree2 variables that would make it harder
to diagnose a failure case when we break merge-tree.  You have tree
and tree2 as files, and I think it is sufficient to do

	git merge-tree ... >result-from-commits &&
	git merge-tree ... >result-from-trees &&
	test_cmp result-from-commits result-from-trees

(no, I am not suggesting to rename these tree and tree2 files; I
just needed them to be descriptive in my illustration to show what
is going on to myself).

Thanks.
Elijah Newren Jan. 30, 2024, 7:04 a.m. UTC | #3
On Fri, Jan 26, 2024 at 6:18 AM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> When specifying a merge base explicitly, there is actually no good
> reason why the inputs need to be commits: that's only needed if the
> merge base has to be deduced from the commit graph.
>
> This commit is best viewed with `--color-moved
> --color-moved-ws=allow-indentation-change`.

Makes sense.

>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>     merge-tree: accept 3 trees as arguments
>
>     I was asked to implement this at $dayjob and it seems like a feature
>     that might be useful to other users, too.
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1647%2Fdscho%2Fallow-merge-tree-to-accept-3-trees-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1647/dscho/allow-merge-tree-to-accept-3-trees-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1647
>
>  Documentation/git-merge-tree.txt |  5 +++-
>  builtin/merge-tree.c             | 42 +++++++++++++++++++-------------
>  t/t4301-merge-tree-write-tree.sh |  8 ++++++
>  3 files changed, 37 insertions(+), 18 deletions(-)
>
> diff --git a/Documentation/git-merge-tree.txt b/Documentation/git-merge-tree.txt
> index b50acace3bc..214e30c70ba 100644
> --- a/Documentation/git-merge-tree.txt
> +++ b/Documentation/git-merge-tree.txt
> @@ -64,10 +64,13 @@ OPTIONS
>         share no common history.  This flag can be given to override that
>         check and make the merge proceed anyway.
>
> ---merge-base=<commit>::
> +--merge-base=<tree-ish>::

A very minor point, but any chance we can just use `<tree>`, like
git-restore does?  I've never liked the '-ish' that we use as it seems
to throw users, and I think they understand that they can use a commit
or a tag where a tree is expected

>         Instead of finding the merge-bases for <branch1> and <branch2>,
>         specify a merge-base for the merge, and specifying multiple bases is
>         currently not supported. This option is incompatible with `--stdin`.
> ++
> +As the merge-base is provided directly, <branch1> and <branch2> do not need
> +o specify commits; it is sufficient if they specify trees.
>
>  [[OUTPUT]]
>  OUTPUT
> diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> index 3bdec53fbe5..cbd8e15af6d 100644
> --- a/builtin/merge-tree.c
> +++ b/builtin/merge-tree.c
> @@ -429,35 +429,43 @@ static int real_merge(struct merge_tree_options *o,
>         struct merge_options opt;
>
>         copy_merge_options(&opt, &o->merge_options);
> -       parent1 = get_merge_parent(branch1);
> -       if (!parent1)
> -               help_unknown_ref(branch1, "merge-tree",
> -                                _("not something we can merge"));
> -
> -       parent2 = get_merge_parent(branch2);
> -       if (!parent2)
> -               help_unknown_ref(branch2, "merge-tree",
> -                                _("not something we can merge"));
> -
>         opt.show_rename_progress = 0;
>
>         opt.branch1 = branch1;
>         opt.branch2 = branch2;

If branch1 and branch2 refer to trees, then when users hit conflicts
they'll see e.g.

<<<<<<< aaaaaaa
  somecode();
=======
  othercode();
>>>>>>> bbbbbbb

but aaaaaaa and bbbbbbb are not commits that they can find.  They may
discover how to show the contents of the tree objects (most users I
run into seem to be unaware that hashes in git can refer to anything
other than commits), but they certainly won't get any context from
git-log as they might hope.

The other places in the code that deal directly with merging trees,
git-am and git-merge-recursive, both provide specially overridden
values for both branch1 and branch2.  (They probably ought to do
something with opt->ancestor as well, but that's been the single ugly
edge case problem that is solely responsible for merge-recursive not
having been fully replaced by merge-ort yet and I haven't wanted to go
back and mess with it.)

That raises the question: if the user passes trees in, should we
require helpful names be provided as additional parameters?  Or are
the usecases such that we don't expect callers to have any useful
information about where these trees come from and these suboptimal
conflicts are the best we can do?
Junio C Hamano Jan. 30, 2024, 5:15 p.m. UTC | #4
Elijah Newren <newren@gmail.com> writes:

>> ---merge-base=<commit>::
>> +--merge-base=<tree-ish>::
>
> A very minor point, but any chance we can just use `<tree>`, like
> git-restore does?  I've never liked the '-ish' that we use as it seems
> to throw users, and I think they understand that they can use a commit
> or a tag where a tree is expected

You are right that the "X-ish" was invented exactly so that folks
(including the nitpicky types among us) can tell if only "X" is
accepted, or other types of objects that uniquely can be peeled to a
"X" also can be used.  Many commands at the Porcelain level may take
a commit or a tag that eventually peel to a tree where at the lowest
level we only need it to be a tree, but it still matters in some
lower level corners of the system where the plumbing commands that
only accept a tree but not a tree-ish exist.  We've discussed this
in the past, e.g.:

    https://lore.kernel.org/git/7voc8ihq4a.fsf@alter.siamese.dyndns.org/

In general, I am OK to update the documentation and usage strings to
drop the "-ish" suffix when it is clear from the context.  But what
does it exactly mean that "--merge-base=<tree>" is meant to take any
tree-ish from the context of this documentation we are discussing?

 - Is it because merge-tree is a Porcelain command and we would
   adopt "Porcelains are to peel liberally what they accept, and
   when they are documented to take X they always accept any X-ish"
   rule?

 - Is it because the description that follows "--merge-base=<tree>"
   header does not mention "contrary to our usual convention, this
   option only accepts a tree and not a commit or a tag that points
   at a tree" and we would adopt "all commands and options that are
   documented to take X take X-ish, unless explicitly documented
   they only take X"?

As long as we clearly spell out such a ground rule and make sure
readers of the documentation understand it, I will not object to
such a tree-wide clean-up.  The current ground rule is "We write X
when we mean to take only X and we write X-ish otherwise", if I am
not mistaken.

>>         opt.branch1 = branch1;
>>         opt.branch2 = branch2;
>
> If branch1 and branch2 refer to trees, then when users hit conflicts
> they'll see e.g.
>
> <<<<<<< aaaaaaa
>   somecode();
> =======
>   othercode();
>>>>>>>> bbbbbbb
>
> but aaaaaaa and bbbbbbb are not commits that they can find.

Correct.  They are what they fed as two trees to be merged, aren't
they?  They (or the script that drives merge-tree and fed these two
trees) should be recognise these two trees, as long as they are told
that is what we show, no?

> That raises the question: if the user passes trees in, should we
> require helpful names be provided as additional parameters?

If the user wants to give human-readable name to override whatever
the code would internally produce, such new options may help them,
and should probably apply equally to input that happens to be a
commit (or a tag that is a tree-ish) as well, I would think.

> Or are
> the usecases such that we don't expect callers to have any useful
> information about where these trees come from and these suboptimal
> conflicts are the best we can do?

I do not necessarily think the output is "suboptimal" from the point
of view of our users, exactly because that is what they fed into the
command themselves.
Johannes Schindelin Jan. 30, 2024, 8:04 p.m. UTC | #5
Hi Elijah,

On Mon, 29 Jan 2024, Elijah Newren wrote:

> On Fri, Jan 26, 2024 at 6:18 AM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> >  Documentation/git-merge-tree.txt |  5 +++-
> >  builtin/merge-tree.c             | 42 +++++++++++++++++++-------------
> >  t/t4301-merge-tree-write-tree.sh |  8 ++++++
> >  3 files changed, 37 insertions(+), 18 deletions(-)
> >
> > diff --git a/Documentation/git-merge-tree.txt b/Documentation/git-merge-tree.txt
> > index b50acace3bc..214e30c70ba 100644
> > --- a/Documentation/git-merge-tree.txt
> > +++ b/Documentation/git-merge-tree.txt
> > @@ -64,10 +64,13 @@ OPTIONS
> >         share no common history.  This flag can be given to override that
> >         check and make the merge proceed anyway.
> >
> > ---merge-base=<commit>::
> > +--merge-base=<tree-ish>::
>
> A very minor point, but any chance we can just use `<tree>`, like
> git-restore does?  I've never liked the '-ish' that we use as it seems
> to throw users, and I think they understand that they can use a commit
> or a tag where a tree is expected

That's funny: I asked Victoria Dye to look over the patch, and she pointed
out the exact opposite: I had written `<tree>` and she remarked that most
of Git's manual pages would call this a `<tree-ish>` :-)

On another funny note, I tried to establish the term "ent" for this category
almost 222 months ago because I also disliked the "-ish" convention:
https://lore.kernel.org/git/Pine.LNX.4.63.0508051655480.8418@wgmdd8.biozentrum.uni-wuerzburg.de/

> > diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> > index 3bdec53fbe5..cbd8e15af6d 100644
> > --- a/builtin/merge-tree.c
> > +++ b/builtin/merge-tree.c
> > @@ -429,35 +429,43 @@ static int real_merge(struct merge_tree_options *o,
> >         struct merge_options opt;
> >
> >         copy_merge_options(&opt, &o->merge_options);
> > -       parent1 = get_merge_parent(branch1);
> > -       if (!parent1)
> > -               help_unknown_ref(branch1, "merge-tree",
> > -                                _("not something we can merge"));
> > -
> > -       parent2 = get_merge_parent(branch2);
> > -       if (!parent2)
> > -               help_unknown_ref(branch2, "merge-tree",
> > -                                _("not something we can merge"));
> > -
> >         opt.show_rename_progress = 0;
> >
> >         opt.branch1 = branch1;
> >         opt.branch2 = branch2;
>
> If branch1 and branch2 refer to trees, then when users hit conflicts
> they'll see e.g.
>
> <<<<<<< aaaaaaa
>   somecode();
> =======
>   othercode();
> >>>>>>> bbbbbbb
>
> but aaaaaaa and bbbbbbb are not commits that they can find.

That is true. And it is not totally obvious to many users that they could
then call `git show aaaaaaa:file` to see the full pre-image on the
first-parent side.

On the other hand, the label that is shown is precisely what the user
specified on the command-line. For example:

	$ git merge-tree --merge-base=v2.42.0:t v2.43.0~11:t v2.43.0~10^2:t

will result in the following conflict markers:

	$ git show 021c3ce211:t0091-bugreport.sh
	[...]
	<<<<<<< v2.43.0~11:t
		grep usage output &&
		test_path_is_missing git-bugreport-*
	'

	test_expect_success 'incorrect positional arguments abort with usage and hint' '
		test_must_fail git bugreport false 2>output &&
		grep usage output &&
		grep false output &&
	=======
		test_grep usage output &&
	>>>>>>> v2.43.0~10^2:t
	[...]

which I personally find very pleasing output.

Besides, the manual page of `git merge-tree` says in no sugar-coated
words:

	Do NOT look through the resulting toplevel tree to try to find which
	files conflict; [...]

:-)

Ciao,
Johannes
Johannes Schindelin Jan. 30, 2024, 8:05 p.m. UTC | #6
Hi Junio,

On Fri, 26 Jan 2024, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > When specifying a merge base explicitly, there is actually no good
> > reason why the inputs need to be commits: that's only needed if the
> > merge base has to be deduced from the commit graph.
>
> yes, Yes, YES, YEAHHHHH!

:-D

> > diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> > index 12ac4368736..71f21bb834f 100755
> > --- a/t/t4301-merge-tree-write-tree.sh
> > +++ b/t/t4301-merge-tree-write-tree.sh
> > @@ -945,4 +945,12 @@ test_expect_success 'check the input format when --stdin is passed' '
> >  	test_cmp expect actual
> >  '
> >
> > +test_expect_success '--merge-base with tree OIDs' '
> > +	git merge-tree --merge-base=side1^ side1 side3 >tree &&
> > +	tree=$(cat tree) &&
> > +	git merge-tree --merge-base=side1^^{tree} side1^{tree} side3^{tree} >tree2 &&
> > +	tree2=$(cat tree2) &&
> > +	test $tree = $tree2
> > +'
>
> You do not need $tree and $tree2 variables that would make it harder
> to diagnose a failure case when we break merge-tree.  You have tree
> and tree2 as files, and I think it is sufficient to do
>
> 	git merge-tree ... >result-from-commits &&
> 	git merge-tree ... >result-from-trees &&
> 	test_cmp result-from-commits result-from-trees

That's valuable feedback, thank you! As you saw, I changed it accordingly
in v2.

Ciao,
Johannes
Elijah Newren Jan. 31, 2024, 4:34 p.m. UTC | #7
On Tue, Jan 30, 2024 at 9:15 AM Junio C Hamano <gitster@pobox.com> wrote:
>
[...]
> >>         opt.branch1 = branch1;
> >>         opt.branch2 = branch2;
> >
> > If branch1 and branch2 refer to trees, then when users hit conflicts
> > they'll see e.g.
> >
> > <<<<<<< aaaaaaa
> >   somecode();
> > =======
> >   othercode();
> >>>>>>>> bbbbbbb
> >
> > but aaaaaaa and bbbbbbb are not commits that they can find.
>
> Correct.  They are what they fed as two trees to be merged, aren't
> they?  They (or the script that drives merge-tree and fed these two
> trees) should be recognise these two trees, as long as they are told
> that is what we show, no?
>
> > That raises the question: if the user passes trees in, should we
> > require helpful names be provided as additional parameters?
>
> If the user wants to give human-readable name to override whatever
> the code would internally produce, such new options may help them,
> and should probably apply equally to input that happens to be a
> commit (or a tag that is a tree-ish) as well, I would think.
>
> > Or are
> > the usecases such that we don't expect callers to have any useful
> > information about where these trees come from and these suboptimal
> > conflicts are the best we can do?
>
> I do not necessarily think the output is "suboptimal" from the point
> of view of our users, exactly because that is what they fed into the
> command themselves.

Yeah, I was worried though that the end user wasn't the one running
the git-merge-tree command, and was trying to dig more into usage
cases.  Johannes example of using revision:path specification lessens
my worry, and you are right to point out that we can add additional
options in the future to allow overriding with more human readable
names.
Elijah Newren Jan. 31, 2024, 4:40 p.m. UTC | #8
Hi Johannes,

On Tue, Jan 30, 2024 at 12:04 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Elijah,
>
> On Mon, 29 Jan 2024, Elijah Newren wrote:
>
> > On Fri, Jan 26, 2024 at 6:18 AM Johannes Schindelin via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> > >
[...]
> That's funny: I asked Victoria Dye to look over the patch, and she pointed
> out the exact opposite: I had written `<tree>` and she remarked that most
> of Git's manual pages would call this a `<tree-ish>` :-)

A code review isn't complete until you get two contradictory requests, I guess?

> On another funny note, I tried to establish the term "ent" for this category
> almost 222 months ago because I also disliked the "-ish" convention:
> https://lore.kernel.org/git/Pine.LNX.4.63.0508051655480.8418@wgmdd8.biozentrum.uni-wuerzburg.de/

:-)

> > > diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> > > index 3bdec53fbe5..cbd8e15af6d 100644
> > > --- a/builtin/merge-tree.c
> > > +++ b/builtin/merge-tree.c
> > > @@ -429,35 +429,43 @@ static int real_merge(struct merge_tree_options *o,
> > >         struct merge_options opt;
> > >
> > >         copy_merge_options(&opt, &o->merge_options);
> > > -       parent1 = get_merge_parent(branch1);
> > > -       if (!parent1)
> > > -               help_unknown_ref(branch1, "merge-tree",
> > > -                                _("not something we can merge"));
> > > -
> > > -       parent2 = get_merge_parent(branch2);
> > > -       if (!parent2)
> > > -               help_unknown_ref(branch2, "merge-tree",
> > > -                                _("not something we can merge"));
> > > -
> > >         opt.show_rename_progress = 0;
> > >
> > >         opt.branch1 = branch1;
> > >         opt.branch2 = branch2;
> >
> > If branch1 and branch2 refer to trees, then when users hit conflicts
> > they'll see e.g.
> >
> > <<<<<<< aaaaaaa
> >   somecode();
> > =======
> >   othercode();
> > >>>>>>> bbbbbbb
> >
> > but aaaaaaa and bbbbbbb are not commits that they can find.
>
> That is true. And it is not totally obvious to many users that they could
> then call `git show aaaaaaa:file` to see the full pre-image on the
> first-parent side.
>
> On the other hand, the label that is shown is precisely what the user
> specified on the command-line.

So this is only for direct use?  I was curious if the user was using
some other tool of yours, perhaps even some web GUI, and thus that
something else was controlling what was passed to git-merge-tree.

> For example:
>
>         $ git merge-tree --merge-base=v2.42.0:t v2.43.0~11:t v2.43.0~10^2:t
>
> will result in the following conflict markers:
>
>         $ git show 021c3ce211:t0091-bugreport.sh
>         [...]
>         <<<<<<< v2.43.0~11:t
>                 grep usage output &&
>                 test_path_is_missing git-bugreport-*
>         '
>
>         test_expect_success 'incorrect positional arguments abort with usage and hint' '
>                 test_must_fail git bugreport false 2>output &&
>                 grep usage output &&
>                 grep false output &&
>         =======
>                 test_grep usage output &&
>         >>>>>>> v2.43.0~10^2:t
>         [...]
>
> which I personally find very pleasing output.

Oh, good point -- if users provide trees in the revision:path format
then they still have access to more information about why the change
was made via the revision.

Of course, if users are using the tool directly, presumably they have
access to more information about where those trees came from anyway
even without the conflict label.

> Besides, the manual page of `git merge-tree` says in no sugar-coated
> words:
>
>         Do NOT look through the resulting toplevel tree to try to find which
>         files conflict; [...]
>
> :-)

Right, but this isn't using the tree to find which files conflict;
they already are looking at the conflict.  They are instead wanting to
learn why certain textual changes were made on one side of history to
better inform how to resolve an otherwise less than obvious conflict
resolution.  At least, that's the only thing I've seen those conflict
labels be used for.
Junio C Hamano Jan. 31, 2024, 6:14 p.m. UTC | #9
Elijah Newren <newren@gmail.com> writes:

> On Tue, Jan 30, 2024 at 9:15 AM Junio C Hamano <gitster@pobox.com> wrote:
>> ...
>> > but aaaaaaa and bbbbbbb are not commits that they can find.
>>
>> Correct.  They are what they fed as two trees to be merged, aren't
>> they?  They (or the script that drives merge-tree and fed these two
>> trees) should be recognise these two trees, as long as they are told

I notice "able to" is missing here...

>> that is what we show, no?
> ...
> Yeah, I was worried though that the end user wasn't the one running
> the git-merge-tree command, and was trying to dig more into usage
> cases.

Sure.  That is exactly why I wrote "They (or the script that
drives..."  above.
diff mbox series

Patch

diff --git a/Documentation/git-merge-tree.txt b/Documentation/git-merge-tree.txt
index b50acace3bc..214e30c70ba 100644
--- a/Documentation/git-merge-tree.txt
+++ b/Documentation/git-merge-tree.txt
@@ -64,10 +64,13 @@  OPTIONS
 	share no common history.  This flag can be given to override that
 	check and make the merge proceed anyway.
 
---merge-base=<commit>::
+--merge-base=<tree-ish>::
 	Instead of finding the merge-bases for <branch1> and <branch2>,
 	specify a merge-base for the merge, and specifying multiple bases is
 	currently not supported. This option is incompatible with `--stdin`.
++
+As the merge-base is provided directly, <branch1> and <branch2> do not need
+o specify commits; it is sufficient if they specify trees.
 
 [[OUTPUT]]
 OUTPUT
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index 3bdec53fbe5..cbd8e15af6d 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -429,35 +429,43 @@  static int real_merge(struct merge_tree_options *o,
 	struct merge_options opt;
 
 	copy_merge_options(&opt, &o->merge_options);
-	parent1 = get_merge_parent(branch1);
-	if (!parent1)
-		help_unknown_ref(branch1, "merge-tree",
-				 _("not something we can merge"));
-
-	parent2 = get_merge_parent(branch2);
-	if (!parent2)
-		help_unknown_ref(branch2, "merge-tree",
-				 _("not something we can merge"));
-
 	opt.show_rename_progress = 0;
 
 	opt.branch1 = branch1;
 	opt.branch2 = branch2;
 
 	if (merge_base) {
-		struct commit *base_commit;
 		struct tree *base_tree, *parent1_tree, *parent2_tree;
 
-		base_commit = lookup_commit_reference_by_name(merge_base);
-		if (!base_commit)
-			die(_("could not lookup commit '%s'"), merge_base);
+		/*
+		 * We actually only need the trees because we already
+		 * have a merge base.
+		 */
+		struct object_id base_oid, head_oid, merge_oid;
+
+		if (repo_get_oid_treeish(the_repository, merge_base, &base_oid))
+			die(_("could not parse as tree '%s'"), merge_base);
+		base_tree = parse_tree_indirect(&base_oid);
+		if (repo_get_oid_treeish(the_repository, branch1, &head_oid))
+			die(_("could not parse as tree '%s'"), branch1);
+		parent1_tree = parse_tree_indirect(&head_oid);
+		if (repo_get_oid_treeish(the_repository, branch2, &merge_oid))
+			die(_("could not parse as tree '%s'"), branch2);
+		parent2_tree = parse_tree_indirect(&merge_oid);
 
 		opt.ancestor = merge_base;
-		base_tree = repo_get_commit_tree(the_repository, base_commit);
-		parent1_tree = repo_get_commit_tree(the_repository, parent1);
-		parent2_tree = repo_get_commit_tree(the_repository, parent2);
 		merge_incore_nonrecursive(&opt, base_tree, parent1_tree, parent2_tree, &result);
 	} else {
+		parent1 = get_merge_parent(branch1);
+		if (!parent1)
+			help_unknown_ref(branch1, "merge-tree",
+					 _("not something we can merge"));
+
+		parent2 = get_merge_parent(branch2);
+		if (!parent2)
+			help_unknown_ref(branch2, "merge-tree",
+					 _("not something we can merge"));
+
 		/*
 		 * Get the merge bases, in reverse order; see comment above
 		 * merge_incore_recursive in merge-ort.h
diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
index 12ac4368736..71f21bb834f 100755
--- a/t/t4301-merge-tree-write-tree.sh
+++ b/t/t4301-merge-tree-write-tree.sh
@@ -945,4 +945,12 @@  test_expect_success 'check the input format when --stdin is passed' '
 	test_cmp expect actual
 '
 
+test_expect_success '--merge-base with tree OIDs' '
+	git merge-tree --merge-base=side1^ side1 side3 >tree &&
+	tree=$(cat tree) &&
+	git merge-tree --merge-base=side1^^{tree} side1^{tree} side3^{tree} >tree2 &&
+	tree2=$(cat tree2) &&
+	test $tree = $tree2
+'
+
 test_done