diff mbox series

[v5] submodule merge: update conflict error message

Message ID 20220718214349.3379328-1-calvinwan@google.com (mailing list archive)
State Superseded
Headers show
Series [v5] submodule merge: update conflict error message | expand

Commit Message

Calvin Wan July 18, 2022, 9:43 p.m. UTC
Changes since v4:
 - Rebased onto gitster/master
 - Fixed test cases to work with only merge-ort (and not merge-recursive)

== Description ==

When attempting to merge in a superproject with conflicting submodule
pointers that cannot be fast-forwarded or trivially resolved, the merge
fails and git prints the following error:

Failed to merge submodule <submodule>
CONFLICT (submodule): Merge conflict in <submodule>
Automatic merge failed; fix conflicts and then commit the result.

Git is left in a conflicted state, which requires the user to:
 1. merge submodules or update submodules to an already existing
	commit that reflects the merge
 2. add submodules changes to the superproject
 3. finish merging superproject
These steps are non-obvious for newer submodule users to figure out
based on the error message and neither `git submodule status` nor `git
status` provide any useful pointers. 

Update error message to match the steps above. If git does not detect a 
merge resolution, the following is printed:

====

Failed to merge submodule <submodule>
CONFLICT (submodule): Merge conflict in <submodule>
Recursive merging with submodules currently only supports trivial cases.
Please manually handle the merging of each conflicted submodule.
This can be accomplished with the following steps:
 - go to submodule (<submodule>), and either merge commit <commit>
or update to an existing commit which has merged those changes
 - come back to superproject, and `git add <submodule>` to record the above merge or update
 - resolve any other conflicts in the superproject
 - commit the resulting index in the superproject
Automatic merge failed; fix conflicts and then commit the result.

====

If git detects a possible merge resolution, the following is printed:

====

Failed to merge submodule sub, but a possible merge resolution exists:
    <commit> Merge branch '<branch1>' into <branch2>

CONFLICT (submodule): Merge conflict in <submodule>
Recursive merging with submodules currently only supports trivial cases.
Please manually handle the merging of each conflicted submodule.
This can be accomplished with the following steps:
To manually complete the merge:
 - go to submodule (<submodule>), and either merge commit <commit>
or update to an existing commit which has merged those changes
such as one listed above
 - come back to superproject, and `git add <submodule>` to record the above merge or update
 - resolve any other conflicts in the superproject
 - commit the resulting index in the superproject
Automatic merge failed; fix conflicts and then commit the result.

====

If git detects multiple possible merge resolutions, the following is printed:

====

Failed to merge submodule sub, but multiple possible merges exist:
    <commit> Merge branch '<branch1>' into <branch2>
    <commit> Merge branch '<branch1>' into <branch3>

CONFLICT (submodule): Merge conflict in <submodule>
Recursive merging with submodules currently only supports trivial cases.
Please manually handle the merging of each conflicted submodule.
This can be accomplished with the following steps:
To manually complete the merge:
 - go to submodule (<submodule>), and either merge commit <commit>
or update to an existing commit which has merged those changes
such as one listed above
 - come back to superproject, and `git add <submodule>` to record the above merge or update
 - resolve any other conflicts in the superproject
 - commit the resulting index in the superproject
Automatic merge failed; fix conflicts and then commit the result.

== Previous Changes ==

Changes since v3:
Thank you again Elijah for the helpful feedback! I have removed any code
touching merge-recursive.c, and refactored the rest into merge-ort.c.
The error message has been updated as well as any relevant test cases. I
had added a jump in v3 to "ret:" in merge_submodule() to accomodate
early returns, but this has been proven to not be necessary since an
early return means the submodule was either renamed or deleted, and this
case is already taken care of with the message "CONFLICT (modify/delete):"

Changes since v2:
There are three major changes in this patch thanks to all the helpful
feedback! I have moved the print function from builtin/merge.c to
merge-ort.c and added it to merge_finalize() in merge-recursive.c and
merge_switch_to_result() in merge-ort.c. This allows other merge
machinery commands besides merge to print submodule conflict advice.

I have moved the check for submodule conflicts from process_entry() to 
merge_submodule(). This also required removing the early returns and
instead going to my submodule conflict check allowing us to pass back
information on whether git detected a possible merge resolution or not.

I have also updated the error message to better reflect the merge
status. Specifically, if git detects a possible merge resolution, the
error message now also suggest updating to one of those resolutions.

Other small changes: 
 - Moved fields that hold submodule conflict information to new object
 - Shortened printed commit id to DEFAULT_ABBREV
 - Added a test to t6437-submodule-merge.sh for merges with no
   resolution existence
 - Modified a test in t7402-submodule-rebase to show error message
   prints in other parts of the merge machinery

Changes since v1:
 - Removed advice to abort merge
 - Error message updated to contain more commit/submodule information

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 merge-ort.c                 | 56 +++++++++++++++++++++++++++++++++++++
 t/t6437-submodule-merge.sh  | 38 +++++++++++++++++++++----
 t/t7402-submodule-rebase.sh |  9 ++++--
 3 files changed, 95 insertions(+), 8 deletions(-)


base-commit: 9dd64cb4d310986dd7b8ca7fff92f9b61e0bd21a

Comments

Junio C Hamano July 19, 2022, 6:39 a.m. UTC | #1
Calvin Wan <calvinwan@google.com> writes:

> Changes since v4:
>  - Rebased onto gitster/master
>  - Fixed test cases to work with only merge-ort (and not merge-recursive)

That's not a log-message material.  You could have probably written
scissors ...

> == Description ==

--- >8 ---

... here, perhaps?

> When attempting to merge in a superproject with conflicting submodule
> pointers that cannot be fast-forwarded or trivially resolved, the merge
> fails and git prints the following error:
>
> Failed to merge submodule <submodule>
> CONFLICT (submodule): Merge conflict in <submodule>
> Automatic merge failed; fix conflicts and then commit the result.
>
> Git is left in a conflicted state, which requires the user to:
>  1. merge submodules or update submodules to an already existing
> 	commit that reflects the merge
>  2. add submodules changes to the superproject
>  3. finish merging superproject
> These steps are non-obvious for newer submodule users to figure out
> based on the error message and neither `git submodule status` nor `git
> status` provide any useful pointers. 
>
> Update error message to match the steps above. If git does not detect a 
> merge resolution, the following is printed:
>
> ====
>
> Failed to merge submodule <submodule>
> CONFLICT (submodule): Merge conflict in <submodule>
> Recursive merging with submodules currently only supports trivial cases.
> Please manually handle the merging of each conflicted submodule.
> This can be accomplished with the following steps:
>  - go to submodule (<submodule>), and either merge commit <commit>
> or update to an existing commit which has merged those changes
>  - come back to superproject, and `git add <submodule>` to record the above merge or update
>  - resolve any other conflicts in the superproject
>  - commit the resulting index in the superproject
> Automatic merge failed; fix conflicts and then commit the result.
>
> ====
>
> If git detects a possible merge resolution, the following is printed:
>
> ====
>
> Failed to merge submodule sub, but a possible merge resolution exists:
>     <commit> Merge branch '<branch1>' into <branch2>
>
> CONFLICT (submodule): Merge conflict in <submodule>
> Recursive merging with submodules currently only supports trivial cases.
> Please manually handle the merging of each conflicted submodule.
> This can be accomplished with the following steps:
> To manually complete the merge:
>  - go to submodule (<submodule>), and either merge commit <commit>
> or update to an existing commit which has merged those changes
> such as one listed above
>  - come back to superproject, and `git add <submodule>` to record the above merge or update
>  - resolve any other conflicts in the superproject
>  - commit the resulting index in the superproject
> Automatic merge failed; fix conflicts and then commit the result.
>
> ====
>
> If git detects multiple possible merge resolutions, the following is printed:
>
> ====
>
> Failed to merge submodule sub, but multiple possible merges exist:
>     <commit> Merge branch '<branch1>' into <branch2>
>     <commit> Merge branch '<branch1>' into <branch3>
>
> CONFLICT (submodule): Merge conflict in <submodule>
> Recursive merging with submodules currently only supports trivial cases.
> Please manually handle the merging of each conflicted submodule.
> This can be accomplished with the following steps:
> To manually complete the merge:
>  - go to submodule (<submodule>), and either merge commit <commit>
> or update to an existing commit which has merged those changes
> such as one listed above
>  - come back to superproject, and `git add <submodule>` to record the above merge or update
>  - resolve any other conflicts in the superproject
>  - commit the resulting index in the superproject
> Automatic merge failed; fix conflicts and then commit the result.

But cutting the cruft at the top is still not enough, because the
below are not log-message material, either.

These extraneous stuff may help reviewers while the patch is being
polished, but once committed to our history, readers of "git log"
should not have to care what other attempts were made that did not
become part of the final hstory.  The log message proper should be
understandable standalone.

The stuff to help reviewers who may have seen earlier round are
usually written in the cover letter, or after the three-dash line.

> == Previous Changes ==
>
> Changes since v3:
> ...
>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
> ---
>  merge-ort.c                 | 56 +++++++++++++++++++++++++++++++++++++
>  t/t6437-submodule-merge.sh  | 38 +++++++++++++++++++++----
>  t/t7402-submodule-rebase.sh |  9 ++++--
>  3 files changed, 95 insertions(+), 8 deletions(-)
>
> diff --git a/merge-ort.c b/merge-ort.c
> index 01f150ef3b..125ee3c0d1 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> ...
> +	if (csub && csub->nr > 0) {
> +		int i;
> +		printf(_("Recursive merging with submodules currently only supports trivial cases.\n"
> +			"Please manually handle the merging of each conflicted submodule.\n"
> +			"This can be accomplished with the following steps:\n"));

This makes me wonder if these "helpful but verbose" messages should
use the advise mechanism.  Also, those reviewers who care about l10n
may suggest use of printf_ln() to lose the LF at the end of these
messages (i.e. not just the above one, but others we see below as
well).

> +		for (i = 0; i < csub->nr; i++) {
> +			printf(_(" - go to submodule (%s), and either merge commit %s\n"
> +				    "or update to an existing commit which has merged those changes\n"),
> +					csub->items[i].path,
> +					csub->items[i].oid);
> +			if (csub->items[i].resolution_exists)
> +				printf(_("such as one listed above\n"));
> +		}
> +		printf(_(" - come back to superproject, and `git add"));
> +		for (i = 0; i < csub->nr; i++)
> +			printf(_(" %s"), csub->items[i].path);
> +		printf(_("` to record the above merge or update \n"
> +			" - resolve any other conflicts in the superproject\n"
> +			" - commit the resulting index in the superproject\n"));
> +	}
> +}
> +
>  void merge_display_update_messages(struct merge_options *opt,
>  				   int detailed,
>  				   struct merge_result *result)
> @@ -4461,6 +4515,8 @@ void merge_display_update_messages(struct merge_options *opt,
>  	}
>  	string_list_clear(&olist, 0);
>  
> +	print_submodule_conflict_suggestion(&opti->conflicted_submodules);
> +
>  	/* Also include needed rename limit adjustment now */
>  	diff_warn_rename_limit("merge.renamelimit",
>  			       opti->renames.needed_limit, 0);

Thanks.
Junio C Hamano July 19, 2022, 7:13 a.m. UTC | #2
Calvin Wan <calvinwan@google.com> writes:

> @@ -1835,6 +1853,7 @@ static int merge_submodule(struct merge_options *opt,
>  			   "resolution exists: %s"),
>  			 path, sb.buf);
>  		strbuf_release(&sb);
> +		resolution_exists = 1;
>  		break;
>  	default:
>  		for (i = 0; i < merges.nr; i++)
> @@ -1845,10 +1864,22 @@ static int merge_submodule(struct merge_options *opt,
>  			 _("Failed to merge submodule %s, but multiple "
>  			   "possible merges exist:\n%s"), path, sb.buf);
>  		strbuf_release(&sb);
> +		resolution_exists = 1;
>  	}

These work on the result of calling find_first_merges(), but is it
possible that we are asked to call this function more than once
because we see conflicted submodule updates at two or more paths?

I may be misreading the code, but find_first_merges(), either the
version we see in this file, or the one in merge-recursive.c, or its
original introduced in 68d03e4a (Implement automatic fast-forward
merge for submodules, 2010-07-07), look safe to be called twice.  It
runs the get_revision() machinery, smudging the object flags while
walking the history, but I do not see any code that cleans up these
flags for the second traversal.

Also, this is not a new problem, but I am afraid that the logic to
find existing merges in find_first_merges() might be overly loose.
It tries to find existing merges that can reach the two commits, and
then finds, among these merges, the one that is not descendant of
any other such merges.  Don't we end up finding a merge M

    A---o---M
           /
          B

when a superproject merge needs a merge of A and B in the submodule?
That is certainly a merge that contains both A and B and it may be
closer to A and B than any other existing merges, but it still may
not be a merge between A and B (in the depicted case, an extra
commit 'o' nobody ordered is included for free in the result).  I am
not seeing how existing code tries to avoid such a situation.

Thanks.
Calvin Wan July 19, 2022, 7:07 p.m. UTC | #3
> These work on the result of calling find_first_merges(), but is it
> possible that we are asked to call this function more than once
> because we see conflicted submodule updates at two or more paths?

This does get called multiple times if we see conflicted submodule
updates at two or more paths.

> I may be misreading the code, but find_first_merges(), either the
> version we see in this file, or the one in merge-recursive.c, or its
> original introduced in 68d03e4a (Implement automatic fast-forward
> merge for submodules, 2010-07-07), look safe to be called twice.  It
> runs the get_revision() machinery, smudging the object flags while
> walking the history, but I do not see any code that cleans up these
> flags for the second traversal.

I don't quite understand which flags need to be cleaned up for the
second traversal.

> Also, this is not a new problem, but I am afraid that the logic to
> find existing merges in find_first_merges() might be overly loose.
> It tries to find existing merges that can reach the two commits, and
> then finds, among these merges, the one that is not descendant of
> any other such merges.  Don't we end up finding a merge M
>
>     A---o---M
>            /
>           B
>
> when a superproject merge needs a merge of A and B in the submodule?
> That is certainly a merge that contains both A and B and it may be
> closer to A and B than any other existing merges, but it still may
> not be a merge between A and B (in the depicted case, an extra
> commit 'o' nobody ordered is included for free in the result).  I am
> not seeing how existing code tries to avoid such a situation.

It is true that we find merge M and it isn't representative of a merge of A
and B in the submodule. In this case, the existing code prints:

"Failed to merge submodule %s, but a possible merge resolution exists: %s"

While this part doesn't claim M to be a guaranteed merge resolution, my
change adds this line:

"or update to an existing commit which has merged those changes such as
one listed above"

Instead of adding more verbosity to this language, it seems like a better
idea to remove "such as one listed above" entirely (and subsequently any
of my code that flags merge resolutions).
Calvin Wan July 19, 2022, 7:30 p.m. UTC | #4
> The stuff to help reviewers who may have seen earlier round are
> usually written in the cover letter, or after the three-dash line.
Ah I see since this patch doesn't have a cover letter, I should put
all the reviewer-centric stuff after the three-dash line.

> > +     if (csub && csub->nr > 0) {
> > +             int i;
> > +             printf(_("Recursive merging with submodules currently only supports trivial cases.\n"
> > +                     "Please manually handle the merging of each conflicted submodule.\n"
> > +                     "This can be accomplished with the following steps:\n"));
>
> This makes me wonder if these "helpful but verbose" messages should
> use the advise mechanism.

I agree. The only loss of information if someone turned off this message
would be the commit id that possibly needs to be merged.

> Also, those reviewers who care about l10n
> may suggest use of printf_ln() to lose the LF at the end of these
> messages (i.e. not just the above one, but others we see below as
> well).

ack
Junio C Hamano July 19, 2022, 8:16 p.m. UTC | #5
Calvin Wan <calvinwan@google.com> writes:

>> The stuff to help reviewers who may have seen earlier round are
>> usually written in the cover letter, or after the three-dash line.
> Ah I see since this patch doesn't have a cover letter, I should put
> all the reviewer-centric stuff after the three-dash line.
>
>> > +     if (csub && csub->nr > 0) {
>> > +             int i;
>> > +             printf(_("Recursive merging with submodules currently only supports trivial cases.\n"
>> > +                     "Please manually handle the merging of each conflicted submodule.\n"
>> > +                     "This can be accomplished with the following steps:\n"));
>>
>> This makes me wonder if these "helpful but verbose" messages should
>> use the advise mechanism.
>
> I agree. The only loss of information if someone turned off this message
> would be the commit id that possibly needs to be merged.

If the commit found by find_first_merges() are useful, then it is
losing information, so it is one argument against using the advise
mechanism.
Junio C Hamano July 19, 2022, 8:30 p.m. UTC | #6
Calvin Wan <calvinwan@google.com> writes:

>> These work on the result of calling find_first_merges(), but is it
>> possible that we are asked to call this function more than once
>> because we see conflicted submodule updates at two or more paths?
>
> This does get called multiple times if we see conflicted submodule
> updates at two or more paths.
>
>> I may be misreading the code, but find_first_merges(), either the
>> version we see in this file, or the one in merge-recursive.c, or its
>> original introduced in 68d03e4a (Implement automatic fast-forward
>> merge for submodules, 2010-07-07), look safe to be called twice.  It
>> runs the get_revision() machinery, smudging the object flags while
>> walking the history, but I do not see any code that cleans up these
>> flags for the second traversal.
>
> I don't quite understand which flags need to be cleaned up for the
> second traversal.

UNINTERESTING, TREESAME, ADDED, SEEN, SHOWN are among the flags used
by the object walk (if MyFirstObjectWalk does not talk about them,
it probably should), and they need to be cleared before you prepare
a new "struct rev_info" and throw it at setup_revisions(),
prepare_revision_walk(), and start calling get_revision().

submodule.c::collect_changed_submodules() has its own revision walk,
but it calls reset_revision_walk() to clear these flags from all
objects in the superproject (i.e. the_repository).

I _think_ the reason why this never turned out to be a problem in
practice is because we do not run this helper twice for the same
submodule.  Even though we may smudge many objects from a submodule
with an object walk without clearing their flags, as long as we
run the next object walk in a different submodule whose object flags
are still unsmudged, it would be OK.
Ævar Arnfjörð Bjarmason July 25, 2022, 6:05 a.m. UTC | #7
On Mon, Jul 18 2022, Calvin Wan wrote:


One of the CI failures in "seen" is because of my topic to mark (among
other things) t1500*.sh as passing with SANITIZE=leak, and this change.

Because...

> ..
>  	object_array_clear(&merges);
>  cleanup:
> +	if (!ret) {
> +		if (!csub) {
> +			CALLOC_ARRAY(csub, 1);
> +		}
> +		csub_item.oid = xstrdup(repo_find_unique_abbrev(&subrepo, b, DEFAULT_ABBREV));
> +		csub_item.path = xstrdup(path);
> +		csub_item.resolution_exists = resolution_exists;
> +		ALLOC_GROW(csub->items, csub->nr + 1, csub->alloc);

... in "cleanup" we're ALLOC_GROW()-ing? I haven't looked into this yet,
but this seems susppect. This is line 1879 in the following stacktrace:
	
	+ git -C super merge branch1
	Failed to merge submodule dir/sub
	CONFLICT (submodule): Merge conflict in dir/sub
	Recursive merging with submodules currently only supports trivial cases.
	Please manually handle the merging of each conflicted submodule.
	This can be accomplished with the following steps:
	 - go to submodule (dir/sub), and either merge commit 7018b5f
	or update to an existing commit which has merged those changes
	 - come back to superproject, and `git add dir/sub` to record the above merge or update
	 - resolve any other conflicts in the superproject
	 - commit the resulting index in the superproject
	Automatic merge failed; fix conflicts and then commit the result.
	
	=================================================================
	==31261==ERROR: LeakSanitizer: detected memory leaks
	
	Direct leak of 576 byte(s) in 1 object(s) allocated from:
	    #0 0x4565ad in __interceptor_realloc (git+0x4565ad)
	    #1 0x76ecfd in xrealloc wrapper.c:136:8
	    #2 0x64fcd3 in merge_submodule merge-ort.c:1879:3
	    #3 0x64ee9b in handle_content_merge merge-ort.c:2118:11
	    #4 0x651c14 in process_entry merge-ort.c:4056:17
	    #5 0x648c05 in process_entries merge-ort.c:4267:4
	    #6 0x646c03 in merge_ort_nonrecursive_internal merge-ort.c:4893:2
	    #7 0x6470f3 in merge_ort_internal merge-ort.c:4982:2
	    #8 0x646de0 in merge_incore_recursive merge-ort.c:5033:2
	    #9 0x652d1a in merge_ort_recursive merge-ort-wrappers.c:57:2
	    #10 0x4ec0f6 in try_merge_strategy builtin/merge.c:764:12
	    #11 0x4e9bf2 in cmd_merge builtin/merge.c:1710:9
	    #12 0x45a3aa in run_builtin git.c:466:11
	    #13 0x458e41 in handle_builtin git.c:720:3
	    #14 0x459d85 in run_argv git.c:787:4
	    #15 0x458bfa in cmd_main git.c:920:19
	    #16 0x56a049 in main common-main.c:56:11
	    #17 0x7fe592bca81c in __libc_start_main csu/../csu/libc-start.c:332:16
	    #18 0x431139 in _start (git+0x431139)
Ævar Arnfjörð Bjarmason July 25, 2022, 12:11 p.m. UTC | #8
On Mon, Jul 25 2022, Ævar Arnfjörð Bjarmason wrote:

> On Mon, Jul 18 2022, Calvin Wan wrote:
>
>
> One of the CI failures in "seen" is because of my topic to mark (among
> other things) t1500*.sh as passing with SANITIZE=leak, and this change.
>
> Because...
>
>> ..
>>  	object_array_clear(&merges);
>>  cleanup:
>> +	if (!ret) {
>> +		if (!csub) {
>> +			CALLOC_ARRAY(csub, 1);
>> +		}
>> +		csub_item.oid = xstrdup(repo_find_unique_abbrev(&subrepo, b, DEFAULT_ABBREV));
>> +		csub_item.path = xstrdup(path);
>> +		csub_item.resolution_exists = resolution_exists;
>> +		ALLOC_GROW(csub->items, csub->nr + 1, csub->alloc);
>
> ... in "cleanup" we're ALLOC_GROW()-ing? I haven't looked into this yet,
> but this seems susppect. This is line 1879 in the following stacktrace:
> 	
> 	+ git -C super merge branch1
> 	Failed to merge submodule dir/sub
> 	CONFLICT (submodule): Merge conflict in dir/sub
> 	Recursive merging with submodules currently only supports trivial cases.
> 	Please manually handle the merging of each conflicted submodule.
> 	This can be accomplished with the following steps:
> 	 - go to submodule (dir/sub), and either merge commit 7018b5f
> 	or update to an existing commit which has merged those changes
> 	 - come back to superproject, and `git add dir/sub` to record the above merge or update
> 	 - resolve any other conflicts in the superproject
> 	 - commit the resulting index in the superproject
> 	Automatic merge failed; fix conflicts and then commit the result.
> 	
> 	=================================================================
> 	==31261==ERROR: LeakSanitizer: detected memory leaks
> 	
> 	Direct leak of 576 byte(s) in 1 object(s) allocated from:
> 	    #0 0x4565ad in __interceptor_realloc (git+0x4565ad)
> 	    #1 0x76ecfd in xrealloc wrapper.c:136:8
> 	    #2 0x64fcd3 in merge_submodule merge-ort.c:1879:3
> 	    #3 0x64ee9b in handle_content_merge merge-ort.c:2118:11
> 	    #4 0x651c14 in process_entry merge-ort.c:4056:17
> 	    #5 0x648c05 in process_entries merge-ort.c:4267:4
> 	    #6 0x646c03 in merge_ort_nonrecursive_internal merge-ort.c:4893:2
> 	    #7 0x6470f3 in merge_ort_internal merge-ort.c:4982:2
> 	    #8 0x646de0 in merge_incore_recursive merge-ort.c:5033:2
> 	    #9 0x652d1a in merge_ort_recursive merge-ort-wrappers.c:57:2
> 	    #10 0x4ec0f6 in try_merge_strategy builtin/merge.c:764:12
> 	    #11 0x4e9bf2 in cmd_merge builtin/merge.c:1710:9
> 	    #12 0x45a3aa in run_builtin git.c:466:11
> 	    #13 0x458e41 in handle_builtin git.c:720:3
> 	    #14 0x459d85 in run_argv git.c:787:4
> 	    #15 0x458bfa in cmd_main git.c:920:19
> 	    #16 0x56a049 in main common-main.c:56:11
> 	    #17 0x7fe592bca81c in __libc_start_main csu/../csu/libc-start.c:332:16
> 	    #18 0x431139 in _start (git+0x431139)

Looking at this a bit more this fixes it, and also the NIH allocation
pattern & what we can do with a "struct string_list" instead of managing
our own allocations.

I did wonder why you need to allocate the "oid" at all, i.e. at the time
of merge_submodules() can't we just squirrel away the "const struct
object_id *" and only when we emit the message later run
repo_find_unique_abbrev()?

I think we can do that, but it means re-init-ing the submodule, and I'm
not sure if there's any edge cases with the state changing between us
trying merge_submodules() and the eventual messages we emit.

But it would mean that we'd just need to allocate a pointer to that oid
+ resolution_exists.

Perhaps we could also continue down that path and do this with just one
allocation for that "oid/resolution_exists" side-data. I.e. you'd have a
second string_list in the struct, have a non-NULL util mean "resolution
exists", and the "string" there would be nodup'd (we could refer to the
"string" in the first one).

But I think all of that probably isn't worth it, but the below seems
like a good trade-off & good place to stop, and fixes the memory leak
this topic introduces.

 merge-ort.c | 69 +++++++++++++++++++++++++++++++++++--------------------------
 1 file changed, 40 insertions(+), 29 deletions(-)

diff --git a/merge-ort.c b/merge-ort.c
index 154d33f2f45..5cf87f70b58 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -293,16 +293,17 @@ struct rename_info {
 };
 
 struct conflicted_submodule_item {
-	char *oid;
-	char *path;
-	int resolution_exists;
+	char *abbrev;
+	int resolution_exists:1;
 };
 
-struct conflicted_submodule_list {
-	struct conflicted_submodule_item *items;
-	size_t nr;
-	size_t alloc;
-};
+static void conflicted_submodule_item_free(void *util, const char *str)
+{
+	struct conflicted_submodule_item *item = util;
+
+	free(item->abbrev);
+	free(item);
+}
 
 struct merge_options_internal {
 	/*
@@ -401,7 +402,7 @@ struct merge_options_internal {
 	int call_depth;
 
 	/* field that holds submodule conflict information */
-	struct conflicted_submodule_list conflicted_submodules;
+	struct string_list conflicted_submodules;
 };
 
 struct version_info {
@@ -701,6 +702,9 @@ static void clear_or_reinit_internal_opts(struct merge_options_internal *opti,
 
 	mem_pool_discard(&opti->pool, 0);
 
+	string_list_clear_func(&opti->conflicted_submodules,
+			       conflicted_submodule_item_free);
+
 	/* Clean out callback_data as well. */
 	FREE_AND_NULL(renames->callback_data);
 	renames->callback_data_nr = renames->callback_data_alloc = 0;
@@ -1756,8 +1760,6 @@ static int merge_submodule(struct merge_options *opt,
 	struct commit *commit_o, *commit_a, *commit_b;
 	int parent_count;
 	struct object_array merges;
-	struct conflicted_submodule_list *csub = &opt->priv->conflicted_submodules;
-	struct conflicted_submodule_item csub_item;
 	int resolution_exists = 0;
 
 	int i;
@@ -1870,15 +1872,17 @@ static int merge_submodule(struct merge_options *opt,
 	object_array_clear(&merges);
 cleanup:
 	if (!ret) {
-		if (!csub) {
-			CALLOC_ARRAY(csub, 1);
-		}
-		csub_item.oid = xstrdup(repo_find_unique_abbrev(&subrepo, b, DEFAULT_ABBREV));
-		csub_item.path = xstrdup(path);
-		csub_item.resolution_exists = resolution_exists;
-		ALLOC_GROW(csub->items, csub->nr + 1, csub->alloc);
-		csub->items[csub->nr++] = csub_item;
-		opt->priv->conflicted_submodules = *csub;
+		struct string_list *csub = &opt->priv->conflicted_submodules;
+		struct conflicted_submodule_item *util;
+		const char *abbrev;
+
+		abbrev = repo_find_unique_abbrev(&subrepo, b, DEFAULT_ABBREV);;
+
+		util = xmalloc(sizeof(*util));
+		util->abbrev = xstrdup(abbrev);
+		util->resolution_exists = resolution_exists;
+
+		string_list_append(csub, path)->util = util;
 	}
 	repo_clear(&subrepo);
 	return ret;
@@ -4465,23 +4469,26 @@ static int record_conflicted_index_entries(struct merge_options *opt)
 	return errs;
 }
 
-static void print_submodule_conflict_suggestion(struct conflicted_submodule_list *csub) {
-	if (csub && csub->nr > 0) {
-		int i;
+static void print_submodule_conflict_suggestion(struct string_list *csub) {
+	if (csub->nr > 0) {
+		struct string_list_item *item;
+
 		printf(_("Recursive merging with submodules currently only supports trivial cases.\n"
 			"Please manually handle the merging of each conflicted submodule.\n"
 			"This can be accomplished with the following steps:\n"));
-		for (i = 0; i < csub->nr; i++) {
+
+		for_each_string_list_item(item, csub) {
+			struct conflicted_submodule_item *util = item->util;
+
 			printf(_(" - go to submodule (%s), and either merge commit %s\n"
 				    "or update to an existing commit which has merged those changes\n"),
-					csub->items[i].path,
-					csub->items[i].oid);
-			if (csub->items[i].resolution_exists)
+					item->string, util->abbrev);
+			if (util->resolution_exists)
 				printf(_("such as one listed above\n"));
 		}
 		printf(_(" - come back to superproject, and `git add"));
-		for (i = 0; i < csub->nr; i++)
-			printf(_(" %s"), csub->items[i].path);
+		for_each_string_list_item(item, csub)
+			printf(_(" %s"), item->string);
 		printf(_("` to record the above merge or update \n"
 			" - resolve any other conflicts in the superproject\n"
 			" - commit the resulting index in the superproject\n"));
@@ -4538,6 +4545,8 @@ void merge_display_update_messages(struct merge_options *opt,
 	string_list_clear(&olist, 0);
 
 	print_submodule_conflict_suggestion(&opti->conflicted_submodules);
+	string_list_clear_func(&opti->conflicted_submodules,
+			       conflicted_submodule_item_free);
 
 	/* Also include needed rename limit adjustment now */
 	diff_warn_rename_limit("merge.renamelimit",
@@ -4795,6 +4804,8 @@ static void merge_start(struct merge_options *opt, struct merge_result *result)
 	 */
 	strmap_init(&opt->priv->conflicts);
 
+	string_list_init_dup(&opt->priv->conflicted_submodules);
+
 	trace2_region_leave("merge", "allocate/init", opt->repo);
 }
Ævar Arnfjörð Bjarmason July 25, 2022, 12:31 p.m. UTC | #9
On Mon, Jul 18 2022, Calvin Wan wrote:

I'll have some more comments, but just on the output/i18n:

> +		printf(_("Recursive merging with submodules currently only supports trivial cases.\n"
> +			"Please manually handle the merging of each conflicted submodule.\n"
> +			"This can be accomplished with the following steps:\n"));
> +		for (i = 0; i < csub->nr; i++) {
> +			printf(_(" - go to submodule (%s), and either merge commit %s\n"
> +				    "or update to an existing commit which has merged those changes\n"),

This seems to have some formatting issues, i.e. we'll emit things like:

 - one
one continued
 - tow
two continued

Whereas we want this, surely?;

 - one
   one continued
 - two
   two continued

I.e. we're using " - " to mark list items, but then not indenting the
items.


> +					csub->items[i].path,
> +					csub->items[i].oid);
> +			if (csub->items[i].resolution_exists)
> +				printf(_("such as one listed above\n"));

	that people might read this backwards
	You need to consider

:)

I.e. this is "translation lego" that we try to avoid.

It's a bit more verbose (but it often is, unfortunately)

I think you can borrow a bit from ba5e8a0eb80 (object-name: make
ambiguous object output translatable, 2022-01-27) here, i.e.:

 1. Just translate a message like "go to submodule ...\nor update to
    an", have another variant for the "resolution exists".

 2. As translators retain those \n split those lines and format them
    with e.g. _(" %s") (you could borrow the string used in
    object-name.c via ba5e8a0eb80), or _("- %s").

    This will give you list-itemized output, which will also format
    correctly in RTL languages, and takes the formatting concerns
    completely out of the hands of translators, and allows us to change
    it later ourselves.

    I.e. we can safely assume that for a \n-delimited translation we can
    take \n-delimited input from a translator and treat the first line
    specially as a "first line in new list item".

> +		}
> +		printf(_(" - come back to superproject, and `git add"));
> +		for (i = 0; i < csub->nr; i++)
> +			printf(_(" %s"), csub->items[i].path);

More inconsistent indentation, and per ba5e8a0eb80 you should explain
any addition of magical formatting like " %s" with a TRANSLATORS
comment.


> +		printf(_("` to record the above merge or update \n"
> +			" - resolve any other conflicts in the superproject\n"
> +			" - commit the resulting index in the superproject\n"));

A bit more odd formatting, i.e.:

 - A trailing  " " before a \n?

 - " " after `, what is that ` doing? At first I thought it was a typo,
   but looking again it's a continuation of `` from above

   It's quite odd to tell a user to run a command with the `` syntax,
   which is used for interpolation. Let's instead suggest:

        blah blah blah run:

		git add foo \
			bar \ [...]
	                baz

   I.e. use \ at the end of lines to note a multi-line command, not wrap
   the whole thing in ``-quotes.

 - If a message must always end in a \n just add it between _()'s,
   instead of making it part of the _() string.
Calvin Wan July 25, 2022, 9:27 p.m. UTC | #10
Hi Ævar,

> Whereas we want this, surely?;
>
>  - one
>    one continued
>  - two
>    two continued
>
> I.e. we're using " - " to mark list items, but then not indenting the
> items.

Yes we surely do!

> I think you can borrow a bit from ba5e8a0eb80 (object-name: make
> ambiguous object output translatable, 2022-01-27) here, i.e.:
>
>  1. Just translate a message like "go to submodule ...\nor update to
>     an", have another variant for the "resolution exists".

I am removing the "resolution exists" variant because the text that
came with it was unclear to begin with.

> per ba5e8a0eb80 you should explain
> any addition of magical formatting like " %s" with a TRANSLATORS
> comment.

ack

>  - A trailing  " " before a \n?

ack

>    It's quite odd to tell a user to run a command with the `` syntax,
>    which is used for interpolation. Let's instead suggest:
>
>         blah blah blah run:
>
>                 git add foo \
>                         bar \ [...]
>                         baz
>
>    I.e. use \ at the end of lines to note a multi-line command, not wrap
>    the whole thing in ``-quotes.

That looks much better than what I currently have.

>  - If a message must always end in a \n just add it between _()'s,
>    instead of making it part of the _() string.

ack
Calvin Wan July 25, 2022, 10:03 p.m. UTC | #11
Hi Ævar,

Thanks in advance for the suggestions and catching my memory leak.

> Perhaps we could also continue down that path and do this with just one
> allocation for that "oid/resolution_exists" side-data. I.e. you'd have a
> second string_list in the struct, have a non-NULL util mean "resolution
> exists", and the "string" there would be nodup'd (we could refer to the
> "string" in the first one).

Since I am removing the "resolution exists" field, I modified your changes
to use the util to only hold the abbrev.
diff mbox series

Patch

diff --git a/merge-ort.c b/merge-ort.c
index 01f150ef3b..125ee3c0d1 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -292,6 +292,18 @@  struct rename_info {
 	int needed_limit;
 };
 
+struct conflicted_submodule_item {
+	char *oid;
+	char *path;
+	int resolution_exists;
+};
+
+struct conflicted_submodule_list {
+	struct conflicted_submodule_item *items;
+	size_t nr;
+	size_t alloc;
+};
+
 struct merge_options_internal {
 	/*
 	 * paths: primary data structure in all of merge ort.
@@ -387,6 +399,9 @@  struct merge_options_internal {
 
 	/* call_depth: recursion level counter for merging merge bases */
 	int call_depth;
+
+	/* field that holds submodule conflict information */
+	struct conflicted_submodule_list conflicted_submodules;
 };
 
 struct version_info {
@@ -1741,6 +1756,9 @@  static int merge_submodule(struct merge_options *opt,
 	struct commit *commit_o, *commit_a, *commit_b;
 	int parent_count;
 	struct object_array merges;
+	struct conflicted_submodule_list *csub = &opt->priv->conflicted_submodules;
+	struct conflicted_submodule_item csub_item;
+	int resolution_exists = 0;
 
 	int i;
 	int search = !opt->priv->call_depth;
@@ -1835,6 +1853,7 @@  static int merge_submodule(struct merge_options *opt,
 			   "resolution exists: %s"),
 			 path, sb.buf);
 		strbuf_release(&sb);
+		resolution_exists = 1;
 		break;
 	default:
 		for (i = 0; i < merges.nr; i++)
@@ -1845,10 +1864,22 @@  static int merge_submodule(struct merge_options *opt,
 			 _("Failed to merge submodule %s, but multiple "
 			   "possible merges exist:\n%s"), path, sb.buf);
 		strbuf_release(&sb);
+		resolution_exists = 1;
 	}
 
 	object_array_clear(&merges);
 cleanup:
+	if (!ret) {
+		if (!csub) {
+			CALLOC_ARRAY(csub, 1);
+		}
+		csub_item.oid = xstrdup(repo_find_unique_abbrev(&subrepo, b, DEFAULT_ABBREV));
+		csub_item.path = xstrdup(path);
+		csub_item.resolution_exists = resolution_exists;
+		ALLOC_GROW(csub->items, csub->nr + 1, csub->alloc);
+		csub->items[csub->nr++] = csub_item;
+		opt->priv->conflicted_submodules = *csub;
+	}
 	repo_clear(&subrepo);
 	return ret;
 }
@@ -4412,6 +4443,29 @@  static int record_conflicted_index_entries(struct merge_options *opt)
 	return errs;
 }
 
+static void print_submodule_conflict_suggestion(struct conflicted_submodule_list *csub) {
+	if (csub && csub->nr > 0) {
+		int i;
+		printf(_("Recursive merging with submodules currently only supports trivial cases.\n"
+			"Please manually handle the merging of each conflicted submodule.\n"
+			"This can be accomplished with the following steps:\n"));
+		for (i = 0; i < csub->nr; i++) {
+			printf(_(" - go to submodule (%s), and either merge commit %s\n"
+				    "or update to an existing commit which has merged those changes\n"),
+					csub->items[i].path,
+					csub->items[i].oid);
+			if (csub->items[i].resolution_exists)
+				printf(_("such as one listed above\n"));
+		}
+		printf(_(" - come back to superproject, and `git add"));
+		for (i = 0; i < csub->nr; i++)
+			printf(_(" %s"), csub->items[i].path);
+		printf(_("` to record the above merge or update \n"
+			" - resolve any other conflicts in the superproject\n"
+			" - commit the resulting index in the superproject\n"));
+	}
+}
+
 void merge_display_update_messages(struct merge_options *opt,
 				   int detailed,
 				   struct merge_result *result)
@@ -4461,6 +4515,8 @@  void merge_display_update_messages(struct merge_options *opt,
 	}
 	string_list_clear(&olist, 0);
 
+	print_submodule_conflict_suggestion(&opti->conflicted_submodules);
+
 	/* Also include needed rename limit adjustment now */
 	diff_warn_rename_limit("merge.renamelimit",
 			       opti->renames.needed_limit, 0);
diff --git a/t/t6437-submodule-merge.sh b/t/t6437-submodule-merge.sh
index c253bf759a..84913525cf 100755
--- a/t/t6437-submodule-merge.sh
+++ b/t/t6437-submodule-merge.sh
@@ -103,8 +103,25 @@  test_expect_success 'setup for merge search' '
 	 echo "file-c" > file-c &&
 	 git add file-c &&
 	 git commit -m "sub-c") &&
-	git commit -a -m "c" &&
+	git commit -a -m "c")
+'
 
+test_expect_success 'merging should conflict for non fast-forward' '
+	test_when_finished "git -C merge-search reset --hard" &&
+	(cd merge-search &&
+	 git checkout -b test-nonforward-a b &&
+	  if test "$GIT_TEST_MERGE_ALGORITHM" = ort
+	  then
+		test_must_fail git merge c >actual &&
+		sub_expect="go to submodule (sub), and either merge commit $(git -C sub rev-parse --short sub-c)" &&
+	 	grep "$sub_expect" actual
+	  else
+		test_must_fail git merge c 2> actual
+	  fi)
+'
+
+test_expect_success 'finish setup for merge-search' '
+	(cd merge-search &&
 	git checkout -b d a &&
 	(cd sub &&
 	 git checkout -b sub-d sub-b &&
@@ -129,14 +146,16 @@  test_expect_success 'merge with one side as a fast-forward of the other' '
 	 test_cmp expect actual)
 '
 
-test_expect_success 'merging should conflict for non fast-forward' '
+test_expect_success 'merging should conflict for non fast-forward (resolution exists)' '
 	(cd merge-search &&
-	 git checkout -b test-nonforward b &&
+	 git checkout -b test-nonforward-b b &&
 	 (cd sub &&
 	  git rev-parse --short sub-d > ../expect) &&
 	  if test "$GIT_TEST_MERGE_ALGORITHM" = ort
 	  then
-		test_must_fail git merge c >actual
+		test_must_fail git merge c >actual &&
+		sub_expect="go to submodule (sub), and either merge commit $(git -C sub rev-parse --short sub-c)" &&
+	 	grep "$sub_expect" actual
 	  else
 		test_must_fail git merge c 2> actual
 	  fi &&
@@ -161,7 +180,9 @@  test_expect_success 'merging should fail for ambiguous common parent' '
 	 ) &&
 	 if test "$GIT_TEST_MERGE_ALGORITHM" = ort
 	 then
-		test_must_fail git merge c >actual
+		test_must_fail git merge c >actual &&
+		sub_expect="go to submodule (sub), and either merge commit $(git -C sub rev-parse --short sub-c)" &&
+		grep "$sub_expect" actual
 	 else
 		test_must_fail git merge c 2> actual
 	 fi &&
@@ -205,7 +226,12 @@  test_expect_success 'merging should fail for changes that are backwards' '
 	git commit -a -m "f" &&
 
 	git checkout -b test-backward e &&
-	test_must_fail git merge f)
+	test_must_fail git merge f >actual &&
+	if test "$GIT_TEST_MERGE_ALGORITHM" = ort
+    then
+		sub_expect="go to submodule (sub), and either merge commit $(git -C sub rev-parse --short sub-d)" &&
+		grep "$sub_expect" actual
+	fi)
 '
 
 
diff --git a/t/t7402-submodule-rebase.sh b/t/t7402-submodule-rebase.sh
index 8e32f19007..ebeca12a71 100755
--- a/t/t7402-submodule-rebase.sh
+++ b/t/t7402-submodule-rebase.sh
@@ -104,7 +104,7 @@  test_expect_success 'rebasing submodule that should conflict' '
 	test_tick &&
 	git commit -m fourth &&
 
-	test_must_fail git rebase --onto HEAD^^ HEAD^ HEAD^0 &&
+	test_must_fail git rebase --onto HEAD^^ HEAD^ HEAD^0 >actual_output &&
 	git ls-files -s submodule >actual &&
 	(
 		cd submodule &&
@@ -112,7 +112,12 @@  test_expect_success 'rebasing submodule that should conflict' '
 		echo "160000 $(git rev-parse HEAD^^) 2	submodule" &&
 		echo "160000 $(git rev-parse HEAD) 3	submodule"
 	) >expect &&
-	test_cmp expect actual
+	test_cmp expect actual &&
+	if test "$GIT_TEST_MERGE_ALGORITHM" = ort
+    then
+		sub_expect="go to submodule (submodule), and either merge commit $(git -C submodule rev-parse --short HEAD^0)" &&
+		grep "$sub_expect" actual_output
+	fi
 '
 
 test_done