[v4,04/10] merge-recursive: new function for better colliding conflict resolutions

Message ID	20181102185317.31015-5-newren@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <git-owner@kernel.org> From: Elijah Newren <newren@gmail.com> To: <git@vger.kernel.org> CC: <gitster@pobox.com>, <stolee@gmail.com>, Elijah Newren <newren@gmail.com> Subject: [PATCH v4 04/10] merge-recursive: new function for better colliding conflict resolutions Date: Fri, 2 Nov 2018 11:53:12 -0700 Message-ID: <20181102185317.31015-5-newren@gmail.com> In-Reply-To: <20181102185317.31015-1-newren@gmail.com> References: <20181019193111.12051-1-newren@gmail.com> <20181102185317.31015-1-newren@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII Sender: git-owner@vger.kernel.org Precedence: bulk
Series	Improve path collision conflict resolutions \| expand [v4,00/10] Improve path collision conflict resolutions [v4,01/10] Add testcases for consistency in file collision conflict handling [v4,02/10] t6036, t6042: testcases for rename collision of already conflicting files [v4,03/10] merge-recursive: increase marker length with depth of recursion [v4,04/10] merge-recursive: new function for better colliding conflict resolutions [v4,05/10] merge-recursive: fix rename/add conflict handling [v4,06/10] merge-recursive: improve handling for rename/rename(2to1) conflicts [v4,07/10] merge-recursive: use handle_file_collision for add/add conflicts [v4,08/10] merge-recursive: improve rename/rename(1to2)/add[/add] handling [RFC,v4,09/10] fixup! merge-recursive: fix rename/add conflict handling [v4,10/10] fixup! merge-recursive: improve rename/rename(1to2)/add[/add] handling

=== Handling the working tree === The previous behavior for these conflict types in regards to the working tree (assuming the file collision occurs at 'foo') was: * add/add does a two-way merge of the two files and records it as 'foo'. * rename/rename(2to1) records the two different files into two new uniquely named files (foo~HEAD and foo~$MERGE), while removing 'foo' from the working tree. * rename/add records the two different files into two different locations, recording the add at foo~$SIDE and, oddly, recording the rename at foo (why is the rename more important than the add?) So, the question for what to write to the working tree boils down to whether the two colliding files should be two-way merged and recorded in place, or recorded into separate files. As per discussion on the git mailing lit, two-way merging was deemed to always be preferred, as that makes these cases all more like content conflicts that users can handle from within their favorite editor, IDE, or merge tool. Note that since renames already involve a content merge, rename/add and rename/rename(2to1) conflicts could result in nested conflict markers. === Handling of the index === For a typical rename, unpack_trees() would set up the index in the following fashion: old_path new_path stage1: 5ca1ab1e 00000000 stage2: f005ba11 00000000 stage3: 00000000 b0a710ad And merge-recursive would rewrite this to new_path stage1: 5ca1ab1e stage2: f005ba11 stage3: b0a710ad Removing old_path from the index means the user won't have to `git rm old_path` manually every time a renamed path has a content conflict. It also means they can use `git checkout [--ours|--theirs|--conflict|-m] new_path`, `git diff [--ours|--theirs]` and various other commands that would be difficult otherwise. This strategy becomes a problem when we have a rename/add or rename/rename(2to1) conflict, however, because then we have only three slots to store blob sha1s and we need either four or six. Previously, this was handled by continuing to delete old_path from the index, and just outright ignoring any blob shas from old_path. That had the downside of deleting any trace of changes made to old_path on the other side of history. This function instead does a three-way content merge of the renamed file, and stores the blob sha1 for that at either stage2 or stage3 for new_path (depending on which side the rename came from). That has the advantage of bringing information about changes on both sides and still allows for easy resolution (no need to git rm old_path, etc.), but does have the downside that if the content merge had conflict markers, then what we store in the index is the sha1 of a blob with conflict markers. While that is a downside, it seems less problematic than the downsides of any obvious alternatives, and certainly makes more sense than the previous handling. Further, it has a precedent in that when we do recursive merges, we may accept a file with conflict markers as the resolution for the merge of the merge-bases, which will then show up in the index of the outer merge at stage 1 if a conflict exists at the outer level. Signed-off-by: Elijah Newren <newren@gmail.com> --- merge-recursive.c | 121 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 121 insertions(+) diff --git a/merge-recursive.c b/merge-recursive.c index f795c92a69..24d979022e 100644 --- a/merge-recursive.c +++ b/merge-recursive.c @@ -1559,6 +1559,127 @@ static struct diff_filespec *filespec_from_entry(struct diff_filespec *target, return target; } +#if 0 // #if-0-ing avoids unused function warning; will make live in next commit +static int handle_file_collision(struct merge_options *o, + const char *collide_path, + const char *prev_path1, + const char *prev_path2, + const char *branch1, const char *branch2, + const struct object_id *a_oid, + unsigned int a_mode, + const struct object_id *b_oid, + unsigned int b_mode) +{ + struct merge_file_info mfi; + struct diff_filespec null, a, b; + char *alt_path = NULL; + const char *update_path = collide_path; + + /* + * In the recursive case, we just opt to undo renames + */ + if (o->call_depth && (prev_path1 || prev_path2)) { + /* Put first file (a_oid, a_mode) in its original spot */ + if (prev_path1) { + if (update_file(o, 1, a_oid, a_mode, prev_path1)) + return -1; + } else { + if (update_file(o, 1, a_oid, a_mode, collide_path)) + return -1; + } + + /* Put second file (b_oid, b_mode) in its original spot */ + if (prev_path2) { + if (update_file(o, 1, b_oid, b_mode, prev_path2)) + return -1; + } else { + if (update_file(o, 1, b_oid, b_mode, collide_path)) + return -1; + } + + /* Don't leave something at collision path if unrenaming both */ + if (prev_path1 && prev_path2) + remove_file(o, 1, collide_path, 0); + + return 0; + } + + /* Remove rename sources if rename/add or rename/rename(2to1) */ + if (prev_path1) + remove_file(o, 1, prev_path1, + o->call_depth || would_lose_untracked(prev_path1)); + if (prev_path2) + remove_file(o, 1, prev_path2, + o->call_depth || would_lose_untracked(prev_path2)); + + /* + * Remove the collision path, if it wouldn't cause dirty contents + * or an untracked file to get lost. We'll either overwrite with + * merged contents, or just write out to differently named files. + */ + if (was_dirty(o, collide_path)) { + output(o, 1, _("Refusing to lose dirty file at %s"), + collide_path); + update_path = alt_path = unique_path(o, collide_path, "merged"); + } else if (would_lose_untracked(collide_path)) { + /* + * Only way we get here is if both renames were from + * a directory rename AND user had an untracked file + * at the location where both files end up after the + * two directory renames. See testcase 10d of t6043. + */ + output(o, 1, _("Refusing to lose untracked file at " + "%s, even though it's in the way."), + collide_path); + update_path = alt_path = unique_path(o, collide_path, "merged"); + } else { + /* + * FIXME: It's possible that the two files are identical + * and that the current working copy happens to match, in + * which case we are unnecessarily touching the working + * tree file. It's not a likely enough scenario that I + * want to code up the checks for it and a better fix is + * available if we restructure how unpack_trees() and + * merge-recursive interoperate anyway, so punting for + * now... + */ + remove_file(o, 0, collide_path, 0); + } + + /* Store things in diff_filespecs for functions that need it */ + memset(&a, 0, sizeof(struct diff_filespec)); + memset(&b, 0, sizeof(struct diff_filespec)); + null.path = a.path = b.path = (char *)collide_path; + oidcpy(&null.oid, &null_oid); + null.mode = 0; + oidcpy(&a.oid, a_oid); + a.mode = a_mode; + a.oid_valid = 1; + oidcpy(&b.oid, b_oid); + b.mode = b_mode; + b.oid_valid = 1; + + if (merge_mode_and_contents(o, &null, &a, &b, collide_path, + branch1, branch2, o->call_depth * 2, &mfi)) + return -1; + mfi.clean &= !alt_path; + if (update_file(o, mfi.clean, &mfi.oid, mfi.mode, update_path)) + return -1; + if (!mfi.clean && !o->call_depth && + update_stages(o, collide_path, NULL, &a, &b)) + return -1; + free(alt_path); + /* + * FIXME: If both a & b both started with conflicts (only possible + * if they came from a rename/rename(2to1)), but had IDENTICAL + * contents including those conflicts, then in the next line we claim + * it was clean. If someone cares about this case, we should have the + * caller notify us if we started with conflicts. + */ + return mfi.clean; +} +#endif + static int handle_file(struct merge_options *o, struct diff_filespec *rename, int stage,

[v4,04/10] merge-recursive: new function for better colliding conflict resolutions

Commit Message

Patch