diff mbox

[v1,0/2] Be nicer to the user on tracked/untracked merge conflicts

Message ID 20220425202721.20066-1-git.jonathan.bressat@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jonathan Bressat April 25, 2022, 8:27 p.m. UTC
Sorry for being a bit slow to answer.

Junio C Hamano <gitster@pobox.com> wrote:
> 	if (result) {
> 		...
> -	}
> +	} else if (!ie_modified(...)) {
> +		return 0;
> +	}
> 	return add_rejected_path(...);
> 
> is what you want, perhaps?

Yes, but that made us ask if it can be a good idea to extend our patch
to overwrite all unstaged file. However if we want to overwrite all
unstaged file even tracked one just this may not be enough:

if (result) {
	...
}

+if (!ie_modified(...)) {
+	return 0;
+}

Because with this merge still fail for unstaged file that has the same
content, because unstaged file are not exactly treated the same way.

Our patch broke some test in t6436-merge-overwrite.sh so we think that
we need to modify those tests to make them follow the patch.

Thanks for your reviews

Jonathan (2):
  t7615: test how merge behave when there is untracked file
  merge with untracked file that are the same without failure

 t/t7615-merge-untracked.sh | 63 ++++++++++++++++++++++++++++++++++++++
 unpack-trees.c             |  5 ++-
 2 files changed, 67 insertions(+), 1 deletion(-)
 create mode 100755 t/t7615-merge-untracked.sh

Interdiff vs v0 :

Comments

Junio C Hamano April 25, 2022, 9:16 p.m. UTC | #1
Jonathan <git.jonathan.bressat@gmail.com> writes:

> Because with this merge still fail for unstaged file that has the same
> content, because unstaged file are not exactly treated the same way.

Correct.  If you want to do this correctly, you'd need to make sure
that you'd clobber untracked files ONLY when you are not losing any
information.

And even with that, I think some existing users will be hurt with
this change in a huge way.  They may have untracked change locally
because they are not quite done with it yet, and somebody else
throws a pull request at them that has the same change as the local
modification.

They make a trial merge, look at the result, and discard it because
there are also unwanted changes in the branch they pulled into.

    $ git pull $URL $branch ;# responding to the pull request
    ... examine the result, finding it unsatisfactory ...
    $ git reset --hard ORIG_HEAD
    ... now we are back to where we started; well not really ...

Now, without this change, "git pull" used to stop until they stashed
away the untracked change safely.  But with this change, "git pull"
will succeed, and then "reset --hard" will discard it together with
other changes that came to us from $URL/$branch.  They lost their
local, uncommitted change.

And "but you can pull the equivalent out of $URL/$branch" is not a
good excuse.  They may not notice the lossage long after having
dealt with this pull request (there are busy people who are handling
many pull requests from many people) and they have been relying on
"git pull" that never clobbers their local uncommitted changes.  And
when they noticed the lossage, they may not even remember which one
of pull requests happened to have an identical change as their local
change to cause this lossage, simply because "git pull" that used to
stop just continued without a noise.

So, I am not sure if this is really a good idea to begin with.  It
certainly would make it slightly simpler in a trivial case, but it
surely looks like a dangerous behaviour change, especially if it is
done unconditionally.

> Our patch broke some test in t6436-merge-overwrite.sh so we think that
> we need to modify those tests to make them follow the patch.

Wait.  Isn't it backwards?  The existing tests _may_ be casting an
undesirable current behaviour in stone, but most of the time it is
protecting existing user's expectations.  If you have an untracked
file, you can rest assured that they won't be clobbered by a merge.

So we'd need to think twice and carefully examine if it makes sense
to update the expectations.  I haven't read the change to the tests,
so I cannot tell which case it is.

Thanks.
Guillaume Cogoni April 25, 2022, 10:28 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> So, I am not sure if this is really a good idea to begin with.  It
> certainly would make it slightly simpler in a trivial case, but it
> surely looks like a dangerous behaviour change, especially if it is
> done unconditionally.

Can we create a configuration variable to avoid this problem?
We keep the old behavior by default, and make a configuration variable
for people who wants to have this new behavior, but if the user set the variable
a message informs it about the problem that you mention.

Or, we add an option like git pull --doSomething.

Maybe, we can think about another behaviour.
When the user git pull and this error occurs:
error: The following untracked working tree files would be overwritten by merge:
file1.txt
file2.txt
Please move or remove them before you merge.
Aborting
We don't abort, but we prompt a yes/no for each file, if the user
wants to remove it.

We just make suggestions, we will think more about it.

> Wait.  Isn't it backwards?  The existing tests _may_ be casting an
> undesirable current behaviour in stone, but most of the time it is
> protecting existing user's expectations.  If you have an untracked
> file, you can rest assured that they won't be clobbered by a merge.

> So we'd need to think twice and carefully examine if it makes sense
> to update the expectations.  I haven't read the change to the tests,
> so I cannot tell which case it is.

Yes, we'll figure out a solution, if there is one.

Thanks for your review and your quick response, it give us a lot of information,

COGONI Guillaume and BRESSAT Jonathan
Junio C Hamano April 25, 2022, 11:10 p.m. UTC | #3
Guillaume Cogoni <cogoni.guillaume@gmail.com> writes:

>> So, I am not sure if this is really a good idea to begin with.  It
>> certainly would make it slightly simpler in a trivial case, but it
>> surely looks like a dangerous behaviour change, especially if it is
>> done unconditionally.
>
> Can we create a configuration variable to avoid this problem?
> We keep the old behavior by default, and make a configuration variable
> for people who wants to have this new behavior, but if the user set the variable
> a message informs it about the problem that you mention.
>
> Or, we add an option like git pull --doSomething.

Probably a command line option ("git merge" would probably want the
same one) plus a configuration varaible to give it the default (the
latter is optional).

> Maybe, we can think about another behaviour.
> When the user git pull and this error occurs:
> error: The following untracked working tree files would be overwritten by merge:
> file1.txt
> file2.txt
> Please move or remove them before you merge.
> Aborting
> We don't abort, but we prompt a yes/no for each file, if the user
> wants to remove it.

I doubt this would fly as-is.  Especially if the action that is
offered by the prompt is "remove", not "move", as that implies we
are not prepared against loss of information.

There is no indication whether the untracked file1.txt matches the
contents we are pulling in.  Most of the time, it is very unlikely
that the contents being lost is identical to what the other side
has, so answering "yes" to the prompt means "No, I do not care about
my garbage, and it is OK that it will forever be lost."  I do not
think we want to be encouraging people to habitually make such a
statement.  If we move (instead of removing) them away to somewhere,
and give users to easily recover them after running "git pull", it
might become more palatable.

I wonder if this whole thing is an attempt to work around whatever
"stash --untracked" fails to do well (or perhaps there are no such
shortcomings, but just the users are not made aware of the command
enough).  If you have these two untracked files (file1.txt and
file2.txt) are "in the way" for a merge to succeed, I have to wonder
if "Please move or remove" message that was introduced by 23cbf11b
(merge-recursive: porcelain messages for checkout, 2010-08-11) is
still giving a good piece of advice to users today.

Would "git stash push -u file1.txt file2.txt" be an easier and safer
alternative that lets you take these files back later?  Back in
2010, when 23cbf11b was current, "git stash" was a shell script and
it seems there was no "untracked" option, so from that point of
view, "move or remove" may have been the best they could do.

Note that I never use "git stash" with "untracked" option, so I do
not know if it works well in this context already, or we need more
work before it becomes usable in this scenario.  But it smells like
it is exactly what we might want to use in such a situation to stash
away these untracked file1.txt and file2.txt while running the
merge, while allowing us to recover them after running the merge or
discarding it.  I dunno.
Matthieu Moy April 26, 2022, 6:38 a.m. UTC | #4
On 4/26/22 00:28, Guillaume Cogoni wrote:
>   Junio C Hamano <gitster@pobox.com> writes:
> 
>> So, I am not sure if this is really a good idea to begin with.  It
>> certainly would make it slightly simpler in a trivial case, but it
>> surely looks like a dangerous behaviour change, especially if it is
>> done unconditionally.
> 
> Can we create a configuration variable to avoid this problem?
> We keep the old behavior by default, and make a configuration variable
> for people who wants to have this new behavior, but if the user set the variable
> a message informs it about the problem that you mention.
> 
> Or, we add an option like git pull --doSomething.
> 
> Maybe, we can think about another behaviour.
> When the user git pull and this error occurs:
> error: The following untracked working tree files would be overwritten by merge:
> file1.txt
> file2.txt
> Please move or remove them before you merge.
> Aborting
> We don't abort, but we prompt a yes/no for each file, if the user
> wants to remove it.

Git very rarely goes interactive like this (only a few special command 
like git send-email, git clean -i, git add -i/-p prompt the user).

Prompting the user in the middle of an operation has several drawbacks:

- When the command is launched from a script, the script may work most 
of the time, and sometimes pause on an interactive prompt which wasn't 
expected from the author of the script. This can be a bit nasty when the 
script isn't ran from a place where you can type to the standard input 
of the command or when its output is redirected.

- Asking for each individual file can be tedious when there are many 
files. Similarly, "rm -i" (plain rm, not "git rm") is a nice safety 
measure, but not really convenient to me at least.

In this particular case, actually, I can't imagine a sane behavior when 
the user wants a mix of "yes" / "no". If a single untracked file 
conflicts with what's being merged, the merge aborts, even if you're OK 
to replace other files. So I can only imagine a single yes/no answer. 
And then the question can be replaced with a suggestion to re-run with a 
command-line flag when all the conflicting files are unmodified.
Matthieu Moy April 26, 2022, 6:48 a.m. UTC | #5
On 4/25/22 22:27, Jonathan wrote:
> when there is untracked file that has the same name than file in the
> merged branch git refuse to proceed, even when the file has the same
> content
> 
> t6436 test a similar thing but not especially with same content file

Write your commit message like normal english: capitalize start of 
sentence, and period at the end (we omit the period in the subject line, 
though).

> +test_expect_success 'fastforward fail when untracked file has the same content' '

Here and other test names: third person => s (fail_s_, and overwrite_s_ 
in the next patch).

> +	test_when_finished "git branch -D B && git reset --hard init && git clean --force" &&
> +	git checkout -b B &&
> +	test_commit --no-tag "tracked" file "content" &&
> +	git checkout A &&
> +	echo content >file &&
> +	test_must_fail git merge B

It would make sense to grep for the correct error message in the output, 
but maybe that's overkill.
Junio C Hamano April 26, 2022, 4:13 p.m. UTC | #6
Matthieu Moy <Matthieu.Moy@univ-lyon1.fr> writes:

> Git very rarely goes interactive like this (only a few special command
> like git send-email, git clean -i, git add -i/-p prompt the user).
>
> Prompting the user in the middle of an operation has several drawbacks:
> ...
> In this particular case, actually, I can't imagine a sane behavior
> when the user wants a mix of "yes" / "no". If a single untracked file 
> conflicts with what's being merged, the merge aborts, even if you're
> OK to replace other files. So I can only imagine a single yes/no
> answer. And then the question can be replaced with a suggestion to
> re-run with a command-line flag when all the conflicting files are
> unmodified.

Nicely explained.  Thanks.
Jonathan Bressat April 28, 2022, 10:33 a.m. UTC | #7
Junio C Hamano <gitster@pobox.com> writes:

> Probably a command line option ("git merge" would probably want the
> same one) plus a configuration varaible to give it the default (the
> latter is optional).

First, we think that add an option to pull and merge is more suited to
our situation, and next, it could be good to add the configuration
variable

In unpack-trees.c there is a list of files that cause problem with merge.
We want to split this list to list files that have the same content, then if
all the files have the same content, we can suggest to use the option
to overwrite those files.
Then we can modify the error message to show the files that have the
same content apart.

> I wonder if this whole thing is an attempt to work around whatever
> "stash --untracked" fails to do well (or perhaps there are no such
> shortcomings, but just the users are not made aware of the command
> enough).  If you have these two untracked files (file1.txt and
> file2.txt) are "in the way" for a merge to succeed, I have to wonder
> if "Please move or remove" message that was introduced by 23cbf11b
> (merge-recursive: porcelain messages for checkout, 2010-08-11) is
> still giving a good piece of advice to users today.

We got a similar idea, but we finally decide that it was not a very good
approach because it's not efficient if we have a lot of files or some big files.
And because if there are files that doesn't block the merge, we treat them
anyway and they will move from the work tree, it's a bit overkill.

> Note that I never use "git stash" with "untracked" option, so I do
>  not know if it works well in this context already, or we need more
> work before it becomes usable in this scenario.  But it smells like
> it is exactly what we might want to use in such a situation to stash
> away these untracked file1.txt and file2.txt while running the
> merge, while allowing us to recover them after running the merge or
> discarding it.  I dunno.

Indeed, git stash works well with this kind of problem, however an option
would be easier in that specific case.

Thanks for you're helpfull review, you always give us a lot of good
information and ideas.

Cogoni Guillaume and
Bressat Jonathan
diff mbox

Patch

diff --git a/t/t7615-merge-untracked.sh b/t/t7615-merge-untracked.sh
index 71a34041d2..99f8bae4c0 100755
--- a/t/t7615-merge-untracked.sh
+++ b/t/t7615-merge-untracked.sh
@@ -2,78 +2,62 @@ 
 
 test_description='test when merge with untracked file'
 
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
 . ./test-lib.sh
 
+test_expect_success 'setup' '
+	test_commit "init" README.md "content" &&
+	git checkout -b A
+'
+
+test_expect_success 'fastforward overwrite untracked file that has the same content' '
+	test_when_finished "git branch -D B && git reset --hard init && git clean --force" &&
+	git checkout -b B &&
+	test_commit --no-tag "tracked" file "content" &&
+	git checkout A &&
+	echo content >file &&
+	git merge B
+'
 
-test_expect_success 'overwrite the file when fastforward and the same content' '
-    echo content >README.md &&
-    test_commit "init" README.md &&
-    git branch A &&
-    git checkout -b B &&
-    echo content >file &&
-    git add file &&
-    git commit -m "tracked" &&
-    git switch A &&
-    echo content >file &&
-    git merge B
+test_expect_success 'fastforward fail when untracked file has different content' '
+	test_when_finished "git branch -D B && git reset --hard init && git clean --force" &&
+	git checkout -b B &&
+	test_commit --no-tag "tracked" file "content" &&
+	git switch A &&
+	echo other >file &&
+	test_must_fail git merge B
 '
 
-test_expect_success 'merge fail with fastforward and different content' '
-    rm * &&
-    rm -r .git &&
-    git init &&
-    echo content >README.md &&
-    test_commit "init" README.md &&
-    git branch A &&
-    git checkout -b B &&
-    echo content >file &&
-    git add file &&
-    git commit -m "tracked" &&
-    git switch A &&
-    echo dif >file &&
-    test_must_fail git merge B
+test_expect_success 'normal merge overwrite untracked file that has the same content' '
+	test_when_finished "git branch -D B && git reset --hard init && git clean --force" &&
+	git checkout -b B &&
+	test_commit --no-tag "tracked" file "content" fileB "content" &&
+	git switch A &&
+	test_commit --no-tag "exA" fileA "content" &&
+	echo content >file &&
+	git merge B
 '
 
-test_expect_success 'normal merge with untracked with the same content' '
-    rm * &&
-    rm -r .git &&
-    git init &&
-    echo content >README.md &&
-    test_commit "init" README.md &&
-    git branch A &&
-    git checkout -b B &&
-    echo content >fileB &&
-    echo content >file &&
-    git add fileB &&
-    git add file &&
-    git commit -m "tracked" &&
-    git switch A &&
-    echo content >fileA &&
-    git add fileA &&
-    git commit -m "exA" &&
-    echo content >file &&
-    git merge B -m "merge"
+test_expect_success 'normal merge fail when untracked file has different content' '
+	test_when_finished "git branch -D B && git reset --hard init && git clean --force" &&
+	git checkout -b B &&
+	test_commit --no-tag "tracked" file "content" fileB "content" &&
+	git switch A &&
+	test_commit --no-tag "exA" fileA "content" &&
+	echo dif >file &&
+	test_must_fail git merge B
 '
 
-test_expect_success 'normal merge fail when untracked with different content' '
-    rm * &&
-    rm -r .git &&
-    git init &&
-    echo content >README.md &&
-    test_commit "init" README.md &&
-    git branch A &&
-    git checkout -b B &&
-    echo content >fileB &&
-    echo content >file &&
-    git add fileB &&
-    git add file &&
-    git commit -m "tracked" &&
-    git switch A &&
-    echo content >fileA &&
-    git add fileA &&
-    git commit -m "exA" &&
-    echo dif >file &&
-    test_must_fail git merge B -m "merge"
+test_expect_success 'merge fail when tracked file modification is unstaged' '
+	test_when_finished "git branch -D B && git reset --hard init && git clean --force" &&
+	test_commit --no-tag "unstaged" file "other" &&
+	git checkout -b B &&
+	test_commit --no-tag "staged" file "content" &&
+	git switch A &&
+	echo content >file &&
+	test_must_fail git merge B
 '
 
-test_done
\ No newline at end of file
+test_done
diff --git a/unpack-trees.c b/unpack-trees.c
index 834aca0da9..61e06c04be 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2257,18 +2257,17 @@  static int check_ok_to_remove(const char *name, int len, int dtype,
 	if (result) {
 		if (result->ce_flags & CE_REMOVE)
 			return 0;
-	}
-
-	if (!ie_modified(&o->result, ce, st, 0))
+	} else if (!ie_modified(&o->result, ce, st, 0)) {
 		return 0;
-
+	}
 
 	return add_rejected_path(o, error_type, name);
 }
 
 /*
  * We do not want to remove or overwrite a working tree file that
- * is not tracked, unless it is ignored.
+ * is not tracked, unless it is ignored and unless it has the same
+ * content than the merged file.
  */
 static int verify_absent_1(const struct cache_entry *ce,
 			   enum unpack_trees_error_types error_type,