diff mbox series

checkout.txt: note about losing staged changes with --merge

Message ID 20190319093910.20229-1-pclouds@gmail.com (mailing list archive)
State New, archived
Headers show
Series checkout.txt: note about losing staged changes with --merge | expand

Commit Message

Duy Nguyen March 19, 2019, 9:39 a.m. UTC
If you have staged changes in path A and perform 'checkout
--merge' (which could result in conflicts in a totally unrelated path
B), changes in A will be gone. Which is unexpected. We are supposed
to keep all changes, or kick and scream otherwise.

This is the result of how --merge is implemented, from the very first
day in 1be0659efc (checkout: merge local modifications while switching
branches., 2006-01-12):

1. a merge is done, unmerged entries are collected
2. a hard switch to a new branch is done, then unmerged entries added
   back

There is no trivial fix for this. Going with 3-way merge one file at a
time loses rename detection. Going with 3-way merge by trees requires
teaching the algorithm to pick up staged changes. And even if we detect
staged changes with --merge and abort for safety, an option to continue
--merge is very weird. Such an option would keep worktree changes, but
drop staged changes.

Because the problem has been with us since the introduction of --merge
and everybody has been pretty happy (except Phillip, who found this
problem), I'll just take a note here to acknowledge it and wait for
merge wizards to come in and work their magic. There may be a way
forward [1].

[1] CABPp-BFoL_U=bzON4SEMaQSKU2TKwnOgNqjt5MUaOejTKGUJxw@mail.gmail.com

Reported-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 This is my "fix" for Phillip's second problem. I chose to reply here
 because this is where an actual fix was discussed. The test script to
 demonstate it is here

 https://public-inbox.org/git/7d3742d6-73e4-2750-6ecb-9edf761d96dd@gmail.com/

 Documentation/git-checkout.txt | 2 ++
 builtin/checkout.c             | 9 +++++++++
 2 files changed, 11 insertions(+)

Comments

Phillip Wood March 19, 2019, 11:24 a.m. UTC | #1
Hi Duy

On 19/03/2019 09:39, Nguyễn Thái Ngọc Duy wrote:
> If you have staged changes in path A and perform 'checkout
> --merge' (which could result in conflicts in a totally unrelated path
> B), changes in A will be gone. Which is unexpected. We are supposed
> to keep all changes, or kick and scream otherwise.
> 
> This is the result of how --merge is implemented, from the very first
> day in 1be0659efc (checkout: merge local modifications while switching
> branches., 2006-01-12):
> 
> 1. a merge is done, unmerged entries are collected
> 2. a hard switch to a new branch is done, then unmerged entries added
>     back
> 
> There is no trivial fix for this. Going with 3-way merge one file at a
> time loses rename detection. Going with 3-way merge by trees requires
> teaching the algorithm to pick up staged changes. And even if we detect
> staged changes with --merge and abort for safety, an option to continue
> --merge is very weird. Such an option would keep worktree changes, but
> drop staged changes.
> 
> Because the problem has been with us since the introduction of --merge
> and everybody has been pretty happy (except Phillip, who found this
> problem), I'll just take a note here to acknowledge it and wait for
> merge wizards to come in and work their magic. There may be a way
> forward [1].
> 
> [1] CABPp-BFoL_U=bzON4SEMaQSKU2TKwnOgNqjt5MUaOejTKGUJxw@mail.gmail.com
> 
> Reported-by: Phillip Wood <phillip.wood123@gmail.com>

I try to use phillip.wood@dunelm.org.uk for git stuff as it shouldn't 
change in the future.

> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>   This is my "fix" for Phillip's second problem. I chose to reply here
>   because this is where an actual fix was discussed. The test script to
>   demonstate it is here
> 
>   https://public-inbox.org/git/7d3742d6-73e4-2750-6ecb-9edf761d96dd@gmail.com/
> 
>   Documentation/git-checkout.txt | 2 ++
>   builtin/checkout.c             | 9 +++++++++
>   2 files changed, 11 insertions(+)
> 
> diff --git a/Documentation/git-checkout.txt b/Documentation/git-checkout.txt
> index f179b43732..877e5f503a 100644
> --- a/Documentation/git-checkout.txt
> +++ b/Documentation/git-checkout.txt
> @@ -242,6 +242,8 @@ should result in deletion of the path).
>   +
>   When checking out paths from the index, this option lets you recreate
>   the conflicted merge in the specified paths.
> ++
> +When switching branches with `--merge`, staged changes may be lost.
>   
>   --conflict=<style>::
>   	The same as --merge option above, but changes the way the
> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index 0e6037b296..f95e7975f7 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -726,6 +726,8 @@ static int merge_working_tree(const struct checkout_opts *opts,
>   			struct tree *result;
>   			struct tree *work;
>   			struct merge_options o;
> +			struct strbuf sb = STRBUF_INIT;
> +
>   			if (!opts->merge)
>   				return 1;
>   
> @@ -736,6 +738,13 @@ static int merge_working_tree(const struct checkout_opts *opts,
>   			if (!old_branch_info->commit)
>   				return 1;
>   
> +			if (repo_index_has_changes(the_repository,
> +						   get_commit_tree(old_branch_info->commit),
> +						   &sb))
> +				warning(_("staged changes in the following files may be lost: %s"),
> +					sb.buf);
> +			strbuf_release(&sb);

Thanks for doing this, I think having some sort of warning is a good 
idea, I wonder if this could be quite noisy though. I guess it depends 
on how many staged changes people have that don't match the new index. 
If we diff against the new tree and only print names that are in both 
lists does that give a definitive list of what will be lost? If it does 
then if there are a lot of files affected then it will still be noisy 
(using columns may help) but at least it will not contain false 
positives. It is more work though, maybe we should just say "staged 
changes may be lost" and leave it at that.

Best Wishes

Phillip

> +
>   			/* Do more real merge */
>   
>   			/*
>
Junio C Hamano March 20, 2019, 12:23 a.m. UTC | #2
Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> If you have staged changes in path A and perform 'checkout
> --merge' (which could result in conflicts in a totally unrelated path
> B), changes in A will be gone. Which is unexpected. We are supposed
> to keep all changes, or kick and scream otherwise.
>
> This is the result of how --merge is implemented, from the very first
> day in 1be0659efc (checkout: merge local modifications while switching
> branches., 2006-01-12):
>
> 1. a merge is done, unmerged entries are collected
> 2. a hard switch to a new branch is done, then unmerged entries added
>    back
>
> There is no trivial fix for this. Going with 3-way merge one file at a
> time loses rename detection. Going with 3-way merge by trees requires
> teaching the algorithm to pick up staged changes. And even if we detect
> staged changes with --merge and abort for safety, an option to continue
> --merge is very weird. Such an option would keep worktree changes, but
> drop staged changes.

I think "checkout -m <otherbranch>" with a dirty index should refuse
to run; there is nothing to "continue" after such a failure, so I am
not sure what you mean by "an option to continue" (iow, I do not see
a need for such an option, and if that option makes the whole notion
strange, we can just decide not to have it, can't we?).
Duy Nguyen March 20, 2019, 12:40 a.m. UTC | #3
On Wed, Mar 20, 2019 at 7:24 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
>
> > If you have staged changes in path A and perform 'checkout
> > --merge' (which could result in conflicts in a totally unrelated path
> > B), changes in A will be gone. Which is unexpected. We are supposed
> > to keep all changes, or kick and scream otherwise.
> >
> > This is the result of how --merge is implemented, from the very first
> > day in 1be0659efc (checkout: merge local modifications while switching
> > branches., 2006-01-12):
> >
> > 1. a merge is done, unmerged entries are collected
> > 2. a hard switch to a new branch is done, then unmerged entries added
> >    back
> >
> > There is no trivial fix for this. Going with 3-way merge one file at a
> > time loses rename detection. Going with 3-way merge by trees requires
> > teaching the algorithm to pick up staged changes. And even if we detect
> > staged changes with --merge and abort for safety, an option to continue
> > --merge is very weird. Such an option would keep worktree changes, but
> > drop staged changes.
>
> I think "checkout -m <otherbranch>" with a dirty index should refuse
> to run; there is nothing to "continue" after such a failure, so I am
> not sure what you mean by "an option to continue" (iow, I do not see
> a need for such an option, and if that option makes the whole notion
> strange, we can just decide not to have it, can't we?).

We have --force to continue even when we have local changes, which
will be overwritten. I was thinking a similar option which gives us
permission to destroy staged changes.

Refusing to run fails the test suite though (I tried that even before
this patch), in t7201.10, "switch to another branch while carrying a
deletion", because of this line

    git rm two
Junio C Hamano March 20, 2019, 1:19 a.m. UTC | #4
Duy Nguyen <pclouds@gmail.com> writes:

>> I think "checkout -m <otherbranch>" with a dirty index should refuse
>> to run; there is nothing to "continue" after such a failure, so I am
>> not sure what you mean by "an option to continue" (iow, I do not see
>> a need for such an option, and if that option makes the whole notion
>> strange, we can just decide not to have it, can't we?).
>
> We have --force to continue even when we have local changes, which
> will be overwritten. I was thinking a similar option which gives us
> permission to destroy staged changes.

Ah, then that is not "checkout --continue", but "checkout --force
-m"?  That sounds sensible, and should behave as if "checkout -f
HEAD && checkout -m <otherbranch>" was done, with respect to local
changes, I would think.
Duy Nguyen March 20, 2019, 1:22 a.m. UTC | #5
On Wed, Mar 20, 2019 at 8:19 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Duy Nguyen <pclouds@gmail.com> writes:
>
> >> I think "checkout -m <otherbranch>" with a dirty index should refuse
> >> to run; there is nothing to "continue" after such a failure, so I am
> >> not sure what you mean by "an option to continue" (iow, I do not see
> >> a need for such an option, and if that option makes the whole notion
> >> strange, we can just decide not to have it, can't we?).
> >
> > We have --force to continue even when we have local changes, which
> > will be overwritten. I was thinking a similar option which gives us
> > permission to destroy staged changes.
>
> Ah, then that is not "checkout --continue", but "checkout --force
> -m"?  That sounds sensible, and should behave as if "checkout -f
> HEAD && checkout -m <otherbranch>" was done, with respect to local
> changes, I would think.

Kinda. But "--force --merge" makes no sense. --force discards all
local changes by definition, which means you can't have conflicts and
will not need --merge. I think this is the reason why we die() out
when both are specified. So we need something like
--discard-staged-changes-only...
Junio C Hamano March 20, 2019, 1:50 a.m. UTC | #6
Duy Nguyen <pclouds@gmail.com> writes:

> Kinda. But "--force --merge" makes no sense. --force discards all
> local changes by definition, which means you can't have conflicts and
> will not need --merge. I think this is the reason why we die() out
> when both are specified. So we need something like
> --discard-staged-changes-only...

At that point, I would have to say that we do not need anything.
The use case is already covered with "git reset && git checkout -m",
isn't it?

Thanks.
Elijah Newren March 20, 2019, 1:53 p.m. UTC | #7
On Tue, Mar 19, 2019 at 7:50 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Duy Nguyen <pclouds@gmail.com> writes:
>
> > Kinda. But "--force --merge" makes no sense. --force discards all
> > local changes by definition, which means you can't have conflicts and
> > will not need --merge. I think this is the reason why we die() out
> > when both are specified. So we need something like
> > --discard-staged-changes-only...
>
> At that point, I would have to say that we do not need anything.
> The use case is already covered with "git reset && git checkout -m",
> isn't it?

I guess the problem is just that 'git checkout -m' has not refused to
run with either a dirty index or a dirty working tree, and if both are
dirty (making us require more of a four-way merge), then our three-way
merge has to have some kind of casualty in the implementation for at
least some case.  The current casualty as highlighted by Philip is
that newly staged files before the 'checkout -m' become untracked and
any carefully staged pieces before that command are lost amongst the
unstaged changes again even if there weren't any conflicts.

One solution is to just accept and document or warn about this
shortcoming for now as Duy did in his patch.  Another is to do as you
mentioned earlier in this thread when you stated 'I think "checkout -m
<otherbranch>" with a dirty index should refuse to run'.  Duy linked
to a third option that I outlined in his commit message, though it'd
require a bit more capability from merge-recursive than we have today.

So, I think we do need something (eventually at least).  Would you
prefer we dropped this patch from Duy and instead made 'checkout -m'
abort when the index is dirty?
Duy Nguyen March 20, 2019, 1:57 p.m. UTC | #8
On Wed, Mar 20, 2019 at 8:53 PM Elijah Newren <newren@gmail.com> wrote:
> So, I think we do need something (eventually at least).  Would you
> prefer we dropped this patch from Duy and instead made 'checkout -m'
> abort when the index is dirty?

I have no problem with this. Still scratching my head wondering if
t7201-co.sh has a slightly incorrect setup, or aborting is actually
wrong. You're probably a better person to understand that test case
;-)
Junio C Hamano March 21, 2019, 12:38 a.m. UTC | #9
Elijah Newren <newren@gmail.com> writes:

> On Tue, Mar 19, 2019 at 7:50 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Duy Nguyen <pclouds@gmail.com> writes:
>>
>> > Kinda. But "--force --merge" makes no sense. --force discards all
>> > local changes by definition, which means you can't have conflicts and
>> > will not need --merge. I think this is the reason why we die() out
>> > when both are specified. So we need something like
>> > --discard-staged-changes-only...
>>
>> At that point, I would have to say that we do not need anything.
>> The use case is already covered with "git reset && git checkout -m",
>> isn't it?
>
> I guess the problem is just that 'git checkout -m' has not refused to
> run with either a dirty index or a dirty working tree, and if both are
> dirty (making us require more of a four-way merge), then our three-way
> merge has to ...

I didn't actually mean "nothing to do here" relative to the current
code; instead, I meant "nothing more than just stop when the index
has updates" (which is hard to read from the above quoted part, as
"Kinda." is a response in a discussion started with my "checkout -m
should probably refuse to do anything when the index is dirty").

> So, I think we do need something (eventually at least).  Would you
> prefer we dropped this patch from Duy and instead made 'checkout -m'
> abort when the index is dirty?

Let's go with the doc update first, as the patch has already
written.  I think in the longer term, just aborting when the index
is dirty would be a vast improvement over the status quo + a doc
update and is a good place to stop.

Thanks.
Elijah Newren March 21, 2019, 1:46 p.m. UTC | #10
On Wed, Mar 20, 2019 at 7:58 AM Duy Nguyen <pclouds@gmail.com> wrote:
>
> On Wed, Mar 20, 2019 at 8:53 PM Elijah Newren <newren@gmail.com> wrote:
> > So, I think we do need something (eventually at least).  Would you
> > prefer we dropped this patch from Duy and instead made 'checkout -m'
> > abort when the index is dirty?
>
> I have no problem with this. Still scratching my head wondering if
> t7201-co.sh has a slightly incorrect setup, or aborting is actually
> wrong. You're probably a better person to understand that test case
> ;-)

It doesn't surprise me at all that some testcases would fail with this
change; it's a change of behavior from the previous implementation.
However, taking a look at that testcase, it looks like it's not a
simple change to make it do something similar because there's at least
one other bug that we need to fix first.  I'll dig in...though I
really want to get my
directory-rename-detection-defaults-to-reporting-conflict series
updated and sent out first.  Since Junio seems to be okay with just
taking your doc update for now, hopefully that's not a problem.  :-)
diff mbox series

Patch

diff --git a/Documentation/git-checkout.txt b/Documentation/git-checkout.txt
index f179b43732..877e5f503a 100644
--- a/Documentation/git-checkout.txt
+++ b/Documentation/git-checkout.txt
@@ -242,6 +242,8 @@  should result in deletion of the path).
 +
 When checking out paths from the index, this option lets you recreate
 the conflicted merge in the specified paths.
++
+When switching branches with `--merge`, staged changes may be lost.
 
 --conflict=<style>::
 	The same as --merge option above, but changes the way the
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 0e6037b296..f95e7975f7 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -726,6 +726,8 @@  static int merge_working_tree(const struct checkout_opts *opts,
 			struct tree *result;
 			struct tree *work;
 			struct merge_options o;
+			struct strbuf sb = STRBUF_INIT;
+
 			if (!opts->merge)
 				return 1;
 
@@ -736,6 +738,13 @@  static int merge_working_tree(const struct checkout_opts *opts,
 			if (!old_branch_info->commit)
 				return 1;
 
+			if (repo_index_has_changes(the_repository,
+						   get_commit_tree(old_branch_info->commit),
+						   &sb))
+				warning(_("staged changes in the following files may be lost: %s"),
+					sb.buf);
+			strbuf_release(&sb);
+
 			/* Do more real merge */
 
 			/*