diff mbox series

[v2,2/3] builtin/rebase.c: Emit warning when rebasing without a forkpoint

Message ID 20230902221641.1399624-3-wesleys@opperschaap.net (mailing list archive)
State New, archived
Headers show
Series [v2,1/3] rebase.c: Make a distiction between rebase.forkpoint and --fork-point arguments | expand

Commit Message

Wesley Schwengle Sept. 2, 2023, 10:16 p.m. UTC
This patch adds a warning where it will indicate that `rebase.forkpoint'
must be set in the git configuration and/or that you can supply a
`--fork-point' or `--no-fork-point' command line option to your `git
rebase' invocation.

When commit d1e894c6d7 (Document `rebase.forkpoint` in rebase man page,
2021-09-16) was submitted there was a discussion on if the forkpoint
behaviour of `git rebase' was sane. In my experience this wasn't sane.

Git rebase doesn't work if you don't have an upstream branch configured
(or something that says `merge = refs/heads/master' in the git config).
You would than need to use `git rebase <upstream>' to rebase. If you
configure an upstream it would seem logical to be able to run `git
rebase' without arguments. However doing so would trigger a different
kind of behavior.  `git rebase <upstream>' behaves as if
`--no-fork-point' was supplied and without it behaves as if
`--fork-point' was supplied. This behavior can result in a loss of
commits and can surprise users. The following reproduction path exposes
this behavior:

    git init reproduction
    cd reproduction
    echo "commit a" > file.txt
    git add file.txt
    git commit -m "First commit" file.txt
    echo "commit b" >> file.txt
    git commit -m "Second commit" file.txt

    git switch -c foo
    echo "commit c" >> file.txt"
    git commit -m "Third commit" file.txt
    git branch --set-upstream-to=master

    git status
    On branch foo
    Your branch is ahead of 'master' by 1 commit.

    git switch master
    git merge foo
    git reset --hard HEAD^
    git switch foo
    Switched to branch 'foo'
    Your branch is ahead of 'master' by 1 commit.

    git log --oneline
    5f427e3 Third commit
    03ad791 Second commit
    411e6d4 First commit

    git rebase
    git status
    On branch foo
    Your branch is up to date with 'master'.

    git log --oneline
    03ad791 Second commit
    411e6d4 First commit

Signed-off-by: Wesley Schwengle <wesleys@opperschaap.net>
---
 builtin/rebase.c             | 16 +++++++++-
 t/t3431-rebase-fork-point.sh | 62 ++++++++++++++++++++++++++++++++++++
 2 files changed, 77 insertions(+), 1 deletion(-)

Comments

Junio C Hamano Sept. 2, 2023, 11:37 p.m. UTC | #1
Wesley Schwengle <wesleys@opperschaap.net> writes:

> Subject: Re: [PATCH v2 2/3] builtin/rebase.c: Emit warning when rebasing without a forkpoint

(applies to all three patches) downcase "Emit" after the <area>: prefix.

> This patch adds a warning where it will indicate that `rebase.forkpoint'
> must be set in the git configuration and/or that you can supply a
> `--fork-point' or `--no-fork-point' command line option to your `git
> rebase' invocation.
>
> When commit d1e894c6d7 (Document `rebase.forkpoint` in rebase man page,
> 2021-09-16) was submitted there was a discussion on if the forkpoint
> behaviour of `git rebase' was sane. In my experience this wasn't sane.

I already said that the above is not true, so I will not repeat myself.

> Git rebase doesn't work if you don't have an upstream branch configured

"git rebase foo" works just fine, so this statement needs a lot of
tightening.

> (or something that says `merge = refs/heads/master' in the git config).
> You would than need to use `git rebase <upstream>' to rebase. If you
> configure an upstream it would seem logical to be able to run `git
> rebase' without arguments. However doing so would trigger a different
> kind of behavior.  `git rebase <upstream>' behaves as if
> `--no-fork-point' was supplied and without it behaves as if
> `--fork-point' was supplied. This behavior can result in a loss of
> commits and can surprise users.

No, what is causing the loss in this particular case is allowing to
use the fork-point heuristics.  If you do not want it, you can
either explicitly give --no-fork-point or <upstream> (or both if you
feel that you need to absolutely be clear).  Or you can set the
configuration to "false" to disable this "auto" behaviour.

> The following reproduction path exposes
> this behavior:

I actually do not think having this example in the proposed log
message adds more value than it distracts readers from the real
point of this change.

If you rewind to lose commits from the branch you are (re)building
against, and what was rewound and discarded was part of the work you
are building, whether it is on a local branch or on a remote branch
that contains what you have already pushed, they will be discarded,
it is by design, and it is a known deficiency with the fork-point
heuristics.  How the fork-point heuristics breaks down is rather
well known and it is pretty much orthogonal to the point of this
patch, which is to make it harder to trigger by folks who are not
familiar with "git rebase" and yet try to be lazy by not specifying
the <upstream> from the command line.

By the way, while I do agree with the need to make users _aware_ of
the "auto" behaviour [*1*], I am not yet convinced that there is a
need to change the default in the future.

	Side note: It allows those who originally advocated the
	fork-point heuristics to be extra lazy and allow fork-point
	heuristics to be used when they rebuild on top of what they
	usually rebuild on (and the "usually" part is signalled by
	using "git rebase" without saying what to build on from the
	command line).  The default allows them not to worry about
	the heuristics to kick in when they explicitly say on which
	exact commit they want to rebuild on.

And when we do not know if the default will change, the new warning
message will lose value.  Many of those who see the message are
already familiar with when the forkpoint heuristics will kick in,
and those who weren't familiar with will not know what the default
change is about, without consulting the documentation.

It might be better to extend the documentation instead, which will
not distract those who are using the tool just fine already.

> diff --git a/t/t3431-rebase-fork-point.sh b/t/t3431-rebase-fork-point.sh
> index 4bfc779bb8..908867ae0f 100755
> --- a/t/t3431-rebase-fork-point.sh
> +++ b/t/t3431-rebase-fork-point.sh
> @@ -113,4 +113,66 @@ test_expect_success 'rebase.forkPoint set to true and --root given' '
>  	git rebase --root
>  '
>  
> +# The use of the diff -qw is because there is some kind of whitespace character
> +# magic going on which probably has to do with the tabs. It only occurs when we
> +# check STDERR
> +test_expect_success 'rebase without rebase.forkpoint' '
> +	git init rebase-forkpoint &&
> +	cd rebase-forkpoint &&
> +	git status >/tmp/foo &&
> +	echo "commit a" > file.txt &&

Style???

> +	git add file.txt &&
> +	git commit -m "First commit" file.txt &&
> +	echo "commit b" >> file.txt &&
> +	git commit -m "Second commit" file.txt &&
> +	git switch -c foo &&
> +	echo "commit c" >> file.txt &&
> +	git commit -m "Third commit" file.txt &&
> +	git branch --set-upstream-to=main &&
> +	git switch main &&
> +	git merge foo &&
> +	git reset --hard HEAD^ &&
> +	git switch foo &&
> +	commit=$(git log -n1 --format="%h") &&
> +	git rebase >out 2>err &&
> +	test_must_be_empty out &&
> +	cat <<-\OEF > expect &&

Why does this have to be orgiinal in such a strange way?  When
everybody else uses string "EOF" as the end-of-here-doc-marker, and
if there is no downside to use the same string here, we should just
use the same "EOF" to avoid distracting readers.

> +	warning: When "git rebase" is run without specifying <upstream> on the
> +	command line, the current default is to use the fork-point
> +	heuristics. This is expected to change in a future version
> +	of Git, and you will have to explicitly give "--fork-point" from
> +	the command line if you keep using the fork-point mode.  You can
> +	run "git config rebase.forkpoint false" to adopt the new default
> +	in advance and that will also squelch the message.
> +
> +	You can replace "git config" with "git config --global" to set a default
> +	preference for all repositories. You can also pass --no-fork-point, --fork-point
> +	on the command line to override the configured default per invocation.
> +
> +	Successfully rebased and updated refs/heads/foo.
> +	OEF
> +	diff -qw expect err &&

Why not "test_cmp expect actual" like everybody else?

> +...
> +	echo "Current branch foo is up to date." > expect &&
> +	test_cmp out expect
> +'
> +
>  test_done
Wesley Schwengle Sept. 3, 2023, 2:29 a.m. UTC | #2
On 9/2/23 19:37, Junio C Hamano wrote:
> Wesley Schwengle <wesleys@opperschaap.net> writes:

Thanks for the feedback. I won't continue the patch series because some 
of the feedback you've given below.

>> However doing so would trigger a different
>> kind of behavior.  `git rebase <upstream>' behaves as if
>> `--no-fork-point' was supplied and without it behaves as if
>> `--fork-point' was supplied. This behavior can result in a loss of
>> commits and can surprise users.
> 
> No, what is causing the loss in this particular case is allowing to
> use the fork-point heuristics.  If you do not want it, you can
> either explicitly give --no-fork-point or <upstream> (or both if you
> feel that you need to absolutely be clear).  Or you can set the
> configuration to "false" to disable this "auto" behaviour.

Isn't that what I'm saying? At least I'm trying to say what you are saying.

> By the way, while I do agree with the need to make users _aware_ of
> the "auto" behaviour [*1*], I am not yet convinced that there is a
> need to change the default in the future.

In that case, I'll abort this patch series. I don't agree with the `git 
rebase' in the lazy form and `git rebase <upstream>' acting differently, 
but I already have the rebase.forkpoint set to false to counter it.

> It might be better to extend the documentation instead, which will
> not distract those who are using the tool just fine already.

That is with the current viewpoints the best option I think.

>> +	diff -qw expect err &&
> 
> Why not "test_cmp expect actual" like everybody else?

As said in the initial patch series and the comment above the tests:

> There is one point where I'm a little confused, the `test_cmp' function in the
> testsuite doesn't like the output that is captured from STDERR, it seems that
> there is a difference in regards to whitespace. My workaround is to use
> `diff -wq`. I don't know if this is an accepted solution.

That's why.

Cheers,
Wesley
Junio C Hamano Sept. 3, 2023, 4:50 a.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

> If you rewind to lose commits from the branch you are (re)building
> against, and what was rewound and discarded was part of the work you
> are building, whether it is on a local branch or on a remote branch
> that contains what you have already pushed, they will be discarded,
> it is by design, and it is a known deficiency with the fork-point
> heuristics.  How the fork-point heuristics breaks down is rather
> well known ...

Another tangent, this time very closely related to this topic, is
that it may be worth warning when the fork-point heuristics chooses
the base commit that is different from the original upstream,
regardless of how we ended up using fork-point heuristics.

Experienced users may not be confused when the heuristics kicks in
and when it does not (e.g. because they configured, because they
used the "lazy" form, or because they gave "--fork-point" from the
command line explicitly), but they still may get surprising results
if a reflog entry chosen to be used as the base by the heuristics is
not what they expected to be used, and can lose their work that way.
Imagine that you pushed your work to the remote that is a shared
repository, and then continued building on top of it, while others
rewound the remote branch to eject your work, and your "git fetch"
updated the remote-tracking branch.  You'll be pretty much in the
same situation you had in your reproduction recipe that rewound your
own local branch that you used to build your derived work on and
would lose your work the same way, if you do not notice that the
remote branch has been rewound (and the fork-point heuristics chose
a "wrong" commit from the reflog of your remote-tracking branch.

Perhaps something along the lines of this (not even compile tested,
though)...  It might even be useful to show a shortlog between the
.restrict_revision and .upstream, which is the list of commits that
is potentially lost, but that might turn out to be excessively loud
and noisy in the workflow of those who do benefit from the
fork-point heuristics because their project rewinds branches too
often and too wildly for them to manually keep track of.  I dunno.


 builtin/rebase.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git c/builtin/rebase.c w/builtin/rebase.c
index 50cb85751f..432a97e205 100644
--- c/builtin/rebase.c
+++ w/builtin/rebase.c
@@ -1721,9 +1721,15 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
 	if (keep_base && options.reapply_cherry_picks)
 		options.upstream = options.onto;
 
-	if (options.fork_point > 0)
+	if (options.fork_point > 0) {
 		options.restrict_revision =
 			get_fork_point(options.upstream_name, options.orig_head);
+		if (options.restrict_revision &&
+		    options.restrict_revision != options.upstream)
+			warning(_("fork-point heuristics using %s from the reflog of %s"),
+				oid_to_hex(&options.restrict_revision->object.oid),
+				options.upstream_name);
+	}
 
 	if (repo_read_index(the_repository) < 0)
 		die(_("could not read index"));
Wesley Schwengle Sept. 3, 2023, 12:34 p.m. UTC | #4
On 9/3/23 00:50, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
>> If you rewind to lose commits from the branch you are (re)building
>> against, and what was rewound and discarded was part of the work you
>> are building, whether it is on a local branch or on a remote branch
>> that contains what you have already pushed, they will be discarded,
>> it is by design, and it is a known deficiency with the fork-point
>> heuristics.  How the fork-point heuristics breaks down is rather
>> well known ...
> 
> Another tangent, this time very closely related to this topic, is
> that it may be worth warning when the fork-point heuristics chooses
> the base commit that is different from the original upstream,
> regardless of how we ended up using fork-point heuristics.
> 
> [snip]
> 
> Perhaps something along the lines of this (not even compile tested,
> though)...  It might even be useful to show a shortlog between the
> .restrict_revision and .upstream, which is the list of commits that
> is potentially lost, but that might turn out to be excessively loud
> and noisy in the workflow of those who do benefit from the
> fork-point heuristics because their project rewinds branches too
> often and too wildly for them to manually keep track of.  I dunno.

I like the idea of the warning, but it could be loud indeed and you'll 
want to turn it off in that case.
Phillip Wood Sept. 4, 2023, 10:16 a.m. UTC | #5
On 03/09/2023 05:50, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
>> If you rewind to lose commits from the branch you are (re)building
>> against, and what was rewound and discarded was part of the work you
>> are building, whether it is on a local branch or on a remote branch
>> that contains what you have already pushed, they will be discarded,
>> it is by design, and it is a known deficiency with the fork-point
>> heuristics.  How the fork-point heuristics breaks down is rather
>> well known ...
> 
> Another tangent, this time very closely related to this topic, is
> that it may be worth warning when the fork-point heuristics chooses
> the base commit that is different from the original upstream,
> regardless of how we ended up using fork-point heuristics.

I think that is a good idea and would help to mitigate the surprise that 
some users have expressed when --fork-point kicks and they didn't know 
about it. I think we may want to compare "branch_base" which holds the 
merge-base of HEAD and upstream with "restrict_revision" to decide when 
to warn.

Best Wishes

Phillip

> Experienced users may not be confused when the heuristics kicks in
> and when it does not (e.g. because they configured, because they
> used the "lazy" form, or because they gave "--fork-point" from the
> command line explicitly), but they still may get surprising results
> if a reflog entry chosen to be used as the base by the heuristics is
> not what they expected to be used, and can lose their work that way.
> Imagine that you pushed your work to the remote that is a shared
> repository, and then continued building on top of it, while others
> rewound the remote branch to eject your work, and your "git fetch"
> updated the remote-tracking branch.  You'll be pretty much in the
> same situation you had in your reproduction recipe that rewound your
> own local branch that you used to build your derived work on and
> would lose your work the same way, if you do not notice that the
> remote branch has been rewound (and the fork-point heuristics chose
> a "wrong" commit from the reflog of your remote-tracking branch.
> 
> Perhaps something along the lines of this (not even compile tested,
> though)...  It might even be useful to show a shortlog between the
> .restrict_revision and .upstream, which is the list of commits that
> is potentially lost, but that might turn out to be excessively loud
> and noisy in the workflow of those who do benefit from the
> fork-point heuristics because their project rewinds branches too
> often and too wildly for them to manually keep track of.  I dunno.
> 
> 
>   builtin/rebase.c | 8 +++++++-
>   1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git c/builtin/rebase.c w/builtin/rebase.c
> index 50cb85751f..432a97e205 100644
> --- c/builtin/rebase.c
> +++ w/builtin/rebase.c
> @@ -1721,9 +1721,15 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>   	if (keep_base && options.reapply_cherry_picks)
>   		options.upstream = options.onto;
>   
> -	if (options.fork_point > 0)
> +	if (options.fork_point > 0) {
>   		options.restrict_revision =
>   			get_fork_point(options.upstream_name, options.orig_head);
> +		if (options.restrict_revision &&
> +		    options.restrict_revision != options.upstream)
> +			warning(_("fork-point heuristics using %s from the reflog of %s"),
> +				oid_to_hex(&options.restrict_revision->object.oid),
> +				options.upstream_name);
> +	}
>   
>   	if (repo_read_index(the_repository) < 0)
>   		die(_("could not read index"));
>
Junio C Hamano Sept. 5, 2023, 10:01 p.m. UTC | #6
Wesley Schwengle <wesleys@opperschaap.net> writes:

> I like the idea of the warning, but it could be loud indeed and you'll
> want to turn it off in that case.

I tend to think that a single-liner warning would not be too
intrusive (it might actually be too subtle to be noticed),
especially given that it is issued only when the fork-point does
move the target commit from what was given.

Giving a shortlog of what is lost in the history does sound like a
bit too loud, I am afraind, though.
diff mbox series

Patch

diff --git a/builtin/rebase.c b/builtin/rebase.c
index 2108001600..ee7db9ba0c 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -1608,8 +1608,22 @@  int cmd_rebase(int argc, const char **argv, const char *prefix)
 								    NULL);
 			if (!options.upstream_name)
 				error_on_missing_default_upstream();
-			if (options.fork_point < 0)
+			if (options.fork_point < 0) {
+				warning(_(
+					"When \"git rebase\" is run without specifying <upstream> on the\n"
+					"command line, the current default is to use the fork-point\n"
+					"heuristics. This is expected to change in a future version\n"
+					"of Git, and you will have to explicitly give \"--fork-point\" from\n"
+					"the command line if you keep using the fork-point mode.  You can\n"
+					"run \"git config rebase.forkpoint false\" to adopt the new default\n"
+					"in advance and that will also squelch the message.\n"
+					"\n"
+					"You can replace \"git config\" with \"git config --global\" to set a default\n"
+					"preference for all repositories. You can also pass --no-fork-point, --fork-point\n"
+					"on the command line to override the configured default per invocation.\n"
+				));
 				options.fork_point = 1;
+			}
 		} else {
 			options.upstream_name = argv[0];
 			argc--;
diff --git a/t/t3431-rebase-fork-point.sh b/t/t3431-rebase-fork-point.sh
index 4bfc779bb8..908867ae0f 100755
--- a/t/t3431-rebase-fork-point.sh
+++ b/t/t3431-rebase-fork-point.sh
@@ -113,4 +113,66 @@  test_expect_success 'rebase.forkPoint set to true and --root given' '
 	git rebase --root
 '
 
+# The use of the diff -qw is because there is some kind of whitespace character
+# magic going on which probably has to do with the tabs. It only occurs when we
+# check STDERR
+test_expect_success 'rebase without rebase.forkpoint' '
+	git init rebase-forkpoint &&
+	cd rebase-forkpoint &&
+	git status >/tmp/foo &&
+	echo "commit a" > file.txt &&
+	git add file.txt &&
+	git commit -m "First commit" file.txt &&
+	echo "commit b" >> file.txt &&
+	git commit -m "Second commit" file.txt &&
+	git switch -c foo &&
+	echo "commit c" >> file.txt &&
+	git commit -m "Third commit" file.txt &&
+	git branch --set-upstream-to=main &&
+	git switch main &&
+	git merge foo &&
+	git reset --hard HEAD^ &&
+	git switch foo &&
+	commit=$(git log -n1 --format="%h") &&
+	git rebase >out 2>err &&
+	test_must_be_empty out &&
+	cat <<-\OEF > expect &&
+	warning: When "git rebase" is run without specifying <upstream> on the
+	command line, the current default is to use the fork-point
+	heuristics. This is expected to change in a future version
+	of Git, and you will have to explicitly give "--fork-point" from
+	the command line if you keep using the fork-point mode.  You can
+	run "git config rebase.forkpoint false" to adopt the new default
+	in advance and that will also squelch the message.
+
+	You can replace "git config" with "git config --global" to set a default
+	preference for all repositories. You can also pass --no-fork-point, --fork-point
+	on the command line to override the configured default per invocation.
+
+	Successfully rebased and updated refs/heads/foo.
+	OEF
+	diff -qw expect err &&
+	git reset --hard $commit &&
+	git rebase --fork-point >out 2>err &&
+	test_must_be_empty out &&
+	echo "Successfully rebased and updated refs/heads/foo." > expect &&
+	diff -qw expect err &&
+	git reset --hard $commit &&
+	git rebase --no-fork-point >out 2>err &&
+	test_must_be_empty err &&
+	echo "Current branch foo is up to date." > expect &&
+	test_cmp out expect &&
+	git config --add rebase.forkpoint true &&
+	git rebase >out 2>err &&
+	test_must_be_empty out &&
+	echo "Successfully rebased and updated refs/heads/foo." > expect &&
+	diff -qw expect err &&
+	git reset --hard $commit &&
+	git config --replace-all rebase.forkpoint false &&
+	git rebase >out 2>err &&
+	test_must_be_empty err &&
+	echo "Current branch foo is up to date." > expect &&
+	test_cmp out expect
+'
+
 test_done