diff mbox series

[2/2] rebase: don't rebase linear topology with --fork-point

Message ID 20190221214059.9195-3-avarab@gmail.com (mailing list archive)
State New, archived
Headers show
Series rebase: fix 2.11.0-era --fork-point regression | expand

Commit Message

Ævar Arnfjörð Bjarmason Feb. 21, 2019, 9:40 p.m. UTC
Fix a regression introduced in 4f21454b55 ("merge-base: handle
--fork-point without reflog", 2016-10-12).

Before that change having a linear history on top of an upstream
master would with --fork-point (aka argument-less rebase) tell us
there was nothing to be done:

    $ git rebase
    Current branch master is up to date.

After that change "rebase" will always redundantly find that it has
work to do (it doesn't):

    $ git rebase
    First, rewinding head to replay your work on top of it...
    Applying: [...]

Whereas equivalently running:

    $ git rebase @{upstream}
    $ git rebase $(git merge-base --fork-point @{u})

Gives us the old behavior of doing nothing.

Now, why did we have this regression? Fully digging into it yields an
interesting combination of causes:

Way back in 1308c17b3e ("Allow rebase to run if upstream is completely
merged", 2007-07-04) "rebase" learned to not do this redundant work
when asked to rebase on a commit that was already an ancestor of the
current commit.

Then in 1e0dacdbdb ("rebase: omit patch-identical commits with
--fork-point", 2014-07-16) a rebase bug was fixed for a case where the
history to be rebased was divergent by entirely skipping the 2007-era
logic if --fork-point was provided.

But here's the critical thing, *only* if the --fork-point was
divergent. At that time "git merge-base --fork-point A B" would return
nothing if the two commits weren't divergent.

Then in 4f21454b55 ("merge-base: handle --fork-point without reflog",
2016-10-12) which introduced the regression being fixed here, a bug
fix for "git merge-base --fork-point" being run stand-alone by proxy
broke this use-case git-rebase.sh was relying on, since it was still
assuming that if we didn't have divergent history we'd have no output.

Finally, when "rebase" was rewritten in C a combination of
9a48a615b4 ("builtin rebase: try to fast forward when possible",
2018-09-04), 103148aad8 ("merge-base --fork-point: extract libified
function", 2018-09-04) and 92d0d74e8d ("builtin rebase: support
`fork-point` option", 2018-09-04) faithfully re-implemented the
then-buggy behavior.

So let's fix this. It's easy enough, we just stop explicitly excluding
--fork-point from the can_fast_forward(...) test we're doing, which as
discussed above is faithfully ported over from buggy shellscript-era
logic.

I'm not bothering to fix this in the legacy rebase mode. As discussed
in 9aea5e9286 ("rebase: fix regression in rebase.useBuiltin=false test
mode", 2019-02-13) it'll be going away shortly after 2.21.0 lands.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/rebase.c                  |  6 ++++--
 t/t3421-rebase-topology-linear.sh | 15 +++++++++++++++
 2 files changed, 19 insertions(+), 2 deletions(-)

Comments

Jeff King Feb. 22, 2019, 3:08 p.m. UTC | #1
On Thu, Feb 21, 2019 at 10:40:59PM +0100, Ævar Arnfjörð Bjarmason wrote:

> Fix a regression introduced in 4f21454b55 ("merge-base: handle
> --fork-point without reflog", 2016-10-12).
> [...]

OK, your explanation mostly makes sense to me, except for one thing.

> Then in 4f21454b55 ("merge-base: handle --fork-point without reflog",
> 2016-10-12) which introduced the regression being fixed here, a bug
> fix for "git merge-base --fork-point" being run stand-alone by proxy
> broke this use-case git-rebase.sh was relying on, since it was still
> assuming that if we didn't have divergent history we'd have no output.

I still don't quite see how 4f21454b55 is involved here, except by
returning a fork-point value when there is no reflog, and thus
triggering the bug in more cases.

In particular, imagine this case:

  git init
  for i in $(seq 1 3); do echo $i >$i; git add $i; git commit -m $i; done
  git checkout -t -b other
  for i in $(seq 4 6); do echo $i >$i; git add $i; git commit -m $i; done
  git rebase

With the current code, that will rewind and replay 4-6, and I understand
that to be a bug from your description. And it happens at 4f21454b55,
too. But it _also_ happens at 4f21454b55^.

I.e., I still think that the only thing that commit changed is that we
found a fork-point in more cases. But the bug was still demonstrably
there when you actually have a reflog entry.

With the fix you have here, that case now produces "Current branch other
is up to date".

This is splitting hairs a little (and of course I'm trying to exonerate
the commit I'm responsible for ;) ), but I just want to make sure we
understand fully what's going on. Your fix looks plausibly correct to
me, but I admit I don't quite grok all the details of that conditional.

-Peff
Ævar Arnfjörð Bjarmason Feb. 22, 2019, 4:49 p.m. UTC | #2
On Fri, Feb 22 2019, Jeff King wrote:

> On Thu, Feb 21, 2019 at 10:40:59PM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> Fix a regression introduced in 4f21454b55 ("merge-base: handle
>> --fork-point without reflog", 2016-10-12).
>> [...]
>
> OK, your explanation mostly makes sense to me, except for one thing.
>
>> Then in 4f21454b55 ("merge-base: handle --fork-point without reflog",
>> 2016-10-12) which introduced the regression being fixed here, a bug
>> fix for "git merge-base --fork-point" being run stand-alone by proxy
>> broke this use-case git-rebase.sh was relying on, since it was still
>> assuming that if we didn't have divergent history we'd have no output.
>
> I still don't quite see how 4f21454b55 is involved here, except by
> returning a fork-point value when there is no reflog, and thus
> triggering the bug in more cases.
>
> In particular, imagine this case:
>
>   git init
>   for i in $(seq 1 3); do echo $i >$i; git add $i; git commit -m $i; done
>   git checkout -t -b other
>   for i in $(seq 4 6); do echo $i >$i; git add $i; git commit -m $i; done
>   git rebase
>
> With the current code, that will rewind and replay 4-6, and I understand
> that to be a bug from your description. And it happens at 4f21454b55,
> too. But it _also_ happens at 4f21454b55^.
>
> I.e., I still think that the only thing that commit changed is that we
> found a fork-point in more cases. But the bug was still demonstrably
> there when you actually have a reflog entry.
>
> With the fix you have here, that case now produces "Current branch other
> is up to date".
>
> This is splitting hairs a little (and of course I'm trying to exonerate
> the commit I'm responsible for ;) ), but I just want to make sure we
> understand fully what's going on.

Yes. I didn't dig far enough into this and will re-word & re-submit,
also with the feedback you had on 1/2.

So here's my current understanding of this.

It's b6266dc88b ("rebase--am: use --cherry-pick instead of
--ignore-if-in-upstream", 2014-07-15) that broke this in the general
case.

I.e. if you set a tracking branch within the same repo (which I'd
betnobody does) but *also* if you have an established clone you have a
reflog for the upstream. Then we'll find the fork point, and we'll
always redundantly rebase.

But this hung on by a thread until your 4f21454b55 ("merge-base: handle
--fork-point without reflog", 2016-10-12). In particular when you:

 1. Clone some *new* repo
 2. commit on top
 3. git pull --rebase

You'll redundantly rebase on top, even though you have nothing to
do. Since there's no reflog.

This is why I ran into this most of the time, because my "patch some
random thing" is that, and I have pull.rebase=true in my config.

What had me confused about this being the primary cause was that when
trying to test this I was re-cloning, so I'd always get this empty
reflog case.

> Your fix looks plausibly correct to me, but I admit I don't quite grok
> all the details of that conditional.

We just consider whether we can fast-forward now, and then don't need to
rebase (unless "git rebase -i" etc.). I.e. that --fork-point was
considered for "do we need to do stuff" was a bug introduced in
b6266dc88b.
Jeff King Feb. 24, 2019, 10:10 a.m. UTC | #3
On Fri, Feb 22, 2019 at 05:49:57PM +0100, Ævar Arnfjörð Bjarmason wrote:

> Yes. I didn't dig far enough into this and will re-word & re-submit,
> also with the feedback you had on 1/2.
> 
> So here's my current understanding of this.
> 
> It's b6266dc88b ("rebase--am: use --cherry-pick instead of
> --ignore-if-in-upstream", 2014-07-15) that broke this in the general
> case.
> 
> I.e. if you set a tracking branch within the same repo (which I'd
> betnobody does) but *also* if you have an established clone you have a
> reflog for the upstream. Then we'll find the fork point, and we'll
> always redundantly rebase.
> 
> But this hung on by a thread until your 4f21454b55 ("merge-base: handle
> --fork-point without reflog", 2016-10-12). In particular when you:
> 
>  1. Clone some *new* repo
>  2. commit on top
>  3. git pull --rebase
> 
> You'll redundantly rebase on top, even though you have nothing to
> do. Since there's no reflog.
> 
> This is why I ran into this most of the time, because my "patch some
> random thing" is that, and I have pull.rebase=true in my config.
> 
> What had me confused about this being the primary cause was that when
> trying to test this I was re-cloning, so I'd always get this empty
> reflog case.

OK, thanks, that all makes sense to me and matches what I'm seeing. I
think we're on the same page now.

> > Your fix looks plausibly correct to me, but I admit I don't quite grok
> > all the details of that conditional.
> 
> We just consider whether we can fast-forward now, and then don't need to
> rebase (unless "git rebase -i" etc.). I.e. that --fork-point was
> considered for "do we need to do stuff" was a bug introduced in
> b6266dc88b.

Right, that makes sense. But I'm just not clear on the new oidcmp() rule
that's put in place.

When we're not using fork point, in the old code we'd check whether
upstream and onto are the same. Which makes sense; if we're rebuilding
onto the same spot, then it's a noop.

And in the fork-point case, we'd instead want to do something similar
with restrict_revision. But you compare upstream and restrict_revision,
and my understanding is that with --fork-point the latter basically
replaces the former.  Shouldn't we be comparing restrict_revision and
onto?

E.g., if I do:

  git checkout -b foo origin
  echo content >file && git add file && git commit -m foo
  git rebase --onto origin^ --fork-point origin

we should do an actual rebase, but with your patch we get "Current
branch foo is up to date". Whereas dropping --fork-point, it works.

-Peff
diff mbox series

Patch

diff --git a/builtin/rebase.c b/builtin/rebase.c
index 7c7bc13e91..7a16b8051c 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -1664,9 +1664,11 @@  int cmd_rebase(int argc, const char **argv, const char *prefix)
 	 * and if this is not an interactive rebase.
 	 */
 	if (can_fast_forward(options.onto, &options.orig_head, &merge_base) &&
-	    !is_interactive(&options) && !options.restrict_revision &&
+	    !is_interactive(&options) &&
 	    options.upstream &&
-	    !oidcmp(&options.upstream->object.oid, &options.onto->object.oid)) {
+	    (options.restrict_revision
+	     ? !oidcmp(&options.upstream->object.oid, &options.restrict_revision->object.oid)
+	     : !oidcmp(&options.upstream->object.oid, &options.onto->object.oid))) {
 		int flag;
 
 		if (!(options.flags & REBASE_FORCE)) {
diff --git a/t/t3421-rebase-topology-linear.sh b/t/t3421-rebase-topology-linear.sh
index b847064f91..1754537789 100755
--- a/t/t3421-rebase-topology-linear.sh
+++ b/t/t3421-rebase-topology-linear.sh
@@ -55,6 +55,21 @@  test_run_rebase success -m
 test_run_rebase success -i
 test_have_prereq !REBASE_P || test_run_rebase success -p
 
+test_run_rebase () {
+	result=$1
+	shift
+	test_expect_$result "rebase $* is no-op if remote upstream is an ancestor" "
+		reset_rebase &&
+		GIT_TEST_REBASE_USE_BUILTIN=true git rebase $* branch-b branch-e &&
+		test_cmp_rev e HEAD
+	"
+}
+test_run_rebase success ''
+test_run_rebase success --fork-point
+test_run_rebase success -m
+test_run_rebase success -i
+test_have_prereq !REBASE_P || test_run_rebase success -p
+
 test_run_rebase () {
 	result=$1
 	shift