Message ID | 20230819203528.562156-2-wesleys@opperschaap.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] builtin/rebase.c: Emit warning when rebasing without a forkpoint | expand |
Wesley Schwengle <wesleys@opperschaap.net> writes: > The behaviour of `git rebase' was that if you supply an upstream on the > command line that it behaves as if `--no-forkpoint' was supplied and if > you don't supply an upstream, it behaves as if `--forkpoint' was > supplied. I actually think it is a reasonable, if a bit too clever (for my taste at least), default for those who do not want to type the "--fork-point" option from the command line and still want to use that option when they are pulling from or rebasing on the source they usually interact with, while still allowing them to be precise when they do want to specify exactly what commit they want to base it on. And the way how you tell if they are using the "usual" source is to see if they used the lazy "git rebase" (without arguments) form. So I do not think it is particularly a bad design to allow "git rebase master" and "git rebase" to behave differently. The latter may use the "fork point computed using 'master' branch" (when the current branch is configured to rebuild on top of 'master') while the former may use "exactly the commit pointed at by the 'master' branch". > This can result in a loss of commits if you don't know that > and if you don't know about `git reflog' or have other copies of your > changes. Surely, but you would lose commits if you don't know these things and explicitly gave the --fork-point option the same way. So I am not sure if switching of the default is warranted. > - if (options.fork_point < 0) > + if (options.fork_point < 0) { > + warning(_( > + "Rebasing without specifying a forkpoint is discouraged. You can squelch\n" > + "this message by running one of the following commands something before your\n" > + "next rebase:\n" > + "\n" > + " git config rebase.forkpoint = false # This will become the new default\n" > + " git config rebase.forkpoint = true # This is the old default\n" > + "\n" The message "Rebasing without specifying a forkpoint" reads as if you are encouraging the use of forkpoint mode (which you are not, I know), but then what the message advertises as a future default stops not make sense. "If we hate the forkpoint mode so much to disable it by default, why so we discourage running the command without specifying it?" would be the confused message the users will read from it. Your "git config" example command lines are not correct, are they? There should be no '=' assignment operator. I am also afraid that this is giving a way too broad an advice. What you want to discourage is to rebase without specifying what to rebase on and without saying if you want or you do not want the forkpoint behaviour, which will opt the user into the more dangerous forkpoint behaviour. The above makes it sound as if we will discourage even the more precise "git rebase <newbase>" form, but I do not think it is the case. We would and should not trigger the folk-point behaviour if there is an explicit <upstream> and the user does not say "--fork-point" from the command line. Here is my attempt to rewrite the above: When 'git rebase' is run without specifying <upstream> on the command line, the current default is to use the fork-point heuristics, but this is expected to change in a future version of Git, and you will have to explicitly give "--fork-point" from the command line if you keep using the fork-point mode. You can run "git config rebase.forkpoint false" to adopt the new default in advance and that will also squelch the message. Note that the parsing of "rebase.forkpoint" is a bit peculiar in that - By leaving it unspecified, the .fork_point = -1 in REBASE_OPTIONS_INIT takes effect (which is unsurprising); - By setting it to false, .fork_point becomes 0; but - If you set the configuration variable to true, .fork_point becomes -1, not 1. And this is very much deliberate if I understand it correctly [*1*]. By the time we get to this part of the code (i.e. .fork_point is -1), the user may already have rebase.forkpoint set to true. IOW, setting it to 'true' is not a valid way to squelch this message. I am not commenting on the tests, as the above code probably needs to be corrected first so that folks who want to squelch the message and want the "forkpoint behaviour by default when rebuilding on the usual upstream" behaviour can do so by setting the variable to true. And that obviously need to be tested, too. Thanks. [References] *1* https://lore.kernel.org/git/xmqqturbdxi2.fsf@gitster.c.googlers.com/
Junio C Hamano <gitster@pobox.com> writes: > I am not commenting on the tests, as the above code probably needs > to be corrected first so that folks who want to squelch the message > and want the "forkpoint behaviour by default when rebuilding on the > usual upstream" behaviour can do so by setting the variable to true. > > And that obviously need to be tested, too. Another worrysome thing about rebase.forkpoint is that it will be inevitable for folks to start complaining that it does not work the way other configuration variables do. Setting the variable to 'true' is not the same as passing '--fork-point=true' from the command line. I actually think it would be a lot larger behaviour change with a huge potential to be received as a regression if we start making the variable to mean the same thing as passing '--fork-point=true'. People may like the current "if you are rebuilding your branch on its usual upstream, pay attention to the rebase and rewind of the upstream itself, but if you are giving an explicit upstream from the command line, the tool does not second guess you with the fork-point heuristics" behaviour and prefer to set it to true. We would be breaking them big time if suddenly the rebase.forkpoint=true they set previously starts triggering the fork-point heuristics when they run "git rebase upstream". So that needs to be kept in mind when/if we fix the "setting the variable, even to 'true', will squelch the warning".
Hi Wesley On 19/08/2023 21:34, Wesley Schwengle wrote: > When commit d1e894c6d7 (Document `rebase.forkpoint` in rebase man page, > 2021-09-16) was submitted there was a discussion on if the forkpoint > behaviour of `git rebase' was sane. In my experience this wasn't sane. > Git rebase doesn't work if you don't have an upstream branch configured > (or something that says `merge = refs/heads/master' in the git config). > The behaviour of `git rebase' was that if you supply an upstream on the > command line that it behaves as if `--no-forkpoint' was supplied and if > you don't supply an upstream, it behaves as if `--forkpoint' was > supplied. This can result in a loss of commits if you don't know that > and if you don't know about `git reflog' or have other copies of your > changes. This can be seen with the following reproduction path: > > mkdir reproduction > cd reproduction > git init . > echo "commit a" > file.txt > git add file.txt > git commit -m "First commit" file.txt > echo "commit b" >> file.txt > git commit -m "Second commit" file.txt > > git switch -c foo > echo "commit c" >> file.txt" > git commit -m "Third commit" file.txt > git branch --set-upstream-to=master > > git status > On branch foo > Your branch is ahead of 'master' by 1 commit. > > git switch master > git merge foo Here "git merge" fast-forwards I think, if instead it created a merge commit there would be no problem as the tip of branch "foo" would not end up in master's reflog. > git reset --hard HEAD^ > git switch foo > Switched to branch 'foo' > Your branch is ahead of 'master' by 1 commit. > > git log --format='%C(yellow)%h%Creset %Cgreen%s%Creset' For a reproduction recipe I think "git log --oneline" would suffice. > 5f427e3 Third commit > 03ad791 Second commit > 411e6d4 First commit > > git rebase > git status > On branch foo > Your branch is up to date with 'master'. > > git log --format='%C(yellow)%h%Creset %Cgreen%s%Creset' > 03ad791 Second commit > 411e6d4 First commit Thanks for the detailed reproduction recipe, I think it would be helpful to summarize what's happening in the commit message, especially as it seems to depend on "git merge" fast-forwarding. Do you often merge a branch into it's upstream and then reset the upstream branch? I tend to agree with Junio that the current default is pretty reasonable. Looking through the links from the cover letter it seems that the current behavior came from a desire for git fetch && git rebase to behave like git pull --rebase I think the commit message for any change to the default should address why that is undesirable. Also we should consider what problems may arise from not defaulting to --fork-point when rebasing on an upstream branch that has itself been rebased or rewound. Best Wishes Phillip
On 31/08/2023 22:52, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: > >> I am not commenting on the tests, as the above code probably needs >> to be corrected first so that folks who want to squelch the message >> and want the "forkpoint behaviour by default when rebuilding on the >> usual upstream" behaviour can do so by setting the variable to true. >> >> And that obviously need to be tested, too. > > Another worrysome thing about rebase.forkpoint is that it will be > inevitable for folks to start complaining that it does not work the > way other configuration variables do. Setting the variable to > 'true' is not the same as passing '--fork-point=true' from the > command line. It does seem strange, it looks like the variable was really added as a way to turn off the current default. If we do change the default to --no-fork-point when no upstream is given on the commandline then I think we should consider allowing "auto" for rebase.forkpoint with the some meaning as "true" and recommend that instead. Best Wishes Phillip > I actually think it would be a lot larger behaviour change with a > huge potential to be received as a regression if we start making the > variable to mean the same thing as passing '--fork-point=true'. > People may like the current "if you are rebuilding your branch on > its usual upstream, pay attention to the rebase and rewind of the > upstream itself, but if you are giving an explicit upstream from the > command line, the tool does not second guess you with the fork-point > heuristics" behaviour and prefer to set it to true. We would be > breaking them big time if suddenly the rebase.forkpoint=true they > set previously starts triggering the fork-point heuristics when they > run "git rebase upstream". So that needs to be kept in mind when/if > we fix the "setting the variable, even to 'true', will squelch the > warning". >
Phillip Wood <phillip.wood123@gmail.com> writes: > It does seem strange, it looks like the variable was really added as a > way to turn off the current default. If we do change the default to > --no-fork-point when no upstream is given on the commandline then I > think we should consider allowing "auto" for rebase.forkpoint with the > some meaning as "true" and recommend that instead. Perhaps. The current and existing users do not need to change anything and 'true' should keep working fine, but given that we are discouraging the use of fork-point heuristics, it is not clear if it makes sense to entice new users with a new 'auto' synonym, so, I dunno Thanks. .
Hello Phillip, On 9/1/23 09:19, Phillip Wood wrote: > Hi Wesley > > On 19/08/2023 21:34, Wesley Schwengle wrote: >> When commit d1e894c6d7 (Document `rebase.forkpoint` in rebase man page, >> 2021-09-16) was submitted there was a discussion on if the forkpoint >> behaviour of `git rebase' was sane. In my experience this wasn't sane. >> Git rebase doesn't work if you don't have an upstream branch configured >> (or something that says `merge = refs/heads/master' in the git config). >> The behaviour of `git rebase' was that if you supply an upstream on the >> command line that it behaves as if `--no-forkpoint' was supplied and if >> you don't supply an upstream, it behaves as if `--forkpoint' was >> supplied. This can result in a loss of commits if you don't know that >> and if you don't know about `git reflog' or have other copies of your >> changes. This can be seen with the following reproduction path: >> >> mkdir reproduction >> cd reproduction >> git init . >> echo "commit a" > file.txt >> git add file.txt >> git commit -m "First commit" file.txt >> echo "commit b" >> file.txt >> git commit -m "Second commit" file.txt >> >> git switch -c foo >> echo "commit c" >> file.txt" >> git commit -m "Third commit" file.txt >> git branch --set-upstream-to=master >> >> git status >> On branch foo >> Your branch is ahead of 'master' by 1 commit. >> >> git switch master >> git merge foo > > Here "git merge" fast-forwards I think, if instead it created a merge > commit there would be no problem as the tip of branch "foo" would not > end up in master's reflog. If you do git merge foo --no-ff git reset --hard HEAD^ git switch foo git rebase You'll end up with just the commits that are in master. You'll lose all commits from foo. >> git log --format='%C(yellow)%h%Creset %Cgreen%s%Creset' > > For a reproduction recipe I think "git log --oneline" would suffice. Will update, thanks. >> [snip] > > Thanks for the detailed reproduction recipe, I think it would be helpful > to summarize what's happening in the commit message, especially as it > seems to depend on "git merge" fast-forwarding. Do you often merge a > branch into it's upstream and then reset the upstream branch? Tricky question. When I encountered this behavior I was working on an epic/topic branch that I had locally. And I made a commit that I thought should have been in another branch. I moved the commit to another branch and than later on rebased it. I didn't reply to Juno yet, but he refers to the discussion about --fork-point and --root command line options. This discussion links to a blogpost [*1*] where the same behavior is experienced. The quirk is this: --fork-point looks at the reflog and reflog is local. Meaning, having an remote upstream branch will make --fork-point a noop. Only where you have an upstream which is local and your reflog has seen dropped commits it does something. In all other cases (including supplying the upstream) it behaves as if --no-fork-point was set. If you do the same action in two different clones, you get a different result, depending on what is in your reflog. I find this very tricky behavior for a default. I've set it to false myself, to get a more consistent behavior. I usually have a remote upstream (gitlab/github) and work with that, so the --fork-point behaviour isn't present because there is no reflog for that, so it behaves as --no-fork-point. Cheers, Wesley [1]: https://commaok.xyz/post/fork-point/
Wesley <wesleys@opperschaap.net> writes: > The quirk is this: --fork-point looks at the reflog and reflog is > local. Meaning, having an remote upstream branch will make > --fork-point a noop. Only where you have an upstream which is local > and your reflog has seen dropped commits it does something. Why do you lack reflog on your remote-tracking branches in the first place? The fork-point heuristics, as far as I understand it, was invented exactly to protect you from your upstream repository rewinding and rebuilding the branch you have been building on top of. The default fetch refspec +refs/heads/*:refs/remotes/origin/* has the "force" option "+" in front exactly because the fetching repository is expected to keep the reflog for remote-tracking branches to help recovering from such a rewind & rebuild. Puzzled.
On 9/1/23 14:10, Junio C Hamano wrote: > Wesley <wesleys@opperschaap.net> writes: > >> The quirk is this: --fork-point looks at the reflog and reflog is >> local. Meaning, having an remote upstream branch will make >> --fork-point a noop. Only where you have an upstream which is local >> and your reflog has seen dropped commits it does something. > > Why do you lack reflog on your remote-tracking branches in the first > place? I do not know? I tested with a bare repo and two clones. And I also tested it with just a remote upstream in another branch. When in repo-1 I do the reset --hard HEAD^, and push the results, and pull them in in repo-2 the behavior doesn't replicate. The git reflog command doesn't show the reset. However, if I delete the reflog entry for removal of the reset HEAD^, git rebase exposes the fork-point behavior. > The fork-point heuristics, as far as I understand it, was invented > exactly to protect you from your upstream repository rewinding and > rebuilding the branch you have been building on top of. The default > fetch refspec +refs/heads/*:refs/remotes/origin/* has the "force" > option "+" in front exactly because the fetching repository is > expected to keep the reflog for remote-tracking branches to help > recovering from such a rewind & rebuild. I haven't force pushed anything btw, maybe that could explain things? Cheers, Wesley
This is the second version of the patch series. Patch 1: Be able to use rebase.forkpoint and --root Patch 2: Adding the warning + tests Patch 3: Update documenation I think I have covered most of your concerns and feedback in this second version. On 8/31/23 16:57, Junio C Hamano wrote: > Wesley Schwengle <wesleys@opperschaap.net> writes: > > Here is my attempt to rewrite the above: > > When 'git rebase' is run without specifying <upstream> on the > command line, the current default is to use the fork-point > heuristics, but this is expected to change in a future version > of Git, and you will have to explicitly give "--fork-point" from > the command line if you keep using the fork-point mode. You can > run "git config rebase.forkpoint false" to adopt the new default > in advance and that will also squelch the message. I agree. I'll change the text to your version. > Note that the parsing of "rebase.forkpoint" is a bit peculiar in > that > > - By leaving it unspecified, the .fork_point = -1 in > REBASE_OPTIONS_INIT takes effect (which is unsurprising); > > - By setting it to false, .fork_point becomes 0; but > > - If you set the configuration variable to true, .fork_point > becomes -1, not 1. I changed this in patch 1. > And this is very much deliberate if I understand it correctly [*1*]. > By the time we get to this part of the code (i.e. .fork_point is > -1), the user may already have rebase.forkpoint set to true. IOW, > setting it to 'true' is not a valid way to squelch this message. So this works now with patch 2. > Another worrysome thing about rebase.forkpoint is that it will be > inevitable for folks to start complaining that it does not work the > way other configuration variables do. Setting the variable to > 'true' is not the same as passing '--fork-point=true' from the > command line. I think it is now with the current series. > I actually think it would be a lot larger behaviour change with a > huge potential to be received as a regression if we start making the > variable to mean the same thing as passing '--fork-point=true'. > People may like the current "if you are rebuilding your branch on > its usual upstream, pay attention to the rebase and rewind of the > upstream itself, but if you are giving an explicit upstream from the > command line, the tool does not second guess you with the fork-point > heuristics" behaviour and prefer to set it to true. We would be > breaking them big time if suddenly the rebase.forkpoint=true they > set previously starts triggering the fork-point heuristics when they > run "git rebase upstream". So that needs to be kept in mind when/if > we fix the "setting the variable, even to 'true', will squelch the > warning". I get what you are saying. My solution is to make the --fork-point or --no-fork-point more explicit. People could use an alias for this? It would mean a different approach to the problem and deprecating rebase.forkpoint as a boolean value. It could become one of three values: "true", "false" and "legacy". Where "legacy" can be "implicit" or "auto". Although you had some ideas on "auto" already. I'm not sure on how I would call it. "no-upstream"?
Wesley <wesleys@opperschaap.net> writes: > On 9/1/23 14:10, Junio C Hamano wrote: >> Wesley <wesleys@opperschaap.net> writes: >> >>> The quirk is this: --fork-point looks at the reflog and reflog is >>> local. Meaning, having an remote upstream branch will make >>> --fork-point a noop. Only where you have an upstream which is local >>> and your reflog has seen dropped commits it does something. >> Why do you lack reflog on your remote-tracking branches in the first >> place? > > I do not know? I tested with a bare repo and two clones. And I also > tested it with just a remote upstream in another branch. IIRC, a non-bare repository (i.e. with working tree) should get core.logallrefupdates set to true by default, so all your refs, not just local and remote-tracking branches, should have records. > I haven't force pushed anything btw, maybe that could explain things? If your "remote" is never force-pushed, then the movements of refs at the remote (which you will observe whenever you fetch from it) will always fast-forward, and the remote-tracking branches in your local repository that keeps track of the movement will also record the fast-forwarding movement in the reflog. But then there is no need for the fork-point heurisitics to trigger, and even if it triggered the heuristics would not change the outcome, when rebasing against such a remote branch, as their tip will always a decendant of all commits that ever sat at the tip of that remote branch.
diff --git a/builtin/rebase.c b/builtin/rebase.c index 50cb85751f..41dd9b6256 100644 --- a/builtin/rebase.c +++ b/builtin/rebase.c @@ -1604,8 +1604,21 @@ int cmd_rebase(int argc, const char **argv, const char *prefix) NULL); if (!options.upstream_name) error_on_missing_default_upstream(); - if (options.fork_point < 0) + if (options.fork_point < 0) { + warning(_( + "Rebasing without specifying a forkpoint is discouraged. You can squelch\n" + "this message by running one of the following commands something before your\n" + "next rebase:\n" + "\n" + " git config rebase.forkpoint = false # This will become the new default\n" + " git config rebase.forkpoint = true # This is the old default\n" + "\n" + "You can replace \"git config\" with \"git config --global\" to set a default\n" + "preference for all repositories. You can also pass --no-fork-point, --fork-point\n" + "on the command line to override the configured default per invocation.\n" + )); options.fork_point = 1; + } } else { options.upstream_name = argv[0]; argc--; diff --git a/t/t3431-rebase-fork-point.sh b/t/t3431-rebase-fork-point.sh index 4bfc779bb8..a583ca6228 100755 --- a/t/t3431-rebase-fork-point.sh +++ b/t/t3431-rebase-fork-point.sh @@ -113,4 +113,54 @@ test_expect_success 'rebase.forkPoint set to true and --root given' ' git rebase --root ' +# The use of the diff -qw is because there is some kind of whitespace character +# magic going on which probably has to do with the tabs. It only occurs when we +# check STDERR +test_expect_success 'rebase without forkpoint' ' + git init rebase-forkpoint && + cd rebase-forkpoint && + git status >/tmp/foo && + echo "commit a" > file.txt && + git add file.txt && + git commit -m "First commit" file.txt && + echo "commit b" >> file.txt && + git commit -m "Second commit" file.txt && + git switch -c foo && + echo "commit c" >> file.txt && + git commit -m "Third commit" file.txt && + git branch --set-upstream-to=main && + git switch main && + git merge foo && + git reset --hard HEAD^ && + git switch foo && + commit=$(git log -n1 --format="%h") && + git rebase >out 2>err && + test_must_be_empty out && + cat <<-\OEF > expect && + warning: Rebasing without specifying a forkpoint is discouraged. You can squelch + this message by running one of the following commands something before your + next rebase: + + git config rebase.forkpoint = false # This will become the new default + git config rebase.forkpoint = true # This is the old default + + You can replace "git config" with "git config --global" to set a default + preference for all repositories. You can also pass --no-fork-point, --fork-point + on the command line to override the configured default per invocation. + + Successfully rebased and updated refs/heads/foo. + OEF + diff -qw expect err && + git reset --hard $commit && + git rebase --fork-point >out 2>err && + test_must_be_empty out && + echo "Successfully rebased and updated refs/heads/foo." > expect && + diff -qw expect err && + git reset --hard $commit && + git rebase --no-fork-point >out 2>err && + test_must_be_empty err && + echo "Current branch foo is up to date." > expect && + test_cmp out expect +' + test_done
When commit d1e894c6d7 (Document `rebase.forkpoint` in rebase man page, 2021-09-16) was submitted there was a discussion on if the forkpoint behaviour of `git rebase' was sane. In my experience this wasn't sane. Git rebase doesn't work if you don't have an upstream branch configured (or something that says `merge = refs/heads/master' in the git config). The behaviour of `git rebase' was that if you supply an upstream on the command line that it behaves as if `--no-forkpoint' was supplied and if you don't supply an upstream, it behaves as if `--forkpoint' was supplied. This can result in a loss of commits if you don't know that and if you don't know about `git reflog' or have other copies of your changes. This can be seen with the following reproduction path: mkdir reproduction cd reproduction git init . echo "commit a" > file.txt git add file.txt git commit -m "First commit" file.txt echo "commit b" >> file.txt git commit -m "Second commit" file.txt git switch -c foo echo "commit c" >> file.txt" git commit -m "Third commit" file.txt git branch --set-upstream-to=master git status On branch foo Your branch is ahead of 'master' by 1 commit. git switch master git merge foo git reset --hard HEAD^ git switch foo Switched to branch 'foo' Your branch is ahead of 'master' by 1 commit. git log --format='%C(yellow)%h%Creset %Cgreen%s%Creset' 5f427e3 Third commit 03ad791 Second commit 411e6d4 First commit git rebase git status On branch foo Your branch is up to date with 'master'. git log --format='%C(yellow)%h%Creset %Cgreen%s%Creset' 03ad791 Second commit 411e6d4 First commit This patch adds a warning where it will indicate that `rebase.forkpoint' must be set in the git configuration and/or that you can supply a `--forkpoint' or `--no-forkpoint' command line option to your `git rebase' invocation. Signed-off-by: Wesley Schwengle <wesleys@opperschaap.net> --- builtin/rebase.c | 15 ++++++++++- t/t3431-rebase-fork-point.sh | 50 ++++++++++++++++++++++++++++++++++++ 2 files changed, 64 insertions(+), 1 deletion(-)