mbox series

[0/2,RFC] Revert/delay performance regression in 'git checkout -b'

Message ID pull.317.git.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series Revert/delay performance regression in 'git checkout -b' | expand

Message

Linus Arver via GitGitGadget Aug. 21, 2019, 7:18 p.m. UTC
As we were integrating Git 2.23.0 into VFS for Git, we discovered that "git
checkout -b new-branch" went from 0.3s to 10+s on the Windows OS repo. This
was an intentional change when writing the "git switch" builtin. Here is the
commit message for 65f099b ("switch: no worktree status unless real branch
switch happens" 2019-03-29):

When we switch from one branch to another, it makes sense to show a
summary of local changes since there could be conflicts, or some files
left modified.... When switch is used solely for creating a new
branch (and "switch" to the same commit) or detaching, we don't really
need to show anything.

"git checkout" does it anyway for historical reasons. But we can start
with a clean slate with switch and don't have to.

This essentially reverts fa655d8411 (checkout: optimize "git checkout
-b <new_branch>" - 2018-08-16) and make it default for switch,
but also for -B and --detach. Users of big repos are encouraged to
move to switch.

I was considering doing a full, long-term revert of this change to get the
performance back to normal, but I also saw this feedback on the list for
this patch:

I like this last bit. The skip_merge_working_tree() function which
this removes was ugly, difficult to maintain, and difficult to get
just right (and easy to break -- even by changing parts of the system
which one might not expect to impact it).

So, the goal is to reduce the complication given by
skip_merge_working_tree() by recommending that users use 'git switch -c'.
The only problem is: users will take a while to move, unless prompted.

This series does the following:

 1. Reverts the change that makes 'git checkout -b' slow again.
 2. Creates a warning that recommends users start using 'git switch -c'
    instead.

This allows us to strip out this performance feature after users have had
time to adopt the new way of doing things.

Derrick Stolee (2):
  Revert "switch: no worktree status unless real branch switch happens"
  DEPRECATION: warn about 'git checkout -b'

 Documentation/config/checkout.txt |   8 ++
 builtin/checkout.c                | 130 +++++++++++++++++++++++++++++-
 t/t1090-sparse-checkout-scope.sh  |  14 ++++
 3 files changed, 150 insertions(+), 2 deletions(-)


base-commit: 5fa0f5238b0cd46cfe7f6fa76c3f526ea98148d9
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-317%2Fderrickstolee%2Frevert-switch-slow-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-317/derrickstolee/revert-switch-slow-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/317

Comments

SZEDER Gábor Aug. 21, 2019, 10:31 p.m. UTC | #1
On Wed, Aug 21, 2019 at 12:18:32PM -0700, Derrick Stolee via GitGitGadget wrote:
> As we were integrating Git 2.23.0 into VFS for Git, we discovered that "git
> checkout -b new-branch" went from 0.3s to 10+s on the Windows OS repo.

Does this slowdown only affect the Windows OS repo with VFS for Git,
or other biggish repos without VFS for Git as well?

> This series does the following:
> 
>  1. Reverts the change that makes 'git checkout -b' slow again.
>  2. Creates a warning that recommends users start using 'git switch -c'
>     instead.

'git help switch' says loudly and clearly that "THIS COMMAND IS
EXPERIMENTAL. THE BEHAVIOR MAY CHANGE."  It's too early to recommend
it this aggressively, and to deprecate any parts of 'git checkout'.
Elijah Newren Aug. 22, 2019, 3:16 a.m. UTC | #2
Hi,

On Wed, Aug 21, 2019 at 12:21 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> As we were integrating Git 2.23.0 into VFS for Git, we discovered that "git
> checkout -b new-branch" went from 0.3s to 10+s on the Windows OS repo. This
> was an intentional change when writing the "git switch" builtin. Here is the
> commit message for 65f099b ("switch: no worktree status unless real branch
> switch happens" 2019-03-29):
>
> When we switch from one branch to another, it makes sense to show a
> summary of local changes since there could be conflicts, or some files
> left modified.... When switch is used solely for creating a new
> branch (and "switch" to the same commit) or detaching, we don't really
> need to show anything.
>
> "git checkout" does it anyway for historical reasons. But we can start
> with a clean slate with switch and don't have to.
>
> This essentially reverts fa655d8411 (checkout: optimize "git checkout
> -b <new_branch>" - 2018-08-16) and make it default for switch,
> but also for -B and --detach. Users of big repos are encouraged to
> move to switch.
>
> I was considering doing a full, long-term revert of this change to get the
> performance back to normal, but I also saw this feedback on the list for
> this patch:
>
> I like this last bit. The skip_merge_working_tree() function which
> this removes was ugly, difficult to maintain, and difficult to get
> just right (and easy to break -- even by changing parts of the system
> which one might not expect to impact it).

Instead of restoring this easy-to-break code, could we instead
simplify it and make it more robust?  As per the original commit
message, the whole point of the patch is that when you have a huge
index, operations take a while.  But in the special case of "git
checkout -b <new_branch>", there's no need to even check the index.
So, we could:

  * Check if the user is running "git checkout -b <new_branch>"
  * If so, use the performance hack to skip touching the index at all.

This would be much better than what the patch currently does:

  * Check if the user has specified -m, if so they clearly didn't just
specify "git checkout -b <new_branch>"
  * Check if the user has specified -f, if so they clearly didn't just
specify "git checkout -b <new_branch>"
  * Check if the user has specified --detach, if so they clearly
didn't just specify "git checkout -b <new_branch>"
  * ...<lots of other similar steps>...
  * If we got here, since we've checked all other cases (assuming
other people who have touched checkout remembered to add the necessary
checks for each and every new flag), then by deduction the user must
have specified "git checkout -b <new_branch>", so...
  * Use the performance hack to skip touching the index at all.
Elijah Newren Aug. 22, 2019, 3:18 a.m. UTC | #3
On Wed, Aug 21, 2019 at 5:01 PM SZEDER Gábor <szeder.dev@gmail.com> wrote:
>
> On Wed, Aug 21, 2019 at 12:18:32PM -0700, Derrick Stolee via GitGitGadget wrote:
> > As we were integrating Git 2.23.0 into VFS for Git, we discovered that "git
> > checkout -b new-branch" went from 0.3s to 10+s on the Windows OS repo.
>
> Does this slowdown only affect the Windows OS repo with VFS for Git,
> or other biggish repos without VFS for Git as well?

It will also affect other biggish repos without VFS for Git; see the
"does not seem to be GFVS-specific in any way" paragraph of
https://public-inbox.org/git/CABPp-BGir_5xyqEfwytDog0rZDydPHXjuqXCpNKk67dVPXjUjA@mail.gmail.com/
Derrick Stolee Aug. 22, 2019, 1:43 p.m. UTC | #4
On 8/21/2019 6:31 PM, SZEDER Gábor wrote:
> On Wed, Aug 21, 2019 at 12:18:32PM -0700, Derrick Stolee via GitGitGadget wrote:
>> As we were integrating Git 2.23.0 into VFS for Git, we discovered that "git
>> checkout -b new-branch" went from 0.3s to 10+s on the Windows OS repo.
> 
> Does this slowdown only affect the Windows OS repo with VFS for Git,
> or other biggish repos without VFS for Git as well?
> 
>> This series does the following:
>>
>>  1. Reverts the change that makes 'git checkout -b' slow again.
>>  2. Creates a warning that recommends users start using 'git switch -c'
>>     instead.
> 
> 'git help switch' says loudly and clearly that "THIS COMMAND IS
> EXPERIMENTAL. THE BEHAVIOR MAY CHANGE."  It's too early to recommend
> it this aggressively, and to deprecate any parts of 'git checkout'.

Even more reason why the performance boost should not have been
removed from 'git checkout -b' so quickly.

Thanks,
-Stolee
Derrick Stolee Aug. 22, 2019, 1:45 p.m. UTC | #5
On 8/21/2019 11:16 PM, Elijah Newren wrote:
> Hi,
> 
> On Wed, Aug 21, 2019 at 12:21 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> As we were integrating Git 2.23.0 into VFS for Git, we discovered that "git
>> checkout -b new-branch" went from 0.3s to 10+s on the Windows OS repo. This
>> was an intentional change when writing the "git switch" builtin. Here is the
>> commit message for 65f099b ("switch: no worktree status unless real branch
>> switch happens" 2019-03-29):
>>
>> When we switch from one branch to another, it makes sense to show a
>> summary of local changes since there could be conflicts, or some files
>> left modified.... When switch is used solely for creating a new
>> branch (and "switch" to the same commit) or detaching, we don't really
>> need to show anything.
>>
>> "git checkout" does it anyway for historical reasons. But we can start
>> with a clean slate with switch and don't have to.
>>
>> This essentially reverts fa655d8411 (checkout: optimize "git checkout
>> -b <new_branch>" - 2018-08-16) and make it default for switch,
>> but also for -B and --detach. Users of big repos are encouraged to
>> move to switch.
>>
>> I was considering doing a full, long-term revert of this change to get the
>> performance back to normal, but I also saw this feedback on the list for
>> this patch:
>>
>> I like this last bit. The skip_merge_working_tree() function which
>> this removes was ugly, difficult to maintain, and difficult to get
>> just right (and easy to break -- even by changing parts of the system
>> which one might not expect to impact it).
> 
> Instead of restoring this easy-to-break code, could we instead
> simplify it and make it more robust?  As per the original commit
> message, the whole point of the patch is that when you have a huge
> index, operations take a while.  But in the special case of "git
> checkout -b <new_branch>", there's no need to even check the index.
> So, we could:
> 
>   * Check if the user is running "git checkout -b <new_branch>"
>   * If so, use the performance hack to skip touching the index at all.
> 
> This would be much better than what the patch currently does:
> 
>   * Check if the user has specified -m, if so they clearly didn't just
> specify "git checkout -b <new_branch>"
>   * Check if the user has specified -f, if so they clearly didn't just
> specify "git checkout -b <new_branch>"
>   * Check if the user has specified --detach, if so they clearly
> didn't just specify "git checkout -b <new_branch>"
>   * ...<lots of other similar steps>...
>   * If we got here, since we've checked all other cases (assuming
> other people who have touched checkout remembered to add the necessary
> checks for each and every new flag), then by deduction the user must
> have specified "git checkout -b <new_branch>", so...
>   * Use the performance hack to skip touching the index at all.

I can look into a simpler implementation of this direction. I'm
getting the feeling that this immediate recommendation of "git switch -c"
is too hasty, so the best thing to do is to create a simpler way
to detect "git checkout -b" and use the fast method.

Thanks,
-Stolee