Message ID | dcf5c60c69d8275a557ffe3d3ae30911d2140162.1567098090.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | checkout: add simple check for 'git checkout -b' | expand |
On Thu, Aug 29, 2019 at 10:04 AM Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote: > > From: Derrick Stolee <dstolee@microsoft.com> > > The 'git switch' command was created to separate half of the > behavior of 'git checkout'. It specifically has the mode to > do nothing with the index and working directory if the user > only specifies to create a new branch and change HEAD to that > branch. This is also the behavior most users expect from > 'git checkout -b', but for historical reasons it also performs > an index update by scanning the working directory. This can be > slow for even moderately-sized repos. > > A performance fix for 'git checkout -b' was introduced by > fa655d8411 (checkout: optimize "git checkout -b <new_branch>" > 2018-08-16). That change includes details about the config > setting checkout.optimizeNewBranch when the sparse-checkout > feature is required. The way this change detected if this > behavior change is safe was through the skip_merge_working_tree() > method. This method was complex and needed to be updated > as new options were introduced. > > This behavior was essentially reverted by 65f099b ("switch: > no worktree status unless real branch switch happens" > 2019-03-29). Instead, two members of the checkout_opts struct > were used to distinguish between 'git checkout' and 'git switch': > > * switch_branch_doing_nothing_is_ok > * only_merge_on_switching_branches > > These settings have opposite values depending on if we start > in cmd_checkout or cmd_switch. > > The message for 64f099b includes "Users of big repos are > encouraged to move to switch." Making this change while > 'git switch' is still experimental is too aggressive. > > Create a happy medium between these two options by making > 'git checkout -b <branch>' behave just like 'git switch', > but only if we read exactly those arguments. This must > be done in cmd_checkout to avoid the arguments being > consumed by the option parsing logic. > > This differs from the previous change by fa644d8 in that > the config option checkout.optimizeNewBranch remains > deleted. This means that 'git checkout -b' will ignore > the index merge even if we have a sparse-checkout file. > While this is a behavior change for 'git checkout -b', > it matches the behavior of 'git switch -c'. > > Signed-off-by: Derrick Stolee <dstolee@microsoft.com> > --- > builtin/checkout.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/builtin/checkout.c b/builtin/checkout.c > index 6123f732a2..116200cf90 100644 > --- a/builtin/checkout.c > +++ b/builtin/checkout.c > @@ -1713,6 +1713,15 @@ int cmd_checkout(int argc, const char **argv, const char *prefix) > opts.overlay_mode = -1; > opts.checkout_index = -2; /* default on */ > opts.checkout_worktree = -2; /* default on */ > + > + if (argc == 3 && !strcmp(argv[1], "-b")) { > + /* > + * User ran 'git checkout -b <branch>' and expects > + * the same behavior as 'git switch -c <branch>'. > + */ > + opts.switch_branch_doing_nothing_is_ok = 0; > + opts.only_merge_on_switching_branches = 1; > + } > > options = parse_options_dup(checkout_options); > options = add_common_options(&opts, options); > -- > gitgitgadget Nice! Thanks for doing this; a small and localized performance hack is much nicer than a big and non-localized one. I also appreciate the detailed history in the commit message. Just for fun, I tested on linux (with a relatively fast SSD) using a simple git-bomb repo with 10M index entries but a sparse checkout of just one file. 'git switch -c' takes approximately 0.004s before or after this patch. 'git checkout -b' before this patch: $ time git checkout -b newbranch1 Switched to a new branch 'newbranch1' real 0m13.533s user 0m9.824s sys 0m2.828s After this patch: $ time git checkout -b newbranch2 Switched to a new branch 'newbranch2' real 0m0.003s user 0m0.000s sys 0m0.000s Anyway, looks good to me.
Hi Stolee On 29/08/2019 18:01, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee <dstolee@microsoft.com> > > The 'git switch' command was created to separate half of the > behavior of 'git checkout'. It specifically has the mode to > do nothing with the index and working directory if the user > only specifies to create a new branch and change HEAD to that > branch. This is also the behavior most users expect from > 'git checkout -b', but for historical reasons it also performs > an index update by scanning the working directory. This can be > slow for even moderately-sized repos. > > A performance fix for 'git checkout -b' was introduced by > fa655d8411 (checkout: optimize "git checkout -b <new_branch>" > 2018-08-16). That change includes details about the config > setting checkout.optimizeNewBranch when the sparse-checkout > feature is required. The way this change detected if this > behavior change is safe was through the skip_merge_working_tree() > method. This method was complex and needed to be updated > as new options were introduced. > > This behavior was essentially reverted by 65f099b ("switch: > no worktree status unless real branch switch happens" > 2019-03-29). Instead, two members of the checkout_opts struct > were used to distinguish between 'git checkout' and 'git switch': > > * switch_branch_doing_nothing_is_ok > * only_merge_on_switching_branches > > These settings have opposite values depending on if we start > in cmd_checkout or cmd_switch. > > The message for 64f099b includes "Users of big repos are > encouraged to move to switch." Making this change while > 'git switch' is still experimental is too aggressive. > > Create a happy medium between these two options by making > 'git checkout -b <branch>' behave just like 'git switch', > but only if we read exactly those arguments. This must > be done in cmd_checkout to avoid the arguments being > consumed by the option parsing logic. > > This differs from the previous change by fa644d8 in that > the config option checkout.optimizeNewBranch remains > deleted. This means that 'git checkout -b' will ignore > the index merge even if we have a sparse-checkout file. > While this is a behavior change for 'git checkout -b', > it matches the behavior of 'git switch -c'. > > Signed-off-by: Derrick Stolee <dstolee@microsoft.com> > --- > builtin/checkout.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/builtin/checkout.c b/builtin/checkout.c > index 6123f732a2..116200cf90 100644 > --- a/builtin/checkout.c > +++ b/builtin/checkout.c > @@ -1713,6 +1713,15 @@ int cmd_checkout(int argc, const char **argv, const char *prefix) > opts.overlay_mode = -1; > opts.checkout_index = -2; /* default on */ > opts.checkout_worktree = -2; /* default on */ > + > + if (argc == 3 && !strcmp(argv[1], "-b")) { > + /* > + * User ran 'git checkout -b <branch>' and expects What if the user ran 'git checkout -b<branch>'? Then argc == 2. Best Wishes Phillip > + * the same behavior as 'git switch -c <branch>'. > + */ > + opts.switch_branch_doing_nothing_is_ok = 0; > + opts.only_merge_on_switching_branches = 1; > + } > > options = parse_options_dup(checkout_options); > options = add_common_options(&opts, options); >
On 8/29/2019 2:54 PM, Phillip Wood wrote: > Hi Stolee > > On 29/08/2019 18:01, Derrick Stolee via GitGitGadget wrote: >> + >> + if (argc == 3 && !strcmp(argv[1], "-b")) { >> + /* >> + * User ran 'git checkout -b <branch>' and expects > > What if the user ran 'git checkout -b<branch>'? Then argc == 2. Good catch. I'm tempted to say "don't do that" to keep this simple. They won't have incorrect results, just slower than the "with space" option. However, if there is enough interest in correcting the "-b<branch>" case, then I can make another attempt at this. -Stolee
On 29/08/19 04:07PM, Derrick Stolee wrote: > On 8/29/2019 2:54 PM, Phillip Wood wrote: > > Hi Stolee > > > > On 29/08/2019 18:01, Derrick Stolee via GitGitGadget wrote: > >> + > >> + if (argc == 3 && !strcmp(argv[1], "-b")) { > >> + /* > >> + * User ran 'git checkout -b <branch>' and expects > > > > What if the user ran 'git checkout -b<branch>'? Then argc == 2. > > Good catch. I'm tempted to say "don't do that" to keep this > simple. They won't have incorrect results, just slower than > the "with space" option. > > However, if there is enough interest in correcting the "-b<branch>" > case, then I can make another attempt at this. You can probably do this with: !strncmp(argv[1], "-b", 2) The difference is so little, might as well do it IMO.
On 30/08/19 02:00AM, Pratyush Yadav wrote: > On 29/08/19 04:07PM, Derrick Stolee wrote: > > On 8/29/2019 2:54 PM, Phillip Wood wrote: > > > Hi Stolee > > > > > > On 29/08/2019 18:01, Derrick Stolee via GitGitGadget wrote: > > >> + > > >> + if (argc == 3 && !strcmp(argv[1], "-b")) { > > >> + /* > > >> + * User ran 'git checkout -b <branch>' and expects > > > > > > What if the user ran 'git checkout -b<branch>'? Then argc == 2. > > > > Good catch. I'm tempted to say "don't do that" to keep this > > simple. They won't have incorrect results, just slower than > > the "with space" option. > > > > However, if there is enough interest in correcting the "-b<branch>" > > case, then I can make another attempt at this. > > You can probably do this with: > > !strncmp(argv[1], "-b", 2) > > The difference is so little, might as well do it IMO. Actually, that is not correct. I took a quick look before writing this and missed the fact that argc == 3 is the bigger problem. Thinking a little more about this, you can mix other options with checkout -b, like --track. You can also specify <start_point>. Now I don't know enough about this optimization you are doing to know whether we need to optimize when these options are given, but at least for --track I don't see any reason not to. So maybe you are better off using something like getopt() (warning: getopt modifies the input string so you probably want to duplicate it) if you want to support all cases. Though for this simple case you can probably get away by just directly scanning the argv list for "-b" (using strncmp instead of strcmp to account for "-b<branch-name>)
On Thu, Aug 29, 2019 at 2:42 PM Pratyush Yadav <me@yadavpratyush.com> wrote: > > On 30/08/19 02:00AM, Pratyush Yadav wrote: > > On 29/08/19 04:07PM, Derrick Stolee wrote: > > > On 8/29/2019 2:54 PM, Phillip Wood wrote: > > > > Hi Stolee > > > > > > > > On 29/08/2019 18:01, Derrick Stolee via GitGitGadget wrote: > > > >> + > > > >> + if (argc == 3 && !strcmp(argv[1], "-b")) { > > > >> + /* > > > >> + * User ran 'git checkout -b <branch>' and expects > > > > > > > > What if the user ran 'git checkout -b<branch>'? Then argc == 2. > > > > > > Good catch. I'm tempted to say "don't do that" to keep this > > > simple. They won't have incorrect results, just slower than > > > the "with space" option. > > > > > > However, if there is enough interest in correcting the "-b<branch>" > > > case, then I can make another attempt at this. > > > > You can probably do this with: > > > > !strncmp(argv[1], "-b", 2) > > > > The difference is so little, might as well do it IMO. > > Actually, that is not correct. I took a quick look before writing this > and missed the fact that argc == 3 is the bigger problem. > > Thinking a little more about this, you can mix other options with > checkout -b, like --track. You can also specify <start_point>. > > Now I don't know enough about this optimization you are doing to know > whether we need to optimize when these options are given, but at least > for --track I don't see any reason not to. > > So maybe you are better off using something like getopt() (warning: > getopt modifies the input string so you probably want to duplicate it) > if you want to support all cases. Though for this simple case you can > probably get away by just directly scanning the argv list for "-b" > (using strncmp instead of strcmp to account for "-b<branch-name>) NO. This would be unsafe to use if <start_point> is specified. I think either -f or -m together with -b make no sense unless <start_point> is specified, but if they do make sense separately, I'm guessing this hack should not be used with those flags. And additional flags may appear in the future that should not be used together with this hack. Personally, although I understand the desire to support any possible cases in general, *this is a performance hack*. As such, it should be as simple and localized as possible. I don't think supporting old-style stuck flags (-b$BRANCH) is worth complicating this. I'm even leery of adding support for --track (do any users of huge repos use -b with --track? Does anyone at all use --track anymore? I'm not sure I've ever seen any user use that flag in the last 10 years other than myself.) Besides, in the *worst* possible case, the command the user specifies works just fine...it just takes a little longer. My opinion is that Stolee's patch is perfect as-is and should not be generalized at all. Just my $0.02, Elijah
Hi Elijah, On Thu, Aug 29, 2019 at 05:19:44PM -0700, Elijah Newren wrote: > On Thu, Aug 29, 2019 at 2:42 PM Pratyush Yadav <me@yadavpratyush.com> wrote: > > > > On 30/08/19 02:00AM, Pratyush Yadav wrote: > > > On 29/08/19 04:07PM, Derrick Stolee wrote: > > > > On 8/29/2019 2:54 PM, Phillip Wood wrote: > > > > > Hi Stolee > > > > > > > > > > On 29/08/2019 18:01, Derrick Stolee via GitGitGadget wrote: > > > > >> + > > > > >> + if (argc == 3 && !strcmp(argv[1], "-b")) { > > > > >> + /* > > > > >> + * User ran 'git checkout -b <branch>' and expects > > > > > > > > > > What if the user ran 'git checkout -b<branch>'? Then argc == 2. > > > > > > > > Good catch. I'm tempted to say "don't do that" to keep this > > > > simple. They won't have incorrect results, just slower than > > > > the "with space" option. > > > > > > > > However, if there is enough interest in correcting the "-b<branch>" > > > > case, then I can make another attempt at this. > > > > > > You can probably do this with: > > > > > > !strncmp(argv[1], "-b", 2) > > > > > > The difference is so little, might as well do it IMO. > > > > Actually, that is not correct. I took a quick look before writing this > > and missed the fact that argc == 3 is the bigger problem. > > > > Thinking a little more about this, you can mix other options with > > checkout -b, like --track. You can also specify <start_point>. > > > > Now I don't know enough about this optimization you are doing to know > > whether we need to optimize when these options are given, but at least > > for --track I don't see any reason not to. > > > > So maybe you are better off using something like getopt() (warning: > > getopt modifies the input string so you probably want to duplicate it) > > if you want to support all cases. Though for this simple case you can > > probably get away by just directly scanning the argv list for "-b" > > (using strncmp instead of strcmp to account for "-b<branch-name>) > > NO. This would be unsafe to use if <start_point> is specified. I > think either -f or -m together with -b make no sense unless > <start_point> is specified, but if they do make sense separately, I'm > guessing this hack should not be used with those flags. And > additional flags may appear in the future that should not be used > together with this hack. > > Personally, although I understand the desire to support any possible > cases in general, *this is a performance hack*. As such, it should be > as simple and localized as possible. I don't think supporting > old-style stuck flags (-b$BRANCH) is worth complicating this. I'm > even leery of adding support for --track (do any users of huge repos > use -b with --track? Does anyone at all use --track anymore? I'm not > sure I've ever seen any user use that flag in the last 10 years other > than myself.) Besides, in the *worst* possible case, the command the > user specifies works just fine...it just takes a little longer. My > opinion is that Stolee's patch is perfect as-is and should not be > generalized at all. I wholeheartedly agree with this, and pledge my $.02 towards it as well. Now with a combined total of $.04, I think that this patch is ready for queueing as-is. > Just my $0.02, > Elijah Thanks, Taylor
On 8/29/2019 8:43 PM, Taylor Blau wrote: > Hi Elijah, > > On Thu, Aug 29, 2019 at 05:19:44PM -0700, Elijah Newren wrote: >> Personally, although I understand the desire to support any possible >> cases in general, *this is a performance hack*. As such, it should be >> as simple and localized as possible. I don't think supporting >> old-style stuck flags (-b$BRANCH) is worth complicating this. I'm >> even leery of adding support for --track (do any users of huge repos >> use -b with --track? Does anyone at all use --track anymore? I'm not >> sure I've ever seen any user use that flag in the last 10 years other >> than myself.) Besides, in the *worst* possible case, the command the >> user specifies works just fine...it just takes a little longer. My >> opinion is that Stolee's patch is perfect as-is and should not be >> generalized at all. > > I wholeheartedly agree with this, and pledge my $.02 towards it as well. > Now with a combined total of $.04, I think that this patch is ready for > queueing as-is. Thanks, both!
Taylor Blau <me@ttaylorr.com> writes: > I wholeheartedly agree with this, and pledge my $.02 towards it as well. > Now with a combined total of $.04, I think that this patch is ready for > queueing as-is. ;-)
diff --git a/builtin/checkout.c b/builtin/checkout.c index 6123f732a2..116200cf90 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -1713,6 +1713,15 @@ int cmd_checkout(int argc, const char **argv, const char *prefix) opts.overlay_mode = -1; opts.checkout_index = -2; /* default on */ opts.checkout_worktree = -2; /* default on */ + + if (argc == 3 && !strcmp(argv[1], "-b")) { + /* + * User ran 'git checkout -b <branch>' and expects + * the same behavior as 'git switch -c <branch>'. + */ + opts.switch_branch_doing_nothing_is_ok = 0; + opts.only_merge_on_switching_branches = 1; + } options = parse_options_dup(checkout_options); options = add_common_options(&opts, options);