diff mbox series

[4/5] rebase -i: don't fork git checkout

Message ID 39ad40c9297531a2d42b7263a1d41b1ecbc23c0a.1631108472.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series rebase -i: a couple of small improvements | expand

Commit Message

Phillip Wood Sept. 8, 2021, 1:41 p.m. UTC
From: Phillip Wood <phillip.wood@dunelm.org.uk>

The "apply" based rebase has avoided forking git checkout since
ac7f467fef ("builtin/rebase: support running "git rebase <upstream>"",
2018-08-07). The code that handles the checkout was moved into libgit
by b309a97108 ("reset: extract reset_head() from rebase", 2020-04-07)
so lets start using it for the "merge" based rebase as well. This
opens the way for us to stop calling the post-checkout hook in the
future.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 sequencer.c | 29 ++++-------------------------
 1 file changed, 4 insertions(+), 25 deletions(-)

Comments

Philippe Blain Sept. 8, 2021, 6:14 p.m. UTC | #1
Hi Phillip,

Le 2021-09-08 à 09:41, Phillip Wood via GitGitGadget a écrit :
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
> 
> The "apply" based rebase has avoided forking git checkout since
> ac7f467fef ("builtin/rebase: support running "git rebase <upstream>"",
> 2018-08-07). The code that handles the checkout was moved into libgit
> by b309a97108 ("reset: extract reset_head() from rebase", 2020-04-07)
> so lets start using it for the "merge" based rebase as well. This
> opens the way for us to stop calling the post-checkout hook in the
> future.
> 

While in general I think it's a good thing to avoid forking, this change
might result in behavioral differences. Any config that affects
'git checkout' but not the internal 'reset.c::reset_head' function might
play a role in the rebase UX.

One that immediately came to mind is 'submodule.recurse'. This initial 'onto'
checkout was pretty much the only part of 'git rebase' that did something useful
for submodules, so it's kind of sad to see it regress. [That is, until someone
takes the time to implement 'git rebase --recurse-submodules' and makes sure *all*
code paths that touch the working tree pay attention to this flag, and that will
probably necessitate 'git merge --recurse-submodules' first because of the 'merge'
backend... as far as I'm aware it's on Emily's list [1], it's also on mine but
I don't know when I'll get the time.]

Anyway, I'm not saying that we should not do what this patch is proposing, but
I think caveats such as that should be documented in the commit message, and maybe
an audit of other configs that might results in behavioural differences should be done.

Thanks,

Philippe.

[1] https://lore.kernel.org/git/YHofmWcIAidkvJiD@google.com/t/#m0229af9183a84c2367f21e82adfbd21f08aa4437
Phillip Wood Sept. 9, 2021, 10:09 a.m. UTC | #2
Hi Philippe

On 08/09/2021 19:14, Philippe Blain wrote:
> Hi Phillip,
> 
> Le 2021-09-08 à 09:41, Phillip Wood via GitGitGadget a écrit :
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>
>> The "apply" based rebase has avoided forking git checkout since
>> ac7f467fef ("builtin/rebase: support running "git rebase <upstream>"",
>> 2018-08-07). The code that handles the checkout was moved into libgit
>> by b309a97108 ("reset: extract reset_head() from rebase", 2020-04-07)
>> so lets start using it for the "merge" based rebase as well. This
>> opens the way for us to stop calling the post-checkout hook in the
>> future.
>>
> 
> While in general I think it's a good thing to avoid forking, this change
> might result in behavioral differences. Any config that affects
> 'git checkout' but not the internal 'reset.c::reset_head' function might
> play a role in the rebase UX.
> 
> One that immediately came to mind is 'submodule.recurse'. This initial 
> 'onto'
> checkout was pretty much the only part of 'git rebase' that did 
> something useful
> for submodules, so it's kind of sad to see it regress. 

Thanks for pointing that out. As a non-submodule user my question would 
be is it actually useful for the initial checkout to work that way if 
the rest of rebase (and the checkout for the am backend) ignores 
submodules? reset.c::reset_head() just uses unpack trees like checkout 
so if rebase read 'submodule.recurse' then reset_head() would work like 
'git checkout' and also 'git rebase --abort' and the "reset" command in 
the todo list would start checking out submodules. I'm reluctant to do 
that until the merge backend also handles submodules unless there is a 
good reason that such partial submodule support would help submodule users.

[That is, until
> someone
> takes the time to implement 'git rebase --recurse-submodules' and makes 
> sure *all*
> code paths that touch the working tree pay attention to this flag, and 
> that will
> probably necessitate 'git merge --recurse-submodules' first because of 
> the 'merge'
> backend... as far as I'm aware it's on Emily's list [1], it's also on 
> mine but
> I don't know when I'll get the time.]
> 
> Anyway, I'm not saying that we should not do what this patch is 
> proposing, but
> I think caveats such as that should be documented in the commit message, 

Yes I'll update the commit message. It should also mention that this 
fixes the bug reported in [1] where a failing post-checkout hook aborts 
a rebase.

Best Wishes

Phillip

[1] 
https://lore.kernel.org/git/CAHr-Uu_KeAJZrd+GzymNP47iFi+dZkvVYsWQPtzT_FQrVnWTDg@mail.gmail.com/

> and maybe
> an audit of other configs that might results in behavioural differences 
> should be done.
> 
> Thanks,
> 
> Philippe.
> 
> [1] 
> https://lore.kernel.org/git/YHofmWcIAidkvJiD@google.com/t/#m0229af9183a84c2367f21e82adfbd21f08aa4437 
>
Johannes Schindelin Sept. 9, 2021, 10:53 a.m. UTC | #3
Hi Philippe,

On Wed, 8 Sep 2021, Philippe Blain wrote:

> Hi Phillip,
>
> Le 2021-09-08 à 09:41, Phillip Wood via GitGitGadget a écrit :
> > From: Phillip Wood <phillip.wood@dunelm.org.uk>
> >
> > The "apply" based rebase has avoided forking git checkout since
> > ac7f467fef ("builtin/rebase: support running "git rebase <upstream>"",
> > 2018-08-07). The code that handles the checkout was moved into libgit
> > by b309a97108 ("reset: extract reset_head() from rebase", 2020-04-07)
> > so lets start using it for the "merge" based rebase as well. This
> > opens the way for us to stop calling the post-checkout hook in the
> > future.
> >
>
> While in general I think it's a good thing to avoid forking, this change
> might result in behavioral differences. Any config that affects
> 'git checkout' but not the internal 'reset.c::reset_head' function might
> play a role in the rebase UX.
>
> One that immediately came to mind is 'submodule.recurse'. This initial
> 'onto' checkout was pretty much the only part of 'git rebase' that did
> something useful for submodules, so it's kind of sad to see it regress.
> [That is, until someone takes the time to implement 'git rebase
> --recurse-submodules' and makes sure *all* code paths that touch the
> working tree pay attention to this flag, and that will probably
> necessitate 'git merge --recurse-submodules' first because of the
> 'merge' backend... as far as I'm aware it's on Emily's list [1], it's
> also on mine but I don't know when I'll get the time.]

Good point, there is more in the `git_checkout_config()` function (which
is `static` in `builtin/checkout.c`, and could easily be moved to
`checkout.[ch]`). We should probably fall back to calling that function
rather than `git_default_config()` in `rebase_config()`.

> Anyway, I'm not saying that we should not do what this patch is
> proposing, but I think caveats such as that should be documented in the
> commit message, and maybe an audit of other configs that might results
> in behavioural differences should be done.

Since this is already a bug in the `apply` backend, it would be even
better to follow-up with a fix, hint, hint, nudge, nudge ;-)

Ciao,
Dscho
Philippe Blain Sept. 9, 2021, 12:40 p.m. UTC | #4
Hi Phillip,

Le 2021-09-09 à 06:09, Phillip Wood a écrit :
> Hi Philippe
> 
> On 08/09/2021 19:14, Philippe Blain wrote:
>> Hi Phillip,
>> 
>> Le 2021-09-08 à 09:41, Phillip Wood via GitGitGadget a écrit :
>>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>> 
>>> The "apply" based rebase has avoided forking git checkout since 
>>> ac7f467fef ("builtin/rebase: support running "git rebase
>>> <upstream>"", 2018-08-07). The code that handles the checkout was
>>> moved into libgit by b309a97108 ("reset: extract reset_head()
>>> from rebase", 2020-04-07) so lets start using it for the "merge"
>>> based rebase as well. This opens the way for us to stop calling
>>> the post-checkout hook in the future.
>>> 
>> 
>> While in general I think it's a good thing to avoid forking, this
>> change might result in behavioral differences. Any config that
>> affects 'git checkout' but not the internal 'reset.c::reset_head'
>> function might play a role in the rebase UX.
>> 
>> One that immediately came to mind is 'submodule.recurse'. This
>> initial 'onto' checkout was pretty much the only part of 'git
>> rebase' that did something useful for submodules, so it's kind of
>> sad to see it regress.
> 
> Thanks for pointing that out. As a non-submodule user my question
> would be is it actually useful for the initial checkout to work that
> way if the rest of rebase (and the checkout for the am backend)
> ignores submodules? reset.c::reset_head() just uses unpack trees like
> checkout so if rebase read 'submodule.recurse' then reset_head()
> would work like 'git checkout' and also 'git rebase --abort' and the
> "reset" command in the todo list would start checking out submodules.
> I'm reluctant to do that until the merge backend also handles
> submodules unless there is a good reason that such partial submodule
> support would help submodule users.

Yeah, it's not that useful, I have to admit; it can also be very confusing
since some parts of rebase are affected, and some not. For example, any time
the rebase stops, like for 'edit', 'break', and when there are conflicts, the
submodules are not updated. So I think a full solution is better than a partial
solution; in the meantime I'm thinking the change you are proposing would actually
be less confusing, even if it slightly changes behaviour...

As an aside, I *think* reading submodule.recurse in rebase like it's done in checkout
et al., i.e. something like this:

diff --git a/builtin/rebase.c b/builtin/rebase.c
index 33e0961900..125ec907e4 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -26,6 +26,7 @@
  #include "rerere.h"
  #include "branch.h"
  #include "sequencer.h"
+#include "submodule.h"
  #include "rebase-interactive.h"
  #include "reset.h"
  
@@ -1106,6 +1107,9 @@ static int rebase_config(const char *var, const char *value, void *data)
  		return git_config_string(&opts->default_backend, var, value);
  	}
  
+	if (!strcmp(var, "submodule.recurse"))
+		return git_default_submodule_config(var, value, data);
+
  	return git_default_config(var, value, data);
  }
  

would actually also affect the merges
performed during the rebase, since that would affect the "global" state in submodule.c.
I hacked exactly that the other day but did not test extensively...

Cheers,
Philippe.
Philippe Blain Sept. 9, 2021, 12:44 p.m. UTC | #5
Hi Dscho,

Le 2021-09-09 à 06:53, Johannes Schindelin a écrit :
> Hi Philippe,
> 
> On Wed, 8 Sep 2021, Philippe Blain wrote:
> 
>> Anyway, I'm not saying that we should not do what this patch is
>> proposing, but I think caveats such as that should be documented in the
>> commit message, and maybe an audit of other configs that might results
>> in behavioural differences should be done.
> 
> Since this is already a bug in the `apply` backend, it would be even
> better to follow-up with a fix, hint, hint, nudge, nudge ;-)
> 

I'm not sure I understand what you are saying. The fact that 'rebase'
does not pay attention to 'submodule.recurse' is not a bug in my opinion,
it's just a limitation of the current code... Or do you mean something else?

Thanks,
Philippe.
Phillip Wood Sept. 9, 2021, 1:57 p.m. UTC | #6
Hi Philippe

On 09/09/2021 13:40, Philippe Blain wrote:
>>> While in general I think it's a good thing to avoid forking, this
>>> change might result in behavioral differences. Any config that
>>> affects 'git checkout' but not the internal 'reset.c::reset_head'
>>> function might play a role in the rebase UX.
>>>
>>> One that immediately came to mind is 'submodule.recurse'. This
>>> initial 'onto' checkout was pretty much the only part of 'git
>>> rebase' that did something useful for submodules, so it's kind of
>>> sad to see it regress.
>>
>> Thanks for pointing that out. As a non-submodule user my question
>> would be is it actually useful for the initial checkout to work that
>> way if the rest of rebase (and the checkout for the am backend)
>> ignores submodules? reset.c::reset_head() just uses unpack trees like
>> checkout so if rebase read 'submodule.recurse' then reset_head()
>> would work like 'git checkout' and also 'git rebase --abort' and the
>> "reset" command in the todo list would start checking out submodules.

it would also affect fast-forwards

>> I'm reluctant to do that until the merge backend also handles
>> submodules unless there is a good reason that such partial submodule
>> support would help submodule users.
> 
> Yeah, it's not that useful, I have to admit; it can also be very confusing
> since some parts of rebase are affected, and some not. For example, any 
> time
> the rebase stops, like for 'edit', 'break', and when there are 
> conflicts, the
> submodules are not updated. So I think a full solution is better than a 
> partial
> solution; in the meantime I'm thinking the change you are proposing 
> would actually
> be less confusing, even if it slightly changes behaviour...
> 
> As an aside, I *think* reading submodule.recurse in rebase like it's 
> done in checkout
> et al., i.e. something like this:
> 
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 33e0961900..125ec907e4 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -26,6 +26,7 @@
>   #include "rerere.h"
>   #include "branch.h"
>   #include "sequencer.h"
> +#include "submodule.h"
>   #include "rebase-interactive.h"
>   #include "reset.h"
> 
> @@ -1106,6 +1107,9 @@ static int rebase_config(const char *var, const 
> char *value, void *data)
>           return git_config_string(&opts->default_backend, var, value);
>       }
> 
> +    if (!strcmp(var, "submodule.recurse"))
> +        return git_default_submodule_config(var, value, data);

That looks about right to me though I think it would be safer to call 
git_default_submodule_config() for submodule.* rather than just 
submodule.recurse

>       return git_default_config(var, value, data);
>   }
> 
> 
> would actually also affect the merges
> performed during the rebase, since that would affect the "global" state 
> in submodule.c.
> I hacked exactly that the other day but did not test extensively...

merge-ort.c:checkout() which is used by merge_switch_to_result() uses 
unpack_trees() so it will pick up the global state and hopefully should 
just work (cc'ing Elijah to check as I didn't look what happens when 
there are conflicts). merge-recursive.c:update_file_flags() does this 
when updating the work tree

        if (S_ISGITLINK(contents->mode)) { 

                 /* 

                  * We may later decide to recursively descend into 

                  * the submodule directory and update its index 

                  * and/or work tree, but we do not do that now. 

                  */ 

                 update_wd = 0; 

                 goto update_index; 

        } 

 

so it looks like it does not update the submodule directory. Given 
merge-ort is now the default perhaps it's time for rebase (and 
cherry-pick/revert) to start reading the submodule config settings (we 
parse the config before we know if we'll be using merge-ort so I don't 
know how easy it would be to only parse the submodule settings if we are 
using merge-ort).

Best Wishes

Phillip
Elijah Newren Sept. 9, 2021, 3:01 p.m. UTC | #7
Hi,

On Thu, Sep 9, 2021 at 6:57 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> Hi Philippe
>
> On 09/09/2021 13:40, Philippe Blain wrote:
> >>> While in general I think it's a good thing to avoid forking, this
> >>> change might result in behavioral differences. Any config that
> >>> affects 'git checkout' but not the internal 'reset.c::reset_head'
> >>> function might play a role in the rebase UX.
> >>>
> >>> One that immediately came to mind is 'submodule.recurse'. This
> >>> initial 'onto' checkout was pretty much the only part of 'git
> >>> rebase' that did something useful for submodules, so it's kind of
> >>> sad to see it regress.
> >>
> >> Thanks for pointing that out. As a non-submodule user my question
> >> would be is it actually useful for the initial checkout to work that
> >> way if the rest of rebase (and the checkout for the am backend)
> >> ignores submodules? reset.c::reset_head() just uses unpack trees like
> >> checkout so if rebase read 'submodule.recurse' then reset_head()
> >> would work like 'git checkout' and also 'git rebase --abort' and the
> >> "reset" command in the todo list would start checking out submodules.
>
> it would also affect fast-forwards
>
> >> I'm reluctant to do that until the merge backend also handles
> >> submodules unless there is a good reason that such partial submodule
> >> support would help submodule users.
> >
> > Yeah, it's not that useful, I have to admit; it can also be very confusing
> > since some parts of rebase are affected, and some not. For example, any
> > time
> > the rebase stops, like for 'edit', 'break', and when there are
> > conflicts, the
> > submodules are not updated. So I think a full solution is better than a
> > partial
> > solution; in the meantime I'm thinking the change you are proposing
> > would actually
> > be less confusing, even if it slightly changes behaviour...
> >
> > As an aside, I *think* reading submodule.recurse in rebase like it's
> > done in checkout
> > et al., i.e. something like this:
> >
> > diff --git a/builtin/rebase.c b/builtin/rebase.c
> > index 33e0961900..125ec907e4 100644
> > --- a/builtin/rebase.c
> > +++ b/builtin/rebase.c
> > @@ -26,6 +26,7 @@
> >   #include "rerere.h"
> >   #include "branch.h"
> >   #include "sequencer.h"
> > +#include "submodule.h"
> >   #include "rebase-interactive.h"
> >   #include "reset.h"
> >
> > @@ -1106,6 +1107,9 @@ static int rebase_config(const char *var, const
> > char *value, void *data)
> >           return git_config_string(&opts->default_backend, var, value);
> >       }
> >
> > +    if (!strcmp(var, "submodule.recurse"))
> > +        return git_default_submodule_config(var, value, data);
>
> That looks about right to me though I think it would be safer to call
> git_default_submodule_config() for submodule.* rather than just
> submodule.recurse
>
> >       return git_default_config(var, value, data);
> >   }
> >
> >
> > would actually also affect the merges
> > performed during the rebase, since that would affect the "global" state
> > in submodule.c.
> > I hacked exactly that the other day but did not test extensively...
>
> merge-ort.c:checkout() which is used by merge_switch_to_result() uses
> unpack_trees() so it will pick up the global state and hopefully should
> just work (cc'ing Elijah to check as I didn't look what happens when
> there are conflicts).

Yep, merge-ort was designed to just piggy back on checkout code.  The
checkout() function was basically just code lifted from
builtin/checkout.c.  Using that code means that merges now also
benefit from all the special working tree handling that is encoded
into git-checkout -- whether that's parallel checkout, submodule
handling, tricky D/F switches or symlink handling, etc.  In contrast
to merge-recursive, it does not need hundreds and hundreds of lines of
special worktree updating code sprayed all over the codebase.

Conflicts are not special in this regard; merge-ort creates a tree
which has files that include conflict markers, and then merge-ort
calls checkout() to switch the working copy over to that tree.

The only issue conflicts present for merge-ort, is that AFTER it has
checked out that special tree with conflict markers, it then has to go
and touch up the index afterwards to replace the entries for
conflicted files with multiple higher order stages.  (You could say
that merge-recursive is "index-first", since its design focuses on the
index -- updating it first and then figuring out everything else like
updating the working tree with special code afterwards.  In contrast,
merge-ort ignores the index entirely until the very end -- after a new
merge tree is created and after the working tree is updated.)

> merge-recursive.c:update_file_flags() does this
> when updating the work tree
>
>         if (S_ISGITLINK(contents->mode)) {
>
>                  /*
>
>                   * We may later decide to recursively descend into
>
>                   * the submodule directory and update its index
>
>                   * and/or work tree, but we do not do that now.
>
>                   */
>
>                  update_wd = 0;
>
>                  goto update_index;
>
>         }
>
>
>
> so it looks like it does not update the submodule directory. Given
> merge-ort is now the default perhaps it's time for rebase (and
> cherry-pick/revert) to start reading the submodule config settings (we
> parse the config before we know if we'll be using merge-ort so I don't
> know how easy it would be to only parse the submodule settings if we are
> using merge-ort).

I'd just parse any needed config in all cases.  The submodule settings
aren't going to hurt merge-recursive; it'll just ignore them.  (Or are
you worried about a mix-and-match of rebase calling both checkout and
merge code doing weird things, and you'd rather not have the checkout
bits update submodules if the merges won't?)
Elijah Newren Sept. 9, 2021, 3:03 p.m. UTC | #8
On Wed, Sep 8, 2021 at 6:44 AM Phillip Wood via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> The "apply" based rebase has avoided forking git checkout since
> ac7f467fef ("builtin/rebase: support running "git rebase <upstream>"",
> 2018-08-07). The code that handles the checkout was moved into libgit
> by b309a97108 ("reset: extract reset_head() from rebase", 2020-04-07)
> so lets start using it for the "merge" based rebase as well. This
> opens the way for us to stop calling the post-checkout hook in the
> future.

Wahoo!  So exciting to see this, and for the future plans mentioned here.  :-)

> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> ---
>  sequencer.c | 29 ++++-------------------------
>  1 file changed, 4 insertions(+), 25 deletions(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index d51440ddcd9..1a9dbc70d3c 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -4192,42 +4192,21 @@ int apply_autostash_oid(const char *stash_oid)
>         return apply_save_autostash_oid(stash_oid, 1);
>  }
>
> -static int run_git_checkout(struct repository *r, struct replay_opts *opts,
> -                           const char *commit, const char *action)
> -{
> -       struct child_process cmd = CHILD_PROCESS_INIT;
> -       int ret;
> -
> -       cmd.git_cmd = 1;
> -
> -       strvec_push(&cmd.args, "checkout");
> -       strvec_push(&cmd.args, commit);
> -       strvec_pushf(&cmd.env_array, GIT_REFLOG_ACTION "=%s", action);
> -
> -       if (opts->verbose)
> -               ret = run_command(&cmd);
> -       else
> -               ret = run_command_silent_on_success(&cmd);
> -
> -       if (!ret)
> -               discard_index(r->index);
> -
> -       return ret;
> -}
> -
>  static int checkout_onto(struct repository *r, struct replay_opts *opts,
>                          const char *onto_name, const struct object_id *onto,
>                          const struct object_id *orig_head)
>  {
>         const char *action = reflog_message(opts, "start", "checkout %s", onto_name);
>
> -       if (run_git_checkout(r, opts, oid_to_hex(onto), action)) {
> +       if (reset_head(r, onto, "checkout", NULL, RESET_HEAD_DETACH |
> +                      RESET_ORIG_HEAD | RESET_HEAD_RUN_POST_CHECKOUT_HOOK,
> +                      NULL, action, "rebase")) {
>                 apply_autostash(rebase_path_autostash());
>                 sequencer_remove_state(opts);
>                 return error(_("could not detach HEAD"));
>         }
>
> -       return update_ref(NULL, "ORIG_HEAD", orig_head, NULL, 0, UPDATE_REFS_MSG_ON_ERR);
> +       return 0;
>  }
>
>  static int stopped_at_head(struct repository *r)
> --
> gitgitgadget
Johannes Schindelin Sept. 9, 2021, 9:43 p.m. UTC | #9
Hi Philippe,

On Thu, 9 Sep 2021, Philippe Blain wrote:

> Le 2021-09-09 à 06:53, Johannes Schindelin a écrit :
> >
> > On Wed, 8 Sep 2021, Philippe Blain wrote:
> >
> > > Anyway, I'm not saying that we should not do what this patch is
> > > proposing, but I think caveats such as that should be documented in the
> > > commit message, and maybe an audit of other configs that might results
> > > in behavioural differences should be done.
> >
> > Since this is already a bug in the `apply` backend, it would be even
> > better to follow-up with a fix, hint, hint, nudge, nudge ;-)
>
> I'm not sure I understand what you are saying.

I am saying that indeed, you found what I consider a bug, but it is
already present in the `apply` backend. And then I am hoping that you
could find the time to fix it ;-)

> The fact that 'rebase' does not pay attention to 'submodule.recurse' is
> not a bug in my opinion, it's just a limitation of the current code...

But the code that spawned `git checkout` _did_ pay attention to the
`submodule.*` settings, no?

Ciao,
Dscho

> Or do you mean something else?
>
> Thanks,
> Philippe.
>
>
Johannes Schindelin Sept. 10, 2021, 10:46 a.m. UTC | #10
Hi Philippe,

On Thu, 9 Sep 2021, Philippe Blain wrote:

> Le 2021-09-09 à 06:53, Johannes Schindelin a écrit :
> >
> > On Wed, 8 Sep 2021, Philippe Blain wrote:
> >
> > > Anyway, I'm not saying that we should not do what this patch is
> > > proposing, but I think caveats such as that should be documented in the
> > > commit message, and maybe an audit of other configs that might results
> > > in behavioural differences should be done.
> >
> > Since this is already a bug in the `apply` backend, it would be even
> > better to follow-up with a fix, hint, hint, nudge, nudge ;-)
> >
>
> I'm not sure I understand what you are saying. The fact that 'rebase'
> does not pay attention to 'submodule.recurse' is not a bug in my opinion,
> it's just a limitation of the current code... Or do you mean something else?

I must have misunderstood you. I thought you were saying that Phillip's
patch introduces the regression where `submodule.recurse` is no longer
respected.

Ciao,
Dscho
Philippe Blain Sept. 10, 2021, 11:58 a.m. UTC | #11
Hi Dscho,

Le 2021-09-10 à 06:46, Johannes Schindelin a écrit :
> Hi Philippe,
> 
> On Thu, 9 Sep 2021, Philippe Blain wrote:
> 
>> Le 2021-09-09 à 06:53, Johannes Schindelin a écrit :
>>>
>>> On Wed, 8 Sep 2021, Philippe Blain wrote:
>>>
>>>> Anyway, I'm not saying that we should not do what this patch is
>>>> proposing, but I think caveats such as that should be documented in the
>>>> commit message, and maybe an audit of other configs that might results
>>>> in behavioural differences should be done.
>>>
>>> Since this is already a bug in the `apply` backend, it would be even
>>> better to follow-up with a fix, hint, hint, nudge, nudge ;-)
>>>
>>
>> I'm not sure I understand what you are saying. The fact that 'rebase'
>> does not pay attention to 'submodule.recurse' is not a bug in my opinion,
>> it's just a limitation of the current code... Or do you mean something else?
> 
> I must have misunderstood you. I thought you were saying that Phillip's
> patch introduces the regression where `submodule.recurse` is no longer
> respected.
> 
> Ciao,
> Dscho
> 

Well it does, but only for the initial checkout of 'onto'. But as I wrote in
[1], I think that half respecting 'submodule.recurse' is confusing UX. So
I would think that it is not *that* bad to keep Phillip's patch as-is in the meantime
i.e. while we wait for 'git rebase --recurse-submodules' to materialize. I think
that it would not be good UI either to have rebase respect 'submodule.recurse',
but not having a --recurse-submodules flag like all other commands that honor
that config.

Thanks,
Philippe.

P.S. I did not CC you in [1], should I have? What's the etiquette around this,
i.e. should I manually add CC's for people involved in "parallel" threads in
addition to everyone I get by doing "reply-all" ?

1. https://lore.kernel.org/git/pull.1034.git.1631108472.gitgitgadget@gmail.com/T/#m182b8d5f24b41c2ff8e919819229974d71258cd9
Philippe Blain Sept. 10, 2021, 12:07 p.m. UTC | #12
Hi Elijah,

Le 2021-09-09 à 11:01, Elijah Newren a écrit :
> Hi,
> 
> On Thu, Sep 9, 2021 at 6:57 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>>
<snip>
>> merge-recursive.c:update_file_flags() does this
>> when updating the work tree
>>
>>          if (S_ISGITLINK(contents->mode)) {
>>
>>                   /*
>>
>>                    * We may later decide to recursively descend into
>>
>>                    * the submodule directory and update its index
>>
>>                    * and/or work tree, but we do not do that now.
>>
>>                    */
>>
>>                   update_wd = 0;
>>
>>                   goto update_index;
>>
>>          }
>>
>>
>>
>> so it looks like it does not update the submodule directory. Given
>> merge-ort is now the default perhaps it's time for rebase (and
>> cherry-pick/revert) to start reading the submodule config settings (we
>> parse the config before we know if we'll be using merge-ort so I don't
>> know how easy it would be to only parse the submodule settings if we are
>> using merge-ort).
> 
> I'd just parse any needed config in all cases.  The submodule settings
> aren't going to hurt merge-recursive; it'll just ignore them.  (Or are
> you worried about a mix-and-match of rebase calling both checkout and
> merge code doing weird things, and you'd rather not have the checkout
> bits update submodules if the merges won't?)
> 

Thanks for your input. I agree that reading the config in all cases would
be simpler. We could even decide that since ort is the new default, the
submodule support will not be "backported" to merge recursive (that would
be way simpler to implement, I think) This way we can just document it as
such and be done with it. But anyway, I think this is kind of
orthogonal to this here series and should be done separately.

Cheers,
Philippe.
Phillip Wood Sept. 15, 2021, 3:44 p.m. UTC | #13
Hi Elijah

On 09/09/2021 16:01, Elijah Newren wrote:
> [...]
>> merge-ort.c:checkout() which is used by merge_switch_to_result() uses
>> unpack_trees() so it will pick up the global state and hopefully should
>> just work (cc'ing Elijah to check as I didn't look what happens when
>> there are conflicts).
> 
> Yep, merge-ort was designed to just piggy back on checkout code.  The
> checkout() function was basically just code lifted from
> builtin/checkout.c.  Using that code means that merges now also
> benefit from all the special working tree handling that is encoded
> into git-checkout -- whether that's parallel checkout, submodule
> handling, tricky D/F switches or symlink handling, etc.  In contrast
> to merge-recursive, it does not need hundreds and hundreds of lines of
> special worktree updating code sprayed all over the codebase.
> 
> Conflicts are not special in this regard; merge-ort creates a tree
> which has files that include conflict markers, and then merge-ort
> calls checkout() to switch the working copy over to that tree.
> 
> The only issue conflicts present for merge-ort, is that AFTER it has
> checked out that special tree with conflict markers, it then has to go
> and touch up the index afterwards to replace the entries for
> conflicted files with multiple higher order stages.  (You could say
> that merge-recursive is "index-first", since its design focuses on the
> index -- updating it first and then figuring out everything else like
> updating the working tree with special code afterwards.  In contrast,
> merge-ort ignores the index entirely until the very end -- after a new
> merge tree is created and after the working tree is updated.)

Thanks for explaining, it's a nice design feature that you can just reuse
the checkout code to update the working copy with the merge result

>> merge-recursive.c:update_file_flags() does this
>> when updating the work tree
>>
>>          if (S_ISGITLINK(contents->mode)) {
>>                   /*
>>                    * We may later decide to recursively descend into
>>                    * the submodule directory and update its index
>>                    * and/or work tree, but we do not do that now.
>>                    */
>>                   update_wd = 0;
>>                   goto update_index;
>>          }
>>
>> so it looks like it does not update the submodule directory. Given
>> merge-ort is now the default perhaps it's time for rebase (and
>> cherry-pick/revert) to start reading the submodule config settings (we
>> parse the config before we know if we'll be using merge-ort so I don't
>> know how easy it would be to only parse the submodule settings if we are
>> using merge-ort).
> 
> I'd just parse any needed config in all cases.  The submodule settings
> aren't going to hurt merge-recursive; it'll just ignore them.  (Or are
> you worried about a mix-and-match of rebase calling both checkout and
> merge code doing weird things, and you'd rather not have the checkout
> bits update submodules if the merges won't?)

I'd rather just parse the config when we know submodules are going to be
rebased, I think it's confusing if some bit work and others don't. I've
tried the diff below locally, but t7402-submodule-rebase.sh does not show
any change (I was hoping some text_expect_failure would be fixed) so I'm
not sure if it's working or not and I ran out of time.

Best Wishes

Phillip

--- >8 ---
diff --git a/builtin/rebase.c b/builtin/rebase.c
index eed70168df..a35a9e3460 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -26,6 +26,7 @@
  #include "rerere.h"
  #include "branch.h"
  #include "sequencer.h"
+#include "submodule.h"
  #include "rebase-interactive.h"
  #include "reset.h"
  
@@ -1114,6 +1115,10 @@ static int rebase_config(const char *var, const char *value, void *data)
                 return git_config_string(&opts->default_backend, var, value);
         }
  
+       if (starts_with(var, "submodule.")) {
+               return git_default_submodule_config(var, value, NULL);
+       }
+
         return git_default_config(var, value, data);
  }
  
@@ -1820,6 +1825,12 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
             getenv("GIT_TEST_MERGE_ALGORITHM"))
                 options.strategy = xstrdup(getenv("GIT_TEST_MERGE_ALGORITHM"));
  
+       /* only the "ort" merge strategy handles submodules correctly */
+       if (!is_merge(&options) ||
+           (options.strategy && strcmp(options.strategy, "ort")))
+               git_default_submodule_config("submodule.recurse", "false",
+                                            NULL);
+
         switch (options.type) {
         case REBASE_MERGE:
         case REBASE_PRESERVE_MERGES:
diff mbox series

Patch

diff --git a/sequencer.c b/sequencer.c
index d51440ddcd9..1a9dbc70d3c 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -4192,42 +4192,21 @@  int apply_autostash_oid(const char *stash_oid)
 	return apply_save_autostash_oid(stash_oid, 1);
 }
 
-static int run_git_checkout(struct repository *r, struct replay_opts *opts,
-			    const char *commit, const char *action)
-{
-	struct child_process cmd = CHILD_PROCESS_INIT;
-	int ret;
-
-	cmd.git_cmd = 1;
-
-	strvec_push(&cmd.args, "checkout");
-	strvec_push(&cmd.args, commit);
-	strvec_pushf(&cmd.env_array, GIT_REFLOG_ACTION "=%s", action);
-
-	if (opts->verbose)
-		ret = run_command(&cmd);
-	else
-		ret = run_command_silent_on_success(&cmd);
-
-	if (!ret)
-		discard_index(r->index);
-
-	return ret;
-}
-
 static int checkout_onto(struct repository *r, struct replay_opts *opts,
 			 const char *onto_name, const struct object_id *onto,
 			 const struct object_id *orig_head)
 {
 	const char *action = reflog_message(opts, "start", "checkout %s", onto_name);
 
-	if (run_git_checkout(r, opts, oid_to_hex(onto), action)) {
+	if (reset_head(r, onto, "checkout", NULL, RESET_HEAD_DETACH |
+		       RESET_ORIG_HEAD | RESET_HEAD_RUN_POST_CHECKOUT_HOOK,
+		       NULL, action, "rebase")) {
 		apply_autostash(rebase_path_autostash());
 		sequencer_remove_state(opts);
 		return error(_("could not detach HEAD"));
 	}
 
-	return update_ref(NULL, "ORIG_HEAD", orig_head, NULL, 0, UPDATE_REFS_MSG_ON_ERR);
+	return 0;
 }
 
 static int stopped_at_head(struct repository *r)