mbox series

[0/1] revert/cherry-pick: add --show-current-patch option

Message ID 20231218121048.68290-1-mi.al.lohmann@gmail.com (mailing list archive)
Headers show
Series revert/cherry-pick: add --show-current-patch option | expand

Message

Michael Lohmann Dec. 18, 2023, 12:10 p.m. UTC
Hi,
I am a lead developer of a small team and quite often I have to
cherry-pick commits (and sometimes also revert them). When
cherry-picking multiple commits at once and there is a merge conflict it
sometimes can be hard to understand what the current patch is trying to
do in order to resolve the conflict properly. With `rebase` there is
`--show-current-patch` and since that is quite helpful I would suggest
to also add this flag also to `cherry-pick` and `revert`.

Since this is my first contribution to git I am not exactly sure where
the best place for this functionality is. From my initial understanding
there are two places where to put the actual invocation of the `show`:
- Duplicate the code (with the needed adaptations) of builtin/rebase.c
  in builtin/revert.c
- Create a central function that shows the respective `*_HEAD` depending
  on the current `action`.

In this first draft I went with the second option, since I felt that it
reduces code duplication and the sequencer already has the action enum
with exactly those three cases. On the other hand I don’t really have a
good understanding of the role that this `sequencer` should play and if
this adds additional coupling that is unwanted. My current impression
is, that this would be the right place, since this looks to be the core
of the commands where a user can apply a sequence of commits and in my
opinion even if additional actions would be added, they could also fail
and so it would be good to add the `--show-current-patch` option to that
one as well.

Side note: my only C(++) experience was ~10 years ago and only for a
single university course, so my perspective is much more from a general
architecture point of view than based on any C experience, let alone in
this code base and so I would be very grateful for criticism!


Side note: The check for the `REBASE_HEAD` would not be necessary, since
that is already taken care of in the builtin/rebase.c before.
Nevertheless I opted for this check, because I would much rather require
the same preconditions no matter from where I call this function. The
whole argument parsing / option struct are very different between rebase
and revert. Maybe it would make sense to align them a bit further?
Initial observations: `rebase_options->type` is functionally similar to
`replay_opts->action` (as in "what general action am I performing? -
interactive rebase / cherry-pick / revert / ...") whereas
`rebase_options->action` is not part of the `replay_opts` struct at all.
Instead the role is taken over in builtin/revert.c by `int cmd = 0;`.
I am preparing a patch converting this to an enum, so that there are
no random chars that have to be kept in sync manually in different
places, or is that a design decision?

I looked through the mailing list archive and did not find anything
related on this topic. The only slightly related thread I could find was
in [1] by Elijah Newren and that one was talking about a separate
possible feature and how to get certain information if CHERRY_PICK_HEAD
and REVERT_HEAD were to be replaced by a different construct. I hope I
did not miss something...

Cheers
Michael

[1]:
https://lore.kernel.org/git/CABPp-BGd-W8T7EsvKYyjdi3=mfSTJ8zM-uzVsFnh1AWyV2wEzQ@mail.gmail.com

Michael Lohmann (1):
  revert/cherry-pick: add --show-current-patch option

 Documentation/git-cherry-pick.txt      |  2 +-
 Documentation/git-revert.txt           |  2 +-
 Documentation/sequencer.txt            |  5 +++++
 builtin/rebase.c                       |  7 ++----
 builtin/revert.c                       |  9 ++++++--
 contrib/completion/git-completion.bash |  2 +-
 sequencer.c                            | 24 +++++++++++++++++++++
 sequencer.h                            |  2 ++
 t/t3507-cherry-pick-conflict.sh        | 30 ++++++++++++++++++++++++++
 9 files changed, 73 insertions(+), 10 deletions(-)

Comments

Phillip Wood Dec. 18, 2023, 4:42 p.m. UTC | #1
Hi Michael

On 18/12/2023 12:10, Michael Lohmann wrote:
> Hi,
> I am a lead developer of a small team and quite often I have to
> cherry-pick commits (and sometimes also revert them). When
> cherry-picking multiple commits at once and there is a merge conflict it
> sometimes can be hard to understand what the current patch is trying to
> do in order to resolve the conflict properly. With `rebase` there is
> `--show-current-patch` and since that is quite helpful I would suggest
> to also add this flag also to `cherry-pick` and `revert`.

Thanks for bringing this up I agree it can be very helpful to look at 
the original commit when resolving cherry-pick and revert conflicts. I'm 
in two minds about this change though - I wonder if it'd be better to 
improve the documentation for CHERRY_PICK_HEAD and REVERT_HEAD and tell 
users to run "git show CHERRY_PICK_HEAD" instead. I think the main 
reason we have a "--show-current-patch" option for "rebase" is that 
there are two different implementations of that command and the 
patched-based one of them does not support REBASE_HEAD. That reasoning 
does not apply to "cherry-pick" and "revert" and "--show-current-patch" 
suggests a patch-based implementation which is also not the case for 
these commands.

Best Wishes

Phillip

> Since this is my first contribution to git I am not exactly sure where
> the best place for this functionality is. From my initial understanding
> there are two places where to put the actual invocation of the `show`:
> - Duplicate the code (with the needed adaptations) of builtin/rebase.c
>    in builtin/revert.c
> - Create a central function that shows the respective `*_HEAD` depending
>    on the current `action`.
> 
> In this first draft I went with the second option, since I felt that it
> reduces code duplication and the sequencer already has the action enum
> with exactly those three cases. On the other hand I don’t really have a
> good understanding of the role that this `sequencer` should play and if
> this adds additional coupling that is unwanted. My current impression
> is, that this would be the right place, since this looks to be the core
> of the commands where a user can apply a sequence of commits and in my
> opinion even if additional actions would be added, they could also fail
> and so it would be good to add the `--show-current-patch` option to that
> one as well.
> 
> Side note: my only C(++) experience was ~10 years ago and only for a
> single university course, so my perspective is much more from a general
> architecture point of view than based on any C experience, let alone in
> this code base and so I would be very grateful for criticism!
> 
> 
> Side note: The check for the `REBASE_HEAD` would not be necessary, since
> that is already taken care of in the builtin/rebase.c before.
> Nevertheless I opted for this check, because I would much rather require
> the same preconditions no matter from where I call this function. The
> whole argument parsing / option struct are very different between rebase
> and revert. Maybe it would make sense to align them a bit further?
> Initial observations: `rebase_options->type` is functionally similar to
> `replay_opts->action` (as in "what general action am I performing? -
> interactive rebase / cherry-pick / revert / ...") whereas
> `rebase_options->action` is not part of the `replay_opts` struct at all.
> Instead the role is taken over in builtin/revert.c by `int cmd = 0;`.
> I am preparing a patch converting this to an enum, so that there are
> no random chars that have to be kept in sync manually in different
> places, or is that a design decision?
> 
> I looked through the mailing list archive and did not find anything
> related on this topic. The only slightly related thread I could find was
> in [1] by Elijah Newren and that one was talking about a separate
> possible feature and how to get certain information if CHERRY_PICK_HEAD
> and REVERT_HEAD were to be replaced by a different construct. I hope I
> did not miss something...
> 
> Cheers
> Michael
> 
> [1]:
> https://lore.kernel.org/git/CABPp-BGd-W8T7EsvKYyjdi3=mfSTJ8zM-uzVsFnh1AWyV2wEzQ@mail.gmail.com
> 
> Michael Lohmann (1):
>    revert/cherry-pick: add --show-current-patch option
> 
>   Documentation/git-cherry-pick.txt      |  2 +-
>   Documentation/git-revert.txt           |  2 +-
>   Documentation/sequencer.txt            |  5 +++++
>   builtin/rebase.c                       |  7 ++----
>   builtin/revert.c                       |  9 ++++++--
>   contrib/completion/git-completion.bash |  2 +-
>   sequencer.c                            | 24 +++++++++++++++++++++
>   sequencer.h                            |  2 ++
>   t/t3507-cherry-pick-conflict.sh        | 30 ++++++++++++++++++++++++++
>   9 files changed, 73 insertions(+), 10 deletions(-)
>
Michael Lohmann Dec. 20, 2023, 6:51 a.m. UTC | #2
Hi Phillip

On 18/12/2023 16:42, Phillip Wood wrote:
> Thanks for bringing this up I agree it can be very helpful to look at
> the original commit when resolving cherry-pick and revert conflicts.
> I'm in two minds about this change though - I wonder if it'd be better
> to improve the documentation for CHERRY_PICK_HEAD and REVERT_HEAD and
> tell users to run "git show CHERRY_PICK_HEAD" instead. I think the
> main reason we have a "--show-current-patch" option for "rebase" is
> that there are two different implementations of that command and the
> patched-based one of them does not support REBASE_HEAD. That reasoning
> does not apply to "cherry-pick" and "revert" and
> "--show-current-patch" suggests a patch-based implementation which is
> also not the case for these commands.

I appreciate the urge of limiting the interface to the minimum needed
and not to duplicate functionality that already exists. On the other
hand, this would
a) grant the user the same experience, not having to wonder about
implementation details such as different backends for rebase, but not
for revert/cherry-pick and
b) (I know it is more indicative of me, but:) when I am looking for a
feature in software and I look into the respective man page I tend to
focus first on the synopsis instead of reading the whole page (or
sometimes I even just rely on the shell autocompletion for
discoverability).

So yes, mentioning REVERT_HEAD and CHERRY_PICK_HEAD in the respective
docs would technically be sufficient, but I don't think it is as
discoverable to an average user (who does not know about the details of
all the existing pseudo refs) as a toplevel action would be. But an
assessment of the pros and cons is not on me to decide.

I have to be honest: I have troubles distinguishing a "patch" and a
"diff", the latter of which `git show <commit>` shows according to the
documentation ("For commits it shows the log message and textual
diff."), though my understanding was that a patch is a diff + context
lines, which is what `git show` actually shows... I think this is
probably why I don't feel so strong about the potential loose usage of
the word here.

Also the documentation of cherry-pick already uses the word "patch" in a
(according to my understanding from a technical perspective) sloppy (but
from a layman's point of view probably nevertheless helpful) way:

> The following sequence attempts to backport a patch, bails out because
> the code the patch applies to has changed too much, and then tries
> again, this time exercising more care about matching up context lines.
> 
> ------------
> $ git cherry-pick topic^             <1>
> $ git diff                           <2>
> $ git cherry-pick --abort            <3>
> $ git cherry-pick -Xpatience topic^  <4>
> ------------
> <1> apply the change that would be shown by `git show topic^`.
>     In this example, the patch does not apply cleanly, so
>     information about the conflict is written to the index and
>     working tree and no new commit results.

Should that also be rephrased?


Out of curiosity: The following from the rebase docs seems to imply that
the apply backend will probably be removed in the future:
> --apply
>           Use applying strategies to rebase (calling git-am
>           internally). This option may become a no-op in the future
>           once the merge backend handles everything the apply one
>           does.

But I would expect the `rebase --show-current-patch` still to be
working. Would that only be a legacy compatibility flag and instead also
for rebases the recommended option would be to run
`git show REBASE_HEAD`?

Best Wishes

Michael
Phillip Wood Dec. 21, 2023, 4:32 p.m. UTC | #3
Hi Michael

On 20/12/2023 06:51, Michael Lohmann wrote:
> Hi Phillip
> 
> On 18/12/2023 16:42, Phillip Wood wrote:
>> Thanks for bringing this up I agree it can be very helpful to look at
>> the original commit when resolving cherry-pick and revert conflicts.

As an aside I find it useful is to do a kind of range-diff before 
committing the conflict resolution. Unfortunately one cannot use "git 
range-diff" because the conflict resolution is not yet committed. 
Instead I use

     diff <(git diff CHERRY_PICK_HEAD^-) <(git diff HEAD)

in practice it is helpful to pipe the diffs through sed to delete the 
"index" lines and normalize the hunk headers.

>> I'm in two minds about this change though - I wonder if it'd be better
>> to improve the documentation for CHERRY_PICK_HEAD and REVERT_HEAD and
>> tell users to run "git show CHERRY_PICK_HEAD" instead. I think the
>> main reason we have a "--show-current-patch" option for "rebase" is
>> that there are two different implementations of that command and the
>> patched-based one of them does not support REBASE_HEAD. That reasoning
>> does not apply to "cherry-pick" and "revert" and
>> "--show-current-patch" suggests a patch-based implementation which is
>> also not the case for these commands.
> 
> I appreciate the urge of limiting the interface to the minimum needed
> and not to duplicate functionality that already exists. On the other
> hand, this would
> a) grant the user the same experience, not having to wonder about
> implementation details such as different backends for rebase, but not
> for revert/cherry-pick and
> b) (I know it is more indicative of me, but:) when I am looking for a
> feature in software and I look into the respective man page I tend to
> focus first on the synopsis instead of reading the whole page (or
> sometimes I even just rely on the shell autocompletion for
> discoverability).
> 
> So yes, mentioning REVERT_HEAD and CHERRY_PICK_HEAD in the respective
> docs would technically be sufficient, but I don't think it is as
> discoverable to an average user (who does not know about the details of
> all the existing pseudo refs) as a toplevel action would be. But an
> assessment of the pros and cons is not on me to decide.

To make the psuedo refs discoverable we should certainly be mentioning 
them in the section about resolving conflicts. I haven't checked what 
the docs say at the moment but a worked example showing how to inspect 
the conflicts and the original changes would be helpful I think. That 
does assume that the user actually reads the section about resolving 
conflicts rather than just scanning the available command line options 
though.

> I have to be honest: I have troubles distinguishing a "patch" and a
> "diff", the latter of which `git show <commit>` shows according to the
> documentation ("For commits it shows the log message and textual
> diff."), though my understanding was that a patch is a diff + context
> lines, which is what `git show` actually shows... I think this is
> probably why I don't feel so strong about the potential loose usage of
> the word here.

I think for the purposes of this discussion "patch" and "diff" are 
largely interchangeable (a "patch" is essentially a "diff" with a commit 
message). Maybe I'm overthinking it but the reason I'm not very keen on 
"--show-current-patch" (in addition to the "duplicate functionality" 
argument you mention above) is that cherry-pick and revert do not work 
by applying patches (or diffs) but use a 3-way merge instead. I think 
--show-current-patch first appeared as an option to "git am" which makes 
sense as that command is all about applying patches.

I'd be interested to hear what other people think about whether it makes 
"--show-current-patch" make sense for other commands.

> Also the documentation of cherry-pick already uses the word "patch" in a
> (according to my understanding from a technical perspective) sloppy (but
> from a layman's point of view probably nevertheless helpful) way:
> 
>> The following sequence attempts to backport a patch, bails out because
>> the code the patch applies to has changed too much, and then tries
>> again, this time exercising more care about matching up context lines.
>>
>> ------------
>> $ git cherry-pick topic^             <1>
>> $ git diff                           <2>
>> $ git cherry-pick --abort            <3>
>> $ git cherry-pick -Xpatience topic^  <4>
>> ------------
>> <1> apply the change that would be shown by `git show topic^`.
>>      In this example, the patch does not apply cleanly, so
>>      information about the conflict is written to the index and
>>      working tree and no new commit results.
> 
> Should that also be rephrased?

It would certainly be more accurate for the first paragraph to say 
something like

     The following sequence tries to backport a commit. It bails out
     because the code modified by the commit has conflicting changes in
     the current branch.

The bit about exercising more care about matching up context lines is 
moot these days as the default merge strategy is "ort" which uses the 
histogram diff algorithm to do just that so commands <3> & <4> should 
not be needed.

> 
> Out of curiosity: The following from the rebase docs seems to imply that
> the apply backend will probably be removed in the future:
>> --apply
>>            Use applying strategies to rebase (calling git-am
>>            internally). This option may become a no-op in the future
>>            once the merge backend handles everything the apply one
>>            does.
> 
> But I would expect the `rebase --show-current-patch` still to be
> working. Would that only be a legacy compatibility flag and instead also
> for rebases the recommended option would be to run
> `git show REBASE_HEAD`?

The long term goal is to remove the apply backend but I don't think 
anyone is actively working on it at the moment. We'd certainly need to 
keep the --show-current-patch option for backwards compatibility.

I'll be off the list for the next couple of weeks but I'll be sure to 
catch up with this thread in the New Year

Best Wishes

Phillip