add-patch: edit the hunk again

Message ID	21ddf64f-10c2-4087-a778-0bd2e82aef42@gmail.com (mailing list archive)
State	Superseded
Headers	show Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0926225570 for <git@vger.kernel.org>; Sun, 15 Sep 2024 11:38:08 +0000 (UTC) Message-ID: <21ddf64f-10c2-4087-a778-0bd2e82aef42@gmail.com> Date: Sun, 15 Sep 2024 13:38:05 +0200 Precedence: bulk MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: =?utf-8?q?Rub=C3=A9n_Justo?= <rjusto@gmail.com> Content-Language: en-US To: Git List <git@vger.kernel.org> Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>, Phillip Wood <phillip.wood@dunelm.org.uk> Subject: [PATCH] add-patch: edit the hunk again Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit
Series	add-patch: edit the hunk again \| expand add-patch: edit the hunk again

Rubén Justo Sept. 15, 2024, 11:38 a.m. UTC

The "edit" option allows the user to directly modify the hunk to be
applied.

If the modified hunk returned is not an applicable patch, we give the
opportunity to try again.

For this new attempt we provide, again, the original hunk;  the user
has to repeat the modification from scratch.

Instead, let's give them the faulty modified patch back, so they can
identify and fix the problem.

If they really want to start over with a fresh patch they still can
say 'no' to cancel the "edit" and start anew [*].

    * In the old script-based version of "add -p", this "no" meant
      discarding the current hunk and moving on to the next one.

      This changed, presumably unintentionally, during the conversion
      to C in bcdd297b78 (built-in add -p: implement hunk editing,
      2019-12-13).

      Now makes perfect sense not to move to the next hunk when the
      user requests to discard their edits.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
---

The message "saying 'no' discards!" comes from ac083c47ea
(git-add--interactive: manual hunk editing mode, 2008-07-03).

I think it was referring to discarding user modifications, not the
current hunk; which is what we were doing then (and now regaining).

However, we stopped behaving that way in 2b8ea7f3c7 (add -p:
calculate offset delta for edited patches, 2018-03-05), perhaps for
some reason I'm missing.

Therefore, this patch also modifies what we did, possibly
unintentionally, in 2b8ea7f3c7.

Thanks.


 add-patch.c                | 26 ++++++++++++++++----------
 t/t3701-add-interactive.sh | 14 ++++++++++++++
 2 files changed, 30 insertions(+), 10 deletions(-)

Phillip Wood Sept. 16, 2024, 1:33 p.m. UTC | #1

Hi Rubén

On 15/09/2024 12:38, Rubén Justo wrote:
> The "edit" option allows the user to directly modify the hunk to be
> applied.
> 
> If the modified hunk returned is not an applicable patch, we give the
> opportunity to try again.
> 
> For this new attempt we provide, again, the original hunk;  the user
> has to repeat the modification from scratch.

As you say below it looks like we started doing this by accident with 
2b8ea7f3c7 (add -p: calculate offset delta for edited patches, 
2018-03-05). I think that although the change was accidental it was 
actually a move in the right direction for several reasons.

  - The error message from "git apply" makes it is virtually impossible
    to tell what is wrong with the edited patch. The line numbers in the
    error message refer to the complete patch but the user is editing a
    single hunk so the user has no idea which line of the hunk the error
    message applies to.

  - If the user uses a terminal based editor then they cannot see the
    error messages while they're re-editing the hunk.

  - If the user has deleted a pre-image line then they need to somehow
    magic it back before the hunk will apply.

> Instead, let's give them the faulty modified patch back, so they can
> identify and fix the problem.

The problem is how do they identify the problem? I have some unfinished 
patches [1] that annotate the edited patch with comments explaining 
what's wrong. Because we know what the unedited patch looked like and 
that the pre-image lines should be unchanged it is possible to provide 
much better error messages than we get from trying to apply the whole 
patch with "git apply". It also makes it possible to restore deleted 
pre-image lines.

[1] https://github.com/phillipwood/git/tree/wip/add-p-editing-improvements
     Note that the later patches do not even compile at the moment. I've
     been meaning to split out the first eight patches and clean them up
     as they're mostly functional and just need the commit messages
     cleaning up.

> diff --git a/add-patch.c b/add-patch.c
> index 557903310d..125e79a5ae 100644
> --- a/add-patch.c
> +++ b/add-patch.c
> @@ -1146,6 +1147,10 @@ static int edit_hunk_manually(struct add_p_state *s, struct hunk *hunk)
>   				      "addp-hunk-edit.diff", NULL) < 0)
>   		return -1;
>   
> +	/* Drop possible previous edits */
> +	strbuf_setlen(&s->plain, plain_len);
> +	strbuf_setlen(&s->colored, colored_len);
> +

At this point hunk->end points past s->plain.len. If the user has 
deleted all the lines then we return with hunk->end in this invalid 
state. I think that turns out not to matter as we end up restoring 
hunk->end from the backup we make at the beginning of edit_hunk_loop() 
but it is not straight forward to reason about.

> @@ -1273,10 +1277,6 @@ static int edit_hunk_loop(struct add_p_state *s,
>   				return 0;
>   		}
>   
> -		/* Drop edits (they were appended to s->plain) */
> -		strbuf_setlen(&s->plain, plain_len);
> -		strbuf_setlen(&s->colored, colored_len);
> -		*hunk = backup;

In the old version we always restore the hunk from the backup when we 
trim the edited patch which maintains the invariant "hunk->end <= 
s->plain->end"

> diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
> index 718438ffc7..6af5636221 100755
> --- a/t/t3701-add-interactive.sh
> +++ b/t/t3701-add-interactive.sh
> @@ -165,6 +165,20 @@ test_expect_success 'dummy edit works' '
>   	diff_cmp expected diff
>   '
>   
> +test_expect_success 'setup re-edit editor' '
> +	write_script "fake_editor.sh" <<-\EOF &&
> +	grep been-here "$1" && echo found >output

'grep been-here "$1" >output' should be sufficient I think

> +	echo been-here > "$1"
> +	EOF
> +	test_set_editor "$(pwd)/fake_editor.sh"
> +'

I don't think we need to write the fake editor in a separate test. Also 
it would be better to call test_set_editor in a subshell so that it does 
not affect later tests.

> +test_expect_success 'editing again works' '
> +	git reset &&
> +	test_write_lines e y | GIT_TRACE=1 git add -p &&

It would be nice to add "n q" to the input to make it complete.

> +	grep found output

Using test_grep makes it easier to debug test failures.

Best Wishes

Phillip

Junio C Hamano Sept. 16, 2024, 5:35 p.m. UTC | #2

Phillip Wood <phillip.wood123@gmail.com> writes:

> Because we know what the unedited patch
> looked like and that the pre-image lines should be unchanged it is
> possible to provide much better error messages than we get from trying
> to apply the whole patch with "git apply". It also makes it possible
> to restore deleted pre-image lines.
> ...
> [1] https://github.com/phillipwood/git/tree/wip/add-p-editing-improvements
>     Note that the later patches do not even compile at the moment. I've
>     been meaning to split out the first eight patches and clean them up
>     as they're mostly functional and just need the commit messages
>     cleaning up.

I'd love this.  With both patches before and after the edit session,
we should be able to give more accurate line counts than having "git
apply" look at only the post-edit shape of the hunk, which wouldn't
see where any brokenness of the hunk comes from even if it wanted
to.  We should probably be able to stop relying on "--recount" once
we do this right, which would be very nice outcome, too.

Thanks.

Rubén Justo Sept. 16, 2024, 10:09 p.m. UTC | #3

On Mon, Sep 16, 2024 at 02:33:54PM +0100, Phillip Wood wrote:

> > The "edit" option allows the user to directly modify the hunk to be
> > applied.
> > 
> > If the modified hunk returned is not an applicable patch, we give the
> > opportunity to try again.
> > 
> > For this new attempt we provide, again, the original hunk;  the user
> > has to repeat the modification from scratch.
> 
> As you say below it looks like we started doing this by accident with
> 2b8ea7f3c7 (add -p: calculate offset delta for edited patches, 2018-03-05).
> I think that although the change was accidental it was actually a move in
> the right direction for several reasons.
> 
>  - The error message from "git apply" makes it is virtually impossible
>    to tell what is wrong with the edited patch. The line numbers in the
>    error message refer to the complete patch but the user is editing a
>    single hunk so the user has no idea which line of the hunk the error
>    message applies to.
> 
>  - If the user uses a terminal based editor then they cannot see the
>    error messages while they're re-editing the hunk.
> 
>  - If the user has deleted a pre-image line then they need to somehow
>    magic it back before the hunk will apply.
> 
> > Instead, let's give them the faulty modified patch back, so they can
> > identify and fix the problem.
> 
> The problem is how do they identify the problem? I have some unfinished
> patches [1] that annotate the edited patch with comments explaining what's
> wrong. Because we know what the unedited patch looked like and that the
> pre-image lines should be unchanged it is possible to provide much better
> error messages than we get from trying to apply the whole patch with "git
> apply". It also makes it possible to restore deleted pre-image lines.
> 
> [1] https://github.com/phillipwood/git/tree/wip/add-p-editing-improvements
>     Note that the later patches do not even compile at the moment. I've
>     been meaning to split out the first eight patches and clean them up
>     as they're mostly functional and just need the commit messages
>     cleaning up.

I can imagine that we could give the flawed and annotated patch back to
the user, if they want to fix it and try again.  Am I misunderstanding
your envision?

At any rate, I'm thinking about small fixes and/or avoiding to use a
backup (":w! /tmp/patch" + ":r /tmp/patch") if I have doubts about
making a mistake after spending some time thinking about a hunk, so as
not to lose some work.

> 
> > diff --git a/add-patch.c b/add-patch.c
> > index 557903310d..125e79a5ae 100644
> > --- a/add-patch.c
> > +++ b/add-patch.c
> > @@ -1146,6 +1147,10 @@ static int edit_hunk_manually(struct add_p_state *s, struct hunk *hunk)
> >   				      "addp-hunk-edit.diff", NULL) < 0)
> >   		return -1;
> > +	/* Drop possible previous edits */
> > +	strbuf_setlen(&s->plain, plain_len);
> > +	strbuf_setlen(&s->colored, colored_len);
> > +
> 
> At this point hunk->end points past s->plain.len. If the user has deleted
> all the lines then we return with hunk->end in this invalid state. I think
> that turns out not to matter as we end up restoring hunk->end from the
> backup we make at the beginning of edit_hunk_loop() but it is not straight
> forward to reason about.

I'm not sure I understand your comment.  We are adjusting "hunk" right
after that, no?

> 
> > @@ -1273,10 +1277,6 @@ static int edit_hunk_loop(struct add_p_state *s,
> >   				return 0;
> >   		}
> > -		/* Drop edits (they were appended to s->plain) */
> > -		strbuf_setlen(&s->plain, plain_len);
> > -		strbuf_setlen(&s->colored, colored_len);
> > -		*hunk = backup;
> 
> In the old version we always restore the hunk from the backup when we trim
> the edited patch which maintains the invariant "hunk->end <= s->plain->end"

Same here.  Are we losing that invariant?

> 
> > diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
> > index 718438ffc7..6af5636221 100755
> > --- a/t/t3701-add-interactive.sh
> > +++ b/t/t3701-add-interactive.sh
> > @@ -165,6 +165,20 @@ test_expect_success 'dummy edit works' '
> >   	diff_cmp expected diff
> >   '
> > +test_expect_success 'setup re-edit editor' '
> > +	write_script "fake_editor.sh" <<-\EOF &&
> > +	grep been-here "$1" && echo found >output
> 
> 'grep been-here "$1" >output' should be sufficient I think

As I was writing the test, it was clearer to me using "&& echo found"
here and "grep found" below.

> 
> > +	echo been-here > "$1"
> > +	EOF
> > +	test_set_editor "$(pwd)/fake_editor.sh"
> > +'
> 
> I don't think we need to write the fake editor in a separate test. Also it
> would be better to call test_set_editor in a subshell so that it does not
> affect later tests.

Yes, t3701 deserves an update.  I tried to respect its current style.
I didn't want to start a mix.

> 
> > +test_expect_success 'editing again works' '
> > +	git reset &&
> > +	test_write_lines e y | GIT_TRACE=1 git add -p &&
> 
> It would be nice to add "n q" to the input to make it complete.

I have no objection to that.

> 
> > +	grep found output
> 
> Using test_grep makes it easier to debug test failures.
> 
> 
> Best Wishes
> 
> Phillip

Thanks for your review.

Phillip Wood Sept. 18, 2024, 10:06 a.m. UTC | #4

Hi Rubén

On 16/09/2024 23:09, Rubén Justo wrote:
> On Mon, Sep 16, 2024 at 02:33:54PM +0100, Phillip Wood wrote:
> 
> I can imagine that we could give the flawed and annotated patch back to
> the user, if they want to fix it and try again.

Exactly

> At any rate, I'm thinking about small fixes and/or avoiding to use a
> backup (":w! /tmp/patch" + ":r /tmp/patch") if I have doubts about
> making a mistake after spending some time thinking about a hunk, so as
> not to lose some work.

The problem is there is no good solution at the moment. Either we throw 
away the user's work if the edited patch does not apply or we keep the 
broken patch and say "this is broken, please figure out what's wrong 
with it and fix it". As I explained previously fixing a broken patch is 
not necessarily straight forward especially for new users. A few times 
when editing patches that are going to be applied in reverse (from "git 
checkout HEAD -- <path>") I've found it impossible to figure out why a 
particular edit was being rejected. In that case starting again with the 
original patch is my only hope. If you want to be able to re-edit a 
broken hunk perhaps we should add an option for that when we ask the 
user if they want to try again.

>>> diff --git a/add-patch.c b/add-patch.c
>>> index 557903310d..125e79a5ae 100644
>>> --- a/add-patch.c
>>> +++ b/add-patch.c
>>> @@ -1146,6 +1147,10 @@ static int edit_hunk_manually(struct add_p_state *s, struct hunk *hunk)
>>>    				      "addp-hunk-edit.diff", NULL) < 0)
>>>    		return -1;
>>> +	/* Drop possible previous edits */
>>> +	strbuf_setlen(&s->plain, plain_len);
>>> +	strbuf_setlen(&s->colored, colored_len);
>>> +
>>
>> At this point hunk->end points past s->plain.len. If the user has deleted
>> all the lines then we return with hunk->end in this invalid state. I think
>> that turns out not to matter as we end up restoring hunk->end from the
>> backup we make at the beginning of edit_hunk_loop() but it is not straight
>> forward to reason about.
> 
> I'm not sure I understand your comment.  We are adjusting "hunk" right
> after that, no?

Sorry I should have said hunk->colored_end and s->colored.len. If we 
return early then we don't call recolor_hunk().

>>> +	echo been-here > "$1"
>>> +	EOF
>>> +	test_set_editor "$(pwd)/fake_editor.sh"
>>> +'
>>
>> I don't think we need to write the fake editor in a separate test. Also it
>> would be better to call test_set_editor in a subshell so that it does not
>> affect later tests.
> 
> Yes, t3701 deserves an update.  I tried to respect its current style.
> I didn't want to start a mix.

I see are four instances of "test_set_editor" in this file, two of which 
setup the editor within the test that uses them and are called from a 
subshell. We should do the same here rather than creating more work for 
whoever decides to clean up this file in the future.

Best Wishes

Phillip

Rubén Justo Sept. 18, 2024, 5:46 p.m. UTC | #5

On Wed, Sep 18, 2024 at 11:06:34AM +0100, phillip.wood123@gmail.com wrote:
> Hi Rubén
> 
> On 16/09/2024 23:09, Rubén Justo wrote:
> > On Mon, Sep 16, 2024 at 02:33:54PM +0100, Phillip Wood wrote:
> > 
> > I can imagine that we could give the flawed and annotated patch back to
> > the user, if they want to fix it and try again.
> 
> Exactly

So we agree on where we're going ...

> 
> > At any rate, I'm thinking about small fixes and/or avoiding to use a
> > backup (":w! /tmp/patch" + ":r /tmp/patch") if I have doubts about
> > making a mistake after spending some time thinking about a hunk, so as
> > not to lose some work.
> 
> The problem is there is no good solution at the moment.

although not in the length of the stride :)

Maybe in the future we can provide better error descriptions, or even
add annotations to the faulty patch explaining the faults.

But we're not there yet, and honestly, it's not my intention to work
on that.

> Either we keep the broken
> patch and say "this is broken, please figure out what's wrong with it and
> fix it"

Yes, we should keep the users's work if they say "yes" to:

    Your edited hunk does not apply. Edit again (saying "no" discards!) [y/n]?

> or throw away
> the user's work if the edited patch does not apply.

Only if the user says "no" (as we say in the message).

After that "no", the user has another opportunity to decide about the
hunk:

    Your edited hunk does not apply. Edit again (saying "no" discards!) [y/n]? no

    (n/m) Stage this hunk [y,n,q,a,d,e,p,?]

And then, "edit" will allow them to start over and edit the original
hunk, again.

> As I explained previously fixing a broken patch is not necessarily
> straight forward especially for new users.

Very true.  But I don't think that should be a reason to stop the user
from trying.

> A few times when editing patches
> that are going to be applied in reverse (from "git checkout HEAD -- <path>")
> I've found it impossible to figure out why a particular edit was being
> rejected. In that case starting again with the original patch is my only
> hope.

My experience is usually small last-minute adjustments that aren't
worth canceling the interactive session for, and I don't want to have
to remember to make them later.

A small error in a large hunk has been frustrating several times
because I have to go back and review the whole thing.

> If you want to be able to re-edit a broken hunk perhaps we should add
> an option for that when we ask the user if they want to try again.

As we commented in a previous message, this is what we are regaining
with this patch.  The option was introduced in ac083c47ea
(git-add--interactive: manual hunk editing mode, 2008-07-03) and lost
in 2b8ea7f3c7 (add -p: calculate offset delta for edited patches,
2018-03-05).

Thanks.

add-patch: edit the hunk again

Commit Message

Comments

Patch