diff mbox series

apply: tell user location of corrupted patch file

Message ID ec38908d05f0d40190173158ef3f0753fa9f1184.1570226253.git.liu.denton@gmail.com (mailing list archive)
State New, archived
Headers show
Series apply: tell user location of corrupted patch file | expand

Commit Message

Denton Liu Oct. 4, 2019, 9:59 p.m. UTC
When `git am` runs into a corrupt patch, it'll error out and give a
message such as,

	error: corrupt patch at line 87

Casual users of am may assume that this line number refers to the <mbox>
file that they provided on the command-line. This assumption, however,
is incorrect. The line count really refers to the
.git/rebase-apply/patch file genrated by am.

Teach am to print the location of corrupted patch files so that users of
the tool will know where to look when fixing their corrupted patch. Thus
the error message will look like this:

	error: corrupt patch at .git/rebase-apply/patch:87

An alternate design was considered which involved printing the line
numbers relative to the output of `git am --show-current-patch` (in
other words, the actual mail file that's provided to am). This design
was not chosen because am does not store the whole mail and instead,
splits the mail into several files. As a result of this, this would
break existing users' workflow if they piped their mail directly to am
from their mail client, the whole mail would not exist in any file and
they would have to manually recreate the mail to see the line number.

Let's avoid breaking Junio's workflow since he's probably one of the
most frequent user of `git am` in the world. ;)

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 apply.c                | 2 +-
 t/t4012-diff-binary.sh | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

Comments

Junio C Hamano Oct. 5, 2019, 8:33 a.m. UTC | #1
Denton Liu <liu.denton@gmail.com> writes:

> When `git am` runs into a corrupt patch, it'll error out and give a
> message such as,
>
> 	error: corrupt patch at line 87
>
> Casual users of am may assume that this line number refers to the <mbox>
> file that they provided on the command-line. This assumption, however,
> is incorrect. The line count really refers to the
> .git/rebase-apply/patch file genrated by am.
>
> Teach am to print the location of corrupted patch files so that users of

s/corrupted/corrupt/;

> the tool will know where to look when fixing their corrupted patch. Thus

Likewise.

> the error message will look like this:
>
> 	error: corrupt patch at .git/rebase-apply/patch:87
>
> An alternate design was considered which involved printing the line
> numbers relative to the output of `git am --show-current-patch` (in
> other words, the actual mail file that's provided to am). This design
> was not chosen because am does not store the whole mail and instead,
> splits the mail into several files. As a result of this, this would
> break existing users' workflow if they piped their mail directly to am
> from their mail client, the whole mail would not exist in any file and
> they would have to manually recreate the mail to see the line number.

More importantly, a change to apply.c (hence "git apply", not "git
am") will mean the tool can only talk about its input.  If you
run, instead of "git am mbox", "git apply mbox" (assuming you are
lucky and the piece of mail does not use any fancy features like
MIME or RFC 1342) may work just as well and it would report the
corrupt patch relative to the entire mail text.

>
> Let's avoid breaking Junio's workflow since he's probably one of the
> most frequent user of `git am` in the world. ;)

Don't name me.

>  		if (len <= 0) {
>  			free(fragment);
> -			return error(_("corrupt patch at line %d"), state->linenr);
> +			return error(_("corrupt patch at %s:%d"), state->patch_input_file, state->linenr);
>  		}

Do not forget that you can run "git apply" and feed the patch from
its standard input, e.g.

	$ git apply <patchfile
	$ git show -R | git apply

Make sure state->patch_input_file is a reasonable string before
considering this.

Also, if you have a mbox file

	$ cd sub/direc/tory
	$ git am -s /var/tmp/mbox

The "git apply" process thatis run inside "git am" would be running
at the top level of the working tree, so state->patch_input_file may
say ".git/rebase-apply/patch" (i.e. relative pathname) that is not
relative to where the end user is in.  I personally do not thinkg it
matters too much, but some people may complain.

Other than that, looks good.  I am kind-of surprised that there is
only one place that we report an unusable input with a line number.
Nicely found.

> diff --git a/t/t4012-diff-binary.sh b/t/t4012-diff-binary.sh
> index 6579c81216..42cb2dd404 100755
> --- a/t/t4012-diff-binary.sh
> +++ b/t/t4012-diff-binary.sh
> @@ -68,7 +68,7 @@ test_expect_success C_LOCALE_OUTPUT 'apply detecting corrupt patch correctly' '
>  	sed -e "s/-CIT/xCIT/" <output >broken &&
>  	test_must_fail git apply --stat --summary broken 2>detected &&
>  	detected=$(cat detected) &&
> -	detected=$(expr "$detected" : "error.*at line \\([0-9]*\\)\$") &&
> +	detected=$(expr "$detected" : "error.*at broken:\\([0-9]*\\)\$") &&
>  	detected=$(sed -ne "${detected}p" broken) &&
>  	test "$detected" = xCIT
>  '
> @@ -77,7 +77,7 @@ test_expect_success C_LOCALE_OUTPUT 'apply detecting corrupt patch correctly' '
>  	git diff --binary | sed -e "s/-CIT/xCIT/" >broken &&
>  	test_must_fail git apply --stat --summary broken 2>detected &&
>  	detected=$(cat detected) &&
> -	detected=$(expr "$detected" : "error.*at line \\([0-9]*\\)\$") &&
> +	detected=$(expr "$detected" : "error.*at broken:\\([0-9]*\\)\$") &&
>  	detected=$(sed -ne "${detected}p" broken) &&
>  	test "$detected" = xCIT
>  '

These existing tests can serve a good test for this new feature, but
I think you'd also need a case where "apply" is fed the patch from
the standard input, and possibly another case where it is run from a
subdirectory of a working tree.

Thanks.
Junio C Hamano Oct. 5, 2019, 10:44 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

>> An alternate design was considered which involved printing the line
>> numbers relative to the output of `git am --show-current-patch` (in
>> other words, the actual mail file that's provided to am). This design
>> was not chosen because am does not store the whole mail and instead,
>> splits the mail into several files. As a result of this, this would
>> break existing users' workflow if they piped their mail directly to am
>> from their mail client, the whole mail would not exist in any file and
>> they would have to manually recreate the mail to see the line number.
>
> More importantly,...

Addendum.

I think the primary reason why the "alternate design" will not fly
is *NOT* that it breaks existing users (which it would), but giving
a line number in the original mbox file is not always possible.

Imagine the message you received was munged by the sending mailer,
or a relaying mailer, and what you received is encoded in base64 ;-)
Junio C Hamano Oct. 5, 2019, 10:51 p.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

>>  		if (len <= 0) {
>>  			free(fragment);
>> -			return error(_("corrupt patch at line %d"), state->linenr);
>> +			return error(_("corrupt patch at %s:%d"), state->patch_input_file, state->linenr);
>>  		}
>
> Do not forget that you can run "git apply" and feed the patch from
> its standard input, e.g.
>
> 	$ git apply <patchfile
> 	$ git show -R | git apply
>
> Make sure state->patch_input_file is a reasonable string before
> considering this.

I think what the patch does is safe in this case; callsites of
apply_patch(), which sets the .patch_input_file field, pass the
string "<stdin>", so you'd say

	error: corrupt patch at <stdin>:43

We lost the word "line" in the message, but it would be picked up
rather quickly by users that colon + integer is a line number, so
I think it is OK.

> Also, if you have a mbox file
>
> 	$ cd sub/direc/tory
> 	$ git am -s /var/tmp/mbox
>
> The "git apply" process thatis run inside "git am" would be running
> at the top level of the working tree, so state->patch_input_file may
> say ".git/rebase-apply/patch" (i.e. relative pathname) that is not
> relative to where the end user is in.  I personally do not thinkg it
> matters too much, but some people may complain.
>
> Other than that, looks good.  I am kind-of surprised that there is
> only one place that we report an unusable input with a line number.
> Nicely found.

I still do not know if we have a relative-path problem, how severe
it would be if there is, or if it is fixable if we wanted to and
how, though.


Thanks.
diff mbox series

Patch

diff --git a/apply.c b/apply.c
index 57a61f2881..b0ba2e7b1a 100644
--- a/apply.c
+++ b/apply.c
@@ -1785,7 +1785,7 @@  static int parse_single_patch(struct apply_state *state,
 		len = parse_fragment(state, line, size, patch, fragment);
 		if (len <= 0) {
 			free(fragment);
-			return error(_("corrupt patch at line %d"), state->linenr);
+			return error(_("corrupt patch at %s:%d"), state->patch_input_file, state->linenr);
 		}
 		fragment->patch = line;
 		fragment->size = len;
diff --git a/t/t4012-diff-binary.sh b/t/t4012-diff-binary.sh
index 6579c81216..42cb2dd404 100755
--- a/t/t4012-diff-binary.sh
+++ b/t/t4012-diff-binary.sh
@@ -68,7 +68,7 @@  test_expect_success C_LOCALE_OUTPUT 'apply detecting corrupt patch correctly' '
 	sed -e "s/-CIT/xCIT/" <output >broken &&
 	test_must_fail git apply --stat --summary broken 2>detected &&
 	detected=$(cat detected) &&
-	detected=$(expr "$detected" : "error.*at line \\([0-9]*\\)\$") &&
+	detected=$(expr "$detected" : "error.*at broken:\\([0-9]*\\)\$") &&
 	detected=$(sed -ne "${detected}p" broken) &&
 	test "$detected" = xCIT
 '
@@ -77,7 +77,7 @@  test_expect_success C_LOCALE_OUTPUT 'apply detecting corrupt patch correctly' '
 	git diff --binary | sed -e "s/-CIT/xCIT/" >broken &&
 	test_must_fail git apply --stat --summary broken 2>detected &&
 	detected=$(cat detected) &&
-	detected=$(expr "$detected" : "error.*at line \\([0-9]*\\)\$") &&
+	detected=$(expr "$detected" : "error.*at broken:\\([0-9]*\\)\$") &&
 	detected=$(sed -ne "${detected}p" broken) &&
 	test "$detected" = xCIT
 '