diff mbox series

[v2,2/2] format-patch: warn if commit msg contains a patch delimiter

Message ID a2c4514aa03657f3b1d822efe3dd630713287ee6.1662559356.git.matheus.bernardino@usp.br (mailing list archive)
State New, archived
Headers show
Series format-patch: warn if commit msg contains a patch delimiter | expand

Commit Message

Matheus Tavares Sept. 7, 2022, 2:44 p.m. UTC
When applying a patch, `git am` looks for special delimiter strings
(such as "---") to know where the message ends and the actual diff
starts. If one of these strings appears in the commit message itself,
`am` might get confused and fail to apply the patch properly. This has
already caused inconveniences in the past [1][2]. To help avoid such
problem, let's make `git format-patch` warn on commit messages
containing one of the said strings.

[1]: https://lore.kernel.org/git/20210113085846-mutt-send-email-mst@kernel.org/
[2]: https://lore.kernel.org/git/16297305.cDA1TJNmNo@earendil/

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/log.c           |  1 +
 log-tree.c              |  1 +
 mailinfo.c              |  4 ++--
 mailinfo.h              |  3 +++
 pretty.c                | 16 +++++++++++++++-
 pretty.h                |  3 ++-
 revision.h              |  3 ++-
 t/t4014-format-patch.sh | 26 ++++++++++++++++++++++++++
 8 files changed, 52 insertions(+), 5 deletions(-)

Comments

Phillip Wood Sept. 7, 2022, 6:09 p.m. UTC | #1
Hi Matheus

On 07/09/2022 15:44, Matheus Tavares wrote:
> When applying a patch, `git am` looks for special delimiter strings
> (such as "---") to know where the message ends and the actual diff
> starts. If one of these strings appears in the commit message itself,
> `am` might get confused and fail to apply the patch properly. This has
> already caused inconveniences in the past [1][2]. To help avoid such
> problem, let's make `git format-patch` warn on commit messages
> containing one of the said strings.

Thanks for working on this, having a warning for this is a useful 
addition. If the user embeds a diff in their commit message then they 
will receive three warnings

warning: commit message has a patch delimiter: 'diff --git a/file b/file'
warning: commit message has a patch delimiter: '--- file'
warning: git am might fail to apply this patch. Consider indenting the 
offending lines.

I guess it's helpful to show all the lines that are considered 
delimiters but it gets quite noisy.


> diff --git a/pretty.c b/pretty.c
> index 6d819103fb..913d974b3a 100644
> --- a/pretty.c
> +++ b/pretty.c
> @@ -2107,6 +2108,14 @@ void pp_remainder(struct pretty_print_context *pp,
>   		if (!linelen)
>   			break;
>   
> +		if (pp->check_in_body_patch_breaks &&
> +		    (patchbreak(line, linelen) || is_scissors_line(line, linelen))) {
> +			warning(_("commit message has a patch delimiter: '%.*s'"),
> +				line[linelen - 1] == '\n' ? linelen - 1 : linelen,
> +				line);
> +			found_delimiter = 1;
> +		}
> +
>   		if (is_blank_line(line, &linelen)) {
>   			if (first)
>   				continue;
> @@ -2133,6 +2142,11 @@ void pp_remainder(struct pretty_print_context *pp,
>   		}
>   		strbuf_addch(sb, '\n');
>   	}
> +
> +	if (found_delimiter) {
> +		warning(_("git am might fail to apply this patch. "
> +			  "Consider indenting the offending lines."));

The message says the patch might fail to apply, but isn't it guaranteed 
to fail?

> diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
> index fbec8ad2ef..4bbf1156e9 100755
> --- a/t/t4014-format-patch.sh
> +++ b/t/t4014-format-patch.sh
> @@ -2329,4 +2329,30 @@ test_expect_success 'interdiff: solo-patch' '
>   	test_cmp expect actual
>   '
>   
> +test_expect_success 'warn if commit message contains patch delimiter' '
> +	>delim &&
> +	git add delim &&
> +	cat >msg <<-\EOF &&
> +	title
> +
> +	---
> +	EOF
> +	git commit -F msg &&
> +	git format-patch -1 2>stderr &&
> +	grep "warning: commit message has a patch delimiter" stderr

I think it would be worth checking for the second message as well in the 
tests.


Best Wishes

Phillip
Junio C Hamano Sept. 7, 2022, 6:36 p.m. UTC | #2
Phillip Wood <phillip.wood123@gmail.com> writes:

> Hi Matheus
>
> On 07/09/2022 15:44, Matheus Tavares wrote:
>> When applying a patch, `git am` looks for special delimiter strings
>> (such as "---") to know where the message ends and the actual diff
>> starts. If one of these strings appears in the commit message itself,
>> `am` might get confused and fail to apply the patch properly. This has
>> already caused inconveniences in the past [1][2]. To help avoid such
>> problem, let's make `git format-patch` warn on commit messages
>> containing one of the said strings.
>
> Thanks for working on this, having a warning for this is a useful
> addition. If the user embeds a diff in their commit message then they
> will receive three warnings
>
> warning: commit message has a patch delimiter: 'diff --git a/file b/file'
> warning: commit message has a patch delimiter: '--- file'
> warning: git am might fail to apply this patch. Consider indenting the
> offending lines.
>
> I guess it's helpful to show all the lines that are considered
> delimiters but it gets quite noisy.

True.  I wonder if automatically indenting these lines is an option ;-)

>> +
>> +	if (found_delimiter) {
>> +		warning(_("git am might fail to apply this patch. "
>> +			  "Consider indenting the offending lines."));
>
> The message says the patch might fail to apply, but isn't it
> guaranteed to fail?

Worse is it may apply a wrong thing (i.e. an illustration patch in
the proposed log message gets applied and committed with a truncated
log message).
Matheus Tavares Sept. 9, 2022, 1:08 a.m. UTC | #3
On Wed, Sep 7, 2022 at 3:36 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Phillip Wood <phillip.wood123@gmail.com> writes:
>
> > Hi Matheus
> >
> > Thanks for working on this, having a warning for this is a useful
> > addition. If the user embeds a diff in their commit message then they
> > will receive three warnings
> >
> > warning: commit message has a patch delimiter: 'diff --git a/file b/file'
> > warning: commit message has a patch delimiter: '--- file'
> > warning: git am might fail to apply this patch. Consider indenting the
> > offending lines.
> >
> > I guess it's helpful to show all the lines that are considered
> > delimiters but it gets quite noisy.

Hmm, right :/ Perhaps we could avoid repeating the warning message:

warning: commit message has a patch delimiter(s):
diff --git a/file b/file
--- file
....
warning: git am might fail to apply this patch.

> True.  I wonder if automatically indenting these lines is an option ;-)

Makes sense. Perhaps under a config option? The difficult part would
be for the scissors; just indenting it with whitespaces wouldn't
suffice, right?
Junio C Hamano Sept. 9, 2022, 4:47 p.m. UTC | #4
Matheus Tavares <matheus.bernardino@usp.br> writes:

> Makes sense. Perhaps under a config option? The difficult part would
> be for the scissors; just indenting it with whitespaces wouldn't
> suffice, right?

It may be difficult not because of any mechanical reasons, but
because we cannot guess WHY the author wrote it there in the log in
the first place.  It could be that the author writes explanatory
text that is not to become part of the permanent history at the
beginning, place scissors, and follow that with log for posterity,
EXPECTING that all of them is output by format-patch and transmit to
the receiving end without modified.

Another thing is a three-dash marker line in the log message.  I
myself did use them to leave a note for myself (which should be left
outside the official history when it is sent to the list and then
applied), and I would have been upset if it was stripped or the tool
even warned against it---I knew what I was doing after all.

Compared to these two, an unindented "diff " and its output in the
log has no reason to be pre-recoded in the commit message and make
the rest of the message a part of the patch, so I am perfectly fine
if we unconditionally "escaped" them.  But I personally think
scissors and three-dash lines should be left intact.
diff mbox series

Patch

diff --git a/builtin/log.c b/builtin/log.c
index 56e2d95e86..edc84abaef 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1973,6 +1973,7 @@  int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	rev.diffopt.flags.recursive = 1;
 	rev.diffopt.no_free = 1;
 	rev.subject_prefix = fmt_patch_subject_prefix;
+	rev.check_in_body_patch_breaks = 1;
 	memset(&s_r_opt, 0, sizeof(s_r_opt));
 	s_r_opt.def = "HEAD";
 	s_r_opt.revarg_opt = REVARG_COMMITTISH;
diff --git a/log-tree.c b/log-tree.c
index 3e8c70ddcf..25ed5452b1 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -766,6 +766,7 @@  void show_log(struct rev_info *opt)
 	ctx.after_subject = extra_headers;
 	ctx.preserve_subject = opt->preserve_subject;
 	ctx.encode_email_headers = opt->encode_email_headers;
+	ctx.check_in_body_patch_breaks = opt->check_in_body_patch_breaks;
 	ctx.reflog_info = opt->reflog_info;
 	ctx.fmt = opt->commit_format;
 	ctx.mailmap = opt->mailmap;
diff --git a/mailinfo.c b/mailinfo.c
index f0a690b6e8..d227397f1c 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -646,7 +646,7 @@  static void decode_transfer_encoding(struct mailinfo *mi, struct strbuf *line)
 	free(ret);
 }
 
-static inline int patchbreak(const char *buf, size_t len)
+int patchbreak(const char *buf, size_t len)
 {
 	/* Beginning of a "diff -" header? */
 	if (skip_prefix_mem(buf, len, "diff -", &buf, &len))
@@ -680,7 +680,7 @@  static inline int patchbreak(const char *buf, size_t len)
 	return 0;
 }
 
-static int is_scissors_line(const char *line, size_t len)
+int is_scissors_line(const char *line, size_t len)
 {
 	const char *c;
 	int scissors = 0, gap = 0;
diff --git a/mailinfo.h b/mailinfo.h
index f2ffd0349e..347eefe856 100644
--- a/mailinfo.h
+++ b/mailinfo.h
@@ -53,4 +53,7 @@  void setup_mailinfo(struct mailinfo *);
 int mailinfo(struct mailinfo *, const char *msg, const char *patch);
 void clear_mailinfo(struct mailinfo *);
 
+int patchbreak(const char *line, size_t len);
+int is_scissors_line(const char *line, size_t len);
+
 #endif /* MAILINFO_H */
diff --git a/pretty.c b/pretty.c
index 6d819103fb..913d974b3a 100644
--- a/pretty.c
+++ b/pretty.c
@@ -5,6 +5,7 @@ 
 #include "diff.h"
 #include "revision.h"
 #include "string-list.h"
+#include "mailinfo.h"
 #include "mailmap.h"
 #include "log-tree.h"
 #include "notes.h"
@@ -2097,7 +2098,7 @@  void pp_remainder(struct pretty_print_context *pp,
 		  int indent)
 {
 	struct grep_opt *opt = pp->rev ? &pp->rev->grep_filter : NULL;
-	int first = 1;
+	int first = 1, found_delimiter = 0;
 
 	for (;;) {
 		const char *line = *msg_p;
@@ -2107,6 +2108,14 @@  void pp_remainder(struct pretty_print_context *pp,
 		if (!linelen)
 			break;
 
+		if (pp->check_in_body_patch_breaks &&
+		    (patchbreak(line, linelen) || is_scissors_line(line, linelen))) {
+			warning(_("commit message has a patch delimiter: '%.*s'"),
+				line[linelen - 1] == '\n' ? linelen - 1 : linelen,
+				line);
+			found_delimiter = 1;
+		}
+
 		if (is_blank_line(line, &linelen)) {
 			if (first)
 				continue;
@@ -2133,6 +2142,11 @@  void pp_remainder(struct pretty_print_context *pp,
 		}
 		strbuf_addch(sb, '\n');
 	}
+
+	if (found_delimiter) {
+		warning(_("git am might fail to apply this patch. "
+			  "Consider indenting the offending lines."));
+	}
 }
 
 void pretty_print_commit(struct pretty_print_context *pp,
diff --git a/pretty.h b/pretty.h
index f34e24c53a..12df2f4a39 100644
--- a/pretty.h
+++ b/pretty.h
@@ -49,7 +49,8 @@  struct pretty_print_context {
 	struct string_list *mailmap;
 	int color;
 	struct ident_split *from_ident;
-	unsigned encode_email_headers:1;
+	unsigned encode_email_headers:1,
+		 check_in_body_patch_breaks:1;
 	struct pretty_print_describe_status *describe_status;
 
 	/*
diff --git a/revision.h b/revision.h
index 61a9b1316b..f384ab716f 100644
--- a/revision.h
+++ b/revision.h
@@ -230,7 +230,8 @@  struct rev_info {
 			date_mode_explicit:1,
 			preserve_subject:1,
 			encode_email_headers:1,
-			include_header:1;
+			include_header:1,
+			check_in_body_patch_breaks:1;
 	unsigned int	disable_stdin:1;
 	/* --show-linear-break */
 	unsigned int	track_linear:1,
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index fbec8ad2ef..4bbf1156e9 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -2329,4 +2329,30 @@  test_expect_success 'interdiff: solo-patch' '
 	test_cmp expect actual
 '
 
+test_expect_success 'warn if commit message contains patch delimiter' '
+	>delim &&
+	git add delim &&
+	cat >msg <<-\EOF &&
+	title
+
+	---
+	EOF
+	git commit -F msg &&
+	git format-patch -1 2>stderr &&
+	grep "warning: commit message has a patch delimiter" stderr
+'
+
+test_expect_success 'warn if commit message contains scissors' '
+	>scissors &&
+	git add scissors &&
+	cat >msg <<-\EOF &&
+	title
+
+	-- >8 --
+	EOF
+	git commit -F msg &&
+	git format-patch -1 2>stderr &&
+	grep "warning: commit message has a patch delimiter" stderr
+'
+
 test_done