diff mbox series

format-patch: teach --no-encode-headers

Message ID 20200405231109.8249-1-me@pluvano.com (mailing list archive)
State New, archived
Headers show
Series format-patch: teach --no-encode-headers | expand

Commit Message

Emma Brooks April 5, 2020, 11:11 p.m. UTC
When commit subjects or authors have non-ASCII characters, git
format-patch Q-encodes them so they can be safely sent over email.
However, if the patch transfer method is something other than email (web
review tools, sneakernet), this only serves to make the patch metadata
harder to read without first applying it (unless you can decode RFC 2047
in your head). git am as well as some email software supports
non-Q-encoded mail as described in RFC 6531.

Add --[no-]encode-headers and format.encodeHeaders to let the user
control this behavior.

Signed-off-by: Emma Brooks <me@pluvano.com>
---
 Documentation/config/format.txt    |  4 +++
 Documentation/git-format-patch.txt |  7 ++++
 builtin/log.c                      |  7 ++++
 log-tree.c                         |  1 +
 pretty.c                           |  6 ++--
 pretty.h                           |  1 +
 revision.c                         |  4 +++
 revision.h                         |  3 +-
 t/t4014-format-patch.sh            | 53 ++++++++++++++++++++++++++++++
 9 files changed, 83 insertions(+), 3 deletions(-)

Comments

brian m. carlson April 6, 2020, 3:04 a.m. UTC | #1
On 2020-04-05 at 23:11:09, Emma Brooks wrote:
> When commit subjects or authors have non-ASCII characters, git
> format-patch Q-encodes them so they can be safely sent over email.
> However, if the patch transfer method is something other than email (web
> review tools, sneakernet), this only serves to make the patch metadata
> harder to read without first applying it (unless you can decode RFC 2047
> in your head). git am as well as some email software supports
> non-Q-encoded mail as described in RFC 6531.

Do we always output UTF-8 in this case, or do we sometimes output other
encodings if the user has specified one for the commit message?  Do we
know how git send-email handles such a message if it receives one?

I know it isn't your intention to work with git send-email in this
patch, but it would be nice to know whether there's additional value in
someone sending a followup patch to make git send-email use SMTPUTF8 if
that's necessary.
Junio C Hamano April 6, 2020, 3:29 a.m. UTC | #2
Emma Brooks <me@pluvano.com> writes:

> When commit subjects or authors have non-ASCII characters, git
> format-patch Q-encodes them so they can be safely sent over email.
> However, if the patch transfer method is something other than email (web
> review tools, sneakernet), this only serves to make the patch metadata
> harder to read without first applying it (unless you can decode RFC 2047
> in your head). git am as well as some email software supports
> non-Q-encoded mail as described in RFC 6531.
>
> Add --[no-]encode-headers and format.encodeHeaders to let the user
> control this behavior.

This would be immensely useful.  I often find the in-body headers
that are Q-encoded too ugly to live.

Is the RFC2047 the only thing we do to message headers?  What I am
trying is to figure out if "encode-headers - yes/no?" would be a
stable (iow, would we be gaining other kinds of encoding over time?)
and well-defined (iow, is there a case where one kind of 'encoding'
is still desirable while disabling other kinds of 'encoding' is
wanted?) question.  If there is any doubt in your answers to the
above question, we may have to make sure the name of the option
makes it clear to users what kind of encoding we're talking about.

> +format.encodeHeaders::
> +	Encode email headers that have non-ASCII characters with
> +	"Q-encoding" for email transmission. Defaults to true.

OK.

>  
> +--[no-]encode-headers::

I think we'd want to standadize on writing these out, i.e.

        --encode-headers::
        --no-encode-headers::

so let's follow that when adding a new option.

> +	Encode email headers that have non-ASCII characters with
> +	"Q-encoding", instead of outputting the headers verbatim. The

I wonder if calling RFC2047 out helps readers here, when they wonder
what Q is and how they can decipher it.

> +	default is `--encode-headers` unless the `format.encodeHeaders`
> +	configuration variable is set.

I am wondering if we can go even shorter, e.g.

	The default is set to the value of `format.encodeHeaders`
	configuration variable.

> -		if (needs_rfc2047_encoding(namebuf, namelen)) {
> +		if (pp->encode_headers &&
> +				needs_rfc2047_encoding(namebuf, namelen)) {

Don't overly indent the second line like this.  The same comment
applies to the next hunk (not quoted).

Thanks.
Jeff King April 6, 2020, 1:30 p.m. UTC | #3
On Mon, Apr 06, 2020 at 03:04:44AM +0000, brian m. carlson wrote:

> On 2020-04-05 at 23:11:09, Emma Brooks wrote:
> > When commit subjects or authors have non-ASCII characters, git
> > format-patch Q-encodes them so they can be safely sent over email.
> > However, if the patch transfer method is something other than email (web
> > review tools, sneakernet), this only serves to make the patch metadata
> > harder to read without first applying it (unless you can decode RFC 2047
> > in your head). git am as well as some email software supports
> > non-Q-encoded mail as described in RFC 6531.
> 
> Do we always output UTF-8 in this case, or do we sometimes output other
> encodings if the user has specified one for the commit message?

That was my first question, too. But I think even without this option,
we always respect i18n.logOutputEncoding before we even hit the email
pretty-printing code. So by default it would always be utf8 (and
otherwise whatever the user has asked us to output).

That would obviously be disastrous for an output encoding that isn't an
ASCII superset, but that's already true for any of our output formats.

> Do we know how git send-email handles such a message if it receives
> one?
> 
> I know it isn't your intention to work with git send-email in this
> patch, but it would be nice to know whether there's additional value in
> someone sending a followup patch to make git send-email use SMTPUTF8 if
> that's necessary.

I suspect this is mostly orthogonal, as that deals only with the
SMTP-level addresses, which include only the actual email part (not the
name) and aren't RFC2047-encoded anyway. It looks like we already leave
characters in addresses untouched (I'm not even 100% sure that RFC2047
allows modifying within the local part of an addr):

  $echo foo >file
  $ git add file
  $ git -c user.email=péff@peff.net commit -m foo
  $ git format-patch -1 --stdout | grep From:
  From: Jeff King <péff@peff.net>

I did wonder if there are any standards around 8bit headers. Certainly
the de facto standard for local tools (e.g., mutt reading a message
you've edited in vim) is that they can be treated like a stream of
ASCII-compatible bytes, and that works pretty well in practice. But if
there's an IETF-endorsed method for 8bit headers, it would be nice to
use it. For 8bit bodies, we're able to give a content-transfer-encoding
and a content-type with the charset. But I don't know of an equivalent
for headers.

-Peff
brian m. carlson April 6, 2020, 3:17 p.m. UTC | #4
On 2020-04-06 at 13:30:40, Jeff King wrote:
> I suspect this is mostly orthogonal, as that deals only with the
> SMTP-level addresses, which include only the actual email part (not the
> name) and aren't RFC2047-encoded anyway. It looks like we already leave
> characters in addresses untouched (I'm not even 100% sure that RFC2047
> allows modifying within the local part of an addr):
> 
>   $echo foo >file
>   $ git add file
>   $ git -c user.email=péff@peff.net commit -m foo
>   $ git format-patch -1 --stdout | grep From:
>   From: Jeff King <péff@peff.net>
> 
> I did wonder if there are any standards around 8bit headers. Certainly
> the de facto standard for local tools (e.g., mutt reading a message
> you've edited in vim) is that they can be treated like a stream of
> ASCII-compatible bytes, and that works pretty well in practice. But if
> there's an IETF-endorsed method for 8bit headers, it would be nice to
> use it. For 8bit bodies, we're able to give a content-transfer-encoding
> and a content-type with the charset. But I don't know of an equivalent
> for headers.

That's RFC 6532, Internationalized Email Headers, the companion document
to RFC 6531.  (The RFC editor has cleverly kept the last digits in sync
between the RFC 532x and 653x series).

The basic summary is that header field names are not internationalized,
but the field values do allow UTF-8 if they contain unstructured text
(e.g., Subject), anything using atoms (e.g., Message-ID), quoted strings
(e.g., local-parts of an email address), domains, and a few other
constructs.  RFC 2047 (MIME encoded words) is allowed "only in a subset
of the places allowed by" RFC 6532, so just not encoding should be safe
here, as long as it's UTF-8.
Jeff King April 6, 2020, 3:30 p.m. UTC | #5
On Mon, Apr 06, 2020 at 03:17:34PM +0000, brian m. carlson wrote:

> > I did wonder if there are any standards around 8bit headers. Certainly
> > the de facto standard for local tools (e.g., mutt reading a message
> > you've edited in vim) is that they can be treated like a stream of
> > ASCII-compatible bytes, and that works pretty well in practice. But if
> > there's an IETF-endorsed method for 8bit headers, it would be nice to
> > use it. For 8bit bodies, we're able to give a content-transfer-encoding
> > and a content-type with the charset. But I don't know of an equivalent
> > for headers.
> 
> That's RFC 6532, Internationalized Email Headers, the companion document
> to RFC 6531.  (The RFC editor has cleverly kept the last digits in sync
> between the RFC 532x and 653x series).

Ah, thanks, that's exactly what I was looking for.

> The basic summary is that header field names are not internationalized,
> but the field values do allow UTF-8 if they contain unstructured text
> (e.g., Subject), anything using atoms (e.g., Message-ID), quoted strings
> (e.g., local-parts of an email address), domains, and a few other
> constructs.  RFC 2047 (MIME encoded words) is allowed "only in a subset
> of the places allowed by" RFC 6532, so just not encoding should be safe
> here, as long as it's UTF-8.

That makes sense. It looks like such messages are technically
message/global rather than message/rfc822. But since there's no
content-type given for the outermost message of an mbox, I guess that
just becomes implied.

The utf8 thing means that doing:

  git format-patch --encoding=iso8859-1 --no-encode-headers

violates the standard. But I think that's OK. If you really prefer that
charset for your local use, it does what you want. And if you try to
send it over SMTP and somebody complains, I think that falls under "if
it hurts, don't do that".

-Peff
Emma Brooks April 7, 2020, 3:46 a.m. UTC | #6
On 2020-04-05 20:29:57-0700, Junio C Hamano wrote:
> Is the RFC2047 the only thing we do to message headers?  What I am
> trying is to figure out if "encode-headers - yes/no?" would be a
> stable (iow, would we be gaining other kinds of encoding over time?)
> and well-defined (iow, is there a case where one kind of 'encoding'
> is still desirable while disabling other kinds of 'encoding' is
> wanted?) question.  If there is any doubt in your answers to the
> above question, we may have to make sure the name of the option
> makes it clear to users what kind of encoding we're talking about.

It's also too vague and it's not entirely clear from the option itself
what sort of encoding it refers to. I will change it to
--[no-]q-encode-headers and format.qEncodeHeaders in v2 unless there are
other suggestions.

> > +--[no-]encode-headers::
> 
> I think we'd want to standadize on writing these out, i.e.
> 
>         --encode-headers::
>         --no-encode-headers::
> 
> so let's follow that when adding a new option.

OK.

> > +	Encode email headers that have non-ASCII characters with
> > +	"Q-encoding", instead of outputting the headers verbatim. The
> 
> I wonder if calling RFC2047 out helps readers here, when they wonder
> what Q is and how they can decipher it.

I'll reference the RFC directly in v2.

> > +	default is `--encode-headers` unless the `format.encodeHeaders`
> > +	configuration variable is set.
> 
> I am wondering if we can go even shorter, e.g.
> 
> 	The default is set to the value of `format.encodeHeaders`
> 	configuration variable.

OK, I'll go with that.

> > -		if (needs_rfc2047_encoding(namebuf, namelen)) {
> > +		if (pp->encode_headers &&
> > +				needs_rfc2047_encoding(namebuf, namelen)) {
> 
> Don't overly indent the second line like this.  The same comment
> applies to the next hunk (not quoted).
> 
> Thanks.

OK.
Junio C Hamano April 7, 2020, 7:37 p.m. UTC | #7
Emma Brooks <me@pluvano.com> writes:

> It's also too vague and it's not entirely clear from the option itself
> what sort of encoding it refers to. I will change it to
> --[no-]q-encode-headers and format.qEncodeHeaders in v2 unless there are
> other suggestions.

I actually did not mean to push you into that direction.  We can,
and do want to, keep the most generic "--[no-]encode-headers" if we
do not anticipate us wanting to special case the Q encoding.  A
sample question to ask is "would it make sense to disable q-encoding
but still perform other parts of 'encode headers'?"  I haven't
thought deeply about such questions, but as a proposer of this
topic, you would certainly have, and I was hoping that you'd say
things like "Q-encoding is the only thing that we do to munge
headers, so there aren't any 'other parts of encoding headers' we
need to worry about", "there are things like X, Y and Z that we do
to the headers when we enable Q-encoding, but they all are what we
do not want when we do not want the Q-encoding", which would be a
very good sign that assures us that "--[no-]encode-headers" is a
good name.

Thanks.
Jeff King April 7, 2020, 8:31 p.m. UTC | #8
On Tue, Apr 07, 2020 at 12:37:31PM -0700, Junio C Hamano wrote:

> Emma Brooks <me@pluvano.com> writes:
> 
> > It's also too vague and it's not entirely clear from the option itself
> > what sort of encoding it refers to. I will change it to
> > --[no-]q-encode-headers and format.qEncodeHeaders in v2 unless there are
> > other suggestions.
> 
> I actually did not mean to push you into that direction.  We can,
> and do want to, keep the most generic "--[no-]encode-headers" if we
> do not anticipate us wanting to special case the Q encoding.  A
> sample question to ask is "would it make sense to disable q-encoding
> but still perform other parts of 'encode headers'?"  I haven't
> thought deeply about such questions, but as a proposer of this
> topic, you would certainly have, and I was hoping that you'd say
> things like "Q-encoding is the only thing that we do to munge
> headers, so there aren't any 'other parts of encoding headers' we
> need to worry about", "there are things like X, Y and Z that we do
> to the headers when we enable Q-encoding, but they all are what we
> do not want when we do not want the Q-encoding", which would be a
> very good sign that assures us that "--[no-]encode-headers" is a
> good name.

I thought we might b-encode some headers, but couldn't find any code to
do so (after about 5 minutes of looking).

However, this new option isn't just for format-patch. It is available
for all revision walkers (as it should be; I can say "log
--format=email" and I might want to use it there). And there "headers"
is less clear that we are talking about email headers, and not other
object headers (e.g., that you might see with --format=raw).

Saying "--no-rfc2047-encoding" would be more descriptive to _me_, but I
wonder if people not so familiar with the standards would find it a bit
obscure. Another option is to invert it to "--8bit-email-headers" or
something.

-Peff
Junio C Hamano April 7, 2020, 10:20 p.m. UTC | #9
Jeff King <peff@peff.net> writes:

> Saying "--no-rfc2047-encoding" would be more descriptive to _me_, but I
> wonder if people not so familiar with the standards would find it a bit
> obscure. Another option is to invert it to "--8bit-email-headers" or
> something.

Yup, having "rfc2047" in the name of the option was one of the
things I considered suggesting, but I didn't for the same reason.  I
am OK with "--[no-]8bit-email-headers" (when --8bit, rfc2047 is
skipped).  Or "--[no-]email-header-encoding".

Thanks.
Emma Brooks April 8, 2020, 4:08 a.m. UTC | #10
On 2020-04-07 12:37:31-0700, Junio C Hamano wrote:
> Emma Brooks <me@pluvano.com> writes:
> 
> > It's also too vague and it's not entirely clear from the option itself
> > what sort of encoding it refers to. I will change it to
> > --[no-]q-encode-headers and format.qEncodeHeaders in v2 unless there are
> > other suggestions.
> 
> I actually did not mean to push you into that direction.  We can,
> and do want to, keep the most generic "--[no-]encode-headers" if we
> do not anticipate us wanting to special case the Q encoding.  A
> sample question to ask is "would it make sense to disable q-encoding
> but still perform other parts of 'encode headers'?"  I haven't
> thought deeply about such questions, but as a proposer of this
> topic, you would certainly have, and I was hoping that you'd say
> things like "Q-encoding is the only thing that we do to munge
> headers, so there aren't any 'other parts of encoding headers' we
> need to worry about", "there are things like X, Y and Z that we do
> to the headers when we enable Q-encoding, but they all are what we
> do not want when we do not want the Q-encoding", which would be a
> very good sign that assures us that "--[no-]encode-headers" is a
> good name.

Ah. I don't think there are any cases where we do other sorts of
encoding, or want to enable one "part" of encoding and disable another.
I do think the name need to be more obviously about *email* headers as
Jeff pointed out, though.
diff mbox series

Patch

diff --git a/Documentation/config/format.txt b/Documentation/config/format.txt
index 45c7bd5a8f..ee0eb4c5da 100644
--- a/Documentation/config/format.txt
+++ b/Documentation/config/format.txt
@@ -57,6 +57,10 @@  format.suffix::
 	`.patch`. Use this variable to change that suffix (make sure to
 	include the dot if you want it).
 
+format.encodeHeaders::
+	Encode email headers that have non-ASCII characters with
+	"Q-encoding" for email transmission. Defaults to true.
+
 format.pretty::
 	The default pretty format for log/show/whatchanged command,
 	See linkgit:git-log[1], linkgit:git-show[1],
diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt
index 0d4f8951bb..a1483a6a34 100644
--- a/Documentation/git-format-patch.txt
+++ b/Documentation/git-format-patch.txt
@@ -24,6 +24,7 @@  SYNOPSIS
 		   [(--reroll-count|-v) <n>]
 		   [--to=<email>] [--cc=<email>]
 		   [--[no-]cover-letter] [--quiet]
+		   [--[no-]encode-headers]
 		   [--no-notes | --notes[=<ref>]]
 		   [--interdiff=<previous>]
 		   [--range-diff=<previous> [--creation-factor=<percent>]]
@@ -253,6 +254,12 @@  feeding the result to `git send-email`.
 	containing the branch description, shortlog and the overall diffstat.  You can
 	fill in a description in the file before sending it out.
 
+--[no-]encode-headers::
+	Encode email headers that have non-ASCII characters with
+	"Q-encoding", instead of outputting the headers verbatim. The
+	default is `--encode-headers` unless the `format.encodeHeaders`
+	configuration variable is set.
+
 --interdiff=<previous>::
 	As a reviewer aid, insert an interdiff into the cover letter,
 	or as commentary of the lone patch of a 1-patch series, showing
diff --git a/builtin/log.c b/builtin/log.c
index 83a4a6188e..1a27049c88 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -46,6 +46,7 @@  static int default_abbrev_commit;
 static int default_show_root = 1;
 static int default_follow;
 static int default_show_signature;
+static int default_encode_headers = 1;
 static int decoration_style;
 static int decoration_given;
 static int use_mailmap_config = 1;
@@ -151,6 +152,7 @@  static void cmd_log_init_defaults(struct rev_info *rev)
 	rev->show_root_diff = default_show_root;
 	rev->subject_prefix = fmt_patch_subject_prefix;
 	rev->show_signature = default_show_signature;
+	rev->encode_headers = default_encode_headers;
 	rev->diffopt.flags.allow_textconv = 1;
 
 	if (default_date_mode)
@@ -438,6 +440,10 @@  static int git_log_config(const char *var, const char *value, void *cb)
 		return git_config_string(&fmt_pretty, var, value);
 	if (!strcmp(var, "format.subjectprefix"))
 		return git_config_string(&fmt_patch_subject_prefix, var, value);
+	if (!strcmp(var, "format.encodeheaders")) {
+		default_encode_headers = git_config_bool(var, value);
+		return 0;
+	}
 	if (!strcmp(var, "log.abbrevcommit")) {
 		default_abbrev_commit = git_config_bool(var, value);
 		return 0;
@@ -1719,6 +1725,7 @@  int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	rev.show_notes = show_notes;
 	memcpy(&rev.notes_opt, &notes_opt, sizeof(notes_opt));
 	rev.commit_format = CMIT_FMT_EMAIL;
+	rev.encode_headers = default_encode_headers;
 	rev.expand_tabs_in_log_default = 0;
 	rev.verbose_header = 1;
 	rev.diff = 1;
diff --git a/log-tree.c b/log-tree.c
index 897a90233e..eaec299762 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -693,6 +693,7 @@  void show_log(struct rev_info *opt)
 	ctx.abbrev = opt->diffopt.abbrev;
 	ctx.after_subject = extra_headers;
 	ctx.preserve_subject = opt->preserve_subject;
+	ctx.encode_headers = opt->encode_headers;
 	ctx.reflog_info = opt->reflog_info;
 	ctx.fmt = opt->commit_format;
 	ctx.mailmap = opt->mailmap;
diff --git a/pretty.c b/pretty.c
index 28afc701b6..12959cca4d 100644
--- a/pretty.c
+++ b/pretty.c
@@ -474,7 +474,8 @@  void pp_user_info(struct pretty_print_context *pp,
 		}
 
 		strbuf_addstr(sb, "From: ");
-		if (needs_rfc2047_encoding(namebuf, namelen)) {
+		if (pp->encode_headers &&
+				needs_rfc2047_encoding(namebuf, namelen)) {
 			add_rfc2047(sb, namebuf, namelen,
 				    encoding, RFC2047_ADDRESS);
 			max_length = 76; /* per rfc2047 */
@@ -1767,7 +1768,8 @@  void pp_title_line(struct pretty_print_context *pp,
 	if (pp->print_email_subject) {
 		if (pp->rev)
 			fmt_output_email_subject(sb, pp->rev);
-		if (needs_rfc2047_encoding(title.buf, title.len))
+		if (pp->encode_headers &&
+				needs_rfc2047_encoding(title.buf, title.len))
 			add_rfc2047(sb, title.buf, title.len,
 						encoding, RFC2047_SUBJECT);
 		else
diff --git a/pretty.h b/pretty.h
index 4ad1fc31ff..4840f7e559 100644
--- a/pretty.h
+++ b/pretty.h
@@ -43,6 +43,7 @@  struct pretty_print_context {
 	struct string_list *mailmap;
 	int color;
 	struct ident_split *from_ident;
+	unsigned encode_headers:1;
 
 	/*
 	 * Fields below here are manipulated internally by pp_* functions and
diff --git a/revision.c b/revision.c
index 8136929e23..961a901985 100644
--- a/revision.c
+++ b/revision.c
@@ -2241,6 +2241,10 @@  static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->topo_order = 1;
 		revs->rewrite_parents = 1;
 		revs->graph = graph_init(revs);
+	} else if (!strcmp(arg, "--encode-headers")) {
+		revs->encode_headers = 1;
+	} else if (!strcmp(arg, "--no-encode-headers")) {
+		revs->encode_headers = 0;
 	} else if (!strcmp(arg, "--root")) {
 		revs->show_root_diff = 1;
 	} else if (!strcmp(arg, "--no-commit-id")) {
diff --git a/revision.h b/revision.h
index 475f048fb6..e4dff23d62 100644
--- a/revision.h
+++ b/revision.h
@@ -203,7 +203,8 @@  struct rev_info {
 			use_terminator:1,
 			missing_newline:1,
 			date_mode_explicit:1,
-			preserve_subject:1;
+			preserve_subject:1,
+			encode_headers:1;
 	unsigned int	disable_stdin:1;
 	/* --show-linear-break */
 	unsigned int	track_linear:1,
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index b653dd7d44..d9c0fe7a45 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -1160,6 +1160,59 @@  test_expect_success 'format-patch wraps extremely long from-header (rfc2047)' '
 	check_author "Foö Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar"
 '
 
+cat >expect <<'EOF'
+From: Foö Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar
+ Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo
+ Bar Foo Bar Foo Bar Foo Bar <author@example.com>
+EOF
+test_expect_success 'format-patch wraps extremely long from-header (non-ASCII without Q-encoding)' '
+	echo content >>file &&
+	git add file &&
+	GIT_AUTHOR_NAME="Foö Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar" \
+	git commit -m author-check &&
+	git format-patch --no-encode-headers --stdout -1 >patch &&
+	sed -n "/^From: /p; /^ /p; /^$/q" patch >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<'EOF'
+Subject: [PATCH] Foö
+EOF
+test_expect_success 'subject lines are unencoded with --no-encode-headers' '
+	echo content >>file &&
+	git add file &&
+	git commit -m "Foö" &&
+	git format-patch --no-encode-headers -1 --stdout >patch &&
+	grep ^Subject: patch >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<'EOF'
+Subject: [PATCH] Foö
+EOF
+test_expect_success 'subject lines are unencoded with format.encodeHeaders=false' '
+	echo content >>file &&
+	git add file &&
+	git commit -m "Foö" &&
+	git config format.encodeHeaders false &&
+	git format-patch -1 --stdout >patch &&
+	grep ^Subject: patch >actual &&
+	test_cmp expect actual
+'
+
+cat >expect <<'EOF'
+Subject: [PATCH] =?UTF-8?q?Fo=C3=B6?=
+EOF
+test_expect_success '--encode-headers overrides format.encodeHeaders' '
+	echo content >>file &&
+	git add file &&
+	git commit -m "Foö" &&
+	git config format.encodeHeaders false &&
+	git format-patch --encode-headers -1 --stdout >patch &&
+	grep ^Subject: patch >actual &&
+	test_cmp expect actual
+'
+
 cat >expect <<'EOF'
 Subject: header with . in it
 EOF