diff mbox series

[2/2] send-email: add support for --mailmap

Message ID 20240816-jk-send-email-mailmap-support-v1-2-68ca5b4a6078@gmail.com (mailing list archive)
State Superseded
Headers show
Series send-email: add --mailmap support | expand

Commit Message

Jacob Keller Aug. 16, 2024, 11:06 p.m. UTC
From: Jacob Keller <jacob.keller@gmail.com>

In certain cases, a user may be generating a patch for an old commit
which now has an out-of-date author or other identity. For example,
consider a team member who contributes to an internal fork of a project,
and then later leaves the company.

It may be desired to submit this change upstream, but the author
identity now points to an invalid email address which will bounce. This
is likely to annoy users who respond to the email on the public mailing
list.

This can be manually corrected, but requires a bit of effort, as it may
require --suppress-cc or otherwise formatting a patch separately and
manually removing any unintended email addresses.

Git already has support for the mailmap, which allows mapping addresses
for old commits to new canonical names and addresses.

Teach git send-email the --mailmap option. When supplied, use git
check-mailmap (with the --no-brackets mode) as a final stage when
processing address lists. This will convert all addresses to their
canonical name and email according to the mailmap file.

A mailmap file can then be configured to point the invalid addresses
either to their current canonical email (if they still participate in
the open source project), or possibly to new owner within the company.

This enables the sender to avoid accidentally listing an invalid address
when sending such a change.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
---
 git-send-email.perl   | 14 ++++++++++++++
 t/t9001-send-email.sh | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 63 insertions(+)

Comments

Eric Sunshine Aug. 16, 2024, 11:41 p.m. UTC | #1
On Fri, Aug 16, 2024 at 7:06 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
> In certain cases, a user may be generating a patch for an old commit
> which now has an out-of-date author or other identity. For example,
> consider a team member who contributes to an internal fork of a project,
> and then later leaves the company.
>
> It may be desired to submit this change upstream, but the author
> identity now points to an invalid email address which will bounce. This
> is likely to annoy users who respond to the email on the public mailing
> list.
>
> This can be manually corrected, but requires a bit of effort, as it may
> require --suppress-cc or otherwise formatting a patch separately and
> manually removing any unintended email addresses.
>
> Git already has support for the mailmap, which allows mapping addresses
> for old commits to new canonical names and addresses.
>
> Teach git send-email the --mailmap option. When supplied, use git
> check-mailmap (with the --no-brackets mode) as a final stage when
> processing address lists. This will convert all addresses to their
> canonical name and email according to the mailmap file.
>
> A mailmap file can then be configured to point the invalid addresses
> either to their current canonical email (if they still participate in
> the open source project), or possibly to new owner within the company.
>
> This enables the sender to avoid accidentally listing an invalid address
> when sending such a change.

Nit: The final two paragraphs appear to repeat what was already stated
or implied earlier, thus don't seem to add any value to the commit
message.

Nit aside, similar to the question I asked about [1/2], are there
downsides to merely enabling this new behavior by default? It seems
like it would be generally desirable to have this translation happen
by default, so making everyone opt-in may be a disservice. On the
other hand, starting out with it disabled by default is understandable
as a cautious first step, though it might be nice to explain that in
the commit message. Similarly, one can imagine a world in which people
want to enable this and forget about it, thus would like it to be
controlled by configuration (though that can, of course, be left for a
future change).

> Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
> ---
> diff --git a/git-send-email.perl b/git-send-email.perl
> @@ -1085,6 +1090,14 @@ sub expand_one_alias {
> +sub mailmap_address_list {
> +       my @addr_list = @_;
> +       if ($mailmap and @addr_list) {
> +               @addr_list = Git::command('check-mailmap', '--no-brackets', @_);
> +       }
> +       return @addr_list;
> +}

For some reason, I found this logic more difficult to follow than
expected, possibly because it doesn't feel quite Perlish, or possibly
because in this codebase, we often take care of the easy cases first
and return early. Thus, I may have been expecting the above to be
written more along the lines of:

    sub mailmap_address_list {
        return @_ unless @_ && $mailmap;
        return Git::command('check-mailmap', '--no-brackets', @_);
    }

Of course, it's highly subjective and not at all worth a reroll.
Jacob Keller Aug. 16, 2024, 11:49 p.m. UTC | #2
On 8/16/2024 4:41 PM, Eric Sunshine wrote:
> On Fri, Aug 16, 2024 at 7:06 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
>> In certain cases, a user may be generating a patch for an old commit
>> which now has an out-of-date author or other identity. For example,
>> consider a team member who contributes to an internal fork of a project,
>> and then later leaves the company.
>>
>> It may be desired to submit this change upstream, but the author
>> identity now points to an invalid email address which will bounce. This
>> is likely to annoy users who respond to the email on the public mailing
>> list.
>>
>> This can be manually corrected, but requires a bit of effort, as it may
>> require --suppress-cc or otherwise formatting a patch separately and
>> manually removing any unintended email addresses.
>>
>> Git already has support for the mailmap, which allows mapping addresses
>> for old commits to new canonical names and addresses.
>>
>> Teach git send-email the --mailmap option. When supplied, use git
>> check-mailmap (with the --no-brackets mode) as a final stage when
>> processing address lists. This will convert all addresses to their
>> canonical name and email according to the mailmap file.
>>
>> A mailmap file can then be configured to point the invalid addresses
>> either to their current canonical email (if they still participate in
>> the open source project), or possibly to new owner within the company.
>>
>> This enables the sender to avoid accidentally listing an invalid address
>> when sending such a change.
> 
> Nit: The final two paragraphs appear to repeat what was already stated
> or implied earlier, thus don't seem to add any value to the commit
> message.
> 

Sure, I think I got a bit verbose here.

> Nit aside, similar to the question I asked about [1/2], are there
> downsides to merely enabling this new behavior by default? It seems
> like it would be generally desirable to have this translation happen
> by default, so making everyone opt-in may be a disservice. On the
> other hand, starting out with it disabled by default is understandable
> as a cautious first step, though it might be nice to explain that in
> the commit message. Similarly, one can imagine a world in which people
> want to enable this and forget about it, thus would like it to be
> controlled by configuration (though that can, of course, be left for a
> future change).

I definitely did it mostly out of conservative: "don't change the
default behavior".

For a general mailmap I think enabling it by default, with a config
option to disable it. I think it might also make sense to have
per-identity configuration, so that different identities could point at
a different mailmap.

For example, one of the use cases we have is to have a mailmap file that
takes the now-invalid addresses and points them all to the current
maintainer. This way, the maintainer (who is sending these patches)
would have all the old addresses automatically point to him instead of
generating the bounced messages as currently happens on accident. This
type of mailmap file likely does not make sense as a general-purpose file.

I think for per-identity configurations, we would either need to add an
option to git check-mailmap to pass the mailmap file, or I would need to
figure out how to set config when calling Git::command() to force a
specific mailmap.

> 
>> Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
>> ---
>> diff --git a/git-send-email.perl b/git-send-email.perl
>> @@ -1085,6 +1090,14 @@ sub expand_one_alias {
>> +sub mailmap_address_list {
>> +       my @addr_list = @_;
>> +       if ($mailmap and @addr_list) {
>> +               @addr_list = Git::command('check-mailmap', '--no-brackets', @_);
>> +       }
>> +       return @addr_list;
>> +}
> 
> For some reason, I found this logic more difficult to follow than
> expected, possibly because it doesn't feel quite Perlish, or possibly
> because in this codebase, we often take care of the easy cases first
> and return early. Thus, I may have been expecting the above to be
> written more along the lines of:
> 
>     sub mailmap_address_list {
>         return @_ unless @_ && $mailmap;
>         return Git::command('check-mailmap', '--no-brackets', @_);
>     }
> 

Ah, yea that makes more sense. I had been trying to figure that out but
I am not as used to the unless syntax.

> Of course, it's highly subjective and not at all worth a reroll.
>
diff mbox series

Patch

diff --git a/git-send-email.perl b/git-send-email.perl
index 72044e5ef3a8..9a081e9f9b41 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -46,6 +46,8 @@  sub usage {
     --compose-encoding      <str>  * Encoding to assume for introduction.
     --8bit-encoding         <str>  * Encoding to assume 8bit mails if undeclared
     --transfer-encoding     <str>  * Transfer encoding to use (quoted-printable, 8bit, base64)
+    --[no-]mailmap                 * Use mailmap file to map all email addresses to canonical
+                                     real names and email addresses.
 
   Sending:
     --envelope-sender       <str>  * Email envelope sender.
@@ -278,6 +280,7 @@  sub do_edit {
 my $chain_reply_to = 0;
 my $use_xmailer = 1;
 my $validate = 1;
+my $mailmap = 0;
 my $target_xfer_encoding = 'auto';
 my $forbid_sendmail_variables = 1;
 
@@ -524,6 +527,8 @@  sub config_regexp {
 		    "thread!" => \$thread,
 		    "validate!" => \$validate,
 		    "transfer-encoding=s" => \$target_xfer_encoding,
+		    "mailmap!" => \$mailmap,
+		    "use-mailmap!" => \$mailmap,
 		    "format-patch!" => \$format_patch,
 		    "8bit-encoding=s" => \$auto_8bit_encoding,
 		    "compose-encoding=s" => \$compose_encoding,
@@ -1085,6 +1090,14 @@  sub expand_one_alias {
 our ($message_id, %mail, $subject, $in_reply_to, $references, $message,
 	$needs_confirm, $message_num, $ask_default);
 
+sub mailmap_address_list {
+	my @addr_list = @_;
+	if ($mailmap and @addr_list) {
+		@addr_list = Git::command('check-mailmap', '--no-brackets', @_);
+	}
+	return @addr_list;
+}
+
 sub extract_valid_address {
 	my $address = shift;
 	my $local_part_regexp = qr/[^<>"\s@]+/;
@@ -1294,6 +1307,7 @@  sub process_address_list {
 	@addr_list = expand_aliases(@addr_list);
 	@addr_list = sanitize_address_list(@addr_list);
 	@addr_list = validate_address_list(@addr_list);
+	@addr_list = mailmap_address_list(@addr_list);
 	return @addr_list;
 }
 
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index 64a4ab3736ef..185697d22563 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -2379,6 +2379,55 @@  test_expect_success $PREREQ 'leading and trailing whitespaces are removed' '
 	test_cmp expected-list actual-list
 '
 
+test_expect_success $PREREQ 'mailmap support with --to' '
+	clean_fake_sendmail &&
+	test_config mailmap.file "mailmap.test" &&
+	cat >mailmap.test <<-EOF &&
+	Some Body <someone@example.com> <someone@example.org>
+	EOF
+	git format-patch --stdout -1 >a.patch &&
+	git send-email \
+		--from="Example <nobody@example.com>" \
+		--smtp-server="$(pwd)/fake.sendmail" \
+		--to=someone@example.org \
+		--mailmap \
+		a.patch \
+		2>errors >out &&
+	grep "^!someone@example\.com!$" commandline1
+'
+
+test_expect_success $PREREQ 'mailmap support in To header' '
+	clean_fake_sendmail &&
+	test_config mailmap.file "mailmap.test" &&
+	cat >mailmap.test <<-EOF &&
+	<someone@example.com> <someone@example.org>
+	EOF
+	git format-patch --stdout -1 --to=someone@example.org >a.patch &&
+	git send-email \
+		--from="Example <nobody@example.com>" \
+		--smtp-server="$(pwd)/fake.sendmail" \
+		--mailmap \
+		a.patch \
+		2>errors >out &&
+	grep "^!someone@example\.com!$" commandline1
+'
+
+test_expect_success $PREREQ 'mailmap support in Cc header' '
+	clean_fake_sendmail &&
+	test_config mailmap.file "mailmap.test" &&
+	cat >mailmap.test <<-EOF &&
+	<someone@example.com> <someone@example.org>
+	EOF
+	git format-patch --stdout -1 --cc=someone@example.org >a.patch &&
+	git send-email \
+		--from="Example <nobody@example.com>" \
+		--smtp-server="$(pwd)/fake.sendmail" \
+		--mailmap \
+		a.patch \
+		2>errors >out &&
+	grep "^!someone@example\.com!$" commandline1
+'
+
 test_expect_success $PREREQ 'test using command name with --sendmail-cmd' '
 	clean_fake_sendmail &&
 	PATH="$PWD:$PATH" \