Message ID | 20240813-jk-support-mailmap-git-format-patch-v1-1-1aea690ea5dd@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | format-patch: add support for mailmap file | expand |
On 2024.08.13 14:45, Jacob Keller wrote: > From: Jacob Keller <jacob.keller@gmail.com> > > Git has support for a mailmap file which translates author and committer > names and email addresses to canonical values. > > Git log has log.mailmap, and the associated --mailmap and --use-mailmap > options. > > Teach git format-patch the format.mailmap and --mailmap options so that > formatting a patch can also reflect the canonical values from the > mailmap file. > > Reported-by: Anthony Nguyen <anthony.l.nguyen@intel.com> > Signed-off-by: Jacob Keller <jacob.keller@gmail.com> I am not sure I understand the utility here; using mailmap at log time makes sense because these are old, established commits that may have outdated contact information. But when writing patches with format-patch, presumably these are still somewhat WIP patches. Is it not better to just reset the author information before running git-format-patch in this case? If I've misunderstood the use case, please let me know.
On 8/13/2024 3:45 PM, Josh Steadmon wrote: > On 2024.08.13 14:45, Jacob Keller wrote: >> From: Jacob Keller <jacob.keller@gmail.com> >> >> Git has support for a mailmap file which translates author and committer >> names and email addresses to canonical values. >> >> Git log has log.mailmap, and the associated --mailmap and --use-mailmap >> options. >> >> Teach git format-patch the format.mailmap and --mailmap options so that >> formatting a patch can also reflect the canonical values from the >> mailmap file. >> >> Reported-by: Anthony Nguyen <anthony.l.nguyen@intel.com> >> Signed-off-by: Jacob Keller <jacob.keller@gmail.com> > > I am not sure I understand the utility here; using mailmap at log time > makes sense because these are old, established commits that may have > outdated contact information. But when writing patches with > format-patch, presumably these are still somewhat WIP patches. Is it not > better to just reset the author information before running > git-format-patch in this case? > I suppose that this is rather an uncommon circumstance. > If I've misunderstood the use case, please let me know. > I've had a few cases where I was formatting an old commit. The example in this case was a change made to an internal tree by one author quite some time ago. In the meantime, that person has left the company, and his company address is no longer valid. We still typically put the original author on such a patch in order to give them credit even if they're no longer on our team when sending the change, as patches are made by people, not companies :) If we left the address alone, it would cause a bounce on the mailing list if it gets included in the cc. In some cases, the upstream project mailmap already includes a mapping from their old company address to their current public address. The internal tree commits are already baked and can't be changed. We can of course fix the generated patches from these commits manually. It seemed convenient to get mailmap to do this for us. I guess this is not very common, and may not be worth the trouble to maintain in format-patch, I can understand that. Another alternative we considered was something like a "--no-cc" option or configuration to prevent including CC for a dead email address when sending such patches. Given that the original author no longer works for us, it is not like we expect them to actually respond or maintain the code, and in all cases the patches have a sign-off from someone else on the team. Thanks, Jake
On Tue, Aug 13, 2024 at 05:20:41PM -0700, Jacob Keller wrote: > I've had a few cases where I was formatting an old commit. The example > in this case was a change made to an internal tree by one author quite > some time ago. In the meantime, that person has left the company, and > his company address is no longer valid. We still typically put the > original author on such a patch in order to give them credit even if > they're no longer on our team when sending the change, as patches are > made by people, not companies :) > > If we left the address alone, it would cause a bounce on the mailing > list if it gets included in the cc. In some cases, the upstream project > mailmap already includes a mapping from their old company address to > their current public address. > > The internal tree commits are already baked and can't be changed. We can > of course fix the generated patches from these commits manually. It > seemed convenient to get mailmap to do this for us. I think that makes sense, especially if the caller is specifically asking to enable address mapping. I do wonder if the new format.mailmap might be surprising for some callers, though. For example, would a rebase using the "apply" backend quietly rewrite commit authors using the mailmap? -Peff
On Wed, Aug 14, 2024, 12:26 AM Jeff King <peff@peff.net> wrote: > On Tue, Aug 13, 2024 at 05:20:41PM -0700, Jacob Keller wrote: > > The internal tree commits are already baked and can't be changed. We can > > of course fix the generated patches from these commits manually. It > > seemed convenient to get mailmap to do this for us. > > I think that makes sense, especially if the caller is specifically > asking to enable address mapping. I do wonder if the new format.mailmap > might be surprising for some callers, though. For example, would a > rebase using the "apply" backend quietly rewrite commit authors using > the mailmap? > > -Peff Ya, I think the config probably doesn't make sense thinking about it. I also realized the actual problem we have is that the mail we send is including a dead address. It would make more sense to just allow send-email to translate from the old addresses to the canonical ones. Specifically the issue we had is that an old commit to the internal repo was prepped for submission by a new owner, and when he sent the email, it automatically included the address of the original author, which ofcourse no longer is valid. This resulted in the mail annoying several people due to bouncing. So the real problem to solve here is perhaps some way to filter the mails either via translation or removal so that we don't have it tagged to or cc a dead email. As I said above, its not like we expect the original author to actually respond or own the change, but we did not want to completely remove their name from the change, because they are the original person who wrote it. In our case, it may be more useful to have the mailmap not at format-patch time but instead at email time... Thanks, Jake
Jacob Keller <jacob.keller@gmail.com> writes: > On Wed, Aug 14, 2024, 12:26 AM Jeff King <peff@peff.net> wrote: >> On Tue, Aug 13, 2024 at 05:20:41PM -0700, Jacob Keller wrote: >> > The internal tree commits are already baked and can't be changed. We can >> > of course fix the generated patches from these commits manually. It >> > seemed convenient to get mailmap to do this for us. >> >> I think that makes sense, especially if the caller is specifically >> asking to enable address mapping. I do wonder if the new format.mailmap >> might be surprising for some callers, though. For example, would a >> rebase using the "apply" backend quietly rewrite commit authors using >> the mailmap? >> >> -Peff > > Ya, I think the config probably doesn't make sense thinking about it. This depends on what an average user's perception is. People like I would know intimately that "format-patch" is just a glorified "log" with special features, so I wouldn't be surprised if somebody sends in a bug report saying "git -c log.mailmap=1 format-patch" does not honor the mailmap. In other words, people may expect "git log -1 -p --format=email HEAD" and "git format-patch -1 --stdout HEAD" to be pretty similar. > In our case, it may be more useful to have the mailmap not at > format-patch time but instead at email time... And probably you want not the regular mailmap but the custom one meant to be used only by send-email to ignore the recipient. You'd still want to see who wrote it back when they were employed with you in "git log" output, and not having them in your regular mailmap would be a way to do so. You can instead choose to use your regular mailmap to map them to their current address, which is what we seem to do here in this project, but it would probably be less common in $Corp settings. If the mailmap you give to send-email maps these dead mailboxes to some known pattern (e.g. a fixed string in the e-mail part, like "A U Thor <invalid@invalid>"), teaching send-email to recognise them would be trivial. Thanks.
Jacob Keller <jacob.e.keller@intel.com> writes: > diff --git a/builtin/log.c b/builtin/log.c > index 4d4b60caa76a..94560add6fbc 100644 > --- a/builtin/log.c > +++ b/builtin/log.c > @@ -975,6 +975,7 @@ struct format_config { > struct log_config log; > enum thread_level thread; > int do_signoff; > + int use_mailmap; As we share the "--[no-]mailmap" option from the command line with "git log", shouldn't we be able to reuse log.use_mailmap_config as well without adding yet another member to the struct? "git log" defaults use_mailmap_config to true, but this command would want to default it to false to avoid disrupting existing users, or something, perhaps? > diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt > index 8708b3159309..f3de349990bf 100644 > --- a/Documentation/git-format-patch.txt > +++ b/Documentation/git-format-patch.txt > @@ -30,6 +30,7 @@ SYNOPSIS > [--range-diff=<previous> [--creation-factor=<percent>]] > [--filename-max-length=<n>] > [--progress] > + [(--mailmap|--no-mailmap|--use-mailmap|--no-use-mailmap)] We seem to say "[--[no-]cover-letter]" to abbreviate, and because "--[no-]use-mailmap" is merely a synonym, shouldn't it be sufficient to say [--[no-]mailmap] without any other frills? I find the use of the (al|terna|tive) here especially annoying, as it is not like it is an error if you give "--mailmap" and then say "--no-mailmap" later on the same command line---it's just the usual "last one wins". I haven't decided what my response to Peff's concern on the fallout to "rebase --apply". On one hand, those who conciously choose to rebase by creating patches and applying them would find it puzzling if it did not honor format.mailmap setting. But I would not be strongly opposed if we hardcoded to pass "--no-mailmap" to the internal invocation of "format-patch", just like we hardcode "-k" and other options and justified it with "the use of format-patch is a mere implementation detail". Thanks.
diff --git a/builtin/log.c b/builtin/log.c index 4d4b60caa76a..94560add6fbc 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -975,6 +975,7 @@ struct format_config { struct log_config log; enum thread_level thread; int do_signoff; + int use_mailmap; enum auto_base_setting auto_base; char *base_commit; char *from; @@ -1131,6 +1132,10 @@ static int git_format_config(const char *var, const char *value, cfg->do_signoff = git_config_bool(var, value); return 0; } + if (!strcmp(var, "format.mailmap")) { + cfg->use_mailmap = git_config_bool(var, value); + return 0; + } if (!strcmp(var, "format.signature")) { FREE_AND_NULL(cfg->signature); return git_config_string(&cfg->signature, var, value); @@ -2042,6 +2047,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) N_("generate a cover letter")), OPT_BOOL(0, "numbered-files", &just_numbers, N_("use simple number sequence for output file names")), + OPT_BOOL(0, "use-mailmap", &cfg.use_mailmap, N_("use mail map file")), + OPT_ALIAS(0, "mailmap", "use-mailmap"), OPT_STRING(0, "suffix", &fmt_patch_suffix, N_("sfx"), N_("use <sfx> instead of '.patch'")), OPT_INTEGER(0, "start-number", &start_number, @@ -2160,6 +2167,12 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) rev.force_in_body_from = force_in_body_from; + if (cfg.use_mailmap) { + rev.mailmap = xmalloc(sizeof(struct string_list)); + string_list_init_nodup(rev.mailmap); + read_mailmap(rev.mailmap); + } + if (!fmt_patch_suffix) fmt_patch_suffix = cfg.fmt_patch_suffix; diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt index 8708b3159309..f3de349990bf 100644 --- a/Documentation/git-format-patch.txt +++ b/Documentation/git-format-patch.txt @@ -30,6 +30,7 @@ SYNOPSIS [--range-diff=<previous> [--creation-factor=<percent>]] [--filename-max-length=<n>] [--progress] + [(--mailmap|--no-mailmap|--use-mailmap|--no-use-mailmap)] [<common-diff-options>] [ <since> | <revision-range> ] @@ -145,6 +146,12 @@ include::diff-options.txt[] Print all commits to the standard output in mbox format, instead of creating a file for each one. +--[no-]mailmap:: +--[no-]use-mailmap:: + Use mailmap file to map author and committer names and email + addresses to canonical real names and email addresses. See + linkgit:git-shortlog[1]. + --attach[=<boundary>]:: Create multipart/mixed attachment, the first part of which is the commit message and the patch itself in the diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh index 884f83fb8a45..3a3ebddfe5c4 100755 --- a/t/t4014-format-patch.sh +++ b/t/t4014-format-patch.sh @@ -1215,7 +1215,7 @@ check_author() { echo content >>file && git add file && GIT_AUTHOR_NAME=$1 git commit -m author-check && - git format-patch --stdout -1 >patch && + git format-patch $2 --stdout -1 >patch && sed -n "/^From: /p; /^ /p; /^$/q" patch >actual && test_cmp expect actual } @@ -1285,6 +1285,53 @@ test_expect_success 'format-patch wraps extremely long from-header (rfc2047)' ' check_author "Foö Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar" ' +cat >mail.map <<'EOF' +Foo B. Baz <author@example.com> +EOF + +cat >expect <<'EOF' +From: "Foo B. Baz" <author@example.com> +EOF +test_expect_success 'format-patch format.mailmap maps properly' ' + test_config format.mailmap true && + test_config mailmap.file mail.map && + check_author "Foo B. Bar" +' + +cat >expect <<'EOF' +From: "Foo B. Bar" <author@example.com> +EOF +test_expect_success 'format-patch --no-mailmap overrides format.mailmap' ' + test_config format.mailmap true && + test_config mailmap.file mail.map && + check_author "Foo B. Bar" "--no-mailmap" +' + +cat >expect <<'EOF' +From: "Foo B. Bar" <author@example.com> +EOF +test_expect_success 'format-patch --no-use-mailmap overrides format.mailmap' ' + test_config format.mailmap true && + test_config mailmap.file mail.map && + check_author "Foo B. Bar" "--no-use-mailmap" +' + +cat >expect <<'EOF' +From: "Foo B. Baz" <author@example.com> +EOF +test_expect_success 'format-patch --mailmap' ' + test_config mailmap.file mail.map && + check_author "Foo B. Bar" "--mailmap" +' + +cat >expect <<'EOF' +From: "Foo B. Baz" <author@example.com> +EOF +test_expect_success 'format-patch --use-mailmap' ' + test_config mailmap.file mail.map && + check_author "Foo B. Bar" "--use-mailmap" +' + cat >expect <<'EOF' From: Foö Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo Bar Foo