diff mbox series

[RFC] mailmap: fix check-mailmap with full mailmap line

Message ID 20250213-jk-fix-sendemail-mailinfo-v1-1-c0b06c215f21@gmail.com (mailing list archive)
State New
Headers show
Series [RFC] mailmap: fix check-mailmap with full mailmap line | expand

Commit Message

Jacob Keller Feb. 13, 2025, 10:47 p.m. UTC
From: Jacob Keller <jacob.keller@gmail.com>

I recently had reported to me a crash from a coworker using the recently
added sendemail mailmap support:

  3724814 Segmentation fault      (core dumped) git check-mailmap "bugs@company.xx"

This appears to happen because of the NULL pointer name passed into
map_user(). Fixing this, by assigning the name to be the empty string I
still saw somewhat unexpected results.

With a mailmap file containing:

A <a@domain.com> B <b@domain.com>

I get the following unexpected result:

$ git check-mailmap b@domain.com
<b@domain.com>

Based on my interpretation of the mailmap documentation, I would have
expected this to translate to "A <a@domain.com>".

I checked the map_user implementation, and noticed that it has a block
to perform a name lookup with lookup_prefix on the subitems under the
email. This appears to fail when given the empty string, as it somehow
doesn't consider the empty string as a valid prefix for any of the
names. It then defaults to using the "simple" entry of the map.

This doesn't make sense, because "full" mailmap entries (those with both
an email and an old name) don't store anything in the
mailmap_entry->name and mailmap_entry->email. At least, not without a
shortform entry filling that in.

This results in the item having a NULL name and email, which triggers
the early return and results in failure to update the return parameters.

I tried my hand at fixing this with a new check to select the only entry
in a mailmap entry with a single subitem. This fixed my test case, but
resulted in a different test case failure, which I don't quite
understand. The whitespace syntax test now seems to fail expectations by
translating a previously untranslated name.

I suspect this isn't the most elegant solution to the situation, and I
don't fully grasp the mailmap code, so help would be appreciated in
finding a good solution.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
---
 builtin/check-mailmap.c |  2 +-
 mailmap.c               |  9 ++++++++-
 t/t4203-mailmap.sh      | 12 ++++++++++++
 3 files changed, 21 insertions(+), 2 deletions(-)


---
base-commit: 5ffbd7fcf84b313bb07e91246eb9419ebd94a7e7
change-id: 20250213-jk-fix-sendemail-mailinfo-32f027b1b9e7

Best regards,

Comments

Junio C Hamano Feb. 13, 2025, 11:45 p.m. UTC | #1
Jacob Keller <jacob.e.keller@intel.com> writes:

> I recently had reported to me a crash from a coworker using the recently
> added sendemail mailmap support:
>
>   3724814 Segmentation fault      (core dumped) git check-mailmap "bugs@company.xx"

Thanks for relaying the report.
    
I can easily reproduce your segfault with our own mailmap, by
picking at random an entry with both name and e-mail listed as
the mapping source, e.g.

    $ git check-mailmap ksaitoh560@gmail.com

> With a mailmap file containing:
>
> A <a@domain.com> B <b@domain.com>
>
> I get the following unexpected result:
>
> $ git check-mailmap b@domain.com
> <b@domain.com>
>
> Based on my interpretation of the mailmap documentation, I would have
> expected this to translate to "A <a@domain.com>".

After reading "git help mailmap" twice, my interpretation is
different (disclaimer: I haven't read the implementation of the
mailmap code lately, and the last time I read any part of it is
probably at least a few years ago if not before).

Unlike "please map anybody with this e-mail address to 'A <a>'"
entry, which is spelled "A <a> <b>", the "fully spelled" form limits
the damage to those that match both name and e-mail, in order to
avoid "D <b>" from getting modified, while rewriting "B <b>" to "A
<a>".  So I would not expect a request with no name to be mapped at
all.

And the command emits the e-mail intact when it does not find any
match, "b@domain.com" being answered by "<b@domain.com>" is quite
expected from my point of view.

Thanks.
Jacob Keller Feb. 15, 2025, 12:24 a.m. UTC | #2
On Thu, Feb 13, 2025 at 3:45 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Jacob Keller <jacob.e.keller@intel.com> writes:
>
> > I recently had reported to me a crash from a coworker using the recently
> > added sendemail mailmap support:
> >
> >   3724814 Segmentation fault      (core dumped) git check-mailmap "bugs@company.xx"
>
> Thanks for relaying the report.
>
> I can easily reproduce your segfault with our own mailmap, by
> picking at random an entry with both name and e-mail listed as
> the mapping source, e.g.
>
>     $ git check-mailmap ksaitoh560@gmail.com
>
> > With a mailmap file containing:
> >
> > A <a@domain.com> B <b@domain.com>
> >
> > I get the following unexpected result:
> >
> > $ git check-mailmap b@domain.com
> > <b@domain.com>
> >
> > Based on my interpretation of the mailmap documentation, I would have
> > expected this to translate to "A <a@domain.com>".
>
> After reading "git help mailmap" twice, my interpretation is
> different (disclaimer: I haven't read the implementation of the
> mailmap code lately, and the last time I read any part of it is
> probably at least a few years ago if not before).
>
> Unlike "please map anybody with this e-mail address to 'A <a>'"
> entry, which is spelled "A <a> <b>", the "fully spelled" form limits
> the damage to those that match both name and e-mail, in order to
> avoid "D <b>" from getting modified, while rewriting "B <b>" to "A
> <a>".  So I would not expect a request with no name to be mapped at
> all.
>
> And the command emits the e-mail intact when it does not find any
> match, "b@domain.com" being answered by "<b@domain.com>" is quite
> expected from my point of view.
>

Re-reading the manual, that is a fair interpretation. I can share that
with my coworker and he can adapt his mailmap to match this
expectation I think.

In that case, I think the simple fix is to just replace the NULL with
a "" to resolve the segmentation fault and add a suitable test case
for that?

Thanks,
Jake

> Thanks.
diff mbox series

Patch

diff --git a/builtin/check-mailmap.c b/builtin/check-mailmap.c
index df00b5ee13adb87881b8c1e92cac256e6ad319d1..be2cebe12152e38d3bb8cf12948823c8d710bdda 100644
--- a/builtin/check-mailmap.c
+++ b/builtin/check-mailmap.c
@@ -35,7 +35,7 @@  static void check_mailmap(struct string_list *mailmap, const char *contact)
 		mail = ident.mail_begin;
 		maillen = ident.mail_end - ident.mail_begin;
 	} else {
-		name = NULL;
+		name = "";
 		namelen = 0;
 		mail = contact;
 		maillen = strlen(contact);
diff --git a/mailmap.c b/mailmap.c
index f35d20ed7fd1ef251e65419805fec49a3c30bcbb..7237ceef8a32a66183e4bbbe4ea01c4ea0264cd4 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -297,7 +297,7 @@  int map_user(struct string_list *map,
 	item = lookup_prefix(map, *email, *emaillen);
 	if (item) {
 		me = (struct mailmap_entry *)item->util;
-		if (me->namemap.nr) {
+		if (me->namemap.nr > 1) {
 			/*
 			 * The item has multiple items, so we'll look up on
 			 * name too. If the name is not found, we choose the
@@ -307,6 +307,13 @@  int map_user(struct string_list *map,
 			subitem = lookup_prefix(&me->namemap, *name, *namelen);
 			if (subitem)
 				item = subitem;
+		} else if (me->name == NULL && me->email == NULL &&
+			   me->namemap.nr == 1) {
+			/*
+			 * The item has one subitem, but no simple entry. Use
+			 * the full entry.
+			 */
+			item = &me->namemap.items[0];
 		}
 	}
 	if (item) {
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index 24214919312777b76e4d3b2b784bcb953583750a..2aab735d2b326caf0d904cb6eee3d602fad96997 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -113,6 +113,18 @@  test_expect_success 'check-mailmap --stdin simple address: no mapping' '
 	test_cmp expect actual
 '
 
+test_expect_success 'check-mailmap name and address: mapping' '
+	test_when_finished "rm .mailmap" &&
+	cat >.mailmap <<-EOF &&
+	Bug Reports <bugs-new@company.xx> Bugs <bugs@company.xx>
+	EOF
+	cat >expect <<-EOF &&
+	Bug Reports <bugs-new@company.xx>
+	EOF
+	git check-mailmap "bugs@company.xx" >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'No mailmap' '
 	cat >expect <<-EOF &&
 	$GIT_AUTHOR_NAME (1):