diff mbox series

ident: don't consider trailing dot crud

Message ID 20230731214435.1462098-1-sandals@crustytoothpaste.net (mailing list archive)
State Accepted
Commit 1c04cb0744d2acdcaebc77b0e78c47efbba67fd5
Headers show
Series ident: don't consider trailing dot crud | expand

Commit Message

brian m. carlson July 31, 2023, 9:44 p.m. UTC
When we process a user's name (as in user.name), we strip all trailing
crud from it.  Right now, we consider a dot trailing crud, and strip it
off.

However, this is unsuitable for many personal names because humans
frequently have abbreviated suffixes, such as "Jr." or "Sr." at the end
of their names, and this corrupts them.  Some other users may wish to
use an abbreviated name or initial, which will pose a problem especially
in cultures that write the family name first, followed by the personal
name.

Since the current approach causes lots of practical problems, let's
avoid it by no longer considering a dot to be crud.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 ident.c                       |  1 -
 t/t4203-mailmap.sh            |  4 ++--
 t/t7518-ident-corner-cases.sh | 11 ++++++++++-
 3 files changed, 12 insertions(+), 4 deletions(-)

Comments

Junio C Hamano July 31, 2023, 9:49 p.m. UTC | #1
"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> When we process a user's name (as in user.name), we strip all trailing
> crud from it.  Right now, we consider a dot trailing crud, and strip it
> off.

We consider a leading or trailing dot crud, I think (applies also to
the title of the patch).  Otherwise the change, together with the
test updates, all look good.

I wonder if this needs some credit to those involved in the original
thread?

Thanks.

> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
>  ident.c                       |  1 -
>  t/t4203-mailmap.sh            |  4 ++--
>  t/t7518-ident-corner-cases.sh | 11 ++++++++++-
>  3 files changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/ident.c b/ident.c
> index 08be4d0747..cc7afdbf81 100644
> --- a/ident.c
> +++ b/ident.c
> @@ -203,7 +203,6 @@ void reset_ident_date(void)
>  static int crud(unsigned char c)
>  {
>  	return  c <= 32  ||
> -		c == '.' ||
>  		c == ',' ||
>  		c == ':' ||
>  		c == ';' ||
> diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
> index fa7f987284..2016132f51 100755
> --- a/t/t4203-mailmap.sh
> +++ b/t/t4203-mailmap.sh
> @@ -466,7 +466,7 @@ test_expect_success 'gitmailmap(5) example output: example #1' '
>  	Author Jane Doe <jane@laptop.(none)> maps to Jane Doe <jane@laptop.(none)>
>  	Committer C O Mitter <committer@example.com> maps to C O Mitter <committer@example.com>
>  
> -	Author Jane D <jane@desktop.(none)> maps to Jane Doe <jane@desktop.(none)>
> +	Author Jane D. <jane@desktop.(none)> maps to Jane Doe <jane@desktop.(none)>
>  	Committer C O Mitter <committer@example.com> maps to C O Mitter <committer@example.com>
>  	EOF
>  	git -C doc log --reverse --pretty=format:"Author %an <%ae> maps to %aN <%aE>%nCommitter %cn <%ce> maps to %cN <%cE>%n" >actual &&
> @@ -494,7 +494,7 @@ test_expect_success 'gitmailmap(5) example output: example #2' '
>  	Author Jane Doe <jane@laptop.(none)> maps to Jane Doe <jane@example.com>
>  	Committer C O Mitter <committer@example.com> maps to C O Mitter <committer@example.com>
>  
> -	Author Jane D <jane@desktop.(none)> maps to Jane Doe <jane@example.com>
> +	Author Jane D. <jane@desktop.(none)> maps to Jane Doe <jane@example.com>
>  	Committer C O Mitter <committer@example.com> maps to C O Mitter <committer@example.com>
>  	EOF
>  	git -C doc log --reverse --pretty=format:"Author %an <%ae> maps to %aN <%aE>%nCommitter %cn <%ce> maps to %cN <%cE>%n" >actual &&
> diff --git a/t/t7518-ident-corner-cases.sh b/t/t7518-ident-corner-cases.sh
> index fffdb6ff2e..9ab2ae2f3b 100755
> --- a/t/t7518-ident-corner-cases.sh
> +++ b/t/t7518-ident-corner-cases.sh
> @@ -20,10 +20,19 @@ test_expect_success 'empty name and missing email' '
>  '
>  
>  test_expect_success 'commit rejects all-crud name' '
> -	test_must_fail env GIT_AUTHOR_NAME=" .;<>" \
> +	test_must_fail env GIT_AUTHOR_NAME=" ,;<>" \
>  		git commit --allow-empty -m foo
>  '
>  
> +test_expect_success 'commit does not strip trailing dot' '
> +	author_name="Pat Doe Jr." &&
> +	env GIT_AUTHOR_NAME="$author_name" \
> +		git commit --allow-empty -m foo &&
> +	git log -1 --format=%an >actual &&
> +	echo "$author_name" >expected &&
> +	test_cmp actual expected
> +'
> +
>  # We must test the actual error message here, as an unwanted
>  # auto-detection could fail for other reasons.
>  test_expect_success 'empty configured name does not auto-detect' '
Thomas J. Faughnan Jr. July 31, 2023, 9:56 p.m. UTC | #2
I agree that simply dropping the "." from the crud list is the best
solution. Hope this gets merged.
Junio C Hamano July 31, 2023, 10:05 p.m. UTC | #3
"Thomas J. Faughnan Jr." <thomas@faughnan.net> writes:

> I agree that simply dropping the "." from the crud list is the best
> solution. Hope this gets merged.

Sure, I do too hope this gets into a shape that can be merged.
Junio C Hamano Aug. 2, 2023, 4:49 p.m. UTC | #4
Junio C Hamano <gitster@pobox.com> writes:

> I wonder if this needs some credit to those involved in the original
> thread?

I've had this on 'seen' as-is, hoping to see a quick update so that
we can merge it down to 'next' before -rc0; here is a minimally
touched-up version I'll replace it with.

Thanks.

------- >8 ------------- >8 ------------- >8 ------------- >8 -------
From: "brian m. carlson" <sandals@crustytoothpaste.net>
Subject: [PATCH] ident: don't consider '.' a crud

When we process a user's name (as in user.name), we strip all
leading and trailing crud from it.  Right now, we consider a dot
a crud character, and strip it off.

However, this is unsuitable for many personal names because humans
frequently have abbreviated suffixes, such as "Jr." or "Sr." at the end
of their names, and this corrupts them.  Some other users may wish to
use an abbreviated name or initial, which will pose a problem especially
in cultures that write the family name first, followed by the personal
name.

Since the current approach causes lots of practical problems, let's
avoid it by no longer considering a dot to be crud.

Note that "." in the name forces the entire name to be quoted to
please mailers, but stripping "." only at the beginning and the end
does not help a name with "." in the middle (like "brian m. carlson")
so this change will not make it much worse.  A name like "Given
Family, Jr." that did not have to be quoted now would need to be, in
order to be placed on the e-mail headers, though.

This is based on a weather-balloon patch by Jeff King sent in Aug 2021
https://lore.kernel.org/git/YSKm8Q8nyTavQaox@coredump.intra.peff.net/

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

1:  4ce0751970 ! 1:  9478b6dadc ident: don't consider trailing dot crud
    @@ Metadata
     Author: brian m. carlson <sandals@crustytoothpaste.net>
     
      ## Commit message ##
    -    ident: don't consider trailing dot crud
    +    ident: don't consider '.' a crud
     
    -    When we process a user's name (as in user.name), we strip all trailing
    -    crud from it.  Right now, we consider a dot trailing crud, and strip it
    -    off.
    +    When we process a user's name (as in user.name), we strip all
    +    leading and trailing crud from it.  Right now, we consider a dot
    +    a crud character, and strip it off.
     
         However, this is unsuitable for many personal names because humans
         frequently have abbreviated suffixes, such as "Jr." or "Sr." at the end
    @@ Commit message
         Since the current approach causes lots of practical problems, let's
         avoid it by no longer considering a dot to be crud.
     
    +    Note that "." in the name forces the entire name to be quoted to
    +    please mailers, but stripping "." only at the beginning and the end
    +    does not help a name with "." in the middle (like "brian m. carlson")
    +    so this change will not make it much worse.  A name like "Given
    +    Family, Jr." that did not have to be quoted now would need to be, in
    +    order to be placed on the e-mail headers, though.
    +
    +    This is based on a weather-balloon patch by Jeff King sent in Aug 2021
    +    https://lore.kernel.org/git/YSKm8Q8nyTavQaox@coredump.intra.peff.net/
    +
         Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
     



 ident.c                       |  1 -
 t/t4203-mailmap.sh            |  4 ++--
 t/t7518-ident-corner-cases.sh | 11 ++++++++++-
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/ident.c b/ident.c
index 8fad92d700..8d490df7d5 100644
--- a/ident.c
+++ b/ident.c
@@ -203,7 +203,6 @@ void reset_ident_date(void)
 static int crud(unsigned char c)
 {
 	return  c <= 32  ||
-		c == '.' ||
 		c == ',' ||
 		c == ':' ||
 		c == ';' ||
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index fa7f987284..2016132f51 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -466,7 +466,7 @@ test_expect_success 'gitmailmap(5) example output: example #1' '
 	Author Jane Doe <jane@laptop.(none)> maps to Jane Doe <jane@laptop.(none)>
 	Committer C O Mitter <committer@example.com> maps to C O Mitter <committer@example.com>
 
-	Author Jane D <jane@desktop.(none)> maps to Jane Doe <jane@desktop.(none)>
+	Author Jane D. <jane@desktop.(none)> maps to Jane Doe <jane@desktop.(none)>
 	Committer C O Mitter <committer@example.com> maps to C O Mitter <committer@example.com>
 	EOF
 	git -C doc log --reverse --pretty=format:"Author %an <%ae> maps to %aN <%aE>%nCommitter %cn <%ce> maps to %cN <%cE>%n" >actual &&
@@ -494,7 +494,7 @@ test_expect_success 'gitmailmap(5) example output: example #2' '
 	Author Jane Doe <jane@laptop.(none)> maps to Jane Doe <jane@example.com>
 	Committer C O Mitter <committer@example.com> maps to C O Mitter <committer@example.com>
 
-	Author Jane D <jane@desktop.(none)> maps to Jane Doe <jane@example.com>
+	Author Jane D. <jane@desktop.(none)> maps to Jane Doe <jane@example.com>
 	Committer C O Mitter <committer@example.com> maps to C O Mitter <committer@example.com>
 	EOF
 	git -C doc log --reverse --pretty=format:"Author %an <%ae> maps to %aN <%aE>%nCommitter %cn <%ce> maps to %cN <%cE>%n" >actual &&
diff --git a/t/t7518-ident-corner-cases.sh b/t/t7518-ident-corner-cases.sh
index fffdb6ff2e..9ab2ae2f3b 100755
--- a/t/t7518-ident-corner-cases.sh
+++ b/t/t7518-ident-corner-cases.sh
@@ -20,10 +20,19 @@ test_expect_success 'empty name and missing email' '
 '
 
 test_expect_success 'commit rejects all-crud name' '
-	test_must_fail env GIT_AUTHOR_NAME=" .;<>" \
+	test_must_fail env GIT_AUTHOR_NAME=" ,;<>" \
 		git commit --allow-empty -m foo
 '
 
+test_expect_success 'commit does not strip trailing dot' '
+	author_name="Pat Doe Jr." &&
+	env GIT_AUTHOR_NAME="$author_name" \
+		git commit --allow-empty -m foo &&
+	git log -1 --format=%an >actual &&
+	echo "$author_name" >expected &&
+	test_cmp actual expected
+'
+
 # We must test the actual error message here, as an unwanted
 # auto-detection could fail for other reasons.
 test_expect_success 'empty configured name does not auto-detect' '
brian m. carlson Aug. 2, 2023, 9:27 p.m. UTC | #5
On 2023-08-02 at 16:49:32, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > I wonder if this needs some credit to those involved in the original
> > thread?
> 
> I've had this on 'seen' as-is, hoping to see a quick update so that
> we can merge it down to 'next' before -rc0; here is a minimally
> touched-up version I'll replace it with.

That's fine.  I was planning to do a re-roll today (since my Tuesdays
tend to be occupied with French classes), but I'm happy with what you
have.  Let me know if you're expecting a re-roll nevertheless, and I'll
try to get one out to you.
Junio C Hamano Aug. 2, 2023, 9:38 p.m. UTC | #6
"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> On 2023-08-02 at 16:49:32, Junio C Hamano wrote:
>> Junio C Hamano <gitster@pobox.com> writes:
>> 
>> > I wonder if this needs some credit to those involved in the original
>> > thread?
>> 
>> I've had this on 'seen' as-is, hoping to see a quick update so that
>> we can merge it down to 'next' before -rc0; here is a minimally
>> touched-up version I'll replace it with.
>
> That's fine.  I was planning to do a re-roll today (since my Tuesdays
> tend to be occupied with French classes), but I'm happy with what you
> have.  Let me know if you're expecting a re-roll nevertheless, and I'll
> try to get one out to you.

Nah, your message I am responding to is just as good as a reroll.

After all we are not propagating cryptographic signatures in your
e-mail to resulting commits, so it would not make any difference ;-)

Thanks.
diff mbox series

Patch

diff --git a/ident.c b/ident.c
index 08be4d0747..cc7afdbf81 100644
--- a/ident.c
+++ b/ident.c
@@ -203,7 +203,6 @@  void reset_ident_date(void)
 static int crud(unsigned char c)
 {
 	return  c <= 32  ||
-		c == '.' ||
 		c == ',' ||
 		c == ':' ||
 		c == ';' ||
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index fa7f987284..2016132f51 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -466,7 +466,7 @@  test_expect_success 'gitmailmap(5) example output: example #1' '
 	Author Jane Doe <jane@laptop.(none)> maps to Jane Doe <jane@laptop.(none)>
 	Committer C O Mitter <committer@example.com> maps to C O Mitter <committer@example.com>
 
-	Author Jane D <jane@desktop.(none)> maps to Jane Doe <jane@desktop.(none)>
+	Author Jane D. <jane@desktop.(none)> maps to Jane Doe <jane@desktop.(none)>
 	Committer C O Mitter <committer@example.com> maps to C O Mitter <committer@example.com>
 	EOF
 	git -C doc log --reverse --pretty=format:"Author %an <%ae> maps to %aN <%aE>%nCommitter %cn <%ce> maps to %cN <%cE>%n" >actual &&
@@ -494,7 +494,7 @@  test_expect_success 'gitmailmap(5) example output: example #2' '
 	Author Jane Doe <jane@laptop.(none)> maps to Jane Doe <jane@example.com>
 	Committer C O Mitter <committer@example.com> maps to C O Mitter <committer@example.com>
 
-	Author Jane D <jane@desktop.(none)> maps to Jane Doe <jane@example.com>
+	Author Jane D. <jane@desktop.(none)> maps to Jane Doe <jane@example.com>
 	Committer C O Mitter <committer@example.com> maps to C O Mitter <committer@example.com>
 	EOF
 	git -C doc log --reverse --pretty=format:"Author %an <%ae> maps to %aN <%aE>%nCommitter %cn <%ce> maps to %cN <%cE>%n" >actual &&
diff --git a/t/t7518-ident-corner-cases.sh b/t/t7518-ident-corner-cases.sh
index fffdb6ff2e..9ab2ae2f3b 100755
--- a/t/t7518-ident-corner-cases.sh
+++ b/t/t7518-ident-corner-cases.sh
@@ -20,10 +20,19 @@  test_expect_success 'empty name and missing email' '
 '
 
 test_expect_success 'commit rejects all-crud name' '
-	test_must_fail env GIT_AUTHOR_NAME=" .;<>" \
+	test_must_fail env GIT_AUTHOR_NAME=" ,;<>" \
 		git commit --allow-empty -m foo
 '
 
+test_expect_success 'commit does not strip trailing dot' '
+	author_name="Pat Doe Jr." &&
+	env GIT_AUTHOR_NAME="$author_name" \
+		git commit --allow-empty -m foo &&
+	git log -1 --format=%an >actual &&
+	echo "$author_name" >expected &&
+	test_cmp actual expected
+'
+
 # We must test the actual error message here, as an unwanted
 # auto-detection could fail for other reasons.
 test_expect_success 'empty configured name does not auto-detect' '