Message ID | 20191022232847.5212-1-prarit@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | pretty: Add "%aU"|"%au" option to output author's username | expand |
Prarit Bhargava <prarit@redhat.com> writes: > Subject: Re: [PATCH] pretty: Add "%aU"|"%au" option to output author's username Downcase "Add" (see "git shortlog --no-merges -100 master" and mimick the project convention). > Add a "%aU"|"%au" option that outputs the author's email username. Even though I personally do not see the use for it, I agree it would make sense to have an option to show the local part only where the e-mail address is shown. I do not know if u/U is a good mnemonic; it hints too strongly that it may come from GIT_{AUTHOR/COMMITTER}_NAME but that is not what you are doing---isn't there a letter that better conveys that this is about RFC 2822 local-part (cf. page 16 ieft.org/rfc/rfc2822.txt)? > diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt > index b87e2e83e6d0..479a15a8ab12 100644 > --- a/Documentation/pretty-formats.txt > +++ b/Documentation/pretty-formats.txt > @@ -163,6 +163,9 @@ The placeholders are: > '%ae':: author email > '%aE':: author email (respecting .mailmap, see linkgit:git-shortlog[1] > or linkgit:git-blame[1]) > +'%au':: author username > +'%aU':: author username (respecting .mailmap, see linkgit:git-shortlog[1] > + or linkgit:git-blame[1]) > '%ad':: author date (format respects --date= option) > '%aD':: author date, RFC2822 style > '%ar':: author date, relative > diff --git a/pretty.c b/pretty.c > index b32f0369531c..2a5b93022050 100644 > --- a/pretty.c > +++ b/pretty.c > @@ -706,6 +706,11 @@ static size_t format_person_part(struct strbuf *sb, char part, > strbuf_add(sb, mail, maillen); > return placeholder_len; > } > + if (part == 'u' || part == 'U') { /* username */ > + maillen = strstr(s.mail_begin, "@") - s.mail_begin; > + strbuf_add(sb, mail, maillen); > + return placeholder_len; > + } I think users get %eu and %eU for free with this change, which should be documented.
On 2019-10-22 at 23:28:47, Prarit Bhargava wrote: > In many projects the number of contributors is low enough that users know > each other and the full email address doesn't need to be displayed. > Displaying only the author's username saves a lot of columns on the screen. > For example displaying "prarit" instead of "prarit@redhat.com" saves 11 > columns. > > Add a "%aU"|"%au" option that outputs the author's email username. I have no position on whether or not this is a useful change. I don't think I'll end up using it, but I can imagine cases where it is useful, such as certain corporate environments. It would be interesting to see what others think. > diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt > index b87e2e83e6d0..479a15a8ab12 100644 > --- a/Documentation/pretty-formats.txt > +++ b/Documentation/pretty-formats.txt > @@ -163,6 +163,9 @@ The placeholders are: > '%ae':: author email > '%aE':: author email (respecting .mailmap, see linkgit:git-shortlog[1] > or linkgit:git-blame[1]) > +'%au':: author username > +'%aU':: author username (respecting .mailmap, see linkgit:git-shortlog[1] > + or linkgit:git-blame[1]) I think we need to actually document what "username" means here. I expect you'll have people think that this magically means their $HOSTING_PLATFORM username, which it is not. > diff --git a/pretty.c b/pretty.c > index b32f0369531c..2a5b93022050 100644 > --- a/pretty.c > +++ b/pretty.c > @@ -706,6 +706,11 @@ static size_t format_person_part(struct strbuf *sb, char part, > strbuf_add(sb, mail, maillen); > return placeholder_len; > } > + if (part == 'u' || part == 'U') { /* username */ > + maillen = strstr(s.mail_begin, "@") - s.mail_begin; > + strbuf_add(sb, mail, maillen); > + return placeholder_len; > + } This branch doesn't appear to do anything different for the mailmap and non-mailmap cases. Perhaps adding an additional test that demonstrates the difference would be a good idea.
On Tue, Oct 22, 2019 at 07:28:47PM -0400, Prarit Bhargava wrote: > In many projects the number of contributors is low enough that users know > each other and the full email address doesn't need to be displayed. > Displaying only the author's username saves a lot of columns on the screen. > For example displaying "prarit" instead of "prarit@redhat.com" saves 11 > columns. > > Add a "%aU"|"%au" option that outputs the author's email username. Like others, this seems potentially useful even if I probably wouldn't use it myself. Another more complicated way to think of it would be to give a list of domains to omit (so if 90% of the committers are @redhat.com, we can skip that, but the one-off contributor from another domain gets their fully qualified name. But that's a lot more complicated. I don't mind doing the easy thing now, and even if we later grew the more complicated thing, I wouldn't be sad to still have this easy one as an option. > --- a/pretty.c > +++ b/pretty.c > @@ -706,6 +706,11 @@ static size_t format_person_part(struct strbuf *sb, char part, > strbuf_add(sb, mail, maillen); > return placeholder_len; > } > + if (part == 'u' || part == 'U') { /* username */ > + maillen = strstr(s.mail_begin, "@") - s.mail_begin; > + strbuf_add(sb, mail, maillen); > + return placeholder_len; > + } What happens if the email doesn't have an "@"? I think you'd either end up printing a bunch of extra cruft (because you're not limiting the search for "@" to the boundaries from split_ident_line) or you'd crash (if there's no "@" at all, and you'd get a huge maillen). There's also no need to use the slower strstr() when looking for a single character. So perhaps: const char *at = memchr(mail, '@', maillen); if (at) maillen = at - mail; strbuf_add(sb, mail, maillen); > +test_expect_success 'log pretty %an %ae %au' ' As others noted, this could cover %aU, too (which is broken; you need to handle 'U' alongside 'E' and 'N' earlier in format_person_part()). > + git checkout -b anaeau && > + test_commit anaeau_test anaeau_test_file && > + git log --pretty="%an" > actual && > + git log --pretty="%ae" >> actual && > + git log --pretty="%au" >> actual && Maybe: git log --pretty="%an %ae %au" or git log --pretty="%an%n%ae%n%au" which is shorter and runs more efficiently? > + git log > full && > + name=$(cat full | grep "^Author: " | awk -F "Author: " " { print \$2 } " | awk -F " <" " { print \$1 } ") && > + email=$(cat full | grep "^Author: " | awk -F "<" " { print \$2 } " | awk -F ">" " { print \$1 } ") && > + username=$(cat full | grep "^Author: " | awk -F "<" " { print \$2 } " | awk -F ">" " { print \$1 } " | awk -F "@" " { print \$1 } " ) && > + echo "${name}" > expect && > + echo "${email}" >> expect && > + echo "${username}" >> expect && These values come from our hard-coded test setup, so it would be more readable to just expect those: { echo "$GIT_AUTHOR_NAME" && echo "$GIT_AUTHOR_EMAIL" && echo "$GIT_AUTHOR_EMAIL" | sed "s/@.*//" } >expect For the last one, also I wouldn't be upset to see test-lib.sh do something like: TEST_AUTHOR_USERNAME=author TEST_AUTHOR_DOMAIN=example.com GIT_AUTHOR_NAME=$TEST_AUTHOR_USERNAME@$TEST_AUTHOR_DOMAIN to let tests like this pick out the individual values if they want. -Peff
On 10/22/19 7:46 PM, Junio C Hamano wrote: > Prarit Bhargava <prarit@redhat.com> writes: > >> Subject: Re: [PATCH] pretty: Add "%aU"|"%au" option to output author's username > > Downcase "Add" (see "git shortlog --no-merges -100 master" and > mimick the project convention). I'll fix that. > >> Add a "%aU"|"%au" option that outputs the author's email username. > > Even though I personally do not see the use for it, I agree it would > make sense to have an option to show the local part only where the > e-mail address is shown. > > I do not know if u/U is a good mnemonic; it hints too strongly that > it may come from GIT_{AUTHOR/COMMITTER}_NAME but that is not what > you are doing---isn't there a letter that better conveys that this > is about RFC 2822 local-part (cf. page 16 ieft.org/rfc/rfc2822.txt)? I'll go with "l" and "L" for local-part as defined on page 16. I will also include a comment in braces that says (the portion of the email address preceding the '@' symbol)". Admittedly that doesn't convey the meaning of the mailbox concept of the email address it does tell a user what is going to be output. > >> diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt >> index b87e2e83e6d0..479a15a8ab12 100644 >> --- a/Documentation/pretty-formats.txt >> +++ b/Documentation/pretty-formats.txt >> @@ -163,6 +163,9 @@ The placeholders are: >> '%ae':: author email >> '%aE':: author email (respecting .mailmap, see linkgit:git-shortlog[1] >> or linkgit:git-blame[1]) >> +'%au':: author username >> +'%aU':: author username (respecting .mailmap, see linkgit:git-shortlog[1] >> + or linkgit:git-blame[1]) >> '%ad':: author date (format respects --date= option) >> '%aD':: author date, RFC2822 style >> '%ar':: author date, relative > >> diff --git a/pretty.c b/pretty.c >> index b32f0369531c..2a5b93022050 100644 >> --- a/pretty.c >> +++ b/pretty.c >> @@ -706,6 +706,11 @@ static size_t format_person_part(struct strbuf *sb, char part, >> strbuf_add(sb, mail, maillen); >> return placeholder_len; >> } >> + if (part == 'u' || part == 'U') { /* username */ >> + maillen = strstr(s.mail_begin, "@") - s.mail_begin; >> + strbuf_add(sb, mail, maillen); >> + return placeholder_len; >> + } > > I think users get %eu and %eU for free with this change, which should > be documented. > Did you mean %cu and %cU? Or am I missing something with "%e"? P.
On 10/22/19 7:48 PM, brian m. carlson wrote: > On 2019-10-22 at 23:28:47, Prarit Bhargava wrote: >> In many projects the number of contributors is low enough that users know >> each other and the full email address doesn't need to be displayed. >> Displaying only the author's username saves a lot of columns on the screen. >> For example displaying "prarit" instead of "prarit@redhat.com" saves 11 >> columns. >> >> Add a "%aU"|"%au" option that outputs the author's email username. > > I have no position on whether or not this is a useful change. I don't > think I'll end up using it, but I can imagine cases where it is useful, > such as certain corporate environments. It would be interesting to see > what others think. > >> diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt >> index b87e2e83e6d0..479a15a8ab12 100644 >> --- a/Documentation/pretty-formats.txt >> +++ b/Documentation/pretty-formats.txt >> @@ -163,6 +163,9 @@ The placeholders are: >> '%ae':: author email >> '%aE':: author email (respecting .mailmap, see linkgit:git-shortlog[1] >> or linkgit:git-blame[1]) >> +'%au':: author username >> +'%aU':: author username (respecting .mailmap, see linkgit:git-shortlog[1] >> + or linkgit:git-blame[1]) > > I think we need to actually document what "username" means here. I > expect you'll have people think that this magically means their > $HOSTING_PLATFORM username, which it is not. > Based on Junio's input, I'm going to change the option to "al" and "aL" where L means "local-part" as defined by the rfc2822.txt specification, and include a comment that says "(the portion of the email address preceding the '@' symbol)". Admittedly that doesn't convey the meaning of the mailbox concept of the email address it does tell a user what is going to be output. >> diff --git a/pretty.c b/pretty.c >> index b32f0369531c..2a5b93022050 100644 >> --- a/pretty.c >> +++ b/pretty.c >> @@ -706,6 +706,11 @@ static size_t format_person_part(struct strbuf *sb, char part, >> strbuf_add(sb, mail, maillen); >> return placeholder_len; >> } >> + if (part == 'u' || part == 'U') { /* username */ >> + maillen = strstr(s.mail_begin, "@") - s.mail_begin; >> + strbuf_add(sb, mail, maillen); >> + return placeholder_len; >> + } > > This branch doesn't appear to do anything different for the mailmap and > non-mailmap cases. Perhaps adding an additional test that demonstrates > the difference would be a good idea. > Thanks for the suggestion. I'll look into it for v2. P.
On 10/23/19 1:02 AM, Jeff King wrote: > On Tue, Oct 22, 2019 at 07:28:47PM -0400, Prarit Bhargava wrote: > >> In many projects the number of contributors is low enough that users know >> each other and the full email address doesn't need to be displayed. >> Displaying only the author's username saves a lot of columns on the screen. >> For example displaying "prarit" instead of "prarit@redhat.com" saves 11 >> columns. >> >> Add a "%aU"|"%au" option that outputs the author's email username. > > Like others, this seems potentially useful even if I probably wouldn't > use it myself. Another more complicated way to think of it would be to > give a list of domains to omit (so if 90% of the committers are > @redhat.com, we can skip that, but the one-off contributor from another > domain gets their fully qualified name.> > But that's a lot more complicated. I don't mind doing the easy thing > now, and even if we later grew the more complicated thing, I wouldn't be > sad to still have this easy one as an option. FWIW, I went through the exact same thought process as you did and came to the same conclusion. > >> --- a/pretty.c >> +++ b/pretty.c >> @@ -706,6 +706,11 @@ static size_t format_person_part(struct strbuf *sb, char part, >> strbuf_add(sb, mail, maillen); >> return placeholder_len; >> } >> + if (part == 'u' || part == 'U') { /* username */ >> + maillen = strstr(s.mail_begin, "@") - s.mail_begin; >> + strbuf_add(sb, mail, maillen); >> + return placeholder_len; >> + } > > What happens if the email doesn't have an "@"? I think you'd either end > up printing a bunch of extra cruft (because you're not limiting the > search for "@" to the boundaries from split_ident_line) or you'd > crash (if there's no "@" at all, and you'd get a huge maillen). > > There's also no need to use the slower strstr() when looking for a > single character. So perhaps: > > const char *at = memchr(mail, '@', maillen); > if (at) > maillen = at - mail; > strbuf_add(sb, mail, maillen); TBH I had assumed that the email address was RFC2822 compliant. Thanks for the code. I've incorporated it into v2. > >> +test_expect_success 'log pretty %an %ae %au' ' > > As others noted, this could cover %aU, too (which is broken; you need to > handle 'U' alongside 'E' and 'N' earlier in format_person_part()). Whups. Thanks for the pointer. Also fixed in v2. > >> + git checkout -b anaeau && >> + test_commit anaeau_test anaeau_test_file && >> + git log --pretty="%an" > actual && >> + git log --pretty="%ae" >> actual && >> + git log --pretty="%au" >> actual && > > Maybe: > > git log --pretty="%an %ae %au" > > or > > git log --pretty="%an%n%ae%n%au" > > which is shorter and runs more efficiently? > >> + git log > full && >> + name=$(cat full | grep "^Author: " | awk -F "Author: " " { print \$2 } " | awk -F " <" " { print \$1 } ") && >> + email=$(cat full | grep "^Author: " | awk -F "<" " { print \$2 } " | awk -F ">" " { print \$1 } ") && >> + username=$(cat full | grep "^Author: " | awk -F "<" " { print \$2 } " | awk -F ">" " { print \$1 } " | awk -F "@" " { print \$1 } " ) && >> + echo "${name}" > expect && >> + echo "${email}" >> expect && >> + echo "${username}" >> expect && > > These values come from our hard-coded test setup, so it would be more > readable to just expect those: > > { > echo "$GIT_AUTHOR_NAME" && > echo "$GIT_AUTHOR_EMAIL" && > echo "$GIT_AUTHOR_EMAIL" | sed "s/@.*//" > } >expect Added to v2 (along with Brian's suggestion to test %aE, %aN, and %aL). > > For the last one, also I wouldn't be upset to see test-lib.sh do > something like: > > TEST_AUTHOR_USERNAME=author > TEST_AUTHOR_DOMAIN=example.com > GIT_AUTHOR_NAME=$TEST_AUTHOR_USERNAME@$TEST_AUTHOR_DOMAIN I like this suggestion a lot. I'm incorporating into v2 as well as similar changes for the COMMITTER fields. P. > > to let tests like this pick out the individual values if they want. > > -Peff >
"brian m. carlson" <sandals@crustytoothpaste.net> writes: >> + if (part == 'u' || part == 'U') { /* username */ >> + maillen = strstr(s.mail_begin, "@") - s.mail_begin; >> + strbuf_add(sb, mail, maillen); >> + return placeholder_len; >> + } > > This branch doesn't appear to do anything different for the mailmap and > non-mailmap cases. Perhaps adding an additional test that demonstrates > the difference would be a good idea. Yes, and the bug that would be exposed is the lack of call to mailmap. Perhaps along this line (I may have off-by-one or two tho)? pretty.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/pretty.c b/pretty.c index e4ed14effe..4b76d022c6 100644 --- a/pretty.c +++ b/pretty.c @@ -696,15 +696,27 @@ static size_t format_person_part(struct strbuf *sb, char part, mail = s.mail_begin; maillen = s.mail_end - s.mail_begin; - if (part == 'N' || part == 'E') /* mailmap lookup */ + if (part == 'N' || part == 'E' || part == 'L') /* mailmap lookup */ mailmap_name(&mail, &maillen, &name, &namelen); - if (part == 'n' || part == 'N') { /* name */ + + switch (part) { + case 'n': case 'N': strbuf_add(sb, name, namelen); return placeholder_len; - } - if (part == 'e' || part == 'E') { /* email */ + case 'l': case 'L': + { + const char *at = memchr(mail, '@', maillen); + if (at) { + maillen -= at - mail + 1; + mail = at + 1; + } + } + /* fall through */ + case 'e': case 'E': strbuf_add(sb, mail, maillen); return placeholder_len; + default: + break; } if (!s.date_begin)
On 10/23/19 10:29 PM, Junio C Hamano wrote: > "brian m. carlson" <sandals@crustytoothpaste.net> writes: > >>> + if (part == 'u' || part == 'U') { /* username */ >>> + maillen = strstr(s.mail_begin, "@") - s.mail_begin; >>> + strbuf_add(sb, mail, maillen); >>> + return placeholder_len; >>> + } >> >> This branch doesn't appear to do anything different for the mailmap and >> non-mailmap cases. Perhaps adding an additional test that demonstrates >> the difference would be a good idea. > > Yes, and the bug that would be exposed is the lack of call to > mailmap. > > Perhaps along this line (I may have off-by-one or two tho)? > > pretty.c | 20 ++++++++++++++++---- > 1 file changed, 16 insertions(+), 4 deletions(-) > > diff --git a/pretty.c b/pretty.c > index e4ed14effe..4b76d022c6 100644 > --- a/pretty.c > +++ b/pretty.c > @@ -696,15 +696,27 @@ static size_t format_person_part(struct strbuf *sb, char part, > mail = s.mail_begin; > maillen = s.mail_end - s.mail_begin; > > - if (part == 'N' || part == 'E') /* mailmap lookup */ > + if (part == 'N' || part == 'E' || part == 'L') /* mailmap lookup */ > mailmap_name(&mail, &maillen, &name, &namelen); > - if (part == 'n' || part == 'N') { /* name */ > + > + switch (part) { > + case 'n': case 'N': > strbuf_add(sb, name, namelen); > return placeholder_len; > - } > - if (part == 'e' || part == 'E') { /* email */ > + case 'l': case 'L': > + { > + const char *at = memchr(mail, '@', maillen); > + if (at) { > + maillen -= at - mail + 1; > + mail = at + 1; ^^^ If the mail is 'prarit@redhat.com' the output of these lines is maillen = maillen - at - mail - 1 which is (16 - [6] - 1) = 9. and mail = at + 1 points to "redhat.com" This is the reverse of what we want. IMO I suggest (sorry for the cut-and-paste) which keeps the code easy to read. diff --git a/pretty.c b/pretty.c index b32f0369531c..93eb6e837071 100644 --- a/pretty.c +++ b/pretty.c @@ -696,7 +696,7 @@ static size_t format_person_part(struct strbuf *sb, char par t, mail = s.mail_begin; maillen = s.mail_end - s.mail_begin; - if (part == 'N' || part == 'E') /* mailmap lookup */ + if (part == 'N' || part == 'E' || part == 'L') /* mailmap lookup */ mailmap_name(&mail, &maillen, &name, &namelen); if (part == 'n' || part == 'N') { /* name */ strbuf_add(sb, name, namelen); @@ -706,6 +706,13 @@ static size_t format_person_part(struct strbuf *sb, char pa rt, strbuf_add(sb, mail, maillen); return placeholder_len; } + if (part == 'l' || part == 'L') { /* local-part */ + const char *at = memchr(mail, '@', maillen); + if (at) + maillen = at - mail; + strbuf_add(sb, mail, maillen); + return placeholder_len; + } if (!s.date_begin) Let me submit a v2 with all the suggested changes so that you can review my work so far. We can continue the discussion there. P. > + } > + } > + /* fall through */ > + case 'e': case 'E': > strbuf_add(sb, mail, maillen); > return placeholder_len; > + default: > + break; > } > > if (!s.date_begin) >
diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt index b87e2e83e6d0..479a15a8ab12 100644 --- a/Documentation/pretty-formats.txt +++ b/Documentation/pretty-formats.txt @@ -163,6 +163,9 @@ The placeholders are: '%ae':: author email '%aE':: author email (respecting .mailmap, see linkgit:git-shortlog[1] or linkgit:git-blame[1]) +'%au':: author username +'%aU':: author username (respecting .mailmap, see linkgit:git-shortlog[1] + or linkgit:git-blame[1]) '%ad':: author date (format respects --date= option) '%aD':: author date, RFC2822 style '%ar':: author date, relative diff --git a/pretty.c b/pretty.c index b32f0369531c..2a5b93022050 100644 --- a/pretty.c +++ b/pretty.c @@ -706,6 +706,11 @@ static size_t format_person_part(struct strbuf *sb, char part, strbuf_add(sb, mail, maillen); return placeholder_len; } + if (part == 'u' || part == 'U') { /* username */ + maillen = strstr(s.mail_begin, "@") - s.mail_begin; + strbuf_add(sb, mail, maillen); + return placeholder_len; + } if (!s.date_begin) goto skip; diff --git a/t/t4202-log.sh b/t/t4202-log.sh index e803ba402e9e..2fee0c067197 100755 --- a/t/t4202-log.sh +++ b/t/t4202-log.sh @@ -1729,4 +1729,20 @@ test_expect_success 'log --end-of-options' ' test_cmp expect actual ' +test_expect_success 'log pretty %an %ae %au' ' + git checkout -b anaeau && + test_commit anaeau_test anaeau_test_file && + git log --pretty="%an" > actual && + git log --pretty="%ae" >> actual && + git log --pretty="%au" >> actual && + git log > full && + name=$(cat full | grep "^Author: " | awk -F "Author: " " { print \$2 } " | awk -F " <" " { print \$1 } ") && + email=$(cat full | grep "^Author: " | awk -F "<" " { print \$2 } " | awk -F ">" " { print \$1 } ") && + username=$(cat full | grep "^Author: " | awk -F "<" " { print \$2 } " | awk -F ">" " { print \$1 } " | awk -F "@" " { print \$1 } " ) && + echo "${name}" > expect && + echo "${email}" >> expect && + echo "${username}" >> expect && + test_cmp expect actual +' + test_done
In many projects the number of contributors is low enough that users know each other and the full email address doesn't need to be displayed. Displaying only the author's username saves a lot of columns on the screen. For example displaying "prarit" instead of "prarit@redhat.com" saves 11 columns. Add a "%aU"|"%au" option that outputs the author's email username. Also add tests for "%ae" and "%an". Signed-off-by: Prarit Bhargava <prarit@redhat.com> --- Documentation/pretty-formats.txt | 3 +++ pretty.c | 5 +++++ t/t4202-log.sh | 16 ++++++++++++++++ 3 files changed, 24 insertions(+)