mbox series

[v6,0/4] Add support for mailmap in cat-file

Message ID 20220718195102.66321-1-siddharthasthana31@gmail.com (mailing list archive)
Headers show
Series Add support for mailmap in cat-file | expand

Message

Siddharth Asthana July 18, 2022, 7:50 p.m. UTC
Thanks a lot Junio for your suggestion :) I have made the suggested
changes.

= Description

This patch series adds mailmap support to the git-cat-file command. It
adds the mailmap support only for the commit and tag objects by
replacing the idents for "author", "committer" and "tagger" headers. The
mailmap only takes effect when --[no-]-use-mailmap or --[no-]-mailmap
option is passed to the git cat-file command. The changes will work with
the batch mode as well.

So, if one wants to enable mailmap they can use either of the following
commands:
$ git cat-file --use-mailmap -p <object>
$ git cat-file --use-mailmap <type> <object>

To use it in the batch mode, one can use the following command:
$ git cat-file --use-mailmap --batch

= Patch Organization

- The first patch improves the commit_rewrite_person() by restricting it 
  to traverse only through the header part of the commit object buffer.
  It also adds an argument called headers which the callers can pass. 
  The function will replace idents only on these  passed headers. 
  Thus, the caller won't have to make repeated calls to the function.
- The second patch moves commit_rewrite_person() to ident.c to expose it
  as a public function so that it can be used to replace idents in the
  headers of desired objects.
- The third patch renames commit_rewrite_person() to a name which
  describes its functionality clearly. It is renamed to
  apply_mailmap_to_header().
- The last patch adds mailmap support to the git cat-file command. It
  adds the required documentation and tests as well.

Changes in v6:
- The function rewrite_ident_line() returns the difference between the
  new and the old length of the ident line. We were not using this
  information and instead parsing the buffer again to look for the line
  ending. This patch set starts using that information to update the
  buf_offset value in commit_rewrite_person().
- This patch set also tweaks the commit_rewrite_person() so that it is
  easier to understand and avoids unnecessary parsing of the buffer
  wherever possible.

Siddharth Asthana (4):
  revision: improve commit_rewrite_person()
  ident: move commit_rewrite_person() to ident.c
  ident: rename commit_rewrite_person() to apply_mailmap_to_header()
  cat-file: add mailmap support

 Documentation/git-cat-file.txt |  6 +++
 builtin/cat-file.c             | 43 +++++++++++++++++++-
 cache.h                        |  6 +++
 ident.c                        | 74 ++++++++++++++++++++++++++++++++++
 revision.c                     | 50 ++---------------------
 t/t4203-mailmap.sh             | 59 +++++++++++++++++++++++++++
 6 files changed, 190 insertions(+), 48 deletions(-)

Range-diff against v5:
1:  8c29ad9351 ! 1:  7c086e4c8a revision: improve commit_rewrite_person()
    @@ revision.c: int rewrite_parents(struct rev_info *revs, struct commit *commit,
     +/*
     + * Returns the difference between the new and old length of the ident line.
     + */
    -+static ssize_t rewrite_ident_line(const char *person, struct strbuf *buf,
    -+								  struct string_list *mailmap)
    ++static ssize_t rewrite_ident_line(const char *person, size_t len,
    ++				   struct strbuf *buf,
    ++				   struct string_list *mailmap)
      {
     -	char *person, *endp;
    -+	char *endp;
    - 	size_t len, namelen, maillen;
    +-	size_t len, namelen, maillen;
    ++	size_t namelen, maillen;
      	const char *name;
      	const char *mail;
      	struct ident_split ident;
    @@ revision.c: int rewrite_parents(struct rev_info *revs, struct commit *commit,
     -		return 0;
     -
     -	person += strlen(what);
    - 	endp = strchr(person, '\n');
    - 	if (!endp)
    +-	endp = strchr(person, '\n');
    +-	if (!endp)
    +-		return 0;
    +-
    +-	len = endp - person;
    +-
    + 	if (split_ident_line(&ident, person, len))
      		return 0;
    + 
     @@ revision.c: static int commit_rewrite_person(struct strbuf *buf, const char *what, struct st
      
      	if (map_user(mailmap, &mail, &maillen, &name, &namelen)) {
    @@ revision.c: static int commit_rewrite_person(struct strbuf *buf, const char *wha
      		strbuf_addf(&namemail, "%.*s <%.*s>",
      			    (int)namelen, name, (int)maillen, mail);
     @@ revision.c: static int commit_rewrite_person(struct strbuf *buf, const char *what, struct st
    + 		strbuf_splice(buf, ident.name_begin - buf->buf,
      			      ident.mail_end - ident.name_begin + 1,
      			      namemail.buf, namemail.len);
    - 
     +		newlen = namemail.len;
    -+
    + 
      		strbuf_release(&namemail);
      
     -		return 1;
    -+		return newlen - (ident.mail_end - ident.name_begin + 1);
    ++		return newlen - (ident.mail_end - ident.name_begin);
      	}
      
      	return 0;
      }
      
     +static void commit_rewrite_person(struct strbuf *buf, const char **header,
    -+								  struct string_list *mailmap)
    ++				   struct string_list *mailmap)
     +{
     +	size_t buf_offset = 0;
     +
    @@ revision.c: static int commit_rewrite_person(struct strbuf *buf, const char *wha
     +	for (;;) {
     +		const char *person, *line;
     +		size_t i;
    ++		int found_header = 0;
     +
     +		line = buf->buf + buf_offset;
     +		if (!*line || *line == '\n')
    -+			return; /* End of header */
    ++			return; /* End of headers */
     +
     +		for (i = 0; header[i]; i++)
     +			if (skip_prefix(line, header[i], &person)) {
    -+				rewrite_ident_line(person, buf, mailmap);
    ++				const char *endp = strchrnul(person, '\n');
    ++				found_header = 1;
    ++				buf_offset += endp - line;
    ++				buf_offset += rewrite_ident_line(person, endp - person, buf, mailmap);
     +				break;
     +			}
     +
    -+		buf_offset = strchrnul(buf->buf + buf_offset, '\n') - buf->buf;
    -+		if (buf->buf[buf_offset] == '\n')
    -+			++buf_offset;
    ++		if (!found_header) {
    ++			buf_offset = strchrnul(line, '\n') - buf->buf;
    ++			if (buf->buf[buf_offset] == '\n')
    ++				buf_offset++;
    ++		}
     +	}
     +}
     +
2:  ccb7f72fcb ! 2:  2f8fba7f57 ident: move commit_rewrite_person() to ident.c
    @@ ident.c: int split_ident_line(struct ident_split *split, const char *line, int l
     +/*
     + * Returns the difference between the new and old length of the ident line.
     + */
    -+static ssize_t rewrite_ident_line(const char *person, struct strbuf *buf,
    -+								  struct string_list *mailmap)
    ++static ssize_t rewrite_ident_line(const char *person, size_t len,
    ++				   struct strbuf *buf,
    ++				   struct string_list *mailmap)
     +{
    -+	char *endp;
    -+	size_t len, namelen, maillen;
    ++	size_t namelen, maillen;
     +	const char *name;
     +	const char *mail;
     +	struct ident_split ident;
     +
    -+	endp = strchr(person, '\n');
    -+	if (!endp)
    -+		return 0;
    -+
    -+	len = endp - person;
    -+
     +	if (split_ident_line(&ident, person, len))
     +		return 0;
     +
    @@ ident.c: int split_ident_line(struct ident_split *split, const char *line, int l
     +		strbuf_splice(buf, ident.name_begin - buf->buf,
     +			      ident.mail_end - ident.name_begin + 1,
     +			      namemail.buf, namemail.len);
    -+
     +		newlen = namemail.len;
     +
     +		strbuf_release(&namemail);
     +
    -+		return newlen - (ident.mail_end - ident.name_begin + 1);
    ++		return newlen - (ident.mail_end - ident.name_begin);
     +	}
     +
     +	return 0;
     +}
     +
     +void commit_rewrite_person(struct strbuf *buf, const char **header,
    -+						   struct string_list *mailmap)
    ++			    struct string_list *mailmap)
     +{
     +	size_t buf_offset = 0;
     +
    @@ ident.c: int split_ident_line(struct ident_split *split, const char *line, int l
     +	for (;;) {
     +		const char *person, *line;
     +		size_t i;
    ++		int found_header = 0;
     +
     +		line = buf->buf + buf_offset;
     +		if (!*line || *line == '\n')
    -+			return; /* End of header */
    ++			return; /* End of headers */
     +
     +		for (i = 0; header[i]; i++)
     +			if (skip_prefix(line, header[i], &person)) {
    -+				rewrite_ident_line(person, buf, mailmap);
    ++				const char *endp = strchrnul(person, '\n');
    ++				found_header = 1;
    ++				buf_offset += endp - line;
    ++				buf_offset += rewrite_ident_line(person, endp - person, buf, mailmap);
     +				break;
     +			}
     +
    -+		buf_offset = strchrnul(buf->buf + buf_offset, '\n') - buf->buf;
    -+		if (buf->buf[buf_offset] == '\n')
    -+			++buf_offset;
    ++		if (!found_header) {
    ++			buf_offset = strchrnul(line, '\n') - buf->buf;
    ++			if (buf->buf[buf_offset] == '\n')
    ++				buf_offset++;
    ++		}
     +	}
     +}
      
    @@ revision.c: int rewrite_parents(struct rev_info *revs, struct commit *commit,
     -/*
     - * Returns the difference between the new and old length of the ident line.
     - */
    --static ssize_t rewrite_ident_line(const char *person, struct strbuf *buf,
    --								  struct string_list *mailmap)
    +-static ssize_t rewrite_ident_line(const char *person, size_t len,
    +-				   struct strbuf *buf,
    +-				   struct string_list *mailmap)
     -{
    --	char *endp;
    --	size_t len, namelen, maillen;
    +-	size_t namelen, maillen;
     -	const char *name;
     -	const char *mail;
     -	struct ident_split ident;
     -
    --	endp = strchr(person, '\n');
    --	if (!endp)
    --		return 0;
    --
    --	len = endp - person;
    --
     -	if (split_ident_line(&ident, person, len))
     -		return 0;
     -
    @@ revision.c: int rewrite_parents(struct rev_info *revs, struct commit *commit,
     -		strbuf_splice(buf, ident.name_begin - buf->buf,
     -			      ident.mail_end - ident.name_begin + 1,
     -			      namemail.buf, namemail.len);
    --
     -		newlen = namemail.len;
     -
     -		strbuf_release(&namemail);
     -
    --		return newlen - (ident.mail_end - ident.name_begin + 1);
    +-		return newlen - (ident.mail_end - ident.name_begin);
     -	}
     -
     -	return 0;
     -}
     -
     -static void commit_rewrite_person(struct strbuf *buf, const char **header,
    --								  struct string_list *mailmap)
    +-				   struct string_list *mailmap)
     -{
     -	size_t buf_offset = 0;
     -
    @@ revision.c: int rewrite_parents(struct rev_info *revs, struct commit *commit,
     -	for (;;) {
     -		const char *person, *line;
     -		size_t i;
    +-		int found_header = 0;
     -
     -		line = buf->buf + buf_offset;
     -		if (!*line || *line == '\n')
    --			return; /* End of header */
    +-			return; /* End of headers */
     -
     -		for (i = 0; header[i]; i++)
     -			if (skip_prefix(line, header[i], &person)) {
    --				rewrite_ident_line(person, buf, mailmap);
    +-				const char *endp = strchrnul(person, '\n');
    +-				found_header = 1;
    +-				buf_offset += endp - line;
    +-				buf_offset += rewrite_ident_line(person, endp - person, buf, mailmap);
     -				break;
     -			}
     -
    --		buf_offset = strchrnul(buf->buf + buf_offset, '\n') - buf->buf;
    --		if (buf->buf[buf_offset] == '\n')
    --			++buf_offset;
    +-		if (!found_header) {
    +-			buf_offset = strchrnul(line, '\n') - buf->buf;
    +-			if (buf->buf[buf_offset] == '\n')
    +-				buf_offset++;
    +-		}
     -	}
     -}
     -
3:  38c18fd10d ! 3:  b4f2477b14 ident: rename commit_rewrite_person() to apply_mailmap_to_header()
    @@ cache.h: struct ident_split {
       * Compare split idents for equality or strict ordering. Note that we
     
      ## ident.c ##
    -@@ ident.c: static ssize_t rewrite_ident_line(const char *person, struct strbuf *buf,
    +@@ ident.c: static ssize_t rewrite_ident_line(const char *person, size_t len,
      	return 0;
      }
      
     -void commit_rewrite_person(struct strbuf *buf, const char **header,
    --						   struct string_list *mailmap)
    +-			    struct string_list *mailmap)
     +void apply_mailmap_to_header(struct strbuf *buf, const char **header,
    -+							 struct string_list *mailmap)
    ++			       struct string_list *mailmap)
      {
      	size_t buf_offset = 0;
      
4:  0a459d4c53 = 4:  63d6f8c201 cat-file: add mailmap support

Comments

Junio C Hamano July 25, 2022, 6:58 p.m. UTC | #1
Siddharth Asthana <siddharthasthana31@gmail.com> writes:

> Changes in v6:
> - The function rewrite_ident_line() returns the difference between the
>   new and the old length of the ident line. We were not using this
>   information and instead parsing the buffer again to look for the line
>   ending. This patch set starts using that information to update the
>   buf_offset value in commit_rewrite_person().
> - This patch set also tweaks the commit_rewrite_person() so that it is
>   easier to understand and avoids unnecessary parsing of the buffer
>   wherever possible.
>
> Siddharth Asthana (4):
>   revision: improve commit_rewrite_person()
>   ident: move commit_rewrite_person() to ident.c
>   ident: rename commit_rewrite_person() to apply_mailmap_to_header()
>   cat-file: add mailmap support
>
>  Documentation/git-cat-file.txt |  6 +++
>  builtin/cat-file.c             | 43 +++++++++++++++++++-
>  cache.h                        |  6 +++
>  ident.c                        | 74 ++++++++++++++++++++++++++++++++++
>  revision.c                     | 50 ++---------------------
>  t/t4203-mailmap.sh             | 59 +++++++++++++++++++++++++++
>  6 files changed, 190 insertions(+), 48 deletions(-)

I haven't seen any comments or objections to this round.  Are people
happy about it going forward?  I am planning to merge it to 'next'
and down to 'master' soonish.

Thanks.
Christian Couder July 28, 2022, 7:07 p.m. UTC | #2
On Mon, Jul 25, 2022 at 8:58 PM Junio C Hamano <gitster@pobox.com> wrote:
> Siddharth Asthana <siddharthasthana31@gmail.com> writes:
>
> > Changes in v6:
> > - The function rewrite_ident_line() returns the difference between the
> >   new and the old length of the ident line. We were not using this
> >   information and instead parsing the buffer again to look for the line
> >   ending. This patch set starts using that information to update the
> >   buf_offset value in commit_rewrite_person().
> > - This patch set also tweaks the commit_rewrite_person() so that it is
> >   easier to understand and avoids unnecessary parsing of the buffer
> >   wherever possible.
> >
> > Siddharth Asthana (4):
> >   revision: improve commit_rewrite_person()
> >   ident: move commit_rewrite_person() to ident.c
> >   ident: rename commit_rewrite_person() to apply_mailmap_to_header()
> >   cat-file: add mailmap support
> >
> >  Documentation/git-cat-file.txt |  6 +++
> >  builtin/cat-file.c             | 43 +++++++++++++++++++-
> >  cache.h                        |  6 +++
> >  ident.c                        | 74 ++++++++++++++++++++++++++++++++++
> >  revision.c                     | 50 ++---------------------
> >  t/t4203-mailmap.sh             | 59 +++++++++++++++++++++++++++
> >  6 files changed, 190 insertions(+), 48 deletions(-)
>
> I haven't seen any comments or objections to this round.  Are people
> happy about it going forward?  I am planning to merge it to 'next'
> and down to 'master' soonish.

I am biased, but I am happy with the current state of this patch
series. During the last versions of this patch series there were only
comments related to the first patch in the series (revision: improve
commit_rewrite_person()). It seems to me that they were all properly
taken into account, and that the code in that patch is now correct and
relatively simple to understand.

Thanks,
Christian.
Junio C Hamano July 28, 2022, 7:32 p.m. UTC | #3
Christian Couder <christian.couder@gmail.com> writes:

> On Mon, Jul 25, 2022 at 8:58 PM Junio C Hamano <gitster@pobox.com> wrote:
>> Siddharth Asthana <siddharthasthana31@gmail.com> writes:
>>
>> > Changes in v6:
>> > - The function rewrite_ident_line() returns the difference between the
>> >   new and the old length of the ident line. We were not using this
>> >   information and instead parsing the buffer again to look for the line
>> >   ending. This patch set starts using that information to update the
>> >   buf_offset value in commit_rewrite_person().
>> > - This patch set also tweaks the commit_rewrite_person() so that it is
>> >   easier to understand and avoids unnecessary parsing of the buffer
>> >   wherever possible.
>> >
>> > Siddharth Asthana (4):
>> >   revision: improve commit_rewrite_person()
>> >   ident: move commit_rewrite_person() to ident.c
>> >   ident: rename commit_rewrite_person() to apply_mailmap_to_header()
>> >   cat-file: add mailmap support
>> >
>> >  Documentation/git-cat-file.txt |  6 +++
>> >  builtin/cat-file.c             | 43 +++++++++++++++++++-
>> >  cache.h                        |  6 +++
>> >  ident.c                        | 74 ++++++++++++++++++++++++++++++++++
>> >  revision.c                     | 50 ++---------------------
>> >  t/t4203-mailmap.sh             | 59 +++++++++++++++++++++++++++
>> >  6 files changed, 190 insertions(+), 48 deletions(-)
>>
>> I haven't seen any comments or objections to this round.  Are people
>> happy about it going forward?  I am planning to merge it to 'next'
>> and down to 'master' soonish.
>
> I am biased, but I am happy with the current state of this patch
> series. During the last versions of this patch series there were only
> comments related to the first patch in the series (revision: improve
> commit_rewrite_person()). It seems to me that they were all properly
> taken into account, and that the code in that patch is now correct and
> relatively simple to understand.

Thanks, let's move it forward.
Siddharth Asthana July 30, 2022, 7:50 a.m. UTC | #4
On 29/07/22 01:02, Junio C Hamano wrote:
> Christian Couder <christian.couder@gmail.com> writes:
> 
>> On Mon, Jul 25, 2022 at 8:58 PM Junio C Hamano <gitster@pobox.com> wrote:
>>> Siddharth Asthana <siddharthasthana31@gmail.com> writes:
>>>
>>>> Changes in v6:
>>>> - The function rewrite_ident_line() returns the difference between the
>>>>    new and the old length of the ident line. We were not using this
>>>>    information and instead parsing the buffer again to look for the line
>>>>    ending. This patch set starts using that information to update the
>>>>    buf_offset value in commit_rewrite_person().
>>>> - This patch set also tweaks the commit_rewrite_person() so that it is
>>>>    easier to understand and avoids unnecessary parsing of the buffer
>>>>    wherever possible.
>>>>
>>>> Siddharth Asthana (4):
>>>>    revision: improve commit_rewrite_person()
>>>>    ident: move commit_rewrite_person() to ident.c
>>>>    ident: rename commit_rewrite_person() to apply_mailmap_to_header()
>>>>    cat-file: add mailmap support
>>>>
>>>>   Documentation/git-cat-file.txt |  6 +++
>>>>   builtin/cat-file.c             | 43 +++++++++++++++++++-
>>>>   cache.h                        |  6 +++
>>>>   ident.c                        | 74 ++++++++++++++++++++++++++++++++++
>>>>   revision.c                     | 50 ++---------------------
>>>>   t/t4203-mailmap.sh             | 59 +++++++++++++++++++++++++++
>>>>   6 files changed, 190 insertions(+), 48 deletions(-)
>>>
>>> I haven't seen any comments or objections to this round.  Are people
>>> happy about it going forward?  I am planning to merge it to 'next'
>>> and down to 'master' soonish.
>>
>> I am biased, but I am happy with the current state of this patch
>> series. During the last versions of this patch series there were only
>> comments related to the first patch in the series (revision: improve
>> commit_rewrite_person()). It seems to me that they were all properly
>> taken into account, and that the code in that patch is now correct and
>> relatively simple to understand.
> 
> Thanks, let's move it forward.
Thanks a lot Junio, Phillip, Đoàn, Ævar, Johannes, Christian and John 
for helping me with the reviews and making this patch better. Thanks a 
lot for accepting it :)