mbox series

[v3,0/4] Add support for mailmap in cat-file

Message ID 20220709154149.165524-1-siddharthasthana31@gmail.com (mailing list archive)
Headers show
Series Add support for mailmap in cat-file | expand

Message

Siddharth Asthana July 9, 2022, 3:41 p.m. UTC
Thanks a lot for the review and suggestions Junio, Danh and Johannes.
Really grateful for that :)

= Description

This patch series adds mailmap support to the git-cat-file command. It
adds the mailmap support only for the commit and tag objects by
replacing the idents for "author", "committer" and "tagger" headers. The
mailmap only takes effect when --[no-]-use-mailmap or --[no-]-mailmap
option is passed to the git cat-file command. The changes will work with
the batch mode as well.

So, if one wants to enable mailmap they can use either of the following
commands:
$ git cat-file --use-mailmap -p <object>
$ git cat-file --use-mailmap <type> <object>

To use it in the batch mode, one can use the following command:
$ git cat-file --use-mailmap --batch

= Patch Organization

- The first patch improves the commit_rewrite_person() by restricting it 
  to traverse only through the header part of the commit object buffer.
  It also adds an argument called headers which the callers can pass. 
  The function will replace idents only on these  passed headers. 
  Thus, the caller won't have to make repeated calls to the function.
- The second patch moves commit_rewrite_person() to ident.c to expose it
  as a public function so that it can be used to replace idents in the
  headers of desired objects.
- The third patch renames commit_rewrite_person() to a name which
  describes its functionality clearly. It is renamed to
  apply_mailmap_to_header().
- The last patch adds mailmap support to the git cat-file command. It
  adds the required documentation and tests as well.

Changes in v3:
- The decl-after-statement warnings have been fixed in all the patches.
- In commit_rewrite_person(), the data type of linelen and i variables
  have been changed from int to size_t.
- The return type of replace_idents_using_mailmap() function, size_t,
  has been explicitly typecasted to unsigned long using the
  cast_size_t_to_ulong() helper method.

Siddharth Asthana (4):
  revision: improve commit_rewrite_person()
  ident: move commit_rewrite_person() to ident.c
  ident: rename commit_rewrite_person() to apply_mailmap_to_header()
  cat-file: add mailmap support

 Documentation/git-cat-file.txt |  6 +++
 builtin/cat-file.c             | 43 +++++++++++++++++++-
 cache.h                        |  6 +++
 ident.c                        | 72 ++++++++++++++++++++++++++++++++++
 revision.c                     | 50 ++---------------------
 t/t4203-mailmap.sh             | 54 +++++++++++++++++++++++++
 6 files changed, 183 insertions(+), 48 deletions(-)

Range-diff against v2:
1:  64e1f750e1 ! 1:  9e95326c58 revision: improve commit_rewrite_person()
    @@ Commit message
     
         The function, commit_rewrite_person(), is designed to find and replace
         an ident string in the header part, and the way it avoids a random
    -    occuranace of "author A U Thor <author@example.com" in the text is by
    +    occurrence of "author A U Thor <author@example.com" in the text is by
         insisting "author" to appear at the beginning of line by passing
         "\nauthor " as "what".
     
    @@ Commit message
         changed from int to void. This has been done because the caller of the
         function doesn't do anything with the return value of the function.
     
    -    By simplyfying the interface of the commit_rewrite_person(), we also
    +    By simplifying the interface of the commit_rewrite_person(), we also
         intend to expose it as a public function. We will also be renaming the
         function in a future commit to a different name which clearly tells that
         the function replaces idents in the header of the commit buffer.
     
         Mentored-by: Christian Couder <christian.couder@gmail.com>
         Mentored-by: John Cai <johncai86@gmail.com>
    +    Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
         Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
     
      ## revision.c ##
    @@ revision.c: static int commit_rewrite_person(struct strbuf *buf, const char *wha
     +		return;
     +
     +	for (;;) {
    -+		const char *person, *line = buf->buf + buf_offset;
    -+		int i, linelen = strchrnul(line, '\n') - line + 1;
    ++		const char *person, *line;
    ++		size_t i, linelen;
     +
    -+		if (!linelen || linelen == 1)
    ++		line = buf->buf + buf_offset;
    ++		linelen = strchrnul(line, '\n') - line + 1;
    ++
    ++		if (linelen <= 1)
     +			/* End of header */
     +			return;
     +
    @@ revision.c: static int commit_rewrite_person(struct strbuf *buf, const char *wha
      {
      	int retval;
     @@ revision.c: static int commit_match(struct commit *commit, struct rev_info *opt)
    + 		strbuf_addstr(&buf, message);
    + 
    + 	if (opt->grep_filter.header_list && opt->mailmap) {
    ++		const char *commit_headers[] = { "author ", "committer ", NULL };
    ++
      		if (!buf.len)
      			strbuf_addstr(&buf, message);
      
     -		commit_rewrite_person(&buf, "\nauthor ", opt->mailmap);
     -		commit_rewrite_person(&buf, "\ncommitter ", opt->mailmap);
    -+		const char *commit_headers[] = { "author ", "committer ", NULL };
     +		commit_rewrite_person(&buf, commit_headers, opt->mailmap);
      	}
      
2:  b18ced0ece ! 2:  d9395cb8b2 ident: move commit_rewrite_person() to ident.c
    @@ ident.c: int split_ident_line(struct ident_split *split, const char *line, int l
     +		return;
     +
     +	for (;;) {
    -+		const char *person, *line = buf->buf + buf_offset;
    -+		int i, linelen = strchrnul(line, '\n') - line + 1;
    ++		const char *person, *line;
    ++		size_t i, linelen;
     +
    -+		if (!linelen || linelen == 1)
    ++		line = buf->buf + buf_offset;
    ++		linelen = strchrnul(line, '\n') - line + 1;
    ++
    ++		if (linelen <= 1)
     +			/* End of header */
     +			return;
     +
    @@ revision.c: int rewrite_parents(struct rev_info *revs, struct commit *commit,
     -		return;
     -
     -	for (;;) {
    --		const char *person, *line = buf->buf + buf_offset;
    --		int i, linelen = strchrnul(line, '\n') - line + 1;
    +-		const char *person, *line;
    +-		size_t i, linelen;
    +-
    +-		line = buf->buf + buf_offset;
    +-		linelen = strchrnul(line, '\n') - line + 1;
     -
    --		if (!linelen || linelen == 1)
    +-		if (linelen <= 1)
     -			/* End of header */
     -			return;
     -
3:  2494ce1ed2 ! 3:  355bbda25e ident: rename commit_rewrite_person() to apply_mailmap_to_header()
    @@ ident.c: static ssize_t rewrite_ident_line(const char* person, struct strbuf *bu
     
      ## revision.c ##
     @@ revision.c: static int commit_match(struct commit *commit, struct rev_info *opt)
    + 		if (!buf.len)
      			strbuf_addstr(&buf, message);
      
    - 		const char *commit_headers[] = { "author ", "committer ", NULL };
     -		commit_rewrite_person(&buf, commit_headers, opt->mailmap);
     +		apply_mailmap_to_header(&buf, commit_headers, opt->mailmap);
      	}
4:  94838a2566 ! 4:  69b7ad898b cat-file: add mailmap support
    @@ Commit message
         Mentored-by: Christian Couder <christian.couder@gmail.com>
         Mentored-by: John Cai <johncai86@gmail.com>
         Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
    +    Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
         Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
     
      ## Documentation/git-cat-file.txt ##
    @@ builtin/cat-file.c: struct batch_options {
     +static struct string_list mailmap = STRING_LIST_INIT_NODUP;
     +static int use_mailmap;
     +
    -+char *replace_idents_using_mailmap(char *object_buf, size_t *size)
    ++static char *replace_idents_using_mailmap(char *, size_t *);
    ++
    ++static char *replace_idents_using_mailmap(char *object_buf, size_t *size)
     +{
     +	struct strbuf sb = STRBUF_INIT;
    -+	strbuf_attach(&sb, object_buf, *size, *size + 1);
     +	const char *headers[] = { "author ", "committer ", "tagger ", NULL };
    ++
    ++	strbuf_attach(&sb, object_buf, *size, *size + 1);
     +	apply_mailmap_to_header(&sb, headers, &mailmap);
     +	*size = sb.len;
     +	return strbuf_detach(&sb, NULL);
    @@ builtin/cat-file.c: static int cat_one_file(int opt, const char *exp_type, const
      		if (!buf)
      			die("Cannot read object %s", obj_name);
      
    -+		if (use_mailmap)
    -+			buf = replace_idents_using_mailmap(buf, &size);
    ++		if (use_mailmap) {
    ++			size_t s = size;
    ++			buf = replace_idents_using_mailmap(buf, &s);
    ++			size = cast_size_t_to_ulong(s);
    ++		}
     +
      		/* otherwise just spit out the data */
      		break;
    @@ builtin/cat-file.c: static int cat_one_file(int opt, const char *exp_type, const
      		buf = read_object_with_reference(the_repository, &oid,
      						 exp_type_id, &size, NULL);
     +
    -+		if (use_mailmap)
    -+			buf = replace_idents_using_mailmap(buf, &size);
    ++		if (use_mailmap) {
    ++			size_t s = size;
    ++			buf = replace_idents_using_mailmap(buf, &s);
    ++			size = cast_size_t_to_ulong(s);
    ++		}
      		break;
      	}
      	default:
    @@ builtin/cat-file.c: static void print_object_or_die(struct batch_options *opt, s
      
      		contents = read_object_file(oid, &type, &size);
     +
    -+		if (use_mailmap)
    -+			contents = replace_idents_using_mailmap(contents, &size);
    ++		if (use_mailmap) {
    ++			size_t s = size;
    ++			contents = replace_idents_using_mailmap(contents, &s);
    ++			size = cast_size_t_to_ulong(s);
    ++		}
     +
      		if (!contents)
      			die("object %s disappeared", oid_to_hex(oid));

Comments

Junio C Hamano July 10, 2022, 5:34 a.m. UTC | #1
Siddharth Asthana <siddharthasthana31@gmail.com> writes:

> = Patch Organization
>
> - The first patch improves the commit_rewrite_person() by restricting it 
>   to traverse only through the header part of the commit object buffer.
>   It also adds an argument called headers which the callers can pass. 
>   The function will replace idents only on these  passed headers. 
>   Thus, the caller won't have to make repeated calls to the function.
> - The second patch moves commit_rewrite_person() to ident.c to expose it
>   as a public function so that it can be used to replace idents in the
>   headers of desired objects.
> - The third patch renames commit_rewrite_person() to a name which
>   describes its functionality clearly. It is renamed to
>   apply_mailmap_to_header().
> - The last patch adds mailmap support to the git cat-file command. It
>   adds the required documentation and tests as well.
>
> Changes in v3:
> - The decl-after-statement warnings have been fixed in all the patches.
> - In commit_rewrite_person(), the data type of linelen and i variables
>   have been changed from int to size_t.
> - The return type of replace_idents_using_mailmap() function, size_t,
>   has been explicitly typecasted to unsigned long using the
>   cast_size_t_to_ulong() helper method.

https://github.com/git/git/actions/runs/2642867380 seems to tell us
that tests added by this series are broken on Windows.  I am not
sure what exactly in this series depends on being LF-only system,
but the symptom makes me suspect that is the cause of the problem.
Johannes Schindelin July 12, 2022, 12:34 p.m. UTC | #2
Hi Junio & Siddarth,

On Sat, 9 Jul 2022, Junio C Hamano wrote:

> Siddharth Asthana <siddharthasthana31@gmail.com> writes:
>
> > = Patch Organization
> >
> > - The first patch improves the commit_rewrite_person() by restricting it
> >   to traverse only through the header part of the commit object buffer.
> >   It also adds an argument called headers which the callers can pass.
> >   The function will replace idents only on these  passed headers.
> >   Thus, the caller won't have to make repeated calls to the function.
> > - The second patch moves commit_rewrite_person() to ident.c to expose it
> >   as a public function so that it can be used to replace idents in the
> >   headers of desired objects.
> > - The third patch renames commit_rewrite_person() to a name which
> >   describes its functionality clearly. It is renamed to
> >   apply_mailmap_to_header().
> > - The last patch adds mailmap support to the git cat-file command. It
> >   adds the required documentation and tests as well.
> >
> > Changes in v3:
> > - The decl-after-statement warnings have been fixed in all the patches.
> > - In commit_rewrite_person(), the data type of linelen and i variables
> >   have been changed from int to size_t.
> > - The return type of replace_idents_using_mailmap() function, size_t,
> >   has been explicitly typecasted to unsigned long using the
> >   cast_size_t_to_ulong() helper method.
>
> https://github.com/git/git/actions/runs/2642867380 seems to tell us
> that tests added by this series are broken on Windows.  I am not
> sure what exactly in this series depends on being LF-only system,
> but the symptom makes me suspect that is the cause of the problem.

It has nothing to do with LF-only, but everything to do with symlinks (I
suspected that when I saw the skipped test cases at
https://github.com/git/git/runs/7292710632?check_suite_focus=true#step:5:195,
and could validate that suspicion via disabling the test cases by
inverting the prereq: this caused the very same symptoms even in a Linux
setup):

-- snipsnap --
From 5bc6d52c95401f60e67312823ed406bd5a3c1026 Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Tue, 12 Jul 2022 14:28:05 +0200
Subject: [PATCH] fixup??? cat-file: add mailmap support

This patch introduced new test cases that rely on the side effects of
the earlier test case `set up symlink tests`. However, that test case is
guarded behind the `SYMLINKS` prereq, therefore it is not run e.g. on
Windows.

Let's fix that by removing the prereq from the `set up` test case, and
adjusting its title to reflect its broadened responsibility.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t4203-mailmap.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index c60a90615cc..94afd4717a2 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -932,7 +932,7 @@ test_expect_success 'find top-level mailmap from subdir' '
 	test_cmp expect actual
 '

-test_expect_success SYMLINKS 'set up symlink tests' '
+test_expect_success 'set up symlink/--use-mailmap tests' '
 	git commit --allow-empty -m foo --author="Orig <orig@example.com>" &&
 	echo "New <new@example.com> <orig@example.com>" >map &&
 	rm -f .mailmap
--
2.37.0.rc2.windows.1.7.g45a475aeb84
Junio C Hamano July 12, 2022, 2:16 p.m. UTC | #3
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> This patch introduced new test cases that rely on the side effects of
> the earlier test case `set up symlink tests`. However, that test case is
> guarded behind the `SYMLINKS` prereq, therefore it is not run e.g. on
> Windows.

Ah, that explains why it only fails there.

> Let's fix that by removing the prereq from the `set up` test case, and
> adjusting its title to reflect its broadened responsibility.
>
> -test_expect_success SYMLINKS 'set up symlink tests' '
> +test_expect_success 'set up symlink/--use-mailmap tests' '
>  	git commit --allow-empty -m foo --author="Orig <orig@example.com>" &&
>  	echo "New <new@example.com> <orig@example.com>" >map &&
>  	rm -f .mailmap

OK, this sets up

 * one commit that can be used in a test, authored by "Orig" person;
 * the "map" file that maps the "Orig" person; and
 * ensures .mailmap is not there.

with the intention to make a symbolic link that points at the "map"
to use as the mailmap file in later tests.  This step does not require
symbolic links at all, but because the point of this set-up is to serve
the later tests, all requiring symbolic link support, it was OK to have
the prerequisite.

The cat-file tests does not have to use the "map" file to do its
thing at all.  In fact, these tests prepare their own .mailmap file
inside them.  But because it chose to run in the history prepared by
previous tests, it broke, because without SYMLINKS, the sought-for
commit does not get created.

Makes sense.  I would have retitled it to s/set up/prepare for/ but
that is minor.

Thanks.  Siddharth, please squash the fix in when rerolling.
Siddharth Asthana July 12, 2022, 4:01 p.m. UTC | #4
On 12/07/22 19:46, Junio C Hamano wrote:
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
>> This patch introduced new test cases that rely on the side effects of
>> the earlier test case `set up symlink tests`. However, that test case is
>> guarded behind the `SYMLINKS` prereq, therefore it is not run e.g. on
>> Windows.
> 
> Ah, that explains why it only fails there.
> 
>> Let's fix that by removing the prereq from the `set up` test case, and
>> adjusting its title to reflect its broadened responsibility.
>>
>> -test_expect_success SYMLINKS 'set up symlink tests' '
>> +test_expect_success 'set up symlink/--use-mailmap tests' '
>>   	git commit --allow-empty -m foo --author="Orig <orig@example.com>" &&
>>   	echo "New <new@example.com> <orig@example.com>" >map &&
>>   	rm -f .mailmap
> 
> OK, this sets up
> 
>   * one commit that can be used in a test, authored by "Orig" person;
>   * the "map" file that maps the "Orig" person; and
>   * ensures .mailmap is not there.
> 
> with the intention to make a symbolic link that points at the "map"
> to use as the mailmap file in later tests.  This step does not require
> symbolic links at all, but because the point of this set-up is to serve
> the later tests, all requiring symbolic link support, it was OK to have
> the prerequisite.
> 
> The cat-file tests does not have to use the "map" file to do its
> thing at all.  In fact, these tests prepare their own .mailmap file
> inside them.  But because it chose to run in the history prepared by
> previous tests, it broke, because without SYMLINKS, the sought-for
> commit does not get created.
> 
> Makes sense.  I would have retitled it to s/set up/prepare for/ but
> that is minor.
> 
> Thanks.  Siddharth, please squash the fix in when rerolling.
> 
Thanks a ton Johannes and Junio for helping me fix the test :D
Will squash the fix in v4!
Junio C Hamano July 12, 2022, 4:06 p.m. UTC | #5
Junio C Hamano <gitster@pobox.com> writes:

> The cat-file tests does not have to use the "map" file to do its
> thing at all.  In fact, these tests prepare their own .mailmap file
> inside them.  But because it chose to run in the history prepared by
> previous tests, it broke, because without SYMLINKS, the sought-for
> commit does not get created.

So, an alternative solution is to keep the existing tests on
symlinks totally unrelated to these new tests.  These cat-file tests
can prepare the commit to munge at the beginning of the sequence,
and then do its thing.  This way, a platform without symlink support
does not have to create the "map" file that nobody uses, something
like the attached.

I do not have strong preference either way, though.

 t/t4203-mailmap.sh | 5 +++++
 1 file changed, 5 insertions(+)

diff --git c/t/t4203-mailmap.sh w/t/t4203-mailmap.sh
index c60a90615c..cd1cab3e54 100755
--- c/t/t4203-mailmap.sh
+++ w/t/t4203-mailmap.sh
@@ -963,6 +963,11 @@ test_expect_success SYMLINKS 'symlinks not respected in-tree' '
 	test_cmp expect actual
 '
 
+test_expect_success 'prepare for cat-file --mailmap' '
+	rm -f .mailmap &&
+	git commit --allow-empty -m foo --author="Orig <orig@example.com>"
+'
+
 test_expect_success '--no-use-mailmap disables mailmap in cat-file' '
 	test_when_finished "rm .mailmap" &&
 	cat >.mailmap <<-EOF &&