diff mbox series

blame: fix unblamable and ignored lines in porcelain mode

Message ID 20250321-514-git-blame-1-s-porcelain-output-does-not-emit-unblamable-and-ignored-markers-v1-1-44b562d9beb8@gmail.com (mailing list archive)
State New
Headers show
Series blame: fix unblamable and ignored lines in porcelain mode | expand

Commit Message

Karthik Nayak March 21, 2025, 4:39 p.m. UTC
The 'git-blame(1)' command allows users to ignore specific revisions via
the '--ignore-rev <rev>' and '--ignore-revs-file <file>' flags. These
flags are often combined with the 'blame.markIgnoredLines' and
'blame.markUnblamableLines' config options. These config options prefix
ignored and unblamable lines with a '?' and '*', respectively.

However, this option was never extended to the porcelain mode of
'git-blame(1)'. Since the documentation does not indicate this
exclusion, it is a bug.

Fix this by ensuring porcelain mode also prints the markers and add
tests to verify the behavior.

Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
 builtin/blame.c              | 10 ++++++++++
 t/t8013-blame-ignore-revs.sh | 30 ++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+)

Karthik Nayak (1):
      blame: fix unblamable and ignored lines in porcelain mode



base-commit: 683c54c999c301c2cd6f715c411407c413b1d84e
change-id: 20250321-514-git-blame-1-s-porcelain-output-does-not-emit-unblamable-and-ignored-markers-4af46f02847e

Thanks
- Karthik
---

---
 builtin/blame.c              | 10 ++++++++++
 t/t8013-blame-ignore-revs.sh | 30 ++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+)

Comments

Junio C Hamano March 23, 2025, 3:58 p.m. UTC | #1
Karthik Nayak <karthik.188@gmail.com> writes:

> However, this option was never extended to the porcelain mode of
> 'git-blame(1)'. Since the documentation does not indicate this
> exclusion, it is a bug.

I agree it is a bug when people added ignore or unblamable support
that they did not _consider_ what to do with their new pieces of
information to help porcelain writers.  It is not a bug in the code
per-se, but it is a bug in the brain of these people ;-)

But prefixing random garbage to the commit object name line in the
porcelain mode output does not sound like the right solution to the
bug, either.

When enhancing an existing output format, make sure that your
changes will have minimum empact to existing parsers that do not
know about your extension.  It is reasonably expected that existing
Porcelain scripts reading from --porcelain mode output works by

 - Recognizing a line that match "^[0-9a-f]{40} \d+ \d+ \d+$" and
   take it as the beginning of a new record;

 - Collect all info lines before the payload line.  Lines that
   describe per-commit information are not repeated if it is already
   shown, so remember them when you see the commit for the first
   time, and recall them when you recognize the commit you already
   saw.

 - A payload line is indented with HT and terminates the record.

If you start to add unrecognizable garbage to the line with very
well known fixed format that is used as record delimiter, you would
break the existing parsers, which is not a very nice thing to do.
Are there other and better ways you can think of to add new pieces
of information like this in a way with less severe damage?
Patrick Steinhardt March 24, 2025, 10:16 a.m. UTC | #2
On Sun, Mar 23, 2025 at 08:58:03AM -0700, Junio C Hamano wrote:
> Karthik Nayak <karthik.188@gmail.com> writes:
> 
> > However, this option was never extended to the porcelain mode of
> > 'git-blame(1)'. Since the documentation does not indicate this
> > exclusion, it is a bug.
> 
> I agree it is a bug when people added ignore or unblamable support
> that they did not _consider_ what to do with their new pieces of
> information to help porcelain writers.  It is not a bug in the code
> per-se, but it is a bug in the brain of these people ;-)
> 
> But prefixing random garbage to the commit object name line in the
> porcelain mode output does not sound like the right solution to the
> bug, either.
> 
> When enhancing an existing output format, make sure that your
> changes will have minimum empact to existing parsers that do not
> know about your extension.  It is reasonably expected that existing
> Porcelain scripts reading from --porcelain mode output works by
> 
>  - Recognizing a line that match "^[0-9a-f]{40} \d+ \d+ \d+$" and
>    take it as the beginning of a new record;
> 
>  - Collect all info lines before the payload line.  Lines that
>    describe per-commit information are not repeated if it is already
>    shown, so remember them when you see the commit for the first
>    time, and recall them when you recognize the commit you already
>    saw.
> 
>  - A payload line is indented with HT and terminates the record.
> 
> If you start to add unrecognizable garbage to the line with very
> well known fixed format that is used as record delimiter, you would
> break the existing parsers, which is not a very nice thing to do.
> Are there other and better ways you can think of to add new pieces
> of information like this in a way with less severe damage?

I think the porcelain mode is already built so that it can be extended
with arbitrary new information, no? In `emit_one_suspect_detail()` we
end up printing one line per info we want to display. I would have
expected that we can extend that function to also print information
around unblamable or ignored commits, like we already do for boundary
commits. E.g. something like the patch further down.

Thanks!

Patrick

diff --git a/builtin/blame.c b/builtin/blame.c
index c470654c7ec..cd8322e2619 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -255,7 +255,8 @@ static void write_filename_info(struct blame_origin *suspect)
  * the first time each commit appears in the output (unless the
  * user has specifically asked for us to repeat).
  */
-static int emit_one_suspect_detail(struct blame_origin *suspect, int repeat)
+static int emit_one_suspect_detail(struct blame_entry *ent,
+				   struct blame_origin *suspect, int repeat)
 {
 	struct commit_info ci = COMMIT_INFO_INIT;
 
@@ -275,6 +276,10 @@ static int emit_one_suspect_detail(struct blame_origin *suspect, int repeat)
 	printf("summary %s\n", ci.summary.buf);
 	if (suspect->commit->object.flags & UNINTERESTING)
 		printf("boundary\n");
+	if (mark_unblamable_lines && ent->unblamable)
+		printf("unblamable\n");
+	if (mark_ignored_lines && ent->ignored)
+		printf("ignored\n");
 
 	commit_info_destroy(&ci);
 
@@ -295,7 +300,7 @@ static void found_guilty_entry(struct blame_entry *ent, void *data)
 		printf("%s %d %d %d\n",
 		       oid_to_hex(&suspect->commit->object.oid),
 		       ent->s_lno + 1, ent->lno + 1, ent->num_lines);
-		emit_one_suspect_detail(suspect, 0);
+		emit_one_suspect_detail(ent, suspect, 0);
 		write_filename_info(suspect);
 		maybe_flush_or_die(stdout, "stdout");
 	}
@@ -344,9 +349,10 @@ static const char *format_time(timestamp_t time, const char *tz_str,
 #define OUTPUT_COLOR_LINE           (1U<<10)
 #define OUTPUT_SHOW_AGE_WITH_COLOR  (1U<<11)
 
-static void emit_porcelain_details(struct blame_origin *suspect, int repeat)
+static void emit_porcelain_details(struct blame_entry *ent,
+				   struct blame_origin *suspect, int repeat)
 {
-	if (emit_one_suspect_detail(suspect, repeat) ||
+	if (emit_one_suspect_detail(ent, suspect, repeat) ||
 	    (suspect->commit->object.flags & MORE_THAN_ONE_PATH))
 		write_filename_info(suspect);
 }
@@ -366,7 +372,7 @@ static void emit_porcelain(struct blame_scoreboard *sb, struct blame_entry *ent,
 	       ent->s_lno + 1,
 	       ent->lno + 1,
 	       ent->num_lines);
-	emit_porcelain_details(suspect, repeat);
+	emit_porcelain_details(ent, suspect, repeat);
 
 	cp = blame_nth_line(sb, ent->lno);
 	for (cnt = 0; cnt < ent->num_lines; cnt++) {
@@ -376,7 +382,7 @@ static void emit_porcelain(struct blame_scoreboard *sb, struct blame_entry *ent,
 			       ent->s_lno + 1 + cnt,
 			       ent->lno + 1 + cnt);
 			if (repeat)
-				emit_porcelain_details(suspect, 1);
+				emit_porcelain_details(ent, suspect, 1);
 		}
 		putchar('\t');
 		do {
Toon Claes March 24, 2025, 10:37 a.m. UTC | #3
Patrick Steinhardt <ps@pks.im> writes:

> I think the porcelain mode is already built so that it can be extended
> with arbitrary new information, no? In `emit_one_suspect_detail()` we
> end up printing one line per info we want to display. I would have
> expected that we can extend that function to also print information
> around unblamable or ignored commits, like we already do for boundary
> commits. E.g. something like the patch further down.

Yeah, I think the porcelain format exists to be easy to machine-parse.
Having an optional prefix symbol on the commit OID would complicate
process that.

And I've been thinking about a similar solution as you've been
suggesting below. I was only wondering whether we only do this when
using `--line-porcelain`. When using `--porcelain` the function
`emit_one_suspect_detail()` doesn't print most of the commit info when
it was already printed. But the "unblamable" and "ignored" info might be
different for each line, even if they blame down to the same commit.
Karthik Nayak March 24, 2025, 7:56 p.m. UTC | #4
Junio C Hamano <gitster@pobox.com> writes:

> Karthik Nayak <karthik.188@gmail.com> writes:
>
>> However, this option was never extended to the porcelain mode of
>> 'git-blame(1)'. Since the documentation does not indicate this
>> exclusion, it is a bug.
>
> I agree it is a bug when people added ignore or unblamable support
> that they did not _consider_ what to do with their new pieces of
> information to help porcelain writers.  It is not a bug in the code
> per-se, but it is a bug in the brain of these people ;-)
>
> But prefixing random garbage to the commit object name line in the
> porcelain mode output does not sound like the right solution to the
> bug, either.
>
> When enhancing an existing output format, make sure that your
> changes will have minimum empact to existing parsers that do not
> know about your extension.  It is reasonably expected that existing
> Porcelain scripts reading from --porcelain mode output works by
>
>  - Recognizing a line that match "^[0-9a-f]{40} \d+ \d+ \d+$" and
>    take it as the beginning of a new record;
>
>  - Collect all info lines before the payload line.  Lines that
>    describe per-commit information are not repeated if it is already
>    shown, so remember them when you see the commit for the first
>    time, and recall them when you recognize the commit you already
>    saw.
>
>  - A payload line is indented with HT and terminates the record.
>
> If you start to add unrecognizable garbage to the line with very
> well known fixed format that is used as record delimiter, you would
> break the existing parsers, which is not a very nice thing to do.
> Are there other and better ways you can think of to add new pieces
> of information like this in a way with less severe damage?

Fair enough, I was having this argument and convinced myself because the
option is an explicit setting. So users who have
'blame.markIgnoredLines' or/and 'blame.markUnblamableLines' set and are
using one of the 'ignore-rev' options would be expecting either an '?'
or a '*' as mentioned in the documentation.

Let's me figure if there is a more backward-compatible way of doing
this.
Karthik Nayak March 24, 2025, 8 p.m. UTC | #5
Patrick Steinhardt <ps@pks.im> writes:

> On Sun, Mar 23, 2025 at 08:58:03AM -0700, Junio C Hamano wrote:
>> Karthik Nayak <karthik.188@gmail.com> writes:
>>
>> > However, this option was never extended to the porcelain mode of
>> > 'git-blame(1)'. Since the documentation does not indicate this
>> > exclusion, it is a bug.
>>
>> I agree it is a bug when people added ignore or unblamable support
>> that they did not _consider_ what to do with their new pieces of
>> information to help porcelain writers.  It is not a bug in the code
>> per-se, but it is a bug in the brain of these people ;-)
>>
>> But prefixing random garbage to the commit object name line in the
>> porcelain mode output does not sound like the right solution to the
>> bug, either.
>>
>> When enhancing an existing output format, make sure that your
>> changes will have minimum empact to existing parsers that do not
>> know about your extension.  It is reasonably expected that existing
>> Porcelain scripts reading from --porcelain mode output works by
>>
>>  - Recognizing a line that match "^[0-9a-f]{40} \d+ \d+ \d+$" and
>>    take it as the beginning of a new record;
>>
>>  - Collect all info lines before the payload line.  Lines that
>>    describe per-commit information are not repeated if it is already
>>    shown, so remember them when you see the commit for the first
>>    time, and recall them when you recognize the commit you already
>>    saw.
>>
>>  - A payload line is indented with HT and terminates the record.
>>
>> If you start to add unrecognizable garbage to the line with very
>> well known fixed format that is used as record delimiter, you would
>> break the existing parsers, which is not a very nice thing to do.
>> Are there other and better ways you can think of to add new pieces
>> of information like this in a way with less severe damage?
>
> I think the porcelain mode is already built so that it can be extended
> with arbitrary new information, no? In `emit_one_suspect_detail()` we
> end up printing one line per info we want to display. I would have
> expected that we can extend that function to also print information
> around unblamable or ignored commits, like we already do for boundary
> commits. E.g. something like the patch further down.
>

This indeed looks like a much better way of doing this. Let me

> Thanks!
>
> Patrick
>
> diff --git a/builtin/blame.c b/builtin/blame.c
> index c470654c7ec..cd8322e2619 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -255,7 +255,8 @@ static void write_filename_info(struct blame_origin *suspect)
>   * the first time each commit appears in the output (unless the
>   * user has specifically asked for us to repeat).
>   */
> -static int emit_one_suspect_detail(struct blame_origin *suspect, int repeat)
> +static int emit_one_suspect_detail(struct blame_entry *ent,
> +				   struct blame_origin *suspect, int repeat)
>  {
>  	struct commit_info ci = COMMIT_INFO_INIT;
>
> @@ -275,6 +276,10 @@ static int emit_one_suspect_detail(struct blame_origin *suspect, int repeat)
>  	printf("summary %s\n", ci.summary.buf);
>  	if (suspect->commit->object.flags & UNINTERESTING)
>  		printf("boundary\n");
> +	if (mark_unblamable_lines && ent->unblamable)
> +		printf("unblamable\n");
> +	if (mark_ignored_lines && ent->ignored)
> +		printf("ignored\n");
>
>  	commit_info_destroy(&ci);
>
> @@ -295,7 +300,7 @@ static void found_guilty_entry(struct blame_entry *ent, void *data)
>  		printf("%s %d %d %d\n",
>  		       oid_to_hex(&suspect->commit->object.oid),
>  		       ent->s_lno + 1, ent->lno + 1, ent->num_lines);
> -		emit_one_suspect_detail(suspect, 0);
> +		emit_one_suspect_detail(ent, suspect, 0);
>  		write_filename_info(suspect);
>  		maybe_flush_or_die(stdout, "stdout");
>  	}
> @@ -344,9 +349,10 @@ static const char *format_time(timestamp_t time, const char *tz_str,
>  #define OUTPUT_COLOR_LINE           (1U<<10)
>  #define OUTPUT_SHOW_AGE_WITH_COLOR  (1U<<11)
>
> -static void emit_porcelain_details(struct blame_origin *suspect, int repeat)
> +static void emit_porcelain_details(struct blame_entry *ent,
> +				   struct blame_origin *suspect, int repeat)
>  {
> -	if (emit_one_suspect_detail(suspect, repeat) ||
> +	if (emit_one_suspect_detail(ent, suspect, repeat) ||
>  	    (suspect->commit->object.flags & MORE_THAN_ONE_PATH))
>  		write_filename_info(suspect);
>  }
> @@ -366,7 +372,7 @@ static void emit_porcelain(struct blame_scoreboard *sb, struct blame_entry *ent,
>  	       ent->s_lno + 1,
>  	       ent->lno + 1,
>  	       ent->num_lines);
> -	emit_porcelain_details(suspect, repeat);
> +	emit_porcelain_details(ent, suspect, repeat);
>
>  	cp = blame_nth_line(sb, ent->lno);
>  	for (cnt = 0; cnt < ent->num_lines; cnt++) {
> @@ -376,7 +382,7 @@ static void emit_porcelain(struct blame_scoreboard *sb, struct blame_entry *ent,
>  			       ent->s_lno + 1 + cnt,
>  			       ent->lno + 1 + cnt);
>  			if (repeat)
> -				emit_porcelain_details(suspect, 1);
> +				emit_porcelain_details(ent, suspect, 1);
>  		}
>  		putchar('\t');
>  		do {

Thanks Patrick, I will send in the new version with this and modified
tests.
Karthik Nayak March 24, 2025, 8:04 p.m. UTC | #6
Toon Claes <toon@iotcl.com> writes:

> Patrick Steinhardt <ps@pks.im> writes:
>
>> I think the porcelain mode is already built so that it can be extended
>> with arbitrary new information, no? In `emit_one_suspect_detail()` we
>> end up printing one line per info we want to display. I would have
>> expected that we can extend that function to also print information
>> around unblamable or ignored commits, like we already do for boundary
>> commits. E.g. something like the patch further down.
>
> Yeah, I think the porcelain format exists to be easy to machine-parse.
> Having an optional prefix symbol on the commit OID would complicate
> process that.
>
> And I've been thinking about a similar solution as you've been
> suggesting below. I was only wondering whether we only do this when
> using `--line-porcelain`. When using `--porcelain` the function
> `emit_one_suspect_detail()` doesn't print most of the commit info when
> it was already printed. But the "unblamable" and "ignored" info might be
> different for each line, even if they blame down to the same commit.
>

I'm curious, how would it be different, if they blame down to the same
commit? My understanding was "unblamable" and "ignored" are tied to
commits.

> --
> Toon
Toon Claes March 25, 2025, 8:45 a.m. UTC | #7
Karthik Nayak <karthik.188@gmail.com> writes:

> I'm curious, how would it be different, if they blame down to the same
> commit? My understanding was "unblamable" and "ignored" are tied to
> commits.

Let me include an example, let's blame `varint.h`.

First have a look at the non-porcelain format:

    $ git blame varint.h -l
    d2c1898571a6a2324593e92163e8754880e0c1fb (Junio C Hamano 2012-04-03 15:53:08 -0700 1) #ifndef VARINT_H
    d2c1898571a6a2324593e92163e8754880e0c1fb (Junio C Hamano 2012-04-03 15:53:08 -0700 2) #define VARINT_H
    d2c1898571a6a2324593e92163e8754880e0c1fb (Junio C Hamano 2012-04-03 15:53:08 -0700 3)
    554544276a604c144df45efcb060c80aa322088c (Denton Liu     2019-04-29 04:28:14 -0400 4) int encode_varint(uintmax_t, unsigned char *);
    554544276a604c144df45efcb060c80aa322088c (Denton Liu     2019-04-29 04:28:14 -0400 5) uintmax_t decode_varint(const unsigned char **);
    d2c1898571a6a2324593e92163e8754880e0c1fb (Junio C Hamano 2012-04-03 15:53:08 -0700 6)
    d2c1898571a6a2324593e92163e8754880e0c1fb (Junio C Hamano 2012-04-03 15:53:08 -0700 7) #endif /* VARINT_H */

Now if we put `554544276a604c144df45efcb060c80aa322088c` in `.git-blame-ignore-revs`:

    $ git -c blame.markUnblamableLines=true -c blame.markIgnoredLines=true blame varint.h --ignore-revs-file .git-blame-ignore-revs -l
    d2c1898571a6a2324593e92163e8754880e0c1fb (Junio C Hamano 2012-04-03 15:53:08 -0700 1) #ifndef VARINT_H
    d2c1898571a6a2324593e92163e8754880e0c1fb (Junio C Hamano 2012-04-03 15:53:08 -0700 2) #define VARINT_H
    d2c1898571a6a2324593e92163e8754880e0c1fb (Junio C Hamano 2012-04-03 15:53:08 -0700 3)
    ?d2c1898571a6a2324593e92163e8754880e0c1f (Junio C Hamano 2012-04-03 15:53:08 -0700 4) int encode_varint(uintmax_t, unsigned char *);
    ?d2c1898571a6a2324593e92163e8754880e0c1f (Junio C Hamano 2012-04-03 15:53:08 -0700 5) uintmax_t decode_varint(const unsigned char **);
    d2c1898571a6a2324593e92163e8754880e0c1fb (Junio C Hamano 2012-04-03 15:53:08 -0700 6)
    d2c1898571a6a2324593e92163e8754880e0c1fb (Junio C Hamano 2012-04-03 15:53:08 -0700 7) #endif /* VARINT_H */

If we compare that to the porcelain format:

    $ git blame varint.h -l --porcelain
    d2c1898571a6a2324593e92163e8754880e0c1fb 1 1 3
    author Junio C Hamano
    author-mail <gitster@pobox.com>
    author-time 1333493588
    author-tz -0700
    committer Junio C Hamano
    committer-mail <gitster@pobox.com>
    committer-time 1333495484
    committer-tz -0700
    summary varint: make it available outside the context of pack
    filename varint.h
            #ifndef VARINT_H
    d2c1898571a6a2324593e92163e8754880e0c1fb 2 2
            #define VARINT_H
    d2c1898571a6a2324593e92163e8754880e0c1fb 3 3

    554544276a604c144df45efcb060c80aa322088c 4 4 2
    author Denton Liu
    author-mail <liu.denton@gmail.com>
    author-time 1556526494
    author-tz -0400
    committer Junio C Hamano
    committer-mail <gitster@pobox.com>
    committer-time 1557037206
    committer-tz +0900
    summary *.[ch]: remove extern from function declarations using spatch
    previous ffac537e6cbbf934b08745a378932722df287a53 varint.h
    filename varint.h
            int encode_varint(uintmax_t, unsigned char *);
    554544276a604c144df45efcb060c80aa322088c 5 5
            uintmax_t decode_varint(const unsigned char **);
    d2c1898571a6a2324593e92163e8754880e0c1fb 8 6 2

    d2c1898571a6a2324593e92163e8754880e0c1fb 9 7
            #endif /* VARINT_H */

And now with the `.git-blame-ignore-revs` file:

    $ git -c blame.markUnblamableLines=true -c blame.markIgnoredLines=true blame varint.h --ignore-revs-file .git-blame-ignore-revs -l --porcelain
    d2c1898571a6a2324593e92163e8754880e0c1fb 1 1 3
    author Junio C Hamano
    author-mail <gitster@pobox.com>
    author-time 1333493588
    author-tz -0700
    committer Junio C Hamano
    committer-mail <gitster@pobox.com>
    committer-time 1333495484
    committer-tz -0700
    summary varint: make it available outside the context of pack
    filename varint.h
            #ifndef VARINT_H
    d2c1898571a6a2324593e92163e8754880e0c1fb 2 2
            #define VARINT_H
    d2c1898571a6a2324593e92163e8754880e0c1fb 3 3

    d2c1898571a6a2324593e92163e8754880e0c1fb 6 4 2
            int encode_varint(uintmax_t, unsigned char *);
    d2c1898571a6a2324593e92163e8754880e0c1fb 7 5
            uintmax_t decode_varint(const unsigned char **);
    d2c1898571a6a2324593e92163e8754880e0c1fb 8 6 2

    d2c1898571a6a2324593e92163e8754880e0c1fb 9 7
            #endif /* VARINT_H */

So every line now blames down to commit
d2c1898571a6a2324593e92163e8754880e0c1fb. The lines which used to
blame down to 554544276a604c144df45efcb060c80aa322088c should be marked
as "ignored", but we only emit the details once for each commit. The
commit details (author, committer) are only relevant once, but the
"ignored" info can differ for each line (as you also can see in the
non-porcelain format).

We could make the output look something like:

    $ git -c blame.markUnblamableLines=true -c blame.markIgnoredLines=true blame varint.h --ignore-revs-file .git-blame-ignore-revs -l --porcelain
    d2c1898571a6a2324593e92163e8754880e0c1fb 1 1 3
    author Junio C Hamano
    author-mail <gitster@pobox.com>
    author-time 1333493588
    author-tz -0700
    committer Junio C Hamano
    committer-mail <gitster@pobox.com>
    committer-time 1333495484
    committer-tz -0700
    summary varint: make it available outside the context of pack
    filename varint.h
            #ifndef VARINT_H
    d2c1898571a6a2324593e92163e8754880e0c1fb 2 2
            #define VARINT_H
    d2c1898571a6a2324593e92163e8754880e0c1fb 3 3

    d2c1898571a6a2324593e92163e8754880e0c1fb 6 4 2
    ignored
            int encode_varint(uintmax_t, unsigned char *);
    d2c1898571a6a2324593e92163e8754880e0c1fb 7 5
    ignored
            uintmax_t decode_varint(const unsigned char **);
    d2c1898571a6a2324593e92163e8754880e0c1fb 8 6 2

    d2c1898571a6a2324593e92163e8754880e0c1fb 9 7
            #endif /* VARINT_H */

It feels odd to me only the "ignored" info is emitted and the rest
of the details isn't. But that might be just me...
Karthik Nayak March 25, 2025, 10:31 a.m. UTC | #8
Toon Claes <toon@iotcl.com> writes:

> Karthik Nayak <karthik.188@gmail.com> writes:
>
>> I'm curious, how would it be different, if they blame down to the same
>> commit? My understanding was "unblamable" and "ignored" are tied to
>> commits.

[snip]

> And now with the `.git-blame-ignore-revs` file:
>
>     $ git -c blame.markUnblamableLines=true -c blame.markIgnoredLines=true blame varint.h --ignore-revs-file .git-blame-ignore-revs -l --porcelain
>     d2c1898571a6a2324593e92163e8754880e0c1fb 1 1 3
>     author Junio C Hamano
>     author-mail <gitster@pobox.com>
>     author-time 1333493588
>     author-tz -0700
>     committer Junio C Hamano
>     committer-mail <gitster@pobox.com>
>     committer-time 1333495484
>     committer-tz -0700
>     summary varint: make it available outside the context of pack
>     filename varint.h
>             #ifndef VARINT_H
>     d2c1898571a6a2324593e92163e8754880e0c1fb 2 2
>             #define VARINT_H
>     d2c1898571a6a2324593e92163e8754880e0c1fb 3 3
>
>     d2c1898571a6a2324593e92163e8754880e0c1fb 6 4 2
>             int encode_varint(uintmax_t, unsigned char *);
>     d2c1898571a6a2324593e92163e8754880e0c1fb 7 5
>             uintmax_t decode_varint(const unsigned char **);
>     d2c1898571a6a2324593e92163e8754880e0c1fb 8 6 2
>
>     d2c1898571a6a2324593e92163e8754880e0c1fb 9 7
>             #endif /* VARINT_H */
>
> So every line now blames down to commit
> d2c1898571a6a2324593e92163e8754880e0c1fb. The lines which used to
> blame down to 554544276a604c144df45efcb060c80aa322088c should be marked
> as "ignored", but we only emit the details once for each commit. The
> commit details (author, committer) are only relevant once, but the
> "ignored" info can differ for each line (as you also can see in the
> non-porcelain format).
>

Ah! So if a rev is ignored via the `--ignore-rev[s-file]` flag, then the
parent revision is shown in the blame. It could happen that in porcelain
mode the parent revision is clubbed with previous lines if they share
the same revision. This would skip the 'unblamable' or 'ignored'
information.

This would be solved in '--line-porcelain' since details aren't clubbed.

I agree, it makes the most sense to only do this in 'line-porcelain'
mode.

> We could make the output look something like:
>
>     $ git -c blame.markUnblamableLines=true -c blame.markIgnoredLines=true blame varint.h --ignore-revs-file .git-blame-ignore-revs -l --porcelain
>     d2c1898571a6a2324593e92163e8754880e0c1fb 1 1 3
>     author Junio C Hamano
>     author-mail <gitster@pobox.com>
>     author-time 1333493588
>     author-tz -0700
>     committer Junio C Hamano
>     committer-mail <gitster@pobox.com>
>     committer-time 1333495484
>     committer-tz -0700
>     summary varint: make it available outside the context of pack
>     filename varint.h
>             #ifndef VARINT_H
>     d2c1898571a6a2324593e92163e8754880e0c1fb 2 2
>             #define VARINT_H
>     d2c1898571a6a2324593e92163e8754880e0c1fb 3 3
>
>     d2c1898571a6a2324593e92163e8754880e0c1fb 6 4 2
>     ignored
>             int encode_varint(uintmax_t, unsigned char *);
>     d2c1898571a6a2324593e92163e8754880e0c1fb 7 5
>     ignored
>             uintmax_t decode_varint(const unsigned char **);
>     d2c1898571a6a2324593e92163e8754880e0c1fb 8 6 2
>
>     d2c1898571a6a2324593e92163e8754880e0c1fb 9 7
>             #endif /* VARINT_H */
>
> It feels odd to me only the "ignored" info is emitted and the rest
> of the details isn't. But that might be just me...
>

I'm with you on this, this would also require us to explain this odd
exclusion where only for 'ignored' and 'unblamable' lines we output
details on every commit. But the other way is also an exclusion, where
we would say that 'ignored' and 'unblamable' lines are only shown in
'--line-porcelain'.

But the latter can be extended into the former in the future but not the
other way around. So I would say it makes more sense to restrict it to
'--line-porcelain' in that sense.

> --
> Toon
Junio C Hamano March 25, 2025, 7:44 p.m. UTC | #9
Toon Claes <toon@iotcl.com> writes:

> It feels odd to me only the "ignored" info is emitted and the rest
> of the details isn't. But that might be just me...

If we have more per-line (and not per-commit/file tuple) pieces of
information, we would have to treat them just like you had "ignored"
in your illustration above.  It is "ignored" that feels "odd", since
it is the only single oddball right now.
diff mbox series

Patch

diff --git a/builtin/blame.c b/builtin/blame.c
index c470654c7e..9a8d7ce7af 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -360,6 +360,11 @@  static void emit_porcelain(struct blame_scoreboard *sb, struct blame_entry *ent,
 	struct blame_origin *suspect = ent->suspect;
 	char hex[GIT_MAX_HEXSZ + 1];
 
+	if (mark_unblamable_lines && ent->unblamable)
+		putchar('*');
+	if (mark_ignored_lines && ent->ignored)
+		putchar('?');
+
 	oid_to_hex_r(hex, &suspect->commit->object.oid);
 	printf("%s %d %d %d\n",
 	       hex,
@@ -372,6 +377,11 @@  static void emit_porcelain(struct blame_scoreboard *sb, struct blame_entry *ent,
 	for (cnt = 0; cnt < ent->num_lines; cnt++) {
 		char ch;
 		if (cnt) {
+			if (mark_unblamable_lines && ent->unblamable)
+				putchar('*');
+			if (mark_ignored_lines && ent->ignored)
+				putchar('?');
+
 			printf("%s %d %d\n", hex,
 			       ent->s_lno + 1 + cnt,
 			       ent->lno + 1 + cnt);
diff --git a/t/t8013-blame-ignore-revs.sh b/t/t8013-blame-ignore-revs.sh
index 370b768149..2722eb4598 100755
--- a/t/t8013-blame-ignore-revs.sh
+++ b/t/t8013-blame-ignore-revs.sh
@@ -158,6 +158,16 @@  test_expect_success mark_unblamable_lines '
 	test_cmp expect actual
 '
 
+test_expect_success 'mark_unblamable_lines porcelain' '
+	sha=$(git rev-parse Y) &&
+
+	git -c blame.markUnblamableLines=false blame -p --ignore-rev Y file >blame_raw &&
+	sed "s/^${sha}/*${sha}/g" blame_raw >expect &&
+
+	git -c blame.markUnblamableLines=true blame -p --ignore-rev Y file >actual &&
+	test_cmp expect actual
+'
+
 # Commit Z will touch the first two lines.  Y touched all four.
 # 	A--B--X--Y--Z
 # The blame output when ignoring Z should be:
@@ -191,6 +201,26 @@  test_expect_success mark_ignored_lines '
 	! test_cmp expect actual
 '
 
+test_expect_success 'mark_ignored_lines porcelain' '
+	sha=$(git rev-parse Y) &&
+
+	git -c blame.markIgnoredLines=true blame -p --ignore-rev Z file | grep $sha >blame_raw &&
+
+	echo "?" >expect &&
+
+	sed -n "1p" blame_raw | cut -c1 >actual &&
+	test_cmp expect actual &&
+
+	sed -n "2p" blame_raw | cut -c1 >actual &&
+	test_cmp expect actual &&
+
+	sed -n "3p" blame_raw | cut -c1 >actual &&
+	! test_cmp expect actual &&
+
+	sed -n "4p" blame_raw | cut -c1 >actual &&
+	! test_cmp expect actual
+'
+
 # For ignored revs that added 'unblamable' lines and more recent commits changed
 # the blamable lines, mark the unblamable lines with a
 # '*'