diff mbox series

[v2,1/2] blame: respect .git-blame-ignore-revs automatically

Message ID 4ed930cab1b7f5e9738e73c7b9374d927a8acd94.1728707867.git.gitgitgadget@gmail.com (mailing list archive)
State New
Headers show
Series blame: respect .git-blame-ignore-revs automatically | expand

Commit Message

Abhijeetsingh Meena Oct. 12, 2024, 4:37 a.m. UTC
From: Abhijeetsingh Meena <abhijeet040403@gmail.com>

git-blame(1) can ignore a list of commits with `--ignore-revs-file`.
This is useful for marking uninteresting commits like formatting
changes, refactors and whatever else should not be “blamed”.  Some
projects even version control this file so that all contributors can
use it; the conventional name is `.git-blame-ignore-revs`.

But each user still has to opt-in to the standard ignore list,
either with this option or with the config `blame.ignoreRevsFile`.
Let’s teach git-blame(1) to respect this conventional file in order
to streamline the process.

Signed-off-by: Abhijeetsingh Meena <abhijeet040403@gmail.com>
---
 builtin/blame.c                      |  8 ++++++++
 t/t8015-blame-default-ignore-revs.sh | 26 ++++++++++++++++++++++++++
 2 files changed, 34 insertions(+)
 create mode 100755 t/t8015-blame-default-ignore-revs.sh

Comments

Eric Sunshine Oct. 12, 2024, 6:07 a.m. UTC | #1
On Sat, Oct 12, 2024 at 12:38 AM Abhijeetsingh Meena via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> git-blame(1) can ignore a list of commits with `--ignore-revs-file`.
> This is useful for marking uninteresting commits like formatting
> changes, refactors and whatever else should not be “blamed”.  Some
> projects even version control this file so that all contributors can
> use it; the conventional name is `.git-blame-ignore-revs`.
>
> But each user still has to opt-in to the standard ignore list,
> either with this option or with the config `blame.ignoreRevsFile`.
> Let’s teach git-blame(1) to respect this conventional file in order
> to streamline the process.
>
> Signed-off-by: Abhijeetsingh Meena <abhijeet040403@gmail.com>
> ---
>  builtin/blame.c                      |  8 ++++++++
>  t/t8015-blame-default-ignore-revs.sh | 26 ++++++++++++++++++++++++++
>  2 files changed, 34 insertions(+)

This change should be accompanied by a documentation update, I would think.

> diff --git a/builtin/blame.c b/builtin/blame.c
> @@ -1105,6 +1105,14 @@ parse_done:
> +       /*
> +       * By default, add .git-blame-ignore-revs to the list of files
> +       * containing revisions to ignore if it exists.
> +       */
> +       if (access(".git-blame-ignore-revs", F_OK) == 0) {
> +               string_list_append(&ignore_revs_file_list, ".git-blame-ignore-revs");
> +       }

A couple style nits and a couple questions...

nit: drop the braces around the one-line `if` body

nit: this project uses `!foo(...)` rather than `foo(...) == 0`

Presumably this consults ".git-blame-ignore-revs" in the top-level
directory (as you intended) rather than ".git-blame-ignore-revs" in
whatever subdirectory you happen to issue the command because the
current-working-directory has already been set to the top-level
directory by the time cmd_blame() has been called, right?

But that leads to the next question. Should automatic consulting of
".git-blame-ignore-revs" be restricted to just the top-level
directory, or should it be modeled after, say, ".gitignore" which may
be strewn around project directories and in which ".gitignore" files
are consulted rootward starting from the directory in which the
command is invoked. My knee-jerk thought was that the ".gitignore"
model may not make sense for ".git-blame-ignore-revs", but the fact
that `git blame` can accept and work with multiple ignore-revs files
makes me question that knee-jerk response.

> diff --git a/t/t8015-blame-default-ignore-revs.sh b/t/t8015-blame-default-ignore-revs.sh
> new file mode 100755

Let's avoid allocating a new test number just for this single new
test. Instead, the existing t8013-blame-ignore-revs.sh would probably
be a good home for this new test.

> +test_expect_success 'blame: default-ignore-revs-file' '
> +    test_commit first-commit hello.txt hello &&
> +
> +    echo world >>hello.txt &&
> +    test_commit second-commit hello.txt &&
> +
> +    sed "1s/hello/hi/" <hello.txt > hello.txt.tmp &&

style: drop space after redirection operator

    sed "1s/hello/hi/" <hello.txt >hello.txt.tmp &&

> +    mv hello.txt.tmp hello.txt &&
> +    test_commit third-commit hello.txt &&
> +
> +    git rev-parse HEAD >ignored-file &&
> +    git blame --ignore-revs-file=ignored-file hello.txt >expect &&
> +    git rev-parse HEAD >.git-blame-ignore-revs &&
> +    git blame hello.txt >actual &&

I would suggest copying or renaming "ignored-file" to
".git-blame-ignore-revs" rather than running `git rev-parse HEAD`
twice. This way readers won't have to waste mental effort verifying
that the result of `git rev-parse HEAD` isn't intended to change
between invocations.
Eric Sunshine Oct. 12, 2024, 6:43 a.m. UTC | #2
On Sat, Oct 12, 2024 at 2:07 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Sat, Oct 12, 2024 at 12:38 AM Abhijeetsingh Meena via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> > +       /*
> > +       * By default, add .git-blame-ignore-revs to the list of files
> > +       * containing revisions to ignore if it exists.
> > +       */
> > +       if (access(".git-blame-ignore-revs", F_OK) == 0) {
> > +               string_list_append(&ignore_revs_file_list, ".git-blame-ignore-revs");
> > +       }
>
> A couple style nits and a couple questions...

One other observation: The comment above this code block doesn't say
anything that isn't already stated just as clearly by the code itself.
Hence, the comment adds no value, thus should be dropped.
Kristoffer Haugsbakk Oct. 12, 2024, 1:58 p.m. UTC | #3
Hi Abhijeetsingh

For what it’s worth here’s how I imagine this feature could work
conceptually:

Before this feature/change, the effective config for Git use looks like this:

```
[blame]
```

No `blame.ignoreRevsFile`.

But with/after it:

```
[blame]
	ignoreRevsFile=.git-blame-ignore-revs
```

This is the effective config.  Not what the user has typed out.

If the user types out this:

```
[blame]
	ignoreRevsFile=.git-blame-more-revs
```

Then this becomes their effective config:

```
[blame]
	ignoreRevsFile=.git-blame-ignore-revs
	ignoreRevsFile=.git-blame-more-revs
```

Now there are two files: the default one and the user-supplied one (this
config variable is documented as being multi-valued: “This option may be
repeated multiple times.”).

§ How to ignore this new default §§§

Considering users who do not want this new default:

```
[blame]
	ignoreRevsFile=
```

This is the change they would have to make.  Because a blank/empty
resets/empties the list of files.

On Sat, Oct 12, 2024, at 06:37, Abhijeetsingh Meena via GitGitGadget wrote:
> From: Abhijeetsingh Meena <abhijeet040403@gmail.com>
>
> git-blame(1) can ignore a list of commits with `--ignore-revs-file`.
> This is useful for marking uninteresting commits like formatting
> changes, refactors and whatever else should not be “blamed”.  Some
> projects even version control this file so that all contributors can
> use it; the conventional name is `.git-blame-ignore-revs`.
>
> But each user still has to opt-in to the standard ignore list,
> either with this option or with the config `blame.ignoreRevsFile`.
> Let’s teach git-blame(1) to respect this conventional file in order
> to streamline the process.
>
> Signed-off-by: Abhijeetsingh Meena <abhijeet040403@gmail.com>
> ---
>  builtin/blame.c                      |  8 ++++++++
>  t/t8015-blame-default-ignore-revs.sh | 26 ++++++++++++++++++++++++++
>  2 files changed, 34 insertions(+)
>  create mode 100755 t/t8015-blame-default-ignore-revs.sh
>
> diff --git a/builtin/blame.c b/builtin/blame.c
> index e407a22da3b..1eddabaf60f 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -1105,6 +1105,14 @@ parse_done:
>  		add_pending_object(&revs, &head_commit->object, "HEAD");
>  	}
>
> +	/*
> +	* By default, add .git-blame-ignore-revs to the list of files
> +	* containing revisions to ignore if it exists.
> +	*/
> +	if (access(".git-blame-ignore-revs", F_OK) == 0) {
> +		string_list_append(&ignore_revs_file_list, ".git-blame-ignore-revs");
> +	}
> +

I have not tested these patches.  But I see why you check for file access/existence.  Because with this config:

```
[blame]
	ignoreRevsFile=.git-blame-ignore-revs
```

I get this warning in repositories that don’t have the file:

```
fatal: could not open object name list: .git-blame-ignore-revs
```

Which is just noise.

I get the same thing with Git Notes namespace configurations.  I need to
configure them for certain repositories (like `amlog` in this project),
but then I get warnings about them when using the relevant commands in a
project that does not have them.

Maybe this is totally off-topic but I think it would make more sense if
`blame.ignoreRevsFile` just didn’t say anything if it didn’t find the
file.  Because the point of the config might be to opt-in to this file
for those projects that does have it.

>  	init_scoreboard(&sb);
>  	sb.revs = &revs;
>  	sb.contents_from = contents_from;
> diff --git a/t/t8015-blame-default-ignore-revs.sh
> b/t/t8015-blame-default-ignore-revs.sh
> new file mode 100755
> index 00000000000..d4ab686f14d
> --- /dev/null
> +++ b/t/t8015-blame-default-ignore-revs.sh
> @@ -0,0 +1,26 @@
> +#!/bin/sh
> +
> +test_description='default revisions to ignore when blaming'
> +
> +TEST_PASSES_SANITIZE_LEAK=true
> +. ./test-lib.sh
> +
> +test_expect_success 'blame: default-ignore-revs-file' '
> +    test_commit first-commit hello.txt hello &&
> +
> +    echo world >>hello.txt &&
> +    test_commit second-commit hello.txt &&
> +
> +    sed "1s/hello/hi/" <hello.txt > hello.txt.tmp &&
> +    mv hello.txt.tmp hello.txt &&
> +    test_commit third-commit hello.txt &&
> +
> +    git rev-parse HEAD >ignored-file &&
> +    git blame --ignore-revs-file=ignored-file hello.txt >expect &&
> +    git rev-parse HEAD >.git-blame-ignore-revs &&
> +    git blame hello.txt >actual &&
> +
> +    test_cmp expect actual
> +'
> +
> +test_done
> --
> gitgitgadget
Phillip Wood Oct. 13, 2024, 3:18 p.m. UTC | #4
Hi Abhijeetsingh

On 12/10/2024 05:37, Abhijeetsingh Meena via GitGitGadget wrote:
> From: Abhijeetsingh Meena <abhijeet040403@gmail.com>
> 
> git-blame(1) can ignore a list of commits with `--ignore-revs-file`.
> This is useful for marking uninteresting commits like formatting
> changes, refactors and whatever else should not be “blamed”.  Some
> projects even version control this file so that all contributors can
> use it; the conventional name is `.git-blame-ignore-revs`.
> 
> But each user still has to opt-in to the standard ignore list,
> either with this option or with the config `blame.ignoreRevsFile`.
> Let’s teach git-blame(1) to respect this conventional file in order
> to streamline the process.

It's good that the commit message now mentions the config setting. It 
would be helpful to explain why the original implementation deliberately 
decided not to implement a default file and explain why it is now a good 
idea to do so. Supporting a default file in addition to the files listed 
in blame.ignoreRevsFile config setting leaves us in an odd position 
compared to other settings which use a fixed name like .gitignore or 
have a default that can be overridden by a config setting like 
core.excludesFile or require a config setting to enable the feature like 
diff.orderFile.

I've left a couple of code comments below but really the most important 
things are to come up with a convincing reason for changing the behavior 
and figuring out how the default file should interact with the config 
setting.

> +	/*
> +	* By default, add .git-blame-ignore-revs to the list of files
> +	* containing revisions to ignore if it exists.
> +	*/
> +	if (access(".git-blame-ignore-revs", F_OK) == 0) {

There are some uses of "access(.., F_OK)" in our code base but it is 
more usual to call file_exists() these days.

> +		string_list_append(&ignore_revs_file_list, ".git-blame-ignore-revs");

If the user already has this path in their config we'll waste time 
parsing it twice. We could avoid that by using a "struct strset" rather 
than a "struct string_list". I don't think we have OPT_STRSET but it 
should be easy to add one by copying OPT_STRING_LIST.

> +    echo world >>hello.txt &&
> +    test_commit second-commit hello.txt &&

test_commit overwrites the file it is committing so you need to use the 
--printf option

	test_commit --printf second-commit hello.txt "hello\nworld\n"

> +    git rev-parse HEAD >ignored-file &&
> +    git blame --ignore-revs-file=ignored-file hello.txt >expect &&
> +    git rev-parse HEAD >.git-blame-ignore-revs &&
> +    git blame hello.txt >actual &&
> +    test_cmp expect actual

I have mixed feelings about this sort of differential testing, comparing 
the actual output of git blame to what we expect makes it unambiguous 
that the test is checking what we want it to.

Best Wishes

Phillip
Phillip Wood Oct. 13, 2024, 3:25 p.m. UTC | #5
Hi Kristoffer

On 12/10/2024 14:58, Kristoffer Haugsbakk wrote:
> Hi Abhijeetsingh
> 
> Maybe this is totally off-topic but I think it would make more sense if
> `blame.ignoreRevsFile` just didn’t say anything if it didn’t find the
> file.  Because the point of the config might be to opt-in to this file
> for those projects that does have it.

See https://lore.kernel.org/git/xmqqr1f5hszw.fsf@gitster.g/ for some 
discussion about this

Best Wishes

Phillip
Kristoffer Haugsbakk Oct. 14, 2024, 9 p.m. UTC | #6
On Sun, Oct 13, 2024, at 17:25, Phillip Wood wrote:
> Hi Kristoffer
>> […]
>
> See https://lore.kernel.org/git/xmqqr1f5hszw.fsf@gitster.g/ for some
> discussion about this
>
> Best Wishes
>
> Phillip

Thanks!  That was an interesting read.  And an interesting idea.

And then today we got this:

https://lore.kernel.org/git/xmqq5ywehb69.fsf@gitster.g/T/#mce170a493a7b324c585124a9124356a0f87c77a6
Taylor Blau Oct. 14, 2024, 9:08 p.m. UTC | #7
On Sat, Oct 12, 2024 at 02:07:36AM -0400, Eric Sunshine wrote:
> > diff --git a/builtin/blame.c b/builtin/blame.c
> > @@ -1105,6 +1105,14 @@ parse_done:
> > +       /*
> > +       * By default, add .git-blame-ignore-revs to the list of files
> > +       * containing revisions to ignore if it exists.
> > +       */
> > +       if (access(".git-blame-ignore-revs", F_OK) == 0) {
> > +               string_list_append(&ignore_revs_file_list, ".git-blame-ignore-revs");
> > +       }
>
> A couple style nits and a couple questions...
>
> nit: drop the braces around the one-line `if` body
>
> nit: this project uses `!foo(...)` rather than `foo(...) == 0`
>
> Presumably this consults ".git-blame-ignore-revs" in the top-level
> directory (as you intended) rather than ".git-blame-ignore-revs" in
> whatever subdirectory you happen to issue the command because the
> current-working-directory has already been set to the top-level
> directory by the time cmd_blame() has been called, right?
>
> But that leads to the next question. Should automatic consulting of
> ".git-blame-ignore-revs" be restricted to just the top-level
> directory, or should it be modeled after, say, ".gitignore" which may
> be strewn around project directories and in which ".gitignore" files
> are consulted rootward starting from the directory in which the
> command is invoked. My knee-jerk thought was that the ".gitignore"
> model may not make sense for ".git-blame-ignore-revs", but the fact
> that `git blame` can accept and work with multiple ignore-revs files
> makes me question that knee-jerk response.

All very good suggestions and questions for Abhijeetsingh to consider.

At a minimum, I think the style nits need to be addressed here. But I
also think it is worth considering seriously whether or not multiple
`.git-blame-ignore-revs` files should be considered, and if so, in what
order and how they override (or not) each other.

I am generally OK with adding one of these special files and having 'git
blame' respect it automatically. But once we do so, it is going to be
considered part of our compatibility guarantee, so we should get it
right the first time.

Thanks,
Taylor
Abhijeetsingh Meena Oct. 16, 2024, 6:04 a.m. UTC | #8
Hi Eric,

Thank you for your thoughtful feedback on v2 of the patch.
Before I proceed with v3, I'd like to address some of the
non-code-related questions and seek your input.

> Presumably this consults ".git-blame-ignore-revs" in the top-level
> directory (as you intended) rather than ".git-blame-ignore-revs" in
> whatever subdirectory you happen to issue the command because the
> current-working-directory has already been set to the top-level
> directory by the time cmd_blame() has been called, right?

Yes, it seems that the current-working-directory is set to the root of
the repository,
as I tested this behaviour locally. The .git-blame-ignore-revs file in
the root worked
as expected, while a similar file in a subdirectory did not.


> But that leads to the next question. Should automatic consulting of
> ".git-blame-ignore-revs" be restricted to just the top-level
> directory, or should it be modeled after, say, ".gitignore" which may
> be strewn around project directories and in which ".gitignore" files
> are consulted rootward starting from the directory in which the
> command is invoked. My knee-jerk thought was that the ".gitignore"
> model may not make sense for ".git-blame-ignore-revs", but the fact
> that `git blame` can accept and work with multiple ignore-revs files
> makes me question that knee-jerk response.

I think both approaches have their merits:

1. Single file
*Purpose:* Having a single .git-blame-ignore-revs file aligns with the idea
of globally ignoring revisions, making it easier for maintainers to
control irrelevant commits.
*Simplicity:* Keeping the file in the root ensures centralized management,
simplifying configuration.

2. Multiple files:
*Large repositories:* In large monorepos, different teams working in separate
subdirectories may want to manage their own ignored revisions. Multiple files
would offer flexibility, particularly for modular projects or those
with distinct submodules.
*Flexibility:* Subdirectory-level .git-blame-ignore-revs files could
allow users to
 fine-tune blame results for their specific areas, especially when
local refactors
are limited to certain parts of the codebase.

Given this, I would like to know your suggestions, as I’m not too
experienced with
the user workflows and what would be more helpful to them. For now, I think we
should stick with the single .git-blame-ignore-revs file at the top level.
However, we could keep the option open for future enhancements, allowing
multiple files to be consulted by setting a configuration flag if a
specific use case arises.


> Is the all-or-nothing behavior implemented by this patch desirable? If
> so, should the command warn or error out if the user gives conflicting
> options like --ignore-revs-file and --override-ignore-revs together?
>
> A common behavior of many Git commands when dealing with options is
> "last wins", and following that precedent could make this new option
> even much more useful by allowing the user to ignore project-supplied
> ignore-revs but still take advantage of the feature with a different
> set of ignore-revs that make sense to the local user. For instance:
>
>     git blame --override-ignore-revs --ignore-revs-file=my-ignore-revs ...

I don’t think the all-or-nothing approach is ideal. Based on Phillip's
suggestions and
Kristoffer's conceptual workflow, I explored using `git_config_set` to
set `blame.ignoreRevsFile`
configuration. This would integrate well with existing configuration
logic and provide greater flexibility.

With `git_config_set`:

git blame hello.txt
would consult the default .git-blame-ignore-revs file.

git blame --no-ignore-revs-file hello.txt
would disable the default ignore file.

git blame --no-ignore-revs-file --ignore-revs-file=ignore-list hello.txt
would allow the user to specify a custom ignore list while bypassing
the global list,
offering the flexibility you suggested.


> What is this test actually checking? It doesn't seem to use
> --override-ignore-revs at all.

Actually, I used the short form -O to represent --override-ignore-revs
in this test.

Thank you again for your time and feedback. I look forward to your thoughts on
these points before finalising the next patch revision.

Best regards,
Abhijeet

On Sat, Oct 12, 2024 at 11:37 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Sat, Oct 12, 2024 at 12:38 AM Abhijeetsingh Meena via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> > git-blame(1) can ignore a list of commits with `--ignore-revs-file`.
> > This is useful for marking uninteresting commits like formatting
> > changes, refactors and whatever else should not be “blamed”.  Some
> > projects even version control this file so that all contributors can
> > use it; the conventional name is `.git-blame-ignore-revs`.
> >
> > But each user still has to opt-in to the standard ignore list,
> > either with this option or with the config `blame.ignoreRevsFile`.
> > Let’s teach git-blame(1) to respect this conventional file in order
> > to streamline the process.
> >
> > Signed-off-by: Abhijeetsingh Meena <abhijeet040403@gmail.com>
> > ---
> >  builtin/blame.c                      |  8 ++++++++
> >  t/t8015-blame-default-ignore-revs.sh | 26 ++++++++++++++++++++++++++
> >  2 files changed, 34 insertions(+)
>
> This change should be accompanied by a documentation update, I would think.
>
> > diff --git a/builtin/blame.c b/builtin/blame.c
> > @@ -1105,6 +1105,14 @@ parse_done:
> > +       /*
> > +       * By default, add .git-blame-ignore-revs to the list of files
> > +       * containing revisions to ignore if it exists.
> > +       */
> > +       if (access(".git-blame-ignore-revs", F_OK) == 0) {
> > +               string_list_append(&ignore_revs_file_list, ".git-blame-ignore-revs");
> > +       }
>
> A couple style nits and a couple questions...
>
> nit: drop the braces around the one-line `if` body
>
> nit: this project uses `!foo(...)` rather than `foo(...) == 0`
>
> Presumably this consults ".git-blame-ignore-revs" in the top-level
> directory (as you intended) rather than ".git-blame-ignore-revs" in
> whatever subdirectory you happen to issue the command because the
> current-working-directory has already been set to the top-level
> directory by the time cmd_blame() has been called, right?
>
> But that leads to the next question. Should automatic consulting of
> ".git-blame-ignore-revs" be restricted to just the top-level
> directory, or should it be modeled after, say, ".gitignore" which may
> be strewn around project directories and in which ".gitignore" files
> are consulted rootward starting from the directory in which the
> command is invoked. My knee-jerk thought was that the ".gitignore"
> model may not make sense for ".git-blame-ignore-revs", but the fact
> that `git blame` can accept and work with multiple ignore-revs files
> makes me question that knee-jerk response.
>
> > diff --git a/t/t8015-blame-default-ignore-revs.sh b/t/t8015-blame-default-ignore-revs.sh
> > new file mode 100755
>
> Let's avoid allocating a new test number just for this single new
> test. Instead, the existing t8013-blame-ignore-revs.sh would probably
> be a good home for this new test.
>
> > +test_expect_success 'blame: default-ignore-revs-file' '
> > +    test_commit first-commit hello.txt hello &&
> > +
> > +    echo world >>hello.txt &&
> > +    test_commit second-commit hello.txt &&
> > +
> > +    sed "1s/hello/hi/" <hello.txt > hello.txt.tmp &&
>
> style: drop space after redirection operator
>
>     sed "1s/hello/hi/" <hello.txt >hello.txt.tmp &&
>
> > +    mv hello.txt.tmp hello.txt &&
> > +    test_commit third-commit hello.txt &&
> > +
> > +    git rev-parse HEAD >ignored-file &&
> > +    git blame --ignore-revs-file=ignored-file hello.txt >expect &&
> > +    git rev-parse HEAD >.git-blame-ignore-revs &&
> > +    git blame hello.txt >actual &&
>
> I would suggest copying or renaming "ignored-file" to
> ".git-blame-ignore-revs" rather than running `git rev-parse HEAD`
> twice. This way readers won't have to waste mental effort verifying
> that the result of `git rev-parse HEAD` isn't intended to change
> between invocations.
Abhijeetsingh Meena Oct. 16, 2024, 6:06 a.m. UTC | #9
Hi Kristoffer,

Thank you for reviewing the v2 of my patch. I appreciate your
thoughtful feedback.
Before proceeding with v3, I’d like to address some of your questions
and suggestions.

> Hi Abhijeetsingh
>
> For what it’s worth here’s how I imagine this feature could work
> conceptually:
>
> Before this feature/change, the effective config for Git use looks like this:
>
> ```
> [blame]
> ```
>
> No `blame.ignoreRevsFile`.
>
> But with/after it:
>
> ```
> [blame]
>         ignoreRevsFile=.git-blame-ignore-revs
> ```
>
> This is the effective config.  Not what the user has typed out.
>
> If the user types out this:
>
> ```
> [blame]
>         ignoreRevsFile=.git-blame-more-revs
> ```
>
> Then this becomes their effective config:
>
> ```
> [blame]
>         ignoreRevsFile=.git-blame-ignore-revs
>         ignoreRevsFile=.git-blame-more-revs
> ```
>
> Now there are two files: the default one and the user-supplied one (this
> config variable is documented as being multi-valued: “This option may be
> repeated multiple times.”).
>
> § How to ignore this new default §§§
>
> Considering users who do not want this new default:
>
> ```
> [blame]
>         ignoreRevsFile=
> ```
>
> This is the change they would have to make.  Because a blank/empty
> resets/empties the list of files.

Thanks, Kristoffer. Your conceptual explanation gave me a new
perspective on how this
feature can be implemented using the existing configuration flow
without disrupting
other settings. It has helped shape the solution, as I described in my
response to Eric earlier.

Based on Phillip's clue of exploring how this feature would interact
with existing configuration
settings and your conceptual workflow, I explored git_config_set and
used it to set the
blame.ignoreRevsFile configuration. This approach fits well with the
existing configuration
logic and provides greater flexibility.

With git_config_set to set blame.ignoreRevsFile:

git blame hello.txt
would consult the default .git-blame-ignore-revs file.

git blame --no-ignore-revs-file hello.txt
would disable the default ignore file.

git blame --no-ignore-revs-file --ignore-revs-file=ignore-list hello.txt
would allow the user to specify a custom ignore list while bypassing
the global list,
offering the flexibility you suggested.

This would maintain consistency with Git’s existing behavior, allowing
users to modify
configurations with a “last-wins” approach and enabling both global
and custom ignore
lists as needed.


> I have not tested these patches.  But I see why you check for file access/existence.  Because with this config:
>
> ```
> [blame]
>         ignoreRevsFile=.git-blame-ignore-revs
> ```
>
> I get this warning in repositories that don’t have the file:
>
> ```
> fatal: could not open object name list: .git-blame-ignore-revs
> ```
>
> Which is just noise.
>
> I get the same thing with Git Notes namespace configurations.  I need to
> configure them for certain repositories (like `amlog` in this project),
> but then I get warnings about them when using the relevant commands in a
> project that does not have them.
>
> Maybe this is totally off-topic but I think it would make more sense if
> `blame.ignoreRevsFile` just didn’t say anything if it didn’t find the
> file.  Because the point of the config might be to opt-in to this file
> for those projects that does have it.

Yes, I agree. For a default ignore file, we shouldn't raise a fatal
error if the file is missing, especially if it’s not present in every
repository.
Suppressing the warning for the default file would improve user experience
and prevent unnecessary noise.


> > However, users may encounter cases where they need to
> > temporarily override these configurations to inspect all commits,
> > even those excluded by the ignore list. Currently, there is no
> > simple way to bypass all ignore revisions settings in one go.
>
> “No simple way” gives me pause.  But there are those options/methods
> that we discussed before:
>
> • `--no-ignore-rev`
> • `--no-ignore-revs-file`
>
> These are not documented but I can provide these options and get a
> different output from git-blame(1).
>
> `builtin/blame.c` uses `parse-options.h` which provides automatic
> negated options.  I just looked at the code today (so it’s new to me)
> but it seems like it will empty the lists that are associated with these
> options.  See `parse-options-cb.c:parse_opt_string_list`.
>
> So I think this should be sufficient to reset all “ignore” options:
>
> ```
> git blame --no-ignore-rev --no-ignore-revs-file
> ```
>
> However I tested with this:
>
> ```
> git blame --ignore-revs-file=.git-blame-ignore-revs --no-ignore-revs
> ```
>
> And the output suggests to me that `--no-ignore-revs` affect the result
> of the before-mentioned list of files.  Even though these are two
> different lists.  I can’t make sense of that from the code.  But I’m not
> a C programmer so this might just be a me-problem.

Yes, --no-ignore-revs-file and --no-ignore-rev flags work as intended
to bypass the configuration that ignores revisions. They are separate lists, so
--no-ignore-revs shouldn’t affect the --ignore-revs-file list. My previous
 testing post v1 had some issues in test setup, which led me to believe that
the --no-ignore flags don’t work and I worked on --override-ignore-revs.


> > which allows users to easily bypass the --ignore-revs-file
> > option, --ignore-rev option and the blame.ignoreRevsFile
>
> I can see no precedence for the name “override” for an option in this
> project.  The convention is `--[no-]option`.
>
> Like Eric Sunshine discussed: a common convention is to let the user
> activate and negate options according to the last-wins rule.  This is
> pretty useful in my opinion.  Because I can then make an alias which
> displays some Git Note:
>
> ```
> timber = log [options] --notes=results
> ```
>
> But then what if I don’t want any notes for a specific invocation?  I
> don’t have to copy the whole alias and modify it.  I can just:
>
> ```
> git timber --no-notes
> ```
>
> And the same goes for an alias which disables notes:
>
> ```
> timber = log [options] --no-notes
> ```
>
> Because then I can use `git timber --notes=results`.

I agree that the override option is unnecessary, as both
--no-ignore-rev and --no-ignore-revs-file
already allow users to bypass the ignore configurations. Also, the
“last-wins” approach is more
useful and aligns with how Git typically handles configurations. It’s
flexible and user-friendly,
allowing for easy toggling within aliases or individual commands.
Implementing this using the
existing configuration method, such as git_config_set, would be a
clean and effective solution
to ensure that users can quickly modify or negate options as needed.


> > configuration. When this option is used, git blame will completely
> > disregard all configured ignore revisions lists.
> >> The motivation behind this feature is to provide users with more
> > flexibility when dealing with large codebases that rely on
> > .git-blame-ignore-revs files for shared configurations, while
> > still allowing them to disable the ignore list when necessary
> > for troubleshooting or deeper inspections.
>
> You might be able to achieve the same thing with the existing negated
> options.
>
> If you *cannot* disable all “ignore” config and options in one negated
> one then you might want an option like `--no-ignores` which acts like:
>
> ```
> git blame --no-ignore-rev --no-ignore-revs-file
> ```

Yes, the override option isn’t necessary since the existing flags work
as intended.
If needed in the future, we can add a single flag to reset both lists, or as you
mentioned it can be an alias too.


> > + if (!override_ignore_revs) {
> > + build_ignorelist(&sb, &ignore_revs_file_list, &ignore_rev_list);
> > + }
> > +
>
> This demonstrates the more limited behavior: you either override
> (discard) the ignores or you don’t.  With the negated options you build
> up and reset/empty those lists before you get to this point.  That ends
> up being more flexible for the user.

Yes, this approach was more limited, we can follow the approach
described earlier that uses git_config_set to handle ignoring
revisions and revision lists more flexibly.


Thanks again for your detailed feedback. I hope this approach is
better than my previous approach.
I’ll incorporate these changes and move forward with v3. Looking
forward to your further thoughts!

Best regards,
Abhijeetsingh

On Sat, Oct 12, 2024 at 7:28 PM Kristoffer Haugsbakk
<code@khaugsbakk.name> wrote:
>
> Hi Abhijeetsingh
>
> For what it’s worth here’s how I imagine this feature could work
> conceptually:
>
> Before this feature/change, the effective config for Git use looks like this:
>
> ```
> [blame]
> ```
>
> No `blame.ignoreRevsFile`.
>
> But with/after it:
>
> ```
> [blame]
>         ignoreRevsFile=.git-blame-ignore-revs
> ```
>
> This is the effective config.  Not what the user has typed out.
>
> If the user types out this:
>
> ```
> [blame]
>         ignoreRevsFile=.git-blame-more-revs
> ```
>
> Then this becomes their effective config:
>
> ```
> [blame]
>         ignoreRevsFile=.git-blame-ignore-revs
>         ignoreRevsFile=.git-blame-more-revs
> ```
>
> Now there are two files: the default one and the user-supplied one (this
> config variable is documented as being multi-valued: “This option may be
> repeated multiple times.”).
>
> § How to ignore this new default §§§
>
> Considering users who do not want this new default:
>
> ```
> [blame]
>         ignoreRevsFile=
> ```
>
> This is the change they would have to make.  Because a blank/empty
> resets/empties the list of files.
>
> On Sat, Oct 12, 2024, at 06:37, Abhijeetsingh Meena via GitGitGadget wrote:
> > From: Abhijeetsingh Meena <abhijeet040403@gmail.com>
> >
> > git-blame(1) can ignore a list of commits with `--ignore-revs-file`.
> > This is useful for marking uninteresting commits like formatting
> > changes, refactors and whatever else should not be “blamed”.  Some
> > projects even version control this file so that all contributors can
> > use it; the conventional name is `.git-blame-ignore-revs`.
> >
> > But each user still has to opt-in to the standard ignore list,
> > either with this option or with the config `blame.ignoreRevsFile`.
> > Let’s teach git-blame(1) to respect this conventional file in order
> > to streamline the process.
> >
> > Signed-off-by: Abhijeetsingh Meena <abhijeet040403@gmail.com>
> > ---
> >  builtin/blame.c                      |  8 ++++++++
> >  t/t8015-blame-default-ignore-revs.sh | 26 ++++++++++++++++++++++++++
> >  2 files changed, 34 insertions(+)
> >  create mode 100755 t/t8015-blame-default-ignore-revs.sh
> >
> > diff --git a/builtin/blame.c b/builtin/blame.c
> > index e407a22da3b..1eddabaf60f 100644
> > --- a/builtin/blame.c
> > +++ b/builtin/blame.c
> > @@ -1105,6 +1105,14 @@ parse_done:
> >               add_pending_object(&revs, &head_commit->object, "HEAD");
> >       }
> >
> > +     /*
> > +     * By default, add .git-blame-ignore-revs to the list of files
> > +     * containing revisions to ignore if it exists.
> > +     */
> > +     if (access(".git-blame-ignore-revs", F_OK) == 0) {
> > +             string_list_append(&ignore_revs_file_list, ".git-blame-ignore-revs");
> > +     }
> > +
>
> I have not tested these patches.  But I see why you check for file access/existence.  Because with this config:
>
> ```
> [blame]
>         ignoreRevsFile=.git-blame-ignore-revs
> ```
>
> I get this warning in repositories that don’t have the file:
>
> ```
> fatal: could not open object name list: .git-blame-ignore-revs
> ```
>
> Which is just noise.
>
> I get the same thing with Git Notes namespace configurations.  I need to
> configure them for certain repositories (like `amlog` in this project),
> but then I get warnings about them when using the relevant commands in a
> project that does not have them.
>
> Maybe this is totally off-topic but I think it would make more sense if
> `blame.ignoreRevsFile` just didn’t say anything if it didn’t find the
> file.  Because the point of the config might be to opt-in to this file
> for those projects that does have it.
>
> >       init_scoreboard(&sb);
> >       sb.revs = &revs;
> >       sb.contents_from = contents_from;
> > diff --git a/t/t8015-blame-default-ignore-revs.sh
> > b/t/t8015-blame-default-ignore-revs.sh
> > new file mode 100755
> > index 00000000000..d4ab686f14d
> > --- /dev/null
> > +++ b/t/t8015-blame-default-ignore-revs.sh
> > @@ -0,0 +1,26 @@
> > +#!/bin/sh
> > +
> > +test_description='default revisions to ignore when blaming'
> > +
> > +TEST_PASSES_SANITIZE_LEAK=true
> > +. ./test-lib.sh
> > +
> > +test_expect_success 'blame: default-ignore-revs-file' '
> > +    test_commit first-commit hello.txt hello &&
> > +
> > +    echo world >>hello.txt &&
> > +    test_commit second-commit hello.txt &&
> > +
> > +    sed "1s/hello/hi/" <hello.txt > hello.txt.tmp &&
> > +    mv hello.txt.tmp hello.txt &&
> > +    test_commit third-commit hello.txt &&
> > +
> > +    git rev-parse HEAD >ignored-file &&
> > +    git blame --ignore-revs-file=ignored-file hello.txt >expect &&
> > +    git rev-parse HEAD >.git-blame-ignore-revs &&
> > +    git blame hello.txt >actual &&
> > +
> > +    test_cmp expect actual
> > +'
> > +
> > +test_done
> > --
> > gitgitgadget
>
> --
> Kristoffer
Abhijeetsingh Meena Oct. 16, 2024, 6:07 a.m. UTC | #10
Hi Phillip,
Thank you for reviewing the patch and providing valuable feedback.
I’d like to address some of your points below:


> Supporting a default file in addition to the files listed in
> blame.ignoreRevsFile config setting leaves us in an odd position
> compared to other settings which use a fixed name like .gitignore
> or have a default that can be overridden by a config setting like
> core.excludesFile or require a config setting to enable the feature
> like diff.orderFile.

Yes, I now understand that we can solve this by using the existing method for
interacting with configurations, as suggested by you and Kristoffer. We can work
with the existing configuration method like git_config_set to set ignore
revisions file. This (I hope) will also keep it consistent with how
other settings like .gitignore
and core.excludesFile work, making the interaction more predictable for users.


> I've left a couple of code comments below but really
> the most important things are to come up with a convincing
> reason for changing the behavior and figuring out how
> the default file should interact with the config setting.

I agree. After revisiting the use case and the flow, I see now that
the solution can be
more straightforward with git_config_set than my previous approach. This
behavior allows for interaction through the configuration system
without the need to
introduce new options. Kristoffer’s suggestion clarified that handling
.git-blame-ignore-revs
 a default file and allowing it to be overridden or disabled via
--no-ignore-revs-file is sufficient.


> As Kristoffer has pointed out --no-ignore-revs-file should
> be sufficient to disable the default file. If it isn't we
> should fix it so that it is, not add a new option.

Absolutely, you're right. After revisiting my earlier testing issues,
I realized that the
--no-ignore-revs-file and --no-ignore-rev flag works as intended. My
previous confusion was due to a mistake in my test setup. I agree with your
suggestion that we should not add a new option and instead focus on ensuring
 that the current flag behavior is clear and functions correctly.


Thanks again for your review. I hope this approach is better than my
previous approach.
I’ll make sure the changes are implemented correctly in v3
and test the interaction between the default file and config settings
more thoroughly.
Looking forward to your further thoughts!

Best regards,
Abhijeetsingh

On Sun, Oct 13, 2024 at 8:48 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> Hi Abhijeetsingh
>
> On 12/10/2024 05:37, Abhijeetsingh Meena via GitGitGadget wrote:
> > From: Abhijeetsingh Meena <abhijeet040403@gmail.com>
> >
> > git-blame(1) can ignore a list of commits with `--ignore-revs-file`.
> > This is useful for marking uninteresting commits like formatting
> > changes, refactors and whatever else should not be “blamed”.  Some
> > projects even version control this file so that all contributors can
> > use it; the conventional name is `.git-blame-ignore-revs`.
> >
> > But each user still has to opt-in to the standard ignore list,
> > either with this option or with the config `blame.ignoreRevsFile`.
> > Let’s teach git-blame(1) to respect this conventional file in order
> > to streamline the process.
>
> It's good that the commit message now mentions the config setting. It
> would be helpful to explain why the original implementation deliberately
> decided not to implement a default file and explain why it is now a good
> idea to do so. Supporting a default file in addition to the files listed
> in blame.ignoreRevsFile config setting leaves us in an odd position
> compared to other settings which use a fixed name like .gitignore or
> have a default that can be overridden by a config setting like
> core.excludesFile or require a config setting to enable the feature like
> diff.orderFile.
>
> I've left a couple of code comments below but really the most important
> things are to come up with a convincing reason for changing the behavior
> and figuring out how the default file should interact with the config
> setting.
>
> > +     /*
> > +     * By default, add .git-blame-ignore-revs to the list of files
> > +     * containing revisions to ignore if it exists.
> > +     */
> > +     if (access(".git-blame-ignore-revs", F_OK) == 0) {
>
> There are some uses of "access(.., F_OK)" in our code base but it is
> more usual to call file_exists() these days.
>
> > +             string_list_append(&ignore_revs_file_list, ".git-blame-ignore-revs");
>
> If the user already has this path in their config we'll waste time
> parsing it twice. We could avoid that by using a "struct strset" rather
> than a "struct string_list". I don't think we have OPT_STRSET but it
> should be easy to add one by copying OPT_STRING_LIST.
>
> > +    echo world >>hello.txt &&
> > +    test_commit second-commit hello.txt &&
>
> test_commit overwrites the file it is committing so you need to use the
> --printf option
>
>         test_commit --printf second-commit hello.txt "hello\nworld\n"
>
> > +    git rev-parse HEAD >ignored-file &&
> > +    git blame --ignore-revs-file=ignored-file hello.txt >expect &&
> > +    git rev-parse HEAD >.git-blame-ignore-revs &&
> > +    git blame hello.txt >actual &&
> > +    test_cmp expect actual
>
> I have mixed feelings about this sort of differential testing, comparing
> the actual output of git blame to what we expect makes it unambiguous
> that the test is checking what we want it to.
>
> Best Wishes
>
> Phillip
>
diff mbox series

Patch

diff --git a/builtin/blame.c b/builtin/blame.c
index e407a22da3b..1eddabaf60f 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1105,6 +1105,14 @@  parse_done:
 		add_pending_object(&revs, &head_commit->object, "HEAD");
 	}
 
+	/*
+	* By default, add .git-blame-ignore-revs to the list of files
+	* containing revisions to ignore if it exists.
+	*/
+	if (access(".git-blame-ignore-revs", F_OK) == 0) {
+		string_list_append(&ignore_revs_file_list, ".git-blame-ignore-revs");
+	}
+
 	init_scoreboard(&sb);
 	sb.revs = &revs;
 	sb.contents_from = contents_from;
diff --git a/t/t8015-blame-default-ignore-revs.sh b/t/t8015-blame-default-ignore-revs.sh
new file mode 100755
index 00000000000..d4ab686f14d
--- /dev/null
+++ b/t/t8015-blame-default-ignore-revs.sh
@@ -0,0 +1,26 @@ 
+#!/bin/sh
+
+test_description='default revisions to ignore when blaming'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+test_expect_success 'blame: default-ignore-revs-file' '
+    test_commit first-commit hello.txt hello &&
+
+    echo world >>hello.txt &&
+    test_commit second-commit hello.txt &&
+
+    sed "1s/hello/hi/" <hello.txt > hello.txt.tmp &&
+    mv hello.txt.tmp hello.txt &&
+    test_commit third-commit hello.txt &&
+
+    git rev-parse HEAD >ignored-file &&
+    git blame --ignore-revs-file=ignored-file hello.txt >expect &&
+    git rev-parse HEAD >.git-blame-ignore-revs &&
+    git blame hello.txt >actual &&
+
+    test_cmp expect actual
+'
+
+test_done