mbox series

[0/3] clean: add `config.exclude` and `--remove-excluded`

Message ID 20250210191504.309661-1-intelfx@intelfx.name (mailing list archive)
Headers show
Series clean: add `config.exclude` and `--remove-excluded` | expand

Message

Ivan Shapovalov Feb. 10, 2025, 7:14 p.m. UTC
This series extends the concept of "excluded files" in `git clean` to
make it useful to protect "precious files" that might be present in a
specific developer's working tree (see below).

Specifically, this series adds a `config.exclude` knob to configure
"always excluded" files (same as `-e` on the command line), and a
`--remove-excluded` flag (intentionally without a short form) to
"REALLY remove everything, dammit!"

This might seem like euphemism treadmill, but there is a specific
use-case for all of the exclusion methods and options:

.gitignore:     files that _the project_ does not want to track or touch
                (build artifacts)
clean.exclude:  files that _the user_ does not want to track or touch
                (IDE configuration)
git clean -x:   remove build artifacts, but keep precious files
                (when a pristine build is desired)
git clean -x --remove-excluded:
                remove everything, including precious files
                (e.g. for redistribution)

For instance, if I use Sublime Text or JetBrains IDEs to work on
projects, I might want to add this to my ~/.gitconfig:

[clean]
  exclude = /*.sublime-*
  exclude = /.idea

Or, if I make use of the Bear compiler wrapper to generate the
compilation database in those projects that do not use any of the
modern build-systems to automate such generation, I might write:

[clean]
  exclude = /compile_commands.json

This way, even if I run `git clean -fxd` to test a clean build, I do
not need to worry about accidentally removing the compilation database
that would take a bunch of CPU-time to regenerate.

Ivan Shapovalov (3):
  clean, dir: add and use new helper `add_patterns_from_string_list()`
  clean: rename `ignored` -> `remove_ignored`
  clean: add `config.exclude` and `--remove-excluded`

 Documentation/config/clean.txt | 11 +++++++++++
 Documentation/git-clean.txt    | 22 +++++++++++++++-------
 builtin/clean.c                | 32 +++++++++++++++++++++-----------
 dir.c                          | 15 +++++++++++++++
 dir.h                          |  4 ++++
 5 files changed, 66 insertions(+), 18 deletions(-)

Comments

Junio C Hamano Feb. 11, 2025, 6:37 p.m. UTC | #1
Ivan Shapovalov <intelfx@intelfx.name> writes:

> This series extends the concept of "excluded files" in `git clean` to
> make it useful to protect "precious files" that might be present in a
> specific developer's working tree (see below).

How does it interact with "git status"?

> Specifically, this series adds a `config.exclude` knob to configure
> "always excluded" files (same as `-e` on the command line), and a
> `--remove-excluded` flag (intentionally without a short form) to
> "REALLY remove everything, dammit!"

I am not sure if this uses the adjective `precious` to mean the same
thing as we historically talked about `precious`, in the context of
"Git does not have `precious files`.  What we call `ignored` are
synoymous to `expendables`, and we'd eventually want to add the
`precious` class of files that are separate from `ignored` files".

If the feature is about _turning_ the existing `ignored/excluded`
into precious and require a new option to clean those files that
have always been treated as expendables, then that is a grave
usability regression.  I am hoping that it is not the case.

Let's read on.

> This might seem like euphemism treadmill, but there is a specific
> use-case for all of the exclusion methods and options:
>
> .gitignore:     files that _the project_ does not want to track or touch
>                 (build artifacts)
> clean.exclude:  files that _the user_ does not want to track or touch
>                 (IDE configuration)

The above two share the same "does not want to track or touch"
explanation and readers do not know if you want them to have
distinct meaning, or just two different places the user has to store
the same information, one project-wide, given by and shared with
others, the other personal.

You need to say something like "`clean.exclude` introduces a new
`precious` class, the user does nto want to track or touch but
unlike those that match the patterns in .gitignore, they are not
expendables" here, if that is what you are trying to say (I am just
guessing).

Without that ...

> git clean -x:   remove build artifacts, but keep precious files
>                 (when a pristine build is desired)

... this would merely be a wishful thinking, but once the reader
understands that you are introducing a new class, yes, it does make
sense.  And it is backward compatible enhancement, which is very
good.

> git clean -x --remove-excluded:
>                 remove everything, including precious files
>                 (e.g. for redistribution)

Ditto.

Another common theme around `precious` is not IDE configuration but
things like config.mak file we have.  Or perhaps deploy key files?

It is a clever UI hack to notice that the `precious` things are not
something you'd share with the project, and to take advantage of the
distinction between the project-wide vs personal preference in the
configuration system to introduce the `precious` class.  For that,
it might even make sense to call the variable "clean.precious", as
its semantics is VASTLY different from what we called `exclude` or
`ignore` (they are synonyms---and they mean expendable files that
are not to be tracked).

And when people want non-project-wide but personal paths that are
excluded and expendable, they can use $GIT_DIR/info/exclude file.
So a possible alternative is to have the dir.[ch] infrastructure to
start paying attention to a new file $GIT_DIR/info/precious instead
of the configuration variables.  I am not making an assessment on
the relative merit between clean.precious vs $GIT_DIR/info/precious
yet---just throwing an alternative for others to discuss.

By the way, I notice Ævar is CC'ed, but I haven't seen him for quite
a while around here, and am wondering how you decided to do so.  Did
you have private conversations with and got suggestions from him or
something?  Just being curious, but at the same time, if somebody's
influence in the resulting design is big enough, crediting them with
"Helped-by:" or some other trailer might be worth considering.
Ivan Shapovalov Feb. 11, 2025, 6:47 p.m. UTC | #2
On 2025-02-11 at 10:37 -0800, Junio C Hamano wrote:
> Ivan Shapovalov <intelfx@intelfx.name> writes:
> 
> > This series extends the concept of "excluded files" in `git clean` to
> > make it useful to protect "precious files" that might be present in a
> > specific developer's working tree (see below).
> 
> How does it interact with "git status"?

In the same way as `git clean -e`, i.e., there is no interaction.

> 
> > Specifically, this series adds a `config.exclude` knob to configure
> > "always excluded" files (same as `-e` on the command line), and a
> > `--remove-excluded` flag (intentionally without a short form) to
> > "REALLY remove everything, dammit!"
> 
> I am not sure if this uses the adjective `precious` to mean the same
> thing as we historically talked about `precious`, in the context of
> "Git does not have `precious files`.  What we call `ignored` are
> synoymous to `expendables`, and we'd eventually want to add the
> `precious` class of files that are separate from `ignored` files".

There were no implications behind my usage of the word "precious".

> 
> If the feature is about _turning_ the existing `ignored/excluded`
> into precious and require a new option to clean those files that
> have always been treated as expendables, then that is a grave
> usability regression.  I am hoping that it is not the case.
> 
> Let's read on.
> 
> > This might seem like euphemism treadmill, but there is a specific
> > use-case for all of the exclusion methods and options:
> > 
> > .gitignore:     files that _the project_ does not want to track or touch
> >                 (build artifacts)
> > clean.exclude:  files that _the user_ does not want to track or touch
> >                 (IDE configuration)
> 
> The above two share the same "does not want to track or touch"
> explanation and readers do not know if you want them to have
> distinct meaning, or just two different places the user has to store
> the same information, one project-wide, given by and shared with
> others, the other personal.
> 
> You need to say something like "`clean.exclude` introduces a new
> `precious` class, the user does nto want to track or touch but
> unlike those that match the patterns in .gitignore, they are not
> expendables" here, if that is what you are trying to say (I am just
> guessing).

I don't think I'm trying to introduce any new fundamental concepts to
Git. This patch is merely extending an existing command line option
into a configuration knob, because I noticed myself passing the same
arguments over and over and eventually creating an alias that does
nothing but `git clean -e ...`, with the `-e` flag repeated a good 20
or so times.

> 
> Without that ...
> 
> > git clean -x:   remove build artifacts, but keep precious files
> >                 (when a pristine build is desired)
> 
> ... this would merely be a wishful thinking, but once the reader
> understands that you are introducing a new class, yes, it does make
> sense.  And it is backward compatible enhancement, which is very
> good.
> 
> > git clean -x --remove-excluded:
> >                 remove everything, including precious files
> >                 (e.g. for redistribution)
> 
> Ditto.

The above descriptions are just that, free-form descriptions to help
understand the intended use-case. I'm not sure I understand the reasons
behind the "wishful thinking" label applied here.

> 
> Another common theme around `precious` is not IDE configuration but
> things like config.mak file we have.  Or perhaps deploy key files?

config.mak is precisely one of such files that I now have in my own
`clean.exclude`.

> 
> It is a clever UI hack to notice that the `precious` things are not
> something you'd share with the project, and to take advantage of the
> distinction between the project-wide vs personal preference in the
> configuration system to introduce the `precious` class.  For that,
> it might even make sense to call the variable "clean.precious", as
> its semantics is VASTLY different from what we called `exclude` or
> `ignore` (they are synonyms---and they mean expendable files that
> are not to be tracked).
> 
> And when people want non-project-wide but personal paths that are
> excluded and expendable, they can use $GIT_DIR/info/exclude file.
> So a possible alternative is to have the dir.[ch] infrastructure to
> start paying attention to a new file $GIT_DIR/info/precious instead
> of the configuration variables.  I am not making an assessment on
> the relative merit between clean.precious vs $GIT_DIR/info/precious
> yet---just throwing an alternative for others to discuss.
> 
> By the way, I notice Ævar is CC'ed, but I haven't seen him for quite
> a while around here, and am wondering how you decided to do so.  Did
> you have private conversations with and got suggestions from him or
> something?  Just being curious, but at the same time, if somebody's
> influence in the resulting design is big enough, crediting them with
> "Helped-by:" or some other trailer might be worth considering.

This email was part of the `perl contrib/contacts/git-contacts` output
for this patchset, as documented in Documentation/SubmittingPatches
and Documentation/MyFirstContribution.txt. Should I have not done that?
Junio C Hamano Feb. 11, 2025, 9:24 p.m. UTC | #3
Ivan Shapovalov <intelfx@intelfx.name> writes:

> On 2025-02-11 at 10:37 -0800, Junio C Hamano wrote:
>> Ivan Shapovalov <intelfx@intelfx.name> writes:
>> 
>> > This series extends the concept of "excluded files" in `git clean` to
>> > make it useful to protect "precious files" that might be present in a
>> > specific developer's working tree (see below).
>> 
>> How does it interact with "git status"?
>
> In the same way as `git clean -e`, i.e., there is no interaction.

That is dissapointing.  I was hoping that "git status -u" would list
precious and ignored ones in two separate sections.

> There were no implications behind my usage of the word "precious".

Then you should ;-)  We'd like to see us use the same language to
refer to the same concept within this same project (and more
importantly, avoid misleading people by calling two different things
with the same phrase).

> This email was part of the `perl contrib/contacts/git-contacts` output
> for this patchset, as documented in Documentation/SubmittingPatches
> and Documentation/MyFirstContribution.txt. Should I have not done that?

No, as I said, I was curious if he is getting involved with the
project back again behind the curtain.

Thanks.
Ivan Shapovalov Feb. 11, 2025, 9:42 p.m. UTC | #4
On 2025-02-11 at 13:24 -0800, Junio C Hamano wrote:
> Ivan Shapovalov <intelfx@intelfx.name> writes:
> 
> > On 2025-02-11 at 10:37 -0800, Junio C Hamano wrote:
> > > Ivan Shapovalov <intelfx@intelfx.name> writes:
> > > 
> > > > This series extends the concept of "excluded files" in `git clean` to
> > > > make it useful to protect "precious files" that might be present in a
> > > > specific developer's working tree (see below).
> > > 
> > > How does it interact with "git status"?
> > 
> > In the same way as `git clean -e`, i.e., there is no interaction.
> 
> That is dissapointing.  I was hoping that "git status -u" would list
> precious and ignored ones in two separate sections.

Do I need to implement those interactions in order for this patch set
to be considered viable?

> 
> > There were no implications behind my usage of the word "precious".
> 
> Then you should ;-)  We'd like to see us use the same language to
> refer to the same concept within this same project (and more
> importantly, avoid misleading people by calling two different things
> with the same phrase).

I did not intend to mislead anyone (as evident by the fact that I was
simply not aware of any preexisting connotations). I'd appreciate
suggestions for a replacement term.