mbox series

[0/2] Support diff.wordDiff config

Message ID 20240302095751.123138-1-karthik.188@gmail.com (mailing list archive)
Headers show
Series Support diff.wordDiff config | expand

Message

Karthik Nayak March 2, 2024, 9:57 a.m. UTC
This patch series adds the diff.wordDiff config option. This mimics the
'--word-diff' option of `git-diff(1)`.

The first patch is more of a preparatory patch, which makes it easier to
add tests when the actual config is added in patch 2.

Karthik Nayak (2):
  t4034: extract out `diff_with_opts`
  diff: add 'diff.wordDiff' config option

 Documentation/config/diff.txt |  4 +++
 diff.c                        | 50 +++++++++++++++++++++------
 t/t4034-diff-words.sh         | 63 ++++++++++++++++++++---------------
 3 files changed, 80 insertions(+), 37 deletions(-)

Comments

Junio C Hamano March 2, 2024, 5:03 p.m. UTC | #1
Karthik Nayak <karthik.188@gmail.com> writes:

> This patch series adds the diff.wordDiff config option. This mimics the
> '--word-diff' option of `git-diff(1)`.

Is it even be sensible to introduce this configuration variable in
the first place?  What would this do to users who set this variable
and use third-party or their own scripts that run "git diff" under
the hood?

The usual answer is "these tools should be using the low-level
plumbing commands like diff-files, diff-index, and diff-tree", so I
am not worried about it too much myself, and the above is purely the
devil's advocate comment.

Having said that, running

	$ git grep -e 'git diff '

in the collection of scripts I use [*] to work on this project, I am
reminded that I may have to be a bit more conservative than I
currently am about the risk of breaking scripts with the changes
like the one being proposed.

The proposed feature also may break those who use the git-prompt and
diff-highlight available in conrib/, even though I am not sure how
badly they would break, because I only looked at the lines given by
this command:

	$ git grep -e 'git diff ' -- \*.sh ':!t/'

and didn't check how the output from 'git diff' is used.


[Footnote]

 * They can be seen in the 'todo' branch, if anybody is interested.
Karthik Nayak March 2, 2024, 6:02 p.m. UTC | #2
On Sat, Mar 2, 2024 at 6:03 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Karthik Nayak <karthik.188@gmail.com> writes:
>
> > This patch series adds the diff.wordDiff config option. This mimics the
> > '--word-diff' option of `git-diff(1)`.
>
> Is it even be sensible to introduce this configuration variable in
> the first place?  What would this do to users who set this variable
> and use third-party or their own scripts that run "git diff" under
> the hood?

This is definitely a good question to ask. I'm primarily not a user of this
option, and this patch series was more of to start this discussion, based
on the request. I'm comfortable dropping the patch series too if it doesn't
make much sense.

>
> The usual answer is "these tools should be using the low-level
> plumbing commands like diff-files, diff-index, and diff-tree", so I
> am not worried about it too much myself, and the above is purely the
> devil's advocate comment.
>
> Having said that, running
>
>         $ git grep -e 'git diff '
>
> in the collection of scripts I use [*] to work on this project, I am
> reminded that I may have to be a bit more conservative than I
> currently am about the risk of breaking scripts with the changes
> like the one being proposed.
>
> The proposed feature also may break those who use the git-prompt and
> diff-highlight available in conrib/, even though I am not sure how
> badly they would break, because I only looked at the lines given by
> this command:
>
>         $ git grep -e 'git diff ' -- \*.sh ':!t/'
>
> and didn't check how the output from 'git diff' is used.
>
>
> [Footnote]
>
>  * They can be seen in the 'todo' branch, if anybody is interested.

Having said that, wouldn't this cause a problem only if the config is set up?
Meaning the user must explicitly set `diff.wordDiff` for their scripts
to potentially
break. In that sense, is it a breaking feature?
Kristoffer Haugsbakk March 2, 2024, 7:57 p.m. UTC | #3
Hi

On Sat, Mar 2, 2024, at 19:02, Karthik Nayak wrote:
> On Sat, Mar 2, 2024 at 6:03 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Karthik Nayak <karthik.188@gmail.com> writes:
>>
>> > This patch series adds the diff.wordDiff config option. This mimics the
>> > '--word-diff' option of `git-diff(1)`.
>>
>> Is it even be sensible to introduce this configuration variable in
>> the first place?  What would this do to users who set this variable
>> and use third-party or their own scripts that run "git diff" under
>> the hood?
>
> This is definitely a good question to ask. I'm primarily not a user of this
> option, and this patch series was more of to start this discussion, based
> on the request. I'm comfortable dropping the patch series too if it doesn't
> make much sense.

This looks similar to the discussion from a [stash] topic:

• Proposed introducing config variables which change how `git stash
  push` and `git stash save` behave (what they save)
• Concern about how that could break third-party scripts

Like here it would be opt-in. But the user might have no idea what kind
of scripts/programs that they use that happen to use git-stash(1).

(That’s at least how I read the thread)

I guess the concern might be worse for git-stash(1) since it seems very
natural to use that command in scripts in order to deal with a working
tree that might be in a who-knows condition: just get these things out
of the way so I can do what I want.


Chris Torek March 3, 2024, 7:23 a.m. UTC | #4
Continuing the digression a bit:

On Sat, Mar 2, 2024 at 11:58 AM Kristoffer Haugsbakk
<code@khaugsbakk.name> wrote:
> This looks similar to the discussion from a [stash] topic:
>
> • Proposed introducing config variables which change how `git stash
>   push` and `git stash save` behave (what they save)
> • Concern about how that could break third-party scripts
[snippage]
> 
Junio C Hamano March 3, 2024, 5:45 p.m. UTC | #5
Chris Torek <chris.torek@gmail.com> writes:

> This tension is relieved somewhat when there *are* separate
> plumbing commands, such as `git diff-index` and `git diff-tree`
> and so on, or `git rev-list` vs `git log`. Unfortunately there
> are some commands, including `git log` itself, that have options
> that are missing from the roughly-equivalent plumbing command,
> and there are commands (such as `git stash` and `git status`)
> that either do not have, or at one time lacked, plumbing command
> equivalents or options.

Yup.  It is my pet peeve that more and more contributors got lazy
and tweaked only Porcelain commands, without bothering to improve
plumbing commands to match, while adding more features during the
last decade.  Unfortunately there is no easy remedy after such sins
have been committed.  Once people start using `git log` in their
scripts, it is way too late to tell them to update their scripts to
use `git log --porcelain`.  The fact that you need to tell them is
an admission that you already broke their scripts.
Olliver Schinagl March 22, 2024, 9:57 p.m. UTC | #6
Hey list,

On 02-03-2024 18:03, Junio C Hamano wrote:
> Karthik Nayak <karthik.188@gmail.com> writes:
> 
>> This patch series adds the diff.wordDiff config option. This mimics the
>> '--word-diff' option of `git-diff(1)`.
> 
> Is it even be sensible to introduce this configuration variable in
> the first place?

Of course it is :p as a human, I crave it :p

On a slightly more serious note though, I always have to use an alias, 
or the command line option I cannot use `git diff` with this as default. 
 From a human UX point of view, this is odd, and we have tons of 
configuration options to do exactly what is desired, without aliases.

I suppose the deeper discussion would be, do we distinct between user 
(human) facing options and arguments, and machine facing options and 
argument (in theory, yes we do; in practice things get abused).

Git is often blamed due to its horrible UX. I think the problem comes 
from the deeper issue mentioned above. Because things get abused, they 
can no longer be touched, not even to improve UX for the human.

> What would this do to users who set this variable
> and use third-party or their own scripts that run "git diff" under
> the hood?
> 
> The usual answer is "these tools should be using the low-level
> plumbing commands like diff-files, diff-index, and diff-tree", so I
> am not worried about it too much myself, and the above is purely the
> devil's advocate comment.
> 
> Having said that, running
> 
> 	$ git grep -e 'git diff '
> 
> in the collection of scripts I use [*] to work on this project, I am
> reminded that I may have to be a bit more conservative than I
> currently am about the risk of breaking scripts with the changes
> like the one being proposed.
> 
> The proposed feature also may break those who use the git-prompt and
> diff-highlight available in conrib/, even though I am not sure how
> badly they would break, because I only looked at the lines given by
> this command:
> 
> 	$ git grep -e 'git diff ' -- \*.sh ':!t/'
> 
> and didn't check how the output from 'git diff' is used.
> 
> 
> [Footnote]
> 
>   * They can be seen in the 'todo' branch, if anybody is interested.
Olliver Schinagl March 22, 2024, 9:59 p.m. UTC | #7
On 02-03-2024 19:02, Karthik Nayak wrote:
> On Sat, Mar 2, 2024 at 6:03 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Karthik Nayak <karthik.188@gmail.com> writes:
>>
>>> This patch series adds the diff.wordDiff config option. This mimics the
>>> '--word-diff' option of `git-diff(1)`.
>>
>> Is it even be sensible to introduce this configuration variable in
>> the first place?  What would this do to users who set this variable
>> and use third-party or their own scripts that run "git diff" under
>> the hood?
> 
> This is definitely a good question to ask. I'm primarily not a user of this
> option, and this patch series was more of to start this discussion, based
> on the request. I'm comfortable dropping the patch series too if it doesn't
> make much sense.

As a human user, I would very much like to see this feature :) It helps 
much to visually distinct things, but `git diff` sits way to deep in my 
muscle memory to remember my alias. Also, we have a configuration system 
to set configuration options for many things.

I wonder though, how many other configuration options we already have, 
that potentially break 'random scripts' because the user has set it ...

> 
>>
>> The usual answer is "these tools should be using the low-level
>> plumbing commands like diff-files, diff-index, and diff-tree", so I
>> am not worried about it too much myself, and the above is purely the
>> devil's advocate comment.
>>
>> Having said that, running
>>
>>          $ git grep -e 'git diff '
>>
>> in the collection of scripts I use [*] to work on this project, I am
>> reminded that I may have to be a bit more conservative than I
>> currently am about the risk of breaking scripts with the changes
>> like the one being proposed.
>>
>> The proposed feature also may break those who use the git-prompt and
>> diff-highlight available in conrib/, even though I am not sure how
>> badly they would break, because I only looked at the lines given by
>> this command:
>>
>>          $ git grep -e 'git diff ' -- \*.sh ':!t/'
>>
>> and didn't check how the output from 'git diff' is used.
>>
>>
>> [Footnote]
>>
>>   * They can be seen in the 'todo' branch, if anybody is interested.
> 
> Having said that, wouldn't this cause a problem only if the config is set up?
> Meaning the user must explicitly set `diff.wordDiff` for their scripts
> to potentially
> break. In that sense, is it a breaking feature?
Olliver Schinagl March 22, 2024, 10:05 p.m. UTC | #8
On 03-03-2024 08:23, Chris Torek wrote:
> Continuing the digression a bit:
> 
> On Sat, Mar 2, 2024 at 11:58 AM Kristoffer Haugsbakk
> <code@khaugsbakk.name> wrote:
>> This looks similar to the discussion from a [stash] topic:
>>
>> • Proposed introducing config variables which change how `git stash
>>    push` and `git stash save` behave (what they save)
>> • Concern about how that could break third-party scripts
> [snippage]
>> 
Olliver Schinagl March 22, 2024, 10:08 p.m. UTC | #9
On 03-03-2024 18:45, Junio C Hamano wrote:
> Chris Torek <chris.torek@gmail.com> writes:
> 
>> This tension is relieved somewhat when there *are* separate
>> plumbing commands, such as `git diff-index` and `git diff-tree`
>> and so on, or `git rev-list` vs `git log`. Unfortunately there
>> are some commands, including `git log` itself, that have options
>> that are missing from the roughly-equivalent plumbing command,
>> and there are commands (such as `git stash` and `git status`)
>> that either do not have, or at one time lacked, plumbing command
>> equivalents or options.
> 
> Yup.  It is my pet peeve that more and more contributors got lazy
> and tweaked only Porcelain commands, without bothering to improve
> plumbing commands to match, while adding more features during the
> last decade.  Unfortunately there is no easy remedy after such sins
> have been committed.  Once people start using `git log` in their
> scripts, it is way too late to tell them to update their scripts to
> use `git log --porcelain`.  The fact that you need to tell them is
> an admission that you already broke their scripts.
> 
To avoid this request from dieing quietly, I will ask (complain) again. 
Who's the client for. How important is the human UX?

Even introducing a new cli, 'git-cli-for-humans' it will be abused again 
for sure. So what's a good way forward? Personally, as I mentioned 
before, it's in the docs to not script around non-plumbing commands, 
which gives an opening to the admission. And why is admitting things a 
bad thing, when it improves things for the human? Even if it hurts.

One could argue 'git3 will break things! Human and machine control is 
split. Use --porcelain (or plumbing commands) in your scripts or expect 
breakage from time to time. You have been warned!'

We do in the end want progress, do we not? :)

Olliver
Dragan Simic March 25, 2024, 9:53 p.m. UTC | #10
On 2024-03-22 23:08, Olliver Schinagl wrote:
> On 03-03-2024 18:45, Junio C Hamano wrote:
>> Chris Torek <chris.torek@gmail.com> writes:
>> 
>>> This tension is relieved somewhat when there *are* separate
>>> plumbing commands, such as `git diff-index` and `git diff-tree`
>>> and so on, or `git rev-list` vs `git log`. Unfortunately there
>>> are some commands, including `git log` itself, that have options
>>> that are missing from the roughly-equivalent plumbing command,
>>> and there are commands (such as `git stash` and `git status`)
>>> that either do not have, or at one time lacked, plumbing command
>>> equivalents or options.
>> 
>> Yup.  It is my pet peeve that more and more contributors got lazy
>> and tweaked only Porcelain commands, without bothering to improve
>> plumbing commands to match, while adding more features during the
>> last decade.  Unfortunately there is no easy remedy after such sins
>> have been committed.  Once people start using `git log` in their
>> scripts, it is way too late to tell them to update their scripts to
>> use `git log --porcelain`.  The fact that you need to tell them is
>> an admission that you already broke their scripts.
>> 
> To avoid this request from dieing quietly, I will ask (complain)
> again. Who's the client for. How important is the human UX?
> 
> Even introducing a new cli, 'git-cli-for-humans' it will be abused
> again for sure. So what's a good way forward? Personally, as I
> mentioned before, it's in the docs to not script around non-plumbing
> commands, which gives an opening to the admission. And why is
> admitting things a bad thing, when it improves things for the human?
> Even if it hurts.
> 
> One could argue 'git3 will break things! Human and machine control is
> split. Use --porcelain (or plumbing commands) in your scripts or
> expect breakage from time to time. You have been warned!'
> 
> We do in the end want progress, do we not? :)

Maybe, but just maybe, a possible solution for introducing such new
configuration options could be introduce a new category of configuration
options, which could be set in the user's git configuration only?

That way, a repository enabling some troublesome configuration option
wouldn't cause the user's scripts to break.