mbox series

[v3,0/2] diff: copies-harder support

Message ID 20240311213928.1872437-1-sam@gentoo.org (mailing list archive)
Headers show
Series diff: copies-harder support | expand

Message

Sam James March 11, 2024, 9:38 p.m. UTC
range-diff:
```
1:  4ad89a3f1a ! 1:  879565c99a diff: implement config.diff.renames=copies-harder
    @@ Commit message
         This allows specifying that 'git log -p', 'git diff', etc should always act
         as if '-C --find-copies-harder' was passed.

    -    I've found this especially useful for certain types of repository (like
    +    It has proven this especially useful for certain types of repository (like
         Gentoo's ebuild repositories) because files are often copies of a previous
    -    version.
    +    version:
    +
    +    Suppose a directory 'sys-devel/gcc' contains recipes for building
    +    GCC, with one file for each supported upstream branch:
    +      gcc-13.x.build.recipe
    +      gcc-12.x.build.recipe
    +      gcc-11.x.build.recipe
    +      gcc-10.x.build.recipe
    +
    +    gcc-13.x.build.recipe was started as a copy of gcc-12.x.build.recipe
    +    (which was started as a copy of gcc-11.x.build.recipe, etc.). Previous versions
    +    are kept around to support parallel installation of multiple versions.
    +
    +    Being able to easily observe the diff relative to other recipes within the
    +    directory has been a quality of life improvement for such repo layouts.

         Signed-off-by: Sam James <sam@gentoo.org>

    @@ Documentation/config/diff.txt: diff.renames::
      	rename detection is disabled. If set to "true", basic rename
      	detection is enabled.  If set to "copies" or "copy", Git will
     -	detect copies, as well.  Defaults to true.  Note that this
    -+	detect copies, as well.  If set to "copies-harder", Git will try harder
    -+	to detect copies.  Defaults to true.  Note that this
    - 	affects only 'git diff' Porcelain like linkgit:git-diff[1] and
    - 	linkgit:git-log[1], and not lower level commands such as
    +-	affects only 'git diff' Porcelain like linkgit:git-diff[1] and
    +-	linkgit:git-log[1], and not lower level commands such as
    ++	detect copies, as well.  If set to "copies-harder", Git will spend extra
    ++	cycles to find more copies even in unmodified paths, see
    ++	'--find-copies-harder' in linkgit:git-diff[1]. Defaults to true.
    ++	Note that this affects only 'git diff' Porcelain like linkgit:git-diff[1]
    ++	and linkgit:git-log[1], and not lower level commands such as
      	linkgit:git-diff-files[1].
    +
    + diff.suppressBlankEmpty::

      ## Documentation/config/status.txt ##
     @@ Documentation/config/status.txt: status.renames::
    @@ Documentation/config/status.txt: status.renames::
      	linkgit:git-commit[1] .  If set to "false", rename detection is
      	disabled. If set to "true", basic rename detection is enabled.
     -	If set to "copies" or "copy", Git will detect copies, as well.
    -+	If set to "copies" or "copy", Git will detect copies, as well.  If
    -+	set to "copies-harder", Git will try harder to detect copies.
    ++	If set to "copies" or "copy", Git will detect copies, as well.  If set
    ++	to "copies-harder", Git will spend extra cycles to find more copies even
    ++	in unmodified paths, see '--find-copies-harder' in linkgit:git-diff[1].
      	Defaults to the value of diff.renames.

      status.showStash::
    @@ diff.c: int git_config_rename(const char *var, const char *value)
     +	if (!strcasecmp(value, "copies-harder"))
     +		return DIFF_DETECT_COPY_HARDER;
      	if (!strcasecmp(value, "copies") || !strcasecmp(value, "copy"))
    --		return  DIFF_DETECT_COPY;
    -+		return DIFF_DETECT_COPY;
    -+
    + 		return  DIFF_DETECT_COPY;
      	return git_config_bool(var,value) ? DIFF_DETECT_RENAME : 0;
    - }
    -
     @@ diff.c: void diff_setup_done(struct diff_options *options)
      	else
      		options->flags.diff_from_contents = 0;

     -	if (options->flags.find_copies_harder)
     +	/* Just fold this in as it makes the patch-to-git smaller */
    -+	if (options->flags.find_copies_harder || options->detect_rename == DIFF_DETECT_COPY_HARDER) {
    ++	if (options->flags.find_copies_harder ||
    ++	    options->detect_rename == DIFF_DETECT_COPY_HARDER) {
     +		options->flags.find_copies_harder = 1;
      		options->detect_rename = DIFF_DETECT_COPY;
     +	}
    @@ diff.c: static int diff_opt_find_copies(const struct option *opt,
      		return error(_("invalid argument to %s"), opt->long_name);

     -	if (options->detect_rename == DIFF_DETECT_COPY)
    -+	if (options->detect_rename == DIFF_DETECT_COPY || options->detect_rename == DIFF_DETECT_COPY_HARDER)
    ++	if (options->detect_rename == DIFF_DETECT_COPY ||
    ++	    options->detect_rename == DIFF_DETECT_COPY_HARDER)
      		options->flags.find_copies_harder = 1;
      	else
      		options->detect_rename = DIFF_DETECT_COPY;
    @@ diffcore-rename.c: static int find_identical_files(struct hashmap *srcs,
      		/* Give higher scores to sources that haven't been used already */
      		score = !source->rename_used;
     -		if (source->rename_used && options->detect_rename != DIFF_DETECT_COPY)
    -+		if (source->rename_used && options->detect_rename != DIFF_DETECT_COPY && options->detect_rename != DIFF_DETECT_COPY_HARDER)
    ++		if (source->rename_used && options->detect_rename != DIFF_DETECT_COPY &&
    ++		    options->detect_rename != DIFF_DETECT_COPY_HARDER)
      			continue;
      		score += basename_same(source, target);
      		if (score > best_score) {
    @@ diffcore-rename.c: void diffcore_rename_extended(struct diff_options *options,
      	info.setup = 0;
      	assert(!dir_rename_count || strmap_empty(dir_rename_count));
     -	want_copies = (detect_rename == DIFF_DETECT_COPY);
    -+	want_copies = (detect_rename == DIFF_DETECT_COPY || detect_rename == DIFF_DETECT_COPY_HARDER);
    ++	want_copies = (detect_rename == DIFF_DETECT_COPY ||
    ++		       detect_rename == DIFF_DETECT_COPY_HARDER);
      	if (dirs_removed && (break_idx || want_copies))
      		BUG("dirs_removed incompatible with break/copy detection");
      	if (break_idx && relevant_sources)
-:  ---------- > 2:  eda1e07ac2 diff: whitespace cleanup
```

Sam James (2):
  diff: implement config.diff.renames=copies-harder
  diff: whitespace cleanup

 Documentation/config/diff.txt   |  8 +++++---
 Documentation/config/status.txt |  4 +++-
 diff.c                          | 14 +++++++++++---
 diff.h                          |  1 +
 diffcore-rename.c               |  6 ++++--
 merge-ort.c                     |  2 +-
 merge-recursive.c               |  2 +-
 7 files changed, 26 insertions(+), 11 deletions(-)

Comments

Sam James April 8, 2024, 3:32 p.m. UTC | #1
Sam James <sam@gentoo.org> writes:

> range-diff:
> ```
> [...]
> ```
>
> Sam James (2):
>   diff: implement config.diff.renames=copies-harder
>   diff: whitespace cleanup
>

It was pointed out that
https://github.com/gitgitgadget/git/pull/1606#issuecomment-2002137907
that I forgot to add the changes in v2/v3.

v2: Documentation phrasing fixes.
v3: Split out whitespace & formatting changes into their own commit and
apply missed documentation phrasing tweaks.


>  Documentation/config/diff.txt   |  8 +++++---
>  Documentation/config/status.txt |  4 +++-
>  diff.c                          | 14 +++++++++++---
>  diff.h                          |  1 +
>  diffcore-rename.c               |  6 ++++--
>  merge-ort.c                     |  2 +-
>  merge-recursive.c               |  2 +-
>  7 files changed, 26 insertions(+), 11 deletions(-)
Sam James April 16, 2024, 2:42 a.m. UTC | #2
Sam James <sam@gentoo.org> writes:

> Sam James <sam@gentoo.org> writes:
>
>> range-diff:
>> ```
>> [...]
>> ```
>>
>> Sam James (2):
>>   diff: implement config.diff.renames=copies-harder
>>   diff: whitespace cleanup
>>
>
> It was pointed out that
> https://github.com/gitgitgadget/git/pull/1606#issuecomment-2002137907
> that I forgot to add the changes in v2/v3.
>
> v2: Documentation phrasing fixes.
> v3: Split out whitespace & formatting changes into their own commit and
> apply missed documentation phrasing tweaks.

ping

I'm not sure of the etiquette for git development, so if it's too short
to ping, my apologies.

>
>
>>  Documentation/config/diff.txt   |  8 +++++---
>>  Documentation/config/status.txt |  4 +++-
>>  diff.c                          | 14 +++++++++++---
>>  diff.h                          |  1 +
>>  diffcore-rename.c               |  6 ++++--
>>  merge-ort.c                     |  2 +-
>>  merge-recursive.c               |  2 +-
>>  7 files changed, 26 insertions(+), 11 deletions(-)
Sam James May 15, 2024, 10:27 p.m. UTC | #3
Sam James <sam@gentoo.org> writes:

> Sam James <sam@gentoo.org> writes:
>
>> Sam James <sam@gentoo.org> writes:
>>
>>> range-diff:
>>> ```
>>> [...]
>>> ```
>>>
>>> Sam James (2):
>>>   diff: implement config.diff.renames=copies-harder
>>>   diff: whitespace cleanup
>>>
>>
>> It was pointed out that
>> https://github.com/gitgitgadget/git/pull/1606#issuecomment-2002137907
>> that I forgot to add the changes in v2/v3.
>>
>> v2: Documentation phrasing fixes.
>> v3: Split out whitespace & formatting changes into their own commit and
>> apply missed documentation phrasing tweaks.
>
> ping
>
> I'm not sure of the etiquette for git development, so if it's too short
> to ping, my apologies.
>

ping - let me know if I need to do anything different. Thanks!

>>
>>
>>>  Documentation/config/diff.txt   |  8 +++++---
>>>  Documentation/config/status.txt |  4 +++-
>>>  diff.c                          | 14 +++++++++++---
>>>  diff.h                          |  1 +
>>>  diffcore-rename.c               |  6 ++++--
>>>  merge-ort.c                     |  2 +-
>>>  merge-recursive.c               |  2 +-
>>>  7 files changed, 26 insertions(+), 11 deletions(-)
Junio C Hamano May 16, 2024, 3:36 p.m. UTC | #4
Sam James <sam@gentoo.org> writes:

> ping - let me know if I need to do anything different. Thanks!
>
>>>
>>>
>>>>  Documentation/config/diff.txt   |  8 +++++---
>>>>  Documentation/config/status.txt |  4 +++-
>>>>  diff.c                          | 14 +++++++++++---
>>>>  diff.h                          |  1 +
>>>>  diffcore-rename.c               |  6 ++++--
>>>>  merge-ort.c                     |  2 +-
>>>>  merge-recursive.c               |  2 +-
>>>>  7 files changed, 26 insertions(+), 11 deletions(-)

Copies-harder is supported from the command line already.  We do not
want a configuration variable for it.  diff.renames configuration
was already a mistake enough.  Let's not pile on a new mistake on an
old mistake that it is too late for us to take back.

Thanks.
Sam James May 17, 2024, 3:38 a.m. UTC | #5
Junio C Hamano <gitster@pobox.com> writes:

> Sam James <sam@gentoo.org> writes:
>
>> ping - let me know if I need to do anything different. Thanks!
>>
>>>>
>>>>
>>>>>  Documentation/config/diff.txt   |  8 +++++---
>>>>>  Documentation/config/status.txt |  4 +++-
>>>>>  diff.c                          | 14 +++++++++++---
>>>>>  diff.h                          |  1 +
>>>>>  diffcore-rename.c               |  6 ++++--
>>>>>  merge-ort.c                     |  2 +-
>>>>>  merge-recursive.c               |  2 +-
>>>>>  7 files changed, 26 insertions(+), 11 deletions(-)
>
> Copies-harder is supported from the command line already.  We do not
> want a configuration variable for it.  diff.renames configuration
> was already a mistake enough.  Let's not pile on a new mistake on an
> old mistake that it is too late for us to take back.

Thanks for the reply. It's a shame that a conceptual NACK wasn't
delivered in v1 [0] though. Also, Elijah said a configuration option made
sense in v1 and you responded to him and didn't disagree, so I took it
as conceptually okay.

I'm aware of the command line option existing. It doesn't work well for
us because it's really only suitable for certain classes of repos where
you essentially *always* want it enabled (any ebuild repository), but
you don't otherwise given its speed and you may not even be
expecting many copies/renames elsewhere.

[0] https://lore.kernel.org/git/xmqq7cmu9s29.fsf@gitster.g/

>
> Thanks.

thanks,
sam