diff mbox series

[1/3] doc: clarify documentation for rename/copy limits

Message ID 7dc448df6ec6edb41464a4115017064cf41b2d83.1625964399.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Improve the documentation and warnings dealing with rename/copy limits | expand

Commit Message

Elijah Newren July 11, 2021, 12:46 a.m. UTC
From: Elijah Newren <newren@gmail.com>

A few places in the docs implied that rename/copy detection is always
quadratic or that all (unpaired) files were involved in the quadratic
portion of rename/copy detection.  The following two commits each
introduced an exception to this:

    9027f53cb505 (Do linear-time/space rename logic for exact renames,
                  2007-10-25)
    bd24aa2f97a0 (diffcore-rename: guide inexact rename detection based
                  on basenames, 2021-02-14)

(As a side note, for copy detection, the basename guided inexact rename
detection is turned off and the exact renames will only result in
sources (without the dests) being removed from the set of files used in
quadratic detection.  So, for copy detection, the documentation was
closer to correct.)

Avoid implying that all files involved in rename/copy detection are
subject to the full quadratic algorithm.  While at it, also note the
default values for all these settings.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/config/diff.txt  |  7 ++++---
 Documentation/config/merge.txt | 10 ++++++----
 Documentation/diff-options.txt | 11 ++++++-----
 3 files changed, 16 insertions(+), 12 deletions(-)

Comments

Bagas Sanjaya July 11, 2021, 4:37 a.m. UTC | #1
On 11/07/21 07.46, Elijah Newren via GitGitGadget wrote:
>   -l<num>::
> -	The `-M` and `-C` options require O(n^2) processing time where n
> -	is the number of potential rename/copy targets.  This
> -	option prevents rename/copy detection from running if
> -	the number of rename/copy targets exceeds the specified
> -	number.
> +	The `-M` and `-C` options can require O(n^2) processing time
> +	where n is the number of potential rename/copy targets.  This
> +	option prevents the quadratic portion of rename/copy detection
> +	from running if the number of rename/copy targets exceeds the
> +	specified number.  Defaults to diff.renameLimit, or if that is
> +	also unspecified, to 400.

Why not just defaults to diff.renameLimit if the default for it and -l 
are both 400?
Elijah Newren July 11, 2021, 4:52 a.m. UTC | #2
On Sat, Jul 10, 2021 at 9:37 PM Bagas Sanjaya <bagasdotme@gmail.com> wrote:
>
> On 11/07/21 07.46, Elijah Newren via GitGitGadget wrote:
> >   -l<num>::
> > -     The `-M` and `-C` options require O(n^2) processing time where n
> > -     is the number of potential rename/copy targets.  This
> > -     option prevents rename/copy detection from running if
> > -     the number of rename/copy targets exceeds the specified
> > -     number.
> > +     The `-M` and `-C` options can require O(n^2) processing time
> > +     where n is the number of potential rename/copy targets.  This
> > +     option prevents the quadratic portion of rename/copy detection
> > +     from running if the number of rename/copy targets exceeds the
> > +     specified number.  Defaults to diff.renameLimit, or if that is
> > +     also unspecified, to 400.
>
> Why not just defaults to diff.renameLimit if the default for it and -l
> are both 400?

Good point; I'll make that change and include it in the next re-roll.
Derrick Stolee July 12, 2021, 3:03 p.m. UTC | #3
On 7/10/2021 8:46 PM, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
...
>  diff.renameLimit::
> -	The number of files to consider when performing the copy/rename
> -	detection; equivalent to the 'git diff' option `-l`. This setting
> -	has no effect if rename detection is turned off.
> +	The number of files to consider in the quadratic portion of
> +	copy/rename detection; equivalent to the 'git diff' option
> +	`-l`.  If not set, the default value is 400.  This setting has
> +	no effect if rename detection is turned off.
...
>  merge.renameLimit::
> -	The number of files to consider when performing rename detection
> -	during a merge; if not specified, defaults to the value of
> -	diff.renameLimit. This setting has no effect if rename detection
> -	is turned off.
> +	The number of files to consider in the quadratic portion of
> +	rename detection during a merge.  If not specified, defaults
> +	to the value of diff.renameLimit.  If neither
> +	merge.renameLimit nor diff.renameLimit are specified, defaults
> +	to 1000.  This setting has no effect if rename detection is
> +	turned off.
...
>  -l<num>::
> -	The `-M` and `-C` options require O(n^2) processing time where n
> -	is the number of potential rename/copy targets.  This
> -	option prevents rename/copy detection from running if
> -	the number of rename/copy targets exceeds the specified
> -	number.
> +	The `-M` and `-C` options can require O(n^2) processing time
> +	where n is the number of potential rename/copy targets.  This
> +	option prevents the quadratic portion of rename/copy detection
> +	from running if the number of rename/copy targets exceeds the
> +	specified number.  Defaults to diff.renameLimit, or if that is
> +	also unspecified, to 400.

These changes are clear to me, but I also know how a bit about how
the rename algorithm works and what "the quadratic portion" means.
I wonder if this is sufficient detail to help a user less versed in
Git's internals.

Perhaps we should instead be describing the set of "potential inexact
renames" as a more descriptive approach? Here is a possible way to
use this wording instead:

  merge.renameLimit::
      The number of potential inexact renames to consider when
      performing rename detection during a merge; if this limit
      is exceeded, then no inexact renames are computed. Renames
      where the content does not change are excluded from this
      limit. If not specified, defaults to...

Feel free to take or leave any of this example.

Thanks,
-Stolee
Junio C Hamano July 12, 2021, 9:27 p.m. UTC | #4
Derrick Stolee <stolee@gmail.com> writes:

>   merge.renameLimit::
>       The number of potential inexact renames to consider when
>       performing rename detection during a merge; if this limit
>       is exceeded, then no inexact renames are computed. Renames
>       where the content does not change are excluded from this
>       limit. If not specified, defaults to...
>
> Feel free to take or leave any of this example.

Phrases like "finding inexact renames" and "finding which files were
moved with changes" may tell what we are spending cycles on to the
end user, but I have to say I agree that "quadratic portion" would
not resonate with the intended audiences at all.
diff mbox series

Patch

diff --git a/Documentation/config/diff.txt b/Documentation/config/diff.txt
index 2d3331f55c2..31b96e8294f 100644
--- a/Documentation/config/diff.txt
+++ b/Documentation/config/diff.txt
@@ -118,9 +118,10 @@  diff.orderFile::
 	relative to the top of the working tree.
 
 diff.renameLimit::
-	The number of files to consider when performing the copy/rename
-	detection; equivalent to the 'git diff' option `-l`. This setting
-	has no effect if rename detection is turned off.
+	The number of files to consider in the quadratic portion of
+	copy/rename detection; equivalent to the 'git diff' option
+	`-l`.  If not set, the default value is 400.  This setting has
+	no effect if rename detection is turned off.
 
 diff.renames::
 	Whether and how Git detects renames.  If set to "false",
diff --git a/Documentation/config/merge.txt b/Documentation/config/merge.txt
index cb2ed589075..d030b936d1f 100644
--- a/Documentation/config/merge.txt
+++ b/Documentation/config/merge.txt
@@ -33,10 +33,12 @@  merge.verifySignatures::
 include::fmt-merge-msg.txt[]
 
 merge.renameLimit::
-	The number of files to consider when performing rename detection
-	during a merge; if not specified, defaults to the value of
-	diff.renameLimit. This setting has no effect if rename detection
-	is turned off.
+	The number of files to consider in the quadratic portion of
+	rename detection during a merge.  If not specified, defaults
+	to the value of diff.renameLimit.  If neither
+	merge.renameLimit nor diff.renameLimit are specified, defaults
+	to 1000.  This setting has no effect if rename detection is
+	turned off.
 
 merge.renames::
 	Whether Git detects renames.  If set to "false", rename detection
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 32e6dee5ac3..b5682b83956 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -588,11 +588,12 @@  When used together with `-B`, omit also the preimage in the deletion part
 of a delete/create pair.
 
 -l<num>::
-	The `-M` and `-C` options require O(n^2) processing time where n
-	is the number of potential rename/copy targets.  This
-	option prevents rename/copy detection from running if
-	the number of rename/copy targets exceeds the specified
-	number.
+	The `-M` and `-C` options can require O(n^2) processing time
+	where n is the number of potential rename/copy targets.  This
+	option prevents the quadratic portion of rename/copy detection
+	from running if the number of rename/copy targets exceeds the
+	specified number.  Defaults to diff.renameLimit, or if that is
+	also unspecified, to 400.
 
 ifndef::git-format-patch[]
 --diff-filter=[(A|C|D|M|R|T|U|X|B)...[*]]::