diff mbox series

diff: implement config.diff.renames=copies-harder

Message ID pull.1606.git.1699010701704.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series diff: implement config.diff.renames=copies-harder | expand

Commit Message

Sam James Nov. 3, 2023, 11:25 a.m. UTC
From: Sam James <sam@gentoo.org>

This patch adds a config value for 'diff.renames' called 'copies-harder'
which make it so '-C -C' is in effect always passed for 'git log -p',
'git diff', etc.

This allows specifying that 'git log -p', 'git diff', etc should always act
as if '-C --find-copies-harder' was passed.

I've found this especially useful for certain types of repository (like
Gentoo's ebuild repositories) because files are often copies of a previous
version.

Signed-off-by: Sam James <sam@gentoo.org>
---
    diff: implement config.diff.renames=copies-harder
    
    This patch adds a config value for 'diff.renames' called 'copies-harder'
    which make it so '-C -C' is in effect always passed for 'git log -p',
    'git diff', etc.
    
    This allows specifying that 'git log -p', 'git diff', etc should always
    act as if '-C --find-copies-harder' was passed.
    
    I've found this especially useful for certain types of repository (like
    Gentoo's ebuild repositories) because files are often copies of a
    previous version.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1606%2Fthesamesam%2Fconfig-copies-harder-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1606/thesamesam/config-copies-harder-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1606

 Documentation/config/diff.txt   |  3 ++-
 Documentation/config/status.txt |  3 ++-
 diff.c                          | 12 +++++++++---
 diff.h                          |  1 +
 diffcore-rename.c               |  4 ++--
 merge-ort.c                     |  2 +-
 merge-recursive.c               |  2 +-
 7 files changed, 18 insertions(+), 9 deletions(-)


base-commit: 692be87cbba55e8488f805d236f2ad50483bd7d5

Comments

Elijah Newren Nov. 7, 2023, 2:45 a.m. UTC | #1
Hi,

On Fri, Nov 3, 2023 at 4:25 AM Sam James via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Sam James <sam@gentoo.org>
>
> This patch adds a config value for 'diff.renames' called 'copies-harder'
> which make it so '-C -C' is in effect always passed for 'git log -p',
> 'git diff', etc.
>
> This allows specifying that 'git log -p', 'git diff', etc should always act
> as if '-C --find-copies-harder' was passed.
>
> I've found this especially useful for certain types of repository (like
> Gentoo's ebuild repositories) because files are often copies of a previous
> version.

These must be very small repositories?  --find-copies-harder is really
expensive...

But, if you are willing to pay the price, the idea of making this a
configuration item makes sense.

> Signed-off-by: Sam James <sam@gentoo.org>
> ---
>     diff: implement config.diff.renames=copies-harder
>
>     This patch adds a config value for 'diff.renames' called 'copies-harder'
>     which make it so '-C -C' is in effect always passed for 'git log -p',
>     'git diff', etc.
>
>     This allows specifying that 'git log -p', 'git diff', etc should always
>     act as if '-C --find-copies-harder' was passed.
>
>     I've found this especially useful for certain types of repository (like
>     Gentoo's ebuild repositories) because files are often copies of a
>     previous version.
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1606%2Fthesamesam%2Fconfig-copies-harder-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1606/thesamesam/config-copies-harder-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1606
>
>  Documentation/config/diff.txt   |  3 ++-
>  Documentation/config/status.txt |  3 ++-
>  diff.c                          | 12 +++++++++---
>  diff.h                          |  1 +
>  diffcore-rename.c               |  4 ++--
>  merge-ort.c                     |  2 +-
>  merge-recursive.c               |  2 +-
>  7 files changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/config/diff.txt b/Documentation/config/diff.txt
> index bd5ae0c3378..d2ff3c62d41 100644
> --- a/Documentation/config/diff.txt
> +++ b/Documentation/config/diff.txt
> @@ -131,7 +131,8 @@ diff.renames::
>         Whether and how Git detects renames.  If set to "false",
>         rename detection is disabled. If set to "true", basic rename
>         detection is enabled.  If set to "copies" or "copy", Git will
> -       detect copies, as well.  Defaults to true.  Note that this
> +       detect copies, as well.  If set to "copies-harder", Git will try harder
> +       to detect copies.  Defaults to true.  Note that this

"try harder to detect copies" feels like an unhelpful explanation.  I
understand that a lengthy explanation (like the one found under the
`--find-copies-harder` option in git-diff) may not be wanted here
since we are trying to describe things succinctly, but could we at
least reference the `--find-copies-harder` option so that people know
where to go to get a more detailed explanation?

>         affects only 'git diff' Porcelain like linkgit:git-diff[1] and
>         linkgit:git-log[1], and not lower level commands such as
>         linkgit:git-diff-files[1].
> diff --git a/Documentation/config/status.txt b/Documentation/config/status.txt
> index 2ff8237f8fc..7ca7a4becd7 100644
> --- a/Documentation/config/status.txt
> +++ b/Documentation/config/status.txt
> @@ -33,7 +33,8 @@ status.renames::
>         Whether and how Git detects renames in linkgit:git-status[1] and
>         linkgit:git-commit[1] .  If set to "false", rename detection is
>         disabled. If set to "true", basic rename detection is enabled.
> -       If set to "copies" or "copy", Git will detect copies, as well.
> +       If set to "copies" or "copy", Git will detect copies, as well.  If
> +       set to "copies-harder", Git will try harder to detect copies.

Same here.

>         Defaults to the value of diff.renames.
>
>  status.showStash::
> diff --git a/diff.c b/diff.c
> index 2c602df10a3..0ca906611f5 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -206,8 +206,11 @@ int git_config_rename(const char *var, const char *value)
>  {
>         if (!value)
>                 return DIFF_DETECT_RENAME;
> +       if (!strcasecmp(value, "copies-harder"))
> +               return DIFF_DETECT_COPY_HARDER;
>         if (!strcasecmp(value, "copies") || !strcasecmp(value, "copy"))
> -               return  DIFF_DETECT_COPY;
> +               return DIFF_DETECT_COPY;
> +

As per CodingGuidelines:
"""
 - Fixing style violations while working on a real change as a
   preparatory clean-up step is good, but otherwise avoid useless code
   churn for the sake of conforming to the style.
"""
So, the fixing of extra space and the extra blank line should be
placed in a separate patch.

>         return git_config_bool(var,value) ? DIFF_DETECT_RENAME : 0;
>  }
>
> @@ -4832,8 +4835,11 @@ void diff_setup_done(struct diff_options *options)
>         else
>                 options->flags.diff_from_contents = 0;
>
> -       if (options->flags.find_copies_harder)
> +       /* Just fold this in as it makes the patch-to-git smaller */
> +       if (options->flags.find_copies_harder || options->detect_rename == DIFF_DETECT_COPY_HARDER) {

As per CodingGuidelines, this line is too long and should be split
across two lines at the `||`.

> +               options->flags.find_copies_harder = 1;
>                 options->detect_rename = DIFF_DETECT_COPY;
> +       }
>
>         if (!options->flags.relative_name)
>                 options->prefix = NULL;
> @@ -5264,7 +5270,7 @@ static int diff_opt_find_copies(const struct option *opt,
>         if (*arg != 0)
>                 return error(_("invalid argument to %s"), opt->long_name);
>
> -       if (options->detect_rename == DIFF_DETECT_COPY)
> +       if (options->detect_rename == DIFF_DETECT_COPY || options->detect_rename == DIFF_DETECT_COPY_HARDER)

Also too long.

>                 options->flags.find_copies_harder = 1;
>         else
>                 options->detect_rename = DIFF_DETECT_COPY;
> diff --git a/diff.h b/diff.h
> index 66bd8aeb293..b29e5b777f8 100644
> --- a/diff.h
> +++ b/diff.h
> @@ -555,6 +555,7 @@ int git_config_rename(const char *var, const char *value);
>
>  #define DIFF_DETECT_RENAME     1
>  #define DIFF_DETECT_COPY       2
> +#define DIFF_DETECT_COPY_HARDER 3
>
>  #define DIFF_PICKAXE_ALL       1
>  #define DIFF_PICKAXE_REGEX     2
> diff --git a/diffcore-rename.c b/diffcore-rename.c
> index 5a6e2bcac71..856291d66f2 100644
> --- a/diffcore-rename.c
> +++ b/diffcore-rename.c
> @@ -299,7 +299,7 @@ static int find_identical_files(struct hashmap *srcs,
>                 }
>                 /* Give higher scores to sources that haven't been used already */
>                 score = !source->rename_used;
> -               if (source->rename_used && options->detect_rename != DIFF_DETECT_COPY)
> +               if (source->rename_used && options->detect_rename != DIFF_DETECT_COPY && options->detect_rename != DIFF_DETECT_COPY_HARDER)

This line should also be split.

>                         continue;
>                 score += basename_same(source, target);
>                 if (score > best_score) {
> @@ -1405,7 +1405,7 @@ void diffcore_rename_extended(struct diff_options *options,
>         trace2_region_enter("diff", "setup", options->repo);
>         info.setup = 0;
>         assert(!dir_rename_count || strmap_empty(dir_rename_count));
> -       want_copies = (detect_rename == DIFF_DETECT_COPY);
> +       want_copies = (detect_rename == DIFF_DETECT_COPY || detect_rename == DIFF_DETECT_COPY_HARDER);

and so should this one.

>         if (dirs_removed && (break_idx || want_copies))
>                 BUG("dirs_removed incompatible with break/copy detection");
>         if (break_idx && relevant_sources)
> diff --git a/merge-ort.c b/merge-ort.c
> index 6491070d965..77498354652 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -4782,7 +4782,7 @@ static void merge_start(struct merge_options *opt, struct merge_result *result)
>          * sanity check them anyway.
>          */
>         assert(opt->detect_renames >= -1 &&
> -              opt->detect_renames <= DIFF_DETECT_COPY);
> +              opt->detect_renames <= DIFF_DETECT_COPY_HARDER);
>         assert(opt->verbosity >= 0 && opt->verbosity <= 5);
>         assert(opt->buffer_output <= 2);
>         assert(opt->obuf.len == 0);
> diff --git a/merge-recursive.c b/merge-recursive.c
> index e3beb0801b1..d52dd536606 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -3708,7 +3708,7 @@ static int merge_start(struct merge_options *opt, struct tree *head)
>         assert(opt->branch1 && opt->branch2);
>
>         assert(opt->detect_renames >= -1 &&
> -              opt->detect_renames <= DIFF_DETECT_COPY);
> +              opt->detect_renames <= DIFF_DETECT_COPY_HARDER);
>         assert(opt->detect_directory_renames >= MERGE_DIRECTORY_RENAMES_NONE &&
>                opt->detect_directory_renames <= MERGE_DIRECTORY_RENAMES_TRUE);
>         assert(opt->rename_limit >= -1);
>
> base-commit: 692be87cbba55e8488f805d236f2ad50483bd7d5
> --
> gitgitgadget

The overall patch makes sense and looks good, modulo some minor
stylistic things that need cleaning up.
Junio C Hamano Nov. 7, 2023, 3:10 a.m. UTC | #2
Elijah Newren <newren@gmail.com> writes:

> On Fri, Nov 3, 2023 at 4:25 AM Sam James via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Sam James <sam@gentoo.org>
>>
>> This patch adds a config value for 'diff.renames' called 'copies-harder'
>> which make it so '-C -C' is in effect always passed for 'git log -p',
>> 'git diff', etc.
>>
>> This allows specifying that 'git log -p', 'git diff', etc should always act
>> as if '-C --find-copies-harder' was passed.
>>
>> I've found this especially useful for certain types of repository (like
>> Gentoo's ebuild repositories) because files are often copies of a previous
>> version.
>
> These must be very small repositories?  --find-copies-harder is really
> expensive...

True.  "often copies of a previous version" means that it is a
directory that has a collection of subdirectories, one for each
version?  In a source tree managed in a version control system,
files are often rewritten in place from the previous version,
so I am puzzled by that justification.

It is, in the proposed log message of our commits, a bit unusual to
see "This patch does X" and "I do Y", by the way, which made my
reading hiccup a bit, but perhaps it is just me?

>> diff --git a/Documentation/config/diff.txt b/Documentation/config/diff.txt
>> index bd5ae0c3378..d2ff3c62d41 100644
>> --- a/Documentation/config/diff.txt
>> +++ b/Documentation/config/diff.txt
>> @@ -131,7 +131,8 @@ diff.renames::
>>         Whether and how Git detects renames.  If set to "false",
>>         rename detection is disabled. If set to "true", basic rename
>>         detection is enabled.  If set to "copies" or "copy", Git will
>> -       detect copies, as well.  Defaults to true.  Note that this
>> +       detect copies, as well.  If set to "copies-harder", Git will try harder
>> +       to detect copies.  Defaults to true.  Note that this
>
> "try harder to detect copies" feels like an unhelpful explanation.

Yup.  "will spend extra cycles to find more copies", perhaps?
Elijah Newren Nov. 7, 2023, 5:19 p.m. UTC | #3
On Mon, Nov 6, 2023 at 7:10 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > On Fri, Nov 3, 2023 at 4:25 AM Sam James via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> >>
> >> From: Sam James <sam@gentoo.org>
> >>
> >> This patch adds a config value for 'diff.renames' called 'copies-harder'
> >> which make it so '-C -C' is in effect always passed for 'git log -p',
> >> 'git diff', etc.
> >>
> >> This allows specifying that 'git log -p', 'git diff', etc should always act
> >> as if '-C --find-copies-harder' was passed.
> >>
> >> I've found this especially useful for certain types of repository (like
> >> Gentoo's ebuild repositories) because files are often copies of a previous
> >> version.
> >
> > These must be very small repositories?  --find-copies-harder is really
> > expensive...
>
> True.  "often copies of a previous version" means that it is a
> directory that has a collection of subdirectories, one for each
> version?  In a source tree managed in a version control system,
> files are often rewritten in place from the previous version,
> so I am puzzled by that justification.
>
> It is, in the proposed log message of our commits, a bit unusual to
> see "This patch does X" and "I do Y", by the way, which made my
> reading hiccup a bit, but perhaps it is just me?

I think I read Sam's description a bit differently than you.  My
assumption was they'd have files with names like the following in the
same directory:
   gcc-13.x.build.recipe
   gcc-12.x.build.recipe
   gcc-11.x.build.recipe
   gcc-10.x.build.recipe

And that gcc-13.x.build.recipe was started as a copy of
gcc-12.x.build.recipe (which was started as a copy of
gcc-11.x.build.recipe, etc.).  They keep all versions because they
want users to be able to build and install multiple gcc versions.

I could be completely off, but that's what I was imagining from the description.

> >> diff --git a/Documentation/config/diff.txt b/Documentation/config/diff.txt
> >> index bd5ae0c3378..d2ff3c62d41 100644
> >> --- a/Documentation/config/diff.txt
> >> +++ b/Documentation/config/diff.txt
> >> @@ -131,7 +131,8 @@ diff.renames::
> >>         Whether and how Git detects renames.  If set to "false",
> >>         rename detection is disabled. If set to "true", basic rename
> >>         detection is enabled.  If set to "copies" or "copy", Git will
> >> -       detect copies, as well.  Defaults to true.  Note that this
> >> +       detect copies, as well.  If set to "copies-harder", Git will try harder
> >> +       to detect copies.  Defaults to true.  Note that this
> >
> > "try harder to detect copies" feels like an unhelpful explanation.
>
> Yup.  "will spend extra cycles to find more copies", perhaps?

I find that marginally better; but I still don't think it answers the
user's question of why they should pick one option or the other.  The
wording for the `--find-copies-harder` does explain when it's useful:

        For performance reasons, by default, `-C` option finds copies only
        if the original file of the copy was modified in the same
        changeset.  This flag makes the command
        inspect unmodified files as candidates for the source of
        copy.  This is a very expensive operation for large
        projects, so use it with caution.

We probably don't want to copy all three of those sentences here, but
I think we need to make sure users can find them, thus my suggestion
to reference the `--find-copies-harder` option to git-diff so that
affected users can get the info they need to choose.
Junio C Hamano Nov. 8, 2023, 1:26 a.m. UTC | #4
Elijah Newren <newren@gmail.com> writes:

>> True.  "often copies of a previous version" means that it is a
>> directory that has a collection of subdirectories, one for each
>> version?  In a source tree managed in a version control system,
>> files are often rewritten in place from the previous version,
>> so I am puzzled by that justification.
>>
>> It is, in the proposed log message of our commits, a bit unusual to
>> see "This patch does X" and "I do Y", by the way, which made my
>> reading hiccup a bit, but perhaps it is just me?
>
> I think I read Sam's description a bit differently than you.  My
> assumption was they'd have files with names like the following in the
> same directory:
>    gcc-13.x.build.recipe
>    gcc-12.x.build.recipe
>    gcc-11.x.build.recipe
>    gcc-10.x.build.recipe
>
> And that gcc-13.x.build.recipe was started as a copy of
> gcc-12.x.build.recipe (which was started as a copy of
> gcc-11.x.build.recipe, etc.).  They keep all versions because they
> want users to be able to build and install multiple gcc versions.

OK, "previous version" is within the context of "variants of gcc",
and to us, there is no distinction among them (we do not care which
ones are older than the others---we need to keep track of them all).

Which makes sense.  OK.

> I find that marginally better; but I still don't think it answers the
> user's question of why they should pick one option or the other.  The
> wording for the `--find-copies-harder` does explain when it's useful:
>
>         For performance reasons, by default, `-C` option finds copies only
>         if the original file of the copy was modified in the same
>         changeset.  This flag makes the command
>         inspect unmodified files as candidates for the source of
>         copy.  This is a very expensive operation for large
>         projects, so use it with caution.
>
> We probably don't want to copy all three of those sentences here, but
> I think we need to make sure users can find them, thus my suggestion
> to reference the `--find-copies-harder` option to git-diff so that
> affected users can get the info they need to choose.

"in addition to paths that are different, will look for more copies
even in unmodified paths" then?
Elijah Newren Nov. 8, 2023, 3:30 a.m. UTC | #5
On Tue, Nov 7, 2023 at 5:26 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > I find that marginally better; but I still don't think it answers the
> > user's question of why they should pick one option or the other.  The
> > wording for the `--find-copies-harder` does explain when it's useful:
> >
> >         For performance reasons, by default, `-C` option finds copies only
> >         if the original file of the copy was modified in the same
> >         changeset.  This flag makes the command
> >         inspect unmodified files as candidates for the source of
> >         copy.  This is a very expensive operation for large
> >         projects, so use it with caution.
> >
> > We probably don't want to copy all three of those sentences here, but
> > I think we need to make sure users can find them, thus my suggestion
> > to reference the `--find-copies-harder` option to git-diff so that
> > affected users can get the info they need to choose.
>
> "in addition to paths that are different, will look for more copies
> even in unmodified paths" then?

That's much better.  I still slightly prefer referencing
`--find-copies-harder` so that there's a link between "copies-harder"
and `--find-copies-harder`; but this version would also be fine.
Junio C Hamano Nov. 8, 2023, 4:06 a.m. UTC | #6
Elijah Newren <newren@gmail.com> writes:

>> > We probably don't want to copy all three of those sentences here, but
>> > I think we need to make sure users can find them, thus my suggestion
>> > to reference the `--find-copies-harder` option to git-diff so that
>> > affected users can get the info they need to choose.
>>
>> "in addition to paths that are different, will look for more copies
>> even in unmodified paths" then?
>
> That's much better.  I still slightly prefer referencing
> `--find-copies-harder` so that there's a link between "copies-harder"
> and `--find-copies-harder`; but this version would also be fine.

Oh, I didn't mean "use this rewrite and do not make any external
reference".  More like "external reference is a good idea and
necessary to help motivated readers, but we should give enough
information inline, and I think this level of details would be
sufficient".

Thanks.
Elijah Newren Nov. 8, 2023, 4:38 a.m. UTC | #7
On Tue, Nov 7, 2023 at 8:06 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> >> > We probably don't want to copy all three of those sentences here, but
> >> > I think we need to make sure users can find them, thus my suggestion
> >> > to reference the `--find-copies-harder` option to git-diff so that
> >> > affected users can get the info they need to choose.
> >>
> >> "in addition to paths that are different, will look for more copies
> >> even in unmodified paths" then?
> >
> > That's much better.  I still slightly prefer referencing
> > `--find-copies-harder` so that there's a link between "copies-harder"
> > and `--find-copies-harder`; but this version would also be fine.
>
> Oh, I didn't mean "use this rewrite and do not make any external
> reference".  More like "external reference is a good idea and
> necessary to help motivated readers, but we should give enough
> information inline, and I think this level of details would be
> sufficient".

Ah, gotcha.  Yeah, this sounds good.
Sam James March 11, 2024, 9:42 p.m. UTC | #8
"Sam James via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Sam James <sam@gentoo.org>
>
> This patch adds a config value for 'diff.renames' called 'copies-harder'
> which make it so '-C -C' is in effect always passed for 'git log -p',
> 'git diff', etc.
>
> This allows specifying that 'git log -p', 'git diff', etc should always act
> as if '-C --find-copies-harder' was passed.
>
> I've found this especially useful for certain types of repository (like
> Gentoo's ebuild repositories) because files are often copies of a previous
> version.
>
> Signed-off-by: Sam James <sam@gentoo.org>
> ---
>     diff: implement config.diff.renames=copies-harder
>     
>     This patch adds a config value for 'diff.renames' called 'copies-harder'
>     which make it so '-C -C' is in effect always passed for 'git log -p',
>     'git diff', etc.
>     
>     This allows specifying that 'git log -p', 'git diff', etc should always
>     act as if '-C --find-copies-harder' was passed.
>     
>     I've found this especially useful for certain types of repository (like
>     Gentoo's ebuild repositories) because files are often copies of a
>     previous version.
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1606%2Fthesamesam%2Fconfig-copies-harder-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1606/thesamesam/config-copies-harder-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1606
>

v2: https://lore.kernel.org/git/20231226202102.3392518-1-sam@gentoo.org/
v3: https://lore.kernel.org/git/20240311213928.1872437-1-sam@gentoo.org/
diff mbox series

Patch

diff --git a/Documentation/config/diff.txt b/Documentation/config/diff.txt
index bd5ae0c3378..d2ff3c62d41 100644
--- a/Documentation/config/diff.txt
+++ b/Documentation/config/diff.txt
@@ -131,7 +131,8 @@  diff.renames::
 	Whether and how Git detects renames.  If set to "false",
 	rename detection is disabled. If set to "true", basic rename
 	detection is enabled.  If set to "copies" or "copy", Git will
-	detect copies, as well.  Defaults to true.  Note that this
+	detect copies, as well.  If set to "copies-harder", Git will try harder
+	to detect copies.  Defaults to true.  Note that this
 	affects only 'git diff' Porcelain like linkgit:git-diff[1] and
 	linkgit:git-log[1], and not lower level commands such as
 	linkgit:git-diff-files[1].
diff --git a/Documentation/config/status.txt b/Documentation/config/status.txt
index 2ff8237f8fc..7ca7a4becd7 100644
--- a/Documentation/config/status.txt
+++ b/Documentation/config/status.txt
@@ -33,7 +33,8 @@  status.renames::
 	Whether and how Git detects renames in linkgit:git-status[1] and
 	linkgit:git-commit[1] .  If set to "false", rename detection is
 	disabled. If set to "true", basic rename detection is enabled.
-	If set to "copies" or "copy", Git will detect copies, as well.
+	If set to "copies" or "copy", Git will detect copies, as well.  If
+	set to "copies-harder", Git will try harder to detect copies.
 	Defaults to the value of diff.renames.
 
 status.showStash::
diff --git a/diff.c b/diff.c
index 2c602df10a3..0ca906611f5 100644
--- a/diff.c
+++ b/diff.c
@@ -206,8 +206,11 @@  int git_config_rename(const char *var, const char *value)
 {
 	if (!value)
 		return DIFF_DETECT_RENAME;
+	if (!strcasecmp(value, "copies-harder"))
+		return DIFF_DETECT_COPY_HARDER;
 	if (!strcasecmp(value, "copies") || !strcasecmp(value, "copy"))
-		return  DIFF_DETECT_COPY;
+		return DIFF_DETECT_COPY;
+
 	return git_config_bool(var,value) ? DIFF_DETECT_RENAME : 0;
 }
 
@@ -4832,8 +4835,11 @@  void diff_setup_done(struct diff_options *options)
 	else
 		options->flags.diff_from_contents = 0;
 
-	if (options->flags.find_copies_harder)
+	/* Just fold this in as it makes the patch-to-git smaller */
+	if (options->flags.find_copies_harder || options->detect_rename == DIFF_DETECT_COPY_HARDER) {
+		options->flags.find_copies_harder = 1;
 		options->detect_rename = DIFF_DETECT_COPY;
+	}
 
 	if (!options->flags.relative_name)
 		options->prefix = NULL;
@@ -5264,7 +5270,7 @@  static int diff_opt_find_copies(const struct option *opt,
 	if (*arg != 0)
 		return error(_("invalid argument to %s"), opt->long_name);
 
-	if (options->detect_rename == DIFF_DETECT_COPY)
+	if (options->detect_rename == DIFF_DETECT_COPY || options->detect_rename == DIFF_DETECT_COPY_HARDER)
 		options->flags.find_copies_harder = 1;
 	else
 		options->detect_rename = DIFF_DETECT_COPY;
diff --git a/diff.h b/diff.h
index 66bd8aeb293..b29e5b777f8 100644
--- a/diff.h
+++ b/diff.h
@@ -555,6 +555,7 @@  int git_config_rename(const char *var, const char *value);
 
 #define DIFF_DETECT_RENAME	1
 #define DIFF_DETECT_COPY	2
+#define DIFF_DETECT_COPY_HARDER 3
 
 #define DIFF_PICKAXE_ALL	1
 #define DIFF_PICKAXE_REGEX	2
diff --git a/diffcore-rename.c b/diffcore-rename.c
index 5a6e2bcac71..856291d66f2 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -299,7 +299,7 @@  static int find_identical_files(struct hashmap *srcs,
 		}
 		/* Give higher scores to sources that haven't been used already */
 		score = !source->rename_used;
-		if (source->rename_used && options->detect_rename != DIFF_DETECT_COPY)
+		if (source->rename_used && options->detect_rename != DIFF_DETECT_COPY && options->detect_rename != DIFF_DETECT_COPY_HARDER)
 			continue;
 		score += basename_same(source, target);
 		if (score > best_score) {
@@ -1405,7 +1405,7 @@  void diffcore_rename_extended(struct diff_options *options,
 	trace2_region_enter("diff", "setup", options->repo);
 	info.setup = 0;
 	assert(!dir_rename_count || strmap_empty(dir_rename_count));
-	want_copies = (detect_rename == DIFF_DETECT_COPY);
+	want_copies = (detect_rename == DIFF_DETECT_COPY || detect_rename == DIFF_DETECT_COPY_HARDER);
 	if (dirs_removed && (break_idx || want_copies))
 		BUG("dirs_removed incompatible with break/copy detection");
 	if (break_idx && relevant_sources)
diff --git a/merge-ort.c b/merge-ort.c
index 6491070d965..77498354652 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -4782,7 +4782,7 @@  static void merge_start(struct merge_options *opt, struct merge_result *result)
 	 * sanity check them anyway.
 	 */
 	assert(opt->detect_renames >= -1 &&
-	       opt->detect_renames <= DIFF_DETECT_COPY);
+	       opt->detect_renames <= DIFF_DETECT_COPY_HARDER);
 	assert(opt->verbosity >= 0 && opt->verbosity <= 5);
 	assert(opt->buffer_output <= 2);
 	assert(opt->obuf.len == 0);
diff --git a/merge-recursive.c b/merge-recursive.c
index e3beb0801b1..d52dd536606 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3708,7 +3708,7 @@  static int merge_start(struct merge_options *opt, struct tree *head)
 	assert(opt->branch1 && opt->branch2);
 
 	assert(opt->detect_renames >= -1 &&
-	       opt->detect_renames <= DIFF_DETECT_COPY);
+	       opt->detect_renames <= DIFF_DETECT_COPY_HARDER);
 	assert(opt->detect_directory_renames >= MERGE_DIRECTORY_RENAMES_NONE &&
 	       opt->detect_directory_renames <= MERGE_DIRECTORY_RENAMES_TRUE);
 	assert(opt->rename_limit >= -1);