diff mbox series

[RFC] gitk: Activate --find-copies-harder

Message ID b12574f0-3ebc-95c0-9def-555150257e46@posteo.net (mailing list archive)
State New, archived
Headers show
Series [RFC] gitk: Activate --find-copies-harder | expand

Commit Message

Robert Pollak Jan. 4, 2021, 7:54 p.m. UTC
Naively forward the diff arguments to make --find-copies-harder work.

Signed-off-by: Robert Pollak <robert.pollak@posteo.net>
---
Dear Paul Mackerras,

This patch is an obviously naive attempt to make gitk observe
--find-copies-harder. I am a gitk user with many small projects, so I am
currently using this patch to get better diffs.

With this, gitk displays the copy in the second commit of my test
repository [1] as desired:

similarity index 100%
copy from a
copy to b


I see the following problems with my patch:

1) It is totally untested with all the other args that are collected in
diffargs, like e.g. "-O<orderfile>", since I didn't need them yet.

2) Even if --find-copies-harder is the only diff argument used, there
might be unintended side effects, since the modified procedure 'diffcmd'
is called in several places. I did not systematically test all these
code paths.


To deal with 1), I could rename the variable 'vdflags' to
'vdflags_ignored' and continue using 'vdflags' only for
--find-copies-harder. Later, other flags could be moved over, after
their harmlessness has been proven. Would this be good?

Ad 2), maybe someone with more code knowledge can tell whether this
a real risk? Also, would it be preferable to add the new flag(s) only to
the arguments of the diffcmd call in 'getblobdiffs'?
(as in:
             lappend flags --root
         }
-        set cmd [concat | git diff-tree -r $flags $ids]
+        set cmd [concat | git diff-tree -r $vdflags($curview) $flags $ids]
     }
     return $cmd
 }

Comments

Laszlo Ersek Jan. 6, 2021, 3:58 p.m. UTC | #1
On 01/04/21 20:54, Robert Pollak wrote:
> Naively forward the diff arguments to make --find-copies-harder work.
> 
> Signed-off-by: Robert Pollak <robert.pollak@posteo.net>
> ---
> Dear Paul Mackerras,
> 
> This patch is an obviously naive attempt to make gitk observe
> --find-copies-harder. I am a gitk user with many small projects, so I am
> currently using this patch to get better diffs.
> 
> With this, gitk displays the copy in the second commit of my test
> repository [1] as desired:
> 
> similarity index 100%
> copy from a
> copy to b
> 
> 
> I see the following problems with my patch:
> 
> 1) It is totally untested with all the other args that are collected in
> diffargs, like e.g. "-O<orderfile>", since I didn't need them yet.

It would be really great if gitk supported both "-O<orderfile>" and
--find-copies-harder!

I can't offer a review, but I very much support the use case.

Thanks,
Laszlo

> 
> 2) Even if --find-copies-harder is the only diff argument used, there
> might be unintended side effects, since the modified procedure 'diffcmd'
> is called in several places. I did not systematically test all these
> code paths.
> 
> 
> To deal with 1), I could rename the variable 'vdflags' to
> 'vdflags_ignored' and continue using 'vdflags' only for
> --find-copies-harder. Later, other flags could be moved over, after
> their harmlessness has been proven. Would this be good?
> 
> Ad 2), maybe someone with more code knowledge can tell whether this
> a real risk? Also, would it be preferable to add the new flag(s) only to
> the arguments of the diffcmd call in 'getblobdiffs'?
> (as in:
> diff --git a/gitk b/gitk
> index 23d9dd1..da6b372 100755
> --- a/gitk
> +++ b/gitk
> @@ -8017,7 +8017,7 @@ proc initblobdiffvars {} {
>  proc getblobdiffs {ids} {
>      global blobdifffd diffids env
>      global treediffs
> -    global diffcontext
> +    global vdflags diffcontext
>      global ignorespace
>      global worddiff
>      global limitdiffs vfilelimit curview
> @@ -8031,7 +8031,7 @@ proc getblobdiffs {ids} {
>      if {[package vcompare $git_version "1.6.6"] >= 0} {
>          set submodule "--submodule"
>      }
> -    set cmd [diffcmd $ids "-p $textconv $submodule  -C --cc --no-commit-id -U$diffcontext"]
> +    set cmd [diffcmd $ids "$vdflags($curview) -p $textconv $submodule  -C --cc --no-commit-id -U$diffcontext"]
>      if {$ignorespace} {
>          append cmd " -w"
>      }
> )
> For my test case, this also works.
> 
> 
> I'd be happy to prepare an updated patch incorporating your feedback.
> 
> Having this functionality in gitk will hopefully make some people stop
> crafting their git history for copy detection, like described e.g. in
> https://devblogs.microsoft.com/oldnewthing/20190919-00/?p=102904 .
> 
> I am CCing Laszlo Ersek, because he expressed interest in a similar
> topic a year ago:
> 'gitk feature requests: (1) "diff.orderFile" and (2) "--function-context"',
> https://public-inbox.org/git/d972c1f1-c49a-f644-ab1c-6a3e26c43ee3@redhat.com/
> .
> 
> -- Robert
> 
> [1] My minimal test case:
>> git init
>> echo "a file" > a
>> git add a
>> git commit -m "a file"
>> cp a b
>> git add b
>> git commit -m "a copy"
> 
> 
>  gitk | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gitk b/gitk
> index 23d9dd1..eb6ba9a 100755
> --- a/gitk
> +++ b/gitk
> @@ -7869,7 +7869,7 @@ proc addtocflist {ids} {
>  }
>   proc diffcmd {ids flags} {
> -    global log_showroot nullid nullid2 git_version
> +    global log_showroot nullid nullid2 git_version vdflags curview
>       set i [lsearch -exact $ids $nullid]
>      set j [lsearch -exact $ids $nullid2]
> @@ -7909,7 +7909,7 @@ proc diffcmd {ids flags} {
>          if {$log_showroot} {
>              lappend flags --root
>          }
> -        set cmd [concat | git diff-tree -r $flags $ids]
> +        set cmd [concat | git diff-tree -r $vdflags($curview) $flags $ids]
>      }
>      return $cmd
>  }
>
Robert Pollak Jan. 10, 2021, 12:59 p.m. UTC | #2
On 2021-01-06 16:58, Laszlo Ersek wrote:
> On 01/04/21 20:54, Robert Pollak wrote:
[...]
>> I see the following problems with my patch:
>>
>> 1) It is totally untested with all the other args that are collected in
>> diffargs, like e.g. "-O<orderfile>", since I didn't need them yet.
> 
> It would be really great if gitk supported both "-O<orderfile>" and
> --find-copies-harder!

Can you please test these options with my patch and report back?

-- Robert
Laszlo Ersek Jan. 11, 2021, 9:21 a.m. UTC | #3
On 01/10/21 13:59, Robert Pollak wrote:
> On 2021-01-06 16:58, Laszlo Ersek wrote:
>> On 01/04/21 20:54, Robert Pollak wrote:
> [...]
>>> I see the following problems with my patch:
>>>
>>> 1) It is totally untested with all the other args that are collected in
>>> diffargs, like e.g. "-O<orderfile>", since I didn't need them yet.
>>
>> It would be really great if gitk supported both "-O<orderfile>" and
>> --find-copies-harder!
>
> Can you please test these options with my patch and report back?
>
> -- Robert
>

The patch doesn't apply with git-am (I'm trying on top of 72c4083ddf91):

> Applying: gitk: Activate --find-copies-harder
> error: corrupt patch at line 100
> Patch failed at 0001 gitk: Activate --find-copies-harder
> hint: Use 'git am --show-current-patch' to see the failed patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".

One problem could be the embedded diff in the notes section (I guess it
could confuse git-am).

Also, "gitk" has existed at "gitk-git/gitk" since commit 62ba5143ec2a
("Move gitk to its own subdirectory", 2007-11-18), so the pathname
headers in the patch look wrong.

Thanks
Laszlo
Pratyush Yadav Jan. 11, 2021, 12:33 p.m. UTC | #4
On Mon, Jan 11 2021, Laszlo Ersek wrote:

> On 01/10/21 13:59, Robert Pollak wrote:
>> On 2021-01-06 16:58, Laszlo Ersek wrote:
>>> On 01/04/21 20:54, Robert Pollak wrote:
>> [...]
>>>> I see the following problems with my patch:
>>>>
>>>> 1) It is totally untested with all the other args that are collected in
>>>> diffargs, like e.g. "-O<orderfile>", since I didn't need them yet.
>>>
>>> It would be really great if gitk supported both "-O<orderfile>" and
>>> --find-copies-harder!
>>
>> Can you please test these options with my patch and report back?
>>
>> -- Robert
>>
>
> The patch doesn't apply with git-am (I'm trying on top of 72c4083ddf91):
>
>> Applying: gitk: Activate --find-copies-harder
>> error: corrupt patch at line 100
>> Patch failed at 0001 gitk: Activate --find-copies-harder
>> hint: Use 'git am --show-current-patch' to see the failed patch
>> When you have resolved this problem, run "git am --continue".
>> If you prefer to skip this patch, run "git am --skip" instead.
>> To restore the original branch and stop patching, run "git am --abort".
>
> One problem could be the embedded diff in the notes section (I guess it
> could confuse git-am).
>
> Also, "gitk" has existed at "gitk-git/gitk" since commit 62ba5143ec2a
> ("Move gitk to its own subdirectory", 2007-11-18), so the pathname
> headers in the patch look wrong.

gitk is maintained as a separate repo by Paul Mackerras at
git://ozlabs.org/~paulus/gitk, and then is pulled into the main Git repo
from time to time using a subtree merge. That's how gitk changes end up
in gitk-git/. Patches for gitk should be based on the gitk repo to make
it easier for Paul to apply them. In short, the paths are fine.

> Thanks
> Laszlo
>
Laszlo Ersek Jan. 11, 2021, 4:28 p.m. UTC | #5
On 01/11/21 13:33, Pratyush Yadav wrote:
> On Mon, Jan 11 2021, Laszlo Ersek wrote:
> 
>> On 01/10/21 13:59, Robert Pollak wrote:
>>> On 2021-01-06 16:58, Laszlo Ersek wrote:
>>>> On 01/04/21 20:54, Robert Pollak wrote:
>>> [...]
>>>>> I see the following problems with my patch:
>>>>>
>>>>> 1) It is totally untested with all the other args that are collected in
>>>>> diffargs, like e.g. "-O<orderfile>", since I didn't need them yet.
>>>>
>>>> It would be really great if gitk supported both "-O<orderfile>" and
>>>> --find-copies-harder!
>>>
>>> Can you please test these options with my patch and report back?
>>>
>>> -- Robert
>>>
>>
>> The patch doesn't apply with git-am (I'm trying on top of 72c4083ddf91):
>>
>>> Applying: gitk: Activate --find-copies-harder
>>> error: corrupt patch at line 100
>>> Patch failed at 0001 gitk: Activate --find-copies-harder
>>> hint: Use 'git am --show-current-patch' to see the failed patch
>>> When you have resolved this problem, run "git am --continue".
>>> If you prefer to skip this patch, run "git am --skip" instead.
>>> To restore the original branch and stop patching, run "git am --abort".
>>
>> One problem could be the embedded diff in the notes section (I guess it
>> could confuse git-am).
>>
>> Also, "gitk" has existed at "gitk-git/gitk" since commit 62ba5143ec2a
>> ("Move gitk to its own subdirectory", 2007-11-18), so the pathname
>> headers in the patch look wrong.
> 
> gitk is maintained as a separate repo by Paul Mackerras at
> git://ozlabs.org/~paulus/gitk, and then is pulled into the main Git repo
> from time to time using a subtree merge. That's how gitk changes end up
> in gitk-git/. Patches for gitk should be based on the gitk repo to make
> it easier for Paul to apply them.

Thank you for explaining!

(I guess I'm more familiar with submodules than the subtree merge
strategy in general; I've had to look up the latter on the web now.)

> In short, the paths are fine.

I tried applying the patch again, now on top of 6cd80496e9d3 ("gitk:
Resize panes correctly when reducing window size", 2020-10-03) -- it
turns out I'd actually had a clone of the stand-alone gitk repo you
mention, I just needed to pull.

But: git-am fails the same way.

Thanks
Laszlo
Junio C Hamano Jan. 11, 2021, 9 p.m. UTC | #6
Laszlo Ersek <lersek@redhat.com> writes:

> On 01/11/21 13:33, Pratyush Yadav wrote:
>> On Mon, Jan 11 2021, Laszlo Ersek wrote:
>> 
>>> On 01/10/21 13:59, Robert Pollak wrote:
>>>> On 2021-01-06 16:58, Laszlo Ersek wrote:
>>>>> On 01/04/21 20:54, Robert Pollak wrote:
>>>> [...]
>>>>>> I see the following problems with my patch:
>>>>>>
>>>>>> 1) It is totally untested with all the other args that are collected in
>>>>>> diffargs, like e.g. "-O<orderfile>", since I didn't need them yet.
>>>>>
>>>>> It would be really great if gitk supported both "-O<orderfile>" and
>>>>> --find-copies-harder!
>>>>
>>>> Can you please test these options with my patch and report back?
>>>>
>>>> -- Robert
>>>>
>>>
>>> The patch doesn't apply with git-am (I'm trying on top of 72c4083ddf91):
>>>
>>>> Applying: gitk: Activate --find-copies-harder
>>>> error: corrupt patch at line 100
>>>> Patch failed at 0001 gitk: Activate --find-copies-harder
>>>> hint: Use 'git am --show-current-patch' to see the failed patch
>>>> When you have resolved this problem, run "git am --continue".
>>>> If you prefer to skip this patch, run "git am --skip" instead.
>>>> To restore the original branch and stop patching, run "git am --abort".
>>>
>>> One problem could be the embedded diff in the notes section (I guess it
>>> could confuse git-am).
>>>
>>> Also, "gitk" has existed at "gitk-git/gitk" since commit 62ba5143ec2a
>>> ("Move gitk to its own subdirectory", 2007-11-18), so the pathname
>>> headers in the patch look wrong.
>> 
>> gitk is maintained as a separate repo by Paul Mackerras at
>> git://ozlabs.org/~paulus/gitk, and then is pulled into the main Git repo
>> from time to time using a subtree merge. That's how gitk changes end up
>> in gitk-git/. Patches for gitk should be based on the gitk repo to make
>> it easier for Paul to apply them.
>
> Thank you for explaining!
>
> (I guess I'm more familiar with submodules than the subtree merge
> strategy in general; I've had to look up the latter on the web now.)
>
>> In short, the paths are fine.
>
> I tried applying the patch again, now on top of 6cd80496e9d3 ("gitk:
> Resize panes correctly when reducing window size", 2020-10-03) -- it
> turns out I'd actually had a clone of the stand-alone gitk repo you
> mention, I just needed to pull.
>
> But: git-am fails the same way.

That is because it has a patch-looking material that is not to be
applied before the real patch.  It is customery to indent such "diff"
that is meant to be read by humans as part of reading log message
and justification of the patch, not the patch itself.
diff mbox series

Patch

diff --git a/gitk b/gitk
index 23d9dd1..da6b372 100755
--- a/gitk
+++ b/gitk
@@ -8017,7 +8017,7 @@  proc initblobdiffvars {} {
 proc getblobdiffs {ids} {
     global blobdifffd diffids env
     global treediffs
-    global diffcontext
+    global vdflags diffcontext
     global ignorespace
     global worddiff
     global limitdiffs vfilelimit curview
@@ -8031,7 +8031,7 @@  proc getblobdiffs {ids} {
     if {[package vcompare $git_version "1.6.6"] >= 0} {
         set submodule "--submodule"
     }
-    set cmd [diffcmd $ids "-p $textconv $submodule  -C --cc --no-commit-id -U$diffcontext"]
+    set cmd [diffcmd $ids "$vdflags($curview) -p $textconv $submodule  -C --cc --no-commit-id -U$diffcontext"]
     if {$ignorespace} {
         append cmd " -w"
     }
)
For my test case, this also works.


I'd be happy to prepare an updated patch incorporating your feedback.

Having this functionality in gitk will hopefully make some people stop
crafting their git history for copy detection, like described e.g. in
https://devblogs.microsoft.com/oldnewthing/20190919-00/?p=102904 .

I am CCing Laszlo Ersek, because he expressed interest in a similar
topic a year ago:
'gitk feature requests: (1) "diff.orderFile" and (2) "--function-context"',
https://public-inbox.org/git/d972c1f1-c49a-f644-ab1c-6a3e26c43ee3@redhat.com/
.

-- Robert

[1] My minimal test case:
> git init
> echo "a file" > a
> git add a
> git commit -m "a file"
> cp a b
> git add b
> git commit -m "a copy"


 gitk | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gitk b/gitk
index 23d9dd1..eb6ba9a 100755
--- a/gitk
+++ b/gitk
@@ -7869,7 +7869,7 @@  proc addtocflist {ids} {
 }
  proc diffcmd {ids flags} {
-    global log_showroot nullid nullid2 git_version
+    global log_showroot nullid nullid2 git_version vdflags curview
      set i [lsearch -exact $ids $nullid]
     set j [lsearch -exact $ids $nullid2]
@@ -7909,7 +7909,7 @@  proc diffcmd {ids flags} {
         if {$log_showroot} {