diff mbox series

[6/6] blame: enable funcname blaming with userdiff driver

Message ID a1e1c977d0978424fb07c97be0479f43a325cbea.1603889270.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series blame: enable funcname blaming with userdiff driver | expand

Commit Message

Philippe Blain Oct. 28, 2020, 12:47 p.m. UTC
From: Philippe Blain <levraiphilippeblain@gmail.com>

In blame.c::cmd_blame, we send the 'path' field of the 'sb' 'struct
blame_scoreboard' as the 'path' argument to
'line-range.c::parse_range_arg', but 'sb.path' is not set yet; it's set
to the local variable 'path' a few lines later at line 1137.

This 'path' argument is only used in 'parse_range_arg' if we are blaming
a funcname, i.e. `git blame -L :<funcname> <path>`, and in that case it
is sent to 'parse_range_funcname', where it is used to determine if a
userdiff driver should be used for said <path> to match the given
funcname.

Since 'path' is yet unset, the userdiff driver is never used, so we fall
back to the default funcname regex, which is usually not appropriate for
paths that are set to use a specific userdiff driver, and thus either we
match some unrelated lines, or we die with

    fatal: -L parameter '<funcname>' starting at line 1: no match

This has been the case ever since `git blame` learned to blame a
funcname in 13b8f68c1f (log -L: :pattern:file syntax to find by
funcname, 2013-03-28).

Enable funcname blaming for paths using specific userdiff drivers by
sending the local variable 'path' to 'parse_range_arg' instead of the
yet unset 'sb.path'.

Add a regression test in 'annotate-tests.sh', which is sourced in
t8001-annotate.sh and t8002-blame.sh, leveraging an existing file used
to test the userdiff patterns in t4018-diff-funcname.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
---
 builtin/blame.c     |  2 +-
 t/annotate-tests.sh | 18 ++++++++++++++++++
 2 files changed, 19 insertions(+), 1 deletion(-)

Comments

Junio C Hamano Oct. 29, 2020, 8:40 p.m. UTC | #1
"Philippe Blain via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Philippe Blain <levraiphilippeblain@gmail.com>
>
> In blame.c::cmd_blame, we send the 'path' field of the 'sb' 'struct
> blame_scoreboard' as the 'path' argument to
> 'line-range.c::parse_range_arg', but 'sb.path' is not set yet; it's set
> to the local variable 'path' a few lines later at line 1137.
>
> This 'path' argument is only used in 'parse_range_arg' if we are blaming
> a funcname, i.e. `git blame -L :<funcname> <path>`, and in that case it
> is sent to 'parse_range_funcname', where it is used to determine if a
> userdiff driver should be used for said <path> to match the given
> funcname.
>
> Since 'path' is yet unset, the userdiff driver is never used, so we fall
> back to the default funcname regex, which is usually not appropriate for
> paths that are set to use a specific userdiff driver, and thus either we
> match some unrelated lines, or we die with
>
>     fatal: -L parameter '<funcname>' starting at line 1: no match
>
> This has been the case ever since `git blame` learned to blame a
> funcname in 13b8f68c1f (log -L: :pattern:file syntax to find by
> funcname, 2013-03-28).

Good find.

> Enable funcname blaming for paths using specific userdiff drivers by
> sending the local variable 'path' to 'parse_range_arg' instead of the
> yet unset 'sb.path'.

Hmph.  The reason why sb.path is not set to path at this point but
later gets set is?  I am wondering if this is leaving an unneeded
room for sb.path and path to diverge by mistake.  IOW, I wonder if
it is better to set sb.path as early as we figure out which path we
are interested in, forget we have local variable "path" and use
sb.path consistently throughout the code.

But that is merely a potential future clean-up.  The local variable
path is still used one more time in the error message given when
this parse_range_arg() fails, so at least this change makes the use
of path more consistent.  I like the simplicity of this fix.

> Add a regression test in 'annotate-tests.sh', which is sourced in
> t8001-annotate.sh and t8002-blame.sh, leveraging an existing file used
> to test the userdiff patterns in t4018-diff-funcname.

Thanks, that change also makes sense.
Philippe Blain Oct. 31, 2020, 6:02 p.m. UTC | #2
> Le 29 oct. 2020 à 16:40, Junio C Hamano <gitster@pobox.com> a écrit :
> 
> "Philippe Blain via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Philippe Blain <levraiphilippeblain@gmail.com>
>> 
>> In blame.c::cmd_blame, we send the 'path' field of the 'sb' 'struct
>> blame_scoreboard' as the 'path' argument to
>> 'line-range.c::parse_range_arg', but 'sb.path' is not set yet; it's set
>> to the local variable 'path' a few lines later at line 1137.
>> 
>> This 'path' argument is only used in 'parse_range_arg' if we are blaming
>> a funcname, i.e. `git blame -L :<funcname> <path>`, and in that case it
>> is sent to 'parse_range_funcname', where it is used to determine if a
>> userdiff driver should be used for said <path> to match the given
>> funcname.
>> 
>> Since 'path' is yet unset, the userdiff driver is never used, so we fall
>> back to the default funcname regex, which is usually not appropriate for
>> paths that are set to use a specific userdiff driver, and thus either we
>> match some unrelated lines, or we die with
>> 
>>    fatal: -L parameter '<funcname>' starting at line 1: no match
>> 
>> This has been the case ever since `git blame` learned to blame a
>> funcname in 13b8f68c1f (log -L: :pattern:file syntax to find by
>> funcname, 2013-03-28).
> 
> Good find.
> 
>> Enable funcname blaming for paths using specific userdiff drivers by
>> sending the local variable 'path' to 'parse_range_arg' instead of the
>> yet unset 'sb.path'.
> 
> Hmph.  The reason why sb.path is not set to path at this point but
> later gets set is?  I am wondering if this is leaving an unneeded
> room for sb.path and path to diverge by mistake.  IOW, I wonder if
> it is better to set sb.path as early as we figure out which path we
> are interested in, forget we have local variable "path" and use
> sb.path consistently throughout the code.

I asked myself the same question, and could not come up with a 
good answer. I don't know if you read what I wrote in the cover letter
(which I would have included in the in-patch commentary for this here
patch if GitGitGadget had this feature), but I'm copying it here just 
in case you did not: 

* In patch 6, I considered fixing that bug in a different way by
  initializing sb.path inside blame.c::setup_scoreboard. This function
  already receives 'path' as its second argument, so changing that would not
  impact the API. Probably Thomas thought that sb.path was already
  initialized in sb when he modified builtin/blame.c::prepare_blame_range 
  to also send sb.path to line-range.c::parse_range_arg in 13b8f68c1f (log
  -L: :pattern:file syntax to find by funcname, 2013-03-28). 

  Initializing the path in setup_scoreboard would mean we could also
  simplify the API of blame.c::setup_blame_bloom_data since it would not
  need to receive path separately and so its second argument could be
  removed. I went for the simpler alternative of just sending 'path' to 
  parse_range_arg instead of sb.path since it felt simpler, but if we feel
  it would be better to actually initialize sb.path in setup_scoreboard,
  I'll gladly tweak that for v2.

> But that is merely a potential future clean-up.  The local variable
> path is still used one more time in the error message given when
> this parse_range_arg() fails, so at least this change makes the use
> of path more consistent.  I like the simplicity of this fix.

I also like its simplicity, and that's why I chose to submit this as v1.
But I completely agree with you that it is "dangerous" in the sense
that some further modifications to the code could then make the same mistake
and use 'sb.path' thinking it is defined when it is not.

So I'm thinking of instead initializing it in setup_scoreboard for v2.

Cheers,

Philippe.
Junio C Hamano Oct. 31, 2020, 6:58 p.m. UTC | #3
Philippe Blain <levraiphilippeblain@gmail.com> writes:

> * In patch 6, I considered fixing that bug in a different way by
>   initializing sb.path inside blame.c::setup_scoreboard. This function
>   already receives 'path' as its second argument, so changing that would not
>   impact the API. Probably Thomas thought that sb.path was already
>   initialized in sb when he modified builtin/blame.c::prepare_blame_range 
>   to also send sb.path to line-range.c::parse_range_arg in 13b8f68c1f (log
>   -L: :pattern:file syntax to find by funcname, 2013-03-28). 
>
>   Initializing the path in setup_scoreboard would mean we could also
>   simplify the API of blame.c::setup_blame_bloom_data since it would not
>   need to receive path separately and so its second argument could be
>   removed. I went for the simpler alternative of just sending 'path' to 
>   parse_range_arg instead of sb.path since it felt simpler, but if we feel
>   it would be better to actually initialize sb.path in setup_scoreboard,
>   I'll gladly tweak that for v2.
>
>> But that is merely a potential future clean-up.  The local variable
>> path is still used one more time in the error message given when
>> this parse_range_arg() fails, so at least this change makes the use
>> of path more consistent.  I like the simplicity of this fix.
>
> I also like its simplicity, and that's why I chose to submit this as v1.
> But I completely agree with you that it is "dangerous" in the sense
> that some further modifications to the code could then make the same mistake
> and use 'sb.path' thinking it is defined when it is not.
>
> So I'm thinking of instead initializing it in setup_scoreboard for v2.

That does sound like a sensible way to clean it up.  Thanks.
diff mbox series

Patch

diff --git a/builtin/blame.c b/builtin/blame.c
index 05f69211dd..917fedc635 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1104,7 +1104,7 @@  int cmd_blame(int argc, const char **argv, const char *prefix)
 		long bottom, top;
 		if (parse_range_arg(range_list.items[range_i].string,
 				    nth_line_cb, &sb, lno, anchor,
-				    &bottom, &top, sb.path,
+				    &bottom, &top, path,
 				    the_repository->index))
 			usage(blame_usage);
 		if ((!lno && (top || bottom)) || lno < bottom)
diff --git a/t/annotate-tests.sh b/t/annotate-tests.sh
index d933af5714..3aee61d2cc 100644
--- a/t/annotate-tests.sh
+++ b/t/annotate-tests.sh
@@ -479,6 +479,24 @@  test_expect_success 'blame -L ^:RE (absolute: end-of-file)' '
 	check_count -f hello.c -L$n -L^:ma.. F 4 G 1 H 1
 '
 
+test_expect_success 'setup -L :funcname with userdiff driver' '
+	echo "fortran-* diff=fortran" >.gitattributes &&
+	fortran_file=fortran-external-function &&
+	orig_file="$TEST_DIRECTORY/t4018/$fortran_file" &&
+	cp $orig_file . &&
+	git add $fortran_file &&
+	GIT_AUTHOR_NAME="A" GIT_AUTHOR_EMAIL="A@test.git" \
+	git commit -m "add fortran file" &&
+	sed -e "s/ChangeMe/IWasChanged/" <"$orig_file" >$fortran_file &&
+	git add $fortran_file &&
+	GIT_AUTHOR_NAME="B" GIT_AUTHOR_EMAIL="B@test.git" \
+	git commit -m "change fortran file"
+'
+
+test_expect_success 'blame -L :funcname with userdiff driver' '
+	check_count -f fortran-external-function -L:RIGHT A 7 B 1
+'
+
 test_expect_success 'setup incremental' '
 	(
 	GIT_AUTHOR_NAME=I &&