diff mbox series

blame: report correct number of lines in progress when using ranges

Message ID 20220403165038.52803-1-eantoranz@gmail.com (mailing list archive)
State Superseded
Headers show
Series blame: report correct number of lines in progress when using ranges | expand

Commit Message

Edmundo Carmona Antoranz April 3, 2022, 4:50 p.m. UTC
When using ranges, use their sizes as the limit for progress
instead of the size of the full file.

Before:
$ git blame --progress builtin/blame.c > /dev/null
Blaming lines: 100% (1210/1210), done.
$ git blame --progress -L 100,120 -L 200,300 builtin/blame.c > /dev/null
Blaming lines:  10% (122/1210), done.
$

After:
$ ./git blame --progress builtin/blame.c > /dev/null
Blaming lines: 100% (1210/1210), done.
$ ./git blame --progress -L 100,120 -L 200,300 builtin/blame.c > /dev/null
Blaming lines: 100% (122/122), done.
$

Signed-off-by: Edmundo Carmona Antoranz <eantoranz@gmail.com>
---
 builtin/blame.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Junio C Hamano April 3, 2022, 10:42 p.m. UTC | #1
Edmundo Carmona Antoranz <eantoranz@gmail.com> writes:

> When using ranges, use their sizes as the limit for progress
> instead of the size of the full file.
>
> Before:
> $ git blame --progress builtin/blame.c > /dev/null
> Blaming lines: 100% (1210/1210), done.
> $ git blame --progress -L 100,120 -L 200,300 builtin/blame.c > /dev/null
> Blaming lines:  10% (122/1210), done.
> $
>
> After:
> $ ./git blame --progress builtin/blame.c > /dev/null
> Blaming lines: 100% (1210/1210), done.
> $ ./git blame --progress -L 100,120 -L 200,300 builtin/blame.c > /dev/null
> Blaming lines: 100% (122/122), done.
> $
>
> Signed-off-by: Edmundo Carmona Antoranz <eantoranz@gmail.com>
> ---
>  builtin/blame.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)

Wow.  I am reasonably sure that this was broken since the
introduction of the progress meter to "git blame".  Thanks for
finding and fixing.

Can we have a test, too, or is that too cumbersome to add for some
reason?

Thanks.

> diff --git a/builtin/blame.c b/builtin/blame.c
> index 8d15b68afc..e33372c56b 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -898,6 +898,7 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
>  	unsigned int range_i;
>  	long anchor;
>  	const int hexsz = the_hash_algo->hexsz;
> +	long num_lines = 0;
>  
>  	setup_default_color_by_age();
>  	git_config(git_blame_config, &output_option);
> @@ -1129,7 +1130,10 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
>  	for (range_i = ranges.nr; range_i > 0; --range_i) {
>  		const struct range *r = &ranges.ranges[range_i - 1];
>  		ent = blame_entry_prepend(ent, r->start, r->end, o);
> +		num_lines += (r->end - r->start);
>  	}
> +	if (!num_lines)
> +		num_lines = sb.num_lines;
>  
>  	o->suspects = ent;
>  	prio_queue_put(&sb.commits, o->commit);
> @@ -1158,7 +1162,7 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
>  	sb.found_guilty_entry = &found_guilty_entry;
>  	sb.found_guilty_entry_data = &pi;
>  	if (show_progress)
> -		pi.progress = start_delayed_progress(_("Blaming lines"), sb.num_lines);
> +		pi.progress = start_delayed_progress(_("Blaming lines"), num_lines);
>  
>  	assign_blame(&sb, opt);
Edmundo Carmona Antoranz April 4, 2022, 5:27 a.m. UTC | #2
On Mon, Apr 4, 2022 at 12:42 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Wow.  I am reasonably sure that this was broken since the
> introduction of the progress meter to "git blame".  Thanks for
> finding and fixing.

As the person who originally wrote said support, I feel like I
am just nurturing it. :-D

>
> Can we have a test, too, or is that too cumbersome to add for some
> reason?

Correct me if I'm wrong but it could be a little tricky because "progress
display" shows up only if it happens to "lose" a race. Progress is
skipped altogether if blame process goes too fast. Even if you run
blame on a file with a lot of history, if box is fast enough and info is
cached, it will fail to display progress. So, all in all, it would be like
trying to unit test output coming out of a race condition.

Let me know what you think.

>
> Thanks.
yw
Ævar Arnfjörð Bjarmason April 4, 2022, 6:12 a.m. UTC | #3
On Sun, Apr 03 2022, Edmundo Carmona Antoranz wrote:

> When using ranges, use their sizes as the limit for progress
> instead of the size of the full file.
>
> Before:
> $ git blame --progress builtin/blame.c > /dev/null
> Blaming lines: 100% (1210/1210), done.
> $ git blame --progress -L 100,120 -L 200,300 builtin/blame.c > /dev/null
> Blaming lines:  10% (122/1210), done.
> $
>
> After:
> $ ./git blame --progress builtin/blame.c > /dev/null
> Blaming lines: 100% (1210/1210), done.
> $ ./git blame --progress -L 100,120 -L 200,300 builtin/blame.c > /dev/null
> Blaming lines: 100% (122/122), done.
> $
>
> Signed-off-by: Edmundo Carmona Antoranz <eantoranz@gmail.com>
> ---
>  builtin/blame.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/blame.c b/builtin/blame.c
> index 8d15b68afc..e33372c56b 100644
> --- a/builtin/blame.c
> +++ b/builtin/blame.c
> @@ -898,6 +898,7 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
>  	unsigned int range_i;
>  	long anchor;
>  	const int hexsz = the_hash_algo->hexsz;
> +	long num_lines = 0;

Here ranges's nr is unsigned int, and the "num_lines" is an int, and the
argument to start_delayed_progress() is uint64_t, but both of "start"
and "end" are "long".

But we appand multiple differences to num_lines, are we sure we won't
overflow here?

>  
>  	setup_default_color_by_age();
>  	git_config(git_blame_config, &output_option);
> @@ -1129,7 +1130,10 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
>  	for (range_i = ranges.nr; range_i > 0; --range_i) {
>  		const struct range *r = &ranges.ranges[range_i - 1];
>  		ent = blame_entry_prepend(ent, r->start, r->end, o);
> +		num_lines += (r->end - r->start);
>  	}
> +	if (!num_lines)
> +		num_lines = sb.num_lines;
>  
>  	o->suspects = ent;
>  	prio_queue_put(&sb.commits, o->commit);
> @@ -1158,7 +1162,7 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
>  	sb.found_guilty_entry = &found_guilty_entry;
>  	sb.found_guilty_entry_data = &pi;
>  	if (show_progress)
> -		pi.progress = start_delayed_progress(_("Blaming lines"), sb.num_lines);
> +		pi.progress = start_delayed_progress(_("Blaming lines"), num_lines);
>  
>  	assign_blame(&sb, opt);
Ævar Arnfjörð Bjarmason April 4, 2022, 6:15 a.m. UTC | #4
On Mon, Apr 04 2022, Edmundo Carmona Antoranz wrote:

> On Mon, Apr 4, 2022 at 12:42 AM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Wow.  I am reasonably sure that this was broken since the
>> introduction of the progress meter to "git blame".  Thanks for
>> finding and fixing.
>
> As the person who originally wrote said support, I feel like I
> am just nurturing it. :-D
>
>>
>> Can we have a test, too, or is that too cumbersome to add for some
>> reason?
>
> Correct me if I'm wrong but it could be a little tricky because "progress
> display" shows up only if it happens to "lose" a race. Progress is
> skipped altogether if blame process goes too fast. Even if you run
> blame on a file with a lot of history, if box is fast enough and info is
> cached, it will fail to display progress. So, all in all, it would be like
> trying to unit test output coming out of a race condition.
>
> Let me know what you think.

You can make it always show up by using GIT_PROGRESS_DELAY=0, grep for
it in existing tests, skimming the code that coupled with --progress
should guarantee that it shows up.
Junio C Hamano April 4, 2022, 4:34 p.m. UTC | #5
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> diff --git a/builtin/blame.c b/builtin/blame.c
>> index 8d15b68afc..e33372c56b 100644
>> --- a/builtin/blame.c
>> +++ b/builtin/blame.c
>> @@ -898,6 +898,7 @@ int cmd_blame(int argc, const char **argv, const char *prefix)
>>  	unsigned int range_i;
>>  	long anchor;
>>  	const int hexsz = the_hash_algo->hexsz;
>> +	long num_lines = 0;
>
> Here ranges's nr is unsigned int, and the "num_lines" is an int, and the
> argument to start_delayed_progress() is uint64_t, but both of "start"
> and "end" are "long".

I see num_lines is a long, which I am guessing was made to match
what start and end are, which in turn is to use parse_range_arg().

> But we appand multiple differences to num_lines, are we sure we won't
> overflow here?

Looking at members of blame_entry and blame_scoreboard structures, I
think we make liberal use of "int" in the blame codepath without
worrying about those with 16-bit int, or those with files longer
than 2G lines, and progress meter showing incorrect values to them
is probably the least of our worries ;-)
diff mbox series

Patch

diff --git a/builtin/blame.c b/builtin/blame.c
index 8d15b68afc..e33372c56b 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -898,6 +898,7 @@  int cmd_blame(int argc, const char **argv, const char *prefix)
 	unsigned int range_i;
 	long anchor;
 	const int hexsz = the_hash_algo->hexsz;
+	long num_lines = 0;
 
 	setup_default_color_by_age();
 	git_config(git_blame_config, &output_option);
@@ -1129,7 +1130,10 @@  int cmd_blame(int argc, const char **argv, const char *prefix)
 	for (range_i = ranges.nr; range_i > 0; --range_i) {
 		const struct range *r = &ranges.ranges[range_i - 1];
 		ent = blame_entry_prepend(ent, r->start, r->end, o);
+		num_lines += (r->end - r->start);
 	}
+	if (!num_lines)
+		num_lines = sb.num_lines;
 
 	o->suspects = ent;
 	prio_queue_put(&sb.commits, o->commit);
@@ -1158,7 +1162,7 @@  int cmd_blame(int argc, const char **argv, const char *prefix)
 	sb.found_guilty_entry = &found_guilty_entry;
 	sb.found_guilty_entry_data = &pi;
 	if (show_progress)
-		pi.progress = start_delayed_progress(_("Blaming lines"), sb.num_lines);
+		pi.progress = start_delayed_progress(_("Blaming lines"), num_lines);
 
 	assign_blame(&sb, opt);