diff mbox series

[2/3] midx: don't provide a total for QSORT() progress

Message ID patch-2.3-3411fe0515-20210722T121801Z-avarab@gmail.com (mailing list archive)
State Superseded
Headers show
Series progress.c API users: fix bogus counting | expand

Commit Message

Ævar Arnfjörð Bjarmason July 22, 2021, 12:20 p.m. UTC
The quicksort algorithm can be anywhere between O(n) and O(n^2), so
providing a "num objects" as a total means that in some cases we're
going to go past 100%.

This fixes a logic error in 5ae18df9d8e (midx: during verify group
objects by packfile to speed verification, 2019-03-21), which in turn
seems to have been diligently copied from my own logic error in the
commit-graph.c code, see 890226ccb57 (commit-graph write: add
itermediate progress, 2019-01-19).

That commit-graph code of mine was removed in
1cbdbf3bef7 (commit-graph: drop count_distinct_commits() function,
2020-12-07), so we don't need to fix that too.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 midx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Junio C Hamano July 23, 2021, 9:56 p.m. UTC | #1
Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> The quicksort algorithm can be anywhere between O(n) and O(n^2), so
> providing a "num objects" as a total means that in some cases we're
> going to go past 100%.
>
> This fixes a logic error in 5ae18df9d8e (midx: during verify group
> objects by packfile to speed verification, 2019-03-21), which in turn
> seems to have been diligently copied from my own logic error in the
> commit-graph.c code, see 890226ccb57 (commit-graph write: add
> itermediate progress, 2019-01-19).

Interesting.

>
> That commit-graph code of mine was removed in
> 1cbdbf3bef7 (commit-graph: drop count_distinct_commits() function,
> 2020-12-07), so we don't need to fix that too.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  midx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/midx.c b/midx.c
> index 9a35b0255d..eaae75ab19 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -1291,7 +1291,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
>  
>  	if (flags & MIDX_PROGRESS)
>  		progress = start_sparse_progress(_("Sorting objects by packfile"),
> -						 m->num_objects);
> +						 0);
>  	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
>  	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
>  	stop_progress(&progress);
Phillip Wood Aug. 5, 2021, 3:07 p.m. UTC | #2
On 22/07/2021 13:20, Ævar Arnfjörð Bjarmason wrote:
> The quicksort algorithm can be anywhere between O(n) and O(n^2), so
> providing a "num objects" as a total means that in some cases we're
> going to go past 100%.

Being pedantic QSORT() is not necessarily a quicksort, for example 
compact/qsort_s.c implements a merge sort.

I'm confused as to how we go past 100% when we only call 
display_progress(progress, 0) and then stop_progress(). If my reading of 
progress.c is correct then this change means we'll stop displaying a 
percentage as progress->total will be zero in

static void display(struct progress *progress, uint64_t n, const char *done)
{
	/* ... */
	if (progress->total) {
		unsigned percent = n * 100 / progress->total;
		if (percent != progress->last_percent || progress_update) {
			progress->last_percent = percent;

			strbuf_reset(counters_sb);
			strbuf_addf(counters_sb,
				    "%3u%% (%"PRIuMAX"/%"PRIuMAX")%s", percent,
				    (uintmax_t)n, (uintmax_t)progress->total,
				    tp);
			show_update = 1;
		}
	} else if (progress_update) {
		strbuf_reset(counters_sb);
		strbuf_addf(counters_sb, "%"PRIuMAX"%s", (uintmax_t)n, tp);
		show_update = 1;
	}
	/* ... */
}

Best Wishes

Phillip

> This fixes a logic error in 5ae18df9d8e (midx: during verify group
> objects by packfile to speed verification, 2019-03-21), which in turn
> seems to have been diligently copied from my own logic error in the
> commit-graph.c code, see 890226ccb57 (commit-graph write: add
> itermediate progress, 2019-01-19).
> 
> That commit-graph code of mine was removed in
> 1cbdbf3bef7 (commit-graph: drop count_distinct_commits() function,
> 2020-12-07), so we don't need to fix that too.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>   midx.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/midx.c b/midx.c
> index 9a35b0255d..eaae75ab19 100644
> --- a/midx.c
> +++ b/midx.c
> @@ -1291,7 +1291,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
>   
>   	if (flags & MIDX_PROGRESS)
>   		progress = start_sparse_progress(_("Sorting objects by packfile"),
> -						 m->num_objects);
> +						 0);
>   	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
>   	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
>   	stop_progress(&progress);
>
Ævar Arnfjörð Bjarmason Aug. 5, 2021, 7:07 p.m. UTC | #3
On Thu, Aug 05 2021, Phillip Wood wrote:

> On 22/07/2021 13:20, Ævar Arnfjörð Bjarmason wrote:
>> The quicksort algorithm can be anywhere between O(n) and O(n^2), so
>> providing a "num objects" as a total means that in some cases we're
>> going to go past 100%.
>
> Being pedantic QSORT() is not necessarily a quicksort, for example
> compact/qsort_s.c implements a merge sort.
>
> I'm confused as to how we go past 100% when we only call
> display_progress(progress, 0) and then stop_progress(). If my reading
> of progress.c is correct then this change means we'll stop displaying
> a percentage as progress->total will be zero in
>
> static void display(struct progress *progress, uint64_t n, const char *done)
> {
> 	/* ... */
> 	if (progress->total) {
> 		unsigned percent = n * 100 / progress->total;
> 		if (percent != progress->last_percent || progress_update) {
> 			progress->last_percent = percent;
>
> 			strbuf_reset(counters_sb);
> 			strbuf_addf(counters_sb,
> 				    "%3u%% (%"PRIuMAX"/%"PRIuMAX")%s", percent,
> 				    (uintmax_t)n, (uintmax_t)progress->total,
> 				    tp);
> 			show_update = 1;
> 		}
> 	} else if (progress_update) {
> 		strbuf_reset(counters_sb);
> 		strbuf_addf(counters_sb, "%"PRIuMAX"%s", (uintmax_t)n, tp);
> 		show_update = 1;
> 	}
> 	/* ... */
> }

Hrm, now I'm also confused. Yes you're right. This whole patch doesn't
make sense.

I.e. it's perfectly fine to provide a target of <num> for QSORT and then
use display_progress() just to bump the progress bar's display like
this, and we have other code in commit-graph.c that does the same.

I think I might have (badly) extracted this patch from some WIP work
where I was giving qsort() calls progress of some sort, or maybe I was
just being stupid...

(I have a few local experiments in progress bar output, one of those is
that you can give the API a <num> target, and not be sure if it'll be
O(n), O(log(n)), O(n^2) etc, and we'll show progress output
appropriately, i.e. "still working, at X items of work for N, so north
of O(N)...".)
diff mbox series

Patch

diff --git a/midx.c b/midx.c
index 9a35b0255d..eaae75ab19 100644
--- a/midx.c
+++ b/midx.c
@@ -1291,7 +1291,7 @@  int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 
 	if (flags & MIDX_PROGRESS)
 		progress = start_sparse_progress(_("Sorting objects by packfile"),
-						 m->num_objects);
+						 0);
 	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
 	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
 	stop_progress(&progress);