diff mbox series

[4/7] commit-graph: fix bogus counter in "Scanning merged commits" progress line

Message ID 20210620200303.2328957-5-szeder.dev@gmail.com (mailing list archive)
State New, archived
Headers show
Series progress: verify progress counters in the test suite | expand

Commit Message

SZEDER Gábor June 20, 2021, 8:03 p.m. UTC
The final value of the counter of the "Scanning merged commits"
progress line is always one less than its expected total, e.g.:

  Scanning merged commits:  83% (5/6), done.

This happens because while iterating over an array the loop variable
is passed to display_progress() as-is, but while C arrays (and thus
the loop variable) start at 0 and end at N-1, the progress counter
must end at N.  This causes the failures of the tests
'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.

Fix this by passing 'i + 1' to display_progress(), like most other
callsites do.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 commit-graph.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Ævar Arnfjörð Bjarmason June 20, 2021, 10:13 p.m. UTC | #1
On Sun, Jun 20 2021, SZEDER Gábor wrote:

> The final value of the counter of the "Scanning merged commits"
> progress line is always one less than its expected total, e.g.:
>
>   Scanning merged commits:  83% (5/6), done.
>
> This happens because while iterating over an array the loop variable
> is passed to display_progress() as-is, but while C arrays (and thus
> the loop variable) start at 0 and end at N-1, the progress counter
> must end at N.  This causes the failures of the tests
> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>
> Fix this by passing 'i + 1' to display_progress(), like most other
> callsites do.
>
> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
> ---
>  commit-graph.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 2bcb4e0f89..3181906368 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>  
>  	ctx->num_extra_edges = 0;
>  	for (i = 0; i < ctx->commits.nr; i++) {
> -		display_progress(ctx->progress, i);
> +		display_progress(ctx->progress, i + 1);
>  
>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>  			  &ctx->commits.list[i]->object.oid)) {

I think this fix makes sense, but FWIW there's a large thread starting
at [1] where René disagrees with me, and thinks the fix for this sort of
thing would be to display_progress(..., i + 1) at the end of that
for-loop, or just before the stop_progress().

I don't agree, but just noting the disagreement, and that if that
argument wins then a patch like this would involve changing the other
20-some calls to display_progress() in commit-graph.c to work
differently (and to be more complex, we'd need to deal with loop
break/continue etc.).

1. https://lore.kernel.org/git/patch-2.2-042f598826-20210607T144206Z-avarab@gmail.com/
René Scharfe June 21, 2021, 6:32 p.m. UTC | #2
Am 21.06.21 um 00:13 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>
>> The final value of the counter of the "Scanning merged commits"
>> progress line is always one less than its expected total, e.g.:
>>
>>   Scanning merged commits:  83% (5/6), done.
>>
>> This happens because while iterating over an array the loop variable
>> is passed to display_progress() as-is, but while C arrays (and thus
>> the loop variable) start at 0 and end at N-1, the progress counter
>> must end at N.  This causes the failures of the tests
>> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
>> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>>
>> Fix this by passing 'i + 1' to display_progress(), like most other
>> callsites do.
>>
>> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
>> ---
>>  commit-graph.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/commit-graph.c b/commit-graph.c
>> index 2bcb4e0f89..3181906368 100644
>> --- a/commit-graph.c
>> +++ b/commit-graph.c
>> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>>
>>  	ctx->num_extra_edges = 0;
>>  	for (i = 0; i < ctx->commits.nr; i++) {
>> -		display_progress(ctx->progress, i);
>> +		display_progress(ctx->progress, i + 1);
>>
>>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>>  			  &ctx->commits.list[i]->object.oid)) {
>
> I think this fix makes sense, but FWIW there's a large thread starting
> at [1] where René disagrees with me, and thinks the fix for this sort of
> thing would be to display_progress(..., i + 1) at the end of that
> for-loop, or just before the stop_progress().
>
> I don't agree, but just noting the disagreement, and that if that
> argument wins then a patch like this would involve changing the other
> 20-some calls to display_progress() in commit-graph.c to work
> differently (and to be more complex, we'd need to deal with loop
> break/continue etc.).
>
> 1. https://lore.kernel.org/git/patch-2.2-042f598826-20210607T144206Z-avarab@gmail.com/

*sigh*  (And sorry, Ævar.)

Before an item is done, it should be reported as not done.  After an
item is done, it should be reported as done.  One loop iteration
finishes one item.  Thus the number of items to report at the bottom of
the loop is one higher than at the top.  i is the correct number to
report at the top of a zero-based loop, i+1 at the bottom.

There is another place: In the loop header.  It's a weird place for a
function call, but it gets triggered before, between and after all
items, just as we need it:

	for (i = 0; display_progress(ctx->progress), i < ctx->commits.nr; i++) {

We could hide this unseemly sight in a macro:

  #define progress_foreach(index, count, progress) \
  for (index = 0; display_progress(progress, index), index < count; index++)

Hmm?

René
Ævar Arnfjörð Bjarmason June 21, 2021, 8:08 p.m. UTC | #3
On Mon, Jun 21 2021, René Scharfe wrote:

> Am 21.06.21 um 00:13 schrieb Ævar Arnfjörð Bjarmason:
>>
>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>
>>> The final value of the counter of the "Scanning merged commits"
>>> progress line is always one less than its expected total, e.g.:
>>>
>>>   Scanning merged commits:  83% (5/6), done.
>>>
>>> This happens because while iterating over an array the loop variable
>>> is passed to display_progress() as-is, but while C arrays (and thus
>>> the loop variable) start at 0 and end at N-1, the progress counter
>>> must end at N.  This causes the failures of the tests
>>> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
>>> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>>>
>>> Fix this by passing 'i + 1' to display_progress(), like most other
>>> callsites do.
>>>
>>> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
>>> ---
>>>  commit-graph.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/commit-graph.c b/commit-graph.c
>>> index 2bcb4e0f89..3181906368 100644
>>> --- a/commit-graph.c
>>> +++ b/commit-graph.c
>>> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>>>
>>>  	ctx->num_extra_edges = 0;
>>>  	for (i = 0; i < ctx->commits.nr; i++) {
>>> -		display_progress(ctx->progress, i);
>>> +		display_progress(ctx->progress, i + 1);
>>>
>>>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>>>  			  &ctx->commits.list[i]->object.oid)) {
>>
>> I think this fix makes sense, but FWIW there's a large thread starting
>> at [1] where René disagrees with me, and thinks the fix for this sort of
>> thing would be to display_progress(..., i + 1) at the end of that
>> for-loop, or just before the stop_progress().
>>
>> I don't agree, but just noting the disagreement, and that if that
>> argument wins then a patch like this would involve changing the other
>> 20-some calls to display_progress() in commit-graph.c to work
>> differently (and to be more complex, we'd need to deal with loop
>> break/continue etc.).
>>
>> 1. https://lore.kernel.org/git/patch-2.2-042f598826-20210607T144206Z-avarab@gmail.com/
>
> *sigh*  (And sorry, Ævar.)
>
> Before an item is done, it should be reported as not done.  After an
> item is done, it should be reported as done.  One loop iteration
> finishes one item.  Thus the number of items to report at the bottom of
> the loop is one higher than at the top.  i is the correct number to
> report at the top of a zero-based loop, i+1 at the bottom.
>
> There is another place: In the loop header.  It's a weird place for a
> function call, but it gets triggered before, between and after all
> items, just as we need it:
>
> 	for (i = 0; display_progress(ctx->progress), i < ctx->commits.nr; i++) {
>
> We could hide this unseemly sight in a macro:
>
>   #define progress_foreach(index, count, progress) \
>   for (index = 0; display_progress(progress, index), index < count; index++)

Anyone with more time than sense can go and read over our linked back &
forth thread where we're disagreeing on that point :). I think the pattern
in commit-graph.c makes sense, you don't.

Anyway, aside from that. I think, and I really would be advocating this
too, even if our respective positions were reversed, that *in this case*
it makes sense to just take something like SZEDER's patch here
as-is. Because in that file there's some dozen occurrences of that exact
pattern.

Let's just bring this one case in line with the rest, if we then want to
argue that one or the other use of the progress.c API is wrong as a
general thing, I think it makes more sense to discuss that as some
follow-up series that changes these various API uses en-masse than
holding back isolated fixes that leave the state of the progress bar it
!= 100%.
René Scharfe June 26, 2021, 8:27 a.m. UTC | #4
Am 21.06.21 um 22:08 schrieb Ævar Arnfjörð Bjarmason:
>
> On Mon, Jun 21 2021, René Scharfe wrote:
>
>> Am 21.06.21 um 00:13 schrieb Ævar Arnfjörð Bjarmason:
>>>
>>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>>
>>>> The final value of the counter of the "Scanning merged commits"
>>>> progress line is always one less than its expected total, e.g.:
>>>>
>>>>   Scanning merged commits:  83% (5/6), done.
>>>>
>>>> This happens because while iterating over an array the loop variable
>>>> is passed to display_progress() as-is, but while C arrays (and thus
>>>> the loop variable) start at 0 and end at N-1, the progress counter
>>>> must end at N.  This causes the failures of the tests
>>>> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
>>>> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>>>>
>>>> Fix this by passing 'i + 1' to display_progress(), like most other
>>>> callsites do.
>>>>
>>>> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
>>>> ---
>>>>  commit-graph.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/commit-graph.c b/commit-graph.c
>>>> index 2bcb4e0f89..3181906368 100644
>>>> --- a/commit-graph.c
>>>> +++ b/commit-graph.c
>>>> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>>>>
>>>>  	ctx->num_extra_edges = 0;
>>>>  	for (i = 0; i < ctx->commits.nr; i++) {
>>>> -		display_progress(ctx->progress, i);
>>>> +		display_progress(ctx->progress, i + 1);
>>>>
>>>>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>>>>  			  &ctx->commits.list[i]->object.oid)) {
>>>
>>> I think this fix makes sense, but FWIW there's a large thread starting
>>> at [1] where René disagrees with me, and thinks the fix for this sort of
>>> thing would be to display_progress(..., i + 1) at the end of that
>>> for-loop, or just before the stop_progress().
>>>
>>> I don't agree, but just noting the disagreement, and that if that
>>> argument wins then a patch like this would involve changing the other
>>> 20-some calls to display_progress() in commit-graph.c to work
>>> differently (and to be more complex, we'd need to deal with loop
>>> break/continue etc.).
>>>
>>> 1. https://lore.kernel.org/git/patch-2.2-042f598826-20210607T144206Z-avarab@gmail.com/
>>
>> *sigh*  (And sorry, Ævar.)
>>
>> Before an item is done, it should be reported as not done.  After an
>> item is done, it should be reported as done.  One loop iteration
>> finishes one item.  Thus the number of items to report at the bottom of
>> the loop is one higher than at the top.  i is the correct number to
>> report at the top of a zero-based loop, i+1 at the bottom.

> Anyone with more time than sense can go and read over our linked back &
> forth thread where we're disagreeing on that point :). I think the pattern
> in commit-graph.c makes sense, you don't.

Thanks for this comment, I think I got it now: Work doesn't count in the
commit-graph.c model of measuring progress, literally.  I.e. progress is
the same before and after one item of work.  Instead it counts the
number of loop iterations.  The model I describe above counts finished
work items instead.  The results of the two models differ by at most one
despite their inverted axiom regarding the value of work.

Phew, that took me a while.

> Anyway, aside from that. I think, and I really would be advocating this
> too, even if our respective positions were reversed, that *in this case*
> it makes sense to just take something like SZEDER's patch here
> as-is. Because in that file there's some dozen occurrences of that exact
> pattern.

The code without the patch either forgets to report the last work item
in the count-work-items model or is one short in the count-iterations
model, so a fix is needed either way.

The number of the other occurrences wouldn't matter if they were
buggy, but in this case they indicate that Stolee consistently used
the count-iterations model.  Thus using it in the patch as well makes
sense.

> Let's just bring this one case in line with the rest, if we then want to
> argue that one or the other use of the progress.c API is wrong as a
> general thing, I think it makes more sense to discuss that as some
> follow-up series that changes these various API uses en-masse than
> holding back isolated fixes that leave the state of the progress bar it
> != 100%.

Agreed.

René
Ævar Arnfjörð Bjarmason June 26, 2021, 2:11 p.m. UTC | #5
On Sat, Jun 26 2021, René Scharfe wrote:

> Am 21.06.21 um 22:08 schrieb Ævar Arnfjörð Bjarmason:
>>
>> On Mon, Jun 21 2021, René Scharfe wrote:
>>
>>> Am 21.06.21 um 00:13 schrieb Ævar Arnfjörð Bjarmason:
>>>>
>>>> On Sun, Jun 20 2021, SZEDER Gábor wrote:
>>>>
>>>>> The final value of the counter of the "Scanning merged commits"
>>>>> progress line is always one less than its expected total, e.g.:
>>>>>
>>>>>   Scanning merged commits:  83% (5/6), done.
>>>>>
>>>>> This happens because while iterating over an array the loop variable
>>>>> is passed to display_progress() as-is, but while C arrays (and thus
>>>>> the loop variable) start at 0 and end at N-1, the progress counter
>>>>> must end at N.  This causes the failures of the tests
>>>>> 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules'
>>>>> in 't5510-fetch.sh' when run with GIT_TEST_CHECK_PROGRESS=1.
>>>>>
>>>>> Fix this by passing 'i + 1' to display_progress(), like most other
>>>>> callsites do.
>>>>>
>>>>> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
>>>>> ---
>>>>>  commit-graph.c | 2 +-
>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/commit-graph.c b/commit-graph.c
>>>>> index 2bcb4e0f89..3181906368 100644
>>>>> --- a/commit-graph.c
>>>>> +++ b/commit-graph.c
>>>>> @@ -2096,7 +2096,7 @@ static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
>>>>>
>>>>>  	ctx->num_extra_edges = 0;
>>>>>  	for (i = 0; i < ctx->commits.nr; i++) {
>>>>> -		display_progress(ctx->progress, i);
>>>>> +		display_progress(ctx->progress, i + 1);
>>>>>
>>>>>  		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
>>>>>  			  &ctx->commits.list[i]->object.oid)) {
>>>>
>>>> I think this fix makes sense, but FWIW there's a large thread starting
>>>> at [1] where René disagrees with me, and thinks the fix for this sort of
>>>> thing would be to display_progress(..., i + 1) at the end of that
>>>> for-loop, or just before the stop_progress().
>>>>
>>>> I don't agree, but just noting the disagreement, and that if that
>>>> argument wins then a patch like this would involve changing the other
>>>> 20-some calls to display_progress() in commit-graph.c to work
>>>> differently (and to be more complex, we'd need to deal with loop
>>>> break/continue etc.).
>>>>
>>>> 1. https://lore.kernel.org/git/patch-2.2-042f598826-20210607T144206Z-avarab@gmail.com/
>>>
>>> *sigh*  (And sorry, Ævar.)
>>>
>>> Before an item is done, it should be reported as not done.  After an
>>> item is done, it should be reported as done.  One loop iteration
>>> finishes one item.  Thus the number of items to report at the bottom of
>>> the loop is one higher than at the top.  i is the correct number to
>>> report at the top of a zero-based loop, i+1 at the bottom.
>
>> Anyone with more time than sense can go and read over our linked back &
>> forth thread where we're disagreeing on that point :). I think the pattern
>> in commit-graph.c makes sense, you don't.
>
> Thanks for this comment, I think I got it now: Work doesn't count in the
> commit-graph.c model of measuring progress, literally.  I.e. progress is
> the same before and after one item of work.

The progress isn't the same, we update the count. Or do you mean in the
time it takes us to go from the end of the for-loop & jump to the start
of it and update the count?

> Instead it counts the number of loop iterations.  The model I describe
> above counts finished work items instead.  The results of the two
> models differ by at most one despite their inverted axiom regarding
> the value of work.
>
> Phew, that took me a while.

For what it's worth I had some extensive examples in our initial
thread[1][2] (search for "apple" and "throughput", respectively), that
you cut out when replying to the relevant E-Mails. I'd think we could
probably have gotten here earlier :)

I'm a bit confused about this "value of work" comment.

If you pick up a copy of say a video game like Mario Kart you'll find
that for a 3-lap race you start at 1/3, and still have an entire lap to
go when the count is at 3/3.

So it's just a question of whether you report progress on item N or work
finished on item N, not whether laps in a race have more or less
value.

To reference my earlier E-Mail[1] are you eating the first apple or the
zeroeth apple? I don't think one is more or less right in the
mathematical sense, I just think for UX aimed at people counting "laps"
makes more sense than counting completed items.

>> Anyway, aside from that. I think, and I really would be advocating this
>> too, even if our respective positions were reversed, that *in this case*
>> it makes sense to just take something like SZEDER's patch here
>> as-is. Because in that file there's some dozen occurrences of that exact
>> pattern.
>
> The code without the patch either forgets to report the last work item
> in the count-work-items model or is one short in the count-iterations
> model, so a fix is needed either way.

It won't be one short, for a loop of 2 items we'll go from:

     0/2
     1/2
     1/2, done

To:

     1/2
     2/2
     2/2, done

Just like the rest of the uses of the progress API in that file.

Which is one of the two reasons I prefer this pattern, i.e. this is less
verbose:

    start_progress()
    for i in (0..X-1):
        display_progress(i+1)
        work()
    stop_progress()

Than one of these, which AFAICT would be your recommendation:

    # Simplest, but stalls on work()
    start_progress()
    for i in (0..X-1):
        work()
        display_progress(i+1)
    stop_progress()

    # More verbose, but doesn't:
    start_progress()
    for i in (0..X-1):
        display_progress(i)
        work()
        display_progress(i+1)
    stop_progress()

    # Ditto:
    start_progress()
    display_progress(0)
    for i in (0..X-1):
        work()
        display_progress(i+1)
    stop_progress()

And of course if your loop continues or whatever you'll need a last
"display_progress(X)" before the "stop_progress()".

The other is that if you count laps you can have your progress bar
optionally show progress on that item. E.g. we could if we stall show
seconds spend that we're hung on that item, or '3/3 ETA 40s". I have a
patch[3] that takes an initial step towards that, with some more queued
locally.

> The number of the other occurrences wouldn't matter if they were
> buggy, but in this case they indicate that Stolee consistently used
> the count-iterations model.  Thus using it in the patch as well makes
> sense.

>> Let's just bring this one case in line with the rest, if we then want to
>> argue that one or the other use of the progress.c API is wrong as a
>> general thing, I think it makes more sense to discuss that as some
>> follow-up series that changes these various API uses en-masse than
>> holding back isolated fixes that leave the state of the progress bar it
>> != 100%.
>
> Agreed.

Sorry to go on about this again :)

1. https://lore.kernel.org/git/87lf7k2bem.fsf@evledraar.gmail.com/
2. https://lore.kernel.org/git/87o8c8z105.fsf@evledraar.gmail.com/
3. https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/
René Scharfe June 26, 2021, 8:22 p.m. UTC | #6
Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sat, Jun 26 2021, René Scharfe wrote:
>
>> Am 21.06.21 um 22:08 schrieb Ævar Arnfjörð Bjarmason:
>>>
>>> On Mon, Jun 21 2021, René Scharfe wrote:
>>>
>>>> Before an item is done, it should be reported as not done.  After an
>>>> item is done, it should be reported as done.  One loop iteration
>>>> finishes one item.  Thus the number of items to report at the bottom of
>>>> the loop is one higher than at the top.  i is the correct number to
>>>> report at the top of a zero-based loop, i+1 at the bottom.
>>
>>> Anyone with more time than sense can go and read over our linked back &
>>> forth thread where we're disagreeing on that point :). I think the pattern
>>> in commit-graph.c makes sense, you don't.
>>
>> Thanks for this comment, I think I got it now: Work doesn't count in the
>> commit-graph.c model of measuring progress, literally.  I.e. progress is
>> the same before and after one item of work.
>
> The progress isn't the same, we update the count. Or do you mean in the
> time it takes us to go from the end of the for-loop & jump to the start
> of it and update the count?
>
>> Instead it counts the number of loop iterations.  The model I describe
>> above counts finished work items instead.  The results of the two
>> models differ by at most one despite their inverted axiom regarding
>> the value of work.
>>
>> Phew, that took me a while.
>
> For what it's worth I had some extensive examples in our initial
> thread[1][2] (search for "apple" and "throughput", respectively), that
> you cut out when replying to the relevant E-Mails. I'd think we could
> probably have gotten here earlier :)

Perhaps, but the key point for me was to invert my basic assumption that
a work item has value, and for that I had to realize and state it first
(done above).  A mathematician would have done that in an instant, I
guess ("Invert, always invert").

> I'm a bit confused about this "value of work" comment.

Progress is a counter.  The difference of the counter before and after
a work item is done is one in the count-work model, but zero in the
count-iterations model.

> If you pick up a copy of say a video game like Mario Kart you'll find
> that for a 3-lap race you start at 1/3, and still have an entire lap to
> go when the count is at 3/3.
>
> So it's just a question of whether you report progress on item N or work
> finished on item N, not whether laps in a race have more or less
> value.

These are linked.  If you want to know which lap you are in, the answer
won't change until you start a new lap:

	for (i = 0; i < 3; i++) {
		display_progress(p, i + 1);
		drive_one_lap();
		display_progress(p, i + 1);
	}

If you want for know how many laps you finished, the answer will
increase after a lap is done:

	for (i = 0; i < 3; i++) {
		display_progress(p, i);
		drive_one_lap();
		display_progress(p, i + 1);
	}

> To reference my earlier E-Mail[1] are you eating the first apple or the
> zeroeth apple? I don't think one is more or less right in the
> mathematical sense, I just think for UX aimed at people counting "laps"
> makes more sense than counting completed items.

The difference between counting iterations and work items vanishes as
their numbers increase.  The most pronounced difference is observed when
there is only a single item of work.  The count-iterations model shows
1/1 from start to finish.  The count-work model shows 0/1 initially and
1/1 after the work is done.

As a user I prefer the second one.  If presented with just a number and
a percentage then I assume 100% means all work is done and would cancel
the program if that status is shown for too long.  With Git I have
learned that only the final ", done" really means done in some cases,
but that's an unnecessary lesson and still surprising to me.

>>> Anyway, aside from that. I think, and I really would be advocating this
>>> too, even if our respective positions were reversed, that *in this case*
>>> it makes sense to just take something like SZEDER's patch here
>>> as-is. Because in that file there's some dozen occurrences of that exact
>>> pattern.
>>
>> The code without the patch either forgets to report the last work item
>> in the count-work-items model or is one short in the count-iterations
>> model, so a fix is needed either way.
>
> It won't be one short, for a loop of 2 items we'll go from:
>
>      0/2
>      1/2
>      1/2, done
>
> To:
>
>      1/2
>      2/2
>      2/2, done
>
> Just like the rest of the uses of the progress API in that file.

Yes, just like I wrote -- the old code is one short compared to the
correct output of the count-iterations method.

For completeness' sake, the correct output of the count-work method
would be:

	0/2
	1/2
	2/2
	2/2, done

> Which is one of the two reasons I prefer this pattern, i.e. this is less
> verbose:
>
>     start_progress()
>     for i in (0..X-1):
>         display_progress(i+1)
>         work()
>     stop_progress()
>
> Than one of these, which AFAICT would be your recommendation:
>
>     # Simplest, but stalls on work()
>     start_progress()
>     for i in (0..X-1):
>         work()
>         display_progress(i+1)
>     stop_progress()
>
>     # More verbose, but doesn't:
>     start_progress()
>     for i in (0..X-1):
>         display_progress(i)
>         work()
>         display_progress(i+1)
>     stop_progress()
>
>     # Ditto:
>     start_progress()
>     display_progress(0)
>     for i in (0..X-1):
>         work()
>         display_progress(i+1)
>     stop_progress()
>
> And of course if your loop continues or whatever you'll need a last
> "display_progress(X)" before the "stop_progress()".

The count-work model needs one more progress update than the
count-iteration model.  We could do all updates in the loop header,
which is evaluated just the right number of times.  But I think that we
rather should choose between the models based on their results.

If each work item finishes within a progress display update period
(half a second) then there won't be any user-visible difference and
both models would do.

> The other is that if you count laps you can have your progress bar
> optionally show progress on that item. E.g. we could if we stall show
> seconds spend that we're hung on that item, or '3/3 ETA 40s". I have a
> patch[3] that takes an initial step towards that, with some more queued
> locally.

A time estimate for the whole operation (until ", done") would be nice.
It can help with the decision to go for a break or to keep staring at
the screen.  I guess we just need to remember when start_progress() was
called and can then estimate the remaining time once the first item is
done.  Stalling items would push the estimate further into the future.

A time estimate per item wouldn't help me much.  I'd have to subtract
to get the number of unfinished items, catch the maximum estimated
duration and multiply those values.  OK, by the time I manage that Git
is probably done -- but I'd rather like to leave arithmetic tasks to
the computer..

Seconds spent for the current item can be shown with both models.  The
progress value is not sufficient to identify the problem case in most
cases.  An ID of some kind (e.g. a file name or hash) would have to be
shown as well for that.  But how would I use that information?

René
Ævar Arnfjörð Bjarmason June 26, 2021, 9:38 p.m. UTC | #7
On Sat, Jun 26 2021, René Scharfe wrote:

> Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:
>> [...]
>> To reference my earlier E-Mail[1] are you eating the first apple or the
>> zeroeth apple? I don't think one is more or less right in the
>> mathematical sense, I just think for UX aimed at people counting "laps"
>> makes more sense than counting completed items.
>
> The difference between counting iterations and work items vanishes as
> their numbers increase.  The most pronounced difference is observed when
> there is only a single item of work.  The count-iterations model shows
> 1/1 from start to finish.  The count-work model shows 0/1 initially and
> 1/1 after the work is done.
>
> As a user I prefer the second one.  If presented with just a number and
> a percentage then I assume 100% means all work is done and would cancel
> the program if that status is shown for too long.  With Git I have
> learned that only the final ", done" really means done in some cases,
> but that's an unnecessary lesson and still surprising to me.

What progress bar of ours goes slow enough that the difference matters
for you in either case?

The only one I know of is "Enumerating objects", which notably stalls at
the start, and which I'm proposing changing the output of in:
https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/

>> [...]
>> Which is one of the two reasons I prefer this pattern, i.e. this is less
>> verbose:
>>
>>     start_progress()
>>     for i in (0..X-1):
>>         display_progress(i+1)
>>         work()
>>     stop_progress()
>>
>> Than one of these, which AFAICT would be your recommendation:
>>
>>     # Simplest, but stalls on work()
>>     start_progress()
>>     for i in (0..X-1):
>>         work()
>>         display_progress(i+1)
>>     stop_progress()
>>
>>     # More verbose, but doesn't:
>>     start_progress()
>>     for i in (0..X-1):
>>         display_progress(i)
>>         work()
>>         display_progress(i+1)
>>     stop_progress()
>>
>>     # Ditto:
>>     start_progress()
>>     display_progress(0)
>>     for i in (0..X-1):
>>         work()
>>         display_progress(i+1)
>>     stop_progress()
>>
>> And of course if your loop continues or whatever you'll need a last
>> "display_progress(X)" before the "stop_progress()".
>
> The count-work model needs one more progress update than the
> count-iteration model.  We could do all updates in the loop header,
> which is evaluated just the right number of times.  But I think that we
> rather should choose between the models based on their results.

I think we should be more biased towards API convenience here than
anything else, because for most of these they'll go so fast that users
won't see the difference. I just also happen to think that the easy way
to do it is also more correct.

Also, because for these cases that you're focusing on where we count up
to exactly 100% and we therefore expect N calls to display_progress()
(igroning the rare but allowed duplicate calls with the same number,
which most callers don't use). We could just have a convenience API of:

    start_progress()
    for i in (0..X-1):
        progress_update() /* passing "i" not needed, we increment internally */
        work()
    stop_progress()

Then we could even make showing 0/N or 1/N the first time configuable,
but we could only do both if we use the API as I'm suggesting, not as
you want to use it.

You also sort of can get me what I want with with what you're
suggesting, but you'd conflate "setup" work with the first item, which
matters e.g. for "Enumerating objects" and my "stalled" patch. It's also
more verbose at the code level, and complex (need to deal with "break",
"continue"), so why would you?

Which I think is the main point of our not so much disagreement but I
think a bit of talking past one another.

I.e. I think you're narrowly focused on what I think of as a display
issue of the current progress bars we show, I'm mainly interested in how
we use the API, and we should pick a way to use it that allows us to do
more with displaying progress better in the future.

> If each work item finishes within a progress display update period
> (half a second) then there won't be any user-visible difference and
> both models would do.

A trivial point, but don't you mean a second? AFAICT for "delayed" we
display after 2 seconds, then update every 1 seconds, it's only if we
have display_throughput() that we do every 0.5s.

>> The other is that if you count laps you can have your progress bar
>> optionally show progress on that item. E.g. we could if we stall show
>> seconds spend that we're hung on that item, or '3/3 ETA 40s". I have a
>> patch[3] that takes an initial step towards that, with some more queued
>> locally.
>
> A time estimate for the whole operation (until ", done") would be nice.
> It can help with the decision to go for a break or to keep staring at
> the screen.  I guess we just need to remember when start_progress() was
> called and can then estimate the remaining time once the first item is
> done.  Stalling items would push the estimate further into the future.
>
> A time estimate per item wouldn't help me much.  I'd have to subtract
> to get the number of unfinished items, catch the maximum estimated
> duration and multiply those values.  OK, by the time I manage that Git
> is probably done -- but I'd rather like to leave arithmetic tasks to
> the computer..
>
> Seconds spent for the current item can be shown with both models.  The
> progress value is not sufficient to identify the problem case in most
> cases.  An ID of some kind (e.g. a file name or hash) would have to be
> shown as well for that.  But how would I use that information?

If we're spending enough time on one item to update progress for it N
times we probably want to show throughput/progress/ETA mainly for that
item, not the work as a whole.

If we do run into those cases and want to convert them to show some
intra-item progress we'd need to first migrate them over to suggested
way of using the API if we picked yours first, with my suggested use we
only need to add new API calls (display_throughput(), and similar future
calls/implicit display).

Consider e.g. using the packfile-uri response to ask the user to
download N number of URLs, just because we grab one at 1MB/s that
probably won't do much to inform our estimate of the next one (which may
be on a different CDN etc.).

The throughput API was intended (and mainly used) for the estimate for
the whole batch, I just wonder if as we use it more widely whether that
use-case won't be the exception.
Felipe Contreras June 27, 2021, 5:31 p.m. UTC | #8
René Scharfe wrote:
> Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:

> > For what it's worth I had some extensive examples in our initial
> > thread[1][2] (search for "apple" and "throughput", respectively), that
> > you cut out when replying to the relevant E-Mails. I'd think we could
> > probably have gotten here earlier :)
> 
> Perhaps, but the key point for me was to invert my basic assumption that
> a work item has value, and for that I had to realize and state it first
> (done above).  A mathematician would have done that in an instant, I
> guess ("Invert, always invert").

When you get down to it, numbers almost never mean what most people
think they mean.

If work is a continuum, the probabilty that you would land exactly at
1/3 is 0 P(X=1/3). What you want is the probability of less than 1/3
P(X<=1/3), and that includes 0.

So, anything from 0 to 1/3 is part of the first chunk of work.
René Scharfe July 4, 2021, 12:15 p.m. UTC | #9
Am 26.06.21 um 23:38 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sat, Jun 26 2021, René Scharfe wrote:
>
>> Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:
>>> [...]
>>> To reference my earlier E-Mail[1] are you eating the first apple or the
>>> zeroeth apple? I don't think one is more or less right in the
>>> mathematical sense, I just think for UX aimed at people counting "laps"
>>> makes more sense than counting completed items.
>>
>> The difference between counting iterations and work items vanishes as
>> their numbers increase.  The most pronounced difference is observed when
>> there is only a single item of work.  The count-iterations model shows
>> 1/1 from start to finish.  The count-work model shows 0/1 initially and
>> 1/1 after the work is done.
>>
>> As a user I prefer the second one.  If presented with just a number and
>> a percentage then I assume 100% means all work is done and would cancel
>> the program if that status is shown for too long.  With Git I have
>> learned that only the final ", done" really means done in some cases,
>> but that's an unnecessary lesson and still surprising to me.
>
> What progress bar of ours goes slow enough that the difference matters
> for you in either case?

I don't have an example -- Git, network and SSD are quick enough for my
small use cases.

The advantage of the count-work method is that the question doesn't even
come up.

> The only one I know of is "Enumerating objects", which notably stalls at
> the start, and which I'm proposing changing the output of in:
> https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/

That's annoying, but the first number I see there has five or six digits,
so it's not an example of the issue mentioned above for me.

Your patch shows ", stalled." while pack-objects starts up.  I'm not sure
this helps.  Perhaps there are cases when it gets stuck, but it's hard to
determine by the clock alone.  When I run gc, it just needs a few seconds
to prepare something and then starts visibly counting objects.  A more
fine-grained report of the preparation steps would help, but seeing
"stalled" would just scare me.

>> The count-work model needs one more progress update than the
>> count-iteration model.  We could do all updates in the loop header,
>> which is evaluated just the right number of times.  But I think that we
>> rather should choose between the models based on their results.
>
> I think we should be more biased towards API convenience here than
> anything else, because for most of these they'll go so fast that users
> won't see the difference. I just also happen to think that the easy way
> to do it is also more correct.

The convenience of having one less display_progress() call is only a
slight advantage.

Correctness is a matter of definitions.  Recently I learned that in Arabic
a person's age is given using the count-iterations model.  I.e. on the day
of your birth your age is one.  That causes trouble if you deal with
state officials that use the count-work, err, count-completed-years model,
where your age is one only after living through a full year.

The solution around here is to avoid ambiguity by not using terms like
"age" in laws, regulations and forms, but to state explicitly "full years
since birth" or so.

"2/3 (33%)" means something else to me than to you by default.  So a
solution could be to state the model explicitly.  I.e. "2/3 (66%) done"
or "working on 2/3 (66%)", but the percentage doesn't quite fit in the
latter case.  Thoughts?

> Also, because for these cases that you're focusing on where we count up
> to exactly 100% and we therefore expect N calls to display_progress()
> (igroning the rare but allowed duplicate calls with the same number,
> which most callers don't use). We could just have a convenience API of:
>
>     start_progress()
>     for i in (0..X-1):
>         progress_update() /* passing "i" not needed, we increment internally */
>         work()
>     stop_progress()
>
> Then we could even make showing 0/N or 1/N the first time configuable,
> but we could only do both if we use the API as I'm suggesting, not as
> you want to use it.

A function that increments the progress number relatively can be used
with both models.  It's more useful for the count-iterations model,
though, as in the count-work model you can piggy-back on the loop
counter check:

	for (i = 0; display_progress(p, i), i < X; i++)
		work();

> You also sort of can get me what I want with with what you're
> suggesting, but you'd conflate "setup" work with the first item, which
> matters e.g. for "Enumerating objects" and my "stalled" patch. It's also
> more verbose at the code level, and complex (need to deal with "break",
> "continue"), so why would you?

It's not complicated, just slightly odd, because function calls are
seldomly put into the loop counter check.

>> If each work item finishes within a progress display update period
>> (half a second) then there won't be any user-visible difference and
>> both models would do.
>
> A trivial point, but don't you mean a second? AFAICT for "delayed" we
> display after 2 seconds, then update every 1 seconds, it's only if we
> have display_throughput() that we do every 0.5s.

Right, I mixed those up.

>>> The other is that if you count laps you can have your progress bar
>>> optionally show progress on that item. E.g. we could if we stall show
>>> seconds spend that we're hung on that item, or '3/3 ETA 40s". I have a
>>> patch[3] that takes an initial step towards that, with some more queued
>>> locally.
>>
>> A time estimate for the whole operation (until ", done") would be nice.
>> It can help with the decision to go for a break or to keep staring at
>> the screen.  I guess we just need to remember when start_progress() was
>> called and can then estimate the remaining time once the first item is
>> done.  Stalling items would push the estimate further into the future.
>>
>> A time estimate per item wouldn't help me much.  I'd have to subtract
>> to get the number of unfinished items, catch the maximum estimated
>> duration and multiply those values.  OK, by the time I manage that Git
>> is probably done -- but I'd rather like to leave arithmetic tasks to
>> the computer..
>>
>> Seconds spent for the current item can be shown with both models.  The
>> progress value is not sufficient to identify the problem case in most
>> cases.  An ID of some kind (e.g. a file name or hash) would have to be
>> shown as well for that.  But how would I use that information?
>
> If we're spending enough time on one item to update progress for it N
> times we probably want to show throughput/progress/ETA mainly for that
> item, not the work as a whole.

Throughput is shown for the last time period.  It is independent of the
item or items being worked on during that period.  If one item takes
multiple periods to finish then indeed only its current progress is
shown automatically, as you want.

Showing intra-item progress requires some kind of hierarchical API to
keep track of both parent and child progress and show them in some
readable way.  Perhaps appending another progress display would suffice?
"Files 1/3 (33%) Bytes 17kB/9GB (0%)".  Not sure.

Calculating the ETA of a single item seems hard.  It does require intra-
item progress to be reported by the work code.

> If we do run into those cases and want to convert them to show some
> intra-item progress we'd need to first migrate them over to suggested
> way of using the API if we picked yours first, with my suggested use we
> only need to add new API calls (display_throughput(), and similar future
> calls/implicit display).

I don't see why.  The intra-item progress numbers need to be reported in
any case if they are to be shown somehow.  If the model is clear then we
can show unambiguous output.

> Consider e.g. using the packfile-uri response to ask the user to
> download N number of URLs, just because we grab one at 1MB/s that
> probably won't do much to inform our estimate of the next one (which may
> be on a different CDN etc.).

Sure, if the speed of work items varies wildly then estimates will be
unreliable.

I can vaguely imagine that it would be kind of useful to know the
throughput of different data sources, to allow e.g. use a different
mirror next time.  The current API doesn't distinguish work items in a
meaningful way, though.  They only have numbers.  I'd need a name (e.g.
the URL) for intra-item progress numbers to mean something.

René
Junio C Hamano July 5, 2021, 2:09 p.m. UTC | #10
René Scharfe <l.s.r@web.de> writes:

> ...  A more
> fine-grained report of the preparation steps would help, but seeing
> "stalled" would just scare me.

True.

> The convenience of having one less display_progress() call is only a
> slight advantage.

True, too.

> "2/3 (33%)" means something else to me than to you by default.  So a
> solution could be to state the model explicitly.  I.e. "2/3 (66%) done"
> or "working on 2/3 (66%)", but the percentage doesn't quite fit in the
> latter case.  Thoughts?


I still see "2/3 done" is how we should look at it, but either way,
that's a good way to view at the problem.

Thanks.


[Unrelated Tangent]

> ...  Recently I learned that in Arabic a person's age is given
> using the count-iterations model.  I.e. on the day of your birth
> your age is one.

East Asign age reckoning is shared among EA countries and works the
same way, not just Arabic.
Ævar Arnfjörð Bjarmason July 5, 2021, 11:28 p.m. UTC | #11
On Sun, Jul 04 2021, René Scharfe wrote:

> Am 26.06.21 um 23:38 schrieb Ævar Arnfjörð Bjarmason:
>>
>> On Sat, Jun 26 2021, René Scharfe wrote:
>>
>>> Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:
>>>> [...]
>>>> To reference my earlier E-Mail[1] are you eating the first apple or the
>>>> zeroeth apple? I don't think one is more or less right in the
>>>> mathematical sense, I just think for UX aimed at people counting "laps"
>>>> makes more sense than counting completed items.
>>>
>>> The difference between counting iterations and work items vanishes as
>>> their numbers increase.  The most pronounced difference is observed when
>>> there is only a single item of work.  The count-iterations model shows
>>> 1/1 from start to finish.  The count-work model shows 0/1 initially and
>>> 1/1 after the work is done.
>>>
>>> As a user I prefer the second one.  If presented with just a number and
>>> a percentage then I assume 100% means all work is done and would cancel
>>> the program if that status is shown for too long.  With Git I have
>>> learned that only the final ", done" really means done in some cases,
>>> but that's an unnecessary lesson and still surprising to me.
>>
>> What progress bar of ours goes slow enough that the difference matters
>> for you in either case?
>
> I don't have an example -- Git, network and SSD are quick enough for my
> small use cases.
>
> The advantage of the count-work method is that the question doesn't even
> come up.
>
>> The only one I know of is "Enumerating objects", which notably stalls at
>> the start, and which I'm proposing changing the output of in:
>> https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/
>
> That's annoying, but the first number I see there has five or six digits,
> so it's not an example of the issue mentioned above for me.

Because it stalls and shows nothing, but with my patch it'll show
something while stalling, FWIW on linux.git from a cold cache it took
5-10s before showing anything.

> Your patch shows ", stalled." while pack-objects starts up.  I'm not sure
> this helps.  Perhaps there are cases when it gets stuck, but it's hard to
> determine by the clock alone.  When I run gc, it just needs a few seconds
> to prepare something and then starts visibly counting objects.  A more
> fine-grained report of the preparation steps would help, but seeing
> "stalled" would just scare me.

Fair enough, I have other patches to have it show a spinner. Again, API
v.s. UI. The idea is that we show something before we start the loop.

>>> The count-work model needs one more progress update than the
>>> count-iteration model.  We could do all updates in the loop header,
>>> which is evaluated just the right number of times.  But I think that we
>>> rather should choose between the models based on their results.
>>
>> I think we should be more biased towards API convenience here than
>> anything else, because for most of these they'll go so fast that users
>> won't see the difference. I just also happen to think that the easy way
>> to do it is also more correct.
>
> The convenience of having one less display_progress() call is only a
> slight advantage.
>
> Correctness is a matter of definitions.  Recently I learned that in Arabic
> a person's age is given using the count-iterations model.  I.e. on the day
> of your birth your age is one.  That causes trouble if you deal with
> state officials that use the count-work, err, count-completed-years model,
> where your age is one only after living through a full year.
>
> The solution around here is to avoid ambiguity by not using terms like
> "age" in laws, regulations and forms, but to state explicitly "full years
> since birth" or so.
>
> "2/3 (33%)" means something else to me than to you by default.  So a
> solution could be to state the model explicitly.  I.e. "2/3 (66%) done"
> or "working on 2/3 (66%)", but the percentage doesn't quite fit in the
> latter case.  Thoughts?

OK, UI again.

>> Also, because for these cases that you're focusing on where we count up
>> to exactly 100% and we therefore expect N calls to display_progress()
>> (igroning the rare but allowed duplicate calls with the same number,
>> which most callers don't use). We could just have a convenience API of:
>>
>>     start_progress()
>>     for i in (0..X-1):
>>         progress_update() /* passing "i" not needed, we increment internally */
>>         work()
>>     stop_progress()
>>
>> Then we could even make showing 0/N or 1/N the first time configuable,
>> but we could only do both if we use the API as I'm suggesting, not as
>> you want to use it.
>
> A function that increments the progress number relatively can be used
> with both models.  It's more useful for the count-iterations model,
> though, as in the count-work model you can piggy-back on the loop
> counter check:
>
> 	for (i = 0; display_progress(p, i), i < X; i++)
> 		work();

Aside from this whole progress API discussion I find sticking stuff like
that in the for-loop body to be less readable.

But no, that can't be used with both models, because it conflates the 0
of the 1st iteration with 0 of doing prep work. I.e.:

    p = start_progress();
    display_progress(p, 0);
    prep_work();
    for (i = 0; i < 100; i++)
        display_progress(p, i + 1);

Which is implicitly how that "stalled" patch views the world, i.e. our
count is -1 is at start_progress() (that's already the case in
progress.c).

If you set it to 0 you're not working on the 1st item yet, but
explicitly doing setup. 

Then at n=1 you're starting work on the 1st item.

>> You also sort of can get me what I want with with what you're
>> suggesting, but you'd conflate "setup" work with the first item, which
>> matters e.g. for "Enumerating objects" and my "stalled" patch. It's also
>> more verbose at the code level, and complex (need to deal with "break",
>> "continue"), so why would you?
>
> It's not complicated, just slightly odd, because function calls are
> seldomly put into the loop counter check.

FWIW the "complicated" here was referring to dealing with break/continue.

Yes I'll grant you that there's cases where the uglyness/oddity of that
for-loop trick is going to be better than dealing with that, but there's
also while loops doing progress, callbacks etc.

Picking an API pattern that works with all of that makes sense, since
the UI can render the count one way or the other.

>>> If each work item finishes within a progress display update period
>>> (half a second) then there won't be any user-visible difference and
>>> both models would do.
>>
>> A trivial point, but don't you mean a second? AFAICT for "delayed" we
>> display after 2 seconds, then update every 1 seconds, it's only if we
>> have display_throughput() that we do every 0.5s.
>
> Right, I mixed those up.
>
>>>> The other is that if you count laps you can have your progress bar
>>>> optionally show progress on that item. E.g. we could if we stall show
>>>> seconds spend that we're hung on that item, or '3/3 ETA 40s". I have a
>>>> patch[3] that takes an initial step towards that, with some more queued
>>>> locally.
>>>
>>> A time estimate for the whole operation (until ", done") would be nice.
>>> It can help with the decision to go for a break or to keep staring at
>>> the screen.  I guess we just need to remember when start_progress() was
>>> called and can then estimate the remaining time once the first item is
>>> done.  Stalling items would push the estimate further into the future.
>>>
>>> A time estimate per item wouldn't help me much.  I'd have to subtract
>>> to get the number of unfinished items, catch the maximum estimated
>>> duration and multiply those values.  OK, by the time I manage that Git
>>> is probably done -- but I'd rather like to leave arithmetic tasks to
>>> the computer..
>>>
>>> Seconds spent for the current item can be shown with both models.  The
>>> progress value is not sufficient to identify the problem case in most
>>> cases.  An ID of some kind (e.g. a file name or hash) would have to be
>>> shown as well for that.  But how would I use that information?
>>
>> If we're spending enough time on one item to update progress for it N
>> times we probably want to show throughput/progress/ETA mainly for that
>> item, not the work as a whole.
>
> Throughput is shown for the last time period.  It is independent of the
> item or items being worked on during that period.  If one item takes
> multiple periods to finish then indeed only its current progress is
> shown automatically, as you want.
>
> Showing intra-item progress requires some kind of hierarchical API to
> keep track of both parent and child progress and show them in some
> readable way.  Perhaps appending another progress display would suffice?
> "Files 1/3 (33%) Bytes 17kB/9GB (0%)".  Not sure.

Yes, this is another thing I'm heading for with the patches I posted.

For now I just fixed bugs in the state machine of how many characters we
erase, now we always reset exactly as much as we need to, and pass
things like ", done" around, not ", done\n" or ", done\r" (i.e. the
output we're emitting isn't conflacted with whether we're clearing the
line, or creating a new line.

It's a relatively straightforward change from there to have N progress
structs that each track/emit their part of a larger progress bar,
e.g. something like the progress prove(1) shows you (test status for
each concurrent test you're running).

You just need a "parent" progress struct that has the "title" (or none),
and receives the signal, and to have N registered sub-progress structs.

> Calculating the ETA of a single item seems hard.  It does require intra-
> item progress to be reported by the work code.
>
>> If we do run into those cases and want to convert them to show some
>> intra-item progress we'd need to first migrate them over to suggested
>> way of using the API if we picked yours first, with my suggested use we
>> only need to add new API calls (display_throughput(), and similar future
>> calls/implicit display).
>
> I don't see why.  The intra-item progress numbers need to be reported in
> any case if they are to be shown somehow.  If the model is clear then we
> can show unambiguous output.

Because you want to show:

    Files 1/3 (33%) Bytes 17kB/9GB (0%)

Not:

    Files 0/3 (33%) Bytes 17kB/9GB (0%)

You're downloading the 1st file, not the 0th file, so the code is a
for-loop (or equivalent) with a display_progress(p, i + 1) for that
file, not display_progress(p, i).

This is the main reason I prefer the API and UI of reporting "what item
am I on?" v.s. "how many items are done?", because it's easy to add
intra-item state to the former.

>> Consider e.g. using the packfile-uri response to ask the user to
>> download N number of URLs, just because we grab one at 1MB/s that
>> probably won't do much to inform our estimate of the next one (which may
>> be on a different CDN etc.).
>
> Sure, if the speed of work items varies wildly then estimates will be
> unreliable.
>
> I can vaguely imagine that it would be kind of useful to know the
> throughput of different data sources, to allow e.g. use a different
> mirror next time.  The current API doesn't distinguish work items in a
> meaningful way, though.  They only have numbers.  I'd need a name (e.g.
> the URL) for intra-item progress numbers to mean something.

Sure, anyway, let's assume all those numbers are magically known and
constant. The point was that as noted above you're downloading the 1st
file, not the 0th file, and want to show throughput/ETA etc. for that
file.
René Scharfe July 6, 2021, 4:02 p.m. UTC | #12
Am 06.07.21 um 01:28 schrieb Ævar Arnfjörð Bjarmason:
>
> On Sun, Jul 04 2021, René Scharfe wrote:
>
>> Am 26.06.21 um 23:38 schrieb Ævar Arnfjörð Bjarmason:
>>>
>>> On Sat, Jun 26 2021, René Scharfe wrote:
>>>
>>>> Am 26.06.21 um 16:11 schrieb Ævar Arnfjörð Bjarmason:
>>> The only one I know of is "Enumerating objects", which notably stalls at
>>> the start, and which I'm proposing changing the output of in:
>>> https://lore.kernel.org/git/patch-18.25-e21fc66623f-20210623T155626Z-avarab@gmail.com/
>>
>> That's annoying, but the first number I see there has five or six digits,
>> so it's not an example of the issue mentioned above for me.
>
> Because it stalls and shows nothing, but with my patch it'll show
> something while stalling, FWIW on linux.git from a cold cache it took
> 5-10s before showing anything.
>
>> Your patch shows ", stalled." while pack-objects starts up.  I'm not sure
>> this helps.  Perhaps there are cases when it gets stuck, but it's hard to
>> determine by the clock alone.  When I run gc, it just needs a few seconds
>> to prepare something and then starts visibly counting objects.  A more
>> fine-grained report of the preparation steps would help, but seeing
>> "stalled" would just scare me.
>
> Fair enough, I have other patches to have it show a spinner. Again, API
> v.s. UI. The idea is that we show something before we start the loop.

A spinner would be nicer, but I would be more interested to see what it is
actually spending all that time on.  A separate progress line might be
justified here.

>>> Also, because for these cases that you're focusing on where we count up
>>> to exactly 100% and we therefore expect N calls to display_progress()
>>> (igroning the rare but allowed duplicate calls with the same number,
>>> which most callers don't use). We could just have a convenience API of:
>>>
>>>     start_progress()
>>>     for i in (0..X-1):
>>>         progress_update() /* passing "i" not needed, we increment internally */
>>>         work()
>>>     stop_progress()
>>>
>>> Then we could even make showing 0/N or 1/N the first time configuable,
>>> but we could only do both if we use the API as I'm suggesting, not as
>>> you want to use it.
>>
>> A function that increments the progress number relatively can be used
>> with both models.  It's more useful for the count-iterations model,
>> though, as in the count-work model you can piggy-back on the loop
>> counter check:
>>
>> 	for (i = 0; display_progress(p, i), i < X; i++)
>> 		work();
>
> Aside from this whole progress API discussion I find sticking stuff like
> that in the for-loop body to be less readable.
>
> But no, that can't be used with both models, because it conflates the 0
> of the 1st iteration with 0 of doing prep work. I.e.:
>
>     p = start_progress();
>     display_progress(p, 0);
>     prep_work();
>     for (i = 0; i < 100; i++)
>         display_progress(p, i + 1);
>
> Which is implicitly how that "stalled" patch views the world, i.e. our
> count is -1 is at start_progress() (that's already the case in
> progress.c).
>
> If you set it to 0 you're not working on the 1st item yet, but
> explicitly doing setup.
>
> Then at n=1 you're starting work on the 1st item.

A distinct preparation phase feels like an extension to the progress
API.  A symmetric cleanup phase at the end may make sense as well then.

I assume that preparations would be done between the start_progress call
and the first display_progress (no matter what number it reports).  And
cleanup would be done between the last display_progress call and the
stop_progress call.

In the count-iterations model this might report the time taken fro the
first or last item as preparation or cleanup depending on the placement
of the display_progress call.  That shouldn't be much of a problem,
though, as the value of one work item is zero in that model.

> FWIW the "complicated" here was referring to dealing with break/continue.
>
> Yes I'll grant you that there's cases where the uglyness/oddity of that
> for-loop trick is going to be better than dealing with that, but there's
> also while loops doing progress, callbacks etc.

while loops can easily be converted to for loops, of course.

Callbacks are a different matter.  I think we should use them less in
general (they force different operations to use the same set of
parameters, which is worked around with context structs).  A function
to increment progress would help them because then they wouldn't need
to keep track of the item/iteration count themselves in a context
variable.

However, in some cases display_progress calls are rate-limited, e.g.
midx_display_sparse_progress does that for performance reasons.  I
wonder why, and whether this is a problem that needs to be addressed
for all callers.  We don't want the progress API to delay the actual
progress significantly!  Currently display_progress avoids updating
the progress counter; an increment function would need to write an
updated value at each call.

> Picking an API pattern that works with all of that makes sense, since
> the UI can render the count one way or the other.

Right.

>>> If we do run into those cases and want to convert them to show some
>>> intra-item progress we'd need to first migrate them over to suggested
>>> way of using the API if we picked yours first, with my suggested use we
>>> only need to add new API calls (display_throughput(), and similar future
>>> calls/implicit display).
>>
>> I don't see why.  The intra-item progress numbers need to be reported in
>> any case if they are to be shown somehow.  If the model is clear then we
>> can show unambiguous output.
>
> Because you want to show:
>
>     Files 1/3 (33%) Bytes 17kB/9GB (0%)
>
> Not:
>
>     Files 0/3 (33%) Bytes 17kB/9GB (0%)
>
> You're downloading the 1st file, not the 0th file, so the code is a
> for-loop (or equivalent) with a display_progress(p, i + 1) for that
> file, not display_progress(p, i).
>
> This is the main reason I prefer the API and UI of reporting "what item
> am I on?" v.s. "how many items are done?", because it's easy to add
> intra-item state to the former.

Both look confusing.  If I'd care enough about one of the files or each
of them that I'd like to know their individual progress then I'd
certainly would want to see their names instead of some random number.

And as you write above: The display part can easily add or subtract one
to convert the number between models.

> Sure, anyway, let's assume all those numbers are magically known and
> constant. The point was that as noted above you're downloading the 1st
> file, not the 0th file, and want to show throughput/ETA etc. for that
> file.

OK, but still some kind of indication would have to be given that the
Bytes relate to a particular File instead of being the total for this
activity.  Perhaps like this, but it's a bit cluttered:

   File 1 (Bytes 17kB/9GB, 0% done) of 3 (0% done in total)

René
diff mbox series

Patch

diff --git a/commit-graph.c b/commit-graph.c
index 2bcb4e0f89..3181906368 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -2096,7 +2096,7 @@  static void sort_and_scan_merged_commits(struct write_commit_graph_context *ctx)
 
 	ctx->num_extra_edges = 0;
 	for (i = 0; i < ctx->commits.nr; i++) {
-		display_progress(ctx->progress, i);
+		display_progress(ctx->progress, i + 1);
 
 		if (i && oideq(&ctx->commits.list[i - 1]->object.oid,
 			  &ctx->commits.list[i]->object.oid)) {