diff mbox series

trace2: log progress time and throughput

Message ID 20200512214420.36329-1-emilyshaffer@google.com (mailing list archive)
State New, archived
Headers show
Series trace2: log progress time and throughput | expand

Commit Message

Emily Shaffer May 12, 2020, 9:44 p.m. UTC
Rather than teaching only one operation, like 'git fetch', how to write
down throughput to traces, we can learn about a wide range of user
operations that may seem slow by adding tooling to the progress library
itself. Operations which display progress are likely to be slow-running
and the kind of thing we want to monitor for performance anyways. By
showing object counts and data transfer size, we should be able to
make some derived measurements to ensure operations are scaling the way
we expect.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
One note: by putting trace collection into the progress library, we end
up with data events which have titles like "Receiving objects" - not
very machine-parseable. An alternative might be to ask for a
machine-readable title in the progress struct, but I didn't think it was
worth the code churn. However, I don't have experience with processing
the trace data after it's been collected, so if this is a bigger problem
than I think, please say so and I'll figure something out.

CI run here, although it failed on the same error Junio noted today[1]:
https://github.com/nasamuffin/git/runs/668457062

 - Emily

[1]: https://lore.kernel.org/git/xmqqtv0kc2q1.fsf@gitster.c.googlers.com

 progress.c                  | 17 +++++++++++++++++
 t/t0500-progress-display.sh | 26 ++++++++++++++++++++++++++
 2 files changed, 43 insertions(+)

Comments

Junio C Hamano May 12, 2020, 10:14 p.m. UTC | #1
Emily Shaffer <emilyshaffer@google.com> writes:

> Rather than teaching only one operation, like 'git fetch', how to write
> down throughput to traces, we can learn about a wide range of user
> operations that may seem slow by adding tooling to the progress library
> itself. Operations which display progress are likely to be slow-running
> and the kind of thing we want to monitor for performance anyways. By
> showing object counts and data transfer size, we should be able to
> make some derived measurements to ensure operations are scaling the way
> we expect.

Excellent observation ;-)

>
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
> One note: by putting trace collection into the progress library, we end
> up with data events which have titles like "Receiving objects" - not
> very machine-parseable. An alternative might be to ask for a
> machine-readable title in the progress struct, but I didn't think it was
> worth the code churn.

Perhaps make it an optional member of the struct, and convert only a
selected few callers to set it and see how well it helps, while
leaving all the other callers set NULL to the member to keep showing
not-very-useful titles?

> CI run here, although it failed on the same error Junio noted today[1]:

I think Dscho's latest patch, even though it didn't mention my message,
is to fix that error.  So with that queued in the same batch, we'll
see a full CI coverage again.

Will queue.  Thanks.
Josh Steadmon May 13, 2020, 7:46 p.m. UTC | #2
On 2020.05.12 14:44, Emily Shaffer wrote:
> Rather than teaching only one operation, like 'git fetch', how to write
> down throughput to traces, we can learn about a wide range of user
> operations that may seem slow by adding tooling to the progress library
> itself. Operations which display progress are likely to be slow-running
> and the kind of thing we want to monitor for performance anyways. By
> showing object counts and data transfer size, we should be able to
> make some derived measurements to ensure operations are scaling the way
> we expect.
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
> One note: by putting trace collection into the progress library, we end
> up with data events which have titles like "Receiving objects" - not
> very machine-parseable. An alternative might be to ask for a
> machine-readable title in the progress struct, but I didn't think it was
> worth the code churn. However, I don't have experience with processing
> the trace data after it's been collected, so if this is a bigger problem
> than I think, please say so and I'll figure something out.
> 
> CI run here, although it failed on the same error Junio noted today[1]:
> https://github.com/nasamuffin/git/runs/668457062
> 
>  - Emily
> 
> [1]: https://lore.kernel.org/git/xmqqtv0kc2q1.fsf@gitster.c.googlers.com


I like Junio's idea of adding an optional machine-readable field in the
progress struct, but I don't think it is necessarily a blocker for this
change. Everything looks good to me.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Derrick Stolee May 15, 2020, 10:59 a.m. UTC | #3
On 5/12/2020 5:44 PM, Emily Shaffer wrote:
> Rather than teaching only one operation, like 'git fetch', how to write
> down throughput to traces, we can learn about a wide range of user
> operations that may seem slow by adding tooling to the progress library
> itself. Operations which display progress are likely to be slow-running
> and the kind of thing we want to monitor for performance anyways. By
> showing object counts and data transfer size, we should be able to
> make some derived measurements to ensure operations are scaling the way
> we expect.
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
> One note: by putting trace collection into the progress library, we end
> up with data events which have titles like "Receiving objects" - not
> very machine-parseable. An alternative might be to ask for a
> machine-readable title in the progress struct, but I didn't think it was
> worth the code churn. However, I don't have experience with processing
> the trace data after it's been collected, so if this is a bigger problem
> than I think, please say so and I'll figure something out.
> 
> CI run here, although it failed on the same error Junio noted today[1]:
> https://github.com/nasamuffin/git/runs/668457062
> 
>  - Emily
> 
> [1]: https://lore.kernel.org/git/xmqqtv0kc2q1.fsf@gitster.c.googlers.com
> 
>  progress.c                  | 17 +++++++++++++++++
>  t/t0500-progress-display.sh | 26 ++++++++++++++++++++++++++
>  2 files changed, 43 insertions(+)
> 
> diff --git a/progress.c b/progress.c
> index 75633e9c5e..6d2dcff0b6 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -265,6 +265,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
>  	progress->title_len = utf8_strwidth(title);
>  	progress->split = 0;
>  	set_progress_signal();
> +	trace2_region_enter("progress", title, the_repository);
>  	return progress;
>  }
>  
> @@ -320,6 +321,22 @@ void stop_progress(struct progress **p_progress)
>  {
>  	finish_if_sparse(*p_progress);
>  
> +	if (p_progress && *p_progress) {
> +		trace2_data_intmax("progress", the_repository, "total_objects",
> +				   (*p_progress)->total);

Should this be "total_objects"? Progress can iterate over lots of things,
such as cache entries. Perhaps leave it as "total"?

> +
> +		if ((*p_progress)->throughput)
> +			trace2_data_intmax("progress", the_repository,
> +					   "total_bytes",
> +					   (*p_progress)->throughput->curr_total);

I like the extra detail here for the specific kind of progress used in
network transfer.

> +	}
> +
> +	trace2_region_leave("progress",
> +			    p_progress && *p_progress
> +				? (*p_progress)->title
> +				: NULL,
> +			    the_repository);
> +
>  	stop_progress_msg(p_progress, _("done"));
>  }

This trace2_region_leave() needs to be conditional on the progress
being non-null. The pattern used by consumers of the progress API is:

	if (my_progress_condition)
		start_progress(&progress);

	do {
		display_progress(&progress, count);
	} while (condition);

	stop_progress(&progress);

The condition to show progress or not is conditional on an option that
is external to the progress API.

The fix for this patch is simple: make the trace2_region_leave() be
conditional on "p_progress && *p_progress".

This leads to an extra problem: if a user uses an option such as "--quiet",
then the trace2 regions won't appear at all. This becomes even more important
when thinking about scripts or tools that have stderr as a non-TTY, which
disables progress most of the time.

It's best to have trace2 data be consistent across commands. I think this can
be accomplished, but it is a more invasive change to the rest of the codebase.
It requires invoking the progress API in all cases, and having the progress
API conditionally initialize the progress struct. The new pattern would look
like this:

	start_progress(&progress, my_progress_condition);

	do {
		display_progress(&progress, count);
	} while (condition);

	stop_progress(&progress);

Then, start_progress() (and variants) could always start the trace2 region,
and stop_progress() could always end the trace2 region.

>  
> diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
> index d2d088d9a0..1ed1df351c 100755
> --- a/t/t0500-progress-display.sh
> +++ b/t/t0500-progress-display.sh
> @@ -283,4 +283,30 @@ test_expect_success 'cover up after throughput shortens a lot' '
>  	test_i18ncmp expect out
>  '
>  
> +test_expect_success 'progress generates traces' '
> +	cat >in <<-\EOF &&
> +	throughput 102400 1000
> +	update
> +	progress 10
> +	throughput 204800 2000
> +	update
> +	progress 20
> +	throughput 307200 3000
> +	update
> +	progress 30
> +	throughput 409600 4000
> +	update
> +	progress 40
> +	EOF
> +
> +	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress --total=40 \
> +		"Working hard" <in 2>stderr &&
> +
> +	# t0212/parse_events.perl intentionally omits regions and data.
> +	grep -e "region_enter" -e "\"category\":\"progress\"" trace.event &&
> +	grep -e "region_leave" -e "\"category\":\"progress\"" trace.event &&
> +	grep "\"key\":\"total_objects\",\"value\":\"40\"" trace.event &&
> +	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
> +'
> +

Thanks for the test! While this is sufficient to test the trace2 in the
happy path, I think that if you run a Git command such as `git commit-graph write`
that automatically quiets progress in the test suite (non-TTY stderr) you will
find the trace2 logs malformed due to excess end regions.

Thanks,
-Stolee
Junio C Hamano May 15, 2020, 3:27 p.m. UTC | #4
Derrick Stolee <stolee@gmail.com> writes:

>> +	if (p_progress && *p_progress) {
>> +		trace2_data_intmax("progress", the_repository, "total_objects",
>> +				   (*p_progress)->total);
>
> Should this be "total_objects"? Progress can iterate over lots of things,
> such as cache entries. Perhaps leave it as "total"?

Good point.  

I think the literal strings like "total_objects" we see in this
patch can later be customized in the same future change that lets us
give a more useful output than "Receiving objects" by having a
handful of customizable strings in the progress struct as hinted by
Emily under the three-dash line.  So I do not mind leaving it as-is.
If a reroll has to come without making these strings customizable,
it may be an improvement to change it to "total" if we do not forget,
but as long as we know where we are going in the longer term, I do
not think it is a big deal.

>> +
>> +		if ((*p_progress)->throughput)
>> +			trace2_data_intmax("progress", the_repository,
>> +					   "total_bytes",
>> +					   (*p_progress)->throughput->curr_total);
>
> I like the extra detail here for the specific kind of progress used in
> network transfer.

The curr_total field must be counted in bytes, by the fact that
strbuf_humanise_bytes() is called by throughput_string() to show it,
so "total_bytes" does make perfect sense.  The field might want to
get renamed to reflect this but obviously that's the kind of change
that we would not want in the middle of a more meaningful change
like this one.

>> +	}
>> +
>> +	trace2_region_leave("progress",
>> +			    p_progress && *p_progress
>> +				? (*p_progress)->title
>> +				: NULL,
>> +			    the_repository);
>> +
>>  	stop_progress_msg(p_progress, _("done"));
>>  }
>
> This trace2_region_leave() needs to be conditional on the progress
> being non-null. The pattern used by consumers of the progress API is:
>
> 	if (my_progress_condition)
> 		start_progress(&progress);
>
> 	do {
> 		display_progress(&progress, count);
> 	} while (condition);
>
> 	stop_progress(&progress);
>
> The condition to show progress or not is conditional on an option that
> is external to the progress API.
>
> The fix for this patch is simple: make the trace2_region_leave() be
> conditional on "p_progress && *p_progress".

Makes sense.

> This leads to an extra problem: if a user uses an option such as "--quiet",
> then the trace2 regions won't appear at all. This becomes even more important
> when thinking about scripts or tools that have stderr as a non-TTY, which
> disables progress most of the time.
>
> It's best to have trace2 data be consistent across commands. I think this can
> be accomplished, but it is a more invasive change to the rest of the codebase.

True, because "--quiet" means we cannot piggyback on the progress code.

> It requires invoking the progress API in all cases, and having the progress
> API conditionally initialize the progress struct. The new pattern would look
> like this:
>
> 	start_progress(&progress, my_progress_condition);
>
> 	do {
> 		display_progress(&progress, count);
> 	} while (condition);
>
> 	stop_progress(&progress);
>
> Then, start_progress() (and variants) could always start the trace2 region,
> and stop_progress() could always end the trace2 region.

OK.  Good analysis.

Thanks.
Junio C Hamano May 15, 2020, 4:09 p.m. UTC | #5
Junio C Hamano <gitster@pobox.com> writes:

> Derrick Stolee <stolee@gmail.com> writes:
> ...
>> This trace2_region_leave() needs to be conditional on the progress
>> being non-null. The pattern used by consumers of the progress API is:
>>
>> 	if (my_progress_condition)
>> 		start_progress(&progress);
>>
>> 	do {
>> 		display_progress(&progress, count);
>> 	} while (condition);
>>
>> 	stop_progress(&progress);
>>
>> The condition to show progress or not is conditional on an option that
>> is external to the progress API.
>>
>> The fix for this patch is simple: make the trace2_region_leave() be
>> conditional on "p_progress && *p_progress".

Oops.  That means that we need to apply the fix before -rc1 to
'master' X-<.

Something like this?

-- >8 --
Subject: progress: call trace2_region_leave() only after calling _enter()
From: Derrick Stolee <dstolee@microsoft.com>

A user of progress API calls start_progress() conditionally and
depends on the display_progress() and stop_progress() functions to
become no-op when start_progress() hasn't been called.

As we added a call to trace2_region_enter() to start_progress(), the
calls to other trace2 API calls from the progress API functions must
make sure that these trace2 calls are skipped when start_progress()
hasn't been called on the progress struct.  Specifically, do not
call trace2_region_leave() from stop_progress() when we haven't
called start_progress(), which would have called the matching
trace2_region_enter().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 progress.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/progress.c b/progress.c
index 6d2dcff0b6..3eda914518 100644
--- a/progress.c
+++ b/progress.c
@@ -329,13 +329,9 @@ void stop_progress(struct progress **p_progress)
 			trace2_data_intmax("progress", the_repository,
 					   "total_bytes",
 					   (*p_progress)->throughput->curr_total);
-	}
 
-	trace2_region_leave("progress",
-			    p_progress && *p_progress
-				? (*p_progress)->title
-				: NULL,
-			    the_repository);
+		trace2_region_leave("progress", (*p_progress)->title, the_repository);
+	}
 
 	stop_progress_msg(p_progress, _("done"));
 }
Derrick Stolee May 15, 2020, 4:49 p.m. UTC | #6
On 5/15/2020 12:09 PM, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
>> Derrick Stolee <stolee@gmail.com> writes:
>> ...
>>> This trace2_region_leave() needs to be conditional on the progress
>>> being non-null. The pattern used by consumers of the progress API is:
>>>
>>> 	if (my_progress_condition)
>>> 		start_progress(&progress);
>>>
>>> 	do {
>>> 		display_progress(&progress, count);
>>> 	} while (condition);
>>>
>>> 	stop_progress(&progress);
>>>
>>> The condition to show progress or not is conditional on an option that
>>> is external to the progress API.
>>>
>>> The fix for this patch is simple: make the trace2_region_leave() be
>>> conditional on "p_progress && *p_progress".
> 
> Oops.  That means that we need to apply the fix before -rc1 to
> 'master' X-<.

Sorry I didn't catch this in time for rc0!

> Something like this?

The patch below is exactly what I would have recommended. It looks like
I wrote it, too! ;)

> -- >8 --
> Subject: progress: call trace2_region_leave() only after calling _enter()
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> A user of progress API calls start_progress() conditionally and
> depends on the display_progress() and stop_progress() functions to
> become no-op when start_progress() hasn't been called.
> 
> As we added a call to trace2_region_enter() to start_progress(), the
> calls to other trace2 API calls from the progress API functions must
> make sure that these trace2 calls are skipped when start_progress()
> hasn't been called on the progress struct.  Specifically, do not
> call trace2_region_leave() from stop_progress() when we haven't
> called start_progress(), which would have called the matching
> trace2_region_enter().
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  progress.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/progress.c b/progress.c
> index 6d2dcff0b6..3eda914518 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -329,13 +329,9 @@ void stop_progress(struct progress **p_progress)
>  			trace2_data_intmax("progress", the_repository,
>  					   "total_bytes",
>  					   (*p_progress)->throughput->curr_total);
> -	}
>  
> -	trace2_region_leave("progress",
> -			    p_progress && *p_progress
> -				? (*p_progress)->title
> -				: NULL,
> -			    the_repository);
> +		trace2_region_leave("progress", (*p_progress)->title, the_repository);
> +	}
>  
>  	stop_progress_msg(p_progress, _("done"));
>  }

Thanks,
-Stolee
Junio C Hamano May 15, 2020, 5 p.m. UTC | #7
Derrick Stolee <stolee@gmail.com> writes:

>> Something like this?
>
> The patch below is exactly what I would have recommended. It looks like
> I wrote it, too! ;)

Thanks, will queue.
Jeff Hostetler May 15, 2020, 7:37 p.m. UTC | #8
On 5/12/20 5:44 PM, Emily Shaffer wrote:
> Rather than teaching only one operation, like 'git fetch', how to write
> down throughput to traces, we can learn about a wide range of user
> operations that may seem slow by adding tooling to the progress library
> itself. Operations which display progress are likely to be slow-running
> and the kind of thing we want to monitor for performance anyways. By
> showing object counts and data transfer size, we should be able to
> make some derived measurements to ensure operations are scaling the way
> we expect.
> 
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>

We need to be careful here.  The region_enter and _leave calls need
to always be properly nested.  Having an implicit region within the
progress code means the code path between all of the different start_ 
and stop_progress calls need to not interleave with explicit regions.

How about putting something like this in stop_progress:

     struct json_writer jw = JSON_WRITER_INIT;
     jw_object_begin(&jw, 0);
     jw_object_intmax(&jw, "total", p->total);
     if (p->throughput)
         jw_object_intmax(&jw, "throughput", p->throughput->curr_total);
     /*
      * and so on...
      * and maybe include the elapsed time in since the start_progress.
      */
     jw_end(&jw);
     trace2_data_json("progress", NULL, p->title, &jw);
     jw_release(&jw);

That will give you a single record summary of the progress meter
and you won't have to worry about any interleaving.

You could also add a bit to `struct progress` to let you opt-in
or opt-out of the message on a case-by-case basis.

Jeff



> ---
> One note: by putting trace collection into the progress library, we end
> up with data events which have titles like "Receiving objects" - not
> very machine-parseable. An alternative might be to ask for a
> machine-readable title in the progress struct, but I didn't think it was
> worth the code churn. However, I don't have experience with processing
> the trace data after it's been collected, so if this is a bigger problem
> than I think, please say so and I'll figure something out.
> 
> CI run here, although it failed on the same error Junio noted today[1]:
> https://github.com/nasamuffin/git/runs/668457062
> 
>   - Emily
> 
> [1]: https://lore.kernel.org/git/xmqqtv0kc2q1.fsf@gitster.c.googlers.com
> 
>   progress.c                  | 17 +++++++++++++++++
>   t/t0500-progress-display.sh | 26 ++++++++++++++++++++++++++
>   2 files changed, 43 insertions(+)
> 
> diff --git a/progress.c b/progress.c
> index 75633e9c5e..6d2dcff0b6 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -265,6 +265,7 @@ static struct progress *start_progress_delay(const char *title, uint64_t total,
>   	progress->title_len = utf8_strwidth(title);
>   	progress->split = 0;
>   	set_progress_signal();
> +	trace2_region_enter("progress", title, the_repository);
>   	return progress;
>   }
>   
> @@ -320,6 +321,22 @@ void stop_progress(struct progress **p_progress)
>   {
>   	finish_if_sparse(*p_progress);
>   
> +	if (p_progress && *p_progress) {
> +		trace2_data_intmax("progress", the_repository, "total_objects",
> +				   (*p_progress)->total);
> +
> +		if ((*p_progress)->throughput)
> +			trace2_data_intmax("progress", the_repository,
> +					   "total_bytes",
> +					   (*p_progress)->throughput->curr_total);
> +	}
> +
> +	trace2_region_leave("progress",
> +			    p_progress && *p_progress
> +				? (*p_progress)->title
> +				: NULL,
> +			    the_repository);
> +
>   	stop_progress_msg(p_progress, _("done"));
>   }
>   
> diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
> index d2d088d9a0..1ed1df351c 100755
> --- a/t/t0500-progress-display.sh
> +++ b/t/t0500-progress-display.sh
> @@ -283,4 +283,30 @@ test_expect_success 'cover up after throughput shortens a lot' '
>   	test_i18ncmp expect out
>   '
>   
> +test_expect_success 'progress generates traces' '
> +	cat >in <<-\EOF &&
> +	throughput 102400 1000
> +	update
> +	progress 10
> +	throughput 204800 2000
> +	update
> +	progress 20
> +	throughput 307200 3000
> +	update
> +	progress 30
> +	throughput 409600 4000
> +	update
> +	progress 40
> +	EOF
> +
> +	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress --total=40 \
> +		"Working hard" <in 2>stderr &&
> +
> +	# t0212/parse_events.perl intentionally omits regions and data.
> +	grep -e "region_enter" -e "\"category\":\"progress\"" trace.event &&
> +	grep -e "region_leave" -e "\"category\":\"progress\"" trace.event &&
> +	grep "\"key\":\"total_objects\",\"value\":\"40\"" trace.event &&
> +	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
> +'
> +
>   test_done
>
Jeff Hostetler May 15, 2020, 7:44 p.m. UTC | #9
On 5/15/20 3:37 PM, Jeff Hostetler wrote:
> 
> 
> On 5/12/20 5:44 PM, Emily Shaffer wrote:
>> Rather than teaching only one operation, like 'git fetch', how to write
>> down throughput to traces, we can learn about a wide range of user
>> operations that may seem slow by adding tooling to the progress library
>> itself. Operations which display progress are likely to be slow-running
>> and the kind of thing we want to monitor for performance anyways. By
>> showing object counts and data transfer size, we should be able to
>> make some derived measurements to ensure operations are scaling the way
>> we expect.
>>
>> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> 
> We need to be careful here.  The region_enter and _leave calls need
> to always be properly nested.  Having an implicit region within the
> progress code means the code path between all of the different start_ 
> and stop_progress calls need to not interleave with explicit regions.
> 
> How about putting something like this in stop_progress:
> 
>      struct json_writer jw = JSON_WRITER_INIT;
>      jw_object_begin(&jw, 0);
>      jw_object_intmax(&jw, "total", p->total);
>      if (p->throughput)
>          jw_object_intmax(&jw, "throughput", p->throughput->curr_total);
>      /*
>       * and so on...
>       * and maybe include the elapsed time in since the start_progress.
>       */
>      jw_end(&jw);
>      trace2_data_json("progress", NULL, p->title, &jw);
>      jw_release(&jw);
> 
> That will give you a single record summary of the progress meter
> and you won't have to worry about any interleaving.
> 
> You could also add a bit to `struct progress` to let you opt-in
> or opt-out of the message on a case-by-case basis.
> 
> Jeff

Hit send too quickly...

One of the advantages of a "data_json" event is that it will keep
the multiple values together in one event.  This makes it easier to
correlate them during post processing.

For example, you could grep for "\"data_json\".*\"progress\".*<title>"
and easily parse and see the total and time and etc in one record.
diff mbox series

Patch

diff --git a/progress.c b/progress.c
index 75633e9c5e..6d2dcff0b6 100644
--- a/progress.c
+++ b/progress.c
@@ -265,6 +265,7 @@  static struct progress *start_progress_delay(const char *title, uint64_t total,
 	progress->title_len = utf8_strwidth(title);
 	progress->split = 0;
 	set_progress_signal();
+	trace2_region_enter("progress", title, the_repository);
 	return progress;
 }
 
@@ -320,6 +321,22 @@  void stop_progress(struct progress **p_progress)
 {
 	finish_if_sparse(*p_progress);
 
+	if (p_progress && *p_progress) {
+		trace2_data_intmax("progress", the_repository, "total_objects",
+				   (*p_progress)->total);
+
+		if ((*p_progress)->throughput)
+			trace2_data_intmax("progress", the_repository,
+					   "total_bytes",
+					   (*p_progress)->throughput->curr_total);
+	}
+
+	trace2_region_leave("progress",
+			    p_progress && *p_progress
+				? (*p_progress)->title
+				: NULL,
+			    the_repository);
+
 	stop_progress_msg(p_progress, _("done"));
 }
 
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index d2d088d9a0..1ed1df351c 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -283,4 +283,30 @@  test_expect_success 'cover up after throughput shortens a lot' '
 	test_i18ncmp expect out
 '
 
+test_expect_success 'progress generates traces' '
+	cat >in <<-\EOF &&
+	throughput 102400 1000
+	update
+	progress 10
+	throughput 204800 2000
+	update
+	progress 20
+	throughput 307200 3000
+	update
+	progress 30
+	throughput 409600 4000
+	update
+	progress 40
+	EOF
+
+	GIT_TRACE2_EVENT="$(pwd)/trace.event" test-tool progress --total=40 \
+		"Working hard" <in 2>stderr &&
+
+	# t0212/parse_events.perl intentionally omits regions and data.
+	grep -e "region_enter" -e "\"category\":\"progress\"" trace.event &&
+	grep -e "region_leave" -e "\"category\":\"progress\"" trace.event &&
+	grep "\"key\":\"total_objects\",\"value\":\"40\"" trace.event &&
+	grep "\"key\":\"total_bytes\",\"value\":\"409600\"" trace.event
+'
+
 test_done