diff mbox series

pack-bitmap: remove trace2 region from hot path

Message ID pull.1365.git.1663938034607.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series pack-bitmap: remove trace2 region from hot path | expand

Commit Message

Derrick Stolee Sept. 23, 2022, 1 p.m. UTC
From: Derrick Stolee <derrickstolee@github.com>

The trace2 region around the call to lazy_bitmap_for_commit() in
bitmap_for_commit() was added in 28cd730680d (pack-bitmap: prepare to
read lookup table extension, 2022-08-14). While adding trace2 regions is
typically helpful for tracking performance, this method is called
possibly thousands of times as a commit walk explores commit history
looking for a matching bitmap. When trace2 output is enabled, this
region is emitted many times and performance is throttled by that
output.

For now, remove these regions entirely.

This is a critical path, and it would be valuable to measure that the
time spent in bitmap_for_commit() does not increase when using the
commit lookup table. The best way to do that would be to use a mechanism
that sums the time spent in a region and reports a single value at the
end of the process. This technique was introduced but not merged by [1]
so maybe this example presents some justification to revisit that
approach.

[1] https://lore.kernel.org/git/pull.1099.v2.git.1640720202.gitgitgadget@gmail.com/

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
    [For 2.38.0] pack-bitmap: remove trace2 region from hot path
    
    I noticed this while trying to backport the Abhradeep's lookup table
    work into GitHub's fork. Something went wrong in that process, causing
    this region to significantly slow down. It turns out that slow down does
    not reproduce on current 'master', which is good. I must have missed
    something while I was backporting.
    
    Regardless, the use of trace2_region_enter() here should be removed or
    replaced. For the sake of 2.38.0, this simple removal is safe enough.
    However, to really dig into what was happening I had to construct a
    rebase [2] of Jeff's trace2 stopwatch work, then apply changes on top
    [3] that could replace this region with trace2_timer_*() methods.
    
    [2]
    https://github.com/git/git/compare/master...derrickstolee:trace2-stopwatch
    [3]
    https://github.com/derrickstolee/git/compare/trace2-stopwatch...bitmap-trace2
    
    As a separate discussion, it might be worth revisiting that stopwatch
    work so we have it available as a tool when doing these kinds of
    investigations.
    
    Thanks, -Stolee

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1365%2Fderrickstolee%2Fbitmap-remove-trace2-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1365/derrickstolee/bitmap-remove-trace2-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1365

 pack-bitmap.c | 2 --
 1 file changed, 2 deletions(-)


base-commit: 1b3d6e17fe83eb6f79ffbac2f2c61bbf1eaef5f8

Comments

Junio C Hamano Sept. 23, 2022, 5:05 p.m. UTC | #1
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:


> -		trace2_region_enter("pack-bitmap", "reading_lookup_table", the_repository);
>  		/* NEEDSWORK: cache misses aren't recorded */
>  		bitmap = lazy_bitmap_for_commit(bitmap_git, commit);
> -		trace2_region_leave("pack-bitmap", "reading_lookup_table", the_repository);
>  		if (!bitmap)
>  			return NULL;
>  		return lookup_stored_bitmap(bitmap);

As "git blame" is bad in finding a piece of code that no longer
exists, it may make sense to leave a comment around here why we do
not have a trace2 region around this call, perhaps?

+	/* do not add trace2_region around here in the hot path */
	bitmap = lazy_bitmap_for_commit(...);

or something?

Thanks.
Junio C Hamano Sept. 23, 2022, 5:25 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>
>> -		trace2_region_enter("pack-bitmap", "reading_lookup_table", the_repository);
>>  		/* NEEDSWORK: cache misses aren't recorded */
>>  		bitmap = lazy_bitmap_for_commit(bitmap_git, commit);
>> -		trace2_region_leave("pack-bitmap", "reading_lookup_table", the_repository);
>>  		if (!bitmap)
>>  			return NULL;
>>  		return lookup_stored_bitmap(bitmap);
>
> As "git blame" is bad in finding a piece of code that no longer
> exists, it may make sense to leave a comment around here why we do
> not have a trace2 region around this call, perhaps?
>
> +	/* do not add trace2_region around here in the hot path */
> 	bitmap = lazy_bitmap_for_commit(...);
>
> or something?
>
> Thanks.

Also, we'd need to disable t5310.150 that expects to see
reading_lookup_table label in the trace.
Junio C Hamano Sept. 23, 2022, 5:36 p.m. UTC | #3
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

>     I noticed this while trying to backport the Abhradeep's lookup table
>     work into GitHub's fork. Something went wrong in that process, causing
>     this region to significantly slow down. It turns out that slow down does
>     not reproduce on current 'master', which is good. I must have missed
>     something while I was backporting.
>     
>     Regardless, the use of trace2_region_enter() here should be removed or
>     replaced. For the sake of 2.38.0, this simple removal is safe enough.

Well, not doing this removal is even safer, if we know that it is
not hurting in the -rc code.  It would be better than breaking our
tests without knowing ;-)  I am not upset that the patch broke the
test, by the way.  Compared to things silently breaking without
surfacing, a loud breakage in tests is a much easier thing to
handle.

My test run with the attached just finished, so that's what I am
going to queue tentatively on top.

Thanks.

 pack-bitmap.c           | 1 +
 t/t5310-pack-bitmaps.sh | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git c/pack-bitmap.c w/pack-bitmap.c
index 13457dd77e..a2bbbbd811 100644
--- c/pack-bitmap.c
+++ w/pack-bitmap.c
@@ -830,6 +830,7 @@ struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git,
 		if (!bitmap_git->table_lookup)
 			return NULL;
 
+		/* this is a fairly hot codepath - no trace2_region please */
 		/* NEEDSWORK: cache misses aren't recorded */
 		bitmap = lazy_bitmap_for_commit(bitmap_git, commit);
 		if (!bitmap)
diff --git c/t/t5310-pack-bitmaps.sh w/t/t5310-pack-bitmaps.sh
index 7e50f8e765..2b1efc923b 100755
--- c/t/t5310-pack-bitmaps.sh
+++ w/t/t5310-pack-bitmaps.sh
@@ -455,7 +455,7 @@ test_expect_success 'verify writing bitmap lookup table when enabled' '
 	grep "\"label\":\"writing_lookup_table\"" trace2
 '
 
-test_expect_success 'lookup table is actually used to traverse objects' '
+: test_expect_success 'lookup table is actually used to traverse objects' '
 	git repack -adb &&
 	GIT_TRACE2_EVENT="$(pwd)/trace3" \
 		git rev-list --use-bitmap-index --count --all &&
Derrick Stolee Sept. 23, 2022, 6:07 p.m. UTC | #4
On 9/23/2022 1:36 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>>     I noticed this while trying to backport the Abhradeep's lookup table
>>     work into GitHub's fork. Something went wrong in that process, causing
>>     this region to significantly slow down. It turns out that slow down does
>>     not reproduce on current 'master', which is good. I must have missed
>>     something while I was backporting.
>>     
>>     Regardless, the use of trace2_region_enter() here should be removed or
>>     replaced. For the sake of 2.38.0, this simple removal is safe enough.
> 
> Well, not doing this removal is even safer, if we know that it is
> not hurting in the -rc code.  It would be better than breaking our
> tests without knowing ;-)  I am not upset that the patch broke the
> test, by the way.  Compared to things silently breaking without
> surfacing, a loud breakage in tests is a much easier thing to
> handle.
> 
> My test run with the attached just finished, so that's what I am
> going to queue tentatively on top.

Whoops! I'm very sorry for not noticing that there was a test checking
for this trace. Your modification makes sense and I'll try to do a more
careful replacement of that test, if possible.

Thanks,
-Stolee
Taylor Blau Sept. 23, 2022, 7:17 p.m. UTC | #5
On Fri, Sep 23, 2022 at 10:36:44AM -0700, Junio C Hamano wrote:
> diff --git c/t/t5310-pack-bitmaps.sh w/t/t5310-pack-bitmaps.sh
> index 7e50f8e765..2b1efc923b 100755
> --- c/t/t5310-pack-bitmaps.sh
> +++ w/t/t5310-pack-bitmaps.sh
> @@ -455,7 +455,7 @@ test_expect_success 'verify writing bitmap lookup table when enabled' '
>  	grep "\"label\":\"writing_lookup_table\"" trace2
>  '
>
> -test_expect_success 'lookup table is actually used to traverse objects' '
> +: test_expect_success 'lookup table is actually used to traverse objects' '
>  	git repack -adb &&
>  	GIT_TRACE2_EVENT="$(pwd)/trace3" \
>  		git rev-list --use-bitmap-index --count --all &&

Yeah, I would advocate for dropping this test entirely. That trace2
marker is the only indication that is so obvious that we're using the
lookup table to read bitmaps.

Perhaps the pack-bitmap helper could be taught to dump which extensions
it sees, and then write a test based on that? Although that isn't really
testing anything new, either, since we sometimes *notice* the extension
but don't actually read anything based on it (e.g., when disabled).

So I'd be content just dropping this.

Thanks,
Taylor
Jeff Hostetler Sept. 27, 2022, 6:37 p.m. UTC | #6
On 9/23/22 9:00 AM, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> 
> The trace2 region around the call to lazy_bitmap_for_commit() in
> bitmap_for_commit() was added in 28cd730680d (pack-bitmap: prepare to
> read lookup table extension, 2022-08-14). While adding trace2 regions is
> typically helpful for tracking performance, this method is called
> possibly thousands of times as a commit walk explores commit history
> looking for a matching bitmap. When trace2 output is enabled, this
> region is emitted many times and performance is throttled by that
> output.
> 
> For now, remove these regions entirely.
> 
> This is a critical path, and it would be valuable to measure that the
> time spent in bitmap_for_commit() does not increase when using the
> commit lookup table. The best way to do that would be to use a mechanism
> that sums the time spent in a region and reports a single value at the
> end of the process. This technique was introduced but not merged by [1]
> so maybe this example presents some justification to revisit that
> approach.
> 
> [1] https://lore.kernel.org/git/pull.1099.v2.git.1640720202.gitgitgadget@gmail.com/

I'll dust off my "timers and counters" series, address the
various issues that were raised back in December, and resubmit.

Thanks for the reminder.
Jeff
diff mbox series

Patch

diff --git a/pack-bitmap.c b/pack-bitmap.c
index 9a208abc1fd..13457dd77e5 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -830,10 +830,8 @@  struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git,
 		if (!bitmap_git->table_lookup)
 			return NULL;
 
-		trace2_region_enter("pack-bitmap", "reading_lookup_table", the_repository);
 		/* NEEDSWORK: cache misses aren't recorded */
 		bitmap = lazy_bitmap_for_commit(bitmap_git, commit);
-		trace2_region_leave("pack-bitmap", "reading_lookup_table", the_repository);
 		if (!bitmap)
 			return NULL;
 		return lookup_stored_bitmap(bitmap);