[0/3] pack-bitmap: boundary-based bitmap traversal

Message ID	cover.1682380788.git.me@ttaylorr.com (mailing list archive)
Headers	show Return-Path: <git-owner@vger.kernel.org> Date: Mon, 24 Apr 2023 20:00:17 -0400 From: Taylor Blau <me@ttaylorr.com> To: git@vger.kernel.org Cc: Jeff King <peff@peff.net>, Derrick Stolee <derrickstolee@github.com>, Junio C Hamano <gitster@pobox.com> Subject: [PATCH 0/3] pack-bitmap: boundary-based bitmap traversal Message-ID: <cover.1682380788.git.me@ttaylorr.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit Precedence: bulk
Series	pack-bitmap: boundary-based bitmap traversal \| expand [0/3] pack-bitmap: boundary-based bitmap traversal [1/3] revision: support tracking uninteresting commits [2/3] pack-bitmap.c: extract `fill_in_bitmap()` [3/3] pack-bitmap.c: use commit boundary during bitmap traversal

Taylor Blau April 25, 2023, midnight UTC

Here is a short (but dense) series that I worked on towards the end of
2021 with Peff.

Its goal is to improve the bitmap traversal we implement in
prepare_bitmap_walk(). Specifically, it avoids building up a complete
bitmap of the "haves" side, and instead uses a combination of (a)
UNINTERESTING commits between the tips and boundary, and (b) the
boundary itself.

The gory details are laid out in 3/3, but the high-level overview of the
new algorithm to compute the set of objects between "haves" and "wants"
using bitmaps is:

  1. Build a (partial) bitmap of the haves side by first OR-ing any
     bitmap(s) that already exist for UNINTERESTING commits between the
     haves and the boundary.

  2. For each commit along the boundary, add it as a fill-in traversal
     tip (where the traversal terminates once an existing bitmap is
     found), and perform fill-in traversal.

  3. Build up a complete bitmap of the wants side as usual, stopping any
     time we intersect the (partial) haves side.

This improves many cases where using bitmaps was significantly *slower*
than regular, non-bitmap traversal. In some instances, using bitmaps is
still slower than the non-bitmap case, but it is a substantial
improvement over the classic bitmap walk.

    $ ours="$(git branch --show-current)"
      argv="--count --objects $ours --not --exclude=$ours --branches"
      hyperfine \
        -n 'no bitmaps' "git.compile rev-list $argv" \
        -n 'existing bitmap traversal' "git rev-list --use-bitmap-index $argv" \
        -n 'this commit' "git.compile rev-list --use-bitmap-index $argv"
    Benchmark 1: no bitmaps
      Time (mean ± σ):      15.1 ms ±   4.1 ms    [User: 8.1 ms, System: 6.9 ms]
      Range (min … max):     7.4 ms …  21.8 ms    131 runs

    Benchmark 2: existing bitmap traversal
      Time (mean ± σ):      82.6 ms ±   9.2 ms    [User: 63.6 ms, System: 19.0 ms]
      Range (min … max):    73.8 ms … 105.4 ms    28 runs

    Benchmark 3: this commit
      Time (mean ± σ):      19.8 ms ±   3.1 ms    [User: 13.0 ms, System: 6.8 ms]
      Range (min … max):    17.7 ms …  38.6 ms    158 runs

    Summary
      'no bitmaps' ran
        1.31 ± 0.41 times faster than 'this commit'
        5.49 ± 1.62 times faster than 'existing bitmap traversal'

In a large repository on GitHub.com, the timings to compute the objects
unique to "master", as in:

    $ git rev-list --count --objects master --not --exclude=master --branches

improve as follows:

  - 1.012 sec (without bitmaps)
  - 29.571 sec (with the existing bitmap walk)
  - 2.279 sec (with these patches)

The first couple of patches are preparatory, and the third patch is the
substantive one.

Please have a look and poke any holes you can find in the new approach
:-). Thanks in advance for your review.

Taylor Blau (3):
  revision: support tracking uninteresting commits
  pack-bitmap.c: extract `fill_in_bitmap()`
  pack-bitmap.c: use commit boundary during bitmap traversal

 pack-bitmap.c | 252 ++++++++++++++++++++++++++++++++++++++------------
 revision.c    |  10 +-
 revision.h    |   5 +
 3 files changed, 205 insertions(+), 62 deletions(-)

Junio C Hamano April 25, 2023, 6:02 p.m. UTC | #1

Taylor Blau <me@ttaylorr.com> writes:

> This improves many cases where using bitmaps was significantly *slower*
> than regular, non-bitmap traversal. In some instances, using bitmaps is
> still slower than the non-bitmap case, but it is a substantial
> improvement over the classic bitmap walk.
> ...
> In a large repository on GitHub.com, the timings to compute the objects
> unique to "master", as in:
>
>     $ git rev-list --count --objects master --not --exclude=master --branches
>
> improve as follows:

Curious---when would it be significantly faster to use bitmaps?
"Most of the time"?  "In not-too-large repository?"

Derrick Stolee April 25, 2023, 6:06 p.m. UTC | #2

On 4/24/2023 8:00 PM, Taylor Blau wrote:
> Here is a short (but dense) series that I worked on towards the end of
> 2021 with Peff.
> 
> Its goal is to improve the bitmap traversal we implement in
> prepare_bitmap_walk(). Specifically, it avoids building up a complete
> bitmap of the "haves" side, and instead uses a combination of (a)
> UNINTERESTING commits between the tips and boundary, and (b) the
> boundary itself.
> 
> The gory details are laid out in 3/3, but the high-level overview of the
> new algorithm to compute the set of objects between "haves" and "wants"
> using bitmaps is:
> 
>   1. Build a (partial) bitmap of the haves side by first OR-ing any
>      bitmap(s) that already exist for UNINTERESTING commits between the
>      haves and the boundary.
> 
>   2. For each commit along the boundary, add it as a fill-in traversal
>      tip (where the traversal terminates once an existing bitmap is
>      found), and perform fill-in traversal.
> 
>   3. Build up a complete bitmap of the wants side as usual, stopping any
>      time we intersect the (partial) haves side.

In other words: this generates something closer to the object set in the
non-bitmapped object walk. The only difference is that the new bitmapped
algorithm will see objects that were re-introduced across the boundary
(say, a blob was reverted to its older mode).
 
> This improves many cases where using bitmaps was significantly *slower*
> than regular, non-bitmap traversal. In some instances, using bitmaps is
> still slower than the non-bitmap case, but it is a substantial
> improvement over the classic bitmap walk.> 
>     $ ours="$(git branch --show-current)"
>       argv="--count --objects $ours --not --exclude=$ours --branches"
>       hyperfine \
>         -n 'no bitmaps' "git.compile rev-list $argv" \
>         -n 'existing bitmap traversal' "git rev-list --use-bitmap-index $argv" \
>         -n 'this commit' "git.compile rev-list --use-bitmap-index $argv"
>     Benchmark 1: no bitmaps
>       Time (mean ± σ):      15.1 ms ±   4.1 ms    [User: 8.1 ms, System: 6.9 ms]
>       Range (min … max):     7.4 ms …  21.8 ms    131 runs
> 
>     Benchmark 2: existing bitmap traversal
>       Time (mean ± σ):      82.6 ms ±   9.2 ms    [User: 63.6 ms, System: 19.0 ms]
>       Range (min … max):    73.8 ms … 105.4 ms    28 runs
> 
>     Benchmark 3: this commit
>       Time (mean ± σ):      19.8 ms ±   3.1 ms    [User: 13.0 ms, System: 6.8 ms]
>       Range (min … max):    17.7 ms …  38.6 ms    158 runs
> 
>     Summary
>       'no bitmaps' ran
>         1.31 ± 0.41 times faster than 'this commit'
>         5.49 ± 1.62 times faster than 'existing bitmap traversal'
> 
> In a large repository on GitHub.com, the timings to compute the objects
> unique to "master", as in:
> 
>     $ git rev-list --count --objects master --not --exclude=master --branches
> 
> improve as follows:
> 
>   - 1.012 sec (without bitmaps)
>   - 29.571 sec (with the existing bitmap walk)
>   - 2.279 sec (with these patches)

For my curiosity, and since you already have a test environment set up,
could you redo the "without bitmaps" case with pack.useSparse true and
false? When the option was created and defaulted to true, we never
really explored comparing it to the bitmap case. In fact, I assumed the
bitmap case was faster in important cases like this (where there is a
substantial difference in object counts), but your data is surprising me
that the sparse algorithm is outperforming bitmaps, even with this new
algorithm.

The main question I'd like to ask is: is pack.useSparse contributing
in an important way to the performance here?
 
I'll go poking into the patches now.

Thanks,
-Stolee

Derrick Stolee April 25, 2023, 6:30 p.m. UTC | #3

On 4/25/2023 2:02 PM, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
> 
>> This improves many cases where using bitmaps was significantly *slower*
>> than regular, non-bitmap traversal. In some instances, using bitmaps is
>> still slower than the non-bitmap case, but it is a substantial
>> improvement over the classic bitmap walk.
>> ...
>> In a large repository on GitHub.com, the timings to compute the objects
>> unique to "master", as in:
>>
>>     $ git rev-list --count --objects master --not --exclude=master --branches
>>
>> improve as follows:
> 
> Curious---when would it be significantly faster to use bitmaps?
> "Most of the time"?  "In not-too-large repository?"

This is an interesting question that is very difficult to predict
in advance.

When building an independent implementation of reachability bitmaps
for Azure Repos, we never managed to make the bitmap method faster
for incremental fetches, on average, and so only enabled them during
full clones (no haves).

Of course, it's relatively easy to construct something that looks
like an incremental fetch but is as expensive as a clone (include
one have being a root commit), but it is very rare in practice.

It would be interesting if we used the initial commit walk to the
boundary as a predictor of whether bitmaps would be good, or if
we should use the tree-parsing walk instead.

But perhaps this series gets the bitmap walk "close enough" in
the typical case that it's better to use bitmaps in the chance that
we have accidentally hit a case where we'd normally need to walk a
lot of trees.

Thanks,
-Stolee

Taylor Blau April 25, 2023, 6:57 p.m. UTC | #4

On Tue, Apr 25, 2023 at 02:30:01PM -0400, Derrick Stolee wrote:
> On 4/25/2023 2:02 PM, Junio C Hamano wrote:
> > Taylor Blau <me@ttaylorr.com> writes:
> >
> >> This improves many cases where using bitmaps was significantly *slower*
> >> than regular, non-bitmap traversal. In some instances, using bitmaps is
> >> still slower than the non-bitmap case, but it is a substantial
> >> improvement over the classic bitmap walk.
> >> ...
> >> In a large repository on GitHub.com, the timings to compute the objects
> >> unique to "master", as in:
> >>
> >>     $ git rev-list --count --objects master --not --exclude=master --branches
> >>
> >> improve as follows:
> >
> > Curious---when would it be significantly faster to use bitmaps?
> > "Most of the time"?  "In not-too-large repository?"
>
> This is an interesting question that is very difficult to predict
> in advance.

Indeed ;-).

> It would be interesting if we used the initial commit walk to the
> boundary as a predictor of whether bitmaps would be good, or if
> we should use the tree-parsing walk instead.

I think that this is tough, too. When making a prediction, it would be
good if we could avoid walking down to the nearest bitmap, since by that
point, you might as well have used bitmaps (if you found one before
reaching the root(s) of history).

I think the best heuristic we have available is if we have is something
like "# of UNINTERESTING commits between the tips and boundary", since
we OR those in ahead of time, and then the fill-in traversal will
(usually, not always) be cheap.

But I think that heuristic probably isn't too predictive, since it would
often give you the wrong prediction, say, when there are no
UNINTERESTING commits between the tips and boundary, but the boundary is
bitmapped.

Another approach you could take is to cut off the fill-in traversal at
some point, or start with whichever is presumed to be faster and then
dynamically switch to the other at some point during the walk. But that
also depends on repository shape/size, and machine configuration.

So I think it's extremely difficult to get "right", but if others have
suggestions, I would certainly be receptive to them :-).

> But perhaps this series gets the bitmap walk "close enough" in
> the typical case that it's better to use bitmaps in the chance that
> we have accidentally hit a case where we'd normally need to walk a
> lot of trees.

Yeah, that is definitely part of this series's goal. Another framing is,
for configurations that are using bitmaps everywhere, this should reduce
the number of cases where using bitmaps is *significantly* slower than
not.

Thanks,
Taylor

Taylor Blau April 25, 2023, 7:01 p.m. UTC | #5

On Tue, Apr 25, 2023 at 02:06:25PM -0400, Derrick Stolee wrote:
> In other words: this generates something closer to the object set in the
> non-bitmapped object walk. The only difference is that the new bitmapped
> algorithm will see objects that were re-introduced across the boundary
> (say, a blob was reverted to its older mode).

Very well put, thank you.

> For my curiosity, and since you already have a test environment set up,
> could you redo the "without bitmaps" case with pack.useSparse true and
> false? When the option was created and defaulted to true, we never
> really explored comparing it to the bitmap case. In fact, I assumed the
> bitmap case was faster in important cases like this (where there is a
> substantial difference in object counts), but your data is surprising me
> that the sparse algorithm is outperforming bitmaps, even with this new
> algorithm.
>
> The main question I'd like to ask is: is pack.useSparse contributing
> in an important way to the performance here?

I don't know enough about pack.useSparse to say with certainty, but
trying again on the same repository (which is reasonably well-packed at
the moment), they appear about the same:

    $ time git -c pack.useSparse=false rev-list --count --objects master \
      --not --exclude=master --branches
    14

    real	0m0.986s
    user	0m0.599s
    sys	0m0.387s

    $ time git -c pack.useSparse=true rev-list --count --objects master \
      --not --exclude=master --branches
    14

    real	0m0.985s
    user	0m0.600s
    sys	0m0.385s

> I'll go poking into the patches now.

Thanks in advance for your review :-).

Thanks,
Taylor

Derrick Stolee April 25, 2023, 7:52 p.m. UTC | #6

On 4/25/2023 2:57 PM, Taylor Blau wrote:
> On Tue, Apr 25, 2023 at 02:30:01PM -0400, Derrick Stolee wrote:
>> It would be interesting if we used the initial commit walk to the
>> boundary as a predictor of whether bitmaps would be good, or if
>> we should use the tree-parsing walk instead.
> 
> I think that this is tough, too. When making a prediction, it would be
> good if we could avoid walking down to the nearest bitmap, since by that
> point, you might as well have used bitmaps (if you found one before
> reaching the root(s) of history).

This isn't quite as cut-and-dry, because walking commits is much
faster than walking trees. I agree that we shouldn't walk too far
to avoid wasting time, but somehow we'd want to think about the
distance between the commits we want to cover and the bitmap(s)
we found in their history.

>> But perhaps this series gets the bitmap walk "close enough" in
>> the typical case that it's better to use bitmaps in the chance that
>> we have accidentally hit a case where we'd normally need to walk a
>> lot of trees.
> 
> Yeah, that is definitely part of this series's goal. Another framing is,
> for configurations that are using bitmaps everywhere, this should reduce
> the number of cases where using bitmaps is *significantly* slower than
> not.

It's a good opportunity to re-think if we should be using bitmaps in
these contexts at all, or if there is a simpler thing to do by
disabling them when 'haves' exist. I don't think that's the right
thing to do in all cases, but the experiments you've demonstrated here
seem to point in that direction.

Thanks,
-Stolee

Derrick Stolee April 25, 2023, 8:27 p.m. UTC | #7

On 4/25/2023 3:01 PM, Taylor Blau wrote:
> On Tue, Apr 25, 2023 at 02:06:25PM -0400, Derrick Stolee wrote:
>> For my curiosity, and since you already have a test environment set up,
>> could you redo the "without bitmaps" case with pack.useSparse true and
>> false? When the option was created and defaulted to true, we never
>> really explored comparing it to the bitmap case. In fact, I assumed the
>> bitmap case was faster in important cases like this (where there is a
>> substantial difference in object counts), but your data is surprising me
>> that the sparse algorithm is outperforming bitmaps, even with this new
>> algorithm.
>>
>> The main question I'd like to ask is: is pack.useSparse contributing
>> in an important way to the performance here?
> 
> I don't know enough about pack.useSparse to say with certainty, but
> trying again on the same repository (which is reasonably well-packed at
> the moment), they appear about the same:

Thanks for running that. I apologize, but I didn't sync in my head
that you're purposefully using 'git rev-list' to do the walk and the
config only makes a difference for 'git pack-objects'. The suspiciously
close timings is indeed too suspicious. Thus, the performance numbers
you are considering are not identical to what would happen in 'git
pack-objects' by default.

So, I created my own example using git.git, which included this input:

	refs/remotes/origin/master
	refs/remotes/origin/maint
	refs/remotes/origin/next
	refs/remotes/origin/seen
	^refs/remotes/origin/master~1000

And there is a clear preference for pack.useSparse=false in this
case:

Benchmark 1: GIT_TRACE2_PERF="/home/stolee/_git/git/git-review/trace-true.txt" ~/_git/git/git-review/git -c pack.useSparse=true -c pack.useBitmaps=false pack-objects --stdout --revs <in >/dev/null
  Time (mean ± σ):      2.857 s ±  0.031 s    [User: 6.850 s, System: 0.378 s]
  Range (min … max):    2.814 s …  2.922 s    10 runs

Benchmark 2: GIT_TRACE2_PERF="/home/stolee/_git/git/git-review/trace-false.txt" ~/_git/git/git-review/git -c pack.useSparse=false -c pack.useBitmaps=false pack-objects --stdout --revs <in >/dev/null
  Time (mean ± σ):      2.682 s ±  0.029 s    [User: 6.678 s, System: 0.388 s]
  Range (min … max):    2.653 s …  2.753 s    10 runs

and I used the perf traces to extract that the enumerate objects
phase took ~550ms for the sparse walk and ~375ms for the non-sparse
version.

But using this same input, I'm able to construct an example where
your modified bitmap walk is faster than the non-sparse object walk:

Benchmark 1: GIT_TRACE2_PERF="/home/stolee/_git/git/git-review/trace-true.txt" ~/_git/git/git-review/git -c pack.useSparse=false -c pack.useBitmaps=true pack-objects --stdout --revs <in >/dev/null
  Time (mean ± σ):      2.045 s ±  0.039 s    [User: 6.464 s, System: 0.334 s]
  Range (min … max):    1.966 s …  2.089 s    10 runs

Benchmark 2: GIT_TRACE2_PERF="/home/stolee/_git/git/git-review/trace-false.txt" ~/_git/git/git-review/git -c pack.useSparse=false -c pack.useBitmaps=false pack-objects --stdout --revs <in >/dev/null
  Time (mean ± σ):      2.138 s ±  0.034 s    [User: 6.070 s, System: 0.333 s]
  Range (min … max):    2.058 s …  2.182 s    10 runs

(Here, the enumerate objects phase takes ~230ms on average.)

I can easily swap this by changing my negative ref to

  ^revs/remotes/origin/master

Benchmark 1: GIT_TRACE2_PERF="/home/stolee/_git/git/git-review/trace-true.txt" ~/_git/git/git-review/git -c pack.useSparse=false -c pack.useBitmaps=true pack-objects --stdout --revs <in >/dev/null
  Time (mean ± σ):     521.0 ms ±  28.8 ms    [User: 1190.9 ms, System: 103.6 ms]
  Range (min … max):   496.5 ms … 589.8 ms    10 runs

Benchmark 2: GIT_TRACE2_PERF="/home/stolee/_git/git/git-review/trace-false.txt" ~/_git/git/git-review/git -c pack.useSparse=false -c pack.useBitmaps=false pack-objects --stdout --revs <in >/dev/null
  Time (mean ± σ):     498.5 ms ±  13.2 ms    [User: 1089.1 ms, System: 123.3 ms]
  Range (min … max):   471.3 ms … 513.6 ms    10 runs

which gives us reason to think about "small fetches" don't need
bitmaps, but "big fetches" do. But that can be done separately
from this patch.

Context about the sparse walk:

The sparse walk uses the names of the paths with respect to the root 
trees as a way to walk only the "new" objects without walking the
entire object set at the boundary. This makes a big difference in
repos with many files at HEAD, but the risk is that those path names
become significant overhead if there are enough new changes spread
across a large fraction of the tree. For clients pushing the work
of a single developer, the sparse algorithm outperforms the default
walk.

I wrote about this years ago in [1] and almost brought it up earlier
as a description of the commit boundary (called the frontier in [1]),
but the sparse object walk is helpful to visualize.

[1] https://devblogs.microsoft.com/devops/exploring-new-frontiers-for-git-push-performance/

Thanks,
-Stolee

Junio C Hamano May 1, 2023, 10:08 p.m. UTC | #8

Taylor Blau <me@ttaylorr.com> writes:

> Here is a short (but dense) series that I worked on towards the end of
> 2021 with Peff.

The topic gathered interest on the day it was posted but after 24
hours or so activities seem to have quieted down.  Are we expecting
a polished version?  Or did it open a can of worms that is a bit
larger than we want to deal with and the topic is left on backburner?

Thanks.

Taylor Blau May 2, 2023, 11:52 p.m. UTC | #9

On Mon, May 01, 2023 at 03:08:30PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > Here is a short (but dense) series that I worked on towards the end of
> > 2021 with Peff.
>
> The topic gathered interest on the day it was posted but after 24
> hours or so activities seem to have quieted down.  Are we expecting
> a polished version?  Or did it open a can of worms that is a bit
> larger than we want to deal with and the topic is left on backburner?

We are expecting a polished version ;-). Though I haven't gotten to it
as soon as I had hoped. I should have something on the list tomorrow,
though.

But, no, the can of worms is as expected here, and I need to provide
some more thorough explanation (e.g., we can collect UNINTERESTING
commits in limit_list() only because we only need to walk down to the
first UNINTERESTING commit common to both sides, and no further).

Sorry for the delay, more soon.

Thanks,
Taylor

Taylor Blau May 3, 2023, 9:43 p.m. UTC | #10

On Tue, Apr 25, 2023 at 03:52:29PM -0400, Derrick Stolee wrote:
> It's a good opportunity to re-think if we should be using bitmaps in
> these contexts at all, or if there is a simpler thing to do by
> disabling them when 'haves' exist. I don't think that's the right
> thing to do in all cases, but the experiments you've demonstrated here
> seem to point in that direction.

Yes, I agree that it's a worthwhile direction to consider in the future.
I fear that it will be hard to come up with absolute rules here, but
that doesn't mean it isn't worth trying :-).

Thanks,
Taylor

[0/3] pack-bitmap: boundary-based bitmap traversal

Message

Comments