Message ID | Ysw4JtoHW1vWmqhz@coredump.intra.peff.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ref-filter: disable save_commit_buffer while traversing | expand |
On Mon, Jul 11 2022, Jeff King wrote: > Various ref-filter options like "--contains" or "--merged" may cause us > to traverse large segments of the history graph. It's counter-productive > to have save_commit_buffer turned on, as that will instruct the commit > code to cache in-memory the object contents for each commit we traverse. > > This increases the amount of heap memory used while providing little or > no benefit, since we're not actually planning to display those commits > (which is the usual reason that tools like git-log want to keep them > around). We can easily disable this feature while ref-filter is running. > This lowers peak heap (as measured by massif) for running: > > git tag --contains 1da177e4c3 > > in linux.git from ~100MB to ~20MB. It also seems to improve runtime by > 4-5% (600ms vs 630ms). > > A few points to note: > > - it should be safe to temporarily disable save_commit_buffer like > this. The saved buffers are accessed through get_commit_buffer(), > which treats the saved ones like a cache, and loads on-demand from > the object database on a cache miss. So any code that was using this > would not be wrong, it might just incur an extra object lookup for > some objects. But... > > - I don't think any ref-filter related code is using the cache. While > it's true that an option like "--format=%(*contents:subject)" or > "--sort=*authordate" will need to look at the commit contents, > ref-filter doesn't use get_commit_buffer() to do so! It always reads > the objects directly via read_object_file(), though it does avoid > re-reading objects if the format can be satisfied without them. > > Timing "git tag --format=%(*authordate)" shows that we're the same > before and after, as expected. Hrm, so for doing the format we're leaving some performance on the table as we're currently not making use of this cache, so this makes nothing worse on that front. But doesn't this approach then also close the door on using the same cache for performance improvements in that area? I.e. spotting that we've already parsed that commit, so we can get it from the cache? B.t.w. did you try to benchmark this with --no-contains too, I tried e.g.: ./git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*" Which gives me: $ git hyperfine -L rev HEAD~1,HEAD -s 'make CFLAGS=-O3' './git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*"' -w 1 Benchmark 1: ./git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*"' in 'HEAD~1 Time (mean ± σ): 1.437 s ± 0.107 s [User: 1.252 s, System: 0.082 s] Range (min … max): 1.306 s … 1.653 s 10 runs Benchmark 2: ./git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*"' in 'HEAD Time (mean ± σ): 1.335 s ± 0.044 s [User: 1.230 s, System: 0.050 s] Range (min … max): 1.260 s … 1.417 s 10 runs Summary './git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*"' in 'HEAD' ran 1.08 ± 0.09 times faster than './git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*"' in 'HEAD~1' Whereas just --contains shows the benefit you're noting: $ git hyperfine -L rev HEAD~1,HEAD -s 'make CFLAGS=-O3' './git -P tag --contains 88ce3ef636b -- "v*"' -w 1 Benchmark 1: ./git -P tag --contains 88ce3ef636b -- "v*"' in 'HEAD~1 Time (mean ± σ): 1.068 s ± 0.102 s [User: 0.886 s, System: 0.068 s] Range (min … max): 0.889 s … 1.272 s 10 runs Benchmark 2: ./git -P tag --contains 88ce3ef636b -- "v*"' in 'HEAD Time (mean ± σ): 931.6 ms ± 39.9 ms [User: 865.3 ms, System: 34.3 ms] Range (min … max): 880.5 ms … 990.1 ms 10 runs Summary './git -P tag --contains 88ce3ef636b -- "v*"' in 'HEAD' ran 1.15 ± 0.12 times faster than './git -P tag --contains 88ce3ef636b -- "v*"' in 'HEAD~1' But this is against git.git on a loaded system, so maybe it means nothing...
On Mon, Jul 11, 2022 at 05:12:37PM +0200, Ævar Arnfjörð Bjarmason wrote: > Hrm, so for doing the format we're leaving some performance on the table > as we're currently not making use of this cache, so this makes nothing > worse on that front. > > But doesn't this approach then also close the door on using the same > cache for performance improvements in that area? I.e. spotting that > we've already parsed that commit, so we can get it from the cache? Yes, it does close that door, or at least make it more challenging. But I suspect it's not a very fruitful door in the first place: - it only helps at all if your sort field or format dereferences the tag to get to the commit - it only helps if you actually use the traversal options - done naively, it's a tradeoff. You might traverse a million commits, but display only a handful of them. Is using all of that memory worth avoiding re-inflating a few commits? Certainly you can come up with a pathological case, but I doubt that it helps in practice. - you _could_ do it less naively by caching only the commits directly pointed to by refs. But the depths of the traversals don't know what those are. So you'd probably do it by pre-loading those commits before any traversal, which would work just fine before or after my patch. But I probably wouldn't because of the extra code complexity, and because... - if the commit is in a commit-graph file, then pre-loading it is actively harmful (because you might not need it anyway!). So really, the best performance will come from just having a commit graph. I don't see the point in writing more code to micro-optimize the fallback case. For this patch, it's just that the existing code is being actively stupid. > B.t.w. did you try to benchmark this with --no-contains too, I tried e.g.: > > ./git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*" I didn't, but doing it now showed the same 5% speedup. > Which gives me: > > $ git hyperfine -L rev HEAD~1,HEAD -s 'make CFLAGS=-O3' './git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*"' -w 1 > Benchmark 1: ./git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*"' in 'HEAD~1 > Time (mean ± σ): 1.437 s ± 0.107 s [User: 1.252 s, System: 0.082 s] > Range (min … max): 1.306 s … 1.653 s 10 runs > > Benchmark 2: ./git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*"' in 'HEAD > Time (mean ± σ): 1.335 s ± 0.044 s [User: 1.230 s, System: 0.050 s] > Range (min … max): 1.260 s … 1.417 s 10 runs > > Summary > './git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*"' in 'HEAD' ran > 1.08 ± 0.09 times faster than './git -P tag --contains 88ce3ef636b --no-contains a39b4003f0e -- "v*"' in 'HEAD~1' Isn't that showing that it's slightly faster with my patch? It's not as fast as your "just --contains" example, but the exact improvement will depend on how many commits your particular query has to traverse (and from the improved overall time, it seems that "just --contains" is traversing less). And I'd expect a fair bit of noise here. This really isn't making anything faster; it's just using less memory, so any speedup is coming from things like less-full memory caches. BTW, I picked "--contains 1da177e4c3" in linux.git because I thought it would have to traverse a lot (because that's the oldest commit), but it actually isn't the worst case. Just asking about "--contains HEAD" requires more traversal, but still shows a speed improvement (7.45s before my patch, 7.26s after). But I was curious about the heap savings there, since the numbers should be larger. And indeed they are. Here's peak heap for two runs: 1093721590 ./git.old --no-pager tag --contains HEAD 218077941 ./git.new --no-pager tag --contains HEAD Whoops, that is indeed 1GB of heap, which is what Olaf was seeing (though I don't know what repo he's using, it's roughly proportional to the number of commits). And 200MB afterwards. So indeed, I'd expect this patch (or the hackier version I showed earlier) to make a significant dent in his workload. Of course the even better solution is to use commit-graphs. That drops the memory usage to a few megabytes, and the time to only a few milliseconds (because generation numbers let us avoid most of the traversal). -Peff
Jeff King <peff@peff.net> writes: > - Note that all of this assumes you don't have a commit-graph file. if > you do, then the heap usage is even lower, and the runtime is 10x > faster. So in that sense this is not urgent, as there's a much > better solution. But since it's such an obvious and easy win for > fallback cases (including commits which aren't yet in the graph > file), there's no reason not to. Sounds sensible. > Signed-off-by: Jeff King <peff@peff.net> > --- > Just pulling this out of the discussion in: > > https://lore.kernel.org/git/YswuaPx6Mk7YkIim@coredump.intra.peff.net/ > > as it's an easy win. > > I doubt that anyone even cares about restoring the value of > save_commit_buffer. So this _could_ be a one-liner turning it off, > rather than doing the save/restore dance. I was mostly erring on the > conservative side, but maybe fewer lines of code is a worthwhile thing. > > ref-filter.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/ref-filter.c b/ref-filter.c > index d3c90e5dbe..bdf39fa761 100644 > --- a/ref-filter.c > +++ b/ref-filter.c > @@ -2405,13 +2405,17 @@ static void reach_filter(struct ref_array *array, > int filter_refs(struct ref_array *array, struct ref_filter *filter, unsigned int type) > { > struct ref_filter_cbdata ref_cbdata; > + int save_commit_buffer_orig; > int ret = 0; > > ref_cbdata.array = array; > ref_cbdata.filter = filter; > > filter->kind = type & FILTER_REFS_KIND_MASK; > > + save_commit_buffer_orig = save_commit_buffer; > + save_commit_buffer = 0; > + > init_contains_cache(&ref_cbdata.contains_cache); > init_contains_cache(&ref_cbdata.no_contains_cache); > > @@ -2444,6 +2448,7 @@ int filter_refs(struct ref_array *array, struct ref_filter *filter, unsigned int > reach_filter(array, filter->reachable_from, INCLUDE_REACHED); > reach_filter(array, filter->unreachable_from, EXCLUDE_REACHED); > > + save_commit_buffer = save_commit_buffer_orig; > return ret; > }
diff --git a/ref-filter.c b/ref-filter.c index d3c90e5dbe..bdf39fa761 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -2405,13 +2405,17 @@ static void reach_filter(struct ref_array *array, int filter_refs(struct ref_array *array, struct ref_filter *filter, unsigned int type) { struct ref_filter_cbdata ref_cbdata; + int save_commit_buffer_orig; int ret = 0; ref_cbdata.array = array; ref_cbdata.filter = filter; filter->kind = type & FILTER_REFS_KIND_MASK; + save_commit_buffer_orig = save_commit_buffer; + save_commit_buffer = 0; + init_contains_cache(&ref_cbdata.contains_cache); init_contains_cache(&ref_cbdata.no_contains_cache); @@ -2444,6 +2448,7 @@ int filter_refs(struct ref_array *array, struct ref_filter *filter, unsigned int reach_filter(array, filter->reachable_from, INCLUDE_REACHED); reach_filter(array, filter->unreachable_from, EXCLUDE_REACHED); + save_commit_buffer = save_commit_buffer_orig; return ret; }
Various ref-filter options like "--contains" or "--merged" may cause us to traverse large segments of the history graph. It's counter-productive to have save_commit_buffer turned on, as that will instruct the commit code to cache in-memory the object contents for each commit we traverse. This increases the amount of heap memory used while providing little or no benefit, since we're not actually planning to display those commits (which is the usual reason that tools like git-log want to keep them around). We can easily disable this feature while ref-filter is running. This lowers peak heap (as measured by massif) for running: git tag --contains 1da177e4c3 in linux.git from ~100MB to ~20MB. It also seems to improve runtime by 4-5% (600ms vs 630ms). A few points to note: - it should be safe to temporarily disable save_commit_buffer like this. The saved buffers are accessed through get_commit_buffer(), which treats the saved ones like a cache, and loads on-demand from the object database on a cache miss. So any code that was using this would not be wrong, it might just incur an extra object lookup for some objects. But... - I don't think any ref-filter related code is using the cache. While it's true that an option like "--format=%(*contents:subject)" or "--sort=*authordate" will need to look at the commit contents, ref-filter doesn't use get_commit_buffer() to do so! It always reads the objects directly via read_object_file(), though it does avoid re-reading objects if the format can be satisfied without them. Timing "git tag --format=%(*authordate)" shows that we're the same before and after, as expected. - Note that all of this assumes you don't have a commit-graph file. if you do, then the heap usage is even lower, and the runtime is 10x faster. So in that sense this is not urgent, as there's a much better solution. But since it's such an obvious and easy win for fallback cases (including commits which aren't yet in the graph file), there's no reason not to. Signed-off-by: Jeff King <peff@peff.net> --- Just pulling this out of the discussion in: https://lore.kernel.org/git/YswuaPx6Mk7YkIim@coredump.intra.peff.net/ as it's an easy win. I doubt that anyone even cares about restoring the value of save_commit_buffer. So this _could_ be a one-liner turning it off, rather than doing the save/restore dance. I was mostly erring on the conservative side, but maybe fewer lines of code is a worthwhile thing. ref-filter.c | 5 +++++ 1 file changed, 5 insertions(+)