diff mbox series

[2/3] commit-graph: free large diffs, too

Message ID 20191222093216.GB3460818@coredump.intra.peff.net (mailing list archive)
State New, archived
Headers show
Series [1/3] commit-graph: examine changed-path objects in pack order | expand

Commit Message

Jeff King Dec. 22, 2019, 9:32 a.m. UTC
If a diff we compute for --changed-path has more than 512 entries, we
don't bother generating a bloom filter for it. But since we don't
iterate over diff_queued_diff, we also don't free the filepairs and
filespecs from the diff before clearing the queue. Let's make sure we do
so.

This drops the peak heap usage of "commit-graph write --changed-paths"
on linux.git from ~8GB to ~4GB.

Signed-off-by: Jeff King <peff@peff.net>
---
 bloom.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Derrick Stolee Dec. 27, 2019, 2:52 p.m. UTC | #1
On 12/22/2019 4:32 AM, Jeff King wrote:
> If a diff we compute for --changed-path has more than 512 entries, we
> don't bother generating a bloom filter for it. But since we don't
> iterate over diff_queued_diff, we also don't free the filepairs and
> filespecs from the diff before clearing the queue. Let's make sure we do
> so.
> 
> This drops the peak heap usage of "commit-graph write --changed-paths"
> on linux.git from ~8GB to ~4GB.

In my testing, the heap size went from ~10gb to ~6gb.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/bloom.c b/bloom.c
index 0c7505d3d6..d1d3796e11 100644
--- a/bloom.c
+++ b/bloom.c
@@ -226,6 +226,8 @@  struct bloom_filter *get_bloom_filter(struct repository *r,
 
 		hashmap_free_entries(&pathmap, struct pathmap_hash_entry, entry);
 	} else {
+		for (i = 0; i < diff_queued_diff.nr; i++)
+			diff_free_filepair(diff_queued_diff.queue[i]);
 		filter->data = NULL;
 		filter->len = 0;
 	}