From patchwork Wed Aug 5 17:02:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 11701963 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2BF5E1510 for ; Wed, 5 Aug 2020 17:05:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0B1A822D3E for ; Wed, 5 Aug 2020 17:05:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20150623.gappssmtp.com header.i=@ttaylorr-com.20150623.gappssmtp.com header.b="DHp+dJyQ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728481AbgHERFd (ORCPT ); Wed, 5 Aug 2020 13:05:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728268AbgHERCq (ORCPT ); Wed, 5 Aug 2020 13:02:46 -0400 Received: from mail-qv1-xf42.google.com (mail-qv1-xf42.google.com [IPv6:2607:f8b0:4864:20::f42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 86527C0617A2 for ; Wed, 5 Aug 2020 10:02:46 -0700 (PDT) Received: by mail-qv1-xf42.google.com with SMTP id b2so10482335qvp.9 for ; Wed, 05 Aug 2020 10:02:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=1eU3I9uZzpxdnR2nm0s+2T/AjhX9gGIMEwKMBanKHvY=; b=DHp+dJyQqSHAhGvB0MZ8lMEhw4kyuup7hbh1TcAaMElIjeNDeP779CMAoXdEKXicHy yXEvrbgNUuR/D4OBSD5EpszmwT6NBrSGMpHUdg76mdKYe2VeRSV1THMxj3E/J24cRSHu u2xdFCVSeGsRbYodnjI07aINk62lAsUxY1WY6eBgZ40W3p+x/tnl4RGyxUMF77ZNxMZY s/OljxENOtz+HhH894OKToBbKQl0NRQeBvLs7uUvIpEkCX2W5sm4KViXgM8R8p2rcO/i okghpUaWf8x07/Ut5OdU3jgK2iv473tut2wGblX7fYc3Ga9ifXoy9oFXAQk1fxtXIUZi RUgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=1eU3I9uZzpxdnR2nm0s+2T/AjhX9gGIMEwKMBanKHvY=; b=aJ6ZCovl/jtXfojq+VnNKJWRltCzZBhxIm2lVr0MLs4Yw4s/6pJjm7Qs/w8yZmPDe9 3FhiPhYS6ysRsUSg2nHcEBwZ1x1NKuzPybYOCfuMD7sZZ3J7MYrr/vstyJWq+yiB5+QI vfX6tiXMS6VaqJVYg7/QIIXtIDkmp2Uim++HZqunZKpZkvV58AdfPPtIzSwYqZqk6awB GaYlRzDRLz3AdFjyeTR9KixcGkrG+HANhnbiiVM8mZMpi44re+Ns3eHb0sda9P3PgP6D U6KxSOXrOjq8pC3HaY2+ZV9SPDK7yFHPBPHmQ8tmqRdvhTzCZa+lOFnmB4JqMQX1rkRv FZvw== X-Gm-Message-State: AOAM530DYgvMjy9M0UL7EEgsfjKHYcaeYvKUV5lwxz9WgUmqoduCoHy+ jxNQj3TT9DNh0+0a6OZe3FvF9uWx7AvchA== X-Google-Smtp-Source: ABdhPJwiNh/al1VcZYRQbPAHAEceeGwfo/zzDgNhFpXakykjk1GadZgf4Ony8BJajwOj89fPLJipmw== X-Received: by 2002:ad4:5748:: with SMTP id q8mr4776114qvx.217.1596646965127; Wed, 05 Aug 2020 10:02:45 -0700 (PDT) Received: from localhost ([2605:9480:22e:ff10:d118:9acc:fdba:dee7]) by smtp.gmail.com with ESMTPSA id e129sm1983527qkf.132.2020.08.05.10.02.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Aug 2020 10:02:44 -0700 (PDT) Date: Wed, 5 Aug 2020 13:02:42 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: peff@peff.net, dstolee@microsoft.com, szeder.dev@gmail.com Subject: [PATCH v2 08/14] bloom: use provided 'struct bloom_filter_settings' Message-ID: <4deb724fc13e580430628f0ac408685394c8689f.1596646576.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When 'get_or_compute_bloom_filter()' needs to compute a Bloom filter from scratch, it looks to the default 'struct bloom_filter_settings' in order to determine the maximum number of changed paths, number of bits per entry, and so on. All of these values have so far been constant, and so there was no need to pass in a pointer from the caller (eg., the one that is stored in the 'struct write_commit_graph_context'). Start passing in a 'struct bloom_filter_settings *' instead of using the default values to respect graph-specific settings (eg., in the case of setting 'GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS'). In order to have an initialized value for these settings, move its initialization to earlier in the commit-graph write. Signed-off-by: Taylor Blau --- bloom.c | 13 ++++++------- bloom.h | 3 ++- commit-graph.c | 21 ++++++++++----------- t/helper/test-bloom.c | 1 + 4 files changed, 19 insertions(+), 19 deletions(-) diff --git a/bloom.c b/bloom.c index a8a21762f4..0cf1962dc5 100644 --- a/bloom.c +++ b/bloom.c @@ -180,13 +180,12 @@ static int pathmap_cmp(const void *hashmap_cmp_fn_data, struct bloom_filter *get_or_compute_bloom_filter(struct repository *r, struct commit *c, int compute_if_not_present, + const struct bloom_filter_settings *settings, int *computed) { struct bloom_filter *filter; - struct bloom_filter_settings settings = DEFAULT_BLOOM_FILTER_SETTINGS; int i; struct diff_options diffopt; - int max_changes = 512; if (computed) *computed = 0; @@ -211,7 +210,7 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r, repo_diff_setup(r, &diffopt); diffopt.flags.recursive = 1; diffopt.detect_rename = 0; - diffopt.max_changes = max_changes; + diffopt.max_changes = settings->max_changed_paths; diff_setup_done(&diffopt); /* ensure commit is parsed so we have parent information */ @@ -223,7 +222,7 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r, diff_tree_oid(NULL, &c->object.oid, "", &diffopt); diffcore_std(&diffopt); - if (diffopt.num_changes <= max_changes) { + if (diffopt.num_changes <= settings->max_changed_paths) { struct hashmap pathmap; struct pathmap_hash_entry *e; struct hashmap_iter iter; @@ -260,13 +259,13 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r, diff_free_filepair(diff_queued_diff.queue[i]); } - filter->len = (hashmap_get_size(&pathmap) * settings.bits_per_entry + BITS_PER_WORD - 1) / BITS_PER_WORD; + filter->len = (hashmap_get_size(&pathmap) * settings->bits_per_entry + BITS_PER_WORD - 1) / BITS_PER_WORD; filter->data = xcalloc(filter->len, sizeof(unsigned char)); hashmap_for_each_entry(&pathmap, &iter, e, entry) { struct bloom_key key; - fill_bloom_key(e->path, strlen(e->path), &key, &settings); - add_key_to_filter(&key, filter, &settings); + fill_bloom_key(e->path, strlen(e->path), &key, settings); + add_key_to_filter(&key, filter, settings); } hashmap_free_entries(&pathmap, struct pathmap_hash_entry, entry); diff --git a/bloom.h b/bloom.h index c4107a6415..2353c74a24 100644 --- a/bloom.h +++ b/bloom.h @@ -92,11 +92,12 @@ void init_bloom_filters(void); struct bloom_filter *get_or_compute_bloom_filter(struct repository *r, struct commit *c, int compute_if_not_present, + const struct bloom_filter_settings *settings, int *computed); #define DEFAULT_BLOOM_MAX_CHANGES 512 #define get_bloom_filter(r, c) get_or_compute_bloom_filter( \ - (r), (c), 0, NULL) + (r), (c), 0, NULL, NULL) int bloom_filter_contains(const struct bloom_filter *filter, const struct bloom_key *key, diff --git a/commit-graph.c b/commit-graph.c index ba2a2cfb22..48d4697f54 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -1432,6 +1432,7 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx) ctx->r, c, 1, + ctx->bloom_settings, &computed); if (computed) { ctx->count_bloom_filter_computed++; @@ -1688,17 +1689,6 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) int num_chunks = 3; uint64_t chunk_offset; struct object_id file_hash; - struct bloom_filter_settings bloom_settings = DEFAULT_BLOOM_FILTER_SETTINGS; - - if (!ctx->bloom_settings) { - bloom_settings.bits_per_entry = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_BITS_PER_ENTRY", - bloom_settings.bits_per_entry); - bloom_settings.num_hashes = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_NUM_HASHES", - bloom_settings.num_hashes); - bloom_settings.max_changed_paths = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS", - bloom_settings.max_changed_paths); - ctx->bloom_settings = &bloom_settings; - } if (ctx->split) { struct strbuf tmp_file = STRBUF_INIT; @@ -2144,6 +2134,7 @@ int write_commit_graph(struct object_directory *odb, uint32_t i, count_distinct = 0; int res = 0; int replace = 0; + struct bloom_filter_settings bloom_settings = DEFAULT_BLOOM_FILTER_SETTINGS; if (!commit_graph_compatible(the_repository)) return 0; @@ -2157,6 +2148,14 @@ int write_commit_graph(struct object_directory *odb, ctx->split_opts = split_opts; ctx->total_bloom_filter_data_size = 0; + bloom_settings.bits_per_entry = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_BITS_PER_ENTRY", + bloom_settings.bits_per_entry); + bloom_settings.num_hashes = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_NUM_HASHES", + bloom_settings.num_hashes); + bloom_settings.max_changed_paths = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS", + bloom_settings.max_changed_paths); + ctx->bloom_settings = &bloom_settings; + if (flags & COMMIT_GRAPH_WRITE_BLOOM_FILTERS) ctx->changed_paths = 1; if (!(flags & COMMIT_GRAPH_NO_WRITE_BLOOM_FILTERS)) { diff --git a/t/helper/test-bloom.c b/t/helper/test-bloom.c index 531af439c2..4af949164c 100644 --- a/t/helper/test-bloom.c +++ b/t/helper/test-bloom.c @@ -40,6 +40,7 @@ static void get_bloom_filter_for_commit(const struct object_id *commit_oid) setup_git_directory(); c = lookup_commit(the_repository, commit_oid); filter = get_or_compute_bloom_filter(the_repository, c, 1, + &settings, NULL); print_bloom_filter(filter); }