From patchwork Wed Jul 1 13:27:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Cai via GitGitGadget X-Patchwork-Id: 11636369 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 617F513B4 for ; Wed, 1 Jul 2020 13:27:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 44DB7207F5 for ; Wed, 1 Jul 2020 13:27:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="FCqozT7A" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730771AbgGAN1f (ORCPT ); Wed, 1 Jul 2020 09:27:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41126 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729689AbgGAN1e (ORCPT ); Wed, 1 Jul 2020 09:27:34 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1BF7EC08C5DB for ; Wed, 1 Jul 2020 06:27:34 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id r12so23766497wrj.13 for ; Wed, 01 Jul 2020 06:27:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=ppLWTMM/QtjlkJ6BAXu8a4TJ2DJ2gu6KkEEaeIgaoCc=; b=FCqozT7AODSylQspbimbhvivLGmv5M5oKrC/q/JBoTPXtSe16xlK8ThDw1wE4ZBo7b KTryPluaclfpve3crSuC2M1/YWumanMSsSHMtzfY3QCuWqVissb5hQYZFug4P3Me0Ra4 a/AFEKPxnW6JeylZ/0V4CBnOboiKN9ytPyQhvr4QVYrt3Milvur1/mjIiXZAvro50BgJ SS25yzA0QRSSC9Dtmy6meEFUjUvZ66AQIWxroIP717FV1ZwOiK9yx1z+JXlz7puOF9g+ QD5s8jH6st0JjDFbpPlERFCqP22Qff8tvXyrlZI1g/L2WNdgUjsH6McvRXqf+tiA3WAN T+zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=ppLWTMM/QtjlkJ6BAXu8a4TJ2DJ2gu6KkEEaeIgaoCc=; b=YXnI1Mwii1qTXTbMXarpNysPntEEa2rrQbdnC+g+B2JlT+99pzK8oipahhM9+102ie bTPePEE2b6ijtCf957/oqQ/r+tOERiXg7trcl1D90xt4iWPhCFUs/WdBBQhGLkIbFrN6 lUYg/h0j+pobFkJ72+gXVuTip0hHBjdb43OthAjeWCtp2S3E6resxiRxghQgaNxT2u69 Fzei6ESFESPbFlZHzX82gRs7/CiCaxDT0TipT0ZfJnCbyNIyoFJcdbtdi1pxQFAuJnio hIHQIpkXtM+NfVF+nzldXQ61en0oPAWtK5kACgMOIxG+FDSQjE1AqE72M08Lcb042oPo aFxg== X-Gm-Message-State: AOAM532OE3ROmUiqbqnezIwh6PlkrvkhiTQQOyRHg0URknGu+RF31V0T PLSw7gz8jH+B5nKptYMyDVGnMGrl X-Google-Smtp-Source: ABdhPJxZQP6np7KqeUueYBTTlEs6PXE2g9hYnegFZkRcTKe0VCBIuP/NBi13hkT1mzSN9O4CJxYs2Q== X-Received: by 2002:adf:f34f:: with SMTP id e15mr27119283wrp.415.1593610052767; Wed, 01 Jul 2020 06:27:32 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a22sm6838412wmb.4.2020.07.01.06.27.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jul 2020 06:27:32 -0700 (PDT) Message-Id: <57002040bc83668d4998da6a1e34a6efe287ae22.1593610050.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Wed, 01 Jul 2020 13:27:21 +0000 Subject: [PATCH v4 01/10] commit-graph: place bloom_settings in context MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: me@ttaylorr.com, szeder.dev@gmail.com, l.s.r@web.de, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Place an instance of struct bloom_settings into the struct write_commit_graph_context. This allows simplifying the function prototype of write_graph_chunk_bloom_data(). This will allow us to combine the function prototypes and use function pointers to simplify write_commit_graph_file(). By using a pointer, we can later replace the settings to match those that exist in the current commit-graph, in case a future Git version allows customization of these parameters. Reported-by: SZEDER Gábor Signed-off-by: Derrick Stolee --- commit-graph.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index 887837e882..d0fedcd9b1 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -882,6 +882,7 @@ struct write_commit_graph_context { const struct split_commit_graph_opts *split_opts; size_t total_bloom_filter_data_size; + const struct bloom_filter_settings *bloom_settings; }; static void write_graph_chunk_fanout(struct hashfile *f, @@ -1103,8 +1104,7 @@ static void write_graph_chunk_bloom_indexes(struct hashfile *f, } static void write_graph_chunk_bloom_data(struct hashfile *f, - struct write_commit_graph_context *ctx, - const struct bloom_filter_settings *settings) + struct write_commit_graph_context *ctx) { struct commit **list = ctx->commits.list; struct commit **last = ctx->commits.list + ctx->commits.nr; @@ -1116,9 +1116,9 @@ static void write_graph_chunk_bloom_data(struct hashfile *f, _("Writing changed paths Bloom filters data"), ctx->commits.nr); - hashwrite_be32(f, settings->hash_version); - hashwrite_be32(f, settings->num_hashes); - hashwrite_be32(f, settings->bits_per_entry); + hashwrite_be32(f, ctx->bloom_settings->hash_version); + hashwrite_be32(f, ctx->bloom_settings->num_hashes); + hashwrite_be32(f, ctx->bloom_settings->bits_per_entry); while (list < last) { struct bloom_filter *filter = get_bloom_filter(ctx->r, *list, 0); @@ -1541,6 +1541,8 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) struct object_id file_hash; const struct bloom_filter_settings bloom_settings = DEFAULT_BLOOM_FILTER_SETTINGS; + ctx->bloom_settings = &bloom_settings; + if (ctx->split) { struct strbuf tmp_file = STRBUF_INIT; @@ -1642,7 +1644,7 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) write_graph_chunk_extra_edges(f, ctx); if (ctx->changed_paths) { write_graph_chunk_bloom_indexes(f, ctx); - write_graph_chunk_bloom_data(f, ctx, &bloom_settings); + write_graph_chunk_bloom_data(f, ctx); } if (ctx->num_commit_graphs_after > 1 && write_graph_chunk_base(f, ctx)) { From patchwork Wed Jul 1 13:27:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Cai via GitGitGadget X-Patchwork-Id: 11636371 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 597ED13B4 for ; Wed, 1 Jul 2020 13:27:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 40B34207F5 for ; Wed, 1 Jul 2020 13:27:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="n+2lBX8Y" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730930AbgGAN1g (ORCPT ); Wed, 1 Jul 2020 09:27:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41130 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730729AbgGAN1f (ORCPT ); Wed, 1 Jul 2020 09:27:35 -0400 Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EFCDEC08C5C1 for ; Wed, 1 Jul 2020 06:27:34 -0700 (PDT) Received: by mail-wm1-x341.google.com with SMTP id f18so23259471wml.3 for ; Wed, 01 Jul 2020 06:27:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=UiCtqTJGVEULmS7sPq1CCgWHFh2xZ9/Kbd4Wgi4nLYI=; b=n+2lBX8YWZ8dIuvSkUm97QgrOTxpDvFKgtou6UVeqGTD4MZS45SGRpyB9ocUcOyU4S zQXMCAUhv3Om7lYllcQmtsKaafnDozYWgizzMsvqlITV/GRZ5UxoPh6ueJRmfeI7eVRA UzpvM+pkcEV7fYpB3w3XImAYOuus4AgAghR8g297ecOEFjz5WzB9/sAS1/ip2PQ0azlU 3lqm+RLgE+7HJopkeanHNt8CSSSW7d0D4yGJnJKKZ2s3mOg8Qm/o9FCQmIEnMhMAdO+g tV9vpxzZfcAyHvRoXr92iKg3GJZvj2kRGyjBwXv/BzaBKkCofwW4Tpnc9eBeDJqEVlmM tfIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=UiCtqTJGVEULmS7sPq1CCgWHFh2xZ9/Kbd4Wgi4nLYI=; b=aTS132QF6wdiCe31o3ED9XZpljmG5tqdYsws3u88tk/XM6wBo9scmRrlJxCnI6Pqhb +ZzVZvKcLuyuFZ5YTjWHD2He1nzaNohgSiEtgPJNNrUreTrZWUH6IKeaiidMNYCJCbSO 3Wsl1uDwvdTyDkp8H7FeJYCIArRbdYJAFLODflQ6mWidY8PHIrT59jiWe/zLhQnMyePJ K4Vyt8jTWfIO2Z+JAS7cySpqmy6+/UvElP3ER4ebjGX9GEfB0476p5mqXcrP7AbJPwba ocsXyD8RhTfa00+DPPB8IIPPEsKT5/EHNSUYeByfDrfpfEgMhoVj+8yiqHinLmA7TBnx wZ4g== X-Gm-Message-State: AOAM533R3aNSZV+TPz6HtVpUDN9sb1FUXS9IqzBxT3/fjZHphNWz+rUN 25MZluCfLcXgHpk8uyB81CjHddAv X-Google-Smtp-Source: ABdhPJwEd/iwCw61522g8qsYx3XO186NRbvqbXwbkjCCrI+TyHPKEgbLGKaSfDT6Pun8xaiTxAtJkQ== X-Received: by 2002:a1c:a70d:: with SMTP id q13mr24808353wme.55.1593610053545; Wed, 01 Jul 2020 06:27:33 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y6sm7601763wmy.0.2020.07.01.06.27.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jul 2020 06:27:33 -0700 (PDT) Message-Id: <6b63f9bd8a2a7e18d7ac1be7066d4bcd1df2a729.1593610050.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Wed, 01 Jul 2020 13:27:22 +0000 Subject: [PATCH v4 02/10] commit-graph: change test to die on parse, not load Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: me@ttaylorr.com, szeder.dev@gmail.com, l.s.r@web.de, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee 43d3561 (commit-graph write: don't die if the existing graph is corrupt, 2019-03-25) introduced the GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD environment variable. This was created to verify that commit-graph was not loaded when writing a new non-incremental commit-graph. An upcoming change wants to load a commit-graph in some valuable cases, but we want to maintain that we don't trust the commit-graph data when writing our new file. Instead of dying on load, instead die if we ever try to parse a commit from the commit-graph. This functionally verifies the same intended behavior, but allows a more advanced feature in the next change. Signed-off-by: Derrick Stolee --- commit-graph.c | 12 ++++++++---- commit-graph.h | 2 +- t/t5318-commit-graph.sh | 2 +- 3 files changed, 10 insertions(+), 6 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index d0fedcd9b1..6a28d4a5a6 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -564,10 +564,6 @@ static int prepare_commit_graph(struct repository *r) return !!r->objects->commit_graph; r->objects->commit_graph_attempted = 1; - if (git_env_bool(GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD, 0)) - die("dying as requested by the '%s' variable on commit-graph load!", - GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD); - prepare_repo_settings(r); if (!git_env_bool(GIT_TEST_COMMIT_GRAPH, 0) && @@ -790,6 +786,14 @@ static int parse_commit_in_graph_one(struct repository *r, int parse_commit_in_graph(struct repository *r, struct commit *item) { + static int checked_env = 0; + + if (!checked_env && + git_env_bool(GIT_TEST_COMMIT_GRAPH_DIE_ON_PARSE, 0)) + die("dying as requested by the '%s' variable on commit-graph parse!", + GIT_TEST_COMMIT_GRAPH_DIE_ON_PARSE); + checked_env = 1; + if (!prepare_commit_graph(r)) return 0; return parse_commit_in_graph_one(r, r->objects->commit_graph, item); diff --git a/commit-graph.h b/commit-graph.h index 881c9b46e5..f0fb13e3f2 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -5,7 +5,7 @@ #include "object-store.h" #define GIT_TEST_COMMIT_GRAPH "GIT_TEST_COMMIT_GRAPH" -#define GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD "GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD" +#define GIT_TEST_COMMIT_GRAPH_DIE_ON_PARSE "GIT_TEST_COMMIT_GRAPH_DIE_ON_PARSE" #define GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS "GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS" /* diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 1073f9e3cf..5ec01abdaa 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -436,7 +436,7 @@ corrupt_graph_verify() { cp $objdir/info/commit-graph commit-graph-pre-write-test fi && git status --short && - GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD=true git commit-graph write && + GIT_TEST_COMMIT_GRAPH_DIE_ON_PARSE=true git commit-graph write && git commit-graph verify } From patchwork Wed Jul 1 13:27:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Cai via GitGitGadget X-Patchwork-Id: 11636373 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 74DF8161F for ; Wed, 1 Jul 2020 13:27:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 579BF207F9 for ; Wed, 1 Jul 2020 13:27:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="DSJELQeK" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731089AbgGAN1i (ORCPT ); Wed, 1 Jul 2020 09:27:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41136 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731084AbgGAN1g (ORCPT ); Wed, 1 Jul 2020 09:27:36 -0400 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EAA17C08C5C1 for ; Wed, 1 Jul 2020 06:27:35 -0700 (PDT) Received: by mail-wr1-x443.google.com with SMTP id j4so21378213wrp.10 for ; Wed, 01 Jul 2020 06:27:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=ufveTMOU4PgwU7CXupDCHbMnWVklWfJci0tSzN1i++o=; b=DSJELQeKSPYrRNCOxvgjm+Whu2cxEHKg6O2i3y4x4poWlMqNAfVxh4xeSuAUvva/Mt QjJzIRAUe2F9PTLTjYgsdaZ3TL8iHfydqZQJrFZ8o0KC3r+f8a60p8CD2iowlZjRLfZV Cv9dqAw/EABUAJJ84OfeaOpecKYssIr+Jx9895x1WV9+2gnVA9fMSiBVC2RxYVYJOOpy potjzXHMhSW4zR6cvdIqg14AOiQW6UkfYGDB1qCT909+Nq4pbmShET8PPrXTD8BHak6k J/cHwie3hpe80PUAvSn76Q1oYXH0mAsD5lDkDEUPbvGvV+NfJU9hsvWkYy0gh0bimFja VyvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=ufveTMOU4PgwU7CXupDCHbMnWVklWfJci0tSzN1i++o=; b=dCW6cgy/2hBrRIeCXwkzhRNk6J3/nxaKoe8VIP3hjhOM9sGYzbStQkFb6k/M3d9JJb TbpGUkYl6MM6nJ2H83c/Mq/KXu5nfsQUM9FieLnNhPAzP/+18HZTtLZg/TpcGRyNUW1B cGoXw7rpZBBgn9LFZ7J1WYkI5r3ilo6FhuN/ChQL6U4hqvTOiJT0x4gZsXfnfomUsHQL 5IliHvCvd0aCEZjg+W4k9NQjCwhb88uXPBFSE1/YVXWbU3Za7PzHcp4XWUu298b3sTki mXjHD5KYaNulkQEA6alzMT7mGW3YRDUI+D8FsTYWpwV4ivKEzC/4+xbteg6jV+RWGGjK c5YQ== X-Gm-Message-State: AOAM532jUaHAE9HEApm0bP9zGSTHYldpXC+KSqbtE3XyFwcLmp3yxQqN 7/KbEjsAQJXyHlsjITq6D9IoXCdn X-Google-Smtp-Source: ABdhPJzrCkE3UkGkXU8wXvn0VTiLwhMu4v3IeYJgtj0hixgh/dsXBw4L7MXuEao5eEyCqxxRrSHPFg== X-Received: by 2002:adf:f104:: with SMTP id r4mr27954536wro.90.1593610054407; Wed, 01 Jul 2020 06:27:34 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n16sm7212839wra.19.2020.07.01.06.27.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jul 2020 06:27:33 -0700 (PDT) Message-Id: <3c532ebabc17fef9149e477b7263cbf11c4d7cb9.1593610050.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Wed, 01 Jul 2020 13:27:23 +0000 Subject: [PATCH v4 03/10] bloom: fix logic in get_bloom_filter() MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: me@ttaylorr.com, szeder.dev@gmail.com, l.s.r@web.de, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The get_bloom_filter() method is a bit complicated in some parts where it does not need to be. In particular, it needs to return a NULL filter only when compute_if_not_present is zero AND the filter data cannot be loaded from a commit-graph file. This currently happens by accident because the commit-graph does not load changed-path Bloom filters from an existing commit-graph when writing a new one. This will change in a later patch. Also clean up some style issues while we are here. One side-effect of returning a NULL filter is that the filters that are reported as "too large" will now be reported as NULL insead of length zero. This case was not properly covered before, so add a test. Further, remote the counting of the zero-length filters from revision.c and the trace2 logs. Helped-by: René Scharfe Helped-by: SZEDER Gábor Signed-off-by: Derrick Stolee --- bloom.c | 14 ++++++-------- commit-graph.c | 8 ++++++-- revision.c | 7 ------- t/t4216-log-bloom.sh | 24 ++++++++++++++++++++++-- 4 files changed, 34 insertions(+), 19 deletions(-) diff --git a/bloom.c b/bloom.c index c38d1cff0c..2af5389795 100644 --- a/bloom.c +++ b/bloom.c @@ -186,7 +186,7 @@ struct bloom_filter *get_bloom_filter(struct repository *r, struct diff_options diffopt; int max_changes = 512; - if (bloom_filters.slab_size == 0) + if (!bloom_filters.slab_size) return NULL; filter = bloom_filter_slab_at(&bloom_filters, c); @@ -194,16 +194,14 @@ struct bloom_filter *get_bloom_filter(struct repository *r, if (!filter->data) { load_commit_graph_info(r, c); if (c->graph_pos != COMMIT_NOT_FROM_GRAPH && - r->objects->commit_graph->chunk_bloom_indexes) { - if (load_bloom_filter_from_graph(r->objects->commit_graph, filter, c)) - return filter; - else - return NULL; - } + r->objects->commit_graph->chunk_bloom_indexes) + load_bloom_filter_from_graph(r->objects->commit_graph, filter, c); } - if (filter->data || !compute_if_not_present) + if (filter->data) return filter; + if (!compute_if_not_present) + return NULL; repo_diff_setup(r, &diffopt); diffopt.flags.recursive = 1; diff --git a/commit-graph.c b/commit-graph.c index 6a28d4a5a6..50ce039a53 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -1098,7 +1098,8 @@ static void write_graph_chunk_bloom_indexes(struct hashfile *f, while (list < last) { struct bloom_filter *filter = get_bloom_filter(ctx->r, *list, 0); - cur_pos += filter->len; + size_t len = filter ? filter->len : 0; + cur_pos += len; display_progress(progress, ++i); hashwrite_be32(f, cur_pos); list++; @@ -1126,8 +1127,11 @@ static void write_graph_chunk_bloom_data(struct hashfile *f, while (list < last) { struct bloom_filter *filter = get_bloom_filter(ctx->r, *list, 0); + size_t len = filter ? filter->len : 0; display_progress(progress, ++i); - hashwrite(f, filter->data, filter->len * sizeof(unsigned char)); + + if (len) + hashwrite(f, filter->data, len * sizeof(unsigned char)); list++; } diff --git a/revision.c b/revision.c index c644c66091..7339750af1 100644 --- a/revision.c +++ b/revision.c @@ -633,7 +633,6 @@ static unsigned int count_bloom_filter_maybe; static unsigned int count_bloom_filter_definitely_not; static unsigned int count_bloom_filter_false_positive; static unsigned int count_bloom_filter_not_present; -static unsigned int count_bloom_filter_length_zero; static void trace2_bloom_filter_statistics_atexit(void) { @@ -641,7 +640,6 @@ static void trace2_bloom_filter_statistics_atexit(void) jw_object_begin(&jw, 0); jw_object_intmax(&jw, "filter_not_present", count_bloom_filter_not_present); - jw_object_intmax(&jw, "zero_length_filter", count_bloom_filter_length_zero); jw_object_intmax(&jw, "maybe", count_bloom_filter_maybe); jw_object_intmax(&jw, "definitely_not", count_bloom_filter_definitely_not); jw_object_intmax(&jw, "false_positive", count_bloom_filter_false_positive); @@ -735,11 +733,6 @@ static int check_maybe_different_in_bloom_filter(struct rev_info *revs, return -1; } - if (!filter->len) { - count_bloom_filter_length_zero++; - return -1; - } - result = bloom_filter_contains(filter, revs->bloom_key, revs->bloom_filter_settings); diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh index c7011f33e2..2761208e74 100755 --- a/t/t4216-log-bloom.sh +++ b/t/t4216-log-bloom.sh @@ -60,7 +60,7 @@ setup () { test_bloom_filters_used () { log_args=$1 - bloom_trace_prefix="statistics:{\"filter_not_present\":0,\"zero_length_filter\":0,\"maybe\"" + bloom_trace_prefix="statistics:{\"filter_not_present\":0,\"maybe\"" setup "$log_args" && grep -q "$bloom_trace_prefix" "$TRASH_DIRECTORY/trace.perf" && test_cmp log_wo_bloom log_w_bloom && @@ -142,7 +142,7 @@ test_expect_success 'setup - add commit-graph to the chain with Bloom filters' ' test_bloom_filters_used_when_some_filters_are_missing () { log_args=$1 - bloom_trace_prefix="statistics:{\"filter_not_present\":3,\"zero_length_filter\":0,\"maybe\":8,\"definitely_not\":6" + bloom_trace_prefix="statistics:{\"filter_not_present\":3,\"maybe\":8,\"definitely_not\":6" setup "$log_args" && grep -q "$bloom_trace_prefix" "$TRASH_DIRECTORY/trace.perf" && test_cmp log_wo_bloom log_w_bloom @@ -152,4 +152,24 @@ test_expect_success 'Use Bloom filters if they exist in the latest but not all c test_bloom_filters_used_when_some_filters_are_missing "-- A/B" ' +test_expect_success 'correctly report changes over limit' ' + git init 513changes && + ( + cd 513changes && + for i in $(test_seq 1 513) + do + echo $i >file$i.txt || return 1 + done && + git add . && + git commit -m "files" && + git commit-graph write --reachable --changed-paths && + for i in $(test_seq 1 513) + do + git -c core.commitGraph=false log -- file$i.txt >expect && + git log -- file$i.txt >actual && + test_cmp expect actual || return 1 + done + ) +' + test_done \ No newline at end of file From patchwork Wed Jul 1 13:27:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Cai via GitGitGadget X-Patchwork-Id: 11636387 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8658B13B4 for ; Wed, 1 Jul 2020 13:27:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 698A1207F9 for ; Wed, 1 Jul 2020 13:27:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="IJBa8TFE" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731094AbgGAN1l (ORCPT ); Wed, 1 Jul 2020 09:27:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41138 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730729AbgGAN1h (ORCPT ); Wed, 1 Jul 2020 09:27:37 -0400 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3F2EC08C5DB for ; Wed, 1 Jul 2020 06:27:36 -0700 (PDT) Received: by mail-wm1-x342.google.com with SMTP id f139so23242925wmf.5 for ; Wed, 01 Jul 2020 06:27:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=IQiadWATS38fXopJv59PiR9qTXIDXVfzvd4sBRhCSSg=; b=IJBa8TFEZQfm2xQTzWXVf8+kgiXSqOIqIMTchm1nLif+AgvTOqODbVlibv5doKeEdU x4EVaQE9BUHZW2/aJ03/9UJjoy4crP0PNx2k5oE+UdSGjZ7wjbr34Qz+jItTShK/106B sajvqlK3V0+LOMPck/Tjeb3zmRHNbqgQbFYh5wK3Y6f3YUXsuMEsvKxZHKCQyaggIfk0 blyTRtIjpNps7KwQ7/oODQfBpGZ0A7zj5IHEmdD7eRZ/NCoJYucO+aiwdMZfs0K/sc+c JK8mnzKyEi+8SWBN0aK7gyavGHN5kLKhutjV+hyUOfXCiT0HiH1H1UQ9DoUGHIFeDuUn F8OA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=IQiadWATS38fXopJv59PiR9qTXIDXVfzvd4sBRhCSSg=; b=HPeiGWHDTZ9ihKvICyHPDVkfpqrsRAsfugBfrw5D1IZDp3pdl4v73uDrMQod0GDY2G hwdgoyvRFZPlfuCmqxvFlxU54iQP9wj7d5U3CYXc5OF52OwcBZufYa7b3p7jVbx+kT21 K3JmH16MI/WxxqmT4Lfvuo1juwCjt3FRl8izX3A86X+hAa00Z3nE6XmLaPKvpZQZiuYs 2xJ60dH1qh0/7bIjtJNmUO3MOiCmIhzvdstj+rYyJmhzeTilRA/1er06H3jyVA59Vc7o s2KG4AfpdNrqFAMOLaEQiwn1Yj5qTYEBkNJdUVOMv2EWhiKGXoIg+5iZAyd1WQ9HF7Pu P1Cw== X-Gm-Message-State: AOAM533PVV2IIOMsIyhFTWrtGBzhENLlk5iWjayCEpTMRZzriJeQ3oFD 1f9Innjwgcp0q6kIc63Z0qYq/TuV X-Google-Smtp-Source: ABdhPJyBOOhBRkQZEqoSxIcozTm8NsZGt8mTw/PQOLQkwjL0KYKSVwvz2UVgQQmcIXrOMj0l7dGeCQ== X-Received: by 2002:a1c:7d56:: with SMTP id y83mr25784668wmc.154.1593610055292; Wed, 01 Jul 2020 06:27:35 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h13sm7382194wml.42.2020.07.01.06.27.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jul 2020 06:27:34 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Wed, 01 Jul 2020 13:27:24 +0000 Subject: [PATCH v4 04/10] commit-graph: persist existence of changed-paths Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: me@ttaylorr.com, szeder.dev@gmail.com, l.s.r@web.de, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee The changed-path Bloom filters were released in v2.27.0, but have a significant drawback. A user can opt-in to writing the changed-path filters using the "--changed-paths" option to "git commit-graph write" but the next write will drop the filters unless that option is specified. This becomes even more important when considering the interaction with gc.writeCommitGraph (on by default) or fetch.writeCommitGraph (part of features.experimental). These config options trigger commit-graph writes that the user did not signal, and hence there is no --changed-paths option available. Allow a user that opts-in to the changed-path filters to persist the property of "my commit-graph has changed-path filters" automatically. A user can drop filters using the --no-changed-paths option. In the process, we need to be extremely careful to match the Bloom filter settings as specified by the commit-graph. This will allow future versions of Git to customize these settings, and the version with this change will persist those settings as commit-graphs are rewritten on top. Use the trace2 API to signal the settings used during the write, and check that output in a test after manually adjusting the correct bytes in the commit-graph file. Signed-off-by: Derrick Stolee --- Documentation/git-commit-graph.txt | 5 +++- builtin/commit-graph.c | 5 +++- commit-graph.c | 45 ++++++++++++++++++++++++++++-- commit-graph.h | 1 + t/t4216-log-bloom.sh | 17 ++++++++++- 5 files changed, 67 insertions(+), 6 deletions(-) diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt index f4b13c005b..369b222b08 100644 --- a/Documentation/git-commit-graph.txt +++ b/Documentation/git-commit-graph.txt @@ -60,7 +60,10 @@ existing commit-graph file. With the `--changed-paths` option, compute and write information about the paths changed between a commit and it's first parent. This operation can take a while on large repositories. It provides significant performance gains -for getting history of a directory or a file with `git log -- `. +for getting history of a directory or a file with `git log -- `. If +this option is given, future commit-graph writes will automatically assume +that this option was intended. Use `--no-changed-paths` to stop storing this +data. + With the `--split` option, write the commit-graph as a chain of multiple commit-graph files stored in `/info/commit-graphs`. The new commits diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c index 59009837dc..ff7b177c33 100644 --- a/builtin/commit-graph.c +++ b/builtin/commit-graph.c @@ -151,6 +151,7 @@ static int graph_write(int argc, const char **argv) }; opts.progress = isatty(2); + opts.enable_changed_paths = -1; split_opts.size_multiple = 2; split_opts.max_commits = 0; split_opts.expire_time = 0; @@ -171,7 +172,9 @@ static int graph_write(int argc, const char **argv) flags |= COMMIT_GRAPH_WRITE_SPLIT; if (opts.progress) flags |= COMMIT_GRAPH_WRITE_PROGRESS; - if (opts.enable_changed_paths || + if (!opts.enable_changed_paths) + flags |= COMMIT_GRAPH_NO_WRITE_BLOOM_FILTERS; + if (opts.enable_changed_paths == 1 || git_env_bool(GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS, 0)) flags |= COMMIT_GRAPH_WRITE_BLOOM_FILTERS; diff --git a/commit-graph.c b/commit-graph.c index 50ce039a53..6762704324 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -16,6 +16,8 @@ #include "progress.h" #include "bloom.h" #include "commit-slab.h" +#include "json-writer.h" +#include "trace2.h" void git_test_write_commit_graph_or_die(void) { @@ -1108,6 +1110,21 @@ static void write_graph_chunk_bloom_indexes(struct hashfile *f, stop_progress(&progress); } +static void trace2_bloom_filter_settings(struct write_commit_graph_context *ctx) +{ + struct json_writer jw = JSON_WRITER_INIT; + + jw_object_begin(&jw, 0); + jw_object_intmax(&jw, "hash_version", ctx->bloom_settings->hash_version); + jw_object_intmax(&jw, "num_hashes", ctx->bloom_settings->num_hashes); + jw_object_intmax(&jw, "bits_per_entry", ctx->bloom_settings->bits_per_entry); + jw_end(&jw); + + trace2_data_json("bloom", ctx->r, "settings", &jw); + + jw_release(&jw); +} + static void write_graph_chunk_bloom_data(struct hashfile *f, struct write_commit_graph_context *ctx) { @@ -1116,6 +1133,8 @@ static void write_graph_chunk_bloom_data(struct hashfile *f, struct progress *progress = NULL; int i = 0; + trace2_bloom_filter_settings(ctx); + if (ctx->report_progress) progress = start_delayed_progress( _("Writing changed paths Bloom filters data"), @@ -1547,9 +1566,15 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) int num_chunks = 3; uint64_t chunk_offset; struct object_id file_hash; - const struct bloom_filter_settings bloom_settings = DEFAULT_BLOOM_FILTER_SETTINGS; + struct bloom_filter_settings bloom_settings = DEFAULT_BLOOM_FILTER_SETTINGS; - ctx->bloom_settings = &bloom_settings; + if (!ctx->bloom_settings) { + bloom_settings.bits_per_entry = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_BITS_PER_ENTRY", + bloom_settings.bits_per_entry); + bloom_settings.num_hashes = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_NUM_HASHES", + bloom_settings.num_hashes); + ctx->bloom_settings = &bloom_settings; + } if (ctx->split) { struct strbuf tmp_file = STRBUF_INIT; @@ -1974,9 +1999,23 @@ int write_commit_graph(struct object_directory *odb, ctx->split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0; ctx->check_oids = flags & COMMIT_GRAPH_WRITE_CHECK_OIDS ? 1 : 0; ctx->split_opts = split_opts; - ctx->changed_paths = flags & COMMIT_GRAPH_WRITE_BLOOM_FILTERS ? 1 : 0; ctx->total_bloom_filter_data_size = 0; + if (flags & COMMIT_GRAPH_WRITE_BLOOM_FILTERS) + ctx->changed_paths = 1; + if (!(flags & COMMIT_GRAPH_NO_WRITE_BLOOM_FILTERS)) { + struct commit_graph *g; + prepare_commit_graph_one(ctx->r, ctx->odb); + + g = ctx->r->objects->commit_graph; + + /* We have changed-paths already. Keep them in the next graph */ + if (g && g->chunk_bloom_data) { + ctx->changed_paths = 1; + ctx->bloom_settings = g->bloom_filter_settings; + } + } + if (ctx->split) { struct commit_graph *g; prepare_commit_graph(ctx->r); diff --git a/commit-graph.h b/commit-graph.h index f0fb13e3f2..45b1e5bca3 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -96,6 +96,7 @@ enum commit_graph_write_flags { /* Make sure that each OID in the input is a valid commit OID. */ COMMIT_GRAPH_WRITE_CHECK_OIDS = (1 << 3), COMMIT_GRAPH_WRITE_BLOOM_FILTERS = (1 << 4), + COMMIT_GRAPH_NO_WRITE_BLOOM_FILTERS = (1 << 5), }; struct split_commit_graph_opts { diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh index 2761208e74..52ad998f9e 100755 --- a/t/t4216-log-bloom.sh +++ b/t/t4216-log-bloom.sh @@ -126,7 +126,7 @@ test_expect_success 'setup - add commit-graph to the chain without Bloom filters test_commit c14 A/anotherFile2 && test_commit c15 A/B/anotherFile2 && test_commit c16 A/B/C/anotherFile2 && - GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=0 git commit-graph write --reachable --split && + git commit-graph write --reachable --split --no-changed-paths && test_line_count = 2 .git/objects/info/commit-graphs/commit-graph-chain ' @@ -152,6 +152,21 @@ test_expect_success 'Use Bloom filters if they exist in the latest but not all c test_bloom_filters_used_when_some_filters_are_missing "-- A/B" ' +test_expect_success 'persist filter settings' ' + test_when_finished rm -rf .git/objects/info/commit-graph* && + rm -rf .git/objects/info/commit-graph* && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \ + GIT_TRACE2_EVENT_NESTING=5 \ + GIT_TEST_BLOOM_SETTINGS_NUM_HASHES=9 \ + GIT_TEST_BLOOM_SETTINGS_BITS_PER_ENTRY=15 \ + git commit-graph write --reachable --changed-paths && + grep "{\"hash_version\":1,\"num_hashes\":9,\"bits_per_entry\":15}" trace2.txt && + GIT_TRACE2_EVENT="$(pwd)/trace2-auto.txt" \ + GIT_TRACE2_EVENT_NESTING=5 \ + git commit-graph write --reachable --changed-paths && + grep "{\"hash_version\":1,\"num_hashes\":9,\"bits_per_entry\":15}" trace2-auto.txt +' + test_expect_success 'correctly report changes over limit' ' git init 513changes && ( From patchwork Wed Jul 1 13:27:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Cai via GitGitGadget X-Patchwork-Id: 11636377 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CDE2713B4 for ; Wed, 1 Jul 2020 13:27:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B078E207F9 for ; Wed, 1 Jul 2020 13:27:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="aramdzxZ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731097AbgGAN1m (ORCPT ); Wed, 1 Jul 2020 09:27:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41144 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731088AbgGAN1i (ORCPT ); Wed, 1 Jul 2020 09:27:38 -0400 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A4E6EC08C5C1 for ; Wed, 1 Jul 2020 06:27:37 -0700 (PDT) Received: by mail-wm1-x342.google.com with SMTP id w3so10789845wmi.4 for ; Wed, 01 Jul 2020 06:27:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=fJHYFO+7x6kA0BJFGtB61JVFGUyOmqi3FbQlHe8Geqs=; b=aramdzxZyX2BeWZXh8xMxTOS7E73KXT4pyPYrIVRC+RFAesgWrAem1pb48ZLlFlIWN WU67OUD0/PNnJPQkjv8k7VUhFBM/6Aynkbr351hR/z7poi+rA2US61jiHwdG8Ol8Filj /euDXFRnRcTnKL7liEbGQ/6PGg5svX6rkGHRokpChhN12lytpR5m4yTIdmVXB3P9OkeF 1rEmvU8UppqkS6qwFb4YOg7wzVHQ4kohBqLeO198yG4deGmLDI2VthZKDDtusSTkSCsE BIaSSxK8s5XAlW/Xe6qfPFLOUmweBzlyBfq4wbCLUMJBxF+q6FgpRFggl714Ey9jS/nG feSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=fJHYFO+7x6kA0BJFGtB61JVFGUyOmqi3FbQlHe8Geqs=; b=uSlwtIqbdM8EQdF/aMe/TDbKpo21/jI92e6kk9BJe0IVEz0ZPmR2cEA6srDHFGo+Me 4HchVPHwpgyPrNm4Qt5PPrW3w3RlYiLmmoAdQ12NVDLDtHcRGZDWIWOt1d+LzpIlcGoT ey5ZQFEyUFQ7kYidEJBTzkrGeCwEFljxqHv2Yv6RBg8GtjWzGJe61npT0WO9VOdfwDLu bvzfr4BOMPmVd6UW3Zq+d2lK/O3DRC7zCoQnQj84eOJ5wTwW1zEG/m6uK3NwhMCOPSmI KKuRXJ4ivchVMOh4NwFUTKduIv2fOsro3Gk3zZuxQIVsu0gjHqAw+ZtIQqdOtiI5onFv VyGw== X-Gm-Message-State: AOAM533aQ4trtFacRL5WF0GTa28YjHbSd2uVLfmLNvlM6tDGeqD+3VQ6 AQjdqHwtngLfRnQeWykdIe0/F3FE X-Google-Smtp-Source: ABdhPJyQk2NSifm3zE5lJKiGpDN49xq4YgqybFSDksPrXlF7DhHdYLH1W8IytgTZkYbVGHVP4/yiFQ== X-Received: by 2002:a1c:7d56:: with SMTP id y83mr25784722wmc.154.1593610056161; Wed, 01 Jul 2020 06:27:36 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w7sm7431330wmc.32.2020.07.01.06.27.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jul 2020 06:27:35 -0700 (PDT) Message-Id: In-Reply-To: References: From: " =?utf-8?q?SZEDER_G=C3=A1bor?= via GitGitGadget" Date: Wed, 01 Jul 2020 13:27:25 +0000 Subject: [PATCH v4 05/10] commit-graph: unify the signatures of all write_graph_chunk_*() functions MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: me@ttaylorr.com, szeder.dev@gmail.com, l.s.r@web.de, Derrick Stolee , =?utf-8?q?SZEDER_G=C3=A1bor?= Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: =?UTF-8?q?SZEDER=20G=C3=A1bor?= Update the write_graph_chunk_*() helper functions to have the same signature: - Return an int error code from all these functions. write_graph_chunk_base() already has an int error code, now the others will have one, too, but since they don't indicate any error, they will always return 0. - Drop the hash size parameter of write_graph_chunk_oids() and write_graph_chunk_data(); its value can be read directly from 'the_hash_algo' inside these functions as well. This opens up the possibility for further cleanups and foolproofing in the following two patches. Helped-by: René Scharfe Signed-off-by: SZEDER Gábor Signed-off-by: Derrick Stolee --- commit-graph.c | 42 ++++++++++++++++++++++++++---------------- 1 file changed, 26 insertions(+), 16 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index 6762704324..1a6d26f864 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -891,8 +891,8 @@ struct write_commit_graph_context { const struct bloom_filter_settings *bloom_settings; }; -static void write_graph_chunk_fanout(struct hashfile *f, - struct write_commit_graph_context *ctx) +static int write_graph_chunk_fanout(struct hashfile *f, + struct write_commit_graph_context *ctx) { int i, count = 0; struct commit **list = ctx->commits.list; @@ -913,17 +913,21 @@ static void write_graph_chunk_fanout(struct hashfile *f, hashwrite_be32(f, count); } + + return 0; } -static void write_graph_chunk_oids(struct hashfile *f, int hash_len, - struct write_commit_graph_context *ctx) +static int write_graph_chunk_oids(struct hashfile *f, + struct write_commit_graph_context *ctx) { struct commit **list = ctx->commits.list; int count; for (count = 0; count < ctx->commits.nr; count++, list++) { display_progress(ctx->progress, ++ctx->progress_cnt); - hashwrite(f, (*list)->object.oid.hash, (int)hash_len); + hashwrite(f, (*list)->object.oid.hash, the_hash_algo->rawsz); } + + return 0; } static const unsigned char *commit_to_sha1(size_t index, void *table) @@ -932,8 +936,8 @@ static const unsigned char *commit_to_sha1(size_t index, void *table) return commits[index]->object.oid.hash; } -static void write_graph_chunk_data(struct hashfile *f, int hash_len, - struct write_commit_graph_context *ctx) +static int write_graph_chunk_data(struct hashfile *f, + struct write_commit_graph_context *ctx) { struct commit **list = ctx->commits.list; struct commit **last = ctx->commits.list + ctx->commits.nr; @@ -950,7 +954,7 @@ static void write_graph_chunk_data(struct hashfile *f, int hash_len, die(_("unable to parse commit %s"), oid_to_hex(&(*list)->object.oid)); tree = get_commit_tree_oid(*list); - hashwrite(f, tree->hash, hash_len); + hashwrite(f, tree->hash, the_hash_algo->rawsz); parent = (*list)->parents; @@ -1030,10 +1034,12 @@ static void write_graph_chunk_data(struct hashfile *f, int hash_len, list++; } + + return 0; } -static void write_graph_chunk_extra_edges(struct hashfile *f, - struct write_commit_graph_context *ctx) +static int write_graph_chunk_extra_edges(struct hashfile *f, + struct write_commit_graph_context *ctx) { struct commit **list = ctx->commits.list; struct commit **last = ctx->commits.list + ctx->commits.nr; @@ -1082,10 +1088,12 @@ static void write_graph_chunk_extra_edges(struct hashfile *f, list++; } + + return 0; } -static void write_graph_chunk_bloom_indexes(struct hashfile *f, - struct write_commit_graph_context *ctx) +static int write_graph_chunk_bloom_indexes(struct hashfile *f, + struct write_commit_graph_context *ctx) { struct commit **list = ctx->commits.list; struct commit **last = ctx->commits.list + ctx->commits.nr; @@ -1108,6 +1116,7 @@ static void write_graph_chunk_bloom_indexes(struct hashfile *f, } stop_progress(&progress); + return 0; } static void trace2_bloom_filter_settings(struct write_commit_graph_context *ctx) @@ -1125,8 +1134,8 @@ static void trace2_bloom_filter_settings(struct write_commit_graph_context *ctx) jw_release(&jw); } -static void write_graph_chunk_bloom_data(struct hashfile *f, - struct write_commit_graph_context *ctx) +static int write_graph_chunk_bloom_data(struct hashfile *f, + struct write_commit_graph_context *ctx) { struct commit **list = ctx->commits.list; struct commit **last = ctx->commits.list + ctx->commits.nr; @@ -1155,6 +1164,7 @@ static void write_graph_chunk_bloom_data(struct hashfile *f, } stop_progress(&progress); + return 0; } static int oid_compare(const void *_a, const void *_b) @@ -1671,8 +1681,8 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) num_chunks * ctx->commits.nr); } write_graph_chunk_fanout(f, ctx); - write_graph_chunk_oids(f, hashsz, ctx); - write_graph_chunk_data(f, hashsz, ctx); + write_graph_chunk_oids(f, ctx); + write_graph_chunk_data(f, ctx); if (ctx->num_extra_edges) write_graph_chunk_extra_edges(f, ctx); if (ctx->changed_paths) { From patchwork Wed Jul 1 13:27:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Cai via GitGitGadget X-Patchwork-Id: 11636379 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EDBA7161F for ; Wed, 1 Jul 2020 13:27:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D4E17207F9 for ; Wed, 1 Jul 2020 13:27:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="j3CP1p7P" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731100AbgGAN1n (ORCPT ); Wed, 1 Jul 2020 09:27:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41146 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731084AbgGAN1j (ORCPT ); Wed, 1 Jul 2020 09:27:39 -0400 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5D14FC08C5DC for ; Wed, 1 Jul 2020 06:27:38 -0700 (PDT) Received: by mail-wr1-x443.google.com with SMTP id z13so23834060wrw.5 for ; Wed, 01 Jul 2020 06:27:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=qC2zQUw85XBuWBnNo0E+B0/zAE11gJ3LrNgTrM3pySo=; b=j3CP1p7PMR6ws3njDMdMdyaFza4kZhkf4LKr3cW3OHlHdeCj8o+wS2LniG9RXgEvC0 sjuqYnFTGkYFWMPGIR9VdmRY4wx8Elua4fLCRJ4nx3oj2XT0spCmQsBxvYYZflmE+xuS BW0yDb+lycS1aWcDu6xrYYQXc3f41jpJJoMcNlTGXalqzSKaPA1llgLkq5CrNoOc5Ejx oWqSl5/RgUGy2MQZ/c91oeSPgVSRLx+9hkdRweyXe8wfQ5fkfkjBwElJimoO4QEKlJnQ /6Qcb5E8a8e1eZFq36M3erVNTmOMnfK0N2IuHttqzVzDJfXvtJpi3jn8isdSOM6HodGs ayWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=qC2zQUw85XBuWBnNo0E+B0/zAE11gJ3LrNgTrM3pySo=; b=OftNZYgDFm2vPBchBZKY9gzCbBG5MLeRQZMf3wvzidLa6nZ3oBvO0AJHhzSgCY+qbx 23fhyBXwfXGB5lkLXrJ5Z+IPTyHUhQ2LGd/xB6LZGR5v1YeVh/diV4/Qigeu/YiTeEB4 mLqDVPLGsJ5V372TeB7O/ODtO+wUjRNKt17NkMES4qrbT/vByDXHj79hHUIlWdPd8acw MsXvMAmt926xc4rSSk62Jfkyq0Gl0DrN2f02wcJFzrXg0gw74DqhuUtDj2RUzE2FbSVS UbgueVt++YEctKf/rHZtBgTyLr0+6OQTCb6aNcTsvs9Q7YO6zY2jHtD3qrRS1m5Gu3kV vfPQ== X-Gm-Message-State: AOAM530bdPs68Ip0W5VC3mRgbmbr1aiFbwM+NYE9sWcBnKVqBQi2HYkt QFeX1nFq7npDm3wnXxuJ2NYiMljk X-Google-Smtp-Source: ABdhPJwDfxytnLJJL5jU4bRLA6fuA2ix/xCLybJbYvIxXL8VVy4pvSzMcyVaHR0HMNOV1HyrCeFSAQ== X-Received: by 2002:adf:e6cb:: with SMTP id y11mr26052329wrm.282.1593610056940; Wed, 01 Jul 2020 06:27:36 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u10sm7080199wml.29.2020.07.01.06.27.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jul 2020 06:27:36 -0700 (PDT) Message-Id: <5ed0ce20a4a82d09ec2e9b82e12b0315dc6fd7d4.1593610050.git.gitgitgadget@gmail.com> In-Reply-To: References: From: " =?utf-8?q?SZEDER_G=C3=A1bor?= via GitGitGadget" Date: Wed, 01 Jul 2020 13:27:26 +0000 Subject: [PATCH v4 06/10] commit-graph: simplify chunk writes into loop MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: me@ttaylorr.com, szeder.dev@gmail.com, l.s.r@web.de, Derrick Stolee , =?utf-8?q?SZEDER_G=C3=A1bor?= Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: =?UTF-8?q?SZEDER=20G=C3=A1bor?= In write_commit_graph_file() we now have one block of code filling the array of 'struct chunk_info' with the IDs and sizes of chunks to be written, and an other block of code calling the functions responsible for writing individual chunks. In case of optional chunks like Extra Edge List an Base Graphs List there is also a condition checking whether that chunk is necessary/desired, and that same condition is repeated in both blocks of code. Other, newer chunks have similar optional conditions. Eliminate these repeated conditions by storing the function pointers responsible for writing individual chunks in the 'struct chunk_info' array as well, and calling them in a loop to write the commit-graph file. This will open up the possibility for a bit of foolproofing in the following patch. Helped-by: René Scharfe Signed-off-by: SZEDER Gábor Signed-off-by: Derrick Stolee --- commit-graph.c | 28 ++++++++++++++++------------ 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index 1a6d26f864..2b26a9dad3 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -1559,9 +1559,13 @@ static int write_graph_chunk_base(struct hashfile *f, return 0; } +typedef int (*chunk_write_fn)(struct hashfile *f, + struct write_commit_graph_context *ctx); + struct chunk_info { uint32_t id; uint64_t size; + chunk_write_fn write_fn; }; static int write_commit_graph_file(struct write_commit_graph_context *ctx) @@ -1624,27 +1628,34 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) chunks[0].id = GRAPH_CHUNKID_OIDFANOUT; chunks[0].size = GRAPH_FANOUT_SIZE; + chunks[0].write_fn = write_graph_chunk_fanout; chunks[1].id = GRAPH_CHUNKID_OIDLOOKUP; chunks[1].size = hashsz * ctx->commits.nr; + chunks[1].write_fn = write_graph_chunk_oids; chunks[2].id = GRAPH_CHUNKID_DATA; chunks[2].size = (hashsz + 16) * ctx->commits.nr; + chunks[2].write_fn = write_graph_chunk_data; if (ctx->num_extra_edges) { chunks[num_chunks].id = GRAPH_CHUNKID_EXTRAEDGES; chunks[num_chunks].size = 4 * ctx->num_extra_edges; + chunks[num_chunks].write_fn = write_graph_chunk_extra_edges; num_chunks++; } if (ctx->changed_paths) { chunks[num_chunks].id = GRAPH_CHUNKID_BLOOMINDEXES; chunks[num_chunks].size = sizeof(uint32_t) * ctx->commits.nr; + chunks[num_chunks].write_fn = write_graph_chunk_bloom_indexes; num_chunks++; chunks[num_chunks].id = GRAPH_CHUNKID_BLOOMDATA; chunks[num_chunks].size = sizeof(uint32_t) * 3 + ctx->total_bloom_filter_data_size; + chunks[num_chunks].write_fn = write_graph_chunk_bloom_data; num_chunks++; } if (ctx->num_commit_graphs_after > 1) { chunks[num_chunks].id = GRAPH_CHUNKID_BASE; chunks[num_chunks].size = hashsz * (ctx->num_commit_graphs_after - 1); + chunks[num_chunks].write_fn = write_graph_chunk_base; num_chunks++; } @@ -1680,19 +1691,12 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) progress_title.buf, num_chunks * ctx->commits.nr); } - write_graph_chunk_fanout(f, ctx); - write_graph_chunk_oids(f, ctx); - write_graph_chunk_data(f, ctx); - if (ctx->num_extra_edges) - write_graph_chunk_extra_edges(f, ctx); - if (ctx->changed_paths) { - write_graph_chunk_bloom_indexes(f, ctx); - write_graph_chunk_bloom_data(f, ctx); - } - if (ctx->num_commit_graphs_after > 1 && - write_graph_chunk_base(f, ctx)) { - return -1; + + for (i = 0; i < num_chunks; i++) { + if (chunks[i].write_fn(f, ctx)) + return -1; } + stop_progress(&ctx->progress); strbuf_release(&progress_title); From patchwork Wed Jul 1 13:27:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Cai via GitGitGadget X-Patchwork-Id: 11636381 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9A13E13B4 for ; Wed, 1 Jul 2020 13:27:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7C774207F5 for ; Wed, 1 Jul 2020 13:27:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="CmMHbtWO" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731101AbgGAN1o (ORCPT ); Wed, 1 Jul 2020 09:27:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41154 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730820AbgGAN1k (ORCPT ); Wed, 1 Jul 2020 09:27:40 -0400 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 292BEC08C5DD for ; Wed, 1 Jul 2020 06:27:39 -0700 (PDT) Received: by mail-wr1-x443.google.com with SMTP id f7so20819178wrw.1 for ; Wed, 01 Jul 2020 06:27:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=iUyFBGxnwdMin+a/xSqjifthXnKIurR1IAIYpOpNFqk=; b=CmMHbtWOUJyx8rOL/ru3UmnbqtwQ7KFdxF/Uj4fWf81hsCYQfVux5oKwCKecT1bujb t79Z86aB7+71zuc4Z/oRTYrTR+n1rmWDxXiE+ec1w7YFwryt84xemWi0yhU6wpfsefvr xva/H7Ks1djcJ7wSHQUGpxlZfyFc5dOYqmJP4x/etaZDl/aiUmuozRRy7O68DKk0h6al NZ86Jj9nQljSX7/lZ8NiyxRBzqXOAZgOgNlzDq9qFDyc8Zynp02Sn9PXPQRNvPczjxXX EmL1M9jLH3PYzQmz2VydFaOfpwEvyiOLf5DcZmhHOVgkembArJBq/8eygXVAK0eODRKC rBNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=iUyFBGxnwdMin+a/xSqjifthXnKIurR1IAIYpOpNFqk=; b=d5LxiQ7LoQ+Js2JG3uuiD4zS6wlQxY8NBwnWDswdsdSNIFz89/rR0I0V+uhfiur6sl it16jbEyUrvbuETUQ7itdPw3p1+zJwQbjg/g7CIHZCL5sv4lF22HhQklnVhcvWK9Lb1Z cNYCyrsPefaS9YY8FFP2Prw528/WKAv64lLQPsYLIiXNhhpN3t4pXyxNEkREYc+i664Y u7k6L5PfFPRvRRlboqxXaNrfdBulA9rF5i5cQDU3okoV9owYY6T5k9yygV9fE196wOTC 5eIW+aG+iVahqVfh/8LQ7Rf9AEOyGwlwBPsMabWSXfG9855nPh+7bm8U01OPOpisg4Ix XtzA== X-Gm-Message-State: AOAM531rsxeCarEpRLxFMbuw2t1YC6YUkNI8V+8fMDzsPtGyvRtMBpOI TKNU656SZrKZwPy5z4g2YFC7EJsK X-Google-Smtp-Source: ABdhPJymVxnUpFqkYt9dV71WobbPFgMmLO1A2QJ97J+72sTEgru2nrLVjv+3RTEjpa1/cvptaD/fqw== X-Received: by 2002:adf:e948:: with SMTP id m8mr27467257wrn.398.1593610057747; Wed, 01 Jul 2020 06:27:37 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s8sm7401076wru.38.2020.07.01.06.27.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jul 2020 06:27:37 -0700 (PDT) Message-Id: In-Reply-To: References: From: " =?utf-8?q?SZEDER_G=C3=A1bor?= via GitGitGadget" Date: Wed, 01 Jul 2020 13:27:27 +0000 Subject: [PATCH v4 07/10] commit-graph: check chunk sizes after writing MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: me@ttaylorr.com, szeder.dev@gmail.com, l.s.r@web.de, Derrick Stolee , =?utf-8?q?SZEDER_G=C3=A1bor?= Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: =?UTF-8?q?SZEDER=20G=C3=A1bor?= In my experience while experimenting with new commit-graph chunks, early versions of the corresponding new write_commit_graph_my_chunk() functions are, sadly but not surprisingly, often buggy, and write more or less data than they are supposed to, especially if the chunk size is not directly proportional to the number of commits. This then causes all kinds of issues when reading such a bogus commit-graph file, raising the question of whether the writing or the reading part happens to be buggy this time. Let's catch such issues early, already when writing the commit-graph file, and check that each write_graph_chunk_*() function wrote the amount of data that it was expected to, and what has been encoded in the Chunk Lookup table. Now that all commit-graph chunks are written in a loop we can do this check in a single place for all chunks, and any chunks added in the future will get checked as well. Helped-by: René Scharfe Signed-off-by: SZEDER Gábor Signed-off-by: Derrick Stolee --- commit-graph.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index 2b26a9dad3..6752916c1a 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -1693,8 +1693,15 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) } for (i = 0; i < num_chunks; i++) { + uint64_t start_offset = f->total + f->offset; + if (chunks[i].write_fn(f, ctx)) return -1; + + if (f->total + f->offset != start_offset + chunks[i].size) + BUG("expected to write %"PRId64" bytes to chunk %"PRIx32", but wrote %"PRId64" instead", + chunks[i].size, chunks[i].id, + f->total + f->offset - start_offset); } stop_progress(&ctx->progress); From patchwork Wed Jul 1 13:27:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Cai via GitGitGadget X-Patchwork-Id: 11636389 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6917C13B4 for ; Wed, 1 Jul 2020 13:27:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4E08D207F5 for ; Wed, 1 Jul 2020 13:27:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Dy6hEayg" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731107AbgGAN1u (ORCPT ); Wed, 1 Jul 2020 09:27:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41156 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731090AbgGAN1k (ORCPT ); Wed, 1 Jul 2020 09:27:40 -0400 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA2BDC08C5DE for ; Wed, 1 Jul 2020 06:27:39 -0700 (PDT) Received: by mail-wm1-x342.google.com with SMTP id g10so6833279wmc.1 for ; Wed, 01 Jul 2020 06:27:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=l7YIzjVqOaRQEBb3x3ZLzBTO+liCfxOYtzPTlnD3iuI=; b=Dy6hEaygUGL1UKtiiYObdgLTpMOSE++I1UAKaeNEGiJX3rCfBynBeMsotTzzYm9NoR 7sdBap10nFyJiLJ+IejulW82oyPmhROX4SPcjDWhD34JWlJrUSbBkJoRXOEH2SzIYHdJ xa5R8d3YMYpcoPsl/BSPXCaTj/Z8merNU35pscIkbxBonbjAqe9Dwl49iTjORLrGATJG Rzg0NCLZq2i/hjo0Vt0cR0m4Qb8MR8hj+RXKoZqZ0SGb9gtyXnsnzmTVw7PsyRSqO7qs Jl9KrWPmGD7IkGw8MAeTm5khZi00B3UBOsjXgFiW2qRfvnkZxz3d7Inw6LdE9/MDmN3s 4bdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=l7YIzjVqOaRQEBb3x3ZLzBTO+liCfxOYtzPTlnD3iuI=; b=pwXjy1J3G4QwBIFxPxEwqassitC9CPodvcXCwHNjTV3/EuvER/orPp5h+w+bvLDIst wLS9X/l7HtvlHdwon4pQYfrNnf3QaNKq4ZxPuMKMQV62phJGRQ1SmLHSlVtfqQFvPt8T oFvTIWeh/dtkrty1lsDfQrWhNa0pWq5lwl4973b1aNSvlVMtPicXwBYX5FOcwjB1aLA2 TMdFhT/GSde+zDZ3V81TF4H+i6MwYM/mcrdjBkCoW5dWH+7/mqQpnMco2OzWpi3tkOSo 1AMDZSSUBc8U//WKngj5lT1osQ8PGBwaHBUE4POwPkmNDkBM5O55Qr0mW/AqNvcT4ZPR 4fhA== X-Gm-Message-State: AOAM533BkLxOsNoTA2no0909B4aj3CEGhS08gDHxykWC/uuN1RcrzjX9 kM4dd6EDJZHABnzaiOYSBdHA9yBG X-Google-Smtp-Source: ABdhPJymJUWMcrNWy8wV3UPUvZU2lPozOGxCFG0WWTF7rC07JlvWNHMI3yhpPvswW9iVBtYNjIQSJQ== X-Received: by 2002:a1c:9e84:: with SMTP id h126mr16265844wme.61.1593610058516; Wed, 01 Jul 2020 06:27:38 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id k14sm7217002wrn.76.2020.07.01.06.27.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jul 2020 06:27:38 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Derrick Stolee via GitGitGadget" Date: Wed, 01 Jul 2020 13:27:28 +0000 Subject: [PATCH v4 08/10] revision.c: fix whitespace Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: me@ttaylorr.com, szeder.dev@gmail.com, l.s.r@web.de, Derrick Stolee , Derrick Stolee Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Derrick Stolee Here, four spaces were used instead of tab characters. Reported-by: Taylor Blau Signed-off-by: Derrick Stolee --- revision.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/revision.c b/revision.c index 7339750af1..ddf09ab0aa 100644 --- a/revision.c +++ b/revision.c @@ -695,11 +695,11 @@ static void prepare_to_use_bloom_filter(struct rev_info *revs) /* remove single trailing slash from path, if needed */ if (pi->match[last_index] == '/') { - path_alloc = xstrdup(pi->match); - path_alloc[last_index] = '\0'; - path = path_alloc; + path_alloc = xstrdup(pi->match); + path_alloc[last_index] = '\0'; + path = path_alloc; } else - path = pi->match; + path = pi->match; len = strlen(path); From patchwork Wed Jul 1 13:27:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Cai via GitGitGadget X-Patchwork-Id: 11636385 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 41DF113B4 for ; Wed, 1 Jul 2020 13:27:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 23B87206BE for ; Wed, 1 Jul 2020 13:27:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="q1LX0g+S" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731105AbgGAN1r (ORCPT ); Wed, 1 Jul 2020 09:27:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41146 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731092AbgGAN1l (ORCPT ); Wed, 1 Jul 2020 09:27:41 -0400 Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 91F7DC08C5DB for ; Wed, 1 Jul 2020 06:27:40 -0700 (PDT) Received: by mail-wm1-x341.google.com with SMTP id o2so23253159wmh.2 for ; Wed, 01 Jul 2020 06:27:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=oI7WnFBtC3brw2z+oArUkqpJjEwqjcuZPTLeeJP+XwQ=; b=q1LX0g+S5tJkygDP7VrnskbSY+HSkzSu9zUNuW+B0rfR1T8X46Th1N5AHrcZmgBxpo 3+PvmwnUbkPklVxpUcr2gVW+R3GNDNQ02q+YstW6OgV6b5HYdVTfM3DVtlzeXaBLg4oU I5RuVb9ZuNqXYrjpHShzF4rb9lhi7/2AG4eauBVL/RDrNDaDDvkJToRb4zxAZzl1lJFb H2zdCPsSaSAcdP0Z4IXxQytZFwicuWvojsyF5fLAjwC1DYs3vEShf27OUNlgYq3w5/OP 41smwvptxIFG31dUkoTJLvkFq2BtMc//4poxn/UGmLrCaZL/iY+nM0RHaLC7qHYTAePX yhFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=oI7WnFBtC3brw2z+oArUkqpJjEwqjcuZPTLeeJP+XwQ=; b=pKpf2aeyTZvJvRlfZ/a0oH/LLHrOuWT3NA+SRWgv0eEf9K9v1NArjxz0mfCx8picsK tjJDJnRDutNdA61lvcZZ37BGbdZgijQmyfNq/+nJyIRv5fojAnhgzrjFqRjCna6aTgjn AlWwzKj15s9UFX3hT5cXPiS7b5GodPyNgGActLX92FwNZQa4rWlokSos/GPs+vHE5Q7c Mwfdmx7AJifDc8vLQazQzG7PU9Ei3RXW2GRcUjxMvCNYW2Z/wgylE/4swyPI/S0zgZet afZVmyOyOkRvSGOoNLw7zdlhOYZaGBkcwGuygkmOUbokYrAkun6LvKhWszhgOhvZfeVe TeMw== X-Gm-Message-State: AOAM533IJo8JzVysUW5/8KH5qnY9s46DDi1MN4WgRL1HJ0zT+08rsoWT /jaGRhW8zmX46Z1a/b+bOrLU3LvY X-Google-Smtp-Source: ABdhPJzTHVvdusEpokc+q0+QpDw3r5/xiNKUpeHx1+1x0txpHaN8785Ry0M5rlllfBv6n8bGl9W/Kw== X-Received: by 2002:a1c:9d07:: with SMTP id g7mr26651295wme.160.1593610059212; Wed, 01 Jul 2020 06:27:39 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b18sm7596440wmb.18.2020.07.01.06.27.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jul 2020 06:27:38 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Taylor Blau via GitGitGadget" Date: Wed, 01 Jul 2020 13:27:29 +0000 Subject: [PATCH v4 09/10] revision: empty pathspecs should not use Bloom filters Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: me@ttaylorr.com, szeder.dev@gmail.com, l.s.r@web.de, Derrick Stolee , Taylor Blau Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Taylor Blau The prepare_to_use_bloom_filter() method was not intended to be called on an empty pathspec. However, 'git log -- .' and 'git log' are subtly different: the latter reports all commits while the former will simplify commits that do not change the root tree. This means that the path used to construct the bloom_key might be empty, and that value is not added to the Bloom filter during construction. That means that the results are likely incorrect! To resolve the issue, be careful about the length of the path and stop filling Bloom filters. To be completely sure we do not use them, drop the pointer to the bloom_filter_settings from the commit-graph. That allows our test to look at the trace2 logs to verify no Bloom filter statistics are reported. Signed-off-by: Taylor Blau Signed-off-by: Derrick Stolee --- revision.c | 4 ++++ t/t4216-log-bloom.sh | 4 ++++ 2 files changed, 8 insertions(+) diff --git a/revision.c b/revision.c index ddf09ab0aa..667ca36e1c 100644 --- a/revision.c +++ b/revision.c @@ -702,6 +702,10 @@ static void prepare_to_use_bloom_filter(struct rev_info *revs) path = pi->match; len = strlen(path); + if (!len) { + revs->bloom_filter_settings = NULL; + return; + } revs->bloom_key = xmalloc(sizeof(struct bloom_key)); fill_bloom_key(path, len, revs->bloom_key, revs->bloom_filter_settings); diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh index 52ad998f9e..29338f36bf 100755 --- a/t/t4216-log-bloom.sh +++ b/t/t4216-log-bloom.sh @@ -112,6 +112,10 @@ test_expect_success 'git log -- multiple path specs does not use Bloom filters' test_bloom_filters_not_used "-- file4 A/file1" ' +test_expect_success 'git log -- "." pathspec at root does not use Bloom filters' ' + test_bloom_filters_not_used "-- ." +' + test_expect_success 'git log with wildcard that resolves to a single path uses Bloom filters' ' test_bloom_filters_used "-- *4" && test_bloom_filters_used "-- *renamed" From patchwork Wed Jul 1 13:27:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Cai via GitGitGadget X-Patchwork-Id: 11636383 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8F93D13B4 for ; Wed, 1 Jul 2020 13:27:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6F101207F5 for ; Wed, 1 Jul 2020 13:27:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="KulU/lmX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731103AbgGAN1p (ORCPT ); Wed, 1 Jul 2020 09:27:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41144 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731095AbgGAN1m (ORCPT ); Wed, 1 Jul 2020 09:27:42 -0400 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 708E6C08C5C1 for ; Wed, 1 Jul 2020 06:27:41 -0700 (PDT) Received: by mail-wr1-x442.google.com with SMTP id s10so23787902wrw.12 for ; Wed, 01 Jul 2020 06:27:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=MknE75MC+5aKozKTrfT42wnsFtG9IYaI1Th4Gq1J5PQ=; b=KulU/lmXntziaDjJ/ZwzY7uW0dghns90HSy9M+0RBe5Czw0cb6+fMIHc1S2c0km/31 elxyDeKxxyxtHjtYVY4dnf1f8iF8U8hieadtyDJsT15tNRMaky2Sd2+uwVUJKcpc4IZ+ AT6t+dSt00bgsxqlgpaK86z8NoSRGtqsWYj1CAtIoUdegNpSR2eBAuYbab3Trp9/5Sl8 vyp+AG0fb6R+DzLf/AAvlTeNIDgGaV3pFhff/dW2a1REsUqFdhkXMll2XGnPxKH2kpwY tm/i3Qsr1Qti29EAZ40I5P0puR7+B0pd6wGqozP1WG5QwRzvZmQnup7nddg717DvEqlS uYqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=MknE75MC+5aKozKTrfT42wnsFtG9IYaI1Th4Gq1J5PQ=; b=WqRbAGEsNAj1Z+QhrpL04b/80BbD7RkIZBKIMDnPYeAAc7GBL3gvlzD/9NUB9yadMJ c8S/PkOQdXBBuFrUPIA2XbvQ1J7Xeyqpde5dM/nU2sIpXmoL/c7PrKmqaVlzlUC+kdq3 Vwkw33Uf2mu9mSshRp5uxPcPZSY3DXZYCLgy+0eZRunsVtBkMJTz+zbn+uSDwvdpZ5Te cNXtjDHRPV/fFmJTjrXfEbmc/f35Q2a3j1nIN8xjMY43zgavqHKH/zsydgADAnHLT3jO YlDJOw+JykvN12W4qYN5lv4U63KpSqfzBEiYQP5Y9mVsIyxQy9oXKJe68n9bxZX3YcQ8 T4rQ== X-Gm-Message-State: AOAM530kMYOtczjb1bNQlffcOfAzgahUNByD4qVc0w95UQ3esBBe2aTc eocTAl5LQbgnoT2U1T/iDqplgQo/ X-Google-Smtp-Source: ABdhPJzhe/LT/Nth8NMyD+atDwtXPAX+T2LNG9OEUlZioQ9EKiaTKcPXSQQCtV0zicgNDVqpN9JCrQ== X-Received: by 2002:adf:f14c:: with SMTP id y12mr26349877wro.30.1593610059953; Wed, 01 Jul 2020 06:27:39 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h14sm7690646wrt.36.2020.07.01.06.27.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jul 2020 06:27:39 -0700 (PDT) Message-Id: <9c4a00ab089c48870f0e3ad02ed15c1cc71f8625.1593610050.git.gitgitgadget@gmail.com> In-Reply-To: References: From: " =?utf-8?q?SZEDER_G=C3=A1bor?= via GitGitGadget" Date: Wed, 01 Jul 2020 13:27:30 +0000 Subject: [PATCH v4 10/10] commit-graph: check all leading directories in changed path Bloom filters MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: me@ttaylorr.com, szeder.dev@gmail.com, l.s.r@web.de, Derrick Stolee , =?utf-8?q?SZEDER_G=C3=A1bor?= Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: =?UTF-8?q?SZEDER=20G=C3=A1bor?= The file 'dir/subdir/file' can only be modified if its leading directories 'dir' and 'dir/subdir' are modified as well. So when checking modified path Bloom filters looking for commits modifying a path with multiple path components, then check not only the full path in the Bloom filters, but all its leading directories as well. Take care to check these paths in "deepest first" order, because it's the full path that is least likely to be modified, and the Bloom filter queries can short circuit sooner. This can significantly reduce the average false positive rate, by about an order of magnitude or three(!), and can further speed up pathspec-limited revision walks. The table below compares the average false positive rate and runtime of git rev-list HEAD -- "$path" before and after this change for 5000+ randomly* selected paths from each repository: Average false Average Average positive rate runtime runtime before after before after difference ------------------------------------------------------------------ git 3.220% 0.7853% 0.0558s 0.0387s -30.6% linux 2.453% 0.0296% 0.1046s 0.0766s -26.8% tensorflow 2.536% 0.6977% 0.0594s 0.0420s -29.2% *Path selection was done with the following pipeline: git ls-tree -r --name-only HEAD | sort -R | head -n 5000 The improvements in runtime are much smaller than the improvements in average false positive rate, as we are clearly reaching diminishing returns here. However, all these timings depend on that accessing tree objects is reasonably fast (warm caches). If we had a partial clone and the tree objects had to be fetched from a promisor remote, e.g.: $ git clone --filter=tree:0 --bare file://.../webkit.git webkit.notrees.git $ git -C webkit.git -c core.modifiedPathBloomFilters=1 \ commit-graph write --reachable $ cp webkit.git/objects/info/commit-graph webkit.notrees.git/objects/info/ $ git -C webkit.notrees.git -c core.modifiedPathBloomFilters=1 \ rev-list HEAD -- "$path" then checking all leading path component can reduce the runtime from over an hour to a few seconds (and this is with the clone and the promisor on the same machine). This adjusts the tracing values in t4216-log-bloom.sh, which provides a concrete way to notice the improvement. Helped-by: Taylor Blau Helped-by: René Scharfe Signed-off-by: SZEDER Gábor Signed-off-by: Derrick Stolee --- revision.c | 46 +++++++++++++++++++++++++++++++++++--------- revision.h | 6 ++++-- t/t4216-log-bloom.sh | 2 +- 3 files changed, 42 insertions(+), 12 deletions(-) diff --git a/revision.c b/revision.c index 667ca36e1c..b9118001f9 100644 --- a/revision.c +++ b/revision.c @@ -668,9 +668,10 @@ static void prepare_to_use_bloom_filter(struct rev_info *revs) { struct pathspec_item *pi; char *path_alloc = NULL; - const char *path; + const char *path, *p; int last_index; - int len; + size_t len; + int path_component_nr = 1; if (!revs->commits) return; @@ -707,8 +708,33 @@ static void prepare_to_use_bloom_filter(struct rev_info *revs) return; } - revs->bloom_key = xmalloc(sizeof(struct bloom_key)); - fill_bloom_key(path, len, revs->bloom_key, revs->bloom_filter_settings); + p = path; + while (*p) { + /* + * At this point, the path is normalized to use Unix-style + * path separators. This is required due to how the + * changed-path Bloom filters store the paths. + */ + if (*p == '/') + path_component_nr++; + p++; + } + + revs->bloom_keys_nr = path_component_nr; + ALLOC_ARRAY(revs->bloom_keys, revs->bloom_keys_nr); + + fill_bloom_key(path, len, &revs->bloom_keys[0], + revs->bloom_filter_settings); + path_component_nr = 1; + + p = path + len - 1; + while (p > path) { + if (*p == '/') + fill_bloom_key(path, p - path, + &revs->bloom_keys[path_component_nr++], + revs->bloom_filter_settings); + p--; + } if (trace2_is_enabled() && !bloom_filter_atexit_registered) { atexit(trace2_bloom_filter_statistics_atexit); @@ -722,7 +748,7 @@ static int check_maybe_different_in_bloom_filter(struct rev_info *revs, struct commit *commit) { struct bloom_filter *filter; - int result; + int result = 1, j; if (!revs->repo->objects->commit_graph) return -1; @@ -737,9 +763,11 @@ static int check_maybe_different_in_bloom_filter(struct rev_info *revs, return -1; } - result = bloom_filter_contains(filter, - revs->bloom_key, - revs->bloom_filter_settings); + for (j = 0; result && j < revs->bloom_keys_nr; j++) { + result = bloom_filter_contains(filter, + &revs->bloom_keys[j], + revs->bloom_filter_settings); + } if (result) count_bloom_filter_maybe++; @@ -779,7 +807,7 @@ static int rev_compare_tree(struct rev_info *revs, return REV_TREE_SAME; } - if (revs->bloom_key && !nth_parent) { + if (revs->bloom_keys_nr && !nth_parent) { bloom_ret = check_maybe_different_in_bloom_filter(revs, commit); if (bloom_ret == 0) diff --git a/revision.h b/revision.h index 7c026fe41f..abbfb4ab59 100644 --- a/revision.h +++ b/revision.h @@ -295,8 +295,10 @@ struct rev_info { struct topo_walk_info *topo_walk_info; /* Commit graph bloom filter fields */ - /* The bloom filter key for the pathspec */ - struct bloom_key *bloom_key; + /* The bloom filter key(s) for the pathspec */ + struct bloom_key *bloom_keys; + int bloom_keys_nr; + /* * The bloom filter settings used to generate the key. * This is loaded from the commit-graph being used. diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh index 29338f36bf..4892364e74 100755 --- a/t/t4216-log-bloom.sh +++ b/t/t4216-log-bloom.sh @@ -146,7 +146,7 @@ test_expect_success 'setup - add commit-graph to the chain with Bloom filters' ' test_bloom_filters_used_when_some_filters_are_missing () { log_args=$1 - bloom_trace_prefix="statistics:{\"filter_not_present\":3,\"maybe\":8,\"definitely_not\":6" + bloom_trace_prefix="statistics:{\"filter_not_present\":3,\"maybe\":6,\"definitely_not\":8" setup "$log_args" && grep -q "$bloom_trace_prefix" "$TRASH_DIRECTORY/trace.perf" && test_cmp log_wo_bloom log_w_bloom