From patchwork Sun Aug 9 02:53:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11706475 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 922BE14B7 for ; Sun, 9 Aug 2020 02:53:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 734B1206D8 for ; Sun, 9 Aug 2020 02:53:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="nFQjwjmE" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726097AbgHICxv (ORCPT ); Sat, 8 Aug 2020 22:53:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40418 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725988AbgHICxv (ORCPT ); Sat, 8 Aug 2020 22:53:51 -0400 Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E8CD4C061756 for ; Sat, 8 Aug 2020 19:53:50 -0700 (PDT) Received: by mail-wm1-x344.google.com with SMTP id 3so5215838wmi.1 for ; Sat, 08 Aug 2020 19:53:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=8HfzOFn9EHvhlcJ4nPceB7zs4j6XWa23+RSfvwPSb0Y=; b=nFQjwjmEv4hXma8csi+ypq0SveVUtmCPw3Ep90S4aNO8JdYOcUKN1WZbymHAcY7yFm PNo/72mCGGkgQEUXIcbkM7XIFqC/QF02Cj+ecMupokenJaBh4QN+nsO9P8pz9iVoqx7W dYQOEClOaQwp5+V0vg2IgatU38m93Xojueogjts4H5867Zh31UJbJGyBeNpbc2pynl9S TsF4/yTOJdgV63dXoOpz80QcCxln4Buvr0t061fSR2jXdB8VYlsFMZk9avvyncbwSiFP Dvolxm+UpOIzn/NSlPfLp5ZMhBx2HL+5EPKHZ+R/iJEWD34PEEWVRLnhrjVa+ANyHWMJ UwJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=8HfzOFn9EHvhlcJ4nPceB7zs4j6XWa23+RSfvwPSb0Y=; b=Zn/ut+THJHO4StwgD9K6pJMwdTL4SIAq4fu1J1DYIPYXsBicgExuOXep8Eprdrz2aY sPZwZ1Yn7ZCh69PnSqj413pp+2czmpZTqqYuJwR4gcSAvQM8AngoDPz4Dakhzm/0iHMm OrcZN1C9flJ/VgrbS0wFm9h4G4HVcrdMds5aJjyRk6SUX/crea+sF9cqSfCPmjTRF1po 338fP/l8C8XFB4UV+ItesQ0amO8QGHHjfyMRpFG3U0FLV3XD6D330pzZHcKYMLA1mcam wPZSfxqRP5iibFXTQYTiOBwDALn0FsUsu2Co6ulgZq0xOsRcYX8GJ/tH40x4m7I4Byyl ZJOg== X-Gm-Message-State: AOAM533XZKLgn7RG8m+BdCzZePzTj42jOSRI0VXr0mjhgd2qQAaS51zI yGXGEOgw2flWHunwbOJ2v7WwIpeE X-Google-Smtp-Source: ABdhPJxeSqNCM+IMfHpPjKbik4/x+N7gWwAFgBNfYKVCC6uyxLrB+HPyc2slfVDVFm8bHf2KLrdMrg== X-Received: by 2002:a7b:c14e:: with SMTP id z14mr19547996wmi.34.1596941629546; Sat, 08 Aug 2020 19:53:49 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b129sm15667379wmb.29.2020.08.08.19.53.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 08 Aug 2020 19:53:49 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sun, 09 Aug 2020 02:53:35 +0000 Subject: [PATCH v2 01/10] commit-graph: fix regression when computing bloom filter Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar commit_gen_cmp is used when writing a commit-graph to sort commits in generation order before computing Bloom filters. Since c49c82aa (commit: move members graph_pos, generation to a slab, 2020-06-17) made it so that 'commit_graph_generation()' returns 'GENERATION_NUMBER_INFINITY' during writing, we cannot call it within this function. Instead, access the generation number directly through the slab (i.e., by calling 'commit_graph_data_at(c)->generation') in order to access it while writing. Signed-off-by: Abhishek Kumar --- commit-graph.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index e51c91dd5b..ace7400a1a 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -144,8 +144,8 @@ static int commit_gen_cmp(const void *va, const void *vb) const struct commit *a = *(const struct commit **)va; const struct commit *b = *(const struct commit **)vb; - uint32_t generation_a = commit_graph_generation(a); - uint32_t generation_b = commit_graph_generation(b); + uint32_t generation_a = commit_graph_data_at(a)->generation; + uint32_t generation_b = commit_graph_data_at(b)->generation; /* lower generation commits first */ if (generation_a < generation_b) return -1; From patchwork Sun Aug 9 02:53:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11706479 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6967A138A for ; Sun, 9 Aug 2020 02:54:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 50EC1206D8 for ; Sun, 9 Aug 2020 02:54:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="o5XZ7ZiG" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726299AbgHICx7 (ORCPT ); Sat, 8 Aug 2020 22:53:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40434 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725988AbgHICxy (ORCPT ); Sat, 8 Aug 2020 22:53:54 -0400 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D1256C061756 for ; Sat, 8 Aug 2020 19:53:53 -0700 (PDT) Received: by mail-wr1-x444.google.com with SMTP id 88so5052325wrh.3 for ; Sat, 08 Aug 2020 19:53:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=lIJ686Wi8AQ681+Y0AaT7IH17wTkCl0SzjBrEmyX2N8=; b=o5XZ7ZiG3ex/zGCEaWLa5TMRY8vMV6es6OiekrsaBKeAN7A3stoPNhqe3QALrMmrtc OqJ2aGqyu560MZdmOu5IYiEEivOuDqXwn5sChIsusKqVJbOcc7hyNM7JB4aTxQP8oolc YYz79IKEVRfFAEaIGX1TB+D/oCUNKttSPfQBmYAxkS77DSwoJKk4MCfO1Xgnua8k9vkA q5DzXhBmYRcsHLjG0JYpE2vUk06I2YF32Zm2OR9syF/nBnyOO1VaLUAD8Q2q9uOzrzR7 JVPZ7FklKouz0TG3HBwf4QEs0G2ccJNShPsmqIyFgha5JJGKf2jVG40yoqibjBzAxA5+ rktA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=lIJ686Wi8AQ681+Y0AaT7IH17wTkCl0SzjBrEmyX2N8=; b=P3Lk03Vgr4GcXGMFoxCHiq23PiHeVAYStP51ZkbQO2zta9hpINGQWE9lwtkmJlSRgy obqNebnmHhemxlkq30o+cw+qE3Qle1SiNBKokphp/oN+ZF4LmXWEU96hZWnI7NldSdyX mPI9GxH5Cl1PSro69+XlOwGIV6lfhJ0B6qbtRI1yc7Y3nRQU98ZqePJN6JKLkw/YAeLz 4zvHMbe4bJ4JNRFGSOi1iThd/rHmUiQQ1t6ELXGY1CK8p/C4TOQDEJHKT6YvxPvciiqT fOykovqsT1TNujZlNlLHyUnZbK79ORi+bE4gbjIzbJ8KWP+Vyv2OS8Qq1mCYNTn8rVwf jfcA== X-Gm-Message-State: AOAM531QBS6Iclhs+M0qpEYPn+ZcY3bQ3hqD9deTWXSaRLkJT7gPcKrY H67AHlojObEnEXGLyVPZaP01iuKx X-Google-Smtp-Source: ABdhPJx2ruT6FsrVoWBvy3W8fN5ukK2zgGI19u9eraqTUnNSIhkOI7rlUEmatwVA4npT85vSSsER0Q== X-Received: by 2002:adf:e845:: with SMTP id d5mr18757363wrn.228.1596941632343; Sat, 08 Aug 2020 19:53:52 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f16sm14555693wro.34.2020.08.08.19.53.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 08 Aug 2020 19:53:49 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sun, 09 Aug 2020 02:53:36 +0000 Subject: [PATCH v2 02/10] revision: parse parent in indegree_walk_step() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar In indegree_walk_step(), we add unvisited parents to the indegree queue. However, parents are not guaranteed to be parsed. As the indegree queue sorts by generation number, let's parse parents before inserting them to ensure the correct priority order. Signed-off-by: Abhishek Kumar --- revision.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/revision.c b/revision.c index 6de29cdf7a..4ec82ed5ab 100644 --- a/revision.c +++ b/revision.c @@ -3365,6 +3365,9 @@ static void indegree_walk_step(struct rev_info *revs) struct commit *parent = p->item; int *pi = indegree_slab_at(&info->indegree, parent); + if (parse_commit_gently(parent, 1) < 0) + return; + if (*pi) (*pi)++; else From patchwork Sun Aug 9 02:53:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11706481 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9C7F916B1 for ; Sun, 9 Aug 2020 02:54:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 84EA6206D8 for ; Sun, 9 Aug 2020 02:54:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="fRCV1NEN" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726360AbgHICyA (ORCPT ); Sat, 8 Aug 2020 22:54:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40436 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726199AbgHICxz (ORCPT ); Sat, 8 Aug 2020 22:53:55 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B1981C061A27 for ; Sat, 8 Aug 2020 19:53:54 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id 88so5052343wrh.3 for ; Sat, 08 Aug 2020 19:53:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=6AVQeslCBa1mSfJ9qfMqVwa16RSD2qGpgXX3ZnaVk/A=; b=fRCV1NENfWIJmgC8wlE1O37CUgehbaCsK59bAZfdo4xStO3QYk11Sr29hSBF3yBE/c l6xZeN8gOw8EW0mJGiMsvndS18T7G7axRtmn30rJnO5pCCNd8YSmjqGIZ/sgsjtBlPGJ k64cErZCs9c3jscbmXbQ6syv5R0pJvQrKQvYoOzwmh9CBX1sEsbGRV/3YOmRHLr+vMwU N8Qya82Res9PW+BdzuvTa+j+c95tTUxwBZqAeLsgrz+K35Y5Ab8/oV0ob1xXOBJCIA2y F7yJHkki2H1y+1XejRlEbKsK39nsoyMnjqIMiECYKo/eWxhYeLSybwU3CCbX/CuB5psZ 8A6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=6AVQeslCBa1mSfJ9qfMqVwa16RSD2qGpgXX3ZnaVk/A=; b=TjfHky3Q/uXwVue5Af2naZgDqVGUiHl98Xlh85uUaFAd+pSJlQcQ3qRlogflDJwRBY gs9z58c7bY8Ebh75ODcOFZ9giynzpbfyV0yuDdySPijd1swiuKpJ2vM5F9DFkfoNshDR 9kzi2e1dHTpUzF4PS+1fI76Oi7+5VmSr8KyoaSWhwoTtgTz+PEoknW4yT8orx3zM0tVG RUBgHt5S4ES2es1ifTHJDfDkzhmLeL3BK7SnPYK2SuOrazSQrkg8bkyOOrNtCjz372lk sNPs7EHCI5K3YCD/E4nTV1tL1Q+wirbtDRGmspqnYH0m+Qgw3Wwv/7tNDrwsERtVs9Vq YCIQ== X-Gm-Message-State: AOAM5300WFKMUUt3p5GuP//pjPhBpUxxsaQn3+3nIxvb+g097Z7eaRFR U/DV/XI7Pps158LclX5jGNwKZTzf X-Google-Smtp-Source: ABdhPJzHhiloenrCvjOOImXRszIv47blnIfVR9/JbHXewitUmV5a9n7MwhKDB66i0aH4bPGwTxAtew== X-Received: by 2002:adf:ec10:: with SMTP id x16mr17858319wrn.74.1596941633328; Sat, 08 Aug 2020 19:53:53 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id i82sm16473697wmi.10.2020.08.08.19.53.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 08 Aug 2020 19:53:52 -0700 (PDT) Message-Id: <32da955e318c143db9605a1b2b312598b3fc5231.1596941624.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sun, 09 Aug 2020 02:53:37 +0000 Subject: [PATCH v2 03/10] commit-graph: consolidate fill_commit_graph_info Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar Both fill_commit_graph_info() and fill_commit_in_graph() parse information present in commit data chunk. Let's simplify the implementation by calling fill_commit_graph_info() within fill_commit_in_graph(). The test 'generate tar with future mtime' creates a commit with commit time of (2 ^ 36 + 1) seconds since EPOCH. The commit time overflows into generation number (within CDAT chunk) and has undefined behavior. The test used to pass as fill_commit_in_graph() guarantees the values of graph position and generation number, and did not load timestamp. However, with corrected commit date we will need load the timestamp as well to populate the generation number. Let's fix the test by setting a timestamp of (2 ^ 34 - 1) seconds. Signed-off-by: Abhishek Kumar --- commit-graph.c | 29 +++++++++++------------------ t/t5000-tar-tree.sh | 4 ++-- 2 files changed, 13 insertions(+), 20 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index ace7400a1a..af8d9cc45e 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -725,15 +725,24 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g, const unsigned char *commit_data; struct commit_graph_data *graph_data; uint32_t lex_index; + uint64_t date_high, date_low; while (pos < g->num_commits_in_base) g = g->base_graph; + if (pos >= g->num_commits + g->num_commits_in_base) + die(_("invalid commit position. commit-graph is likely corrupt")); + lex_index = pos - g->num_commits_in_base; commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index; graph_data = commit_graph_data_at(item); graph_data->graph_pos = pos; + + date_high = get_be32(commit_data + g->hash_len + 8) & 0x3; + date_low = get_be32(commit_data + g->hash_len + 12); + item->date = (timestamp_t)((date_high << 32) | date_low); + graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; } @@ -748,38 +757,22 @@ static int fill_commit_in_graph(struct repository *r, { uint32_t edge_value; uint32_t *parent_data_ptr; - uint64_t date_low, date_high; struct commit_list **pptr; - struct commit_graph_data *graph_data; const unsigned char *commit_data; uint32_t lex_index; while (pos < g->num_commits_in_base) g = g->base_graph; - if (pos >= g->num_commits + g->num_commits_in_base) - die(_("invalid commit position. commit-graph is likely corrupt")); + fill_commit_graph_info(item, g, pos); - /* - * Store the "full" position, but then use the - * "local" position for the rest of the calculation. - */ - graph_data = commit_graph_data_at(item); - graph_data->graph_pos = pos; lex_index = pos - g->num_commits_in_base; - - commit_data = g->chunk_commit_data + (g->hash_len + 16) * lex_index; + commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index; item->object.parsed = 1; set_commit_tree(item, NULL); - date_high = get_be32(commit_data + g->hash_len + 8) & 0x3; - date_low = get_be32(commit_data + g->hash_len + 12); - item->date = (timestamp_t)((date_high << 32) | date_low); - - graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; - pptr = &item->parents; edge_value = get_be32(commit_data + g->hash_len); diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh index 37655a237c..1986354fc3 100755 --- a/t/t5000-tar-tree.sh +++ b/t/t5000-tar-tree.sh @@ -406,7 +406,7 @@ test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' ' rm -f .git/index && echo content >file && git add file && - GIT_COMMITTER_DATE="@68719476737 +0000" \ + GIT_COMMITTER_DATE="@17179869183 +0000" \ git commit -m "tempori parendum" ' @@ -415,7 +415,7 @@ test_expect_success TIME_IS_64BIT 'generate tar with future mtime' ' ' test_expect_success TAR_HUGE,TIME_IS_64BIT,TIME_T_IS_64BIT 'system tar can read our future mtime' ' - echo 4147 >expect && + echo 2514 >expect && tar_info future.tar | cut -d" " -f2 >actual && test_cmp expect actual ' From patchwork Sun Aug 9 02:53:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11706495 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E786D138A for ; Sun, 9 Aug 2020 02:54:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C64DC206E2 for ; Sun, 9 Aug 2020 02:54:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="PuLrMSwq" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726399AbgHICyH (ORCPT ); Sat, 8 Aug 2020 22:54:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40446 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726210AbgHICx5 (ORCPT ); Sat, 8 Aug 2020 22:53:57 -0400 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB03FC061A29 for ; Sat, 8 Aug 2020 19:53:56 -0700 (PDT) Received: by mail-wr1-x436.google.com with SMTP id a5so5056136wrm.6 for ; Sat, 08 Aug 2020 19:53:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=0oCQKiZpvmFFQwrKwuo8n2v4wexVlGSS2JtelNy+Og8=; b=PuLrMSwqP0PbEeWDglQj61rOeYq4KMCQS0nFl5lu+HZ9qXuDsipyBnlIuMzHYA0/oS naddTT3+NT/riymBJeh2y9L1OUlSWbsJyXAp4Dy5YWBXgOIWJRIJdfSvQbKc6pmCMYtp 831ll76O3/r/UTKdfMELdPiX2LiZiZca5NrGvgku7HYQ8Ke7lr6iLcXHDQoF3jp1saen JEvJpGP9FtfAnbylrD3XQt5q7Vzx5kCFEfiGTRk0WzeB2gK1eYtokvoI4b8hfIRxK0Og 1eSKeFg0KS4p53spamBy4ZNW0pF+7nY5Z4J6PsfSgAMJyQUFQvivtpcI6ge3SxwvDMt7 x+WQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=0oCQKiZpvmFFQwrKwuo8n2v4wexVlGSS2JtelNy+Og8=; b=aI6qefvSMe9Eq/fsZmlTTEH5NAxoLbTozyI2hfhawcediSQ7NZMDc/+YPKggp3Q2LR fd9K24yhUi/cRmHVim8yPuyjVCdAhS7nZxbFVuuolp/4bI6ZTWP6t+1lPk4OAb3mMwW+ yQSk9lBa/5hzBHHkTo/OMwfR2MgGi2mVjxvYWoLGIqPLAyAsx+6kPnXoOcT3sIJonFz3 PYpmCjmnP9/vxl/t2h1EvA/Yo4+aflsUTtBoN2QjIm+XakVeMpzaRymnXwTnleWB0Kd+ 7RO92chv7Mt8FmTJ+hZjTtx7kR9cztUEBwXVRVe4T8iXufWdhql11ExT87/je/WgOlaz SVFw== X-Gm-Message-State: AOAM533RWCwHELCZy6UZo+leHyFBKO/jBmxPlAAAutGYqY4UIdZ7tXke MjR3BrxQVB8u7a5B2uoos7A1DDMO X-Google-Smtp-Source: ABdhPJwjG/iLFX5ujBERR8Z4exuX/xEi3LT2XduZiXIKCztMakFuL63Hzr+Hla5yZyPmiviByj1mcA== X-Received: by 2002:adf:fb01:: with SMTP id c1mr17805498wrr.119.1596941634099; Sat, 08 Aug 2020 19:53:54 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u66sm15967850wmu.37.2020.08.08.19.53.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 08 Aug 2020 19:53:53 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sun, 09 Aug 2020 02:53:38 +0000 Subject: [PATCH v2 04/10] commit-graph: consolidate compare_commits_by_gen Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar Comparing commits by generation has been independently defined twice, in commit-reach and commit. Let's simplify the implementation by moving compare_commits_by_gen() to commit-graph. Signed-off-by: Abhishek Kumar Reviewed-by: Taylor Blau --- commit-graph.c | 15 +++++++++++++++ commit-graph.h | 2 ++ commit-reach.c | 15 --------------- commit.c | 9 +++------ 4 files changed, 20 insertions(+), 21 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index af8d9cc45e..fb6e2bf18f 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -112,6 +112,21 @@ uint32_t commit_graph_generation(const struct commit *c) return data->generation; } +int compare_commits_by_gen(const void *_a, const void *_b) +{ + const struct commit *a = _a, *b = _b; + const uint32_t generation_a = commit_graph_generation(a); + const uint32_t generation_b = commit_graph_generation(b); + + /* older commits first */ + if (generation_a < generation_b) + return -1; + else if (generation_a > generation_b) + return 1; + + return 0; +} + static struct commit_graph_data *commit_graph_data_at(const struct commit *c) { unsigned int i, nth_slab; diff --git a/commit-graph.h b/commit-graph.h index 09a97030dc..701e3d41aa 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -146,4 +146,6 @@ struct commit_graph_data { */ uint32_t commit_graph_generation(const struct commit *); uint32_t commit_graph_position(const struct commit *); + +int compare_commits_by_gen(const void *_a, const void *_b); #endif diff --git a/commit-reach.c b/commit-reach.c index efd5925cbb..c83cc291e7 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -561,21 +561,6 @@ int commit_contains(struct ref_filter *filter, struct commit *commit, return repo_is_descendant_of(the_repository, commit, list); } -static int compare_commits_by_gen(const void *_a, const void *_b) -{ - const struct commit *a = *(const struct commit * const *)_a; - const struct commit *b = *(const struct commit * const *)_b; - - uint32_t generation_a = commit_graph_generation(a); - uint32_t generation_b = commit_graph_generation(b); - - if (generation_a < generation_b) - return -1; - if (generation_a > generation_b) - return 1; - return 0; -} - int can_all_from_reach_with_flag(struct object_array *from, unsigned int with_flag, unsigned int assign_flag, diff --git a/commit.c b/commit.c index 7128895c3a..bed63b41fb 100644 --- a/commit.c +++ b/commit.c @@ -731,14 +731,11 @@ int compare_commits_by_author_date(const void *a_, const void *b_, int compare_commits_by_gen_then_commit_date(const void *a_, const void *b_, void *unused) { const struct commit *a = a_, *b = b_; - const uint32_t generation_a = commit_graph_generation(a), - generation_b = commit_graph_generation(b); + int ret_val = compare_commits_by_gen(a_, b_); /* newer commits first */ - if (generation_a < generation_b) - return 1; - else if (generation_a > generation_b) - return -1; + if (ret_val) + return -ret_val; /* use date as a heuristic when generations are equal */ if (a->date < b->date) From patchwork Sun Aug 9 02:53:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11706493 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A345F14B7 for ; Sun, 9 Aug 2020 02:54:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 80E35206E2 for ; Sun, 9 Aug 2020 02:54:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kGo/D5vC" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726412AbgHICyJ (ORCPT ); Sat, 8 Aug 2020 22:54:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726200AbgHICx5 (ORCPT ); Sat, 8 Aug 2020 22:53:57 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 96DE6C061A28 for ; Sat, 8 Aug 2020 19:53:56 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id l2so5046660wrc.7 for ; Sat, 08 Aug 2020 19:53:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=x+NMQmyEPyym9fv5eB5Ov1DQmnxpJwCo0d/JGwLDegM=; b=kGo/D5vC19ET2T3eV9zeKBeAOkMGeC113BJ9b0J6lrKnkM8y9vXS90JlOvccqfZh4W C+vMyHZYQrGwQlxhmc2tqn4V4MHQIzDlpiOK/oxIXXg0sQNC/rFCsZm8bGJeFKl6Wf9r Z+bx9g4ApSJo3dMN1fH+NM/53/sL80PD5H5Ja2PH3dIT8YTei5KmTOKcbqj8ZJYFbJds Acr2Os87SSOEH/uRyG0q/SxcsWrMwvq2wOs95G6P3+yUG6b3V9dnN28OsbqAqF3XBm0h OAX02uZVK+FhQgiYj+07ocSagZSiIHCdETu3ESfnnCA6QyJoAS8s5Ez30nW+gX/pCLQ+ vdLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=x+NMQmyEPyym9fv5eB5Ov1DQmnxpJwCo0d/JGwLDegM=; b=fR7d7smwDlAVAZFu9hUtu7VH8quYDscICZS9hTdn1+XlzLcAry1AvxifHygRYuZfNX Gi5FPBMFqYebXUpOKk+Gax6QRLgWGh8VVNXip48jLRNye7M2UQXHVSH9Elc1u2L6loaf t62Xs7aon1+GjISEYor+vTD3mFK3UEZ2YJk8Uai3fRw+HZ2x2YwkBnLVnhMakHWoEgr2 dctcP7uLNAKh5NiPntLsW+vukstuIvCvtO3lpOz95gi48xXrS5YTvFr3/juRDqf2hnor P6/WHhW42B4Fup1L/ViVrIm27z62KxIF2SgrDkEIwWHyc5/e+42ZXoyGo/9v2fH+uqsJ ucAQ== X-Gm-Message-State: AOAM532SkrWayA+D9K278nX8W+CyevmH9JnMoCQSXc3ItiYg00wtO5pf 5ZOFZ3zAHdYXMLs0g9moBlimvZMG X-Google-Smtp-Source: ABdhPJy5o8yLuY6xcBug2bvDGow8I8ANKckSM6C09490YHZdjl3vJmfCEGyMVvPZuBESaU5c3+G8zg== X-Received: by 2002:adf:a4c8:: with SMTP id h8mr18365226wrb.262.1596941634828; Sat, 08 Aug 2020 19:53:54 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id i82sm16473742wmi.10.2020.08.08.19.53.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 08 Aug 2020 19:53:54 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sun, 09 Aug 2020 02:53:39 +0000 Subject: [PATCH v2 05/10] commit-graph: implement generation data chunk MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar As discovered by Ævar, we cannot increment graph version to distinguish between generation numbers v1 and v2 [1]. Thus, one of pre-requistes before implementing generation number was to distinguish between graph versions in a backwards compatible manner. We are going to introduce a new chunk called Generation Data chunk (or GDAT). GDAT stores generation number v2 (and any subsequent versions), whereas CDAT will still store topological level. Old Git does not understand GDAT chunk and would ignore it, reading topological levels from CDAT. New Git can parse GDAT and take advantage of newer generation numbers, falling back to topological levels when GDAT chunk is missing (as it would happen with a commit graph written by old Git). We introduce a test environment variable 'GIT_TEST_COMMIT_GRAPH_NO_GDAT' which forces commit-graph file to be written without generation data chunk to emulate a commit-graph file written by old Git. [1]: https://lore.kernel.org/git/87a7gdspo4.fsf@evledraar.gmail.com/ Signed-off-by: Abhishek Kumar --- commit-graph.c | 40 +++++++++++++++++++--- commit-graph.h | 2 ++ t/README | 3 ++ t/helper/test-read-graph.c | 2 ++ t/t4216-log-bloom.sh | 4 +-- t/t5318-commit-graph.sh | 27 +++++++-------- t/t5324-split-commit-graph.sh | 12 +++---- t/t6600-test-reach.sh | 62 +++++++++++++++++++---------------- 8 files changed, 99 insertions(+), 53 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index fb6e2bf18f..d5da1e8028 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -38,11 +38,12 @@ void git_test_write_commit_graph_or_die(void) #define GRAPH_CHUNKID_OIDFANOUT 0x4f494446 /* "OIDF" */ #define GRAPH_CHUNKID_OIDLOOKUP 0x4f49444c /* "OIDL" */ #define GRAPH_CHUNKID_DATA 0x43444154 /* "CDAT" */ +#define GRAPH_CHUNKID_GENERATION_DATA 0x47444154 /* "GDAT" */ #define GRAPH_CHUNKID_EXTRAEDGES 0x45444745 /* "EDGE" */ #define GRAPH_CHUNKID_BLOOMINDEXES 0x42494458 /* "BIDX" */ #define GRAPH_CHUNKID_BLOOMDATA 0x42444154 /* "BDAT" */ #define GRAPH_CHUNKID_BASE 0x42415345 /* "BASE" */ -#define MAX_NUM_CHUNKS 7 +#define MAX_NUM_CHUNKS 8 #define GRAPH_DATA_WIDTH (the_hash_algo->rawsz + 16) @@ -392,6 +393,13 @@ struct commit_graph *parse_commit_graph(void *graph_map, size_t graph_size) graph->chunk_commit_data = data + chunk_offset; break; + case GRAPH_CHUNKID_GENERATION_DATA: + if (graph->chunk_generation_data) + chunk_repeated = 1; + else + graph->chunk_generation_data = data + chunk_offset; + break; + case GRAPH_CHUNKID_EXTRAEDGES: if (graph->chunk_extra_edges) chunk_repeated = 1; @@ -758,7 +766,10 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g, date_low = get_be32(commit_data + g->hash_len + 12); item->date = (timestamp_t)((date_high << 32) | date_low); - graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; + if (g->chunk_generation_data) + graph_data->generation = get_be32(g->chunk_generation_data + sizeof(uint32_t) * lex_index); + else + graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; } static inline void set_commit_tree(struct commit *c, struct tree *t) @@ -951,7 +962,8 @@ struct write_commit_graph_context { report_progress:1, split:1, changed_paths:1, - order_by_pack:1; + order_by_pack:1, + write_generation_data:1; const struct split_commit_graph_opts *split_opts; size_t total_bloom_filter_data_size; @@ -1105,8 +1117,21 @@ static int write_graph_chunk_data(struct hashfile *f, return 0; } +static int write_graph_chunk_generation_data(struct hashfile *f, + struct write_commit_graph_context *ctx) +{ + int i; + for (i = 0; i < ctx->commits.nr; i++) { + struct commit *c = ctx->commits.list[i]; + display_progress(ctx->progress, ++ctx->progress_cnt); + hashwrite_be32(f, commit_graph_data_at(c)->generation); + } + + return 0; +} + static int write_graph_chunk_extra_edges(struct hashfile *f, - struct write_commit_graph_context *ctx) + struct write_commit_graph_context *ctx) { struct commit **list = ctx->commits.list; struct commit **last = ctx->commits.list + ctx->commits.nr; @@ -1710,6 +1735,12 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) chunks[2].id = GRAPH_CHUNKID_DATA; chunks[2].size = (hashsz + 16) * ctx->commits.nr; chunks[2].write_fn = write_graph_chunk_data; + if (ctx->write_generation_data) { + chunks[num_chunks].id = GRAPH_CHUNKID_GENERATION_DATA; + chunks[num_chunks].size = sizeof(uint32_t) * ctx->commits.nr; + chunks[num_chunks].write_fn = write_graph_chunk_generation_data; + num_chunks++; + } if (ctx->num_extra_edges) { chunks[num_chunks].id = GRAPH_CHUNKID_EXTRAEDGES; chunks[num_chunks].size = 4 * ctx->num_extra_edges; @@ -2113,6 +2144,7 @@ int write_commit_graph(struct object_directory *odb, ctx->split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0; ctx->split_opts = split_opts; ctx->total_bloom_filter_data_size = 0; + ctx->write_generation_data = !git_env_bool(GIT_TEST_COMMIT_GRAPH_NO_GDAT, 0); if (flags & COMMIT_GRAPH_WRITE_BLOOM_FILTERS) ctx->changed_paths = 1; diff --git a/commit-graph.h b/commit-graph.h index 701e3d41aa..cc232e0678 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -6,6 +6,7 @@ #include "oidset.h" #define GIT_TEST_COMMIT_GRAPH "GIT_TEST_COMMIT_GRAPH" +#define GIT_TEST_COMMIT_GRAPH_NO_GDAT "GIT_TEST_COMMIT_GRAPH_NO_GDAT" #define GIT_TEST_COMMIT_GRAPH_DIE_ON_PARSE "GIT_TEST_COMMIT_GRAPH_DIE_ON_PARSE" #define GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS "GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS" @@ -67,6 +68,7 @@ struct commit_graph { const uint32_t *chunk_oid_fanout; const unsigned char *chunk_oid_lookup; const unsigned char *chunk_commit_data; + const unsigned char *chunk_generation_data; const unsigned char *chunk_extra_edges; const unsigned char *chunk_base_graphs; const unsigned char *chunk_bloom_indexes; diff --git a/t/README b/t/README index 70ec61cf88..6647ef132e 100644 --- a/t/README +++ b/t/README @@ -379,6 +379,9 @@ GIT_TEST_COMMIT_GRAPH=, when true, forces the commit-graph to be written after every 'git commit' command, and overrides the 'core.commitGraph' setting to true. +GIT_TEST_COMMIT_GRAPH_NO_GDAT=, when true, forces the +commit-graph to be written without generation data chunk. + GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=, when true, forces commit-graph write to compute and write changed path Bloom filters for every 'git commit-graph write', as if the `--changed-paths` option was diff --git a/t/helper/test-read-graph.c b/t/helper/test-read-graph.c index 6d0c962438..1c2a5366c7 100644 --- a/t/helper/test-read-graph.c +++ b/t/helper/test-read-graph.c @@ -32,6 +32,8 @@ int cmd__read_graph(int argc, const char **argv) printf(" oid_lookup"); if (graph->chunk_commit_data) printf(" commit_metadata"); + if (graph->chunk_generation_data) + printf(" generation_data"); if (graph->chunk_extra_edges) printf(" extra_edges"); if (graph->chunk_bloom_indexes) diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh index c21cc160f3..55c94e9ebd 100755 --- a/t/t4216-log-bloom.sh +++ b/t/t4216-log-bloom.sh @@ -33,11 +33,11 @@ test_expect_success 'setup test - repo, commits, commit graph, log outputs' ' git commit-graph write --reachable --changed-paths ' graph_read_expect () { - NUM_CHUNKS=5 + NUM_CHUNKS=6 cat >expect <<- EOF header: 43475048 1 1 $NUM_CHUNKS 0 num_commits: $1 - chunks: oid_fanout oid_lookup commit_metadata bloom_indexes bloom_data + chunks: oid_fanout oid_lookup commit_metadata generation_data bloom_indexes bloom_data EOF test-tool read-graph >actual && test_cmp expect actual diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 2804b0dd45..fef05c33d7 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -72,7 +72,7 @@ graph_git_behavior 'no graph' full commits/3 commits/1 graph_read_expect() { OPTIONAL="" NUM_CHUNKS=3 - if test ! -z $2 + if test ! -z "$2" then OPTIONAL=" $2" NUM_CHUNKS=$((3 + $(echo "$2" | wc -w))) @@ -99,14 +99,14 @@ test_expect_success 'exit with correct error on bad input to --stdin-commits' ' # valid commit and tree OID git rev-parse HEAD HEAD^{tree} >in && git commit-graph write --stdin-commits >commits-in && cat commits-in | git commit-graph write --stdin-commits && test_path_is_file $objdir/info/commit-graph && - graph_read_expect "6" + graph_read_expect "6" "generation_data" ' graph_git_behavior 'graph from commits, commit 8 vs merge 1' full commits/8 merge/1 @@ -293,7 +293,7 @@ test_expect_success 'build graph from commits with append' ' cd "$TRASH_DIRECTORY/full" && git rev-parse merge/3 | git commit-graph write --stdin-commits --append && test_path_is_file $objdir/info/commit-graph && - graph_read_expect "10" "extra_edges" + graph_read_expect "10" "generation_data extra_edges" ' graph_git_behavior 'append graph, commit 8 vs merge 1' full commits/8 merge/1 @@ -303,7 +303,7 @@ test_expect_success 'build graph using --reachable' ' cd "$TRASH_DIRECTORY/full" && git commit-graph write --reachable && test_path_is_file $objdir/info/commit-graph && - graph_read_expect "11" "extra_edges" + graph_read_expect "11" "generation_data extra_edges" ' graph_git_behavior 'append graph, commit 8 vs merge 1' full commits/8 merge/1 @@ -324,7 +324,7 @@ test_expect_success 'write graph in bare repo' ' cd "$TRASH_DIRECTORY/bare" && git commit-graph write && test_path_is_file $baredir/info/commit-graph && - graph_read_expect "11" "extra_edges" + graph_read_expect "11" "generation_data extra_edges" ' graph_git_behavior 'bare repo with graph, commit 8 vs merge 1' bare commits/8 merge/1 @@ -421,8 +421,9 @@ test_expect_success 'replace-objects invalidates commit-graph' ' test_expect_success 'git commit-graph verify' ' cd "$TRASH_DIRECTORY/full" && - git rev-parse commits/8 | git commit-graph write --stdin-commits && - git commit-graph verify >output + git rev-parse commits/8 | GIT_TEST_COMMIT_GRAPH_NO_GDAT=1 git commit-graph write --stdin-commits && + git commit-graph verify >output && + graph_read_expect 9 extra_edges ' NUM_COMMITS=9 diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh index 9b850ea907..6b25c3d9ce 100755 --- a/t/t5324-split-commit-graph.sh +++ b/t/t5324-split-commit-graph.sh @@ -14,11 +14,11 @@ test_expect_success 'setup repo' ' graphdir="$infodir/commit-graphs" && test_oid_init && test_oid_cache <<-EOM - shallow sha1:1760 - shallow sha256:2064 + shallow sha1:2132 + shallow sha256:2436 - base sha1:1376 - base sha256:1496 + base sha1:1408 + base sha256:1528 EOM ' @@ -29,9 +29,9 @@ graph_read_expect() { NUM_BASE=$2 fi cat >expect <<- EOF - header: 43475048 1 1 3 $NUM_BASE + header: 43475048 1 1 4 $NUM_BASE num_commits: $1 - chunks: oid_fanout oid_lookup commit_metadata + chunks: oid_fanout oid_lookup commit_metadata generation_data EOF test-tool read-graph >output && test_cmp expect output diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh index 475564bee7..d14b129f06 100755 --- a/t/t6600-test-reach.sh +++ b/t/t6600-test-reach.sh @@ -55,10 +55,13 @@ test_expect_success 'setup' ' git show-ref -s commit-5-5 | git commit-graph write --stdin-commits && mv .git/objects/info/commit-graph commit-graph-half && chmod u+w commit-graph-half && + GIT_TEST_COMMIT_GRAPH_NO_GDAT=1 git commit-graph write --reachable && + mv .git/objects/info/commit-graph commit-graph-no-gdat && + chmod u+w commit-graph-no-gdat && git config core.commitGraph true ' -run_three_modes () { +run_all_modes () { test_when_finished rm -rf .git/objects/info/commit-graph && "$@" actual && test_cmp expect actual && @@ -67,11 +70,14 @@ run_three_modes () { test_cmp expect actual && cp commit-graph-half .git/objects/info/commit-graph && "$@" actual && + test_cmp expect actual && + cp commit-graph-no-gdat .git/objects/info/commit-graph && + "$@" actual && test_cmp expect actual } -test_three_modes () { - run_three_modes test-tool reach "$@" +test_all_modes () { + run_all_modes test-tool reach "$@" } test_expect_success 'ref_newer:miss' ' @@ -80,7 +86,7 @@ test_expect_success 'ref_newer:miss' ' B:commit-4-9 EOF echo "ref_newer(A,B):0" >expect && - test_three_modes ref_newer + test_all_modes ref_newer ' test_expect_success 'ref_newer:hit' ' @@ -89,7 +95,7 @@ test_expect_success 'ref_newer:hit' ' B:commit-2-3 EOF echo "ref_newer(A,B):1" >expect && - test_three_modes ref_newer + test_all_modes ref_newer ' test_expect_success 'in_merge_bases:hit' ' @@ -98,7 +104,7 @@ test_expect_success 'in_merge_bases:hit' ' B:commit-8-8 EOF echo "in_merge_bases(A,B):1" >expect && - test_three_modes in_merge_bases + test_all_modes in_merge_bases ' test_expect_success 'in_merge_bases:miss' ' @@ -107,7 +113,7 @@ test_expect_success 'in_merge_bases:miss' ' B:commit-5-9 EOF echo "in_merge_bases(A,B):0" >expect && - test_three_modes in_merge_bases + test_all_modes in_merge_bases ' test_expect_success 'is_descendant_of:hit' ' @@ -118,7 +124,7 @@ test_expect_success 'is_descendant_of:hit' ' X:commit-1-1 EOF echo "is_descendant_of(A,X):1" >expect && - test_three_modes is_descendant_of + test_all_modes is_descendant_of ' test_expect_success 'is_descendant_of:miss' ' @@ -129,7 +135,7 @@ test_expect_success 'is_descendant_of:miss' ' X:commit-7-6 EOF echo "is_descendant_of(A,X):0" >expect && - test_three_modes is_descendant_of + test_all_modes is_descendant_of ' test_expect_success 'get_merge_bases_many' ' @@ -144,7 +150,7 @@ test_expect_success 'get_merge_bases_many' ' git rev-parse commit-5-6 \ commit-4-7 | sort } >expect && - test_three_modes get_merge_bases_many + test_all_modes get_merge_bases_many ' test_expect_success 'reduce_heads' ' @@ -166,7 +172,7 @@ test_expect_success 'reduce_heads' ' commit-2-8 \ commit-1-10 | sort } >expect && - test_three_modes reduce_heads + test_all_modes reduce_heads ' test_expect_success 'can_all_from_reach:hit' ' @@ -189,7 +195,7 @@ test_expect_success 'can_all_from_reach:hit' ' Y:commit-8-1 EOF echo "can_all_from_reach(X,Y):1" >expect && - test_three_modes can_all_from_reach + test_all_modes can_all_from_reach ' test_expect_success 'can_all_from_reach:miss' ' @@ -211,7 +217,7 @@ test_expect_success 'can_all_from_reach:miss' ' Y:commit-8-5 EOF echo "can_all_from_reach(X,Y):0" >expect && - test_three_modes can_all_from_reach + test_all_modes can_all_from_reach ' test_expect_success 'can_all_from_reach_with_flag: tags case' ' @@ -234,7 +240,7 @@ test_expect_success 'can_all_from_reach_with_flag: tags case' ' Y:commit-8-1 EOF echo "can_all_from_reach_with_flag(X,_,_,0,0):1" >expect && - test_three_modes can_all_from_reach_with_flag + test_all_modes can_all_from_reach_with_flag ' test_expect_success 'commit_contains:hit' ' @@ -250,8 +256,8 @@ test_expect_success 'commit_contains:hit' ' X:commit-9-3 EOF echo "commit_contains(_,A,X,_):1" >expect && - test_three_modes commit_contains && - test_three_modes commit_contains --tag + test_all_modes commit_contains && + test_all_modes commit_contains --tag ' test_expect_success 'commit_contains:miss' ' @@ -267,8 +273,8 @@ test_expect_success 'commit_contains:miss' ' X:commit-9-3 EOF echo "commit_contains(_,A,X,_):0" >expect && - test_three_modes commit_contains && - test_three_modes commit_contains --tag + test_all_modes commit_contains && + test_all_modes commit_contains --tag ' test_expect_success 'rev-list: basic topo-order' ' @@ -280,7 +286,7 @@ test_expect_success 'rev-list: basic topo-order' ' commit-6-2 commit-5-2 commit-4-2 commit-3-2 commit-2-2 commit-1-2 \ commit-6-1 commit-5-1 commit-4-1 commit-3-1 commit-2-1 commit-1-1 \ >expect && - run_three_modes git rev-list --topo-order commit-6-6 + run_all_modes git rev-list --topo-order commit-6-6 ' test_expect_success 'rev-list: first-parent topo-order' ' @@ -292,7 +298,7 @@ test_expect_success 'rev-list: first-parent topo-order' ' commit-6-2 \ commit-6-1 commit-5-1 commit-4-1 commit-3-1 commit-2-1 commit-1-1 \ >expect && - run_three_modes git rev-list --first-parent --topo-order commit-6-6 + run_all_modes git rev-list --first-parent --topo-order commit-6-6 ' test_expect_success 'rev-list: range topo-order' ' @@ -304,7 +310,7 @@ test_expect_success 'rev-list: range topo-order' ' commit-6-2 commit-5-2 commit-4-2 \ commit-6-1 commit-5-1 commit-4-1 \ >expect && - run_three_modes git rev-list --topo-order commit-3-3..commit-6-6 + run_all_modes git rev-list --topo-order commit-3-3..commit-6-6 ' test_expect_success 'rev-list: range topo-order' ' @@ -316,7 +322,7 @@ test_expect_success 'rev-list: range topo-order' ' commit-6-2 commit-5-2 commit-4-2 \ commit-6-1 commit-5-1 commit-4-1 \ >expect && - run_three_modes git rev-list --topo-order commit-3-8..commit-6-6 + run_all_modes git rev-list --topo-order commit-3-8..commit-6-6 ' test_expect_success 'rev-list: first-parent range topo-order' ' @@ -328,7 +334,7 @@ test_expect_success 'rev-list: first-parent range topo-order' ' commit-6-2 \ commit-6-1 commit-5-1 commit-4-1 \ >expect && - run_three_modes git rev-list --first-parent --topo-order commit-3-8..commit-6-6 + run_all_modes git rev-list --first-parent --topo-order commit-3-8..commit-6-6 ' test_expect_success 'rev-list: ancestry-path topo-order' ' @@ -338,7 +344,7 @@ test_expect_success 'rev-list: ancestry-path topo-order' ' commit-6-4 commit-5-4 commit-4-4 commit-3-4 \ commit-6-3 commit-5-3 commit-4-3 \ >expect && - run_three_modes git rev-list --topo-order --ancestry-path commit-3-3..commit-6-6 + run_all_modes git rev-list --topo-order --ancestry-path commit-3-3..commit-6-6 ' test_expect_success 'rev-list: symmetric difference topo-order' ' @@ -352,7 +358,7 @@ test_expect_success 'rev-list: symmetric difference topo-order' ' commit-3-8 commit-2-8 commit-1-8 \ commit-3-7 commit-2-7 commit-1-7 \ >expect && - run_three_modes git rev-list --topo-order commit-3-8...commit-6-6 + run_all_modes git rev-list --topo-order commit-3-8...commit-6-6 ' test_expect_success 'get_reachable_subset:all' ' @@ -372,7 +378,7 @@ test_expect_success 'get_reachable_subset:all' ' commit-1-7 \ commit-5-6 | sort ) >expect && - test_three_modes get_reachable_subset + test_all_modes get_reachable_subset ' test_expect_success 'get_reachable_subset:some' ' @@ -390,7 +396,7 @@ test_expect_success 'get_reachable_subset:some' ' git rev-parse commit-3-3 \ commit-1-7 | sort ) >expect && - test_three_modes get_reachable_subset + test_all_modes get_reachable_subset ' test_expect_success 'get_reachable_subset:none' ' @@ -404,7 +410,7 @@ test_expect_success 'get_reachable_subset:none' ' Y:commit-2-8 EOF echo "get_reachable_subset(X,Y)" >expect && - test_three_modes get_reachable_subset + test_all_modes get_reachable_subset ' test_done From patchwork Sun Aug 9 02:53:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11706489 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 109ED14B7 for ; Sun, 9 Aug 2020 02:54:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E092A206D8 for ; Sun, 9 Aug 2020 02:54:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ur35OWu4" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726398AbgHICyF (ORCPT ); Sat, 8 Aug 2020 22:54:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726212AbgHICx5 (ORCPT ); Sat, 8 Aug 2020 22:53:57 -0400 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1FED5C061A2A for ; Sat, 8 Aug 2020 19:53:57 -0700 (PDT) Received: by mail-wr1-x442.google.com with SMTP id l2so5046671wrc.7 for ; Sat, 08 Aug 2020 19:53:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=4ArW47UXhDWLACISWKffd9HxQQpBFINq7O/wlB4NeZ4=; b=ur35OWu4T4jqXyfZUJhi97XnRr+u//Y41J5a+Gfe61w6iHgii9UiQQk2+4UyWWaEfG oSLHt7xdgl+/8mXjKYCeR9rO+gj7KXH8IGi+LQHQLeOOYt51wohgyrnWmM4mLV4z4WG4 th9/1jtV5Zg1AyfRFG2x8dTGIKvWa/klxORA0wMTDSsKlNk/lR5hGmdcf2QEsTX3NPZi L9SvJ9+24PSh+JBUVclwt9oSrjxD/g/FbcvLmrsYzMiVIa5AvLlkOM8Iv7n5RfMiUCMq XVjbHbTPA08aVWVTUYzyW1z7bTnEpiqzBMLTDPj6zogBpm3j7cZaXDvkwj92xyayVgCf FjRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=4ArW47UXhDWLACISWKffd9HxQQpBFINq7O/wlB4NeZ4=; b=LOWLMbW9U+tl4YMbJQRzsKmbLHFYxZKQz8SnmFISdETNGxv9Vg9RD7URDAm5LnRWW1 PJR1FHLLw4EVyhl9ygN+nPQkP3kXL04SLvW4V/N7dcBgAQGhVcRLopqe3Tnao+5zF01O aTFqkgzL7cV/krGUwelNHHaVszEZk7Qu0SSQgq7qb1rGaXBMjQGu2uq8fekulT+s0fAZ dxKoOMJjSWT2LO2XdC4oLUA6q584EJS5ste2Nn3e6lxkXAK5AJdkGiOTfFzPuvtfF0w5 tM96sg94DZ/ujC0XxxpYd0QZKZQLluhD0/KYVTmdUEp+3HOkpbvrKjZtcQWIZH8SO59L DQ+g== X-Gm-Message-State: AOAM533odN2OLiQCmJyoqwV/0yCs0jcFirlIObqEB2RHRqDX3AVWlidB YRbdLtHzLu580PoFI1mYdjfoZLlc X-Google-Smtp-Source: ABdhPJz5LCJyYtKUUV9hYf011jY26e6+0uG4DFrNywYaFW3Hw4tl73qxoUBFdaIEM1oIfJ6J+M0DIQ== X-Received: by 2002:a5d:42c2:: with SMTP id t2mr18485749wrr.396.1596941635603; Sat, 08 Aug 2020 19:53:55 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z15sm16278690wrn.89.2020.08.08.19.53.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 08 Aug 2020 19:53:55 -0700 (PDT) Message-Id: <1aa2a00a7a2a0e4c884bf95261b5e308c3611fbc.1596941625.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sun, 09 Aug 2020 02:53:40 +0000 Subject: [PATCH v2 06/10] commit-graph: return 64-bit generation number Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar In a preparatory step, let's return timestamp_t values from commit_graph_generation(), use timestamp_t for local variables and define GENERATION_NUMBER_INFINITY as (2 ^ 63 - 1) instead. Signed-off-by: Abhishek Kumar --- commit-graph.c | 18 +++++++++--------- commit-graph.h | 4 ++-- commit-reach.c | 32 ++++++++++++++++---------------- commit-reach.h | 2 +- commit.h | 3 ++- revision.c | 10 +++++----- upload-pack.c | 2 +- 7 files changed, 36 insertions(+), 35 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index d5da1e8028..42f3ec5460 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -100,7 +100,7 @@ uint32_t commit_graph_position(const struct commit *c) return data ? data->graph_pos : COMMIT_NOT_FROM_GRAPH; } -uint32_t commit_graph_generation(const struct commit *c) +timestamp_t commit_graph_generation(const struct commit *c) { struct commit_graph_data *data = commit_graph_data_slab_peek(&commit_graph_data_slab, c); @@ -116,8 +116,8 @@ uint32_t commit_graph_generation(const struct commit *c) int compare_commits_by_gen(const void *_a, const void *_b) { const struct commit *a = _a, *b = _b; - const uint32_t generation_a = commit_graph_generation(a); - const uint32_t generation_b = commit_graph_generation(b); + const timestamp_t generation_a = commit_graph_generation(a); + const timestamp_t generation_b = commit_graph_generation(b); /* older commits first */ if (generation_a < generation_b) @@ -160,8 +160,8 @@ static int commit_gen_cmp(const void *va, const void *vb) const struct commit *a = *(const struct commit **)va; const struct commit *b = *(const struct commit **)vb; - uint32_t generation_a = commit_graph_data_at(a)->generation; - uint32_t generation_b = commit_graph_data_at(b)->generation; + const timestamp_t generation_a = commit_graph_data_at(a)->generation; + const timestamp_t generation_b = commit_graph_data_at(b)->generation; /* lower generation commits first */ if (generation_a < generation_b) return -1; @@ -1363,7 +1363,7 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) uint32_t generation = commit_graph_data_at(ctx->commits.list[i])->generation; display_progress(ctx->progress, i + 1); - if (generation != GENERATION_NUMBER_INFINITY && + if (generation != GENERATION_NUMBER_V1_INFINITY && generation != GENERATION_NUMBER_ZERO) continue; @@ -1377,7 +1377,7 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) for (parent = current->parents; parent; parent = parent->next) { generation = commit_graph_data_at(parent->item)->generation; - if (generation == GENERATION_NUMBER_INFINITY || + if (generation == GENERATION_NUMBER_V1_INFINITY || generation == GENERATION_NUMBER_ZERO) { all_parents_computed = 0; commit_list_insert(parent->item, &list); @@ -2387,8 +2387,8 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags) for (i = 0; i < g->num_commits; i++) { struct commit *graph_commit, *odb_commit; struct commit_list *graph_parents, *odb_parents; - uint32_t max_generation = 0; - uint32_t generation; + timestamp_t max_generation = 0; + timestamp_t generation; display_progress(progress, i + 1); hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); diff --git a/commit-graph.h b/commit-graph.h index cc232e0678..f89614ecd5 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -140,13 +140,13 @@ void disable_commit_graph(struct repository *r); struct commit_graph_data { uint32_t graph_pos; - uint32_t generation; + timestamp_t generation; }; /* * Commits should be parsed before accessing generation, graph positions. */ -uint32_t commit_graph_generation(const struct commit *); +timestamp_t commit_graph_generation(const struct commit *); uint32_t commit_graph_position(const struct commit *); int compare_commits_by_gen(const void *_a, const void *_b); diff --git a/commit-reach.c b/commit-reach.c index c83cc291e7..470bc80139 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -32,12 +32,12 @@ static int queue_has_nonstale(struct prio_queue *queue) static struct commit_list *paint_down_to_common(struct repository *r, struct commit *one, int n, struct commit **twos, - int min_generation) + timestamp_t min_generation) { struct prio_queue queue = { compare_commits_by_gen_then_commit_date }; struct commit_list *result = NULL; int i; - uint32_t last_gen = GENERATION_NUMBER_INFINITY; + timestamp_t last_gen = GENERATION_NUMBER_INFINITY; if (!min_generation) queue.compare = compare_commits_by_commit_date; @@ -58,10 +58,10 @@ static struct commit_list *paint_down_to_common(struct repository *r, struct commit *commit = prio_queue_get(&queue); struct commit_list *parents; int flags; - uint32_t generation = commit_graph_generation(commit); + timestamp_t generation = commit_graph_generation(commit); if (min_generation && generation > last_gen) - BUG("bad generation skip %8x > %8x at %s", + BUG("bad generation skip %"PRItime" > %"PRItime" at %s", generation, last_gen, oid_to_hex(&commit->object.oid)); last_gen = generation; @@ -177,12 +177,12 @@ static int remove_redundant(struct repository *r, struct commit **array, int cnt repo_parse_commit(r, array[i]); for (i = 0; i < cnt; i++) { struct commit_list *common; - uint32_t min_generation = commit_graph_generation(array[i]); + timestamp_t min_generation = commit_graph_generation(array[i]); if (redundant[i]) continue; for (j = filled = 0; j < cnt; j++) { - uint32_t curr_generation; + timestamp_t curr_generation; if (i == j || redundant[j]) continue; filled_index[filled] = j; @@ -321,7 +321,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit, { struct commit_list *bases; int ret = 0, i; - uint32_t generation, min_generation = GENERATION_NUMBER_INFINITY; + timestamp_t generation, min_generation = GENERATION_NUMBER_INFINITY; if (repo_parse_commit(r, commit)) return ret; @@ -470,7 +470,7 @@ static int in_commit_list(const struct commit_list *want, struct commit *c) static enum contains_result contains_test(struct commit *candidate, const struct commit_list *want, struct contains_cache *cache, - uint32_t cutoff) + timestamp_t cutoff) { enum contains_result *cached = contains_cache_at(cache, candidate); @@ -506,11 +506,11 @@ static enum contains_result contains_tag_algo(struct commit *candidate, { struct contains_stack contains_stack = { 0, 0, NULL }; enum contains_result result; - uint32_t cutoff = GENERATION_NUMBER_INFINITY; + timestamp_t cutoff = GENERATION_NUMBER_INFINITY; const struct commit_list *p; for (p = want; p; p = p->next) { - uint32_t generation; + timestamp_t generation; struct commit *c = p->item; load_commit_graph_info(the_repository, c); generation = commit_graph_generation(c); @@ -565,7 +565,7 @@ int can_all_from_reach_with_flag(struct object_array *from, unsigned int with_flag, unsigned int assign_flag, time_t min_commit_date, - uint32_t min_generation) + timestamp_t min_generation) { struct commit **list = NULL; int i; @@ -666,13 +666,13 @@ int can_all_from_reach(struct commit_list *from, struct commit_list *to, time_t min_commit_date = cutoff_by_min_date ? from->item->date : 0; struct commit_list *from_iter = from, *to_iter = to; int result; - uint32_t min_generation = GENERATION_NUMBER_INFINITY; + timestamp_t min_generation = GENERATION_NUMBER_INFINITY; while (from_iter) { add_object_array(&from_iter->item->object, NULL, &from_objs); if (!parse_commit(from_iter->item)) { - uint32_t generation; + timestamp_t generation; if (from_iter->item->date < min_commit_date) min_commit_date = from_iter->item->date; @@ -686,7 +686,7 @@ int can_all_from_reach(struct commit_list *from, struct commit_list *to, while (to_iter) { if (!parse_commit(to_iter->item)) { - uint32_t generation; + timestamp_t generation; if (to_iter->item->date < min_commit_date) min_commit_date = to_iter->item->date; @@ -726,13 +726,13 @@ struct commit_list *get_reachable_subset(struct commit **from, int nr_from, struct commit_list *found_commits = NULL; struct commit **to_last = to + nr_to; struct commit **from_last = from + nr_from; - uint32_t min_generation = GENERATION_NUMBER_INFINITY; + timestamp_t min_generation = GENERATION_NUMBER_INFINITY; int num_to_find = 0; struct prio_queue queue = { compare_commits_by_gen_then_commit_date }; for (item = to; item < to_last; item++) { - uint32_t generation; + timestamp_t generation; struct commit *c = *item; parse_commit(c); diff --git a/commit-reach.h b/commit-reach.h index b49ad71a31..148b56fea5 100644 --- a/commit-reach.h +++ b/commit-reach.h @@ -87,7 +87,7 @@ int can_all_from_reach_with_flag(struct object_array *from, unsigned int with_flag, unsigned int assign_flag, time_t min_commit_date, - uint32_t min_generation); + timestamp_t min_generation); int can_all_from_reach(struct commit_list *from, struct commit_list *to, int commit_date_cutoff); diff --git a/commit.h b/commit.h index e901538909..bc0732a4fe 100644 --- a/commit.h +++ b/commit.h @@ -11,7 +11,8 @@ #include "commit-slab.h" #define COMMIT_NOT_FROM_GRAPH 0xFFFFFFFF -#define GENERATION_NUMBER_INFINITY 0xFFFFFFFF +#define GENERATION_NUMBER_INFINITY ((1ULL << 63) - 1) +#define GENERATION_NUMBER_V1_INFINITY 0xFFFFFFFF #define GENERATION_NUMBER_MAX 0x3FFFFFFF #define GENERATION_NUMBER_ZERO 0 diff --git a/revision.c b/revision.c index 4ec82ed5ab..bd7b39c806 100644 --- a/revision.c +++ b/revision.c @@ -3292,7 +3292,7 @@ define_commit_slab(indegree_slab, int); define_commit_slab(author_date_slab, timestamp_t); struct topo_walk_info { - uint32_t min_generation; + timestamp_t min_generation; struct prio_queue explore_queue; struct prio_queue indegree_queue; struct prio_queue topo_queue; @@ -3338,7 +3338,7 @@ static void explore_walk_step(struct rev_info *revs) } static void explore_to_depth(struct rev_info *revs, - uint32_t gen_cutoff) + timestamp_t gen_cutoff) { struct topo_walk_info *info = revs->topo_walk_info; struct commit *c; @@ -3381,7 +3381,7 @@ static void indegree_walk_step(struct rev_info *revs) } static void compute_indegrees_to_depth(struct rev_info *revs, - uint32_t gen_cutoff) + timestamp_t gen_cutoff) { struct topo_walk_info *info = revs->topo_walk_info; struct commit *c; @@ -3439,7 +3439,7 @@ static void init_topo_walk(struct rev_info *revs) info->min_generation = GENERATION_NUMBER_INFINITY; for (list = revs->commits; list; list = list->next) { struct commit *c = list->item; - uint32_t generation; + timestamp_t generation; if (parse_commit_gently(c, 1)) continue; @@ -3500,7 +3500,7 @@ static void expand_topo_walk(struct rev_info *revs, struct commit *commit) for (p = commit->parents; p; p = p->next) { struct commit *parent = p->item; int *pi; - uint32_t generation; + timestamp_t generation; if (parent->object.flags & UNINTERESTING) continue; diff --git a/upload-pack.c b/upload-pack.c index 8673741070..18ee29db67 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -490,7 +490,7 @@ static int got_oid(struct upload_pack_data *data, static int ok_to_give_up(struct upload_pack_data *data) { - uint32_t min_generation = GENERATION_NUMBER_ZERO; + timestamp_t min_generation = GENERATION_NUMBER_ZERO; if (!data->have_obj.nr) return 0; From patchwork Sun Aug 9 02:53:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11706487 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0502E175A for ; Sun, 9 Aug 2020 02:54:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E2297206D8 for ; Sun, 9 Aug 2020 02:54:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HCuCFKBR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726395AbgHICyD (ORCPT ); Sat, 8 Aug 2020 22:54:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40454 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726242AbgHICx6 (ORCPT ); Sat, 8 Aug 2020 22:53:58 -0400 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EE4B0C061A2B for ; Sat, 8 Aug 2020 19:53:57 -0700 (PDT) Received: by mail-wr1-x444.google.com with SMTP id p20so5059181wrf.0 for ; Sat, 08 Aug 2020 19:53:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=pr2Zc/i+T1F6Y+cZHGY7FqZwxY9Ft3aIIPl4yRzi6Xs=; b=HCuCFKBRnmsskT62haz6FS669ebg+MrRmKvpJu43FqGZOyvyX4PDtZK35rBgjnYWMN 6LXxonXLOctMRfr7OZeKMuHbqvCVwYHu9U8jKp44mHfHTqT8xSfohfzi1nFQEapHVUXW ymUbb/inQW+EYvv7J8UfvaEd2tRewoVrEWcJ8G+n/TaPSNIpgjN/kcMrtY1YX+zRC7Cv 47cNvFPC4lI2UsIUyRJDsRVxepjG0BNTD58l4UNipzKzMneIU73ZD/LJaqVcOh1MN/f0 ljbUsSKnDpDZeXGYyM6tBlBtpiGPPveZrixGeymeztcN83sHkCYBWdhy8ANVEzTFaajn Ur+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=pr2Zc/i+T1F6Y+cZHGY7FqZwxY9Ft3aIIPl4yRzi6Xs=; b=P24vuffd7E7A141jGo6QloaxzMGbTixhX7d4Cbbmljw9K3ZWKQEQRIj+AuWgtRB9ml F2QT5+IJrF/h1sIUwyHqse5PoY9xRWjc3tr90QCyThKcFBmcr7VVwNMh3OzTikg15SiC Map4iH8fQxLD71BaGm5Cypg06Gv/DUxzEKZDZAo4WYupUlkPAlvU/5QZgtd7oe7wmCwC 8zLn8415I5PTF03nj3HFfh5wmW63rGDiHJTiSs3S8pXQUfGuUmRGzcnniP4gYOASvRkq DmtzIJYnP4GiTO8ZTWs3ApJVJ4FlVQ+Aq4EvY1dWxz/Hzmo2EOQ+izDraslAPgqK1w5J qC6g== X-Gm-Message-State: AOAM531D+cbUYzHPm5eN7cvsFkqMoJyENmygscL7MTczIuLtYC8jt1Ro CthsGm417PY/vP03NMf706jKVxXF X-Google-Smtp-Source: ABdhPJz+I14H6OjVwFQlCfC5hHP8n4rBIHheQcdxKUP2TRg7zrxsgflhg/UvIv7pcQIGYoewXCkX7g== X-Received: by 2002:a5d:5746:: with SMTP id q6mr18625136wrw.59.1596941636377; Sat, 08 Aug 2020 19:53:56 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id p14sm17002984wrx.90.2020.08.08.19.53.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 08 Aug 2020 19:53:55 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sun, 09 Aug 2020 02:53:41 +0000 Subject: [PATCH v2 07/10] commit-graph: implement corrected commit date Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar With most of preparations done, let's implement corrected commit date offset. We add a new commit-slab to store topogical levels while writing commit graph and upgrade the generation member in struct commit_graph_data to a 64-bit timestamp. We store topological levels to ensure that older versions of Git will still have the performance benefits from generation number v2. Signed-off-by: Abhishek Kumar Signed-off-by: Derrick Stolee Signed-off-by: Derrick Stolee Signed-off-by: Abhishek Kumar --- commit-graph.c | 89 ++++++++++++++++++++++++++++---------------------- commit.h | 1 + 2 files changed, 51 insertions(+), 39 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index 42f3ec5460..d0f977852b 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -65,6 +65,8 @@ void git_test_write_commit_graph_or_die(void) /* Remember to update object flag allocation in object.h */ #define REACHABLE (1u<<15) +define_commit_slab(topo_level_slab, uint32_t); + /* Keep track of the order in which commits are added to our list. */ define_commit_slab(commit_pos, int); static struct commit_pos commit_pos = COMMIT_SLAB_INIT(1, commit_pos); @@ -168,11 +170,6 @@ static int commit_gen_cmp(const void *va, const void *vb) else if (generation_a > generation_b) return 1; - /* use date as a heuristic when generations are equal */ - if (a->date < b->date) - return -1; - else if (a->date > b->date) - return 1; return 0; } @@ -767,7 +764,10 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g, item->date = (timestamp_t)((date_high << 32) | date_low); if (g->chunk_generation_data) - graph_data->generation = get_be32(g->chunk_generation_data + sizeof(uint32_t) * lex_index); + { + graph_data->generation = item->date + + (timestamp_t) get_be32(g->chunk_generation_data + sizeof(uint32_t) * lex_index); + } else graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; } @@ -948,6 +948,7 @@ struct write_commit_graph_context { struct progress *progress; int progress_done; uint64_t progress_cnt; + struct topo_level_slab *topo_levels; char *base_graph_name; int num_commit_graphs_before; @@ -1106,7 +1107,7 @@ static int write_graph_chunk_data(struct hashfile *f, else packedDate[0] = 0; - packedDate[0] |= htonl(commit_graph_data_at(*list)->generation << 2); + packedDate[0] |= htonl(*topo_level_slab_at(ctx->topo_levels, *list) << 2); packedDate[1] = htonl((*list)->date); hashwrite(f, packedDate, 8); @@ -1123,8 +1124,13 @@ static int write_graph_chunk_generation_data(struct hashfile *f, int i; for (i = 0; i < ctx->commits.nr; i++) { struct commit *c = ctx->commits.list[i]; + timestamp_t offset = commit_graph_data_at(c)->generation - c->date; display_progress(ctx->progress, ++ctx->progress_cnt); - hashwrite_be32(f, commit_graph_data_at(c)->generation); + + if (offset > GENERATION_NUMBER_V2_OFFSET_MAX) + offset = GENERATION_NUMBER_V2_OFFSET_MAX; + + hashwrite_be32(f, offset); } return 0; @@ -1360,11 +1366,11 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) _("Computing commit graph generation numbers"), ctx->commits.nr); for (i = 0; i < ctx->commits.nr; i++) { - uint32_t generation = commit_graph_data_at(ctx->commits.list[i])->generation; + uint32_t topo_level = *topo_level_slab_at(ctx->topo_levels, ctx->commits.list[i]); display_progress(ctx->progress, i + 1); - if (generation != GENERATION_NUMBER_V1_INFINITY && - generation != GENERATION_NUMBER_ZERO) + if (topo_level != GENERATION_NUMBER_V1_INFINITY && + topo_level != GENERATION_NUMBER_ZERO) continue; commit_list_insert(ctx->commits.list[i], &list); @@ -1372,29 +1378,38 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) struct commit *current = list->item; struct commit_list *parent; int all_parents_computed = 1; - uint32_t max_generation = 0; + uint32_t max_level = 0; + timestamp_t max_corrected_commit_date = current->date - 1; for (parent = current->parents; parent; parent = parent->next) { - generation = commit_graph_data_at(parent->item)->generation; + topo_level = *topo_level_slab_at(ctx->topo_levels, parent->item); - if (generation == GENERATION_NUMBER_V1_INFINITY || - generation == GENERATION_NUMBER_ZERO) { + if (topo_level == GENERATION_NUMBER_V1_INFINITY || + topo_level == GENERATION_NUMBER_ZERO) { all_parents_computed = 0; commit_list_insert(parent->item, &list); break; - } else if (generation > max_generation) { - max_generation = generation; + } else { + struct commit_graph_data *data = commit_graph_data_at(parent->item); + + if (topo_level > max_level) + max_level = topo_level; + + if (data->generation > max_corrected_commit_date) + max_corrected_commit_date = data->generation; } } if (all_parents_computed) { struct commit_graph_data *data = commit_graph_data_at(current); - data->generation = max_generation + 1; - pop_commit(&list); + if (max_level > GENERATION_NUMBER_MAX - 1) + max_level = GENERATION_NUMBER_MAX - 1; + + *topo_level_slab_at(ctx->topo_levels, current) = max_level + 1; + data->generation = max_corrected_commit_date + 1; - if (data->generation > GENERATION_NUMBER_MAX) - data->generation = GENERATION_NUMBER_MAX; + pop_commit(&list); } } } @@ -2132,6 +2147,7 @@ int write_commit_graph(struct object_directory *odb, uint32_t i, count_distinct = 0; int res = 0; int replace = 0; + struct topo_level_slab topo_levels; if (!commit_graph_compatible(the_repository)) return 0; @@ -2146,6 +2162,9 @@ int write_commit_graph(struct object_directory *odb, ctx->total_bloom_filter_data_size = 0; ctx->write_generation_data = !git_env_bool(GIT_TEST_COMMIT_GRAPH_NO_GDAT, 0); + init_topo_level_slab(&topo_levels); + ctx->topo_levels = &topo_levels; + if (flags & COMMIT_GRAPH_WRITE_BLOOM_FILTERS) ctx->changed_paths = 1; if (!(flags & COMMIT_GRAPH_NO_WRITE_BLOOM_FILTERS)) { @@ -2387,8 +2406,8 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags) for (i = 0; i < g->num_commits; i++) { struct commit *graph_commit, *odb_commit; struct commit_list *graph_parents, *odb_parents; - timestamp_t max_generation = 0; - timestamp_t generation; + timestamp_t max_parent_corrected_commit_date = 0; + timestamp_t corrected_commit_date; display_progress(progress, i + 1); hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); @@ -2427,9 +2446,9 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags) oid_to_hex(&graph_parents->item->object.oid), oid_to_hex(&odb_parents->item->object.oid)); - generation = commit_graph_generation(graph_parents->item); - if (generation > max_generation) - max_generation = generation; + corrected_commit_date = commit_graph_generation(graph_parents->item); + if (corrected_commit_date > max_parent_corrected_commit_date) + max_parent_corrected_commit_date = corrected_commit_date; graph_parents = graph_parents->next; odb_parents = odb_parents->next; @@ -2451,20 +2470,12 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags) if (generation_zero == GENERATION_ZERO_EXISTS) continue; - /* - * If one of our parents has generation GENERATION_NUMBER_MAX, then - * our generation is also GENERATION_NUMBER_MAX. Decrement to avoid - * extra logic in the following condition. - */ - if (max_generation == GENERATION_NUMBER_MAX) - max_generation--; - - generation = commit_graph_generation(graph_commit); - if (generation != max_generation + 1) - graph_report(_("commit-graph generation for commit %s is %u != %u"), + corrected_commit_date = commit_graph_generation(graph_commit); + if (corrected_commit_date < max_parent_corrected_commit_date + 1) + graph_report(_("commit-graph generation for commit %s is %"PRItime" < %"PRItime), oid_to_hex(&cur_oid), - generation, - max_generation + 1); + corrected_commit_date, + max_parent_corrected_commit_date + 1); if (graph_commit->date != odb_commit->date) graph_report(_("commit date for commit %s in commit-graph is %"PRItime" != %"PRItime), diff --git a/commit.h b/commit.h index bc0732a4fe..bb846e0025 100644 --- a/commit.h +++ b/commit.h @@ -15,6 +15,7 @@ #define GENERATION_NUMBER_V1_INFINITY 0xFFFFFFFF #define GENERATION_NUMBER_MAX 0x3FFFFFFF #define GENERATION_NUMBER_ZERO 0 +#define GENERATION_NUMBER_V2_OFFSET_MAX 0xFFFFFFFF struct commit_list { struct commit *item; From patchwork Sun Aug 9 02:53:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11706483 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B876D14B7 for ; Sun, 9 Aug 2020 02:54:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A142A206D8 for ; Sun, 9 Aug 2020 02:54:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ORClujzL" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726380AbgHICyB (ORCPT ); Sat, 8 Aug 2020 22:54:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40456 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726293AbgHICx7 (ORCPT ); Sat, 8 Aug 2020 22:53:59 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB42EC061756 for ; Sat, 8 Aug 2020 19:53:58 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id y3so5048583wrl.4 for ; Sat, 08 Aug 2020 19:53:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=/kisH7idb9PejmUQk6u8VOA3OdZBoJWM11ilrahD6zw=; b=ORClujzL9ZJaZ4jmESy66WPbnL6ZZjuk/ABZm/R75V++1iL8wuM9ms6cE16bU4CT2z fN+ACH5451zk6f3Pr9gqFI4VD9VSznphoC46hAkkBzS44u09wMzuA9fivATs90VUcMLE LuSBd2gJqSVG9+kiorkVCmJxI/yUeqv+TwUv7YVv5+YjFYGJwmZwDrNWlzR0RsUrahVw a3NxImzcFFDnXlEzS5+z7/9sKHusekpfRcunySFgebcHeDajio0ysT++jxrDfCpb7l9u 2GUgOVHomtSeKHjDM50thEfoAxzSEsA99ub8GKWU3k18eZlg2yOcfb/uuCtmXkM9hX+1 pO1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=/kisH7idb9PejmUQk6u8VOA3OdZBoJWM11ilrahD6zw=; b=gT0eKq9dbMdB0XPxyxsq+6SohonyrF8LkBPias96GadICN58//xtjBK+4/gdKwQOoy Qz4AcqY3gcDsTVm4zlwPOPJSRzOq89y/+Tmcd92wlQXpHA9CHQxtwN+OXMu3K+YZlf1C 6LzNOFCr1TZy+HNjL140SY2xMJ8E18nfvBlz3yMziIlTjXAOT7BQhz9NLk04oD05I9KI M9ZxEAobafvjxXSYRelufY6xV+zfjG7XnQ8WGXjEfDXp1xE1W7VTr6wQIbfqXnBbAVin OjOQoDUbkwMZ0930+LPFmNwlQaElGEPJsTbvHGqmn7OCM+/iB54woCqV/vYMs9TQ65M3 9g9g== X-Gm-Message-State: AOAM533OU2GnAsBbR40wCInxF/obd6FD5wU3bld1uL4g19gfTQWKIqV/ wxwD4jnrAwCmB8GmIwRB6mIVKXEZ X-Google-Smtp-Source: ABdhPJxaZtQMdf4/yGr3Q+lodCvEEBpBwc6s8Q855yr+rAgpzc1VHLNgpuyfIbc9XIkEO0pKkGLRUQ== X-Received: by 2002:adf:f74f:: with SMTP id z15mr5561330wrp.365.1596941637241; Sat, 08 Aug 2020 19:53:57 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y84sm16249868wmg.38.2020.08.08.19.53.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 08 Aug 2020 19:53:56 -0700 (PDT) Message-Id: <833779ad53eb4f57ae514f4e8964e397845f1ddd.1596941625.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sun, 09 Aug 2020 02:53:42 +0000 Subject: [PATCH v2 08/10] commit-graph: handle mixed generation commit chains Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar As corrected commit dates and topological levels cannot be compared directly, we must handle commit graph chains with mixed generation number definitions. While reading a commit graph file, we disable generation numbers if the chain contains mixed generation numbers. While writing to commit graph chain, we write generation data chunk only if the previous tip of chain had a generation data chunk. Using `--split=replace` overwrites the existing chain and writes generation data chunk regardless of previous tip. In t5324-split-commit-graph, we set up a repo with twelve commits and write a base commit graph file with no generation data chunk. When add three commits and write to chain again, Git does not write generation data chunk even without setting GIT_TEST_COMMIT_GRAPH_NO_GDAT=1. Then, as we replace the existing chain, Git writes a commit graph file with generation data chunk. Signed-off-by: Abhishek Kumar --- commit-graph.c | 14 ++++++++ t/t5324-split-commit-graph.sh | 66 +++++++++++++++++++++++++++++++++++ 2 files changed, 80 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index d0f977852b..c6b6111adf 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -674,6 +674,14 @@ int generation_numbers_enabled(struct repository *r) if (!g->num_commits) return 0; + /* We cannot compare topological levels and corrected commit dates */ + while (g->base_graph) { + warning(_("commit-graph-chain contains mixed generation versions")); + if ((g->chunk_generation_data == NULL) ^ (g->base_graph->chunk_generation_data == NULL)) + return 0; + g = g->base_graph; + } + first_generation = get_be32(g->chunk_commit_data + g->hash_len + 8) >> 2; @@ -2186,6 +2194,9 @@ int write_commit_graph(struct object_directory *odb, g = ctx->r->objects->commit_graph; + if (g && !g->chunk_generation_data) + ctx->write_generation_data = 0; + while (g) { ctx->num_commit_graphs_before++; g = g->base_graph; @@ -2204,6 +2215,9 @@ int write_commit_graph(struct object_directory *odb, if (ctx->split_opts) replace = ctx->split_opts->flags & COMMIT_GRAPH_SPLIT_REPLACE; + + if (replace) + ctx->write_generation_data = 1; } ctx->approx_nr_objects = approximate_object_count(); diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh index 6b25c3d9ce..1a9be5e656 100755 --- a/t/t5324-split-commit-graph.sh +++ b/t/t5324-split-commit-graph.sh @@ -425,4 +425,70 @@ done <<\EOF 0600 -r-------- EOF +test_expect_success 'setup repo for mixed generation commit-graph-chain' ' + mkdir mixed && + graphdir=".git/objects/info/commit-graphs" && + cd "$TRASH_DIRECTORY/mixed" && + git init && + git config core.commitGraph true && + git config gc.writeCommitGraph false && + for i in $(test_seq 3) + do + test_commit $i && + git branch commits/$i || return 1 + done && + git reset --hard commits/1 && + for i in $(test_seq 4 5) + do + test_commit $i && + git branch commits/$i || return 1 + done && + git reset --hard commits/2 && + for i in $(test_seq 6 10) + do + test_commit $i && + git branch commits/$i || return 1 + done && + git reset --hard commits/2 && + git merge commits/4 && + git branch merge/1 && + git reset --hard commits/4 && + git merge commits/6 && + git branch merge/2 && + GIT_TEST_COMMIT_GRAPH_NO_GDAT=1 git commit-graph write --reachable --split && + test-tool read-graph >output && + cat >expect <<-EOF && + header: 43475048 1 1 3 0 + num_commits: 12 + chunks: oid_fanout oid_lookup commit_metadata + EOF + test_cmp expect output +' + +test_expect_success 'does not write generation data chunk if not present on existing tip' ' + cd "$TRASH_DIRECTORY/mixed" && + git reset --hard commits/3 && + git merge merge/1 && + git merge commits/5 && + git merge merge/2 && + git branch merge/3 && + git commit-graph write --reachable --split && + test-tool read-graph >output && + cat >expect <<-EOF && + header: 43475048 1 1 4 1 + num_commits: 3 + chunks: oid_fanout oid_lookup commit_metadata + EOF + test_cmp expect output +' + +test_expect_success 'writes generation data chunk when commit-graph chain is replaced' ' + cd "$TRASH_DIRECTORY/mixed" && + git commit-graph write --reachable --split='replace' && + test_path_is_file $graphdir/commit-graph-chain && + test_line_count = 1 $graphdir/commit-graph-chain && + verify_chain_files_exist $graphdir && + graph_read_expect 15 +' + test_done From patchwork Sun Aug 9 02:53:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11706485 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DD6DB17C5 for ; Sun, 9 Aug 2020 02:54:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BD067206D8 for ; Sun, 9 Aug 2020 02:54:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="agQQ9fnI" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726386AbgHICyC (ORCPT ); Sat, 8 Aug 2020 22:54:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725988AbgHICyA (ORCPT ); Sat, 8 Aug 2020 22:54:00 -0400 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A556BC061A27 for ; Sat, 8 Aug 2020 19:53:59 -0700 (PDT) Received: by mail-wr1-x442.google.com with SMTP id p20so5059196wrf.0 for ; Sat, 08 Aug 2020 19:53:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=7h7jK5uvxqsvZDY4WJcTkp4cmOBz3r+zPD48AjNXcAE=; b=agQQ9fnIOBMd9zNK5SNqE32LV7XZ4wHeg59JVD/Pai74XM2wubx77tjXh1/QPlnqoq nC7T0vz9GYpDmaNNmSH4cWT0YUg+cJvMDPZu10Dm+HKDo709o6+BDoYHAwm5Ez8AMy5r majZRW5YMpVlIiSul4DkLBOBoC1WdQKgBiQFX8+buD6RY7HJ7bA6ZViHmTBTM7RzNMLM 79puf5ndhdQJEuGtzRFMzsfyM3h5LSzcD0SAUhBX6kUOdXQ5p9y1zYr/vlzaJaNCbtKr Wla+dLCZ9sXvV4ycINjmfjLreizTyAZSOfvZyVvCxIxFzsfJe0ql41iNuaITbxMx36uu iq9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=7h7jK5uvxqsvZDY4WJcTkp4cmOBz3r+zPD48AjNXcAE=; b=uZXPdnepusbbazNIFqHpVP2FnXONAu4Px71dHqsvYX6v+R1d1USz/MBmH+6EFQPget 83+B8CS4/gcNPWVG9Hr8kFkYspPITJ6z9hTL4uLvjdPKcPQbS/ECHOcOIqnz6vBTkEvL s/6bdpjNpPttLMUB22yJH+E5YR1SylnF8zoOSOuFk2U7PF7IGz64pjSCAp3aP0oVBTgL PBED6I9Gky/PW1iVLn2y6JpviBAaJ5qkkXlDwOl0dpx2uu8ZjWtUfuKYDUhEvl4DOUu1 AJpNH3JbbEQN2M5CzMNshcgwwcnc1nY7ffB6retUFGMpUgY6RtFDf/3HAztjvHev6oXA 0GhQ== X-Gm-Message-State: AOAM532gBwSo5+DPBOQWzqw3X0HyEyUGK9C5nq6NLLTAYaYFRvWNrc55 4BtQQ1S88QfXjwFMC5w1j56qWExJ X-Google-Smtp-Source: ABdhPJxVroM9LpJmBaUPB8QvkZkRscXkXMc/T8TDCYACHrQ93iEjPi1FoogBHX/MXzVUT9gGPN4ymg== X-Received: by 2002:adf:9ec5:: with SMTP id b5mr17445120wrf.190.1596941638312; Sat, 08 Aug 2020 19:53:58 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h14sm15278284wml.30.2020.08.08.19.53.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 08 Aug 2020 19:53:57 -0700 (PDT) Message-Id: <58a2d5da0105e6572305b07d4e39ef6be9ee0044.1596941625.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sun, 09 Aug 2020 02:53:43 +0000 Subject: [PATCH v2 09/10] commit-reach: use corrected commit dates in paint_down_to_common() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar With corrected commit dates implemented, we no longer have to rely on commit date as a heuristic in paint_down_to_common(). t6024-recursive-merge setups a unique repository where all commits have the same committer date without well-defined merge-base. As this has already caused problems (as noted in 859fdc0 (commit-graph: define GIT_TEST_COMMIT_GRAPH, 2018-08-29)), we disable commit graph within the test script. Signed-off-by: Abhishek Kumar --- commit-graph.c | 14 ++++++++++++++ commit-graph.h | 6 ++++++ commit-reach.c | 2 +- t/t6024-recursive-merge.sh | 4 +++- 4 files changed, 24 insertions(+), 2 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index c6b6111adf..eb78af3dad 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -688,6 +688,20 @@ int generation_numbers_enabled(struct repository *r) return !!first_generation; } +int corrected_commit_dates_enabled(struct repository *r) +{ + struct commit_graph *g; + if (!prepare_commit_graph(r)) + return 0; + + g = r->objects->commit_graph; + + if (!g->num_commits) + return 0; + + return !!g->chunk_generation_data; +} + static void close_commit_graph_one(struct commit_graph *g) { if (!g) diff --git a/commit-graph.h b/commit-graph.h index f89614ecd5..d3a485faa6 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -89,6 +89,12 @@ struct commit_graph *parse_commit_graph(void *graph_map, size_t graph_size); */ int generation_numbers_enabled(struct repository *r); +/* + * Return 1 if and only if the repository has a commit-graph + * file and generation data chunk has been written for the file. + */ +int corrected_commit_dates_enabled(struct repository *r); + enum commit_graph_write_flags { COMMIT_GRAPH_WRITE_APPEND = (1 << 0), COMMIT_GRAPH_WRITE_PROGRESS = (1 << 1), diff --git a/commit-reach.c b/commit-reach.c index 470bc80139..3a1b925274 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -39,7 +39,7 @@ static struct commit_list *paint_down_to_common(struct repository *r, int i; timestamp_t last_gen = GENERATION_NUMBER_INFINITY; - if (!min_generation) + if (!min_generation && !corrected_commit_dates_enabled(r)) queue.compare = compare_commits_by_commit_date; one->object.flags |= PARENT1; diff --git a/t/t6024-recursive-merge.sh b/t/t6024-recursive-merge.sh index 332cfc53fd..d3def66e7d 100755 --- a/t/t6024-recursive-merge.sh +++ b/t/t6024-recursive-merge.sh @@ -15,6 +15,8 @@ GIT_COMMITTER_DATE="2006-12-12 23:28:00 +0100" export GIT_COMMITTER_DATE test_expect_success 'setup tests' ' + GIT_TEST_COMMIT_GRAPH=0 && + export GIT_TEST_COMMIT_GRAPH && echo 1 >a1 && git add a1 && GIT_AUTHOR_DATE="2006-12-12 23:00:00" git commit -m 1 a1 && @@ -66,7 +68,7 @@ test_expect_success 'setup tests' ' ' test_expect_success 'combined merge conflicts' ' - test_must_fail env GIT_TEST_COMMIT_GRAPH=0 git merge -m final G + test_must_fail git merge -m final G ' test_expect_success 'result contains a conflict' ' From patchwork Sun Aug 9 02:53:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11706491 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8476A14B7 for ; Sun, 9 Aug 2020 02:54:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6750F206E2 for ; Sun, 9 Aug 2020 02:54:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="LtbAV0/y" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726401AbgHICyI (ORCPT ); Sat, 8 Aug 2020 22:54:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726370AbgHICyB (ORCPT ); Sat, 8 Aug 2020 22:54:01 -0400 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 26AE8C061756 for ; Sat, 8 Aug 2020 19:54:01 -0700 (PDT) Received: by mail-wm1-x32d.google.com with SMTP id 9so4778175wmj.5 for ; Sat, 08 Aug 2020 19:54:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ToI4nwdtL3GkFmibdyswuTEHfpGjEVUQmYdZnttMOF0=; b=LtbAV0/yP2xhFfmtECUA7xa5DtvEGeYSrGUtc2xee16iEjiE79uJWy4pgjrmzhRKk/ I7Hxb6/4xvoA8GqyWizql/6K/Ez+8tNHAGGtjVGVH7EodQCgyjR10I3kbD5B2iTx9Fzt 1eiuEYI0ee/Y5gL58idqj6ChGCBwiA0TjFQ1ACjEhCdqzhr5WtS5/oWVNfazmUIZXzTb SD6LZMWStvsHr4O8tl0+c5fHBIZL7CXmf4u/FMEiT9ZrAngaCQY8fP4InJb4qI2QArNJ 7yAVzShXUsXV7FhznvA5e5ygHQovXamCGHjlpRwHccv6jnAcrisWQuEdJYMUoDPJwnYe GJ/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ToI4nwdtL3GkFmibdyswuTEHfpGjEVUQmYdZnttMOF0=; b=A3kVn2M5tblSAAjKOQsXmncKLgGaufK3l5jGTD8Uqy72Zf76CQG2yBIU8H1f55ap1A iscjgFHDgmgpyCpq5VwnIgqDIoIJdtZE0UT6RlplvV873X89PVu+S4BtLDKDMmAhXVt7 5Kvlmf8brdYf4kAT8YOMPwrEXiatE5i3LSaBysDuTzvYhWS51H3c8EzKqn1ZWju/xYl2 4xSq9wmRBWt1E+l0suvzIusTJcpo/odSZN1tOB0NSac1+XsAKSkOBu4xiK7CFy5vRnG7 dUg65AWiFi/RHu36DU2fNKV4Is3otcV0mhyEXadFIpeP5mIw7u8AOayN+l9p9JcoxFYr YM1Q== X-Gm-Message-State: AOAM531HRwFDJifT1IpA0sELTR1v8k4GPlzV+yIH9r6S82nyCzFYLg+7 UhfzntVwurSuZLV/CTdk5adZzUfe X-Google-Smtp-Source: ABdhPJxwnc6VvLeAYJEbFIldSZ5DodhweTIpis9eRJvnUAGprEQnNkuDk4PaJWesepltQmX4ucLoLA== X-Received: by 2002:a1c:4e0c:: with SMTP id g12mr19146401wmh.136.1596941639622; Sat, 08 Aug 2020 19:53:59 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j5sm16520464wmb.12.2020.08.08.19.53.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 08 Aug 2020 19:53:58 -0700 (PDT) Message-Id: <4c34294602b23f4427b024bbd38e4403a397fc50.1596941625.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sun, 09 Aug 2020 02:53:44 +0000 Subject: [PATCH v2 10/10] doc: add corrected commit date info Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar With generation data chunk and corrected commit dates implemented, let's update the technical documentation for commit-graph. Signed-off-by: Abhishek Kumar --- .../technical/commit-graph-format.txt | 12 ++--- Documentation/technical/commit-graph.txt | 45 ++++++++++++------- 2 files changed, 36 insertions(+), 21 deletions(-) diff --git a/Documentation/technical/commit-graph-format.txt b/Documentation/technical/commit-graph-format.txt index 440541045d..71c43884ec 100644 --- a/Documentation/technical/commit-graph-format.txt +++ b/Documentation/technical/commit-graph-format.txt @@ -4,11 +4,7 @@ Git commit graph format The Git commit graph stores a list of commit OIDs and some associated metadata, including: -- The generation number of the commit. Commits with no parents have - generation number 1; commits with parents have generation number - one more than the maximum generation number of its parents. We - reserve zero as special, and can be used to mark a generation - number invalid or as "not computed". +- The generation number of the commit. - The root tree OID. @@ -88,6 +84,12 @@ CHUNK DATA: 2 bits of the lowest byte, storing the 33rd and 34th bit of the commit time. + Generation Data (ID: {'G', 'D', 'A', 'T' }) (N * 4 bytes) [Optional] + * This list of 4-byte values store corrected commit date offsets for the + commits, arranged in the same order as commit data chunk. + * This list can be later modified to store future generation number related + data. + Extra Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional] This list of 4-byte values store the second through nth parents for all octopus merges. The second parent value in the commit data stores diff --git a/Documentation/technical/commit-graph.txt b/Documentation/technical/commit-graph.txt index 808fa30b99..f27145328c 100644 --- a/Documentation/technical/commit-graph.txt +++ b/Documentation/technical/commit-graph.txt @@ -38,14 +38,27 @@ A consumer may load the following info for a commit from the graph: Values 1-4 satisfy the requirements of parse_commit_gently(). -Define the "generation number" of a commit recursively as follows: +There are two definitions of generation number: +1. Corrected committer dates +2. Topological levels + +Define "corrected committer date" of a commit recursively as follows: + + * A commit with no parents (a root commit) has corrected committer date + equal to its committer date. + + * A commit with at least one parent has corrected committer date equal to + the maximum of its commiter date and one more than the largest corrected + committer date among its parents. + +Define the "topological level" of a commit recursively as follows: * A commit with no parents (a root commit) has generation number one. - * A commit with at least one parent has generation number one more than - the largest generation number among its parents. + * A commit with at least one parent has topological level one more than + the largest topological level among its parents. -Equivalently, the generation number of a commit A is one more than the +Equivalently, the topological level of a commit A is one more than the length of a longest path from A to a root commit. The recursive definition is easier to use for computation and observing the following property: @@ -67,17 +80,12 @@ numbers, the general heuristic is the following: If A and B are commits with commit time X and Y, respectively, and X < Y, then A _probably_ cannot reach B. -This heuristic is currently used whenever the computation is allowed to -violate topological relationships due to clock skew (such as "git log" -with default order), but is not used when the topological order is -required (such as merge base calculations, "git log --graph"). - In practice, we expect some commits to be created recently and not stored in the commit graph. We can treat these commits as having "infinite" generation number and walk until reaching commits with known generation number. -We use the macro GENERATION_NUMBER_INFINITY = 0xFFFFFFFF to mark commits not +We use the macro GENERATION_NUMBER_INFINITY to mark commits not in the commit-graph file. If a commit-graph file was written by a version of Git that did not compute generation numbers, then those commits will have generation number represented by the macro GENERATION_NUMBER_ZERO = 0. @@ -93,12 +101,11 @@ fully-computed generation numbers. Using strict inequality may result in walking a few extra commits, but the simplicity in dealing with commits with generation number *_INFINITY or *_ZERO is valuable. -We use the macro GENERATION_NUMBER_MAX = 0x3FFFFFFF to for commits whose -generation numbers are computed to be at least this value. We limit at -this value since it is the largest value that can be stored in the -commit-graph file using the 30 bits available to generation numbers. This -presents another case where a commit can have generation number equal to -that of a parent. +We use the macro GENERATION_NUMBER_MAX for commits whose generation numbers +are computed to be at least this value. We limit at this value since it is +the largest value that can be stored in the commit-graph file using the +available to generation numbers. This presents another case where a +commit can have generation number equal to that of a parent. Design Details -------------- @@ -267,6 +274,12 @@ The merge strategy values (2 for the size multiple, 64,000 for the maximum number of commits) could be extracted into config settings for full flexibility. +We also merge commit-graph chains when we try to write a commit graph with +two different generation number definitions as they cannot be compared directly. +We overwrite the existing chain and create a commit-graph with the newer or more +efficient defintion. For example, overwriting topological levels commit graph +chain to create a corrected commit dates commit graph chain. + ## Deleting graph-{hash} files After a new tip file is written, some `graph-{hash}` files may no longer