From patchwork Mon Dec 28 11:15:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhishek Kumar X-Patchwork-Id: 11991077 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69342C43381 for ; Mon, 28 Dec 2020 11:17:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 38A59229C5 for ; Mon, 28 Dec 2020 11:17:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727310AbgL1LQz (ORCPT ); Mon, 28 Dec 2020 06:16:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59946 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727303AbgL1LQy (ORCPT ); Mon, 28 Dec 2020 06:16:54 -0500 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2C9C0C061796 for ; Mon, 28 Dec 2020 03:16:14 -0800 (PST) Received: by mail-wm1-x334.google.com with SMTP id r4so9528309wmh.5 for ; Mon, 28 Dec 2020 03:16:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=LgKUK8hJbbqtv2edclen7bw3MGngHPyxn35OHpHBvhQ=; b=QU5pBdQbUx0DH+kq7o7s/HPPvAoAERPY+F/mG8EYWC09oZL647Qv98+gwxqslP+EjD 5EW6+Rhwf8/DE8V8aHJWsc15vkzkyEbEDWH9V/QMNFhBOa4tRddybXNZ1zTUVEovE+BA R4Ms6ZRBUnYzPfLcutbmwgzuM5b7XV8TEaM1gI8oe7vjT2iEkWoj9KlSAXBYXHpQrfbl SMgkDl47/z12yOWsd/lVIHV3DWdYcIaOu+XcsVr0ZdAna0WgNnWiNDcRS9DqmYvjlrva pLZlIGbuuWISyiDPkehC2YgclO/wAqqzxysiCvvZ6nM262NkxhHIqohqZCrff4j9iWXi ChqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=LgKUK8hJbbqtv2edclen7bw3MGngHPyxn35OHpHBvhQ=; b=Ds8dcYp1y3oDMfyIBzb4WFYptVuStyoYR4BWxoUkzkJxVmIqfJ4k7PwmRl1uscGsEh szsVwKnaq0NGhRZVRB1ygPLGArtaSGr9iThS9jp8jem2DZ4J5W3QwP8EkuEcT+NGgmrQ fgBA5QjraS2uxrFCSj7ouht+R7E7zmCM/vqn5nHclrAhPzegkF/7y8ZpWyuZbjzzCIrN wpQWsSMr7rUYZg081dH1u/3hNuR9X1hNyAGPNqWdkNFApxoVWhTan+aGadVaHwhXRoQA Rihv33nm91LpJE2TX1P4fXtrLqDYc4SWFkiw1g13WkqphSTzskvDq3ytQNrptZUNDyd3 SxlA== X-Gm-Message-State: AOAM532C3sHzoO6e1BzLApSwgZ5UqgGL4uIiYG2RNdIurEcKeUjPQ6xc XhvLdkIqBX8jSl23P+MCK1E7jOzhNWU= X-Google-Smtp-Source: ABdhPJxj0AJrkBUcfLOJRgfBNgmBmRtRVuQk3mjVXq9BW+lBKG0/sxqzuUMjLCMOgnuXrr6JXg3p8A== X-Received: by 2002:a1c:9692:: with SMTP id y140mr20293753wmd.128.1609154172754; Mon, 28 Dec 2020 03:16:12 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z22sm18865407wml.1.2020.12.28.03.16.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Dec 2020 03:16:12 -0800 (PST) Message-Id: <7645e0bcef08c8fd726148b3545c0ca3adeefdec.1609154168.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 28 Dec 2020 11:15:59 +0000 Subject: [PATCH v5 02/11] revision: parse parent in indegree_walk_step() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar , Abhishek Kumar Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar From: Abhishek Kumar In indegree_walk_step(), we add unvisited parents to the indegree queue. However, parents are not guaranteed to be parsed. As the indegree queue sorts by generation number, let's parse parents before inserting them to ensure the correct priority order. Signed-off-by: Abhishek Kumar --- revision.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/revision.c b/revision.c index 9dff845bed6..de8e45f462f 100644 --- a/revision.c +++ b/revision.c @@ -3373,6 +3373,9 @@ static void indegree_walk_step(struct rev_info *revs) struct commit *parent = p->item; int *pi = indegree_slab_at(&info->indegree, parent); + if (repo_parse_commit_gently(revs->repo, parent, 1) < 0) + return; + if (*pi) (*pi)++; else From patchwork Mon Dec 28 11:16:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhishek Kumar X-Patchwork-Id: 11991079 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 56174C433E9 for ; Mon, 28 Dec 2020 11:17:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1382B229C4 for ; Mon, 28 Dec 2020 11:17:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727318AbgL1LQ4 (ORCPT ); Mon, 28 Dec 2020 06:16:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59952 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727314AbgL1LQz (ORCPT ); Mon, 28 Dec 2020 06:16:55 -0500 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 28331C061798 for ; Mon, 28 Dec 2020 03:16:15 -0800 (PST) Received: by mail-wm1-x335.google.com with SMTP id r4so9528352wmh.5 for ; Mon, 28 Dec 2020 03:16:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=XLxhN5a+VA5tg9a5eySYHW11oiSTsE+4vg/aiMZVm54=; b=VihxzZ2UBZ1vhsNKNqIafFJrAyrW1TkX86W5fSMp3fXnUnDcp8Vub0NiN8ZDWCTm+X wyCyVoM8iQw/uBjAgrkjH+1Zv40G/EbdT0/tvrByetHI1ue5zG/o39XPeCYfsVcMOtI2 nLwlOXzGHzeaHLRdok79atCpsg/yjURwjg7QqsoSeJ+4Qwr3pPfTUANX2+KUz4v8wgOY OU+vo/zaN13TNimyLBamV6Ex8F8cB5+AoiAsRKd5Yry9bRhg+bg7eClzDnWoed++hAW/ rsEUCTHXE9zKUsCJuDPKRPMeZlCSreEANU4lLULJsMhMKfKqGTy0dCyKnHhd6qeBt1dP absg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=XLxhN5a+VA5tg9a5eySYHW11oiSTsE+4vg/aiMZVm54=; b=cPhECMEGecNejuLOcey99YvFGF6BBisXD1hc1W1MaQkCNHpRue8ePoEZwv9PIRsJxI ym7kooCqdJytEIJUULf51vqQwIJvzAvgwsN6uqUFyn9v4jo6T4g9/f4CJ01fXg+XcqWo Xc42QwAkzTpr+Ftd4y0rqaHLtsfMkpdFNLjsmnnDndvZDk75zlD2JixlyuQi3WkdyLo3 NNp5mfmsD1jO+A2v+kHFSptNUGwrkH4RUwoQdexC6wPoh4x7FjKWhtyWbzKcoBV+s4nj 8se5u81pinz8dOT1VCWwdpWRE5XhTKbYUiPJaVWIxQ1poONofXngCiYpOrXs9LuwFc7x bHUQ== X-Gm-Message-State: AOAM532F/Is38ECXp4lfAZGxUwMnf1d6ODyCmO61WzCSObZinYEhqZdW 5yEbRm7dnHgWqtjoI2vWrEySRCxUg6A= X-Google-Smtp-Source: ABdhPJwonrzvAe7FPcGv1HRLZ7b2x/6lRB/qG39MBThUv3PHN/+qT+zEl2SazASKrfSi/yT44oR2Wg== X-Received: by 2002:a1c:e445:: with SMTP id b66mr20467891wmh.187.1609154173582; Mon, 28 Dec 2020 03:16:13 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f14sm42023209wre.69.2020.12.28.03.16.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Dec 2020 03:16:13 -0800 (PST) Message-Id: In-Reply-To: References: Date: Mon, 28 Dec 2020 11:16:00 +0000 Subject: [PATCH v5 03/11] commit-graph: consolidate fill_commit_graph_info Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar , Abhishek Kumar Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar From: Abhishek Kumar Both fill_commit_graph_info() and fill_commit_in_graph() parse information present in commit data chunk. Let's simplify the implementation by calling fill_commit_graph_info() within fill_commit_in_graph(). fill_commit_graph_info() used to not load committer data from commit data chunk. However, with the upcoming switch to using corrected committer date as generation number v2, we will have to load committer date to compute generation number value anyway. e51217e15 (t5000: test tar files that overflow ustar headers, 30-06-2016) introduced a test 'generate tar with future mtime' that creates a commit with committer date of (2^36 + 1) seconds since EPOCH. The CDAT chunk provides 34-bits for storing committer date, thus committer time overflows into generation number (within CDAT chunk) and has undefined behavior. The test used to pass as fill_commit_graph_info() would not set struct member `date` of struct commit and load committer date from the object database, generating a tar file with the expected mtime. However, with corrected commit date, we will load the committer date from CDAT chunk (truncated to lower 34-bits to populate the generation number. Thus, Git sets date and generates tar file with the truncated mtime. The ustar format (the header format used by most modern tar programs) only has room for 11 (or 12, depending on some implementations) octal digits for the size and mtime of each file. As the CDAT chunk is overflow by 12-octal digits but not 11-octal digits, we split the existing tests to test both implementations separately and add a new explicit test for 11-digit implementation. To test the 11-octal digit implementation, we create a future commit with committer date of 2^34 - 1, which overflows 11-octal digits without overflowing 34-bits of the Commit Date chunks. To test the 12-octal digit implementation, the smallest committer date possible is 2^36 + 1, which overflows the CDAT chunk and thus commit-graph must be disabled for the test. Signed-off-by: Abhishek Kumar --- commit-graph.c | 27 ++++++++++----------------- t/t5000-tar-tree.sh | 24 +++++++++++++++++++++--- 2 files changed, 31 insertions(+), 20 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index caf823295f4..d5b33b4f7ac 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -749,15 +749,24 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g, const unsigned char *commit_data; struct commit_graph_data *graph_data; uint32_t lex_index; + uint64_t date_high, date_low; while (pos < g->num_commits_in_base) g = g->base_graph; + if (pos >= g->num_commits + g->num_commits_in_base) + die(_("invalid commit position. commit-graph is likely corrupt")); + lex_index = pos - g->num_commits_in_base; commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index; graph_data = commit_graph_data_at(item); graph_data->graph_pos = pos; + + date_high = get_be32(commit_data + g->hash_len + 8) & 0x3; + date_low = get_be32(commit_data + g->hash_len + 12); + item->date = (timestamp_t)((date_high << 32) | date_low); + graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; } @@ -772,38 +781,22 @@ static int fill_commit_in_graph(struct repository *r, { uint32_t edge_value; uint32_t *parent_data_ptr; - uint64_t date_low, date_high; struct commit_list **pptr; - struct commit_graph_data *graph_data; const unsigned char *commit_data; uint32_t lex_index; while (pos < g->num_commits_in_base) g = g->base_graph; - if (pos >= g->num_commits + g->num_commits_in_base) - die(_("invalid commit position. commit-graph is likely corrupt")); + fill_commit_graph_info(item, g, pos); - /* - * Store the "full" position, but then use the - * "local" position for the rest of the calculation. - */ - graph_data = commit_graph_data_at(item); - graph_data->graph_pos = pos; lex_index = pos - g->num_commits_in_base; - commit_data = g->chunk_commit_data + (g->hash_len + 16) * lex_index; item->object.parsed = 1; set_commit_tree(item, NULL); - date_high = get_be32(commit_data + g->hash_len + 8) & 0x3; - date_low = get_be32(commit_data + g->hash_len + 12); - item->date = (timestamp_t)((date_high << 32) | date_low); - - graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; - pptr = &item->parents; edge_value = get_be32(commit_data + g->hash_len); diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh index 3ebb0d3b652..7204799a0b5 100755 --- a/t/t5000-tar-tree.sh +++ b/t/t5000-tar-tree.sh @@ -431,15 +431,33 @@ test_expect_success TAR_HUGE,LONG_IS_64BIT 'system tar can read our huge size' ' test_cmp expect actual ' -test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' ' +test_expect_success TIME_IS_64BIT 'set up repository with far-future (2^34 - 1) commit' ' + rm -f .git/index && + echo foo >file && + git add file && + GIT_COMMITTER_DATE="@17179869183 +0000" \ + git commit -m "tempori parendum" +' + +test_expect_success TIME_IS_64BIT 'generate tar with far-future mtime' ' + git archive HEAD >future.tar +' + +test_expect_success TAR_HUGE,TIME_IS_64BIT,TIME_T_IS_64BIT 'system tar can read our future mtime' ' + echo 2514 >expect && + tar_info future.tar | cut -d" " -f2 >actual && + test_cmp expect actual +' + +test_expect_success TIME_IS_64BIT 'set up repository with far-far-future (2^36 + 1) commit' ' rm -f .git/index && echo content >file && git add file && - GIT_COMMITTER_DATE="@68719476737 +0000" \ + GIT_TEST_COMMIT_GRAPH=0 GIT_COMMITTER_DATE="@68719476737 +0000" \ git commit -m "tempori parendum" ' -test_expect_success TIME_IS_64BIT 'generate tar with future mtime' ' +test_expect_success TIME_IS_64BIT 'generate tar with far-far-future mtime' ' git archive HEAD >future.tar ' From patchwork Mon Dec 28 11:16:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhishek Kumar X-Patchwork-Id: 11991081 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7935C4332B for ; Mon, 28 Dec 2020 11:17:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6350C22583 for ; Mon, 28 Dec 2020 11:17:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727323AbgL1LRA (ORCPT ); Mon, 28 Dec 2020 06:17:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727314AbgL1LQ4 (ORCPT ); Mon, 28 Dec 2020 06:16:56 -0500 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 523D4C061799 for ; Mon, 28 Dec 2020 03:16:16 -0800 (PST) Received: by mail-wr1-x436.google.com with SMTP id d13so10983331wrc.13 for ; Mon, 28 Dec 2020 03:16:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=cEe95guw9s9mzUDrRNnTP/221TBTHwdatVA+Owv7Nr0=; b=qn7fAp3Tt7H3AW/2+PbNFLsSxpZbW63yNzfYmofm51EGfLj6uCKYdbFUK0O/qpww9T DKotAlDmcpGjTs69hSYAK8KQSxHjOeEe8GLKiV+uJi9diU8sRRrFYgepByqGAl5f/J0q AEi6xYG8vtMrVDWWX32lEi8C34z7PREavIjJfhYVMNuqG/LgS+CHXtdWd0TI5yurZOgL AuRUUeMc7UV2o4/PSvEwO7r11LWPaW8vvnA1j4r+slo80ux4ZD8rEzRaOFCXgFYfHrqG LU1/h95a6DHwM3rL49RNiV7BGE8HlKqQ/+wLur/fJObVdMkl7ThtKHE2+kena56DGOyo oN1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=cEe95guw9s9mzUDrRNnTP/221TBTHwdatVA+Owv7Nr0=; b=HU/ryf3vR0/JXEBol8MqtvDBFq45+/IIHk9DXmeUuPfaBG/zI99jxHRrFXYKb7PPRi lFbhFeiP2knGbja8pFi/0nY9SlShOhjYD9jbzkW9ycdLoF9GpDF5WydWmmhrhRPxRdU6 E/jL2jQc2tUYbD2SVifwA0sjJG7TzHCSU/3r5Z3SiXzjKArhbbIdoZUk4UhTTPS9xWo9 Gs1k+lNM8nJS21v0u8ebQkeQaoLYEUwWOTGuvMBtOwF2Lm1Wp4yJi1/RkhiFGBjOzwgh A4N62loUD5Jo3lBDjZO13o3WNhIIEGebxpmmVW6qOxZht7tqfAOib5j5hTiBzQTPxRS9 1QHw== X-Gm-Message-State: AOAM530b+AjM0ZugiFMiicivU9E7n6uUlspJ5Gxhf7hiDx6+RDwkzD9P nmKJqrqE7vy2Eld4mFrZPpGcUAtqwbQ= X-Google-Smtp-Source: ABdhPJzw6aeFzfxk3Vuw21185Zp1A1nErXfPbx4bu82lykxis07PkD7DY/IkkMh7+pv+8paIpMkwXw== X-Received: by 2002:adf:9567:: with SMTP id 94mr50953440wrs.394.1609154174815; Mon, 28 Dec 2020 03:16:14 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id k18sm61175811wrd.45.2020.12.28.03.16.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Dec 2020 03:16:14 -0800 (PST) Message-Id: <591935075f1dce264b24c91715c81ce1ff15fd47.1609154168.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 28 Dec 2020 11:16:01 +0000 Subject: [PATCH v5 04/11] t6600-test-reach: generalize *_three_modes Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar , Abhishek Kumar Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar From: Abhishek Kumar In a preparatory step to implement generation number v2, we add tests to ensure Git can read and parse commit-graph files without Generation Data chunk. These files represent commit-graph files written by Old Git and are neccesary for backward compatability. We extend run_three_modes() and test_three_modes() to *_all_modes() with the fourth mode being "commit-graph without generation data chunk". Signed-off-by: Abhishek Kumar --- t/t6600-test-reach.sh | 62 +++++++++++++++++++++---------------------- 1 file changed, 31 insertions(+), 31 deletions(-) diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh index f807276337d..af10f0dc090 100755 --- a/t/t6600-test-reach.sh +++ b/t/t6600-test-reach.sh @@ -58,7 +58,7 @@ test_expect_success 'setup' ' git config core.commitGraph true ' -run_three_modes () { +run_all_modes () { test_when_finished rm -rf .git/objects/info/commit-graph && "$@" actual && test_cmp expect actual && @@ -70,8 +70,8 @@ run_three_modes () { test_cmp expect actual } -test_three_modes () { - run_three_modes test-tool reach "$@" +test_all_modes () { + run_all_modes test-tool reach "$@" } test_expect_success 'ref_newer:miss' ' @@ -80,7 +80,7 @@ test_expect_success 'ref_newer:miss' ' B:commit-4-9 EOF echo "ref_newer(A,B):0" >expect && - test_three_modes ref_newer + test_all_modes ref_newer ' test_expect_success 'ref_newer:hit' ' @@ -89,7 +89,7 @@ test_expect_success 'ref_newer:hit' ' B:commit-2-3 EOF echo "ref_newer(A,B):1" >expect && - test_three_modes ref_newer + test_all_modes ref_newer ' test_expect_success 'in_merge_bases:hit' ' @@ -98,7 +98,7 @@ test_expect_success 'in_merge_bases:hit' ' B:commit-8-8 EOF echo "in_merge_bases(A,B):1" >expect && - test_three_modes in_merge_bases + test_all_modes in_merge_bases ' test_expect_success 'in_merge_bases:miss' ' @@ -107,7 +107,7 @@ test_expect_success 'in_merge_bases:miss' ' B:commit-5-9 EOF echo "in_merge_bases(A,B):0" >expect && - test_three_modes in_merge_bases + test_all_modes in_merge_bases ' test_expect_success 'in_merge_bases_many:hit' ' @@ -117,7 +117,7 @@ test_expect_success 'in_merge_bases_many:hit' ' X:commit-5-7 EOF echo "in_merge_bases_many(A,X):1" >expect && - test_three_modes in_merge_bases_many + test_all_modes in_merge_bases_many ' test_expect_success 'in_merge_bases_many:miss' ' @@ -127,7 +127,7 @@ test_expect_success 'in_merge_bases_many:miss' ' X:commit-8-6 EOF echo "in_merge_bases_many(A,X):0" >expect && - test_three_modes in_merge_bases_many + test_all_modes in_merge_bases_many ' test_expect_success 'in_merge_bases_many:miss-heuristic' ' @@ -137,7 +137,7 @@ test_expect_success 'in_merge_bases_many:miss-heuristic' ' X:commit-6-6 EOF echo "in_merge_bases_many(A,X):0" >expect && - test_three_modes in_merge_bases_many + test_all_modes in_merge_bases_many ' test_expect_success 'is_descendant_of:hit' ' @@ -148,7 +148,7 @@ test_expect_success 'is_descendant_of:hit' ' X:commit-1-1 EOF echo "is_descendant_of(A,X):1" >expect && - test_three_modes is_descendant_of + test_all_modes is_descendant_of ' test_expect_success 'is_descendant_of:miss' ' @@ -159,7 +159,7 @@ test_expect_success 'is_descendant_of:miss' ' X:commit-7-6 EOF echo "is_descendant_of(A,X):0" >expect && - test_three_modes is_descendant_of + test_all_modes is_descendant_of ' test_expect_success 'get_merge_bases_many' ' @@ -174,7 +174,7 @@ test_expect_success 'get_merge_bases_many' ' git rev-parse commit-5-6 \ commit-4-7 | sort } >expect && - test_three_modes get_merge_bases_many + test_all_modes get_merge_bases_many ' test_expect_success 'reduce_heads' ' @@ -196,7 +196,7 @@ test_expect_success 'reduce_heads' ' commit-2-8 \ commit-1-10 | sort } >expect && - test_three_modes reduce_heads + test_all_modes reduce_heads ' test_expect_success 'can_all_from_reach:hit' ' @@ -219,7 +219,7 @@ test_expect_success 'can_all_from_reach:hit' ' Y:commit-8-1 EOF echo "can_all_from_reach(X,Y):1" >expect && - test_three_modes can_all_from_reach + test_all_modes can_all_from_reach ' test_expect_success 'can_all_from_reach:miss' ' @@ -241,7 +241,7 @@ test_expect_success 'can_all_from_reach:miss' ' Y:commit-8-5 EOF echo "can_all_from_reach(X,Y):0" >expect && - test_three_modes can_all_from_reach + test_all_modes can_all_from_reach ' test_expect_success 'can_all_from_reach_with_flag: tags case' ' @@ -264,7 +264,7 @@ test_expect_success 'can_all_from_reach_with_flag: tags case' ' Y:commit-8-1 EOF echo "can_all_from_reach_with_flag(X,_,_,0,0):1" >expect && - test_three_modes can_all_from_reach_with_flag + test_all_modes can_all_from_reach_with_flag ' test_expect_success 'commit_contains:hit' ' @@ -280,8 +280,8 @@ test_expect_success 'commit_contains:hit' ' X:commit-9-3 EOF echo "commit_contains(_,A,X,_):1" >expect && - test_three_modes commit_contains && - test_three_modes commit_contains --tag + test_all_modes commit_contains && + test_all_modes commit_contains --tag ' test_expect_success 'commit_contains:miss' ' @@ -297,8 +297,8 @@ test_expect_success 'commit_contains:miss' ' X:commit-9-3 EOF echo "commit_contains(_,A,X,_):0" >expect && - test_three_modes commit_contains && - test_three_modes commit_contains --tag + test_all_modes commit_contains && + test_all_modes commit_contains --tag ' test_expect_success 'rev-list: basic topo-order' ' @@ -310,7 +310,7 @@ test_expect_success 'rev-list: basic topo-order' ' commit-6-2 commit-5-2 commit-4-2 commit-3-2 commit-2-2 commit-1-2 \ commit-6-1 commit-5-1 commit-4-1 commit-3-1 commit-2-1 commit-1-1 \ >expect && - run_three_modes git rev-list --topo-order commit-6-6 + run_all_modes git rev-list --topo-order commit-6-6 ' test_expect_success 'rev-list: first-parent topo-order' ' @@ -322,7 +322,7 @@ test_expect_success 'rev-list: first-parent topo-order' ' commit-6-2 \ commit-6-1 commit-5-1 commit-4-1 commit-3-1 commit-2-1 commit-1-1 \ >expect && - run_three_modes git rev-list --first-parent --topo-order commit-6-6 + run_all_modes git rev-list --first-parent --topo-order commit-6-6 ' test_expect_success 'rev-list: range topo-order' ' @@ -334,7 +334,7 @@ test_expect_success 'rev-list: range topo-order' ' commit-6-2 commit-5-2 commit-4-2 \ commit-6-1 commit-5-1 commit-4-1 \ >expect && - run_three_modes git rev-list --topo-order commit-3-3..commit-6-6 + run_all_modes git rev-list --topo-order commit-3-3..commit-6-6 ' test_expect_success 'rev-list: range topo-order' ' @@ -346,7 +346,7 @@ test_expect_success 'rev-list: range topo-order' ' commit-6-2 commit-5-2 commit-4-2 \ commit-6-1 commit-5-1 commit-4-1 \ >expect && - run_three_modes git rev-list --topo-order commit-3-8..commit-6-6 + run_all_modes git rev-list --topo-order commit-3-8..commit-6-6 ' test_expect_success 'rev-list: first-parent range topo-order' ' @@ -358,7 +358,7 @@ test_expect_success 'rev-list: first-parent range topo-order' ' commit-6-2 \ commit-6-1 commit-5-1 commit-4-1 \ >expect && - run_three_modes git rev-list --first-parent --topo-order commit-3-8..commit-6-6 + run_all_modes git rev-list --first-parent --topo-order commit-3-8..commit-6-6 ' test_expect_success 'rev-list: ancestry-path topo-order' ' @@ -368,7 +368,7 @@ test_expect_success 'rev-list: ancestry-path topo-order' ' commit-6-4 commit-5-4 commit-4-4 commit-3-4 \ commit-6-3 commit-5-3 commit-4-3 \ >expect && - run_three_modes git rev-list --topo-order --ancestry-path commit-3-3..commit-6-6 + run_all_modes git rev-list --topo-order --ancestry-path commit-3-3..commit-6-6 ' test_expect_success 'rev-list: symmetric difference topo-order' ' @@ -382,7 +382,7 @@ test_expect_success 'rev-list: symmetric difference topo-order' ' commit-3-8 commit-2-8 commit-1-8 \ commit-3-7 commit-2-7 commit-1-7 \ >expect && - run_three_modes git rev-list --topo-order commit-3-8...commit-6-6 + run_all_modes git rev-list --topo-order commit-3-8...commit-6-6 ' test_expect_success 'get_reachable_subset:all' ' @@ -402,7 +402,7 @@ test_expect_success 'get_reachable_subset:all' ' commit-1-7 \ commit-5-6 | sort ) >expect && - test_three_modes get_reachable_subset + test_all_modes get_reachable_subset ' test_expect_success 'get_reachable_subset:some' ' @@ -420,7 +420,7 @@ test_expect_success 'get_reachable_subset:some' ' git rev-parse commit-3-3 \ commit-1-7 | sort ) >expect && - test_three_modes get_reachable_subset + test_all_modes get_reachable_subset ' test_expect_success 'get_reachable_subset:none' ' @@ -434,7 +434,7 @@ test_expect_success 'get_reachable_subset:none' ' Y:commit-2-8 EOF echo "get_reachable_subset(X,Y)" >expect && - test_three_modes get_reachable_subset + test_all_modes get_reachable_subset ' test_done From patchwork Mon Dec 28 11:16:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhishek Kumar X-Patchwork-Id: 11991085 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BBAFC433E0 for ; Mon, 28 Dec 2020 11:17:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D6FE9229C4 for ; Mon, 28 Dec 2020 11:17:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727335AbgL1LRf (ORCPT ); Mon, 28 Dec 2020 06:17:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60056 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727314AbgL1LRe (ORCPT ); Mon, 28 Dec 2020 06:17:34 -0500 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 99B0BC06179A for ; Mon, 28 Dec 2020 03:16:17 -0800 (PST) Received: by mail-wr1-x42a.google.com with SMTP id q18so11061727wrn.1 for ; Mon, 28 Dec 2020 03:16:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=rjn+0mwSTH6YTjRp31OOFmg0eS5Ry84Z1I2Z2bLweJU=; b=phhfmkcmWwlXBJhmSw7ZVyHwM4cU77ejfCV3uv/EE22nEd4cnQRay2at/5OU4wkChz AjHHJXFWzPitFQq+0c6P1i3HgijAVHBfOQxCqQZbpP9cAXQm8bnhHcMc7DjkmpdABU/K 8+EGoSEIOiHiAPEJ61fVAGRZdu3WoPV2ErW78Ab6vwgZi8vAVI+wydgaXxQTeiOErll3 WsGUacu9gH4lrsPozwcp13DnSMHJ9qMj+/qtltUOb4TN+EjRYRBFwh7KwjybeJx2UF/5 wT42i6HsAa/W4BZgqJ7/zjSbUwkhkRGYJgWEyLsZJYSywKMuYwN8fJaPvzwyAhaMY+JY pG/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=rjn+0mwSTH6YTjRp31OOFmg0eS5Ry84Z1I2Z2bLweJU=; b=VnZ2GAOQzgUk2fSoZJCGJiYxDb/toImcKiv8XOXdH5W8ur3QdiHYsOOMeUND6hcgVz NSG0gxo7Bl5VDFgQ4Duh0WPJUre/NCbvBv41lyxYVN7sO8H0elibYU0iaSJNCq6ahGGg q+/pFAmNOcIrJOHHAiWzyE/CrnmTECF+n6bGKjr7CCn86gSfl2QGs6pjsQ84K3hskQOh HfhtZ9zkZJ7z8o5t55NBoKMHHYHsKyAfrPoMKRxmm8SOTdDHsyFKCh0bv2ULud6myHb8 koacRGbZnHCJGG6nhmjs9a1CBi+ywA+M3Sl7MWbql3Iy+edQIx5RKmZumzZCHldS3Vui nnMg== X-Gm-Message-State: AOAM5325tYdsoLEdhVYzkysM7vJFN+0HXheU+4tLH8CTjWJrSp0WbPXt MCnWlfDFEsdSjWb9OltnUzTcYRe/oGg= X-Google-Smtp-Source: ABdhPJyS/kvzBjlgB41Y1ioidjLC4S/4ETFJkpO8XEoiF3HKpHmbOPJAKvYs+D+LqV58IOVjiBUvVw== X-Received: by 2002:a5d:4ccf:: with SMTP id c15mr51274334wrt.237.1609154176125; Mon, 28 Dec 2020 03:16:16 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h4sm53377836wrt.65.2020.12.28.03.16.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Dec 2020 03:16:15 -0800 (PST) Message-Id: In-Reply-To: References: Date: Mon, 28 Dec 2020 11:16:02 +0000 Subject: [PATCH v5 05/11] commit-graph: add a slab to store topological levels Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar , Abhishek Kumar Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar From: Abhishek Kumar In a later commit we will introduce corrected commit date as the generation number v2. Corrected commit dates will be stored in the new seperate Generation Data chunk. However, to ensure backwards compatibility with "Old" Git we need to continue to write generation number v1 (topological levels) to the commit data chunk. Thus, we need to compute and store both versions of generation numbers to write the commit-graph file. Therefore, let's introduce a commit-slab `topo_level_slab` to store topological levels; corrected commit date will be stored in the member `generation` of struct commit_graph_data. The macros `GENERATION_NUMBER_INFINITY` and `GENERATION_NUMBER_ZERO` mark commits not in the commit-graph file and commits written by a version of Git that did not compute generation numbers respectively. Generation numbers are computed identically for both kinds of commits. A "slab-miss" should return `GENERATION_NUMBER_INFINITY` as the commit is not in the commit-graph file. However, since the slab is zero-initialized, it returns 0 (or rather `GENERATION_NUMBER_ZERO`). Thus, we no longer need to check if the topological level of a commit is `GENERATION_NUMBER_INFINITY`. We will add a pointer to the slab in `struct write_commit_graph_context` and `struct commit_graph` to populate the slab in `fill_commit_graph_info` if the commit has a pre-computed topological level as in case of split commit-graphs. Signed-off-by: Abhishek Kumar --- commit-graph.c | 45 ++++++++++++++++++++++++++++++--------------- commit-graph.h | 1 + 2 files changed, 31 insertions(+), 15 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index d5b33b4f7ac..c98e8910fe2 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -64,6 +64,8 @@ void git_test_write_commit_graph_or_die(void) /* Remember to update object flag allocation in object.h */ #define REACHABLE (1u<<15) +define_commit_slab(topo_level_slab, uint32_t); + /* Keep track of the order in which commits are added to our list. */ define_commit_slab(commit_pos, int); static struct commit_pos commit_pos = COMMIT_SLAB_INIT(1, commit_pos); @@ -768,6 +770,9 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g, item->date = (timestamp_t)((date_high << 32) | date_low); graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; + + if (g->topo_levels) + *topo_level_slab_at(g->topo_levels, item) = get_be32(commit_data + g->hash_len + 8) >> 2; } static inline void set_commit_tree(struct commit *c, struct tree *t) @@ -956,6 +961,7 @@ struct write_commit_graph_context { changed_paths:1, order_by_pack:1; + struct topo_level_slab *topo_levels; const struct commit_graph_opts *opts; size_t total_bloom_filter_data_size; const struct bloom_filter_settings *bloom_settings; @@ -1102,7 +1108,7 @@ static int write_graph_chunk_data(struct hashfile *f, else packedDate[0] = 0; - packedDate[0] |= htonl(commit_graph_data_at(*list)->generation << 2); + packedDate[0] |= htonl(*topo_level_slab_at(ctx->topo_levels, *list) << 2); packedDate[1] = htonl((*list)->date); hashwrite(f, packedDate, 8); @@ -1332,11 +1338,10 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) _("Computing commit graph generation numbers"), ctx->commits.nr); for (i = 0; i < ctx->commits.nr; i++) { - uint32_t generation = commit_graph_data_at(ctx->commits.list[i])->generation; + uint32_t level = *topo_level_slab_at(ctx->topo_levels, ctx->commits.list[i]); display_progress(ctx->progress, i + 1); - if (generation != GENERATION_NUMBER_INFINITY && - generation != GENERATION_NUMBER_ZERO) + if (level != GENERATION_NUMBER_ZERO) continue; commit_list_insert(ctx->commits.list[i], &list); @@ -1344,29 +1349,26 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) struct commit *current = list->item; struct commit_list *parent; int all_parents_computed = 1; - uint32_t max_generation = 0; + uint32_t max_level = 0; for (parent = current->parents; parent; parent = parent->next) { - generation = commit_graph_data_at(parent->item)->generation; + level = *topo_level_slab_at(ctx->topo_levels, parent->item); - if (generation == GENERATION_NUMBER_INFINITY || - generation == GENERATION_NUMBER_ZERO) { + if (level == GENERATION_NUMBER_ZERO) { all_parents_computed = 0; commit_list_insert(parent->item, &list); break; - } else if (generation > max_generation) { - max_generation = generation; + } else if (level > max_level) { + max_level = level; } } if (all_parents_computed) { - struct commit_graph_data *data = commit_graph_data_at(current); - - data->generation = max_generation + 1; pop_commit(&list); - if (data->generation > GENERATION_NUMBER_MAX) - data->generation = GENERATION_NUMBER_MAX; + if (max_level > GENERATION_NUMBER_MAX - 1) + max_level = GENERATION_NUMBER_MAX - 1; + *topo_level_slab_at(ctx->topo_levels, current) = max_level + 1; } } } @@ -2102,6 +2104,7 @@ int write_commit_graph(struct object_directory *odb, int res = 0; int replace = 0; struct bloom_filter_settings bloom_settings = DEFAULT_BLOOM_FILTER_SETTINGS; + struct topo_level_slab topo_levels; prepare_repo_settings(the_repository); if (!the_repository->settings.core_commit_graph) { @@ -2128,6 +2131,18 @@ int write_commit_graph(struct object_directory *odb, bloom_settings.max_changed_paths); ctx->bloom_settings = &bloom_settings; + init_topo_level_slab(&topo_levels); + ctx->topo_levels = &topo_levels; + + if (ctx->r->objects->commit_graph) { + struct commit_graph *g = ctx->r->objects->commit_graph; + + while (g) { + g->topo_levels = &topo_levels; + g = g->base_graph; + } + } + if (flags & COMMIT_GRAPH_WRITE_BLOOM_FILTERS) ctx->changed_paths = 1; if (!(flags & COMMIT_GRAPH_NO_WRITE_BLOOM_FILTERS)) { diff --git a/commit-graph.h b/commit-graph.h index f8e92500c6e..00f00745b79 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -73,6 +73,7 @@ struct commit_graph { const unsigned char *chunk_bloom_indexes; const unsigned char *chunk_bloom_data; + struct topo_level_slab *topo_levels; struct bloom_filter_settings *bloom_filter_settings; }; From patchwork Mon Dec 28 11:16:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhishek Kumar X-Patchwork-Id: 11991087 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4DE1BC433E6 for ; Mon, 28 Dec 2020 11:17:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0CDAC229C6 for ; Mon, 28 Dec 2020 11:17:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727339AbgL1LRf (ORCPT ); Mon, 28 Dec 2020 06:17:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60058 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727325AbgL1LRe (ORCPT ); Mon, 28 Dec 2020 06:17:34 -0500 Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9FF10C06179B for ; Mon, 28 Dec 2020 03:16:18 -0800 (PST) Received: by mail-wm1-x331.google.com with SMTP id 3so9532499wmg.4 for ; Mon, 28 Dec 2020 03:16:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=zbyEcLxXymMRNtykZi/9U3zxzfVEp9oLGbTgro/5XhA=; b=qBtO6KDgh5QlNrGiOp7guMu43kChMViziAX2xPB0dP94hJ3o7bUlYBZ67YKglbfCX/ MCvo4of5bUx4wIqsifJT/NlMURbsyiXD2Q7ieW2oopdb+KrQQmSq6uN1ZIxEwzGE7I7v sMaoJReBkJxqygHfZgvKaUspFT8gBoNaNJIGIWJZNvmnbVIiyhYf6E4LsMwHOsQpLETz k7od1L5Qaq+QYWdjDhiypeJnEDfxawFnmY/GvibkMaml+f3t+fwTh6StyOZ0PkW528Wv 1SZtbtlVS11893kFmMA5A6V2ElvAP4kYrMStX5vbaTolxCGh4C84kIyb4bIzObb2SRol QDuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=zbyEcLxXymMRNtykZi/9U3zxzfVEp9oLGbTgro/5XhA=; b=BfJ9eZT5Aepd+R7O7OsDP2VrDxwLZL/a9lPmSMMm39BxjtSt1r0UWnQKAJ2HL+usYN MwtioRzsgPoJkJMQ1zVrWNtQAZb2lxTKvtCZKOxdckvxkv1ou/Ijxg8fwt/h9UZE74h9 I2ciJFXwwPS/vTpsIG0ouRDfZLPSnGXL8KXGXE7YAP/zjWZZ/AjH2C6bsiCiyzJlgHx2 XCHLPITwYOFyYKt+eCNr/FNIjldmukU1xV+ct0zDs4oJnvZz2Nz3eC+3EYhgIq220BF6 d/yYDzTRYZ/Cld5Z5Ea/1fuWDSP0erSXrMxBj9zrsTEcibTesDI0BWpR2iRZkkRgOvSh jyKg== X-Gm-Message-State: AOAM5326SmZCgjwBH5FkW237Wc+UkhlNd8LkFtgMI3FarJpJUIx4YCjb DzCQM59IabS7h/wp2BWebyy44SyAn3c= X-Google-Smtp-Source: ABdhPJxytqWKLT1Z5JvLbYj2YG9wtf7mJGKTgnFHMuBZhRO9/2xJ0m7nfEj/nbbtbIvwGfJ8dtNrmA== X-Received: by 2002:a1c:4156:: with SMTP id o83mr20034079wma.178.1609154177103; Mon, 28 Dec 2020 03:16:17 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id k10sm52780361wrq.38.2020.12.28.03.16.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Dec 2020 03:16:16 -0800 (PST) Message-Id: <26bd6f4910059a0900a9ce48b2d6b668da6e34e7.1609154168.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 28 Dec 2020 11:16:03 +0000 Subject: [PATCH v5 06/11] commit-graph: return 64-bit generation number Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar , Abhishek Kumar Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar From: Abhishek Kumar In a preparatory step for introducing corrected commit dates, let's return timestamp_t values from commit_graph_generation(), use timestamp_t for local variables and define GENERATION_NUMBER_INFINITY as (2 ^ 63 - 1) instead. We rename GENERATION_NUMBER_MAX to GENERATION_NUMBER_V1_MAX to represent the largest topological level we can store in the commit data chunk. With corrected commit dates implemented, we will have two such *_MAX variables to denote the largest offset and largest topological level that can be stored. Signed-off-by: Abhishek Kumar --- commit-graph.c | 22 +++++++++++----------- commit-graph.h | 4 ++-- commit-reach.c | 36 ++++++++++++++++++------------------ commit-reach.h | 2 +- commit.c | 4 ++-- commit.h | 4 ++-- revision.c | 10 +++++----- upload-pack.c | 2 +- 8 files changed, 42 insertions(+), 42 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index c98e8910fe2..1b2a015f92f 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -101,7 +101,7 @@ uint32_t commit_graph_position(const struct commit *c) return data ? data->graph_pos : COMMIT_NOT_FROM_GRAPH; } -uint32_t commit_graph_generation(const struct commit *c) +timestamp_t commit_graph_generation(const struct commit *c) { struct commit_graph_data *data = commit_graph_data_slab_peek(&commit_graph_data_slab, c); @@ -146,8 +146,8 @@ static int commit_gen_cmp(const void *va, const void *vb) const struct commit *a = *(const struct commit **)va; const struct commit *b = *(const struct commit **)vb; - uint32_t generation_a = commit_graph_data_at(a)->generation; - uint32_t generation_b = commit_graph_data_at(b)->generation; + const timestamp_t generation_a = commit_graph_data_at(a)->generation; + const timestamp_t generation_b = commit_graph_data_at(b)->generation; /* lower generation commits first */ if (generation_a < generation_b) return -1; @@ -1366,8 +1366,8 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) if (all_parents_computed) { pop_commit(&list); - if (max_level > GENERATION_NUMBER_MAX - 1) - max_level = GENERATION_NUMBER_MAX - 1; + if (max_level > GENERATION_NUMBER_V1_MAX - 1) + max_level = GENERATION_NUMBER_V1_MAX - 1; *topo_level_slab_at(ctx->topo_levels, current) = max_level + 1; } } @@ -2363,8 +2363,8 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags) for (i = 0; i < g->num_commits; i++) { struct commit *graph_commit, *odb_commit; struct commit_list *graph_parents, *odb_parents; - uint32_t max_generation = 0; - uint32_t generation; + timestamp_t max_generation = 0; + timestamp_t generation; display_progress(progress, i + 1); hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); @@ -2428,16 +2428,16 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags) continue; /* - * If one of our parents has generation GENERATION_NUMBER_MAX, then - * our generation is also GENERATION_NUMBER_MAX. Decrement to avoid + * If one of our parents has generation GENERATION_NUMBER_V1_MAX, then + * our generation is also GENERATION_NUMBER_V1_MAX. Decrement to avoid * extra logic in the following condition. */ - if (max_generation == GENERATION_NUMBER_MAX) + if (max_generation == GENERATION_NUMBER_V1_MAX) max_generation--; generation = commit_graph_generation(graph_commit); if (generation != max_generation + 1) - graph_report(_("commit-graph generation for commit %s is %u != %u"), + graph_report(_("commit-graph generation for commit %s is %"PRItime" != %"PRItime), oid_to_hex(&cur_oid), generation, max_generation + 1); diff --git a/commit-graph.h b/commit-graph.h index 00f00745b79..2e9aa7824ee 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -145,12 +145,12 @@ void disable_commit_graph(struct repository *r); struct commit_graph_data { uint32_t graph_pos; - uint32_t generation; + timestamp_t generation; }; /* * Commits should be parsed before accessing generation, graph positions. */ -uint32_t commit_graph_generation(const struct commit *); +timestamp_t commit_graph_generation(const struct commit *); uint32_t commit_graph_position(const struct commit *); #endif diff --git a/commit-reach.c b/commit-reach.c index 50175b159e7..9b24b0378d5 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -32,12 +32,12 @@ static int queue_has_nonstale(struct prio_queue *queue) static struct commit_list *paint_down_to_common(struct repository *r, struct commit *one, int n, struct commit **twos, - int min_generation) + timestamp_t min_generation) { struct prio_queue queue = { compare_commits_by_gen_then_commit_date }; struct commit_list *result = NULL; int i; - uint32_t last_gen = GENERATION_NUMBER_INFINITY; + timestamp_t last_gen = GENERATION_NUMBER_INFINITY; if (!min_generation) queue.compare = compare_commits_by_commit_date; @@ -58,10 +58,10 @@ static struct commit_list *paint_down_to_common(struct repository *r, struct commit *commit = prio_queue_get(&queue); struct commit_list *parents; int flags; - uint32_t generation = commit_graph_generation(commit); + timestamp_t generation = commit_graph_generation(commit); if (min_generation && generation > last_gen) - BUG("bad generation skip %8x > %8x at %s", + BUG("bad generation skip %"PRItime" > %"PRItime" at %s", generation, last_gen, oid_to_hex(&commit->object.oid)); last_gen = generation; @@ -177,12 +177,12 @@ static int remove_redundant(struct repository *r, struct commit **array, int cnt repo_parse_commit(r, array[i]); for (i = 0; i < cnt; i++) { struct commit_list *common; - uint32_t min_generation = commit_graph_generation(array[i]); + timestamp_t min_generation = commit_graph_generation(array[i]); if (redundant[i]) continue; for (j = filled = 0; j < cnt; j++) { - uint32_t curr_generation; + timestamp_t curr_generation; if (i == j || redundant[j]) continue; filled_index[filled] = j; @@ -321,7 +321,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit, { struct commit_list *bases; int ret = 0, i; - uint32_t generation, max_generation = GENERATION_NUMBER_ZERO; + timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO; if (repo_parse_commit(r, commit)) return ret; @@ -470,7 +470,7 @@ static int in_commit_list(const struct commit_list *want, struct commit *c) static enum contains_result contains_test(struct commit *candidate, const struct commit_list *want, struct contains_cache *cache, - uint32_t cutoff) + timestamp_t cutoff) { enum contains_result *cached = contains_cache_at(cache, candidate); @@ -506,11 +506,11 @@ static enum contains_result contains_tag_algo(struct commit *candidate, { struct contains_stack contains_stack = { 0, 0, NULL }; enum contains_result result; - uint32_t cutoff = GENERATION_NUMBER_INFINITY; + timestamp_t cutoff = GENERATION_NUMBER_INFINITY; const struct commit_list *p; for (p = want; p; p = p->next) { - uint32_t generation; + timestamp_t generation; struct commit *c = p->item; load_commit_graph_info(the_repository, c); generation = commit_graph_generation(c); @@ -566,8 +566,8 @@ static int compare_commits_by_gen(const void *_a, const void *_b) const struct commit *a = *(const struct commit * const *)_a; const struct commit *b = *(const struct commit * const *)_b; - uint32_t generation_a = commit_graph_generation(a); - uint32_t generation_b = commit_graph_generation(b); + timestamp_t generation_a = commit_graph_generation(a); + timestamp_t generation_b = commit_graph_generation(b); if (generation_a < generation_b) return -1; @@ -580,7 +580,7 @@ int can_all_from_reach_with_flag(struct object_array *from, unsigned int with_flag, unsigned int assign_flag, time_t min_commit_date, - uint32_t min_generation) + timestamp_t min_generation) { struct commit **list = NULL; int i; @@ -681,13 +681,13 @@ int can_all_from_reach(struct commit_list *from, struct commit_list *to, time_t min_commit_date = cutoff_by_min_date ? from->item->date : 0; struct commit_list *from_iter = from, *to_iter = to; int result; - uint32_t min_generation = GENERATION_NUMBER_INFINITY; + timestamp_t min_generation = GENERATION_NUMBER_INFINITY; while (from_iter) { add_object_array(&from_iter->item->object, NULL, &from_objs); if (!parse_commit(from_iter->item)) { - uint32_t generation; + timestamp_t generation; if (from_iter->item->date < min_commit_date) min_commit_date = from_iter->item->date; @@ -701,7 +701,7 @@ int can_all_from_reach(struct commit_list *from, struct commit_list *to, while (to_iter) { if (!parse_commit(to_iter->item)) { - uint32_t generation; + timestamp_t generation; if (to_iter->item->date < min_commit_date) min_commit_date = to_iter->item->date; @@ -741,13 +741,13 @@ struct commit_list *get_reachable_subset(struct commit **from, int nr_from, struct commit_list *found_commits = NULL; struct commit **to_last = to + nr_to; struct commit **from_last = from + nr_from; - uint32_t min_generation = GENERATION_NUMBER_INFINITY; + timestamp_t min_generation = GENERATION_NUMBER_INFINITY; int num_to_find = 0; struct prio_queue queue = { compare_commits_by_gen_then_commit_date }; for (item = to; item < to_last; item++) { - uint32_t generation; + timestamp_t generation; struct commit *c = *item; parse_commit(c); diff --git a/commit-reach.h b/commit-reach.h index b49ad71a317..148b56fea50 100644 --- a/commit-reach.h +++ b/commit-reach.h @@ -87,7 +87,7 @@ int can_all_from_reach_with_flag(struct object_array *from, unsigned int with_flag, unsigned int assign_flag, time_t min_commit_date, - uint32_t min_generation); + timestamp_t min_generation); int can_all_from_reach(struct commit_list *from, struct commit_list *to, int commit_date_cutoff); diff --git a/commit.c b/commit.c index fe1fa3dc41f..17abf92a2d2 100644 --- a/commit.c +++ b/commit.c @@ -731,8 +731,8 @@ int compare_commits_by_author_date(const void *a_, const void *b_, int compare_commits_by_gen_then_commit_date(const void *a_, const void *b_, void *unused) { const struct commit *a = a_, *b = b_; - const uint32_t generation_a = commit_graph_generation(a), - generation_b = commit_graph_generation(b); + const timestamp_t generation_a = commit_graph_generation(a), + generation_b = commit_graph_generation(b); /* newer commits first */ if (generation_a < generation_b) diff --git a/commit.h b/commit.h index 5467786c7be..33c66b2177c 100644 --- a/commit.h +++ b/commit.h @@ -11,8 +11,8 @@ #include "commit-slab.h" #define COMMIT_NOT_FROM_GRAPH 0xFFFFFFFF -#define GENERATION_NUMBER_INFINITY 0xFFFFFFFF -#define GENERATION_NUMBER_MAX 0x3FFFFFFF +#define GENERATION_NUMBER_INFINITY ((1ULL << 63) - 1) +#define GENERATION_NUMBER_V1_MAX 0x3FFFFFFF #define GENERATION_NUMBER_ZERO 0 struct commit_list { diff --git a/revision.c b/revision.c index de8e45f462f..d55c2e4d566 100644 --- a/revision.c +++ b/revision.c @@ -3300,7 +3300,7 @@ define_commit_slab(indegree_slab, int); define_commit_slab(author_date_slab, timestamp_t); struct topo_walk_info { - uint32_t min_generation; + timestamp_t min_generation; struct prio_queue explore_queue; struct prio_queue indegree_queue; struct prio_queue topo_queue; @@ -3346,7 +3346,7 @@ static void explore_walk_step(struct rev_info *revs) } static void explore_to_depth(struct rev_info *revs, - uint32_t gen_cutoff) + timestamp_t gen_cutoff) { struct topo_walk_info *info = revs->topo_walk_info; struct commit *c; @@ -3389,7 +3389,7 @@ static void indegree_walk_step(struct rev_info *revs) } static void compute_indegrees_to_depth(struct rev_info *revs, - uint32_t gen_cutoff) + timestamp_t gen_cutoff) { struct topo_walk_info *info = revs->topo_walk_info; struct commit *c; @@ -3447,7 +3447,7 @@ static void init_topo_walk(struct rev_info *revs) info->min_generation = GENERATION_NUMBER_INFINITY; for (list = revs->commits; list; list = list->next) { struct commit *c = list->item; - uint32_t generation; + timestamp_t generation; if (repo_parse_commit_gently(revs->repo, c, 1)) continue; @@ -3508,7 +3508,7 @@ static void expand_topo_walk(struct rev_info *revs, struct commit *commit) for (p = commit->parents; p; p = p->next) { struct commit *parent = p->item; int *pi; - uint32_t generation; + timestamp_t generation; if (parent->object.flags & UNINTERESTING) continue; diff --git a/upload-pack.c b/upload-pack.c index 3b66bf92ba8..b87607e0dd4 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -500,7 +500,7 @@ static int got_oid(struct upload_pack_data *data, static int ok_to_give_up(struct upload_pack_data *data) { - uint32_t min_generation = GENERATION_NUMBER_ZERO; + timestamp_t min_generation = GENERATION_NUMBER_ZERO; if (!data->have_obj.nr) return 0; From patchwork Mon Dec 28 11:16:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhishek Kumar X-Patchwork-Id: 11991091 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74D9EC433DB for ; Mon, 28 Dec 2020 11:17:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 32BF022583 for ; Mon, 28 Dec 2020 11:17:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727344AbgL1LRg (ORCPT ); Mon, 28 Dec 2020 06:17:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727328AbgL1LRf (ORCPT ); Mon, 28 Dec 2020 06:17:35 -0500 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6321BC06179C for ; Mon, 28 Dec 2020 03:16:19 -0800 (PST) Received: by mail-wr1-x42d.google.com with SMTP id i9so11030557wrc.4 for ; Mon, 28 Dec 2020 03:16:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=uAuu72FlkmWGfiSbzW2PBA+6L6K8jcsYnFjikJdp3xE=; b=LlqcsN3sN7tJ8NvB6uN1308e342FEtDb+iZK4aKcCsVQdOYsfcUq3Mw/9+9/DAmau/ +TsHh9L6dwne9sRG5zkQsqbYu1v6pSSD5c64dsSdgdw9mmjlu0OZyFKejuK+Olq5tcAl PQdvNVTgXS+Xkjr5Yi6bNzvwvU1xMq6TD0Snz5igNIwfXsvb3gEP4fPOt1YdUtVWZr5E bfpxpNTB3FkPOLISUEQ9htrm+IAt5pxt9T5u0ZDCbKzxUJQy5tOdVyGrYJtmNETx+YI9 YCreQ4+8r8T/jJPYPrATrtPSJnX7lhfXMSeRTH/FeCy0AF5rmT7L3kFcVonUxpZijjTv TJ+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=uAuu72FlkmWGfiSbzW2PBA+6L6K8jcsYnFjikJdp3xE=; b=EkmpBY8lg1WyY5A7obQ3uROCYUJIKzJD4W2tH31rTzcWQpBDMBl8Sr2PEpiWDTM+yd Oc7ipDjiojYASg/ssuxYtVmSjpZTNgTgVWWFUQ7sDVjR6Tl5QGXw6h5W8wAbUXOxxRN5 ISkdPIZlj6MxjaJsEI0JWW6/tW5YQJEzPnbhh+LFQkeMwuh6UjQdHeh1NWRBxzmRIFCo 4GuuWm5HGrOkitzDIKWp6fsx+0DEGNh7ZjSqSKOdzQJD3m6c1G1eZn61oKY9lQa6E9dt EOaWFgFXp25LoYEB14Vx3WoXlzuLB9c6uH5EQpmH+nTfwn9ZDrVzHU3Ra+z/Zw9Cnv4m kDww== X-Gm-Message-State: AOAM531OaK0hBZ38f1tA4MTnbVyyjhVndStZJtmmiiWIBUIgh1WCrbto 6xGHc1m2tjnTrHyOov2JKibm/HNLCTg= X-Google-Smtp-Source: ABdhPJyix5DXEgRQoiOnaSCNb4QDvqDwP2ZYaeguEJhYParihqNFvj7hY6uyZqwnQr2AlQFrOfFt3w== X-Received: by 2002:adf:a3c3:: with SMTP id m3mr51771977wrb.105.1609154177998; Mon, 28 Dec 2020 03:16:17 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id i18sm56499455wrp.74.2020.12.28.03.16.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Dec 2020 03:16:17 -0800 (PST) Message-Id: <859c39eff52e32ad322969d024184971acec82e7.1609154168.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 28 Dec 2020 11:16:04 +0000 Subject: [PATCH v5 07/11] commit-graph: implement corrected commit date Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar , Abhishek Kumar Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar From: Abhishek Kumar With most of preparations done, let's implement corrected commit date. The corrected commit date for a commit is defined as: * A commit with no parents (a root commit) has corrected commit date equal to its committer date. * A commit with at least one parent has corrected commit date equal to the maximum of its commit date and one more than the largest corrected commit date among its parents. As a special case, a root commit with timestamp of zero (01.01.1970 00:00:00Z) has corrected commit date of one, to be able to distinguish from GENERATION_NUMBER_ZERO (that is, an uncomputed corrected commit date). To minimize the space required to store corrected commit date, Git stores corrected commit date offsets into the commit-graph file. The corrected commit date offset for a commit is defined as the difference between its corrected commit date and actual commit date. Storing corrected commit date requires sizeof(timestamp_t) bytes, which in most cases is 64 bits (uintmax_t). However, corrected commit date offsets can be safely stored using only 32-bits. This halves the size of GDAT chunk, which is a reduction of around 6% in the size of commit-graph file. However, using offsets be problematic if one of commits is malformed but valid and has committerdate of 0 Unix time, as the offset would be the same as corrected commit date and thus require 64-bits to be stored properly. While Git does not write out offsets at this stage, Git stores the corrected commit dates in member generation of struct commit_graph_data. It will begin writing commit date offsets with the introduction of generation data chunk. Signed-off-by: Abhishek Kumar --- commit-graph.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index 1b2a015f92f..bfc3aae5f93 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -1339,9 +1339,11 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) ctx->commits.nr); for (i = 0; i < ctx->commits.nr; i++) { uint32_t level = *topo_level_slab_at(ctx->topo_levels, ctx->commits.list[i]); + timestamp_t corrected_commit_date = commit_graph_data_at(ctx->commits.list[i])->generation; display_progress(ctx->progress, i + 1); - if (level != GENERATION_NUMBER_ZERO) + if (level != GENERATION_NUMBER_ZERO && + corrected_commit_date != GENERATION_NUMBER_ZERO) continue; commit_list_insert(ctx->commits.list[i], &list); @@ -1350,16 +1352,23 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) struct commit_list *parent; int all_parents_computed = 1; uint32_t max_level = 0; + timestamp_t max_corrected_commit_date = 0; for (parent = current->parents; parent; parent = parent->next) { level = *topo_level_slab_at(ctx->topo_levels, parent->item); + corrected_commit_date = commit_graph_data_at(parent->item)->generation; - if (level == GENERATION_NUMBER_ZERO) { + if (level == GENERATION_NUMBER_ZERO || + corrected_commit_date == GENERATION_NUMBER_ZERO) { all_parents_computed = 0; commit_list_insert(parent->item, &list); break; - } else if (level > max_level) { - max_level = level; + } else { + if (level > max_level) + max_level = level; + + if (corrected_commit_date > max_corrected_commit_date) + max_corrected_commit_date = corrected_commit_date; } } @@ -1369,6 +1378,10 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) if (max_level > GENERATION_NUMBER_V1_MAX - 1) max_level = GENERATION_NUMBER_V1_MAX - 1; *topo_level_slab_at(ctx->topo_levels, current) = max_level + 1; + + if (current->date && current->date > max_corrected_commit_date) + max_corrected_commit_date = current->date - 1; + commit_graph_data_at(current)->generation = max_corrected_commit_date + 1; } } } From patchwork Mon Dec 28 11:16:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Abhishek Kumar X-Patchwork-Id: 11991097 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD4C6C4332B for ; Mon, 28 Dec 2020 11:17:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9DAD9229C6 for ; Mon, 28 Dec 2020 11:17:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727362AbgL1LRm (ORCPT ); Mon, 28 Dec 2020 06:17:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727329AbgL1LRe (ORCPT ); Mon, 28 Dec 2020 06:17:34 -0500 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 771C1C06179E for ; Mon, 28 Dec 2020 03:16:20 -0800 (PST) Received: by mail-wr1-x42f.google.com with SMTP id i9so11030611wrc.4 for ; Mon, 28 Dec 2020 03:16:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=5FdB9pecxzRhb/f7Jf8BisIavo9M2IWJpKMI4JN0M44=; b=EFkNzmcJBuIHERzaDb+akIRGNp9sJOnmiOum6/nqEqzOGThEP3N23fOwYJ0rj8B7dI PCP/KLr6/VDJHpe/r13fMIBzcrIJVqlLQ9A/Wm/BWVxhPVoGdCKRtn9PWPIt/1u4Q+CM uOVgz5hvE3l8ZU01kuRIzuR3qM2iPkFWTf0oxIdej3GYd0KDnBxwfPIq3CgvDAHEP8KR DAiRwUgKHjhZd0EhDuEfXXzUXy30R/N9ORiihtXVWe2o3DLCyBkywtZJQdUzT+60aPcS oX+TcIUHs43ZsRy6Qx2aMudcKTC3n58P2hdDklxYtOXOdk4nWNVrpaEbVRDM2HjzriZb KCMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=5FdB9pecxzRhb/f7Jf8BisIavo9M2IWJpKMI4JN0M44=; b=BIicPCtIYwNRka1xibCZR+YDOETEautAs3s2LVEbSvGfGNqdk34kF7N3+gd7wJ+NfD lh7jxrNUkojLlqmra5jDPxzHW0wncNMx8Wu+FKZlY/pHjYI2N+qsGD2tzT9KU/BmtoS2 fgwg4N+LCXPk8E3v0RdtIeoNUbSRDjQdYM/WwW1ee6f+GeCTCJHHDcFsepymMxs0LvtO JQ4Tg0j1D1og1Ak9iOpjgDtJC9i5+HfTioVl62V0lMvK9Z9ApadSat6khbIGkgc3R8y0 nm6E0jz/OCzFr2j267Jz8WdezM9akeKxgNiN6IdYjlVYAOX8idNeKHIUZB6s7FjFgVmm 9A6w== X-Gm-Message-State: AOAM533L9F6UBkRtux+OmxRqD6iozfydKnjrCos7FSOqHGTewNhuru7i cnUYxeU5ltmLpELaYALjdlSjkYY2exY= X-Google-Smtp-Source: ABdhPJwFR36lunrmx7z5S+iftcsHtWrwB6x2BPocPn8TKgmP80G01Q/YKldx0L4drBqQs1FISG0MVw== X-Received: by 2002:a5d:6a4f:: with SMTP id t15mr52213427wrw.62.1609154178848; Mon, 28 Dec 2020 03:16:18 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id v20sm58099510wra.19.2020.12.28.03.16.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Dec 2020 03:16:18 -0800 (PST) Message-Id: <8403c4d025727bbc4b69ca12c42dd1db7826159b.1609154169.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 28 Dec 2020 11:16:05 +0000 Subject: [PATCH v5 08/11] commit-graph: implement generation data chunk MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar , Abhishek Kumar Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar From: Abhishek Kumar As discovered by Ævar, we cannot increment graph version to distinguish between generation numbers v1 and v2 [1]. Thus, one of pre-requistes before implementing generation number v2 was to distinguish between graph versions in a backwards compatible manner. We are going to introduce a new chunk called Generation DATa chunk (or GDAT). GDAT will store corrected committer date offsets whereas CDAT will still store topological level. Old Git does not understand GDAT chunk and would ignore it, reading topological levels from CDAT. New Git can parse GDAT and take advantage of newer generation numbers, falling back to topological levels when GDAT chunk is missing (as it would happen with a commit-graph written by old Git). We introduce a test environment variable 'GIT_TEST_COMMIT_GRAPH_NO_GDAT' which forces commit-graph file to be written without generation data chunk to emulate a commit-graph file written by old Git. To minimize the space required to store corrrected commit date, Git stores corrected commit date offsets into the commit-graph file, instea of corrected commit dates. This saves us 4 bytes per commit, decreasing the GDAT chunk size by half, but it's possible for the offset to overflow the 4-bytes allocated for storage. As such overflows are and should be exceedingly rare, we use the following overflow management scheme: We introduce a new commit-graph chunk, Generation Data OVerflow ('GDOV') to store corrected commit dates for commits with offsets greater than GENERATION_NUMBER_V2_OFFSET_MAX. If the offset is greater than GENERATION_NUMBER_V2_OFFSET_MAX, we set the MSB of the offset and the other bits store the position of corrected commit date in GDOV chunk, similar to how Extra Edge List is maintained. We test the overflow-related code with the following repo history: F - N - U / \ U - N - U N \ / N - F - N Where the commits denoted by U have committer date of zero seconds since Unix epoch, the commits denoted by N have committer date of 1112354055 (default committer date for the test suite) seconds since Unix epoch and the commits denoted by F have committer date of (2 ^ 31 - 2) seconds since Unix epoch. The largest offset observed is 2 ^ 31, just large enough to overflow. [1]: https://lore.kernel.org/git/87a7gdspo4.fsf@evledraar.gmail.com/ Signed-off-by: Abhishek Kumar --- commit-graph.c | 111 ++++++++++++++++++++++++++++++---- commit-graph.h | 3 + commit.h | 1 + t/README | 3 + t/helper/test-read-graph.c | 4 ++ t/t4216-log-bloom.sh | 4 +- t/t5318-commit-graph.sh | 79 ++++++++++++++++++++---- t/t5324-split-commit-graph.sh | 12 ++-- t/t6600-test-reach.sh | 6 ++ t/test-lib-functions.sh | 6 ++ 10 files changed, 197 insertions(+), 32 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index bfc3aae5f93..629b2f17fbc 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -38,11 +38,13 @@ void git_test_write_commit_graph_or_die(void) #define GRAPH_CHUNKID_OIDFANOUT 0x4f494446 /* "OIDF" */ #define GRAPH_CHUNKID_OIDLOOKUP 0x4f49444c /* "OIDL" */ #define GRAPH_CHUNKID_DATA 0x43444154 /* "CDAT" */ +#define GRAPH_CHUNKID_GENERATION_DATA 0x47444154 /* "GDAT" */ +#define GRAPH_CHUNKID_GENERATION_DATA_OVERFLOW 0x47444f56 /* "GDOV" */ #define GRAPH_CHUNKID_EXTRAEDGES 0x45444745 /* "EDGE" */ #define GRAPH_CHUNKID_BLOOMINDEXES 0x42494458 /* "BIDX" */ #define GRAPH_CHUNKID_BLOOMDATA 0x42444154 /* "BDAT" */ #define GRAPH_CHUNKID_BASE 0x42415345 /* "BASE" */ -#define MAX_NUM_CHUNKS 7 +#define MAX_NUM_CHUNKS 9 #define GRAPH_DATA_WIDTH (the_hash_algo->rawsz + 16) @@ -61,6 +63,8 @@ void git_test_write_commit_graph_or_die(void) #define GRAPH_MIN_SIZE (GRAPH_HEADER_SIZE + 4 * GRAPH_CHUNKLOOKUP_WIDTH \ + GRAPH_FANOUT_SIZE + the_hash_algo->rawsz) +#define CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW (1ULL << 31) + /* Remember to update object flag allocation in object.h */ #define REACHABLE (1u<<15) @@ -390,6 +394,20 @@ struct commit_graph *parse_commit_graph(struct repository *r, graph->chunk_commit_data = data + chunk_offset; break; + case GRAPH_CHUNKID_GENERATION_DATA: + if (graph->chunk_generation_data) + chunk_repeated = 1; + else + graph->chunk_generation_data = data + chunk_offset; + break; + + case GRAPH_CHUNKID_GENERATION_DATA_OVERFLOW: + if (graph->chunk_generation_data_overflow) + chunk_repeated = 1; + else + graph->chunk_generation_data_overflow = data + chunk_offset; + break; + case GRAPH_CHUNKID_EXTRAEDGES: if (graph->chunk_extra_edges) chunk_repeated = 1; @@ -750,8 +768,8 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g, { const unsigned char *commit_data; struct commit_graph_data *graph_data; - uint32_t lex_index; - uint64_t date_high, date_low; + uint32_t lex_index, offset_pos; + uint64_t date_high, date_low, offset; while (pos < g->num_commits_in_base) g = g->base_graph; @@ -769,7 +787,16 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g, date_low = get_be32(commit_data + g->hash_len + 12); item->date = (timestamp_t)((date_high << 32) | date_low); - graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; + if (g->chunk_generation_data) { + offset = (timestamp_t)get_be32(g->chunk_generation_data + sizeof(uint32_t) * lex_index); + + if (offset & CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW) { + offset_pos = offset ^ CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW; + graph_data->generation = get_be64(g->chunk_generation_data_overflow + 8 * offset_pos); + } else + graph_data->generation = item->date + offset; + } else + graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; if (g->topo_levels) *topo_level_slab_at(g->topo_levels, item) = get_be32(commit_data + g->hash_len + 8) >> 2; @@ -941,6 +968,7 @@ struct write_commit_graph_context { struct oid_array oids; struct packed_commit_list commits; int num_extra_edges; + int num_generation_data_overflows; unsigned long approx_nr_objects; struct progress *progress; int progress_done; @@ -959,7 +987,8 @@ struct write_commit_graph_context { report_progress:1, split:1, changed_paths:1, - order_by_pack:1; + order_by_pack:1, + write_generation_data:1; struct topo_level_slab *topo_levels; const struct commit_graph_opts *opts; @@ -1119,6 +1148,45 @@ static int write_graph_chunk_data(struct hashfile *f, return 0; } +static int write_graph_chunk_generation_data(struct hashfile *f, + struct write_commit_graph_context *ctx) +{ + int i, num_generation_data_overflows = 0; + + for (i = 0; i < ctx->commits.nr; i++) { + struct commit *c = ctx->commits.list[i]; + timestamp_t offset = commit_graph_data_at(c)->generation - c->date; + display_progress(ctx->progress, ++ctx->progress_cnt); + + if (offset > GENERATION_NUMBER_V2_OFFSET_MAX) { + offset = CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW | num_generation_data_overflows; + num_generation_data_overflows++; + } + + hashwrite_be32(f, offset); + } + + return 0; +} + +static int write_graph_chunk_generation_data_overflow(struct hashfile *f, + struct write_commit_graph_context *ctx) +{ + int i; + for (i = 0; i < ctx->commits.nr; i++) { + struct commit *c = ctx->commits.list[i]; + timestamp_t offset = commit_graph_data_at(c)->generation - c->date; + display_progress(ctx->progress, ++ctx->progress_cnt); + + if (offset > GENERATION_NUMBER_V2_OFFSET_MAX) { + hashwrite_be32(f, offset >> 32); + hashwrite_be32(f, (uint32_t) offset); + } + } + + return 0; +} + static int write_graph_chunk_extra_edges(struct hashfile *f, struct write_commit_graph_context *ctx) { @@ -1382,6 +1450,9 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) if (current->date && current->date > max_corrected_commit_date) max_corrected_commit_date = current->date - 1; commit_graph_data_at(current)->generation = max_corrected_commit_date + 1; + + if (commit_graph_data_at(current)->generation - current->date > GENERATION_NUMBER_V2_OFFSET_MAX) + ctx->num_generation_data_overflows++; } } } @@ -1715,6 +1786,21 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) chunks[2].id = GRAPH_CHUNKID_DATA; chunks[2].size = (hashsz + 16) * ctx->commits.nr; chunks[2].write_fn = write_graph_chunk_data; + + if (git_env_bool(GIT_TEST_COMMIT_GRAPH_NO_GDAT, 0)) + ctx->write_generation_data = 0; + if (ctx->write_generation_data) { + chunks[num_chunks].id = GRAPH_CHUNKID_GENERATION_DATA; + chunks[num_chunks].size = sizeof(uint32_t) * ctx->commits.nr; + chunks[num_chunks].write_fn = write_graph_chunk_generation_data; + num_chunks++; + } + if (ctx->num_generation_data_overflows) { + chunks[num_chunks].id = GRAPH_CHUNKID_GENERATION_DATA_OVERFLOW; + chunks[num_chunks].size = sizeof(timestamp_t) * ctx->num_generation_data_overflows; + chunks[num_chunks].write_fn = write_graph_chunk_generation_data_overflow; + num_chunks++; + } if (ctx->num_extra_edges) { chunks[num_chunks].id = GRAPH_CHUNKID_EXTRAEDGES; chunks[num_chunks].size = 4 * ctx->num_extra_edges; @@ -2135,6 +2221,8 @@ int write_commit_graph(struct object_directory *odb, ctx->split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0; ctx->opts = opts; ctx->total_bloom_filter_data_size = 0; + ctx->write_generation_data = 1; + ctx->num_generation_data_overflows = 0; bloom_settings.bits_per_entry = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_BITS_PER_ENTRY", bloom_settings.bits_per_entry); @@ -2441,16 +2529,17 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags) continue; /* - * If one of our parents has generation GENERATION_NUMBER_V1_MAX, then - * our generation is also GENERATION_NUMBER_V1_MAX. Decrement to avoid - * extra logic in the following condition. + * If we are using topological level and one of our parents has + * generation GENERATION_NUMBER_V1_MAX, then our generation is + * also GENERATION_NUMBER_V1_MAX. Decrement to avoid extra logic + * in the following condition. */ - if (max_generation == GENERATION_NUMBER_V1_MAX) + if (!g->chunk_generation_data && max_generation == GENERATION_NUMBER_V1_MAX) max_generation--; generation = commit_graph_generation(graph_commit); - if (generation != max_generation + 1) - graph_report(_("commit-graph generation for commit %s is %"PRItime" != %"PRItime), + if (generation < max_generation + 1) + graph_report(_("commit-graph generation for commit %s is %"PRItime" < %"PRItime), oid_to_hex(&cur_oid), generation, max_generation + 1); diff --git a/commit-graph.h b/commit-graph.h index 2e9aa7824ee..19a02001fde 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -6,6 +6,7 @@ #include "oidset.h" #define GIT_TEST_COMMIT_GRAPH "GIT_TEST_COMMIT_GRAPH" +#define GIT_TEST_COMMIT_GRAPH_NO_GDAT "GIT_TEST_COMMIT_GRAPH_NO_GDAT" #define GIT_TEST_COMMIT_GRAPH_DIE_ON_PARSE "GIT_TEST_COMMIT_GRAPH_DIE_ON_PARSE" #define GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS "GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS" @@ -68,6 +69,8 @@ struct commit_graph { const uint32_t *chunk_oid_fanout; const unsigned char *chunk_oid_lookup; const unsigned char *chunk_commit_data; + const unsigned char *chunk_generation_data; + const unsigned char *chunk_generation_data_overflow; const unsigned char *chunk_extra_edges; const unsigned char *chunk_base_graphs; const unsigned char *chunk_bloom_indexes; diff --git a/commit.h b/commit.h index 33c66b2177c..251d877fcf6 100644 --- a/commit.h +++ b/commit.h @@ -14,6 +14,7 @@ #define GENERATION_NUMBER_INFINITY ((1ULL << 63) - 1) #define GENERATION_NUMBER_V1_MAX 0x3FFFFFFF #define GENERATION_NUMBER_ZERO 0 +#define GENERATION_NUMBER_V2_OFFSET_MAX ((1ULL << 31) - 1) struct commit_list { struct commit *item; diff --git a/t/README b/t/README index c730a707705..8a121487279 100644 --- a/t/README +++ b/t/README @@ -393,6 +393,9 @@ GIT_TEST_COMMIT_GRAPH=, when true, forces the commit-graph to be written after every 'git commit' command, and overrides the 'core.commitGraph' setting to true. +GIT_TEST_COMMIT_GRAPH_NO_GDAT=, when true, forces the +commit-graph to be written without generation data chunk. + GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=, when true, forces commit-graph write to compute and write changed path Bloom filters for every 'git commit-graph write', as if the `--changed-paths` option was diff --git a/t/helper/test-read-graph.c b/t/helper/test-read-graph.c index 5f585a17256..75927b2c81d 100644 --- a/t/helper/test-read-graph.c +++ b/t/helper/test-read-graph.c @@ -33,6 +33,10 @@ int cmd__read_graph(int argc, const char **argv) printf(" oid_lookup"); if (graph->chunk_commit_data) printf(" commit_metadata"); + if (graph->chunk_generation_data) + printf(" generation_data"); + if (graph->chunk_generation_data_overflow) + printf(" generation_data_overflow"); if (graph->chunk_extra_edges) printf(" extra_edges"); if (graph->chunk_bloom_indexes) diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh index d11040ce41c..dbde0161882 100755 --- a/t/t4216-log-bloom.sh +++ b/t/t4216-log-bloom.sh @@ -40,11 +40,11 @@ test_expect_success 'setup test - repo, commits, commit graph, log outputs' ' ' graph_read_expect () { - NUM_CHUNKS=5 + NUM_CHUNKS=6 cat >expect <<- EOF header: 43475048 1 $(test_oid oid_version) $NUM_CHUNKS 0 num_commits: $1 - chunks: oid_fanout oid_lookup commit_metadata bloom_indexes bloom_data + chunks: oid_fanout oid_lookup commit_metadata generation_data bloom_indexes bloom_data EOF test-tool read-graph >actual && test_cmp expect actual diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 2ed0c1544da..fa27df579a5 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -76,7 +76,7 @@ graph_git_behavior 'no graph' full commits/3 commits/1 graph_read_expect() { OPTIONAL="" NUM_CHUNKS=3 - if test ! -z $2 + if test ! -z "$2" then OPTIONAL=" $2" NUM_CHUNKS=$((3 + $(echo "$2" | wc -w))) @@ -103,14 +103,14 @@ test_expect_success 'exit with correct error on bad input to --stdin-commits' ' # valid commit and tree OID git rev-parse HEAD HEAD^{tree} >in && git commit-graph write --stdin-commits >commits-in && cat commits-in | git commit-graph write --stdin-commits && test_path_is_file $objdir/info/commit-graph && - graph_read_expect "6" + graph_read_expect "6" "generation_data" ' graph_git_behavior 'graph from commits, commit 8 vs merge 1' full commits/8 merge/1 @@ -297,7 +297,7 @@ test_expect_success 'build graph from commits with append' ' cd "$TRASH_DIRECTORY/full" && git rev-parse merge/3 | git commit-graph write --stdin-commits --append && test_path_is_file $objdir/info/commit-graph && - graph_read_expect "10" "extra_edges" + graph_read_expect "10" "generation_data extra_edges" ' graph_git_behavior 'append graph, commit 8 vs merge 1' full commits/8 merge/1 @@ -307,7 +307,7 @@ test_expect_success 'build graph using --reachable' ' cd "$TRASH_DIRECTORY/full" && git commit-graph write --reachable && test_path_is_file $objdir/info/commit-graph && - graph_read_expect "11" "extra_edges" + graph_read_expect "11" "generation_data extra_edges" ' graph_git_behavior 'append graph, commit 8 vs merge 1' full commits/8 merge/1 @@ -328,7 +328,7 @@ test_expect_success 'write graph in bare repo' ' cd "$TRASH_DIRECTORY/bare" && git commit-graph write && test_path_is_file $baredir/info/commit-graph && - graph_read_expect "11" "extra_edges" + graph_read_expect "11" "generation_data extra_edges" ' graph_git_behavior 'bare repo with graph, commit 8 vs merge 1' bare commits/8 merge/1 @@ -454,8 +454,9 @@ test_expect_success 'warn on improper hash version' ' test_expect_success 'git commit-graph verify' ' cd "$TRASH_DIRECTORY/full" && - git rev-parse commits/8 | git commit-graph write --stdin-commits && - git commit-graph verify >output + git rev-parse commits/8 | GIT_TEST_COMMIT_GRAPH_NO_GDAT=1 git commit-graph write --stdin-commits && + git commit-graph verify >output && + graph_read_expect 9 extra_edges ' NUM_COMMITS=9 @@ -741,4 +742,56 @@ test_expect_success 'corrupt commit-graph write (missing tree)' ' ) ' +# We test the overflow-related code with the following repo history: +# +# 4:F - 5:N - 6:U +# / \ +# 1:U - 2:N - 3:U M:N +# \ / +# 7:N - 8:F - 9:N +# +# Here the commits denoted by U have committer date of zero seconds +# since Unix epoch, the commits denoted by N have committer date +# starting from 1112354055 seconds since Unix epoch (default committer +# date for the test suite), and the commits denoted by F have committer +# date of (2 ^ 31 - 2) seconds since Unix epoch. +# +# The largest offset observed is 2 ^ 31, just large enough to overflow. +# + +test_expect_success 'set up and verify repo with generation data overflow chunk' ' + objdir=".git/objects" && + UNIX_EPOCH_ZERO="@0 +0000" && + FUTURE_DATE="@2147483646 +0000" && + test_oid_cache <<-EOF && + oid_version sha1:1 + oid_version sha256:2 + EOF + cd "$TRASH_DIRECTORY" && + mkdir repo && + cd repo && + git init && + test_commit --date "$UNIX_EPOCH_ZERO" 1 && + test_commit 2 && + test_commit --date "$UNIX_EPOCH_ZERO" 3 && + git commit-graph write --reachable && + graph_read_expect 3 generation_data && + test_commit --date "$FUTURE_DATE" 4 && + test_commit 5 && + test_commit --date "$UNIX_EPOCH_ZERO" 6 && + git branch left && + git reset --hard 3 && + test_commit 7 && + test_commit --date "$FUTURE_DATE" 8 && + test_commit 9 && + git branch right && + git reset --hard 3 && + test_merge M left right && + git commit-graph write --reachable && + graph_read_expect 10 "generation_data generation_data_overflow" && + git commit-graph verify +' + +graph_git_behavior 'generation data overflow chunk repo' repo left right + test_done diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh index 4d3842b83b9..587757b62d9 100755 --- a/t/t5324-split-commit-graph.sh +++ b/t/t5324-split-commit-graph.sh @@ -13,11 +13,11 @@ test_expect_success 'setup repo' ' infodir=".git/objects/info" && graphdir="$infodir/commit-graphs" && test_oid_cache <<-EOM - shallow sha1:1760 - shallow sha256:2064 + shallow sha1:2132 + shallow sha256:2436 - base sha1:1376 - base sha256:1496 + base sha1:1408 + base sha256:1528 oid_version sha1:1 oid_version sha256:2 @@ -31,9 +31,9 @@ graph_read_expect() { NUM_BASE=$2 fi cat >expect <<- EOF - header: 43475048 1 $(test_oid oid_version) 3 $NUM_BASE + header: 43475048 1 $(test_oid oid_version) 4 $NUM_BASE num_commits: $1 - chunks: oid_fanout oid_lookup commit_metadata + chunks: oid_fanout oid_lookup commit_metadata generation_data EOF test-tool read-graph >output && test_cmp expect output diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh index af10f0dc090..e2d33a8a4c4 100755 --- a/t/t6600-test-reach.sh +++ b/t/t6600-test-reach.sh @@ -55,6 +55,9 @@ test_expect_success 'setup' ' git show-ref -s commit-5-5 | git commit-graph write --stdin-commits && mv .git/objects/info/commit-graph commit-graph-half && chmod u+w commit-graph-half && + GIT_TEST_COMMIT_GRAPH_NO_GDAT=1 git commit-graph write --reachable && + mv .git/objects/info/commit-graph commit-graph-no-gdat && + chmod u+w commit-graph-no-gdat && git config core.commitGraph true ' @@ -67,6 +70,9 @@ run_all_modes () { test_cmp expect actual && cp commit-graph-half .git/objects/info/commit-graph && "$@" actual && + test_cmp expect actual && + cp commit-graph-no-gdat .git/objects/info/commit-graph && + "$@" actual && test_cmp expect actual } diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 999982fe4a9..3ad712c3acc 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -202,6 +202,12 @@ test_commit () { --signoff) signoff="$1" ;; + --date) + notick=yes + GIT_COMMITTER_DATE="$2" + GIT_AUTHOR_DATE="$2" + shift + ;; -C) indir="$2" shift From patchwork Mon Dec 28 11:16:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhishek Kumar X-Patchwork-Id: 11991095 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B49FC4332E for ; Mon, 28 Dec 2020 11:17:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CAAD8229EF for ; Mon, 28 Dec 2020 11:17:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727358AbgL1LRm (ORCPT ); Mon, 28 Dec 2020 06:17:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60064 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727330AbgL1LRe (ORCPT ); Mon, 28 Dec 2020 06:17:34 -0500 Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7FF4CC06179F for ; Mon, 28 Dec 2020 03:16:21 -0800 (PST) Received: by mail-wm1-x332.google.com with SMTP id v14so10093382wml.1 for ; Mon, 28 Dec 2020 03:16:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=s+qADjefv6K6z3JcMzIyR6hH/npnjBVgHftgEbr5pao=; b=AGv4EwN4GPi/U+Ceya0j143QJY9zDbcGocZLbgUyNTEKnPSDU+zk6jcih8n84HyAWb irkK2fwkS+50wEy1ERXFsvPaD7Pj76z3ANoO3mycNxL9YEqKpnUdhR3cSzEF7id2WVaA j+lRSJofBnqAdfIUGu9EfusPvgl9l3EhKnqSb6NHewPQ0ruqZEMPtidel0CjCJQ99ndS VIyqVtfpIf3WAF/GMSb5dmZntiDCsGHwNbhOYT0rp94IKtyechkM3zludu00VUswDhJB aOZwBKX0iJLDxK04IQOTjnDWlDRclEVRF+BvVIx/mV8/hPgXb9mrl1KNrI4g9wov03Iu 4icg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=s+qADjefv6K6z3JcMzIyR6hH/npnjBVgHftgEbr5pao=; b=X+rZf+1ejTi4f4g7DJYMmhMHl1D/xcCHolH7WLsB+XWkuuuNIl7PZT3pHc/ymZCp9M DEj2oFDEaUFS9hT9cskL8ec6zRjHQ2rjvZVNkmClbsCKrBSUBfpODKdhUDfA8Rck3Yjs SOshC3PYup8IXN5dm7g4qNhzbqDbe3uBuCqex0E+9Rk5b99jgAzN0RoS2CZR1h0MGCWJ dFOT4A4xIbwGeW6dbjszCLI03rGPKK8DwIPUVPaG2K+4zFHt6d70r/UIVLEd+Uyitj1I BzpGE/oHedlUCQUBCsmj1lSFAEczEc24sBafE0NwQvJ14kTPcHEwaCjuIcDvk5xD1Mje vxYg== X-Gm-Message-State: AOAM532ydHq6oA/xBrgxz/nLOfquUdUkhelqaNcES/oS0Kqzdmmt3aJe Hp7RqSyYtklwAQKa5UFaWajvsxhpfls= X-Google-Smtp-Source: ABdhPJzqUWf99j46/ThP/7Cy8eqfoUJSbM8ZmXDERVda7I6jv6pHOKo5oweZfgtImkJk1kFSwdrVCQ== X-Received: by 2002:a1c:a9c4:: with SMTP id s187mr20085250wme.116.1609154179816; Mon, 28 Dec 2020 03:16:19 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d191sm18340904wmd.24.2020.12.28.03.16.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Dec 2020 03:16:19 -0800 (PST) Message-Id: In-Reply-To: References: Date: Mon, 28 Dec 2020 11:16:06 +0000 Subject: [PATCH v5 09/11] commit-graph: use generation v2 only if entire chain does Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar , Abhishek Kumar Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar From: Abhishek Kumar Since there are released versions of Git that understand generation numbers in the commit-graph's CDAT chunk but do not understand the GDAT chunk, the following scenario is possible: 1. "New" Git writes a commit-graph with the GDAT chunk. 2. "Old" Git writes a split commit-graph on top without a GDAT chunk. If each layer of split commit-graph is treated independently, as it was the case before this commit, with Git inspecting only the current layer for chunk_generation_data pointer, commits in the lower layer (one with GDAT) whould have corrected commit date as their generation number, while commits in the upper layer would have topological levels as their generation. Corrected commit dates usually have much larger values than topological levels. This means that if we take two commits, one from the upper layer, and one reachable from it in the lower layer, then the expectation that the generation of a parent is smaller than the generation of a child would be violated. It is difficult to expose this issue in a test. Since we _start_ with artificially low generation numbers, any commit walk that prioritizes generation numbers will walk all of the commits with high generation number before walking the commits with low generation number. In all the cases I tried, the commit-graph layers themselves "protect" any incorrect behavior since none of the commits in the lower layer can reach the commits in the upper layer. This issue would manifest itself as a performance problem in this case, especially with something like "git log --graph" since the low generation numbers would cause the in-degree queue to walk all of the commits in the lower layer before allowing the topo-order queue to write anything to output (depending on the size of the upper layer). Therefore, When writing the new layer in split commit-graph, we write a GDAT chunk only if the topmost layer has a GDAT chunk. This guarantees that if a layer has GDAT chunk, all lower layers must have a GDAT chunk as well. Rewriting layers follows similar approach: if the topmost layer below the set of layers being rewritten (in the split commit-graph chain) exists, and it does not contain GDAT chunk, then the result of rewrite does not have GDAT chunks either. Signed-off-by: Derrick Stolee Signed-off-by: Abhishek Kumar --- commit-graph.c | 29 +++++- commit-graph.h | 1 + t/t5324-split-commit-graph.sh | 181 ++++++++++++++++++++++++++++++++++ 3 files changed, 209 insertions(+), 2 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index 629b2f17fbc..41a65d98738 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -610,6 +610,21 @@ static struct commit_graph *load_commit_graph_chain(struct repository *r, return graph_chain; } +static void validate_mixed_generation_chain(struct commit_graph *g) +{ + int read_generation_data; + + if (!g) + return; + + read_generation_data = !!g->chunk_generation_data; + + while (g) { + g->read_generation_data = read_generation_data; + g = g->base_graph; + } +} + struct commit_graph *read_commit_graph_one(struct repository *r, struct object_directory *odb) { @@ -618,6 +633,8 @@ struct commit_graph *read_commit_graph_one(struct repository *r, if (!g) g = load_commit_graph_chain(r, odb); + validate_mixed_generation_chain(g); + return g; } @@ -787,7 +804,7 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g, date_low = get_be32(commit_data + g->hash_len + 12); item->date = (timestamp_t)((date_high << 32) | date_low); - if (g->chunk_generation_data) { + if (g->read_generation_data) { offset = (timestamp_t)get_be32(g->chunk_generation_data + sizeof(uint32_t) * lex_index); if (offset & CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW) { @@ -2012,6 +2029,13 @@ static void split_graph_merge_strategy(struct write_commit_graph_context *ctx) if (i < ctx->num_commit_graphs_after) ctx->commit_graph_hash_after[i] = xstrdup(oid_to_hex(&g->oid)); + /* + * If the topmost remaining layer has generation data chunk, the + * resultant layer also has generation data chunk. + */ + if (i == ctx->num_commit_graphs_after - 2) + ctx->write_generation_data = !!g->chunk_generation_data; + i--; g = g->base_graph; } @@ -2239,6 +2263,7 @@ int write_commit_graph(struct object_directory *odb, struct commit_graph *g = ctx->r->objects->commit_graph; while (g) { + g->read_generation_data = 1; g->topo_levels = &topo_levels; g = g->base_graph; } @@ -2534,7 +2559,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags) * also GENERATION_NUMBER_V1_MAX. Decrement to avoid extra logic * in the following condition. */ - if (!g->chunk_generation_data && max_generation == GENERATION_NUMBER_V1_MAX) + if (!g->read_generation_data && max_generation == GENERATION_NUMBER_V1_MAX) max_generation--; generation = commit_graph_generation(graph_commit); diff --git a/commit-graph.h b/commit-graph.h index 19a02001fde..ad52130883b 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -64,6 +64,7 @@ struct commit_graph { struct object_directory *odb; uint32_t num_commits_in_base; + unsigned int read_generation_data; struct commit_graph *base_graph; const uint32_t *chunk_oid_fanout; diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh index 587757b62d9..8e90f3423b8 100755 --- a/t/t5324-split-commit-graph.sh +++ b/t/t5324-split-commit-graph.sh @@ -453,4 +453,185 @@ test_expect_success 'prevent regression for duplicate commits across layers' ' git -C dup commit-graph verify ' +NUM_FIRST_LAYER_COMMITS=64 +NUM_SECOND_LAYER_COMMITS=16 +NUM_THIRD_LAYER_COMMITS=7 +NUM_FOURTH_LAYER_COMMITS=8 +NUM_FIFTH_LAYER_COMMITS=16 +SECOND_LAYER_SEQUENCE_START=$(($NUM_FIRST_LAYER_COMMITS + 1)) +SECOND_LAYER_SEQUENCE_END=$(($SECOND_LAYER_SEQUENCE_START + $NUM_SECOND_LAYER_COMMITS - 1)) +THIRD_LAYER_SEQUENCE_START=$(($SECOND_LAYER_SEQUENCE_END + 1)) +THIRD_LAYER_SEQUENCE_END=$(($THIRD_LAYER_SEQUENCE_START + $NUM_THIRD_LAYER_COMMITS - 1)) +FOURTH_LAYER_SEQUENCE_START=$(($THIRD_LAYER_SEQUENCE_END + 1)) +FOURTH_LAYER_SEQUENCE_END=$(($FOURTH_LAYER_SEQUENCE_START + $NUM_FOURTH_LAYER_COMMITS - 1)) +FIFTH_LAYER_SEQUENCE_START=$(($FOURTH_LAYER_SEQUENCE_END + 1)) +FIFTH_LAYER_SEQUENCE_END=$(($FIFTH_LAYER_SEQUENCE_START + $NUM_FIFTH_LAYER_COMMITS - 1)) + +# Current split graph chain: +# +# 16 commits (No GDAT) +# ------------------------ +# 64 commits (GDAT) +# +test_expect_success 'setup repo for mixed generation commit-graph-chain' ' + graphdir=".git/objects/info/commit-graphs" && + test_oid_cache <<-EOF && + oid_version sha1:1 + oid_version sha256:2 + EOF + git init mixed && + ( + cd mixed && + git config core.commitGraph true && + git config gc.writeCommitGraph false && + for i in $(test_seq $NUM_FIRST_LAYER_COMMITS) + do + test_commit $i && + git branch commits/$i || return 1 + done && + git commit-graph write --reachable --split && + graph_read_expect $NUM_FIRST_LAYER_COMMITS && + test_line_count = 1 $graphdir/commit-graph-chain && + for i in $(test_seq $SECOND_LAYER_SEQUENCE_START $SECOND_LAYER_SEQUENCE_END) + do + test_commit $i && + git branch commits/$i || return 1 + done && + GIT_TEST_COMMIT_GRAPH_NO_GDAT=1 git commit-graph write --reachable --split=no-merge && + test_line_count = 2 $graphdir/commit-graph-chain && + test-tool read-graph >output && + cat >expect <<-EOF && + header: 43475048 1 $(test_oid oid_version) 4 1 + num_commits: $NUM_SECOND_LAYER_COMMITS + chunks: oid_fanout oid_lookup commit_metadata + EOF + test_cmp expect output && + git commit-graph verify && + cat $graphdir/commit-graph-chain + ) +' + +# The new layer will be added without generation data chunk as it was not +# present on the layer underneath it. +# +# 7 commits (No GDAT) +# ------------------------ +# 16 commits (No GDAT) +# ------------------------ +# 64 commits (GDAT) +# +test_expect_success 'do not write generation data chunk if not present on existing tip' ' + git clone mixed mixed-no-gdat && + ( + cd mixed-no-gdat && + for i in $(test_seq $THIRD_LAYER_SEQUENCE_START $THIRD_LAYER_SEQUENCE_END) + do + test_commit $i && + git branch commits/$i || return 1 + done && + git commit-graph write --reachable --split=no-merge && + test_line_count = 3 $graphdir/commit-graph-chain && + test-tool read-graph >output && + cat >expect <<-EOF && + header: 43475048 1 $(test_oid oid_version) 4 2 + num_commits: $NUM_THIRD_LAYER_COMMITS + chunks: oid_fanout oid_lookup commit_metadata + EOF + test_cmp expect output && + git commit-graph verify + ) +' + +# Number of commits in each layer of the split-commit graph before merge: +# +# 8 commits (No GDAT) +# ------------------------ +# 7 commits (No GDAT) +# ------------------------ +# 16 commits (No GDAT) +# ------------------------ +# 64 commits (GDAT) +# +# The top two layers are merged and do not have generation data chunk as layer below them does +# not have generation data chunk. +# +# 15 commits (No GDAT) +# ------------------------ +# 16 commits (No GDAT) +# ------------------------ +# 64 commits (GDAT) +# +test_expect_success 'do not write generation data chunk if the topmost remaining layer does not have generation data chunk' ' + git clone mixed-no-gdat mixed-merge-no-gdat && + ( + cd mixed-merge-no-gdat && + for i in $(test_seq $FOURTH_LAYER_SEQUENCE_START $FOURTH_LAYER_SEQUENCE_END) + do + test_commit $i && + git branch commits/$i || return 1 + done && + git commit-graph write --reachable --split --size-multiple 1 && + test_line_count = 3 $graphdir/commit-graph-chain && + test-tool read-graph >output && + cat >expect <<-EOF && + header: 43475048 1 $(test_oid oid_version) 4 2 + num_commits: $(($NUM_THIRD_LAYER_COMMITS + $NUM_FOURTH_LAYER_COMMITS)) + chunks: oid_fanout oid_lookup commit_metadata + EOF + test_cmp expect output && + git commit-graph verify + ) +' + +# Number of commits in each layer of the split-commit graph before merge: +# +# 16 commits (No GDAT) +# ------------------------ +# 15 commits (No GDAT) +# ------------------------ +# 16 commits (No GDAT) +# ------------------------ +# 64 commits (GDAT) +# +# The top three layers are merged and has generation data chunk as the topmost remaining layer +# has generation data chunk. +# +# 47 commits (GDAT) +# ------------------------ +# 64 commits (GDAT) +# +test_expect_success 'write generation data chunk if topmost remaining layer has generation data chunk' ' + git clone mixed-merge-no-gdat mixed-merge-gdat && + ( + cd mixed-merge-gdat && + for i in $(test_seq $FIFTH_LAYER_SEQUENCE_START $FIFTH_LAYER_SEQUENCE_END) + do + test_commit $i && + git branch commits/$i || return 1 + done && + git commit-graph write --reachable --split --size-multiple 1 && + test_line_count = 2 $graphdir/commit-graph-chain && + test-tool read-graph >output && + cat >expect <<-EOF && + header: 43475048 1 $(test_oid oid_version) 5 1 + num_commits: $(($NUM_SECOND_LAYER_COMMITS + $NUM_THIRD_LAYER_COMMITS + $NUM_FOURTH_LAYER_COMMITS + $NUM_FIFTH_LAYER_COMMITS)) + chunks: oid_fanout oid_lookup commit_metadata generation_data + EOF + test_cmp expect output + ) +' + +test_expect_success 'write generation data chunk when commit-graph chain is replaced' ' + git clone mixed mixed-replace && + ( + cd mixed-replace && + git commit-graph write --reachable --split=replace && + test_path_is_file $graphdir/commit-graph-chain && + test_line_count = 1 $graphdir/commit-graph-chain && + verify_chain_files_exist $graphdir && + graph_read_expect $(($NUM_FIRST_LAYER_COMMITS + $NUM_SECOND_LAYER_COMMITS)) && + git commit-graph verify + ) +' + test_done From patchwork Mon Dec 28 11:16:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhishek Kumar X-Patchwork-Id: 11991089 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95D1AC433E9 for ; Mon, 28 Dec 2020 11:17:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5319A22AAD for ; Mon, 28 Dec 2020 11:17:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727353AbgL1LRi (ORCPT ); Mon, 28 Dec 2020 06:17:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60066 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727333AbgL1LRf (ORCPT ); Mon, 28 Dec 2020 06:17:35 -0500 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 457F8C0617A0 for ; Mon, 28 Dec 2020 03:16:22 -0800 (PST) Received: by mail-wr1-x432.google.com with SMTP id d13so10983543wrc.13 for ; Mon, 28 Dec 2020 03:16:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=GflQn4+6J6LwmIi+iXcJTd4ibHu5n98QXxTJISdMLMU=; b=Jb4/4h8v5UZ8fDv/MVr+NIa6gCEq5AP5afoayijXzm2u+6yypsWvm2VRbcWXSPd+Te BO6Qj3eOpCtGFJghoZNFswKDvjosZH7YxINB1ebjKgUC15Tbnjdl13o+ZjtR53MjA+VR s092vJ8qmV+4uA+2BGc8bw7le+KVKYzxGwZEpG3jDLka608TRULly6A7kQm688rmLk+1 pinFLG4QUs63p+hr3W3s5tKmYOqD+/oVQ1EQPst96SA9R5rw1o+mes6y3bAWJiFwwRLT BkTqR4kJeCEKNfMTkwpUtTUlcGWeyDEHmw2ZVKyvWR2RvJn5Nv8GYdVBdl3+CYxzhunT 3/hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=GflQn4+6J6LwmIi+iXcJTd4ibHu5n98QXxTJISdMLMU=; b=bZygFp3CW6XdMNE6YQqLt/7yp3PTCIgIpGWyEKoH3uxlpL/qpjQggDntmmeYApQOlN 1QG0AESfzr8RfrfaS/bPMivYtqxfJqqoJtS7Ezl2JO6echdx5WfexzvHZ0bwQQTxkw3j tPtgbcVkLRhRdkd0XlrZCntBXm+4o/xDcj/5ZJovXen/OcRu1QtAQn3zijjUuyZECfZh JWtDmhLBIjC1aLFWrAvZpyeBmaGPZvVP2zFD5eEzR1WWdXGCib/tcIo4r9SHTvrVO0uU nUPxL30Qqlwep+o4hP9gaqW9/kH1nV5U1iYsDAMqJoQcA53POWOrmvl2uREl/X3tLgbg 116w== X-Gm-Message-State: AOAM533P/j68eAnU5H3uhTjnjm/Xea80P6KC3uA/BGbeAz7FZITfsPyE CyZOgxEUeR2NknVSOX67F1o0Y9Gt0y4= X-Google-Smtp-Source: ABdhPJwhdBjQ1xqoBs5txs7K1T4jg9Ph+7k0cRcekWh2VeIhMZV6Kbho4PTUOSNMRguhgGdhkFbjhw== X-Received: by 2002:adf:d20b:: with SMTP id j11mr19240548wrh.318.1609154180792; Mon, 28 Dec 2020 03:16:20 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id g184sm19202224wma.16.2020.12.28.03.16.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Dec 2020 03:16:20 -0800 (PST) Message-Id: <093101f908b166099af41d99250cf8e79d921740.1609154169.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 28 Dec 2020 11:16:07 +0000 Subject: [PATCH v5 10/11] commit-reach: use corrected commit dates in paint_down_to_common() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar , Abhishek Kumar Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar From: Abhishek Kumar 091f4cf (commit: don't use generation numbers if not needed, 2018-08-30) changed paint_down_to_common() to use commit dates instead of generation numbers v1 (topological levels) as the performance regressed on certain topologies. With generation number v2 (corrected commit dates) implemented, we no longer have to rely on commit dates and can use generation numbers. For example, the command `git merge-base v4.8 v4.9` on the Linux repository walks 167468 commits, taking 0.135s for committer date and 167496 commits, taking 0.157s for corrected committer date respectively. While using corrected commit dates, Git walks nearly the same number of commits as commit date, the process is slower as for each comparision we have to access a commit-slab (for corrected committer date) instead of accessing struct member (for committer date). This change incidentally broke the fragile t6404-recursive-merge test. t6404-recursive-merge sets up a unique repository where all commits have the same committer date without a well-defined merge-base. While running tests with GIT_TEST_COMMIT_GRAPH unset, we use committer date as a heuristic in paint_down_to_common(). 6404.1 'combined merge conflicts' merges commits in the order: - Merge C with B to form an intermediate commit. - Merge the intermediate commit with A. With GIT_TEST_COMMIT_GRAPH=1, we write a commit-graph and subsequently use the corrected committer date, which changes the order in which commits are merged: - Merge A with B to form an intermediate commit. - Merge the intermediate commit with C. While resulting repositories are equivalent, 6404.4 'virtual trees were processed' fails with GIT_TEST_COMMIT_GRAPH=1 as we are selecting different merge-bases and thus have different object ids for the intermediate commits. As this has already causes problems (as noted in 859fdc0 (commit-graph: define GIT_TEST_COMMIT_GRAPH, 2018-08-29)), we disable commit graph within t6404-recursive-merge. Signed-off-by: Abhishek Kumar --- commit-graph.c | 14 ++++++++++++++ commit-graph.h | 6 ++++++ commit-reach.c | 2 +- t/t6404-recursive-merge.sh | 5 ++++- 4 files changed, 25 insertions(+), 2 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index 41a65d98738..c8d7ed13302 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -710,6 +710,20 @@ int generation_numbers_enabled(struct repository *r) return !!first_generation; } +int corrected_commit_dates_enabled(struct repository *r) +{ + struct commit_graph *g; + if (!prepare_commit_graph(r)) + return 0; + + g = r->objects->commit_graph; + + if (!g->num_commits) + return 0; + + return g->read_generation_data; +} + struct bloom_filter_settings *get_bloom_filter_settings(struct repository *r) { struct commit_graph *g = r->objects->commit_graph; diff --git a/commit-graph.h b/commit-graph.h index ad52130883b..97f3497c279 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -95,6 +95,12 @@ struct commit_graph *parse_commit_graph(struct repository *r, */ int generation_numbers_enabled(struct repository *r); +/* + * Return 1 if and only if the repository has a commit-graph + * file and generation data chunk has been written for the file. + */ +int corrected_commit_dates_enabled(struct repository *r); + struct bloom_filter_settings *get_bloom_filter_settings(struct repository *r); enum commit_graph_write_flags { diff --git a/commit-reach.c b/commit-reach.c index 9b24b0378d5..e38771ca5a1 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -39,7 +39,7 @@ static struct commit_list *paint_down_to_common(struct repository *r, int i; timestamp_t last_gen = GENERATION_NUMBER_INFINITY; - if (!min_generation) + if (!min_generation && !corrected_commit_dates_enabled(r)) queue.compare = compare_commits_by_commit_date; one->object.flags |= PARENT1; diff --git a/t/t6404-recursive-merge.sh b/t/t6404-recursive-merge.sh index b1c3d4dda49..86f74ae5847 100755 --- a/t/t6404-recursive-merge.sh +++ b/t/t6404-recursive-merge.sh @@ -15,6 +15,8 @@ GIT_COMMITTER_DATE="2006-12-12 23:28:00 +0100" export GIT_COMMITTER_DATE test_expect_success 'setup tests' ' + GIT_TEST_COMMIT_GRAPH=0 && + export GIT_TEST_COMMIT_GRAPH && echo 1 >a1 && git add a1 && GIT_AUTHOR_DATE="2006-12-12 23:00:00" git commit -m 1 a1 && @@ -66,7 +68,7 @@ test_expect_success 'setup tests' ' ' test_expect_success 'combined merge conflicts' ' - test_must_fail env GIT_TEST_COMMIT_GRAPH=0 git merge -m final G + test_must_fail git merge -m final G ' test_expect_success 'result contains a conflict' ' @@ -82,6 +84,7 @@ test_expect_success 'result contains a conflict' ' ' test_expect_success 'virtual trees were processed' ' + # TODO: fragile test, relies on ambigious merge-base resolution git ls-files --stage >out && cat >expect <<-EOF && From patchwork Mon Dec 28 11:16:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Abhishek Kumar X-Patchwork-Id: 11991093 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB313C43381 for ; Mon, 28 Dec 2020 11:17:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 77D49229C4 for ; Mon, 28 Dec 2020 11:17:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727349AbgL1LRi (ORCPT ); Mon, 28 Dec 2020 06:17:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727325AbgL1LRg (ORCPT ); Mon, 28 Dec 2020 06:17:36 -0500 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 87A4CC0617A1 for ; Mon, 28 Dec 2020 03:16:23 -0800 (PST) Received: by mail-wr1-x432.google.com with SMTP id q18so11061947wrn.1 for ; Mon, 28 Dec 2020 03:16:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=YjaFqlWBLS/Ghmuj7hyNZ0dlOJahQzrJDl0+ZysUGw0=; b=X7e/ahUJn+k6b0lJ4sLZMuBVo2khDBFEfh+1aq9W8r3rvaOOAPls9KTiKBt56Y6NS2 Qk3E7B1g1GUXoLvu6YFiZf3ImJdKiP9+QkfRwScbC27pzFXAs/XluPt7jF0MfcOgA4iM vM4vAnTzpZ6an3ARitUuAg71iwMXkHDZFVmnqQTgm8537sITiY9iX8TPcOh4swjq5Ni5 SifhR4Kijfw64KkLXzJJxNgSRH+P+OMitBpP9yCLzpgy2FyMVR7hj/yg1mPTG3cvPIIv ZHJla5HxiPomWrPR4TDzv46WOgbkMfz5Ne2vLZky5vxz0MJo0+tjuom0DF7mBvBlalb1 xDOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=YjaFqlWBLS/Ghmuj7hyNZ0dlOJahQzrJDl0+ZysUGw0=; b=C/UO3Vfqk7usuzLgWHk7GhyUTNqtmWuugXWqwMmlklM5bzBH/3NOOiTUpVCkprb+Fw 6BeTNoIV+KX7gVve7BXsbW5Js3PfMIIjrEmZKoQfNh27WsukhJyOA/I4CKcz+a+xayjG XgCmevc7HKfzVH8ZL+bFff3aVAY4vspzRZPdtznbbA0mVH8U7bM4JR15/9oMJk0zHxxf zMV6vQ2LnHF6nuLev977AmFkdoc6NKiXL36jmyQsOFWYmVg4wBxNrCjsjUamNIfYoZYu SufNjIuWTR8OltFzr2dQstqib5/nzy7BKzW9tyLlGytPZ8oRyTKRq8oVjI129TSypIBq CwXA== X-Gm-Message-State: AOAM533/+6zfF8MBur9kkriIbMM9DpyMqp1qpxgSupT4kzKy7TKaupi2 K7x9I8b6SyJl3JSS6aLJX0v1lXmBDcY= X-Google-Smtp-Source: ABdhPJyhwXNOXiO6Sp1QP5TOio9/celc29ZMyoqfMC8CPdlf9jv/SkBUSu7BqpB0Mxu2DsODawA/RA== X-Received: by 2002:a05:6000:143:: with SMTP id r3mr50728538wrx.331.1609154182033; Mon, 28 Dec 2020 03:16:22 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u66sm18893842wmg.30.2020.12.28.03.16.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Dec 2020 03:16:21 -0800 (PST) Message-Id: <20299e574574690ba11961d493aad378b804e5b5.1609154169.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Mon, 28 Dec 2020 11:16:08 +0000 Subject: [PATCH v5 11/11] doc: add corrected commit date info Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar , Abhishek Kumar Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar From: Abhishek Kumar With generation data chunk and corrected commit dates implemented, let's update the technical documentation for commit-graph. Signed-off-by: Abhishek Kumar --- .../technical/commit-graph-format.txt | 28 +++++-- Documentation/technical/commit-graph.txt | 77 +++++++++++++++---- 2 files changed, 86 insertions(+), 19 deletions(-) diff --git a/Documentation/technical/commit-graph-format.txt b/Documentation/technical/commit-graph-format.txt index b3b58880b92..b6658eff188 100644 --- a/Documentation/technical/commit-graph-format.txt +++ b/Documentation/technical/commit-graph-format.txt @@ -4,11 +4,7 @@ Git commit graph format The Git commit graph stores a list of commit OIDs and some associated metadata, including: -- The generation number of the commit. Commits with no parents have - generation number 1; commits with parents have generation number - one more than the maximum generation number of its parents. We - reserve zero as special, and can be used to mark a generation - number invalid or as "not computed". +- The generation number of the commit. - The root tree OID. @@ -86,13 +82,33 @@ CHUNK DATA: position. If there are more than two parents, the second value has its most-significant bit on and the other bits store an array position into the Extra Edge List chunk. - * The next 8 bytes store the generation number of the commit and + * The next 8 bytes store the topological level (generation number v1) + of the commit and the commit time in seconds since EPOCH. The generation number uses the higher 30 bits of the first 4 bytes, while the commit time uses the 32 bits of the second 4 bytes, along with the lowest 2 bits of the lowest byte, storing the 33rd and 34th bit of the commit time. + Generation Data (ID: {'G', 'D', 'A', 'T' }) (N * 4 bytes) [Optional] + * This list of 4-byte values store corrected commit date offsets for the + commits, arranged in the same order as commit data chunk. + * If the corrected commit date offset cannot be stored within 31 bits, + the value has its most-significant bit on and the other bits store + the position of corrected commit date into the Generation Data Overflow + chunk. + * Generation Data chunk is present only when commit-graph file is written + by compatible versions of Git and in case of split commit-graph chains, + the topmost layer also has Generation Data chunk. + + Generation Data Overflow (ID: {'G', 'D', 'O', 'V' }) [Optional] + * This list of 8-byte values stores the corrected commit date offsets + for commits with corrected commit date offsets that cannot be + stored within 31 bits. + * Generation Data Overflow chunk is present only when Generation Data + chunk is present and atleast one corrected commit date offset cannot + be stored within 31 bits. + Extra Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional] This list of 4-byte values store the second through nth parents for all octopus merges. The second parent value in the commit data stores diff --git a/Documentation/technical/commit-graph.txt b/Documentation/technical/commit-graph.txt index f14a7659aa8..f05e7bda1a9 100644 --- a/Documentation/technical/commit-graph.txt +++ b/Documentation/technical/commit-graph.txt @@ -38,14 +38,31 @@ A consumer may load the following info for a commit from the graph: Values 1-4 satisfy the requirements of parse_commit_gently(). -Define the "generation number" of a commit recursively as follows: +There are two definitions of generation number: +1. Corrected committer dates (generation number v2) +2. Topological levels (generation nummber v1) - * A commit with no parents (a root commit) has generation number one. +Define "corrected committer date" of a commit recursively as follows: - * A commit with at least one parent has generation number one more than - the largest generation number among its parents. + * A commit with no parents (a root commit) has corrected committer date + equal to its committer date. -Equivalently, the generation number of a commit A is one more than the + * A commit with at least one parent has corrected committer date equal to + the maximum of its commiter date and one more than the largest corrected + committer date among its parents. + + * As a special case, a root commit with timestamp zero has corrected commit + date of 1, to be able to distinguish it from GENERATION_NUMBER_ZERO + (that is, an uncomputed corrected commit date). + +Define the "topological level" of a commit recursively as follows: + + * A commit with no parents (a root commit) has topological level of one. + + * A commit with at least one parent has topological level one more than + the largest topological level among its parents. + +Equivalently, the topological level of a commit A is one more than the length of a longest path from A to a root commit. The recursive definition is easier to use for computation and observing the following property: @@ -60,6 +77,9 @@ is easier to use for computation and observing the following property: generation numbers, then we always expand the boundary commit with highest generation number and can easily detect the stopping condition. +The property applies to both versions of generation number, that is both +corrected committer dates and topological levels. + This property can be used to significantly reduce the time it takes to walk commits and determine topological relationships. Without generation numbers, the general heuristic is the following: @@ -67,7 +87,9 @@ numbers, the general heuristic is the following: If A and B are commits with commit time X and Y, respectively, and X < Y, then A _probably_ cannot reach B. -This heuristic is currently used whenever the computation is allowed to +In absence of corrected commit dates (for example, old versions of Git or +mixed generation graph chains), +this heuristic is currently used whenever the computation is allowed to violate topological relationships due to clock skew (such as "git log" with default order), but is not used when the topological order is required (such as merge base calculations, "git log --graph"). @@ -77,7 +99,7 @@ in the commit graph. We can treat these commits as having "infinite" generation number and walk until reaching commits with known generation number. -We use the macro GENERATION_NUMBER_INFINITY = 0xFFFFFFFF to mark commits not +We use the macro GENERATION_NUMBER_INFINITY to mark commits not in the commit-graph file. If a commit-graph file was written by a version of Git that did not compute generation numbers, then those commits will have generation number represented by the macro GENERATION_NUMBER_ZERO = 0. @@ -93,12 +115,12 @@ fully-computed generation numbers. Using strict inequality may result in walking a few extra commits, but the simplicity in dealing with commits with generation number *_INFINITY or *_ZERO is valuable. -We use the macro GENERATION_NUMBER_MAX = 0x3FFFFFFF to for commits whose -generation numbers are computed to be at least this value. We limit at -this value since it is the largest value that can be stored in the -commit-graph file using the 30 bits available to generation numbers. This -presents another case where a commit can have generation number equal to -that of a parent. +We use the macro GENERATION_NUMBER_V1_MAX = 0x3FFFFFFF for commits whose +topological levels (generation number v1) are computed to be at least +this value. We limit at this value since it is the largest value that +can be stored in the commit-graph file using the 30 bits available +to topological levels. This presents another case where a commit can +have generation number equal to that of a parent. Design Details -------------- @@ -267,6 +289,35 @@ The merge strategy values (2 for the size multiple, 64,000 for the maximum number of commits) could be extracted into config settings for full flexibility. +## Handling Mixed Generation Number Chains + +With the introduction of generation number v2 and generation data chunk, the +following scenario is possible: + +1. "New" Git writes a commit-graph with the corrected commit dates. +2. "Old" Git writes a split commit-graph on top without corrected commit dates. + +A naive approach of using the newest available generation number from +each layer would lead to violated expectations: the lower layer would +use corrected commit dates which are much larger than the topological +levels of the higher layer. For this reason, Git inspects the topmost +layer to see if the layer is missing corrected commit dates. In such a case +Git only uses topological level for generation numbers. + +When writing a new layer in split commit-graph, we write corrected commit +dates if the topmost layer has corrected commit dates written. This +guarantees that if a layer has corrected commit dates, all lower layers +must have corrected commit dates as well. + +When merging layers, we do not consider whether the merged layers had corrected +commit dates. Instead, the new layer will have corrected commit dates if the +layer below the new layer has corrected commit dates. + +While writing or merging layers, if the new layer is the only layer, it will +have corrected commit dates when written by compatible versions of Git. Thus, +rewriting split commit-graph as a single file (`--split=replace`) creates a +single layer with corrected commit dates. + ## Deleting graph-{hash} files After a new tip file is written, some `graph-{hash}` files may no longer