From patchwork Sat Aug 15 16:39:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philippe Blain via GitGitGadget X-Patchwork-Id: 11715665 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9C079913 for ; Sat, 15 Aug 2020 21:52:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 846C12053B for ; Sat, 15 Aug 2020 21:52:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="h2xlFNBs" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728998AbgHOVwO (ORCPT ); Sat, 15 Aug 2020 17:52:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45658 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729000AbgHOVwI (ORCPT ); Sat, 15 Aug 2020 17:52:08 -0400 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 60045C0A3BEC for ; Sat, 15 Aug 2020 09:39:49 -0700 (PDT) Received: by mail-wr1-x443.google.com with SMTP id r4so10920717wrx.9 for ; Sat, 15 Aug 2020 09:39:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=8HfzOFn9EHvhlcJ4nPceB7zs4j6XWa23+RSfvwPSb0Y=; b=h2xlFNBsPnsXOhdli9NBDo4eATEjiUJqeadAZOv5y5WjB6H0RpaUmzhgpjIm4hx9Sj d0wiLR/lPUjGc8t2fquVn/pjWym5jc+I9bNVMrlmzGLdK8aAP8BprXfZI5BXwFr0c4LC bC2A76+Xq1jWQVPwvznTYDNABnxLAa/vKE+r1dwiWIwXWKyLl9SeO6vz5jmG1YOx7ea/ WZrXcxBMiiQDs4zyOlfbC5/nY2qxC7GVTi+JfwtwBOf3QQVIphoiEFXnr1TBNuag78D+ o/1ldupTb6eQ5ufhXhMFOEXCxL1S5lAbaTfFZvEQmbqC2UbOVUTwNYhkpAafrOw369bK 4w5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=8HfzOFn9EHvhlcJ4nPceB7zs4j6XWa23+RSfvwPSb0Y=; b=pFPgeIX0MbpYHRggymmKp3jNnRloff3aq2yBZ7ij3jQcoY6REFYvpQY6q8Brx8QnjN Yr3TvbXpA/BP73ePxcy5Jl1SDeNfEIdJNbOG/MqnuALqNl0jDpDoJ8/1DVobZIJjoxzG 8j3O1HAG4TeoksPZtA2nLxQ0OAFNkAdYApd8Bw5OFaN16RcytzvOozIuZGfjmdauB9JW teXpEM7biCP/2Rq4tdBngj0NOQ4YAEUOojcgSCx2zWxkZyjUIdGLkRlb5nwI14DJKwtB ApPWIzBkOxCssyFVU3ZgqVSc3HNg2a4Sl7k5gzLsfdKD9OQ/RQy2ZAsIOqm7wZtfMPVu VtaA== X-Gm-Message-State: AOAM531RR9PGHo3HE9ySu22oA+R+kK4baT3gUMxHKLEC6zI7JgGoM/Zm r5LQlV60566JvZhW3g7o2Ucs9GvIluU= X-Google-Smtp-Source: ABdhPJxPoyNQ7ysTjuWUJiabZ2jePIRZHyKll9QXFnKlEw26WywMGh+h5bvrOzR4Yz6Z3WBmVb5TwA== X-Received: by 2002:a5d:514e:: with SMTP id u14mr7395784wrt.20.1597509585811; Sat, 15 Aug 2020 09:39:45 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l1sm24124280wrb.12.2020.08.15.09.39.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 15 Aug 2020 09:39:45 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sat, 15 Aug 2020 16:39:33 +0000 Subject: [PATCH v3 01/11] commit-graph: fix regression when computing bloom filter Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar commit_gen_cmp is used when writing a commit-graph to sort commits in generation order before computing Bloom filters. Since c49c82aa (commit: move members graph_pos, generation to a slab, 2020-06-17) made it so that 'commit_graph_generation()' returns 'GENERATION_NUMBER_INFINITY' during writing, we cannot call it within this function. Instead, access the generation number directly through the slab (i.e., by calling 'commit_graph_data_at(c)->generation') in order to access it while writing. Signed-off-by: Abhishek Kumar --- commit-graph.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index e51c91dd5b..ace7400a1a 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -144,8 +144,8 @@ static int commit_gen_cmp(const void *va, const void *vb) const struct commit *a = *(const struct commit **)va; const struct commit *b = *(const struct commit **)vb; - uint32_t generation_a = commit_graph_generation(a); - uint32_t generation_b = commit_graph_generation(b); + uint32_t generation_a = commit_graph_data_at(a)->generation; + uint32_t generation_b = commit_graph_data_at(b)->generation; /* lower generation commits first */ if (generation_a < generation_b) return -1; From patchwork Sat Aug 15 16:39:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philippe Blain via GitGitGadget X-Patchwork-Id: 11715817 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F3BE3109B for ; Sat, 15 Aug 2020 22:05:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D951B20639 for ; Sat, 15 Aug 2020 22:05:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="XNbZBnYs" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729430AbgHOWFP (ORCPT ); Sat, 15 Aug 2020 18:05:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45610 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728443AbgHOVux (ORCPT ); Sat, 15 Aug 2020 17:50:53 -0400 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85E26C0A3BEE for ; Sat, 15 Aug 2020 09:39:49 -0700 (PDT) Received: by mail-wr1-x444.google.com with SMTP id c15so10915578wrs.11 for ; Sat, 15 Aug 2020 09:39:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=gEltWc4E8IF32606ynbvm1ztlj4ryx+s7moKETG0/go=; b=XNbZBnYszcZWxjF6coIeKTOu2Z/9NJrdZciUcMCjGGx03qa9jXxdGrvUmb99E6W+sH kr7RiyIQYjeDcSW1W81GAOcAxw8+cjTdADaSUD4eZzQvg4c5ar6bap0+/WazpJ0Kdwq0 v4jcSEnp5PVERqGif/B7JiHN/VCd2Bq9X1t99CURXb5WqkEwsPChQmWF5W0pn03o0Mwx uo1wLiqyEwhjUuH9QIh+W9rynrcvA0LtVReo2IoUzI1PRPsnS5XNMSKTdNiN4aNsVEyB SxUjxMtJFoQvUuW6YI5+ACmSP3OaMCgFdLbLuNo5+XAC/mtjBk3eNa5Fk5F1Os3HxZCr jOAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=gEltWc4E8IF32606ynbvm1ztlj4ryx+s7moKETG0/go=; b=JDSmMhCmHzUutAZ8Ev7tZtIgDjm0ZL1JNcwSmNe910erexz+8+StEeTmBwGsRVGRwj De5zaV8VVv1jE3sscp72sMKyaMxN0aE/ZuBYCZz2DMzTxafEpD40e/MeYT+gOpXNsR/5 krOcnMPwfCGd6gV+JJyVugFuC/1v+OxioB7TBbgnqiF/OeoiZhnvbmAdWBo1hEu57Oy0 XxH/lQxR9/EfNyl803ibRAn3wKxB2VZw6MabV9veO6X5cJiLVP5Je4gnNflCDlZvKTOI fM7V0yDy+Pq4udxC9kUZCmvon/z4ddL2iUnHM7qISM6qR2uDq7bJ48HXz2GA9ZVp97IV VrWQ== X-Gm-Message-State: AOAM533lAiQ45NG3yijb4jHH5U20T3C0ex97To5woPUBukzqYHG2fw62 /qV/F+Z2tJ7FQuyyTSYrOOlDE8QedFE= X-Google-Smtp-Source: ABdhPJzz1+oxppibR8waSg6rY4WClNOsXNSvGe4vYM2GcrsJgFqweyJ3wmPPFjnRT/FUhKEMfR/1kg== X-Received: by 2002:adf:f486:: with SMTP id l6mr7545953wro.265.1597509586929; Sat, 15 Aug 2020 09:39:46 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f63sm21468993wmf.9.2020.08.15.09.39.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 15 Aug 2020 09:39:46 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sat, 15 Aug 2020 16:39:34 +0000 Subject: [PATCH v3 02/11] revision: parse parent in indegree_walk_step() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar In indegree_walk_step(), we add unvisited parents to the indegree queue. However, parents are not guaranteed to be parsed. As the indegree queue sorts by generation number, let's parse parents before inserting them to ensure the correct priority order. Signed-off-by: Abhishek Kumar --- revision.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/revision.c b/revision.c index 3dcf689341..ecf757c327 100644 --- a/revision.c +++ b/revision.c @@ -3363,6 +3363,9 @@ static void indegree_walk_step(struct rev_info *revs) struct commit *parent = p->item; int *pi = indegree_slab_at(&info->indegree, parent); + if (parse_commit_gently(parent, 1) < 0) + return; + if (*pi) (*pi)++; else From patchwork Sat Aug 15 16:39:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philippe Blain via GitGitGadget X-Patchwork-Id: 11715809 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C37C8109B for ; Sat, 15 Aug 2020 22:05:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A3CF62053B for ; Sat, 15 Aug 2020 22:05:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="FkwgPo11" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729876AbgHOWFE (ORCPT ); Sat, 15 Aug 2020 18:05:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45646 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728455AbgHOVux (ORCPT ); Sat, 15 Aug 2020 17:50:53 -0400 Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8DEEBC0A3BEF for ; Sat, 15 Aug 2020 09:39:49 -0700 (PDT) Received: by mail-wm1-x344.google.com with SMTP id k8so10428109wma.2 for ; Sat, 15 Aug 2020 09:39:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=6AVQeslCBa1mSfJ9qfMqVwa16RSD2qGpgXX3ZnaVk/A=; b=FkwgPo11Bc0VQFZOBv7a8d4PXg7xUAJolRUW4vVjUwJ44Ec9hrpFqsuwDz3LC10KdW 3CE25Kr2Z2MrcgomH3+adxUBsZotC3aYL3r2kBBZPvZThWCa7pqsA1pYYAFTPcUrh+wp Qd+/0vhuL/bpKAZgqbRTIblehVn3feEAlT4ub3svU2xjyRWSDdypmmFCQap9OT7cZM6I 6BHKCStxeoL0NnG2GOtcN5omsfvKds9C2AMS0ANtXDJ4INWV7MI5YdcbSUoL7R6fNRUH uFerv05GAfcKXCTp3wY2RJfYoUfujuRpWEsHgZvo7GYCDmW4J2keYmd71y4KNxLfl9Qv 2isg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=6AVQeslCBa1mSfJ9qfMqVwa16RSD2qGpgXX3ZnaVk/A=; b=Tuyx+U/zAHNKtHlEHBfV/fjaPMNUpcrVxBLl2m3akw/fOEi+EfpbuI6QB9eWRBEM1g zFQ30NrIqBVaumDJsrZCK2NrkPSZXiKGxrdDBjMOa98UVsLrSEyubDGR8jhdzVSR1bBS 61Q0XnwO5f9UdBeTUJ51QtY+tkDWQDtKpW4JRc+JCB50rKGCFwTBs13xoqJfGa9n5Grh YDFu01adTB1a4Im6IJncsCdyYzV8ldhAAWv4+aV9DrqroBs6/53Xdot4UpzvAHSy2NsW 9h1cKTb5qbdWhiDoxIV236x6hcc5TaZHK2lTtQsYu8dLeye5tA4M2ZFJmiS8W+aDg6uI znyA== X-Gm-Message-State: AOAM530nJYUh8N6aULzXOlRACwIO0ejYddtpNcsJCJ16yvu1Og/nGM1O BntD23JSoF7Y5Tf77N9FXFkagwPWc4A= X-Google-Smtp-Source: ABdhPJzYmkweA7j1E4fQFrh9Zj64Aarif4Qy/wGrbeXvgH3kpXR0Sh6NyTTUxwQk41lCaEx1fFbHoA== X-Received: by 2002:a1c:32c3:: with SMTP id y186mr7331105wmy.15.1597509587646; Sat, 15 Aug 2020 09:39:47 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id t189sm23251642wmf.47.2020.08.15.09.39.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 15 Aug 2020 09:39:47 -0700 (PDT) Message-Id: <18d5864f81e89585cc94cd12eca166a9d8b929a5.1597509583.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sat, 15 Aug 2020 16:39:35 +0000 Subject: [PATCH v3 03/11] commit-graph: consolidate fill_commit_graph_info Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar Both fill_commit_graph_info() and fill_commit_in_graph() parse information present in commit data chunk. Let's simplify the implementation by calling fill_commit_graph_info() within fill_commit_in_graph(). The test 'generate tar with future mtime' creates a commit with commit time of (2 ^ 36 + 1) seconds since EPOCH. The commit time overflows into generation number (within CDAT chunk) and has undefined behavior. The test used to pass as fill_commit_in_graph() guarantees the values of graph position and generation number, and did not load timestamp. However, with corrected commit date we will need load the timestamp as well to populate the generation number. Let's fix the test by setting a timestamp of (2 ^ 34 - 1) seconds. Signed-off-by: Abhishek Kumar --- commit-graph.c | 29 +++++++++++------------------ t/t5000-tar-tree.sh | 4 ++-- 2 files changed, 13 insertions(+), 20 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index ace7400a1a..af8d9cc45e 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -725,15 +725,24 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g, const unsigned char *commit_data; struct commit_graph_data *graph_data; uint32_t lex_index; + uint64_t date_high, date_low; while (pos < g->num_commits_in_base) g = g->base_graph; + if (pos >= g->num_commits + g->num_commits_in_base) + die(_("invalid commit position. commit-graph is likely corrupt")); + lex_index = pos - g->num_commits_in_base; commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index; graph_data = commit_graph_data_at(item); graph_data->graph_pos = pos; + + date_high = get_be32(commit_data + g->hash_len + 8) & 0x3; + date_low = get_be32(commit_data + g->hash_len + 12); + item->date = (timestamp_t)((date_high << 32) | date_low); + graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; } @@ -748,38 +757,22 @@ static int fill_commit_in_graph(struct repository *r, { uint32_t edge_value; uint32_t *parent_data_ptr; - uint64_t date_low, date_high; struct commit_list **pptr; - struct commit_graph_data *graph_data; const unsigned char *commit_data; uint32_t lex_index; while (pos < g->num_commits_in_base) g = g->base_graph; - if (pos >= g->num_commits + g->num_commits_in_base) - die(_("invalid commit position. commit-graph is likely corrupt")); + fill_commit_graph_info(item, g, pos); - /* - * Store the "full" position, but then use the - * "local" position for the rest of the calculation. - */ - graph_data = commit_graph_data_at(item); - graph_data->graph_pos = pos; lex_index = pos - g->num_commits_in_base; - - commit_data = g->chunk_commit_data + (g->hash_len + 16) * lex_index; + commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index; item->object.parsed = 1; set_commit_tree(item, NULL); - date_high = get_be32(commit_data + g->hash_len + 8) & 0x3; - date_low = get_be32(commit_data + g->hash_len + 12); - item->date = (timestamp_t)((date_high << 32) | date_low); - - graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; - pptr = &item->parents; edge_value = get_be32(commit_data + g->hash_len); diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh index 37655a237c..1986354fc3 100755 --- a/t/t5000-tar-tree.sh +++ b/t/t5000-tar-tree.sh @@ -406,7 +406,7 @@ test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' ' rm -f .git/index && echo content >file && git add file && - GIT_COMMITTER_DATE="@68719476737 +0000" \ + GIT_COMMITTER_DATE="@17179869183 +0000" \ git commit -m "tempori parendum" ' @@ -415,7 +415,7 @@ test_expect_success TIME_IS_64BIT 'generate tar with future mtime' ' ' test_expect_success TAR_HUGE,TIME_IS_64BIT,TIME_T_IS_64BIT 'system tar can read our future mtime' ' - echo 4147 >expect && + echo 2514 >expect && tar_info future.tar | cut -d" " -f2 >actual && test_cmp expect actual ' From patchwork Sat Aug 15 16:39:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philippe Blain via GitGitGadget X-Patchwork-Id: 11715679 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 20720913 for ; Sat, 15 Aug 2020 21:53:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 07CEA206B6 for ; Sat, 15 Aug 2020 21:53:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WyTbWC8L" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729215AbgHOVxN (ORCPT ); Sat, 15 Aug 2020 17:53:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729157AbgHOVwn (ORCPT ); Sat, 15 Aug 2020 17:52:43 -0400 Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [IPv6:2a00:1450:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC721C0A3BF1 for ; Sat, 15 Aug 2020 09:39:51 -0700 (PDT) Received: by mail-wr1-x430.google.com with SMTP id 88so10968419wrh.3 for ; Sat, 15 Aug 2020 09:39:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=8yKwkpMYec9D/7NkNOP3qoKkDCzYHq6Bs1apFgJT+rs=; b=WyTbWC8Lr9PMsPA66qumdHWg72BmF8syIO6/W7ch5HVy2MOtD2xH0j6ONOdFZHTt0j N51pvLIq1A+XOEx4WY7xceshch9AMUKmbbIE/34y+4JirxkUTVfJiqU0qk6gdvSI3Pxf chNzcpidhO4XPGlTalEA1MSr4HAgYh26ku76maSjX7rCq6Hj2jRAjRVo7I3O+gwC5VVh Bax4cipclxNy6ukwFznokF3HgerAX5X6MzrbJNxSXDgOfjvy8uYn4brKiqTMs8qmgDZ3 2iK2tZBx74qn7C2m+elixpGx8A4rAcWtuPzaTdHtps74qqHFPyFiYb0DupCHeg//FY4A 3v0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=8yKwkpMYec9D/7NkNOP3qoKkDCzYHq6Bs1apFgJT+rs=; b=TtzJ1CWjp/V7bYS/AobnfezasxYdbCaMnR2K1FcHA7bcry3XlVBccjERiFChqMYk8T OpL6/apdAC4xm6g34iAzpzuTY6lLC5ptaFmAOmY2fsvcT4fOdlEZd9a5D1rkrPyryHl/ rlwpzUw9BJwDl57DanWyvDL7biThzWwKOupbMKKxgLUfG1nIKKSeOGXbKHVbdO6sYYtt Rf9HJjlflzehOjjR0Heb1V4X/CLLRx0V2u1k8fOD7JQIrXG3E0ZMSoM+Qzh2j/+hu9q+ IqIUwmSu9SpMhOnAVbwCOe5Sotn6K2e9PkPPCbtaWVqD250ENGmlP56i3/Pnm8K6yGw7 0x1A== X-Gm-Message-State: AOAM531XE49T+5hk136s8+DfUo7jfl5ncO+p+rK3yWSC4orEKQKnu0pg E5usNW7ccHMUoj6I+2Jl3ZQTcpJbJ64= X-Google-Smtp-Source: ABdhPJzh+TVXmuuzS3NtsEBffy+fs+WG0MkNeEtOcIPs0CYZ7GH4C5EnNQvnxoQq5REGayJDT/pLvQ== X-Received: by 2002:adf:ed85:: with SMTP id c5mr7526084wro.307.1597509588769; Sat, 15 Aug 2020 09:39:48 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b8sm22702664wrv.4.2020.08.15.09.39.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 15 Aug 2020 09:39:48 -0700 (PDT) Message-Id: <6a0cde983d9ed20f043a4977313d714154602012.1597509583.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sat, 15 Aug 2020 16:39:36 +0000 Subject: [PATCH v3 04/11] commit-graph: consolidate compare_commits_by_gen Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar Comparing commits by generation has been independently defined twice, in commit-reach and commit. Let's simplify the implementation by moving compare_commits_by_gen() to commit-graph. Signed-off-by: Abhishek Kumar Reviewed-by: Taylor Blau Signed-off-by: Abhishek Kumar --- commit-graph.c | 15 +++++++++++++++ commit-graph.h | 2 ++ commit-reach.c | 15 --------------- commit.c | 9 +++------ 4 files changed, 20 insertions(+), 21 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index af8d9cc45e..fb6e2bf18f 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -112,6 +112,21 @@ uint32_t commit_graph_generation(const struct commit *c) return data->generation; } +int compare_commits_by_gen(const void *_a, const void *_b) +{ + const struct commit *a = _a, *b = _b; + const uint32_t generation_a = commit_graph_generation(a); + const uint32_t generation_b = commit_graph_generation(b); + + /* older commits first */ + if (generation_a < generation_b) + return -1; + else if (generation_a > generation_b) + return 1; + + return 0; +} + static struct commit_graph_data *commit_graph_data_at(const struct commit *c) { unsigned int i, nth_slab; diff --git a/commit-graph.h b/commit-graph.h index 09a97030dc..701e3d41aa 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -146,4 +146,6 @@ struct commit_graph_data { */ uint32_t commit_graph_generation(const struct commit *); uint32_t commit_graph_position(const struct commit *); + +int compare_commits_by_gen(const void *_a, const void *_b); #endif diff --git a/commit-reach.c b/commit-reach.c index efd5925cbb..c83cc291e7 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -561,21 +561,6 @@ int commit_contains(struct ref_filter *filter, struct commit *commit, return repo_is_descendant_of(the_repository, commit, list); } -static int compare_commits_by_gen(const void *_a, const void *_b) -{ - const struct commit *a = *(const struct commit * const *)_a; - const struct commit *b = *(const struct commit * const *)_b; - - uint32_t generation_a = commit_graph_generation(a); - uint32_t generation_b = commit_graph_generation(b); - - if (generation_a < generation_b) - return -1; - if (generation_a > generation_b) - return 1; - return 0; -} - int can_all_from_reach_with_flag(struct object_array *from, unsigned int with_flag, unsigned int assign_flag, diff --git a/commit.c b/commit.c index 4ce8cb38d5..bd6d5e587f 100644 --- a/commit.c +++ b/commit.c @@ -731,14 +731,11 @@ int compare_commits_by_author_date(const void *a_, const void *b_, int compare_commits_by_gen_then_commit_date(const void *a_, const void *b_, void *unused) { const struct commit *a = a_, *b = b_; - const uint32_t generation_a = commit_graph_generation(a), - generation_b = commit_graph_generation(b); + int ret_val = compare_commits_by_gen(a_, b_); /* newer commits first */ - if (generation_a < generation_b) - return 1; - else if (generation_a > generation_b) - return -1; + if (ret_val) + return -ret_val; /* use date as a heuristic when generations are equal */ if (a->date < b->date) From patchwork Sat Aug 15 16:39:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philippe Blain via GitGitGadget X-Patchwork-Id: 11715707 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 13DC214F6 for ; Sat, 15 Aug 2020 21:56:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E7CAF206B6 for ; Sat, 15 Aug 2020 21:56:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="rZsb7kPi" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729027AbgHOVwT (ORCPT ); Sat, 15 Aug 2020 17:52:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45640 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729026AbgHOVwN (ORCPT ); Sat, 15 Aug 2020 17:52:13 -0400 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7A40BC0A3BF0 for ; Sat, 15 Aug 2020 09:39:51 -0700 (PDT) Received: by mail-wm1-x343.google.com with SMTP id g8so9878842wmk.3 for ; Sat, 15 Aug 2020 09:39:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=uS+mjap2jFwH7Dc8dQ/gHl0IFxMHTx1C1ALI8i/PBpY=; b=rZsb7kPibwEJgUyti+ZDQYoipvK8Vp20hQWQ0RNgugrCIapSUQzbOis4yAZvNmZDm+ CW3SJqXHyDYFUpSUEYT6chXILSfYoA0A9zYsAM6oeOWLr+9DX3EQrEfLxIHHDMOE7/Di aW/K5P7y1SVFEXE3evlp0SNZKQryW23PSIg+LPQBlWnHC6PsQWtxu0BffmvVQdfRJk9R SjcmNHuubh5CSv8nwdn5dm5Poc+CPbmPEBxWiQoHaAOkwIUMYYF8L9NUtFdEpwA0NXC5 WUrQFejbVelci/HQ3MPv6Krw0gPg4IN3GykX8iUwCefH7oTnhOO4b1nDO6YdNj0CtpN+ SPow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=uS+mjap2jFwH7Dc8dQ/gHl0IFxMHTx1C1ALI8i/PBpY=; b=NQR7wPj2q+5DEHwyY4XP+iKB+TVjowyyQ5QkO2PYr35/h4b6OnB7FbFSYPhB+Ji6gW ClogXp2dwJVz8mcEztA8brDUHHDo0u6fuYBpeHvcNMd2OV6q5+ZmK2DlLlSUQfP9qw/0 NAfkUlvXHugPGlSrPTeUCwzBVD+xB5Nz22E6OXX/kFMkMSViok3Mgi88KijAe8kZn84u tVWPh2e8S43Qvfu/uIIFtbKm5E0ZyM7WeZk0LEmlYglcLCv8IH+WK16VAnT1uGFXSGNg LVGdoUpDireHRs9UKl0Gwh2YXapoGZbvS4qUxVenCktYKiCkgEHLp1A1YS1EwIEjCA64 vrkg== X-Gm-Message-State: AOAM531OHLwkEXVx5D/3uOfCfNWkKg+dS2FxDMAQ46/bdX6x8Z5tMeUu IpiGTHhOVLRNdwjskp6UK/cmtAQF+Q8= X-Google-Smtp-Source: ABdhPJx7KckH4LGWiOk0WduRxpbp7e86FQon9C6rExCQtb/1xLgkHCsRghlQTZszxRox5NPDaW0qKQ== X-Received: by 2002:a7b:c0c8:: with SMTP id s8mr7648296wmh.4.1597509589547; Sat, 15 Aug 2020 09:39:49 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j11sm21631027wrq.69.2020.08.15.09.39.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 15 Aug 2020 09:39:49 -0700 (PDT) Message-Id: <6be759a9542114e4de41422efa18491085e19682.1597509583.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sat, 15 Aug 2020 16:39:37 +0000 Subject: [PATCH v3 05/11] commit-graph: return 64-bit generation number Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar In a preparatory step, let's return timestamp_t values from commit_graph_generation(), use timestamp_t for local variables and define GENERATION_NUMBER_INFINITY as (2 ^ 63 - 1) instead. Signed-off-by: Abhishek Kumar --- commit-graph.c | 18 +++++++++--------- commit-graph.h | 4 ++-- commit-reach.c | 32 ++++++++++++++++---------------- commit-reach.h | 2 +- commit.h | 3 ++- revision.c | 10 +++++----- upload-pack.c | 2 +- 7 files changed, 36 insertions(+), 35 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index fb6e2bf18f..7f9f858577 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -99,7 +99,7 @@ uint32_t commit_graph_position(const struct commit *c) return data ? data->graph_pos : COMMIT_NOT_FROM_GRAPH; } -uint32_t commit_graph_generation(const struct commit *c) +timestamp_t commit_graph_generation(const struct commit *c) { struct commit_graph_data *data = commit_graph_data_slab_peek(&commit_graph_data_slab, c); @@ -115,8 +115,8 @@ uint32_t commit_graph_generation(const struct commit *c) int compare_commits_by_gen(const void *_a, const void *_b) { const struct commit *a = _a, *b = _b; - const uint32_t generation_a = commit_graph_generation(a); - const uint32_t generation_b = commit_graph_generation(b); + const timestamp_t generation_a = commit_graph_generation(a); + const timestamp_t generation_b = commit_graph_generation(b); /* older commits first */ if (generation_a < generation_b) @@ -159,8 +159,8 @@ static int commit_gen_cmp(const void *va, const void *vb) const struct commit *a = *(const struct commit **)va; const struct commit *b = *(const struct commit **)vb; - uint32_t generation_a = commit_graph_data_at(a)->generation; - uint32_t generation_b = commit_graph_data_at(b)->generation; + const timestamp_t generation_a = commit_graph_data_at(a)->generation; + const timestamp_t generation_b = commit_graph_data_at(b)->generation; /* lower generation commits first */ if (generation_a < generation_b) return -1; @@ -1338,7 +1338,7 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) uint32_t generation = commit_graph_data_at(ctx->commits.list[i])->generation; display_progress(ctx->progress, i + 1); - if (generation != GENERATION_NUMBER_INFINITY && + if (generation != GENERATION_NUMBER_V1_INFINITY && generation != GENERATION_NUMBER_ZERO) continue; @@ -1352,7 +1352,7 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) for (parent = current->parents; parent; parent = parent->next) { generation = commit_graph_data_at(parent->item)->generation; - if (generation == GENERATION_NUMBER_INFINITY || + if (generation == GENERATION_NUMBER_V1_INFINITY || generation == GENERATION_NUMBER_ZERO) { all_parents_computed = 0; commit_list_insert(parent->item, &list); @@ -2355,8 +2355,8 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags) for (i = 0; i < g->num_commits; i++) { struct commit *graph_commit, *odb_commit; struct commit_list *graph_parents, *odb_parents; - uint32_t max_generation = 0; - uint32_t generation; + timestamp_t max_generation = 0; + timestamp_t generation; display_progress(progress, i + 1); hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); diff --git a/commit-graph.h b/commit-graph.h index 701e3d41aa..430bc830bb 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -138,13 +138,13 @@ void disable_commit_graph(struct repository *r); struct commit_graph_data { uint32_t graph_pos; - uint32_t generation; + timestamp_t generation; }; /* * Commits should be parsed before accessing generation, graph positions. */ -uint32_t commit_graph_generation(const struct commit *); +timestamp_t commit_graph_generation(const struct commit *); uint32_t commit_graph_position(const struct commit *); int compare_commits_by_gen(const void *_a, const void *_b); diff --git a/commit-reach.c b/commit-reach.c index c83cc291e7..470bc80139 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -32,12 +32,12 @@ static int queue_has_nonstale(struct prio_queue *queue) static struct commit_list *paint_down_to_common(struct repository *r, struct commit *one, int n, struct commit **twos, - int min_generation) + timestamp_t min_generation) { struct prio_queue queue = { compare_commits_by_gen_then_commit_date }; struct commit_list *result = NULL; int i; - uint32_t last_gen = GENERATION_NUMBER_INFINITY; + timestamp_t last_gen = GENERATION_NUMBER_INFINITY; if (!min_generation) queue.compare = compare_commits_by_commit_date; @@ -58,10 +58,10 @@ static struct commit_list *paint_down_to_common(struct repository *r, struct commit *commit = prio_queue_get(&queue); struct commit_list *parents; int flags; - uint32_t generation = commit_graph_generation(commit); + timestamp_t generation = commit_graph_generation(commit); if (min_generation && generation > last_gen) - BUG("bad generation skip %8x > %8x at %s", + BUG("bad generation skip %"PRItime" > %"PRItime" at %s", generation, last_gen, oid_to_hex(&commit->object.oid)); last_gen = generation; @@ -177,12 +177,12 @@ static int remove_redundant(struct repository *r, struct commit **array, int cnt repo_parse_commit(r, array[i]); for (i = 0; i < cnt; i++) { struct commit_list *common; - uint32_t min_generation = commit_graph_generation(array[i]); + timestamp_t min_generation = commit_graph_generation(array[i]); if (redundant[i]) continue; for (j = filled = 0; j < cnt; j++) { - uint32_t curr_generation; + timestamp_t curr_generation; if (i == j || redundant[j]) continue; filled_index[filled] = j; @@ -321,7 +321,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit, { struct commit_list *bases; int ret = 0, i; - uint32_t generation, min_generation = GENERATION_NUMBER_INFINITY; + timestamp_t generation, min_generation = GENERATION_NUMBER_INFINITY; if (repo_parse_commit(r, commit)) return ret; @@ -470,7 +470,7 @@ static int in_commit_list(const struct commit_list *want, struct commit *c) static enum contains_result contains_test(struct commit *candidate, const struct commit_list *want, struct contains_cache *cache, - uint32_t cutoff) + timestamp_t cutoff) { enum contains_result *cached = contains_cache_at(cache, candidate); @@ -506,11 +506,11 @@ static enum contains_result contains_tag_algo(struct commit *candidate, { struct contains_stack contains_stack = { 0, 0, NULL }; enum contains_result result; - uint32_t cutoff = GENERATION_NUMBER_INFINITY; + timestamp_t cutoff = GENERATION_NUMBER_INFINITY; const struct commit_list *p; for (p = want; p; p = p->next) { - uint32_t generation; + timestamp_t generation; struct commit *c = p->item; load_commit_graph_info(the_repository, c); generation = commit_graph_generation(c); @@ -565,7 +565,7 @@ int can_all_from_reach_with_flag(struct object_array *from, unsigned int with_flag, unsigned int assign_flag, time_t min_commit_date, - uint32_t min_generation) + timestamp_t min_generation) { struct commit **list = NULL; int i; @@ -666,13 +666,13 @@ int can_all_from_reach(struct commit_list *from, struct commit_list *to, time_t min_commit_date = cutoff_by_min_date ? from->item->date : 0; struct commit_list *from_iter = from, *to_iter = to; int result; - uint32_t min_generation = GENERATION_NUMBER_INFINITY; + timestamp_t min_generation = GENERATION_NUMBER_INFINITY; while (from_iter) { add_object_array(&from_iter->item->object, NULL, &from_objs); if (!parse_commit(from_iter->item)) { - uint32_t generation; + timestamp_t generation; if (from_iter->item->date < min_commit_date) min_commit_date = from_iter->item->date; @@ -686,7 +686,7 @@ int can_all_from_reach(struct commit_list *from, struct commit_list *to, while (to_iter) { if (!parse_commit(to_iter->item)) { - uint32_t generation; + timestamp_t generation; if (to_iter->item->date < min_commit_date) min_commit_date = to_iter->item->date; @@ -726,13 +726,13 @@ struct commit_list *get_reachable_subset(struct commit **from, int nr_from, struct commit_list *found_commits = NULL; struct commit **to_last = to + nr_to; struct commit **from_last = from + nr_from; - uint32_t min_generation = GENERATION_NUMBER_INFINITY; + timestamp_t min_generation = GENERATION_NUMBER_INFINITY; int num_to_find = 0; struct prio_queue queue = { compare_commits_by_gen_then_commit_date }; for (item = to; item < to_last; item++) { - uint32_t generation; + timestamp_t generation; struct commit *c = *item; parse_commit(c); diff --git a/commit-reach.h b/commit-reach.h index b49ad71a31..148b56fea5 100644 --- a/commit-reach.h +++ b/commit-reach.h @@ -87,7 +87,7 @@ int can_all_from_reach_with_flag(struct object_array *from, unsigned int with_flag, unsigned int assign_flag, time_t min_commit_date, - uint32_t min_generation); + timestamp_t min_generation); int can_all_from_reach(struct commit_list *from, struct commit_list *to, int commit_date_cutoff); diff --git a/commit.h b/commit.h index e901538909..bc0732a4fe 100644 --- a/commit.h +++ b/commit.h @@ -11,7 +11,8 @@ #include "commit-slab.h" #define COMMIT_NOT_FROM_GRAPH 0xFFFFFFFF -#define GENERATION_NUMBER_INFINITY 0xFFFFFFFF +#define GENERATION_NUMBER_INFINITY ((1ULL << 63) - 1) +#define GENERATION_NUMBER_V1_INFINITY 0xFFFFFFFF #define GENERATION_NUMBER_MAX 0x3FFFFFFF #define GENERATION_NUMBER_ZERO 0 diff --git a/revision.c b/revision.c index ecf757c327..411852468b 100644 --- a/revision.c +++ b/revision.c @@ -3290,7 +3290,7 @@ define_commit_slab(indegree_slab, int); define_commit_slab(author_date_slab, timestamp_t); struct topo_walk_info { - uint32_t min_generation; + timestamp_t min_generation; struct prio_queue explore_queue; struct prio_queue indegree_queue; struct prio_queue topo_queue; @@ -3336,7 +3336,7 @@ static void explore_walk_step(struct rev_info *revs) } static void explore_to_depth(struct rev_info *revs, - uint32_t gen_cutoff) + timestamp_t gen_cutoff) { struct topo_walk_info *info = revs->topo_walk_info; struct commit *c; @@ -3379,7 +3379,7 @@ static void indegree_walk_step(struct rev_info *revs) } static void compute_indegrees_to_depth(struct rev_info *revs, - uint32_t gen_cutoff) + timestamp_t gen_cutoff) { struct topo_walk_info *info = revs->topo_walk_info; struct commit *c; @@ -3437,7 +3437,7 @@ static void init_topo_walk(struct rev_info *revs) info->min_generation = GENERATION_NUMBER_INFINITY; for (list = revs->commits; list; list = list->next) { struct commit *c = list->item; - uint32_t generation; + timestamp_t generation; if (parse_commit_gently(c, 1)) continue; @@ -3498,7 +3498,7 @@ static void expand_topo_walk(struct rev_info *revs, struct commit *commit) for (p = commit->parents; p; p = p->next) { struct commit *parent = p->item; int *pi; - uint32_t generation; + timestamp_t generation; if (parent->object.flags & UNINTERESTING) continue; diff --git a/upload-pack.c b/upload-pack.c index 80ad9a38d8..bcb8b5dfda 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -497,7 +497,7 @@ static int got_oid(struct upload_pack_data *data, static int ok_to_give_up(struct upload_pack_data *data) { - uint32_t min_generation = GENERATION_NUMBER_ZERO; + timestamp_t min_generation = GENERATION_NUMBER_ZERO; if (!data->have_obj.nr) return 0; From patchwork Sat Aug 15 16:39:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philippe Blain via GitGitGadget X-Patchwork-Id: 11715765 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D05DD618 for ; Sat, 15 Aug 2020 22:01:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B634120781 for ; Sat, 15 Aug 2020 22:01:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="UNPoVGMH" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728803AbgHOVvc (ORCPT ); Sat, 15 Aug 2020 17:51:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728721AbgHOVvY (ORCPT ); Sat, 15 Aug 2020 17:51:24 -0400 Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0940FC0A3BF2 for ; Sat, 15 Aug 2020 09:39:51 -0700 (PDT) Received: by mail-wm1-x341.google.com with SMTP id g75so10414090wme.4 for ; Sat, 15 Aug 2020 09:39:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Tecyx3UhIpZWqH03UciY9yF/in1nxi1c0KS0xS7R+SY=; b=UNPoVGMHvJVDnNPGsSjNrd9grBGo7/dg+7VYOa2pNwjqRNy+WNim9Xt9Fc/yzsr6bE uFzHX7bddaQ2HjP8qtn5TEYAictL7F+79UI7Qows5YnJkQc2dGp5VZmgalRhRnlJPBAZ FOKOhiDWYgEuqXOvV2U18HMM53dHFsnjVaWjQiaPvsBzpk82BX8GuRJZIy5bW+ZQ02x5 x64N6O2L14a0IfLRX4UQQcpFE4eZjJbXuMHvrokanZCXsB74Xaz6HaNCGJsHIFP6gHYa LBvwn/Pt884WEgg59OFsol96BL3tzvYiVvDcKEgmY44AKYSIyqwALtLoAIkNpnJyqraF Z3ow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Tecyx3UhIpZWqH03UciY9yF/in1nxi1c0KS0xS7R+SY=; b=oZ9YaFiQE1YNWJ175Y/oZLFxNcb+knP6zL259aMgp+adafy0Dr6RLqa7WLbg3wOn+1 EzpkWuZNr7uG24iptLo36SI3kk7ZGdtXdR/etrPVhMtEeJiIi7gmxAYTiCbkozm2gMi2 Ch0eolXFQ/9g3HtUKgKW3tXSEUkThaP4D98AiDHsPHDI6ghnj5v1AKoZTisyP+MdBJua ymqky2Gxz8Ubog3J1VJgKFJAm2A8hrnbHbowMGVrIG97As4+wthn6TvftKOhdDFLsi2P mJLlUDM7k740YIGzQJBK4kJvY7wulVKG+nM7PEOvgrZiUYbtVXyRuA3LoySKr0s+ECEE Up0Q== X-Gm-Message-State: AOAM5301Wxtk+hU3ZrUb63u3LMRcSpj/nk6mkfEIQPg/ypIqA1L4+L1z 20yt6+pDuyyHEk6Y7wnh7WcKs49awhE= X-Google-Smtp-Source: ABdhPJx/Zng1QW3r93mNmV+4rYbQcZ6AL2y7jKxSDWiM5EfAnzfg94c+2WlgB6tfRtmJzbRNfOQe7w== X-Received: by 2002:a1c:1f8b:: with SMTP id f133mr7352212wmf.65.1597509590222; Sat, 15 Aug 2020 09:39:50 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l11sm21388123wme.11.2020.08.15.09.39.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 15 Aug 2020 09:39:49 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sat, 15 Aug 2020 16:39:38 +0000 Subject: [PATCH v3 06/11] commit-graph: add a slab to store topological levels Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar As we are writing topological levels to commit data chunk to ensure backwards compatibility with "Old" Git and the member `generation` of struct commit_graph_data will store corrected commit date in a later commit, let's introduce a commit-slab to store topological levels while writing commit-graph. When Git creates a split commit-graph, it takes advantage of the generation values that have been computed already and present in existing commit-graph files. So, let's add a pointer to struct commit_graph to the topological level commit-slab and populate it with topological levels while writing a split commit-graph. Signed-off-by: Abhishek Kumar --- commit-graph.c | 47 ++++++++++++++++++++++++++++++++--------------- commit-graph.h | 1 + commit.h | 1 + 3 files changed, 34 insertions(+), 15 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index 7f9f858577..a2f15b2825 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -64,6 +64,8 @@ void git_test_write_commit_graph_or_die(void) /* Remember to update object flag allocation in object.h */ #define REACHABLE (1u<<15) +define_commit_slab(topo_level_slab, uint32_t); + /* Keep track of the order in which commits are added to our list. */ define_commit_slab(commit_pos, int); static struct commit_pos commit_pos = COMMIT_SLAB_INIT(1, commit_pos); @@ -759,6 +761,9 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g, item->date = (timestamp_t)((date_high << 32) | date_low); graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; + + if (g->topo_levels) + *topo_level_slab_at(g->topo_levels, item) = get_be32(commit_data + g->hash_len + 8) >> 2; } static inline void set_commit_tree(struct commit *c, struct tree *t) @@ -953,6 +958,7 @@ struct write_commit_graph_context { changed_paths:1, order_by_pack:1; + struct topo_level_slab *topo_levels; const struct split_commit_graph_opts *split_opts; size_t total_bloom_filter_data_size; const struct bloom_filter_settings *bloom_settings; @@ -1094,7 +1100,7 @@ static int write_graph_chunk_data(struct hashfile *f, else packedDate[0] = 0; - packedDate[0] |= htonl(commit_graph_data_at(*list)->generation << 2); + packedDate[0] |= htonl(*topo_level_slab_at(ctx->topo_levels, *list) << 2); packedDate[1] = htonl((*list)->date); hashwrite(f, packedDate, 8); @@ -1335,11 +1341,11 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) _("Computing commit graph generation numbers"), ctx->commits.nr); for (i = 0; i < ctx->commits.nr; i++) { - uint32_t generation = commit_graph_data_at(ctx->commits.list[i])->generation; + uint32_t level = *topo_level_slab_at(ctx->topo_levels, ctx->commits.list[i]); display_progress(ctx->progress, i + 1); - if (generation != GENERATION_NUMBER_V1_INFINITY && - generation != GENERATION_NUMBER_ZERO) + if (level != GENERATION_NUMBER_V1_INFINITY && + level != GENERATION_NUMBER_ZERO) continue; commit_list_insert(ctx->commits.list[i], &list); @@ -1347,29 +1353,27 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) struct commit *current = list->item; struct commit_list *parent; int all_parents_computed = 1; - uint32_t max_generation = 0; + uint32_t max_level = 0; for (parent = current->parents; parent; parent = parent->next) { - generation = commit_graph_data_at(parent->item)->generation; + level = *topo_level_slab_at(ctx->topo_levels, parent->item); - if (generation == GENERATION_NUMBER_V1_INFINITY || - generation == GENERATION_NUMBER_ZERO) { + if (level == GENERATION_NUMBER_V1_INFINITY || + level == GENERATION_NUMBER_ZERO) { all_parents_computed = 0; commit_list_insert(parent->item, &list); break; - } else if (generation > max_generation) { - max_generation = generation; + } else if (level > max_level) { + max_level = level; } } if (all_parents_computed) { - struct commit_graph_data *data = commit_graph_data_at(current); - - data->generation = max_generation + 1; pop_commit(&list); - if (data->generation > GENERATION_NUMBER_MAX) - data->generation = GENERATION_NUMBER_MAX; + if (max_level > GENERATION_NUMBER_MAX - 1) + max_level = GENERATION_NUMBER_MAX - 1; + *topo_level_slab_at(ctx->topo_levels, current) = max_level + 1; } } } @@ -2101,6 +2105,7 @@ int write_commit_graph(struct object_directory *odb, uint32_t i, count_distinct = 0; int res = 0; int replace = 0; + struct topo_level_slab topo_levels; if (!commit_graph_compatible(the_repository)) return 0; @@ -2179,6 +2184,18 @@ int write_commit_graph(struct object_directory *odb, } } + init_topo_level_slab(&topo_levels); + ctx->topo_levels = &topo_levels; + + if (ctx->r->objects->commit_graph) { + struct commit_graph *g = ctx->r->objects->commit_graph; + + while (g) { + g->topo_levels = &topo_levels; + g = g->base_graph; + } + } + if (pack_indexes) { ctx->order_by_pack = 1; if ((res = fill_oids_from_packs(ctx, pack_indexes))) diff --git a/commit-graph.h b/commit-graph.h index 430bc830bb..1152a9642e 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -72,6 +72,7 @@ struct commit_graph { const unsigned char *chunk_bloom_indexes; const unsigned char *chunk_bloom_data; + struct topo_level_slab *topo_levels; struct bloom_filter_settings *bloom_filter_settings; }; diff --git a/commit.h b/commit.h index bc0732a4fe..bb846e0025 100644 --- a/commit.h +++ b/commit.h @@ -15,6 +15,7 @@ #define GENERATION_NUMBER_V1_INFINITY 0xFFFFFFFF #define GENERATION_NUMBER_MAX 0x3FFFFFFF #define GENERATION_NUMBER_ZERO 0 +#define GENERATION_NUMBER_V2_OFFSET_MAX 0xFFFFFFFF struct commit_list { struct commit *item; From patchwork Sat Aug 15 16:39:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philippe Blain via GitGitGadget X-Patchwork-Id: 11715815 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 67F5E13B1 for ; Sat, 15 Aug 2020 22:05:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 45C57205CB for ; Sat, 15 Aug 2020 22:05:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="s5o+bE5/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729879AbgHOWFQ (ORCPT ); Sat, 15 Aug 2020 18:05:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45648 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728340AbgHOVux (ORCPT ); Sat, 15 Aug 2020 17:50:53 -0400 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A045C0A3BF3 for ; Sat, 15 Aug 2020 09:39:52 -0700 (PDT) Received: by mail-wr1-x443.google.com with SMTP id a5so10945607wrm.6 for ; Sat, 15 Aug 2020 09:39:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=3YuPMWbjM1ZnPG3IVZSLcSZcfSHeNABaP5tzmsVTiKU=; b=s5o+bE5/5PTiH4lKDujfr33VaV9h6FiwcpkvPIFHAhRE8BrSb72S9CdPkqHmE894Xj DBqp382gkJ5Ag8QOgKbY4lX3urUbKph8onWtrlX9nNN7d0BITlrsv21OG3eu7p1C8uaZ zgfNH4/ldYaP88vHQyvLxZDWsmEOVjVLuAiv1BC9PlFAEzS2ScKoZfm2UvamKqho0XES gcjg+wg4WF4EnRpNu32XFwT3b+wZ19E98QoLI1rTeVIVrEjAjwD48C2o+OXP0fyo5ZDR dMqWGzjSAte1F8x5HVymrvlM3XDxRAkkZk5g2xHjyuB7WpZfwprY0E8phTs/jyYwVabL 5fwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=3YuPMWbjM1ZnPG3IVZSLcSZcfSHeNABaP5tzmsVTiKU=; b=VfbyMK9ur4r6sfho2zT/W3ZMJtP1jWFrKbbHs6anDjv3CtT0b1R/yF5gHDQkWDd4oU 0K9kuOGDIYUQkAmmvOGItgl2NlRE+ZdD2l3Ejq3XWRTwFJzwcNbCXu6PtGRrqq7Glfu7 FWOFZOd2FlHUMOlthz0pwtlyuBybmD1HHpeQqcDOlBL6FmKmvGLt0fY+e5UpFm5EuQGz U8i6UcGue9KouO9dn2HQ3p8gpfvcFwuocoWVaKTxc1XQKTimxJKlBRMO7cqF2Jx2NnVF qTEgMeNkRqlZJhpF1i8IS5WFClBgeTMI0Wzuc/yBdGDjhKHWXAFaewJ4GeG+0TkKSsTp W+6g== X-Gm-Message-State: AOAM531b+WKvJTD6NcdlTh1+fh/l1fPvKXmRT85FvKY9s/dnlUcqoCQT fI2/Wg56ua8RFLxtnalD0yX9Q6eveew= X-Google-Smtp-Source: ABdhPJz7r8kYN2m6bSdjVEDlMr/5ZlUfKuDbpdinGQB64C1lBGt30DiQiLrc4qxS0hizmqNP3cxhVA== X-Received: by 2002:a5d:414d:: with SMTP id c13mr2423778wrq.78.1597509591084; Sat, 15 Aug 2020 09:39:51 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r16sm25724784wrr.13.2020.08.15.09.39.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 15 Aug 2020 09:39:50 -0700 (PDT) Message-Id: <4074ace65be3094d35dd0aaedb89eb5a0ec98cee.1597509583.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sat, 15 Aug 2020 16:39:39 +0000 Subject: [PATCH v3 07/11] commit-graph: implement corrected commit date Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar With most of preparations done, let's implement corrected commit date. The corrected commit date for a commit is defined as: * A commit with no parents (a root commit) has corrected commit date equal to its committer date. * A commit with at least one parent has corrected commit date equal to the maximum of its commit date and one more than the largest corrected commit date among its parents. To minimize the space required to store corrected commit date, Git stores corrected commit date offsets into the commit-graph file. The corrected commit date offset for a commit is defined as the difference between its corrected commit date and actual commit date. While Git does not write out offsets at this stage, Git stores the corrected commit dates in member generation of struct commit_graph_data. It will begin writing commit date offsets with the introduction of generation data chunk. Signed-off-by: Abhishek Kumar --- commit-graph.c | 58 +++++++++++++++++++++++++++----------------------- 1 file changed, 31 insertions(+), 27 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index a2f15b2825..fd69534dd5 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -169,11 +169,6 @@ static int commit_gen_cmp(const void *va, const void *vb) else if (generation_a > generation_b) return 1; - /* use date as a heuristic when generations are equal */ - if (a->date < b->date) - return -1; - else if (a->date > b->date) - return 1; return 0; } @@ -1342,10 +1337,14 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) ctx->commits.nr); for (i = 0; i < ctx->commits.nr; i++) { uint32_t level = *topo_level_slab_at(ctx->topo_levels, ctx->commits.list[i]); + timestamp_t corrected_commit_date = commit_graph_data_at(ctx->commits.list[i])->generation; display_progress(ctx->progress, i + 1); if (level != GENERATION_NUMBER_V1_INFINITY && - level != GENERATION_NUMBER_ZERO) + level != GENERATION_NUMBER_ZERO && + corrected_commit_date != GENERATION_NUMBER_INFINITY && + corrected_commit_date != GENERATION_NUMBER_ZERO + ) continue; commit_list_insert(ctx->commits.list[i], &list); @@ -1354,17 +1353,26 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) struct commit_list *parent; int all_parents_computed = 1; uint32_t max_level = 0; + timestamp_t max_corrected_commit_date = 0; for (parent = current->parents; parent; parent = parent->next) { level = *topo_level_slab_at(ctx->topo_levels, parent->item); + corrected_commit_date = commit_graph_data_at(parent->item)->generation; if (level == GENERATION_NUMBER_V1_INFINITY || - level == GENERATION_NUMBER_ZERO) { + level == GENERATION_NUMBER_ZERO || + corrected_commit_date == GENERATION_NUMBER_INFINITY || + corrected_commit_date == GENERATION_NUMBER_ZERO + ) { all_parents_computed = 0; commit_list_insert(parent->item, &list); break; - } else if (level > max_level) { - max_level = level; + } else { + if (level > max_level) + max_level = level; + + if (corrected_commit_date > max_corrected_commit_date) + max_corrected_commit_date = corrected_commit_date; } } @@ -1374,6 +1382,10 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx) if (max_level > GENERATION_NUMBER_MAX - 1) max_level = GENERATION_NUMBER_MAX - 1; *topo_level_slab_at(ctx->topo_levels, current) = max_level + 1; + + if (current->date > max_corrected_commit_date) + max_corrected_commit_date = current->date - 1; + commit_graph_data_at(current)->generation = max_corrected_commit_date + 1; } } } @@ -2372,8 +2384,8 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags) for (i = 0; i < g->num_commits; i++) { struct commit *graph_commit, *odb_commit; struct commit_list *graph_parents, *odb_parents; - timestamp_t max_generation = 0; - timestamp_t generation; + timestamp_t max_corrected_commit_date = 0; + timestamp_t corrected_commit_date; display_progress(progress, i + 1); hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); @@ -2412,9 +2424,9 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags) oid_to_hex(&graph_parents->item->object.oid), oid_to_hex(&odb_parents->item->object.oid)); - generation = commit_graph_generation(graph_parents->item); - if (generation > max_generation) - max_generation = generation; + corrected_commit_date = commit_graph_generation(graph_parents->item); + if (corrected_commit_date > max_corrected_commit_date) + max_corrected_commit_date = corrected_commit_date; graph_parents = graph_parents->next; odb_parents = odb_parents->next; @@ -2436,20 +2448,12 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags) if (generation_zero == GENERATION_ZERO_EXISTS) continue; - /* - * If one of our parents has generation GENERATION_NUMBER_MAX, then - * our generation is also GENERATION_NUMBER_MAX. Decrement to avoid - * extra logic in the following condition. - */ - if (max_generation == GENERATION_NUMBER_MAX) - max_generation--; - - generation = commit_graph_generation(graph_commit); - if (generation != max_generation + 1) - graph_report(_("commit-graph generation for commit %s is %u != %u"), + corrected_commit_date = commit_graph_generation(graph_commit); + if (corrected_commit_date < max_corrected_commit_date + 1) + graph_report(_("commit-graph generation for commit %s is %"PRItime" < %"PRItime), oid_to_hex(&cur_oid), - generation, - max_generation + 1); + corrected_commit_date, + max_corrected_commit_date + 1); if (graph_commit->date != odb_commit->date) graph_report(_("commit date for commit %s in commit-graph is %"PRItime" != %"PRItime), From patchwork Sat Aug 15 16:39:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Philippe Blain via GitGitGadget X-Patchwork-Id: 11715757 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1EC92109B for ; Sat, 15 Aug 2020 22:00:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E8E9E23358 for ; Sat, 15 Aug 2020 22:00:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="rLBg1qEW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729622AbgHOWAM (ORCPT ); Sat, 15 Aug 2020 18:00:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45640 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728850AbgHOVvg (ORCPT ); Sat, 15 Aug 2020 17:51:36 -0400 Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5F25C0A3BF4 for ; Sat, 15 Aug 2020 09:39:53 -0700 (PDT) Received: by mail-wm1-x341.google.com with SMTP id g8so9878901wmk.3 for ; Sat, 15 Aug 2020 09:39:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:mime-version :content-transfer-encoding:fcc:to:cc; bh=DHDhFjl1hDkzjQZSWcMqXTXs/ruIOAbqHvg+QbxqMK4=; b=rLBg1qEWlyjVt7eQUa4HIY+GdqBgqjeOXrsQ6ctVd+rB7f9I8DAH0VYg0j71fgPl4X FjuhFobmyWxkkIPuQYTBqsXPkW8s9Kj/u3QXr887LUjwdfeDHZzWsbqjYXSUoj2fpIuf SicbX4tTq8quvYi7bvg/VBzZDiO/8q5xEbTE4Ylkeckog3Z4jz4OqFgX8rFOHhSNpgS7 9X5h7gKdt9CPE2fp3RXra+1CBD28M7M2WrXu6CLmahYzZdCqmGZfOysgl6Ga/0SE2UYL IL8aRmLreWphe9ZiJPxAihZnS1VVbk2sTWZCmWcM+cYTmIU7ph6O5uLTbUfTqmh+bC5s qe+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:mime-version:content-transfer-encoding:fcc:to:cc; bh=DHDhFjl1hDkzjQZSWcMqXTXs/ruIOAbqHvg+QbxqMK4=; b=Q8S4YMR8sLlXXGp/MGekLwJH0Hb3clttplvyV2p1ywzaMYEupwusbqepO+H6xkeRHB ZtDSzXix+pHFULR1fNVhsGLLGNgJegi2zhvgG16EMgcaaIyXhIofug4f5WIoe4zlGaLe REcGgmkicDigYSXBXaZGaUu7bjxrz3sFz/+1LJg4cn8rNPDwUCF/3pcdlbtEVSqUiqjy eN3ueHKL0iY1FEBZw7KWPbRvr7bj9duS7rLCPiypNjqmkx1gSHU+SHeDwn4/ePYcUikV 18RQKd4nB9y4r/y+ItnmJYcILP2xwkeZ5WBoneczPvpd7gOYFQO9PtquQTfTEwshK9VC zJQQ== X-Gm-Message-State: AOAM531uxTym76GOPkHQUVIPEc/CFXsdXap7ZOO9Nki8NX0JiMQNzem5 iEzTmFrsGKhy46PpYrf2c8FpGBaxO3Y= X-Google-Smtp-Source: ABdhPJwEkK4eR8/RhkdECBcNiXA+OiCZAtnZe1F3k+5U7WR/1dyCIS0Gxrl40vZVDW/NYM6m8opX6Q== X-Received: by 2002:a1c:c913:: with SMTP id f19mr7105658wmb.173.1597509591968; Sat, 15 Aug 2020 09:39:51 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o2sm25247087wrh.70.2020.08.15.09.39.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 15 Aug 2020 09:39:51 -0700 (PDT) Message-Id: <4e746628acdb49af5e8eb788864156f54724d4fa.1597509583.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sat, 15 Aug 2020 16:39:40 +0000 Subject: [PATCH v3 08/11] commit-graph: implement generation data chunk MIME-Version: 1.0 Fcc: Sent To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar As discovered by Ævar, we cannot increment graph version to distinguish between generation numbers v1 and v2 [1]. Thus, one of pre-requistes before implementing generation number was to distinguish between graph versions in a backwards compatible manner. We are going to introduce a new chunk called Generation Data chunk (or GDAT). GDAT stores generation number v2 (and any subsequent versions), whereas CDAT will still store topological level. Old Git does not understand GDAT chunk and would ignore it, reading topological levels from CDAT. New Git can parse GDAT and take advantage of newer generation numbers, falling back to topological levels when GDAT chunk is missing (as it would happen with a commit graph written by old Git). We introduce a test environment variable 'GIT_TEST_COMMIT_GRAPH_NO_GDAT' which forces commit-graph file to be written without generation data chunk to emulate a commit-graph file written by old Git. [1]: https://lore.kernel.org/git/87a7gdspo4.fsf@evledraar.gmail.com/ Signed-off-by: Abhishek Kumar --- commit-graph.c | 48 ++++++++++++++++++++++++--- commit-graph.h | 2 ++ t/README | 3 ++ t/helper/test-read-graph.c | 2 ++ t/t4216-log-bloom.sh | 4 +-- t/t5318-commit-graph.sh | 27 +++++++-------- t/t5324-split-commit-graph.sh | 12 +++---- t/t6600-test-reach.sh | 62 +++++++++++++++++++---------------- 8 files changed, 107 insertions(+), 53 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index fd69534dd5..b7a72b40db 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -38,11 +38,12 @@ void git_test_write_commit_graph_or_die(void) #define GRAPH_CHUNKID_OIDFANOUT 0x4f494446 /* "OIDF" */ #define GRAPH_CHUNKID_OIDLOOKUP 0x4f49444c /* "OIDL" */ #define GRAPH_CHUNKID_DATA 0x43444154 /* "CDAT" */ +#define GRAPH_CHUNKID_GENERATION_DATA 0x47444154 /* "GDAT" */ #define GRAPH_CHUNKID_EXTRAEDGES 0x45444745 /* "EDGE" */ #define GRAPH_CHUNKID_BLOOMINDEXES 0x42494458 /* "BIDX" */ #define GRAPH_CHUNKID_BLOOMDATA 0x42444154 /* "BDAT" */ #define GRAPH_CHUNKID_BASE 0x42415345 /* "BASE" */ -#define MAX_NUM_CHUNKS 7 +#define MAX_NUM_CHUNKS 8 #define GRAPH_DATA_WIDTH (the_hash_algo->rawsz + 16) @@ -389,6 +390,13 @@ struct commit_graph *parse_commit_graph(void *graph_map, size_t graph_size) graph->chunk_commit_data = data + chunk_offset; break; + case GRAPH_CHUNKID_GENERATION_DATA: + if (graph->chunk_generation_data) + chunk_repeated = 1; + else + graph->chunk_generation_data = data + chunk_offset; + break; + case GRAPH_CHUNKID_EXTRAEDGES: if (graph->chunk_extra_edges) chunk_repeated = 1; @@ -755,7 +763,11 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g, date_low = get_be32(commit_data + g->hash_len + 12); item->date = (timestamp_t)((date_high << 32) | date_low); - graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; + if (g->chunk_generation_data) + graph_data->generation = item->date + + (timestamp_t) get_be32(g->chunk_generation_data + sizeof(uint32_t) * lex_index); + else + graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2; if (g->topo_levels) *topo_level_slab_at(g->topo_levels, item) = get_be32(commit_data + g->hash_len + 8) >> 2; @@ -951,7 +963,8 @@ struct write_commit_graph_context { report_progress:1, split:1, changed_paths:1, - order_by_pack:1; + order_by_pack:1, + write_generation_data:1; struct topo_level_slab *topo_levels; const struct split_commit_graph_opts *split_opts; @@ -1106,8 +1119,25 @@ static int write_graph_chunk_data(struct hashfile *f, return 0; } +static int write_graph_chunk_generation_data(struct hashfile *f, + struct write_commit_graph_context *ctx) +{ + int i; + for (i = 0; i < ctx->commits.nr; i++) { + struct commit *c = ctx->commits.list[i]; + timestamp_t offset = commit_graph_data_at(c)->generation - c->date; + display_progress(ctx->progress, ++ctx->progress_cnt); + + if (offset > GENERATION_NUMBER_V2_OFFSET_MAX) + offset = GENERATION_NUMBER_V2_OFFSET_MAX; + hashwrite_be32(f, offset); + } + + return 0; +} + static int write_graph_chunk_extra_edges(struct hashfile *f, - struct write_commit_graph_context *ctx) + struct write_commit_graph_context *ctx) { struct commit **list = ctx->commits.list; struct commit **last = ctx->commits.list + ctx->commits.nr; @@ -1726,6 +1756,15 @@ static int write_commit_graph_file(struct write_commit_graph_context *ctx) chunks[2].id = GRAPH_CHUNKID_DATA; chunks[2].size = (hashsz + 16) * ctx->commits.nr; chunks[2].write_fn = write_graph_chunk_data; + + if (git_env_bool(GIT_TEST_COMMIT_GRAPH_NO_GDAT, 0)) + ctx->write_generation_data = 0; + if (ctx->write_generation_data) { + chunks[num_chunks].id = GRAPH_CHUNKID_GENERATION_DATA; + chunks[num_chunks].size = sizeof(uint32_t) * ctx->commits.nr; + chunks[num_chunks].write_fn = write_graph_chunk_generation_data; + num_chunks++; + } if (ctx->num_extra_edges) { chunks[num_chunks].id = GRAPH_CHUNKID_EXTRAEDGES; chunks[num_chunks].size = 4 * ctx->num_extra_edges; @@ -2130,6 +2169,7 @@ int write_commit_graph(struct object_directory *odb, ctx->split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0; ctx->split_opts = split_opts; ctx->total_bloom_filter_data_size = 0; + ctx->write_generation_data = 1; if (flags & COMMIT_GRAPH_WRITE_BLOOM_FILTERS) ctx->changed_paths = 1; diff --git a/commit-graph.h b/commit-graph.h index 1152a9642e..f78c892fc0 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -6,6 +6,7 @@ #include "oidset.h" #define GIT_TEST_COMMIT_GRAPH "GIT_TEST_COMMIT_GRAPH" +#define GIT_TEST_COMMIT_GRAPH_NO_GDAT "GIT_TEST_COMMIT_GRAPH_NO_GDAT" #define GIT_TEST_COMMIT_GRAPH_DIE_ON_PARSE "GIT_TEST_COMMIT_GRAPH_DIE_ON_PARSE" #define GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS "GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS" @@ -67,6 +68,7 @@ struct commit_graph { const uint32_t *chunk_oid_fanout; const unsigned char *chunk_oid_lookup; const unsigned char *chunk_commit_data; + const unsigned char *chunk_generation_data; const unsigned char *chunk_extra_edges; const unsigned char *chunk_base_graphs; const unsigned char *chunk_bloom_indexes; diff --git a/t/README b/t/README index 70ec61cf88..6647ef132e 100644 --- a/t/README +++ b/t/README @@ -379,6 +379,9 @@ GIT_TEST_COMMIT_GRAPH=, when true, forces the commit-graph to be written after every 'git commit' command, and overrides the 'core.commitGraph' setting to true. +GIT_TEST_COMMIT_GRAPH_NO_GDAT=, when true, forces the +commit-graph to be written without generation data chunk. + GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=, when true, forces commit-graph write to compute and write changed path Bloom filters for every 'git commit-graph write', as if the `--changed-paths` option was diff --git a/t/helper/test-read-graph.c b/t/helper/test-read-graph.c index 6d0c962438..1c2a5366c7 100644 --- a/t/helper/test-read-graph.c +++ b/t/helper/test-read-graph.c @@ -32,6 +32,8 @@ int cmd__read_graph(int argc, const char **argv) printf(" oid_lookup"); if (graph->chunk_commit_data) printf(" commit_metadata"); + if (graph->chunk_generation_data) + printf(" generation_data"); if (graph->chunk_extra_edges) printf(" extra_edges"); if (graph->chunk_bloom_indexes) diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh index c21cc160f3..55c94e9ebd 100755 --- a/t/t4216-log-bloom.sh +++ b/t/t4216-log-bloom.sh @@ -33,11 +33,11 @@ test_expect_success 'setup test - repo, commits, commit graph, log outputs' ' git commit-graph write --reachable --changed-paths ' graph_read_expect () { - NUM_CHUNKS=5 + NUM_CHUNKS=6 cat >expect <<- EOF header: 43475048 1 1 $NUM_CHUNKS 0 num_commits: $1 - chunks: oid_fanout oid_lookup commit_metadata bloom_indexes bloom_data + chunks: oid_fanout oid_lookup commit_metadata generation_data bloom_indexes bloom_data EOF test-tool read-graph >actual && test_cmp expect actual diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 044cf8a3de..b41b2160c6 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -71,7 +71,7 @@ graph_git_behavior 'no graph' full commits/3 commits/1 graph_read_expect() { OPTIONAL="" NUM_CHUNKS=3 - if test ! -z $2 + if test ! -z "$2" then OPTIONAL=" $2" NUM_CHUNKS=$((3 + $(echo "$2" | wc -w))) @@ -98,14 +98,14 @@ test_expect_success 'exit with correct error on bad input to --stdin-commits' ' # valid commit and tree OID git rev-parse HEAD HEAD^{tree} >in && git commit-graph write --stdin-commits >commits-in && cat commits-in | git commit-graph write --stdin-commits && test_path_is_file $objdir/info/commit-graph && - graph_read_expect "6" + graph_read_expect "6" "generation_data" ' graph_git_behavior 'graph from commits, commit 8 vs merge 1' full commits/8 merge/1 @@ -292,7 +292,7 @@ test_expect_success 'build graph from commits with append' ' cd "$TRASH_DIRECTORY/full" && git rev-parse merge/3 | git commit-graph write --stdin-commits --append && test_path_is_file $objdir/info/commit-graph && - graph_read_expect "10" "extra_edges" + graph_read_expect "10" "generation_data extra_edges" ' graph_git_behavior 'append graph, commit 8 vs merge 1' full commits/8 merge/1 @@ -302,7 +302,7 @@ test_expect_success 'build graph using --reachable' ' cd "$TRASH_DIRECTORY/full" && git commit-graph write --reachable && test_path_is_file $objdir/info/commit-graph && - graph_read_expect "11" "extra_edges" + graph_read_expect "11" "generation_data extra_edges" ' graph_git_behavior 'append graph, commit 8 vs merge 1' full commits/8 merge/1 @@ -323,7 +323,7 @@ test_expect_success 'write graph in bare repo' ' cd "$TRASH_DIRECTORY/bare" && git commit-graph write && test_path_is_file $baredir/info/commit-graph && - graph_read_expect "11" "extra_edges" + graph_read_expect "11" "generation_data extra_edges" ' graph_git_behavior 'bare repo with graph, commit 8 vs merge 1' bare commits/8 merge/1 @@ -420,8 +420,9 @@ test_expect_success 'replace-objects invalidates commit-graph' ' test_expect_success 'git commit-graph verify' ' cd "$TRASH_DIRECTORY/full" && - git rev-parse commits/8 | git commit-graph write --stdin-commits && - git commit-graph verify >output + git rev-parse commits/8 | GIT_TEST_COMMIT_GRAPH_NO_GDAT=1 git commit-graph write --stdin-commits && + git commit-graph verify >output && + graph_read_expect 9 extra_edges ' NUM_COMMITS=9 diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh index ea28d522b8..531016f405 100755 --- a/t/t5324-split-commit-graph.sh +++ b/t/t5324-split-commit-graph.sh @@ -13,11 +13,11 @@ test_expect_success 'setup repo' ' infodir=".git/objects/info" && graphdir="$infodir/commit-graphs" && test_oid_cache <<-EOM - shallow sha1:1760 - shallow sha256:2064 + shallow sha1:2132 + shallow sha256:2436 - base sha1:1376 - base sha256:1496 + base sha1:1408 + base sha256:1528 EOM ' @@ -28,9 +28,9 @@ graph_read_expect() { NUM_BASE=$2 fi cat >expect <<- EOF - header: 43475048 1 1 3 $NUM_BASE + header: 43475048 1 1 4 $NUM_BASE num_commits: $1 - chunks: oid_fanout oid_lookup commit_metadata + chunks: oid_fanout oid_lookup commit_metadata generation_data EOF test-tool read-graph >output && test_cmp expect output diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh index 475564bee7..d14b129f06 100755 --- a/t/t6600-test-reach.sh +++ b/t/t6600-test-reach.sh @@ -55,10 +55,13 @@ test_expect_success 'setup' ' git show-ref -s commit-5-5 | git commit-graph write --stdin-commits && mv .git/objects/info/commit-graph commit-graph-half && chmod u+w commit-graph-half && + GIT_TEST_COMMIT_GRAPH_NO_GDAT=1 git commit-graph write --reachable && + mv .git/objects/info/commit-graph commit-graph-no-gdat && + chmod u+w commit-graph-no-gdat && git config core.commitGraph true ' -run_three_modes () { +run_all_modes () { test_when_finished rm -rf .git/objects/info/commit-graph && "$@" actual && test_cmp expect actual && @@ -67,11 +70,14 @@ run_three_modes () { test_cmp expect actual && cp commit-graph-half .git/objects/info/commit-graph && "$@" actual && + test_cmp expect actual && + cp commit-graph-no-gdat .git/objects/info/commit-graph && + "$@" actual && test_cmp expect actual } -test_three_modes () { - run_three_modes test-tool reach "$@" +test_all_modes () { + run_all_modes test-tool reach "$@" } test_expect_success 'ref_newer:miss' ' @@ -80,7 +86,7 @@ test_expect_success 'ref_newer:miss' ' B:commit-4-9 EOF echo "ref_newer(A,B):0" >expect && - test_three_modes ref_newer + test_all_modes ref_newer ' test_expect_success 'ref_newer:hit' ' @@ -89,7 +95,7 @@ test_expect_success 'ref_newer:hit' ' B:commit-2-3 EOF echo "ref_newer(A,B):1" >expect && - test_three_modes ref_newer + test_all_modes ref_newer ' test_expect_success 'in_merge_bases:hit' ' @@ -98,7 +104,7 @@ test_expect_success 'in_merge_bases:hit' ' B:commit-8-8 EOF echo "in_merge_bases(A,B):1" >expect && - test_three_modes in_merge_bases + test_all_modes in_merge_bases ' test_expect_success 'in_merge_bases:miss' ' @@ -107,7 +113,7 @@ test_expect_success 'in_merge_bases:miss' ' B:commit-5-9 EOF echo "in_merge_bases(A,B):0" >expect && - test_three_modes in_merge_bases + test_all_modes in_merge_bases ' test_expect_success 'is_descendant_of:hit' ' @@ -118,7 +124,7 @@ test_expect_success 'is_descendant_of:hit' ' X:commit-1-1 EOF echo "is_descendant_of(A,X):1" >expect && - test_three_modes is_descendant_of + test_all_modes is_descendant_of ' test_expect_success 'is_descendant_of:miss' ' @@ -129,7 +135,7 @@ test_expect_success 'is_descendant_of:miss' ' X:commit-7-6 EOF echo "is_descendant_of(A,X):0" >expect && - test_three_modes is_descendant_of + test_all_modes is_descendant_of ' test_expect_success 'get_merge_bases_many' ' @@ -144,7 +150,7 @@ test_expect_success 'get_merge_bases_many' ' git rev-parse commit-5-6 \ commit-4-7 | sort } >expect && - test_three_modes get_merge_bases_many + test_all_modes get_merge_bases_many ' test_expect_success 'reduce_heads' ' @@ -166,7 +172,7 @@ test_expect_success 'reduce_heads' ' commit-2-8 \ commit-1-10 | sort } >expect && - test_three_modes reduce_heads + test_all_modes reduce_heads ' test_expect_success 'can_all_from_reach:hit' ' @@ -189,7 +195,7 @@ test_expect_success 'can_all_from_reach:hit' ' Y:commit-8-1 EOF echo "can_all_from_reach(X,Y):1" >expect && - test_three_modes can_all_from_reach + test_all_modes can_all_from_reach ' test_expect_success 'can_all_from_reach:miss' ' @@ -211,7 +217,7 @@ test_expect_success 'can_all_from_reach:miss' ' Y:commit-8-5 EOF echo "can_all_from_reach(X,Y):0" >expect && - test_three_modes can_all_from_reach + test_all_modes can_all_from_reach ' test_expect_success 'can_all_from_reach_with_flag: tags case' ' @@ -234,7 +240,7 @@ test_expect_success 'can_all_from_reach_with_flag: tags case' ' Y:commit-8-1 EOF echo "can_all_from_reach_with_flag(X,_,_,0,0):1" >expect && - test_three_modes can_all_from_reach_with_flag + test_all_modes can_all_from_reach_with_flag ' test_expect_success 'commit_contains:hit' ' @@ -250,8 +256,8 @@ test_expect_success 'commit_contains:hit' ' X:commit-9-3 EOF echo "commit_contains(_,A,X,_):1" >expect && - test_three_modes commit_contains && - test_three_modes commit_contains --tag + test_all_modes commit_contains && + test_all_modes commit_contains --tag ' test_expect_success 'commit_contains:miss' ' @@ -267,8 +273,8 @@ test_expect_success 'commit_contains:miss' ' X:commit-9-3 EOF echo "commit_contains(_,A,X,_):0" >expect && - test_three_modes commit_contains && - test_three_modes commit_contains --tag + test_all_modes commit_contains && + test_all_modes commit_contains --tag ' test_expect_success 'rev-list: basic topo-order' ' @@ -280,7 +286,7 @@ test_expect_success 'rev-list: basic topo-order' ' commit-6-2 commit-5-2 commit-4-2 commit-3-2 commit-2-2 commit-1-2 \ commit-6-1 commit-5-1 commit-4-1 commit-3-1 commit-2-1 commit-1-1 \ >expect && - run_three_modes git rev-list --topo-order commit-6-6 + run_all_modes git rev-list --topo-order commit-6-6 ' test_expect_success 'rev-list: first-parent topo-order' ' @@ -292,7 +298,7 @@ test_expect_success 'rev-list: first-parent topo-order' ' commit-6-2 \ commit-6-1 commit-5-1 commit-4-1 commit-3-1 commit-2-1 commit-1-1 \ >expect && - run_three_modes git rev-list --first-parent --topo-order commit-6-6 + run_all_modes git rev-list --first-parent --topo-order commit-6-6 ' test_expect_success 'rev-list: range topo-order' ' @@ -304,7 +310,7 @@ test_expect_success 'rev-list: range topo-order' ' commit-6-2 commit-5-2 commit-4-2 \ commit-6-1 commit-5-1 commit-4-1 \ >expect && - run_three_modes git rev-list --topo-order commit-3-3..commit-6-6 + run_all_modes git rev-list --topo-order commit-3-3..commit-6-6 ' test_expect_success 'rev-list: range topo-order' ' @@ -316,7 +322,7 @@ test_expect_success 'rev-list: range topo-order' ' commit-6-2 commit-5-2 commit-4-2 \ commit-6-1 commit-5-1 commit-4-1 \ >expect && - run_three_modes git rev-list --topo-order commit-3-8..commit-6-6 + run_all_modes git rev-list --topo-order commit-3-8..commit-6-6 ' test_expect_success 'rev-list: first-parent range topo-order' ' @@ -328,7 +334,7 @@ test_expect_success 'rev-list: first-parent range topo-order' ' commit-6-2 \ commit-6-1 commit-5-1 commit-4-1 \ >expect && - run_three_modes git rev-list --first-parent --topo-order commit-3-8..commit-6-6 + run_all_modes git rev-list --first-parent --topo-order commit-3-8..commit-6-6 ' test_expect_success 'rev-list: ancestry-path topo-order' ' @@ -338,7 +344,7 @@ test_expect_success 'rev-list: ancestry-path topo-order' ' commit-6-4 commit-5-4 commit-4-4 commit-3-4 \ commit-6-3 commit-5-3 commit-4-3 \ >expect && - run_three_modes git rev-list --topo-order --ancestry-path commit-3-3..commit-6-6 + run_all_modes git rev-list --topo-order --ancestry-path commit-3-3..commit-6-6 ' test_expect_success 'rev-list: symmetric difference topo-order' ' @@ -352,7 +358,7 @@ test_expect_success 'rev-list: symmetric difference topo-order' ' commit-3-8 commit-2-8 commit-1-8 \ commit-3-7 commit-2-7 commit-1-7 \ >expect && - run_three_modes git rev-list --topo-order commit-3-8...commit-6-6 + run_all_modes git rev-list --topo-order commit-3-8...commit-6-6 ' test_expect_success 'get_reachable_subset:all' ' @@ -372,7 +378,7 @@ test_expect_success 'get_reachable_subset:all' ' commit-1-7 \ commit-5-6 | sort ) >expect && - test_three_modes get_reachable_subset + test_all_modes get_reachable_subset ' test_expect_success 'get_reachable_subset:some' ' @@ -390,7 +396,7 @@ test_expect_success 'get_reachable_subset:some' ' git rev-parse commit-3-3 \ commit-1-7 | sort ) >expect && - test_three_modes get_reachable_subset + test_all_modes get_reachable_subset ' test_expect_success 'get_reachable_subset:none' ' @@ -404,7 +410,7 @@ test_expect_success 'get_reachable_subset:none' ' Y:commit-2-8 EOF echo "get_reachable_subset(X,Y)" >expect && - test_three_modes get_reachable_subset + test_all_modes get_reachable_subset ' test_done From patchwork Sat Aug 15 16:39:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philippe Blain via GitGitGadget X-Patchwork-Id: 11715749 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BAB9F109B for ; Sat, 15 Aug 2020 21:59:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9E73723343 for ; Sat, 15 Aug 2020 21:59:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="uUQR70QY" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729554AbgHOV7B (ORCPT ); Sat, 15 Aug 2020 17:59:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45660 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728930AbgHOVvx (ORCPT ); Sat, 15 Aug 2020 17:51:53 -0400 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 45235C0A3BF5 for ; Sat, 15 Aug 2020 09:39:54 -0700 (PDT) Received: by mail-wm1-x343.google.com with SMTP id t14so10423080wmi.3 for ; Sat, 15 Aug 2020 09:39:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=uQlqr/gCizoF6WiYW8C6e1RMxvpJvuXmp1DBXcJ33U0=; b=uUQR70QYEcVwwCbj0BmyVUtA2y6IeQAWtQDaCXmetMHvgsvlpl2rX5FPtR6WztUskU KuRdZ9rQ9ororaOhhygkdl2jn6DQUDFF5AMOOn5cFxIjWlegqsTtLTtEHOC6PFkG29mY 10CvzYvmdHlC1gsReOEz05v7t0X53ZjMYMrz3btAmFrdYzYUiXQajt4V7pKfi7vaMrpu DRZG9Ny5WXXyO+V5LyLWFqvBpHhBKM7J9lZeJyoHQsGurwH6hy00sTlL/NBQ8XJuZI3N 2ecoXEqOWSMy696wkx879vnDdHSSw6C01zN5s53+YL9qJfdPfs7W68LFUlxXUmbV3hk2 medA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=uQlqr/gCizoF6WiYW8C6e1RMxvpJvuXmp1DBXcJ33U0=; b=iwqJ5T/rV4zaQ0MBCbrfTyWrgSka8XqW9qE2T3RBWdUx3/qNcBUIwbKn8s3c72jdWw +/8cS1z3B0WtptPlJeNhjEp5LsnG2MKzJmKREbtCYkg1tCqxRc4twTRbJ3EBpPb6qxxJ P2Xl5fInFgWdNOIhiEED5DJnuILpUWapGqdlUphWl5LPeWIjrAcb9rFV9mWEdsCl7R+T 3jCzr1g01R6BnP7XqzxNh2AaNw0+ebcSja/Dfo6Xt62Og6p9HizaZbC+VY3tykJdT34A MScKZdtrBc12PTa4aepT1ixuVzxrnBRadJjc3my1ok2IbYR9SFsjoWHsT5hxiejp79BJ 1Hsw== X-Gm-Message-State: AOAM532YyMXcgEEkk4o37EGS7RGO/07HI9KRGxzKn88BxXrC7vk/pjkt 74RtsNvHoBjrU4ZHd4qFsk4UP9a6Mpo= X-Google-Smtp-Source: ABdhPJztk50S5CY02oKigQb/BXAolb4gySDT3+dg7nxfkIqM5VVADHDfB9N3vLBkJgVMB8W1i/JyDQ== X-Received: by 2002:a7b:c084:: with SMTP id r4mr7253322wmh.23.1597509592795; Sat, 15 Aug 2020 09:39:52 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id g145sm29769305wmg.23.2020.08.15.09.39.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 15 Aug 2020 09:39:52 -0700 (PDT) Message-Id: <5a147a9704f0f8d8644c92ea38583e966378b931.1597509583.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sat, 15 Aug 2020 16:39:41 +0000 Subject: [PATCH v3 09/11] commit-graph: use generation v2 only if entire chain does Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar Since there are released versions of Git that understand generation numbers in the commit-graph's CDAT chunk but do not understand the GDAT chunk, the following scenario is possible: 1. "New" Git writes a commit-graph with the GDAT chunk. 2. "Old" Git writes a split commit-graph on top without a GDAT chunk. Because of the current use of inspecting the current layer for a chunk_generation_data pointer, the commits in the lower layer will be interpreted as having very large generation values (commit date plus offset) compared to the generation numbers in the top layer (topological level). This violates the expectation that the generation of a parent is strictly smaller than the generation of a child. It is difficult to expose this issue in a test. Since we _start_ with artificially low generation numbers, any commit walk that prioritizes generation numbers will walk all of the commits with high generation number before walking the commits with low generation number. In all the cases I tried, the commit-graph layers themselves "protect" any incorrect behavior since none of the commits in the lower layer can reach the commits in the upper layer. This issue would manifest itself as a performance problem in this case, especially with something like "git log --graph" since the low generation numbers would cause the in-degree queue to walk all of the commits in the lower layer before allowing the topo-order queue to write anything to output (depending on the size of the upper layer). Signed-off-by: Derrick Stolee Signed-off-by: Abhishek Kumar --- commit-graph.c | 32 +++++++++++++++- commit-graph.h | 1 + t/t5324-split-commit-graph.sh | 70 +++++++++++++++++++++++++++++++++++ 3 files changed, 102 insertions(+), 1 deletion(-) diff --git a/commit-graph.c b/commit-graph.c index b7a72b40db..c1292f8e08 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -597,6 +597,27 @@ static struct commit_graph *load_commit_graph_chain(struct repository *r, return graph_chain; } +static void validate_mixed_generation_chain(struct repository *r) +{ + struct commit_graph *g = r->objects->commit_graph; + int read_generation_data = 1; + + while (g) { + if (!g->chunk_generation_data) { + read_generation_data = 0; + break; + } + g = g->base_graph; + } + + g = r->objects->commit_graph; + + while (g) { + g->read_generation_data = read_generation_data; + g = g->base_graph; + } +} + struct commit_graph *read_commit_graph_one(struct repository *r, struct object_directory *odb) { @@ -605,6 +626,8 @@ struct commit_graph *read_commit_graph_one(struct repository *r, if (!g) g = load_commit_graph_chain(r, odb); + validate_mixed_generation_chain(r); + return g; } @@ -763,7 +786,7 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g, date_low = get_be32(commit_data + g->hash_len + 12); item->date = (timestamp_t)((date_high << 32) | date_low); - if (g->chunk_generation_data) + if (g->chunk_generation_data && g->read_generation_data) graph_data->generation = item->date + (timestamp_t) get_be32(g->chunk_generation_data + sizeof(uint32_t) * lex_index); else @@ -885,6 +908,7 @@ void load_commit_graph_info(struct repository *r, struct commit *item) uint32_t pos; if (!prepare_commit_graph(r)) return; + if (find_commit_in_graph(item, r->objects->commit_graph, &pos)) fill_commit_graph_info(item, r->objects->commit_graph, pos); } @@ -2192,6 +2216,9 @@ int write_commit_graph(struct object_directory *odb, g = ctx->r->objects->commit_graph; + if (g && !g->chunk_generation_data) + ctx->write_generation_data = 0; + while (g) { ctx->num_commit_graphs_before++; g = g->base_graph; @@ -2210,6 +2237,9 @@ int write_commit_graph(struct object_directory *odb, if (ctx->split_opts) replace = ctx->split_opts->flags & COMMIT_GRAPH_SPLIT_REPLACE; + + if (replace) + ctx->write_generation_data = 1; } ctx->approx_nr_objects = approximate_object_count(); diff --git a/commit-graph.h b/commit-graph.h index f78c892fc0..3cf89d895d 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -63,6 +63,7 @@ struct commit_graph { struct object_directory *odb; uint32_t num_commits_in_base; + uint32_t read_generation_data; struct commit_graph *base_graph; const uint32_t *chunk_oid_fanout; diff --git a/t/t5324-split-commit-graph.sh b/t/t5324-split-commit-graph.sh index 531016f405..ac5e7783fb 100755 --- a/t/t5324-split-commit-graph.sh +++ b/t/t5324-split-commit-graph.sh @@ -424,4 +424,74 @@ done <<\EOF 0600 -r-------- EOF +test_expect_success 'setup repo for mixed generation commit-graph-chain' ' + mkdir mixed && + graphdir=".git/objects/info/commit-graphs" && + cd "$TRASH_DIRECTORY/mixed" && + git init && + git config core.commitGraph true && + git config gc.writeCommitGraph false && + for i in $(test_seq 3) + do + test_commit $i && + git branch commits/$i || return 1 + done && + git reset --hard commits/1 && + for i in $(test_seq 4 5) + do + test_commit $i && + git branch commits/$i || return 1 + done && + git reset --hard commits/2 && + for i in $(test_seq 6 10) + do + test_commit $i && + git branch commits/$i || return 1 + done && + git commit-graph write --reachable --split && + git reset --hard commits/2 && + git merge commits/4 && + git branch merge/1 && + git reset --hard commits/4 && + git merge commits/6 && + git branch merge/2 && + GIT_TEST_COMMIT_GRAPH_NO_GDAT=1 git commit-graph write --reachable --split=no-merge && + test-tool read-graph >output && + cat >expect <<-EOF && + header: 43475048 1 1 4 1 + num_commits: 2 + chunks: oid_fanout oid_lookup commit_metadata + EOF + test_cmp expect output && + git commit-graph verify +' + +test_expect_success 'does not write generation data chunk if not present on existing tip' ' + cd "$TRASH_DIRECTORY/mixed" && + git reset --hard commits/3 && + git merge merge/1 && + git merge commits/5 && + git merge merge/2 && + git branch merge/3 && + git commit-graph write --reachable --split=no-merge && + test-tool read-graph >output && + cat >expect <<-EOF && + header: 43475048 1 1 4 2 + num_commits: 3 + chunks: oid_fanout oid_lookup commit_metadata + EOF + test_cmp expect output && + git commit-graph verify +' + +test_expect_success 'writes generation data chunk when commit-graph chain is replaced' ' + cd "$TRASH_DIRECTORY/mixed" && + git commit-graph write --reachable --split=replace && + test_path_is_file $graphdir/commit-graph-chain && + test_line_count = 1 $graphdir/commit-graph-chain && + verify_chain_files_exist $graphdir && + graph_read_expect 15 && + git commit-graph verify +' + test_done From patchwork Sat Aug 15 16:39:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philippe Blain via GitGitGadget X-Patchwork-Id: 11715735 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4BEFD109B for ; Sat, 15 Aug 2020 21:57:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2E744206B6 for ; Sat, 15 Aug 2020 21:57:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="mCm3TVYJ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728973AbgHOVwC (ORCPT ); Sat, 15 Aug 2020 17:52:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45624 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728929AbgHOVvx (ORCPT ); Sat, 15 Aug 2020 17:51:53 -0400 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 48547C0A3BF6 for ; Sat, 15 Aug 2020 09:39:55 -0700 (PDT) Received: by mail-wr1-x443.google.com with SMTP id z18so10921545wrm.12 for ; Sat, 15 Aug 2020 09:39:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=IcWebLKiGfBEwgtIAv7pSrcEhe1QF9/VXMB8mF2w3zg=; b=mCm3TVYJ/ExcJbRcQ9bmIdXtD+nrkMumnBDHWuk8kZ6MWIzFYRsdy7q0yYK/CKAF3I +xdY0UH4+wLV+wXZqgkhPFWBskOw0oi7SIK3oNTcWnMgWJCx4kLGFhXbilGTd6+kpPRn WCtO8iMB4m08Mg75DqCgWmri4T6B9H/zEi9JzQEiOtEtw4/hnuSx9LczwgE4blYRwHOe y7bKA3ysd3ZM3wdOXqnmry8XZ276YfhxSNSAr1c3w/KnWSTAc6B6eTirAGQ3QOrM/pQI xXOkUQn6xbdeW63UNw4Nelf8REmGLPBw7b0vSWHvG3bLpuSEq5zNE2n1T4G0+AgkcCB2 YTGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=IcWebLKiGfBEwgtIAv7pSrcEhe1QF9/VXMB8mF2w3zg=; b=GtTuI3di/zxHgw5HJI5WP8BLMHhoOmpmhSsPSsT88JuKIQ+cWkoUaxmOymLezSg/fl gZs8KAa2wfoyIydhLrXKKdxFlep2K7aDqUfUQJdzKOph952IIwWNxaWzDY/s6AS5XKjD glxiW7Ek/ojDtaTzLvx2gbCqGftL7NJzCMbIykpYOi+vbPTqnNa3hglna7rDlmmoSygg Uxmek1H7ea+4Oxo5H73BVGjQEHdHV0qbjj7PngsVRv32P7tfAfgJnlfxy8Cobybpyexu F3SP97FRUl6hcOaAD94tKpLAuLDP00S3rOJ5IHrVImd+FemF/DfaDmxZBsmxH5Q4msSh cOyw== X-Gm-Message-State: AOAM532N+6G/AFMK+TJAG8wIE9w7bihUtQmUiww/cMQlupSHZI3uM4N2 e2m2P999XObw4aVbrU80taRaLPQMFt8= X-Google-Smtp-Source: ABdhPJw/CPopLF0O+/uxjwHrejxthN12rVJ78Qnbo33fqX9DNYDlxL6pjSKK0S9Ip2EXIOXJnn7c4w== X-Received: by 2002:a5d:5446:: with SMTP id w6mr7392943wrv.127.1597509593839; Sat, 15 Aug 2020 09:39:53 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id e5sm24162660wrc.37.2020.08.15.09.39.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 15 Aug 2020 09:39:53 -0700 (PDT) Message-Id: <439adc1718d6cc37f18c1eaeafd605f5c2961733.1597509583.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sat, 15 Aug 2020 16:39:42 +0000 Subject: [PATCH v3 10/11] commit-reach: use corrected commit dates in paint_down_to_common() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar With corrected commit dates implemented, we no longer have to rely on commit date as a heuristic in paint_down_to_common(). t6024-recursive-merge setups a unique repository where all commits have the same committer date without well-defined merge-base. As this has already caused problems (as noted in 859fdc0 (commit-graph: define GIT_TEST_COMMIT_GRAPH, 2018-08-29)), we disable commit graph within the test script. Signed-off-by: Abhishek Kumar --- commit-graph.c | 14 ++++++++++++++ commit-graph.h | 6 ++++++ commit-reach.c | 2 +- t/t6024-recursive-merge.sh | 4 +++- 4 files changed, 24 insertions(+), 2 deletions(-) diff --git a/commit-graph.c b/commit-graph.c index c1292f8e08..6411068411 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -703,6 +703,20 @@ int generation_numbers_enabled(struct repository *r) return !!first_generation; } +int corrected_commit_dates_enabled(struct repository *r) +{ + struct commit_graph *g; + if (!prepare_commit_graph(r)) + return 0; + + g = r->objects->commit_graph; + + if (!g->num_commits) + return 0; + + return !!g->chunk_generation_data; +} + static void close_commit_graph_one(struct commit_graph *g) { if (!g) diff --git a/commit-graph.h b/commit-graph.h index 3cf89d895d..e22ec1e626 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -91,6 +91,12 @@ struct commit_graph *parse_commit_graph(void *graph_map, size_t graph_size); */ int generation_numbers_enabled(struct repository *r); +/* + * Return 1 if and only if the repository has a commit-graph + * file and generation data chunk has been written for the file. + */ +int corrected_commit_dates_enabled(struct repository *r); + enum commit_graph_write_flags { COMMIT_GRAPH_WRITE_APPEND = (1 << 0), COMMIT_GRAPH_WRITE_PROGRESS = (1 << 1), diff --git a/commit-reach.c b/commit-reach.c index 470bc80139..3a1b925274 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -39,7 +39,7 @@ static struct commit_list *paint_down_to_common(struct repository *r, int i; timestamp_t last_gen = GENERATION_NUMBER_INFINITY; - if (!min_generation) + if (!min_generation && !corrected_commit_dates_enabled(r)) queue.compare = compare_commits_by_commit_date; one->object.flags |= PARENT1; diff --git a/t/t6024-recursive-merge.sh b/t/t6024-recursive-merge.sh index 332cfc53fd..d3def66e7d 100755 --- a/t/t6024-recursive-merge.sh +++ b/t/t6024-recursive-merge.sh @@ -15,6 +15,8 @@ GIT_COMMITTER_DATE="2006-12-12 23:28:00 +0100" export GIT_COMMITTER_DATE test_expect_success 'setup tests' ' + GIT_TEST_COMMIT_GRAPH=0 && + export GIT_TEST_COMMIT_GRAPH && echo 1 >a1 && git add a1 && GIT_AUTHOR_DATE="2006-12-12 23:00:00" git commit -m 1 a1 && @@ -66,7 +68,7 @@ test_expect_success 'setup tests' ' ' test_expect_success 'combined merge conflicts' ' - test_must_fail env GIT_TEST_COMMIT_GRAPH=0 git merge -m final G + test_must_fail git merge -m final G ' test_expect_success 'result contains a conflict' ' From patchwork Sat Aug 15 16:39:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philippe Blain via GitGitGadget X-Patchwork-Id: 11715699 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0FF60913 for ; Sat, 15 Aug 2020 21:55:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E37AF2053B for ; Sat, 15 Aug 2020 21:55:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ssyy5pFp" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729190AbgHOVzl (ORCPT ); Sat, 15 Aug 2020 17:55:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45626 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729068AbgHOVwZ (ORCPT ); Sat, 15 Aug 2020 17:52:25 -0400 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC008C0A3BF7 for ; Sat, 15 Aug 2020 09:39:56 -0700 (PDT) Received: by mail-wm1-x336.google.com with SMTP id c19so10334844wmd.1 for ; Sat, 15 Aug 2020 09:39:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ToI4nwdtL3GkFmibdyswuTEHfpGjEVUQmYdZnttMOF0=; b=ssyy5pFpIzHtxNoUSAxmynoLOXkgIk2TQb137WERJKP3CZb+ItQKOSAIBfvuK0IPWS EOEjOad699+eR2MkmDHwRiyx3GapP+TWJm76vh+BjZMULCzFjU65lQgA3XgpKU8PVLjR B/UEidtDApudZH4idAh1CreVH0j6pzYAMbcBHjttSDR2nw6AVJ8uLTQcez5LDJ7B5OGP qmulG4VW99RBecHtj9wm0Ply8TXIf2rQDeZs65IEDnOKfaqLeubQfn4kpWqbTZ33RSlq ziWW7h110YVDX8VCISKqNRLDSbAQRh5ZjXWdEDDJAUmeQAqssqgLwXTIqa5OoUAJ26t7 pQUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ToI4nwdtL3GkFmibdyswuTEHfpGjEVUQmYdZnttMOF0=; b=l1+LnqFzyU2W/FP2JNBVyg3hY6x3sHuVeyFZ0J7HWuyDQQH00/5JCBT77iZ/tw9xbO +efSmGayF9GFE/6fK0xGqTv8xz+30wYuyAOE5eWlpFnhFOvVlnrRoyFH6XI4UDOJsGpf INMy5y6tBNlXcGq59UdW+c0y60AJgo6cBj0eu4hAl9+a4Q2TBtqo3lR9SNebFxYeC3jG BwH4iQoIex4Errjja9kAv6SNXt7wd8kWEQl2+2lV2Z5Rl0w0IBftle1xJRJWX2AG3lqS 437DKD2ksLQBqKA/K+1EZpmC2LRRY9dUFG8xUoQUjorXACA4C63ZNvFL/VBgqZtBmU0r 6KWQ== X-Gm-Message-State: AOAM532Qf8uHUamznqmiTMAgQU4ZmFnESjMRK/lAr5rDfmpAkMs8SAP7 VleZ0jt2cMNrHtXagv73dnDpjclERTo= X-Google-Smtp-Source: ABdhPJwPaH1Rb9LxHd/d8WVFU5EfeX0BXGpi8t09uVFqWBivp7p0ZrpgjHmZQO1FGX4/h94h6lInYg== X-Received: by 2002:a1c:24d5:: with SMTP id k204mr7121288wmk.159.1597509594645; Sat, 15 Aug 2020 09:39:54 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c24sm22425925wrb.11.2020.08.15.09.39.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 15 Aug 2020 09:39:54 -0700 (PDT) Message-Id: In-Reply-To: References: From: "Abhishek Kumar via GitGitGadget" Date: Sat, 15 Aug 2020 16:39:43 +0000 Subject: [PATCH v3 11/11] doc: add corrected commit date info Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Derrick Stolee , Jakub =?utf-8?b?TmFyxJlic2tp?= , Taylor Blau , Abhishek Kumar , Abhishek Kumar Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Abhishek Kumar With generation data chunk and corrected commit dates implemented, let's update the technical documentation for commit-graph. Signed-off-by: Abhishek Kumar --- .../technical/commit-graph-format.txt | 12 ++--- Documentation/technical/commit-graph.txt | 45 ++++++++++++------- 2 files changed, 36 insertions(+), 21 deletions(-) diff --git a/Documentation/technical/commit-graph-format.txt b/Documentation/technical/commit-graph-format.txt index 440541045d..71c43884ec 100644 --- a/Documentation/technical/commit-graph-format.txt +++ b/Documentation/technical/commit-graph-format.txt @@ -4,11 +4,7 @@ Git commit graph format The Git commit graph stores a list of commit OIDs and some associated metadata, including: -- The generation number of the commit. Commits with no parents have - generation number 1; commits with parents have generation number - one more than the maximum generation number of its parents. We - reserve zero as special, and can be used to mark a generation - number invalid or as "not computed". +- The generation number of the commit. - The root tree OID. @@ -88,6 +84,12 @@ CHUNK DATA: 2 bits of the lowest byte, storing the 33rd and 34th bit of the commit time. + Generation Data (ID: {'G', 'D', 'A', 'T' }) (N * 4 bytes) [Optional] + * This list of 4-byte values store corrected commit date offsets for the + commits, arranged in the same order as commit data chunk. + * This list can be later modified to store future generation number related + data. + Extra Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional] This list of 4-byte values store the second through nth parents for all octopus merges. The second parent value in the commit data stores diff --git a/Documentation/technical/commit-graph.txt b/Documentation/technical/commit-graph.txt index 808fa30b99..f27145328c 100644 --- a/Documentation/technical/commit-graph.txt +++ b/Documentation/technical/commit-graph.txt @@ -38,14 +38,27 @@ A consumer may load the following info for a commit from the graph: Values 1-4 satisfy the requirements of parse_commit_gently(). -Define the "generation number" of a commit recursively as follows: +There are two definitions of generation number: +1. Corrected committer dates +2. Topological levels + +Define "corrected committer date" of a commit recursively as follows: + + * A commit with no parents (a root commit) has corrected committer date + equal to its committer date. + + * A commit with at least one parent has corrected committer date equal to + the maximum of its commiter date and one more than the largest corrected + committer date among its parents. + +Define the "topological level" of a commit recursively as follows: * A commit with no parents (a root commit) has generation number one. - * A commit with at least one parent has generation number one more than - the largest generation number among its parents. + * A commit with at least one parent has topological level one more than + the largest topological level among its parents. -Equivalently, the generation number of a commit A is one more than the +Equivalently, the topological level of a commit A is one more than the length of a longest path from A to a root commit. The recursive definition is easier to use for computation and observing the following property: @@ -67,17 +80,12 @@ numbers, the general heuristic is the following: If A and B are commits with commit time X and Y, respectively, and X < Y, then A _probably_ cannot reach B. -This heuristic is currently used whenever the computation is allowed to -violate topological relationships due to clock skew (such as "git log" -with default order), but is not used when the topological order is -required (such as merge base calculations, "git log --graph"). - In practice, we expect some commits to be created recently and not stored in the commit graph. We can treat these commits as having "infinite" generation number and walk until reaching commits with known generation number. -We use the macro GENERATION_NUMBER_INFINITY = 0xFFFFFFFF to mark commits not +We use the macro GENERATION_NUMBER_INFINITY to mark commits not in the commit-graph file. If a commit-graph file was written by a version of Git that did not compute generation numbers, then those commits will have generation number represented by the macro GENERATION_NUMBER_ZERO = 0. @@ -93,12 +101,11 @@ fully-computed generation numbers. Using strict inequality may result in walking a few extra commits, but the simplicity in dealing with commits with generation number *_INFINITY or *_ZERO is valuable. -We use the macro GENERATION_NUMBER_MAX = 0x3FFFFFFF to for commits whose -generation numbers are computed to be at least this value. We limit at -this value since it is the largest value that can be stored in the -commit-graph file using the 30 bits available to generation numbers. This -presents another case where a commit can have generation number equal to -that of a parent. +We use the macro GENERATION_NUMBER_MAX for commits whose generation numbers +are computed to be at least this value. We limit at this value since it is +the largest value that can be stored in the commit-graph file using the +available to generation numbers. This presents another case where a +commit can have generation number equal to that of a parent. Design Details -------------- @@ -267,6 +274,12 @@ The merge strategy values (2 for the size multiple, 64,000 for the maximum number of commits) could be extracted into config settings for full flexibility. +We also merge commit-graph chains when we try to write a commit graph with +two different generation number definitions as they cannot be compared directly. +We overwrite the existing chain and create a commit-graph with the newer or more +efficient defintion. For example, overwriting topological levels commit graph +chain to create a corrected commit dates commit graph chain. + ## Deleting graph-{hash} files After a new tip file is written, some `graph-{hash}` files may no longer