diff mbox series

[4/7] commit-graph: fix generation number v2 overflow values

Message ID de7ab2f39d90c6b33b21a5adf126ec2ef5ce27e6.1645735117.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Commit-graph: Generation Number v2 Fixes, v3 implementation | expand

Commit Message

Derrick Stolee Feb. 24, 2022, 8:38 p.m. UTC
From: Derrick Stolee <derrickstolee@github.com>

The Generation Data Chunk was implemented and tested in e8b63005c
(commit-graph: implement generation data chunk, 2021-01-16), but the
test was carefully constructed to work on systems with 32-bit dates.
Since the corrected commit date offsets still required more than 31
bits, this triggered writing the generation_data_overflow chunk.

However, upon closer look, the
write_graph_chunk_generation_data_overflow() method writes the offsets
to the chunk (as dictated by the format) but fill_commit_graph_info()
treats the value in the chunk as if it is the full corrected commit date
(not an offset). For some reason, this does not cause an issue when
using the FUTURE_DATE specified in t5318-commit-graph.sh, but it does
show up as a failure in 'git commit-graph verify' if we increase that
FUTURE_DATE to be above four billion.

Fix this error and update the test to require 64-bit dates so we can
safely use this large value in our test.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 commit-graph.c          |  2 +-
 t/t5318-commit-graph.sh | 21 +++++++++++++++++++--
 2 files changed, 20 insertions(+), 3 deletions(-)

Comments

Junio C Hamano Feb. 24, 2022, 10:35 p.m. UTC | #1
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> -			graph_data->generation = get_be64(g->chunk_generation_data_overflow + 8 * offset_pos);
> +			graph_data->generation = item->date + get_be64(g->chunk_generation_data_overflow + 8 * offset_pos);

Wow, that's embarrassing.

> diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
> index 1afee1c2705..5e4b0216fa6 100755
> --- a/t/t5318-commit-graph.sh
> +++ b/t/t5318-commit-graph.sh
> @@ -815,6 +815,19 @@ test_expect_success 'corrupt commit-graph write (missing tree)' '
>  	)
>  '
>  
> +# The remaining tests check timestamps that flow over
> +# 32-bits. The graph_git_behavior checks can't take a
> +# prereq, so just stop here if we are on a 32-bit machine.
> +
> +if ! test_have_prereq TIME_IS_64BIT
> +then
> +	test_done
> +fi
> +if ! test_have_prereq TIME_T_IS_64BIT
> +then
> +	test_done
> +fi

The above is OK but is there a reason why we cannot do

	if A || B
	then
		test_done
	fi

here (I am assuming not, but in case I am missing the reason why it
has to be separate)?

> @@ -832,10 +845,10 @@ test_expect_success 'corrupt commit-graph write (missing tree)' '
>  # The largest offset observed is 2 ^ 31, just large enough to overflow.
>  #
>  
> -test_expect_success 'set up and verify repo with generation data overflow chunk' '
> +test_expect_success TIME_IS_64BIT,TIME_T_IS_64BIT 'set up and verify repo with generation data overflow chunk' '
>  	objdir=".git/objects" &&
>  	UNIX_EPOCH_ZERO="@0 +0000" &&
> -	FUTURE_DATE="@2147483646 +0000" &&
> +	FUTURE_DATE="@4000000000 +0000" &&

OK. 16#EE6B2800 too large to fit and will cause wrapping around with
signed 32-bit integer.

> @@ -867,4 +880,8 @@ test_expect_success 'set up and verify repo with generation data overflow chunk'
>  
>  graph_git_behavior 'generation data overflow chunk repo' repo left right
>  
> +# Do not add tests at the end of this file, unless they require 64-bit
> +# timestamps, since this portion of the script is only executed when
> +# time data types have 64 bits.
> +
>  test_done
Derrick Stolee Feb. 25, 2022, 1:53 p.m. UTC | #2
On 2/24/2022 5:35 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> -			graph_data->generation = get_be64(g->chunk_generation_data_overflow + 8 * offset_pos);
>> +			graph_data->generation = item->date + get_be64(g->chunk_generation_data_overflow + 8 * offset_pos);
> 
> Wow, that's embarrassing.
> 
>> diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
>> index 1afee1c2705..5e4b0216fa6 100755
>> --- a/t/t5318-commit-graph.sh
>> +++ b/t/t5318-commit-graph.sh
>> @@ -815,6 +815,19 @@ test_expect_success 'corrupt commit-graph write (missing tree)' '
>>  	)
>>  '
>>  
>> +# The remaining tests check timestamps that flow over
>> +# 32-bits. The graph_git_behavior checks can't take a
>> +# prereq, so just stop here if we are on a 32-bit machine.
>> +
>> +if ! test_have_prereq TIME_IS_64BIT
>> +then
>> +	test_done
>> +fi
>> +if ! test_have_prereq TIME_T_IS_64BIT
>> +then
>> +	test_done
>> +fi
> 
> The above is OK but is there a reason why we cannot do
> 
> 	if A || B
> 	then
> 		test_done
> 	fi
> 
> here (I am assuming not, but in case I am missing the reason why it
> has to be separate)?

Does not need to be separate. I just discovered the two different
prereqs for similar, but not exact, checks. I can swap this to an
or statement.
 
>> @@ -832,10 +845,10 @@ test_expect_success 'corrupt commit-graph write (missing tree)' '
>>  # The largest offset observed is 2 ^ 31, just large enough to overflow.
>>  #
>>  
>> -test_expect_success 'set up and verify repo with generation data overflow chunk' '
>> +test_expect_success TIME_IS_64BIT,TIME_T_IS_64BIT 'set up and verify repo with generation data overflow chunk' '
>>  	objdir=".git/objects" &&
>>  	UNIX_EPOCH_ZERO="@0 +0000" &&
>> -	FUTURE_DATE="@2147483646 +0000" &&
>> +	FUTURE_DATE="@4000000000 +0000" &&
> 
> OK. 16#EE6B2800 too large to fit and will cause wrapping around with
> signed 32-bit integer.

Right. I wanted it to be right on that boundary of needing the 32nd
bit but not being over that on its own. I did check that without
the prereqs this code fails on 32-bit systems due to parsing the
time.

Thanks,
-Stolee
Junio C Hamano Feb. 25, 2022, 5:38 p.m. UTC | #3
Derrick Stolee <derrickstolee@github.com> writes:

>>> +# The remaining tests check timestamps that flow over
>>> +# 32-bits. The graph_git_behavior checks can't take a
>>> +# prereq, so just stop here if we are on a 32-bit machine.
>>> +
>>> +if ! test_have_prereq TIME_IS_64BIT
>>> +then
>>> +	test_done
>>> +fi
>>> +if ! test_have_prereq TIME_T_IS_64BIT
>>> +then
>>> +	test_done
>>> +fi
>> 
>> The above is OK but is there a reason why we cannot do
>> 
>> 	if A || B
>> 	then
>> 		test_done
>> 	fi
>> 
>> here (I am assuming not, but in case I am missing the reason why it
>> has to be separate)?
>
> Does not need to be separate. I just discovered the two different
> prereqs for similar, but not exact, checks. I can swap this to an
> or statement.

I do not think a single condition with single test_done is
necessarily better. I was just curious if there was anything subtle
going on.

Thanks.
diff mbox series

Patch

diff --git a/commit-graph.c b/commit-graph.c
index 8e52bb09552..b86a6a634fe 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -806,7 +806,7 @@  static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
 				die(_("commit-graph requires overflow generation data but has none"));
 
 			offset_pos = offset ^ CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW;
-			graph_data->generation = get_be64(g->chunk_generation_data_overflow + 8 * offset_pos);
+			graph_data->generation = item->date + get_be64(g->chunk_generation_data_overflow + 8 * offset_pos);
 		} else
 			graph_data->generation = item->date + offset;
 	} else
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 1afee1c2705..5e4b0216fa6 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -815,6 +815,19 @@  test_expect_success 'corrupt commit-graph write (missing tree)' '
 	)
 '
 
+# The remaining tests check timestamps that flow over
+# 32-bits. The graph_git_behavior checks can't take a
+# prereq, so just stop here if we are on a 32-bit machine.
+
+if ! test_have_prereq TIME_IS_64BIT
+then
+	test_done
+fi
+if ! test_have_prereq TIME_T_IS_64BIT
+then
+	test_done
+fi
+
 # We test the overflow-related code with the following repo history:
 #
 #               4:F - 5:N - 6:U
@@ -832,10 +845,10 @@  test_expect_success 'corrupt commit-graph write (missing tree)' '
 # The largest offset observed is 2 ^ 31, just large enough to overflow.
 #
 
-test_expect_success 'set up and verify repo with generation data overflow chunk' '
+test_expect_success TIME_IS_64BIT,TIME_T_IS_64BIT 'set up and verify repo with generation data overflow chunk' '
 	objdir=".git/objects" &&
 	UNIX_EPOCH_ZERO="@0 +0000" &&
-	FUTURE_DATE="@2147483646 +0000" &&
+	FUTURE_DATE="@4000000000 +0000" &&
 	test_oid_cache <<-EOF &&
 	oid_version sha1:1
 	oid_version sha256:2
@@ -867,4 +880,8 @@  test_expect_success 'set up and verify repo with generation data overflow chunk'
 
 graph_git_behavior 'generation data overflow chunk repo' repo left right
 
+# Do not add tests at the end of this file, unless they require 64-bit
+# timestamps, since this portion of the script is only executed when
+# time data types have 64 bits.
+
 test_done