diff mbox series

[3/6] commit-graph: consolidate fill_commit_graph_info

Message ID 701f5912369c0fcc07cf604c3129cb5017a125ce.1595927632.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Implement Corrected Commit Date | expand

Commit Message

Linus Arver via GitGitGadget July 28, 2020, 9:13 a.m. UTC
From: Abhishek Kumar <abhishekkumar8222@gmail.com>

Both fill_commit_graph_info() and fill_commit_in_graph() parse
information present in commit data chunk. Let's simplify the
implementation by calling fill_commit_graph_info() within
fill_commit_in_graph().

The test 'generate tar with future mtime' creates a commit with commit
time of (2 ^ 36 + 1) seconds since EPOCH. The commit time overflows into
generation number and has undefined behavior. The test used to pass as
fill_commit_in_graph() did not read commit time from commit graph,
reading commit date from odb instead.

Let's fix that by setting commit time of (2 ^ 34 - 1) seconds.

Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
---
 commit-graph.c      | 31 ++++++++++++-------------------
 t/t5000-tar-tree.sh |  4 ++--
 2 files changed, 14 insertions(+), 21 deletions(-)

Comments

Derrick Stolee July 28, 2020, 1:14 p.m. UTC | #1
On 7/28/2020 5:13 AM, Abhishek Kumar via GitGitGadget wrote:
> From: Abhishek Kumar <abhishekkumar8222@gmail.com>
> 
> Both fill_commit_graph_info() and fill_commit_in_graph() parse
> information present in commit data chunk. Let's simplify the
> implementation by calling fill_commit_graph_info() within
> fill_commit_in_graph().
> 
> The test 'generate tar with future mtime' creates a commit with commit
> time of (2 ^ 36 + 1) seconds since EPOCH. The commit time overflows into
> generation number and has undefined behavior. The test used to pass as
> fill_commit_in_graph() did not read commit time from commit graph,
> reading commit date from odb instead.

I was first confused as to why fill_commit_graph_info() did not
load the timestamp, but the reason is that it is only used by
two methods:

1. fill_commit_in_graph(): this actually leaves the commit in a
   "parsed" state, so the date must be correct. Thus, it parses
   the date out of the commit-graph.

2. load_commit_graph_info(): this only helps to guarantee we
   know the graph_pos and generation number values.

Perhaps add this extra context: you will _need_ the commit date
from the commit-graph in order to populate the generation number
v2 in fill_commit_graph_info().

> Let's fix that by setting commit time of (2 ^ 34 - 1) seconds.

The timestamp limit placed in the commit-graph is more restrictive
than 64-bit timestamps, but as your test points out, the maximum
timestamp allowed takes place in the year 2514. That is far enough
away for all real data.

> Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
> ---
>  commit-graph.c      | 31 ++++++++++++-------------------
>  t/t5000-tar-tree.sh |  4 ++--
>  2 files changed, 14 insertions(+), 21 deletions(-)
> 
> diff --git a/commit-graph.c b/commit-graph.c
> index 5d3c9bd23c..204eb454b2 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -735,15 +735,24 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
>  	const unsigned char *commit_data;
>  	struct commit_graph_data *graph_data;
>  	uint32_t lex_index;
> +	uint64_t date_high, date_low;
>  
>  	while (pos < g->num_commits_in_base)
>  		g = g->base_graph;
>  
> +	if (pos >= g->num_commits + g->num_commits_in_base)
> +		die(_("invalid commit position. commit-graph is likely corrupt"));
> +
>  	lex_index = pos - g->num_commits_in_base;
>  	commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index;
>  
>  	graph_data = commit_graph_data_at(item);
>  	graph_data->graph_pos = pos;
> +
> +	date_high = get_be32(commit_data + g->hash_len + 8) & 0x3;
> +	date_low = get_be32(commit_data + g->hash_len + 12);
> +	item->date = (timestamp_t)((date_high << 32) | date_low);
> +
>  	graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
>  }
>  
> @@ -758,38 +767,22 @@ static int fill_commit_in_graph(struct repository *r,
>  {
>  	uint32_t edge_value;
>  	uint32_t *parent_data_ptr;
> -	uint64_t date_low, date_high;
>  	struct commit_list **pptr;
> -	struct commit_graph_data *graph_data;
>  	const unsigned char *commit_data;
>  	uint32_t lex_index;
>  
> +	fill_commit_graph_info(item, g, pos);
> +
>  	while (pos < g->num_commits_in_base)
>  		g = g->base_graph;

This 'while' loop happens in both implementations, so you could
save a miniscule amount of time by placing the call to
fill_commit_graph_info() after the while loop.

> -	if (pos >= g->num_commits + g->num_commits_in_base)
> -		die(_("invalid commit position. commit-graph is likely corrupt"));

> -	/*
> -	 * Store the "full" position, but then use the
> -	 * "local" position for the rest of the calculation.
> -	 */
> -	graph_data = commit_graph_data_at(item);
> -	graph_data->graph_pos = pos;
>  	lex_index = pos - g->num_commits_in_base;
> -
> -	commit_data = g->chunk_commit_data + (g->hash_len + 16) * lex_index;
> +	commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index;

I was about to complain about this change, but GRAPH_DATA_WIDTH
is a macro that does an equivalent thing (except the_hash_algo->rawsz
instead of g->hash_len).

>  
>  	item->object.parsed = 1;
>  
>  	set_commit_tree(item, NULL);
>  
> -	date_high = get_be32(commit_data + g->hash_len + 8) & 0x3;
> -	date_low = get_be32(commit_data + g->hash_len + 12);
> -	item->date = (timestamp_t)((date_high << 32) | date_low);
> -
> -	graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
> -
>  	pptr = &item->parents;
>  
>  	edge_value = get_be32(commit_data + g->hash_len);
> diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
> index 37655a237c..1986354fc3 100755
> --- a/t/t5000-tar-tree.sh
> +++ b/t/t5000-tar-tree.sh
> @@ -406,7 +406,7 @@ test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' '
>  	rm -f .git/index &&
>  	echo content >file &&
>  	git add file &&
> -	GIT_COMMITTER_DATE="@68719476737 +0000" \
> +	GIT_COMMITTER_DATE="@17179869183 +0000" \
>  		git commit -m "tempori parendum"
>  '
>  
> @@ -415,7 +415,7 @@ test_expect_success TIME_IS_64BIT 'generate tar with future mtime' '
>  '
>  
>  test_expect_success TAR_HUGE,TIME_IS_64BIT,TIME_T_IS_64BIT 'system tar can read our future mtime' '
> -	echo 4147 >expect &&
> +	echo 2514 >expect &&
>  	tar_info future.tar | cut -d" " -f2 >actual &&
>  	test_cmp expect actual
>  '
> 

Thanks,
-Stolee
René Scharfe July 28, 2020, 3:19 p.m. UTC | #2
[Had to remove stolee@gmail.com because with it my mail provider
 rejected this email with the following error message:

   Requested action not taken: mailbox unavailable
   invalid DNS MX or A/AAAA resource record.]

Am 28.07.20 um 15:14 schrieb Derrick Stolee:
> On 7/28/2020 5:13 AM, Abhishek Kumar via GitGitGadget wrote:
>> From: Abhishek Kumar <abhishekkumar8222@gmail.com>
>>
>> Both fill_commit_graph_info() and fill_commit_in_graph() parse
>> information present in commit data chunk. Let's simplify the
>> implementation by calling fill_commit_graph_info() within
>> fill_commit_in_graph().
>>
>> The test 'generate tar with future mtime' creates a commit with commit
>> time of (2 ^ 36 + 1) seconds since EPOCH. The commit time overflows into
>> generation number and has undefined behavior. The test used to pass as
>> fill_commit_in_graph() did not read commit time from commit graph,
>> reading commit date from odb instead.
>
> I was first confused as to why fill_commit_graph_info() did not
> load the timestamp, but the reason is that it is only used by
> two methods:
>
> 1. fill_commit_in_graph(): this actually leaves the commit in a
>    "parsed" state, so the date must be correct. Thus, it parses
>    the date out of the commit-graph.
>
> 2. load_commit_graph_info(): this only helps to guarantee we
>    know the graph_pos and generation number values.
>
> Perhaps add this extra context: you will _need_ the commit date
> from the commit-graph in order to populate the generation number
> v2 in fill_commit_graph_info().
>
>> Let's fix that by setting commit time of (2 ^ 34 - 1) seconds.
>
> The timestamp limit placed in the commit-graph is more restrictive
> than 64-bit timestamps, but as your test points out, the maximum
> timestamp allowed takes place in the year 2514. That is far enough
> away for all real data.

We all may feel like the end of the world is imminent, but do we really
need to set such an arbitrary limit?  OK, that limit was already set two
years ago, and I'm really late.  But still: It's sad to see anything
else than signed 64-bit timestamps to be used in fresh code (after Y2K).
The extra four bytes would fatten up the structures less than the
transition from SHA-1 to SHA-256 will, and no bit twiddling would be
required.  *sigh*

René
Derrick Stolee July 28, 2020, 3:58 p.m. UTC | #3
On 7/28/2020 11:19 AM, René Scharfe wrote:
> [Had to remove stolee@gmail.com because with it my mail provider
>  rejected this email with the following error message:
> 
>    Requested action not taken: mailbox unavailable
>    invalid DNS MX or A/AAAA resource record.]
> 
> Am 28.07.20 um 15:14 schrieb Derrick Stolee:
>> On 7/28/2020 5:13 AM, Abhishek Kumar via GitGitGadget wrote:
>>> From: Abhishek Kumar <abhishekkumar8222@gmail.com>
>>>
>>> Both fill_commit_graph_info() and fill_commit_in_graph() parse
>>> information present in commit data chunk. Let's simplify the
>>> implementation by calling fill_commit_graph_info() within
>>> fill_commit_in_graph().
>>>
>>> The test 'generate tar with future mtime' creates a commit with commit
>>> time of (2 ^ 36 + 1) seconds since EPOCH. The commit time overflows into
>>> generation number and has undefined behavior. The test used to pass as
>>> fill_commit_in_graph() did not read commit time from commit graph,
>>> reading commit date from odb instead.
>>
>> I was first confused as to why fill_commit_graph_info() did not
>> load the timestamp, but the reason is that it is only used by
>> two methods:
>>
>> 1. fill_commit_in_graph(): this actually leaves the commit in a
>>    "parsed" state, so the date must be correct. Thus, it parses
>>    the date out of the commit-graph.
>>
>> 2. load_commit_graph_info(): this only helps to guarantee we
>>    know the graph_pos and generation number values.
>>
>> Perhaps add this extra context: you will _need_ the commit date
>> from the commit-graph in order to populate the generation number
>> v2 in fill_commit_graph_info().
>>
>>> Let's fix that by setting commit time of (2 ^ 34 - 1) seconds.
>>
>> The timestamp limit placed in the commit-graph is more restrictive
>> than 64-bit timestamps, but as your test points out, the maximum
>> timestamp allowed takes place in the year 2514. That is far enough
>> away for all real data.
> 
> We all may feel like the end of the world is imminent, but do we really
> need to set such an arbitrary limit?  OK, that limit was already set two
> years ago, and I'm really late.  But still: It's sad to see anything
> else than signed 64-bit timestamps to be used in fresh code (after Y2K).
> The extra four bytes would fatten up the structures less than the
> transition from SHA-1 to SHA-256 will, and no bit twiddling would be
> required.  *sigh*

One thing to consider after generation number v2 is out long enough
is if we could drop the topo-levels and write zeroes for the topo-
level portion. This was valid data in the first version of the
commit-graph, so it would still be valid. Then, we could allow
full 64-bit timestamps again.

This is something to think about again in a year, maybe.

Thanks,
-Stolee
Taylor Blau July 28, 2020, 4:01 p.m. UTC | #4
On Tue, Jul 28, 2020 at 09:14:42AM -0400, Derrick Stolee wrote:
> On 7/28/2020 5:13 AM, Abhishek Kumar via GitGitGadget wrote:
> > From: Abhishek Kumar <abhishekkumar8222@gmail.com>
> >
> > Both fill_commit_graph_info() and fill_commit_in_graph() parse
> > information present in commit data chunk. Let's simplify the
> > implementation by calling fill_commit_graph_info() within
> > fill_commit_in_graph().
> >
> > The test 'generate tar with future mtime' creates a commit with commit
> > time of (2 ^ 36 + 1) seconds since EPOCH. The commit time overflows into
> > generation number and has undefined behavior. The test used to pass as
> > fill_commit_in_graph() did not read commit time from commit graph,
> > reading commit date from odb instead.
>
> I was first confused as to why fill_commit_graph_info() did not
> load the timestamp, but the reason is that it is only used by
> two methods:
>
> 1. fill_commit_in_graph(): this actually leaves the commit in a
>    "parsed" state, so the date must be correct. Thus, it parses
>    the date out of the commit-graph.
>
> 2. load_commit_graph_info(): this only helps to guarantee we
>    know the graph_pos and generation number values.
>
> Perhaps add this extra context: you will _need_ the commit date
> from the commit-graph in order to populate the generation number
> v2 in fill_commit_graph_info().
>
> > Let's fix that by setting commit time of (2 ^ 34 - 1) seconds.
>
> The timestamp limit placed in the commit-graph is more restrictive
> than 64-bit timestamps, but as your test points out, the maximum
> timestamp allowed takes place in the year 2514. That is far enough
> away for all real data.
>
> > Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
> > ---
> >  commit-graph.c      | 31 ++++++++++++-------------------
> >  t/t5000-tar-tree.sh |  4 ++--
> >  2 files changed, 14 insertions(+), 21 deletions(-)
> >
> > diff --git a/commit-graph.c b/commit-graph.c
> > index 5d3c9bd23c..204eb454b2 100644
> > --- a/commit-graph.c
> > +++ b/commit-graph.c
> > @@ -735,15 +735,24 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
> >  	const unsigned char *commit_data;
> >  	struct commit_graph_data *graph_data;
> >  	uint32_t lex_index;
> > +	uint64_t date_high, date_low;
> >
> >  	while (pos < g->num_commits_in_base)
> >  		g = g->base_graph;
> >
> > +	if (pos >= g->num_commits + g->num_commits_in_base)
> > +		die(_("invalid commit position. commit-graph is likely corrupt"));
> > +
> >  	lex_index = pos - g->num_commits_in_base;
> >  	commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index;
> >
> >  	graph_data = commit_graph_data_at(item);
> >  	graph_data->graph_pos = pos;
> > +
> > +	date_high = get_be32(commit_data + g->hash_len + 8) & 0x3;
> > +	date_low = get_be32(commit_data + g->hash_len + 12);
> > +	item->date = (timestamp_t)((date_high << 32) | date_low);
> > +
> >  	graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
> >  }
> >
> > @@ -758,38 +767,22 @@ static int fill_commit_in_graph(struct repository *r,
> >  {
> >  	uint32_t edge_value;
> >  	uint32_t *parent_data_ptr;
> > -	uint64_t date_low, date_high;
> >  	struct commit_list **pptr;
> > -	struct commit_graph_data *graph_data;
> >  	const unsigned char *commit_data;
> >  	uint32_t lex_index;
> >
> > +	fill_commit_graph_info(item, g, pos);
> > +
> >  	while (pos < g->num_commits_in_base)
> >  		g = g->base_graph;
>
> This 'while' loop happens in both implementations, so you could
> save a miniscule amount of time by placing the call to
> fill_commit_graph_info() after the while loop.
>
> > -	if (pos >= g->num_commits + g->num_commits_in_base)
> > -		die(_("invalid commit position. commit-graph is likely corrupt"));
>
> > -	/*
> > -	 * Store the "full" position, but then use the
> > -	 * "local" position for the rest of the calculation.
> > -	 */
> > -	graph_data = commit_graph_data_at(item);
> > -	graph_data->graph_pos = pos;
> >  	lex_index = pos - g->num_commits_in_base;
> > -
> > -	commit_data = g->chunk_commit_data + (g->hash_len + 16) * lex_index;
> > +	commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index;
>
> I was about to complain about this change, but GRAPH_DATA_WIDTH
> is a macro that does an equivalent thing (except the_hash_algo->rawsz
> instead of g->hash_len).
>
> >
> >  	item->object.parsed = 1;
> >
> >  	set_commit_tree(item, NULL);
> >
> > -	date_high = get_be32(commit_data + g->hash_len + 8) & 0x3;
> > -	date_low = get_be32(commit_data + g->hash_len + 12);
> > -	item->date = (timestamp_t)((date_high << 32) | date_low);
> > -
> > -	graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
> > -
> >  	pptr = &item->parents;
> >
> >  	edge_value = get_be32(commit_data + g->hash_len);
> > diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
> > index 37655a237c..1986354fc3 100755
> > --- a/t/t5000-tar-tree.sh
> > +++ b/t/t5000-tar-tree.sh
> > @@ -406,7 +406,7 @@ test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' '
> >  	rm -f .git/index &&
> >  	echo content >file &&
> >  	git add file &&
> > -	GIT_COMMITTER_DATE="@68719476737 +0000" \
> > +	GIT_COMMITTER_DATE="@17179869183 +0000" \
> >  		git commit -m "tempori parendum"
> >  '
> >
> > @@ -415,7 +415,7 @@ test_expect_success TIME_IS_64BIT 'generate tar with future mtime' '
> >  '
> >
> >  test_expect_success TAR_HUGE,TIME_IS_64BIT,TIME_T_IS_64BIT 'system tar can read our future mtime' '
> > -	echo 4147 >expect &&
> > +	echo 2514 >expect &&
> >  	tar_info future.tar | cut -d" " -f2 >actual &&
> >  	test_cmp expect actual
> >  '
> >
>
> Thanks,
> -Stolee

Agreed with Stolee's review, but otherwise this looks like a faithful
transformation.

Thanks,
Taylor
Abhishek Kumar July 30, 2020, 6:07 a.m. UTC | #5
On Tue, Jul 28, 2020 at 09:14:42AM -0400, Derrick Stolee wrote:
> On 7/28/2020 5:13 AM, Abhishek Kumar via GitGitGadget wrote:
> > From: Abhishek Kumar <abhishekkumar8222@gmail.com>
> > 
> > Both fill_commit_graph_info() and fill_commit_in_graph() parse
> > information present in commit data chunk. Let's simplify the
> > implementation by calling fill_commit_graph_info() within
> > fill_commit_in_graph().
> > 
> > The test 'generate tar with future mtime' creates a commit with commit
> > time of (2 ^ 36 + 1) seconds since EPOCH. The commit time overflows into
> > generation number and has undefined behavior. The test used to pass as
> > fill_commit_in_graph() did not read commit time from commit graph,
> > reading commit date from odb instead.
> 
> I was first confused as to why fill_commit_graph_info() did not
> load the timestamp, but the reason is that it is only used by
> two methods:
> 
> 1. fill_commit_in_graph(): this actually leaves the commit in a
>    "parsed" state, so the date must be correct. Thus, it parses
>    the date out of the commit-graph.
> 
> 2. load_commit_graph_info(): this only helps to guarantee we
>    know the graph_pos and generation number values.
> 
> Perhaps add this extra context: you will _need_ the commit date
> from the commit-graph in order to populate the generation number
> v2 in fill_commit_graph_info().

Thanks, that makes sense. I have revised the commit message to:

commit-graph: consolidate fill_commit_graph_info
    
    Both fill_commit_graph_info() and fill_commit_in_graph() parse
    information present in commit data chunk. Let's simplify the
    implementation by calling fill_commit_graph_info() within
    fill_commit_in_graph().
    
    The test 'generate tar with future mtime' creates a commit with commit
    time of (2 ^ 36 + 1) seconds since EPOCH. The commit time overflows into
    generation number (within CDAT chunk) and has undefined behavior.
    
    The test used to pass as fill_commit_in_graph() guarantees the values of
    graph position and generation number, and did not load timestamp.
    However, with corrected commit date we will need load the timestamp as
    well to populate the generation number.
> 
> > Let's fix that by setting commit time of (2 ^ 34 - 1) seconds.
> 
> The timestamp limit placed in the commit-graph is more restrictive
> than 64-bit timestamps, but as your test points out, the maximum
> timestamp allowed takes place in the year 2514. That is far enough
> away for all real data.
> 
> > Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
> > ---
> >  commit-graph.c      | 31 ++++++++++++-------------------
> >  t/t5000-tar-tree.sh |  4 ++--
> >  2 files changed, 14 insertions(+), 21 deletions(-)
> > 
> > diff --git a/commit-graph.c b/commit-graph.c
> > index 5d3c9bd23c..204eb454b2 100644
> > --- a/commit-graph.c
> > +++ b/commit-graph.c
> > @@ -735,15 +735,24 @@ static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
> >  	const unsigned char *commit_data;
> >  	struct commit_graph_data *graph_data;
> >  	uint32_t lex_index;
> > +	uint64_t date_high, date_low;
> >  
> >  	while (pos < g->num_commits_in_base)
> >  		g = g->base_graph;
> >  
> > +	if (pos >= g->num_commits + g->num_commits_in_base)
> > +		die(_("invalid commit position. commit-graph is likely corrupt"));
> > +
> >  	lex_index = pos - g->num_commits_in_base;
> >  	commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index;
> >  
> >  	graph_data = commit_graph_data_at(item);
> >  	graph_data->graph_pos = pos;
> > +
> > +	date_high = get_be32(commit_data + g->hash_len + 8) & 0x3;
> > +	date_low = get_be32(commit_data + g->hash_len + 12);
> > +	item->date = (timestamp_t)((date_high << 32) | date_low);
> > +
> >  	graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
> >  }
> >  
> > @@ -758,38 +767,22 @@ static int fill_commit_in_graph(struct repository *r,
> >  {
> >  	uint32_t edge_value;
> >  	uint32_t *parent_data_ptr;
> > -	uint64_t date_low, date_high;
> >  	struct commit_list **pptr;
> > -	struct commit_graph_data *graph_data;
> >  	const unsigned char *commit_data;
> >  	uint32_t lex_index;
> >  
> > +	fill_commit_graph_info(item, g, pos);
> > +
> >  	while (pos < g->num_commits_in_base)
> >  		g = g->base_graph;
> 
> This 'while' loop happens in both implementations, so you could
> save a miniscule amount of time by placing the call to
> fill_commit_graph_info() after the while loop.
> 
> > -	if (pos >= g->num_commits + g->num_commits_in_base)
> > -		die(_("invalid commit position. commit-graph is likely corrupt"));
> 
> > -	/*
> > -	 * Store the "full" position, but then use the
> > -	 * "local" position for the rest of the calculation.
> > -	 */
> > -	graph_data = commit_graph_data_at(item);
> > -	graph_data->graph_pos = pos;
> >  	lex_index = pos - g->num_commits_in_base;
> > -
> > -	commit_data = g->chunk_commit_data + (g->hash_len + 16) * lex_index;
> > +	commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index;
> 
> I was about to complain about this change, but GRAPH_DATA_WIDTH
> is a macro that does an equivalent thing (except the_hash_algo->rawsz
> instead of g->hash_len).
> 
> >  
> >  	item->object.parsed = 1;
> >  
> >  	set_commit_tree(item, NULL);
> >  
> > -	date_high = get_be32(commit_data + g->hash_len + 8) & 0x3;
> > -	date_low = get_be32(commit_data + g->hash_len + 12);
> > -	item->date = (timestamp_t)((date_high << 32) | date_low);
> > -
> > -	graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
> > -
> >  	pptr = &item->parents;
> >  
> >  	edge_value = get_be32(commit_data + g->hash_len);
> > diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
> > index 37655a237c..1986354fc3 100755
> > --- a/t/t5000-tar-tree.sh
> > +++ b/t/t5000-tar-tree.sh
> > @@ -406,7 +406,7 @@ test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' '
> >  	rm -f .git/index &&
> >  	echo content >file &&
> >  	git add file &&
> > -	GIT_COMMITTER_DATE="@68719476737 +0000" \
> > +	GIT_COMMITTER_DATE="@17179869183 +0000" \
> >  		git commit -m "tempori parendum"
> >  '
> >  
> > @@ -415,7 +415,7 @@ test_expect_success TIME_IS_64BIT 'generate tar with future mtime' '
> >  '
> >  
> >  test_expect_success TAR_HUGE,TIME_IS_64BIT,TIME_T_IS_64BIT 'system tar can read our future mtime' '
> > -	echo 4147 >expect &&
> > +	echo 2514 >expect &&
> >  	tar_info future.tar | cut -d" " -f2 >actual &&
> >  	test_cmp expect actual
> >  '
> > 
> 
> Thanks,
> -Stolee
diff mbox series

Patch

diff --git a/commit-graph.c b/commit-graph.c
index 5d3c9bd23c..204eb454b2 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -735,15 +735,24 @@  static void fill_commit_graph_info(struct commit *item, struct commit_graph *g,
 	const unsigned char *commit_data;
 	struct commit_graph_data *graph_data;
 	uint32_t lex_index;
+	uint64_t date_high, date_low;
 
 	while (pos < g->num_commits_in_base)
 		g = g->base_graph;
 
+	if (pos >= g->num_commits + g->num_commits_in_base)
+		die(_("invalid commit position. commit-graph is likely corrupt"));
+
 	lex_index = pos - g->num_commits_in_base;
 	commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index;
 
 	graph_data = commit_graph_data_at(item);
 	graph_data->graph_pos = pos;
+
+	date_high = get_be32(commit_data + g->hash_len + 8) & 0x3;
+	date_low = get_be32(commit_data + g->hash_len + 12);
+	item->date = (timestamp_t)((date_high << 32) | date_low);
+
 	graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
 }
 
@@ -758,38 +767,22 @@  static int fill_commit_in_graph(struct repository *r,
 {
 	uint32_t edge_value;
 	uint32_t *parent_data_ptr;
-	uint64_t date_low, date_high;
 	struct commit_list **pptr;
-	struct commit_graph_data *graph_data;
 	const unsigned char *commit_data;
 	uint32_t lex_index;
 
+	fill_commit_graph_info(item, g, pos);
+
 	while (pos < g->num_commits_in_base)
 		g = g->base_graph;
 
-	if (pos >= g->num_commits + g->num_commits_in_base)
-		die(_("invalid commit position. commit-graph is likely corrupt"));
-
-	/*
-	 * Store the "full" position, but then use the
-	 * "local" position for the rest of the calculation.
-	 */
-	graph_data = commit_graph_data_at(item);
-	graph_data->graph_pos = pos;
 	lex_index = pos - g->num_commits_in_base;
-
-	commit_data = g->chunk_commit_data + (g->hash_len + 16) * lex_index;
+	commit_data = g->chunk_commit_data + GRAPH_DATA_WIDTH * lex_index;
 
 	item->object.parsed = 1;
 
 	set_commit_tree(item, NULL);
 
-	date_high = get_be32(commit_data + g->hash_len + 8) & 0x3;
-	date_low = get_be32(commit_data + g->hash_len + 12);
-	item->date = (timestamp_t)((date_high << 32) | date_low);
-
-	graph_data->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
-
 	pptr = &item->parents;
 
 	edge_value = get_be32(commit_data + g->hash_len);
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 37655a237c..1986354fc3 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -406,7 +406,7 @@  test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' '
 	rm -f .git/index &&
 	echo content >file &&
 	git add file &&
-	GIT_COMMITTER_DATE="@68719476737 +0000" \
+	GIT_COMMITTER_DATE="@17179869183 +0000" \
 		git commit -m "tempori parendum"
 '
 
@@ -415,7 +415,7 @@  test_expect_success TIME_IS_64BIT 'generate tar with future mtime' '
 '
 
 test_expect_success TAR_HUGE,TIME_IS_64BIT,TIME_T_IS_64BIT 'system tar can read our future mtime' '
-	echo 4147 >expect &&
+	echo 2514 >expect &&
 	tar_info future.tar | cut -d" " -f2 >actual &&
 	test_cmp expect actual
 '