diff mbox series

[2/7] commit-graph: fix ordering bug in generation numbers

Message ID 6e47ffed25795260c2b8614d4589fb58d892c8df.1645735117.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Commit-graph: Generation Number v2 Fixes, v3 implementation | expand

Commit Message

Derrick Stolee Feb. 24, 2022, 8:38 p.m. UTC
From: Derrick Stolee <derrickstolee@github.com>

When computing the generation numbers for a commit-graph, we compute
the corrected commit dates and then check if their offsets from the
actual dates is too large to fit in the 32-bit Generation Data chunk.
However, there is a problem with this approach: if we have parsed the
generation data from the previous commit-graph, then we continue the
loop because the corrected commit date is already computed.

It is incorrect to add an increment to num_generation_data_overflows
here, because we might start double-counting commits that are computed
because of the depth-first search walk from a commit with an earlier
OID.

Instead, iterate over the full commit list at the end, checking the
offsets to see how many grow beyond the maximum value.

Update a test in t5318 to use a larger time value, which will help
demonstrate this bug in more cases. It still won't hit all potential
cases until the next change, which reenables reading generation numbers.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 commit-graph.c          | 10 +++++++---
 t/t5318-commit-graph.sh |  4 ++--
 2 files changed, 9 insertions(+), 5 deletions(-)

Comments

Junio C Hamano Feb. 24, 2022, 10:15 p.m. UTC | #1
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <derrickstolee@github.com>
>
> When computing the generation numbers for a commit-graph, we compute
> the corrected commit dates and then check if their offsets from the
> actual dates is too large to fit in the 32-bit Generation Data chunk.
> However, there is a problem with this approach: if we have parsed the
> generation data from the previous commit-graph, then we continue the
> loop because the corrected commit date is already computed.
>
> It is incorrect to add an increment to num_generation_data_overflows
> here, because we might start double-counting commits that are computed
> because of the depth-first search walk from a commit with an earlier
> OID.
>
> Instead, iterate over the full commit list at the end, checking the
> offsets to see how many grow beyond the maximum value.

Hmph, I can see how the new code correctly counts the commits that
require offsets that are too large, but I am not sure why the fix is
needed.  The overall loop structure is

    for each commit ctx->commits.list[i]:
        continue if generation number has been computed for it already

	set up a commit-list for depth first search
	while (we are still digging) {
		for each parent {
			if generation for the parent is not known yet:
				push it down and redo
			else
				compute max of parents' generation number
		}
                if (all parents' generation number is known) {
			set the generation number for ourselves
			count if we needed an offset that is too big
		}
	}
    }

The only case where we may double-count near the end of inner loop I
can think of is when we end up computing generation for the same
commit in the while () loop.  But isn't that "we dig the same thing
twice" by itself something we want to fix, regardless of the
double-counting issue?

IOW,

>  				if (current->date && current->date > max_corrected_commit_date)
>  					max_corrected_commit_date = current->date - 1;
>  				commit_graph_data_at(current)->generation = max_corrected_commit_date + 1;
> -
> -				if (commit_graph_data_at(current)->generation - current->date > GENERATION_NUMBER_V2_OFFSET_MAX)
> -					ctx->num_generation_data_overflows++;
>  			}
>  		}
>  	}

here, before doing the assignment for the "current" commit's
generation number, if we added

		if (commit_graph_data_at(current)->generation !=
		    GENERATION_NUMBER_ZERO)
			BUG("why are we digging it twice?");

would it trigger?  If so, isn't that already a bug worth fixing?

Perhaps avoiding the second round, perhaps like this, may be a
better fix?

	while (list) {
		struct commit *current = list->item;
		struct commit_list *parent;
		int all_parents_computed = 1;
		timestamp_t max_corrected_commit_date = 0;

+		if (commit_graph_data_at(current)->generation !=
+		    GENERATION_NUMBER_ZERO) {
+			pop_commit(&list);
+			continue;
+		}
+
		for (parent = current->parents; parent; parent = parent->next) {

Or am I grossly misunderstanding why the original code is incorrect
to have the counting at this place?

> +
> +	for (i = 0; i < ctx->commits.nr; i++) {
> +		struct commit *c = ctx->commits.list[i];
> +		timestamp_t offset = commit_graph_data_at(c)->generation - c->date;
> +		if (offset > GENERATION_NUMBER_V2_OFFSET_MAX)
> +			ctx->num_generation_data_overflows++;
> +	}
>  	stop_progress(&ctx->progress);
>  }
>  
> diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
> index 2b05026cf6d..f9bffe38013 100755
> --- a/t/t5318-commit-graph.sh
> +++ b/t/t5318-commit-graph.sh
> @@ -467,10 +467,10 @@ test_expect_success 'warn on improper hash version' '
>  	)
>  '
>  
> -test_expect_success 'lower layers have overflow chunk' '
> +test_expect_success TIME_IS_64BIT,TIME_T_IS_64BIT 'lower layers have overflow chunk' '
>  	cd "$TRASH_DIRECTORY/full" &&
>  	UNIX_EPOCH_ZERO="@0 +0000" &&
> -	FUTURE_DATE="@2147483646 +0000" &&
> +	FUTURE_DATE="@4147483646 +0000" &&
>  	rm -f .git/objects/info/commit-graph &&
>  	test_commit --date "$FUTURE_DATE" future-1 &&
>  	test_commit --date "$UNIX_EPOCH_ZERO" old-1 &&
Derrick Stolee Feb. 25, 2022, 1:51 p.m. UTC | #2
On 2/24/2022 5:15 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Derrick Stolee <derrickstolee@github.com>
>>
>> When computing the generation numbers for a commit-graph, we compute
>> the corrected commit dates and then check if their offsets from the
>> actual dates is too large to fit in the 32-bit Generation Data chunk.
>> However, there is a problem with this approach: if we have parsed the
>> generation data from the previous commit-graph, then we continue the
>> loop because the corrected commit date is already computed.
>>
>> It is incorrect to add an increment to num_generation_data_overflows
>> here, because we might start double-counting commits that are computed
>> because of the depth-first search walk from a commit with an earlier
>> OID.
>>
>> Instead, iterate over the full commit list at the end, checking the
>> offsets to see how many grow beyond the maximum value.
> 
> Hmph, I can see how the new code correctly counts the commits that
> require offsets that are too large, but I am not sure why the fix is
> needed.  The overall loop structure is

It is very subtle, which is why it took me a while to debug this
issue once I managed to trigger it.

>     for each commit ctx->commits.list[i]:
>         continue if generation number has been computed for it already

This is the critical line in the current version. This includes
"continue if the generation number was loaded from the previous
commit-graph file." This means we under-count when building from
an existing commit-graph with overflows.

If we insert an increment here, then we risk double-counting. I
should have described this better.

> 	set up a commit-list for depth first search
> 	while (we are still digging) {
> 		for each parent {
> 			if generation for the parent is not known yet:
> 				push it down and redo
> 			else
> 				compute max of parents' generation number
> 		}
>                 if (all parents' generation number is known) {
> 			set the generation number for ourselves
> 			count if we needed an offset that is too big
> 		}
> 	}
>     }
> 
> The only case where we may double-count near the end of inner loop I
> can think of is when we end up computing generation for the same
> commit in the while () loop.  But isn't that "we dig the same thing
> twice" by itself something we want to fix, regardless of the
> double-counting issue?

By "we dig the same thing twice" I think you mean "we look across
every edge in the commit-graph, and some commits have multiple
direct children." There is no way around this, but we do skip
recalculating generation numbers for parents that are already
computed.

> IOW,
> 
>>  				if (current->date && current->date > max_corrected_commit_date)
>>  					max_corrected_commit_date = current->date - 1;
>>  				commit_graph_data_at(current)->generation = max_corrected_commit_date + 1;
>> -
>> -				if (commit_graph_data_at(current)->generation - current->date > GENERATION_NUMBER_V2_OFFSET_MAX)
>> -					ctx->num_generation_data_overflows++;
>>  			}
>>  		}
>>  	}
> 
> here, before doing the assignment for the "current" commit's
> generation number, if we added
> 
> 		if (commit_graph_data_at(current)->generation !=
> 		    GENERATION_NUMBER_ZERO)
> 			BUG("why are we digging it twice?");
> 
> would it trigger?  If so, isn't that already a bug worth fixing?

This would not trigger, since 'current' did not have its
generation when adding to the stack and it could not possibly
have been added a second time when doing a depth-first search
from that commit.

> Perhaps avoiding the second round, perhaps like this, may be a
> better fix?
> 
> 	while (list) {
> 		struct commit *current = list->item;
> 		struct commit_list *parent;
> 		int all_parents_computed = 1;
> 		timestamp_t max_corrected_commit_date = 0;
> 
> +		if (commit_graph_data_at(current)->generation !=
> +		    GENERATION_NUMBER_ZERO) {
> +			pop_commit(&list);
> +			continue;
> +		}
> +
> 		for (parent = current->parents; parent; parent = parent->next) {
> 
> Or am I grossly misunderstanding why the original code is incorrect
> to have the counting at this place?

Hopefully I cleared up the issue earlier in my reply. Let me
know if this is still confusing.

Thanks,
-Stolee
Junio C Hamano Feb. 25, 2022, 5:35 p.m. UTC | #3
Derrick Stolee <derrickstolee@github.com> writes:

> It is very subtle, which is why it took me a while to debug this
> issue once I managed to trigger it.
>
>>     for each commit ctx->commits.list[i]:
>>         continue if generation number has been computed for it already
>
> This is the critical line in the current version. This includes
> "continue if the generation number was loaded from the previous
> commit-graph file." This means we under-count when building from
> an existing commit-graph with overflows.
>
> If we insert an increment here, then we risk double-counting. I
> should have described this better.

Ah, that obviously I missed.  Thanks.
diff mbox series

Patch

diff --git a/commit-graph.c b/commit-graph.c
index 265c010122e..a19bd96c2ee 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1556,12 +1556,16 @@  static void compute_generation_numbers(struct write_commit_graph_context *ctx)
 				if (current->date && current->date > max_corrected_commit_date)
 					max_corrected_commit_date = current->date - 1;
 				commit_graph_data_at(current)->generation = max_corrected_commit_date + 1;
-
-				if (commit_graph_data_at(current)->generation - current->date > GENERATION_NUMBER_V2_OFFSET_MAX)
-					ctx->num_generation_data_overflows++;
 			}
 		}
 	}
+
+	for (i = 0; i < ctx->commits.nr; i++) {
+		struct commit *c = ctx->commits.list[i];
+		timestamp_t offset = commit_graph_data_at(c)->generation - c->date;
+		if (offset > GENERATION_NUMBER_V2_OFFSET_MAX)
+			ctx->num_generation_data_overflows++;
+	}
 	stop_progress(&ctx->progress);
 }
 
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 2b05026cf6d..f9bffe38013 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -467,10 +467,10 @@  test_expect_success 'warn on improper hash version' '
 	)
 '
 
-test_expect_success 'lower layers have overflow chunk' '
+test_expect_success TIME_IS_64BIT,TIME_T_IS_64BIT 'lower layers have overflow chunk' '
 	cd "$TRASH_DIRECTORY/full" &&
 	UNIX_EPOCH_ZERO="@0 +0000" &&
-	FUTURE_DATE="@2147483646 +0000" &&
+	FUTURE_DATE="@4147483646 +0000" &&
 	rm -f .git/objects/info/commit-graph &&
 	test_commit --date "$FUTURE_DATE" future-1 &&
 	test_commit --date "$UNIX_EPOCH_ZERO" old-1 &&