diff mbox series

[v2,2/6] commit-graph: always parse before commit_graph_data_at()

Message ID 454b183b9ba502da7f40dc36aaa95cc3d12b5c2f.1612234883.git.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit 90cb1c47c7cbf6cdd5152c1371a2732af59911a4
Headers show
Series Generation Number v2: Fix a tricky split graph bug | expand

Commit Message

Derrick Stolee Feb. 2, 2021, 3:01 a.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

There is a subtle failure happening when computing corrected commit
dates with --split enabled. It requires a base layer needing the
generation_data_overflow chunk. Then, the next layer on top
erroneously thinks it needs an overflow chunk due to a bug leading
to recalculating all reachable generation numbers. The output of
the failure is

  BUG: commit-graph.c:1912: expected to write 8 bytes to
  chunk 47444f56, but wrote 0 instead

These "expected" 8 bytes are due to re-computing the corrected
commit date for the lower layer but the new layer does not need
any overflow.

Add a test to t5318-commit-graph.sh that demonstrates this bug. However,
it does not trigger consistently with the existing code.

The generation number data is stored in a slab and accessed by
commit_graph_data_at(). This data is initialized when parsing a commit,
but is otherwise used assuming it has been populated. The loop in
compute_generation_numbers() did not enforce that all reachable
commits were parsed and had correct values. This could lead to some
problems when writing a commit-graph with corrected commit dates based
on a commit-graph without them.

It has been difficult to identify the issue here because it was so hard
to reproduce. It relies on this uninitialized data having a non-zero
value, but also on specifically in a way that overwrites the existing
data.

This patch adds the extra parse to ensure the data is filled before we
compute the generation number of a commit. This triggers the new test
to fail because the generation number overflow count does not match
between this computation and the write for that chunk.

The actual fix will follow as the next few changes.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 commit-graph.c          | 16 ++++++++++++----
 t/t5318-commit-graph.sh | 21 +++++++++++++++++++++
 2 files changed, 33 insertions(+), 4 deletions(-)

Comments

Jonathan Nieder Feb. 3, 2021, 1:08 a.m. UTC | #1
Derrick Stolee wrote:

> There is a subtle failure happening when computing corrected commit
> dates with --split enabled. It requires a base layer needing the
> generation_data_overflow chunk. Then, the next layer on top
> erroneously thinks it needs an overflow chunk due to a bug leading
> to recalculating all reachable generation numbers. The output of
> the failure is
>
>   BUG: commit-graph.c:1912: expected to write 8 bytes to
>   chunk 47444f56, but wrote 0 instead

At Google, we're running into a commit-graph issue that appears to
have also arrived as part of this last week's rollout.  This one is a
bit worse --- it is reproducible for affected users and stops them
from being able to do day-to-day development:

  $ git pull
  remote: Finding sources: 100% (33/33)
  remote: Total 33 (delta 18), reused 33 (delta 18)
  Unpacking objects: 100% (33/33), 27.44 KiB | 460.00 KiB/s, done.
  From https://example.com/path/to/repo
     05ba0d775..279e4e6d0  master     -> origin/master
  BUG: commit-reach.c:64: bad generation skip     29e3 >      628 at 62abdabd1be00ebadbf73061ecf72b35042338e3
  error: merge died of signal 6

"git commit-graph verify" agrees that the generation numbers are wrong:

  $ git commit-graph verify
  commit-graph generation for commit 4290b2214cdf50263118322735347d151715a272 is 3 != 1586
  Verifying commits in commit graph: 100% (1/1), done.
  commit-graph generation for commit b6c73a8472c7cb503cce0668849150a4b4329230 is 1576 != 10724
  Verifying commits in commit graph: 100% (10/10), done.
  Verifying commits in commit graph: 100% (88/88), done.
  Verifying commits in commit graph: 100% (208/208), done.
  Verifying commits in commit graph: 100% (592/592), done.
  Verifying commits in commit graph: 100% (1567/1567), done.
  Verifying commits in commit graph: 100% (8358/8358), done.

We have some examples of repositories that were corrupted this way,
but we didn't catch them in the act of corruption --- it started
happening to several users with this release so we immediately rolled
back.

Questions:

- is this likely to be due to the same cause, or is it orthogonal?

- what is the recommended way to recover from this state?  "git fsck"
  shows the repositories to have no problems.  "git help commit-graph"
  doesn't show a command for users to use; is
  `rm -fr .git/objects/info/commit-graphs/` the recommended recovery
  command?

- is there configuration or a patch we can roll out to help affected
  users recover from this state?

Thanks,
Jonathan
Derrick Stolee Feb. 3, 2021, 1:35 a.m. UTC | #2
On 2/2/2021 8:08 PM, Jonathan Nieder wrote:
> Derrick Stolee wrote:
> 
>> There is a subtle failure happening when computing corrected commit
>> dates with --split enabled. It requires a base layer needing the
>> generation_data_overflow chunk. Then, the next layer on top
>> erroneously thinks it needs an overflow chunk due to a bug leading
>> to recalculating all reachable generation numbers. The output of
>> the failure is
>>
>>   BUG: commit-graph.c:1912: expected to write 8 bytes to
>>   chunk 47444f56, but wrote 0 instead
> 
> At Google, we're running into a commit-graph issue that appears to
> have also arrived as part of this last week's rollout.  This one is a
> bit worse --- it is reproducible for affected users and stops them
> from being able to do day-to-day development:

You're shipping 'next' widely? I appreciate the extra eyes on
early bits, so we can find more issues and get them resolved.

>   $ git pull
>   remote: Finding sources: 100% (33/33)
>   remote: Total 33 (delta 18), reused 33 (delta 18)
>   Unpacking objects: 100% (33/33), 27.44 KiB | 460.00 KiB/s, done.
>   From https://example.com/path/to/repo
>      05ba0d775..279e4e6d0  master     -> origin/master
>   BUG: commit-reach.c:64: bad generation skip     29e3 >      628 at 62abdabd1be00ebadbf73061ecf72b35042338e3
>   error: merge died of signal 6
> 
> "git commit-graph verify" agrees that the generation numbers are wrong:
> 
>   $ git commit-graph verify
>   commit-graph generation for commit 4290b2214cdf50263118322735347d151715a272 is 3 != 1586
>   Verifying commits in commit graph: 100% (1/1), done.
>   commit-graph generation for commit b6c73a8472c7cb503cce0668849150a4b4329230 is 1576 != 10724
>   Verifying commits in commit graph: 100% (10/10), done.
>   Verifying commits in commit graph: 100% (88/88), done.
>   Verifying commits in commit graph: 100% (208/208), done.
>   Verifying commits in commit graph: 100% (592/592), done.
>   Verifying commits in commit graph: 100% (1567/1567), done.
>   Verifying commits in commit graph: 100% (8358/8358), done.
> 
> We have some examples of repositories that were corrupted this way,
> but we didn't catch them in the act of corruption --- it started
> happening to several users with this release so we immediately rolled
> back.

It is definitely related to the split commit-graph during the
upgrade scenario. Your verify output shows that you are using
the --split option heavily (possibly with fetch.writeCommitGraph?
or are you using 'git maintenance run --task=commit-graph'?)

> Questions:
> 
> - is this likely to be due to the same cause, or is it orthogonal?

My guess is that the reason is the same. I think that you might
have some strangeness of a commit-graph layer with corrected commit
dates being below a commit-graph layer without it.

> - what is the recommended way to recover from this state?  "git fsck"
>   shows the repositories to have no problems.  "git help commit-graph"
>   doesn't show a command for users to use; is
>   `rm -fr .git/objects/info/commit-graphs/` the recommended recovery
>   command?

That, followed by `git commit-graph write --reachable [--changed-paths]`
depending on what they want.

> - is there configuration or a patch we can roll out to help affected
>   users recover from this state?

If you are willing, then take v2 of this series and follow through by
clearing the commit-graph files of affected users. Note that you can
be proactive using `git commit-graph verify` to see who needs rewrites.

Thanks,
-Stolee
Jonathan Nieder Feb. 3, 2021, 1:48 a.m. UTC | #3
Hi,

Derrick Stolee wrote:
> On 2/2/2021 8:08 PM, Jonathan Nieder wrote:

>> At Google, we're running into a commit-graph issue that appears to
>> have also arrived as part of this last week's rollout.  This one is a
>> bit worse --- it is reproducible for affected users and stops them
>> from being able to do day-to-day development:
>
> You're shipping 'next' widely? I appreciate the extra eyes on
> early bits, so we can find more issues and get them resolved.

Yes.  Changes in 'next' have already gotten all the vetting via code
review that they're going to get; the difference between changes in
'next' and 'master' is that the latter have had some production
exposure among users of 'next' with the ability to get help from a
local expert, roll back quickly when there's a problem, and so on.  I
recommend that anyone with an installation with that ability use
'next', to improve the quality of code that ultimately is released
from 'master'.

It also helps us get the chance to use our experience to affect the
direction of a topic before it's too late.

[...]
>> We have some examples of repositories that were corrupted this way,
>> but we didn't catch them in the act of corruption --- it started
>> happening to several users with this release so we immediately rolled
>> back.
>
> It is definitely related to the split commit-graph during the
> upgrade scenario. Your verify output shows that you are using
> the --split option heavily (possibly with fetch.writeCommitGraph?
> or are you using 'git maintenance run --task=commit-graph'?)

Yep, the splits come from fetch.writeCommitGraph.

[...]
>> - what is the recommended way to recover from this state?  "git fsck"
>>   shows the repositories to have no problems.  "git help commit-graph"
>>   doesn't show a command for users to use; is
>>   `rm -fr .git/objects/info/commit-graphs/` the recommended recovery
>>   command?
>
> That, followed by `git commit-graph write --reachable [--changed-paths]`
> depending on what they want.

Can we package this as something more user-friendly?  E.g.

	git commit-graph clear
	git commit-graph write --reachable

If that makes sense to you, I'm happy to send a patch (or to review
one if someone else gets to it first).  I'm mostly asking to find out
whether this matches your idea of what the UI should be like.

>> - is there configuration or a patch we can roll out to help affected
>>   users recover from this state?
>
> If you are willing, then take v2 of this series and follow through by
> clearing the commit-graph files of affected users. Note that you can
> be proactive using `git commit-graph verify` to see who needs rewrites.

Does this mean we should change the BUG error message to help affected
users discover how they can recover for themselves (for example, using
commands like the above)?

Also, should "git fsck" call "git commit-graph verify" to make the
latter more discoverable?

Thanks,
Jonathan
Junio C Hamano Feb. 3, 2021, 2:06 a.m. UTC | #4
Derrick Stolee <stolee@gmail.com> writes:

>> - what is the recommended way to recover from this state?  "git fsck"
>>   shows the repositories to have no problems.  "git help commit-graph"
>>   doesn't show a command for users to use; is
>>   `rm -fr .git/objects/info/commit-graphs/` the recommended recovery
>>   command?

"rm -f .git/objects/info/commit-graph" as well, no?

> That, followed by `git commit-graph write --reachable [--changed-paths]`
> depending on what they want.

Just out of curiosity, how important is "--reachable"?  It only
traverses from the tips of refs and unlike fsck and repack, not from
reflog entries (or the index for that matter, but that shouldn't
make much difference as there is no _commit_ in the index).

>> - is there configuration or a patch we can roll out to help affected
>>   users recover from this state?
>
> If you are willing, then take v2 of this series and follow through by
> clearing the commit-graph files of affected users. Note that you can
> be proactive using `git commit-graph verify` to see who needs rewrites.

FYI, today's 368b6599 (Merge branch 'ds/commit-graph-genno-fix' into
jch, 2021-02-01) has this stuff.
Derrick Stolee Feb. 3, 2021, 3:07 a.m. UTC | #5
On 2/2/2021 8:48 PM, Jonathan Nieder wrote:
> Hi,
> 
> Derrick Stolee wrote:
>> On 2/2/2021 8:08 PM, Jonathan Nieder wrote:
> 
>>> At Google, we're running into a commit-graph issue that appears to
>>> have also arrived as part of this last week's rollout.  This one is a
>>> bit worse --- it is reproducible for affected users and stops them
>>> from being able to do day-to-day development:
>>
>> You're shipping 'next' widely? I appreciate the extra eyes on
>> early bits, so we can find more issues and get them resolved.
> 
> Yes.  Changes in 'next' have already gotten all the vetting via code
> review that they're going to get; the difference between changes in
> 'next' and 'master' is that the latter have had some production
> exposure among users of 'next' with the ability to get help from a
> local expert, roll back quickly when there's a problem, and so on.  I
> recommend that anyone with an installation with that ability use
> 'next', to improve the quality of code that ultimately is released
> from 'master'.
> 
> It also helps us get the chance to use our experience to affect the
> direction of a topic before it's too late.

This is a good practice. It's also how I found the issues fixed
in this series, but that's because I install it locally for my own
extra additional testing before shipping it to users.

> [...]
>>> We have some examples of repositories that were corrupted this way,
>>> but we didn't catch them in the act of corruption --- it started
>>> happening to several users with this release so we immediately rolled
>>> back.
>>
>> It is definitely related to the split commit-graph during the
>> upgrade scenario. Your verify output shows that you are using
>> the --split option heavily (possibly with fetch.writeCommitGraph?
>> or are you using 'git maintenance run --task=commit-graph'?)
> 
> Yep, the splits come from fetch.writeCommitGraph.
> 
> [...]
>>> - what is the recommended way to recover from this state?  "git fsck"
>>>   shows the repositories to have no problems.  "git help commit-graph"
>>>   doesn't show a command for users to use; is
>>>   `rm -fr .git/objects/info/commit-graphs/` the recommended recovery
>>>   command?
>>
>> That, followed by `git commit-graph write --reachable [--changed-paths]`
>> depending on what they want.
> 
> Can we package this as something more user-friendly?  E.g.
> 
> 	git commit-graph clear
> 	git commit-graph write --reachable
> 
> If that makes sense to you, I'm happy to send a patch (or to review
> one if someone else gets to it first).  I'm mostly asking to find out
> whether this matches your idea of what the UI should be like.

'clear' is probably fine. I was thinking it might be good to have
an option to the 'write' subcommand to clear the existing data, but
it's probably better as separate steps.

>>> - is there configuration or a patch we can roll out to help affected
>>>   users recover from this state?
>>
>> If you are willing, then take v2 of this series and follow through by
>> clearing the commit-graph files of affected users. Note that you can
>> be proactive using `git commit-graph verify` to see who needs rewrites.
> 
> Does this mean we should change the BUG error message to help affected
> users discover how they can recover for themselves (for example, using
> commands like the above)?

It _is_ a bug that led to this, but it's more about incorrect
commit-graph data which could be caused by anything. Better to
have a better message such as "your commit-graph file is
probably corrupt".

> Also, should "git fsck" call "git commit-graph verify" to make the
> latter more discoverable?

Yes. I thought it did, but I must be incorrect.

Thanks,
-Stolee
Derrick Stolee Feb. 3, 2021, 3:09 a.m. UTC | #6
On 2/2/2021 9:06 PM, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
>>> - what is the recommended way to recover from this state?  "git fsck"
>>>   shows the repositories to have no problems.  "git help commit-graph"
>>>   doesn't show a command for users to use; is
>>>   `rm -fr .git/objects/info/commit-graphs/` the recommended recovery
>>>   command?
> 
> "rm -f .git/objects/info/commit-graph" as well, no?

In this case, that won't be necessary since they are using a
split commit-graph. However, the following is what I do to
be extra sure:

	rm -rf .git/objects/info/commit-graph*

Deletes the singleton file and the directory.

>> That, followed by `git commit-graph write --reachable [--changed-paths]`
>> depending on what they want.
> 
> Just out of curiosity, how important is "--reachable"?  It only
> traverses from the tips of refs and unlike fsck and repack, not from
> reflog entries (or the index for that matter, but that shouldn't
> make much difference as there is no _commit_ in the index).

I just like to focus on the reachable commits starting at refs
instead of scanning all packed objects to see which are commits
or not.

Thanks,
-stolee
Taylor Blau Feb. 3, 2021, 3:34 p.m. UTC | #7
On Tue, Feb 02, 2021 at 10:07:32PM -0500, Derrick Stolee wrote:
> > Can we package this as something more user-friendly?  E.g.
> >
> > 	git commit-graph clear
> > 	git commit-graph write --reachable
> >
> > If that makes sense to you, I'm happy to send a patch (or to review
> > one if someone else gets to it first).  I'm mostly asking to find out
> > whether this matches your idea of what the UI should be like.
>
> 'clear' is probably fine. I was thinking it might be good to have
> an option to the 'write' subcommand to clear the existing data, but
> it's probably better as separate steps.

Wouldn't 'git commit-graph write --split=replace --reachable' do the
same thing? I know that you changed some of the spots where we load an
existing commit graph, so my claim might be out-of-date, but I'm pretty
sure that this would get you out of a broken state.

Thinking aloud, I'm not totally sure that we should be exposing "git
commit-graph clear" to users. The only time that you'd want to run this
is if you were trying to remove a corrupted commit-graph, so I'd rather
see guidance on how to do that safely show up in
Documentation/git-commit-graph.txt.

On the other hand, now I'm encouraging running "rm -fr
$GIT_DIR/objects/info/commit-graph*", which feels dangerous.

Somewhere in the middle would be something like:

  git -c core.commitGraph=false commit-graph write --reachable

which would disable reading existing commit-graph files. Since
85102ac71b (commit-graph: don't write commit-graph when disabled,
2020-10-09), that causes us to exit immediately.

I think that reverting that patch and advertising setting
'core.commitGraph=false' in the documentation makes the most sense.

Thanks,
Taylor
Eric Sunshine Feb. 3, 2021, 5:37 p.m. UTC | #8
On Wed, Feb 3, 2021 at 10:55 AM Taylor Blau <me@ttaylorr.com> wrote:
> On Tue, Feb 02, 2021 at 10:07:32PM -0500, Derrick Stolee wrote:
> > 'clear' is probably fine. I was thinking it might be good to have
> > an option to the 'write' subcommand to clear the existing data, but
> > it's probably better as separate steps.
>
> Wouldn't 'git commit-graph write --split=replace --reachable' do the
> same thing? I know that you changed some of the spots where we load an
> existing commit graph, so my claim might be out-of-date, but I'm pretty
> sure that this would get you out of a broken state.
>
> Thinking aloud, I'm not totally sure that we should be exposing "git
> commit-graph clear" to users. The only time that you'd want to run this
> is if you were trying to remove a corrupted commit-graph, so I'd rather
> see guidance on how to do that safely show up in
> Documentation/git-commit-graph.txt.

Throwing one more idea into the mix, git-worktree recently got a
`repair` subcommand. Even though it presently repairs a small set of
problems, the subcommand name is intentionally generic so as to
provide room for growth. One could imagine `git commit-graph repair`
being added to provide a user-friendly mechanism for recovering from
commit-graph damage.
Junio C Hamano Feb. 3, 2021, 6:41 p.m. UTC | #9
Taylor Blau <me@ttaylorr.com> writes:

> Thinking aloud, I'm not totally sure that we should be exposing "git
> commit-graph clear" to users. The only time that you'd want to run this
> is if you were trying to remove a corrupted commit-graph, so I'd rather
> see guidance on how to do that safely show up in
> Documentation/git-commit-graph.txt.
>
> On the other hand, now I'm encouraging running "rm -fr
> $GIT_DIR/objects/info/commit-graph*", which feels dangerous.

True.

As this is, like pack .idx file, supposed to be "precomputed cached
data that can be fully recreated using primary information" [*], I
am perfectly fine to say "commit-graph may have unexplored corners,
and when you hit a BUG(), you can safely use 'commit-graph clear'
and recreate it from scratch, or operate without it if you feel you
do not yet want to trust your data to it for now."  Giving safer and
easier way to opt out for those who need to get today's release
done, with enough performance incentive to re-enable it when the
crunch is over, would be an honest thing to do, I would think.

	Side note: the index file also used to be considered to hold
	such cached data, that can be recreated from the working
	tree data and the tip commit.  We no longer treat it that
	way, though.

> Somewhere in the middle would be something like:
>
>   git -c core.commitGraph=false commit-graph write --reachable

I am a bit worried about the thinking along this line, because it
gives the users an impression that there is no escaping from
trusting commit-graph---the one that was created from scratch is
bug-free and they only need to be cautious about incrementals.

But (1) we do not know that, and (2) it is an unconvincing message
to somebody who just got hit by a BUG().

> which would disable reading existing commit-graph files. Since
> 85102ac71b (commit-graph: don't write commit-graph when disabled,
> 2020-10-09), that causes us to exit immediately.

Meaning the three command sequence

	git commit-graph clear
	git commit-graph write --reachable
        git config core.commitGraph false

to force a clean build of a graph and forbid further updates until
the bug is squashed???  But should't core.commitGraph forbid reading
and using the data in the existing files, too?  In which case, shouldn't
it be equivalent to "git commit-graph clear"?

> I think that reverting that patch and advertising setting
> 'core.commitGraph=false' in the documentation makes the most sense.
Taylor Blau Feb. 3, 2021, 9:08 p.m. UTC | #10
On Wed, Feb 03, 2021 at 10:41:08AM -0800, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > Thinking aloud, I'm not totally sure that we should be exposing "git
> > commit-graph clear" to users. The only time that you'd want to run this
> > is if you were trying to remove a corrupted commit-graph, so I'd rather
> > see guidance on how to do that safely show up in
> > Documentation/git-commit-graph.txt.
> >
> > On the other hand, now I'm encouraging running "rm -fr
> > $GIT_DIR/objects/info/commit-graph*", which feels dangerous.
>
> True.
>
> As this is, like pack .idx file, supposed to be "precomputed cached
> data that can be fully recreated using primary information" [*], I
> am perfectly fine to say "commit-graph may have unexplored corners,
> and when you hit a BUG(), you can safely use 'commit-graph clear'
> and recreate it from scratch, or operate without it if you feel you
> do not yet want to trust your data to it for now."  Giving safer and
> easier way to opt out for those who need to get today's release
> done, with enough performance incentive to re-enable it when the
> crunch is over, would be an honest thing to do, I would think.
>
> 	Side note: the index file also used to be considered to hold
> 	such cached data, that can be recreated from the working
> 	tree data and the tip commit.  We no longer treat it that
> 	way, though.
>
> > Somewhere in the middle would be something like:
> >
> >   git -c core.commitGraph=false commit-graph write --reachable
>
> I am a bit worried about the thinking along this line, because it
> gives the users an impression that there is no escaping from
> trusting commit-graph---the one that was created from scratch is
> bug-free and they only need to be cautious about incrementals.
>
> But (1) we do not know that, and (2) it is an unconvincing message
> to somebody who just got hit by a BUG().

This is a convincing counter-point to my proposal. Yeah, I agree that we
shouldn't be advertising that commit-graph is completely trustworthy.

> > which would disable reading existing commit-graph files. Since
> > 85102ac71b (commit-graph: don't write commit-graph when disabled,
> > 2020-10-09), that causes us to exit immediately.
>
> Meaning the three command sequence
>
> 	git commit-graph clear
> 	git commit-graph write --reachable
>         git config core.commitGraph false
>
> to force a clean build of a graph and forbid further updates until
> the bug is squashed???  But should't core.commitGraph forbid reading
> and using the data in the existing files, too?  In which case, shouldn't
> it be equivalent to "git commit-graph clear"?

I think we may be saying the same thing. I was suggesting that if we
reverted 85102ac71b, that 'git -c core.commitGraph=false commit-graph
write ...' would rewrite your commit-graph from scratch (without opening
up existing ones and propagating corruption).

So I was saying that that *would* be a viable "git commit-grpah clear"
(if 85102ac71b were reverted).

Thanks,
Taylor
SZEDER Gábor Feb. 7, 2021, 7:04 p.m. UTC | #11
On Tue, Feb 02, 2021 at 06:06:51PM -0800, Junio C Hamano wrote:
> Derrick Stolee <stolee@gmail.com> writes:
> 
> >> - what is the recommended way to recover from this state?  "git fsck"
> >>   shows the repositories to have no problems.  "git help commit-graph"
> >>   doesn't show a command for users to use; is
> >>   `rm -fr .git/objects/info/commit-graphs/` the recommended recovery
> >>   command?
> 
> "rm -f .git/objects/info/commit-graph" as well, no?
> 
> > That, followed by `git commit-graph write --reachable [--changed-paths]`
> > depending on what they want.
> 
> Just out of curiosity, how important is "--reachable"?  It only
> traverses from the tips of refs and unlike fsck and repack, not from
> reflog entries (or the index for that matter, but that shouldn't
> make much difference as there is no _commit_ in the index).

Scanning all objects in all packfiles is a very inefficient way to
find the commits to be recorded in the commit-graph, and depending on
the repository's shape and size can have several times higher runtime
and memory footprint.
Junio C Hamano Feb. 7, 2021, 8:12 p.m. UTC | #12
SZEDER Gábor <szeder.dev@gmail.com> writes:

> On Tue, Feb 02, 2021 at 06:06:51PM -0800, Junio C Hamano wrote:
>> Derrick Stolee <stolee@gmail.com> writes:
>> 
>> >> - what is the recommended way to recover from this state?  "git fsck"
>> >>   shows the repositories to have no problems.  "git help commit-graph"
>> >>   doesn't show a command for users to use; is
>> >>   `rm -fr .git/objects/info/commit-graphs/` the recommended recovery
>> >>   command?
>> 
>> "rm -f .git/objects/info/commit-graph" as well, no?
>> 
>> > That, followed by `git commit-graph write --reachable [--changed-paths]`
>> > depending on what they want.
>> 
>> Just out of curiosity, how important is "--reachable"?  It only
>> traverses from the tips of refs and unlike fsck and repack, not from
>> reflog entries (or the index for that matter, but that shouldn't
>> make much difference as there is no _commit_ in the index).
>
> Scanning all objects in all packfiles is a very inefficient way to
> find the commits to be recorded in the commit-graph, and depending on
> the repository's shape and size can have several times higher runtime
> and memory footprint.

But wouldn't it make the resulting graph file not very useful for
the purpose of say deciding what object to pack when running "gc" or
"repack" or "prune"?  The fact that it ignores the index and the reflog
entries as roots of traversal with "--reachable" bothers me.
Derrick Stolee Feb. 8, 2021, 2:01 a.m. UTC | #13
On 2/7/2021 3:12 PM, Junio C Hamano wrote:
> SZEDER Gábor <szeder.dev@gmail.com> writes:
>> Scanning all objects in all packfiles is a very inefficient way to
>> find the commits to be recorded in the commit-graph, and depending on
>> the repository's shape and size can have several times higher runtime
>> and memory footprint.
> 
> But wouldn't it make the resulting graph file not very useful for
> the purpose of say deciding what object to pack when running "gc" or
> "repack" or "prune"?  The fact that it ignores the index and the reflog
> entries as roots of traversal with "--reachable" bothers me.

The reflog _might_ have something of value there, but the hope is
that very few commits are actually being force-pushed away. The focus
is to prioritize the deep history, and it would definitely be an
anti-pattern if the commits in the reflog are so numerous that they
must be tracked by the commit-graph. Of course, skipping the --reachable
option enables a way to gather these commits as necessary.

The index won't have commit oids that matter (yes, for submodules they
will exist, but those are not commits for the super repo).

Thanks,
-Stolee
Junio C Hamano Feb. 8, 2021, 5:55 a.m. UTC | #14
Derrick Stolee <stolee@gmail.com> writes:

> The reflog _might_ have something of value there, but the hope is
> that very few commits are actually being force-pushed away. The focus
> is to prioritize the deep history, and it would definitely be an
> anti-pattern if the commits in the reflog are so numerous that they
> must be tracked by the commit-graph. Of course, skipping the --reachable
> option enables a way to gather these commits as necessary.

Sure.  As long as gaps in commit-graph coverage only affect
performance and never correctness (e.g. even if the things that are
only reachable from reflog entries are not covered by commit-graph,
when 'git prune' wants to know if an object can safely pruned, we
will correctly say "no, do not prune it yet"), that is perfectly fine.
diff mbox series

Patch

diff --git a/commit-graph.c b/commit-graph.c
index 03e5a987968..edbb3a0f2cc 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1193,7 +1193,9 @@  static int write_graph_chunk_generation_data(struct hashfile *f,
 
 	for (i = 0; i < ctx->commits.nr; i++) {
 		struct commit *c = ctx->commits.list[i];
-		timestamp_t offset = commit_graph_data_at(c)->generation - c->date;
+		timestamp_t offset;
+		repo_parse_commit(ctx->r, c);
+		offset = commit_graph_data_at(c)->generation - c->date;
 		display_progress(ctx->progress, ++ctx->progress_cnt);
 
 		if (offset > GENERATION_NUMBER_V2_OFFSET_MAX) {
@@ -1444,15 +1446,20 @@  static void compute_generation_numbers(struct write_commit_graph_context *ctx)
 					_("Computing commit graph generation numbers"),
 					ctx->commits.nr);
 	for (i = 0; i < ctx->commits.nr; i++) {
-		uint32_t level = *topo_level_slab_at(ctx->topo_levels, ctx->commits.list[i]);
-		timestamp_t corrected_commit_date = commit_graph_data_at(ctx->commits.list[i])->generation;
+		struct commit *c = ctx->commits.list[i];
+		uint32_t level;
+		timestamp_t corrected_commit_date;
+
+		repo_parse_commit(ctx->r, c);
+		level = *topo_level_slab_at(ctx->topo_levels, c);
+		corrected_commit_date = commit_graph_data_at(c)->generation;
 
 		display_progress(ctx->progress, i + 1);
 		if (level != GENERATION_NUMBER_ZERO &&
 		    corrected_commit_date != GENERATION_NUMBER_ZERO)
 			continue;
 
-		commit_list_insert(ctx->commits.list[i], &list);
+		commit_list_insert(c, &list);
 		while (list) {
 			struct commit *current = list->item;
 			struct commit_list *parent;
@@ -1461,6 +1468,7 @@  static void compute_generation_numbers(struct write_commit_graph_context *ctx)
 			timestamp_t max_corrected_commit_date = 0;
 
 			for (parent = current->parents; parent; parent = parent->next) {
+				repo_parse_commit(ctx->r, parent->item);
 				level = *topo_level_slab_at(ctx->topo_levels, parent->item);
 				corrected_commit_date = commit_graph_data_at(parent->item)->generation;
 
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index fa27df579a5..2cf29f425a0 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -446,6 +446,27 @@  test_expect_success 'warn on improper hash version' '
 	)
 '
 
+test_expect_failure 'lower layers have overflow chunk' '
+	cd "$TRASH_DIRECTORY/full" &&
+	UNIX_EPOCH_ZERO="@0 +0000" &&
+	FUTURE_DATE="@2147483646 +0000" &&
+	rm -f .git/objects/info/commit-graph &&
+	test_commit --date "$FUTURE_DATE" future-1 &&
+	test_commit --date "$UNIX_EPOCH_ZERO" old-1 &&
+	git commit-graph write --reachable &&
+	test_commit --date "$FUTURE_DATE" future-2 &&
+	test_commit --date "$UNIX_EPOCH_ZERO" old-2 &&
+	git commit-graph write --reachable --split=no-merge &&
+	test_commit extra &&
+	git commit-graph write --reachable --split=no-merge &&
+	git commit-graph write --reachable &&
+	graph_read_expect 16 "generation_data generation_data_overflow extra_edges" &&
+	mv .git/objects/info/commit-graph commit-graph-upgraded &&
+	git commit-graph write --reachable &&
+	graph_read_expect 16 "generation_data generation_data_overflow extra_edges" &&
+	test_cmp .git/objects/info/commit-graph commit-graph-upgraded
+'
+
 # the verify tests below expect the commit-graph to contain
 # exactly the commits reachable from the commits/8 branch.
 # If the file changes the set of commits in the list, then the