Message ID | cover.1708643825.git.steadmon@google.com (mailing list archive) |
---|---|
Headers | show |
Series | commit-graph: suggest deleting corrupt graphs | expand |
Josh Steadmon <steadmon@google.com> writes: > At $WORK, we've had a few occasions where someone's commit-graph becomes > corrupt, and hits various BUG()s that block their day-to-day work. When > this happens, we advise the user to either disable the commit graph, or > to delete it and let it be regenerated. > > It would be a nicer user experience if we can make this a self-serve > procedure. To do this, let's add a new `git commit-graph clear` > subcommand so that users don't need to manually delete files under their > .git directories. And to make it self-documenting, update various BUG(), > die(), and error() messages to suggest removing the commit graph to > recover from the corruption. I am of two minds. For one, if we know there is a corruption and if we know that we will certainly recover cleanly if we removed these files, it would be fair for an end-user to respond with: instead of telling me to run "commit-graph clear", you can run it for me, can't you? The other one is if it hinders debugging the root cause to run "clear", whether it is done by the end-user or by the mechanism that detects and dies upon discovery of a corruption. Do we know how these commit-graph files become corrupt? How valuable would these corrupt files be to help us track down where the corruption comes from? If they are not all that useful in debugging, then removing them ourselves or telling users to remove them may be OK, of course. Do these BUG()s come from corruption that can be diagnosed upfront when we "open" the commit-graph files? I am wondering if it would be the matter of teaching prepare_commit_graph() to check for corruption and return without enabling the support. Thanks.
On 2024.02.22 16:05, Junio C Hamano wrote: > Josh Steadmon <steadmon@google.com> writes: > > > At $WORK, we've had a few occasions where someone's commit-graph becomes > > corrupt, and hits various BUG()s that block their day-to-day work. When > > this happens, we advise the user to either disable the commit graph, or > > to delete it and let it be regenerated. > > > > It would be a nicer user experience if we can make this a self-serve > > procedure. To do this, let's add a new `git commit-graph clear` > > subcommand so that users don't need to manually delete files under their > > .git directories. And to make it self-documenting, update various BUG(), > > die(), and error() messages to suggest removing the commit graph to > > recover from the corruption. > > I am of two minds. > > For one, if we know there is a corruption and if we know that we > will certainly recover cleanly if we removed these files, it would > be fair for an end-user to respond with: instead of telling me to > run "commit-graph clear", you can run it for me, can't you? > > The other one is if it hinders debugging the root cause to run > "clear", whether it is done by the end-user or by the mechanism that > detects and dies upon discovery of a corruption. Do we know how > these commit-graph files become corrupt? How valuable would these > corrupt files be to help us track down where the corruption comes > from? If they are not all that useful in debugging, then removing > them ourselves or telling users to remove them may be OK, of course. > > Do these BUG()s come from corruption that can be diagnosed upfront > when we "open" the commit-graph files? I am wondering if it would > be the matter of teaching prepare_commit_graph() to check for > corruption and return without enabling the support. > > Thanks. Sorry for the late reply, this got buried in my inbox. The corruption we saw was related to a generation numbers bug [1] that I think was only present for a short while in 'next'. [1] https://lore.kernel.org/git/YBn3fxFe978Up5Ly@google.com/ I believe that being able to examine the files after the corruption was detected did help us narrow down the issue, so I would lean towards not automatically deleting them upon detecting corruption. I don't think that this case would be detectable without running a full `git commit-graph verify` up front.
Josh Steadmon <steadmon@google.com> writes: > I believe that being able to examine the files after the corruption was > detected did help us narrow down the issue, so I would lean towards not > automatically deleting them upon detecting corruption. Understood. > I don't think that this case would be detectable without running a full > `git commit-graph verify` up front. OK, then that approach is not worth pursuing. Thanks for a dose of sanity.