Message ID | 20221018175530.086c8c74@apus (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | revision: ignore non-existent objects in resolve-undo list | expand |
Mathias Rav <m@git.strova.dk> writes: > Garbage collection could inadvertently prune blobs mentioned only in the > resolve-undo extension prior to the bugfix in 5a5ea141e7 > ("revision: mark blobs needed for resolve-undo as reachable", 2022-06-09). Older versions of Git did not consider blobs referenced by the resolve-undo as reachable, and allowed "git gc" to expire them out of existence. And "git fsck" in these versions did report that these blobs are unreachable. Newer versions of Git on the other hand do consider these blobs as reachable, so "git gc" would not expire them. And "git fsck" would complain when they are missing, because by definition we should not lose reachable objects. The error discussed recently on the list was only because older version was used to "git gc" away blobs that are still in use. I think the right solution for such a transitory error is not to hide the problem and pretend that such a blob reference does not exist, which is what ... > Fix the error by emitting a warning when the resolve-undo list mentions > objects that do not exist and then ignoring the nonexistent object. ... this approach is about. I think it is backwards to sweep the problem under the rug without fixing the underlying problem. We should instead be removing the reference that is no longer even usable for the purpose of resolve-undo, e.g. when "rerere forget <pathspec>" reads from the resolve-undo extension to recreate the conflicts. Perhaps "git reflog --state-fix" is a good model to follow. Back when the option was introduced, we found that there was a buggy implementation of "git gc" that did not consider commits referenced by reflog entries reachable and removed them, breaking "git reflog". The solution was to remove these reflog entries that accidentally lost commits that they reference because they no longer are usable. The manual procedure Peff gave in the thread does work OK, but if it makes it more friendly, a new option to "update-index" to fix the index file by removing things that refer to missing objects would not be a bad idea. Thanks.
Junio C Hamano <gitster@pobox.com> writes: >> Fix the error by emitting a warning when the resolve-undo list mentions >> objects that do not exist and then ignoring the nonexistent object. > > ... this approach is about. I think it is backwards to sweep the > problem under the rug without fixing the underlying problem. > > We should instead be removing the reference that is no longer even > usable for the purpose of resolve-undo, e.g. when "rerere forget > <pathspec>" reads from the resolve-undo extension to recreate the > conflicts. Ah, I take half of it back. "instead" -> "in addition". The patch corresponds to revision.c::handle_one_reflog_commit() that skips the object referenced by a reflog entry that no longer exists as part of the solution to the same problem as "reflog --stale-fix" solved, and needs to be an integral half of the solution to the "older gc lose blobs referenced by resolve-undo extension" problem. And the patch goes in the right direction. It is a bit sad that it now has to do parse_object() but in the normal case, the object referenced should be a blob that exists, for which the cost of parsing it would be none (just setting .parsed member to true), so it should be OK. Thanks. Will queue.
On Tue, Oct 18, 2022 at 09:40:01AM -0700, Junio C Hamano wrote: > And the patch goes in the right direction. It is a bit sad that it > now has to do parse_object() but in the normal case, the object > referenced should be a blob that exists, for which the cost of > parsing it would be none (just setting .parsed member to true), so > it should be OK. This isn't quite true. parse_object() will still inflate the object contents to check the sha1. I think has_object_file() is probably the right thing here. We want to know if the object is missing entirely. We'd not notice corrupted bytes, of course, but that is OK. Traversal does not open blobs we reach via trees, either. For pack-objects, we rely on either: - for repacking to disk, we check the pack crc for already-packed objects (which avoids inflating them). For loose objects, we'll inflate them later when we convert them to packed form. - for packing to stdout for fetch/push, the receiver is expected to check the sha1 via index-pack, etc. So I think just checking "do we have it? If not, gently skip it" is the right thing here. And in the long run we'd hopefully remove that code, as "we don't have it" becomes less "this was probably gc'd with an older version of git" to "oops, there is a bug in Git that lost this object". I notice that 5a5ea141e7 (revision: mark blobs needed for resolve-undo as reachable, 2022-06-09) uses parse_object() in the fsck code path. That _might_ be better as lookup_object(), as earlier stages of fsck would have checked the bytes of each object and created an in-memory object struct. Though I guess in that sense, it doesn't matter; parse_object() will hit lookup_object() first and see that in-memory struct. -Peff
diff --git a/revision.c b/revision.c index 36e31942ce..03bc45bef1 100644 --- a/revision.c +++ b/revision.c @@ -1720,18 +1720,18 @@ static void add_resolve_undo_to_pending(struct index_state *istate, struct rev_i if (!ru) continue; for (i = 0; i < 3; i++) { - struct blob *blob; + struct object *obj; if (!ru->mode[i] || !S_ISREG(ru->mode[i])) continue; - blob = lookup_blob(revs->repo, &ru->oid[i]); - if (!blob) { + obj = parse_object(revs->repo, &ru->oid[i]); + if (!obj) { warning(_("resolve-undo records `%s` which is missing"), oid_to_hex(&ru->oid[i])); continue; } - add_pending_object_with_path(revs, &blob->object, "", + add_pending_object_with_path(revs, obj, "", ru->mode[i], path); } }
Garbage collection could inadvertently prune blobs mentioned only in the resolve-undo extension prior to the bugfix in 5a5ea141e7 ("revision: mark blobs needed for resolve-undo as reachable", 2022-06-09). If a repository is affected by this bug, an obscure error can occur in `git gc` after updating to a version of git that has the bugfix: $ git gc Enumerating objects: 327687, done. Counting objects: 100% (327687/327687), done. Delta compression using up to 8 threads Compressing objects: 100% (70883/70883), done. fatal: unable to read 616c8d17f4625f227708aae480e71233f7f58dce fatal: failed to run repack A similar error occurs in `git rev-list --objects --indexed-objects`. Fix the error by emitting a warning when the resolve-undo list mentions objects that do not exist and then ignoring the nonexistent object. The bugfix 5a5ea141e7 already contained code to emit this warning, but since the code used lookup_blob() (and not parse_object()), it would only warn in the unlikely scenario where the resolve-undo list mentions an existing object that is not a blob. I have encountered this error on two different clones of a large mono-repository in checkouts with dozens of worktrees that see frequent rebasing (causing a lot of git gc churn) and frequent merge conflicts during rebasing, leading to resolve-undo lists in the index. Somehow it seems the resolve-undo lists in the index persist after the merge conflicts are resolved, which makes the error more frequent than if the resolve-undo lists were only populated during the merge conflict. Signed-off-by: Mathias Rav <m@git.strova.dk> Fixes: 5a5ea141e7 Reported-by: Paul Wagland <pwagland@gmail.com> --- revision.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)