Message ID | cover.1730409376.git.jonathantanmy@google.com (mailing list archive) |
---|---|
Headers | show |
Series | When fetching, die if in commit graph but not obj db | expand |
On 2024.10.31 14:18, Jonathan Tan wrote: > Thanks everyone for your review comments. I've updated the patch 1 > commit message as Josh requested. I'll reply individually to comments on > patch 2. > > Jonathan Tan (2): > Revert "fetch-pack: add a deref_without_lazy_fetch_extended()" > fetch-pack: warn if in commit graph but not obj db > > fetch-pack.c | 42 +++++++++++----------- > t/t5330-no-lazy-fetch-with-commit-graph.sh | 2 +- > 2 files changed, 23 insertions(+), 21 deletions(-) This version looks good to me, thanks! Reviewed-by: Josh Steadmon <steadmon@google.com>
Thanks everyone for your review comments. I've updated the patch 1 commit message as Josh requested. I'll reply individually to comments on patch 2. Jonathan Tan (2): Revert "fetch-pack: add a deref_without_lazy_fetch_extended()" fetch-pack: warn if in commit graph but not obj db fetch-pack.c | 42 +++++++++++----------- t/t5330-no-lazy-fetch-with-commit-graph.sh | 2 +- 2 files changed, 23 insertions(+), 21 deletions(-) Range-diff against v1: 1: 4dea8933cf ! 1: 34e87b8388 Revert "fetch-pack: add a deref_without_lazy_fetch_extended()" @@ Commit message This reverts commit a6e65fb39caf18259c660c1c7910d5bf80bc15cb. + This revert simplifies the next patch in this patch set. + The commit message of that commit mentions that the new function "will be used for the bundle-uri client in a subsequent commit", but it seems that eventually it wasn't used. 2: 1027ff2cb7 ! 2: 631b9a8677 fetch-pack: warn if in commit graph but not obj db @@ Commit message action, because the object is present in the commit graph file but not in the object DB. - Therefore, detect when this occurs and print a warning. (Note that - we cannot proceed to include this object in the list of objects to - be fetched without changing at least the fetch negotiation code: - what would happen is that the client will send "want X" and "have X" - and when I tested at $DAYJOB with a work server that uses JGit, the - server reasonably returned an empty packfile. And changing the fetch - negotiation code to only use the object DB when deciding what to report - as "have" would be an unnecessary slowdown, I think.) + Therefore, make it a fatal error when this occurs. (Note that we cannot + proceed to include this object in the list of objects to be fetched + without changing at least the fetch negotiation code: what would happen + is that the client will send "want X" and "have X" and when I tested + at $DAYJOB with a work server that uses JGit, the server reasonably + returned an empty packfile. And changing the fetch negotiation code to + only use the object DB when deciding what to report as "have" would be + an unnecessary slowdown, I think.) This was discovered when a lazy fetch of a missing commit completed with nothing actually fetched, and the writing of the commit graph file after every fetch then attempted to read said missing commit, triggering a lazy fetch of said missing commit, resulting in an infinite loop with no user-visible indication (until they check the list of processes running - on their computer). With this fix, at least a warning message will be - printed. Note that although the repo corruption we discovered was caused - by a bug in GC in a partial clone, the behavior that this patch teaches - Git to warn about applies to any repo with commit graph enabled and with - a missing commit, whether it is a partial clone or not. + on their computer). With this fix, there is no infinite loop. Note that + although the repo corruption we discovered was caused by a bug in GC in + a partial clone, the behavior that this patch teaches Git to warn about + applies to any repo with commit graph enabled and with a missing commit, + whether it is a partial clone or not. + + t5330, introduced in 3a1ea94a49 (commit-graph.c: no lazy fetch in + lookup_commit_in_graph(), 2022-07-01), tests that an interaction between + fetch and the commit graph does not cause an infinite loop. This patch + changes the exit code in that situation, so that test had to be changed. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> ## fetch-pack.c ## -@@ fetch-pack.c: static struct string_list uri_protocols = STRING_LIST_INIT_DUP; - #define ALTERNATE (1U << 1) - #define COMMON (1U << 6) - #define REACH_SCRATCH (1U << 7) -+#define COMPLETE_FROM_COMMIT_GRAPH (1U << 8) - - /* - * After sending this many "have"s if we do not get any new ACK , we @@ fetch-pack.c: static void for_each_cached_alternate(struct fetch_negotiator *negotiator, + cb(negotiator, cache.items[i]); } ++static void die_in_commit_graph_only(const struct object_id *oid) ++{ ++ die(_("You are attempting to fetch %s, which is in the commit graph file but not in the object database.\n" ++ "This is probably due to repo corruption.\n" ++ "If you are attempting to repair this repo corruption by refetching the missing object, use 'git fetch --refetch' with the missing object."), ++ oid_to_hex(oid)); ++} ++ static struct commit *deref_without_lazy_fetch(const struct object_id *oid, - int mark_tags_complete) -+ int mark_additional_complete_information) ++ int mark_tags_complete_and_check_obj_db) { enum object_type type; struct object_info info = { .typep = &type }; @@ fetch-pack.c: static void for_each_cached_alternate(struct fetch_negotiator *neg commit = lookup_commit_in_graph(the_repository, oid); - if (commit) + if (commit) { -+ if (mark_additional_complete_information) -+ commit->object.flags |= COMPLETE_FROM_COMMIT_GRAPH; ++ if (mark_tags_complete_and_check_obj_db) { ++ if (!has_object(the_repository, oid, 0)) ++ die_in_commit_graph_only(oid); ++ } return commit; + } @@ fetch-pack.c: static struct commit *deref_without_lazy_fetch(const struct object if (!tag->tagged) return NULL; - if (mark_tags_complete) -+ if (mark_additional_complete_information) ++ if (mark_tags_complete_and_check_obj_db) tag->object.flags |= COMPLETE; oid = &tag->tagged->oid; } else { -@@ fetch-pack.c: static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, - save_commit_buffer = old_save_commit_buffer; - } - -+static void warn_in_commit_graph_only(const struct object_id *oid) -+{ -+ warning(_("You are attempting to fetch %s, which is in the commit graph file but not in the object database."), -+ oid_to_hex(oid)); -+ warning(_("This is probably due to repo corruption.")); -+ warning(_("If you are attempting to repair this repo corruption by refetching the missing object, use 'git fetch --refetch' with the missing object.")); -+} -+ - /* - * Returns 1 if every object pointed to by the given remote refs is available - * locally and reachable from a local ref, and 0 otherwise. -@@ fetch-pack.c: static int everything_local(struct fetch_pack_args *args, - ref->name); - continue; - } -+ if (o->flags & COMPLETE_FROM_COMMIT_GRAPH) { -+ if (!has_object(the_repository, remote, 0)) -+ warn_in_commit_graph_only(remote); -+ } - print_verbose(args, _("already have %s (%s)"), oid_to_hex(remote), - ref->name); - } - ## object.h ## -@@ object.h: void object_array_init(struct object_array *array); - /* - * object flag allocation: - * revision.h: 0---------10 15 23------27 -- * fetch-pack.c: 01 67 -+ * fetch-pack.c: 01 6-8 - * negotiator/default.c: 2--5 - * walker.c: 0-2 - * upload-pack.c: 4 11-----14 16-----19 + ## t/t5330-no-lazy-fetch-with-commit-graph.sh ## +@@ t/t5330-no-lazy-fetch-with-commit-graph.sh: test_expect_success 'fetch any commit from promisor with the usage of the commit + test_commit -C with-commit any-commit && + anycommit=$(git -C with-commit rev-parse HEAD) && + GIT_TRACE="$(pwd)/trace.txt" \ +- git -C with-commit-graph fetch origin $anycommit 2>err && ++ test_must_fail git -C with-commit-graph fetch origin $anycommit 2>err && + ! grep "fatal: promisor-remote: unable to fork off fetch subprocess" err && + grep "git fetch origin" trace.txt >actual && + test_line_count = 1 actual