Message ID | 19fd72c34dcd1332df638d76b0b028e9d9da3d41.1662025272.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | list-object-filter: introduce depth filter | expand |
On 9/1/2022 5:41 AM, ZheNing Hu via GitGitGadget wrote: > From: ZheNing Hu <adlternative@gmail.com> > > In repo_parse_commit_internal(), if we want to use > commit graph, it will call parse_commit_in_graph() to > parse commit's content from commit graph, otherwise > call repo_read_object_file() to parse commit's content > from commit object. > > repo_read_object_file() will respect commit graft, > which can correctly amend commit's parents. But > parse_commit_in_graph() not. Inconsistencies here may > result in incorrect processing of shallow clone. > > So let parse_commit_in_graph() respect commit graft as > repo_read_object_file() does, which can solve this problem. If grafts or replace-objects exist, then the commit-graph is disabled and this code will never be called. I would expect a test case demonstrating the change in behavior here, but that is impossible. The commit-graph parsing should not be bogged down with this logic. Thanks, -Stolee
Derrick Stolee <derrickstolee@github.com> 于2022年9月2日周五 03:18写道: > > On 9/1/2022 5:41 AM, ZheNing Hu via GitGitGadget wrote: > > From: ZheNing Hu <adlternative@gmail.com> > > > > In repo_parse_commit_internal(), if we want to use > > commit graph, it will call parse_commit_in_graph() to > > parse commit's content from commit graph, otherwise > > call repo_read_object_file() to parse commit's content > > from commit object. > > > > repo_read_object_file() will respect commit graft, > > which can correctly amend commit's parents. But > > parse_commit_in_graph() not. Inconsistencies here may > > result in incorrect processing of shallow clone. > > > > So let parse_commit_in_graph() respect commit graft as > > repo_read_object_file() does, which can solve this problem. > > If grafts or replace-objects exist, then the commit-graph > is disabled and this code will never be called. I would > expect a test case demonstrating the change in behavior > here, but that is impossible. > Thanks for the clarification. I don't really know what's the wrong here, but just let do a little test: 1. Revert this commit 19fd72c34dcd1332df638d76b0b028e9d9da3d41 $ git revert 19fd72 2. Clone the git repo $ git clone --bare git@github.com:git/git.git 3. Write commit graph $ git commit-graph write 4. Use the depth=<depth> to clone (depth=1) $ git clone --no-checkout --no-local --=depth=1 git.git git1 Cloning into 'git1'... remote: Enumerating objects: 4306, done. remote: Counting objects: 100% (4306/4306), done. remote: Compressing objects: 100% (3785/3785), done. 4. Use the depth=<depth> to clone (depth=2) $ git clone --no-checkout --no-local --=depth=2 git.git git2 Cloning into 'git2'... remote: Enumerating objects: 4311, done. remote: Counting objects: 100% (4311/4311), done. remote: Compressing objects: 100% (3788/3788), done. 5. Use the depth filter to clone (depth=1) $ git clone --no-checkout --no-local --filter=depth:1 git.git git3 Cloning into 'git3'... remote: Enumerating objects: 4306, done. remote: Counting objects: 100% (4306/4306), done. remote: Compressing objects: 100% (3785/3785), done. 6. Use the depth filter to clone (depth=2) $ git clone --no-checkout --no-local --filter=depth:2 git.git git4 Cloning into 'git4'... remote: Enumerating objects: 322987, done. remote: Counting objects: 100% (322987/322987), done. remote: Compressing objects: 100% (77441/77441), done. As we can see, when we use --filter=depth:<depth> (depth >= 2), it seems like we clone a lot of objects. The result is significantly different from git clone --depth=<depth> (depth >= 2). So I debug it by reproducing the git pack-objects process: I find there are different action between --filter=depth:<depth> and --depth=<depth> . --filter=depth:<depth> will be successfully resolved commit parents in parse_commit_in_graph(), Call stack( cmd_pack_objects -> get_object_list -> traverse_commit_list -> traverse_commit_list_filtered -> do_traverse -> get_revision -> get_revision_internal -> get_revision_1 -> process_parents -> repo_parse_commit_gently -> repo_parse_commit_internal -> parse_commit_in_graph) --depth=<depth> will failed in parse_commit_in_graph(), and call repo_read_object_file() to resolved commit parents. Call stack( cmd_pack_objects -> get_object_list -> traverse_commit_list -> traverse_commit_list_filtered -> do_traverse -> get_revision -> get_revision_internal -> get_revision_1 -> process_parents -> repo_parse_commit_gently -> repo_parse_commit_internal -> repo_read_object_file) > The commit-graph parsing should not be bogged down with > this logic. > So I try to fix this problem by let commit-graph respect commit-graft. I don't know if I overlook something before... > Thanks, > -Stolee > Thanks, ZheNing Hu
diff --git a/commit-graph.c b/commit-graph.c index f2a36032f84..89bb6f87079 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -820,6 +820,7 @@ static int fill_commit_in_graph(struct repository *r, struct commit_list **pptr; const unsigned char *commit_data; uint32_t lex_index; + struct commit_graft *graft; while (pos < g->num_commits_in_base) g = g->base_graph; @@ -833,31 +834,54 @@ static int fill_commit_in_graph(struct repository *r, set_commit_tree(item, NULL); + graft = lookup_commit_graft(r, &item->object.oid); + if (graft) + r->parsed_objects->substituted_parent = 1; + pptr = &item->parents; edge_value = get_be32(commit_data + g->hash_len); if (edge_value == GRAPH_PARENT_NONE) return 1; - pptr = insert_parent_or_die(r, g, edge_value, pptr); + if (!(graft && (graft->nr_parent < 0 || grafts_replace_parents))) + pptr = insert_parent_or_die(r, g, edge_value, pptr); edge_value = get_be32(commit_data + g->hash_len + 4); if (edge_value == GRAPH_PARENT_NONE) return 1; if (!(edge_value & GRAPH_EXTRA_EDGES_NEEDED)) { - pptr = insert_parent_or_die(r, g, edge_value, pptr); - return 1; + if (!(graft && (graft->nr_parent < 0 || grafts_replace_parents))) { + pptr = insert_parent_or_die(r, g, edge_value, pptr); + return 1; + } } parent_data_ptr = (uint32_t*)(g->chunk_extra_edges + 4 * (uint64_t)(edge_value & GRAPH_EDGE_LAST_MASK)); do { edge_value = get_be32(parent_data_ptr); - pptr = insert_parent_or_die(r, g, - edge_value & GRAPH_EDGE_LAST_MASK, - pptr); + if (!(graft && (graft->nr_parent < 0 || grafts_replace_parents))) { + pptr = insert_parent_or_die(r, g, + edge_value & GRAPH_EDGE_LAST_MASK, + pptr); + } parent_data_ptr++; } while (!(edge_value & GRAPH_LAST_EDGE)); + if (graft) { + int i; + struct commit *new_parent; + for (i = 0; i < graft->nr_parent; i++) { + new_parent = lookup_commit(r, + &graft->parent[i]); + if (!new_parent) + die(_("bad graft parent %s in commit %s"), + oid_to_hex(&graft->parent[i]), + oid_to_hex(&item->object.oid)); + pptr = &commit_list_insert(new_parent, pptr)->next; + } + } + return 1; }