Message ID | 2f2f0db78bf85c14ef132e1924ab5021298aace3.1733170252.git.jonathantanmy@google.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Performance improvements for repacking non-promisor objects | expand |
Jonathan Tan <jonathantanmy@google.com> writes: > Subject: Re: [PATCH 3/3] index-pack: commit tree during outgoing link check > Commit c08589efdc (index-pack: repack local links into promisor packs, > 2024-11-01) seems to contain an oversight in that the tree of a commit > is not checked. I am having a hard time linking the subject with the statement. The verb "commit" should probably be something else, as you are not creating a tree object while checking, but I am not sure? > The fix slows down a fetch from a certain repo at > $DAYJOB from 2m2.127s to 2m45.052s, but in order to make the fetch > correct, it seems worth it. And "the fix" is not described so a reader is left wondering. Is the fix for an oversight of not checking merely to check it? IOW, is c08589efdc made outgoing links to be checked for commits, but failed to do so for trees. Make sure we check both what is happening? > In order to test this, we could create server and client repos as > follows... > > C S > \ / > O > > (O and C are commits both on the client and server. S is a commit > only on the server. C and S have the same tree but different commit > messages.) > > ...and then, from the client, fetch S from the server. > > In theory, the client declares "have C" and the server can use this > information to exclude S's tree (since it knows that the client has C's > tree, which is the same as S's tree). OK. > However, it is also possible for > the server to compute that it needs to send S and not O, and proceed > from there; If O, C, and S have all identical trees, then wouldn't such a test work well? At that point it does not matter which between O and C the server bases its decision to send S but not S's tree on, no? In any case, will queue. Thanks. > therefore the objects of C are not considered at all when > determining what to send in the packfile. In order to prevent a test of > client functionality from having such a dependence on server behavior, I > have not included such a test. > > Signed-off-by: Jonathan Tan <jonathantanmy@google.com> > --- > builtin/index-pack.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/builtin/index-pack.c b/builtin/index-pack.c > index 58d24540dc..338aeeadc8 100644 > --- a/builtin/index-pack.c > +++ b/builtin/index-pack.c > @@ -838,6 +838,7 @@ static void do_record_outgoing_links(struct object *obj) > struct commit *commit = (struct commit *) obj; > struct commit_list *parents = commit->parents; > > + record_outgoing_link(get_commit_tree_oid(commit)); > for (; parents; parents = parents->next) > record_outgoing_link(&parents->item->object.oid); > } else if (obj->type == OBJ_TAG) {
Junio C Hamano <gitster@pobox.com> writes: > > The fix slows down a fetch from a certain repo at > > $DAYJOB from 2m2.127s to 2m45.052s, but in order to make the fetch > > correct, it seems worth it. > > And "the fix" is not described so a reader is left wondering. Is > the fix for an oversight of not checking merely to check it? IOW, > is > > c08589efdc made outgoing links to be checked for commits, but > failed to do so for trees. Make sure we check both > > what is happening? Yes. I was trying to keep to the character limit and in doing so, made the commit message title hard to understand. I think the new title should be easier to understand (and also stated explicitly in the commit message what is being taught to Git). > > However, it is also possible for > > the server to compute that it needs to send S and not O, and proceed > > from there; > > If O, C, and S have all identical trees, then wouldn't such a test > work well? At that point it does not matter which between O and C > the server bases its decision to send S but not S's tree on, no? > > In any case, will queue. Thanks. O has a different tree from C and S. I will add a note to clarify this.
Jonathan Tan <jonathantanmy@google.com> writes: > Junio C Hamano <gitster@pobox.com> writes: >> > The fix slows down a fetch from a certain repo at >> > $DAYJOB from 2m2.127s to 2m45.052s, but in order to make the fetch >> > correct, it seems worth it. >> >> And "the fix" is not described so a reader is left wondering. Is >> the fix for an oversight of not checking merely to check it? IOW, >> is >> >> c08589efdc made outgoing links to be checked for commits, but >> failed to do so for trees. Make sure we check both >> >> what is happening? > > Yes. I was trying to keep to the character limit and in doing so, made > the commit message title hard to understand. I think the new title > should be easier to understand (and also stated explicitly in the commit > message what is being taught to Git). Thanks. >> > However, it is also possible for >> > the server to compute that it needs to send S and not O, and proceed >> > from there; >> >> If O, C, and S have all identical trees, then wouldn't such a test >> work well? At that point it does not matter which between O and C >> the server bases its decision to send S but not S's tree on, no? >> >> In any case, will queue. Thanks. > > O has a different tree from C and S. I will add a note to clarify this. No, that is not what I meant. "If you arrange your test so that all three have the same tree, then would't the reason why such a test would not work you cited disappear and make this fix testable?" is what I wanted to ask.
Junio C Hamano <gitster@pobox.com> writes: > >> > However, it is also possible for > >> > the server to compute that it needs to send S and not O, and proceed > >> > from there; > >> > >> If O, C, and S have all identical trees, then wouldn't such a test > >> work well? At that point it does not matter which between O and C > >> the server bases its decision to send S but not S's tree on, no? > >> > >> In any case, will queue. Thanks. > > > > O has a different tree from C and S. I will add a note to clarify this. > > No, that is not what I meant. "If you arrange your test so that all > three have the same tree, then would't the reason why such a test > would not work you cited disappear and make this fix testable?" is > what I wanted to ask. Copying and pasting the diagram for reference: C S \ / O I looked into this and I don't think that making O, C, and S have the same tree will make this fix testable. If they all had the same tree, whether we check S's tree or not doesn't matter, since that tree is already in a promisor pack (O and its tree was previously fetched from the promisor remote, and thus is in a promisor pack). (We are checking trees in order to repack them into promisor packs if necessary.)
Jonathan Tan <jonathantanmy@google.com> writes: > Copying and pasting the diagram for reference: > > C S > \ / > O > > I looked into this and I don't think that making O, C, and S have the > same tree will make this fix testable. If they all had the same tree, > whether we check S's tree or not doesn't matter, since that tree is > already in a promisor pack (O and its tree was previously fetched from > the promisor remote, and thus is in a promisor pack). (We are checking > trees in order to repack them into promisor packs if necessary.) I see; thanks.
diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 58d24540dc..338aeeadc8 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -838,6 +838,7 @@ static void do_record_outgoing_links(struct object *obj) struct commit *commit = (struct commit *) obj; struct commit_list *parents = commit->parents; + record_outgoing_link(get_commit_tree_oid(commit)); for (; parents; parents = parents->next) record_outgoing_link(&parents->item->object.oid); } else if (obj->type == OBJ_TAG) {
Commit c08589efdc (index-pack: repack local links into promisor packs, 2024-11-01) seems to contain an oversight in that the tree of a commit is not checked. The fix slows down a fetch from a certain repo at $DAYJOB from 2m2.127s to 2m45.052s, but in order to make the fetch correct, it seems worth it. In order to test this, we could create server and client repos as follows... C S \ / O (O and C are commits both on the client and server. S is a commit only on the server. C and S have the same tree but different commit messages.) ...and then, from the client, fetch S from the server. In theory, the client declares "have C" and the server can use this information to exclude S's tree (since it knows that the client has C's tree, which is the same as S's tree). However, it is also possible for the server to compute that it needs to send S and not O, and proceed from there; therefore the objects of C are not considered at all when determining what to send in the packfile. In order to prevent a test of client functionality from having such a dependence on server behavior, I have not included such a test. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- builtin/index-pack.c | 1 + 1 file changed, 1 insertion(+)