Message ID | 53ebceb7f11d1e1ea39ae4ca86850190ae839eff.1539883476.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | commit-reach: fix first-parent heuristic | expand |
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > We can also re-run the performance tests from commit 4fbcca4e > "commit-reach: make can_all_from_reach... linear". > > Performance was measured on the Linux repository using > 'test-tool reach can_all_from_reach'. The input included rows seeded by > tag values. The "small" case included X-rows as v4.[0-9]* and Y-rows as > v3.[0-9]*. This mimics a (very large) fetch that says "I have all major > v3 releases and want all major v4 releases." The "large" case included > X-rows as "v4.*" and Y-rows as "v3.*". This adds all release-candidate > tags to the set, which does not greatly increase the number of objects > that are considered, but does increase the number of 'from' commits, > demonstrating the quadratic nature of the previous code. These micro-benchmarks are interesting, but we should also remember to describe the impact of the bug being fixed in the larger picture. What end-user visible operations are affected? Computing merge-base? Finding if a push fast-forwards? Something else?
On 10/19/2018 1:24 AM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> We can also re-run the performance tests from commit 4fbcca4e >> "commit-reach: make can_all_from_reach... linear". >> >> Performance was measured on the Linux repository using >> 'test-tool reach can_all_from_reach'. The input included rows seeded by >> tag values. The "small" case included X-rows as v4.[0-9]* and Y-rows as >> v3.[0-9]*. This mimics a (very large) fetch that says "I have all major >> v3 releases and want all major v4 releases." The "large" case included >> X-rows as "v4.*" and Y-rows as "v3.*". This adds all release-candidate >> tags to the set, which does not greatly increase the number of objects >> that are considered, but does increase the number of 'from' commits, >> demonstrating the quadratic nature of the previous code. > These micro-benchmarks are interesting, but we should also remember > to describe the impact of the bug being fixed in the larger picture. > What end-user visible operations are affected? Computing merge-base? > Finding if a push fast-forwards? Something else? Sorry about that. Here is a description that could be inserted into the commit message: The can_all_from_reach() method synthesizes the logic for one iteration of can_all_from_reach_with_flags() which is used by the server during fetch negotiation to determine if more haves/wants are needed. The logic is also used by is_descendant_of() which is used to check if a force-push is required and in 'git branch --contains'. Thanks, -Stolee
Derrick Stolee <stolee@gmail.com> writes: > On 10/19/2018 1:24 AM, Junio C Hamano wrote: >> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: >> >>> We can also re-run the performance tests from commit 4fbcca4e >>> "commit-reach: make can_all_from_reach... linear". >>> >>> Performance was measured on the Linux repository using >>> 'test-tool reach can_all_from_reach'. The input included rows seeded by >>> tag values. The "small" case included X-rows as v4.[0-9]* and Y-rows as >>> v3.[0-9]*. This mimics a (very large) fetch that says "I have all major >>> v3 releases and want all major v4 releases." The "large" case included >>> X-rows as "v4.*" and Y-rows as "v3.*". This adds all release-candidate >>> tags to the set, which does not greatly increase the number of objects >>> that are considered, but does increase the number of 'from' commits, >>> demonstrating the quadratic nature of the previous code. >> These micro-benchmarks are interesting, but we should also remember >> to describe the impact of the bug being fixed in the larger picture. >> What end-user visible operations are affected? Computing merge-base? >> Finding if a push fast-forwards? Something else? > > Sorry about that. Here is a description that could be inserted into > the commit message: > > The can_all_from_reach() method synthesizes the logic for one > iteration of can_all_from_reach_with_flags() which is used by the > server during fetch negotiation to determine if more haves/wants are > needed. The logic is also used by is_descendant_of() which is used to > check if a force-push is required and in 'git branch --contains'. I am afraid that it is still not end-user serving enough. The level of "larger picture" I had in mind was those that would appear as an entry in the release notes, e.g. (this is for illustration purposes only; I do not claim its actual contents is correct). We started using commit generation numbers in various reachability computations, but due to a bug, negitiation between the "git fetch" and the server started to require 30% more roundtrips than necessary, and it has become less efficient to see if a commit is a descendant of another commit in certain cases, which has been corrected in this release. Thanks.
diff --git a/commit-reach.c b/commit-reach.c index 9f79ce0a22..79419be8af 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -593,8 +593,10 @@ int can_all_from_reach_with_flag(struct object_array *from, while (stack) { struct commit_list *parent; - if (stack->item->object.flags & with_flag) { + if (stack->item->object.flags & (with_flag | RESULT)) { pop_commit(&stack); + if (stack) + stack->item->object.flags |= RESULT; continue; }