Message ID | 364c526a5d674e3f137578b24143678256601d08.1573069152.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fetch: add trace2 instrumentation | expand |
"Erik Chen via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Erik Chen <erikchen@chromium.org> > > Add trace2 regions to fetch-pack.c to better track time spent in the various > phases of a fetch: > > * matching common remote and local refs > * marking local refs as complete (part of the matching process) > > Both of these stages can be slow for repositories with many refs. > > Signed-off-by: Erik Chen <erikchen@chromium.org> > --- > fetch-pack.c | 8 ++++++++ > 1 file changed, 8 insertions(+) OK. > diff --git a/fetch-pack.c b/fetch-pack.c > index 0130b44112..5e3eee0477 100644 > --- a/fetch-pack.c > +++ b/fetch-pack.c > @@ -669,6 +669,8 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, > > save_commit_buffer = 0; > > + trace2_region_enter("fetch-pack", "mark_complete_and_common_ref", NULL); > + > for (ref = *refs; ref; ref = ref->next) { > struct object *o; > > @@ -690,6 +692,10 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, > } > } > > + /* This block marks all local refs as COMPLETE, and then recursively marks all > + * parents of those refs as COMPLETE. > + */ /* * We write our multi-line comments like this, with the * slash-asterisk at the beginning and the asterisk-slash * at the end on its own line. Learn such local conventions * from the existing surrounding code and imitate, which * would reduce stylistic errors. */ Will fix-up while queuing (no need to reroll only to fix this). > + trace2_region_enter("fetch-pack", "mark_complete_local_refs", NULL); > if (!args->deepen) { > for_each_ref(mark_complete_oid, NULL); > for_each_cached_alternate(NULL, mark_alternate_complete); > @@ -697,6 +703,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, > if (cutoff) > mark_recent_complete_commits(args, cutoff); > } > + trace2_region_leave("fetch-pack", "mark_complete_local_refs", NULL); > > /* > * Mark all complete remote refs as common refs. > @@ -716,6 +723,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, > } > > save_commit_buffer = old_save_commit_buffer; > + trace2_region_leave("fetch-pack", "mark_complete_and_common_ref", NULL); So this introduces a single region around the entire function body of mark_complete_and_common_ref(), within which only one subpart is also enclosed in a nested region. Is that because the parts inside the outer region before and after the inner region are known to consume negligible time? IOW I would understand F () { <region 1 begin> <region 1.1 begin> code <region 1.1 end> <region 1.2 begin> code <region 1.2 end> <region 1.3 begin> code <region 1.3 end> <region 1 end> } or F () { trivial code <region 1 begin> heavy code <region 1 end> trivial code } but this appears to do F () { <region 1 begin> code <region 1.1 begin> code <region 1.1 end> code <region 1 end> } which is somewhat puzzling. Thanks.
On 11/7/2019 12:32 AM, Junio C Hamano wrote: > "Erik Chen via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> From: Erik Chen <erikchen@chromium.org> >> >> Add trace2 regions to fetch-pack.c to better track time spent in the various >> phases of a fetch: >> >> * matching common remote and local refs >> * marking local refs as complete (part of the matching process) >> >> Both of these stages can be slow for repositories with many refs. >> >> Signed-off-by: Erik Chen <erikchen@chromium.org> >> --- >> fetch-pack.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) > > OK. > >> diff --git a/fetch-pack.c b/fetch-pack.c >> index 0130b44112..5e3eee0477 100644 >> --- a/fetch-pack.c >> +++ b/fetch-pack.c >> @@ -669,6 +669,8 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, >> >> save_commit_buffer = 0; >> >> + trace2_region_enter("fetch-pack", "mark_complete_and_common_ref", NULL); >> + >> for (ref = *refs; ref; ref = ref->next) { >> struct object *o; >> >> @@ -690,6 +692,10 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, >> } >> } >> >> + /* This block marks all local refs as COMPLETE, and then recursively marks all >> + * parents of those refs as COMPLETE. >> + */ > > /* > * We write our multi-line comments like this, with the > * slash-asterisk at the beginning and the asterisk-slash > * at the end on its own line. Learn such local conventions > * from the existing surrounding code and imitate, which > * would reduce stylistic errors. > */ > > Will fix-up while queuing (no need to reroll only to fix this). > >> + trace2_region_enter("fetch-pack", "mark_complete_local_refs", NULL); >> if (!args->deepen) { >> for_each_ref(mark_complete_oid, NULL); >> for_each_cached_alternate(NULL, mark_alternate_complete); >> @@ -697,6 +703,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, >> if (cutoff) >> mark_recent_complete_commits(args, cutoff); >> } >> + trace2_region_leave("fetch-pack", "mark_complete_local_refs", NULL); >> >> /* >> * Mark all complete remote refs as common refs. >> @@ -716,6 +723,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, >> } >> >> save_commit_buffer = old_save_commit_buffer; >> + trace2_region_leave("fetch-pack", "mark_complete_and_common_ref", NULL); > > > So this introduces a single region around the entire function body > of mark_complete_and_common_ref(), within which only one subpart is > also enclosed in a nested region. Is that because the parts inside > the outer region before and after the inner region are known to > consume negligible time? IOW I would understand > > F () { > <region 1 begin> > <region 1.1 begin> > code > <region 1.1 end> > <region 1.2 begin> > code > <region 1.2 end> > <region 1.3 begin> > code > <region 1.3 end> > <region 1 end> > } > > or > > F () { > trivial code > <region 1 begin> > heavy code > <region 1 end> > trivial code > } > > but this appears to do > > > F () { > <region 1 begin> > code > <region 1.1 begin> > code > <region 1.1 end> > code > <region 1 end> > } > > which is somewhat puzzling. I notice that a v4 was sent that adds more sub-regions without actually responding to this request. (It is worth also pointing out that you ignored Junio's request you use the cover letter to explain your reasoning for changes between versions.) There is a real downside to nesting regions like this. Specifically, we frequently limit the depth that we report nested regions to avoid overwhelming the logs. In general, these sub-regions should be avoided when possible and instead create regions around important sections, such as the second option Junio lists above. Thanks, -Stolee
Derrick Stolee <stolee@gmail.com> writes: > On 11/7/2019 12:32 AM, Junio C Hamano wrote: >> So this introduces a single region around the entire function body >> of mark_complete_and_common_ref(), within which only one subpart is >> also enclosed in a nested region. Is that because the parts inside >> the outer region before and after the inner region are known to >> consume negligible time? IOW I would understand >> >> F () { >> ... >> } >> >> or >> >> F () { >> trivial code >> <region 1 begin> >> heavy code >> <region 1 end> >> trivial code >> } >> >> but this appears to do >> ... >> which is somewhat puzzling. > > I notice that a v4 was sent that adds more sub-regions without actually > responding to this request. (It is worth also pointing out that you > ignored Junio's request you use the cover letter to explain your reasoning > for changes between versions.) Thanks for noticing. I wasn't requesting any change in particular (at least not yet), but was inquiring the reasoning behind what was done. From that point of view, the lack of answers was worse than yet another patch that does not explain why it was done that other way this time around. > There is a real downside to nesting regions like this. Specifically, we > frequently limit the depth that we report nested regions to avoid > overwhelming the logs. > > In general, these sub-regions should be avoided when possible and instead > create regions around important sections, such as the second option Junio > lists above. Thanks for a clear direction as the area expert of trace2.
On 11/18/19 5:55 PM, Junio C Hamano wrote: > Derrick Stolee <stolee@gmail.com> writes: > >> On 11/7/2019 12:32 AM, Junio C Hamano wrote: >>> So this introduces a single region around the entire function body >>> of mark_complete_and_common_ref(), within which only one subpart is >>> also enclosed in a nested region. Is that because the parts inside >>> the outer region before and after the inner region are known to >>> consume negligible time? IOW I would understand >>> >>> F () { >>> ... >>> } >>> >>> or >>> >>> F () { >>> trivial code >>> <region 1 begin> >>> heavy code >>> <region 1 end> >>> trivial code >>> } >>> >>> but this appears to do >>> ... >>> which is somewhat puzzling. >> I notice that a v4 was sent that adds more sub-regions without actually >> responding to this request. (It is worth also pointing out that you >> ignored Junio's request you use the cover letter to explain your reasoning >> for changes between versions.) > Thanks for noticing. I wasn't requesting any change in particular > (at least not yet), but was inquiring the reasoning behind what was > done. From that point of view, the lack of answers was worse than > yet another patch that does not explain why it was done that other > way this time around. Sorry, I've been replying on the GitGitGadget pull request thread: e.g. https://github.com/gitgitgadget/git/pull/451#issuecomment-555044068 But none of those replies have been making their way to this mailing list. I'll attempt to send these replies to the mailing list now. > >> There is a real downside to nesting regions like this. Specifically, we >> frequently limit the depth that we report nested regions to avoid >> overwhelming the logs. >> >> In general, these sub-regions should be avoided when possible and instead >> create regions around important sections, such as the second option Junio >> lists above. > Thanks for a clear direction as the area expert of trace2. > >
Importing response from https://github.com/gitgitgadget/git/pull/451#issuecomment-555044068 > > /* > * We write our multi-line comments like this, with the > * slash-asterisk at the beginning and the asterisk-slash > * at the end on its own line. Learn such local conventions > * from the existing surrounding code and imitate, which > * would reduce stylistic errors. > */ > > Will fix-up while queuing (no need to reroll only to fix this). Sorry about that. I was copying from line 682, which looks like it was also using the wrong style. I've fixed that line as well. > So this introduces a single region around the entire function body > of mark_complete_and_common_ref(), within which only one subpart is > also enclosed in a nested region. Is that because the parts inside > the outer region before and after the inner region are known to > consume negligible time? IOW I would understand Good point. I used this structure because I observed locally that the middle block was slow, but was concerned that the initial/ending block might be slow as well. I've gone ahead and adding tracing regions for each block.
Importing response from https://github.com/gitgitgadget/git/pull/451#issuecomment-555077933 > I notice that a v4 was sent that adds more sub-regions without actually > responding to this request. Sorry, did I misunderstand the request? I was trying to implement the suggestion: <Junio's first suggestion> > (It is worth also pointing out that you > ignored Junio's request you use the cover letter to explain your reasoning > for changes between versions.) Sorry, this was not intentional. I'm using GitGitGadget, and it wasn't clear to me how to change this. See my comment here: <https://github.com/gitgitgadget/git/pull/451#issuecomment-555044068> > > There is a real downside to nesting regions like this. Specifically, we > frequently limit the depth that we report nested regions to avoid > overwhelming the logs. > > In general, these sub-regions should be avoided when possible and instead > create regions around important sections, such as the second option Junio > lists above. I thought it would be nice to have an encapsulating region for the full function, but I can switch to Junio's second option if that's preferable. It wasn't clear to me that that was preferred over the first option?
>> In general, these sub-regions should be avoided when possible and instead >> create regions around important sections, such as the second option Junio >> lists above. > > Thanks for a clear direction as the area expert of trace2. I'll upload a new patch set to do that.
diff --git a/fetch-pack.c b/fetch-pack.c index 0130b44112..5e3eee0477 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -669,6 +669,8 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, save_commit_buffer = 0; + trace2_region_enter("fetch-pack", "mark_complete_and_common_ref", NULL); + for (ref = *refs; ref; ref = ref->next) { struct object *o; @@ -690,6 +692,10 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, } } + /* This block marks all local refs as COMPLETE, and then recursively marks all + * parents of those refs as COMPLETE. + */ + trace2_region_enter("fetch-pack", "mark_complete_local_refs", NULL); if (!args->deepen) { for_each_ref(mark_complete_oid, NULL); for_each_cached_alternate(NULL, mark_alternate_complete); @@ -697,6 +703,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, if (cutoff) mark_recent_complete_commits(args, cutoff); } + trace2_region_leave("fetch-pack", "mark_complete_local_refs", NULL); /* * Mark all complete remote refs as common refs. @@ -716,6 +723,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, } save_commit_buffer = old_save_commit_buffer; + trace2_region_leave("fetch-pack", "mark_complete_and_common_ref", NULL); } /*