Message ID | pull.927.git.1617631280402.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [GSOC] ref-filter: use single strbuf for all output | expand |
On Mon, Apr 5, 2021 at 10:01 AM ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com> wrote: > When we use `git for-each-ref`, every ref will call > `show_ref_array_item()` and allocate its own final strbuf > and error strbuf. Instead, we can provide two single strbuf: > final_buf and error_buf that get reused for each output. > [...] > Signed-off-by: ZheNing Hu <adlternative@gmail.com> > --- Was there a discussion leading up to this change? If so, it may be a good idea to provide a link to it in the mailing list here under the "---" line. Some comments below... > diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c > @@ -22,6 +22,8 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix) > struct ref_format format = REF_FORMAT_INIT; > + struct strbuf final_buf = STRBUF_INIT; > + struct strbuf error_buf = STRBUF_INIT; > > for (i = 0; i < maxcount; i++) > - show_ref_array_item(array.items[i], &format); > + show_ref_array_item(array.items[i], &format, &final_buf, &error_buf); > ref_array_clear(&array); > return 0; > } The call to strbuf_reset() within show_ref_array_item() does not free the memory from the strbufs, so the memory is being leaked. Therefore, at the end of this function, you should have: strbuf_release(final_buf); strbuf_release(error_buf); > diff --git a/builtin/tag.c b/builtin/tag.c > @@ -39,6 +39,8 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting, > struct ref_array array; > + struct strbuf final_buf = STRBUF_INIT; > + struct strbuf error_buf = STRBUF_INIT; > > for (i = 0; i < array.nr; i++) > - show_ref_array_item(array.items[i], format); > + show_ref_array_item(array.items[i], format, &final_buf, &error_buf); > ref_array_clear(&array); > free(to_free); Leaking `final_buf` and `error_buf`. > diff --git a/ref-filter.c b/ref-filter.c > @@ -2436,16 +2436,16 @@ int format_ref_array_item(struct ref_array_item *info, > void show_ref_array_item(struct ref_array_item *info, > - const struct ref_format *format) > + const struct ref_format *format, > + struct strbuf *final_buf, > + struct strbuf *error_buf) > { > - struct strbuf final_buf = STRBUF_INIT; > - struct strbuf error_buf = STRBUF_INIT; > > - if (format_ref_array_item(info, format, &final_buf, &error_buf)) > - die("%s", error_buf.buf); > - fwrite(final_buf.buf, 1, final_buf.len, stdout); > - strbuf_release(&error_buf); > - strbuf_release(&final_buf); > + if (format_ref_array_item(info, format, final_buf, error_buf)) > + die("%s", error_buf->buf); > + fwrite(final_buf->buf, 1, final_buf->len, stdout); > + strbuf_reset(error_buf); > + strbuf_reset(final_buf); > putchar('\n'); > } A couple comments: It is especially ugly that `error_buf` needs to be passed in by the caller since it is only ever used in case of an error, at which point the program will die() anyhow, so it's not on a critical, speed-sensitive path. The initialization with STRBUF_INIT is practically cost-free, so this variable could easily stay local to this function without cost-penalty rather than forcing the caller to pass it in. (This assumes that none of the consumers of `error_buf` down the line insert into the buffer unnecessarily, which is probably a reasonable assumption.) It is an unsafe assumption to only call strbuf_reset() at the end of the function. For this to be robust, you can't assume that the caller has given you an empty strbuf. Instead, you must ensure it by calling strbuf_reset() at the start. (It doesn't hurt to also call strbuf_reset() at the end, but that is not critical to correct operation, so could be omitted.) > @@ -2453,9 +2453,12 @@ void pretty_print_ref(const char *name, const struct object_id *oid, > struct ref_array_item *ref_item; > + struct strbuf final_buf = STRBUF_INIT; > + struct strbuf error_buf = STRBUF_INIT; > + > ref_item = new_ref_array_item(name, oid); > ref_item->kind = ref_kind_from_refname(name); > - show_ref_array_item(ref_item, format); > + show_ref_array_item(ref_item, format, &final_buf, &error_buf); > free_array_item(ref_item); > } Leaking `final_buf` and `error_buf`. > diff --git a/ref-filter.h b/ref-filter.h > @@ -120,7 +120,10 @@ int format_ref_array_item(struct ref_array_item *info, > /* Print the ref using the given format and quote_style */ > -void show_ref_array_item(struct ref_array_item *info, const struct ref_format *format); > +void show_ref_array_item(struct ref_array_item *info, > + const struct ref_format *format, > + struct strbuf *final_buf, > + struct strbuf *error_buf); It is not clear to the person reading this what these two new arguments are for or what should be provided, so the comment above the function deserves an update explaining what these arguments are for and how to provide them. This is especially important since this is a public function. If this function was merely internal to some builtin command, this sort of change for the sake of optimization might not be so bad, but as a public function, it comes across as rather ugly. In general, we don't necessarily want to pollute an otherwise clean API with changes which make the API uglier merely for the sake of tiny optimizations like this (IMHO). The extra burden placed on callers by this change, coupled with the ugliness this introduces into the API, makes the change seem less than desirable. One way you might be able to mitigate the undesirableness would be to have two versions of this function. For instance: /* Print the ref using the given format and quote_style */ show_ref_array_item(...); /* This is like show_ref_array_item() but ... */ show_ref_array_item_optim(...); The comment of the second function would, of course, need to explain that it is similar to show_ref_array_item() but more optimal because <some reasons> and that the caller is responsible for allocating and releasing the strbufs, and that the strbufs are used only for temporary storage, thus the caller should not assume anything about them. This way, callers which don't invoke show_ref_array_item() in a tight loop don't need to be burdened by the new arguments (and won't have to remember to release them), and callers with a tight loop can take advantage of the optimization with the bit of extra work of having to declare and release the strbufs. So, having said all that, it's not clear that the ugliness and extra work are worth the gain. However, if it is decided that the change is worthwhile, then the commit message probably should explain cases in which such an optimization will be really beneficial.
On 4/5/2021 10:01 AM, ZheNing Hu via GitGitGadget wrote: > From: ZheNing Hu <adlternative@gmail.com> > > When we use `git for-each-ref`, every ref will call > `show_ref_array_item()` and allocate its own final strbuf > and error strbuf. Instead, we can provide two single strbuf: > final_buf and error_buf that get reused for each output. s/two single strbuf/two buffers/ > When run it 100 times: > > $ git for-each-ref > > on git.git : > > 3.19s user > 3.88s system > 35% cpu > 20.199 total > > to: > > 2.89s user > 4.00s system > 34% cpu > 19.741 total > > The performance has been slightly improved. This is a nice amount of time! Perhaps simplify the presentation: The performance for 'git for-each-ref' on the Git repository itself with X references changes from 3.2 seconds to 2.9 seconds. > Signed-off-by: ZheNing Hu <adlternative@gmail.com> > --- > [GSOC] ref-filter: use single strbuf for all output > > This patch learned Jeff King's optimization measures in git > cat-file(79ed0a5): using a single strbuf for all objects output Instead > of allocating a large number of small strbuf for every object. This part of the cover letter could be put into the commit message itself. Use the correct formatting, though: This approach is similar to the one used by 79ed0a5 (cat-file: use a single strbuf for all output, 2018-08-14) to speed up the cat-file builtin. I found the code change to be correct. Thanks! -Stolee
On Mon, Apr 05, 2021 at 02:01:19PM +0000, ZheNing Hu via GitGitGadget wrote: > When we use `git for-each-ref`, every ref will call > `show_ref_array_item()` and allocate its own final strbuf > and error strbuf. Instead, we can provide two single strbuf: > final_buf and error_buf that get reused for each output. > > When run it 100 times: > > $ git for-each-ref > > on git.git : > > 3.19s user > 3.88s system > 35% cpu > 20.199 total > > to: > > 2.89s user > 4.00s system > 34% cpu > 19.741 total That's a bigger performance improvement than I'd expect from this. I'm having trouble reproducing it here (I get the same time before and after). I also notice that you don't seem to be CPU bound, and we spend most of our time on system CPU (so program startup stuff, not the loop you're optimizing). I think a more interesting test is timing a single invocation with a large number of refs. If you are planning to do a lot of work on the formatting code, it might be worth adding such a test into t/perf (both to show off results, but also to catch any regressions after we make things faster). > This patch learned Jeff King's optimization measures in git > cat-file(79ed0a5): using a single strbuf for all objects output Instead > of allocating a large number of small strbuf for every object. I do think this is a good direction (for all the reasons laid out in 79ed0a5), though it wasn't actually the part I was most worried about for ref-filter performance. The bigger issue in ref-filter is that each individual atom will generally have its own allocated string, and then we'll format all of the atoms into the final strbuf output. In most cases we could avoid those intermediate copies entirely. I do think this would be a useful optimization to have in addition, though. As for the patch itself, I looked over the review that Eric gave, and he already said everything I would have. :) -Peff
Eric Sunshine <sunshine@sunshineco.com> 于2021年4月6日周二 上午1:05写道: > > On Mon, Apr 5, 2021 at 10:01 AM ZheNing Hu via GitGitGadget > <gitgitgadget@gmail.com> wrote: > > When we use `git for-each-ref`, every ref will call > > `show_ref_array_item()` and allocate its own final strbuf > > and error strbuf. Instead, we can provide two single strbuf: > > final_buf and error_buf that get reused for each output. > > [...] > > Signed-off-by: ZheNing Hu <adlternative@gmail.com> > > --- > > Was there a discussion leading up to this change? If so, it may be a > good idea to provide a link to it in the mailing list here under the > "---" line. > Okay, I will add them in cover-letter next time. > Some comments below... > > > diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c > > @@ -22,6 +22,8 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix) > > struct ref_format format = REF_FORMAT_INIT; > > + struct strbuf final_buf = STRBUF_INIT; > > + struct strbuf error_buf = STRBUF_INIT; > > > > for (i = 0; i < maxcount; i++) > > - show_ref_array_item(array.items[i], &format); > > + show_ref_array_item(array.items[i], &format, &final_buf, &error_buf); > > ref_array_clear(&array); > > return 0; > > } > > The call to strbuf_reset() within show_ref_array_item() does not free > the memory from the strbufs, so the memory is being leaked. Therefore, > at the end of this function, you should have: > > strbuf_release(final_buf); > strbuf_release(error_buf); > Thanks, I ignored this point. > > diff --git a/builtin/tag.c b/builtin/tag.c > > @@ -39,6 +39,8 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting, > > struct ref_array array; > > + struct strbuf final_buf = STRBUF_INIT; > > + struct strbuf error_buf = STRBUF_INIT; > > > > for (i = 0; i < array.nr; i++) > > - show_ref_array_item(array.items[i], format); > > + show_ref_array_item(array.items[i], format, &final_buf, &error_buf); > > ref_array_clear(&array); > > free(to_free); > > Leaking `final_buf` and `error_buf`. > > > diff --git a/ref-filter.c b/ref-filter.c > > @@ -2436,16 +2436,16 @@ int format_ref_array_item(struct ref_array_item *info, > > void show_ref_array_item(struct ref_array_item *info, > > - const struct ref_format *format) > > + const struct ref_format *format, > > + struct strbuf *final_buf, > > + struct strbuf *error_buf) > > { > > - struct strbuf final_buf = STRBUF_INIT; > > - struct strbuf error_buf = STRBUF_INIT; > > > > - if (format_ref_array_item(info, format, &final_buf, &error_buf)) > > - die("%s", error_buf.buf); > > - fwrite(final_buf.buf, 1, final_buf.len, stdout); > > - strbuf_release(&error_buf); > > - strbuf_release(&final_buf); > > + if (format_ref_array_item(info, format, final_buf, error_buf)) > > + die("%s", error_buf->buf); > > + fwrite(final_buf->buf, 1, final_buf->len, stdout); > > + strbuf_reset(error_buf); > > + strbuf_reset(final_buf); > > putchar('\n'); > > } > > A couple comments: > > It is especially ugly that `error_buf` needs to be passed in by the > caller since it is only ever used in case of an error, at which point > the program will die() anyhow, so it's not on a critical, > speed-sensitive path. The initialization with STRBUF_INIT is > practically cost-free, so this variable could easily stay local to > this function without cost-penalty rather than forcing the caller to > pass it in. (This assumes that none of the consumers of `error_buf` > down the line insert into the buffer unnecessarily, which is probably > a reasonable assumption.) > What you said makes sense. The `error_buf` may not need to be passed as a parameter, because errors are generally less. > It is an unsafe assumption to only call strbuf_reset() at the end of > the function. For this to be robust, you can't assume that the caller > has given you an empty strbuf. Instead, you must ensure it by calling > strbuf_reset() at the start. (It doesn't hurt to also call > strbuf_reset() at the end, but that is not critical to correct > operation, so could be omitted.) > Well, indeed, it would be better to use `strbuf_reset()` first, as it do in `cat-file.c`. > > @@ -2453,9 +2453,12 @@ void pretty_print_ref(const char *name, const struct object_id *oid, > > struct ref_array_item *ref_item; > > + struct strbuf final_buf = STRBUF_INIT; > > + struct strbuf error_buf = STRBUF_INIT; > > + > > ref_item = new_ref_array_item(name, oid); > > ref_item->kind = ref_kind_from_refname(name); > > - show_ref_array_item(ref_item, format); > > + show_ref_array_item(ref_item, format, &final_buf, &error_buf); > > free_array_item(ref_item); > > } > > Leaking `final_buf` and `error_buf`. > > > diff --git a/ref-filter.h b/ref-filter.h > > @@ -120,7 +120,10 @@ int format_ref_array_item(struct ref_array_item *info, > > /* Print the ref using the given format and quote_style */ > > -void show_ref_array_item(struct ref_array_item *info, const struct ref_format *format); > > +void show_ref_array_item(struct ref_array_item *info, > > + const struct ref_format *format, > > + struct strbuf *final_buf, > > + struct strbuf *error_buf); > > It is not clear to the person reading this what these two new > arguments are for or what should be provided, so the comment above the > function deserves an update explaining what these arguments are for > and how to provide them. This is especially important since this is a > public function. > > If this function was merely internal to some builtin command, this > sort of change for the sake of optimization might not be so bad, but > as a public function, it comes across as rather ugly. In general, we > don't necessarily want to pollute an otherwise clean API with changes > which make the API uglier merely for the sake of tiny optimizations > like this (IMHO). The extra burden placed on callers by this change, > coupled with the ugliness this introduces into the API, makes the > change seem less than desirable. > Well, for the time being, there are relatively few places where`show_ref_array_item()` is used in the entire git repository. I may need to pay attention to this later: the ease of use of public interfaces is also important. > One way you might be able to mitigate the undesirableness would be to > have two versions of this function. For instance: > > /* Print the ref using the given format and quote_style */ > show_ref_array_item(...); > /* This is like show_ref_array_item() but ... */ > show_ref_array_item_optim(...); > > The comment of the second function would, of course, need to explain > that it is similar to show_ref_array_item() but more optimal because > <some reasons> and that the caller is responsible for allocating and > releasing the strbufs, and that the strbufs are used only for > temporary storage, thus the caller should not assume anything about > them. > Yes, this will ensure that this new public interface will not be misused. > This way, callers which don't invoke show_ref_array_item() in a tight > loop don't need to be burdened by the new arguments (and won't have to > remember to release them), and callers with a tight loop can take > advantage of the optimization with the bit of extra work of having to > declare and release the strbufs. > For external calls, reasonable release of strbuf does require attention, which is indeed a disadvantage at some time. > So, having said all that, it's not clear that the ugliness and extra > work are worth the gain. However, if it is decided that the change is > worthwhile, then the commit message probably should explain cases in > which such an optimization will be really beneficial. I now estimate that the optimization brought here may appear in a more refs git repo like `linux.git`. I have to experiment first. Thanks, Eric. -- ZheNing Hu
Derrick Stolee <stolee@gmail.com> 于2021年4月6日周二 上午5:02写道: > > On 4/5/2021 10:01 AM, ZheNing Hu via GitGitGadget wrote: > > From: ZheNing Hu <adlternative@gmail.com> > > > > When we use `git for-each-ref`, every ref will call > > `show_ref_array_item()` and allocate its own final strbuf > > and error strbuf. Instead, we can provide two single strbuf: > > final_buf and error_buf that get reused for each output. > > s/two single strbuf/two buffers/ > > > When run it 100 times: > > > > $ git for-each-ref > > > > on git.git : > > > > 3.19s user > > 3.88s system > > 35% cpu > > 20.199 total > > > > to: > > > > 2.89s user > > 4.00s system > > 34% cpu > > 19.741 total > > > > The performance has been slightly improved. > > This is a nice amount of time! Perhaps simplify the > presentation: > Thanks! But now it may still need more experimental examples to prove that this optimization is visible. > The performance for 'git for-each-ref' on the Git > repository itself with X references changes from > 3.2 seconds to 2.9 seconds. > > > Signed-off-by: ZheNing Hu <adlternative@gmail.com> > > --- > > [GSOC] ref-filter: use single strbuf for all output > > > > This patch learned Jeff King's optimization measures in git > > cat-file(79ed0a5): using a single strbuf for all objects output Instead > > of allocating a large number of small strbuf for every object. > > This part of the cover letter could be put into the > commit message itself. Use the correct formatting, > though: > > This approach is similar to the one used by 79ed0a5 > (cat-file: use a single strbuf for all output, 2018-08-14) > to speed up the cat-file builtin. > OK, I will do it. > I found the code change to be correct. Thanks! > -Stolee Thanks. -- ZheNing Hu
Jeff King <peff@peff.net> 于2021年4月6日周二 上午6:17写道: > > On Mon, Apr 05, 2021 at 02:01:19PM +0000, ZheNing Hu via GitGitGadget wrote: > > > When we use `git for-each-ref`, every ref will call > > `show_ref_array_item()` and allocate its own final strbuf > > and error strbuf. Instead, we can provide two single strbuf: > > final_buf and error_buf that get reused for each output. > > > > When run it 100 times: > > > > $ git for-each-ref > > > > on git.git : > > > > 3.19s user > > 3.88s system > > 35% cpu > > 20.199 total > > > > to: > > > > 2.89s user > > 4.00s system > > 34% cpu > > 19.741 total > > That's a bigger performance improvement than I'd expect from this. I'm > having trouble reproducing it here (I get the same time before and > after). I also notice that you don't seem to be CPU bound, and we spend > most of our time on system CPU (so program startup stuff, not the loop > you're optimizing). > > I think a more interesting test is timing a single invocation with a > large number of refs. If you are planning to do a lot of work on the > formatting code, it might be worth adding such a test into t/perf (both > to show off results, but also to catch any regressions after we make > things faster). > It makes sense. A lot of refs can be convincing. Just like the number of objects measured in `cat-files` is large enough. But this is the first time I use `t/perf/*` and there is a little problem. It seem like whatever I run single script like `sh ./p0007-write-cache.sh` or just `make` or `./run ${HOME}/git -- ./p0002-read-cache.sh` , these tests will fail. > > This patch learned Jeff King's optimization measures in git > > cat-file(79ed0a5): using a single strbuf for all objects output Instead > > of allocating a large number of small strbuf for every object. > > I do think this is a good direction (for all the reasons laid out in > 79ed0a5), though it wasn't actually the part I was most worried about > for ref-filter performance. The bigger issue in ref-filter is that each > individual atom will generally have its own allocated string, and then > we'll format all of the atoms into the final strbuf output. In most > cases we could avoid those intermediate copies entirely. > Yes! In `ref-filter` we set object info in `v->s` and then append them to current `stack` buffer, and finally set in `final_buf`, the copy of the string is expensive. I don’t know if the optimization should start by removing the stack buffer? > I do think this would be a useful optimization to have in addition, > though. As for the patch itself, I looked over the review that Eric > gave, and he already said everything I would have. :) > I think it should be optimized, It will reduce the overhead of malloc and free, but it is not obvious enough. Yes, there is a lot of bad code in my patch. > -Peff Thanks. -- ZheNing Hu
ZheNing Hu <adlternative@gmail.com> 于2021年4月6日周二 下午5:49写道: > But this is the first time I use `t/perf/*` and there is a little problem. > It seem like whatever I run single script like `sh ./p0007-write-cache.sh` > or just `make` or `./run ${HOME}/git -- ./p0002-read-cache.sh` , these > tests will fail. > It's because I don't have /usr/bin/time, solved after installation. So best have this: --- a/t/perf/perf-lib.sh +++ b/t/perf/perf-lib.sh @@ -152,6 +152,10 @@ immediate=t # Perf tests require GNU time case "$(uname -s)" in Darwin) GTIME="${GTIME:-gtime}";; esac GTIME="${GTIME:-/usr/bin/time}" +if ! test -f "$GTIME" +then + error "command not found: "$GTIME"" +fi Thanks. -- ZheNing Hu
On Tue, Apr 06, 2021 at 06:35:57PM +0800, ZheNing Hu wrote: > ZheNing Hu <adlternative@gmail.com> 于2021年4月6日周二 下午5:49写道: > > But this is the first time I use `t/perf/*` and there is a little problem. > > It seem like whatever I run single script like `sh ./p0007-write-cache.sh` > > or just `make` or `./run ${HOME}/git -- ./p0002-read-cache.sh` , these > > tests will fail. > > > It's because I don't have /usr/bin/time, solved after installation. > So best have this: > > --- a/t/perf/perf-lib.sh > +++ b/t/perf/perf-lib.sh > @@ -152,6 +152,10 @@ immediate=t > # Perf tests require GNU time > case "$(uname -s)" in Darwin) GTIME="${GTIME:-gtime}";; esac > GTIME="${GTIME:-/usr/bin/time}" > +if ! test -f "$GTIME" > +then > + error "command not found: "$GTIME"" > +fi This patch would create problems when we expect to find the value of $GTIME in the $PATH e.g., you can see in the Darwin case it is set to just "gtime", not an absolute path). I am sympathetic to helping people see what's wrong, but I think in this case we're better off pointing people to using "-v". E.g.: $ GTIME=pretend-we-do-not-have-gtime ./p0001-rev-list.sh perf 1 - rev-list --all: not ok 1 - rev-list --all # # git rev-list --all >/dev/null # Uh oh, that wasn't very informative. But how about this: $ GTIME=pretend-we-do-not-have-gtime ./p0001-rev-list.sh -v [...] perf 1 - rev-list --all: running: git rev-list --all >/dev/null ./p0001-rev-list.sh: 160: pretend-we-do-not-have-gtime: not found not ok 1 - rev-list --all # # git rev-list --all >/dev/null # which I think makes it reasonably clear. -Peff
Jeff King <peff@peff.net> 于2021年4月6日周二 下午10:00写道: > > On Tue, Apr 06, 2021 at 06:35:57PM +0800, ZheNing Hu wrote: > > > ZheNing Hu <adlternative@gmail.com> 于2021年4月6日周二 下午5:49写道: > > > But this is the first time I use `t/perf/*` and there is a little problem. > > > It seem like whatever I run single script like `sh ./p0007-write-cache.sh` > > > or just `make` or `./run ${HOME}/git -- ./p0002-read-cache.sh` , these > > > tests will fail. > > > > > It's because I don't have /usr/bin/time, solved after installation. > > So best have this: > > > > --- a/t/perf/perf-lib.sh > > +++ b/t/perf/perf-lib.sh > > @@ -152,6 +152,10 @@ immediate=t > > # Perf tests require GNU time > > case "$(uname -s)" in Darwin) GTIME="${GTIME:-gtime}";; esac > > GTIME="${GTIME:-/usr/bin/time}" > > +if ! test -f "$GTIME" > > +then > > + error "command not found: "$GTIME"" > > +fi > > This patch would create problems when we expect to find the value of > $GTIME in the $PATH e.g., you can see in the Darwin case it is set to > just "gtime", not an absolute path). > > I am sympathetic to helping people see what's wrong, but I think in this > case we're better off pointing people to using "-v". E.g.: > > $ GTIME=pretend-we-do-not-have-gtime ./p0001-rev-list.sh > perf 1 - rev-list --all: > not ok 1 - rev-list --all > # > # git rev-list --all >/dev/null > # > > Uh oh, that wasn't very informative. But how about this: > > $ GTIME=pretend-we-do-not-have-gtime ./p0001-rev-list.sh -v > [...] > perf 1 - rev-list --all: > running: > git rev-list --all >/dev/null > > ./p0001-rev-list.sh: 160: pretend-we-do-not-have-gtime: not found > not ok 1 - rev-list --all > # > # git rev-list --all >/dev/null > # > > which I think makes it reasonably clear. > > -Peff I just make a small suggestion. ;) You are right, "-v" is enough. In addition, I found that the performance was basically unchanged after testing. It seems that this optimization is indeed too small, not as practical as in `cat-file`. This shows that the performance bottleneck of `ref-filter` lies elsewhere. E.g. you mentioned "intermediate copies". -- ZheNing Hu
Am 05.04.21 um 16:01 schrieb ZheNing Hu via GitGitGadget: > From: ZheNing Hu <adlternative@gmail.com> > > When we use `git for-each-ref`, every ref will call > `show_ref_array_item()` and allocate its own final strbuf > and error strbuf. Instead, we can provide two single strbuf: > final_buf and error_buf that get reused for each output. > > When run it 100 times: > > $ git for-each-ref > > on git.git : > > 3.19s user > 3.88s system > 35% cpu > 20.199 total > > to: > > 2.89s user > 4.00s system > 34% cpu > 19.741 total > > The performance has been slightly improved. I like to use hyperfine (https://github.com/sharkdp/hyperfine) to get more stable benchmark numbers, incl. standard deviation. With three warmup runs I get the following results for running git for-each-ref on Git's own repo with the current master (2e36527f23): Benchmark #1: ./git for-each-ref Time (mean ± σ): 18.8 ms ± 0.3 ms [User: 12.7 ms, System: 5.6 ms] Range (min … max): 18.2 ms … 19.8 ms 148 runs With your patch on top I get this: Benchmark #1: ./git for-each-ref Time (mean ± σ): 18.5 ms ± 0.4 ms [User: 12.3 ms, System: 5.6 ms] Range (min … max): 17.8 ms … 19.6 ms 147 runs So there seems to be a slight improvement here, but it is within the noise. I'm quite surprised how much longer this takes on your machine, however, and (like Peff already mentioned) how much of the total time it spends in system calls. Is an antivirus program or similar interferring? Or some kind of emulator or similar, e.g. Valgrind? Or has it been a long time since you ran "git gc"? The benchmark certainly depends on the number of local and remote branches in the repo; my copy currently has 4304 according to "git for-each-ref | wc -l". > > Signed-off-by: ZheNing Hu <adlternative@gmail.com> > --- > [GSOC] ref-filter: use single strbuf for all output > > This patch learned Jeff King's optimization measures in git > cat-file(79ed0a5): using a single strbuf for all objects output Instead > of allocating a large number of small strbuf for every object. > > So ref-filter can learn same thing: use single buffer: final_buf and > error_buf for all refs output. > > Thanks. > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-927%2Fadlternative%2Fref-filter-single-buf-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-927/adlternative/ref-filter-single-buf-v1 > Pull-Request: https://github.com/gitgitgadget/git/pull/927 > > builtin/for-each-ref.c | 4 +++- > builtin/tag.c | 4 +++- > ref-filter.c | 21 ++++++++++++--------- > ref-filter.h | 5 ++++- > 4 files changed, 22 insertions(+), 12 deletions(-) > > diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c > index cb9c81a04606..9dc41f48bfa0 100644 > --- a/builtin/for-each-ref.c > +++ b/builtin/for-each-ref.c > @@ -22,6 +22,8 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix) > struct ref_array array; > struct ref_filter filter; > struct ref_format format = REF_FORMAT_INIT; > + struct strbuf final_buf = STRBUF_INIT; > + struct strbuf error_buf = STRBUF_INIT; > > struct option opts[] = { > OPT_BIT('s', "shell", &format.quote_style, > @@ -81,7 +83,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix) > if (!maxcount || array.nr < maxcount) > maxcount = array.nr; > for (i = 0; i < maxcount; i++) > - show_ref_array_item(array.items[i], &format); > + show_ref_array_item(array.items[i], &format, &final_buf, &error_buf); This user of show_ref_array_item() calls it in a loop on an array. > ref_array_clear(&array); > return 0; > } > diff --git a/builtin/tag.c b/builtin/tag.c > index d403417b5625..8a38b3e2de34 100644 > --- a/builtin/tag.c > +++ b/builtin/tag.c > @@ -39,6 +39,8 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting, > struct ref_format *format) > { > struct ref_array array; > + struct strbuf final_buf = STRBUF_INIT; > + struct strbuf error_buf = STRBUF_INIT; > char *to_free = NULL; > int i; > > @@ -64,7 +66,7 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting, > ref_array_sort(sorting, &array); > > for (i = 0; i < array.nr; i++) > - show_ref_array_item(array.items[i], format); > + show_ref_array_item(array.items[i], format, &final_buf, &error_buf); Dito. > ref_array_clear(&array); > free(to_free); > > diff --git a/ref-filter.c b/ref-filter.c > index f0bd32f71416..51ff6af64ebc 100644 > --- a/ref-filter.c > +++ b/ref-filter.c > @@ -2436,16 +2436,16 @@ int format_ref_array_item(struct ref_array_item *info, > } > > void show_ref_array_item(struct ref_array_item *info, > - const struct ref_format *format) > + const struct ref_format *format, > + struct strbuf *final_buf, > + struct strbuf *error_buf) > { > - struct strbuf final_buf = STRBUF_INIT; > - struct strbuf error_buf = STRBUF_INIT; > > - if (format_ref_array_item(info, format, &final_buf, &error_buf)) > - die("%s", error_buf.buf); > - fwrite(final_buf.buf, 1, final_buf.len, stdout); > - strbuf_release(&error_buf); > - strbuf_release(&final_buf); > + if (format_ref_array_item(info, format, final_buf, error_buf)) > + die("%s", error_buf->buf); > + fwrite(final_buf->buf, 1, final_buf->len, stdout); > + strbuf_reset(error_buf); > + strbuf_reset(final_buf); > putchar('\n'); > } > > @@ -2453,9 +2453,12 @@ void pretty_print_ref(const char *name, const struct object_id *oid, > const struct ref_format *format) > { > struct ref_array_item *ref_item; > + struct strbuf final_buf = STRBUF_INIT; > + struct strbuf error_buf = STRBUF_INIT; > + > ref_item = new_ref_array_item(name, oid); > ref_item->kind = ref_kind_from_refname(name); > - show_ref_array_item(ref_item, format); > + show_ref_array_item(ref_item, format, &final_buf, &error_buf); This third and final caller works with a single item; there is no loop. > free_array_item(ref_item); > } > > diff --git a/ref-filter.h b/ref-filter.h > index 19ea4c413409..95498c9f4467 100644 > --- a/ref-filter.h > +++ b/ref-filter.h > @@ -120,7 +120,10 @@ int format_ref_array_item(struct ref_array_item *info, > struct strbuf *final_buf, > struct strbuf *error_buf); > /* Print the ref using the given format and quote_style */ > -void show_ref_array_item(struct ref_array_item *info, const struct ref_format *format); > +void show_ref_array_item(struct ref_array_item *info, > + const struct ref_format *format, > + struct strbuf *final_buf, > + struct strbuf *error_buf); This bring-your-own-buffer approach pushes responsibilities back to the callers, in exchange for improved performance. The number of users of this interface is low, so that's defensible. But that added effort is also non-trivial -- as you demonstrated by leaking the allocated memory. ;-) How about offering to do more instead? In particular you could add a count parameter and have show_ref_array_item() handle an array of struct ref_array_item objects. It could reuse the buffers internally to get the same performance benefit, and would free callers from having to iterate loops themselves. Something like: void show_ref_array_items(struct ref_array_item **info, size_t n, const struct ref_format *format); Callers that deal with a single element can pass n = 1. Perhaps the "format" parameter should go first, like with printf. The double reference in "**info" is a bit ugly, though (array of pointers instead of a simple array of objects). That's dictated by struct ref_array_item containing a flexible array member, which seems to be hard to change. > /* Parse a single sort specifier and add it to the list */ > void parse_ref_sorting(struct ref_sorting **sorting_tail, const char *atom); > /* Callback function for parsing the sort option */ > > base-commit: 2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48 >
René Scharfe <l.s.r@web.de> 于2021年4月7日周三 上午2:34写道: > > Am 05.04.21 um 16:01 schrieb ZheNing Hu via GitGitGadget: > > From: ZheNing Hu <adlternative@gmail.com> > > > > When we use `git for-each-ref`, every ref will call > > `show_ref_array_item()` and allocate its own final strbuf > > and error strbuf. Instead, we can provide two single strbuf: > > final_buf and error_buf that get reused for each output. > > > > When run it 100 times: > > > > $ git for-each-ref > > > > on git.git : > > > > 3.19s user > > 3.88s system > > 35% cpu > > 20.199 total > > > > to: > > > > 2.89s user > > 4.00s system > > 34% cpu > > 19.741 total > > > > The performance has been slightly improved. > > I like to use hyperfine (https://github.com/sharkdp/hyperfine) to get > more stable benchmark numbers, incl. standard deviation. With three > warmup runs I get the following results for running git for-each-ref on > Git's own repo with the current master (2e36527f23): > Yes, hyperfine is really easy to use! > Benchmark #1: ./git for-each-ref > Time (mean ± σ): 18.8 ms ± 0.3 ms [User: 12.7 ms, System: 5.6 ms] > Range (min … max): 18.2 ms … 19.8 ms 148 runs > > With your patch on top I get this: > > Benchmark #1: ./git for-each-ref > Time (mean ± σ): 18.5 ms ± 0.4 ms [User: 12.3 ms, System: 5.6 ms] > Range (min … max): 17.8 ms … 19.6 ms 147 runs > > So there seems to be a slight improvement here, but it is within the > noise. > Yeah. I meet same noise when I do such test. > I'm quite surprised how much longer this takes on your machine, however, > and (like Peff already mentioned) how much of the total time it spends > in system calls. Is an antivirus program or similar interferring? Or > some kind of emulator or similar, e.g. Valgrind? Or has it been a long > time since you ran "git gc"? > Yes, I haven't used `git gc` for a long time. In addition, when I did the test before, I ran the network proxy software, so there have a bit notice. > The benchmark certainly depends on the number of local and remote > branches in the repo; my copy currently has 4304 according to > "git for-each-ref | wc -l". > Yes i understand this point. But In my git.git, the result of "git for-each-ref | wc -l" is 8716 refs. > > > > Signed-off-by: ZheNing Hu <adlternative@gmail.com> > > --- > > [GSOC] ref-filter: use single strbuf for all output > > > > This patch learned Jeff King's optimization measures in git > > cat-file(79ed0a5): using a single strbuf for all objects output Instead > > of allocating a large number of small strbuf for every object. > > > > So ref-filter can learn same thing: use single buffer: final_buf and > > error_buf for all refs output. > > > > Thanks. > > > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-927%2Fadlternative%2Fref-filter-single-buf-v1 > > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-927/adlternative/ref-filter-single-buf-v1 > > Pull-Request: https://github.com/gitgitgadget/git/pull/927 > > > > builtin/for-each-ref.c | 4 +++- > > builtin/tag.c | 4 +++- > > ref-filter.c | 21 ++++++++++++--------- > > ref-filter.h | 5 ++++- > > 4 files changed, 22 insertions(+), 12 deletions(-) > > > > diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c > > index cb9c81a04606..9dc41f48bfa0 100644 > > --- a/builtin/for-each-ref.c > > +++ b/builtin/for-each-ref.c > > @@ -22,6 +22,8 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix) > > struct ref_array array; > > struct ref_filter filter; > > struct ref_format format = REF_FORMAT_INIT; > > + struct strbuf final_buf = STRBUF_INIT; > > + struct strbuf error_buf = STRBUF_INIT; > > > > struct option opts[] = { > > OPT_BIT('s', "shell", &format.quote_style, > > @@ -81,7 +83,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix) > > if (!maxcount || array.nr < maxcount) > > maxcount = array.nr; > > for (i = 0; i < maxcount; i++) > > - show_ref_array_item(array.items[i], &format); > > + show_ref_array_item(array.items[i], &format, &final_buf, &error_buf); > > This user of show_ref_array_item() calls it in a loop on an array. > > > ref_array_clear(&array); > > return 0; > > } > > diff --git a/builtin/tag.c b/builtin/tag.c > > index d403417b5625..8a38b3e2de34 100644 > > --- a/builtin/tag.c > > +++ b/builtin/tag.c > > @@ -39,6 +39,8 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting, > > struct ref_format *format) > > { > > struct ref_array array; > > + struct strbuf final_buf = STRBUF_INIT; > > + struct strbuf error_buf = STRBUF_INIT; > > char *to_free = NULL; > > int i; > > > > @@ -64,7 +66,7 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting, > > ref_array_sort(sorting, &array); > > > > for (i = 0; i < array.nr; i++) > > - show_ref_array_item(array.items[i], format); > > + show_ref_array_item(array.items[i], format, &final_buf, &error_buf); > > Dito. > > > ref_array_clear(&array); > > free(to_free); > > > > diff --git a/ref-filter.c b/ref-filter.c > > index f0bd32f71416..51ff6af64ebc 100644 > > --- a/ref-filter.c > > +++ b/ref-filter.c > > @@ -2436,16 +2436,16 @@ int format_ref_array_item(struct ref_array_item *info, > > } > > > > void show_ref_array_item(struct ref_array_item *info, > > - const struct ref_format *format) > > + const struct ref_format *format, > > + struct strbuf *final_buf, > > + struct strbuf *error_buf) > > { > > - struct strbuf final_buf = STRBUF_INIT; > > - struct strbuf error_buf = STRBUF_INIT; > > > > - if (format_ref_array_item(info, format, &final_buf, &error_buf)) > > - die("%s", error_buf.buf); > > - fwrite(final_buf.buf, 1, final_buf.len, stdout); > > - strbuf_release(&error_buf); > > - strbuf_release(&final_buf); > > + if (format_ref_array_item(info, format, final_buf, error_buf)) > > + die("%s", error_buf->buf); > > + fwrite(final_buf->buf, 1, final_buf->len, stdout); > > + strbuf_reset(error_buf); > > + strbuf_reset(final_buf); > > putchar('\n'); > > } > > > > @@ -2453,9 +2453,12 @@ void pretty_print_ref(const char *name, const struct object_id *oid, > > const struct ref_format *format) > > { > > struct ref_array_item *ref_item; > > + struct strbuf final_buf = STRBUF_INIT; > > + struct strbuf error_buf = STRBUF_INIT; > > + > > ref_item = new_ref_array_item(name, oid); > > ref_item->kind = ref_kind_from_refname(name); > > - show_ref_array_item(ref_item, format); > > + show_ref_array_item(ref_item, format, &final_buf, &error_buf); > > This third and final caller works with a single item; there is no loop. > > > free_array_item(ref_item); > > } > > > > diff --git a/ref-filter.h b/ref-filter.h > > index 19ea4c413409..95498c9f4467 100644 > > --- a/ref-filter.h > > +++ b/ref-filter.h > > @@ -120,7 +120,10 @@ int format_ref_array_item(struct ref_array_item *info, > > struct strbuf *final_buf, > > struct strbuf *error_buf); > > /* Print the ref using the given format and quote_style */ > > -void show_ref_array_item(struct ref_array_item *info, const struct ref_format *format); > > +void show_ref_array_item(struct ref_array_item *info, > > + const struct ref_format *format, > > + struct strbuf *final_buf, > > + struct strbuf *error_buf); > > This bring-your-own-buffer approach pushes responsibilities back to > the callers, in exchange for improved performance. The number of > users of this interface is low, so that's defensible. But that added > effort is also non-trivial -- as you demonstrated by leaking the > allocated memory. ;-) > Yes, this may be burden for the function caller. > How about offering to do more instead? In particular you could add > a count parameter and have show_ref_array_item() handle an array of > struct ref_array_item objects. It could reuse the buffers internally > to get the same performance benefit, and would free callers from > having to iterate loops themselves. Something like: > > void show_ref_array_items(struct ref_array_item **info, > size_t n, > const struct ref_format *format); > > Callers that deal with a single element can pass n = 1. > > Perhaps the "format" parameter should go first, like with printf. > > The double reference in "**info" is a bit ugly, though (array of > pointers instead of a simple array of objects). That's dictated > by struct ref_array_item containing a flexible array member, which > seems to be hard to change. > I personally think this idea is great. In this way, there is no need to pass in two strbuf from the outside. +void show_ref_array_items(struct ref_array_item **info, + const struct ref_format *format, + size_t n) +{ + struct strbuf final_buf = STRBUF_INIT; + struct strbuf error_buf = STRBUF_INIT; + size_t i; + + for (i = 0; i < n; i++) { + if (format_ref_array_item(info[i], format, &final_buf, &error_buf)) + die("%s", error_buf.buf); + fwrite(final_buf.buf, 1, final_buf.len, stdout); + strbuf_reset(&error_buf); + strbuf_reset(&final_buf); + putchar('\n'); + } + strbuf_release(&error_buf); + strbuf_release(&final_buf); +} + And the result is here(close the network proxy program): HEAD~ result : (git)-[heads/master] % hyperfine "./bin-wrappers/git for-each-ref" --warmup=10 Benchmark #1: ./bin-wrappers/git for-each-ref Time (mean ± σ): 18.7 ms ± 0.4 ms [User: 14.9 ms, System: 3.9 ms] Range (min … max): 18.1 ms … 19.8 ms 141 runs With the new patch : (git)-[ref-filter-single-buf] % hyperfine "./bin-wrappers/git for-each-ref" --warmup=10 Benchmark #1: ./bin-wrappers/git for-each-ref Time (mean ± σ): 18.2 ms ± 0.3 ms [User: 14.1 ms, System: 4.2 ms] Range (min … max): 17.4 ms … 19.2 ms 140 runs Seem like it does have some small advantages ;-) > > /* Parse a single sort specifier and add it to the list */ > > void parse_ref_sorting(struct ref_sorting **sorting_tail, const char *atom); > > /* Callback function for parsing the sort option */ > > > > base-commit: 2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48 > > > A new iteration will be sent later. Thanks! -- ZheNing Hu
diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c index cb9c81a04606..9dc41f48bfa0 100644 --- a/builtin/for-each-ref.c +++ b/builtin/for-each-ref.c @@ -22,6 +22,8 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix) struct ref_array array; struct ref_filter filter; struct ref_format format = REF_FORMAT_INIT; + struct strbuf final_buf = STRBUF_INIT; + struct strbuf error_buf = STRBUF_INIT; struct option opts[] = { OPT_BIT('s', "shell", &format.quote_style, @@ -81,7 +83,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix) if (!maxcount || array.nr < maxcount) maxcount = array.nr; for (i = 0; i < maxcount; i++) - show_ref_array_item(array.items[i], &format); + show_ref_array_item(array.items[i], &format, &final_buf, &error_buf); ref_array_clear(&array); return 0; } diff --git a/builtin/tag.c b/builtin/tag.c index d403417b5625..8a38b3e2de34 100644 --- a/builtin/tag.c +++ b/builtin/tag.c @@ -39,6 +39,8 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting, struct ref_format *format) { struct ref_array array; + struct strbuf final_buf = STRBUF_INIT; + struct strbuf error_buf = STRBUF_INIT; char *to_free = NULL; int i; @@ -64,7 +66,7 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting, ref_array_sort(sorting, &array); for (i = 0; i < array.nr; i++) - show_ref_array_item(array.items[i], format); + show_ref_array_item(array.items[i], format, &final_buf, &error_buf); ref_array_clear(&array); free(to_free); diff --git a/ref-filter.c b/ref-filter.c index f0bd32f71416..51ff6af64ebc 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -2436,16 +2436,16 @@ int format_ref_array_item(struct ref_array_item *info, } void show_ref_array_item(struct ref_array_item *info, - const struct ref_format *format) + const struct ref_format *format, + struct strbuf *final_buf, + struct strbuf *error_buf) { - struct strbuf final_buf = STRBUF_INIT; - struct strbuf error_buf = STRBUF_INIT; - if (format_ref_array_item(info, format, &final_buf, &error_buf)) - die("%s", error_buf.buf); - fwrite(final_buf.buf, 1, final_buf.len, stdout); - strbuf_release(&error_buf); - strbuf_release(&final_buf); + if (format_ref_array_item(info, format, final_buf, error_buf)) + die("%s", error_buf->buf); + fwrite(final_buf->buf, 1, final_buf->len, stdout); + strbuf_reset(error_buf); + strbuf_reset(final_buf); putchar('\n'); } @@ -2453,9 +2453,12 @@ void pretty_print_ref(const char *name, const struct object_id *oid, const struct ref_format *format) { struct ref_array_item *ref_item; + struct strbuf final_buf = STRBUF_INIT; + struct strbuf error_buf = STRBUF_INIT; + ref_item = new_ref_array_item(name, oid); ref_item->kind = ref_kind_from_refname(name); - show_ref_array_item(ref_item, format); + show_ref_array_item(ref_item, format, &final_buf, &error_buf); free_array_item(ref_item); } diff --git a/ref-filter.h b/ref-filter.h index 19ea4c413409..95498c9f4467 100644 --- a/ref-filter.h +++ b/ref-filter.h @@ -120,7 +120,10 @@ int format_ref_array_item(struct ref_array_item *info, struct strbuf *final_buf, struct strbuf *error_buf); /* Print the ref using the given format and quote_style */ -void show_ref_array_item(struct ref_array_item *info, const struct ref_format *format); +void show_ref_array_item(struct ref_array_item *info, + const struct ref_format *format, + struct strbuf *final_buf, + struct strbuf *error_buf); /* Parse a single sort specifier and add it to the list */ void parse_ref_sorting(struct ref_sorting **sorting_tail, const char *atom); /* Callback function for parsing the sort option */