Message ID | 20180919141016.27930-1-avarab@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | reflog expire: add progress output | expand |
On Wed, Sep 19, 2018 at 4:23 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > Before this change the "git reflog expire" command didn't report any > progress. I love these progress additions you've been pushing lately :) > diff --git a/builtin/reflog.c b/builtin/reflog.c > index 3acef5a0ab..d3075ee75a 100644 > --- a/builtin/reflog.c > +++ b/builtin/reflog.c > @@ -10,6 +10,7 @@ > #include "diff.h" > #include "revision.h" > #include "reachable.h" > +#include "progress.h" > > /* NEEDSWORK: switch to using parse_options */ > static const char reflog_expire_usage[] = > @@ -225,14 +226,20 @@ static void mark_reachable(struct expire_reflog_policy_cb *cb) > struct commit_list *pending; > timestamp_t expire_limit = cb->mark_limit; > struct commit_list *leftover = NULL; > + struct progress *progress = NULL; > + int i = 0; > > for (pending = cb->mark_list; pending; pending = pending->next) > pending->item->object.flags &= ~REACHABLE; > > pending = cb->mark_list; > + progress = start_delayed_progress( > + _("Marking unreachable commits in reflog for expiry"), 0); Maybe just "Searching expired reflog entries" or something like that. It's not technically as accurate, but I think it's easier to understand by by new people. Do we have --quiet option or something that needs to completely suppress this progress thing? > while (pending) { > struct commit_list *parent; > struct commit *commit = pop_commit(&pending); > + > + display_progress(progress, ++i); maybe rename it to commit_count or something and leave "i" for temporary/short lived usage. > if (commit->object.flags & REACHABLE) > continue; > if (parse_commit(commit)) > @@ -253,6 +260,7 @@ static void mark_reachable(struct expire_reflog_policy_cb *cb) > } > } > cb->mark_list = leftover; > + stop_progress(&progress); > } > > static int unreachable(struct expire_reflog_policy_cb *cb, struct commit *commit, struct object_id *oid) > -- > 2.19.0.444.g18242da7ef >
On Wed, Sep 19 2018, Duy Nguyen wrote: > On Wed, Sep 19, 2018 at 4:23 PM Ævar Arnfjörð Bjarmason > <avarab@gmail.com> wrote: >> Before this change the "git reflog expire" command didn't report any >> progress. > > I love these progress additions you've been pushing lately :) > >> diff --git a/builtin/reflog.c b/builtin/reflog.c >> index 3acef5a0ab..d3075ee75a 100644 >> --- a/builtin/reflog.c >> +++ b/builtin/reflog.c >> @@ -10,6 +10,7 @@ >> #include "diff.h" >> #include "revision.h" >> #include "reachable.h" >> +#include "progress.h" >> >> /* NEEDSWORK: switch to using parse_options */ >> static const char reflog_expire_usage[] = >> @@ -225,14 +226,20 @@ static void mark_reachable(struct expire_reflog_policy_cb *cb) >> struct commit_list *pending; >> timestamp_t expire_limit = cb->mark_limit; >> struct commit_list *leftover = NULL; >> + struct progress *progress = NULL; >> + int i = 0; >> >> for (pending = cb->mark_list; pending; pending = pending->next) >> pending->item->object.flags &= ~REACHABLE; >> >> pending = cb->mark_list; >> + progress = start_delayed_progress( >> + _("Marking unreachable commits in reflog for expiry"), 0); > > Maybe just "Searching expired reflog entries" or something like that. > It's not technically as accurate, but I think it's easier to > understand by by new people. Or "Pruning reflog entries"? I was aiming for some combination of a) making it clear what we're doing (pruning stuff) b) that we're doing it on a subset of the counter of the (very large) values we're showing. So the current one has "a" && "b", "Searching..." has neither, and "Pruning..." has "a" but not "b". Maybe making a && b clear isn't that important, but I'm currently leaning towards keeping the current one because it's not *that* long and makes things clearer to the user. > Do we have --quiet option or something that needs to completely > suppress this progress thing? Yes. I also see my commit graph process patches sitting in "next" broke the "git gc --quiet" mode, and I'll need to submit something on top (which'll be easy), and submit a v2 on this (pending further comments...). Is there a better way to test that (fake up the file descriptor check) in the tests other than adding getenv("GIT_TEST...") to the progress.c logic? >> while (pending) { >> struct commit_list *parent; >> struct commit *commit = pop_commit(&pending); >> + >> + display_progress(progress, ++i); > > maybe rename it to commit_count or something and leave "i" for > temporary/short lived usage. Good point. Willdo. >> if (commit->object.flags & REACHABLE) >> continue; >> if (parse_commit(commit)) >> @@ -253,6 +260,7 @@ static void mark_reachable(struct expire_reflog_policy_cb *cb) >> } >> } >> cb->mark_list = leftover; >> + stop_progress(&progress); >> } >> >> static int unreachable(struct expire_reflog_policy_cb *cb, struct commit *commit, struct object_id *oid) >> -- >> 2.19.0.444.g18242da7ef >>
On Wed, Sep 19, 2018 at 07:22:44PM +0200, Ævar Arnfjörð Bjarmason wrote: > > Do we have --quiet option or something that needs to completely > > suppress this progress thing? > > Yes. I also see my commit graph process patches sitting in "next" broke > the "git gc --quiet" mode, and I'll need to submit something on top > (which'll be easy), and submit a v2 on this (pending further > comments...). > > Is there a better way to test that (fake up the file descriptor check) > in the tests other than adding getenv("GIT_TEST...") to the progress.c > logic? The progress code doesn't do the isatty() check at all. The caller has to do it (and ideally would respect --progress/--no-progress to override, along with having --quiet imply --no-progress if such an option exists). Once you have all that, then you can test --progress explicitly. If you want to check the isatty() handling, you can use test_terminal(). -Peff
On Wed, Sep 19 2018, Jeff King wrote: > On Wed, Sep 19, 2018 at 07:22:44PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> > Do we have --quiet option or something that needs to completely >> > suppress this progress thing? >> >> Yes. I also see my commit graph process patches sitting in "next" broke >> the "git gc --quiet" mode, and I'll need to submit something on top >> (which'll be easy), and submit a v2 on this (pending further >> comments...). >> >> Is there a better way to test that (fake up the file descriptor check) >> in the tests other than adding getenv("GIT_TEST...") to the progress.c >> logic? > > The progress code doesn't do the isatty() check at all. The caller has > to do it (and ideally would respect --progress/--no-progress to > override, along with having --quiet imply --no-progress if such an > option exists). Yeah, what I was confused by was testing this with "git gc", and until my recent commit graph progress patches + this (which I wasn't testing against) the progress output was all from pack-objects, which checks the --progress option, and then proceeds to ignore all of that and just check isatty(). > Once you have all that, then you can test --progress explicitly. If you > want to check the isatty() handling, you can use test_terminal(). Thanks.
On Wed, Sep 19, 2018 at 07:22:44PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> @@ -225,14 +226,20 @@ static void mark_reachable(struct expire_reflog_policy_cb *cb) > >> struct commit_list *pending; > >> timestamp_t expire_limit = cb->mark_limit; > >> struct commit_list *leftover = NULL; > >> + struct progress *progress = NULL; > >> + int i = 0; > >> > >> for (pending = cb->mark_list; pending; pending = pending->next) > >> pending->item->object.flags &= ~REACHABLE; > >> > >> pending = cb->mark_list; > >> + progress = start_delayed_progress( > >> + _("Marking unreachable commits in reflog for expiry"), 0); > > > > Maybe just "Searching expired reflog entries" or something like that. > > It's not technically as accurate, but I think it's easier to > > understand by by new people. > > Or "Pruning reflog entries"? I was aiming for some combination of a) > making it clear what we're doing (pruning stuff) b) that we're doing it > on a subset of the counter of the (very large) values we're showing. > > So the current one has "a" && "b", "Searching..." has neither, and > "Pruning..." has "a" but not "b". > > Maybe making a && b clear isn't that important, but I'm currently > leaning towards keeping the current one because it's not *that* long and > makes things clearer to the user. OK > >> while (pending) { > >> struct commit_list *parent; > >> struct commit *commit = pop_commit(&pending); > >> + > >> + display_progress(progress, ++i); > > > > maybe rename it to commit_count or something and leave "i" for > > temporary/short lived usage. > > Good point. Willdo. Actually I'm still not sure if it's valuable to show the actual commit count here. Some patch like this could show "activity" without the number. But this is tangent. -- 8< -- Subject: [PATCH] progress: add api for displaying a throbber --- progress.c | 29 +++++++++++++++++++++++++---- progress.h | 1 + 2 files changed, 26 insertions(+), 4 deletions(-) diff --git a/progress.c b/progress.c index 5a99c9fbf0..fada556bf0 100644 --- a/progress.c +++ b/progress.c @@ -36,6 +36,7 @@ struct progress { unsigned delay; struct throughput *throughput; uint64_t start_ns; + int show_throbber; }; static volatile sig_atomic_t progress_update; @@ -81,11 +82,13 @@ static int is_foreground_fd(int fd) static int display(struct progress *progress, uint64_t n, const char *done) { const char *eol, *tp; + static char throbber[] = "\\|/-"; if (progress->delay && (!progress_update || --progress->delay)) return 0; - progress->last_value = n; + if (!progress->show_throbber) + progress->last_value = n; tp = (progress->throughput) ? progress->throughput->display.buf : ""; eol = done ? done : " \r"; if (progress->total) { @@ -104,8 +107,15 @@ static int display(struct progress *progress, uint64_t n, const char *done) } } else if (progress_update) { if (is_foreground_fd(fileno(stderr)) || done) { - fprintf(stderr, "%s: %"PRIuMAX"%s%s", - progress->title, (uintmax_t)n, tp, eol); + if (progress->show_throbber) { + progress->last_value++; + fprintf(stderr, "%s: %c%s%s", + progress->title, + throbber[progress->last_value % 4], + tp, eol); + } else + fprintf(stderr, "%s: %"PRIuMAX"%s%s", + progress->title, (uintmax_t)n, tp, eol); fflush(stderr); } progress_update = 0; @@ -193,6 +203,14 @@ int display_progress(struct progress *progress, uint64_t n) return progress ? display(progress, n, NULL) : 0; } +void display_throbber(struct progress *progress) +{ + if (progress) { + progress->show_throbber = 1; + display(progress, 0, NULL); + } +} + static struct progress *start_progress_delay(const char *title, uint64_t total, unsigned delay) { @@ -248,7 +266,10 @@ void stop_progress_msg(struct progress **p_progress, const char *msg) throughput_string(&tp->display, tp->curr_total, rate); } progress_update = 1; - buf = xstrfmt(", %s.\n", msg); + if (progress->show_throbber) + buf = xstrfmt(" %s.\n", msg); + else + buf = xstrfmt(", %s.\n", msg); display(progress, progress->last_value, buf); free(buf); } diff --git a/progress.h b/progress.h index 70a4d4a0d6..9d23ff242f 100644 --- a/progress.h +++ b/progress.h @@ -5,6 +5,7 @@ struct progress; void display_throughput(struct progress *progress, uint64_t total); int display_progress(struct progress *progress, uint64_t n); +void display_throbber(struct progress *progress); struct progress *start_progress(const char *title, uint64_t total); struct progress *start_delayed_progress(const char *title, uint64_t total); void stop_progress(struct progress **progress);
On Wed, Sep 19, 2018 at 10:10 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > Before this change the "git reflog expire" command didn't report any > progress. This is the second command (after "pack-refs --all --prune") > that the "gc" command will run. > [...] > That may just be some artifact of the repositories I've tested, but I > suspect It'll be true in general. As the callstack above shows, in s/It'll/it'll/ > order to guarantee that we don't do that we'd need to pass some > "progress" variable through 10 levels of functions, many of which are > "for_each" callback functions with void* cb_data.
diff --git a/builtin/reflog.c b/builtin/reflog.c index 3acef5a0ab..d3075ee75a 100644 --- a/builtin/reflog.c +++ b/builtin/reflog.c @@ -10,6 +10,7 @@ #include "diff.h" #include "revision.h" #include "reachable.h" +#include "progress.h" /* NEEDSWORK: switch to using parse_options */ static const char reflog_expire_usage[] = @@ -225,14 +226,20 @@ static void mark_reachable(struct expire_reflog_policy_cb *cb) struct commit_list *pending; timestamp_t expire_limit = cb->mark_limit; struct commit_list *leftover = NULL; + struct progress *progress = NULL; + int i = 0; for (pending = cb->mark_list; pending; pending = pending->next) pending->item->object.flags &= ~REACHABLE; pending = cb->mark_list; + progress = start_delayed_progress( + _("Marking unreachable commits in reflog for expiry"), 0); while (pending) { struct commit_list *parent; struct commit *commit = pop_commit(&pending); + + display_progress(progress, ++i); if (commit->object.flags & REACHABLE) continue; if (parse_commit(commit)) @@ -253,6 +260,7 @@ static void mark_reachable(struct expire_reflog_policy_cb *cb) } } cb->mark_list = leftover; + stop_progress(&progress); } static int unreachable(struct expire_reflog_policy_cb *cb, struct commit *commit, struct object_id *oid)
Before this change the "git reflog expire" command didn't report any progress. This is the second command (after "pack-refs --all --prune") that the "gc" command will run. On small repositories like this command won't take long to run, my test system it takes just under 1 second to run on git.git, but just around 8 seconds on linux.git, and much longer on the 2015-04-03-1M-git.git[1] large test repository. Taking so long means that "gc" will appear to hang at the beginning of its run. It might still do so after this change if the earlier "pack-refs" command takes a really long time to run, but that'll only impact repositories with a really large set of refs to pack, and can be addressed in some future change. One thing that's bad about this change is that we might *in theory* print a "Marking unreachable commits in reflog for expiry" message for each ref with a reflog. This is because the abbreviated callstack looks like this: 0 mark_reachable at builtin/reflog.c:227 1 in unreachable at builtin/reflog.c:290 2 in should_expire_reflog_ent at builtin/reflog.c:317 3 in expire_reflog_ent at refs/files-backend.c:2956 4 in show_one_reflog_ent at refs/files-backend.c:1879 # This is the last function that has the refname (e.g. "HEAD") available 5 in files_for_each_reflog_ent at refs/files-backend.c:2025 6 in refs_for_each_reflog_ent at refs.c:2066 7 in files_reflog_expire at refs/files-backend.c:3043 8 in refs_reflog_expire at refs.c:2117 9 in reflog_expire at refs.c:2129 # Here's where we collect reflogs to expire, and expire each one 10 in cmd_reflog_expire at builtin/reflog.c:595 I.e. this progress is being reported for each expired reflog. So if start_progress() were used instead of start_delayed_progress() we'd print (e.g. on my git.git) hundreds of these lines. In practice I haven't been able to make it print anything except one line. This is because validating the reflogs for these other branches (not "HEAD") takes such a short amount of time. That may just be some artifact of the repositories I've tested, but I suspect It'll be true in general. As the callstack above shows, in order to guarantee that we don't do that we'd need to pass some "progress" variable through 10 levels of functions, many of which are "for_each" callback functions with void* cb_data. 1. https://github.com/avar/2015-04-03-1M-git Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/reflog.c | 8 ++++++++ 1 file changed, 8 insertions(+)