Message ID | 0fd18b6d740f1e8a6f62a25947bc3ad49c2674a6.1678111599.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ahead-behind: new builtin for counting multiple commit ranges | expand |
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > For example, we will be able to track all local branches relative to an > upstream branch using an invocation such as > > git for-each-ref --format=%(refname) refs/heads/* | > git ahead-behind --base=origin/main --stdin Stepping back a bit, this motivating example makes me wonder if $ git for-each-ref --format='%(refname) %(aheadbehind)' refs/heads/\* that computes the ahead-behind number for each ref (that matches the pattern) based on their own "upstream" (presumably each branch is configured to track the same, or different, upstreams), or overrriding @{upstream}, a specified base, i.e. $ git for-each-ref --format='%(refname) %(aheadbehind:origin/main)' refs/heads/\* would be a more intuitive interface to the end-users. It would probably work well in conjunction with git for-each-ref --format='%(refname)' --merged origin/main refs/heads/\* which is a way to list local branches that are already merged into the upstream, to have the feature appear in the same command, perhaps?
On Mon, Mar 06, 2023 at 10:48:45AM -0800, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > For example, we will be able to track all local branches relative to an > > upstream branch using an invocation such as > > > > git for-each-ref --format=%(refname) refs/heads/* | > > git ahead-behind --base=origin/main --stdin > > Stepping back a bit, this motivating example makes me wonder if > > $ git for-each-ref --format='%(refname) %(aheadbehind)' refs/heads/\* One disadvantage to using for-each-ref here is that we are bound to use all of the ref-sorting code, so callers can't see intermediate results until the entire walk is complete. I can't remember enough of the details about the custom traversal we use here to know if that would even matter or not (i.e., do we need to traverse through the whole set of objects entirely before outputting a single result anyway?). But something to think about nonetheless. At the very least, it is quite a cute idea (especially something like '%(aheadbehind:origin/main)') ;-). > that computes the ahead-behind number for each ref (that matches the > pattern) based on their own "upstream" (presumably each branch is > configured to track the same, or different, upstreams), or > overrriding @{upstream}, a specified base, i.e. > > $ git for-each-ref --format='%(refname) %(aheadbehind:origin/main)' refs/heads/\* > > would be a more intuitive interface to the end-users. > > It would probably work well in conjunction with > > git for-each-ref --format='%(refname)' --merged origin/main refs/heads/\* > > which is a way to list local branches that are already merged into > the upstream, to have the feature appear in the same command, > perhaps? One thing that we had talked about internally[^1] was the idea of specifying multiple bases. IOW, having some way to invoke the ahead-behind builtin that gives some set of tips with a common base B1, and another set of tips (which could--but doesn't have to--intersect with the first) and a common base to compare *them* to, say, B2. There are some technical reasons that we might want to consider such a thing at least motivated by GitHub's proposed future use of it. But they are kind of technical and not that interesting to this discussion, so I wouldn't be sad if we didn't have a way to specify multiple bases. OTOH, it would be nice to avoid painting ourselves into a corner from a UI-perspective if we can avoid it. Thanks, Taylor [^1]: ...and couldn't decide if it was going to be a nice future addition or simply another case of YAGNI ;-).
On 3/6/2023 7:40 PM, Taylor Blau wrote: > On Mon, Mar 06, 2023 at 10:48:45AM -0800, Junio C Hamano wrote: >> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: >> >>> For example, we will be able to track all local branches relative to an >>> upstream branch using an invocation such as >>> >>> git for-each-ref --format=%(refname) refs/heads/* | >>> git ahead-behind --base=origin/main --stdin >> >> Stepping back a bit, this motivating example makes me wonder if >> >> $ git for-each-ref --format='%(refname) %(aheadbehind)' refs/heads/\* > > One disadvantage to using for-each-ref here is that we are bound to use > all of the ref-sorting code, so callers can't see intermediate results > until the entire walk is complete. > > I can't remember enough of the details about the custom traversal we use > here to know if that would even matter or not (i.e., do we need to > traverse through the whole set of objects entirely before outputting a > single result anyway?). But something to think about nonetheless. > > At the very least, it is quite a cute idea (especially something like > '%(aheadbehind:origin/main)') ;-). > >> that computes the ahead-behind number for each ref (that matches the >> pattern) based on their own "upstream" (presumably each branch is >> configured to track the same, or different, upstreams), or >> overrriding @{upstream}, a specified base, i.e. >> >> $ git for-each-ref --format='%(refname) %(aheadbehind:origin/main)' refs/heads/\* >> >> would be a more intuitive interface to the end-users. >> >> It would probably work well in conjunction with >> >> git for-each-ref --format='%(refname)' --merged origin/main refs/heads/\* >> >> which is a way to list local branches that are already merged into >> the upstream, to have the feature appear in the same command, >> perhaps? > > One thing that we had talked about internally[^1] was the idea of > specifying multiple bases. IOW, having some way to invoke the > ahead-behind builtin that gives some set of tips with a common base B1, > and another set of tips (which could--but doesn't have to--intersect > with the first) and a common base to compare *them* to, say, B2. > > There are some technical reasons that we might want to consider such a > thing at least motivated by GitHub's proposed future use of it. But they > are kind of technical and not that interesting to this discussion, so I > wouldn't be sad if we didn't have a way to specify multiple bases. > > OTOH, it would be nice to avoid painting ourselves into a corner from a > UI-perspective if we can avoid it. > > Thanks, > Taylor > > [^1]: ...and couldn't decide if it was going to be a nice future > addition or simply another case of YAGNI ;-). This use of 'git for-each-ref --format=""' actually fixes some of the issues I had with how to specify multiple bases. I'm not sure there is a huge need for it, except that if we allow a "%(ahead-behind:<ref>)" format token, then we would need to support multiple bases. Thankfully, the implementation in this series is already prepared for that, so the following diff implements this format token: --- >8 --- builtin/for-each-ref.c | 50 ++++++++++++++++++++++++++++++++++++++++++ ref-filter.c | 23 +++++++++++++++++++ ref-filter.h | 15 ++++++++++++- 3 files changed, 87 insertions(+), 1 deletion(-) diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c index 6f62f40d126..c8dd21d7e13 100644 --- a/builtin/for-each-ref.c +++ b/builtin/for-each-ref.c @@ -5,6 +5,7 @@ #include "object.h" #include "parse-options.h" #include "ref-filter.h" +#include "commit-reach.h" static char const * const for_each_ref_usage[] = { N_("git for-each-ref [<options>] [<pattern>]"), @@ -14,6 +15,51 @@ static char const * const for_each_ref_usage[] = { NULL }; +static void compute_ahead_behind(struct ref_format *format, + struct ref_array *array) +{ + struct commit **commits; + size_t commits_nr = format->bases.nr + array->nr; + + if (!format->bases.nr || !array->nr) + return; + + ALLOC_ARRAY(commits, commits_nr); + for (size_t i = 0; i < format->bases.nr; i++) { + const char *name = format->bases.items[i].string; + commits[i] = lookup_commit_reference_by_name(name); + if (!commits[i]) + die("failed to find '%s'", name); + } + + ALLOC_ARRAY(array->counts, st_mult(format->bases.nr, array->nr)); + + commits_nr = format->bases.nr; + array->counts_nr = 0; + for (size_t i = 0; i < array->nr; i++) { + const char *name = array->items[i]->refname; + commits[commits_nr] = lookup_commit_reference_by_name(name); + + if (!commits[commits_nr]) { + warning(_("could not find '%s'"), name); + continue; + } + + CALLOC_ARRAY(array->items[i]->counts, format->bases.nr); + for (size_t j = 0; j < format->bases.nr; j++) { + struct ahead_behind_count *count; + count = &array->counts[array->counts_nr++]; + count->tip_index = format->bases.nr + i; + count->base_index = j; + + array->items[i]->counts[j] = count; + } + commits_nr++; + } + + ahead_behind(commits, commits_nr, array->counts, array->counts_nr); +} + int cmd_for_each_ref(int argc, const char **argv, const char *prefix) { int i; @@ -78,6 +124,10 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix) filter.name_patterns = argv; filter.match_as_path = 1; filter_refs(&array, &filter, FILTER_REFS_ALL); + + /* Do ahead-behind things, if necessary. */ + compute_ahead_behind(&format, &array); + ref_array_sort(sorting, &array); if (!maxcount || array.nr < maxcount) diff --git a/ref-filter.c b/ref-filter.c index f8203c6b052..1706b9dd0d5 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -158,6 +158,7 @@ enum atom_type { ATOM_THEN, ATOM_ELSE, ATOM_REST, + ATOM_AHEADBEHIND, }; /* @@ -586,6 +587,16 @@ static int rest_atom_parser(struct ref_format *format, struct used_atom *atom, return 0; } +static int ahead_behind_atom_parser(struct ref_format *format, struct used_atom *atom, + const char *arg, struct strbuf *err) +{ + if (!arg) + return strbuf_addf_ret(err, -1, _("expected format: %%(ahead-behind:<ref>)")); + + string_list_append(&format->bases, arg); + return 0; +} + static int head_atom_parser(struct ref_format *format, struct used_atom *atom, const char *arg, struct strbuf *err) { @@ -645,6 +656,7 @@ static struct { [ATOM_THEN] = { "then", SOURCE_NONE }, [ATOM_ELSE] = { "else", SOURCE_NONE }, [ATOM_REST] = { "rest", SOURCE_NONE, FIELD_STR, rest_atom_parser }, + [ATOM_AHEADBEHIND] = { "ahead-behind", SOURCE_OTHER, FIELD_STR, ahead_behind_atom_parser }, /* * Please update $__git_ref_fieldlist in git-completion.bash * when you add new atoms @@ -1848,6 +1860,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err) struct object *obj; int i; struct object_info empty = OBJECT_INFO_INIT; + int ahead_behind_atoms = 0; CALLOC_ARRAY(ref->value, used_atom_cnt); @@ -1978,6 +1991,16 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err) else v->s = xstrdup(""); continue; + } else if (atom_type == ATOM_AHEADBEHIND) { + if (ref->counts) { + const struct ahead_behind_count *count; + count = ref->counts[ahead_behind_atoms++]; + v->s = xstrfmt("%d %d", count->ahead, count->behind); + } else { + /* Not a commit. */ + v->s = xstrdup(""); + } + continue; } else continue; diff --git a/ref-filter.h b/ref-filter.h index aa0eea4ecf5..937a857ddee 100644 --- a/ref-filter.h +++ b/ref-filter.h @@ -5,6 +5,7 @@ #include "refs.h" #include "commit.h" #include "parse-options.h" +#include "string-list.h" /* Quoting styles */ #define QUOTE_NONE 0 @@ -24,6 +25,7 @@ struct atom_value; struct ref_sorting; +struct ahead_behind_count; enum ref_sorting_order { REF_SORTING_REVERSE = 1<<0, @@ -40,6 +42,8 @@ struct ref_array_item { const char *symref; struct commit *commit; struct atom_value *value; + struct ahead_behind_count **counts; + char refname[FLEX_ARRAY]; }; @@ -47,6 +51,9 @@ struct ref_array { int nr, alloc; struct ref_array_item **items; struct rev_info *revs; + + struct ahead_behind_count *counts; + size_t counts_nr; }; struct ref_filter { @@ -80,9 +87,15 @@ struct ref_format { /* Internal state to ref-filter */ int need_color_reset_at_eol; + + /* List of bases for ahead-behind counts. */ + struct string_list bases; }; -#define REF_FORMAT_INIT { .use_color = -1 } +#define REF_FORMAT_INIT { \ + .use_color = -1, \ + .bases = STRING_LIST_INIT_DUP, \ +} /* Macros for checking --merged and --no-merged options */ #define _OPT_MERGED_NO_MERGED(option, filter, h) \
Derrick Stolee <derrickstolee@github.com> writes: > What I have yet to determine is that 'git for-each-ref' does > not have significant overhead due to how it's implementation is > built around listing "all refs that match" versus an explicit > input list of refs. There's also the concept of '--stdin' that > would be interesting to interact with. Yeah, we could add --no-sort to allow streaming better and --stdin to feed list of refs to work on, if the end-user facing interface based on --format is what people find reasonable. > I'll continue to investigate this path and report back when I > have more of this information. This is as far I as I could get > today. Thanks. It is a very interesting experiment.
diff --git a/.gitignore b/.gitignore index e875c590545..cc064a4817a 100644 --- a/.gitignore +++ b/.gitignore @@ -14,6 +14,7 @@ /bin-wrappers/ /git /git-add +/git-ahead-behind /git-am /git-annotate /git-apply diff --git a/Documentation/git-ahead-behind.txt b/Documentation/git-ahead-behind.txt new file mode 100644 index 00000000000..0e2f989a1a0 --- /dev/null +++ b/Documentation/git-ahead-behind.txt @@ -0,0 +1,62 @@ +git-ahead-behind(1) +=================== + +NAME +---- +git-ahead-behind - Count the commits on each side of a revision range + +SYNOPSIS +-------- +[verse] +'git ahead-behind' --base=<ref> [ --stdin | <revs> ] + +DESCRIPTION +----------- + +Given a list of commit ranges, report the number of commits reachable from +each of the sides of the range, but not the other. Consider a commit range +specified as `<base>...<tip>`, allowing for the following definitions: + +* The `<tip>` is *ahead* of `<base>` by the number of commits reachable + from `<tip>` but not reachable from `<base>`. This is the same as the + number of the commits in the range `<base>..<tip>`. + +* The `<tip>` is *behind* `<base>` by the number of commits reachable from + `<base>` but not reachble from `<tip>`. This is the same as the number + of commits in the range `<tip>..<base>`. + +The sum of the ahead and behind counts equals the number of commits in the +symmetric difference, the range `<base>...<tip>`. + +Multiple revisions may be specified, and they are all compared against a +common base revision, as specified by the `--base` option. The values are +reported to stdout one line at a time as follows: + +--- + <rev> <ahead> <behind> +--- + +There will be exactly one line per input revision, but the lines may be +in an arbitrary order. + + +OPTIONS +------- +--base=<ref>:: + Specify that `<ref>` should be used as a common base for all + provided revisions that are not specified in the form of a range. + +--stdin:: + Read revision tips and ranges from stdin instead of from the + command-line. + + +SEE ALSO +-------- +linkgit:git-branch[1] +linkgit:git-rev-list[1] +linkgit:git-tag[1] + +GIT +--- +Part of the linkgit:git[1] suite diff --git a/Makefile b/Makefile index 50ee51fde32..691f84e8d4e 100644 --- a/Makefile +++ b/Makefile @@ -1199,6 +1199,7 @@ LIB_OBJS += xdiff-interface.o LIB_OBJS += zlib.o BUILTIN_OBJS += builtin/add.o +BUILTIN_OBJS += builtin/ahead-behind.o BUILTIN_OBJS += builtin/am.o BUILTIN_OBJS += builtin/annotate.o BUILTIN_OBJS += builtin/apply.o diff --git a/builtin.h b/builtin.h index 46cc7897898..1ae168fa3e3 100644 --- a/builtin.h +++ b/builtin.h @@ -108,6 +108,7 @@ void setup_auto_pager(const char *cmd, int def); int is_builtin(const char *s); int cmd_add(int argc, const char **argv, const char *prefix); +int cmd_ahead_behind(int argc, const char **argv, const char *prefix); int cmd_am(int argc, const char **argv, const char *prefix); int cmd_annotate(int argc, const char **argv, const char *prefix); int cmd_apply(int argc, const char **argv, const char *prefix); diff --git a/builtin/ahead-behind.c b/builtin/ahead-behind.c new file mode 100644 index 00000000000..a56cc565def --- /dev/null +++ b/builtin/ahead-behind.c @@ -0,0 +1,30 @@ +#include "builtin.h" +#include "parse-options.h" +#include "config.h" + +static const char * const ahead_behind_usage[] = { + N_("git ahead-behind --base=<ref> [ --stdin | <revs> ]"), + NULL +}; + +int cmd_ahead_behind(int argc, const char **argv, const char *prefix) +{ + const char *base_ref = NULL; + int from_stdin = 0; + + struct option ahead_behind_opts[] = { + OPT_STRING('b', "base", &base_ref, N_("base"), N_("base reference to process")), + OPT_BOOL(0 , "stdin", &from_stdin, N_("read rev names from stdin")), + OPT_END() + }; + + argc = parse_options(argc, argv, NULL, ahead_behind_opts, + ahead_behind_usage, PARSE_OPT_KEEP_UNKNOWN_OPT); + + if (!base_ref) + usage_with_options(ahead_behind_usage, ahead_behind_opts); + + git_config(git_default_config, NULL); + + return 0; +} diff --git a/git.c b/git.c index 6171fd6769d..64e3d493561 100644 --- a/git.c +++ b/git.c @@ -467,6 +467,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv) static struct cmd_struct commands[] = { { "add", cmd_add, RUN_SETUP | NEED_WORK_TREE }, + { "ahead-behind", cmd_ahead_behind, RUN_SETUP }, { "am", cmd_am, RUN_SETUP | NEED_WORK_TREE }, { "annotate", cmd_annotate, RUN_SETUP }, { "apply", cmd_apply, RUN_SETUP_GENTLY }, diff --git a/t/t4218-ahead-behind.sh b/t/t4218-ahead-behind.sh new file mode 100755 index 00000000000..bc08f1207a0 --- /dev/null +++ b/t/t4218-ahead-behind.sh @@ -0,0 +1,17 @@ +#!/bin/sh + +test_description='git ahead-behind command-line options' + +. ./test-lib.sh + +test_expect_success 'git ahead-behind -h' ' + test_must_fail git ahead-behind -h >out && + grep "usage:" out +' + +test_expect_success 'git ahead-behind without --base' ' + test_must_fail git ahead-behind HEAD 2>err && + grep "usage:" err +' + +test_done