diff mbox series

Make "git log --count" work like "git rev-list"

Message ID CAHk-=wg0NUNFjZumgC-9f=kmU3L4T+qOAgXwiDAfPaNtuFfvFg@mail.gmail.com (mailing list archive)
State New, archived
Headers show
Series Make "git log --count" work like "git rev-list" | expand

Commit Message

Linus Torvalds Jan. 5, 2019, 10:46 p.m. UTC
This is a ridiculous patch. And I understand entirely if nobody else
cares, because I don't think anybody else has ever even noticed.

It turns out that I still use "git rev-list" outside of some hard-core
scripting for one silly reason: the linux-next statistics are all
about non-merge commits, and so to do a rough comparison with
linux-next, I do

        git rev-list --count --no-merges v4.20..

to get an approximate idea of how much I've merged compared to what is
in linux-next.

(See also

        http://neuling.org/linux-next-size.html

for the graphical view of it all, even though it looks a bit odd right
now because of how linux-next wasn't being updated over the holidats
and right at the 4.19 release).

Anyway, I've occasionally thought to myself that I should just fix
"git log" to learn that too, so that I wouldn't have to type out "git
rev-list". Because "git log" does actually take the "--count"
argument, it just doesn't honor it.

This is that patch.

                    Linus

Comments

Stefan Beller Jan. 9, 2019, 5:21 p.m. UTC | #1
On Sat, Jan 5, 2019 at 2:47 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> This is a ridiculous patch. And I understand entirely if nobody else
> cares, because I don't think anybody else has ever even noticed.
>
> It turns out that I still use "git rev-list" outside of some hard-core
> scripting for one silly reason: the linux-next statistics are all
> about non-merge commits, and so to do a rough comparison with
> linux-next, I do
>
>         git rev-list --count --no-merges v4.20..
>
> to get an approximate idea of how much I've merged compared to what is
> in linux-next.
>
> (See also
>
>         http://neuling.org/linux-next-size.html
>
> for the graphical view of it all, even though it looks a bit odd right
> now because of how linux-next wasn't being updated over the holidats
> and right at the 4.19 release).
>
> Anyway, I've occasionally thought to myself that I should just fix
> "git log" to learn that too, so that I wouldn't have to type out "git
> rev-list". Because "git log" does actually take the "--count"
> argument, it just doesn't honor it.
>
> This is that patch.

Sounds reasonable to me to have such functionality,
as I tend to use
  git log --oneline origin/master..origin/next --no-merges |wc -l
for such queries, which I always assume to be doing useless
work as I'd be interested in *only* the count, and not the
intermediate oneline output, but that is the best output
that wc works on.

So maybe the --count option would want to suppress
other output if given (or we'd want to have another option
for no output)?

Instead of printing, do we want to redirect to
rev->diffopt.file, or send it through the diff codes
buffer if we ever anticipate needing this output in
buffered?

A test would be nice, too.

Stefan
Linus Torvalds Jan. 9, 2019, 5:44 p.m. UTC | #2
On Wed, Jan 9, 2019 at 9:21 AM Stefan Beller <sbeller@google.com> wrote:
>
> Sounds reasonable to me to have such functionality,
> as I tend to use
>   git log --oneline origin/master..origin/next --no-merges |wc -l
> for such queries, which I always assume to be doing useless
> work as I'd be interested in *only* the count, and not the
> intermediate oneline output, but that is the best output
> that wc works on.

Right. I've been known to do that too, but because I grew up with "git
rev-list", I know about --count and tend to use it.

In fact, I've occasionally used it with "git log" already (before my
patch), and it would silently accept the parameter (because it's
parsed by the generic revision parsing), it just wouldn't work.

> So maybe the --count option would want to suppress
> other output if given (or we'd want to have another option
> for no output)?

It already does so in my patch, exactly because it uses the same logic
as "git rev-list --count" does.

Which is to simply react to the "--count" thing early in
log_tree_commit(), and do the counting and then say "I showed this"
(without showing anything).

So you can do silly things like this:

    [torvalds@i7 linux]$ git log --count -10
    10
    [torvalds@i7 linux]$

and it just works. Nonsensical, but logical.

              Linus
Jeff King Jan. 9, 2019, 7:54 p.m. UTC | #3
On Sat, Jan 05, 2019 at 02:46:37PM -0800, Linus Torvalds wrote:

> "git log" is really the human-facing useful command that long long ago
> used to scripted around "git rev-list".
> 
> In fact, if you want to use the old scripting code, you can still
> approximate "git log" with something like
> 
>    git rev-list --pretty HEAD | less
> 
> but you'd be silly to do that, since "git log" is clearly a much nicer
> interface and is what people should use.
> 
> Except I find myself still occasionally using "git rev-list" simply
> because "git log" doesn't do one odd little quirk: commit counting.

I suspect it's more than that one little quirk. E.g., "git log" does not
do anything useful with "--objects" or "--use-bitmap-index", and perhaps
some others. I'm not at all opposed to bringing the feature-sets more
inline (where it makes sense -- I'm not sure what "log --objects" would
do), but I hope that we can generally do so by pushing features into
revision.c, and not just re-implementing them.

The counting part may have to be specific to each command, since it
depends on the loop around get_revision(). But I wonder if we can push
the logic into a helper function (and ditto for the printing code).

> So if you want to count the number of non-merge commits since the last
> kernel version, you'd have to do something like
> 
>    git rev-list --count --no-merges v4.20..
> 
> because while "git log" actually silently _accepts_ the "--count"
> argument, it doesn't do anything about it.

Yeah, silently ignoring "--count" is just awful. I agree we should make
it do the sensible thing here. For something like "log --objects", we
should probably flag an error.

> diff --git a/builtin/log.c b/builtin/log.c
> index e8e51068bd..62bef62f8a 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -411,6 +411,17 @@ static int cmd_log_walk(struct rev_info *rev)
>  	if (close_file)
>  		fclose(rev->diffopt.file);
>  
> +	if (rev->count) {
> +		if (rev->left_right && rev->cherry_mark)
> +			printf("%d\t%d\t%d\n", rev->count_left, rev->count_right, rev->count_same);
> +		else if (rev->left_right)
> +			printf("%d\t%d\n", rev->count_left, rev->count_right);
> +		else if (rev->cherry_mark)
> +			printf("%d\t%d\n", rev->count_left + rev->count_right, rev->count_same);
> +		else
> +			printf("%d\n", rev->count_left + rev->count_right);
> +	}

OK, this makes sense, and is lifted from rev-list.

> diff --git a/log-tree.c b/log-tree.c
> index 10680c139e..49ff485320 100644
> --- a/log-tree.c
> +++ b/log-tree.c
> @@ -913,6 +913,16 @@ int log_tree_commit(struct rev_info *opt, struct commit *commit)
>  	struct log_info log;
>  	int shown, close_file = opt->diffopt.close_file;
>  
> +	if (opt->count) {
> +		if (commit->object.flags & PATCHSAME)
> +			opt->count_same++;
> +		else if (commit->object.flags & SYMMETRIC_LEFT)
> +			opt->count_left++;
> +		else
> +			opt->count_right++;
> +		return 1;
> +	}

And the logic here make sense, but I wonder if this is the best place to
put it.

We call log_tree_commit() from lots of places besides git-log, but
presumably none of would set opt->count recreationally, so that's
probably not a big deal.

But does this catch all of the limiting that git-log would do? I notice
that it happens before the call to log_tree_diff(), which conditionally
returns a "shown" flag. So you get weird results with some options. For
example:

  # works, because pathspec limiting happens early
  git log --count builtin/log.c

  # doesn't work, because --follow disables pruning
  git log --follow --count builtin/log.c

I know "--follow" is a bit hacky in general, but I think there are other
cases where log_tree_diff() may decide not to show a commit (maybe
without --root, though I guess that's the default these days).

Unfortunately I'm not sure of an easy fix. We'd have to tell the log
code "figure out if you're going to show the commit, but don't actually
show anything", which means respecting the count flag virtually
everywhere that would produce output.

I dunno. Certainly respecting "--count" even for the simple cases is an
improvement over the status quo. Maybe it would be enough to give a
warning in the manpage that it may not work with exotic options.

-Peff
Stefan Beller Jan. 9, 2019, 8:14 p.m. UTC | #4
> Unfortunately I'm not sure of an easy fix. We'd have to tell the log
> code "figure out if you're going to show the commit, but don't actually
> show anything", which means respecting the count flag virtually
> everywhere that would produce output.

This is (partially) solved for commit parts already,
e6e045f803 (diff.c: buffer all output if asked to, 2017-06-29)
but we'd have to extend that to revision walking?
Junio C Hamano Jan. 10, 2019, 10:19 p.m. UTC | #5
Jeff King <peff@peff.net> writes:

> But does this catch all of the limiting that git-log would do? I notice
> that it happens before the call to log_tree_diff(), which conditionally
> returns a "shown" flag. So you get weird results with some options. For
> example:
>
>   # works, because pathspec limiting happens early
>   git log --count builtin/log.c
>
>   # doesn't work, because --follow disables pruning
>   git log --follow --count builtin/log.c

Well, that's a bad one.

> I know "--follow" is a bit hacky in general, but I think there are other
> cases where log_tree_diff() may decide not to show a commit (maybe
> without --root, though I guess that's the default these days).
>
> I dunno. Certainly respecting "--count" even for the simple cases is an
> improvement over the status quo. Maybe it would be enough to give a
> warning in the manpage that it may not work with exotic options.

Hmph, perhaps.  I wonder if it is easy enough to redirect the entire
codeflow to that of rev-list when we see --count in cmd_log().
Jeff King Jan. 11, 2019, 3:35 p.m. UTC | #6
On Thu, Jan 10, 2019 at 02:19:51PM -0800, Junio C Hamano wrote:

> > I know "--follow" is a bit hacky in general, but I think there are other
> > cases where log_tree_diff() may decide not to show a commit (maybe
> > without --root, though I guess that's the default these days).
> >
> > I dunno. Certainly respecting "--count" even for the simple cases is an
> > improvement over the status quo. Maybe it would be enough to give a
> > warning in the manpage that it may not work with exotic options.
> 
> Hmph, perhaps.  I wonder if it is easy enough to redirect the entire
> codeflow to that of rev-list when we see --count in cmd_log().

Interesting idea. I think it might end up creating weird inconsistencies
for other cases, though (e.g., log respecting config like log.showroot
that rev-list does not).

I suppose it depends on whether we want to advertise that "log --count"
is just a convenience wrapper for "rev-list --count". Personally, that
sounds hacky to me. I'd expect "log --count <options>" to give the same
answer as "log --oneline <options> | wc -l".

-Peff
diff mbox series

Patch

From e3bc5387404bcefbd86081fbc82a93fc5c9efa99 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Sat, 5 Jan 2019 14:39:41 -0800
Subject: [PATCH] git log: honor the '--count' argument

"git log" is really the human-facing useful command that long long ago
used to scripted around "git rev-list".

In fact, if you want to use the old scripting code, you can still
approximate "git log" with something like

   git rev-list --pretty HEAD | less

but you'd be silly to do that, since "git log" is clearly a much nicer
interface and is what people should use.

Except I find myself still occasionally using "git rev-list" simply
because "git log" doesn't do one odd little quirk: commit counting.

So if you want to count the number of non-merge commits since the last
kernel version, you'd have to do something like

   git rev-list --count --no-merges v4.20..

because while "git log" actually silently _accepts_ the "--count"
argument, it doesn't do anything about it.

This little patch copies the rev-list code for counting to "git log".

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 builtin/log.c | 11 +++++++++++
 log-tree.c    | 10 ++++++++++
 2 files changed, 21 insertions(+)

diff --git a/builtin/log.c b/builtin/log.c
index e8e51068bd..62bef62f8a 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -411,6 +411,17 @@  static int cmd_log_walk(struct rev_info *rev)
 	if (close_file)
 		fclose(rev->diffopt.file);
 
+	if (rev->count) {
+		if (rev->left_right && rev->cherry_mark)
+			printf("%d\t%d\t%d\n", rev->count_left, rev->count_right, rev->count_same);
+		else if (rev->left_right)
+			printf("%d\t%d\n", rev->count_left, rev->count_right);
+		else if (rev->cherry_mark)
+			printf("%d\t%d\n", rev->count_left + rev->count_right, rev->count_same);
+		else
+			printf("%d\n", rev->count_left + rev->count_right);
+	}
+
 	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
 	    rev->diffopt.flags.check_failed) {
 		return 02;
diff --git a/log-tree.c b/log-tree.c
index 10680c139e..49ff485320 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -913,6 +913,16 @@  int log_tree_commit(struct rev_info *opt, struct commit *commit)
 	struct log_info log;
 	int shown, close_file = opt->diffopt.close_file;
 
+	if (opt->count) {
+		if (commit->object.flags & PATCHSAME)
+			opt->count_same++;
+		else if (commit->object.flags & SYMMETRIC_LEFT)
+			opt->count_left++;
+		else
+			opt->count_right++;
+		return 1;
+	}
+
 	log.commit = commit;
 	log.parent = NULL;
 	opt->loginfo = &log;
-- 
2.20.1.101.g60ba6df0c4