diff mbox series

[2/8] shortlog: refactor committer/author grouping

Message ID 20200925070211.GB62741@coredump.intra.peff.net (mailing list archive)
State Superseded
Headers show
Series parsing trailers with shortlog | expand

Commit Message

Jeff King Sept. 25, 2020, 7:02 a.m. UTC
In preparation for adding more grouping types, let's
refactor the committer/author grouping code. In particular:

  - the master option is now "--group", to make it clear
    that the various group types are mutually exclusive. The
    "--committer" option is an alias for "--group=committer".

  - we keep an enum rather than a binary flag, to prepare
    for more values

  - we prefer switch statements to ternary assignment, since
    other group types will need more custom code

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/git-shortlog.txt |  9 ++++++
 builtin/shortlog.c             | 59 +++++++++++++++++++++++++++-------
 shortlog.h                     |  6 +++-
 t/t4201-shortlog.sh            |  5 +++
 4 files changed, 67 insertions(+), 12 deletions(-)

Comments

Eric Sunshine Sept. 25, 2020, 8:05 p.m. UTC | #1
On Fri, Sep 25, 2020 at 3:02 AM Jeff King <peff@peff.net> wrote:
> In preparation for adding more grouping types, let's
> refactor the committer/author grouping code. In particular:
>
>   - the master option is now "--group", to make it clear
>     that the various group types are mutually exclusive. The
>     "--committer" option is an alias for "--group=committer".
>
>   - we keep an enum rather than a binary flag, to prepare
>     for more values
>
>   - we prefer switch statements to ternary assignment, since
>     other group types will need more custom code
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> diff --git a/Documentation/git-shortlog.txt b/Documentation/git-shortlog.txt
> @@ -47,9 +47,18 @@ OPTIONS
> +--group=<type>::
> +       By default, `shortlog` collects and collates author identities;
> +       using `--group` will collect other fields.
> +       `<type>` is one of:
> ++
> + - `author`, commits are grouped by author (this is the default)
> + - `committer`, commits are grouped by committer (the same as `-c`)

I had trouble interpreting "(this is the default)". It made me think
that <type> is optional:

    --group[=<type>]

but that isn't the case at all. Instead, it means that if --group
isn't specified, then grouping is done by `author` by default. It also
repeats what the general description of --group already says with
regard to the default, thus it is redundant to say it again when
describing the `author` type. Therefore, perhaps drop "(this is the
default)" altogether?
Martin Ågren Sept. 26, 2020, 12:31 p.m. UTC | #2
Hi Peff,

Some random thoughts follow on this patch and a few others.

On Fri, 25 Sep 2020 at 09:04, Jeff King <peff@peff.net> wrote:
>
> In preparation for adding more grouping types, let's
> refactor the committer/author grouping code. In particular:
>
>   - the master option is now "--group", to make it clear
>     that the various group types are mutually exclusive. The
>     "--committer" option is an alias for "--group=committer".

I think this is more than a refactoring of the code. The user interface
is also refactored (if that's even the right word). From the subject and
the first sentence above, I did not expect this first "-" item, nor that
the patch would touch Documentation/.

> +       enum {
> +               SHORTLOG_GROUP_AUTHOR = 0,
> +               SHORTLOG_GROUP_COMMITTER
> +       } group;

You could reduce the patch noise by adding a trailing comma, see
cc0c42975a ("CodingGuidelines: spell out post-C89 rules", 2019-07-16).
(I do realize that you later redefine all values anyway.)

Martin
Jeff King Sept. 27, 2020, 7:59 a.m. UTC | #3
On Sat, Sep 26, 2020 at 02:31:32PM +0200, Martin Ågren wrote:

> On Fri, 25 Sep 2020 at 09:04, Jeff King <peff@peff.net> wrote:
> >
> > In preparation for adding more grouping types, let's
> > refactor the committer/author grouping code. In particular:
> >
> >   - the master option is now "--group", to make it clear
> >     that the various group types are mutually exclusive. The
> >     "--committer" option is an alias for "--group=committer".
> 
> I think this is more than a refactoring of the code. The user interface
> is also refactored (if that's even the right word). From the subject and
> the first sentence above, I did not expect this first "-" item, nor that
> the patch would touch Documentation/.

Yeah, I agree it's a bit misleading. I've reworded it (especially the
title) to make the intent more clear.

> > +       enum {
> > +               SHORTLOG_GROUP_AUTHOR = 0,
> > +               SHORTLOG_GROUP_COMMITTER
> > +       } group;
> 
> You could reduce the patch noise by adding a trailing comma, see
> cc0c42975a ("CodingGuidelines: spell out post-C89 rules", 2019-07-16).
> (I do realize that you later redefine all values anyway.)

Thanks, that's worth doing. This part of the series was actually from
2015, so perhaps before we had that guideline. :)

-Peff
Jeff King Sept. 27, 2020, 8:03 a.m. UTC | #4
On Fri, Sep 25, 2020 at 04:05:04PM -0400, Eric Sunshine wrote:

> > +--group=<type>::
> > +       By default, `shortlog` collects and collates author identities;
> > +       using `--group` will collect other fields.
> > +       `<type>` is one of:
> > ++
> > + - `author`, commits are grouped by author (this is the default)
> > + - `committer`, commits are grouped by committer (the same as `-c`)
> 
> I had trouble interpreting "(this is the default)". It made me think
> that <type> is optional:
> 
>     --group[=<type>]
> 
> but that isn't the case at all. Instead, it means that if --group
> isn't specified, then grouping is done by `author` by default.

Right.

> It also
> repeats what the general description of --group already says with
> regard to the default, thus it is redundant to say it again when
> describing the `author` type. Therefore, perhaps drop "(this is the
> default)" altogether?

OK. I was worried somebody wouldn't quite pick up on the default. We
could spell it out as:

  - `author`, commits are grouped by author (this is the default if no
    --group is given)

but that's pretty lengthy. We could also reword the whole thing as:

  --group=<type>::
          Collect and collate values of type `<type>`; if no `--group`
	  option is given, the default is `author`:
   - `author`, commits are grouped by author
   - `committer`, commits are grouped by committer (the same as `-c`)

I'm not sure if that's clearer or not. The first sentence seems a bit
opaque, but I'm not sure how to improve it.

For now (and what plan to send out in v2), I'll take your suggestion to
just drop the phrase.

-Peff
Jeff King Sept. 27, 2020, 8:08 a.m. UTC | #5
On Sun, Sep 27, 2020 at 04:03:45AM -0400, Jeff King wrote:

> but that's pretty lengthy. We could also reword the whole thing as:
> 
>   --group=<type>::
>           Collect and collate values of type `<type>`; if no `--group`
> 	  option is given, the default is `author`:
>    - `author`, commits are grouped by author
>    - `committer`, commits are grouped by committer (the same as `-c`)
> 
> I'm not sure if that's clearer or not. The first sentence seems a bit
> opaque, but I'm not sure how to improve it.
> 
> For now (and what plan to send out in v2), I'll take your suggestion to
> just drop the phrase.

OK, I lied. After reading your other email, I ended up with:

  --group=<type>::
          Group commits based on `<type>`. If no `--group` option is
          specified, the default is `author`. `<type>` is one of:
  +
   - `author`, commits are grouped by author
   - `committer`, commits are grouped by committer (the same as `-c`)

  -c::
  --committer::
          This is an alias for `--group=committer`.

which I think works by itself, and extends nicely to trailers.

-Peff
Eric Sunshine Sept. 27, 2020, 8:23 a.m. UTC | #6
On Sun, Sep 27, 2020 at 4:08 AM Jeff King <peff@peff.net> wrote:
> OK, I lied. After reading your other email, I ended up with:
>
>   --group=<type>::
>           Group commits based on `<type>`. If no `--group` option is
>           specified, the default is `author`. `<type>` is one of:
>   +
>    - `author`, commits are grouped by author
>    - `committer`, commits are grouped by committer (the same as `-c`)

Sounds good. Makes it much more clear to me.
diff mbox series

Patch

diff --git a/Documentation/git-shortlog.txt b/Documentation/git-shortlog.txt
index a72ea7f7ba..41b7679aa1 100644
--- a/Documentation/git-shortlog.txt
+++ b/Documentation/git-shortlog.txt
@@ -47,9 +47,18 @@  OPTIONS
 
 	Each pretty-printed commit will be rewrapped before it is shown.
 
+--group=<type>::
+	By default, `shortlog` collects and collates author identities;
+	using `--group` will collect other fields.
+	`<type>` is one of:
++
+ - `author`, commits are grouped by author (this is the default)
+ - `committer`, commits are grouped by committer (the same as `-c`)
+
 -c::
 --committer::
 	Collect and show committer identities instead of authors.
+	This is an alias for `--group=committer`.
 
 -w[<width>[,<indent1>[,<indent2>]]]::
 	Linewrap the output by wrapping each line at `width`.  The first
diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index edcf2e0d54..880ce19304 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -129,7 +129,17 @@  static void read_from_stdin(struct shortlog *log)
 	static const char *committer_match[2] = { "Commit: ", "committer " };
 	const char **match;
 
-	match = log->committer ? committer_match : author_match;
+	switch (log->group) {
+	case SHORTLOG_GROUP_AUTHOR:
+		match = author_match;
+		break;
+	case SHORTLOG_GROUP_COMMITTER:
+		match = committer_match;
+		break;
+	default:
+		BUG("unhandled shortlog group");
+	}
+
 	while (strbuf_getline_lf(&ident, stdin) != EOF) {
 		const char *v;
 		if (!skip_prefix(ident.buf, match[0], &v) &&
@@ -158,27 +168,36 @@  void shortlog_add_commit(struct shortlog *log, struct commit *commit)
 	struct strbuf ident = STRBUF_INIT;
 	struct strbuf oneline = STRBUF_INIT;
 	struct pretty_print_context ctx = {0};
-	const char *fmt;
+	const char *oneline_str;
 
 	ctx.fmt = CMIT_FMT_USERFORMAT;
 	ctx.abbrev = log->abbrev;
 	ctx.print_email_subject = 1;
 	ctx.date_mode.type = DATE_NORMAL;
 	ctx.output_encoding = get_log_output_encoding();
 
-	fmt = log->committer ?
-		(log->email ? "%cN <%cE>" : "%cN") :
-		(log->email ? "%aN <%aE>" : "%aN");
-
-	format_commit_message(commit, fmt, &ident, &ctx);
 	if (!log->summary) {
 		if (log->user_format)
 			pretty_print_commit(&ctx, commit, &oneline);
 		else
 			format_commit_message(commit, "%s", &oneline, &ctx);
 	}
-
-	insert_one_record(log, ident.buf, oneline.len ? oneline.buf : "<none>");
+	oneline_str = oneline.len ? oneline.buf : "<none>";
+
+	switch (log->group) {
+	case SHORTLOG_GROUP_AUTHOR:
+		format_commit_message(commit,
+				      log->email ? "%aN <%aE>" : "%aN",
+				      &ident, &ctx);
+		insert_one_record(log, ident.buf, oneline_str);
+		break;
+	case SHORTLOG_GROUP_COMMITTER:
+		format_commit_message(commit,
+				      log->email ? "%cN <%cE>" : "%cN",
+				      &ident, &ctx);
+		insert_one_record(log, ident.buf, oneline_str);
+		break;
+	}
 
 	strbuf_release(&ident);
 	strbuf_release(&oneline);
@@ -241,6 +260,21 @@  static int parse_wrap_args(const struct option *opt, const char *arg, int unset)
 	return 0;
 }
 
+static int parse_group_option(const struct option *opt, const char *arg, int unset)
+{
+	struct shortlog *log = opt->value;
+
+	if (unset || !strcasecmp(arg, "author"))
+		log->group = SHORTLOG_GROUP_AUTHOR;
+	else if (!strcasecmp(arg, "committer"))
+		log->group = SHORTLOG_GROUP_COMMITTER;
+	else
+		return error(_("unknown group type: %s"), arg);
+
+	return 0;
+}
+
+
 void shortlog_init(struct shortlog *log)
 {
 	memset(log, 0, sizeof(*log));
@@ -260,8 +294,9 @@  int cmd_shortlog(int argc, const char **argv, const char *prefix)
 	int nongit = !startup_info->have_repository;
 
 	const struct option options[] = {
-		OPT_BOOL('c', "committer", &log.committer,
-			 N_("Group by committer rather than author")),
+		OPT_SET_INT('c', "committer", &log.group,
+			    N_("Group by committer rather than author"),
+			    SHORTLOG_GROUP_COMMITTER),
 		OPT_BOOL('n', "numbered", &log.sort_by_number,
 			 N_("sort output according to the number of commits per author")),
 		OPT_BOOL('s', "summary", &log.summary,
@@ -271,6 +306,8 @@  int cmd_shortlog(int argc, const char **argv, const char *prefix)
 		OPT_CALLBACK_F('w', NULL, &log, N_("<w>[,<i1>[,<i2>]]"),
 			N_("Linewrap output"), PARSE_OPT_OPTARG,
 			&parse_wrap_args),
+		OPT_CALLBACK(0, "group", &log, N_("field"),
+			N_("Group by field"), parse_group_option),
 		OPT_END(),
 	};
 
diff --git a/shortlog.h b/shortlog.h
index 2fa61c4294..9deeeb3ffb 100644
--- a/shortlog.h
+++ b/shortlog.h
@@ -15,7 +15,11 @@  struct shortlog {
 	int in2;
 	int user_format;
 	int abbrev;
-	int committer;
+
+	enum {
+		SHORTLOG_GROUP_AUTHOR = 0,
+		SHORTLOG_GROUP_COMMITTER
+	} group;
 
 	char *common_repo_prefix;
 	int email;
diff --git a/t/t4201-shortlog.sh b/t/t4201-shortlog.sh
index d3a7ce6bbb..65e4468746 100755
--- a/t/t4201-shortlog.sh
+++ b/t/t4201-shortlog.sh
@@ -215,4 +215,9 @@  test_expect_success 'shortlog --committer (external)' '
 	test_cmp expect actual
 '
 
+test_expect_success '--group=committer is the same as --committer' '
+	git shortlog -ns --group=committer HEAD >actual &&
+	test_cmp expect actual
+'
+
 test_done