diff mbox series

[v2,7/8] fetch: introduce new `--output-format` option

Message ID 0335e5eeb4ded336c5ff7c8888c8aab9dfed2505.1682593865.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series fetch: introduce machine-parseable output | expand

Commit Message

Patrick Steinhardt April 27, 2023, 11:13 a.m. UTC
It is only possible to configure the output format that git-fetch(1)
uses by setting it via a config key. While this interface may be fine as
long as we only have the current "full" and "compact" output formats,
where it is unlikely that the user will have to change them regularly.
But we're about to introduce a new machine-parseable interface where the
current mechanism feels a little bit indirect and rigid.

Introduce a new `--output-format` option that allows the user to change
the desired format more directly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/fetch-options.txt |  5 +++
 builtin/fetch.c                 | 48 ++++++++++++++++++----
 t/t5574-fetch-output.sh         | 72 +++++++++++++++++++++++++++------
 3 files changed, 106 insertions(+), 19 deletions(-)

Comments

Junio C Hamano April 27, 2023, 10:01 p.m. UTC | #1
Patrick Steinhardt <ps@pks.im> writes:

> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 97a510649c..30099b2ac3 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -52,6 +52,13 @@ enum display_format {
>  	DISPLAY_FORMAT_UNKNOWN = 0,
>  	DISPLAY_FORMAT_FULL,
>  	DISPLAY_FORMAT_COMPACT,
> +	DISPLAY_FORMAT_MAX,
> +};
> +
> +static const char * const display_formats[DISPLAY_FORMAT_MAX] = {
> +	NULL,
> +	"full",
> +	"compact",
>  };

Hmph, the _MAX thing that is only needed to size the array and never
used elsewhere (i.e. parse_display_format() uses ARRAY_SIZE() of the
thing, instead of the constant, and that is just fine) is an eyesore.

I wonder if

	static const char *const display_format[] = {
		[DISPLAY_FORMAT_UNKNOWN] = NULL,
		[DISPLAY_FORMAT_FULL] = "full",
		[DISPLAY_FORMAT_COMPACT] = "compact",
	};

would be easier to maintain?

I'll omit my usual "name your array in singular" lecture, as I think
you've heard it already.

Thanks.
Glen Choo April 28, 2023, 10:03 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> I wonder if
>
> 	static const char *const display_format[] = {
> 		[DISPLAY_FORMAT_UNKNOWN] = NULL,
> 		[DISPLAY_FORMAT_FULL] = "full",
> 		[DISPLAY_FORMAT_COMPACT] = "compact",
> 	};
>
> would be easier to maintain?

It's easier to read, so I'd think so.
Glen Choo April 28, 2023, 10:31 p.m. UTC | #3
Patrick Steinhardt <ps@pks.im> writes:

> @@ -1894,6 +1902,9 @@ static int fetch_multiple(struct string_list *list, int max_children)
>  		     "--no-write-commit-graph", NULL);
>  	add_options_to_argv(&argv);
>  
> +	if (format != DISPLAY_FORMAT_UNKNOWN)
> +		strvec_pushf(&argv, "--output-format=%s", display_formats[format]);
> +

I think these lines belong inside add_options_to_argv(), since that's
also used to prepare argv for fetch_submodules(), so we'd also get
support for --recurse-submodules. (I wish I had spotted that in v1,
sorry. Thankfully they use the same helper function, so we only have to
do this once.)

----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
  diff --git a/builtin/fetch.c b/builtin/fetch.c
  index 422e29a914..7aa385aed5 100644
  --- a/builtin/fetch.c
  +++ b/builtin/fetch.c
  @@ -1796,8 +1796,11 @@ static int add_remote_or_group(const char *name, struct string_list *list)
    return 1;
  }

  -static void add_options_to_argv(struct strvec *argv)
  +static void add_options_to_argv(struct strvec *argv,
  +				enum display_format format)
  {
  /* Maybe this shouldn't be first, idk */
  +	if (format != DISPLAY_FORMAT_UNKNOWN)
  +		strvec_pushf(argv, "--output-format=%s", display_formats[format]);
    if (dry_run)
      strvec_push(argv, "--dry-run");
    if (prune != -1)
  @@ -1908,10 +1911,7 @@ static int fetch_multiple(struct string_list *list, int max_children,
    strvec_pushl(&argv, "-c", "fetch.bundleURI=",
          "fetch", "--append", "--no-auto-gc",
          "--no-write-commit-graph", NULL);
  -	add_options_to_argv(&argv);
  -
  -	if (format != DISPLAY_FORMAT_UNKNOWN)
  -		strvec_pushf(&argv, "--output-format=%s", display_formats[format]);
  +	add_options_to_argv(&argv, format);

    if (max_children != 1 && list->nr != 1) {
      struct parallel_fetch_state state = { argv.v, list, 0, 0 };
  @@ -2403,7 +2403,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
      if (max_children < 0)
        max_children = fetch_parallel_config;

  -		add_options_to_argv(&options);
  +		add_options_to_argv(&options, display_format);
      result = fetch_submodules(the_repository,
              &options,
              submodule_prefix,

----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----

I tested the result of that locally with --recurse-submodules, and
it works.
Patrick Steinhardt May 3, 2023, 9:12 a.m. UTC | #4
On Fri, Apr 28, 2023 at 03:03:43PM -0700, Glen Choo wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > I wonder if
> >
> > 	static const char *const display_format[] = {
> > 		[DISPLAY_FORMAT_UNKNOWN] = NULL,
> > 		[DISPLAY_FORMAT_FULL] = "full",
> > 		[DISPLAY_FORMAT_COMPACT] = "compact",
> > 	};
> >
> > would be easier to maintain?
> 
> It's easier to read, so I'd think so.

Yeah, I'll adopt this approach in v3.

Patrick
Patrick Steinhardt May 3, 2023, 9:43 a.m. UTC | #5
On Fri, Apr 28, 2023 at 03:31:08PM -0700, Glen Choo wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > @@ -1894,6 +1902,9 @@ static int fetch_multiple(struct string_list *list, int max_children)
> >  		     "--no-write-commit-graph", NULL);
> >  	add_options_to_argv(&argv);
> >  
> > +	if (format != DISPLAY_FORMAT_UNKNOWN)
> > +		strvec_pushf(&argv, "--output-format=%s", display_formats[format]);
> > +
> 
> I think these lines belong inside add_options_to_argv(), since that's
> also used to prepare argv for fetch_submodules(), so we'd also get
> support for --recurse-submodules. (I wish I had spotted that in v1,
> sorry. Thankfully they use the same helper function, so we only have to
> do this once.)
> 
> ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
>   diff --git a/builtin/fetch.c b/builtin/fetch.c
>   index 422e29a914..7aa385aed5 100644
>   --- a/builtin/fetch.c
>   +++ b/builtin/fetch.c
>   @@ -1796,8 +1796,11 @@ static int add_remote_or_group(const char *name, struct string_list *list)
>     return 1;
>   }
> 
>   -static void add_options_to_argv(struct strvec *argv)
>   +static void add_options_to_argv(struct strvec *argv,
>   +				enum display_format format)
>   {
>   /* Maybe this shouldn't be first, idk */
>   +	if (format != DISPLAY_FORMAT_UNKNOWN)
>   +		strvec_pushf(argv, "--output-format=%s", display_formats[format]);
>     if (dry_run)
>       strvec_push(argv, "--dry-run");
>     if (prune != -1)
>   @@ -1908,10 +1911,7 @@ static int fetch_multiple(struct string_list *list, int max_children,
>     strvec_pushl(&argv, "-c", "fetch.bundleURI=",
>           "fetch", "--append", "--no-auto-gc",
>           "--no-write-commit-graph", NULL);
>   -	add_options_to_argv(&argv);
>   -
>   -	if (format != DISPLAY_FORMAT_UNKNOWN)
>   -		strvec_pushf(&argv, "--output-format=%s", display_formats[format]);
>   +	add_options_to_argv(&argv, format);
> 
>     if (max_children != 1 && list->nr != 1) {
>       struct parallel_fetch_state state = { argv.v, list, 0, 0 };
>   @@ -2403,7 +2403,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>       if (max_children < 0)
>         max_children = fetch_parallel_config;
> 
>   -		add_options_to_argv(&options);
>   +		add_options_to_argv(&options, display_format);
>       result = fetch_submodules(the_repository,
>               &options,
>               submodule_prefix,
> 
> ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
> 
> I tested the result of that locally with --recurse-submodules, and
> it works.

Unfortunately it doesn't quite work alright: while the porcelain format
does indeed get inherited to the child process correctly, the parallel
process API will cause us to group output per submodule-fetch. This has
the consequence that stdout will be redirected into stderr, and that
then breaks the assumption that all machine-parseable output goes to
stdout.

My initial reflex is to just outright reject porcelain mode when
submodule fetches are enabled. But that would require the caller to
always explicitly pass `--recurse-submodules=off`, which isn't exactly
great usability-wise.

The alternative would be to ungroup the output so that we can continue
to print to the correct output streams. That works alright, and I've got
a working version that does exactly that. But now we have the issue that
the porcelain output is misleading: you cannot tell whether a specific
reference update happens in the parent repository or in the submodule as
that information is not part of the output.

I consider the second option to be much worse than the first option
because it can cause scripts do to the wrong thing. So I'll send v3 with
the first option, even though it's kind of an awful workaround. I'd be
happy to hear any alternative proposals though.

Patrick
Patrick Steinhardt May 3, 2023, 11:36 a.m. UTC | #6
On Wed, May 03, 2023 at 11:43:31AM +0200, Patrick Steinhardt wrote:
> On Fri, Apr 28, 2023 at 03:31:08PM -0700, Glen Choo wrote:
> > Patrick Steinhardt <ps@pks.im> writes:
> > 
> > > @@ -1894,6 +1902,9 @@ static int fetch_multiple(struct string_list *list, int max_children)
> > >  		     "--no-write-commit-graph", NULL);
> > >  	add_options_to_argv(&argv);
> > >  
> > > +	if (format != DISPLAY_FORMAT_UNKNOWN)
> > > +		strvec_pushf(&argv, "--output-format=%s", display_formats[format]);
> > > +
> > 
> > I think these lines belong inside add_options_to_argv(), since that's
> > also used to prepare argv for fetch_submodules(), so we'd also get
> > support for --recurse-submodules. (I wish I had spotted that in v1,
> > sorry. Thankfully they use the same helper function, so we only have to
> > do this once.)
> > 
> > ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
> >   diff --git a/builtin/fetch.c b/builtin/fetch.c
> >   index 422e29a914..7aa385aed5 100644
> >   --- a/builtin/fetch.c
> >   +++ b/builtin/fetch.c
> >   @@ -1796,8 +1796,11 @@ static int add_remote_or_group(const char *name, struct string_list *list)
> >     return 1;
> >   }
> > 
> >   -static void add_options_to_argv(struct strvec *argv)
> >   +static void add_options_to_argv(struct strvec *argv,
> >   +				enum display_format format)
> >   {
> >   /* Maybe this shouldn't be first, idk */
> >   +	if (format != DISPLAY_FORMAT_UNKNOWN)
> >   +		strvec_pushf(argv, "--output-format=%s", display_formats[format]);
> >     if (dry_run)
> >       strvec_push(argv, "--dry-run");
> >     if (prune != -1)
> >   @@ -1908,10 +1911,7 @@ static int fetch_multiple(struct string_list *list, int max_children,
> >     strvec_pushl(&argv, "-c", "fetch.bundleURI=",
> >           "fetch", "--append", "--no-auto-gc",
> >           "--no-write-commit-graph", NULL);
> >   -	add_options_to_argv(&argv);
> >   -
> >   -	if (format != DISPLAY_FORMAT_UNKNOWN)
> >   -		strvec_pushf(&argv, "--output-format=%s", display_formats[format]);
> >   +	add_options_to_argv(&argv, format);
> > 
> >     if (max_children != 1 && list->nr != 1) {
> >       struct parallel_fetch_state state = { argv.v, list, 0, 0 };
> >   @@ -2403,7 +2403,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
> >       if (max_children < 0)
> >         max_children = fetch_parallel_config;
> > 
> >   -		add_options_to_argv(&options);
> >   +		add_options_to_argv(&options, display_format);
> >       result = fetch_submodules(the_repository,
> >               &options,
> >               submodule_prefix,
> > 
> > ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
> > 
> > I tested the result of that locally with --recurse-submodules, and
> > it works.
> 
> Unfortunately it doesn't quite work alright: while the porcelain format
> does indeed get inherited to the child process correctly, the parallel
> process API will cause us to group output per submodule-fetch. This has
> the consequence that stdout will be redirected into stderr, and that
> then breaks the assumption that all machine-parseable output goes to
> stdout.
> 
> My initial reflex is to just outright reject porcelain mode when
> submodule fetches are enabled. But that would require the caller to
> always explicitly pass `--recurse-submodules=off`, which isn't exactly
> great usability-wise.
> 
> The alternative would be to ungroup the output so that we can continue
> to print to the correct output streams. That works alright, and I've got
> a working version that does exactly that. But now we have the issue that
> the porcelain output is misleading: you cannot tell whether a specific
> reference update happens in the parent repository or in the submodule as
> that information is not part of the output.
> 
> I consider the second option to be much worse than the first option
> because it can cause scripts do to the wrong thing. So I'll send v3 with
> the first option, even though it's kind of an awful workaround. I'd be
> happy to hear any alternative proposals though.
> 
> Patrick

I've gone with a slightly different variant of the first option that is
inspired by `--negotiate-only`: instead of refusing to run, we disable
submodule-fetches unless explicitly specified on the command line. In
that case we return an error.

Patrick
diff mbox series

Patch

diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index 622bd84768..654f96f79d 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -78,6 +78,11 @@  linkgit:git-config[1].
 --dry-run::
 	Show what would be done, without making any changes.
 
+--output-format::
+	Control how ref update status is printed. Valid values are
+	`full` and `compact`. Default value is `full`. See section
+	OUTPUT in linkgit:git-fetch[1] for detail.
+
 ifndef::git-pull[]
 --[no-]write-fetch-head::
 	Write the list of remote refs fetched in the `FETCH_HEAD`
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 97a510649c..30099b2ac3 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -52,6 +52,13 @@  enum display_format {
 	DISPLAY_FORMAT_UNKNOWN = 0,
 	DISPLAY_FORMAT_FULL,
 	DISPLAY_FORMAT_COMPACT,
+	DISPLAY_FORMAT_MAX,
+};
+
+static const char * const display_formats[DISPLAY_FORMAT_MAX] = {
+	NULL,
+	"full",
+	"compact",
 };
 
 struct display_state {
@@ -1879,7 +1886,8 @@  static int fetch_finished(int result, struct strbuf *out,
 	return 0;
 }
 
-static int fetch_multiple(struct string_list *list, int max_children)
+static int fetch_multiple(struct string_list *list, int max_children,
+			  enum display_format format)
 {
 	int i, result = 0;
 	struct strvec argv = STRVEC_INIT;
@@ -1894,6 +1902,9 @@  static int fetch_multiple(struct string_list *list, int max_children)
 		     "--no-write-commit-graph", NULL);
 	add_options_to_argv(&argv);
 
+	if (format != DISPLAY_FORMAT_UNKNOWN)
+		strvec_pushf(&argv, "--output-format=%s", display_formats[format]);
+
 	if (max_children != 1 && list->nr != 1) {
 		struct parallel_fetch_state state = { argv.v, list, 0, 0 };
 		const struct run_process_parallel_opts opts = {
@@ -2050,6 +2061,29 @@  static int fetch_one(struct remote *remote, int argc, const char **argv,
 	return exit_code;
 }
 
+static enum display_format parse_display_format(const char *format)
+{
+	for (int i = 0; i < ARRAY_SIZE(display_formats); i++)
+		if (display_formats[i] && !strcmp(display_formats[i], format))
+			return i;
+	return DISPLAY_FORMAT_UNKNOWN;
+}
+
+static int opt_parse_output_format(const struct option *opt, const char *arg, int unset)
+{
+	enum display_format *format = opt->value, parsed;
+
+	if (unset || !arg)
+		return 1;
+
+	parsed = parse_display_format(arg);
+	if (parsed == DISPLAY_FORMAT_UNKNOWN)
+		return error(_("unsupported output format '%s'"), arg);
+	*format = parsed;
+
+	return 0;
+}
+
 int cmd_fetch(int argc, const char **argv, const char *prefix)
 {
 	const char *bundle_uri;
@@ -2102,6 +2136,8 @@  int cmd_fetch(int argc, const char **argv, const char *prefix)
 			    PARSE_OPT_OPTARG, option_fetch_parse_recurse_submodules),
 		OPT_BOOL(0, "dry-run", &dry_run,
 			 N_("dry run")),
+		OPT_CALLBACK(0, "output-format", &display_format, N_("format"), N_("output format"),
+			     opt_parse_output_format),
 		OPT_BOOL(0, "write-fetch-head", &write_fetch_head,
 			 N_("write fetched references to the FETCH_HEAD file")),
 		OPT_BOOL('k', "keep", &keep, N_("keep downloaded pack")),
@@ -2181,11 +2217,9 @@  int cmd_fetch(int argc, const char **argv, const char *prefix)
 		const char *format = "full";
 
 		git_config_get_string_tmp("fetch.output", &format);
-		if (!strcasecmp(format, "full"))
-			display_format = DISPLAY_FORMAT_FULL;
-		else if (!strcasecmp(format, "compact"))
-			display_format = DISPLAY_FORMAT_COMPACT;
-		else
+
+		display_format = parse_display_format(format);
+		if (display_format == DISPLAY_FORMAT_UNKNOWN)
 			die(_("invalid value for '%s': '%s'"),
 			    "fetch.output", format);
 	}
@@ -2339,7 +2373,7 @@  int cmd_fetch(int argc, const char **argv, const char *prefix)
 			max_children = fetch_parallel_config;
 
 		/* TODO should this also die if we have a previous partial-clone? */
-		result = fetch_multiple(&list, max_children);
+		result = fetch_multiple(&list, max_children, display_format);
 	}
 
 
diff --git a/t/t5574-fetch-output.sh b/t/t5574-fetch-output.sh
index b9dcdade63..662c960f94 100755
--- a/t/t5574-fetch-output.sh
+++ b/t/t5574-fetch-output.sh
@@ -24,14 +24,37 @@  test_expect_success 'fetch with invalid output format configuration' '
 	test_cmp expect actual
 '
 
+test_expect_success 'fetch with invalid output format via command line' '
+	test_must_fail git fetch --output-format >actual 2>&1 &&
+	cat >expect <<-EOF &&
+	error: option \`output-format${SQ} requires a value
+	EOF
+	test_cmp expect actual &&
+
+	test_must_fail git fetch --output-format= origin >actual 2>&1 &&
+	cat >expect <<-EOF &&
+	error: unsupported output format ${SQ}${SQ}
+	EOF
+	test_cmp expect actual &&
+
+	test_must_fail git fetch --output-format=garbage origin >actual 2>&1 &&
+	cat >expect <<-EOF &&
+	error: unsupported output format ${SQ}garbage${SQ}
+	EOF
+	test_cmp expect actual
+'
+
 test_expect_success 'fetch aligned output' '
-	git clone . full-output &&
+	test_when_finished "rm -rf full-cfg full-cli" &&
+	git clone . full-cfg &&
+	git clone . full-cli &&
 	test_commit looooooooooooong-tag &&
-	(
-		cd full-output &&
-		git -c fetch.output=full fetch origin >actual 2>&1 &&
-		grep -e "->" actual | cut -c 22- >../actual
-	) &&
+
+	git -C full-cfg -c fetch.output=full fetch origin >actual-cfg 2>&1 &&
+	git -C full-cli fetch --output-format=full origin >actual-cli 2>&1 &&
+	test_cmp actual-cfg actual-cli &&
+
+	grep -e "->" actual-cfg | cut -c 22- >actual &&
 	cat >expect <<-\EOF &&
 	main                 -> origin/main
 	looooooooooooong-tag -> looooooooooooong-tag
@@ -40,13 +63,16 @@  test_expect_success 'fetch aligned output' '
 '
 
 test_expect_success 'fetch compact output' '
-	git clone . compact &&
+	test_when_finished "rm -rf compact-cfg compact-cli" &&
+	git clone . compact-cli &&
+	git clone . compact-cfg &&
 	test_commit extraaa &&
-	(
-		cd compact &&
-		git -c fetch.output=compact fetch origin >actual 2>&1 &&
-		grep -e "->" actual | cut -c 22- >../actual
-	) &&
+
+	git -C compact-cfg -c fetch.output=compact fetch origin >actual-cfg 2>&1 &&
+	git -C compact-cli fetch --output-format=compact origin >actual-cli 2>&1 &&
+	test_cmp actual-cfg actual-cli &&
+
+	grep -e "->" actual-cfg | cut -c 22- >actual &&
 	cat >expect <<-\EOF &&
 	main       -> origin/*
 	extraaa    -> *
@@ -54,6 +80,28 @@  test_expect_success 'fetch compact output' '
 	test_cmp expect actual
 '
 
+test_expect_success 'fetch compact output with multiple remotes' '
+	test_when_finished "rm -rf compact-cfg compact-cli" &&
+
+	git clone . compact-cli &&
+	git -C compact-cli remote add second-remote "$PWD" &&
+	git clone . compact-cfg &&
+	git -C compact-cfg remote add second-remote "$PWD" &&
+	test_commit multi-commit &&
+
+	git -C compact-cfg -c fetch.output=compact fetch --all >actual-cfg 2>&1 &&
+	git -C compact-cli fetch --output-format=compact --all >actual-cli 2>&1 &&
+	test_cmp actual-cfg actual-cli &&
+
+	grep -e "->" actual-cfg | cut -c 22- >actual &&
+	cat >expect <<-\EOF &&
+	main         -> origin/*
+	multi-commit -> *
+	main       -> second-remote/*
+	EOF
+	test_cmp expect actual
+'
+
 test_expect_success 'fetch output with HEAD and --dry-run' '
 	test_when_finished "rm -rf head" &&
 	git clone . head &&