diff mbox series

[v2,8/8] fetch: introduce machine-parseable "porcelain" output format

Message ID d7c1bc1a80406ad320c2de684e0c97ba14827c7a.1682593865.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series fetch: introduce machine-parseable output | expand

Commit Message

Patrick Steinhardt April 27, 2023, 11:13 a.m. UTC
The output of git-fetch(1) is obviously designed for consumption by
users, only: we neatly columnize data, we abbreviate reference names, we
print neat arrows and we don't provide information about actual object
IDs that have changed. This makes the output format basically unusable
in the context of scripted invocations of git-fetch(1) that want to
learn about the exact changes that the command performs.

Introduce a new machine-parseable "porcelain" output format that is
supposed to fix this shortcoming. This output format is intended to
provide information about every reference that is about to be updated,
the old object ID that the reference has been pointing to and the new
object ID it will be updated to. Furthermore, the output format provides
the same flags as the human-readable format to indicate basic conditions
for each reference update like whether it was a fast-forward update, a
branch deletion, a rejected update or others.

The output format is quite simple:

```
<flag> <old-object-id> <new-object-id> <local-reference>\n
```

We assume two conditions which are generally true:

    - The old and new object IDs have fixed known widths and cannot
      contain spaces.

    - References cannot contain newlines.

With these assumptions, the output format becomes unambiguously
parseable. Furthermore, given that this output is designed to be
consumed by scripts, the machine-readable data is printed to stdout
instead of stderr like the human-readable output is. This is mostly done
so that other data printed to stderr, like error messages or progress
meters, don't interfere with the parseable data.

A notable ommission here is that the output format does not include the
remote from which a reference was fetched, which might be important
information especially in the context of multi-remote fetches. But as
such a format would require us to print the remote for every single
reference update due to parallelizable fetches it feels wasteful for the
most likely usecase, which is when fetching from a single remote. If
usecases come up for this in the future though it is easy enough to add
a new "porcelain-v2" format that adds this information.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config/fetch.txt  |  4 +-
 Documentation/fetch-options.txt |  4 +-
 Documentation/git-fetch.txt     | 17 ++++++-
 builtin/fetch.c                 | 47 +++++++++++++-----
 t/t5574-fetch-output.sh         | 84 +++++++++++++++++++++++++++++++++
 5 files changed, 139 insertions(+), 17 deletions(-)

Comments

Jacob Keller April 27, 2023, 7:52 p.m. UTC | #1
On 4/27/2023 4:13 AM, Patrick Steinhardt wrote:
> A notable ommission here is that the output format does not include the
> remote from which a reference was fetched, which might be important
> information especially in the context of multi-remote fetches. But as
> such a format would require us to print the remote for every single
> reference update due to parallelizable fetches it feels wasteful for the
> most likely usecase, which is when fetching from a single remote. If
> usecases come up for this in the future though it is easy enough to add
> a new "porcelain-v2" format that adds this information.

Thanks for clarifying this. I think generally its ok to leave out since
you can often infer what remote a ref came from (either because of
refs/remotes, or because of how you configured your refspecs for fetching).

We do have the option to extend this in the future if needed, so I think
that's ok. In that case if we wanted an even more verbose output I'd
probably argue for something more structured like adding it as part of a
JSON output.

Thanks,
Jake
Glen Choo April 28, 2023, 10:42 p.m. UTC | #2
Patrick Steinhardt <ps@pks.im> writes:

> A notable ommission here is that the output format does not include the
> remote from which a reference was fetched, which might be important
> information especially in the context of multi-remote fetches. But as
> such a format would require us to print the remote for every single
> reference update due to parallelizable fetches it feels wasteful for the
> most likely usecase, which is when fetching from a single remote. If
> usecases come up for this in the future though it is easy enough to add
> a new "porcelain-v2" format that adds this information.

We discussed this elsewhere in the thread, but if we are just adding
information (and not omitting existing information or shuffling it
around), I would prefer for us to make the format extensible using flags
than to add a whole new format enum. We can't imagine what other
information users might want (maybe the remote ref name?), and it would
be nice to avoid bumping the 'porcelain version' unnecessarily.

I agree that this format is good enough as a starting point, though.

It's not new to v2, but I've mentioned my reservations on
--output-format on:

  https://lore.kernel.org/git/kl6ledo33ovx.fsf@chooglen-macbookpro.roam.corp.google.com/

I will be out of office all of next week. If, in the meantime, everyone
else decides that --output-format=porcelain is good enough, I'm happy to
accept the result.

> @@ -127,6 +193,24 @@ test_expect_success 'fetch output with HEAD and --dry-run' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'fetch porcelain output with HEAD and --dry-run' '
> +	test_when_finished "rm -rf head" &&
> +	git clone . head &&
> +	COMMIT_ID=$(git rev-parse HEAD) &&
> +
> +	git -C head fetch --output-format=porcelain --dry-run origin HEAD >actual &&
> +	cat >expect <<-EOF &&
> +	* $ZERO_OID $COMMIT_ID FETCH_HEAD
> +	EOF
> +	test_cmp expect actual &&
> +
> +	git -C head fetch --output-format=porcelain --dry-run origin HEAD:foo >actual &&
> +	cat >expect <<-EOF &&
> +	* $ZERO_OID $COMMIT_ID refs/heads/foo
> +	EOF
> +	test_cmp expect actual
> +'
> +

Now that the earlier test also tests without --dry-run, shouldn't this
one too?
diff mbox series

Patch

diff --git a/Documentation/config/fetch.txt b/Documentation/config/fetch.txt
index 568f0f75b3..70734226c0 100644
--- a/Documentation/config/fetch.txt
+++ b/Documentation/config/fetch.txt
@@ -52,8 +52,8 @@  fetch.pruneTags::
 
 fetch.output::
 	Control how ref update status is printed. Valid values are
-	`full` and `compact`. Default value is `full`. See section
-	OUTPUT in linkgit:git-fetch[1] for detail.
+	`full`, `compact` and `porcelain`. Default value is `full`.
+	See section OUTPUT in linkgit:git-fetch[1] for detail.
 
 fetch.negotiationAlgorithm::
 	Control how information about the commits in the local repository
diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt
index 654f96f79d..5ca8a67fe8 100644
--- a/Documentation/fetch-options.txt
+++ b/Documentation/fetch-options.txt
@@ -80,8 +80,8 @@  linkgit:git-config[1].
 
 --output-format::
 	Control how ref update status is printed. Valid values are
-	`full` and `compact`. Default value is `full`. See section
-	OUTPUT in linkgit:git-fetch[1] for detail.
+	`full`, `compact` and `porcelain`. Default value is `full`.
+	See section OUTPUT in linkgit:git-fetch[1] for detail.
 
 ifndef::git-pull[]
 --[no-]write-fetch-head::
diff --git a/Documentation/git-fetch.txt b/Documentation/git-fetch.txt
index fba66f1460..efd22cd372 100644
--- a/Documentation/git-fetch.txt
+++ b/Documentation/git-fetch.txt
@@ -197,13 +197,26 @@  The output of "git fetch" depends on the transport method used; this
 section describes the output when fetching over the Git protocol
 (either locally or via ssh) and Smart HTTP protocol.
 
-The status of the fetch is output in tabular form, with each line
-representing the status of a single ref. Each line is of the form:
+The output format can be chosen either via the `fetch.output` config
+(see linkgit:git-config[1]), or via the `--output-format` switch.
+Supported values include:
+
+For the `full` and `compact` output formats, the status of the fetch is
+output in tabular, with each line representing the status of a single
+ref. Each line is of the form:
 
 -------------------------------
  <flag> <summary> <from> -> <to> [<reason>]
 -------------------------------
 
+The `porcelain` output format is intended to be machine-parseable. In
+contrast to the human-readable output formats it thus prints to standard
+output instead of standard error. Each line is of the form:
+
+-------------------------------
+<flag> <old-object-id> <new-object-id> <local-reference>
+-------------------------------
+
 The status of up-to-date refs is shown only if the --verbose option is
 used.
 
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 30099b2ac3..abe6b8879d 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -52,6 +52,7 @@  enum display_format {
 	DISPLAY_FORMAT_UNKNOWN = 0,
 	DISPLAY_FORMAT_FULL,
 	DISPLAY_FORMAT_COMPACT,
+	DISPLAY_FORMAT_PORCELAIN,
 	DISPLAY_FORMAT_MAX,
 };
 
@@ -59,6 +60,7 @@  static const char * const display_formats[DISPLAY_FORMAT_MAX] = {
 	NULL,
 	"full",
 	"compact",
+	"porcelain",
 };
 
 struct display_state {
@@ -752,8 +754,11 @@  static void display_state_init(struct display_state *display_state, struct ref *
 
 		break;
 	}
+	case DISPLAY_FORMAT_PORCELAIN:
+		/* We don't need to precompute anything here. */
+		break;
 	default:
-		BUG("unexpected display foramt %d", display_state->format);
+		BUG("unexpected display format %d", display_state->format);
 	}
 }
 
@@ -822,8 +827,12 @@  static void print_compact(struct display_state *display_state,
 static void display_ref_update(struct display_state *display_state, char code,
 			       const char *summary, const char *error,
 			       const char *remote, const char *local,
+			       const struct object_id *old_oid,
+			       const struct object_id *new_oid,
 			       int summary_width)
 {
+	FILE *f = stderr;
+
 	if (verbosity < 0)
 		return;
 
@@ -856,12 +865,17 @@  static void display_ref_update(struct display_state *display_state, char code,
 
 		break;
 	}
+	case DISPLAY_FORMAT_PORCELAIN:
+		strbuf_addf(&display_state->buf, "%c %s %s %s", code,
+			    oid_to_hex(old_oid), oid_to_hex(new_oid), local);
+		f = stdout;
+		break;
 	default:
 		BUG("unexpected display format %d", display_state->format);
 	};
 	strbuf_addch(&display_state->buf, '\n');
 
-	fputs(display_state->buf.buf, stderr);
+	fputs(display_state->buf.buf, f);
 }
 
 static int update_local_ref(struct ref *ref,
@@ -879,7 +893,8 @@  static int update_local_ref(struct ref *ref,
 	if (oideq(&ref->old_oid, &ref->new_oid)) {
 		if (verbosity > 0)
 			display_ref_update(display_state, '=', _("[up to date]"), NULL,
-					   remote_ref->name, ref->name, summary_width);
+					   remote_ref->name, ref->name,
+					   &ref->old_oid, &ref->new_oid, summary_width);
 		return 0;
 	}
 
@@ -892,7 +907,8 @@  static int update_local_ref(struct ref *ref,
 		 */
 		display_ref_update(display_state, '!', _("[rejected]"),
 				   _("can't fetch into checked-out branch"),
-				   remote_ref->name, ref->name, summary_width);
+				   remote_ref->name, ref->name,
+				   &ref->old_oid, &ref->new_oid, summary_width);
 		return 1;
 	}
 
@@ -903,12 +919,14 @@  static int update_local_ref(struct ref *ref,
 			r = s_update_ref("updating tag", ref, transaction, 0);
 			display_ref_update(display_state, r ? '!' : 't', _("[tag update]"),
 					   r ? _("unable to update local ref") : NULL,
-					   remote_ref->name, ref->name, summary_width);
+					   remote_ref->name, ref->name,
+					   &ref->old_oid, &ref->new_oid, summary_width);
 			return r;
 		} else {
 			display_ref_update(display_state, '!', _("[rejected]"),
 					   _("would clobber existing tag"),
-					   remote_ref->name, ref->name, summary_width);
+					   remote_ref->name, ref->name,
+					   &ref->old_oid, &ref->new_oid, summary_width);
 			return 1;
 		}
 	}
@@ -941,7 +959,8 @@  static int update_local_ref(struct ref *ref,
 		r = s_update_ref(msg, ref, transaction, 0);
 		display_ref_update(display_state, r ? '!' : '*', what,
 				   r ? _("unable to update local ref") : NULL,
-				   remote_ref->name, ref->name, summary_width);
+				   remote_ref->name, ref->name,
+				   &ref->old_oid, &ref->new_oid, summary_width);
 		return r;
 	}
 
@@ -963,7 +982,8 @@  static int update_local_ref(struct ref *ref,
 		r = s_update_ref("fast-forward", ref, transaction, 1);
 		display_ref_update(display_state, r ? '!' : ' ', quickref.buf,
 				   r ? _("unable to update local ref") : NULL,
-				   remote_ref->name, ref->name, summary_width);
+				   remote_ref->name, ref->name,
+				   &ref->old_oid, &ref->new_oid, summary_width);
 		strbuf_release(&quickref);
 		return r;
 	} else if (force || ref->force) {
@@ -975,12 +995,14 @@  static int update_local_ref(struct ref *ref,
 		r = s_update_ref("forced-update", ref, transaction, 1);
 		display_ref_update(display_state, r ? '!' : '+', quickref.buf,
 				   r ? _("unable to update local ref") : _("forced update"),
-				   remote_ref->name, ref->name, summary_width);
+				   remote_ref->name, ref->name,
+				   &ref->old_oid, &ref->new_oid, summary_width);
 		strbuf_release(&quickref);
 		return r;
 	} else {
 		display_ref_update(display_state, '!', _("[rejected]"), _("non-fast-forward"),
-				   remote_ref->name, ref->name, summary_width);
+				   remote_ref->name, ref->name,
+				   &ref->old_oid, &ref->new_oid, summary_width);
 		return 1;
 	}
 }
@@ -1221,7 +1243,9 @@  static int store_updated_refs(struct display_state *display_state,
 				display_ref_update(display_state, '*',
 						   *kind ? kind : "branch", NULL,
 						   rm->name,
-						   "FETCH_HEAD", summary_width);
+						   "FETCH_HEAD",
+						   &rm->new_oid, &rm->old_oid,
+						   summary_width);
 			}
 		}
 	}
@@ -1361,6 +1385,7 @@  static int prune_refs(struct display_state *display_state,
 		for (ref = stale_refs; ref; ref = ref->next) {
 			display_ref_update(display_state, '-', _("[deleted]"), NULL,
 					   _("(none)"), ref->name,
+					   &ref->new_oid, &ref->old_oid,
 					   summary_width);
 			warn_dangling_symref(stderr, dangling_msg, ref->name);
 		}
diff --git a/t/t5574-fetch-output.sh b/t/t5574-fetch-output.sh
index 662c960f94..d88ad8af31 100755
--- a/t/t5574-fetch-output.sh
+++ b/t/t5574-fetch-output.sh
@@ -102,6 +102,72 @@  test_expect_success 'fetch compact output with multiple remotes' '
 	test_cmp expect actual
 '
 
+test_expect_success 'fetch porcelain output' '
+	test_when_finished "rm -rf porcelain-cfg porcelain-cli" &&
+
+	# Set up a bunch of references that we can use to demonstrate different
+	# kinds of flag symbols in the output format.
+	MAIN_OLD=$(git rev-parse HEAD) &&
+	git branch "fast-forward" &&
+	git branch "deleted-branch" &&
+	git checkout -b force-updated &&
+	test_commit --no-tag force-update-old &&
+	FORCE_UPDATED_OLD=$(git rev-parse HEAD) &&
+	git checkout main &&
+
+	# Clone and pre-seed the repositories. We fetch references into two
+	# namespaces so that we can test that rejected and force-updated
+	# references are reported properly.
+	refspecs="refs/heads/*:refs/unforced/* +refs/heads/*:refs/forced/*" &&
+	git clone . porcelain-cli &&
+	git clone . porcelain-cfg &&
+	git -C porcelain-cfg fetch origin $refspecs &&
+	git -C porcelain-cli fetch origin $refspecs &&
+
+	# Now that we have set up the client repositories we can change our
+	# local references.
+	git branch new-branch &&
+	git branch -d deleted-branch &&
+	git checkout fast-forward &&
+	test_commit --no-tag fast-forward-new &&
+	FAST_FORWARD_NEW=$(git rev-parse HEAD) &&
+	git checkout force-updated &&
+	git reset --hard HEAD~ &&
+	test_commit --no-tag force-update-new &&
+	FORCE_UPDATED_NEW=$(git rev-parse HEAD) &&
+
+	cat >expect <<-EOF &&
+	- $MAIN_OLD $ZERO_OID refs/forced/deleted-branch
+	- $MAIN_OLD $ZERO_OID refs/unforced/deleted-branch
+	  $MAIN_OLD $FAST_FORWARD_NEW refs/unforced/fast-forward
+	! $FORCE_UPDATED_OLD $FORCE_UPDATED_NEW refs/unforced/force-updated
+	* $ZERO_OID $MAIN_OLD refs/unforced/new-branch
+	  $MAIN_OLD $FAST_FORWARD_NEW refs/forced/fast-forward
+	+ $FORCE_UPDATED_OLD $FORCE_UPDATED_NEW refs/forced/force-updated
+	* $ZERO_OID $MAIN_OLD refs/forced/new-branch
+	  $MAIN_OLD $FAST_FORWARD_NEW refs/remotes/origin/fast-forward
+	+ $FORCE_UPDATED_OLD $FORCE_UPDATED_NEW refs/remotes/origin/force-updated
+	* $ZERO_OID $MAIN_OLD refs/remotes/origin/new-branch
+	EOF
+
+	# Execute a dry-run fetch first. We do this to assert that the dry-run
+	# and non-dry-run fetches produces the same output. Execution of the
+	# fetch is expected to fail as we have a rejected reference update.
+	test_must_fail git -C porcelain-cfg -c fetch.output=porcelain fetch --dry-run --prune origin $refspecs >actual-dry-run-cfg &&
+	test_must_fail git -C porcelain-cli fetch --output-format=porcelain --dry-run --prune origin $refspecs >actual-dry-run-cli &&
+	test_cmp actual-dry-run-cfg actual-dry-run-cli &&
+	test_cmp expect actual-dry-run-cfg &&
+
+	# And now we perform a non-dry-run fetch.
+	test_must_fail git -C porcelain-cfg -c fetch.output=porcelain fetch --prune origin $refspecs >actual-cfg &&
+	test_must_fail git -C porcelain-cli fetch --output-format=porcelain --prune origin $refspecs >actual-cli &&
+	test_cmp actual-cfg actual-cli &&
+	test_cmp expect actual-cfg &&
+
+	# Ensure that the dry-run and non-dry-run output matches.
+	test_cmp actual-dry-run-cfg actual-cfg
+'
+
 test_expect_success 'fetch output with HEAD and --dry-run' '
 	test_when_finished "rm -rf head" &&
 	git clone . head &&
@@ -127,6 +193,24 @@  test_expect_success 'fetch output with HEAD and --dry-run' '
 	test_cmp expect actual
 '
 
+test_expect_success 'fetch porcelain output with HEAD and --dry-run' '
+	test_when_finished "rm -rf head" &&
+	git clone . head &&
+	COMMIT_ID=$(git rev-parse HEAD) &&
+
+	git -C head fetch --output-format=porcelain --dry-run origin HEAD >actual &&
+	cat >expect <<-EOF &&
+	* $ZERO_OID $COMMIT_ID FETCH_HEAD
+	EOF
+	test_cmp expect actual &&
+
+	git -C head fetch --output-format=porcelain --dry-run origin HEAD:foo >actual &&
+	cat >expect <<-EOF &&
+	* $ZERO_OID $COMMIT_ID refs/heads/foo
+	EOF
+	test_cmp expect actual
+'
+
 test_expect_success '--no-show-forced-updates' '
 	mkdir forced-updates &&
 	(