diff mbox series

worktree: add -z option for list subcommand

Message ID 20210105110219.99610-1-phillip.wood123@gmail.com (mailing list archive)
State New, archived
Headers show
Series worktree: add -z option for list subcommand | expand

Commit Message

Phillip Wood Jan. 5, 2021, 11:02 a.m. UTC
From: Phillip Wood <phillip.wood@dunelm.org.uk>

Add a -z option to be used in conjunction with --porcelain that gives
NUL-terminated output. This enables 'worktree list --porcelain' to
handle worktree paths that contain newlines.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
I found an old patch that added NUL termination. I've rebased it
but the new test fails as there seems to be another worktree thats
been added since I wrote this, anyway I thought it might be a useful
start for adding a `-z` option.

 Documentation/git-worktree.txt | 13 +++++++++---
 builtin/worktree.c             | 36 ++++++++++++++++++++++------------
 t/t2402-worktree-list.sh       | 22 +++++++++++++++++++++
 3 files changed, 56 insertions(+), 15 deletions(-)

Comments

Eric Sunshine Jan. 7, 2021, 3:34 a.m. UTC | #1
On Tue, Jan 5, 2021 at 6:02 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> Add a -z option to be used in conjunction with --porcelain that gives
> NUL-terminated output. This enables 'worktree list --porcelain' to
> handle worktree paths that contain newlines.

Adding a -z mode makes a lot of sense. This, along with a fix to call
quote_c_style() on paths in normal (not `-z`) porcelain mode, can
easily be done after Rafael's series lands. Or they could be done
before his series, though that might make a lot of extra work for him.

> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> ---
> I found an old patch that added NUL termination. I've rebased it
> but the new test fails as there seems to be another worktree thats
> been added since I wrote this, anyway I thought it might be a useful
> start for adding a `-z` option.

The test failure is fallout from a "bug" in a test added by Rafael's
earlier series[1] which was not caught by the reviewer[2]. In fact,
his current series has a drive-by fix[3] for this bug in patch [6/7].
Applying that fix (or the refinement[4] I suggested in my review)
makes your test work.

[1]: https://lore.kernel.org/git/20201011101152.5291-2-rafaeloliveira.cs@gmail.com/
[2]: https://lore.kernel.org/git/CAPig+cS8hvX4GCsnfLBnQ4Q_AkUad=bw7rjVcaOqSEqcLZvx8w@mail.gmail.com/
[3]: https://lore.kernel.org/git/20210104162128.95281-7-rafaeloliveira.cs@gmail.com/
[4]: https://lore.kernel.org/git/CAPig+cRysXpK0e1xXOuVd+EtkeyTk8FR6MUrL=Bg3W4ve8jdNA@mail.gmail.com/

> diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
> @@ -217,7 +217,13 @@ This can also be set up as the default behaviour by using the
> +-z::
> +       When `--porcelain` is specified with `list` terminate each line with a
> +       NUL rather than a newline. This makes it possible to parse the output
> +       when a worktree path contains a newline character.

If we fix normal-mode porcelain to call quote_c_style(), then we can
drop the last sentence or refine it to say something along the lines
of it being easier to deal with paths with embedded newlines than in
normal porcelain mode.

> diff --git a/builtin/worktree.c b/builtin/worktree.c
> @@ -640,19 +640,25 @@ static int add(int ac, const char **av, const char *prefix)
> +static void show_worktree_porcelain(struct worktree *wt, int line_terminator)
>  {
> +       printf("worktree %s", wt->path);
> +       fputc(line_terminator, stdout);
> +       if (wt->is_bare) {
> +               printf("bare");
> +               fputc(line_terminator, stdout);
> +       } else {
> +               printf("HEAD %s", oid_to_hex(&wt->head_oid));
> +               fputc(line_terminator, stdout);
> +               if (wt->is_detached) {
> +                       printf("detached");
> +                       fputc(line_terminator, stdout);
> +               } else if (wt->head_ref) {
> +                       printf("branch %s", wt->head_ref);
> +                       fputc(line_terminator, stdout);
> +               }

Perhaps this could all be done a bit more concisely with printf()
alone rather than combining it with fputc(). For instance:

    printf("worktree %s%c", wt->path, line_terminator);

and so on.

> -       printf("\n");
> +       fputc(line_terminator, stdout);

This use of fputc() makes sense.

> @@ -720,15 +726,20 @@ static void pathsort(struct worktree **wt)
>  static int list(int ac, const char **av, const char *prefix)
>  {
> +               OPT_SET_INT('z', NULL, &line_terminator,
> +                           N_("fields are separated with NUL character"), '\0'),

"fields" sounds a little odd. "lines" might make more sense. "records"
might be even better and matches the wording of some other Git
commands accepting `-z`.

> +       else if (!line_terminator && !porcelain)
> +               die(_("'-z' requires '--porcelain'"));

Other error messages in this file don't quote command-line options, so:

    die(_("-z requires --porcelain"));

would be more consistent.

> diff --git a/t/t2402-worktree-list.sh b/t/t2402-worktree-list.sh
> @@ -71,6 +71,28 @@ test_expect_success '"list" all worktrees with locked annotation' '
> +test_expect_success '"list" all worktrees --porcelain -z' '
> +       test_when_finished "rm -rf here _actual actual expect &&
> +                               git worktree prune" &&
> +       printf "worktree %sQHEAD %sQbranch %sQQ" \
> +               "$(git rev-parse --show-toplevel)" \
> +               "$(git rev-parse HEAD)" \
> +               "$(git symbolic-ref HEAD)" >expect &&
> +       git worktree add --detach here master &&
> +       printf "worktree %sQHEAD %sQdetachedQQ" \
> +               "$(git -C here rev-parse --show-toplevel)" \
> +               "$(git rev-parse HEAD)" >>expect &&
> +       git worktree list --porcelain -z >_actual &&
> +       cat _actual | tr "\0" Q >actual &&

Or just use nul_to_q():

    nul_to_q <_actual >actual &&

(And there's no need to `cat` the file.)

> +       test_cmp expect actual
> +'
> +
> +test_expect_success '"list" -z fails without --porcelain' '
> +       test_when_finished "rm -rf here && git worktree prune" &&
> +       git worktree add --detach here master &&
> +       test_must_fail git worktree list -z
> +'

I don't think there's any need for this test to create (and cleanup) a
worktree. So, the entire test could collapse to:

    test_expect_success '"list" -z fails without --porcelain' '
        test_must_fail git worktree list -z
    '
Phillip Wood Jan. 8, 2021, 10:33 a.m. UTC | #2
Hi Eric & Rafael

On 07/01/2021 03:34, Eric Sunshine wrote:
> On Tue, Jan 5, 2021 at 6:02 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>> Add a -z option to be used in conjunction with --porcelain that gives
>> NUL-terminated output. This enables 'worktree list --porcelain' to
>> handle worktree paths that contain newlines.
> 
> Adding a -z mode makes a lot of sense. This, along with a fix to call
> quote_c_style() on paths in normal (not `-z`) porcelain mode, 

I'm concerned that starting to quote paths will break backwards 
compatibility. Even if we restricted the quoting to just those paths 
that contain '\n' there is no way to distinguish between a quoted path 
and one that begins and ends with ". This is the reason I prefer to add 
`-z` instead of taking Rafael's patch to quote the lock reason as that 
patch still leaves the output of `git worktree list --porcelain` 
ambiguous and it cannot be fixed without breaking existing users. A 
counter argument to all this is that there are thousands of users on 
file systems that cannot have newlines in paths and Rafael's patch is 
definitely a net win for them.

> can
> easily be done after Rafael's series lands. Or they could be done
> before his series, though that might make a lot of extra work for him.

I would definitely be easiest to wait for Rafael's series to land if we 
want to add `-z` separately

>> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
>> ---
>> I found an old patch that added NUL termination. I've rebased it
>> but the new test fails as there seems to be another worktree thats
>> been added since I wrote this, anyway I thought it might be a useful
>> start for adding a `-z` option.
> 
> The test failure is fallout from a "bug" in a test added by Rafael's
> earlier series[1] which was not caught by the reviewer[2]. In fact,
> his current series has a drive-by fix[3] for this bug in patch [6/7].
> Applying that fix (or the refinement[4] I suggested in my review)
> makes your test work.

Thanks ever so much for taking the time to track down the cause of the 
test failure.

Best Wishes

Phillip

> [1]: https://lore.kernel.org/git/20201011101152.5291-2-rafaeloliveira.cs@gmail.com/
> [2]: https://lore.kernel.org/git/CAPig+cS8hvX4GCsnfLBnQ4Q_AkUad=bw7rjVcaOqSEqcLZvx8w@mail.gmail.com/
> [3]: https://lore.kernel.org/git/20210104162128.95281-7-rafaeloliveira.cs@gmail.com/
> [4]: https://lore.kernel.org/git/CAPig+cRysXpK0e1xXOuVd+EtkeyTk8FR6MUrL=Bg3W4ve8jdNA@mail.gmail.com/
> 
>> diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
>> @@ -217,7 +217,13 @@ This can also be set up as the default behaviour by using the
>> +-z::
>> +       When `--porcelain` is specified with `list` terminate each line with a
>> +       NUL rather than a newline. This makes it possible to parse the output
>> +       when a worktree path contains a newline character.
> 
> If we fix normal-mode porcelain to call quote_c_style(), then we can
> drop the last sentence or refine it to say something along the lines
> of it being easier to deal with paths with embedded newlines than in
> normal porcelain mode.
> 
>> diff --git a/builtin/worktree.c b/builtin/worktree.c
>> @@ -640,19 +640,25 @@ static int add(int ac, const char **av, const char *prefix)
>> +static void show_worktree_porcelain(struct worktree *wt, int line_terminator)
>>   {
>> +       printf("worktree %s", wt->path);
>> +       fputc(line_terminator, stdout);
>> +       if (wt->is_bare) {
>> +               printf("bare");
>> +               fputc(line_terminator, stdout);
>> +       } else {
>> +               printf("HEAD %s", oid_to_hex(&wt->head_oid));
>> +               fputc(line_terminator, stdout);
>> +               if (wt->is_detached) {
>> +                       printf("detached");
>> +                       fputc(line_terminator, stdout);
>> +               } else if (wt->head_ref) {
>> +                       printf("branch %s", wt->head_ref);
>> +                       fputc(line_terminator, stdout);
>> +               }
> 
> Perhaps this could all be done a bit more concisely with printf()
> alone rather than combining it with fputc(). For instance:
> 
>      printf("worktree %s%c", wt->path, line_terminator);
> 
> and so on.
> 
>> -       printf("\n");
>> +       fputc(line_terminator, stdout);
> 
> This use of fputc() makes sense.
> 
>> @@ -720,15 +726,20 @@ static void pathsort(struct worktree **wt)
>>   static int list(int ac, const char **av, const char *prefix)
>>   {
>> +               OPT_SET_INT('z', NULL, &line_terminator,
>> +                           N_("fields are separated with NUL character"), '\0'),
> 
> "fields" sounds a little odd. "lines" might make more sense. "records"
> might be even better and matches the wording of some other Git
> commands accepting `-z`.
> 
>> +       else if (!line_terminator && !porcelain)
>> +               die(_("'-z' requires '--porcelain'"));
> 
> Other error messages in this file don't quote command-line options, so:
> 
>      die(_("-z requires --porcelain"));
> 
> would be more consistent.
> 
>> diff --git a/t/t2402-worktree-list.sh b/t/t2402-worktree-list.sh
>> @@ -71,6 +71,28 @@ test_expect_success '"list" all worktrees with locked annotation' '
>> +test_expect_success '"list" all worktrees --porcelain -z' '
>> +       test_when_finished "rm -rf here _actual actual expect &&
>> +                               git worktree prune" &&
>> +       printf "worktree %sQHEAD %sQbranch %sQQ" \
>> +               "$(git rev-parse --show-toplevel)" \
>> +               "$(git rev-parse HEAD)" \
>> +               "$(git symbolic-ref HEAD)" >expect &&
>> +       git worktree add --detach here master &&
>> +       printf "worktree %sQHEAD %sQdetachedQQ" \
>> +               "$(git -C here rev-parse --show-toplevel)" \
>> +               "$(git rev-parse HEAD)" >>expect &&
>> +       git worktree list --porcelain -z >_actual &&
>> +       cat _actual | tr "\0" Q >actual &&
> 
> Or just use nul_to_q():
> 
>      nul_to_q <_actual >actual &&
> 
> (And there's no need to `cat` the file.)
> 
>> +       test_cmp expect actual
>> +'
>> +
>> +test_expect_success '"list" -z fails without --porcelain' '
>> +       test_when_finished "rm -rf here && git worktree prune" &&
>> +       git worktree add --detach here master &&
>> +       test_must_fail git worktree list -z
>> +'
> 
> I don't think there's any need for this test to create (and cleanup) a
> worktree. So, the entire test could collapse to:
> 
>      test_expect_success '"list" -z fails without --porcelain' '
>          test_must_fail git worktree list -z
>      '
>
Eric Sunshine Jan. 10, 2021, 7:27 a.m. UTC | #3
On Fri, Jan 8, 2021 at 5:33 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> On 07/01/2021 03:34, Eric Sunshine wrote:
> > On Tue, Jan 5, 2021 at 6:02 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
> >> Add a -z option to be used in conjunction with --porcelain that gives
> >> NUL-terminated output. This enables 'worktree list --porcelain' to
> >> handle worktree paths that contain newlines.
> >
> > Adding a -z mode makes a lot of sense. This, along with a fix to call
> > quote_c_style() on paths in normal (not `-z`) porcelain mode,
>
> I'm concerned that starting to quote paths will break backwards
> compatibility. Even if we restricted the quoting to just those paths
> that contain '\n' there is no way to distinguish between a quoted path
> and one that begins and ends with ".

Backward compatibility is a valid concern, though I haven't managed to
convince myself that it would matter much in this case. In one sense,
the failure of the porcelain format to properly quote/escape paths
when needed can be viewed as an outright bug and, although we value
backward compatibility, we also value correctness, and such bug fixes
have been accepted in the past. Especially in a case such as this, it
seems exceedingly unlikely that fixing the bug would be harmful or
break existing tooling (though, of course that possibility always
exists, even if remotely so).

I can imagine ways in which tooling might be engineered to work around
the shortcoming that `git worktree list --porcelain` doesn't properly
handle newlines embedded in paths, but such tooling would almost
certainly be so fragile anyhow that it would break as we add more keys
to the extensible porcelain format. Coupled with the fact that
newlines embedded in paths are so exceedingly unlikely, it's hard to
imagine that fixing this bug would have a big impact on existing
tooling.

The case you mention about a path which happens to have a double-quote
as its first character does concern me a bit more since existing
tooling wouldn't have had to jump through hoops, or indeed do anything
special, with such paths, unlike the embedded newline case. But then,
too, it's pretty hard to imagine this coming up much, if at all, in
practice. That's not to say that I can't imagine a path, in general,
beginning with a quote, but keeping in mind that we're talking only
about worktree paths, it seems exceedingly unlikely.

My gut feeling (for what it's worth) is that worktree paths containing
embedded newlines (or other special characters) or beginning with a
double-quote is so unlikely to come in in actual practice that viewing
this as a bug fix is probably a reasonable approach, whereas some
other approach -- such as introducing porcelain-v2 or creating a new
porcelain key, say "worktreepath" which supersedes "worktree" (without
retiring "worktree") -- may be overkill.

None of the above is an argument against a complementary `-z` mode,
which I think is a very good idea.

> This is the reason I prefer to add
> `-z` instead of taking Rafael's patch to quote the lock reason as that
> patch still leaves the output of `git worktree list --porcelain`
> ambiguous and it cannot be fixed without breaking existing users. A
> counter argument to all this is that there are thousands of users on
> file systems that cannot have newlines in paths and Rafael's patch is
> definitely a net win for them.

Rafael's patch is quoting only the lock-reason, not the worktree path,
so I think it's orthogonal to this discussion. Also, his patch is
introducing `lock` as a new attribute in porcelain output, not
modifying behavior of an existing `lock` attribute.
diff mbox series

Patch

diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
index af06128cc9..a07b593cc3 100644
--- a/Documentation/git-worktree.txt
+++ b/Documentation/git-worktree.txt
@@ -10,7 +10,7 @@  SYNOPSIS
 --------
 [verse]
 'git worktree add' [-f] [--detach] [--checkout] [--lock] [-b <new-branch>] <path> [<commit-ish>]
-'git worktree list' [--porcelain]
+'git worktree list' [--porcelain [-z]]
 'git worktree lock' [--reason <string>] <worktree>
 'git worktree move' <worktree> <new-path>
 'git worktree prune' [-n] [-v] [--expire <expire>]
@@ -217,7 +217,13 @@  This can also be set up as the default behaviour by using the
 --porcelain::
 	With `list`, output in an easy-to-parse format for scripts.
 	This format will remain stable across Git versions and regardless of user
-	configuration.  See below for details.
+	configuration.  It is recommended to combine this with `-z`.
+	See below for details.
+
+-z::
+	When `--porcelain` is specified with `list` terminate each line with a
+	NUL rather than a newline. This makes it possible to parse the output
+	when a worktree path contains a newline character.
 
 -q::
 --quiet::
@@ -369,7 +375,8 @@  $ git worktree list
 
 Porcelain Format
 ~~~~~~~~~~~~~~~~
-The porcelain format has a line per attribute.  Attributes are listed with a
+The porcelain format has a line per attribute.  If `-z` is given then the lines
+are terminated with NUL rather than a newline.  Attributes are listed with a
 label and value separated by a single space.  Boolean attributes (like `bare`
 and `detached`) are listed as a label only, and are present only
 if the value is true.  The first attribute of a working tree is always
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 197fd24a55..0cd331873c 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -640,19 +640,25 @@  static int add(int ac, const char **av, const char *prefix)
 	return add_worktree(path, branch, &opts);
 }
 
-static void show_worktree_porcelain(struct worktree *wt)
+static void show_worktree_porcelain(struct worktree *wt, int line_terminator)
 {
-	printf("worktree %s\n", wt->path);
-	if (wt->is_bare)
-		printf("bare\n");
-	else {
-		printf("HEAD %s\n", oid_to_hex(&wt->head_oid));
-		if (wt->is_detached)
-			printf("detached\n");
-		else if (wt->head_ref)
-			printf("branch %s\n", wt->head_ref);
+	printf("worktree %s", wt->path);
+	fputc(line_terminator, stdout);
+	if (wt->is_bare) {
+		printf("bare");
+		fputc(line_terminator, stdout);
+	} else {
+		printf("HEAD %s", oid_to_hex(&wt->head_oid));
+		fputc(line_terminator, stdout);
+		if (wt->is_detached) {
+			printf("detached");
+			fputc(line_terminator, stdout);
+		} else if (wt->head_ref) {
+			printf("branch %s", wt->head_ref);
+			fputc(line_terminator, stdout);
+		}
 	}
-	printf("\n");
+	fputc(line_terminator, stdout);
 }
 
 static void show_worktree(struct worktree *wt, int path_maxlen, int abbrev_len)
@@ -720,15 +726,20 @@  static void pathsort(struct worktree **wt)
 static int list(int ac, const char **av, const char *prefix)
 {
 	int porcelain = 0;
+	int line_terminator = '\n';
 
 	struct option options[] = {
 		OPT_BOOL(0, "porcelain", &porcelain, N_("machine-readable output")),
+		OPT_SET_INT('z', NULL, &line_terminator,
+			    N_("fields are separated with NUL character"), '\0'),
 		OPT_END()
 	};
 
 	ac = parse_options(ac, av, prefix, options, worktree_usage, 0);
 	if (ac)
 		usage_with_options(worktree_usage, options);
+	else if (!line_terminator && !porcelain)
+		die(_("'-z' requires '--porcelain'"));
 	else {
 		struct worktree **worktrees = get_worktrees();
 		int path_maxlen = 0, abbrev = DEFAULT_ABBREV, i;
@@ -741,7 +752,8 @@  static int list(int ac, const char **av, const char *prefix)
 
 		for (i = 0; worktrees[i]; i++) {
 			if (porcelain)
-				show_worktree_porcelain(worktrees[i]);
+				show_worktree_porcelain(worktrees[i],
+							line_terminator);
 			else
 				show_worktree(worktrees[i], path_maxlen, abbrev);
 		}
diff --git a/t/t2402-worktree-list.sh b/t/t2402-worktree-list.sh
index 795ddca2e4..acd9f8ab84 100755
--- a/t/t2402-worktree-list.sh
+++ b/t/t2402-worktree-list.sh
@@ -71,6 +71,28 @@  test_expect_success '"list" all worktrees with locked annotation' '
 	! grep "/unlocked  *[0-9a-f].* locked$" out
 '
 
+test_expect_success '"list" all worktrees --porcelain -z' '
+	test_when_finished "rm -rf here _actual actual expect &&
+				git worktree prune" &&
+	printf "worktree %sQHEAD %sQbranch %sQQ" \
+		"$(git rev-parse --show-toplevel)" \
+		"$(git rev-parse HEAD)" \
+		"$(git symbolic-ref HEAD)" >expect &&
+	git worktree add --detach here master &&
+	printf "worktree %sQHEAD %sQdetachedQQ" \
+		"$(git -C here rev-parse --show-toplevel)" \
+		"$(git rev-parse HEAD)" >>expect &&
+	git worktree list --porcelain -z >_actual &&
+	cat _actual | tr "\0" Q >actual	&&
+	test_cmp expect actual
+'
+
+test_expect_success '"list" -z fails without --porcelain' '
+	test_when_finished "rm -rf here && git worktree prune" &&
+	git worktree add --detach here master &&
+	test_must_fail git worktree list -z
+'
+
 test_expect_success 'bare repo setup' '
 	git init --bare bare1 &&
 	echo "data" >file1 &&