diff mbox series

Message ID 20200414074204.677139-1-matheus.bernardino@usp.br (mailing list archive)
State New, archived
Headers show
Series | expand

Commit Message

Matheus Tavares April 14, 2020, 7:42 a.m. UTC
On Mon, Apr 13, 2020 at 6:57 PM Greg Hurrell <greg@hurrell.net> wrote:
>
> It seems that `git-grep -lz` behaves differently depending on whether
> it is inside a subdirectory:
[...]
> $ git grep -lz content
> an "example".txt^@nested/other "example".txt^@
>
> Note that, as expected, the files are NUL-terminated and not wrapped
> in quotes. ("^@" represents NUL byte.)
>
> $ cd nested
> $ git grep -lz content
> "other \"example\".txt"^@
>
> As soon as we move into a subdirectory, files are wrapped in quotes
> and contain escapes, despite the "-z" switch.

I think this is a bug in "plain" git-grep, not in -z specifically. In some git
commands, -z has the effect of unquoting the output paths. For example, the
docs for git-ls-files says: "-z: \0 line termination on output and do not quote
filenames." In git-grep, however, the -z doc entry only says: "Output \0
instead of the character that normally follows a file name."

So -z does not seem to affect quoting in git-grep. But should we change this, to
quote unusual pathnames (relative or not) by default, and let -z fall back to
the old behavior? This would be more consistent with other commands such as
git-ls-files and to the core.quotePath setting.

However, perhaps output paths are never intended to be displayed quoted in
git-grep (with or without -z) in order to be consistent with GNU grep. I don't
know which of these options is correct (if any), so I would love to hear what
others have to say about it.

But if the second is correct, I think we can use the following patch to solve
the reported bug:

(I'm just wondering: should we also add the information at core.quotePath that
git-grep does not comply with this setting, to be consistent with GNU grep {or
for any other reason}?)

-- >8 --
Subject: [RFC PATCH] grep: fix inconsistent output of unusual pathnames

When grepping from the repository root, paths that contain unusual
characters are not output quoted. However, they are quoted when grepping
from a subdir without --full-name. For example:

$ echo content >'an "unusual" name.txt'
$ mkdir subdir
$ echo content >'subdir/another "unusual" name.txt'
$ git add .
$ git commit -m content

Then:
$ git grep content
an "unusual" name.txt:1:content
subdir/another "unusual" name.txt:1:content

But:
$ cd subdir
$ git grep content
"another \"unusual\" name.txt":1:content

Fix this inconsistency by not quoting unusual pathnames (relative or
not), which is also the default behavior in GNU grep. Also add some
tests to prevent regressions.

Note: some non-related tests that compare git-grep output against
git-ls-files also had to be modified. This is because the testing repo
now contains some unusual pathnames. And git-ls-files will quote such
paths by default, whereas git-grep won't.

Reported-by: Greg Hurrell <greg@hurrell.net>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c  | 15 ++++++++++-----
 t/t7810-grep.sh | 33 ++++++++++++++++++++++++++++++---
 2 files changed, 40 insertions(+), 8 deletions(-)

--
2.26.0
diff mbox series

Patch

diff --git a/builtin/grep.c b/builtin/grep.c
index 99e2685090..ca21f98315 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -303,8 +303,12 @@  static int grep_oid(struct grep_opt *opt, const struct object_id *oid,
 	struct grep_source gs;

 	if (opt->relative && opt->prefix_length) {
-		quote_path_relative(filename + tree_name_len, opt->prefix, &pathbuf);
-		strbuf_insert(&pathbuf, 0, filename, tree_name_len);
+		struct strbuf rel_buf = STRBUF_INIT;
+		const char *rel_name = relative_path(filename + tree_name_len,
+						     opt->prefix, &rel_buf);
+		strbuf_add(&pathbuf, filename, tree_name_len);
+		strbuf_addstr(&pathbuf, rel_name);
+		strbuf_release(&rel_buf);
 	} else {
 		strbuf_addstr(&pathbuf, filename);
 	}
@@ -333,13 +337,14 @@  static int grep_file(struct grep_opt *opt, const char *filename)
 {
 	struct strbuf buf = STRBUF_INIT;
 	struct grep_source gs;
+	const char *gs_name;

 	if (opt->relative && opt->prefix_length)
-		quote_path_relative(filename, opt->prefix, &buf);
+		gs_name = relative_path(filename, opt->prefix, &buf);
 	else
-		strbuf_addstr(&buf, filename);
+		gs_name = filename;

-	grep_source_init(&gs, GREP_SOURCE_FILE, buf.buf, filename, filename);
+	grep_source_init(&gs, GREP_SOURCE_FILE, gs_name, filename, filename);
 	strbuf_release(&buf);

 	if (num_threads > 1) {
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 7d7b396c23..23911a3574 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -72,6 +72,8 @@  test_expect_success setup '
 	# Still a no-op.
 	function dummy() {}
 	EOF
+	echo unusual >"\"unusual\" pathname" &&
+	echo unusual >"t/\"unusual\" pathname2" &&
 	git add . &&
 	test_tick &&
 	git commit -m initial
@@ -481,6 +483,26 @@  do
 		git grep --count -h -e b $H -- ab >actual &&
 		test_cmp expected actual
 	'
+
+	test_expect_success "grep $L should not quote unusual pathnames" '
+		cat >expected <<-EOF &&
+		${HC}"unusual" pathname:unusual
+		${HC}t/"unusual" pathname2:unusual
+		EOF
+		git grep unusual $H >actual &&
+		test_cmp expected actual
+	'
+
+	test_expect_success "grep $L should not quote unusual relative pathnames" '
+		cat >expected <<-EOF &&
+		${HC}"unusual" pathname2:unusual
+		EOF
+		(
+			cd t &&
+			git grep unusual $H
+		) >actual &&
+		test_cmp expected actual
+	'
 done

 cat >expected <<EOF
@@ -500,7 +522,8 @@  test_expect_success 'grep -c -C' '
 '

 test_expect_success 'grep -L -C' '
-	git ls-files >expected &&
+	git ls-files -z >ls-files-z &&
+	tr "\0" "\n" <ls-files-z >expected &&
 	git grep -L -C1 nonexistent_string >actual &&
 	test_cmp expected actual
 '
@@ -1686,7 +1709,9 @@  test_expect_success 'grep does not report i-t-a with -L --cached' '
 	echo "intend to add this" >intend-to-add &&
 	git add -N intend-to-add &&
 	test_when_finished "git rm -f intend-to-add" &&
-	git ls-files | grep -v "^intend-to-add\$" >expected &&
+	git ls-files -z >ls-files-z &&
+	tr "\0" "\n" <ls-files-z >files &&
+	grep -v "^intend-to-add\$" files >expected &&
 	git grep -L --cached "nonexistent_string" >actual &&
 	test_cmp expected actual
 '
@@ -1696,7 +1721,9 @@  test_expect_success 'grep does not report i-t-a and assume unchanged with -L' '
 	git add -N intend-to-add-assume-unchanged &&
 	test_when_finished "git rm -f intend-to-add-assume-unchanged" &&
 	git update-index --assume-unchanged intend-to-add-assume-unchanged &&
-	git ls-files | grep -v "^intend-to-add-assume-unchanged\$" >expected &&
+	git ls-files -z >ls-files-z &&
+	tr "\0" "\n" <ls-files-z >files &&
+	grep -v "^intend-to-add-assume-unchanged\$" files >expected &&
 	git grep -L "nonexistent_string" >actual &&
 	test_cmp expected actual
 '