Message ID | eaae7214925189f562056b1ee6972c05dcf76a32.1587103366.git.matheus.bernardino@usp.br (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | grep: follow conventions for printing paths w/ unusual chars | expand |
Matheus Tavares <matheus.bernardino@usp.br> writes: > + if (opt->relative && opt->prefix_length) > + quote_path_relative(filename + tree_name_len, opt->prefix, out); > + else > + quote_c_style(filename + tree_name_len, out, NULL, 0); Yup. This solves the discrepancy reported correctly (i.e. both sides should do the quoting, the original only quoted when relative, and the new code corrects the other side). > + if (tree_name_len) > + strbuf_insert(out, 0, filename, tree_name_len); I am not quite sure about this part, though. Earlier we inserted the latter part of filename (after offset tree_name_len) to strbuf "out" after quoting, and then we are prefixing the earlier part of the filename without quoting to that same "out". Wouldn't a path ABCDEF (I do not literally mean that these ascii alphabets need quoting---just imagine each of these stands for a different letter and some causes the path to be quoted) with tree_name_len pointing somewhere in the middle be added as (an analog of) ABC"DEF", i.e. literal prefix with remainder quoted? I would (perhaps näively) expect that the whole thing would be placed inside a dq pair in such a case, even if the prefix part alone would not require quoting. Thanks.
On Fri, Apr 17, 2020 at 3:45 AM Junio C Hamano <gitster@pobox.com> wrote: > > Matheus Tavares <matheus.bernardino@usp.br> writes: > > > + if (opt->relative && opt->prefix_length) > > + quote_path_relative(filename + tree_name_len, opt->prefix, out); > > + else > > + quote_c_style(filename + tree_name_len, out, NULL, 0); > > Yup. This solves the discrepancy reported correctly (i.e. both > sides should do the quoting, the original only quoted when relative, > and the new code corrects the other side). > > > + if (tree_name_len) > > + strbuf_insert(out, 0, filename, tree_name_len); > > I am not quite sure about this part, though. > > Earlier we inserted the latter part of filename (after offset > tree_name_len) to strbuf "out" after quoting, and then we are > prefixing the earlier part of the filename without quoting to that > same "out". Wouldn't a path ABCDEF (I do not literally mean that > these ascii alphabets need quoting---just imagine each of these > stands for a different letter and some causes the path to be quoted) > with tree_name_len pointing somewhere in the middle be added as (an > analog of) ABC"DEF", i.e. literal prefix with remainder quoted? Right. But the ABC prefix here is always a tree name, so the output is one of the following: <sha1>:"unusual path":line <ref name>:"unusual path":line "unusual path":line (Always having the entire "unusual path" inside the double quotes, though) > I would (perhaps näively) expect that the whole thing would be placed > inside a dq pair in such a case, even if the prefix part alone would > not require quoting. Hm, it might be just me, but I think that including the tree name and path in the same dq pair could be a bit confusing, as they can seem like one unique thing. There is also the case of ref names containing unusual chars, as dq's. Such ref names are not currently escaped in the output of other commands as checkout, branch or describe. But if we include the ref name in grep's output quote, it would be escaped, which would be a minor inconsistency (I don't know if that's relevant, though).
Matheus Tavares Bernardino <matheus.bernardino@usp.br> writes:
> Right. But the ABC prefix here is always a tree name,...
Oh, then that is perfectly fine. I did not expect that the code
passes <tree-name>:<pathnname> (like "HEAD~2:t/Makefile") to us as a
single string.
Thanks.
Hi Matheus, On Fri, 17 Apr 2020, Matheus Tavares wrote: > grep does not follow the conventions used by other Git commands when > printing paths that contain unusual characters (as double-quotes or > newlines). Commands such as ls-files, commit, status and diff will: > > - Quote and escape unusual pathnames, by default. > - Print names verbatim and unquoted when "-z" is used. > > But grep *never* quotes/escapes absolute paths with unusual chars and > *always* quotes/escapes relative ones, even with "-z". Besides being > inconsistent in its own output, the deviation from other Git commands > can be confusing. So let's make it follow the two rules above and add > some tests for this new behavior. Note that, making grep quote/escape > all unusual paths by default, also make it fully compliant with the > core.quotePath configuration, which is currently ignored for absolute > paths. > > Reported-by: Greg Hurrell <greg@hurrell.net> > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> > --- > Documentation/git-grep.txt | 6 +++-- > builtin/grep.c | 46 ++++++++++++++++++++++++++++---------- > t/t7810-grep.sh | 44 ++++++++++++++++++++++++++++++++++++ > 3 files changed, 82 insertions(+), 14 deletions(-) Unfortunately, this causes eight test failures on Windows: https://dev.azure.com/gitgitgadget/git/_build/results?buildId=38023&view=ms.vss-test-web.build-test-results-tab Could you please have a look? I suspect that this has something to do with those new tests needing the `FUNNYNAMES` prereq. Ciao, Dscho > > diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt > index ddb6acc025..3109ce8fbe 100644 > --- a/Documentation/git-grep.txt > +++ b/Documentation/git-grep.txt > @@ -206,8 +206,10 @@ providing this option will cause it to die. > > -z:: > --null:: > - Output \0 instead of the character that normally follows a > - file name. > + Use \0 as the delimiter for pathnames in the output, and print > + them verbatim. Without this option, pathnames with "unusual" > + characters are quoted as explained for the configuration > + variable core.quotePath (see git-config(1)). > > -o:: > --only-matching:: > diff --git a/builtin/grep.c b/builtin/grep.c > index 99e2685090..bdf1a4bbc9 100644 > --- a/builtin/grep.c > +++ b/builtin/grep.c > @@ -295,6 +295,38 @@ static int grep_cmd_config(const char *var, const char *value, void *cb) > return st; > } > > +static void grep_source_name(struct grep_opt *opt, const char *filename, > + int tree_name_len, struct strbuf *out) > +{ > + strbuf_reset(out); > + > + if (opt->null_following_name) { > + if (opt->relative && opt->prefix_length) { > + struct strbuf rel_buf = STRBUF_INIT; > + const char *rel_name = > + relative_path(filename + tree_name_len, > + opt->prefix, &rel_buf); > + > + if (tree_name_len) > + strbuf_add(out, filename, tree_name_len); > + > + strbuf_addstr(out, rel_name); > + strbuf_release(&rel_buf); > + } else { > + strbuf_addstr(out, filename); > + } > + return; > + } > + > + if (opt->relative && opt->prefix_length) > + quote_path_relative(filename + tree_name_len, opt->prefix, out); > + else > + quote_c_style(filename + tree_name_len, out, NULL, 0); > + > + if (tree_name_len) > + strbuf_insert(out, 0, filename, tree_name_len); > +} > + > static int grep_oid(struct grep_opt *opt, const struct object_id *oid, > const char *filename, int tree_name_len, > const char *path) > @@ -302,13 +334,7 @@ static int grep_oid(struct grep_opt *opt, const struct object_id *oid, > struct strbuf pathbuf = STRBUF_INIT; > struct grep_source gs; > > - if (opt->relative && opt->prefix_length) { > - quote_path_relative(filename + tree_name_len, opt->prefix, &pathbuf); > - strbuf_insert(&pathbuf, 0, filename, tree_name_len); > - } else { > - strbuf_addstr(&pathbuf, filename); > - } > - > + grep_source_name(opt, filename, tree_name_len, &pathbuf); > grep_source_init(&gs, GREP_SOURCE_OID, pathbuf.buf, path, oid); > strbuf_release(&pathbuf); > > @@ -334,11 +360,7 @@ static int grep_file(struct grep_opt *opt, const char *filename) > struct strbuf buf = STRBUF_INIT; > struct grep_source gs; > > - if (opt->relative && opt->prefix_length) > - quote_path_relative(filename, opt->prefix, &buf); > - else > - strbuf_addstr(&buf, filename); > - > + grep_source_name(opt, filename, 0, &buf); > grep_source_init(&gs, GREP_SOURCE_FILE, buf.buf, filename, filename); > strbuf_release(&buf); > > diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh > index 7d7b396c23..ab495dba28 100755 > --- a/t/t7810-grep.sh > +++ b/t/t7810-grep.sh > @@ -72,6 +72,8 @@ test_expect_success setup ' > # Still a no-op. > function dummy() {} > EOF > + echo unusual >"\"unusual\" pathname" && > + echo unusual >"t/nested \"unusual\" pathname" && > git add . && > test_tick && > git commit -m initial > @@ -481,6 +483,48 @@ do > git grep --count -h -e b $H -- ab >actual && > test_cmp expected actual > ' > + > + test_expect_success "grep $L should quote unusual pathnames" ' > + cat >expected <<-EOF && > + ${HC}"\"unusual\" pathname":unusual > + ${HC}"t/nested \"unusual\" pathname":unusual > + EOF > + git grep unusual $H >actual && > + test_cmp expected actual > + ' > + > + test_expect_success "grep $L in subdir should quote unusual relative pathnames" ' > + cat >expected <<-EOF && > + ${HC}"nested \"unusual\" pathname":unusual > + EOF > + ( > + cd t && > + git grep unusual $H > + ) >actual && > + test_cmp expected actual > + ' > + > + test_expect_success "grep -z $L with unusual pathnames" ' > + cat >expected <<-EOF && > + ${HC}"unusual" pathname:unusual > + ${HC}t/nested "unusual" pathname:unusual > + EOF > + git grep -z unusual $H >actual && > + tr "\0" ":" <actual >actual-replace-null && > + test_cmp expected actual-replace-null > + ' > + > + test_expect_success "grep -z $L in subdir with unusual relative pathnames" ' > + cat >expected <<-EOF && > + ${HC}nested "unusual" pathname:unusual > + EOF > + ( > + cd t && > + git grep -z unusual $H > + ) >actual && > + tr "\0" ":" <actual >actual-replace-null && > + test_cmp expected actual-replace-null > + ' > done > > cat >expected <<EOF > -- > 2.26.0 > > >
Hi Matheus, On Sat, 18 Apr 2020, Johannes Schindelin wrote: > On Fri, 17 Apr 2020, Matheus Tavares wrote: > > > grep does not follow the conventions used by other Git commands when > > printing paths that contain unusual characters (as double-quotes or > > newlines). Commands such as ls-files, commit, status and diff will: > > > > - Quote and escape unusual pathnames, by default. > > - Print names verbatim and unquoted when "-z" is used. > > > > But grep *never* quotes/escapes absolute paths with unusual chars and > > *always* quotes/escapes relative ones, even with "-z". Besides being > > inconsistent in its own output, the deviation from other Git commands > > can be confusing. So let's make it follow the two rules above and add > > some tests for this new behavior. Note that, making grep quote/escape > > all unusual paths by default, also make it fully compliant with the > > core.quotePath configuration, which is currently ignored for absolute > > paths. > > > > Reported-by: Greg Hurrell <greg@hurrell.net> > > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> > > --- > > Documentation/git-grep.txt | 6 +++-- > > builtin/grep.c | 46 ++++++++++++++++++++++++++++---------- > > t/t7810-grep.sh | 44 ++++++++++++++++++++++++++++++++++++ > > 3 files changed, 82 insertions(+), 14 deletions(-) > > Unfortunately, this causes eight test failures on Windows: > https://dev.azure.com/gitgitgadget/git/_build/results?buildId=38023&view=ms.vss-test-web.build-test-results-tab > > Could you please have a look? I suspect that this has something to do with > those new tests needing the `FUNNYNAMES` prereq. I need this commit to fix it: https://github.com/git-for-windows/git/commit/7ca815e1ab89d6ffdb1a17b3cbacdf22a508d33c I'll paste it here, for your convenience: -- snipsnap -- From 7ca815e1ab89d6ffdb1a17b3cbacdf22a508d33c Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Sat, 18 Apr 2020 16:49:43 +0200 Subject: [PATCH] fixup??? grep: follow conventions for printing paths w/ unusual chars It is easy to be fooled by the Bash included in Git for Windows, which leads you to believe that quotes are valid parts of file names. On Windows, they are not. But Cygwin (which is the base of MSYS2, which is the POSIX emulation layer used by that Bash) only _pretends_ that it is a valid file name character. In reality, it will map the character into the private Unicode page. Cygwin knows about this. The rest of Windows applications (including Git for Windows), however, does not. As a consequence, `>\"with\ quotes\"` will claim to succeed, but the file on disk will have Unicode characters in place of those quotes that literally no application but Cygwin ones can handle, and this leads to those beautiful new tests to fail. Let's just use the prereq we introduced to guard precisely against this problem: `FUNNYNAMES`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t7810-grep.sh | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh index ab495dba28a7..991d5bd9c03f 100755 --- a/t/t7810-grep.sh +++ b/t/t7810-grep.sh @@ -72,8 +72,11 @@ test_expect_success setup ' # Still a no-op. function dummy() {} EOF - echo unusual >"\"unusual\" pathname" && - echo unusual >"t/nested \"unusual\" pathname" && + if test_have_prereq FUNNYNAMES + then + echo unusual >"\"unusual\" pathname" && + echo unusual >"t/nested \"unusual\" pathname" + fi && git add . && test_tick && git commit -m initial @@ -484,7 +487,7 @@ do test_cmp expected actual ' - test_expect_success "grep $L should quote unusual pathnames" ' + test_expect_success FUNNYNAMES "grep $L should quote unusual pathnames" ' cat >expected <<-EOF && ${HC}"\"unusual\" pathname":unusual ${HC}"t/nested \"unusual\" pathname":unusual @@ -493,7 +496,7 @@ do test_cmp expected actual ' - test_expect_success "grep $L in subdir should quote unusual relative pathnames" ' + test_expect_success FUNNYNAMES "grep $L in subdir should quote unusual relative pathnames" ' cat >expected <<-EOF && ${HC}"nested \"unusual\" pathname":unusual EOF @@ -504,7 +507,7 @@ do test_cmp expected actual ' - test_expect_success "grep -z $L with unusual pathnames" ' + test_expect_success FUNNYNAMES "grep -z $L with unusual pathnames" ' cat >expected <<-EOF && ${HC}"unusual" pathname:unusual ${HC}t/nested "unusual" pathname:unusual @@ -514,7 +517,7 @@ do test_cmp expected actual-replace-null ' - test_expect_success "grep -z $L in subdir with unusual relative pathnames" ' + test_expect_success FUNNYNAMES "grep -z $L in subdir with unusual relative pathnames" ' cat >expected <<-EOF && ${HC}nested "unusual" pathname:unusual EOF -- 2.26.1.windows.1
Hi, Dscho On Sat, Apr 18, 2020 at 11:57 AM Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > On Sat, 18 Apr 2020, Johannes Schindelin wrote: > > > > On Fri, 17 Apr 2020, Matheus Tavares wrote: ... > > > > > > Reported-by: Greg Hurrell <greg@hurrell.net> > > > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> > > > --- > > > Documentation/git-grep.txt | 6 +++-- > > > builtin/grep.c | 46 ++++++++++++++++++++++++++++---------- > > > t/t7810-grep.sh | 44 ++++++++++++++++++++++++++++++++++++ > > > 3 files changed, 82 insertions(+), 14 deletions(-) > > > > Unfortunately, this causes eight test failures on Windows: > > https://dev.azure.com/gitgitgadget/git/_build/results?buildId=38023&view=ms.vss-test-web.build-test-results-tab > > > > Could you please have a look? I suspect that this has something to do with > > those new tests needing the `FUNNYNAMES` prereq. > > I need this commit to fix it: > https://github.com/git-for-windows/git/commit/7ca815e1ab89d6ffdb1a17b3cbacdf22a508d33c > > I'll paste it here, for your convenience: > > -- snipsnap -- Thanks for reporting and correcting the testing issue. I'll send a v2 with your patch squashed in.
diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index ddb6acc025..3109ce8fbe 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -206,8 +206,10 @@ providing this option will cause it to die. -z:: --null:: - Output \0 instead of the character that normally follows a - file name. + Use \0 as the delimiter for pathnames in the output, and print + them verbatim. Without this option, pathnames with "unusual" + characters are quoted as explained for the configuration + variable core.quotePath (see git-config(1)). -o:: --only-matching:: diff --git a/builtin/grep.c b/builtin/grep.c index 99e2685090..bdf1a4bbc9 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -295,6 +295,38 @@ static int grep_cmd_config(const char *var, const char *value, void *cb) return st; } +static void grep_source_name(struct grep_opt *opt, const char *filename, + int tree_name_len, struct strbuf *out) +{ + strbuf_reset(out); + + if (opt->null_following_name) { + if (opt->relative && opt->prefix_length) { + struct strbuf rel_buf = STRBUF_INIT; + const char *rel_name = + relative_path(filename + tree_name_len, + opt->prefix, &rel_buf); + + if (tree_name_len) + strbuf_add(out, filename, tree_name_len); + + strbuf_addstr(out, rel_name); + strbuf_release(&rel_buf); + } else { + strbuf_addstr(out, filename); + } + return; + } + + if (opt->relative && opt->prefix_length) + quote_path_relative(filename + tree_name_len, opt->prefix, out); + else + quote_c_style(filename + tree_name_len, out, NULL, 0); + + if (tree_name_len) + strbuf_insert(out, 0, filename, tree_name_len); +} + static int grep_oid(struct grep_opt *opt, const struct object_id *oid, const char *filename, int tree_name_len, const char *path) @@ -302,13 +334,7 @@ static int grep_oid(struct grep_opt *opt, const struct object_id *oid, struct strbuf pathbuf = STRBUF_INIT; struct grep_source gs; - if (opt->relative && opt->prefix_length) { - quote_path_relative(filename + tree_name_len, opt->prefix, &pathbuf); - strbuf_insert(&pathbuf, 0, filename, tree_name_len); - } else { - strbuf_addstr(&pathbuf, filename); - } - + grep_source_name(opt, filename, tree_name_len, &pathbuf); grep_source_init(&gs, GREP_SOURCE_OID, pathbuf.buf, path, oid); strbuf_release(&pathbuf); @@ -334,11 +360,7 @@ static int grep_file(struct grep_opt *opt, const char *filename) struct strbuf buf = STRBUF_INIT; struct grep_source gs; - if (opt->relative && opt->prefix_length) - quote_path_relative(filename, opt->prefix, &buf); - else - strbuf_addstr(&buf, filename); - + grep_source_name(opt, filename, 0, &buf); grep_source_init(&gs, GREP_SOURCE_FILE, buf.buf, filename, filename); strbuf_release(&buf); diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh index 7d7b396c23..ab495dba28 100755 --- a/t/t7810-grep.sh +++ b/t/t7810-grep.sh @@ -72,6 +72,8 @@ test_expect_success setup ' # Still a no-op. function dummy() {} EOF + echo unusual >"\"unusual\" pathname" && + echo unusual >"t/nested \"unusual\" pathname" && git add . && test_tick && git commit -m initial @@ -481,6 +483,48 @@ do git grep --count -h -e b $H -- ab >actual && test_cmp expected actual ' + + test_expect_success "grep $L should quote unusual pathnames" ' + cat >expected <<-EOF && + ${HC}"\"unusual\" pathname":unusual + ${HC}"t/nested \"unusual\" pathname":unusual + EOF + git grep unusual $H >actual && + test_cmp expected actual + ' + + test_expect_success "grep $L in subdir should quote unusual relative pathnames" ' + cat >expected <<-EOF && + ${HC}"nested \"unusual\" pathname":unusual + EOF + ( + cd t && + git grep unusual $H + ) >actual && + test_cmp expected actual + ' + + test_expect_success "grep -z $L with unusual pathnames" ' + cat >expected <<-EOF && + ${HC}"unusual" pathname:unusual + ${HC}t/nested "unusual" pathname:unusual + EOF + git grep -z unusual $H >actual && + tr "\0" ":" <actual >actual-replace-null && + test_cmp expected actual-replace-null + ' + + test_expect_success "grep -z $L in subdir with unusual relative pathnames" ' + cat >expected <<-EOF && + ${HC}nested "unusual" pathname:unusual + EOF + ( + cd t && + git grep -z unusual $H + ) >actual && + tr "\0" ":" <actual >actual-replace-null && + test_cmp expected actual-replace-null + ' done cat >expected <<EOF
grep does not follow the conventions used by other Git commands when printing paths that contain unusual characters (as double-quotes or newlines). Commands such as ls-files, commit, status and diff will: - Quote and escape unusual pathnames, by default. - Print names verbatim and unquoted when "-z" is used. But grep *never* quotes/escapes absolute paths with unusual chars and *always* quotes/escapes relative ones, even with "-z". Besides being inconsistent in its own output, the deviation from other Git commands can be confusing. So let's make it follow the two rules above and add some tests for this new behavior. Note that, making grep quote/escape all unusual paths by default, also make it fully compliant with the core.quotePath configuration, which is currently ignored for absolute paths. Reported-by: Greg Hurrell <greg@hurrell.net> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> --- Documentation/git-grep.txt | 6 +++-- builtin/grep.c | 46 ++++++++++++++++++++++++++++---------- t/t7810-grep.sh | 44 ++++++++++++++++++++++++++++++++++++ 3 files changed, 82 insertions(+), 14 deletions(-)