Message ID | 087c92e3904dd774f672373727c300bf7f5f6369.1697317276.git.code@khaugsbakk.name (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | grep: die gracefully when outside repository | expand |
On Sat, Oct 14, 2023 at 11:02:38PM +0200, Kristoffer Haugsbakk wrote: > Die gracefully when `git grep --no-index` is run outside of a Git > repository and the path is outside the directory tree. > > If you are not in a Git repository and say: > > git grep --no-index search .. > > You trigger a `BUG`: > > BUG: environment.c:213: git environment hasn't been setup > Aborted (core dumped) > > Because `..` is a valid path which is treated as a pathspec. Then > `pathspec` figures out that it is not in the current directory tree. The > `BUG` is triggered when `pathspec` tries to advice the user about the path > to the (non-existing) repository. Is it even reasonable for "grep --no-index" to care about leaving the tree in the first place? That is, is there a reason we should not allow: git grep --no-index foo ../bar ? And if we do want to care, there is a weirdness here that even with your patch, we check to see if the file exists: $ git grep --no-index foo ../does-exist fatal: '../does-exist' is outside the directory tree $ git grep --no-index foo ../does-not-exist fatal: ../does-not-exist: no such path in the working tree. If we want to avoid leaving the current directory, then I think we need to be checking much sooner (but again, I would argue that it is not worth caring about in no-index mode). I do think your patch does not make anything worse (and indeed makes the error output much better). So I do not mind it in the meantime. But I have a feeling that we'd end up reverting it as part of the fix for the larger issue. -Peff
On Sun, Oct 15, 2023, at 05:26, Jeff King wrote: > Is it even reasonable for "grep --no-index" to care about leaving the > tree in the first place? That is, is there a reason we should not allow: > > git grep --no-index foo ../bar > > ? On second thought yeah, it doesn't make sense. We are outside of any repository already so what does it matter where the file is relative to the current working directory? It seems that `pathspec.c:init_pathspec_item` should let you through in this case.
Jeff King <peff@peff.net> writes: > Is it even reasonable for "grep --no-index" to care about leaving the > tree in the first place? That is, is there a reason we should not allow: > > git grep --no-index foo ../bar A huge difference between the bare "grep" and "git grep" is that we know the scope of the project tree, so it goes recursive by default. Should the above command line recursively go below ../bar? Would we allow "/" to be given? I actually do not think these "we are allowing Git tools to be used on random garbage" is a good idea to begin with X-<. If we invented something nice for our variant in "git grep" and wish we can use it outside the repository, contributing the feature to implementations of "grep" would have been the right way to move forward, instead of contaminating the codebase with things that are not related to Git. Whoever did 59332d13 (Resurrect "git grep --no-index", 2010-02-06) should be punished X-<. Anyway. 2e48fcdb (grep docs: document --no-index option, 2010-02-25) seems to have wanted to explicitly limit the search within the "current directory", and I am fine to keep the search space limited by the cwd. On the other hand, of course, the users can shoot themselves in the foot with "grep -r foo /", so letting them use "git grep" the same way is perhaps OK. Especially if it simplifies the code if we lift the limitation, that is a very tempting thing to do. > If we want to avoid leaving the current directory, then I think we need > to be checking much sooner (but again, I would argue that it is not > worth caring about in no-index mode).
Kristoffer Haugsbakk <code@khaugsbakk.name> writes: > diff --git a/pathspec.c b/pathspec.c > index 3a3a5724c44..e115832f17a 100644 > --- a/pathspec.c > +++ b/pathspec.c > @@ -468,6 +468,9 @@ static void init_pathspec_item(struct pathspec_item *item, unsigned flags, > &prefixlen, copyfrom); > if (!match) { > const char *hint_path = get_git_work_tree(); > + if (!have_git_dir()) > + die(_("'%s' is outside the directory tree"), > + copyfrom); > if (!hint_path) > hint_path = get_git_dir(); > die(_("%s: '%s' is outside repository at '%s'"), elt, It is curious that the original has two sources of hint_path (i.e., get_git_dir() is used as a fallback for get_git_work_tree()). Are we certain that the check is at the right place? If we do not have a repository, then both would fail by returning NULL, so it should not matter if we add the new check before we check either or both, or even after we checked both before dying. I wonder if const char *hint_path = get_git_work_tree(); if (!hint_path) hint_path = get_git_dir(); if (hint_path) die(_("%s: '%s' is outside repository at '%s'"), elt, copyfrom, absolute_path(hint_path)); else die(_("%s: '%s' is outside the directory tree"), elt, copyfrom); makes the intent of the code clearer. We want to hint the location of the repository by computing hint_path, and if we can compute it, we use it in the error message, but otherwise we don't add hint. And we apply that conditional whether we have repository or not---what we care about is the NULL-ness of the hint string we computed. > diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh > index 39d6d713ecb..b976f81a166 100755 > --- a/t/t7810-grep.sh > +++ b/t/t7810-grep.sh > @@ -1234,6 +1234,19 @@ test_expect_success 'outside of git repository with fallbackToNoIndex' ' > ) > ' > > +test_expect_success 'outside of git repository with pathspec outside the directory tree' ' > + test_when_finished rm -fr non && > + rm -fr non && > + mkdir -p non/git/sub && > + ( > + GIT_CEILING_DIRECTORIES="$(pwd)/non" && > + export GIT_CEILING_DIRECTORIES && > + cd non/git && > + test_expect_code 128 git grep --no-index search .. 2>error && > + grep "is outside the directory tree" error Excellent. This is a very good use of the GIT_CEILING_DIRECTORIES facility. > + ) > +' > + > test_expect_success 'inside git repository but with --no-index' ' > rm -fr is && > mkdir -p is/git/sub && > > base-commit: 43c8a30d150ecede9709c1f2527c8fba92c65f40
On Tue, Oct 17, 2023, at 18:42, Junio C Hamano wrote: > It is curious that the original has two sources of hint_path (i.e., > get_git_dir() is used as a fallback for get_git_work_tree()). Are > we certain that the check is at the right place? If we do not have > a repository, then both would fail by returning NULL, so it should > not matter if we add the new check before we check either or both, > or even after we checked both before dying. > > I wonder if > > const char *hint_path = get_git_work_tree(); > > if (!hint_path) > hint_path = get_git_dir(); > if (hint_path) > die(_("%s: '%s' is outside repository at '%s'"), > elt, copyfrom, absolute_path(hint_path)); > else > die(_("%s: '%s' is outside the directory tree"), > elt, copyfrom); > > makes the intent of the code clearer. That doesn't work since `get_git_dir()` triggers `BUG` instead of returning `NULL`. The `hint_path` declaration has to be at the start because of style rules. But we can initialize it after. I can also have a second look at the test since I am using `grep` to test the failure output and not the translation string variant. -- >8 -- Subject: [PATCH] fixup! grep: die gracefully when outside repository --- pathspec.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/pathspec.c b/pathspec.c index e115832f17a..0c1061fad11 100644 --- a/pathspec.c +++ b/pathspec.c @@ -467,10 +467,11 @@ static void init_pathspec_item(struct pathspec_item *item, unsigned flags, match = prefix_path_gently(prefix, prefixlen, &prefixlen, copyfrom); if (!match) { - const char *hint_path = get_git_work_tree(); + const char *hint_path; if (!have_git_dir()) die(_("'%s' is outside the directory tree"), copyfrom); + hint_path = get_git_work_tree(); if (!hint_path) hint_path = get_git_dir(); die(_("%s: '%s' is outside repository at '%s'"), elt,
Kristoffer Haugsbakk <code@khaugsbakk.name> writes: > On Tue, Oct 17, 2023, at 18:42, Junio C Hamano wrote: >> It is curious that the original has two sources of hint_path (i.e., >> get_git_dir() is used as a fallback for get_git_work_tree()). Are >> we certain that the check is at the right place? If we do not have >> a repository, then both would fail by returning NULL, so it should >> not matter if we add the new check before we check either or both, >> or even after we checked both before dying. >> >> I wonder if >> >> const char *hint_path = get_git_work_tree(); >> >> if (!hint_path) >> hint_path = get_git_dir(); >> if (hint_path) >> die(_("%s: '%s' is outside repository at '%s'"), >> elt, copyfrom, absolute_path(hint_path)); >> else >> die(_("%s: '%s' is outside the directory tree"), >> elt, copyfrom); >> >> makes the intent of the code clearer. > > That doesn't work since `get_git_dir()` triggers `BUG` instead of > returning `NULL`. Ah, interesting. > The `hint_path` declaration has to be at the start because of style > rules. But we can initialize it after. Yes, what you have below (but please leave a blank line between the last line of decl and the first line of statement for readablility) looks very readable and sensible. > I can also have a second look at the test since I am using `grep` to > test the failure output and not the translation string variant. That is not necessary, as we no longer run under phoney i18n that required us to use test_i18ngrep. It is OK to assume that the tests are run under "C" locale. Thanks. > -- >8 -- > Subject: [PATCH] fixup! grep: die gracefully when outside repository > > --- > pathspec.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/pathspec.c b/pathspec.c > index e115832f17a..0c1061fad11 100644 > --- a/pathspec.c > +++ b/pathspec.c > @@ -467,10 +467,11 @@ static void init_pathspec_item(struct pathspec_item *item, unsigned flags, > match = prefix_path_gently(prefix, prefixlen, > &prefixlen, copyfrom); > if (!match) { > - const char *hint_path = get_git_work_tree(); > + const char *hint_path; > if (!have_git_dir()) > die(_("'%s' is outside the directory tree"), > copyfrom); > + hint_path = get_git_work_tree(); > if (!hint_path) > hint_path = get_git_dir(); > die(_("%s: '%s' is outside repository at '%s'"), elt,
Kristoffer Haugsbakk <code@khaugsbakk.name> writes: > diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh > index 39d6d713ecb..b976f81a166 100755 > --- a/t/t7810-grep.sh > +++ b/t/t7810-grep.sh > @@ -1234,6 +1234,19 @@ test_expect_success 'outside of git repository with fallbackToNoIndex' ' > ) > ' > > +test_expect_success 'outside of git repository with pathspec outside the directory tree' ' > + test_when_finished rm -fr non && > + rm -fr non && > + mkdir -p non/git/sub && > + ( > + GIT_CEILING_DIRECTORIES="$(pwd)/non" && > + export GIT_CEILING_DIRECTORIES && > + cd non/git && > + test_expect_code 128 git grep --no-index search .. 2>error && > + grep "is outside the directory tree" error > + ) > +' > + So you create non/git/sub, go to non/git (so there is sub/ directory), and try running "..". If you had a directory non/tig next to non/git and used ../tig instead of .. as the path given to "git grep", it would also correctly fail. Searching in a non-existing path, ../non, dies in a different error, with an error message that is not technically wrong, but it probably can be improved. It has been a while since I looked at the pathspec matching code, but if we are lucky, it might be just the matter of swapping the order of checking (in other words, check "is it outside" first and then "does it exist" next, or something like that)? t/t7810-grep.sh | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git c/t/t7810-grep.sh w/t/t7810-grep.sh index b976f81a16..84838c0fe1 100755 --- c/t/t7810-grep.sh +++ w/t/t7810-grep.sh @@ -1234,16 +1234,30 @@ test_expect_success 'outside of git repository with fallbackToNoIndex' ' ) ' -test_expect_success 'outside of git repository with pathspec outside the directory tree' ' +test_expect_success 'no repository with path outside $cwd' ' test_when_finished rm -fr non && rm -fr non && - mkdir -p non/git/sub && + mkdir -p non/git/sub non/tig && ( GIT_CEILING_DIRECTORIES="$(pwd)/non" && export GIT_CEILING_DIRECTORIES && cd non/git && test_expect_code 128 git grep --no-index search .. 2>error && grep "is outside the directory tree" error + ) && + ( + GIT_CEILING_DIRECTORIES="$(pwd)/non" && + export GIT_CEILING_DIRECTORIES && + cd non/git && + test_expect_code 128 git grep --no-index search ../tig 2>error && + grep "is outside the directory tree" error + ) && + ( + GIT_CEILING_DIRECTORIES="$(pwd)/non" && + export GIT_CEILING_DIRECTORIES && + cd non/git && + test_expect_code 128 git grep --no-index search ../non 2>error && + grep "no such path in the working tree" error ) '
diff --git a/pathspec.c b/pathspec.c index 3a3a5724c44..e115832f17a 100644 --- a/pathspec.c +++ b/pathspec.c @@ -468,6 +468,9 @@ static void init_pathspec_item(struct pathspec_item *item, unsigned flags, &prefixlen, copyfrom); if (!match) { const char *hint_path = get_git_work_tree(); + if (!have_git_dir()) + die(_("'%s' is outside the directory tree"), + copyfrom); if (!hint_path) hint_path = get_git_dir(); die(_("%s: '%s' is outside repository at '%s'"), elt, diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh index 39d6d713ecb..b976f81a166 100755 --- a/t/t7810-grep.sh +++ b/t/t7810-grep.sh @@ -1234,6 +1234,19 @@ test_expect_success 'outside of git repository with fallbackToNoIndex' ' ) ' +test_expect_success 'outside of git repository with pathspec outside the directory tree' ' + test_when_finished rm -fr non && + rm -fr non && + mkdir -p non/git/sub && + ( + GIT_CEILING_DIRECTORIES="$(pwd)/non" && + export GIT_CEILING_DIRECTORIES && + cd non/git && + test_expect_code 128 git grep --no-index search .. 2>error && + grep "is outside the directory tree" error + ) +' + test_expect_success 'inside git repository but with --no-index' ' rm -fr is && mkdir -p is/git/sub &&
Die gracefully when `git grep --no-index` is run outside of a Git repository and the path is outside the directory tree. If you are not in a Git repository and say: git grep --no-index search .. You trigger a `BUG`: BUG: environment.c:213: git environment hasn't been setup Aborted (core dumped) Because `..` is a valid path which is treated as a pathspec. Then `pathspec` figures out that it is not in the current directory tree. The `BUG` is triggered when `pathspec` tries to advice the user about the path to the (non-existing) repository. Reported-by: ks1322 ks1322 <ks1322@gmail.com> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> --- pathspec.c | 3 +++ t/t7810-grep.sh | 13 +++++++++++++ 2 files changed, 16 insertions(+) base-commit: 43c8a30d150ecede9709c1f2527c8fba92c65f40