Message ID | 5bd9dce8f611c5fe380c9f58dbdfa2dc6d2fd51f.1612813249.git.matheus.bernardino@usp.br (mailing list archive) |
---|---|
State | Accepted |
Commit | e1fca7910453527765acf39529fa64a1960030e5 |
Headers | show |
Series | grep: error out if --untracked is used with --cached | expand |
On Mon, Feb 08, 2021 at 04:43:28PM -0300, Matheus Tavares wrote: > The options --untracked and --cached are not compatible, but if they are > used together, grep just silently ignores --cached and searches the > working tree. Error out, instead, to avoid any potential confusion. > > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> > --- > builtin/grep.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/builtin/grep.c b/builtin/grep.c > index ca259af441..392acf8cab 100644 > --- a/builtin/grep.c > +++ b/builtin/grep.c > @@ -1157,6 +1157,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix) > if (!use_index && (untracked || cached)) > die(_("--cached or --untracked cannot be used with --no-index")); > > + if (untracked && cached) > + die(_("--untracked cannot be used with --cached")); > + Are these really incompatible? --untracked says that untracked files are searched in addition to tracked ones in the working tree. --cached says that the index is searched instead of tracked files. From my reading, that seems to imply that the combination you're proposing getting rid of would mean: "search the index,and untracked files". That's a narrow use-case, but I couldn't think of a reason that it shouldn't work (but it's been a while since I've looked or thought much about the "git grep" code...). Assuming that they are incompatible, though, a few thoughts: Should this come before the "!use_index && (untracked || cached)" guard? right now passing all three options first says you can't combine --cached/--untracked with --no-index. Presumably the next invocation would come without --no-index, only to come back that the remaining options are incompatible. I dunno. I'm thinking out loud, but it feels like this guard should come before the one above it, not after. Should this appear in 'git-grep(1)' too? I guess not, since looking for "[iI]ncompatible" in that file turns up zero results. Thanks, Taylor
On Mon, Feb 8, 2021 at 6:48 PM Taylor Blau <me@ttaylorr.com> wrote: > > On Mon, Feb 08, 2021 at 04:43:28PM -0300, Matheus Tavares wrote: > > The options --untracked and --cached are not compatible, but if they are > > used together, grep just silently ignores --cached and searches the > > working tree. Error out, instead, to avoid any potential confusion. > > > > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> > > --- > > builtin/grep.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/builtin/grep.c b/builtin/grep.c > > index ca259af441..392acf8cab 100644 > > --- a/builtin/grep.c > > +++ b/builtin/grep.c > > @@ -1157,6 +1157,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix) > > if (!use_index && (untracked || cached)) > > die(_("--cached or --untracked cannot be used with --no-index")); > > > > + if (untracked && cached) > > + die(_("--untracked cannot be used with --cached")); > > + > > Are these really incompatible? --untracked says that untracked files are > searched in addition to tracked ones in the working tree. > --cached says that the index is searched instead of tracked files. From > my reading, that seems to imply that the combination you're proposing > getting rid of would mean: "search the index,and untracked files". Yeah, I agree that there is nothing conceptually wrong with the use case "search the index, and untracked files". The problem is that git-grep is currently not able to search both these places in the same execution. In fact, I don't think it can combine working tree and index searches in any way (besides with --assume-unchanged paths). When --cached is used with --untracked, git-grep silently ignores --cached and searches the working tree only: $ echo 'cached A' >A $ git add A $ git commit -m A $ echo 'modified A' >A $ echo 'untracked' >B $ git grep --cached --untracked . A:modified A B:untracked Perhaps, instead of erroring out with this currently invalid combination, should we make it valid by teaching git-grep how to search the two places on a single run? I.e. something like: diff --git a/builtin/grep.c b/builtin/grep.c index ca259af441..b0e99096ff 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -699,7 +699,7 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec, } static int grep_directory(struct grep_opt *opt, const struct pathspec *pathspec, - int exc_std, int use_index) + int exc_std, int use_index, int untracked_only) { struct dir_struct dir; int i, hit = 0; @@ -712,7 +712,13 @@ static int grep_directory(struct grep_opt *opt, const struct pathspec *pathspec, fill_directory(&dir, opt->repo->index, pathspec); for (i = 0; i < dir.nr; i++) { - hit |= grep_file(opt, dir.entries[i]->name); + struct dir_entry *ent = dir.entries[i]; + + if (untracked_only && + !index_name_is_other(opt->repo->index, ent->name, ent->len)) + continue; + + hit |= grep_file(opt, ent->name); if (hit && opt->status_only) break; } @@ -1157,9 +1163,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix) if (!use_index && (untracked || cached)) die(_("--cached or --untracked cannot be used with --no-index")); - if (!use_index || untracked) { + if (cached && untracked) { + hit = grep_cache(&opt, &pathspec, 1); + hit |= grep_directory(&opt, &pathspec, !!opt_exclude, 1, 1); + + } else if (!use_index || untracked) { int use_exclude = (opt_exclude < 0) ? use_index : !!opt_exclude; - hit = grep_directory(&opt, &pathspec, use_exclude, use_index); + hit |= grep_directory(&opt, &pathspec, use_exclude, use_index, 0); + } else if (0 <= opt_exclude) { die(_("--[no-]exclude-standard cannot be used for tracked contents")); } else if (!list.nr) { With this, the `git grep --cached --untracked .` search from the previous example would produce the following output: A:cached A B:untracked In this case, should we also add an 'untracked:' / 'cached:' prefix to the filenames, like we do for revision searches (e.g. 'HEAD:A:cached A') ?
On Mon, Feb 8, 2021 at 3:27 PM Matheus Tavares <matheus.bernardino@usp.br> wrote: > > > On Mon, Feb 8, 2021 at 6:48 PM Taylor Blau <me@ttaylorr.com> wrote: > > > > On Mon, Feb 08, 2021 at 04:43:28PM -0300, Matheus Tavares wrote: > > > The options --untracked and --cached are not compatible, but if they are > > > used together, grep just silently ignores --cached and searches the > > > working tree. Error out, instead, to avoid any potential confusion. > > > > > > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> > > > --- > > > builtin/grep.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/builtin/grep.c b/builtin/grep.c > > > index ca259af441..392acf8cab 100644 > > > --- a/builtin/grep.c > > > +++ b/builtin/grep.c > > > @@ -1157,6 +1157,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix) > > > if (!use_index && (untracked || cached)) > > > die(_("--cached or --untracked cannot be used with --no-index")); > > > > > > + if (untracked && cached) > > > + die(_("--untracked cannot be used with --cached")); > > > + > > > > Are these really incompatible? --untracked says that untracked files are > > searched in addition to tracked ones in the working tree. > > --cached says that the index is searched instead of tracked files. From > > my reading, that seems to imply that the combination you're proposing > > getting rid of would mean: "search the index,and untracked files". > > Yeah, I agree that there is nothing conceptually wrong with the use case > "search the index, and untracked files". The problem is that git-grep is > currently not able to search both these places in the same execution. > In fact, I don't think it can combine working tree and index searches > in any way (besides with --assume-unchanged paths). > > When --cached is used with --untracked, git-grep silently ignores > --cached and searches the working tree only: > > $ echo 'cached A' >A > $ git add A > $ git commit -m A > $ echo 'modified A' >A > $ echo 'untracked' >B > $ git grep --cached --untracked . > A:modified A > B:untracked > > Perhaps, instead of erroring out with this currently invalid > combination, should we make it valid by teaching git-grep how to search > the two places on a single run? I.e. something like: > > diff --git a/builtin/grep.c b/builtin/grep.c > index ca259af441..b0e99096ff 100644 > --- a/builtin/grep.c > +++ b/builtin/grep.c > @@ -699,7 +699,7 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec, > } > > static int grep_directory(struct grep_opt *opt, const struct pathspec *pathspec, > - int exc_std, int use_index) > + int exc_std, int use_index, int untracked_only) > { > struct dir_struct dir; > int i, hit = 0; > @@ -712,7 +712,13 @@ static int grep_directory(struct grep_opt *opt, const struct pathspec *pathspec, > > fill_directory(&dir, opt->repo->index, pathspec); > for (i = 0; i < dir.nr; i++) { > - hit |= grep_file(opt, dir.entries[i]->name); > + struct dir_entry *ent = dir.entries[i]; > + > + if (untracked_only && > + !index_name_is_other(opt->repo->index, ent->name, ent->len)) > + continue; > + > + hit |= grep_file(opt, ent->name); > if (hit && opt->status_only) > break; > } > @@ -1157,9 +1163,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix) > if (!use_index && (untracked || cached)) > die(_("--cached or --untracked cannot be used with --no-index")); > > - if (!use_index || untracked) { > + if (cached && untracked) { > + hit = grep_cache(&opt, &pathspec, 1); > + hit |= grep_directory(&opt, &pathspec, !!opt_exclude, 1, 1); > + > + } else if (!use_index || untracked) { > int use_exclude = (opt_exclude < 0) ? use_index : !!opt_exclude; > - hit = grep_directory(&opt, &pathspec, use_exclude, use_index); > + hit |= grep_directory(&opt, &pathspec, use_exclude, use_index, 0); > + > } else if (0 <= opt_exclude) { > die(_("--[no-]exclude-standard cannot be used for tracked contents")); > } else if (!list.nr) { > > With this, the `git grep --cached --untracked .` search from the > previous example would produce the following output: > > A:cached A > B:untracked > > In this case, should we also add an 'untracked:' / 'cached:' prefix to > the filenames, like we do for revision searches (e.g. 'HEAD:A:cached A') ? Ick, no. --untracked was a modifier on working tree searches. I think doing this adds a really weird inconsistency for all the other option combinations. I say mark --untracked and --cached as incompatible like you did in your original patch, unless you're willing to make grep generally be able to have options to search two or more locations from any of (revisions of history, the cache, the working tree), including being able to simultaneously search two different revisions of history. Even then, I'd be tempted to say that --untracked is only used in combination with a search of the working tree.
On Mon, Feb 8, 2021 at 11:47 AM Matheus Tavares <matheus.bernardino@usp.br> wrote: > > The options --untracked and --cached are not compatible, but if they are > used together, grep just silently ignores --cached and searches the > working tree. Error out, instead, to avoid any potential confusion. > > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> > --- > builtin/grep.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/builtin/grep.c b/builtin/grep.c > index ca259af441..392acf8cab 100644 > --- a/builtin/grep.c > +++ b/builtin/grep.c > @@ -1157,6 +1157,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix) > if (!use_index && (untracked || cached)) > die(_("--cached or --untracked cannot be used with --no-index")); > > + if (untracked && cached) > + die(_("--untracked cannot be used with --cached")); > + > if (!use_index || untracked) { > int use_exclude = (opt_exclude < 0) ? use_index : !!opt_exclude; > hit = grep_directory(&opt, &pathspec, use_exclude, use_index); > -- > 2.29.2 Reviewed-by: Elijah Newren <newren@gmail.com>
Elijah Newren <newren@gmail.com> writes: > ... Even then, I'd be tempted to say that > --untracked is only used in combination with a search of the working > tree. I tend to agree. Something like $ git grep -ne POISON maint master next seen -- t/test-lib.sh would be a useful thing, but I do not see how --untracked would usefully interact with any of these "tracked" contents, be the search from the index or tree.
On Tue, Feb 09, 2021 at 12:38:02PM -0800, Junio C Hamano wrote: > Elijah Newren <newren@gmail.com> writes: > > > ... Even then, I'd be tempted to say that > > --untracked is only used in combination with a search of the working > > tree. > > I tend to agree. Something like > > $ git grep -ne POISON maint master next seen -- t/test-lib.sh > > would be a useful thing, but I do not see how --untracked would > usefully interact with any of these "tracked" contents, be the > search from the index or tree. I agree with both you and Elijah. I don't mind banning the pair as unimplemented and "won't possibly work correctly together", since it leaves open the possibility that we might find time to make them compatible, at which point we could revert this patch. But what Matheus provided looks good to me. Thanks, Taylor
diff --git a/builtin/grep.c b/builtin/grep.c index ca259af441..392acf8cab 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -1157,6 +1157,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix) if (!use_index && (untracked || cached)) die(_("--cached or --untracked cannot be used with --no-index")); + if (untracked && cached) + die(_("--untracked cannot be used with --cached")); + if (!use_index || untracked) { int use_exclude = (opt_exclude < 0) ? use_index : !!opt_exclude; hit = grep_directory(&opt, &pathspec, use_exclude, use_index);
The options --untracked and --cached are not compatible, but if they are used together, grep just silently ignores --cached and searches the working tree. Error out, instead, to avoid any potential confusion. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> --- builtin/grep.c | 3 +++ 1 file changed, 3 insertions(+)