diff mbox series

grep: error out if --untracked is used with --cached

Message ID 5bd9dce8f611c5fe380c9f58dbdfa2dc6d2fd51f.1612813249.git.matheus.bernardino@usp.br (mailing list archive)
State Accepted
Commit e1fca7910453527765acf39529fa64a1960030e5
Headers show
Series grep: error out if --untracked is used with --cached | expand

Commit Message

Matheus Tavares Feb. 8, 2021, 7:43 p.m. UTC
The options --untracked and --cached are not compatible, but if they are
used together, grep just silently ignores --cached and searches the
working tree. Error out, instead, to avoid any potential confusion.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Taylor Blau Feb. 8, 2021, 9:48 p.m. UTC | #1
On Mon, Feb 08, 2021 at 04:43:28PM -0300, Matheus Tavares wrote:
> The options --untracked and --cached are not compatible, but if they are
> used together, grep just silently ignores --cached and searches the
> working tree. Error out, instead, to avoid any potential confusion.
>
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
>  builtin/grep.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/builtin/grep.c b/builtin/grep.c
> index ca259af441..392acf8cab 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -1157,6 +1157,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>  	if (!use_index && (untracked || cached))
>  		die(_("--cached or --untracked cannot be used with --no-index"));
>
> +	if (untracked && cached)
> +		die(_("--untracked cannot be used with --cached"));
> +

Are these really incompatible? --untracked says that untracked files are
searched in addition to tracked ones in the working tree.
--cached says that the index is searched instead of tracked files. From
my reading, that seems to imply that the combination you're proposing
getting rid of would mean: "search the index,and untracked files".

That's a narrow use-case, but I couldn't think of a reason that it
shouldn't work (but it's been a while since I've looked or thought much
about the "git grep" code...).

Assuming that they are incompatible, though, a few thoughts:

Should this come before the "!use_index && (untracked || cached)" guard?
right now passing all three options first says you can't combine
--cached/--untracked with --no-index. Presumably the next invocation
would come without --no-index, only to come back that the remaining
options are incompatible.

I dunno. I'm thinking out loud, but it feels like this guard should come
before the one above it, not after.

Should this appear in 'git-grep(1)' too? I guess not, since looking for
"[iI]ncompatible" in that file turns up zero results.

Thanks,
Taylor
Matheus Tavares Feb. 8, 2021, 11:21 p.m. UTC | #2
On Mon, Feb 8, 2021 at 6:48 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> On Mon, Feb 08, 2021 at 04:43:28PM -0300, Matheus Tavares wrote:
> > The options --untracked and --cached are not compatible, but if they are
> > used together, grep just silently ignores --cached and searches the
> > working tree. Error out, instead, to avoid any potential confusion.
> >
> > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> > ---
> >  builtin/grep.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/builtin/grep.c b/builtin/grep.c
> > index ca259af441..392acf8cab 100644
> > --- a/builtin/grep.c
> > +++ b/builtin/grep.c
> > @@ -1157,6 +1157,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
> >       if (!use_index && (untracked || cached))
> >               die(_("--cached or --untracked cannot be used with --no-index"));
> >
> > +     if (untracked && cached)
> > +             die(_("--untracked cannot be used with --cached"));
> > +
>
> Are these really incompatible? --untracked says that untracked files are
> searched in addition to tracked ones in the working tree.
> --cached says that the index is searched instead of tracked files. From
> my reading, that seems to imply that the combination you're proposing
> getting rid of would mean: "search the index,and untracked files".

Yeah, I agree that there is nothing conceptually wrong with the use case
"search the index, and untracked files". The problem is that git-grep is
currently not able to search both these places in the same execution.
In fact, I don't think it can combine working tree and index searches
in any way (besides with --assume-unchanged paths).

When --cached is used with --untracked, git-grep silently ignores
--cached and searches the working tree only:

$ echo 'cached A' >A
$ git add A
$ git commit -m A
$ echo 'modified A' >A
$ echo 'untracked' >B
$ git grep --cached --untracked .
A:modified A
B:untracked

Perhaps, instead of erroring out with this currently invalid
combination, should we make it valid by teaching git-grep how to search
the two places on a single run? I.e. something like:

diff --git a/builtin/grep.c b/builtin/grep.c
index ca259af441..b0e99096ff 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -699,7 +699,7 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
 }

 static int grep_directory(struct grep_opt *opt, const struct pathspec *pathspec,
-			  int exc_std, int use_index)
+			  int exc_std, int use_index, int untracked_only)
 {
 	struct dir_struct dir;
 	int i, hit = 0;
@@ -712,7 +712,13 @@ static int grep_directory(struct grep_opt *opt, const struct pathspec *pathspec,

 	fill_directory(&dir, opt->repo->index, pathspec);
 	for (i = 0; i < dir.nr; i++) {
-		hit |= grep_file(opt, dir.entries[i]->name);
+		struct dir_entry *ent = dir.entries[i];
+
+		if (untracked_only &&
+		    !index_name_is_other(opt->repo->index, ent->name, ent->len))
+			continue;
+
+		hit |= grep_file(opt, ent->name);
 		if (hit && opt->status_only)
 			break;
 	}
@@ -1157,9 +1163,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	if (!use_index && (untracked || cached))
 		die(_("--cached or --untracked cannot be used with --no-index"));

-	if (!use_index || untracked) {
+	if (cached && untracked) {
+		hit = grep_cache(&opt, &pathspec, 1);
+		hit |= grep_directory(&opt, &pathspec, !!opt_exclude, 1, 1);
+
+	} else if (!use_index || untracked) {
 		int use_exclude = (opt_exclude < 0) ? use_index : !!opt_exclude;
-		hit = grep_directory(&opt, &pathspec, use_exclude, use_index);
+		hit |= grep_directory(&opt, &pathspec, use_exclude, use_index, 0);
+
 	} else if (0 <= opt_exclude) {
 		die(_("--[no-]exclude-standard cannot be used for tracked contents"));
 	} else if (!list.nr) {

With this, the `git grep --cached --untracked .` search from the
previous example would produce the following output:

A:cached A
B:untracked

In this case, should we also add an 'untracked:' / 'cached:' prefix to
the filenames, like we do for revision searches (e.g. 'HEAD:A:cached A') ?
Elijah Newren Feb. 9, 2021, 10:06 a.m. UTC | #3
On Mon, Feb 8, 2021 at 3:27 PM Matheus Tavares
<matheus.bernardino@usp.br> wrote:
>
>
> On Mon, Feb 8, 2021 at 6:48 PM Taylor Blau <me@ttaylorr.com> wrote:
> >
> > On Mon, Feb 08, 2021 at 04:43:28PM -0300, Matheus Tavares wrote:
> > > The options --untracked and --cached are not compatible, but if they are
> > > used together, grep just silently ignores --cached and searches the
> > > working tree. Error out, instead, to avoid any potential confusion.
> > >
> > > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> > > ---
> > >  builtin/grep.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/builtin/grep.c b/builtin/grep.c
> > > index ca259af441..392acf8cab 100644
> > > --- a/builtin/grep.c
> > > +++ b/builtin/grep.c
> > > @@ -1157,6 +1157,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
> > >       if (!use_index && (untracked || cached))
> > >               die(_("--cached or --untracked cannot be used with --no-index"));
> > >
> > > +     if (untracked && cached)
> > > +             die(_("--untracked cannot be used with --cached"));
> > > +
> >
> > Are these really incompatible? --untracked says that untracked files are
> > searched in addition to tracked ones in the working tree.
> > --cached says that the index is searched instead of tracked files. From
> > my reading, that seems to imply that the combination you're proposing
> > getting rid of would mean: "search the index,and untracked files".
>
> Yeah, I agree that there is nothing conceptually wrong with the use case
> "search the index, and untracked files". The problem is that git-grep is
> currently not able to search both these places in the same execution.
> In fact, I don't think it can combine working tree and index searches
> in any way (besides with --assume-unchanged paths).
>
> When --cached is used with --untracked, git-grep silently ignores
> --cached and searches the working tree only:
>
> $ echo 'cached A' >A
> $ git add A
> $ git commit -m A
> $ echo 'modified A' >A
> $ echo 'untracked' >B
> $ git grep --cached --untracked .
> A:modified A
> B:untracked
>
> Perhaps, instead of erroring out with this currently invalid
> combination, should we make it valid by teaching git-grep how to search
> the two places on a single run? I.e. something like:
>
> diff --git a/builtin/grep.c b/builtin/grep.c
> index ca259af441..b0e99096ff 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -699,7 +699,7 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
>  }
>
>  static int grep_directory(struct grep_opt *opt, const struct pathspec *pathspec,
> -                         int exc_std, int use_index)
> +                         int exc_std, int use_index, int untracked_only)
>  {
>         struct dir_struct dir;
>         int i, hit = 0;
> @@ -712,7 +712,13 @@ static int grep_directory(struct grep_opt *opt, const struct pathspec *pathspec,
>
>         fill_directory(&dir, opt->repo->index, pathspec);
>         for (i = 0; i < dir.nr; i++) {
> -               hit |= grep_file(opt, dir.entries[i]->name);
> +               struct dir_entry *ent = dir.entries[i];
> +
> +               if (untracked_only &&
> +                   !index_name_is_other(opt->repo->index, ent->name, ent->len))
> +                       continue;
> +
> +               hit |= grep_file(opt, ent->name);
>                 if (hit && opt->status_only)
>                         break;
>         }
> @@ -1157,9 +1163,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>         if (!use_index && (untracked || cached))
>                 die(_("--cached or --untracked cannot be used with --no-index"));
>
> -       if (!use_index || untracked) {
> +       if (cached && untracked) {
> +               hit = grep_cache(&opt, &pathspec, 1);
> +               hit |= grep_directory(&opt, &pathspec, !!opt_exclude, 1, 1);
> +
> +       } else if (!use_index || untracked) {
>                 int use_exclude = (opt_exclude < 0) ? use_index : !!opt_exclude;
> -               hit = grep_directory(&opt, &pathspec, use_exclude, use_index);
> +               hit |= grep_directory(&opt, &pathspec, use_exclude, use_index, 0);
> +
>         } else if (0 <= opt_exclude) {
>                 die(_("--[no-]exclude-standard cannot be used for tracked contents"));
>         } else if (!list.nr) {
>
> With this, the `git grep --cached --untracked .` search from the
> previous example would produce the following output:
>
> A:cached A
> B:untracked
>
> In this case, should we also add an 'untracked:' / 'cached:' prefix to
> the filenames, like we do for revision searches (e.g. 'HEAD:A:cached A') ?

Ick, no.  --untracked was a modifier on working tree searches.  I
think doing this adds a really weird inconsistency for all the other
option combinations.  I say mark --untracked and --cached as
incompatible like you did in your original patch, unless you're
willing to make grep generally be able to have options to search two
or more locations from any of (revisions of history, the cache, the
working tree), including being able to simultaneously search two
different revisions of history.  Even then, I'd be tempted to say that
--untracked is only used in combination with a search of the working
tree.
Elijah Newren Feb. 9, 2021, 10:07 a.m. UTC | #4
On Mon, Feb 8, 2021 at 11:47 AM Matheus Tavares
<matheus.bernardino@usp.br> wrote:
>
> The options --untracked and --cached are not compatible, but if they are
> used together, grep just silently ignores --cached and searches the
> working tree. Error out, instead, to avoid any potential confusion.
>
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
>  builtin/grep.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/builtin/grep.c b/builtin/grep.c
> index ca259af441..392acf8cab 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -1157,6 +1157,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>         if (!use_index && (untracked || cached))
>                 die(_("--cached or --untracked cannot be used with --no-index"));
>
> +       if (untracked && cached)
> +               die(_("--untracked cannot be used with --cached"));
> +
>         if (!use_index || untracked) {
>                 int use_exclude = (opt_exclude < 0) ? use_index : !!opt_exclude;
>                 hit = grep_directory(&opt, &pathspec, use_exclude, use_index);
> --
> 2.29.2

Reviewed-by: Elijah Newren <newren@gmail.com>
Junio C Hamano Feb. 9, 2021, 8:38 p.m. UTC | #5
Elijah Newren <newren@gmail.com> writes:

> ...  Even then, I'd be tempted to say that
> --untracked is only used in combination with a search of the working
> tree.

I tend to agree.  Something like

  $ git grep -ne POISON maint master next seen -- t/test-lib.sh

would be a useful thing, but I do not see how --untracked would
usefully interact with any of these "tracked" contents, be the
search from the index or tree.
Taylor Blau Feb. 10, 2021, 2:43 a.m. UTC | #6
On Tue, Feb 09, 2021 at 12:38:02PM -0800, Junio C Hamano wrote:
> Elijah Newren <newren@gmail.com> writes:
>
> > ...  Even then, I'd be tempted to say that
> > --untracked is only used in combination with a search of the working
> > tree.
>
> I tend to agree.  Something like
>
>   $ git grep -ne POISON maint master next seen -- t/test-lib.sh
>
> would be a useful thing, but I do not see how --untracked would
> usefully interact with any of these "tracked" contents, be the
> search from the index or tree.

I agree with both you and Elijah. I don't mind banning the pair as
unimplemented and "won't possibly work correctly together", since it
leaves open the possibility that we might find time to make them
compatible, at which point we could revert this patch.

But what Matheus provided looks good to me.


Thanks,
Taylor
diff mbox series

Patch

diff --git a/builtin/grep.c b/builtin/grep.c
index ca259af441..392acf8cab 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1157,6 +1157,9 @@  int cmd_grep(int argc, const char **argv, const char *prefix)
 	if (!use_index && (untracked || cached))
 		die(_("--cached or --untracked cannot be used with --no-index"));
 
+	if (untracked && cached)
+		die(_("--untracked cannot be used with --cached"));
+
 	if (!use_index || untracked) {
 		int use_exclude = (opt_exclude < 0) ? use_index : !!opt_exclude;
 		hit = grep_directory(&opt, &pathspec, use_exclude, use_index);