diff mbox series

[4/4] completion: avoid user confusion in non-cone mode

Message ID fe8669a3f4f05c186e497f870c7e7ba9a94ac63f.1700761448.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Sparse checkout completion fixes | expand

Commit Message

Elijah Newren Nov. 23, 2023, 5:44 p.m. UTC
From: Elijah Newren <newren@gmail.com>

It is tempting to think of "files and directories" of the current
directory as valid inputs to the add and set subcommands of git
sparse-checkout.  However, in non-cone mode, they often aren't and using
them as potential completions leads to *many* forms of confusion:

Issue #1. It provides the *wrong* files and directories.

For
    git sparse-checkout add
we always want to add files and directories not currently in our sparse
checkout, which means we want file and directories not currently present
in the current working tree.  Providing the files and directories
currently present is thus always wrong.

For
    git sparse-checkout set
we have a similar problem except in the subset of cases where we are
trying to narrow our checkout to a strict subset of what we already
have.  That is not a very common scenario, especially since it often
does not even happen to be true for the first use of the command; for
years we required users to create a sparse-checkout via
    git sparse-checkout init
    git sparse-checkout set <args...>
(or use a clone option that did the init step for you at clone time).
The init command creates a minimal sparse-checkout with just the
top-level directory present, meaning the set command has to be used to
expand the checkout.  Thus, only in a special and perhaps unusual cases
would any of the suggestions from normal file and directory completion
be appropriate.

Issue #2: Suggesting patterns that lead to warnings is unfriendly.

If the user specifies any regular file and omits the leading '/', then
the sparse-checkout command will warn the user that their command is
problematic and suggest they use a leading slash instead.

Issue #3: Completion gets confused by leading '/', and provides wrong paths.

Users often want to anchor their patterns to the toplevel of the
repository, especially when listing individual files.  There are a
number of reasons for this, but notably even sparse-checkout encourages
them to do so (as noted above).  However, if users do so (via adding a
leading '/' to their pattern), then bash completion will interpret the
leading slash not as a request for a path at the toplevel of the
repository, but as a request for a path at the root of the filesytem.
That means at best that completion cannot help with such paths, and if
it does find any completions, they are almost guaranteed to be wrong.

Issue #4: Suggesting invalid patterns from subdirectories is unfriendly.

There is no per-directory equivalent to .gitignore with
sparse-checkouts.  There is only a single worktree-global
$GIT_DIR/info/sparse-checkout file.  As such, paths to files must be
specified relative to the toplevel of a repository.  Providing
suggestions of paths that are relative to the current working directory,
as bash completion defaults to, is wrong when the current working
directory is not the worktree toplevel directory.

Issue #5: Paths with special characters will be interpreted incorrectly

The entries in the sparse-checkout file are patterns, not paths.  While
most paths also qualify as patterns (though even in such cases it would
be better for users to not use them directly but prefix them with a
leading '/'), there are a variety of special characters that would need
special escaping beyond the normal shell escaping: '*', '?', '\', '[',
']', and any leading '#' or '!'.  If completion suggests any such paths,
users will likely expect them to be treated as an exact path rather than
as a pattern that might match some number of files other than 1.

Because of the combination of the above issues, turn completion off for
the `set` and `add` subcommands of `sparse-checkout` when in non-cone
mode, but leave a NEEDSWORK comment specifying what could theoretically
be done if someone wanted to provide completion rules that were more
helpful than harmful.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 contrib/completion/git-completion.bash | 61 ++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)

Comments

Junio C Hamano Nov. 24, 2023, 1:19 a.m. UTC | #1
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  		if [[ "$using_cone" == "true" ]]; then
>  			__gitcomp_directories

Hmph, doesn't "Providing the files and directories currently present
is thus always wrong." apply equally to cone mode?

> +		else
> +			# NEEDSWORK: It might be useful to provide a
> +			# completion function which:
> +			#
> +			#     1. Provides completions based on
> +			#        files/directories that exist in HEAD, not
> +			#        just those currently present in the working
> +			#        tree.

This makes a lot of sense.  May make even more sense with
s/HEAD/index/, though.

> +			#     4. Provides no completions when run from a
> +			#        subdirectory of the repository root.  (If we
> +			#        did provide file/directory completions, the
> +			#        user would just get a "please run from the
> +			#        toplevel directory" error message when they
> +			#        ran it.  *Further*, if the user did rerun
> +			#        the command from the toplevel, the
> +			#        completions we previously provided would
> +			#        likely be wrong as they'd be relative to the
> +			#        subdirectory rather than the repository
> +			#        root.  That could lead to users getting a
> +			#        nasty surprise based on trying to use a
> +			#        command we helped them create.)

Hmph, would an obvious alternative to (1) check against the HEAD (or
the index) to see if the prefix string matches an entity at the
current directory level, and then (2) to prefix the result of the
previous step with "/$(git rev-parse --show-prefix)" work?  That is
something like this:

    $ cd t
    $ git sparse-checkout add help<TAB>
    ->
    $ git sparse-checkout add /t/helper/

and when the user gave the full path from the root level, do the
obvious:

    $ cd t
    $ git sparse-checkout add /t/help<TAB>
    ->
    $ git sparse-checkout add /t/helper/

Another more fundamental approach to avoid "confusion" this bullet
item tries to side step might be to *fix* the command that gets
completed.  As "git sparse-checkout --help" is marked as
EXPERIMENTAL in capital letters, we should be able to say "what was
traditionally known as 'add' is from now on called 'add-pattern' and
command line completion would not get in the way; the 'add'
subcommand now takes only literal paths, not patterns, that are
relative to the current directory" if we wanted to.

> +			#     5. Provides escaped completions for any paths
> +			#        containing a '*', '?', '\', '[', ']', or
> +			#        leading '#' or '!'.  (These characters might
> +			#        already be escaped to protect from the
> +			#        shell, but they need an *extra* layer of
> +			#        escaping to prevent the pattern parsing in
> +			#        Git from seeing them as special characters.)
> +			#
> +			# Of course, this would be a lot of work, so for now,
> +			# just avoid the many forms of user confusion that
> +			# could be caused by providing bad completions by
> +			# providing a fake completion to avoid falling back to
> +			# bash's normal file and directory completion.

> +			COMPREPLY=( "" )
>  		fi
>  	esac
>  }
Elijah Newren Nov. 24, 2023, 3:28 p.m. UTC | #2
On Thu, Nov 23, 2023 at 5:19 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> >               if [[ "$using_cone" == "true" ]]; then
> >                       __gitcomp_directories
>
> Hmph, doesn't "Providing the files and directories currently present
> is thus always wrong." apply equally to cone mode?

Absolutely, it definitely applies to cone mode.  We (mostly) fixed
that a long time ago, making it not complete on the files &
directories currently present.  In particular, the
__gitcomp_directories() function highlighted here completes on the
output of `git ls-tree -z -d --name-only HEAD`.

However, before this series, there was a problem when
__gitcomp_directories() finds no possible completions.  In that case,
the code would fall back to bash-completion's default of completing on
all files and directories currently present.  But that was fixed in
patch 3 of this series to avoid that fallback.

This patch, though, isn't about cone mode.  It's about fixing (or at
least improving) non-cone mode.

> > +             else
> > +                     # NEEDSWORK: It might be useful to provide a
> > +                     # completion function which:
> > +                     #
> > +                     #     1. Provides completions based on
> > +                     #        files/directories that exist in HEAD, not
> > +                     #        just those currently present in the working
> > +                     #        tree.
>
> This makes a lot of sense.  May make even more sense with
> s/HEAD/index/, though.

Ooh, interesting.  That wouldn't work with the sparse index (where
paths we want to complete on are currently missing from the index
too), but sparse index is restricted to cone mode, and we're
discussing non-cone-mode here.  So, this might be a basis for a good
alternative.

> > +                     #     4. Provides no completions when run from a
> > +                     #        subdirectory of the repository root.  (If we
> > +                     #        did provide file/directory completions, the
> > +                     #        user would just get a "please run from the
> > +                     #        toplevel directory" error message when they
> > +                     #        ran it.  *Further*, if the user did rerun
> > +                     #        the command from the toplevel, the
> > +                     #        completions we previously provided would
> > +                     #        likely be wrong as they'd be relative to the
> > +                     #        subdirectory rather than the repository
> > +                     #        root.  That could lead to users getting a
> > +                     #        nasty surprise based on trying to use a
> > +                     #        command we helped them create.)
>
> Hmph, would an obvious alternative to (1) check against the HEAD (or
> the index) to see if the prefix string matches an entity at the
> current directory level, and then (2) to prefix the result of the
> previous step with "/$(git rev-parse --show-prefix)" work?  That is
> something like this:
>
>     $ cd t
>     $ git sparse-checkout add help<TAB>
>     ->
>     $ git sparse-checkout add /t/helper/

I thought bash-completion was only for completions, not for startings
as well.  Was I mistaken?

> and when the user gave the full path from the root level, do the
> obvious:
>
>     $ cd t
>     $ git sparse-checkout add /t/help<TAB>
>     ->
>     $ git sparse-checkout add /t/helper/
>
> Another more fundamental approach to avoid "confusion" this bullet
> item tries to side step might be to *fix* the command that gets
> completed.  As "git sparse-checkout --help" is marked as
> EXPERIMENTAL in capital letters, we should be able to say "what was
> traditionally known as 'add' is from now on called 'add-pattern' and
> command line completion would not get in the way; the 'add'
> subcommand now takes only literal paths, not patterns, that are
> relative to the current directory" if we wanted to.

That's interesting...but it opens up a new can of worms:
  * Would we also need both `set-patterns` and `set`, in addition to
`add-patterns` and `add`?
  * In cone mode, the paths passed are literal directories (and only
directories; no individual files), but the thing added is a
telescoping "cone" of leading directories as well.  Does this make it
potentially confusing to users to say that `add` only takes literal
paths?
  * In cone mode (the default), should `add-patterns` just be an
error, since no pattern specification is allowed?
  * In the git-sparse-checkout manual, for performance reasons, we
recommend users _not_ specify individual paths in non-cone mode.
Would our recommendation then be to just not use `add` or `set` and
only use `add-patterns` and `set-patterns`?  If so, what have we
accomplished by adding the new names?

Maybe I'm missing something about your suggestion, but this seems much
more complex than the simple solution we implemented in bb8b5e9a90d
("sparse-checkout: pay attention to prefix for {set, add}",
2022-02-19) for the !core_sparse_checkout_cone case.  I like the
simple solution there, though that simple solution omitted modifying
the completion rules in a way that was consistent (i.e. that returns
nothing when the user is running from a subdirectory).


> > +                     #     5. Provides escaped completions for any paths
> > +                     #        containing a '*', '?', '\', '[', ']', or
> > +                     #        leading '#' or '!'.  (These characters might
> > +                     #        already be escaped to protect from the
> > +                     #        shell, but they need an *extra* layer of
> > +                     #        escaping to prevent the pattern parsing in
> > +                     #        Git from seeing them as special characters.)
> > +                     #
> > +                     # Of course, this would be a lot of work, so for now,
> > +                     # just avoid the many forms of user confusion that
> > +                     # could be caused by providing bad completions by
> > +                     # providing a fake completion to avoid falling back to
> > +                     # bash's normal file and directory completion.
>
> > +                     COMPREPLY=( "" )
> >               fi
> >       esac
> >  }
Junio C Hamano Nov. 27, 2023, 1:39 a.m. UTC | #3
Elijah Newren <newren@gmail.com> writes:

>> >               if [[ "$using_cone" == "true" ]]; then
>> >                       __gitcomp_directories
>>
>> Hmph, doesn't "Providing the files and directories currently present
>> is thus always wrong." apply equally to cone mode?
>
> Absolutely, it definitely applies to cone mode.  We (mostly) fixed
> that a long time ago, making it not complete on the files &
> directories currently present.  In particular, the
> __gitcomp_directories() function highlighted here completes on the
> output of `git ls-tree -z -d --name-only HEAD`.

Thanks; what I missed was exactly what __gitcomp_directories does
not do you explained above.

>> > +                     #     4. Provides no completions when run from a
>> > +                     #        subdirectory of the repository root.  (If we
>> > +                     #        did provide file/directory completions, the
>> > +                     #        user would just get a "please run from the
>> > +                     #        toplevel directory" error message when they
>> > +                     #        ran it.  *Further*, if the user did rerun
>> > +                     #        the command from the toplevel, the
>> > +                     #        completions we previously provided would
>> > +                     #        likely be wrong as they'd be relative to the
>> > +                     #        subdirectory rather than the repository
>> > +                     #        root.  That could lead to users getting a
>> > +                     #        nasty surprise based on trying to use a
>> > +                     #        command we helped them create.)
>>
>> Hmph, would an obvious alternative to (1) check against the HEAD (or
>> the index) to see if the prefix string matches an entity at the
>> current directory level, and then (2) to prefix the result of the
>> previous step with "/$(git rev-parse --show-prefix)" work?  That is
>> something like this:
>>
>>     $ cd t
>>     $ git sparse-checkout add help<TAB>
>>     ->
>>     $ git sparse-checkout add /t/helper/
>
> I thought bash-completion was only for completions, not for startings
> as well.  Was I mistaken?

To my mind, the completion is what I as an end user get when I type
<TAB> to help me formulate input that is acceptable by the command.
As I said, I consider it a bug (or UI mistake) in the a command if
it pretends to work inside a subdirecctory but complains when it is
given a path relative to the current directory, so I'd rather prefer
the approach to "fix" the underlying command, but if that is too
much work or cannot be done for whatever reason, the second best
would be to turn whatever we can do to help the end-user input into
a form that is accepted by the command without changing what the
input means.  If it takes more than "appending at the end", that is
fine, at least by me as an end user.

If you are saying "completion code can only append at the end
because we can only return strings to be appended, not the entire
strings, to the readline machinery, so mucking with the start of the
string is not doable", then sorry---I accept that what we cannot do
cannot be done, and in that case you are "not mistaken".

But from the existing use of COMPREPLY[], it didn't look that way
(it seems __gitcomp is equipped to take fixed prefix to all
candidates by passing it in $2 and used to complete names of
configuration variables in a section, but it seems to me that it can
be repurposed when prefixing "$(git rev-parse --show-prefix)" to a
given pathname relative to the $cwd).  And if that can be done, then
you are "not mistaken", but merely being dogmatic and limiting what
your code can do yourself.

>> Another more fundamental approach to avoid "confusion" this bullet
>> item tries to side step might be to *fix* the command that gets
>> completed.  As "git sparse-checkout --help" is marked as
>> EXPERIMENTAL in capital letters, we should be able to say "what was
>> traditionally known as 'add' is from now on called 'add-pattern' and
>> command line completion would not get in the way; the 'add'
>> subcommand now takes only literal paths, not patterns, that are
>> relative to the current directory" if we wanted to.
>
> That's interesting...but it opens up a new can of worms:
>   * Would we also need both `set-patterns` and `set`, in addition to
> `add-patterns` and `add`?

If "set" has a similar UI issue that confuses end-users, then sure,
I do not see a reason why we want to leave it confusing---the
experimental labelling is to allow us to fix these warts more
easily, no?

>   * In cone mode, the paths passed are literal directories (and only
> directories; no individual files), but the thing added is a
> telescoping "cone" of leading directories as well.  Does this make it
> potentially confusing to users to say that `add` only takes literal
> paths?

I do not know.

>   * In cone mode (the default), should `add-patterns` just be an
> error, since no pattern specification is allowed?

I do not really care.  "add-patterns" is a potential tool you can
use to reduce friction while fixing the UI warts in an experimental
command.

>   * In the git-sparse-checkout manual, for performance reasons, we
> recommend users _not_ specify individual paths in non-cone mode.
> Would our recommendation then be to just not use `add` or `set` and
> only use `add-patterns` and `set-patterns`?

Very likely.  If the desired behaviour from the command can only be
had by castrating features, then such a recommendation would not
mean much to end-users anyway, though.

> If so, what have we
> accomplished by adding the new names?

It is valuable for those who do need to go against recommendation
(because the recommendation robs usability from them way too much),
will have much less confusing and working completion when they use
'add' or 'set', no?

> Maybe I'm missing something about your suggestion, but this seems much
> more complex than the simple solution we implemented in bb8b5e9a90d
> ("sparse-checkout: pay attention to prefix for {set, add}",
> 2022-02-19) for the !core_sparse_checkout_cone case.

Oh, if we do honor the $(git rev-parse --show-prefix), then that
changes the equation somewhat.  I got an impression from your log
message or cover letter that it wasn't the case, and that was where
the "if the command is so broken, then completion can add it for the
user" and "if the command is so broken, then fix it to take relative
paths" came from.
Elijah Newren Dec. 3, 2023, 5:57 a.m. UTC | #4
On Sun, Nov 26, 2023 at 5:39 PM Junio C Hamano <gitster@pobox.com> wrote:
[...]
> >> Hmph, would an obvious alternative to (1) check against the HEAD (or
> >> the index) to see if the prefix string matches an entity at the
> >> current directory level, and then (2) to prefix the result of the
> >> previous step with "/$(git rev-parse --show-prefix)" work?  That is
> >> something like this:
> >>
> >>     $ cd t
> >>     $ git sparse-checkout add help<TAB>
> >>     ->
> >>     $ git sparse-checkout add /t/helper/
> >
> > I thought bash-completion was only for completions, not for startings
> > as well.  Was I mistaken?
>
> To my mind, the completion is what I as an end user get when I type
> <TAB> to help me formulate input that is acceptable by the command.
> As I said, I consider it a bug (or UI mistake) in the a command if
> it pretends to work inside a subdirecctory but complains when it is
> given a path relative to the current directory, so I'd rather prefer
> the approach to "fix" the underlying command, but if that is too
> much work or cannot be done for whatever reason, the second best
> would be to turn whatever we can do to help the end-user input into
> a form that is accepted by the command without changing what the
> input means.  If it takes more than "appending at the end", that is
> fine, at least by me as an end user.
>
> If you are saying "completion code can only append at the end
> because we can only return strings to be appended, not the entire
> strings, to the readline machinery, so mucking with the start of the
> string is not doable", then sorry---I accept that what we cannot do
> cannot be done, and in that case you are "not mistaken".

This was what I thought; that bash completion didn't support this.

> But from the existing use of COMPREPLY[], it didn't look that way
> (it seems __gitcomp is equipped to take fixed prefix to all
> candidates by passing it in $2 and used to complete names of
> configuration variables in a section, but it seems to me that it can
> be repurposed when prefixing "$(git rev-parse --show-prefix)" to a
> given pathname relative to the $cwd).

Ooh, that's really interesting; I had no idea it had this kind of
flexibility.  It does feel like we're abusing "bash completions" to be
both "bash completions AND startings", but I agree that this is a case
where it makes sense to do so.

I changed patch 4 to implement this for non-cone mode, and submitted
v3 with that change.
diff mbox series

Patch

diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index 136faeca1e9..7d460da2fab 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -3068,6 +3068,67 @@  _git_sparse_checkout ()
 		fi
 		if [[ "$using_cone" == "true" ]]; then
 			__gitcomp_directories
+		else
+			# NEEDSWORK: It might be useful to provide a
+			# completion function which:
+			#
+			#     1. Provides completions based on
+			#        files/directories that exist in HEAD, not
+			#        just those currently present in the working
+			#        tree.  Bash's default file and directory
+			#        completion is totally useless for "git
+			#        sparse-checkout add" because of this.  It is
+			#        likewise problematic for "git
+			#        sparse-checkout set" except in those subset
+			#        of cases when trying to narrow scope to a
+			#        strict subset of what you already have
+			#        checked out.
+			#
+			#     2. Always provides file/directory completions
+			#        with a prepended leading '/', so that
+			#        files/directories are only searched at the
+			#        relevant level rather than throughout all
+			#        trees in the hierarchy.  Doing this also
+			#        avoids suggesting the user run a
+			#        sparse-checkout command that will result in
+			#        a warning be thrown at the user.
+			#
+			#     3. Does not accidentally search the root of
+			#        the filesystem when a path with a leading
+			#        slash is specified.  ("git sparse-checkout
+			#        add /ho<TAB>" should not complete to
+			#        "/home" but to e.g. "/hooks" if there is a
+			#        "hooks" in the top of the repository.)
+			#
+			#     4. Provides no completions when run from a
+			#        subdirectory of the repository root.  (If we
+			#        did provide file/directory completions, the
+			#        user would just get a "please run from the
+			#        toplevel directory" error message when they
+			#        ran it.  *Further*, if the user did rerun
+			#        the command from the toplevel, the
+			#        completions we previously provided would
+			#        likely be wrong as they'd be relative to the
+			#        subdirectory rather than the repository
+			#        root.  That could lead to users getting a
+			#        nasty surprise based on trying to use a
+			#        command we helped them create.)
+			#
+			#     5. Provides escaped completions for any paths
+			#        containing a '*', '?', '\', '[', ']', or
+			#        leading '#' or '!'.  (These characters might
+			#        already be escaped to protect from the
+			#        shell, but they need an *extra* layer of
+			#        escaping to prevent the pattern parsing in
+			#        Git from seeing them as special characters.)
+			#
+			# Of course, this would be a lot of work, so for now,
+			# just avoid the many forms of user confusion that
+			# could be caused by providing bad completions by
+			# providing a fake completion to avoid falling back to
+			# bash's normal file and directory completion.
+
+			COMPREPLY=( "" )
 		fi
 	esac
 }