diff mbox series

[v3,2/3] sparse-checkout: custom tab completion

Message ID 256e5f034c6072b6e3621adfa96c5c6319752fae.1641841193.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series sparse checkout: custom bash completion updates | expand

Commit Message

Lessley Dennington Jan. 10, 2022, 6:59 p.m. UTC
From: Lessley Dennington <lessleydennington@gmail.com>

Fix custom tab completion for sparse-checkout command. This will ensure:

1. The full list of subcommands is provided when users enter git
sparse-checkout <TAB>.
2. The --help option is tab-completable.
3. Subcommand options are tab-completable.
4. A list of directories (but not files) is provided when users enter git
sparse-checkout add <TAB> or git sparse-checkout set <TAB>. It is
important to note that this will apply for both cone mode and non-cone
mode (even though non-cone mode matches on patterns rather than
directories).

Failing tests that were added in the previous commit to verify these
scenarios are now passing with these updates.

Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
---
 contrib/completion/git-completion.bash | 38 ++++++++++++++++++--------
 t/t9902-completion.sh                  |  8 +++---
 2 files changed, 30 insertions(+), 16 deletions(-)

Comments

SZEDER Gábor Jan. 15, 2022, 9:57 a.m. UTC | #1
On Mon, Jan 10, 2022 at 06:59:51PM +0000, Lessley Dennington via GitGitGadget wrote:
> Subject: Re: [PATCH v3 2/3] sparse-checkout: custom tab completion

None of these patches touch sparse-checkout, but only the completion
script and its tests.  Therefore "completion:" would be a better
matching area prefix.

> Fix custom tab completion for sparse-checkout command. This will ensure:
> 
> 1. The full list of subcommands is provided when users enter git
> sparse-checkout <TAB>.
> 2. The --help option is tab-completable.

This is inconsistent with the rest of the completion script, because
it doesn't list '--help' for any commands or subcommand (with the sole
exception of 'git --<TAB>').

> 3. Subcommand options are tab-completable.
> 4. A list of directories (but not files) is provided when users enter git
> sparse-checkout add <TAB> or git sparse-checkout set <TAB>.

Why limit completion only to directories?  Both of those subcommands
accept files, and I think 'git sparse-checkout set README.md' is a
perfectly reasonable command.

> It is
> important to note that this will apply for both cone mode and non-cone
> mode (even though non-cone mode matches on patterns rather than
> directories).
> 
> Failing tests that were added in the previous commit to verify these
> scenarios are now passing with these updates.
> 
> Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
> ---
>  contrib/completion/git-completion.bash | 38 ++++++++++++++++++--------
>  t/t9902-completion.sh                  |  8 +++---
>  2 files changed, 30 insertions(+), 16 deletions(-)
> 
> diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
> index c82ccaebcc7..f478883f51c 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -2986,24 +2986,38 @@ _git_show_branch ()
>  	__git_complete_revlist
>  }
>  
> +__git_sparse_checkout_subcommand_opts="--cone --no-cone --sparse-index --no-sparse-index"
> +
>  _git_sparse_checkout ()
>  {
> -	local subcommands="list init set disable"
> +	local subcommands="list init set disable add reapply"
>  	local subcommand="$(__git_find_on_cmdline "$subcommands")"
> +
>  	if [ -z "$subcommand" ]; then
> -		__gitcomp "$subcommands"
> -		return
> +		case "$cur" in
> +			--*)
> +				__gitcomp "--help"
> +				;;
> +			*)
> +				__gitcomp "$subcommands"
> +				;;
> +		esac
>  	fi
>  
> -	case "$subcommand,$cur" in
> -	init,--*)
> -		__gitcomp "--cone"
> -		;;
> -	set,--*)
> -		__gitcomp "--stdin"
> -		;;
> -	*)
> -		;;
> +	case "$subcommand" in
> +		set)
> +			__gitcomp "$__git_sparse_checkout_subcommand_opts --stdin"

Oh, a hard-coded list of --options is so old school :)

All subcommands of 'git sparse-checkout' use parse-options (even those
that don't have any --options at the moment), so their options can be
completed programmatically, and then we wouldn't have to worry about
updating the list of options in the completion script ever again.  It
would also make several tests added in the previous patch unnecessary.

You could use the completion function of 'git notes' for inspiration
on how to do that.

> +			__gitcomp "$(git ls-tree -d -r HEAD --name-only)"

This will have troubles with unusual characters in pathnames:

  - If a pathname contains a space (or any other IFS character), then
    the shell will split that into multiple words.  

  - If a pathname contains a special character, e.g. double quote or
    backslash, then 'git ls-tree' will quote that path.  Furthermore,
    by default it will quote pathnames containing e.g. UTF-8
    characters as well.

  - The shell has its own special characteers that have to be
    quoted/escaped on the command line to strip them from their
    special meaning, that quoting/escaping will interfere with listing
    only paths matching the current word to be completed.

You should use the __git_complete_index_file() helper function
instead, which takes care of most of this.

Furthermore, these two subsequent __gitcomp() calls will offer both
options and paths for 'git sparse-checkout set <TAB>', which is
inconsistent with the rest of the completion script.  Usually it lists
--options only after e.g. 'git rm --<TAB>' or 'git log --<TAB>', but
after 'git rm <TAB>' and 'git log <TAB>' it lists only files and refs,
respectively.

> +			;;
> +		add)
> +			__gitcomp "--stdin"
> +			__gitcomp "$(git ls-tree -d -r HEAD --name-only)"

Likewise.

> +			;;
> +		init|reapply)
> +			__gitcomp "$__git_sparse_checkout_subcommand_opts"
> +			;;
> +		*)
> +			;;
>  	esac
>  }
>
Elijah Newren Jan. 16, 2022, 1:03 a.m. UTC | #2
On Sat, Jan 15, 2022 at 1:57 AM SZEDER Gábor <szeder.dev@gmail.com> wrote:
>
> On Mon, Jan 10, 2022 at 06:59:51PM +0000, Lessley Dennington via GitGitGadget wrote:
> > Subject: Re: [PATCH v3 2/3] sparse-checkout: custom tab completion
>
> None of these patches touch sparse-checkout, but only the completion
> script and its tests.  Therefore "completion:" would be a better
> matching area prefix.

Thanks for the detailed feedback and guidance in your review.  Very
helpful.  I'll omit quoting most of it here, but I do want to comment
on the point about directories.

...
> > 4. A list of directories (but not files) is provided when users enter git
> > sparse-checkout add <TAB> or git sparse-checkout set <TAB>.
>
> Why limit completion only to directories?  Both of those subcommands
> accept files,

Discussed in part at [1], but let me give a more detailed answer.

Both of these commands accept not only directories and files, but also
nearly arbitrary input as far as I can tell.  (In cone-mode, it'll
accept anything so long as it doesn't look like a relative path that
tries to reach above the toplevel directory with '../' sequences.  In
non-cone mode, I think it accepts completely arbitrary input).  If our
guide is merely what the command swallows, then we should forgo
completion for these subcommands, because it's not possible to
enumerate all possible completions.  I don't think that's a useful
guide or starting point, so we instead need to discuss what are
reasonable completions.

cone-mode works exclusively on directories.  So, in that mode,
directories are what we want to complete on.  (And if a file is
specified, cone-mode will treat it as a directory and add expressions
for including all the files under that "directory", which might be
confusing.  sparse-checkout doesn't verify it is a diretory, because
it *might* name a directory in a different branch, including one not
yet downloaded.  But "might name a directory on another branch" is no
reason to suggest picking that pathname with completion.)

In non-cone mode, arbitrary expressions are valid and will be treated
as gitignore-style expressions.  That again leaves us with either not
providing completions, or choosing a subset of possible inputs that
are reasonable suggestions for users.  I prefer the latter, and in
particular I feel that directories are reasonable suggestions.  In
contrast, I don't think providing files is helpful, because it
reinforces the design flaw of non-cone mode.  Non-cone mode has
quadratic performance baked into its design, and since
sparse-checkouts are about performance, non-cone mode kind of defeats
the purpose of the command.  (In addition to other problems[2].)  So,
I think non-cone mode should be deprecated and excised.  Patches
elsewhere are moving in the direction of deprecation already[3], and
we've already discussed multiple steps we'll likely take soon
continuing in that direction.  In the meantime, providing just
directories for completion seems like a good direction to me.

[1] https://lore.kernel.org/git/CABPp-BG=wr81CPtW1M12xFN_0dyS8mAZjM6o=77LA20Zge8Xng@mail.gmail.com/
[2] https://lore.kernel.org/git/CABPp-BF=-1aZd=nFHF6spo7Ksa7f7Wb7ervCt0QvtNitMY=ZBA@mail.gmail.com/
[3] https://lore.kernel.org/git/0af00779128e594aff0ee4ec5378addeac8e88a2.1642175983.git.gitgitgadget@gmail.com/
("This mode is harder to use and less performant, and is thus not
recommended.")

> and I think 'git sparse-checkout set README.md' is a
> perfectly reasonable command.

Reasonable in what sense?  That it makes it (vastly) easier to
implement the completion and sparse-checkout set|add will swallow it,
or that it's something that should actually be recommended for users
doing sparse-checkouts?  While the former certainly holds, I don't
think the latter does.
Junio C Hamano Jan. 16, 2022, 10:13 p.m. UTC | #3
Elijah Newren <newren@gmail.com> writes:

> ...  If our
> guide is merely what the command swallows, then we should forgo
> completion for these subcommands, because it's not possible to
> enumerate all possible completions.

I am not sure if I follow.  I do not think it makes sense to aim for
enumerating EVERYTHING the command would accept.  But I do not know
that is the same as "merely what the command swallows".

> I don't think that's a useful guide or starting point, so we
> instead need to discuss what are reasonable completions.

I do not think it is a good idea to refrain from suggesting anything
that has a possibility of being wring, either, though.  If a path
that is not a directory (either because it is a file in the current
checkout, or because it is a directory only in a different branch)
is given, it might not make sense in the cone-mode for the current
checkout, but it may well make sense when a different branch is
checked out.  Or you may not even in the cone-mode, and in which
case, as SZEDER suggested, a single filename may perfectly make
sense.  A user who said READM<TAB> and does not see it completed to
README.md would be quite confused.

Are we limiting ourselves to directories only when we know we are in
the cone-mode and showing both directories and files otherwise?

I think the guiding principle ought to be that we show completion
that

 - is cheap enough to compute in interactive use (e.g. we should
   refrain from looking for directories in all possible branches,
   but instead just look at the working tree and possibly in the
   index),

 - is simple enough to explain to the users to understand what the
   rules are, and

 - gives useful enough candidates.

"We only look for directories (without going recursive) in the
working tree" does satisfy the first two, but I am not sure it is
more useful than "We show files and directories (without going
recursive) in the working tree", which also satisfies the first
two.

Of course, if the completion limits to directories only in a
repository in the cone-mode, I would not worry about the exclusion
of non-directories will confuse users.
Elijah Newren Jan. 17, 2022, 6:14 p.m. UTC | #4
On Sun, Jan 16, 2022 at 2:13 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > ...  If our
> > guide is merely what the command swallows, then we should forgo
> > completion for these subcommands, because it's not possible to
> > enumerate all possible completions.
>
> I am not sure if I follow.  I do not think it makes sense to aim for
> enumerating EVERYTHING the command would accept.  But I do not know
> that is the same as "merely what the command swallows".

I don't understand your distinction; I think your second sentence is
the same thing I said.

> > I don't think that's a useful guide or starting point, so we
> > instead need to discuss what are reasonable completions.
>
> I do not think it is a good idea to refrain from suggesting anything
> that has a possibility of being wring, either, though.  If a path
> that is not a directory (either because it is a file in the current
> checkout, or because it is a directory only in a different branch)
> is given, it might not make sense in the cone-mode for the current
> checkout, but it may well make sense when a different branch is
> checked out.

Completing on files and directories would be a reasonable first cut,
but that's what we already had before this series was proposed.  It
has the downsides of
  * [cone-mode] making the majority of suggested completions wrong
  * [non-cone mode] subtly recommending individually adding all wanted
files, increasing the number of patterns and exacerbating the
quadratic behavior that non-cone mode experiences with every
unpack_trees() call that touches the working tree.
  * [both modes] potentially making it hard to find or select the
valid choices you do want (there are many more files than directories)

But, despite all that, completing on files and directories is a
somewhat reasonable first cut.

Lessley was trying to aim higher with her submission, though, and I
don't see why she should be dissuaded.  Something better would be
nice.

>  Or you may not even in the cone-mode, and in which
> case, as SZEDER suggested, a single filename may perfectly make
> sense.  A user who said READM<TAB> and does not see it completed to
> README.md would be quite confused.

I'm not sure I follow.  READM<TAB> already doesn't complete to
README.md in the following example command lines:
   'cd READM<TAB>'
   'ssh READM<TAB>'
That doesn't seem to cause user confusion, so I don't think
disallowing it in cone-mode would cause confusion.  Suggesting it, on
the other hand, may well cause confusion given that cone-mode is
explicitly about directories and is documented as such.

If you're only talking about non-cone mode, then this may be a
reasonable objection, though I'm not sure I agree with it even then.
In addition to the fact that individual file completion can be
detrimental to the user in non-cone mode for performance reasons, the
documentation never explicitly states files are acceptable to
sparse-checkout {set,add}.  It always mentions "directories" and
"patterns".  We can't complete all patterns, so we have to pick a
subset.  I don't see why "directories and files" is a better subset to
pick than "just directories".

> Are we limiting ourselves to directories only when we know we are in
> the cone-mode and showing both directories and files otherwise?

That is one possibility, though not the one Lessley proposed.

> I think the guiding principle ought to be that we show completion
> that
>
>  - is cheap enough to compute in interactive use (e.g. we should
>    refrain from looking for directories in all possible branches,
>    but instead just look at the working tree and possibly in the
>    index),

I agree, except with the suggestion to use the working tree or index
in the case of sparse-checkouts.  The working tree shouldn't be used
because it doesn't have the needed information:
  * `git clone --sparse` starts with zero subdirectories present
  * `set` and `add` are likely being used to change sparsity to
include something not already present

The index shouldn't be used because it is not cheap for interactive use:
  * it contains recursive entries below directories of interest that
have to be filtered out
  * with the sparse-index, it may have sparse-directories hiding what
we want, forcing us to fully inflate the index to find the pieces we
do want

I agree with Lessley's choice of using ls-tree and HEAD to achieve
cheap interactive use.

>  - is simple enough to explain to the users to understand what the
>    rules are, and
>
>  - gives useful enough candidates.

I agree with these 3 criteria.

> "We only look for directories (without going recursive) in the
> working tree" does satisfy the first two, but I am not sure it is
> more useful than "We show files and directories (without going
> recursive) in the working tree", which also satisfies the first
> two.

Well, Lessley certainly thought directories-only was more useful, and
in fact labelled "files and directories" as a bug/issue in her cover
letter.  Stolee commented on the series and also didn't see anything
wrong with that claim.

> Of course, if the completion limits to directories only in a
> repository in the cone-mode, I would not worry about the exclusion
> of non-directories will confuse users.

I'd prefer directories-only for both modes, but could accept
directories-only for cone mode and both-files-and-directories for
non-cone mode.

Especially since I think we're going to deprecate non-cone mode regardless.
Junio C Hamano Jan. 17, 2022, 7:40 p.m. UTC | #5
Elijah Newren <newren@gmail.com> writes:

> I'm not sure I follow.  READM<TAB> already doesn't complete to
> README.md in the following example command lines:
>    'cd READM<TAB>'
>    'ssh READM<TAB>'

If the example were "ls READM<TAB>", it would have been worth the
time to read and think about this again, but all users expect "cd"
to go into directories, and "ssh" to go visit another host, and
taking a local filename as a hint to complete would be nonsense.

But that is not what we are talking about (and it is frustrating
that you know it).  

To users, what sparse-checkout takes look like local pathnames, not
necessarily limited to directory names (if you disagree, go back and
read what SZEDER wrote to trigger this thread).

I know it is your preference to complete only directories and
exclude filenames, but I question if the confusion such a design
causes to end-users is worth it.
Lessley Dennington Jan. 18, 2022, 5:56 p.m. UTC | #6
On 1/17/22 11:40 AM, Junio C Hamano wrote:
> Elijah Newren <newren@gmail.com> writes:
> 
>> I'm not sure I follow.  READM<TAB> already doesn't complete to
>> README.md in the following example command lines:
>>     'cd READM<TAB>'
>>     'ssh READM<TAB>'
> 
> If the example were "ls READM<TAB>", it would have been worth the
> time to read and think about this again, but all users expect "cd"
> to go into directories, and "ssh" to go visit another host, and
> taking a local filename as a hint to complete would be nonsense.
> 
> But that is not what we are talking about (and it is frustrating
> that you know it).
> 
> To users, what sparse-checkout takes look like local pathnames, not
> necessarily limited to directory names (if you disagree, go back and
> read what SZEDER wrote to trigger this thread).
> 
> I know it is your preference to complete only directories and
> exclude filenames, but I question if the confusion such a design
> causes to end-users is worth it.

I think perhaps we're a little caught up in exemplifying commands that
are unrelated to sparse-checkout. As Elijah said in [1], the documentation
states that directories and patterns are acceptable to sparse-checkout but
not files. While it is not reasonable to try to offer every pattern a user
could possibly pass to sparse-checkout, it is reasonable to offer
directories and (in my opinion) will help guide users toward correct usage
of the command.

However, since completion on directories is cone-mode-specific, I am
willing to accept the suggestion to only complete directories if we are in
a cone-mode sparse-checkout and apply it in v4 of this series.

[1]: 
https://lore.kernel.org/git/CABPp-BErg-RtyycXaRXYfQHEQXA4q-FU9Q6nYkSHJsqL-04oXw@mail.gmail.com/
Lessley Dennington Jan. 18, 2022, 5:59 p.m. UTC | #7
On 1/15/22 5:03 PM, Elijah Newren wrote:
> On Sat, Jan 15, 2022 at 1:57 AM SZEDER Gábor <szeder.dev@gmail.com> wrote:
>>
>> On Mon, Jan 10, 2022 at 06:59:51PM +0000, Lessley Dennington via GitGitGadget wrote:
>>> Subject: Re: [PATCH v3 2/3] sparse-checkout: custom tab completion
>>
>> None of these patches touch sparse-checkout, but only the completion
>> script and its tests.  Therefore "completion:" would be a better
>> matching area prefix.
> 
> Thanks for the detailed feedback and guidance in your review.  Very
> helpful.  I'll omit quoting most of it here, but I do want to comment
> on the point about directories.
> 
I second this - am planning to apply all feedback (with the exception of 
completing
on both files and directories) in v4 of this series.
SZEDER Gábor Jan. 18, 2022, 9:02 p.m. UTC | #8
On Mon, Jan 17, 2022 at 10:14:58AM -0800, Elijah Newren wrote:
> the
> documentation never explicitly states files are acceptable to
> sparse-checkout {set,add}.  It always mentions "directories" and
> "patterns".

The "Full Pattern Set" section of the manpage does in fact explicitly
states that files are supported.
Elijah Newren Jan. 18, 2022, 9:43 p.m. UTC | #9
On Tue, Jan 18, 2022 at 1:02 PM SZEDER Gábor <szeder.dev@gmail.com> wrote:
>
> On Mon, Jan 17, 2022 at 10:14:58AM -0800, Elijah Newren wrote:
> > the
> > documentation never explicitly states files are acceptable to
> > sparse-checkout {set,add}.  It always mentions "directories" and
> > "patterns".
>
> The "Full Pattern Set" section of the manpage does in fact explicitly
> states that files are supported.

Not sure it matters since Junio has stated his position pretty
strongly, making it irrelevant at this point, but.... I'm not sure I'm
following you.  Is this the quote you're referring to?

"""
While $GIT_DIR/info/sparse-checkout is usually used to specify what
files are included,
"""

If so,
    /foo[A-M]*/pref?x_*.c
could be an entry in the file "used to specify what files are
included", but it's not a file.  The two example lines immediately
following aren't filenames either (though the '!'-prefixed line is
close).  Or are you referring to something else?
SZEDER Gábor Jan. 18, 2022, 10:22 p.m. UTC | #10
On Sat, Jan 15, 2022 at 05:03:24PM -0800, Elijah Newren wrote:
> On Sat, Jan 15, 2022 at 1:57 AM SZEDER Gábor <szeder.dev@gmail.com> wrote:
> >
> > On Mon, Jan 10, 2022 at 06:59:51PM +0000, Lessley Dennington via GitGitGadget wrote:
> > > Subject: Re: [PATCH v3 2/3] sparse-checkout: custom tab completion
> >
> > None of these patches touch sparse-checkout, but only the completion
> > script and its tests.  Therefore "completion:" would be a better
> > matching area prefix.
> 
> Thanks for the detailed feedback and guidance in your review.  Very
> helpful.  I'll omit quoting most of it here, but I do want to comment
> on the point about directories.
> 
> ...
> > > 4. A list of directories (but not files) is provided when users enter git
> > > sparse-checkout add <TAB> or git sparse-checkout set <TAB>.
> >
> > Why limit completion only to directories?  Both of those subcommands
> > accept files,
> 
> Discussed in part at [1], but let me give a more detailed answer.

It was a semi-rhetorical question.  Whether the reasons for expluding
files are sound or not, it should be convincingly justified in the
commit message.

> Both of these commands accept not only directories and files, but also
> nearly arbitrary input as far as I can tell.  (In cone-mode, it'll
> accept anything so long as it doesn't look like a relative path that
> tries to reach above the toplevel directory with '../' sequences.  In
> non-cone mode, I think it accepts completely arbitrary input).  If our
> guide is merely what the command swallows, then we should forgo
> completion for these subcommands, because it's not possible to
> enumerate all possible completions.  I don't think that's a useful
> guide or starting point, so we instead need to discuss what are
> reasonable completions.
> 
> cone-mode works exclusively on directories.  So, in that mode,
> directories are what we want to complete on.  (And if a file is
> specified, cone-mode will treat it as a directory and add expressions
> for including all the files under that "directory", which might be
> confusing.  sparse-checkout doesn't verify it is a diretory, because
> it *might* name a directory in a different branch, including one not
> yet downloaded.  But "might name a directory on another branch" is no
> reason to suggest picking that pathname with completion.)
> 
> In non-cone mode, arbitrary expressions are valid and will be treated
> as gitignore-style expressions.  That again leaves us with either not
> providing completions, or choosing a subset of possible inputs that
> are reasonable suggestions for users.  I prefer the latter, and in
> particular I feel that directories are reasonable suggestions.  In
> contrast, I don't think providing files is helpful, because it
> reinforces the design flaw of non-cone mode.  Non-cone mode has
> quadratic performance baked into its design, and since
> sparse-checkouts are about performance, non-cone mode kind of defeats
> the purpose of the command.

Or about disk space.  Which, because of the potentially significantly
reduced number of files in the work tree can bring along significant
performance benefits, even with quadratic behavior.

>  (In addition to other problems[2].)  So,
> I think non-cone mode should be deprecated and excised.  Patches
> elsewhere are moving in the direction of deprecation already[3], and
> we've already discussed multiple steps we'll likely take soon
> continuing in that direction.  In the meantime, providing just
> directories for completion seems like a good direction to me.
> 
> [1] https://lore.kernel.org/git/CABPp-BG=wr81CPtW1M12xFN_0dyS8mAZjM6o=77LA20Zge8Xng@mail.gmail.com/
> [2] https://lore.kernel.org/git/CABPp-BF=-1aZd=nFHF6spo7Ksa7f7Wb7ervCt0QvtNitMY=ZBA@mail.gmail.com/
> [3] https://lore.kernel.org/git/0af00779128e594aff0ee4ec5378addeac8e88a2.1642175983.git.gitgitgadget@gmail.com/
> ("This mode is harder to use and less performant, and is thus not
> recommended.")
> 
> > and I think 'git sparse-checkout set README.md' is a
> > perfectly reasonable command.
> 
> Reasonable in what sense?  That it makes it (vastly) easier to
> implement the completion and sparse-checkout set|add will swallow it,
> or that it's something that should actually be recommended for users
> doing sparse-checkouts?  While the former certainly holds, I don't
> think the latter does.

I used the following command to create a sparse-checkout from
linux.git to build and play with 'perf':

  git sparse-checkout set tools/perf/ tools/scripts/ tools/build/ tools/include/ tools/arch/x86/ tools/lib/ /.gitignore /.gitattributes

Including the top-level '.gitignore' and '.gitattributes' was
important, becase those ignore object files and specify userdiff.
Now, I wouldn't mind having other files present in the top-level
directory, because there are only a handful of files there.  However,
being able to specify just those two files to 'git sparse-checkout'
was great, because I didn't even have to think about what wildcard
pattern to use, and what negative pattern to use to exclude anything
that might have been included recursively.

I don't remember having any performance issues with it, on the
contrary, I do remember that Git suddenly became much faster that in
the full worktree.

So I'm fairly convinced that specifying files to sparse-checkout is a
feature that can make users' life easier.  It certainly made my life
easier.


On a related note: I just noticed the leading slashes in '/.gitignore'
and '/.gitattributes'.  __git_complete_index_file() is not ready for
that, I'm afraid; but I don't think the proposed patches could deal
with that, either (but I didn't actually try).

It would be great if completion could cope with patterns starting with
'/' and/or '!'.
Elijah Newren Jan. 18, 2022, 11:30 p.m. UTC | #11
On Tue, Jan 18, 2022 at 2:22 PM SZEDER Gábor <szeder.dev@gmail.com> wrote:
>
> On Sat, Jan 15, 2022 at 05:03:24PM -0800, Elijah Newren wrote:
> > On Sat, Jan 15, 2022 at 1:57 AM SZEDER Gábor <szeder.dev@gmail.com> wrote:
> > >
> > > On Mon, Jan 10, 2022 at 06:59:51PM +0000, Lessley Dennington via GitGitGadget wrote:
...
> > > > 4. A list of directories (but not files) is provided when users enter git
> > > > sparse-checkout add <TAB> or git sparse-checkout set <TAB>.
> > >
> > > Why limit completion only to directories?  Both of those subcommands
> > > accept files,
> >
> > Discussed in part at [1], but let me give a more detailed answer.
>
> It was a semi-rhetorical question.  Whether the reasons for expluding
> files are sound or not, it should be convincingly justified in the
> commit message.

Fair enough.

...
> I used the following command to create a sparse-checkout from
> linux.git to build and play with 'perf':
>
>   git sparse-checkout set tools/perf/ tools/scripts/ tools/build/ tools/include/ tools/arch/x86/ tools/lib/ /.gitignore /.gitattributes
>
> Including the top-level '.gitignore' and '.gitattributes' was
> important, becase those ignore object files and specify userdiff.
> Now, I wouldn't mind having other files present in the top-level
> directory, because there are only a handful of files there.  However,
> being able to specify just those two files to 'git sparse-checkout'
> was great, because I didn't even have to think about what wildcard
> pattern to use, and what negative pattern to use to exclude anything
> that might have been included recursively.

Yeah, we should really make cone mode the default, because then this
simplifies to:

    git sparse-checkout set tools/perf/ tools/scripts/ tools/build/
tools/include/ tools/arch/x86/ tools/lib/

In cone-mode you automatically get the directories specified PLUS:

  * all files in the toplevel directory (you always get these in cone mode)
  * all files directly within a parent (or ancestor) of one of the
directories you specified (thus all files directly under tools/ and
tools/arch/).

> I don't remember having any performance issues with it, on the
> contrary, I do remember that Git suddenly became much faster that in
> the full worktree.

Cool, thanks for testing it out and reporting.

If I can be permitted to share a bit of my experience...

In this specific case, you did keep it to 8 patterns and the linux
kernel "only" has ~60k files.  So O(N*M) isn't too bad here...but
experience at $DAYJOB is that even with only 60k files, the pattern
list tends to grow and it's not clear to users why there are so many
ugly pauses when "Git used to be fast".  Users tend to not make the
connection either between the sparsity patterns and the slowness for
whatever reason, and when I'm asked questions about the slowness, I
may have to check a few other possible causes before finally realizing
that it's just "too many sparsity patterns".

At least, that's the way it was at $DAYJOB before I switched our
internal "sparsify" tool to just use cone mode.  Then that type of
problem went away.

> So I'm fairly convinced that specifying files to sparse-checkout is a
> feature that can make users' life easier.  It certainly made my life
> easier.

Or we can make it even easier and faster by using cone mode.  That
really, really should be the default (as we've been talking about for
a couple cycles now), and I think your email reinforces my belief in
it.  I've got a series ready after some other things merge down to
next.

> On a related note: I just noticed the leading slashes in '/.gitignore'
> and '/.gitattributes'.  __git_complete_index_file() is not ready for
> that, I'm afraid; but I don't think the proposed patches could deal
> with that, either (but I didn't actually try).
>
> It would be great if completion could cope with patterns starting with
> '/' and/or '!'.

Ah, very good point.  The leading '/' is kind of critical here,
otherwise you get all .gitignore files in every directory at every
depth, so completing on files (as currently done) wouldn't really help
at all in your case.


Since you brought up a new good point here, can I also mention two
others not previously covered in this thread (i.e. besides scaling)?

* If we complete file and directory names in non-cone mode, and those
file and directory names contain a leading '#' or '!' or contain
anywhere in their names a '*', '?', '\', '[', or ']', then the users
may be in for a very nasty surprise.  (Which perhaps suggests that in
non-cone mode, we shouldn't provide file OR directory completions for
set/add?)

* sparse-checkout currently ignores prefix[1], causing file and
directory completions to be wrong if the user is running from a
subdirectory.  This is just a bug in cone mode, though I'm not sure if
it's bug or feature in non-cone mode[2].  A decision about what it is
might have a bearing on what kinds of completions make sense in
non-cone mode (and might reinforce files/directories, or suggest
something else.).

[1] https://lore.kernel.org/git/29f0410e-6dfa-2e86-394d-b1fb735e7608@gmail.com/
[2] https://lore.kernel.org/git/CABPp-BH5woi6KY7OBpnsS-M2EmgLHii9zs8rSwrgcPFkOAvn_A@mail.gmail.com/
Lessley Dennington Jan. 22, 2022, 1:07 a.m. UTC | #12
>> I know it is your preference to complete only directories and
>> exclude filenames, but I question if the confusion such a design
>> causes to end-users is worth it.
> 
> I think perhaps we're a little caught up in exemplifying commands that
> are unrelated to sparse-checkout. As Elijah said in [1], the documentation
> states that directories and patterns are acceptable to sparse-checkout but
> not files. While it is not reasonable to try to offer every pattern a user
> could possibly pass to sparse-checkout, it is reasonable to offer
> directories and (in my opinion) will help guide users toward correct usage
> of the command.
> 
> However, since completion on directories is cone-mode-specific, I am
> willing to accept the suggestion to only complete directories if we are in
> a cone-mode sparse-checkout and apply it in v4 of this series.
> 
> [1]: 
> https://lore.kernel.org/git/CABPp-BErg-RtyycXaRXYfQHEQXA4q-FU9Q6nYkSHJsqL-04oXw@mail.gmail.com/ 
> 

In light of non-cone mode being removed in the near future (see [1]), it
actually seems it does not make sense to add different behaviors for cone
mode and non-cone mode. I also ran this by some other contributors, who
thought it would be best to complete on both files and directories so as
not to confuse users (as Junio and Szeder have indicated). So, instead of
differentiating between cone mode and non-cone mode in V4, I will plan to
remove directory completion.

[1]: 
https://lore.kernel.org/git/CABPp-BEwMAPHGt5xD9jDU58grbrAqCdqNY9Nh8UJGLKuLbArXQ@mail.gmail.com/
Lessley Dennington Jan. 22, 2022, 1:08 a.m. UTC | #13
On 1/21/22 5:07 PM, Lessley Dennington wrote:
>>> I know it is your preference to complete only directories and
>>> exclude filenames, but I question if the confusion such a design
>>> causes to end-users is worth it.
>>
>> I think perhaps we're a little caught up in exemplifying commands that
>> are unrelated to sparse-checkout. As Elijah said in [1], the documentation
>> states that directories and patterns are acceptable to sparse-checkout but
>> not files. While it is not reasonable to try to offer every pattern a user
>> could possibly pass to sparse-checkout, it is reasonable to offer
>> directories and (in my opinion) will help guide users toward correct usage
>> of the command.
>>
>> However, since completion on directories is cone-mode-specific, I am
>> willing to accept the suggestion to only complete directories if we are in
>> a cone-mode sparse-checkout and apply it in v4 of this series.
>>
>> [1]: 
>> https://lore.kernel.org/git/CABPp-BErg-RtyycXaRXYfQHEQXA4q-FU9Q6nYkSHJsqL-04oXw@mail.gmail.com/ 
>>
> 
> In light of non-cone mode being removed in the near future (see [1]), it
> actually seems it does not make sense to add different behaviors for cone
> mode and non-cone mode. I also ran this by some other contributors, who
> thought it would be best to complete on both files and directories so as
> not to confuse users (as Junio and Szeder have indicated). So, instead of
> differentiating between cone mode and non-cone mode in V4, I will plan to
> remove directory completion.
> 
> [1]: 
> https://lore.kernel.org/git/CABPp-BEwMAPHGt5xD9jDU58grbrAqCdqNY9Nh8UJGLKuLbArXQ@mail.gmail.com/ 
> 
My apologies, I will not remove directory completion, but rather will
return to completing on both files and directories.
Lessley Dennington Jan. 22, 2022, 2:11 a.m. UTC | #14
On 1/21/22 5:08 PM, Lessley Dennington wrote:
> 
> 
> On 1/21/22 5:07 PM, Lessley Dennington wrote:
>>>> I know it is your preference to complete only directories and
>>>> exclude filenames, but I question if the confusion such a design
>>>> causes to end-users is worth it.
>>>
>>> I think perhaps we're a little caught up in exemplifying commands that
>>> are unrelated to sparse-checkout. As Elijah said in [1], the documentation
>>> states that directories and patterns are acceptable to sparse-checkout but
>>> not files. While it is not reasonable to try to offer every pattern a user
>>> could possibly pass to sparse-checkout, it is reasonable to offer
>>> directories and (in my opinion) will help guide users toward correct usage
>>> of the command.
>>>
>>> However, since completion on directories is cone-mode-specific, I am
>>> willing to accept the suggestion to only complete directories if we are in
>>> a cone-mode sparse-checkout and apply it in v4 of this series.
>>>
>>> [1]: 
>>> https://lore.kernel.org/git/CABPp-BErg-RtyycXaRXYfQHEQXA4q-FU9Q6nYkSHJsqL-04oXw@mail.gmail.com/ 
>>>
>>
>> In light of non-cone mode being removed in the near future (see [1]), it
>> actually seems it does not make sense to add different behaviors for cone
>> mode and non-cone mode. I also ran this by some other contributors, who
>> thought it would be best to complete on both files and directories so as
>> not to confuse users (as Junio and Szeder have indicated). So, instead of
>> differentiating between cone mode and non-cone mode in V4, I will plan to
>> remove directory completion.
>>
>> [1]: 
>> https://lore.kernel.org/git/CABPp-BEwMAPHGt5xD9jDU58grbrAqCdqNY9Nh8UJGLKuLbArXQ@mail.gmail.com/ 
>>
> My apologies, I will not remove directory completion, but rather will
> return to completing on both files and directories.

My apologies again, I think I mistakenly assumed Junio did not want
directory-only completion for cone mode based on this comment from [1]:

If a path that is not a directory (either because it is a file in the
current checkout, or because it is a directory only in a different branch)
is given, it might not make sense in the cone-mode for the current
checkout, but it may well make sense when a different branch is
checked out.

However, this line (which I overlooked when I was re-reading this
evening), leads me to believe that is not the case:

Are we limiting ourselves to directories only when we know we are in
the cone-mode and showing both directories and files otherwise?

It has also been pointed out to me that saying cone mode is going away is
incorrect - it is just being deprecated.

So, I will stick to my original plan of continuing to complete on
directories for cone mode and completing on both files and directories
for non-cone mode.

Again, apologies for the noise.

[1]: https://lore.kernel.org/git/xmqqv8yjz5us.fsf@gitster.g/
diff mbox series

Patch

diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index c82ccaebcc7..f478883f51c 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -2986,24 +2986,38 @@  _git_show_branch ()
 	__git_complete_revlist
 }
 
+__git_sparse_checkout_subcommand_opts="--cone --no-cone --sparse-index --no-sparse-index"
+
 _git_sparse_checkout ()
 {
-	local subcommands="list init set disable"
+	local subcommands="list init set disable add reapply"
 	local subcommand="$(__git_find_on_cmdline "$subcommands")"
+
 	if [ -z "$subcommand" ]; then
-		__gitcomp "$subcommands"
-		return
+		case "$cur" in
+			--*)
+				__gitcomp "--help"
+				;;
+			*)
+				__gitcomp "$subcommands"
+				;;
+		esac
 	fi
 
-	case "$subcommand,$cur" in
-	init,--*)
-		__gitcomp "--cone"
-		;;
-	set,--*)
-		__gitcomp "--stdin"
-		;;
-	*)
-		;;
+	case "$subcommand" in
+		set)
+			__gitcomp "$__git_sparse_checkout_subcommand_opts --stdin"
+			__gitcomp "$(git ls-tree -d -r HEAD --name-only)"
+			;;
+		add)
+			__gitcomp "--stdin"
+			__gitcomp "$(git ls-tree -d -r HEAD --name-only)"
+			;;
+		init|reapply)
+			__gitcomp "$__git_sparse_checkout_subcommand_opts"
+			;;
+		*)
+			;;
 	esac
 }
 
diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
index 51d0f2d93a1..4dc93ee0595 100755
--- a/t/t9902-completion.sh
+++ b/t/t9902-completion.sh
@@ -1447,7 +1447,7 @@  test_expect_success 'git checkout - with --detach, complete only references' '
 	EOF
 '
 
-test_expect_failure 'sparse-checkout completes subcommands' '
+test_expect_success 'sparse-checkout completes subcommands' '
 	test_completion "git sparse-checkout " <<-\EOF
 	list Z
 	init Z
@@ -1458,13 +1458,13 @@  test_expect_failure 'sparse-checkout completes subcommands' '
 	EOF
 '
 
-test_expect_failure 'sparse-checkout completes options' '
+test_expect_success 'sparse-checkout completes options' '
 	test_completion "git sparse-checkout --" <<-\EOF
 	--help Z
 	EOF
 '
 
-test_expect_failure 'sparse-checkout completes subcommand options' '
+test_expect_success 'sparse-checkout completes subcommand options' '
 	test_completion "git sparse-checkout init --" <<-\EOF &&
 	--cone Z
 	--no-cone Z
@@ -1492,7 +1492,7 @@  test_expect_failure 'sparse-checkout completes subcommand options' '
 	EOF
 '
 
-test_expect_failure 'sparse-checkout completes directory names' '
+test_expect_success 'sparse-checkout completes directory names' '
 	# set up sparse-checkout repo
 	git init sparse-checkout &&
 	(