diff mbox series

[v5,3/3] completion: handle unusual characters for sparse-checkout

Message ID ddf5e583dd71a729d6fb0513aed9fc4eb6ebbdd7.1643921091.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series completion: sparse-checkout updates | expand

Commit Message

Lessley Dennington Feb. 3, 2022, 8:44 p.m. UTC
From: Lessley Dennington <lessleydennington@gmail.com>

Update the __gitcomp_directories method to de-quote and handle unusual
characters in directory names. Although this initially involved an attempt
to re-use the logic in __git_index_files, this method removed
subdirectories (e.g. folder1/0/ became folder1/), so instead new custom
logic was placed directly in the __gitcomp_directories method.

Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Co-authored-by: Lessley Dennington <lessleydennington@gmail.com>
Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
---
 contrib/completion/git-completion.bash | 50 +++++++++++++-------------
 t/t9902-completion.sh                  | 28 +++++++++++++++
 2 files changed, 52 insertions(+), 26 deletions(-)

Comments

Elijah Newren Feb. 3, 2022, 11:58 p.m. UTC | #1
On Thu, Feb 3, 2022 at 12:44 PM Lessley Dennington via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Lessley Dennington <lessleydennington@gmail.com>
>
> Update the __gitcomp_directories method to de-quote and handle unusual
> characters in directory names. Although this initially involved an attempt
> to re-use the logic in __git_index_files, this method removed
> subdirectories (e.g. folder1/0/ became folder1/), so instead new custom
> logic was placed directly in the __gitcomp_directories method.
>
> Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> Co-authored-by: Lessley Dennington <lessleydennington@gmail.com>
> Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
> ---
>  contrib/completion/git-completion.bash | 50 +++++++++++++-------------
>  t/t9902-completion.sh                  | 28 +++++++++++++++
>  2 files changed, 52 insertions(+), 26 deletions(-)
>
> diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
> index c5c8df6b6e5..c47e9ce09b2 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -2988,32 +2988,30 @@ _git_show_branch ()
>
>  __gitcomp_directories ()
>  {
> -     local _tmp_dir _tmp_completions
> -
> -     # Get the directory of the current token; this differs from dirname
> -     # in that it keeps up to the final trailing slash.  If no slash found
> -     # that's fine too.
> -     [[ "$cur" =~ .*/ ]]
> -     _tmp_dir=$BASH_REMATCH
> -
> -     # Find possible directory completions, adding trailing '/' characters
> -     _tmp_completions="$(git ls-tree -d --name-only HEAD $_tmp_dir |
> -         sed -e s%$%/%)"
> -
> -     if [[ -n "$_tmp_completions" ]]; then
> -         # There were some directory completions, so find ones that
> -         # start with "$cur", the current token, and put those in COMPREPLY
> -         local i=0 c IFS=$' \t\n'
> -         for c in $_tmp_completions; do
> -             if [[ $c == "$cur"* ]]; then
> -                 COMPREPLY+=("$c")
> -             fi
> -         done
> -     elif [[ "$cur" =~ /$ ]]; then
> -         # No possible further completions any deeper, so assume we're at
> -         # a leaf directory and just consider it complete
> -         __gitcomp_direct_append "$cur "
> -     fi
> +       local _tmp_dir _tmp_completions _found=0
> +
> +       # Get the directory of the current token; this differs from dirname
> +       # in that it keeps up to the final trailing slash.  If no slash found
> +       # that's fine too.
> +       [[ "$cur" =~ .*/ ]]
> +       _tmp_dir=$BASH_REMATCH
> +
> +       # Find possible directory completions, adding trailing '/' characters,
> +       # de-quoting, and handling unusual characters.
> +       while IFS= read -r -d $'\0' c ; do
> +               # If there are directory completions, find ones that start
> +               # with "$cur", the current token, and put those in COMPREPLY
> +               if [[ $c == "$cur"* ]]; then
> +                       COMPREPLY+=("$c/")
> +                       _found=1
> +               fi
> +       done < <(git ls-tree -z -d --name-only HEAD $_tmp_dir)
> +
> +       if [[ $_found == 0 ]] && [[ "$cur" =~ /$ ]]; then
> +               # No possible further completions any deeper, so assume we're at
> +               # a leaf directory and just consider it complete
> +               __gitcomp_direct_append "$cur "
> +       fi

The indentation changes are distracting and make the patch harder to
review.  Could you either remove those, or apply the indentation
changes to patch 2 so that it starts with the right indentation?

I'm slightly surprised that __gitcomp_direct_append handles the
quoting for us, but the testcases below seem to cover it, so that's
cool.

Anyway, looks pretty clever to me; I was worried this was going to
require a much bigger change.

>  }
>
>  _git_sparse_checkout ()
> diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
> index b38a7302249..7f63d6057be 100755
> --- a/t/t9902-completion.sh
> +++ b/t/t9902-completion.sh
> @@ -1508,6 +1508,34 @@ test_expect_success 'cone mode sparse-checkout completes directory names' '
>         )
>  '
>
> +# use FUNNYNAMES to avoid running on Windows, which doesn't permit backslashes in paths
> +test_expect_success FUNNYNAMES 'cone mode sparse-checkout completes directory names with special characters' '
> +       # reset sparse-checkout
> +       git -C sparse-checkout sparse-checkout disable &&
> +       (
> +               cd sparse-checkout &&
> +               mkdir "directory with spaces" &&
> +               mkdir "$(printf "directory\twith\ttabs")" &&
> +               mkdir "directory\with\backslashes" &&
> +               mkdir "directory-with-áccent" &&
> +               >"directory with spaces/randomfile" &&
> +               >"$(printf "directory\twith\ttabs")/randomfile" &&
> +               >"directory\with\backslashes/randomfile" &&
> +               >"directory-with-áccent/randomfile" &&
> +               git add . &&
> +               git commit -m "Add directories containing unusual characters" &&
> +               git sparse-checkout set --cone "directory with spaces" \
> +                       "$(printf "directory\twith\ttabs")" "directory\with\backslashes" \
> +                       "directory-with-áccent" &&
> +               test_completion "git sparse-checkout add dir" <<-\EOF
> +               directory with spaces/
> +               directory       with    tabs/
> +               directory\with\backslashes/
> +               directory-with-áccent/
> +               EOF
> +       )
> +'

I'm glad you tested with lots of special characters -- spaces, tabs,
backslashes, and accents.  Newlines might also be nice, but probably
makes the test hard.  Anyway, looks good to me, other than the
indentation change.

> +
>  test_expect_success 'non-cone mode sparse-checkout uses bash completion' '
>         # reset sparse-checkout repo to non-cone mode
>         git -C sparse-checkout sparse-checkout disable &&
> --
> gitgitgadget
Lessley Dennington Feb. 4, 2022, 12:37 a.m. UTC | #2
On 2/3/22 3:58 PM, Elijah Newren wrote:
> On Thu, Feb 3, 2022 at 12:44 PM Lessley Dennington via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Lessley Dennington <lessleydennington@gmail.com>
>>
>> Update the __gitcomp_directories method to de-quote and handle unusual
>> characters in directory names. Although this initially involved an attempt
>> to re-use the logic in __git_index_files, this method removed
>> subdirectories (e.g. folder1/0/ became folder1/), so instead new custom
>> logic was placed directly in the __gitcomp_directories method.
>>
>> Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>> Co-authored-by: Lessley Dennington <lessleydennington@gmail.com>
>> Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
>> ---
>>   contrib/completion/git-completion.bash | 50 +++++++++++++-------------
>>   t/t9902-completion.sh                  | 28 +++++++++++++++
>>   2 files changed, 52 insertions(+), 26 deletions(-)
>>
>> diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
>> index c5c8df6b6e5..c47e9ce09b2 100644
>> --- a/contrib/completion/git-completion.bash
>> +++ b/contrib/completion/git-completion.bash
>> @@ -2988,32 +2988,30 @@ _git_show_branch ()
>>
>>   __gitcomp_directories ()
>>   {
>> -     local _tmp_dir _tmp_completions
>> -
>> -     # Get the directory of the current token; this differs from dirname
>> -     # in that it keeps up to the final trailing slash.  If no slash found
>> -     # that's fine too.
>> -     [[ "$cur" =~ .*/ ]]
>> -     _tmp_dir=$BASH_REMATCH
>> -
>> -     # Find possible directory completions, adding trailing '/' characters
>> -     _tmp_completions="$(git ls-tree -d --name-only HEAD $_tmp_dir |
>> -         sed -e s%$%/%)"
>> -
>> -     if [[ -n "$_tmp_completions" ]]; then
>> -         # There were some directory completions, so find ones that
>> -         # start with "$cur", the current token, and put those in COMPREPLY
>> -         local i=0 c IFS=$' \t\n'
>> -         for c in $_tmp_completions; do
>> -             if [[ $c == "$cur"* ]]; then
>> -                 COMPREPLY+=("$c")
>> -             fi
>> -         done
>> -     elif [[ "$cur" =~ /$ ]]; then
>> -         # No possible further completions any deeper, so assume we're at
>> -         # a leaf directory and just consider it complete
>> -         __gitcomp_direct_append "$cur "
>> -     fi
>> +       local _tmp_dir _tmp_completions _found=0
>> +
>> +       # Get the directory of the current token; this differs from dirname
>> +       # in that it keeps up to the final trailing slash.  If no slash found
>> +       # that's fine too.
>> +       [[ "$cur" =~ .*/ ]]
>> +       _tmp_dir=$BASH_REMATCH
>> +
>> +       # Find possible directory completions, adding trailing '/' characters,
>> +       # de-quoting, and handling unusual characters.
>> +       while IFS= read -r -d $'\0' c ; do
>> +               # If there are directory completions, find ones that start
>> +               # with "$cur", the current token, and put those in COMPREPLY
>> +               if [[ $c == "$cur"* ]]; then
>> +                       COMPREPLY+=("$c/")
>> +                       _found=1
>> +               fi
>> +       done < <(git ls-tree -z -d --name-only HEAD $_tmp_dir)
>> +
>> +       if [[ $_found == 0 ]] && [[ "$cur" =~ /$ ]]; then
>> +               # No possible further completions any deeper, so assume we're at
>> +               # a leaf directory and just consider it complete
>> +               __gitcomp_direct_append "$cur "
>> +       fi
> 
> The indentation changes are distracting and make the patch harder to
> review.  Could you either remove those, or apply the indentation
> changes to patch 2 so that it starts with the right indentation?

I've also corrected this as part of the fix I'm about to submit.
> 
> I'm slightly surprised that __gitcomp_direct_append handles the
> quoting for us, but the testcases below seem to cover it, so that's
> cool.
>
> Anyway, looks pretty clever to me; I was worried this was going to
> require a much bigger change.
> 
>>   }
>>
>>   _git_sparse_checkout ()
>> diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
>> index b38a7302249..7f63d6057be 100755
>> --- a/t/t9902-completion.sh
>> +++ b/t/t9902-completion.sh
>> @@ -1508,6 +1508,34 @@ test_expect_success 'cone mode sparse-checkout completes directory names' '
>>          )
>>   '
>>
>> +# use FUNNYNAMES to avoid running on Windows, which doesn't permit backslashes in paths
>> +test_expect_success FUNNYNAMES 'cone mode sparse-checkout completes directory names with special characters' '
>> +       # reset sparse-checkout
>> +       git -C sparse-checkout sparse-checkout disable &&
>> +       (
>> +               cd sparse-checkout &&
>> +               mkdir "directory with spaces" &&
>> +               mkdir "$(printf "directory\twith\ttabs")" &&
>> +               mkdir "directory\with\backslashes" &&
>> +               mkdir "directory-with-áccent" &&
>> +               >"directory with spaces/randomfile" &&
>> +               >"$(printf "directory\twith\ttabs")/randomfile" &&
>> +               >"directory\with\backslashes/randomfile" &&
>> +               >"directory-with-áccent/randomfile" &&
>> +               git add . &&
>> +               git commit -m "Add directories containing unusual characters" &&
>> +               git sparse-checkout set --cone "directory with spaces" \
>> +                       "$(printf "directory\twith\ttabs")" "directory\with\backslashes" \
>> +                       "directory-with-áccent" &&
>> +               test_completion "git sparse-checkout add dir" <<-\EOF
>> +               directory with spaces/
>> +               directory       with    tabs/
>> +               directory\with\backslashes/
>> +               directory-with-áccent/
>> +               EOF
>> +       )
>> +'
> 
> I'm glad you tested with lots of special characters -- spaces, tabs,
> backslashes, and accents.  Newlines might also be nice, but probably
> makes the test hard.  Anyway, looks good to me, other than the
> indentation change.
> 
>> +
>>   test_expect_success 'non-cone mode sparse-checkout uses bash completion' '
>>          # reset sparse-checkout repo to non-cone mode
>>          git -C sparse-checkout sparse-checkout disable &&
>> --
>> gitgitgadget
Ævar Arnfjörð Bjarmason Feb. 4, 2022, 4:25 a.m. UTC | #3
On Thu, Feb 03 2022, Elijah Newren wrote:

> On Thu, Feb 3, 2022 at 12:44 PM Lessley Dennington via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Lessley Dennington <lessleydennington@gmail.com>
>> +# use FUNNYNAMES to avoid running on Windows, which doesn't permit backslashes in paths
>> +test_expect_success FUNNYNAMES 'cone mode sparse-checkout completes directory names with special characters' '
>> +       # reset sparse-checkout
>> +       git -C sparse-checkout sparse-checkout disable &&
>> +       (
>> +               cd sparse-checkout &&
>> +               mkdir "directory with spaces" &&
>> +               mkdir "$(printf "directory\twith\ttabs")" &&
>> +               mkdir "directory\with\backslashes" &&
>> +               mkdir "directory-with-áccent" &&
>> +               >"directory with spaces/randomfile" &&
>> +               >"$(printf "directory\twith\ttabs")/randomfile" &&
>> +               >"directory\with\backslashes/randomfile" &&
>> +               >"directory-with-áccent/randomfile" &&
>> +               git add . &&
>> +               git commit -m "Add directories containing unusual characters" &&
>> +               git sparse-checkout set --cone "directory with spaces" \
>> +                       "$(printf "directory\twith\ttabs")" "directory\with\backslashes" \
>> +                       "directory-with-áccent" &&
>> +               test_completion "git sparse-checkout add dir" <<-\EOF
>> +               directory with spaces/
>> +               directory       with    tabs/
>> +               directory\with\backslashes/
>> +               directory-with-áccent/
>> +               EOF
>> +       )
>> +'
>
> I'm glad you tested with lots of special characters -- spaces, tabs,
> backslashes, and accents.  Newlines might also be nice, but probably
> makes the test hard.  Anyway, looks good to me, other than the
> indentation change.

This looks like its over-skipping tests, the comment says Windows
doesn't like \ in paths, but there's a lo more being skipped here than
that.

Shouldn't e.g. "directory-with-áccent/" be pulled out into another test,
leaving the directory\with\backslashes/ etc. as non-Windows?

Another useful method is to just try to create those paths
unconditionally, but with "mkdir" etc., and if that works run the test
with git. After all if the OS can create a given path, but we're not
handling it that's our bug.

But if the FS just doesn't support the path we'd catch that in the
"mkdir" (or whatever), which we can assume behaves sanely on that
platform (or we're horribly broken anyway...).
Junio C Hamano Feb. 4, 2022, 9:55 p.m. UTC | #4
"Lessley Dennington via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> +test_expect_success FUNNYNAMES 'cone mode sparse-checkout completes directory names with special characters' '
> +	# reset sparse-checkout
> +	git -C sparse-checkout sparse-checkout disable &&
> +	(
> +		cd sparse-checkout &&
> +		mkdir "directory with spaces" &&
> +		mkdir "$(printf "directory\twith\ttabs")" &&
> +		mkdir "directory\with\backslashes" &&
> +		mkdir "directory-with-áccent" &&
> +		>"directory with spaces/randomfile" &&
> +		>"$(printf "directory\twith\ttabs")/randomfile" &&
> +		>"directory\with\backslashes/randomfile" &&
> +		>"directory-with-áccent/randomfile" &&
> +		git add . &&
> +		git commit -m "Add directories containing unusual characters" &&
> +		git sparse-checkout set --cone "directory with spaces" \
> +			"$(printf "directory\twith\ttabs")" "directory\with\backslashes" \
> +			"directory-with-áccent" &&
> +		test_completion "git sparse-checkout add dir" <<-\EOF
> +		directory with spaces/
> +		directory	with	tabs/
> +		directory\with\backslashes/
> +		directory-with-áccent/
> +		EOF
> +	)
> +'

I wonder if we care about the repetition, which can burden anybody
who will later have to change this script.  "directory with spaces"
appears three times in the above (i.e. mkdir uses it, then redirection
to create a file under it uses it, and "sparse-checkout set" uses it,
and then finally the expected output has it.

An obvious way to reduce repetition is to add variables for them,
and because this is in a subshell, we do not have to worry about the
use of extra variables contaminating the namespace for later tests.

	D1="directory with spaces" &&
	D2=$(printf "directory\t\with\ttabs") &&
	D3="directory\with\backslashes" &&
        D4="directory-with-áccent" &&
	for D in "$D1" "$D2" "$D3" "$D4"
	do
		>"$D/samplefile" || return 1
	done &&
	git add . &&
	git commit -m "add many" &&
	git sparse-checkout set --cone "$D1" "$D2" "$D3" "$D4" &&
	test_completion "git sparse-checkout add dir" <<-EOF
	$D1/
	$D2/
	$D3/
	$D4/
	EOF

Having written it, I am undecided.  It does reduce repetition, and
it does reduce the chance to make typo, but did I make it easier to
follow?  I am not sure about the last part.

Thanks.
diff mbox series

Patch

diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index c5c8df6b6e5..c47e9ce09b2 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -2988,32 +2988,30 @@  _git_show_branch ()
 
 __gitcomp_directories ()
 {
-     local _tmp_dir _tmp_completions
-
-     # Get the directory of the current token; this differs from dirname
-     # in that it keeps up to the final trailing slash.  If no slash found
-     # that's fine too.
-     [[ "$cur" =~ .*/ ]]
-     _tmp_dir=$BASH_REMATCH
-
-     # Find possible directory completions, adding trailing '/' characters
-     _tmp_completions="$(git ls-tree -d --name-only HEAD $_tmp_dir |
-         sed -e s%$%/%)"
-
-     if [[ -n "$_tmp_completions" ]]; then
-         # There were some directory completions, so find ones that
-         # start with "$cur", the current token, and put those in COMPREPLY
-         local i=0 c IFS=$' \t\n'
-         for c in $_tmp_completions; do
-             if [[ $c == "$cur"* ]]; then
-                 COMPREPLY+=("$c")
-             fi
-         done
-     elif [[ "$cur" =~ /$ ]]; then
-         # No possible further completions any deeper, so assume we're at
-         # a leaf directory and just consider it complete
-         __gitcomp_direct_append "$cur "
-     fi
+	local _tmp_dir _tmp_completions _found=0
+
+	# Get the directory of the current token; this differs from dirname
+	# in that it keeps up to the final trailing slash.  If no slash found
+	# that's fine too.
+	[[ "$cur" =~ .*/ ]]
+	_tmp_dir=$BASH_REMATCH
+
+	# Find possible directory completions, adding trailing '/' characters,
+	# de-quoting, and handling unusual characters.
+	while IFS= read -r -d $'\0' c ; do
+		# If there are directory completions, find ones that start
+		# with "$cur", the current token, and put those in COMPREPLY
+		if [[ $c == "$cur"* ]]; then
+			COMPREPLY+=("$c/")
+			_found=1
+		fi
+	done < <(git ls-tree -z -d --name-only HEAD $_tmp_dir)
+
+	if [[ $_found == 0 ]] && [[ "$cur" =~ /$ ]]; then
+		# No possible further completions any deeper, so assume we're at
+		# a leaf directory and just consider it complete
+		__gitcomp_direct_append "$cur "
+	fi
 }
 
 _git_sparse_checkout ()
diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
index b38a7302249..7f63d6057be 100755
--- a/t/t9902-completion.sh
+++ b/t/t9902-completion.sh
@@ -1508,6 +1508,34 @@  test_expect_success 'cone mode sparse-checkout completes directory names' '
 	)
 '
 
+# use FUNNYNAMES to avoid running on Windows, which doesn't permit backslashes in paths
+test_expect_success FUNNYNAMES 'cone mode sparse-checkout completes directory names with special characters' '
+	# reset sparse-checkout
+	git -C sparse-checkout sparse-checkout disable &&
+	(
+		cd sparse-checkout &&
+		mkdir "directory with spaces" &&
+		mkdir "$(printf "directory\twith\ttabs")" &&
+		mkdir "directory\with\backslashes" &&
+		mkdir "directory-with-áccent" &&
+		>"directory with spaces/randomfile" &&
+		>"$(printf "directory\twith\ttabs")/randomfile" &&
+		>"directory\with\backslashes/randomfile" &&
+		>"directory-with-áccent/randomfile" &&
+		git add . &&
+		git commit -m "Add directories containing unusual characters" &&
+		git sparse-checkout set --cone "directory with spaces" \
+			"$(printf "directory\twith\ttabs")" "directory\with\backslashes" \
+			"directory-with-áccent" &&
+		test_completion "git sparse-checkout add dir" <<-\EOF
+		directory with spaces/
+		directory	with	tabs/
+		directory\with\backslashes/
+		directory-with-áccent/
+		EOF
+	)
+'
+
 test_expect_success 'non-cone mode sparse-checkout uses bash completion' '
 	# reset sparse-checkout repo to non-cone mode
 	git -C sparse-checkout sparse-checkout disable &&