diff mbox series

[2/3] mergetool: dissect strings with shell variable magic instead of `expr`

Message ID 2a33ca20af41d68a5bb4e2cf1e5ae32fddf2796c.1560152205.git.j6t@kdbg.org (mailing list archive)
State New, archived
Headers show
Series Reduce number of processes spawned by git-mergetool | expand

Commit Message

Johannes Sixt June 10, 2019, 8:58 a.m. UTC
git-mergetool spawns an enormous amount of processes. For this reason,
the test script, t7610, is exceptionally slow, in particular, on
Windows. Most of the processes are invocations of git, but there are
also some that can be replaced with shell builtins. Do so with `expr`.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 git-mergetool.sh | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

Comments

Junio C Hamano June 10, 2019, 5:17 p.m. UTC | #1
Johannes Sixt <j6t@kdbg.org> writes:

> git-mergetool spawns an enormous amount of processes. For this reason,
> the test script, t7610, is exceptionally slow, in particular, on
> Windows. Most of the processes are invocations of git, but there are
> also some that can be replaced with shell builtins. Do so with `expr`.

I see these as improvements independent of whatever test may or may
not be slow ;-)  s/^.*/but there are/There are/.  Thanks for working
on it.

>  checkout_staged_file () {
> -	tmpfile=$(expr \
> -		"$(git checkout-index --temp --stage="$1" "$2" 2>/dev/null)" \
> -		: '\([^	]*\)	')

So this wants to grab leading non-HT substring that comes before an
HT; we are trying to grab the name of the temorary picked by the
checkout-index command, the output is ".merge_file_XXXXXX" followed
by HT followed by the original filename "$2".

> +	tmpfile="$(git checkout-index --temp --stage="$1" "$2" 2>/dev/null)" &&
> +	tmpfile=${tmpfile%%'	'*}

And this obviously is an equivalent, at least in the successful
case.  The ".merge_file_XXXXXX" temporary filename never has HT in
it, and we are stripping everything after the first HT.

And this rewrite makes the error behaviour much better.  In the
original, the exit code checked in the next "if test $? -eq 0" is
that of "expr" (i.e. does the pattern match?); with this version, we
are looking at the exit status of the checkout-index command.

Good.

> @@ -255,13 +254,16 @@ merge_file () {
>  		return 1
>  	fi
>  
> -	if BASE=$(expr "$MERGED" : '\(.*\)\.[^/]*$')
> -	then
> -		ext=$(expr "$MERGED" : '.*\(\.[^/]*\)$')
> -	else
> +	# extract file extension from the last path component
> +	case "${MERGED##*/}" in
> +	*.*)
> +		ext=.${MERGED##*.}
> +		BASE=${MERGED%"$ext"}

This rewrite can potentially change the behaviour, when $ext has
glob metacharacters.  Wouldn't BASE=${MERGED%.*} be more faithful
conversion?

> +		;;
> +	*)
>  		BASE=$MERGED
>  		ext=
> -	fi
> +	esac
> @@ -406,7 +408,7 @@ main () {
>  		-t|--tool*)
>  			case "$#,$1" in
>  			*,*=*)
> -				merge_tool=$(expr "z$1" : 'z-[^=]*=\(.*\)')
> +				merge_tool=${1#*=}

OK, we strip leading substring before the first '=' out of "$1" and
the case/esac ensures that there is such an equal '=' sign in "$1",
so the rewrite is correct.

Looks good.  Thanks.
Johannes Sixt June 10, 2019, 9:34 p.m. UTC | #2
Am 10.06.19 um 19:17 schrieb Junio C Hamano:
> Johannes Sixt <j6t@kdbg.org> writes:
>> git-mergetool spawns an enormous amount of processes. For this reason,
>> the test script, t7610, is exceptionally slow, in particular, on
>> Windows. Most of the processes are invocations of git, but there are
>> also some that can be replaced with shell builtins. Do so with `expr`.
> 
> I see these as improvements independent of whatever test may or may
> not be slow ;-)  s/^.*/but there are/There are/.  Thanks for working
> on it.

Noted.

>> @@ -255,13 +254,16 @@ merge_file () {
>>  		return 1
>>  	fi
>>  
>> -	if BASE=$(expr "$MERGED" : '\(.*\)\.[^/]*$')
>> -	then
>> -		ext=$(expr "$MERGED" : '.*\(\.[^/]*\)$')
>> -	else
>> +	# extract file extension from the last path component
>> +	case "${MERGED##*/}" in
>> +	*.*)
>> +		ext=.${MERGED##*.}
>> +		BASE=${MERGED%"$ext"}
> 
> This rewrite can potentially change the behaviour, when $ext has
> glob metacharacters.  Wouldn't BASE=${MERGED%.*} be more faithful
> conversion?

Since "$ext" is quoted inside the braces of the parameter expansion, the
pattern counts as quoted, so all glob characters in $ext lose their
special meaning. At least that's how I read the spec.

I do see the symmetry in your proposed version. Nevertheless, I have a
slight preference for my version because it specifies exactly what is to
be removed from the end of value.

-- Hannes
diff mbox series

Patch

diff --git a/git-mergetool.sh b/git-mergetool.sh
index 88fa6a914a..8a937f680f 100755
--- a/git-mergetool.sh
+++ b/git-mergetool.sh
@@ -228,9 +228,8 @@  stage_submodule () {
 }
 
 checkout_staged_file () {
-	tmpfile=$(expr \
-		"$(git checkout-index --temp --stage="$1" "$2" 2>/dev/null)" \
-		: '\([^	]*\)	')
+	tmpfile="$(git checkout-index --temp --stage="$1" "$2" 2>/dev/null)" &&
+	tmpfile=${tmpfile%%'	'*}
 
 	if test $? -eq 0 && test -n "$tmpfile"
 	then
@@ -255,13 +254,16 @@  merge_file () {
 		return 1
 	fi
 
-	if BASE=$(expr "$MERGED" : '\(.*\)\.[^/]*$')
-	then
-		ext=$(expr "$MERGED" : '.*\(\.[^/]*\)$')
-	else
+	# extract file extension from the last path component
+	case "${MERGED##*/}" in
+	*.*)
+		ext=.${MERGED##*.}
+		BASE=${MERGED%"$ext"}
+		;;
+	*)
 		BASE=$MERGED
 		ext=
-	fi
+	esac
 
 	mergetool_tmpdir_init
 
@@ -406,7 +408,7 @@  main () {
 		-t|--tool*)
 			case "$#,$1" in
 			*,*=*)
-				merge_tool=$(expr "z$1" : 'z-[^=]*=\(.*\)')
+				merge_tool=${1#*=}
 				;;
 			1,*)
 				usage ;;