diff mbox series

[v3] userdiff: support Bash

Message ID 6c6b5ed2166ec2c308c53bf87c78b422fdc5084f.camel@engmark.name (mailing list archive)
State New, archived
Headers show
Series [v3] userdiff: support Bash | expand

Commit Message

Victor Engmark Oct. 21, 2020, 11:45 p.m. UTC
Support POSIX, bashism and mixed function declarations, all four
compound command types, trailing comments and mixed whitespace.

Even though Bash allows locale-dependent characters in function names
<https://unix.stackexchange.com/a/245336/3645>, only detect function
names with characters allowed by POSIX.1-2017
<https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_235>
for simplicity. This should cover the vast majority of use cases, and
produces system-agnostic results.

Since a word pattern has to be specified, but there is no easy way to
know the default word pattern, use the default `IFS` characters for a
starter. A later patch can improve this.

Signed-off-by: Victor Engmark <victor@engmark.name>
---
Includes suggestions by Johannes Sixt <j6t@kdbg.org>.
---
 Documentation/gitattributes.txt       |  3 +++
 t/t4018-diff-funcname.sh              |  1 +
 t/t4018/bash-arithmetic-function      |  4 ++++
 t/t4018/bash-bashism-style-compact    |  6 ++++++
 t/t4018/bash-bashism-style-function   |  4 ++++
 t/t4018/bash-bashism-style-whitespace |  4 ++++
 t/t4018/bash-conditional-function     |  4 ++++
 t/t4018/bash-missing-parentheses      |  6 ++++++
 t/t4018/bash-mixed-style-compact      |  4 ++++
 t/t4018/bash-mixed-style-function     |  4 ++++
 t/t4018/bash-nested-functions         |  6 ++++++
 t/t4018/bash-other-characters         |  4 ++++
 t/t4018/bash-posix-style-compact      |  4 ++++
 t/t4018/bash-posix-style-function     |  4 ++++
 t/t4018/bash-posix-style-whitespace   |  4 ++++
 t/t4018/bash-subshell-function        |  4 ++++
 t/t4018/bash-trailing-comment         |  4 ++++
 userdiff.c                            | 21 +++++++++++++++++++++
 18 files changed, 91 insertions(+)
 create mode 100644 t/t4018/bash-arithmetic-function
 create mode 100644 t/t4018/bash-bashism-style-compact
 create mode 100644 t/t4018/bash-bashism-style-function
 create mode 100644 t/t4018/bash-bashism-style-whitespace
 create mode 100644 t/t4018/bash-conditional-function
 create mode 100644 t/t4018/bash-missing-parentheses
 create mode 100644 t/t4018/bash-mixed-style-compact
 create mode 100644 t/t4018/bash-mixed-style-function
 create mode 100644 t/t4018/bash-nested-functions
 create mode 100644 t/t4018/bash-other-characters
 create mode 100644 t/t4018/bash-posix-style-compact
 create mode 100644 t/t4018/bash-posix-style-function
 create mode 100644 t/t4018/bash-posix-style-whitespace
 create mode 100644 t/t4018/bash-subshell-function
 create mode 100644 t/t4018/bash-trailing-comment

Comments

Johannes Sixt Oct. 22, 2020, 6:08 a.m. UTC | #1
Am 22.10.20 um 01:45 schrieb Victor Engmark:
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index 2d0a03715b..5e8a973449 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -802,6 +802,9 @@ patterns are available:
>  
>  - `ada` suitable for source code in the Ada language.
>  
> +- `bash` suitable for source code in the Bourne-Again SHell language.
> +  Covers a superset of POSIX function definitions.

OK. POSIX *shell* function definitions would have been even better, but
I think I can live with this version.

> diff --git a/t/t4018/bash-bashism-style-compact b/t/t4018/bash-bashism-style-compact
> new file mode 100644
> index 0000000000..1ca3126f61
> --- /dev/null
> +++ b/t/t4018/bash-bashism-style-compact
> @@ -0,0 +1,6 @@
> +function RIGHT {
> +    function InvalidSyntax{

Nicely done!

> +        :
> +        echo 'ChangeMe'
> +    }
> +}

> diff --git a/t/t4018/bash-nested-functions b/t/t4018/bash-nested-functions
> new file mode 100644
> index 0000000000..2c9237ead4
> --- /dev/null
> +++ b/t/t4018/bash-nested-functions
> @@ -0,0 +1,6 @@
> +outer() {
> +    RIGHT() {
> +        :
> +        echo 'ChangeMe'
> +    }
> +}

That's another very good addition!

> diff --git a/userdiff.c b/userdiff.c
> index fde02f225b..eb698eaca7 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -23,6 +23,27 @@ IPATTERN("ada",
>  	 "[a-zA-Z][a-zA-Z0-9_]*"
>  	 "|[-+]?[0-9][0-9#_.aAbBcCdDeEfF]*([eE][+-]?[0-9_]+)?"
>  	 "|=>|\\.\\.|\\*\\*|:=|/=|>=|<=|<<|>>|<>"),
> +PATTERNS("bash",
> +	 /* Optional leading indentation */
> +	 "^[ \t]*"
> +	 /* Start of captured text */
> +	 "("
> +	 "("
> +	     /* POSIX identifier with mandatory parentheses */
> +	     "[a-zA-Z_][a-zA-Z0-9_]*[ \t]*\\([ \t]*\\))"
> +	 "|"
> +	     /* Bashism identifier with optional parentheses */
> +	     "(function[ \t]+[a-zA-Z_][a-zA-Z0-9_]*(([ \t]*\\([ \t]*\\))|([ \t]+))"
> +	 ")"
> +	 /* Optional whitespace */
> +	 "[ \t]*"
> +	 /* Compound command starting with `{`, `(`, `((` or `[[` */
> +	 "(\\{|\\(\\(?|\\[\\[)"
> +	 /* End of captured text */
> +	 ")",
> +	 /* -- */
> +	 /* Characters not in the default $IFS value */
> +	 "[^ \t]+"),
>  PATTERNS("dts",
>  	 "!;\n"
>  	 "!=\n"
> 

This is very well done. Thank you!

Acked-by: Johannes Sixt <j6t@kdbg.org>

-- Hannes
Junio C Hamano Oct. 22, 2020, 5:30 p.m. UTC | #2
Johannes Sixt <j6t@kdbg.org> writes:

> Am 22.10.20 um 01:45 schrieb Victor Engmark:
>> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
>> index 2d0a03715b..5e8a973449 100644
>> --- a/Documentation/gitattributes.txt
>> +++ b/Documentation/gitattributes.txt
>> @@ -802,6 +802,9 @@ patterns are available:
>>  
>>  - `ada` suitable for source code in the Ada language.
>>  
>> +- `bash` suitable for source code in the Bourne-Again SHell language.
>> +  Covers a superset of POSIX function definitions.
>
> OK. POSIX *shell* function definitions would have been even better, but
> I think I can live with this version.

I can't, so I'll locally amend ...

> This is very well done. Thank you!
>
> Acked-by: Johannes Sixt <j6t@kdbg.org>

... and with this in the trailer block.

Thanks, both.  Queued.
diff mbox series

Patch

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 2d0a03715b..5e8a973449 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -802,6 +802,9 @@  patterns are available:
 
 - `ada` suitable for source code in the Ada language.
 
+- `bash` suitable for source code in the Bourne-Again SHell language.
+  Covers a superset of POSIX function definitions.
+
 - `bibtex` suitable for files with BibTeX coded references.
 
 - `cpp` suitable for source code in the C and C++ languages.
diff --git a/t/t4018-diff-funcname.sh b/t/t4018-diff-funcname.sh
index 9d07797579..9675bc17db 100755
--- a/t/t4018-diff-funcname.sh
+++ b/t/t4018-diff-funcname.sh
@@ -27,6 +27,7 @@  test_expect_success 'setup' '
 
 diffpatterns="
 	ada
+	bash
 	bibtex
 	cpp
 	csharp
diff --git a/t/t4018/bash-arithmetic-function b/t/t4018/bash-arithmetic-function
new file mode 100644
index 0000000000..c0b276cb50
--- /dev/null
+++ b/t/t4018/bash-arithmetic-function
@@ -0,0 +1,4 @@ 
+RIGHT() ((
+
+    ChangeMe = "$x" + "$y"
+))
diff --git a/t/t4018/bash-bashism-style-compact b/t/t4018/bash-bashism-style-compact
new file mode 100644
index 0000000000..1ca3126f61
--- /dev/null
+++ b/t/t4018/bash-bashism-style-compact
@@ -0,0 +1,6 @@ 
+function RIGHT {
+    function InvalidSyntax{
+        :
+        echo 'ChangeMe'
+    }
+}
diff --git a/t/t4018/bash-bashism-style-function b/t/t4018/bash-bashism-style-function
new file mode 100644
index 0000000000..f1de4fa831
--- /dev/null
+++ b/t/t4018/bash-bashism-style-function
@@ -0,0 +1,4 @@ 
+function RIGHT {
+    :
+    echo 'ChangeMe'
+}
diff --git a/t/t4018/bash-bashism-style-whitespace b/t/t4018/bash-bashism-style-whitespace
new file mode 100644
index 0000000000..ade85dd3a5
--- /dev/null
+++ b/t/t4018/bash-bashism-style-whitespace
@@ -0,0 +1,4 @@ 
+	 function 	RIGHT 	( 	) 	{
+
+	    ChangeMe
+	 }
diff --git a/t/t4018/bash-conditional-function b/t/t4018/bash-conditional-function
new file mode 100644
index 0000000000..c5949e829b
--- /dev/null
+++ b/t/t4018/bash-conditional-function
@@ -0,0 +1,4 @@ 
+RIGHT() [[ \
+
+    "$a" > "$ChangeMe"
+]]
diff --git a/t/t4018/bash-missing-parentheses b/t/t4018/bash-missing-parentheses
new file mode 100644
index 0000000000..8c8a05dd7a
--- /dev/null
+++ b/t/t4018/bash-missing-parentheses
@@ -0,0 +1,6 @@ 
+function RIGHT {
+    functionInvalidSyntax {
+        :
+        echo 'ChangeMe'
+    }
+}
diff --git a/t/t4018/bash-mixed-style-compact b/t/t4018/bash-mixed-style-compact
new file mode 100644
index 0000000000..d9364cba67
--- /dev/null
+++ b/t/t4018/bash-mixed-style-compact
@@ -0,0 +1,4 @@ 
+function RIGHT(){
+    :
+    echo 'ChangeMe'
+}
diff --git a/t/t4018/bash-mixed-style-function b/t/t4018/bash-mixed-style-function
new file mode 100644
index 0000000000..555f9b2466
--- /dev/null
+++ b/t/t4018/bash-mixed-style-function
@@ -0,0 +1,4 @@ 
+function RIGHT() {
+
+    ChangeMe
+}
diff --git a/t/t4018/bash-nested-functions b/t/t4018/bash-nested-functions
new file mode 100644
index 0000000000..2c9237ead4
--- /dev/null
+++ b/t/t4018/bash-nested-functions
@@ -0,0 +1,6 @@ 
+outer() {
+    RIGHT() {
+        :
+        echo 'ChangeMe'
+    }
+}
diff --git a/t/t4018/bash-other-characters b/t/t4018/bash-other-characters
new file mode 100644
index 0000000000..a3f390d525
--- /dev/null
+++ b/t/t4018/bash-other-characters
@@ -0,0 +1,4 @@ 
+_RIGHT_0n() {
+
+    ChangeMe
+}
diff --git a/t/t4018/bash-posix-style-compact b/t/t4018/bash-posix-style-compact
new file mode 100644
index 0000000000..045bd2029b
--- /dev/null
+++ b/t/t4018/bash-posix-style-compact
@@ -0,0 +1,4 @@ 
+RIGHT(){
+
+    ChangeMe
+}
diff --git a/t/t4018/bash-posix-style-function b/t/t4018/bash-posix-style-function
new file mode 100644
index 0000000000..a4d144856e
--- /dev/null
+++ b/t/t4018/bash-posix-style-function
@@ -0,0 +1,4 @@ 
+RIGHT() {
+
+    ChangeMe
+}
diff --git a/t/t4018/bash-posix-style-whitespace b/t/t4018/bash-posix-style-whitespace
new file mode 100644
index 0000000000..4d984f0aa4
--- /dev/null
+++ b/t/t4018/bash-posix-style-whitespace
@@ -0,0 +1,4 @@ 
+	 RIGHT 	( 	) 	{
+
+	    ChangeMe
+	 }
diff --git a/t/t4018/bash-subshell-function b/t/t4018/bash-subshell-function
new file mode 100644
index 0000000000..80baa09484
--- /dev/null
+++ b/t/t4018/bash-subshell-function
@@ -0,0 +1,4 @@ 
+RIGHT() (
+
+    ChangeMe=2
+)
diff --git a/t/t4018/bash-trailing-comment b/t/t4018/bash-trailing-comment
new file mode 100644
index 0000000000..f1edbeda31
--- /dev/null
+++ b/t/t4018/bash-trailing-comment
@@ -0,0 +1,4 @@ 
+RIGHT() { # Comment
+
+    ChangeMe
+}
diff --git a/userdiff.c b/userdiff.c
index fde02f225b..eb698eaca7 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -23,6 +23,27 @@  IPATTERN("ada",
 	 "[a-zA-Z][a-zA-Z0-9_]*"
 	 "|[-+]?[0-9][0-9#_.aAbBcCdDeEfF]*([eE][+-]?[0-9_]+)?"
 	 "|=>|\\.\\.|\\*\\*|:=|/=|>=|<=|<<|>>|<>"),
+PATTERNS("bash",
+	 /* Optional leading indentation */
+	 "^[ \t]*"
+	 /* Start of captured text */
+	 "("
+	 "("
+	     /* POSIX identifier with mandatory parentheses */
+	     "[a-zA-Z_][a-zA-Z0-9_]*[ \t]*\\([ \t]*\\))"
+	 "|"
+	     /* Bashism identifier with optional parentheses */
+	     "(function[ \t]+[a-zA-Z_][a-zA-Z0-9_]*(([ \t]*\\([ \t]*\\))|([ \t]+))"
+	 ")"
+	 /* Optional whitespace */
+	 "[ \t]*"
+	 /* Compound command starting with `{`, `(`, `((` or `[[` */
+	 "(\\{|\\(\\(?|\\[\\[)"
+	 /* End of captured text */
+	 ")",
+	 /* -- */
+	 /* Characters not in the default $IFS value */
+	 "[^ \t]+"),
 PATTERNS("dts",
 	 "!;\n"
 	 "!=\n"