diff mbox series

userdiff: add builtin driver for kotlin language

Message ID 20220302142608.2754709-2-jaydeepjd.8914@gmail.com (mailing list archive)
State New, archived
Headers show
Series userdiff: add builtin driver for kotlin language | expand

Commit Message

Jaydeep Das March 2, 2022, 2:26 p.m. UTC
The xfuncname pattern finds func/class declarations
in diffs to display as a hunk header. The word_regex
pattern finds individual tokens in Kotlin code to generate
appropriate diffs.

This patch adds xfuncname regex and word_regex for Kotlin
language.

Signed-off-by: Jaydeep P Das <jaydeepjd.8914@gmail.com>
---
 Documentation/gitattributes.txt |  2 ++
 t/t4018/kotlin-class            |  5 +++++
 t/t4018/kotlin-enum-class       |  5 +++++
 t/t4018/kotlin-fun              |  5 +++++
 t/t4018/kotlin-inheritace-class |  5 +++++
 t/t4018/kotlin-inline-class     |  5 +++++
 t/t4018/kotlin-interface        |  5 +++++
 t/t4018/kotlin-nested-fun       |  9 +++++++++
 t/t4018/kotlin-public-class     |  5 +++++
 t/t4018/kotlin-sealed-class     |  5 +++++
 t/t4034-diff-words.sh           |  1 +
 t/t4034/kotlin/expect           | 34 +++++++++++++++++++++++++++++++++
 t/t4034/kotlin/post             | 21 ++++++++++++++++++++
 t/t4034/kotlin/pre              | 21 ++++++++++++++++++++
 userdiff.c                      | 10 ++++++++++
 15 files changed, 138 insertions(+)
 create mode 100644 t/t4018/kotlin-class
 create mode 100644 t/t4018/kotlin-enum-class
 create mode 100644 t/t4018/kotlin-fun
 create mode 100644 t/t4018/kotlin-inheritace-class
 create mode 100644 t/t4018/kotlin-inline-class
 create mode 100644 t/t4018/kotlin-interface
 create mode 100644 t/t4018/kotlin-nested-fun
 create mode 100644 t/t4018/kotlin-public-class
 create mode 100644 t/t4018/kotlin-sealed-class
 create mode 100644 t/t4034/kotlin/expect
 create mode 100644 t/t4034/kotlin/post
 create mode 100644 t/t4034/kotlin/pre

Comments

Johannes Sixt March 2, 2022, 8:18 p.m. UTC | #1
Am 02.03.22 um 15:26 schrieb Jaydeep P Das:
> diff --git a/t/t4034/kotlin/expect b/t/t4034/kotlin/expect
> new file mode 100644
> index 0000000000..7062b67319
> --- /dev/null
> +++ b/t/t4034/kotlin/expect
> @@ -0,0 +1,34 @@
> +<BOLD>diff --git a/pre b/post<RESET>
> +<BOLD>index 3cfa271..20d26cc 100644<RESET>
> +<BOLD>--- a/pre<RESET>
> +<BOLD>+++ b/post<RESET>
> +<CYAN>@@ -1,21 +1,21 @@<RESET>
> +println("Hello World<RED>!\n<RESET><GREEN>?<RESET>")
> +<GREEN>(<RESET>1<GREEN>) (<RESET>-1e10<GREEN>) (<RESET>0xabcdef<GREEN>)<RESET> '<RED>x<RESET><GREEN>y<RESET>'
> +[<RED>a<RESET><GREEN>x<RESET>] <RED>a<RESET><GREEN>x<RESET>-><RED>b a<RESET><GREEN>y x<RESET>.<RED>b<RESET><GREEN>y<RESET>
> +!<RED>a a<RESET><GREEN>x x<RESET>.inv() <RED>a<RESET><GREEN>x<RESET>*<RED>b a<RESET><GREEN>y x<RESET>&<RED>b<RESET>
> +<RED>a<RESET><GREEN>y<RESET>
> +<GREEN>x<RESET>*<RED>b a<RESET><GREEN>y x<RESET>/<RED>b a<RESET><GREEN>y x<RESET>%<RED>b<RESET>
> +<RED>a<RESET><GREEN>y<RESET>
> +<GREEN>x<RESET>+<RED>b a<RESET><GREEN>y x<RESET>-<RED>b<RESET><GREEN>y<RESET>
> +a <RED>shr<RESET><GREEN>shl<RESET> b
> +<RED>a<RESET><GREEN>x<RESET><<RED>b a<RESET><GREEN>y x<RESET><=<RED>b a<RESET><GREEN>y x<RESET>><RED>b a<RESET><GREEN>y x<RESET>>=<RED>b<RESET>
> +<RED>a<RESET><GREEN>y<RESET>
> +<GREEN>x<RESET>==<RED>b a<RESET><GREEN>y x<RESET>!=<RED>b a<RESET><GREEN>y x<RESET>===<RED>b<RESET>
> +<RED>a<RESET><GREEN>y<RESET>
> +<GREEN>x<RESET> and <RED>b<RESET>
> +<RED>a<RESET><GREEN>y<RESET>
> +<GREEN>x<RESET>^<RED>b<RESET>
> +<RED>a<RESET><GREEN>y<RESET>
> +<GREEN>x<RESET> or <RED>b<RESET>
> +<RED>a<RESET><GREEN>y<RESET>
> +<GREEN>x<RESET>&&<RED>b a<RESET><GREEN>y x<RESET>||<RED>b<RESET>
> +<RED>a<RESET><GREEN>y<RESET>
> +<GREEN>x<RESET>=<RED>b a<RESET><GREEN>y x<RESET>+=<RED>b a<RESET><GREEN>y x<RESET>-=<RED>b a<RESET><GREEN>y x<RESET>*=<RED>b a<RESET><GREEN>y x<RESET>/=<RED>b a<RESET><GREEN>y x<RESET>%=<RED>b a<RESET><GREEN>y x<RESET><<=<RED>b a<RESET><GREEN>y x<RESET>>>=<RED>b a<RESET><GREEN>y x<RESET>&=<RED>b a<RESET><GREEN>y x<RESET>^=<RED>b a<RESET><GREEN>y x<RESET>|=<RED>b<RESET><GREEN>y<RESET>
> +a<RED>=<RESET><GREEN>+=<RESET>b c<RED>+=<RESET><GREEN>=<RESET>d e<RED>-=<RESET><GREEN><=<RESET>f g<RED>*=<RESET><GREEN>>=<RESET>h i<RED>/=<RESET><GREEN>/<RESET>j k<RED>%=<RESET><GREEN>%<RESET>l m<RED><<=<RESET><GREEN><<<RESET>n o<RED>>>=<RESET><GREEN>>><RESET>p q<RED>&=<RESET><GREEN>&<RESET>r s<RED>^=<RESET><GREEN>^<RESET>t u<RED>|=<RESET><GREEN>|<RESET>v
> +a<RED><<=<RESET><GREEN><=<RESET>b
> +a<RED>||<RESET><GREEN>|<RESET>b a<RED>&&<RESET><GREEN>&<RESET>b
> +<RED>a<RESET><GREEN>x<RESET>,y
> +--a<RED>==<RESET><GREEN>!=<RESET>--b
> +a++<RED>==<RESET><GREEN>!=<RESET>++b
> +<RED>0xFF_EC_DE_5E 0b100_000 100_000<RESET><GREEN>0xFF_E1_DE_5E 0b100_100 200_000<RESET>

Many of the a->x, b->y changes are redundant IMHO, but they do not hurt.
This looks good.

> diff --git a/userdiff.c b/userdiff.c
> index 8578cb0d12..bb701100c6 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -168,6 +168,16 @@ PATTERNS("java",
>  	 "|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?"
>  	 "|[-+*/<>%&^|=!]="
>  	 "|--|\\+\\+|<<=?|>>>?=?|&&|\\|\\|"),
> +PATTERNS("kotlin",
> +	 "^[ \t]*(([a-z]+[ \t]+)*(fun|class|interface)[ \t]+.*)$",
> +	 /* -- */
> +	 "[_]?[a-zA-Z][a-zA-Z0-9_]*"
> +	 /* hexadecimal and binary numbers */
> +	 "|0[xXbB][0-9a-fA-F_]+[lLuU]*"
> +	 /* integers and floats */
> +	 "|[0-9._]+([Ee][-+]?[0-9]+)?[fFlLuU]*"
> +	 /* unary and binary operators */
> +	 "|[-+*/<>%&^|=!]?=(=)?|--|\\+\\+|<<?=?|>>?=?|&&?|[|]?\\||\\|->\\*?|\\.\\*"),

Some of these sub-expressions match single-character operators, but that
does not hurt.

How many tokens will the word-regex find in the expression X.e+200UL?
.e+200UL is a single token. Also, X.Find consists of the three tokens X
.F ind.

It's most easily fixed by requiring a digit before the fullstop. But if
floatingpoint numbers can begin with a fullstop, then we need a second
expression that requires a digit after a leading fullstop.

>  PATTERNS("markdown",
>  	 "^ {0,3}#{1,6}[ \t].*",
>  	 /* -- */

-- Hannes
Jaydeep Das March 3, 2022, 11:41 a.m. UTC | #2
How about modifying the number match regex to:

`[0-9._]+([Ee][-+]?[0-9]+)?[fFlLuU]*[^a-zA-Z]` ?

The `[^a-zA-Z]` in the end would make sure to not match
the `.F` in `X.Find`.

Additionally, we can add another regex for matching just
the method calls:

`[.][a-zA-Z()0-9]+`

Both of these changes would make word_regex match 2 tokens in
X.Find() : X and .Find() (Here X can be any valid identifier name)


> How many tokens will the word-regex find in the expression X.e+200UL?
> .e+200UL is a single token. > It's most easily fixed by requiring a digit before the fullstop. But if
> floatingpoint numbers can begin with a fullstop, then we need a second
> expression that requires a digit after a leading fullstop.

But that syntax would be wrong. I tried making a condition like you said,
but it always ended up breaking something else(like breaking 2.e+200UL into 2, .e, + and 200UL)

Also, I realized I did a bit of mistake in the identifier regex.
Both _abc and __abc are valid identifiers. _3432, __3232 are valid identifiers too.(not numbers)

The previous regex matched only one `_`, so in the next patch,
I plan to implement the following regex:

Identifier: `([_]*[a-zA-Z]|[_]+[0-9]+)[a-zA-Z0-9_]*`

Numbers: `[0-9_.]+([Ee][-+]?[0-9]+)?[fFlLuU]*[^a-zA-Z]`
(It makes sure that in X.Find, .F is not matched )

Additionally, An extra regex for method calls:

`[.][a-zA-Z()0-9]+`

What do you think?


Thanks,
Jaydeep.
Ævar Arnfjörð Bjarmason March 3, 2022, 4:54 p.m. UTC | #3
On Thu, Mar 03 2022, Jaydeep Das wrote:

> How about modifying the number match regex to:
>
> `[0-9._]+([Ee][-+]?[0-9]+)?[fFlLuU]*[^a-zA-Z]` ?
>
> The `[^a-zA-Z]` in the end would make sure to not match
> the `.F` in `X.Find`.
>
> Additionally, we can add another regex for matching just
> the method calls:
>
> `[.][a-zA-Z()0-9]+`
>
> Both of these changes would make word_regex match 2 tokens in
> X.Find() : X and .Find() (Here X can be any valid identifier name)
>
>
>> How many tokens will the word-regex find in the expression X.e+200UL?
>> .e+200UL is a single token. > It's most easily fixed by requiring a digit before the fullstop. But if
>> floatingpoint numbers can begin with a fullstop, then we need a second
>> expression that requires a digit after a leading fullstop.
>
> But that syntax would be wrong. I tried making a condition like you said,
> but it always ended up breaking something else(like breaking 2.e+200UL into 2, .e, + and 200UL)
>
> Also, I realized I did a bit of mistake in the identifier regex.
> Both _abc and __abc are valid identifiers. _3432, __3232 are valid identifiers too.(not numbers)
>
> The previous regex matched only one `_`, so in the next patch,
> I plan to implement the following regex:
>
> Identifier: `([_]*[a-zA-Z]|[_]+[0-9]+)[a-zA-Z0-9_]*`
>
> Numbers: `[0-9_.]+([Ee][-+]?[0-9]+)?[fFlLuU]*[^a-zA-Z]`
> (It makes sure that in X.Find, .F is not matched )
>
> Additionally, An extra regex for method calls:
>
> `[.][a-zA-Z()0-9]+`
>
> What do you think?

Just a small note on rx syntax> [.] can be handy to escape "." (but you
can also use "\\.", but that's arguably not as easy to read.

But there's no reason to use [_]* over just _*..

(Also, I have an in-flight change to userdiff.c that would conflict, but
I wonder if it wouldn't be handy to make the word_regex a "struct
userdiff_funcname". Then we could specify icase flags, which in this
case would make it a lot easier to read).
Junio C Hamano March 3, 2022, 7:47 p.m. UTC | #4
Jaydeep Das <jaydeepjd.8914@gmail.com> writes:

> How about modifying the number match regex to:
>
> `[0-9._]+([Ee][-+]?[0-9]+)?[fFlLuU]*[^a-zA-Z]` ?
>
> The `[^a-zA-Z]` in the end would make sure to not match
> the `.F` in `X.Find`.

Do we want to match "foo.F<EOL>"?  If requiring at least one
non-alpha after [fFlLuU]* is OK, then please ignore this message ;-)

Thanks.
Johannes Sixt March 3, 2022, 8:04 p.m. UTC | #5
Am 03.03.22 um 12:41 schrieb Jaydeep Das:
> How about modifying the number match regex to:
> 
> `[0-9._]+([Ee][-+]?[0-9]+)?[fFlLuU]*[^a-zA-Z]` ?
> 
> The `[^a-zA-Z]` in the end would make sure to not match
> the `.F` in `X.Find`.

No, you cannot do that, because then in X.u+1 you have three tokens X
.u+ 1, which you do not want, either.

> Additionally, we can add another regex for matching just
> the method calls:
> 
> `[.][a-zA-Z()0-9]+`
> 
> Both of these changes would make word_regex match 2 tokens in
> X.Find() : X and .Find() (Here X can be any valid identifier name)

Well, you can do that. But I would not do that if it is allowed to have
a blank between the fullstop and a method name.

>> How many tokens will the word-regex find in the expression X.e+200UL?
>> .e+200UL is a single token. > It's most easily fixed by requiring a
>> digit before the fullstop. But if
>> floatingpoint numbers can begin with a fullstop, then we need a second
>> expression that requires a digit after a leading fullstop.
> 
> But that syntax would be wrong. I tried making a condition like you said,
> but it always ended up breaking something else(like breaking 2.e+200UL
> into 2, .e, + and 200UL)
> 
> Also, I realized I did a bit of mistake in the identifier regex.
> Both _abc and __abc are valid identifiers. _3432, __3232 are valid
> identifiers too.(not numbers)
> 
> The previous regex matched only one `_`, so in the next patch,
> I plan to implement the following regex:
> 
> Identifier: `([_]*[a-zA-Z]|[_]+[0-9]+)[a-zA-Z0-9_]*`

But then you can use the regex you had in the first round:

   [a-zA-Z_][a-zA-Z0-9_]*

> 
> Numbers: `[0-9_.]+([Ee][-+]?[0-9]+)?[fFlLuU]*[^a-zA-Z]`
> (It makes sure that in X.Find, .F is not matched )
> 
> Additionally, An extra regex for method calls:
> 
> `[.][a-zA-Z()0-9]+`
> 
> What do you think?

Have a look at the regex in the cpp driver. I think we need something
like this:

  /* integers floatingpoint numbers */
  "|[0-9][0-9_.]*([Ee][*-]?[0-9]+)?[FfLl]*"
  /* floatingpoint numbers that begin with a decimal point */
  "|[.][0-9][0-9_]*([Ee][*-]?[0-9]+)?[FfLl]*"

Drop the second option if numbers such as .5 are invalid syntax in Kotlin.

-- Hannes
Jaydeep Das March 4, 2022, 12:28 p.m. UTC | #6
On 3/4/22 01:34, Johannes Sixt wrote:
> Am 03.03.22 um 12:41 schrieb Jaydeep Das:
>> How about modifying the number match regex to:
>>
>> `[0-9._]+([Ee][-+]?[0-9]+)?[fFlLuU]*[^a-zA-Z]` ?
>>
>> The `[^a-zA-Z]` in the end would make sure to not match
>> the `.F` in `X.Find`.
  
> No, you cannot do that, because then in X.u+1 you have three tokens X
> .u+ 1, which you do not want, either.

If X is an integer here, then

In C/C++ 2.f is equivalent to 2.000000
However in Kotlin 2.f is invalid syntax. 2.0f is valid.

So is implementing a proper regex for invalid syntax really
necessary?


> But then you can use the regex you had in the first round:
> 
>     [a-zA-Z_][a-zA-Z0-9_]*

Right. I will change that.
  
>> Numbers: `[0-9_.]+([Ee][-+]?[0-9]+)?[fFlLuU]*[^a-zA-Z]`
>> (It makes sure that in X.Find, .F is not matched )
>>
>> Additionally, An extra regex for method calls:
>>
>> `[.][a-zA-Z()0-9]+`
>>
>> What do you think?
> 
> Have a look at the regex in the cpp driver. I think we need something
> like this:
> 
>    /* integers floatingpoint numbers */
>    "|[0-9][0-9_.]*([Ee][*-]?[0-9]+)?[FfLl]*"
>    /* floatingpoint numbers that begin with a decimal point */
>    "|[.][0-9][0-9_]*([Ee][*-]?[0-9]+)?[FfLl]*"


> Drop the second option if numbers such as .5 are invalid syntax in Kotlin.
.5 is valid syntax in Kotlin.

--
Thanks,
Jaydeep.
Johannes Sixt March 4, 2022, 1:59 p.m. UTC | #7
Am 04.03.22 um 13:28 schrieb Jaydeep Das:
> On 3/4/22 01:34, Johannes Sixt wrote:
>> Am 03.03.22 um 12:41 schrieb Jaydeep Das:
>>> How about modifying the number match regex to:
>>>
>>> `[0-9._]+([Ee][-+]?[0-9]+)?[fFlLuU]*[^a-zA-Z]` ?
>>>
>>> The `[^a-zA-Z]` in the end would make sure to not match
>>> the `.F` in `X.Find`.
>  
>> No, you cannot do that, because then in X.u+1 you have three tokens X
>> .u+ 1, which you do not want, either.
> 
> If X is an integer here, then

No, I mean X literally, i.e., an identifier.

> 
> In C/C++ 2.f is equivalent to 2.000000
> However in Kotlin 2.f is invalid syntax. 2.0f is valid.
> 
> So is implementing a proper regex for invalid syntax really
> necessary?

No, that's not necessary. It can be assumed that invalid syntax does not
occur. For this reason...

>> Have a look at the regex in the cpp driver. I think we need something
>> like this:
>>
>>    /* integers floatingpoint numbers */
>>    "|[0-9][0-9_.]*([Ee][*-]?[0-9]+)?[FfLl]*"

... I propose this loose [0-9_.]* after the first digit, even though it
would match "9.8_7._65"; we can assume that this invalid token will not
occur.

BTW, make that [FfLlUl] near the end.

>>    /* floatingpoint numbers that begin with a decimal point */
>>    "|[.][0-9][0-9_]*([Ee][*-]?[0-9]+)?[FfLl]*"
> 
> 
>> Drop the second option if numbers such as .5 are invalid syntax in
>> Kotlin.
> .5 is valid syntax in Kotlin.

OK, then we need this second branch, which ensures that there is a digit
after the fullstop.

-- Hannes
diff mbox series

Patch

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index a71dad2674..4b36d51beb 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -829,6 +829,8 @@  patterns are available:
 
 - `java` suitable for source code in the Java language.
 
+- `kotlin` suitable for source code in the Kotlin language.
+
 - `markdown` suitable for Markdown documents.
 
 - `matlab` suitable for source code in the MATLAB and Octave languages.
diff --git a/t/t4018/kotlin-class b/t/t4018/kotlin-class
new file mode 100644
index 0000000000..bb864f22e6
--- /dev/null
+++ b/t/t4018/kotlin-class
@@ -0,0 +1,5 @@ 
+class RIGHT {
+	//comment
+	//comment
+	return ChangeMe
+}
diff --git a/t/t4018/kotlin-enum-class b/t/t4018/kotlin-enum-class
new file mode 100644
index 0000000000..8885f908fd
--- /dev/null
+++ b/t/t4018/kotlin-enum-class
@@ -0,0 +1,5 @@ 
+enum class RIGHT{
+	// Left
+	// a comment
+	ChangeMe
+}
diff --git a/t/t4018/kotlin-fun b/t/t4018/kotlin-fun
new file mode 100644
index 0000000000..2a60280256
--- /dev/null
+++ b/t/t4018/kotlin-fun
@@ -0,0 +1,5 @@ 
+fun RIGHT(){
+	//a comment
+	//b comment
+    return ChangeMe()
+}
diff --git a/t/t4018/kotlin-inheritace-class b/t/t4018/kotlin-inheritace-class
new file mode 100644
index 0000000000..77376c1f05
--- /dev/null
+++ b/t/t4018/kotlin-inheritace-class
@@ -0,0 +1,5 @@ 
+open class RIGHT{
+	// a comment
+	// b comment
+	// ChangeMe
+}
diff --git a/t/t4018/kotlin-inline-class b/t/t4018/kotlin-inline-class
new file mode 100644
index 0000000000..7bf46dd8d4
--- /dev/null
+++ b/t/t4018/kotlin-inline-class
@@ -0,0 +1,5 @@ 
+value class RIGHT(Args){
+	// a comment
+	// b comment
+	ChangeMe
+}
diff --git a/t/t4018/kotlin-interface b/t/t4018/kotlin-interface
new file mode 100644
index 0000000000..f686ba7770
--- /dev/null
+++ b/t/t4018/kotlin-interface
@@ -0,0 +1,5 @@ 
+interface RIGHT{
+	//another comment
+	//another comment
+	//ChangeMe
+}
diff --git a/t/t4018/kotlin-nested-fun b/t/t4018/kotlin-nested-fun
new file mode 100644
index 0000000000..12186858cb
--- /dev/null
+++ b/t/t4018/kotlin-nested-fun
@@ -0,0 +1,9 @@ 
+class LEFT{
+	class CENTER{
+		fun RIGHT(  a:Int){
+			//comment
+			//comment
+			ChangeMe
+		}
+	}
+}
diff --git a/t/t4018/kotlin-public-class b/t/t4018/kotlin-public-class
new file mode 100644
index 0000000000..9433fcc226
--- /dev/null
+++ b/t/t4018/kotlin-public-class
@@ -0,0 +1,5 @@ 
+public class RIGHT{
+	//comment1
+	//comment2
+	ChangeMe
+}
diff --git a/t/t4018/kotlin-sealed-class b/t/t4018/kotlin-sealed-class
new file mode 100644
index 0000000000..0efa4a4eaf
--- /dev/null
+++ b/t/t4018/kotlin-sealed-class
@@ -0,0 +1,5 @@ 
+sealed class RIGHT {
+	// a comment
+	// b comment
+	ChangeMe
+}
diff --git a/t/t4034-diff-words.sh b/t/t4034-diff-words.sh
index d5abcf4b4c..15764ee9ac 100755
--- a/t/t4034-diff-words.sh
+++ b/t/t4034-diff-words.sh
@@ -324,6 +324,7 @@  test_language_driver dts
 test_language_driver fortran
 test_language_driver html
 test_language_driver java
+test_language_driver kotlin
 test_language_driver matlab
 test_language_driver objc
 test_language_driver pascal
diff --git a/t/t4034/kotlin/expect b/t/t4034/kotlin/expect
new file mode 100644
index 0000000000..7062b67319
--- /dev/null
+++ b/t/t4034/kotlin/expect
@@ -0,0 +1,34 @@ 
+<BOLD>diff --git a/pre b/post<RESET>
+<BOLD>index 3cfa271..20d26cc 100644<RESET>
+<BOLD>--- a/pre<RESET>
+<BOLD>+++ b/post<RESET>
+<CYAN>@@ -1,21 +1,21 @@<RESET>
+println("Hello World<RED>!\n<RESET><GREEN>?<RESET>")
+<GREEN>(<RESET>1<GREEN>) (<RESET>-1e10<GREEN>) (<RESET>0xabcdef<GREEN>)<RESET> '<RED>x<RESET><GREEN>y<RESET>'
+[<RED>a<RESET><GREEN>x<RESET>] <RED>a<RESET><GREEN>x<RESET>-><RED>b a<RESET><GREEN>y x<RESET>.<RED>b<RESET><GREEN>y<RESET>
+!<RED>a a<RESET><GREEN>x x<RESET>.inv() <RED>a<RESET><GREEN>x<RESET>*<RED>b a<RESET><GREEN>y x<RESET>&<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>*<RED>b a<RESET><GREEN>y x<RESET>/<RED>b a<RESET><GREEN>y x<RESET>%<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>+<RED>b a<RESET><GREEN>y x<RESET>-<RED>b<RESET><GREEN>y<RESET>
+a <RED>shr<RESET><GREEN>shl<RESET> b
+<RED>a<RESET><GREEN>x<RESET><<RED>b a<RESET><GREEN>y x<RESET><=<RED>b a<RESET><GREEN>y x<RESET>><RED>b a<RESET><GREEN>y x<RESET>>=<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>==<RED>b a<RESET><GREEN>y x<RESET>!=<RED>b a<RESET><GREEN>y x<RESET>===<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET> and <RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>^<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET> or <RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>&&<RED>b a<RESET><GREEN>y x<RESET>||<RED>b<RESET>
+<RED>a<RESET><GREEN>y<RESET>
+<GREEN>x<RESET>=<RED>b a<RESET><GREEN>y x<RESET>+=<RED>b a<RESET><GREEN>y x<RESET>-=<RED>b a<RESET><GREEN>y x<RESET>*=<RED>b a<RESET><GREEN>y x<RESET>/=<RED>b a<RESET><GREEN>y x<RESET>%=<RED>b a<RESET><GREEN>y x<RESET><<=<RED>b a<RESET><GREEN>y x<RESET>>>=<RED>b a<RESET><GREEN>y x<RESET>&=<RED>b a<RESET><GREEN>y x<RESET>^=<RED>b a<RESET><GREEN>y x<RESET>|=<RED>b<RESET><GREEN>y<RESET>
+a<RED>=<RESET><GREEN>+=<RESET>b c<RED>+=<RESET><GREEN>=<RESET>d e<RED>-=<RESET><GREEN><=<RESET>f g<RED>*=<RESET><GREEN>>=<RESET>h i<RED>/=<RESET><GREEN>/<RESET>j k<RED>%=<RESET><GREEN>%<RESET>l m<RED><<=<RESET><GREEN><<<RESET>n o<RED>>>=<RESET><GREEN>>><RESET>p q<RED>&=<RESET><GREEN>&<RESET>r s<RED>^=<RESET><GREEN>^<RESET>t u<RED>|=<RESET><GREEN>|<RESET>v
+a<RED><<=<RESET><GREEN><=<RESET>b
+a<RED>||<RESET><GREEN>|<RESET>b a<RED>&&<RESET><GREEN>&<RESET>b
+<RED>a<RESET><GREEN>x<RESET>,y
+--a<RED>==<RESET><GREEN>!=<RESET>--b
+a++<RED>==<RESET><GREEN>!=<RESET>++b
+<RED>0xFF_EC_DE_5E 0b100_000 100_000<RESET><GREEN>0xFF_E1_DE_5E 0b100_100 200_000<RESET>
diff --git a/t/t4034/kotlin/post b/t/t4034/kotlin/post
new file mode 100644
index 0000000000..20d26cca5f
--- /dev/null
+++ b/t/t4034/kotlin/post
@@ -0,0 +1,21 @@ 
+println("Hello World?")
+(1) (-1e10) (0xabcdef) 'y'
+[x] x->y x.y
+!x x.inv() x*y x&y
+x*y x/y x%y
+x+y x-y
+a shl b
+x<y x<=y x>y x>=y
+x==y x!=y x===y
+x and y
+x^y
+x or y
+x&&y x||y
+x=y x+=y x-=y x*=y x/=y x%=y x<<=y x>>=y x&=y x^=y x|=y
+a+=b c=d e<=f g>=h i/j k%l m<<n o>>p q&r s^t u|v
+a<=b
+a|b a&b
+x,y
+--a!=--b
+a++!=++b
+0xFF_E1_DE_5E 0b100_100 200_000
diff --git a/t/t4034/kotlin/pre b/t/t4034/kotlin/pre
new file mode 100644
index 0000000000..3cfa271e37
--- /dev/null
+++ b/t/t4034/kotlin/pre
@@ -0,0 +1,21 @@ 
+println("Hello World!\n")
+1 -1e10 0xabcdef 'x'
+[a] a->b a.b
+!a a.inv() a*b a&b
+a*b a/b a%b
+a+b a-b
+a shr b
+a<b a<=b a>b a>=b
+a==b a!=b a===b
+a and b
+a^b
+a or b
+a&&b a||b
+a=b a+=b a-=b a*=b a/=b a%=b a<<=b a>>=b a&=b a^=b a|=b
+a=b c+=d e-=f g*=h i/=j k%=l m<<=n o>>=p q&=r s^=t u|=v
+a<<=b
+a||b a&&b
+a,y
+--a==--b
+a++==++b
+0xFF_EC_DE_5E 0b100_000 100_000
diff --git a/userdiff.c b/userdiff.c
index 8578cb0d12..bb701100c6 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -168,6 +168,16 @@  PATTERNS("java",
 	 "|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?"
 	 "|[-+*/<>%&^|=!]="
 	 "|--|\\+\\+|<<=?|>>>?=?|&&|\\|\\|"),
+PATTERNS("kotlin",
+	 "^[ \t]*(([a-z]+[ \t]+)*(fun|class|interface)[ \t]+.*)$",
+	 /* -- */
+	 "[_]?[a-zA-Z][a-zA-Z0-9_]*"
+	 /* hexadecimal and binary numbers */
+	 "|0[xXbB][0-9a-fA-F_]+[lLuU]*"
+	 /* integers and floats */
+	 "|[0-9._]+([Ee][-+]?[0-9]+)?[fFlLuU]*"
+	 /* unary and binary operators */
+	 "|[-+*/<>%&^|=!]?=(=)?|--|\\+\\+|<<?=?|>>?=?|&&?|[|]?\\||\\|->\\*?|\\.\\*"),
 PATTERNS("markdown",
 	 "^ {0,3}#{1,6}[ \t].*",
 	 /* -- */