Message ID | 20220305094004.250570-1-jaydeepjd.8914@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v4] userdiff: add builtin diff driver for Kotlin language. | expand |
Am 05.03.22 um 10:40 schrieb Jaydeep P Das: > +<RED>_32<RESET><GREEN>_33<RESET>.find(arr) > +X<RED>.fill()<RESET><GREEN>.find()<RESET> > +X<RED>.u<RESET><GREEN>.f<RESET>+1 > +X.u<RED>-<RESET><GREEN>+<RESET>2 Nice move to include these new tests! > diff --git a/userdiff.c b/userdiff.c > index 8578cb0d12..24821a0f69 100644 > --- a/userdiff.c > +++ b/userdiff.c > @@ -168,6 +168,18 @@ PATTERNS("java", > "|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?" > "|[-+*/<>%&^|=!]=" > "|--|\\+\\+|<<=?|>>>?=?|&&|\\|\\|"), > +PATTERNS("kotlin", > + "^[ \t]*(([a-z]+[ \t]+)*(fun|class|interface)[ \t]+.*)$", > + /* -- */ > + "[a-zA-Z_][a-zA-Z0-9_]*" > + /* hexadecimal and binary numbers */ > + "|0[xXbB][0-9a-fA-F_]+[lLuU]*" > + /* integers and floats */ > + "|[0-9][0-9_.]*([Ee][-+]?[0-9]+)?[fFlL]*" Good! > + /* method calls */ > + "|[.][a-zA-Z()0-9]+" This matches both .empty() as well as .125, but only the .5e part of .5e-3 and only the .find(x part of .find(x/2). Is that intended? I find the desire to have method calls as an entire token a bit strange. In other languages, the last expression part is actually split into many tokens: . find ( x / 2 ). BTW, I'm in no way saying that this must be changed (personally I do not care at all as I'm not writing Kotlin), so if you say that is how people want Kotlin code to be split with --word-diff, I will believe you. > + /* unary and binary operators */ > + "|[-+*/<>%&^|=!]?=(=)?|--|\\+\\+|<<?=?|>>?=?|&&?|[|]?\\||\\|->\\*?|\\.\\*"), Is the part "|\\|->\\*?|" actually meant to be something else? Does Kotlin have the tokens "|->" and "|->*"? A final minor nit: There is "|&&?|[|]?\\||" that could just be "|&&|\\|\\||" (remember: single character operators are matched implicitly). > PATTERNS("markdown", > "^ {0,3}#{1,6}[ \t].*", > /* -- */ -- Hannes
> > + /* method calls */ > > + "|[.][a-zA-Z()0-9]+" > > This matches both .empty() as well as .125, but only the .5e part of > .5e-3 and only the .find(x part of .find(x/2). Is that intended? Oh. It completely missed my mind. Anyways, that method call regex is better gone. For matching these, Maybe we could just use cpp's regex for floating numbers starting with decimal point: "|\\.[0-9][0-9]*([Ee][-+]?[0-9]+)?[fFlL]?" Or maybe, we can make the current regex for floats and integers a bit more loose: "|[0-9.][0-9_.]*([Ee][*-]?[0-9]+)?[FfLl]*" What do you think would be better? > I find the desire to have method calls as an entire token a bit strange. > In other languages, the last expression part is actually split into many > tokens: . find ( x / 2 ). > > BTW, I'm in no way saying that this must be changed (personally I do not > care at all as I'm not writing Kotlin), so if you say that is how people > want Kotlin code to be split with --word-diff, I will believe you. Yes. The tokenisation does not make sense if its something like `X.find(2)`. I think I should remove it. > > + /* unary and binary operators */ > > + "|[-+*/<>%&^|=!]?=(=)?|--|\\+\\+|<<?=?|>>?=?|&&?|[|]?\\||\\|->\\*?|\\.\\*"), > > Is the part > > "|\\|->\\*?|" > > actually meant to be something else? Does Kotlin have the tokens "|->" > and "|->*"? Ah. yes. Kotlin does have "->" operator but not "|->". Also there are a few more compound operators like ".." , "!!" etc which I forgot to add. I will add these in the next patch. > A final minor nit: There is "|&&?|[|]?\\||" that could just be > "|&&|\\|\\||" (remember: single character operators are matched implicitly). > Yes. Right. -- Thanks :] Jaydeep
Am 05.03.22 um 20:18 schrieb jaydeepjd.8914@gmail.com: > Maybe we could just use cpp's regex for floating numbers starting with > decimal point: > > "|\\.[0-9][0-9]*([Ee][-+]?[0-9]+)?[fFlL]?" Yes, but with '_' permitted after the first digit: "|\\.[0-9][0-9_]*([Ee][-+]?[0-9]+)?[fFlL]?" > > Or maybe, we can make the current regex for floats and integers a bit > more loose: > > "|[0-9.][0-9_.]*([Ee][*-]?[0-9]+)?[FfLl]*" > What do you think would be better? No, that does not work, either, because it splits X.Find into X .F ind. I fixed that very problem with the cpp driver recently. We do need a separate alternative for the floatingpoint numbers that start with a decimal point. -- Hannes
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt index a71dad2674..4b36d51beb 100644 --- a/Documentation/gitattributes.txt +++ b/Documentation/gitattributes.txt @@ -829,6 +829,8 @@ patterns are available: - `java` suitable for source code in the Java language. +- `kotlin` suitable for source code in the Kotlin language. + - `markdown` suitable for Markdown documents. - `matlab` suitable for source code in the MATLAB and Octave languages. diff --git a/t/t4018/kotlin-class b/t/t4018/kotlin-class new file mode 100644 index 0000000000..bb864f22e6 --- /dev/null +++ b/t/t4018/kotlin-class @@ -0,0 +1,5 @@ +class RIGHT { + //comment + //comment + return ChangeMe +} diff --git a/t/t4018/kotlin-enum-class b/t/t4018/kotlin-enum-class new file mode 100644 index 0000000000..8885f908fd --- /dev/null +++ b/t/t4018/kotlin-enum-class @@ -0,0 +1,5 @@ +enum class RIGHT{ + // Left + // a comment + ChangeMe +} diff --git a/t/t4018/kotlin-fun b/t/t4018/kotlin-fun new file mode 100644 index 0000000000..2a60280256 --- /dev/null +++ b/t/t4018/kotlin-fun @@ -0,0 +1,5 @@ +fun RIGHT(){ + //a comment + //b comment + return ChangeMe() +} diff --git a/t/t4018/kotlin-inheritace-class b/t/t4018/kotlin-inheritace-class new file mode 100644 index 0000000000..77376c1f05 --- /dev/null +++ b/t/t4018/kotlin-inheritace-class @@ -0,0 +1,5 @@ +open class RIGHT{ + // a comment + // b comment + // ChangeMe +} diff --git a/t/t4018/kotlin-inline-class b/t/t4018/kotlin-inline-class new file mode 100644 index 0000000000..7bf46dd8d4 --- /dev/null +++ b/t/t4018/kotlin-inline-class @@ -0,0 +1,5 @@ +value class RIGHT(Args){ + // a comment + // b comment + ChangeMe +} diff --git a/t/t4018/kotlin-interface b/t/t4018/kotlin-interface new file mode 100644 index 0000000000..f686ba7770 --- /dev/null +++ b/t/t4018/kotlin-interface @@ -0,0 +1,5 @@ +interface RIGHT{ + //another comment + //another comment + //ChangeMe +} diff --git a/t/t4018/kotlin-nested-fun b/t/t4018/kotlin-nested-fun new file mode 100644 index 0000000000..12186858cb --- /dev/null +++ b/t/t4018/kotlin-nested-fun @@ -0,0 +1,9 @@ +class LEFT{ + class CENTER{ + fun RIGHT( a:Int){ + //comment + //comment + ChangeMe + } + } +} diff --git a/t/t4018/kotlin-public-class b/t/t4018/kotlin-public-class new file mode 100644 index 0000000000..9433fcc226 --- /dev/null +++ b/t/t4018/kotlin-public-class @@ -0,0 +1,5 @@ +public class RIGHT{ + //comment1 + //comment2 + ChangeMe +} diff --git a/t/t4018/kotlin-sealed-class b/t/t4018/kotlin-sealed-class new file mode 100644 index 0000000000..0efa4a4eaf --- /dev/null +++ b/t/t4018/kotlin-sealed-class @@ -0,0 +1,5 @@ +sealed class RIGHT { + // a comment + // b comment + ChangeMe +} diff --git a/t/t4034-diff-words.sh b/t/t4034-diff-words.sh index d5abcf4b4c..15764ee9ac 100755 --- a/t/t4034-diff-words.sh +++ b/t/t4034-diff-words.sh @@ -324,6 +324,7 @@ test_language_driver dts test_language_driver fortran test_language_driver html test_language_driver java +test_language_driver kotlin test_language_driver matlab test_language_driver objc test_language_driver pascal diff --git a/t/t4034/kotlin/expect b/t/t4034/kotlin/expect new file mode 100644 index 0000000000..228da3e95e --- /dev/null +++ b/t/t4034/kotlin/expect @@ -0,0 +1,39 @@ +<BOLD>diff --git a/pre b/post<RESET> +<BOLD>index 0c2a5a8..a9e7b41 100644<RESET> +<BOLD>--- a/pre<RESET> +<BOLD>+++ b/post<RESET> +<CYAN>@@ -1,26 +1,26 @@<RESET> +println("Hello World<RED>!\n<RESET><GREEN>?<RESET>") +<GREEN>(<RESET>1<GREEN>) (<RESET>-1e10<GREEN>) (<RESET>0xabcdef<GREEN>)<RESET> '<RED>x<RESET><GREEN>y<RESET>' +[<RED>a<RESET><GREEN>x<RESET>] <RED>a<RESET><GREEN>x<RESET>-><RED>b a.b<RESET><GREEN>y x.y<RESET> +!<RED>a a<RESET><GREEN>x x<RESET>.inv() <RED>a<RESET><GREEN>x<RESET>*<RED>b a<RESET><GREEN>y x<RESET>&<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>*<RED>b a<RESET><GREEN>y x<RESET>/<RED>b a<RESET><GREEN>y x<RESET>%<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>+<RED>b a<RESET><GREEN>y x<RESET>-<RED>b<RESET><GREEN>y<RESET> +a <RED>shr<RESET><GREEN>shl<RESET> b +<RED>a<RESET><GREEN>x<RESET><<RED>b a<RESET><GREEN>y x<RESET><=<RED>b a<RESET><GREEN>y x<RESET>><RED>b a<RESET><GREEN>y x<RESET>>=<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>==<RED>b a<RESET><GREEN>y x<RESET>!=<RED>b a<RESET><GREEN>y x<RESET>===<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET> and <RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>^<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET> or <RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>&&<RED>b a<RESET><GREEN>y x<RESET>||<RED>b<RESET> +<RED>a<RESET><GREEN>y<RESET> +<GREEN>x<RESET>=<RED>b a<RESET><GREEN>y x<RESET>+=<RED>b a<RESET><GREEN>y x<RESET>-=<RED>b a<RESET><GREEN>y x<RESET>*=<RED>b a<RESET><GREEN>y x<RESET>/=<RED>b a<RESET><GREEN>y x<RESET>%=<RED>b a<RESET><GREEN>y x<RESET><<=<RED>b a<RESET><GREEN>y x<RESET>>>=<RED>b a<RESET><GREEN>y x<RESET>&=<RED>b a<RESET><GREEN>y x<RESET>^=<RED>b a<RESET><GREEN>y x<RESET>|=<RED>b<RESET><GREEN>y<RESET> +a<RED>=<RESET><GREEN>+=<RESET>b c<RED>+=<RESET><GREEN>=<RESET>d e<RED>-=<RESET><GREEN><=<RESET>f g<RED>*=<RESET><GREEN>>=<RESET>h i<RED>/=<RESET><GREEN>/<RESET>j k<RED>%=<RESET><GREEN>%<RESET>l m<RED><<=<RESET><GREEN><<<RESET>n o<RED>>>=<RESET><GREEN>>><RESET>p q<RED>&=<RESET><GREEN>&<RESET>r s<RED>^=<RESET><GREEN>^<RESET>t u<RED>|=<RESET><GREEN>|<RESET>v +a<RED><<=<RESET><GREEN><=<RESET>b +a<RED>||<RESET><GREEN>|<RESET>b a<RED>&&<RESET><GREEN>&<RESET>b +<RED>a<RESET><GREEN>x<RESET>,y +--a<RED>==<RESET><GREEN>!=<RESET>--b +a++<RED>==<RESET><GREEN>!=<RESET>++b +<RED>0xFF_EC_DE_5E 0b100_000 100_000<RESET><GREEN>0xFF_E1_DE_5E 0b100_100 200_000<RESET> +a<RED>==<RESET><GREEN>===<RESET>b +<RED>_32<RESET><GREEN>_33<RESET>.find(arr) +X<RED>.fill()<RESET><GREEN>.find()<RESET> +X<RED>.u<RESET><GREEN>.f<RESET>+1 +X.u<RED>-<RESET><GREEN>+<RESET>2 diff --git a/t/t4034/kotlin/post b/t/t4034/kotlin/post new file mode 100644 index 0000000000..a9e7b41631 --- /dev/null +++ b/t/t4034/kotlin/post @@ -0,0 +1,26 @@ +println("Hello World?") +(1) (-1e10) (0xabcdef) 'y' +[x] x->y x.y +!x x.inv() x*y x&y +x*y x/y x%y +x+y x-y +a shl b +x<y x<=y x>y x>=y +x==y x!=y x===y +x and y +x^y +x or y +x&&y x||y +x=y x+=y x-=y x*=y x/=y x%=y x<<=y x>>=y x&=y x^=y x|=y +a+=b c=d e<=f g>=h i/j k%l m<<n o>>p q&r s^t u|v +a<=b +a|b a&b +x,y +--a!=--b +a++!=++b +0xFF_E1_DE_5E 0b100_100 200_000 +a===b +_33.find(arr) +X.find() +X.f+1 +X.u+2 diff --git a/t/t4034/kotlin/pre b/t/t4034/kotlin/pre new file mode 100644 index 0000000000..0c2a5a82b3 --- /dev/null +++ b/t/t4034/kotlin/pre @@ -0,0 +1,26 @@ +println("Hello World!\n") +1 -1e10 0xabcdef 'x' +[a] a->b a.b +!a a.inv() a*b a&b +a*b a/b a%b +a+b a-b +a shr b +a<b a<=b a>b a>=b +a==b a!=b a===b +a and b +a^b +a or b +a&&b a||b +a=b a+=b a-=b a*=b a/=b a%=b a<<=b a>>=b a&=b a^=b a|=b +a=b c+=d e-=f g*=h i/=j k%=l m<<=n o>>=p q&=r s^=t u|=v +a<<=b +a||b a&&b +a,y +--a==--b +a++==++b +0xFF_EC_DE_5E 0b100_000 100_000 +a==b +_32.find(arr) +X.fill() +X.u+1 +X.u-2 diff --git a/userdiff.c b/userdiff.c index 8578cb0d12..24821a0f69 100644 --- a/userdiff.c +++ b/userdiff.c @@ -168,6 +168,18 @@ PATTERNS("java", "|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?" "|[-+*/<>%&^|=!]=" "|--|\\+\\+|<<=?|>>>?=?|&&|\\|\\|"), +PATTERNS("kotlin", + "^[ \t]*(([a-z]+[ \t]+)*(fun|class|interface)[ \t]+.*)$", + /* -- */ + "[a-zA-Z_][a-zA-Z0-9_]*" + /* hexadecimal and binary numbers */ + "|0[xXbB][0-9a-fA-F_]+[lLuU]*" + /* integers and floats */ + "|[0-9][0-9_.]*([Ee][-+]?[0-9]+)?[fFlL]*" + /* method calls */ + "|[.][a-zA-Z()0-9]+" + /* unary and binary operators */ + "|[-+*/<>%&^|=!]?=(=)?|--|\\+\\+|<<?=?|>>?=?|&&?|[|]?\\||\\|->\\*?|\\.\\*"), PATTERNS("markdown", "^ {0,3}#{1,6}[ \t].*", /* -- */
The xfuncname pattern finds func/class declarations in diffs to display as a hunk header. The word_regex pattern finds individual tokens in Kotlin code to generate appropriate diffs. This patch adds xfuncname regex and word_regex for Kotlin language. Signed-off-by: Jaydeep P Das <jaydeepjd.8914@gmail.com> --- Documentation/gitattributes.txt | 2 ++ t/t4018/kotlin-class | 5 +++++ t/t4018/kotlin-enum-class | 5 +++++ t/t4018/kotlin-fun | 5 +++++ t/t4018/kotlin-inheritace-class | 5 +++++ t/t4018/kotlin-inline-class | 5 +++++ t/t4018/kotlin-interface | 5 +++++ t/t4018/kotlin-nested-fun | 9 ++++++++ t/t4018/kotlin-public-class | 5 +++++ t/t4018/kotlin-sealed-class | 5 +++++ t/t4034-diff-words.sh | 1 + t/t4034/kotlin/expect | 39 +++++++++++++++++++++++++++++++++ t/t4034/kotlin/post | 26 ++++++++++++++++++++++ t/t4034/kotlin/pre | 26 ++++++++++++++++++++++ userdiff.c | 12 ++++++++++ 15 files changed, 155 insertions(+) create mode 100644 t/t4018/kotlin-class create mode 100644 t/t4018/kotlin-enum-class create mode 100644 t/t4018/kotlin-fun create mode 100644 t/t4018/kotlin-inheritace-class create mode 100644 t/t4018/kotlin-inline-class create mode 100644 t/t4018/kotlin-interface create mode 100644 t/t4018/kotlin-nested-fun create mode 100644 t/t4018/kotlin-public-class create mode 100644 t/t4018/kotlin-sealed-class create mode 100644 t/t4034/kotlin/expect create mode 100644 t/t4034/kotlin/post create mode 100644 t/t4034/kotlin/pre