diff mbox series

[v2] feat: add Elixir to supported userdiff languages

Message ID 20191106174844.23482-1-lukasz@niemier.pl (mailing list archive)
State New, archived
Headers show
Series [v2] feat: add Elixir to supported userdiff languages | expand

Commit Message

Łukasz Niemier Nov. 6, 2019, 5:48 p.m. UTC
---
Remove file that landed accidentally in the commit
 t/t4018-diff-funcname.sh | 1 +
 t/t4018/elixir-function  | 5 +++++
 t/t4018/elixir-module    | 9 +++++++++
 userdiff.c               | 7 +++++++
 4 files changed, 22 insertions(+)
 create mode 100644 t/t4018/elixir-function
 create mode 100644 t/t4018/elixir-module

Comments

Johannes Sixt Nov. 6, 2019, 8:39 p.m. UTC | #1
Thank you for your contribution!

Am 06.11.19 um 18:48 schrieb Łukasz Niemier:
> ---

Please use "subsystem: short description" in the subject. For example:

   userdiff: support Elixir

would be sufficient in this case.

Please add your sign-off before the three-dash line so that we know that
you are entitled to publish this patch. See Documentation/SubmittingPatches.

It would be enlightening to know what Elixir is. (I haven't googled it,
yet.) If it were a popular language, I think I would have heard about
it. But it may well be possible that I have lived under a rock for too
long... ;)

> diff --git a/t/t4018-diff-funcname.sh b/t/t4018-diff-funcname.sh
> index 6f5ef0035e..194310377e 100755
> --- a/t/t4018-diff-funcname.sh
> +++ b/t/t4018-diff-funcname.sh
> @@ -31,6 +31,7 @@ diffpatterns="
>  	cpp
>  	csharp
>  	css
> +	elixir
>  	dts
>  	fortran
>  	fountain

This list is sorted, basically, but your addition perturbates the order.

> diff --git a/t/t4018/elixir-function b/t/t4018/elixir-function
> new file mode 100644
> index 0000000000..d452f495a7
> --- /dev/null
> +++ b/t/t4018/elixir-function
> @@ -0,0 +1,5 @@
> +def function(RIGHT, arg) do
> +  # comment
> +  # comment
> +  ChangeMe
> +end
> diff --git a/t/t4018/elixir-module b/t/t4018/elixir-module
> new file mode 100644
> index 0000000000..91a4e7aa20
> --- /dev/null
> +++ b/t/t4018/elixir-module
> @@ -0,0 +1,9 @@
> +defmodule RIGHT do
> +  @moduledoc """
> +  Foo bar
> +  """
> +
> +  def ChangeMe(a) where is_map(a) do
> +    a
> +  end
> +end

The default hunk header pattern picks up lines that begin with a letter
without leading whitespace. The tests that you present here do not show
that the language specific hunk header pattern is better. The default
would have picked up the correct lines. And, in fact, when I remove the
pattern from the code, these tests still pass!

I'm not saying that the pattern is bad; I say that the tests do not show
its worthiness. More tests are needed. For example:

--- 8< ---
defmodule RIGHT do
end
#
#
# ChangeMe; do not pick up 'end' line
--- 8< ---

BTW, I guess that any def, defmodule, etc. as the first word on a line
in the docstring would be picked up incorrectly. Is that a problem?

> diff --git a/userdiff.c b/userdiff.c
> index e187d356f6..31fff34e1e 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -32,6 +32,13 @@ PATTERNS("dts",
>  	 /* Property names and math operators */
>  	 "[a-zA-Z0-9,._+?#-]+"
>  	 "|[-+*/%&^|!~]|>>|<<|&&|\\|\\|"),
> +PATTERNS("elixir",
> +	 "^[ \t]*((def(macro|module|impl|guard|protocol)?p?|test)[ \t].*)$",

Good. A test case that shows that indented def lines are picked up, if
that is the intent, would be appreciated.

> +	 "[a-zA-Z0-9_.]+"
> +	 "|:[a-zA-Z0-9@_]+"
> +	 "|:'a-zA-Z0-9@_]+'"

The opening bracket is missing here.

> +	 "|:\"[a-zA-Z0-9@_]+\""
> +	 "|@[a-zA-Z0-9_]+"),
Would it be an option to collapse all but the first pattern (because I
do not want to start the pattern with an optional part) to

	"[:@]['\"]?[a-zA-Z0-9@_]"

This assumes that @"x1 and @'y2 cannot occur in a syntactically valid
program. Remember: the patterns can be loose; they do not have to
validate the input, but can assume that it is syntactically valid.

Does the language not have any two-character operators, such as '<='?

>  IPATTERN("fortran",
>  	 "!^([C*]|[ \t]*!)\n"
>  	 "!^[ \t]*MODULE[ \t]+PROCEDURE[ \t]\n"
> 

-- Hannes
Łukasz Niemier Nov. 6, 2019, 10:07 p.m. UTC | #2
> Please use "subsystem: short description" in the subject. For example:
> 
>   userdiff: support Elixir
> 
> would be sufficient in this case.

Ok, will fix.

> Please add your sign-off before the three-dash line so that we know that
> you are entitled to publish this patch. See Documentation/SubmittingPatches.

Yeah, I have seen it too late to fix. Will do.

> It would be enlightening to know what Elixir is. (I haven't googled it,
> yet.)

It is language with Ruby-like syntax for BEAM (Erlang virtual machine).

https://elixir-lang.org

> If it were a popular language, I think I would have heard about
> it. But it may well be possible that I have lived under a rock for too
> long... ;)

It is quite popular among few services (Pinterest, Discord, Bleacher Report),
but it is still pretty new, and still need to get a little more spotlight.

> This list is sorted, basically, but your addition perturbates the order.

Oh, sorry, I missed dts.

> The default hunk header pattern picks up lines that begin with a letter
> without leading whitespace. The tests that you present here do not show
> that the language specific hunk header pattern is better. The default
> would have picked up the correct lines. And, in fact, when I remove the
> pattern from the code, these tests still pass!
> 
> I'm not saying that the pattern is bad; I say that the tests do not show
> its worthiness. More tests are needed. For example:
> 
> --- 8< ---
> defmodule RIGHT do
> end
> #
> #
> # ChangeMe; do not pick up 'end' line
> --- 8< —

Yeah, I will provide such

> BTW, I guess that any def, defmodule, etc. as the first word on a line
> in the docstring would be picked up incorrectly. Is that a problem?

No, as this would (almost?) always be the definition of the module/function,
so it is not a problem at all.

>> +	 "|:'a-zA-Z0-9@_]+'"
> 
> The opening bracket is missing here.

Whoops.

> Would it be an option to collapse all but the first pattern (because I
> do not want to start the pattern with an optional part) to
> 
> 	"[:@]['\"]?[a-zA-Z0-9@_]"
> 
> This assumes that @"x1 and @'y2 cannot occur in a syntactically valid
> program.

No, these aren’t valid.

> Remember: the patterns can be loose; they do not have to
> validate the input, but can assume that it is syntactically valid.

Ok

> Does the language not have any two-character operators, such as '<=‚?

It has, I should add them as well (it even has 3 letter operators).

--

Łukasz Niemier
lukasz@niemier.pl
diff mbox series

Patch

diff --git a/t/t4018-diff-funcname.sh b/t/t4018-diff-funcname.sh
index 6f5ef0035e..194310377e 100755
--- a/t/t4018-diff-funcname.sh
+++ b/t/t4018-diff-funcname.sh
@@ -31,6 +31,7 @@  diffpatterns="
 	cpp
 	csharp
 	css
+	elixir
 	dts
 	fortran
 	fountain
diff --git a/t/t4018/elixir-function b/t/t4018/elixir-function
new file mode 100644
index 0000000000..d452f495a7
--- /dev/null
+++ b/t/t4018/elixir-function
@@ -0,0 +1,5 @@ 
+def function(RIGHT, arg) do
+  # comment
+  # comment
+  ChangeMe
+end
diff --git a/t/t4018/elixir-module b/t/t4018/elixir-module
new file mode 100644
index 0000000000..91a4e7aa20
--- /dev/null
+++ b/t/t4018/elixir-module
@@ -0,0 +1,9 @@ 
+defmodule RIGHT do
+  @moduledoc """
+  Foo bar
+  """
+
+  def ChangeMe(a) where is_map(a) do
+    a
+  end
+end
diff --git a/userdiff.c b/userdiff.c
index e187d356f6..31fff34e1e 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -32,6 +32,13 @@  PATTERNS("dts",
 	 /* Property names and math operators */
 	 "[a-zA-Z0-9,._+?#-]+"
 	 "|[-+*/%&^|!~]|>>|<<|&&|\\|\\|"),
+PATTERNS("elixir",
+	 "^[ \t]*((def(macro|module|impl|guard|protocol)?p?|test)[ \t].*)$",
+	 "[a-zA-Z0-9_.]+"
+	 "|:[a-zA-Z0-9@_]+"
+	 "|:'a-zA-Z0-9@_]+'"
+	 "|:\"[a-zA-Z0-9@_]+\""
+	 "|@[a-zA-Z0-9_]+"),
 IPATTERN("fortran",
 	 "!^([C*]|[ \t]*!)\n"
 	 "!^[ \t]*MODULE[ \t]+PROCEDURE[ \t]\n"