Message ID | 20240227160253.104011-2-sergiusnyah@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] Subject: [GSOC][RFC PATCH 1/2] Add builtin patterns for JavaScript function detection in userdiff. | expand |
On Tue Feb 27, 2024 at 9:32 PM IST, Sergius Nyah wrote: > [PATCH 1/2] Subject: [GSOC][RFC PATCH 1/2] Add builtin patterns for JavaScript function detection in userdiff. I think these prefixes somehow got added twice. In any case, this makes it so that "Subject: [GSOC][RFC PATCH 1/2]" is part of the commit log, which shoud not be. And the subject can be better written as: userdiff: add builtin patterns for Javascript > This patch adds the regular expression for detecting JavaScript functions and expressions in Git diffs. The pattern accurately identifies function declerations, expressions, and definitions inside blocks. commit message should be wrapped at 72 characters. > --- > userdiff.c | 17 +++++++++++++++-- > 1 file changed, 15 insertions(+), 2 deletions(-) > > diff --git a/userdiff.c b/userdiff.c > index e399543823..12e31ff14d 100644 > --- a/userdiff.c > +++ b/userdiff.c > @@ -1,7 +1,7 @@ > #include "git-compat-util.h" > #include "config.h" > #include "userdiff.h" > -#include "attr.h" > +#include "attr.h" This change adds trailing whitespace which is not desired. > #include "strbuf.h" > > static struct userdiff_driver *drivers; > @@ -183,6 +183,19 @@ PATTERNS("java", > "|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?" > "|[-+*/<>%&^|=!]=" > "|--|\\+\\+|<<=?|>>>?=?|&&|\\|\\|"), > +PATTERNS("javascript", > + /* > + * Looks for lines that start with optional whitespace, followed > + * by 'function'* and any characters (for function declarations), > + * or valid JavaScript identifiers, equals sign '=', 'function' keyword > + * and any characters (for function expressions). > + * Also considers functions defined inside blocks with '{...}'. > + */ > + "^[ \t]*(function[ \t]*.*|[a-zA-Z_$][0-9a-zA-Z_$]*[ \t]*=[ \t]*function[ \t]*.*|(\\{[ \t]*)?)\n", > + /* This pattern matches JavaScript identifiers */ > + "[a-zA-Z_$][0-9a-zA-Z_$]*" > + "|[-+0-9.eE]+|0[xX][0-9a-fA-F]+" > + "|[-+*/<>%&^|=!:]=|--|\\+\\+|<<=?|>>=?|&&|\\|\\|"), Majority of the lines above (including comment) have trailing whitespace. Consider removing them. Also, tabs should be used for indentation instead of spaces. (cf. Documentation/CodingGuidelines) > PATTERNS("kotlin", > "^[ \t]*(([a-z]+[ \t]+)*(fun|class|interface)[ \t]+.*)$", > /* -- */ > @@ -192,7 +205,7 @@ PATTERNS("kotlin", > /* integers and floats */ > "|[0-9][0-9_]*([.][0-9_]*)?([Ee][-+]?[0-9]+)?[fFlLuU]*" > /* floating point numbers beginning with decimal point */ > - "|[.][0-9][0-9_]*([Ee][-+]?[0-9]+)?[fFlLuU]?" > + "|[.][0-9][0-9_]*([Ee][-+]?[0-9]+ )?[fFlLuU]?" This adds extra whitespace in the middle. Changes not related to the patch topic should be avoided. Also this was attempted before, see [1]. I think you can take some inspiration and inputs from that also. Aside from that I think the separate patch which adds tests for this can be squashed into this one. [1]: https://lore.kernel.org/git/20220403132508.28196-1-a97410985new@gmail.com/ Thanks.
On Tue, Feb 27, 2024 at 8:06 PM Ghanshyam Thakkar <shyamthakkar001@gmail.com> wrote: > > On Tue Feb 27, 2024 at 9:32 PM IST, Sergius Nyah wrote: > > [PATCH 1/2] Subject: [GSOC][RFC PATCH 1/2] Add builtin patterns for JavaScript function detection in userdiff. > > I think these prefixes somehow got added twice. In any case, this makes > it so that "Subject: [GSOC][RFC PATCH 1/2]" is part of the commit log, > which shoud not be. And the subject can be better written as: > > userdiff: add builtin patterns for Javascript > Noted, thanks! > > This patch adds the regular expression for detecting JavaScript functions and expressions in Git diffs. The pattern accurately identifies function declerations, expressions, and definitions inside blocks. > > commit message should be wrapped at 72 characters. > > --- Thank you for the correction. > > userdiff.c | 17 +++++++++++++++-- > > 1 file changed, 15 insertions(+), 2 deletions(-) > > > > diff --git a/userdiff.c b/userdiff.c > > index e399543823..12e31ff14d 100644 > > --- a/userdiff.c > > +++ b/userdiff.c > > @@ -1,7 +1,7 @@ > > #include "git-compat-util.h" > > #include "config.h" > > #include "userdiff.h" > > -#include "attr.h" > > +#include "attr.h" > > This change adds trailing whitespace which is not desired. > Okay. Would avoid them in subsequent commits. > > #include "strbuf.h" > > > > static struct userdiff_driver *drivers; > > @@ -183,6 +183,19 @@ PATTERNS("java", > > "|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?" > > "|[-+*/<>%&^|=!]=" > > "|--|\\+\\+|<<=?|>>>?=?|&&|\\|\\|"), > > +PATTERNS("javascript", > > + /* > > + * Looks for lines that start with optional whitespace, followed > > + * by 'function'* and any characters (for function declarations), > > + * or valid JavaScript identifiers, equals sign '=', 'function' keyword > > + * and any characters (for function expressions). > > + * Also considers functions defined inside blocks with '{...}'. > > + */ > > + "^[ \t]*(function[ \t]*.*|[a-zA-Z_$][0-9a-zA-Z_$]*[ \t]*=[ \t]*function[ \t]*.*|(\\{[ \t]*)?)\n", > > + /* This pattern matches JavaScript identifiers */ > > + "[a-zA-Z_$][0-9a-zA-Z_$]*" > > + "|[-+0-9.eE]+|0[xX][0-9a-fA-F]+" > > + "|[-+*/<>%&^|=!:]=|--|\\+\\+|<<=?|>>=?|&&|\\|\\|"), > > Majority of the lines above (including comment) have trailing whitespace. > Consider removing them. Also, tabs should be used for indentation instead > of spaces. (cf. Documentation/CodingGuidelines) Well received. Thanks! > > > PATTERNS("kotlin", > > "^[ \t]*(([a-z]+[ \t]+)*(fun|class|interface)[ \t]+.*)$", > > /* -- */ > > @@ -192,7 +205,7 @@ PATTERNS("kotlin", > > /* integers and floats */ > > "|[0-9][0-9_]*([.][0-9_]*)?([Ee][-+]?[0-9]+)?[fFlLuU]*" > > /* floating point numbers beginning with decimal point */ > > - "|[.][0-9][0-9_]*([Ee][-+]?[0-9]+)?[fFlLuU]?" > > + "|[.][0-9][0-9_]*([Ee][-+]?[0-9]+ )?[fFlLuU]?" > > This adds extra whitespace in the middle. Changes not related to the patch > topic should be avoided. Sorry about that, I just touched what I shouldn't have. > > Also this was attempted before, see [1]. I think you can take some > inspiration and inputs from that also. Aside from that I think the > separate patch which adds tests for this can be squashed into this > one. > > [1]: > https://lore.kernel.org/git/20220403132508.28196-1-a97410985new@gmail.com/ > > Thanks. Thank you for all the corrections. I'd submit a patch series with the correct practices.
diff --git a/userdiff.c b/userdiff.c index e399543823..12e31ff14d 100644 --- a/userdiff.c +++ b/userdiff.c @@ -1,7 +1,7 @@ #include "git-compat-util.h" #include "config.h" #include "userdiff.h" -#include "attr.h" +#include "attr.h" #include "strbuf.h" static struct userdiff_driver *drivers; @@ -183,6 +183,19 @@ PATTERNS("java", "|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?" "|[-+*/<>%&^|=!]=" "|--|\\+\\+|<<=?|>>>?=?|&&|\\|\\|"), +PATTERNS("javascript", + /* + * Looks for lines that start with optional whitespace, followed + * by 'function'* and any characters (for function declarations), + * or valid JavaScript identifiers, equals sign '=', 'function' keyword + * and any characters (for function expressions). + * Also considers functions defined inside blocks with '{...}'. + */ + "^[ \t]*(function[ \t]*.*|[a-zA-Z_$][0-9a-zA-Z_$]*[ \t]*=[ \t]*function[ \t]*.*|(\\{[ \t]*)?)\n", + /* This pattern matches JavaScript identifiers */ + "[a-zA-Z_$][0-9a-zA-Z_$]*" + "|[-+0-9.eE]+|0[xX][0-9a-fA-F]+" + "|[-+*/<>%&^|=!:]=|--|\\+\\+|<<=?|>>=?|&&|\\|\\|"), PATTERNS("kotlin", "^[ \t]*(([a-z]+[ \t]+)*(fun|class|interface)[ \t]+.*)$", /* -- */ @@ -192,7 +205,7 @@ PATTERNS("kotlin", /* integers and floats */ "|[0-9][0-9_]*([.][0-9_]*)?([Ee][-+]?[0-9]+)?[fFlLuU]*" /* floating point numbers beginning with decimal point */ - "|[.][0-9][0-9_]*([Ee][-+]?[0-9]+)?[fFlLuU]?" + "|[.][0-9][0-9_]*([Ee][-+]?[0-9]+ )?[fFlLuU]?" /* unary and binary operators */ "|[-+*/<>%&^|=!]==?|--|\\+\\+|<<=|>>=|&&|\\|\\||->|\\.\\*|!!|[?:.][.:]"), PATTERNS("markdown",