Message ID | 20190511041331.51642-1-liboxuan@connect.hku.hk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] userdiff.c & doc/gitattributes.txt: add Octave | expand |
Boxuan Li <liboxuan@connect.hku.hk> writes: > Octave pattern is almost the same as matlab. Besides, > octave also uses '%%%' or '##' to begin code sections. My reading of the above hiccupped at around "Besides, octave also uses...". Checking the differences in patterns, I think ... the same as matlab, except that '%%%' and '##' can also be used to begin code sections, in addition to '%%' that is understood by both. may be easier to read. It makes it clear that you'd want to stay away from %%% and ## if you want to be compatible. Thanks. > Signed-off-by: Boxuan Li <liboxuan@connect.hku.hk> > --- > Update from v2: fix indentation > --- > Documentation/gitattributes.txt | 2 ++ > userdiff.c | 5 +++++ > 2 files changed, 7 insertions(+) > > diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt > index 4fb20cd0e9..45374c7dd3 100644 > --- a/Documentation/gitattributes.txt > +++ b/Documentation/gitattributes.txt > @@ -821,6 +821,8 @@ patterns are available: > > - `matlab` suitable for source code in the MATLAB language. > > +- `octave` suitable for source code in the Octave language. > + > - `objc` suitable for source code in the Objective-C language. > > - `pascal` suitable for source code in the Pascal/Delphi language. > diff --git a/userdiff.c b/userdiff.c > index 3a78fbf504..7d07b82116 100644 > --- a/userdiff.c > +++ b/userdiff.c > @@ -60,6 +60,11 @@ PATTERNS("java", > PATTERNS("matlab", > "^[[:space:]]*((classdef|function)[[:space:]].*)$|^%%[[:space:]].*$", > "[a-zA-Z_][a-zA-Z0-9_]*|[-+0-9.e]+|[=~<>]=|\\.[*/\\^']|\\|\\||&&"), > +PATTERNS("octave", > + /* Mostly the same as matlab. In addition, Octave > + * supports '##' and '%%%' for code sections */ > + "^[[:space:]]*((classdef|function)[[:space:]].*)$|^(%%%?|##)[[:space:]].*$", > + "[a-zA-Z_][a-zA-Z0-9_]*|[-+0-9.e]+|[=~<>]=|\\.[*/\\^']|\\|\\||&&"), > PATTERNS("objc", > /* Negate C statements that can look like functions */ > "!^[ \t]*(do|for|if|else|return|switch|while)\n"
Am 11.05.19 um 06:13 schrieb Boxuan Li: > Octave pattern is almost the same as matlab. Besides, > octave also uses '%%%' or '##' to begin code sections. > > @@ -60,6 +60,11 @@ PATTERNS("java", > PATTERNS("matlab", > "^[[:space:]]*((classdef|function)[[:space:]].*)$|^%%[[:space:]].*$", > "[a-zA-Z_][a-zA-Z0-9_]*|[-+0-9.e]+|[=~<>]=|\\.[*/\\^']|\\|\\||&&"), > +PATTERNS("octave", > + /* Mostly the same as matlab. In addition, Octave > + * supports '##' and '%%%' for code sections */ > + "^[[:space:]]*((classdef|function)[[:space:]].*)$|^(%%%?|##)[[:space:]].*$", > + "[a-zA-Z_][a-zA-Z0-9_]*|[-+0-9.e]+|[=~<>]=|\\.[*/\\^']|\\|\\||&&"), > PATTERNS("objc", > /* Negate C statements that can look like functions */ > "!^[ \t]*(do|for|if|else|return|switch|while)\n" > In Matlab, are %%% and ## valid syntax? If not, instead of introducing a new language, please just extend the Matlab rule to treat %%% and ## as you need for Octave and mark your Octave files as Matlab. -- Hannes
On Wed, May 15, 2019 at 1:57 PM Johannes Sixt <j6t@kdbg.org> wrote: > > Am 11.05.19 um 06:13 schrieb Boxuan Li: > > Octave pattern is almost the same as matlab. Besides, > > octave also uses '%%%' or '##' to begin code sections. > > > > > @@ -60,6 +60,11 @@ PATTERNS("java", > > PATTERNS("matlab", > > "^[[:space:]]*((classdef|function)[[:space:]].*)$|^%%[[:space:]].*$", > > "[a-zA-Z_][a-zA-Z0-9_]*|[-+0-9.e]+|[=~<>]=|\\.[*/\\^']|\\|\\||&&"), > > +PATTERNS("octave", > > + /* Mostly the same as matlab. In addition, Octave > > + * supports '##' and '%%%' for code sections */ > > + "^[[:space:]]*((classdef|function)[[:space:]].*)$|^(%%%?|##)[[:space:]].*$", > > + "[a-zA-Z_][a-zA-Z0-9_]*|[-+0-9.e]+|[=~<>]=|\\.[*/\\^']|\\|\\||&&"), > > PATTERNS("objc", > > /* Negate C statements that can look like functions */ > > "!^[ \t]*(do|for|if|else|return|switch|while)\n" > > > > In Matlab, are %%% and ## valid syntax? If not, instead of introducing a > new language, please just extend the Matlab rule to treat %%% and ## as > you need for Octave and mark your Octave files as Matlab. > > -- Hannes Hi Hannes, '##' is not valid syntax in Matlab scripts. '%%%' is valid syntax in Matlab. However, it is not used as a section divider. ref: https://www.mathworks.com/help/matlab/matlab_prog/run-sections-of-programs.html ref: https://octave.org/doc/interpreter/Sections.html Best regards, Boxuan
Am 15.05.19 um 08:15 schrieb LI, BO XUAN: > On Wed, May 15, 2019 at 1:57 PM Johannes Sixt <j6t@kdbg.org> wrote: >> >> Am 11.05.19 um 06:13 schrieb Boxuan Li: >>> Octave pattern is almost the same as matlab. Besides, >>> octave also uses '%%%' or '##' to begin code sections. >>> >> >>> @@ -60,6 +60,11 @@ PATTERNS("java", >>> PATTERNS("matlab", >>> "^[[:space:]]*((classdef|function)[[:space:]].*)$|^%%[[:space:]].*$", >>> "[a-zA-Z_][a-zA-Z0-9_]*|[-+0-9.e]+|[=~<>]=|\\.[*/\\^']|\\|\\||&&"), >>> +PATTERNS("octave", >>> + /* Mostly the same as matlab. In addition, Octave >>> + * supports '##' and '%%%' for code sections */ >>> + "^[[:space:]]*((classdef|function)[[:space:]].*)$|^(%%%?|##)[[:space:]].*$", >>> + "[a-zA-Z_][a-zA-Z0-9_]*|[-+0-9.e]+|[=~<>]=|\\.[*/\\^']|\\|\\||&&"), >>> PATTERNS("objc", >>> /* Negate C statements that can look like functions */ >>> "!^[ \t]*(do|for|if|else|return|switch|while)\n" >>> >> >> In Matlab, are %%% and ## valid syntax? If not, instead of introducing a >> new language, please just extend the Matlab rule to treat %%% and ## as >> you need for Octave and mark your Octave files as Matlab. > > '##' is not valid syntax in Matlab scripts. > > '%%%' is valid syntax in Matlab. However, it is not used as a section divider. In Matlab, is %%% followed by space at the beginning of a line *commonly* used for something different? If I were to make a guess, I would say no. If I'm right, it does not hurt to merge the Octave rules into the Matlab rules. -- Hannes
Johannes Sixt <j6t@kdbg.org> writes: > In Matlab, is %%% followed by space at the beginning of a line > *commonly* used for something different? If I were to make a guess, I > would say no. If I'm right, it does not hurt to merge the Octave rules > into the Matlab rules. That is true because we are not syntax-aware and error-highlighting text editor. If we were, I'd suspect that your stance may probably be different. But instead we apply these patterns to a program that is assumed to be correctly written. And from that point of view, I agree with you that it would not hurt to make the existing patterns for Matlab slightly more receptive so that a correctly written programs in either language would be matched appropriately. But would it hurt to have two similar entries, with a clear description on how they are different, in our code there, given how infrequently individual entries have historically been updated?
Am 16.05.19 um 11:19 schrieb Junio C Hamano: > Johannes Sixt <j6t@kdbg.org> writes: > >> In Matlab, is %%% followed by space at the beginning of a line >> *commonly* used for something different? If I were to make a guess, I >> would say no. If I'm right, it does not hurt to merge the Octave rules >> into the Matlab rules. > > That is true because we are not syntax-aware and error-highlighting > text editor. If we were, I'd suspect that your stance may probably > be different. But instead we apply these patterns to a program that > is assumed to be correctly written. Correct! > And from that point of view, I agree with you that it would not hurt > to make the existing patterns for Matlab slightly more receptive so > that a correctly written programs in either language would be matched > appropriately. > > But would it hurt to have two similar entries, with a clear > description on how they are different, in our code there, given how > infrequently individual entries have historically been updated? Would it hurt? Probably not. Should we open the door for everybody's favorite dialect? Probably not, either. See, we do not even have separate entries for C and C++. Heck, we don't have an XML entry, because the HTML entry is good enough, and there are probably many more XML users than Matlab and Octave users together. I'd prefer to keep this list at the minimum necessary as long as it is hard-coded in C. I would take a different stance if this were some configuration file that we ship. -- Hannes
Johannes Sixt <j6t@kdbg.org> writes: > I'd prefer to keep this list at the minimum necessary as long as it is > hard-coded in C. Yeah, I know that feeling. > I would take a different stance if this were some > configuration file that we ship. Hmm, now you reminded me of my ancient wish. Perhaps it is not too bad to ship $(sharedir)/git-core/userdiff that can be read using git_config_from_file() interface, using a very narrow callback function that understands only diff.*.xfuncname and diff.*.wordregex and discards everything else, without even following the include/includeIf stuff? As long as that can be done safely and without too much overhead, we could get rid of the hardcoded patterns in userdiff.c::builtin_drivers[] and that would be wonderful ;-)
On Fri, May 17, 2019 at 7:34 AM Junio C Hamano <gitster@pobox.com> wrote: > > Johannes Sixt <j6t@kdbg.org> writes: > > > I'd prefer to keep this list at the minimum necessary as long as it is > > hard-coded in C. > > Yeah, I know that feeling. > So do we reach a consensus? Should I merge the Octave rules into Matlab rules? Best regards, Boxuan > > I would take a different stance if this were some > > configuration file that we ship. > > Hmm, now you reminded me of my ancient wish. > > Perhaps it is not too bad to ship $(sharedir)/git-core/userdiff that > can be read using git_config_from_file() interface, using a very > narrow callback function that understands only diff.*.xfuncname and > diff.*.wordregex and discards everything else, without even > following the include/includeIf stuff? As long as that can be done > safely and without too much overhead, we could get rid of the > hardcoded patterns in userdiff.c::builtin_drivers[] and that would > be wonderful ;-) >
Am 17.05.19 um 14:19 schrieb LI, BO XUAN: > On Fri, May 17, 2019 at 7:34 AM Junio C Hamano <gitster@pobox.com> wrote: >> >> Johannes Sixt <j6t@kdbg.org> writes: >> >>> I'd prefer to keep this list at the minimum necessary as long as it is >>> hard-coded in C. >> >> Yeah, I know that feeling. > > So do we reach a consensus? Should I merge the Octave rules into Matlab rules? Yes, please, that would be very much appreciated. Bonus points if you add some test cases for the hunk header patterns to the t/t4018 directory. -- Hannes
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt index 4fb20cd0e9..45374c7dd3 100644 --- a/Documentation/gitattributes.txt +++ b/Documentation/gitattributes.txt @@ -821,6 +821,8 @@ patterns are available: - `matlab` suitable for source code in the MATLAB language. +- `octave` suitable for source code in the Octave language. + - `objc` suitable for source code in the Objective-C language. - `pascal` suitable for source code in the Pascal/Delphi language. diff --git a/userdiff.c b/userdiff.c index 3a78fbf504..7d07b82116 100644 --- a/userdiff.c +++ b/userdiff.c @@ -60,6 +60,11 @@ PATTERNS("java", PATTERNS("matlab", "^[[:space:]]*((classdef|function)[[:space:]].*)$|^%%[[:space:]].*$", "[a-zA-Z_][a-zA-Z0-9_]*|[-+0-9.e]+|[=~<>]=|\\.[*/\\^']|\\|\\||&&"), +PATTERNS("octave", + /* Mostly the same as matlab. In addition, Octave + * supports '##' and '%%%' for code sections */ + "^[[:space:]]*((classdef|function)[[:space:]].*)$|^(%%%?|##)[[:space:]].*$", + "[a-zA-Z_][a-zA-Z0-9_]*|[-+0-9.e]+|[=~<>]=|\\.[*/\\^']|\\|\\||&&"), PATTERNS("objc", /* Negate C statements that can look like functions */ "!^[ \t]*(do|for|if|else|return|switch|while)\n"
Octave pattern is almost the same as matlab. Besides, octave also uses '%%%' or '##' to begin code sections. Signed-off-by: Boxuan Li <liboxuan@connect.hku.hk> --- Update from v2: fix indentation --- Documentation/gitattributes.txt | 2 ++ userdiff.c | 5 +++++ 2 files changed, 7 insertions(+)