diff mbox series

[2/2] userdiff: improve Fortran xfuncname regex

Message ID 69fe977b66f9744c914cfdfa2da4b9be5e720e4f.1597271429.git.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit 75c3b6b2e8a72239fa23e039c46f9a5cf8c24142
Headers show
Series Improve and test Fortran xfuncname regex | expand

Commit Message

Linus Arver via GitGitGadget Aug. 12, 2020, 10:30 p.m. UTC
From: Philippe Blain <levraiphilippeblain@gmail.com>

The third part of the Fortran xfuncname regex wants to match the
beginning of a subroutine or function, so it allows for all characters
except `'`, `"` or whitespace before the keyword 'function' or
'subroutine'. This is meant to match the 'recursive', 'elemental' or
'pure' keywords, as well as function return types, and to prevent
matches inside strings.

However, the negated set does not contain the `!` comment character,
so a line with an end-of-line comment containing the keyword 'function' or
'subroutine' followed by another word is mistakenly chosen as a hunk header.

Improve the regex by adding `!` to the negated set.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
---
 t/t4018/fortran-comment-keyword | 1 -
 userdiff.c                      | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

Comments

Elijah Newren Aug. 13, 2020, 2:10 a.m. UTC | #1
On Wed, Aug 12, 2020 at 3:34 PM Philippe Blain via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Philippe Blain <levraiphilippeblain@gmail.com>
>
> The third part of the Fortran xfuncname regex wants to match the
> beginning of a subroutine or function, so it allows for all characters
> except `'`, `"` or whitespace before the keyword 'function' or
> 'subroutine'. This is meant to match the 'recursive', 'elemental' or
> 'pure' keywords, as well as function return types, and to prevent
> matches inside strings.
>
> However, the negated set does not contain the `!` comment character,
> so a line with an end-of-line comment containing the keyword 'function' or
> 'subroutine' followed by another word is mistakenly chosen as a hunk header.
>
> Improve the regex by adding `!` to the negated set.
>
> Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
> ---
>  t/t4018/fortran-comment-keyword | 1 -
>  userdiff.c                      | 2 +-
>  2 files changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/t/t4018/fortran-comment-keyword b/t/t4018/fortran-comment-keyword
> index c5dbdb4c61..e9206a5379 100644
> --- a/t/t4018/fortran-comment-keyword
> +++ b/t/t4018/fortran-comment-keyword
> @@ -8,7 +8,6 @@
>        real funcB  ! grid function b
>
>        real ChangeMe
> -      integer broken
>
>        end subroutine RIGHT
>

This change seems orthogonal to the explanation in the commit message.
What is its purpose, and does it belong in this commit or a different
one?

> diff --git a/userdiff.c b/userdiff.c
> index 707d82435a..fde02f225b 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -53,7 +53,7 @@ IPATTERN("fortran",
>          /* Program, module, block data */
>          "^[ \t]*((END[ \t]+)?(PROGRAM|MODULE|BLOCK[ \t]+DATA"
>                 /* Subroutines and functions */
> -               "|([^'\" \t]+[ \t]+)*(SUBROUTINE|FUNCTION))[ \t]+[A-Z].*)$",
> +               "|([^!'\" \t]+[ \t]+)*(SUBROUTINE|FUNCTION))[ \t]+[A-Z].*)$",
>          /* -- */
>          "[a-zA-Z][a-zA-Z0-9_]*"
>          "|\\.([Ee][Qq]|[Nn][Ee]|[Gg][TtEe]|[Ll][TtEe]|[Tt][Rr][Uu][Ee]|[Ff][Aa][Ll][Ss][Ee]|[Aa][Nn][Dd]|[Oo][Rr]|[Nn]?[Ee][Qq][Vv]|[Nn][Oo][Tt])\\."
> --
> gitgitgadget
Philippe Blain Aug. 13, 2020, 12:45 p.m. UTC | #2
Hi Elijah, 

> Le 12 août 2020 à 22:10, Elijah Newren <newren@gmail.com> a écrit :
> 
> On Wed, Aug 12, 2020 at 3:34 PM Philippe Blain via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>> 
>> From: Philippe Blain <levraiphilippeblain@gmail.com>
>> 
>> The third part of the Fortran xfuncname regex wants to match the
>> beginning of a subroutine or function, so it allows for all characters
>> except `'`, `"` or whitespace before the keyword 'function' or
>> 'subroutine'. This is meant to match the 'recursive', 'elemental' or
>> 'pure' keywords, as well as function return types, and to prevent
>> matches inside strings.
>> 
>> However, the negated set does not contain the `!` comment character,
>> so a line with an end-of-line comment containing the keyword 'function' or
>> 'subroutine' followed by another word is mistakenly chosen as a hunk header.
>> 
>> Improve the regex by adding `!` to the negated set.
>> 
>> Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
>> ---
>> t/t4018/fortran-comment-keyword | 1 -
>> userdiff.c                      | 2 +-
>> 2 files changed, 1 insertion(+), 2 deletions(-)
>> 
>> diff --git a/t/t4018/fortran-comment-keyword b/t/t4018/fortran-comment-keyword
>> index c5dbdb4c61..e9206a5379 100644
>> --- a/t/t4018/fortran-comment-keyword
>> +++ b/t/t4018/fortran-comment-keyword
>> @@ -8,7 +8,6 @@
>>       real funcB  ! grid function b
>> 
>>       real ChangeMe
>> -      integer broken
>> 
>>       end subroutine RIGHT
>> 
> 
> This change seems orthogonal to the explanation in the commit message.
> What is its purpose, and does it belong in this commit or a different
> one?

If you take a look at t/t4018/README, the way to mark t4018 tests as "known failures"
is to insert "broken" somewhere in the file. Since I'm fixing the regex in this commit to be able 
to cope with the situation in t/t4018/fortran-comment-keyword, I'm unmarking this test as broken.

Cheers,

Philippe.
Elijah Newren Aug. 13, 2020, 2:04 p.m. UTC | #3
Hi Philippe,

On Thu, Aug 13, 2020 at 5:45 AM Philippe Blain
<levraiphilippeblain@gmail.com> wrote:
>
> Hi Elijah,
>
> > Le 12 août 2020 à 22:10, Elijah Newren <newren@gmail.com> a écrit :
> >
> > On Wed, Aug 12, 2020 at 3:34 PM Philippe Blain via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> >>
> >> From: Philippe Blain <levraiphilippeblain@gmail.com>
> >>
> >> The third part of the Fortran xfuncname regex wants to match the
> >> beginning of a subroutine or function, so it allows for all characters
> >> except `'`, `"` or whitespace before the keyword 'function' or
> >> 'subroutine'. This is meant to match the 'recursive', 'elemental' or
> >> 'pure' keywords, as well as function return types, and to prevent
> >> matches inside strings.
> >>
> >> However, the negated set does not contain the `!` comment character,
> >> so a line with an end-of-line comment containing the keyword 'function' or
> >> 'subroutine' followed by another word is mistakenly chosen as a hunk header.
> >>
> >> Improve the regex by adding `!` to the negated set.
> >>
> >> Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
> >> ---
> >> t/t4018/fortran-comment-keyword | 1 -
> >> userdiff.c                      | 2 +-
> >> 2 files changed, 1 insertion(+), 2 deletions(-)
> >>
> >> diff --git a/t/t4018/fortran-comment-keyword b/t/t4018/fortran-comment-keyword
> >> index c5dbdb4c61..e9206a5379 100644
> >> --- a/t/t4018/fortran-comment-keyword
> >> +++ b/t/t4018/fortran-comment-keyword
> >> @@ -8,7 +8,6 @@
> >>       real funcB  ! grid function b
> >>
> >>       real ChangeMe
> >> -      integer broken
> >>
> >>       end subroutine RIGHT
> >>
> >
> > This change seems orthogonal to the explanation in the commit message.
> > What is its purpose, and does it belong in this commit or a different
> > one?
>
> If you take a look at t/t4018/README, the way to mark t4018 tests as "known failures"
> is to insert "broken" somewhere in the file. Since I'm fixing the regex in this commit to be able
> to cope with the situation in t/t4018/fortran-comment-keyword, I'm unmarking this test as broken.

Ah, gotcha.  I guess that's what I get for trying to review a random
patch outside my area of expertise.  :-)  Thanks for explaining how
this works to me.

Elijah
diff mbox series

Patch

diff --git a/t/t4018/fortran-comment-keyword b/t/t4018/fortran-comment-keyword
index c5dbdb4c61..e9206a5379 100644
--- a/t/t4018/fortran-comment-keyword
+++ b/t/t4018/fortran-comment-keyword
@@ -8,7 +8,6 @@ 
       real funcB  ! grid function b
 
       real ChangeMe
-      integer broken
 
       end subroutine RIGHT
 
diff --git a/userdiff.c b/userdiff.c
index 707d82435a..fde02f225b 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -53,7 +53,7 @@  IPATTERN("fortran",
 	 /* Program, module, block data */
 	 "^[ \t]*((END[ \t]+)?(PROGRAM|MODULE|BLOCK[ \t]+DATA"
 		/* Subroutines and functions */
-		"|([^'\" \t]+[ \t]+)*(SUBROUTINE|FUNCTION))[ \t]+[A-Z].*)$",
+		"|([^!'\" \t]+[ \t]+)*(SUBROUTINE|FUNCTION))[ \t]+[A-Z].*)$",
 	 /* -- */
 	 "[a-zA-Z][a-zA-Z0-9_]*"
 	 "|\\.([Ee][Qq]|[Nn][Ee]|[Gg][TtEe]|[Ll][TtEe]|[Tt][Rr][Uu][Ee]|[Ff][Aa][Ll][Ss][Ee]|[Aa][Nn][Dd]|[Oo][Rr]|[Nn]?[Ee][Qq][Vv]|[Nn][Oo][Tt])\\."