diff mbox series

[2/8] Documentation/sphinx: fix Python string escapes

Message ID 20230814060704.79655-3-bgray@linux.ibm.com (mailing list archive)
State Handled Elsewhere, archived
Headers show
Series Fix Python string escapes | expand

Commit Message

Benjamin Gray Aug. 14, 2023, 6:06 a.m. UTC
Python 3.6 introduced a DeprecationWarning for invalid escape sequences.
This is upgraded to a SyntaxWarning in Python 3.12, and will eventually
be a syntax error.

Fix these now to get ahead of it before it's an error.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
---
 Documentation/sphinx/cdomain.py             | 2 +-
 Documentation/sphinx/kernel_abi.py          | 2 +-
 Documentation/sphinx/kernel_feat.py         | 2 +-
 Documentation/sphinx/kerneldoc.py           | 2 +-
 Documentation/sphinx/maintainers_include.py | 8 ++++----
 5 files changed, 8 insertions(+), 8 deletions(-)

Comments

Jonathan Corbet Aug. 14, 2023, 1:35 p.m. UTC | #1
Benjamin Gray <bgray@linux.ibm.com> writes:

> Python 3.6 introduced a DeprecationWarning for invalid escape sequences.
> This is upgraded to a SyntaxWarning in Python 3.12, and will eventually
> be a syntax error.
>
> Fix these now to get ahead of it before it's an error.
>
> Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
> ---
>  Documentation/sphinx/cdomain.py             | 2 +-
>  Documentation/sphinx/kernel_abi.py          | 2 +-
>  Documentation/sphinx/kernel_feat.py         | 2 +-
>  Documentation/sphinx/kerneldoc.py           | 2 +-
>  Documentation/sphinx/maintainers_include.py | 8 ++++----
>  5 files changed, 8 insertions(+), 8 deletions(-)

So I am the maintainer for this stuff...is there a reason you didn't
copy me on this work?

> diff --git a/Documentation/sphinx/cdomain.py b/Documentation/sphinx/cdomain.py
> index ca8ac9e59ded..dbdc74bd0772 100644
> --- a/Documentation/sphinx/cdomain.py
> +++ b/Documentation/sphinx/cdomain.py
> @@ -93,7 +93,7 @@ def markup_ctype_refs(match):
>  #
>  RE_expr = re.compile(r':c:(expr|texpr):`([^\`]+)`')
>  def markup_c_expr(match):
> -    return '\ ``' + match.group(2) + '``\ '
> +    return '\\ ``' + match.group(2) + '``\\ '

I have to wonder about this one; I doubt the intent was to insert a
literal backslash.  I have to fire up my ancient build environment to
even try this, but even if it's right...

>  #
>  # Parse Sphinx 3.x C markups, replacing them by backward-compatible ones
> diff --git a/Documentation/sphinx/kernel_abi.py b/Documentation/sphinx/kernel_abi.py
> index b5feb5b1d905..b9f026f016fd 100644
> --- a/Documentation/sphinx/kernel_abi.py
> +++ b/Documentation/sphinx/kernel_abi.py
> @@ -138,7 +138,7 @@ class KernelCmd(Directive):
>                  code_block += "\n    " + l
>              lines = code_block + "\n\n"
>  
> -        line_regex = re.compile("^\.\. LINENO (\S+)\#([0-9]+)$")
> +        line_regex = re.compile("^\\.\\. LINENO (\\S+)\\#([0-9]+)$")

All of these really just want to be raw strings - a much more minimal
fix that makes the result quite a bit more readable:

     line_regex = re.compile(r"^\.\. LINENO (\S+)\#([0-9]+)$")
                             ^
                             |
  ---------------------------+

That, I think, is how these should be fixed.

Thanks,

jon
Benjamin Gray Aug. 14, 2023, 11:26 p.m. UTC | #2
On 14/8/23 11:35 pm, Jonathan Corbet wrote:
> Benjamin Gray <bgray@linux.ibm.com> writes:
> 
>> Python 3.6 introduced a DeprecationWarning for invalid escape sequences.
>> This is upgraded to a SyntaxWarning in Python 3.12, and will eventually
>> be a syntax error.
>>
>> Fix these now to get ahead of it before it's an error.
>>
>> Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
>> ---
>>   Documentation/sphinx/cdomain.py             | 2 +-
>>   Documentation/sphinx/kernel_abi.py          | 2 +-
>>   Documentation/sphinx/kernel_feat.py         | 2 +-
>>   Documentation/sphinx/kerneldoc.py           | 2 +-
>>   Documentation/sphinx/maintainers_include.py | 8 ++++----
>>   5 files changed, 8 insertions(+), 8 deletions(-)
> 
> So I am the maintainer for this stuff...is there a reason you didn't
> copy me on this work?

Sorry, I thought the list linux-doc@vger.kernel.org itself was enough. I 
haven't done a cross tree series before, I was a bit adverse to CC'ing 
everyone that appears as a maintainer for every patch.

> 
>> diff --git a/Documentation/sphinx/cdomain.py b/Documentation/sphinx/cdomain.py
>> index ca8ac9e59ded..dbdc74bd0772 100644
>> --- a/Documentation/sphinx/cdomain.py
>> +++ b/Documentation/sphinx/cdomain.py
>> @@ -93,7 +93,7 @@ def markup_ctype_refs(match):
>>   #
>>   RE_expr = re.compile(r':c:(expr|texpr):`([^\`]+)`')
>>   def markup_c_expr(match):
>> -    return '\ ``' + match.group(2) + '``\ '
>> +    return '\\ ``' + match.group(2) + '``\\ '
> 
> I have to wonder about this one; I doubt the intent was to insert a
> literal backslash.  I have to fire up my ancient build environment to
> even try this, but even if it's right...

Yeah, there is even a file that just has a syntax error. I don't have a 
way to verify the original script was correct, but I have verified this 
series doesn't change the parsed AST.

In this case though, it's generating reST, so it might just be 
conservatively guarding against generating bad markup[1]

[1]: 
https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#inline-markup 


>>   #
>>   # Parse Sphinx 3.x C markups, replacing them by backward-compatible ones
>> diff --git a/Documentation/sphinx/kernel_abi.py b/Documentation/sphinx/kernel_abi.py
>> index b5feb5b1d905..b9f026f016fd 100644
>> --- a/Documentation/sphinx/kernel_abi.py
>> +++ b/Documentation/sphinx/kernel_abi.py
>> @@ -138,7 +138,7 @@ class KernelCmd(Directive):
>>                   code_block += "\n    " + l
>>               lines = code_block + "\n\n"
>>   
>> -        line_regex = re.compile("^\.\. LINENO (\S+)\#([0-9]+)$")
>> +        line_regex = re.compile("^\\.\\. LINENO (\\S+)\\#([0-9]+)$")
> 
> All of these really just want to be raw strings - a much more minimal
> fix that makes the result quite a bit more readable:
> 
>       line_regex = re.compile(r"^\.\. LINENO (\S+)\#([0-9]+)$")
>                               ^
>                               |
>    ---------------------------+
> 
> That, I think, is how these should be fixed.

Yup, I mentioned that at the end of the cover letter. I can automate and 
verify the conversion, but automating what _should_ be treated as a 
'regex' string is fuzzier. Checking if there's a `re.*(` prefix on the 
string should work for most though. I'll give it a shot.

> Thanks,
> 
> jon
diff mbox series

Patch

diff --git a/Documentation/sphinx/cdomain.py b/Documentation/sphinx/cdomain.py
index ca8ac9e59ded..dbdc74bd0772 100644
--- a/Documentation/sphinx/cdomain.py
+++ b/Documentation/sphinx/cdomain.py
@@ -93,7 +93,7 @@  def markup_ctype_refs(match):
 #
 RE_expr = re.compile(r':c:(expr|texpr):`([^\`]+)`')
 def markup_c_expr(match):
-    return '\ ``' + match.group(2) + '``\ '
+    return '\\ ``' + match.group(2) + '``\\ '
 
 #
 # Parse Sphinx 3.x C markups, replacing them by backward-compatible ones
diff --git a/Documentation/sphinx/kernel_abi.py b/Documentation/sphinx/kernel_abi.py
index b5feb5b1d905..b9f026f016fd 100644
--- a/Documentation/sphinx/kernel_abi.py
+++ b/Documentation/sphinx/kernel_abi.py
@@ -138,7 +138,7 @@  class KernelCmd(Directive):
                 code_block += "\n    " + l
             lines = code_block + "\n\n"
 
-        line_regex = re.compile("^\.\. LINENO (\S+)\#([0-9]+)$")
+        line_regex = re.compile("^\\.\\. LINENO (\\S+)\\#([0-9]+)$")
         ln = 0
         n = 0
         f = fname
diff --git a/Documentation/sphinx/kernel_feat.py b/Documentation/sphinx/kernel_feat.py
index 27b701ed3681..d17adc1a367a 100644
--- a/Documentation/sphinx/kernel_feat.py
+++ b/Documentation/sphinx/kernel_feat.py
@@ -104,7 +104,7 @@  class KernelFeat(Directive):
 
         lines = self.runCmd(cmd, shell=True, cwd=cwd, env=shell_env)
 
-        line_regex = re.compile("^\.\. FILE (\S+)$")
+        line_regex = re.compile("^\\.\\. FILE (\\S+)$")
 
         out_lines = ""
 
diff --git a/Documentation/sphinx/kerneldoc.py b/Documentation/sphinx/kerneldoc.py
index 9395892c7ba3..d6ec34ce2232 100644
--- a/Documentation/sphinx/kerneldoc.py
+++ b/Documentation/sphinx/kerneldoc.py
@@ -130,7 +130,7 @@  class KernelDocDirective(Directive):
             result = ViewList()
 
             lineoffset = 0;
-            line_regex = re.compile("^\.\. LINENO ([0-9]+)$")
+            line_regex = re.compile("^\\.\\. LINENO ([0-9]+)$")
             for line in lines:
                 match = line_regex.search(line)
                 if match:
diff --git a/Documentation/sphinx/maintainers_include.py b/Documentation/sphinx/maintainers_include.py
index 328b3631a585..73be47963153 100755
--- a/Documentation/sphinx/maintainers_include.py
+++ b/Documentation/sphinx/maintainers_include.py
@@ -77,7 +77,7 @@  class MaintainersInclude(Include):
             line = line.rstrip()
 
             # Linkify all non-wildcard refs to ReST files in Documentation/.
-            pat = '(Documentation/([^\s\?\*]*)\.rst)'
+            pat = '(Documentation/([^\\s\\?\\*]*)\\.rst)'
             m = re.search(pat, line)
             if m:
                 # maintainers.rst is in a subdirectory, so include "../".
@@ -90,11 +90,11 @@  class MaintainersInclude(Include):
                 output = "| %s" % (line.replace("\\", "\\\\"))
                 # Look for and record field letter to field name mappings:
                 #   R: Designated *reviewer*: FullName <address@domain>
-                m = re.search("\s(\S):\s", line)
+                m = re.search("\\s(\\S):\\s", line)
                 if m:
                     field_letter = m.group(1)
                 if field_letter and not field_letter in fields:
-                    m = re.search("\*([^\*]+)\*", line)
+                    m = re.search("\\*([^\\*]+)\\*", line)
                     if m:
                         fields[field_letter] = m.group(1)
             elif subsystems:
@@ -112,7 +112,7 @@  class MaintainersInclude(Include):
                     field_content = ""
 
                     # Collapse whitespace in subsystem name.
-                    heading = re.sub("\s+", " ", line)
+                    heading = re.sub("\\s+", " ", line)
                     output = output + "%s\n%s" % (heading, "~" * len(heading))
                     field_prev = ""
                 else: