diff mbox series

[v4,05/19] qapi/expr.py: constrain incoming expression types

Message ID 20210325060356.4040114-6-jsnow@redhat.com (mailing list archive)
State New, archived
Headers show
Series qapi: static typing conversion, pt3 | expand

Commit Message

John Snow March 25, 2021, 6:03 a.m. UTC
mypy does not know the types of values stored in Dicts that masquerade
as objects. Help the type checker out by constraining the type.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/expr.py | 26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

Comments

Markus Armbruster March 25, 2021, 2:04 p.m. UTC | #1
John Snow <jsnow@redhat.com> writes:

> mypy does not know the types of values stored in Dicts that masquerade
> as objects. Help the type checker out by constraining the type.
>
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>  scripts/qapi/expr.py | 26 +++++++++++++++++++++++---
>  1 file changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/scripts/qapi/expr.py b/scripts/qapi/expr.py
> index b4bbcd54c0..b75c85c160 100644
> --- a/scripts/qapi/expr.py
> +++ b/scripts/qapi/expr.py
> @@ -15,9 +15,18 @@
>  # See the COPYING file in the top-level directory.
>  
>  import re
> +from typing import Dict, Optional
>  
>  from .common import c_name
>  from .error import QAPISemError
> +from .parser import QAPIDoc
> +from .source import QAPISourceInfo
> +
> +
> +# Deserialized JSON objects as returned by the parser;
> +# The values of this mapping are not necessary to exhaustively type

Not necessary and also not practical with current mypy.  Correct?

> +# here, because the purpose of this module is to interrogate that type.
> +_JSONObject = Dict[str, object]
>  
>  
>  # Names consist of letters, digits, -, and _, starting with a letter.
> @@ -315,9 +324,20 @@ def check_event(expr, info):
>  
>  def check_exprs(exprs):
>      for expr_elem in exprs:
> -        expr = expr_elem['expr']
> -        info = expr_elem['info']
> -        doc = expr_elem.get('doc')
> +        # Expression
> +        assert isinstance(expr_elem['expr'], dict)

I dislike relaxing OrderedDict to dict here.  I'm going to accept it
anyway, because the difference between the two is going away in 3.7, and
because so far order actually matters only in certain sub-expressions,
not top-level expressions.

> +        for key in expr_elem['expr'].keys():
> +            assert isinstance(key, str)
> +        expr: _JSONObject = expr_elem['expr']
> +
> +        # QAPISourceInfo
> +        assert isinstance(expr_elem['info'], QAPISourceInfo)
> +        info: QAPISourceInfo = expr_elem['info']
> +
> +        # Optional[QAPIDoc]
> +        tmp = expr_elem.get('doc')
> +        assert tmp is None or isinstance(tmp, QAPIDoc)
> +        doc: Optional[QAPIDoc] = tmp
>  
>          if 'include' in expr:
>              continue

I see you didn't bite on the idea to do less checking here.  Okay.
John Snow March 25, 2021, 8:48 p.m. UTC | #2
On 3/25/21 10:04 AM, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
> 
>> mypy does not know the types of values stored in Dicts that masquerade
>> as objects. Help the type checker out by constraining the type.
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>> ---
>>   scripts/qapi/expr.py | 26 +++++++++++++++++++++++---
>>   1 file changed, 23 insertions(+), 3 deletions(-)
>>
>> diff --git a/scripts/qapi/expr.py b/scripts/qapi/expr.py
>> index b4bbcd54c0..b75c85c160 100644
>> --- a/scripts/qapi/expr.py
>> +++ b/scripts/qapi/expr.py
>> @@ -15,9 +15,18 @@
>>   # See the COPYING file in the top-level directory.
>>   
>>   import re
>> +from typing import Dict, Optional
>>   
>>   from .common import c_name
>>   from .error import QAPISemError
>> +from .parser import QAPIDoc
>> +from .source import QAPISourceInfo
>> +
>> +
>> +# Deserialized JSON objects as returned by the parser;
>> +# The values of this mapping are not necessary to exhaustively type
> 
> Not necessary and also not practical with current mypy.  Correct?
> 

Neither necessary nor practical. Typing as 'object' guarantees that 
these values will never be used in a manner not supported by all python 
objects. Mypy does not complain, so we know that we don't misuse the type.

This is helpful for proving the validity of the expr.py validator 
itself: we know that we are not forgetting to perform type narrowing and 
using the value contained therein inappropriately.

Adding a more exhaustive typing here is impractical (for reasons we 
learned during introspect.py), but also provides no benefit to the 
static analysis here anyway.

(None of the functions written here *assume* the shape of the structure, 
so there are no functions that benefit from having a more laboriously 
specified type.)

If the comment needs more work, suggest away -- I tried to follow our 
last discussion here as best as I was able.

>> +# here, because the purpose of this module is to interrogate that type.
>> +_JSONObject = Dict[str, object]
>>   
>>   
>>   # Names consist of letters, digits, -, and _, starting with a letter.
>> @@ -315,9 +324,20 @@ def check_event(expr, info):
>>   
>>   def check_exprs(exprs):
>>       for expr_elem in exprs:
>> -        expr = expr_elem['expr']
>> -        info = expr_elem['info']
>> -        doc = expr_elem.get('doc')
>> +        # Expression
>> +        assert isinstance(expr_elem['expr'], dict)
> 
> I dislike relaxing OrderedDict to dict here.  I'm going to accept it
> anyway, because the difference between the two is going away in 3.7, and
> because so far order actually matters only in certain sub-expressions,
> not top-level expressions.
> 

Right, there is a semantic piece of information missing from this type.

You've asked if we care if we ditch OrderedDict and claim that we only 
support CPython -- I don't know. I assume nobody uses anything else, but 
I sincerely don't know. My hunch is that I really doubt it, but I don't 
see a reason to complicate ./configure et al and don't think it needs to 
be messed with.

I am content with leaving OrderedDict in parser.py, but it does make it 
easier for me to play with the pieces to not impose a constraint on a 
specific *type name* and there is no way (that I am presently aware of) 
to write a type constraint on just the semantic information that the 
keys are ordered.

The largest loss I am aware of here is that a newcomer to this file *may 
not know* that these keys are ordered, however, the order of the keys in 
and of itself has no impact on the operation of expr.py itself, so I am 
not sure if it is necessary to repeat that fact for a theoretical 
visitor here. parser.py of course still uses OrderedDict and will 
continue to do so for the forseeable future.

"Why bother relaxing the type at all, then?"

Strictly it makes life easier for me, because I am experimenting with 
different validation backends, different parsers, and so on.

Can I just patch it out in every branch I want to play with these 
changes? I could, yes.

I am asking for a favor in exchange for my continued diligence in adding 
documentation and static time type analysis to a critical component used 
for generating the API interface for QEMU.

>> +        for key in expr_elem['expr'].keys():
>> +            assert isinstance(key, str)
>> +        expr: _JSONObject = expr_elem['expr']
>> +
>> +        # QAPISourceInfo
>> +        assert isinstance(expr_elem['info'], QAPISourceInfo)
>> +        info: QAPISourceInfo = expr_elem['info']
>> +
>> +        # Optional[QAPIDoc]
>> +        tmp = expr_elem.get('doc')
>> +        assert tmp is None or isinstance(tmp, QAPIDoc)
>> +        doc: Optional[QAPIDoc] = tmp
>>   
>>           if 'include' in expr:
>>               continue
> 
> I see you didn't bite on the idea to do less checking here.  Okay.
> 

Almost all of this goes away in part 5, because I add a typed structure 
that parser returns and we no longer have to do the runtime type 
narrowing here as a result.

I started to shift things around a bit here, but it'll just cause more 
work to rebase it again anyway, so I left it alone. I did reorder one of 
the other checks here, but wound up leaving this one alone.

(I will admit to liking the assertion in the interim because it 
convinced me I was on terra-firma. Through all of the rebase churn, some 
more brain-dead looking bits help keep my expectations tethered to the 
current reality.)
Markus Armbruster March 26, 2021, 5:40 a.m. UTC | #3
John Snow <jsnow@redhat.com> writes:

> On 3/25/21 10:04 AM, Markus Armbruster wrote:
>> John Snow <jsnow@redhat.com> writes:
>> 
>>> mypy does not know the types of values stored in Dicts that masquerade
>>> as objects. Help the type checker out by constraining the type.
>>>
>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>> ---
>>>   scripts/qapi/expr.py | 26 +++++++++++++++++++++++---
>>>   1 file changed, 23 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/scripts/qapi/expr.py b/scripts/qapi/expr.py
>>> index b4bbcd54c0..b75c85c160 100644
>>> --- a/scripts/qapi/expr.py
>>> +++ b/scripts/qapi/expr.py
>>> @@ -15,9 +15,18 @@
>>>   # See the COPYING file in the top-level directory.
>>>   
>>>   import re
>>> +from typing import Dict, Optional
>>>   
>>>   from .common import c_name
>>>   from .error import QAPISemError
>>> +from .parser import QAPIDoc
>>> +from .source import QAPISourceInfo
>>> +
>>> +
>>> +# Deserialized JSON objects as returned by the parser;
>>> +# The values of this mapping are not necessary to exhaustively type
>> 
>> Not necessary and also not practical with current mypy.  Correct?
>
> Neither necessary nor practical. Typing as 'object' guarantees that 
> these values will never be used in a manner not supported by all python 
> objects. Mypy does not complain, so we know that we don't misuse the type.
>
> This is helpful for proving the validity of the expr.py validator 
> itself: we know that we are not forgetting to perform type narrowing and 
> using the value contained therein inappropriately.
>
> Adding a more exhaustive typing here is impractical (for reasons we 
> learned during introspect.py), but also provides no benefit to the 
> static analysis here anyway.
>
> (None of the functions written here *assume* the shape of the structure, 
> so there are no functions that benefit from having a more laboriously 
> specified type.)
>
> If the comment needs more work, suggest away -- I tried to follow our 
> last discussion here as best as I was able.

"Needs more work" sounds like "inadequate", which isn't the case.

The comment focuses on what we need from mypy here.  We may or may not
want to hint at the other aspect: what mypy can provide.

>>> +# here, because the purpose of this module is to interrogate that type.
>>> +_JSONObject = Dict[str, object]
[...]

If we want to, maybe:

     # Deserialized JSON objects as returned by the parser.
     # The values of this mapping are not necessary to exhaustively type
     # here (and also not practical as long as mypy lacks recursive
     # types), because the purpose of this module is to interrogate that
     # type.

Thoughts?
John Snow March 26, 2021, 5:12 p.m. UTC | #4
On 3/26/21 1:40 AM, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
> 
>> On 3/25/21 10:04 AM, Markus Armbruster wrote:
>>> John Snow <jsnow@redhat.com> writes:
>>>
>>>> mypy does not know the types of values stored in Dicts that masquerade
>>>> as objects. Help the type checker out by constraining the type.
>>>>
>>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>>> ---
>>>>    scripts/qapi/expr.py | 26 +++++++++++++++++++++++---
>>>>    1 file changed, 23 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/scripts/qapi/expr.py b/scripts/qapi/expr.py
>>>> index b4bbcd54c0..b75c85c160 100644
>>>> --- a/scripts/qapi/expr.py
>>>> +++ b/scripts/qapi/expr.py
>>>> @@ -15,9 +15,18 @@
>>>>    # See the COPYING file in the top-level directory.
>>>>    
>>>>    import re
>>>> +from typing import Dict, Optional
>>>>    
>>>>    from .common import c_name
>>>>    from .error import QAPISemError
>>>> +from .parser import QAPIDoc
>>>> +from .source import QAPISourceInfo
>>>> +
>>>> +
>>>> +# Deserialized JSON objects as returned by the parser;
>>>> +# The values of this mapping are not necessary to exhaustively type
>>>
>>> Not necessary and also not practical with current mypy.  Correct?
>>
>> Neither necessary nor practical. Typing as 'object' guarantees that
>> these values will never be used in a manner not supported by all python
>> objects. Mypy does not complain, so we know that we don't misuse the type.
>>
>> This is helpful for proving the validity of the expr.py validator
>> itself: we know that we are not forgetting to perform type narrowing and
>> using the value contained therein inappropriately.
>>
>> Adding a more exhaustive typing here is impractical (for reasons we
>> learned during introspect.py), but also provides no benefit to the
>> static analysis here anyway.
>>
>> (None of the functions written here *assume* the shape of the structure,
>> so there are no functions that benefit from having a more laboriously
>> specified type.)
>>
>> If the comment needs more work, suggest away -- I tried to follow our
>> last discussion here as best as I was able.
> 
> "Needs more work" sounds like "inadequate", which isn't the case.
> 
> The comment focuses on what we need from mypy here.  We may or may not
> want to hint at the other aspect: what mypy can provide.
> 
>>>> +# here, because the purpose of this module is to interrogate that type.
>>>> +_JSONObject = Dict[str, object]
> [...]
> 
> If we want to, maybe:
> 
>       # Deserialized JSON objects as returned by the parser.
>       # The values of this mapping are not necessary to exhaustively type
>       # here (and also not practical as long as mypy lacks recursive
>       # types), because the purpose of this module is to interrogate that
>       # type.
> 
> Thoughts?
> 

If it occurs to you to want the extra explanation, I won't say no to it. 
I will fold it in.

--js
diff mbox series

Patch

diff --git a/scripts/qapi/expr.py b/scripts/qapi/expr.py
index b4bbcd54c0..b75c85c160 100644
--- a/scripts/qapi/expr.py
+++ b/scripts/qapi/expr.py
@@ -15,9 +15,18 @@ 
 # See the COPYING file in the top-level directory.
 
 import re
+from typing import Dict, Optional
 
 from .common import c_name
 from .error import QAPISemError
+from .parser import QAPIDoc
+from .source import QAPISourceInfo
+
+
+# Deserialized JSON objects as returned by the parser;
+# The values of this mapping are not necessary to exhaustively type
+# here, because the purpose of this module is to interrogate that type.
+_JSONObject = Dict[str, object]
 
 
 # Names consist of letters, digits, -, and _, starting with a letter.
@@ -315,9 +324,20 @@  def check_event(expr, info):
 
 def check_exprs(exprs):
     for expr_elem in exprs:
-        expr = expr_elem['expr']
-        info = expr_elem['info']
-        doc = expr_elem.get('doc')
+        # Expression
+        assert isinstance(expr_elem['expr'], dict)
+        for key in expr_elem['expr'].keys():
+            assert isinstance(key, str)
+        expr: _JSONObject = expr_elem['expr']
+
+        # QAPISourceInfo
+        assert isinstance(expr_elem['info'], QAPISourceInfo)
+        info: QAPISourceInfo = expr_elem['info']
+
+        # Optional[QAPIDoc]
+        tmp = expr_elem.get('doc')
+        assert tmp is None or isinstance(tmp, QAPIDoc)
+        doc: Optional[QAPIDoc] = tmp
 
         if 'include' in expr:
             continue