diff mbox series

[v2,09/19] qapi/schema: allow resolve_type to be used for built-in types

Message ID 20240112222945.3033854-10-jsnow@redhat.com (mailing list archive)
State New, archived
Headers show
Series qapi: statically type schema.py | expand

Commit Message

John Snow Jan. 12, 2024, 10:29 p.m. UTC
allow resolve_type to be used for both built-in and user-specified
type definitions. In the event that the type cannot be resolved, assert
that 'info' and 'what' were both provided in order to create a usable
QAPISemError.

In practice, 'info' will only be None for built-in definitions, which
*should not fail* type lookup.

As a convenience, allow the 'what' and 'info' parameters to be elided
entirely so that it can be used as a can-not-fail version of
lookup_type.

Note: there are only three callsites to resolve_type at present where
"info" is perceived to be possibly None:

    1) QAPISchemaArrayType.check()
    2) QAPISchemaObjectTypeMember.check()
    3) QAPISchemaEvent.check()

    Of those three, only the first actually ever passes None; the other two
    are limited by their base class initializers which accept info=None, but
    neither actually use it in practice.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/schema.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Markus Armbruster Jan. 16, 2024, 11:09 a.m. UTC | #1
John Snow <jsnow@redhat.com> writes:

> allow resolve_type to be used for both built-in and user-specified
> type definitions. In the event that the type cannot be resolved, assert
> that 'info' and 'what' were both provided in order to create a usable
> QAPISemError.
>
> In practice, 'info' will only be None for built-in definitions, which
> *should not fail* type lookup.
>
> As a convenience, allow the 'what' and 'info' parameters to be elided
> entirely so that it can be used as a can-not-fail version of
> lookup_type.

The convenience remains unused until the next patch.  It should be added
there.

> Note: there are only three callsites to resolve_type at present where
> "info" is perceived to be possibly None:
>
>     1) QAPISchemaArrayType.check()
>     2) QAPISchemaObjectTypeMember.check()
>     3) QAPISchemaEvent.check()
>
>     Of those three, only the first actually ever passes None;

Yes.  More below.

>                                                               the other two
>     are limited by their base class initializers which accept info=None, but

They do?

>     neither actually use it in practice.
>
> Signed-off-by: John Snow <jsnow@redhat.com>

Hmm.

We look up types by name in two ways:

1. Failure is a semantic error

   Use .resolve_type(), passing real @info and @what.

   Users:

   * QAPISchemaArrayType.check() resolving the element type

     Fine print: when the array type is built-in, we pass None @info and
     @what.  The built-in array type's element type must exist for
     .resolve_type() to work.  This commit changes .resolve_type() to
     assert it does.

   * QAPISchemaObjectType.check() resolving the base type

   * QAPISchemaObjectTypeMember.check() resolving the member type

   * QAPISchemaCommand.check() resolving argument type (if named) and
     return type (which is always named).

   * QAPISchemaEvent.check() resolving argument type (if named).

   Note all users are in .check() methods.  That's where type named get
   resolved.

2. Handle failure

   Use .lookup_type(), which returns None when the named type doesn't
   exist.

   Users:

   * QAPISchemaVariants.check(), to look up the base type containing the
     tag member for error reporting purposes.  Failure would be a
     programming error.

   * .resolve_type(), which handles failure as semantic error

   * ._make_array_type(), which uses it as "type exists already"
      predicate.

   * QAPISchemaGenIntrospectVisitor._use_type(), to look up certain
     built-in types.  Failure would be a programming error.

The next commit switches the uses where failure would be a programming
error from .lookup_type() to .resolve_type() without @info and @what, so
failure trips its assertion.  I don't like it, because it overloads
.resolve_type() to serve two rather different use cases:

1. Failure is a semantic error; pass @info and @what

2. Failure is a programming error; don't pass @info and what

The odd one out is of course QAPISchemaArrayType.check(), which wants to
use 1. for the user's types and 2. for built-in types.  Let's ignore it
for a second.

I prefer to do 2. like typ = .lookup_type(); assert typ.  We can factor
this out into its own helper if that helps (pardon the pun).

Back to QAPISchemaArrayType.check().  Its need to resolve built-in
element types, which have no info, necessitates .resolve_type() taking
Optional[QAPISourceInfo].  This might bother you.  It doesn't bother me,
unless it leads to mypy complications I can't see.

We can simply leave it as is.  Adding the assertion to .resolve_type()
is fine.

Ot we complicate QAPISchemaArrayType.check() to simplify
.resolve_type()'s typing, roughly like this:

            if self.info:
                self.element_type = schema.resolve_type(
                    self._element_type_name,
                    self.info, self.info.defn_meta)
            else:               # built-in type
                self.element_type = schema.lookup_type(
                    self._element_type_name)
                assert self.element_type

Not sure it's worth the trouble.  Thoughts?

> ---
>  scripts/qapi/schema.py | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
> index 66a78f28fd4..a77b51d1b96 100644
> --- a/scripts/qapi/schema.py
> +++ b/scripts/qapi/schema.py
> @@ -1001,9 +1001,10 @@ def lookup_type(self, name):
>          assert typ is None or isinstance(typ, QAPISchemaType)
>          return typ
>  
> -    def resolve_type(self, name, info, what):
> +    def resolve_type(self, name, info=None, what=None):
>          typ = self.lookup_type(name)
>          if not typ:
> +            assert info and what  # built-in types must not fail lookup
>              if callable(what):
>                  what = what(info)
>              raise QAPISemError(
John Snow Jan. 17, 2024, 4:44 p.m. UTC | #2
On Tue, Jan 16, 2024 at 6:09 AM Markus Armbruster <armbru@redhat.com> wrote:
>
> John Snow <jsnow@redhat.com> writes:
>
> > allow resolve_type to be used for both built-in and user-specified
> > type definitions. In the event that the type cannot be resolved, assert
> > that 'info' and 'what' were both provided in order to create a usable
> > QAPISemError.
> >
> > In practice, 'info' will only be None for built-in definitions, which
> > *should not fail* type lookup.
> >
> > As a convenience, allow the 'what' and 'info' parameters to be elided
> > entirely so that it can be used as a can-not-fail version of
> > lookup_type.
>
> The convenience remains unused until the next patch.  It should be added
> there.

Okie-ducky.

>
> > Note: there are only three callsites to resolve_type at present where
> > "info" is perceived to be possibly None:
> >
> >     1) QAPISchemaArrayType.check()
> >     2) QAPISchemaObjectTypeMember.check()
> >     3) QAPISchemaEvent.check()
> >
> >     Of those three, only the first actually ever passes None;
>
> Yes.  More below.

Scary...

>
> >                                                               the other two
> >     are limited by their base class initializers which accept info=None, but
>
> They do?
>

In the case of QAPISchemaObjectTypeMember, the parent class
QAPISchemaMember allows initialization with info=None. I can't fully
trace all of the callsites, but one of them at least is in types.py:

>     enum_members = members + [QAPISchemaEnumMember('_MAX', None)]

which necessitates, for now, info-less QAPISchemaEnumMember, which
necessitates info-less QAPISchemaMember. There are others, etc.

> >     neither actually use it in practice.
> >
> > Signed-off-by: John Snow <jsnow@redhat.com>
>
> Hmm.

Scary.

>
> We look up types by name in two ways:
>
> 1. Failure is a semantic error
>
>    Use .resolve_type(), passing real @info and @what.
>
>    Users:
>
>    * QAPISchemaArrayType.check() resolving the element type
>
>      Fine print: when the array type is built-in, we pass None @info and
>      @what.  The built-in array type's element type must exist for
>      .resolve_type() to work.  This commit changes .resolve_type() to
>      assert it does.
>
>    * QAPISchemaObjectType.check() resolving the base type
>
>    * QAPISchemaObjectTypeMember.check() resolving the member type
>
>    * QAPISchemaCommand.check() resolving argument type (if named) and
>      return type (which is always named).
>
>    * QAPISchemaEvent.check() resolving argument type (if named).
>
>    Note all users are in .check() methods.  That's where type named get
>    resolved.
>
> 2. Handle failure
>
>    Use .lookup_type(), which returns None when the named type doesn't
>    exist.
>
>    Users:
>
>    * QAPISchemaVariants.check(), to look up the base type containing the
>      tag member for error reporting purposes.  Failure would be a
>      programming error.
>
>    * .resolve_type(), which handles failure as semantic error
>
>    * ._make_array_type(), which uses it as "type exists already"
>       predicate.
>
>    * QAPISchemaGenIntrospectVisitor._use_type(), to look up certain
>      built-in types.  Failure would be a programming error.
>
> The next commit switches the uses where failure would be a programming
> error from .lookup_type() to .resolve_type() without @info and @what, so
> failure trips its assertion.  I don't like it, because it overloads
> .resolve_type() to serve two rather different use cases:
>
> 1. Failure is a semantic error; pass @info and @what
>
> 2. Failure is a programming error; don't pass @info and what
>
> The odd one out is of course QAPISchemaArrayType.check(), which wants to
> use 1. for the user's types and 2. for built-in types.  Let's ignore it
> for a second.

"Let's ignore what motivated this patch" aww...

>
> I prefer to do 2. like typ = .lookup_type(); assert typ.  We can factor
> this out into its own helper if that helps (pardon the pun).
>
> Back to QAPISchemaArrayType.check().  Its need to resolve built-in
> element types, which have no info, necessitates .resolve_type() taking
> Optional[QAPISourceInfo].  This might bother you.  It doesn't bother me,
> unless it leads to mypy complications I can't see.

Well, with this patch I allowed it to take Optional[QAPISourceInfo] -
just keep in mind that QAPISemError *requires* an info object, even
though the typing there is also Optional[QAPISourceInfo] ... It will
assert that info is present in __str__.

Actually, I'd love to change that too - and make it fully required -
but since built-in types have no info, there's too many places I'd
need to change to enforce this as a static type.

Still.

>
> We can simply leave it as is.  Adding the assertion to .resolve_type()
> is fine.
>
> Ot we complicate QAPISchemaArrayType.check() to simplify
> .resolve_type()'s typing, roughly like this:
>
>             if self.info:
>                 self.element_type = schema.resolve_type(
>                     self._element_type_name,
>                     self.info, self.info.defn_meta)
>             else:               # built-in type
>                 self.element_type = schema.lookup_type(
>                     self._element_type_name)
>                 assert self.element_type
>
> Not sure it's worth the trouble.  Thoughts?

I suppose it's your call, ultimately. This patch exists primarily to
help in two places:

(A) QAPISchemaArrayType.check(), as you've noticed, because it uses
the same path for both built-in and user-defined types. This is the
only place in the code where this occurs *at the moment*, but I can't
predict the future.

(B) Calls to lookup_type in introspect.py which look up built-in types
and must-not-fail. It was cumbersome in the old patchset, but this one
makes it simpler.

I suppose at the moment, having the assert directly in resolve_type
just means we get to use the same helper/pathway for both user-defined
and built-in types, which matches the infrastructure we already have,
which doesn't differentiate between the two. (By which I mean, all of
the Schema classes are not split into built-in and user-defined types,
so it is invisible to the type system.)

I could add conditional logic to the array check, and leave the
lookup_type calls in introspect.py being a little cumbersome - my main
concern with that solution is that I might be leaving a nasty
booby-trap in the future if someone wants to add a new built-in type
or something gets refactored to share more code pathways. Maybe that's
not fully rational, but it's why I went the way I did.

(P.S. I still violently want to create an info object that represents
built-in definitions so I can just get rid of all the
Optional[QAPISourceInfo] types from everywhere. I know I tried to do
it before and you vetoed it, but the desire lives on in my heart.)

>
> > ---
> >  scripts/qapi/schema.py | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
> > index 66a78f28fd4..a77b51d1b96 100644
> > --- a/scripts/qapi/schema.py
> > +++ b/scripts/qapi/schema.py
> > @@ -1001,9 +1001,10 @@ def lookup_type(self, name):
> >          assert typ is None or isinstance(typ, QAPISchemaType)
> >          return typ
> >
> > -    def resolve_type(self, name, info, what):
> > +    def resolve_type(self, name, info=None, what=None):
> >          typ = self.lookup_type(name)
> >          if not typ:
> > +            assert info and what  # built-in types must not fail lookup
> >              if callable(what):
> >                  what = what(info)
> >              raise QAPISemError(
>
Markus Armbruster Jan. 22, 2024, 1:12 p.m. UTC | #3
John Snow <jsnow@redhat.com> writes:

> On Tue, Jan 16, 2024 at 6:09 AM Markus Armbruster <armbru@redhat.com> wrote:
>>
>> John Snow <jsnow@redhat.com> writes:
>>
>> > allow resolve_type to be used for both built-in and user-specified
>> > type definitions. In the event that the type cannot be resolved, assert
>> > that 'info' and 'what' were both provided in order to create a usable
>> > QAPISemError.
>> >
>> > In practice, 'info' will only be None for built-in definitions, which
>> > *should not fail* type lookup.
>> >
>> > As a convenience, allow the 'what' and 'info' parameters to be elided
>> > entirely so that it can be used as a can-not-fail version of
>> > lookup_type.
>>
>> The convenience remains unused until the next patch.  It should be added
>> there.
>
> Okie-ducky.
>
>>
>> > Note: there are only three callsites to resolve_type at present where
>> > "info" is perceived to be possibly None:
>> >
>> >     1) QAPISchemaArrayType.check()
>> >     2) QAPISchemaObjectTypeMember.check()
>> >     3) QAPISchemaEvent.check()
>> >
>> >     Of those three, only the first actually ever passes None;
>>
>> Yes.  More below.
>
> Scary...

I know...

>> >                                                               the other two
>> >     are limited by their base class initializers which accept info=None, but
>>
>> They do?
>
> In the case of QAPISchemaObjectTypeMember, the parent class
> QAPISchemaMember allows initialization with info=None. I can't fully
> trace all of the callsites, but one of them at least is in types.py:
>
>>     enum_members = members + [QAPISchemaEnumMember('_MAX', None)]

I see.

We may want to do the _MAX thingy differently.  Not now.

> which necessitates, for now, info-less QAPISchemaEnumMember, which
> necessitates info-less QAPISchemaMember. There are others, etc.

Overriding an inherited attribute of type Optional[T] so it's
non-optional T makes mypy unhappy?

>> >     neither actually use it in practice.
>> >
>> > Signed-off-by: John Snow <jsnow@redhat.com>
>>
>> Hmm.
>
> Scary.
>
>>
>> We look up types by name in two ways:
>>
>> 1. Failure is a semantic error
>>
>>    Use .resolve_type(), passing real @info and @what.
>>
>>    Users:
>>
>>    * QAPISchemaArrayType.check() resolving the element type
>>
>>      Fine print: when the array type is built-in, we pass None @info and
>>      @what.  The built-in array type's element type must exist for
>>      .resolve_type() to work.  This commit changes .resolve_type() to
>>      assert it does.
>>
>>    * QAPISchemaObjectType.check() resolving the base type
>>
>>    * QAPISchemaObjectTypeMember.check() resolving the member type
>>
>>    * QAPISchemaCommand.check() resolving argument type (if named) and
>>      return type (which is always named).
>>
>>    * QAPISchemaEvent.check() resolving argument type (if named).
>>
>>    Note all users are in .check() methods.  That's where type named get
>>    resolved.
>>
>> 2. Handle failure
>>
>>    Use .lookup_type(), which returns None when the named type doesn't
>>    exist.
>>
>>    Users:
>>
>>    * QAPISchemaVariants.check(), to look up the base type containing the
>>      tag member for error reporting purposes.  Failure would be a
>>      programming error.
>>
>>    * .resolve_type(), which handles failure as semantic error
>>
>>    * ._make_array_type(), which uses it as "type exists already"
>>       predicate.
>>
>>    * QAPISchemaGenIntrospectVisitor._use_type(), to look up certain
>>      built-in types.  Failure would be a programming error.
>>
>> The next commit switches the uses where failure would be a programming
>> error from .lookup_type() to .resolve_type() without @info and @what, so
>> failure trips its assertion.  I don't like it, because it overloads
>> .resolve_type() to serve two rather different use cases:
>>
>> 1. Failure is a semantic error; pass @info and @what
>>
>> 2. Failure is a programming error; don't pass @info and what
>>
>> The odd one out is of course QAPISchemaArrayType.check(), which wants to
>> use 1. for the user's types and 2. for built-in types.  Let's ignore it
>> for a second.
>
> "Let's ignore what motivated this patch" aww...

Just for a second, I swear!

>> I prefer to do 2. like typ = .lookup_type(); assert typ.  We can factor
>> this out into its own helper if that helps (pardon the pun).
>>
>> Back to QAPISchemaArrayType.check().  Its need to resolve built-in
>> element types, which have no info, necessitates .resolve_type() taking
>> Optional[QAPISourceInfo].  This might bother you.  It doesn't bother me,
>> unless it leads to mypy complications I can't see.
>
> Well, with this patch I allowed it to take Optional[QAPISourceInfo] -
> just keep in mind that QAPISemError *requires* an info object, even
> though the typing there is also Optional[QAPISourceInfo] ... It will
> assert that info is present in __str__.
>
> Actually, I'd love to change that too - and make it fully required -
> but since built-in types have no info, there's too many places I'd
> need to change to enforce this as a static type.
>
> Still.

Invariant: no error reports for built-in types.

Checked since forever by asserting info is not None, exploiting the fact
that info is None exactly for built-in types.

This makes info: Optional[QAPISourceInfo] by design.

Works.

Specializing it to just QAPISourceInfo moves the assertion check from
run time to compile time.  Might give a nice feeling, but I don't think
it's practical everywhere, and it doesn't really matter anyway.

Using a special value of QAPISourceInfo instead of None would also get
rid of the Optional, along with the potential of checking at compile
time.  Good trade *if* it simplifies the code.  See also the very end of
my reply.

>> We can simply leave it as is.  Adding the assertion to .resolve_type()
>> is fine.
>>
>> Ot we complicate QAPISchemaArrayType.check() to simplify
>> .resolve_type()'s typing, roughly like this:
>>
>>             if self.info:
>>                 self.element_type = schema.resolve_type(
>>                     self._element_type_name,
>>                     self.info, self.info.defn_meta)
>>             else:               # built-in type
>>                 self.element_type = schema.lookup_type(
>>                     self._element_type_name)
>>                 assert self.element_type
>>
>> Not sure it's worth the trouble.  Thoughts?
>
> I suppose it's your call, ultimately. This patch exists primarily to
> help in two places:
>
> (A) QAPISchemaArrayType.check(), as you've noticed, because it uses
> the same path for both built-in and user-defined types. This is the
> only place in the code where this occurs *at the moment*, but I can't
> predict the future.
>
> (B) Calls to lookup_type in introspect.py which look up built-in types
> and must-not-fail. It was cumbersome in the old patchset, but this one
> makes it simpler.
>
> I suppose at the moment, having the assert directly in resolve_type
> just means we get to use the same helper/pathway for both user-defined
> and built-in types, which matches the infrastructure we already have,
> which doesn't differentiate between the two. (By which I mean, all of
> the Schema classes are not split into built-in and user-defined types,
> so it is invisible to the type system.)

Yes.

> I could add conditional logic to the array check, and leave the
> lookup_type calls in introspect.py being a little cumbersome - my main
> concern with that solution is that I might be leaving a nasty
> booby-trap in the future if someone wants to add a new built-in type
> or something gets refactored to share more code pathways. Maybe that's
> not fully rational, but it's why I went the way I did.

In my mind, .resolve_type() is strictly for resolving types during
semantic analysis: look up a type by name, report an error if it doesn't
exist.

Before this patch:

(A) QAPISchemaArrayType.check() works.  The invariant check is buried
somewhat deep, in QAPISourceError.

(B) introspect.py works.  The invariant is not checked there.

(C) QAPISchemaVariants.check() works.  A rather losely related invariant
is checked there: the tag member's type exists.

This patch conflates two changes.

One, it adds an invariant check right to .resolve_type().  Impact:

    (A) Adds an invariant check closer to the surface.

    (B) Not touched.

    (C) Not touched.

No objection.

Two, it defaults .resolve_type()'s arguments to None.  Belongs to the
next patch.

The next patch overloads .resolve_type() to serve two use cases,
1. failure is a semantic error, and 2. failure is a programming error.
The first kind passes the arguments, the second doesn't.  Impact:

    (A) Not touched.

    (B) Adds invariant checking, in the callee.

    (C) Pushes the invariant checking into the callee.

I don't like overloading .resolve_type() this way.  Again: in my mind,
it's strictly for resolving the user's type names in semantic analysis.

If I drop this patch and the next one, mypy complains

    scripts/qapi/schema.py:1219: error: Argument 1 has incompatible type "QAPISourceInfo | None"; expected "QAPISourceInfo"  [arg-type]
    scripts/qapi/introspect.py:230: error: Incompatible types in assignment (expression has type "QAPISchemaType | None", variable has type "QAPISchemaType")  [assignment]
    scripts/qapi/introspect.py:233: error: Incompatible types in assignment (expression has type "QAPISchemaType | None", variable has type "QAPISchemaType")  [assignment]

Retaining the assertion added in this patch takes care of the first one.

To get rid of the two in introspect.py, we need to actually check the
invariant:

diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index 67c7d89aae..4679b1bc2c 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -227,10 +227,14 @@ def _use_type(self, typ: QAPISchemaType) -> str:
 
         # Map the various integer types to plain int
         if typ.json_type() == 'int':
-            typ = self._schema.lookup_type('int')
+            type_int = self._schema.lookup_type('int')
+            assert type_int
+            typ = type_int
         elif (isinstance(typ, QAPISchemaArrayType) and
               typ.element_type.json_type() == 'int'):
-            typ = self._schema.lookup_type('intList')
+            type_intList = self._schema.lookup_type('intList')
+            assert type_intList
+            typ = type_intList
         # Add type to work queue if new
         if typ not in self._used_types:
             self._used_types.append(typ)

Straightforward enough, although with a bit of notational overhead.

We use t = .lookup_type(...); assert t in three places then.  Feel free
to factor it out into a new helper.

> (P.S. I still violently want to create an info object that represents
> built-in definitions so I can just get rid of all the
> Optional[QAPISourceInfo] types from everywhere. I know I tried to do
> it before and you vetoed it, but the desire lives on in my heart.)

Once everything is properly typed, the cost and benefit of such a change
should be more clearly visible.

For now, let's try to type what we have, unless what we have complicates
typing too much.

[...]
John Snow Jan. 31, 2024, 10:28 p.m. UTC | #4
On Mon, Jan 22, 2024 at 8:12 AM Markus Armbruster <armbru@redhat.com> wrote:
>
> John Snow <jsnow@redhat.com> writes:
>
> > On Tue, Jan 16, 2024 at 6:09 AM Markus Armbruster <armbru@redhat.com> wrote:
> >>
> >> John Snow <jsnow@redhat.com> writes:
> >>
> >> > allow resolve_type to be used for both built-in and user-specified
> >> > type definitions. In the event that the type cannot be resolved, assert
> >> > that 'info' and 'what' were both provided in order to create a usable
> >> > QAPISemError.
> >> >
> >> > In practice, 'info' will only be None for built-in definitions, which
> >> > *should not fail* type lookup.
> >> >
> >> > As a convenience, allow the 'what' and 'info' parameters to be elided
> >> > entirely so that it can be used as a can-not-fail version of
> >> > lookup_type.
> >>
> >> The convenience remains unused until the next patch.  It should be added
> >> there.
> >
> > Okie-ducky.
> >
> >>
> >> > Note: there are only three callsites to resolve_type at present where
> >> > "info" is perceived to be possibly None:
> >> >
> >> >     1) QAPISchemaArrayType.check()
> >> >     2) QAPISchemaObjectTypeMember.check()
> >> >     3) QAPISchemaEvent.check()
> >> >
> >> >     Of those three, only the first actually ever passes None;
> >>
> >> Yes.  More below.
> >
> > Scary...
>
> I know...
>
> >> >                                                               the other two
> >> >     are limited by their base class initializers which accept info=None, but
> >>
> >> They do?
> >
> > In the case of QAPISchemaObjectTypeMember, the parent class
> > QAPISchemaMember allows initialization with info=None. I can't fully
> > trace all of the callsites, but one of them at least is in types.py:
> >
> >>     enum_members = members + [QAPISchemaEnumMember('_MAX', None)]
>
> I see.
>
> We may want to do the _MAX thingy differently.  Not now.
>
> > which necessitates, for now, info-less QAPISchemaEnumMember, which
> > necessitates info-less QAPISchemaMember. There are others, etc.
>
> Overriding an inherited attribute of type Optional[T] so it's
> non-optional T makes mypy unhappy?

Yeah, it considers it to be improper OO - it remembers only the
broadest type from the base class, which is Optional[T]. We aren't
overriding the property itself, we've just redefined a different
initializer, which doesn't carry through to the actual object.

(i.e. the initializer takes a T, the core object has an Optional[T],
there's no problem - but the field remains Optional[T].)

>
> >> >     neither actually use it in practice.
> >> >
> >> > Signed-off-by: John Snow <jsnow@redhat.com>
> >>
> >> Hmm.
> >
> > Scary.
> >
> >>
> >> We look up types by name in two ways:
> >>
> >> 1. Failure is a semantic error
> >>
> >>    Use .resolve_type(), passing real @info and @what.
> >>
> >>    Users:
> >>
> >>    * QAPISchemaArrayType.check() resolving the element type
> >>
> >>      Fine print: when the array type is built-in, we pass None @info and
> >>      @what.  The built-in array type's element type must exist for
> >>      .resolve_type() to work.  This commit changes .resolve_type() to
> >>      assert it does.
> >>
> >>    * QAPISchemaObjectType.check() resolving the base type
> >>
> >>    * QAPISchemaObjectTypeMember.check() resolving the member type
> >>
> >>    * QAPISchemaCommand.check() resolving argument type (if named) and
> >>      return type (which is always named).
> >>
> >>    * QAPISchemaEvent.check() resolving argument type (if named).
> >>
> >>    Note all users are in .check() methods.  That's where type named get
> >>    resolved.
> >>
> >> 2. Handle failure
> >>
> >>    Use .lookup_type(), which returns None when the named type doesn't
> >>    exist.
> >>
> >>    Users:
> >>
> >>    * QAPISchemaVariants.check(), to look up the base type containing the
> >>      tag member for error reporting purposes.  Failure would be a
> >>      programming error.
> >>
> >>    * .resolve_type(), which handles failure as semantic error
> >>
> >>    * ._make_array_type(), which uses it as "type exists already"
> >>       predicate.
> >>
> >>    * QAPISchemaGenIntrospectVisitor._use_type(), to look up certain
> >>      built-in types.  Failure would be a programming error.
> >>
> >> The next commit switches the uses where failure would be a programming
> >> error from .lookup_type() to .resolve_type() without @info and @what, so
> >> failure trips its assertion.  I don't like it, because it overloads
> >> .resolve_type() to serve two rather different use cases:
> >>
> >> 1. Failure is a semantic error; pass @info and @what
> >>
> >> 2. Failure is a programming error; don't pass @info and what
> >>
> >> The odd one out is of course QAPISchemaArrayType.check(), which wants to
> >> use 1. for the user's types and 2. for built-in types.  Let's ignore it
> >> for a second.
> >
> > "Let's ignore what motivated this patch" aww...
>
> Just for a second, I swear!
>
> >> I prefer to do 2. like typ = .lookup_type(); assert typ.  We can factor
> >> this out into its own helper if that helps (pardon the pun).
> >>
> >> Back to QAPISchemaArrayType.check().  Its need to resolve built-in
> >> element types, which have no info, necessitates .resolve_type() taking
> >> Optional[QAPISourceInfo].  This might bother you.  It doesn't bother me,
> >> unless it leads to mypy complications I can't see.
> >
> > Well, with this patch I allowed it to take Optional[QAPISourceInfo] -
> > just keep in mind that QAPISemError *requires* an info object, even
> > though the typing there is also Optional[QAPISourceInfo] ... It will
> > assert that info is present in __str__.
> >
> > Actually, I'd love to change that too - and make it fully required -
> > but since built-in types have no info, there's too many places I'd
> > need to change to enforce this as a static type.
> >
> > Still.
>
> Invariant: no error reports for built-in types.
>
> Checked since forever by asserting info is not None, exploiting the fact
> that info is None exactly for built-in types.
>
> This makes info: Optional[QAPISourceInfo] by design.
>
> Works.
>
> Specializing it to just QAPISourceInfo moves the assertion check from
> run time to compile time.  Might give a nice feeling, but I don't think
> it's practical everywhere, and it doesn't really matter anyway.
>
> Using a special value of QAPISourceInfo instead of None would also get
> rid of the Optional, along with the potential of checking at compile
> time.  Good trade *if* it simplifies the code.  See also the very end of
> my reply.
>
> >> We can simply leave it as is.  Adding the assertion to .resolve_type()
> >> is fine.
> >>
> >> Ot we complicate QAPISchemaArrayType.check() to simplify
> >> .resolve_type()'s typing, roughly like this:
> >>
> >>             if self.info:
> >>                 self.element_type = schema.resolve_type(
> >>                     self._element_type_name,
> >>                     self.info, self.info.defn_meta)
> >>             else:               # built-in type
> >>                 self.element_type = schema.lookup_type(
> >>                     self._element_type_name)
> >>                 assert self.element_type
> >>
> >> Not sure it's worth the trouble.  Thoughts?
> >
> > I suppose it's your call, ultimately. This patch exists primarily to
> > help in two places:
> >
> > (A) QAPISchemaArrayType.check(), as you've noticed, because it uses
> > the same path for both built-in and user-defined types. This is the
> > only place in the code where this occurs *at the moment*, but I can't
> > predict the future.
> >
> > (B) Calls to lookup_type in introspect.py which look up built-in types
> > and must-not-fail. It was cumbersome in the old patchset, but this one
> > makes it simpler.
> >
> > I suppose at the moment, having the assert directly in resolve_type
> > just means we get to use the same helper/pathway for both user-defined
> > and built-in types, which matches the infrastructure we already have,
> > which doesn't differentiate between the two. (By which I mean, all of
> > the Schema classes are not split into built-in and user-defined types,
> > so it is invisible to the type system.)
>
> Yes.
>
> > I could add conditional logic to the array check, and leave the
> > lookup_type calls in introspect.py being a little cumbersome - my main
> > concern with that solution is that I might be leaving a nasty
> > booby-trap in the future if someone wants to add a new built-in type
> > or something gets refactored to share more code pathways. Maybe that's
> > not fully rational, but it's why I went the way I did.
>
> In my mind, .resolve_type() is strictly for resolving types during
> semantic analysis: look up a type by name, report an error if it doesn't
> exist.
>
> Before this patch:
>
> (A) QAPISchemaArrayType.check() works.  The invariant check is buried
> somewhat deep, in QAPISourceError.
>
> (B) introspect.py works.  The invariant is not checked there.
>
> (C) QAPISchemaVariants.check() works.  A rather losely related invariant
> is checked there: the tag member's type exists.
>
> This patch conflates two changes.
>
> One, it adds an invariant check right to .resolve_type().  Impact:
>
>     (A) Adds an invariant check closer to the surface.
>
>     (B) Not touched.
>
>     (C) Not touched.
>
> No objection.
>
> Two, it defaults .resolve_type()'s arguments to None.  Belongs to the
> next patch.
>
> The next patch overloads .resolve_type() to serve two use cases,
> 1. failure is a semantic error, and 2. failure is a programming error.
> The first kind passes the arguments, the second doesn't.  Impact:
>
>     (A) Not touched.
>
>     (B) Adds invariant checking, in the callee.
>
>     (C) Pushes the invariant checking into the callee.
>
> I don't like overloading .resolve_type() this way.  Again: in my mind,
> it's strictly for resolving the user's type names in semantic analysis.
>
> If I drop this patch and the next one, mypy complains
>
>     scripts/qapi/schema.py:1219: error: Argument 1 has incompatible type "QAPISourceInfo | None"; expected "QAPISourceInfo"  [arg-type]
>     scripts/qapi/introspect.py:230: error: Incompatible types in assignment (expression has type "QAPISchemaType | None", variable has type "QAPISchemaType")  [assignment]
>     scripts/qapi/introspect.py:233: error: Incompatible types in assignment (expression has type "QAPISchemaType | None", variable has type "QAPISchemaType")  [assignment]
>
> Retaining the assertion added in this patch takes care of the first one.
>
> To get rid of the two in introspect.py, we need to actually check the
> invariant:
>
> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
> index 67c7d89aae..4679b1bc2c 100644
> --- a/scripts/qapi/introspect.py
> +++ b/scripts/qapi/introspect.py
> @@ -227,10 +227,14 @@ def _use_type(self, typ: QAPISchemaType) -> str:
>
>          # Map the various integer types to plain int
>          if typ.json_type() == 'int':
> -            typ = self._schema.lookup_type('int')
> +            type_int = self._schema.lookup_type('int')
> +            assert type_int
> +            typ = type_int
>          elif (isinstance(typ, QAPISchemaArrayType) and
>                typ.element_type.json_type() == 'int'):
> -            typ = self._schema.lookup_type('intList')
> +            type_intList = self._schema.lookup_type('intList')
> +            assert type_intList
> +            typ = type_intList
>          # Add type to work queue if new
>          if typ not in self._used_types:
>              self._used_types.append(typ)
>
> Straightforward enough, although with a bit of notational overhead.
>
> We use t = .lookup_type(...); assert t in three places then.  Feel free
> to factor it out into a new helper.
>
> > (P.S. I still violently want to create an info object that represents
> > built-in definitions so I can just get rid of all the
> > Optional[QAPISourceInfo] types from everywhere. I know I tried to do
> > it before and you vetoed it, but the desire lives on in my heart.)
>
> Once everything is properly typed, the cost and benefit of such a change
> should be more clearly visible.
>
> For now, let's try to type what we have, unless what we have complicates
> typing too much.
John Snow Jan. 31, 2024, 11:04 p.m. UTC | #5
On Mon, Jan 22, 2024 at 8:12 AM Markus Armbruster <armbru@redhat.com> wrote:
>
> John Snow <jsnow@redhat.com> writes:
>
> > On Tue, Jan 16, 2024 at 6:09 AM Markus Armbruster <armbru@redhat.com> wrote:
> >>
> >> John Snow <jsnow@redhat.com> writes:
> >>
> >> > allow resolve_type to be used for both built-in and user-specified
> >> > type definitions. In the event that the type cannot be resolved, assert
> >> > that 'info' and 'what' were both provided in order to create a usable
> >> > QAPISemError.
> >> >
> >> > In practice, 'info' will only be None for built-in definitions, which
> >> > *should not fail* type lookup.
> >> >
> >> > As a convenience, allow the 'what' and 'info' parameters to be elided
> >> > entirely so that it can be used as a can-not-fail version of
> >> > lookup_type.
> >>
> >> The convenience remains unused until the next patch.  It should be added
> >> there.
> >
> > Okie-ducky.
> >
> >>
> >> > Note: there are only three callsites to resolve_type at present where
> >> > "info" is perceived to be possibly None:
> >> >
> >> >     1) QAPISchemaArrayType.check()
> >> >     2) QAPISchemaObjectTypeMember.check()
> >> >     3) QAPISchemaEvent.check()
> >> >
> >> >     Of those three, only the first actually ever passes None;
> >>
> >> Yes.  More below.
> >
> > Scary...
>
> I know...
>
> >> >                                                               the other two
> >> >     are limited by their base class initializers which accept info=None, but
> >>
> >> They do?
> >
> > In the case of QAPISchemaObjectTypeMember, the parent class
> > QAPISchemaMember allows initialization with info=None. I can't fully
> > trace all of the callsites, but one of them at least is in types.py:
> >
> >>     enum_members = members + [QAPISchemaEnumMember('_MAX', None)]
>
> I see.
>
> We may want to do the _MAX thingy differently.  Not now.
>
> > which necessitates, for now, info-less QAPISchemaEnumMember, which
> > necessitates info-less QAPISchemaMember. There are others, etc.
>
> Overriding an inherited attribute of type Optional[T] so it's
> non-optional T makes mypy unhappy?
>
> >> >     neither actually use it in practice.
> >> >
> >> > Signed-off-by: John Snow <jsnow@redhat.com>
> >>
> >> Hmm.
> >
> > Scary.
> >
> >>
> >> We look up types by name in two ways:
> >>
> >> 1. Failure is a semantic error
> >>
> >>    Use .resolve_type(), passing real @info and @what.
> >>
> >>    Users:
> >>
> >>    * QAPISchemaArrayType.check() resolving the element type
> >>
> >>      Fine print: when the array type is built-in, we pass None @info and
> >>      @what.  The built-in array type's element type must exist for
> >>      .resolve_type() to work.  This commit changes .resolve_type() to
> >>      assert it does.
> >>
> >>    * QAPISchemaObjectType.check() resolving the base type
> >>
> >>    * QAPISchemaObjectTypeMember.check() resolving the member type
> >>
> >>    * QAPISchemaCommand.check() resolving argument type (if named) and
> >>      return type (which is always named).
> >>
> >>    * QAPISchemaEvent.check() resolving argument type (if named).
> >>
> >>    Note all users are in .check() methods.  That's where type named get
> >>    resolved.
> >>
> >> 2. Handle failure
> >>
> >>    Use .lookup_type(), which returns None when the named type doesn't
> >>    exist.
> >>
> >>    Users:
> >>
> >>    * QAPISchemaVariants.check(), to look up the base type containing the
> >>      tag member for error reporting purposes.  Failure would be a
> >>      programming error.
> >>
> >>    * .resolve_type(), which handles failure as semantic error
> >>
> >>    * ._make_array_type(), which uses it as "type exists already"
> >>       predicate.
> >>
> >>    * QAPISchemaGenIntrospectVisitor._use_type(), to look up certain
> >>      built-in types.  Failure would be a programming error.
> >>
> >> The next commit switches the uses where failure would be a programming
> >> error from .lookup_type() to .resolve_type() without @info and @what, so
> >> failure trips its assertion.  I don't like it, because it overloads
> >> .resolve_type() to serve two rather different use cases:
> >>
> >> 1. Failure is a semantic error; pass @info and @what
> >>
> >> 2. Failure is a programming error; don't pass @info and what
> >>
> >> The odd one out is of course QAPISchemaArrayType.check(), which wants to
> >> use 1. for the user's types and 2. for built-in types.  Let's ignore it
> >> for a second.
> >
> > "Let's ignore what motivated this patch" aww...
>
> Just for a second, I swear!
>
> >> I prefer to do 2. like typ = .lookup_type(); assert typ.  We can factor
> >> this out into its own helper if that helps (pardon the pun).
> >>
> >> Back to QAPISchemaArrayType.check().  Its need to resolve built-in
> >> element types, which have no info, necessitates .resolve_type() taking
> >> Optional[QAPISourceInfo].  This might bother you.  It doesn't bother me,
> >> unless it leads to mypy complications I can't see.
> >
> > Well, with this patch I allowed it to take Optional[QAPISourceInfo] -
> > just keep in mind that QAPISemError *requires* an info object, even
> > though the typing there is also Optional[QAPISourceInfo] ... It will
> > assert that info is present in __str__.
> >
> > Actually, I'd love to change that too - and make it fully required -
> > but since built-in types have no info, there's too many places I'd
> > need to change to enforce this as a static type.
> >
> > Still.
>
> Invariant: no error reports for built-in types.
>
> Checked since forever by asserting info is not None, exploiting the fact
> that info is None exactly for built-in types.
>
> This makes info: Optional[QAPISourceInfo] by design.
>
> Works.
>
> Specializing it to just QAPISourceInfo moves the assertion check from
> run time to compile time.  Might give a nice feeling, but I don't think
> it's practical everywhere, and it doesn't really matter anyway.
>
> Using a special value of QAPISourceInfo instead of None would also get
> rid of the Optional, along with the potential of checking at compile
> time.  Good trade *if* it simplifies the code.  See also the very end of
> my reply.
>
> >> We can simply leave it as is.  Adding the assertion to .resolve_type()
> >> is fine.
> >>
> >> Ot we complicate QAPISchemaArrayType.check() to simplify
> >> .resolve_type()'s typing, roughly like this:
> >>
> >>             if self.info:
> >>                 self.element_type = schema.resolve_type(
> >>                     self._element_type_name,
> >>                     self.info, self.info.defn_meta)
> >>             else:               # built-in type
> >>                 self.element_type = schema.lookup_type(
> >>                     self._element_type_name)
> >>                 assert self.element_type
> >>
> >> Not sure it's worth the trouble.  Thoughts?
> >
> > I suppose it's your call, ultimately. This patch exists primarily to
> > help in two places:
> >
> > (A) QAPISchemaArrayType.check(), as you've noticed, because it uses
> > the same path for both built-in and user-defined types. This is the
> > only place in the code where this occurs *at the moment*, but I can't
> > predict the future.
> >
> > (B) Calls to lookup_type in introspect.py which look up built-in types
> > and must-not-fail. It was cumbersome in the old patchset, but this one
> > makes it simpler.
> >
> > I suppose at the moment, having the assert directly in resolve_type
> > just means we get to use the same helper/pathway for both user-defined
> > and built-in types, which matches the infrastructure we already have,
> > which doesn't differentiate between the two. (By which I mean, all of
> > the Schema classes are not split into built-in and user-defined types,
> > so it is invisible to the type system.)
>
> Yes.
>
> > I could add conditional logic to the array check, and leave the
> > lookup_type calls in introspect.py being a little cumbersome - my main
> > concern with that solution is that I might be leaving a nasty
> > booby-trap in the future if someone wants to add a new built-in type
> > or something gets refactored to share more code pathways. Maybe that's
> > not fully rational, but it's why I went the way I did.
>
> In my mind, .resolve_type() is strictly for resolving types during
> semantic analysis: look up a type by name, report an error if it doesn't
> exist.
>

In my mind, it's a function which must not return None, which makes it
useful. If it has different failure modes for different arguments,
that doesn't matter much to me. Assertions for programmer errors and
QAPISemError for semantic errors seems fine.

> Before this patch:
>
> (A) QAPISchemaArrayType.check() works.  The invariant check is buried
> somewhat deep, in QAPISourceError.
>

It also completely obscures what has actually failed with a pretty
unreadable error. It's a programmer error, sure, but I'm a programmer
and I hate being inconvenienced. (I have tripped on this bomb multiple
times while writing this series.)

> (B) introspect.py works.  The invariant is not checked there.
>
> (C) QAPISchemaVariants.check() works.  A rather losely related invariant
> is checked there: the tag member's type exists.
>
> This patch conflates two changes.
>
> One, it adds an invariant check right to .resolve_type().  Impact:
>
>     (A) Adds an invariant check closer to the surface.
>
>     (B) Not touched.
>
>     (C) Not touched.
>
> No objection.
>

OK, so I'll just keep the single new assert for this patch ...

> Two, it defaults .resolve_type()'s arguments to None.  Belongs to the
> next patch.
>
> The next patch overloads .resolve_type() to serve two use cases,
> 1. failure is a semantic error, and 2. failure is a programming error.
> The first kind passes the arguments, the second doesn't.  Impact:
>
>     (A) Not touched.
>
>     (B) Adds invariant checking, in the callee.
>
>     (C) Pushes the invariant checking into the callee.
>
> I don't like overloading .resolve_type() this way.  Again: in my mind,
> it's strictly for resolving the user's type names in semantic analysis.

It's already not *strictly* used for that, though, because of (C) in
particular. We have a lot less *goop* at the callsite by just teaching
resolve_type to understand which case it is being used for and
adapting it to raise the correct error in response (Assertion for
programmer failure, QAPISemError for semantic error.)

>
> If I drop this patch and the next one, mypy complains
>
>     scripts/qapi/schema.py:1219: error: Argument 1 has incompatible type "QAPISourceInfo | None"; expected "QAPISourceInfo"  [arg-type]
>     scripts/qapi/introspect.py:230: error: Incompatible types in assignment (expression has type "QAPISchemaType | None", variable has type "QAPISchemaType")  [assignment]
>     scripts/qapi/introspect.py:233: error: Incompatible types in assignment (expression has type "QAPISchemaType | None", variable has type "QAPISchemaType")  [assignment]
>
> Retaining the assertion added in this patch takes care of the first one.
>

Yep.

> To get rid of the two in introspect.py, we need to actually check the
> invariant:
>
> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
> index 67c7d89aae..4679b1bc2c 100644
> --- a/scripts/qapi/introspect.py
> +++ b/scripts/qapi/introspect.py
> @@ -227,10 +227,14 @@ def _use_type(self, typ: QAPISchemaType) -> str:
>
>          # Map the various integer types to plain int
>          if typ.json_type() == 'int':
> -            typ = self._schema.lookup_type('int')
> +            type_int = self._schema.lookup_type('int')
> +            assert type_int
> +            typ = type_int
>          elif (isinstance(typ, QAPISchemaArrayType) and
>                typ.element_type.json_type() == 'int'):
> -            typ = self._schema.lookup_type('intList')
> +            type_intList = self._schema.lookup_type('intList')
> +            assert type_intList
> +            typ = type_intList
>          # Add type to work queue if new
>          if typ not in self._used_types:
>              self._used_types.append(typ)
>
> Straightforward enough, although with a bit of notational overhead.

Yeah. It's goopy. I don't like the goop.

In my mind:

(1) resolve_type: idiomatic python for type resolution; Exception one
way or another if we fail.
(2) lookup_type: C-brained type resolution, return None if we fail and
make it the caller's problem to perform due diligence.

>
> We use t = .lookup_type(...); assert t in three places then.  Feel free
> to factor it out into a new helper.
>

It'd cut down on the goop. Not convinced we need yet-another-helper (I
even dropped my patch refactoring these because I decided it wasn't
worth it), but if you would *really* like to maintain some semantic
difference between lookup/resolve beyond the return type, I'll
probably go this route because I think it makes callsites the
cleanest.

> > (P.S. I still violently want to create an info object that represents
> > built-in definitions so I can just get rid of all the
> > Optional[QAPISourceInfo] types from everywhere. I know I tried to do
> > it before and you vetoed it, but the desire lives on in my heart.)
>
> Once everything is properly typed, the cost and benefit of such a change
> should be more clearly visible.
>
> For now, let's try to type what we have, unless what we have complicates
> typing too much.
>

Yes, it just would help sweep all of the dirt into a more consolidated
location. Trying to audit when and where info can be None takes more
brain cycles than I'd prefer. I'm not advocating for it to happen in
this series, I am just advocating for it to happen.

> [...]
>
diff mbox series

Patch

diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
index 66a78f28fd4..a77b51d1b96 100644
--- a/scripts/qapi/schema.py
+++ b/scripts/qapi/schema.py
@@ -1001,9 +1001,10 @@  def lookup_type(self, name):
         assert typ is None or isinstance(typ, QAPISchemaType)
         return typ
 
-    def resolve_type(self, name, info, what):
+    def resolve_type(self, name, info=None, what=None):
         typ = self.lookup_type(name)
         if not typ:
+            assert info and what  # built-in types must not fail lookup
             if callable(what):
                 what = what(info)
             raise QAPISemError(