diff mbox series

[v4,1/3] qapi/qdev.json: add DEVICE_UNPLUG_ERROR QAPI event

Message ID 20210707003314.37110-2-danielhb413@gmail.com (mailing list archive)
State New, archived
Headers show
Series DEVICE_UNPLUG_ERROR QAPI event | expand

Commit Message

Daniel Henrique Barboza July 7, 2021, 12:33 a.m. UTC
At this moment we only provide one event to report a hotunplug error,
MEM_UNPLUG_ERROR. As of Linux kernel 5.12 and QEMU 6.0.0, the pseries
machine is now able to report unplug errors for other device types, such
as CPUs.

Instead of creating a (device_type)_UNPLUG_ERROR for each new device,
create a generic DEVICE_UNPLUG_ERROR event that can be used by all
unplug errors in the future.

With this new generic event, MEM_UNPLUG_ERROR is now marked as deprecated.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
---
 docs/system/deprecated.rst | 10 ++++++++++
 qapi/machine.json          |  6 +++++-
 qapi/qdev.json             | 27 ++++++++++++++++++++++++++-
 3 files changed, 41 insertions(+), 2 deletions(-)

Comments

Greg Kurz July 7, 2021, 9:26 a.m. UTC | #1
On Tue,  6 Jul 2021 21:33:12 -0300
Daniel Henrique Barboza <danielhb413@gmail.com> wrote:

> At this moment we only provide one event to report a hotunplug error,
> MEM_UNPLUG_ERROR. As of Linux kernel 5.12 and QEMU 6.0.0, the pseries
> machine is now able to report unplug errors for other device types, such
> as CPUs.
> 
> Instead of creating a (device_type)_UNPLUG_ERROR for each new device,
> create a generic DEVICE_UNPLUG_ERROR event that can be used by all
> unplug errors in the future.
> 
> With this new generic event, MEM_UNPLUG_ERROR is now marked as deprecated.
> 
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  docs/system/deprecated.rst | 10 ++++++++++
>  qapi/machine.json          |  6 +++++-
>  qapi/qdev.json             | 27 ++++++++++++++++++++++++++-
>  3 files changed, 41 insertions(+), 2 deletions(-)
> 
> diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
> index 70e08baff6..ca6c7f9d43 100644
> --- a/docs/system/deprecated.rst
> +++ b/docs/system/deprecated.rst
> @@ -204,6 +204,16 @@ The ``I7200`` guest CPU relies on the nanoMIPS ISA, which is deprecated
>  (the ISA has never been upstreamed to a compiler toolchain). Therefore
>  this CPU is also deprecated.
>  
> +
> +QEMU API (QAPI) events
> +----------------------
> +
> +``MEM_UNPLUG_ERROR`` (since 6.1)
> +''''''''''''''''''''''''''''''''''''''''''''''''''''''''
> +
> +Use the more generic event ``DEVICE_UNPLUG_ERROR`` instead.
> +
> +
>  System emulator machines
>  ------------------------
>  
> diff --git a/qapi/machine.json b/qapi/machine.json
> index c3210ee1fb..a595c753d2 100644
> --- a/qapi/machine.json
> +++ b/qapi/machine.json
> @@ -1271,6 +1271,9 @@
>  #
>  # @msg: Informative message
>  #
> +# Features:
> +# @deprecated: This event is deprecated. Use @DEVICE_UNPLUG_ERROR instead.
> +#
>  # Since: 2.4
>  #
>  # Example:
> @@ -1283,7 +1286,8 @@
>  #
>  ##
>  { 'event': 'MEM_UNPLUG_ERROR',
> -  'data': { 'device': 'str', 'msg': 'str' } }
> +  'data': { 'device': 'str', 'msg': 'str' },
> +  'features': ['deprecated'] }
>  
>  ##
>  # @SMPConfiguration:
> diff --git a/qapi/qdev.json b/qapi/qdev.json
> index b83178220b..349d7439fa 100644
> --- a/qapi/qdev.json
> +++ b/qapi/qdev.json
> @@ -84,7 +84,9 @@
>  #        This command merely requests that the guest begin the hot removal
>  #        process.  Completion of the device removal process is signaled with a
>  #        DEVICE_DELETED event. Guest reset will automatically complete removal
> -#        for all devices.
> +#        for all devices. If an error in the hot removal process is detected,
> +#        the device will not be removed and a DEVICE_UNPLUG_ERROR event is
> +#        sent.
>  #
>  # Since: 0.14
>  #
> @@ -124,3 +126,26 @@
>  ##
>  { 'event': 'DEVICE_DELETED',
>    'data': { '*device': 'str', 'path': 'str' } }
> +
> +##
> +# @DEVICE_UNPLUG_ERROR:
> +#
> +# Emitted when a device hot unplug error occurs.
> +#
> +# @device: device name
> +#
> +# @msg: Informative message
> +#
> +# Since: 6.1
> +#
> +# Example:
> +#
> +# <- { "event": "DEVICE_UNPLUG_ERROR"
> +#      "data": { "device": "dimm1",
> +#                "msg": "Memory hotunplug rejected by the guest for device dimm1"
> +#      },
> +#      "timestamp": { "seconds": 1615570772, "microseconds": 202844 } }
> +#
> +##
> +{ 'event': 'DEVICE_UNPLUG_ERROR',
> +  'data': { 'device': 'str', 'msg': 'str' } }
Markus Armbruster July 8, 2021, 1:01 p.m. UTC | #2
Daniel Henrique Barboza <danielhb413@gmail.com> writes:

> At this moment we only provide one event to report a hotunplug error,
> MEM_UNPLUG_ERROR. As of Linux kernel 5.12 and QEMU 6.0.0, the pseries
> machine is now able to report unplug errors for other device types, such
> as CPUs.
>
> Instead of creating a (device_type)_UNPLUG_ERROR for each new device,
> create a generic DEVICE_UNPLUG_ERROR event that can be used by all
> unplug errors in the future.
>
> With this new generic event, MEM_UNPLUG_ERROR is now marked as deprecated.
>
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> ---
>  docs/system/deprecated.rst | 10 ++++++++++
>  qapi/machine.json          |  6 +++++-
>  qapi/qdev.json             | 27 ++++++++++++++++++++++++++-
>  3 files changed, 41 insertions(+), 2 deletions(-)
>
> diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
> index 70e08baff6..ca6c7f9d43 100644
> --- a/docs/system/deprecated.rst
> +++ b/docs/system/deprecated.rst
> @@ -204,6 +204,16 @@ The ``I7200`` guest CPU relies on the nanoMIPS ISA, which is deprecated
>  (the ISA has never been upstreamed to a compiler toolchain). Therefore
>  this CPU is also deprecated.
>  
> +
> +QEMU API (QAPI) events
> +----------------------
> +
> +``MEM_UNPLUG_ERROR`` (since 6.1)
> +''''''''''''''''''''''''''''''''''''''''''''''''''''''''
> +
> +Use the more generic event ``DEVICE_UNPLUG_ERROR`` instead.
> +
> +
>  System emulator machines
>  ------------------------
>  
> diff --git a/qapi/machine.json b/qapi/machine.json
> index c3210ee1fb..a595c753d2 100644
> --- a/qapi/machine.json
> +++ b/qapi/machine.json
> @@ -1271,6 +1271,9 @@
>  #
>  # @msg: Informative message
>  #
> +# Features:
> +# @deprecated: This event is deprecated. Use @DEVICE_UNPLUG_ERROR instead.
> +#
>  # Since: 2.4
>  #
>  # Example:
> @@ -1283,7 +1286,8 @@
>  #
>  ##
>  { 'event': 'MEM_UNPLUG_ERROR',
> -  'data': { 'device': 'str', 'msg': 'str' } }
> +  'data': { 'device': 'str', 'msg': 'str' },
> +  'features': ['deprecated'] }
>  
>  ##
>  # @SMPConfiguration:
> diff --git a/qapi/qdev.json b/qapi/qdev.json
> index b83178220b..349d7439fa 100644
> --- a/qapi/qdev.json
> +++ b/qapi/qdev.json
> @@ -84,7 +84,9 @@
>  #        This command merely requests that the guest begin the hot removal
>  #        process.  Completion of the device removal process is signaled with a
>  #        DEVICE_DELETED event. Guest reset will automatically complete removal
> -#        for all devices.
> +#        for all devices. If an error in the hot removal process is detected,
> +#        the device will not be removed and a DEVICE_UNPLUG_ERROR event is
> +#        sent.

"If an error ... is detected" kind of implies that some errors may go
undetected.  Let's spell this out more clearly.  Perhaps append "Some
errors cannot be detected."

DEVICE_UNPLUG_ERROR's unrelability is awkward.  Best we can do in the
general case.  Can we do better in special cases, and would it be
worthwhile?  If yes, it should probably be done on top.

Two spaces between sentences for consistency with the existing text, please.

>  #
>  # Since: 0.14
>  #
> @@ -124,3 +126,26 @@
>  ##
>  { 'event': 'DEVICE_DELETED',
>    'data': { '*device': 'str', 'path': 'str' } }
> +
> +##
> +# @DEVICE_UNPLUG_ERROR:
> +#
> +# Emitted when a device hot unplug error occurs.
> +#
> +# @device: device name
> +#
> +# @msg: Informative message
> +#
> +# Since: 6.1
> +#
> +# Example:
> +#
> +# <- { "event": "DEVICE_UNPLUG_ERROR"
> +#      "data": { "device": "dimm1",
> +#                "msg": "Memory hotunplug rejected by the guest for device dimm1"
> +#      },
> +#      "timestamp": { "seconds": 1615570772, "microseconds": 202844 } }
> +#
> +##
> +{ 'event': 'DEVICE_UNPLUG_ERROR',
> +  'data': { 'device': 'str', 'msg': 'str' } }

Hmm.

DEVICE_DELETED provides the device ID if the device has one, and the QOM
path.  Documentation is less than clear for both (not your patch's
fault).

DEVICE_UNPLUG_ERROR provides the device ID unconditionally, and doesn't
provide the QOM path.  What if the device doesn't have a device ID?

I suspect DEVICE_UNPLUG_ERROR should match DEVICE_DELETED exactly.

Bonus (for me, not for you): improving the unclear documentation becomes
your patch's problem.  Here's my attempt:

   # @device: the device's ID if it has one
   #
   # @path: the device's path within the object model
Daniel Henrique Barboza July 8, 2021, 2:20 p.m. UTC | #3
On 7/8/21 10:01 AM, Markus Armbruster wrote:
> Daniel Henrique Barboza <danielhb413@gmail.com> writes:
> 
>> At this moment we only provide one event to report a hotunplug error,
>> MEM_UNPLUG_ERROR. As of Linux kernel 5.12 and QEMU 6.0.0, the pseries
>> machine is now able to report unplug errors for other device types, such
>> as CPUs.
>>
>> Instead of creating a (device_type)_UNPLUG_ERROR for each new device,
>> create a generic DEVICE_UNPLUG_ERROR event that can be used by all
>> unplug errors in the future.
>>
>> With this new generic event, MEM_UNPLUG_ERROR is now marked as deprecated.
>>
>> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
>> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
>> ---
>>   docs/system/deprecated.rst | 10 ++++++++++
>>   qapi/machine.json          |  6 +++++-
>>   qapi/qdev.json             | 27 ++++++++++++++++++++++++++-
>>   3 files changed, 41 insertions(+), 2 deletions(-)
>>
>> diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
>> index 70e08baff6..ca6c7f9d43 100644
>> --- a/docs/system/deprecated.rst
>> +++ b/docs/system/deprecated.rst
>> @@ -204,6 +204,16 @@ The ``I7200`` guest CPU relies on the nanoMIPS ISA, which is deprecated
>>   (the ISA has never been upstreamed to a compiler toolchain). Therefore
>>   this CPU is also deprecated.
>>   
>> +
>> +QEMU API (QAPI) events
>> +----------------------
>> +
>> +``MEM_UNPLUG_ERROR`` (since 6.1)
>> +''''''''''''''''''''''''''''''''''''''''''''''''''''''''
>> +
>> +Use the more generic event ``DEVICE_UNPLUG_ERROR`` instead.
>> +
>> +
>>   System emulator machines
>>   ------------------------
>>   
>> diff --git a/qapi/machine.json b/qapi/machine.json
>> index c3210ee1fb..a595c753d2 100644
>> --- a/qapi/machine.json
>> +++ b/qapi/machine.json
>> @@ -1271,6 +1271,9 @@
>>   #
>>   # @msg: Informative message
>>   #
>> +# Features:
>> +# @deprecated: This event is deprecated. Use @DEVICE_UNPLUG_ERROR instead.
>> +#
>>   # Since: 2.4
>>   #
>>   # Example:
>> @@ -1283,7 +1286,8 @@
>>   #
>>   ##
>>   { 'event': 'MEM_UNPLUG_ERROR',
>> -  'data': { 'device': 'str', 'msg': 'str' } }
>> +  'data': { 'device': 'str', 'msg': 'str' },
>> +  'features': ['deprecated'] }
>>   
>>   ##
>>   # @SMPConfiguration:
>> diff --git a/qapi/qdev.json b/qapi/qdev.json
>> index b83178220b..349d7439fa 100644
>> --- a/qapi/qdev.json
>> +++ b/qapi/qdev.json
>> @@ -84,7 +84,9 @@
>>   #        This command merely requests that the guest begin the hot removal
>>   #        process.  Completion of the device removal process is signaled with a
>>   #        DEVICE_DELETED event. Guest reset will automatically complete removal
>> -#        for all devices.
>> +#        for all devices. If an error in the hot removal process is detected,
>> +#        the device will not be removed and a DEVICE_UNPLUG_ERROR event is
>> +#        sent.
> 
> "If an error ... is detected" kind of implies that some errors may go
> undetected.  Let's spell this out more clearly.  Perhaps append "Some
> errors cannot be detected."
> 
> DEVICE_UNPLUG_ERROR's unrelability is awkward.  Best we can do in the
> general case.  Can we do better in special cases, and would it be
> worthwhile?  If yes, it should probably be done on top.
> 
> Two spaces between sentences for consistency with the existing text, please.

Ok!

> 
>>   #
>>   # Since: 0.14
>>   #
>> @@ -124,3 +126,26 @@
>>   ##
>>   { 'event': 'DEVICE_DELETED',
>>     'data': { '*device': 'str', 'path': 'str' } }
>> +
>> +##
>> +# @DEVICE_UNPLUG_ERROR:
>> +#
>> +# Emitted when a device hot unplug error occurs.
>> +#
>> +# @device: device name
>> +#
>> +# @msg: Informative message
>> +#
>> +# Since: 6.1
>> +#
>> +# Example:
>> +#
>> +# <- { "event": "DEVICE_UNPLUG_ERROR"
>> +#      "data": { "device": "dimm1",
>> +#                "msg": "Memory hotunplug rejected by the guest for device dimm1"
>> +#      },
>> +#      "timestamp": { "seconds": 1615570772, "microseconds": 202844 } }
>> +#
>> +##
>> +{ 'event': 'DEVICE_UNPLUG_ERROR',
>> +  'data': { 'device': 'str', 'msg': 'str' } }
> 
> Hmm.
> 
> DEVICE_DELETED provides the device ID if the device has one, and the QOM
> path.  Documentation is less than clear for both (not your patch's
> fault).

Now that you mentioned I realized that I was seeing both 'device' and 'path'
being propagated all this time in this event without noticing it. E.g.:

{"timestamp": {"seconds": 1625617532, "microseconds": 50228}, "event": "DEVICE_DELETED", "data": {"device": "core1", "path": "/machine/peripheral/core1"}}


> 
> DEVICE_UNPLUG_ERROR provides the device ID unconditionally, and doesn't
> provide the QOM path.  What if the device doesn't have a device ID?
> 
> I suspect DEVICE_UNPLUG_ERROR should match DEVICE_DELETED exactly.

Agree. That will allow us to send DEVICE_UNPLUG_ERROR events even if dev->id is
NULL since we're also providing the path.

DEVICE_UNPLUG_ERROR was inspired by MEM_UNPLUG_ERROR since the usage was similar,
but I guess we're better of basing the new event on DEVICE_DELETED API instead.

This will also fix most of your inquiries in patches 2 and 3 as well.


I'll do the proper adjustments and re-send.


> 
> Bonus (for me, not for you): improving the unclear documentation becomes
> your patch's problem.  Here's my attempt:
> 
>     # @device: the device's ID if it has one
>     #
>     # @path: the device's path within the object model
> 

I can make a pre-patch that add this information in DEVICE_DELETED documentation if
you prefer, instead of putting everything into the same patch (since the amended
DEVICE_DELETED docs are useful regardless of this work).



Thanks,



Daniel
David Gibson July 12, 2021, 2:26 a.m. UTC | #4
On Thu, Jul 08, 2021 at 03:01:20PM +0200, Markus Armbruster wrote:
> Daniel Henrique Barboza <danielhb413@gmail.com> writes:
> 
> > At this moment we only provide one event to report a hotunplug error,
> > MEM_UNPLUG_ERROR. As of Linux kernel 5.12 and QEMU 6.0.0, the pseries
> > machine is now able to report unplug errors for other device types, such
> > as CPUs.
> >
> > Instead of creating a (device_type)_UNPLUG_ERROR for each new device,
> > create a generic DEVICE_UNPLUG_ERROR event that can be used by all
> > unplug errors in the future.
> >
> > With this new generic event, MEM_UNPLUG_ERROR is now marked as deprecated.
> >
> > Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> > ---
> >  docs/system/deprecated.rst | 10 ++++++++++
> >  qapi/machine.json          |  6 +++++-
> >  qapi/qdev.json             | 27 ++++++++++++++++++++++++++-
> >  3 files changed, 41 insertions(+), 2 deletions(-)
> >
> > diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
> > index 70e08baff6..ca6c7f9d43 100644
> > --- a/docs/system/deprecated.rst
> > +++ b/docs/system/deprecated.rst
> > @@ -204,6 +204,16 @@ The ``I7200`` guest CPU relies on the nanoMIPS ISA, which is deprecated
> >  (the ISA has never been upstreamed to a compiler toolchain). Therefore
> >  this CPU is also deprecated.
> >  
> > +
> > +QEMU API (QAPI) events
> > +----------------------
> > +
> > +``MEM_UNPLUG_ERROR`` (since 6.1)
> > +''''''''''''''''''''''''''''''''''''''''''''''''''''''''
> > +
> > +Use the more generic event ``DEVICE_UNPLUG_ERROR`` instead.
> > +
> > +
> >  System emulator machines
> >  ------------------------
> >  
> > diff --git a/qapi/machine.json b/qapi/machine.json
> > index c3210ee1fb..a595c753d2 100644
> > --- a/qapi/machine.json
> > +++ b/qapi/machine.json
> > @@ -1271,6 +1271,9 @@
> >  #
> >  # @msg: Informative message
> >  #
> > +# Features:
> > +# @deprecated: This event is deprecated. Use @DEVICE_UNPLUG_ERROR instead.
> > +#
> >  # Since: 2.4
> >  #
> >  # Example:
> > @@ -1283,7 +1286,8 @@
> >  #
> >  ##
> >  { 'event': 'MEM_UNPLUG_ERROR',
> > -  'data': { 'device': 'str', 'msg': 'str' } }
> > +  'data': { 'device': 'str', 'msg': 'str' },
> > +  'features': ['deprecated'] }
> >  
> >  ##
> >  # @SMPConfiguration:
> > diff --git a/qapi/qdev.json b/qapi/qdev.json
> > index b83178220b..349d7439fa 100644
> > --- a/qapi/qdev.json
> > +++ b/qapi/qdev.json
> > @@ -84,7 +84,9 @@
> >  #        This command merely requests that the guest begin the hot removal
> >  #        process.  Completion of the device removal process is signaled with a
> >  #        DEVICE_DELETED event. Guest reset will automatically complete removal
> > -#        for all devices.
> > +#        for all devices. If an error in the hot removal process is detected,
> > +#        the device will not be removed and a DEVICE_UNPLUG_ERROR event is
> > +#        sent.
> 
> "If an error ... is detected" kind of implies that some errors may go
> undetected.  Let's spell this out more clearly.  Perhaps append "Some
> errors cannot be detected."
> 
> DEVICE_UNPLUG_ERROR's unrelability is awkward.  Best we can do in the
> general case.  Can we do better in special cases, and would it be
> worthwhile?  If yes, it should probably be done on top.

I can't rule out such a special case entirely, but it's pretty hard to
imagine.  If we need any kind of acknowledgement from the guest to
complete the unplug, then the unplug failing but the guest never
reporting anything is going to be indistinguishable from the guest
working on the unplug but being super slow.
diff mbox series

Patch

diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index 70e08baff6..ca6c7f9d43 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -204,6 +204,16 @@  The ``I7200`` guest CPU relies on the nanoMIPS ISA, which is deprecated
 (the ISA has never been upstreamed to a compiler toolchain). Therefore
 this CPU is also deprecated.
 
+
+QEMU API (QAPI) events
+----------------------
+
+``MEM_UNPLUG_ERROR`` (since 6.1)
+''''''''''''''''''''''''''''''''''''''''''''''''''''''''
+
+Use the more generic event ``DEVICE_UNPLUG_ERROR`` instead.
+
+
 System emulator machines
 ------------------------
 
diff --git a/qapi/machine.json b/qapi/machine.json
index c3210ee1fb..a595c753d2 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1271,6 +1271,9 @@ 
 #
 # @msg: Informative message
 #
+# Features:
+# @deprecated: This event is deprecated. Use @DEVICE_UNPLUG_ERROR instead.
+#
 # Since: 2.4
 #
 # Example:
@@ -1283,7 +1286,8 @@ 
 #
 ##
 { 'event': 'MEM_UNPLUG_ERROR',
-  'data': { 'device': 'str', 'msg': 'str' } }
+  'data': { 'device': 'str', 'msg': 'str' },
+  'features': ['deprecated'] }
 
 ##
 # @SMPConfiguration:
diff --git a/qapi/qdev.json b/qapi/qdev.json
index b83178220b..349d7439fa 100644
--- a/qapi/qdev.json
+++ b/qapi/qdev.json
@@ -84,7 +84,9 @@ 
 #        This command merely requests that the guest begin the hot removal
 #        process.  Completion of the device removal process is signaled with a
 #        DEVICE_DELETED event. Guest reset will automatically complete removal
-#        for all devices.
+#        for all devices. If an error in the hot removal process is detected,
+#        the device will not be removed and a DEVICE_UNPLUG_ERROR event is
+#        sent.
 #
 # Since: 0.14
 #
@@ -124,3 +126,26 @@ 
 ##
 { 'event': 'DEVICE_DELETED',
   'data': { '*device': 'str', 'path': 'str' } }
+
+##
+# @DEVICE_UNPLUG_ERROR:
+#
+# Emitted when a device hot unplug error occurs.
+#
+# @device: device name
+#
+# @msg: Informative message
+#
+# Since: 6.1
+#
+# Example:
+#
+# <- { "event": "DEVICE_UNPLUG_ERROR"
+#      "data": { "device": "dimm1",
+#                "msg": "Memory hotunplug rejected by the guest for device dimm1"
+#      },
+#      "timestamp": { "seconds": 1615570772, "microseconds": 202844 } }
+#
+##
+{ 'event': 'DEVICE_UNPLUG_ERROR',
+  'data': { 'device': 'str', 'msg': 'str' } }