Message ID | 20210604200353.1206897-2-danielhb413@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | DEVICE_UNPLUG_ERROR QAPI event | expand |
On Fri, Jun 04, 2021 at 05:03:52PM -0300, Daniel Henrique Barboza wrote: > At this moment we only provide one event to report a hotunplug error, > MEM_UNPLUG_ERROR. As of Linux kernel 5.12 and QEMU 6.0.0, the pseries > machine is now able to report unplug errors for other device types, such > as CPUs. > > Instead of creating a (device_type)_UNPLUG_ERROR for each new device, > create a generic DEVICE_UNPLUG_ERROR event that can be used by all > unplug errors in the future. > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> > --- > qapi/machine.json | 23 +++++++++++++++++++++++ > 1 file changed, 23 insertions(+) > > diff --git a/qapi/machine.json b/qapi/machine.json > index 58a9c86b36..f0c7e56be0 100644 > --- a/qapi/machine.json > +++ b/qapi/machine.json > @@ -1274,3 +1274,26 @@ > ## > { 'event': 'MEM_UNPLUG_ERROR', > 'data': { 'device': 'str', 'msg': 'str' } } > + > +## > +# @DEVICE_UNPLUG_ERROR: > +# > +# Emitted when a device hot unplug error occurs. > +# > +# @device: device name > +# > +# @msg: Informative message > +# > +# Since: 6.1 > +# > +# Example: > +# > +# <- { "event": "DEVICE_UNPLUG_ERROR" > +# "data": { "device": "dimm1", > +# "msg": "Memory hotunplug rejected by the guest for device dimm1" > +# }, > +# "timestamp": { "seconds": 1615570772, "microseconds": 202844 } } > +# > +## > +{ 'event': 'DEVICE_UNPLUG_ERROR', > + 'data': { 'device': 'str', 'msg': 'str' } }
On Mon, Jun 07, 2021 at 12:23:08PM +1000, David Gibson wrote: > On Fri, Jun 04, 2021 at 05:03:52PM -0300, Daniel Henrique Barboza wrote: > > At this moment we only provide one event to report a hotunplug error, > > MEM_UNPLUG_ERROR. As of Linux kernel 5.12 and QEMU 6.0.0, the pseries > > machine is now able to report unplug errors for other device types, such > > as CPUs. > > > > Instead of creating a (device_type)_UNPLUG_ERROR for each new device, > > create a generic DEVICE_UNPLUG_ERROR event that can be used by all > > unplug errors in the future. > > > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> > > Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Markus, I'm happy to take this through my tree if that's convenient for you, but I'd like to get an ack. > > > --- > > qapi/machine.json | 23 +++++++++++++++++++++++ > > 1 file changed, 23 insertions(+) > > > > diff --git a/qapi/machine.json b/qapi/machine.json > > index 58a9c86b36..f0c7e56be0 100644 > > --- a/qapi/machine.json > > +++ b/qapi/machine.json > > @@ -1274,3 +1274,26 @@ > > ## > > { 'event': 'MEM_UNPLUG_ERROR', > > 'data': { 'device': 'str', 'msg': 'str' } } > > + > > +## > > +# @DEVICE_UNPLUG_ERROR: > > +# > > +# Emitted when a device hot unplug error occurs. > > +# > > +# @device: device name > > +# > > +# @msg: Informative message > > +# > > +# Since: 6.1 > > +# > > +# Example: > > +# > > +# <- { "event": "DEVICE_UNPLUG_ERROR" > > +# "data": { "device": "dimm1", > > +# "msg": "Memory hotunplug rejected by the guest for device dimm1" > > +# }, > > +# "timestamp": { "seconds": 1615570772, "microseconds": 202844 } } > > +# > > +## > > +{ 'event': 'DEVICE_UNPLUG_ERROR', > > + 'data': { 'device': 'str', 'msg': 'str' } } >
On Mon, Jun 07, 2021 at 12:23:53PM +1000, David Gibson wrote: > On Mon, Jun 07, 2021 at 12:23:08PM +1000, David Gibson wrote: > > On Fri, Jun 04, 2021 at 05:03:52PM -0300, Daniel Henrique Barboza wrote: > > > At this moment we only provide one event to report a hotunplug error, > > > MEM_UNPLUG_ERROR. As of Linux kernel 5.12 and QEMU 6.0.0, the pseries > > > machine is now able to report unplug errors for other device types, such > > > as CPUs. > > > > > > Instead of creating a (device_type)_UNPLUG_ERROR for each new device, > > > create a generic DEVICE_UNPLUG_ERROR event that can be used by all > > > unplug errors in the future. > > > > > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> > > > > Reviewed-by: David Gibson <david@gibson.dropbear.id.au> > > Markus, I'm happy to take this through my tree if that's convenient > for you, but I'd like to get an ack. I'm not Markus, but if you're happy with my interface review, Reviewed-by: Eric Blake <eblake@redhat.com>
Daniel Henrique Barboza <danielhb413@gmail.com> writes: > At this moment we only provide one event to report a hotunplug error, > MEM_UNPLUG_ERROR. As of Linux kernel 5.12 and QEMU 6.0.0, the pseries > machine is now able to report unplug errors for other device types, such > as CPUs. > > Instead of creating a (device_type)_UNPLUG_ERROR for each new device, > create a generic DEVICE_UNPLUG_ERROR event that can be used by all > unplug errors in the future. > > Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> > --- > qapi/machine.json | 23 +++++++++++++++++++++++ > 1 file changed, 23 insertions(+) > > diff --git a/qapi/machine.json b/qapi/machine.json > index 58a9c86b36..f0c7e56be0 100644 > --- a/qapi/machine.json > +++ b/qapi/machine.json > @@ -1274,3 +1274,26 @@ > ## > { 'event': 'MEM_UNPLUG_ERROR', > 'data': { 'device': 'str', 'msg': 'str' } } > + > +## > +# @DEVICE_UNPLUG_ERROR: > +# > +# Emitted when a device hot unplug error occurs. > +# > +# @device: device name > +# > +# @msg: Informative message > +# > +# Since: 6.1 > +# > +# Example: > +# > +# <- { "event": "DEVICE_UNPLUG_ERROR" > +# "data": { "device": "dimm1", > +# "msg": "Memory hotunplug rejected by the guest for device dimm1" > +# }, > +# "timestamp": { "seconds": 1615570772, "microseconds": 202844 } } > +# > +## > +{ 'event': 'DEVICE_UNPLUG_ERROR', > + 'data': { 'device': 'str', 'msg': 'str' } } Missing: update of device_add's doc comment in qdev.json: # Notes: When this command completes, the device may not be removed from the # guest. Hot removal is an operation that requires guest cooperation. # This command merely requests that the guest begin the hot removal # process. Completion of the device removal process is signaled with a # DEVICE_DELETED event. Guest reset will automatically complete removal # for all devices. This sure could use some polish. If I understand things correctly, we're aiming for the following device unplug protocol: Unplug the device with device_del (or possibly equivalent) If we know we can't unplug the device, fail immediately. Also emit DEVICE_UNPLUG_ERROR. If possible, unplug the device synchronously and succeed. Also emit DEVICE_DELETED. Else, initiate unplug and succeed. When unplug finishes, emit either DEVICE_DELETED or DEVICE_UNPLUG_ERROR. For some machines and devices, unplug may never finish. Correct? Any particular reason for not putting event DEVICE_UNPLUG_ERROR next to DEVICE_DELETED in qdev.json?
On 6/11/21 9:12 AM, Markus Armbruster wrote: > Daniel Henrique Barboza <danielhb413@gmail.com> writes: > >> At this moment we only provide one event to report a hotunplug error, >> MEM_UNPLUG_ERROR. As of Linux kernel 5.12 and QEMU 6.0.0, the pseries >> machine is now able to report unplug errors for other device types, such >> as CPUs. >> >> Instead of creating a (device_type)_UNPLUG_ERROR for each new device, >> create a generic DEVICE_UNPLUG_ERROR event that can be used by all >> unplug errors in the future. >> >> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> >> --- >> qapi/machine.json | 23 +++++++++++++++++++++++ >> 1 file changed, 23 insertions(+) >> >> diff --git a/qapi/machine.json b/qapi/machine.json >> index 58a9c86b36..f0c7e56be0 100644 >> --- a/qapi/machine.json >> +++ b/qapi/machine.json >> @@ -1274,3 +1274,26 @@ >> ## >> { 'event': 'MEM_UNPLUG_ERROR', >> 'data': { 'device': 'str', 'msg': 'str' } } >> + >> +## >> +# @DEVICE_UNPLUG_ERROR: >> +# >> +# Emitted when a device hot unplug error occurs. >> +# >> +# @device: device name >> +# >> +# @msg: Informative message >> +# >> +# Since: 6.1 >> +# >> +# Example: >> +# >> +# <- { "event": "DEVICE_UNPLUG_ERROR" >> +# "data": { "device": "dimm1", >> +# "msg": "Memory hotunplug rejected by the guest for device dimm1" >> +# }, >> +# "timestamp": { "seconds": 1615570772, "microseconds": 202844 } } >> +# >> +## >> +{ 'event': 'DEVICE_UNPLUG_ERROR', >> + 'data': { 'device': 'str', 'msg': 'str' } } > > Missing: update of device_add's doc comment in qdev.json: > > # Notes: When this command completes, the device may not be removed from the > # guest. Hot removal is an operation that requires guest cooperation. > # This command merely requests that the guest begin the hot removal > # process. Completion of the device removal process is signaled with a > # DEVICE_DELETED event. Guest reset will automatically complete removal > # for all devices. Ok > > This sure could use some polish. > > If I understand things correctly, we're aiming for the following device > unplug protocol: One thing to note is that DEVICE_UNPLUG_ERROR isn't guaranteed to be send for every hotunplug error. The event depends on machine/architecture support to detect a guest side error. > > Unplug the device with device_del (or possibly equivalent) > > If we know we can't unplug the device, fail immediately. Also emit > DEVICE_UNPLUG_ERROR. I haven't predicted to use this event in those cases as well, although it seems reasonable to do so now that you mentioned it. > > If possible, unplug the device synchronously and succeed. Also emit > DEVICE_DELETED. > > Else, initiate unplug and succeed. > > When unplug finishes, emit either DEVICE_DELETED or > DEVICE_UNPLUG_ERROR. Since there's no 100% guarantee that DEVICE_UNPLUG_ERROR will be emitted for guest side errors, the wording here would be "When unplug finishes, emit DEVICE_DELETED. A DEVICE_UNPLUG_ERROR can be emitted if a guest side error was detected" > > For some machines and devices, unplug may never finish. > > Correct? > > Any particular reason for not putting event DEVICE_UNPLUG_ERROR next to > DEVICE_DELETED in qdev.json? Not really. I looked where MEM_UNPLUG_ERROR was declared and put it right after it. I can change it to qdev.json near DEVICE_DELETED. Thanks, Daniel > >
Daniel Henrique Barboza <danielhb413@gmail.com> writes: > On 6/11/21 9:12 AM, Markus Armbruster wrote: >> Daniel Henrique Barboza <danielhb413@gmail.com> writes: >> >>> At this moment we only provide one event to report a hotunplug error, >>> MEM_UNPLUG_ERROR. As of Linux kernel 5.12 and QEMU 6.0.0, the pseries >>> machine is now able to report unplug errors for other device types, such >>> as CPUs. >>> >>> Instead of creating a (device_type)_UNPLUG_ERROR for each new device, >>> create a generic DEVICE_UNPLUG_ERROR event that can be used by all >>> unplug errors in the future. >>> >>> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> >>> --- >>> qapi/machine.json | 23 +++++++++++++++++++++++ >>> 1 file changed, 23 insertions(+) >>> >>> diff --git a/qapi/machine.json b/qapi/machine.json >>> index 58a9c86b36..f0c7e56be0 100644 >>> --- a/qapi/machine.json >>> +++ b/qapi/machine.json >>> @@ -1274,3 +1274,26 @@ >>> ## >>> { 'event': 'MEM_UNPLUG_ERROR', >>> 'data': { 'device': 'str', 'msg': 'str' } } >>> + >>> +## >>> +# @DEVICE_UNPLUG_ERROR: >>> +# >>> +# Emitted when a device hot unplug error occurs. >>> +# >>> +# @device: device name >>> +# >>> +# @msg: Informative message >>> +# >>> +# Since: 6.1 >>> +# >>> +# Example: >>> +# >>> +# <- { "event": "DEVICE_UNPLUG_ERROR" >>> +# "data": { "device": "dimm1", >>> +# "msg": "Memory hotunplug rejected by the guest for device dimm1" >>> +# }, >>> +# "timestamp": { "seconds": 1615570772, "microseconds": 202844 } } >>> +# >>> +## >>> +{ 'event': 'DEVICE_UNPLUG_ERROR', >>> + 'data': { 'device': 'str', 'msg': 'str' } } >> >> Missing: update of device_add's doc comment in qdev.json: >> >> # Notes: When this command completes, the device may not be removed from the >> # guest. Hot removal is an operation that requires guest cooperation. >> # This command merely requests that the guest begin the hot removal >> # process. Completion of the device removal process is signaled with a >> # DEVICE_DELETED event. Guest reset will automatically complete removal >> # for all devices. > > Ok > >> >> This sure could use some polish. >> >> If I understand things correctly, we're aiming for the following device >> unplug protocol: > > One thing to note is that DEVICE_UNPLUG_ERROR isn't guaranteed to be send for > every hotunplug error. The event depends on machine/architecture support to > detect a guest side error. Yes. I tried to provide for that in my description of the protocol. >> >> Unplug the device with device_del (or possibly equivalent) >> >> If we know we can't unplug the device, fail immediately. Also emit >> DEVICE_UNPLUG_ERROR. > > > I haven't predicted to use this event in those cases as well, although it > seems reasonable to do so now that you mentioned it. I think this is a matter of taste. For what it's worth, we do emit DEVICE_DELETED on immediate success (see next paragraph). Emitting DEVICE_DELETED always lets QMP clients track device deletion even when it's triggered by something else. Feature. No such tracking is needed for unplug failure. >> If possible, unplug the device synchronously and succeed. Also emit >> DEVICE_DELETED. >> >> Else, initiate unplug and succeed. >> >> When unplug finishes, emit either DEVICE_DELETED or >> DEVICE_UNPLUG_ERROR. > > Since there's no 100% guarantee that DEVICE_UNPLUG_ERROR will be emitted for > guest side errors, the wording here would be > > "When unplug finishes, emit DEVICE_DELETED. A DEVICE_UNPLUG_ERROR can be > emitted if a guest side error was detected" My assumption is that QEMU can detect asynchronous success reliably, but not asynchronous failure. From QEMU's point of view, "asynchronous unplug has not finished" is indistinguishable from "asynchronous unplug finished unsuccessfully, but we can't detect it". So I lumped these two cases together in the next paragraph: >> For some machines and devices, unplug may never finish. My goal is a clear description of the state machine as it can be observed in QMP. >> Correct? >> >> Any particular reason for not putting event DEVICE_UNPLUG_ERROR next to >> DEVICE_DELETED in qdev.json? > > > Not really. I looked where MEM_UNPLUG_ERROR was declared and put it right > after it. I can change it to qdev.json near DEVICE_DELETED. Yes, please.
diff --git a/qapi/machine.json b/qapi/machine.json index 58a9c86b36..f0c7e56be0 100644 --- a/qapi/machine.json +++ b/qapi/machine.json @@ -1274,3 +1274,26 @@ ## { 'event': 'MEM_UNPLUG_ERROR', 'data': { 'device': 'str', 'msg': 'str' } } + +## +# @DEVICE_UNPLUG_ERROR: +# +# Emitted when a device hot unplug error occurs. +# +# @device: device name +# +# @msg: Informative message +# +# Since: 6.1 +# +# Example: +# +# <- { "event": "DEVICE_UNPLUG_ERROR" +# "data": { "device": "dimm1", +# "msg": "Memory hotunplug rejected by the guest for device dimm1" +# }, +# "timestamp": { "seconds": 1615570772, "microseconds": 202844 } } +# +## +{ 'event': 'DEVICE_UNPLUG_ERROR', + 'data': { 'device': 'str', 'msg': 'str' } }
At this moment we only provide one event to report a hotunplug error, MEM_UNPLUG_ERROR. As of Linux kernel 5.12 and QEMU 6.0.0, the pseries machine is now able to report unplug errors for other device types, such as CPUs. Instead of creating a (device_type)_UNPLUG_ERROR for each new device, create a generic DEVICE_UNPLUG_ERROR event that can be used by all unplug errors in the future. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- qapi/machine.json | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+)