Message ID | 20180102170645.5877-1-danielhb@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 01/02/2018 11:06 AM, Daniel Henrique Barboza wrote: > The qmp/hmp command 'system_wakeup' is simply a direct call to > 'qemu_system_wakeup_request' from vl.c. This function verifies if > runstate is SUSPENDED and if the wake up reason is valid before > proceeding. However, no error or warning is thrown if any of those > pre-requirements isn't met. > > This leads to situations such as the one described in > https://github.com/open-power-host-os/qemu/issues/31, where one > can induce the OS to be suspended by using pm-suspend (via > dompmsuspend, for example) but for some reason the machine failed to > go to the SUSPENDED runstate, staying at runstate RUNNING. The user > then tries to wake up the guest using system_wakeup (or dompmwakeup), > no error is thrown but nothing happened either because the wake up > wasn't fired at all. In the end, the user is left with a guest that > is dormant and believing that system_wakeup isn't working. > > This patch changes qmp_system_wakeup to make the runstate verification > before proceeding to call qemu_system_wakeup_request, firing up > an error message if the user tries to wake up a machine that > isn't in SUSPENDED state. The change isn't made inside > qemu_system_wakeup_request because it is used in migration, > ACPI and others where this usage might be valid. This patch > by no means fixes the situation described above, but it can direct > the user/management closer to the real problem. Converting something that silently did nothing successfully into now returning an error is somewhat backwards-incompatible. Is this something that needs to go through a deprecation cycle, to give people a chance to fix QMP scripts that were relying on the previous no-op behavior to now work with the new semantics? Or should we add an optional bool parameter to the command, where omitting the parameter gives the old semantics, but new enough clients can pass the bool to choose which behavior (no-op or error) they prefer? > +++ b/qmp.c > @@ -201,6 +201,11 @@ void qmp_cont(Error **errp) > > void qmp_system_wakeup(Error **errp) > { > + if (!runstate_check(RUN_STATE_SUSPENDED)) { > + error_setg(errp, > + "Unable to wake up: guest is not in suspended state"); > + return; > + } > qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER); > } > >
On 01/02/2018 06:03 PM, Eric Blake wrote: > On 01/02/2018 11:06 AM, Daniel Henrique Barboza wrote: >> The qmp/hmp command 'system_wakeup' is simply a direct call to >> 'qemu_system_wakeup_request' from vl.c. This function verifies if >> runstate is SUSPENDED and if the wake up reason is valid before >> proceeding. However, no error or warning is thrown if any of those >> pre-requirements isn't met. >> >> This leads to situations such as the one described in >> https://github.com/open-power-host-os/qemu/issues/31, where one >> can induce the OS to be suspended by using pm-suspend (via >> dompmsuspend, for example) but for some reason the machine failed to >> go to the SUSPENDED runstate, staying at runstate RUNNING. The user >> then tries to wake up the guest using system_wakeup (or dompmwakeup), >> no error is thrown but nothing happened either because the wake up >> wasn't fired at all. In the end, the user is left with a guest that >> is dormant and believing that system_wakeup isn't working. >> >> This patch changes qmp_system_wakeup to make the runstate verification >> before proceeding to call qemu_system_wakeup_request, firing up >> an error message if the user tries to wake up a machine that >> isn't in SUSPENDED state. The change isn't made inside >> qemu_system_wakeup_request because it is used in migration, >> ACPI and others where this usage might be valid. This patch >> by no means fixes the situation described above, but it can direct >> the user/management closer to the real problem. > Converting something that silently did nothing successfully into now > returning an error is somewhat backwards-incompatible. Is this > something that needs to go through a deprecation cycle, to give people a > chance to fix QMP scripts that were relying on the previous no-op > behavior to now work with the new semantics? Or should we add an > optional bool parameter to the command, where omitting the parameter > gives the old semantics, but new enough clients can pass the bool to > choose which behavior (no-op or error) they prefer? Good point. Back when I first coded this patch I made sure that no QEMU test were broken, but I didn't consider that existing user scripts that uses system_wakeup might now yield a different result. The extra optional parameter seems a good approach to take here IMO. The current system_wakeup implementation works fine if the user is careful enough to not call it if the guest isn't on SUSPENDED state, so I am unsure if deprecating this behavior in favor of this new one is a bit too extreme. Thanks, Daniel > >> +++ b/qmp.c >> @@ -201,6 +201,11 @@ void qmp_cont(Error **errp) >> >> void qmp_system_wakeup(Error **errp) >> { >> + if (!runstate_check(RUN_STATE_SUSPENDED)) { >> + error_setg(errp, >> + "Unable to wake up: guest is not in suspended state"); >> + return; >> + } >> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER); >> } >> >>
Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> writes: > On 01/02/2018 06:03 PM, Eric Blake wrote: >> On 01/02/2018 11:06 AM, Daniel Henrique Barboza wrote: >>> The qmp/hmp command 'system_wakeup' is simply a direct call to >>> 'qemu_system_wakeup_request' from vl.c. This function verifies if >>> runstate is SUSPENDED and if the wake up reason is valid before >>> proceeding. However, no error or warning is thrown if any of those >>> pre-requirements isn't met. >>> >>> This leads to situations such as the one described in >>> https://github.com/open-power-host-os/qemu/issues/31, where one >>> can induce the OS to be suspended by using pm-suspend (via >>> dompmsuspend, for example) but for some reason the machine failed to >>> go to the SUSPENDED runstate, staying at runstate RUNNING. The user >>> then tries to wake up the guest using system_wakeup (or dompmwakeup), >>> no error is thrown but nothing happened either because the wake up >>> wasn't fired at all. In the end, the user is left with a guest that >>> is dormant and believing that system_wakeup isn't working. >>> >>> This patch changes qmp_system_wakeup to make the runstate verification >>> before proceeding to call qemu_system_wakeup_request, firing up >>> an error message if the user tries to wake up a machine that >>> isn't in SUSPENDED state. The change isn't made inside >>> qemu_system_wakeup_request because it is used in migration, >>> ACPI and others where this usage might be valid. This patch >>> by no means fixes the situation described above, but it can direct >>> the user/management closer to the real problem. >> Converting something that silently did nothing successfully into now >> returning an error is somewhat backwards-incompatible. Is this >> something that needs to go through a deprecation cycle, to give people a >> chance to fix QMP scripts that were relying on the previous no-op >> behavior to now work with the new semantics? I doubt it. We're fixing a silent failure to be "loud". How exactly could that upset applications? Applications that don't bother to check the result remain exactly as broken as before. Applications that do check become aware of the fact that the command did not work as advertized. I guess this risks breaking them differently. qmp-spec.txt section 5. gives us license to add errors: "Clients must not assume any particular [...] errors generated by a command, that is, new errors can be added to any existing command in newer versions of the Server. >> behavior to now work with the new semantics? Or should we add an >> optional bool parameter to the command, where omitting the parameter >> gives the old semantics, but new enough clients can pass the bool to >> choose which behavior (no-op or error) they prefer? > > Good point. Back when I first coded this patch I made sure that no QEMU test > were broken, but I didn't consider that existing user scripts that uses > system_wakeup might now yield a different result. > > The extra optional parameter seems a good approach to take here IMO. The > current system_wakeup implementation works fine if the user is careful > enough to not call it if the guest isn't on SUSPENDED state, so I am unsure > if deprecating this behavior in favor of this new one is a bit too extreme. I'd say we call the silent failure to wake up the guest a bug, and your patch a bug fix. I figure we could find precedence for turning broken successful behavior into a clean error if we looked for it.
On 01/03/2018 11:41 AM, Markus Armbruster wrote: > Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> writes: > >> On 01/02/2018 06:03 PM, Eric Blake wrote: >>> On 01/02/2018 11:06 AM, Daniel Henrique Barboza wrote: >>>> The qmp/hmp command 'system_wakeup' is simply a direct call to >>>> 'qemu_system_wakeup_request' from vl.c. This function verifies if >>>> runstate is SUSPENDED and if the wake up reason is valid before >>>> proceeding. However, no error or warning is thrown if any of those >>>> pre-requirements isn't met. >>>> >>>> This leads to situations such as the one described in >>>> https://github.com/open-power-host-os/qemu/issues/31, where one >>>> can induce the OS to be suspended by using pm-suspend (via >>>> dompmsuspend, for example) but for some reason the machine failed to >>>> go to the SUSPENDED runstate, staying at runstate RUNNING. The user >>>> then tries to wake up the guest using system_wakeup (or dompmwakeup), >>>> no error is thrown but nothing happened either because the wake up >>>> wasn't fired at all. In the end, the user is left with a guest that >>>> is dormant and believing that system_wakeup isn't working. >>>> >>>> This patch changes qmp_system_wakeup to make the runstate verification >>>> before proceeding to call qemu_system_wakeup_request, firing up >>>> an error message if the user tries to wake up a machine that >>>> isn't in SUSPENDED state. The change isn't made inside >>>> qemu_system_wakeup_request because it is used in migration, >>>> ACPI and others where this usage might be valid. This patch >>>> by no means fixes the situation described above, but it can direct >>>> the user/management closer to the real problem. >>> Converting something that silently did nothing successfully into now >>> returning an error is somewhat backwards-incompatible. Is this >>> something that needs to go through a deprecation cycle, to give people a >>> chance to fix QMP scripts that were relying on the previous no-op >>> behavior to now work with the new semantics? > I doubt it. > > We're fixing a silent failure to be "loud". How exactly could that > upset applications? > > Applications that don't bother to check the result remain exactly as > broken as before. That's true. > > Applications that do check become aware of the fact that the command did > not work as advertized. I guess this risks breaking them differently. > > qmp-spec.txt section 5. gives us license to add errors: "Clients must > not assume any particular [...] errors generated by a command, that is, > new errors can be added to any existing command in newer versions of the > Server. > >>> behavior to now work with the new semantics? Or should we add an >>> optional bool parameter to the command, where omitting the parameter >>> gives the old semantics, but new enough clients can pass the bool to >>> choose which behavior (no-op or error) they prefer? >> Good point. Back when I first coded this patch I made sure that no QEMU test >> were broken, but I didn't consider that existing user scripts that uses >> system_wakeup might now yield a different result. >> >> The extra optional parameter seems a good approach to take here IMO. The >> current system_wakeup implementation works fine if the user is careful >> enough to not call it if the guest isn't on SUSPENDED state, so I am unsure >> if deprecating this behavior in favor of this new one is a bit too extreme. > I'd say we call the silent failure to wake up the guest a bug, and your > patch a bug fix. I am fine with this approach, specially after your argument about an application that doesn't check the result of system_wakeup is already broken. This patch would simply make the error explicit. Daniel > > I figure we could find precedence for turning broken successful behavior > into a clean error if we looked for it. >
Ping On 01/02/2018 03:06 PM, Daniel Henrique Barboza wrote: > The qmp/hmp command 'system_wakeup' is simply a direct call to > 'qemu_system_wakeup_request' from vl.c. This function verifies if > runstate is SUSPENDED and if the wake up reason is valid before > proceeding. However, no error or warning is thrown if any of those > pre-requirements isn't met. > > This leads to situations such as the one described in > https://github.com/open-power-host-os/qemu/issues/31, where one > can induce the OS to be suspended by using pm-suspend (via > dompmsuspend, for example) but for some reason the machine failed to > go to the SUSPENDED runstate, staying at runstate RUNNING. The user > then tries to wake up the guest using system_wakeup (or dompmwakeup), > no error is thrown but nothing happened either because the wake up > wasn't fired at all. In the end, the user is left with a guest that > is dormant and believing that system_wakeup isn't working. > > This patch changes qmp_system_wakeup to make the runstate verification > before proceeding to call qemu_system_wakeup_request, firing up > an error message if the user tries to wake up a machine that > isn't in SUSPENDED state. The change isn't made inside > qemu_system_wakeup_request because it is used in migration, > ACPI and others where this usage might be valid. This patch > by no means fixes the situation described above, but it can direct > the user/management closer to the real problem. > > Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> > CC: Markus Armbruster <armbru@redhat.com> > CC: Dr. David Alan Gilbert <dgilbert@redhat.com> > --- > hmp.c | 4 +++- > qmp.c | 5 +++++ > 2 files changed, 8 insertions(+), 1 deletion(-) > > diff --git a/hmp.c b/hmp.c > index 35a7041824..2c2326a00e 100644 > --- a/hmp.c > +++ b/hmp.c > @@ -1158,7 +1158,9 @@ void hmp_cont(Monitor *mon, const QDict *qdict) > > void hmp_system_wakeup(Monitor *mon, const QDict *qdict) > { > - qmp_system_wakeup(NULL); > + Error *err = NULL; > + qmp_system_wakeup(&err); > + hmp_handle_error(mon, &err); > } > > void hmp_nmi(Monitor *mon, const QDict *qdict) > diff --git a/qmp.c b/qmp.c > index 52cfd2d81c..d0be43fa1a 100644 > --- a/qmp.c > +++ b/qmp.c > @@ -201,6 +201,11 @@ void qmp_cont(Error **errp) > > void qmp_system_wakeup(Error **errp) > { > + if (!runstate_check(RUN_STATE_SUSPENDED)) { > + error_setg(errp, > + "Unable to wake up: guest is not in suspended state"); > + return; > + } > qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER); > } >
diff --git a/hmp.c b/hmp.c index 35a7041824..2c2326a00e 100644 --- a/hmp.c +++ b/hmp.c @@ -1158,7 +1158,9 @@ void hmp_cont(Monitor *mon, const QDict *qdict) void hmp_system_wakeup(Monitor *mon, const QDict *qdict) { - qmp_system_wakeup(NULL); + Error *err = NULL; + qmp_system_wakeup(&err); + hmp_handle_error(mon, &err); } void hmp_nmi(Monitor *mon, const QDict *qdict) diff --git a/qmp.c b/qmp.c index 52cfd2d81c..d0be43fa1a 100644 --- a/qmp.c +++ b/qmp.c @@ -201,6 +201,11 @@ void qmp_cont(Error **errp) void qmp_system_wakeup(Error **errp) { + if (!runstate_check(RUN_STATE_SUSPENDED)) { + error_setg(errp, + "Unable to wake up: guest is not in suspended state"); + return; + } qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER); }
The qmp/hmp command 'system_wakeup' is simply a direct call to 'qemu_system_wakeup_request' from vl.c. This function verifies if runstate is SUSPENDED and if the wake up reason is valid before proceeding. However, no error or warning is thrown if any of those pre-requirements isn't met. This leads to situations such as the one described in https://github.com/open-power-host-os/qemu/issues/31, where one can induce the OS to be suspended by using pm-suspend (via dompmsuspend, for example) but for some reason the machine failed to go to the SUSPENDED runstate, staying at runstate RUNNING. The user then tries to wake up the guest using system_wakeup (or dompmwakeup), no error is thrown but nothing happened either because the wake up wasn't fired at all. In the end, the user is left with a guest that is dormant and believing that system_wakeup isn't working. This patch changes qmp_system_wakeup to make the runstate verification before proceeding to call qemu_system_wakeup_request, firing up an error message if the user tries to wake up a machine that isn't in SUSPENDED state. The change isn't made inside qemu_system_wakeup_request because it is used in migration, ACPI and others where this usage might be valid. This patch by no means fixes the situation described above, but it can direct the user/management closer to the real problem. Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> CC: Markus Armbruster <armbru@redhat.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> --- hmp.c | 4 +++- qmp.c | 5 +++++ 2 files changed, 8 insertions(+), 1 deletion(-)