Message ID | 20200702182136.GA3511@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix PM hibernation in Xen guests | expand |
On 7/2/20 2:21 PM, Anchal Agarwal wrote: > From: Munehisa Kamata <kamatam@amazon.com> > > Guest hibernation is different from xen suspend/resume/live migration. > Xen save/restore does not use pm_ops as is needed by guest hibernation. > Hibernation in guest follows ACPI path and is guest inititated , the > hibernation image is saved within guest as compared to later modes > which are xen toolstack assisted and image creation/storage is in > control of hypervisor/host machine. > To differentiate between Xen suspend and PM hibernation, keep track > of the on-going suspend mode by mainly using a new PM notifier. > Introduce simple functions which help to know the on-going suspend mode > so that other Xen-related code can behave differently according to the > current suspend mode. > Since Xen suspend doesn't have corresponding PM event, its main logic > is modfied to acquire pm_mutex and set the current mode. > > Though, acquirng pm_mutex is still right thing to do, we may > see deadlock if PM hibernation is interrupted by Xen suspend. > PM hibernation depends on xenwatch thread to process xenbus state > transactions, but the thread will sleep to wait pm_mutex which is > already held by PM hibernation context in the scenario. Xen shutdown > code may need some changes to avoid the issue. > > [Anchal Agarwal: Changelog]: > RFC v1->v2: Code refactoring > v1->v2: Remove unused functions for PM SUSPEND/PM hibernation > > Signed-off-by: Anchal Agarwal <anchalag@amazon.com> > Signed-off-by: Munehisa Kamata <kamatam@amazon.com> > --- > drivers/xen/manage.c | 60 +++++++++++++++++++++++++++++++++++++++++++ > include/xen/xen-ops.h | 1 + > 2 files changed, 61 insertions(+) > > diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c > index cd046684e0d1..69833fd6cfd1 100644 > --- a/drivers/xen/manage.c > +++ b/drivers/xen/manage.c > @@ -14,6 +14,7 @@ > #include <linux/freezer.h> > #include <linux/syscore_ops.h> > #include <linux/export.h> > +#include <linux/suspend.h> > > #include <xen/xen.h> > #include <xen/xenbus.h> > @@ -40,6 +41,20 @@ enum shutdown_state { > /* Ignore multiple shutdown requests. */ > static enum shutdown_state shutting_down = SHUTDOWN_INVALID; > > +enum suspend_modes { > + NO_SUSPEND = 0, > + XEN_SUSPEND, > + PM_HIBERNATION, > +}; > + > +/* Protected by pm_mutex */ > +static enum suspend_modes suspend_mode = NO_SUSPEND; > + > +bool xen_is_xen_suspend(void) Weren't you going to call this pv suspend? (And also --- is this suspend or hibernation? Your commit messages and cover letter talk about fixing hibernation). > +{ > + return suspend_mode == XEN_SUSPEND; > +} > + > + > +static int xen_pm_notifier(struct notifier_block *notifier, > + unsigned long pm_event, void *unused) > +{ > + switch (pm_event) { > + case PM_SUSPEND_PREPARE: > + case PM_HIBERNATION_PREPARE: > + case PM_RESTORE_PREPARE: > + suspend_mode = PM_HIBERNATION; Do you ever use this mode? It seems to me all you care about is whether or not we are doing XEN_SUSPEND. And so perhaps suspend_mode could become a boolean. And then maybe even drop it altogether because it you should be able to key off (shutting_down == SHUTDOWN_SUSPEND). > + break; > + case PM_POST_SUSPEND: > + case PM_POST_RESTORE: > + case PM_POST_HIBERNATION: > + /* Set back to the default */ > + suspend_mode = NO_SUSPEND; > + break; > + default: > + pr_warn("Receive unknown PM event 0x%lx\n", pm_event); > + return -EINVAL; > + } > + > + return 0; > +}; > +static int xen_setup_pm_notifier(void) > +{ > + if (!xen_hvm_domain()) > + return -ENODEV; I forgot --- what did we decide about non-x86 (i.e. ARM)? And PVH dom0. -boris
On Mon, Jul 13, 2020 at 11:52:01AM -0400, Boris Ostrovsky wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > On 7/2/20 2:21 PM, Anchal Agarwal wrote: > > From: Munehisa Kamata <kamatam@amazon.com> > > > > Guest hibernation is different from xen suspend/resume/live migration. > > Xen save/restore does not use pm_ops as is needed by guest hibernation. > > Hibernation in guest follows ACPI path and is guest inititated , the > > hibernation image is saved within guest as compared to later modes > > which are xen toolstack assisted and image creation/storage is in > > control of hypervisor/host machine. > > To differentiate between Xen suspend and PM hibernation, keep track > > of the on-going suspend mode by mainly using a new PM notifier. > > Introduce simple functions which help to know the on-going suspend mode > > so that other Xen-related code can behave differently according to the > > current suspend mode. > > Since Xen suspend doesn't have corresponding PM event, its main logic > > is modfied to acquire pm_mutex and set the current mode. > > > > Though, acquirng pm_mutex is still right thing to do, we may > > see deadlock if PM hibernation is interrupted by Xen suspend. > > PM hibernation depends on xenwatch thread to process xenbus state > > transactions, but the thread will sleep to wait pm_mutex which is > > already held by PM hibernation context in the scenario. Xen shutdown > > code may need some changes to avoid the issue. > > > > [Anchal Agarwal: Changelog]: > > RFC v1->v2: Code refactoring > > v1->v2: Remove unused functions for PM SUSPEND/PM hibernation > > > > Signed-off-by: Anchal Agarwal <anchalag@amazon.com> > > Signed-off-by: Munehisa Kamata <kamatam@amazon.com> > > --- > > drivers/xen/manage.c | 60 +++++++++++++++++++++++++++++++++++++++++++ > > include/xen/xen-ops.h | 1 + > > 2 files changed, 61 insertions(+) > > > > diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c > > index cd046684e0d1..69833fd6cfd1 100644 > > --- a/drivers/xen/manage.c > > +++ b/drivers/xen/manage.c > > @@ -14,6 +14,7 @@ > > #include <linux/freezer.h> > > #include <linux/syscore_ops.h> > > #include <linux/export.h> > > +#include <linux/suspend.h> > > > > #include <xen/xen.h> > > #include <xen/xenbus.h> > > @@ -40,6 +41,20 @@ enum shutdown_state { > > /* Ignore multiple shutdown requests. */ > > static enum shutdown_state shutting_down = SHUTDOWN_INVALID; > > > > +enum suspend_modes { > > + NO_SUSPEND = 0, > > + XEN_SUSPEND, > > + PM_HIBERNATION, > > +}; > > + > > +/* Protected by pm_mutex */ > > +static enum suspend_modes suspend_mode = NO_SUSPEND; > > + > > +bool xen_is_xen_suspend(void) > > > Weren't you going to call this pv suspend? (And also --- is this suspend > or hibernation? Your commit messages and cover letter talk about fixing > hibernation). > > This is for hibernation is for pvhvm/hvm/pv-on-hvm guests as you may call it. The method is just there to check if "xen suspend" is in progress. I do not see "xen_suspend" differentiating between pv or hvm domain until later in the code hence, I abstracted it to xen_is_xen_suspend. > > +{ > > + return suspend_mode == XEN_SUSPEND; > > +} > > + > > > > > + > > +static int xen_pm_notifier(struct notifier_block *notifier, > > + unsigned long pm_event, void *unused) > > +{ > > + switch (pm_event) { > > + case PM_SUSPEND_PREPARE: > > + case PM_HIBERNATION_PREPARE: > > + case PM_RESTORE_PREPARE: > > + suspend_mode = PM_HIBERNATION; > > > Do you ever use this mode? It seems to me all you care about is whether > or not we are doing XEN_SUSPEND. And so perhaps suspend_mode could > become a boolean. And then maybe even drop it altogether because it you > should be able to key off (shutting_down == SHUTDOWN_SUSPEND). > > The mode was left there in case its needed for restore prepare cases. But you are right the only thing I currently care about whether shutting_down == SHUTDOWN_SUSPEND. Infact, the notifier may not be needed in first place. xen_is_xen_suspend could work right off the bat using 'shutting_down' variable itself. *I think so* I will test it on my end and send an updated patch. > > + break; > > + case PM_POST_SUSPEND: > > + case PM_POST_RESTORE: > > + case PM_POST_HIBERNATION: > > + /* Set back to the default */ > > + suspend_mode = NO_SUSPEND; > > + break; > > + default: > > + pr_warn("Receive unknown PM event 0x%lx\n", pm_event); > > + return -EINVAL; > > + } > > + > > + return 0; > > +}; > > > > > +static int xen_setup_pm_notifier(void) > > +{ > > + if (!xen_hvm_domain()) > > + return -ENODEV; > > > I forgot --- what did we decide about non-x86 (i.e. ARM)? It would be great to support that however, its out of scope for this patch set. I’ll be happy to discuss it separately. > > > And PVH dom0. That's another good use case to make it work with however, I still think that should be tested/worked upon separately as the feature itself (PVH Dom0) is very new. > > Thanks, Anchal > -boris > > >
On 7/15/20 4:49 PM, Anchal Agarwal wrote: > On Mon, Jul 13, 2020 at 11:52:01AM -0400, Boris Ostrovsky wrote: >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. >> >> >> >> On 7/2/20 2:21 PM, Anchal Agarwal wrote: >>> + >>> +bool xen_is_xen_suspend(void) >> >> Weren't you going to call this pv suspend? (And also --- is this suspend >> or hibernation? Your commit messages and cover letter talk about fixing >> hibernation). >> >> > This is for hibernation is for pvhvm/hvm/pv-on-hvm guests as you may call it. > The method is just there to check if "xen suspend" is in progress. > I do not see "xen_suspend" differentiating between pv or hvm > domain until later in the code hence, I abstracted it to xen_is_xen_suspend. I meant "pv suspend" in the sense that this is paravirtual suspend, not suspend for paravirtual guests. Just like pv drivers are for both pv and hvm guests. And then --- should it be pv suspend or pv hibernation? >>> +{ >>> + return suspend_mode == XEN_SUSPEND; >>> +} >>> + >> >> +static int xen_setup_pm_notifier(void) >> +{ >> + if (!xen_hvm_domain()) >> + return -ENODEV; >> >> I forgot --- what did we decide about non-x86 (i.e. ARM)? > It would be great to support that however, its out of > scope for this patch set. > I’ll be happy to discuss it separately. I wasn't implying that this *should* work on ARM but rather whether this will break ARM somehow (because xen_hvm_domain() is true there). >> >> And PVH dom0. > That's another good use case to make it work with however, I still > think that should be tested/worked upon separately as the feature itself > (PVH Dom0) is very new. Same question here --- will this break PVH dom0? -boris
On Wed, Jul 15, 2020 at 05:18:08PM -0400, Boris Ostrovsky wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > On 7/15/20 4:49 PM, Anchal Agarwal wrote: > > On Mon, Jul 13, 2020 at 11:52:01AM -0400, Boris Ostrovsky wrote: > >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > >> > >> > >> > >> On 7/2/20 2:21 PM, Anchal Agarwal wrote: > >>> + > >>> +bool xen_is_xen_suspend(void) > >> > >> Weren't you going to call this pv suspend? (And also --- is this suspend > >> or hibernation? Your commit messages and cover letter talk about fixing > >> hibernation). > >> > >> > > This is for hibernation is for pvhvm/hvm/pv-on-hvm guests as you may call it. > > The method is just there to check if "xen suspend" is in progress. > > I do not see "xen_suspend" differentiating between pv or hvm > > domain until later in the code hence, I abstracted it to xen_is_xen_suspend. > > > I meant "pv suspend" in the sense that this is paravirtual suspend, not > suspend for paravirtual guests. Just like pv drivers are for both pv and > hvm guests. > > > And then --- should it be pv suspend or pv hibernation? > > Ok so I think I am lot confused by this question. Here is what this function for, function xen_is_xen_suspend() just tells us whether the guest is in "SHUTDOWN_SUSPEND" state or not. This check is needed for correct invocation of syscore_ops callbacks registered for guest's hibernation and for xenbus to invoke respective callbacks[suspend/resume vs freeze/thaw/restore]. Since "shutting_down" state is defined static and is not directly available to other parts of the code, the function solves the purpose. I am having hard time understanding why this should be called pv suspend/hibernation unless you are suggesting something else? Am I missing your point here? > > >>> +{ > >>> + return suspend_mode == XEN_SUSPEND; > >>> +} > >>> + > >> > >> +static int xen_setup_pm_notifier(void) > >> +{ > >> + if (!xen_hvm_domain()) > >> + return -ENODEV; > >> > >> I forgot --- what did we decide about non-x86 (i.e. ARM)? > > It would be great to support that however, its out of > > scope for this patch set. > > I’ll be happy to discuss it separately. > > > I wasn't implying that this *should* work on ARM but rather whether this > will break ARM somehow (because xen_hvm_domain() is true there). > > Ok makes sense. TBH, I haven't tested this part of code on ARM and the series was only support x86 guests hibernation. Moreover, this notifier is there to distinguish between 2 PM events PM SUSPEND and PM hibernation. Now since we only care about PM HIBERNATION I may just remove this code and rely on "SHUTDOWN_SUSPEND" state. However, I may have to fix other patches in the series where this check may appear and cater it only for x86 right? > > >> > >> And PVH dom0. > > That's another good use case to make it work with however, I still > > think that should be tested/worked upon separately as the feature itself > > (PVH Dom0) is very new. > > > Same question here --- will this break PVH dom0? > I haven't tested it as a part of this series. Is that a blocker here? > Thanks, Anchal > -boris > >
(Roger, question for you at the very end) On 7/17/20 3:10 PM, Anchal Agarwal wrote: > On Wed, Jul 15, 2020 at 05:18:08PM -0400, Boris Ostrovsky wrote: >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. >> >> >> >> On 7/15/20 4:49 PM, Anchal Agarwal wrote: >>> On Mon, Jul 13, 2020 at 11:52:01AM -0400, Boris Ostrovsky wrote: >>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. >>>> >>>> >>>> >>>> On 7/2/20 2:21 PM, Anchal Agarwal wrote: >>>>> + >>>>> +bool xen_is_xen_suspend(void) >>>> Weren't you going to call this pv suspend? (And also --- is this suspend >>>> or hibernation? Your commit messages and cover letter talk about fixing >>>> hibernation). >>>> >>>> >>> This is for hibernation is for pvhvm/hvm/pv-on-hvm guests as you may call it. >>> The method is just there to check if "xen suspend" is in progress. >>> I do not see "xen_suspend" differentiating between pv or hvm >>> domain until later in the code hence, I abstracted it to xen_is_xen_suspend. >> >> I meant "pv suspend" in the sense that this is paravirtual suspend, not >> suspend for paravirtual guests. Just like pv drivers are for both pv and >> hvm guests. >> >> >> And then --- should it be pv suspend or pv hibernation? >> >> > Ok so I think I am lot confused by this question. Here is what this > function for, function xen_is_xen_suspend() just tells us whether > the guest is in "SHUTDOWN_SUSPEND" state or not. This check is needed > for correct invocation of syscore_ops callbacks registered for guest's > hibernation and for xenbus to invoke respective callbacks[suspend/resume > vs freeze/thaw/restore]. > Since "shutting_down" state is defined static and is not directly available > to other parts of the code, the function solves the purpose. > > I am having hard time understanding why this should be called pv > suspend/hibernation unless you are suggesting something else? > Am I missing your point here? I think I understand now what you are trying to say --- it's whether we are going to use xen_suspend() routine, right? If that's the case then sure, you can use "xen_suspend" term. (I'd probably still change xen_is_xen_suspend() to is_xen_suspend()) >>>>> +{ >>>>> + return suspend_mode == XEN_SUSPEND; >>>>> +} >>>>> + >>>> +static int xen_setup_pm_notifier(void) >>>> +{ >>>> + if (!xen_hvm_domain()) >>>> + return -ENODEV; >>>> >>>> I forgot --- what did we decide about non-x86 (i.e. ARM)? >>> It would be great to support that however, its out of >>> scope for this patch set. >>> I’ll be happy to discuss it separately. >> >> I wasn't implying that this *should* work on ARM but rather whether this >> will break ARM somehow (because xen_hvm_domain() is true there). >> >> > Ok makes sense. TBH, I haven't tested this part of code on ARM and the series > was only support x86 guests hibernation. > Moreover, this notifier is there to distinguish between 2 PM > events PM SUSPEND and PM hibernation. Now since we only care about PM > HIBERNATION I may just remove this code and rely on "SHUTDOWN_SUSPEND" state. > However, I may have to fix other patches in the series where this check may > appear and cater it only for x86 right? I don't know what would happen if ARM guest tries to handle hibernation callbacks. The only ones that you are introducing are in block and net fronts and that's arch-independent. You do add a bunch of x86-specific code though (syscore ops), would something similar be needed for ARM? >>>> And PVH dom0. >>> That's another good use case to make it work with however, I still >>> think that should be tested/worked upon separately as the feature itself >>> (PVH Dom0) is very new. >> >> Same question here --- will this break PVH dom0? >> > I haven't tested it as a part of this series. Is that a blocker here? I suspect dom0 will not do well now as far as hibernation goes, in which case you are not breaking anything. Roger? -boris
On Sat, Jul 18, 2020 at 09:47:04PM -0400, Boris Ostrovsky wrote: > (Roger, question for you at the very end) > > On 7/17/20 3:10 PM, Anchal Agarwal wrote: > > On Wed, Jul 15, 2020 at 05:18:08PM -0400, Boris Ostrovsky wrote: > >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > >> > >> > >> > >> On 7/15/20 4:49 PM, Anchal Agarwal wrote: > >>> On Mon, Jul 13, 2020 at 11:52:01AM -0400, Boris Ostrovsky wrote: > >>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > >>>> > >>>> > >>>> > >>>> On 7/2/20 2:21 PM, Anchal Agarwal wrote: > >>>> And PVH dom0. > >>> That's another good use case to make it work with however, I still > >>> think that should be tested/worked upon separately as the feature itself > >>> (PVH Dom0) is very new. > >> > >> Same question here --- will this break PVH dom0? > >> > > I haven't tested it as a part of this series. Is that a blocker here? > > > I suspect dom0 will not do well now as far as hibernation goes, in which > case you are not breaking anything. > > > Roger? I sadly don't have any box ATM that supports hibernation where I could test it. We have hibernation support for PV dom0, so while I haven't done anything specific to support or test hibernation on PVH dom0 I would at least aim to not make this any worse, and hence the check should at least also fail for a PVH dom0? if (!xen_hvm_domain() || xen_initial_domain()) return -ENODEV; Ie: none of this should be applied to a PVH dom0, as it doesn't have PV devices and hence should follow the bare metal device suspend. Also I would contact the QubesOS guys, they rely heavily on the suspend feature for dom0, and that's something not currently tested by osstest so any breakages there go unnoticed. Thanks, Roger.
On Sat, Jul 18, 2020 at 09:47:04PM -0400, Boris Ostrovsky wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > (Roger, question for you at the very end) > > On 7/17/20 3:10 PM, Anchal Agarwal wrote: > > On Wed, Jul 15, 2020 at 05:18:08PM -0400, Boris Ostrovsky wrote: > >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > >> > >> > >> > >> On 7/15/20 4:49 PM, Anchal Agarwal wrote: > >>> On Mon, Jul 13, 2020 at 11:52:01AM -0400, Boris Ostrovsky wrote: > >>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > >>>> > >>>> > >>>> > >>>> On 7/2/20 2:21 PM, Anchal Agarwal wrote: > >>>>> + > >>>>> +bool xen_is_xen_suspend(void) > >>>> Weren't you going to call this pv suspend? (And also --- is this suspend > >>>> or hibernation? Your commit messages and cover letter talk about fixing > >>>> hibernation). > >>>> > >>>> > >>> This is for hibernation is for pvhvm/hvm/pv-on-hvm guests as you may call it. > >>> The method is just there to check if "xen suspend" is in progress. > >>> I do not see "xen_suspend" differentiating between pv or hvm > >>> domain until later in the code hence, I abstracted it to xen_is_xen_suspend. > >> > >> I meant "pv suspend" in the sense that this is paravirtual suspend, not > >> suspend for paravirtual guests. Just like pv drivers are for both pv and > >> hvm guests. > >> > >> > >> And then --- should it be pv suspend or pv hibernation? > >> > >> > > Ok so I think I am lot confused by this question. Here is what this > > function for, function xen_is_xen_suspend() just tells us whether > > the guest is in "SHUTDOWN_SUSPEND" state or not. This check is needed > > for correct invocation of syscore_ops callbacks registered for guest's > > hibernation and for xenbus to invoke respective callbacks[suspend/resume > > vs freeze/thaw/restore]. > > Since "shutting_down" state is defined static and is not directly available > > to other parts of the code, the function solves the purpose. > > > > I am having hard time understanding why this should be called pv > > suspend/hibernation unless you are suggesting something else? > > Am I missing your point here? > > > > I think I understand now what you are trying to say --- it's whether we > are going to use xen_suspend() routine, right? If that's the case then > sure, you can use "xen_suspend" term. (I'd probably still change > xen_is_xen_suspend() to is_xen_suspend()) > I think so too. Will change it. > > >>>>> +{ > >>>>> + return suspend_mode == XEN_SUSPEND; > >>>>> +} > >>>>> + > >>>> +static int xen_setup_pm_notifier(void) > >>>> +{ > >>>> + if (!xen_hvm_domain()) > >>>> + return -ENODEV; > >>>> > >>>> I forgot --- what did we decide about non-x86 (i.e. ARM)? > >>> It would be great to support that however, its out of > >>> scope for this patch set. > >>> I’ll be happy to discuss it separately. > >> > >> I wasn't implying that this *should* work on ARM but rather whether this > >> will break ARM somehow (because xen_hvm_domain() is true there). > >> > >> > > Ok makes sense. TBH, I haven't tested this part of code on ARM and the series > > was only support x86 guests hibernation. > > Moreover, this notifier is there to distinguish between 2 PM > > events PM SUSPEND and PM hibernation. Now since we only care about PM > > HIBERNATION I may just remove this code and rely on "SHUTDOWN_SUSPEND" state. > > However, I may have to fix other patches in the series where this check may > > appear and cater it only for x86 right? > > > I don't know what would happen if ARM guest tries to handle hibernation > callbacks. The only ones that you are introducing are in block and net > fronts and that's arch-independent. > > > You do add a bunch of x86-specific code though (syscore ops), would > something similar be needed for ARM? > > I don't expect this to work out of the box on ARM. To start with something similar will be needed for ARM too. We may still want to keep the driver code as-is. I understand the concern here wrt ARM, however, currently the support is only proposed for x86 guests here and similar work could be carried out for ARM. Also, if regular hibernation works correctly on arm, then all is needed is to fix Xen side of things. I am not sure what could be done to achieve any assurances on arm side as far as this series is concerned. > >>>> And PVH dom0. > >>> That's another good use case to make it work with however, I still > >>> think that should be tested/worked upon separately as the feature itself > >>> (PVH Dom0) is very new. > >> > >> Same question here --- will this break PVH dom0? > >> > > I haven't tested it as a part of this series. Is that a blocker here? > > > I suspect dom0 will not do well now as far as hibernation goes, in which > case you are not breaking anything. > > > Roger? > > > -boris Thanks, Anchal > > >
On Mon, Jul 20, 2020 at 11:37:05AM +0200, Roger Pau Monné wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > On Sat, Jul 18, 2020 at 09:47:04PM -0400, Boris Ostrovsky wrote: > > (Roger, question for you at the very end) > > > > On 7/17/20 3:10 PM, Anchal Agarwal wrote: > > > On Wed, Jul 15, 2020 at 05:18:08PM -0400, Boris Ostrovsky wrote: > > >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > >> > > >> > > >> > > >> On 7/15/20 4:49 PM, Anchal Agarwal wrote: > > >>> On Mon, Jul 13, 2020 at 11:52:01AM -0400, Boris Ostrovsky wrote: > > >>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > >>>> > > >>>> > > >>>> > > >>>> On 7/2/20 2:21 PM, Anchal Agarwal wrote: > > >>>> And PVH dom0. > > >>> That's another good use case to make it work with however, I still > > >>> think that should be tested/worked upon separately as the feature itself > > >>> (PVH Dom0) is very new. > > >> > > >> Same question here --- will this break PVH dom0? > > >> > > > I haven't tested it as a part of this series. Is that a blocker here? > > > > > > I suspect dom0 will not do well now as far as hibernation goes, in which > > case you are not breaking anything. > > > > > > Roger? > > I sadly don't have any box ATM that supports hibernation where I > could test it. We have hibernation support for PV dom0, so while I > haven't done anything specific to support or test hibernation on PVH > dom0 I would at least aim to not make this any worse, and hence the > check should at least also fail for a PVH dom0? > > if (!xen_hvm_domain() || xen_initial_domain()) > return -ENODEV; > > Ie: none of this should be applied to a PVH dom0, as it doesn't have > PV devices and hence should follow the bare metal device suspend. > So from what I understand you meant for any guest running on pvh dom0 should not hibernate if hibernation is triggered from within the guest or should they? > Also I would contact the QubesOS guys, they rely heavily on the > suspend feature for dom0, and that's something not currently tested by > osstest so any breakages there go unnoticed. > Was this for me or Boris? If its the former then I have no idea how to? > Thanks, Roger. Thanks, Anchal
Marek: I'm adding you in case you could be able to give this a try and make sure it doesn't break suspend for dom0. On Tue, Jul 21, 2020 at 12:17:36AM +0000, Anchal Agarwal wrote: > On Mon, Jul 20, 2020 at 11:37:05AM +0200, Roger Pau Monné wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > > > > > On Sat, Jul 18, 2020 at 09:47:04PM -0400, Boris Ostrovsky wrote: > > > (Roger, question for you at the very end) > > > > > > On 7/17/20 3:10 PM, Anchal Agarwal wrote: > > > > On Wed, Jul 15, 2020 at 05:18:08PM -0400, Boris Ostrovsky wrote: > > > >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > >> > > > >> > > > >> > > > >> On 7/15/20 4:49 PM, Anchal Agarwal wrote: > > > >>> On Mon, Jul 13, 2020 at 11:52:01AM -0400, Boris Ostrovsky wrote: > > > >>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > >>>> > > > >>>> > > > >>>> > > > >>>> On 7/2/20 2:21 PM, Anchal Agarwal wrote: > > > >>>> And PVH dom0. > > > >>> That's another good use case to make it work with however, I still > > > >>> think that should be tested/worked upon separately as the feature itself > > > >>> (PVH Dom0) is very new. > > > >> > > > >> Same question here --- will this break PVH dom0? > > > >> > > > > I haven't tested it as a part of this series. Is that a blocker here? > > > > > > > > > I suspect dom0 will not do well now as far as hibernation goes, in which > > > case you are not breaking anything. > > > > > > > > > Roger? > > > > I sadly don't have any box ATM that supports hibernation where I > > could test it. We have hibernation support for PV dom0, so while I > > haven't done anything specific to support or test hibernation on PVH > > dom0 I would at least aim to not make this any worse, and hence the > > check should at least also fail for a PVH dom0? > > > > if (!xen_hvm_domain() || xen_initial_domain()) > > return -ENODEV; > > > > Ie: none of this should be applied to a PVH dom0, as it doesn't have > > PV devices and hence should follow the bare metal device suspend. > > > So from what I understand you meant for any guest running on pvh dom0 should not > hibernate if hibernation is triggered from within the guest or should they? Er no to both I think. What I meant is that a PVH dom0 should be able to properly suspend, and we should make sure this work doesn't make this any harder (or breaks it if it's currently working). Or at least that's how I understood the question raised by Boris. You are adding code to the generic suspend path that's also used by dom0 in order to perform bare metal suspension. This is fine now for a PV dom0 because the code is gated on xen_hvm_domain, but you should also take into account that a PVH dom0 is considered a HVM domain, and hence will get the notifier registered. > > Also I would contact the QubesOS guys, they rely heavily on the > > suspend feature for dom0, and that's something not currently tested by > > osstest so any breakages there go unnoticed. > > > Was this for me or Boris? If its the former then I have no idea how to? I've now added Marek. Roger.
On Tue, Jul 21, 2020 at 10:30:18AM +0200, Roger Pau Monné wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > Marek: I'm adding you in case you could be able to give this a try and > make sure it doesn't break suspend for dom0. > > On Tue, Jul 21, 2020 at 12:17:36AM +0000, Anchal Agarwal wrote: > > On Mon, Jul 20, 2020 at 11:37:05AM +0200, Roger Pau Monné wrote: > > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > > > > > > > > > On Sat, Jul 18, 2020 at 09:47:04PM -0400, Boris Ostrovsky wrote: > > > > (Roger, question for you at the very end) > > > > > > > > On 7/17/20 3:10 PM, Anchal Agarwal wrote: > > > > > On Wed, Jul 15, 2020 at 05:18:08PM -0400, Boris Ostrovsky wrote: > > > > >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > >> > > > > >> > > > > >> > > > > >> On 7/15/20 4:49 PM, Anchal Agarwal wrote: > > > > >>> On Mon, Jul 13, 2020 at 11:52:01AM -0400, Boris Ostrovsky wrote: > > > > >>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> On 7/2/20 2:21 PM, Anchal Agarwal wrote: > > > > >>>> And PVH dom0. > > > > >>> That's another good use case to make it work with however, I still > > > > >>> think that should be tested/worked upon separately as the feature itself > > > > >>> (PVH Dom0) is very new. > > > > >> > > > > >> Same question here --- will this break PVH dom0? > > > > >> > > > > > I haven't tested it as a part of this series. Is that a blocker here? > > > > > > > > > > > > I suspect dom0 will not do well now as far as hibernation goes, in which > > > > case you are not breaking anything. > > > > > > > > > > > > Roger? > > > > > > I sadly don't have any box ATM that supports hibernation where I > > > could test it. We have hibernation support for PV dom0, so while I > > > haven't done anything specific to support or test hibernation on PVH > > > dom0 I would at least aim to not make this any worse, and hence the > > > check should at least also fail for a PVH dom0? > > > > > > if (!xen_hvm_domain() || xen_initial_domain()) > > > return -ENODEV; > > > > > > Ie: none of this should be applied to a PVH dom0, as it doesn't have > > > PV devices and hence should follow the bare metal device suspend. > > > > > So from what I understand you meant for any guest running on pvh dom0 should not > > hibernate if hibernation is triggered from within the guest or should they? > > Er no to both I think. What I meant is that a PVH dom0 should be able > to properly suspend, and we should make sure this work doesn't make > this any harder (or breaks it if it's currently working). > > Or at least that's how I understood the question raised by Boris. > > You are adding code to the generic suspend path that's also used by dom0 > in order to perform bare metal suspension. This is fine now for a PV > dom0 because the code is gated on xen_hvm_domain, but you should also > take into account that a PVH dom0 is considered a HVM domain, and > hence will get the notifier registered. > Ok that makes sense now. This is good to be safe, but my patch series is only to support domU hibernation, so I am not sure if this will affect pvh dom0. However, since I do not have a good way of testing sure I will add the check. Moreover, in Patch-0004, I do register suspend/resume syscore_ops specifically for domU hibernation only if its xen_hvm_domain. I don't see any reason that should not be registered for domU's running on pvh dom0. Those suspend/resume callbacks will only be invoked in case hibernation and will be skipped if generic suspend path is in progress. Do you see any issue with that? > > > Also I would contact the QubesOS guys, they rely heavily on the > > > suspend feature for dom0, and that's something not currently tested by > > > osstest so any breakages there go unnoticed. > > > > > Was this for me or Boris? If its the former then I have no idea how to? > > I've now added Marek. > > Roger. Anchal
>>>>>> +static int xen_setup_pm_notifier(void) >>>>>> +{ >>>>>> + if (!xen_hvm_domain()) >>>>>> + return -ENODEV; >>>>>> >>>>>> I forgot --- what did we decide about non-x86 (i.e. ARM)? >>>>> It would be great to support that however, its out of >>>>> scope for this patch set. >>>>> I’ll be happy to discuss it separately. >>>> >>>> I wasn't implying that this *should* work on ARM but rather whether this >>>> will break ARM somehow (because xen_hvm_domain() is true there). >>>> >>>> >>> Ok makes sense. TBH, I haven't tested this part of code on ARM and the series >>> was only support x86 guests hibernation. >>> Moreover, this notifier is there to distinguish between 2 PM >>> events PM SUSPEND and PM hibernation. Now since we only care about PM >>> HIBERNATION I may just remove this code and rely on "SHUTDOWN_SUSPEND" state. >>> However, I may have to fix other patches in the series where this check may >>> appear and cater it only for x86 right? >> >> >> I don't know what would happen if ARM guest tries to handle hibernation >> callbacks. The only ones that you are introducing are in block and net >> fronts and that's arch-independent. >> >> >> You do add a bunch of x86-specific code though (syscore ops), would >> something similar be needed for ARM? >> >> > I don't expect this to work out of the box on ARM. To start with something > similar will be needed for ARM too. > We may still want to keep the driver code as-is. > > I understand the concern here wrt ARM, however, currently the support is only > proposed for x86 guests here and similar work could be carried out for ARM. > Also, if regular hibernation works correctly on arm, then all is needed is to > fix Xen side of things. > > I am not sure what could be done to achieve any assurances on arm side as far as > this series is concerned. If you are not sure what the effects are (or sure that it won't work) on ARM then I'd add IS_ENABLED(CONFIG_X86) check, i.e. if (!IS_ENABLED(CONFIG_X86) || !xen_hvm_domain()) return -ENODEV; (plus '|| xen_initial_domain()' for PVH dom0 as Roger suggested.) -boris
On Tue, 21 Jul 2020, Boris Ostrovsky wrote: > >>>>>> +static int xen_setup_pm_notifier(void) > >>>>>> +{ > >>>>>> + if (!xen_hvm_domain()) > >>>>>> + return -ENODEV; > >>>>>> > >>>>>> I forgot --- what did we decide about non-x86 (i.e. ARM)? > >>>>> It would be great to support that however, its out of > >>>>> scope for this patch set. > >>>>> I’ll be happy to discuss it separately. > >>>> > >>>> I wasn't implying that this *should* work on ARM but rather whether this > >>>> will break ARM somehow (because xen_hvm_domain() is true there). > >>>> > >>>> > >>> Ok makes sense. TBH, I haven't tested this part of code on ARM and the series > >>> was only support x86 guests hibernation. > >>> Moreover, this notifier is there to distinguish between 2 PM > >>> events PM SUSPEND and PM hibernation. Now since we only care about PM > >>> HIBERNATION I may just remove this code and rely on "SHUTDOWN_SUSPEND" state. > >>> However, I may have to fix other patches in the series where this check may > >>> appear and cater it only for x86 right? > >> > >> > >> I don't know what would happen if ARM guest tries to handle hibernation > >> callbacks. The only ones that you are introducing are in block and net > >> fronts and that's arch-independent. > >> > >> > >> You do add a bunch of x86-specific code though (syscore ops), would > >> something similar be needed for ARM? > >> > >> > > I don't expect this to work out of the box on ARM. To start with something > > similar will be needed for ARM too. > > We may still want to keep the driver code as-is. > > > > I understand the concern here wrt ARM, however, currently the support is only > > proposed for x86 guests here and similar work could be carried out for ARM. > > Also, if regular hibernation works correctly on arm, then all is needed is to > > fix Xen side of things. > > > > I am not sure what could be done to achieve any assurances on arm side as far as > > this series is concerned. Just to clarify: new features don't need to work on ARM or cause any addition efforts to you to make them work on ARM. The patch series only needs not to break existing code paths (on ARM and any other platforms). It should also not make it overly difficult to implement the ARM side of things (if there is one) at some point in the future. FYI drivers/xen/manage.c is compiled and working on ARM today, however Xen suspend/resume is not supported. I don't know for sure if guest-initiated hibernation works because I have not tested it. > If you are not sure what the effects are (or sure that it won't work) on > ARM then I'd add IS_ENABLED(CONFIG_X86) check, i.e. > > > if (!IS_ENABLED(CONFIG_X86) || !xen_hvm_domain()) > return -ENODEV; That is a good principle to have and thanks for suggesting it. However, in this specific case there is nothing in this patch that doesn't work on ARM. From an ARM perspective I think we should enable it and &xen_pm_notifier_block should be registered. Given that all guests are HVM guests on ARM, it should work fine as is. I gave a quick look at the rest of the series and everything looks fine to me from an ARM perspective. I cannot imaging that the new freeze, thaw, and restore callbacks for net and block are going to cause any trouble on ARM. The two main x86-specific functions are xen_syscore_suspend/resume and they look trivial to implement on ARM (in the sense that they are likely going to look exactly the same.) One question for Anchal: what's going to happen if you trigger a hibernation, you have the new callbacks, but you are missing xen_syscore_suspend/resume? Is it any worse than not having the new freeze, thaw and restore callbacks at all and try to do a hibernation?
On Tue, Jul 21, 2020 at 07:55:09PM +0000, Anchal Agarwal wrote: > On Tue, Jul 21, 2020 at 10:30:18AM +0200, Roger Pau Monné wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > > > > > Marek: I'm adding you in case you could be able to give this a try and > > make sure it doesn't break suspend for dom0. > > > > On Tue, Jul 21, 2020 at 12:17:36AM +0000, Anchal Agarwal wrote: > > > On Mon, Jul 20, 2020 at 11:37:05AM +0200, Roger Pau Monné wrote: > > > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > > > > > > > > > > > > > On Sat, Jul 18, 2020 at 09:47:04PM -0400, Boris Ostrovsky wrote: > > > > > (Roger, question for you at the very end) > > > > > > > > > > On 7/17/20 3:10 PM, Anchal Agarwal wrote: > > > > > > On Wed, Jul 15, 2020 at 05:18:08PM -0400, Boris Ostrovsky wrote: > > > > > >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > > >> > > > > > >> > > > > > >> > > > > > >> On 7/15/20 4:49 PM, Anchal Agarwal wrote: > > > > > >>> On Mon, Jul 13, 2020 at 11:52:01AM -0400, Boris Ostrovsky wrote: > > > > > >>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> On 7/2/20 2:21 PM, Anchal Agarwal wrote: > > > > > >>>> And PVH dom0. > > > > > >>> That's another good use case to make it work with however, I still > > > > > >>> think that should be tested/worked upon separately as the feature itself > > > > > >>> (PVH Dom0) is very new. > > > > > >> > > > > > >> Same question here --- will this break PVH dom0? > > > > > >> > > > > > > I haven't tested it as a part of this series. Is that a blocker here? > > > > > > > > > > > > > > > I suspect dom0 will not do well now as far as hibernation goes, in which > > > > > case you are not breaking anything. > > > > > > > > > > > > > > > Roger? > > > > > > > > I sadly don't have any box ATM that supports hibernation where I > > > > could test it. We have hibernation support for PV dom0, so while I > > > > haven't done anything specific to support or test hibernation on PVH > > > > dom0 I would at least aim to not make this any worse, and hence the > > > > check should at least also fail for a PVH dom0? > > > > > > > > if (!xen_hvm_domain() || xen_initial_domain()) > > > > return -ENODEV; > > > > > > > > Ie: none of this should be applied to a PVH dom0, as it doesn't have > > > > PV devices and hence should follow the bare metal device suspend. > > > > > > > So from what I understand you meant for any guest running on pvh dom0 should not > > > hibernate if hibernation is triggered from within the guest or should they? > > > > Er no to both I think. What I meant is that a PVH dom0 should be able > > to properly suspend, and we should make sure this work doesn't make > > this any harder (or breaks it if it's currently working). > > > > Or at least that's how I understood the question raised by Boris. > > > > You are adding code to the generic suspend path that's also used by dom0 > > in order to perform bare metal suspension. This is fine now for a PV > > dom0 because the code is gated on xen_hvm_domain, but you should also > > take into account that a PVH dom0 is considered a HVM domain, and > > hence will get the notifier registered. > > > Ok that makes sense now. This is good to be safe, but my patch series is only to > support domU hibernation, so I am not sure if this will affect pvh dom0. > However, since I do not have a good way of testing sure I will add the check. > > Moreover, in Patch-0004, I do register suspend/resume syscore_ops specifically for domU > hibernation only if its xen_hvm_domain. So if the hooks are only registered for domU, do you still need this xen_hvm_domain check here? I have to admit I'm not familiar with Linux PM suspend. > I don't see any reason that should not > be registered for domU's running on pvh dom0. To be clear: it should be registered for all HVM domUs, regardless of whether they are running on a PV or a PVH dom0. My intention was never to suggest otherwise. It should be enabled for all HVM domUs, but shouldn't be enabled for HVM dom0. > Those suspend/resume callbacks will > only be invoked in case hibernation and will be skipped if generic suspend path > is in progress. Do you see any issue with that? No, I think it's fine. Roger.
On Tue, Jul 21, 2020 at 05:18:34PM -0700, Stefano Stabellini wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > On Tue, 21 Jul 2020, Boris Ostrovsky wrote: > > >>>>>> +static int xen_setup_pm_notifier(void) > > >>>>>> +{ > > >>>>>> + if (!xen_hvm_domain()) > > >>>>>> + return -ENODEV; > > >>>>>> > > >>>>>> I forgot --- what did we decide about non-x86 (i.e. ARM)? > > >>>>> It would be great to support that however, its out of > > >>>>> scope for this patch set. > > >>>>> I’ll be happy to discuss it separately. > > >>>> > > >>>> I wasn't implying that this *should* work on ARM but rather whether this > > >>>> will break ARM somehow (because xen_hvm_domain() is true there). > > >>>> > > >>>> > > >>> Ok makes sense. TBH, I haven't tested this part of code on ARM and the series > > >>> was only support x86 guests hibernation. > > >>> Moreover, this notifier is there to distinguish between 2 PM > > >>> events PM SUSPEND and PM hibernation. Now since we only care about PM > > >>> HIBERNATION I may just remove this code and rely on "SHUTDOWN_SUSPEND" state. > > >>> However, I may have to fix other patches in the series where this check may > > >>> appear and cater it only for x86 right? > > >> > > >> > > >> I don't know what would happen if ARM guest tries to handle hibernation > > >> callbacks. The only ones that you are introducing are in block and net > > >> fronts and that's arch-independent. > > >> > > >> > > >> You do add a bunch of x86-specific code though (syscore ops), would > > >> something similar be needed for ARM? > > >> > > >> > > > I don't expect this to work out of the box on ARM. To start with something > > > similar will be needed for ARM too. > > > We may still want to keep the driver code as-is. > > > > > > I understand the concern here wrt ARM, however, currently the support is only > > > proposed for x86 guests here and similar work could be carried out for ARM. > > > Also, if regular hibernation works correctly on arm, then all is needed is to > > > fix Xen side of things. > > > > > > I am not sure what could be done to achieve any assurances on arm side as far as > > > this series is concerned. > > Just to clarify: new features don't need to work on ARM or cause any > addition efforts to you to make them work on ARM. The patch series only > needs not to break existing code paths (on ARM and any other platforms). > It should also not make it overly difficult to implement the ARM side of > things (if there is one) at some point in the future. > > FYI drivers/xen/manage.c is compiled and working on ARM today, however > Xen suspend/resume is not supported. I don't know for sure if > guest-initiated hibernation works because I have not tested it. > > > > > If you are not sure what the effects are (or sure that it won't work) on > > ARM then I'd add IS_ENABLED(CONFIG_X86) check, i.e. > > > > > > if (!IS_ENABLED(CONFIG_X86) || !xen_hvm_domain()) > > return -ENODEV; > > That is a good principle to have and thanks for suggesting it. However, > in this specific case there is nothing in this patch that doesn't work > on ARM. From an ARM perspective I think we should enable it and > &xen_pm_notifier_block should be registered. > This question is for Boris, I think you we decided to get rid of the notifier in V3 as all we need to check is SHUTDOWN_SUSPEND state which sounds plausible to me. So this check may go away. It may still be needed for sycore_ops callbacks registration. > Given that all guests are HVM guests on ARM, it should work fine as is. > > > I gave a quick look at the rest of the series and everything looks fine > to me from an ARM perspective. I cannot imaging that the new freeze, > thaw, and restore callbacks for net and block are going to cause any > trouble on ARM. The two main x86-specific functions are > xen_syscore_suspend/resume and they look trivial to implement on ARM (in > the sense that they are likely going to look exactly the same.) > Yes but for now since things are not tested I will put this !IS_ENABLED(CONFIG_X86) on syscore_ops calls registration part just to be safe and not break anything. > > One question for Anchal: what's going to happen if you trigger a > hibernation, you have the new callbacks, but you are missing > xen_syscore_suspend/resume? > > Is it any worse than not having the new freeze, thaw and restore > callbacks at all and try to do a hibernation? If callbacks are not there, I don't expect hibernation to work correctly. These callbacks takes care of xen primitives like shared_info_page, grant table, sched clock, runstate time which are important to save the correct state of the guest and bring it back up. Other patches in the series, adds all the logic to these syscore callbacks. Freeze/thaw/restore are just there for at driver level. Thanks, Anchal
On 7/22/20 2:02 PM, Anchal Agarwal wrote: > On Tue, Jul 21, 2020 at 05:18:34PM -0700, Stefano Stabellini wrote: >> >> >>> If you are not sure what the effects are (or sure that it won't work) on >>> ARM then I'd add IS_ENABLED(CONFIG_X86) check, i.e. >>> >>> >>> if (!IS_ENABLED(CONFIG_X86) || !xen_hvm_domain()) >>> return -ENODEV; >> That is a good principle to have and thanks for suggesting it. However, >> in this specific case there is nothing in this patch that doesn't work >> on ARM. From an ARM perspective I think we should enable it and >> &xen_pm_notifier_block should be registered. >> > This question is for Boris, I think you we decided to get rid of the notifier > in V3 as all we need to check is SHUTDOWN_SUSPEND state which sounds plausible > to me. So this check may go away. It may still be needed for sycore_ops > callbacks registration. If this check is going away then I guess there is nothing to do here. My concern isn't about this particular notifier but rather whether this feature may affect existing functionality (ARM and PVH dom0). If Stefano feels this should be fine for ARM then so be it. -boris >> Given that all guests are HVM guests on ARM, it should work fine as is. >> >> >> I gave a quick look at the rest of the series and everything looks fine >> to me from an ARM perspective. I cannot imaging that the new freeze, >> thaw, and restore callbacks for net and block are going to cause any >> trouble on ARM. The two main x86-specific functions are >> xen_syscore_suspend/resume and they look trivial to implement on ARM (in >> the sense that they are likely going to look exactly the same.) >> > Yes but for now since things are not tested I will put this > !IS_ENABLED(CONFIG_X86) on syscore_ops calls registration part just to be safe > and not break anything. >> One question for Anchal: what's going to happen if you trigger a >> hibernation, you have the new callbacks, but you are missing >> xen_syscore_suspend/resume? >> >> Is it any worse than not having the new freeze, thaw and restore >> callbacks at all and try to do a hibernation? > If callbacks are not there, I don't expect hibernation to work correctly. > These callbacks takes care of xen primitives like shared_info_page, > grant table, sched clock, runstate time which are important to save the correct > state of the guest and bring it back up. Other patches in the series, adds all > the logic to these syscore callbacks. Freeze/thaw/restore are just there for at driver > level. > > Thanks, > Anchal
On Wed, 22 Jul 2020, Anchal Agarwal wrote: > On Tue, Jul 21, 2020 at 05:18:34PM -0700, Stefano Stabellini wrote: > > On Tue, 21 Jul 2020, Boris Ostrovsky wrote: > > > >>>>>> +static int xen_setup_pm_notifier(void) > > > >>>>>> +{ > > > >>>>>> + if (!xen_hvm_domain()) > > > >>>>>> + return -ENODEV; > > > >>>>>> > > > >>>>>> I forgot --- what did we decide about non-x86 (i.e. ARM)? > > > >>>>> It would be great to support that however, its out of > > > >>>>> scope for this patch set. > > > >>>>> I’ll be happy to discuss it separately. > > > >>>> > > > >>>> I wasn't implying that this *should* work on ARM but rather whether this > > > >>>> will break ARM somehow (because xen_hvm_domain() is true there). > > > >>>> > > > >>>> > > > >>> Ok makes sense. TBH, I haven't tested this part of code on ARM and the series > > > >>> was only support x86 guests hibernation. > > > >>> Moreover, this notifier is there to distinguish between 2 PM > > > >>> events PM SUSPEND and PM hibernation. Now since we only care about PM > > > >>> HIBERNATION I may just remove this code and rely on "SHUTDOWN_SUSPEND" state. > > > >>> However, I may have to fix other patches in the series where this check may > > > >>> appear and cater it only for x86 right? > > > >> > > > >> > > > >> I don't know what would happen if ARM guest tries to handle hibernation > > > >> callbacks. The only ones that you are introducing are in block and net > > > >> fronts and that's arch-independent. > > > >> > > > >> > > > >> You do add a bunch of x86-specific code though (syscore ops), would > > > >> something similar be needed for ARM? > > > >> > > > >> > > > > I don't expect this to work out of the box on ARM. To start with something > > > > similar will be needed for ARM too. > > > > We may still want to keep the driver code as-is. > > > > > > > > I understand the concern here wrt ARM, however, currently the support is only > > > > proposed for x86 guests here and similar work could be carried out for ARM. > > > > Also, if regular hibernation works correctly on arm, then all is needed is to > > > > fix Xen side of things. > > > > > > > > I am not sure what could be done to achieve any assurances on arm side as far as > > > > this series is concerned. > > > > Just to clarify: new features don't need to work on ARM or cause any > > addition efforts to you to make them work on ARM. The patch series only > > needs not to break existing code paths (on ARM and any other platforms). > > It should also not make it overly difficult to implement the ARM side of > > things (if there is one) at some point in the future. > > > > FYI drivers/xen/manage.c is compiled and working on ARM today, however > > Xen suspend/resume is not supported. I don't know for sure if > > guest-initiated hibernation works because I have not tested it. > > > > > > > > > If you are not sure what the effects are (or sure that it won't work) on > > > ARM then I'd add IS_ENABLED(CONFIG_X86) check, i.e. > > > > > > > > > if (!IS_ENABLED(CONFIG_X86) || !xen_hvm_domain()) > > > return -ENODEV; > > > > That is a good principle to have and thanks for suggesting it. However, > > in this specific case there is nothing in this patch that doesn't work > > on ARM. From an ARM perspective I think we should enable it and > > &xen_pm_notifier_block should be registered. > > > This question is for Boris, I think you we decided to get rid of the notifier > in V3 as all we need to check is SHUTDOWN_SUSPEND state which sounds plausible > to me. So this check may go away. It may still be needed for sycore_ops > callbacks registration. > > Given that all guests are HVM guests on ARM, it should work fine as is. > > > > > > I gave a quick look at the rest of the series and everything looks fine > > to me from an ARM perspective. I cannot imaging that the new freeze, > > thaw, and restore callbacks for net and block are going to cause any > > trouble on ARM. The two main x86-specific functions are > > xen_syscore_suspend/resume and they look trivial to implement on ARM (in > > the sense that they are likely going to look exactly the same.) > > > Yes but for now since things are not tested I will put this > !IS_ENABLED(CONFIG_X86) on syscore_ops calls registration part just to be safe > and not break anything. > > > > One question for Anchal: what's going to happen if you trigger a > > hibernation, you have the new callbacks, but you are missing > > xen_syscore_suspend/resume? > > > > Is it any worse than not having the new freeze, thaw and restore > > callbacks at all and try to do a hibernation? > If callbacks are not there, I don't expect hibernation to work correctly. > These callbacks takes care of xen primitives like shared_info_page, > grant table, sched clock, runstate time which are important to save the correct > state of the guest and bring it back up. Other patches in the series, adds all > the logic to these syscore callbacks. Freeze/thaw/restore are just there for at driver > level. I meant the other way around :-) Let me rephrase the question. Do you think that implementing freeze/thaw/restore at the driver level without having xen_syscore_suspend/resume can potentially make things worse compared to not having freeze/thaw/restore at the driver level at all?
On Wed, Jul 22, 2020 at 04:49:16PM -0700, Stefano Stabellini wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > On Wed, 22 Jul 2020, Anchal Agarwal wrote: > > On Tue, Jul 21, 2020 at 05:18:34PM -0700, Stefano Stabellini wrote: > > > On Tue, 21 Jul 2020, Boris Ostrovsky wrote: > > > > >>>>>> +static int xen_setup_pm_notifier(void) > > > > >>>>>> +{ > > > > >>>>>> + if (!xen_hvm_domain()) > > > > >>>>>> + return -ENODEV; > > > > >>>>>> > > > > >>>>>> I forgot --- what did we decide about non-x86 (i.e. ARM)? > > > > >>>>> It would be great to support that however, its out of > > > > >>>>> scope for this patch set. > > > > >>>>> I’ll be happy to discuss it separately. > > > > >>>> > > > > >>>> I wasn't implying that this *should* work on ARM but rather whether this > > > > >>>> will break ARM somehow (because xen_hvm_domain() is true there). > > > > >>>> > > > > >>>> > > > > >>> Ok makes sense. TBH, I haven't tested this part of code on ARM and the series > > > > >>> was only support x86 guests hibernation. > > > > >>> Moreover, this notifier is there to distinguish between 2 PM > > > > >>> events PM SUSPEND and PM hibernation. Now since we only care about PM > > > > >>> HIBERNATION I may just remove this code and rely on "SHUTDOWN_SUSPEND" state. > > > > >>> However, I may have to fix other patches in the series where this check may > > > > >>> appear and cater it only for x86 right? > > > > >> > > > > >> > > > > >> I don't know what would happen if ARM guest tries to handle hibernation > > > > >> callbacks. The only ones that you are introducing are in block and net > > > > >> fronts and that's arch-independent. > > > > >> > > > > >> > > > > >> You do add a bunch of x86-specific code though (syscore ops), would > > > > >> something similar be needed for ARM? > > > > >> > > > > >> > > > > > I don't expect this to work out of the box on ARM. To start with something > > > > > similar will be needed for ARM too. > > > > > We may still want to keep the driver code as-is. > > > > > > > > > > I understand the concern here wrt ARM, however, currently the support is only > > > > > proposed for x86 guests here and similar work could be carried out for ARM. > > > > > Also, if regular hibernation works correctly on arm, then all is needed is to > > > > > fix Xen side of things. > > > > > > > > > > I am not sure what could be done to achieve any assurances on arm side as far as > > > > > this series is concerned. > > > > > > Just to clarify: new features don't need to work on ARM or cause any > > > addition efforts to you to make them work on ARM. The patch series only > > > needs not to break existing code paths (on ARM and any other platforms). > > > It should also not make it overly difficult to implement the ARM side of > > > things (if there is one) at some point in the future. > > > > > > FYI drivers/xen/manage.c is compiled and working on ARM today, however > > > Xen suspend/resume is not supported. I don't know for sure if > > > guest-initiated hibernation works because I have not tested it. > > > > > > > > > > > > > If you are not sure what the effects are (or sure that it won't work) on > > > > ARM then I'd add IS_ENABLED(CONFIG_X86) check, i.e. > > > > > > > > > > > > if (!IS_ENABLED(CONFIG_X86) || !xen_hvm_domain()) > > > > return -ENODEV; > > > > > > That is a good principle to have and thanks for suggesting it. However, > > > in this specific case there is nothing in this patch that doesn't work > > > on ARM. From an ARM perspective I think we should enable it and > > > &xen_pm_notifier_block should be registered. > > > > > This question is for Boris, I think you we decided to get rid of the notifier > > in V3 as all we need to check is SHUTDOWN_SUSPEND state which sounds plausible > > to me. So this check may go away. It may still be needed for sycore_ops > > callbacks registration. > > > Given that all guests are HVM guests on ARM, it should work fine as is. > > > > > > > > > I gave a quick look at the rest of the series and everything looks fine > > > to me from an ARM perspective. I cannot imaging that the new freeze, > > > thaw, and restore callbacks for net and block are going to cause any > > > trouble on ARM. The two main x86-specific functions are > > > xen_syscore_suspend/resume and they look trivial to implement on ARM (in > > > the sense that they are likely going to look exactly the same.) > > > > > Yes but for now since things are not tested I will put this > > !IS_ENABLED(CONFIG_X86) on syscore_ops calls registration part just to be safe > > and not break anything. > > > > > > One question for Anchal: what's going to happen if you trigger a > > > hibernation, you have the new callbacks, but you are missing > > > xen_syscore_suspend/resume? > > > > > > Is it any worse than not having the new freeze, thaw and restore > > > callbacks at all and try to do a hibernation? > > If callbacks are not there, I don't expect hibernation to work correctly. > > These callbacks takes care of xen primitives like shared_info_page, > > grant table, sched clock, runstate time which are important to save the correct > > state of the guest and bring it back up. Other patches in the series, adds all > > the logic to these syscore callbacks. Freeze/thaw/restore are just there for at driver > > level. > > I meant the other way around :-) Let me rephrase the question. > > Do you think that implementing freeze/thaw/restore at the driver level > without having xen_syscore_suspend/resume can potentially make things > worse compared to not having freeze/thaw/restore at the driver level at > all? I think in both the cases I don't expect it to work. System may end up in different state if you register vs not. Hibernation does not work properly at least for domU instances without these changes on x86 and I am assuming the same for ARM. If you do not register freeze/thaw/restore callbacks for arm, then on invocation of xenbus_dev_suspend, default suspend/resume callbacks will be called for each driver and since you do not have any code to save domU's xen primitives state (syscore_ops), hibernation will either fail or will demand a reboot. I do no have setup to test the current state of ARM's hibernation If you only register freeze/thaw/restore and no syscore_ops, it will again fail. Since, I do not have an ARM setup running, I quickly ran a similar test on x86, may not be an apple to apple comparison but instance failed to resume or I should say stuck showing huge jump in time and required a reboot. Now if this doesn't happen currently when you trigger hibernation on arm domU instances or if system is still alive when you trigger hibernation in xen guest then not registering the callbacks may be a better idea. In that case may be I need to put arch specific check when registering freeze/thaw/restore handlers. Hope that answers your question. Thanks, Anchal
On Thu, 23 Jul 2020, Anchal Agarwal wrote: > On Wed, Jul 22, 2020 at 04:49:16PM -0700, Stefano Stabellini wrote: > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > > > > > On Wed, 22 Jul 2020, Anchal Agarwal wrote: > > > On Tue, Jul 21, 2020 at 05:18:34PM -0700, Stefano Stabellini wrote: > > > > On Tue, 21 Jul 2020, Boris Ostrovsky wrote: > > > > > >>>>>> +static int xen_setup_pm_notifier(void) > > > > > >>>>>> +{ > > > > > >>>>>> + if (!xen_hvm_domain()) > > > > > >>>>>> + return -ENODEV; > > > > > >>>>>> > > > > > >>>>>> I forgot --- what did we decide about non-x86 (i.e. ARM)? > > > > > >>>>> It would be great to support that however, its out of > > > > > >>>>> scope for this patch set. > > > > > >>>>> I’ll be happy to discuss it separately. > > > > > >>>> > > > > > >>>> I wasn't implying that this *should* work on ARM but rather whether this > > > > > >>>> will break ARM somehow (because xen_hvm_domain() is true there). > > > > > >>>> > > > > > >>>> > > > > > >>> Ok makes sense. TBH, I haven't tested this part of code on ARM and the series > > > > > >>> was only support x86 guests hibernation. > > > > > >>> Moreover, this notifier is there to distinguish between 2 PM > > > > > >>> events PM SUSPEND and PM hibernation. Now since we only care about PM > > > > > >>> HIBERNATION I may just remove this code and rely on "SHUTDOWN_SUSPEND" state. > > > > > >>> However, I may have to fix other patches in the series where this check may > > > > > >>> appear and cater it only for x86 right? > > > > > >> > > > > > >> > > > > > >> I don't know what would happen if ARM guest tries to handle hibernation > > > > > >> callbacks. The only ones that you are introducing are in block and net > > > > > >> fronts and that's arch-independent. > > > > > >> > > > > > >> > > > > > >> You do add a bunch of x86-specific code though (syscore ops), would > > > > > >> something similar be needed for ARM? > > > > > >> > > > > > >> > > > > > > I don't expect this to work out of the box on ARM. To start with something > > > > > > similar will be needed for ARM too. > > > > > > We may still want to keep the driver code as-is. > > > > > > > > > > > > I understand the concern here wrt ARM, however, currently the support is only > > > > > > proposed for x86 guests here and similar work could be carried out for ARM. > > > > > > Also, if regular hibernation works correctly on arm, then all is needed is to > > > > > > fix Xen side of things. > > > > > > > > > > > > I am not sure what could be done to achieve any assurances on arm side as far as > > > > > > this series is concerned. > > > > > > > > Just to clarify: new features don't need to work on ARM or cause any > > > > addition efforts to you to make them work on ARM. The patch series only > > > > needs not to break existing code paths (on ARM and any other platforms). > > > > It should also not make it overly difficult to implement the ARM side of > > > > things (if there is one) at some point in the future. > > > > > > > > FYI drivers/xen/manage.c is compiled and working on ARM today, however > > > > Xen suspend/resume is not supported. I don't know for sure if > > > > guest-initiated hibernation works because I have not tested it. > > > > > > > > > > > > > > > > > If you are not sure what the effects are (or sure that it won't work) on > > > > > ARM then I'd add IS_ENABLED(CONFIG_X86) check, i.e. > > > > > > > > > > > > > > > if (!IS_ENABLED(CONFIG_X86) || !xen_hvm_domain()) > > > > > return -ENODEV; > > > > > > > > That is a good principle to have and thanks for suggesting it. However, > > > > in this specific case there is nothing in this patch that doesn't work > > > > on ARM. From an ARM perspective I think we should enable it and > > > > &xen_pm_notifier_block should be registered. > > > > > > > This question is for Boris, I think you we decided to get rid of the notifier > > > in V3 as all we need to check is SHUTDOWN_SUSPEND state which sounds plausible > > > to me. So this check may go away. It may still be needed for sycore_ops > > > callbacks registration. > > > > Given that all guests are HVM guests on ARM, it should work fine as is. > > > > > > > > > > > > I gave a quick look at the rest of the series and everything looks fine > > > > to me from an ARM perspective. I cannot imaging that the new freeze, > > > > thaw, and restore callbacks for net and block are going to cause any > > > > trouble on ARM. The two main x86-specific functions are > > > > xen_syscore_suspend/resume and they look trivial to implement on ARM (in > > > > the sense that they are likely going to look exactly the same.) > > > > > > > Yes but for now since things are not tested I will put this > > > !IS_ENABLED(CONFIG_X86) on syscore_ops calls registration part just to be safe > > > and not break anything. > > > > > > > > One question for Anchal: what's going to happen if you trigger a > > > > hibernation, you have the new callbacks, but you are missing > > > > xen_syscore_suspend/resume? > > > > > > > > Is it any worse than not having the new freeze, thaw and restore > > > > callbacks at all and try to do a hibernation? > > > If callbacks are not there, I don't expect hibernation to work correctly. > > > These callbacks takes care of xen primitives like shared_info_page, > > > grant table, sched clock, runstate time which are important to save the correct > > > state of the guest and bring it back up. Other patches in the series, adds all > > > the logic to these syscore callbacks. Freeze/thaw/restore are just there for at driver > > > level. > > > > I meant the other way around :-) Let me rephrase the question. > > > > Do you think that implementing freeze/thaw/restore at the driver level > > without having xen_syscore_suspend/resume can potentially make things > > worse compared to not having freeze/thaw/restore at the driver level at > > all? > I think in both the cases I don't expect it to work. System may end up in > different state if you register vs not. Hibernation does not work properly > at least for domU instances without these changes on x86 and I am assuming the > same for ARM. > > If you do not register freeze/thaw/restore callbacks for arm, then on > invocation of xenbus_dev_suspend, default suspend/resume callbacks > will be called for each driver and since you do not have any code to save domU's > xen primitives state (syscore_ops), hibernation will either fail or will demand a reboot. > I do no have setup to test the current state of ARM's hibernation > > If you only register freeze/thaw/restore and no syscore_ops, it will again fail. > Since, I do not have an ARM setup running, I quickly ran a similar test on x86, > may not be an apple to apple comparison but instance failed to resume or I > should say stuck showing huge jump in time and required a reboot. > > Now if this doesn't happen currently when you trigger hibernation on arm domU > instances or if system is still alive when you trigger hibernation in xen guest > then not registering the callbacks may be a better idea. In that case may be > I need to put arch specific check when registering freeze/thaw/restore handlers. > > Hope that answers your question. Yes, it does, thank you. I'd rather not introduce unknown regressions so I would recommend to add an arch-specific check on registering freeze/thaw/restore handlers. Maybe something like the following: #ifdef CONFIG_X86 .freeze = blkfront_freeze, .thaw = blkfront_restore, .restore = blkfront_restore #endif maybe Boris has a better suggestion on how to do it
On 7/24/20 7:01 PM, Stefano Stabellini wrote: > Yes, it does, thank you. I'd rather not introduce unknown regressions so > I would recommend to add an arch-specific check on registering > freeze/thaw/restore handlers. Maybe something like the following: > > #ifdef CONFIG_X86 > .freeze = blkfront_freeze, > .thaw = blkfront_restore, > .restore = blkfront_restore > #endif > > > maybe Boris has a better suggestion on how to do it An alternative might be to still install pm notifier in drivers/xen/manage.c (I think as result of latest discussions we decided we won't need it) and return -ENOTSUPP for ARM for PM_HIBERNATION_PREPARE and friends. Would that work? -boris
On Mon, Jul 27, 2020 at 06:08:29PM -0400, Boris Ostrovsky wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > On 7/24/20 7:01 PM, Stefano Stabellini wrote: > > Yes, it does, thank you. I'd rather not introduce unknown regressions so > > I would recommend to add an arch-specific check on registering > > freeze/thaw/restore handlers. Maybe something like the following: > > > > #ifdef CONFIG_X86 > > .freeze = blkfront_freeze, > > .thaw = blkfront_restore, > > .restore = blkfront_restore > > #endif > > > > > > maybe Boris has a better suggestion on how to do it > > > An alternative might be to still install pm notifier in > drivers/xen/manage.c (I think as result of latest discussions we decided > we won't need it) and return -ENOTSUPP for ARM for > PM_HIBERNATION_PREPARE and friends. Would that work? > I think the question here is for registering driver specific freeze/thaw/restore callbacks for x86 only. I have dropped the pm_notifier in the v3 still pending testing. So I think just registering driver specific callbacks for x86 only is a good option. What do you think? Anchal > > -boris > > >
On 7/30/20 7:06 PM, Anchal Agarwal wrote: > On Mon, Jul 27, 2020 at 06:08:29PM -0400, Boris Ostrovsky wrote: >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. >> >> >> >> On 7/24/20 7:01 PM, Stefano Stabellini wrote: >>> Yes, it does, thank you. I'd rather not introduce unknown regressions so >>> I would recommend to add an arch-specific check on registering >>> freeze/thaw/restore handlers. Maybe something like the following: >>> >>> #ifdef CONFIG_X86 >>> .freeze = blkfront_freeze, >>> .thaw = blkfront_restore, >>> .restore = blkfront_restore >>> #endif >>> >>> >>> maybe Boris has a better suggestion on how to do it >> >> An alternative might be to still install pm notifier in >> drivers/xen/manage.c (I think as result of latest discussions we decided >> we won't need it) and return -ENOTSUPP for ARM for >> PM_HIBERNATION_PREPARE and friends. Would that work? >> > I think the question here is for registering driver specific freeze/thaw/restore > callbacks for x86 only. I have dropped the pm_notifier in the v3 still pending > testing. So I think just registering driver specific callbacks for x86 only is a > good option. What do you think? I suggested using the notifier under assumption that if it returns an error then that will prevent callbacks to be called because hibernation will be effectively disabled. But I haven't looked at PM code so I don't know whether this is actually the case. The advantage of doing it in the notifier is that instead of adding ifdefs to each driver you will be able to prevent callbacks from a single place. Plus you can use this do disable hibernation for PVH dom0 as well. -boris
On Fri, Jul 31, 2020 at 4:14 PM Boris Ostrovsky <boris.ostrovsky@oracle.com> wrote: > > On 7/30/20 7:06 PM, Anchal Agarwal wrote: > > On Mon, Jul 27, 2020 at 06:08:29PM -0400, Boris Ostrovsky wrote: > >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > >> > >> > >> > >> On 7/24/20 7:01 PM, Stefano Stabellini wrote: > >>> Yes, it does, thank you. I'd rather not introduce unknown regressions so > >>> I would recommend to add an arch-specific check on registering > >>> freeze/thaw/restore handlers. Maybe something like the following: > >>> > >>> #ifdef CONFIG_X86 > >>> .freeze = blkfront_freeze, > >>> .thaw = blkfront_restore, > >>> .restore = blkfront_restore > >>> #endif > >>> > >>> > >>> maybe Boris has a better suggestion on how to do it > >> > >> An alternative might be to still install pm notifier in > >> drivers/xen/manage.c (I think as result of latest discussions we decided > >> we won't need it) and return -ENOTSUPP for ARM for > >> PM_HIBERNATION_PREPARE and friends. Would that work? > >> > > I think the question here is for registering driver specific freeze/thaw/restore > > callbacks for x86 only. I have dropped the pm_notifier in the v3 still pending > > testing. So I think just registering driver specific callbacks for x86 only is a > > good option. What do you think? > > > I suggested using the notifier under assumption that if it returns an > error then that will prevent callbacks to be called because hibernation > will be effectively disabled. That's correct.
On Fri, Jul 31, 2020 at 10:13:48AM -0400, Boris Ostrovsky wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > On 7/30/20 7:06 PM, Anchal Agarwal wrote: > > On Mon, Jul 27, 2020 at 06:08:29PM -0400, Boris Ostrovsky wrote: > >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > >> > >> > >> > >> On 7/24/20 7:01 PM, Stefano Stabellini wrote: > >>> Yes, it does, thank you. I'd rather not introduce unknown regressions so > >>> I would recommend to add an arch-specific check on registering > >>> freeze/thaw/restore handlers. Maybe something like the following: > >>> > >>> #ifdef CONFIG_X86 > >>> .freeze = blkfront_freeze, > >>> .thaw = blkfront_restore, > >>> .restore = blkfront_restore > >>> #endif > >>> > >>> > >>> maybe Boris has a better suggestion on how to do it > >> > >> An alternative might be to still install pm notifier in > >> drivers/xen/manage.c (I think as result of latest discussions we decided > >> we won't need it) and return -ENOTSUPP for ARM for > >> PM_HIBERNATION_PREPARE and friends. Would that work? > >> > > I think the question here is for registering driver specific freeze/thaw/restore > > callbacks for x86 only. I have dropped the pm_notifier in the v3 still pending > > testing. So I think just registering driver specific callbacks for x86 only is a > > good option. What do you think? > > > I suggested using the notifier under assumption that if it returns an > error then that will prevent callbacks to be called because hibernation > will be effectively disabled. But I haven't looked at PM code so I don't > know whether this is actually the case. > I think this could be done. PM_HIBERNATION_PREPARE could return -ENOTSUPP for arm and pvh dom0 when the notifier call chain is invoked for this case in hibernate(). This will then be an empty notifier just for checking two usecases. Also, for pvh dom0, the earlier code didn't register any notifier, with this approach you are suggesting setup the notifier for hvm/pvh dom0 and arm but fail during notifier call chain during PM_HIBERNATION_PREPARE ? I think still getting rid of suspend mode that was earlier a part of this notifier is a good idea as it seems redundant as you pointed out earlier. > > The advantage of doing it in the notifier is that instead of adding > ifdefs to each driver you will be able to prevent callbacks from a > single place. Plus you can use this do disable hibernation for PVH dom0 > as well. > > > > -boris > Anchal > >
On 8/4/20 7:42 PM, Anchal Agarwal wrote: > > I think this could be done. PM_HIBERNATION_PREPARE could return -ENOTSUPP > for arm and pvh dom0 when the notifier call chain is invoked for this case > in hibernate(). This will then be an empty notifier just for checking two > usecases. > Also, for pvh dom0, the earlier code didn't register any notifier, > with this approach you are suggesting setup the notifier for hvm/pvh dom0 and > arm but fail during notifier call chain during PM_HIBERNATION_PREPARE ? Right. (Although the earlier code did register the notifier: xen_setup_pm_notifier() would return an error for !xen_hvm_domain() and PVH *is* an HVM domain, so registration would actually happen) > > I think still getting rid of suspend mode that was earlier a part of this > notifier is a good idea as it seems redundant as you pointed out earlier. Yes. -boris
On Wed, Aug 05, 2020 at 09:31:13AM -0400, Boris Ostrovsky wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > On 8/4/20 7:42 PM, Anchal Agarwal wrote: > > > > I think this could be done. PM_HIBERNATION_PREPARE could return -ENOTSUPP > > for arm and pvh dom0 when the notifier call chain is invoked for this case > > in hibernate(). This will then be an empty notifier just for checking two > > usecases. > > Also, for pvh dom0, the earlier code didn't register any notifier, > > with this approach you are suggesting setup the notifier for hvm/pvh dom0 and > > arm but fail during notifier call chain during PM_HIBERNATION_PREPARE ? > > > Right. > > > (Although the earlier code did register the notifier: > xen_setup_pm_notifier() would return an error for !xen_hvm_domain() and > PVH *is* an HVM domain, so registration would actually happen) > Yes you are right. My bad, what I meant with "earlier code" was whatever we discussed w.r.t to removing the notifier all together, it won't be registered for pvh dom0. Anyways got the point :) > > > > > I think still getting rid of suspend mode that was earlier a part of this > > notifier is a good idea as it seems redundant as you pointed out earlier. > > > Yes. > > > -boris Thanks, Anchal > >
diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c index cd046684e0d1..69833fd6cfd1 100644 --- a/drivers/xen/manage.c +++ b/drivers/xen/manage.c @@ -14,6 +14,7 @@ #include <linux/freezer.h> #include <linux/syscore_ops.h> #include <linux/export.h> +#include <linux/suspend.h> #include <xen/xen.h> #include <xen/xenbus.h> @@ -40,6 +41,20 @@ enum shutdown_state { /* Ignore multiple shutdown requests. */ static enum shutdown_state shutting_down = SHUTDOWN_INVALID; +enum suspend_modes { + NO_SUSPEND = 0, + XEN_SUSPEND, + PM_HIBERNATION, +}; + +/* Protected by pm_mutex */ +static enum suspend_modes suspend_mode = NO_SUSPEND; + +bool xen_is_xen_suspend(void) +{ + return suspend_mode == XEN_SUSPEND; +} + struct suspend_info { int cancelled; }; @@ -99,6 +114,10 @@ static void do_suspend(void) int err; struct suspend_info si; + lock_system_sleep(); + + suspend_mode = XEN_SUSPEND; + shutting_down = SHUTDOWN_SUSPEND; err = freeze_processes(); @@ -162,6 +181,10 @@ static void do_suspend(void) thaw_processes(); out: shutting_down = SHUTDOWN_INVALID; + + suspend_mode = NO_SUSPEND; + + unlock_system_sleep(); } #endif /* CONFIG_HIBERNATE_CALLBACKS */ @@ -387,3 +410,40 @@ int xen_setup_shutdown_event(void) EXPORT_SYMBOL_GPL(xen_setup_shutdown_event); subsys_initcall(xen_setup_shutdown_event); + +static int xen_pm_notifier(struct notifier_block *notifier, + unsigned long pm_event, void *unused) +{ + switch (pm_event) { + case PM_SUSPEND_PREPARE: + case PM_HIBERNATION_PREPARE: + case PM_RESTORE_PREPARE: + suspend_mode = PM_HIBERNATION; + break; + case PM_POST_SUSPEND: + case PM_POST_RESTORE: + case PM_POST_HIBERNATION: + /* Set back to the default */ + suspend_mode = NO_SUSPEND; + break; + default: + pr_warn("Receive unknown PM event 0x%lx\n", pm_event); + return -EINVAL; + } + + return 0; +}; + +static struct notifier_block xen_pm_notifier_block = { + .notifier_call = xen_pm_notifier +}; + +static int xen_setup_pm_notifier(void) +{ + if (!xen_hvm_domain()) + return -ENODEV; + + return register_pm_notifier(&xen_pm_notifier_block); +} + +subsys_initcall(xen_setup_pm_notifier); diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h index 39a5580f8feb..2521d6a306cd 100644 --- a/include/xen/xen-ops.h +++ b/include/xen/xen-ops.h @@ -40,6 +40,7 @@ u64 xen_steal_clock(int cpu); int xen_setup_shutdown_event(void); +bool xen_is_xen_suspend(void); extern unsigned long *xen_contiguous_bitmap; #if defined(CONFIG_XEN_PV) || defined(CONFIG_ARM) || defined(CONFIG_ARM64)