Message ID | 20190318131155.29450-5-jgross@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xen: simplify suspend/resume handling | expand |
On Mon, 2019-03-18 at 14:11 +0100, Juergen Gross wrote: > Instead of freeing percpu areas during suspend and allocating them > again when resuming keep them. Only free an area in case a cpu didn't > come up again when resuming. > > Signed-off-by: Juergen Gross <jgross@suse.com> > Reviewed-by: Dario Faggioli <dfaggioli@suse.com> Dario
On 18/03/2019 13:11, Juergen Gross wrote: > Instead of freeing percpu areas during suspend and allocating them > again when resuming keep them. Only free an area in case a cpu didn't > come up again when resuming. > > Signed-off-by: Juergen Gross <jgross@suse.com> Hmm - this is slightly problematic, given the dual nature of this code. I agree that it this change is beneficial for the suspend case, but it is a problem when we are parking an individual CPU for smt=0 or xen-hptool reasons. Do we have any hint we can use when taking the CPU down as to whether we're expecting it to come straight back up again? ~Andrew
On 27/03/2019 16:55, Andrew Cooper wrote: > On 18/03/2019 13:11, Juergen Gross wrote: >> Instead of freeing percpu areas during suspend and allocating them >> again when resuming keep them. Only free an area in case a cpu didn't >> come up again when resuming. >> >> Signed-off-by: Juergen Gross <jgross@suse.com> > > Hmm - this is slightly problematic, given the dual nature of this code. > > I agree that it this change is beneficial for the suspend case, but it > is a problem when we are parking an individual CPU for smt=0 or > xen-hptool reasons. > > Do we have any hint we can use when taking the CPU down as to whether > we're expecting it to come straight back up again? Did you look into the patch? I did this by testing system_state. Juergen
>>> On 27.03.19 at 17:18, <jgross@suse.com> wrote: > On 27/03/2019 16:55, Andrew Cooper wrote: >> On 18/03/2019 13:11, Juergen Gross wrote: >>> Instead of freeing percpu areas during suspend and allocating them >>> again when resuming keep them. Only free an area in case a cpu didn't >>> come up again when resuming. >>> >>> Signed-off-by: Juergen Gross <jgross@suse.com> >> >> Hmm - this is slightly problematic, given the dual nature of this code. >> >> I agree that it this change is beneficial for the suspend case, but it >> is a problem when we are parking an individual CPU for smt=0 or >> xen-hptool reasons. >> >> Do we have any hint we can use when taking the CPU down as to whether >> we're expecting it to come straight back up again? > > Did you look into the patch? I did this by testing system_state. I think there's a wider problem here: enable_nonboot_cpus() only brings back up the CPUs that were previously online. Parked ones would be left alone, yet after resume they'd need to be put back into parked state. Jan
On 27/03/2019 17:38, Jan Beulich wrote: >>>> On 27.03.19 at 17:18, <jgross@suse.com> wrote: >> On 27/03/2019 16:55, Andrew Cooper wrote: >>> On 18/03/2019 13:11, Juergen Gross wrote: >>>> Instead of freeing percpu areas during suspend and allocating them >>>> again when resuming keep them. Only free an area in case a cpu didn't >>>> come up again when resuming. >>>> >>>> Signed-off-by: Juergen Gross <jgross@suse.com> >>> >>> Hmm - this is slightly problematic, given the dual nature of this code. >>> >>> I agree that it this change is beneficial for the suspend case, but it >>> is a problem when we are parking an individual CPU for smt=0 or >>> xen-hptool reasons. >>> >>> Do we have any hint we can use when taking the CPU down as to whether >>> we're expecting it to come straight back up again? >> >> Did you look into the patch? I did this by testing system_state. > > I think there's a wider problem here: enable_nonboot_cpus() > only brings back up the CPUs that were previously online. > Parked ones would be left alone, yet after resume they'd > need to be put back into parked state. I can add that handling in the respin of the series. Juergen
On 27/03/2019 17:52, Juergen Gross wrote: > On 27/03/2019 17:38, Jan Beulich wrote: >>>>> On 27.03.19 at 17:18, <jgross@suse.com> wrote: >>> On 27/03/2019 16:55, Andrew Cooper wrote: >>>> On 18/03/2019 13:11, Juergen Gross wrote: >>>>> Instead of freeing percpu areas during suspend and allocating them >>>>> again when resuming keep them. Only free an area in case a cpu didn't >>>>> come up again when resuming. >>>>> >>>>> Signed-off-by: Juergen Gross <jgross@suse.com> >>>> >>>> Hmm - this is slightly problematic, given the dual nature of this code. >>>> >>>> I agree that it this change is beneficial for the suspend case, but it >>>> is a problem when we are parking an individual CPU for smt=0 or >>>> xen-hptool reasons. >>>> >>>> Do we have any hint we can use when taking the CPU down as to whether >>>> we're expecting it to come straight back up again? >>> >>> Did you look into the patch? I did this by testing system_state. >> >> I think there's a wider problem here: enable_nonboot_cpus() >> only brings back up the CPUs that were previously online. >> Parked ones would be left alone, yet after resume they'd >> need to be put back into parked state. > > I can add that handling in the respin of the series. Looking deeper into that mess I believe that should be a series of its own. Cpu parking needs to be handled for cpu hotplug and core parking (XENPF_core_parking), too. Juergen
>>> On 18.03.19 at 14:11, <jgross@suse.com> wrote: > Instead of freeing percpu areas during suspend and allocating them > again when resuming keep them. Only free an area in case a cpu didn't > come up again when resuming. There's another aspect here which needs at least mentioning: Right now, CPUs coming back up have their per-CPU data cleared. Not doing so may indeed be an improvement in some cases (like statistics collected), but may also get in the way in other cases. Jan
On 28/03/2019 08:46, Jan Beulich wrote: >>>> On 18.03.19 at 14:11, <jgross@suse.com> wrote: >> Instead of freeing percpu areas during suspend and allocating them >> again when resuming keep them. Only free an area in case a cpu didn't >> come up again when resuming. > > There's another aspect here which needs at least mentioning: > Right now, CPUs coming back up have their per-CPU data > cleared. Not doing so may indeed be an improvement in > some cases (like statistics collected), but may also get in the > way in other cases. I have checked the respective cpu notifier hooks and I think this is no problem. I hope I didn't overlook something. I'll add a note to the commit message. Juergen
>>> On 28.03.19 at 07:59, <jgross@suse.com> wrote: > On 27/03/2019 17:52, Juergen Gross wrote: >> On 27/03/2019 17:38, Jan Beulich wrote: >>>>>> On 27.03.19 at 17:18, <jgross@suse.com> wrote: >>>> On 27/03/2019 16:55, Andrew Cooper wrote: >>>>> On 18/03/2019 13:11, Juergen Gross wrote: >>>>>> Instead of freeing percpu areas during suspend and allocating them >>>>>> again when resuming keep them. Only free an area in case a cpu didn't >>>>>> come up again when resuming. >>>>>> >>>>>> Signed-off-by: Juergen Gross <jgross@suse.com> >>>>> >>>>> Hmm - this is slightly problematic, given the dual nature of this code. >>>>> >>>>> I agree that it this change is beneficial for the suspend case, but it >>>>> is a problem when we are parking an individual CPU for smt=0 or >>>>> xen-hptool reasons. >>>>> >>>>> Do we have any hint we can use when taking the CPU down as to whether >>>>> we're expecting it to come straight back up again? >>>> >>>> Did you look into the patch? I did this by testing system_state. >>> >>> I think there's a wider problem here: enable_nonboot_cpus() >>> only brings back up the CPUs that were previously online. >>> Parked ones would be left alone, yet after resume they'd >>> need to be put back into parked state. >> >> I can add that handling in the respin of the series. > > Looking deeper into that mess I believe that should be a series of its > own. Cpu parking needs to be handled for cpu hotplug and core parking > (XENPF_core_parking), too. What issue do you see for CPU hotplug? cpu_up_helper() has been modified by the parking series. For core parking I wonder whether core_parking_helper() shouldn't, first of all, invoke cpu_{up,down}_helper(). This wouldn't be enough, though - the policy hooks need to honor opt_smt as well. As to this wanting to be a patch / series of its own - I don't mind, but preferably it would come ahead of your changes here, so that it can be backported independently and (sufficiently) easily (unless of course there's really no collision between the two). Jan
>>> On 28.03.19 at 08:53, <jgross@suse.com> wrote: > On 28/03/2019 08:46, Jan Beulich wrote: >>>>> On 18.03.19 at 14:11, <jgross@suse.com> wrote: >>> Instead of freeing percpu areas during suspend and allocating them >>> again when resuming keep them. Only free an area in case a cpu didn't >>> come up again when resuming. >> >> There's another aspect here which needs at least mentioning: >> Right now, CPUs coming back up have their per-CPU data >> cleared. Not doing so may indeed be an improvement in >> some cases (like statistics collected), but may also get in the >> way in other cases. > > I have checked the respective cpu notifier hooks and I think this is no > problem. I hope I didn't overlook something. Is checking the hooks sufficient? I'd rather expect we need to check all per-CPU data objects. Jan
On 28/03/2019 09:04, Jan Beulich wrote: >>>> On 28.03.19 at 08:53, <jgross@suse.com> wrote: >> On 28/03/2019 08:46, Jan Beulich wrote: >>>>>> On 18.03.19 at 14:11, <jgross@suse.com> wrote: >>>> Instead of freeing percpu areas during suspend and allocating them >>>> again when resuming keep them. Only free an area in case a cpu didn't >>>> come up again when resuming. >>> >>> There's another aspect here which needs at least mentioning: >>> Right now, CPUs coming back up have their per-CPU data >>> cleared. Not doing so may indeed be an improvement in >>> some cases (like statistics collected), but may also get in the >>> way in other cases. >> >> I have checked the respective cpu notifier hooks and I think this is no >> problem. I hope I didn't overlook something. > > Is checking the hooks sufficient? I'd rather expect we need to > check all per-CPU data objects. Why? The main concern would be a hook not doing its job due to some data not being zeroed. In case there is a dependency somewhere else I'd rather expect subtle negative effects with today's solution as suddenly percpu data will be zero after suspend/resume without a component having been involved with suspending. Sure, there could be negative effects with my series, but I'd asset the chances lower as without my series. Suspend/resume tends to be not tested very often, it seems. I guess I'm the only one having it tried recently (it was broken for 3 months at least before my patch repairing an issue with IOMMU). Juergen
On 28/03/2019 09:03, Jan Beulich wrote: >>>> On 28.03.19 at 07:59, <jgross@suse.com> wrote: >> On 27/03/2019 17:52, Juergen Gross wrote: >>> On 27/03/2019 17:38, Jan Beulich wrote: >>>>>>> On 27.03.19 at 17:18, <jgross@suse.com> wrote: >>>>> On 27/03/2019 16:55, Andrew Cooper wrote: >>>>>> On 18/03/2019 13:11, Juergen Gross wrote: >>>>>>> Instead of freeing percpu areas during suspend and allocating them >>>>>>> again when resuming keep them. Only free an area in case a cpu didn't >>>>>>> come up again when resuming. >>>>>>> >>>>>>> Signed-off-by: Juergen Gross <jgross@suse.com> >>>>>> >>>>>> Hmm - this is slightly problematic, given the dual nature of this code. >>>>>> >>>>>> I agree that it this change is beneficial for the suspend case, but it >>>>>> is a problem when we are parking an individual CPU for smt=0 or >>>>>> xen-hptool reasons. >>>>>> >>>>>> Do we have any hint we can use when taking the CPU down as to whether >>>>>> we're expecting it to come straight back up again? >>>>> >>>>> Did you look into the patch? I did this by testing system_state. >>>> >>>> I think there's a wider problem here: enable_nonboot_cpus() >>>> only brings back up the CPUs that were previously online. >>>> Parked ones would be left alone, yet after resume they'd >>>> need to be put back into parked state. >>> >>> I can add that handling in the respin of the series. >> >> Looking deeper into that mess I believe that should be a series of its >> own. Cpu parking needs to be handled for cpu hotplug and core parking >> (XENPF_core_parking), too. > > What issue do you see for CPU hotplug? cpu_up_helper() has > been modified by the parking series. I was thinking of hot unplug. cpu_down() won't do the job for a parked cpu. > For core parking I wonder whether core_parking_helper() > shouldn't, first of all, invoke cpu_{up,down}_helper(). This > wouldn't be enough, though - the policy hooks need to honor > opt_smt as well. Right. > As to this wanting to be a patch / series of its own - I don't > mind, but preferably it would come ahead of your changes > here, so that it can be backported independently and > (sufficiently) easily (unless of course there's really no > collision between the two). In case there is a collision it should be fairly minimal. I'd prefer not to block my series as it is a prerequisite for my core scheduling series, which I believe should go in rather sooner than later as it probably should see lot of testing before the next release. Juergen
>>> On 28.03.19 at 09:35, <jgross@suse.com> wrote: > On 28/03/2019 09:03, Jan Beulich wrote: >>>>> On 28.03.19 at 07:59, <jgross@suse.com> wrote: >>> On 27/03/2019 17:52, Juergen Gross wrote: >>>> On 27/03/2019 17:38, Jan Beulich wrote: >>>>>>>> On 27.03.19 at 17:18, <jgross@suse.com> wrote: >>>>>> On 27/03/2019 16:55, Andrew Cooper wrote: >>>>>>> On 18/03/2019 13:11, Juergen Gross wrote: >>>>>>>> Instead of freeing percpu areas during suspend and allocating them >>>>>>>> again when resuming keep them. Only free an area in case a cpu didn't >>>>>>>> come up again when resuming. >>>>>>>> >>>>>>>> Signed-off-by: Juergen Gross <jgross@suse.com> >>>>>>> >>>>>>> Hmm - this is slightly problematic, given the dual nature of this code. >>>>>>> >>>>>>> I agree that it this change is beneficial for the suspend case, but it >>>>>>> is a problem when we are parking an individual CPU for smt=0 or >>>>>>> xen-hptool reasons. >>>>>>> >>>>>>> Do we have any hint we can use when taking the CPU down as to whether >>>>>>> we're expecting it to come straight back up again? >>>>>> >>>>>> Did you look into the patch? I did this by testing system_state. >>>>> >>>>> I think there's a wider problem here: enable_nonboot_cpus() >>>>> only brings back up the CPUs that were previously online. >>>>> Parked ones would be left alone, yet after resume they'd >>>>> need to be put back into parked state. >>>> >>>> I can add that handling in the respin of the series. >>> >>> Looking deeper into that mess I believe that should be a series of its >>> own. Cpu parking needs to be handled for cpu hotplug and core parking >>> (XENPF_core_parking), too. >> >> What issue do you see for CPU hotplug? cpu_up_helper() has >> been modified by the parking series. > > I was thinking of hot unplug. cpu_down() won't do the job for a parked > cpu. There's nothing to be done when soft-offlining a CPU. And I'm not convinced physical CPU unplug was actually ever tested to work. >> As to this wanting to be a patch / series of its own - I don't >> mind, but preferably it would come ahead of your changes >> here, so that it can be backported independently and >> (sufficiently) easily (unless of course there's really no >> collision between the two). > > In case there is a collision it should be fairly minimal. > > I'd prefer not to block my series as it is a prerequisite for my core > scheduling series, which I believe should go in rather sooner than later > as it probably should see lot of testing before the next release. Understood. Jan
>>> On 28.03.19 at 09:03, <JBeulich@suse.com> wrote: > For core parking I wonder whether core_parking_helper() > shouldn't, first of all, invoke cpu_{up,down}_helper(). This > wouldn't be enough, though - the policy hooks need to honor > opt_smt as well. Actually no, there was no problem at the time there: With opt_smt set to false, no secondary thread would ever have made it into core_parking_cpunum[] (that's where CPU numbers to be passed to cpu_up() get taken from). A problem here was introduced only by Andrew's 2bed1bc241, making it possible for opt_smt to change at runtime. I think I'll make a patch to have smt_up_down_helper() call into core-parking code to purge CPUs from core_parking_cpunum[] as needed. The interaction of core-parking and xen-hptool activities is up for discussion anyway, I think. At least to me it's not immediately clear which of the two should take priority. Allowing admins to shoot themselves in the foot (as we appear to do now) is a reasonable possibility, but not the only one, the more that the platform is liable to notice higher power consumption and to subsequently request offlining of CPUs again anyway. Jan
diff --git a/xen/arch/x86/percpu.c b/xen/arch/x86/percpu.c index 8be4ebddf4..5ea14b6ec3 100644 --- a/xen/arch/x86/percpu.c +++ b/xen/arch/x86/percpu.c @@ -76,7 +76,8 @@ static int cpu_percpu_callback( break; case CPU_UP_CANCELED: case CPU_DEAD: - if ( !park_offline_cpus ) + case CPU_RESUME_FAILED: + if ( !park_offline_cpus && system_state != SYS_STATE_suspend ) free_percpu_area(cpu); break; case CPU_REMOVE:
Instead of freeing percpu areas during suspend and allocating them again when resuming keep them. Only free an area in case a cpu didn't come up again when resuming. Signed-off-by: Juergen Gross <jgross@suse.com> --- xen/arch/x86/percpu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)