diff mbox

x86/vpmu: add cpu hot unplug notifier for vpmu

Message ID 1494955785-31546-1-git-send-email-luwei.kang@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Luwei Kang May 16, 2017, 5:29 p.m. UTC
Currently, hot unplug a cpu with vpmu enabled may cause system
hang due to send IPI to a die physical cpu. This patch add a
cpu hot unplug notifer to save vpmu context before cpu offline.

Consider one scene, hotplug physical cpu N with vpmu is enabled.
The vcpu which running on this physical cpu before will be switch
to other online cpu. Before load the vpmu context to new physical
cpu, a IPI will be send to cpu N to save the vpmu context.
System will hang in function on_select_cpus because of that
physical cpu is offline and can not do any response.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 xen/arch/x86/cpu/vpmu.c | 36 ++++++++++++++++++++++++++++++++++--
 1 file changed, 34 insertions(+), 2 deletions(-)

Comments

Jan Beulich May 17, 2017, 9:28 a.m. UTC | #1
>>> On 16.05.17 at 19:29, <luwei.kang@intel.com> wrote:
> Currently, hot unplug a cpu with vpmu enabled may cause system
> hang due to send IPI to a die physical cpu. This patch add a
> cpu hot unplug notifer to save vpmu context before cpu offline.
> 
> Consider one scene, hotplug physical cpu N with vpmu is enabled.

I think you mean "scenario" and "hot unplug".

> The vcpu which running on this physical cpu before will be switch
> to other online cpu. Before load the vpmu context to new physical
> cpu, a IPI will be send to cpu N to save the vpmu context.
> System will hang in function on_select_cpus because of that
> physical cpu is offline and can not do any response.

Doesn't this make clear that you would better also make sure
->last_pcpu doesn't hold to the then stale CPU anymore? For
example, vpmu_load() compares it with smp_processor_id()
(the subsequent use is guarded by a VPMU_CONTEXT_LOADED
flag check), allowing badness if the same or another CPU with
the same number comes up again quickly enough. Similarly
vpmu_arch_destroy() uses it without checking
VPMU_CONTEXT_LOADED.

> --- a/xen/arch/x86/cpu/vpmu.c
> +++ b/xen/arch/x86/cpu/vpmu.c
> @@ -35,6 +35,7 @@
>  #include <asm/apic.h>
>  #include <public/pmu.h>
>  #include <xsm/xsm.h>
> +#include <xen/cpu.h>

Please place this in the group of other xen/ includes.

> +static int cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
> +{
> +    unsigned int cpu = (unsigned long)hcpu;
> +    struct vcpu *vcpu = per_cpu(last_vcpu, cpu);
> +    struct vpmu_struct *vpmu;
> +
> +    if ( !vcpu )
> +        return NOTIFY_DONE;
> +
> +    vpmu = vcpu_vpmu(vcpu);
> +    if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) )
> +        return NOTIFY_DONE;
> +
> +    switch ( action )
> +    {
> +    case CPU_DYING:
> +        vpmu_save_force(vcpu);
> +        vpmu_reset(vpmu, VPMU_CONTEXT_LOADED);
> +        break;
> +    default:
> +        break;

Pointless default case.

> @@ -871,10 +902,11 @@ static int __init vpmu_init(void)
>          break;
>      }
>  
> -    if ( vpmu_mode != XENPMU_MODE_OFF )
> +    if ( vpmu_mode != XENPMU_MODE_OFF ) {
> +        register_cpu_notifier(&vpmu_cpu_nfb);
>          printk(XENLOG_INFO "VPMU: version " __stringify(XENPMU_VER_MAJ) "."
>                 __stringify(XENPMU_VER_MIN) "\n");
> -    else
> +    } else

Coding style (brace placement).

Jan
Luwei Kang May 17, 2017, 12:40 p.m. UTC | #2
> >>> On 16.05.17 at 19:29, <luwei.kang@intel.com> wrote:
> > Currently, hot unplug a cpu with vpmu enabled may cause system hang
> > due to send IPI to a die physical cpu. This patch add a cpu hot unplug
> > notifer to save vpmu context before cpu offline.
> >
> > Consider one scene, hotplug physical cpu N with vpmu is enabled.
> 
> I think you mean "scenario" and "hot unplug".
> 
> > The vcpu which running on this physical cpu before will be switch to
> > other online cpu. Before load the vpmu context to new physical cpu, a
> > IPI will be send to cpu N to save the vpmu context.
> > System will hang in function on_select_cpus because of that physical
> > cpu is offline and can not do any response.
> 
> Doesn't this make clear that you would better also make sure
> ->last_pcpu doesn't hold to the then stale CPU anymore? For
> example, vpmu_load() compares it with smp_processor_id() (the subsequent use is guarded by a VPMU_CONTEXT_LOADED flag
> check), allowing badness if the same or another CPU with the same number comes up again quickly enough. Similarly
> vpmu_arch_destroy() uses it without checking VPMU_CONTEXT_LOADED.

Hi Jan,
    I think it may can't make sure  "->last_pcpu" doesn't hold to the then stale CPU. The purpose of this notifier is to save the vpmu context before cpu offline. Avoid save vpmu context by send IPI to that offline cpu. There is no reason to change the value except it saving (vpmu_save()) in another physical cpu.
    Regarding vpmu_arch_destroy(), it indeed will cause same issue. What about add " this_cpu(cpu) = NULL" in cpu_callback() to clean the last_vcpu pointer of this physical cpu. 
    In addition, add VPMU_CONTEXT_LOADED check before execute on_selected_cpus(cpumask_of(vcpu_vpmu(v)->last_pcpu), vpmu_save_force, v, 1) in vpmu_arch_destroy(). Because of force save operation has been finished in notifier function.

Thanks,
Luwei Kang

> 
> > --- a/xen/arch/x86/cpu/vpmu.c
> > +++ b/xen/arch/x86/cpu/vpmu.c
> > @@ -35,6 +35,7 @@
> >  #include <asm/apic.h>
> >  #include <public/pmu.h>
> >  #include <xsm/xsm.h>
> > +#include <xen/cpu.h>
> 
> Please place this in the group of other xen/ includes.
> 
> > +static int cpu_callback(struct notifier_block *nfb, unsigned long
> > +action, void *hcpu) {
> > +    unsigned int cpu = (unsigned long)hcpu;
> > +    struct vcpu *vcpu = per_cpu(last_vcpu, cpu);
> > +    struct vpmu_struct *vpmu;
> > +
> > +    if ( !vcpu )
> > +        return NOTIFY_DONE;
> > +
> > +    vpmu = vcpu_vpmu(vcpu);
> > +    if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) )
> > +        return NOTIFY_DONE;
> > +
> > +    switch ( action )
> > +    {
> > +    case CPU_DYING:
> > +        vpmu_save_force(vcpu);
> > +        vpmu_reset(vpmu, VPMU_CONTEXT_LOADED);
> > +        break;
> > +    default:
> > +        break;
> 
> Pointless default case.
> 
> > @@ -871,10 +902,11 @@ static int __init vpmu_init(void)
> >          break;
> >      }
> >
> > -    if ( vpmu_mode != XENPMU_MODE_OFF )
> > +    if ( vpmu_mode != XENPMU_MODE_OFF ) {
> > +        register_cpu_notifier(&vpmu_cpu_nfb);
> >          printk(XENLOG_INFO "VPMU: version " __stringify(XENPMU_VER_MAJ) "."
> >                 __stringify(XENPMU_VER_MIN) "\n");
> > -    else
> > +    } else
> 
> Coding style (brace placement).
> 
> Jan
Jan Beulich May 17, 2017, 12:54 p.m. UTC | #3
>>> On 17.05.17 at 14:40, <luwei.kang@intel.com> wrote:
>> >>> On 16.05.17 at 19:29, <luwei.kang@intel.com> wrote:
>> > Currently, hot unplug a cpu with vpmu enabled may cause system hang
>> > due to send IPI to a die physical cpu. This patch add a cpu hot unplug
>> > notifer to save vpmu context before cpu offline.
>> >
>> > Consider one scene, hotplug physical cpu N with vpmu is enabled.
>> 
>> I think you mean "scenario" and "hot unplug".
>> 
>> > The vcpu which running on this physical cpu before will be switch to
>> > other online cpu. Before load the vpmu context to new physical cpu, a
>> > IPI will be send to cpu N to save the vpmu context.
>> > System will hang in function on_select_cpus because of that physical
>> > cpu is offline and can not do any response.
>> 
>> Doesn't this make clear that you would better also make sure
>> ->last_pcpu doesn't hold to the then stale CPU anymore? For
>> example, vpmu_load() compares it with smp_processor_id() (the subsequent use 
> is guarded by a VPMU_CONTEXT_LOADED flag
>> check), allowing badness if the same or another CPU with the same number 
> comes up again quickly enough. Similarly
>> vpmu_arch_destroy() uses it without checking VPMU_CONTEXT_LOADED.
> 
>     I think it may can't make sure  "->last_pcpu" doesn't hold to the then 
> stale CPU. The purpose of this notifier is to save the vpmu context before 
> cpu offline. Avoid save vpmu context by send IPI to that offline cpu. There 
> is no reason to change the value except it saving (vpmu_save()) in another 
> physical cpu.

I'm afraid I don't understand most of your reply.

>     Regarding vpmu_arch_destroy(), it indeed will cause same issue. What 
> about add " this_cpu(cpu) = NULL" in cpu_callback() to clean the last_vcpu 
> pointer of this physical cpu. 

That's being done by vpmu_save_force() already afaict (assuming
you mean this_cpu(last_vcpu)), albeit for whatever reason open
coding this_cpu().

>     In addition, add VPMU_CONTEXT_LOADED check before execute 
> on_selected_cpus(cpumask_of(vcpu_vpmu(v)->last_pcpu), vpmu_save_force, v, 1) in 
> vpmu_arch_destroy(). Because of force save operation has been finished in 
> notifier function.

I'm not sure whether that would be correct. Boris?

Jan
Boris Ostrovsky May 17, 2017, 1:58 p.m. UTC | #4
On 05/17/2017 08:54 AM, Jan Beulich wrote:
>>>> On 17.05.17 at 14:40, <luwei.kang@intel.com> wrote:
>>>>>> On 16.05.17 at 19:29, <luwei.kang@intel.com> wrote:
>>>> Currently, hot unplug a cpu with vpmu enabled may cause system hang
>>>> due to send IPI to a die physical cpu. This patch add a cpu hot unplug
>>>> notifer to save vpmu context before cpu offline.
>>>>
>>>> Consider one scene, hotplug physical cpu N with vpmu is enabled.
>>> I think you mean "scenario" and "hot unplug".
>>>
>>>> The vcpu which running on this physical cpu before will be switch to
>>>> other online cpu. Before load the vpmu context to new physical cpu, a
>>>> IPI will be send to cpu N to save the vpmu context.
>>>> System will hang in function on_select_cpus because of that physical
>>>> cpu is offline and can not do any response.
>>> Doesn't this make clear that you would better also make sure
>>> ->last_pcpu doesn't hold to the then stale CPU anymore? For
>>> example, vpmu_load() compares it with smp_processor_id() (the subsequent use 
>> is guarded by a VPMU_CONTEXT_LOADED flag
>>> check), allowing badness if the same or another CPU with the same number 
>> comes up again quickly enough. Similarly
>>> vpmu_arch_destroy() uses it without checking VPMU_CONTEXT_LOADED.
>>     I think it may can't make sure  "->last_pcpu" doesn't hold to the then 
>> stale CPU. The purpose of this notifier is to save the vpmu context before 
>> cpu offline. Avoid save vpmu context by send IPI to that offline cpu. There 
>> is no reason to change the value except it saving (vpmu_save()) in another 
>> physical cpu.
> I'm afraid I don't understand most of your reply.
>
>>     Regarding vpmu_arch_destroy(), it indeed will cause same issue. What 
>> about add " this_cpu(cpu) = NULL" in cpu_callback() to clean the last_vcpu 
>> pointer of this physical cpu. 
> That's being done by vpmu_save_force() already afaict (assuming
> you mean this_cpu(last_vcpu)), albeit for whatever reason open
> coding this_cpu().
>
>>     In addition, add VPMU_CONTEXT_LOADED check before execute 
>> on_selected_cpus(cpumask_of(vcpu_vpmu(v)->last_pcpu), vpmu_save_force, v, 1) in 
>> vpmu_arch_destroy(). Because of force save operation has been finished in 
>> notifier function.
> I'm not sure whether that would be correct. Boris?


I believe we still have a race with vpmu_load(): it can be past
VPMU_CONTEXT_LOADED test and committed to the remote call when the
remote VCPU becomes offlined.

Taking vpmu_lock in vpmu_load() and cpu_callback() (which IMO should be
called vpmu_cpu_callback() or some such) may be one solution, although
holding a lock across a remote call is not optimal, obviously.

And I think the same argument is applicable to vpmu_arch_destroy().

-boris
Jan Beulich May 17, 2017, 2:09 p.m. UTC | #5
>>> On 17.05.17 at 15:58, <boris.ostrovsky@oracle.com> wrote:
> On 05/17/2017 08:54 AM, Jan Beulich wrote:
>>>>> On 17.05.17 at 14:40, <luwei.kang@intel.com> wrote:
>>>>>>> On 16.05.17 at 19:29, <luwei.kang@intel.com> wrote:
>>>>> Currently, hot unplug a cpu with vpmu enabled may cause system hang
>>>>> due to send IPI to a die physical cpu. This patch add a cpu hot unplug
>>>>> notifer to save vpmu context before cpu offline.
>>>>>
>>>>> Consider one scene, hotplug physical cpu N with vpmu is enabled.
>>>> I think you mean "scenario" and "hot unplug".
>>>>
>>>>> The vcpu which running on this physical cpu before will be switch to
>>>>> other online cpu. Before load the vpmu context to new physical cpu, a
>>>>> IPI will be send to cpu N to save the vpmu context.
>>>>> System will hang in function on_select_cpus because of that physical
>>>>> cpu is offline and can not do any response.
>>>> Doesn't this make clear that you would better also make sure
>>>> ->last_pcpu doesn't hold to the then stale CPU anymore? For
>>>> example, vpmu_load() compares it with smp_processor_id() (the subsequent use 
> 
>>> is guarded by a VPMU_CONTEXT_LOADED flag
>>>> check), allowing badness if the same or another CPU with the same number 
>>> comes up again quickly enough. Similarly
>>>> vpmu_arch_destroy() uses it without checking VPMU_CONTEXT_LOADED.
>>>     I think it may can't make sure  "->last_pcpu" doesn't hold to the then 
>>> stale CPU. The purpose of this notifier is to save the vpmu context before 
>>> cpu offline. Avoid save vpmu context by send IPI to that offline cpu. There 
>>> is no reason to change the value except it saving (vpmu_save()) in another 
>>> physical cpu.
>> I'm afraid I don't understand most of your reply.
>>
>>>     Regarding vpmu_arch_destroy(), it indeed will cause same issue. What 
>>> about add " this_cpu(cpu) = NULL" in cpu_callback() to clean the last_vcpu 
>>> pointer of this physical cpu. 
>> That's being done by vpmu_save_force() already afaict (assuming
>> you mean this_cpu(last_vcpu)), albeit for whatever reason open
>> coding this_cpu().
>>
>>>     In addition, add VPMU_CONTEXT_LOADED check before execute 
>>> on_selected_cpus(cpumask_of(vcpu_vpmu(v)->last_pcpu), vpmu_save_force, v, 1) in 
>>> vpmu_arch_destroy(). Because of force save operation has been finished in 
>>> notifier function.
>> I'm not sure whether that would be correct. Boris?
> 
> 
> I believe we still have a race with vpmu_load(): it can be past
> VPMU_CONTEXT_LOADED test and committed to the remote call when the
> remote VCPU becomes offlined.

The offlined entity is a pCPU, and such offlining happens in stop-
machine context iirc.

Jan

> Taking vpmu_lock in vpmu_load() and cpu_callback() (which IMO should be
> called vpmu_cpu_callback() or some such) may be one solution, although
> holding a lock across a remote call is not optimal, obviously.
> 
> And I think the same argument is applicable to vpmu_arch_destroy().
> 
> -boris
Boris Ostrovsky May 17, 2017, 2:32 p.m. UTC | #6
On 05/17/2017 10:09 AM, Jan Beulich wrote:
>>>> On 17.05.17 at 15:58, <boris.ostrovsky@oracle.com> wrote:
>> On 05/17/2017 08:54 AM, Jan Beulich wrote:
>>>>>> On 17.05.17 at 14:40, <luwei.kang@intel.com> wrote:
>>>>>>>> On 16.05.17 at 19:29, <luwei.kang@intel.com> wrote:
>>>>>> Currently, hot unplug a cpu with vpmu enabled may cause system hang
>>>>>> due to send IPI to a die physical cpu. This patch add a cpu hot unplug
>>>>>> notifer to save vpmu context before cpu offline.
>>>>>>
>>>>>> Consider one scene, hotplug physical cpu N with vpmu is enabled.
>>>>> I think you mean "scenario" and "hot unplug".
>>>>>
>>>>>> The vcpu which running on this physical cpu before will be switch to
>>>>>> other online cpu. Before load the vpmu context to new physical cpu, a
>>>>>> IPI will be send to cpu N to save the vpmu context.
>>>>>> System will hang in function on_select_cpus because of that physical
>>>>>> cpu is offline and can not do any response.
>>>>> Doesn't this make clear that you would better also make sure
>>>>> ->last_pcpu doesn't hold to the then stale CPU anymore? For
>>>>> example, vpmu_load() compares it with smp_processor_id() (the subsequent use 
>>>> is guarded by a VPMU_CONTEXT_LOADED flag
>>>>> check), allowing badness if the same or another CPU with the same number 
>>>> comes up again quickly enough. Similarly
>>>>> vpmu_arch_destroy() uses it without checking VPMU_CONTEXT_LOADED.
>>>>     I think it may can't make sure  "->last_pcpu" doesn't hold to the then 
>>>> stale CPU. The purpose of this notifier is to save the vpmu context before 
>>>> cpu offline. Avoid save vpmu context by send IPI to that offline cpu. There 
>>>> is no reason to change the value except it saving (vpmu_save()) in another 
>>>> physical cpu.
>>> I'm afraid I don't understand most of your reply.
>>>
>>>>     Regarding vpmu_arch_destroy(), it indeed will cause same issue. What 
>>>> about add " this_cpu(cpu) = NULL" in cpu_callback() to clean the last_vcpu 
>>>> pointer of this physical cpu. 
>>> That's being done by vpmu_save_force() already afaict (assuming
>>> you mean this_cpu(last_vcpu)), albeit for whatever reason open
>>> coding this_cpu().
>>>
>>>>     In addition, add VPMU_CONTEXT_LOADED check before execute 
>>>> on_selected_cpus(cpumask_of(vcpu_vpmu(v)->last_pcpu), vpmu_save_force, v, 1) in 
>>>> vpmu_arch_destroy(). Because of force save operation has been finished in 
>>>> notifier function.
>>> I'm not sure whether that would be correct. Boris?
>>
>> I believe we still have a race with vpmu_load(): it can be past
>> VPMU_CONTEXT_LOADED test and committed to the remote call when the
>> remote VCPU becomes offlined.
> The offlined entity is a pCPU, and such offlining happens in stop-
> machine context iirc.

Oh, then I think this should work --- remote calls are predicated on
VPMU_CONTEXT_LOADED being set and the callback will clear it.

Nevertheless, I'd still make sure that last_pcpu doesn't point to an
offlined processor.

-boris

>
> Jan
>
>> Taking vpmu_lock in vpmu_load() and cpu_callback() (which IMO should be
>> called vpmu_cpu_callback() or some such) may be one solution, although
>> holding a lock across a remote call is not optimal, obviously.
>>
>> And I think the same argument is applicable to vpmu_arch_destroy().
>>
>> -boris
>
>
Jan Beulich May 17, 2017, 2:45 p.m. UTC | #7
>>> On 17.05.17 at 16:32, <boris.ostrovsky@oracle.com> wrote:
> On 05/17/2017 10:09 AM, Jan Beulich wrote:
>>>>> On 17.05.17 at 15:58, <boris.ostrovsky@oracle.com> wrote:
>>> On 05/17/2017 08:54 AM, Jan Beulich wrote:
>>>>>>> On 17.05.17 at 14:40, <luwei.kang@intel.com> wrote:
>>>>>>>>> On 16.05.17 at 19:29, <luwei.kang@intel.com> wrote:
>>>>>>> Currently, hot unplug a cpu with vpmu enabled may cause system hang
>>>>>>> due to send IPI to a die physical cpu. This patch add a cpu hot unplug
>>>>>>> notifer to save vpmu context before cpu offline.
>>>>>>>
>>>>>>> Consider one scene, hotplug physical cpu N with vpmu is enabled.
>>>>>> I think you mean "scenario" and "hot unplug".
>>>>>>
>>>>>>> The vcpu which running on this physical cpu before will be switch to
>>>>>>> other online cpu. Before load the vpmu context to new physical cpu, a
>>>>>>> IPI will be send to cpu N to save the vpmu context.
>>>>>>> System will hang in function on_select_cpus because of that physical
>>>>>>> cpu is offline and can not do any response.
>>>>>> Doesn't this make clear that you would better also make sure
>>>>>> ->last_pcpu doesn't hold to the then stale CPU anymore? For
>>>>>> example, vpmu_load() compares it with smp_processor_id() (the subsequent use 
> 
>>>>> is guarded by a VPMU_CONTEXT_LOADED flag
>>>>>> check), allowing badness if the same or another CPU with the same number 
>>>>> comes up again quickly enough. Similarly
>>>>>> vpmu_arch_destroy() uses it without checking VPMU_CONTEXT_LOADED.
>>>>>     I think it may can't make sure  "->last_pcpu" doesn't hold to the then 
>>>>> stale CPU. The purpose of this notifier is to save the vpmu context before 
>>>>> cpu offline. Avoid save vpmu context by send IPI to that offline cpu. There 
>>>>> is no reason to change the value except it saving (vpmu_save()) in another 
>>>>> physical cpu.
>>>> I'm afraid I don't understand most of your reply.
>>>>
>>>>>     Regarding vpmu_arch_destroy(), it indeed will cause same issue. What 
>>>>> about add " this_cpu(cpu) = NULL" in cpu_callback() to clean the last_vcpu 
>>>>> pointer of this physical cpu. 
>>>> That's being done by vpmu_save_force() already afaict (assuming
>>>> you mean this_cpu(last_vcpu)), albeit for whatever reason open
>>>> coding this_cpu().
>>>>
>>>>>     In addition, add VPMU_CONTEXT_LOADED check before execute 
>>>>> on_selected_cpus(cpumask_of(vcpu_vpmu(v)->last_pcpu), vpmu_save_force, v, 1) 
> in 
>>>>> vpmu_arch_destroy(). Because of force save operation has been finished in 
>>>>> notifier function.
>>>> I'm not sure whether that would be correct. Boris?
>>>
>>> I believe we still have a race with vpmu_load(): it can be past
>>> VPMU_CONTEXT_LOADED test and committed to the remote call when the
>>> remote VCPU becomes offlined.
>> The offlined entity is a pCPU, and such offlining happens in stop-
>> machine context iirc.
> 
> Oh, then I think this should work --- remote calls are predicated on
> VPMU_CONTEXT_LOADED being set and the callback will clear it.
> 
> Nevertheless, I'd still make sure that last_pcpu doesn't point to an
> offlined processor.

Well, we can't reasonably use some random online one; my
intention rather was to write some visibly invalid value there,
so that comparing with any possibly online CPU would always
produce false.

Jan
diff mbox

Patch

diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c
index 03401fd..89e0ea8 100644
--- a/xen/arch/x86/cpu/vpmu.c
+++ b/xen/arch/x86/cpu/vpmu.c
@@ -35,6 +35,7 @@ 
 #include <asm/apic.h>
 #include <public/pmu.h>
 #include <xsm/xsm.h>
+#include <xen/cpu.h>
 
 #include <compat/pmu.h>
 CHECK_pmu_cntr_pair;
@@ -835,6 +836,36 @@  long do_xenpmu_op(unsigned int op, XEN_GUEST_HANDLE_PARAM(xen_pmu_params_t) arg)
     return ret;
 }
 
+static int cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
+{
+    unsigned int cpu = (unsigned long)hcpu;
+    struct vcpu *vcpu = per_cpu(last_vcpu, cpu);
+    struct vpmu_struct *vpmu;
+
+    if ( !vcpu )
+        return NOTIFY_DONE;
+
+    vpmu = vcpu_vpmu(vcpu);
+    if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) )
+        return NOTIFY_DONE;
+
+    switch ( action )
+    {
+    case CPU_DYING:
+        vpmu_save_force(vcpu);
+        vpmu_reset(vpmu, VPMU_CONTEXT_LOADED);
+        break;
+    default:
+        break;
+    }
+
+    return NOTIFY_DONE;
+}
+
+static struct notifier_block vpmu_cpu_nfb = {
+    .notifier_call = cpu_callback
+};
+
 static int __init vpmu_init(void)
 {
     int vendor = current_cpu_data.x86_vendor;
@@ -871,10 +902,11 @@  static int __init vpmu_init(void)
         break;
     }
 
-    if ( vpmu_mode != XENPMU_MODE_OFF )
+    if ( vpmu_mode != XENPMU_MODE_OFF ) {
+        register_cpu_notifier(&vpmu_cpu_nfb);
         printk(XENLOG_INFO "VPMU: version " __stringify(XENPMU_VER_MAJ) "."
                __stringify(XENPMU_VER_MIN) "\n");
-    else
+    } else
         opt_vpmu_enabled = 0;
 
     return 0;