diff mbox series

[v3] xen/x86: vpmu: Unmap per-vCPU PMU page when the domain is destroyed

Message ID 20191128093828.8462-1-pdurrant@amazon.com (mailing list archive)
State New, archived
Headers show
Series [v3] xen/x86: vpmu: Unmap per-vCPU PMU page when the domain is destroyed | expand

Commit Message

Paul Durrant Nov. 28, 2019, 9:38 a.m. UTC
From: Julien Grall <jgrall@amazon.com>

A guest will setup a shared page with the hypervisor for each vCPU via
XENPMU_init. The page will then get mapped in the hypervisor and only
released when XENPMU_finish is called.

This means that if the guest fails to invoke XENPMU_finish, e.g if it is
destroyed rather than cleanly shut down, the page will stay mapped in the
hypervisor. One of the consequences is the domain can never be fully
destroyed as a page reference is still held.

As Xen should never rely on the guest to correctly clean-up any
allocation in the hypervisor, we should also unmap such pages during the
domain destruction if there are any left.

We can re-use the same logic as in pvpmu_finish(). To avoid
duplication, move the logic in a new function that can also be called
from vpmu_destroy().

NOTE: - The call to vpmu_destroy() must also be moved from
        arch_vcpu_destroy() into domain_relinquish_resources() such that
        the reference on the mapped page does not prevent domain_destroy()
        (which calls arch_vcpu_destroy()) from being called.
      - Whilst it appears that vpmu_arch_destroy() is idempotent it is
        by no means obvious. Hence make sure the VPMU_CONTEXT_ALLOCATED
        flag is cleared at the end of vpmu_arch_destroy().
      - This is not an XSA because vPMU is not security supported (see
        XSA-163).

Signed-off-by: Julien Grall <jgrall@amazon.com>
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wl@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>

v2:
 - Re-word commit comment slightly
 - Re-enforce idempotency of vmpu_arch_destroy()
 - Move invocation of vpmu_destroy() earlier in
   domain_relinquish_resources()
---
 xen/arch/x86/cpu/vpmu.c | 47 +++++++++++++++++++++++------------------
 xen/arch/x86/domain.c   | 10 +++++----
 2 files changed, 33 insertions(+), 24 deletions(-)

Comments

Jan Beulich Nov. 28, 2019, 10:23 a.m. UTC | #1
On 28.11.2019 10:38, Paul Durrant wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> A guest will setup a shared page with the hypervisor for each vCPU via
> XENPMU_init. The page will then get mapped in the hypervisor and only
> released when XENPMU_finish is called.
> 
> This means that if the guest fails to invoke XENPMU_finish, e.g if it is
> destroyed rather than cleanly shut down, the page will stay mapped in the
> hypervisor. One of the consequences is the domain can never be fully
> destroyed as a page reference is still held.
> 
> As Xen should never rely on the guest to correctly clean-up any
> allocation in the hypervisor, we should also unmap such pages during the
> domain destruction if there are any left.
> 
> We can re-use the same logic as in pvpmu_finish(). To avoid
> duplication, move the logic in a new function that can also be called
> from vpmu_destroy().
> 
> NOTE: - The call to vpmu_destroy() must also be moved from
>         arch_vcpu_destroy() into domain_relinquish_resources() such that
>         the reference on the mapped page does not prevent domain_destroy()
>         (which calls arch_vcpu_destroy()) from being called.
>       - Whilst it appears that vpmu_arch_destroy() is idempotent it is
>         by no means obvious. Hence make sure the VPMU_CONTEXT_ALLOCATED
>         flag is cleared at the end of vpmu_arch_destroy().
>       - This is not an XSA because vPMU is not security supported (see
>         XSA-163).
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> ---
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Wei Liu <wl@xen.org>
> Cc: "Roger Pau Monné" <roger.pau@citrix.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> 
> v2:
>  - Re-word commit comment slightly
>  - Re-enforce idempotency of vmpu_arch_destroy()
>  - Move invocation of vpmu_destroy() earlier in
>    domain_relinquish_resources()

What about v3?

> --- a/xen/arch/x86/cpu/vpmu.c
> +++ b/xen/arch/x86/cpu/vpmu.c
> @@ -576,11 +576,36 @@ static void vpmu_arch_destroy(struct vcpu *v)
>  
>           vpmu->arch_vpmu_ops->arch_vpmu_destroy(v);
>      }
> +
> +    vpmu_reset(vpmu, VPMU_CONTEXT_ALLOCATED);
>  }

Boris, to be on the safe side - are you in agreement with this
change, now that the setting of the flag is being left untouched?

Jan
Paul Durrant Nov. 28, 2019, 10:28 a.m. UTC | #2
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 28 November 2019 10:23
> To: Durrant, Paul <pdurrant@amazon.com>; Boris Ostrovsky
> <boris.ostrovsky@oracle.com>
> Cc: xen-devel@lists.xenproject.org; Grall, Julien <jgrall@amazon.com>;
> Andrew Cooper <andrew.cooper3@citrix.com>; Roger Pau Monné
> <roger.pau@citrix.com>; Jun Nakajima <jun.nakajima@intel.com>; Kevin Tian
> <kevin.tian@intel.com>; Wei Liu <wl@xen.org>
> Subject: Re: [PATCH v3] xen/x86: vpmu: Unmap per-vCPU PMU page when the
> domain is destroyed
> 
> On 28.11.2019 10:38, Paul Durrant wrote:
> > From: Julien Grall <jgrall@amazon.com>
> >
> > A guest will setup a shared page with the hypervisor for each vCPU via
> > XENPMU_init. The page will then get mapped in the hypervisor and only
> > released when XENPMU_finish is called.
> >
> > This means that if the guest fails to invoke XENPMU_finish, e.g if it is
> > destroyed rather than cleanly shut down, the page will stay mapped in
> the
> > hypervisor. One of the consequences is the domain can never be fully
> > destroyed as a page reference is still held.
> >
> > As Xen should never rely on the guest to correctly clean-up any
> > allocation in the hypervisor, we should also unmap such pages during the
> > domain destruction if there are any left.
> >
> > We can re-use the same logic as in pvpmu_finish(). To avoid
> > duplication, move the logic in a new function that can also be called
> > from vpmu_destroy().
> >
> > NOTE: - The call to vpmu_destroy() must also be moved from
> >         arch_vcpu_destroy() into domain_relinquish_resources() such that
> >         the reference on the mapped page does not prevent
> domain_destroy()
> >         (which calls arch_vcpu_destroy()) from being called.
> >       - Whilst it appears that vpmu_arch_destroy() is idempotent it is
> >         by no means obvious. Hence make sure the VPMU_CONTEXT_ALLOCATED
> >         flag is cleared at the end of vpmu_arch_destroy().
> >       - This is not an XSA because vPMU is not security supported (see
> >         XSA-163).
> >
> > Signed-off-by: Julien Grall <jgrall@amazon.com>
> > Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> > ---
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > Cc: Wei Liu <wl@xen.org>
> > Cc: "Roger Pau Monné" <roger.pau@citrix.com>
> > Cc: Jun Nakajima <jun.nakajima@intel.com>
> > Cc: Kevin Tian <kevin.tian@intel.com>
> >
> > v2:
> >  - Re-word commit comment slightly
> >  - Re-enforce idempotency of vmpu_arch_destroy()
> >  - Move invocation of vpmu_destroy() earlier in
> >    domain_relinquish_resources()
> 
> What about v3?

Oh, sorry:

v3:
 - Add comment regarding XSA-163
 - Revert changes setting VPMU_CONTEXT_ALLOCATED in common code

  Paul

> 
> > --- a/xen/arch/x86/cpu/vpmu.c
> > +++ b/xen/arch/x86/cpu/vpmu.c
> > @@ -576,11 +576,36 @@ static void vpmu_arch_destroy(struct vcpu *v)
> >
> >           vpmu->arch_vpmu_ops->arch_vpmu_destroy(v);
> >      }
> > +
> > +    vpmu_reset(vpmu, VPMU_CONTEXT_ALLOCATED);
> >  }
> 
> Boris, to be on the safe side - are you in agreement with this
> change, now that the setting of the flag is being left untouched?
> 
> Jan
Boris Ostrovsky Nov. 28, 2019, 9:50 p.m. UTC | #3
On 11/28/19 5:23 AM, Jan Beulich wrote:
> On 28.11.2019 10:38, Paul Durrant wrote:
>
>> --- a/xen/arch/x86/cpu/vpmu.c
>> +++ b/xen/arch/x86/cpu/vpmu.c
>> @@ -576,11 +576,36 @@ static void vpmu_arch_destroy(struct vcpu *v)
>>  
>>           vpmu->arch_vpmu_ops->arch_vpmu_destroy(v);
>>      }
>> +
>> +    vpmu_reset(vpmu, VPMU_CONTEXT_ALLOCATED);
>>  }
> Boris, to be on the safe side - are you in agreement with this
> change, now that the setting of the flag is being left untouched?

Yes, this is fine.

(I probably would clear it in arch_vpmu_destroy op since it is set in
arch-specific code but either way works)

-boris
Jan Beulich Nov. 29, 2019, 8:45 a.m. UTC | #4
On 28.11.2019 10:38, Paul Durrant wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> A guest will setup a shared page with the hypervisor for each vCPU via
> XENPMU_init. The page will then get mapped in the hypervisor and only
> released when XENPMU_finish is called.
> 
> This means that if the guest fails to invoke XENPMU_finish, e.g if it is
> destroyed rather than cleanly shut down, the page will stay mapped in the
> hypervisor. One of the consequences is the domain can never be fully
> destroyed as a page reference is still held.
> 
> As Xen should never rely on the guest to correctly clean-up any
> allocation in the hypervisor, we should also unmap such pages during the
> domain destruction if there are any left.
> 
> We can re-use the same logic as in pvpmu_finish(). To avoid
> duplication, move the logic in a new function that can also be called
> from vpmu_destroy().
> 
> NOTE: - The call to vpmu_destroy() must also be moved from
>         arch_vcpu_destroy() into domain_relinquish_resources() such that
>         the reference on the mapped page does not prevent domain_destroy()
>         (which calls arch_vcpu_destroy()) from being called.
>       - Whilst it appears that vpmu_arch_destroy() is idempotent it is
>         by no means obvious. Hence make sure the VPMU_CONTEXT_ALLOCATED
>         flag is cleared at the end of vpmu_arch_destroy().
>       - This is not an XSA because vPMU is not security supported (see
>         XSA-163).
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
Julien Grall Dec. 4, 2019, 3:55 p.m. UTC | #5
Hi Boris,

On 28/11/2019 21:50, Boris Ostrovsky wrote:
> On 11/28/19 5:23 AM, Jan Beulich wrote:
>> On 28.11.2019 10:38, Paul Durrant wrote:
>>
>>> --- a/xen/arch/x86/cpu/vpmu.c
>>> +++ b/xen/arch/x86/cpu/vpmu.c
>>> @@ -576,11 +576,36 @@ static void vpmu_arch_destroy(struct vcpu *v)
>>>   
>>>            vpmu->arch_vpmu_ops->arch_vpmu_destroy(v);
>>>       }
>>> +
>>> +    vpmu_reset(vpmu, VPMU_CONTEXT_ALLOCATED);
>>>   }
>> Boris, to be on the safe side - are you in agreement with this
>> change, now that the setting of the flag is being left untouched?
> 
> Yes, this is fine.

Can we take this as an ACK?

> 
> (I probably would clear it in arch_vpmu_destroy op since it is set in
> arch-specific code but either way works)
> 
> -boris
> 

Cheers,
Boris Ostrovsky Dec. 4, 2019, 3:59 p.m. UTC | #6
> On Dec 4, 2019, at 10:55 AM, Julien Grall <jgrall@amazon.com> wrote:
> 
> Hi Boris,
> 
> On 28/11/2019 21:50, Boris Ostrovsky wrote:
>> On 11/28/19 5:23 AM, Jan Beulich wrote:
>>> On 28.11.2019 10:38, Paul Durrant wrote:
>>> 
>>>> --- a/xen/arch/x86/cpu/vpmu.c
>>>> +++ b/xen/arch/x86/cpu/vpmu.c
>>>> @@ -576,11 +576,36 @@ static void vpmu_arch_destroy(struct vcpu *v)
>>>>             vpmu->arch_vpmu_ops->arch_vpmu_destroy(v);
>>>>      }
>>>> +
>>>> +    vpmu_reset(vpmu, VPMU_CONTEXT_ALLOCATED);
>>>>  }
>>> Boris, to be on the safe side - are you in agreement with this
>>> change, now that the setting of the flag is being left untouched?
>> Yes, this is fine.
> 
> Can we take this as an ACK?


Yes, of course.


-boris


> 
>> (I probably would clear it in arch_vpmu_destroy op since it is set in
>> arch-specific code but either way works)
>> -boris
> 
> Cheers,
Julien Grall Dec. 4, 2019, 4:02 p.m. UTC | #7
On 04/12/2019 15:59, Boris Ostrovsky wrote:
> 
> 
>> On Dec 4, 2019, at 10:55 AM, Julien Grall <jgrall@amazon.com> wrote:
>>
>> Hi Boris,
>>
>> On 28/11/2019 21:50, Boris Ostrovsky wrote:
>>> On 11/28/19 5:23 AM, Jan Beulich wrote:
>>>> On 28.11.2019 10:38, Paul Durrant wrote:
>>>>
>>>>> --- a/xen/arch/x86/cpu/vpmu.c
>>>>> +++ b/xen/arch/x86/cpu/vpmu.c
>>>>> @@ -576,11 +576,36 @@ static void vpmu_arch_destroy(struct vcpu *v)
>>>>>              vpmu->arch_vpmu_ops->arch_vpmu_destroy(v);
>>>>>       }
>>>>> +
>>>>> +    vpmu_reset(vpmu, VPMU_CONTEXT_ALLOCATED);
>>>>>   }
>>>> Boris, to be on the safe side - are you in agreement with this
>>>> change, now that the setting of the flag is being left untouched?
>>> Yes, this is fine.
>>
>> Can we take this as an ACK?
> 
> 
> Yes, of course.

Thank you!

@Andrew, @Jan: this is x86 code, but I am happy to commit it if you prefer.

Cheers,
Jan Beulich Dec. 4, 2019, 4:08 p.m. UTC | #8
On 04.12.2019 17:02, Julien Grall wrote:
> 
> 
> On 04/12/2019 15:59, Boris Ostrovsky wrote:
>>
>>
>>> On Dec 4, 2019, at 10:55 AM, Julien Grall <jgrall@amazon.com> wrote:
>>>
>>> Hi Boris,
>>>
>>> On 28/11/2019 21:50, Boris Ostrovsky wrote:
>>>> On 11/28/19 5:23 AM, Jan Beulich wrote:
>>>>> On 28.11.2019 10:38, Paul Durrant wrote:
>>>>>
>>>>>> --- a/xen/arch/x86/cpu/vpmu.c
>>>>>> +++ b/xen/arch/x86/cpu/vpmu.c
>>>>>> @@ -576,11 +576,36 @@ static void vpmu_arch_destroy(struct vcpu *v)
>>>>>>              vpmu->arch_vpmu_ops->arch_vpmu_destroy(v);
>>>>>>       }
>>>>>> +
>>>>>> +    vpmu_reset(vpmu, VPMU_CONTEXT_ALLOCATED);
>>>>>>   }
>>>>> Boris, to be on the safe side - are you in agreement with this
>>>>> change, now that the setting of the flag is being left untouched?
>>>> Yes, this is fine.
>>>
>>> Can we take this as an ACK?
>>
>>
>> Yes, of course.
> 
> Thank you!
> 
> @Andrew, @Jan: this is x86 code, but I am happy to commit it if you prefer.

It has been committed already.

Jan
Julien Grall Dec. 4, 2019, 4:12 p.m. UTC | #9
On 04/12/2019 16:08, Jan Beulich wrote:
> On 04.12.2019 17:02, Julien Grall wrote:
>>
>>
>> On 04/12/2019 15:59, Boris Ostrovsky wrote:
>>>
>>>
>>>> On Dec 4, 2019, at 10:55 AM, Julien Grall <jgrall@amazon.com> wrote:
>>>>
>>>> Hi Boris,
>>>>
>>>> On 28/11/2019 21:50, Boris Ostrovsky wrote:
>>>>> On 11/28/19 5:23 AM, Jan Beulich wrote:
>>>>>> On 28.11.2019 10:38, Paul Durrant wrote:
>>>>>>
>>>>>>> --- a/xen/arch/x86/cpu/vpmu.c
>>>>>>> +++ b/xen/arch/x86/cpu/vpmu.c
>>>>>>> @@ -576,11 +576,36 @@ static void vpmu_arch_destroy(struct vcpu *v)
>>>>>>>               vpmu->arch_vpmu_ops->arch_vpmu_destroy(v);
>>>>>>>        }
>>>>>>> +
>>>>>>> +    vpmu_reset(vpmu, VPMU_CONTEXT_ALLOCATED);
>>>>>>>    }
>>>>>> Boris, to be on the safe side - are you in agreement with this
>>>>>> change, now that the setting of the flag is being left untouched?
>>>>> Yes, this is fine.
>>>>
>>>> Can we take this as an ACK?
>>>
>>>
>>> Yes, of course.
>>
>> Thank you!
>>
>> @Andrew, @Jan: this is x86 code, but I am happy to commit it if you prefer.
> 
> It has been committed already.

Oh, I didn't spot it in git log. Sorry for the noise.

Cheers,
diff mbox series

Patch

diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c
index f397183ec3..792953e7cb 100644
--- a/xen/arch/x86/cpu/vpmu.c
+++ b/xen/arch/x86/cpu/vpmu.c
@@ -576,11 +576,36 @@  static void vpmu_arch_destroy(struct vcpu *v)
 
          vpmu->arch_vpmu_ops->arch_vpmu_destroy(v);
     }
+
+    vpmu_reset(vpmu, VPMU_CONTEXT_ALLOCATED);
 }
 
-void vpmu_destroy(struct vcpu *v)
+static void vpmu_cleanup(struct vcpu *v)
 {
+    struct vpmu_struct *vpmu = vcpu_vpmu(v);
+    mfn_t mfn;
+    void *xenpmu_data;
+
+    spin_lock(&vpmu->vpmu_lock);
+
     vpmu_arch_destroy(v);
+    xenpmu_data = vpmu->xenpmu_data;
+    vpmu->xenpmu_data = NULL;
+
+    spin_unlock(&vpmu->vpmu_lock);
+
+    if ( xenpmu_data )
+    {
+        mfn = domain_page_map_to_mfn(xenpmu_data);
+        ASSERT(mfn_valid(mfn));
+        unmap_domain_page_global(xenpmu_data);
+        put_page_and_type(mfn_to_page(mfn));
+    }
+}
+
+void vpmu_destroy(struct vcpu *v)
+{
+    vpmu_cleanup(v);
 
     put_vpmu(v);
 }
@@ -639,9 +664,6 @@  static int pvpmu_init(struct domain *d, xen_pmu_params_t *params)
 static void pvpmu_finish(struct domain *d, xen_pmu_params_t *params)
 {
     struct vcpu *v;
-    struct vpmu_struct *vpmu;
-    mfn_t mfn;
-    void *xenpmu_data;
 
     if ( (params->vcpu >= d->max_vcpus) || (d->vcpu[params->vcpu] == NULL) )
         return;
@@ -650,22 +672,7 @@  static void pvpmu_finish(struct domain *d, xen_pmu_params_t *params)
     if ( v != current )
         vcpu_pause(v);
 
-    vpmu = vcpu_vpmu(v);
-    spin_lock(&vpmu->vpmu_lock);
-
-    vpmu_arch_destroy(v);
-    xenpmu_data = vpmu->xenpmu_data;
-    vpmu->xenpmu_data = NULL;
-
-    spin_unlock(&vpmu->vpmu_lock);
-
-    if ( xenpmu_data )
-    {
-        mfn = domain_page_map_to_mfn(xenpmu_data);
-        ASSERT(mfn_valid(mfn));
-        unmap_domain_page_global(xenpmu_data);
-        put_page_and_type(mfn_to_page(mfn));
-    }
+    vpmu_cleanup(v);
 
     if ( v != current )
         vcpu_unpause(v);
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index f1dd86e12e..f5c0c378ef 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -454,9 +454,6 @@  void arch_vcpu_destroy(struct vcpu *v)
     xfree(v->arch.msrs);
     v->arch.msrs = NULL;
 
-    if ( !is_idle_domain(v->domain) )
-        vpmu_destroy(v);
-
     if ( is_hvm_vcpu(v) )
         hvm_vcpu_destroy(v);
     else
@@ -2136,12 +2133,17 @@  int domain_relinquish_resources(struct domain *d)
 
     PROGRESS(vcpu_pagetables):
 
-        /* Drop the in-use references to page-table bases. */
+        /*
+         * Drop the in-use references to page-table bases and clean
+         * up vPMU instances.
+         */
         for_each_vcpu ( d, v )
         {
             ret = vcpu_destroy_pagetables(v);
             if ( ret )
                 return ret;
+
+            vpmu_destroy(v);
         }
 
         if ( altp2m_active(d) )