Message ID | 20191126171715.10881-1-pdurrant@amazon.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xen/x86: vpmu: Unmap per-vCPU PMU page when the domain is destroyed | expand |
On 26.11.2019 18:17, Paul Durrant wrote: > From: Julien Grall <jgrall@amazon.com> > > A guest will setup a shared page with the hypervisor for each vCPU via > XENPMU_init. The page will then get mapped in the hypervisor and only > released when XEMPMU_finish is called. > > This means that if the guest is not shutdown gracefully (such as via xl > destroy), the page will stay mapped in the hypervisor. Isn't this still too weak a description? It's not the tool stack invoking XENPMU_finish, but the guest itself afaics. I.e. a misbehaving guest could prevent proper cleanup even with graceful shutdown. > @@ -2224,6 +2221,9 @@ int domain_relinquish_resources(struct domain *d) > if ( is_hvm_domain(d) ) > hvm_domain_relinquish_resources(d); > > + for_each_vcpu ( d, v ) > + vpmu_destroy(v); > + > return 0; > } I think simple things which may allow shrinking the page lists should be done early in the function. As vpmu_destroy() looks to be idempotent, how about leveraging the very first for_each_vcpu() loop in the function (there are too many of them there anyway, at least for my taste)? Jan
> -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > Sent: 27 November 2019 09:44 > To: Durrant, Paul <pdurrant@amazon.com>; Grall, Julien <jgrall@amazon.com> > Cc: xen-devel@lists.xenproject.org; Andrew Cooper > <andrew.cooper3@citrix.com>; Roger Pau Monné <roger.pau@citrix.com>; Wei > Liu <wl@xen.org> > Subject: Re: [PATCH] xen/x86: vpmu: Unmap per-vCPU PMU page when the > domain is destroyed > > On 26.11.2019 18:17, Paul Durrant wrote: > > From: Julien Grall <jgrall@amazon.com> > > > > A guest will setup a shared page with the hypervisor for each vCPU via > > XENPMU_init. The page will then get mapped in the hypervisor and only > > released when XEMPMU_finish is called. > > > > This means that if the guest is not shutdown gracefully (such as via xl > > destroy), the page will stay mapped in the hypervisor. > > Isn't this still too weak a description? It's not the tool stack > invoking XENPMU_finish, but the guest itself afaics. I.e. a > misbehaving guest could prevent proper cleanup even with graceful > shutdown. > Ok, how about 'if the guest fails to invoke XENPMU_finish, e.g. if it is destroyed, rather than cleanly shut down'? > > @@ -2224,6 +2221,9 @@ int domain_relinquish_resources(struct domain *d) > > if ( is_hvm_domain(d) ) > > hvm_domain_relinquish_resources(d); > > > > + for_each_vcpu ( d, v ) > > + vpmu_destroy(v); > > + > > return 0; > > } > > I think simple things which may allow shrinking the page lists > should be done early in the function. As vpmu_destroy() looks > to be idempotent, how about leveraging the very first > for_each_vcpu() loop in the function (there are too many of them > there anyway, at least for my taste)? > Ok. I did wonder where in the sequence was best... Leaving to the end obviously puts it closer to where it was previously called, but I can't see any harm in moving it earlier. Paul > Jan
On 27.11.2019 11:57, Durrant, Paul wrote: >> -----Original Message----- >> From: Jan Beulich <jbeulich@suse.com> >> Sent: 27 November 2019 09:44 >> To: Durrant, Paul <pdurrant@amazon.com>; Grall, Julien <jgrall@amazon.com> >> Cc: xen-devel@lists.xenproject.org; Andrew Cooper >> <andrew.cooper3@citrix.com>; Roger Pau Monné <roger.pau@citrix.com>; Wei >> Liu <wl@xen.org> >> Subject: Re: [PATCH] xen/x86: vpmu: Unmap per-vCPU PMU page when the >> domain is destroyed >> >> On 26.11.2019 18:17, Paul Durrant wrote: >>> From: Julien Grall <jgrall@amazon.com> >>> >>> A guest will setup a shared page with the hypervisor for each vCPU via >>> XENPMU_init. The page will then get mapped in the hypervisor and only >>> released when XEMPMU_finish is called. >>> >>> This means that if the guest is not shutdown gracefully (such as via xl >>> destroy), the page will stay mapped in the hypervisor. >> >> Isn't this still too weak a description? It's not the tool stack >> invoking XENPMU_finish, but the guest itself afaics. I.e. a >> misbehaving guest could prevent proper cleanup even with graceful >> shutdown. >> > > Ok, how about 'if the guest fails to invoke XENPMU_finish, e.g. if > it is destroyed, rather than cleanly shut down'? Sounds good. Jan
Hi, On 27/11/2019 09:44, Jan Beulich wrote: > On 26.11.2019 18:17, Paul Durrant wrote: >> From: Julien Grall <jgrall@amazon.com> >> >> A guest will setup a shared page with the hypervisor for each vCPU via >> XENPMU_init. The page will then get mapped in the hypervisor and only >> released when XEMPMU_finish is called. >> >> This means that if the guest is not shutdown gracefully (such as via xl >> destroy), the page will stay mapped in the hypervisor. > > Isn't this still too weak a description? It's not the tool stack > invoking XENPMU_finish, but the guest itself afaics. I.e. a > misbehaving guest could prevent proper cleanup even with graceful > shutdown. > >> @@ -2224,6 +2221,9 @@ int domain_relinquish_resources(struct domain *d) >> if ( is_hvm_domain(d) ) >> hvm_domain_relinquish_resources(d); >> >> + for_each_vcpu ( d, v ) >> + vpmu_destroy(v); >> + >> return 0; >> } > > I think simple things which may allow shrinking the page lists > should be done early in the function. As vpmu_destroy() looks > to be idempotent, how about leveraging the very first > for_each_vcpu() loop in the function (there are too many of them > there anyway, at least for my taste)? This is not entirely obvious that vpmu_destroy() is idempotent. For instance, I can't find out who is clearing VCPU_CONTEXT_ALLOCATED. so I think vcpu_arch_destroy() would be executed over and over. I don't know whether this is an issue, but I can't figure out that is it not one. Did I miss anything? Cheers,
> -----Original Message----- > From: Julien Grall <jgrall@amazon.com> > Sent: 27 November 2019 11:15 > To: Jan Beulich <jbeulich@suse.com>; Durrant, Paul <pdurrant@amazon.com> > Cc: xen-devel@lists.xenproject.org; Andrew Cooper > <andrew.cooper3@citrix.com>; Roger Pau Monné <roger.pau@citrix.com>; Wei > Liu <wl@xen.org> > Subject: Re: [PATCH] xen/x86: vpmu: Unmap per-vCPU PMU page when the > domain is destroyed > > Hi, > > On 27/11/2019 09:44, Jan Beulich wrote: > > On 26.11.2019 18:17, Paul Durrant wrote: > >> From: Julien Grall <jgrall@amazon.com> > >> > >> A guest will setup a shared page with the hypervisor for each vCPU via > >> XENPMU_init. The page will then get mapped in the hypervisor and only > >> released when XEMPMU_finish is called. > >> > >> This means that if the guest is not shutdown gracefully (such as via xl > >> destroy), the page will stay mapped in the hypervisor. > > > > Isn't this still too weak a description? It's not the tool stack > > invoking XENPMU_finish, but the guest itself afaics. I.e. a > > misbehaving guest could prevent proper cleanup even with graceful > > shutdown. > > > >> @@ -2224,6 +2221,9 @@ int domain_relinquish_resources(struct domain *d) > >> if ( is_hvm_domain(d) ) > >> hvm_domain_relinquish_resources(d); > >> > >> + for_each_vcpu ( d, v ) > >> + vpmu_destroy(v); > >> + > >> return 0; > >> } > > > > I think simple things which may allow shrinking the page lists > > should be done early in the function. As vpmu_destroy() looks > > to be idempotent, how about leveraging the very first > > for_each_vcpu() loop in the function (there are too many of them > > there anyway, at least for my taste)? > > This is not entirely obvious that vpmu_destroy() is idempotent. > > For instance, I can't find out who is clearing VCPU_CONTEXT_ALLOCATED. > so I think vcpu_arch_destroy() would be executed over and over. > > I don't know whether this is an issue, but I can't figure out that is it > not one. Did I miss anything? It's sufficiently unobvious that it is a concern whether a guest invoking XENPMU_finish multiple times can cause harm. I'll see if I can clean that up. Paul > > Cheers,
On 27.11.2019 12:14, Julien Grall wrote: > Hi, > > On 27/11/2019 09:44, Jan Beulich wrote: >> On 26.11.2019 18:17, Paul Durrant wrote: >>> From: Julien Grall <jgrall@amazon.com> >>> >>> A guest will setup a shared page with the hypervisor for each vCPU via >>> XENPMU_init. The page will then get mapped in the hypervisor and only >>> released when XEMPMU_finish is called. >>> >>> This means that if the guest is not shutdown gracefully (such as via xl >>> destroy), the page will stay mapped in the hypervisor. >> >> Isn't this still too weak a description? It's not the tool stack >> invoking XENPMU_finish, but the guest itself afaics. I.e. a >> misbehaving guest could prevent proper cleanup even with graceful >> shutdown. >> >>> @@ -2224,6 +2221,9 @@ int domain_relinquish_resources(struct domain *d) >>> if ( is_hvm_domain(d) ) >>> hvm_domain_relinquish_resources(d); >>> >>> + for_each_vcpu ( d, v ) >>> + vpmu_destroy(v); >>> + >>> return 0; >>> } >> >> I think simple things which may allow shrinking the page lists >> should be done early in the function. As vpmu_destroy() looks >> to be idempotent, how about leveraging the very first >> for_each_vcpu() loop in the function (there are too many of them >> there anyway, at least for my taste)? > > This is not entirely obvious that vpmu_destroy() is idempotent. > > For instance, I can't find out who is clearing VCPU_CONTEXT_ALLOCATED. > so I think vcpu_arch_destroy() would be executed over and over. > > I don't know whether this is an issue, but I can't figure out that is it > not one. Did I miss anything? If the function wasn't idempotent, then calling it unconditionally from domain_relinquish_resources() would be wrong too. After all the guest may have invoked XENPMU_finish. As to VCPU_CONTEXT_ALLOCATED - I don't think this ever gets cleared anywhere. But this by itself doesn't make the function non- idempotent. The vpmu_clear_last invocation, for example, will happen just once. And {amd,core2}_vpmu_destroy() look okay at the first glance, too. Jan
diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c index f397183ec3..9ae4ed48c8 100644 --- a/xen/arch/x86/cpu/vpmu.c +++ b/xen/arch/x86/cpu/vpmu.c @@ -578,9 +578,32 @@ static void vpmu_arch_destroy(struct vcpu *v) } } -void vpmu_destroy(struct vcpu *v) +static void vpmu_cleanup(struct vcpu *v) { + struct vpmu_struct *vpmu = vcpu_vpmu(v); + mfn_t mfn; + void *xenpmu_data; + + spin_lock(&vpmu->vpmu_lock); + vpmu_arch_destroy(v); + xenpmu_data = vpmu->xenpmu_data; + vpmu->xenpmu_data = NULL; + + spin_unlock(&vpmu->vpmu_lock); + + if ( xenpmu_data ) + { + mfn = domain_page_map_to_mfn(xenpmu_data); + ASSERT(mfn_valid(mfn)); + unmap_domain_page_global(xenpmu_data); + put_page_and_type(mfn_to_page(mfn)); + } +} + +void vpmu_destroy(struct vcpu *v) +{ + vpmu_cleanup(v); put_vpmu(v); } @@ -639,9 +662,6 @@ static int pvpmu_init(struct domain *d, xen_pmu_params_t *params) static void pvpmu_finish(struct domain *d, xen_pmu_params_t *params) { struct vcpu *v; - struct vpmu_struct *vpmu; - mfn_t mfn; - void *xenpmu_data; if ( (params->vcpu >= d->max_vcpus) || (d->vcpu[params->vcpu] == NULL) ) return; @@ -650,22 +670,7 @@ static void pvpmu_finish(struct domain *d, xen_pmu_params_t *params) if ( v != current ) vcpu_pause(v); - vpmu = vcpu_vpmu(v); - spin_lock(&vpmu->vpmu_lock); - - vpmu_arch_destroy(v); - xenpmu_data = vpmu->xenpmu_data; - vpmu->xenpmu_data = NULL; - - spin_unlock(&vpmu->vpmu_lock); - - if ( xenpmu_data ) - { - mfn = domain_page_map_to_mfn(xenpmu_data); - ASSERT(mfn_valid(mfn)); - unmap_domain_page_global(xenpmu_data); - put_page_and_type(mfn_to_page(mfn)); - } + vpmu_cleanup(v); if ( v != current ) vcpu_unpause(v); diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index f1dd86e12e..1d75b2e6c3 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -454,9 +454,6 @@ void arch_vcpu_destroy(struct vcpu *v) xfree(v->arch.msrs); v->arch.msrs = NULL; - if ( !is_idle_domain(v->domain) ) - vpmu_destroy(v); - if ( is_hvm_vcpu(v) ) hvm_vcpu_destroy(v); else @@ -2224,6 +2221,9 @@ int domain_relinquish_resources(struct domain *d) if ( is_hvm_domain(d) ) hvm_domain_relinquish_resources(d); + for_each_vcpu ( d, v ) + vpmu_destroy(v); + return 0; }