KVM: X86: Make fpu allocation a common function

Message ID	20191014162247.61461-1-xiaoyao.li@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=utul=YH=vger.kernel.org=kvm-owner@kernel.org> From: Xiaoyao Li <xiaoyao.li@intel.com> To: Paolo Bonzini <pbonzini@redhat.com>, =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= <rkrcmar@redhat.com>, Sean Christopherson <sean.j.christopherson@intel.com>, Vitaly Kuznetsov <vkuznets@redhat.com>, Jim Mattson <jmattson@google.com> Cc: Xiaoyao Li <xiaoyao.li@intel.com>, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] KVM: X86: Make fpu allocation a common function Date: Tue, 15 Oct 2019 00:22:47 +0800 Message-Id: <20191014162247.61461-1-xiaoyao.li@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: kvm-owner@vger.kernel.org Precedence: bulk
Series	KVM: X86: Make fpu allocation a common function \| expand KVM: X86: Make fpu allocation a common function

Xiaoyao Li Oct. 14, 2019, 4:22 p.m. UTC

They are duplicated codes to create vcpu.arch.{user,guest}_fpu in VMX
and SVM. Make them common functions.

No functional change intended.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 arch/x86/kvm/svm.c     | 20 +++-----------------
 arch/x86/kvm/vmx/vmx.c | 20 +++-----------------
 arch/x86/kvm/x86.h     | 26 ++++++++++++++++++++++++++
 3 files changed, 32 insertions(+), 34 deletions(-)

Vitaly Kuznetsov Oct. 14, 2019, 4:58 p.m. UTC | #1

Xiaoyao Li <xiaoyao.li@intel.com> writes:

> They are duplicated codes to create vcpu.arch.{user,guest}_fpu in VMX
> and SVM. Make them common functions.
>
> No functional change intended.

Would it rather make sense to move this code to
kvm_arch_vcpu_create()/kvm_arch_vcpu_destroy() instead?

>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  arch/x86/kvm/svm.c     | 20 +++-----------------
>  arch/x86/kvm/vmx/vmx.c | 20 +++-----------------
>  arch/x86/kvm/x86.h     | 26 ++++++++++++++++++++++++++
>  3 files changed, 32 insertions(+), 34 deletions(-)
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index e479ea9bc9da..0116a3c37a07 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -2156,21 +2156,9 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id)
>  		goto out;
>  	}
>  
> -	svm->vcpu.arch.user_fpu = kmem_cache_zalloc(x86_fpu_cache,
> -						     GFP_KERNEL_ACCOUNT);
> -	if (!svm->vcpu.arch.user_fpu) {
> -		printk(KERN_ERR "kvm: failed to allocate kvm userspace's fpu\n");
> -		err = -ENOMEM;
> +	err = kvm_vcpu_create_fpu(&svm->vcpu);
> +	if (err)
>  		goto free_partial_svm;
> -	}
> -
> -	svm->vcpu.arch.guest_fpu = kmem_cache_zalloc(x86_fpu_cache,
> -						     GFP_KERNEL_ACCOUNT);
> -	if (!svm->vcpu.arch.guest_fpu) {
> -		printk(KERN_ERR "kvm: failed to allocate vcpu's fpu\n");
> -		err = -ENOMEM;
> -		goto free_user_fpu;
> -	}
>  
>  	err = kvm_vcpu_init(&svm->vcpu, kvm, id);
>  	if (err)
> @@ -2231,9 +2219,7 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id)
>  uninit:
>  	kvm_vcpu_uninit(&svm->vcpu);
>  free_svm:
> -	kmem_cache_free(x86_fpu_cache, svm->vcpu.arch.guest_fpu);
> -free_user_fpu:
> -	kmem_cache_free(x86_fpu_cache, svm->vcpu.arch.user_fpu);
> +	kvm_vcpu_free_fpu(&svm->vcpu);
>  free_partial_svm:
>  	kmem_cache_free(kvm_vcpu_cache, svm);
>  out:
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index e660e28e9ae0..53d9298ff648 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6710,21 +6710,9 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
>  	if (!vmx)
>  		return ERR_PTR(-ENOMEM);
>  
> -	vmx->vcpu.arch.user_fpu = kmem_cache_zalloc(x86_fpu_cache,
> -			GFP_KERNEL_ACCOUNT);
> -	if (!vmx->vcpu.arch.user_fpu) {
> -		printk(KERN_ERR "kvm: failed to allocate kvm userspace's fpu\n");
> -		err = -ENOMEM;
> +	err = kvm_vcpu_create_fpu(&vmx->vcpu);
> +	if (err)
>  		goto free_partial_vcpu;
> -	}
> -
> -	vmx->vcpu.arch.guest_fpu = kmem_cache_zalloc(x86_fpu_cache,
> -			GFP_KERNEL_ACCOUNT);
> -	if (!vmx->vcpu.arch.guest_fpu) {
> -		printk(KERN_ERR "kvm: failed to allocate vcpu's fpu\n");
> -		err = -ENOMEM;
> -		goto free_user_fpu;
> -	}
>  
>  	vmx->vpid = allocate_vpid();
>  
> @@ -6825,9 +6813,7 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
>  	kvm_vcpu_uninit(&vmx->vcpu);
>  free_vcpu:
>  	free_vpid(vmx->vpid);
> -	kmem_cache_free(x86_fpu_cache, vmx->vcpu.arch.guest_fpu);
> -free_user_fpu:
> -	kmem_cache_free(x86_fpu_cache, vmx->vcpu.arch.user_fpu);
> +	kvm_vcpu_free_fpu(&vmx->vcpu);
>  free_partial_vcpu:
>  	kmem_cache_free(kvm_vcpu_cache, vmx);
>  	return ERR_PTR(err);
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 45d82b8277e5..c27e7ac91337 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -367,4 +367,30 @@ static inline bool kvm_pat_valid(u64 data)
>  void kvm_load_guest_xcr0(struct kvm_vcpu *vcpu);
>  void kvm_put_guest_xcr0(struct kvm_vcpu *vcpu);
>  
> +static inline int kvm_vcpu_create_fpu(struct kvm_vcpu *vcpu)
> +{
> +	vcpu->arch.user_fpu = kmem_cache_zalloc(x86_fpu_cache,
> +			GFP_KERNEL_ACCOUNT);
> +	if (!vcpu->arch.user_fpu) {
> +		printk(KERN_ERR "kvm: failed to allocate kvm userspace's fpu\n");
> +		return -ENOMEM;
> +	}
> +
> +	vcpu->arch.guest_fpu = kmem_cache_zalloc(x86_fpu_cache,
> +			GFP_KERNEL_ACCOUNT);
> +	if (!vcpu->arch.guest_fpu) {
> +		printk(KERN_ERR "kvm: failed to allocate vcpu's fpu\n");
> +		kmem_cache_free(x86_fpu_cache, vcpu->arch.user_fpu);
> +		return -ENOMEM;
> +	}
> +
> +	return 0;
> +}
> +
> +static inline void kvm_vcpu_free_fpu(struct kvm_vcpu *vcpu)
> +{
> +	kmem_cache_free(x86_fpu_cache, vcpu->arch.guest_fpu);
> +	kmem_cache_free(x86_fpu_cache, vcpu->arch.user_fpu);
> +}
> +
>  #endif

Sean Christopherson Oct. 14, 2019, 6:37 p.m. UTC | #2

On Mon, Oct 14, 2019 at 06:58:49PM +0200, Vitaly Kuznetsov wrote:
> Xiaoyao Li <xiaoyao.li@intel.com> writes:
> 
> > They are duplicated codes to create vcpu.arch.{user,guest}_fpu in VMX
> > and SVM. Make them common functions.
> >
> > No functional change intended.
> 
> Would it rather make sense to move this code to
> kvm_arch_vcpu_create()/kvm_arch_vcpu_destroy() instead?

Does it make sense?  Yes.  Would it actually work?  No.  Well, not without
other shenanigans.

FPU allocation can't be placed after the call to .create_vcpu() becuase
it's consumed in kvm_arch_vcpu_init().   FPU allocation can't come before
.create_vcpu() because the vCPU struct itself hasn't been allocated.  The
latter could be solved by passed the FPU pointer into .create_vcpu(), but
that's a bit ugly and is not a precedent we want to set.

At a glance, FPU allocation can be moved to kvm_arch_vcpu_init(), maybe
right before the call to fx_init().

Xiaoyao Li Oct. 15, 2019, 12:48 a.m. UTC | #3

On 10/15/2019 2:37 AM, Sean Christopherson wrote:
> On Mon, Oct 14, 2019 at 06:58:49PM +0200, Vitaly Kuznetsov wrote:
>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>>
>>> They are duplicated codes to create vcpu.arch.{user,guest}_fpu in VMX
>>> and SVM. Make them common functions.
>>>
>>> No functional change intended.
>>
>> Would it rather make sense to move this code to
>> kvm_arch_vcpu_create()/kvm_arch_vcpu_destroy() instead?
> 
> Does it make sense?  Yes.  Would it actually work?  No.  Well, not without
> other shenanigans.
> 
> FPU allocation can't be placed after the call to .create_vcpu() becuase
> it's consumed in kvm_arch_vcpu_init().   FPU allocation can't come before
> .create_vcpu() because the vCPU struct itself hasn't been allocated.  The
> latter could be solved by passed the FPU pointer into .create_vcpu(), but
> that's a bit ugly and is not a precedent we want to set.
> 

That's exactly what I found.

> At a glance, FPU allocation can be moved to kvm_arch_vcpu_init(), maybe
> right before the call to fx_init().
> 

Yeah, putting here is better.

I'm wondering the semantic of create, init, setup. There are 
vcpu_{create,init,setup}, and IIUC, vcpu_create is mainly for data 
structure allocation and vcpu_{init,setup} should be for data structure 
initialization/setup (and maybe they could/should merge into one)

But I feel the current codes for vcpu creation a bit messed, especially 
of vmx.

Paolo Bonzini Oct. 15, 2019, 9:28 a.m. UTC | #4

On 14/10/19 18:58, Vitaly Kuznetsov wrote:
> Xiaoyao Li <xiaoyao.li@intel.com> writes:
> 
>> They are duplicated codes to create vcpu.arch.{user,guest}_fpu in VMX
>> and SVM. Make them common functions.
>>
>> No functional change intended.
> Would it rather make sense to move this code to
> kvm_arch_vcpu_create()/kvm_arch_vcpu_destroy() instead?
> 

user_fpu could be made percpu too...  That would save a bit of memory
for each vCPU.  I'm holding on Xiaoyao's patch because a lot of the code
he's touching would go away then.

Paolo

Vitaly Kuznetsov Oct. 15, 2019, 10:53 a.m. UTC | #5

Sean Christopherson <sean.j.christopherson@intel.com> writes:

> On Mon, Oct 14, 2019 at 06:58:49PM +0200, Vitaly Kuznetsov wrote:
>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>> 
>> > They are duplicated codes to create vcpu.arch.{user,guest}_fpu in VMX
>> > and SVM. Make them common functions.
>> >
>> > No functional change intended.
>> 
>> Would it rather make sense to move this code to
>> kvm_arch_vcpu_create()/kvm_arch_vcpu_destroy() instead?
>
> Does it make sense?  Yes.  Would it actually work?  No.  Well, not without
> other shenanigans.
>
> FPU allocation can't be placed after the call to .create_vcpu() becuase
> it's consumed in kvm_arch_vcpu_init().   FPU allocation can't come before
> .create_vcpu() because the vCPU struct itself hasn't been allocated.

A very theoretical question: why do we have 'struct vcpu' embedded in
vcpu_vmx/vcpu_svm and not the other way around (e.g. in a union)? That
would've allowed us to allocate memory in common code and then fill in
vendor-specific details in .create_vcpu().

Paolo Bonzini Oct. 15, 2019, 2:27 p.m. UTC | #6

On 15/10/19 12:53, Vitaly Kuznetsov wrote:
> A very theoretical question: why do we have 'struct vcpu' embedded in
> vcpu_vmx/vcpu_svm and not the other way around (e.g. in a union)? That
> would've allowed us to allocate memory in common code and then fill in
> vendor-specific details in .create_vcpu().

Probably "because it's always been like that" is the most accurate answer.

Paolo

Vitaly Kuznetsov Oct. 15, 2019, 2:36 p.m. UTC | #7

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 15/10/19 12:53, Vitaly Kuznetsov wrote:
>> A very theoretical question: why do we have 'struct vcpu' embedded in
>> vcpu_vmx/vcpu_svm and not the other way around (e.g. in a union)? That
>> would've allowed us to allocate memory in common code and then fill in
>> vendor-specific details in .create_vcpu().
>
> Probably "because it's always been like that" is the most accurate answer.
>

OK, so let me make my question a bit less theoretical: would you be in
favor of changing the status quo? :-)

Sean Christopherson Oct. 15, 2019, 4:14 p.m. UTC | #8

On Tue, Oct 15, 2019 at 04:36:57PM +0200, Vitaly Kuznetsov wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
> > On 15/10/19 12:53, Vitaly Kuznetsov wrote:
> >> A very theoretical question: why do we have 'struct vcpu' embedded in
> >> vcpu_vmx/vcpu_svm and not the other way around (e.g. in a union)? That
> >> would've allowed us to allocate memory in common code and then fill in
> >> vendor-specific details in .create_vcpu().

A union would waste a non-trivial amount of memory on SVM.

  SVM: struct size = 14560
  VMX: struct size = 16192

There are ways around that, but...

> >
> > Probably "because it's always been like that" is the most accurate answer.
> >
> 
> OK, so let me make my question a bit less theoretical: would you be in
> favor of changing the status quo? :-)

... we don't need to invert the strut embedding to re-order the create
flow.  'struct kvm_vcpu' must be at offset zero and the size of the vcpu
is vendor defined, so kvm_arch_vcpu_create() can allocate the struct and
directly cast it to a 'struct kvm_vcpu *'.

Paolo Bonzini Oct. 15, 2019, 4:36 p.m. UTC | #9

On 15/10/19 16:36, Vitaly Kuznetsov wrote:
>> On 15/10/19 12:53, Vitaly Kuznetsov wrote:
>>> A very theoretical question: why do we have 'struct vcpu' embedded in
>>> vcpu_vmx/vcpu_svm and not the other way around (e.g. in a union)? That
>>> would've allowed us to allocate memory in common code and then fill in
>>> vendor-specific details in .create_vcpu().
>> Probably "because it's always been like that" is the most accurate answer.
>>
> OK, so let me make my question a bit less theoretical: would you be in
> favor of changing the status quo? :-)

Not really, it would be a lot of churn for debatable benefit.

Paolo

Xiaoyao Li Oct. 16, 2019, 1:52 a.m. UTC | #10

On 10/15/2019 5:28 PM, Paolo Bonzini wrote:
> On 14/10/19 18:58, Vitaly Kuznetsov wrote:
>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>>
>>> They are duplicated codes to create vcpu.arch.{user,guest}_fpu in VMX
>>> and SVM. Make them common functions.
>>>
>>> No functional change intended.
>> Would it rather make sense to move this code to
>> kvm_arch_vcpu_create()/kvm_arch_vcpu_destroy() instead?
>>
> 
> user_fpu could be made percpu too...  That would save a bit of memory
> for each vCPU.  I'm holding on Xiaoyao's patch because a lot of the code
> he's touching would go away then.

Sorry, I don't get clear your attitude.
Do you mean the generic common function is not so better that I'd better 
to implement the percpu solution?

> Paolo
>

Paolo Bonzini Oct. 16, 2019, 7:35 a.m. UTC | #11

On 16/10/19 03:52, Xiaoyao Li wrote:
>>
>> user_fpu could be made percpu too...  That would save a bit of memory
>> for each vCPU.  I'm holding on Xiaoyao's patch because a lot of the code
>> he's touching would go away then.
> 
> Sorry, I don't get clear your attitude.
> Do you mean the generic common function is not so better that I'd better
> to implement the percpu solution?

I wanted some time to give further thought to the percpu user_fpu idea.
 But kvm_load_guest_fpu and kvm_put_guest_fpu are not part of vcpu_load,
so it would not be so easy.  I'll just apply your patch now.

Paolo

Xiaoyao Li Oct. 16, 2019, 7:48 a.m. UTC | #12

On 10/16/2019 3:35 PM, Paolo Bonzini wrote:
> On 16/10/19 03:52, Xiaoyao Li wrote:
>>>
>>> user_fpu could be made percpu too...  That would save a bit of memory
>>> for each vCPU.  I'm holding on Xiaoyao's patch because a lot of the code
>>> he's touching would go away then.
>>
>> Sorry, I don't get clear your attitude.
>> Do you mean the generic common function is not so better that I'd better
>> to implement the percpu solution?
> 
> I wanted some time to give further thought to the percpu user_fpu idea.
>   But kvm_load_guest_fpu and kvm_put_guest_fpu are not part of vcpu_load,
> so it would not be so easy.  I'll just apply your patch now.

Got it, thanks.

BTW, could you have a look at the series I sent yesterday to refactor 
the vcpu creation flow, which is inspired partly by this issue. Any 
comment and suggestion is welcomed since I don't want to waste time on 
wrong direction.

Paolo Bonzini Oct. 16, 2019, 9:41 a.m. UTC | #13

On 16/10/19 09:48, Xiaoyao Li wrote:
> BTW, could you have a look at the series I sent yesterday to refactor
> the vcpu creation flow, which is inspired partly by this issue. Any
> comment and suggestion is welcomed since I don't want to waste time on
> wrong direction.

Yes, that's the series from which I'll take your patch.

Paolo

Sean Christopherson Oct. 17, 2019, 4:05 p.m. UTC | #14

On Wed, Oct 16, 2019 at 11:41:05AM +0200, Paolo Bonzini wrote:
> On 16/10/19 09:48, Xiaoyao Li wrote:
> > BTW, could you have a look at the series I sent yesterday to refactor
> > the vcpu creation flow, which is inspired partly by this issue. Any
> > comment and suggestion is welcomed since I don't want to waste time on
> > wrong direction.
> 
> Yes, that's the series from which I'll take your patch.

Can you hold off on taking that patch?  I'm pretty sure we can do more
cleanup in that area, with less code.

Paolo Bonzini Oct. 21, 2019, 1:09 p.m. UTC | #15

On 17/10/19 18:05, Sean Christopherson wrote:
> On Wed, Oct 16, 2019 at 11:41:05AM +0200, Paolo Bonzini wrote:
>> On 16/10/19 09:48, Xiaoyao Li wrote:
>>> BTW, could you have a look at the series I sent yesterday to refactor
>>> the vcpu creation flow, which is inspired partly by this issue. Any
>>> comment and suggestion is welcomed since I don't want to waste time on
>>> wrong direction.
>>
>> Yes, that's the series from which I'll take your patch.
> 
> Can you hold off on taking that patch?  I'm pretty sure we can do more
> cleanup in that area, with less code.
> 

Should I hold off on the whole "Refactor vcpu creation flow of x86 arch"
series then?

Paolo

Xiaoyao Li Oct. 22, 2019, 12:57 a.m. UTC | #16

On 10/21/2019 9:09 PM, Paolo Bonzini wrote:
> On 17/10/19 18:05, Sean Christopherson wrote:
>> On Wed, Oct 16, 2019 at 11:41:05AM +0200, Paolo Bonzini wrote:
>>> On 16/10/19 09:48, Xiaoyao Li wrote:
>>>> BTW, could you have a look at the series I sent yesterday to refactor
>>>> the vcpu creation flow, which is inspired partly by this issue. Any
>>>> comment and suggestion is welcomed since I don't want to waste time on
>>>> wrong direction.
>>>
>>> Yes, that's the series from which I'll take your patch.
>>
>> Can you hold off on taking that patch?  I'm pretty sure we can do more
>> cleanup in that area, with less code.
>>
> 
> Should I hold off on the whole "Refactor vcpu creation flow of x86 arch"
> series then?
> 
Yes, please just leave them aside.
If could, you can have an eye on my "v3 Minor cleanup and refactor about 
vmcs"

Thanks,
-Xiaoyao

KVM: X86: Make fpu allocation a common function

Commit Message

Comments

Patch