KVM: x86: dynamically allocate large struct in em_fxrstor

Message ID	20170524062433.20680-1-nick.desaulniers@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> From: Nick Desaulniers <nick.desaulniers@gmail.com> Cc: Nick Desaulniers <nick.desaulniers@gmail.com>, Paolo Bonzini <pbonzini@redhat.com>, =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= <rkrcmar@redhat.com>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] KVM: x86: dynamically allocate large struct in em_fxrstor Date: Tue, 23 May 2017 23:24:33 -0700 Message-Id: <20170524062433.20680-1-nick.desaulniers@gmail.com> To: unlisted-recipients:; (no To-header on input) Sender: kvm-owner@vger.kernel.org Precedence: bulk

Nick Desaulniers May 24, 2017, 6:24 a.m. UTC

Fixes the warning:

arch/x86/kvm/emulate.c:4018:12: warning: stack frame size of 1080 bytes in
function
      'em_fxrstor' [-Wframe-larger-than=]
static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
           ^

Found with CONFIG_FRAME_WARN set to 1024.

Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com>
---
 arch/x86/kvm/emulate.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

Radim Krčmář May 24, 2017, 2:19 p.m. UTC | #1

2017-05-23 23:24-0700, Nick Desaulniers:
> Fixes the warning:
> 
> arch/x86/kvm/emulate.c:4018:12: warning: stack frame size of 1080 bytes in
> function
>       'em_fxrstor' [-Wframe-larger-than=]
> static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
>            ^
> 
> Found with CONFIG_FRAME_WARN set to 1024.
> 
> Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com>
> ---
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> @@ -4017,30 +4017,38 @@ static int fxrstor_fixup(struct x86_emulate_ctxt *ctxt,
>  
>  static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
>  {
> -	struct fxregs_state fx_state;
> +	struct fxregs_state *fx_state;
>  	int rc;
>  
>  	rc = check_fxsr(ctxt);
>  	if (rc != X86EMUL_CONTINUE)
>  		return rc;
>  
> -	rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, 512);
> +	fx_state = kmalloc(sizeof(*fx_state), GFP_KERNEL);
> +	fx_state = kmalloc(sizeof(*fx_state), GFP_KERNEL);

fx_state must be 16 byte aligned and x86 ARCH_KMALLOC_MINALIGN is 8, so
this needs manual correction.

Also, please kmalloc also fxregs_state in fxrstor_fixup and em_fxsave so
we again have only one storage type.

> +	if (!fx_state)
> +		return -ENOMEM;

The caller does not understand -ENOMEM.  The appropriate return value is
X86EMUL_UNHANDLEABLE.

>  	if (ctxt->mode < X86EMUL_MODE_PROT64)
> -		rc = fxrstor_fixup(ctxt, &fx_state);
> +		rc = fxrstor_fixup(ctxt, fx_state);

Ah, fxrstor_fixup most likely got inlined and both of them put ~512 byte
fxregs_state on the stack ... noinline attribute should solve the
warning too.

Thanks.

Nick Desaulniers May 25, 2017, 1:36 a.m. UTC | #2

On Wed, May 24, 2017 at 04:19:57PM +0200, Radim Krčmář wrote:
> 2017-05-23 23:24-0700, Nick Desaulniers:
> > +	fx_state = kmalloc(sizeof(*fx_state), GFP_KERNEL);
> > +	fx_state = kmalloc(sizeof(*fx_state), GFP_KERNEL);
> 
> fx_state must be 16 byte aligned and x86 ARCH_KMALLOC_MINALIGN is 8, so
> this needs manual correction.

Radim, thanks for the code review. I appreciate it! I'm fairly new to
kernel patching, but I'd like to resolve this with your guidance.

Looking through the available functions in slab.h, it seems the only
function that allows specifying alignment is kmem_cache_create.

Is that the simplest solution?

If so, then it seems like a `struct kmem_cache *` should be added as a
member to `struct x86_emulate_ctxt`? I'm not sure yet where the context
gets initialized and destroyed.  Do you have any insight?

Also, is there a #define'd value to use for the 16B alignment that I
should be using?

> Also, please kmalloc also fxregs_state in fxrstor_fixup and em_fxsave so
> we again have only one storage type.

Then it should be easy to call kmem_cache_alloc() at these sites.

> > +	if (!fx_state)
> > +		return -ENOMEM;
> 
> The caller does not understand -ENOMEM.  The appropriate return value is
> X86EMUL_UNHANDLEABLE.

Ack.

> >  	if (ctxt->mode < X86EMUL_MODE_PROT64)
> > -		rc = fxrstor_fixup(ctxt, &fx_state);
> > +		rc = fxrstor_fixup(ctxt, fx_state);
> 
> Ah, fxrstor_fixup most likely got inlined and both of them put ~512 byte
> fxregs_state on the stack ... noinline attribute should solve the
> warning too.

While that would change fewer lines, doesn't the problem still exist in
the case of `ctxt->mode < X86EMUL_MODE_PROT64`, just split across two
stack frames?  As in shouldn't we still dynamically allocate fx_state?

Paolo Bonzini May 25, 2017, 2:07 p.m. UTC | #3

On 25/05/2017 03:36, Nick Desaulniers wrote:
>>>  	if (ctxt->mode < X86EMUL_MODE_PROT64)
>>> -		rc = fxrstor_fixup(ctxt, &fx_state);
>>> +		rc = fxrstor_fixup(ctxt, fx_state);
>> Ah, fxrstor_fixup most likely got inlined and both of them put ~512 byte
>> fxregs_state on the stack ... noinline attribute should solve the
>> warning too.
> While that would change fewer lines, doesn't the problem still exist in
> the case of `ctxt->mode < X86EMUL_MODE_PROT64`, just split across two
> stack frames?  As in shouldn't we still dynamically allocate fx_state?

I think we should do the fixup backwards.

That is:

- first do get_fpu

- if the fixup is necessary, i.e. ctxt->mode < X86EMUL_MODE_PROT64, do
fxsave into &fxstate.

- then do segmented_read_std with the correct size, which is
  - offsetof(struct fxregs_state, xmm_space[16]), i.e. 416
    if ctxt->mode == X86EMUL_MODE_PROT64
  - offsetof(struct fxregs_state, xmm_space[8]), i.e. 288
    if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=1
  - offsetof(struct fxregs_state, xmm_space[0]), i.e. 160
    if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=0

- then check fx_state.mxcsr

- then do fxrstor

- finally do put_fpu

This will remove one of the two fxregs_state structs.

Paolo

Nick Desaulniers May 26, 2017, 4:13 a.m. UTC | #4

On Thu, May 25, 2017 at 04:07:08PM +0200, Paolo Bonzini wrote:
> I think we should do the fixup backwards.
> 
> That is:
> 
> - first do get_fpu
> 
> - if the fixup is necessary, i.e. ctxt->mode < X86EMUL_MODE_PROT64, do
> fxsave into &fxstate.
> 
> - then do segmented_read_std with the correct size, which is
>   - offsetof(struct fxregs_state, xmm_space[16]), i.e. 416
>     if ctxt->mode == X86EMUL_MODE_PROT64
>   - offsetof(struct fxregs_state, xmm_space[8]), i.e. 288
>     if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=1
>   - offsetof(struct fxregs_state, xmm_space[0]), i.e. 160
>     if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=0

but we still want to do a segmented_read_std with size 512 otherwise,
correct?

> - then check fx_state.mxcsr
> 
> - then do fxrstor

This sounds like we conditionally do the fxsave, but then always do the
fxrstor.  Is that ok? I guess the original code kind of does that as
well.

> - finally do put_fpu

Sounds straight forward.  I can see how fxsave and CR4.OSFXSR are
accessed in fxstor_fixup.  Is it ok to skip those memcpy's that would
otherwise occur when calling fxrstor_fixup() (which after these changes,
we would not be)?

Paolo Bonzini May 26, 2017, 7:18 a.m. UTC | #5

On 26/05/2017 06:13, Nick Desaulniers wrote:
> On Thu, May 25, 2017 at 04:07:08PM +0200, Paolo Bonzini wrote:
>> I think we should do the fixup backwards.
>>
>> That is:
>>
>> - first do get_fpu
>>
>> - if the fixup is necessary, i.e. ctxt->mode < X86EMUL_MODE_PROT64, do
>> fxsave into &fxstate.
>>
>> - then do segmented_read_std with the correct size, which is
>>   - offsetof(struct fxregs_state, xmm_space[16]), i.e. 416
>>     if ctxt->mode == X86EMUL_MODE_PROT64
>>   - offsetof(struct fxregs_state, xmm_space[8]), i.e. 288
>>     if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=1
>>   - offsetof(struct fxregs_state, xmm_space[0]), i.e. 160
>>     if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=0
> 
> but we still want to do a segmented_read_std with size 512 otherwise,
> correct?

No, 512 is never necessary.  ctxt->mode is never > X86EMUL_MODE_PROT64
(see the definition of enum x86emul_mode in
arch/x86/include/asm/kvm_emulate.h.

>> - then check fx_state.mxcsr
>>
>> - then do fxrstor
> 
> This sounds like we conditionally do the fxsave, but then always do the
> fxrstor.  Is that ok? I guess the original code kind of does that as
> well.

Correct.  They idea is that fxrstor on Linux always accesses 416 bytes,
but we may have to restore only the first part for accurate emulation.
The fxsave retrieves the current state so that we can leave it
unmodified when we do fxrstor.

>> - finally do put_fpu
> 
> Sounds straight forward.  I can see how fxsave and CR4.OSFXSR are
> accessed in fxstor_fixup.  Is it ok to skip those memcpy's that would
> otherwise occur when calling fxrstor_fixup() (which after these changes,
> we would not be)?

Yes, the memcpys are replaced with a shorter segmented_read_std.

Thanks,

Paolo

KVM: x86: dynamically allocate large struct in em_fxrstor

Commit Message

Comments

Patch