Message ID | 20170524062433.20680-1-nick.desaulniers@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
2017-05-23 23:24-0700, Nick Desaulniers: > Fixes the warning: > > arch/x86/kvm/emulate.c:4018:12: warning: stack frame size of 1080 bytes in > function > 'em_fxrstor' [-Wframe-larger-than=] > static int em_fxrstor(struct x86_emulate_ctxt *ctxt) > ^ > > Found with CONFIG_FRAME_WARN set to 1024. > > Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com> > --- > diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c > @@ -4017,30 +4017,38 @@ static int fxrstor_fixup(struct x86_emulate_ctxt *ctxt, > > static int em_fxrstor(struct x86_emulate_ctxt *ctxt) > { > - struct fxregs_state fx_state; > + struct fxregs_state *fx_state; > int rc; > > rc = check_fxsr(ctxt); > if (rc != X86EMUL_CONTINUE) > return rc; > > - rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, 512); > + fx_state = kmalloc(sizeof(*fx_state), GFP_KERNEL); > + fx_state = kmalloc(sizeof(*fx_state), GFP_KERNEL); fx_state must be 16 byte aligned and x86 ARCH_KMALLOC_MINALIGN is 8, so this needs manual correction. Also, please kmalloc also fxregs_state in fxrstor_fixup and em_fxsave so we again have only one storage type. > + if (!fx_state) > + return -ENOMEM; The caller does not understand -ENOMEM. The appropriate return value is X86EMUL_UNHANDLEABLE. > if (ctxt->mode < X86EMUL_MODE_PROT64) > - rc = fxrstor_fixup(ctxt, &fx_state); > + rc = fxrstor_fixup(ctxt, fx_state); Ah, fxrstor_fixup most likely got inlined and both of them put ~512 byte fxregs_state on the stack ... noinline attribute should solve the warning too. Thanks.
On Wed, May 24, 2017 at 04:19:57PM +0200, Radim Krčmář wrote: > 2017-05-23 23:24-0700, Nick Desaulniers: > > + fx_state = kmalloc(sizeof(*fx_state), GFP_KERNEL); > > + fx_state = kmalloc(sizeof(*fx_state), GFP_KERNEL); > > fx_state must be 16 byte aligned and x86 ARCH_KMALLOC_MINALIGN is 8, so > this needs manual correction. Radim, thanks for the code review. I appreciate it! I'm fairly new to kernel patching, but I'd like to resolve this with your guidance. Looking through the available functions in slab.h, it seems the only function that allows specifying alignment is kmem_cache_create. Is that the simplest solution? If so, then it seems like a `struct kmem_cache *` should be added as a member to `struct x86_emulate_ctxt`? I'm not sure yet where the context gets initialized and destroyed. Do you have any insight? Also, is there a #define'd value to use for the 16B alignment that I should be using? > Also, please kmalloc also fxregs_state in fxrstor_fixup and em_fxsave so > we again have only one storage type. Then it should be easy to call kmem_cache_alloc() at these sites. > > + if (!fx_state) > > + return -ENOMEM; > > The caller does not understand -ENOMEM. The appropriate return value is > X86EMUL_UNHANDLEABLE. Ack. > > if (ctxt->mode < X86EMUL_MODE_PROT64) > > - rc = fxrstor_fixup(ctxt, &fx_state); > > + rc = fxrstor_fixup(ctxt, fx_state); > > Ah, fxrstor_fixup most likely got inlined and both of them put ~512 byte > fxregs_state on the stack ... noinline attribute should solve the > warning too. While that would change fewer lines, doesn't the problem still exist in the case of `ctxt->mode < X86EMUL_MODE_PROT64`, just split across two stack frames? As in shouldn't we still dynamically allocate fx_state?
On 25/05/2017 03:36, Nick Desaulniers wrote: >>> if (ctxt->mode < X86EMUL_MODE_PROT64) >>> - rc = fxrstor_fixup(ctxt, &fx_state); >>> + rc = fxrstor_fixup(ctxt, fx_state); >> Ah, fxrstor_fixup most likely got inlined and both of them put ~512 byte >> fxregs_state on the stack ... noinline attribute should solve the >> warning too. > While that would change fewer lines, doesn't the problem still exist in > the case of `ctxt->mode < X86EMUL_MODE_PROT64`, just split across two > stack frames? As in shouldn't we still dynamically allocate fx_state? I think we should do the fixup backwards. That is: - first do get_fpu - if the fixup is necessary, i.e. ctxt->mode < X86EMUL_MODE_PROT64, do fxsave into &fxstate. - then do segmented_read_std with the correct size, which is - offsetof(struct fxregs_state, xmm_space[16]), i.e. 416 if ctxt->mode == X86EMUL_MODE_PROT64 - offsetof(struct fxregs_state, xmm_space[8]), i.e. 288 if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=1 - offsetof(struct fxregs_state, xmm_space[0]), i.e. 160 if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=0 - then check fx_state.mxcsr - then do fxrstor - finally do put_fpu This will remove one of the two fxregs_state structs. Paolo
On Thu, May 25, 2017 at 04:07:08PM +0200, Paolo Bonzini wrote: > I think we should do the fixup backwards. > > That is: > > - first do get_fpu > > - if the fixup is necessary, i.e. ctxt->mode < X86EMUL_MODE_PROT64, do > fxsave into &fxstate. > > - then do segmented_read_std with the correct size, which is > - offsetof(struct fxregs_state, xmm_space[16]), i.e. 416 > if ctxt->mode == X86EMUL_MODE_PROT64 > - offsetof(struct fxregs_state, xmm_space[8]), i.e. 288 > if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=1 > - offsetof(struct fxregs_state, xmm_space[0]), i.e. 160 > if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=0 but we still want to do a segmented_read_std with size 512 otherwise, correct? > - then check fx_state.mxcsr > > - then do fxrstor This sounds like we conditionally do the fxsave, but then always do the fxrstor. Is that ok? I guess the original code kind of does that as well. > - finally do put_fpu Sounds straight forward. I can see how fxsave and CR4.OSFXSR are accessed in fxstor_fixup. Is it ok to skip those memcpy's that would otherwise occur when calling fxrstor_fixup() (which after these changes, we would not be)?
On 26/05/2017 06:13, Nick Desaulniers wrote: > On Thu, May 25, 2017 at 04:07:08PM +0200, Paolo Bonzini wrote: >> I think we should do the fixup backwards. >> >> That is: >> >> - first do get_fpu >> >> - if the fixup is necessary, i.e. ctxt->mode < X86EMUL_MODE_PROT64, do >> fxsave into &fxstate. >> >> - then do segmented_read_std with the correct size, which is >> - offsetof(struct fxregs_state, xmm_space[16]), i.e. 416 >> if ctxt->mode == X86EMUL_MODE_PROT64 >> - offsetof(struct fxregs_state, xmm_space[8]), i.e. 288 >> if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=1 >> - offsetof(struct fxregs_state, xmm_space[0]), i.e. 160 >> if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=0 > > but we still want to do a segmented_read_std with size 512 otherwise, > correct? No, 512 is never necessary. ctxt->mode is never > X86EMUL_MODE_PROT64 (see the definition of enum x86emul_mode in arch/x86/include/asm/kvm_emulate.h. >> - then check fx_state.mxcsr >> >> - then do fxrstor > > This sounds like we conditionally do the fxsave, but then always do the > fxrstor. Is that ok? I guess the original code kind of does that as > well. Correct. They idea is that fxrstor on Linux always accesses 416 bytes, but we may have to restore only the first part for accurate emulation. The fxsave retrieves the current state so that we can leave it unmodified when we do fxrstor. >> - finally do put_fpu > > Sounds straight forward. I can see how fxsave and CR4.OSFXSR are > accessed in fxstor_fixup. Is it ok to skip those memcpy's that would > otherwise occur when calling fxrstor_fixup() (which after these changes, > we would not be)? Yes, the memcpys are replaced with a shorter segmented_read_std. Thanks, Paolo
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 0816ab2e8adc..1d7c9ceeff56 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -4017,30 +4017,38 @@ static int fxrstor_fixup(struct x86_emulate_ctxt *ctxt, static int em_fxrstor(struct x86_emulate_ctxt *ctxt) { - struct fxregs_state fx_state; + struct fxregs_state *fx_state; int rc; rc = check_fxsr(ctxt); if (rc != X86EMUL_CONTINUE) return rc; - rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, 512); + fx_state = kmalloc(sizeof(*fx_state), GFP_KERNEL); + if (!fx_state) + return -ENOMEM; + + rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, fx_state, 512); if (rc != X86EMUL_CONTINUE) - return rc; + goto out; - if (fx_state.mxcsr >> 16) - return emulate_gp(ctxt, 0); + if (fx_state->mxcsr >> 16) { + rc = emulate_gp(ctxt, 0); + goto out; + } ctxt->ops->get_fpu(ctxt); if (ctxt->mode < X86EMUL_MODE_PROT64) - rc = fxrstor_fixup(ctxt, &fx_state); + rc = fxrstor_fixup(ctxt, fx_state); if (rc == X86EMUL_CONTINUE) - rc = asm_safe("fxrstor %[fx]", : [fx] "m"(fx_state)); + rc = asm_safe("fxrstor %[fx]", : [fx] "m"(*fx_state)); ctxt->ops->put_fpu(ctxt); +out: + kfree(fx_state); return rc; }
Fixes the warning: arch/x86/kvm/emulate.c:4018:12: warning: stack frame size of 1080 bytes in function 'em_fxrstor' [-Wframe-larger-than=] static int em_fxrstor(struct x86_emulate_ctxt *ctxt) ^ Found with CONFIG_FRAME_WARN set to 1024. Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com> --- arch/x86/kvm/emulate.c | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-)