Message ID | 1582570669-45822-1-git-send-email-pbonzini@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [GIT,PULL] KVM changes for Linux 5.6-rc4 | expand |
The pull request you sent on Mon, 24 Feb 2020 19:57:49 +0100:
> https://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/63623fd44972d1ed2bfb6e0fb631dfcf547fd1e7
Thank you!
Paolo Bonzini <pbonzini@redhat.com> writes:
> KVM: nVMX: Don't emulate instructions in guest mode
I just discovered that this patch breaks Hyper-V on KVM completely;
Oliver's 86f7e90ce8 ("KVM: VMX: check descriptor table exits on
instruction emulation") doesn't fix it either. The breakage manifests
itself as
qemu-system-x86-23579 [005] 22018.775584: kvm_exit: reason EPT_VIOLATION rip 0xfffff802987d6169 info 181 0
qemu-system-x86-23579 [005] 22018.775584: kvm_nested_vmexit: rip fffff802987d6169 reason EPT_VIOLATION info1 181 info2 0 int_info 0 int_info_err 0
qemu-system-x86-23579 [005] 22018.775585: kvm_page_fault: address febd0000 error_code 181
qemu-system-x86-23579 [005] 22018.775592: kvm_emulate_insn: 0:fffff802987d6169: f3 a5
qemu-system-x86-23579 [005] 22018.775593: kvm_emulate_insn: 0:fffff802987d6169: f3 a5 FAIL
qemu-system-x86-23579 [005] 22018.775596: kvm_inj_exception: #UD (0x0)
We probably need to re-enable instruction emulation for something...
On 02/03/20 19:40, Vitaly Kuznetsov wrote: > > qemu-system-x86-23579 [005] 22018.775584: kvm_exit: reason EPT_VIOLATION rip 0xfffff802987d6169 info 181 0 > qemu-system-x86-23579 [005] 22018.775584: kvm_nested_vmexit: rip fffff802987d6169 reason EPT_VIOLATION info1 181 info2 0 int_info 0 int_info_err 0 > qemu-system-x86-23579 [005] 22018.775585: kvm_page_fault: address febd0000 error_code 181 > qemu-system-x86-23579 [005] 22018.775592: kvm_emulate_insn: 0:fffff802987d6169: f3 a5 > qemu-system-x86-23579 [005] 22018.775593: kvm_emulate_insn: 0:fffff802987d6169: f3 a5 FAIL > qemu-system-x86-23579 [005] 22018.775596: kvm_inj_exception: #UD (0x0) > > We probably need to re-enable instruction emulation for something... This is a rep movsw instruction, it shouldn't be intercepted. I think we have a stale ctxt->intercept because the /* Fields above regs are cleared together. */ comment is not true anymore since commit c44b4c6ab80eef3a9c52c7b3f0c632942e6489aa Author: Bandan Das <bsd@redhat.com> Date: Wed Apr 16 12:46:12 2014 -0400 KVM: emulate: clean up initializations in init_decode_cache A lot of initializations are unnecessary as they get set to appropriate values before actually being used. Optimize placement of fields in x86_emulate_ctxt Signed-off-by: Bandan Das <bsd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Paolo
Paolo Bonzini <pbonzini@redhat.com> writes: > On 02/03/20 19:40, Vitaly Kuznetsov wrote: >> >> qemu-system-x86-23579 [005] 22018.775584: kvm_exit: reason EPT_VIOLATION rip 0xfffff802987d6169 info 181 0 >> qemu-system-x86-23579 [005] 22018.775584: kvm_nested_vmexit: rip fffff802987d6169 reason EPT_VIOLATION info1 181 info2 0 int_info 0 int_info_err 0 >> qemu-system-x86-23579 [005] 22018.775585: kvm_page_fault: address febd0000 error_code 181 >> qemu-system-x86-23579 [005] 22018.775592: kvm_emulate_insn: 0:fffff802987d6169: f3 a5 >> qemu-system-x86-23579 [005] 22018.775593: kvm_emulate_insn: 0:fffff802987d6169: f3 a5 FAIL >> qemu-system-x86-23579 [005] 22018.775596: kvm_inj_exception: #UD (0x0) >> >> We probably need to re-enable instruction emulation for something... > > This is a rep movsw instruction, it shouldn't be intercepted. I think > we have a stale ctxt->intercept because the > > /* Fields above regs are cleared together. */ > > comment is not true anymore since > > commit c44b4c6ab80eef3a9c52c7b3f0c632942e6489aa > Author: Bandan Das <bsd@redhat.com> > Date: Wed Apr 16 12:46:12 2014 -0400 > > KVM: emulate: clean up initializations in init_decode_cache > Right you are, a big hammer like diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h index 2a8f2bd..52c9bce 100644 --- a/arch/x86/include/asm/kvm_emulate.h +++ b/arch/x86/include/asm/kvm_emulate.h @@ -324,14 +324,6 @@ struct x86_emulate_ctxt { */ /* current opcode length in bytes */ - u8 opcode_len; - u8 b; - u8 intercept; - u8 op_bytes; - u8 ad_bytes; - struct operand src; - struct operand src2; - struct operand dst; union { int (*execute)(struct x86_emulate_ctxt *ctxt); fastop_t fop; @@ -343,6 +335,14 @@ struct x86_emulate_ctxt { * or elsewhere */ bool rip_relative; + u8 opcode_len; + u8 b; + u8 intercept; + u8 op_bytes; + u8 ad_bytes; + struct operand src; + struct operand src2; + struct operand dst; u8 rex_prefix; u8 lock_prefix; u8 rep_prefix; seems to make the issue go away. (For those wondering why fielf shuffling makes a difference: init_decode_cache() clears [rip_relative, modrm) range) How did this even work before... (I'm still looking at the code, stay tuned...)
On 03/03/20 14:02, Vitaly Kuznetsov wrote: > Right you are, > > a big hammer like > > diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h > index 2a8f2bd..52c9bce 100644 > --- a/arch/x86/include/asm/kvm_emulate.h > +++ b/arch/x86/include/asm/kvm_emulate.h > @@ -324,14 +324,6 @@ struct x86_emulate_ctxt { > */ > > /* current opcode length in bytes */ > - u8 opcode_len; > - u8 b; > - u8 intercept; > - u8 op_bytes; > - u8 ad_bytes; > - struct operand src; > - struct operand src2; > - struct operand dst; > union { > int (*execute)(struct x86_emulate_ctxt *ctxt); > fastop_t fop; > @@ -343,6 +335,14 @@ struct x86_emulate_ctxt { > * or elsewhere > */ > bool rip_relative; > + u8 opcode_len; > + u8 b; > + u8 intercept; > + u8 op_bytes; > + u8 ad_bytes; > + struct operand src; > + struct operand src2; > + struct operand dst; > u8 rex_prefix; > u8 lock_prefix; > u8 rep_prefix; > > seems to make the issue go away. (For those wondering why fielf > shuffling makes a difference: init_decode_cache() clears > [rip_relative, modrm) range) How did this even work before... > (I'm still looking at the code, stay tuned...) On AMD, probably because all these instructions were normally trapped by L1. Of these, however, most need not be zeroed again. op_bytes, ad_bytes, opcode_len and b are initialized by x86_decode_insn, and dst/src/src2 also by decode_operand. So only intercept is affected, adding "ctxt->intercept = x86_intercept_none" should be enough. Paolo
Paolo Bonzini <pbonzini@redhat.com> writes: > On 03/03/20 14:02, Vitaly Kuznetsov wrote: >> Right you are, >> >> a big hammer like >> >> diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h >> index 2a8f2bd..52c9bce 100644 >> --- a/arch/x86/include/asm/kvm_emulate.h >> +++ b/arch/x86/include/asm/kvm_emulate.h >> @@ -324,14 +324,6 @@ struct x86_emulate_ctxt { >> */ >> >> /* current opcode length in bytes */ >> - u8 opcode_len; >> - u8 b; >> - u8 intercept; >> - u8 op_bytes; >> - u8 ad_bytes; >> - struct operand src; >> - struct operand src2; >> - struct operand dst; >> union { >> int (*execute)(struct x86_emulate_ctxt *ctxt); >> fastop_t fop; >> @@ -343,6 +335,14 @@ struct x86_emulate_ctxt { >> * or elsewhere >> */ >> bool rip_relative; >> + u8 opcode_len; >> + u8 b; >> + u8 intercept; >> + u8 op_bytes; >> + u8 ad_bytes; >> + struct operand src; >> + struct operand src2; >> + struct operand dst; >> u8 rex_prefix; >> u8 lock_prefix; >> u8 rep_prefix; >> >> seems to make the issue go away. (For those wondering why fielf >> shuffling makes a difference: init_decode_cache() clears >> [rip_relative, modrm) range) How did this even work before... >> (I'm still looking at the code, stay tuned...) > > On AMD, probably because all these instructions were normally trapped by L1. > > Of these, however, most need not be zeroed again. op_bytes, ad_bytes, > opcode_len and b are initialized by x86_decode_insn, and dst/src/src2 > also by decode_operand. So only intercept is affected, adding > "ctxt->intercept = x86_intercept_none" should be enough. This matches my findings, thank you! Patch[es] are coming.