Message ID | 1513590316-4702-4-git-send-email-liran.alon@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 18/12/2017 10:45, Liran Alon wrote: > Next commits are going introduce support for accessing VMware backdoor > ports even though guest's TSS I/O permissions bitmap doesn't allow > access. This mimic VMware hypervisor behavior. > > In order to support this, next commits will change VMX/SVM to > intercept #GP which was raised by such access and handle it by calling > the x86 emulator to emulate instruction. Since commit "KVM: x86: > Always allow access to VMware backdoor I/O ports", the x86 emulator > handles access to these I/O ports by not checking these ports against > the TSS I/O permission bitmap. > > It turns out that the x86 emulator is incomplete and therefore > certain instructions that can cause #GP cannot be emulated. > Such an example is "INT x" (opcode 0xcd) which reach emulate_int() > which can only emulate instruction if vCPU is in real-mode. > > In those cases, we would like the #GP intercept to just forward #GP > as-is to guest as if there was no intercept to begin with. > However, current emulator code always queue #UD exception in case > emulator fails which is not what is wanted in this flow. > > This commit address this issue by adding a new emulation_type flag > that will allow the #GP intercept handler to specify it wish to just > be aware of when instruction emulation fails and doesn't want #UD > exception to be queued. > > Signed-off-by: Liran Alon <liran.alon@oracle.com> > Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> > Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> > --- > arch/x86/include/asm/kvm_host.h | 1 + > arch/x86/kvm/x86.c | 12 ++++++++---- > 2 files changed, 9 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 516798431328..2b7ea1ac4f86 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -1168,6 +1168,7 @@ enum emulation_result { > #define EMULTYPE_SKIP (1 << 2) > #define EMULTYPE_RETRY (1 << 3) > #define EMULTYPE_NO_REEXECUTE (1 << 4) > +#define EMULTYPE_NO_UD_ON_FAIL (1 << 5) > int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2, > int emulation_type, void *insn, int insn_len); > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 5fef09674de1..8fd2d3e1bcd4 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -5425,7 +5425,7 @@ int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip) > } > EXPORT_SYMBOL_GPL(kvm_inject_realmode_interrupt); > > -static int handle_emulation_failure(struct kvm_vcpu *vcpu) > +static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type) > { > int r = EMULATE_DONE; > > @@ -5437,7 +5437,11 @@ static int handle_emulation_failure(struct kvm_vcpu *vcpu) > vcpu->run->internal.ndata = 0; > r = EMULATE_USER_EXIT; > } > - kvm_queue_exception(vcpu, UD_VECTOR); > + > + if (emulation_type & EMULTYPE_NO_UD_ON_FAIL) > + r = EMULATE_FAIL; There seems to be some overlap between EMULTYPE_VMWARE and EMULTYPE_NO_UD_ON_FAIL. Even if you don't specify EMULTYPE_VMWARE, the emulation could fail, but there should have been no writeback and injecting the original #GP exception should be safe. Maybe should EMULTYPE_NO_UD_ON_FAIL return straight away, skipping even the user-mode exit. On the other hand, you may want to have EMULTYPE_VMWARE so as to reduce the attack surface from the emulator (as the emulator would then be very easy to trigger, just by executing an instruction that causes #GP). In that case, however, emulation of the {in,out}{s,} instructions shouldn't fail and you shouldn't need EMULTYPE_NO_UD_ON_FAIL. Paolo > + else > + kvm_queue_exception(vcpu, UD_VECTOR); > > return r; > } > @@ -5731,7 +5735,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, > return EMULATE_DONE; > if (emulation_type & EMULTYPE_SKIP) > return EMULATE_FAIL; > - return handle_emulation_failure(vcpu); > + return handle_emulation_failure(vcpu, emulation_type); > } > } > > @@ -5766,7 +5770,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, > emulation_type)) > return EMULATE_DONE; > > - return handle_emulation_failure(vcpu); > + return handle_emulation_failure(vcpu, emulation_type); > } > > if (ctxt->have_exception) { >
On 21/12/17 17:11, Paolo Bonzini wrote: > On 18/12/2017 10:45, Liran Alon wrote: >> Next commits are going introduce support for accessing VMware backdoor >> ports even though guest's TSS I/O permissions bitmap doesn't allow >> access. This mimic VMware hypervisor behavior. >> >> In order to support this, next commits will change VMX/SVM to >> intercept #GP which was raised by such access and handle it by calling >> the x86 emulator to emulate instruction. Since commit "KVM: x86: >> Always allow access to VMware backdoor I/O ports", the x86 emulator >> handles access to these I/O ports by not checking these ports against >> the TSS I/O permission bitmap. >> >> It turns out that the x86 emulator is incomplete and therefore >> certain instructions that can cause #GP cannot be emulated. >> Such an example is "INT x" (opcode 0xcd) which reach emulate_int() >> which can only emulate instruction if vCPU is in real-mode. >> >> In those cases, we would like the #GP intercept to just forward #GP >> as-is to guest as if there was no intercept to begin with. >> However, current emulator code always queue #UD exception in case >> emulator fails which is not what is wanted in this flow. >> >> This commit address this issue by adding a new emulation_type flag >> that will allow the #GP intercept handler to specify it wish to just >> be aware of when instruction emulation fails and doesn't want #UD >> exception to be queued. >> >> Signed-off-by: Liran Alon <liran.alon@oracle.com> >> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> >> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> >> --- >> arch/x86/include/asm/kvm_host.h | 1 + >> arch/x86/kvm/x86.c | 12 ++++++++---- >> 2 files changed, 9 insertions(+), 4 deletions(-) >> >> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h >> index 516798431328..2b7ea1ac4f86 100644 >> --- a/arch/x86/include/asm/kvm_host.h >> +++ b/arch/x86/include/asm/kvm_host.h >> @@ -1168,6 +1168,7 @@ enum emulation_result { >> #define EMULTYPE_SKIP (1 << 2) >> #define EMULTYPE_RETRY (1 << 3) >> #define EMULTYPE_NO_REEXECUTE (1 << 4) >> +#define EMULTYPE_NO_UD_ON_FAIL (1 << 5) >> int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2, >> int emulation_type, void *insn, int insn_len); >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 5fef09674de1..8fd2d3e1bcd4 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -5425,7 +5425,7 @@ int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip) >> } >> EXPORT_SYMBOL_GPL(kvm_inject_realmode_interrupt); >> >> -static int handle_emulation_failure(struct kvm_vcpu *vcpu) >> +static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type) >> { >> int r = EMULATE_DONE; >> >> @@ -5437,7 +5437,11 @@ static int handle_emulation_failure(struct kvm_vcpu *vcpu) >> vcpu->run->internal.ndata = 0; >> r = EMULATE_USER_EXIT; >> } >> - kvm_queue_exception(vcpu, UD_VECTOR); >> + >> + if (emulation_type & EMULTYPE_NO_UD_ON_FAIL) >> + r = EMULATE_FAIL; > > There seems to be some overlap between EMULTYPE_VMWARE and > EMULTYPE_NO_UD_ON_FAIL. > > Even if you don't specify EMULTYPE_VMWARE, the emulation could fail, but > there should have been no writeback and injecting the original #GP > exception should be safe. What you mention is true in case x86_emulate_instruction() fails on instruction emulation but it could also fail on instruction disassembly. (See more details below). > Maybe should EMULTYPE_NO_UD_ON_FAIL return > straight away, skipping even the user-mode exit. I agree it is possible to check "if (emulation_type & EMULTYPE_NO_UD_ON_FAIL)" immediately after statistics & tracing, as if this flag is set, we will return EMULATE_FAIL anyway (maybe overwriting EMULATE_USER_EXIT). If that's important, I can do such change in a v2 of this patch. > > On the other hand, you may want to have EMULTYPE_VMWARE so as to reduce > the attack surface from the emulator (as the emulator would then be very > easy to trigger, just by executing an instruction that causes #GP). In > that case, however, emulation of the {in,out}{s,} instructions shouldn't > fail and you shouldn't need EMULTYPE_NO_UD_ON_FAIL. Consider the case where the CPU raises a #GP on some instruction which is now intercepted by KVM. The #GP intercept will call x86_emulate_instruction(). If the x86 emulator disassembly engine is incomplete and therefore doesn't know how to parse the instruction which caused the #GP, x86_decode_insn() will fail which will also reach handle_emulation_failure(). If there is no EMULTYPE_NO_UD_ON_FAIL flag, this will cause a #UD exception to be queued which is not what we want. (We would like to preserve behaviour of raising a #GP to guest) Therefore we can summarize these flags usage as follows: 1. EMULTYPE_NO_UD_ON_FAIL is used to tell emulator "if you fail to disassemble the instruction, I just want you to return failure. Do not queue a #UD and let me decide what should be the proper response". 2. EMULTYPE_VMWARE is indeed used to avoid making all instructions that could raise #GP to reach instruction-emulation as the x86 emulator is incomplete anyway and it just, as you say, increase attack surface. Having said that, I agree the commit messages of the 2 commits introducing these flags may not be indicative enough. If we agree on the written above, I can fix them in v2 of this series. What do you think? Regards, -Liran > > Paolo > > >> + else >> + kvm_queue_exception(vcpu, UD_VECTOR); >> >> return r; >> } >> @@ -5731,7 +5735,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, >> return EMULATE_DONE; >> if (emulation_type & EMULTYPE_SKIP) >> return EMULATE_FAIL; >> - return handle_emulation_failure(vcpu); >> + return handle_emulation_failure(vcpu, emulation_type); >> } >> } >> >> @@ -5766,7 +5770,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, >> emulation_type)) >> return EMULATE_DONE; >> >> - return handle_emulation_failure(vcpu); >> + return handle_emulation_failure(vcpu, emulation_type); >> } >> >> if (ctxt->have_exception) { >> >
On 22/12/2017 02:11, Liran Alon wrote: > Consider the case where the CPU raises a #GP on some instruction > which is now intercepted by KVM. The #GP intercept will call > x86_emulate_instruction(). If the x86 emulator disassembly engine is > incomplete and therefore doesn't know how to parse the instruction > which caused the #GP, x86_decode_insn() will fail which will also > reach handle_emulation_failure(). If there is no > EMULTYPE_NO_UD_ON_FAIL flag, this will cause a #UD exception to be > queued which is not what we want. Yup, however EMULTYPE_VMWARE has filtered the opcodes, hasn't it? So in this case you shouldn't fail the decoding. > Therefore we can summarize these flags usage as follows: 1. > EMULTYPE_NO_UD_ON_FAIL is used to tell emulator "if you fail to > disassemble the instruction, I just want you to return failure. Do > not queue a #UD and let me decide what should be the proper > response". > > 2. EMULTYPE_VMWARE is indeed used to avoid making all > instructions that could raise #GP to reach instruction-emulation as > the x86 emulator is incomplete anyway and it just, as you say, > increase attack surface. > > Having said that, I agree the commit messages of the 2 commits > introducing these flags may not be indicative enough. If we agree on > the written above, I can fix them in v2 of this series. Yeah, that's good. In particular it's important to note that EMULTYPE_VMWARE is not for correctness, only for hardening. Thanks, Paolo
On 22/12/17 17:16, Paolo Bonzini wrote: > On 22/12/2017 02:11, Liran Alon wrote: >> Consider the case where the CPU raises a #GP on some instruction >> which is now intercepted by KVM. The #GP intercept will call >> x86_emulate_instruction(). If the x86 emulator disassembly engine is >> incomplete and therefore doesn't know how to parse the instruction >> which caused the #GP, x86_decode_insn() will fail which will also >> reach handle_emulation_failure(). If there is no >> EMULTYPE_NO_UD_ON_FAIL flag, this will cause a #UD exception to be >> queued which is not what we want. > > Yup, however EMULTYPE_VMWARE has filtered the opcodes, hasn't it? So in > this case you shouldn't fail the decoding. In my current implementation EMULTYPE_VMWARE is considered only after the disassembly engine (x86_decode_insn()) has succeeded. It is true I could have filtered the opcodes before invoking the disassembly engine but that will make code a bit more complex. In addition, I didn't saw a lot of value in reducing the attack surface from the disassembly engine itself. Only from the emulation. Therefore, I decided to make the EMULTYPE_NO_UD_ON_FAIL flag which may be also useful in the future for other use cases. Regards, -Liran > >> Therefore we can summarize these flags usage as follows: 1. >> EMULTYPE_NO_UD_ON_FAIL is used to tell emulator "if you fail to >> disassemble the instruction, I just want you to return failure. Do >> not queue a #UD and let me decide what should be the proper >> response". >> >> 2. EMULTYPE_VMWARE is indeed used to avoid making all >> instructions that could raise #GP to reach instruction-emulation as >> the x86 emulator is incomplete anyway and it just, as you say, >> increase attack surface. >> >> Having said that, I agree the commit messages of the 2 commits >> introducing these flags may not be indicative enough. If we agree on >> the written above, I can fix them in v2 of this series. > > Yeah, that's good. In particular it's important to note that > EMULTYPE_VMWARE is not for correctness, only for hardening. > > Thanks, > > Paolo >
On 22/12/2017 16:53, Liran Alon wrote: >> >> Yup, however EMULTYPE_VMWARE has filtered the opcodes, hasn't it? So in >> this case you shouldn't fail the decoding. > > In my current implementation EMULTYPE_VMWARE is considered only after > the disassembly engine (x86_decode_insn()) has succeeded. It is true I > could have filtered the opcodes before invoking the disassembly engine > but that will make code a bit more complex. In addition, I didn't saw a > lot of value in reducing the attack surface from the disassembly engine > itself. Only from the emulation. We've had oopses from incorrect decoding, so it can be useful. You're right that the RDPMC case has a two-byte opcode and therefore it makes decoding a bit more complex. I'd say, let's keep it simple and start with EMULTYPE_NO_UD_ON_FAIL. Thanks, Paolo > Therefore, I decided to make the EMULTYPE_NO_UD_ON_FAIL flag which may > be also useful in the future for other use cases.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 516798431328..2b7ea1ac4f86 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1168,6 +1168,7 @@ enum emulation_result { #define EMULTYPE_SKIP (1 << 2) #define EMULTYPE_RETRY (1 << 3) #define EMULTYPE_NO_REEXECUTE (1 << 4) +#define EMULTYPE_NO_UD_ON_FAIL (1 << 5) int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2, int emulation_type, void *insn, int insn_len); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5fef09674de1..8fd2d3e1bcd4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5425,7 +5425,7 @@ int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip) } EXPORT_SYMBOL_GPL(kvm_inject_realmode_interrupt); -static int handle_emulation_failure(struct kvm_vcpu *vcpu) +static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type) { int r = EMULATE_DONE; @@ -5437,7 +5437,11 @@ static int handle_emulation_failure(struct kvm_vcpu *vcpu) vcpu->run->internal.ndata = 0; r = EMULATE_USER_EXIT; } - kvm_queue_exception(vcpu, UD_VECTOR); + + if (emulation_type & EMULTYPE_NO_UD_ON_FAIL) + r = EMULATE_FAIL; + else + kvm_queue_exception(vcpu, UD_VECTOR); return r; } @@ -5731,7 +5735,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, return EMULATE_DONE; if (emulation_type & EMULTYPE_SKIP) return EMULATE_FAIL; - return handle_emulation_failure(vcpu); + return handle_emulation_failure(vcpu, emulation_type); } } @@ -5766,7 +5770,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, emulation_type)) return EMULATE_DONE; - return handle_emulation_failure(vcpu); + return handle_emulation_failure(vcpu, emulation_type); } if (ctxt->have_exception) {