Message ID | 1566911210-30059-4-git-send-email-jan.dakinevich@virtuozzo.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fix emulation error on Windows bootup | expand |
+Cc Peng Hao and Yi Wang On Tue, Aug 27, 2019 at 01:07:09PM +0000, Jan Dakinevich wrote: > inject_emulated_exception() returns true if and only if nested page > fault happens. However, page fault can come from guest page tables > walk, either nested or not nested. In both cases we should stop an > attempt to read under RIP and give guest to step over its own page > fault handler. > > Fixes: 6ea6e84 ("KVM: x86: inject exceptions produced by x86_decode_insn") > Cc: Denis Lunev <den@virtuozzo.com> > Cc: Roman Kagan <rkagan@virtuozzo.com> > Cc: Denis Plotnikov <dplotnikov@virtuozzo.com> > Signed-off-by: Jan Dakinevich <jan.dakinevich@virtuozzo.com> > --- > arch/x86/kvm/x86.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 93b0bd4..45caa69 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -6521,8 +6521,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, > if (reexecute_instruction(vcpu, cr2, write_fault_to_spt, > emulation_type)) > return EMULATE_DONE; > - if (ctxt->have_exception && inject_emulated_exception(vcpu)) > + if (ctxt->have_exception) { > + inject_emulated_exception(vcpu); > return EMULATE_DONE; > + } Yikes, this patch and the previous have quite the sordid history. The non-void return from inject_emulated_exception() was added by commit ef54bcfeea6c ("KVM: x86: skip writeback on injection of nested exception") for the purpose of skipping writeback. At the time, the above blob in the decode flow didn't exist. Decode exception handling was added by commit 6ea6e84309ca ("KVM: x86: inject exceptions produced by x86_decode_insn") but it was dead code even then. The patch discussion[1] even point out that it was dead code, i.e. the change probably should have been reverted. Peng Hao and Yi Wang later ran into what appears to be the same bug you're hitting[2][3], and even had patches temporarily queued[4][5], but the patches never made it to mainline as they broke kvm-unit-tests. Fun side note, Radim even pointed out[4] the bug fixed by patch 1/3. So, the patches look correct, but there's the open question of why the hypercall test was failing for Paolo. I've tried to reproduce the #DF to no avail. [1] https://lore.kernel.org/patchwork/patch/850077/ [2] https://lkml.kernel.org/r/1537311828-4547-1-git-send-email-penghao122@sina.com.cn [3] https://lkml.kernel.org/r/20190111133002.GA14852@flask [4] https://lkml.kernel.org/r/20190111133002.GA14852@flask [5] https://lkml.kernel.org/r/9835d255-dd9a-222b-f4a2-93611175b326@redhat.com > if (emulation_type & EMULTYPE_SKIP) > return EMULATE_FAIL; > return handle_emulation_failure(vcpu, emulation_type); > -- > 2.1.4 >
Actually adding Peng Hao and Yi Wang... On Tue, Aug 27, 2019 at 07:50:30AM -0700, Sean Christopherson wrote: > +Cc Peng Hao and Yi Wang > > On Tue, Aug 27, 2019 at 01:07:09PM +0000, Jan Dakinevich wrote: > > inject_emulated_exception() returns true if and only if nested page > > fault happens. However, page fault can come from guest page tables > > walk, either nested or not nested. In both cases we should stop an > > attempt to read under RIP and give guest to step over its own page > > fault handler. > > > > Fixes: 6ea6e84 ("KVM: x86: inject exceptions produced by x86_decode_insn") > > Cc: Denis Lunev <den@virtuozzo.com> > > Cc: Roman Kagan <rkagan@virtuozzo.com> > > Cc: Denis Plotnikov <dplotnikov@virtuozzo.com> > > Signed-off-by: Jan Dakinevich <jan.dakinevich@virtuozzo.com> > > --- > > arch/x86/kvm/x86.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 93b0bd4..45caa69 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -6521,8 +6521,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, > > if (reexecute_instruction(vcpu, cr2, write_fault_to_spt, > > emulation_type)) > > return EMULATE_DONE; > > - if (ctxt->have_exception && inject_emulated_exception(vcpu)) > > + if (ctxt->have_exception) { > > + inject_emulated_exception(vcpu); > > return EMULATE_DONE; > > + } > > > Yikes, this patch and the previous have quite the sordid history. > > > The non-void return from inject_emulated_exception() was added by commit > > ef54bcfeea6c ("KVM: x86: skip writeback on injection of nested exception") > > for the purpose of skipping writeback. At the time, the above blob in the > decode flow didn't exist. > > > Decode exception handling was added by commit > > 6ea6e84309ca ("KVM: x86: inject exceptions produced by x86_decode_insn") > > but it was dead code even then. The patch discussion[1] even point out that > it was dead code, i.e. the change probably should have been reverted. > > > Peng Hao and Yi Wang later ran into what appears to be the same bug you're > hitting[2][3], and even had patches temporarily queued[4][5], but the > patches never made it to mainline as they broke kvm-unit-tests. Fun side > note, Radim even pointed out[4] the bug fixed by patch 1/3. > > So, the patches look correct, but there's the open question of why the > hypercall test was failing for Paolo. I've tried to reproduce the #DF to > no avail. > > [1] https://lore.kernel.org/patchwork/patch/850077/ > [2] https://lkml.kernel.org/r/1537311828-4547-1-git-send-email-penghao122@sina.com.cn > [3] https://lkml.kernel.org/r/20190111133002.GA14852@flask > [4] https://lkml.kernel.org/r/20190111133002.GA14852@flask > [5] https://lkml.kernel.org/r/9835d255-dd9a-222b-f4a2-93611175b326@redhat.com > > > if (emulation_type & EMULTYPE_SKIP) > > return EMULATE_FAIL; > > return handle_emulation_failure(vcpu, emulation_type); > > -- > > 2.1.4 > >
On Tue, 27 Aug 2019 07:50:30 -0700 Sean Christopherson <sean.j.christopherson@intel.com> wrote: > +Cc Peng Hao and Yi Wang > > On Tue, Aug 27, 2019 at 01:07:09PM +0000, Jan Dakinevich wrote: > > inject_emulated_exception() returns true if and only if nested page > > fault happens. However, page fault can come from guest page tables > > walk, either nested or not nested. In both cases we should stop an > > attempt to read under RIP and give guest to step over its own page > > fault handler. > > > > Fixes: 6ea6e84 ("KVM: x86: inject exceptions produced by x86_decode_insn") > > Cc: Denis Lunev <den@virtuozzo.com> > > Cc: Roman Kagan <rkagan@virtuozzo.com> > > Cc: Denis Plotnikov <dplotnikov@virtuozzo.com> > > Signed-off-by: Jan Dakinevich <jan.dakinevich@virtuozzo.com> > > --- > > arch/x86/kvm/x86.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 93b0bd4..45caa69 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -6521,8 +6521,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, > > if (reexecute_instruction(vcpu, cr2, write_fault_to_spt, > > emulation_type)) > > return EMULATE_DONE; > > - if (ctxt->have_exception && inject_emulated_exception(vcpu)) > > + if (ctxt->have_exception) { > > + inject_emulated_exception(vcpu); > > return EMULATE_DONE; > > + } > > > Yikes, this patch and the previous have quite the sordid history. > > > The non-void return from inject_emulated_exception() was added by commit > > ef54bcfeea6c ("KVM: x86: skip writeback on injection of nested exception") > > for the purpose of skipping writeback. At the time, the above blob in the > decode flow didn't exist. > > > Decode exception handling was added by commit > > 6ea6e84309ca ("KVM: x86: inject exceptions produced by x86_decode_insn") > > but it was dead code even then. The patch discussion[1] even point out that > it was dead code, i.e. the change probably should have been reverted. > > > Peng Hao and Yi Wang later ran into what appears to be the same bug you're > hitting[2][3], and even had patches temporarily queued[4][5], but the > patches never made it to mainline as they broke kvm-unit-tests. Fun side > note, Radim even pointed out[4] the bug fixed by patch 1/3. > > So, the patches look correct, but there's the open question of why the > hypercall test was failing for Paolo. Sorry, I'm little confused. Could you please, point me which test or tests were broken? I've just run kvm-unit-test and I see same results with and without my changes. > I've tried to reproduce the #DF to > no avail. > > [1] https://lore.kernel.org/patchwork/patch/850077/ > [2] https://lkml.kernel.org/r/1537311828-4547-1-git-send-email-penghao122@sina.com.cn > [3] https://lkml.kernel.org/r/20190111133002.GA14852@flask > [4] https://lkml.kernel.org/r/20190111133002.GA14852@flask > [5] https://lkml.kernel.org/r/9835d255-dd9a-222b-f4a2-93611175b326@redhat.com > > > if (emulation_type & EMULTYPE_SKIP) > > return EMULATE_FAIL; > > return handle_emulation_failure(vcpu, emulation_type); > > -- > > 2.1.4 > >
On Wed, Aug 28, 2019 at 10:19:51AM +0000, Jan Dakinevich wrote: > On Tue, 27 Aug 2019 07:50:30 -0700 > Sean Christopherson <sean.j.christopherson@intel.com> wrote: > > Yikes, this patch and the previous have quite the sordid history. > > > > > > The non-void return from inject_emulated_exception() was added by commit > > > > ef54bcfeea6c ("KVM: x86: skip writeback on injection of nested exception") > > > > for the purpose of skipping writeback. At the time, the above blob in the > > decode flow didn't exist. > > > > > > Decode exception handling was added by commit > > > > 6ea6e84309ca ("KVM: x86: inject exceptions produced by x86_decode_insn") > > > > but it was dead code even then. The patch discussion[1] even point out that > > it was dead code, i.e. the change probably should have been reverted. > > > > > > Peng Hao and Yi Wang later ran into what appears to be the same bug you're > > hitting[2][3], and even had patches temporarily queued[4][5], but the > > patches never made it to mainline as they broke kvm-unit-tests. Fun side > > note, Radim even pointed out[4] the bug fixed by patch 1/3. > > > > So, the patches look correct, but there's the open question of why the > > hypercall test was failing for Paolo. > > Sorry, I'm little confused. Could you please, point me which test or tests > were broken? I've just run kvm-unit-test and I see same results with and > without my changes. > > > I've tried to reproduce the #DF to > > no avail. Aha! The #DF occurs if patch 2/3, but not patch 3/3, is applied, and the VMware backdoor is enabled. The backdoor is off by default, which is why only Paolo was seeing the #DF. To handle the VMware backdoor, KVM intercepts #GP faults, which includes the non-canonical #GP from the hypercall unit test. With only patch 2/3 applied, x86_emulate_instruction() injects a #GP for the non-canonical RIP but returns EMULATE_FAIL instead of EMULATE_DONE. EMULATE_FAIL causes handle_exception_nmi() (or gp_interception() for SVM) to re-inject the original #GP because it thinks emulation failed due to a non-VMware opcode. Applying patch 3/3 resolves the issue as x86_emulate_instruction() returns EMULATE_DONE after injecting the #GP. TL;DR: Swap the order of patches and everything should be hunky dory. Please rebase to the latest kvm/queue, which has an equivalent to patch 1/3.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 93b0bd4..45caa69 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6521,8 +6521,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, if (reexecute_instruction(vcpu, cr2, write_fault_to_spt, emulation_type)) return EMULATE_DONE; - if (ctxt->have_exception && inject_emulated_exception(vcpu)) + if (ctxt->have_exception) { + inject_emulated_exception(vcpu); return EMULATE_DONE; + } if (emulation_type & EMULTYPE_SKIP) return EMULATE_FAIL; return handle_emulation_failure(vcpu, emulation_type);
inject_emulated_exception() returns true if and only if nested page fault happens. However, page fault can come from guest page tables walk, either nested or not nested. In both cases we should stop an attempt to read under RIP and give guest to step over its own page fault handler. Fixes: 6ea6e84 ("KVM: x86: inject exceptions produced by x86_decode_insn") Cc: Denis Lunev <den@virtuozzo.com> Cc: Roman Kagan <rkagan@virtuozzo.com> Cc: Denis Plotnikov <dplotnikov@virtuozzo.com> Signed-off-by: Jan Dakinevich <jan.dakinevich@virtuozzo.com> --- arch/x86/kvm/x86.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)