Message ID | 1489761691-11441-1-git-send-email-wanpeng.li@hotmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 17/03/2017 15:41, Wanpeng Li wrote: > From: Wanpeng Li <wanpeng.li@hotmail.com> > > The L2 guest hang if shadow page tables on EPT, the trace on L1 shows that > L2 kvm_exit reason EXCEPTION_NMI and page fault repeatedly: > > qemu-system-x86-2821 [003] d..2 45.848814: kvm_entry: vcpu 0 > qemu-system-x86-2821 [003] ...1 45.848827: kvm_exit: reason EXCEPTION_NMI rip 0xe05b info fe05b 80000b0e > qemu-system-x86-2821 [003] ...1 45.848827: kvm_page_fault: address fe05b error_code 14 > > Commit 7ca29de21362 (KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT) > prevents to load L2's PDPTRs according to dereferencing L2's CR3 since it is > uninitialized in real mode. Hyper-V L1 will emulate L2 real mode with PAE > paging and EPT enabled. However, there is a progress to switch from Legacy > mode's such-mode Protected mode to Long mode during system boot, the check > in nested_vmx_load_cr3() will prevent to load PDPTRs if it is still in > Protected mode w/ PAE paging and nested EPT/shadow page tables on EPT. Actually > the original commit should just intended to prevent to dereference L2's CR3 > if the L1 hypervisor emulates L2's real mode through vm8086. > > This patch fixes it by allowing load PDPTRs if PAE paing, EPT enabled and > !vm86_active. > > Cc: Paolo Bonzini <pbonzini@redhat.com> > Cc: Radim Krčmář <rkrcmar@redhat.com> > Cc: Ladi Prosek <lprosek@redhat.com> > Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Please provide a testcase. I know this is a regression, but I'm not going to merge the fix without a corresponding patch to kvm-unit-tests. Paolo > --- > arch/x86/kvm/vmx.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index c664365..2b2a05f 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -9933,7 +9933,7 @@ static bool nested_cr3_valid(struct kvm_vcpu *vcpu, unsigned long val) > static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool nested_ept, > u32 *entry_failure_code) > { > - if (cr3 != kvm_read_cr3(vcpu) || (!nested_ept && pdptrs_changed(vcpu))) { > + if (cr3 != kvm_read_cr3(vcpu) || pdptrs_changed(vcpu)) { > if (!nested_cr3_valid(vcpu, cr3)) { > *entry_failure_code = ENTRY_FAIL_DEFAULT; > return 1; > @@ -9944,7 +9944,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne > * must not be dereferenced. > */ > if (!is_long_mode(vcpu) && is_pae(vcpu) && is_paging(vcpu) && > - !nested_ept) { > + !(nested_ept && to_vmx(vcpu)->rmode.vm86_active)) { > if (!load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3)) { > *entry_failure_code = ENTRY_FAIL_PDPTE; > return 1; >
On Fri, Mar 17, 2017 at 3:41 PM, Wanpeng Li <kernellwp@gmail.com> wrote: > From: Wanpeng Li <wanpeng.li@hotmail.com> > > The L2 guest hang if shadow page tables on EPT, the trace on L1 shows that > L2 kvm_exit reason EXCEPTION_NMI and page fault repeatedly: > > qemu-system-x86-2821 [003] d..2 45.848814: kvm_entry: vcpu 0 > qemu-system-x86-2821 [003] ...1 45.848827: kvm_exit: reason EXCEPTION_NMI rip 0xe05b info fe05b 80000b0e > qemu-system-x86-2821 [003] ...1 45.848827: kvm_page_fault: address fe05b error_code 14 > > Commit 7ca29de21362 (KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT) > prevents to load L2's PDPTRs according to dereferencing L2's CR3 since it is > uninitialized in real mode. Hyper-V L1 will emulate L2 real mode with PAE > paging and EPT enabled. However, there is a progress to switch from Legacy > mode's such-mode Protected mode to Long mode during system boot, the check > in nested_vmx_load_cr3() will prevent to load PDPTRs if it is still in > Protected mode w/ PAE paging and nested EPT/shadow page tables on EPT. Actually > the original commit should just intended to prevent to dereference L2's CR3 > if the L1 hypervisor emulates L2's real mode through vm8086. > > This patch fixes it by allowing load PDPTRs if PAE paing, EPT enabled and > !vm86_active. > > Cc: Paolo Bonzini <pbonzini@redhat.com> > Cc: Radim Krčmář <rkrcmar@redhat.com> > Cc: Ladi Prosek <lprosek@redhat.com> > Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> > --- > arch/x86/kvm/vmx.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index c664365..2b2a05f 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -9933,7 +9933,7 @@ static bool nested_cr3_valid(struct kvm_vcpu *vcpu, unsigned long val) > static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool nested_ept, > u32 *entry_failure_code) > { > - if (cr3 != kvm_read_cr3(vcpu) || (!nested_ept && pdptrs_changed(vcpu))) { > + if (cr3 != kvm_read_cr3(vcpu) || pdptrs_changed(vcpu)) { > if (!nested_cr3_valid(vcpu, cr3)) { > *entry_failure_code = ENTRY_FAIL_DEFAULT; > return 1; > @@ -9944,7 +9944,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne > * must not be dereferenced. > */ > if (!is_long_mode(vcpu) && is_pae(vcpu) && is_paging(vcpu) && > - !nested_ept) { > + !(nested_ept && to_vmx(vcpu)->rmode.vm86_active)) { This change breaks Hyper-V on KVM. L2 hangs on start-up, same symptoms as before 7ca29de21362. I'll take a closer look next week. Is there an easy way for me to reproduce the issue you're seeing? > if (!load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3)) { > *entry_failure_code = ENTRY_FAIL_PDPTE; > return 1; > -- > 2.7.4 >
On 17/03/2017 18:28, Ladi Prosek wrote: > On Fri, Mar 17, 2017 at 3:41 PM, Wanpeng Li <kernellwp@gmail.com> wrote: >> From: Wanpeng Li <wanpeng.li@hotmail.com> >> >> The L2 guest hang if shadow page tables on EPT, the trace on L1 shows that >> L2 kvm_exit reason EXCEPTION_NMI and page fault repeatedly: >> >> qemu-system-x86-2821 [003] d..2 45.848814: kvm_entry: vcpu 0 >> qemu-system-x86-2821 [003] ...1 45.848827: kvm_exit: reason EXCEPTION_NMI rip 0xe05b info fe05b 80000b0e >> qemu-system-x86-2821 [003] ...1 45.848827: kvm_page_fault: address fe05b error_code 14 >> >> Commit 7ca29de21362 (KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT) >> prevents to load L2's PDPTRs according to dereferencing L2's CR3 since it is >> uninitialized in real mode. Hyper-V L1 will emulate L2 real mode with PAE >> paging and EPT enabled. However, there is a progress to switch from Legacy >> mode's such-mode Protected mode to Long mode during system boot, the check >> in nested_vmx_load_cr3() will prevent to load PDPTRs if it is still in >> Protected mode w/ PAE paging and nested EPT/shadow page tables on EPT. Actually >> the original commit should just intended to prevent to dereference L2's CR3 >> if the L1 hypervisor emulates L2's real mode through vm8086. >> >> This patch fixes it by allowing load PDPTRs if PAE paing, EPT enabled and >> !vm86_active. >> >> Cc: Paolo Bonzini <pbonzini@redhat.com> >> Cc: Radim Krčmář <rkrcmar@redhat.com> >> Cc: Ladi Prosek <lprosek@redhat.com> >> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> >> --- >> arch/x86/kvm/vmx.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> index c664365..2b2a05f 100644 >> --- a/arch/x86/kvm/vmx.c >> +++ b/arch/x86/kvm/vmx.c >> @@ -9933,7 +9933,7 @@ static bool nested_cr3_valid(struct kvm_vcpu *vcpu, unsigned long val) >> static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool nested_ept, >> u32 *entry_failure_code) >> { >> - if (cr3 != kvm_read_cr3(vcpu) || (!nested_ept && pdptrs_changed(vcpu))) { >> + if (cr3 != kvm_read_cr3(vcpu) || pdptrs_changed(vcpu)) { >> if (!nested_cr3_valid(vcpu, cr3)) { >> *entry_failure_code = ENTRY_FAIL_DEFAULT; >> return 1; >> @@ -9944,7 +9944,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne >> * must not be dereferenced. >> */ >> if (!is_long_mode(vcpu) && is_pae(vcpu) && is_paging(vcpu) && >> - !nested_ept) { >> + !(nested_ept && to_vmx(vcpu)->rmode.vm86_active)) { > > This change breaks Hyper-V on KVM. L2 hangs on start-up, same symptoms > as before 7ca29de21362. Looks like we need _two_ testcases then... :) Paolo > I'll take a closer look next week. Is there an easy way for me to > reproduce the issue you're seeing?
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index c664365..2b2a05f 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -9933,7 +9933,7 @@ static bool nested_cr3_valid(struct kvm_vcpu *vcpu, unsigned long val) static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool nested_ept, u32 *entry_failure_code) { - if (cr3 != kvm_read_cr3(vcpu) || (!nested_ept && pdptrs_changed(vcpu))) { + if (cr3 != kvm_read_cr3(vcpu) || pdptrs_changed(vcpu)) { if (!nested_cr3_valid(vcpu, cr3)) { *entry_failure_code = ENTRY_FAIL_DEFAULT; return 1; @@ -9944,7 +9944,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne * must not be dereferenced. */ if (!is_long_mode(vcpu) && is_pae(vcpu) && is_paging(vcpu) && - !nested_ept) { + !(nested_ept && to_vmx(vcpu)->rmode.vm86_active)) { if (!load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3)) { *entry_failure_code = ENTRY_FAIL_PDPTE; return 1;