diff mbox

[2/3] KVM: VMX: Simplify pdptr and cr3 management

Message ID 1243862524-22120-3-git-send-email-avi@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Avi Kivity June 1, 2009, 1:22 p.m. UTC
Instead of reading the PDPTRs from memory after every exit (which is slow
and wrong, as the PDPTRs are stored on the cpu), sync the PDPTRs from
memory to the VMCS before entry, and from the VMCS to memory after exit.
Do the same for cr3.

Signed-off-by: Avi Kivity <avi@redhat.com>
---
 arch/x86/kvm/vmx.c |   21 +++++++++++++++------
 1 files changed, 15 insertions(+), 6 deletions(-)

Comments

Sheng Yang June 2, 2009, 9:22 a.m. UTC | #1
On Monday 01 June 2009 21:22:02 Avi Kivity wrote:
> Instead of reading the PDPTRs from memory after every exit (which is slow
> and wrong, as the PDPTRs are stored on the cpu), sync the PDPTRs from
> memory to the VMCS before entry, and from the VMCS to memory after exit.
> Do the same for cr3.
>

Thanks for fixing!

After review my original code, I found a potential bug. For SDM 3B have this:

23.3.4 Saving Non-Register State
...
If the logical processor supports the 1-setting of the “enable EPT” VM-
execution control, values are saved into the four (4) PDPTE fields as follows:
— If the “enable EPT” VM-execution control is 1 and the logical processor was
using PAE paging at the time of the VM exit, the PDPTE values currently in use
are saved:
• The values saved into bits 11:9 of each of the fields is undefined.
• If the value saved into one of the fields has bit 0 (present) clear, the 
value saved into bits 63:1 of that field is undefined. That value need not
correspond to the value that was loaded by VM entry or to any value that
might have been loaded in VMX non-root operation.
• If the value saved into one of the fields has bit 0 (present) set, the value
saved into bits 63:12 of the field is a guest-physical address.
— If the “enable EPT” VM-execution control is 0 or the logical processor was 
not using PAE paging at the time of the VM exit, the values saved are 
undefined.

But drop the ept_load_pdptrs() when exit and add it in cr0 handling result in 
Windows PAE guest hang on boot. I am checking it now. Any thoughts?...
Avi Kivity June 2, 2009, 9:26 a.m. UTC | #2
Sheng Yang wrote:
> On Monday 01 June 2009 21:22:02 Avi Kivity wrote:
>   
>> Instead of reading the PDPTRs from memory after every exit (which is slow
>> and wrong, as the PDPTRs are stored on the cpu), sync the PDPTRs from
>> memory to the VMCS before entry, and from the VMCS to memory after exit.
>> Do the same for cr3.
>>
>>     
>
> Thanks for fixing!
>
> After review my original code, I found a potential bug. For SDM 3B have this:
>
> 23.3.4 Saving Non-Register State
> ...
> If the logical processor supports the 1-setting of the “enable EPT” VM-
> execution control, values are saved into the four (4) PDPTE fields as follows:
> — If the “enable EPT” VM-execution control is 1 and the logical processor was
> using PAE paging at the time of the VM exit, the PDPTE values currently in use
> are saved:
> • The values saved into bits 11:9 of each of the fields is undefined.
> • If the value saved into one of the fields has bit 0 (present) clear, the 
> value saved into bits 63:1 of that field is undefined. That value need not
> correspond to the value that was loaded by VM entry or to any value that
> might have been loaded in VMX non-root operation.
> • If the value saved into one of the fields has bit 0 (present) set, the value
> saved into bits 63:12 of the field is a guest-physical address.
> — If the “enable EPT” VM-execution control is 0 or the logical processor was 
> not using PAE paging at the time of the VM exit, the values saved are 
> undefined.
>
> But drop the ept_load_pdptrs() when exit and add it in cr0 handling result in 
> Windows PAE guest hang on boot. I am checking it now. Any thoughts?...
>   

You mean with the new code?  What version of Windows exactly?

I'll check it out, though EPTs are a little hard to find here.
Sheng Yang June 2, 2009, 9:30 a.m. UTC | #3
On Tuesday 02 June 2009 17:26:27 Avi Kivity wrote:
> Sheng Yang wrote:
> > On Monday 01 June 2009 21:22:02 Avi Kivity wrote:
> >> Instead of reading the PDPTRs from memory after every exit (which is
> >> slow and wrong, as the PDPTRs are stored on the cpu), sync the PDPTRs
> >> from memory to the VMCS before entry, and from the VMCS to memory after
> >> exit. Do the same for cr3.
> >
> > Thanks for fixing!
> >
> > After review my original code, I found a potential bug. For SDM 3B have
> > this:
> >
> > 23.3.4 Saving Non-Register State
> > ...
> > If the logical processor supports the 1-setting of the “enable EPT” VM-
> > execution control, values are saved into the four (4) PDPTE fields as
> > follows: — If the “enable EPT” VM-execution control is 1 and the logical
> > processor was using PAE paging at the time of the VM exit, the PDPTE
> > values currently in use are saved:
> > • The values saved into bits 11:9 of each of the fields is undefined.
> > • If the value saved into one of the fields has bit 0 (present) clear,
> > the value saved into bits 63:1 of that field is undefined. That value
> > need not correspond to the value that was loaded by VM entry or to any
> > value that might have been loaded in VMX non-root operation.
> > • If the value saved into one of the fields has bit 0 (present) set, the
> > value saved into bits 63:12 of the field is a guest-physical address.
> > — If the “enable EPT” VM-execution control is 0 or the logical processor
> > was not using PAE paging at the time of the VM exit, the values saved are
> > undefined.
> >
> > But drop the ept_load_pdptrs() when exit and add it in cr0 handling
> > result in Windows PAE guest hang on boot. I am checking it now. Any
> > thoughts?...
>
> You mean with the new code?  What version of Windows exactly?
>
> I'll check it out, though EPTs are a little hard to find here.

No, no, not with the new code. For CPU can load pdptrs if EPT enabled with PAE 
from VM exit, there should not be necessary load it explicitly. So I estimate 
the ept_load_pdptr() in exit handler, and put it in CR0 handling. Just tried 
to optimize load-pdptr according to the spec, but not got the desired 
result...

So I am trying to find the failure reason...
Avi Kivity June 2, 2009, 9:46 a.m. UTC | #4
Sheng Yang wrote:
> No, no, not with the new code. For CPU can load pdptrs if EPT enabled with PAE 
> from VM exit, there should not be necessary load it explicitly. So I estimate 
> the ept_load_pdptr() in exit handler, and put it in CR0 handling. Just tried 
> to optimize load-pdptr according to the spec, but not got the desired 
> result...
>
> So I am trying to find the failure reason...
>   

so from your point of view, the patches are okay?
Sheng Yang June 2, 2009, 9:56 a.m. UTC | #5
On Tuesday 02 June 2009 17:46:39 Avi Kivity wrote:
> Sheng Yang wrote:
> > No, no, not with the new code. For CPU can load pdptrs if EPT enabled
> > with PAE from VM exit, there should not be necessary load it explicitly.
> > So I estimate the ept_load_pdptr() in exit handler, and put it in CR0
> > handling. Just tried to optimize load-pdptr according to the spec, but
> > not got the desired result...
> >
> > So I am trying to find the failure reason...
>
> so from your point of view, the patches are okay?

Yes. But we should can optimize it more by avoiding most of PDPTR loading, 
according to the spec.
Avi Kivity June 2, 2009, 10:16 a.m. UTC | #6
Sheng Yang wrote:
> On Tuesday 02 June 2009 17:46:39 Avi Kivity wrote:
>   
>> Sheng Yang wrote:
>>     
>>> No, no, not with the new code. For CPU can load pdptrs if EPT enabled
>>> with PAE from VM exit, there should not be necessary load it explicitly.
>>> So I estimate the ept_load_pdptr() in exit handler, and put it in CR0
>>> handling. Just tried to optimize load-pdptr according to the spec, but
>>> not got the desired result...
>>>
>>> So I am trying to find the failure reason...
>>>       
>> so from your point of view, the patches are okay?
>>     
>
> Yes. But we should can optimize it more by avoiding most of PDPTR loading, 
> according to the spec.
>   

Patch 3 makes reloading a very rare event.  It only happens on pae mode 
changes or if userspace sets cr3.
Sheng Yang June 2, 2009, 11:31 a.m. UTC | #7
On Tuesday 02 June 2009 18:16:14 Avi Kivity wrote:
> Sheng Yang wrote:
> > On Tuesday 02 June 2009 17:46:39 Avi Kivity wrote:
> >> Sheng Yang wrote:
> >>> No, no, not with the new code. For CPU can load pdptrs if EPT enabled
> >>> with PAE from VM exit, there should not be necessary load it
> >>> explicitly. So I estimate the ept_load_pdptr() in exit handler, and put
> >>> it in CR0 handling. Just tried to optimize load-pdptr according to the
> >>> spec, but not got the desired result...
> >>>
> >>> So I am trying to find the failure reason...
> >>
> >> so from your point of view, the patches are okay?
> >
> > Yes. But we should can optimize it more by avoiding most of PDPTR
> > loading, according to the spec.
>
> Patch 3 makes reloading a very rare event.  It only happens on pae mode
> changes or if userspace sets cr3.

Yes, you are right. Quite beautiful. :)

Sorry for misunderstood the meaning of "on demand" without looking at the 
patch carefully...

Looks fine now. Thanks!

BTW: one extra blank line above function kvm_pdptr_read() in patch 3. ;)
Avi Kivity June 2, 2009, 11:44 a.m. UTC | #8
Sheng Yang wrote:
> Yes, you are right. Quite beautiful. :)
>   

Thanks :)

It's basically a continuation of RIP/RSP caching.  I plan to continue it 
for cr0/cr3.

> Sorry for misunderstood the meaning of "on demand" without looking at the 
> patch carefully...
>
> Looks fine now. Thanks!
>
> BTW: one extra blank line above function kvm_pdptr_read() in patch 3. ;)
>   

Will fix.
diff mbox

Patch

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 5607de8..1783606 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1538,10 +1538,6 @@  static void vmx_decache_cr4_guest_bits(struct kvm_vcpu *vcpu)
 static void ept_load_pdptrs(struct kvm_vcpu *vcpu)
 {
 	if (is_paging(vcpu) && is_pae(vcpu) && !is_long_mode(vcpu)) {
-		if (!load_pdptrs(vcpu, vcpu->arch.cr3)) {
-			printk(KERN_ERR "EPT: Fail to load pdptrs!\n");
-			return;
-		}
 		vmcs_write64(GUEST_PDPTR0, vcpu->arch.pdptrs[0]);
 		vmcs_write64(GUEST_PDPTR1, vcpu->arch.pdptrs[1]);
 		vmcs_write64(GUEST_PDPTR2, vcpu->arch.pdptrs[2]);
@@ -1549,6 +1545,16 @@  static void ept_load_pdptrs(struct kvm_vcpu *vcpu)
 	}
 }
 
+static void ept_save_pdptrs(struct kvm_vcpu *vcpu)
+{
+	if (is_paging(vcpu) && is_pae(vcpu) && !is_long_mode(vcpu)) {
+		vcpu->arch.pdptrs[0] = vmcs_read64(GUEST_PDPTR0);
+		vcpu->arch.pdptrs[1] = vmcs_read64(GUEST_PDPTR1);
+		vcpu->arch.pdptrs[2] = vmcs_read64(GUEST_PDPTR2);
+		vcpu->arch.pdptrs[3] = vmcs_read64(GUEST_PDPTR3);
+	}
+}
+
 static void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
 
 static void ept_update_paging_mode_cr0(unsigned long *hw_cr0,
@@ -1642,7 +1648,6 @@  static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
 	if (enable_ept) {
 		eptp = construct_eptp(cr3);
 		vmcs_write64(EPT_POINTER, eptp);
-		ept_load_pdptrs(vcpu);
 		guest_cr3 = is_paging(vcpu) ? vcpu->arch.cr3 :
 			VMX_EPT_IDENTITY_PAGETABLE_ADDR;
 	}
@@ -3199,7 +3204,7 @@  static int vmx_handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 	 * to sync with guest real CR3. */
 	if (enable_ept && is_paging(vcpu)) {
 		vcpu->arch.cr3 = vmcs_readl(GUEST_CR3);
-		ept_load_pdptrs(vcpu);
+		ept_save_pdptrs(vcpu);
 	}
 
 	if (unlikely(vmx->fail)) {
@@ -3376,6 +3381,10 @@  static void vmx_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
+	if (enable_ept && is_paging(vcpu)) {
+		vmcs_writel(GUEST_CR3, vcpu->arch.cr3);
+		ept_load_pdptrs(vcpu);
+	}
 	/* Record the guest's net vcpu time for enforced NMI injections. */
 	if (unlikely(!cpu_has_virtual_nmis() && vmx->soft_vnmi_blocked))
 		vmx->entry_time = ktime_get();