diff mbox

[v4,11/13] nEPT: Advertise EPT to L1

Message ID 1374750001-28527-12-git-send-email-gleb@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Gleb Natapov July 25, 2013, 10:59 a.m. UTC
From: Nadav Har'El <nyh@il.ibm.com>

Advertise the support of EPT to the L1 guest, through the appropriate MSR.

This is the last patch of the basic Nested EPT feature, so as to allow
bisection through this patch series: The guest will not see EPT support until
this last patch, and will not attempt to use the half-applied feature.

Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
Signed-off-by: Jun Nakajima <jun.nakajima@intel.com>
Signed-off-by: Xinhao Xu <xinhao.xu@intel.com>
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/vmx.c |   16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

Comments

Paolo Bonzini July 29, 2013, 9:21 a.m. UTC | #1
Il 25/07/2013 12:59, Gleb Natapov ha scritto:
> +	if (enable_ept) {
> +		/* nested EPT: emulate EPT also to L1 */
> +		nested_vmx_secondary_ctls_high |= SECONDARY_EXEC_ENABLE_EPT;
> +		nested_vmx_ept_caps = VMX_EPT_PAGE_WALK_4_BIT;
> +		nested_vmx_ept_caps |=
> +			VMX_EPT_INVEPT_BIT | VMX_EPT_EXTENT_GLOBAL_BIT |
> +			VMX_EPT_EXTENT_CONTEXT_BIT |
> +			VMX_EPT_EXTENT_INDIVIDUAL_BIT;

What version of the Intel manual defines bit 24?  Mine is January 2013
and only has 20/25/26.

> +		nested_vmx_ept_caps &= vmx_capability.ept;

This is always missing VMX_EPT_EXECUTE_ONLY_BIT, should it be added
before the "&=".

Also, the three extent bits should always be fine for the MSR,
independent of the host support, because the processor will do the
INVEPT vmexit before checking the INVEPT type against the processor
capabilities.  So they can be added after the "&=".

Related to this, I suppose enable_ept should depend on
VMX_EPT_INVEPT_BIT too (it currently doesn't) since vmx_flush_tlb does
an unconditional invept under "if (enable_ept)".  This is really just
nitpicking though.

Paolo

> +	} else
> +		nested_vmx_ept_caps = 0;

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gleb Natapov July 29, 2013, 11:11 a.m. UTC | #2
On Mon, Jul 29, 2013 at 11:21:38AM +0200, Paolo Bonzini wrote:
> Il 25/07/2013 12:59, Gleb Natapov ha scritto:
> > +	if (enable_ept) {
> > +		/* nested EPT: emulate EPT also to L1 */
> > +		nested_vmx_secondary_ctls_high |= SECONDARY_EXEC_ENABLE_EPT;
> > +		nested_vmx_ept_caps = VMX_EPT_PAGE_WALK_4_BIT;
> > +		nested_vmx_ept_caps |=
> > +			VMX_EPT_INVEPT_BIT | VMX_EPT_EXTENT_GLOBAL_BIT |
> > +			VMX_EPT_EXTENT_CONTEXT_BIT |
> > +			VMX_EPT_EXTENT_INDIVIDUAL_BIT;
> 
> What version of the Intel manual defines bit 24?  Mine is January 2013
> and only has 20/25/26.
> 
Good question. Looks like this crept in while Jun rebased Nadavs patch
on more resent master. This bit was defined by initial EPT implementation,
but was dropped by commit 2b3c5cbc0d814437fe4d70cc11ed60550b95b29f not
long time ago.

> > +		nested_vmx_ept_caps &= vmx_capability.ept;
> 
> This is always missing VMX_EPT_EXECUTE_ONLY_BIT, should it be added
> before the "&=".
I am not at all sure our current shadow implementation can support
execute only pages. Best to leave it off for now.

> 
> Also, the three extent bits should always be fine for the MSR,
> independent of the host support, because the processor will do the
> INVEPT vmexit before checking the INVEPT type against the processor
> capabilities.  So they can be added after the "&=".
> 
Good point.

> Related to this, I suppose enable_ept should depend on
> VMX_EPT_INVEPT_BIT too (it currently doesn't) since vmx_flush_tlb does
> an unconditional invept under "if (enable_ept)".  This is really just
> nitpicking though.
> 
> Paolo
> 
> > +	} else
> > +		nested_vmx_ept_caps = 0;

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini July 29, 2013, 11:33 a.m. UTC | #3
Il 29/07/2013 13:11, Gleb Natapov ha scritto:
> > > +		nested_vmx_ept_caps &= vmx_capability.ept;
> > 
> > This is always missing VMX_EPT_EXECUTE_ONLY_BIT, should it be added
> > before the "&=".
> 
> I am not at all sure our current shadow implementation can support
> execute only pages. Best to leave it off for now.

Ok, I was tricked by this reference to nested_vmx_ept_caps's execonly bit:

+	int r = kvm_init_shadow_ept_mmu(vcpu, &vcpu->arch.mmu,
+			nested_vmx_ept_caps & VMX_EPT_EXECUTE_ONLY_BIT);

It's probably best to add a comment there, saying that the bit will
always be zero for now.

>> Also, the three extent bits should always be fine for the MSR,
>> independent of the host support, because the processor will do the
>> INVEPT vmexit before checking the INVEPT type against the processor
>> capabilities.  So they can be added after the "&=".
>>
> Good point.

For v5 you probably should leave out individual-addr invalidation from
this and the EPT patch too, though.

Paolo

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gleb Natapov July 29, 2013, 11:35 a.m. UTC | #4
On Mon, Jul 29, 2013 at 01:33:26PM +0200, Paolo Bonzini wrote:
> Il 29/07/2013 13:11, Gleb Natapov ha scritto:
> > > > +		nested_vmx_ept_caps &= vmx_capability.ept;
> > > 
> > > This is always missing VMX_EPT_EXECUTE_ONLY_BIT, should it be added
> > > before the "&=".
> > 
> > I am not at all sure our current shadow implementation can support
> > execute only pages. Best to leave it off for now.
> 
> Ok, I was tricked by this reference to nested_vmx_ept_caps's execonly bit:
> 
> +	int r = kvm_init_shadow_ept_mmu(vcpu, &vcpu->arch.mmu,
> +			nested_vmx_ept_caps & VMX_EPT_EXECUTE_ONLY_BIT);
> 
> It's probably best to add a comment there, saying that the bit will
> always be zero for now.
> 
> >> Also, the three extent bits should always be fine for the MSR,
> >> independent of the host support, because the processor will do the
> >> INVEPT vmexit before checking the INVEPT type against the processor
> >> capabilities.  So they can be added after the "&=".
> >>
> > Good point.
> 
> For v5 you probably should leave out individual-addr invalidation from
> this and the EPT patch too, though.
> 
Of course. The define should not be introduces again.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 6b79db7..a77f902 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2249,6 +2249,18 @@  static __init void nested_vmx_setup_ctls_msrs(void)
 		SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
 		SECONDARY_EXEC_WBINVD_EXITING;
 
+	if (enable_ept) {
+		/* nested EPT: emulate EPT also to L1 */
+		nested_vmx_secondary_ctls_high |= SECONDARY_EXEC_ENABLE_EPT;
+		nested_vmx_ept_caps = VMX_EPT_PAGE_WALK_4_BIT;
+		nested_vmx_ept_caps |=
+			VMX_EPT_INVEPT_BIT | VMX_EPT_EXTENT_GLOBAL_BIT |
+			VMX_EPT_EXTENT_CONTEXT_BIT |
+			VMX_EPT_EXTENT_INDIVIDUAL_BIT;
+		nested_vmx_ept_caps &= vmx_capability.ept;
+	} else
+		nested_vmx_ept_caps = 0;
+
 	/* miscellaneous data */
 	rdmsr(MSR_IA32_VMX_MISC, nested_vmx_misc_low, nested_vmx_misc_high);
 	nested_vmx_misc_low &= VMX_MISC_PREEMPTION_TIMER_RATE_MASK |
@@ -2357,8 +2369,8 @@  static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
 					nested_vmx_secondary_ctls_high);
 		break;
 	case MSR_IA32_VMX_EPT_VPID_CAP:
-		/* Currently, no nested ept or nested vpid */
-		*pdata = 0;
+		/* Currently, no nested vpid support */
+		*pdata = nested_vmx_ept_caps;
 		break;
 	default:
 		return 0;