[3/3] KVM: nVMX: check for null vmcs12 when L1 does invept

Message ID	1395286089-5406-4-git-send-email-bsd@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> From: Bandan Das <bsd@redhat.com> To: kvm@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com>, Gleb Natapov <gleb@kernel.org>, Jan Kiszka <jan.kiszka@siemens.com> Subject: [PATCH 3/3] KVM: nVMX: check for null vmcs12 when L1 does invept Date: Wed, 19 Mar 2014 23:28:09 -0400 Message-Id: <1395286089-5406-4-git-send-email-bsd@redhat.com> In-Reply-To: <1395286089-5406-1-git-send-email-bsd@redhat.com> References: <1395286089-5406-1-git-send-email-bsd@redhat.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk

Bandan Das March 20, 2014, 3:28 a.m. UTC

Some L1 hypervisors such as Xen seem to be calling invept after
vmclear or before vmptrld on L2. In this case, proceed with
falling through and syncing roots as a case where
context wide invalidation can't be supported

Signed-off-by: Bandan Das <bsd@redhat.com>
---
 arch/x86/kvm/vmx.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Jan Kiszka March 20, 2014, 9:34 a.m. UTC | #1

On 2014-03-20 04:28, Bandan Das wrote:
> Some L1 hypervisors such as Xen seem to be calling invept after
> vmclear or before vmptrld on L2. In this case, proceed with
> falling through and syncing roots as a case where
> context wide invalidation can't be supported

Can we also base this behaviour on a statement in the SDM? But on first
glance, I do not find anything like this over there.

Jan

> 
> Signed-off-by: Bandan Das <bsd@redhat.com>
> ---
>  arch/x86/kvm/vmx.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index c707389..b407b3a 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -6435,8 +6435,9 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>  
>  	switch (type) {
>  	case VMX_EPT_EXTENT_CONTEXT:
> -		if ((operand.eptp & eptp_mask) !=
> -				(nested_ept_get_cr3(vcpu) & eptp_mask))
> +		if (get_vmcs12(vcpu) &&
> +		    ((operand.eptp & eptp_mask) != (nested_ept_get_cr3(vcpu) &
> +						    eptp_mask)))
>  			break;
>  	case VMX_EPT_EXTENT_GLOBAL:
>  		kvm_mmu_sync_roots(vcpu);
>

Paolo Bonzini March 20, 2014, 12:43 p.m. UTC | #2

Il 20/03/2014 04:28, Bandan Das ha scritto:
> Some L1 hypervisors such as Xen seem to be calling invept after
> vmclear or before vmptrld on L2. In this case, proceed with
> falling through and syncing roots as a case where
> context wide invalidation can't be supported
>
> Signed-off-by: Bandan Das <bsd@redhat.com>
> ---
>  arch/x86/kvm/vmx.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index c707389..b407b3a 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -6435,8 +6435,9 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>
>  	switch (type) {
>  	case VMX_EPT_EXTENT_CONTEXT:
> -		if ((operand.eptp & eptp_mask) !=
> -				(nested_ept_get_cr3(vcpu) & eptp_mask))
> +		if (get_vmcs12(vcpu) &&
> +		    ((operand.eptp & eptp_mask) != (nested_ept_get_cr3(vcpu) &
> +						    eptp_mask)))
>  			break;
>  	case VMX_EPT_EXTENT_GLOBAL:
>  		kvm_mmu_sync_roots(vcpu);
>

Please add a /* fall through */ comment as well.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bandan Das March 20, 2014, 8:58 p.m. UTC | #3

Jan Kiszka <jan.kiszka@siemens.com> writes:

> On 2014-03-20 04:28, Bandan Das wrote:
>> Some L1 hypervisors such as Xen seem to be calling invept after
>> vmclear or before vmptrld on L2. In this case, proceed with
>> falling through and syncing roots as a case where
>> context wide invalidation can't be supported
>
> Can we also base this behaviour on a statement in the SDM? But on first
> glance, I do not find anything like this over there.

The SDM has nothing of this sort explicitly mentioned but 28.3.3.1 
"Operations that invalidate Cached Mappings" does mention that
the instruction may invalidate mappings associated with other
EP4TAs (even in single context).

Note that I based this on what we currently do for context invalidation -
static inline void ept_sync_context(u64 eptp)
{
	if (enable_ept) {
		if (cpu_has_vmx_invept_context())
			__invept(VMX_EPT_EXTENT_CONTEXT, eptp, 0);
		else
			ept_sync_global();
	}
}

Seemed easier and cleaner than having a cached eptp after vmcs12 is 
long gone :)

If you prefer, I can modify the commit message to reflect this.

> Jan
>
>> 
>> Signed-off-by: Bandan Das <bsd@redhat.com>
>> ---
>>  arch/x86/kvm/vmx.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index c707389..b407b3a 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -6435,8 +6435,9 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>>  
>>  	switch (type) {
>>  	case VMX_EPT_EXTENT_CONTEXT:
>> -		if ((operand.eptp & eptp_mask) !=
>> -				(nested_ept_get_cr3(vcpu) & eptp_mask))
>> +		if (get_vmcs12(vcpu) &&
>> +		    ((operand.eptp & eptp_mask) != (nested_ept_get_cr3(vcpu) &
>> +						    eptp_mask)))
>>  			break;
>>  	case VMX_EPT_EXTENT_GLOBAL:
>>  		kvm_mmu_sync_roots(vcpu);
>> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bandan Das March 20, 2014, 8:58 p.m. UTC | #4

Paolo Bonzini <pbonzini@redhat.com> writes:

> Il 20/03/2014 04:28, Bandan Das ha scritto:
>> Some L1 hypervisors such as Xen seem to be calling invept after
>> vmclear or before vmptrld on L2. In this case, proceed with
>> falling through and syncing roots as a case where
>> context wide invalidation can't be supported
>>
>> Signed-off-by: Bandan Das <bsd@redhat.com>
>> ---
>>  arch/x86/kvm/vmx.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index c707389..b407b3a 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -6435,8 +6435,9 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>>
>>  	switch (type) {
>>  	case VMX_EPT_EXTENT_CONTEXT:
>> -		if ((operand.eptp & eptp_mask) !=
>> -				(nested_ept_get_cr3(vcpu) & eptp_mask))
>> +		if (get_vmcs12(vcpu) &&
>> +		    ((operand.eptp & eptp_mask) != (nested_ept_get_cr3(vcpu) &
>> +						    eptp_mask)))
>>  			break;
>>  	case VMX_EPT_EXTENT_GLOBAL:
>>  		kvm_mmu_sync_roots(vcpu);
>>
>
> Please add a /* fall through */ comment as well.

Thanks for reviewing Paolo, will fix in v2.

> Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jan Kiszka March 22, 2014, 11:38 a.m. UTC | #5

On 2014-03-20 21:58, Bandan Das wrote:
> Jan Kiszka <jan.kiszka@siemens.com> writes:
> 
>> On 2014-03-20 04:28, Bandan Das wrote:
>>> Some L1 hypervisors such as Xen seem to be calling invept after
>>> vmclear or before vmptrld on L2. In this case, proceed with
>>> falling through and syncing roots as a case where
>>> context wide invalidation can't be supported
>>
>> Can we also base this behaviour on a statement in the SDM? But on first
>> glance, I do not find anything like this over there.
> 
> The SDM has nothing of this sort explicitly mentioned but 28.3.3.1 
> "Operations that invalidate Cached Mappings" does mention that
> the instruction may invalidate mappings associated with other
> EP4TAs (even in single context).

Yes, "may". So we are implementing undefined behavior in order to please
a broken hypervisor that relies on it? Then please state this in the
patch and probably also inform Xen about their issue.

> 
> Note that I based this on what we currently do for context invalidation -
> static inline void ept_sync_context(u64 eptp)
> {
> 	if (enable_ept) {
> 		if (cpu_has_vmx_invept_context())
> 			__invept(VMX_EPT_EXTENT_CONTEXT, eptp, 0);
> 		else
> 			ept_sync_global();
> 	}
> }

Don't get your point. This test is about testing for the CPU support
context invalidating, then falling back to global invalidation if there
is no support.

Jan

> 
> Seemed easier and cleaner than having a cached eptp after vmcs12 is 
> long gone :)
> 
> If you prefer, I can modify the commit message to reflect this.
> 
>> Jan
>>
>>>
>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>> ---
>>>  arch/x86/kvm/vmx.c | 5 +++--
>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>> index c707389..b407b3a 100644
>>> --- a/arch/x86/kvm/vmx.c
>>> +++ b/arch/x86/kvm/vmx.c
>>> @@ -6435,8 +6435,9 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>>>  
>>>  	switch (type) {
>>>  	case VMX_EPT_EXTENT_CONTEXT:
>>> -		if ((operand.eptp & eptp_mask) !=
>>> -				(nested_ept_get_cr3(vcpu) & eptp_mask))
>>> +		if (get_vmcs12(vcpu) &&
>>> +		    ((operand.eptp & eptp_mask) != (nested_ept_get_cr3(vcpu) &
>>> +						    eptp_mask)))
>>>  			break;
>>>  	case VMX_EPT_EXTENT_GLOBAL:
>>>  		kvm_mmu_sync_roots(vcpu);
>>>

Bandan Das March 22, 2014, 4:43 p.m. UTC | #6

Jan Kiszka <jan.kiszka@web.de> writes:

> On 2014-03-20 21:58, Bandan Das wrote:
>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>> 
>>> On 2014-03-20 04:28, Bandan Das wrote:
>>>> Some L1 hypervisors such as Xen seem to be calling invept after
>>>> vmclear or before vmptrld on L2. In this case, proceed with
>>>> falling through and syncing roots as a case where
>>>> context wide invalidation can't be supported
>>>
>>> Can we also base this behaviour on a statement in the SDM? But on first
>>> glance, I do not find anything like this over there.
>> 
>> The SDM has nothing of this sort explicitly mentioned but 28.3.3.1 
>> "Operations that invalidate Cached Mappings" does mention that
>> the instruction may invalidate mappings associated with other
>> EP4TAs (even in single context).
>
> Yes, "may". So we are implementing undefined behavior in order to please
> a broken hypervisor that relies on it? Then please state this in the
> patch and probably also inform Xen about their issue.

Why undefined behavior ? We don't do anything specific for 
the single context invalidation case ianyway .e If the eptp matches what 
vmcs12 has, single context invalidation does fall though to the global 
invalidation case already. All this change does is add the "L1 calls 
invept after vmclear and  before vmptrld" to the list of cases to fall 
though to global invalidation since nvmx doesn't have any knowledge of 
the current eptp for this case.

Or do you think we should rethink this approach ?

>> 
>> Note that I based this on what we currently do for context invalidation -
>> static inline void ept_sync_context(u64 eptp)
>> {
>> 	if (enable_ept) {
>> 		if (cpu_has_vmx_invept_context())
>> 			__invept(VMX_EPT_EXTENT_CONTEXT, eptp, 0);
>> 		else
>> 			ept_sync_global();
>> 	}
>> }
>
> Don't get your point. This test is about testing for the CPU support
> context invalidating, then falling back to global invalidation if there
> is no support.

Sorry, if this was confusing. All I was trying to say is switching to global
invalidation if we can't do single context invalidation for some reason 
is not unusual.

Thanks,
Bandan

> Jan
>
>> 
>> Seemed easier and cleaner than having a cached eptp after vmcs12 is 
>> long gone :)
>> 
>> If you prefer, I can modify the commit message to reflect this.
>> 
>>> Jan
>>>
>>>>
>>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>>> ---
>>>>  arch/x86/kvm/vmx.c | 5 +++--
>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>> index c707389..b407b3a 100644
>>>> --- a/arch/x86/kvm/vmx.c
>>>> +++ b/arch/x86/kvm/vmx.c
>>>> @@ -6435,8 +6435,9 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>>>>  
>>>>  	switch (type) {
>>>>  	case VMX_EPT_EXTENT_CONTEXT:
>>>> -		if ((operand.eptp & eptp_mask) !=
>>>> -				(nested_ept_get_cr3(vcpu) & eptp_mask))
>>>> +		if (get_vmcs12(vcpu) &&
>>>> +		    ((operand.eptp & eptp_mask) != (nested_ept_get_cr3(vcpu) &
>>>> +						    eptp_mask)))
>>>>  			break;
>>>>  	case VMX_EPT_EXTENT_GLOBAL:
>>>>  		kvm_mmu_sync_roots(vcpu);
>>>>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jan Kiszka March 23, 2014, 7:16 p.m. UTC | #7

On 2014-03-22 17:43, Bandan Das wrote:
> Jan Kiszka <jan.kiszka@web.de> writes:
> 
>> On 2014-03-20 21:58, Bandan Das wrote:
>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>
>>>> On 2014-03-20 04:28, Bandan Das wrote:
>>>>> Some L1 hypervisors such as Xen seem to be calling invept after
>>>>> vmclear or before vmptrld on L2. In this case, proceed with
>>>>> falling through and syncing roots as a case where
>>>>> context wide invalidation can't be supported
>>>>
>>>> Can we also base this behaviour on a statement in the SDM? But on first
>>>> glance, I do not find anything like this over there.
>>>
>>> The SDM has nothing of this sort explicitly mentioned but 28.3.3.1 
>>> "Operations that invalidate Cached Mappings" does mention that
>>> the instruction may invalidate mappings associated with other
>>> EP4TAs (even in single context).
>>
>> Yes, "may". So we are implementing undefined behavior in order to please
>> a broken hypervisor that relies on it? Then please state this in the
>> patch and probably also inform Xen about their issue.
> 
> Why undefined behavior ? We don't do anything specific for 
> the single context invalidation case ianyway .e If the eptp matches what 
> vmcs12 has, single context invalidation does fall though to the global 
> invalidation case already. All this change does is add the "L1 calls 
> invept after vmclear and  before vmptrld" to the list of cases to fall 
> though to global invalidation since nvmx doesn't have any knowledge of 
> the current eptp for this case.

OK, I think I misunderstood what the guest expects and how we currently
achieve this: we do not track the mapping between guest and host eptp,
thus cannot properly emulate its behaviour. We therefore need to flush
everything.

> 
> Or do you think we should rethink this approach ?

Well, I wonder if we should expose single-context invept support at all.

I'm also wondering if we are returning proper flags on

    if ((operand.eptp & eptp_mask) !=
                    (nested_ept_get_cr3(vcpu) & eptp_mask))
            break;

Neither nested_vmx_succeed nor nested_vmx_fail* is called if this
condition evaluates to true.

Jan

Bandan Das March 26, 2014, 8:22 p.m. UTC | #8

Jan Kiszka <jan.kiszka@web.de> writes:

> On 2014-03-22 17:43, Bandan Das wrote:
>> Jan Kiszka <jan.kiszka@web.de> writes:
>> 
>>> On 2014-03-20 21:58, Bandan Das wrote:
>>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>>
>>>>> On 2014-03-20 04:28, Bandan Das wrote:
>>>>>> Some L1 hypervisors such as Xen seem to be calling invept after
>>>>>> vmclear or before vmptrld on L2. In this case, proceed with
>>>>>> falling through and syncing roots as a case where
>>>>>> context wide invalidation can't be supported
>>>>>
>>>>> Can we also base this behaviour on a statement in the SDM? But on first
>>>>> glance, I do not find anything like this over there.
>>>>
>>>> The SDM has nothing of this sort explicitly mentioned but 28.3.3.1 
>>>> "Operations that invalidate Cached Mappings" does mention that
>>>> the instruction may invalidate mappings associated with other
>>>> EP4TAs (even in single context).
>>>
>>> Yes, "may". So we are implementing undefined behavior in order to please
>>> a broken hypervisor that relies on it? Then please state this in the
>>> patch and probably also inform Xen about their issue.
>> 
>> Why undefined behavior ? We don't do anything specific for 
>> the single context invalidation case ianyway .e If the eptp matches what 
>> vmcs12 has, single context invalidation does fall though to the global 
>> invalidation case already. All this change does is add the "L1 calls 
>> invept after vmclear and  before vmptrld" to the list of cases to fall 
>> though to global invalidation since nvmx doesn't have any knowledge of 
>> the current eptp for this case.
>
> OK, I think I misunderstood what the guest expects and how we currently
> achieve this: we do not track the mapping between guest and host eptp,
> thus cannot properly emulate its behaviour. We therefore need to flush
> everything.
>
>> 
>> Or do you think we should rethink this approach ?
>
> Well, I wonder if we should expose single-context invept support at all.
>
> I'm also wondering if we are returning proper flags on
>
>     if ((operand.eptp & eptp_mask) !=
>                     (nested_ept_get_cr3(vcpu) & eptp_mask))
>             break;

That does sound plausible but then we would have to get rid of this 
little optimization in the code you have quoted above. We would have
to flush and sync roots for all invept calls. So, my preference is to 
keep it.

> Neither nested_vmx_succeed nor nested_vmx_fail* is called if this
> condition evaluates to true.

It appears we should always call nested_vmx_succeed(). If you are ok,
I will send a v2.

Thanks,
Bandan

> Jan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jan Kiszka March 27, 2014, 9:03 a.m. UTC | #9

On 2014-03-26 21:22, Bandan Das wrote:
> Jan Kiszka <jan.kiszka@web.de> writes:
> 
>> On 2014-03-22 17:43, Bandan Das wrote:
>>> Jan Kiszka <jan.kiszka@web.de> writes:
>>>
>>>> On 2014-03-20 21:58, Bandan Das wrote:
>>>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>>>
>>>>>> On 2014-03-20 04:28, Bandan Das wrote:
>>>>>>> Some L1 hypervisors such as Xen seem to be calling invept after
>>>>>>> vmclear or before vmptrld on L2. In this case, proceed with
>>>>>>> falling through and syncing roots as a case where
>>>>>>> context wide invalidation can't be supported
>>>>>>
>>>>>> Can we also base this behaviour on a statement in the SDM? But on first
>>>>>> glance, I do not find anything like this over there.
>>>>>
>>>>> The SDM has nothing of this sort explicitly mentioned but 28.3.3.1 
>>>>> "Operations that invalidate Cached Mappings" does mention that
>>>>> the instruction may invalidate mappings associated with other
>>>>> EP4TAs (even in single context).
>>>>
>>>> Yes, "may". So we are implementing undefined behavior in order to please
>>>> a broken hypervisor that relies on it? Then please state this in the
>>>> patch and probably also inform Xen about their issue.
>>>
>>> Why undefined behavior ? We don't do anything specific for 
>>> the single context invalidation case ianyway .e If the eptp matches what 
>>> vmcs12 has, single context invalidation does fall though to the global 
>>> invalidation case already. All this change does is add the "L1 calls 
>>> invept after vmclear and  before vmptrld" to the list of cases to fall 
>>> though to global invalidation since nvmx doesn't have any knowledge of 
>>> the current eptp for this case.
>>
>> OK, I think I misunderstood what the guest expects and how we currently
>> achieve this: we do not track the mapping between guest and host eptp,
>> thus cannot properly emulate its behaviour. We therefore need to flush
>> everything.
>>
>>>
>>> Or do you think we should rethink this approach ?
>>
>> Well, I wonder if we should expose single-context invept support at all.
>>
>> I'm also wondering if we are returning proper flags on
>>
>>     if ((operand.eptp & eptp_mask) !=
>>                     (nested_ept_get_cr3(vcpu) & eptp_mask))
>>             break;
> 
> That does sound plausible but then we would have to get rid of this 
> little optimization in the code you have quoted above. We would have
> to flush and sync roots for all invept calls. So, my preference is to 
> keep it.

Do you have any numbers on how many per-context invept calls we are able
to ignore? At least for KVM this is should be 0 as we only flush on
changes to the current EPT structures.

Jan

> 
>> Neither nested_vmx_succeed nor nested_vmx_fail* is called if this
>> condition evaluates to true.
> 
> It appears we should always call nested_vmx_succeed(). If you are ok,
> I will send a v2.
> 
> Thanks,
> Bandan
> 
>> Jan

Bandan Das March 27, 2014, 10:14 p.m. UTC | #10

Jan Kiszka <jan.kiszka@web.de> writes:

> On 2014-03-26 21:22, Bandan Das wrote:
>> Jan Kiszka <jan.kiszka@web.de> writes:
>> 
>>> On 2014-03-22 17:43, Bandan Das wrote:
>>>> Jan Kiszka <jan.kiszka@web.de> writes:
>>>>
>>>>> On 2014-03-20 21:58, Bandan Das wrote:
>>>>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>>>>
>>>>>>> On 2014-03-20 04:28, Bandan Das wrote:
>>>>>>>> Some L1 hypervisors such as Xen seem to be calling invept after
>>>>>>>> vmclear or before vmptrld on L2. In this case, proceed with
>>>>>>>> falling through and syncing roots as a case where
>>>>>>>> context wide invalidation can't be supported
>>>>>>>
>>>>>>> Can we also base this behaviour on a statement in the SDM? But on first
>>>>>>> glance, I do not find anything like this over there.
>>>>>>
>>>>>> The SDM has nothing of this sort explicitly mentioned but 28.3.3.1 
>>>>>> "Operations that invalidate Cached Mappings" does mention that
>>>>>> the instruction may invalidate mappings associated with other
>>>>>> EP4TAs (even in single context).
>>>>>
>>>>> Yes, "may". So we are implementing undefined behavior in order to please
>>>>> a broken hypervisor that relies on it? Then please state this in the
>>>>> patch and probably also inform Xen about their issue.
>>>>
>>>> Why undefined behavior ? We don't do anything specific for 
>>>> the single context invalidation case ianyway .e If the eptp matches what 
>>>> vmcs12 has, single context invalidation does fall though to the global 
>>>> invalidation case already. All this change does is add the "L1 calls 
>>>> invept after vmclear and  before vmptrld" to the list of cases to fall 
>>>> though to global invalidation since nvmx doesn't have any knowledge of 
>>>> the current eptp for this case.
>>>
>>> OK, I think I misunderstood what the guest expects and how we currently
>>> achieve this: we do not track the mapping between guest and host eptp,
>>> thus cannot properly emulate its behaviour. We therefore need to flush
>>> everything.
>>>
>>>>
>>>> Or do you think we should rethink this approach ?
>>>
>>> Well, I wonder if we should expose single-context invept support at all.
>>>
>>> I'm also wondering if we are returning proper flags on
>>>
>>>     if ((operand.eptp & eptp_mask) !=
>>>                     (nested_ept_get_cr3(vcpu) & eptp_mask))
>>>             break;
>> 
>> That does sound plausible but then we would have to get rid of this 
>> little optimization in the code you have quoted above. We would have
>> to flush and sync roots for all invept calls. So, my preference is to 
>> keep it.
>
> Do you have any numbers on how many per-context invept calls we are able
> to ignore? At least for KVM this is should be 0 as we only flush on
> changes to the current EPT structures.

Just instrumented a Debian L2 bootup under L1 Xen and there were 0 
per-context invept calls (with a valid vmcs12). I will spin out a 
version for testing where we don't advertise single context invept and 
if everything works, put a comment as to why we are removing support.

Thanks,
Bandan

> Jan
>
>> 
>>> Neither nested_vmx_succeed nor nested_vmx_fail* is called if this
>>> condition evaluates to true.
>> 
>> It appears we should always call nested_vmx_succeed(). If you are ok,
>> I will send a v2.
>> 
>> Thanks,
>> Bandan
>> 
>>> Jan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[3/3] KVM: nVMX: check for null vmcs12 when L1 does invept

Commit Message

Comments

Patch