diff mbox

[v2,1/3] KVM: nVMX: Don't advertise single context invalidation for invept

Message ID 1396299625-8285-2-git-send-email-bsd@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Bandan Das March 31, 2014, 9 p.m. UTC
For single context invalidation, we fall through to global
invalidation in handle_invept() except for one case - when
the operand supplied by L1 is different from what we have in
vmcs12. However, typically hypervisors will only call invept
for the currently loaded eptp, so the condition will
never be true.

Signed-off-by: Bandan Das <bsd@redhat.com>
---
 arch/x86/kvm/vmx.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

Comments

Marcelo Tosatti April 10, 2014, 8:47 p.m. UTC | #1
On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
> For single context invalidation, we fall through to global
> invalidation in handle_invept() except for one case - when
> the operand supplied by L1 is different from what we have in
> vmcs12. However, typically hypervisors will only call invept
> for the currently loaded eptp, so the condition will
> never be true.
> 
> Signed-off-by: Bandan Das <bsd@redhat.com>

Bandan,

Why not fix INVEPT single-context rather than removing it entirely?

"Single-context. If the INVEPT type is 1, the logical processor
invalidates all guest-physical mappings and combined mappings associated
with the EP4TA specified in the INVEPT descriptor. Combined mappings for
that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
may invalidate mappings associated with other EP4TAs.)"

So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.

> ---
>  arch/x86/kvm/vmx.c | 15 +++++----------
>  1 file changed, 5 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 3927528..3e7f60c 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -2331,12 +2331,11 @@ static __init void nested_vmx_setup_ctls_msrs(void)
>  			 VMX_EPT_INVEPT_BIT;
>  		nested_vmx_ept_caps &= vmx_capability.ept;
>  		/*
> -		 * Since invept is completely emulated we support both global
> -		 * and context invalidation independent of what host cpu
> -		 * supports
> +		 * For nested guests, we don't do anything specific
> +		 * for single context invalidation. Hence, only advertise
> +		 * support for global context invalidation.
>  		 */
> -		nested_vmx_ept_caps |= VMX_EPT_EXTENT_GLOBAL_BIT |
> -			VMX_EPT_EXTENT_CONTEXT_BIT;
> +		nested_vmx_ept_caps |= VMX_EPT_EXTENT_GLOBAL_BIT;
>  	} else
>  		nested_vmx_ept_caps = 0;
>  
> @@ -6383,7 +6382,6 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>  	struct {
>  		u64 eptp, gpa;
>  	} operand;
> -	u64 eptp_mask = ((1ull << 51) - 1) & PAGE_MASK;
>  
>  	if (!(nested_vmx_secondary_ctls_high & SECONDARY_EXEC_ENABLE_EPT) ||
>  	    !(nested_vmx_ept_caps & VMX_EPT_INVEPT_BIT)) {
> @@ -6423,16 +6421,13 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>  	}
>  
>  	switch (type) {
> -	case VMX_EPT_EXTENT_CONTEXT:
> -		if ((operand.eptp & eptp_mask) !=
> -				(nested_ept_get_cr3(vcpu) & eptp_mask))
> -			break;
>  	case VMX_EPT_EXTENT_GLOBAL:
>  		kvm_mmu_sync_roots(vcpu);
>  		kvm_mmu_flush_tlb(vcpu);
>  		nested_vmx_succeed(vcpu);
>  		break;
>  	default:
> +		/* Trap single context invalidation invept calls */
>  		BUG_ON(1);
>  		break;
>  	}
> -- 
> 1.8.3.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bandan Das April 11, 2014, 12:27 a.m. UTC | #2
Marcelo Tosatti <mtosatti@redhat.com> writes:

> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>> For single context invalidation, we fall through to global
>> invalidation in handle_invept() except for one case - when
>> the operand supplied by L1 is different from what we have in
>> vmcs12. However, typically hypervisors will only call invept
>> for the currently loaded eptp, so the condition will
>> never be true.
>> 
>> Signed-off-by: Bandan Das <bsd@redhat.com>
>
> Bandan,
>
> Why not fix INVEPT single-context rather than removing it entirely?
>
> "Single-context. If the INVEPT type is 1, the logical processor
> invalidates all guest-physical mappings and combined mappings associated
> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
> may invalidate mappings associated with other EP4TAs.)"
>
> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.

The single context invalidation in handle_invept() doesn't do 
anything different. It just falls down to the global case.
And the invept code in Xen and KVM both seemed to fall back
to global invalidation if support for single context wasn't found.
So, it was proposed not to advertise it at all.

But rethinking this again, I agree with you. If there's a hypervisor
with a  single context invept implmentation that does not fallback,
this will unfortunately not work. Jan, do you agree with this ?

Bandan

>> ---
>>  arch/x86/kvm/vmx.c | 15 +++++----------
>>  1 file changed, 5 insertions(+), 10 deletions(-)
>> 
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 3927528..3e7f60c 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -2331,12 +2331,11 @@ static __init void nested_vmx_setup_ctls_msrs(void)
>>  			 VMX_EPT_INVEPT_BIT;
>>  		nested_vmx_ept_caps &= vmx_capability.ept;
>>  		/*
>> -		 * Since invept is completely emulated we support both global
>> -		 * and context invalidation independent of what host cpu
>> -		 * supports
>> +		 * For nested guests, we don't do anything specific
>> +		 * for single context invalidation. Hence, only advertise
>> +		 * support for global context invalidation.
>>  		 */
>> -		nested_vmx_ept_caps |= VMX_EPT_EXTENT_GLOBAL_BIT |
>> -			VMX_EPT_EXTENT_CONTEXT_BIT;
>> +		nested_vmx_ept_caps |= VMX_EPT_EXTENT_GLOBAL_BIT;
>>  	} else
>>  		nested_vmx_ept_caps = 0;
>>  
>> @@ -6383,7 +6382,6 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>>  	struct {
>>  		u64 eptp, gpa;
>>  	} operand;
>> -	u64 eptp_mask = ((1ull << 51) - 1) & PAGE_MASK;
>>  
>>  	if (!(nested_vmx_secondary_ctls_high & SECONDARY_EXEC_ENABLE_EPT) ||
>>  	    !(nested_vmx_ept_caps & VMX_EPT_INVEPT_BIT)) {
>> @@ -6423,16 +6421,13 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>>  	}
>>  
>>  	switch (type) {
>> -	case VMX_EPT_EXTENT_CONTEXT:
>> -		if ((operand.eptp & eptp_mask) !=
>> -				(nested_ept_get_cr3(vcpu) & eptp_mask))
>> -			break;
>>  	case VMX_EPT_EXTENT_GLOBAL:
>>  		kvm_mmu_sync_roots(vcpu);
>>  		kvm_mmu_flush_tlb(vcpu);
>>  		nested_vmx_succeed(vcpu);
>>  		break;
>>  	default:
>> +		/* Trap single context invalidation invept calls */
>>  		BUG_ON(1);
>>  		break;
>>  	}
>> -- 
>> 1.8.3.1
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka April 11, 2014, 6:22 a.m. UTC | #3
On 2014-04-11 02:27, Bandan Das wrote:
> Marcelo Tosatti <mtosatti@redhat.com> writes:
> 
>> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>>> For single context invalidation, we fall through to global
>>> invalidation in handle_invept() except for one case - when
>>> the operand supplied by L1 is different from what we have in
>>> vmcs12. However, typically hypervisors will only call invept
>>> for the currently loaded eptp, so the condition will
>>> never be true.
>>>
>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>
>> Bandan,
>>
>> Why not fix INVEPT single-context rather than removing it entirely?
>>
>> "Single-context. If the INVEPT type is 1, the logical processor
>> invalidates all guest-physical mappings and combined mappings associated
>> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
>> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
>> may invalidate mappings associated with other EP4TAs.)"
>>
>> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
> 
> The single context invalidation in handle_invept() doesn't do 
> anything different. It just falls down to the global case.
> And the invept code in Xen and KVM both seemed to fall back
> to global invalidation if support for single context wasn't found.
> So, it was proposed not to advertise it at all.
> 
> But rethinking this again, I agree with you. If there's a hypervisor
> with a  single context invept implmentation that does not fallback,
> this will unfortunately not work. Jan, do you agree with this ?

A hypervisor that doesn't properly check the HW caps is just broken. And
one that mandates single context invalidation support is silly.

Jan
Bandan Das April 11, 2014, 5:26 p.m. UTC | #4
Jan Kiszka <jan.kiszka@siemens.com> writes:

> On 2014-04-11 02:27, Bandan Das wrote:
>> Marcelo Tosatti <mtosatti@redhat.com> writes:
>> 
>>> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>>>> For single context invalidation, we fall through to global
>>>> invalidation in handle_invept() except for one case - when
>>>> the operand supplied by L1 is different from what we have in
>>>> vmcs12. However, typically hypervisors will only call invept
>>>> for the currently loaded eptp, so the condition will
>>>> never be true.
>>>>
>>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>>
>>> Bandan,
>>>
>>> Why not fix INVEPT single-context rather than removing it entirely?
>>>
>>> "Single-context. If the INVEPT type is 1, the logical processor
>>> invalidates all guest-physical mappings and combined mappings associated
>>> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
>>> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
>>> may invalidate mappings associated with other EP4TAs.)"
>>>
>>> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
>> 
>> The single context invalidation in handle_invept() doesn't do 
>> anything different. It just falls down to the global case.
>> And the invept code in Xen and KVM both seemed to fall back
>> to global invalidation if support for single context wasn't found.
>> So, it was proposed not to advertise it at all.
>> 
>> But rethinking this again, I agree with you. If there's a hypervisor
>> with a  single context invept implmentation that does not fallback,
>> this will unfortunately not work. Jan, do you agree with this ?
>
> A hypervisor that doesn't properly check the HW caps is just broken. And
> one that mandates single context invalidation support is silly.

Well, but we could make life a little bit easier for the unfortunate user
using the broken hypervisor :) And advertising single context inavalidation
doesn't really seem to have any downsides.

> Jan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka April 11, 2014, 6:01 p.m. UTC | #5
On 2014-04-11 19:26, Bandan Das wrote:
> Jan Kiszka <jan.kiszka@siemens.com> writes:
> 
>> On 2014-04-11 02:27, Bandan Das wrote:
>>> Marcelo Tosatti <mtosatti@redhat.com> writes:
>>>
>>>> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>>>>> For single context invalidation, we fall through to global
>>>>> invalidation in handle_invept() except for one case - when
>>>>> the operand supplied by L1 is different from what we have in
>>>>> vmcs12. However, typically hypervisors will only call invept
>>>>> for the currently loaded eptp, so the condition will
>>>>> never be true.
>>>>>
>>>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>>>
>>>> Bandan,
>>>>
>>>> Why not fix INVEPT single-context rather than removing it entirely?
>>>>
>>>> "Single-context. If the INVEPT type is 1, the logical processor
>>>> invalidates all guest-physical mappings and combined mappings associated
>>>> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
>>>> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
>>>> may invalidate mappings associated with other EP4TAs.)"
>>>>
>>>> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
>>>
>>> The single context invalidation in handle_invept() doesn't do 
>>> anything different. It just falls down to the global case.
>>> And the invept code in Xen and KVM both seemed to fall back
>>> to global invalidation if support for single context wasn't found.
>>> So, it was proposed not to advertise it at all.
>>>
>>> But rethinking this again, I agree with you. If there's a hypervisor
>>> with a  single context invept implmentation that does not fallback,
>>> this will unfortunately not work. Jan, do you agree with this ?
>>
>> A hypervisor that doesn't properly check the HW caps is just broken. And
>> one that mandates single context invalidation support is silly.
> 
> Well, but we could make life a little bit easier for the unfortunate user
> using the broken hypervisor :) And advertising single context inavalidation
> doesn't really seem to have any downsides.

Ok, let's try it this way: single-context invalidation is inherently
tied to VPID support (that's how you address a context). However, KVM
does not expose VPID to its guest. So this discussion is mood: no
hypervisor will make use of this feature as it has no means to fill in
the required parameter.

Once we start supporting VPID, we can also think about how to address
single-context invalidation reasonably.

Jan
Bandan Das April 11, 2014, 6:35 p.m. UTC | #6
Jan Kiszka <jan.kiszka@siemens.com> writes:

> On 2014-04-11 19:26, Bandan Das wrote:
>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>> 
>>> On 2014-04-11 02:27, Bandan Das wrote:
>>>> Marcelo Tosatti <mtosatti@redhat.com> writes:
>>>>
>>>>> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>>>>>> For single context invalidation, we fall through to global
>>>>>> invalidation in handle_invept() except for one case - when
>>>>>> the operand supplied by L1 is different from what we have in
>>>>>> vmcs12. However, typically hypervisors will only call invept
>>>>>> for the currently loaded eptp, so the condition will
>>>>>> never be true.
>>>>>>
>>>>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>>>>
>>>>> Bandan,
>>>>>
>>>>> Why not fix INVEPT single-context rather than removing it entirely?
>>>>>
>>>>> "Single-context. If the INVEPT type is 1, the logical processor
>>>>> invalidates all guest-physical mappings and combined mappings associated
>>>>> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
>>>>> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
>>>>> may invalidate mappings associated with other EP4TAs.)"
>>>>>
>>>>> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
>>>>
>>>> The single context invalidation in handle_invept() doesn't do 
>>>> anything different. It just falls down to the global case.
>>>> And the invept code in Xen and KVM both seemed to fall back
>>>> to global invalidation if support for single context wasn't found.
>>>> So, it was proposed not to advertise it at all.
>>>>
>>>> But rethinking this again, I agree with you. If there's a hypervisor
>>>> with a  single context invept implmentation that does not fallback,
>>>> this will unfortunately not work. Jan, do you agree with this ?
>>>
>>> A hypervisor that doesn't properly check the HW caps is just broken. And
>>> one that mandates single context invalidation support is silly.
>> 
>> Well, but we could make life a little bit easier for the unfortunate user
>> using the broken hypervisor :) And advertising single context inavalidation
>> doesn't really seem to have any downsides.
>
> Ok, let's try it this way: single-context invalidation is inherently
> tied to VPID support (that's how you address a context). However, KVM
> does not expose VPID to its guest. So this discussion is mood: no
> hypervisor will make use of this feature as it has no means to fill in
> the required parameter.

I thought (from the spec) invept single context invalidation
takes the EP4TA as the second argument. invvpid single context
however takes the VPID as its descriptor.

The Xen L1 hypervisor was actually calling single context invept
multiple times. That's how I hit this bug.

> Once we start supporting VPID, we can also think about how to address
> single-context invalidation reasonably.
>
> Jan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcelo Tosatti April 11, 2014, 6:48 p.m. UTC | #7
On Fri, Apr 11, 2014 at 08:22:13AM +0200, Jan Kiszka wrote:
> On 2014-04-11 02:27, Bandan Das wrote:
> > Marcelo Tosatti <mtosatti@redhat.com> writes:
> > 
> >> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
> >>> For single context invalidation, we fall through to global
> >>> invalidation in handle_invept() except for one case - when
> >>> the operand supplied by L1 is different from what we have in
> >>> vmcs12. However, typically hypervisors will only call invept
> >>> for the currently loaded eptp, so the condition will
> >>> never be true.
> >>>
> >>> Signed-off-by: Bandan Das <bsd@redhat.com>
> >>
>> Bandan,
> >>
> >> Why not fix INVEPT single-context rather than removing it entirely?
> >>
> >> "Single-context. If the INVEPT type is 1, the logical processor
> >> invalidates all guest-physical mappings and combined mappings associated
> >> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
> >> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
> >> may invalidate mappings associated with other EP4TAs.)"
> >>
> >> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
> > 
> > The single context invalidation in handle_invept() doesn't do 
> > anything different. It just falls down to the global case.
> > And the invept code in Xen and KVM both seemed to fall back
> > to global invalidation if support for single context wasn't found.
> > So, it was proposed not to advertise it at all.
> > 
> > But rethinking this again, I agree with you. If there's a hypervisor
> > with a  single context invept implmentation that does not fallback,

What do you mean "does not fallback" ? The hypervisor cannot detect 
fallback because:

"(The instruction may invalidate mappings associated with other EP4TAs.)"

So the spec says single context can behave as global context (similar
with TLB entries and INVLPG).

So it is valid to implement single context as global context.

> > this will unfortunately not work. Jan, do you agree with this ?
> 
> A hypervisor that doesn't properly check the HW caps is just broken. And
> one that mandates single context invalidation support is silly.
> 
> Jan

I imagined Xen broke because broken KVM's implementation of INVEPT
single context (so that should be fixed).

If with the proper implementation of INVEPT single context in KVM Xen
still fails for some reason, would have to understand why it is failing.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka April 11, 2014, 6:53 p.m. UTC | #8
On 2014-04-11 20:35, Bandan Das wrote:
> Jan Kiszka <jan.kiszka@siemens.com> writes:
> 
>> On 2014-04-11 19:26, Bandan Das wrote:
>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>
>>>> On 2014-04-11 02:27, Bandan Das wrote:
>>>>> Marcelo Tosatti <mtosatti@redhat.com> writes:
>>>>>
>>>>>> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>>>>>>> For single context invalidation, we fall through to global
>>>>>>> invalidation in handle_invept() except for one case - when
>>>>>>> the operand supplied by L1 is different from what we have in
>>>>>>> vmcs12. However, typically hypervisors will only call invept
>>>>>>> for the currently loaded eptp, so the condition will
>>>>>>> never be true.
>>>>>>>
>>>>>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>>>>>
>>>>>> Bandan,
>>>>>>
>>>>>> Why not fix INVEPT single-context rather than removing it entirely?
>>>>>>
>>>>>> "Single-context. If the INVEPT type is 1, the logical processor
>>>>>> invalidates all guest-physical mappings and combined mappings associated
>>>>>> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
>>>>>> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
>>>>>> may invalidate mappings associated with other EP4TAs.)"
>>>>>>
>>>>>> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
>>>>>
>>>>> The single context invalidation in handle_invept() doesn't do 
>>>>> anything different. It just falls down to the global case.
>>>>> And the invept code in Xen and KVM both seemed to fall back
>>>>> to global invalidation if support for single context wasn't found.
>>>>> So, it was proposed not to advertise it at all.
>>>>>
>>>>> But rethinking this again, I agree with you. If there's a hypervisor
>>>>> with a  single context invept implmentation that does not fallback,
>>>>> this will unfortunately not work. Jan, do you agree with this ?
>>>>
>>>> A hypervisor that doesn't properly check the HW caps is just broken. And
>>>> one that mandates single context invalidation support is silly.
>>>
>>> Well, but we could make life a little bit easier for the unfortunate user
>>> using the broken hypervisor :) And advertising single context inavalidation
>>> doesn't really seem to have any downsides.
>>
>> Ok, let's try it this way: single-context invalidation is inherently
>> tied to VPID support (that's how you address a context). However, KVM
>> does not expose VPID to its guest. So this discussion is mood: no
>> hypervisor will make use of this feature as it has no means to fill in
>> the required parameter.
> 
> I thought (from the spec) invept single context invalidation
> takes the EP4TA as the second argument. invvpid single context
> however takes the VPID as its descriptor.

Oops, invept/invvpid mess-up while re-reading the spec - sorry.

> 
> The Xen L1 hypervisor was actually calling single context invept
> multiple times. That's how I hit this bug.

...and it's no longer doing it now, I suppose. The question remains,
which hypervisor we want to cater with a
"single-context-that-is-current-context" invalidation (that is my
understanding of Marcelo's proposal). On the other hand, if some
hypervisor actually uses invept to invalidate a non-current mapping, we
would regress compared to not exposing single context invept. Hope I got
this conclusion right. ;)

Jan
Marcelo Tosatti April 11, 2014, 7:02 p.m. UTC | #9
On Fri, Apr 11, 2014 at 08:22:13AM +0200, Jan Kiszka wrote:
> > But rethinking this again, I agree with you. If there's a hypervisor
> > with a  single context invept implmentation that does not fallback,
> > this will unfortunately not work. Jan, do you agree with this ?
> 
> A hypervisor that doesn't properly check the HW caps is just broken. And
> one that mandates single context invalidation support is silly.

Is this a justification for removing INVEPT single-context until it 
is implemented as single-context?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bandan Das April 11, 2014, 7:33 p.m. UTC | #10
Marcelo Tosatti <mtosatti@redhat.com> writes:

> On Fri, Apr 11, 2014 at 08:22:13AM +0200, Jan Kiszka wrote:
>> On 2014-04-11 02:27, Bandan Das wrote:
>> > Marcelo Tosatti <mtosatti@redhat.com> writes:
>> > 
>> >> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>> >>> For single context invalidation, we fall through to global
>> >>> invalidation in handle_invept() except for one case - when
>> >>> the operand supplied by L1 is different from what we have in
>> >>> vmcs12. However, typically hypervisors will only call invept
>> >>> for the currently loaded eptp, so the condition will
>> >>> never be true.
>> >>>
>> >>> Signed-off-by: Bandan Das <bsd@redhat.com>
>> >>
>>> Bandan,
>> >>
>> >> Why not fix INVEPT single-context rather than removing it entirely?
>> >>
>> >> "Single-context. If the INVEPT type is 1, the logical processor
>> >> invalidates all guest-physical mappings and combined mappings associated
>> >> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
>> >> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
>> >> may invalidate mappings associated with other EP4TAs.)"
>> >>
>> >> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
>> > 
>> > The single context invalidation in handle_invept() doesn't do 
>> > anything different. It just falls down to the global case.
>> > And the invept code in Xen and KVM both seemed to fall back
>> > to global invalidation if support for single context wasn't found.
>> > So, it was proposed not to advertise it at all.
>> > 
>> > But rethinking this again, I agree with you. If there's a hypervisor
>> > with a  single context invept implmentation that does not fallback,
>
> What do you mean "does not fallback" ? The hypervisor cannot detect 
> fallback because:
>
> "(The instruction may invalidate mappings associated with other EP4TAs.)"
>
> So the spec says single context can behave as global context (similar
> with TLB entries and INVLPG).
>
> So it is valid to implement single context as global context.

I meant if single context invalidation isn't supported,
the hypervisor falls back to global invalidation like in kvm -

static inline void ept_sync_context(u64 eptp)
{
...
		if (cpu_has_vmx_invept_context())
			__invept(VMX_EPT_EXTENT_CONTEXT, eptp, 0);
		else
			ept_sync_global();
...

>> > this will unfortunately not work. Jan, do you agree with this ?
>> 
>> A hypervisor that doesn't properly check the HW caps is just broken. And
>> one that mandates single context invalidation support is silly.
>> 
>> Jan
>
> I imagined Xen broke because broken KVM's implementation of INVEPT
> single context (so that should be fixed).

It's failing because of this check in handle_invept -
if ((operand.eptp & eptp_mask) !=
	(nested_ept_get_cr3(vcpu) & eptp_mask))
			break;

Problem is invept can get called even after a vmclear and Jan 
pointed out that there's probably no case where this if will
evaluate to true (atleast not for kvm/xen).

> If with the proper implementation of INVEPT single context in KVM Xen
> still fails for some reason, would have to understand why it is failing.

The argument was that since kvm doesn't do anything different
for single context invalidation, does it make sense to not advertise
it at all assuming that the above snippet of invept code is used
by all hypervisors ?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcelo Tosatti April 11, 2014, 7:35 p.m. UTC | #11
On Fri, Apr 11, 2014 at 08:53:09PM +0200, Jan Kiszka wrote:
> On 2014-04-11 20:35, Bandan Das wrote:
> > Jan Kiszka <jan.kiszka@siemens.com> writes:
> > 
> >> On 2014-04-11 19:26, Bandan Das wrote:
> >>> Jan Kiszka <jan.kiszka@siemens.com> writes:
> >>>
> >>>> On 2014-04-11 02:27, Bandan Das wrote:
> >>>>> Marcelo Tosatti <mtosatti@redhat.com> writes:
> >>>>>
> >>>>>> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
> >>>>>>> For single context invalidation, we fall through to global
> >>>>>>> invalidation in handle_invept() except for one case - when
> >>>>>>> the operand supplied by L1 is different from what we have in
> >>>>>>> vmcs12. However, typically hypervisors will only call invept
> >>>>>>> for the currently loaded eptp, so the condition will
> >>>>>>> never be true.
> >>>>>>>
> >>>>>>> Signed-off-by: Bandan Das <bsd@redhat.com>
> >>>>>>
> >>>>>> Bandan,
> >>>>>>
> >>>>>> Why not fix INVEPT single-context rather than removing it entirely?
> >>>>>>
> >>>>>> "Single-context. If the INVEPT type is 1, the logical processor
> >>>>>> invalidates all guest-physical mappings and combined mappings associated
> >>>>>> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
> >>>>>> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
> >>>>>> may invalidate mappings associated with other EP4TAs.)"
> >>>>>>
> >>>>>> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
> >>>>>
> >>>>> The single context invalidation in handle_invept() doesn't do 
> >>>>> anything different. It just falls down to the global case.
> >>>>> And the invept code in Xen and KVM both seemed to fall back
> >>>>> to global invalidation if support for single context wasn't found.
> >>>>> So, it was proposed not to advertise it at all.
> >>>>>
> >>>>> But rethinking this again, I agree with you. If there's a hypervisor
> >>>>> with a  single context invept implmentation that does not fallback,
> >>>>> this will unfortunately not work. Jan, do you agree with this ?
> >>>>
> >>>> A hypervisor that doesn't properly check the HW caps is just broken. And
> >>>> one that mandates single context invalidation support is silly.
> >>>
> >>> Well, but we could make life a little bit easier for the unfortunate user
> >>> using the broken hypervisor :) And advertising single context inavalidation
> >>> doesn't really seem to have any downsides.
> >>
> >> Ok, let's try it this way: single-context invalidation is inherently
> >> tied to VPID support (that's how you address a context). However, KVM
> >> does not expose VPID to its guest. So this discussion is mood: no
> >> hypervisor will make use of this feature as it has no means to fill in
> >> the required parameter.
> > 
> > I thought (from the spec) invept single context invalidation
> > takes the EP4TA as the second argument. invvpid single context
> > however takes the VPID as its descriptor.
> 
> Oops, invept/invvpid mess-up while re-reading the spec - sorry.
> 
> > 
> > The Xen L1 hypervisor was actually calling single context invept
> > multiple times. That's how I hit this bug.
> 
> ...and it's no longer doing it now, I suppose. The question remains,
> which hypervisor we want to cater with a
> "single-context-that-is-current-context" invalidation (that is my
> understanding of Marcelo's proposal). 

My proposal is to implement what is in the spec.

> On the other hand, if some hypervisor actually uses invept to
> invalidate a non-current mapping, we would regress compared to not
> exposing single context invept. Hope I got this conclusion right. ;)

In that case INVEPT global would also be broken.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bandan Das April 11, 2014, 7:38 p.m. UTC | #12
Jan Kiszka <jan.kiszka@siemens.com> writes:

> On 2014-04-11 20:35, Bandan Das wrote:
>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>> 
>>> On 2014-04-11 19:26, Bandan Das wrote:
>>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>>
>>>>> On 2014-04-11 02:27, Bandan Das wrote:
>>>>>> Marcelo Tosatti <mtosatti@redhat.com> writes:
>>>>>>
>>>>>>> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>>>>>>>> For single context invalidation, we fall through to global
>>>>>>>> invalidation in handle_invept() except for one case - when
>>>>>>>> the operand supplied by L1 is different from what we have in
>>>>>>>> vmcs12. However, typically hypervisors will only call invept
>>>>>>>> for the currently loaded eptp, so the condition will
>>>>>>>> never be true.
>>>>>>>>
>>>>>>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>>>>>>
>>>>>>> Bandan,
>>>>>>>
>>>>>>> Why not fix INVEPT single-context rather than removing it entirely?
>>>>>>>
>>>>>>> "Single-context. If the INVEPT type is 1, the logical processor
>>>>>>> invalidates all guest-physical mappings and combined mappings associated
>>>>>>> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
>>>>>>> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
>>>>>>> may invalidate mappings associated with other EP4TAs.)"
>>>>>>>
>>>>>>> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
>>>>>>
>>>>>> The single context invalidation in handle_invept() doesn't do 
>>>>>> anything different. It just falls down to the global case.
>>>>>> And the invept code in Xen and KVM both seemed to fall back
>>>>>> to global invalidation if support for single context wasn't found.
>>>>>> So, it was proposed not to advertise it at all.
>>>>>>
>>>>>> But rethinking this again, I agree with you. If there's a hypervisor
>>>>>> with a  single context invept implmentation that does not fallback,
>>>>>> this will unfortunately not work. Jan, do you agree with this ?
>>>>>
>>>>> A hypervisor that doesn't properly check the HW caps is just broken. And
>>>>> one that mandates single context invalidation support is silly.
>>>>
>>>> Well, but we could make life a little bit easier for the unfortunate user
>>>> using the broken hypervisor :) And advertising single context inavalidation
>>>> doesn't really seem to have any downsides.
>>>
>>> Ok, let's try it this way: single-context invalidation is inherently
>>> tied to VPID support (that's how you address a context). However, KVM
>>> does not expose VPID to its guest. So this discussion is mood: no
>>> hypervisor will make use of this feature as it has no means to fill in
>>> the required parameter.
>> 
>> I thought (from the spec) invept single context invalidation
>> takes the EP4TA as the second argument. invvpid single context
>> however takes the VPID as its descriptor.
>
> Oops, invept/invvpid mess-up while re-reading the spec - sorry.
>
>> 
>> The Xen L1 hypervisor was actually calling single context invept
>> multiple times. That's how I hit this bug.
>
> ...and it's no longer doing it now, I suppose. The question remains,
Yes.

> which hypervisor we want to cater with a
> "single-context-that-is-current-context" invalidation (that is my
> understanding of Marcelo's proposal). On the other hand, if some
> hypervisor actually uses invept to invalidate a non-current mapping, we
> would regress compared to not exposing single context invept. Hope I got
> this conclusion right. ;)

Yep, not sure if this holds true for any hypervisor. I traced this change
down to http://www.spinics.net/lists/kvm/msg94802.html but the 
conversation doesn't mention the reasoning

> Jan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka April 14, 2014, 5:46 a.m. UTC | #13
On 2014-04-11 21:35, Marcelo Tosatti wrote:
> On Fri, Apr 11, 2014 at 08:53:09PM +0200, Jan Kiszka wrote:
>> On 2014-04-11 20:35, Bandan Das wrote:
>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>
>>>> On 2014-04-11 19:26, Bandan Das wrote:
>>>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>>>
>>>>>> On 2014-04-11 02:27, Bandan Das wrote:
>>>>>>> Marcelo Tosatti <mtosatti@redhat.com> writes:
>>>>>>>
>>>>>>>> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>>>>>>>>> For single context invalidation, we fall through to global
>>>>>>>>> invalidation in handle_invept() except for one case - when
>>>>>>>>> the operand supplied by L1 is different from what we have in
>>>>>>>>> vmcs12. However, typically hypervisors will only call invept
>>>>>>>>> for the currently loaded eptp, so the condition will
>>>>>>>>> never be true.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>>>>>>>
>>>>>>>> Bandan,
>>>>>>>>
>>>>>>>> Why not fix INVEPT single-context rather than removing it entirely?
>>>>>>>>
>>>>>>>> "Single-context. If the INVEPT type is 1, the logical processor
>>>>>>>> invalidates all guest-physical mappings and combined mappings associated
>>>>>>>> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
>>>>>>>> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
>>>>>>>> may invalidate mappings associated with other EP4TAs.)"
>>>>>>>>
>>>>>>>> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
>>>>>>>
>>>>>>> The single context invalidation in handle_invept() doesn't do 
>>>>>>> anything different. It just falls down to the global case.
>>>>>>> And the invept code in Xen and KVM both seemed to fall back
>>>>>>> to global invalidation if support for single context wasn't found.
>>>>>>> So, it was proposed not to advertise it at all.
>>>>>>>
>>>>>>> But rethinking this again, I agree with you. If there's a hypervisor
>>>>>>> with a  single context invept implmentation that does not fallback,
>>>>>>> this will unfortunately not work. Jan, do you agree with this ?
>>>>>>
>>>>>> A hypervisor that doesn't properly check the HW caps is just broken. And
>>>>>> one that mandates single context invalidation support is silly.
>>>>>
>>>>> Well, but we could make life a little bit easier for the unfortunate user
>>>>> using the broken hypervisor :) And advertising single context inavalidation
>>>>> doesn't really seem to have any downsides.
>>>>
>>>> Ok, let's try it this way: single-context invalidation is inherently
>>>> tied to VPID support (that's how you address a context). However, KVM
>>>> does not expose VPID to its guest. So this discussion is mood: no
>>>> hypervisor will make use of this feature as it has no means to fill in
>>>> the required parameter.
>>>
>>> I thought (from the spec) invept single context invalidation
>>> takes the EP4TA as the second argument. invvpid single context
>>> however takes the VPID as its descriptor.
>>
>> Oops, invept/invvpid mess-up while re-reading the spec - sorry.
>>
>>>
>>> The Xen L1 hypervisor was actually calling single context invept
>>> multiple times. That's how I hit this bug.
>>
>> ...and it's no longer doing it now, I suppose. The question remains,
>> which hypervisor we want to cater with a
>> "single-context-that-is-current-context" invalidation (that is my
>> understanding of Marcelo's proposal). 
> 
> My proposal is to implement what is in the spec.
> 
>> On the other hand, if some hypervisor actually uses invept to
>> invalidate a non-current mapping, we would regress compared to not
>> exposing single context invept. Hope I got this conclusion right. ;)
> 
> In that case INVEPT global would also be broken.

I'm all for having a proper invept single context support but that,
first of all, requires tracking the vEPTP->EPTP mappings.

Jan
diff mbox

Patch

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 3927528..3e7f60c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2331,12 +2331,11 @@  static __init void nested_vmx_setup_ctls_msrs(void)
 			 VMX_EPT_INVEPT_BIT;
 		nested_vmx_ept_caps &= vmx_capability.ept;
 		/*
-		 * Since invept is completely emulated we support both global
-		 * and context invalidation independent of what host cpu
-		 * supports
+		 * For nested guests, we don't do anything specific
+		 * for single context invalidation. Hence, only advertise
+		 * support for global context invalidation.
 		 */
-		nested_vmx_ept_caps |= VMX_EPT_EXTENT_GLOBAL_BIT |
-			VMX_EPT_EXTENT_CONTEXT_BIT;
+		nested_vmx_ept_caps |= VMX_EPT_EXTENT_GLOBAL_BIT;
 	} else
 		nested_vmx_ept_caps = 0;
 
@@ -6383,7 +6382,6 @@  static int handle_invept(struct kvm_vcpu *vcpu)
 	struct {
 		u64 eptp, gpa;
 	} operand;
-	u64 eptp_mask = ((1ull << 51) - 1) & PAGE_MASK;
 
 	if (!(nested_vmx_secondary_ctls_high & SECONDARY_EXEC_ENABLE_EPT) ||
 	    !(nested_vmx_ept_caps & VMX_EPT_INVEPT_BIT)) {
@@ -6423,16 +6421,13 @@  static int handle_invept(struct kvm_vcpu *vcpu)
 	}
 
 	switch (type) {
-	case VMX_EPT_EXTENT_CONTEXT:
-		if ((operand.eptp & eptp_mask) !=
-				(nested_ept_get_cr3(vcpu) & eptp_mask))
-			break;
 	case VMX_EPT_EXTENT_GLOBAL:
 		kvm_mmu_sync_roots(vcpu);
 		kvm_mmu_flush_tlb(vcpu);
 		nested_vmx_succeed(vcpu);
 		break;
 	default:
+		/* Trap single context invalidation invept calls */
 		BUG_ON(1);
 		break;
 	}