diff mbox series

[2/9] KVM: nVMX: Initialize #VE info page for vmcs02 when proving #VE support

Message ID 20240518000430.1118488-3-seanjc@google.com (mailing list archive)
State New
Headers show
Series KVM: x86: Fixes for KVM_INTEL_PROVE_VE | expand

Commit Message

Sean Christopherson May 18, 2024, 12:04 a.m. UTC
Point vmcs02.VE_INFORMATION_ADDRESS at the vCPU's #VE info page when
initializing vmcs02, otherwise KVM will run L2 with EPT Violation #VE
enabled and a VE info address pointing at pfn 0.

Fixes: 8131cf5b4fd8 ("KVM: VMX: Introduce test mode related to EPT violation VE")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/nested.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Huang, Kai May 20, 2024, 11:09 p.m. UTC | #1
On 18/05/2024 12:04 pm, Sean Christopherson wrote:
> Point vmcs02.VE_INFORMATION_ADDRESS at the vCPU's #VE info page when
> initializing vmcs02, otherwise KVM will run L2 with EPT Violation #VE
> enabled and a VE info address pointing at pfn 0.

How about we just clear EPT_VIOLATION_VE bit in 2nd_exec_control 
unconditionally for vmcs02?  Your next patch says:

"
Always handle #VEs, e.g. due to prove EPT Violation #VE failures, in L0,
as KVM does not expose any #VE capabilities to L1, i.e. any and all #VEs
are KVM's responsibility.
"

> 
> Fixes: 8131cf5b4fd8 ("KVM: VMX: Introduce test mode related to EPT violation VE")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   arch/x86/kvm/vmx/nested.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index d5b832126e34..6798fadaa335 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2242,6 +2242,9 @@ static void prepare_vmcs02_constant_state(struct vcpu_vmx *vmx)
>   		vmcs_write64(EPT_POINTER,
>   			     construct_eptp(&vmx->vcpu, 0, PT64_ROOT_4LEVEL));
>   
> +	if (vmx->ve_info)
> +		vmcs_write64(VE_INFORMATION_ADDRESS, __pa(vmx->ve_info));
> +
>   	/* All VMFUNCs are currently emulated through L0 vmexits.  */
>   	if (cpu_has_vmx_vmfunc())
>   		vmcs_write64(VM_FUNCTION_CONTROL, 0);
Sean Christopherson May 20, 2024, 11:22 p.m. UTC | #2
On Tue, May 21, 2024, Kai Huang wrote:
> On 18/05/2024 12:04 pm, Sean Christopherson wrote:
> > Point vmcs02.VE_INFORMATION_ADDRESS at the vCPU's #VE info page when
> > initializing vmcs02, otherwise KVM will run L2 with EPT Violation #VE
> > enabled and a VE info address pointing at pfn 0.
> 
> How about we just clear EPT_VIOLATION_VE bit in 2nd_exec_control
> unconditionally for vmcs02?

Because then KVM wouldn't get any EPT Violation #VE coverage for L2, and as
evidence by the KVM-Unit-Test failure, running L2 with EPT Violation #VEs enabled
provides unique coverage.  Doing so definitely provides coverage beyond what is
strictly needed for TDX, but it's just as easy to set the VE info page in vmcs02
as it is so clear EPT_VIOLATION_VE, so why not.

> Your next patch says:
> 
> "
> Always handle #VEs, e.g. due to prove EPT Violation #VE failures, in L0,
> as KVM does not expose any #VE capabilities to L1, i.e. any and all #VEs
> are KVM's responsibility.
> "

I don't see how that's relevant to whether or not KVM enables EPT Violation #VEs
while L2 is running.  That patch simply routes all #VEs to L0, it doesn't affect
whether or not it's safe to enable EPT Violation #VEs for L2.
Huang, Kai May 20, 2024, 11:49 p.m. UTC | #3
On 21/05/2024 11:22 am, Sean Christopherson wrote:
> On Tue, May 21, 2024, Kai Huang wrote:
>> On 18/05/2024 12:04 pm, Sean Christopherson wrote:
>>> Point vmcs02.VE_INFORMATION_ADDRESS at the vCPU's #VE info page when
>>> initializing vmcs02, otherwise KVM will run L2 with EPT Violation #VE
>>> enabled and a VE info address pointing at pfn 0.
>>
>> How about we just clear EPT_VIOLATION_VE bit in 2nd_exec_control
>> unconditionally for vmcs02?
> 
> Because then KVM wouldn't get any EPT Violation #VE coverage for L2, and as
> evidence by the KVM-Unit-Test failure, running L2 with EPT Violation #VEs enabled
> provides unique coverage.  Doing so definitely provides coverage beyond what is
> strictly needed for TDX, but it's just as easy to set the VE info page in vmcs02
> as it is so clear EPT_VIOLATION_VE, so why not.
> 
>> Your next patch says:
>>
>> "
>> Always handle #VEs, e.g. due to prove EPT Violation #VE failures, in L0,
>> as KVM does not expose any #VE capabilities to L1, i.e. any and all #VEs
>> are KVM's responsibility.
>> "
> 
> I don't see how that's relevant to whether or not KVM enables EPT Violation #VEs
> while L2 is running.  That patch simply routes all #VEs to L0, it doesn't affect
> whether or not it's safe to enable EPT Violation #VEs for L2.

My logic is, if #VE exit cannot possibly happen for L2, then we don't 
need to deal whether to route #VE exits to L1. :-)

Well, actually I think conceptually, it kinda makes sense to route #VE 
exits to L1:

L1 should never enable #VE related bits so L1 is certainly not expecting 
to see #VE from L2.  But how to act should be depending on L1's logic? 
E.g., it can choose to ignore, or just kill the L2 etc?

Unconditionally disable #VE in vmcs02 can avoid such issue because it's 
just not possible for L2 to have the #VE exit.
Sean Christopherson May 21, 2024, 12:21 a.m. UTC | #4
On Tue, May 21, 2024, Kai Huang wrote:
> On 21/05/2024 11:22 am, Sean Christopherson wrote:
> > On Tue, May 21, 2024, Kai Huang wrote:
> > > On 18/05/2024 12:04 pm, Sean Christopherson wrote:
> > > > Point vmcs02.VE_INFORMATION_ADDRESS at the vCPU's #VE info page when
> > > > initializing vmcs02, otherwise KVM will run L2 with EPT Violation #VE
> > > > enabled and a VE info address pointing at pfn 0.
> > > 
> > > How about we just clear EPT_VIOLATION_VE bit in 2nd_exec_control
> > > unconditionally for vmcs02?
> > 
> > Because then KVM wouldn't get any EPT Violation #VE coverage for L2, and as
> > evidence by the KVM-Unit-Test failure, running L2 with EPT Violation #VEs enabled
> > provides unique coverage.  Doing so definitely provides coverage beyond what is
> > strictly needed for TDX, but it's just as easy to set the VE info page in vmcs02
> > as it is so clear EPT_VIOLATION_VE, so why not.
> > 
> > > Your next patch says:
> > > 
> > > "
> > > Always handle #VEs, e.g. due to prove EPT Violation #VE failures, in L0,
> > > as KVM does not expose any #VE capabilities to L1, i.e. any and all #VEs
> > > are KVM's responsibility.
> > > "
> > 
> > I don't see how that's relevant to whether or not KVM enables EPT Violation #VEs
> > while L2 is running.  That patch simply routes all #VEs to L0, it doesn't affect
> > whether or not it's safe to enable EPT Violation #VEs for L2.
> 
> My logic is, if #VE exit cannot possibly happen for L2, then we don't need
> to deal whether to route #VE exits to L1. :-)
> 
> Well, actually I think conceptually, it kinda makes sense to route #VE exits
> to L1:
> 
> L1 should never enable #VE related bits so L1 is certainly not expecting to

Not "should never", "can never".  If L1 attempts to enable EPT_VIOLATION_VE, then
VM-Enter will VM-Fail.

> see #VE from L2.  But how to act should be depending on L1's logic? E.g., it
> can choose to ignore, or just kill the L2 etc?

No.  Architecturally, from L1's perspective, a #VE VM-Exit _cannot_ occur in L2.
L1 can inject a #VE into L2, but a #VE cannot be generated by the CPU and thus
cannot cause a VM-Exit.

> Unconditionally disable #VE in vmcs02 can avoid such issue because it's just
> not possible for L2 to have the #VE exit.

Sure, but by that argument we could just avoid all nested VMX issues by never
enabling anything for L2.

If there's an argument to be made for disabling EPT_VIOLATION_VE in vmcs02, it's
that the potential maintenance cost of keeping nEPT, nVMX, and the shadow MMU
healthy outweighs the benefits.  I.e. we don't have a use case for enabling
EPT_VIOLATION_VE while L2 is running, so why validate it?

If whatever bug the KUT EPT found ends up being a KVM bug that specifically only
affects nVMX, then it'd be worth revisiting whether or not it's worth enabling
EPT_VIOLATION_VE in vmcs02.  But that's a rather big "if" at this point.
Huang, Kai May 21, 2024, 12:42 a.m. UTC | #5
On 21/05/2024 12:21 pm, Sean Christopherson wrote:
> On Tue, May 21, 2024, Kai Huang wrote:
>> On 21/05/2024 11:22 am, Sean Christopherson wrote:
>>> On Tue, May 21, 2024, Kai Huang wrote:
>>>> On 18/05/2024 12:04 pm, Sean Christopherson wrote:
>>>>> Point vmcs02.VE_INFORMATION_ADDRESS at the vCPU's #VE info page when
>>>>> initializing vmcs02, otherwise KVM will run L2 with EPT Violation #VE
>>>>> enabled and a VE info address pointing at pfn 0.
>>>>
>>>> How about we just clear EPT_VIOLATION_VE bit in 2nd_exec_control
>>>> unconditionally for vmcs02?
>>>
>>> Because then KVM wouldn't get any EPT Violation #VE coverage for L2, and as
>>> evidence by the KVM-Unit-Test failure, running L2 with EPT Violation #VEs enabled
>>> provides unique coverage.  Doing so definitely provides coverage beyond what is
>>> strictly needed for TDX, but it's just as easy to set the VE info page in vmcs02
>>> as it is so clear EPT_VIOLATION_VE, so why not.
>>>
>>>> Your next patch says:
>>>>
>>>> "
>>>> Always handle #VEs, e.g. due to prove EPT Violation #VE failures, in L0,
>>>> as KVM does not expose any #VE capabilities to L1, i.e. any and all #VEs
>>>> are KVM's responsibility.
>>>> "
>>>
>>> I don't see how that's relevant to whether or not KVM enables EPT Violation #VEs
>>> while L2 is running.  That patch simply routes all #VEs to L0, it doesn't affect
>>> whether or not it's safe to enable EPT Violation #VEs for L2.
>>
>> My logic is, if #VE exit cannot possibly happen for L2, then we don't need
>> to deal whether to route #VE exits to L1. :-)
>>
>> Well, actually I think conceptually, it kinda makes sense to route #VE exits
>> to L1:
>>
>> L1 should never enable #VE related bits so L1 is certainly not expecting to
> 
> Not "should never", "can never".  If L1 attempts to enable EPT_VIOLATION_VE, then
> VM-Enter will VM-Fail.
> 
>> see #VE from L2.  But how to act should be depending on L1's logic? E.g., it
>> can choose to ignore, or just kill the L2 etc?
> 
> No.  Architecturally, from L1's perspective, a #VE VM-Exit _cannot_ occur in L2.
> L1 can inject a #VE into L2, but a #VE cannot be generated by the CPU and thus
> cannot cause a VM-Exit.

OK.  The point is not to argue about L1 how to handle, but whether we 
should inject to L1 -- L1 can do whatever it believes legal/sane.

But I understand the purpose is to test/validate, so it's fine for L0 to 
handle, and by handle it eventually means we want to just dump that #VE 
exit.

But now L0 always handles #VE exits from L2, and AFAICT L0 will just 
kill the L1, until the patch:

	KVM: VMX: Don't kill the VM on an unexpected #VE

lands.

So looks that patch at least should be done first.  Otherwise it doesn't 
make a lot sense to kill L1 for #VE exits from L2.

> 
>> Unconditionally disable #VE in vmcs02 can avoid such issue because it's just
>> not possible for L2 to have the #VE exit.
> 
> Sure, but by that argument we could just avoid all nested VMX issues by never
> enabling anything for L2.
> 
> If there's an argument to be made for disabling EPT_VIOLATION_VE in vmcs02, it's
> that the potential maintenance cost of keeping nEPT, nVMX, and the shadow MMU
> healthy outweighs the benefits.  I.e. we don't have a use case for enabling
> EPT_VIOLATION_VE while L2 is running, so why validate it?

Yeah.  I am not sure the purpose of validating #VE exits from L2.

> 
> If whatever bug the KUT EPT found ends up being a KVM bug that specifically only
> affects nVMX, then it'd be worth revisiting whether or not it's worth enabling
> EPT_VIOLATION_VE in vmcs02.  But that's a rather big "if" at this point.

OK.
Sean Christopherson May 21, 2024, 1:02 a.m. UTC | #6
On Tue, May 21, 2024, Kai Huang wrote:
> But now L0 always handles #VE exits from L2, and AFAICT L0 will just kill
> the L1, until the patch:
> 
> 	KVM: VMX: Don't kill the VM on an unexpected #VE
> 
> lands.
> 
> So looks that patch at least should be done first.  Otherwise it doesn't
> make a lot sense to kill L1 for #VE exits from L2.

I have no objection to changing the order.
diff mbox series

Patch

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index d5b832126e34..6798fadaa335 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2242,6 +2242,9 @@  static void prepare_vmcs02_constant_state(struct vcpu_vmx *vmx)
 		vmcs_write64(EPT_POINTER,
 			     construct_eptp(&vmx->vcpu, 0, PT64_ROOT_4LEVEL));
 
+	if (vmx->ve_info)
+		vmcs_write64(VE_INFORMATION_ADDRESS, __pa(vmx->ve_info));
+
 	/* All VMFUNCs are currently emulated through L0 vmexits.  */
 	if (cpu_has_vmx_vmfunc())
 		vmcs_write64(VM_FUNCTION_CONTROL, 0);