diff mbox series

[v2,3/3] x86/vmx: Disallow the use of inactivity states

Message ID 20240111231323.4043461-4-andrew.cooper3@citrix.com (mailing list archive)
State New
Headers show
Series x86/vmx: Multiple fixes | expand

Commit Message

Andrew Cooper Jan. 11, 2024, 11:13 p.m. UTC
Right now, vvmx will blindly copy L12's ACTIVITY_STATE into the L02 VMCS and
enter the vCPU.  Luckily for us, nested-virt is explicitly unsupported for
security bugs.

The inactivity states are HLT, SHUTDOWN and WAIT-FOR-SIPI, and as noted by the
SDM in Vol3 27.7 "Special Features of VM Entry":

  If VM entry ends with the logical processor in an inactive activity state,
  the VM entry generates any special bus cycle that is normally generated when
  that activity state is entered from the active state.

Also,

  Some activity states unconditionally block certain events.

I.e. A VMEntry with ACTIVITY=SHUTDOWN will initiate a platform reset, while a
VMEntry with ACTIVITY=WAIT-FOR-SIPI will really block everything other than
SIPIs.

Both of these activity states are for the TXT ACM to use, not for regular
hypervisors, and Xen doesn't support dropping the HLT intercept either.

There are two paths in Xen which operate on ACTIVITY_STATE.

1) The vmx_{get,set}_nonreg_state() helpers for VM-Fork.

   As regular VMs can't use any inactivity states, this is just duplicating
   the 0 from construct_vmcs().  Retain the ability to query activity_state,
   but crash the domain on any attempt to set an inactivity state.

2) Nested virt, because of ACTIVITY_STATE in vmcs_gstate_field[].

   Explicitly hide the inactivity states in the guest's view of MSR_VMX_MISC,
   and remove ACTIVITY_STATE from vmcs_gstate_field[].

   In virtual_vmentry(), we should trigger a VMEntry failure for the use of
   any inactivity states, but there's no support for that in the code at all
   so leave a TODO for when we finally start working on nested-virt in
   earnest.

Reported-by: Reima Ishii <ishiir@g.ecc.u-tokyo.ac.jp>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Tamas K Lengyel <tamas@tklengyel.com>
CC: Reima ISHII <ishiir@g.ecc.u-tokyo.ac.jp>
CC: Takahiro Shinagawa <shina@ecc.u-tokyo.ac.jp>
CC: George Dunlap <george.dunlap@citrix.com>

v2:
 * Retain the ability to query ACTIVITY_STATE in the vmfork helpers, but veto
   attempts to set an inactivity state.

Note, entirely untested.
---
 xen/arch/x86/hvm/vmx/vmx.c              | 5 ++++-
 xen/arch/x86/hvm/vmx/vvmx.c             | 9 +++++++--
 xen/arch/x86/include/asm/hvm/vmx/vmcs.h | 1 +
 3 files changed, 12 insertions(+), 3 deletions(-)

Comments

Jan Beulich Jan. 12, 2024, 10:37 a.m. UTC | #1
On 12.01.2024 00:13, Andrew Cooper wrote:
> Right now, vvmx will blindly copy L12's ACTIVITY_STATE into the L02 VMCS and
> enter the vCPU.  Luckily for us, nested-virt is explicitly unsupported for
> security bugs.
> 
> The inactivity states are HLT, SHUTDOWN and WAIT-FOR-SIPI, and as noted by the
> SDM in Vol3 27.7 "Special Features of VM Entry":
> 
>   If VM entry ends with the logical processor in an inactive activity state,
>   the VM entry generates any special bus cycle that is normally generated when
>   that activity state is entered from the active state.
> 
> Also,
> 
>   Some activity states unconditionally block certain events.
> 
> I.e. A VMEntry with ACTIVITY=SHUTDOWN will initiate a platform reset, while a
> VMEntry with ACTIVITY=WAIT-FOR-SIPI will really block everything other than
> SIPIs.
> 
> Both of these activity states are for the TXT ACM to use, not for regular
> hypervisors, and Xen doesn't support dropping the HLT intercept either.
> 
> There are two paths in Xen which operate on ACTIVITY_STATE.
> 
> 1) The vmx_{get,set}_nonreg_state() helpers for VM-Fork.
> 
>    As regular VMs can't use any inactivity states, this is just duplicating
>    the 0 from construct_vmcs().  Retain the ability to query activity_state,
>    but crash the domain on any attempt to set an inactivity state.
> 
> 2) Nested virt, because of ACTIVITY_STATE in vmcs_gstate_field[].
> 
>    Explicitly hide the inactivity states in the guest's view of MSR_VMX_MISC,
>    and remove ACTIVITY_STATE from vmcs_gstate_field[].
> 
>    In virtual_vmentry(), we should trigger a VMEntry failure for the use of
>    any inactivity states, but there's no support for that in the code at all
>    so leave a TODO for when we finally start working on nested-virt in
>    earnest.
> 
> Reported-by: Reima Ishii <ishiir@g.ecc.u-tokyo.ac.jp>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with one remark/suggestion:

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -1551,7 +1551,10 @@ static void cf_check vmx_set_nonreg_state(struct vcpu *v,
>  {
>      vmx_vmcs_enter(v);
>  
> -    __vmwrite(GUEST_ACTIVITY_STATE, nrs->vmx.activity_state);
> +    if ( nrs->vmx.activity_state )
> +        domain_crash(v->domain, "Attempt to set activity_state %#lx\n",
> +                     nrs->vmx.activity_state);

Might be useful to log the offending vCPU here?

Jan
Andrew Cooper Jan. 12, 2024, 10:43 a.m. UTC | #2
On 12/01/2024 10:37 am, Jan Beulich wrote:
> On 12.01.2024 00:13, Andrew Cooper wrote:
>> Right now, vvmx will blindly copy L12's ACTIVITY_STATE into the L02 VMCS and
>> enter the vCPU.  Luckily for us, nested-virt is explicitly unsupported for
>> security bugs.
>>
>> The inactivity states are HLT, SHUTDOWN and WAIT-FOR-SIPI, and as noted by the
>> SDM in Vol3 27.7 "Special Features of VM Entry":
>>
>>   If VM entry ends with the logical processor in an inactive activity state,
>>   the VM entry generates any special bus cycle that is normally generated when
>>   that activity state is entered from the active state.
>>
>> Also,
>>
>>   Some activity states unconditionally block certain events.
>>
>> I.e. A VMEntry with ACTIVITY=SHUTDOWN will initiate a platform reset, while a
>> VMEntry with ACTIVITY=WAIT-FOR-SIPI will really block everything other than
>> SIPIs.
>>
>> Both of these activity states are for the TXT ACM to use, not for regular
>> hypervisors, and Xen doesn't support dropping the HLT intercept either.
>>
>> There are two paths in Xen which operate on ACTIVITY_STATE.
>>
>> 1) The vmx_{get,set}_nonreg_state() helpers for VM-Fork.
>>
>>    As regular VMs can't use any inactivity states, this is just duplicating
>>    the 0 from construct_vmcs().  Retain the ability to query activity_state,
>>    but crash the domain on any attempt to set an inactivity state.
>>
>> 2) Nested virt, because of ACTIVITY_STATE in vmcs_gstate_field[].
>>
>>    Explicitly hide the inactivity states in the guest's view of MSR_VMX_MISC,
>>    and remove ACTIVITY_STATE from vmcs_gstate_field[].
>>
>>    In virtual_vmentry(), we should trigger a VMEntry failure for the use of
>>    any inactivity states, but there's no support for that in the code at all
>>    so leave a TODO for when we finally start working on nested-virt in
>>    earnest.
>>
>> Reported-by: Reima Ishii <ishiir@g.ecc.u-tokyo.ac.jp>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks.

> with one remark/suggestion:
>
>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -1551,7 +1551,10 @@ static void cf_check vmx_set_nonreg_state(struct vcpu *v,
>>  {
>>      vmx_vmcs_enter(v);
>>  
>> -    __vmwrite(GUEST_ACTIVITY_STATE, nrs->vmx.activity_state);
>> +    if ( nrs->vmx.activity_state )
>> +        domain_crash(v->domain, "Attempt to set activity_state %#lx\n",
>> +                     nrs->vmx.activity_state);
> Might be useful to log the offending vCPU here?

Already covered.  the innards of __domain_crash() does:

    else if ( d == current->domain )
    {
        printk("Domain %d (vcpu#%d) crashed on cpu#%d:\n",
        ...

~Andrew
Jan Beulich Jan. 12, 2024, 11:04 a.m. UTC | #3
On 12.01.2024 11:43, Andrew Cooper wrote:
> On 12/01/2024 10:37 am, Jan Beulich wrote:
>> On 12.01.2024 00:13, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>> @@ -1551,7 +1551,10 @@ static void cf_check vmx_set_nonreg_state(struct vcpu *v,
>>>  {
>>>      vmx_vmcs_enter(v);
>>>  
>>> -    __vmwrite(GUEST_ACTIVITY_STATE, nrs->vmx.activity_state);
>>> +    if ( nrs->vmx.activity_state )
>>> +        domain_crash(v->domain, "Attempt to set activity_state %#lx\n",
>>> +                     nrs->vmx.activity_state);
>> Might be useful to log the offending vCPU here?
> 
> Already covered.  the innards of __domain_crash() does:
> 
>     else if ( d == current->domain )
>     {
>         printk("Domain %d (vcpu#%d) crashed on cpu#%d:\n",
>         ...

Except that afaict v != current here at all times (at least as far as
current use of the function goes).

Jan
Andrew Cooper Jan. 12, 2024, 11:18 a.m. UTC | #4
On 12/01/2024 11:04 am, Jan Beulich wrote:
> On 12.01.2024 11:43, Andrew Cooper wrote:
>> On 12/01/2024 10:37 am, Jan Beulich wrote:
>>> On 12.01.2024 00:13, Andrew Cooper wrote:
>>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>>> @@ -1551,7 +1551,10 @@ static void cf_check vmx_set_nonreg_state(struct vcpu *v,
>>>>  {
>>>>      vmx_vmcs_enter(v);
>>>>  
>>>> -    __vmwrite(GUEST_ACTIVITY_STATE, nrs->vmx.activity_state);
>>>> +    if ( nrs->vmx.activity_state )
>>>> +        domain_crash(v->domain, "Attempt to set activity_state %#lx\n",
>>>> +                     nrs->vmx.activity_state);
>>> Might be useful to log the offending vCPU here?
>> Already covered.  the innards of __domain_crash() does:
>>
>>     else if ( d == current->domain )
>>     {
>>         printk("Domain %d (vcpu#%d) crashed on cpu#%d:\n",
>>         ...
> Except that afaict v != current here at all times (at least as far as
> current use of the function goes).

Hmm.  That's irritating.

In this case, it's a dead logic path - hence why in v1 I simply deleted it.

But I would prefer not to have to rely on a human getting an error
message right in order to get proper logging.

I suppose what we really want is a vcpu_crash(), but this is now firmly
in the realms of the cleanup patch I still haven't had time to repost.

I think I'll extend this with %pv for now, which can be dropped when
vcpu_crash() appears.

~Andrew
Tamas K Lengyel Jan. 16, 2024, 2:27 p.m. UTC | #5
On Thu, Jan 11, 2024 at 6:13 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>
> Right now, vvmx will blindly copy L12's ACTIVITY_STATE into the L02 VMCS and
> enter the vCPU.  Luckily for us, nested-virt is explicitly unsupported for
> security bugs.
>
> The inactivity states are HLT, SHUTDOWN and WAIT-FOR-SIPI, and as noted by the
> SDM in Vol3 27.7 "Special Features of VM Entry":
>
>   If VM entry ends with the logical processor in an inactive activity state,
>   the VM entry generates any special bus cycle that is normally generated when
>   that activity state is entered from the active state.
>
> Also,
>
>   Some activity states unconditionally block certain events.
>
> I.e. A VMEntry with ACTIVITY=SHUTDOWN will initiate a platform reset, while a
> VMEntry with ACTIVITY=WAIT-FOR-SIPI will really block everything other than
> SIPIs.
>
> Both of these activity states are for the TXT ACM to use, not for regular
> hypervisors, and Xen doesn't support dropping the HLT intercept either.
>
> There are two paths in Xen which operate on ACTIVITY_STATE.
>
> 1) The vmx_{get,set}_nonreg_state() helpers for VM-Fork.
>
>    As regular VMs can't use any inactivity states, this is just duplicating
>    the 0 from construct_vmcs().  Retain the ability to query activity_state,
>    but crash the domain on any attempt to set an inactivity state.
>
> 2) Nested virt, because of ACTIVITY_STATE in vmcs_gstate_field[].
>
>    Explicitly hide the inactivity states in the guest's view of MSR_VMX_MISC,
>    and remove ACTIVITY_STATE from vmcs_gstate_field[].
>
>    In virtual_vmentry(), we should trigger a VMEntry failure for the use of
>    any inactivity states, but there's no support for that in the code at all
>    so leave a TODO for when we finally start working on nested-virt in
>    earnest.
>
> Reported-by: Reima Ishii <ishiir@g.ecc.u-tokyo.ac.jp>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Tamas K Lengyel <tamas@tklengyel.com>
diff mbox series

Patch

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index a85394232a23..fd580bd5625f 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1551,7 +1551,10 @@  static void cf_check vmx_set_nonreg_state(struct vcpu *v,
 {
     vmx_vmcs_enter(v);
 
-    __vmwrite(GUEST_ACTIVITY_STATE, nrs->vmx.activity_state);
+    if ( nrs->vmx.activity_state )
+        domain_crash(v->domain, "Attempt to set activity_state %#lx\n",
+                     nrs->vmx.activity_state);
+
     __vmwrite(GUEST_INTERRUPTIBILITY_INFO, nrs->vmx.interruptibility_info);
     __vmwrite(GUEST_PENDING_DBG_EXCEPTIONS, nrs->vmx.pending_dbg);
 
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index f14053e7637a..ece0aa243a73 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -899,7 +899,10 @@  static const u16 vmcs_gstate_field[] = {
     GUEST_LDTR_AR_BYTES,
     GUEST_TR_AR_BYTES,
     GUEST_INTERRUPTIBILITY_INFO,
+    /*
+     * ACTIVITY_STATE is handled specially.
     GUEST_ACTIVITY_STATE,
+     */
     GUEST_SYSENTER_CS,
     GUEST_PREEMPTION_TIMER,
     /* natural */
@@ -1200,6 +1203,8 @@  static void virtual_vmentry(struct cpu_user_regs *regs)
     nvcpu->nv_vmentry_pending = 0;
     nvcpu->nv_vmswitch_in_progress = 1;
 
+    /* TODO: Fail VMentry for GUEST_ACTIVITY_STATE != 0 */
+
     /*
      * EFER handling:
      * hvm_set_efer won't work if CR0.PG = 1, so we change the value
@@ -2316,8 +2321,8 @@  int nvmx_msr_read_intercept(unsigned int msr, u64 *msr_content)
         data = hvm_cr4_guest_valid_bits(d);
         break;
     case MSR_IA32_VMX_MISC:
-        /* Do not support CR3-target feature now */
-        data = host_data & ~VMX_MISC_CR3_TARGET;
+        /* Do not support CR3-targets or activity states. */
+        data = host_data & ~(VMX_MISC_CR3_TARGET | VMX_MISC_ACTIVITY_MASK);
         break;
     case MSR_IA32_VMX_EPT_VPID_CAP:
         data = nept_get_ept_vpid_cap();
diff --git a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
index a9afdffae547..5ec474c79c32 100644
--- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
@@ -277,6 +277,7 @@  extern u32 vmx_secondary_exec_control;
 #define VMX_VPID_INVVPID_SINGLE_CONTEXT_RETAINING_GLOBAL 0x80000000000ULL
 extern u64 vmx_ept_vpid_cap;
 
+#define VMX_MISC_ACTIVITY_MASK                  0x000001c0
 #define VMX_MISC_PROC_TRACE                     0x00004000
 #define VMX_MISC_CR3_TARGET                     0x01ff0000
 #define VMX_MISC_VMWRITE_ALL                    0x20000000