diff mbox series

[v2,6/8] x86/hvm: Enable guest access to MSR_PKRS

Message ID 20230110171845.20542-7-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series x86: Protection Key Supervisor support | expand

Commit Message

Andrew Cooper Jan. 10, 2023, 5:18 p.m. UTC
Have guest_{rd,wr}msr(), via hvm_{get,set}_reg(), access either the live
register, or stashed state, depending on context.  Include MSR_PKRS for
migration, and let the guest have full access.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Kevin Tian <kevin.tian@intel.com>

v2:
 * Rebase over the get/set_reg() infrastructure.
---
 xen/arch/x86/hvm/hvm.c     |  1 +
 xen/arch/x86/hvm/vmx/vmx.c | 17 +++++++++++++++++
 xen/arch/x86/msr.c         | 10 ++++++++++
 3 files changed, 28 insertions(+)

Comments

Andrew Cooper Jan. 10, 2023, 6:07 p.m. UTC | #1
On 10/01/2023 5:18 pm, Andrew Cooper wrote:
> Have guest_{rd,wr}msr(), via hvm_{get,set}_reg(), access either the live
> register, or stashed state, depending on context.  Include MSR_PKRS for
> migration, and let the guest have full access.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
> CC: Kevin Tian <kevin.tian@intel.com>
>
> v2:
>  * Rebase over the get/set_reg() infrastructure.
> ---
>  xen/arch/x86/hvm/hvm.c     |  1 +
>  xen/arch/x86/hvm/vmx/vmx.c | 17 +++++++++++++++++
>  xen/arch/x86/msr.c         | 10 ++++++++++
>  3 files changed, 28 insertions(+)
>
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 927a221660e8..c6c1eea18003 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -1333,6 +1333,7 @@ static int cf_check hvm_load_cpu_xsave_states(
>  static const uint32_t msrs_to_send[] = {
>      MSR_SPEC_CTRL,
>      MSR_INTEL_MISC_FEATURES_ENABLES,
> +    MSR_PKRS,
>      MSR_IA32_BNDCFGS,
>      MSR_IA32_XSS,
>      MSR_VIRT_SPEC_CTRL,

Needs the following hunk too:

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c6c1eea18003..86cab7aa2627 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1487,6 +1487,7 @@ static int cf_check hvm_load_cpu_msrs(struct
domain *d, hvm_domain_context_t *h)
 
         case MSR_SPEC_CTRL:
         case MSR_INTEL_MISC_FEATURES_ENABLES:
+        case MSR_PKRS:
         case MSR_IA32_BNDCFGS:
         case MSR_IA32_XSS:
         case MSR_VIRT_SPEC_CTRL:

for the receive side of migration to work.

~Andrew
Jan Beulich Jan. 12, 2023, 1:26 p.m. UTC | #2
On 10.01.2023 18:18, Andrew Cooper wrote:
> @@ -2471,6 +2477,9 @@ static uint64_t cf_check vmx_get_reg(struct vcpu *v, unsigned int reg)
>          }
>          return val;
>  
> +    case MSR_PKRS:
> +        return (v == curr) ? rdpkrs() : msrs->pkrs;

Nothing here or ...

> @@ -2514,6 +2525,12 @@ static void cf_check vmx_set_reg(struct vcpu *v, unsigned int reg, uint64_t val)
>              domain_crash(d);
>          }
>          return;
> +
> +    case MSR_PKRS:
> +        msrs->pkrs = val;
> +        if ( v == curr )
> +            wrpkrs(val);
> +        return;

... here is VMX or (if we were to support it, just as a abstract
consideration) HVM specific. Which makes me wonder why this needs
handling in [gs]et_reg() in the first place. I guess I'm still not
fully in sync with your longer term plans here ...

The other thing I'd like to understand (and having an answer to this
would have been better before re-applying my R-b to this re-based
logic) is towards the lack of feature checks here. hvm_get_reg()
can be called from other than guest_rdmsr(), for an example see
arch_get_info_guest().

Jan
Andrew Cooper Jan. 12, 2023, 2:16 p.m. UTC | #3
On 12/01/2023 1:26 pm, Jan Beulich wrote:
> On 10.01.2023 18:18, Andrew Cooper wrote:
>> @@ -2471,6 +2477,9 @@ static uint64_t cf_check vmx_get_reg(struct vcpu *v, unsigned int reg)
>>          }
>>          return val;
>>  
>> +    case MSR_PKRS:
>> +        return (v == curr) ? rdpkrs() : msrs->pkrs;
> Nothing here or ...
>
>> @@ -2514,6 +2525,12 @@ static void cf_check vmx_set_reg(struct vcpu *v, unsigned int reg, uint64_t val)
>>              domain_crash(d);
>>          }
>>          return;
>> +
>> +    case MSR_PKRS:
>> +        msrs->pkrs = val;
>> +        if ( v == curr )
>> +            wrpkrs(val);
>> +        return;
> ... here is VMX or (if we were to support it, just as a abstract
> consideration) HVM specific. Which makes me wonder why this needs
> handling in [gs]et_reg() in the first place. I guess I'm still not
> fully in sync with your longer term plans here ...

If (when) AMD implement it, the AMD form needs will be vmcb->pkrs and
not msrs->pkrs, because like all other paging controls, they'll be
swapped automatically by VMRUN.

(I don't know this for certain, but I'm happy to bet on it, given a
decade of consistency in this regard.)

> The other thing I'd like to understand (and having an answer to this
> would have been better before re-applying my R-b to this re-based
> logic) is towards the lack of feature checks here. hvm_get_reg()
> can be called from other than guest_rdmsr(), for an example see
> arch_get_info_guest().

The point is to separate auditing logic (wants to be implemented only
once) from data shuffling logic (is the value in a register, or the MSR
lists, or VMCB/VMCS or struct vcpu, etc).  It is always the caller's
responsibility to confirm that REG exists, and that VAL is suitable for REG.

arch_get_info_guest() passes MSR_SHADOW_GS_BASE which exists
unilaterally (because we don't technically do !LM correctly.)


But this is all discussed in the comment by the function prototypes. 
I'm not sure how to make that any clearer than it already is.

~Andrew
Jan Beulich Jan. 12, 2023, 3:16 p.m. UTC | #4
On 12.01.2023 15:16, Andrew Cooper wrote:
> On 12/01/2023 1:26 pm, Jan Beulich wrote:
>> The other thing I'd like to understand (and having an answer to this
>> would have been better before re-applying my R-b to this re-based
>> logic) is towards the lack of feature checks here. hvm_get_reg()
>> can be called from other than guest_rdmsr(), for an example see
>> arch_get_info_guest().
> 
> The point is to separate auditing logic (wants to be implemented only
> once) from data shuffling logic (is the value in a register, or the MSR
> lists, or VMCB/VMCS or struct vcpu, etc).  It is always the caller's
> responsibility to confirm that REG exists, and that VAL is suitable for REG.
> 
> arch_get_info_guest() passes MSR_SHADOW_GS_BASE which exists
> unilaterally (because we don't technically do !LM correctly.)
> 
> 
> But this is all discussed in the comment by the function prototypes. 
> I'm not sure how to make that any clearer than it already is.

Okay, and I'm sorry for having looked at the definitions without finding
any helpful comment, but not at the declarations. Certainly sufficient
to confirm that my R-b can remain as you already had it.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 927a221660e8..c6c1eea18003 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1333,6 +1333,7 @@  static int cf_check hvm_load_cpu_xsave_states(
 static const uint32_t msrs_to_send[] = {
     MSR_SPEC_CTRL,
     MSR_INTEL_MISC_FEATURES_ENABLES,
+    MSR_PKRS,
     MSR_IA32_BNDCFGS,
     MSR_IA32_XSS,
     MSR_VIRT_SPEC_CTRL,
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index b1f493f009fd..57827779c305 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -657,6 +657,11 @@  static void cf_check vmx_cpuid_policy_changed(struct vcpu *v)
     else
         vmx_set_msr_intercept(v, MSR_FLUSH_CMD, VMX_MSR_RW);
 
+    if ( cp->feat.pks )
+        vmx_clear_msr_intercept(v, MSR_PKRS, VMX_MSR_RW);
+    else
+        vmx_set_msr_intercept(v, MSR_PKRS, VMX_MSR_RW);
+
  out:
     vmx_vmcs_exit(v);
 
@@ -2455,6 +2460,7 @@  static uint64_t cf_check vmx_get_reg(struct vcpu *v, unsigned int reg)
 {
     const struct vcpu *curr = current;
     struct domain *d = v->domain;
+    const struct vcpu_msrs *msrs = v->arch.msrs;
     uint64_t val = 0;
     int rc;
 
@@ -2471,6 +2477,9 @@  static uint64_t cf_check vmx_get_reg(struct vcpu *v, unsigned int reg)
         }
         return val;
 
+    case MSR_PKRS:
+        return (v == curr) ? rdpkrs() : msrs->pkrs;
+
     case MSR_SHADOW_GS_BASE:
         if ( v != curr )
             return v->arch.hvm.vmx.shadow_gs;
@@ -2499,6 +2508,8 @@  static uint64_t cf_check vmx_get_reg(struct vcpu *v, unsigned int reg)
 
 static void cf_check vmx_set_reg(struct vcpu *v, unsigned int reg, uint64_t val)
 {
+    const struct vcpu *curr = current;
+    struct vcpu_msrs *msrs = v->arch.msrs;
     struct domain *d = v->domain;
     int rc;
 
@@ -2514,6 +2525,12 @@  static void cf_check vmx_set_reg(struct vcpu *v, unsigned int reg, uint64_t val)
             domain_crash(d);
         }
         return;
+
+    case MSR_PKRS:
+        msrs->pkrs = val;
+        if ( v == curr )
+            wrpkrs(val);
+        return;
     }
 
     /* Logic which maybe requires remote VMCS acquisition. */
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 317b154d244d..7ddf0078c3a2 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -325,6 +325,11 @@  int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
         *val = 0;
         break;
 
+    case MSR_PKRS:
+        if ( !cp->feat.pks )
+            goto gp_fault;
+        goto get_reg;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
@@ -616,6 +621,11 @@  int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
             break;
         goto gp_fault;
 
+    case MSR_PKRS:
+        if ( !cp->feat.pks || val != (uint32_t)val )
+            goto gp_fault;
+        goto set_reg;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;