From patchwork Fri Aug 7 08:48:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chenyi Qiang X-Patchwork-Id: 11705167 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 12DC7913 for ; Fri, 7 Aug 2020 08:46:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 084BE206B5 for ; Fri, 7 Aug 2020 08:46:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727843AbgHGIqs (ORCPT ); Fri, 7 Aug 2020 04:46:48 -0400 Received: from mga06.intel.com ([134.134.136.31]:65145 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726382AbgHGIqr (ORCPT ); Fri, 7 Aug 2020 04:46:47 -0400 IronPort-SDR: 9lUVoQ61V38JDQuZP/R0iCzRx94K6gKM0LbYE3q/Tzc0JnJAjY5tx/EDUw5PNnivqRTYb22y8C 1+kdfYdB0TpQ== X-IronPort-AV: E=McAfee;i="6000,8403,9705"; a="214565718" X-IronPort-AV: E=Sophos;i="5.75,445,1589266800"; d="scan'208";a="214565718" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Aug 2020 01:46:46 -0700 IronPort-SDR: 95lBwmMTRHBuGxcPLquYgJLSzAWQtRulQ1T7PQhLt3uMioPI148jKqrntGjoyKziS0e8AFeB2V Fljic4D8ECSw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,445,1589266800"; d="scan'208";a="307317144" Received: from chenyi-pc.sh.intel.com ([10.239.159.72]) by orsmga002.jf.intel.com with ESMTP; 07 Aug 2020 01:46:44 -0700 From: Chenyi Qiang To: Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Xiaoyao Li Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC 1/7] KVM: VMX: Introduce PKS VMCS fields Date: Fri, 7 Aug 2020 16:48:35 +0800 Message-Id: <20200807084841.7112-2-chenyi.qiang@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200807084841.7112-1-chenyi.qiang@intel.com> References: <20200807084841.7112-1-chenyi.qiang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org PKS(Protection Keys for Supervisor Pages) is a feature that extends the Protection Key architecture to support thread-specific permission restrictions on supervisor pages. A new PKS MSR(PKRS) is defined in kernel to support PKS, which holds a set of permissions associated with each protection domian. Two VMCS fields {HOST,GUEST}_IA32_PKRS are introduced in {host,guest}-state area to store the value of PKRS. Every VM exit saves PKRS into guest-state area. If VM_EXIT_LOAD_IA32_PKRS = 1, VM exit loads PKRS from the host-state area. If VM_ENTRY_LOAD_IA32_PKRS = 1, VM entry loads PKRS from the guest-state area. Signed-off-by: Chenyi Qiang Reviewed-by: Jim Mattson --- arch/x86/include/asm/vmx.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index cd7de4b401fe..425cf81dd722 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -94,6 +94,7 @@ #define VM_EXIT_CLEAR_BNDCFGS 0x00800000 #define VM_EXIT_PT_CONCEAL_PIP 0x01000000 #define VM_EXIT_CLEAR_IA32_RTIT_CTL 0x02000000 +#define VM_EXIT_LOAD_IA32_PKRS 0x20000000 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff @@ -107,6 +108,7 @@ #define VM_ENTRY_LOAD_BNDCFGS 0x00010000 #define VM_ENTRY_PT_CONCEAL_PIP 0x00020000 #define VM_ENTRY_LOAD_IA32_RTIT_CTL 0x00040000 +#define VM_ENTRY_LOAD_IA32_PKRS 0x00400000 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x000011ff @@ -243,12 +245,16 @@ enum vmcs_field { GUEST_BNDCFGS_HIGH = 0x00002813, GUEST_IA32_RTIT_CTL = 0x00002814, GUEST_IA32_RTIT_CTL_HIGH = 0x00002815, + GUEST_IA32_PKRS = 0x00002818, + GUEST_IA32_PKRS_HIGH = 0x00002819, HOST_IA32_PAT = 0x00002c00, HOST_IA32_PAT_HIGH = 0x00002c01, HOST_IA32_EFER = 0x00002c02, HOST_IA32_EFER_HIGH = 0x00002c03, HOST_IA32_PERF_GLOBAL_CTRL = 0x00002c04, HOST_IA32_PERF_GLOBAL_CTRL_HIGH = 0x00002c05, + HOST_IA32_PKRS = 0x00002c06, + HOST_IA32_PKRS_HIGH = 0x00002c07, PIN_BASED_VM_EXEC_CONTROL = 0x00004000, CPU_BASED_VM_EXEC_CONTROL = 0x00004002, EXCEPTION_BITMAP = 0x00004004, From patchwork Fri Aug 7 08:48:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chenyi Qiang X-Patchwork-Id: 11705169 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6DD4C1510 for ; Fri, 7 Aug 2020 08:46:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5FE6420855 for ; Fri, 7 Aug 2020 08:46:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727892AbgHGIqu (ORCPT ); Fri, 7 Aug 2020 04:46:50 -0400 Received: from mga06.intel.com ([134.134.136.31]:65145 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726382AbgHGIqt (ORCPT ); Fri, 7 Aug 2020 04:46:49 -0400 IronPort-SDR: YU7FTCGP2dtB3FTDwJcudlhSGzMbAfWgZQFBnJSgVJgJYT3Bs2ksnLJ42Adazy5Hm/SyAsInPG SuLPWrd36fFQ== X-IronPort-AV: E=McAfee;i="6000,8403,9705"; a="214565724" X-IronPort-AV: E=Sophos;i="5.75,445,1589266800"; d="scan'208";a="214565724" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Aug 2020 01:46:49 -0700 IronPort-SDR: pIySebqtJTveNP937cJ0psikslWD94AYuOqb7lCOLZxamXDzXvfvG71AFqJ70N/OqRdv7fifSk wV9eT/c8Oxfw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,445,1589266800"; d="scan'208";a="307317154" Received: from chenyi-pc.sh.intel.com ([10.239.159.72]) by orsmga002.jf.intel.com with ESMTP; 07 Aug 2020 01:46:46 -0700 From: Chenyi Qiang To: Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Xiaoyao Li Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC 2/7] KVM: VMX: Expose IA32_PKRS MSR Date: Fri, 7 Aug 2020 16:48:36 +0800 Message-Id: <20200807084841.7112-3-chenyi.qiang@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200807084841.7112-1-chenyi.qiang@intel.com> References: <20200807084841.7112-1-chenyi.qiang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Protection Keys for Supervisor Pages (PKS) uses IA32_PKRS MSR (PKRS) at index 0x6E1 to allow software to manage supervisor protection key rights. For performance consideration, PKRS intercept will be disabled so that the guest can access the PKRS without VM exits. PKS introduces dedicated control fields in VMCS to switch PKRS, which only does the retore part. In addition, every VM exit saves PKRS into the guest-state area in VMCS, while VM enter won't save the host value due to the expectation that the host won't change the MSR often. Update the host's value in VMCS manually if the MSR has been changed by the kernel since the last time the VMCS was run. The function get_current_pkrs() in arch/x86/mm/pkeys.c exports the per-cpu variable pkrs_cache to avoid frequent rdmsr of PKRS. Signed-off-by: Chenyi Qiang --- arch/x86/include/asm/pkeys.h | 1 + arch/x86/kvm/vmx/capabilities.h | 6 +++ arch/x86/kvm/vmx/nested.c | 1 + arch/x86/kvm/vmx/vmcs.h | 1 + arch/x86/kvm/vmx/vmx.c | 66 ++++++++++++++++++++++++++++++++- arch/x86/kvm/x86.h | 6 +++ arch/x86/mm/pkeys.c | 6 +++ include/linux/pkeys.h | 4 ++ 8 files changed, 89 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h index 097abca7784c..d7c405d6eea6 100644 --- a/arch/x86/include/asm/pkeys.h +++ b/arch/x86/include/asm/pkeys.h @@ -142,6 +142,7 @@ u32 get_new_pkr(u32 old_pkr, int pkey, unsigned long init_val); int pks_key_alloc(const char *const pkey_user); void pks_key_free(int pkey); u32 get_new_pkr(u32 old_pkr, int pkey, unsigned long init_val); +u32 get_current_pkrs(void); /* * pks_update_protection - Update the protection of the specified key diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index 4bbd8b448d22..7099e3105f48 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -103,6 +103,12 @@ static inline bool cpu_has_load_perf_global_ctrl(void) (vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL); } +static inline bool cpu_has_load_ia32_pkrs(void) +{ + return (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PKRS) && + (vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_IA32_PKRS); +} + static inline bool cpu_has_vmx_mpx(void) { return (vmcs_config.vmexit_ctrl & VM_EXIT_CLEAR_BNDCFGS) && diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 11e4df560018..df2c2e733549 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -289,6 +289,7 @@ static void vmx_sync_vmcs_host_state(struct vcpu_vmx *vmx, dest->ds_sel = src->ds_sel; dest->es_sel = src->es_sel; #endif + dest->pkrs = src->pkrs; } static void vmx_switch_vmcs(struct kvm_vcpu *vcpu, struct loaded_vmcs *vmcs) diff --git a/arch/x86/kvm/vmx/vmcs.h b/arch/x86/kvm/vmx/vmcs.h index 7a3675fddec2..39ec3d0c844b 100644 --- a/arch/x86/kvm/vmx/vmcs.h +++ b/arch/x86/kvm/vmx/vmcs.h @@ -40,6 +40,7 @@ struct vmcs_host_state { #ifdef CONFIG_X86_64 u16 ds_sel, es_sel; #endif + u32 pkrs; }; struct vmcs_controls_shadow { diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 13745f2a5ecd..d91d59fb46fa 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1131,6 +1131,7 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) #endif unsigned long fs_base, gs_base; u16 fs_sel, gs_sel; + u32 host_pkrs; int i; vmx->req_immediate_exit = false; @@ -1163,6 +1164,20 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) */ host_state->ldt_sel = kvm_read_ldt(); + /* + * Update the host pkrs vmcs field before vcpu runs. + * The setting of VM_EXIT_LOAD_IA32_PKRS can ensure + * kvm_cpu_cap_has(X86_FEATURE_PKS) && + * guest_cpuid_has(vcpu, X86_FEATURE_VMX). + */ + if (vm_exit_controls_get(vmx) & VM_EXIT_LOAD_IA32_PKRS) { + host_pkrs = get_current_pkrs(); + if (unlikely(host_pkrs != host_state->pkrs)) { + vmcs_write64(HOST_IA32_PKRS, host_pkrs); + host_state->pkrs = host_pkrs; + } + } + #ifdef CONFIG_X86_64 savesegment(ds, host_state->ds_sel); savesegment(es, host_state->es_sel); @@ -1951,6 +1966,13 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else msr_info->data = vmx->pt_desc.guest.addr_a[index / 2]; break; + case MSR_IA32_PKRS: + if (!kvm_cpu_cap_has(X86_FEATURE_PKS) || + (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_PKS))) + return 1; + msr_info->data = vmcs_read64(GUEST_IA32_PKRS); + break; case MSR_TSC_AUX: if (!msr_info->host_initiated && !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP)) @@ -2221,6 +2243,15 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else vmx->pt_desc.guest.addr_a[index / 2] = data; break; + case MSR_IA32_PKRS: + if (!kvm_pkrs_valid(data)) + return 1; + if (!kvm_cpu_cap_has(X86_FEATURE_PKS) || + (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_PKS))) + return 1; + vmcs_write64(GUEST_IA32_PKRS, data); + break; case MSR_TSC_AUX: if (!msr_info->host_initiated && !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP)) @@ -2510,7 +2541,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf, VM_EXIT_LOAD_IA32_EFER | VM_EXIT_CLEAR_BNDCFGS | VM_EXIT_PT_CONCEAL_PIP | - VM_EXIT_CLEAR_IA32_RTIT_CTL; + VM_EXIT_CLEAR_IA32_RTIT_CTL | + VM_EXIT_LOAD_IA32_PKRS; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS, &_vmexit_control) < 0) return -EIO; @@ -2534,7 +2566,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf, VM_ENTRY_LOAD_IA32_EFER | VM_ENTRY_LOAD_BNDCFGS | VM_ENTRY_PT_CONCEAL_PIP | - VM_ENTRY_LOAD_IA32_RTIT_CTL; + VM_ENTRY_LOAD_IA32_RTIT_CTL | + VM_ENTRY_LOAD_IA32_PKRS; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS, &_vmentry_control) < 0) return -EIO; @@ -5868,6 +5901,8 @@ void dump_vmcs(void) vmcs_read64(GUEST_IA32_PERF_GLOBAL_CTRL)); if (vmentry_ctl & VM_ENTRY_LOAD_BNDCFGS) pr_err("BndCfgS = 0x%016llx\n", vmcs_read64(GUEST_BNDCFGS)); + if (vmentry_ctl & VM_ENTRY_LOAD_IA32_PKRS) + pr_err("PKRS = 0x%016llx\n", vmcs_read64(GUEST_IA32_PKRS)); pr_err("Interruptibility = %08x ActivityState = %08x\n", vmcs_read32(GUEST_INTERRUPTIBILITY_INFO), vmcs_read32(GUEST_ACTIVITY_STATE)); @@ -5903,6 +5938,8 @@ void dump_vmcs(void) vmexit_ctl & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL) pr_err("PerfGlobCtl = 0x%016llx\n", vmcs_read64(HOST_IA32_PERF_GLOBAL_CTRL)); + if (vmexit_ctl & VM_EXIT_LOAD_IA32_PKRS) + pr_err("PKRS = 0x%016llx\n", vmcs_read64(HOST_IA32_PKRS)); pr_err("*** Control State ***\n"); pr_err("PinBased=%08x CPUBased=%08x SecondaryExec=%08x\n", @@ -7230,6 +7267,26 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu) vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4)); } +static void vmx_update_pkrs_cfg(struct kvm_vcpu *vcpu) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); + unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap; + bool pks_supported = guest_cpuid_has(vcpu, X86_FEATURE_PKS); + + /* + * set intercept for PKRS when the guest doesn't support pks + */ + vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PKRS, MSR_TYPE_RW, !pks_supported); + + if (pks_supported) { + vm_entry_controls_setbit(vmx, VM_ENTRY_LOAD_IA32_PKRS); + vm_exit_controls_setbit(vmx, VM_EXIT_LOAD_IA32_PKRS); + } else { + vm_entry_controls_clearbit(vmx, VM_ENTRY_LOAD_IA32_PKRS); + vm_exit_controls_clearbit(vmx, VM_EXIT_LOAD_IA32_PKRS); + } +} + static void vmx_cpuid_update(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -7251,6 +7308,11 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu) ~(FEAT_CTL_VMX_ENABLED_INSIDE_SMX | FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX); + if (kvm_cpu_cap_has(X86_FEATURE_PKS)) + vmx_update_pkrs_cfg(vcpu); + else + guest_cpuid_clear(vcpu, X86_FEATURE_PKS); + if (nested_vmx_allowed(vcpu)) { nested_vmx_cr_fixed1_bits_update(vcpu); nested_vmx_entry_exit_ctls_update(vcpu); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 6eb62e97e59f..7fb206f98bed 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -361,6 +361,12 @@ static inline bool kvm_dr7_valid(u64 data) return !(data >> 32); } +static inline bool kvm_pkrs_valid(u64 data) +{ + /* bit[63,32] must be zero */ + return !(data >> 32); +} + void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu); void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu); u64 kvm_spec_ctrl_valid_bits(struct kvm_vcpu *vcpu); diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c index cc9a28a55ba3..b237c54074ba 100644 --- a/arch/x86/mm/pkeys.c +++ b/arch/x86/mm/pkeys.c @@ -333,6 +333,12 @@ void pks_key_free(int pkey) } EXPORT_SYMBOL_GPL(pks_key_free); +u32 get_current_pkrs(void) +{ + return this_cpu_read(pkrs_cache); +} +EXPORT_SYMBOL_GPL(get_current_pkrs); + static int pks_keys_allocated_show(struct seq_file *m, void *p) { int i; diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h index 1d84ab7c12d4..8ac90fae137f 100644 --- a/include/linux/pkeys.h +++ b/include/linux/pkeys.h @@ -66,6 +66,10 @@ static inline void pks_update_protection(int pkey, unsigned long protection) { WARN_ON_ONCE(1); } +static inline u32 get_current_pkrs(void) +{ + return 0; +} #endif /* ! CONFIG_ARCH_HAS_SUPERVISOR_PKEYS */ #endif /* _LINUX_PKEYS_H */ From patchwork Fri Aug 7 08:48:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chenyi Qiang X-Patchwork-Id: 11705171 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DF21E14DD for ; Fri, 7 Aug 2020 08:46:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D3CAD2177B for ; Fri, 7 Aug 2020 08:46:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727969AbgHGIqx (ORCPT ); Fri, 7 Aug 2020 04:46:53 -0400 Received: from mga06.intel.com ([134.134.136.31]:65145 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727942AbgHGIqw (ORCPT ); Fri, 7 Aug 2020 04:46:52 -0400 IronPort-SDR: Ho8c5/inTtiglyfFSms6YuQHKBnVrqPkBYlY2xhraCxBH8xqztePaP+et7e+CWCT2C/C8NnYF9 ucZpBWn/aPrA== X-IronPort-AV: E=McAfee;i="6000,8403,9705"; a="214565728" X-IronPort-AV: E=Sophos;i="5.75,445,1589266800"; d="scan'208";a="214565728" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Aug 2020 01:46:51 -0700 IronPort-SDR: /TZ5g8CQaDHr4HC4/eVJkBX63K2XceJxFpluqUucB6QUnqIuouIxsd+GP+UxTTrO5zq9FF3i4A f3TRswJ5WfLQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,445,1589266800"; d="scan'208";a="307317163" Received: from chenyi-pc.sh.intel.com ([10.239.159.72]) by orsmga002.jf.intel.com with ESMTP; 07 Aug 2020 01:46:49 -0700 From: Chenyi Qiang To: Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Xiaoyao Li Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC 3/7] KVM: MMU: Rename the pkru to pkr Date: Fri, 7 Aug 2020 16:48:37 +0800 Message-Id: <20200807084841.7112-4-chenyi.qiang@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200807084841.7112-1-chenyi.qiang@intel.com> References: <20200807084841.7112-1-chenyi.qiang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org PKRU represents the PKU register utilized in the protection key rights check for user pages. Protection Keys for Superviosr Pages (PKS) extends the protection key architecture to cover supervisor pages. Rename the *pkru* related variables and functions to *pkr* which stands for both of the PKRU and PKRS. It makes sense because both registers have the same format. PKS and PKU can also share the same bitmap to cache the conditions where protection key checks are needed. Signed-off-by: Chenyi Qiang Reviewed-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/mmu.h | 12 ++++++------ arch/x86/kvm/mmu/mmu.c | 18 +++++++++--------- 3 files changed, 16 insertions(+), 16 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index be5363b21540..6b739d0d1c97 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -427,7 +427,7 @@ struct kvm_mmu { * with PFEC.RSVD replaced by ACC_USER_MASK from the page tables. * Each domain has 2 bits which are ANDed with AD and WD from PKRU. */ - u32 pkru_mask; + u32 pkr_mask; u64 *pae_root; u64 *lm_root; diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 444bb9c54548..0c2fdf0abf22 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -193,8 +193,8 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, u32 errcode = PFERR_PRESENT_MASK; WARN_ON(pfec & (PFERR_PK_MASK | PFERR_RSVD_MASK)); - if (unlikely(mmu->pkru_mask)) { - u32 pkru_bits, offset; + if (unlikely(mmu->pkr_mask)) { + u32 pkr_bits, offset; /* * PKRU defines 32 bits, there are 16 domains and 2 @@ -202,15 +202,15 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, * index of the protection domain, so pte_pkey * 2 is * is the index of the first bit for the domain. */ - pkru_bits = (vcpu->arch.pkru >> (pte_pkey * 2)) & 3; + pkr_bits = (vcpu->arch.pkru >> (pte_pkey * 2)) & 3; /* clear present bit, replace PFEC.RSVD with ACC_USER_MASK. */ offset = (pfec & ~1) + ((pte_access & PT_USER_MASK) << (PFERR_RSVD_BIT - PT_USER_SHIFT)); - pkru_bits &= mmu->pkru_mask >> offset; - errcode |= -pkru_bits & PFERR_PK_MASK; - fault |= (pkru_bits != 0); + pkr_bits &= mmu->pkr_mask >> offset; + errcode |= -pkr_bits & PFERR_PK_MASK; + fault |= (pkr_bits != 0); } return -(u32)fault & errcode; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 6d6a0ae7800c..481442f5e27a 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4716,20 +4716,20 @@ static void update_permission_bitmask(struct kvm_vcpu *vcpu, * away both AD and WD. For all reads or if the last condition holds, WD * only will be masked away. */ -static void update_pkru_bitmask(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, +static void update_pkr_bitmask(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, bool ept) { unsigned bit; bool wp; if (ept) { - mmu->pkru_mask = 0; + mmu->pkr_mask = 0; return; } /* PKEY is enabled only if CR4.PKE and EFER.LMA are both set. */ if (!kvm_read_cr4_bits(vcpu, X86_CR4_PKE) || !is_long_mode(vcpu)) { - mmu->pkru_mask = 0; + mmu->pkr_mask = 0; return; } @@ -4763,7 +4763,7 @@ static void update_pkru_bitmask(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, /* PKRU.WD stops write access. */ pkey_bits |= (!!check_write) << 1; - mmu->pkru_mask |= (pkey_bits & 3) << pfec; + mmu->pkr_mask |= (pkey_bits & 3) << pfec; } } @@ -4785,7 +4785,7 @@ static void paging64_init_context_common(struct kvm_vcpu *vcpu, reset_rsvds_bits_mask(vcpu, context); update_permission_bitmask(vcpu, context, false); - update_pkru_bitmask(vcpu, context, false); + update_pkr_bitmask(vcpu, context, false); update_last_nonleaf_level(vcpu, context); MMU_WARN_ON(!is_pae(vcpu)); @@ -4815,7 +4815,7 @@ static void paging32_init_context(struct kvm_vcpu *vcpu, reset_rsvds_bits_mask(vcpu, context); update_permission_bitmask(vcpu, context, false); - update_pkru_bitmask(vcpu, context, false); + update_pkr_bitmask(vcpu, context, false); update_last_nonleaf_level(vcpu, context); context->page_fault = paging32_page_fault; @@ -4925,7 +4925,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu) } update_permission_bitmask(vcpu, context, false); - update_pkru_bitmask(vcpu, context, false); + update_pkr_bitmask(vcpu, context, false); update_last_nonleaf_level(vcpu, context); reset_tdp_shadow_zero_bits_mask(vcpu, context); } @@ -5032,7 +5032,7 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly, context->mmu_role.as_u64 = new_role.as_u64; update_permission_bitmask(vcpu, context, true); - update_pkru_bitmask(vcpu, context, true); + update_pkr_bitmask(vcpu, context, true); update_last_nonleaf_level(vcpu, context); reset_rsvds_bits_mask_ept(vcpu, context, execonly); reset_ept_shadow_zero_bits_mask(vcpu, context, execonly); @@ -5103,7 +5103,7 @@ static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu) } update_permission_bitmask(vcpu, g_context, false); - update_pkru_bitmask(vcpu, g_context, false); + update_pkr_bitmask(vcpu, g_context, false); update_last_nonleaf_level(vcpu, g_context); } From patchwork Fri Aug 7 08:48:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chenyi Qiang X-Patchwork-Id: 11705179 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A756114DD for ; Fri, 7 Aug 2020 08:47:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9533420855 for ; Fri, 7 Aug 2020 08:47:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727987AbgHGIq6 (ORCPT ); Fri, 7 Aug 2020 04:46:58 -0400 Received: from mga06.intel.com ([134.134.136.31]:65145 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727981AbgHGIqy (ORCPT ); Fri, 7 Aug 2020 04:46:54 -0400 IronPort-SDR: mjLd4oKi7pji6zI2iwk5GBWVHWkqWxfFbQWm33YDuRstjsFGruQKe+FAxZkjvx188mHvde8K4y m5iMdCNj3Pzw== X-IronPort-AV: E=McAfee;i="6000,8403,9705"; a="214565732" X-IronPort-AV: E=Sophos;i="5.75,445,1589266800"; d="scan'208";a="214565732" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Aug 2020 01:46:54 -0700 IronPort-SDR: yH1muHlAK2pbJ2xEbXcQhXqLLn6ZXISE8u1S7WhCCdXDl3h1XE656dZ9vaGmkLo/8YaMB/nFG9 gs5D3oYTrN2w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,445,1589266800"; d="scan'208";a="307317172" Received: from chenyi-pc.sh.intel.com ([10.239.159.72]) by orsmga002.jf.intel.com with ESMTP; 07 Aug 2020 01:46:51 -0700 From: Chenyi Qiang To: Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Xiaoyao Li Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC 4/7] KVM: MMU: Refactor pkr_mask to cache condition Date: Fri, 7 Aug 2020 16:48:38 +0800 Message-Id: <20200807084841.7112-5-chenyi.qiang@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200807084841.7112-1-chenyi.qiang@intel.com> References: <20200807084841.7112-1-chenyi.qiang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org pkr_mask bitmap indicates if protection key checks are needed for user pages currently. It is indexed by page fault error code bits [4:1] with PFEC.RSVD replaced by the ACC_USER_MASK from the page tables. Refactor it by reverting to the use of PFEC.RSVD. After that, PKS and PKU can share the same bitmap. Signed-off-by: Chenyi Qiang --- arch/x86/kvm/mmu.h | 10 ++++++---- arch/x86/kvm/mmu/mmu.c | 16 ++++++++++------ 2 files changed, 16 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 0c2fdf0abf22..7fb4c63d5704 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -202,11 +202,13 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, * index of the protection domain, so pte_pkey * 2 is * is the index of the first bit for the domain. */ - pkr_bits = (vcpu->arch.pkru >> (pte_pkey * 2)) & 3; + if (pte_access & PT_USER_MASK) + pkr_bits = (vcpu->arch.pkru >> (pte_pkey * 2)) & 3; + else + pkr_bits = 0; - /* clear present bit, replace PFEC.RSVD with ACC_USER_MASK. */ - offset = (pfec & ~1) + - ((pte_access & PT_USER_MASK) << (PFERR_RSVD_BIT - PT_USER_SHIFT)); + /* clear present bit */ + offset = (pfec & ~1); pkr_bits &= mmu->pkr_mask >> offset; errcode |= -pkr_bits & PFERR_PK_MASK; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 481442f5e27a..333b4da739f8 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4737,21 +4737,25 @@ static void update_pkr_bitmask(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, for (bit = 0; bit < ARRAY_SIZE(mmu->permissions); ++bit) { unsigned pfec, pkey_bits; - bool check_pkey, check_write, ff, uf, wf, pte_user; + bool check_pkey, check_write, ff, uf, wf, rsvdf; pfec = bit << 1; ff = pfec & PFERR_FETCH_MASK; uf = pfec & PFERR_USER_MASK; wf = pfec & PFERR_WRITE_MASK; - /* PFEC.RSVD is replaced by ACC_USER_MASK. */ - pte_user = pfec & PFERR_RSVD_MASK; + /* + * PFERR_RSVD_MASK bit is not set if the + * access is subject to PK restrictions. + */ + rsvdf = pfec & PFERR_RSVD_MASK; /* - * Only need to check the access which is not an - * instruction fetch and is to a user page. + * need to check the access which is not an + * instruction fetch and is not a rsvd fault. */ - check_pkey = (!ff && pte_user); + check_pkey = (!ff && !rsvdf); + /* * write access is controlled by PKRU if it is a * user access or CR0.WP = 1. From patchwork Fri Aug 7 08:48:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chenyi Qiang X-Patchwork-Id: 11705173 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 418F0913 for ; Fri, 7 Aug 2020 08:47:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 300DA20855 for ; Fri, 7 Aug 2020 08:47:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728050AbgHGIq6 (ORCPT ); Fri, 7 Aug 2020 04:46:58 -0400 Received: from mga06.intel.com ([134.134.136.31]:65145 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727942AbgHGIq5 (ORCPT ); Fri, 7 Aug 2020 04:46:57 -0400 IronPort-SDR: HkTtbA8K4qqtuVg5p/uqdZhL4Go0wz8cae8WxBnLrtsUwo3fAiSGsBf4Z+0xree9sgAm7eXGM/ Xval5DkM3UdQ== X-IronPort-AV: E=McAfee;i="6000,8403,9705"; a="214565734" X-IronPort-AV: E=Sophos;i="5.75,445,1589266800"; d="scan'208";a="214565734" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Aug 2020 01:46:56 -0700 IronPort-SDR: GbC13P0kZW6dyIqBs8R85hTPlUpzOuVOCZuTKi2tyI/Zdu8WVh2ZiapSht5xQCvP9StdF92u21 mi5j/c6eWrOw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,445,1589266800"; d="scan'208";a="307317184" Received: from chenyi-pc.sh.intel.com ([10.239.159.72]) by orsmga002.jf.intel.com with ESMTP; 07 Aug 2020 01:46:54 -0700 From: Chenyi Qiang To: Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Xiaoyao Li Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC 5/7] KVM: MMU: Add support for PKS emulation Date: Fri, 7 Aug 2020 16:48:39 +0800 Message-Id: <20200807084841.7112-6-chenyi.qiang@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200807084841.7112-1-chenyi.qiang@intel.com> References: <20200807084841.7112-1-chenyi.qiang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Advertise pkr_mask to cache the conditions where pretection key checks for supervisor pages are needed. When the accessed pages are those with a translation for which the U/S flag is 0 in at least one paging-structure entry controlling the translation, they are the supervisor pages and PKRS enforces the access rights check. Signed-off-by: Chenyi Qiang --- arch/x86/include/asm/kvm_host.h | 8 +++--- arch/x86/kvm/mmu.h | 12 ++++++--- arch/x86/kvm/mmu/mmu.c | 44 +++++++++++++++++---------------- 3 files changed, 35 insertions(+), 29 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 6b739d0d1c97..736e56e023d5 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -422,10 +422,10 @@ struct kvm_mmu { u8 permissions[16]; /* - * The pkru_mask indicates if protection key checks are needed. It - * consists of 16 domains indexed by page fault error code bits [4:1], - * with PFEC.RSVD replaced by ACC_USER_MASK from the page tables. - * Each domain has 2 bits which are ANDed with AD and WD from PKRU. + * The pkr_mask indicates if protection key checks are needed. + * It consists of 16 domains indexed by page fault error code + * bits[4:1]. Each domain has 2 bits which are ANDed with AD + * and WD from PKRU/PKRS. */ u32 pkr_mask; diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 7fb4c63d5704..b840b2d9ee9f 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -195,15 +195,19 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, WARN_ON(pfec & (PFERR_PK_MASK | PFERR_RSVD_MASK)); if (unlikely(mmu->pkr_mask)) { u32 pkr_bits, offset; + u64 pkrs; /* - * PKRU defines 32 bits, there are 16 domains and 2 - * attribute bits per domain in pkru. pte_pkey is the - * index of the protection domain, so pte_pkey * 2 is - * is the index of the first bit for the domain. + * PKRU and PKRS both define 32 bits. There are 16 domains + * and 2 attribute bits per domain in them. pte_key is the + * index of the protection domain, so pte_pkey * 2 is the + * index of the first bit for the domain. The choice of + * PKRU and PKRS is determined by the accessed pages. */ if (pte_access & PT_USER_MASK) pkr_bits = (vcpu->arch.pkru >> (pte_pkey * 2)) & 3; + else if (!kvm_get_msr(vcpu, MSR_IA32_PKRS, &pkrs)) + pkr_bits = (pkrs >> (pte_pkey * 2)) & 3; else pkr_bits = 0; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 333b4da739f8..845aea86b138 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4693,28 +4693,29 @@ static void update_permission_bitmask(struct kvm_vcpu *vcpu, } /* -* PKU is an additional mechanism by which the paging controls access to -* user-mode addresses based on the value in the PKRU register. Protection -* key violations are reported through a bit in the page fault error code. +* Protection Keys (PKEY) is an additional mechanism by which +* the paging controls access to user-mode/supervisor-mode address +* based on the values in PKEY registers (PKRU/PKRS). Protection key +* violations are reported through a bit in the page fault error code. * Unlike other bits of the error code, the PK bit is not known at the * call site of e.g. gva_to_gpa; it must be computed directly in -* permission_fault based on two bits of PKRU, on some machine state (CR4, -* CR0, EFER, CPL), and on other bits of the error code and the page tables. +* permission_fault based on two bits of PKRU/PKRS, on some machine +* state (CR4, CR0, EFER, CPL), and on other bits of the error code +* and the page tables. * * In particular the following conditions come from the error code, the * page tables and the machine state: -* - PK is always zero unless CR4.PKE=1 and EFER.LMA=1 +* - PK is always zero unless CR4.PKE=1/CR4.PKS=1 and EFER.LMA=1 * - PK is always zero if RSVD=1 (reserved bit set) or F=1 (instruction fetch) -* - PK is always zero if U=0 in the page tables -* - PKRU.WD is ignored if CR0.WP=0 and the access is a supervisor access. +* - (PKRU/PKRS).WD is ignored if CR0.WP=0 and the access is a supervisor access. * -* The PKRU bitmask caches the result of these four conditions. The error -* code (minus the P bit) and the page table's U bit form an index into the -* PKRU bitmask. Two bits of the PKRU bitmask are then extracted and ANDed -* with the two bits of the PKRU register corresponding to the protection key. -* For the first three conditions above the bits will be 00, thus masking -* away both AD and WD. For all reads or if the last condition holds, WD -* only will be masked away. +* The pkr_mask caches the result of these three conditions. The error +* code (minus the P bit) forms an index into the pkr_mask. Both PKU and +* PKS shares the same bitmask. Two bits of the pkr_mask are then extracted +* and ANDed with the two bits of the PKEY register corresponding to +* the protection key. For the first two conditions above the bits will be 00, +* thus masking away both AD and WD. For all reads or if the last condition +* holds, WD only will be masked away. */ static void update_pkr_bitmask(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, bool ept) @@ -4727,8 +4728,9 @@ static void update_pkr_bitmask(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, return; } - /* PKEY is enabled only if CR4.PKE and EFER.LMA are both set. */ - if (!kvm_read_cr4_bits(vcpu, X86_CR4_PKE) || !is_long_mode(vcpu)) { + /* PKEY is enabled only if CR4.PKE/CR4.PKS and EFER.LMA are both set. */ + if ((!kvm_read_cr4_bits(vcpu, X86_CR4_PKE) && + !kvm_read_cr4_bits(vcpu, X86_CR4_PKS)) || !is_long_mode(vcpu)) { mmu->pkr_mask = 0; return; } @@ -4757,14 +4759,14 @@ static void update_pkr_bitmask(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, check_pkey = (!ff && !rsvdf); /* - * write access is controlled by PKRU if it is a - * user access or CR0.WP = 1. + * write access is controlled by PKRU/PKRS if + * it is a user access or CR0.WP = 1. */ check_write = check_pkey && wf && (uf || wp); - /* PKRU.AD stops both read and write access. */ + /* PKRU/PKRS.AD stops both read and write access. */ pkey_bits = !!check_pkey; - /* PKRU.WD stops write access. */ + /* PKRU/PKRS.WD stops write access. */ pkey_bits |= (!!check_write) << 1; mmu->pkr_mask |= (pkey_bits & 3) << pfec; From patchwork Fri Aug 7 08:48:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chenyi Qiang X-Patchwork-Id: 11705175 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 57D70913 for ; Fri, 7 Aug 2020 08:47:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4CD7920855 for ; Fri, 7 Aug 2020 08:47:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728091AbgHGIrC (ORCPT ); Fri, 7 Aug 2020 04:47:02 -0400 Received: from mga06.intel.com ([134.134.136.31]:65145 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728057AbgHGIrA (ORCPT ); Fri, 7 Aug 2020 04:47:00 -0400 IronPort-SDR: 0nNg/aw53gbd0EWiDBM7Td/7+PLLkwOee+81LN7SpqCMX/AJntqW1CcsYIY+DDbHjmiQ0ME8vV U7R1DrtU2yfQ== X-IronPort-AV: E=McAfee;i="6000,8403,9705"; a="214565745" X-IronPort-AV: E=Sophos;i="5.75,445,1589266800"; d="scan'208";a="214565745" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Aug 2020 01:46:59 -0700 IronPort-SDR: EbP2cn4kDI7hNdvb6jw1cq2EB3jhiRWgFGYYqA/a+Flv6RgFn3ZkEx0SCrx8ZwlojzUs0lQbmP fGmGt1eog6LQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,445,1589266800"; d="scan'208";a="307317194" Received: from chenyi-pc.sh.intel.com ([10.239.159.72]) by orsmga002.jf.intel.com with ESMTP; 07 Aug 2020 01:46:56 -0700 From: Chenyi Qiang To: Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Xiaoyao Li Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC 6/7] KVM: X86: Expose PKS to guest and userspace Date: Fri, 7 Aug 2020 16:48:40 +0800 Message-Id: <20200807084841.7112-7-chenyi.qiang@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200807084841.7112-1-chenyi.qiang@intel.com> References: <20200807084841.7112-1-chenyi.qiang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Existence of PKS is enumerated via CPUID.(EAX=7H,ECX=0):ECX[31]. It is enabled by setting CR4.PKS when long mode is active. PKS is only implemented when EPT is enabled and requires the support of VM_{ENTRY, EXIT}_LOAD_IA32_PKRS currently. Signed-off-by: Chenyi Qiang --- arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/cpuid.c | 3 ++- arch/x86/kvm/vmx/vmx.c | 15 ++++++++++++--- arch/x86/kvm/x86.c | 7 +++++-- 4 files changed, 21 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 736e56e023d5..36c7356693c8 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -99,7 +99,8 @@ | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \ | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \ | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \ - | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP)) + | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \ + | X86_CR4_PKS)) #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 8a294f9747aa..897749250afd 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -325,7 +325,8 @@ void kvm_set_cpu_caps(void) F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) | F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) | F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) | - F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ + F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ | + 0 /*PKS*/ ); /* Set LA57 based on hardware capability. */ if (cpuid_ecx(7) & F(LA57)) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index d91d59fb46fa..5cdf4d3848fb 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3216,7 +3216,7 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) } /* - * SMEP/SMAP/PKU is disabled if CPU is in non-paging mode in + * SMEP/SMAP/PKU/PKS is disabled if CPU is in non-paging mode in * hardware. To emulate this behavior, SMEP/SMAP/PKU needs * to be manually disabled when guest switches to non-paging * mode. @@ -3224,10 +3224,11 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) * If !enable_unrestricted_guest, the CPU is always running * with CR0.PG=1 and CR4 needs to be modified. * If enable_unrestricted_guest, the CPU automatically - * disables SMEP/SMAP/PKU when the guest sets CR0.PG=0. + * disables SMEP/SMAP/PKU/PKS when the guest sets CR0.PG=0. */ if (!is_paging(vcpu)) - hw_cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE); + hw_cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE | + X86_CR4_PKS); } vmcs_writel(CR4_READ_SHADOW, cr4); @@ -7348,6 +7349,14 @@ static __init void vmx_set_cpu_caps(void) if (vmx_pt_mode_is_host_guest()) kvm_cpu_cap_check_and_set(X86_FEATURE_INTEL_PT); + /* + * PKS is not yet implemented for shadow paging. + * If not support VM_{ENTRY, EXIT}_LOAD_IA32_PKRS, + * don't expose the PKS as well. + */ + if (enable_ept && cpu_has_load_ia32_pkrs()) + kvm_cpu_cap_check_and_set(X86_FEATURE_PKS); + if (vmx_umip_emulated()) kvm_cpu_cap_set(X86_FEATURE_UMIP); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 88c593f83b28..8ad9622ee2b2 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -949,6 +949,8 @@ EXPORT_SYMBOL_GPL(kvm_set_xcr); __reserved_bits |= X86_CR4_LA57; \ if (!__cpu_has(__c, X86_FEATURE_UMIP)) \ __reserved_bits |= X86_CR4_UMIP; \ + if (!__cpu_has(__c, X86_FEATURE_PKS)) \ + __reserved_bits |= X86_CR4_PKS; \ __reserved_bits; \ }) @@ -967,7 +969,8 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { unsigned long old_cr4 = kvm_read_cr4(vcpu); unsigned long pdptr_bits = X86_CR4_PGE | X86_CR4_PSE | X86_CR4_PAE | - X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE; + X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE | + X86_CR4_PKS; if (kvm_valid_cr4(vcpu, cr4)) return 1; @@ -1202,7 +1205,7 @@ static const u32 msrs_to_save_all[] = { MSR_IA32_RTIT_ADDR1_A, MSR_IA32_RTIT_ADDR1_B, MSR_IA32_RTIT_ADDR2_A, MSR_IA32_RTIT_ADDR2_B, MSR_IA32_RTIT_ADDR3_A, MSR_IA32_RTIT_ADDR3_B, - MSR_IA32_UMWAIT_CONTROL, + MSR_IA32_UMWAIT_CONTROL, MSR_IA32_PKRS, MSR_ARCH_PERFMON_FIXED_CTR0, MSR_ARCH_PERFMON_FIXED_CTR1, MSR_ARCH_PERFMON_FIXED_CTR0 + 2, MSR_ARCH_PERFMON_FIXED_CTR0 + 3, From patchwork Fri Aug 7 08:48:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chenyi Qiang X-Patchwork-Id: 11705177 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 93151913 for ; Fri, 7 Aug 2020 08:47:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7FA4520855 for ; Fri, 7 Aug 2020 08:47:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728139AbgHGIrK (ORCPT ); Fri, 7 Aug 2020 04:47:10 -0400 Received: from mga06.intel.com ([134.134.136.31]:65145 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728088AbgHGIrD (ORCPT ); Fri, 7 Aug 2020 04:47:03 -0400 IronPort-SDR: 69whCAa6IyEaCcYEPfPaAQ+Fvmy5gfcSM9T6g61p7DsbjLiiUJcQETsOCmhkyri6FlflHV6Nob 91SozWc2Rfuw== X-IronPort-AV: E=McAfee;i="6000,8403,9705"; a="214565755" X-IronPort-AV: E=Sophos;i="5.75,445,1589266800"; d="scan'208";a="214565755" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Aug 2020 01:47:01 -0700 IronPort-SDR: NMWduveskULa4Jj0BSdDvlgojNhV1Pt7nv1o3YBtgSQSqaVb1gFMF4oD5xkh5Uqky9py+fW0nj I5Und0QdZdig== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,445,1589266800"; d="scan'208";a="307317202" Received: from chenyi-pc.sh.intel.com ([10.239.159.72]) by orsmga002.jf.intel.com with ESMTP; 07 Aug 2020 01:46:59 -0700 From: Chenyi Qiang To: Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Xiaoyao Li Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC 7/7] KVM: VMX: Enable PKS for nested VM Date: Fri, 7 Aug 2020 16:48:41 +0800 Message-Id: <20200807084841.7112-8-chenyi.qiang@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200807084841.7112-1-chenyi.qiang@intel.com> References: <20200807084841.7112-1-chenyi.qiang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org PKS MSR passes through guest directly. Configure the MSR to match the L0/L1 settings so that nested VM runs PKS properly. Signed-off-by: Chenyi Qiang --- arch/x86/kvm/vmx/nested.c | 32 ++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/vmcs12.c | 2 ++ arch/x86/kvm/vmx/vmcs12.h | 6 +++++- arch/x86/kvm/vmx/vmx.c | 10 ++++++++++ arch/x86/kvm/vmx/vmx.h | 1 + 5 files changed, 50 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index df2c2e733549..1f9823d21ecd 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -647,6 +647,12 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu, MSR_IA32_PRED_CMD, MSR_TYPE_W); + if (!msr_write_intercepted_l01(vcpu, MSR_IA32_PKRS)) + nested_vmx_disable_intercept_for_msr( + msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PKRS, + MSR_TYPE_R | MSR_TYPE_W); + kvm_vcpu_unmap(vcpu, &to_vmx(vcpu)->nested.msr_bitmap_map, false); return true; @@ -2427,6 +2433,10 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12) if (kvm_mpx_supported() && vmx->nested.nested_run_pending && (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS)) vmcs_write64(GUEST_BNDCFGS, vmcs12->guest_bndcfgs); + + if (vmx->nested.nested_run_pending && + (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PKRS)) + vmcs_write64(GUEST_IA32_PKRS, vmcs12->guest_ia32_pkrs); } if (nested_cpu_has_xsaves(vmcs12)) @@ -2509,6 +2519,11 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, if (kvm_mpx_supported() && (!vmx->nested.nested_run_pending || !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))) vmcs_write64(GUEST_BNDCFGS, vmx->nested.vmcs01_guest_bndcfgs); + + if (kvm_cpu_cap_has(X86_FEATURE_PKS) && + (!vmx->nested.nested_run_pending || + !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PKRS))) + vmcs_write64(GUEST_IA32_PKRS, vmx->nested.vmcs01_guest_pkrs); vmx_set_rflags(vcpu, vmcs12->guest_rflags); /* EXCEPTION_BITMAP and CR0_GUEST_HOST_MASK should basically be the @@ -2849,6 +2864,10 @@ static int nested_vmx_check_host_state(struct kvm_vcpu *vcpu, vmcs12->host_ia32_perf_global_ctrl))) return -EINVAL; + if ((vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PKRS) && + CC(!kvm_pkrs_valid(vmcs12->host_ia32_pkrs))) + return -EINVAL; + #ifdef CONFIG_X86_64 ia32e = !!(vcpu->arch.efer & EFER_LMA); #else @@ -2998,6 +3017,10 @@ static int nested_vmx_check_guest_state(struct kvm_vcpu *vcpu, if (nested_check_guest_non_reg_state(vmcs12)) return -EINVAL; + if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PKRS) && + CC(!kvm_pkrs_valid(vmcs12->guest_ia32_pkrs))) + return -EINVAL; + return 0; } @@ -3271,6 +3294,9 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, if (kvm_mpx_supported() && !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS)) vmx->nested.vmcs01_guest_bndcfgs = vmcs_read64(GUEST_BNDCFGS); + if (kvm_cpu_cap_has(X86_FEATURE_PKS) && + !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PKRS)) + vmx->nested.vmcs01_guest_pkrs = vmcs_read64(GUEST_IA32_PKRS); /* * Overwrite vmcs01.GUEST_CR3 with L1's CR3 if EPT is disabled *and* @@ -3865,6 +3891,7 @@ static bool is_vmcs12_ext_field(unsigned long field) case GUEST_IDTR_BASE: case GUEST_PENDING_DBG_EXCEPTIONS: case GUEST_BNDCFGS: + case GUEST_IA32_PKRS: return true; default: break; @@ -3916,6 +3943,8 @@ static void sync_vmcs02_to_vmcs12_rare(struct kvm_vcpu *vcpu, vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS); if (kvm_mpx_supported()) vmcs12->guest_bndcfgs = vmcs_read64(GUEST_BNDCFGS); + if (kvm_cpu_cap_has(X86_FEATURE_PKS)) + vmcs12->guest_ia32_pkrs = vmcs_read64(GUEST_IA32_PKRS); vmx->nested.need_sync_vmcs02_to_vmcs12_rare = false; } @@ -4151,6 +4180,9 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, WARN_ON_ONCE(kvm_set_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL, vmcs12->host_ia32_perf_global_ctrl)); + if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PKRS) + vmcs_write64(GUEST_IA32_PKRS, vmcs12->host_ia32_pkrs); + /* Set L1 segment info according to Intel SDM 27.5.2 Loading Host Segment and Descriptor-Table Registers */ seg = (struct kvm_segment) { diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c index c8e51c004f78..df7b2143b807 100644 --- a/arch/x86/kvm/vmx/vmcs12.c +++ b/arch/x86/kvm/vmx/vmcs12.c @@ -61,9 +61,11 @@ const unsigned short vmcs_field_to_offset_table[] = { FIELD64(GUEST_PDPTR2, guest_pdptr2), FIELD64(GUEST_PDPTR3, guest_pdptr3), FIELD64(GUEST_BNDCFGS, guest_bndcfgs), + FIELD64(GUEST_IA32_PKRS, guest_ia32_pkrs), FIELD64(HOST_IA32_PAT, host_ia32_pat), FIELD64(HOST_IA32_EFER, host_ia32_efer), FIELD64(HOST_IA32_PERF_GLOBAL_CTRL, host_ia32_perf_global_ctrl), + FIELD64(HOST_IA32_PKRS, host_ia32_pkrs), FIELD(PIN_BASED_VM_EXEC_CONTROL, pin_based_vm_exec_control), FIELD(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control), FIELD(EXCEPTION_BITMAP, exception_bitmap), diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h index 80232daf00ff..009b4c317375 100644 --- a/arch/x86/kvm/vmx/vmcs12.h +++ b/arch/x86/kvm/vmx/vmcs12.h @@ -69,7 +69,9 @@ struct __packed vmcs12 { u64 vm_function_control; u64 eptp_list_address; u64 pml_address; - u64 padding64[3]; /* room for future expansion */ + u64 guest_ia32_pkrs; + u64 host_ia32_pkrs; + u64 padding64[1]; /* room for future expansion */ /* * To allow migration of L1 (complete with its L2 guests) between * machines of different natural widths (32 or 64 bit), we cannot have @@ -256,6 +258,8 @@ static inline void vmx_check_vmcs12_offsets(void) CHECK_OFFSET(vm_function_control, 296); CHECK_OFFSET(eptp_list_address, 304); CHECK_OFFSET(pml_address, 312); + CHECK_OFFSET(guest_ia32_pkrs, 320); + CHECK_OFFSET(host_ia32_pkrs, 328); CHECK_OFFSET(cr0_guest_host_mask, 344); CHECK_OFFSET(cr4_guest_host_mask, 352); CHECK_OFFSET(cr0_read_shadow, 360); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 5cdf4d3848fb..3c3fb554c3fc 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7178,6 +7178,7 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu) cr4_fixed1_update(X86_CR4_PKE, ecx, feature_bit(PKU)); cr4_fixed1_update(X86_CR4_UMIP, ecx, feature_bit(UMIP)); cr4_fixed1_update(X86_CR4_LA57, ecx, feature_bit(LA57)); + cr4_fixed1_update(X86_CR4_PKS, ecx, feature_bit(PKS)); #undef cr4_fixed1_update } @@ -7197,6 +7198,15 @@ static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu) vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_CLEAR_BNDCFGS; } } + + if (kvm_cpu_cap_has(X86_FEATURE_PKS) && + guest_cpuid_has(vcpu, X86_FEATURE_PKS)) { + vmx->nested.msrs.entry_ctls_high |= VM_ENTRY_LOAD_IA32_PKRS; + vmx->nested.msrs.exit_ctls_high |= VM_EXIT_LOAD_IA32_PKRS; + } else { + vmx->nested.msrs.entry_ctls_high &= ~VM_ENTRY_LOAD_IA32_PKRS; + vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_LOAD_IA32_PKRS; + } } static void update_intel_pt_cfg(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 639798e4a6ca..2819d3e030b9 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -176,6 +176,7 @@ struct nested_vmx { /* to migrate it to L2 if VM_ENTRY_LOAD_DEBUG_CONTROLS is off */ u64 vmcs01_debugctl; u64 vmcs01_guest_bndcfgs; + u64 vmcs01_guest_pkrs; /* to migrate it to L1 if L2 writes to L1's CR8 directly */ int l1_tpr_threshold;