[10/23] KVM: MMU: split cpu_role from mmu_role

Message ID	20220204115718.14934-11-pbonzini@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> From: Paolo Bonzini <pbonzini@redhat.com> To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: dmatlack@google.com, seanjc@google.com, vkuznets@redhat.com Subject: [PATCH 10/23] KVM: MMU: split cpu_role from mmu_role Date: Fri, 4 Feb 2022 06:57:05 -0500 Message-Id: <20220204115718.14934-11-pbonzini@redhat.com> In-Reply-To: <20220204115718.14934-1-pbonzini@redhat.com> References: <20220204115718.14934-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	KVM: MMU: MMU role refactoring \| expand [00/23] KVM: MMU: MMU role refactoring [01/23] KVM: MMU: pass uses_nx directly to reset_shadow_zero_bits_mask [02/23] KVM: MMU: nested EPT cannot be used in SMM [03/23] KVM: MMU: remove valid from extended role [04/23] KVM: MMU: constify uses of struct kvm_mmu_role_regs [05/23] KVM: MMU: pull computation of kvm_mmu_role_regs to kvm_init_mmu [06/23] KVM: MMU: load new PGD once nested two-dimensional paging is initialized [07/23] KVM: MMU: remove kvm_mmu_calc_root_page_role [08/23] KVM: MMU: rephrase unclear comment [09/23] KVM: MMU: remove "bool base_only" arguments [10/23] KVM: MMU: split cpu_role from mmu_role [11/23] KVM: MMU: do not recompute root level from kvm_mmu_role_regs [12/23] KVM: MMU: remove ept_ad field [13/23] KVM: MMU: remove kvm_calc_shadow_root_page_role_common [14/23] KVM: MMU: cleanup computation of MMU roles for two-dimensional paging [15/23] KVM: MMU: cleanup computation of MMU roles for shadow paging [16/23] KVM: MMU: remove extended bits from mmu_role [17/23] KVM: MMU: remove redundant bits from extended role [18/23] KVM: MMU: fetch shadow EFER.NX from MMU role [19/23] KVM: MMU: simplify and/or inline computation of shadow MMU roles [20/23] KVM: MMU: pull CPU role computation to kvm_init_mmu [21/23] KVM: MMU: store shadow_root_level into mmu_role [22/23] KVM: MMU: use cpu_role for root_level [23/23] KVM: MMU: replace direct_map with mmu_role.direct

Message ID

20220204115718.14934-11-pbonzini@redhat.com (mailing list archive)

State

New, archived

Headers

From: Paolo Bonzini <pbonzini@redhat.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Cc: dmatlack@google.com, seanjc@google.com, vkuznets@redhat.com
Subject: [PATCH 10/23] KVM: MMU: split cpu_role from mmu_role
Date: Fri,  4 Feb 2022 06:57:05 -0500
Message-Id: <20220204115718.14934-11-pbonzini@redhat.com>
In-Reply-To: <20220204115718.14934-1-pbonzini@redhat.com>
References: <20220204115718.14934-1-pbonzini@redhat.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

KVM: MMU: MMU role refactoring | expand

Commit Message

Paolo Bonzini Feb. 4, 2022, 11:57 a.m. UTC

Snapshot the state of the processor registers that govern page walk into
a new field of struct kvm_mmu.  This is a more natural representation
than having it *mostly* in mmu_role but not exclusively; the delta
right now is represented in other fields, such as root_level.  For
example, already in this patch we can replace role_regs_to_root_level
with the "level" field of the CPU role.

The nested MMU now has only the CPU role; and in fact the new function
kvm_calc_cpu_role is analogous to the previous kvm_calc_nested_mmu_role,
except that it has role.base.direct equal to CR0.PG.  It is not clear
what the code meant by "setting role.base.direct to true to detect bogus
usage of the nested MMU".

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |   1 +
 arch/x86/kvm/mmu/mmu.c          | 100 ++++++++++++++++++++------------
 arch/x86/kvm/mmu/paging_tmpl.h  |   2 +-
 3 files changed, 64 insertions(+), 39 deletions(-)

Comments

David Matlack Feb. 4, 2022, 9:57 p.m. UTC | #1

On Fri, Feb 04, 2022 at 06:57:05AM -0500, Paolo Bonzini wrote:
> Snapshot the state of the processor registers that govern page walk into
> a new field of struct kvm_mmu.  This is a more natural representation
> than having it *mostly* in mmu_role but not exclusively; the delta
> right now is represented in other fields, such as root_level.  For
> example, already in this patch we can replace role_regs_to_root_level
> with the "level" field of the CPU role.
> 
> The nested MMU now has only the CPU role; and in fact the new function
> kvm_calc_cpu_role is analogous to the previous kvm_calc_nested_mmu_role,
> except that it has role.base.direct equal to CR0.PG.  It is not clear
> what the code meant by "setting role.base.direct to true to detect bogus
> usage of the nested MMU".
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/include/asm/kvm_host.h |   1 +
>  arch/x86/kvm/mmu/mmu.c          | 100 ++++++++++++++++++++------------
>  arch/x86/kvm/mmu/paging_tmpl.h  |   2 +-
>  3 files changed, 64 insertions(+), 39 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 4ec7d1e3aa36..427ee486309c 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -432,6 +432,7 @@ struct kvm_mmu {
>  	void (*invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa);
>  	hpa_t root_hpa;
>  	gpa_t root_pgd;
> +	union kvm_mmu_role cpu_role;
>  	union kvm_mmu_role mmu_role;
>  	u8 root_level;
>  	u8 shadow_root_level;
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index dd69cfc8c4f6..f98444e1d834 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -230,7 +230,7 @@ BUILD_MMU_ROLE_REGS_ACCESSOR(efer, lma, EFER_LMA);
>  #define BUILD_MMU_ROLE_ACCESSOR(base_or_ext, reg, name)		\
>  static inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu)	\
>  {								\
> -	return !!(mmu->mmu_role. base_or_ext . reg##_##name);	\
> +	return !!(mmu->cpu_role. base_or_ext . reg##_##name);	\
>  }
>  BUILD_MMU_ROLE_ACCESSOR(ext,  cr0, pg);
>  BUILD_MMU_ROLE_ACCESSOR(base, cr0, wp);
> @@ -4658,6 +4658,38 @@ static void paging32_init_context(struct kvm_mmu *context)
>  	context->direct_map = false;
>  }
>  
> +static union kvm_mmu_role
> +kvm_calc_cpu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
> +{
> +	union kvm_mmu_role role = {0};
> +
> +	role.base.access = ACC_ALL;
> +	role.base.smm = is_smm(vcpu);
> +	role.base.guest_mode = is_guest_mode(vcpu);
> +	role.base.direct = !____is_cr0_pg(regs);
> +	if (!role.base.direct) {
> +		role.base.efer_nx = ____is_efer_nx(regs);
> +		role.base.cr0_wp = ____is_cr0_wp(regs);
> +		role.base.smep_andnot_wp = ____is_cr4_smep(regs) && !____is_cr0_wp(regs);
> +		role.base.smap_andnot_wp = ____is_cr4_smap(regs) && !____is_cr0_wp(regs);
> +		role.base.has_4_byte_gpte = !____is_cr4_pae(regs);
> +		role.base.level = role_regs_to_root_level(regs);
> +
> +		role.ext.cr0_pg = 1;
> +		role.ext.cr4_pae = ____is_cr4_pae(regs);
> +		role.ext.cr4_smep = ____is_cr4_smep(regs);
> +		role.ext.cr4_smap = ____is_cr4_smap(regs);
> +		role.ext.cr4_pse = ____is_cr4_pse(regs);
> +
> +		/* PKEY and LA57 are active iff long mode is active. */
> +		role.ext.cr4_pke = ____is_efer_lma(regs) && ____is_cr4_pke(regs);
> +		role.ext.cr4_la57 = ____is_efer_lma(regs) && ____is_cr4_la57(regs);
> +		role.ext.efer_lma = ____is_efer_lma(regs);
> +	}
> +
> +	return role;
> +}
> +
>  static union kvm_mmu_role kvm_calc_mmu_role_common(struct kvm_vcpu *vcpu,
>  						   const struct kvm_mmu_role_regs *regs)
>  {
> @@ -4716,13 +4748,16 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
>  			     const struct kvm_mmu_role_regs *regs)
>  {
>  	struct kvm_mmu *context = &vcpu->arch.root_mmu;
> -	union kvm_mmu_role new_role =
> +	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
> +	union kvm_mmu_role mmu_role =
>  		kvm_calc_tdp_mmu_root_page_role(vcpu, regs);
>  
> -	if (new_role.as_u64 == context->mmu_role.as_u64)
> +	if (cpu_role.as_u64 == context->cpu_role.as_u64 &&
> +	    mmu_role.as_u64 == context->mmu_role.as_u64)
>  		return;
>  
> -	context->mmu_role.as_u64 = new_role.as_u64;
> +	context->cpu_role.as_u64 = cpu_role.as_u64;
> +	context->mmu_role.as_u64 = mmu_role.as_u64;
>  	context->page_fault = kvm_tdp_page_fault;
>  	context->sync_page = nonpaging_sync_page;
>  	context->invlpg = NULL;
> @@ -4777,13 +4812,15 @@ kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu,
>  }
>  
>  static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *context,
> -				    const struct kvm_mmu_role_regs *regs,
> -				    union kvm_mmu_role new_role)
> +				    union kvm_mmu_role cpu_role,
> +				    union kvm_mmu_role mmu_role)
>  {
> -	if (new_role.as_u64 == context->mmu_role.as_u64)
> +	if (cpu_role.as_u64 == context->cpu_role.as_u64 &&
> +	    mmu_role.as_u64 == context->mmu_role.as_u64)
>  		return;
>  
> -	context->mmu_role.as_u64 = new_role.as_u64;
> +	context->cpu_role.as_u64 = cpu_role.as_u64;
> +	context->mmu_role.as_u64 = mmu_role.as_u64;
>  
>  	if (!is_cr0_pg(context))
>  		nonpaging_init_context(context);
> @@ -4791,20 +4828,21 @@ static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *conte
>  		paging64_init_context(context);
>  	else
>  		paging32_init_context(context);
> -	context->root_level = role_regs_to_root_level(regs);
> +	context->root_level = cpu_role.base.level;
>  
>  	reset_guest_paging_metadata(vcpu, context);
> -	context->shadow_root_level = new_role.base.level;
> +	context->shadow_root_level = mmu_role.base.level;
>  }
>  
>  static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
>  				const struct kvm_mmu_role_regs *regs)
>  {
>  	struct kvm_mmu *context = &vcpu->arch.root_mmu;
> -	union kvm_mmu_role new_role =
> +	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
> +	union kvm_mmu_role mmu_role =
>  		kvm_calc_shadow_mmu_root_page_role(vcpu, regs);
>  
> -	shadow_mmu_init_context(vcpu, context, regs, new_role);
> +	shadow_mmu_init_context(vcpu, context, cpu_role, mmu_role);
>  
>  	/*
>  	 * KVM uses NX when TDP is disabled to handle a variety of scenarios,
> @@ -4839,11 +4877,10 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
>  		.cr4 = cr4 & ~X86_CR4_PKE,
>  		.efer = efer,
>  	};
> -	union kvm_mmu_role new_role;
> -
> -	new_role = kvm_calc_shadow_npt_root_page_role(vcpu, &regs);
> +	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, &regs);
> +	union kvm_mmu_role mmu_role = kvm_calc_shadow_npt_root_page_role(vcpu, &regs);;
>  
> -	shadow_mmu_init_context(vcpu, context, &regs, new_role);
> +	shadow_mmu_init_context(vcpu, context, cpu_role, mmu_role);
>  	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
>  	kvm_mmu_new_pgd(vcpu, nested_cr3);
>  }
> @@ -4862,7 +4899,6 @@ kvm_calc_shadow_ept_root_page_role(struct kvm_vcpu *vcpu, bool accessed_dirty,
>  	role.base.guest_mode = true;
>  	role.base.access = ACC_ALL;
>  
> -	/* EPT, and thus nested EPT, does not consume CR0, CR4, nor EFER. */
>  	role.ext.word = 0;
>  	role.ext.execonly = execonly;
>  
> @@ -4879,7 +4915,9 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
>  		kvm_calc_shadow_ept_root_page_role(vcpu, accessed_dirty,
>  						   execonly, level);
>  
> -	if (new_role.as_u64 != context->mmu_role.as_u64) {
> +	if (new_role.as_u64 != context->cpu_role.as_u64) {
> +		/* EPT, and thus nested EPT, does not consume CR0, CR4, nor EFER. */
> +		context->cpu_role.as_u64 = new_role.as_u64;
>  		context->mmu_role.as_u64 = new_role.as_u64;
>  
>  		context->shadow_root_level = level;
> @@ -4913,32 +4951,15 @@ static void init_kvm_softmmu(struct kvm_vcpu *vcpu,
>  	context->inject_page_fault = kvm_inject_page_fault;
>  }
>  
> -static union kvm_mmu_role
> -kvm_calc_nested_mmu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
> -{
> -	union kvm_mmu_role role;
> -
> -	role = kvm_calc_shadow_root_page_role_common(vcpu, regs);
> -
> -	/*
> -	 * Nested MMUs are used only for walking L2's gva->gpa, they never have
> -	 * shadow pages of their own and so "direct" has no meaning.   Set it
> -	 * to "true" to try to detect bogus usage of the nested MMU.
> -	 */
> -	role.base.direct = true;
> -	role.base.level = role_regs_to_root_level(regs);
> -	return role;
> -}
> -
>  static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
>  {
> -	union kvm_mmu_role new_role = kvm_calc_nested_mmu_role(vcpu, regs);
> +	union kvm_mmu_role new_role = kvm_calc_cpu_role(vcpu, regs);
>  	struct kvm_mmu *g_context = &vcpu->arch.nested_mmu;
>  
> -	if (new_role.as_u64 == g_context->mmu_role.as_u64)
> +	if (new_role.as_u64 == g_context->cpu_role.as_u64)
>  		return;
>  
> -	g_context->mmu_role.as_u64 = new_role.as_u64;
> +	g_context->cpu_role.as_u64 = new_role.as_u64;
>  	g_context->get_guest_pgd     = get_cr3;
>  	g_context->get_pdptr         = kvm_pdptr_read;
>  	g_context->inject_page_fault = kvm_inject_page_fault;
> @@ -4997,6 +5018,9 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	 * problem is swept under the rug; KVM's CPUID API is horrific and
>  	 * it's all but impossible to solve it without introducing a new API.
>  	 */
> +	vcpu->arch.root_mmu.cpu_role.base.level = 0;
> +	vcpu->arch.guest_mmu.cpu_role.base.level = 0;
> +	vcpu->arch.nested_mmu.cpu_role.base.level = 0;

Will cpu_role.base.level already be 0 if CR0.PG=0 && !tdp_enabled? i.e.
setting cpu_role.base.level to 0 might not have the desired effect.

It might not matter in practice since the shadow_mmu_init_context() and
kvm_calc_mmu_role_common() check both the mmu_role and cpu_role, but does
make this reset code confusing.

>  	vcpu->arch.root_mmu.mmu_role.base.level = 0;
>  	vcpu->arch.guest_mmu.mmu_role.base.level = 0;
>  	vcpu->arch.nested_mmu.mmu_role.base.level = 0;
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index 6bb9a377bf89..b9f472f27077 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -323,7 +323,7 @@ static inline bool FNAME(is_last_gpte)(struct kvm_mmu *mmu,
>  	 * is not reserved and does not indicate a large page at this level,
>  	 * so clear PT_PAGE_SIZE_MASK in gpte if that is the case.
>  	 */
> -	gpte &= level - (PT32_ROOT_LEVEL + mmu->mmu_role.ext.cr4_pse);
> +	gpte &= level - (PT32_ROOT_LEVEL + mmu->cpu_role.ext.cr4_pse);
>  #endif
>  	/*
>  	 * PG_LEVEL_4K always terminates.  The RHS has bit 7 set
> -- 
> 2.31.1
> 
>

Paolo Bonzini Feb. 5, 2022, 2:49 p.m. UTC | #2

On 2/4/22 22:57, David Matlack wrote:
>> +	vcpu->arch.root_mmu.cpu_role.base.level = 0;
>> +	vcpu->arch.guest_mmu.cpu_role.base.level = 0;
>> +	vcpu->arch.nested_mmu.cpu_role.base.level = 0;
> Will cpu_role.base.level already be 0 if CR0.PG=0 && !tdp_enabled? i.e.
> setting cpu_role.base.level to 0 might not have the desired effect.
> 
> It might not matter in practice since the shadow_mmu_init_context() and
> kvm_calc_mmu_role_common() check both the mmu_role and cpu_role, but does
> make this reset code confusing.
> 

Good point.  The (still unrealized) purpose of this series is to be able 
to check mmu_role only, so for now I'll just keep the valid bit in the 
ext part of the cpu_role.  The mmu_role's level however is never zero, 
so I can already use the level when I remove the ext part from the mmu_role.

I'll remove the valid bit of the ext part only after the cpu_role check 
is removed, because then it can trivially go.

Paolo

David Matlack Feb. 7, 2022, 9:38 p.m. UTC | #3

On Sat, Feb 5, 2022 at 6:49 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 2/4/22 22:57, David Matlack wrote:
> >> +    vcpu->arch.root_mmu.cpu_role.base.level = 0;
> >> +    vcpu->arch.guest_mmu.cpu_role.base.level = 0;
> >> +    vcpu->arch.nested_mmu.cpu_role.base.level = 0;
> > Will cpu_role.base.level already be 0 if CR0.PG=0 && !tdp_enabled? i.e.
> > setting cpu_role.base.level to 0 might not have the desired effect.
> >
> > It might not matter in practice since the shadow_mmu_init_context() and
> > kvm_calc_mmu_role_common() check both the mmu_role and cpu_role, but does
> > make this reset code confusing.
> >
>
> Good point.  The (still unrealized) purpose of this series is to be able
> to check mmu_role only, so for now I'll just keep the valid bit in the
> ext part of the cpu_role.  The mmu_role's level however is never zero,
> so I can already use the level when I remove the ext part from the mmu_role.

Agreed.

>
> I'll remove the valid bit of the ext part only after the cpu_role check
> is removed, because then it can trivially go.

Ok sounds good.

>
> Paolo
>

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4ec7d1e3aa36..427ee486309c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -432,6 +432,7 @@  struct kvm_mmu {
 	void (*invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa);
 	hpa_t root_hpa;
 	gpa_t root_pgd;
+	union kvm_mmu_role cpu_role;
 	union kvm_mmu_role mmu_role;
 	u8 root_level;
 	u8 shadow_root_level;
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index dd69cfc8c4f6..f98444e1d834 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -230,7 +230,7 @@  BUILD_MMU_ROLE_REGS_ACCESSOR(efer, lma, EFER_LMA);
 #define BUILD_MMU_ROLE_ACCESSOR(base_or_ext, reg, name)		\
 static inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu)	\
 {								\
-	return !!(mmu->mmu_role. base_or_ext . reg##_##name);	\
+	return !!(mmu->cpu_role. base_or_ext . reg##_##name);	\
 }
 BUILD_MMU_ROLE_ACCESSOR(ext,  cr0, pg);
 BUILD_MMU_ROLE_ACCESSOR(base, cr0, wp);
@@ -4658,6 +4658,38 @@  static void paging32_init_context(struct kvm_mmu *context)
 	context->direct_map = false;
 }
 
+static union kvm_mmu_role
+kvm_calc_cpu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
+{
+	union kvm_mmu_role role = {0};
+
+	role.base.access = ACC_ALL;
+	role.base.smm = is_smm(vcpu);
+	role.base.guest_mode = is_guest_mode(vcpu);
+	role.base.direct = !____is_cr0_pg(regs);
+	if (!role.base.direct) {
+		role.base.efer_nx = ____is_efer_nx(regs);
+		role.base.cr0_wp = ____is_cr0_wp(regs);
+		role.base.smep_andnot_wp = ____is_cr4_smep(regs) && !____is_cr0_wp(regs);
+		role.base.smap_andnot_wp = ____is_cr4_smap(regs) && !____is_cr0_wp(regs);
+		role.base.has_4_byte_gpte = !____is_cr4_pae(regs);
+		role.base.level = role_regs_to_root_level(regs);
+
+		role.ext.cr0_pg = 1;
+		role.ext.cr4_pae = ____is_cr4_pae(regs);
+		role.ext.cr4_smep = ____is_cr4_smep(regs);
+		role.ext.cr4_smap = ____is_cr4_smap(regs);
+		role.ext.cr4_pse = ____is_cr4_pse(regs);
+
+		/* PKEY and LA57 are active iff long mode is active. */
+		role.ext.cr4_pke = ____is_efer_lma(regs) && ____is_cr4_pke(regs);
+		role.ext.cr4_la57 = ____is_efer_lma(regs) && ____is_cr4_la57(regs);
+		role.ext.efer_lma = ____is_efer_lma(regs);
+	}
+
+	return role;
+}
+
 static union kvm_mmu_role kvm_calc_mmu_role_common(struct kvm_vcpu *vcpu,
 						   const struct kvm_mmu_role_regs *regs)
 {
@@ -4716,13 +4748,16 @@  static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
 			     const struct kvm_mmu_role_regs *regs)
 {
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
-	union kvm_mmu_role new_role =
+	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
+	union kvm_mmu_role mmu_role =
 		kvm_calc_tdp_mmu_root_page_role(vcpu, regs);
 
-	if (new_role.as_u64 == context->mmu_role.as_u64)
+	if (cpu_role.as_u64 == context->cpu_role.as_u64 &&
+	    mmu_role.as_u64 == context->mmu_role.as_u64)
 		return;
 
-	context->mmu_role.as_u64 = new_role.as_u64;
+	context->cpu_role.as_u64 = cpu_role.as_u64;
+	context->mmu_role.as_u64 = mmu_role.as_u64;
 	context->page_fault = kvm_tdp_page_fault;
 	context->sync_page = nonpaging_sync_page;
 	context->invlpg = NULL;
@@ -4777,13 +4812,15 @@  kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu,
 }
 
 static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *context,
-				    const struct kvm_mmu_role_regs *regs,
-				    union kvm_mmu_role new_role)
+				    union kvm_mmu_role cpu_role,
+				    union kvm_mmu_role mmu_role)
 {
-	if (new_role.as_u64 == context->mmu_role.as_u64)
+	if (cpu_role.as_u64 == context->cpu_role.as_u64 &&
+	    mmu_role.as_u64 == context->mmu_role.as_u64)
 		return;
 
-	context->mmu_role.as_u64 = new_role.as_u64;
+	context->cpu_role.as_u64 = cpu_role.as_u64;
+	context->mmu_role.as_u64 = mmu_role.as_u64;
 
 	if (!is_cr0_pg(context))
 		nonpaging_init_context(context);
@@ -4791,20 +4828,21 @@  static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *conte
 		paging64_init_context(context);
 	else
 		paging32_init_context(context);
-	context->root_level = role_regs_to_root_level(regs);
+	context->root_level = cpu_role.base.level;
 
 	reset_guest_paging_metadata(vcpu, context);
-	context->shadow_root_level = new_role.base.level;
+	context->shadow_root_level = mmu_role.base.level;
 }
 
 static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
 				const struct kvm_mmu_role_regs *regs)
 {
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
-	union kvm_mmu_role new_role =
+	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
+	union kvm_mmu_role mmu_role =
 		kvm_calc_shadow_mmu_root_page_role(vcpu, regs);
 
-	shadow_mmu_init_context(vcpu, context, regs, new_role);
+	shadow_mmu_init_context(vcpu, context, cpu_role, mmu_role);
 
 	/*
 	 * KVM uses NX when TDP is disabled to handle a variety of scenarios,
@@ -4839,11 +4877,10 @@  void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
 		.cr4 = cr4 & ~X86_CR4_PKE,
 		.efer = efer,
 	};
-	union kvm_mmu_role new_role;
-
-	new_role = kvm_calc_shadow_npt_root_page_role(vcpu, &regs);
+	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, &regs);
+	union kvm_mmu_role mmu_role = kvm_calc_shadow_npt_root_page_role(vcpu, &regs);;
 
-	shadow_mmu_init_context(vcpu, context, &regs, new_role);
+	shadow_mmu_init_context(vcpu, context, cpu_role, mmu_role);
 	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
 	kvm_mmu_new_pgd(vcpu, nested_cr3);
 }
@@ -4862,7 +4899,6 @@  kvm_calc_shadow_ept_root_page_role(struct kvm_vcpu *vcpu, bool accessed_dirty,
 	role.base.guest_mode = true;
 	role.base.access = ACC_ALL;
 
-	/* EPT, and thus nested EPT, does not consume CR0, CR4, nor EFER. */
 	role.ext.word = 0;
 	role.ext.execonly = execonly;
 
@@ -4879,7 +4915,9 @@  void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
 		kvm_calc_shadow_ept_root_page_role(vcpu, accessed_dirty,
 						   execonly, level);
 
-	if (new_role.as_u64 != context->mmu_role.as_u64) {
+	if (new_role.as_u64 != context->cpu_role.as_u64) {
+		/* EPT, and thus nested EPT, does not consume CR0, CR4, nor EFER. */
+		context->cpu_role.as_u64 = new_role.as_u64;
 		context->mmu_role.as_u64 = new_role.as_u64;
 
 		context->shadow_root_level = level;
@@ -4913,32 +4951,15 @@  static void init_kvm_softmmu(struct kvm_vcpu *vcpu,
 	context->inject_page_fault = kvm_inject_page_fault;
 }
 
-static union kvm_mmu_role
-kvm_calc_nested_mmu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
-{
-	union kvm_mmu_role role;
-
-	role = kvm_calc_shadow_root_page_role_common(vcpu, regs);
-
-	/*
-	 * Nested MMUs are used only for walking L2's gva->gpa, they never have
-	 * shadow pages of their own and so "direct" has no meaning.   Set it
-	 * to "true" to try to detect bogus usage of the nested MMU.
-	 */
-	role.base.direct = true;
-	role.base.level = role_regs_to_root_level(regs);
-	return role;
-}
-
 static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
 {
-	union kvm_mmu_role new_role = kvm_calc_nested_mmu_role(vcpu, regs);
+	union kvm_mmu_role new_role = kvm_calc_cpu_role(vcpu, regs);
 	struct kvm_mmu *g_context = &vcpu->arch.nested_mmu;
 
-	if (new_role.as_u64 == g_context->mmu_role.as_u64)
+	if (new_role.as_u64 == g_context->cpu_role.as_u64)
 		return;
 
-	g_context->mmu_role.as_u64 = new_role.as_u64;
+	g_context->cpu_role.as_u64 = new_role.as_u64;
 	g_context->get_guest_pgd     = get_cr3;
 	g_context->get_pdptr         = kvm_pdptr_read;
 	g_context->inject_page_fault = kvm_inject_page_fault;
@@ -4997,6 +5018,9 @@  void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * problem is swept under the rug; KVM's CPUID API is horrific and
 	 * it's all but impossible to solve it without introducing a new API.
 	 */
+	vcpu->arch.root_mmu.cpu_role.base.level = 0;
+	vcpu->arch.guest_mmu.cpu_role.base.level = 0;
+	vcpu->arch.nested_mmu.cpu_role.base.level = 0;
 	vcpu->arch.root_mmu.mmu_role.base.level = 0;
 	vcpu->arch.guest_mmu.mmu_role.base.level = 0;
 	vcpu->arch.nested_mmu.mmu_role.base.level = 0;
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 6bb9a377bf89..b9f472f27077 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -323,7 +323,7 @@  static inline bool FNAME(is_last_gpte)(struct kvm_mmu *mmu,
 	 * is not reserved and does not indicate a large page at this level,
 	 * so clear PT_PAGE_SIZE_MASK in gpte if that is the case.
 	 */
-	gpte &= level - (PT32_ROOT_LEVEL + mmu->mmu_role.ext.cr4_pse);
+	gpte &= level - (PT32_ROOT_LEVEL + mmu->cpu_role.ext.cr4_pse);
 #endif
 	/*
 	 * PG_LEVEL_4K always terminates.  The RHS has bit 7 set

[10/23] KVM: MMU: split cpu_role from mmu_role

Commit Message

Comments

Patch