diff mbox series

[02/16] KVM: x86: clamp host mapping level to max_level in kvm_mmu_max_mapping_level

Message ID 20210807134936.3083984-3-pbonzini@redhat.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: pass arguments on the page fault path via struct kvm_page_fault | expand

Commit Message

Paolo Bonzini Aug. 7, 2021, 1:49 p.m. UTC
This patch started as a way to make kvm_mmu_hugepage_adjust a bit simpler,
in preparation for switching it to struct kvm_page_fault, but it does
fix a microscopic bug in zapping collapsible PTEs.

If a large page size is disallowed but not all of them, kvm_mmu_max_mapping_level
will return the host mapping level and the small PTEs will be zapped up
to that level.  However, if e.g. 1GB are prohibited, we can still zap 4KB
mapping and preserve the 2MB ones.  This can happen for example when NX
huge pages are in use.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

Comments

Sean Christopherson Aug. 13, 2021, 4:28 p.m. UTC | #1
On Sat, Aug 07, 2021, Paolo Bonzini wrote:
> This patch started as a way to make kvm_mmu_hugepage_adjust a bit simpler,
> in preparation for switching it to struct kvm_page_fault, but it does
> fix a microscopic bug in zapping collapsible PTEs.

I think this also fixes a bug where userspace backs guest memory with a 1gb hugepage
but only assigns a subset of the page to the guest.  1gb pages would be disallowed
by the memslot, but not 2mb.  kvm_mmu_max_mapping_level() would fall through to the
host_pfn_mapping_level() logic, see the 1gb huge, and map the whole thing into the
guest.  I can't imagine any userspace would actually do something like that, but the
failure mode is serious enough that it warrants a Fixes: + Cc: stable@.
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 7477f340d318..5d4de39fe5a9 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2868,6 +2868,7 @@  int kvm_mmu_max_mapping_level(struct kvm *kvm,
 			      kvm_pfn_t pfn, int max_level)
 {
 	struct kvm_lpage_info *linfo;
+	int host_level;
 
 	max_level = min(max_level, max_huge_page_level);
 	for ( ; max_level > PG_LEVEL_4K; max_level--) {
@@ -2879,7 +2880,8 @@  int kvm_mmu_max_mapping_level(struct kvm *kvm,
 	if (max_level == PG_LEVEL_4K)
 		return PG_LEVEL_4K;
 
-	return host_pfn_mapping_level(kvm, gfn, pfn, slot);
+	host_level = host_pfn_mapping_level(kvm, gfn, pfn, slot);
+	return min(host_level, max_level);
 }
 
 int kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, gfn_t gfn,
@@ -2903,17 +2905,12 @@  int kvm_mmu_hugepage_adjust(struct kvm_vcpu *vcpu, gfn_t gfn,
 	if (!slot)
 		return PG_LEVEL_4K;
 
-	level = kvm_mmu_max_mapping_level(vcpu->kvm, slot, gfn, pfn, max_level);
-	if (level == PG_LEVEL_4K)
-		return level;
-
-	*req_level = level = min(level, max_level);
-
 	/*
 	 * Enforce the iTLB multihit workaround after capturing the requested
 	 * level, which will be used to do precise, accurate accounting.
 	 */
-	if (huge_page_disallowed)
+	*req_level = level = kvm_mmu_max_mapping_level(vcpu->kvm, slot, gfn, pfn, max_level);
+	if (level == PG_LEVEL_4K || huge_page_disallowed)
 		return PG_LEVEL_4K;
 
 	/*