diff mbox series

[05/12] KVM: MMU: avoid NULL-pointer dereference on page freeing bugs

Message ID 20220209170020.1775368-6-pbonzini@redhat.com (mailing list archive)
State New, archived
Headers show
Series KVM: MMU: do not unload MMU roots on all role changes | expand

Commit Message

Paolo Bonzini Feb. 9, 2022, 5 p.m. UTC
If kvm_mmu_free_roots encounters a PAE page table where a 64-bit page
table is expected, the result is a NULL pointer dereference.  Instead
just WARN and exit.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Sean Christopherson Feb. 11, 2022, 12:24 a.m. UTC | #1
On Wed, Feb 09, 2022, Paolo Bonzini wrote:
> If kvm_mmu_free_roots encounters a PAE page table where a 64-bit page
> table is expected, the result is a NULL pointer dereference.  Instead
> just WARN and exit.

This confused the heck out of me, because we obviously free PAE page tables.  What
we don't do is back the root that gets shoved into CR3 with a shadow page.  It'd
be especially confusing without the context that this WARN was helpful during
related development, as it's not super obvious why mmu_free_root_page() is a special
snowflake and deserves a WARN.

Something like this?

  WARN and bail if KVM attempts to free a root that isn't backed by a shadow
  page.  KVM allocates a bare page for "special" roots, e.g. when using PAE
  paging or shadowing 2/3/4-level page tables with 4/5-level, and so root_hpa
  will be valid but won't be backed by a shadow page.  It's all too easy to
  blindly call mmu_free_root_page() on root_hpa, be nice and WARN instead of
  crashing KVM and possibly the kernel.

> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 7b5765ced928..d0f2077bd798 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3201,6 +3201,8 @@ static void mmu_free_root_page(struct kvm *kvm, hpa_t *root_hpa,
>  		return;
>  
>  	sp = to_shadow_page(*root_hpa & PT64_BASE_ADDR_MASK);
> +	if (WARN_ON(!sp))

Should this be KVM_BUG_ON()?  I.e. when you triggered these, would continuing on
potentially corrupt guest data, or was it truly benign-ish?

> +		return;
>  
>  	if (is_tdp_mmu_page(sp))
>  		kvm_tdp_mmu_put_root(kvm, sp, false);
> -- 
> 2.31.1
> 
>
Paolo Bonzini Feb. 11, 2022, 11:21 a.m. UTC | #2
On 2/11/22 01:24, Sean Christopherson wrote:
>>   
>>   	sp = to_shadow_page(*root_hpa & PT64_BASE_ADDR_MASK);
>> +	if (WARN_ON(!sp))
>
> Should this be KVM_BUG_ON()?  I.e. when you triggered these, would continuing on
> potentially corrupt guest data, or was it truly benign-ish?

It only triggered on the mode_switch SVM unit test (with npt=0); so, in 
a very small test which just hung after the bug.  The WARN however was 
the 10-minute difference between rmmod and reboot...

I didn't use KVM_BUG_ON because we're in a pretty deep call stack 
(kvm_mmu_new_pgd, itself called from nested vmentry) and all sort of 
stuff will happen before bailing out.  My mental model is to use 
KVM_BUG_ON in situations for which error propagation is possible and clean.

Paolo
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 7b5765ced928..d0f2077bd798 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3201,6 +3201,8 @@  static void mmu_free_root_page(struct kvm *kvm, hpa_t *root_hpa,
 		return;
 
 	sp = to_shadow_page(*root_hpa & PT64_BASE_ADDR_MASK);
+	if (WARN_ON(!sp))
+		return;
 
 	if (is_tdp_mmu_page(sp))
 		kvm_tdp_mmu_put_root(kvm, sp, false);