diff mbox series

[v4,4/4] KVM: mmu: remove over-aggressive warnings

Message ID 20210929042908.1313874-5-stevensd@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: allow mapping non-refcounted pages | expand

Commit Message

David Stevens Sept. 29, 2021, 4:29 a.m. UTC
From: David Stevens <stevensd@chromium.org>

Remove two warnings that require ref counts for pages to be non-zero, as
mapped pfns from follow_pfn may not have an initialized ref count.

Signed-off-by: David Stevens <stevensd@chromium.org>
---
 arch/x86/kvm/mmu/mmu.c | 7 -------
 virt/kvm/kvm_main.c    | 2 +-
 2 files changed, 1 insertion(+), 8 deletions(-)

Comments

Sean Christopherson Oct. 13, 2021, 12:02 a.m. UTC | #1
On Wed, Sep 29, 2021, David Stevens wrote:
> From: David Stevens <stevensd@chromium.org>
> 
> Remove two warnings that require ref counts for pages to be non-zero, as
> mapped pfns from follow_pfn may not have an initialized ref count.
> 
> Signed-off-by: David Stevens <stevensd@chromium.org>
> ---
>  arch/x86/kvm/mmu/mmu.c | 7 -------
>  virt/kvm/kvm_main.c    | 2 +-
>  2 files changed, 1 insertion(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 5a1adcc9cfbc..3b469df63bcf 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -617,13 +617,6 @@ static int mmu_spte_clear_track_bits(struct kvm *kvm, u64 *sptep)
>  
>  	pfn = spte_to_pfn(old_spte);
>  
> -	/*
> -	 * KVM does not hold the refcount of the page used by
> -	 * kvm mmu, before reclaiming the page, we should
> -	 * unmap it from mmu first.
> -	 */
> -	WARN_ON(!kvm_is_reserved_pfn(pfn) && !page_count(pfn_to_page(pfn)));

Have you actually observed false positives with this WARN?  I would expect anything
without a struct page to get filtered out by !kvm_is_reserved_pfn(pfn).

If you have observed false positives, I would strongly prefer we find a way to
keep the page_count() sanity check, it has proven very helpful in the past in
finding/debugging bugs during MMU development.

> -
>  	if (is_accessed_spte(old_spte))
>  		kvm_set_pfn_accessed(pfn);
>
David Stevens Oct. 13, 2021, 3:29 a.m. UTC | #2
On Wed, Oct 13, 2021 at 9:02 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Sep 29, 2021, David Stevens wrote:
> > From: David Stevens <stevensd@chromium.org>
> >
> > Remove two warnings that require ref counts for pages to be non-zero, as
> > mapped pfns from follow_pfn may not have an initialized ref count.
> >
> > Signed-off-by: David Stevens <stevensd@chromium.org>
> > ---
> >  arch/x86/kvm/mmu/mmu.c | 7 -------
> >  virt/kvm/kvm_main.c    | 2 +-
> >  2 files changed, 1 insertion(+), 8 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 5a1adcc9cfbc..3b469df63bcf 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -617,13 +617,6 @@ static int mmu_spte_clear_track_bits(struct kvm *kvm, u64 *sptep)
> >
> >       pfn = spte_to_pfn(old_spte);
> >
> > -     /*
> > -      * KVM does not hold the refcount of the page used by
> > -      * kvm mmu, before reclaiming the page, we should
> > -      * unmap it from mmu first.
> > -      */
> > -     WARN_ON(!kvm_is_reserved_pfn(pfn) && !page_count(pfn_to_page(pfn)));
>
> Have you actually observed false positives with this WARN?  I would expect anything
> without a struct page to get filtered out by !kvm_is_reserved_pfn(pfn).

Those are the type of pfns that were responsible for CVE-2021-22543
[1]. One specific example is that amdgpu uses ttm_pool, which makes
higher order, non-compound allocation. Without the head/tail metadata,
only the first base page in such an allocation has non-zero
page_count.

[1] https://github.com/google/security-research/security/advisories/GHSA-7wq5-phmq-m584

> If you have observed false positives, I would strongly prefer we find a way to
> keep the page_count() sanity check, it has proven very helpful in the past in
> finding/debugging bugs during MMU development.

When we see a refcount of zero, I think we can look up spte->(gfn,
slot)->hva->vma and determine whether or not the zero refcount is
okay, based on vm_flags. That's kind of heavy for a debug check,
although at least we'd only pay the cost for unusual mappings. But it
still might make sense to switch to a MMU_WARN_ON, in that case. Or we
could just ignore the cost, since at least from a superficial reading
and some basic tests, tdp_mmu doesn't seem to execute this code path.

Thoughts? I'd lean towards MMU_WARN_ON, but I'd like to know what the
maintainers' preferences are before sending an updated patch series.

-David

>
> > -
> >       if (is_accessed_spte(old_spte))
> >               kvm_set_pfn_accessed(pfn);
> >
Sean Christopherson Oct. 15, 2021, 12:10 a.m. UTC | #3
On Wed, Oct 13, 2021, David Stevens wrote:
> On Wed, Oct 13, 2021 at 9:02 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Wed, Sep 29, 2021, David Stevens wrote:
> > > From: David Stevens <stevensd@chromium.org>
> > >
> > > Remove two warnings that require ref counts for pages to be non-zero, as
> > > mapped pfns from follow_pfn may not have an initialized ref count.
> > >
> > > Signed-off-by: David Stevens <stevensd@chromium.org>
> > > ---
> > >  arch/x86/kvm/mmu/mmu.c | 7 -------
> > >  virt/kvm/kvm_main.c    | 2 +-
> > >  2 files changed, 1 insertion(+), 8 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index 5a1adcc9cfbc..3b469df63bcf 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -617,13 +617,6 @@ static int mmu_spte_clear_track_bits(struct kvm *kvm, u64 *sptep)
> > >
> > >       pfn = spte_to_pfn(old_spte);
> > >
> > > -     /*
> > > -      * KVM does not hold the refcount of the page used by
> > > -      * kvm mmu, before reclaiming the page, we should
> > > -      * unmap it from mmu first.
> > > -      */
> > > -     WARN_ON(!kvm_is_reserved_pfn(pfn) && !page_count(pfn_to_page(pfn)));
> >
> > Have you actually observed false positives with this WARN?  I would expect anything
> > without a struct page to get filtered out by !kvm_is_reserved_pfn(pfn).
> 
> Those are the type of pfns that were responsible for CVE-2021-22543
> [1]. One specific example is that amdgpu uses ttm_pool, which makes
> higher order, non-compound allocation. Without the head/tail metadata,
> only the first base page in such an allocation has non-zero
> page_count.

Huh.  I hadn't actually read the CVE, or obviously thought critically about the
problem. :-)

> [1] https://github.com/google/security-research/security/advisories/GHSA-7wq5-phmq-m584
> 
> > If you have observed false positives, I would strongly prefer we find a way to
> > keep the page_count() sanity check, it has proven very helpful in the past in
> > finding/debugging bugs during MMU development.
> 
> When we see a refcount of zero, I think we can look up spte->(gfn,
> slot)->hva->vma and determine whether or not the zero refcount is
> okay, based on vm_flags. That's kind of heavy for a debug check,
> although at least we'd only pay the cost for unusual mappings. But it
> still might make sense to switch to a MMU_WARN_ON, in that case. Or we
> could just ignore the cost, since at least from a superficial reading
> and some basic tests, tdp_mmu doesn't seem to execute this code path.
> 
> Thoughts? I'd lean towards MMU_WARN_ON, but I'd like to know what the
> maintainers' preferences are before sending an updated patch series.

MMU_WARN_ON is a poor choice, but only because no one turns it on.  I think we've
discussed turning it into a proper Kconfig (and killing off mmu_audit.c) multiple
times, but no one has actually followed through.

The TDP MMU indeed doesn't hit this path.  So I'd say just keep this patch as is
and punt the whole MMU_WARN_ON / audit cleanup to the future.  I bet if we spend
any time at all, we can think of a big pile of MMU sanity checks we could add to
KVM, i.e. this can be but one of many checks that applies to all flavors of MMUs.
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 5a1adcc9cfbc..3b469df63bcf 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -617,13 +617,6 @@  static int mmu_spte_clear_track_bits(struct kvm *kvm, u64 *sptep)
 
 	pfn = spte_to_pfn(old_spte);
 
-	/*
-	 * KVM does not hold the refcount of the page used by
-	 * kvm mmu, before reclaiming the page, we should
-	 * unmap it from mmu first.
-	 */
-	WARN_ON(!kvm_is_reserved_pfn(pfn) && !page_count(pfn_to_page(pfn)));
-
 	if (is_accessed_spte(old_spte))
 		kvm_set_pfn_accessed(pfn);
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 421146bd1360..c72ad995a359 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -168,7 +168,7 @@  bool kvm_is_zone_device_pfn(kvm_pfn_t pfn)
 	 * the device has been pinned, e.g. by get_user_pages().  WARN if the
 	 * page_count() is zero to help detect bad usage of this helper.
 	 */
-	if (!pfn_valid(pfn) || WARN_ON_ONCE(!page_count(pfn_to_page(pfn))))
+	if (!pfn_valid(pfn) || !page_count(pfn_to_page(pfn)))
 		return false;
 
 	return is_zone_device_page(pfn_to_page(pfn));