diff mbox series

KVM: arm64: Retry fault if vma_lookup() results become invalid

Message ID 20230313235454.2964067-1-dmatlack@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: arm64: Retry fault if vma_lookup() results become invalid | expand

Commit Message

David Matlack March 13, 2023, 11:54 p.m. UTC
Read mmu_invalidate_seq before dropping the mmap_lock so that KVM can
detect if the results of vma_lookup() (e.g. vma_shift) become stale
before it acquires kvm->mmu_lock. This fixes a theoretical bug where a
VMA could be changed by userspace after vma_lookup() and before KVM
reads the mmu_invalidate_seq, causing KVM to install page table entries
based on a (possibly) no-longer-valid vma_shift.

Re-order the MMU cache top-up to earlier in user_mem_abort() so that it
is not done after KVM has read mmu_invalidate_seq (i.e. so as to avoid
inducing spurious fault retries).

This bug has existed since KVM/ARM's inception. It's unlikely that any
sane userspace currently modifies VMAs in such a way as to trigger this
race. And even with directed testing I was unable to reproduce it. But a
sufficiently motivated host userspace might be able to exploit this
race.

Fixes: 94f8e6418d39 ("KVM: ARM: Handle guest faults in KVM")
Cc: stable@vger.kernel.org
Reported-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/arm64/kvm/mmu.c | 48 +++++++++++++++++++-------------------------
 1 file changed, 21 insertions(+), 27 deletions(-)


base-commit: 96a4627dbbd48144a65af936b321701c70876026

Comments

Marc Zyngier March 14, 2023, 4:31 p.m. UTC | #1
[Dropping Christoffer's 11 year obsolete address...]

On Mon, 13 Mar 2023 23:54:54 +0000,
David Matlack <dmatlack@google.com> wrote:
> 
> Read mmu_invalidate_seq before dropping the mmap_lock so that KVM can
> detect if the results of vma_lookup() (e.g. vma_shift) become stale
> before it acquires kvm->mmu_lock. This fixes a theoretical bug where a
> VMA could be changed by userspace after vma_lookup() and before KVM
> reads the mmu_invalidate_seq, causing KVM to install page table entries
> based on a (possibly) no-longer-valid vma_shift.
> 
> Re-order the MMU cache top-up to earlier in user_mem_abort() so that it
> is not done after KVM has read mmu_invalidate_seq (i.e. so as to avoid
> inducing spurious fault retries).
> 
> This bug has existed since KVM/ARM's inception. It's unlikely that any
> sane userspace currently modifies VMAs in such a way as to trigger this
> race. And even with directed testing I was unable to reproduce it. But a
> sufficiently motivated host userspace might be able to exploit this
> race.
> 
> Fixes: 94f8e6418d39 ("KVM: ARM: Handle guest faults in KVM")

Ah, good luck with that one! :D user_mem_abort() used to be so nice
and simple at the time! And yet...

> Cc: stable@vger.kernel.org
> Reported-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: David Matlack <dmatlack@google.com>

Reviewed-by: Marc Zyngier <maz@kernel.org>

Oliver, how do you want to deal with this one? queue it right now? Or
wait until the dust settles on my two other patches?

I don't mind either way, I can either take it as part of the same
series, or rebase my stuff on it.

Thanks,

	M.
Oliver Upton March 14, 2023, 4:46 p.m. UTC | #2
On Tue, Mar 14, 2023 at 04:31:38PM +0000, Marc Zyngier wrote:
> [Dropping Christoffer's 11 year obsolete address...]
> 
> On Mon, 13 Mar 2023 23:54:54 +0000,
> David Matlack <dmatlack@google.com> wrote:
> > 
> > Read mmu_invalidate_seq before dropping the mmap_lock so that KVM can
> > detect if the results of vma_lookup() (e.g. vma_shift) become stale
> > before it acquires kvm->mmu_lock. This fixes a theoretical bug where a
> > VMA could be changed by userspace after vma_lookup() and before KVM
> > reads the mmu_invalidate_seq, causing KVM to install page table entries
> > based on a (possibly) no-longer-valid vma_shift.
> > 
> > Re-order the MMU cache top-up to earlier in user_mem_abort() so that it
> > is not done after KVM has read mmu_invalidate_seq (i.e. so as to avoid
> > inducing spurious fault retries).
> > 
> > This bug has existed since KVM/ARM's inception. It's unlikely that any
> > sane userspace currently modifies VMAs in such a way as to trigger this
> > race. And even with directed testing I was unable to reproduce it. But a
> > sufficiently motivated host userspace might be able to exploit this
> > race.
> > 
> > Fixes: 94f8e6418d39 ("KVM: ARM: Handle guest faults in KVM")
> 
> Ah, good luck with that one! :D user_mem_abort() used to be so nice
> and simple at the time! And yet...
> 
> > Cc: stable@vger.kernel.org
> > Reported-by: Sean Christopherson <seanjc@google.com>
> > Signed-off-by: David Matlack <dmatlack@google.com>
> 
> Reviewed-by: Marc Zyngier <maz@kernel.org>
> 
> Oliver, how do you want to deal with this one? queue it right now? Or
> wait until the dust settles on my two other patches?
> 
> I don't mind either way, I can either take it as part of the same
> series, or rebase my stuff on it.

I'll go ahead and grab it if you want to base your series on top of
this, thanks both of you!
Oliver Upton March 14, 2023, 6:10 p.m. UTC | #3
On Mon, 13 Mar 2023 16:54:54 -0700, David Matlack wrote:
> Read mmu_invalidate_seq before dropping the mmap_lock so that KVM can
> detect if the results of vma_lookup() (e.g. vma_shift) become stale
> before it acquires kvm->mmu_lock. This fixes a theoretical bug where a
> VMA could be changed by userspace after vma_lookup() and before KVM
> reads the mmu_invalidate_seq, causing KVM to install page table entries
> based on a (possibly) no-longer-valid vma_shift.
> 
> [...]

Applied to kvmarm/fixes, thanks!

[1/1] KVM: arm64: Retry fault if vma_lookup() results become invalid
      https://git.kernel.org/kvmarm/kvmarm/c/13ec9308a857

--
Best,
Oliver
diff mbox series

Patch

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7113587222ff..f54408355d1d 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1217,6 +1217,20 @@  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 	}
 
+	/*
+	 * Permission faults just need to update the existing leaf entry,
+	 * and so normally don't require allocations from the memcache. The
+	 * only exception to this is when dirty logging is enabled at runtime
+	 * and a write fault needs to collapse a block entry into a table.
+	 */
+	if (fault_status != ESR_ELx_FSC_PERM ||
+	    (logging_active && write_fault)) {
+		ret = kvm_mmu_topup_memory_cache(memcache,
+						 kvm_mmu_cache_min_pages(kvm));
+		if (ret)
+			return ret;
+	}
+
 	/*
 	 * Let's check if we will get back a huge page backed by hugetlbfs, or
 	 * get block mapping for device MMIO region.
@@ -1269,37 +1283,17 @@  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		fault_ipa &= ~(vma_pagesize - 1);
 
 	gfn = fault_ipa >> PAGE_SHIFT;
-	mmap_read_unlock(current->mm);
-
-	/*
-	 * Permission faults just need to update the existing leaf entry,
-	 * and so normally don't require allocations from the memcache. The
-	 * only exception to this is when dirty logging is enabled at runtime
-	 * and a write fault needs to collapse a block entry into a table.
-	 */
-	if (fault_status != ESR_ELx_FSC_PERM ||
-	    (logging_active && write_fault)) {
-		ret = kvm_mmu_topup_memory_cache(memcache,
-						 kvm_mmu_cache_min_pages(kvm));
-		if (ret)
-			return ret;
-	}
 
-	mmu_seq = vcpu->kvm->mmu_invalidate_seq;
 	/*
-	 * Ensure the read of mmu_invalidate_seq happens before we call
-	 * gfn_to_pfn_prot (which calls get_user_pages), so that we don't risk
-	 * the page we just got a reference to gets unmapped before we have a
-	 * chance to grab the mmu_lock, which ensure that if the page gets
-	 * unmapped afterwards, the call to kvm_unmap_gfn will take it away
-	 * from us again properly. This smp_rmb() interacts with the smp_wmb()
-	 * in kvm_mmu_notifier_invalidate_<page|range_end>.
+	 * Read mmu_invalidate_seq so that KVM can detect if the results of
+	 * vma_lookup() or __gfn_to_pfn_memslot() become stale prior to
+	 * acquiring kvm->mmu_lock.
 	 *
-	 * Besides, __gfn_to_pfn_memslot() instead of gfn_to_pfn_prot() is
-	 * used to avoid unnecessary overhead introduced to locate the memory
-	 * slot because it's always fixed even @gfn is adjusted for huge pages.
+	 * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs
+	 * with the smp_wmb() in kvm_mmu_invalidate_end().
 	 */
-	smp_rmb();
+	mmu_seq = vcpu->kvm->mmu_invalidate_seq;
+	mmap_read_unlock(current->mm);
 
 	pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL,
 				   write_fault, &writable, NULL);