diff mbox series

KVM: x86/mmu: Check that root is valid/loaded when pre-faulting SPTEs

Message ID 20240723000211.3352304-1-seanjc@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86/mmu: Check that root is valid/loaded when pre-faulting SPTEs | expand

Commit Message

Sean Christopherson July 23, 2024, 12:02 a.m. UTC
Error out if kvm_mmu_reload() fails when pre-faulting memory, as trying to
fault-in SPTEs will fail miserably due to root.hpa pointing at garbage.

Note, kvm_mmu_reload() can return -EIO and thus trigger the WARN on -EIO
in kvm_vcpu_pre_fault_memory(), but all such paths also WARN, i.e. the
WARN isn't user-triggerable and won't run afoul of warn-on-panic because
the kernel would already be panicking.

  BUG: unable to handle page fault for address: 000029ffffffffe8
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: Oops: 0000 [#1] PREEMPT SMP
  CPU: 22 PID: 1069 Comm: pre_fault_memor Not tainted 6.10.0-rc7-332d2c1d713e-next-vm #548
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:is_page_fault_stale+0x3e/0xe0 [kvm]
  RSP: 0018:ffffc9000114bd48 EFLAGS: 00010206
  RAX: 00003fffffffffc0 RBX: ffff88810a07c080 RCX: ffffc9000114bd78
  RDX: ffff88810a07c080 RSI: ffffea0000000000 RDI: ffff88810a07c080
  RBP: ffffc9000114bd78 R08: 00007fa3c8c00000 R09: 8000000000000225
  R10: ffffea00043d7d80 R11: 0000000000000000 R12: ffff88810a07c080
  R13: 0000000100000000 R14: ffffc9000114be58 R15: 0000000000000000
  FS:  00007fa3c9da0740(0000) GS:ffff888277d80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 000029ffffffffe8 CR3: 000000011d698000 CR4: 0000000000352eb0
  Call Trace:
   <TASK>
   kvm_tdp_page_fault+0xcc/0x160 [kvm]
   kvm_mmu_do_page_fault+0xfb/0x1f0 [kvm]
   kvm_arch_vcpu_pre_fault_memory+0xd0/0x1a0 [kvm]
   kvm_vcpu_ioctl+0x761/0x8c0 [kvm]
   __x64_sys_ioctl+0x82/0xb0
   do_syscall_64+0x5b/0x160
   entry_SYSCALL_64_after_hwframe+0x4b/0x53
   </TASK>
  Modules linked in: kvm_intel kvm
  CR2: 000029ffffffffe8
  ---[ end trace 0000000000000000 ]---

Fixes: 6e01b7601dfe ("KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory()")
Reported-by: syzbot+23786faffb695f17edaa@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/0000000000002b84dc061dd73544@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
---

Haven't seen a reproducer from syzbot, but I verified by forcing the same
root allocation failure (to generate the above splat).

 arch/x86/kvm/mmu/mmu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)


base-commit: 332d2c1d713e232e163386c35a3ba0c1b90df83f

Comments

Huang, Kai July 23, 2024, 2:48 a.m. UTC | #1
On 23/07/2024 12:02 pm, Sean Christopherson wrote:
> Error out if kvm_mmu_reload() fails when pre-faulting memory, as trying to
> fault-in SPTEs will fail miserably due to root.hpa pointing at garbage.
> 
> Note, kvm_mmu_reload() can return -EIO and thus trigger the WARN on -EIO
> in kvm_vcpu_pre_fault_memory(), but all such paths also WARN, i.e. the
> WARN isn't user-triggerable and won't run afoul of warn-on-panic because
> the kernel would already be panicking.
> 
>    BUG: unable to handle page fault for address: 000029ffffffffe8
>    #PF: supervisor read access in kernel mode
>    #PF: error_code(0x0000) - not-present page
>    PGD 0 P4D 0
>    Oops: Oops: 0000 [#1] PREEMPT SMP
>    CPU: 22 PID: 1069 Comm: pre_fault_memor Not tainted 6.10.0-rc7-332d2c1d713e-next-vm #548
>    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
>    RIP: 0010:is_page_fault_stale+0x3e/0xe0 [kvm]
>    RSP: 0018:ffffc9000114bd48 EFLAGS: 00010206
>    RAX: 00003fffffffffc0 RBX: ffff88810a07c080 RCX: ffffc9000114bd78
>    RDX: ffff88810a07c080 RSI: ffffea0000000000 RDI: ffff88810a07c080
>    RBP: ffffc9000114bd78 R08: 00007fa3c8c00000 R09: 8000000000000225
>    R10: ffffea00043d7d80 R11: 0000000000000000 R12: ffff88810a07c080
>    R13: 0000000100000000 R14: ffffc9000114be58 R15: 0000000000000000
>    FS:  00007fa3c9da0740(0000) GS:ffff888277d80000(0000) knlGS:0000000000000000
>    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>    CR2: 000029ffffffffe8 CR3: 000000011d698000 CR4: 0000000000352eb0
>    Call Trace:
>     <TASK>
>     kvm_tdp_page_fault+0xcc/0x160 [kvm]
>     kvm_mmu_do_page_fault+0xfb/0x1f0 [kvm]
>     kvm_arch_vcpu_pre_fault_memory+0xd0/0x1a0 [kvm]
>     kvm_vcpu_ioctl+0x761/0x8c0 [kvm]
>     __x64_sys_ioctl+0x82/0xb0
>     do_syscall_64+0x5b/0x160
>     entry_SYSCALL_64_after_hwframe+0x4b/0x53
>     </TASK>
>    Modules linked in: kvm_intel kvm
>    CR2: 000029ffffffffe8
>    ---[ end trace 0000000000000000 ]---
> 
> Fixes: 6e01b7601dfe ("KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory()")
> Reported-by: syzbot+23786faffb695f17edaa@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/0000000000002b84dc061dd73544@google.com
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---

Reviewed-by: Kai Huang <kai.huang@intel.com>
Sean Christopherson Aug. 23, 2024, 11:47 p.m. UTC | #2
On Mon, 22 Jul 2024 17:02:11 -0700, Sean Christopherson wrote:
> Error out if kvm_mmu_reload() fails when pre-faulting memory, as trying to
> fault-in SPTEs will fail miserably due to root.hpa pointing at garbage.
> 
> Note, kvm_mmu_reload() can return -EIO and thus trigger the WARN on -EIO
> in kvm_vcpu_pre_fault_memory(), but all such paths also WARN, i.e. the
> WARN isn't user-triggerable and won't run afoul of warn-on-panic because
> the kernel would already be panicking.
> 
> [...]

Applied to kvm-x86 fixes, thanks!

[1/1] KVM: x86/mmu: Check that root is valid/loaded when pre-faulting SPTEs
      https://github.com/kvm-x86/linux/commit/28cec7f08b8b

--
https://github.com/kvm-x86/linux/tree/next
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 901be9e420a4..ee516baf3a31 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4747,7 +4747,9 @@  long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
 	 * reload is efficient when called repeatedly, so we can do it on
 	 * every iteration.
 	 */
-	kvm_mmu_reload(vcpu);
+	r = kvm_mmu_reload(vcpu);
+	if (r)
+		return r;
 
 	if (kvm_arch_has_private_mem(vcpu->kvm) &&
 	    kvm_mem_is_private(vcpu->kvm, gpa_to_gfn(range->gpa)))