Message ID | 1449769137-8668-1-git-send-email-pbonzini@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Dec 10, 2015 at 06:38:57PM +0100, Paolo Bonzini wrote: > Invoking tracepoints within kvm_guest_enter/kvm_guest_exit causes a > lockdep splat. > > Cc: stable@vger.kernel.org > Reported-by: Borislav Petkov <bp@alien8.de> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > arch/x86/kvm/svm.c | 4 ++-- > arch/x86/kvm/vmx.c | 3 ++- > arch/x86/kvm/x86.c | 2 +- > 3 files changed, 5 insertions(+), 4 deletions(-) Looks like you missed some... [ 144.296364] kvm: zapping shadow pages for mmio generation wraparound [ 164.699053] kvm: zapping shadow pages for mmio generation wraparound [ 312.115767] kvm: zapping shadow pages for mmio generation wraparound [ 432.277585] kvm: zapping shadow pages for mmio generation wraparound [ 434.547820] =============================== [ 434.552020] [ INFO: suspicious RCU usage. ] [ 434.556223] 4.4.0-rc4+ #1 Not tainted [ 434.559886] ------------------------------- [ 434.564072] arch/x86/kvm/trace.h:971 suspicious rcu_dereference_check() usage! [ 434.571303] other info that might help us debug this: [ 434.579324] RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0 [ 434.590209] RCU used illegally from extended quiescent state! [ 434.595971] 1 lock held by qemu-system-x86/2402: [ 434.600596] #0: (&vcpu->mutex){+.+.+.}, at: [<ffffffffa020872c>] vcpu_load+0x1c/0x80 [kvm] [ 434.609146] stack backtrace: [ 434.613526] CPU: 4 PID: 2402 Comm: qemu-system-x86 Not tainted 4.4.0-rc4+ #1 [ 434.620583] Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A13 05/11/2014 [ 434.627987] 0000000000000001 ffff88042f79fcf0 ffffffff813c2cfc ffff880435aa0000 [ 434.635443] ffff88042f79fd20 ffffffff810c5157 ffff88042fd48000 0000000295a85563 [ 434.642886] 0000000295de483f 0000000000000000 ffff88042f79fd58 ffffffffa023ec6e [ 434.650334] Call Trace: [ 434.652804] [<ffffffff813c2cfc>] dump_stack+0x4e/0x82 [ 434.657950] [<ffffffff810c5157>] lockdep_rcu_suspicious+0xe7/0x120 [ 434.664239] [<ffffffffa023ec6e>] wait_lapic_expire+0xfe/0x1e0 [kvm] [ 434.670606] [<ffffffffa022183e>] kvm_arch_vcpu_ioctl_run+0x76e/0x19c0 [kvm] [ 434.677674] [<ffffffffa0221982>] ? kvm_arch_vcpu_ioctl_run+0x8b2/0x19c0 [kvm] [ 434.684905] [<ffffffff81858acc>] ? mutex_lock_killable_nested+0x29c/0x4c0 [ 434.691792] [<ffffffffa0208b32>] kvm_vcpu_ioctl+0x342/0x700 [kvm] [ 434.697984] [<ffffffff810c4a7d>] ? __lock_is_held+0x4d/0x70 [ 434.703655] [<ffffffff812351ae>] ? __fget+0xfe/0x200 [ 434.708719] [<ffffffff812291f1>] do_vfs_ioctl+0x301/0x550 [ 434.714208] [<ffffffff8123531a>] ? __fget_light+0x2a/0x90 [ 434.719700] [<ffffffff81229481>] SyS_ioctl+0x41/0x70 [ 434.724754] [<ffffffff8185cb36>] entry_SYSCALL_64_fastpath+0x16/0x7a [ 437.411818] kvm [2400]: vcpu0 unhandled rdmsr: 0x606 [ 437.416898] kvm [2400]: vcpu0 unhandled rdmsr: 0x34
On 10/12/2015 19:09, Borislav Petkov wrote: > On Thu, Dec 10, 2015 at 06:38:57PM +0100, Paolo Bonzini wrote: >> Invoking tracepoints within kvm_guest_enter/kvm_guest_exit causes a >> lockdep splat. >> >> Cc: stable@vger.kernel.org >> Reported-by: Borislav Petkov <bp@alien8.de> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> >> --- >> arch/x86/kvm/svm.c | 4 ++-- >> arch/x86/kvm/vmx.c | 3 ++- >> arch/x86/kvm/x86.c | 2 +- >> 3 files changed, 5 insertions(+), 4 deletions(-) > > Looks like you missed some... Yeah, wait_lapic_expire also have to be moved before __kvm_guest_enter. Paolo > [ 144.296364] kvm: zapping shadow pages for mmio generation wraparound > [ 164.699053] kvm: zapping shadow pages for mmio generation wraparound > [ 312.115767] kvm: zapping shadow pages for mmio generation wraparound > [ 432.277585] kvm: zapping shadow pages for mmio generation wraparound > > [ 434.547820] =============================== > [ 434.552020] [ INFO: suspicious RCU usage. ] > [ 434.556223] 4.4.0-rc4+ #1 Not tainted > [ 434.559886] ------------------------------- > [ 434.564072] arch/x86/kvm/trace.h:971 suspicious rcu_dereference_check() usage! > [ 434.571303] > other info that might help us debug this: > > [ 434.579324] > RCU used illegally from idle CPU! > rcu_scheduler_active = 1, debug_locks = 0 > [ 434.590209] RCU used illegally from extended quiescent state! > [ 434.595971] 1 lock held by qemu-system-x86/2402: > [ 434.600596] #0: (&vcpu->mutex){+.+.+.}, at: [<ffffffffa020872c>] vcpu_load+0x1c/0x80 [kvm] > [ 434.609146] > stack backtrace: > [ 434.613526] CPU: 4 PID: 2402 Comm: qemu-system-x86 Not tainted 4.4.0-rc4+ #1 > [ 434.620583] Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A13 05/11/2014 > [ 434.627987] 0000000000000001 ffff88042f79fcf0 ffffffff813c2cfc ffff880435aa0000 > [ 434.635443] ffff88042f79fd20 ffffffff810c5157 ffff88042fd48000 0000000295a85563 > [ 434.642886] 0000000295de483f 0000000000000000 ffff88042f79fd58 ffffffffa023ec6e > [ 434.650334] Call Trace: > [ 434.652804] [<ffffffff813c2cfc>] dump_stack+0x4e/0x82 > [ 434.657950] [<ffffffff810c5157>] lockdep_rcu_suspicious+0xe7/0x120 > [ 434.664239] [<ffffffffa023ec6e>] wait_lapic_expire+0xfe/0x1e0 [kvm] > [ 434.670606] [<ffffffffa022183e>] kvm_arch_vcpu_ioctl_run+0x76e/0x19c0 [kvm] > [ 434.677674] [<ffffffffa0221982>] ? kvm_arch_vcpu_ioctl_run+0x8b2/0x19c0 [kvm] > [ 434.684905] [<ffffffff81858acc>] ? mutex_lock_killable_nested+0x29c/0x4c0 > [ 434.691792] [<ffffffffa0208b32>] kvm_vcpu_ioctl+0x342/0x700 [kvm] > [ 434.697984] [<ffffffff810c4a7d>] ? __lock_is_held+0x4d/0x70 > [ 434.703655] [<ffffffff812351ae>] ? __fget+0xfe/0x200 > [ 434.708719] [<ffffffff812291f1>] do_vfs_ioctl+0x301/0x550 > [ 434.714208] [<ffffffff8123531a>] ? __fget_light+0x2a/0x90 > [ 434.719700] [<ffffffff81229481>] SyS_ioctl+0x41/0x70 > [ 434.724754] [<ffffffff8185cb36>] entry_SYSCALL_64_fastpath+0x16/0x7a > [ 437.411818] kvm [2400]: vcpu0 unhandled rdmsr: 0x606 > [ 437.416898] kvm [2400]: vcpu0 unhandled rdmsr: 0x34 > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Dec 10, 2015 at 07:15:19PM +0100, Paolo Bonzini wrote:
> Yeah, wait_lapic_expire also have to be moved before __kvm_guest_enter.
Yeah, v2 doesn't splat on the Intel box anymore but the AMD box still
has, and it is a different problem. With the v2 applied, it still
explodes, see below.
And I'm willing to bet good money on that shadow pages fun. The
[ 959.466549] kernel tried to execute NX-protected page - exploit attempt? (uid: 1000)
line basically says that we're pagefaulting when trying to fetch
instructions, i.e., we're trying to execute something from a page, rIP
points to 0xffff8800b9f9bdf0 and that is most likely a page belonging to
kvm, which, however, is for some reason not executable (anymore?).
Could it have anything to do with that zapping of shadow pages, per
chance?
Can I disable the zapping and see if it still triggers? Or should I try
modprobing kvm with "npt=0" or so?
/me goes and tries it...
Nope, that doesn't help - it still splats.
Hmmm...
[ 849.272337] kvm: zapping shadow pages for mmio generation wraparound
[ 933.813871] kvm: zapping shadow pages for mmio generation wraparound
[ 959.466549] kernel tried to execute NX-protected page - exploit attempt? (uid: 1000)
[ 959.474369] BUG: unable to handle kernel paging request at ffff8800b9f9bdf0
[ 959.481407] IP: [<ffff8800b9f9bdf0>] 0xffff8800b9f9bdf0
[ 959.486677] PGD 2d7e067 PUD 43efff067 PMD 80000000b9e001e3
[ 959.492338] Oops: 0011 [#1] PREEMPT SMP
[ 959.496340] Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm irqbypass crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod fam15h_power k10temp edac_mce_amd amdkfd amd_iommu_v2 radeon acpi_cpufreq
[ 959.524023] CPU: 3 PID: 3798 Comm: qemu-system-x86 Not tainted 4.4.0-rc4+ #8
[ 959.531127] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
[ 959.541113] task: ffff8800b7ca5e00 ti: ffff8800b9f98000 task.ti: ffff8800b9f98000
[ 959.548625] RIP: 0010:[<ffff8800b9f9bdf0>] [<ffff8800b9f9bdf0>] 0xffff8800b9f9bdf0
[ 959.556338] RSP: 0018:ffff8800b9f9bde0 EFLAGS: 00010206
[ 959.561676] RAX: 000003993d0f82ee RBX: ffff8800b7d48000 RCX: 0000000000000001
[ 959.568844] RDX: 0000039900000000 RSI: ffffffffa02bdc7b RDI: ffff8800b7d48000
[ 959.576010] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
[ 959.583177] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[ 959.590346] R13: ffff8800b7d48000 R14: 0000000000000000 R15: 0000000000000000
[ 959.597513] FS: 00007f7fae580700(0000) GS:ffff88042cc00000(0000) knlGS:0000000000000000
[ 959.605643] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 959.611414] CR2: ffff8800b9f9bdf0 CR3: 000000041b5fe000 CR4: 00000000000406e0
[ 959.618579] Stack:
[ 959.620607] ffffffffa02d5e17 ffff8800b7d48000 ffff8800b9f9be08 ffffffffa02bdb1f
[ 959.628104] 0000000000000000 ffff8800b9f9be98 ffffffffa02bdc7b ffff8804242a4400
[ 959.635601] 0000000000000070 0000000000004000 ffffffff81a3c1e0 ffff8800b7ca5e00
[ 959.643114] Call Trace:
[ 959.645599] [<ffffffffa02d5e17>] ? kvm_arch_vcpu_put+0x17/0x40 [kvm]
[ 959.652081] [<ffffffffa02bdb1f>] ? vcpu_put+0x1f/0x60 [kvm]
[ 959.657782] [<ffffffffa02bdc7b>] ? kvm_vcpu_ioctl+0x11b/0x6f0 [kvm]
[ 959.664169] [<ffffffff811a0930>] ? do_vfs_ioctl+0x2e0/0x540
[ 959.669855] [<ffffffff811ac8e9>] ? __fget_light+0x29/0x90
[ 959.675364] [<ffffffff811a0bdc>] ? SyS_ioctl+0x4c/0x90
[ 959.680618] [<ffffffff816e2d5b>] ? entry_SYSCALL_64_fastpath+0x16/0x6f
[ 959.687263] Code: 00 00 00 06 02 01 00 00 00 00 00 e0 bd f9 b9 00 88 ff ff 18 00 00 00 00 00 00 00 17 5e 2d a0 ff ff ff ff 00 80 d4 b7 00 88 ff ff <08> be f9 b9 00 88 ff ff 1f db 2b a0 ff ff ff ff 00 00 00 00 00
[ 959.707506] RIP [<ffff8800b9f9bdf0>] 0xffff8800b9f9bdf0
[ 959.712862] RSP <ffff8800b9f9bde0>
[ 959.716373] CR2: ffff8800b9f9bdf0
[ 959.735764] ---[ end trace 6826bd13f6e235cd ]---
[ 959.740465] note: qemu-system-x86[3798] exited with preempt_count 1
[ 979.163010] kvm: zapping shadow pages for mmio generation wraparound
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 83a1c643f9a5..899c40f826dd 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -3422,6 +3422,8 @@ static int handle_exit(struct kvm_vcpu *vcpu) struct kvm_run *kvm_run = vcpu->run; u32 exit_code = svm->vmcb->control.exit_code; + trace_kvm_exit(exit_code, vcpu, KVM_ISA_SVM); + if (!is_cr_intercept(svm, INTERCEPT_CR0_WRITE)) vcpu->arch.cr0 = svm->vmcb->save.cr0; if (npt_enabled) @@ -3892,8 +3894,6 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu) vcpu->arch.regs[VCPU_REGS_RSP] = svm->vmcb->save.rsp; vcpu->arch.regs[VCPU_REGS_RIP] = svm->vmcb->save.rip; - trace_kvm_exit(svm->vmcb->control.exit_code, vcpu, KVM_ISA_SVM); - if (unlikely(svm->vmcb->control.exit_code == SVM_EXIT_NMI)) kvm_before_handle_nmi(&svm->vcpu); diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index af823a388c19..6b5605607849 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -8042,6 +8042,8 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu) u32 exit_reason = vmx->exit_reason; u32 vectoring_info = vmx->idt_vectoring_info; + trace_kvm_exit(exit_reason, vcpu, KVM_ISA_VMX); + /* * Flush logged GPAs PML buffer, this will make dirty_bitmap more * updated. Another good is, in kvm_vm_ioctl_get_dirty_log, before @@ -8668,7 +8670,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) vmx->loaded_vmcs->launched = 1; vmx->exit_reason = vmcs_read32(VM_EXIT_REASON); - trace_kvm_exit(vmx->exit_reason, vcpu, KVM_ISA_VMX); /* * the KVM_REQ_EVENT optimization bit is only on for one entry, and if diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index eed32283d22c..2a8c035ec7fe 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6515,6 +6515,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) if (req_immediate_exit) smp_send_reschedule(vcpu->cpu); + trace_kvm_entry(vcpu->vcpu_id); __kvm_guest_enter(); if (unlikely(vcpu->arch.switch_db_regs)) { @@ -6527,7 +6528,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD; } - trace_kvm_entry(vcpu->vcpu_id); wait_lapic_expire(vcpu); kvm_x86_ops->run(vcpu);
Invoking tracepoints within kvm_guest_enter/kvm_guest_exit causes a lockdep splat. Cc: stable@vger.kernel.org Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- arch/x86/kvm/svm.c | 4 ++-- arch/x86/kvm/vmx.c | 3 ++- arch/x86/kvm/x86.c | 2 +- 3 files changed, 5 insertions(+), 4 deletions(-)