Message ID | 20150219160221.GB19057@potion.brq.redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
2015-02-19 17:02+0100, Radim Kr?má?:
> Fixes: e011c663b9c7 ("Check all exceptions for intercept during delivery to L2")
Note: I haven't verified that it was introduced by this patch, just
nothing against the hypothesis popped out in a short gravedigging.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Feb 19, 2015 at 05:02:22PM +0100, Radim Kr?má? wrote: > 2015-02-19 16:01+0100, Radim Kr?má?: > > 2015-02-19 13:07+0100, Kashyap Chamarthy: > > 5f3d5799974b8 KVM: nVMX: Rework event injection and recovery: > > This concept is based on the rule that a pending vmlaunch/vmresume is > > not canceled. Otherwise, we would risk to lose injected events or leak > > them into the wrong queues. Encode this rule via a WARN_ON_ONCE at the > > entry of nested_vmx_vmexit. > > > > I wonder if we have broken the invariant since 3.9 ... > > e011c663b9c786d115c0f45e5b0bfae0c39428d4 > KVM: nVMX: Check all exceptions for intercept during delivery to L2 > > All exceptions should be checked for intercept during delivery to L2, > but we check only #PF currently. Drop nested_run_pending while we are > at it since exception cannot be injected during vmentry anyway. > > The last sentence is not true. > > Can you try if the following patch works? Sure, will test a Kernel built with the below patch and report back. Thanks for taking a look. -- /kashyap > (I know little about nested, so it might be introducing another bug.) > > Thanks. > > ---8<--- > KVM: nVMX: fix L2 to L1 interrupt leak > > When vmx->nested.nested_run_pending is set, we aren't expected to exit > to L1, but nested_vmx_check_exception() could, since e011c663b9c7. > Prevent that. > > Fixes: e011c663b9c7 ("Check all exceptions for intercept during delivery to L2") > Signed-off-by: Radim Kr?má? <rkrcmar@redhat.com> > --- > arch/x86/kvm/vmx.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 3f73bfad0349..389166a1b79a 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -2098,6 +2098,9 @@ static int nested_vmx_check_exception(struct kvm_vcpu *vcpu, unsigned nr) > { > struct vmcs12 *vmcs12 = get_vmcs12(vcpu); > > + if (to_vmx(vcpu)->nested.nested_run_pending) > + return 0; > + > if (!(vmcs12->exception_bitmap & (1u << nr))) > return 0; -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Feb 19, 2015 at 10:10:11PM +0100, Kashyap Chamarthy wrote: > On Thu, Feb 19, 2015 at 05:02:22PM +0100, Radim Kr?má? wrote: [. . .] > > Can you try if the following patch works? > > Sure, will test a Kernel built with the below patch and report back. Hmm, I'm stuck with a meta issue. I checked out the KVM tree[1] on L0, applied your patch and built[*] the Kernel, and booted into it. Boot fails and drops into a dracut shell on because: . . . dracut-initqueue[3045]: Warning: Cancelling resume operation. Device not found. [ TIME ] Timed out waiting for device dev-ma...per910\x2d\x2d02\x2droot.device. [DEPEND] Dependency failed for /sysroot. [DEPEND] Dependency failed for Initrd Root File SyWarning: /dev/disk/by-uuid/4ccddb2d-4d63-4fce-b4d4-9b2f119a30cc does not exist . . . I saved the report from /run/initramfs/rdsosreport.txt here[2]. Then, I did another test: - Rebooted into Kernel 3.20.0-0.rc0.git5.1.fc23.x86_64 on physical host (L0). - In L1, checked out the KVM tree, applied your patch and built Kernel[*] from the current KVM tree and booted into the newly built one, here too, I'm thrown into a dracut shell [1] git://git.kernel.org/pub/scm/virt/kvm/kvm.git [2] https://kashyapc.fedorapeople.org/temp/kernel-boot-failure.txt [*] Exactly, I built it this way: # Clone the tree $ git://git.kernel.org/pub/scm/virt/kvm/kvm.git # Make a new branch: $ git checkout -b nvmx_test $ git describe warning: tag 'for-linus' is really 'kvm-3.19-1' here for-linus-14459-g49776d5 # Make a config file $ make defconfig # Compile $ make -j4 && make bzImage && make modules # Install $ sudo -i $ make modules_install && make install
2015-02-19 23:28+0100, Kashyap Chamarthy: > On Thu, Feb 19, 2015 at 10:10:11PM +0100, Kashyap Chamarthy wrote: > > On Thu, Feb 19, 2015 at 05:02:22PM +0100, Radim Kr?má? wrote: > > [. . .] > > > > Can you try if the following patch works? > > > > Sure, will test a Kernel built with the below patch and report back. > > Hmm, I'm stuck with a meta issue. > > I checked out the KVM tree[1] on L0, applied your patch and built[*] the > Kernel, and booted into it. Boot fails and drops into a dracut shell on > because: > > . . . > dracut-initqueue[3045]: Warning: Cancelling resume operation. Device not found. > [ TIME ] Timed out waiting for device > dev-ma...per910\x2d\x2d02\x2droot.device. > [DEPEND] Dependency failed for /sysroot. > [DEPEND] Dependency failed for Initrd Root File SyWarning: > /dev/disk/by-uuid/4ccddb2d-4d63-4fce-b4d4-9b2f119a30cc does not exist > . . . > > I saved the report from /run/initramfs/rdsosreport.txt here[2]. > > > Then, I did another test: > > - Rebooted into Kernel 3.20.0-0.rc0.git5.1.fc23.x86_64 on physical > host (L0). > - In L1, checked out the KVM tree, applied your patch and built > Kernel[*] from the current KVM tree and booted into the newly built > one, here too, I'm thrown into a dracut shell Weird, but considering that boot fails on L0 as well, I think it that basing off a different commit could help ... > [1] git://git.kernel.org/pub/scm/virt/kvm/kvm.git > [2] https://kashyapc.fedorapeople.org/temp/kernel-boot-failure.txt > > [*] Exactly, I built it this way: > > # Clone the tree > $ git://git.kernel.org/pub/scm/virt/kvm/kvm.git > > # Make a new branch: > $ git checkout -b nvmx_test > $ git describe > warning: tag 'for-linus' is really 'kvm-3.19-1' here > for-linus-14459-g49776d5 Hm, it should say v3.19 -- does it stay the same if you do `git fetch && git checkout origin/master`? If it still does, please try to apply it on top of `git checkout v3.18`. (The one that one failed too.) > # Make a config file > $ make defconfig It would be safer to copy the fedora config (from /boot) to .config and do `make olddefconfig`. > # Compile > $ make -j4 && make bzImage && make modules > > # Install > $ sudo -i > $ make modules_install && make install > > -- > /kashyap -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Feb 20, 2015 at 05:14:15PM +0100, Radim Kr?má? wrote: > 2015-02-19 23:28+0100, Kashyap Chamarthy: > > On Thu, Feb 19, 2015 at 10:10:11PM +0100, Kashyap Chamarthy wrote: > > > On Thu, Feb 19, 2015 at 05:02:22PM +0100, Radim Kr?má? wrote: [. . .] > > Then, I did another test: > > > > - Rebooted into Kernel 3.20.0-0.rc0.git5.1.fc23.x86_64 on physical > > host (L0). > > - In L1, checked out the KVM tree, applied your patch and built > > Kernel[*] from the current KVM tree and booted into the newly built > > one, here too, I'm thrown into a dracut shell > > Weird, but considering that boot fails on L0 as well, I think it that > basing off a different commit could help ... What I missed to do was to build initramfs: $ cd /boot $ dracut initramfs-3.19.0+.img 3.19.0+ --force Then I can boot. However, networking was hosed due to this bug[1] in `dhclient` (Andrea Arcangeli said it's fixed for him in newest Kernels, but unfortunately it's still not fixed for me as I noted in the bug. Anyway, for the nVMX bug in question, I actually built a Fedora scratch Kernel build[2], with your fix, which was successful[3]. I will test with it once I get the networking fixed on the physical machine, hopefully, early next week. > > [*] Exactly, I built it this way: > > > > # Clone the tree > > $ git://git.kernel.org/pub/scm/virt/kvm/kvm.git > > > > # Make a new branch: > > $ git checkout -b nvmx_test > > $ git describe > > warning: tag 'for-linus' is really 'kvm-3.19-1' here > > for-linus-14459-g49776d5 > > Hm, it should say v3.19 -- does it stay the same if you do > `git fetch && git checkout origin/master`? > > If it still does, please try to apply it on top of `git checkout v3.18`. > (The one that one failed too.) > > > # Make a config file > > $ make defconfig > > It would be safer to copy the fedora config (from /boot) to .config and > do `make olddefconfig`. That's actually what I did on my later compiles. For now, as noted above, will test with the Fedora Kernel scratch build I made. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1194809 -- `dhclient` crashes on boot [2] http://koji.fedoraproject.org/koji/taskinfo?taskID=9004708 [3] https://kojipkgs.fedoraproject.org//work/tasks/4708/9004708/build.log
Radim, I just tested with your patch[1] in this thread. I built a Fedora Kernel[2] with it, and installed (and booted into) it on both L0 and L1. Result: I don't have good news, I'm afraid: L1 *still* reboots when an L2 guest is booted. And, L0 throws the stack trace that was previously noted on this thread: . . . [< 57.747345>] ------------[ cut here ]------------ [< 0.004638>] WARNING: CPU: 5 PID: 50206 at arch/x86/kvm/vmx.c:8962 nested_vmx_vmexit+0x7ee/0x880 [kvm_intel]() [< 0.009903>] Modules linked in: vhost_net vhost macvtap macvlan xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defra g_ipv4 xt_conntrack nf_conntrack tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables kvm_intel coretemp iTCO_wdt kvm ipmi_devintf iTCO_vendor_support i7core_edac gpio_ich c rc32c_intel serio_raw edac_core ipmi_si dcdbas shpchp tpm_tis lpc_ich mfd_core tpm ipmi_msghandler wmi acpi_power_meter acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc mgag200 i2c_algo_bit drm_kms_helper ttm drm ata_generic megaraid_sas bnx2 pata_acpi [last unloaded: kvm_intel] [< 0.060404>] CPU: 5 PID: 50206 Comm: qemu-system-x86 Not tainted 3.18.7-200.fc21.x86_64 #1 [ +0.008220] Hardware name: Dell Inc. PowerEdge R910/0P658H, BIOS 2.8.2 10/25/2012 [ +0.007526] 0000000000000000 00000000a30d0ba3 ffff883f2489fc48 ffffffff8175e686 [ +0.007688] 0000000000000000 0000000000000000 ffff883f2489fc88 ffffffff810991d1 [ +0.007613] ffff883f2489fc98 ffff88bece1ba000 0000000000000000 0000000000000014 [ +0.007611] Call Trace: [ +0.002518] [<ffffffff8175e686>] dump_stack+0x46/0x58 [ +0.005202] [<ffffffff810991d1>] warn_slowpath_common+0x81/0xa0 [ +0.006055] [<ffffffff810992ea>] warn_slowpath_null+0x1a/0x20 [ +0.005889] [<ffffffffa02f00ee>] nested_vmx_vmexit+0x7ee/0x880 [kvm_intel] [ +0.007014] [<ffffffffa02f05af>] ? vmx_handle_exit+0x1bf/0xaa0 [kvm_intel] [ +0.007015] [<ffffffffa02f039c>] vmx_queue_exception+0xfc/0x150 [kvm_intel] [ +0.007130] [<ffffffffa028cdfd>] kvm_arch_vcpu_ioctl_run+0xd9d/0x1290 [kvm] [ +0.007111] [<ffffffffa0288528>] ? kvm_arch_vcpu_load+0x58/0x220 [kvm] [ +0.006670] [<ffffffffa0274cbc>] kvm_vcpu_ioctl+0x32c/0x5c0 [kvm] [ +0.006236] [<ffffffff810d0f7b>] ? put_prev_entity+0x5b/0x400 [ +0.005887] [<ffffffff810cbb37>] ? set_next_entity+0x67/0x80 [ +0.005802] [<ffffffff810d4549>] ? pick_next_task_fair+0x6c9/0x8c0 [ +0.006324] [<ffffffff810126d6>] ? __switch_to+0x1d6/0x5f0 [ +0.005626] [<ffffffff8122a1c0>] do_vfs_ioctl+0x2d0/0x4b0 [ +0.005543] [<ffffffff81760764>] ? __schedule+0x2f4/0x8a0 [ +0.005537] [<ffffffff8122a421>] SyS_ioctl+0x81/0xa0 [ +0.005106] [<ffffffff81765429>] system_call_fastpath+0x12/0x17 [ +0.006056] ---[ end trace 646ed2360b84865c ]--- [ +7.000298] kvm [50179]: vcpu0 unhandled rdmsr: 0x1c9 [ +0.005061] kvm [50179]: vcpu0 unhandled rdmsr: 0x1a6 [ +0.005053] kvm [50179]: vcpu0 unhandled rdmsr: 0x3f6 . . . [1] http://article.gmane.org/gmane.comp.emulators.kvm.devel/132937 [2] http://koji.fedoraproject.org/koji/taskinfo?taskID=9004708
2015-02-22 16:46+0100, Kashyap Chamarthy: > Radim, > > I just tested with your patch[1] in this thread. I built a Fedora > Kernel[2] with it, and installed (and booted into) it on both L0 and L1. > > Result: I don't have good news, I'm afraid: L1 *still* reboots when an > L2 guest is booted. And, L0 throws the stack trace that was > previously noted on this thread: Thanks, I'm puzzled though ... isn't it possible that a wrong kernel sneaked into grub? > . . . > [< 57.747345>] ------------[ cut here ]------------ > [< 0.004638>] WARNING: CPU: 5 PID: 50206 at arch/x86/kvm/vmx.c:8962 nested_vmx_vmexit+0x7ee/0x880 [kvm_intel]() > [< 0.060404>] CPU: 5 PID: 50206 Comm: qemu-system-x86 Not tainted 3.18.7-200.fc21.x86_64 #1 This looks like a new backtrace, but the kernel is not [2]. > [ +0.006055] [<ffffffff810992ea>] warn_slowpath_null+0x1a/0x20 > [ +0.005889] [<ffffffffa02f00ee>] nested_vmx_vmexit+0x7ee/0x880 [kvm_intel] > [ +0.007014] [<ffffffffa02f05af>] ? vmx_handle_exit+0x1bf/0xaa0 [kvm_intel] > [ +0.007015] [<ffffffffa02f039c>] vmx_queue_exception+0xfc/0x150 [kvm_intel] > [ +0.007130] [<ffffffffa028cdfd>] kvm_arch_vcpu_ioctl_run+0xd9d/0x1290 [kvm] (There is only one execution path and unless there is a race, it would be prevented by [1].) > [ +0.007111] [<ffffffffa0288528>] ? kvm_arch_vcpu_load+0x58/0x220 [kvm] > [ +0.006670] [<ffffffffa0274cbc>] kvm_vcpu_ioctl+0x32c/0x5c0 [kvm] [...] > [1] http://article.gmane.org/gmane.comp.emulators.kvm.devel/132937 > [2] http://koji.fedoraproject.org/koji/taskinfo?taskID=9004708 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Feb 23, 2015 at 02:56:11PM +0100, Radim Kr?má? wrote: > 2015-02-22 16:46+0100, Kashyap Chamarthy: > > Radim, > > > > I just tested with your patch[1] in this thread. I built a Fedora > > Kernel[2] with it, and installed (and booted into) it on both L0 and L1. > > > > Result: I don't have good news, I'm afraid: L1 *still* reboots when an > > L2 guest is booted. And, L0 throws the stack trace that was > > previously noted on this thread: > > Thanks, I'm puzzled though ... isn't it possible that a wrong kernel > sneaked into grub? Hmm, unlikely - I just double-confirmed that I'm running the same patched Kernel (3.20.0-0.rc0.git9.1.fc23.x86_64) on both L0 and L1. > > . . . > > [< 57.747345>] ------------[ cut here ]------------ > > [< 0.004638>] WARNING: CPU: 5 PID: 50206 at arch/x86/kvm/vmx.c:8962 nested_vmx_vmexit+0x7ee/0x880 [kvm_intel]() > > [< 0.060404>] CPU: 5 PID: 50206 Comm: qemu-system-x86 Not tainted 3.18.7-200.fc21.x86_64 #1 > > This looks like a new backtrace, but the kernel is not [2]. Err, looks like I pasted the wrong one, but here it is again. I just tested with the patched Kernel (that I linked below) on both L0 and L1, the same behavior (L1 reboot on L2 boot) manifests: . . . [< 0.058440>] CPU: 8 PID: 1828 Comm: qemu-system-x86 Not tainted 3.20.0-0.rc0.git9.1.fc23.x86_64 #1 [< 0.008856>] Hardware name: Dell Inc. PowerEdge R910/0P658H, BIOS 2.8.2 10/25/2012 [< 0.007475>] 0000000000000000 0000000097b7f39b ffff883f5acc3bf8 ffffffff818773cd [< 0.007477>] 0000000000000000 0000000000000000 ffff883f5acc3c38 ffffffff810ab3ba [< 0.007495>] ffff883f5acc3c68 ffff887f62678000 0000000000000000 0000000000000000 [< 0.007489>] Call Trace: [< 0.002455>] [<ffffffff818773cd>] dump_stack+0x4c/0x65 [< 0.005139>] [<ffffffff810ab3ba>] warn_slowpath_common+0x8a/0xc0 [< 0.006001>] [<ffffffff810ab4ea>] warn_slowpath_null+0x1a/0x20 [< 0.005831>] [<ffffffffa220cf8e>] nested_vmx_vmexit+0xbde/0xd30 [kvm_intel] [< 0.006957>] [<ffffffffa220fda3>] ? vmx_handle_exit+0x213/0xd80 [kvm_intel] [< 0.006956>] [<ffffffffa220d3fa>] vmx_queue_exception+0x10a/0x150 [kvm_intel] [< 0.007160>] [<ffffffffa03c8cdb>] kvm_arch_vcpu_ioctl_run+0x107b/0x1b60 [kvm] [< 0.007138>] [<ffffffffa03c833a>] ? kvm_arch_vcpu_ioctl_run+0x6da/0x1b60 [kvm] [< 0.007219>] [<ffffffff8110725d>] ? trace_hardirqs_on+0xd/0x10 [< 0.005837>] [<ffffffffa03b0666>] ? vcpu_load+0x26/0x70 [kvm] [< 0.005745>] [<ffffffff8110385f>] ? lock_release_holdtime.part.29+0xf/0x200 [< 0.006966>] [<ffffffffa03c3a68>] ? kvm_arch_vcpu_load+0x58/0x210 [kvm] [< 0.006618>] [<ffffffffa03b0a73>] kvm_vcpu_ioctl+0x383/0x7e0 [kvm] [< 0.006175>] [<ffffffff81027b9d>] ? native_sched_clock+0x2d/0xa0 [< 0.006000>] [<ffffffff810d5c56>] ? creds_are_invalid.part.1+0x16/0x50 [< 0.006518>] [<ffffffff810d5cb1>] ? creds_are_invalid+0x21/0x30 [< 0.005918>] [<ffffffff813a77fa>] ? inode_has_perm.isra.48+0x2a/0xa0 [< 0.006350>] [<ffffffff8128c9a8>] do_vfs_ioctl+0x2e8/0x530 [< 0.005514>] [<ffffffff8128cc71>] SyS_ioctl+0x81/0xa0 [< 0.005051>] [<ffffffff81880969>] system_call_fastpath+0x12/0x17 [< 0.005999>] ---[ end trace 3e4dca7180cdddab ]--- [< 5.529564>] kvm [1766]: vcpu0 unhandled rdmsr: 0x1c9 [< 0.005026>] kvm [1766]: vcpu0 unhandled rdmsr: 0x1a6 [< 0.004998>] kvm [1766]: vcpu0 unhandled rdmsr: 0x3f6 . . . > > [ +0.006055] [<ffffffff810992ea>] warn_slowpath_null+0x1a/0x20 > > [ +0.005889] [<ffffffffa02f00ee>] nested_vmx_vmexit+0x7ee/0x880 [kvm_intel] > > [ +0.007014] [<ffffffffa02f05af>] ? vmx_handle_exit+0x1bf/0xaa0 [kvm_intel] > > [ +0.007015] [<ffffffffa02f039c>] vmx_queue_exception+0xfc/0x150 [kvm_intel] > > [ +0.007130] [<ffffffffa028cdfd>] kvm_arch_vcpu_ioctl_run+0xd9d/0x1290 [kvm] > > (There is only one execution path and unless there is a race, it would > be prevented by [1].) > > > [ +0.007111] [<ffffffffa0288528>] ? kvm_arch_vcpu_load+0x58/0x220 [kvm] > > [ +0.006670] [<ffffffffa0274cbc>] kvm_vcpu_ioctl+0x32c/0x5c0 [kvm] > [...] > > [1] http://article.gmane.org/gmane.comp.emulators.kvm.devel/132937 > > [2] http://koji.fedoraproject.org/koji/taskinfo?taskID=9004708
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 3f73bfad0349..389166a1b79a 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2098,6 +2098,9 @@ static int nested_vmx_check_exception(struct kvm_vcpu *vcpu, unsigned nr) { struct vmcs12 *vmcs12 = get_vmcs12(vcpu); + if (to_vmx(vcpu)->nested.nested_run_pending) + return 0; + if (!(vmcs12->exception_bitmap & (1u << nr))) return 0;