diff mbox

regression bisected; KVM: entry failed, hardware error 0x80000021

Message ID 549A796C.50801@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tiejun Chen Dec. 24, 2014, 8:29 a.m. UTC
On 2014/12/23 15:26, Jamie Heilman wrote:
> Chen, Tiejun wrote:
>> On 2014/12/23 9:50, Chen, Tiejun wrote:
>>> On 2014/12/22 17:23, Jamie Heilman wrote:
>>>> Chen, Tiejun wrote:
>>>>> On 2014/12/21 20:46, Jamie Heilman wrote:
>>>>>> With v3.19-rc1 when I run qemu-system-x86_64 -machine pc,accel=kvm I
>>>>>> get:
>>>>>>
>>>>>> KVM: entry failed, hardware error 0x80000021
>>>>>
>>>>> Looks some MSR writing issues such a failed entry.
>>>>>
>>>>>> If you're running a guest on an Intel machine without unrestricted mode
>>>>>> support, the failure can be most likely due to the guest entering an
>>>>>> invalid
>>>>>> state for Intel VT. For example, the guest maybe running in big real
>>>>>> mode
>>>>>> which is not supported on less recent Intel processors.
>>>>>>
>>>>>> EAX=00000000 EBX=00000000 ECX=00000000 EDX=00000663
>>>>>> ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
>>>>>> EIP=0000e05b EFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
>>>>>> ES =0000 00000000 0000ffff 00009300
>>>>>> CS =f000 000f0000 0000ffff 00009b00
>>>>>> SS =0000 00000000 0000ffff 00009300
>>>>>> DS =0000 00000000 0000ffff 00009300
>>>>>> FS =0000 00000000 0000ffff 00009300
>>>>>> GS =0000 00000000 0000ffff 00009300
>>>>>> LDT=0000 00000000 0000ffff 00008200
>>>>>> TR =0000 00000000 0000ffff 00008b00
>>>>>> GDT=     00000000 0000ffff
>>>>>> IDT=     00000000 0000ffff
>>>>>> CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
>>>>>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
>>>>>> DR3=0000000000000000
>>>>>> DR6=00000000ffff0ff0 DR7=0000000000000400
>>>>>> EFER=0000000000000000
>>>>>
>>>>> And I don't see any obvious wrong as well. Any valuable info from dmesg?
>>>>
>>>> With the simple qemu command above, on 3.18.1 I see:
>>>>
>>>> kern.info: kvm: zapping shadow pages for mmio generation wraparound
>>>>
>>>> when I fire up a full guest that's actually useful I get:
>>>>
>>>> kern.info: kvm: zapping shadow pages for mmio generation wraparound
>>>> kern.err: kvm [4073]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
>>>>
>>>> On 3.18.0-rc3-00042-g34a1cd6 nothing appears in the dmesg, just the
>>>> message I mention above to stderr.  Same thing with a stock
>>>> 3.19.0-rc1.  Once I apply your patch the simple test command produces
>>>> the same zapping shadow pages messages as 3.18.1, and a test guest of
>>>> a Debian Jessie image (w/stock distro kernel) produces the same thing
>>>> with disabled perfctr wrmsr message.  However, it doesn't look like
>>>
>>> Sorry I'm not sure if I understood current status. Looks 3.19-rc1 & my
>>> patch just fix that error above,
>>>
>>> KVM: entry failed, hardware error 0x80000021
>>> ...
>>>
>>> Right?
>>>
>>>> I'm entirely out of the woods, because one of my other guest VMs with a
>>>> custom kernel that works great under 3.18.1 now fails to run.  Nothing
>>>> in dmesg, but here's the stderr:
>>>
>>> But even you revert 34a1cd60d17 or just apply my patch, something else
>>> introduced between 3.18.1 and 3.19-rc1 led this error below, right?
>>>
>>>>
>>>> KVM internal error. Suberror: 1
>>>> emulation failure
>>>> EAX=000de494 EBX=00000000 ECX=00000000 EDX=00000cfd
>>>> ESI=00000059 EDI=00000000 EBP=00000000 ESP=00006fb4
>>>> EIP=000f15c1 EFL=00010016 [----AP-] CPL=0 II=0 A20=1 SMM=0 HLT=0
>>>> ES =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
>>>> CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
>>>> SS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
>>>> DS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
>>>> FS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
>>>> GS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
>>>> LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
>>>> TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
>>>> GDT=     000f6be8 00000037
>>>> IDT=     000f6c26 00000000
>>>> CR0=60000011 CR2=00000000 CR3=00000000 CR4=00000000
>>>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
>>>> DR3=0000000000000000
>>>> DR6=00000000ffff0ff0 DR7=0000000000000400
>>>> EFER=0000000000000000
>>>> Code=e8 ae fc ff ff 89 f2 a8 10 89 d8 75 0a b9 41 15 ff ff ff d1 <5b>
>>>> 5e c3 5b 5e e9 76 ff ff ff b0 11 e6 20 e6 a0 b0 08 e6 21 b0 70 e6 a1
>>>> b0 04 e6 21 b0 02
>>>>
>>>> FWIW, I get the same thing with 34a1cd60d17 reverted.  Maybe there are
>>>> two bugs, maybe there's more to this first one.  I can repro this
>>>
>>> So if my understanding is correct, this is probably another bug. And
>>> especially, I already saw the same log in another thread, "Cleaning up
>>> the KVM clock". Maybe you can continue to `git bisect` to locate that
>>> bad commit.
>>>
>>
>> Looks just now Andy found that commit,
>> 0e60b0799fedc495a5c57dbd669de3c10d72edd2 "kvm: change memslot sorting rule
>> from size to GFN", maybe you can try to revert this to try yours again.
>
> That doesn't revert cleanly for me, and I don't have much time to
> fiddle with it until the 24th---so checked out the commit before it
> (d4ae84a0), applied your patch, built, and yes, everything works fine
> at that point.  I'll probably have time for another full bisection
> later, assuming things aren't ironed out already by then.

Could you try this to fix your last error?

Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
---
  virt/kvm/kvm_main.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Jamie Heilman Dec. 24, 2014, 11:02 a.m. UTC | #1
Chen, Tiejun wrote:
> On 2014/12/23 15:26, Jamie Heilman wrote:
> >Chen, Tiejun wrote:
> >>On 2014/12/23 9:50, Chen, Tiejun wrote:
> >>>On 2014/12/22 17:23, Jamie Heilman wrote:
> >>>>KVM internal error. Suberror: 1
> >>>>emulation failure
> >>>>EAX=000de494 EBX=00000000 ECX=00000000 EDX=00000cfd
> >>>>ESI=00000059 EDI=00000000 EBP=00000000 ESP=00006fb4
> >>>>EIP=000f15c1 EFL=00010016 [----AP-] CPL=0 II=0 A20=1 SMM=0 HLT=0
> >>>>ES =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
> >>>>CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
> >>>>SS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
> >>>>DS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
> >>>>FS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
> >>>>GS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
> >>>>LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
> >>>>TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
> >>>>GDT=     000f6be8 00000037
> >>>>IDT=     000f6c26 00000000
> >>>>CR0=60000011 CR2=00000000 CR3=00000000 CR4=00000000
> >>>>DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
> >>>>DR3=0000000000000000
> >>>>DR6=00000000ffff0ff0 DR7=0000000000000400
> >>>>EFER=0000000000000000
> >>>>Code=e8 ae fc ff ff 89 f2 a8 10 89 d8 75 0a b9 41 15 ff ff ff d1 <5b>
> >>>>5e c3 5b 5e e9 76 ff ff ff b0 11 e6 20 e6 a0 b0 08 e6 21 b0 70 e6 a1
> >>>>b0 04 e6 21 b0 02
> >>>>
> >>>>FWIW, I get the same thing with 34a1cd60d17 reverted.  Maybe there are
> >>>>two bugs, maybe there's more to this first one.  I can repro this
> >>>
> >>>So if my understanding is correct, this is probably another bug. And
> >>>especially, I already saw the same log in another thread, "Cleaning up
> >>>the KVM clock". Maybe you can continue to `git bisect` to locate that
> >>>bad commit.
> >>>
> >>
> >>Looks just now Andy found that commit,
> >>0e60b0799fedc495a5c57dbd669de3c10d72edd2 "kvm: change memslot sorting rule
> >>from size to GFN", maybe you can try to revert this to try yours again.
> >
> >That doesn't revert cleanly for me, and I don't have much time to
> >fiddle with it until the 24th---so checked out the commit before it
> >(d4ae84a0), applied your patch, built, and yes, everything works fine
> >at that point.  I'll probably have time for another full bisection
> >later, assuming things aren't ironed out already by then.

3.18.0-rc3-00120-gd4ae84a0 + vmx reorder msr writes patch = OK
3.18.0-rc3-00121-g0e60b07 + vmx reorder msr writes patch = emulation failure

So that certainly points to 0e60b0799fedc495a5c57dbd669de3c10d72edd2
as well.

> Could you try this to fix your last error?

Running qemu-system-x86_64 -machine pc,accel=kvm -nodefaults works,
my real (headless) kvm guests work, but this new patch makes running
"qemu-system-x86_64 -machine pc,accel=kvm" fail again, this time with
errors in the host to the tune of:

------------[ cut here ]------------
WARNING: CPU: 1 PID: 3901 at arch/x86/kvm/x86.c:6575 kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]()
Modules linked in: nfsv4 cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_ondemand cpufreq_conservative autofs4 fan nfsd auth_rpcgss nfs lockd grace fscache sunrpc bridge stp llc vhost_net tun vhost macvtap macvlan fuse cbc dm_crypt usb_storage snd_hda_codec_analog snd_hda_codec_generic kvm_intel kvm tg3 ptp pps_core sr_mod snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer snd sg dcdbas cdrom psmouse soundcore floppy evdev xfs dm_mod raid1 md_mod
CPU: 1 PID: 3901 Comm: qemu-system-x86 Not tainted 3.19.0-rc1-00011-g53262d1-dirty #1
Hardware name: Dell Inc. Precision WorkStation T3400  /0TP412, BIOS A14 04/30/2012
 0000000000000000 000000007e052328 ffff8800c25ffcf8 ffffffff813defbe
 0000000000000000 0000000000000000 ffff8800c25ffd38 ffffffff8103b517
 ffff8800c25ffd28 ffffffffa019bdec ffff8800caf1d000 ffff8800c2774800
Call Trace:
 [<ffffffff813defbe>] dump_stack+0x4c/0x6e
 [<ffffffff8103b517>] warn_slowpath_common+0x97/0xb1
 [<ffffffffa019bdec>] ? kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]
 [<ffffffff8103b60b>] warn_slowpath_null+0x15/0x17
 [<ffffffffa019bdec>] kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]
 [<ffffffffa02308b9>] ? vmcs_load+0x20/0x62 [kvm_intel]
 [<ffffffffa0231e03>] ? vmx_vcpu_load+0x140/0x16a [kvm_intel]
 [<ffffffffa0196ba3>] ? kvm_arch_vcpu_load+0x15c/0x161 [kvm]
 [<ffffffffa018d8b1>] kvm_vcpu_ioctl+0x189/0x4bd [kvm]
 [<ffffffff8104647a>] ? do_sigtimedwait+0x12f/0x189
 [<ffffffff810ea316>] do_vfs_ioctl+0x370/0x436
 [<ffffffff810f24f2>] ? __fget+0x67/0x72
 [<ffffffff810ea41b>] SyS_ioctl+0x3f/0x5e
 [<ffffffff813e34d2>] system_call_fastpath+0x12/0x17
---[ end trace 46abac932fb3b4a1 ]---
------------[ cut here ]------------
WARNING: CPU: 1 PID: 3901 at arch/x86/kvm/x86.c:6575 kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]()
Modules linked in: nfsv4 cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_ondemand cpufreq_conservative autofs4 fan nfsd auth_rpcgss nfs lockd grace fscache sunrpc bridge stp llc vhost_net tun vhost macvtap macvlan fuse cbc dm_crypt usb_storage snd_hda_codec_analog snd_hda_codec_generic kvm_intel kvm tg3 ptp pps_core sr_mod snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer snd sg dcdbas cdrom psmouse soundcore floppy evdev xfs dm_mod raid1 md_mod
CPU: 1 PID: 3901 Comm: qemu-system-x86 Tainted: G        W      3.19.0-rc1-00011-g53262d1-dirty #1
Hardware name: Dell Inc. Precision WorkStation T3400  /0TP412, BIOS A14 04/30/2012
 0000000000000000 000000007e052328 ffff8800c25ffcf8 ffffffff813defbe
 0000000000000000 0000000000000000 ffff8800c25ffd38 ffffffff8103b517
 ffff8800c25ffd28 ffffffffa019bdec ffff8800caf1d000 ffff8800c2774800
Call Trace:
 [<ffffffff813defbe>] dump_stack+0x4c/0x6e
 [<ffffffff8103b517>] warn_slowpath_common+0x97/0xb1
 [<ffffffffa019bdec>] ? kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]
 [<ffffffff8103b60b>] warn_slowpath_null+0x15/0x17
 [<ffffffffa019bdec>] kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]
 [<ffffffffa02308b9>] ? vmcs_load+0x20/0x62 [kvm_intel]
 [<ffffffffa0231e03>] ? vmx_vcpu_load+0x140/0x16a [kvm_intel]
 [<ffffffffa0196ba3>] ? kvm_arch_vcpu_load+0x15c/0x161 [kvm]
 [<ffffffffa018d8b1>] kvm_vcpu_ioctl+0x189/0x4bd [kvm]
 [<ffffffff8104647a>] ? do_sigtimedwait+0x12f/0x189
 [<ffffffff810ea316>] do_vfs_ioctl+0x370/0x436
 [<ffffffff810f24f2>] ? __fget+0x67/0x72
 [<ffffffff810ea41b>] SyS_ioctl+0x3f/0x5e
 [<ffffffff813e34d2>] system_call_fastpath+0x12/0x17
---[ end trace 46abac932fb3b4a2 ]---

over and over and over ad nauseum, or until I kill the qemu command,
it also eats a core's worth of cpu.
Paolo Bonzini Dec. 24, 2014, 11:11 a.m. UTC | #2
On 24/12/2014 12:02, Jamie Heilman wrote:
> Running qemu-system-x86_64 -machine pc,accel=kvm -nodefaults works,
> my real (headless) kvm guests work, but this new patch makes running
> "qemu-system-x86_64 -machine pc,accel=kvm" fail again, this time with
> errors in the host to the tune of:
> 
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 3901 at arch/x86/kvm/x86.c:6575 kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]()
> Modules linked in: nfsv4 cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_ondemand cpufreq_conservative autofs4 fan nfsd auth_rpcgss nfs lockd grace fscache sunrpc bridge stp llc vhost_net tun vhost macvtap macvlan fuse cbc dm_crypt usb_storage snd_hda_codec_analog snd_hda_codec_generic kvm_intel kvm tg3 ptp pps_core sr_mod snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer snd sg dcdbas cdrom psmouse soundcore floppy evdev xfs dm_mod raid1 md_mod
> CPU: 1 PID: 3901 Comm: qemu-system-x86 Not tainted 3.19.0-rc1-00011-g53262d1-dirty #1
> Hardware name: Dell Inc. Precision WorkStation T3400  /0TP412, BIOS A14 04/30/2012
>  0000000000000000 000000007e052328 ffff8800c25ffcf8 ffffffff813defbe
>  0000000000000000 0000000000000000 ffff8800c25ffd38 ffffffff8103b517
>  ffff8800c25ffd28 ffffffffa019bdec ffff8800caf1d000 ffff8800c2774800
> Call Trace:
>  [<ffffffff813defbe>] dump_stack+0x4c/0x6e
>  [<ffffffff8103b517>] warn_slowpath_common+0x97/0xb1
>  [<ffffffffa019bdec>] ? kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]
>  [<ffffffff8103b60b>] warn_slowpath_null+0x15/0x17
>  [<ffffffffa019bdec>] kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]
>  [<ffffffffa02308b9>] ? vmcs_load+0x20/0x62 [kvm_intel]
>  [<ffffffffa0231e03>] ? vmx_vcpu_load+0x140/0x16a [kvm_intel]
>  [<ffffffffa0196ba3>] ? kvm_arch_vcpu_load+0x15c/0x161 [kvm]
>  [<ffffffffa018d8b1>] kvm_vcpu_ioctl+0x189/0x4bd [kvm]
>  [<ffffffff8104647a>] ? do_sigtimedwait+0x12f/0x189
>  [<ffffffff810ea316>] do_vfs_ioctl+0x370/0x436
>  [<ffffffff810f24f2>] ? __fget+0x67/0x72
>  [<ffffffff810ea41b>] SyS_ioctl+0x3f/0x5e
>  [<ffffffff813e34d2>] system_call_fastpath+0x12/0x17
> ---[ end trace 46abac932fb3b4a1 ]---
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 3901 at arch/x86/kvm/x86.c:6575 kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]()
> Modules linked in: nfsv4 cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_ondemand cpufreq_conservative autofs4 fan nfsd auth_rpcgss nfs lockd grace fscache sunrpc bridge stp llc vhost_net tun vhost macvtap macvlan fuse cbc dm_crypt usb_storage snd_hda_codec_analog snd_hda_codec_generic kvm_intel kvm tg3 ptp pps_core sr_mod snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer snd sg dcdbas cdrom psmouse soundcore floppy evdev xfs dm_mod raid1 md_mod
> CPU: 1 PID: 3901 Comm: qemu-system-x86 Tainted: G        W      3.19.0-rc1-00011-g53262d1-dirty #1
> Hardware name: Dell Inc. Precision WorkStation T3400  /0TP412, BIOS A14 04/30/2012
>  0000000000000000 000000007e052328 ffff8800c25ffcf8 ffffffff813defbe
>  0000000000000000 0000000000000000 ffff8800c25ffd38 ffffffff8103b517
>  ffff8800c25ffd28 ffffffffa019bdec ffff8800caf1d000 ffff8800c2774800
> Call Trace:
>  [<ffffffff813defbe>] dump_stack+0x4c/0x6e
>  [<ffffffff8103b517>] warn_slowpath_common+0x97/0xb1
>  [<ffffffffa019bdec>] ? kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]
>  [<ffffffff8103b60b>] warn_slowpath_null+0x15/0x17
>  [<ffffffffa019bdec>] kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]
>  [<ffffffffa02308b9>] ? vmcs_load+0x20/0x62 [kvm_intel]
>  [<ffffffffa0231e03>] ? vmx_vcpu_load+0x140/0x16a [kvm_intel]
>  [<ffffffffa0196ba3>] ? kvm_arch_vcpu_load+0x15c/0x161 [kvm]
>  [<ffffffffa018d8b1>] kvm_vcpu_ioctl+0x189/0x4bd [kvm]
>  [<ffffffff8104647a>] ? do_sigtimedwait+0x12f/0x189
>  [<ffffffff810ea316>] do_vfs_ioctl+0x370/0x436
>  [<ffffffff810f24f2>] ? __fget+0x67/0x72
>  [<ffffffff810ea41b>] SyS_ioctl+0x3f/0x5e
>  [<ffffffff813e34d2>] system_call_fastpath+0x12/0x17
> ---[ end trace 46abac932fb3b4a2 ]---
> 
> over and over and over ad nauseum, or until I kill the qemu command,
> it also eats a core's worth of cpu.

Yeah, I'm fairly sure that the second hunk of Tiejun's patch is not
correct, but he's on the right track.  I hope to post a fix today, else
on the 27th or 29th.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tiejun Chen Dec. 25, 2014, 7:56 a.m. UTC | #3
On 2014/12/24 19:11, Paolo Bonzini wrote:
>
>
> On 24/12/2014 12:02, Jamie Heilman wrote:
>> Running qemu-system-x86_64 -machine pc,accel=kvm -nodefaults works,
>> my real (headless) kvm guests work, but this new patch makes running
>> "qemu-system-x86_64 -machine pc,accel=kvm" fail again, this time with
>> errors in the host to the tune of:
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 1 PID: 3901 at arch/x86/kvm/x86.c:6575 kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]()
>> Modules linked in: nfsv4 cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_ondemand cpufreq_conservative autofs4 fan nfsd auth_rpcgss nfs lockd grace fscache sunrpc bridge stp llc vhost_net tun vhost macvtap macvlan fuse cbc dm_crypt usb_storage snd_hda_codec_analog snd_hda_codec_generic kvm_intel kvm tg3 ptp pps_core sr_mod snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer snd sg dcdbas cdrom psmouse soundcore floppy evdev xfs dm_mod raid1 md_mod
>> CPU: 1 PID: 3901 Comm: qemu-system-x86 Not tainted 3.19.0-rc1-00011-g53262d1-dirty #1
>> Hardware name: Dell Inc. Precision WorkStation T3400  /0TP412, BIOS A14 04/30/2012
>>   0000000000000000 000000007e052328 ffff8800c25ffcf8 ffffffff813defbe
>>   0000000000000000 0000000000000000 ffff8800c25ffd38 ffffffff8103b517
>>   ffff8800c25ffd28 ffffffffa019bdec ffff8800caf1d000 ffff8800c2774800
>> Call Trace:
>>   [<ffffffff813defbe>] dump_stack+0x4c/0x6e
>>   [<ffffffff8103b517>] warn_slowpath_common+0x97/0xb1
>>   [<ffffffffa019bdec>] ? kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]
>>   [<ffffffff8103b60b>] warn_slowpath_null+0x15/0x17
>>   [<ffffffffa019bdec>] kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]
>>   [<ffffffffa02308b9>] ? vmcs_load+0x20/0x62 [kvm_intel]
>>   [<ffffffffa0231e03>] ? vmx_vcpu_load+0x140/0x16a [kvm_intel]
>>   [<ffffffffa0196ba3>] ? kvm_arch_vcpu_load+0x15c/0x161 [kvm]
>>   [<ffffffffa018d8b1>] kvm_vcpu_ioctl+0x189/0x4bd [kvm]
>>   [<ffffffff8104647a>] ? do_sigtimedwait+0x12f/0x189
>>   [<ffffffff810ea316>] do_vfs_ioctl+0x370/0x436
>>   [<ffffffff810f24f2>] ? __fget+0x67/0x72
>>   [<ffffffff810ea41b>] SyS_ioctl+0x3f/0x5e
>>   [<ffffffff813e34d2>] system_call_fastpath+0x12/0x17
>> ---[ end trace 46abac932fb3b4a1 ]---
>> ------------[ cut here ]------------
>> WARNING: CPU: 1 PID: 3901 at arch/x86/kvm/x86.c:6575 kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]()
>> Modules linked in: nfsv4 cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_ondemand cpufreq_conservative autofs4 fan nfsd auth_rpcgss nfs lockd grace fscache sunrpc bridge stp llc vhost_net tun vhost macvtap macvlan fuse cbc dm_crypt usb_storage snd_hda_codec_analog snd_hda_codec_generic kvm_intel kvm tg3 ptp pps_core sr_mod snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer snd sg dcdbas cdrom psmouse soundcore floppy evdev xfs dm_mod raid1 md_mod
>> CPU: 1 PID: 3901 Comm: qemu-system-x86 Tainted: G        W      3.19.0-rc1-00011-g53262d1-dirty #1
>> Hardware name: Dell Inc. Precision WorkStation T3400  /0TP412, BIOS A14 04/30/2012
>>   0000000000000000 000000007e052328 ffff8800c25ffcf8 ffffffff813defbe
>>   0000000000000000 0000000000000000 ffff8800c25ffd38 ffffffff8103b517
>>   ffff8800c25ffd28 ffffffffa019bdec ffff8800caf1d000 ffff8800c2774800
>> Call Trace:
>>   [<ffffffff813defbe>] dump_stack+0x4c/0x6e
>>   [<ffffffff8103b517>] warn_slowpath_common+0x97/0xb1
>>   [<ffffffffa019bdec>] ? kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]
>>   [<ffffffff8103b60b>] warn_slowpath_null+0x15/0x17
>>   [<ffffffffa019bdec>] kvm_arch_vcpu_ioctl_run+0xd63/0xe5b [kvm]
>>   [<ffffffffa02308b9>] ? vmcs_load+0x20/0x62 [kvm_intel]
>>   [<ffffffffa0231e03>] ? vmx_vcpu_load+0x140/0x16a [kvm_intel]
>>   [<ffffffffa0196ba3>] ? kvm_arch_vcpu_load+0x15c/0x161 [kvm]
>>   [<ffffffffa018d8b1>] kvm_vcpu_ioctl+0x189/0x4bd [kvm]
>>   [<ffffffff8104647a>] ? do_sigtimedwait+0x12f/0x189
>>   [<ffffffff810ea316>] do_vfs_ioctl+0x370/0x436
>>   [<ffffffff810f24f2>] ? __fget+0x67/0x72
>>   [<ffffffff810ea41b>] SyS_ioctl+0x3f/0x5e
>>   [<ffffffff813e34d2>] system_call_fastpath+0x12/0x17
>> ---[ end trace 46abac932fb3b4a2 ]---
>>
>> over and over and over ad nauseum, or until I kill the qemu command,
>> it also eats a core's worth of cpu.

Such a message above seems to be out of our mem_slot issue, I'm not 100% 
sure but actually I can run this case,

qemu-system-x86_64 -machine pc,accel=kvm -m 2048 -smp 2 -hda ubuntu.img

Just one patch, "kvm: x86: vmx: reorder some msr writing", is needed 
here. So I guess you guy can try to pull your 3.19-rc1 + that patch, and 
also pull qemu.

>
> Yeah, I'm fairly sure that the second hunk of Tiejun's patch is not
> correct, but he's on the right track.  I hope to post a fix today, else

Yeah, looks that will broken !next case then I regenerate that again 
post into another email. Now at lease I myself can run Andy's next case 
and a normal case, "qemu-system-x86_64 -machine pc,accel=kvm", at the 
same time. But if I'm missing something please correct me directly :)

Tiejun
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f528343..a2d928c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -672,6 +672,7 @@  static void update_memslots(struct kvm_memslots *slots,
         WARN_ON(mslots[i].id != id);
         if (!new->npages) {
                 new->base_gfn = 0;
+               new->flags = 0;
                 if (mslots[i].npages)
                         slots->used_slots--;
         } else {
@@ -688,7 +689,7 @@  static void update_memslots(struct kvm_memslots *slots,
                 i++;
         }
         while (i > 0 &&
-              new->base_gfn > mslots[i - 1].base_gfn) {
+              new->base_gfn >= mslots[i - 1].base_gfn) {
                 mslots[i] = mslots[i - 1];
                 slots->id_to_index[mslots[i].id] = i;
                 i--;