Message ID | f6ad65ee00ce0207701293200f6aa4d691cac3f8.1490642724.git.jpoimboe@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
> While debugging a kernel issue, I found that QEMU always reboots when an > x86 triple fault occurs, which complicates debugging. QEMU and libvirt > have a facility for creating a dump when KVM reports > KVM_SYSTEM_EVENT_CRASH. So change the VMX triple fault handler to do > that. This gives user space the ability to decide whether to dump, > pause, shutdown, or reboot. You probably want QEMU's -no-reboot option. Triple faults are already reported to userspace with KVM_EXIT_SHUTDOWN, and it's up to userspace to decide what to do with it. This patch cannot be applied, because there are guests that do a triple-fault intentionally in order to reset the machine. Paolo > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > --- > arch/x86/kvm/vmx.c | 3 ++- > include/trace/events/kvm.h | 3 ++- > 2 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 3acde66..1f2694c 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -5731,7 +5731,8 @@ static int handle_external_interrupt(struct kvm_vcpu > *vcpu) > > static int handle_triple_fault(struct kvm_vcpu *vcpu) > { > - vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN; > + vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT; > + vcpu->run->system_event.type = KVM_SYSTEM_EVENT_CRASH; > return 0; > } > > diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h > index 8ade3eb..200a3d7 100644 > --- a/include/trace/events/kvm.h > +++ b/include/trace/events/kvm.h > @@ -14,7 +14,8 @@ > ERSN(SHUTDOWN), ERSN(FAIL_ENTRY), ERSN(INTR), ERSN(SET_TPR), \ > ERSN(TPR_ACCESS), ERSN(S390_SIEIC), ERSN(S390_RESET), ERSN(DCR),\ > ERSN(NMI), ERSN(INTERNAL_ERROR), ERSN(OSI), ERSN(PAPR_HCALL), \ > - ERSN(S390_UCONTROL), ERSN(WATCHDOG), ERSN(S390_TSCH) > + ERSN(S390_UCONTROL), ERSN(WATCHDOG), ERSN(S390_TSCH), \ > + ERSN(SYSTEM_EVENT) > > TRACE_EVENT(kvm_userspace_exit, > TP_PROTO(__u32 reason, int errno), > -- > 2.7.4 > >
On Tue, Mar 28, 2017 at 03:51:01AM -0400, Paolo Bonzini wrote: > > > While debugging a kernel issue, I found that QEMU always reboots when an > > x86 triple fault occurs, which complicates debugging. QEMU and libvirt > > have a facility for creating a dump when KVM reports > > KVM_SYSTEM_EVENT_CRASH. So change the VMX triple fault handler to do > > that. This gives user space the ability to decide whether to dump, > > pause, shutdown, or reboot. > > You probably want QEMU's -no-reboot option. > > Triple faults are already reported to userspace with KVM_EXIT_SHUTDOWN, > and it's up to userspace to decide what to do with it. This patch cannot > be applied, because there are guests that do a triple-fault intentionally > in order to reset the machine. Ok. Any idea how to force libvirt to create a dump? It has a 'coredump-destroy' option, but that only seems to work with 'on_crash': https://libvirt.org/formatdomain.html#elementsEvents
On 28/03/2017 13:46, Josh Poimboeuf wrote: > On Tue, Mar 28, 2017 at 03:51:01AM -0400, Paolo Bonzini wrote: >> >>> While debugging a kernel issue, I found that QEMU always reboots when an >>> x86 triple fault occurs, which complicates debugging. QEMU and libvirt >>> have a facility for creating a dump when KVM reports >>> KVM_SYSTEM_EVENT_CRASH. So change the VMX triple fault handler to do >>> that. This gives user space the ability to decide whether to dump, >>> pause, shutdown, or reboot. >> >> You probably want QEMU's -no-reboot option. >> >> Triple faults are already reported to userspace with KVM_EXIT_SHUTDOWN, >> and it's up to userspace to decide what to do with it. This patch cannot >> be applied, because there are guests that do a triple-fault intentionally >> in order to reset the machine. > > Ok. Any idea how to force libvirt to create a dump? It has a > 'coredump-destroy' option, but that only seems to work with 'on_crash': > > https://libvirt.org/formatdomain.html#elementsEvents Probably QEMU, when invoked with -no-shutdown -no-reboot, should treat KVM_EXIT_SHUTDOWN as a panic. I can have a go at it, but note that QEMU is now in hard freeze for the next release, so it will take a while. However you're using libvirt and it doesn't use -no-reboot. It's probably possible for libvirt to use -no-reboot more often. The price would be that if libvirtd crashes and a VM wants to reset, then the VM gets stuck. Alternatively, we could generalize -no-shutdown and -no-reboot to something like: -action reset=stop|restart|quit, poweroff=stop|quit, triple-fault=stop|panic|restart|quit and teach libvirt about it. The current semantics map relatively easily to the new option: | reset | poweroff | triple-fault --------------------------+-------------+------------+------------------- no option | restart | quit | restart -no-shutdown | restart | stop | restart -no-reboot | quit | quit | quit -no-shutdown -no-reboot | stop | stop | stop (panic?) Paolo
On Tue, Mar 28, 2017 at 02:39:34PM +0200, Paolo Bonzini wrote: > > > On 28/03/2017 13:46, Josh Poimboeuf wrote: > > On Tue, Mar 28, 2017 at 03:51:01AM -0400, Paolo Bonzini wrote: > >> > >>> While debugging a kernel issue, I found that QEMU always reboots when an > >>> x86 triple fault occurs, which complicates debugging. QEMU and libvirt > >>> have a facility for creating a dump when KVM reports > >>> KVM_SYSTEM_EVENT_CRASH. So change the VMX triple fault handler to do > >>> that. This gives user space the ability to decide whether to dump, > >>> pause, shutdown, or reboot. > >> > >> You probably want QEMU's -no-reboot option. > >> > >> Triple faults are already reported to userspace with KVM_EXIT_SHUTDOWN, > >> and it's up to userspace to decide what to do with it. This patch cannot > >> be applied, because there are guests that do a triple-fault intentionally > >> in order to reset the machine. > > > > Ok. Any idea how to force libvirt to create a dump? It has a > > 'coredump-destroy' option, but that only seems to work with 'on_crash': > > > > https://libvirt.org/formatdomain.html#elementsEvents > > Probably QEMU, when invoked with -no-shutdown -no-reboot, should treat > KVM_EXIT_SHUTDOWN as a panic. I can have a go at it, but note that QEMU > is now in hard freeze for the next release, so it will take a while. > > However you're using libvirt and it doesn't use -no-reboot. > > It's probably possible for libvirt to use -no-reboot more often. The > price would be that if libvirtd crashes and a VM wants to reset, then > the VM gets stuck. > > Alternatively, we could generalize -no-shutdown and -no-reboot to > something like: > > -action reset=stop|restart|quit, > poweroff=stop|quit, > triple-fault=stop|panic|restart|quit > > and teach libvirt about it. The current semantics map relatively easily > to the new option: > > | reset | poweroff | triple-fault > --------------------------+-------------+------------+------------------- > no option | restart | quit | restart > -no-shutdown | restart | stop | restart > -no-reboot | quit | quit | quit > -no-shutdown -no-reboot | stop | stop | stop (panic?) I like your new option proposal. It makes a lot more sense, at least from the perspective of a novice user (me). Having some kind of framework in place for dealing with triple faults -- either pausing or dumping -- would be very useful. Right now I can't even get libvirt to pause when it happens.
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 3acde66..1f2694c 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -5731,7 +5731,8 @@ static int handle_external_interrupt(struct kvm_vcpu *vcpu) static int handle_triple_fault(struct kvm_vcpu *vcpu) { - vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN; + vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT; + vcpu->run->system_event.type = KVM_SYSTEM_EVENT_CRASH; return 0; } diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h index 8ade3eb..200a3d7 100644 --- a/include/trace/events/kvm.h +++ b/include/trace/events/kvm.h @@ -14,7 +14,8 @@ ERSN(SHUTDOWN), ERSN(FAIL_ENTRY), ERSN(INTR), ERSN(SET_TPR), \ ERSN(TPR_ACCESS), ERSN(S390_SIEIC), ERSN(S390_RESET), ERSN(DCR),\ ERSN(NMI), ERSN(INTERNAL_ERROR), ERSN(OSI), ERSN(PAPR_HCALL), \ - ERSN(S390_UCONTROL), ERSN(WATCHDOG), ERSN(S390_TSCH) + ERSN(S390_UCONTROL), ERSN(WATCHDOG), ERSN(S390_TSCH), \ + ERSN(SYSTEM_EVENT) TRACE_EVENT(kvm_userspace_exit, TP_PROTO(__u32 reason, int errno),
While debugging a kernel issue, I found that QEMU always reboots when an x86 triple fault occurs, which complicates debugging. QEMU and libvirt have a facility for creating a dump when KVM reports KVM_SYSTEM_EVENT_CRASH. So change the VMX triple fault handler to do that. This gives user space the ability to decide whether to dump, pause, shutdown, or reboot. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> --- arch/x86/kvm/vmx.c | 3 ++- include/trace/events/kvm.h | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-)