Message ID | 20200601222416.71303-4-jmattson@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add logical CPU to KVM_EXIT_FAIL_ENTRY info | expand |
On Mon, Jun 01, 2020 at 03:24:15PM -0700, Jim Mattson wrote: > As we already do in svm, record the last logical processor on which a > vCPU has run, so that it can be communicated to userspace for > potential hardware errors. > > Signed-off-by: Jim Mattson <jmattson@google.com> > Reviewed-by: Oliver Upton <oupton@google.com> > Reviewed-by: Peter Shier <pshier@google.com> > --- > arch/x86/kvm/vmx/vmx.c | 1 + > arch/x86/kvm/vmx/vmx.h | 3 +++ > 2 files changed, 4 insertions(+) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 170cc76a581f..42856970d3b8 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -6730,6 +6730,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) > if (vcpu->arch.cr2 != read_cr2()) > write_cr2(vcpu->arch.cr2); > > + vmx->last_cpu = vcpu->cpu; This is redundant in the EXIT_FASTPATH_REENTER_GUEST case. Setting it before reenter_guest is technically wrong if emulation_required is true, but that doesn't seem like it'd be an issue in practice. > vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs, > vmx->loaded_vmcs->launched); > > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h > index 672c28f17e49..8a1e833cf4fb 100644 > --- a/arch/x86/kvm/vmx/vmx.h > +++ b/arch/x86/kvm/vmx/vmx.h > @@ -302,6 +302,9 @@ struct vcpu_vmx { > u64 ept_pointer; > > struct pt_desc pt_desc; > + > + /* which host CPU was used for running this vcpu */ > + unsigned int last_cpu; Why not put this in struct kvm_vcpu_arch? I'd also vote to name it last_run_cpu, as last_cpu is super misleading. And if it's in arch, what about setting it vcpu_enter_guest? > }; > > enum ept_pointers_status { > -- > 2.27.0.rc2.251.g90737beb825-goog >
On Mon, Jun 1, 2020 at 6:21 PM Sean Christopherson <sean.j.christopherson@intel.com> wrote: > > On Mon, Jun 01, 2020 at 03:24:15PM -0700, Jim Mattson wrote: > > As we already do in svm, record the last logical processor on which a > > vCPU has run, so that it can be communicated to userspace for > > potential hardware errors. > > > > Signed-off-by: Jim Mattson <jmattson@google.com> > > Reviewed-by: Oliver Upton <oupton@google.com> > > Reviewed-by: Peter Shier <pshier@google.com> > > --- > > arch/x86/kvm/vmx/vmx.c | 1 + > > arch/x86/kvm/vmx/vmx.h | 3 +++ > > 2 files changed, 4 insertions(+) > > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > index 170cc76a581f..42856970d3b8 100644 > > --- a/arch/x86/kvm/vmx/vmx.c > > +++ b/arch/x86/kvm/vmx/vmx.c > > @@ -6730,6 +6730,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) > > if (vcpu->arch.cr2 != read_cr2()) > > write_cr2(vcpu->arch.cr2); > > > > + vmx->last_cpu = vcpu->cpu; > > This is redundant in the EXIT_FASTPATH_REENTER_GUEST case. Setting it > before reenter_guest is technically wrong if emulation_required is true, but > that doesn't seem like it'd be an issue in practice. I really would like to capture the last logical processor to execute VMLAUNCH/VMRESUME (or VMRUN on the AMD side) on behalf of this vCPU. > > vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs, > > vmx->loaded_vmcs->launched); > > > > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h > > index 672c28f17e49..8a1e833cf4fb 100644 > > --- a/arch/x86/kvm/vmx/vmx.h > > +++ b/arch/x86/kvm/vmx/vmx.h > > @@ -302,6 +302,9 @@ struct vcpu_vmx { > > u64 ept_pointer; > > > > struct pt_desc pt_desc; > > + > > + /* which host CPU was used for running this vcpu */ > > + unsigned int last_cpu; > > Why not put this in struct kvm_vcpu_arch? I'd also vote to name it > last_run_cpu, as last_cpu is super misleading. I think last_run_cpu may also be misleading, since in the cases of interest, nothing actually 'ran.' Maybe last_attempted_vmentry_cpu? > And if it's in arch, what about setting it vcpu_enter_guest? As you point out above, this isn't entirely accurate. (But that's the way we roll in kvm, isn't it? :-) > > }; > > > > enum ept_pointers_status { > > -- > > 2.27.0.rc2.251.g90737beb825-goog > >
On Tue, Jun 02, 2020 at 10:33:51AM -0700, Jim Mattson wrote: > On Mon, Jun 1, 2020 at 6:21 PM Sean Christopherson > <sean.j.christopherson@intel.com> wrote: > > > > On Mon, Jun 01, 2020 at 03:24:15PM -0700, Jim Mattson wrote: > > > As we already do in svm, record the last logical processor on which a > > > vCPU has run, so that it can be communicated to userspace for > > > potential hardware errors. > > > > > > Signed-off-by: Jim Mattson <jmattson@google.com> > > > Reviewed-by: Oliver Upton <oupton@google.com> > > > Reviewed-by: Peter Shier <pshier@google.com> > > > --- > > > arch/x86/kvm/vmx/vmx.c | 1 + > > > arch/x86/kvm/vmx/vmx.h | 3 +++ > > > 2 files changed, 4 insertions(+) > > > > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > > index 170cc76a581f..42856970d3b8 100644 > > > --- a/arch/x86/kvm/vmx/vmx.c > > > +++ b/arch/x86/kvm/vmx/vmx.c > > > @@ -6730,6 +6730,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) > > > if (vcpu->arch.cr2 != read_cr2()) > > > write_cr2(vcpu->arch.cr2); > > > > > > + vmx->last_cpu = vcpu->cpu; > > > > This is redundant in the EXIT_FASTPATH_REENTER_GUEST case. Setting it > > before reenter_guest is technically wrong if emulation_required is true, but > > that doesn't seem like it'd be an issue in practice. > > I really would like to capture the last logical processor to execute > VMLAUNCH/VMRESUME (or VMRUN on the AMD side) on behalf of this vCPU. Does it matter though? The flows that consume the variable are all directly in the VM-Exit path. > > > vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs, > > > vmx->loaded_vmcs->launched); > > > > > > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h > > > index 672c28f17e49..8a1e833cf4fb 100644 > > > --- a/arch/x86/kvm/vmx/vmx.h > > > +++ b/arch/x86/kvm/vmx/vmx.h > > > @@ -302,6 +302,9 @@ struct vcpu_vmx { > > > u64 ept_pointer; > > > > > > struct pt_desc pt_desc; > > > + > > > + /* which host CPU was used for running this vcpu */ > > > + unsigned int last_cpu; > > > > Why not put this in struct kvm_vcpu_arch? I'd also vote to name it > > last_run_cpu, as last_cpu is super misleading. > > I think last_run_cpu may also be misleading, since in the cases of > interest, nothing actually 'ran.' Maybe last_attempted_vmentry_cpu? Ya, that thought crossed my mind as well. > > And if it's in arch, what about setting it vcpu_enter_guest? > > As you point out above, this isn't entirely accurate. (But that's the > way we roll in kvm, isn't it? :-) As an alternative to storing the last run/attempted CPU, what about moving the "bad VM-Exit" detection into handle_exit_irqoff, or maybe a new hook that is called after IRQs are enabled but before preemption is enabled, e.g. detect_bad_exit or something? All of the paths in patch 4/4 can easily be moved out of handle_exit. VMX would require a little bit of refacotring for it's "no handler" check, but that should be minor.
On Tue, Jun 2, 2020 at 7:24 PM Sean Christopherson <sean.j.christopherson@intel.com> wrote: > > On Tue, Jun 02, 2020 at 10:33:51AM -0700, Jim Mattson wrote: > > On Mon, Jun 1, 2020 at 6:21 PM Sean Christopherson > > <sean.j.christopherson@intel.com> wrote: > > > > > > On Mon, Jun 01, 2020 at 03:24:15PM -0700, Jim Mattson wrote: > > > > As we already do in svm, record the last logical processor on which a > > > > vCPU has run, so that it can be communicated to userspace for > > > > potential hardware errors. > > > > > > > > Signed-off-by: Jim Mattson <jmattson@google.com> > > > > Reviewed-by: Oliver Upton <oupton@google.com> > > > > Reviewed-by: Peter Shier <pshier@google.com> > > > > --- > > > > arch/x86/kvm/vmx/vmx.c | 1 + > > > > arch/x86/kvm/vmx/vmx.h | 3 +++ > > > > 2 files changed, 4 insertions(+) > > > > > > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > > > index 170cc76a581f..42856970d3b8 100644 > > > > --- a/arch/x86/kvm/vmx/vmx.c > > > > +++ b/arch/x86/kvm/vmx/vmx.c > > > > @@ -6730,6 +6730,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) > > > > if (vcpu->arch.cr2 != read_cr2()) > > > > write_cr2(vcpu->arch.cr2); > > > > > > > > + vmx->last_cpu = vcpu->cpu; > > > > > > This is redundant in the EXIT_FASTPATH_REENTER_GUEST case. Setting it > > > before reenter_guest is technically wrong if emulation_required is true, but > > > that doesn't seem like it'd be an issue in practice. > > > > I really would like to capture the last logical processor to execute > > VMLAUNCH/VMRESUME (or VMRUN on the AMD side) on behalf of this vCPU. > > Does it matter though? The flows that consume the variable are all directly > in the VM-Exit path. > > > > > vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs, > > > > vmx->loaded_vmcs->launched); > > > > > > > > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h > > > > index 672c28f17e49..8a1e833cf4fb 100644 > > > > --- a/arch/x86/kvm/vmx/vmx.h > > > > +++ b/arch/x86/kvm/vmx/vmx.h > > > > @@ -302,6 +302,9 @@ struct vcpu_vmx { > > > > u64 ept_pointer; > > > > > > > > struct pt_desc pt_desc; > > > > + > > > > + /* which host CPU was used for running this vcpu */ > > > > + unsigned int last_cpu; > > > > > > Why not put this in struct kvm_vcpu_arch? I'd also vote to name it > > > last_run_cpu, as last_cpu is super misleading. > > > > I think last_run_cpu may also be misleading, since in the cases of > > interest, nothing actually 'ran.' Maybe last_attempted_vmentry_cpu? > > Ya, that thought crossed my mind as well. > > > > And if it's in arch, what about setting it vcpu_enter_guest? > > > > As you point out above, this isn't entirely accurate. (But that's the > > way we roll in kvm, isn't it? :-) > > As an alternative to storing the last run/attempted CPU, what about moving > the "bad VM-Exit" detection into handle_exit_irqoff, or maybe a new hook > that is called after IRQs are enabled but before preemption is enabled, e.g. > detect_bad_exit or something? All of the paths in patch 4/4 can easily be > moved out of handle_exit. VMX would require a little bit of refacotring for > it's "no handler" check, but that should be minor. Given the alternatives, I'm willing to compromise my principles wrt emulation_required. :-) I'll send out v4 soon.
On Wed, Jun 03, 2020 at 01:18:31PM -0700, Jim Mattson wrote: > On Tue, Jun 2, 2020 at 7:24 PM Sean Christopherson > <sean.j.christopherson@intel.com> wrote: > > As an alternative to storing the last run/attempted CPU, what about moving > > the "bad VM-Exit" detection into handle_exit_irqoff, or maybe a new hook > > that is called after IRQs are enabled but before preemption is enabled, e.g. > > detect_bad_exit or something? All of the paths in patch 4/4 can easily be > > moved out of handle_exit. VMX would require a little bit of refacotring for > > it's "no handler" check, but that should be minor. > > Given the alternatives, I'm willing to compromise my principles wrt > emulation_required. :-) I'll send out v4 soon. What do you dislike about the alternative approach?
On Thu, Jun 4, 2020 at 11:47 AM Sean Christopherson <sean.j.christopherson@intel.com> wrote: > > On Wed, Jun 03, 2020 at 01:18:31PM -0700, Jim Mattson wrote: > > On Tue, Jun 2, 2020 at 7:24 PM Sean Christopherson > > <sean.j.christopherson@intel.com> wrote: > > > As an alternative to storing the last run/attempted CPU, what about moving > > > the "bad VM-Exit" detection into handle_exit_irqoff, or maybe a new hook > > > that is called after IRQs are enabled but before preemption is enabled, e.g. > > > detect_bad_exit or something? All of the paths in patch 4/4 can easily be > > > moved out of handle_exit. VMX would require a little bit of refacotring for > > > it's "no handler" check, but that should be minor. > > > > Given the alternatives, I'm willing to compromise my principles wrt > > emulation_required. :-) I'll send out v4 soon. > > What do you dislike about the alternative approach? Mainly, I wanted to stash this in a common location so that I could print it out in our local version of dump_vmcs(). Ideally, we'd like to be able to identify the bad part(s) just from the kernel logs. That, and I wouldn't have been as comfortable with the refactoring without a lot more testing.
On Thu, Jun 04, 2020 at 12:00:33PM -0700, Jim Mattson wrote: > On Thu, Jun 4, 2020 at 11:47 AM Sean Christopherson > <sean.j.christopherson@intel.com> wrote: > > > > On Wed, Jun 03, 2020 at 01:18:31PM -0700, Jim Mattson wrote: > > > On Tue, Jun 2, 2020 at 7:24 PM Sean Christopherson > > > <sean.j.christopherson@intel.com> wrote: > > > > As an alternative to storing the last run/attempted CPU, what about moving > > > > the "bad VM-Exit" detection into handle_exit_irqoff, or maybe a new hook > > > > that is called after IRQs are enabled but before preemption is enabled, e.g. > > > > detect_bad_exit or something? All of the paths in patch 4/4 can easily be > > > > moved out of handle_exit. VMX would require a little bit of refacotring for > > > > it's "no handler" check, but that should be minor. > > > > > > Given the alternatives, I'm willing to compromise my principles wrt > > > emulation_required. :-) I'll send out v4 soon. > > > > What do you dislike about the alternative approach? > > Mainly, I wanted to stash this in a common location so that I could > print it out in our local version of dump_vmcs(). Ideally, we'd like > to be able to identify the bad part(s) just from the kernel logs. But this would also move dump_vmcs() to before preemption is enabled, i.e. your version could read the CPU directly. And actually, if we're talking about ferreting out hardware issues, you really do want this happening before preemption is enabled so that the VMCS dump comes from the failing CPU. If the vCPU is migrated, the VMCS will be dumped after a VMCLEAR->VMPTRLD, i.e. will be written to memory and pulled back into the VMCS cache on a different CPU, and will also have been written to by the new CPU to update host state. Odds are that wouldn't affect the dump in a meaningful way, but never say never. Tangentially related, what about adding an option to do VMCLEAR at the end of dump_vmcs(), followed by a dump of raw memory? It'd be useless for debugging software issues, but might be potentially useful/interesting for triaging hardware problems. > That, and I wouldn't have been as comfortable with the refactoring > without a lot more testing.
On Thu, Jun 4, 2020 at 12:26 PM Sean Christopherson <sean.j.christopherson@intel.com> wrote: > > On Thu, Jun 04, 2020 at 12:00:33PM -0700, Jim Mattson wrote: > > On Thu, Jun 4, 2020 at 11:47 AM Sean Christopherson > > <sean.j.christopherson@intel.com> wrote: > > > > > > On Wed, Jun 03, 2020 at 01:18:31PM -0700, Jim Mattson wrote: > > > > On Tue, Jun 2, 2020 at 7:24 PM Sean Christopherson > > > > <sean.j.christopherson@intel.com> wrote: > > > > > As an alternative to storing the last run/attempted CPU, what about moving > > > > > the "bad VM-Exit" detection into handle_exit_irqoff, or maybe a new hook > > > > > that is called after IRQs are enabled but before preemption is enabled, e.g. > > > > > detect_bad_exit or something? All of the paths in patch 4/4 can easily be > > > > > moved out of handle_exit. VMX would require a little bit of refacotring for > > > > > it's "no handler" check, but that should be minor. > > > > > > > > Given the alternatives, I'm willing to compromise my principles wrt > > > > emulation_required. :-) I'll send out v4 soon. > > > > > > What do you dislike about the alternative approach? > > > > Mainly, I wanted to stash this in a common location so that I could > > print it out in our local version of dump_vmcs(). Ideally, we'd like > > to be able to identify the bad part(s) just from the kernel logs. > > But this would also move dump_vmcs() to before preemption is enabled, i.e. > your version could read the CPU directly. If it backports easily. The bigger the change, the less likely that is. > And actually, if we're talking about ferreting out hardware issues, you > really do want this happening before preemption is enabled so that the VMCS > dump comes from the failing CPU. If the vCPU is migrated, the VMCS will be > dumped after a VMCLEAR->VMPTRLD, i.e. will be written to memory and pulled > back into the VMCS cache on a different CPU, and will also have been written > to by the new CPU to update host state. Odds are that wouldn't affect the > dump in a meaningful way, but never say never. True. > Tangentially related, what about adding an option to do VMCLEAR at the end > of dump_vmcs(), followed by a dump of raw memory? It'd be useless for > debugging software issues, but might be potentially useful/interesting for > triaging hardware problems. Our dump_vmcs() dumps all vmreadable fields, which should be pretty close to what we can get from a raw memory dump. We do have additional instrumentation to aid in determining the layout of the VMCS in memory, but it is too stupid to figure out how access rights are stored. Maybe it could be beefed up a little, and we could at least verify that VMCLEAR dumps the same thing to physical memory that we get from the individual VMREADs. > > That, and I wouldn't have been as comfortable with the refactoring > > without a lot more testing.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 170cc76a581f..42856970d3b8 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6730,6 +6730,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) if (vcpu->arch.cr2 != read_cr2()) write_cr2(vcpu->arch.cr2); + vmx->last_cpu = vcpu->cpu; vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs, vmx->loaded_vmcs->launched); diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 672c28f17e49..8a1e833cf4fb 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -302,6 +302,9 @@ struct vcpu_vmx { u64 ept_pointer; struct pt_desc pt_desc; + + /* which host CPU was used for running this vcpu */ + unsigned int last_cpu; }; enum ept_pointers_status {