Message ID | 20200423022550.15113-10-sean.j.christopherson@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86: Event fixes and cleanup | expand |
On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson <sean.j.christopherson@intel.com> wrote: > > Check for an unblocked SMI in vmx_check_nested_events() so that pending > SMIs are correctly prioritized over IRQs and NMIs when the latter events > will trigger VM-Exit. This also fixes an issue where an SMI that was > marked pending while processing a nested VM-Enter wouldn't trigger an > immediate exit, i.e. would be incorrectly delayed until L2 happened to > take a VM-Exit. > > Fixes: 64d6067057d96 ("KVM: x86: stubs for SMM support") > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > --- > arch/x86/kvm/vmx/nested.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > index 1fdaca5fd93d..8c16b190816b 100644 > --- a/arch/x86/kvm/vmx/nested.c > +++ b/arch/x86/kvm/vmx/nested.c > @@ -3750,6 +3750,12 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) > return 0; > } > > + if (vcpu->arch.smi_pending && !is_smm(vcpu)) { > + if (block_nested_events) > + return -EBUSY; > + goto no_vmexit; > + } > + From the SDM, volume 3: • System-management interrupts (SMIs), INIT signals, and higher priority events take priority over MTF VM exits. I think this block needs to be moved up.
On Tue, Apr 28, 2020 at 03:04:02PM -0700, Jim Mattson wrote: > On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson > <sean.j.christopherson@intel.com> wrote: > > > > Check for an unblocked SMI in vmx_check_nested_events() so that pending > > SMIs are correctly prioritized over IRQs and NMIs when the latter events > > will trigger VM-Exit. This also fixes an issue where an SMI that was > > marked pending while processing a nested VM-Enter wouldn't trigger an > > immediate exit, i.e. would be incorrectly delayed until L2 happened to > > take a VM-Exit. > > > > Fixes: 64d6067057d96 ("KVM: x86: stubs for SMM support") > > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > > --- > > arch/x86/kvm/vmx/nested.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > > index 1fdaca5fd93d..8c16b190816b 100644 > > --- a/arch/x86/kvm/vmx/nested.c > > +++ b/arch/x86/kvm/vmx/nested.c > > @@ -3750,6 +3750,12 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) > > return 0; > > } > > > > + if (vcpu->arch.smi_pending && !is_smm(vcpu)) { > > + if (block_nested_events) > > + return -EBUSY; > > + goto no_vmexit; > > + } > > + > > From the SDM, volume 3: > > • System-management interrupts (SMIs), INIT signals, and higher > priority events take priority over MTF VM exits. > > I think this block needs to be moved up. Hrm. It definitely needs to be moved above the preemption timer, though I can't find any public documentation about the preemption timer's priority. Preemption timer is lower priority than MTF, ergo it's not in the same class as SMI. Regarding SMI vs. MTF and #DB trap, to actually prioritize SMIs above MTF and #DBs, we'd need to save/restore MTF and pending #DBs via SMRAM. I think it makes sense to take the easy road and keep SMI after the traps, with a comment to say it's technically wrong but not worth fixing.
On Tue, Apr 28, 2020 at 3:59 PM Sean Christopherson <sean.j.christopherson@intel.com> wrote: > > On Tue, Apr 28, 2020 at 03:04:02PM -0700, Jim Mattson wrote: > > On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson > > <sean.j.christopherson@intel.com> wrote: > > > > > > Check for an unblocked SMI in vmx_check_nested_events() so that pending > > > SMIs are correctly prioritized over IRQs and NMIs when the latter events > > > will trigger VM-Exit. This also fixes an issue where an SMI that was > > > marked pending while processing a nested VM-Enter wouldn't trigger an > > > immediate exit, i.e. would be incorrectly delayed until L2 happened to > > > take a VM-Exit. > > > > > > Fixes: 64d6067057d96 ("KVM: x86: stubs for SMM support") > > > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > > > --- > > > arch/x86/kvm/vmx/nested.c | 6 ++++++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > > > index 1fdaca5fd93d..8c16b190816b 100644 > > > --- a/arch/x86/kvm/vmx/nested.c > > > +++ b/arch/x86/kvm/vmx/nested.c > > > @@ -3750,6 +3750,12 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) > > > return 0; > > > } > > > > > > + if (vcpu->arch.smi_pending && !is_smm(vcpu)) { > > > + if (block_nested_events) > > > + return -EBUSY; > > > + goto no_vmexit; > > > + } > > > + > > > > From the SDM, volume 3: > > > > • System-management interrupts (SMIs), INIT signals, and higher > > priority events take priority over MTF VM exits. > > > > I think this block needs to be moved up. > > Hrm. It definitely needs to be moved above the preemption timer, though I > can't find any public documentation about the preemption timer's priority. > Preemption timer is lower priority than MTF, ergo it's not in the same > class as SMI. > > Regarding SMI vs. MTF and #DB trap, to actually prioritize SMIs above MTF > and #DBs, we'd need to save/restore MTF and pending #DBs via SMRAM. I > think it makes sense to take the easy road and keep SMI after the traps, > with a comment to say it's technically wrong but not worth fixing. Pending debug exceptions should just go in the pending debug exceptions field. End of story and end of complications. I don't understand why kvm is so averse to using this field the way it was intended. As for the MTF, section 34.14.1 of the SDM, volume 3, clearly states: The pseudocode above makes reference to the saving of VMX-critical state. This state consists of the following: (1) SS.DPL (the current privilege level); (2) RFLAGS.VM; (3) the state of blocking by STI and by MOV SS (see Table 24-3 in Section 24.4.2); (4) the state of virtual-NMI blocking (only if the processor is in VMX non-root oper- ation and the “virtual NMIs” VM-execution control is 1); and (5) an indication of whether an MTF VM exit is pending (see Section 25.5.2). These data may be saved internal to the processor or in the VMCS region of the current VMCS. Processors that do not support SMI recognition while there is blocking by STI or by MOV SS need not save the state of such blocking. I haven't really looked at kvm's implementation of SMM (because Google doesn't support it), but it seems that the "MTF VM exit is pending" bit should be trivial to deal with. I assume we save the other VMX-critical state somewhere!
On Tue, Apr 28, 2020 at 3:59 PM Sean Christopherson <sean.j.christopherson@intel.com> wrote: > > On Tue, Apr 28, 2020 at 03:04:02PM -0700, Jim Mattson wrote: > > On Wed, Apr 22, 2020 at 7:26 PM Sean Christopherson > > <sean.j.christopherson@intel.com> wrote: > > > > > > Check for an unblocked SMI in vmx_check_nested_events() so that pending > > > SMIs are correctly prioritized over IRQs and NMIs when the latter events > > > will trigger VM-Exit. This also fixes an issue where an SMI that was > > > marked pending while processing a nested VM-Enter wouldn't trigger an > > > immediate exit, i.e. would be incorrectly delayed until L2 happened to > > > take a VM-Exit. > > > > > > Fixes: 64d6067057d96 ("KVM: x86: stubs for SMM support") > > > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > > > --- > > > arch/x86/kvm/vmx/nested.c | 6 ++++++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > > > index 1fdaca5fd93d..8c16b190816b 100644 > > > --- a/arch/x86/kvm/vmx/nested.c > > > +++ b/arch/x86/kvm/vmx/nested.c > > > @@ -3750,6 +3750,12 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) > > > return 0; > > > } > > > > > > + if (vcpu->arch.smi_pending && !is_smm(vcpu)) { > > > + if (block_nested_events) > > > + return -EBUSY; > > > + goto no_vmexit; > > > + } > > > + > > > > From the SDM, volume 3: > > > > • System-management interrupts (SMIs), INIT signals, and higher > > priority events take priority over MTF VM exits. > > > > I think this block needs to be moved up. > > Hrm. It definitely needs to be moved above the preemption timer, though I > can't find any public documentation about the preemption timer's priority. Section 25.2 of the SDM, volume 3: Debug-trap exceptions and higher priority events take priority over VM exits caused by the VMX-preemption timer. VM exits caused by the VMX-preemption timer take priority over VM exits caused by the “NMI-window exiting” VM-execution control and lower priority events.
On Tue, Apr 28, 2020 at 04:16:16PM -0700, Jim Mattson wrote: > On Tue, Apr 28, 2020 at 3:59 PM Sean Christopherson > <sean.j.christopherson@intel.com> wrote: > > > > On Tue, Apr 28, 2020 at 03:04:02PM -0700, Jim Mattson wrote: > > > From the SDM, volume 3: > > > > > > • System-management interrupts (SMIs), INIT signals, and higher > > > priority events take priority over MTF VM exits. > > > > > > I think this block needs to be moved up. > > > > Hrm. It definitely needs to be moved above the preemption timer, though I > > can't find any public documentation about the preemption timer's priority. > > Preemption timer is lower priority than MTF, ergo it's not in the same > > class as SMI. > > > > Regarding SMI vs. MTF and #DB trap, to actually prioritize SMIs above MTF > > and #DBs, we'd need to save/restore MTF and pending #DBs via SMRAM. I > > think it makes sense to take the easy road and keep SMI after the traps, > > with a comment to say it's technically wrong but not worth fixing. > > Pending debug exceptions should just go in the pending debug > exceptions field. End of story and end of complications. I don't > understand why kvm is so averse to using this field the way it was > intended. Ah, it took my brain a bit to catch on. I assume you're suggesting calling nested_vmx_updated_pending_dbg() so that the pending #DB naturally gets propagated to/from vmcs12 on SMI/RSM? I think that should work. > As for the MTF, section 34.14.1 of the SDM, volume 3, clearly states: > > The pseudocode above makes reference to the saving of VMX-critical > state. This state consists of the following: > (1) SS.DPL (the current privilege level); (2) RFLAGS.VM; (3) the state > of blocking by STI and by MOV SS (see > Table 24-3 in Section 24.4.2); (4) the state of virtual-NMI blocking > (only if the processor is in VMX non-root oper- > ation and the “virtual NMIs” VM-execution control is 1); and (5) an > indication of whether an MTF VM exit is pending > (see Section 25.5.2). These data may be saved internal to the > processor or in the VMCS region of the current > VMCS. Processors that do not support SMI recognition while there is > blocking by STI or by MOV SS need not save > the state of such blocking. > > I haven't really looked at kvm's implementation of SMM (because Google > doesn't support it), but it seems that the "MTF VM exit is pending" > bit should be trivial to deal with. I assume we save the other > VMX-critical state somewhere! True, I spaced on the extistence of vmx_pre_{enter,leave}_smm(). I'll send a patch, the delta to what's in kvm/queue should actually be quite small.
On Wed, Apr 29, 2020 at 07:50:40AM -0700, Sean Christopherson wrote: > On Tue, Apr 28, 2020 at 04:16:16PM -0700, Jim Mattson wrote: > > On Tue, Apr 28, 2020 at 3:59 PM Sean Christopherson > > <sean.j.christopherson@intel.com> wrote: > > > > > > On Tue, Apr 28, 2020 at 03:04:02PM -0700, Jim Mattson wrote: > > > > From the SDM, volume 3: > > > > > > > > • System-management interrupts (SMIs), INIT signals, and higher > > > > priority events take priority over MTF VM exits. > > > > > > > > I think this block needs to be moved up. > > > > > > Hrm. It definitely needs to be moved above the preemption timer, though I > > > can't find any public documentation about the preemption timer's priority. > > > Preemption timer is lower priority than MTF, ergo it's not in the same > > > class as SMI. > > > > > > Regarding SMI vs. MTF and #DB trap, to actually prioritize SMIs above MTF > > > and #DBs, we'd need to save/restore MTF and pending #DBs via SMRAM. I > > > think it makes sense to take the easy road and keep SMI after the traps, > > > with a comment to say it's technically wrong but not worth fixing. > > > > Pending debug exceptions should just go in the pending debug > > exceptions field. End of story and end of complications. I don't > > understand why kvm is so averse to using this field the way it was > > intended. > > Ah, it took my brain a bit to catch on. I assume you're suggesting calling > nested_vmx_updated_pending_dbg() so that the pending #DB naturally gets > propagated to/from vmcs12 on SMI/RSM? I think that should work. This works for L2, but not L1 :-( And L2 can't be fixed without first fixing L1 because inject_pending_event() also incorrectly prioritizes #DB over SMI. For L1, utilizing SMRAM to save/restore the pending #DB is likely the easiest solution as it avoids having to add new state for migration. I have everything coded up but it'll probably take a few weeks to test and get it sent out, need to focus on other stuff for a while.
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 1fdaca5fd93d..8c16b190816b 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -3750,6 +3750,12 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu) return 0; } + if (vcpu->arch.smi_pending && !is_smm(vcpu)) { + if (block_nested_events) + return -EBUSY; + goto no_vmexit; + } + if (vcpu->arch.nmi_pending && !vmx_nmi_blocked(vcpu)) { if (block_nested_events) return -EBUSY;
Check for an unblocked SMI in vmx_check_nested_events() so that pending SMIs are correctly prioritized over IRQs and NMIs when the latter events will trigger VM-Exit. This also fixes an issue where an SMI that was marked pending while processing a nested VM-Enter wouldn't trigger an immediate exit, i.e. would be incorrectly delayed until L2 happened to take a VM-Exit. Fixes: 64d6067057d96 ("KVM: x86: stubs for SMM support") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> --- arch/x86/kvm/vmx/nested.c | 6 ++++++ 1 file changed, 6 insertions(+)