diff mbox series

[PATCHv2,4/7] KVM: SVM: Report NMI not allowed when Guest busy handling VNMI

Message ID 20220709134230.2397-5-santosh.shukla@amd.com (mailing list archive)
State New, archived
Headers show
Series Virtual NMI feature | expand

Commit Message

Santosh Shukla July 9, 2022, 1:42 p.m. UTC
In the VNMI case, Report NMI is not allowed when the processor set the
V_NMI_MASK to 1 which means the Guest is busy handling VNMI.

Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
---
v2:
- Moved vnmi check after is_guest_mode() in func _nmi_blocked().
- Removed is_vnmi_mask_set check from _enable_nmi_window().
as it was a redundent check.

 arch/x86/kvm/svm/svm.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Sean Christopherson July 20, 2022, 9:54 p.m. UTC | #1
On Sat, Jul 09, 2022, Santosh Shukla wrote:
> In the VNMI case, Report NMI is not allowed when the processor set the
> V_NMI_MASK to 1 which means the Guest is busy handling VNMI.
> 
> Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
> ---
> v2:
> - Moved vnmi check after is_guest_mode() in func _nmi_blocked().
> - Removed is_vnmi_mask_set check from _enable_nmi_window().
> as it was a redundent check.
> 
>  arch/x86/kvm/svm/svm.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 3574e804d757..44c1f2317b45 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3480,6 +3480,9 @@ bool svm_nmi_blocked(struct kvm_vcpu *vcpu)
>  	if (is_guest_mode(vcpu) && nested_exit_on_nmi(svm))
>  		return false;
>  
> +	if (is_vnmi_enabled(svm) && is_vnmi_mask_set(svm))
> +		return true;
> +
>  	ret = (vmcb->control.int_state & SVM_INTERRUPT_SHADOW_MASK) ||
>  	      (vcpu->arch.hflags & HF_NMI_MASK);
>  
> @@ -3609,6 +3612,9 @@ static void svm_enable_nmi_window(struct kvm_vcpu *vcpu)
>  {
>  	struct vcpu_svm *svm = to_svm(vcpu);
>  
> +	if (is_vnmi_enabled(svm))
> +		return;

Ugh, is there really no way to trigger an exit when NMIs become unmasked?  Because
if there isn't, this is broken for KVM.

On bare metal, if two NMIs arrive "simultaneously", so long as NMIs aren't blocked,
the first NMI will be delivered and the second will be pended, i.e. software will
see both NMIs.  And if that doesn't hold true, the window for a true collision is
really, really tiny.

But in KVM, because a vCPU may not be run a long duration, that window becomes
very large.  To not drop NMIs and more faithfully emulate hardware, KVM allows two
NMIs to be _pending_.  And when that happens, KVM needs to trigger an exit when
NMIs become unmasked _after_ the first NMI is injected.

> +
>  	if ((vcpu->arch.hflags & (HF_NMI_MASK | HF_IRET_MASK)) == HF_NMI_MASK)
>  		return; /* IRET will cause a vm exit */
>  
> -- 
> 2.25.1
>
Maxim Levitsky July 21, 2022, 12:05 p.m. UTC | #2
On Wed, 2022-07-20 at 21:54 +0000, Sean Christopherson wrote:
> On Sat, Jul 09, 2022, Santosh Shukla wrote:
> > In the VNMI case, Report NMI is not allowed when the processor set the
> > V_NMI_MASK to 1 which means the Guest is busy handling VNMI.
> > 
> > Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
> > ---
> > v2:
> > - Moved vnmi check after is_guest_mode() in func _nmi_blocked().
> > - Removed is_vnmi_mask_set check from _enable_nmi_window().
> > as it was a redundent check.
> > 
> >  arch/x86/kvm/svm/svm.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 3574e804d757..44c1f2317b45 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -3480,6 +3480,9 @@ bool svm_nmi_blocked(struct kvm_vcpu *vcpu)
> >  	if (is_guest_mode(vcpu) && nested_exit_on_nmi(svm))
> >  		return false;
> >  
> > +	if (is_vnmi_enabled(svm) && is_vnmi_mask_set(svm))
> > +		return true;
> > +
> >  	ret = (vmcb->control.int_state & SVM_INTERRUPT_SHADOW_MASK) ||
> >  	      (vcpu->arch.hflags & HF_NMI_MASK);
> >  
> > @@ -3609,6 +3612,9 @@ static void svm_enable_nmi_window(struct kvm_vcpu *vcpu)
> >  {
> >  	struct vcpu_svm *svm = to_svm(vcpu);
> >  
> > +	if (is_vnmi_enabled(svm))
> > +		return;
> 
> Ugh, is there really no way to trigger an exit when NMIs become unmasked?  Because
> if there isn't, this is broken for KVM.
> 
> On bare metal, if two NMIs arrive "simultaneously", so long as NMIs aren't blocked,
> the first NMI will be delivered and the second will be pended, i.e. software will
> see both NMIs.  And if that doesn't hold true, the window for a true collision is
> really, really tiny.
> 
> But in KVM, because a vCPU may not be run a long duration, that window becomes
> very large.  To not drop NMIs and more faithfully emulate hardware, KVM allows two
> NMIs to be _pending_.  And when that happens, KVM needs to trigger an exit when
> NMIs become unmasked _after_ the first NMI is injected.


This is how I see this:

- When a NMI arrives and neither NMI is injected (V_NMI_PENDING) nor in service (V_NMI_MASK)
  then all it is needed to inject the NMI will be to set the V_NMI_PENDING bit and do VM entry.

- If V_NMI_PENDING is set but not V_NMI_MASK, and another NMI arrives we can make the
  svm_nmi_allowed return -EBUSY which will cause immediate VM exit,

  and if hopefully vNMI takes priority over the fake interrupt we raise, it will be injected,
  and upon immediate VM exit we can inject another NMI by setting the V_NMI_PENDING again,
  and later when the guest is done with first NMI, it will take the second.

  Of course if we get a nested exception, then it will be fun....

  (the patches don't do it (causing immediate VM exit), 
  but I think we should make the svm_nmi_allowed, check for the case for 
  V_NMI_PENDING && !V_NMI_MASK and make it return -EBUSY).


- If both V_NMI_PENDING and V_NMI_MASK are set, then I guess we lose an NMI.
 (It means that the guest is handling an NMI, there is a pending NMI, and now
 another NMI arrived)

 Sean, this is the problem you mention, right?

Best regards,
	Maxim Levitsky

> 
> > +
> >  	if ((vcpu->arch.hflags & (HF_NMI_MASK | HF_IRET_MASK)) == HF_NMI_MASK)
> >  		return; /* IRET will cause a vm exit */
> >  
> > -- 
> > 2.25.1
> >
Sean Christopherson July 21, 2022, 2:59 p.m. UTC | #3
On Thu, Jul 21, 2022, Maxim Levitsky wrote:
> On Wed, 2022-07-20 at 21:54 +0000, Sean Christopherson wrote:
> > On Sat, Jul 09, 2022, Santosh Shukla wrote:
> > > @@ -3609,6 +3612,9 @@ static void svm_enable_nmi_window(struct kvm_vcpu *vcpu)
> > >  {
> > >  	struct vcpu_svm *svm = to_svm(vcpu);
> > >  
> > > +	if (is_vnmi_enabled(svm))
> > > +		return;
> > 
> > Ugh, is there really no way to trigger an exit when NMIs become unmasked?  Because
> > if there isn't, this is broken for KVM.
> > 
> > On bare metal, if two NMIs arrive "simultaneously", so long as NMIs aren't blocked,
> > the first NMI will be delivered and the second will be pended, i.e. software will
> > see both NMIs.  And if that doesn't hold true, the window for a true collision is
> > really, really tiny.
> > 
> > But in KVM, because a vCPU may not be run a long duration, that window becomes
> > very large.  To not drop NMIs and more faithfully emulate hardware, KVM allows two
> > NMIs to be _pending_.  And when that happens, KVM needs to trigger an exit when
> > NMIs become unmasked _after_ the first NMI is injected.
> 
> 
> This is how I see this:
> 
> - When a NMI arrives and neither NMI is injected (V_NMI_PENDING) nor in service (V_NMI_MASK)
>   then all it is needed to inject the NMI will be to set the V_NMI_PENDING bit and do VM entry.
> 
> - If V_NMI_PENDING is set but not V_NMI_MASK, and another NMI arrives we can make the
>   svm_nmi_allowed return -EBUSY which will cause immediate VM exit,
> 
>   and if hopefully vNMI takes priority over the fake interrupt we raise, it will be injected,

Nit (for other readers following along), it's not a fake interrupt,I would describe
it as spurious or ignored.  It's very much a real IRQ, which matters because it
factors into event priority.

>   and upon immediate VM exit we can inject another NMI by setting the V_NMI_PENDING again,
>   and later when the guest is done with first NMI, it will take the second.

Yeaaaah.  This depends heavily on the vNMI being prioritized over the IRQ.

>   Of course if we get a nested exception, then it will be fun....
> 
>   (the patches don't do it (causing immediate VM exit), 
>   but I think we should make the svm_nmi_allowed, check for the case for 
>   V_NMI_PENDING && !V_NMI_MASK and make it return -EBUSY).

Yep, though I think there's a wrinkle (see below).

> - If both V_NMI_PENDING and V_NMI_MASK are set, then I guess we lose an NMI.
>  (It means that the guest is handling an NMI, there is a pending NMI, and now
>  another NMI arrived)
> 
>  Sean, this is the problem you mention, right?

Yep.  Dropping an NMI in the last case is ok, AFAIK no CPU will pend multiple NMIs
while another is in-flight.  But triggering an immediate exit in svm_nmi_allowed()
will hang the vCPU as the second pending NMI will never go away since the vCPU
won't make forward progress to unmask NMIs.  This can also happen if there are
two pending NMIs and GIF=0, i.e. any time there are multiple pending NMIs and NMIs
are blocked.

One other question: what happens if software atomically sets V_NMI_PENDING while
the VMCB is in use?  I assume bad things?  I.e. I assume KVM can't "post" NMIs :-)
Maxim Levitsky July 21, 2022, 3:31 p.m. UTC | #4
On Thu, 2022-07-21 at 14:59 +0000, Sean Christopherson wrote:
> On Thu, Jul 21, 2022, Maxim Levitsky wrote:
> > On Wed, 2022-07-20 at 21:54 +0000, Sean Christopherson wrote:
> > > On Sat, Jul 09, 2022, Santosh Shukla wrote:
> > > > @@ -3609,6 +3612,9 @@ static void svm_enable_nmi_window(struct kvm_vcpu *vcpu)
> > > >  {
> > > >  	struct vcpu_svm *svm = to_svm(vcpu);
> > > >  
> > > > +	if (is_vnmi_enabled(svm))
> > > > +		return;
> > > 
> > > Ugh, is there really no way to trigger an exit when NMIs become unmasked?  Because
> > > if there isn't, this is broken for KVM.
> > > 
> > > On bare metal, if two NMIs arrive "simultaneously", so long as NMIs aren't blocked,
> > > the first NMI will be delivered and the second will be pended, i.e. software will
> > > see both NMIs.  And if that doesn't hold true, the window for a true collision is
> > > really, really tiny.
> > > 
> > > But in KVM, because a vCPU may not be run a long duration, that window becomes
> > > very large.  To not drop NMIs and more faithfully emulate hardware, KVM allows two
> > > NMIs to be _pending_.  And when that happens, KVM needs to trigger an exit when
> > > NMIs become unmasked _after_ the first NMI is injected.
> > 
> > This is how I see this:
> > 
> > - When a NMI arrives and neither NMI is injected (V_NMI_PENDING) nor in service (V_NMI_MASK)
> >   then all it is needed to inject the NMI will be to set the V_NMI_PENDING bit and do VM entry.
> > 
> > - If V_NMI_PENDING is set but not V_NMI_MASK, and another NMI arrives we can make the
> >   svm_nmi_allowed return -EBUSY which will cause immediate VM exit,
> > 
> >   and if hopefully vNMI takes priority over the fake interrupt we raise, it will be injected,
> 
> Nit (for other readers following along), it's not a fake interrupt,I would describe
> it as spurious or ignored.  It's very much a real IRQ, which matters because it
> factors into event priority.

Yep, 100% agree.


> 
> >   and upon immediate VM exit we can inject another NMI by setting the V_NMI_PENDING again,
> >   and later when the guest is done with first NMI, it will take the second.
> 
> Yeaaaah.  This depends heavily on the vNMI being prioritized over the IRQ.
> 
> >   Of course if we get a nested exception, then it will be fun....
> > 
> >   (the patches don't do it (causing immediate VM exit), 
> >   but I think we should make the svm_nmi_allowed, check for the case for 
> >   V_NMI_PENDING && !V_NMI_MASK and make it return -EBUSY).
> 
> Yep, though I think there's a wrinkle (see below).
> 
> > - If both V_NMI_PENDING and V_NMI_MASK are set, then I guess we lose an NMI.
> >  (It means that the guest is handling an NMI, there is a pending NMI, and now
> >  another NMI arrived)
> > 
> >  Sean, this is the problem you mention, right?
> 
> Yep.  Dropping an NMI in the last case is ok, AFAIK no CPU will pend multiple NMIs
> while another is in-flight.  But triggering an immediate exit in svm_nmi_allowed()
> will hang the vCPU as the second pending NMI will never go away since the vCPU

The idea is to trigger the immediate exit only when a NMI was just injected (V_NMI_PENDING=1)
but not masked (that is currently in service, that is V_NMI_MASK=0).

In case both bits are set, the NMI is dropped, that is no immediate exit is requested.

In this case, next VM entry should have no reason to not inject the NMI and then VM exit
on the interrupt we raised, so there should not be a problem with forward progress.

There is an issue still, the NMI could also be masked if we are in SMM (I suggested
setting the V_NMI_MASK manually in this case), thus in this case we won't have more
that one pending NMI, but I guess this is not that big problem.

We can btw also in this case "open" the NMI window by waiting for RSM intercept.
(that is just not inject the NMI, and on RSM inject it, I think that KVM already does this)

I think it should overal work, but no doubt I do expect issues and corner cases,


> won't make forward progress to unmask NMIs.  This can also happen if there are
> two pending NMIs and GIF=0, i.e. any time there are multiple pending NMIs and NMIs
> are blocked.

GIF=0 can be dealt with though, if GIF is 0 when 2nd pending NMI arrives, we can
delay its injection to the moment the STGI is executed and intercept STGI.

We I think already do something like that as well.

> 
> One other question: what happens if software atomically sets V_NMI_PENDING while
> the VMCB is in use?  I assume bad things?  I.e. I assume KVM can't "post" NMIs :-)

I can't answer that but I bet that this bit is only checked on VM entry.


Another question about the spec (sorry if I already asked this):

if vGIF is used, and the guest does CLGI, does the CPU set the V_NMI_MASK,
itself, or the CPU considers both the V_GIF and the V_NMI_MASK in the int_ctl,
as a condition of delivering the NMI? I think that the later should be true,
and thus V_NMI_MASK is more like V_NMI_IN_SERVICE.

Best regards,
	Maxim Levitsky

>
Sean Christopherson July 21, 2022, 4:08 p.m. UTC | #5
On Thu, Jul 21, 2022, Maxim Levitsky wrote:
> On Thu, 2022-07-21 at 14:59 +0000, Sean Christopherson wrote:
> > Yep.  Dropping an NMI in the last case is ok, AFAIK no CPU will pend multiple NMIs
> > while another is in-flight.  But triggering an immediate exit in svm_nmi_allowed()
> > will hang the vCPU as the second pending NMI will never go away since the vCPU
> 
> The idea is to trigger the immediate exit only when a NMI was just injected (V_NMI_PENDING=1)
> but not masked (that is currently in service, that is V_NMI_MASK=0).

I assume you mean "and an NMI is currently NOT in service"?

Anyways, we're on the same page, trigger an exit if and only if there's an NMI pending
and the vCPU isn't already handling a vNMI.  We may need to explicitly drop one of
the pending NMIs in that case though, otherwise the NMI that _KVM_ holds pending could
get "injected" well after NMIs are unmasked, which could suprise the guest.  E.g.
guest IRETs from the second (of three) NMIs, KVM doesn't "inject" that third NMI
until the next VM-Exit, which could be a long time in the future.

> In case both bits are set, the NMI is dropped, that is no immediate exit is requested.
> 
> In this case, next VM entry should have no reason to not inject the NMI and then VM exit
> on the interrupt we raised, so there should not be a problem with forward progress.
> 
> There is an issue still, the NMI could also be masked if we are in SMM (I suggested
> setting the V_NMI_MASK manually in this case), thus in this case we won't have more
> that one pending NMI, but I guess this is not that big problem.
> 
> We can btw also in this case "open" the NMI window by waiting for RSM intercept.
> (that is just not inject the NMI, and on RSM inject it, I think that KVM already does this)
> 
> I think it should overal work, but no doubt I do expect issues and corner cases,
> 
> 
> > won't make forward progress to unmask NMIs.  This can also happen if there are
> > two pending NMIs and GIF=0, i.e. any time there are multiple pending NMIs and NMIs
> > are blocked.
> 
> GIF=0 can be dealt with though, if GIF is 0 when 2nd pending NMI arrives, we can
> delay its injection to the moment the STGI is executed and intercept STGI.
> 
> We I think already do something like that as well.

Yep, you're right, svm_enable_nmi_window() sets INTERCEPT_STGI if VGIF is enabled
and GIF=0 (and STGI exits unconditional if VGIF=0?).

So we have a poor man's NMI-window exiting.
Maxim Levitsky July 21, 2022, 4:17 p.m. UTC | #6
On Thu, 2022-07-21 at 16:08 +0000, Sean Christopherson wrote:
> On Thu, Jul 21, 2022, Maxim Levitsky wrote:
> > On Thu, 2022-07-21 at 14:59 +0000, Sean Christopherson wrote:
> > > Yep.  Dropping an NMI in the last case is ok, AFAIK no CPU will pend multiple NMIs
> > > while another is in-flight.  But triggering an immediate exit in svm_nmi_allowed()
> > > will hang the vCPU as the second pending NMI will never go away since the vCPU
> > 
> > The idea is to trigger the immediate exit only when a NMI was just injected (V_NMI_PENDING=1)
> > but not masked (that is currently in service, that is V_NMI_MASK=0).
> 
> I assume you mean "and an NMI is currently NOT in service"?

Yes
> 
> Anyways, we're on the same page, trigger an exit if and only if there's an NMI pending
> and the vCPU isn't already handling a vNMI.  We may need to explicitly drop one of
> the pending NMIs in that case though, otherwise the NMI that _KVM_ holds pending could
> get "injected" well after NMIs are unmasked, which could suprise the guest.  E.g.
> guest IRETs from the second (of three) NMIs, KVM doesn't "inject" that third NMI
> until the next VM-Exit, which could be a long time in the future.
> 
> > In case both bits are set, the NMI is dropped, that is no immediate exit is requested.
> > 
> > In this case, next VM entry should have no reason to not inject the NMI and then VM exit
> > on the interrupt we raised, so there should not be a problem with forward progress.
> > 
> > There is an issue still, the NMI could also be masked if we are in SMM (I suggested
> > setting the V_NMI_MASK manually in this case), thus in this case we won't have more
> > that one pending NMI, but I guess this is not that big problem.
> > 
> > We can btw also in this case "open" the NMI window by waiting for RSM intercept.
> > (that is just not inject the NMI, and on RSM inject it, I think that KVM already does this)
> > 
> > I think it should overal work, but no doubt I do expect issues and corner cases,
> > 
> > 
> > > won't make forward progress to unmask NMIs.  This can also happen if there are
> > > two pending NMIs and GIF=0, i.e. any time there are multiple pending NMIs and NMIs
> > > are blocked.
> > 
> > GIF=0 can be dealt with though, if GIF is 0 when 2nd pending NMI arrives, we can
> > delay its injection to the moment the STGI is executed and intercept STGI.
> > 
> > We I think already do something like that as well.
> 
> Yep, you're right, svm_enable_nmi_window() sets INTERCEPT_STGI if VGIF is enabled
> and GIF=0 (and STGI exits unconditional if VGIF=0?

Its not unconditional but KVM has to set the intercept, otherwise the guest
will control the host's GIF.

Best regards,
	Maxim Levitsky


> ).
> 
> So we have a poor man's NMI-window exiting.

Yep, we also intercept IRET for the same purpose, and RSM interception
is also a place the NMI are evaluated.

We only single step over the IRET, because NMIs are unmasked _after_ the IRET
retires.

Best regards,
	Maxim Levitsky
>
Sean Christopherson July 21, 2022, 4:25 p.m. UTC | #7
On Thu, Jul 21, 2022, Maxim Levitsky wrote:
> On Thu, 2022-07-21 at 16:08 +0000, Sean Christopherson wrote:
> > So we have a poor man's NMI-window exiting.
> 
> Yep, we also intercept IRET for the same purpose, and RSM interception
> is also a place the NMI are evaluated.
> 
> We only single step over the IRET, because NMIs are unmasked _after_ the IRET
> retires.

Heh, check out this blurb from Intel's SDM:

  An execution of the IRET instruction unblocks NMIs even if the instruction
  causes a fault. For example, if the IRET instruction executes with EFLAGS.VM = 1
  and IOPL of less than 3, a general-protection exception is generated (see
  Section 20.2.7, “Sensitive Instructions”). In such a case, NMIs are unmasked
  before the exception handler is invoked.

Not that I want to try and handle that in KVM if AMD follows suit, I simply find
it amusing how messy this all is.  A true NMI-window exit would have been nice...
Santosh Shukla July 29, 2022, 5:51 a.m. UTC | #8
Hello Sean,

On 7/21/2022 3:24 AM, Sean Christopherson wrote:
> On Sat, Jul 09, 2022, Santosh Shukla wrote:
>> In the VNMI case, Report NMI is not allowed when the processor set the
>> V_NMI_MASK to 1 which means the Guest is busy handling VNMI.
>>
>> Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
>> ---
>> v2:
>> - Moved vnmi check after is_guest_mode() in func _nmi_blocked().
>> - Removed is_vnmi_mask_set check from _enable_nmi_window().
>> as it was a redundent check.
>>
>>  arch/x86/kvm/svm/svm.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>> index 3574e804d757..44c1f2317b45 100644
>> --- a/arch/x86/kvm/svm/svm.c
>> +++ b/arch/x86/kvm/svm/svm.c
>> @@ -3480,6 +3480,9 @@ bool svm_nmi_blocked(struct kvm_vcpu *vcpu)
>>  	if (is_guest_mode(vcpu) && nested_exit_on_nmi(svm))
>>  		return false;
>>  
>> +	if (is_vnmi_enabled(svm) && is_vnmi_mask_set(svm))
>> +		return true;
>> +
>>  	ret = (vmcb->control.int_state & SVM_INTERRUPT_SHADOW_MASK) ||
>>  	      (vcpu->arch.hflags & HF_NMI_MASK);
>>  
>> @@ -3609,6 +3612,9 @@ static void svm_enable_nmi_window(struct kvm_vcpu *vcpu)
>>  {
>>  	struct vcpu_svm *svm = to_svm(vcpu);
>>  
>> +	if (is_vnmi_enabled(svm))
>> +		return;
> 
> Ugh, is there really no way to trigger an exit when NMIs become unmasked?  Because
> if there isn't, this is broken for KVM.
> 

Yes. there is.

NMI_INTERCEPT will trigger VMEXIT when second NMI arrives while guest is busy handling first NMI.
And in that scenario, Guest will exit with V_NMI_MASK set to 1, KVM can inject pending(Second)
NMI(V_NMI=1). Guest will resume handling the first NMI, then HW will
clear the V_NMI_MASK and later HW will take the pending V_NMI in side the guest. 

I'll handle above case in v3.

Thanks,
Santosh

> On bare metal, if two NMIs arrive "simultaneously", so long as NMIs aren't blocked,
> the first NMI will be delivered and the second will be pended, i.e. software will
> see both NMIs.  And if that doesn't hold true, the window for a true collision is
> really, really tiny.
> 
> But in KVM, because a vCPU may not be run a long duration, that window becomes
> very large.  To not drop NMIs and more faithfully emulate hardware, KVM allows two
> NMIs to be _pending_.  And when that happens, KVM needs to trigger an exit when
> NMIs become unmasked _after_ the first NMI is injected.
> 
>> +
>>  	if ((vcpu->arch.hflags & (HF_NMI_MASK | HF_IRET_MASK)) == HF_NMI_MASK)
>>  		return; /* IRET will cause a vm exit */
>>  
>> -- 
>> 2.25.1
>>
Sean Christopherson July 29, 2022, 2:41 p.m. UTC | #9
On Fri, Jul 29, 2022, Shukla, Santosh wrote:
> Hello Sean,
> 
> On 7/21/2022 3:24 AM, Sean Christopherson wrote:
> > On Sat, Jul 09, 2022, Santosh Shukla wrote:

...

> >> @@ -3609,6 +3612,9 @@ static void svm_enable_nmi_window(struct kvm_vcpu *vcpu)
> >>  {
> >>  	struct vcpu_svm *svm = to_svm(vcpu);
> >>  
> >> +	if (is_vnmi_enabled(svm))
> >> +		return;
> > 
> > Ugh, is there really no way to trigger an exit when NMIs become unmasked?  Because
> > if there isn't, this is broken for KVM.
> > 
> 
> Yes. there is.
> 
> NMI_INTERCEPT will trigger VMEXIT when second NMI arrives while guest is busy handling first NMI.

But NMI_INTERCEPT only applies to "real" NMIs.  The scenario laid out below is where
KVM wants to inject a virtual NMI without an associated hardware/real NMI, e.g. if
multiple vCPUs send NMI IPIs to the target.

> And in that scenario, Guest will exit with V_NMI_MASK set to 1, KVM can inject pending(Second)
> NMI(V_NMI=1). Guest will resume handling the first NMI, then HW will
> clear the V_NMI_MASK and later HW will take the pending V_NMI in side the guest. 
> 
> I'll handle above case in v3.
> 
> Thanks,
> Santosh
> 
> > On bare metal, if two NMIs arrive "simultaneously", so long as NMIs aren't blocked,
> > the first NMI will be delivered and the second will be pended, i.e. software will
> > see both NMIs.  And if that doesn't hold true, the window for a true collision is
> > really, really tiny.
> > 
> > But in KVM, because a vCPU may not be run a long duration, that window becomes
> > very large.  To not drop NMIs and more faithfully emulate hardware, KVM allows two
> > NMIs to be _pending_.  And when that happens, KVM needs to trigger an exit when
> > NMIs become unmasked _after_ the first NMI is injected.
> > 
> >> +
> >>  	if ((vcpu->arch.hflags & (HF_NMI_MASK | HF_IRET_MASK)) == HF_NMI_MASK)
> >>  		return; /* IRET will cause a vm exit */
> >>  
> >> -- 
> >> 2.25.1
> >>
Santosh Shukla Aug. 4, 2022, 9:51 a.m. UTC | #10
On 7/29/2022 8:11 PM, Sean Christopherson wrote:
> On Fri, Jul 29, 2022, Shukla, Santosh wrote:
>> Hello Sean,
>>
>> On 7/21/2022 3:24 AM, Sean Christopherson wrote:
>>> On Sat, Jul 09, 2022, Santosh Shukla wrote:
> 
> ...
> 
>>>> @@ -3609,6 +3612,9 @@ static void svm_enable_nmi_window(struct kvm_vcpu *vcpu)
>>>>  {
>>>>  	struct vcpu_svm *svm = to_svm(vcpu);
>>>>  
>>>> +	if (is_vnmi_enabled(svm))
>>>> +		return;
>>>
>>> Ugh, is there really no way to trigger an exit when NMIs become unmasked?  Because
>>> if there isn't, this is broken for KVM.
>>>
>>
>> Yes. there is.
>>
>> NMI_INTERCEPT will trigger VMEXIT when second NMI arrives while guest is busy handling first NMI.
> 
> But NMI_INTERCEPT only applies to "real" NMIs.  The scenario laid out below is where
> KVM wants to inject a virtual NMI without an associated hardware/real NMI, e.g. if
> multiple vCPUs send NMI IPIs to the target.
> 

Yes, HW supports above case. KVM can inject the pending virtual NMI while guest busy handling
current (virtual) NMI such that after guest finished handling.. HW will take
the pending virtual NMI on the IRET instruction (w/o vmexit).

Thanks,
Santosh

>> And in that scenario, Guest will exit with V_NMI_MASK set to 1, KVM can inject pending(Second)
>> NMI(V_NMI=1). Guest will resume handling the first NMI, then HW will
>> clear the V_NMI_MASK and later HW will take the pending V_NMI in side the guest. 
>>
>> I'll handle above case in v3.
>>
>> Thanks,
>> Santosh
>>
>>> On bare metal, if two NMIs arrive "simultaneously", so long as NMIs aren't blocked,
>>> the first NMI will be delivered and the second will be pended, i.e. software will
>>> see both NMIs.  And if that doesn't hold true, the window for a true collision is
>>> really, really tiny.
>>>
>>> But in KVM, because a vCPU may not be run a long duration, that window becomes
>>> very large.  To not drop NMIs and more faithfully emulate hardware, KVM allows two
>>> NMIs to be _pending_.  And when that happens, KVM needs to trigger an exit when
>>> NMIs become unmasked _after_ the first NMI is injected.
>>>
>>>> +
>>>>  	if ((vcpu->arch.hflags & (HF_NMI_MASK | HF_IRET_MASK)) == HF_NMI_MASK)
>>>>  		return; /* IRET will cause a vm exit */
>>>>  
>>>> -- 
>>>> 2.25.1
>>>>
diff mbox series

Patch

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 3574e804d757..44c1f2317b45 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3480,6 +3480,9 @@  bool svm_nmi_blocked(struct kvm_vcpu *vcpu)
 	if (is_guest_mode(vcpu) && nested_exit_on_nmi(svm))
 		return false;
 
+	if (is_vnmi_enabled(svm) && is_vnmi_mask_set(svm))
+		return true;
+
 	ret = (vmcb->control.int_state & SVM_INTERRUPT_SHADOW_MASK) ||
 	      (vcpu->arch.hflags & HF_NMI_MASK);
 
@@ -3609,6 +3612,9 @@  static void svm_enable_nmi_window(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 
+	if (is_vnmi_enabled(svm))
+		return;
+
 	if ((vcpu->arch.hflags & (HF_NMI_MASK | HF_IRET_MASK)) == HF_NMI_MASK)
 		return; /* IRET will cause a vm exit */