diff mbox series

[PATCHv3,5/8] KVM: SVM: Add VNMI support in inject_nmi

Message ID 20220810061226.1286-6-santosh.shukla@amd.com (mailing list archive)
State New, archived
Headers show
Series Virtual NMI feature | expand

Commit Message

Shukla, Santosh Aug. 10, 2022, 6:12 a.m. UTC
Inject the NMI by setting V_NMI in the VMCB interrupt control. processor
will clear V_NMI to acknowledge processing has started and will keep the
V_NMI_MASK set until the processor is done with processing the NMI event.

Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
---
v3: 
- Removed WARN_ON check.

v2:
- Added WARN_ON check for vnmi pending.
- use `get_vnmi_vmcb` to get correct vmcb so to inject vnmi.

 arch/x86/kvm/svm/svm.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Maciej S. Szmigiero Aug. 10, 2022, 9:24 p.m. UTC | #1
On 10.08.2022 08:12, Santosh Shukla wrote:
> Inject the NMI by setting V_NMI in the VMCB interrupt control. processor
> will clear V_NMI to acknowledge processing has started and will keep the
> V_NMI_MASK set until the processor is done with processing the NMI event.
> 
> Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
> ---
> v3:
> - Removed WARN_ON check.
> 
> v2:
> - Added WARN_ON check for vnmi pending.
> - use `get_vnmi_vmcb` to get correct vmcb so to inject vnmi.
> 
>   arch/x86/kvm/svm/svm.c | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index e260e8cb0c81..8c4098b8a63e 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3479,7 +3479,14 @@ static void pre_svm_run(struct kvm_vcpu *vcpu)
>   static void svm_inject_nmi(struct kvm_vcpu *vcpu)
>   {
>   	struct vcpu_svm *svm = to_svm(vcpu);
> +	struct vmcb *vmcb = NULL;
>   
> +	if (is_vnmi_enabled(svm)) {

I guess this should be "is_vnmi_enabled(svm) && !svm->nmi_l1_to_l2"
since if nmi_l1_to_l2 is true then the NMI to be injected originally
comes from L1's VMCB12 EVENTINJ field.

As you said in the cover letter, this field has different semantics
than vNMI - specifically, it should allow injecting even if L2 is in
NMI blocking state (it's then up to L1 to keep track of NMI injectability
for its L2 guest - so L0 should be transparently injecting it when L1
wants so).

> +		vmcb = get_vnmi_vmcb(svm);
> +		vmcb->control.int_ctl |= V_NMI_PENDING;
> +		++vcpu->stat.nmi_injections;
> +		return;
> +	}
>   	svm->vmcb->control.event_inj = SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_NMI;
>   
>   	if (svm->nmi_l1_to_l2)

Thanks,
Maciej
Shukla, Santosh Aug. 24, 2022, 12:13 p.m. UTC | #2
Hi Maciej,

On 8/11/2022 2:54 AM, Maciej S. Szmigiero wrote:
> On 10.08.2022 08:12, Santosh Shukla wrote:
>> Inject the NMI by setting V_NMI in the VMCB interrupt control. processor
>> will clear V_NMI to acknowledge processing has started and will keep the
>> V_NMI_MASK set until the processor is done with processing the NMI event.
>>
>> Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
>> ---
>> v3:
>> - Removed WARN_ON check.
>>
>> v2:
>> - Added WARN_ON check for vnmi pending.
>> - use `get_vnmi_vmcb` to get correct vmcb so to inject vnmi.
>>
>>   arch/x86/kvm/svm/svm.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>> index e260e8cb0c81..8c4098b8a63e 100644
>> --- a/arch/x86/kvm/svm/svm.c
>> +++ b/arch/x86/kvm/svm/svm.c
>> @@ -3479,7 +3479,14 @@ static void pre_svm_run(struct kvm_vcpu *vcpu)
>>   static void svm_inject_nmi(struct kvm_vcpu *vcpu)
>>   {
>>       struct vcpu_svm *svm = to_svm(vcpu);
>> +    struct vmcb *vmcb = NULL;
>>   +    if (is_vnmi_enabled(svm)) {
> 
> I guess this should be "is_vnmi_enabled(svm) && !svm->nmi_l1_to_l2"
> since if nmi_l1_to_l2 is true then the NMI to be injected originally
> comes from L1's VMCB12 EVENTINJ field.
> 

Not sure if I understood the case fully.. so trying to sketch scenario here -
if nmi_l1_to_l2 is true then event is coming from EVTINJ. .which could
be one of following case -
1) L0 (vnmi enabled) and L1 (vnmi disabled)
2) L0 & L1 both vnmi disabled.

In both cases the vnmi check will fail for L1 and execution path
will fall back to default - right?

Thanks,
Santosh

> As you said in the cover letter, this field has different semantics
> than vNMI - specifically, it should allow injecting even if L2 is in
> NMI blocking state (it's then up to L1 to keep track of NMI injectability
> for its L2 guest - so L0 should be transparently injecting it when L1
> wants so).
> 
>> +        vmcb = get_vnmi_vmcb(svm);
>> +        vmcb->control.int_ctl |= V_NMI_PENDING;
>> +        ++vcpu->stat.nmi_injections;
>> +        return;
>> +    }
>>       svm->vmcb->control.event_inj = SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_NMI;
>>         if (svm->nmi_l1_to_l2)
> 
> Thanks,
> Maciej
Maciej S. Szmigiero Aug. 24, 2022, 12:56 p.m. UTC | #3
On 24.08.2022 14:13, Shukla, Santosh wrote:
> Hi Maciej,
> 
> On 8/11/2022 2:54 AM, Maciej S. Szmigiero wrote:
>> On 10.08.2022 08:12, Santosh Shukla wrote:
>>> Inject the NMI by setting V_NMI in the VMCB interrupt control. processor
>>> will clear V_NMI to acknowledge processing has started and will keep the
>>> V_NMI_MASK set until the processor is done with processing the NMI event.
>>>
>>> Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
>>> ---
>>> v3:
>>> - Removed WARN_ON check.
>>>
>>> v2:
>>> - Added WARN_ON check for vnmi pending.
>>> - use `get_vnmi_vmcb` to get correct vmcb so to inject vnmi.
>>>
>>>    arch/x86/kvm/svm/svm.c | 7 +++++++
>>>    1 file changed, 7 insertions(+)
>>>
>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>>> index e260e8cb0c81..8c4098b8a63e 100644
>>> --- a/arch/x86/kvm/svm/svm.c
>>> +++ b/arch/x86/kvm/svm/svm.c
>>> @@ -3479,7 +3479,14 @@ static void pre_svm_run(struct kvm_vcpu *vcpu)
>>>    static void svm_inject_nmi(struct kvm_vcpu *vcpu)
>>>    {
>>>        struct vcpu_svm *svm = to_svm(vcpu);
>>> +    struct vmcb *vmcb = NULL;
>>>    +    if (is_vnmi_enabled(svm)) {
>>
>> I guess this should be "is_vnmi_enabled(svm) && !svm->nmi_l1_to_l2"
>> since if nmi_l1_to_l2 is true then the NMI to be injected originally
>> comes from L1's VMCB12 EVENTINJ field.
>>
> 
> Not sure if I understood the case fully.. so trying to sketch scenario here -
> if nmi_l1_to_l2 is true then event is coming from EVTINJ. .which could
> be one of following case -
> 1) L0 (vnmi enabled) and L1 (vnmi disabled)

As far as I can see in this case:
is_vnmi_enabled() returns whether VMCB02's int_ctl has V_NMI_ENABLE bit set.

This field in VMCB02 comes from nested_vmcb02_prepare_control() which
in the !nested_vnmi_enabled() case (L1 is not using vNMI) copies these bits
from VMCB01:
> int_ctl_vmcb01_bits |= (V_NMI_PENDING | V_NMI_ENABLE | V_NMI_MASK);

So in this case (L0 uses vNMI) V_NMI_ENABLE will be set in VMCB01, right?

This bit will then be copied to VMCB02 so re-injection will attempt to use
vNMI instead of EVTINJ.

> 2) L0 & L1 both vnmi disabled.

This case is ok.

> 
> In both cases the vnmi check will fail for L1 and execution path
> will fall back to default - right?
> 
> Thanks,
> Santosh

Thanks,
Maciej
Shukla, Santosh Aug. 25, 2022, 10:56 a.m. UTC | #4
On 8/24/2022 6:26 PM, Maciej S. Szmigiero wrote:
> On 24.08.2022 14:13, Shukla, Santosh wrote:
>> Hi Maciej,
>>
>> On 8/11/2022 2:54 AM, Maciej S. Szmigiero wrote:
>>> On 10.08.2022 08:12, Santosh Shukla wrote:
>>>> Inject the NMI by setting V_NMI in the VMCB interrupt control. processor
>>>> will clear V_NMI to acknowledge processing has started and will keep the
>>>> V_NMI_MASK set until the processor is done with processing the NMI event.
>>>>
>>>> Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
>>>> ---
>>>> v3:
>>>> - Removed WARN_ON check.
>>>>
>>>> v2:
>>>> - Added WARN_ON check for vnmi pending.
>>>> - use `get_vnmi_vmcb` to get correct vmcb so to inject vnmi.
>>>>
>>>>    arch/x86/kvm/svm/svm.c | 7 +++++++
>>>>    1 file changed, 7 insertions(+)
>>>>
>>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>>>> index e260e8cb0c81..8c4098b8a63e 100644
>>>> --- a/arch/x86/kvm/svm/svm.c
>>>> +++ b/arch/x86/kvm/svm/svm.c
>>>> @@ -3479,7 +3479,14 @@ static void pre_svm_run(struct kvm_vcpu *vcpu)
>>>>    static void svm_inject_nmi(struct kvm_vcpu *vcpu)
>>>>    {
>>>>        struct vcpu_svm *svm = to_svm(vcpu);
>>>> +    struct vmcb *vmcb = NULL;
>>>>    +    if (is_vnmi_enabled(svm)) {
>>>
>>> I guess this should be "is_vnmi_enabled(svm) && !svm->nmi_l1_to_l2"
>>> since if nmi_l1_to_l2 is true then the NMI to be injected originally
>>> comes from L1's VMCB12 EVENTINJ field.
>>>
>>
>> Not sure if I understood the case fully.. so trying to sketch scenario here -
>> if nmi_l1_to_l2 is true then event is coming from EVTINJ. .which could
>> be one of following case -
>> 1) L0 (vnmi enabled) and L1 (vnmi disabled)
> 
> As far as I can see in this case:
> is_vnmi_enabled() returns whether VMCB02's int_ctl has V_NMI_ENABLE bit set.
> 

For L1 with vnmi disabled case - is_vnmi_enabled()->get_vnmi_vmcb() will return false so the
execution path will opt EVTINJ model for re-injection.

Thanks,
Santosh

> This field in VMCB02 comes from nested_vmcb02_prepare_control() which
> in the !nested_vnmi_enabled() case (L1 is not using vNMI) copies these bits
> from VMCB01:
>> int_ctl_vmcb01_bits |= (V_NMI_PENDING | V_NMI_ENABLE | V_NMI_MASK);
> 
> So in this case (L0 uses vNMI) V_NMI_ENABLE will be set in VMCB01, right?
> 
> This bit will then be copied to VMCB02 so re-injection will attempt to use
> vNMI instead of EVTINJ.
> 
>> 2) L0 & L1 both vnmi disabled.
> 
> This case is ok.
> 
>>
>> In both cases the vnmi check will fail for L1 and execution path
>> will fall back to default - right?
>>
>> Thanks,
>> Santosh
> 
> Thanks,
> Maciej
Maciej S. Szmigiero Aug. 25, 2022, 12:45 p.m. UTC | #5
On 25.08.2022 12:56, Shukla, Santosh wrote:
> On 8/24/2022 6:26 PM, Maciej S. Szmigiero wrote:
>> On 24.08.2022 14:13, Shukla, Santosh wrote:
>>> Hi Maciej,
>>>
>>> On 8/11/2022 2:54 AM, Maciej S. Szmigiero wrote:
>>>> On 10.08.2022 08:12, Santosh Shukla wrote:
>>>>> Inject the NMI by setting V_NMI in the VMCB interrupt control. processor
>>>>> will clear V_NMI to acknowledge processing has started and will keep the
>>>>> V_NMI_MASK set until the processor is done with processing the NMI event.
>>>>>
>>>>> Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
>>>>> ---
>>>>> v3:
>>>>> - Removed WARN_ON check.
>>>>>
>>>>> v2:
>>>>> - Added WARN_ON check for vnmi pending.
>>>>> - use `get_vnmi_vmcb` to get correct vmcb so to inject vnmi.
>>>>>
>>>>>     arch/x86/kvm/svm/svm.c | 7 +++++++
>>>>>     1 file changed, 7 insertions(+)
>>>>>
>>>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>>>>> index e260e8cb0c81..8c4098b8a63e 100644
>>>>> --- a/arch/x86/kvm/svm/svm.c
>>>>> +++ b/arch/x86/kvm/svm/svm.c
>>>>> @@ -3479,7 +3479,14 @@ static void pre_svm_run(struct kvm_vcpu *vcpu)
>>>>>     static void svm_inject_nmi(struct kvm_vcpu *vcpu)
>>>>>     {
>>>>>         struct vcpu_svm *svm = to_svm(vcpu);
>>>>> +    struct vmcb *vmcb = NULL;
>>>>>     +    if (is_vnmi_enabled(svm)) {
>>>>
>>>> I guess this should be "is_vnmi_enabled(svm) && !svm->nmi_l1_to_l2"
>>>> since if nmi_l1_to_l2 is true then the NMI to be injected originally
>>>> comes from L1's VMCB12 EVENTINJ field.
>>>>
>>>
>>> Not sure if I understood the case fully.. so trying to sketch scenario here -
>>> if nmi_l1_to_l2 is true then event is coming from EVTINJ. .which could
>>> be one of following case -
>>> 1) L0 (vnmi enabled) and L1 (vnmi disabled)
>>
>> As far as I can see in this case:
>> is_vnmi_enabled() returns whether VMCB02's int_ctl has V_NMI_ENABLE bit set.
>>
> 
> For L1 with vnmi disabled case - is_vnmi_enabled()->get_vnmi_vmcb() will return false so the
> execution path will opt EVTINJ model for re-injection.

I guess by "get_vnmi_vmcb() will return false" you mean it will return NULL,
since this function returns a pointer, not a bool.

I can't see however, how this will happen:
>static inline struct vmcb *get_vnmi_vmcb(struct vcpu_svm *svm)
>{
>	if (!vnmi)
>		return NULL;
         ^ "vnmi" variable controls whether L0 uses vNMI,
	   so this variable is true in our case

>
>	if (is_guest_mode(&svm->vcpu))
>		return svm->nested.vmcb02.ptr;
		^ this should be always non-NULL.

So get_vnmi_vmcb() will return VMCB02 pointer in our case, not NULL...

> 
> Thanks,
> Santosh
> 
>> This field in VMCB02 comes from nested_vmcb02_prepare_control() which
>> in the !nested_vnmi_enabled() case (L1 is not using vNMI) copies these bits
>> from VMCB01:
>>> int_ctl_vmcb01_bits |= (V_NMI_PENDING | V_NMI_ENABLE | V_NMI_MASK);
>>
>> So in this case (L0 uses vNMI) V_NMI_ENABLE will be set in VMCB01, right?
>>
>> This bit will then be copied to VMCB02 

... and due to the above is_vnmi_enabled() will return true, so
re-injection will attempt to use vNMI instead of EVTINJ (wrong).

>>> 2) L0 & L1 both vnmi disabled.
>>
>> This case is ok.
>>
>>>
>>> In both cases the vnmi check will fail for L1 and execution path
>>> will fall back to default - right?
>>>
>>> Thanks,
>>> Santosh
>>

Thanks,
Maciej
Shukla, Santosh Aug. 25, 2022, 2:05 p.m. UTC | #6
On 8/25/2022 6:15 PM, Maciej S. Szmigiero wrote:
> On 25.08.2022 12:56, Shukla, Santosh wrote:
>> On 8/24/2022 6:26 PM, Maciej S. Szmigiero wrote:
>>> On 24.08.2022 14:13, Shukla, Santosh wrote:
>>>> Hi Maciej,
>>>>
>>>> On 8/11/2022 2:54 AM, Maciej S. Szmigiero wrote:
>>>>> On 10.08.2022 08:12, Santosh Shukla wrote:
>>>>>> Inject the NMI by setting V_NMI in the VMCB interrupt control. processor
>>>>>> will clear V_NMI to acknowledge processing has started and will keep the
>>>>>> V_NMI_MASK set until the processor is done with processing the NMI event.
>>>>>>
>>>>>> Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
>>>>>> ---
>>>>>> v3:
>>>>>> - Removed WARN_ON check.
>>>>>>
>>>>>> v2:
>>>>>> - Added WARN_ON check for vnmi pending.
>>>>>> - use `get_vnmi_vmcb` to get correct vmcb so to inject vnmi.
>>>>>>
>>>>>>     arch/x86/kvm/svm/svm.c | 7 +++++++
>>>>>>     1 file changed, 7 insertions(+)
>>>>>>
>>>>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>>>>>> index e260e8cb0c81..8c4098b8a63e 100644
>>>>>> --- a/arch/x86/kvm/svm/svm.c
>>>>>> +++ b/arch/x86/kvm/svm/svm.c
>>>>>> @@ -3479,7 +3479,14 @@ static void pre_svm_run(struct kvm_vcpu *vcpu)
>>>>>>     static void svm_inject_nmi(struct kvm_vcpu *vcpu)
>>>>>>     {
>>>>>>         struct vcpu_svm *svm = to_svm(vcpu);
>>>>>> +    struct vmcb *vmcb = NULL;
>>>>>>     +    if (is_vnmi_enabled(svm)) {
>>>>>
>>>>> I guess this should be "is_vnmi_enabled(svm) && !svm->nmi_l1_to_l2"
>>>>> since if nmi_l1_to_l2 is true then the NMI to be injected originally
>>>>> comes from L1's VMCB12 EVENTINJ field.
>>>>>
>>>>
>>>> Not sure if I understood the case fully.. so trying to sketch scenario here -
>>>> if nmi_l1_to_l2 is true then event is coming from EVTINJ. .which could
>>>> be one of following case -
>>>> 1) L0 (vnmi enabled) and L1 (vnmi disabled)
>>>
>>> As far as I can see in this case:
>>> is_vnmi_enabled() returns whether VMCB02's int_ctl has V_NMI_ENABLE bit set.
>>>
>>
>> For L1 with vnmi disabled case - is_vnmi_enabled()->get_vnmi_vmcb() will return false so the
>> execution path will opt EVTINJ model for re-injection.
> 
> I guess by "get_vnmi_vmcb() will return false" you mean it will return NULL,
> since this function returns a pointer, not a bool.
> 

Yes, I meant is_vnmi_enabled() will return false if vnmi param is unset.

> I can't see however, how this will happen:
>> static inline struct vmcb *get_vnmi_vmcb(struct vcpu_svm *svm)
>> {
>>     if (!vnmi)
>>         return NULL;
>         ^ "vnmi" variable controls whether L0 uses vNMI,
>        so this variable is true in our case
> 

No.

In L1 case (vnmi disabled) - vnmi param will be false.
In L0 case (vnmi enabled) - vnmi param will be true.

So in L1 case, is_vnmi_enabled() will return false and
in L0 case will return true.

Thanks,
Santosh
>>
>>     if (is_guest_mode(&svm->vcpu))
>>         return svm->nested.vmcb02.ptr;
>         ^ this should be always non-NULL.
> 
> So get_vnmi_vmcb() will return VMCB02 pointer in our case, not NULL...
> 
>>
>> Thanks,
>> Santosh
>>
>>> This field in VMCB02 comes from nested_vmcb02_prepare_control() which
>>> in the !nested_vnmi_enabled() case (L1 is not using vNMI) copies these bits
>>> from VMCB01:
>>>> int_ctl_vmcb01_bits |= (V_NMI_PENDING | V_NMI_ENABLE | V_NMI_MASK);
>>>
>>> So in this case (L0 uses vNMI) V_NMI_ENABLE will be set in VMCB01, right?
>>>
>>> This bit will then be copied to VMCB02 
> 
> ... and due to the above is_vnmi_enabled() will return true, so
> re-injection will attempt to use vNMI instead of EVTINJ (wrong).
> 
>>>> 2) L0 & L1 both vnmi disabled.
>>>
>>> This case is ok.
>>>
>>>>
>>>> In both cases the vnmi check will fail for L1 and execution path
>>>> will fall back to default - right?
>>>>
>>>> Thanks,
>>>> Santosh
>>>
> 
> Thanks,
> Maciej
Maciej S. Szmigiero Aug. 25, 2022, 2:16 p.m. UTC | #7
On 25.08.2022 16:05, Shukla, Santosh wrote:
> On 8/25/2022 6:15 PM, Maciej S. Szmigiero wrote:
>> On 25.08.2022 12:56, Shukla, Santosh wrote:
>>> On 8/24/2022 6:26 PM, Maciej S. Szmigiero wrote:
>>>> On 24.08.2022 14:13, Shukla, Santosh wrote:
>>>>> Hi Maciej,
>>>>>
>>>>> On 8/11/2022 2:54 AM, Maciej S. Szmigiero wrote:
>>>>>> On 10.08.2022 08:12, Santosh Shukla wrote:
>>>>>>> Inject the NMI by setting V_NMI in the VMCB interrupt control. processor
>>>>>>> will clear V_NMI to acknowledge processing has started and will keep the
>>>>>>> V_NMI_MASK set until the processor is done with processing the NMI event.
>>>>>>>
>>>>>>> Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
>>>>>>> ---
>>>>>>> v3:
>>>>>>> - Removed WARN_ON check.
>>>>>>>
>>>>>>> v2:
>>>>>>> - Added WARN_ON check for vnmi pending.
>>>>>>> - use `get_vnmi_vmcb` to get correct vmcb so to inject vnmi.
>>>>>>>
>>>>>>>      arch/x86/kvm/svm/svm.c | 7 +++++++
>>>>>>>      1 file changed, 7 insertions(+)
>>>>>>>
>>>>>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>>>>>>> index e260e8cb0c81..8c4098b8a63e 100644
>>>>>>> --- a/arch/x86/kvm/svm/svm.c
>>>>>>> +++ b/arch/x86/kvm/svm/svm.c
>>>>>>> @@ -3479,7 +3479,14 @@ static void pre_svm_run(struct kvm_vcpu *vcpu)
>>>>>>>      static void svm_inject_nmi(struct kvm_vcpu *vcpu)
>>>>>>>      {
>>>>>>>          struct vcpu_svm *svm = to_svm(vcpu);
>>>>>>> +    struct vmcb *vmcb = NULL;
>>>>>>>      +    if (is_vnmi_enabled(svm)) {
>>>>>>
>>>>>> I guess this should be "is_vnmi_enabled(svm) && !svm->nmi_l1_to_l2"
>>>>>> since if nmi_l1_to_l2 is true then the NMI to be injected originally
>>>>>> comes from L1's VMCB12 EVENTINJ field.
>>>>>>
>>>>>
>>>>> Not sure if I understood the case fully.. so trying to sketch scenario here -
>>>>> if nmi_l1_to_l2 is true then event is coming from EVTINJ. .which could
>>>>> be one of following case -
>>>>> 1) L0 (vnmi enabled) and L1 (vnmi disabled)
>>>>
>>>> As far as I can see in this case:
>>>> is_vnmi_enabled() returns whether VMCB02's int_ctl has V_NMI_ENABLE bit set.
>>>>
>>>
>>> For L1 with vnmi disabled case - is_vnmi_enabled()->get_vnmi_vmcb() will return false so the
>>> execution path will opt EVTINJ model for re-injection.
>>
>> I guess by "get_vnmi_vmcb() will return false" you mean it will return NULL,
>> since this function returns a pointer, not a bool.
>>
> 
> Yes, I meant is_vnmi_enabled() will return false if vnmi param is unset.
> 
>> I can't see however, how this will happen:
>>> static inline struct vmcb *get_vnmi_vmcb(struct vcpu_svm *svm)
>>> {
>>>      if (!vnmi)
>>>          return NULL;
>>          ^ "vnmi" variable controls whether L0 uses vNMI,
>>         so this variable is true in our case
>>
> 
> No.
> 
> In L1 case (vnmi disabled) - vnmi param will be false.

Perhaps there was a misunderstanding here - the case here
isn't the code under discussion running as L1, but as L0
where L1 not using vNMI - L1 here can be an old version of KVM,
or Hyper-V, or any other hypervisor.

In this case L0 is re-injecting an EVENTINJ NMI into L2 on
the behalf of L1.
That's when "nmi_l1_to_l2" is true.

Since the code is physically running on L0 (which makes use of vNMI)
it has the "vnmi" param set.

So is_vnmi_enabled() will return true and vNMI will be used
for the re-injection instead of the required EVENTINJ.

> In L0 case (vnmi enabled) - vnmi param will be true.
> 
> So in L1 case, is_vnmi_enabled() will return false and
> in L0 case will return true.
> 
> Thanks,
> Santosh

Thanks,
Maciej
Shukla, Santosh Aug. 26, 2022, 9:35 a.m. UTC | #8
On 8/25/2022 7:46 PM, Maciej S. Szmigiero wrote:
> On 25.08.2022 16:05, Shukla, Santosh wrote:
>> On 8/25/2022 6:15 PM, Maciej S. Szmigiero wrote:
>>> On 25.08.2022 12:56, Shukla, Santosh wrote:
>>>> On 8/24/2022 6:26 PM, Maciej S. Szmigiero wrote:
>>>>> On 24.08.2022 14:13, Shukla, Santosh wrote:
>>>>>> Hi Maciej,
>>>>>>
>>>>>> On 8/11/2022 2:54 AM, Maciej S. Szmigiero wrote:
>>>>>>> On 10.08.2022 08:12, Santosh Shukla wrote:
>>>>>>>> Inject the NMI by setting V_NMI in the VMCB interrupt control. processor
>>>>>>>> will clear V_NMI to acknowledge processing has started and will keep the
>>>>>>>> V_NMI_MASK set until the processor is done with processing the NMI event.
>>>>>>>>
>>>>>>>> Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
>>>>>>>> ---
>>>>>>>> v3:
>>>>>>>> - Removed WARN_ON check.
>>>>>>>>
>>>>>>>> v2:
>>>>>>>> - Added WARN_ON check for vnmi pending.
>>>>>>>> - use `get_vnmi_vmcb` to get correct vmcb so to inject vnmi.
>>>>>>>>
>>>>>>>>      arch/x86/kvm/svm/svm.c | 7 +++++++
>>>>>>>>      1 file changed, 7 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>>>>>>>> index e260e8cb0c81..8c4098b8a63e 100644
>>>>>>>> --- a/arch/x86/kvm/svm/svm.c
>>>>>>>> +++ b/arch/x86/kvm/svm/svm.c
>>>>>>>> @@ -3479,7 +3479,14 @@ static void pre_svm_run(struct kvm_vcpu *vcpu)
>>>>>>>>      static void svm_inject_nmi(struct kvm_vcpu *vcpu)
>>>>>>>>      {
>>>>>>>>          struct vcpu_svm *svm = to_svm(vcpu);
>>>>>>>> +    struct vmcb *vmcb = NULL;
>>>>>>>>      +    if (is_vnmi_enabled(svm)) {
>>>>>>>
>>>>>>> I guess this should be "is_vnmi_enabled(svm) && !svm->nmi_l1_to_l2"
>>>>>>> since if nmi_l1_to_l2 is true then the NMI to be injected originally
>>>>>>> comes from L1's VMCB12 EVENTINJ field.
>>>>>>>
>>>>>>
>>>>>> Not sure if I understood the case fully.. so trying to sketch scenario here -
>>>>>> if nmi_l1_to_l2 is true then event is coming from EVTINJ. .which could
>>>>>> be one of following case -
>>>>>> 1) L0 (vnmi enabled) and L1 (vnmi disabled)
>>>>>
>>>>> As far as I can see in this case:
>>>>> is_vnmi_enabled() returns whether VMCB02's int_ctl has V_NMI_ENABLE bit set.
>>>>>
>>>>
>>>> For L1 with vnmi disabled case - is_vnmi_enabled()->get_vnmi_vmcb() will return false so the
>>>> execution path will opt EVTINJ model for re-injection.
>>>
>>> I guess by "get_vnmi_vmcb() will return false" you mean it will return NULL,
>>> since this function returns a pointer, not a bool.
>>>
>>
>> Yes, I meant is_vnmi_enabled() will return false if vnmi param is unset.
>>
>>> I can't see however, how this will happen:
>>>> static inline struct vmcb *get_vnmi_vmcb(struct vcpu_svm *svm)
>>>> {
>>>>      if (!vnmi)
>>>>          return NULL;
>>>          ^ "vnmi" variable controls whether L0 uses vNMI,
>>>         so this variable is true in our case
>>>
>>
>> No.
>>
>> In L1 case (vnmi disabled) - vnmi param will be false.
> 
> Perhaps there was a misunderstanding here - the case here
> isn't the code under discussion running as L1, but as L0
> where L1 not using vNMI - L1 here can be an old version of KVM,
> or Hyper-V, or any other hypervisor.
> 

Ok.

> In this case L0 is re-injecting an EVENTINJ NMI into L2 on
> the behalf of L1.
> That's when "nmi_l1_to_l2" is true.
 
hmm,. trying to understand the event re-injection flow -
First L1 (non-vnmi) injecting event to L2 guest, in-turn
intercepted by L0, L0 sees event injection through EVTINJ
so sets the 'nmi_l1_to_l2' var and then L0 calls svm_inject_nmi()
to re-inject event in L2 - is that correct (nmi_l1_to_l2) flow?

Thanks,.
Santosh
> Since the code is physically running on L0 (which makes use of vNMI)
> it has the "vnmi" param set.
> 
> So is_vnmi_enabled() will return true and vNMI will be used
> for the re-injection instead of the required EVENTINJ.
> 
>> In L0 case (vnmi enabled) - vnmi param will be true.
>>
>> So in L1 case, is_vnmi_enabled() will return false and
>> in L0 case will return true.
>>
>> Thanks,
>> Santosh
> 
> Thanks,
> Maciej
Maciej S. Szmigiero Aug. 26, 2022, 12:20 p.m. UTC | #9
On 26.08.2022 11:35, Shukla, Santosh wrote:
> On 8/25/2022 7:46 PM, Maciej S. Szmigiero wrote:
>> On 25.08.2022 16:05, Shukla, Santosh wrote:
>>> On 8/25/2022 6:15 PM, Maciej S. Szmigiero wrote:
>>>> On 25.08.2022 12:56, Shukla, Santosh wrote:
>>>>> On 8/24/2022 6:26 PM, Maciej S. Szmigiero wrote:
>>>>>> On 24.08.2022 14:13, Shukla, Santosh wrote:
>>>>>>> Hi Maciej,
>>>>>>>
>>>>>>> On 8/11/2022 2:54 AM, Maciej S. Szmigiero wrote:
>>>>>>>> On 10.08.2022 08:12, Santosh Shukla wrote:
>>>>>>>>> Inject the NMI by setting V_NMI in the VMCB interrupt control. processor
>>>>>>>>> will clear V_NMI to acknowledge processing has started and will keep the
>>>>>>>>> V_NMI_MASK set until the processor is done with processing the NMI event.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
>>>>>>>>> ---
>>>>>>>>> v3:
>>>>>>>>> - Removed WARN_ON check.
>>>>>>>>>
>>>>>>>>> v2:
>>>>>>>>> - Added WARN_ON check for vnmi pending.
>>>>>>>>> - use `get_vnmi_vmcb` to get correct vmcb so to inject vnmi.
>>>>>>>>>
>>>>>>>>>       arch/x86/kvm/svm/svm.c | 7 +++++++
>>>>>>>>>       1 file changed, 7 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>>>>>>>>> index e260e8cb0c81..8c4098b8a63e 100644
>>>>>>>>> --- a/arch/x86/kvm/svm/svm.c
>>>>>>>>> +++ b/arch/x86/kvm/svm/svm.c
>>>>>>>>> @@ -3479,7 +3479,14 @@ static void pre_svm_run(struct kvm_vcpu *vcpu)
>>>>>>>>>       static void svm_inject_nmi(struct kvm_vcpu *vcpu)
>>>>>>>>>       {
>>>>>>>>>           struct vcpu_svm *svm = to_svm(vcpu);
>>>>>>>>> +    struct vmcb *vmcb = NULL;
>>>>>>>>>       +    if (is_vnmi_enabled(svm)) {
>>>>>>>>
>>>>>>>> I guess this should be "is_vnmi_enabled(svm) && !svm->nmi_l1_to_l2"
>>>>>>>> since if nmi_l1_to_l2 is true then the NMI to be injected originally
>>>>>>>> comes from L1's VMCB12 EVENTINJ field.
>>>>>>>>
>>>>>>>
>>>>>>> Not sure if I understood the case fully.. so trying to sketch scenario here -
>>>>>>> if nmi_l1_to_l2 is true then event is coming from EVTINJ. .which could
>>>>>>> be one of following case -
>>>>>>> 1) L0 (vnmi enabled) and L1 (vnmi disabled)
>>>>>>
>>>>>> As far as I can see in this case:
>>>>>> is_vnmi_enabled() returns whether VMCB02's int_ctl has V_NMI_ENABLE bit set.
>>>>>>
>>>>>
>>>>> For L1 with vnmi disabled case - is_vnmi_enabled()->get_vnmi_vmcb() will return false so the
>>>>> execution path will opt EVTINJ model for re-injection.
>>>>
>>>> I guess by "get_vnmi_vmcb() will return false" you mean it will return NULL,
>>>> since this function returns a pointer, not a bool.
>>>>
>>>
>>> Yes, I meant is_vnmi_enabled() will return false if vnmi param is unset.
>>>
>>>> I can't see however, how this will happen:
>>>>> static inline struct vmcb *get_vnmi_vmcb(struct vcpu_svm *svm)
>>>>> {
>>>>>       if (!vnmi)
>>>>>           return NULL;
>>>>           ^ "vnmi" variable controls whether L0 uses vNMI,
>>>>          so this variable is true in our case
>>>>
>>>
>>> No.
>>>
>>> In L1 case (vnmi disabled) - vnmi param will be false.
>>
>> Perhaps there was a misunderstanding here - the case here
>> isn't the code under discussion running as L1, but as L0
>> where L1 not using vNMI - L1 here can be an old version of KVM,
>> or Hyper-V, or any other hypervisor.
>>
> 
> Ok.
> 
>> In this case L0 is re-injecting an EVENTINJ NMI into L2 on
>> the behalf of L1.
>> That's when "nmi_l1_to_l2" is true.
>   
> hmm,. trying to understand the event re-injection flow -
> First L1 (non-vnmi) injecting event to L2 guest, in-turn
> intercepted by L0, 

That's right, the L1's VMRUN of L2 gets intercepted by L0.

> L0 sees event injection through EVTINJ

It sees that L1 wants to inject an NMI into L2 via VMCB12 EVTINJ.

> so sets the 'nmi_l1_to_l2' var 

That's right, L0 needs to keep track of this fact.

> and then L0 calls svm_inject_nmi()

Not yet - at this point svm_inject_nmi() is NOT called
(rather than, VMCB12 EVTINJ is directly copied into VMCB02 EVTINJ).

Now L0 does the actual VMRUN of L2.

Let's say that there is an intervening VMExit during delivery of
that NMI to L2, of type which is handled by L0 (perhaps a NPF on
L2 IDT or so).

In this case the NMI will be returned in VMCB02 EXITINTINFO and
needs to be re-injected into L2 on the next VMRUN,
again via EVTINJ.

That's when svm_inject_nmi() will get called to re-inject
that NMI.

> to re-inject event in L2 - is that correct (nmi_l1_to_l2) flow?
Hope the flow is clear now.

> 
> Thanks,.
> Santosh

Thanks,
Maciej
Shukla, Santosh Aug. 26, 2022, 4:26 p.m. UTC | #10
On 8/26/2022 5:50 PM, Maciej S. Szmigiero wrote:
> On 26.08.2022 11:35, Shukla, Santosh wrote:
>> On 8/25/2022 7:46 PM, Maciej S. Szmigiero wrote:
>>> On 25.08.2022 16:05, Shukla, Santosh wrote:
>>>> On 8/25/2022 6:15 PM, Maciej S. Szmigiero wrote:
>>>>> On 25.08.2022 12:56, Shukla, Santosh wrote:
>>>>>> On 8/24/2022 6:26 PM, Maciej S. Szmigiero wrote:
>>>>>>> On 24.08.2022 14:13, Shukla, Santosh wrote:
>>>>>>>> Hi Maciej,
>>>>>>>>
>>>>>>>> On 8/11/2022 2:54 AM, Maciej S. Szmigiero wrote:
>>>>>>>>> On 10.08.2022 08:12, Santosh Shukla wrote:
>>>>>>>>>> Inject the NMI by setting V_NMI in the VMCB interrupt control. processor
>>>>>>>>>> will clear V_NMI to acknowledge processing has started and will keep the
>>>>>>>>>> V_NMI_MASK set until the processor is done with processing the NMI event.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
>>>>>>>>>> ---
>>>>>>>>>> v3:
>>>>>>>>>> - Removed WARN_ON check.
>>>>>>>>>>
>>>>>>>>>> v2:
>>>>>>>>>> - Added WARN_ON check for vnmi pending.
>>>>>>>>>> - use `get_vnmi_vmcb` to get correct vmcb so to inject vnmi.
>>>>>>>>>>
>>>>>>>>>>       arch/x86/kvm/svm/svm.c | 7 +++++++
>>>>>>>>>>       1 file changed, 7 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>>>>>>>>>> index e260e8cb0c81..8c4098b8a63e 100644
>>>>>>>>>> --- a/arch/x86/kvm/svm/svm.c
>>>>>>>>>> +++ b/arch/x86/kvm/svm/svm.c
>>>>>>>>>> @@ -3479,7 +3479,14 @@ static void pre_svm_run(struct kvm_vcpu *vcpu)
>>>>>>>>>>       static void svm_inject_nmi(struct kvm_vcpu *vcpu)
>>>>>>>>>>       {
>>>>>>>>>>           struct vcpu_svm *svm = to_svm(vcpu);
>>>>>>>>>> +    struct vmcb *vmcb = NULL;
>>>>>>>>>>       +    if (is_vnmi_enabled(svm)) {
>>>>>>>>>
>>>>>>>>> I guess this should be "is_vnmi_enabled(svm) && !svm->nmi_l1_to_l2"
>>>>>>>>> since if nmi_l1_to_l2 is true then the NMI to be injected originally
>>>>>>>>> comes from L1's VMCB12 EVENTINJ field.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Not sure if I understood the case fully.. so trying to sketch scenario here -
>>>>>>>> if nmi_l1_to_l2 is true then event is coming from EVTINJ. .which could
>>>>>>>> be one of following case -
>>>>>>>> 1) L0 (vnmi enabled) and L1 (vnmi disabled)
>>>>>>>
>>>>>>> As far as I can see in this case:
>>>>>>> is_vnmi_enabled() returns whether VMCB02's int_ctl has V_NMI_ENABLE bit set.
>>>>>>>
>>>>>>
>>>>>> For L1 with vnmi disabled case - is_vnmi_enabled()->get_vnmi_vmcb() will return false so the
>>>>>> execution path will opt EVTINJ model for re-injection.
>>>>>
>>>>> I guess by "get_vnmi_vmcb() will return false" you mean it will return NULL,
>>>>> since this function returns a pointer, not a bool.
>>>>>
>>>>
>>>> Yes, I meant is_vnmi_enabled() will return false if vnmi param is unset.
>>>>
>>>>> I can't see however, how this will happen:
>>>>>> static inline struct vmcb *get_vnmi_vmcb(struct vcpu_svm *svm)
>>>>>> {
>>>>>>       if (!vnmi)
>>>>>>           return NULL;
>>>>>           ^ "vnmi" variable controls whether L0 uses vNMI,
>>>>>          so this variable is true in our case
>>>>>
>>>>
>>>> No.
>>>>
>>>> In L1 case (vnmi disabled) - vnmi param will be false.
>>>
>>> Perhaps there was a misunderstanding here - the case here
>>> isn't the code under discussion running as L1, but as L0
>>> where L1 not using vNMI - L1 here can be an old version of KVM,
>>> or Hyper-V, or any other hypervisor.
>>>
>>
>> Ok.
>>
>>> In this case L0 is re-injecting an EVENTINJ NMI into L2 on
>>> the behalf of L1.
>>> That's when "nmi_l1_to_l2" is true.
>>   hmm,. trying to understand the event re-injection flow -
>> First L1 (non-vnmi) injecting event to L2 guest, in-turn
>> intercepted by L0, 
> 
> That's right, the L1's VMRUN of L2 gets intercepted by L0.
> 
>> L0 sees event injection through EVTINJ
> 
> It sees that L1 wants to inject an NMI into L2 via VMCB12 EVTINJ.
> 
>> so sets the 'nmi_l1_to_l2' var 
> 
> That's right, L0 needs to keep track of this fact.
> 
>> and then L0 calls svm_inject_nmi()
> 
> Not yet - at this point svm_inject_nmi() is NOT called
> (rather than, VMCB12 EVTINJ is directly copied into VMCB02 EVTINJ).
> 
> Now L0 does the actual VMRUN of L2.
> 
> Let's say that there is an intervening VMExit during delivery of
> that NMI to L2, of type which is handled by L0 (perhaps a NPF on
> L2 IDT or so).
> 
> In this case the NMI will be returned in VMCB02 EXITINTINFO and
> needs to be re-injected into L2 on the next VMRUN,
> again via EVTINJ.
> 
> That's when svm_inject_nmi() will get called to re-inject
> that NMI.
> 
>> to re-inject event in L2 - is that correct (nmi_l1_to_l2) flow?
> Hope the flow is clear now.
> 

Yes, Thank-you for the clarification :).

Santosh.

>>
>> Thanks,.
>> Santosh
> 
> Thanks,
> Maciej
diff mbox series

Patch

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e260e8cb0c81..8c4098b8a63e 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3479,7 +3479,14 @@  static void pre_svm_run(struct kvm_vcpu *vcpu)
 static void svm_inject_nmi(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
+	struct vmcb *vmcb = NULL;
 
+	if (is_vnmi_enabled(svm)) {
+		vmcb = get_vnmi_vmcb(svm);
+		vmcb->control.int_ctl |= V_NMI_PENDING;
+		++vcpu->stat.nmi_injections;
+		return;
+	}
 	svm->vmcb->control.event_inj = SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_NMI;
 
 	if (svm->nmi_l1_to_l2)