mbox series

[kvm-unit-tests,0/8] : x86: vmx: Test INIT processing in various CPU VMX states

Message ID 20190919125211.18152-1-liran.alon@oracle.com (mailing list archive)
Headers show
Series : x86: vmx: Test INIT processing in various CPU VMX states | expand

Message

Liran Alon Sept. 19, 2019, 12:52 p.m. UTC
Hi,

This patch series aims to add a vmx test to verify the functionality
introduced by KVM commit:
4b9852f4f389 ("KVM: x86: Fix INIT signal handling in various CPU states")

The test verifies the following functionality:
1) An INIT signal received when CPU is in VMX operation
  is latched until it exits VMX operation.
2) If there is an INIT signal pending when CPU is in
  VMX non-root mode, it result in VMExit with (reason == 3).
3) Exit from VMX non-root mode on VMExit do not clear
  pending INIT signal in LAPIC.
4) When CPU exits VMX operation, pending INIT signal in
  LAPIC is processed.

In order to write such a complex test, the vmx tests framework was
enhanced to support using VMX in non BSP CPUs. This enhancement is
implemented in patches 1-7. The test itself is implemented at patch 8.
This enhancement to the vmx tests framework is a bit hackish, but
I believe it's OK because this functionality is rarely required by
other VMX tests.

Regards,
-Liran

Comments

Vitaly Kuznetsov Sept. 19, 2019, 2:08 p.m. UTC | #1
Liran Alon <liran.alon@oracle.com> writes:

> Hi,
>
> This patch series aims to add a vmx test to verify the functionality
> introduced by KVM commit:
> 4b9852f4f389 ("KVM: x86: Fix INIT signal handling in various CPU states")
>
> The test verifies the following functionality:
> 1) An INIT signal received when CPU is in VMX operation
>   is latched until it exits VMX operation.
> 2) If there is an INIT signal pending when CPU is in
>   VMX non-root mode, it result in VMExit with (reason == 3).
> 3) Exit from VMX non-root mode on VMExit do not clear
>   pending INIT signal in LAPIC.
> 4) When CPU exits VMX operation, pending INIT signal in
>   LAPIC is processed.
>
> In order to write such a complex test, the vmx tests framework was
> enhanced to support using VMX in non BSP CPUs. This enhancement is
> implemented in patches 1-7. The test itself is implemented at patch 8.
> This enhancement to the vmx tests framework is a bit hackish, but
> I believe it's OK because this functionality is rarely required by
> other VMX tests.
>

Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>

Thanks!
Liran Alon Sept. 24, 2019, 3:34 p.m. UTC | #2
Gentle ping.

> On 19 Sep 2019, at 17:08, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
> 
> Liran Alon <liran.alon@oracle.com> writes:
> 
>> Hi,
>> 
>> This patch series aims to add a vmx test to verify the functionality
>> introduced by KVM commit:
>> 4b9852f4f389 ("KVM: x86: Fix INIT signal handling in various CPU states")
>> 
>> The test verifies the following functionality:
>> 1) An INIT signal received when CPU is in VMX operation
>>  is latched until it exits VMX operation.
>> 2) If there is an INIT signal pending when CPU is in
>>  VMX non-root mode, it result in VMExit with (reason == 3).
>> 3) Exit from VMX non-root mode on VMExit do not clear
>>  pending INIT signal in LAPIC.
>> 4) When CPU exits VMX operation, pending INIT signal in
>>  LAPIC is processed.
>> 
>> In order to write such a complex test, the vmx tests framework was
>> enhanced to support using VMX in non BSP CPUs. This enhancement is
>> implemented in patches 1-7. The test itself is implemented at patch 8.
>> This enhancement to the vmx tests framework is a bit hackish, but
>> I believe it's OK because this functionality is rarely required by
>> other VMX tests.
>> 
> 
> Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> Thanks!
> 
> -- 
> Vitaly
Paolo Bonzini Sept. 24, 2019, 3:42 p.m. UTC | #3
On 24/09/19 17:34, Liran Alon wrote:
> Gentle ping.

I'm going to send another pull request to Linus this week and then will
get back to this patch (and also Krish's performance counter series).

Paolo

>> On 19 Sep 2019, at 17:08, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>>
>> Liran Alon <liran.alon@oracle.com> writes:
>>
>>> Hi,
>>>
>>> This patch series aims to add a vmx test to verify the functionality
>>> introduced by KVM commit:
>>> 4b9852f4f389 ("KVM: x86: Fix INIT signal handling in various CPU states")
>>>
>>> The test verifies the following functionality:
>>> 1) An INIT signal received when CPU is in VMX operation
>>>  is latched until it exits VMX operation.
>>> 2) If there is an INIT signal pending when CPU is in
>>>  VMX non-root mode, it result in VMExit with (reason == 3).
>>> 3) Exit from VMX non-root mode on VMExit do not clear
>>>  pending INIT signal in LAPIC.
>>> 4) When CPU exits VMX operation, pending INIT signal in
>>>  LAPIC is processed.
>>>
>>> In order to write such a complex test, the vmx tests framework was
>>> enhanced to support using VMX in non BSP CPUs. This enhancement is
>>> implemented in patches 1-7. The test itself is implemented at patch 8.
>>> This enhancement to the vmx tests framework is a bit hackish, but
>>> I believe it's OK because this functionality is rarely required by
>>> other VMX tests.
>>>
>>
>> Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>>
>> Thanks!
>>
>> -- 
>> Vitaly
>
Liran Alon Sept. 25, 2019, 11:57 p.m. UTC | #4
Paolo, I have noticed that all the patches of these series were merged to origin/master
besides the last one which adds the patch itself.

Have you missed the last patch by mistake?

-Liran

> On 24 Sep 2019, at 18:42, Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> On 24/09/19 17:34, Liran Alon wrote:
>> Gentle ping.
> 
> I'm going to send another pull request to Linus this week and then will
> get back to this patch (and also Krish's performance counter series).
> 
> Paolo
> 
>>> On 19 Sep 2019, at 17:08, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>>> 
>>> Liran Alon <liran.alon@oracle.com> writes:
>>> 
>>>> Hi,
>>>> 
>>>> This patch series aims to add a vmx test to verify the functionality
>>>> introduced by KVM commit:
>>>> 4b9852f4f389 ("KVM: x86: Fix INIT signal handling in various CPU states")
>>>> 
>>>> The test verifies the following functionality:
>>>> 1) An INIT signal received when CPU is in VMX operation
>>>> is latched until it exits VMX operation.
>>>> 2) If there is an INIT signal pending when CPU is in
>>>> VMX non-root mode, it result in VMExit with (reason == 3).
>>>> 3) Exit from VMX non-root mode on VMExit do not clear
>>>> pending INIT signal in LAPIC.
>>>> 4) When CPU exits VMX operation, pending INIT signal in
>>>> LAPIC is processed.
>>>> 
>>>> In order to write such a complex test, the vmx tests framework was
>>>> enhanced to support using VMX in non BSP CPUs. This enhancement is
>>>> implemented in patches 1-7. The test itself is implemented at patch 8.
>>>> This enhancement to the vmx tests framework is a bit hackish, but
>>>> I believe it's OK because this functionality is rarely required by
>>>> other VMX tests.
>>>> 
>>> 
>>> Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>>> 
>>> Thanks!
>>> 
>>> -- 
>>> Vitaly
>> 
>
Paolo Bonzini Sept. 26, 2019, 8:47 a.m. UTC | #5
On 26/09/19 01:57, Liran Alon wrote:
> Paolo, I have noticed that all the patches of these series were merged to origin/master
> besides the last one which adds the patch itself.
> 
> Have you missed the last patch by mistake?

No, I was sidetracked by the HOST_EFER failure; I immediately checked if
(for whatever reason) it was caused by the refactoring part of this
series, and pushed those 7 patches because it wasn't.

I will push patch 8 shortly, since I have the HOST_EFER failure covered
and have now tested the new INIT testcases properly.

Paolo
Nadav Amit Sept. 30, 2019, 11:02 p.m. UTC | #6
> On Sep 19, 2019, at 5:52 AM, Liran Alon <liran.alon@oracle.com> wrote:
> 
> Hi,
> 
> This patch series aims to add a vmx test to verify the functionality
> introduced by KVM commit:
> 4b9852f4f389 ("KVM: x86: Fix INIT signal handling in various CPU states")
> 
> The test verifies the following functionality:
> 1) An INIT signal received when CPU is in VMX operation
>  is latched until it exits VMX operation.
> 2) If there is an INIT signal pending when CPU is in
>  VMX non-root mode, it result in VMExit with (reason == 3).
> 3) Exit from VMX non-root mode on VMExit do not clear
>  pending INIT signal in LAPIC.
> 4) When CPU exits VMX operation, pending INIT signal in
>  LAPIC is processed.
> 
> In order to write such a complex test, the vmx tests framework was
> enhanced to support using VMX in non BSP CPUs. This enhancement is
> implemented in patches 1-7. The test itself is implemented at patch 8.
> This enhancement to the vmx tests framework is a bit hackish, but
> I believe it's OK because this functionality is rarely required by
> other VMX tests.
> 
> Regards,
> -Liran

Hi Liran,

I ran this test on bare-metal and it fails:

 Test suite: vmx_init_signal_test
 PASS: INIT signal blocked when CPU in VMX operation
 PASS: INIT signal during VMX non-root mode result in exit-reason VMX_INIT (3)
 FAIL: INIT signal processed after exit VMX operation
 SUMMARY: 8 tests, 1 unexpected failures

I don’t have time to debug this issue, but let me know if you want some
print-outs.

Nadav
Liran Alon Oct. 1, 2019, 12:48 a.m. UTC | #7
> On 1 Oct 2019, at 2:02, Nadav Amit <nadav.amit@gmail.com> wrote:
> 
>> On Sep 19, 2019, at 5:52 AM, Liran Alon <liran.alon@oracle.com> wrote:
>> 
>> Hi,
>> 
>> This patch series aims to add a vmx test to verify the functionality
>> introduced by KVM commit:
>> 4b9852f4f389 ("KVM: x86: Fix INIT signal handling in various CPU states")
>> 
>> The test verifies the following functionality:
>> 1) An INIT signal received when CPU is in VMX operation
>> is latched until it exits VMX operation.
>> 2) If there is an INIT signal pending when CPU is in
>> VMX non-root mode, it result in VMExit with (reason == 3).
>> 3) Exit from VMX non-root mode on VMExit do not clear
>> pending INIT signal in LAPIC.
>> 4) When CPU exits VMX operation, pending INIT signal in
>> LAPIC is processed.
>> 
>> In order to write such a complex test, the vmx tests framework was
>> enhanced to support using VMX in non BSP CPUs. This enhancement is
>> implemented in patches 1-7. The test itself is implemented at patch 8.
>> This enhancement to the vmx tests framework is a bit hackish, but
>> I believe it's OK because this functionality is rarely required by
>> other VMX tests.
>> 
>> Regards,
>> -Liran
> 
> Hi Liran,
> 
> I ran this test on bare-metal and it fails:
> 
> Test suite: vmx_init_signal_test
> PASS: INIT signal blocked when CPU in VMX operation
> PASS: INIT signal during VMX non-root mode result in exit-reason VMX_INIT (3)
> FAIL: INIT signal processed after exit VMX operation
> SUMMARY: 8 tests, 1 unexpected failures
> 
> I don’t have time to debug this issue, but let me know if you want some
> print-outs.
> 
> Nadav
> 

Thanks Nadav for running this on bare-metal. This is very useful!

It seems that when CPU exited on exit-reason VMX_INIT (3), the LAPIC INIT pending event
was consumed instead of still being latched until CPU exits VMX operation.

In my commit which this unit-test verifies 4b9852f4f389 ("KVM: x86: Fix INIT signal handling in various CPU states”),
I have assumed that such exit-reason don’t consume the LAPIC INIT pending event.
My assumption was based on the phrasing of Intel SDM section 25.2 OTHER CAUSES OF VM EXITS regarding INIT signals:
"Such exits do not modify register state or clear pending events as they would outside of VMX operation."
I thought Intel logic behind this is that if an INIT signal is sent to a CPU in VMX non-root mode, it would exit
on exit-reason 3 which would allow hypervisor to decide to exit VMX operation in order to consume INIT signal.

Nadav, can you attempt to just add a delay in init_signal_test_thread() between calling vmx_off() & setting init_signal_test_thread_continued to true?
It may be that real hardware delays a bit when the INIT signal is released from LAPIC after exit VMX operation.

Thanks,
-Liran
Nadav Amit Oct. 1, 2019, 1:14 a.m. UTC | #8
> On Sep 30, 2019, at 5:48 PM, Liran Alon <liran.alon@oracle.com> wrote:
> 
> 
> 
>> On 1 Oct 2019, at 2:02, Nadav Amit <nadav.amit@gmail.com> wrote:
>> 
>>> On Sep 19, 2019, at 5:52 AM, Liran Alon <liran.alon@oracle.com> wrote:
>>> 
>>> Hi,
>>> 
>>> This patch series aims to add a vmx test to verify the functionality
>>> introduced by KVM commit:
>>> 4b9852f4f389 ("KVM: x86: Fix INIT signal handling in various CPU states")
>>> 
>>> The test verifies the following functionality:
>>> 1) An INIT signal received when CPU is in VMX operation
>>> is latched until it exits VMX operation.
>>> 2) If there is an INIT signal pending when CPU is in
>>> VMX non-root mode, it result in VMExit with (reason == 3).
>>> 3) Exit from VMX non-root mode on VMExit do not clear
>>> pending INIT signal in LAPIC.
>>> 4) When CPU exits VMX operation, pending INIT signal in
>>> LAPIC is processed.
>>> 
>>> In order to write such a complex test, the vmx tests framework was
>>> enhanced to support using VMX in non BSP CPUs. This enhancement is
>>> implemented in patches 1-7. The test itself is implemented at patch 8.
>>> This enhancement to the vmx tests framework is a bit hackish, but
>>> I believe it's OK because this functionality is rarely required by
>>> other VMX tests.
>>> 
>>> Regards,
>>> -Liran
>> 
>> Hi Liran,
>> 
>> I ran this test on bare-metal and it fails:
>> 
>> Test suite: vmx_init_signal_test
>> PASS: INIT signal blocked when CPU in VMX operation
>> PASS: INIT signal during VMX non-root mode result in exit-reason VMX_INIT (3)
>> FAIL: INIT signal processed after exit VMX operation
>> SUMMARY: 8 tests, 1 unexpected failures
>> 
>> I don’t have time to debug this issue, but let me know if you want some
>> print-outs.
>> 
>> Nadav
> 
> Thanks Nadav for running this on bare-metal. This is very useful!
> 
> It seems that when CPU exited on exit-reason VMX_INIT (3), the LAPIC INIT pending event
> was consumed instead of still being latched until CPU exits VMX operation.
> 
> In my commit which this unit-test verifies 4b9852f4f389 ("KVM: x86: Fix INIT signal handling in various CPU states”),
> I have assumed that such exit-reason don’t consume the LAPIC INIT pending event.
> My assumption was based on the phrasing of Intel SDM section 25.2 OTHER CAUSES OF VM EXITS regarding INIT signals:
> "Such exits do not modify register state or clear pending events as they would outside of VMX operation."
> I thought Intel logic behind this is that if an INIT signal is sent to a CPU in VMX non-root mode, it would exit
> on exit-reason 3 which would allow hypervisor to decide to exit VMX operation in order to consume INIT signal.

I think this sentence can be read differently. It also reasonable not to
bound the host to get an INIT signal the moment it disabled vmx.

> Nadav, can you attempt to just add a delay in init_signal_test_thread() between calling vmx_off() & setting init_signal_test_thread_continued to true?
> It may be that real hardware delays a bit when the INIT signal is released from LAPIC after exit VMX operation.

I added “delay(100000)” between them, but got the same result.
Liran Alon Oct. 1, 2019, 1:23 a.m. UTC | #9
> On 1 Oct 2019, at 4:14, Nadav Amit <nadav.amit@gmail.com> wrote:
> 
>> On Sep 30, 2019, at 5:48 PM, Liran Alon <liran.alon@oracle.com> wrote:
>> 
>> 
>> 
>>> On 1 Oct 2019, at 2:02, Nadav Amit <nadav.amit@gmail.com> wrote:
>>> 
>>>> On Sep 19, 2019, at 5:52 AM, Liran Alon <liran.alon@oracle.com> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> This patch series aims to add a vmx test to verify the functionality
>>>> introduced by KVM commit:
>>>> 4b9852f4f389 ("KVM: x86: Fix INIT signal handling in various CPU states")
>>>> 
>>>> The test verifies the following functionality:
>>>> 1) An INIT signal received when CPU is in VMX operation
>>>> is latched until it exits VMX operation.
>>>> 2) If there is an INIT signal pending when CPU is in
>>>> VMX non-root mode, it result in VMExit with (reason == 3).
>>>> 3) Exit from VMX non-root mode on VMExit do not clear
>>>> pending INIT signal in LAPIC.
>>>> 4) When CPU exits VMX operation, pending INIT signal in
>>>> LAPIC is processed.
>>>> 
>>>> In order to write such a complex test, the vmx tests framework was
>>>> enhanced to support using VMX in non BSP CPUs. This enhancement is
>>>> implemented in patches 1-7. The test itself is implemented at patch 8.
>>>> This enhancement to the vmx tests framework is a bit hackish, but
>>>> I believe it's OK because this functionality is rarely required by
>>>> other VMX tests.
>>>> 
>>>> Regards,
>>>> -Liran
>>> 
>>> Hi Liran,
>>> 
>>> I ran this test on bare-metal and it fails:
>>> 
>>> Test suite: vmx_init_signal_test
>>> PASS: INIT signal blocked when CPU in VMX operation
>>> PASS: INIT signal during VMX non-root mode result in exit-reason VMX_INIT (3)
>>> FAIL: INIT signal processed after exit VMX operation
>>> SUMMARY: 8 tests, 1 unexpected failures
>>> 
>>> I don’t have time to debug this issue, but let me know if you want some
>>> print-outs.
>>> 
>>> Nadav
>> 
>> Thanks Nadav for running this on bare-metal. This is very useful!
>> 
>> It seems that when CPU exited on exit-reason VMX_INIT (3), the LAPIC INIT pending event
>> was consumed instead of still being latched until CPU exits VMX operation.
>> 
>> In my commit which this unit-test verifies 4b9852f4f389 ("KVM: x86: Fix INIT signal handling in various CPU states”),
>> I have assumed that such exit-reason don’t consume the LAPIC INIT pending event.
>> My assumption was based on the phrasing of Intel SDM section 25.2 OTHER CAUSES OF VM EXITS regarding INIT signals:
>> "Such exits do not modify register state or clear pending events as they would outside of VMX operation."
>> I thought Intel logic behind this is that if an INIT signal is sent to a CPU in VMX non-root mode, it would exit
>> on exit-reason 3 which would allow hypervisor to decide to exit VMX operation in order to consume INIT signal.
> 
> I think this sentence can be read differently. It also reasonable not to
> bound the host to get an INIT signal the moment it disabled vmx.

If INIT signal won’t be kept pending until exiting VMX operation, target CPU which was sent with INIT signal
when it was running guest, basically lost INIT signal forever and just received an exit-reason it cannot do much with.
That’s why I thought Intel designed this mechanism like I specified above.

I also remembered to verify this behaviour against some discussions made online:
1) https://software.intel.com/en-us/forums/virtualization-software-development/topic/355484
* "When the 16-bit guest issues an INIT IPI to itself using the APIC, I run into an infinite VMExit situation that my hypervisor cannot seem to recover from.”
* "In response to the VMExit with a reason of 3 (which is expected), the hypervisor resets the 16-bit guest's registers, limits, access rights, etc. to simulate starting execution from a known initialization point.  However, it seems that as soon as the hypervisor resumes guest execution, the VMExit occurs again, repeatedly.”
2) https://patchwork.kernel.org/patch/2244311/
"I actually find it very useful. On INIT vmexit hypervisor may call vmxoff and do proper reset."

Anyway, Sean, can you assist verifying inside Intel what should be the expected behaviour?

> 
>> Nadav, can you attempt to just add a delay in init_signal_test_thread() between calling vmx_off() & setting init_signal_test_thread_continued to true?
>> It may be that real hardware delays a bit when the INIT signal is released from LAPIC after exit VMX operation.
> 
> I added “delay(100000)” between them, but got the same result.
> 

Weird. Then I have no idea why this happening…

Thanks,
-Liran
Nadav Amit Oct. 1, 2019, 1:29 a.m. UTC | #10
> On Sep 30, 2019, at 6:23 PM, Liran Alon <liran.alon@oracle.com> wrote:
> 
> 
> 
>> On 1 Oct 2019, at 4:14, Nadav Amit <nadav.amit@gmail.com> wrote:
>> 
>>> On Sep 30, 2019, at 5:48 PM, Liran Alon <liran.alon@oracle.com> wrote:
>>> 
>>> 
>>> 
>>>> On 1 Oct 2019, at 2:02, Nadav Amit <nadav.amit@gmail.com> wrote:
>>>> 
>>>>> On Sep 19, 2019, at 5:52 AM, Liran Alon <liran.alon@oracle.com> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> This patch series aims to add a vmx test to verify the functionality
>>>>> introduced by KVM commit:
>>>>> 4b9852f4f389 ("KVM: x86: Fix INIT signal handling in various CPU states")
>>>>> 
>>>>> The test verifies the following functionality:
>>>>> 1) An INIT signal received when CPU is in VMX operation
>>>>> is latched until it exits VMX operation.
>>>>> 2) If there is an INIT signal pending when CPU is in
>>>>> VMX non-root mode, it result in VMExit with (reason == 3).
>>>>> 3) Exit from VMX non-root mode on VMExit do not clear
>>>>> pending INIT signal in LAPIC.
>>>>> 4) When CPU exits VMX operation, pending INIT signal in
>>>>> LAPIC is processed.
>>>>> 
>>>>> In order to write such a complex test, the vmx tests framework was
>>>>> enhanced to support using VMX in non BSP CPUs. This enhancement is
>>>>> implemented in patches 1-7. The test itself is implemented at patch 8.
>>>>> This enhancement to the vmx tests framework is a bit hackish, but
>>>>> I believe it's OK because this functionality is rarely required by
>>>>> other VMX tests.
>>>>> 
>>>>> Regards,
>>>>> -Liran
>>>> 
>>>> Hi Liran,
>>>> 
>>>> I ran this test on bare-metal and it fails:
>>>> 
>>>> Test suite: vmx_init_signal_test
>>>> PASS: INIT signal blocked when CPU in VMX operation
>>>> PASS: INIT signal during VMX non-root mode result in exit-reason VMX_INIT (3)
>>>> FAIL: INIT signal processed after exit VMX operation
>>>> SUMMARY: 8 tests, 1 unexpected failures
>>>> 
>>>> I don’t have time to debug this issue, but let me know if you want some
>>>> print-outs.
>>>> 
>>>> Nadav
>>> 
>>> Thanks Nadav for running this on bare-metal. This is very useful!
>>> 
>>> It seems that when CPU exited on exit-reason VMX_INIT (3), the LAPIC INIT pending event
>>> was consumed instead of still being latched until CPU exits VMX operation.
>>> 
>>> In my commit which this unit-test verifies 4b9852f4f389 ("KVM: x86: Fix INIT signal handling in various CPU states”),
>>> I have assumed that such exit-reason don’t consume the LAPIC INIT pending event.
>>> My assumption was based on the phrasing of Intel SDM section 25.2 OTHER CAUSES OF VM EXITS regarding INIT signals:
>>> "Such exits do not modify register state or clear pending events as they would outside of VMX operation."
>>> I thought Intel logic behind this is that if an INIT signal is sent to a CPU in VMX non-root mode, it would exit
>>> on exit-reason 3 which would allow hypervisor to decide to exit VMX operation in order to consume INIT signal.
>> 
>> I think this sentence can be read differently. It also reasonable not to
>> bound the host to get an INIT signal the moment it disabled vmx.
> 
> If INIT signal won’t be kept pending until exiting VMX operation, target CPU which was sent with INIT signal
> when it was running guest, basically lost INIT signal forever and just received an exit-reason it cannot do much with.
> That’s why I thought Intel designed this mechanism like I specified above.

Well, the host can always issue an INIT using an IPI.

> 
> I also remembered to verify this behaviour against some discussions made online:
> 1) https://software.intel.com/en-us/forums/virtualization-software-development/topic/355484
> * "When the 16-bit guest issues an INIT IPI to itself using the APIC, I run into an infinite VMExit situation that my hypervisor cannot seem to recover from.”
> * "In response to the VMExit with a reason of 3 (which is expected), the hypervisor resets the 16-bit guest's registers, limits, access rights, etc. to simulate starting execution from a known initialization point.  However, it seems that as soon as the hypervisor resumes guest execution, the VMExit occurs again, repeatedly.”
> 2) https://patchwork.kernel.org/patch/2244311/
> "I actually find it very useful. On INIT vmexit hypervisor may call vmxoff and do proper reset."
> 
> Anyway, Sean, can you assist verifying inside Intel what should be the expected behaviour?

It might always be (yet) another kvm-unit-tests bug that is only apparent on
bare-metal. But if Sean can confirm what the expected behavior is, it would
save time.

I do not have an ITP, so debugging on bare-metal is not fun at all...
Sean Christopherson Oct. 1, 2019, 6:40 p.m. UTC | #11
On Mon, Sep 30, 2019 at 06:29:52PM -0700, Nadav Amit wrote:
> > On Sep 30, 2019, at 6:23 PM, Liran Alon <liran.alon@oracle.com> wrote:
> > 
> > If INIT signal won’t be kept pending until exiting VMX operation, target
> > CPU which was sent with INIT signal when it was running guest, basically
> > lost INIT signal forever and just received an exit-reason it cannot do much
> > with.  That’s why I thought Intel designed this mechanism like I specified
> > above.
> 
> Well, the host can always issue an INIT using an IPI.

And conversely, if the INIT persisted then the host would be forced to do
VMXOFF, otherwise it effectively couldn't do VM-Enter on that logical CPU.

> > I also remembered to verify this behaviour against some discussions made online:
> > 1) https://software.intel.com/en-us/forums/virtualization-software-development/topic/355484
> > * "When the 16-bit guest issues an INIT IPI to itself using the APIC, I run into an infinite VMExit situation that my hypervisor cannot seem to recover from.”
> > * "In response to the VMExit with a reason of 3 (which is expected), the hypervisor resets the 16-bit guest's registers, limits, access rights, etc. to simulate starting execution from a known initialization point.  However, it seems that as soon as the hypervisor resumes guest execution, the VMExit occurs again, repeatedly.”
> > 2) https://patchwork.kernel.org/patch/2244311/
> > "I actually find it very useful. On INIT vmexit hypervisor may call vmxoff and do proper reset."
> > 
> > Anyway, Sean, can you assist verifying inside Intel what should be the expected behaviour?
> 
> It might always be (yet) another kvm-unit-tests bug that is only apparent on
> bare-metal. But if Sean can confirm what the expected behavior is, it would
> save time.
> 
> I do not have an ITP, so debugging on bare-metal is not fun at all...

My understanding of the architecture is that the INIT should be consumed
on VM-Exit.  The only scenario where an event is not consumed/acknowledge
is when a vanilla interrupt occurs without VM_EXIT_ACK_INTR_ON_EXIT set,
in which case the VM-Exit is technically considered a "pending" interrupt.
For all other cases (NMI, SMI, INIT, and INTR w/ ACK-ON-EXIT), the VM-Exit
is the end result of delivering the event.

INITs are indeed blocked and not dropped in VMX root mode.  But entering
non-root (guest) mode should unblock INITs and cause a VM-Exit, and thus
clear the INIT that was pended while in VMX root mode.  This behavior does
not conflict with the whitepaper[*] referenced by link (2) above, and in
fact the whitepaper explicitly covers guest mode behavior in a footnote:

  When the processor is in VMX guest mode, delivery of INIT causes a
  normal VMEXIT, of course.

The INIT attack described uses "VMX mode" to refer to VMX root mode, and
other than the footnote, doesn't mention VMX guest mode.  My reading of it
is that they're showing a proof of concept of based on getting the OS into
VMX root mode but not actually running a guest, e.g. this can be done
in KVM by creating a VM (KVM_CREATE_VM) but not running it (KVM_RUN).

Anyways, I'll double check that the INIT should indeed be consumed as part
of the VM-Exit.

[*] https://invisiblethingslab.com/resources/2011/Software%20Attacks%20on%20Intel%20VT-d.pdf
Sean Christopherson Oct. 1, 2019, 11:21 p.m. UTC | #12
On Tue, Oct 01, 2019 at 11:40:34AM -0700, Sean Christopherson wrote:
> On Mon, Sep 30, 2019 at 06:29:52PM -0700, Nadav Amit wrote:
> > > On Sep 30, 2019, at 6:23 PM, Liran Alon <liran.alon@oracle.com> wrote:

...

> > > I also remembered to verify this behaviour against some discussions made online:
> > > 1) https://software.intel.com/en-us/forums/virtualization-software-development/topic/355484
> > > * "When the 16-bit guest issues an INIT IPI to itself using the APIC, I run into an infinite VMExit situation that my hypervisor cannot seem to recover from.”
> > > * "In response to the VMExit with a reason of 3 (which is expected), the hypervisor resets the 16-bit guest's registers, limits, access rights, etc. to simulate starting execution from a known initialization point.  However, it seems that as soon as the hypervisor resumes guest execution, the VMExit occurs again, repeatedly.”
> > > 2) https://patchwork.kernel.org/patch/2244311/
> > > "I actually find it very useful. On INIT vmexit hypervisor may call vmxoff and do proper reset."
> > > 
> > > Anyway, Sean, can you assist verifying inside Intel what should be the expected behaviour?
> > 
> > It might always be (yet) another kvm-unit-tests bug that is only apparent on
> > bare-metal. But if Sean can confirm what the expected behavior is, it would
> > save time.
> > 
> > I do not have an ITP, so debugging on bare-metal is not fun at all...
> 
> My understanding of the architecture is that the INIT should be consumed
> on VM-Exit.  The only scenario where an event is not consumed/acknowledge
> is when a vanilla interrupt occurs without VM_EXIT_ACK_INTR_ON_EXIT set,
> in which case the VM-Exit is technically considered a "pending" interrupt.
> For all other cases (NMI, SMI, INIT, and INTR w/ ACK-ON-EXIT), the VM-Exit
> is the end result of delivering the event.
> 
> INITs are indeed blocked and not dropped in VMX root mode.  But entering
> non-root (guest) mode should unblock INITs and cause a VM-Exit, and thus
> clear the INIT that was pended while in VMX root mode.  This behavior does
> not conflict with the whitepaper[*] referenced by link (2) above, and in
> fact the whitepaper explicitly covers guest mode behavior in a footnote:
> 
>   When the processor is in VMX guest mode, delivery of INIT causes a
>   normal VMEXIT, of course.
> 
> The INIT attack described uses "VMX mode" to refer to VMX root mode, and
> other than the footnote, doesn't mention VMX guest mode.  My reading of it
> is that they're showing a proof of concept of based on getting the OS into
> VMX root mode but not actually running a guest, e.g. this can be done
> in KVM by creating a VM (KVM_CREATE_VM) but not running it (KVM_RUN).
> 
> Anyways, I'll double check that the INIT should indeed be consumed as part
> of the VM-Exit.

I couldn't help but run a few tests before reaching out to the architecture
folks...

I modified KVM to have the CPU send an INIT IPI to itself in vmx_vcpu_run(),
with a bit of delay to ensure the INIT is pending prior to VM-Enter.  On an
INIT VM-Exit, KVM immediately resumes the guest.  On Haswell client system,
the INIT does indeed appear to be consumed when it's handled by VM-Exit,
i.e. KVM doesn't get stuck in an infinite INIT VM-Exits loop.

One possible explanation for the infinite loop observed in (1) above, is
that the developer didn't properly reconfigure guest state when emulating
INIT and hit a VM-Fail.  Because vmcs.EXIT_REASON isn't written on VM-Fail,
if the VMM isn't checking for VM-Fail it will think it's getting endless
INIT VM-Exits.  I did exactly this when tweaking KVM to handle INIT (forgot
to mark the VMCS as launched redoing VM-Enter), so I even inadvertantly
confirmed that it's plausible :-)
Sean Christopherson Oct. 1, 2019, 11:34 p.m. UTC | #13
On Tue, Oct 01, 2019 at 11:40:34AM -0700, Sean Christopherson wrote:
> Anyways, I'll double check that the INIT should indeed be consumed as part
> of the VM-Exit.

Confirmed that the INIT is cleared prior to delivering VM-Exit.
Nadav Amit Oct. 1, 2019, 11:37 p.m. UTC | #14
> On Oct 1, 2019, at 4:34 PM, Sean Christopherson <sean.j.christopherson@intel.com> wrote:
> 
> On Tue, Oct 01, 2019 at 11:40:34AM -0700, Sean Christopherson wrote:
>> Anyways, I'll double check that the INIT should indeed be consumed as part
>> of the VM-Exit.
> 
> Confirmed that the INIT is cleared prior to delivering VM-Exit.

Thanks for checking. I guess Liran will take it from here - I just wanted to
ensure kvm-unit-tests on bare-metal is not broken.
Liran Alon Oct. 2, 2019, 12:10 a.m. UTC | #15
> On 2 Oct 2019, at 2:37, Nadav Amit <nadav.amit@gmail.com> wrote:
> 
>> On Oct 1, 2019, at 4:34 PM, Sean Christopherson <sean.j.christopherson@intel.com> wrote:
>> 
>> On Tue, Oct 01, 2019 at 11:40:34AM -0700, Sean Christopherson wrote:
>>> Anyways, I'll double check that the INIT should indeed be consumed as part
>>> of the VM-Exit.
>> 
>> Confirmed that the INIT is cleared prior to delivering VM-Exit.
> 
> Thanks for checking. I guess Liran will take it from here - I just wanted to
> ensure kvm-unit-tests on bare-metal is not broken.
> 

Yes, thanks everyone. I will submit a patch for both KVM and kvm-unit-tests for this.

-Liran
Liran Alon Oct. 2, 2019, 12:19 a.m. UTC | #16
> On 1 Oct 2019, at 21:40, Sean Christopherson <sean.j.christopherson@intel.com> wrote:
> 
> On Mon, Sep 30, 2019 at 06:29:52PM -0700, Nadav Amit wrote:
>>> On Sep 30, 2019, at 6:23 PM, Liran Alon <liran.alon@oracle.com> wrote:
>>> 
>>> If INIT signal won’t be kept pending until exiting VMX operation, target
>>> CPU which was sent with INIT signal when it was running guest, basically
>>> lost INIT signal forever and just received an exit-reason it cannot do much
>>> with.  That’s why I thought Intel designed this mechanism like I specified
>>> above.
>> 
>> Well, the host can always issue an INIT using an IPI.
> 
> And conversely, if the INIT persisted then the host would be forced to do
> VMXOFF, otherwise it effectively couldn't do VM-Enter on that logical CPU.

The way I grasped it previously is that hypervisor have 2 different options to respond to an INIT-signal exit-reason:
1) Interpret INIT signal as suppose to be delivered to host (e.g. KVM use-case). i.e. Allow other CPU which send INIT signal to reset it. By just exiting VMX operation with VMXOFF.
2) Interpet INIT signal as suppose to be delivered to guest (e.g. A “passthrough” security hypervisor loaded during boot-chain). In this case, hypervisor would reset vCPU context and then enter guest with wait-for-SIPI activity-state. That blocks pending INIT signal from being delivered and exiting from non-root mode. Then next physical SIPI delivered to CPU will be consumed properly.

Anyway, just wanted to layout my previous thoughts on the matter.
I think Intel SDM phrasing on this regard is very confusing…

-Liran