mbox series

[v3,0/5] Add support for the Idle HLT intercept feature

Message ID 20240528041926.3989-1-manali.shukla@amd.com (mailing list archive)
Headers show
Series Add support for the Idle HLT intercept feature | expand

Message

Manali Shukla May 28, 2024, 4:19 a.m. UTC
The upcoming new Idle HLT Intercept feature allows for the HLT
instruction execution by a vCPU to be intercepted by the hypervisor
only if there are no pending V_INTR and V_NMI events for the vCPU.
When the vCPU is expected to service the pending V_INTR and V_NMI
events, the Idle HLT intercept won’t trigger. The feature allows the
hypervisor to determine if the vCPU is actually idle and reduces
wasteful VMEXITs.

Presence of the Idle HLT Intercept feature is indicated via CPUID
function Fn8000_000A_EDX[30].

Document for the Idle HLT intercept feature is available at [1].

[1]: AMD64 Architecture Programmer's Manual Pub. 24593, April 2024,
     Vol 2, 15.9 Instruction Intercepts (Table 15-7: IDLE_HLT).
     https://bugzilla.kernel.org/attachment.cgi?id=306250

Testing Done:
- Added a selftest to test the Idle HLT intercept functionality.
- Compile and functionality testing for the Idle HLT intercept selftest
  are only done for x86_64.
- Tested SEV and SEV-ES guest for the Idle HLT intercept functionality.

v2 -> v3
- Incorporated Andrew's suggestion to structure vcpu_stat_types in
  a way that each architecture can share the generic types and also
  provide its own.

v1 -> v2
- Done changes in svm_idle_hlt_test based on the review comments from Sean.
- Added an enum based approach to get binary stats in vcpu_get_stat() which
  doesn't use string to get stat data based on the comments from Sean.
- Added self_halt() and cli() helpers based on the comments from Sean.

Manali Shukla (5):
  x86/cpufeatures: Add CPUID feature bit for Idle HLT intercept
  KVM: SVM: Add Idle HLT intercept support
  KVM: selftests: Add safe_halt() and cli() helpers to common code
  KVM: selftests: Add an interface to read the data of named vcpu stat
  KVM: selftests: KVM: SVM: Add Idle HLT intercept test

 arch/x86/include/asm/cpufeatures.h            |  1 +
 arch/x86/include/asm/svm.h                    |  1 +
 arch/x86/include/uapi/asm/svm.h               |  2 +
 arch/x86/kvm/svm/svm.c                        | 11 ++-
 tools/testing/selftests/kvm/Makefile          |  1 +
 .../testing/selftests/kvm/include/kvm_util.h  | 44 +++++++++
 .../kvm/include/x86_64/kvm_util_arch.h        | 40 +++++++++
 .../selftests/kvm/include/x86_64/processor.h  | 18 ++++
 tools/testing/selftests/kvm/lib/kvm_util.c    | 32 +++++++
 .../selftests/kvm/x86_64/svm_idle_hlt_test.c  | 89 +++++++++++++++++++
 10 files changed, 236 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/svm_idle_hlt_test.c


base-commit: d91a9cc16417b8247213a0144a1f0fd61dc855dd

Comments

Paolo Bonzini May 28, 2024, 10:22 a.m. UTC | #1
On Tue, May 28, 2024 at 6:19 AM Manali Shukla <manali.shukla@amd.com> wrote:
>
> The upcoming new Idle HLT Intercept feature allows for the HLT
> instruction execution by a vCPU to be intercepted by the hypervisor
> only if there are no pending V_INTR and V_NMI events for the vCPU.
> When the vCPU is expected to service the pending V_INTR and V_NMI
> events, the Idle HLT intercept won’t trigger. The feature allows the
> hypervisor to determine if the vCPU is actually idle and reduces
> wasteful VMEXITs.

Does this have an effect on the number of vmexits for KVM, unless AVIC
is enabled? Can you write a testcase for kvm-unit-tests' vmexit.flat
that shows an improvement?

The reason I am wondering is because KVM does not really use V_INTR
injection. The "idle HLT" intercept basically differs from the basic
HLT trigger only in how it handles an STI;HLT sequence, as in that
case the interrupt can be injected directly and the HLT vmexit is
suppressed. But in that circumstance KVM would anyway use a V_INTR
intercept to detect the opening of the interrupt injection window (and
then the interrupt uses event injection rather than V_INTR). Again,
this is only true if AVIC is disabled, but that is the default.

So unless I'm wrong in my analysis above, I'm not sure this series,
albeit small, is really worth it. As things stand, it would be more
interesting to enable this for nested VMs, especially Hyper-V which
does use V_INTR and V_TPL; even better, _emulating_ it on older
processors would reduce the L2->L0->L1->L0->L2 path to a
less-expensive L2->L0->L2 vmexit.


Paolo

> Presence of the Idle HLT Intercept feature is indicated via CPUID
> function Fn8000_000A_EDX[30].
>
> Document for the Idle HLT intercept feature is available at [1].
>
> [1]: AMD64 Architecture Programmer's Manual Pub. 24593, April 2024,
>      Vol 2, 15.9 Instruction Intercepts (Table 15-7: IDLE_HLT).
>      https://bugzilla.kernel.org/attachment.cgi?id=306250
>
> Testing Done:
> - Added a selftest to test the Idle HLT intercept functionality.
> - Compile and functionality testing for the Idle HLT intercept selftest
>   are only done for x86_64.
> - Tested SEV and SEV-ES guest for the Idle HLT intercept functionality.
>
> v2 -> v3
> - Incorporated Andrew's suggestion to structure vcpu_stat_types in
>   a way that each architecture can share the generic types and also
>   provide its own.
>
> v1 -> v2
> - Done changes in svm_idle_hlt_test based on the review comments from Sean.
> - Added an enum based approach to get binary stats in vcpu_get_stat() which
>   doesn't use string to get stat data based on the comments from Sean.
> - Added self_halt() and cli() helpers based on the comments from Sean.
>
> Manali Shukla (5):
>   x86/cpufeatures: Add CPUID feature bit for Idle HLT intercept
>   KVM: SVM: Add Idle HLT intercept support
>   KVM: selftests: Add safe_halt() and cli() helpers to common code
>   KVM: selftests: Add an interface to read the data of named vcpu stat
>   KVM: selftests: KVM: SVM: Add Idle HLT intercept test
>
>  arch/x86/include/asm/cpufeatures.h            |  1 +
>  arch/x86/include/asm/svm.h                    |  1 +
>  arch/x86/include/uapi/asm/svm.h               |  2 +
>  arch/x86/kvm/svm/svm.c                        | 11 ++-
>  tools/testing/selftests/kvm/Makefile          |  1 +
>  .../testing/selftests/kvm/include/kvm_util.h  | 44 +++++++++
>  .../kvm/include/x86_64/kvm_util_arch.h        | 40 +++++++++
>  .../selftests/kvm/include/x86_64/processor.h  | 18 ++++
>  tools/testing/selftests/kvm/lib/kvm_util.c    | 32 +++++++
>  .../selftests/kvm/x86_64/svm_idle_hlt_test.c  | 89 +++++++++++++++++++
>  10 files changed, 236 insertions(+), 3 deletions(-)
>  create mode 100644 tools/testing/selftests/kvm/x86_64/svm_idle_hlt_test.c
>
>
> base-commit: d91a9cc16417b8247213a0144a1f0fd61dc855dd
> --
> 2.34.1
>
Sean Christopherson June 4, 2024, 12:47 a.m. UTC | #2
On Tue, May 28, 2024, Paolo Bonzini wrote:
> On Tue, May 28, 2024 at 6:19 AM Manali Shukla <manali.shukla@amd.com> wrote:
> >
> > The upcoming new Idle HLT Intercept feature allows for the HLT
> > instruction execution by a vCPU to be intercepted by the hypervisor
> > only if there are no pending V_INTR and V_NMI events for the vCPU.
> > When the vCPU is expected to service the pending V_INTR and V_NMI
> > events, the Idle HLT intercept won’t trigger. The feature allows the
> > hypervisor to determine if the vCPU is actually idle and reduces
> > wasteful VMEXITs.
> 
> Does this have an effect on the number of vmexits for KVM, unless AVIC
> is enabled? Can you write a testcase for kvm-unit-tests' vmexit.flat
> that shows an improvement?
> 
> The reason I am wondering is because KVM does not really use V_INTR
> injection. The "idle HLT" intercept basically differs from the basic
> HLT trigger only in how it handles an STI;HLT sequence, as in that
> case the interrupt can be injected directly and the HLT vmexit is
> suppressed. But in that circumstance KVM would anyway use a V_INTR
> intercept to detect the opening of the interrupt injection window (and
> then the interrupt uses event injection rather than V_INTR). Again,
> this is only true if AVIC is disabled, but that is the default.
> 
> So unless I'm wrong in my analysis above, I'm not sure this series,
> albeit small, is really worth it.

But aren't we hoping to enable x2AVIC by default sooner than later?

> As things stand, it would be more interesting to enable this for nested VMs,
> especially Hyper-V which does use V_INTR and V_TPL; even better, _emulating_
> it on older processors would reduce the L2->L0->L1->L0->L2 path to a
> less-expensive L2->L0->L2 vmexit.
Manali Shukla June 4, 2024, 12:23 p.m. UTC | #3
Hi Paolo,

Thank you for reviewing my patches.

On 5/28/2024 3:52 PM, Paolo Bonzini wrote:
> On Tue, May 28, 2024 at 6:19 AM Manali Shukla <manali.shukla@amd.com> wrote:
>>
>> The upcoming new Idle HLT Intercept feature allows for the HLT
>> instruction execution by a vCPU to be intercepted by the hypervisor
>> only if there are no pending V_INTR and V_NMI events for the vCPU.
>> When the vCPU is expected to service the pending V_INTR and V_NMI
>> events, the Idle HLT intercept won’t trigger. The feature allows the
>> hypervisor to determine if the vCPU is actually idle and reduces
>> wasteful VMEXITs.
> 
> Does this have an effect on the number of vmexits for KVM, unless AVIC
> is enabled? Can you write a testcase for kvm-unit-tests' vmexit.flat
> that shows an improvement?

I have measured the total numbers of vmexits (using perf kvm stat report), while running the the test case I have
written for the idle halt intercept in vmexit.flat.


Without idle halt
-------------------------------------------------------------------------------------------------
|    Event name      |    Samples        |     Sample%      |      Time (ns)     |     Time%    |
-------------------------------------------------------------------------------------------------
|        msr         |      524213       |       49.00%     |      592573933     |    64.00 %   |
-------------------------------------------------------------------------------------------------
|        hlt         |      262080       |       24.00%     |      154429476     |    16.00%    |
-------------------------------------------------------------------------------------------------
|        vintr       |      262080       |       24.00%     |      163724208     |     16.00%   |
-------------------------------------------------------------------------------------------------

With idle halt

-----------------------------------------------------------------------------------------------
|    Event name      |     Samples       |     Sample%    |      Time (ns)     |     Time%    |
-----------------------------------------------------------------------------------------------  
|       msr          |      524213       |     66.00%     |      502011088     |    75.00 %   |
-----------------------------------------------------------------------------------------------
|      vintr         |      262080       |     33.00%     |      147515295     |    22.00%    |
-----------------------------------------------------------------------------------------------
|       io           |        1879       |      0.00%     |       8784729      |     1.00%    |
-----------------------------------------------------------------------------------------------


Below is data for the average of 10 runs of idle halt test case in vmexit.flat
----------------------------------------------------------------------------------
|  idle halt (on/off)  |   full test run    |   ipi test run   |  eoi test run   |
----------------------------------------------------------------------------------
|          on          |       5048 .4      |     1289.2       |    1140.6       |
----------------------------------------------------------------------------------
|          off         |       4806.1       |     1318.6       |    1165.8       |
----------------------------------------------------------------------------------

The "ipi test run" when the idle halt is enabled, takes less time (~2.28% )to finish as compared to
the  "ipi test run" when the idle halt is disabled. 

The "eoi test run" when the idle halt is enabled, takes less time (~2.20% )to finish as compared to
the  "eoi test run" when the idle halt is disabled. 

The "full test run" when the idle halt is enabled, takes more time (~5.4%) as compared to
the "full test run" when the idle halt is disabled.
(Seems a bit odd, I have not yet begun to investigate this behavior)

Snippet of the Test case:
+static void idle_hlt_test(void)
+{
+       x = 0;
+       cli();
+       apic_self_ipi(IPI_TEST_VECTOR);
+       safe_halt();
+       if (x != 1) printf("%d", x);
+}
+
 
> The reason I am wondering is because KVM does not really use V_INTR
> injection. The "idle HLT" intercept basically differs from the basic
> HLT trigger only in how it handles an STI;HLT sequence, as in that
> case the interrupt can be injected directly and the HLT vmexit is
> suppressed. But in that circumstance KVM would anyway use a V_INTR
> intercept to detect the opening of the interrupt injection window (and
> then the interrupt uses event injection rather than V_INTR). Again,
> this is only true if AVIC is disabled, but that is the default.
> 

I have taken traces to analyze it further.

With idle halt enabled
 220.696238: kvm_apic_ipi: dst b0 vec 176 (Fixed|physical|assert|edge|self)
 220.696238: kvm_apic_accept_irq: apicid 0 vec 176 (Fixed|edge)
 220.696238: kvm_apic: apic_write APIC_ICR = 0xb0000440b0
 220.696238: kvm_msr: msr_write 830 = 0xb0000440b0
 220.696238: kvm_entry: vcpu 0, rip 0x406a89
 220.696239: kvm_exit: vcpu 0 reason vintr rip 0x4004ae info1 0x0000000000000000 info2 0x0000000000000000 intr_info 0x00000000 error_code 0x00000000
 220.696239: kvm_inj_virq: IRQ 0xb0
 220.696240: kvm_entry: vcpu 0, rip 0x4004ae
 220.696240: kvm_exit: vcpu 0 reason msr rip 0x406a74 info1 0x0000000000000001 info2 0x0000000000000000 intr_info 0x00000000 error_code 0x00000000
 220.696240: kvm_apic: apic_write APIC_EOI = 0x0
 220.696240: kvm_eoi: apicid 0 vector 176


Without idle halt enabled
 6204.951631: kvm_apic_ipi: dst b0 vec 176 (Fixed|physical|assert|edge|self)
 6204.951631: kvm_apic_accept_irq: apicid 0 vec 176 (Fixed|edge)
 6204.951631: kvm_apic: apic_write APIC_ICR = 0xb0000440b0
 6204.951631: kvm_msr: msr_write 830 = 0xb0000440b0
 6204.951631: kvm_entry: vcpu 0, rip 0x406a89
 6204.951632: kvm_exit: vcpu 0 reason hlt rip 0x4004ad info1 0x0000000000000000 info2 0x0000000000000000 intr_info 0x00000000 error_code 0x00000000
 6204.951632: kvm_entry: vcpu 0, rip 0x4004ae
 6204.951632: kvm_exit: vcpu 0 reason vintr rip 0x4004ae info1 0x0000000000000000 info2 0x0000000000000000 intr_info 0x00000000 error_code 0x00000000
 6204.951632: kvm_inj_virq: IRQ 0xb0
 6204.951632: kvm_entry: vcpu 0, rip 0x4004ae
 6204.951633: kvm_exit: vcpu 0 reason msr rip 0x406a74 info1 0x0000000000000001 info2 0x0000000000000000 intr_info 0x00000000 error_code 0x00000000
 6204.951633: kvm_apic: apic_write APIC_EOI = 0x0
 6204.951633: kvm_eoi: apicid 0 vector 176

From the traces, it is seen that hlt exit is avoided while the idle halt intercept feature is enabled.

> So unless I'm wrong in my analysis above, I'm not sure this series,
> albeit small, is really worth it. As things stand, it would be more
> interesting to enable this for nested VMs, especially Hyper-V which
> does use V_INTR and V_TPL; even better, _emulating_ it on older
> processors would reduce the L2->L0->L1->L0->L2 path to a
> less-expensive L2->L0->L2 vmexit.
> 

I am yet to try to test the idle halt intercept feature on nested guest.

- Manali
> 
> Paolo
> 
>> Presence of the Idle HLT Intercept feature is indicated via CPUID
>> function Fn8000_000A_EDX[30].
>>
>> Document for the Idle HLT intercept feature is available at [1].
>>
>> [1]: AMD64 Architecture Programmer's Manual Pub. 24593, April 2024,
>>      Vol 2, 15.9 Instruction Intercepts (Table 15-7: IDLE_HLT).
>>      https://bugzilla.kernel.org/attachment.cgi?id=306250
>>
>> Testing Done:
>> - Added a selftest to test the Idle HLT intercept functionality.
>> - Compile and functionality testing for the Idle HLT intercept selftest
>>   are only done for x86_64.
>> - Tested SEV and SEV-ES guest for the Idle HLT intercept functionality.
>>
>> v2 -> v3
>> - Incorporated Andrew's suggestion to structure vcpu_stat_types in
>>   a way that each architecture can share the generic types and also
>>   provide its own.
>>
>> v1 -> v2
>> - Done changes in svm_idle_hlt_test based on the review comments from Sean.
>> - Added an enum based approach to get binary stats in vcpu_get_stat() which
>>   doesn't use string to get stat data based on the comments from Sean.
>> - Added self_halt() and cli() helpers based on the comments from Sean.
>>
>> Manali Shukla (5):
>>   x86/cpufeatures: Add CPUID feature bit for Idle HLT intercept
>>   KVM: SVM: Add Idle HLT intercept support
>>   KVM: selftests: Add safe_halt() and cli() helpers to common code
>>   KVM: selftests: Add an interface to read the data of named vcpu stat
>>   KVM: selftests: KVM: SVM: Add Idle HLT intercept test
>>
>>  arch/x86/include/asm/cpufeatures.h            |  1 +
>>  arch/x86/include/asm/svm.h                    |  1 +
>>  arch/x86/include/uapi/asm/svm.h               |  2 +
>>  arch/x86/kvm/svm/svm.c                        | 11 ++-
>>  tools/testing/selftests/kvm/Makefile          |  1 +
>>  .../testing/selftests/kvm/include/kvm_util.h  | 44 +++++++++
>>  .../kvm/include/x86_64/kvm_util_arch.h        | 40 +++++++++
>>  .../selftests/kvm/include/x86_64/processor.h  | 18 ++++
>>  tools/testing/selftests/kvm/lib/kvm_util.c    | 32 +++++++
>>  .../selftests/kvm/x86_64/svm_idle_hlt_test.c  | 89 +++++++++++++++++++
>>  10 files changed, 236 insertions(+), 3 deletions(-)
>>  create mode 100644 tools/testing/selftests/kvm/x86_64/svm_idle_hlt_test.c
>>
>>
>> base-commit: d91a9cc16417b8247213a0144a1f0fd61dc855dd
>> --
>> 2.34.1
>>
>
Manali Shukla June 4, 2024, 1:21 p.m. UTC | #4
On 6/4/2024 6:17 AM, Sean Christopherson wrote:
> On Tue, May 28, 2024, Paolo Bonzini wrote:
>> On Tue, May 28, 2024 at 6:19 AM Manali Shukla <manali.shukla@amd.com> wrote:
>>>
>>> The upcoming new Idle HLT Intercept feature allows for the HLT
>>> instruction execution by a vCPU to be intercepted by the hypervisor
>>> only if there are no pending V_INTR and V_NMI events for the vCPU.
>>> When the vCPU is expected to service the pending V_INTR and V_NMI
>>> events, the Idle HLT intercept won’t trigger. The feature allows the
>>> hypervisor to determine if the vCPU is actually idle and reduces
>>> wasteful VMEXITs.
>>
>> Does this have an effect on the number of vmexits for KVM, unless AVIC
>> is enabled? Can you write a testcase for kvm-unit-tests' vmexit.flat
>> that shows an improvement?
>>
>> The reason I am wondering is because KVM does not really use V_INTR
>> injection. The "idle HLT" intercept basically differs from the basic
>> HLT trigger only in how it handles an STI;HLT sequence, as in that
>> case the interrupt can be injected directly and the HLT vmexit is
>> suppressed. But in that circumstance KVM would anyway use a V_INTR
>> intercept to detect the opening of the interrupt injection window (and
>> then the interrupt uses event injection rather than V_INTR). Again,
>> this is only true if AVIC is disabled, but that is the default.
>>
>> So unless I'm wrong in my analysis above, I'm not sure this series,
>> albeit small, is really worth it.
> 
> But aren't we hoping to enable x2AVIC by default sooner than later?

The idle halt intercept feature not only suppresses HLT exit when a V_INTR
event is pending during the execution of halt instruction, but it also
suppresses HLT exit when a V_NMI event is pending during the execution of
halt instruction. This capability will be advantageous in IBS virtualization
and PMC virtualization functionalities, as both rely on VNMI for delivering
virtualized interrupts from IBS and PMC hardware.

> 
>> As things stand, it would be more interesting to enable this for nested VMs,
>> especially Hyper-V which does use V_INTR and V_TPL; even better, _emulating_
>> it on older processors would reduce the L2->L0->L1->L0->L2 path to a
>> less-expensive L2->L0->L2 vmexit.