mbox series

[v5,00/18] kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode

Message ID 1573762520-80328-1-git-send-email-suravee.suthikulpanit@amd.com (mailing list archive)
Headers show
Series kvm: x86: Support AMD SVM AVIC w/ in-kernel irqchip mode | expand

Message

Suravee Suthikulpanit Nov. 14, 2019, 8:15 p.m. UTC
The 'commit 67034bb9dd5e ("KVM: SVM: Add irqchip_split() checks before
enabling AVIC")' was introduced to fix miscellaneous boot-hang issues
when enable AVIC. This is mainly due to AVIC hardware doest not #vmexit
on write to LAPIC EOI register resulting in-kernel PIC and IOAPIC to
wait and do not inject new interrupts (e.g. PIT, RTC).

This limits AVIC to only work with kernel_irqchip=split mode, which is
not currently enabled by default, and also required user-space to
support split irqchip model, which might not be the case.

The goal of this series is to enable AVIC to work in both irqchip modes,
by allowing AVIC to be deactivated temporarily during runtime, and fallback
to legacy interrupt injection mode (w/ vINTR and interrupt windows)
when needed, and then re-enabled subsequently (a.k.a Dynamic APICv).

Similar approach is also used to handle Hyper-V SynIC in the
'commit 5c919412fe61 ("kvm/x86: Hyper-V synthetic interrupt controller")',
where APICv is permanently disabled at runtime (currently broken for
AVIC, and fixed by this series). 

This series contains several parts:
  * Part 1: patch 1,2
    Code clean up, refactor, and introduce helper functions

  * Part 2: patch 3 
    Introduce APICv deactivate bits to keep track of APICv state 
    for each vm.
 
  * Part 3: patch 4-10
    Add support for activate/deactivate APICv at runtime

  * Part 4: patch 11-14:
    Add support for various cases where APICv needs to
    be deactivated

  * Part 5: patch 15-17:
    Introduce in-kernel IOAPIC workaround for AVIC EOI

  * Part 6: path 18
    Allow enable AVIC w/ kernel_irqchip=on

Pre-requisite Patch:
  * commit b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC (de-)activation code")
    (https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/
     ?h=next&id=b9c6ff94e43a0ee053e0c1d983fba1ac4953b762)

This series has been tested against v5.3 as following:
  * Booting Linux, FreeBSD, and Windows Server 2019 VMs upto 240 vcpus
    w/ qemu option "kernel-irqchip=on" and "-no-hpet".
  * Pass-through Intel 10GbE NIC and run netperf in the VM.

Changes from V4: (https://lkml.org/lkml/2019/11/1/764)
  * Rename APICV_DEACT_BIT_xxx to APICV_INHIBIT_REASON_xxxx
  * Introduce kvm_x86_ops.check_apicv_inhibit_reasons hook
    to allow vendors to specify which APICv inhibit reason bits
    to support (patch 08/18).
  * Update comment on kvm_request_apicv_update() no-lock requirement.
    (patch 04/18)

Suravee Suthikulpanit (18):
  kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm
    parameter
  kvm: lapic: Introduce APICv update helper function
  kvm: x86: Introduce APICv inhibit reason bits
  kvm: x86: Add support for dynamic APICv
  kvm: x86: Add APICv (de)activate request trace points
  kvm: x86: svm: Add support to (de)activate posted interrupts
  svm: Add support for setup/destroy virutal APIC backing page for AVIC
  kvm: x86: Introduce APICv x86 ops for checking APIC inhibit reasons
  kvm: x86: Introduce x86 ops hook for pre-update APICv
  svm: Add support for dynamic APICv
  kvm: x86: hyperv: Use APICv update request interface
  svm: Deactivate AVIC when launching guest with nested SVM support
  svm: Temporary deactivate AVIC during ExtINT handling
  kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection
    mode.
  kvm: lapic: Clean up APIC predefined macros
  kvm: ioapic: Refactor kvm_ioapic_update_eoi()
  kvm: ioapic: Lazy update IOAPIC EOI
  svm: Allow AVIC with in-kernel irqchip mode

 arch/x86/include/asm/kvm_host.h |  19 ++++-
 arch/x86/kvm/hyperv.c           |   5 +-
 arch/x86/kvm/i8254.c            |  12 +++
 arch/x86/kvm/ioapic.c           | 149 +++++++++++++++++++++++-------------
 arch/x86/kvm/lapic.c            |  35 +++++----
 arch/x86/kvm/lapic.h            |   2 +
 arch/x86/kvm/svm.c              | 164 +++++++++++++++++++++++++++++++++++-----
 arch/x86/kvm/trace.h            |  19 +++++
 arch/x86/kvm/vmx/vmx.c          |  12 ++-
 arch/x86/kvm/x86.c              |  71 ++++++++++++++---
 10 files changed, 385 insertions(+), 103 deletions(-)

Comments

Suravee Suthikulpanit Jan. 2, 2020, 10:17 a.m. UTC | #1
Paolo,

Ping. Would you please let me know your feedback when you get a chance to review this series

Thanks,
Suravee

On 11/15/19 3:15 AM, Suravee Suthikulpanit wrote:
> The 'commit 67034bb9dd5e ("KVM: SVM: Add irqchip_split() checks before
> enabling AVIC")' was introduced to fix miscellaneous boot-hang issues
> when enable AVIC. This is mainly due to AVIC hardware doest not #vmexit
> on write to LAPIC EOI register resulting in-kernel PIC and IOAPIC to
> wait and do not inject new interrupts (e.g. PIT, RTC).
> 
> This limits AVIC to only work with kernel_irqchip=split mode, which is
> not currently enabled by default, and also required user-space to
> support split irqchip model, which might not be the case.
> 
> The goal of this series is to enable AVIC to work in both irqchip modes,
> by allowing AVIC to be deactivated temporarily during runtime, and fallback
> to legacy interrupt injection mode (w/ vINTR and interrupt windows)
> when needed, and then re-enabled subsequently (a.k.a Dynamic APICv).
> 
> Similar approach is also used to handle Hyper-V SynIC in the
> 'commit 5c919412fe61 ("kvm/x86: Hyper-V synthetic interrupt controller")',
> where APICv is permanently disabled at runtime (currently broken for
> AVIC, and fixed by this series).
> 
> This series contains several parts:
>    * Part 1: patch 1,2
>      Code clean up, refactor, and introduce helper functions
> 
>    * Part 2: patch 3
>      Introduce APICv deactivate bits to keep track of APICv state
>      for each vm.
>   
>    * Part 3: patch 4-10
>      Add support for activate/deactivate APICv at runtime
> 
>    * Part 4: patch 11-14:
>      Add support for various cases where APICv needs to
>      be deactivated
> 
>    * Part 5: patch 15-17:
>      Introduce in-kernel IOAPIC workaround for AVIC EOI
> 
>    * Part 6: path 18
>      Allow enable AVIC w/ kernel_irqchip=on
> 
> Pre-requisite Patch:
>    * commit b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC (de-)activation code")
>      (https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/
>       ?h=next&id=b9c6ff94e43a0ee053e0c1d983fba1ac4953b762)
> 
> This series has been tested against v5.3 as following:
>    * Booting Linux, FreeBSD, and Windows Server 2019 VMs upto 240 vcpus
>      w/ qemu option "kernel-irqchip=on" and "-no-hpet".
>    * Pass-through Intel 10GbE NIC and run netperf in the VM.
> 
> Changes from V4: (https://lkml.org/lkml/2019/11/1/764)
>    * Rename APICV_DEACT_BIT_xxx to APICV_INHIBIT_REASON_xxxx
>    * Introduce kvm_x86_ops.check_apicv_inhibit_reasons hook
>      to allow vendors to specify which APICv inhibit reason bits
>      to support (patch 08/18).
>    * Update comment on kvm_request_apicv_update() no-lock requirement.
>      (patch 04/18)
> 
> Suravee Suthikulpanit (18):
>    kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm
>      parameter
>    kvm: lapic: Introduce APICv update helper function
>    kvm: x86: Introduce APICv inhibit reason bits
>    kvm: x86: Add support for dynamic APICv
>    kvm: x86: Add APICv (de)activate request trace points
>    kvm: x86: svm: Add support to (de)activate posted interrupts
>    svm: Add support for setup/destroy virutal APIC backing page for AVIC
>    kvm: x86: Introduce APICv x86 ops for checking APIC inhibit reasons
>    kvm: x86: Introduce x86 ops hook for pre-update APICv
>    svm: Add support for dynamic APICv
>    kvm: x86: hyperv: Use APICv update request interface
>    svm: Deactivate AVIC when launching guest with nested SVM support
>    svm: Temporary deactivate AVIC during ExtINT handling
>    kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection
>      mode.
>    kvm: lapic: Clean up APIC predefined macros
>    kvm: ioapic: Refactor kvm_ioapic_update_eoi()
>    kvm: ioapic: Lazy update IOAPIC EOI
>    svm: Allow AVIC with in-kernel irqchip mode
> 
>   arch/x86/include/asm/kvm_host.h |  19 ++++-
>   arch/x86/kvm/hyperv.c           |   5 +-
>   arch/x86/kvm/i8254.c            |  12 +++
>   arch/x86/kvm/ioapic.c           | 149 +++++++++++++++++++++++-------------
>   arch/x86/kvm/lapic.c            |  35 +++++----
>   arch/x86/kvm/lapic.h            |   2 +
>   arch/x86/kvm/svm.c              | 164 +++++++++++++++++++++++++++++++++++-----
>   arch/x86/kvm/trace.h            |  19 +++++
>   arch/x86/kvm/vmx/vmx.c          |  12 ++-
>   arch/x86/kvm/x86.c              |  71 ++++++++++++++---
>   10 files changed, 385 insertions(+), 103 deletions(-)
>
Suravee Suthikulpanit Jan. 20, 2020, 6:16 a.m. UTC | #2
Ping

Thanks
Suravee

On 1/2/20 5:17 PM, Suravee Suthikulpanit wrote:
> Paolo,
> 
> Ping. Would you please let me know your feedback when you get a chance to review this series
> 
> Thanks,
> Suravee
> 
> On 11/15/19 3:15 AM, Suravee Suthikulpanit wrote:
>> The 'commit 67034bb9dd5e ("KVM: SVM: Add irqchip_split() checks before
>> enabling AVIC")' was introduced to fix miscellaneous boot-hang issues
>> when enable AVIC. This is mainly due to AVIC hardware doest not #vmexit
>> on write to LAPIC EOI register resulting in-kernel PIC and IOAPIC to
>> wait and do not inject new interrupts (e.g. PIT, RTC).
>>
>> This limits AVIC to only work with kernel_irqchip=split mode, which is
>> not currently enabled by default, and also required user-space to
>> support split irqchip model, which might not be the case.
>>
>> The goal of this series is to enable AVIC to work in both irqchip modes,
>> by allowing AVIC to be deactivated temporarily during runtime, and fallback
>> to legacy interrupt injection mode (w/ vINTR and interrupt windows)
>> when needed, and then re-enabled subsequently (a.k.a Dynamic APICv).
>>
>> Similar approach is also used to handle Hyper-V SynIC in the
>> 'commit 5c919412fe61 ("kvm/x86: Hyper-V synthetic interrupt controller")',
>> where APICv is permanently disabled at runtime (currently broken for
>> AVIC, and fixed by this series).
>>
>> This series contains several parts:
>>    * Part 1: patch 1,2
>>      Code clean up, refactor, and introduce helper functions
>>
>>    * Part 2: patch 3
>>      Introduce APICv deactivate bits to keep track of APICv state
>>      for each vm.
>>    * Part 3: patch 4-10
>>      Add support for activate/deactivate APICv at runtime
>>
>>    * Part 4: patch 11-14:
>>      Add support for various cases where APICv needs to
>>      be deactivated
>>
>>    * Part 5: patch 15-17:
>>      Introduce in-kernel IOAPIC workaround for AVIC EOI
>>
>>    * Part 6: path 18
>>      Allow enable AVIC w/ kernel_irqchip=on
>>
>> Pre-requisite Patch:
>>    * commit b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC (de-)activation code")
>>      (https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/
>>       ?h=next&id=b9c6ff94e43a0ee053e0c1d983fba1ac4953b762)
>>
>> This series has been tested against v5.3 as following:
>>    * Booting Linux, FreeBSD, and Windows Server 2019 VMs upto 240 vcpus
>>      w/ qemu option "kernel-irqchip=on" and "-no-hpet".
>>    * Pass-through Intel 10GbE NIC and run netperf in the VM.
>>
>> Changes from V4: (https://lkml.org/lkml/2019/11/1/764)
>>    * Rename APICV_DEACT_BIT_xxx to APICV_INHIBIT_REASON_xxxx
>>    * Introduce kvm_x86_ops.check_apicv_inhibit_reasons hook
>>      to allow vendors to specify which APICv inhibit reason bits
>>      to support (patch 08/18).
>>    * Update comment on kvm_request_apicv_update() no-lock requirement.
>>      (patch 04/18)
>>
>> Suravee Suthikulpanit (18):
>>    kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm
>>      parameter
>>    kvm: lapic: Introduce APICv update helper function
>>    kvm: x86: Introduce APICv inhibit reason bits
>>    kvm: x86: Add support for dynamic APICv
>>    kvm: x86: Add APICv (de)activate request trace points
>>    kvm: x86: svm: Add support to (de)activate posted interrupts
>>    svm: Add support for setup/destroy virutal APIC backing page for AVIC
>>    kvm: x86: Introduce APICv x86 ops for checking APIC inhibit reasons
>>    kvm: x86: Introduce x86 ops hook for pre-update APICv
>>    svm: Add support for dynamic APICv
>>    kvm: x86: hyperv: Use APICv update request interface
>>    svm: Deactivate AVIC when launching guest with nested SVM support
>>    svm: Temporary deactivate AVIC during ExtINT handling
>>    kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection
>>      mode.
>>    kvm: lapic: Clean up APIC predefined macros
>>    kvm: ioapic: Refactor kvm_ioapic_update_eoi()
>>    kvm: ioapic: Lazy update IOAPIC EOI
>>    svm: Allow AVIC with in-kernel irqchip mode
>>
>>   arch/x86/include/asm/kvm_host.h |  19 ++++-
>>   arch/x86/kvm/hyperv.c           |   5 +-
>>   arch/x86/kvm/i8254.c            |  12 +++
>>   arch/x86/kvm/ioapic.c           | 149 +++++++++++++++++++++++-------------
>>   arch/x86/kvm/lapic.c            |  35 +++++----
>>   arch/x86/kvm/lapic.h            |   2 +
>>   arch/x86/kvm/svm.c              | 164 +++++++++++++++++++++++++++++++++++-----
>>   arch/x86/kvm/trace.h            |  19 +++++
>>   arch/x86/kvm/vmx/vmx.c          |  12 ++-
>>   arch/x86/kvm/x86.c              |  71 ++++++++++++++---
>>   10 files changed, 385 insertions(+), 103 deletions(-)
>>
Paolo Bonzini Jan. 22, 2020, 4:08 p.m. UTC | #3
On 20/01/20 07:16, Suravee Suthikulpanit wrote:
> Ping
> 
> Thanks
> Suravee

Queued it, finally.  Sorry for the wait.

Paolo

> On 1/2/20 5:17 PM, Suravee Suthikulpanit wrote:
>> Paolo,
>>
>> Ping. Would you please let me know your feedback when you get a chance
>> to review this series
>>
>> Thanks,
>> Suravee
>>
>> On 11/15/19 3:15 AM, Suravee Suthikulpanit wrote:
>>> The 'commit 67034bb9dd5e ("KVM: SVM: Add irqchip_split() checks before
>>> enabling AVIC")' was introduced to fix miscellaneous boot-hang issues
>>> when enable AVIC. This is mainly due to AVIC hardware doest not #vmexit
>>> on write to LAPIC EOI register resulting in-kernel PIC and IOAPIC to
>>> wait and do not inject new interrupts (e.g. PIT, RTC).
>>>
>>> This limits AVIC to only work with kernel_irqchip=split mode, which is
>>> not currently enabled by default, and also required user-space to
>>> support split irqchip model, which might not be the case.
>>>
>>> The goal of this series is to enable AVIC to work in both irqchip modes,
>>> by allowing AVIC to be deactivated temporarily during runtime, and
>>> fallback
>>> to legacy interrupt injection mode (w/ vINTR and interrupt windows)
>>> when needed, and then re-enabled subsequently (a.k.a Dynamic APICv).
>>>
>>> Similar approach is also used to handle Hyper-V SynIC in the
>>> 'commit 5c919412fe61 ("kvm/x86: Hyper-V synthetic interrupt
>>> controller")',
>>> where APICv is permanently disabled at runtime (currently broken for
>>> AVIC, and fixed by this series).
>>>
>>> This series contains several parts:
>>>    * Part 1: patch 1,2
>>>      Code clean up, refactor, and introduce helper functions
>>>
>>>    * Part 2: patch 3
>>>      Introduce APICv deactivate bits to keep track of APICv state
>>>      for each vm.
>>>    * Part 3: patch 4-10
>>>      Add support for activate/deactivate APICv at runtime
>>>
>>>    * Part 4: patch 11-14:
>>>      Add support for various cases where APICv needs to
>>>      be deactivated
>>>
>>>    * Part 5: patch 15-17:
>>>      Introduce in-kernel IOAPIC workaround for AVIC EOI
>>>
>>>    * Part 6: path 18
>>>      Allow enable AVIC w/ kernel_irqchip=on
>>>
>>> Pre-requisite Patch:
>>>    * commit b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC
>>> (de-)activation code")
>>>     
>>> (https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/
>>>       ?h=next&id=b9c6ff94e43a0ee053e0c1d983fba1ac4953b762)
>>>
>>> This series has been tested against v5.3 as following:
>>>    * Booting Linux, FreeBSD, and Windows Server 2019 VMs upto 240 vcpus
>>>      w/ qemu option "kernel-irqchip=on" and "-no-hpet".
>>>    * Pass-through Intel 10GbE NIC and run netperf in the VM.
>>>
>>> Changes from V4: (https://lkml.org/lkml/2019/11/1/764)
>>>    * Rename APICV_DEACT_BIT_xxx to APICV_INHIBIT_REASON_xxxx
>>>    * Introduce kvm_x86_ops.check_apicv_inhibit_reasons hook
>>>      to allow vendors to specify which APICv inhibit reason bits
>>>      to support (patch 08/18).
>>>    * Update comment on kvm_request_apicv_update() no-lock requirement.
>>>      (patch 04/18)
>>>
>>> Suravee Suthikulpanit (18):
>>>    kvm: x86: Modify kvm_x86_ops.get_enable_apicv() to use struct kvm
>>>      parameter
>>>    kvm: lapic: Introduce APICv update helper function
>>>    kvm: x86: Introduce APICv inhibit reason bits
>>>    kvm: x86: Add support for dynamic APICv
>>>    kvm: x86: Add APICv (de)activate request trace points
>>>    kvm: x86: svm: Add support to (de)activate posted interrupts
>>>    svm: Add support for setup/destroy virutal APIC backing page for AVIC
>>>    kvm: x86: Introduce APICv x86 ops for checking APIC inhibit reasons
>>>    kvm: x86: Introduce x86 ops hook for pre-update APICv
>>>    svm: Add support for dynamic APICv
>>>    kvm: x86: hyperv: Use APICv update request interface
>>>    svm: Deactivate AVIC when launching guest with nested SVM support
>>>    svm: Temporary deactivate AVIC during ExtINT handling
>>>    kvm: i8254: Deactivate APICv when using in-kernel PIT re-injection
>>>      mode.
>>>    kvm: lapic: Clean up APIC predefined macros
>>>    kvm: ioapic: Refactor kvm_ioapic_update_eoi()
>>>    kvm: ioapic: Lazy update IOAPIC EOI
>>>    svm: Allow AVIC with in-kernel irqchip mode
>>>
>>>   arch/x86/include/asm/kvm_host.h |  19 ++++-
>>>   arch/x86/kvm/hyperv.c           |   5 +-
>>>   arch/x86/kvm/i8254.c            |  12 +++
>>>   arch/x86/kvm/ioapic.c           | 149
>>> +++++++++++++++++++++++-------------
>>>   arch/x86/kvm/lapic.c            |  35 +++++----
>>>   arch/x86/kvm/lapic.h            |   2 +
>>>   arch/x86/kvm/svm.c              | 164
>>> +++++++++++++++++++++++++++++++++++-----
>>>   arch/x86/kvm/trace.h            |  19 +++++
>>>   arch/x86/kvm/vmx/vmx.c          |  12 ++-
>>>   arch/x86/kvm/x86.c              |  71 ++++++++++++++---
>>>   10 files changed, 385 insertions(+), 103 deletions(-)
>>>
>