Message ID | 20210716064808.14757-7-guang.zeng@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | IPI virtualization support for VM | expand |
On 16/07/21 08:48, Zeng Guang wrote: > > + if (!(_cpu_based_3rd_exec_control & TERTIARY_EXEC_IPI_VIRT)) > + enable_ipiv = 0; > + > } Please move this to hardware_setup(), using a new function cpu_has_vmx_ipiv() in vmx/capabilities.h. > if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) { > - u64 opt3 = 0; > + u64 opt3 = enable_ipiv ? TERTIARY_EXEC_IPI_VIRT : 0; > u64 min3 = 0; I like the idea of changing opt3, but it's different from how setup_vmcs_config works for the other execution controls. Let me think if it makes sense to clean this up, and move the handling of other module parameters from hardware_setup() to setup_vmcs_config(). > + > + if (vmx->ipiv_active) > + install_pid(vmx); This should be if (enable_ipiv) instead, I think. In fact, in all other places that are using vmx->ipiv_active, you can actually replace it with enable_ipiv; they are all reached only with kvm_vcpu_apicv_active(vcpu) == true. > + if (!enable_apicv) { > + enable_ipiv = 0; > + vmcs_config.cpu_based_3rd_exec_ctrl &= ~TERTIARY_EXEC_IPI_VIRT; > + } The assignment to vmcs_config.cpu_based_3rd_exec_ctrl should not be necessary; kvm_vcpu_apicv_active will always be false in that case and IPI virtualization would never be enabled. Paolo
On 7/16/2021 5:52 PM, Paolo Bonzini wrote: > On 16/07/21 08:48, Zeng Guang wrote: >> >> + if (!(_cpu_based_3rd_exec_control & TERTIARY_EXEC_IPI_VIRT)) >> + enable_ipiv = 0; >> + >> } > > Please move this to hardware_setup(), using a new function > cpu_has_vmx_ipiv() in vmx/capabilities.h. > ok, we will change it to follow current framework. >> if (_cpu_based_exec_control & >> CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) { >> - u64 opt3 = 0; >> + u64 opt3 = enable_ipiv ? TERTIARY_EXEC_IPI_VIRT : 0; >> u64 min3 = 0; > > I like the idea of changing opt3, but it's different from how > setup_vmcs_config works for the other execution controls. Let me > think if it makes sense to clean this up, and move the handling of > other module parameters from hardware_setup() to setup_vmcs_config(). > May be an exception for ipiv feature ? >> + >> + if (vmx->ipiv_active) >> + install_pid(vmx); > > This should be if (enable_ipiv) instead, I think. > > In fact, in all other places that are using vmx->ipiv_active, you can > actually replace it with enable_ipiv; they are all reached only with > kvm_vcpu_apicv_active(vcpu) == true. > enable_ipiv as a global variable indicates the hardware capability to enable IPIv. Each VM may have different IPIv configuration according to kvm_vcpu_apicv_active status. So we use ipiv_active per VM to enclose IPIv related operations. >> + if (!enable_apicv) { >> + enable_ipiv = 0; >> + vmcs_config.cpu_based_3rd_exec_ctrl &= ~TERTIARY_EXEC_IPI_VIRT; >> + } > > The assignment to vmcs_config.cpu_based_3rd_exec_ctrl should not be > necessary; kvm_vcpu_apicv_active will always be false in that case and > IPI virtualization would never be enabled. > We originally intend to make vmcs_config consistent with the actual ipiv capability and decouple it from other factors. As you mentioned , it's not necessary to update vmcs_config.cpu_based_3rd_exec_ctrl in this case. We will remove it. Thanks. > Paolo >
On 17/07/21 05:55, Zeng Guang wrote: >>> if (_cpu_based_exec_control & >>> CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) { >>> - u64 opt3 = 0; >>> + u64 opt3 = enable_ipiv ? TERTIARY_EXEC_IPI_VIRT : 0; >>> u64 min3 = 0; >> >> I like the idea of changing opt3, but it's different from how >> setup_vmcs_config works for the other execution controls. Let me >> think if it makes sense to clean this up, and move the handling of >> other module parameters from hardware_setup() to setup_vmcs_config(). >> > May be an exception for ipiv feature ? Yes, possibly. I'll look into using this idea for other parameters. >>> + if (vmx->ipiv_active) >>> + install_pid(vmx); >> >> This should be if (enable_ipiv) instead, I think. >> >> In fact, in all other places that are using vmx->ipiv_active, you can >> actually replace it with enable_ipiv; they are all reached only with >> kvm_vcpu_apicv_active(vcpu) == true. >> > enable_ipiv as a global variable indicates the hardware capability to > enable IPIv. Each VM may have different IPIv configuration according to > kvm_vcpu_apicv_active status. So we use ipiv_active per VM to enclose > IPIv related operations. Understood, but in practice all uses of vmx->ipiv_active are guarded by kvm_vcpu_apicv_active so they are always reached with vmx->ipiv_active == enable_ipiv. The one above instead seems wrong and should just use enable_ipiv. Paolo
On 7/19/2021 4:32 AM, Paolo Bonzini wrote: > On 17/07/21 05:55, Zeng Guang wrote: >>>> + if (vmx->ipiv_active) >>>> + install_pid(vmx); >>> This should be if (enable_ipiv) instead, I think. >>> >>> In fact, in all other places that are using vmx->ipiv_active, you can >>> actually replace it with enable_ipiv; they are all reached only with >>> kvm_vcpu_apicv_active(vcpu) == true. >>> >> enable_ipiv as a global variable indicates the hardware capability to >> enable IPIv. Each VM may have different IPIv configuration according to >> kvm_vcpu_apicv_active status. So we use ipiv_active per VM to enclose >> IPIv related operations. > Understood, but in practice all uses of vmx->ipiv_active are guarded by > kvm_vcpu_apicv_active so they are always reached with vmx->ipiv_active > == enable_ipiv. > > The one above instead seems wrong and should just use enable_ipiv. enable_ipiv associate with "IPI virtualization" setting in tertiary exec controls and enable_apicv which depends on cpu_has_vmx_apicv(). kvm_vcpu_apicv_active still can be false even if enable_ipiv is true, e.g. in case irqchip not emulated in kernel. It's ok to use enable_ipiv here. vmcs setup succeed for IPIv but it won't take into effect as false kvm_vcpu_apicv_active disable “IPI virtualization" in this case. > Paolo >
On 7/19/2021 4:32 AM, Paolo Bonzini wrote: > On 17/07/21 05:55, Zeng Guang wrote: >>>> + if (vmx->ipiv_active) >>>> + install_pid(vmx); >>> This should be if (enable_ipiv) instead, I think. >>> >>> In fact, in all other places that are using vmx->ipiv_active, you can >>> actually replace it with enable_ipiv; they are all reached only with >>> kvm_vcpu_apicv_active(vcpu) == true. >>> >> enable_ipiv as a global variable indicates the hardware capability to >> enable IPIv. Each VM may have different IPIv configuration according to >> kvm_vcpu_apicv_active status. So we use ipiv_active per VM to enclose >> IPIv related operations. > Understood, but in practice all uses of vmx->ipiv_active are guarded by > kvm_vcpu_apicv_active so they are always reached with vmx->ipiv_active > == enable_ipiv. > > The one above instead seems wrong and should just use enable_ipiv. enable_ipiv associate with "IPI virtualization" setting in tertiary exec controls and enable_apicv which depends on cpu_has_vmx_apicv(). kvm_vcpu_apicv_active still can be false even if enable_ipiv is true, e.g. in case irqchip not emulated in kernel. Though it's ok to use enable_ipiv here, vmcs setup succeed for IPIv but it won't take into effect as false kvm_vcpu_apicv_active runtime disable “IPI virtualization" for VM in this case and leads vmx->ipiv_active become false as well. So vmx->ipiv_active is more accurate to reflect runtime IPIv status. > Paolo >
On 19/07/21 14:38, Zeng Guang wrote: >> Understood, but in practice all uses of vmx->ipiv_active are >> guarded by kvm_vcpu_apicv_active so they are always reached with >> vmx->ipiv_active == enable_ipiv. >> >> The one above instead seems wrong and should just use enable_ipiv. > > enable_ipiv associate with "IPI virtualization" setting in tertiary > exec controls and enable_apicv which depends on cpu_has_vmx_apicv(). > kvm_vcpu_apicv_active still can be false even if enable_ipiv is true, > e.g. in case irqchip not emulated in kernel. Right, kvm_vcpu_apicv_active *is* set in init_vmcs. But there's an "if (kvm_vcpu_apicv_active(&vmx->vcpu))" above. You can just stick if (enable_ipicv) install_pid(vmx); inside there. As to the other occurrences of vmx->ipiv_active, look here: > + if (!kvm_vcpu_apicv_active(vcpu)) > + return; > + > + if ((!kvm_arch_has_assigned_device(vcpu->kvm) || > + !irq_remapping_cap(IRQ_POSTING_CAP)) && > + !to_vmx(vcpu)->ipiv_active) > return; > This one can be enable_ipiv because APICv must be active. > + if (!kvm_vcpu_apicv_active(vcpu)) > + return 0; > + > + /* Put vCPU into a list and set NV to wakeup vector if it is > + * one of the following cases: > + * 1. any assigned device is in use. > + * 2. IPI virtualization is enabled. > + */ > + if ((!kvm_arch_has_assigned_device(vcpu->kvm) || > + !irq_remapping_cap(IRQ_POSTING_CAP)) && !to_vmx(vcpu)->ipiv_active) > return 0; This one can be !enable_ipiv because APICv must be active. > > @@ -3870,6 +3877,8 @@ static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu, u8 mode) > vmx_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_RW); > vmx_disable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_EOI), MSR_TYPE_W); > vmx_disable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W); > + vmx_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_ICR), > + MSR_TYPE_RW, !to_vmx(vcpu)->ipiv_active); > } > } Is inside "if (mode & MSR_BITMAP_MODE_X2APIC_APICV)" so APICv must be activ; so it can be enable_ipiv as well. In conclusion, you do not need vmx->ipiv_active. Paolo
On 7/19/2021 9:58 PM, Paolo Bonzini wrote: > On 19/07/21 14:38, Zeng Guang wrote: >>> Understood, but in practice all uses of vmx->ipiv_active are >>> guarded by kvm_vcpu_apicv_active so they are always reached with >>> vmx->ipiv_active == enable_ipiv. >>> >>> The one above instead seems wrong and should just use enable_ipiv. >> enable_ipiv associate with "IPI virtualization" setting in tertiary >> exec controls and enable_apicv which depends on cpu_has_vmx_apicv(). >> kvm_vcpu_apicv_active still can be false even if enable_ipiv is true, >> e.g. in case irqchip not emulated in kernel. > Right, kvm_vcpu_apicv_active *is* set in init_vmcs. But there's an > "if (kvm_vcpu_apicv_active(&vmx->vcpu))" above. You can just stick > > if (enable_ipicv) > install_pid(vmx); Ok, got your point now. I will revise to remove vmx->ipiv_active. Thanks. > inside there. As to the other occurrences of vmx->ipiv_active, look here: > >> + if (!kvm_vcpu_apicv_active(vcpu)) >> + return; >> + >> + if ((!kvm_arch_has_assigned_device(vcpu->kvm) || >> + !irq_remapping_cap(IRQ_POSTING_CAP)) && >> + !to_vmx(vcpu)->ipiv_active) >> return; >> > This one can be enable_ipiv because APICv must be active. > >> + if (!kvm_vcpu_apicv_active(vcpu)) >> + return 0; >> + >> + /* Put vCPU into a list and set NV to wakeup vector if it is >> + * one of the following cases: >> + * 1. any assigned device is in use. >> + * 2. IPI virtualization is enabled. >> + */ >> + if ((!kvm_arch_has_assigned_device(vcpu->kvm) || >> + !irq_remapping_cap(IRQ_POSTING_CAP)) && !to_vmx(vcpu)->ipiv_active) >> return 0; > This one can be !enable_ipiv because APICv must be active. > >> @@ -3870,6 +3877,8 @@ static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu, u8 mode) >> vmx_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_RW); >> vmx_disable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_EOI), MSR_TYPE_W); >> vmx_disable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W); >> + vmx_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_ICR), >> + MSR_TYPE_RW, !to_vmx(vcpu)->ipiv_active); >> } >> } > Is inside "if (mode & MSR_BITMAP_MODE_X2APIC_APICV)" so APICv must be > activ; so it can be enable_ipiv as well. > > In conclusion, you do not need vmx->ipiv_active. > > Paolo >
diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 15652047f2db..e97cf7b9ff12 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -76,6 +76,11 @@ #define SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE VMCS_CONTROL_BIT(USR_WAIT_PAUSE) #define SECONDARY_EXEC_BUS_LOCK_DETECTION VMCS_CONTROL_BIT(BUS_LOCK_DETECTION) +/* + * Definitions of Tertiary Processor-Based VM-Execution Controls. + */ +#define TERTIARY_EXEC_IPI_VIRT VMCS_CONTROL_BIT(IPI_VIRT) + #define PIN_BASED_EXT_INTR_MASK VMCS_CONTROL_BIT(INTR_EXITING) #define PIN_BASED_NMI_EXITING VMCS_CONTROL_BIT(NMI_EXITING) #define PIN_BASED_VIRTUAL_NMIS VMCS_CONTROL_BIT(VIRTUAL_NMIS) @@ -159,6 +164,7 @@ static inline int vmx_misc_mseg_revid(u64 vmx_misc) enum vmcs_field { VIRTUAL_PROCESSOR_ID = 0x00000000, POSTED_INTR_NV = 0x00000002, + LAST_PID_POINTER_INDEX = 0x00000008, GUEST_ES_SELECTOR = 0x00000800, GUEST_CS_SELECTOR = 0x00000802, GUEST_SS_SELECTOR = 0x00000804, @@ -224,6 +230,8 @@ enum vmcs_field { TSC_MULTIPLIER_HIGH = 0x00002033, TERTIARY_VM_EXEC_CONTROL = 0x00002034, TERTIARY_VM_EXEC_CONTROL_HIGH = 0x00002035, + PID_POINTER_TABLE = 0x00002042, + PID_POINTER_TABLE_HIGH = 0x00002043, GUEST_PHYSICAL_ADDRESS = 0x00002400, GUEST_PHYSICAL_ADDRESS_HIGH = 0x00002401, VMCS_LINK_POINTER = 0x00002800, diff --git a/arch/x86/include/asm/vmxfeatures.h b/arch/x86/include/asm/vmxfeatures.h index 27e76eeca05b..e821e8126097 100644 --- a/arch/x86/include/asm/vmxfeatures.h +++ b/arch/x86/include/asm/vmxfeatures.h @@ -86,4 +86,6 @@ #define VMX_FEATURE_ENCLV_EXITING ( 2*32+ 28) /* "" VM-Exit on ENCLV (leaf dependent) */ #define VMX_FEATURE_BUS_LOCK_DETECTION ( 2*32+ 30) /* "" VM-Exit when bus lock caused */ +/* Tertiary Processor-Based VM-Execution Controls, word 3 */ +#define VMX_FEATURE_IPI_VIRT ( 3*32 + 4) /* "" Enable IPI virtualization */ #endif /* _ASM_X86_VMXFEATURES_H */ diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index 38d414f64e61..9e9710c3ee51 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -12,6 +12,7 @@ extern bool __read_mostly enable_ept; extern bool __read_mostly enable_unrestricted_guest; extern bool __read_mostly enable_ept_ad_bits; extern bool __read_mostly enable_pml; +extern bool __read_mostly enable_ipiv; extern int __read_mostly pt_mode; #define PT_MODE_SYSTEM 0 diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c index 5f81ef092bd4..d817331bfb05 100644 --- a/arch/x86/kvm/vmx/posted_intr.c +++ b/arch/x86/kvm/vmx/posted_intr.c @@ -81,9 +81,12 @@ void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu) { struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); - if (!kvm_arch_has_assigned_device(vcpu->kvm) || - !irq_remapping_cap(IRQ_POSTING_CAP) || - !kvm_vcpu_apicv_active(vcpu)) + if (!kvm_vcpu_apicv_active(vcpu)) + return; + + if ((!kvm_arch_has_assigned_device(vcpu->kvm) || + !irq_remapping_cap(IRQ_POSTING_CAP)) && + !to_vmx(vcpu)->ipiv_active) return; /* Set SN when the vCPU is preempted */ @@ -141,9 +144,16 @@ int pi_pre_block(struct kvm_vcpu *vcpu) struct pi_desc old, new; struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); - if (!kvm_arch_has_assigned_device(vcpu->kvm) || - !irq_remapping_cap(IRQ_POSTING_CAP) || - !kvm_vcpu_apicv_active(vcpu)) + if (!kvm_vcpu_apicv_active(vcpu)) + return 0; + + /* Put vCPU into a list and set NV to wakeup vector if it is + * one of the following cases: + * 1. any assigned device is in use. + * 2. IPI virtualization is enabled. + */ + if ((!kvm_arch_has_assigned_device(vcpu->kvm) || + !irq_remapping_cap(IRQ_POSTING_CAP)) && !to_vmx(vcpu)->ipiv_active) return 0; WARN_ON(irqs_disabled()); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 820fc131d258..8a45f45b263c 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -104,6 +104,9 @@ module_param(fasteoi, bool, S_IRUGO); module_param(enable_apicv, bool, S_IRUGO); +bool __read_mostly enable_ipiv = 1; +module_param(enable_ipiv, bool, S_IRUGO); + /* * If nested=1, nested virtualization is supported, i.e., guests may use * VMX and be a hypervisor for its own guests. If nested=0, guests may not @@ -225,6 +228,7 @@ static const struct { }; #define L1D_CACHE_ORDER 4 +#define PID_TABLE_ORDER get_order(KVM_MAX_VCPU_ID << 3) static void *vmx_l1d_flush_pages; static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf) @@ -2514,7 +2518,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf, } if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) { - u64 opt3 = 0; + u64 opt3 = enable_ipiv ? TERTIARY_EXEC_IPI_VIRT : 0; u64 min3 = 0; if (adjust_vmx_controls_64(min3, opt3, @@ -2523,6 +2527,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf, return -EIO; } + if (!(_cpu_based_3rd_exec_control & TERTIARY_EXEC_IPI_VIRT)) + enable_ipiv = 0; + min = VM_EXIT_SAVE_DEBUG_CONTROLS | VM_EXIT_ACK_INTR_ON_EXIT; #ifdef CONFIG_X86_64 min |= VM_EXIT_HOST_ADDR_SPACE_SIZE; @@ -3870,6 +3877,8 @@ static void vmx_update_msr_bitmap_x2apic(struct kvm_vcpu *vcpu, u8 mode) vmx_enable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_RW); vmx_disable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_EOI), MSR_TYPE_W); vmx_disable_intercept_for_msr(vcpu, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W); + vmx_set_intercept_for_msr(vcpu, X2APIC_MSR(APIC_ICR), + MSR_TYPE_RW, !to_vmx(vcpu)->ipiv_active); } } @@ -4138,14 +4147,21 @@ static void vmx_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu) pin_controls_set(vmx, vmx_pin_based_exec_ctrl(vmx)); if (cpu_has_secondary_exec_ctrls()) { - if (kvm_vcpu_apicv_active(vcpu)) + if (kvm_vcpu_apicv_active(vcpu)) { secondary_exec_controls_setbit(vmx, SECONDARY_EXEC_APIC_REGISTER_VIRT | SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY); - else + if (cpu_has_tertiary_exec_ctrls() && enable_ipiv) + tertiary_exec_controls_setbit(vmx, + TERTIARY_EXEC_IPI_VIRT); + } else { secondary_exec_controls_clearbit(vmx, SECONDARY_EXEC_APIC_REGISTER_VIRT | SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY); + if (cpu_has_tertiary_exec_ctrls()) + tertiary_exec_controls_clearbit(vmx, + TERTIARY_EXEC_IPI_VIRT); + } } if (cpu_has_vmx_msr_bitmap()) @@ -4236,8 +4252,14 @@ vmx_adjust_secondary_exec_control(struct vcpu_vmx *vmx, u32 *exec_control, static void vmx_compute_tertiary_exec_control(struct vcpu_vmx *vmx) { + struct kvm_vcpu *vcpu = &vmx->vcpu; u32 exec_control = vmcs_config.cpu_based_3rd_exec_ctrl; + if (!kvm_vcpu_apicv_active(vcpu)) + exec_control &= ~TERTIARY_EXEC_IPI_VIRT; + + vmx->ipiv_active = (exec_control & TERTIARY_EXEC_IPI_VIRT) ? true : false; + vmx->tertiary_exec_control = exec_control; } @@ -4332,6 +4354,17 @@ static void vmx_compute_secondary_exec_control(struct vcpu_vmx *vmx) #define VMX_XSS_EXIT_BITMAP 0 +static void install_pid(struct vcpu_vmx *vmx) +{ + struct kvm_vmx *kvm_vmx = to_kvm_vmx(vmx->vcpu.kvm); + + BUG_ON(vmx->vcpu.vcpu_id > kvm_vmx->pid_last_index); + /* Bit 0 is the valid bit */ + kvm_vmx->pid_table[vmx->vcpu.vcpu_id] = __pa(&vmx->pi_desc) | 1; + vmcs_write64(PID_POINTER_TABLE, __pa(kvm_vmx->pid_table)); + vmcs_write16(LAST_PID_POINTER_INDEX, kvm_vmx->pid_last_index); +} + /* * Noting that the initialization of Guest-state Area of VMCS is in * vmx_vcpu_reset(). @@ -4430,6 +4463,9 @@ static void init_vmcs(struct vcpu_vmx *vmx) vmx->pt_desc.guest.output_mask = 0x7F; vmcs_write64(GUEST_IA32_RTIT_CTL, 0); } + + if (vmx->ipiv_active) + install_pid(vmx); } static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) @@ -6969,6 +7005,22 @@ static int vmx_vm_init(struct kvm *kvm) break; } } + + if (enable_ipiv) { + struct page *pages; + + /* Allocate pages for PID table in order of PID_TABLE_ORDER + * depending on KVM_MAX_VCPU_ID. Each PID entry is 8 bytes. + */ + pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, PID_TABLE_ORDER); + + if (!pages) + return -ENOMEM; + + to_kvm_vmx(kvm)->pid_table = (void *)page_address(pages); + to_kvm_vmx(kvm)->pid_last_index = KVM_MAX_VCPU_ID; + } + return 0; } @@ -7579,6 +7631,14 @@ static bool vmx_check_apicv_inhibit_reasons(ulong bit) return supported & BIT(bit); } +static void vmx_vm_destroy(struct kvm *kvm) +{ + struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm); + + if (kvm_vmx->pid_table) + free_pages((unsigned long)kvm_vmx->pid_table, PID_TABLE_ORDER); +} + static struct kvm_x86_ops vmx_x86_ops __initdata = { .hardware_unsetup = hardware_unsetup, @@ -7589,6 +7649,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .vm_size = sizeof(struct kvm_vmx), .vm_init = vmx_vm_init, + .vm_destroy = vmx_vm_destroy, .vcpu_create = vmx_create_vcpu, .vcpu_free = vmx_free_vcpu, @@ -7828,6 +7889,11 @@ static __init int hardware_setup(void) vmx_x86_ops.sync_pir_to_irr = NULL; } + if (!enable_apicv) { + enable_ipiv = 0; + vmcs_config.cpu_based_3rd_exec_ctrl &= ~TERTIARY_EXEC_IPI_VIRT; + } + if (cpu_has_vmx_tsc_scaling()) { kvm_has_tsc_control = true; kvm_max_tsc_scaling_ratio = KVM_VMX_TSC_MULTIPLIER_MAX; diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index c356ceebe84c..0dee1e4c628c 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -344,6 +344,9 @@ struct vcpu_vmx { DECLARE_BITMAP(read, MAX_POSSIBLE_PASSTHROUGH_MSRS); DECLARE_BITMAP(write, MAX_POSSIBLE_PASSTHROUGH_MSRS); } shadow_msr_intercept; + + /* IPI virtualization status */ + bool ipiv_active; }; struct kvm_vmx { @@ -352,6 +355,9 @@ struct kvm_vmx { unsigned int tss_addr; bool ept_identity_pagetable_done; gpa_t ept_identity_map_addr; + /* PID table for IPI virtualization */ + u64 *pid_table; + u16 pid_last_index; }; bool nested_vmx_allowed(struct kvm_vcpu *vcpu);