Message ID | 20191120143307.59906-1-liran.alon@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: nVMX: Remove unnecessary TLB flushes on L1<->L2 switches when L1 use apic-access-page | expand |
On 20/11/19 15:33, Liran Alon wrote: > "Virtualize APIC accesses" VM-execution control was changed > from 0 to 1, OR the value of apic_access_page was changed. > > Examining prepare_vmcs02(), one could note that L0 won't flush > physical TLB only in case: L0 use VPID, L1 use VPID and L0 > can guarantee TLB entries populated while running L1 are tagged > differently than TLB entries populated while running L2. > The last condition can only occur if either L0 use EPT or > L0 use different VPID for L1 and L2 > (i.e. vmx->vpid != vmx->nested.vpid02). > > If L0 use EPT, L0 use different EPTP when running L2 than L1 > (Because guest_mode is part of mmu-role) and therefore SDM section > 28.3.3.4 doesn't apply. Otherwise, L0 use different VPID when > running L2 than L1 and therefore SDM section 28.3.3.3 doesn't > apply. I don't understand this. You could still have a stale EPTP entry from a previous L2 vmenter. If L1 uses neither EPT nor VPID, it expects a TLB flush to occur on every vmentry, but this won't happen if L0 uses EPT. Paolo
> On 20 Nov 2019, at 17:05, Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 20/11/19 15:33, Liran Alon wrote: >> "Virtualize APIC accesses" VM-execution control was changed >> from 0 to 1, OR the value of apic_access_page was changed. >> >> Examining prepare_vmcs02(), one could note that L0 won't flush >> physical TLB only in case: L0 use VPID, L1 use VPID and L0 >> can guarantee TLB entries populated while running L1 are tagged >> differently than TLB entries populated while running L2. >> The last condition can only occur if either L0 use EPT or >> L0 use different VPID for L1 and L2 >> (i.e. vmx->vpid != vmx->nested.vpid02). >> >> If L0 use EPT, L0 use different EPTP when running L2 than L1 >> (Because guest_mode is part of mmu-role) and therefore SDM section >> 28.3.3.4 doesn't apply. Otherwise, L0 use different VPID when >> running L2 than L1 and therefore SDM section 28.3.3.3 doesn't >> apply. > > I don't understand this. You could still have a stale EPTP entry from a > previous L2 vmenter. If L1 uses neither EPT nor VPID, it expects a TLB > flush to occur on every vmentry, but this won't happen if L0 uses EPT. I don’t seem to get your concern. In case L1 don’t use VPID, prepare_vmcs02() will request KVM_REQ_TLB_FLUSH. (As it needs to emulate to L1 that on every L1<->L2 switch, the entire physical TLB is flushed) As explained in commit message. > > Paolo
On 20/11/19 16:25, Liran Alon wrote: >>> If L0 use EPT, L0 use different EPTP when running L2 than L1 >>> (Because guest_mode is part of mmu-role) and therefore SDM section >>> 28.3.3.4 doesn't apply. Otherwise, L0 use different VPID when >>> running L2 than L1 and therefore SDM section 28.3.3.3 doesn't >>> apply. >> I don't understand this. You could still have a stale EPTP entry from a >> previous L2 vmenter. If L1 uses neither EPT nor VPID, it expects a TLB >> flush to occur on every vmentry, but this won't happen if L0 uses EPT. > I don’t seem to get your concern. > In case L1 don’t use VPID, prepare_vmcs02() will request KVM_REQ_TLB_FLUSH. > (As it needs to emulate to L1 that on every L1<->L2 switch, the entire physical TLB is flushed) > As explained in commit message. > You're right. I'll rewrite some parts of the commit message, but the patch is correct. Paolo
On 20/11/19 18:49, Paolo Bonzini wrote: > On 20/11/19 16:25, Liran Alon wrote: >>>> If L0 use EPT, L0 use different EPTP when running L2 than L1 >>>> (Because guest_mode is part of mmu-role) and therefore SDM section >>>> 28.3.3.4 doesn't apply. Otherwise, L0 use different VPID when >>>> running L2 than L1 and therefore SDM section 28.3.3.3 doesn't >>>> apply. >>> I don't understand this. You could still have a stale EPTP entry from a >>> previous L2 vmenter. If L1 uses neither EPT nor VPID, it expects a TLB >>> flush to occur on every vmentry, but this won't happen if L0 uses EPT. >> I don’t seem to get your concern. >> In case L1 don’t use VPID, prepare_vmcs02() will request KVM_REQ_TLB_FLUSH. >> (As it needs to emulate to L1 that on every L1<->L2 switch, the entire physical TLB is flushed) >> As explained in commit message. > > You're right. I'll rewrite some parts of the commit message, but the > patch is correct. I got confused because of "Otherwise, L0 use different VPID when running L2 than L1 and therefore SDM section 28.3.3.3 doesn't apply". In fact SDM section 28.3.3.3 never applies to nested virtualization because: * if L1 doesn't use VPID, the architecture requires a TLB flush on vmentry/vmexit anyway * if L1 uses VPID, it already has to do an INVVPID. L0 only needs to take care of combined mappings, which can be done with INVEPT but also by using a different EP4TA for L1 and L2, as is the case after 1313cc2bd8f6. Here is the rewritten commit message: ---- According to Intel SDM section 28.3.3.3/28.3.3.4 Guidelines for Use of the INVVPID/INVEPT Instruction, the hypervisor needs to execute INVVPID/INVEPT X in case CPU executes VMEntry with VPID/EPTP X and either: "Virtualize APIC accesses" VM-execution control was changed from 0 to 1, OR the value of apic_access_page was changed. In the nested case, the burden falls on L1, unless L0 enables EPT in vmcs02 but L1 enables neither EPT nor VPID in vmcs12. For this reason prepare_vmcs02() and load_vmcs12_host_state() have special code to request a TLB flush in case L1 does not use EPT but it uses "virtualize APIC accesses". This special case however is not necessary. On a nested vmentry the physical TLB will already be flushed except if all the following apply: * L0 uses VPID * L1 uses VPID * L0 can guarantee TLB entries populated while running L1 are tagged differently than TLB entries populated while running L2. If the first condition is false, the processor will flush the TLB on vmentry to L2. If the second or third condition are false, prepare_vmcs02() will request KVM_REQ_TLB_FLUSH. However, even if both are true, no extra TLB flush is needed to handle the APIC access page: * if L1 doesn't use VPID, the second condition doesn't hold and the TLB will be flushed anyway. * if L1 uses VPID, it has to flush the TLB itself with INVVPID and section 28.3.3.3 doesn't apply to L0. * even INVEPT is not needed because, if L0 uses EPT, it uses different EPTP when running L2 than L1 (because guest_mode is part of mmu-role). In this case SDM section 28.3.3.4 doesn't apply. Therefore, there is no need to specifically flush TLB in case L1 does not use EPT but it uses "Virtualize APIC accesses". Similarly, examining nested_vmx_vmexit()->load_vmcs12_host_state(), one could note that L0 won't flush TLB only in cases where SDM sections 28.3.3.3 and 28.3.3.4 don't apply. In particular, if L0 uses different VPIDs for L1 and L2 (i.e. vmx->vpid != vmx->nested.vpid02), section 28.3.3.3 doesn't apply. Thus, remove this flush from prepare_vmcs02() and nested_vmx_vmexit(). Side-note: This patch can be viewed as removing parts of commit fb6c81984313 ("kvm: vmx: Flush TLB when the APIC-access address changes”) that is not relevant anymore since commit 1313cc2bd8f6 ("kvm: mmu: Add guest_mode to kvm_mmu_page_role”). i.e. The first commit assumes that if L0 use EPT and L1 doesn’t use EPT, then L0 will use same EPTP for both L0 and L1. Which indeed required L0 to execute INVEPT before entering L2 guest. This assumption is not true anymore since when guest_mode was added to mmu-role. ---- Paolo
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index fdcead2d4dd6..20692e442d13 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -2384,9 +2384,6 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, if (nested_cpu_has_ept(vmcs12)) nested_ept_init_mmu_context(vcpu); - else if (nested_cpu_has2(vmcs12, - SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)) - vmx_flush_tlb(vcpu, true); /* * This sets GUEST_CR0 to vmcs12->guest_cr0, possibly modifying those @@ -4125,10 +4122,6 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, if (vmx->nested.change_vmcs01_virtual_apic_mode) { vmx->nested.change_vmcs01_virtual_apic_mode = false; vmx_set_virtual_apic_mode(vcpu); - } else if (!nested_cpu_has_ept(vmcs12) && - nested_cpu_has2(vmcs12, - SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)) { - vmx_flush_tlb(vcpu, true); } /* Unpin physical memory we referred to in vmcs02 */