Message ID | 1512461786-6465-3-git-send-email-liran.alon@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
2017-12-05 10:16+0200, Liran Alon: > In case posted-interrupt was delivered to CPU while it is in host > (outside guest), then posted-interrupt delivery will be done by > calling sync_pir_to_irr() at vmentry after interrupts are disabled. > > sync_pir_to_irr() will check if vmx->pi_desc.control ON bit and if > set, it will sync vmx->pi_desc.pir to IRR and afterwards update RVI to > ensure virtual-interrupt-delivery will dispatch interrupt to guest. > > However, it is possible that L1 will receive a posted-interrupt while > CPU runs at host and is about to enter L2. In this case, the call to > sync_pir_to_irr() will indeed update the L1's APIC IRR but > vcpu_enter_guest() will then just resume into L2 guest without > re-evaluating if it should exit from L2 to L1 as a result of this > new pending L1 event. > > To address this case, if sync_pir_to_irr() has a new L1 injectable > interrupt and CPU is running L2, we set KVM_REQ_EVENT. > This will cause vcpu_enter_guest() to run another iteration of > evaluating pending KVM requests and will therefore consume > KVM_REQ_EVENT which will make sure to call check_nested_events() which > will handle the pending L1 event properly. > > Signed-off-by: Liran Alon <liran.alon@oracle.com> > Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> > Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> > Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/kvm/vmx.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index f5074ec5701b..47bbb8b691e8 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -9031,20 +9031,33 @@ static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr) > static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) > { > struct vcpu_vmx *vmx = to_vmx(vcpu); > + int prev_max_irr; > int max_irr; > > WARN_ON(!vcpu->arch.apicv_active); > + > + prev_max_irr = kvm_lapic_find_highest_irr(vcpu); > if (pi_test_on(&vmx->pi_desc)) { > pi_clear_on(&vmx->pi_desc); > + > /* > * IOMMU can write to PIR.ON, so the barrier matters even on UP. > * But on x86 this is just a compiler barrier anyway. > */ > smp_mb__after_atomic(); > max_irr = kvm_apic_update_irr(vcpu, vmx->pi_desc.pir); I think the optimization (partly livelock protection) is not worth the overhead of two IRR scans for non-nested guests. Please make kvm_apic_update_irr() return both prev_max_irr and max_irr in one pass. > + > + /* > + * If we are running L2 and L1 has a new pending interrupt > + * which can be injected, we should re-evaluate > + * what should be done with this new L1 interrupt. > + */ > + if (is_guest_mode(vcpu) && (max_irr > prev_max_irr)) > + kvm_make_request(KVM_REQ_EVENT, vcpu); We don't need anything from KVM_REQ_EVENT and only use it to abort the VM entry, kvm_vcpu_exiting_guest_mode() is better for that. > } else { > - max_irr = kvm_lapic_find_highest_irr(vcpu); > + max_irr = prev_max_irr; > } > + > vmx_hwapic_irr_update(vcpu, max_irr); We also should just inject the interrupt if L2 is run without nested_exit_on_intr(), maybe reusing the check in vmx_hwapic_irr_update? Thanks.
On 06/12/17 20:52, Radim Krčmář wrote: > 2017-12-05 10:16+0200, Liran Alon: >> In case posted-interrupt was delivered to CPU while it is in host >> (outside guest), then posted-interrupt delivery will be done by >> calling sync_pir_to_irr() at vmentry after interrupts are disabled. >> >> sync_pir_to_irr() will check if vmx->pi_desc.control ON bit and if >> set, it will sync vmx->pi_desc.pir to IRR and afterwards update RVI to >> ensure virtual-interrupt-delivery will dispatch interrupt to guest. >> >> However, it is possible that L1 will receive a posted-interrupt while >> CPU runs at host and is about to enter L2. In this case, the call to >> sync_pir_to_irr() will indeed update the L1's APIC IRR but >> vcpu_enter_guest() will then just resume into L2 guest without >> re-evaluating if it should exit from L2 to L1 as a result of this >> new pending L1 event. >> >> To address this case, if sync_pir_to_irr() has a new L1 injectable >> interrupt and CPU is running L2, we set KVM_REQ_EVENT. >> This will cause vcpu_enter_guest() to run another iteration of >> evaluating pending KVM requests and will therefore consume >> KVM_REQ_EVENT which will make sure to call check_nested_events() which >> will handle the pending L1 event properly. >> >> Signed-off-by: Liran Alon <liran.alon@oracle.com> >> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> >> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> >> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> >> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >> --- >> arch/x86/kvm/vmx.c | 15 ++++++++++++++- >> 1 file changed, 14 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> index f5074ec5701b..47bbb8b691e8 100644 >> --- a/arch/x86/kvm/vmx.c >> +++ b/arch/x86/kvm/vmx.c >> @@ -9031,20 +9031,33 @@ static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr) >> static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) >> { >> struct vcpu_vmx *vmx = to_vmx(vcpu); >> + int prev_max_irr; >> int max_irr; >> >> WARN_ON(!vcpu->arch.apicv_active); >> + >> + prev_max_irr = kvm_lapic_find_highest_irr(vcpu); >> if (pi_test_on(&vmx->pi_desc)) { >> pi_clear_on(&vmx->pi_desc); >> + >> /* >> * IOMMU can write to PIR.ON, so the barrier matters even on UP. >> * But on x86 this is just a compiler barrier anyway. >> */ >> smp_mb__after_atomic(); >> max_irr = kvm_apic_update_irr(vcpu, vmx->pi_desc.pir); > > I think the optimization (partly livelock protection) is not worth the > overhead of two IRR scans for non-nested guests. Please make > kvm_apic_update_irr() return both prev_max_irr and max_irr in one pass. OK. I will modify kvm_apic_update_irr(). > >> + >> + /* >> + * If we are running L2 and L1 has a new pending interrupt >> + * which can be injected, we should re-evaluate >> + * what should be done with this new L1 interrupt. >> + */ >> + if (is_guest_mode(vcpu) && (max_irr > prev_max_irr)) >> + kvm_make_request(KVM_REQ_EVENT, vcpu); > > We don't need anything from KVM_REQ_EVENT and only use it to abort the > VM entry, kvm_vcpu_exiting_guest_mode() is better for that. Yes you are right. I will change to kvm_vcpu_exiting_guest_mode(). > >> } else { >> - max_irr = kvm_lapic_find_highest_irr(vcpu); >> + max_irr = prev_max_irr; >> } >> + >> vmx_hwapic_irr_update(vcpu, max_irr); > > We also should just inject the interrupt if L2 is run without > nested_exit_on_intr(), maybe reusing the check in vmx_hwapic_irr_update? See next patch in series :) > > Thanks. > Regards, -Liran
On 06/12/2017 19:52, Radim Krčmář wrote: >> smp_mb__after_atomic(); >> max_irr = kvm_apic_update_irr(vcpu, vmx->pi_desc.pir); > I think the optimization (partly livelock protection) is not worth the > overhead of two IRR scans for non-nested guests. Please make > kvm_apic_update_irr() return both prev_max_irr and max_irr in one pass. You could also return max_irr in an int*, and give the function a "bool" return type for max_irr > prev_max_irr. That is more efficient because you can do the check in the "if (pir_val)" conditional of __kvm_apic_update_irr. Paolo >> + >> + /* >> + * If we are running L2 and L1 has a new pending interrupt >> + * which can be injected, we should re-evaluate >> + * what should be done with this new L1 interrupt. >> + */ >> + if (is_guest_mode(vcpu) && (max_irr > prev_max_irr)) >> + kvm_make_request(KVM_REQ_EVENT, vcpu); > We don't need anything from KVM_REQ_EVENT and only use it to abort the > VM entry, kvm_vcpu_exiting_guest_mode() is better for that. > >> } else { >> - max_irr = kvm_lapic_find_highest_irr(vcpu); >> + max_irr = prev_max_irr; >> } >> + >> vmx_hwapic_irr_update(vcpu, max_irr); > We also should just inject the interrupt if L2 is run without > nested_exit_on_intr(), maybe reusing the check in vmx_hwapic_irr_update?
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index f5074ec5701b..47bbb8b691e8 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -9031,20 +9031,33 @@ static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr) static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); + int prev_max_irr; int max_irr; WARN_ON(!vcpu->arch.apicv_active); + + prev_max_irr = kvm_lapic_find_highest_irr(vcpu); if (pi_test_on(&vmx->pi_desc)) { pi_clear_on(&vmx->pi_desc); + /* * IOMMU can write to PIR.ON, so the barrier matters even on UP. * But on x86 this is just a compiler barrier anyway. */ smp_mb__after_atomic(); max_irr = kvm_apic_update_irr(vcpu, vmx->pi_desc.pir); + + /* + * If we are running L2 and L1 has a new pending interrupt + * which can be injected, we should re-evaluate + * what should be done with this new L1 interrupt. + */ + if (is_guest_mode(vcpu) && (max_irr > prev_max_irr)) + kvm_make_request(KVM_REQ_EVENT, vcpu); } else { - max_irr = kvm_lapic_find_highest_irr(vcpu); + max_irr = prev_max_irr; } + vmx_hwapic_irr_update(vcpu, max_irr); return max_irr; }