Message ID | 1512461786-6465-4-git-send-email-liran.alon@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
2017-12-05 10:16+0200, Liran Alon: > Before each vmentry to guest, vcpu_enter_guest() calls sync_pir_to_irr() > which calls vmx_hwapic_irr_update() to update RVI. > Currently, vmx_hwapic_irr_update() contains a tweak in case it is called > when CPU is running L2 and L1 don't intercept external-interrupts. > In that case, code injects interrupt directly into L2 instead of > updating RVI. > > Besides being hacky (wouldn't expect function updating RVI to also > inject interrupt), it also doesn't handle this case correctly. > The code contains several issues: > 1. When code calls kvm_queue_interrupt() it just passes it max_irr which > represents the highest IRR currently pending in L1 LAPIC. > This is problematic as interrupt was injected to guest but it's bit is > still set in LAPIC IRR instead of being cleared from IRR and set in ISR. > 2. Code doesn't check if LAPIC PPR is set to accept an interrupt of > max_irr priority. It just checks if interrupts are enabled in guest with > vmx_interrupt_allowed(). Good catch. > To fix the above issues: > 1. Simplify vmx_hwapic_irr_update() to just update RVI. > Note that this shouldn't happen when CPU is running L2 > (See comment in code). > 2. Since now vmx_hwapic_irr_update() only does logic for L1 > virtual-interrupt-delivery, inject_pending_event() should be the > one responsible for injecting the interrupt directly into L2. > Therefore, change kvm_cpu_has_injectable_intr() to check L1 > LAPIC when CPU is running L2. > > This is enough to fix the issue since commit ("KVM: nVMX: Re-evaluate > L1 pending events when running L2 and L1 got posted-interrupt") > made vcpu_enter_guest() to set KVM_REQ_EVENT when L1 has a pending > injectable interrupt. Right, that answers my question from the first patch. > Fixes: 963fee165660 ("KVM: nVMX: Fix virtual interrupt delivery > injection") > > Signed-off-by: Liran Alon <liran.alon@oracle.com> > Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> > Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> > Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c > index 5c24811e8b0b..f171051eecf3 100644 > --- a/arch/x86/kvm/irq.c > +++ b/arch/x86/kvm/irq.c > @@ -79,7 +79,7 @@ int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v) > if (kvm_cpu_has_extint(v)) > return 1; > > - if (kvm_vcpu_apicv_active(v)) > + if (!is_guest_mode(v) && kvm_vcpu_apicv_active(v)) > return 0; > > return kvm_apic_has_interrupt(v) != -1; /* LAPIC */ > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 47bbb8b691e8..bbc023fb9ef1 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -9002,30 +9002,18 @@ static void vmx_set_rvi(int vector) > > static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr) > { > - if (!is_guest_mode(vcpu)) { > - vmx_set_rvi(max_irr); > - return; > - } > - > - if (max_irr == -1) > - return; > - > /* > - * In guest mode. If a vmexit is needed, vmx_check_nested_events > - * handles it. > + * When running L2, updating RVI is only relevant when > + * vmcs12 virtual-interrupt-delivery enabled. > + * However, it can be enabled only when L1 also > + * intercepts external-interrupts and in that case > + * we should not update vmcs02 RVI but instead intercept > + * interrupt. Therefore, do nothing when running L2. > */ > - if (nested_exit_on_intr(vcpu)) > + if ((max_irr == -1) || is_guest_mode(vcpu)) > return; The logic should remain if (!is_guest_mode(vcpu)) vmx_set_rvi(max_irr); because we had a bug where RVI was not cleared on reset and I don't think we fixed it elsewhere. 4114c27d450b ("KVM: x86: reset RVI upon system reset") > > - /* > - * Else, fall back to pre-APICv interrupt injection since L2 > - * is run without virtual interrupt delivery. > - */ > - if (!kvm_event_needs_reinjection(vcpu) && > - vmx_interrupt_allowed(vcpu)) { > - kvm_queue_interrupt(vcpu, max_irr, false); > - vmx_inject_irq(vcpu); > - } > + vmx_set_rvi(max_irr); > } > > static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) > @@ -9051,6 +9039,10 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) > * If we are running L2 and L1 has a new pending interrupt > * which can be injected, we should re-evaluate > * what should be done with this new L1 interrupt. > + * If L1 intercepts external-interrupts, we should > + * exit from L2 to L1. Otherwise, interrupt should be > + * delivered directly to L2. > + * These cases are handled correctly by KVM_REQ_EVENT. > */ > if (is_guest_mode(vcpu) && (max_irr > prev_max_irr)) > kvm_make_request(KVM_REQ_EVENT, vcpu); KVM_REQ_EVENT is not needed when we are intercepting the interrupt, because we just want to run vmx_check_nested_events(), but it is needed when delivering L1's interrupt inside L2. When vmcs12 doesn't intercept external interrupts, I'm thinking we could pass virtual interrupts from vmcs01 and use RVI normally, as well as delivering L1 posted interrupts to L2 without a VM exit.
On 06/12/17 22:20, Radim Krčmář wrote: > 2017-12-05 10:16+0200, Liran Alon: >> Before each vmentry to guest, vcpu_enter_guest() calls sync_pir_to_irr() >> which calls vmx_hwapic_irr_update() to update RVI. >> Currently, vmx_hwapic_irr_update() contains a tweak in case it is called >> when CPU is running L2 and L1 don't intercept external-interrupts. >> In that case, code injects interrupt directly into L2 instead of >> updating RVI. >> >> Besides being hacky (wouldn't expect function updating RVI to also >> inject interrupt), it also doesn't handle this case correctly. >> The code contains several issues: >> 1. When code calls kvm_queue_interrupt() it just passes it max_irr which >> represents the highest IRR currently pending in L1 LAPIC. >> This is problematic as interrupt was injected to guest but it's bit is >> still set in LAPIC IRR instead of being cleared from IRR and set in ISR. >> 2. Code doesn't check if LAPIC PPR is set to accept an interrupt of >> max_irr priority. It just checks if interrupts are enabled in guest with >> vmx_interrupt_allowed(). > > Good catch. > >> To fix the above issues: >> 1. Simplify vmx_hwapic_irr_update() to just update RVI. >> Note that this shouldn't happen when CPU is running L2 >> (See comment in code). >> 2. Since now vmx_hwapic_irr_update() only does logic for L1 >> virtual-interrupt-delivery, inject_pending_event() should be the >> one responsible for injecting the interrupt directly into L2. >> Therefore, change kvm_cpu_has_injectable_intr() to check L1 >> LAPIC when CPU is running L2. >> >> This is enough to fix the issue since commit ("KVM: nVMX: Re-evaluate >> L1 pending events when running L2 and L1 got posted-interrupt") >> made vcpu_enter_guest() to set KVM_REQ_EVENT when L1 has a pending >> injectable interrupt. > > Right, that answers my question from the first patch. > >> Fixes: 963fee165660 ("KVM: nVMX: Fix virtual interrupt delivery >> injection") >> >> Signed-off-by: Liran Alon <liran.alon@oracle.com> >> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> >> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> >> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> >> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >> --- >> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c >> index 5c24811e8b0b..f171051eecf3 100644 >> --- a/arch/x86/kvm/irq.c >> +++ b/arch/x86/kvm/irq.c >> @@ -79,7 +79,7 @@ int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v) >> if (kvm_cpu_has_extint(v)) >> return 1; >> >> - if (kvm_vcpu_apicv_active(v)) >> + if (!is_guest_mode(v) && kvm_vcpu_apicv_active(v)) >> return 0; >> >> return kvm_apic_has_interrupt(v) != -1; /* LAPIC */ >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> index 47bbb8b691e8..bbc023fb9ef1 100644 >> --- a/arch/x86/kvm/vmx.c >> +++ b/arch/x86/kvm/vmx.c >> @@ -9002,30 +9002,18 @@ static void vmx_set_rvi(int vector) >> >> static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr) >> { >> - if (!is_guest_mode(vcpu)) { >> - vmx_set_rvi(max_irr); >> - return; >> - } >> - >> - if (max_irr == -1) >> - return; >> - >> /* >> - * In guest mode. If a vmexit is needed, vmx_check_nested_events >> - * handles it. >> + * When running L2, updating RVI is only relevant when >> + * vmcs12 virtual-interrupt-delivery enabled. >> + * However, it can be enabled only when L1 also >> + * intercepts external-interrupts and in that case >> + * we should not update vmcs02 RVI but instead intercept >> + * interrupt. Therefore, do nothing when running L2. >> */ >> - if (nested_exit_on_intr(vcpu)) >> + if ((max_irr == -1) || is_guest_mode(vcpu)) >> return; > > The logic should remain > > if (!is_guest_mode(vcpu)) > vmx_set_rvi(max_irr); > > because we had a bug where RVI was not cleared on reset and I don't > think we fixed it elsewhere. 4114c27d450b ("KVM: x86: reset RVI upon > system reset") I see. You are right. I wasn't aware of that commit. Will change. > >> >> - /* >> - * Else, fall back to pre-APICv interrupt injection since L2 >> - * is run without virtual interrupt delivery. >> - */ >> - if (!kvm_event_needs_reinjection(vcpu) && >> - vmx_interrupt_allowed(vcpu)) { >> - kvm_queue_interrupt(vcpu, max_irr, false); >> - vmx_inject_irq(vcpu); >> - } >> + vmx_set_rvi(max_irr); >> } >> >> static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) >> @@ -9051,6 +9039,10 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) >> * If we are running L2 and L1 has a new pending interrupt >> * which can be injected, we should re-evaluate >> * what should be done with this new L1 interrupt. >> + * If L1 intercepts external-interrupts, we should >> + * exit from L2 to L1. Otherwise, interrupt should be >> + * delivered directly to L2. >> + * These cases are handled correctly by KVM_REQ_EVENT. >> */ >> if (is_guest_mode(vcpu) && (max_irr > prev_max_irr)) >> kvm_make_request(KVM_REQ_EVENT, vcpu); > > > KVM_REQ_EVENT is not needed when we are intercepting the interrupt, > because we just want to run vmx_check_nested_events(), but it is needed > when delivering L1's interrupt inside L2. OK. If vmcs12 intercept external interrupts, I will just call kvm_vcpu_exiting_guest_mode(). Otherwise, I will use KVM_REQ_EVENT. > > When vmcs12 doesn't intercept external interrupts, I'm thinking we could > pass virtual interrupts from vmcs01 and use RVI normally, as well as > delivering L1 posted interrupts to L2 without a VM exit. > Maybe I am missing something but I am not sure this is possible. As the comment I added to vmx_hwapic_irr_update() specifies, virtual-interrupt-delivery cannot be enabled if exit on external-interrupts is disabled. This can be seen in the following code in nested_vmx_check_apicv_controls(): /* * If virtual interrupt delivery is enabled, * we must exit on external interrupts. */ if (nested_cpu_has_vid(vmcs12) && !nested_exit_on_intr(vcpu)) return -EINVAL; Therefore, in case vmcs12 disables exit on external-interrupts then vmcs12 SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY must be cleared. As prepare_vmcs02() currently takes value of SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY as-is from vmcs12 to vmcs02, it means vmcs02 doesn't have virtual-interrupt-delivery in this case. Therefore, in this case, we should not update vmcs02 RVI. Regards, -Liran
2017-12-07 13:19+0200, Liran Alon: > On 06/12/17 22:20, Radim Krčmář wrote: > > 2017-12-05 10:16+0200, Liran Alon: > > > Before each vmentry to guest, vcpu_enter_guest() calls sync_pir_to_irr() > > > which calls vmx_hwapic_irr_update() to update RVI. > > > Currently, vmx_hwapic_irr_update() contains a tweak in case it is called > > > when CPU is running L2 and L1 don't intercept external-interrupts. > > > In that case, code injects interrupt directly into L2 instead of > > > updating RVI. > > > > > > Besides being hacky (wouldn't expect function updating RVI to also > > > inject interrupt), it also doesn't handle this case correctly. > > > The code contains several issues: > > > 1. When code calls kvm_queue_interrupt() it just passes it max_irr which > > > represents the highest IRR currently pending in L1 LAPIC. > > > This is problematic as interrupt was injected to guest but it's bit is > > > still set in LAPIC IRR instead of being cleared from IRR and set in ISR. > > > 2. Code doesn't check if LAPIC PPR is set to accept an interrupt of > > > max_irr priority. It just checks if interrupts are enabled in guest with > > > vmx_interrupt_allowed(). > > > > Good catch. > > > > > To fix the above issues: > > > 1. Simplify vmx_hwapic_irr_update() to just update RVI. > > > Note that this shouldn't happen when CPU is running L2 > > > (See comment in code). > > > 2. Since now vmx_hwapic_irr_update() only does logic for L1 > > > virtual-interrupt-delivery, inject_pending_event() should be the > > > one responsible for injecting the interrupt directly into L2. > > > Therefore, change kvm_cpu_has_injectable_intr() to check L1 > > > LAPIC when CPU is running L2. > > > > > > This is enough to fix the issue since commit ("KVM: nVMX: Re-evaluate > > > L1 pending events when running L2 and L1 got posted-interrupt") > > > made vcpu_enter_guest() to set KVM_REQ_EVENT when L1 has a pending > > > injectable interrupt. > > > > Right, that answers my question from the first patch. > > > > > Fixes: 963fee165660 ("KVM: nVMX: Fix virtual interrupt delivery > > > injection") > > > > > > Signed-off-by: Liran Alon <liran.alon@oracle.com> > > > Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> > > > Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> > > > Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > > --- > > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > > > @@ -9051,6 +9039,10 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) > > > * If we are running L2 and L1 has a new pending interrupt > > > * which can be injected, we should re-evaluate > > > * what should be done with this new L1 interrupt. > > > + * If L1 intercepts external-interrupts, we should > > > + * exit from L2 to L1. Otherwise, interrupt should be > > > + * delivered directly to L2. > > > + * These cases are handled correctly by KVM_REQ_EVENT. > > > */ > > > if (is_guest_mode(vcpu) && (max_irr > prev_max_irr)) > > > kvm_make_request(KVM_REQ_EVENT, vcpu); > > > > > > KVM_REQ_EVENT is not needed when we are intercepting the interrupt, > > because we just want to run vmx_check_nested_events(), but it is needed > > when delivering L1's interrupt inside L2. > > OK. If vmcs12 intercept external interrupts, I will just call > kvm_vcpu_exiting_guest_mode(). Otherwise, I will use KVM_REQ_EVENT. > > > > > When vmcs12 doesn't intercept external interrupts, I'm thinking we could > > pass virtual interrupts from vmcs01 and use RVI normally, as well as > > delivering L1 posted interrupts to L2 without a VM exit. > > > > Maybe I am missing something but I am not sure this is possible. > > As the comment I added to vmx_hwapic_irr_update() specifies, > virtual-interrupt-delivery cannot be enabled if exit on external-interrupts > is disabled. This can be seen in the following code in > nested_vmx_check_apicv_controls(): > /* > * If virtual interrupt delivery is enabled, > * we must exit on external interrupts. > */ > if (nested_cpu_has_vid(vmcs12) && > !nested_exit_on_intr(vcpu)) > return -EINVAL; > > Therefore, in case vmcs12 disables exit on > external-interrupts then vmcs12 SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY must be > cleared. As prepare_vmcs02() currently takes value of > SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY as-is from vmcs12 to vmcs02, it means > vmcs02 doesn't have virtual-interrupt-delivery in this case. Therefore, in > this case, we should not update vmcs02 RVI. Right, vmcs12 can't configure it, but we already have PIN_BASED_EXT_INTR_MASK in vmcs02, so we'd also enable VIRTUAL_INTR_DELIVERY in vmcs02 if we detect that vmcs12 disables PIN_BASED_EXT_INTR_MASK. And because vmcs12 can't use it, we're free to configure vmcs02 with data from vmcs01, so interrupt delivery would behave as if there was no nesting. (Dislaimer: I haven't verified that it would actually work.) For now, I think that throwing KVM_REQ_EVENT in this case is perfectly acceptable and we can optimize later if we notice that it is being widely used. (Windows with device guard come to mind.) Thanks.
diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c index 5c24811e8b0b..f171051eecf3 100644 --- a/arch/x86/kvm/irq.c +++ b/arch/x86/kvm/irq.c @@ -79,7 +79,7 @@ int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v) if (kvm_cpu_has_extint(v)) return 1; - if (kvm_vcpu_apicv_active(v)) + if (!is_guest_mode(v) && kvm_vcpu_apicv_active(v)) return 0; return kvm_apic_has_interrupt(v) != -1; /* LAPIC */ diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 47bbb8b691e8..bbc023fb9ef1 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -9002,30 +9002,18 @@ static void vmx_set_rvi(int vector) static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr) { - if (!is_guest_mode(vcpu)) { - vmx_set_rvi(max_irr); - return; - } - - if (max_irr == -1) - return; - /* - * In guest mode. If a vmexit is needed, vmx_check_nested_events - * handles it. + * When running L2, updating RVI is only relevant when + * vmcs12 virtual-interrupt-delivery enabled. + * However, it can be enabled only when L1 also + * intercepts external-interrupts and in that case + * we should not update vmcs02 RVI but instead intercept + * interrupt. Therefore, do nothing when running L2. */ - if (nested_exit_on_intr(vcpu)) + if ((max_irr == -1) || is_guest_mode(vcpu)) return; - /* - * Else, fall back to pre-APICv interrupt injection since L2 - * is run without virtual interrupt delivery. - */ - if (!kvm_event_needs_reinjection(vcpu) && - vmx_interrupt_allowed(vcpu)) { - kvm_queue_interrupt(vcpu, max_irr, false); - vmx_inject_irq(vcpu); - } + vmx_set_rvi(max_irr); } static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) @@ -9051,6 +9039,10 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) * If we are running L2 and L1 has a new pending interrupt * which can be injected, we should re-evaluate * what should be done with this new L1 interrupt. + * If L1 intercepts external-interrupts, we should + * exit from L2 to L1. Otherwise, interrupt should be + * delivered directly to L2. + * These cases are handled correctly by KVM_REQ_EVENT. */ if (is_guest_mode(vcpu) && (max_irr > prev_max_irr)) kvm_make_request(KVM_REQ_EVENT, vcpu);