Message ID | 20191106175602.4515-3-joao.m.martins@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: VMX: Posted Interrupts fixes | expand |
On 11/6/19 5:56 PM, Joao Martins wrote: > When vCPU enters block phase, pi_pre_block() inserts vCPU to a per pCPU > linked list of all vCPUs that are blocked on this pCPU. Afterwards, it > changes PID.NV to POSTED_INTR_WAKEUP_VECTOR which its handler > (wakeup_handler()) is responsible to kick (unblock) any vCPU on that > linked list that now has pending posted interrupts. > > While vCPU is blocked (in kvm_vcpu_block()), it may be preempted which > will cause vmx_vcpu_pi_put() to set PID.SN. If later the vCPU will be > scheduled to run on a different pCPU, vmx_vcpu_pi_load() will clear > PID.SN but will also *overwrite PID.NDST to this different pCPU*. > Instead of keeping it with original pCPU which vCPU had entered block > phase on. > > This results in an issue because when a posted interrupt is delivered, > the wakeup_handler() will be executed and fail to find blocked vCPU on > its per pCPU linked list of all vCPUs that are blocked on this pCPU. > Which is due to the vCPU being placed on a *different* per pCPU > linked list than the original pCPU that it had entered block phase. > > The regression is introduced by commit c112b5f50232 ("KVM: x86: > Recompute PID.ON when clearing PID.SN"). Therefore, partially revert > it and reintroduce the condition in vmx_vcpu_pi_load() responsible for > avoiding changing PID.NDST when loading a blocked vCPU. > > Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN") > Signed-off-by: Joao Martins <joao.m.martins@oracle.com> > Signed-off-by: Liran Alon <liran.alon@oracle.com> Sorry, I missed a Tb tag: Tested-by: Nathan Ni <nathan.ni@oracle.com> I can resend it, if preferred. > --- > arch/x86/kvm/vmx/vmx.c | 14 ++++++++++++++ > arch/x86/kvm/vmx/vmx.h | 6 ++++++ > 2 files changed, 20 insertions(+) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 18b0bee662a5..75d903455e1c 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -1274,6 +1274,18 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu) > if (!pi_test_sn(pi_desc) && vcpu->cpu == cpu) > return; > > + /* > + * If the 'nv' field is POSTED_INTR_WAKEUP_VECTOR, do not change > + * PI.NDST: pi_post_block is the one expected to change PID.NDST and the > + * wakeup handler expects the vCPU to be on the blocked_vcpu_list that > + * matches PI.NDST. Otherwise, a vcpu may not be able to be woken up > + * correctly. > + */ > + if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR || vcpu->cpu == cpu) { > + pi_clear_sn(pi_desc); > + goto after_clear_sn; > + } > + > /* The full case. */ > do { > old.control = new.control = pi_desc->control; > @@ -1289,6 +1301,8 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu) > } while (cmpxchg64(&pi_desc->control, old.control, > new.control) != old.control); > > +after_clear_sn: > + > /* > * Clear SN before reading the bitmap. The VT-d firmware > * writes the bitmap and reads SN atomically (5.2.3 in the > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h > index bee16687dc0b..1e32ab54fc2d 100644 > --- a/arch/x86/kvm/vmx/vmx.h > +++ b/arch/x86/kvm/vmx/vmx.h > @@ -373,6 +373,12 @@ static inline void pi_clear_on(struct pi_desc *pi_desc) > (unsigned long *)&pi_desc->control); > } > > +static inline void pi_clear_sn(struct pi_desc *pi_desc) > +{ > + clear_bit(POSTED_INTR_SN, > + (unsigned long *)&pi_desc->control); > +} > + > static inline int pi_test_on(struct pi_desc *pi_desc) > { > return test_bit(POSTED_INTR_ON, >
On 06/11/19 18:56, Joao Martins wrote: > When vCPU enters block phase, pi_pre_block() inserts vCPU to a per pCPU > linked list of all vCPUs that are blocked on this pCPU. Afterwards, it > changes PID.NV to POSTED_INTR_WAKEUP_VECTOR which its handler > (wakeup_handler()) is responsible to kick (unblock) any vCPU on that > linked list that now has pending posted interrupts. > > While vCPU is blocked (in kvm_vcpu_block()), it may be preempted which > will cause vmx_vcpu_pi_put() to set PID.SN. If later the vCPU will be > scheduled to run on a different pCPU, vmx_vcpu_pi_load() will clear > PID.SN but will also *overwrite PID.NDST to this different pCPU*. > Instead of keeping it with original pCPU which vCPU had entered block > phase on. > > This results in an issue because when a posted interrupt is delivered, > the wakeup_handler() will be executed and fail to find blocked vCPU on > its per pCPU linked list of all vCPUs that are blocked on this pCPU. > Which is due to the vCPU being placed on a *different* per pCPU > linked list than the original pCPU that it had entered block phase. > > The regression is introduced by commit c112b5f50232 ("KVM: x86: > Recompute PID.ON when clearing PID.SN"). Therefore, partially revert > it and reintroduce the condition in vmx_vcpu_pi_load() responsible for > avoiding changing PID.NDST when loading a blocked vCPU. > > Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN") > Signed-off-by: Joao Martins <joao.m.martins@oracle.com> > Signed-off-by: Liran Alon <liran.alon@oracle.com> Something wrong in the SoB line? Otherwise looks good. Paolo > --- > arch/x86/kvm/vmx/vmx.c | 14 ++++++++++++++ > arch/x86/kvm/vmx/vmx.h | 6 ++++++ > 2 files changed, 20 insertions(+) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 18b0bee662a5..75d903455e1c 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -1274,6 +1274,18 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu) > if (!pi_test_sn(pi_desc) && vcpu->cpu == cpu) > return; > > + /* > + * If the 'nv' field is POSTED_INTR_WAKEUP_VECTOR, do not change > + * PI.NDST: pi_post_block is the one expected to change PID.NDST and the > + * wakeup handler expects the vCPU to be on the blocked_vcpu_list that > + * matches PI.NDST. Otherwise, a vcpu may not be able to be woken up > + * correctly. > + */ > + if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR || vcpu->cpu == cpu) { > + pi_clear_sn(pi_desc); > + goto after_clear_sn; > + } > + > /* The full case. */ > do { > old.control = new.control = pi_desc->control; > @@ -1289,6 +1301,8 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu) > } while (cmpxchg64(&pi_desc->control, old.control, > new.control) != old.control); > > +after_clear_sn: > + > /* > * Clear SN before reading the bitmap. The VT-d firmware > * writes the bitmap and reads SN atomically (5.2.3 in the > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h > index bee16687dc0b..1e32ab54fc2d 100644 > --- a/arch/x86/kvm/vmx/vmx.h > +++ b/arch/x86/kvm/vmx/vmx.h > @@ -373,6 +373,12 @@ static inline void pi_clear_on(struct pi_desc *pi_desc) > (unsigned long *)&pi_desc->control); > } > > +static inline void pi_clear_sn(struct pi_desc *pi_desc) > +{ > + clear_bit(POSTED_INTR_SN, > + (unsigned long *)&pi_desc->control); > +} > + > static inline int pi_test_on(struct pi_desc *pi_desc) > { > return test_bit(POSTED_INTR_ON, >
On 11/11/19 2:39 PM, Paolo Bonzini wrote: > On 06/11/19 18:56, Joao Martins wrote: >> When vCPU enters block phase, pi_pre_block() inserts vCPU to a per pCPU >> linked list of all vCPUs that are blocked on this pCPU. Afterwards, it >> changes PID.NV to POSTED_INTR_WAKEUP_VECTOR which its handler >> (wakeup_handler()) is responsible to kick (unblock) any vCPU on that >> linked list that now has pending posted interrupts. >> >> While vCPU is blocked (in kvm_vcpu_block()), it may be preempted which >> will cause vmx_vcpu_pi_put() to set PID.SN. If later the vCPU will be >> scheduled to run on a different pCPU, vmx_vcpu_pi_load() will clear >> PID.SN but will also *overwrite PID.NDST to this different pCPU*. >> Instead of keeping it with original pCPU which vCPU had entered block >> phase on. >> >> This results in an issue because when a posted interrupt is delivered, >> the wakeup_handler() will be executed and fail to find blocked vCPU on >> its per pCPU linked list of all vCPUs that are blocked on this pCPU. >> Which is due to the vCPU being placed on a *different* per pCPU >> linked list than the original pCPU that it had entered block phase. >> >> The regression is introduced by commit c112b5f50232 ("KVM: x86: >> Recompute PID.ON when clearing PID.SN"). Therefore, partially revert >> it and reintroduce the condition in vmx_vcpu_pi_load() responsible for >> avoiding changing PID.NDST when loading a blocked vCPU. >> >> Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN") >> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> >> Signed-off-by: Liran Alon <liran.alon@oracle.com> > > Something wrong in the SoB line? > I can't spot any mistake; at least it looks chained correctly for me. What's the issue you see with the Sob line? The only thing I forgot was a: Tested-by: Nathan Ni <nathan.ni@oracle.com> > Otherwise looks good. If you want I can resubmit the series with the Tb and Jag Rb, unless you think it's OK doing on commit? Otherwise I can resubmit. Joao
On 11/11/19 15:48, Joao Martins wrote: >>> >>> Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN") >>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> >>> Signed-off-by: Liran Alon <liran.alon@oracle.com> >> Something wrong in the SoB line? >> > I can't spot any mistake; at least it looks chained correctly for me. What's the > issue you see with the Sob line? Liran's line after yours is confusing. Did he help with the analysis or anything like that? Paolo > The only thing I forgot was a: > > Tested-by: Nathan Ni <nathan.ni@oracle.com> > >> Otherwise looks good. > If you want I can resubmit the series with the Tb and Jag Rb, unless you think > it's OK doing on commit? Otherwise I can resubmit.
On 11/11/19 2:50 PM, Paolo Bonzini wrote: > On 11/11/19 15:48, Joao Martins wrote: >>>> >>>> Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN") >>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> >>>> Signed-off-by: Liran Alon <liran.alon@oracle.com> >>> Something wrong in the SoB line? >>> >> I can't spot any mistake; at least it looks chained correctly for me. What's the >> issue you see with the Sob line? > > Liran's line after yours is confusing. Did he help with the analysis or > anything like that? > He was initially reviewing my patches, but then helped improving the problem description in the commit messages so felt correct to give credit. Joao
> On 11 Nov 2019, at 16:56, Joao Martins <joao.m.martins@oracle.com> wrote: > > On 11/11/19 2:50 PM, Paolo Bonzini wrote: >> On 11/11/19 15:48, Joao Martins wrote: >>>>> >>>>> Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN") >>>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> >>>>> Signed-off-by: Liran Alon <liran.alon@oracle.com> >>>> Something wrong in the SoB line? >>>> >>> I can't spot any mistake; at least it looks chained correctly for me. What's the >>> issue you see with the Sob line? >> >> Liran's line after yours is confusing. Did he help with the analysis or >> anything like that? >> > He was initially reviewing my patches, but then helped improving the problem > description in the commit messages so felt correct to give credit. > > Joao I think proper action is to just remove me from the SoB line. -Liran
On Mon, Nov 11, 2019 at 04:59:20PM +0200, Liran Alon wrote: > > > > On 11 Nov 2019, at 16:56, Joao Martins <joao.m.martins@oracle.com> wrote: > > > > On 11/11/19 2:50 PM, Paolo Bonzini wrote: > >> On 11/11/19 15:48, Joao Martins wrote: > >>>>> > >>>>> Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN") > >>>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> > >>>>> Signed-off-by: Liran Alon <liran.alon@oracle.com> > >>>> Something wrong in the SoB line? > >>>> > >>> I can't spot any mistake; at least it looks chained correctly for me. What's the > >>> issue you see with the Sob line? > >> > >> Liran's line after yours is confusing. Did he help with the analysis or > >> anything like that? > >> > > He was initially reviewing my patches, but then helped improving the problem > > description in the commit messages so felt correct to give credit. > > > > Joao > > I think proper action is to just remove me from the SoB line. Use Co-developed-by to attribute multiple authors. Note, the SoB ordering should show the chronological history of the patch when possible, e.g. the person sending the patch should always have their SoB last. Documentation/process/submitting-patches.rst and Documentation/process/5.Posting.rst have more details.
On 11/11/19 3:53 PM, Sean Christopherson wrote: > On Mon, Nov 11, 2019 at 04:59:20PM +0200, Liran Alon wrote: >> >> >>> On 11 Nov 2019, at 16:56, Joao Martins <joao.m.martins@oracle.com> wrote: >>> >>> On 11/11/19 2:50 PM, Paolo Bonzini wrote: >>>> On 11/11/19 15:48, Joao Martins wrote: >>>>>>> >>>>>>> Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN") >>>>>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> >>>>>>> Signed-off-by: Liran Alon <liran.alon@oracle.com> >>>>>> Something wrong in the SoB line? >>>>>> >>>>> I can't spot any mistake; at least it looks chained correctly for me. What's the >>>>> issue you see with the Sob line? >>>> >>>> Liran's line after yours is confusing. Did he help with the analysis or >>>> anything like that? >>>> >>> He was initially reviewing my patches, but then helped improving the problem >>> description in the commit messages so felt correct to give credit. >>> >>> Joao >> >> I think proper action is to just remove me from the SoB line. > > Use Co-developed-by to attribute multiple authors. Note, the SoB ordering > should show the chronological history of the patch when possible, e.g. the > person sending the patch should always have their SoB last. > > Documentation/process/submitting-patches.rst and > Documentation/process/5.Posting.rst have more details. > The Sob chain on the first two patches were broken (regardless of any use of Co-developed-by). Fixed it up on v2, alongside the rest of the comments. Cheers, Joao
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 18b0bee662a5..75d903455e1c 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1274,6 +1274,18 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu) if (!pi_test_sn(pi_desc) && vcpu->cpu == cpu) return; + /* + * If the 'nv' field is POSTED_INTR_WAKEUP_VECTOR, do not change + * PI.NDST: pi_post_block is the one expected to change PID.NDST and the + * wakeup handler expects the vCPU to be on the blocked_vcpu_list that + * matches PI.NDST. Otherwise, a vcpu may not be able to be woken up + * correctly. + */ + if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR || vcpu->cpu == cpu) { + pi_clear_sn(pi_desc); + goto after_clear_sn; + } + /* The full case. */ do { old.control = new.control = pi_desc->control; @@ -1289,6 +1301,8 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu) } while (cmpxchg64(&pi_desc->control, old.control, new.control) != old.control); +after_clear_sn: + /* * Clear SN before reading the bitmap. The VT-d firmware * writes the bitmap and reads SN atomically (5.2.3 in the diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index bee16687dc0b..1e32ab54fc2d 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -373,6 +373,12 @@ static inline void pi_clear_on(struct pi_desc *pi_desc) (unsigned long *)&pi_desc->control); } +static inline void pi_clear_sn(struct pi_desc *pi_desc) +{ + clear_bit(POSTED_INTR_SN, + (unsigned long *)&pi_desc->control); +} + static inline int pi_test_on(struct pi_desc *pi_desc) { return test_bit(POSTED_INTR_ON,