diff mbox series

kvm: lapic: fix broken vcpu hotplug

Message ID 20200622160830.426022-1-imammedo@redhat.com (mailing list archive)
State New, archived
Headers show
Series kvm: lapic: fix broken vcpu hotplug | expand

Commit Message

Igor Mammedov June 22, 2020, 4:08 p.m. UTC
Guest fails to online hotplugged CPU with error
  smpboot: do_boot_cpu failed(-1) to wakeup CPU#4

It's caused by the fact that kvm_apic_set_state(), which used to call
recalculate_apic_map() unconditionally and pulled hotplugged CPU into
apic map, is updating map conditionally [1] on state change which doesn't
happen in this case and apic map update is skipped.

Note:
new CPU during kvm_arch_vcpu_create() is not visible to
kvm_recalculate_apic_map(), so all related update calls endup
as NOP and only follow up kvm_apic_set_state() used to trigger map
update that counted in hotplugged CPU.
Fix issue by forcing unconditional update from kvm_apic_set_state(),
like it used to be.

1)
Fixes: (4abaffce4d25a KVM: LAPIC: Recalculate apic map in batch)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
PS:
it's alternative to full revert of [1], I've posted earlier
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2205600.html
so fii free to pick up whatever is better by now
---
 arch/x86/kvm/lapic.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Paolo Bonzini June 22, 2020, 4:47 p.m. UTC | #1
On 22/06/20 18:08, Igor Mammedov wrote:
> Guest fails to online hotplugged CPU with error
>   smpboot: do_boot_cpu failed(-1) to wakeup CPU#4
> 
> It's caused by the fact that kvm_apic_set_state(), which used to call
> recalculate_apic_map() unconditionally and pulled hotplugged CPU into
> apic map, is updating map conditionally [1] on state change which doesn't
> happen in this case and apic map update is skipped.
> 
> Note:
> new CPU during kvm_arch_vcpu_create() is not visible to
> kvm_recalculate_apic_map(), so all related update calls endup
> as NOP and only follow up kvm_apic_set_state() used to trigger map
> update that counted in hotplugged CPU.
> Fix issue by forcing unconditional update from kvm_apic_set_state(),
> like it used to be.
> 
> 1)
> Fixes: (4abaffce4d25a KVM: LAPIC: Recalculate apic map in batch)
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
> PS:
> it's alternative to full revert of [1], I've posted earlier
> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2205600.html
> so fii free to pick up whatever is better by now
> ---
>  arch/x86/kvm/lapic.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 34a7e0533dad..5696831d4005 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2556,6 +2556,7 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
>  	struct kvm_lapic *apic = vcpu->arch.apic;
>  	int r;
>  
> +	apic->vcpu->kvm->arch.apic_map_dirty = true;
>  	kvm_lapic_set_base(vcpu, vcpu->arch.apic_base);
>  	/* set SPIV separately to get count of SW disabled APICs right */
>  	apic_set_spiv(apic, *((u32 *)(s->regs + APIC_SPIV)));
> 

Queued, but it's better to set apic_map_dirty just before the call to
kvm_recalculate_apic_map, or you can have a variant of the race that you
pointed out.

Paolo
Igor Mammedov June 23, 2020, 11:13 a.m. UTC | #2
On Mon, 22 Jun 2020 18:47:57 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 22/06/20 18:08, Igor Mammedov wrote:
> > Guest fails to online hotplugged CPU with error
> >   smpboot: do_boot_cpu failed(-1) to wakeup CPU#4
> > 
> > It's caused by the fact that kvm_apic_set_state(), which used to call
> > recalculate_apic_map() unconditionally and pulled hotplugged CPU into
> > apic map, is updating map conditionally [1] on state change which doesn't
> > happen in this case and apic map update is skipped.
> > 
> > Note:
> > new CPU during kvm_arch_vcpu_create() is not visible to
> > kvm_recalculate_apic_map(), so all related update calls endup
> > as NOP and only follow up kvm_apic_set_state() used to trigger map
> > update that counted in hotplugged CPU.
> > Fix issue by forcing unconditional update from kvm_apic_set_state(),
> > like it used to be.
> > 
> > 1)
> > Fixes: (4abaffce4d25a KVM: LAPIC: Recalculate apic map in batch)
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > ---
> > PS:
> > it's alternative to full revert of [1], I've posted earlier
> > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2205600.html
> > so fii free to pick up whatever is better by now
> > ---
> >  arch/x86/kvm/lapic.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 34a7e0533dad..5696831d4005 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -2556,6 +2556,7 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
> >  	struct kvm_lapic *apic = vcpu->arch.apic;
> >  	int r;
> >  
> > +	apic->vcpu->kvm->arch.apic_map_dirty = true;
> >  	kvm_lapic_set_base(vcpu, vcpu->arch.apic_base);
> >  	/* set SPIV separately to get count of SW disabled APICs right */
> >  	apic_set_spiv(apic, *((u32 *)(s->regs + APIC_SPIV)));
> >   
> 
> Queued, but it's better to set apic_map_dirty just before the call to
> kvm_recalculate_apic_map, or you can have a variant of the race that you
> pointed out.

Here I was worried about failure path as well that is just before normal
kvm_recalculate_apic_map(), and has its own kvm_recalculate_apic_map().

but I'm not sure if we should force map update in that case.

> 
> Paolo
>
Paolo Bonzini June 23, 2020, 11:34 a.m. UTC | #3
On 23/06/20 13:13, Igor Mammedov wrote:
>>> +	apic->vcpu->kvm->arch.apic_map_dirty = true;
>>>  	kvm_lapic_set_base(vcpu, vcpu->arch.apic_base);
>>>  	/* set SPIV separately to get count of SW disabled APICs right */
>>>  	apic_set_spiv(apic, *((u32 *)(s->regs + APIC_SPIV)));
>>>   
>> Queued, but it's better to set apic_map_dirty just before the call to
>> kvm_recalculate_apic_map, or you can have a variant of the race that you
>> pointed out.
> Here I was worried about failure path as well that is just before normal
> kvm_recalculate_apic_map(), and has its own kvm_recalculate_apic_map().
> 
> but I'm not sure if we should force map update in that case.
> 

In that case kvm_lapic_set_base and apic_set_spiv will take care of it
(and if it kvm_apic_state_fixup writes LDR, it succeeds and you go down
the other path).

Paolo
diff mbox series

Patch

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 34a7e0533dad..5696831d4005 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2556,6 +2556,7 @@  int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
 	struct kvm_lapic *apic = vcpu->arch.apic;
 	int r;
 
+	apic->vcpu->kvm->arch.apic_map_dirty = true;
 	kvm_lapic_set_base(vcpu, vcpu->arch.apic_base);
 	/* set SPIV separately to get count of SW disabled APICs right */
 	apic_set_spiv(apic, *((u32 *)(s->regs + APIC_SPIV)));