diff mbox

[v3,09/19] pc: delay setting number of boot CPUs to machine_done time

Message ID 1467786055-85835-10-git-send-email-imammedo@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Igor Mammedov July 6, 2016, 6:20 a.m. UTC
currently present CPUs counter in CMOS only contains
smp_cpus (i.e. initial CPUs specified with -smp X) and
doesn't account for CPUs created with -device.
If VM is started with additional CPUs added with
 -device, it will hang in BIOS waiting for condition
   smp_cpus == counted_cpus
forever as counted_cpus will include -device CPUs as well
and be more than smp_cpus.

make present CPUs counter in CMOS to count all CPUs
(initial and coldplugged with -device) by delaying
it to machine done time when it possible to count
CPUs added with -device.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/i386/pc.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

Comments

Eduardo Habkost July 12, 2016, 3:29 a.m. UTC | #1
On Wed, Jul 06, 2016 at 08:20:45AM +0200, Igor Mammedov wrote:
> currently present CPUs counter in CMOS only contains
> smp_cpus (i.e. initial CPUs specified with -smp X) and
> doesn't account for CPUs created with -device.
> If VM is started with additional CPUs added with
>  -device, it will hang in BIOS waiting for condition
>    smp_cpus == counted_cpus
> forever as counted_cpus will include -device CPUs as well
> and be more than smp_cpus.
> 
> make present CPUs counter in CMOS to count all CPUs
> (initial and coldplugged with -device) by delaying
> it to machine done time when it possible to count
> CPUs added with -device.

Do you plan to fix the remaining code using smp_cpus? e.g.:

1) x86_cpu_realizefn():
    if (cpu->env.features[FEAT_1_EDX] & CPUID_APIC || smp_cpus > 1) {
            x86_cpu_apic_create(cpu, &local_err);
    [...]
  (Maybe we should simply make realizefn fail if we try to create a second CPU
  using -device or device_add without CPUID_APIC)

2) the smp_cpus checks at hw/i386/kvmvapic.c.

> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
>  hw/i386/pc.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 6691825..3206572 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -471,9 +471,6 @@ void pc_cmos_init(PCMachineState *pcms,
>      rtc_set_memory(s, 0x5c, val >> 8);
>      rtc_set_memory(s, 0x5d, val >> 16);
>  
> -    /* set the number of CPU */
> -    rtc_set_memory(s, 0x5f, smp_cpus - 1);
> -
>      object_property_add_link(OBJECT(pcms), "rtc_state",
>                               TYPE_ISA_DEVICE,
>                               (Object **)&pcms->rtc,
> @@ -1157,10 +1154,19 @@ void pc_cpus_init(PCMachineState *pcms)
>  static
>  void pc_machine_done(Notifier *notifier, void *data)
>  {
> +    int i, boot_cpus = 0;
>      PCMachineState *pcms = container_of(notifier,
>                                          PCMachineState, machine_done);
>      PCIBus *bus = pcms->bus;
>  
> +    for (i = 0; i < pcms->possible_cpus->len; i++) {
> +        if (pcms->possible_cpus->cpus[i].cpu) {
> +            boot_cpus++;
> +        }
> +    }

Any specific reason you chose to check possible_cpus instead of the
arch-independent CPU list from exec.c? I believe other architectures will be
interested in a generic way to count online CPUs instead of using smp_cpus.

> +    /* set the number of CPUs */
> +    rtc_set_memory(pcms->rtc, 0x5f, boot_cpus - 1);
> +
>      if (bus) {
>          int extra_hosts = 0;
>  
> -- 
> 2.7.0
>
Igor Mammedov July 12, 2016, 12:48 p.m. UTC | #2
On Tue, 12 Jul 2016 00:29:08 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Wed, Jul 06, 2016 at 08:20:45AM +0200, Igor Mammedov wrote:
> > currently present CPUs counter in CMOS only contains
> > smp_cpus (i.e. initial CPUs specified with -smp X) and
> > doesn't account for CPUs created with -device.
> > If VM is started with additional CPUs added with
> >  -device, it will hang in BIOS waiting for condition
> >    smp_cpus == counted_cpus
> > forever as counted_cpus will include -device CPUs as well
> > and be more than smp_cpus.
> > 
> > make present CPUs counter in CMOS to count all CPUs
> > (initial and coldplugged with -device) by delaying
> > it to machine done time when it possible to count
> > CPUs added with -device.  
> 
> Do you plan to fix the remaining code using smp_cpus? e.g.:
perhaps after Drew's -smp refactoring or maybe during it,
it looks like good candidate for that series,
I'll work with Drew on that.

> 
> 1) x86_cpu_realizefn():
>     if (cpu->env.features[FEAT_1_EDX] & CPUID_APIC || smp_cpus > 1) {
>             x86_cpu_apic_create(cpu, &local_err);
>     [...]
>   (Maybe we should simply make realizefn fail if we try to create a second CPU
>   using -device or device_add without CPUID_APIC)
wouldn't that break some setups that doing it but still able
to boot?

> 
> 2) the smp_cpus checks at hw/i386/kvmvapic.c.
> 
> > 
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > ---
> >  hw/i386/pc.c | 12 +++++++++---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index 6691825..3206572 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -471,9 +471,6 @@ void pc_cmos_init(PCMachineState *pcms,
> >      rtc_set_memory(s, 0x5c, val >> 8);
> >      rtc_set_memory(s, 0x5d, val >> 16);
> >  
> > -    /* set the number of CPU */
> > -    rtc_set_memory(s, 0x5f, smp_cpus - 1);
> > -
> >      object_property_add_link(OBJECT(pcms), "rtc_state",
> >                               TYPE_ISA_DEVICE,
> >                               (Object **)&pcms->rtc,
> > @@ -1157,10 +1154,19 @@ void pc_cpus_init(PCMachineState *pcms)
> >  static
> >  void pc_machine_done(Notifier *notifier, void *data)
> >  {
> > +    int i, boot_cpus = 0;
> >      PCMachineState *pcms = container_of(notifier,
> >                                          PCMachineState, machine_done);
> >      PCIBus *bus = pcms->bus;
> >  
> > +    for (i = 0; i < pcms->possible_cpus->len; i++) {
> > +        if (pcms->possible_cpus->cpus[i].cpu) {
> > +            boot_cpus++;
> > +        }
> > +    }  
> 
> Any specific reason you chose to check possible_cpus instead of the
> arch-independent CPU list from exec.c? I believe other architectures will be
> interested in a generic way to count online CPUs instead of using smp_cpus.
When others would need to do it we can switch to arch-independent CPU list but
for now I'd prefer to keep using possible_cpus throughout pc.c for consistency
reasons (so it would be the single point of bookkeeping CPUs on PC),
and I most likely will do the same for ARM probably reusing some of PC code.

 
> > +    /* set the number of CPUs */
> > +    rtc_set_memory(pcms->rtc, 0x5f, boot_cpus - 1);
> > +
> >      if (bus) {
> >          int extra_hosts = 0;
> >  
> > -- 
> > 2.7.0
> >   
>
Igor Mammedov July 12, 2016, 1:42 p.m. UTC | #3
On Tue, 12 Jul 2016 14:48:43 +0200
Igor Mammedov <imammedo@redhat.com> wrote:

> On Tue, 12 Jul 2016 00:29:08 -0300
> Eduardo Habkost <ehabkost@redhat.com> wrote:
[...] 
> > 1) x86_cpu_realizefn():
> >     if (cpu->env.features[FEAT_1_EDX] & CPUID_APIC || smp_cpus > 1) {
> >             x86_cpu_apic_create(cpu, &local_err);
Usage of smp_cpus here looks incorrect to me, it should be max_cpus.
But we can't make it max_cpus without compat glue as it will break
backwards migration (not that upstream cares about it)
Eduardo Habkost July 12, 2016, 5:18 p.m. UTC | #4
On Tue, Jul 12, 2016 at 02:48:43PM +0200, Igor Mammedov wrote:
> On Tue, 12 Jul 2016 00:29:08 -0300
> Eduardo Habkost <ehabkost@redhat.com> wrote:
> 
> > On Wed, Jul 06, 2016 at 08:20:45AM +0200, Igor Mammedov wrote:
> > > currently present CPUs counter in CMOS only contains
> > > smp_cpus (i.e. initial CPUs specified with -smp X) and
> > > doesn't account for CPUs created with -device.
> > > If VM is started with additional CPUs added with
> > >  -device, it will hang in BIOS waiting for condition
> > >    smp_cpus == counted_cpus
> > > forever as counted_cpus will include -device CPUs as well
> > > and be more than smp_cpus.
> > > 
> > > make present CPUs counter in CMOS to count all CPUs
> > > (initial and coldplugged with -device) by delaying
> > > it to machine done time when it possible to count
> > > CPUs added with -device.  
> > 
> > Do you plan to fix the remaining code using smp_cpus? e.g.:
> perhaps after Drew's -smp refactoring or maybe during it,
> it looks like good candidate for that series,
> I'll work with Drew on that.
> 
> > 
> > 1) x86_cpu_realizefn():
> >     if (cpu->env.features[FEAT_1_EDX] & CPUID_APIC || smp_cpus > 1) {
> >             x86_cpu_apic_create(cpu, &local_err);
> >     [...]
> >   (Maybe we should simply make realizefn fail if we try to create a second CPU
> >   using -device or device_add without CPUID_APIC)
> wouldn't that break some setups that doing it but still able
> to boot?

It would, that's why we need to do it only in the case of -device
or device_add. What about something like:

  if (!(cpu->env.features[FEAT_1_EDX] & CPUID_APIC) && smp_cpus < 2 &&total_cpus_already_created() > 0) {
      error_setg("we can't create a new VCPU without an APIC");
      return;
  }

I believe this logic should be moved to PC code, eventually. Or
at least the process should be controlled by PC code (by setting
a force-apic-creation property in X86CPU, for example).

> 
> > 
> > 2) the smp_cpus checks at hw/i386/kvmvapic.c.
> > 
> > > 
> > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > > ---
> > >  hw/i386/pc.c | 12 +++++++++---
> > >  1 file changed, 9 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > index 6691825..3206572 100644
> > > --- a/hw/i386/pc.c
> > > +++ b/hw/i386/pc.c
> > > @@ -471,9 +471,6 @@ void pc_cmos_init(PCMachineState *pcms,
> > >      rtc_set_memory(s, 0x5c, val >> 8);
> > >      rtc_set_memory(s, 0x5d, val >> 16);
> > >  
> > > -    /* set the number of CPU */
> > > -    rtc_set_memory(s, 0x5f, smp_cpus - 1);
> > > -
> > >      object_property_add_link(OBJECT(pcms), "rtc_state",
> > >                               TYPE_ISA_DEVICE,
> > >                               (Object **)&pcms->rtc,
> > > @@ -1157,10 +1154,19 @@ void pc_cpus_init(PCMachineState *pcms)
> > >  static
> > >  void pc_machine_done(Notifier *notifier, void *data)
> > >  {
> > > +    int i, boot_cpus = 0;
> > >      PCMachineState *pcms = container_of(notifier,
> > >                                          PCMachineState, machine_done);
> > >      PCIBus *bus = pcms->bus;
> > >  
> > > +    for (i = 0; i < pcms->possible_cpus->len; i++) {
> > > +        if (pcms->possible_cpus->cpus[i].cpu) {
> > > +            boot_cpus++;
> > > +        }
> > > +    }  
> > 
> > Any specific reason you chose to check possible_cpus instead of the
> > arch-independent CPU list from exec.c? I believe other architectures will be
> > interested in a generic way to count online CPUs instead of using smp_cpus.
> When others would need to do it we can switch to arch-independent CPU list but
> for now I'd prefer to keep using possible_cpus throughout pc.c for consistency
> reasons (so it would be the single point of bookkeeping CPUs on PC),
> and I most likely will do the same for ARM probably reusing some of PC code.

Common can be done as a follow-up, no problem. In that case,
could you move the calculation to a reusable function in pc.c?
The APIC check above would also need to check how many CPUs
already exist, for example.
Eduardo Habkost July 12, 2016, 5:19 p.m. UTC | #5
On Tue, Jul 12, 2016 at 03:42:58PM +0200, Igor Mammedov wrote:
> On Tue, 12 Jul 2016 14:48:43 +0200
> Igor Mammedov <imammedo@redhat.com> wrote:
> 
> > On Tue, 12 Jul 2016 00:29:08 -0300
> > Eduardo Habkost <ehabkost@redhat.com> wrote:
> [...] 
> > > 1) x86_cpu_realizefn():
> > >     if (cpu->env.features[FEAT_1_EDX] & CPUID_APIC || smp_cpus > 1) {
> > >             x86_cpu_apic_create(cpu, &local_err);
> Usage of smp_cpus here looks incorrect to me, it should be max_cpus.
> But we can't make it max_cpus without compat glue as it will break
> backwards migration (not that upstream cares about it)

We can at least print a warning, by now: "your CPU doesn't have
APIC, CPU hotplug will never work".
Igor Mammedov July 13, 2016, 7:44 a.m. UTC | #6
On Tue, 12 Jul 2016 14:19:25 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Tue, Jul 12, 2016 at 03:42:58PM +0200, Igor Mammedov wrote:
> > On Tue, 12 Jul 2016 14:48:43 +0200
> > Igor Mammedov <imammedo@redhat.com> wrote:
> >   
> > > On Tue, 12 Jul 2016 00:29:08 -0300
> > > Eduardo Habkost <ehabkost@redhat.com> wrote:  
> > [...]   
> > > > 1) x86_cpu_realizefn():
> > > >     if (cpu->env.features[FEAT_1_EDX] & CPUID_APIC || smp_cpus > 1) {
> > > >             x86_cpu_apic_create(cpu, &local_err);  
> > Usage of smp_cpus here looks incorrect to me, it should be max_cpus.
> > But we can't make it max_cpus without compat glue as it will break
> > backwards migration (not that upstream cares about it)  
> 
> We can at least print a warning, by now: "your CPU doesn't have
> APIC, CPU hotplug will never work".
sure, lets do it at 2.8 time
Igor Mammedov July 13, 2016, 7:56 a.m. UTC | #7
On Tue, 12 Jul 2016 14:18:22 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Tue, Jul 12, 2016 at 02:48:43PM +0200, Igor Mammedov wrote:
> > On Tue, 12 Jul 2016 00:29:08 -0300
> > Eduardo Habkost <ehabkost@redhat.com> wrote:
> >   
> > > On Wed, Jul 06, 2016 at 08:20:45AM +0200, Igor Mammedov wrote:  
> > > > currently present CPUs counter in CMOS only contains
> > > > smp_cpus (i.e. initial CPUs specified with -smp X) and
> > > > doesn't account for CPUs created with -device.
> > > > If VM is started with additional CPUs added with
> > > >  -device, it will hang in BIOS waiting for condition
> > > >    smp_cpus == counted_cpus
> > > > forever as counted_cpus will include -device CPUs as well
> > > > and be more than smp_cpus.
> > > > 
> > > > make present CPUs counter in CMOS to count all CPUs
> > > > (initial and coldplugged with -device) by delaying
> > > > it to machine done time when it possible to count
> > > > CPUs added with -device.    
> > > 
> > > Do you plan to fix the remaining code using smp_cpus? e.g.:  
> > perhaps after Drew's -smp refactoring or maybe during it,
> > it looks like good candidate for that series,
> > I'll work with Drew on that.
> >   
> > > 
> > > 1) x86_cpu_realizefn():
> > >     if (cpu->env.features[FEAT_1_EDX] & CPUID_APIC || smp_cpus > 1) {
> > >             x86_cpu_apic_create(cpu, &local_err);
> > >     [...]
> > >   (Maybe we should simply make realizefn fail if we try to create a second CPU
> > >   using -device or device_add without CPUID_APIC)  
> > wouldn't that break some setups that doing it but still able
> > to boot?  
> 
> It would, that's why we need to do it only in the case of -device
> or device_add. What about something like: 
>   if (!(cpu->env.features[FEAT_1_EDX] & CPUID_APIC) && smp_cpus < 2 &&total_cpus_already_created() > 0) {
>       error_setg("we can't create a new VCPU without an APIC");
>       return;
-device/device_add is orthogonal here, cpu-add is also affected.

I'd woul make it a warning instead of hard error:
"broken configuration, only 1st CPU will work when creating multiple CPUs with APIC disabled,
to fix it remove -apic/apic=off from -cpu option"

>   }
> 
> I believe this logic should be moved to PC code, eventually. Or
> at least the process should be controlled by PC code (by setting
> a force-apic-creation property in X86CPU, for example).
smp_cpus check should definitely be part of PC code.

PS:
You do not consider above as part of this series, do you?

> 
> >   
> > > 
> > > 2) the smp_cpus checks at hw/i386/kvmvapic.c.
> > >   
> > > > 
> > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > > > ---
> > > >  hw/i386/pc.c | 12 +++++++++---
> > > >  1 file changed, 9 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > > index 6691825..3206572 100644
> > > > --- a/hw/i386/pc.c
> > > > +++ b/hw/i386/pc.c
> > > > @@ -471,9 +471,6 @@ void pc_cmos_init(PCMachineState *pcms,
> > > >      rtc_set_memory(s, 0x5c, val >> 8);
> > > >      rtc_set_memory(s, 0x5d, val >> 16);
> > > >  
> > > > -    /* set the number of CPU */
> > > > -    rtc_set_memory(s, 0x5f, smp_cpus - 1);
> > > > -
> > > >      object_property_add_link(OBJECT(pcms), "rtc_state",
> > > >                               TYPE_ISA_DEVICE,
> > > >                               (Object **)&pcms->rtc,
> > > > @@ -1157,10 +1154,19 @@ void pc_cpus_init(PCMachineState *pcms)
> > > >  static
> > > >  void pc_machine_done(Notifier *notifier, void *data)
> > > >  {
> > > > +    int i, boot_cpus = 0;
> > > >      PCMachineState *pcms = container_of(notifier,
> > > >                                          PCMachineState, machine_done);
> > > >      PCIBus *bus = pcms->bus;
> > > >  
> > > > +    for (i = 0; i < pcms->possible_cpus->len; i++) {
> > > > +        if (pcms->possible_cpus->cpus[i].cpu) {
> > > > +            boot_cpus++;
> > > > +        }
> > > > +    }    
> > > 
> > > Any specific reason you chose to check possible_cpus instead of the
> > > arch-independent CPU list from exec.c? I believe other architectures will be
> > > interested in a generic way to count online CPUs instead of using smp_cpus.  
> > When others would need to do it we can switch to arch-independent CPU list but
> > for now I'd prefer to keep using possible_cpus throughout pc.c for consistency
> > reasons (so it would be the single point of bookkeeping CPUs on PC),
> > and I most likely will do the same for ARM probably reusing some of PC code.  
> 
> Common can be done as a follow-up, no problem. In that case,
> could you move the calculation to a reusable function in pc.c?
> The APIC check above would also need to check how many CPUs
> already exist, for example.
sure, I'll do it.
Eduardo Habkost July 13, 2016, 1:56 p.m. UTC | #8
On Wed, Jul 13, 2016 at 09:56:25AM +0200, Igor Mammedov wrote:
> On Tue, 12 Jul 2016 14:18:22 -0300
> Eduardo Habkost <ehabkost@redhat.com> wrote:
> 
> > On Tue, Jul 12, 2016 at 02:48:43PM +0200, Igor Mammedov wrote:
> > > On Tue, 12 Jul 2016 00:29:08 -0300
> > > Eduardo Habkost <ehabkost@redhat.com> wrote:
> > >   
> > > > On Wed, Jul 06, 2016 at 08:20:45AM +0200, Igor Mammedov wrote:  
> > > > > currently present CPUs counter in CMOS only contains
> > > > > smp_cpus (i.e. initial CPUs specified with -smp X) and
> > > > > doesn't account for CPUs created with -device.
> > > > > If VM is started with additional CPUs added with
> > > > >  -device, it will hang in BIOS waiting for condition
> > > > >    smp_cpus == counted_cpus
> > > > > forever as counted_cpus will include -device CPUs as well
> > > > > and be more than smp_cpus.
> > > > > 
> > > > > make present CPUs counter in CMOS to count all CPUs
> > > > > (initial and coldplugged with -device) by delaying
> > > > > it to machine done time when it possible to count
> > > > > CPUs added with -device.    
> > > > 
> > > > Do you plan to fix the remaining code using smp_cpus? e.g.:  
> > > perhaps after Drew's -smp refactoring or maybe during it,
> > > it looks like good candidate for that series,
> > > I'll work with Drew on that.
> > >   
> > > > 
> > > > 1) x86_cpu_realizefn():
> > > >     if (cpu->env.features[FEAT_1_EDX] & CPUID_APIC || smp_cpus > 1) {
> > > >             x86_cpu_apic_create(cpu, &local_err);
> > > >     [...]
> > > >   (Maybe we should simply make realizefn fail if we try to create a second CPU
> > > >   using -device or device_add without CPUID_APIC)  
> > > wouldn't that break some setups that doing it but still able
> > > to boot?  
> > 
> > It would, that's why we need to do it only in the case of -device
> > or device_add. What about something like: 
> >   if (!(cpu->env.features[FEAT_1_EDX] & CPUID_APIC) && smp_cpus < 2 &&total_cpus_already_created() > 0) {
> >       error_setg("we can't create a new VCPU without an APIC");
> >       return;
> -device/device_add is orthogonal here, cpu-add is also affected.
> 
> I'd woul make it a warning instead of hard error:
> "broken configuration, only 1st CPU will work when creating multiple CPUs with APIC disabled,
> to fix it remove -apic/apic=off from -cpu option"

I forgot about cpu-add. You're right.

> 
> >   }
> > 
> > I believe this logic should be moved to PC code, eventually. Or
> > at least the process should be controlled by PC code (by setting
> > a force-apic-creation property in X86CPU, for example).
> smp_cpus check should definitely be part of PC code.
> 
> PS:
> You do not consider above as part of this series, do you?

No, it's just something to be done later.
diff mbox

Patch

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 6691825..3206572 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -471,9 +471,6 @@  void pc_cmos_init(PCMachineState *pcms,
     rtc_set_memory(s, 0x5c, val >> 8);
     rtc_set_memory(s, 0x5d, val >> 16);
 
-    /* set the number of CPU */
-    rtc_set_memory(s, 0x5f, smp_cpus - 1);
-
     object_property_add_link(OBJECT(pcms), "rtc_state",
                              TYPE_ISA_DEVICE,
                              (Object **)&pcms->rtc,
@@ -1157,10 +1154,19 @@  void pc_cpus_init(PCMachineState *pcms)
 static
 void pc_machine_done(Notifier *notifier, void *data)
 {
+    int i, boot_cpus = 0;
     PCMachineState *pcms = container_of(notifier,
                                         PCMachineState, machine_done);
     PCIBus *bus = pcms->bus;
 
+    for (i = 0; i < pcms->possible_cpus->len; i++) {
+        if (pcms->possible_cpus->cpus[i].cpu) {
+            boot_cpus++;
+        }
+    }
+    /* set the number of CPUs */
+    rtc_set_memory(pcms->rtc, 0x5f, boot_cpus - 1);
+
     if (bus) {
         int extra_hosts = 0;