diff mbox

[v7,01/13] machine: Don't allow CPU toplogies with partially filled cores

Message ID 1453960195-15181-2-git-send-email-bharata@linux.vnet.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Bharata B Rao Jan. 28, 2016, 5:49 a.m. UTC
Prevent guests from booting with CPU topologies that have partially
filled CPU cores or can result in partially filled CPU cores after
CPU hotplug like

-smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
-smp 15,sockets=1,cores=4,threads=4,maxcpus=17.

This is enforced by introducing MachineClass::validate_smp_config()
that gets called from generic SMP parsing code. Machine type versions
that want to enforce this can define this to the generic version
provided.

Only sPAPR and PC machine types starting from version 2.6 enforce this in
this patch.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/core/machine.c   | 23 +++++++++++++++++++++++
 hw/i386/pc.c        |  1 +
 hw/i386/pc_piix.c   |  1 +
 hw/i386/pc_q35.c    |  1 +
 hw/ppc/spapr.c      |  2 ++
 include/hw/boards.h |  4 ++++
 vl.c                |  5 +++++
 7 files changed, 37 insertions(+)

Comments

Eduardo Habkost Jan. 28, 2016, 7:04 p.m. UTC | #1
On Thu, Jan 28, 2016 at 11:19:43AM +0530, Bharata B Rao wrote:
> Prevent guests from booting with CPU topologies that have partially
> filled CPU cores or can result in partially filled CPU cores after
> CPU hotplug like
> 
> -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> 
> This is enforced by introducing MachineClass::validate_smp_config()
> that gets called from generic SMP parsing code. Machine type versions
> that want to enforce this can define this to the generic version
> provided.
> 
> Only sPAPR and PC machine types starting from version 2.6 enforce this in
> this patch.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> ---
[...]
> diff --git a/vl.c b/vl.c
> index f043009..9e4da46 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -4126,6 +4126,11 @@ int main(int argc, char **argv, char **envp)
>  
>      smp_parse(qemu_opts_find(qemu_find_opts("smp-opts"), NULL));
>  
> +    if (machine_class->validate_smp_config) {
> +        machine_class->validate_smp_config(smp_cpus, max_cpus, smp_threads,
> +                                           &error_abort);

Invalid SMP config should make QEMU just exit with an error, not
abort(). Please use &error_fatal instead.

The rest looks good.
David Gibson Jan. 29, 2016, 3:52 a.m. UTC | #2
On Thu, Jan 28, 2016 at 11:19:43AM +0530, Bharata B Rao wrote:
> Prevent guests from booting with CPU topologies that have partially
> filled CPU cores or can result in partially filled CPU cores after
> CPU hotplug like
> 
> -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> 
> This is enforced by introducing MachineClass::validate_smp_config()
> that gets called from generic SMP parsing code. Machine type versions
> that want to enforce this can define this to the generic version
> provided.
> 
> Only sPAPR and PC machine types starting from version 2.6 enforce this in
> this patch.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>

I've been kind of lost in the back and forth about
threads/cores/sockets.

What, in the end, is the rationale for allowing partially filled
sockets, but not partially filled cores?
Eduardo Habkost Jan. 29, 2016, 2:24 p.m. UTC | #3
On Fri, Jan 29, 2016 at 02:52:30PM +1100, David Gibson wrote:
> On Thu, Jan 28, 2016 at 11:19:43AM +0530, Bharata B Rao wrote:
> > Prevent guests from booting with CPU topologies that have partially
> > filled CPU cores or can result in partially filled CPU cores after
> > CPU hotplug like
> > 
> > -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> > -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> > 
> > This is enforced by introducing MachineClass::validate_smp_config()
> > that gets called from generic SMP parsing code. Machine type versions
> > that want to enforce this can define this to the generic version
> > provided.
> > 
> > Only sPAPR and PC machine types starting from version 2.6 enforce this in
> > this patch.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> 
> I've been kind of lost in the back and forth about
> threads/cores/sockets.
> 
> What, in the end, is the rationale for allowing partially filled
> sockets, but not partially filled cores?

I don't think there's a good reason for that (at least for PC).

It's easier to relax the requirements later if necessary, than
dealing with compatibility issues again when making the code more
strict. So I suggest we make validate_smp_config_generic() also
check if smp_cpus % (smp_threads * smp_cores) == 0.
Igor Mammedov Jan. 29, 2016, 3:10 p.m. UTC | #4
On Fri, 29 Jan 2016 12:24:18 -0200
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Fri, Jan 29, 2016 at 02:52:30PM +1100, David Gibson wrote:
> > On Thu, Jan 28, 2016 at 11:19:43AM +0530, Bharata B Rao wrote:  
> > > Prevent guests from booting with CPU topologies that have partially
> > > filled CPU cores or can result in partially filled CPU cores after
> > > CPU hotplug like
> > > 
> > > -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> > > -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> > > 
> > > This is enforced by introducing MachineClass::validate_smp_config()
> > > that gets called from generic SMP parsing code. Machine type versions
> > > that want to enforce this can define this to the generic version
> > > provided.
> > > 
> > > Only sPAPR and PC machine types starting from version 2.6 enforce this in
> > > this patch.
> > > 
> > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>  
> > 
> > I've been kind of lost in the back and forth about
> > threads/cores/sockets.
> > 
> > What, in the end, is the rationale for allowing partially filled
> > sockets, but not partially filled cores?  
> 
> I don't think there's a good reason for that (at least for PC).
> 
> It's easier to relax the requirements later if necessary, than
> dealing with compatibility issues again when making the code more
> strict. So I suggest we make validate_smp_config_generic() also
> check if smp_cpus % (smp_threads * smp_cores) == 0.

that would break exiting setups.

Also in case of cpu hotplug this patch will break migration
as target QEMU might refuse starting with hotplugged CPU thread.

Perhaps this check should be enforced per target/machine if
arch requires it.
Eduardo Habkost Jan. 29, 2016, 3:36 p.m. UTC | #5
On Fri, Jan 29, 2016 at 04:10:47PM +0100, Igor Mammedov wrote:
> On Fri, 29 Jan 2016 12:24:18 -0200
> Eduardo Habkost <ehabkost@redhat.com> wrote:
> 
> > On Fri, Jan 29, 2016 at 02:52:30PM +1100, David Gibson wrote:
> > > On Thu, Jan 28, 2016 at 11:19:43AM +0530, Bharata B Rao wrote:  
> > > > Prevent guests from booting with CPU topologies that have partially
> > > > filled CPU cores or can result in partially filled CPU cores after
> > > > CPU hotplug like
> > > > 
> > > > -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> > > > -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> > > > 
> > > > This is enforced by introducing MachineClass::validate_smp_config()
> > > > that gets called from generic SMP parsing code. Machine type versions
> > > > that want to enforce this can define this to the generic version
> > > > provided.
> > > > 
> > > > Only sPAPR and PC machine types starting from version 2.6 enforce this in
> > > > this patch.
> > > > 
> > > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>  
> > > 
> > > I've been kind of lost in the back and forth about
> > > threads/cores/sockets.
> > > 
> > > What, in the end, is the rationale for allowing partially filled
> > > sockets, but not partially filled cores?  
> > 
> > I don't think there's a good reason for that (at least for PC).
> > 
> > It's easier to relax the requirements later if necessary, than
> > dealing with compatibility issues again when making the code more
> > strict. So I suggest we make validate_smp_config_generic() also
> > check if smp_cpus % (smp_threads * smp_cores) == 0.
> 
> that would break exiting setups.

Not if we do that only on newer machine classes.
validate_smp_config_generic() will be used only on *-2.6 and
newer.


> 
> Also in case of cpu hotplug this patch will break migration
> as target QEMU might refuse starting with hotplugged CPU thread.

This won't change older machine-types.

But I think you are right: it can break migration on pc-2.6, too.
But: isn't migration already broken when creating other sets of
CPUs that can't represented using -smp?

How exactly would you migrate a machine today, if you run:

  $ qemu-system-x86_64 -smp 16,sockets=2,cores=2,threads=2,maxcpus=32
  (QMP) cpu-add id=31


> 
> Perhaps this check should be enforced per target/machine if
> arch requires it.

It is. Please see the patch. It introduces a validate_smp_config
method.

But we need your input to clarify if
validate_smp_config_generic() is safe for pc-2.6 too.
diff mbox

Patch

diff --git a/hw/core/machine.c b/hw/core/machine.c
index c46ddc7..4505995 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -336,6 +336,29 @@  static void machine_init_notify(Notifier *notifier, void *data)
     foreach_dynamic_sysbus_device(error_on_sysbus_device, NULL);
 }
 
+/*
+ * Machine types that want to prevent starting of guests with
+ * partially filled CPU cores can use this routine as their
+ * MachineClass:validate_smp_config().
+ */
+void validate_smp_config_generic(int smp_cpus, int max_cpus,
+                                 int smp_threads, Error **errp)
+{
+    if (smp_cpus % smp_threads) {
+        error_setg(errp, "cpu topology: "
+                   "smp_cpus (%u) should be multiple of threads (%u) ",
+                   smp_cpus, smp_threads);
+        return;
+    }
+
+    if (max_cpus % smp_threads) {
+        error_setg(errp, "cpu topology: "
+                   "max_cpus (%u) should be multiple of threads (%u) ",
+                   max_cpus, smp_threads);
+        return;
+    }
+}
+
 static void machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 78cf8fa..a54e0a0 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1971,6 +1971,7 @@  static void pc_machine_class_init(ObjectClass *oc, void *data)
     mc->hot_add_cpu = pc_hot_add_cpu;
     mc->max_cpus = 255;
     mc->reset = pc_machine_reset;
+    mc->validate_smp_config = validate_smp_config_generic;
     hc->plug = pc_machine_device_plug_cb;
     hc->unplug_request = pc_machine_device_unplug_request_cb;
     hc->unplug = pc_machine_device_unplug_cb;
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index bc74557..98b8b69 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -435,6 +435,7 @@  static void pc_i440fx_2_5_machine_options(MachineClass *m)
     pc_i440fx_2_6_machine_options(m);
     m->alias = NULL;
     m->is_default = 0;
+    m->validate_smp_config = NULL;
     pcmc->save_tsc_khz = false;
     SET_MACHINE_COMPAT(m, PC_COMPAT_2_5);
 }
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 6128b02..c5f4935 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -362,6 +362,7 @@  static void pc_q35_2_5_machine_options(MachineClass *m)
     PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
     pc_q35_2_6_machine_options(m);
     m->alias = NULL;
+    m->validate_smp_config = NULL;
     pcmc->save_tsc_khz = false;
     SET_MACHINE_COMPAT(m, PC_COMPAT_2_5);
 }
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a9c9a95..6ac9f06 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2317,6 +2317,7 @@  static void spapr_machine_class_init(ObjectClass *oc, void *data)
     mc->has_dynamic_sysbus = true;
     mc->pci_allow_0_address = true;
     mc->get_hotplug_handler = spapr_get_hotpug_handler;
+    mc->validate_smp_config = validate_smp_config_generic;
     hc->plug = spapr_machine_device_plug;
     hc->unplug = spapr_machine_device_unplug;
     mc->cpu_index_to_socket_id = spapr_cpu_index_to_socket_id;
@@ -2402,6 +2403,7 @@  static void spapr_machine_2_5_class_options(MachineClass *mc)
     spapr_machine_2_6_class_options(mc);
     smc->use_ohci_by_default = true;
     SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_5);
+    mc->validate_smp_config = NULL;
 }
 
 DEFINE_SPAPR_MACHINE(2_5, "2.5", false);
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 0f30959..435c339 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -40,6 +40,8 @@  int machine_kvm_shadow_mem(MachineState *machine);
 int machine_phandle_start(MachineState *machine);
 bool machine_dump_guest_core(MachineState *machine);
 bool machine_mem_merge(MachineState *machine);
+void validate_smp_config_generic(int smp_cpus, int max_cpus,
+                                 int smp_threads, Error **errp);
 
 /**
  * MachineClass:
@@ -99,6 +101,8 @@  struct MachineClass {
     HotplugHandler *(*get_hotplug_handler)(MachineState *machine,
                                            DeviceState *dev);
     unsigned (*cpu_index_to_socket_id)(unsigned cpu_index);
+    void (*validate_smp_config)(int smp_cpus, int max_cpus, int smp_threads,
+                                Error **errp);
 };
 
 /**
diff --git a/vl.c b/vl.c
index f043009..9e4da46 100644
--- a/vl.c
+++ b/vl.c
@@ -4126,6 +4126,11 @@  int main(int argc, char **argv, char **envp)
 
     smp_parse(qemu_opts_find(qemu_find_opts("smp-opts"), NULL));
 
+    if (machine_class->validate_smp_config) {
+        machine_class->validate_smp_config(smp_cpus, max_cpus, smp_threads,
+                                           &error_abort);
+    }
+
     machine_class->max_cpus = machine_class->max_cpus ?: 1; /* Default to UP */
     if (max_cpus > machine_class->max_cpus) {
         error_report("Number of SMP CPUs requested (%d) exceeds max CPUs "