Message ID | 20190805071302.6260-1-tao3.xu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | numa: Introduce MachineClass::auto_enable_numa for implicit NUMA node | expand |
On Mon, 5 Aug 2019 15:13:02 +0800 Tao Xu <tao3.xu@intel.com> wrote: > Add MachineClass::auto_enable_numa field. When it is true, a NUMA node > is expected to be created implicitly. > > Acked-by: David Gibson <david@gibson.dropbear.id.au> > Suggested-by: Igor Mammedov <imammedo@redhat.com> > Suggested-by: Eduardo Habkost <ehabkost@redhat.com> > Signed-off-by: Tao Xu <tao3.xu@intel.com> > --- > > This patch has a dependency on > https://patchwork.kernel.org/cover/11063235/ > --- > hw/core/numa.c | 9 +++++++-- > hw/ppc/spapr.c | 9 +-------- > include/hw/boards.h | 1 + > 3 files changed, 9 insertions(+), 10 deletions(-) > > diff --git a/hw/core/numa.c b/hw/core/numa.c > index 75db35ac19..756d243d3f 100644 > --- a/hw/core/numa.c > +++ b/hw/core/numa.c > @@ -580,9 +580,14 @@ void numa_complete_configuration(MachineState *ms) > * guest tries to use it with that drivers. > * > * Enable NUMA implicitly by adding a new NUMA node automatically. > + * > + * Or if MachineClass::auto_enable_numa is true and no NUMA nodes, > + * assume there is just one node with whole RAM. > */ > - if (ms->ram_slots > 0 && ms->numa_state->num_nodes == 0 && > - mc->auto_enable_numa_with_memhp) { > + if (ms->numa_state->num_nodes == 0 && > + ((ms->ram_slots > 0 && > + mc->auto_enable_numa_with_memhp) || > + mc->auto_enable_numa)) { > NumaNodeOptions node = { }; > parse_numa_node(ms, &node, &error_abort); > } > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index f607ca567b..e50343f326 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -400,14 +400,6 @@ static int spapr_populate_memory(SpaprMachineState *spapr, void *fdt) > hwaddr mem_start, node_size; > int i, nb_nodes = machine->numa_state->num_nodes; > NodeInfo *nodes = machine->numa_state->nodes; > - NodeInfo ramnode; > - > - /* No NUMA nodes, assume there is just one node with whole RAM */ > - if (!nb_nodes) { > - nb_nodes = 1; > - ramnode.node_mem = machine->ram_size; > - nodes = &ramnode; > - } > > for (i = 0, mem_start = 0; i < nb_nodes; ++i) { > if (!nodes[i].node_mem) { > @@ -4369,6 +4361,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data) > */ > mc->numa_mem_align_shift = 28; > mc->numa_mem_supported = true; > + mc->auto_enable_numa = true; this will always create a numa node (that will affect not only RAM but also all other components that depends on numa state (like CPUs)), where as spapr_populate_memory() was only faking numa node in DT for RAM. It makes non-numa configuration impossible. Seeing David's ACK on the patch it might be fine, but I believe commit message should capture that and explain why the change in behavior is fine. > smc->default_caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_OFF; > smc->default_caps.caps[SPAPR_CAP_VSX] = SPAPR_CAP_ON; > diff --git a/include/hw/boards.h b/include/hw/boards.h > index 2eb9a0b4e0..4a350b87d2 100644 > --- a/include/hw/boards.h > +++ b/include/hw/boards.h > @@ -220,6 +220,7 @@ struct MachineClass { > bool smbus_no_migration_support; > bool nvdimm_supported; > bool numa_mem_supported; > + bool auto_enable_numa; > > HotplugHandler *(*get_hotplug_handler)(MachineState *machine, > DeviceState *dev);
On Tue, Aug 06, 2019 at 02:50:55PM +0200, Igor Mammedov wrote: > On Mon, 5 Aug 2019 15:13:02 +0800 > Tao Xu <tao3.xu@intel.com> wrote: > > > Add MachineClass::auto_enable_numa field. When it is true, a NUMA node > > is expected to be created implicitly. > > > > Acked-by: David Gibson <david@gibson.dropbear.id.au> > > Suggested-by: Igor Mammedov <imammedo@redhat.com> > > Suggested-by: Eduardo Habkost <ehabkost@redhat.com> > > Signed-off-by: Tao Xu <tao3.xu@intel.com> [...] > > + mc->auto_enable_numa = true; > > this will always create a numa node (that will affect not only RAM but > also all other components that depends on numa state (like CPUs)), > where as spapr_populate_memory() was only faking numa node in DT for RAM. > It makes non-numa configuration impossible. > Seeing David's ACK on the patch it might be fine, but I believe > commit message should capture that and explain why the change in > behavior is fine. After a quick look, all spapr code seems to have the same behavior when nb_numa_nodes==0 and nb_numa_nodes==1, but I'd like to be sure. David and/or Tao Xu: do you confirm there's no ABI change at all on spapr after implicitly creating a NUMA node? > > > smc->default_caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_OFF; > > smc->default_caps.caps[SPAPR_CAP_VSX] = SPAPR_CAP_ON; > > diff --git a/include/hw/boards.h b/include/hw/boards.h > > index 2eb9a0b4e0..4a350b87d2 100644 > > --- a/include/hw/boards.h > > +++ b/include/hw/boards.h > > @@ -220,6 +220,7 @@ struct MachineClass { > > bool smbus_no_migration_support; > > bool nvdimm_supported; > > bool numa_mem_supported; > > + bool auto_enable_numa; > > > > HotplugHandler *(*get_hotplug_handler)(MachineState *machine, > > DeviceState *dev); >
On Wed, Aug 07, 2019 at 02:52:56PM -0300, Eduardo Habkost wrote: > On Tue, Aug 06, 2019 at 02:50:55PM +0200, Igor Mammedov wrote: > > On Mon, 5 Aug 2019 15:13:02 +0800 > > Tao Xu <tao3.xu@intel.com> wrote: > > > > > Add MachineClass::auto_enable_numa field. When it is true, a NUMA node > > > is expected to be created implicitly. > > > > > > Acked-by: David Gibson <david@gibson.dropbear.id.au> > > > Suggested-by: Igor Mammedov <imammedo@redhat.com> > > > Suggested-by: Eduardo Habkost <ehabkost@redhat.com> > > > Signed-off-by: Tao Xu <tao3.xu@intel.com> > [...] > > > + mc->auto_enable_numa = true; > > > > this will always create a numa node (that will affect not only RAM but > > also all other components that depends on numa state (like CPUs)), > > where as spapr_populate_memory() was only faking numa node in DT for RAM. > > It makes non-numa configuration impossible. > > Seeing David's ACK on the patch it might be fine, but I believe > > commit message should capture that and explain why the change in > > behavior is fine. > > After a quick look, all spapr code seems to have the same > behavior when nb_numa_nodes==0 and nb_numa_nodes==1, but I'd like > to be sure. That's certainly the intention. If there are cases where it doesn't behave that way, it's a bug - although possible one we have to maintainer for machine compatibility. > David and/or Tao Xu: do you confirm there's no ABI change at all > on spapr after implicitly creating a NUMA node? I don't believe there is, no. > > > > > > smc->default_caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_OFF; > > > smc->default_caps.caps[SPAPR_CAP_VSX] = SPAPR_CAP_ON; > > > diff --git a/include/hw/boards.h b/include/hw/boards.h > > > index 2eb9a0b4e0..4a350b87d2 100644 > > > --- a/include/hw/boards.h > > > +++ b/include/hw/boards.h > > > @@ -220,6 +220,7 @@ struct MachineClass { > > > bool smbus_no_migration_support; > > > bool nvdimm_supported; > > > bool numa_mem_supported; > > > + bool auto_enable_numa; > > > > > > HotplugHandler *(*get_hotplug_handler)(MachineState *machine, > > > DeviceState *dev); > > >
On Thu, Aug 08, 2019 at 04:35:00PM +1000, David Gibson wrote: > On Wed, Aug 07, 2019 at 02:52:56PM -0300, Eduardo Habkost wrote: > > On Tue, Aug 06, 2019 at 02:50:55PM +0200, Igor Mammedov wrote: > > > On Mon, 5 Aug 2019 15:13:02 +0800 > > > Tao Xu <tao3.xu@intel.com> wrote: > > > > > > > Add MachineClass::auto_enable_numa field. When it is true, a NUMA node > > > > is expected to be created implicitly. > > > > > > > > Acked-by: David Gibson <david@gibson.dropbear.id.au> > > > > Suggested-by: Igor Mammedov <imammedo@redhat.com> > > > > Suggested-by: Eduardo Habkost <ehabkost@redhat.com> > > > > Signed-off-by: Tao Xu <tao3.xu@intel.com> > > [...] > > > > + mc->auto_enable_numa = true; > > > > > > this will always create a numa node (that will affect not only RAM but > > > also all other components that depends on numa state (like CPUs)), > > > where as spapr_populate_memory() was only faking numa node in DT for RAM. > > > It makes non-numa configuration impossible. > > > Seeing David's ACK on the patch it might be fine, but I believe > > > commit message should capture that and explain why the change in > > > behavior is fine. > > > > After a quick look, all spapr code seems to have the same > > behavior when nb_numa_nodes==0 and nb_numa_nodes==1, but I'd like > > to be sure. > > That's certainly the intention. If there are cases where it doesn't > behave that way, it's a bug - although possible one we have to > maintainer for machine compatibility. > > > David and/or Tao Xu: do you confirm there's no ABI change at all > > on spapr after implicitly creating a NUMA node? > > I don't believe there is, no. Oh, FWIW, the PAPR interface which is what defines the guest environment has no notion of "non NUMA" except in the sense of a system with exactly one NUMA node.
On 8/8/2019 1:52 AM, Eduardo Habkost wrote: > On Tue, Aug 06, 2019 at 02:50:55PM +0200, Igor Mammedov wrote: >> On Mon, 5 Aug 2019 15:13:02 +0800 >> Tao Xu <tao3.xu@intel.com> wrote: >> >>> Add MachineClass::auto_enable_numa field. When it is true, a NUMA node >>> is expected to be created implicitly. >>> >>> Acked-by: David Gibson <david@gibson.dropbear.id.au> >>> Suggested-by: Igor Mammedov <imammedo@redhat.com> >>> Suggested-by: Eduardo Habkost <ehabkost@redhat.com> >>> Signed-off-by: Tao Xu <tao3.xu@intel.com> > [...] >>> + mc->auto_enable_numa = true; >> >> this will always create a numa node (that will affect not only RAM but >> also all other components that depends on numa state (like CPUs)), >> where as spapr_populate_memory() was only faking numa node in DT for RAM. >> It makes non-numa configuration impossible. >> Seeing David's ACK on the patch it might be fine, but I believe >> commit message should capture that and explain why the change in >> behavior is fine. > > After a quick look, all spapr code seems to have the same > behavior when nb_numa_nodes==0 and nb_numa_nodes==1, but I'd like > to be sure. > > David and/or Tao Xu: do you confirm there's no ABI change at all > on spapr after implicitly creating a NUMA node? > Even without this patch and HMAT patch, if without numa configuration, global nb_numa_nodes is always existing and default is 0, so nb_nodes will be auto set to 1, so from my point of view, this patch will not change ABI. And I would also want to listen David's opinion. >> >>> smc->default_caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_OFF; >>> smc->default_caps.caps[SPAPR_CAP_VSX] = SPAPR_CAP_ON; >>> diff --git a/include/hw/boards.h b/include/hw/boards.h >>> index 2eb9a0b4e0..4a350b87d2 100644 >>> --- a/include/hw/boards.h >>> +++ b/include/hw/boards.h >>> @@ -220,6 +220,7 @@ struct MachineClass { >>> bool smbus_no_migration_support; >>> bool nvdimm_supported; >>> bool numa_mem_supported; >>> + bool auto_enable_numa; >>> >>> HotplugHandler *(*get_hotplug_handler)(MachineState *machine, >>> DeviceState *dev); >> >
On Thu, 8 Aug 2019 16:35:00 +1000 David Gibson <david@gibson.dropbear.id.au> wrote: > On Wed, Aug 07, 2019 at 02:52:56PM -0300, Eduardo Habkost wrote: > > On Tue, Aug 06, 2019 at 02:50:55PM +0200, Igor Mammedov wrote: > > > On Mon, 5 Aug 2019 15:13:02 +0800 > > > Tao Xu <tao3.xu@intel.com> wrote: > > > > > > > Add MachineClass::auto_enable_numa field. When it is true, a NUMA node > > > > is expected to be created implicitly. > > > > > > > > Acked-by: David Gibson <david@gibson.dropbear.id.au> > > > > Suggested-by: Igor Mammedov <imammedo@redhat.com> > > > > Suggested-by: Eduardo Habkost <ehabkost@redhat.com> > > > > Signed-off-by: Tao Xu <tao3.xu@intel.com> > > [...] > > > > + mc->auto_enable_numa = true; > > > > > > this will always create a numa node (that will affect not only RAM but > > > also all other components that depends on numa state (like CPUs)), > > > where as spapr_populate_memory() was only faking numa node in DT for RAM. > > > It makes non-numa configuration impossible. > > > Seeing David's ACK on the patch it might be fine, but I believe > > > commit message should capture that and explain why the change in > > > behavior is fine. > > > > After a quick look, all spapr code seems to have the same > > behavior when nb_numa_nodes==0 and nb_numa_nodes==1, but I'd like > > to be sure. > > That's certainly the intention. If there are cases where it doesn't > behave that way, it's a bug - although possible one we have to > maintainer for machine compatibility. considering DT is firmware we typically do not add any compat code for the later. > > > David and/or Tao Xu: do you confirm there's no ABI change at all > > on spapr after implicitly creating a NUMA node? > > I don't believe there is, no. Also seeing your next reply, it seems there is no non-numa usecase is spec so it would be a bug to begin with, hence: Reviewed-by: Igor Mammedov <imammedo@redhat.com> > > > > > > > > > smc->default_caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_OFF; > > > > smc->default_caps.caps[SPAPR_CAP_VSX] = SPAPR_CAP_ON; > > > > diff --git a/include/hw/boards.h b/include/hw/boards.h > > > > index 2eb9a0b4e0..4a350b87d2 100644 > > > > --- a/include/hw/boards.h > > > > +++ b/include/hw/boards.h > > > > @@ -220,6 +220,7 @@ struct MachineClass { > > > > bool smbus_no_migration_support; > > > > bool nvdimm_supported; > > > > bool numa_mem_supported; > > > > + bool auto_enable_numa; > > > > > > > > HotplugHandler *(*get_hotplug_handler)(MachineState *machine, > > > > DeviceState *dev); > > > > > >
On Mon, Aug 05, 2019 at 03:13:02PM +0800, Tao Xu wrote: > Add MachineClass::auto_enable_numa field. When it is true, a NUMA node > is expected to be created implicitly. > > Acked-by: David Gibson <david@gibson.dropbear.id.au> > Suggested-by: Igor Mammedov <imammedo@redhat.com> > Suggested-by: Eduardo Habkost <ehabkost@redhat.com> > Signed-off-by: Tao Xu <tao3.xu@intel.com> This introduces spurious warnings when running qemu-system-ppc64. See: https://lore.kernel.org/qemu-devel/CAFEAcA-AvFS2cbDH-t5SxgY9hA=LGL81_8dn-vh193vtV9W1Lg@mail.gmail.com/ To reproduce it, just run 'qemu-system-ppc64 -machine pseries' without any -numa arguments. I have removed this patch from machine-next so it won't block the existing pull request.
On 9/4/2019 1:52 AM, Eduardo Habkost wrote: > On Mon, Aug 05, 2019 at 03:13:02PM +0800, Tao Xu wrote: >> Add MachineClass::auto_enable_numa field. When it is true, a NUMA node >> is expected to be created implicitly. >> >> Acked-by: David Gibson <david@gibson.dropbear.id.au> >> Suggested-by: Igor Mammedov <imammedo@redhat.com> >> Suggested-by: Eduardo Habkost <ehabkost@redhat.com> >> Signed-off-by: Tao Xu <tao3.xu@intel.com> > > This introduces spurious warnings when running qemu-system-ppc64. > See: https://lore.kernel.org/qemu-devel/CAFEAcA-AvFS2cbDH-t5SxgY9hA=LGL81_8dn-vh193vtV9W1Lg@mail.gmail.com/ > > To reproduce it, just run 'qemu-system-ppc64 -machine pseries' > without any -numa arguments. > > I have removed this patch from machine-next so it won't block the > existing pull request. > I got it. If default splitting of RAM between nodes is deprecated, this patch can't reuse the splitting code. I agree with droping this patch.
On Wed, Sep 04, 2019 at 02:22:39PM +0800, Tao Xu wrote: > On 9/4/2019 1:52 AM, Eduardo Habkost wrote: > > On Mon, Aug 05, 2019 at 03:13:02PM +0800, Tao Xu wrote: > > > Add MachineClass::auto_enable_numa field. When it is true, a NUMA node > > > is expected to be created implicitly. > > > > > > Acked-by: David Gibson <david@gibson.dropbear.id.au> > > > Suggested-by: Igor Mammedov <imammedo@redhat.com> > > > Suggested-by: Eduardo Habkost <ehabkost@redhat.com> > > > Signed-off-by: Tao Xu <tao3.xu@intel.com> > > > > This introduces spurious warnings when running qemu-system-ppc64. > > See: https://lore.kernel.org/qemu-devel/CAFEAcA-AvFS2cbDH-t5SxgY9hA=LGL81_8dn-vh193vtV9W1Lg@mail.gmail.com/ > > > > To reproduce it, just run 'qemu-system-ppc64 -machine pseries' > > without any -numa arguments. > > > > I have removed this patch from machine-next so it won't block the > > existing pull request. > > > I got it. If default splitting of RAM between nodes is > deprecated, this patch can't reuse the splitting code. I agree with droping > this patch. Probably all we need to fix this issue is to replace NumaNodeOptions node = { }; with NumaNodeOptions node = { .size = ram_size }; in the auto_enable_numa block. Do you plan to send v2?
On 9/5/2019 4:43 AM, Eduardo Habkost wrote: > On Wed, Sep 04, 2019 at 02:22:39PM +0800, Tao Xu wrote: >> On 9/4/2019 1:52 AM, Eduardo Habkost wrote: >>> On Mon, Aug 05, 2019 at 03:13:02PM +0800, Tao Xu wrote: >>>> Add MachineClass::auto_enable_numa field. When it is true, a NUMA node >>>> is expected to be created implicitly. >>>> >>>> Acked-by: David Gibson <david@gibson.dropbear.id.au> >>>> Suggested-by: Igor Mammedov <imammedo@redhat.com> >>>> Suggested-by: Eduardo Habkost <ehabkost@redhat.com> >>>> Signed-off-by: Tao Xu <tao3.xu@intel.com> >>> >>> This introduces spurious warnings when running qemu-system-ppc64. >>> See: https://lore.kernel.org/qemu-devel/CAFEAcA-AvFS2cbDH-t5SxgY9hA=LGL81_8dn-vh193vtV9W1Lg@mail.gmail.com/ >>> >>> To reproduce it, just run 'qemu-system-ppc64 -machine pseries' >>> without any -numa arguments. >>> >>> I have removed this patch from machine-next so it won't block the >>> existing pull request. >>> >> I got it. If default splitting of RAM between nodes is >> deprecated, this patch can't reuse the splitting code. I agree with droping >> this patch. > > Probably all we need to fix this issue is to replace > NumaNodeOptions node = { }; > with > NumaNodeOptions node = { .size = ram_size }; > in the auto_enable_numa block. > > Do you plan to send v2? > OK, thank you for your suggestion. I will fix it and send v2.
diff --git a/hw/core/numa.c b/hw/core/numa.c index 75db35ac19..756d243d3f 100644 --- a/hw/core/numa.c +++ b/hw/core/numa.c @@ -580,9 +580,14 @@ void numa_complete_configuration(MachineState *ms) * guest tries to use it with that drivers. * * Enable NUMA implicitly by adding a new NUMA node automatically. + * + * Or if MachineClass::auto_enable_numa is true and no NUMA nodes, + * assume there is just one node with whole RAM. */ - if (ms->ram_slots > 0 && ms->numa_state->num_nodes == 0 && - mc->auto_enable_numa_with_memhp) { + if (ms->numa_state->num_nodes == 0 && + ((ms->ram_slots > 0 && + mc->auto_enable_numa_with_memhp) || + mc->auto_enable_numa)) { NumaNodeOptions node = { }; parse_numa_node(ms, &node, &error_abort); } diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index f607ca567b..e50343f326 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -400,14 +400,6 @@ static int spapr_populate_memory(SpaprMachineState *spapr, void *fdt) hwaddr mem_start, node_size; int i, nb_nodes = machine->numa_state->num_nodes; NodeInfo *nodes = machine->numa_state->nodes; - NodeInfo ramnode; - - /* No NUMA nodes, assume there is just one node with whole RAM */ - if (!nb_nodes) { - nb_nodes = 1; - ramnode.node_mem = machine->ram_size; - nodes = &ramnode; - } for (i = 0, mem_start = 0; i < nb_nodes; ++i) { if (!nodes[i].node_mem) { @@ -4369,6 +4361,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data) */ mc->numa_mem_align_shift = 28; mc->numa_mem_supported = true; + mc->auto_enable_numa = true; smc->default_caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_OFF; smc->default_caps.caps[SPAPR_CAP_VSX] = SPAPR_CAP_ON; diff --git a/include/hw/boards.h b/include/hw/boards.h index 2eb9a0b4e0..4a350b87d2 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -220,6 +220,7 @@ struct MachineClass { bool smbus_no_migration_support; bool nvdimm_supported; bool numa_mem_supported; + bool auto_enable_numa; HotplugHandler *(*get_hotplug_handler)(MachineState *machine, DeviceState *dev);