Message ID | 20231007201740.30335-2-ankita@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | acpi: report numa nodes for device memory using GI | expand |
On Sun, 8 Oct 2023 01:47:38 +0530 <ankita@nvidia.com> wrote: > From: Ankit Agrawal <ankita@nvidia.com> > > The CPU cache coherent device memory can be added as NUMA nodes > distinct from the system memory nodes. These nodes are associated > with the device and Qemu needs a way to maintain this link. Hi Ankit, I'm not sure I'm convinced of the approach to creating nodes for memory usage (or whether that works in Linux on all NUMA ACPI archs), but I am keen to see Generic Initiator support in QEMU. I'd also like to see it done in a way that naturally extends to Generic Ports which are very similar (but don't hang memory off them! :) Dave Jiang posted a PoC a while back for generic ports. https://lore.kernel.org/qemu-devel/168185633821.899932.322047053764766056.stgit@djiang5-mobl3/ My concern with this approach is that it is using a side effect of a Linux implementation detail that the infra structure to bring up coherent memory is all present even for a GI only node (if it is which I can't recall) I'm also fairly sure we never tidied up the detail of going from the GI to the device in Linux (because it's harder than a _PXM entry for the device). It requires stashing a better description than the BDF before potentially doing reenumeration so that we can rebuild the association after that is done. > > Introduce a new acpi-generic-initiator object to allow host admin > provide the device and the corresponding NUMA node. Qemu maintain > this association and use this object to build the requisite GI > Affinity Structure. > > The admin provides the id of the device and the NUMA node id such > as in the following example. > -device vfio-pci-nohotplug,host=<bdf>,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \ > -object acpi-generic-initiator,id=gi0,device=dev0,node=2 \ This seems more different to existing numa configuration than necessary. The corner you have here of multiple GIs per PCI device is I hope the corner case so I'd like to see something that is really simple for single device, single node. Note that, whilst we don't do CXL 1.1 etc in QEMU yet, all CXL accelerators before CXL 2.0 are pretty much expected to present a GI SRAT entry + SRAT memory entry if appropriate. For CXL 2.0 and later everything can be left to be discovered by the OS but those Generic Ports I mentioned are an important part of that. Why not something like -numa node,gi=dev0 \ -numa node,gi=dev0 \ etc or maybe even the slightly weird -numa node,gi=dev0,gi=dev0,gi=dev0... ? > > Signed-off-by: Ankit Agrawal <ankita@nvidia.com> > --- > hw/acpi/acpi-generic-initiator.c | 74 ++++++++++++++++++++++++ > hw/acpi/meson.build | 1 + > include/hw/acpi/acpi-generic-initiator.h | 30 ++++++++++ > qapi/qom.json | 20 ++++++- > 4 files changed, 123 insertions(+), 2 deletions(-) > create mode 100644 hw/acpi/acpi-generic-initiator.c > create mode 100644 include/hw/acpi/acpi-generic-initiator.h > > diff --git a/hw/acpi/acpi-generic-initiator.c b/hw/acpi/acpi-generic-initiator.c > new file mode 100644 > index 0000000000..6406736090 > --- /dev/null > +++ b/hw/acpi/acpi-generic-initiator.c > @@ -0,0 +1,74 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved > + */ > + > +#include "qemu/osdep.h" > +#include "hw/qdev-properties.h" > +#include "qapi/error.h" > +#include "qapi/visitor.h" > +#include "qom/object_interfaces.h" > +#include "qom/object.h" > +#include "hw/qdev-core.h" > +#include "hw/vfio/vfio-common.h" > +#include "hw/vfio/pci.h" > +#include "hw/pci/pci_device.h" > +#include "sysemu/numa.h" > +#include "hw/acpi/acpi-generic-initiator.h" > + > +OBJECT_DEFINE_TYPE_WITH_INTERFACES(AcpiGenericInitiator, acpi_generic_initiator, > + ACPI_GENERIC_INITIATOR, OBJECT, > + { TYPE_USER_CREATABLE }, > + { NULL }) > + > +OBJECT_DECLARE_SIMPLE_TYPE(AcpiGenericInitiator, ACPI_GENERIC_INITIATOR) > + > +static void acpi_generic_initiator_init(Object *obj) > +{ > + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); > + gi->device = NULL; > + gi->node = MAX_NODES; > + gi->node_count = 1; > +} > + > +static void acpi_generic_initiator_finalize(Object *obj) > +{ > + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); > + > + g_free(gi->device); > +} > + > +static void acpi_generic_initiator_set_device(Object *obj, const char *value, > + Error **errp) > +{ > + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); > + > + gi->device = g_strdup(value); > +} > + > +static void acpi_generic_initiator_set_node(Object *obj, Visitor *v, > + const char *name, void *opaque, > + Error **errp) > +{ > + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); > + uint32_t value; > + > + if (!visit_type_uint32(v, name, &value, errp)) { > + return; > + } > + > + if (value >= MAX_NODES) { > + return; > + } > + > + gi->node = value; > +} > + > +static void acpi_generic_initiator_class_init(ObjectClass *oc, void *data) > +{ > + object_class_property_add_str(oc, ACPI_GENERIC_INITIATOR_DEVICE_PROP, NULL, > + acpi_generic_initiator_set_device); > + object_class_property_add(oc, ACPI_GENERIC_INITIATOR_NODE_PROP, "uint32", > + NULL, acpi_generic_initiator_set_node, NULL, > + NULL); > +} > diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build > index fc1b952379..22252836ed 100644 > --- a/hw/acpi/meson.build > +++ b/hw/acpi/meson.build > @@ -5,6 +5,7 @@ acpi_ss.add(files( > 'bios-linker-loader.c', > 'core.c', > 'utils.c', > + 'acpi-generic-initiator.c', > )) > acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu.c', 'cpu_hotplug.c')) > acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_false: files('acpi-cpu-hotplug-stub.c')) > diff --git a/include/hw/acpi/acpi-generic-initiator.h b/include/hw/acpi/acpi-generic-initiator.h > new file mode 100644 > index 0000000000..e67e6e23b1 > --- /dev/null > +++ b/include/hw/acpi/acpi-generic-initiator.h > @@ -0,0 +1,30 @@ > +#ifndef ACPI_GENERIC_INITIATOR_H > +#define ACPI_GENERIC_INITIATOR_H > + > +#include "hw/mem/pc-dimm.h" > +#include "hw/acpi/bios-linker-loader.h" > +#include "qemu/uuid.h" > +#include "hw/acpi/aml-build.h" > +#include "qom/object.h" > +#include "qom/object_interfaces.h" > + > +#define TYPE_ACPI_GENERIC_INITIATOR "acpi-generic-initiator" > + > +#define ACPI_GENERIC_INITIATOR_DEVICE_PROP "device" > +#define ACPI_GENERIC_INITIATOR_NODE_PROP "node" > + > +typedef struct AcpiGenericInitiator { > + /* private */ > + Object parent; > + > + /* public */ > + char *device; > + uint32_t node; > + uint32_t node_count; > +} AcpiGenericInitiator; > + > +typedef struct AcpiGenericInitiatorClass { > + ObjectClass parent_class; > +} AcpiGenericInitiatorClass; > + > +#endif > diff --git a/qapi/qom.json b/qapi/qom.json > index fa3e88c8e6..86c87a161c 100644 > --- a/qapi/qom.json > +++ b/qapi/qom.json > @@ -779,6 +779,20 @@ > { 'struct': 'VfioUserServerProperties', > 'data': { 'socket': 'SocketAddress', 'device': 'str' } } > > +## > +# @AcpiGenericInitiatorProperties: > +# > +# Properties for acpi-generic-initiator objects. > +# > +# @device: the ID of the device to be associated with the node > +# > +# @node: the ID of the numa node > +# > +# Since: 8.0 > +## > +{ 'struct': 'AcpiGenericInitiatorProperties', > + 'data': { 'device': 'str', 'node': 'uint32' } } > + > ## > # @RngProperties: > # > @@ -947,7 +961,8 @@ > 'tls-creds-x509', > 'tls-cipher-suites', > { 'name': 'x-remote-object', 'features': [ 'unstable' ] }, > - { 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] } > + { 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] }, > + 'acpi-generic-initiator' > ] } > > ## > @@ -1014,7 +1029,8 @@ > 'tls-creds-x509': 'TlsCredsX509Properties', > 'tls-cipher-suites': 'TlsCredsProperties', > 'x-remote-object': 'RemoteObjectProperties', > - 'x-vfio-user-server': 'VfioUserServerProperties' > + 'x-vfio-user-server': 'VfioUserServerProperties', > + 'acpi-generic-initiator': 'AcpiGenericInitiatorProperties' > } } > > ##
On Sun, 8 Oct 2023 01:47:38 +0530 <ankita@nvidia.com> wrote: > From: Ankit Agrawal <ankita@nvidia.com> > > The CPU cache coherent device memory can be added as NUMA nodes > distinct from the system memory nodes. These nodes are associated > with the device and Qemu needs a way to maintain this link. > > Introduce a new acpi-generic-initiator object to allow host admin > provide the device and the corresponding NUMA node. Qemu maintain > this association and use this object to build the requisite GI > Affinity Structure. > > The admin provides the id of the device and the NUMA node id such > as in the following example. > -device vfio-pci-nohotplug,host=<bdf>,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \ > -object acpi-generic-initiator,id=gi0,device=dev0,node=2 \ > > Signed-off-by: Ankit Agrawal <ankita@nvidia.com> > --- > hw/acpi/acpi-generic-initiator.c | 74 ++++++++++++++++++++++++ > hw/acpi/meson.build | 1 + > include/hw/acpi/acpi-generic-initiator.h | 30 ++++++++++ > qapi/qom.json | 20 ++++++- > 4 files changed, 123 insertions(+), 2 deletions(-) > create mode 100644 hw/acpi/acpi-generic-initiator.c > create mode 100644 include/hw/acpi/acpi-generic-initiator.h > > diff --git a/hw/acpi/acpi-generic-initiator.c b/hw/acpi/acpi-generic-initiator.c > new file mode 100644 > index 0000000000..6406736090 > --- /dev/null > +++ b/hw/acpi/acpi-generic-initiator.c > @@ -0,0 +1,74 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved > + */ > + > +#include "qemu/osdep.h" > +#include "hw/qdev-properties.h" > +#include "qapi/error.h" > +#include "qapi/visitor.h" > +#include "qom/object_interfaces.h" > +#include "qom/object.h" > +#include "hw/qdev-core.h" > +#include "hw/vfio/vfio-common.h" > +#include "hw/vfio/pci.h" > +#include "hw/pci/pci_device.h" > +#include "sysemu/numa.h" > +#include "hw/acpi/acpi-generic-initiator.h" > + > +OBJECT_DEFINE_TYPE_WITH_INTERFACES(AcpiGenericInitiator, acpi_generic_initiator, > + ACPI_GENERIC_INITIATOR, OBJECT, > + { TYPE_USER_CREATABLE }, > + { NULL }) > + > +OBJECT_DECLARE_SIMPLE_TYPE(AcpiGenericInitiator, ACPI_GENERIC_INITIATOR) > + > +static void acpi_generic_initiator_init(Object *obj) > +{ > + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); > + gi->device = NULL; > + gi->node = MAX_NODES; > + gi->node_count = 1; > +} > + > +static void acpi_generic_initiator_finalize(Object *obj) > +{ > + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); > + > + g_free(gi->device); > +} > + > +static void acpi_generic_initiator_set_device(Object *obj, const char *value, > + Error **errp) > +{ > + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); > + > + gi->device = g_strdup(value); > +} > + > +static void acpi_generic_initiator_set_node(Object *obj, Visitor *v, > + const char *name, void *opaque, > + Error **errp) > +{ > + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); > + uint32_t value; > + > + if (!visit_type_uint32(v, name, &value, errp)) { > + return; > + } > + > + if (value >= MAX_NODES) { error_setg() here? Thanks, Alex > + return; > + } > + > + gi->node = value; > +} > + > +static void acpi_generic_initiator_class_init(ObjectClass *oc, void *data) > +{ > + object_class_property_add_str(oc, ACPI_GENERIC_INITIATOR_DEVICE_PROP, NULL, > + acpi_generic_initiator_set_device); > + object_class_property_add(oc, ACPI_GENERIC_INITIATOR_NODE_PROP, "uint32", > + NULL, acpi_generic_initiator_set_node, NULL, > + NULL); > +} > diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build > index fc1b952379..22252836ed 100644 > --- a/hw/acpi/meson.build > +++ b/hw/acpi/meson.build > @@ -5,6 +5,7 @@ acpi_ss.add(files( > 'bios-linker-loader.c', > 'core.c', > 'utils.c', > + 'acpi-generic-initiator.c', > )) > acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu.c', 'cpu_hotplug.c')) > acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_false: files('acpi-cpu-hotplug-stub.c')) > diff --git a/include/hw/acpi/acpi-generic-initiator.h b/include/hw/acpi/acpi-generic-initiator.h > new file mode 100644 > index 0000000000..e67e6e23b1 > --- /dev/null > +++ b/include/hw/acpi/acpi-generic-initiator.h > @@ -0,0 +1,30 @@ > +#ifndef ACPI_GENERIC_INITIATOR_H > +#define ACPI_GENERIC_INITIATOR_H > + > +#include "hw/mem/pc-dimm.h" > +#include "hw/acpi/bios-linker-loader.h" > +#include "qemu/uuid.h" > +#include "hw/acpi/aml-build.h" > +#include "qom/object.h" > +#include "qom/object_interfaces.h" > + > +#define TYPE_ACPI_GENERIC_INITIATOR "acpi-generic-initiator" > + > +#define ACPI_GENERIC_INITIATOR_DEVICE_PROP "device" > +#define ACPI_GENERIC_INITIATOR_NODE_PROP "node" > + > +typedef struct AcpiGenericInitiator { > + /* private */ > + Object parent; > + > + /* public */ > + char *device; > + uint32_t node; > + uint32_t node_count; > +} AcpiGenericInitiator; > + > +typedef struct AcpiGenericInitiatorClass { > + ObjectClass parent_class; > +} AcpiGenericInitiatorClass; > + > +#endif > diff --git a/qapi/qom.json b/qapi/qom.json > index fa3e88c8e6..86c87a161c 100644 > --- a/qapi/qom.json > +++ b/qapi/qom.json > @@ -779,6 +779,20 @@ > { 'struct': 'VfioUserServerProperties', > 'data': { 'socket': 'SocketAddress', 'device': 'str' } } > > +## > +# @AcpiGenericInitiatorProperties: > +# > +# Properties for acpi-generic-initiator objects. > +# > +# @device: the ID of the device to be associated with the node > +# > +# @node: the ID of the numa node > +# > +# Since: 8.0 > +## > +{ 'struct': 'AcpiGenericInitiatorProperties', > + 'data': { 'device': 'str', 'node': 'uint32' } } > + > ## > # @RngProperties: > # > @@ -947,7 +961,8 @@ > 'tls-creds-x509', > 'tls-cipher-suites', > { 'name': 'x-remote-object', 'features': [ 'unstable' ] }, > - { 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] } > + { 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] }, > + 'acpi-generic-initiator' > ] } > > ## > @@ -1014,7 +1029,8 @@ > 'tls-creds-x509': 'TlsCredsX509Properties', > 'tls-cipher-suites': 'TlsCredsProperties', > 'x-remote-object': 'RemoteObjectProperties', > - 'x-vfio-user-server': 'VfioUserServerProperties' > + 'x-vfio-user-server': 'VfioUserServerProperties', > + 'acpi-generic-initiator': 'AcpiGenericInitiatorProperties' > } } > > ##
Hi Jonathan, > -----Original Message----- > From: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > Sent: Monday, October 9, 2023 7:27 AM > To: Ankit Agrawal <ankita@nvidia.com> > Cc: Jason Gunthorpe <jgg@nvidia.com>; alex.williamson@redhat.com; > clg@redhat.com; shannon.zhaosl@gmail.com; peter.maydell@linaro.org; > ani@anisinha.ca; berrange@redhat.com; eduardo@habkost.net; > imammedo@redhat.com; mst@redhat.com; eblake@redhat.com; > armbru@redhat.com; david@redhat.com; gshan@redhat.com; Aniket > Agashe <aniketa@nvidia.com>; Neo Jia <cjia@nvidia.com>; Kirti Wankhede > <kwankhede@nvidia.com>; Tarun Gupta (SW-GPU) <targupta@nvidia.com>; > Vikram Sethi <vsethi@nvidia.com>; Andy Currid <acurrid@nvidia.com>; > Dheeraj Nigam <dnigam@nvidia.com>; Uday Dhoke <udhoke@nvidia.com>; > qemu-arm@nongnu.org; qemu-devel@nongnu.org; Dave Jiang > <dave.jiang@intel.com> > Subject: Re: [PATCH v2 1/3] qom: new object to associate device to numa > node > > > On Sun, 8 Oct 2023 01:47:38 +0530 > <ankita@nvidia.com> wrote: > > > From: Ankit Agrawal <ankita@nvidia.com> > > > > The CPU cache coherent device memory can be added as NUMA nodes > > distinct from the system memory nodes. These nodes are associated with > > the device and Qemu needs a way to maintain this link. > > Hi Ankit, > > I'm not sure I'm convinced of the approach to creating nodes for memory > usage (or whether that works in Linux on all NUMA ACPI archs), but I am > keen to see Generic Initiator support in QEMU. I'd also like to see it done in a > way that naturally extends to Generic Ports which are very similar (but don't > hang memory off them! :) Dave Jiang posted a PoC a while back for generic > ports. > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore. > kernel.org%2Fqemu- > devel%2F168185633821.899932.322047053764766056.stgit%40djiang5- > mobl3%2F&data=05%7C01%7Cvsethi%40nvidia.com%7C846b19f87bc5424d > c33608dbc8c3015d%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7 > C638324512146712954%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA > wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C% > 7C%7C&sdata=v318MXognoITHyv7AFqZAfvUi2hLy2ZUVnLvyQ2IAfY%3D&res > erved=0 > > My concern with this approach is that it is using a side effect of a Linux > implementation detail that the infra structure to bring up coherent memory > is all present even for a GI only node (if it is which I can't recall) I'm also fairly > sure we never tidied up the detail of going from the GI to the device in Linux > (because it's harder than a _PXM entry for the device). It requires stashing a > better description than the BDF before potentially doing reenumeration so > that we can rebuild the association after that is done. > I'm not sure I understood the concern. Are you suggesting that the ACPI specification somehow prohibits memory associated with a GI node in the same PXM? i.e whether the GI is memory-less or with memory isn't mandated by the spec IIRC. Certainly seems perfectly normal for an accelerator with memory to have a GI with memory and that memory be able to be associated with the same PXM. So what about this patch is using a Linux implementation detail? Even if Linux wasn't currently supporting that use case, it is something that would have been reasonable to add IMO. What am I missing? > > > > Introduce a new acpi-generic-initiator object to allow host admin > > provide the device and the corresponding NUMA node. Qemu maintain > this > > association and use this object to build the requisite GI Affinity > > Structure. > > > > The admin provides the id of the device and the NUMA node id such as > > in the following example. > > -device > > vfio-pci-nohotplug,host=<bdf>,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \ > > -object acpi-generic-initiator,id=gi0,device=dev0,node=2 \ > > This seems more different to existing numa configuration than necessary. > The corner you have here of multiple GIs per PCI device is I hope the corner > case so I'd like to see something that is really simple for single device, single > node. Note that, whilst we don't do CXL 1.1 etc in QEMU yet, all CXL > accelerators before CXL 2.0 are pretty much expected to present a GI SRAT > entry + SRAT memory entry if appropriate. > For CXL 2.0 and later everything can be left to be discovered by the OS but > those Generic Ports I mentioned are an important part of that. > > Why not something like > -numa node,gi=dev0 \ > -numa node,gi=dev0 \ > etc or maybe even the slightly weird > -numa node,gi=dev0,gi=dev0,gi=dev0... > > ? > > > > > Signed-off-by: Ankit Agrawal <ankita@nvidia.com> > > --- > > hw/acpi/acpi-generic-initiator.c | 74 ++++++++++++++++++++++++ > > hw/acpi/meson.build | 1 + > > include/hw/acpi/acpi-generic-initiator.h | 30 ++++++++++ > > qapi/qom.json | 20 ++++++- > > 4 files changed, 123 insertions(+), 2 deletions(-) create mode > > 100644 hw/acpi/acpi-generic-initiator.c create mode 100644 > > include/hw/acpi/acpi-generic-initiator.h > > > > diff --git a/hw/acpi/acpi-generic-initiator.c > > b/hw/acpi/acpi-generic-initiator.c > > new file mode 100644 > > index 0000000000..6406736090 > > --- /dev/null > > +++ b/hw/acpi/acpi-generic-initiator.c > > @@ -0,0 +1,74 @@ > > +// SPDX-License-Identifier: GPL-2.0-only > > +/* > > + * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights > > +reserved */ > > + > > +#include "qemu/osdep.h" > > +#include "hw/qdev-properties.h" > > +#include "qapi/error.h" > > +#include "qapi/visitor.h" > > +#include "qom/object_interfaces.h" > > +#include "qom/object.h" > > +#include "hw/qdev-core.h" > > +#include "hw/vfio/vfio-common.h" > > +#include "hw/vfio/pci.h" > > +#include "hw/pci/pci_device.h" > > +#include "sysemu/numa.h" > > +#include "hw/acpi/acpi-generic-initiator.h" > > + > > +OBJECT_DEFINE_TYPE_WITH_INTERFACES(AcpiGenericInitiator, > acpi_generic_initiator, > > + ACPI_GENERIC_INITIATOR, OBJECT, > > + { TYPE_USER_CREATABLE }, > > + { NULL }) > > + > > +OBJECT_DECLARE_SIMPLE_TYPE(AcpiGenericInitiator, > > +ACPI_GENERIC_INITIATOR) > > + > > +static void acpi_generic_initiator_init(Object *obj) { > > + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); > > + gi->device = NULL; > > + gi->node = MAX_NODES; > > + gi->node_count = 1; > > +} > > + > > +static void acpi_generic_initiator_finalize(Object *obj) { > > + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); > > + > > + g_free(gi->device); > > +} > > + > > +static void acpi_generic_initiator_set_device(Object *obj, const char > *value, > > + Error **errp) { > > + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); > > + > > + gi->device = g_strdup(value); > > +} > > + > > +static void acpi_generic_initiator_set_node(Object *obj, Visitor *v, > > + const char *name, void *opaque, > > + Error **errp) { > > + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); > > + uint32_t value; > > + > > + if (!visit_type_uint32(v, name, &value, errp)) { > > + return; > > + } > > + > > + if (value >= MAX_NODES) { > > + return; > > + } > > + > > + gi->node = value; > > +} > > + > > +static void acpi_generic_initiator_class_init(ObjectClass *oc, void > > +*data) { > > + object_class_property_add_str(oc, > ACPI_GENERIC_INITIATOR_DEVICE_PROP, NULL, > > + acpi_generic_initiator_set_device); > > + object_class_property_add(oc, > ACPI_GENERIC_INITIATOR_NODE_PROP, "uint32", > > + NULL, acpi_generic_initiator_set_node, NULL, > > + NULL); > > +} > > diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build index > > fc1b952379..22252836ed 100644 > > --- a/hw/acpi/meson.build > > +++ b/hw/acpi/meson.build > > @@ -5,6 +5,7 @@ acpi_ss.add(files( > > 'bios-linker-loader.c', > > 'core.c', > > 'utils.c', > > + 'acpi-generic-initiator.c', > > )) > > acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu.c', > > 'cpu_hotplug.c')) > > acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_false: > > files('acpi-cpu-hotplug-stub.c')) diff --git > > a/include/hw/acpi/acpi-generic-initiator.h > > b/include/hw/acpi/acpi-generic-initiator.h > > new file mode 100644 > > index 0000000000..e67e6e23b1 > > --- /dev/null > > +++ b/include/hw/acpi/acpi-generic-initiator.h > > @@ -0,0 +1,30 @@ > > +#ifndef ACPI_GENERIC_INITIATOR_H > > +#define ACPI_GENERIC_INITIATOR_H > > + > > +#include "hw/mem/pc-dimm.h" > > +#include "hw/acpi/bios-linker-loader.h" > > +#include "qemu/uuid.h" > > +#include "hw/acpi/aml-build.h" > > +#include "qom/object.h" > > +#include "qom/object_interfaces.h" > > + > > +#define TYPE_ACPI_GENERIC_INITIATOR "acpi-generic-initiator" > > + > > +#define ACPI_GENERIC_INITIATOR_DEVICE_PROP "device" > > +#define ACPI_GENERIC_INITIATOR_NODE_PROP "node" > > + > > +typedef struct AcpiGenericInitiator { > > + /* private */ > > + Object parent; > > + > > + /* public */ > > + char *device; > > + uint32_t node; > > + uint32_t node_count; > > +} AcpiGenericInitiator; > > + > > +typedef struct AcpiGenericInitiatorClass { > > + ObjectClass parent_class; > > +} AcpiGenericInitiatorClass; > > + > > +#endif > > diff --git a/qapi/qom.json b/qapi/qom.json index > > fa3e88c8e6..86c87a161c 100644 > > --- a/qapi/qom.json > > +++ b/qapi/qom.json > > @@ -779,6 +779,20 @@ > > { 'struct': 'VfioUserServerProperties', > > 'data': { 'socket': 'SocketAddress', 'device': 'str' } } > > > > +## > > +# @AcpiGenericInitiatorProperties: > > +# > > +# Properties for acpi-generic-initiator objects. > > +# > > +# @device: the ID of the device to be associated with the node # # > > +@node: the ID of the numa node # # Since: 8.0 ## { 'struct': > > +'AcpiGenericInitiatorProperties', > > + 'data': { 'device': 'str', 'node': 'uint32' } } > > + > > ## > > # @RngProperties: > > # > > @@ -947,7 +961,8 @@ > > 'tls-creds-x509', > > 'tls-cipher-suites', > > { 'name': 'x-remote-object', 'features': [ 'unstable' ] }, > > - { 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] } > > + { 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] }, > > + 'acpi-generic-initiator' > > ] } > > > > ## > > @@ -1014,7 +1029,8 @@ > > 'tls-creds-x509': 'TlsCredsX509Properties', > > 'tls-cipher-suites': 'TlsCredsProperties', > > 'x-remote-object': 'RemoteObjectProperties', > > - 'x-vfio-user-server': 'VfioUserServerProperties' > > + 'x-vfio-user-server': 'VfioUserServerProperties', > > + 'acpi-generic-initiator': 'AcpiGenericInitiatorProperties' > > } } > > > > ##
On Wed, 11 Oct 2023 17:37:11 +0000 Vikram Sethi <vsethi@nvidia.com> wrote: > Hi Jonathan, > > > -----Original Message----- > > From: Jonathan Cameron <Jonathan.Cameron@Huawei.com> > > Sent: Monday, October 9, 2023 7:27 AM > > To: Ankit Agrawal <ankita@nvidia.com> > > Cc: Jason Gunthorpe <jgg@nvidia.com>; alex.williamson@redhat.com; > > clg@redhat.com; shannon.zhaosl@gmail.com; peter.maydell@linaro.org; > > ani@anisinha.ca; berrange@redhat.com; eduardo@habkost.net; > > imammedo@redhat.com; mst@redhat.com; eblake@redhat.com; > > armbru@redhat.com; david@redhat.com; gshan@redhat.com; Aniket > > Agashe <aniketa@nvidia.com>; Neo Jia <cjia@nvidia.com>; Kirti Wankhede > > <kwankhede@nvidia.com>; Tarun Gupta (SW-GPU) <targupta@nvidia.com>; > > Vikram Sethi <vsethi@nvidia.com>; Andy Currid <acurrid@nvidia.com>; > > Dheeraj Nigam <dnigam@nvidia.com>; Uday Dhoke <udhoke@nvidia.com>; > > qemu-arm@nongnu.org; qemu-devel@nongnu.org; Dave Jiang > > <dave.jiang@intel.com> > > Subject: Re: [PATCH v2 1/3] qom: new object to associate device to numa > > node > > > > > > On Sun, 8 Oct 2023 01:47:38 +0530 > > <ankita@nvidia.com> wrote: > > > > > From: Ankit Agrawal <ankita@nvidia.com> > > > > > > The CPU cache coherent device memory can be added as NUMA nodes > > > distinct from the system memory nodes. These nodes are associated with > > > the device and Qemu needs a way to maintain this link. > > > > Hi Ankit, > > > > I'm not sure I'm convinced of the approach to creating nodes for memory > > usage (or whether that works in Linux on all NUMA ACPI archs), but I am > > keen to see Generic Initiator support in QEMU. I'd also like to see it done in a > > way that naturally extends to Generic Ports which are very similar (but don't > > hang memory off them! :) Dave Jiang posted a PoC a while back for generic > > ports. > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore. > > kernel.org%2Fqemu- > > devel%2F168185633821.899932.322047053764766056.stgit%40djiang5- > > mobl3%2F&data=05%7C01%7Cvsethi%40nvidia.com%7C846b19f87bc5424d > > c33608dbc8c3015d%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7 > > C638324512146712954%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA > > wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C% > > 7C%7C&sdata=v318MXognoITHyv7AFqZAfvUi2hLy2ZUVnLvyQ2IAfY%3D&res > > erved=0 > > > > My concern with this approach is that it is using a side effect of a Linux > > implementation detail that the infra structure to bring up coherent memory > > is all present even for a GI only node (if it is which I can't recall) I'm also fairly > > sure we never tidied up the detail of going from the GI to the device in Linux > > (because it's harder than a _PXM entry for the device). It requires stashing a > > better description than the BDF before potentially doing reenumeration so > > that we can rebuild the association after that is done. > > > > I'm not sure I understood the concern. Are you suggesting that the ACPI specification > somehow prohibits memory associated with a GI node in the same PXM? i.e whether the GI is memory-less > or with memory isn't mandated by the spec IIRC. Certainly seems perfectly normal > for an accelerator with memory to have a GI with memory and that memory be able to be associated with the same PXM. Indeed reasonable that a GI would have associated memory, but if it's "normal memory" (i.e. coherent and not device private memory accessed by PCI bar etc) then expectation would be that memory is in SRAT as a memory entry. Which brings us back to the original question of whether 0 sized memory nodes are fine. > So what about this patch is using a Linux implementation detail? Even if Linux wasn't currently supporting > that use case, it is something that would have been reasonable to add IMO. What am I missing? Linux is careful to only bring up the infrastructure for specific types of roximity node. It works its way through SRAT and sets appropriate bitmap bits to say which combination of PXM node types a given node is. (CPU, Memory, GI etc) After that walk is done it then brings up various infrastructure. What I can't remember (you'll need to experiment) is if there is anything not brought up for a non Memory node that you would need. Might be fine, but that doesn't mean it will remain fine. Maybe we just need to make sure the documentation / comments in Linux cover this use case. You are on your own for what other OSes decided is valid here as the specifcation does not mention this AFAIK. If it does then add a reference. There is a non trivial (potential) cost to enabling facilities on NUMA nodes that will never make use of them - a bunch of longer searches etc when looking for memory. For GIs we enable pretty much everything a CPU node uses. That was controversial though only well after support was already in - the controversy being that it added costs to paths that didn't care about GIs. Basically it boils down to using unexpected corners of specifications may prove fragile. For one thing I'm doubtful if the NUMA description the kernel exposes (coming from a subset of HMAT) won't deal with this case. Not tried it though so you may be lucky. Jonathan
<ankita@nvidia.com> writes: > From: Ankit Agrawal <ankita@nvidia.com> > > The CPU cache coherent device memory can be added as NUMA nodes > distinct from the system memory nodes. These nodes are associated > with the device and Qemu needs a way to maintain this link. > > Introduce a new acpi-generic-initiator object to allow host admin > provide the device and the corresponding NUMA node. Qemu maintain > this association and use this object to build the requisite GI > Affinity Structure. > > The admin provides the id of the device and the NUMA node id such > as in the following example. > -device vfio-pci-nohotplug,host=<bdf>,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \ > -object acpi-generic-initiator,id=gi0,device=dev0,node=2 \ > > Signed-off-by: Ankit Agrawal <ankita@nvidia.com> [...] > diff --git a/qapi/qom.json b/qapi/qom.json > index fa3e88c8e6..86c87a161c 100644 > --- a/qapi/qom.json > +++ b/qapi/qom.json > @@ -779,6 +779,20 @@ > { 'struct': 'VfioUserServerProperties', > 'data': { 'socket': 'SocketAddress', 'device': 'str' } } > > +## > +# @AcpiGenericInitiatorProperties: > +# > +# Properties for acpi-generic-initiator objects. > +# > +# @device: the ID of the device to be associated with the node > +# > +# @node: the ID of the numa node > +# > +# Since: 8.0 Since 8.2 > +## > +{ 'struct': 'AcpiGenericInitiatorProperties', > + 'data': { 'device': 'str', 'node': 'uint32' } } > + > ## > # @RngProperties: > # > @@ -947,7 +961,8 @@ > 'tls-creds-x509', > 'tls-cipher-suites', > { 'name': 'x-remote-object', 'features': [ 'unstable' ] }, > - { 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] } > + { 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] }, > + 'acpi-generic-initiator' Please keep the object types sorted alphabetically. > ] } > > ## > @@ -1014,7 +1029,8 @@ > 'tls-creds-x509': 'TlsCredsX509Properties', > 'tls-cipher-suites': 'TlsCredsProperties', > 'x-remote-object': 'RemoteObjectProperties', > - 'x-vfio-user-server': 'VfioUserServerProperties' > + 'x-vfio-user-server': 'VfioUserServerProperties', > + 'acpi-generic-initiator': 'AcpiGenericInitiatorProperties' Likewise. > } } > > ##
>> +static void acpi_generic_initiator_set_node(Object *obj, Visitor *v, >> + const char *name, void *opaque, >> + Error **errp) >> +{ >> + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); >> + uint32_t value; >> + >> + if (!visit_type_uint32(v, name, &value, errp)) { >> + return; >> + } >> + >> + if (value >= MAX_NODES) { > > error_setg() here? Thanks, Thanks for pointing out, will make the change in the next version. >> +## >> +# @AcpiGenericInitiatorProperties: >> +# >> +# Properties for acpi-generic-initiator objects. >> +# >> +# @device: the ID of the device to be associated with the node >> +# >> +# @node: the ID of the numa node >> +# >> +# Since: 8.0 > > Since 8.2 Thanks, will make the change. >> 'tls-creds-x509', >> 'tls-cipher-suites', >> { 'name': 'x-remote-object', 'features': [ 'unstable' ] }, >> - { 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] } >> + { 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] }, >> + 'acpi-generic-initiator' > > Please keep the object types sorted alphabetically. Ack. >> @@ -1014,7 +1029,8 @@ >> 'tls-creds-x509': 'TlsCredsX509Properties', >> 'tls-cipher-suites': 'TlsCredsProperties', >> 'x-remote-object': 'RemoteObjectProperties', >> - 'x-vfio-user-server': 'VfioUserServerProperties' >> + 'x-vfio-user-server': 'VfioUserServerProperties', >> + 'acpi-generic-initiator': 'AcpiGenericInitiatorProperties' > > Likewise. Ack, will make the change.
diff --git a/hw/acpi/acpi-generic-initiator.c b/hw/acpi/acpi-generic-initiator.c new file mode 100644 index 0000000000..6406736090 --- /dev/null +++ b/hw/acpi/acpi-generic-initiator.c @@ -0,0 +1,74 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#include "qemu/osdep.h" +#include "hw/qdev-properties.h" +#include "qapi/error.h" +#include "qapi/visitor.h" +#include "qom/object_interfaces.h" +#include "qom/object.h" +#include "hw/qdev-core.h" +#include "hw/vfio/vfio-common.h" +#include "hw/vfio/pci.h" +#include "hw/pci/pci_device.h" +#include "sysemu/numa.h" +#include "hw/acpi/acpi-generic-initiator.h" + +OBJECT_DEFINE_TYPE_WITH_INTERFACES(AcpiGenericInitiator, acpi_generic_initiator, + ACPI_GENERIC_INITIATOR, OBJECT, + { TYPE_USER_CREATABLE }, + { NULL }) + +OBJECT_DECLARE_SIMPLE_TYPE(AcpiGenericInitiator, ACPI_GENERIC_INITIATOR) + +static void acpi_generic_initiator_init(Object *obj) +{ + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); + gi->device = NULL; + gi->node = MAX_NODES; + gi->node_count = 1; +} + +static void acpi_generic_initiator_finalize(Object *obj) +{ + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); + + g_free(gi->device); +} + +static void acpi_generic_initiator_set_device(Object *obj, const char *value, + Error **errp) +{ + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); + + gi->device = g_strdup(value); +} + +static void acpi_generic_initiator_set_node(Object *obj, Visitor *v, + const char *name, void *opaque, + Error **errp) +{ + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); + uint32_t value; + + if (!visit_type_uint32(v, name, &value, errp)) { + return; + } + + if (value >= MAX_NODES) { + return; + } + + gi->node = value; +} + +static void acpi_generic_initiator_class_init(ObjectClass *oc, void *data) +{ + object_class_property_add_str(oc, ACPI_GENERIC_INITIATOR_DEVICE_PROP, NULL, + acpi_generic_initiator_set_device); + object_class_property_add(oc, ACPI_GENERIC_INITIATOR_NODE_PROP, "uint32", + NULL, acpi_generic_initiator_set_node, NULL, + NULL); +} diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build index fc1b952379..22252836ed 100644 --- a/hw/acpi/meson.build +++ b/hw/acpi/meson.build @@ -5,6 +5,7 @@ acpi_ss.add(files( 'bios-linker-loader.c', 'core.c', 'utils.c', + 'acpi-generic-initiator.c', )) acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu.c', 'cpu_hotplug.c')) acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_false: files('acpi-cpu-hotplug-stub.c')) diff --git a/include/hw/acpi/acpi-generic-initiator.h b/include/hw/acpi/acpi-generic-initiator.h new file mode 100644 index 0000000000..e67e6e23b1 --- /dev/null +++ b/include/hw/acpi/acpi-generic-initiator.h @@ -0,0 +1,30 @@ +#ifndef ACPI_GENERIC_INITIATOR_H +#define ACPI_GENERIC_INITIATOR_H + +#include "hw/mem/pc-dimm.h" +#include "hw/acpi/bios-linker-loader.h" +#include "qemu/uuid.h" +#include "hw/acpi/aml-build.h" +#include "qom/object.h" +#include "qom/object_interfaces.h" + +#define TYPE_ACPI_GENERIC_INITIATOR "acpi-generic-initiator" + +#define ACPI_GENERIC_INITIATOR_DEVICE_PROP "device" +#define ACPI_GENERIC_INITIATOR_NODE_PROP "node" + +typedef struct AcpiGenericInitiator { + /* private */ + Object parent; + + /* public */ + char *device; + uint32_t node; + uint32_t node_count; +} AcpiGenericInitiator; + +typedef struct AcpiGenericInitiatorClass { + ObjectClass parent_class; +} AcpiGenericInitiatorClass; + +#endif diff --git a/qapi/qom.json b/qapi/qom.json index fa3e88c8e6..86c87a161c 100644 --- a/qapi/qom.json +++ b/qapi/qom.json @@ -779,6 +779,20 @@ { 'struct': 'VfioUserServerProperties', 'data': { 'socket': 'SocketAddress', 'device': 'str' } } +## +# @AcpiGenericInitiatorProperties: +# +# Properties for acpi-generic-initiator objects. +# +# @device: the ID of the device to be associated with the node +# +# @node: the ID of the numa node +# +# Since: 8.0 +## +{ 'struct': 'AcpiGenericInitiatorProperties', + 'data': { 'device': 'str', 'node': 'uint32' } } + ## # @RngProperties: # @@ -947,7 +961,8 @@ 'tls-creds-x509', 'tls-cipher-suites', { 'name': 'x-remote-object', 'features': [ 'unstable' ] }, - { 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] } + { 'name': 'x-vfio-user-server', 'features': [ 'unstable' ] }, + 'acpi-generic-initiator' ] } ## @@ -1014,7 +1029,8 @@ 'tls-creds-x509': 'TlsCredsX509Properties', 'tls-cipher-suites': 'TlsCredsProperties', 'x-remote-object': 'RemoteObjectProperties', - 'x-vfio-user-server': 'VfioUserServerProperties' + 'x-vfio-user-server': 'VfioUserServerProperties', + 'acpi-generic-initiator': 'AcpiGenericInitiatorProperties' } } ##