Message ID | 20231203060245.31593-2-ankita@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | acpi: report numa nodes for device memory using GI | expand |
<ankita@nvidia.com> writes: > From: Ankit Agrawal <ankita@nvidia.com> > > NVIDIA GPU's support MIG (Mult-Instance GPUs) feature [1], which allows > partitioning of the GPU device resources (including device memory) into > several (upto 8) isolated instances. Each of the partitioned memory needs > a dedicated NUMA node to operate. The partitions are not fixed and they > can be created/deleted at runtime. > > Unfortunately Linux OS does not provide a means to dynamically create/destroy > NUMA nodes and such feature implementation is not expected to be trivial. The > nodes that OS discovers at the boot time while parsing SRAT remains fixed. So > we utilize the Generic Initiator Affinity structures that allows association > between nodes and devices. Multiple GI structures per BDF is possible, > allowing creation of multiple nodes by exposing unique PXM in each of these > structures. > > Introduce a new acpi-generic-initiator object to allow host admin provide the > device and the corresponding NUMA nodes. Qemu maintain this association and > use this object to build the requisite GI Affinity Structure. Pardon my ignorance... What makes this object an "initiator", and why is it "generic"? > An admin can provide the range of nodes through a uint16 array host-nodes > and link it to a device by providing its id. Currently, only PCI device is > supported. The following sample creates 8 nodes and link them to the PCI > device dev0: > > -numa node,nodeid=2 \ > -numa node,nodeid=3 \ > -numa node,nodeid=4 \ > -numa node,nodeid=5 \ > -numa node,nodeid=6 \ > -numa node,nodeid=7 \ > -numa node,nodeid=8 \ > -numa node,nodeid=9 \ > -device vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \ > -object acpi-generic-initiator,id=gi0,pci-dev=dev0,host-nodes=2-9 \ Does this link *all* NUMA nodes to dev0? Would an example involving two devices be more instructive? > [1] https://www.nvidia.com/en-in/technologies/multi-instance-gpu > > Signed-off-by: Ankit Agrawal <ankita@nvidia.com> [...] > diff --git a/qapi/qom.json b/qapi/qom.json > index c53ef978ff..efcc4b8dfd 100644 > --- a/qapi/qom.json > +++ b/qapi/qom.json > @@ -794,6 +794,21 @@ > { 'struct': 'VfioUserServerProperties', > 'data': { 'socket': 'SocketAddress', 'device': 'str' } } > > +## > +# @AcpiGenericInitiatorProperties: > +# > +# Properties for acpi-generic-initiator objects. > +# > +# @pci-dev: PCI device ID to be associated with the node > +# > +# @host-nodes: numa node list This feels a bit terse. The commit message makes me guess this specifies the NUMA nodes to be linked to @pci-dev. Correct? > +# > +# Since: 9.0 > +## > +{ 'struct': 'AcpiGenericInitiatorProperties', > + 'data': { 'pci-dev': 'str', > + 'host-nodes': ['uint16'] } } > + > ## > # @RngProperties: > # > @@ -911,6 +926,7 @@ > ## > { 'enum': 'ObjectType', > 'data': [ > + 'acpi-generic-initiator', > 'authz-list', > 'authz-listfile', > 'authz-pam', > @@ -981,6 +997,7 @@ > 'id': 'str' }, > 'discriminator': 'qom-type', > 'data': { > + 'acpi-generic-initiator': 'AcpiGenericInitiatorProperties', > 'authz-list': 'AuthZListProperties', > 'authz-listfile': 'AuthZListFileProperties', > 'authz-pam': 'AuthZPAMProperties',
Thanks Markus for the review. >> Introduce a new acpi-generic-initiator object to allow host admin provide the >> device and the corresponding NUMA nodes. Qemu maintain this association and >> use this object to build the requisite GI Affinity Structure. > > Pardon my ignorance... What makes this object an "initiator", and why > is it "generic"? In ACPI 6.3 (https://uefi.org/sites/default/files/resources/ACPI_6_3_final_Jan30.pdf), a new table in System Resource Affinity Table called Generic initiator Affinity table was introduced to describe devices such as heterogeneous processors and accelerators, GPUs, and I/O devices with integrated compute or DMA engines (termed as Generic Initiator) that are present on the system. It is used to associate a proximity domain with those devices. You may refer 5.2.16.6 in the aforementioned link. This patch implements that structure (Table 5-78) for Qemu ACPI. >> An admin can provide the range of nodes through a uint16 array host-nodes >> and link it to a device by providing its id. Currently, only PCI device is >> supported. The following sample creates 8 nodes and link them to the PCI >> device dev0: >> >> -numa node,nodeid=2 \ >> -numa node,nodeid=3 \ >> -numa node,nodeid=4 \ >> -numa node,nodeid=5 \ >> -numa node,nodeid=6 \ >> -numa node,nodeid=7 \ >> -numa node,nodeid=8 \ >> -numa node,nodeid=9 \ >> -device vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \ >> -object acpi-generic-initiator,id=gi0,pci-dev=dev0,host-nodes=2-9 \ > > Does this link *all* NUMA nodes to dev0? > > Would an example involving two devices be more instructive? Sure, updated in v6. >> diff --git a/qapi/qom.json b/qapi/qom.json >> index c53ef978ff..efcc4b8dfd 100644 >> --- a/qapi/qom.json >> +++ b/qapi/qom.json >> @@ -794,6 +794,21 @@ >> { 'struct': 'VfioUserServerProperties', >> 'data': { 'socket': 'SocketAddress', 'device': 'str' } } >> >> +## >> +# @AcpiGenericInitiatorProperties: >> +# >> +# Properties for acpi-generic-initiator objects. >> +# >> +# @pci-dev: PCI device ID to be associated with the node >> +# >> +# @host-nodes: numa node list > > This feels a bit terse. The commit message makes me guess this > specifies the NUMA nodes to be linked to @pci-dev. Correct? Right, it could be made cleared. Done in v6.
diff --git a/hw/acpi/acpi-generic-initiator.c b/hw/acpi/acpi-generic-initiator.c new file mode 100644 index 0000000000..e05e28e962 --- /dev/null +++ b/hw/acpi/acpi-generic-initiator.c @@ -0,0 +1,70 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#include "qemu/osdep.h" +#include "hw/acpi/acpi-generic-initiator.h" +#include "hw/pci/pci_device.h" +#include "qapi/error.h" +#include "qapi/qapi-builtin-visit.h" +#include "qapi/visitor.h" +#include "qemu/error-report.h" + +OBJECT_DEFINE_TYPE_WITH_INTERFACES(AcpiGenericInitiator, acpi_generic_initiator, + ACPI_GENERIC_INITIATOR, OBJECT, + { TYPE_USER_CREATABLE }, + { NULL }) + +OBJECT_DECLARE_SIMPLE_TYPE(AcpiGenericInitiator, ACPI_GENERIC_INITIATOR) + +static void acpi_generic_initiator_init(Object *obj) +{ + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); + bitmap_zero(gi->host_nodes, MAX_NODES); + gi->pci_dev = NULL; +} + +static void acpi_generic_initiator_finalize(Object *obj) +{ + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); + + g_free(gi->pci_dev); +} + +static void acpi_generic_initiator_set_pci_device(Object *obj, const char *val, + Error **errp) +{ + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); + + gi->pci_dev = g_strdup(val); +} + +static void +acpi_generic_initiator_set_host_nodes(Object *obj, Visitor *v, const char *name, + void *opaque, Error **errp) +{ + AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj); + uint16List *l = NULL, *host_nodes = NULL; + + visit_type_uint16List(v, name, &host_nodes, errp); + + for (l = host_nodes; l; l = l->next) { + if (l->value >= MAX_NODES) { + error_setg(errp, "Invalid host-nodes value: %d", l->value); + break; + } else { + bitmap_set(gi->host_nodes, l->value, 1); + } + } + + qapi_free_uint16List(host_nodes); +} + +static void acpi_generic_initiator_class_init(ObjectClass *oc, void *data) +{ + object_class_property_add_str(oc, "pci-dev", NULL, + acpi_generic_initiator_set_pci_device); + object_class_property_add(oc, "host-nodes", "int", NULL, + acpi_generic_initiator_set_host_nodes, NULL, NULL); +} diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build index fc1b952379..2268589519 100644 --- a/hw/acpi/meson.build +++ b/hw/acpi/meson.build @@ -1,5 +1,6 @@ acpi_ss = ss.source_set() acpi_ss.add(files( + 'acpi-generic-initiator.c', 'acpi_interface.c', 'aml-build.c', 'bios-linker-loader.c', diff --git a/include/hw/acpi/acpi-generic-initiator.h b/include/hw/acpi/acpi-generic-initiator.h new file mode 100644 index 0000000000..41b0c8cda2 --- /dev/null +++ b/include/hw/acpi/acpi-generic-initiator.h @@ -0,0 +1,27 @@ +#ifndef ACPI_GENERIC_INITIATOR_H +#define ACPI_GENERIC_INITIATOR_H + +#include "hw/mem/pc-dimm.h" +#include "hw/acpi/bios-linker-loader.h" +#include "hw/acpi/aml-build.h" +#include "sysemu/numa.h" +#include "qemu/uuid.h" +#include "qom/object.h" +#include "qom/object_interfaces.h" + +#define TYPE_ACPI_GENERIC_INITIATOR "acpi-generic-initiator" + +typedef struct AcpiGenericInitiator { + /* private */ + Object parent; + + /* public */ + char *pci_dev; + DECLARE_BITMAP(host_nodes, MAX_NODES); +} AcpiGenericInitiator; + +typedef struct AcpiGenericInitiatorClass { + ObjectClass parent_class; +} AcpiGenericInitiatorClass; + +#endif diff --git a/qapi/qom.json b/qapi/qom.json index c53ef978ff..efcc4b8dfd 100644 --- a/qapi/qom.json +++ b/qapi/qom.json @@ -794,6 +794,21 @@ { 'struct': 'VfioUserServerProperties', 'data': { 'socket': 'SocketAddress', 'device': 'str' } } +## +# @AcpiGenericInitiatorProperties: +# +# Properties for acpi-generic-initiator objects. +# +# @pci-dev: PCI device ID to be associated with the node +# +# @host-nodes: numa node list +# +# Since: 9.0 +## +{ 'struct': 'AcpiGenericInitiatorProperties', + 'data': { 'pci-dev': 'str', + 'host-nodes': ['uint16'] } } + ## # @RngProperties: # @@ -911,6 +926,7 @@ ## { 'enum': 'ObjectType', 'data': [ + 'acpi-generic-initiator', 'authz-list', 'authz-listfile', 'authz-pam', @@ -981,6 +997,7 @@ 'id': 'str' }, 'discriminator': 'qom-type', 'data': { + 'acpi-generic-initiator': 'AcpiGenericInitiatorProperties', 'authz-list': 'AuthZListProperties', 'authz-listfile': 'AuthZListFileProperties', 'authz-pam': 'AuthZPAMProperties',