mbox series

[v3,qemu,00/11] acpi: NUMA nodes for CXL HB as GP + complex NUMA test

Message ID 20240620160324.109058-1-Jonathan.Cameron@huawei.com
Headers show
Series acpi: NUMA nodes for CXL HB as GP + complex NUMA test | expand

Message

Jonathan Cameron June 20, 2024, 4:03 p.m. UTC
v3: Thanks to Richard for help debugging BE issue and to Igor for
    finding a bunch of other thing to improve via the context in
    the fix patch.
    
- Fix the big endian host/little endian guest issue in the HID being
  written to the Generic Port Affinity Structure ACPI Device Handle.
- Fix a bug in the ordering of bus vs devfn in the BDF field which is
  reversed in the ACPI table wrt to QEMU's internal handling. Note the
  fix is minimal and refactored later in the series.
- Move original GI code to hw/acpi/aml-build.c and hw/acpi/pc.c as
  no need for a separate file and this keeps the SRAT entry building
  all in one place.
- Use properties for the pci bus number and the ACPI UID to avoid
  using pci internal implementation details in hw/acpi.
- Drop the GenericNode base object as much less code is unified with
  the new approach to the aml building and that approach did not bring
  sufficient advantages to be worthwhile after other refactors.
  A little more duplication occurs in v3 but the code is easier to read.

ACPI 6.5 introduced Generic Port Affinity Structures to close a system
description gap that was a problem for CXL memory systems.
It defines an new SRAT Affinity structure (and hence allows creation of an
ACPI Proximity Node which can only be defined via an SRAT structure)
for the boundary between a discoverable fabric and a non discoverable
system interconnects etc.

The HMAT data on latency and bandwidth is combined with discoverable
information from the CXL bus (link speeds, lane counts) and CXL devices
(switch port to port characteristics and USP to memory, via CDAT tables
read from the device).  QEMU has supported the rest of the elements
of this chain for a while but now the kernel has caught up and we need
the missing element of Generic Ports (this code has been used extensively
in testing and debugging that kernel support, some resulting fixes
currently under review).

Generic Port Affinity Structures are very similar to the recently
added Generic Initiator Affinity Structures (GI) so this series
factors out and reuses much of that infrastructure for reuse
There are subtle differences (beyond the obvious structure ID change).

- The ACPI spec example (and linux kernel support) has a Generic
  Port not as associated with the CXL root port, but rather with
  the CXL Host bridge. As a result, an ACPI handle is used (rather
  than the PCI SBDF option for GIs). In QEMU the easiest way
  to get to this is to target the root bridge PCI Bus, and
  conveniently the root bridge bus number is used for the UID allowing
  us to construct an appropriate entry.

A key addition of this series is a complex NUMA topology example that
stretches the QEMU emulation code for GI, GP and nodes with just
CPUS, just memory, just hot pluggable memory, mixture of memory and CPUs.

A similar test showed up a few NUMA related bugs with fixes applied for
9.0 (note that one of these needs linux booted to identify that it
rejects the HMAT table and this test is a regression test for the
table generation only).

https://lore.kernel.org/qemu-devel/2eb6672cfdaea7dacd8e9bb0523887f13b9f85ce.1710282274.git.mst@redhat.com/
https://lore.kernel.org/qemu-devel/74e2845c5f95b0c139c79233ddb65bb17f2dd679.1710282274.git.mst@redhat.com/


Jonathan Cameron (11):
  hw/acpi: Fix ordering of BDF in Generic Initiator PCI Device Handle.
  hw/acpi/GI: Fix trivial parameter alignment issue.
  hw/acpi: Move AML building code for Generic Initiators to aml_build.c
  hw/acpi: Rename build_all_acpi_generic_initiators() to
    build_acpi_generic_initiator()
  hw/pci: Add a bus property to pci_props and use for acpi/gi
  acpi/pci: Move Generic Initiator object handling into acpi/pci.*
  hw/pci-bridge: Add acpi_uid property to CXL PXB
  hw/acpi: Generic Port Affinity Structure support
  bios-tables-test: Allow for new acpihmat-generic-x test data.
  bios-tables-test: Add complex SRAT / HMAT test for GI GP
  bios-tables-test: Add data for complex numa test (GI, GP etc)

 qapi/qom.json                               |  34 +++
 include/hw/acpi/acpi_generic_initiator.h    |  30 +--
 include/hw/acpi/aml-build.h                 |   8 +
 include/hw/acpi/pci.h                       |   7 +
 include/hw/pci/pci_bridge.h                 |   1 +
 hw/acpi/acpi_generic_initiator.c            | 132 +++++++++---
 hw/acpi/aml-build.c                         |  84 ++++++++
 hw/acpi/pci.c                               | 226 ++++++++++++++++++++
 hw/arm/virt-acpi-build.c                    |   3 +-
 hw/i386/acpi-build.c                        |   3 +-
 hw/pci-bridge/pci_expander_bridge.c         |  18 +-
 hw/pci/pci.c                                |  14 ++
 tests/qtest/bios-tables-test.c              |  96 +++++++++
 hw/acpi/meson.build                         |   1 -
 tests/data/acpi/q35/APIC.acpihmat-generic-x | Bin 0 -> 136 bytes
 tests/data/acpi/q35/CEDT.acpihmat-generic-x | Bin 0 -> 68 bytes
 tests/data/acpi/q35/DSDT.acpihmat-generic-x | Bin 0 -> 10849 bytes
 tests/data/acpi/q35/HMAT.acpihmat-generic-x | Bin 0 -> 360 bytes
 tests/data/acpi/q35/SRAT.acpihmat-generic-x | Bin 0 -> 520 bytes
 19 files changed, 597 insertions(+), 60 deletions(-)
 create mode 100644 tests/data/acpi/q35/APIC.acpihmat-generic-x
 create mode 100644 tests/data/acpi/q35/CEDT.acpihmat-generic-x
 create mode 100644 tests/data/acpi/q35/DSDT.acpihmat-generic-x
 create mode 100644 tests/data/acpi/q35/HMAT.acpihmat-generic-x
 create mode 100644 tests/data/acpi/q35/SRAT.acpihmat-generic-x

Comments

Huang, Ying June 21, 2024, 3:25 a.m. UTC | #1
Hi, Jonathan,

Jonathan Cameron <Jonathan.Cameron@huawei.com> writes:

> v3: Thanks to Richard for help debugging BE issue and to Igor for
>     finding a bunch of other thing to improve via the context in
>     the fix patch.
>     
> - Fix the big endian host/little endian guest issue in the HID being
>   written to the Generic Port Affinity Structure ACPI Device Handle.
> - Fix a bug in the ordering of bus vs devfn in the BDF field which is
>   reversed in the ACPI table wrt to QEMU's internal handling. Note the
>   fix is minimal and refactored later in the series.
> - Move original GI code to hw/acpi/aml-build.c and hw/acpi/pc.c as
>   no need for a separate file and this keeps the SRAT entry building
>   all in one place.
> - Use properties for the pci bus number and the ACPI UID to avoid
>   using pci internal implementation details in hw/acpi.
> - Drop the GenericNode base object as much less code is unified with
>   the new approach to the aml building and that approach did not bring
>   sufficient advantages to be worthwhile after other refactors.
>   A little more duplication occurs in v3 but the code is easier to read.
>
> ACPI 6.5 introduced Generic Port Affinity Structures to close a system
> description gap that was a problem for CXL memory systems.
> It defines an new SRAT Affinity structure (and hence allows creation of an
> ACPI Proximity Node which can only be defined via an SRAT structure)
> for the boundary between a discoverable fabric and a non discoverable
> system interconnects etc.
>
> The HMAT data on latency and bandwidth is combined with discoverable
> information from the CXL bus (link speeds, lane counts) and CXL devices
> (switch port to port characteristics and USP to memory, via CDAT tables
> read from the device).  QEMU has supported the rest of the elements
> of this chain for a while but now the kernel has caught up and we need
> the missing element of Generic Ports (this code has been used extensively
> in testing and debugging that kernel support, some resulting fixes
> currently under review).
>
> Generic Port Affinity Structures are very similar to the recently
> added Generic Initiator Affinity Structures (GI) so this series
> factors out and reuses much of that infrastructure for reuse
> There are subtle differences (beyond the obvious structure ID change).
>
> - The ACPI spec example (and linux kernel support) has a Generic
>   Port not as associated with the CXL root port, but rather with
>   the CXL Host bridge. As a result, an ACPI handle is used (rather
>   than the PCI SBDF option for GIs). In QEMU the easiest way
>   to get to this is to target the root bridge PCI Bus, and
>   conveniently the root bridge bus number is used for the UID allowing
>   us to construct an appropriate entry.
>
> A key addition of this series is a complex NUMA topology example that
> stretches the QEMU emulation code for GI, GP and nodes with just
> CPUS, just memory, just hot pluggable memory, mixture of memory and CPUs.
>
> A similar test showed up a few NUMA related bugs with fixes applied for
> 9.0 (note that one of these needs linux booted to identify that it
> rejects the HMAT table and this test is a regression test for the
> table generation only).
>
> https://lore.kernel.org/qemu-devel/2eb6672cfdaea7dacd8e9bb0523887f13b9f85ce.1710282274.git.mst@redhat.com/
> https://lore.kernel.org/qemu-devel/74e2845c5f95b0c139c79233ddb65bb17f2dd679.1710282274.git.mst@redhat.com/

When developing the Linux kernel patchset "[PATCH v3 0/3] cxl/region:
Support to calculate memory tier abstract distance" as in [1].

[1] https://lore.kernel.org/linux-cxl/20240618084639.1419629-1-ying.huang@intel.com/

I use this patchset to test my kernel patchset and it works great!
Thanks!

Feel free to add my

Tested-by: "Huang, Ying" <ying.huang@intel.com>

in the future versions.

>
> Jonathan Cameron (11):
>   hw/acpi: Fix ordering of BDF in Generic Initiator PCI Device Handle.
>   hw/acpi/GI: Fix trivial parameter alignment issue.
>   hw/acpi: Move AML building code for Generic Initiators to aml_build.c
>   hw/acpi: Rename build_all_acpi_generic_initiators() to
>     build_acpi_generic_initiator()
>   hw/pci: Add a bus property to pci_props and use for acpi/gi
>   acpi/pci: Move Generic Initiator object handling into acpi/pci.*
>   hw/pci-bridge: Add acpi_uid property to CXL PXB
>   hw/acpi: Generic Port Affinity Structure support
>   bios-tables-test: Allow for new acpihmat-generic-x test data.
>   bios-tables-test: Add complex SRAT / HMAT test for GI GP
>   bios-tables-test: Add data for complex numa test (GI, GP etc)
>
>  qapi/qom.json                               |  34 +++
>  include/hw/acpi/acpi_generic_initiator.h    |  30 +--
>  include/hw/acpi/aml-build.h                 |   8 +
>  include/hw/acpi/pci.h                       |   7 +
>  include/hw/pci/pci_bridge.h                 |   1 +
>  hw/acpi/acpi_generic_initiator.c            | 132 +++++++++---
>  hw/acpi/aml-build.c                         |  84 ++++++++
>  hw/acpi/pci.c                               | 226 ++++++++++++++++++++
>  hw/arm/virt-acpi-build.c                    |   3 +-
>  hw/i386/acpi-build.c                        |   3 +-
>  hw/pci-bridge/pci_expander_bridge.c         |  18 +-
>  hw/pci/pci.c                                |  14 ++
>  tests/qtest/bios-tables-test.c              |  96 +++++++++
>  hw/acpi/meson.build                         |   1 -
>  tests/data/acpi/q35/APIC.acpihmat-generic-x | Bin 0 -> 136 bytes
>  tests/data/acpi/q35/CEDT.acpihmat-generic-x | Bin 0 -> 68 bytes
>  tests/data/acpi/q35/DSDT.acpihmat-generic-x | Bin 0 -> 10849 bytes
>  tests/data/acpi/q35/HMAT.acpihmat-generic-x | Bin 0 -> 360 bytes
>  tests/data/acpi/q35/SRAT.acpihmat-generic-x | Bin 0 -> 520 bytes
>  19 files changed, 597 insertions(+), 60 deletions(-)
>  create mode 100644 tests/data/acpi/q35/APIC.acpihmat-generic-x
>  create mode 100644 tests/data/acpi/q35/CEDT.acpihmat-generic-x
>  create mode 100644 tests/data/acpi/q35/DSDT.acpihmat-generic-x
>  create mode 100644 tests/data/acpi/q35/HMAT.acpihmat-generic-x
>  create mode 100644 tests/data/acpi/q35/SRAT.acpihmat-generic-x

--
Best Regards,
Huang, Ying
Jonathan Cameron June 21, 2024, 4:20 p.m. UTC | #2
On Thu, 20 Jun 2024 17:03:08 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> v3: Thanks to Richard for help debugging BE issue and to Igor for
>     finding a bunch of other thing to improve via the context in
>     the fix patch.

I forgot to mention that his time I ran the bios tables test on
an emulated x86_64 machine on top of an emulated s390 (with the timeouts
massively increased as it took about 2 hours).

Hopefully no more surprises!