mbox series

[v1,00/29] cxl: Add address translation support and enable AMD Zen5 platforms

Message ID 20250107141015.3367194-1-rrichter@amd.com
Headers show
Series cxl: Add address translation support and enable AMD Zen5 platforms | expand

Message

Robert Richter Jan. 7, 2025, 2:09 p.m. UTC
This patch set adds support of address translation and enables this
for AMD Zen5 platforms. This is a new appoach in response to an
earlier attempt to implement CXL address translation [1] and the
comments on it, esp. Dan's [2]. Dan suggested to solve this by walking
the port hierarchy from the host port to the host bridge. When
crossing memory domains from one port to the other, HPA translations
are applied using a callback function to handle platform specifics.

The CXL driver currently does not implement address translation which
assumes the host physical addresses (HPA) and system physical
addresses (SPA) are equal.

Systems with different HPA and SPA addresses need address translation.
If this is the case, the hardware addresses esp. used in the HDM
decoder configurations are different to the system's or parent port
address ranges. E.g. AMD Zen5 systems may be configured to use
'Normalized addresses'. Then, CXL endpoints have their own physical
address base which is not the same as the SPA used by the CXL host
bridge. Thus, addresses need to be translated from the endpoint's to
its CXL host bridge's address range.

To enable address translation, the endpoint's HPA range must be
translated to each of the parent port's address ranges up to the root
decoder. This is implemented by traversing the decoder and port
hierarchy from the endpoint up to the root port and applying platform
specific translation functions to determine the next HPA range of the
parent port where needed:

  if (cxl_port->to_hpa)
    hpa = cxl_port->to_hpa(cxl_decoder, hpa)

A callback is introduced to translate an HPA range from a port to its
parent.

The root port's HPA range is equivalent to the system's SPA range and
can then be used to find an endpoint's root port and region.  

Also, translated HPA ranges must be used to calculate the endpoint
position in the region.

Once the region was found, the decoders of all ports between the
endpoint and the root port need to be found based on the translated
HPA. Configuration checks and interleaving setup must be modified as
necessary to support address translation.

Note that only auto-discovery of decoders is supported. Thus, decoders
are locked and cannot be configured manually.

Finally, Zen5 address translation is enabled using ACPI PRMT.

Purpose of patches:

 * Patches #1-#4: Minor cleanups and updates separated from the actual
   implementation

 * Patches #5-#12, #14, #17, #18: Code rework and refactoring.

 * Patches #13, #15, #16, #19-#24: Functional changes for address
   translation (common code).

 * Patch #25, #26: AMD Zen5 address translation.

 * Patch #27-#29: Changes to improve debug messages for better debugging.

[1] https://lore.kernel.org/linux-cxl/20240701174754.967954-1-rrichter@amd.com/
[2] https://lore.kernel.org/linux-cxl/669086821f136_5fffa29473@dwillia2-xfh.jf.intel.com.notmuch/


Robert Richter (29):
  cxl: Remove else after return
  cxl/pci: Moving code in cxl_hdm_decode_init()
  cxl/pci: cxl_hdm_decode_init: Move comment
  cxl/pci: Add comments to cxl_hdm_decode_init()
  cxl/region: Move find_cxl_root() to cxl_add_to_region()
  cxl/region: Factor out code to find the root decoder
  cxl/region: Factor out code to find a root decoder's region
  cxl/region: Split region registration into an initialization and
    adding part
  cxl/region: Use iterator to find the root port in
    cxl_find_root_decoder()
  cxl/region: Add function to find a port's switch decoder by range
  cxl/region: Unfold cxl_find_root_decoder() into
    cxl_endpoint_initialize()
  cxl: Modify address translation callback for generic use
  cxl: Introduce callback to translate an HPA range from a port to its
    parent
  cxl: Introduce parent_port_of() helper
  cxl/region: Use an endpoint's SPA range to find a region
  cxl/region: Use translated HPA ranges to calculate the endpoint
    position
  cxl/region: Rename function to cxl_find_decoder_early()
  cxl/region: Avoid duplicate call of cxl_find_decoder_early()
  cxl/region: Use endpoint's HPA range to find the port's decoder
  cxl/region: Use translated HPA ranges to find the port's decoder
  cxl/region: Lock decoders that need address translation
  cxl/region: Use translated HPA ranges to create a region
  cxl/region: Use root decoders interleaving parameters to create a
    region
  cxl/region: Use endpoint's SPA range to check a region
  cxl/amd: Enable Zen5 address translation using ACPI PRMT
  MAINTAINERS: CXL: Add entry for AMD platform support (CXL_AMD)
  cxl/region: Show message on registration failure
  cxl/region: Show message on broken target list
  cxl: Show message when a decoder was added to a port

 MAINTAINERS               |   7 +
 drivers/cxl/Kconfig       |   4 +
 drivers/cxl/acpi.c        |  14 +-
 drivers/cxl/core/Makefile |   1 +
 drivers/cxl/core/amd.c    | 227 +++++++++++++++++++++
 drivers/cxl/core/cdat.c   |   2 +-
 drivers/cxl/core/core.h   |   6 +
 drivers/cxl/core/hdm.c    |   3 +-
 drivers/cxl/core/pci.c    |  44 +++--
 drivers/cxl/core/port.c   |  22 ++-
 drivers/cxl/core/region.c | 407 ++++++++++++++++++++++++++++----------
 drivers/cxl/cxl.h         |  16 +-
 drivers/cxl/port.c        |  22 +--
 13 files changed, 623 insertions(+), 152 deletions(-)
 create mode 100644 drivers/cxl/core/amd.c


base-commit: 2f84d072bdcb7d6ec66cc4d0de9f37a3dc394cd2

Comments

Alison Schofield Jan. 13, 2025, 6:41 p.m. UTC | #1
On Tue, Jan 07, 2025 at 03:09:46PM +0100, Robert Richter wrote:
> This patch set adds support of address translation and enables this
> for AMD Zen5 platforms. This is a new appoach in response to an
> earlier attempt to implement CXL address translation [1] and the
> comments on it, esp. Dan's [2]. Dan suggested to solve this by walking
> the port hierarchy from the host port to the host bridge. When
> crossing memory domains from one port to the other, HPA translations
> are applied using a callback function to handle platform specifics.
> 
> The CXL driver currently does not implement address translation which
> assumes the host physical addresses (HPA) and system physical
> addresses (SPA) are equal.

Hi Robert,

I tried this out and have some sporadic review. (It's a big set so please
pardon that I'm not giving a comprehensive review on v1.)

I commented directly on a couple of patches, which were things that I touched
to get user created regions working. So that leads to my main feedback - let's
keep user created regions alive.

This set made me wonder if there may be an advantage in further delineating the
auto vs user region creation path. This set is a special auto method and we
have hetero-interleave coming which will introduce more auto only special-ness. Just a thought that was triggered by seeing the refactoring of the interleave
calculation path.

Suggest ditching the term 'address translation' and specifically stating what
is being translated. And that may lead to some clean up in code/comments that
I added during the DPA->HPA->SPA translation when I thought it was the only
translation in town.

DPA to HPA: exists to report HPAs in trace events when a device reports a DPA
        media error.
HPA to SPA: exists to report the above HPAs correctly when XOR interleave
        arithmetic is used.
HPA to SPA: introduced here

The latter 2 HPA to SPA are not equivalent. The root decoder callback added
for XOR doesn't apply on the front-end like is introduced here. The HB
decoder (hardware) does XOR the translation. The callback (as it exists today)
only comes into play when DPA addresses are directly reported in events. So
there may be some renaming needs there.

That's all for now. I'll review, try out more, in the v2.
--Alison

> 
> Systems with different HPA and SPA addresses need address translation.
> If this is the case, the hardware addresses esp. used in the HDM
> decoder configurations are different to the system's or parent port
> address ranges. E.g. AMD Zen5 systems may be configured to use
> 'Normalized addresses'. Then, CXL endpoints have their own physical
> address base which is not the same as the SPA used by the CXL host
> bridge. Thus, addresses need to be translated from the endpoint's to
> its CXL host bridge's address range.
> 
> To enable address translation, the endpoint's HPA range must be
> translated to each of the parent port's address ranges up to the root
> decoder. This is implemented by traversing the decoder and port
> hierarchy from the endpoint up to the root port and applying platform
> specific translation functions to determine the next HPA range of the
> parent port where needed:
> 
>   if (cxl_port->to_hpa)
>     hpa = cxl_port->to_hpa(cxl_decoder, hpa)
> 
> A callback is introduced to translate an HPA range from a port to its
> parent.
> 
> The root port's HPA range is equivalent to the system's SPA range and
> can then be used to find an endpoint's root port and region.  
> 
> Also, translated HPA ranges must be used to calculate the endpoint
> position in the region.
> 
> Once the region was found, the decoders of all ports between the
> endpoint and the root port need to be found based on the translated
> HPA. Configuration checks and interleaving setup must be modified as
> necessary to support address translation.
> 
> Note that only auto-discovery of decoders is supported. Thus, decoders
> are locked and cannot be configured manually.
> 
> Finally, Zen5 address translation is enabled using ACPI PRMT.
> 
> Purpose of patches:
> 
>  * Patches #1-#4: Minor cleanups and updates separated from the actual
>    implementation
> 
>  * Patches #5-#12, #14, #17, #18: Code rework and refactoring.
> 
>  * Patches #13, #15, #16, #19-#24: Functional changes for address
>    translation (common code).
> 
>  * Patch #25, #26: AMD Zen5 address translation.
> 
>  * Patch #27-#29: Changes to improve debug messages for better debugging.
> 
> [1] https://lore.kernel.org/linux-cxl/20240701174754.967954-1-rrichter@amd.com/
> [2] https://lore.kernel.org/linux-cxl/669086821f136_5fffa29473@dwillia2-xfh.jf.intel.com.notmuch/
> 
> 
> Robert Richter (29):
>   cxl: Remove else after return
>   cxl/pci: Moving code in cxl_hdm_decode_init()
>   cxl/pci: cxl_hdm_decode_init: Move comment
>   cxl/pci: Add comments to cxl_hdm_decode_init()
>   cxl/region: Move find_cxl_root() to cxl_add_to_region()
>   cxl/region: Factor out code to find the root decoder
>   cxl/region: Factor out code to find a root decoder's region
>   cxl/region: Split region registration into an initialization and
>     adding part
>   cxl/region: Use iterator to find the root port in
>     cxl_find_root_decoder()
>   cxl/region: Add function to find a port's switch decoder by range
>   cxl/region: Unfold cxl_find_root_decoder() into
>     cxl_endpoint_initialize()
>   cxl: Modify address translation callback for generic use
>   cxl: Introduce callback to translate an HPA range from a port to its
>     parent
>   cxl: Introduce parent_port_of() helper
>   cxl/region: Use an endpoint's SPA range to find a region
>   cxl/region: Use translated HPA ranges to calculate the endpoint
>     position
>   cxl/region: Rename function to cxl_find_decoder_early()
>   cxl/region: Avoid duplicate call of cxl_find_decoder_early()
>   cxl/region: Use endpoint's HPA range to find the port's decoder
>   cxl/region: Use translated HPA ranges to find the port's decoder
>   cxl/region: Lock decoders that need address translation
>   cxl/region: Use translated HPA ranges to create a region
>   cxl/region: Use root decoders interleaving parameters to create a
>     region
>   cxl/region: Use endpoint's SPA range to check a region
>   cxl/amd: Enable Zen5 address translation using ACPI PRMT
>   MAINTAINERS: CXL: Add entry for AMD platform support (CXL_AMD)
>   cxl/region: Show message on registration failure
>   cxl/region: Show message on broken target list
>   cxl: Show message when a decoder was added to a port
> 
>  MAINTAINERS               |   7 +
>  drivers/cxl/Kconfig       |   4 +
>  drivers/cxl/acpi.c        |  14 +-
>  drivers/cxl/core/Makefile |   1 +
>  drivers/cxl/core/amd.c    | 227 +++++++++++++++++++++
>  drivers/cxl/core/cdat.c   |   2 +-
>  drivers/cxl/core/core.h   |   6 +
>  drivers/cxl/core/hdm.c    |   3 +-
>  drivers/cxl/core/pci.c    |  44 +++--
>  drivers/cxl/core/port.c   |  22 ++-
>  drivers/cxl/core/region.c | 407 ++++++++++++++++++++++++++++----------
>  drivers/cxl/cxl.h         |  16 +-
>  drivers/cxl/port.c        |  22 +--
>  13 files changed, 623 insertions(+), 152 deletions(-)
>  create mode 100644 drivers/cxl/core/amd.c
> 
> 
> base-commit: 2f84d072bdcb7d6ec66cc4d0de9f37a3dc394cd2
> -- 
> 2.39.5
>