mbox series

[0/4,v3] cxl/core: Enable Region creation on x86 with Low Mem Hole

Message ID 20250314113708.759808-1-fabio.m.de.francesco@linux.intel.com
Headers show
Series cxl/core: Enable Region creation on x86 with Low Mem Hole | expand

Message

Fabio M. De Francesco March 14, 2025, 11:36 a.m. UTC
The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
Physical Address (HPA) windows that are associated with each CXL Host
Bridge. Each window represents a contiguous HPA that may be interleaved
with one or more targets (CXL v3.1 - 9.18.1.3).

The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
memory to which systems cannot send transactions. On those systems, BIOS
publishes CFMWS which communicate the active System Physical Address (SPA)
ranges that map to a subset of the Host Physical Address (HPA) ranges. The
SPA range trims out the hole, and capacity in the endpoint is lost with no
SPA to map to CXL HPA in that hole.

In the early stages of CXL Regions construction and attach on platforms
with Low Memory Holes, the driver fails and returns an error because it
expects that the CXL Endpoint Decoder range is a subset of the Root
Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.

Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
Decoders or already made CXL Regions and Decoders to allow the
construction of new CXL Regions and the attachment of Endpoint Decoders,
even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
range end to match Root Decoder's.

- Patch 1/4 changes the calling conventions of three match_*_by_range()
  helpers in preparation of 3/4.
- Patch 2/4 Introduces helpers to detect LMH's and also one to adjust
  the HPA range end for CXL Regions construction.
- Patch 3/4 enables CXL Regions construction and Endpoint Decoders
  attachment by matching Root Decoders or Regions with Endpoint
  Decoders, adjusting Endpoint Decoders HPA range end, and relaxing
  constraints while Endpoints decoders' attachment.
- Patch 4/4 simulates a LMH for the CXL tests on patched CXL driver.

Many thanks to Alison, Dan, and Ira for their help and for their reviews
of my RFC on Intel's internal ML.

Commenting on v1, Alison wrote a couple of observations on what users
will see. I suggest anyone interested to see how this series affect
users to take a look at her observations.[0] Thank you!

Changes for v3:

  Re-base the series on cxl/next.

  1/4 - 2/4:
	Constify local variables.
  3/4:
	Call arch_match_region() from region_res_match_cxl_range().
  4/4:
	arch_match_region() - Check that region end is under start + 4G;
	arch_match_spa() - Check that SPA range start is cfmws_range_start.
	
  base-commit: acc2913692413df9d1

v2 - https://lore.kernel.org/linux-cxl/20250114203432.31861-1-fabio.m.de.francesco@linux.intel.com/
v1 - https://lore.kernel.org/all/20241122155226.2068287-1-fabio.m.de.francesco@linux.intel.com/

[0] - https://lore.kernel.org/all/Z0Tzif55CcHuujJ-@aschofie-mobl2.lan/
[1] - https://lore.kernel.org/oe-kbuild-all/202411252126.T7xKY88q-lkp@intel.com/ 

Fabio M. De Francesco (4):
  cxl/core: Change match_*_by_range() calling convention
  cxl/core: Add helpers to detect Low memory Holes on x86
  cxl/core: Enable Region creation on x86 with Low Memory Hole
  cxl/test: Simulate an x86 Low Memory Hole for tests

 drivers/cxl/Kconfig                  |  5 ++
 drivers/cxl/core/Makefile            |  1 +
 drivers/cxl/core/lmh.c               | 83 ++++++++++++++++++++++++++++
 drivers/cxl/core/lmh.h               | 31 +++++++++++
 drivers/cxl/core/region.c            | 82 +++++++++++++++++++++------
 tools/testing/cxl/Kbuild             |  1 +
 tools/testing/cxl/cxl_core_exports.c |  2 +
 tools/testing/cxl/test/cxl.c         | 10 ++++
 8 files changed, 197 insertions(+), 18 deletions(-)
 create mode 100644 drivers/cxl/core/lmh.c
 create mode 100644 drivers/cxl/core/lmh.h

Comments

Alison Schofield March 20, 2025, 1:46 a.m. UTC | #1
On Fri, Mar 14, 2025 at 12:36:29PM +0100, Fabio M. De Francesco wrote:
> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> Physical Address (HPA) windows that are associated with each CXL Host
> Bridge. Each window represents a contiguous HPA that may be interleaved
> with one or more targets (CXL v3.1 - 9.18.1.3).
> 
> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> memory to which systems cannot send transactions. On those systems, BIOS
> publishes CFMWS which communicate the active System Physical Address (SPA)
> ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> SPA range trims out the hole, and capacity in the endpoint is lost with no
> SPA to map to CXL HPA in that hole.
> 
> In the early stages of CXL Regions construction and attach on platforms
> with Low Memory Holes, the driver fails and returns an error because it
> expects that the CXL Endpoint Decoder range is a subset of the Root
> Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
> 
> Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
> Decoders or already made CXL Regions and Decoders to allow the
> construction of new CXL Regions and the attachment of Endpoint Decoders,
> even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
> range end to match Root Decoder's.

I think the dpa_res field of the endpoint decoder needs adjusting.
After the region is setup, the cxled->dpa_res has the unadjusted value
and that leads to region warning and address translation failure because
the driver 'thinks' that DPA is within a region, but when it tries
to translate to an HPA in that region, it fails.

Here's where I looked at it: using the cxl-test LMH auto-region (nice!)
each endpoint decoder is programmed to contribute 512MB to the 1024MB region.
The LMH adjustment shrunk the region to 768MB, so each endpoint is only
contributing 384MB to the region. 

DPA->HPA address translations of DPA addresses in the 384->512 gap cause
a problem. The driver will needlessly warn that they are in a region for
any poison inject or clear, and will fail address translations for any
poison, general media or dram event.

I think this should fail in region.c: __cxl_dpa_to_region()
        if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start)
                return 0;

For that to fail, LMH code needs to adjust cxled->dpa_res too. 


To test is using clear_poison you can:
# echo 536866816 > /sys/kernel/debug/cxl/mem1/clear_poison
  (536866816 = 512MB - 4096)

[ ] cxl_core:__cxl_dpa_to_region:2860: cxl decoder18.0: dpa:0x1ffff000 mapped in region:region0
[ ] cxl_core:cxl_dpa_to_hpa:2963: cxl_region region0: Addr trans fail: hpa 0x3ff04fffe000 not in region


snip
>
Alison Schofield March 20, 2025, 6:10 p.m. UTC | #2
On Fri, Mar 14, 2025 at 12:36:29PM +0100, Fabio M. De Francesco wrote:
> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> Physical Address (HPA) windows that are associated with each CXL Host
> Bridge. Each window represents a contiguous HPA that may be interleaved
> with one or more targets (CXL v3.1 - 9.18.1.3).
> 
> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> memory to which systems cannot send transactions. On those systems, BIOS
> publishes CFMWS which communicate the active System Physical Address (SPA)
> ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> SPA range trims out the hole, and capacity in the endpoint is lost with no
> SPA to map to CXL HPA in that hole.
> 
> In the early stages of CXL Regions construction and attach on platforms
> with Low Memory Holes, the driver fails and returns an error because it
> expects that the CXL Endpoint Decoder range is a subset of the Root
> Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
> 
> Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
> Decoders or already made CXL Regions and Decoders to allow the
> construction of new CXL Regions and the attachment of Endpoint Decoders,
> even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
> range end to match Root Decoder's.

Unless dev_dbg() is on, LMH handling won't be apparent. How about adding
a dev_info() that informs the user that a region was adjusted to account
for a LMH. I'm foreseeing looking at logs and divining a LMH, when a
simple log message would be a nice shortcut.

CXL subsystem is quiet with only 8 dev_info()'s, so if this breaks some
vow of silence, nevermind :)


-snip

>
Robert Richter March 21, 2025, 10:34 a.m. UTC | #3
Fabio,

On 14.03.25 12:36:29, Fabio M. De Francesco wrote:
> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> Physical Address (HPA) windows that are associated with each CXL Host
> Bridge. Each window represents a contiguous HPA that may be interleaved
> with one or more targets (CXL v3.1 - 9.18.1.3).
> 
> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> memory to which systems cannot send transactions. On those systems, BIOS
> publishes CFMWS which communicate the active System Physical Address (SPA)
> ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> SPA range trims out the hole, and capacity in the endpoint is lost with no
> SPA to map to CXL HPA in that hole.
> 
> In the early stages of CXL Regions construction and attach on platforms
> with Low Memory Holes, the driver fails and returns an error because it
> expects that the CXL Endpoint Decoder range is a subset of the Root
> Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
> 
> Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
> Decoders or already made CXL Regions and Decoders to allow the
> construction of new CXL Regions and the attachment of Endpoint Decoders,
> even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
> range end to match Root Decoder's.
> 
> - Patch 1/4 changes the calling conventions of three match_*_by_range()
>   helpers in preparation of 3/4.
> - Patch 2/4 Introduces helpers to detect LMH's and also one to adjust
>   the HPA range end for CXL Regions construction.
> - Patch 3/4 enables CXL Regions construction and Endpoint Decoders
>   attachment by matching Root Decoders or Regions with Endpoint
>   Decoders, adjusting Endpoint Decoders HPA range end, and relaxing
>   constraints while Endpoints decoders' attachment.
> - Patch 4/4 simulates a LMH for the CXL tests on patched CXL driver.
> 
> Many thanks to Alison, Dan, and Ira for their help and for their reviews
> of my RFC on Intel's internal ML.
> 
> Commenting on v1, Alison wrote a couple of observations on what users
> will see. I suggest anyone interested to see how this series affect
> users to take a look at her observations.[0] Thank you!
> 
> Changes for v3:
> 
>   Re-base the series on cxl/next.
> 
>   1/4 - 2/4:
> 	Constify local variables.
>   3/4:
> 	Call arch_match_region() from region_res_match_cxl_range().
>   4/4:
> 	arch_match_region() - Check that region end is under start + 4G;
> 	arch_match_spa() - Check that SPA range start is cfmws_range_start.

I have sent comments for version 1 and suggested a simpler approach
for this to implement. My comments haven't been addressed yet, but we
need better isolation to reduce interference with other platforms and
archs. Please take a look again.

Many thanks,

-Robert
Fabio M. De Francesco March 25, 2025, 4:13 p.m. UTC | #4
On Friday, March 21, 2025 11:34:35 AM Central European Standard Time Robert Richter wrote:
> Fabio,
> 
> On 14.03.25 12:36:29, Fabio M. De Francesco wrote:
> > The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> > Physical Address (HPA) windows that are associated with each CXL Host
> > Bridge. Each window represents a contiguous HPA that may be interleaved
> > with one or more targets (CXL v3.1 - 9.18.1.3).
> > 
> > The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> > memory to which systems cannot send transactions. On those systems, BIOS
> > publishes CFMWS which communicate the active System Physical Address (SPA)
> > ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> > SPA range trims out the hole, and capacity in the endpoint is lost with no
> > SPA to map to CXL HPA in that hole.
> > 
> > In the early stages of CXL Regions construction and attach on platforms
> > with Low Memory Holes, the driver fails and returns an error because it
> > expects that the CXL Endpoint Decoder range is a subset of the Root
> > Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
> > 
> > Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
> > Decoders or already made CXL Regions and Decoders to allow the
> > construction of new CXL Regions and the attachment of Endpoint Decoders,
> > even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
> > range end to match Root Decoder's.
> > 
> > - Patch 1/4 changes the calling conventions of three match_*_by_range()
> >   helpers in preparation of 3/4.
> > - Patch 2/4 Introduces helpers to detect LMH's and also one to adjust
> >   the HPA range end for CXL Regions construction.
> > - Patch 3/4 enables CXL Regions construction and Endpoint Decoders
> >   attachment by matching Root Decoders or Regions with Endpoint
> >   Decoders, adjusting Endpoint Decoders HPA range end, and relaxing
> >   constraints while Endpoints decoders' attachment.
> > - Patch 4/4 simulates a LMH for the CXL tests on patched CXL driver.
> > 
> > Many thanks to Alison, Dan, and Ira for their help and for their reviews
> > of my RFC on Intel's internal ML.
> > 
> > Commenting on v1, Alison wrote a couple of observations on what users
> > will see. I suggest anyone interested to see how this series affect
> > users to take a look at her observations.[0] Thank you!
> > 
> > Changes for v3:
> > 
> >   Re-base the series on cxl/next.
> > 
> >   1/4 - 2/4:
> > 	Constify local variables.
> >   3/4:
> > 	Call arch_match_region() from region_res_match_cxl_range().
> >   4/4:
> > 	arch_match_region() - Check that region end is under start + 4G;
> > 	arch_match_spa() - Check that SPA range start is cfmws_range_start.
> 
> I have sent comments for version 1 and suggested a simpler approach
> for this to implement.
>
I replied to your comments for version 1. Please find my message at 
https://lore.kernel.org/linux-cxl/9930375.eNJFYEL58v@fdefranc-mobl3/.
>
> My comments haven't been addressed yet,
>
I think it's not possible to detect an LMH while still in cxl_port_add().
Therefore, I think that this is the best way to go. It works to prevent
the driver to fail to create Regions and attach Endpoint Decoders on x86
with Low Memory Holes, provided that the lower SPA range starts at 0x0.

On platforms with other kind of holes, the driver will continue to fail
as it currently does. 
>
> but we
> need better isolation to reduce interference with other platforms and
> archs. Please take a look again.
>
Interference? Do you mean that this series would make the driver fail on 
other platforms? 

Of course I don't want anything like that. I'm not clear about it...
Would you please describe how would this series interfere and what
would happen on other platforms?

Thanks,

Fabio
>
> Many thanks,
> 
> -Robert
>
Fabio M. De Francesco March 26, 2025, 4:23 p.m. UTC | #5
On Thursday, March 20, 2025 2:46:14 AM Central European Standard Time Alison Schofield wrote:
> On Fri, Mar 14, 2025 at 12:36:29PM +0100, Fabio M. De Francesco wrote:
> > The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> > Physical Address (HPA) windows that are associated with each CXL Host
> > Bridge. Each window represents a contiguous HPA that may be interleaved
> > with one or more targets (CXL v3.1 - 9.18.1.3).
> > 
> > The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> > memory to which systems cannot send transactions. On those systems, BIOS
> > publishes CFMWS which communicate the active System Physical Address (SPA)
> > ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> > SPA range trims out the hole, and capacity in the endpoint is lost with no
> > SPA to map to CXL HPA in that hole.
> > 
> > In the early stages of CXL Regions construction and attach on platforms
> > with Low Memory Holes, the driver fails and returns an error because it
> > expects that the CXL Endpoint Decoder range is a subset of the Root
> > Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
> > 
> > Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
> > Decoders or already made CXL Regions and Decoders to allow the
> > construction of new CXL Regions and the attachment of Endpoint Decoders,
> > even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
> > range end to match Root Decoder's.
> 
> I think the dpa_res field of the endpoint decoder needs adjusting.
> After the region is setup, the cxled->dpa_res has the unadjusted value
> and that leads to region warning and address translation failure because
> the driver 'thinks' that DPA is within a region, but when it tries
> to translate to an HPA in that region, it fails.
> 
> Here's where I looked at it: using the cxl-test LMH auto-region (nice!)
> each endpoint decoder is programmed to contribute 512MB to the 1024MB region.
> The LMH adjustment shrunk the region to 768MB, so each endpoint is only
> contributing 384MB to the region. 
> 
> DPA->HPA address translations of DPA addresses in the 384->512 gap cause
> a problem. The driver will needlessly warn that they are in a region for
> any poison inject or clear, and will fail address translations for any
> poison, general media or dram event.
> 
> I think this should fail in region.c: __cxl_dpa_to_region()
>         if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start)
>                 return 0;
> 
> For that to fail, LMH code needs to adjust cxled->dpa_res too. 
> 
> 
> To test is using clear_poison you can:
> # echo 536866816 > /sys/kernel/debug/cxl/mem1/clear_poison
>   (536866816 = 512MB - 4096)
> 
> [ ] cxl_core:__cxl_dpa_to_region:2860: cxl decoder18.0: dpa:0x1ffff000 mapped in region:region0
> [ ] cxl_core:cxl_dpa_to_hpa:2963: cxl_region region0: Addr trans fail: hpa 0x3ff04fffe000 not in region
> 
> 
> snip
> > 
> 
Alison,

I'll adjust cxled->dpa_res too.
Thank you for noticing this issue.

Fabio
Fabio M. De Francesco March 26, 2025, 4:24 p.m. UTC | #6
On Thursday, March 20, 2025 7:10:13 PM Central European Standard Time Alison Schofield wrote:
> On Fri, Mar 14, 2025 at 12:36:29PM +0100, Fabio M. De Francesco wrote:
> > The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> > Physical Address (HPA) windows that are associated with each CXL Host
> > Bridge. Each window represents a contiguous HPA that may be interleaved
> > with one or more targets (CXL v3.1 - 9.18.1.3).
> > 
> > The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> > memory to which systems cannot send transactions. On those systems, BIOS
> > publishes CFMWS which communicate the active System Physical Address (SPA)
> > ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> > SPA range trims out the hole, and capacity in the endpoint is lost with no
> > SPA to map to CXL HPA in that hole.
> > 
> > In the early stages of CXL Regions construction and attach on platforms
> > with Low Memory Holes, the driver fails and returns an error because it
> > expects that the CXL Endpoint Decoder range is a subset of the Root
> > Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
> > 
> > Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
> > Decoders or already made CXL Regions and Decoders to allow the
> > construction of new CXL Regions and the attachment of Endpoint Decoders,
> > even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
> > range end to match Root Decoder's.
> 
> Unless dev_dbg() is on, LMH handling won't be apparent. How about adding
> a dev_info() that informs the user that a region was adjusted to account
> for a LMH. I'm foreseeing looking at logs and divining a LMH, when a
> simple log message would be a nice shortcut.
> 
> CXL subsystem is quiet with only 8 dev_info()'s, so if this breaks some
> vow of silence, nevermind :)
> 
Yes, sure. It sounds good.

Thanks,

Fabio
Robert Richter March 28, 2025, 9:02 a.m. UTC | #7
On 25.03.25 17:13:50, Fabio M. De Francesco wrote:
> On Friday, March 21, 2025 11:34:35 AM Central European Standard Time Robert Richter wrote:
> > Fabio,
> > 
> > On 14.03.25 12:36:29, Fabio M. De Francesco wrote:
> > > The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> > > Physical Address (HPA) windows that are associated with each CXL Host
> > > Bridge. Each window represents a contiguous HPA that may be interleaved
> > > with one or more targets (CXL v3.1 - 9.18.1.3).
> > > 
> > > The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> > > memory to which systems cannot send transactions. On those systems, BIOS
> > > publishes CFMWS which communicate the active System Physical Address (SPA)
> > > ranges that map to a subset of the Host Physical Address (HPA) ranges. The
> > > SPA range trims out the hole, and capacity in the endpoint is lost with no
> > > SPA to map to CXL HPA in that hole.
> > > 
> > > In the early stages of CXL Regions construction and attach on platforms
> > > with Low Memory Holes, the driver fails and returns an error because it
> > > expects that the CXL Endpoint Decoder range is a subset of the Root
> > > Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
> > > 
> > > Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
> > > Decoders or already made CXL Regions and Decoders to allow the
> > > construction of new CXL Regions and the attachment of Endpoint Decoders,
> > > even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
> > > range end to match Root Decoder's.
> > > 
> > > - Patch 1/4 changes the calling conventions of three match_*_by_range()
> > >   helpers in preparation of 3/4.
> > > - Patch 2/4 Introduces helpers to detect LMH's and also one to adjust
> > >   the HPA range end for CXL Regions construction.
> > > - Patch 3/4 enables CXL Regions construction and Endpoint Decoders
> > >   attachment by matching Root Decoders or Regions with Endpoint
> > >   Decoders, adjusting Endpoint Decoders HPA range end, and relaxing
> > >   constraints while Endpoints decoders' attachment.
> > > - Patch 4/4 simulates a LMH for the CXL tests on patched CXL driver.
> > > 
> > > Many thanks to Alison, Dan, and Ira for their help and for their reviews
> > > of my RFC on Intel's internal ML.
> > > 
> > > Commenting on v1, Alison wrote a couple of observations on what users
> > > will see. I suggest anyone interested to see how this series affect
> > > users to take a look at her observations.[0] Thank you!
> > > 
> > > Changes for v3:
> > > 
> > >   Re-base the series on cxl/next.
> > > 
> > >   1/4 - 2/4:
> > > 	Constify local variables.
> > >   3/4:
> > > 	Call arch_match_region() from region_res_match_cxl_range().
> > >   4/4:
> > > 	arch_match_region() - Check that region end is under start + 4G;
> > > 	arch_match_spa() - Check that SPA range start is cfmws_range_start.
> > 
> > I have sent comments for version 1 and suggested a simpler approach
> > for this to implement.
> >
> I replied to your comments for version 1. Please find my message at 
> https://lore.kernel.org/linux-cxl/9930375.eNJFYEL58v@fdefranc-mobl3/.

The outcome was "I'll think about it.".

> >
> > My comments haven't been addressed yet,
> >
> I think it's not possible to detect an LMH while still in cxl_port_add().

Why is that not possible? I have outlined a solution before: Implement
a function to run platform specific port setup. Run a platform check.
Enable a platform dependend callback. Setup the port using platform
specifics.

> Therefore, I think that this is the best way to go. It works to prevent
> the driver to fail to create Regions and attach Endpoint Decoders on x86
> with Low Memory Holes, provided that the lower SPA range starts at 0x0.

An alternative works as well, that is not an argument.

> 
> On platforms with other kind of holes, the driver will continue to fail
> as it currently does. 

And those platform will then add more specific code and more
complexity in the main path other need to run? See, the code needs to
be encapsulated.

> >
> > but we
> > need better isolation to reduce interference with other platforms and
> > archs. Please take a look again.
> >
> Interference? Do you mean that this series would make the driver fail on 
> other platforms? 

No, other platforms must deal with that specific code and constrains.

> 
> Of course I don't want anything like that. I'm not clear about it...
> Would you please describe how would this series interfere and what
> would happen on other platforms?

Other platforms should not care about platform-specifics of others. So
again, use a platform check and only enable that code there necessary.
And this requires a well defined interface to common code.

Thanks,

-Robert

> 
> Thanks,
> 
> Fabio
> >
> > Many thanks,
> > 
> > -Robert
> > 
> 
> 
> 
>
Dave Jiang March 28, 2025, 9:10 p.m. UTC | #8
On 3/28/25 2:02 AM, Robert Richter wrote:
> On 25.03.25 17:13:50, Fabio M. De Francesco wrote:
>> On Friday, March 21, 2025 11:34:35 AM Central European Standard Time Robert Richter wrote:
>>> Fabio,
>>>
>>> On 14.03.25 12:36:29, Fabio M. De Francesco wrote:
>>>> The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
>>>> Physical Address (HPA) windows that are associated with each CXL Host
>>>> Bridge. Each window represents a contiguous HPA that may be interleaved
>>>> with one or more targets (CXL v3.1 - 9.18.1.3).
>>>>
>>>> The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
>>>> memory to which systems cannot send transactions. On those systems, BIOS
>>>> publishes CFMWS which communicate the active System Physical Address (SPA)
>>>> ranges that map to a subset of the Host Physical Address (HPA) ranges. The
>>>> SPA range trims out the hole, and capacity in the endpoint is lost with no
>>>> SPA to map to CXL HPA in that hole.
>>>>
>>>> In the early stages of CXL Regions construction and attach on platforms
>>>> with Low Memory Holes, the driver fails and returns an error because it
>>>> expects that the CXL Endpoint Decoder range is a subset of the Root
>>>> Decoder's (SPA >= HPA). On x86 with LMH's, it happens that SPA < HPA.
>>>>
>>>> Therefore, detect x86 Low Memory Holes, match CXL Root and Endpoint
>>>> Decoders or already made CXL Regions and Decoders to allow the
>>>> construction of new CXL Regions and the attachment of Endpoint Decoders,
>>>> even if SPA < HPA. If needed because of LMH's, adjust the Endpoint Decoder
>>>> range end to match Root Decoder's.
>>>>
>>>> - Patch 1/4 changes the calling conventions of three match_*_by_range()
>>>>   helpers in preparation of 3/4.
>>>> - Patch 2/4 Introduces helpers to detect LMH's and also one to adjust
>>>>   the HPA range end for CXL Regions construction.
>>>> - Patch 3/4 enables CXL Regions construction and Endpoint Decoders
>>>>   attachment by matching Root Decoders or Regions with Endpoint
>>>>   Decoders, adjusting Endpoint Decoders HPA range end, and relaxing
>>>>   constraints while Endpoints decoders' attachment.
>>>> - Patch 4/4 simulates a LMH for the CXL tests on patched CXL driver.
>>>>
>>>> Many thanks to Alison, Dan, and Ira for their help and for their reviews
>>>> of my RFC on Intel's internal ML.
>>>>
>>>> Commenting on v1, Alison wrote a couple of observations on what users
>>>> will see. I suggest anyone interested to see how this series affect
>>>> users to take a look at her observations.[0] Thank you!
>>>>
>>>> Changes for v3:
>>>>
>>>>   Re-base the series on cxl/next.
>>>>
>>>>   1/4 - 2/4:
>>>> 	Constify local variables.
>>>>   3/4:
>>>> 	Call arch_match_region() from region_res_match_cxl_range().
>>>>   4/4:
>>>> 	arch_match_region() - Check that region end is under start + 4G;
>>>> 	arch_match_spa() - Check that SPA range start is cfmws_range_start.
>>>
>>> I have sent comments for version 1 and suggested a simpler approach
>>> for this to implement.
>>>
>> I replied to your comments for version 1. Please find my message at 
>> https://lore.kernel.org/linux-cxl/9930375.eNJFYEL58v@fdefranc-mobl3/.
> 
> The outcome was "I'll think about it.".
> 
>>>
>>> My comments haven't been addressed yet,
>>>
>> I think it's not possible to detect an LMH while still in cxl_port_add().
> 
> Why is that not possible? I have outlined a solution before: Implement
> a function to run platform specific port setup. Run a platform check.
> Enable a platform dependend callback. Setup the port using platform
> specifics.
> 
>> Therefore, I think that this is the best way to go. It works to prevent
>> the driver to fail to create Regions and attach Endpoint Decoders on x86
>> with Low Memory Holes, provided that the lower SPA range starts at 0x0.
> 
> An alternative works as well, that is not an argument.
> 
>>
>> On platforms with other kind of holes, the driver will continue to fail
>> as it currently does. 
> 
> And those platform will then add more specific code and more
> complexity in the main path other need to run? See, the code needs to
> be encapsulated.
> 
>>>
>>> but we
>>> need better isolation to reduce interference with other platforms and
>>> archs. Please take a look again.
>>>
>> Interference? Do you mean that this series would make the driver fail on 
>> other platforms? 
> 
> No, other platforms must deal with that specific code and constrains.
> 
>>
>> Of course I don't want anything like that. I'm not clear about it...
>> Would you please describe how would this series interfere and what
>> would happen on other platforms?
> 
> Other platforms should not care about platform-specifics of others. So
> again, use a platform check and only enable that code there necessary.
> And this requires a well defined interface to common code.

Hi Robert,
Can you please share more on the background information and/or your specific concerns on the possible memory holes in the other platforms that need to be considered and not covered by Fabio's code? Let's all get on the same page of what specifics we need to consider to make this work. Preferably I want to avoid arch and platform specific code in CXL if possible. Of course that may not always be possible. Would like see if we can avoid a bunch of #ifdef X86/ARM/INTEL/AMD and do it more cleanly. But fully understand the situation first would be helpful to determine that. Thank you!

DJ

> 
> Thanks,
> 
> -Robert
> 
>>
>> Thanks,
>>
>> Fabio
>>>
>>> Many thanks,
>>>
>>> -Robert
>>>
>>
>>
>>
>>