diff mbox series

[2/4,v3] cxl/core: Add helpers to detect Low memory Holes on x86

Message ID 20250314113708.759808-3-fabio.m.de.francesco@linux.intel.com
State New
Headers show
Series cxl/core: Enable Region creation on x86 with Low Mem Hole | expand

Commit Message

Fabio M. De Francesco March 14, 2025, 11:36 a.m. UTC
In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
ranges are always guaranteed to align to the NIW * 256M rule.

In order to construct Regions and attach Decoders, the driver needs to
match Root Decoders and Regions with Endpoint Decoders, but it fails and
the entire process returns errors because it doesn't expect to deal with
SPA range lengths smaller than corresponding HPA's.

Introduce functions that indirectly detect x86 LMH's by comparing SPA's
with corresponding HPA's. They will be used in the process of Regions
creation and Endpoint attachments to prevent driver failures in a few
steps of the above-mentioned process.

The helpers return true when HPA/SPA misalignments are detected under
specific conditions: both the SPA and HPA ranges must start at
LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
be less than HPA's, SPA's range's size be less than 4G, HPA's size be
aligned to the NIW * 256M rule.

Also introduce a function to adjust the range end of the Regions to be
created on x86 with LMH's.

Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
 drivers/cxl/core/lmh.c | 56 ++++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/core/lmh.h | 29 ++++++++++++++++++++++
 2 files changed, 85 insertions(+)
 create mode 100644 drivers/cxl/core/lmh.c
 create mode 100644 drivers/cxl/core/lmh.h

Comments

Ira Weiny March 18, 2025, 3:15 p.m. UTC | #1
Fabio M. De Francesco wrote:
> In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
> SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
> HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
> ranges are always guaranteed to align to the NIW * 256M rule.
> 
> In order to construct Regions and attach Decoders, the driver needs to
> match Root Decoders and Regions with Endpoint Decoders, but it fails and
> the entire process returns errors because it doesn't expect to deal with
> SPA range lengths smaller than corresponding HPA's.
> 
> Introduce functions that indirectly detect x86 LMH's by comparing SPA's
> with corresponding HPA's. They will be used in the process of Regions
> creation and Endpoint attachments to prevent driver failures in a few
> steps of the above-mentioned process.
> 
> The helpers return true when HPA/SPA misalignments are detected under
> specific conditions: both the SPA and HPA ranges must start at
> LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
> be less than HPA's, SPA's range's size be less than 4G, HPA's size be
> aligned to the NIW * 256M rule.
> 
> Also introduce a function to adjust the range end of the Regions to be
> created on x86 with LMH's.
> 
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

[snip]
Robert Richter March 21, 2025, 10:21 a.m. UTC | #2
On 14.03.25 12:36:31, Fabio M. De Francesco wrote:
> In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
> SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
> HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
> ranges are always guaranteed to align to the NIW * 256M rule.
> 
> In order to construct Regions and attach Decoders, the driver needs to
> match Root Decoders and Regions with Endpoint Decoders, but it fails and
> the entire process returns errors because it doesn't expect to deal with
> SPA range lengths smaller than corresponding HPA's.
> 
> Introduce functions that indirectly detect x86 LMH's by comparing SPA's
> with corresponding HPA's. They will be used in the process of Regions
> creation and Endpoint attachments to prevent driver failures in a few
> steps of the above-mentioned process.
> 
> The helpers return true when HPA/SPA misalignments are detected under
> specific conditions: both the SPA and HPA ranges must start at
> LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes

"... that for Intel with LMHs is 0x0", not true for AMD.

> be less than HPA's, SPA's range's size be less than 4G, HPA's size be
> aligned to the NIW * 256M rule.
> 
> Also introduce a function to adjust the range end of the Regions to be
> created on x86 with LMH's.
> 
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
>  drivers/cxl/core/lmh.c | 56 ++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/core/lmh.h | 29 ++++++++++++++++++++++

The split of the code in patch #2 and #3 does not make sense. The
"interface" you introduce here is out of context which is patch #3.
And in patch #3 the functions are actually used, so you need to switch
back to this one. Other than that, the code is not enabled here at
all, it is even not built.

>  2 files changed, 85 insertions(+)
>  create mode 100644 drivers/cxl/core/lmh.c
>  create mode 100644 drivers/cxl/core/lmh.h
> 
> diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> new file mode 100644
> index 000000000000..2e32f867eb94
> --- /dev/null
> +++ b/drivers/cxl/core/lmh.c
> @@ -0,0 +1,56 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +#include <linux/range.h>
> +#include "lmh.h"
> +
> +/* Start of CFMWS range that end before x86 Low Memory Holes */
> +#define LMH_CFMWS_RANGE_START 0x0ULL

This looks arbitrary. An endpoint decoder's zero base address could
have other reasons too, e.g. the need of address translation. So you
need a different check for the mem hole.

In my previous review comment I have requested a platform check for
this code to enable.

> +
> +/*
> + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> + *
> + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
> + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
> + * the given endpoint decoder HPA range size is always expected aligned and
> + * also larger than that of the matching root decoder. If there are LMH's,
> + * the root decoder range end is always less than SZ_4G.
> + */
> +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> +		    const struct cxl_endpoint_decoder *cxled)
> +{
> +	const struct range *r1, *r2;
> +	int niw;
> +
> +	r1 = &cxlrd->cxlsd.cxld.hpa_range;
> +	r2 = &cxled->cxld.hpa_range;
> +	niw = cxled->cxld.interleave_ways;
> +
> +	if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
> +	    r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&

How about removing LMH_CFMWS_RANGE_START at all? It is zero anyway and
would make this check much easier.

Can this being modified to reuse this codes for "holes" other than
below 4G?

> +	    IS_ALIGNED(range_len(r2), niw * SZ_256M))
> +		return true;
> +
> +	return false;
> +}
> +
> +/* Similar to arch_match_spa(), it matches regions and decoders */
> +bool arch_match_region(const struct cxl_region_params *p,
> +		       const struct cxl_decoder *cxld)
> +{
> +	const struct range *r = &cxld->hpa_range;
> +	const struct resource *res = p->res;
> +	int niw = cxld->interleave_ways;
> +
> +	if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
> +	    res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&

Same here.

> +	    IS_ALIGNED(range_len(r), niw * SZ_256M))
> +		return true;
> +
> +	return false;
> +}

Right now the default check is:

  (p->res->start == r->start && p->res->end == r->end)

Can't we just calculate a matching spa range for the decoder and
region and then check if both match? Both match functions would be
obsolete then.

> +
> +void arch_adjust_region_resource(struct resource *res,
> +				 struct cxl_root_decoder *cxlrd)
> +{
> +	res->end = cxlrd->res->end;
> +}

This should be squashed with arch_match_spa(): same style and
interface as for cxl_extended_linear_cache_resize(). Please generalize
the interface of cxl_extended_linear_cache_resize() first.

> diff --git a/drivers/cxl/core/lmh.h b/drivers/cxl/core/lmh.h
> new file mode 100644
> index 000000000000..16746ceac1ed
> --- /dev/null
> +++ b/drivers/cxl/core/lmh.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include "cxl.h"
> +
> +#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
> +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> +		    const struct cxl_endpoint_decoder *cxled);
> +bool arch_match_region(const struct cxl_region_params *p,
> +		       const struct cxl_decoder *cxld);
> +void arch_adjust_region_resource(struct resource *res,
> +				 struct cxl_root_decoder *cxlrd);
> +#else
> +static bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> +			   struct cxl_endpoint_decoder *cxled)
> +{
> +	return false;
> +}
> +
> +static bool arch_match_region(struct cxl_region_params *p,
> +			      struct cxl_decoder *cxld)
> +{
> +	return false;
> +}
> +
> +static void arch_adjust_region_resource(struct resource *res,
> +					struct cxl_root_decoder *cxlrd)
> +{
> +}
> +#endif /* CXL_ARCH_LOW_MEMORY_HOLE */

I don't think we will need all this helpers if there are platform
specific callbacks as suggested in a comment to v1.

-Robert

> -- 
> 2.48.1
>
Fabio M. De Francesco March 26, 2025, 4:47 p.m. UTC | #3
On Friday, March 21, 2025 11:21:44 AM Central European Standard Time Robert Richter wrote:
> On 14.03.25 12:36:31, Fabio M. De Francesco wrote:
> > In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
> > SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
> > HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
> > ranges are always guaranteed to align to the NIW * 256M rule.
> > 
> > In order to construct Regions and attach Decoders, the driver needs to
> > match Root Decoders and Regions with Endpoint Decoders, but it fails and
> > the entire process returns errors because it doesn't expect to deal with
> > SPA range lengths smaller than corresponding HPA's.
> > 
> > Introduce functions that indirectly detect x86 LMH's by comparing SPA's
> > with corresponding HPA's. They will be used in the process of Regions
> > creation and Endpoint attachments to prevent driver failures in a few
> > steps of the above-mentioned process.
> > 
> > The helpers return true when HPA/SPA misalignments are detected under
> > specific conditions: both the SPA and HPA ranges must start at
> > LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
> 
> "... that for Intel with LMHs is 0x0", not true for AMD.
> 
> > be less than HPA's, SPA's range's size be less than 4G, HPA's size be
> > aligned to the NIW * 256M rule.
> > 
> > Also introduce a function to adjust the range end of the Regions to be
> > created on x86 with LMH's.
> > 
> > Cc: Alison Schofield <alison.schofield@intel.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > ---
> >  drivers/cxl/core/lmh.c | 56 ++++++++++++++++++++++++++++++++++++++++++
> >  drivers/cxl/core/lmh.h | 29 ++++++++++++++++++++++
> 
> The split of the code in patch #2 and #3 does not make sense. The
> "interface" you introduce here is out of context which is patch #3.
> And in patch #3 the functions are actually used, so you need to switch
> back to this one. Other than that, the code is not enabled here at
> all, it is even not built.
> 
I prefer to split the introduction of helpers and the use of them in
separate patches.
>
> >  2 files changed, 85 insertions(+)
> >  create mode 100644 drivers/cxl/core/lmh.c
> >  create mode 100644 drivers/cxl/core/lmh.h
> > 
> > diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> > new file mode 100644
> > index 000000000000..2e32f867eb94
> > --- /dev/null
> > +++ b/drivers/cxl/core/lmh.c
> > @@ -0,0 +1,56 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +
> > +#include <linux/range.h>
> > +#include "lmh.h"
> > +
> > +/* Start of CFMWS range that end before x86 Low Memory Holes */
> > +#define LMH_CFMWS_RANGE_START 0x0ULL
> 
> This looks arbitrary. An endpoint decoder's zero base address could
> have other reasons too, e.g. the need of address translation. So you
> need a different check for the mem hole.
> 
> In my previous review comment I have requested a platform check for
> this code to enable.
> 
> > +
> > +/*
> > + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> > + *
> > + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
> > + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
> > + * the given endpoint decoder HPA range size is always expected aligned and
> > + * also larger than that of the matching root decoder. If there are LMH's,
> > + * the root decoder range end is always less than SZ_4G.
> > + */
> > +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> > +		    const struct cxl_endpoint_decoder *cxled)
> > +{
> > +	const struct range *r1, *r2;
> > +	int niw;
> > +
> > +	r1 = &cxlrd->cxlsd.cxld.hpa_range;
> > +	r2 = &cxled->cxld.hpa_range;
> > +	niw = cxled->cxld.interleave_ways;
> > +
> > +	if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
> > +	    r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
> 
> How about removing LMH_CFMWS_RANGE_START at all? It is zero anyway and
> would make this check much easier.
>
I think that in the next version I'll check that the range start is a multiple 
of Number of Interleave Ways * 256M (it would include 0x0).
>  
> Can this being modified to reuse this codes for "holes" other than
> below 4G?
> 
This series enables CXL Regions creation and Endpoints attach for x86 LOW 
Memory Holes. I prefer not to enable regions on platforms with memory holes 
beyond 4G until I'm sure it's needed. arch_match_spa() and arch_match_region()
can be easily modified if in future they need to be.
>
> > +	    IS_ALIGNED(range_len(r2), niw * SZ_256M))
> > +		return true;
> > +
> > +	return false;
> > +}
> > +
> > +/* Similar to arch_match_spa(), it matches regions and decoders */
> > +bool arch_match_region(const struct cxl_region_params *p,
> > +		       const struct cxl_decoder *cxld)
> > +{
> > +	const struct range *r = &cxld->hpa_range;
> > +	const struct resource *res = p->res;
> > +	int niw = cxld->interleave_ways;
> > +
> > +	if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
> > +	    res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
> 
> Same here.
> 
> > +	    IS_ALIGNED(range_len(r), niw * SZ_256M))
> > +		return true;
> > +
> > +	return false;
> > +}
> 
> Right now the default check is:
> 
>   (p->res->start == r->start && p->res->end == r->end)
> 
> Can't we just calculate a matching spa range for the decoder and
> region and then check if both match? Both match functions would be
> obsolete then.
> 
I prefer to keep the design untouched. Anyway, I'll check for other aligned 
res->start, as said above.
>
> > +
> > +void arch_adjust_region_resource(struct resource *res,
> > +				 struct cxl_root_decoder *cxlrd)
> > +{
> > +	res->end = cxlrd->res->end;
> > +}
> 
> This should be squashed with arch_match_spa(): same style and
> interface as for cxl_extended_linear_cache_resize(). Please generalize
> the interface of cxl_extended_linear_cache_resize() first.
> 
Do you mean that arch_match_spa() should adjust res->end? I don't think
it should.
>
> > diff --git a/drivers/cxl/core/lmh.h b/drivers/cxl/core/lmh.h
> > new file mode 100644
> > index 000000000000..16746ceac1ed
> > --- /dev/null
> > +++ b/drivers/cxl/core/lmh.h
> > @@ -0,0 +1,29 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include "cxl.h"
> > +
> > +#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
> > +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> > +		    const struct cxl_endpoint_decoder *cxled);
> > +bool arch_match_region(const struct cxl_region_params *p,
> > +		       const struct cxl_decoder *cxld);
> > +void arch_adjust_region_resource(struct resource *res,
> > +				 struct cxl_root_decoder *cxlrd);
> > +#else
> > +static bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> > +			   struct cxl_endpoint_decoder *cxled)
> > +{
> > +	return false;
> > +}
> > +
> > +static bool arch_match_region(struct cxl_region_params *p,
> > +			      struct cxl_decoder *cxld)
> > +{
> > +	return false;
> > +}
> > +
> > +static void arch_adjust_region_resource(struct resource *res,
> > +					struct cxl_root_decoder *cxlrd)
> > +{
> > +}
> > +#endif /* CXL_ARCH_LOW_MEMORY_HOLE */
> 
> I don't think we will need all this helpers if there are platform
> specific callbacks as suggested in a comment to v1.
> 
> -Robert
> 
As said in a reply to 0/4, I'm not yet aware of issues that would interfere
with non Intel x86. Anyway, I'll think a bit more about using platform
specific callbacks, even if currently I don't think they are necessary. 
> 
Thanks,

Fabio
Robert Richter March 28, 2025, 10:26 a.m. UTC | #4
On 26.03.25 17:47:26, Fabio M. De Francesco wrote:
> On Friday, March 21, 2025 11:21:44 AM Central European Standard Time Robert Richter wrote:
> > On 14.03.25 12:36:31, Fabio M. De Francesco wrote:
> > > In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
> > > SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
> > > HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
> > > ranges are always guaranteed to align to the NIW * 256M rule.
> > > 
> > > In order to construct Regions and attach Decoders, the driver needs to
> > > match Root Decoders and Regions with Endpoint Decoders, but it fails and
> > > the entire process returns errors because it doesn't expect to deal with
> > > SPA range lengths smaller than corresponding HPA's.
> > > 
> > > Introduce functions that indirectly detect x86 LMH's by comparing SPA's
> > > with corresponding HPA's. They will be used in the process of Regions
> > > creation and Endpoint attachments to prevent driver failures in a few
> > > steps of the above-mentioned process.
> > > 
> > > The helpers return true when HPA/SPA misalignments are detected under
> > > specific conditions: both the SPA and HPA ranges must start at
> > > LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
> > 
> > "... that for Intel with LMHs is 0x0", not true for AMD.
> > 
> > > be less than HPA's, SPA's range's size be less than 4G, HPA's size be
> > > aligned to the NIW * 256M rule.
> > > 
> > > Also introduce a function to adjust the range end of the Regions to be
> > > created on x86 with LMH's.
> > > 
> > > Cc: Alison Schofield <alison.schofield@intel.com>
> > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > Cc: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > > ---
> > >  drivers/cxl/core/lmh.c | 56 ++++++++++++++++++++++++++++++++++++++++++
> > >  drivers/cxl/core/lmh.h | 29 ++++++++++++++++++++++
> > 
> > The split of the code in patch #2 and #3 does not make sense. The
> > "interface" you introduce here is out of context which is patch #3.
> > And in patch #3 the functions are actually used, so you need to switch
> > back to this one. Other than that, the code is not enabled here at
> > all, it is even not built.
> > 
> I prefer to split the introduction of helpers and the use of them in
> separate patches.

You introduce dead code here. If you want to split it, then add an
empty stub function that is actually called in the first patch and add
the actual implementation in the second.

> >
> > >  2 files changed, 85 insertions(+)
> > >  create mode 100644 drivers/cxl/core/lmh.c
> > >  create mode 100644 drivers/cxl/core/lmh.h
> > > 
> > > diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> > > new file mode 100644
> > > index 000000000000..2e32f867eb94
> > > --- /dev/null
> > > +++ b/drivers/cxl/core/lmh.c
> > > @@ -0,0 +1,56 @@
> > > +// SPDX-License-Identifier: GPL-2.0-only
> > > +
> > > +#include <linux/range.h>
> > > +#include "lmh.h"
> > > +
> > > +/* Start of CFMWS range that end before x86 Low Memory Holes */
> > > +#define LMH_CFMWS_RANGE_START 0x0ULL
> > 
> > This looks arbitrary. An endpoint decoder's zero base address could
> > have other reasons too, e.g. the need of address translation. So you
> > need a different check for the mem hole.
> > 
> > In my previous review comment I have requested a platform check for
> > this code to enable.
> > 
> > > +
> > > +/*
> > > + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> > > + *
> > > + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
> > > + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
> > > + * the given endpoint decoder HPA range size is always expected aligned and
> > > + * also larger than that of the matching root decoder. If there are LMH's,
> > > + * the root decoder range end is always less than SZ_4G.
> > > + */
> > > +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> > > +		    const struct cxl_endpoint_decoder *cxled)
> > > +{
> > > +	const struct range *r1, *r2;
> > > +	int niw;
> > > +
> > > +	r1 = &cxlrd->cxlsd.cxld.hpa_range;
> > > +	r2 = &cxled->cxld.hpa_range;
> > > +	niw = cxled->cxld.interleave_ways;
> > > +
> > > +	if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
> > > +	    r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
> > 
> > How about removing LMH_CFMWS_RANGE_START at all? It is zero anyway and
> > would make this check much easier.
> >
> I think that in the next version I'll check that the range start is a multiple 
> of Number of Interleave Ways * 256M (it would include 0x0).

The code suggests that r2->start is zero, but shouldn't be the base of
the endpoint non-zero? And r2->end is greater than SZ_4G which is the
case for all ranges above 4G. Confused, how do you determine the
endpoint fits into the LMH range?

> >  
> > Can this being modified to reuse this codes for "holes" other than
> > below 4G?
> > 
> This series enables CXL Regions creation and Endpoints attach for x86 LOW 
> Memory Holes. I prefer not to enable regions on platforms with memory holes 
> beyond 4G until I'm sure it's needed. arch_match_spa() and arch_match_region()
> can be easily modified if in future they need to be.
> >
> > > +	    IS_ALIGNED(range_len(r2), niw * SZ_256M))

Isn't that length check of the endpoint always true, since this is a
CXL range?

> > > +		return true;

The whole check looks odd, please clarify.

> > > +
> > > +	return false;
> > > +}
> > > +
> > > +/* Similar to arch_match_spa(), it matches regions and decoders */
> > > +bool arch_match_region(const struct cxl_region_params *p,
> > > +		       const struct cxl_decoder *cxld)
> > > +{
> > > +	const struct range *r = &cxld->hpa_range;
> > > +	const struct resource *res = p->res;
> > > +	int niw = cxld->interleave_ways;
> > > +
> > > +	if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
> > > +	    res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
> > 
> > Same here.
> > 
> > > +	    IS_ALIGNED(range_len(r), niw * SZ_256M))
> > > +		return true;
> > > +
> > > +	return false;
> > > +}
> > 
> > Right now the default check is:
> > 
> >   (p->res->start == r->start && p->res->end == r->end)
> > 
> > Can't we just calculate a matching spa range for the decoder and
> > region and then check if both match? Both match functions would be
> > obsolete then.
> > 
> I prefer to keep the design untouched. Anyway, I'll check for other aligned 
> res->start, as said above.
> >
> > > +
> > > +void arch_adjust_region_resource(struct resource *res,
> > > +				 struct cxl_root_decoder *cxlrd)
> > > +{
> > > +	res->end = cxlrd->res->end;
> > > +}
> > 
> > This should be squashed with arch_match_spa(): same style and
> > interface as for cxl_extended_linear_cache_resize(). Please generalize
> > the interface of cxl_extended_linear_cache_resize() first.
> > 
> Do you mean that arch_match_spa() should adjust res->end? I don't think
> it should.

Yes, it should follow the style of already existing code with unified
interfaces.

> >
> > > diff --git a/drivers/cxl/core/lmh.h b/drivers/cxl/core/lmh.h
> > > new file mode 100644
> > > index 000000000000..16746ceac1ed
> > > --- /dev/null
> > > +++ b/drivers/cxl/core/lmh.h
> > > @@ -0,0 +1,29 @@
> > > +/* SPDX-License-Identifier: GPL-2.0-only */
> > > +
> > > +#include "cxl.h"
> > > +
> > > +#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
> > > +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> > > +		    const struct cxl_endpoint_decoder *cxled);
> > > +bool arch_match_region(const struct cxl_region_params *p,
> > > +		       const struct cxl_decoder *cxld);
> > > +void arch_adjust_region_resource(struct resource *res,
> > > +				 struct cxl_root_decoder *cxlrd);
> > > +#else
> > > +static bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> > > +			   struct cxl_endpoint_decoder *cxled)
> > > +{
> > > +	return false;
> > > +}
> > > +
> > > +static bool arch_match_region(struct cxl_region_params *p,
> > > +			      struct cxl_decoder *cxld)
> > > +{
> > > +	return false;
> > > +}
> > > +
> > > +static void arch_adjust_region_resource(struct resource *res,
> > > +					struct cxl_root_decoder *cxlrd)
> > > +{
> > > +}
> > > +#endif /* CXL_ARCH_LOW_MEMORY_HOLE */
> > 
> > I don't think we will need all this helpers if there are platform
> > specific callbacks as suggested in a comment to v1.
> > 
> > -Robert
> > 
> As said in a reply to 0/4, I'm not yet aware of issues that would interfere
> with non Intel x86. Anyway, I'll think a bit more about using platform
> specific callbacks, even if currently I don't think they are necessary. 

Please do, thanks.

-Robert

> > 
> Thanks,
> 
> Fabio
> 
> 
>
Dan Williams March 28, 2025, 11:40 p.m. UTC | #5
Fabio M. De Francesco wrote:
> In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
> SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
> HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
> ranges are always guaranteed to align to the NIW * 256M rule.
> 
> In order to construct Regions and attach Decoders, the driver needs to
> match Root Decoders and Regions with Endpoint Decoders, but it fails and
> the entire process returns errors because it doesn't expect to deal with
> SPA range lengths smaller than corresponding HPA's.
> 
> Introduce functions that indirectly detect x86 LMH's by comparing SPA's
> with corresponding HPA's. They will be used in the process of Regions
> creation and Endpoint attachments to prevent driver failures in a few
> steps of the above-mentioned process.
> 
> The helpers return true when HPA/SPA misalignments are detected under
> specific conditions: both the SPA and HPA ranges must start at
> LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
> be less than HPA's, SPA's range's size be less than 4G, HPA's size be
> aligned to the NIW * 256M rule.
> 
> Also introduce a function to adjust the range end of the Regions to be
> created on x86 with LMH's.
> 
> Cc: Alison Schofield <alison.schofield@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
>  drivers/cxl/core/lmh.c | 56 ++++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/core/lmh.h | 29 ++++++++++++++++++++++
>  2 files changed, 85 insertions(+)
>  create mode 100644 drivers/cxl/core/lmh.c
>  create mode 100644 drivers/cxl/core/lmh.h
> 
> diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> new file mode 100644
> index 000000000000..2e32f867eb94
> --- /dev/null
> +++ b/drivers/cxl/core/lmh.c
> @@ -0,0 +1,56 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +#include <linux/range.h>
> +#include "lmh.h"
> +
> +/* Start of CFMWS range that end before x86 Low Memory Holes */
> +#define LMH_CFMWS_RANGE_START 0x0ULL
> +
> +/*
> + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> + *
> + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
> + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
> + * the given endpoint decoder HPA range size is always expected aligned and
> + * also larger than that of the matching root decoder. If there are LMH's,
> + * the root decoder range end is always less than SZ_4G.
> + */
> +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> +		    const struct cxl_endpoint_decoder *cxled)
> +{
> +	const struct range *r1, *r2;
> +	int niw;
> +
> +	r1 = &cxlrd->cxlsd.cxld.hpa_range;
> +	r2 = &cxled->cxld.hpa_range;
> +	niw = cxled->cxld.interleave_ways;
> +
> +	if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
> +	    r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
> +	    IS_ALIGNED(range_len(r2), niw * SZ_256M))
> +		return true;
> +
> +	return false;
> +}
> +
> +/* Similar to arch_match_spa(), it matches regions and decoders */
> +bool arch_match_region(const struct cxl_region_params *p,
> +		       const struct cxl_decoder *cxld)
> +{
> +	const struct range *r = &cxld->hpa_range;
> +	const struct resource *res = p->res;
> +	int niw = cxld->interleave_ways;
> +
> +	if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
> +	    res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
> +	    IS_ALIGNED(range_len(r), niw * SZ_256M))
> +		return true;
> +
> +	return false;
> +}
> +
> +void arch_adjust_region_resource(struct resource *res,
> +				 struct cxl_root_decoder *cxlrd)
> +{
> +	res->end = cxlrd->res->end;
> +}
> diff --git a/drivers/cxl/core/lmh.h b/drivers/cxl/core/lmh.h
> new file mode 100644
> index 000000000000..16746ceac1ed
> --- /dev/null
> +++ b/drivers/cxl/core/lmh.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#include "cxl.h"
> +
> +#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
> +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> +		    const struct cxl_endpoint_decoder *cxled);
> +bool arch_match_region(const struct cxl_region_params *p,
> +		       const struct cxl_decoder *cxld);
> +void arch_adjust_region_resource(struct resource *res,
> +				 struct cxl_root_decoder *cxlrd);
> +#else
> +static bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> +			   struct cxl_endpoint_decoder *cxled)
> +{
> +	return false;

I would have expected the default match routines to do the default
matching, not return false.

This can document the common expectation on architectures that do not
need to account for decoders not aligning to window boundaries due to
holes.
Fabio M. De Francesco March 29, 2025, 10:05 a.m. UTC | #6
On Saturday, March 29, 2025 12:40:34 AM Central European Standard Time Dan Williams wrote:
> Fabio M. De Francesco wrote:
> > In x86 with Low memory Hole, the BIOS may publishes CFMWS that describe
> > SPA ranges which are subsets of the corresponding CXL Endpoint Decoders
> > HPA's because the CFMWS never intersects LMH's while EP Decoders HPA's
> > ranges are always guaranteed to align to the NIW * 256M rule.
> > 
> > In order to construct Regions and attach Decoders, the driver needs to
> > match Root Decoders and Regions with Endpoint Decoders, but it fails and
> > the entire process returns errors because it doesn't expect to deal with
> > SPA range lengths smaller than corresponding HPA's.
> > 
> > Introduce functions that indirectly detect x86 LMH's by comparing SPA's
> > with corresponding HPA's. They will be used in the process of Regions
> > creation and Endpoint attachments to prevent driver failures in a few
> > steps of the above-mentioned process.
> > 
> > The helpers return true when HPA/SPA misalignments are detected under
> > specific conditions: both the SPA and HPA ranges must start at
> > LMH_CFMWS_RANGE_START (that in x86 with LMH's is 0x0), SPA range sizes
> > be less than HPA's, SPA's range's size be less than 4G, HPA's size be
> > aligned to the NIW * 256M rule.
> > 
> > Also introduce a function to adjust the range end of the Regions to be
> > created on x86 with LMH's.
> > 
> > Cc: Alison Schofield <alison.schofield@intel.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> > ---
> >  drivers/cxl/core/lmh.c | 56 ++++++++++++++++++++++++++++++++++++++++++
> >  drivers/cxl/core/lmh.h | 29 ++++++++++++++++++++++
> >  2 files changed, 85 insertions(+)
> >  create mode 100644 drivers/cxl/core/lmh.c
> >  create mode 100644 drivers/cxl/core/lmh.h
> > 
> > diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
> > new file mode 100644
> > index 000000000000..2e32f867eb94
> > --- /dev/null
> > +++ b/drivers/cxl/core/lmh.c
> > @@ -0,0 +1,56 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +
> > +#include <linux/range.h>
> > +#include "lmh.h"
> > +
> > +/* Start of CFMWS range that end before x86 Low Memory Holes */
> > +#define LMH_CFMWS_RANGE_START 0x0ULL
> > +
> > +/*
> > + * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
> > + *
> > + * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
> > + * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
> > + * the given endpoint decoder HPA range size is always expected aligned and
> > + * also larger than that of the matching root decoder. If there are LMH's,
> > + * the root decoder range end is always less than SZ_4G.
> > + */
> > +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> > +		    const struct cxl_endpoint_decoder *cxled)
> > +{
> > +	const struct range *r1, *r2;
> > +	int niw;
> > +
> > +	r1 = &cxlrd->cxlsd.cxld.hpa_range;
> > +	r2 = &cxled->cxld.hpa_range;
> > +	niw = cxled->cxld.interleave_ways;
> > +
> > +	if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
> > +	    r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
> > +	    IS_ALIGNED(range_len(r2), niw * SZ_256M))
> > +		return true;
> > +
> > +	return false;
> > +}
> > +
> > +/* Similar to arch_match_spa(), it matches regions and decoders */
> > +bool arch_match_region(const struct cxl_region_params *p,
> > +		       const struct cxl_decoder *cxld)
> > +{
> > +	const struct range *r = &cxld->hpa_range;
> > +	const struct resource *res = p->res;
> > +	int niw = cxld->interleave_ways;
> > +
> > +	if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
> > +	    res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
> > +	    IS_ALIGNED(range_len(r), niw * SZ_256M))
> > +		return true;
> > +
> > +	return false;
> > +}
> > +
> > +void arch_adjust_region_resource(struct resource *res,
> > +				 struct cxl_root_decoder *cxlrd)
> > +{
> > +	res->end = cxlrd->res->end;
> > +}
> > diff --git a/drivers/cxl/core/lmh.h b/drivers/cxl/core/lmh.h
> > new file mode 100644
> > index 000000000000..16746ceac1ed
> > --- /dev/null
> > +++ b/drivers/cxl/core/lmh.h
> > @@ -0,0 +1,29 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include "cxl.h"
> > +
> > +#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
> > +bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
> > +		    const struct cxl_endpoint_decoder *cxled);
> > +bool arch_match_region(const struct cxl_region_params *p,
> > +		       const struct cxl_decoder *cxld);
> > +void arch_adjust_region_resource(struct resource *res,
> > +				 struct cxl_root_decoder *cxlrd);
> > +#else
> > +static bool arch_match_spa(struct cxl_root_decoder *cxlrd,
> > +			   struct cxl_endpoint_decoder *cxled)
> > +{
> > +	return false;
> 
> I would have expected the default match routines to do the default
> matching, not return false.
> 
> This can document the common expectation on architectures that do not
> need to account for decoders not aligning to window boundaries due to
> holes.
> 
Hi Dan,

A typical example of arch_match_spa() use is from match_root_decoder_by_range()
which returns false on platforms that don't enable support for the low memory hole.
Therefore, the default behavior is failing the matching by returning false.

This is how arch_match_spa() is used to detect a hole and allow the matching:

static int match_root_decoder_by_range(struct device *dev,
                                       const void *data)
{
        const struct cxl_endpoint_decoder *cxled = data;
        struct cxl_root_decoder *cxlrd;
        const struct range *r1, *r2;

        if (!is_root_decoder(dev))
                return 0;

        cxlrd = to_cxl_root_decoder(dev);
        r1 = &cxlrd->cxlsd.cxld.hpa_range;
        r2 = &cxled->cxld.hpa_range;

        if (range_contains(r1, r2))
                return true;
        if (arch_match_spa(cxlrd, cxled))
                return true;

        return false;
}

Currently the default behavior is for match_root_decoder_by_range() to
not match root and endpoint decoders. I left that default unchanged for
all platforms and architectures that don't enable LMH support. 

Thanks,

Fabio
diff mbox series

Patch

diff --git a/drivers/cxl/core/lmh.c b/drivers/cxl/core/lmh.c
new file mode 100644
index 000000000000..2e32f867eb94
--- /dev/null
+++ b/drivers/cxl/core/lmh.c
@@ -0,0 +1,56 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/range.h>
+#include "lmh.h"
+
+/* Start of CFMWS range that end before x86 Low Memory Holes */
+#define LMH_CFMWS_RANGE_START 0x0ULL
+
+/*
+ * Match CXL Root and Endpoint Decoders by comparing SPA and HPA ranges.
+ *
+ * On x86, CFMWS ranges never intersect memory holes while endpoint decoders
+ * HPA range sizes are always guaranteed aligned to NIW * 256MB; therefore,
+ * the given endpoint decoder HPA range size is always expected aligned and
+ * also larger than that of the matching root decoder. If there are LMH's,
+ * the root decoder range end is always less than SZ_4G.
+ */
+bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
+		    const struct cxl_endpoint_decoder *cxled)
+{
+	const struct range *r1, *r2;
+	int niw;
+
+	r1 = &cxlrd->cxlsd.cxld.hpa_range;
+	r2 = &cxled->cxld.hpa_range;
+	niw = cxled->cxld.interleave_ways;
+
+	if (r1->start == LMH_CFMWS_RANGE_START && r1->start == r2->start &&
+	    r1->end < (LMH_CFMWS_RANGE_START + SZ_4G) && r1->end < r2->end &&
+	    IS_ALIGNED(range_len(r2), niw * SZ_256M))
+		return true;
+
+	return false;
+}
+
+/* Similar to arch_match_spa(), it matches regions and decoders */
+bool arch_match_region(const struct cxl_region_params *p,
+		       const struct cxl_decoder *cxld)
+{
+	const struct range *r = &cxld->hpa_range;
+	const struct resource *res = p->res;
+	int niw = cxld->interleave_ways;
+
+	if (res->start == LMH_CFMWS_RANGE_START && res->start == r->start &&
+	    res->end < (LMH_CFMWS_RANGE_START + SZ_4G) && res->end < r->end &&
+	    IS_ALIGNED(range_len(r), niw * SZ_256M))
+		return true;
+
+	return false;
+}
+
+void arch_adjust_region_resource(struct resource *res,
+				 struct cxl_root_decoder *cxlrd)
+{
+	res->end = cxlrd->res->end;
+}
diff --git a/drivers/cxl/core/lmh.h b/drivers/cxl/core/lmh.h
new file mode 100644
index 000000000000..16746ceac1ed
--- /dev/null
+++ b/drivers/cxl/core/lmh.h
@@ -0,0 +1,29 @@ 
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include "cxl.h"
+
+#ifdef CONFIG_CXL_ARCH_LOW_MEMORY_HOLE
+bool arch_match_spa(const struct cxl_root_decoder *cxlrd,
+		    const struct cxl_endpoint_decoder *cxled);
+bool arch_match_region(const struct cxl_region_params *p,
+		       const struct cxl_decoder *cxld);
+void arch_adjust_region_resource(struct resource *res,
+				 struct cxl_root_decoder *cxlrd);
+#else
+static bool arch_match_spa(struct cxl_root_decoder *cxlrd,
+			   struct cxl_endpoint_decoder *cxled)
+{
+	return false;
+}
+
+static bool arch_match_region(struct cxl_region_params *p,
+			      struct cxl_decoder *cxld)
+{
+	return false;
+}
+
+static void arch_adjust_region_resource(struct resource *res,
+					struct cxl_root_decoder *cxlrd)
+{
+}
+#endif /* CXL_ARCH_LOW_MEMORY_HOLE */