diff mbox series

[1/4] cxl: Remove the CXL_DECODER_MIXED mistake

Message ID 173709423269.753996.17229236572128350685.stgit@dwillia2-xfh.jf.intel.com
State New
Headers show
Series cxl: DPA partition metadata is a mess... | expand

Commit Message

Dan Williams Jan. 17, 2025, 6:10 a.m. UTC
CXL_DECODER_MIXED is a safety mechanism introduced for the case where
platform firmware has programmed an endpoint decoder that straddles a
DPA partition boundary. While the kernel is careful to only allocate DPA
capacity within a single partition there is no guarantee that platform
firmware, or anything that touched the device before the current kernel,
gets that right.

However, __cxl_dpa_reserve() will never get to the CXL_DECODER_MIXED
designation because of the way it tracks partition boundaries. A
request_resource() that spans ->ram_res and ->pmem_res fails with the
following signature:

    __cxl_dpa_reserve: cxl_port endpoint15: decoder15.0: failed to reserve allocation

CXL_DECODER_MIXED is dead defensive programming after the driver has
already given up on the device. It has never offered any protection in
practice, just delete it.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/core/hdm.c    |    8 ++++----
 drivers/cxl/core/region.c |   12 ------------
 drivers/cxl/cxl.h         |    4 +---
 3 files changed, 5 insertions(+), 19 deletions(-)

Comments

Jonathan Cameron Jan. 17, 2025, 10:03 a.m. UTC | #1
On Thu, 16 Jan 2025 22:10:32 -0800
Dan Williams <dan.j.williams@intel.com> wrote:

> CXL_DECODER_MIXED is a safety mechanism introduced for the case where
> platform firmware has programmed an endpoint decoder that straddles a
> DPA partition boundary. While the kernel is careful to only allocate DPA
> capacity within a single partition there is no guarantee that platform
> firmware, or anything that touched the device before the current kernel,
> gets that right.
> 
> However, __cxl_dpa_reserve() will never get to the CXL_DECODER_MIXED
> designation because of the way it tracks partition boundaries. A
> request_resource() that spans ->ram_res and ->pmem_res fails with the
> following signature:
> 
>     __cxl_dpa_reserve: cxl_port endpoint15: decoder15.0: failed to reserve allocation
> 
> CXL_DECODER_MIXED is dead defensive programming after the driver has
> already given up on the device. It has never offered any protection in
> practice, just delete it.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/cxl/core/hdm.c    |    8 ++++----
>  drivers/cxl/core/region.c |   12 ------------
>  drivers/cxl/cxl.h         |    4 +---
>  3 files changed, 5 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 28edd5822486..be8556119d94 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -329,12 +329,12 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
>  
>  	if (resource_contains(&cxlds->pmem_res, res))
>  		cxled->mode = CXL_DECODER_PMEM;
> -	else if (resource_contains(&cxlds->ram_res, res))
> +	if (resource_contains(&cxlds->ram_res, res))

Logic of removing the else?  I assume there is 0 chance that both conditions
match, but doesn't this mean if the res is not in ram_res we always hit the next
else and print the warning?

>  		cxled->mode = CXL_DECODER_RAM;
>  	else {
> -		dev_warn(dev, "decoder%d.%d: %pr mixed mode not supported\n",
> -			 port->id, cxled->cxld.id, cxled->dpa_res);
> -		cxled->mode = CXL_DECODER_MIXED;
> +		dev_warn(dev, "decoder%d.%d: %pr does not map any partition\n",
> +			 port->id, cxled->cxld.id, res);
> +		cxled->mode = CXL_DECODER_NONE;
>  	}
>  
>  	port->hdm_end++;

> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index f6015f24ad38..0fb8d70fa3e5 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -379,7 +379,6 @@ enum cxl_decoder_mode {
>  	CXL_DECODER_NONE,
>  	CXL_DECODER_RAM,
>  	CXL_DECODER_PMEM,
> -	CXL_DECODER_MIXED,
>  	CXL_DECODER_DEAD,
>  };
>  
> @@ -389,10 +388,9 @@ static inline const char *cxl_decoder_mode_name(enum cxl_decoder_mode mode)
>  		[CXL_DECODER_NONE] = "none",
>  		[CXL_DECODER_RAM] = "ram",
>  		[CXL_DECODER_PMEM] = "pmem",
> -		[CXL_DECODER_MIXED] = "mixed",
>  	};
>  
> -	if (mode >= CXL_DECODER_NONE && mode <= CXL_DECODER_MIXED)
> +	if (mode >= CXL_DECODER_NONE && mode <= CXL_DECODER_PMEM)
Maybe just < DEAD is simpler?
>  		return names[mode];
>  	return "mixed";
>  }
> 
>
Alejandro Lucero Palau Jan. 17, 2025, 10:24 a.m. UTC | #2
On 1/17/25 06:10, Dan Williams wrote:
> CXL_DECODER_MIXED is a safety mechanism introduced for the case where
> platform firmware has programmed an endpoint decoder that straddles a
> DPA partition boundary. While the kernel is careful to only allocate DPA
> capacity within a single partition there is no guarantee that platform
> firmware, or anything that touched the device before the current kernel,
> gets that right.
>
> However, __cxl_dpa_reserve() will never get to the CXL_DECODER_MIXED
> designation because of the way it tracks partition boundaries. A
> request_resource() that spans ->ram_res and ->pmem_res fails with the
> following signature:
>
>      __cxl_dpa_reserve: cxl_port endpoint15: decoder15.0: failed to reserve allocation
>
> CXL_DECODER_MIXED is dead defensive programming after the driver has
> already given up on the device. It has never offered any protection in
> practice, just delete it.


I wonder if the reason for adding this CXL_DECODER_MIXED  does still 
worth it for fixing __cxl_dpa_reserve instead of just not supporting 
this case.

Assuming it does not:

Reviewed-by: Alejandro Lucero <alucerop@amd.com>


> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>   drivers/cxl/core/hdm.c    |    8 ++++----
>   drivers/cxl/core/region.c |   12 ------------
>   drivers/cxl/cxl.h         |    4 +---
>   3 files changed, 5 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index 28edd5822486..be8556119d94 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -329,12 +329,12 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
>   
>   	if (resource_contains(&cxlds->pmem_res, res))
>   		cxled->mode = CXL_DECODER_PMEM;
> -	else if (resource_contains(&cxlds->ram_res, res))
> +	if (resource_contains(&cxlds->ram_res, res))
>   		cxled->mode = CXL_DECODER_RAM;
>   	else {
> -		dev_warn(dev, "decoder%d.%d: %pr mixed mode not supported\n",
> -			 port->id, cxled->cxld.id, cxled->dpa_res);
> -		cxled->mode = CXL_DECODER_MIXED;
> +		dev_warn(dev, "decoder%d.%d: %pr does not map any partition\n",
> +			 port->id, cxled->cxld.id, res);
> +		cxled->mode = CXL_DECODER_NONE;
>   	}
>   
>   	port->hdm_end++;
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index d77899650798..e4885acac853 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -2725,18 +2725,6 @@ static int poison_by_decoder(struct device *dev, void *arg)
>   	if (!cxled->dpa_res || !resource_size(cxled->dpa_res))
>   		return rc;
>   
> -	/*
> -	 * Regions are only created with single mode decoders: pmem or ram.
> -	 * Linux does not support mixed mode decoders. This means that
> -	 * reading poison per endpoint decoder adheres to the requirement
> -	 * that poison reads of pmem and ram must be separated.
> -	 * CXL 3.0 Spec 8.2.9.8.4.1
> -	 */
> -	if (cxled->mode == CXL_DECODER_MIXED) {
> -		dev_dbg(dev, "poison list read unsupported in mixed mode\n");
> -		return rc;
> -	}
> -
>   	cxlmd = cxled_to_memdev(cxled);
>   	if (cxled->skip) {
>   		offset = cxled->dpa_res->start - cxled->skip;
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index f6015f24ad38..0fb8d70fa3e5 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -379,7 +379,6 @@ enum cxl_decoder_mode {
>   	CXL_DECODER_NONE,
>   	CXL_DECODER_RAM,
>   	CXL_DECODER_PMEM,
> -	CXL_DECODER_MIXED,
>   	CXL_DECODER_DEAD,
>   };
>   
> @@ -389,10 +388,9 @@ static inline const char *cxl_decoder_mode_name(enum cxl_decoder_mode mode)
>   		[CXL_DECODER_NONE] = "none",
>   		[CXL_DECODER_RAM] = "ram",
>   		[CXL_DECODER_PMEM] = "pmem",
> -		[CXL_DECODER_MIXED] = "mixed",
>   	};
>   
> -	if (mode >= CXL_DECODER_NONE && mode <= CXL_DECODER_MIXED)
> +	if (mode >= CXL_DECODER_NONE && mode <= CXL_DECODER_PMEM)
>   		return names[mode];
>   	return "mixed";
>   }
>
>
Dan Williams Jan. 17, 2025, 5:47 p.m. UTC | #3
Jonathan Cameron wrote:
> On Thu, 16 Jan 2025 22:10:32 -0800
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > CXL_DECODER_MIXED is a safety mechanism introduced for the case where
> > platform firmware has programmed an endpoint decoder that straddles a
> > DPA partition boundary. While the kernel is careful to only allocate DPA
> > capacity within a single partition there is no guarantee that platform
> > firmware, or anything that touched the device before the current kernel,
> > gets that right.
> > 
> > However, __cxl_dpa_reserve() will never get to the CXL_DECODER_MIXED
> > designation because of the way it tracks partition boundaries. A
> > request_resource() that spans ->ram_res and ->pmem_res fails with the
> > following signature:
> > 
> >     __cxl_dpa_reserve: cxl_port endpoint15: decoder15.0: failed to reserve allocation
> > 
> > CXL_DECODER_MIXED is dead defensive programming after the driver has
> > already given up on the device. It has never offered any protection in
> > practice, just delete it.
> > 
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> >  drivers/cxl/core/hdm.c    |    8 ++++----
> >  drivers/cxl/core/region.c |   12 ------------
> >  drivers/cxl/cxl.h         |    4 +---
> >  3 files changed, 5 insertions(+), 19 deletions(-)
> > 
> > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> > index 28edd5822486..be8556119d94 100644
> > --- a/drivers/cxl/core/hdm.c
> > +++ b/drivers/cxl/core/hdm.c
> > @@ -329,12 +329,12 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
> >  
> >  	if (resource_contains(&cxlds->pmem_res, res))
> >  		cxled->mode = CXL_DECODER_PMEM;
> > -	else if (resource_contains(&cxlds->ram_res, res))
> > +	if (resource_contains(&cxlds->ram_res, res))
> 
> Logic of removing the else?  I assume there is 0 chance that both conditions
> match, but doesn't this mean if the res is not in ram_res we always hit the next
> else and print the warning?

...bug that I fixed later in the series and did not fold all the way
back to where it came from when splitting the series.

Good catch.

> 
> >  		cxled->mode = CXL_DECODER_RAM;
> >  	else {
> > -		dev_warn(dev, "decoder%d.%d: %pr mixed mode not supported\n",
> > -			 port->id, cxled->cxld.id, cxled->dpa_res);
> > -		cxled->mode = CXL_DECODER_MIXED;
> > +		dev_warn(dev, "decoder%d.%d: %pr does not map any partition\n",
> > +			 port->id, cxled->cxld.id, res);
> > +		cxled->mode = CXL_DECODER_NONE;
> >  	}
> >  
> >  	port->hdm_end++;
> 
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index f6015f24ad38..0fb8d70fa3e5 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -379,7 +379,6 @@ enum cxl_decoder_mode {
> >  	CXL_DECODER_NONE,
> >  	CXL_DECODER_RAM,
> >  	CXL_DECODER_PMEM,
> > -	CXL_DECODER_MIXED,
> >  	CXL_DECODER_DEAD,
> >  };
> >  
> > @@ -389,10 +388,9 @@ static inline const char *cxl_decoder_mode_name(enum cxl_decoder_mode mode)
> >  		[CXL_DECODER_NONE] = "none",
> >  		[CXL_DECODER_RAM] = "ram",
> >  		[CXL_DECODER_PMEM] = "pmem",
> > -		[CXL_DECODER_MIXED] = "mixed",
> >  	};
> >  
> > -	if (mode >= CXL_DECODER_NONE && mode <= CXL_DECODER_MIXED)
> > +	if (mode >= CXL_DECODER_NONE && mode <= CXL_DECODER_PMEM)
> Maybe just < DEAD is simpler?

I like that.
Dan Williams Jan. 17, 2025, 5:54 p.m. UTC | #4
Alejandro Lucero Palau wrote:
> 
> On 1/17/25 06:10, Dan Williams wrote:
> > CXL_DECODER_MIXED is a safety mechanism introduced for the case where
> > platform firmware has programmed an endpoint decoder that straddles a
> > DPA partition boundary. While the kernel is careful to only allocate DPA
> > capacity within a single partition there is no guarantee that platform
> > firmware, or anything that touched the device before the current kernel,
> > gets that right.
> >
> > However, __cxl_dpa_reserve() will never get to the CXL_DECODER_MIXED
> > designation because of the way it tracks partition boundaries. A
> > request_resource() that spans ->ram_res and ->pmem_res fails with the
> > following signature:
> >
> >      __cxl_dpa_reserve: cxl_port endpoint15: decoder15.0: failed to reserve allocation
> >
> > CXL_DECODER_MIXED is dead defensive programming after the driver has
> > already given up on the device. It has never offered any protection in
> > practice, just delete it.
> 
> 
> I wonder if the reason for adding this CXL_DECODER_MIXED  does still 
> worth it for fixing __cxl_dpa_reserve instead of just not supporting 
> this case.

See where that "failed to reserve allocation" message is printed. That
leads to the driver giving up on the device before the bad decoder
setting can confuse other code paths.
Ira Weiny Jan. 17, 2025, 6:45 p.m. UTC | #5
Dan Williams wrote:
> CXL_DECODER_MIXED is a safety mechanism introduced for the case where
> platform firmware has programmed an endpoint decoder that straddles a
> DPA partition boundary. While the kernel is careful to only allocate DPA
> capacity within a single partition there is no guarantee that platform
> firmware, or anything that touched the device before the current kernel,
> gets that right.
> 
> However, __cxl_dpa_reserve() will never get to the CXL_DECODER_MIXED
> designation because of the way it tracks partition boundaries. A
> request_resource() that spans ->ram_res and ->pmem_res fails with the
> following signature:
> 
>     __cxl_dpa_reserve: cxl_port endpoint15: decoder15.0: failed to reserve allocation
> 
> CXL_DECODER_MIXED is dead defensive programming after the driver has
> already given up on the device. It has never offered any protection in
> practice, just delete it.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

[snip]
diff mbox series

Patch

diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index 28edd5822486..be8556119d94 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -329,12 +329,12 @@  static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled,
 
 	if (resource_contains(&cxlds->pmem_res, res))
 		cxled->mode = CXL_DECODER_PMEM;
-	else if (resource_contains(&cxlds->ram_res, res))
+	if (resource_contains(&cxlds->ram_res, res))
 		cxled->mode = CXL_DECODER_RAM;
 	else {
-		dev_warn(dev, "decoder%d.%d: %pr mixed mode not supported\n",
-			 port->id, cxled->cxld.id, cxled->dpa_res);
-		cxled->mode = CXL_DECODER_MIXED;
+		dev_warn(dev, "decoder%d.%d: %pr does not map any partition\n",
+			 port->id, cxled->cxld.id, res);
+		cxled->mode = CXL_DECODER_NONE;
 	}
 
 	port->hdm_end++;
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index d77899650798..e4885acac853 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2725,18 +2725,6 @@  static int poison_by_decoder(struct device *dev, void *arg)
 	if (!cxled->dpa_res || !resource_size(cxled->dpa_res))
 		return rc;
 
-	/*
-	 * Regions are only created with single mode decoders: pmem or ram.
-	 * Linux does not support mixed mode decoders. This means that
-	 * reading poison per endpoint decoder adheres to the requirement
-	 * that poison reads of pmem and ram must be separated.
-	 * CXL 3.0 Spec 8.2.9.8.4.1
-	 */
-	if (cxled->mode == CXL_DECODER_MIXED) {
-		dev_dbg(dev, "poison list read unsupported in mixed mode\n");
-		return rc;
-	}
-
 	cxlmd = cxled_to_memdev(cxled);
 	if (cxled->skip) {
 		offset = cxled->dpa_res->start - cxled->skip;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index f6015f24ad38..0fb8d70fa3e5 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -379,7 +379,6 @@  enum cxl_decoder_mode {
 	CXL_DECODER_NONE,
 	CXL_DECODER_RAM,
 	CXL_DECODER_PMEM,
-	CXL_DECODER_MIXED,
 	CXL_DECODER_DEAD,
 };
 
@@ -389,10 +388,9 @@  static inline const char *cxl_decoder_mode_name(enum cxl_decoder_mode mode)
 		[CXL_DECODER_NONE] = "none",
 		[CXL_DECODER_RAM] = "ram",
 		[CXL_DECODER_PMEM] = "pmem",
-		[CXL_DECODER_MIXED] = "mixed",
 	};
 
-	if (mode >= CXL_DECODER_NONE && mode <= CXL_DECODER_MIXED)
+	if (mode >= CXL_DECODER_NONE && mode <= CXL_DECODER_PMEM)
 		return names[mode];
 	return "mixed";
 }