diff mbox series

[v2,14/14] cxl/port: Enable HDM Capability after validating DVSEC Ranges

Message ID 165283418817.1033989.11273676872054815459.stgit@dwillia2-xfh
State New, archived
Headers show
Series None | expand

Commit Message

Dan Williams May 18, 2022, 12:38 a.m. UTC
CXL memory expanders that support the CXL 2.0 memory device class code
include an "HDM Decoder Capability" mechanism to supplant the "CXL DVSEC
Range" mechanism originally defined in CXL 1.1. Both mechanisms depend
on a "mem_enable" bit being set in configuration space before either
mechanism activates. When the HDM Decoder Capability is enabled the CXL
DVSEC Range settings are ignored.

Previously, the cxl_mem driver was relying on platform-firmware to set
"mem_enable". That is an invalid assumption as there is no requirement
that platform-firmware sets the bit before the driver sees a device,
especially in hot-plug scenarios. Additionally, ACPI-platforms that
support CXL 2.0 devices also support the ACPI CEDT (CXL Early Discovery
Table). That table outlines the platform permissible address ranges for
CXL operation. So, there is a need for the driver to set "mem_enable",
and there is information available to determine the validity of the CXL
DVSEC Ranges. While DVSEC Ranges are expected to be at least
256M in size, the specification (CXL 2.0 Section 8.1.3.8.4 DVSEC CXL
Range 1 Base Low) allows for the possibilty of devices smaller than
256M. So the range [0, 256M) is considered active even if Memory_size
is 0.

Arrange for the driver to optionally enable the HDM Decoder Capability
if "mem_enable" was not set by platform firmware, or the CXL DVSEC Range
configuration was invalid. Be careful to only disable memory decode if
the kernel was the one to enable it. In other words, if CXL is backing
all of kernel memory at boot the device needs to maintain "mem_enable"
and "HDM Decoder enable" all the way up to handoff back to platform
firmware (e.g. ACPI S5 state entry may require CXL memory to stay
active).

Fixes: 560f78559006 ("cxl/pci: Retrieve CXL DVSEC memory info")
Cc: Dan Carpenter <dan.carpenter@oracle.com>
[dan: fix early terminiation of range-allowed loop]
Cc: Ariel Sibley <ariel.sibley@microchip.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
Changes since v1:
- Fix range-allowed loop termination (Smatch / Dan)
- Clean up changeloe wording around why [0, 256M) is considered always
  active (Ariel)

 drivers/cxl/core/pci.c |  163 ++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 151 insertions(+), 12 deletions(-)

Comments

Ariel.Sibley@microchip.com May 18, 2022, 2:07 a.m. UTC | #1
> Previously, the cxl_mem driver was relying on platform-firmware to set
> "mem_enable". That is an invalid assumption as there is no requirement
> that platform-firmware sets the bit before the driver sees a device,
> especially in hot-plug scenarios. Additionally, ACPI-platforms that
> support CXL 2.0 devices also support the ACPI CEDT (CXL Early Discovery
> Table). That table outlines the platform permissible address ranges for
> CXL operation. So, there is a need for the driver to set "mem_enable",
> and there is information available to determine the validity of the CXL
> DVSEC Ranges. While DVSEC Ranges are expected to be at least
> 256M in size, the specification (CXL 2.0 Section 8.1.3.8.4 DVSEC CXL
> Range 1 Base Low) allows for the possibilty of devices smaller than
> 256M. So the range [0, 256M) is considered active even if Memory_size
> is 0.

Regarding "So the range [0, 256M) is considered active even if
Memory_size is 0."

Since Memory_Base is included in address A, this portion of the equation
from CXL 2.0 Section 8.1.3.8.4 mandates that for host access to address A
to be directed to local HDM memory, Memory_Size[63:28] must be > 0:

(A >> 28) < Memory_Base[63:28] + Memory_Size[63:28]

This means if a device advertises Memory_Size = 0, no host access will
result in access to the HDM memory.

I would also note this text from CXL 2.0 Section 8.1.3.8:
"A CXL.mem capable device is permitted to report zero memory size."

For a device with a non-zero capacity less than 256M to satisfy the
equation, it would need to advertise a Memory_Size of at least 256M.

Regards,
Ariel
Dan Williams May 18, 2022, 2:44 a.m. UTC | #2
On Tue, May 17, 2022 at 7:08 PM <Ariel.Sibley@microchip.com> wrote:
>
> > Previously, the cxl_mem driver was relying on platform-firmware to set
> > "mem_enable". That is an invalid assumption as there is no requirement
> > that platform-firmware sets the bit before the driver sees a device,
> > especially in hot-plug scenarios. Additionally, ACPI-platforms that
> > support CXL 2.0 devices also support the ACPI CEDT (CXL Early Discovery
> > Table). That table outlines the platform permissible address ranges for
> > CXL operation. So, there is a need for the driver to set "mem_enable",
> > and there is information available to determine the validity of the CXL
> > DVSEC Ranges. While DVSEC Ranges are expected to be at least
> > 256M in size, the specification (CXL 2.0 Section 8.1.3.8.4 DVSEC CXL
> > Range 1 Base Low) allows for the possibilty of devices smaller than
> > 256M. So the range [0, 256M) is considered active even if Memory_size
> > is 0.
>
> Regarding "So the range [0, 256M) is considered active even if
> Memory_size is 0."
>
> Since Memory_Base is included in address A, this portion of the equation
> from CXL 2.0 Section 8.1.3.8.4 mandates that for host access to address A
> to be directed to local HDM memory, Memory_Size[63:28] must be > 0:
>
> (A >> 28) < Memory_Base[63:28] + Memory_Size[63:28]
>
> This means if a device advertises Memory_Size = 0, no host access will
> result in access to the HDM memory.
>
> I would also note this text from CXL 2.0 Section 8.1.3.8:
> "A CXL.mem capable device is permitted to report zero memory size."
>
> For a device with a non-zero capacity less than 256M to satisfy the
> equation, it would need to advertise a Memory_Size of at least 256M.

I think we need an errata to delete the "(e.g. a device with less than
256 MB of memory)" mention. I otherwise do not see how such a device
can exist if Memory_size must be >= 256M.
Jonathan Cameron May 18, 2022, 3:33 p.m. UTC | #3
On Tue, 17 May 2022 19:44:28 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> On Tue, May 17, 2022 at 7:08 PM <Ariel.Sibley@microchip.com> wrote:
> >  
> > > Previously, the cxl_mem driver was relying on platform-firmware to set
> > > "mem_enable". That is an invalid assumption as there is no requirement
> > > that platform-firmware sets the bit before the driver sees a device,
> > > especially in hot-plug scenarios. Additionally, ACPI-platforms that
> > > support CXL 2.0 devices also support the ACPI CEDT (CXL Early Discovery
> > > Table). That table outlines the platform permissible address ranges for
> > > CXL operation. So, there is a need for the driver to set "mem_enable",
> > > and there is information available to determine the validity of the CXL
> > > DVSEC Ranges. While DVSEC Ranges are expected to be at least
> > > 256M in size, the specification (CXL 2.0 Section 8.1.3.8.4 DVSEC CXL
> > > Range 1 Base Low) allows for the possibilty of devices smaller than
> > > 256M. So the range [0, 256M) is considered active even if Memory_size
> > > is 0.  
> >
> > Regarding "So the range [0, 256M) is considered active even if
> > Memory_size is 0."
> >
> > Since Memory_Base is included in address A, this portion of the equation
> > from CXL 2.0 Section 8.1.3.8.4 mandates that for host access to address A
> > to be directed to local HDM memory, Memory_Size[63:28] must be > 0:
> >
> > (A >> 28) < Memory_Base[63:28] + Memory_Size[63:28]
> >
> > This means if a device advertises Memory_Size = 0, no host access will
> > result in access to the HDM memory.
> >
> > I would also note this text from CXL 2.0 Section 8.1.3.8:
> > "A CXL.mem capable device is permitted to report zero memory size."
> >
> > For a device with a non-zero capacity less than 256M to satisfy the
> > equation, it would need to advertise a Memory_Size of at least 256M.  
> 
> I think we need an errata to delete the "(e.g. a device with less than
> 256 MB of memory)" mention. I otherwise do not see how such a device
> can exist if Memory_size must be >= 256M.

My reading of that is it is permissible to implement a device that
has say 16MiB or actual memory, report it as 256MiB and follow this
behavior for the 16-256 MiB range.  It also covers a 300MiB device
where the size of the HDM decoder is set to 512MiB etc.

As such I don't think it's wrong, but rather just not relevant to us
here (0 is a valid setting for Memory_Size).
Would need impdef means to establish the actual size of the
memory to do anything useful with that corner case.

Jonathan
Jonathan Cameron May 18, 2022, 5:17 p.m. UTC | #4
On Tue, 17 May 2022 17:38:10 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> CXL memory expanders that support the CXL 2.0 memory device class code
> include an "HDM Decoder Capability" mechanism to supplant the "CXL DVSEC
> Range" mechanism originally defined in CXL 1.1. Both mechanisms depend
> on a "mem_enable" bit being set in configuration space before either
> mechanism activates. When the HDM Decoder Capability is enabled the CXL
> DVSEC Range settings are ignored.
> 
> Previously, the cxl_mem driver was relying on platform-firmware to set
> "mem_enable". That is an invalid assumption as there is no requirement
> that platform-firmware sets the bit before the driver sees a device,
> especially in hot-plug scenarios. Additionally, ACPI-platforms that
> support CXL 2.0 devices also support the ACPI CEDT (CXL Early Discovery
> Table). That table outlines the platform permissible address ranges for
> CXL operation. So, there is a need for the driver to set "mem_enable",
> and there is information available to determine the validity of the CXL
> DVSEC Ranges. While DVSEC Ranges are expected to be at least
> 256M in size, the specification (CXL 2.0 Section 8.1.3.8.4 DVSEC CXL
> Range 1 Base Low) allows for the possibilty of devices smaller than
> 256M. So the range [0, 256M) is considered active even if Memory_size
> is 0.
> 
> Arrange for the driver to optionally enable the HDM Decoder Capability
> if "mem_enable" was not set by platform firmware, or the CXL DVSEC Range
> configuration was invalid. Be careful to only disable memory decode if
> the kernel was the one to enable it. In other words, if CXL is backing
> all of kernel memory at boot the device needs to maintain "mem_enable"
> and "HDM Decoder enable" all the way up to handoff back to platform
> firmware (e.g. ACPI S5 state entry may require CXL memory to stay
> active).
> 
> Fixes: 560f78559006 ("cxl/pci: Retrieve CXL DVSEC memory info")
> Cc: Dan Carpenter <dan.carpenter@oracle.com>
> [dan: fix early terminiation of range-allowed loop]
> Cc: Ariel Sibley <ariel.sibley@microchip.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
> Changes since v1:
> - Fix range-allowed loop termination (Smatch / Dan)
That had me confused before I saw v2 :)

I'm not keen on the trick to do disallowed in the debug message...

Other than ongoing discussion around the range being always allowed
(or not) this looks good to me.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> - Clean up changeloe wording around why [0, 256M) is considered always
>   active (Ariel)
> 
>  drivers/cxl/core/pci.c |  163 ++++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 151 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index a697c48fc830..528430da0e77 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -175,30 +175,164 @@ static int wait_for_valid(struct cxl_dev_state *cxlds)
>  	return -ETIMEDOUT;
>  }
>  
> +static int cxl_set_mem_enable(struct cxl_dev_state *cxlds, u16 val)
> +{
> +	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> +	int d = cxlds->cxl_dvsec;
> +	u16 ctrl;
> +	int rc;
> +
> +	rc = pci_read_config_word(pdev, d + CXL_DVSEC_CTRL_OFFSET, &ctrl);
> +	if (rc < 0)
> +		return rc;
> +
> +	if ((ctrl & CXL_DVSEC_MEM_ENABLE) == val)
> +		return 1;
> +	ctrl &= ~CXL_DVSEC_MEM_ENABLE;
> +	ctrl |= val;
> +
> +	rc = pci_write_config_word(pdev, d + CXL_DVSEC_CTRL_OFFSET, ctrl);
> +	if (rc < 0)
> +		return rc;
> +
> +	return 0;
> +}
> +
> +static void clear_mem_enable(void *cxlds)
> +{
> +	cxl_set_mem_enable(cxlds, 0);
> +}
> +
> +static int devm_cxl_enable_mem(struct device *host, struct cxl_dev_state *cxlds)
> +{
> +	int rc;
> +
> +	rc = cxl_set_mem_enable(cxlds, CXL_DVSEC_MEM_ENABLE);
> +	if (rc < 0)
> +		return rc;
> +	if (rc > 0)
> +		return 0;
> +	return devm_add_action_or_reset(host, clear_mem_enable, cxlds);
> +}
> +
> +static bool range_contains(struct range *r1, struct range *r2)
> +{
> +	return r1->start <= r2->start && r1->end >= r2->end;
> +}
> +
> +/* require dvsec ranges to be covered by a locked platform window */
> +static int dvsec_range_allowed(struct device *dev, void *arg)
> +{
> +	struct range *dev_range = arg;
> +	struct cxl_decoder *cxld;
> +	struct range root_range;
> +
> +	if (!is_root_decoder(dev))
> +		return 0;
> +
> +	cxld = to_cxl_decoder(dev);
> +
> +	if (!(cxld->flags & CXL_DECODER_F_LOCK))
> +		return 0;
> +	if (!(cxld->flags & CXL_DECODER_F_RAM))
> +		return 0;
> +
> +	root_range = (struct range) {
> +		.start = cxld->platform_res.start,
> +		.end = cxld->platform_res.end,
> +	};
> +
> +	return range_contains(&root_range, dev_range);
> +}
> +
> +static void disable_hdm(void *_cxlhdm)
> +{
> +	u32 global_ctrl;
> +	struct cxl_hdm *cxlhdm = _cxlhdm;
> +	void __iomem *hdm = cxlhdm->regs.hdm_decoder;
> +
> +	global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET);
> +	writel(global_ctrl & ~CXL_HDM_DECODER_ENABLE,
> +	       hdm + CXL_HDM_DECODER_CTRL_OFFSET);
> +}
> +
> +static int devm_cxl_enable_hdm(struct device *host, struct cxl_hdm *cxlhdm)
> +{
> +	void __iomem *hdm = cxlhdm->regs.hdm_decoder;
> +	u32 global_ctrl;
> +
> +	global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET);
> +	writel(global_ctrl | CXL_HDM_DECODER_ENABLE,
> +	       hdm + CXL_HDM_DECODER_CTRL_OFFSET);
> +
> +	return devm_add_action_or_reset(host, disable_hdm, cxlhdm);
> +}
> +
>  static bool __cxl_hdm_decode_init(struct cxl_dev_state *cxlds,
>  				  struct cxl_hdm *cxlhdm,
>  				  struct cxl_endpoint_dvsec_info *info)
>  {
>  	void __iomem *hdm = cxlhdm->regs.hdm_decoder;
> -	bool global_enable;
> +	struct cxl_port *port = cxlhdm->port;
> +	struct device *dev = cxlds->dev;
> +	struct cxl_port *root;
> +	int i, rc, allowed;
>  	u32 global_ctrl;
>  
>  	global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET);
> -	global_enable = global_ctrl & CXL_HDM_DECODER_ENABLE;
>  
> -	if (!global_enable && info->mem_enabled)
> +	/*
> +	 * If the HDM Decoder Capability is already enabled then assume
> +	 * that some other agent like platform firmware set it up.
> +	 */
> +	if (global_ctrl & CXL_HDM_DECODER_ENABLE) {
> +		rc = devm_cxl_enable_mem(&port->dev, cxlds);
> +		if (rc)
> +			return false;
> +		return true;
> +	}
> +
> +	root = to_cxl_port(port->dev.parent);
> +	while (!is_cxl_root(root) && is_cxl_port(root->dev.parent))
> +		root = to_cxl_port(root->dev.parent);
> +	if (!is_cxl_root(root)) {
> +		dev_err(dev, "Failed to acquire root port for HDM enable\n");
>  		return false;
> +	}
> +
> +	for (i = 0, allowed = 0; info->mem_enabled && i < info->ranges; i++) {
> +		struct device *cxld_dev;
> +
> +		cxld_dev = device_find_child(&root->dev, &info->dvsec_range[i],
> +					     dvsec_range_allowed);
> +		dev_dbg(dev, "DVSEC Range%d %sallowed by platform\n", i,
> +			cxld_dev ? "" : "dis");

Ouch.  Not worth doing that to save a few chars. Makes the message
harder to grep for.

> +		if (!cxld_dev)
> +			continue;
> +		put_device(cxld_dev);
> +		allowed++;
> +	}
> +	put_device(&root->dev);
> +
> +	if (!allowed) {
> +		cxl_set_mem_enable(cxlds, 0);
> +		info->mem_enabled = 0;
> +	}
Dan Williams May 18, 2022, 6 p.m. UTC | #5
On Wed, May 18, 2022 at 10:55 AM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Tue, 17 May 2022 17:38:10 -0700
> Dan Williams <dan.j.williams@intel.com> wrote:
>
> > CXL memory expanders that support the CXL 2.0 memory device class code
> > include an "HDM Decoder Capability" mechanism to supplant the "CXL DVSEC
> > Range" mechanism originally defined in CXL 1.1. Both mechanisms depend
> > on a "mem_enable" bit being set in configuration space before either
> > mechanism activates. When the HDM Decoder Capability is enabled the CXL
> > DVSEC Range settings are ignored.
> >
> > Previously, the cxl_mem driver was relying on platform-firmware to set
> > "mem_enable". That is an invalid assumption as there is no requirement
> > that platform-firmware sets the bit before the driver sees a device,
> > especially in hot-plug scenarios. Additionally, ACPI-platforms that
> > support CXL 2.0 devices also support the ACPI CEDT (CXL Early Discovery
> > Table). That table outlines the platform permissible address ranges for
> > CXL operation. So, there is a need for the driver to set "mem_enable",
> > and there is information available to determine the validity of the CXL
> > DVSEC Ranges. While DVSEC Ranges are expected to be at least
> > 256M in size, the specification (CXL 2.0 Section 8.1.3.8.4 DVSEC CXL
> > Range 1 Base Low) allows for the possibilty of devices smaller than
> > 256M. So the range [0, 256M) is considered active even if Memory_size
> > is 0.
> >
> > Arrange for the driver to optionally enable the HDM Decoder Capability
> > if "mem_enable" was not set by platform firmware, or the CXL DVSEC Range
> > configuration was invalid. Be careful to only disable memory decode if
> > the kernel was the one to enable it. In other words, if CXL is backing
> > all of kernel memory at boot the device needs to maintain "mem_enable"
> > and "HDM Decoder enable" all the way up to handoff back to platform
> > firmware (e.g. ACPI S5 state entry may require CXL memory to stay
> > active).
> >
> > Fixes: 560f78559006 ("cxl/pci: Retrieve CXL DVSEC memory info")
> > Cc: Dan Carpenter <dan.carpenter@oracle.com>
> > [dan: fix early terminiation of range-allowed loop]
> > Cc: Ariel Sibley <ariel.sibley@microchip.com>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> > Changes since v1:
> > - Fix range-allowed loop termination (Smatch / Dan)
> That had me confused before I saw v2 :)
>
> I'm not keen on the trick to do disallowed in the debug message...
>
> Other than ongoing discussion around the range being always allowed
> (or not) this looks good to me.
>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>
> > - Clean up changeloe wording around why [0, 256M) is considered always
> >   active (Ariel)
> >
> >  drivers/cxl/core/pci.c |  163 ++++++++++++++++++++++++++++++++++++++++++++----
> >  1 file changed, 151 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> > index a697c48fc830..528430da0e77 100644
> > --- a/drivers/cxl/core/pci.c
> > +++ b/drivers/cxl/core/pci.c
> > @@ -175,30 +175,164 @@ static int wait_for_valid(struct cxl_dev_state *cxlds)
> >       return -ETIMEDOUT;
> >  }
> >
> > +static int cxl_set_mem_enable(struct cxl_dev_state *cxlds, u16 val)
> > +{
> > +     struct pci_dev *pdev = to_pci_dev(cxlds->dev);
> > +     int d = cxlds->cxl_dvsec;
> > +     u16 ctrl;
> > +     int rc;
> > +
> > +     rc = pci_read_config_word(pdev, d + CXL_DVSEC_CTRL_OFFSET, &ctrl);
> > +     if (rc < 0)
> > +             return rc;
> > +
> > +     if ((ctrl & CXL_DVSEC_MEM_ENABLE) == val)
> > +             return 1;
> > +     ctrl &= ~CXL_DVSEC_MEM_ENABLE;
> > +     ctrl |= val;
> > +
> > +     rc = pci_write_config_word(pdev, d + CXL_DVSEC_CTRL_OFFSET, ctrl);
> > +     if (rc < 0)
> > +             return rc;
> > +
> > +     return 0;
> > +}
> > +
> > +static void clear_mem_enable(void *cxlds)
> > +{
> > +     cxl_set_mem_enable(cxlds, 0);
> > +}
> > +
> > +static int devm_cxl_enable_mem(struct device *host, struct cxl_dev_state *cxlds)
> > +{
> > +     int rc;
> > +
> > +     rc = cxl_set_mem_enable(cxlds, CXL_DVSEC_MEM_ENABLE);
> > +     if (rc < 0)
> > +             return rc;
> > +     if (rc > 0)
> > +             return 0;
> > +     return devm_add_action_or_reset(host, clear_mem_enable, cxlds);
> > +}
> > +
> > +static bool range_contains(struct range *r1, struct range *r2)
> > +{
> > +     return r1->start <= r2->start && r1->end >= r2->end;
> > +}
> > +
> > +/* require dvsec ranges to be covered by a locked platform window */
> > +static int dvsec_range_allowed(struct device *dev, void *arg)
> > +{
> > +     struct range *dev_range = arg;
> > +     struct cxl_decoder *cxld;
> > +     struct range root_range;
> > +
> > +     if (!is_root_decoder(dev))
> > +             return 0;
> > +
> > +     cxld = to_cxl_decoder(dev);
> > +
> > +     if (!(cxld->flags & CXL_DECODER_F_LOCK))
> > +             return 0;
> > +     if (!(cxld->flags & CXL_DECODER_F_RAM))
> > +             return 0;
> > +
> > +     root_range = (struct range) {
> > +             .start = cxld->platform_res.start,
> > +             .end = cxld->platform_res.end,
> > +     };
> > +
> > +     return range_contains(&root_range, dev_range);
> > +}
> > +
> > +static void disable_hdm(void *_cxlhdm)
> > +{
> > +     u32 global_ctrl;
> > +     struct cxl_hdm *cxlhdm = _cxlhdm;
> > +     void __iomem *hdm = cxlhdm->regs.hdm_decoder;
> > +
> > +     global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET);
> > +     writel(global_ctrl & ~CXL_HDM_DECODER_ENABLE,
> > +            hdm + CXL_HDM_DECODER_CTRL_OFFSET);
> > +}
> > +
> > +static int devm_cxl_enable_hdm(struct device *host, struct cxl_hdm *cxlhdm)
> > +{
> > +     void __iomem *hdm = cxlhdm->regs.hdm_decoder;
> > +     u32 global_ctrl;
> > +
> > +     global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET);
> > +     writel(global_ctrl | CXL_HDM_DECODER_ENABLE,
> > +            hdm + CXL_HDM_DECODER_CTRL_OFFSET);
> > +
> > +     return devm_add_action_or_reset(host, disable_hdm, cxlhdm);
> > +}
> > +
> >  static bool __cxl_hdm_decode_init(struct cxl_dev_state *cxlds,
> >                                 struct cxl_hdm *cxlhdm,
> >                                 struct cxl_endpoint_dvsec_info *info)
> >  {
> >       void __iomem *hdm = cxlhdm->regs.hdm_decoder;
> > -     bool global_enable;
> > +     struct cxl_port *port = cxlhdm->port;
> > +     struct device *dev = cxlds->dev;
> > +     struct cxl_port *root;
> > +     int i, rc, allowed;
> >       u32 global_ctrl;
> >
> >       global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET);
> > -     global_enable = global_ctrl & CXL_HDM_DECODER_ENABLE;
> >
> > -     if (!global_enable && info->mem_enabled)
> > +     /*
> > +      * If the HDM Decoder Capability is already enabled then assume
> > +      * that some other agent like platform firmware set it up.
> > +      */
> > +     if (global_ctrl & CXL_HDM_DECODER_ENABLE) {
> > +             rc = devm_cxl_enable_mem(&port->dev, cxlds);
> > +             if (rc)
> > +                     return false;
> > +             return true;
> > +     }
> > +
> > +     root = to_cxl_port(port->dev.parent);
> > +     while (!is_cxl_root(root) && is_cxl_port(root->dev.parent))
> > +             root = to_cxl_port(root->dev.parent);
> > +     if (!is_cxl_root(root)) {
> > +             dev_err(dev, "Failed to acquire root port for HDM enable\n");
> >               return false;
> > +     }
> > +
> > +     for (i = 0, allowed = 0; info->mem_enabled && i < info->ranges; i++) {
> > +             struct device *cxld_dev;
> > +
> > +             cxld_dev = device_find_child(&root->dev, &info->dvsec_range[i],
> > +                                          dvsec_range_allowed);
> > +             dev_dbg(dev, "DVSEC Range%d %sallowed by platform\n", i,
> > +                     cxld_dev ? "" : "dis");
>
> Ouch.  Not worth doing that to save a few chars. Makes the message
> harder to grep for.

Ok, will drop, along with the "always enabled" change.
diff mbox series

Patch

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index a697c48fc830..528430da0e77 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -175,30 +175,164 @@  static int wait_for_valid(struct cxl_dev_state *cxlds)
 	return -ETIMEDOUT;
 }
 
+static int cxl_set_mem_enable(struct cxl_dev_state *cxlds, u16 val)
+{
+	struct pci_dev *pdev = to_pci_dev(cxlds->dev);
+	int d = cxlds->cxl_dvsec;
+	u16 ctrl;
+	int rc;
+
+	rc = pci_read_config_word(pdev, d + CXL_DVSEC_CTRL_OFFSET, &ctrl);
+	if (rc < 0)
+		return rc;
+
+	if ((ctrl & CXL_DVSEC_MEM_ENABLE) == val)
+		return 1;
+	ctrl &= ~CXL_DVSEC_MEM_ENABLE;
+	ctrl |= val;
+
+	rc = pci_write_config_word(pdev, d + CXL_DVSEC_CTRL_OFFSET, ctrl);
+	if (rc < 0)
+		return rc;
+
+	return 0;
+}
+
+static void clear_mem_enable(void *cxlds)
+{
+	cxl_set_mem_enable(cxlds, 0);
+}
+
+static int devm_cxl_enable_mem(struct device *host, struct cxl_dev_state *cxlds)
+{
+	int rc;
+
+	rc = cxl_set_mem_enable(cxlds, CXL_DVSEC_MEM_ENABLE);
+	if (rc < 0)
+		return rc;
+	if (rc > 0)
+		return 0;
+	return devm_add_action_or_reset(host, clear_mem_enable, cxlds);
+}
+
+static bool range_contains(struct range *r1, struct range *r2)
+{
+	return r1->start <= r2->start && r1->end >= r2->end;
+}
+
+/* require dvsec ranges to be covered by a locked platform window */
+static int dvsec_range_allowed(struct device *dev, void *arg)
+{
+	struct range *dev_range = arg;
+	struct cxl_decoder *cxld;
+	struct range root_range;
+
+	if (!is_root_decoder(dev))
+		return 0;
+
+	cxld = to_cxl_decoder(dev);
+
+	if (!(cxld->flags & CXL_DECODER_F_LOCK))
+		return 0;
+	if (!(cxld->flags & CXL_DECODER_F_RAM))
+		return 0;
+
+	root_range = (struct range) {
+		.start = cxld->platform_res.start,
+		.end = cxld->platform_res.end,
+	};
+
+	return range_contains(&root_range, dev_range);
+}
+
+static void disable_hdm(void *_cxlhdm)
+{
+	u32 global_ctrl;
+	struct cxl_hdm *cxlhdm = _cxlhdm;
+	void __iomem *hdm = cxlhdm->regs.hdm_decoder;
+
+	global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET);
+	writel(global_ctrl & ~CXL_HDM_DECODER_ENABLE,
+	       hdm + CXL_HDM_DECODER_CTRL_OFFSET);
+}
+
+static int devm_cxl_enable_hdm(struct device *host, struct cxl_hdm *cxlhdm)
+{
+	void __iomem *hdm = cxlhdm->regs.hdm_decoder;
+	u32 global_ctrl;
+
+	global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET);
+	writel(global_ctrl | CXL_HDM_DECODER_ENABLE,
+	       hdm + CXL_HDM_DECODER_CTRL_OFFSET);
+
+	return devm_add_action_or_reset(host, disable_hdm, cxlhdm);
+}
+
 static bool __cxl_hdm_decode_init(struct cxl_dev_state *cxlds,
 				  struct cxl_hdm *cxlhdm,
 				  struct cxl_endpoint_dvsec_info *info)
 {
 	void __iomem *hdm = cxlhdm->regs.hdm_decoder;
-	bool global_enable;
+	struct cxl_port *port = cxlhdm->port;
+	struct device *dev = cxlds->dev;
+	struct cxl_port *root;
+	int i, rc, allowed;
 	u32 global_ctrl;
 
 	global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET);
-	global_enable = global_ctrl & CXL_HDM_DECODER_ENABLE;
 
-	if (!global_enable && info->mem_enabled)
+	/*
+	 * If the HDM Decoder Capability is already enabled then assume
+	 * that some other agent like platform firmware set it up.
+	 */
+	if (global_ctrl & CXL_HDM_DECODER_ENABLE) {
+		rc = devm_cxl_enable_mem(&port->dev, cxlds);
+		if (rc)
+			return false;
+		return true;
+	}
+
+	root = to_cxl_port(port->dev.parent);
+	while (!is_cxl_root(root) && is_cxl_port(root->dev.parent))
+		root = to_cxl_port(root->dev.parent);
+	if (!is_cxl_root(root)) {
+		dev_err(dev, "Failed to acquire root port for HDM enable\n");
 		return false;
+	}
+
+	for (i = 0, allowed = 0; info->mem_enabled && i < info->ranges; i++) {
+		struct device *cxld_dev;
+
+		cxld_dev = device_find_child(&root->dev, &info->dvsec_range[i],
+					     dvsec_range_allowed);
+		dev_dbg(dev, "DVSEC Range%d %sallowed by platform\n", i,
+			cxld_dev ? "" : "dis");
+		if (!cxld_dev)
+			continue;
+		put_device(cxld_dev);
+		allowed++;
+	}
+	put_device(&root->dev);
+
+	if (!allowed) {
+		cxl_set_mem_enable(cxlds, 0);
+		info->mem_enabled = 0;
+	}
 
 	/*
-	 * Permanently (for this boot at least) opt the device into HDM
-	 * operation. Individual HDM decoders still need to be enabled after
-	 * this point.
+	 * At least one DVSEC range is enabled and allowed, skip HDM
+	 * Decoder Capability Enable
 	 */
-	if (!global_enable) {
-		dev_dbg(cxlds->dev, "Enabling HDM decode\n");
-		writel(global_ctrl | CXL_HDM_DECODER_ENABLE,
-		       hdm + CXL_HDM_DECODER_CTRL_OFFSET);
-	}
+	if (info->mem_enabled)
+		return false;
+
+	rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
+	if (rc)
+		return false;
+
+	rc = devm_cxl_enable_mem(&port->dev, cxlds);
+	if (rc)
+		return false;
 
 	return true;
 }
@@ -253,9 +387,14 @@  int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm)
 		return rc;
 	}
 
+	/*
+	 * The current DVSEC values are moot if the memory capability is
+	 * disabled, and they will remain moot after the HDM Decoder
+	 * capability is enabled.
+	 */
 	info.mem_enabled = FIELD_GET(CXL_DVSEC_MEM_ENABLE, ctrl);
 	if (!info.mem_enabled)
-		return 0;
+		return __cxl_hdm_decode_init(cxlds, cxlhdm, &info);
 
 	for (i = 0; i < hdm_count; i++) {
 		u64 base, size;