Message ID | 165283418817.1033989.11273676872054815459.stgit@dwillia2-xfh |
---|---|
State | New, archived |
Headers | show |
Series | None | expand |
> Previously, the cxl_mem driver was relying on platform-firmware to set > "mem_enable". That is an invalid assumption as there is no requirement > that platform-firmware sets the bit before the driver sees a device, > especially in hot-plug scenarios. Additionally, ACPI-platforms that > support CXL 2.0 devices also support the ACPI CEDT (CXL Early Discovery > Table). That table outlines the platform permissible address ranges for > CXL operation. So, there is a need for the driver to set "mem_enable", > and there is information available to determine the validity of the CXL > DVSEC Ranges. While DVSEC Ranges are expected to be at least > 256M in size, the specification (CXL 2.0 Section 8.1.3.8.4 DVSEC CXL > Range 1 Base Low) allows for the possibilty of devices smaller than > 256M. So the range [0, 256M) is considered active even if Memory_size > is 0. Regarding "So the range [0, 256M) is considered active even if Memory_size is 0." Since Memory_Base is included in address A, this portion of the equation from CXL 2.0 Section 8.1.3.8.4 mandates that for host access to address A to be directed to local HDM memory, Memory_Size[63:28] must be > 0: (A >> 28) < Memory_Base[63:28] + Memory_Size[63:28] This means if a device advertises Memory_Size = 0, no host access will result in access to the HDM memory. I would also note this text from CXL 2.0 Section 8.1.3.8: "A CXL.mem capable device is permitted to report zero memory size." For a device with a non-zero capacity less than 256M to satisfy the equation, it would need to advertise a Memory_Size of at least 256M. Regards, Ariel
On Tue, May 17, 2022 at 7:08 PM <Ariel.Sibley@microchip.com> wrote: > > > Previously, the cxl_mem driver was relying on platform-firmware to set > > "mem_enable". That is an invalid assumption as there is no requirement > > that platform-firmware sets the bit before the driver sees a device, > > especially in hot-plug scenarios. Additionally, ACPI-platforms that > > support CXL 2.0 devices also support the ACPI CEDT (CXL Early Discovery > > Table). That table outlines the platform permissible address ranges for > > CXL operation. So, there is a need for the driver to set "mem_enable", > > and there is information available to determine the validity of the CXL > > DVSEC Ranges. While DVSEC Ranges are expected to be at least > > 256M in size, the specification (CXL 2.0 Section 8.1.3.8.4 DVSEC CXL > > Range 1 Base Low) allows for the possibilty of devices smaller than > > 256M. So the range [0, 256M) is considered active even if Memory_size > > is 0. > > Regarding "So the range [0, 256M) is considered active even if > Memory_size is 0." > > Since Memory_Base is included in address A, this portion of the equation > from CXL 2.0 Section 8.1.3.8.4 mandates that for host access to address A > to be directed to local HDM memory, Memory_Size[63:28] must be > 0: > > (A >> 28) < Memory_Base[63:28] + Memory_Size[63:28] > > This means if a device advertises Memory_Size = 0, no host access will > result in access to the HDM memory. > > I would also note this text from CXL 2.0 Section 8.1.3.8: > "A CXL.mem capable device is permitted to report zero memory size." > > For a device with a non-zero capacity less than 256M to satisfy the > equation, it would need to advertise a Memory_Size of at least 256M. I think we need an errata to delete the "(e.g. a device with less than 256 MB of memory)" mention. I otherwise do not see how such a device can exist if Memory_size must be >= 256M.
On Tue, 17 May 2022 19:44:28 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > On Tue, May 17, 2022 at 7:08 PM <Ariel.Sibley@microchip.com> wrote: > > > > > Previously, the cxl_mem driver was relying on platform-firmware to set > > > "mem_enable". That is an invalid assumption as there is no requirement > > > that platform-firmware sets the bit before the driver sees a device, > > > especially in hot-plug scenarios. Additionally, ACPI-platforms that > > > support CXL 2.0 devices also support the ACPI CEDT (CXL Early Discovery > > > Table). That table outlines the platform permissible address ranges for > > > CXL operation. So, there is a need for the driver to set "mem_enable", > > > and there is information available to determine the validity of the CXL > > > DVSEC Ranges. While DVSEC Ranges are expected to be at least > > > 256M in size, the specification (CXL 2.0 Section 8.1.3.8.4 DVSEC CXL > > > Range 1 Base Low) allows for the possibilty of devices smaller than > > > 256M. So the range [0, 256M) is considered active even if Memory_size > > > is 0. > > > > Regarding "So the range [0, 256M) is considered active even if > > Memory_size is 0." > > > > Since Memory_Base is included in address A, this portion of the equation > > from CXL 2.0 Section 8.1.3.8.4 mandates that for host access to address A > > to be directed to local HDM memory, Memory_Size[63:28] must be > 0: > > > > (A >> 28) < Memory_Base[63:28] + Memory_Size[63:28] > > > > This means if a device advertises Memory_Size = 0, no host access will > > result in access to the HDM memory. > > > > I would also note this text from CXL 2.0 Section 8.1.3.8: > > "A CXL.mem capable device is permitted to report zero memory size." > > > > For a device with a non-zero capacity less than 256M to satisfy the > > equation, it would need to advertise a Memory_Size of at least 256M. > > I think we need an errata to delete the "(e.g. a device with less than > 256 MB of memory)" mention. I otherwise do not see how such a device > can exist if Memory_size must be >= 256M. My reading of that is it is permissible to implement a device that has say 16MiB or actual memory, report it as 256MiB and follow this behavior for the 16-256 MiB range. It also covers a 300MiB device where the size of the HDM decoder is set to 512MiB etc. As such I don't think it's wrong, but rather just not relevant to us here (0 is a valid setting for Memory_Size). Would need impdef means to establish the actual size of the memory to do anything useful with that corner case. Jonathan
On Tue, 17 May 2022 17:38:10 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > CXL memory expanders that support the CXL 2.0 memory device class code > include an "HDM Decoder Capability" mechanism to supplant the "CXL DVSEC > Range" mechanism originally defined in CXL 1.1. Both mechanisms depend > on a "mem_enable" bit being set in configuration space before either > mechanism activates. When the HDM Decoder Capability is enabled the CXL > DVSEC Range settings are ignored. > > Previously, the cxl_mem driver was relying on platform-firmware to set > "mem_enable". That is an invalid assumption as there is no requirement > that platform-firmware sets the bit before the driver sees a device, > especially in hot-plug scenarios. Additionally, ACPI-platforms that > support CXL 2.0 devices also support the ACPI CEDT (CXL Early Discovery > Table). That table outlines the platform permissible address ranges for > CXL operation. So, there is a need for the driver to set "mem_enable", > and there is information available to determine the validity of the CXL > DVSEC Ranges. While DVSEC Ranges are expected to be at least > 256M in size, the specification (CXL 2.0 Section 8.1.3.8.4 DVSEC CXL > Range 1 Base Low) allows for the possibilty of devices smaller than > 256M. So the range [0, 256M) is considered active even if Memory_size > is 0. > > Arrange for the driver to optionally enable the HDM Decoder Capability > if "mem_enable" was not set by platform firmware, or the CXL DVSEC Range > configuration was invalid. Be careful to only disable memory decode if > the kernel was the one to enable it. In other words, if CXL is backing > all of kernel memory at boot the device needs to maintain "mem_enable" > and "HDM Decoder enable" all the way up to handoff back to platform > firmware (e.g. ACPI S5 state entry may require CXL memory to stay > active). > > Fixes: 560f78559006 ("cxl/pci: Retrieve CXL DVSEC memory info") > Cc: Dan Carpenter <dan.carpenter@oracle.com> > [dan: fix early terminiation of range-allowed loop] > Cc: Ariel Sibley <ariel.sibley@microchip.com> > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > --- > Changes since v1: > - Fix range-allowed loop termination (Smatch / Dan) That had me confused before I saw v2 :) I'm not keen on the trick to do disallowed in the debug message... Other than ongoing discussion around the range being always allowed (or not) this looks good to me. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > - Clean up changeloe wording around why [0, 256M) is considered always > active (Ariel) > > drivers/cxl/core/pci.c | 163 ++++++++++++++++++++++++++++++++++++++++++++---- > 1 file changed, 151 insertions(+), 12 deletions(-) > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index a697c48fc830..528430da0e77 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -175,30 +175,164 @@ static int wait_for_valid(struct cxl_dev_state *cxlds) > return -ETIMEDOUT; > } > > +static int cxl_set_mem_enable(struct cxl_dev_state *cxlds, u16 val) > +{ > + struct pci_dev *pdev = to_pci_dev(cxlds->dev); > + int d = cxlds->cxl_dvsec; > + u16 ctrl; > + int rc; > + > + rc = pci_read_config_word(pdev, d + CXL_DVSEC_CTRL_OFFSET, &ctrl); > + if (rc < 0) > + return rc; > + > + if ((ctrl & CXL_DVSEC_MEM_ENABLE) == val) > + return 1; > + ctrl &= ~CXL_DVSEC_MEM_ENABLE; > + ctrl |= val; > + > + rc = pci_write_config_word(pdev, d + CXL_DVSEC_CTRL_OFFSET, ctrl); > + if (rc < 0) > + return rc; > + > + return 0; > +} > + > +static void clear_mem_enable(void *cxlds) > +{ > + cxl_set_mem_enable(cxlds, 0); > +} > + > +static int devm_cxl_enable_mem(struct device *host, struct cxl_dev_state *cxlds) > +{ > + int rc; > + > + rc = cxl_set_mem_enable(cxlds, CXL_DVSEC_MEM_ENABLE); > + if (rc < 0) > + return rc; > + if (rc > 0) > + return 0; > + return devm_add_action_or_reset(host, clear_mem_enable, cxlds); > +} > + > +static bool range_contains(struct range *r1, struct range *r2) > +{ > + return r1->start <= r2->start && r1->end >= r2->end; > +} > + > +/* require dvsec ranges to be covered by a locked platform window */ > +static int dvsec_range_allowed(struct device *dev, void *arg) > +{ > + struct range *dev_range = arg; > + struct cxl_decoder *cxld; > + struct range root_range; > + > + if (!is_root_decoder(dev)) > + return 0; > + > + cxld = to_cxl_decoder(dev); > + > + if (!(cxld->flags & CXL_DECODER_F_LOCK)) > + return 0; > + if (!(cxld->flags & CXL_DECODER_F_RAM)) > + return 0; > + > + root_range = (struct range) { > + .start = cxld->platform_res.start, > + .end = cxld->platform_res.end, > + }; > + > + return range_contains(&root_range, dev_range); > +} > + > +static void disable_hdm(void *_cxlhdm) > +{ > + u32 global_ctrl; > + struct cxl_hdm *cxlhdm = _cxlhdm; > + void __iomem *hdm = cxlhdm->regs.hdm_decoder; > + > + global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET); > + writel(global_ctrl & ~CXL_HDM_DECODER_ENABLE, > + hdm + CXL_HDM_DECODER_CTRL_OFFSET); > +} > + > +static int devm_cxl_enable_hdm(struct device *host, struct cxl_hdm *cxlhdm) > +{ > + void __iomem *hdm = cxlhdm->regs.hdm_decoder; > + u32 global_ctrl; > + > + global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET); > + writel(global_ctrl | CXL_HDM_DECODER_ENABLE, > + hdm + CXL_HDM_DECODER_CTRL_OFFSET); > + > + return devm_add_action_or_reset(host, disable_hdm, cxlhdm); > +} > + > static bool __cxl_hdm_decode_init(struct cxl_dev_state *cxlds, > struct cxl_hdm *cxlhdm, > struct cxl_endpoint_dvsec_info *info) > { > void __iomem *hdm = cxlhdm->regs.hdm_decoder; > - bool global_enable; > + struct cxl_port *port = cxlhdm->port; > + struct device *dev = cxlds->dev; > + struct cxl_port *root; > + int i, rc, allowed; > u32 global_ctrl; > > global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET); > - global_enable = global_ctrl & CXL_HDM_DECODER_ENABLE; > > - if (!global_enable && info->mem_enabled) > + /* > + * If the HDM Decoder Capability is already enabled then assume > + * that some other agent like platform firmware set it up. > + */ > + if (global_ctrl & CXL_HDM_DECODER_ENABLE) { > + rc = devm_cxl_enable_mem(&port->dev, cxlds); > + if (rc) > + return false; > + return true; > + } > + > + root = to_cxl_port(port->dev.parent); > + while (!is_cxl_root(root) && is_cxl_port(root->dev.parent)) > + root = to_cxl_port(root->dev.parent); > + if (!is_cxl_root(root)) { > + dev_err(dev, "Failed to acquire root port for HDM enable\n"); > return false; > + } > + > + for (i = 0, allowed = 0; info->mem_enabled && i < info->ranges; i++) { > + struct device *cxld_dev; > + > + cxld_dev = device_find_child(&root->dev, &info->dvsec_range[i], > + dvsec_range_allowed); > + dev_dbg(dev, "DVSEC Range%d %sallowed by platform\n", i, > + cxld_dev ? "" : "dis"); Ouch. Not worth doing that to save a few chars. Makes the message harder to grep for. > + if (!cxld_dev) > + continue; > + put_device(cxld_dev); > + allowed++; > + } > + put_device(&root->dev); > + > + if (!allowed) { > + cxl_set_mem_enable(cxlds, 0); > + info->mem_enabled = 0; > + }
On Wed, May 18, 2022 at 10:55 AM Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: > > On Tue, 17 May 2022 17:38:10 -0700 > Dan Williams <dan.j.williams@intel.com> wrote: > > > CXL memory expanders that support the CXL 2.0 memory device class code > > include an "HDM Decoder Capability" mechanism to supplant the "CXL DVSEC > > Range" mechanism originally defined in CXL 1.1. Both mechanisms depend > > on a "mem_enable" bit being set in configuration space before either > > mechanism activates. When the HDM Decoder Capability is enabled the CXL > > DVSEC Range settings are ignored. > > > > Previously, the cxl_mem driver was relying on platform-firmware to set > > "mem_enable". That is an invalid assumption as there is no requirement > > that platform-firmware sets the bit before the driver sees a device, > > especially in hot-plug scenarios. Additionally, ACPI-platforms that > > support CXL 2.0 devices also support the ACPI CEDT (CXL Early Discovery > > Table). That table outlines the platform permissible address ranges for > > CXL operation. So, there is a need for the driver to set "mem_enable", > > and there is information available to determine the validity of the CXL > > DVSEC Ranges. While DVSEC Ranges are expected to be at least > > 256M in size, the specification (CXL 2.0 Section 8.1.3.8.4 DVSEC CXL > > Range 1 Base Low) allows for the possibilty of devices smaller than > > 256M. So the range [0, 256M) is considered active even if Memory_size > > is 0. > > > > Arrange for the driver to optionally enable the HDM Decoder Capability > > if "mem_enable" was not set by platform firmware, or the CXL DVSEC Range > > configuration was invalid. Be careful to only disable memory decode if > > the kernel was the one to enable it. In other words, if CXL is backing > > all of kernel memory at boot the device needs to maintain "mem_enable" > > and "HDM Decoder enable" all the way up to handoff back to platform > > firmware (e.g. ACPI S5 state entry may require CXL memory to stay > > active). > > > > Fixes: 560f78559006 ("cxl/pci: Retrieve CXL DVSEC memory info") > > Cc: Dan Carpenter <dan.carpenter@oracle.com> > > [dan: fix early terminiation of range-allowed loop] > > Cc: Ariel Sibley <ariel.sibley@microchip.com> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > > --- > > Changes since v1: > > - Fix range-allowed loop termination (Smatch / Dan) > That had me confused before I saw v2 :) > > I'm not keen on the trick to do disallowed in the debug message... > > Other than ongoing discussion around the range being always allowed > (or not) this looks good to me. > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > - Clean up changeloe wording around why [0, 256M) is considered always > > active (Ariel) > > > > drivers/cxl/core/pci.c | 163 ++++++++++++++++++++++++++++++++++++++++++++---- > > 1 file changed, 151 insertions(+), 12 deletions(-) > > > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > > index a697c48fc830..528430da0e77 100644 > > --- a/drivers/cxl/core/pci.c > > +++ b/drivers/cxl/core/pci.c > > @@ -175,30 +175,164 @@ static int wait_for_valid(struct cxl_dev_state *cxlds) > > return -ETIMEDOUT; > > } > > > > +static int cxl_set_mem_enable(struct cxl_dev_state *cxlds, u16 val) > > +{ > > + struct pci_dev *pdev = to_pci_dev(cxlds->dev); > > + int d = cxlds->cxl_dvsec; > > + u16 ctrl; > > + int rc; > > + > > + rc = pci_read_config_word(pdev, d + CXL_DVSEC_CTRL_OFFSET, &ctrl); > > + if (rc < 0) > > + return rc; > > + > > + if ((ctrl & CXL_DVSEC_MEM_ENABLE) == val) > > + return 1; > > + ctrl &= ~CXL_DVSEC_MEM_ENABLE; > > + ctrl |= val; > > + > > + rc = pci_write_config_word(pdev, d + CXL_DVSEC_CTRL_OFFSET, ctrl); > > + if (rc < 0) > > + return rc; > > + > > + return 0; > > +} > > + > > +static void clear_mem_enable(void *cxlds) > > +{ > > + cxl_set_mem_enable(cxlds, 0); > > +} > > + > > +static int devm_cxl_enable_mem(struct device *host, struct cxl_dev_state *cxlds) > > +{ > > + int rc; > > + > > + rc = cxl_set_mem_enable(cxlds, CXL_DVSEC_MEM_ENABLE); > > + if (rc < 0) > > + return rc; > > + if (rc > 0) > > + return 0; > > + return devm_add_action_or_reset(host, clear_mem_enable, cxlds); > > +} > > + > > +static bool range_contains(struct range *r1, struct range *r2) > > +{ > > + return r1->start <= r2->start && r1->end >= r2->end; > > +} > > + > > +/* require dvsec ranges to be covered by a locked platform window */ > > +static int dvsec_range_allowed(struct device *dev, void *arg) > > +{ > > + struct range *dev_range = arg; > > + struct cxl_decoder *cxld; > > + struct range root_range; > > + > > + if (!is_root_decoder(dev)) > > + return 0; > > + > > + cxld = to_cxl_decoder(dev); > > + > > + if (!(cxld->flags & CXL_DECODER_F_LOCK)) > > + return 0; > > + if (!(cxld->flags & CXL_DECODER_F_RAM)) > > + return 0; > > + > > + root_range = (struct range) { > > + .start = cxld->platform_res.start, > > + .end = cxld->platform_res.end, > > + }; > > + > > + return range_contains(&root_range, dev_range); > > +} > > + > > +static void disable_hdm(void *_cxlhdm) > > +{ > > + u32 global_ctrl; > > + struct cxl_hdm *cxlhdm = _cxlhdm; > > + void __iomem *hdm = cxlhdm->regs.hdm_decoder; > > + > > + global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET); > > + writel(global_ctrl & ~CXL_HDM_DECODER_ENABLE, > > + hdm + CXL_HDM_DECODER_CTRL_OFFSET); > > +} > > + > > +static int devm_cxl_enable_hdm(struct device *host, struct cxl_hdm *cxlhdm) > > +{ > > + void __iomem *hdm = cxlhdm->regs.hdm_decoder; > > + u32 global_ctrl; > > + > > + global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET); > > + writel(global_ctrl | CXL_HDM_DECODER_ENABLE, > > + hdm + CXL_HDM_DECODER_CTRL_OFFSET); > > + > > + return devm_add_action_or_reset(host, disable_hdm, cxlhdm); > > +} > > + > > static bool __cxl_hdm_decode_init(struct cxl_dev_state *cxlds, > > struct cxl_hdm *cxlhdm, > > struct cxl_endpoint_dvsec_info *info) > > { > > void __iomem *hdm = cxlhdm->regs.hdm_decoder; > > - bool global_enable; > > + struct cxl_port *port = cxlhdm->port; > > + struct device *dev = cxlds->dev; > > + struct cxl_port *root; > > + int i, rc, allowed; > > u32 global_ctrl; > > > > global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET); > > - global_enable = global_ctrl & CXL_HDM_DECODER_ENABLE; > > > > - if (!global_enable && info->mem_enabled) > > + /* > > + * If the HDM Decoder Capability is already enabled then assume > > + * that some other agent like platform firmware set it up. > > + */ > > + if (global_ctrl & CXL_HDM_DECODER_ENABLE) { > > + rc = devm_cxl_enable_mem(&port->dev, cxlds); > > + if (rc) > > + return false; > > + return true; > > + } > > + > > + root = to_cxl_port(port->dev.parent); > > + while (!is_cxl_root(root) && is_cxl_port(root->dev.parent)) > > + root = to_cxl_port(root->dev.parent); > > + if (!is_cxl_root(root)) { > > + dev_err(dev, "Failed to acquire root port for HDM enable\n"); > > return false; > > + } > > + > > + for (i = 0, allowed = 0; info->mem_enabled && i < info->ranges; i++) { > > + struct device *cxld_dev; > > + > > + cxld_dev = device_find_child(&root->dev, &info->dvsec_range[i], > > + dvsec_range_allowed); > > + dev_dbg(dev, "DVSEC Range%d %sallowed by platform\n", i, > > + cxld_dev ? "" : "dis"); > > Ouch. Not worth doing that to save a few chars. Makes the message > harder to grep for. Ok, will drop, along with the "always enabled" change.
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c index a697c48fc830..528430da0e77 100644 --- a/drivers/cxl/core/pci.c +++ b/drivers/cxl/core/pci.c @@ -175,30 +175,164 @@ static int wait_for_valid(struct cxl_dev_state *cxlds) return -ETIMEDOUT; } +static int cxl_set_mem_enable(struct cxl_dev_state *cxlds, u16 val) +{ + struct pci_dev *pdev = to_pci_dev(cxlds->dev); + int d = cxlds->cxl_dvsec; + u16 ctrl; + int rc; + + rc = pci_read_config_word(pdev, d + CXL_DVSEC_CTRL_OFFSET, &ctrl); + if (rc < 0) + return rc; + + if ((ctrl & CXL_DVSEC_MEM_ENABLE) == val) + return 1; + ctrl &= ~CXL_DVSEC_MEM_ENABLE; + ctrl |= val; + + rc = pci_write_config_word(pdev, d + CXL_DVSEC_CTRL_OFFSET, ctrl); + if (rc < 0) + return rc; + + return 0; +} + +static void clear_mem_enable(void *cxlds) +{ + cxl_set_mem_enable(cxlds, 0); +} + +static int devm_cxl_enable_mem(struct device *host, struct cxl_dev_state *cxlds) +{ + int rc; + + rc = cxl_set_mem_enable(cxlds, CXL_DVSEC_MEM_ENABLE); + if (rc < 0) + return rc; + if (rc > 0) + return 0; + return devm_add_action_or_reset(host, clear_mem_enable, cxlds); +} + +static bool range_contains(struct range *r1, struct range *r2) +{ + return r1->start <= r2->start && r1->end >= r2->end; +} + +/* require dvsec ranges to be covered by a locked platform window */ +static int dvsec_range_allowed(struct device *dev, void *arg) +{ + struct range *dev_range = arg; + struct cxl_decoder *cxld; + struct range root_range; + + if (!is_root_decoder(dev)) + return 0; + + cxld = to_cxl_decoder(dev); + + if (!(cxld->flags & CXL_DECODER_F_LOCK)) + return 0; + if (!(cxld->flags & CXL_DECODER_F_RAM)) + return 0; + + root_range = (struct range) { + .start = cxld->platform_res.start, + .end = cxld->platform_res.end, + }; + + return range_contains(&root_range, dev_range); +} + +static void disable_hdm(void *_cxlhdm) +{ + u32 global_ctrl; + struct cxl_hdm *cxlhdm = _cxlhdm; + void __iomem *hdm = cxlhdm->regs.hdm_decoder; + + global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET); + writel(global_ctrl & ~CXL_HDM_DECODER_ENABLE, + hdm + CXL_HDM_DECODER_CTRL_OFFSET); +} + +static int devm_cxl_enable_hdm(struct device *host, struct cxl_hdm *cxlhdm) +{ + void __iomem *hdm = cxlhdm->regs.hdm_decoder; + u32 global_ctrl; + + global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET); + writel(global_ctrl | CXL_HDM_DECODER_ENABLE, + hdm + CXL_HDM_DECODER_CTRL_OFFSET); + + return devm_add_action_or_reset(host, disable_hdm, cxlhdm); +} + static bool __cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm, struct cxl_endpoint_dvsec_info *info) { void __iomem *hdm = cxlhdm->regs.hdm_decoder; - bool global_enable; + struct cxl_port *port = cxlhdm->port; + struct device *dev = cxlds->dev; + struct cxl_port *root; + int i, rc, allowed; u32 global_ctrl; global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET); - global_enable = global_ctrl & CXL_HDM_DECODER_ENABLE; - if (!global_enable && info->mem_enabled) + /* + * If the HDM Decoder Capability is already enabled then assume + * that some other agent like platform firmware set it up. + */ + if (global_ctrl & CXL_HDM_DECODER_ENABLE) { + rc = devm_cxl_enable_mem(&port->dev, cxlds); + if (rc) + return false; + return true; + } + + root = to_cxl_port(port->dev.parent); + while (!is_cxl_root(root) && is_cxl_port(root->dev.parent)) + root = to_cxl_port(root->dev.parent); + if (!is_cxl_root(root)) { + dev_err(dev, "Failed to acquire root port for HDM enable\n"); return false; + } + + for (i = 0, allowed = 0; info->mem_enabled && i < info->ranges; i++) { + struct device *cxld_dev; + + cxld_dev = device_find_child(&root->dev, &info->dvsec_range[i], + dvsec_range_allowed); + dev_dbg(dev, "DVSEC Range%d %sallowed by platform\n", i, + cxld_dev ? "" : "dis"); + if (!cxld_dev) + continue; + put_device(cxld_dev); + allowed++; + } + put_device(&root->dev); + + if (!allowed) { + cxl_set_mem_enable(cxlds, 0); + info->mem_enabled = 0; + } /* - * Permanently (for this boot at least) opt the device into HDM - * operation. Individual HDM decoders still need to be enabled after - * this point. + * At least one DVSEC range is enabled and allowed, skip HDM + * Decoder Capability Enable */ - if (!global_enable) { - dev_dbg(cxlds->dev, "Enabling HDM decode\n"); - writel(global_ctrl | CXL_HDM_DECODER_ENABLE, - hdm + CXL_HDM_DECODER_CTRL_OFFSET); - } + if (info->mem_enabled) + return false; + + rc = devm_cxl_enable_hdm(&port->dev, cxlhdm); + if (rc) + return false; + + rc = devm_cxl_enable_mem(&port->dev, cxlds); + if (rc) + return false; return true; } @@ -253,9 +387,14 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm) return rc; } + /* + * The current DVSEC values are moot if the memory capability is + * disabled, and they will remain moot after the HDM Decoder + * capability is enabled. + */ info.mem_enabled = FIELD_GET(CXL_DVSEC_MEM_ENABLE, ctrl); if (!info.mem_enabled) - return 0; + return __cxl_hdm_decode_init(cxlds, cxlhdm, &info); for (i = 0; i < hdm_count; i++) { u64 base, size;
CXL memory expanders that support the CXL 2.0 memory device class code include an "HDM Decoder Capability" mechanism to supplant the "CXL DVSEC Range" mechanism originally defined in CXL 1.1. Both mechanisms depend on a "mem_enable" bit being set in configuration space before either mechanism activates. When the HDM Decoder Capability is enabled the CXL DVSEC Range settings are ignored. Previously, the cxl_mem driver was relying on platform-firmware to set "mem_enable". That is an invalid assumption as there is no requirement that platform-firmware sets the bit before the driver sees a device, especially in hot-plug scenarios. Additionally, ACPI-platforms that support CXL 2.0 devices also support the ACPI CEDT (CXL Early Discovery Table). That table outlines the platform permissible address ranges for CXL operation. So, there is a need for the driver to set "mem_enable", and there is information available to determine the validity of the CXL DVSEC Ranges. While DVSEC Ranges are expected to be at least 256M in size, the specification (CXL 2.0 Section 8.1.3.8.4 DVSEC CXL Range 1 Base Low) allows for the possibilty of devices smaller than 256M. So the range [0, 256M) is considered active even if Memory_size is 0. Arrange for the driver to optionally enable the HDM Decoder Capability if "mem_enable" was not set by platform firmware, or the CXL DVSEC Range configuration was invalid. Be careful to only disable memory decode if the kernel was the one to enable it. In other words, if CXL is backing all of kernel memory at boot the device needs to maintain "mem_enable" and "HDM Decoder enable" all the way up to handoff back to platform firmware (e.g. ACPI S5 state entry may require CXL memory to stay active). Fixes: 560f78559006 ("cxl/pci: Retrieve CXL DVSEC memory info") Cc: Dan Carpenter <dan.carpenter@oracle.com> [dan: fix early terminiation of range-allowed loop] Cc: Ariel Sibley <ariel.sibley@microchip.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- Changes since v1: - Fix range-allowed loop termination (Smatch / Dan) - Clean up changeloe wording around why [0, 256M) is considered always active (Ariel) drivers/cxl/core/pci.c | 163 ++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 151 insertions(+), 12 deletions(-)