Message ID | 20240215194048.141411-3-Benjamin.Cheatham@amd.com |
---|---|
State | New, archived |
Headers | show |
Series | Implement initial CXL Timeout & Isolation support | expand |
Follow drivers/pci/ subject line convention. Suggest PCI/CXL: Add CXL Timeout & Isolation service driver On Thu, Feb 15, 2024 at 01:40:44PM -0600, Ben Cheatham wrote: > Add a CXL Timeout & Isolation (CXL 3.0 12.3) service driver to the > PCIe port bus driver for CXL root ports. The service will support > enabling/programming CXL.mem transaction timeout, error isolation, > and interrupt handling. > ... > /* CXL 3.0 8.2.4.23 CXL Timeout and Isolation Capability Structure */ > #define CXL_TIMEOUT_CAPABILITY_OFFSET 0x0 > +#define CXL_TIMEOUT_CAP_MEM_TIMEOUT_SUPP BIT(4) > +#define CXL_TIMEOUT_CONTROL_OFFSET 0x8 > +#define CXL_TIMEOUT_CONTROL_MEM_TIMEOUT_ENABLE BIT(4) > #define CXL_TIMEOUT_CAPABILITY_LENGTH 0x10 Weird lack of alignment makes these #defines hard to read, but kudos for following the local style ;) > /* RAS Registers CXL 2.0 8.2.5.9 CXL RAS Capability Structure */ Update to current CXL spec, please. In r3.0, it looks like this is sec 8.2.4.16. Updating everything to r3.1 references would be even better. > +config PCIE_CXL_TIMEOUT > + bool "PCI Express CXL.mem Timeout & Isolation Interrupt support" > + depends on PCIEPORTBUS > + depends on CXL_BUS=PCIEPORTBUS && CXL_PORT > + help > + Enables the CXL.mem Timeout & Isolation PCIE port service driver. This > + driver, in combination with the CXL driver core, is responsible for > + handling CXL capable PCIE root ports that undergo CXL.mem error isolation > + due to either a CXL.mem transaction timeout or uncorrectable device error. When running menuconfig, it seems like both PCIEAER_CXL and PCIE_CXL_TIMEOUT would fit more logically in the drivers/cxl/Kconfig menu along with the rest of the CXL options. Rewrap to fit in 78 columns. s/PCIE/PCIe/ (also below) > + * Implements CXL Timeout & Isolation (CXL 3.0 12.3.2) interrupt support as a > + * PCIE port service driver. The driver is set up such that near all of the > + * work for setting up and handling interrupts are in this file, while the > + * CXL core enables the interrupts during port enumeration. Seems like sec 12.3.2 is slightly too specific here. Maybe 12.3? > +struct pcie_cxlt_data { > + struct cxl_timeout *cxlt; > + struct cxl_dport *dport; "dport" is not used here. Would be better to add it in the patch that needs it. > +static int cxl_map_timeout_regs(struct pci_dev *port, > + struct cxl_register_map *map, > + struct cxl_component_regs *regs) > +{ > + int rc = 0; Unnecessary initialization. > + rc = cxl_find_regblock(port, CXL_REGLOC_RBI_COMPONENT, map); > ... > +int cxl_find_timeout_cap(struct pci_dev *dev, u32 *cap) > +{ > + struct cxl_component_regs regs; > + struct cxl_register_map map; > + int rc = 0; Unnecessary initialization. > + rc = cxl_map_timeout_regs(dev, &map, ®s); > + if (rc) > + return rc; > + > + *cap = readl(regs.timeout + CXL_TIMEOUT_CAPABILITY_OFFSET); > + cxl_unmap_timeout_regs(dev, &map, ®s); > + > + return rc; Since we already know the value: return 0; > +} Move cxl_find_timeout_cap() to the patch that needs it. It's unused in this one. > +static int cxl_timeout_probe(struct pcie_device *dev) > +{ > + struct pci_dev *port = dev->port; > + struct pcie_cxlt_data *pdata; > + struct cxl_timeout *cxlt; > + int rc = 0; Unnecessary initialization. > + /* Limit to CXL root ports */ > + if (!pci_find_dvsec_capability(port, PCI_DVSEC_VENDOR_ID_CXL, > + CXL_DVSEC_PORT_EXTENSIONS)) > + return -ENODEV; > + > + pdata = cxlt_create_pdata(dev); > + if (IS_ERR_OR_NULL(pdata)) > + return PTR_ERR(pdata); > + > + set_service_data(dev, pdata); > + cxlt = pdata->cxlt; > + > + rc = cxl_enable_timeout(dev, cxlt); > + if (rc) > + pci_dbg(dev->port, "Failed to enable CXL.mem timeout: %d\n", > + rc); "(%pe)" (similar in subsequent patches) > + return rc; > +} > + > @@ -829,6 +829,7 @@ static void __init pcie_init_services(void) > pcie_pme_init(); > pcie_dpc_init(); > pcie_hp_init(); > + pcie_cxlt_init(); "cxlt" seems slightly too generic outside cxl_timeout.c, since there might be other CXL things that start with "t" someday. > +#define PCIE_PORT_SERVICE_CXLT_SHIFT 5 /* CXL Timeout & Isolation */ > +#define PCIE_PORT_SERVICE_CXLT (1 << PCIE_PORT_SERVICE_CXLT_SHIFT) I hate having to add this to the portdrv, but I see why you need to (so we can deal with the weirdness of PCIe feature interrupts). Bjorn
Hi Bjorn, thanks for the comments! Responses inline. On 2/15/24 3:13 PM, Bjorn Helgaas wrote: > Follow drivers/pci/ subject line convention. Suggest > > PCI/CXL: Add CXL Timeout & Isolation service driver > > On Thu, Feb 15, 2024 at 01:40:44PM -0600, Ben Cheatham wrote: >> Add a CXL Timeout & Isolation (CXL 3.0 12.3) service driver to the >> PCIe port bus driver for CXL root ports. The service will support >> enabling/programming CXL.mem transaction timeout, error isolation, >> and interrupt handling. >> ... > >> /* CXL 3.0 8.2.4.23 CXL Timeout and Isolation Capability Structure */ >> #define CXL_TIMEOUT_CAPABILITY_OFFSET 0x0 >> +#define CXL_TIMEOUT_CAP_MEM_TIMEOUT_SUPP BIT(4) >> +#define CXL_TIMEOUT_CONTROL_OFFSET 0x8 >> +#define CXL_TIMEOUT_CONTROL_MEM_TIMEOUT_ENABLE BIT(4) >> #define CXL_TIMEOUT_CAPABILITY_LENGTH 0x10 > > Weird lack of alignment makes these #defines hard to read, but kudos > for following the local style ;) > >> /* RAS Registers CXL 2.0 8.2.5.9 CXL RAS Capability Structure */ > > Update to current CXL spec, please. In r3.0, it looks like this is > sec 8.2.4.16. Updating everything to r3.1 references would be even > better. > Yeah a lot of the spec references in cxl.h are out of date, I'll update them where I touch them. And I didn't realize 3.1 was out yet, I'll update the references to that revision as well. >> +config PCIE_CXL_TIMEOUT >> + bool "PCI Express CXL.mem Timeout & Isolation Interrupt support" >> + depends on PCIEPORTBUS >> + depends on CXL_BUS=PCIEPORTBUS && CXL_PORT >> + help >> + Enables the CXL.mem Timeout & Isolation PCIE port service driver. This >> + driver, in combination with the CXL driver core, is responsible for >> + handling CXL capable PCIE root ports that undergo CXL.mem error isolation >> + due to either a CXL.mem transaction timeout or uncorrectable device error. > > When running menuconfig, it seems like both PCIEAER_CXL and > PCIE_CXL_TIMEOUT would fit more logically in the drivers/cxl/Kconfig > menu along with the rest of the CXL options. > Will do. > Rewrap to fit in 78 columns. > > s/PCIE/PCIe/ (also below) > Will do both above. >> + * Implements CXL Timeout & Isolation (CXL 3.0 12.3.2) interrupt support as a >> + * PCIE port service driver. The driver is set up such that near all of the >> + * work for setting up and handling interrupts are in this file, while the >> + * CXL core enables the interrupts during port enumeration. > > Seems like sec 12.3.2 is slightly too specific here. Maybe 12.3? > 12.3.2 is the CXL.mem portion of the section, which is the only section implemented in the patch set. That being said, I don't mind changing it to 12.3 instead. >> +struct pcie_cxlt_data { >> + struct cxl_timeout *cxlt; >> + struct cxl_dport *dport; > > "dport" is not used here. Would be better to add it in the patch that > needs it. > Will do. >> +static int cxl_map_timeout_regs(struct pci_dev *port, >> + struct cxl_register_map *map, >> + struct cxl_component_regs *regs) >> +{ >> + int rc = 0; > > Unnecessary initialization. > >> + rc = cxl_find_regblock(port, CXL_REGLOC_RBI_COMPONENT, map); >> ... > >> +int cxl_find_timeout_cap(struct pci_dev *dev, u32 *cap) >> +{ >> + struct cxl_component_regs regs; >> + struct cxl_register_map map; >> + int rc = 0; > > Unnecessary initialization. > >> + rc = cxl_map_timeout_regs(dev, &map, ®s); >> + if (rc) >> + return rc; >> + >> + *cap = readl(regs.timeout + CXL_TIMEOUT_CAPABILITY_OFFSET); >> + cxl_unmap_timeout_regs(dev, &map, ®s); >> + >> + return rc; > > Since we already know the value: > > return 0; > >> +} > > Move cxl_find_timeout_cap() to the patch that needs it. It's unused > in this one. > Will do to all 4 above. >> +static int cxl_timeout_probe(struct pcie_device *dev) >> +{ >> + struct pci_dev *port = dev->port; >> + struct pcie_cxlt_data *pdata; >> + struct cxl_timeout *cxlt; >> + int rc = 0; > > Unnecessary initialization. > >> + /* Limit to CXL root ports */ >> + if (!pci_find_dvsec_capability(port, PCI_DVSEC_VENDOR_ID_CXL, >> + CXL_DVSEC_PORT_EXTENSIONS)) >> + return -ENODEV; >> + >> + pdata = cxlt_create_pdata(dev); >> + if (IS_ERR_OR_NULL(pdata)) >> + return PTR_ERR(pdata); >> + >> + set_service_data(dev, pdata); >> + cxlt = pdata->cxlt; >> + >> + rc = cxl_enable_timeout(dev, cxlt); >> + if (rc) >> + pci_dbg(dev->port, "Failed to enable CXL.mem timeout: %d\n", >> + rc); > > "(%pe)" (similar in subsequent patches) > I wasn't aware of that, I'll change it. >> + return rc; >> +} >> + > >> @@ -829,6 +829,7 @@ static void __init pcie_init_services(void) >> pcie_pme_init(); >> pcie_dpc_init(); >> pcie_hp_init(); >> + pcie_cxlt_init(); > > "cxlt" seems slightly too generic outside cxl_timeout.c, since there > might be other CXL things that start with "t" someday. > >> +#define PCIE_PORT_SERVICE_CXLT_SHIFT 5 /* CXL Timeout & Isolation */ >> +#define PCIE_PORT_SERVICE_CXLT (1 << PCIE_PORT_SERVICE_CXLT_SHIFT) > > I hate having to add this to the portdrv, but I see why you need to > (so we can deal with the weirdness of PCIe feature interrupts). > > Bjorn
On Thu, Feb 15, 2024 at 04:21:42PM -0600, Ben Cheatham wrote: > On 2/15/24 3:13 PM, Bjorn Helgaas wrote: > >> + rc = cxl_enable_timeout(dev, cxlt); > >> + if (rc) > >> + pci_dbg(dev->port, "Failed to enable CXL.mem timeout: %d\n", > >> + rc); > > > > "(%pe)" (similar in subsequent patches) > > > > I wasn't aware of that, I'll change it. I wasn't either, but I'm getting patches to change them, and it does seem a little bit easier than looking up the errnos when debugging, so ...
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 87f3178d6642..0c65f4ec7aae 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -129,7 +129,11 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw) /* CXL 3.0 8.2.4.23 CXL Timeout and Isolation Capability Structure */ #define CXL_TIMEOUT_CAPABILITY_OFFSET 0x0 +#define CXL_TIMEOUT_CAP_MEM_TIMEOUT_SUPP BIT(4) +#define CXL_TIMEOUT_CONTROL_OFFSET 0x8 +#define CXL_TIMEOUT_CONTROL_MEM_TIMEOUT_ENABLE BIT(4) #define CXL_TIMEOUT_CAPABILITY_LENGTH 0x10 + /* RAS Registers CXL 2.0 8.2.5.9 CXL RAS Capability Structure */ #define CXL_RAS_UNCORRECTABLE_STATUS_OFFSET 0x0 #define CXL_RAS_UNCORRECTABLE_STATUS_MASK (GENMASK(16, 14) | GENMASK(11, 0)) diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig index 8999fcebde6a..27820af4502e 100644 --- a/drivers/pci/pcie/Kconfig +++ b/drivers/pci/pcie/Kconfig @@ -58,6 +58,16 @@ config PCIEAER_CXL If unsure, say Y. +config PCIE_CXL_TIMEOUT + bool "PCI Express CXL.mem Timeout & Isolation Interrupt support" + depends on PCIEPORTBUS + depends on CXL_BUS=PCIEPORTBUS && CXL_PORT + help + Enables the CXL.mem Timeout & Isolation PCIE port service driver. This + driver, in combination with the CXL driver core, is responsible for + handling CXL capable PCIE root ports that undergo CXL.mem error isolation + due to either a CXL.mem transaction timeout or uncorrectable device error. + # # PCI Express ECRC # diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile index 8de4ed5f98f1..433ef08efc6f 100644 --- a/drivers/pci/pcie/Makefile +++ b/drivers/pci/pcie/Makefile @@ -13,3 +13,4 @@ obj-$(CONFIG_PCIE_PME) += pme.o obj-$(CONFIG_PCIE_DPC) += dpc.o obj-$(CONFIG_PCIE_PTM) += ptm.o obj-$(CONFIG_PCIE_EDR) += edr.o +obj-$(CONFIG_PCIE_CXL_TIMEOUT) += cxl_timeout.o diff --git a/drivers/pci/pcie/cxl_timeout.c b/drivers/pci/pcie/cxl_timeout.c new file mode 100644 index 000000000000..84f2df0e0397 --- /dev/null +++ b/drivers/pci/pcie/cxl_timeout.c @@ -0,0 +1,197 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Implements CXL Timeout & Isolation (CXL 3.0 12.3.2) interrupt support as a + * PCIE port service driver. The driver is set up such that near all of the + * work for setting up and handling interrupts are in this file, while the + * CXL core enables the interrupts during port enumeration. + * + * Copyright (C) 2024, Advanced Micro Devices, Inc. + * All Rights Reserved. + * + * Author: Ben Cheatham <Benjamin.Cheatham@amd.com> + */ + +#define pr_fmt(fmt) "cxl_timeout: " fmt +#define dev_fmt pr_fmt + +#include <linux/pci.h> +#include <linux/acpi.h> + +#include "../../cxl/cxlpci.h" +#include "portdrv.h" + +struct cxl_timeout { + struct pcie_device *dev; + void __iomem *regs; + u32 cap; +}; + +struct pcie_cxlt_data { + struct cxl_timeout *cxlt; + struct cxl_dport *dport; +}; + +static int cxl_map_timeout_regs(struct pci_dev *port, + struct cxl_register_map *map, + struct cxl_component_regs *regs) +{ + int rc = 0; + + rc = cxl_find_regblock(port, CXL_REGLOC_RBI_COMPONENT, map); + if (rc) + return rc; + + rc = cxl_setup_regs(map); + if (rc) + return rc; + + rc = cxl_map_component_regs(map, regs, + BIT(CXL_CM_CAP_CAP_ID_TIMEOUT)); + return rc; +} + +static void cxl_unmap_timeout_regs(struct pci_dev *port, + struct cxl_register_map *map, + struct cxl_component_regs *regs) +{ + struct cxl_reg_map *timeout_map = &map->component_map.timeout; + + devm_iounmap(map->host, regs->timeout); + devm_release_mem_region(map->host, map->resource + timeout_map->offset, + timeout_map->size); +} + +static struct cxl_timeout *cxl_create_cxlt(struct pcie_device *dev) +{ + struct cxl_component_regs *regs; + struct cxl_register_map *map; + struct cxl_timeout *cxlt; + int rc; + + regs = devm_kmalloc(&dev->device, sizeof(*regs), GFP_KERNEL); + if (!regs) + return ERR_PTR(-ENOMEM); + + map = devm_kmalloc(&dev->device, sizeof(*map), GFP_KERNEL); + if (!map) { + devm_kfree(&dev->device, regs); + return ERR_PTR(-ENOMEM); + } + + rc = cxl_map_timeout_regs(dev->port, map, regs); + if (rc) + goto err; + + cxlt = devm_kmalloc(&dev->device, sizeof(*cxlt), GFP_KERNEL); + if (!cxlt) + goto err; + + cxlt->regs = regs->timeout; + cxlt->dev = dev; + cxlt->cap = readl(cxlt->regs + CXL_TIMEOUT_CAPABILITY_OFFSET); + + return cxlt; + +err: + cxl_unmap_timeout_regs(dev->port, map, regs); + return ERR_PTR(rc); +} + +int cxl_find_timeout_cap(struct pci_dev *dev, u32 *cap) +{ + struct cxl_component_regs regs; + struct cxl_register_map map; + int rc = 0; + + rc = cxl_map_timeout_regs(dev, &map, ®s); + if (rc) + return rc; + + *cap = readl(regs.timeout + CXL_TIMEOUT_CAPABILITY_OFFSET); + cxl_unmap_timeout_regs(dev, &map, ®s); + + return rc; +} + +static struct pcie_cxlt_data *cxlt_create_pdata(struct pcie_device *dev) +{ + struct pcie_cxlt_data *data; + + data = devm_kzalloc(&dev->device, sizeof(*data), GFP_KERNEL); + if (IS_ERR_OR_NULL(data)) + return ERR_PTR(-ENOMEM); + + data->cxlt = cxl_create_cxlt(dev); + if (IS_ERR_OR_NULL(data->cxlt)) + return ERR_PTR(PTR_ERR(data->cxlt)); + + data->dport = NULL; + + return data; +} + +static void cxl_disable_timeout(void *data) +{ + struct cxl_timeout *cxlt = data; + u32 cntrl = readl(cxlt->regs + CXL_TIMEOUT_CONTROL_OFFSET); + + cntrl &= ~CXL_TIMEOUT_CONTROL_MEM_TIMEOUT_ENABLE; + writel(cntrl, cxlt->regs + CXL_TIMEOUT_CONTROL_OFFSET); +} + +static int cxl_enable_timeout(struct pcie_device *dev, struct cxl_timeout *cxlt) +{ + u32 cntrl; + + if (!cxlt || !FIELD_GET(CXL_TIMEOUT_CAP_MEM_TIMEOUT_SUPP, cxlt->cap)) + return -ENXIO; + + cntrl = readl(cxlt->regs + CXL_TIMEOUT_CONTROL_OFFSET); + cntrl |= CXL_TIMEOUT_CONTROL_MEM_TIMEOUT_ENABLE; + writel(cntrl, cxlt->regs + CXL_TIMEOUT_CONTROL_OFFSET); + + return devm_add_action_or_reset(&dev->device, cxl_disable_timeout, + cxlt); +} + +static int cxl_timeout_probe(struct pcie_device *dev) +{ + struct pci_dev *port = dev->port; + struct pcie_cxlt_data *pdata; + struct cxl_timeout *cxlt; + int rc = 0; + + /* Limit to CXL root ports */ + if (!pci_find_dvsec_capability(port, PCI_DVSEC_VENDOR_ID_CXL, + CXL_DVSEC_PORT_EXTENSIONS)) + return -ENODEV; + + pdata = cxlt_create_pdata(dev); + if (IS_ERR_OR_NULL(pdata)) + return PTR_ERR(pdata); + + set_service_data(dev, pdata); + cxlt = pdata->cxlt; + + rc = cxl_enable_timeout(dev, cxlt); + if (rc) + pci_dbg(dev->port, "Failed to enable CXL.mem timeout: %d\n", + rc); + + return rc; +} + +static struct pcie_port_service_driver cxltdriver = { + .name = "cxl_timeout", + .port_type = PCI_EXP_TYPE_ROOT_PORT, + .service = PCIE_PORT_SERVICE_CXLT, + + .probe = cxl_timeout_probe, +}; + +int __init pcie_cxlt_init(void) +{ + return pcie_port_service_register(&cxltdriver); +} + +MODULE_IMPORT_NS(CXL); diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c index 14a4b89a3b83..7aa0a6f2da4e 100644 --- a/drivers/pci/pcie/portdrv.c +++ b/drivers/pci/pcie/portdrv.c @@ -829,6 +829,7 @@ static void __init pcie_init_services(void) pcie_pme_init(); pcie_dpc_init(); pcie_hp_init(); + pcie_cxlt_init(); } static int __init pcie_portdrv_init(void) diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h index 1f3803bde7ee..5395a0e36956 100644 --- a/drivers/pci/pcie/portdrv.h +++ b/drivers/pci/pcie/portdrv.h @@ -22,8 +22,10 @@ #define PCIE_PORT_SERVICE_DPC (1 << PCIE_PORT_SERVICE_DPC_SHIFT) #define PCIE_PORT_SERVICE_BWNOTIF_SHIFT 4 /* Bandwidth notification */ #define PCIE_PORT_SERVICE_BWNOTIF (1 << PCIE_PORT_SERVICE_BWNOTIF_SHIFT) +#define PCIE_PORT_SERVICE_CXLT_SHIFT 5 /* CXL Timeout & Isolation */ +#define PCIE_PORT_SERVICE_CXLT (1 << PCIE_PORT_SERVICE_CXLT_SHIFT) -#define PCIE_PORT_DEVICE_MAXSERVICES 5 +#define PCIE_PORT_DEVICE_MAXSERVICES 6 extern bool pcie_ports_dpc_native; @@ -51,6 +53,12 @@ int pcie_dpc_init(void); static inline int pcie_dpc_init(void) { return 0; } #endif +#ifdef CONFIG_PCIE_CXL_TIMEOUT +int pcie_cxlt_init(void); +#else +static inline int pcie_cxlt_init(void) { return 0; } +#endif + /* Port Type */ #define PCIE_ANY_PORT (~0)
Add a CXL Timeout & Isolation (CXL 3.0 12.3) service driver to the PCIe port bus driver for CXL root ports. The service will support enabling/programming CXL.mem transaction timeout, error isolation, and interrupt handling. Add code to find and map CXL Timeout & Isolation capability register (CXL 3.0 8.2.4.23.1) from service driver. Then use capability register mapping to enable CXL.mem transaction timeout with the default value. Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> --- drivers/cxl/cxl.h | 4 + drivers/pci/pcie/Kconfig | 10 ++ drivers/pci/pcie/Makefile | 1 + drivers/pci/pcie/cxl_timeout.c | 197 +++++++++++++++++++++++++++++++++ drivers/pci/pcie/portdrv.c | 1 + drivers/pci/pcie/portdrv.h | 10 +- 6 files changed, 222 insertions(+), 1 deletion(-) create mode 100644 drivers/pci/pcie/cxl_timeout.c