diff mbox series

[v10,12/15] cxl/pci: Disable root port interrupts in RCH mode

Message ID 20230831170248.185078-5-terry.bowman@amd.com
State New, archived
Headers show
Series None | expand

Commit Message

Terry Bowman Aug. 31, 2023, 5:02 p.m. UTC
The RCH root port contains root command AER registers that should not be
enabled.[1] Disable these to prevent root port interrupts.

[1] CXL 3.0 - 12.2.1.1 RCH Downstream Port-detected Errors

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/core.h |  6 ++++++
 drivers/cxl/core/pci.c  | 29 +++++++++++++++++++++++++++++
 drivers/cxl/core/port.c |  3 +++
 3 files changed, 38 insertions(+)

Comments

Dan Williams Sept. 15, 2023, 6:43 p.m. UTC | #1
Terry Bowman wrote:
> The RCH root port contains root command AER registers that should not be
> enabled.[1] Disable these to prevent root port interrupts.
> 
> [1] CXL 3.0 - 12.2.1.1 RCH Downstream Port-detected Errors
> 
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Robert Richter <rrichter@amd.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
[..]
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 2a22a7ed4704..d195af72ed65 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1042,6 +1042,9 @@ __devm_cxl_add_dport(struct cxl_port *port, struct device *dport_dev,
>  
>  	cxl_dport_map_regs(dport);
>  
> +	if (dport->rch)
> +		cxl_disable_rch_root_ints(dport);
> +

Similar to the comment about cxl_dport_map_regs() not being appropriate
in an enumeration routine, this also needs to move out of _add_dport. It
occurs to me that it should also be undone on driver detach just like
other device "enables".
Terry Bowman Sept. 19, 2023, 3:08 p.m. UTC | #2
Hi Dan,

I added comments below.

On 9/15/23 13:43, Dan Williams wrote:
> Terry Bowman wrote:
>> The RCH root port contains root command AER registers that should not be
>> enabled.[1] Disable these to prevent root port interrupts.
>>
>> [1] CXL 3.0 - 12.2.1.1 RCH Downstream Port-detected Errors
>>
>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Robert Richter <rrichter@amd.com>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> [..]
>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index 2a22a7ed4704..d195af72ed65 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -1042,6 +1042,9 @@ __devm_cxl_add_dport(struct cxl_port *port, struct device *dport_dev,
>>  
>>  	cxl_dport_map_regs(dport);
>>  
>> +	if (dport->rch)
>> +		cxl_disable_rch_root_ints(dport);
>> +
> 
> Similar to the comment about cxl_dport_map_regs() not being appropriate
> in an enumeration routine, this also needs to move out of _add_dport. It
> occurs to me that it should also be undone on driver detach just like
> other device "enables".

Ok. I will move out of enumeration. 

Per the 'undo' request: This is a RCH downstream port (dport) with PCIe root port 
capability. PCI spec states root port error reporting is disabled by default at 
powerup. And SW does *not* enable the root port errors because the RCH dport is *not* 
bound to a root port driver (missing BDF, etc). This mask is added to follow the 
CXL spec precisely and if the rest of the system behaves as expected should not 
be necessary. 

I don't believe masking should be 'undone' in driver detach or elsewhere. Adding 
the 'undo' masking would potentially introduce RCH dport root port interrupt 
reporting which is incorrect for the RCH/RCD mode. Only CXL components (device, 
uport, switch) may reside under the RCH dport and never want RCH dport reporting 
root port errors. RCEC reports the root complex errors in RCH/RCD mode.

Regards,
Terry
Dan Williams Sept. 26, 2023, 9:47 p.m. UTC | #3
Terry Bowman wrote:
> Hi Dan,
> 
> I added comments below.
> 
> On 9/15/23 13:43, Dan Williams wrote:
> > Terry Bowman wrote:
> >> The RCH root port contains root command AER registers that should not be
> >> enabled.[1] Disable these to prevent root port interrupts.
> >>
> >> [1] CXL 3.0 - 12.2.1.1 RCH Downstream Port-detected Errors
> >>
> >> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> >> Signed-off-by: Robert Richter <rrichter@amd.com>
> >> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> > [..]
> >> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> >> index 2a22a7ed4704..d195af72ed65 100644
> >> --- a/drivers/cxl/core/port.c
> >> +++ b/drivers/cxl/core/port.c
> >> @@ -1042,6 +1042,9 @@ __devm_cxl_add_dport(struct cxl_port *port, struct device *dport_dev,
> >>  
> >>  	cxl_dport_map_regs(dport);
> >>  
> >> +	if (dport->rch)
> >> +		cxl_disable_rch_root_ints(dport);
> >> +
> > 
> > Similar to the comment about cxl_dport_map_regs() not being appropriate
> > in an enumeration routine, this also needs to move out of _add_dport. It
> > occurs to me that it should also be undone on driver detach just like
> > other device "enables".
> 
> Ok. I will move out of enumeration. 
> 
> Per the 'undo' request: This is a RCH downstream port (dport) with PCIe root port 
> capability. PCI spec states root port error reporting is disabled by default at 
> powerup. And SW does *not* enable the root port errors because the RCH dport is *not* 
> bound to a root port driver (missing BDF, etc). This mask is added to follow the 
> CXL spec precisely and if the rest of the system behaves as expected should not 
> be necessary. 

Ah, got it perhaps add a comment to sanity check that the hardware is in
the per-spec state. Are you certain that even in firmware-first error
handling it is safe for the driver to unconditionally disable these
interrupts?

> I don't believe masking should be 'undone' in driver detach or elsewhere. Adding 
> the 'undo' masking would potentially introduce RCH dport root port interrupt 
> reporting which is incorrect for the RCH/RCD mode. Only CXL components (device, 
> uport, switch) may reside under the RCH dport and never want RCH dport reporting 
> root port errors. RCEC reports the root complex errors in RCH/RCD mode.

Ok, that also seems to suggest that even in the firmware-first case the
driver should make sure they are off per-spec.
diff mbox series

Patch

diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index f470ef5c0a6a..6b037030b936 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -87,4 +87,10 @@  enum cxl_poison_trace_type {
 	CXL_POISON_TRACE_CLEAR,
 };
 
+#ifdef CONFIG_PCIEAER_CXL
+void cxl_disable_rch_root_ints(struct cxl_dport *dport);
+#else
+static inline void cxl_disable_rch_root_ints(struct cxl_dport *dport) { };
+#endif
+
 #endif /* __CXL_CORE_H__ */
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 1c40270968b6..e306d3c9638b 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -819,6 +819,35 @@  static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds)
 		cxl_handle_rdport_ras(cxlds, dport);
 }
 
+void cxl_disable_rch_root_ints(struct cxl_dport *dport)
+{
+	void __iomem *aer_base = dport->regs.dport_aer;
+	struct pci_host_bridge *bridge;
+	u32 aer_cmd_mask, aer_cmd;
+
+	if (!aer_base)
+		return;
+
+	bridge = to_pci_host_bridge(dport->dport_dev);
+
+	/*
+	 * Disable RCH root port command interrupts.
+	 * CXL 3.0 12.2.1.1 - RCH Downstream Port-detected Errors
+	 *
+	 * This sequence may not be necessary. CXL spec states disabling
+	 * the root cmd register's interrupts is required. But, PCI spec
+	 * shows these are disabled by default on reset.
+	 */
+	if (bridge->native_cxl_error) {
+		aer_cmd_mask = (PCI_ERR_ROOT_CMD_COR_EN |
+				PCI_ERR_ROOT_CMD_NONFATAL_EN |
+				PCI_ERR_ROOT_CMD_FATAL_EN);
+		aer_cmd = readl(aer_base + PCI_ERR_ROOT_COMMAND);
+		aer_cmd &= ~aer_cmd_mask;
+		writel(aer_cmd, aer_base + PCI_ERR_ROOT_COMMAND);
+	}
+}
+
 #else
 static void cxl_handle_rdport_errors(struct cxl_dev_state *cxlds) { }
 #endif
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 2a22a7ed4704..d195af72ed65 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1042,6 +1042,9 @@  __devm_cxl_add_dport(struct cxl_port *port, struct device *dport_dev,
 
 	cxl_dport_map_regs(dport);
 
+	if (dport->rch)
+		cxl_disable_rch_root_ints(dport);
+
 	cond_cxl_root_lock(port);
 	rc = add_dport(port, dport);
 	cond_cxl_root_unlock(port);