Message ID | 1744180833-68472-1-git-send-email-shawn.lin@rock-chips.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Krzysztof Wilczyński |
Headers | show |
Series | PCI: dw-rockchip: Remove PCIE_L0S_ENTRY check from rockchip_pcie_link_up() | expand |
On Wed, Apr 09, 2025 at 02:40:33PM +0800, Shawn Lin wrote: > Two mistakes here: > 1. 0x11 is L0 not L0S, so the naming is wrong from the very beginning. > 2. It's totally broken if enabling ASPM as rockchip_pcie_link_up() treat other > states, for instance, L0S or L1 as link down which is obvioult wrong. > > Remove the check. > > Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> > --- > > drivers/pci/controller/dwc/pcie-dw-rockchip.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c > index c624b7e..21dc99c 100644 > --- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c > +++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c > @@ -44,7 +44,6 @@ > #define PCIE_LINKUP (PCIE_SMLH_LINKUP | PCIE_RDLH_LINKUP) > #define PCIE_RDLH_LINK_UP_CHGED BIT(1) > #define PCIE_LINK_REQ_RST_NOT_INT BIT(2) > -#define PCIE_L0S_ENTRY 0x11 > #define PCIE_CLIENT_GENERAL_CONTROL 0x0 > #define PCIE_CLIENT_INTR_STATUS_LEGACY 0x8 > #define PCIE_CLIENT_INTR_MASK_LEGACY 0x1c > @@ -177,8 +176,7 @@ static int rockchip_pcie_link_up(struct dw_pcie *pci) > struct rockchip_pcie *rockchip = to_rockchip_pcie(pci); > u32 val = rockchip_pcie_get_ltssm(rockchip); > > - if ((val & PCIE_LINKUP) == PCIE_LINKUP && > - (val & PCIE_LTSSM_STATUS_MASK) == PCIE_L0S_ENTRY) > + if ((val & PCIE_LINKUP) == PCIE_LINKUP) > return 1; > > return 0; > -- > 2.7.4 > You should probably also add: Fixes: 0e898eb8df4e ("PCI: rockchip-dwc: Add Rockchip RK356X host controller driver") Considering that dw_pcie_link_up() looks like this: https://github.com/torvalds/linux/blob/v6.15-rc1/drivers/pci/controller/dwc/pcie-designware.c#L714-L725 Why not simply remove the rockchip_pcie_link_up() callback completely? Is there any advantage of using a rockchip specific way to read link up, rather than just reading link up via the DWC PCIE_PORT_DEBUG1 register? Kind regards, Niklas
在 2025/04/09 星期三 16:30, Niklas Cassel 写道: > On Wed, Apr 09, 2025 at 02:40:33PM +0800, Shawn Lin wrote: >> Two mistakes here: >> 1. 0x11 is L0 not L0S, so the naming is wrong from the very beginning. >> 2. It's totally broken if enabling ASPM as rockchip_pcie_link_up() treat other >> states, for instance, L0S or L1 as link down which is obvioult wrong. >> >> Remove the check. >> >> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> >> --- >> >> drivers/pci/controller/dwc/pcie-dw-rockchip.c | 4 +--- >> 1 file changed, 1 insertion(+), 3 deletions(-) >> >> diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c >> index c624b7e..21dc99c 100644 >> --- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c >> +++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c >> @@ -44,7 +44,6 @@ >> #define PCIE_LINKUP (PCIE_SMLH_LINKUP | PCIE_RDLH_LINKUP) >> #define PCIE_RDLH_LINK_UP_CHGED BIT(1) >> #define PCIE_LINK_REQ_RST_NOT_INT BIT(2) >> -#define PCIE_L0S_ENTRY 0x11 >> #define PCIE_CLIENT_GENERAL_CONTROL 0x0 >> #define PCIE_CLIENT_INTR_STATUS_LEGACY 0x8 >> #define PCIE_CLIENT_INTR_MASK_LEGACY 0x1c >> @@ -177,8 +176,7 @@ static int rockchip_pcie_link_up(struct dw_pcie *pci) >> struct rockchip_pcie *rockchip = to_rockchip_pcie(pci); >> u32 val = rockchip_pcie_get_ltssm(rockchip); >> >> - if ((val & PCIE_LINKUP) == PCIE_LINKUP && >> - (val & PCIE_LTSSM_STATUS_MASK) == PCIE_L0S_ENTRY) >> + if ((val & PCIE_LINKUP) == PCIE_LINKUP) >> return 1; >> >> return 0; >> -- >> 2.7.4 >> > > You should probably also add: > Fixes: 0e898eb8df4e ("PCI: rockchip-dwc: Add Rockchip RK356X host controller driver") > Will add. > > Considering that dw_pcie_link_up() looks like this: > https://github.com/torvalds/linux/blob/v6.15-rc1/drivers/pci/controller/dwc/pcie-designware.c#L714-L725 > > Why not simply remove the rockchip_pcie_link_up() callback completely? > > Is there any advantage of using a rockchip specific way to read link up, > rather than just reading link up via the DWC PCIE_PORT_DEBUG1 register? This is a very good question which we tried but made real products suffer from it for a long time, and finally we found the reason and discarded it. Quoted from DWC databook, section 8.2.3 AXI Bridge Initialization, Clocking and Reset: "In RC Mode, your AXI application must not generate any MEM or I/O requests, until the host software has enabled the Memory Space Enable (MSE), and IO Space Enable (ISE) bits respectively. Your RC application should not generate CFG requests until it has confirmed that the link is up by sampling the smlh_link_up and rdlh_link_up outputs." Quoted from DWC databook, section 5.50 SII: Debug Signals "[36]: smlh_link_up: LTSSM reports PHY link up or LTSSM is in Loopback.Active for Loopback Master" which refers to PCIE_PORT_DEBUG1_LINK_UP per code. The timing in dwc core is drving smlh_link_up->L0->rdlh_link_up->FC init(a fixed delay) from IC simulation when linking up. The dw_pcie_link_up() wasn't reliably work as expected by massive test. The problem is clear from our ASIC view, that cxpl_debug_info from DWC core is missing rdlh_link_up. cxpl_debug_info[32:63] is indentical to PCIE_PORT_DEBUG1, So reading PCIE_PORT_DEBUG1 and check smlh_link_up isn't enough. The problem was introduced by commit 1 and fixed by commit 2 but not to the end. And finally commit 3 rename the register but not fix anything. It was broken from the first time. Any dwc controllers should not be use the buggy default method to check link up state from our view. So this's the whole story for it, which may help you understand the indeed problem and why we reinvent rockchip_pcie_link_up() here. [1]. commit dac29e6c5460 ("PCI: designware: Add default link up check if sub-driver doesn't override") [2]. commit 01c076732e82 ("PCI: designware: Check LTSSM training bit before deciding link is up") [3]. commit 60ef4b072ba0 ("PCI: dwc: imx6: Share PHY debug register definitions") > > > Kind regards, > Niklas >
Hello Shawn, On Wed, Apr 09, 2025 at 05:09:38PM +0800, Shawn Lin wrote: > 在 2025/04/09 星期三 16:30, Niklas Cassel 写道: > > On Wed, Apr 09, 2025 at 02:40:33PM +0800, Shawn Lin wrote: > > > > Is there any advantage of using a rockchip specific way to read link up, > > rather than just reading link up via the DWC PCIE_PORT_DEBUG1 register? > > This is a very good question which we tried but made real products > suffer from it for a long time, and finally we found the reason and > discarded it. > > Quoted from DWC databook, section 8.2.3 AXI Bridge Initialization, > Clocking and Reset: > > "In RC Mode, your AXI application must not generate any MEM or I/O > requests, until the host software has enabled the Memory Space Enable > (MSE), and IO Space Enable (ISE) bits respectively. Your RC application > should not generate CFG requests until it has confirmed that the link is > up by sampling the smlh_link_up and rdlh_link_up outputs." > > Quoted from DWC databook, section 5.50 SII: Debug Signals > "[36]: smlh_link_up: LTSSM reports PHY link up or LTSSM is in > Loopback.Active for Loopback Master" which refers to > PCIE_PORT_DEBUG1_LINK_UP per code. > > The timing in dwc core is drving smlh_link_up->L0->rdlh_link_up->FC > init(a fixed delay) from IC simulation when linking up. > > The dw_pcie_link_up() wasn't reliably work as expected by massive test. > The problem is clear from our ASIC view, that cxpl_debug_info from DWC > core is missing rdlh_link_up. cxpl_debug_info[32:63] is indentical to > PCIE_PORT_DEBUG1, So reading PCIE_PORT_DEBUG1 and check > smlh_link_up isn't enough. > > The problem was introduced by commit 1 and fixed by commit 2 but not to > the end. And finally commit 3 rename the register but not fix anything. > > It was broken from the first time. Any dwc controllers should not be use > the buggy default method to check link up state from our view. > So this's the whole story for it, which may help you understand the > indeed problem and why we reinvent rockchip_pcie_link_up() here. > > [1]. commit dac29e6c5460 ("PCI: designware: Add default link up check if > sub-driver doesn't override") > > [2]. commit 01c076732e82 ("PCI: designware: Check LTSSM training bit > before deciding link is up") > > [3]. commit 60ef4b072ba0 ("PCI: dwc: imx6: Share PHY debug register > definitions") Thank you for the detailed answer. It seems like we should really add a warning and a comment in dw_pcie_link_up(), so that others don't get hit by this hard to debug issue! (Especially since dw_pcie_link_up() was added by someone with a @synopsys.com email). Kind regards, Niklas
在 2025/04/09 星期三 17:19, Niklas Cassel 写道: > Hello Shawn, > > On Wed, Apr 09, 2025 at 05:09:38PM +0800, Shawn Lin wrote: >> 在 2025/04/09 星期三 16:30, Niklas Cassel 写道: >>> On Wed, Apr 09, 2025 at 02:40:33PM +0800, Shawn Lin wrote: >>> >>> Is there any advantage of using a rockchip specific way to read link up, >>> rather than just reading link up via the DWC PCIE_PORT_DEBUG1 register? >> >> This is a very good question which we tried but made real products >> suffer from it for a long time, and finally we found the reason and >> discarded it. >> >> Quoted from DWC databook, section 8.2.3 AXI Bridge Initialization, >> Clocking and Reset: >> >> "In RC Mode, your AXI application must not generate any MEM or I/O >> requests, until the host software has enabled the Memory Space Enable >> (MSE), and IO Space Enable (ISE) bits respectively. Your RC application >> should not generate CFG requests until it has confirmed that the link is >> up by sampling the smlh_link_up and rdlh_link_up outputs." >> >> Quoted from DWC databook, section 5.50 SII: Debug Signals >> "[36]: smlh_link_up: LTSSM reports PHY link up or LTSSM is in >> Loopback.Active for Loopback Master" which refers to >> PCIE_PORT_DEBUG1_LINK_UP per code. >> >> The timing in dwc core is drving smlh_link_up->L0->rdlh_link_up->FC >> init(a fixed delay) from IC simulation when linking up. >> >> The dw_pcie_link_up() wasn't reliably work as expected by massive test. >> The problem is clear from our ASIC view, that cxpl_debug_info from DWC >> core is missing rdlh_link_up. cxpl_debug_info[32:63] is indentical to >> PCIE_PORT_DEBUG1, So reading PCIE_PORT_DEBUG1 and check >> smlh_link_up isn't enough. >> >> The problem was introduced by commit 1 and fixed by commit 2 but not to >> the end. And finally commit 3 rename the register but not fix anything. >> >> It was broken from the first time. Any dwc controllers should not be use >> the buggy default method to check link up state from our view. >> So this's the whole story for it, which may help you understand the >> indeed problem and why we reinvent rockchip_pcie_link_up() here. >> >> [1]. commit dac29e6c5460 ("PCI: designware: Add default link up check if >> sub-driver doesn't override") >> >> [2]. commit 01c076732e82 ("PCI: designware: Check LTSSM training bit >> before deciding link is up") >> >> [3]. commit 60ef4b072ba0 ("PCI: dwc: imx6: Share PHY debug register >> definitions") > > Thank you for the detailed answer. > > It seems like we should really add a warning and a comment in > dw_pcie_link_up(), so that others don't get hit by this hard to debug issue! > > (Especially since dw_pcie_link_up() was added by someone with a @synopsys.com > email). > Yep, will do it with another patch. > > Kind regards, > Niklas >
On Wed, Apr 09, 2025 at 11:19:03AM +0200, Niklas Cassel wrote: > Hello Shawn, > > On Wed, Apr 09, 2025 at 05:09:38PM +0800, Shawn Lin wrote: > > 在 2025/04/09 星期三 16:30, Niklas Cassel 写道: > > > On Wed, Apr 09, 2025 at 02:40:33PM +0800, Shawn Lin wrote: > > > > > > Is there any advantage of using a rockchip specific way to read link up, > > > rather than just reading link up via the DWC PCIE_PORT_DEBUG1 register? > > > > This is a very good question which we tried but made real products > > suffer from it for a long time, and finally we found the reason and > > discarded it. > > > > Quoted from DWC databook, section 8.2.3 AXI Bridge Initialization, > > Clocking and Reset: > > > > "In RC Mode, your AXI application must not generate any MEM or I/O > > requests, until the host software has enabled the Memory Space Enable > > (MSE), and IO Space Enable (ISE) bits respectively. Your RC application > > should not generate CFG requests until it has confirmed that the link is > > up by sampling the smlh_link_up and rdlh_link_up outputs." > > > > Quoted from DWC databook, section 5.50 SII: Debug Signals > > "[36]: smlh_link_up: LTSSM reports PHY link up or LTSSM is in > > Loopback.Active for Loopback Master" which refers to > > PCIE_PORT_DEBUG1_LINK_UP per code. > > > > The timing in dwc core is drving smlh_link_up->L0->rdlh_link_up->FC > > init(a fixed delay) from IC simulation when linking up. > > > > The dw_pcie_link_up() wasn't reliably work as expected by massive test. > > The problem is clear from our ASIC view, that cxpl_debug_info from DWC > > core is missing rdlh_link_up. cxpl_debug_info[32:63] is indentical to > > PCIE_PORT_DEBUG1, So reading PCIE_PORT_DEBUG1 and check > > smlh_link_up isn't enough. > > > > The problem was introduced by commit 1 and fixed by commit 2 but not to > > the end. And finally commit 3 rename the register but not fix anything. > > > > It was broken from the first time. Any dwc controllers should not be use > > the buggy default method to check link up state from our view. > > So this's the whole story for it, which may help you understand the > > indeed problem and why we reinvent rockchip_pcie_link_up() here. > > > > [1]. commit dac29e6c5460 ("PCI: designware: Add default link up check if > > sub-driver doesn't override") > > > > [2]. commit 01c076732e82 ("PCI: designware: Check LTSSM training bit > > before deciding link is up") > > > > [3]. commit 60ef4b072ba0 ("PCI: dwc: imx6: Share PHY debug register > > definitions") > > Thank you for the detailed answer. > > It seems like we should really add a warning and a comment in > dw_pcie_link_up(), so that others don't get hit by this hard to debug issue! > Right. But I'm also wondering if we should use the 'Data Link Layer Link Active' bit in PCI Express Capability for checking link up. Qcom driver has been using it from the start and there are no reported issues. We could add this as the first fallback if the link_up callback is not provided. - Mani
Hello Mani, On Sun, Apr 13, 2025 at 07:54:28PM +0530, Manivannan Sadhasivam wrote: > On Wed, Apr 09, 2025 at 11:19:03AM +0200, Niklas Cassel wrote: > > > > It seems like we should really add a warning and a comment in > > dw_pcie_link_up(), so that others don't get hit by this hard to debug issue! > > > > Right. But I'm also wondering if we should use the 'Data Link Layer Link Active' > bit in PCI Express Capability for checking link up. Qcom driver has been using > it from the start and there are no reported issues. We could add this as the > first fallback if the link_up callback is not provided. Sounds like a good idea, but from looking at: 7.5.3.6 Link Capabilities Register (Offset 0Ch) " Data Link Layer Link Active Reporting Capable - For a Downstream Port, this bit must be hardwired to 1b if the component supports the optional capability of reporting the DL_Active state of the Data Link Control and Management State Machine. For a hot-plug capable Downstream Port (as indicated by the Hot-Plug Capable bit of the Slot Capabilities Register) or a Downstream Port that supports Link speeds greater than 5.0 GT/s, this bit must be hardwired to 1b. For Upstream Ports and components that do not support this optional capability, this bit must be hardwired to 0b. " It sounds like the the 'Data Link Layer Link Active' bit is optional, or at least optional for Gen1 and Gen2. Kind regards, Niklas
On Mon, Apr 14, 2025 at 12:06:09PM +0200, Niklas Cassel wrote: > Hello Mani, > > On Sun, Apr 13, 2025 at 07:54:28PM +0530, Manivannan Sadhasivam wrote: > > On Wed, Apr 09, 2025 at 11:19:03AM +0200, Niklas Cassel wrote: > > > > > > It seems like we should really add a warning and a comment in > > > dw_pcie_link_up(), so that others don't get hit by this hard to debug issue! > > > > > > > Right. But I'm also wondering if we should use the 'Data Link Layer Link Active' > > bit in PCI Express Capability for checking link up. Qcom driver has been using > > it from the start and there are no reported issues. We could add this as the > > first fallback if the link_up callback is not provided. > > Sounds like a good idea, but from looking at: > 7.5.3.6 Link Capabilities Register (Offset 0Ch) > > " > Data Link Layer Link Active Reporting Capable - For a Downstream Port, > this bit must be hardwired to 1b if the component supports the optional > capability of reporting the DL_Active state of the Data Link Control and > Management State Machine. For a hot-plug capable Downstream Port (as > indicated by the Hot-Plug Capable bit of the Slot Capabilities Register) > or a Downstream Port that supports Link speeds greater than 5.0 GT/s, > this bit must be hardwired to 1b. > > For Upstream Ports and components that do not support this optional > capability, this bit must be hardwired to 0b. > " > > It sounds like the the 'Data Link Layer Link Active' bit is optional, > or at least optional for Gen1 and Gen2. Yeah, we should avoid relying on it. Thanks for digging through the spec :) Really appreciated! - Mani
diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c index c624b7e..21dc99c 100644 --- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c +++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c @@ -44,7 +44,6 @@ #define PCIE_LINKUP (PCIE_SMLH_LINKUP | PCIE_RDLH_LINKUP) #define PCIE_RDLH_LINK_UP_CHGED BIT(1) #define PCIE_LINK_REQ_RST_NOT_INT BIT(2) -#define PCIE_L0S_ENTRY 0x11 #define PCIE_CLIENT_GENERAL_CONTROL 0x0 #define PCIE_CLIENT_INTR_STATUS_LEGACY 0x8 #define PCIE_CLIENT_INTR_MASK_LEGACY 0x1c @@ -177,8 +176,7 @@ static int rockchip_pcie_link_up(struct dw_pcie *pci) struct rockchip_pcie *rockchip = to_rockchip_pcie(pci); u32 val = rockchip_pcie_get_ltssm(rockchip); - if ((val & PCIE_LINKUP) == PCIE_LINKUP && - (val & PCIE_LTSSM_STATUS_MASK) == PCIE_L0S_ENTRY) + if ((val & PCIE_LINKUP) == PCIE_LINKUP) return 1; return 0;
Two mistakes here: 1. 0x11 is L0 not L0S, so the naming is wrong from the very beginning. 2. It's totally broken if enabling ASPM as rockchip_pcie_link_up() treat other states, for instance, L0S or L1 as link down which is obvioult wrong. Remove the check. Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> --- drivers/pci/controller/dwc/pcie-dw-rockchip.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)