Message ID | 20240819090428.17349-1-eichest@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | PCI: imx6: reset link after suspend/resume | expand |
On Mon, Aug 19, 2024 at 11:03:16AM +0200, Stefan Eichenberger wrote: > On the i.MX6Quad (not QuadPlus), the PCIe link does not work after a > suspend/resume cycle. Worse, the PCIe memory mapped I/O isn't accessible > at all, so the system freezes when a PCIe driver tries to access its I/O > space. The only way to get resume working again is to reset the PCIe > link, similar to what is done on devices that support suspend/resume. > Through trial and error, we found that something about the PCIe > reference clock does not work as expected after a resume. We could not > figure out if it is disabled (even though the registers still say it is > enabled), or if it is somehow unstable or has some hiccups. With the > workaround introduced in this patch series, we were able to fully resume > a Compex WLE900VX (ath10k) miniPCIe Wifi module and an Intel AX200 M.2 > Wifi module. If there is a better way or other ideas on how to fix this > problem, please let us know. We are aware that resetting the link should > not be necessary, but we could not find a better solution. More > interestingly, even the SoCs that support suspend/resume according to > the i.MX erratas seem to reset the link on resume in > imx6_pcie_host_init, so we hope this might be a valid workaround. > > Stefan Eichenberger (3): > PCI: imx6: Add a function to deassert the reset gpio > PCI: imx6: move the wait for clock stabilization to enable ref clk > PCI: imx6: reset link on resume Thanks you for your patch, but it may have conflict with https://lore.kernel.org/linux-pci/Zr4XG6r+HnbIlu8S@lizhi-Precision-Tower-5810/T/#t > > drivers/pci/controller/dwc/pci-imx6.c | 69 +++++++++++++++++++++++---- > 1 file changed, 59 insertions(+), 10 deletions(-) > > -- > 2.43.0 >
On Mon, Aug 19, 2024 at 10:30:38AM -0400, Frank Li wrote: > On Mon, Aug 19, 2024 at 11:03:16AM +0200, Stefan Eichenberger wrote: > > On the i.MX6Quad (not QuadPlus), the PCIe link does not work after a > > suspend/resume cycle. Worse, the PCIe memory mapped I/O isn't accessible > > at all, so the system freezes when a PCIe driver tries to access its I/O > > space. The only way to get resume working again is to reset the PCIe > > link, similar to what is done on devices that support suspend/resume. > > Through trial and error, we found that something about the PCIe > > reference clock does not work as expected after a resume. We could not > > figure out if it is disabled (even though the registers still say it is > > enabled), or if it is somehow unstable or has some hiccups. With the > > workaround introduced in this patch series, we were able to fully resume > > a Compex WLE900VX (ath10k) miniPCIe Wifi module and an Intel AX200 M.2 > > Wifi module. If there is a better way or other ideas on how to fix this > > problem, please let us know. We are aware that resetting the link should > > not be necessary, but we could not find a better solution. More > > interestingly, even the SoCs that support suspend/resume according to > > the i.MX erratas seem to reset the link on resume in > > imx6_pcie_host_init, so we hope this might be a valid workaround. > > > > Stefan Eichenberger (3): > > PCI: imx6: Add a function to deassert the reset gpio > > PCI: imx6: move the wait for clock stabilization to enable ref clk > > PCI: imx6: reset link on resume > > Thanks you for your patch, but it may have conflict with > https://lore.kernel.org/linux-pci/Zr4XG6r+HnbIlu8S@lizhi-Precision-Tower-5810/T/#t > Thanks a lot for the hint. I will have a look at the series and see if I can adapt my changes including your suggestions. Regards, Stefan
Hi Frank, On Tue, Aug 20, 2024 at 08:52:41AM +0200, Stefan Eichenberger wrote: > On Mon, Aug 19, 2024 at 10:30:38AM -0400, Frank Li wrote: > > On Mon, Aug 19, 2024 at 11:03:16AM +0200, Stefan Eichenberger wrote: > > > On the i.MX6Quad (not QuadPlus), the PCIe link does not work after a > > > suspend/resume cycle. Worse, the PCIe memory mapped I/O isn't accessible > > > at all, so the system freezes when a PCIe driver tries to access its I/O > > > space. The only way to get resume working again is to reset the PCIe > > > link, similar to what is done on devices that support suspend/resume. > > > Through trial and error, we found that something about the PCIe > > > reference clock does not work as expected after a resume. We could not > > > figure out if it is disabled (even though the registers still say it is > > > enabled), or if it is somehow unstable or has some hiccups. With the > > > workaround introduced in this patch series, we were able to fully resume > > > a Compex WLE900VX (ath10k) miniPCIe Wifi module and an Intel AX200 M.2 > > > Wifi module. If there is a better way or other ideas on how to fix this > > > problem, please let us know. We are aware that resetting the link should > > > not be necessary, but we could not find a better solution. More > > > interestingly, even the SoCs that support suspend/resume according to > > > the i.MX erratas seem to reset the link on resume in > > > imx6_pcie_host_init, so we hope this might be a valid workaround. > > > > > > Stefan Eichenberger (3): > > > PCI: imx6: Add a function to deassert the reset gpio > > > PCI: imx6: move the wait for clock stabilization to enable ref clk > > > PCI: imx6: reset link on resume > > > > Thanks you for your patch, but it may have conflict with > > https://lore.kernel.org/linux-pci/Zr4XG6r+HnbIlu8S@lizhi-Precision-Tower-5810/T/#t > > > > Thanks a lot for the hint. I will have a look at the series and see if I > can adapt my changes including your suggestions. I did some more tests with your series applied. Everything works as expected on an i.MX8M Plus. However, the i.MX6Quad PCIe link still does not work after a resume. I could trace the issues back to the following functions, besides that we could use the same suspend/resume function as for the other i.MX SoCs. In suspend the follwoing function is causing issues: imx_pcie_stop_link(imx_pcie->pci); In resume the following functions are causing issues: imx_pcie->drvdata->init_phy(imx_pcie); // in imx_pcie_host_init imx_pcie_start_link(imx_pcie->pci); I think the second one makes sense because I could not stop the link I should not start it again. But why is also the init_phy function failing? Are they required to setup the link? The messages I get when the system resumes are: [ 50.176212] Enabling non-boot CPUs ... [ 50.194446] CPU1 is up [ 50.198087] CPU2 is up [ 50.201746] CPU3 is up [ 50.563710] imx6q-pcie 1ffc000.pcie: Read DBI address failed After the last message the system hangs. It seems this happens because the PCIe I/O mem is not accessible anymore. Do you have an idea what could cause imx_pcie_stop_link to break the link on resume? Without calling them the link is working fine after resuming and the drivers can access the PCIe I/O mem. Thanks, Stefan
On Wed, Aug 21, 2024 at 05:41:00PM +0200, Stefan Eichenberger wrote: > Hi Frank, > > On Tue, Aug 20, 2024 at 08:52:41AM +0200, Stefan Eichenberger wrote: > > On Mon, Aug 19, 2024 at 10:30:38AM -0400, Frank Li wrote: > > > On Mon, Aug 19, 2024 at 11:03:16AM +0200, Stefan Eichenberger wrote: > > > > On the i.MX6Quad (not QuadPlus), the PCIe link does not work after a > > > > suspend/resume cycle. Worse, the PCIe memory mapped I/O isn't accessible > > > > at all, so the system freezes when a PCIe driver tries to access its I/O > > > > space. The only way to get resume working again is to reset the PCIe > > > > link, similar to what is done on devices that support suspend/resume. > > > > Through trial and error, we found that something about the PCIe > > > > reference clock does not work as expected after a resume. We could not > > > > figure out if it is disabled (even though the registers still say it is > > > > enabled), or if it is somehow unstable or has some hiccups. With the > > > > workaround introduced in this patch series, we were able to fully resume > > > > a Compex WLE900VX (ath10k) miniPCIe Wifi module and an Intel AX200 M.2 > > > > Wifi module. If there is a better way or other ideas on how to fix this > > > > problem, please let us know. We are aware that resetting the link should > > > > not be necessary, but we could not find a better solution. More > > > > interestingly, even the SoCs that support suspend/resume according to > > > > the i.MX erratas seem to reset the link on resume in > > > > imx6_pcie_host_init, so we hope this might be a valid workaround. > > > > > > > > Stefan Eichenberger (3): > > > > PCI: imx6: Add a function to deassert the reset gpio > > > > PCI: imx6: move the wait for clock stabilization to enable ref clk > > > > PCI: imx6: reset link on resume > > > > > > Thanks you for your patch, but it may have conflict with > > > https://lore.kernel.org/linux-pci/Zr4XG6r+HnbIlu8S@lizhi-Precision-Tower-5810/T/#t > > > > > > > Thanks a lot for the hint. I will have a look at the series and see if I > > can adapt my changes including your suggestions. > > I did some more tests with your series applied. Everything works as > expected on an i.MX8M Plus. However, the i.MX6Quad PCIe link still does > not work after a resume. I could trace the issues back to the following > functions, besides that we could use the same suspend/resume function as > for the other i.MX SoCs. Upstream 6q pci don't support suspend/resume. [IMX6Q] = { .variant = IMX6Q, .flags = IMX_PCIE_FLAG_IMX_PHY | IMX_PCIE_FLAG_IMX_SPEED_CHANGE, If you add some code, can you post your patch(mark as RFC) then let me to check. Frank > > In suspend the follwoing function is causing issues: > imx_pcie_stop_link(imx_pcie->pci); > > In resume the following functions are causing issues: > imx_pcie->drvdata->init_phy(imx_pcie); // in imx_pcie_host_init > imx_pcie_start_link(imx_pcie->pci); > > I think the second one makes sense because I could not stop the link I > should not start it again. But why is also the init_phy function > failing? Are they required to setup the link? > > The messages I get when the system resumes are: > [ 50.176212] Enabling non-boot CPUs ... > [ 50.194446] CPU1 is up > [ 50.198087] CPU2 is up > [ 50.201746] CPU3 is up > [ 50.563710] imx6q-pcie 1ffc000.pcie: Read DBI address failed > > After the last message the system hangs. It seems this happens because > the PCIe I/O mem is not accessible anymore. > > Do you have an idea what could cause imx_pcie_stop_link to break the > link on resume? Without calling them the link is working fine after > resuming and the drivers can access the PCIe I/O mem. > > Thanks, > Stefan
On Wed, Aug 21, 2024 at 12:51:12PM -0400, Frank Li wrote: > On Wed, Aug 21, 2024 at 05:41:00PM +0200, Stefan Eichenberger wrote: > > Hi Frank, > > > > On Tue, Aug 20, 2024 at 08:52:41AM +0200, Stefan Eichenberger wrote: > > > On Mon, Aug 19, 2024 at 10:30:38AM -0400, Frank Li wrote: > > > > On Mon, Aug 19, 2024 at 11:03:16AM +0200, Stefan Eichenberger wrote: > > > > > On the i.MX6Quad (not QuadPlus), the PCIe link does not work after a > > > > > suspend/resume cycle. Worse, the PCIe memory mapped I/O isn't accessible > > > > > at all, so the system freezes when a PCIe driver tries to access its I/O > > > > > space. The only way to get resume working again is to reset the PCIe > > > > > link, similar to what is done on devices that support suspend/resume. > > > > > Through trial and error, we found that something about the PCIe > > > > > reference clock does not work as expected after a resume. We could not > > > > > figure out if it is disabled (even though the registers still say it is > > > > > enabled), or if it is somehow unstable or has some hiccups. With the > > > > > workaround introduced in this patch series, we were able to fully resume > > > > > a Compex WLE900VX (ath10k) miniPCIe Wifi module and an Intel AX200 M.2 > > > > > Wifi module. If there is a better way or other ideas on how to fix this > > > > > problem, please let us know. We are aware that resetting the link should > > > > > not be necessary, but we could not find a better solution. More > > > > > interestingly, even the SoCs that support suspend/resume according to > > > > > the i.MX erratas seem to reset the link on resume in > > > > > imx6_pcie_host_init, so we hope this might be a valid workaround. > > > > > > > > > > Stefan Eichenberger (3): > > > > > PCI: imx6: Add a function to deassert the reset gpio > > > > > PCI: imx6: move the wait for clock stabilization to enable ref clk > > > > > PCI: imx6: reset link on resume > > > > > > > > Thanks you for your patch, but it may have conflict with > > > > https://lore.kernel.org/linux-pci/Zr4XG6r+HnbIlu8S@lizhi-Precision-Tower-5810/T/#t > > > > > > > > > > Thanks a lot for the hint. I will have a look at the series and see if I > > > can adapt my changes including your suggestions. > > > > I did some more tests with your series applied. Everything works as > > expected on an i.MX8M Plus. However, the i.MX6Quad PCIe link still does > > not work after a resume. I could trace the issues back to the following > > functions, besides that we could use the same suspend/resume function as > > for the other i.MX SoCs. > > Upstream 6q pci don't support suspend/resume. > > [IMX6Q] = { > .variant = IMX6Q, > .flags = IMX_PCIE_FLAG_IMX_PHY | > IMX_PCIE_FLAG_IMX_SPEED_CHANGE, > > If you add some code, can you post your patch(mark as RFC) then let me to > check. Sure, I will wait until your series is merged and then try to come up with an RFC for suspend/resume on i.MX6Quad. Maybe we could use the same mechanism as for the downstream kernel where they just toggle the IMX6Q_GPR1_PCIE_TEST_PD bit? https://github.com/nxp-imx/linux-imx/commit/4e92355e1f79d225ea842511fcfd42b343b32995