Message ID | 20190604232443.3417-1-bjorn.andersson@linaro.org (mailing list archive) |
---|---|
State | Accepted |
Commit | 885bd765963b42c380db442db7f1c0f2a26076fa |
Headers | show |
Series | phy: qcom-qmp: Correct READY_STATUS poll break condition | expand |
On Tue, Jun 4, 2019 at 4:24 PM Bjorn Andersson <bjorn.andersson@linaro.org> wrote: > > After issuing a PHY_START request to the QMP, the hardware documentation > states that the software should wait for the PCS_READY_STATUS to become > 1. > > With the introduction of c9b589791fc1 ("phy: qcom: Utilize UFS reset > controller") an additional 1ms delay was introduced between the start > request and the check of the status bit. This greatly increases the > chances for the hardware to actually becoming ready before the status > bit is read. > > The result can be seen in that UFS PHY enabling is now reported as a > failure in 10% of the boots on SDM845, which is a clear regression from > the previous rare/occasional failure. > > This patch fixes the "break condition" of the poll to check for the > correct state of the status bit. > > Unfortunately PCIe on 8996 and 8998 does not specify the mask_pcs_ready > register, which means that the code checks a bit that's always 0. So the > patch also fixes these, in order to not regress these targets. > > Cc: stable@vger.kernel.org > Cc: Evan Green <evgreen@chromium.org> > Cc: Marc Gonzalez <marc.w.gonzalez@free.fr> > Cc: Vivek Gautam <vivek.gautam@codeaurora.org> > Fixes: 73d7ec899bd8 ("phy: qcom-qmp: Add msm8998 PCIe QMP PHY support") > Fixes: e78f3d15e115 ("phy: qcom-qmp: new qmp phy driver for qcom-chipsets") > Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Nice find. Reviewed-by: Evan Green <evgreen@chromium.org>
On Tue, Jun 04, 2019 at 04:24:43PM -0700, Bjorn Andersson wrote: > After issuing a PHY_START request to the QMP, the hardware documentation > states that the software should wait for the PCS_READY_STATUS to become > 1. > > With the introduction of c9b589791fc1 ("phy: qcom: Utilize UFS reset > controller") an additional 1ms delay was introduced between the start > request and the check of the status bit. This greatly increases the > chances for the hardware to actually becoming ready before the status > bit is read. > > The result can be seen in that UFS PHY enabling is now reported as a > failure in 10% of the boots on SDM845, which is a clear regression from > the previous rare/occasional failure. > > This patch fixes the "break condition" of the poll to check for the > correct state of the status bit. > > Unfortunately PCIe on 8996 and 8998 does not specify the mask_pcs_ready > register, which means that the code checks a bit that's always 0. So the > patch also fixes these, in order to not regress these targets. > > Cc: stable@vger.kernel.org > Cc: Evan Green <evgreen@chromium.org> > Cc: Marc Gonzalez <marc.w.gonzalez@free.fr> > Cc: Vivek Gautam <vivek.gautam@codeaurora.org> > Fixes: 73d7ec899bd8 ("phy: qcom-qmp: Add msm8998 PCIe QMP PHY support") > Fixes: e78f3d15e115 ("phy: qcom-qmp: new qmp phy driver for qcom-chipsets") > Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> > --- > > @Kishon, this is a regression spotted in v5.2-rc1, so please consider applying > this towards v5.2. > > drivers/phy/qualcomm/phy-qcom-qmp.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c b/drivers/phy/qualcomm/phy-qcom-qmp.c > index cd91b4179b10..43abdfd0deed 100644 > --- a/drivers/phy/qualcomm/phy-qcom-qmp.c > +++ b/drivers/phy/qualcomm/phy-qcom-qmp.c > @@ -1074,6 +1074,7 @@ static const struct qmp_phy_cfg msm8996_pciephy_cfg = { > > .start_ctrl = PCS_START | PLL_READY_GATE_EN, > .pwrdn_ctrl = SW_PWRDN | REFCLK_DRV_DSBL, > + .mask_pcs_ready = PHYSTATUS, > .mask_com_pcs_ready = PCS_READY, > > .has_phy_com_ctrl = true, > @@ -1253,6 +1254,7 @@ static const struct qmp_phy_cfg msm8998_pciephy_cfg = { > > .start_ctrl = SERDES_START | PCS_START, > .pwrdn_ctrl = SW_PWRDN | REFCLK_DRV_DSBL, > + .mask_pcs_ready = PHYSTATUS, > .mask_com_pcs_ready = PCS_READY, > }; > > @@ -1547,7 +1549,7 @@ static int qcom_qmp_phy_enable(struct phy *phy) > status = pcs + cfg->regs[QPHY_PCS_READY_STATUS]; > mask = cfg->mask_pcs_ready; > > - ret = readl_poll_timeout(status, val, !(val & mask), 1, > + ret = readl_poll_timeout(status, val, val & mask, 1, > PHY_INIT_COMPLETE_TIMEOUT); > if (ret) { > dev_err(qmp->dev, "phy initialization timed-out\n"); > -- > 2.18.0 > msm8996_pciephy_cfg and msm8998_pciephy_cfg not having a bit mask defined for PCS ready is really a separate bug, so personally I would have created two patches, one that adds the missing masks, and one patch that fixes the broken break condition. Either way: Reviewed-by: Niklas Cassel <niklas.cassel@linaro.org>
On 05/06/2019 01:24, Bjorn Andersson wrote: > After issuing a PHY_START request to the QMP, the hardware documentation > states that the software should wait for the PCS_READY_STATUS to become > 1. > > With the introduction of c9b589791fc1 ("phy: qcom: Utilize UFS reset > controller") an additional 1ms delay was introduced between the start > request and the check of the status bit. This greatly increases the > chances for the hardware to actually becoming ready before the status > bit is read. > > The result can be seen in that UFS PHY enabling is now reported as a > failure in 10% of the boots on SDM845, which is a clear regression from > the previous rare/occasional failure. > > This patch fixes the "break condition" of the poll to check for the > correct state of the status bit. > > Unfortunately PCIe on 8996 and 8998 does not specify the mask_pcs_ready > register, which means that the code checks a bit that's always 0. So the > patch also fixes these, in order to not regress these targets. > > Cc: stable@vger.kernel.org > Cc: Evan Green <evgreen@chromium.org> > Cc: Marc Gonzalez <marc.w.gonzalez@free.fr> > Cc: Vivek Gautam <vivek.gautam@codeaurora.org> > Fixes: 73d7ec899bd8 ("phy: qcom-qmp: Add msm8998 PCIe QMP PHY support") > Fixes: e78f3d15e115 ("phy: qcom-qmp: new qmp phy driver for qcom-chipsets") > Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> > --- > > @Kishon, this is a regression spotted in v5.2-rc1, so please consider applying > this towards v5.2. > > drivers/phy/qualcomm/phy-qcom-qmp.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c b/drivers/phy/qualcomm/phy-qcom-qmp.c > index cd91b4179b10..43abdfd0deed 100644 > --- a/drivers/phy/qualcomm/phy-qcom-qmp.c > +++ b/drivers/phy/qualcomm/phy-qcom-qmp.c > @@ -1074,6 +1074,7 @@ static const struct qmp_phy_cfg msm8996_pciephy_cfg = { > > .start_ctrl = PCS_START | PLL_READY_GATE_EN, > .pwrdn_ctrl = SW_PWRDN | REFCLK_DRV_DSBL, > + .mask_pcs_ready = PHYSTATUS, > .mask_com_pcs_ready = PCS_READY, > > .has_phy_com_ctrl = true, > @@ -1253,6 +1254,7 @@ static const struct qmp_phy_cfg msm8998_pciephy_cfg = { > > .start_ctrl = SERDES_START | PCS_START, > .pwrdn_ctrl = SW_PWRDN | REFCLK_DRV_DSBL, > + .mask_pcs_ready = PHYSTATUS, > .mask_com_pcs_ready = PCS_READY, > }; > > @@ -1547,7 +1549,7 @@ static int qcom_qmp_phy_enable(struct phy *phy) > status = pcs + cfg->regs[QPHY_PCS_READY_STATUS]; > mask = cfg->mask_pcs_ready; > > - ret = readl_poll_timeout(status, val, !(val & mask), 1, > + ret = readl_poll_timeout(status, val, val & mask, 1, > PHY_INIT_COMPLETE_TIMEOUT); > if (ret) { > dev_err(qmp->dev, "phy initialization timed-out\n"); Your patch made me realize that: msm8998_pciephy_cfg.has_phy_com_ctrl = false thus msm8998_pciephy_cfg.mask_com_pcs_ready is useless, AFAICT. (I copied msm8996_pciephy_cfg for msm8998_pciephy_cfg) Does msm8996_pciephy_cfg really need both mask_pcs_ready AND mask_com_pcs_ready? I'll test your patch tomorrow. Regards.
On Wed 12 Jun 09:24 PDT 2019, Marc Gonzalez wrote: > On 05/06/2019 01:24, Bjorn Andersson wrote: > > > After issuing a PHY_START request to the QMP, the hardware documentation > > states that the software should wait for the PCS_READY_STATUS to become > > 1. > > > > With the introduction of c9b589791fc1 ("phy: qcom: Utilize UFS reset > > controller") an additional 1ms delay was introduced between the start > > request and the check of the status bit. This greatly increases the > > chances for the hardware to actually becoming ready before the status > > bit is read. > > > > The result can be seen in that UFS PHY enabling is now reported as a > > failure in 10% of the boots on SDM845, which is a clear regression from > > the previous rare/occasional failure. > > > > This patch fixes the "break condition" of the poll to check for the > > correct state of the status bit. > > > > Unfortunately PCIe on 8996 and 8998 does not specify the mask_pcs_ready > > register, which means that the code checks a bit that's always 0. So the > > patch also fixes these, in order to not regress these targets. > > > > Cc: stable@vger.kernel.org > > Cc: Evan Green <evgreen@chromium.org> > > Cc: Marc Gonzalez <marc.w.gonzalez@free.fr> > > Cc: Vivek Gautam <vivek.gautam@codeaurora.org> > > Fixes: 73d7ec899bd8 ("phy: qcom-qmp: Add msm8998 PCIe QMP PHY support") > > Fixes: e78f3d15e115 ("phy: qcom-qmp: new qmp phy driver for qcom-chipsets") > > Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> > > --- > > > > @Kishon, this is a regression spotted in v5.2-rc1, so please consider applying > > this towards v5.2. > > > > drivers/phy/qualcomm/phy-qcom-qmp.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c b/drivers/phy/qualcomm/phy-qcom-qmp.c > > index cd91b4179b10..43abdfd0deed 100644 > > --- a/drivers/phy/qualcomm/phy-qcom-qmp.c > > +++ b/drivers/phy/qualcomm/phy-qcom-qmp.c > > @@ -1074,6 +1074,7 @@ static const struct qmp_phy_cfg msm8996_pciephy_cfg = { > > > > .start_ctrl = PCS_START | PLL_READY_GATE_EN, > > .pwrdn_ctrl = SW_PWRDN | REFCLK_DRV_DSBL, > > + .mask_pcs_ready = PHYSTATUS, > > .mask_com_pcs_ready = PCS_READY, > > > > .has_phy_com_ctrl = true, > > @@ -1253,6 +1254,7 @@ static const struct qmp_phy_cfg msm8998_pciephy_cfg = { > > > > .start_ctrl = SERDES_START | PCS_START, > > .pwrdn_ctrl = SW_PWRDN | REFCLK_DRV_DSBL, > > + .mask_pcs_ready = PHYSTATUS, > > .mask_com_pcs_ready = PCS_READY, > > }; > > > > @@ -1547,7 +1549,7 @@ static int qcom_qmp_phy_enable(struct phy *phy) > > status = pcs + cfg->regs[QPHY_PCS_READY_STATUS]; > > mask = cfg->mask_pcs_ready; > > > > - ret = readl_poll_timeout(status, val, !(val & mask), 1, > > + ret = readl_poll_timeout(status, val, val & mask, 1, > > PHY_INIT_COMPLETE_TIMEOUT); > > if (ret) { > > dev_err(qmp->dev, "phy initialization timed-out\n"); > > Your patch made me realize that: > msm8998_pciephy_cfg.has_phy_com_ctrl = false > thus > msm8998_pciephy_cfg.mask_com_pcs_ready is useless, AFAICT. > While 8998 has a COM block, it does (among other things) not have a ready bit. So afaict has_phy_com_ctrl = false is correct. The addition of mask_pcs_ready is part of resolving the regression in 5.2, so I suggest that we remove mask_com_pcs_ready separately. > (I copied msm8996_pciephy_cfg for msm8998_pciephy_cfg) > > Does msm8996_pciephy_cfg really need both mask_pcs_ready AND > mask_com_pcs_ready? > 8996 has a COM block and it contains both the control bits and the status bits, so that looks correct. > I'll test your patch tomorrow. > I appreciate that. Thanks, Bjorn
On Wed 12 Jun 06:08 PDT 2019, Niklas Cassel wrote: > On Tue, Jun 04, 2019 at 04:24:43PM -0700, Bjorn Andersson wrote: > > After issuing a PHY_START request to the QMP, the hardware documentation > > states that the software should wait for the PCS_READY_STATUS to become > > 1. > > > > With the introduction of c9b589791fc1 ("phy: qcom: Utilize UFS reset > > controller") an additional 1ms delay was introduced between the start > > request and the check of the status bit. This greatly increases the > > chances for the hardware to actually becoming ready before the status > > bit is read. > > > > The result can be seen in that UFS PHY enabling is now reported as a > > failure in 10% of the boots on SDM845, which is a clear regression from > > the previous rare/occasional failure. > > > > This patch fixes the "break condition" of the poll to check for the > > correct state of the status bit. > > > > Unfortunately PCIe on 8996 and 8998 does not specify the mask_pcs_ready > > register, which means that the code checks a bit that's always 0. So the > > patch also fixes these, in order to not regress these targets. > > > > Cc: stable@vger.kernel.org > > Cc: Evan Green <evgreen@chromium.org> > > Cc: Marc Gonzalez <marc.w.gonzalez@free.fr> > > Cc: Vivek Gautam <vivek.gautam@codeaurora.org> > > Fixes: 73d7ec899bd8 ("phy: qcom-qmp: Add msm8998 PCIe QMP PHY support") > > Fixes: e78f3d15e115 ("phy: qcom-qmp: new qmp phy driver for qcom-chipsets") > > Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> > > --- > > > > @Kishon, this is a regression spotted in v5.2-rc1, so please consider applying > > this towards v5.2. > > > > drivers/phy/qualcomm/phy-qcom-qmp.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c b/drivers/phy/qualcomm/phy-qcom-qmp.c > > index cd91b4179b10..43abdfd0deed 100644 > > --- a/drivers/phy/qualcomm/phy-qcom-qmp.c > > +++ b/drivers/phy/qualcomm/phy-qcom-qmp.c > > @@ -1074,6 +1074,7 @@ static const struct qmp_phy_cfg msm8996_pciephy_cfg = { > > > > .start_ctrl = PCS_START | PLL_READY_GATE_EN, > > .pwrdn_ctrl = SW_PWRDN | REFCLK_DRV_DSBL, > > + .mask_pcs_ready = PHYSTATUS, > > .mask_com_pcs_ready = PCS_READY, > > > > .has_phy_com_ctrl = true, > > @@ -1253,6 +1254,7 @@ static const struct qmp_phy_cfg msm8998_pciephy_cfg = { > > > > .start_ctrl = SERDES_START | PCS_START, > > .pwrdn_ctrl = SW_PWRDN | REFCLK_DRV_DSBL, > > + .mask_pcs_ready = PHYSTATUS, > > .mask_com_pcs_ready = PCS_READY, > > }; > > > > @@ -1547,7 +1549,7 @@ static int qcom_qmp_phy_enable(struct phy *phy) > > status = pcs + cfg->regs[QPHY_PCS_READY_STATUS]; > > mask = cfg->mask_pcs_ready; > > > > - ret = readl_poll_timeout(status, val, !(val & mask), 1, > > + ret = readl_poll_timeout(status, val, val & mask, 1, > > PHY_INIT_COMPLETE_TIMEOUT); > > if (ret) { > > dev_err(qmp->dev, "phy initialization timed-out\n"); > > -- > > 2.18.0 > > > > msm8996_pciephy_cfg and msm8998_pciephy_cfg not having a bit mask defined > for PCS ready is really a separate bug, so personally I would have created > two patches, one that adds the missing masks, and one patch that fixes the > broken break condition. > We can't add mask_pcs_ready in a separate commit after the poll change, because this would introduce a regression in the history and we can't add the mask_pcs_ready before because when I tested this on db820c I saw occasional initialization failures. I was not able to verify 8998, but I presume that the same dependency exists there. > Either way: > > Reviewed-by: Niklas Cassel <niklas.cassel@linaro.org> Thanks, Bjorn
On 12/06/2019 19:25, Bjorn Andersson wrote: > On Wed 12 Jun 09:24 PDT 2019, Marc Gonzalez wrote: > >> On 05/06/2019 01:24, Bjorn Andersson wrote: >> >>> After issuing a PHY_START request to the QMP, the hardware documentation >>> states that the software should wait for the PCS_READY_STATUS to become 1. >>> >>> With the introduction of c9b589791fc1 ("phy: qcom: Utilize UFS reset >>> controller") an additional 1ms delay was introduced between the start >>> request and the check of the status bit. This greatly increases the >>> chances for the hardware to actually becoming ready before the status >>> bit is read. >>> >>> The result can be seen in that UFS PHY enabling is now reported as a >>> failure in 10% of the boots on SDM845, which is a clear regression from >>> the previous rare/occasional failure. >>> >>> This patch fixes the "break condition" of the poll to check for the >>> correct state of the status bit. >>> >>> Unfortunately PCIe on 8996 and 8998 does not specify the mask_pcs_ready >>> register, which means that the code checks a bit that's always 0. So the >>> patch also fixes these, in order to not regress these targets. >>> >>> Cc: stable@vger.kernel.org >>> Cc: Evan Green <evgreen@chromium.org> >>> Cc: Marc Gonzalez <marc.w.gonzalez@free.fr> >>> Cc: Vivek Gautam <vivek.gautam@codeaurora.org> >>> Fixes: 73d7ec899bd8 ("phy: qcom-qmp: Add msm8998 PCIe QMP PHY support") >>> Fixes: e78f3d15e115 ("phy: qcom-qmp: new qmp phy driver for qcom-chipsets") >>> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> >>> --- >>> >>> @Kishon, this is a regression spotted in v5.2-rc1, so please consider applying >>> this towards v5.2. >>> >>> drivers/phy/qualcomm/phy-qcom-qmp.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c b/drivers/phy/qualcomm/phy-qcom-qmp.c >>> index cd91b4179b10..43abdfd0deed 100644 >>> --- a/drivers/phy/qualcomm/phy-qcom-qmp.c >>> +++ b/drivers/phy/qualcomm/phy-qcom-qmp.c >>> @@ -1074,6 +1074,7 @@ static const struct qmp_phy_cfg msm8996_pciephy_cfg = { >>> >>> .start_ctrl = PCS_START | PLL_READY_GATE_EN, >>> .pwrdn_ctrl = SW_PWRDN | REFCLK_DRV_DSBL, >>> + .mask_pcs_ready = PHYSTATUS, >>> .mask_com_pcs_ready = PCS_READY, >>> >>> .has_phy_com_ctrl = true, >>> @@ -1253,6 +1254,7 @@ static const struct qmp_phy_cfg msm8998_pciephy_cfg = { >>> >>> .start_ctrl = SERDES_START | PCS_START, >>> .pwrdn_ctrl = SW_PWRDN | REFCLK_DRV_DSBL, >>> + .mask_pcs_ready = PHYSTATUS, >>> .mask_com_pcs_ready = PCS_READY, >>> }; >>> >>> @@ -1547,7 +1549,7 @@ static int qcom_qmp_phy_enable(struct phy *phy) >>> status = pcs + cfg->regs[QPHY_PCS_READY_STATUS]; >>> mask = cfg->mask_pcs_ready; >>> >>> - ret = readl_poll_timeout(status, val, !(val & mask), 1, >>> + ret = readl_poll_timeout(status, val, val & mask, 1, >>> PHY_INIT_COMPLETE_TIMEOUT); >>> if (ret) { >>> dev_err(qmp->dev, "phy initialization timed-out\n"); >> >> Your patch made me realize that: >> msm8998_pciephy_cfg.has_phy_com_ctrl = false >> thus >> msm8998_pciephy_cfg.mask_com_pcs_ready is useless, AFAICT. > > While 8998 has a COM block, it does (among other things) not have a > ready bit. So afaict has_phy_com_ctrl = false is correct. Pfff... Working blind without the HPG sucks... > The addition of mask_pcs_ready is part of resolving the regression in > 5.2, so I suggest that we remove mask_com_pcs_ready separately. I agree that it should be done separately. I'll send a patch on top of yours. >> (I copied msm8996_pciephy_cfg for msm8998_pciephy_cfg) >> >> Does msm8996_pciephy_cfg really need both mask_pcs_ready AND >> mask_com_pcs_ready? > > 8996 has a COM block and it contains both the control bits and the > status bits, so that looks correct. Thanks for checking. >> I'll test your patch tomorrow. > > I appreciate that. Here are my observations for a 8998 board: 1) If I apply only the readl_poll_timeout() fix (not the mask_pcs_ready fixup) qcom_pcie_probe() fails with a timeout in phy_init. => this is in line with your regression analysis. 2) Your patch also fixes a long-standing bug in UFS init whereby sending lots of information to the console during phy init would lead to an incorrectly diagnosed time-out. Good stuff! Reviewed-by: Marc Gonzalez <marc.w.gonzalez@free.fr> Tested-by: Marc Gonzalez <marc.w.gonzalez@free.fr> Regards.
On 13/06/2019 11:10, Marc Gonzalez wrote: > Here are my observations for a 8998 board: > > 1) If I apply only the readl_poll_timeout() fix (not the mask_pcs_ready fixup) > qcom_pcie_probe() fails with a timeout in phy_init. > => this is in line with your regression analysis. > > 2) Your patch also fixes a long-standing bug in UFS init whereby sending > lots of information to the console during phy init would lead to an > incorrectly diagnosed time-out. > > Good stuff! > > Reviewed-by: Marc Gonzalez <marc.w.gonzalez@free.fr> > Tested-by: Marc Gonzalez <marc.w.gonzalez@free.fr> Hello Kishon, Could you take this patch through your tree? It fixes a pair of nasty bugs. I do have a follow-up (trivial) patch on top of this one: https://lore.kernel.org/patchwork/patch/1088044/ What are your thoughts on the usleep_range issue? https://lore.kernel.org/patchwork/patch/1088035/ Regards.
On 13/06/2019 11:10, Marc Gonzalez wrote: > Here are my observations for a 8998 board: > > 1) If I apply only the readl_poll_timeout() fix (not the mask_pcs_ready fixup) > qcom_pcie_probe() fails with a timeout in phy_init. > => this is in line with your regression analysis. > > 2) Your patch also fixes a long-standing bug in UFS init whereby sending > lots of information to the console during phy init would lead to an > incorrectly diagnosed time-out. > > Good stuff! > > Reviewed-by: Marc Gonzalez <marc.w.gonzalez@free.fr> > Tested-by: Marc Gonzalez <marc.w.gonzalez@free.fr> It looks like this patch fixed UFS, but broke PCIe and USB3 ^_^ qcom-qmp-phy 1c06000.phy: Registered Qcom-QMP phy qcom-qmp-phy c010000.phy: Registered Qcom-QMP phy qcom-qmp-phy 1da7000.phy: Registered Qcom-QMP phy qcom-qmp-phy 1c06000.phy: BEFORE=000000a6 AFTER=000000a6 qcom-qmp-phy 1c06000.phy: phy initialization timed-out phy phy-1c06000.phy.0: phy init failed --> -110 qcom-pcie: probe of 1c00000.pci failed with error -110 qcom-qmp-phy 1da7000.phy: BEFORE=00000040 AFTER=0000000d qcom-qmp-phy c010000.phy: BEFORE=69696969 AFTER=b7b7b7b7 qcom-qmp-phy c010000.phy: phy initialization timed-out phy phy-c010000.phy.1: phy init failed --> -110 dwc3 a800000.dwc3: failed to initialize core: -110 dwc3: probe of a800000.dwc3 failed with error -110 Downstream code for PCIe is: static bool pcie_phy_is_ready(struct msm_pcie_dev_t *dev) { if (dev->phy_ver >= 0x20) { if (readl_relaxed(dev->phy + PCIE_N_PCS_STATUS(dev->rc_idx, dev->common_phy)) & BIT(6)) return false; else return true; } if (!(readl_relaxed(dev->phy + PCIE_COM_PCS_READY_STATUS) & 0x1)) return false; else return true; } AFAICT: PCIe and USB3 QMP PHYs are ready when PHYSTATUS=BIT(6) goes to 0. But UFS is ready when PCS_READY=BIT(0) goes to 1. Can someone verify that USB3 is broken on 845 with 885bd765963b? Regards.
On 19/07/2019 17:50, Marc Gonzalez wrote: > On 13/06/2019 11:10, Marc Gonzalez wrote: > >> Here are my observations for a 8998 board: >> >> 1) If I apply only the readl_poll_timeout() fix (not the mask_pcs_ready fixup) >> qcom_pcie_probe() fails with a timeout in phy_init. >> => this is in line with your regression analysis. >> >> 2) Your patch also fixes a long-standing bug in UFS init whereby sending >> lots of information to the console during phy init would lead to an >> incorrectly diagnosed time-out. >> >> Good stuff! >> >> Reviewed-by: Marc Gonzalez <marc.w.gonzalez@free.fr> >> Tested-by: Marc Gonzalez <marc.w.gonzalez@free.fr> > > It looks like this patch fixed UFS, but broke PCIe and USB3 ^_^ > > qcom-qmp-phy 1c06000.phy: Registered Qcom-QMP phy > qcom-qmp-phy c010000.phy: Registered Qcom-QMP phy > qcom-qmp-phy 1da7000.phy: Registered Qcom-QMP phy > > qcom-qmp-phy 1c06000.phy: BEFORE=000000a6 AFTER=000000a6 > qcom-qmp-phy 1c06000.phy: phy initialization timed-out > phy phy-1c06000.phy.0: phy init failed --> -110 > qcom-pcie: probe of 1c00000.pci failed with error -110 > > qcom-qmp-phy 1da7000.phy: BEFORE=00000040 AFTER=0000000d > > qcom-qmp-phy c010000.phy: BEFORE=69696969 AFTER=b7b7b7b7 > qcom-qmp-phy c010000.phy: phy initialization timed-out > phy phy-c010000.phy.1: phy init failed --> -110 > dwc3 a800000.dwc3: failed to initialize core: -110 > dwc3: probe of a800000.dwc3 failed with error -110 > > > Downstream code for PCIe is: > > static bool pcie_phy_is_ready(struct msm_pcie_dev_t *dev) > { > if (dev->phy_ver >= 0x20) { > if (readl_relaxed(dev->phy + PCIE_N_PCS_STATUS(dev->rc_idx, dev->common_phy)) & BIT(6)) > return false; > else > return true; > } > > if (!(readl_relaxed(dev->phy + PCIE_COM_PCS_READY_STATUS) & 0x1)) > return false; > else > return true; > } > > AFAICT: > PCIe and USB3 QMP PHYs are ready when PHYSTATUS=BIT(6) goes to 0. > But UFS is ready when PCS_READY=BIT(0) goes to 1. > > > Can someone verify that USB3 is broken on 845 with 885bd765963b? Suggested fix: diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c b/drivers/phy/qualcomm/phy-qcom-qmp.c index 34ff6434da8f..11c1b02f0206 100644 --- a/drivers/phy/qualcomm/phy-qcom-qmp.c +++ b/drivers/phy/qualcomm/phy-qcom-qmp.c @@ -1447,6 +1447,11 @@ static int qcom_qmp_phy_com_exit(struct qcom_qmp *qmp) return 0; } +static bool phy_is_ready(unsigned int val, unsigned int mask) +{ + return mask == PCS_READY ? val & mask : !(val & mask); +} + static int qcom_qmp_phy_enable(struct phy *phy) { struct qmp_phy *qphy = phy_get_drvdata(phy); @@ -1548,7 +1553,7 @@ static int qcom_qmp_phy_enable(struct phy *phy) status = pcs + cfg->regs[QPHY_PCS_READY_STATUS]; mask = cfg->mask_pcs_ready; - ret = readl_poll_timeout(status, val, val & mask, 10, + ret = readl_poll_timeout(status, val, phy_is_ready(val, mask), 10, PHY_INIT_COMPLETE_TIMEOUT); if (ret) { dev_err(qmp->dev, "phy initialization timed-out\n");
On 23/07/2019 12:31, Marc Gonzalez wrote: > On 19/07/2019 17:50, Marc Gonzalez wrote: > >> On 13/06/2019 11:10, Marc Gonzalez wrote: >> >>> Here are my observations for a 8998 board: >>> >>> 1) If I apply only the readl_poll_timeout() fix (not the mask_pcs_ready fixup) >>> qcom_pcie_probe() fails with a timeout in phy_init. >>> => this is in line with your regression analysis. >>> >>> 2) Your patch also fixes a long-standing bug in UFS init whereby sending >>> lots of information to the console during phy init would lead to an >>> incorrectly diagnosed time-out. >>> >>> Good stuff! >>> >>> Reviewed-by: Marc Gonzalez <marc.w.gonzalez@free.fr> >>> Tested-by: Marc Gonzalez <marc.w.gonzalez@free.fr> >> >> It looks like this patch fixed UFS, but broke PCIe and USB3 ^_^ >> >> qcom-qmp-phy 1c06000.phy: Registered Qcom-QMP phy >> qcom-qmp-phy c010000.phy: Registered Qcom-QMP phy >> qcom-qmp-phy 1da7000.phy: Registered Qcom-QMP phy >> >> qcom-qmp-phy 1c06000.phy: BEFORE=000000a6 AFTER=000000a6 >> qcom-qmp-phy 1c06000.phy: phy initialization timed-out >> phy phy-1c06000.phy.0: phy init failed --> -110 >> qcom-pcie: probe of 1c00000.pci failed with error -110 >> >> qcom-qmp-phy 1da7000.phy: BEFORE=00000040 AFTER=0000000d >> >> qcom-qmp-phy c010000.phy: BEFORE=69696969 AFTER=b7b7b7b7 >> qcom-qmp-phy c010000.phy: phy initialization timed-out >> phy phy-c010000.phy.1: phy init failed --> -110 >> dwc3 a800000.dwc3: failed to initialize core: -110 >> dwc3: probe of a800000.dwc3 failed with error -110 >> >> >> Downstream code for PCIe is: >> >> static bool pcie_phy_is_ready(struct msm_pcie_dev_t *dev) >> { >> if (dev->phy_ver >= 0x20) { >> if (readl_relaxed(dev->phy + PCIE_N_PCS_STATUS(dev->rc_idx, dev->common_phy)) & BIT(6)) >> return false; >> else >> return true; >> } >> >> if (!(readl_relaxed(dev->phy + PCIE_COM_PCS_READY_STATUS) & 0x1)) >> return false; >> else >> return true; >> } >> >> AFAICT: >> PCIe and USB3 QMP PHYs are ready when PHYSTATUS=BIT(6) goes to 0. >> But UFS is ready when PCS_READY=BIT(0) goes to 1. >> >> >> Can someone verify that USB3 is broken on 845 with 885bd765963b? > > Suggested fix: > > diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c b/drivers/phy/qualcomm/phy-qcom-qmp.c > index 34ff6434da8f..11c1b02f0206 100644 > --- a/drivers/phy/qualcomm/phy-qcom-qmp.c > +++ b/drivers/phy/qualcomm/phy-qcom-qmp.c > @@ -1447,6 +1447,11 @@ static int qcom_qmp_phy_com_exit(struct qcom_qmp *qmp) > return 0; > } > > +static bool phy_is_ready(unsigned int val, unsigned int mask) > +{ > + return mask == PCS_READY ? val & mask : !(val & mask); > +} > + > static int qcom_qmp_phy_enable(struct phy *phy) > { > struct qmp_phy *qphy = phy_get_drvdata(phy); > @@ -1548,7 +1553,7 @@ static int qcom_qmp_phy_enable(struct phy *phy) > status = pcs + cfg->regs[QPHY_PCS_READY_STATUS]; > mask = cfg->mask_pcs_ready; > > - ret = readl_poll_timeout(status, val, val & mask, 10, > + ret = readl_poll_timeout(status, val, phy_is_ready(val, mask), 10, > PHY_INIT_COMPLETE_TIMEOUT); > if (ret) { > dev_err(qmp->dev, "phy initialization timed-out\n"); > It seems there is a problem. This patch made its way to stable, even though it appears to break (at least) msm8998, and possibly sdm845. Could someone with access to one or both platforms confirm that the patch broke something, and that the proposed fix actually fixes the issue? Regards.
On Fri 19 Jul 08:50 PDT 2019, Marc Gonzalez wrote: > On 13/06/2019 11:10, Marc Gonzalez wrote: > > > Here are my observations for a 8998 board: > > > > 1) If I apply only the readl_poll_timeout() fix (not the mask_pcs_ready fixup) > > qcom_pcie_probe() fails with a timeout in phy_init. > > => this is in line with your regression analysis. > > > > 2) Your patch also fixes a long-standing bug in UFS init whereby sending > > lots of information to the console during phy init would lead to an > > incorrectly diagnosed time-out. > > > > Good stuff! > > > > Reviewed-by: Marc Gonzalez <marc.w.gonzalez@free.fr> > > Tested-by: Marc Gonzalez <marc.w.gonzalez@free.fr> > > It looks like this patch fixed UFS, but broke PCIe and USB3 ^_^ > > qcom-qmp-phy 1c06000.phy: Registered Qcom-QMP phy > qcom-qmp-phy c010000.phy: Registered Qcom-QMP phy > qcom-qmp-phy 1da7000.phy: Registered Qcom-QMP phy > > qcom-qmp-phy 1c06000.phy: BEFORE=000000a6 AFTER=000000a6 > qcom-qmp-phy 1c06000.phy: phy initialization timed-out > phy phy-1c06000.phy.0: phy init failed --> -110 > qcom-pcie: probe of 1c00000.pci failed with error -110 > > qcom-qmp-phy 1da7000.phy: BEFORE=00000040 AFTER=0000000d > > qcom-qmp-phy c010000.phy: BEFORE=69696969 AFTER=b7b7b7b7 > qcom-qmp-phy c010000.phy: phy initialization timed-out > phy phy-c010000.phy.1: phy init failed --> -110 > dwc3 a800000.dwc3: failed to initialize core: -110 > dwc3: probe of a800000.dwc3 failed with error -110 > > > Downstream code for PCIe is: > > static bool pcie_phy_is_ready(struct msm_pcie_dev_t *dev) > { > if (dev->phy_ver >= 0x20) { > if (readl_relaxed(dev->phy + PCIE_N_PCS_STATUS(dev->rc_idx, dev->common_phy)) & BIT(6)) > return false; > else > return true; > } > > if (!(readl_relaxed(dev->phy + PCIE_COM_PCS_READY_STATUS) & 0x1)) > return false; > else > return true; > } > > AFAICT: > PCIe and USB3 QMP PHYs are ready when PHYSTATUS=BIT(6) goes to 0. > But UFS is ready when PCS_READY=BIT(0) goes to 1. > > > Can someone verify that USB3 is broken on 845 with 885bd765963b? > Both USB3 and PCIe still works for me on 845, but you are correct; PHYSTATUS is ready low, while PCS_READY is ready high. I posted a patch that attempts to clean up the differences, please give it a spin (and review). Regards, Bjorn
diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c b/drivers/phy/qualcomm/phy-qcom-qmp.c index cd91b4179b10..43abdfd0deed 100644 --- a/drivers/phy/qualcomm/phy-qcom-qmp.c +++ b/drivers/phy/qualcomm/phy-qcom-qmp.c @@ -1074,6 +1074,7 @@ static const struct qmp_phy_cfg msm8996_pciephy_cfg = { .start_ctrl = PCS_START | PLL_READY_GATE_EN, .pwrdn_ctrl = SW_PWRDN | REFCLK_DRV_DSBL, + .mask_pcs_ready = PHYSTATUS, .mask_com_pcs_ready = PCS_READY, .has_phy_com_ctrl = true, @@ -1253,6 +1254,7 @@ static const struct qmp_phy_cfg msm8998_pciephy_cfg = { .start_ctrl = SERDES_START | PCS_START, .pwrdn_ctrl = SW_PWRDN | REFCLK_DRV_DSBL, + .mask_pcs_ready = PHYSTATUS, .mask_com_pcs_ready = PCS_READY, }; @@ -1547,7 +1549,7 @@ static int qcom_qmp_phy_enable(struct phy *phy) status = pcs + cfg->regs[QPHY_PCS_READY_STATUS]; mask = cfg->mask_pcs_ready; - ret = readl_poll_timeout(status, val, !(val & mask), 1, + ret = readl_poll_timeout(status, val, val & mask, 1, PHY_INIT_COMPLETE_TIMEOUT); if (ret) { dev_err(qmp->dev, "phy initialization timed-out\n");
After issuing a PHY_START request to the QMP, the hardware documentation states that the software should wait for the PCS_READY_STATUS to become 1. With the introduction of c9b589791fc1 ("phy: qcom: Utilize UFS reset controller") an additional 1ms delay was introduced between the start request and the check of the status bit. This greatly increases the chances for the hardware to actually becoming ready before the status bit is read. The result can be seen in that UFS PHY enabling is now reported as a failure in 10% of the boots on SDM845, which is a clear regression from the previous rare/occasional failure. This patch fixes the "break condition" of the poll to check for the correct state of the status bit. Unfortunately PCIe on 8996 and 8998 does not specify the mask_pcs_ready register, which means that the code checks a bit that's always 0. So the patch also fixes these, in order to not regress these targets. Cc: stable@vger.kernel.org Cc: Evan Green <evgreen@chromium.org> Cc: Marc Gonzalez <marc.w.gonzalez@free.fr> Cc: Vivek Gautam <vivek.gautam@codeaurora.org> Fixes: 73d7ec899bd8 ("phy: qcom-qmp: Add msm8998 PCIe QMP PHY support") Fixes: e78f3d15e115 ("phy: qcom-qmp: new qmp phy driver for qcom-chipsets") Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> --- @Kishon, this is a regression spotted in v5.2-rc1, so please consider applying this towards v5.2. drivers/phy/qualcomm/phy-qcom-qmp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)