diff mbox series

[v2,net-next,2/2] net: stmmac: qcom-ethqos: add a DMA-reset quirk for sa8775p-ride

Message ID 20240627113948.25358-3-brgl@bgdev.pl (mailing list archive)
State New
Headers show
Series net: stmmac: qcom-ethqos: enable 2.5G ethernet on sa8775p-ride | expand

Commit Message

Bartosz Golaszewski June 27, 2024, 11:39 a.m. UTC
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

On sa8775p-ride the RX clocks from the AQR115C PHY are not available at
the time of the DMA reset so we need to loop TX clocks to RX and then
disable loopback after link-up. Use the existing callbacks to do it just
for this board.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
 .../stmicro/stmmac/dwmac-qcom-ethqos.c        | 22 +++++++++++++++++++
 1 file changed, 22 insertions(+)

Comments

Andrew Halaney June 27, 2024, 5:07 p.m. UTC | #1
On Thu, Jun 27, 2024 at 01:39:47PM GMT, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> 
> On sa8775p-ride the RX clocks from the AQR115C PHY are not available at
> the time of the DMA reset so we need to loop TX clocks to RX and then
> disable loopback after link-up. Use the existing callbacks to do it just
> for this board.
> 
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Sorry, not being very helpful but trying to understand these changes
and the general cleanup of stmmac... so I'll point out that I'm still
confused by this based on Russell's last response:
https://lore.kernel.org/netdev/ZnQLED%2FC3Opeim5q@shell.armlinux.org.uk/

Quote:

    If you're using true Cisco SGMII, there are _no_ clocks transferred
    between the PHY and PCS/MAC. There are two balanced pairs of data
    lines and that is all - one for transmit and one for receive. So this
    explanation doesn't make sense to me.


<snip>

> +}
> +
>  static void ethqos_set_func_clk_en(struct qcom_ethqos *ethqos)
>  {
> +	qcom_ethqos_set_sgmii_loopback(ethqos, true);
>  	rgmii_updatel(ethqos, RGMII_CONFIG_FUNC_CLK_EN,
>  		      RGMII_CONFIG_FUNC_CLK_EN, RGMII_IO_MACRO_CONFIG);
>  }
<snip>
> @@ -682,6 +702,7 @@ static void ethqos_fix_mac_speed(void *priv, unsigned int speed, unsigned int mo
>  {
>  	struct qcom_ethqos *ethqos = priv;
>  
> +	qcom_ethqos_set_sgmii_loopback(ethqos, false);

I'm trying to map out when the loopback is currently enabled/disabled
due to Russell's prior concerns.

Quote:

    So you enable loopback at open time, and disable it when the link comes
    up. This breaks inband signalling (should stmmac ever use that) because
    enabling loopback prevents the PHY sending the SGMII result to the PCS
    to indicate that the link has come up... thus phylink won't call
    mac_link_up().

    So no, I really hate this proposed change.

    What I think would be better is if there were hooks at the appropriate
    places to handle the lack of clock over _just_ the period that it needs
    to be handled, rather than hacking the driver as this proposal does,
    abusing platform callbacks because there's nothing better.

looks like currently you'd:
    qcom_ethqos_probe()
	ethqos_clks_config(ethqos, true)
	    ethqos_set_func_clk_en(ethqos)
		qcom_ethqos_set_sgmii_loopback(ethqos, true) // loopback enabled
	ethqos_set_func_clk_en(ethqos)
	    qcom_ethqos_set_sgmii_loopback(ethqos, true) // no change in loopback
    devm_stmmac_pltfr_probe()
	stmmac_pltfr_probe()
	    stmmac_drv_probe()
		pm_runtime_put()
    // Eventually runtime PM will then do below
    stmmac_stmmac_runtime_suspend()
	stmmac_bus_clks_config(priv, false)
	    ethqos_clks_config(ethqos, false) // pointless branch but proving to myself
	                                      // that pm_runtime isn't getting in the way here
    __stmmac_open()
	stmmac_runtime_resume()
	    ethqos_clks_config(ethqos, true)
		ethqos_set_func_clk_en(ethqos)
		    qcom_ethqos_set_sgmii_loopback(ethqos, true) // no change in loopback
    stmmac_mac_link_up()
	ethqos_fix_mac_speed()
	    qcom_ethqos_set_sgmii_loopback(ethqos, false); // loopback disabled

Good chance I foobared tracing that... but!
That seems to still go against Russell's comment, i.e. its on at probe
and remains on until a link is up. This doesn't add anymore stmmac wide
platform callbacks at least, but I'm still concerned based on his prior
comments.

Its not clear to me though if the "2500basex" mentioned here supports
any in-band signalling from a Qualcomm SoC POV (not just the Aquantia
phy its attached to, but in general). So maybe in that case its not a
concern?

Although, this isn't tied to _just_ 2500basex here. If I boot the
sa8775p-ride (r2 version, with a marvell 88ea1512 phy attached via
sgmii, not indicating 2500basex) wouldn't all this get exercised? Right
now the devicetree doesn't indicate inband signalling, but I tried that
over here with Russell's clean up a week or two ago and things at least
came up ok (which made me think all the INTEGRATED_PCS stuff wasn't needed,
and I'm not totally positive my test proved inband signalling worked,
but I thought it did...):

    https://lore.kernel.org/netdev/zzevmhmwxrhs5yfv5srvcjxrue2d7wu7vjqmmoyd5mp6kgur54@jvmuv7bxxhqt/

based on Russell's comments, I feel if I was to use his series over
there, add 'managed = "in-band-status"' to the dts, and then apply this
series, the link would not come up anymore.

Total side note, but I'm wondering if the sa8775p-ride dts should
specify 'managed = "in-band-status"'.

Thanks,
Andrew
Bartosz Golaszewski June 27, 2024, 6:35 p.m. UTC | #2
On Thu, Jun 27, 2024 at 7:07 PM Andrew Halaney <ahalaney@redhat.com> wrote:
>
> On Thu, Jun 27, 2024 at 01:39:47PM GMT, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >
> > On sa8775p-ride the RX clocks from the AQR115C PHY are not available at
> > the time of the DMA reset so we need to loop TX clocks to RX and then
> > disable loopback after link-up. Use the existing callbacks to do it just
> > for this board.
> >
> > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> Sorry, not being very helpful but trying to understand these changes
> and the general cleanup of stmmac... so I'll point out that I'm still
> confused by this based on Russell's last response:
> https://lore.kernel.org/netdev/ZnQLED%2FC3Opeim5q@shell.armlinux.org.uk/
>

I realized Russell's email didn't pop up in get_maintainers.pl for
stmmac. Adding him now.

> Quote:
>
>     If you're using true Cisco SGMII, there are _no_ clocks transferred
>     between the PHY and PCS/MAC. There are two balanced pairs of data
>     lines and that is all - one for transmit and one for receive. So this
>     explanation doesn't make sense to me.
>
>
> <snip>
>
> > +}
> > +
> >  static void ethqos_set_func_clk_en(struct qcom_ethqos *ethqos)
> >  {
> > +     qcom_ethqos_set_sgmii_loopback(ethqos, true);
> >       rgmii_updatel(ethqos, RGMII_CONFIG_FUNC_CLK_EN,
> >                     RGMII_CONFIG_FUNC_CLK_EN, RGMII_IO_MACRO_CONFIG);
> >  }
> <snip>
> > @@ -682,6 +702,7 @@ static void ethqos_fix_mac_speed(void *priv, unsigned int speed, unsigned int mo
> >  {
> >       struct qcom_ethqos *ethqos = priv;
> >
> > +     qcom_ethqos_set_sgmii_loopback(ethqos, false);
>
> I'm trying to map out when the loopback is currently enabled/disabled
> due to Russell's prior concerns.
>
> Quote:
>
>     So you enable loopback at open time, and disable it when the link comes
>     up. This breaks inband signalling (should stmmac ever use that) because
>     enabling loopback prevents the PHY sending the SGMII result to the PCS
>     to indicate that the link has come up... thus phylink won't call
>     mac_link_up().
>
>     So no, I really hate this proposed change.
>
>     What I think would be better is if there were hooks at the appropriate
>     places to handle the lack of clock over _just_ the period that it needs
>     to be handled, rather than hacking the driver as this proposal does,
>     abusing platform callbacks because there's nothing better.
>
> looks like currently you'd:
>     qcom_ethqos_probe()
>         ethqos_clks_config(ethqos, true)
>             ethqos_set_func_clk_en(ethqos)
>                 qcom_ethqos_set_sgmii_loopback(ethqos, true) // loopback enabled
>         ethqos_set_func_clk_en(ethqos)
>             qcom_ethqos_set_sgmii_loopback(ethqos, true) // no change in loopback
>     devm_stmmac_pltfr_probe()
>         stmmac_pltfr_probe()
>             stmmac_drv_probe()
>                 pm_runtime_put()
>     // Eventually runtime PM will then do below
>     stmmac_stmmac_runtime_suspend()
>         stmmac_bus_clks_config(priv, false)
>             ethqos_clks_config(ethqos, false) // pointless branch but proving to myself
>                                               // that pm_runtime isn't getting in the way here
>     __stmmac_open()
>         stmmac_runtime_resume()
>             ethqos_clks_config(ethqos, true)
>                 ethqos_set_func_clk_en(ethqos)
>                     qcom_ethqos_set_sgmii_loopback(ethqos, true) // no change in loopback
>     stmmac_mac_link_up()
>         ethqos_fix_mac_speed()
>             qcom_ethqos_set_sgmii_loopback(ethqos, false); // loopback disabled
>
> Good chance I foobared tracing that... but!
> That seems to still go against Russell's comment, i.e. its on at probe
> and remains on until a link is up. This doesn't add anymore stmmac wide
> platform callbacks at least, but I'm still concerned based on his prior
> comments.
>
> Its not clear to me though if the "2500basex" mentioned here supports
> any in-band signalling from a Qualcomm SoC POV (not just the Aquantia
> phy its attached to, but in general). So maybe in that case its not a
> concern?
>
> Although, this isn't tied to _just_ 2500basex here. If I boot the
> sa8775p-ride (r2 version, with a marvell 88ea1512 phy attached via
> sgmii, not indicating 2500basex) wouldn't all this get exercised? Right
> now the devicetree doesn't indicate inband signalling, but I tried that
> over here with Russell's clean up a week or two ago and things at least
> came up ok (which made me think all the INTEGRATED_PCS stuff wasn't needed,
> and I'm not totally positive my test proved inband signalling worked,
> but I thought it did...):
>

Am I getting this right? You added `managed = "in-band-status"' to
Rev2 DTS and it still worked?

>     https://lore.kernel.org/netdev/zzevmhmwxrhs5yfv5srvcjxrue2d7wu7vjqmmoyd5mp6kgur54@jvmuv7bxxhqt/
>
> based on Russell's comments, I feel if I was to use his series over
> there, add 'managed = "in-band-status"' to the dts, and then apply this
> series, the link would not come up anymore.
>

Because I can confirm that it doesn't on Rev 3. :(

So to explain myself: I tried to follow Andrew Lunn's suggestion about
unifying this and the existing ethqos_set_func_clk_en() bits as they
seem to address a similar issue.

I'm working with limited information here as well regarding this issue
so I figured this could work but you're right - if we ever need
in-band signalling, then it won't work. It's late here so let me get
back to it tomorrow.

> Total side note, but I'm wondering if the sa8775p-ride dts should
> specify 'managed = "in-band-status"'.
>

I'll check this at the source.

Bart

> Thanks,
> Andrew
>
Andrew Lunn June 27, 2024, 7:24 p.m. UTC | #3
> Its not clear to me though if the "2500basex" mentioned here supports
> any in-band signalling from a Qualcomm SoC POV (not just the Aquantia
> phy its attached to, but in general). So maybe in that case its not a
> concern?

True 2500BaseX does have inband signalling, for the results of pause
negotiation.

However, in this case, this is not true 2500BaseX, but a hacked SGMII
overclocked to 2.5GHz. There is no inband signalling, because SGMII
signalling makes no sense when over clocked. So out of band signalling
will be used.

My understanding is that both ends of this link are not using true
2500BaseX, and this Qualcomm SoC is incapable of true 2500BaseX. So we
don't need to worry about it in the Qualcomm glue code.

However, what these patches should not block is some other vendors SoC
with true 2500BaseX from working correctly.

     Andrew
Andrew Lunn June 27, 2024, 7:37 p.m. UTC | #4
On Thu, Jun 27, 2024 at 12:07:22PM -0500, Andrew Halaney wrote:
> On Thu, Jun 27, 2024 at 01:39:47PM GMT, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > 
> > On sa8775p-ride the RX clocks from the AQR115C PHY are not available at
> > the time of the DMA reset so we need to loop TX clocks to RX and then
> > disable loopback after link-up. Use the existing callbacks to do it just
> > for this board.
> > 
> > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> 
> Sorry, not being very helpful but trying to understand these changes
> and the general cleanup of stmmac... so I'll point out that I'm still
> confused by this based on Russell's last response:
> https://lore.kernel.org/netdev/ZnQLED%2FC3Opeim5q@shell.armlinux.org.uk/
> 
> Quote:
> 
>     If you're using true Cisco SGMII, there are _no_ clocks transferred
>     between the PHY and PCS/MAC. There are two balanced pairs of data
>     lines and that is all - one for transmit and one for receive. So this
>     explanation doesn't make sense to me.
> 

Agreed. We need a deeper understanding of the clocking to find an
acceptable solution to this problem.

Is the MAC extracting a clock from the SERDES lines?

Is the PHY not driving the SERDES lines when there is no link?

For RGMII PHYs, they often do have a clock output at 25 or 50MHz which
the MAC uses. And some PHY drivers need asking to not turn this clock
off.  Maybe we need the same here, by asking the PHY to keep the
SERDES lines running when there is no link?

https://elixir.bootlin.com/linux/v6.10-rc5/source/include/linux/phy.h#L781

I also wounder why this is not an issue with plain SGMII, rather than
overclocked SGMII? Maybe there is already a workaround for SGMII and
it just needs extended to this not quiet 2500BaseX mode.

      Andrew
Andrew Halaney June 27, 2024, 9:48 p.m. UTC | #5
On Thu, Jun 27, 2024 at 08:35:16PM GMT, Bartosz Golaszewski wrote:
> On Thu, Jun 27, 2024 at 7:07 PM Andrew Halaney <ahalaney@redhat.com> wrote:
> >
> > Although, this isn't tied to _just_ 2500basex here. If I boot the
> > sa8775p-ride (r2 version, with a marvell 88ea1512 phy attached via
> > sgmii, not indicating 2500basex) wouldn't all this get exercised? Right
> > now the devicetree doesn't indicate inband signalling, but I tried that
> > over here with Russell's clean up a week or two ago and things at least
> > came up ok (which made me think all the INTEGRATED_PCS stuff wasn't needed,
> > and I'm not totally positive my test proved inband signalling worked,
> > but I thought it did...):
> >
> 
> Am I getting this right? You added `managed = "in-band-status"' to
> Rev2 DTS and it still worked?

> 
> >     https://lore.kernel.org/netdev/zzevmhmwxrhs5yfv5srvcjxrue2d7wu7vjqmmoyd5mp6kgur54@jvmuv7bxxhqt/
> >
> > based on Russell's comments, I feel if I was to use his series over
> > there, add 'managed = "in-band-status"' to the dts, and then apply this
> > series, the link would not come up anymore.
> >

This works on rev2/rev1 (no way to tell which one it actually is, shouldn't matter),
here's a branch I just whipped up to replicate the setup I had when
making the comments in above link:

    https://gitlab.com/ahalaney/kernel-automotive-9/-/commits/russell-cleanups-and-inband

The last commit has some dmesg/ethtool output etc to show things
working. I reverted recent changes to stmmac just to apply cleanly.

I tried the patches Serge added on top of that series, but that was causing
the link to cycle up/down, so I dropped those and went back to just
Russell's patches to recreate the setup I had when leaving the comment.
I need to try with Serge's stuff again when I find a moment and see if I
can work out why the link starts going up/down with those + some
compiler fixups and removing INTEGRATED_PCS flags. For what its worth,
here's the branch, logs are in the last commit:

    https://gitlab.com/ahalaney/kernel-automotive-9/-/commits/russell-plus-serge-plus-inband-link-cycles


Without Russell's patches the link doesn't come up after switching to
'managed = "in-band-status"' otherwise I'd look into switching the dts to
inband signalling now instead of after those cleanups land.

> 
> Because I can confirm that it doesn't on Rev 3. :(
> 
> So to explain myself: I tried to follow Andrew Lunn's suggestion about
> unifying this and the existing ethqos_set_func_clk_en() bits as they
> seem to address a similar issue.
> 
> I'm working with limited information here as well regarding this issue
> so I figured this could work but you're right - if we ever need
> in-band signalling, then it won't work. It's late here so let me get
> back to it tomorrow.

No worries, I understand how this goes, stmmac is tricky and getting
information/documentation and understanding it can be tough. I appreciate you trying
to get this squared away.

> 
> > Total side note, but I'm wondering if the sa8775p-ride dts should
> > specify 'managed = "in-band-status"'.
> >
> 
> I'll check this at the source.
> 

Thanks!
diff mbox series

Patch

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c
index 91fe57a3e59e..f4d72d75e8de 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c
@@ -21,6 +21,7 @@ 
 #define RGMII_IO_MACRO_CONFIG2		0x1C
 #define RGMII_IO_MACRO_DEBUG1		0x20
 #define EMAC_SYSTEM_LOW_POWER_DEBUG	0x28
+#define EMAC_WRAPPER_SGMII_PHY_CNTRL1	0xf4
 
 /* RGMII_IO_MACRO_CONFIG fields */
 #define RGMII_CONFIG_FUNC_CLK_EN		BIT(30)
@@ -79,6 +80,9 @@ 
 #define ETHQOS_MAC_CTRL_SPEED_MODE		BIT(14)
 #define ETHQOS_MAC_CTRL_PORT_SEL		BIT(15)
 
+/* EMAC_WRAPPER_SGMII_PHY_CNTRL1 bits */
+#define SGMII_PHY_CNTRL1_SGMII_TX_TO_RX_LOOPBACK_EN	BIT(3)
+
 #define SGMII_10M_RX_CLK_DVDR			0x31
 
 struct ethqos_emac_por {
@@ -95,6 +99,7 @@  struct ethqos_emac_driver_data {
 	bool has_integrated_pcs;
 	u32 dma_addr_width;
 	struct dwmac4_addrs dwmac4_addrs;
+	bool needs_sgmii_loopback;
 };
 
 struct qcom_ethqos {
@@ -114,6 +119,7 @@  struct qcom_ethqos {
 	unsigned int num_por;
 	bool rgmii_config_loopback_en;
 	bool has_emac_ge_3;
+	bool needs_sgmii_loopback;
 };
 
 static int rgmii_readl(struct qcom_ethqos *ethqos, unsigned int offset)
@@ -191,8 +197,21 @@  ethqos_update_link_clk(struct qcom_ethqos *ethqos, unsigned int speed)
 	clk_set_rate(ethqos->link_clk, ethqos->link_clk_rate);
 }
 
+static void
+qcom_ethqos_set_sgmii_loopback(struct qcom_ethqos *ethqos, bool enable)
+{
+	if (!ethqos->needs_sgmii_loopback)
+		return;
+
+	rgmii_updatel(ethqos,
+		      SGMII_PHY_CNTRL1_SGMII_TX_TO_RX_LOOPBACK_EN,
+		      enable ? SGMII_PHY_CNTRL1_SGMII_TX_TO_RX_LOOPBACK_EN : 0,
+		      EMAC_WRAPPER_SGMII_PHY_CNTRL1);
+}
+
 static void ethqos_set_func_clk_en(struct qcom_ethqos *ethqos)
 {
+	qcom_ethqos_set_sgmii_loopback(ethqos, true);
 	rgmii_updatel(ethqos, RGMII_CONFIG_FUNC_CLK_EN,
 		      RGMII_CONFIG_FUNC_CLK_EN, RGMII_IO_MACRO_CONFIG);
 }
@@ -277,6 +296,7 @@  static const struct ethqos_emac_driver_data emac_v4_0_0_data = {
 	.has_emac_ge_3 = true,
 	.link_clk_name = "phyaux",
 	.has_integrated_pcs = true,
+	.needs_sgmii_loopback = true,
 	.dma_addr_width = 36,
 	.dwmac4_addrs = {
 		.dma_chan = 0x00008100,
@@ -682,6 +702,7 @@  static void ethqos_fix_mac_speed(void *priv, unsigned int speed, unsigned int mo
 {
 	struct qcom_ethqos *ethqos = priv;
 
+	qcom_ethqos_set_sgmii_loopback(ethqos, false);
 	ethqos->speed = speed;
 	ethqos_update_link_clk(ethqos, speed);
 	ethqos_configure(ethqos);
@@ -820,6 +841,7 @@  static int qcom_ethqos_probe(struct platform_device *pdev)
 	ethqos->num_por = data->num_por;
 	ethqos->rgmii_config_loopback_en = data->rgmii_config_loopback_en;
 	ethqos->has_emac_ge_3 = data->has_emac_ge_3;
+	ethqos->needs_sgmii_loopback = data->needs_sgmii_loopback;
 
 	ethqos->link_clk = devm_clk_get(dev, data->link_clk_name ?: "rgmii");
 	if (IS_ERR(ethqos->link_clk))