mbox series

[net,v2,0/3] net: phy: mscc: support VSC8501

Message ID 20230523090405.10655-1-david.epping@missinglinkelectronics.com (mailing list archive)
Headers show
Series net: phy: mscc: support VSC8501 | expand

Message

David Epping May 23, 2023, 9:04 a.m. UTC
Hello,

this updated series of patches adds support for the VSC8501 Ethernet
PHY and fixes support for the VSC8502 PHY in cases where no other
software (like U-Boot) has initialized the PHY after power up.

The first patch simply adds the VSC8502 to the MODULE_DEVICE_TABLE,
where I guess it was unintentionally missing. I have no hardware to
test my change.

The second patch adds the VSC8501 PHY with exactly the same driver
implementation as the existing VSC8502.

The third patch fixes the initialization for VSC8501 and VSC8502.
I have tested this patch with VSC8501 on hardware in RGMII mode only.
https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/VSC8501-03_Datasheet_60001741A.PDF
https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/VSC8502-03_Datasheet_60001742B.pdf
Table 4-42 "RGMII CONTROL, ADDRESS 20E2 (0X14)" Bit 11 for each of
them.
By default the RX_CLK is disabled for these PHYs. In cases where no
other software, like U-Boot, enabled the clock, this results in no
received packets being handed to the MAC.
The patch enables this clock output.
According to Microchip support (case number 01268776) this applies
to all modes (RGMII, GMII, and MII).

Other PHYs sharing the same register map and code, like
VSC8530/31/40/41 have the clock enabled and the relevant bit 11 is
reserved and read-only for them. As per previous discussion the
patch still clears the bit on these PHYs, too, possibly more easily
supporting other future PHYs implementing this functionality.

For the VSC8572 family of PHYs, having a different register map,
no such changes are applied.

Thanks for your feedback,
David

--

Changes in v2:
- adjust cover letter (U-Boot, PHY families)
- add reviewed-by tags to patch 1/3 and 2/3
- patch 3/3: combine vsc85xx_rgmii_set_skews() and
  vsc85xx_rgmii_enable_rx_clk() into vsc85xx_update_rgmii_cntl()
  for fewer MDIO accesses
- patch 3/3: treat all VSC8502 family PHYs the same (regardless of
  bit 11 reserved status)

Additional notes for review:
- If you want to, feel free to add something like
  Co developed by Vladimir Oltean <olteanv@gmail.com>.
  I did not do that, because the Kernel documentation requires a
  signed off by to go with it.
  Significant parts of the new patch are from your emails.
- I left the mutex_lock(&phydev->lock) in the
  vsc85xx_update_rgmii_cntl() function, as I'm not sure whether it
  is required to repeatedly access phydev->interface and
  phy_interface_is_rgmii(phydev) in a consistent way.
- For cases of not RGMII mode and not VSC8502 family there is no
  MDIO access. Same as with the current mainline code.

--

David Epping (3):
  net: phy: mscc: add VSC8502 to MODULE_DEVICE_TABLE
  net: phy: mscc: add support for VSC8501
  net: phy: mscc: enable VSC8501/2 RGMII RX clock

 drivers/net/phy/mscc/mscc.h      |  2 +
 drivers/net/phy/mscc/mscc_main.c | 80 +++++++++++++++++++++-----------
 2 files changed, 56 insertions(+), 26 deletions(-)


base-commit: 18c40a1cc1d990c51381ef48cd93fdb31d5cd903

Comments

Russell King (Oracle) May 23, 2023, 9:12 a.m. UTC | #1
On Tue, May 23, 2023 at 11:04:02AM +0200, David Epping wrote:
> Hello,
> 
> this updated series of patches adds support for the VSC8501 Ethernet
> PHY and fixes support for the VSC8502 PHY in cases where no other
> software (like U-Boot) has initialized the PHY after power up.
> 
> The first patch simply adds the VSC8502 to the MODULE_DEVICE_TABLE,
> where I guess it was unintentionally missing. I have no hardware to
> test my change.
> 
> The second patch adds the VSC8501 PHY with exactly the same driver
> implementation as the existing VSC8502.
> 
> The third patch fixes the initialization for VSC8501 and VSC8502.
> I have tested this patch with VSC8501 on hardware in RGMII mode only.
> https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/VSC8501-03_Datasheet_60001741A.PDF
> https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/VSC8502-03_Datasheet_60001742B.pdf
> Table 4-42 "RGMII CONTROL, ADDRESS 20E2 (0X14)" Bit 11 for each of
> them.
> By default the RX_CLK is disabled for these PHYs. In cases where no
> other software, like U-Boot, enabled the clock, this results in no
> received packets being handed to the MAC.
> The patch enables this clock output.
> According to Microchip support (case number 01268776) this applies
> to all modes (RGMII, GMII, and MII).
> 
> Other PHYs sharing the same register map and code, like
> VSC8530/31/40/41 have the clock enabled and the relevant bit 11 is
> reserved and read-only for them. As per previous discussion the
> patch still clears the bit on these PHYs, too, possibly more easily
> supporting other future PHYs implementing this functionality.
> 
> For the VSC8572 family of PHYs, having a different register map,
> no such changes are applied.
> 
> Thanks for your feedback,
> David
> 
> --
> 
> Changes in v2:
> - adjust cover letter (U-Boot, PHY families)
> - add reviewed-by tags to patch 1/3 and 2/3
> - patch 3/3: combine vsc85xx_rgmii_set_skews() and
>   vsc85xx_rgmii_enable_rx_clk() into vsc85xx_update_rgmii_cntl()
>   for fewer MDIO accesses
> - patch 3/3: treat all VSC8502 family PHYs the same (regardless of
>   bit 11 reserved status)
> 
> Additional notes for review:
> - If you want to, feel free to add something like
>   Co developed by Vladimir Oltean <olteanv@gmail.com>.
>   I did not do that, because the Kernel documentation requires a
>   signed off by to go with it.
>   Significant parts of the new patch are from your emails.
> - I left the mutex_lock(&phydev->lock) in the
>   vsc85xx_update_rgmii_cntl() function, as I'm not sure whether it
>   is required to repeatedly access phydev->interface and
>   phy_interface_is_rgmii(phydev) in a consistent way.

Nothing should change phydev->interface except:
1. the PHY driver in its ->read_status method when phylib has been
   started (via phy_start()).
2. phylib when the PHY is initially being attached.

The config_init methods are called during initial attachment and also
when the phy is being resumed, for neither of which phylib will be in
the "started" mode so (1) doesn't apply, and (2) doesn't apply because
phy_attach_direct() will have set ->interface prior to calling the
config_init method.

As far as a phy driver should be concerned, phydev->interface is
stable while it's being called.
Andrew Lunn May 23, 2023, 1:16 p.m. UTC | #2
> - I left the mutex_lock(&phydev->lock) in the
>   vsc85xx_update_rgmii_cntl() function, as I'm not sure whether it
>   is required to repeatedly access phydev->interface and
>   phy_interface_is_rgmii(phydev) in a consistent way.

Just adding to Russell comment.

As a general rule of thumb, if your driver is doing something which no
other driver is doing, you have to consider if it is correct. A PHY
driver taking phydev->lock is very unusual. So at minimum you should
be able to explain why it is needed. And when it comes to locking,
locking is hard, so you really should understand it.

Now the mscc is an odd device, because it has multiple PHYs in the
package, and a number of registers are shared between these PHYs. So
it does have different locking requirements to most PHYs. However, i
don't think that is involved here. Those oddities are hidden behind
phy_base_write() and phy_base_read().

	Andrew
David Epping May 23, 2023, 1:32 p.m. UTC | #3
On Tue, May 23, 2023 at 03:16:51PM +0200, Andrew Lunn wrote:
> > - I left the mutex_lock(&phydev->lock) in the
> >   vsc85xx_update_rgmii_cntl() function, as I'm not sure whether it
> >   is required to repeatedly access phydev->interface and
> >   phy_interface_is_rgmii(phydev) in a consistent way.
> 
> Just adding to Russell comment.
> 
> As a general rule of thumb, if your driver is doing something which no
> other driver is doing, you have to consider if it is correct. A PHY
> driver taking phydev->lock is very unusual. So at minimum you should
> be able to explain why it is needed. And when it comes to locking,
> locking is hard, so you really should understand it.
> 
> Now the mscc is an odd device, because it has multiple PHYs in the
> package, and a number of registers are shared between these PHYs. So
> it does have different locking requirements to most PHYs. However, i
> don't think that is involved here. Those oddities are hidden behind
> phy_base_write() and phy_base_read().
> 
> 	Andrew

Russell, Andrew,

as you stated, locking is hard, and I am not in detail familiar with
the mscc driver and the supported PHYs behavior. Also, I only have
VSC8501, the single PHY chip, and none of the multi PHY chips to test.
And testing these corner cases and race conditions is hard anyways.
Thus my current patch is not touching the locking code at all, and
assumes the current mainline code is correct in that regard.
Because I don't understand all implications, I'm hesitant to change
the existing locking scheme.
Maybe this can be a separate patch? In the current patch set I'm not
making the situation worse (unlike the first one where I added locks
which Russell pointed out).
If you insist and all agree the locks should be removed with this
patch set, I'll update it of course.
Russell King (Oracle) May 23, 2023, 2:32 p.m. UTC | #4
On Tue, May 23, 2023 at 03:32:36PM +0200, David Epping wrote:
> On Tue, May 23, 2023 at 03:16:51PM +0200, Andrew Lunn wrote:
> > > - I left the mutex_lock(&phydev->lock) in the
> > >   vsc85xx_update_rgmii_cntl() function, as I'm not sure whether it
> > >   is required to repeatedly access phydev->interface and
> > >   phy_interface_is_rgmii(phydev) in a consistent way.
> > 
> > Just adding to Russell comment.
> > 
> > As a general rule of thumb, if your driver is doing something which no
> > other driver is doing, you have to consider if it is correct. A PHY
> > driver taking phydev->lock is very unusual. So at minimum you should
> > be able to explain why it is needed. And when it comes to locking,
> > locking is hard, so you really should understand it.
> > 
> > Now the mscc is an odd device, because it has multiple PHYs in the
> > package, and a number of registers are shared between these PHYs. So
> > it does have different locking requirements to most PHYs. However, i
> > don't think that is involved here. Those oddities are hidden behind
> > phy_base_write() and phy_base_read().
> > 
> > 	Andrew
> 
> Russell, Andrew,
> 
> as you stated, locking is hard, and I am not in detail familiar with
> the mscc driver and the supported PHYs behavior. Also, I only have
> VSC8501, the single PHY chip, and none of the multi PHY chips to test.
> And testing these corner cases and race conditions is hard anyways.
> Thus my current patch is not touching the locking code at all, and
> assumes the current mainline code is correct in that regard.
> Because I don't understand all implications, I'm hesitant to change
> the existing locking scheme.
> Maybe this can be a separate patch? In the current patch set I'm not
> making the situation worse (unlike the first one where I added locks
> which Russell pointed out).
> If you insist and all agree the locks should be removed with this
> patch set, I'll update it of course.

Reading through this driver, IMHO it's clear that the original author
didn't have much idea about locking.

Your assumption that taking phydev->lock in vsc85xx_rgmii_set_skews()
protects phydev->interface is provably false, because:

static int vsc8584_config_init(struct phy_device *phydev)
{
...
        if (phy_interface_is_rgmii(phydev)) {
                ret = vsc85xx_rgmii_set_skews(phydev, VSC8572_RGMII_CNTL,
                                              VSC8572_RGMII_RX_DELAY_MASK,
                                              VSC8572_RGMII_TX_DELAY_MASK);

This accesses phydev->interface without holding phydev->lock,
before entering vsc85xx_rgmii_set_skews().

The second place that vsc85xx_rgmii_set_skews() is called from is
vsc85xx_default_config() which also accesses phydev->interface,
again without taking the phydev->lock.

So both paths into vsc85xx_rgmii_set_skews() have already read
phydev->interface without taking the lock. If this was what the
lock in vsc85xx_rgmii_set_skews() was protecting, then surely it
would need to protect those reads as well! It doesn't.

Also, with knowledge of phylib, I can say that this lock is
completely unnecessary when accessing phydev->interface in any
PHY driver .config_init method, which is the only place that
vsc85xx_rgmii_set_skews() is called from.


Having read the rest of the driver, it would appear that phydev->lock
is being abused to protect register accesses. This is evidenced by
the following, where I also set out why it's wrong:

vsc85xx_led_cntl_set()... which should be using phy_modify(), not
phy_read()..modify..phy_write(), which is open to races e.g. from
userspace MDIO access. phydev->lock doesn't solve anything there.

vsc85xx_edge_rate_cntl_set()... which correctly uses
phy_modify_paged() which itself will correctly prevent racy accesses
by taking the MDIO bus lock. It makes no accesses to anything else,
so phydev->lock here is entirely unnecessary.

vsc85xx_mac_if_set()... which is another case of racy access in the
same way as vsc85xx_led_cntl_set().

vsc8531_pre_init_seq_set() and vsc85xx_eee_init_seq_set()... both of
which IMHO show a complete misunderstanding for locking. At least
both of these functions are safe from other threads accessing the
bus because they correctly use phy_select_page()...phy_restore_page()
(which uses the MDIO bus lock to guarantee no other access will
happen.) BTW, I'm the author of phy_select_page()...phy_restore_page()
which were added to ensure that PHY drivers can _safely_ access
paged registers without fear of anything else disrupting accesses
to those paged registers, not even from userspace.

Essentially, taking phydev->lock offers absolutely zero protection
against another thread making accesses to the PHYs registers. The
*only* lock which prevents concurrent access to registers on devices
on a MDIO bus is the MDIO bus lock.

The only locking decision that I can see in this driver that is
correct is in phy_base_(read|write) which correctly demand that the
MDIO bus lock is held.


Oh my, things get even more "fun"...

vsc8574_config_pre_init() requires the MDIO bus lock to be held when
its called. This function uses request_firmware(), which can call out
 to userspace and then *block* waiting for userspace to respond. This
will block *all* access to any device on that MDIO bus until the
firmware request has been satisfied.

Sorry, but the locking in this driver is nothing but a mess.

I'm sorry that you're the one who's modifying the driver when we've
spotted this, but please can you add a patch which first removes
phydev->lock from vsc85xx_rgmii_set_skews() and then your patch on
top please?

At least that starts to reduce the amount of broken locking in this
driver.
David Epping May 23, 2023, 2:57 p.m. UTC | #5
On Tue, May 23, 2023 at 03:32:06PM +0100, Russell King (Oracle) wrote:
> I'm sorry that you're the one who's modifying the driver when we've
> spotted this, but please can you add a patch which first removes
> phydev->lock from vsc85xx_rgmii_set_skews() and then your patch on
> top please?

Sure, no problem, will do that. Thanks for your detailed post.