Message ID | 20230523090405.10655-1-david.epping@missinglinkelectronics.com (mailing list archive) |
---|---|
Headers | show |
Series | net: phy: mscc: support VSC8501 | expand |
On Tue, May 23, 2023 at 11:04:02AM +0200, David Epping wrote: > Hello, > > this updated series of patches adds support for the VSC8501 Ethernet > PHY and fixes support for the VSC8502 PHY in cases where no other > software (like U-Boot) has initialized the PHY after power up. > > The first patch simply adds the VSC8502 to the MODULE_DEVICE_TABLE, > where I guess it was unintentionally missing. I have no hardware to > test my change. > > The second patch adds the VSC8501 PHY with exactly the same driver > implementation as the existing VSC8502. > > The third patch fixes the initialization for VSC8501 and VSC8502. > I have tested this patch with VSC8501 on hardware in RGMII mode only. > https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/VSC8501-03_Datasheet_60001741A.PDF > https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/DataSheets/VSC8502-03_Datasheet_60001742B.pdf > Table 4-42 "RGMII CONTROL, ADDRESS 20E2 (0X14)" Bit 11 for each of > them. > By default the RX_CLK is disabled for these PHYs. In cases where no > other software, like U-Boot, enabled the clock, this results in no > received packets being handed to the MAC. > The patch enables this clock output. > According to Microchip support (case number 01268776) this applies > to all modes (RGMII, GMII, and MII). > > Other PHYs sharing the same register map and code, like > VSC8530/31/40/41 have the clock enabled and the relevant bit 11 is > reserved and read-only for them. As per previous discussion the > patch still clears the bit on these PHYs, too, possibly more easily > supporting other future PHYs implementing this functionality. > > For the VSC8572 family of PHYs, having a different register map, > no such changes are applied. > > Thanks for your feedback, > David > > -- > > Changes in v2: > - adjust cover letter (U-Boot, PHY families) > - add reviewed-by tags to patch 1/3 and 2/3 > - patch 3/3: combine vsc85xx_rgmii_set_skews() and > vsc85xx_rgmii_enable_rx_clk() into vsc85xx_update_rgmii_cntl() > for fewer MDIO accesses > - patch 3/3: treat all VSC8502 family PHYs the same (regardless of > bit 11 reserved status) > > Additional notes for review: > - If you want to, feel free to add something like > Co developed by Vladimir Oltean <olteanv@gmail.com>. > I did not do that, because the Kernel documentation requires a > signed off by to go with it. > Significant parts of the new patch are from your emails. > - I left the mutex_lock(&phydev->lock) in the > vsc85xx_update_rgmii_cntl() function, as I'm not sure whether it > is required to repeatedly access phydev->interface and > phy_interface_is_rgmii(phydev) in a consistent way. Nothing should change phydev->interface except: 1. the PHY driver in its ->read_status method when phylib has been started (via phy_start()). 2. phylib when the PHY is initially being attached. The config_init methods are called during initial attachment and also when the phy is being resumed, for neither of which phylib will be in the "started" mode so (1) doesn't apply, and (2) doesn't apply because phy_attach_direct() will have set ->interface prior to calling the config_init method. As far as a phy driver should be concerned, phydev->interface is stable while it's being called.
> - I left the mutex_lock(&phydev->lock) in the > vsc85xx_update_rgmii_cntl() function, as I'm not sure whether it > is required to repeatedly access phydev->interface and > phy_interface_is_rgmii(phydev) in a consistent way. Just adding to Russell comment. As a general rule of thumb, if your driver is doing something which no other driver is doing, you have to consider if it is correct. A PHY driver taking phydev->lock is very unusual. So at minimum you should be able to explain why it is needed. And when it comes to locking, locking is hard, so you really should understand it. Now the mscc is an odd device, because it has multiple PHYs in the package, and a number of registers are shared between these PHYs. So it does have different locking requirements to most PHYs. However, i don't think that is involved here. Those oddities are hidden behind phy_base_write() and phy_base_read(). Andrew
On Tue, May 23, 2023 at 03:16:51PM +0200, Andrew Lunn wrote: > > - I left the mutex_lock(&phydev->lock) in the > > vsc85xx_update_rgmii_cntl() function, as I'm not sure whether it > > is required to repeatedly access phydev->interface and > > phy_interface_is_rgmii(phydev) in a consistent way. > > Just adding to Russell comment. > > As a general rule of thumb, if your driver is doing something which no > other driver is doing, you have to consider if it is correct. A PHY > driver taking phydev->lock is very unusual. So at minimum you should > be able to explain why it is needed. And when it comes to locking, > locking is hard, so you really should understand it. > > Now the mscc is an odd device, because it has multiple PHYs in the > package, and a number of registers are shared between these PHYs. So > it does have different locking requirements to most PHYs. However, i > don't think that is involved here. Those oddities are hidden behind > phy_base_write() and phy_base_read(). > > Andrew Russell, Andrew, as you stated, locking is hard, and I am not in detail familiar with the mscc driver and the supported PHYs behavior. Also, I only have VSC8501, the single PHY chip, and none of the multi PHY chips to test. And testing these corner cases and race conditions is hard anyways. Thus my current patch is not touching the locking code at all, and assumes the current mainline code is correct in that regard. Because I don't understand all implications, I'm hesitant to change the existing locking scheme. Maybe this can be a separate patch? In the current patch set I'm not making the situation worse (unlike the first one where I added locks which Russell pointed out). If you insist and all agree the locks should be removed with this patch set, I'll update it of course.
On Tue, May 23, 2023 at 03:32:36PM +0200, David Epping wrote: > On Tue, May 23, 2023 at 03:16:51PM +0200, Andrew Lunn wrote: > > > - I left the mutex_lock(&phydev->lock) in the > > > vsc85xx_update_rgmii_cntl() function, as I'm not sure whether it > > > is required to repeatedly access phydev->interface and > > > phy_interface_is_rgmii(phydev) in a consistent way. > > > > Just adding to Russell comment. > > > > As a general rule of thumb, if your driver is doing something which no > > other driver is doing, you have to consider if it is correct. A PHY > > driver taking phydev->lock is very unusual. So at minimum you should > > be able to explain why it is needed. And when it comes to locking, > > locking is hard, so you really should understand it. > > > > Now the mscc is an odd device, because it has multiple PHYs in the > > package, and a number of registers are shared between these PHYs. So > > it does have different locking requirements to most PHYs. However, i > > don't think that is involved here. Those oddities are hidden behind > > phy_base_write() and phy_base_read(). > > > > Andrew > > Russell, Andrew, > > as you stated, locking is hard, and I am not in detail familiar with > the mscc driver and the supported PHYs behavior. Also, I only have > VSC8501, the single PHY chip, and none of the multi PHY chips to test. > And testing these corner cases and race conditions is hard anyways. > Thus my current patch is not touching the locking code at all, and > assumes the current mainline code is correct in that regard. > Because I don't understand all implications, I'm hesitant to change > the existing locking scheme. > Maybe this can be a separate patch? In the current patch set I'm not > making the situation worse (unlike the first one where I added locks > which Russell pointed out). > If you insist and all agree the locks should be removed with this > patch set, I'll update it of course. Reading through this driver, IMHO it's clear that the original author didn't have much idea about locking. Your assumption that taking phydev->lock in vsc85xx_rgmii_set_skews() protects phydev->interface is provably false, because: static int vsc8584_config_init(struct phy_device *phydev) { ... if (phy_interface_is_rgmii(phydev)) { ret = vsc85xx_rgmii_set_skews(phydev, VSC8572_RGMII_CNTL, VSC8572_RGMII_RX_DELAY_MASK, VSC8572_RGMII_TX_DELAY_MASK); This accesses phydev->interface without holding phydev->lock, before entering vsc85xx_rgmii_set_skews(). The second place that vsc85xx_rgmii_set_skews() is called from is vsc85xx_default_config() which also accesses phydev->interface, again without taking the phydev->lock. So both paths into vsc85xx_rgmii_set_skews() have already read phydev->interface without taking the lock. If this was what the lock in vsc85xx_rgmii_set_skews() was protecting, then surely it would need to protect those reads as well! It doesn't. Also, with knowledge of phylib, I can say that this lock is completely unnecessary when accessing phydev->interface in any PHY driver .config_init method, which is the only place that vsc85xx_rgmii_set_skews() is called from. Having read the rest of the driver, it would appear that phydev->lock is being abused to protect register accesses. This is evidenced by the following, where I also set out why it's wrong: vsc85xx_led_cntl_set()... which should be using phy_modify(), not phy_read()..modify..phy_write(), which is open to races e.g. from userspace MDIO access. phydev->lock doesn't solve anything there. vsc85xx_edge_rate_cntl_set()... which correctly uses phy_modify_paged() which itself will correctly prevent racy accesses by taking the MDIO bus lock. It makes no accesses to anything else, so phydev->lock here is entirely unnecessary. vsc85xx_mac_if_set()... which is another case of racy access in the same way as vsc85xx_led_cntl_set(). vsc8531_pre_init_seq_set() and vsc85xx_eee_init_seq_set()... both of which IMHO show a complete misunderstanding for locking. At least both of these functions are safe from other threads accessing the bus because they correctly use phy_select_page()...phy_restore_page() (which uses the MDIO bus lock to guarantee no other access will happen.) BTW, I'm the author of phy_select_page()...phy_restore_page() which were added to ensure that PHY drivers can _safely_ access paged registers without fear of anything else disrupting accesses to those paged registers, not even from userspace. Essentially, taking phydev->lock offers absolutely zero protection against another thread making accesses to the PHYs registers. The *only* lock which prevents concurrent access to registers on devices on a MDIO bus is the MDIO bus lock. The only locking decision that I can see in this driver that is correct is in phy_base_(read|write) which correctly demand that the MDIO bus lock is held. Oh my, things get even more "fun"... vsc8574_config_pre_init() requires the MDIO bus lock to be held when its called. This function uses request_firmware(), which can call out to userspace and then *block* waiting for userspace to respond. This will block *all* access to any device on that MDIO bus until the firmware request has been satisfied. Sorry, but the locking in this driver is nothing but a mess. I'm sorry that you're the one who's modifying the driver when we've spotted this, but please can you add a patch which first removes phydev->lock from vsc85xx_rgmii_set_skews() and then your patch on top please? At least that starts to reduce the amount of broken locking in this driver.
On Tue, May 23, 2023 at 03:32:06PM +0100, Russell King (Oracle) wrote: > I'm sorry that you're the one who's modifying the driver when we've > spotted this, but please can you add a patch which first removes > phydev->lock from vsc85xx_rgmii_set_skews() and then your patch on > top please? Sure, no problem, will do that. Thanks for your detailed post.