Message ID | 20250225112043.419189-2-maxime.chevallier@bootlin.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: phy: sfp: Add single-byte SMBus SFP access | expand |
On Tue, Feb 25, 2025 at 12:20:39PM +0100, Maxime Chevallier wrote: > The SFP module's eeprom and internals are accessible through an i2c bus. > However, all the i2c transfers that are performed are SMBus-style > transfers for read and write operations. > > It is possible that the SFP might be connected to an SMBus-only > controller, such as the one found in some PHY devices in the VSC85xx > family. > > Introduce a set of sfp read/write ops that are going to be used if the > i2c bus is only capable of doing smbus byte accesses. > > As Single-byte SMBus transaction go against SFF-8472 and breaks the > atomicity for diagnostics data access, hwmon is disabled in the case > of SMBus access. > > Moreover, as this may cause other instabilities, print a warning at > probe time to indicate that the setup may be unreliable because of the > hardware design. > > Tested-by: Sean Anderson <sean.anderson@linux.dev> > Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> > --- > > V2: - Added Sean's tested-by > - Added a warning indicating that operations won't be reliable, from > Russell and Andrew's reviews > - Also added a flag saying we're under a single-byte-access bus, to > both print the warning and disable hwmon. > > drivers/net/phy/sfp.c | 79 +++++++++++++++++++++++++++++++++++++++---- > 1 file changed, 73 insertions(+), 6 deletions(-) > > diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c > index 9369f5297769..6e9d3d95eb95 100644 > --- a/drivers/net/phy/sfp.c > +++ b/drivers/net/phy/sfp.c > @@ -282,6 +282,7 @@ struct sfp { > unsigned int rs_state_mask; > > bool have_a2; > + bool single_byte_access; > > const struct sfp_quirk *quirk; > > @@ -690,14 +691,70 @@ static int sfp_i2c_write(struct sfp *sfp, bool a2, u8 dev_addr, void *buf, > return ret == ARRAY_SIZE(msgs) ? len : 0; > } > > -static int sfp_i2c_configure(struct sfp *sfp, struct i2c_adapter *i2c) > +static int sfp_smbus_read(struct sfp *sfp, bool a2, u8 dev_addr, void *buf, Maybe call this sfp_smbus_byte_read(), leaving space for sfp_smbus_word_read() in the future. > + size_t len) > { > - if (!i2c_check_functionality(i2c, I2C_FUNC_I2C)) > - return -EINVAL; > + u8 bus_addr = a2 ? 0x51 : 0x50; > + union i2c_smbus_data smbus_data; > + u8 *data = buf; > + int ret; > + > + while (len) { > + ret = i2c_smbus_xfer(sfp->i2c, bus_addr, 0, > + I2C_SMBUS_READ, dev_addr, > + I2C_SMBUS_BYTE_DATA, &smbus_data); > + if (ret < 0) > + return ret; Isn't this the wrong order? You should do the upper byte first, then the lower? Andrew --- pw-bot: cr
Hi Andrew, > > -static int sfp_i2c_configure(struct sfp *sfp, struct i2c_adapter *i2c) > > +static int sfp_smbus_read(struct sfp *sfp, bool a2, u8 dev_addr, void *buf, > > Maybe call this sfp_smbus_byte_read(), leaving space for > sfp_smbus_word_read() in the future. Good idea, I'll do that :) > > + size_t len) > > { > > - if (!i2c_check_functionality(i2c, I2C_FUNC_I2C)) > > - return -EINVAL; > > + u8 bus_addr = a2 ? 0x51 : 0x50; > > + union i2c_smbus_data smbus_data; > > + u8 *data = buf; > > + int ret; > > + > > + while (len) { > > + ret = i2c_smbus_xfer(sfp->i2c, bus_addr, 0, > > + I2C_SMBUS_READ, dev_addr, > > + I2C_SMBUS_BYTE_DATA, &smbus_data); > > + if (ret < 0) > > + return ret; > > Isn't this the wrong order? You should do the upper byte first, then > the lower? You might be correct. As I have been running that code out-of-tree for a while, I was thinking that surely I'd have noticed if this was wrong, however there are only a few cases where we actually write to SFP : - sfp_modify_u8(...) => one-byte write - in sfp_cotsworks_fixup_check(...) there are 2 writes : one 1-byte write and a 3-bytes write. As I don't have any cotsworks SFP, then it looks like having the writes mis-ordered would have stayed un-noticed on my side as I only stressed the 1 byte write path... So, good catch :) Let me triple-check and see if I can find any conceivable way of testing that... Thanks, Maxime
> You might be correct. As I have been running that code out-of-tree for > a while, I was thinking that surely I'd have noticed if this was > wrong, however there are only a few cases where we actually write to > SFP : > > - sfp_modify_u8(...) => one-byte write > - in sfp_cotsworks_fixup_check(...) there are 2 writes : one 1-byte > write and a 3-bytes write. > > As I don't have any cotsworks SFP, then it looks like having the writes > mis-ordered would have stayed un-noticed on my side as I only > stressed the 1 byte write path... > > So, good catch :) Let me triple-check and see if I can find any > conceivable way of testing that... Read might be more important than write. This is particularly important for the second page containing the diagnostics, and dumped by ethtool -m. It could be the sensor values latch when you read the higher byte, so you can read the lower byte without worrying about it changing. This is why we don't want HWMON, if you can only do byte access. You might be able to test this with the temperature sensor. The value is in 1/256 degrees. So if you can get is going from 21 255/256C to 22 0/256C and see if you ever read 21 0/256 or 22 255/256C. Andrew
On Tue, Feb 25, 2025 at 12:20:39PM +0100, Maxime Chevallier wrote: > The SFP module's eeprom and internals are accessible through an i2c bus. > However, all the i2c transfers that are performed are SMBus-style > transfers for read and write operations. Note that there are SFPs that fail if you access them by byte - the 3FE46541AA locks the bus if you byte access the emulated EEPROM at 0x50, address 0x51. This is documented in sfp_sm_mod_probe(). So there's a very real reason for adding the warning - this module will not work!
On Tue, Feb 25, 2025 at 02:56:17PM +0100, Maxime Chevallier wrote: > > > + while (len) { > > > + ret = i2c_smbus_xfer(sfp->i2c, bus_addr, 0, > > > + I2C_SMBUS_READ, dev_addr, > > > + I2C_SMBUS_BYTE_DATA, &smbus_data); > > > + if (ret < 0) > > > + return ret; > > > > Isn't this the wrong order? You should do the upper byte first, then > > the lower? > > You might be correct. As I have been running that code out-of-tree for > a while, I was thinking that surely I'd have noticed if this was > wrong, however there are only a few cases where we actually write to > SFP : > > - sfp_modify_u8(...) => one-byte write > - in sfp_cotsworks_fixup_check(...) there are 2 writes : one 1-byte > write and a 3-bytes write. > > As I don't have any cotsworks SFP, then it looks like having the writes > mis-ordered would have stayed un-noticed on my side as I only > stressed the 1 byte write path... This Cotsworks module is not a SFP. It's a solder-on SFF module.
On Tue, 25 Feb 2025 18:04:29 +0000 "Russell King (Oracle)" <linux@armlinux.org.uk> wrote: > On Tue, Feb 25, 2025 at 12:20:39PM +0100, Maxime Chevallier wrote: > > The SFP module's eeprom and internals are accessible through an i2c bus. > > However, all the i2c transfers that are performed are SMBus-style > > transfers for read and write operations. > > Note that there are SFPs that fail if you access them by byte - the > 3FE46541AA locks the bus if you byte access the emulated EEPROM at > 0x50, address 0x51. This is documented in sfp_sm_mod_probe(). > > So there's a very real reason for adding the warning - this module > will not work! That's indeed pretty serious and since we can't even read the eeprom, there's not even a chance to have, say, a list of modules that will break with 1-byte io. So indeed the warning is needed... Thanks Maxime
On Tue, Feb 25, 2025 at 03:58:31PM +0100, Andrew Lunn wrote: > > You might be correct. As I have been running that code out-of-tree for > > a while, I was thinking that surely I'd have noticed if this was > > wrong, however there are only a few cases where we actually write to > > SFP : > > > > - sfp_modify_u8(...) => one-byte write > > - in sfp_cotsworks_fixup_check(...) there are 2 writes : one 1-byte > > write and a 3-bytes write. > > > > As I don't have any cotsworks SFP, then it looks like having the writes > > mis-ordered would have stayed un-noticed on my side as I only > > stressed the 1 byte write path... > > > > So, good catch :) Let me triple-check and see if I can find any > > conceivable way of testing that... > > Read might be more important than write. This is particularly > important for the second page containing the diagnostics, and dumped > by ethtool -m. It could be the sensor values latch when you read the > higher byte, so you can read the lower byte without worrying about it > changing. This is why we don't want HWMON, if you can only do byte > access. You might be able to test this with the temperature > sensor. The value is in 1/256 degrees. So if you can get is going from > 21 255/256C to 22 0/256C and see if you ever read 21 0/256 or 22 > 255/256C. <frustrated> Why don't we read SFF-8472 instead of testing module specific behaviour? Section 9.1 (Diagnostics overview) paragraphs 4 and 5 cover this. No, it's not latched when you read the high byte. Paragraph 4 states that multi-byte fields must be read using "a single two-byte read sequence across the 2-wire interface". Paragraph 5 states that "the transceiver shall not update a multi-byte field within the structure during the transfer of that multi-byte field to the host, such that partially updated data would be transferred to the host." In other words, while reading the the individual bytes of a multi-byte field, the value will remain stable _while the bus transaction which is required to be a multi-byte read is in progress_. So, when the STOP condition is signalled on the bus, the transceiver is then free to change the values. So accessing the high byte and low byte seperately does not guarantee to be coherent. It *might* work with some modules. It may not work with others. It might crash or lock the I2C bus with other modules. (I already know that at least one GPON module locks the bus with byte reads of 0xA0 EEPROM offset 0x51.) We've had this before. We have a byte-mode fallback in the SFP code, and we've had to be *very* careful when enabling this only for modules that need it.
diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c index 9369f5297769..6e9d3d95eb95 100644 --- a/drivers/net/phy/sfp.c +++ b/drivers/net/phy/sfp.c @@ -282,6 +282,7 @@ struct sfp { unsigned int rs_state_mask; bool have_a2; + bool single_byte_access; const struct sfp_quirk *quirk; @@ -690,14 +691,70 @@ static int sfp_i2c_write(struct sfp *sfp, bool a2, u8 dev_addr, void *buf, return ret == ARRAY_SIZE(msgs) ? len : 0; } -static int sfp_i2c_configure(struct sfp *sfp, struct i2c_adapter *i2c) +static int sfp_smbus_read(struct sfp *sfp, bool a2, u8 dev_addr, void *buf, + size_t len) { - if (!i2c_check_functionality(i2c, I2C_FUNC_I2C)) - return -EINVAL; + u8 bus_addr = a2 ? 0x51 : 0x50; + union i2c_smbus_data smbus_data; + u8 *data = buf; + int ret; + + while (len) { + ret = i2c_smbus_xfer(sfp->i2c, bus_addr, 0, + I2C_SMBUS_READ, dev_addr, + I2C_SMBUS_BYTE_DATA, &smbus_data); + if (ret < 0) + return ret; + + *data = smbus_data.byte; + + len--; + data++; + dev_addr++; + } + + return data - (u8 *)buf; +} + +static int sfp_smbus_write(struct sfp *sfp, bool a2, u8 dev_addr, void *buf, + size_t len) +{ + u8 bus_addr = a2 ? 0x51 : 0x50; + union i2c_smbus_data smbus_data; + u8 *data = buf; + int ret; + while (len) { + smbus_data.byte = *data; + ret = i2c_smbus_xfer(sfp->i2c, bus_addr, 0, + I2C_SMBUS_WRITE, dev_addr, + I2C_SMBUS_BYTE_DATA, &smbus_data); + if (ret) + return ret; + + len--; + data++; + dev_addr++; + } + + return 0; +} + +static int sfp_i2c_configure(struct sfp *sfp, struct i2c_adapter *i2c) +{ sfp->i2c = i2c; - sfp->read = sfp_i2c_read; - sfp->write = sfp_i2c_write; + + if (i2c_check_functionality(i2c, I2C_FUNC_I2C)) { + sfp->read = sfp_i2c_read; + sfp->write = sfp_i2c_write; + } else if (i2c_check_functionality(i2c, I2C_FUNC_SMBUS_BYTE_DATA)) { + sfp->read = sfp_smbus_read; + sfp->write = sfp_smbus_write; + sfp->single_byte_access = true; + } else { + sfp->i2c = NULL; + return -EINVAL; + } return 0; } @@ -1628,7 +1685,8 @@ static void sfp_hwmon_probe(struct work_struct *work) static int sfp_hwmon_insert(struct sfp *sfp) { - if (sfp->have_a2 && sfp->id.ext.diagmon & SFP_DIAGMON_DDM) { + if (sfp->have_a2 && sfp->id.ext.diagmon & SFP_DIAGMON_DDM && + !sfp->single_byte_access) { mod_delayed_work(system_wq, &sfp->hwmon_probe, 1); sfp->hwmon_tries = R_PROBE_RETRY_SLOW; } @@ -3114,6 +3172,15 @@ static int sfp_probe(struct platform_device *pdev) if (!sfp->sfp_bus) return -ENOMEM; + if (sfp->single_byte_access) + dev_warn(sfp->dev, + "Please note:\n" + "This SFP cage is accessed via an SMBus only capable of single byte\n" + "transactions. Some features are disabled, other may be unreliable or\n" + "sporadically fail. Use with caution. There is nothing that the kernel\n" + "or community can do to fix it, the kernel will try best efforts. Please\n" + "verify any problems on hardware that supports multi-byte I2C transactions.\n"); + sfp_debugfs_init(sfp); return 0;