Message ID | a7904c43-01c7-4f9c-a1f9-e0a7ce2db532@ans.pl (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [mlx4] Mellanox ConnectX2 (MHQH29C aka 26428) and module diagnostic support (ethtool -m) issues | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
I see Tariq is OOO so I'm adding Gal who might be able to help with CX2/mlx4 issues. On Sat, Aug 31, 2024 at 11:28:03PM -0700, Krzysztof Olędzki wrote: > Hi, > > I noticed that module diagnostic on Mellanox ConnectX2 NIC (MHQH29Caka 26428 aka 15b3:673c, FW version 2.10.0720) behaves in somehow strange ways. > > 1. For SFP modules the driver is able to read the first page but not the 2nd one: > > [ 318.082923] mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(1) i2c_addr(51) offset(0) size(48): Response Mad Status(71c) - invalid I2C slave address > [ 318.082936] mlx4_en: eth1: mlx4_get_module_info i(0) offset(256) bytes_to_read(128) - FAILED (0xfffff8e4) I assume you are using a relatively recent ethtool with netlink support. It should only try to read from I2C address 0x51 if the module indicated support for diagnostics via bit 6 in byte 92. A few things worth checking: 1. mlx4 does not implement the modern get_module_eeprom_by_page() ethtool operation so what it gets invoked is the fallback path in eeprom_fallback(). Can you try to rule out problems in this path by compiling ethtool without netlink support (i.e., ./configure --disable-netlink) and retesting? I don't think it will make a difference, but worth trying. 2. Can you test this transceiver with a different NIC? 3. I'm wondering if this transceiver requires an "address change sequence" before accessing I2C address 0x51 (see SFF-8472 Section 8.9 Addressing Modes). The generic SFP driver doesn't support it (see sfp_module_parse_sff8472()) and other drivers probably don't support it as well. Can you look at an hexdump of page 0 and see if this bit is set? If so, maybe the correct thing to do would be to teach the SFF-8472 parser to look at both bit 2 and bit 6 before trying to access this I2C address. > > However, as the driver intentionally tries mask the problem [1], ethtool reports "Optical diagnostics support" being available and shows completely wrong information [2]. > > Removing the workaround allows ethtool to recognize the problem and handle everything correctly [3]: > ---- cut here ---- > --- a/drivers/net/ethernet/mellanox/mlx4/port.c 2024-07-27 02:34:11.000000000 -0700 > +++ b/drivers/net/ethernet/mellanox/mlx4/port.c 2024-08-31 21:57:11.211612505 -0700 > @@ -2197,14 +2197,7 @@ > 0xFF60, port, i2c_addr, offset, size, > ret, cable_info_mad_err_str(ret)); > > - if (i2c_addr == I2C_ADDR_HIGH && > - MAD_STATUS_2_CABLE_ERR(ret) == CABLE_INF_I2C_ADDR) > - /* Some SFP cables do not support i2c slave > - * address 0x51 (high page), abort silently. > - */ > - ret = 0; > - else > - ret = -ret; > + ret = -ret; > goto out; > } > cable_info = (struct mlx4_cable_info *)outm > ---- cut here ---- > > However, we end up with a strange "netlink error: Unknown error 1820" error because mlx4_get_module_info returns -0x71c (0x71c is 1820 in decimal). > > This can be fixed with returning -EIO instead of ret, either in mlx4_get_module_info() or perhaps better mlx4_en_get_module_eeprom() from en_ethtool.c: > ---- cut here ---- > --- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c 2024-07-27 02:34:11.000000000 -0700 > +++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c 2024-08-31 21:52:50.370553218 -0700 > @@ -2110,7 +2110,7 @@ > en_err(priv, > "mlx4_get_module_info i(%d) offset(%d) bytes_to_read(%d) - FAILED (0x%x)\n", > i, offset, ee->len - i, ret); > - return ret; > + return -EIO; > } > > i += ret; > ---- cut here ---- > > BTW: it is also possible to augment the error reporting in ethtool/sfpid.c: > ---- cut here ---- > - if (ret) > + if (ret) { > + fprintf(stderr, "Failed to read Page A2h.\n"); > goto out; > + } > ---- cut here ---- > With all the above changes, we now get: > > ---- cut here ---- > Identifier : 0x03 (SFP) > Extended identifier : 0x04 (GBIC/SFP defined by 2-wire interface ID) > (...) > Date code : <REDACTED> > netlink error: Input/output error > Failed to read Page A2h. > ---- cut here ---- > > So, the first question is if above set of fixes makes sense, give that ethtool handles this correctly? If so, I'm happy to send the fixes. I believe it makes sense for the driver to return an error rather than mask the problem and return the wrong information (zeroes). > > The second question is if not being able to read Page A2h and "invalid I2C slave address" is a due to a bug in the driver or a HW (firmware?) limitation and if something can be done to address this? Let's see if it's related to the "address change sequence" I mentioned above. Maybe that's why the error masking was put in mlx4 in the first place. > > 2. For a QSFP module (which works in CX3/CX3Pro), handling "ethtool -m" seems to be completely broken. Given it works with CX3, then the problem is most likely with CX2 HW/FW. Gal, can you or someone from the team look into it? > > With QSFP module in port #2 (eth2), for the first attempt (ethtool -m eth2): > mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(2) i2c_addr(50) offset(0) size(48): Response Mad Status(41c) - the connected cable has no EPROM (passive copper cable) > mlx4_en: eth2: mlx4_get_module_info i(0) offset(0) bytes_to_read(128) - FAILED (0xfffffbe4) > > However, if I first try run "ethtool -m eth1" with a SFP module installed in port #1, and then immediately "ethtool -m eth2", I end up getting the information for the SFP module: > # ethtool -m eth2 > Identifier : 0x03 (SFP) > Extended identifier : 0x04 (GBIC/SFP defined by 2-wire interface ID) > (...) > > I this case, I even get the same "invalid I2C slave address" error: > mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(2) i2c_addr(51) offset(0) size(48): Response Mad Status(71c) - invalid I2C slave address > > If I immediately run "ethtool -m eth1" I get: > mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(1) i2c_addr(50) offset(224) size(32): Response Mad Status(61c) - invalid device_address or size (that is, size equals 0 or address+size is greater than 256) > mlx4_en: eth1: mlx4_get_module_info i(96) offset(224) bytes_to_read(32) - FAILED (0xfffff9e4) > > Alternatively, if I remove SFP module from port #1 and run "ethtool -m eth2", I get: > [ 1071.945737] mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module ID attr(ff60) port(2) i2c_addr(50) offset(0) size(1): Response Mad Status(31c) - cable is not connected > > At this point, running "ethtool -m eth1" produces one of: > > *) > mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module ID attr(ff60) port(2) i2c_addr(50) offset(0) size(1): Response Mad Status(41c) - the connected cable has no EPROM (passive copper cable) > > *) > mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(2) i2c_addr(50) offset(128) size(48): Response Mad Status(41c) - the connected cable has no EPROM (passive copper cable) > mlx4_en: eth2: mlx4_get_module_info i(0) offset(128) bytes_to_read(128) - FAILED (0xfffffbe4) > > *) > mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module ID attr(ff60) port(2) i2c_addr(50) offset(0) size(1): Response Mad Status(41c) - the connected cable has no EPROM (passive copper cable) > > *) > mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(2) i2c_addr(50) offset(0) size(48): Response Mad Status(41c) - the connected cable has no EPROM (passive copper cable) > mlx4_en: eth2: mlx4_get_module_info i(0) offset(0) bytes_to_read(128) - FAILED (0xfffffbe4) > > *) > mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module ID attr(ff60) port(2) i2c_addr(50) offset(0) size(1): Response Mad Status(41c) - the connected cable has no EPROM (passive copper cable) > > I wonder if in this situation we are communicating with a wrong device or returning some stale data from kernel memory or the firmware? > > Thanks, > Krzysztof > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/mellanox/mlx4/port.c#n2200 > > > [2] > Identifier : 0x03 (SFP) > Extended identifier : 0x04 (GBIC/SFP defined by 2-wire interface ID) > (...) > Optical diagnostics support : Yes > Laser bias current : 0.000 mA > Laser output power : 0.0000 mW / -inf dBm > Receiver signal average optical power : 0.0000 mW / -inf dBm > Module temperature : 0.00 degrees C / 32.00 degrees F > Module voltage : 0.0000 V > Alarm/warning flags implemented : Yes > Laser bias current high alarm : Off > Laser bias current low alarm : Off > Laser bias current high warning : Off > Laser bias current low warning : Off > Laser output power high alarm : Off > Laser output power low alarm : Off > Laser output power high warning : Off > Laser output power low warning : Off > Module temperature high alarm : Off > Module temperature low alarm : Off > Module temperature high warning : Off > Module temperature low warning : Off > Module voltage high alarm : Off > Module voltage low alarm : Off > Module voltage high warning : Off > Module voltage low warning : Off > Laser rx power high alarm : Off > Laser rx power low alarm : Off > Laser rx power high warning : Off > Laser rx power low warning : Off > Laser bias current high alarm threshold : 0.000 mA > Laser bias current low alarm threshold : 0.000 mA > Laser bias current high warning threshold : 0.000 mA > Laser bias current low warning threshold : 0.000 mA > Laser output power high alarm threshold : 0.0000 mW / -inf dBm > Laser output power low alarm threshold : 0.0000 mW / -inf dBm > Laser output power high warning threshold : 0.0000 mW / -inf dBm > Laser output power low warning threshold : 0.0000 mW / -inf dBm > Module temperature high alarm threshold : 0.00 degrees C / 32.00 degrees F > Module temperature low alarm threshold : 0.00 degrees C / 32.00 degrees F > Module temperature high warning threshold : 0.00 degrees C / 32.00 degrees F > Module temperature low warning threshold : 0.00 degrees C / 32.00 degrees F > Module voltage high alarm threshold : 0.0000 V > Module voltage low alarm threshold : 0.0000 V > Module voltage high warning threshold : 0.0000 V > Module voltage low warning threshold : 0.0000 V > Laser rx power high alarm threshold : 0.0000 mW / -inf dBm > Laser rx power low alarm threshold : 0.0000 mW / -inf dBm > Laser rx power high warning threshold : 0.0000 mW / -inf dBm > Laser rx power low warning threshold : 0.0000 mW / -inf dBm > > [3] > # ethtool -m eth1 > Identifier : 0x03 (SFP) > Extended identifier : 0x04 (GBIC/SFP defined by 2-wire interface ID) > Connector : 0x07 (LC) > Transceiver codes : 0x10 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 > Transceiver type : 10G Ethernet: 10G Base-SR > Encoding : 0x06 (64B/66B) > BR, Nominal : 10300MBd > Rate identifier : 0x00 (unspecified) > Length (SMF,km) : 0km > Length (SMF) : 0m > Length (50um) : 80m > Length (62.5um) : 30m > Length (Copper) : 0m > Length (OM3) : 300m > Laser wavelength : 850nm > Vendor name : IBM-Avago > Vendor OUI : <REDACTED> > Vendor PN : <REDACTED> > Vendor rev : G2.3 > Option values : 0x00 0x1a > Option : RX_LOS implemented > Option : TX_FAULT implemented > Option : TX_DISABLE implemented > BR margin, max : 0% > BR margin, min : 0% > Vendor SN : <REDACTED> > Date code : <REDACTED> > netlink error: Unknown error 1820 >
On 04/09/2024 16:00, Ido Schimmel wrote: >> 2. For a QSFP module (which works in CX3/CX3Pro), handling "ethtool -m" seems to be completely broken. > > Given it works with CX3, then the problem is most likely with CX2 HW/FW. > Gal, can you or someone from the team look into it? ConnectX-2 is End-of-Life since 2015 and End-of-Service since 2017..
On 04.09.2024 at 06:00, Ido Schimmel wrote: > I see Tariq is OOO so I'm adding Gal who might be able to help with > CX2/mlx4 issues. > > On Sat, Aug 31, 2024 at 11:28:03PM -0700, Krzysztof Olędzki wrote: >> Hi, >> >> I noticed that module diagnostic on Mellanox ConnectX2 NIC (MHQH29C aka 26428 aka 15b3:673c, FW version 2.10.0720) behaves in somehow strange ways. >> >> 1. For SFP modules the driver is able to read the first page but not the 2nd one: >> >> [ 318.082923] mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(1) i2c_addr(51) offset(0) size(48): Response Mad Status(71c) - invalid I2C slave address >> [ 318.082936] mlx4_en: eth1: mlx4_get_module_info i(0) offset(256) bytes_to_read(128) - FAILED (0xfffff8e4) > > I assume you are using a relatively recent ethtool with netlink support. Yes, sorry for not stating this explicitly: # ethtool --version ethtool version 6.10 > It should only try to read from I2C address 0x51 if the module indicated > support for diagnostics via bit 6 in byte 92. > > A few things worth checking: > > 1. mlx4 does not implement the modern get_module_eeprom_by_page() ethtool > operation so what it gets invoked is the fallback path in > eeprom_fallback(). Can you try to rule out problems in this path by > compiling ethtool without netlink support (i.e., ./configure > --disable-netlink) and retesting? I don't think it will make a > difference, but worth trying. Right... I should have thought about this. Interestingly, this makes things even worse: # ethtool -m eth2 Cannot get Module EEPROM data: Unknown error 1564 mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(2) i2c_addr(50) offset(240) size(16): Response Mad Status(61c) - invalid device_address or size (that is, size equals 0 or address+size is greater than 256) mlx4_en: eth2: mlx4_get_module_info i(240) offset(240) bytes_to_read(272) - FAILED (0xfffff9e4) 1564 is mlx4_get_module_info() incorrectly returning -0x61c coming from "Response Mad Status(61c)"... Also.. I think this is 3rd time I'm recompiling ethtool without netlink support. Would it make sense to add add --disable-netlink? RFC quality patch: Subject: [PATCH ethtool] Add runtime support for disabling netlink Provide --disable-netlink option for disabling netlink during runtime, without the need to recompile the binary. Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl> --- ethtool.8.in | 6 ++++++ ethtool.c | 6 ++++++ internal.h | 1 + netlink/netlink.c | 5 +++++ 4 files changed, 18 insertions(+) diff --git a/ethtool.8.in b/ethtool.8.in index 11bb0f9..0b54983 100644 --- a/ethtool.8.in +++ b/ethtool.8.in @@ -137,6 +137,9 @@ ethtool \- query or control network driver and hardware settings .BN --debug .I args .HP +.B ethtool [--disable-netlink] +.I args +.HP .B ethtool [--json] .I args .HP @@ -579,6 +582,9 @@ lB l. 0x10 Structure of netlink messages .TE .TP +.BI \-\-disable-netlink +Do not use netlink and fall back to the ioctl interface if possible. +.TP .BI \-\-json Output results in JavaScript Object Notation (JSON). Only a subset of options support this. Those which do not will continue to output diff --git a/ethtool.c b/ethtool.c index 7f47407..dc28069 100644 --- a/ethtool.c +++ b/ethtool.c @@ -6537,6 +6537,12 @@ int main(int argc, char **argp) argc -= 2; continue; } + if (*argp && !strcmp(*argp, "--disable-netlink")) { + ctx.nl_disable = true; + argp += 1; + argc -= 1; + continue; + } if (*argp && !strcmp(*argp, "--json")) { ctx.json = true; argp += 1; diff --git a/internal.h b/internal.h index 4b994f5..84c64be 100644 --- a/internal.h +++ b/internal.h @@ -221,6 +221,7 @@ struct cmd_context { char **argp; /* arguments to the sub-command */ unsigned long debug; /* debugging mask */ bool json; /* Output JSON, if supported */ + bool nl_disable; /* Disable netlink even if available */ bool show_stats; /* include command-specific stats */ #ifdef ETHTOOL_ENABLE_NETLINK struct nl_context *nlctx; /* netlink context (opaque) */ diff --git a/netlink/netlink.c b/netlink/netlink.c index ef0d825..3cf1710 100644 --- a/netlink/netlink.c +++ b/netlink/netlink.c @@ -470,6 +470,11 @@ void netlink_run_handler(struct cmd_context *ctx, nl_chk_t nlchk, const char *reason; int ret; + if (ctx->nl_disable) { + reason = "netlink disabled"; + goto no_support; + } + if (nlchk && !nlchk(ctx)) { reason = "ioctl-only request"; goto no_support;
On 04.09.2024 at 21:08, Krzysztof Olędzki wrote: > On 04.09.2024 at 06:00, Ido Schimmel wrote: >> I see Tariq is OOO so I'm adding Gal who might be able to help with >> CX2/mlx4 issues. >> >> On Sat, Aug 31, 2024 at 11:28:03PM -0700, Krzysztof Olędzki wrote: >>> Hi, >>> >>> I noticed that module diagnostic on Mellanox ConnectX2 NIC (MHQH29C aka 26428 aka 15b3:673c, FW version 2.10.0720) behaves in somehow strange ways. >>> >>> 1. For SFP modules the driver is able to read the first page but not the 2nd one: >>> >>> [ 318.082923] mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(1) i2c_addr(51) offset(0) size(48): Response Mad Status(71c) - invalid I2C slave address >>> [ 318.082936] mlx4_en: eth1: mlx4_get_module_info i(0) offset(256) bytes_to_read(128) - FAILED (0xfffff8e4) >> I assume you are using a relatively recent ethtool with netlink support. > Yes, sorry for not stating this explicitly: > > # ethtool --version > ethtool version 6.10 > >> It should only try to read from I2C address 0x51 if the module indicated >> support for diagnostics via bit 6 in byte 92. >> >> A few things worth checking: >> >> 1. mlx4 does not implement the modern get_module_eeprom_by_page() ethtool >> operation so what it gets invoked is the fallback path in >> eeprom_fallback(). Can you try to rule out problems in this path by >> compiling ethtool without netlink support (i.e., ./configure >> --disable-netlink) and retesting? I don't think it will make a >> difference, but worth trying. > Right... I should have thought about this. > > Interestingly, this makes things even worse: > > # ethtool -m eth2 > Cannot get Module EEPROM data: Unknown error 1564 > > mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(2) i2c_addr(50) offset(240) size(16): Response Mad Status(61c) - invalid device_address or size (that is, size equals 0 or address+size is greater than 256) > mlx4_en: eth2: mlx4_get_module_info i(240) offset(240) bytes_to_read(272) - FAILED (0xfffff9e4) > > 1564 is mlx4_get_module_info() incorrectly returning -0x61c coming from "Response Mad Status(61c)"... This BTW looks like another problem: # ethtool -m eth1 hex on offset 254 length 1 Offset Values ------ ------ 0x00fe: 00 # ethtool -m eth1 hex on offset 255 length 1 Cannot get Module EEPROM data: Unknown error 1564 mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(1) i2c_addr(50) offset(255) size(1): Response Mad Status(61c) - invalid device_address or size (that is, size equals 0 or address+size is greater than 256) mlx4_en: eth1: mlx4_get_module_info i(0) offset(255) bytes_to_read(1) - FAILED (0xfffff9e4) With the netlink interface, ethtool seems to be only asking for for the first 128 bytes, which works: sending genetlink packet (76 bytes): msg length 76 ethool ETHTOOL_MSG_MODULE_EEPROM_GET ETHTOOL_MSG_MODULE_EEPROM_GET ETHTOOL_A_MODULE_EEPROM_HEADER ETHTOOL_A_HEADER_DEV_NAME = "eth1" ETHTOOL_A_MODULE_EEPROM_LENGTH = 128 ETHTOOL_A_MODULE_EEPROM_OFFSET = 0 ETHTOOL_A_MODULE_EEPROM_PAGE = 0 ETHTOOL_A_MODULE_EEPROM_BANK = 0 ETHTOOL_A_MODULE_EEPROM_I2C_ADDRESS = 81 For the ioctl one, it looks like we want all 512 bytes but reading fails when trying to read 16 bytes @240 because of the 255 octet issue. As the driver only masks the "invalid I2C slave address" error (CABLE_INF_I2C_ADDR) but not "invalid device_address or size" (CABLE_INF_INV_ADDR) it is able to successfully read 255 octets from A0h, and also to fake 256 from A2h: # ethtool -m eth1 hex on offset 256 Offset Values ------ ------ 0x0100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 (again with "i2c_addr(51) offset(0) size(48): Response Mad Status(71c) - invalid I2C slave address") To compare, this is what we get from CX3 pro: Offset Values ------ ------ 0x0100: 55 00 fb 00 50 00 00 00 8c a0 75 30 87 28 7a 44 0x0110: 14 82 04 e2 14 82 04 e2 4e 20 04 ec 1e dc 0c 62 0x0120: 4e 20 01 3b 1e dc 01 3b 00 00 00 00 00 00 00 00 0x0130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0140: 00 00 00 00 3f 80 00 00 00 00 00 00 01 00 00 00 0x0150: 01 00 00 00 01 00 00 00 01 00 00 00 00 00 00 3f 0x0160: 24 4e 7f 55 0e f6 16 c1 00 01 00 00 00 00 32 00 0x0170: 00 40 00 00 00 40 00 00 00 00 1c 00 00 00 00 00 0x0180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x01f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Also no issue here: 0x00f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > Also.. I think this is 3rd time I'm recompiling ethtool without netlink support. > Would it make sense to add add --disable-netlink? > > RFC quality patch: > > Subject: [PATCH ethtool] Add runtime support for disabling netlink > > Provide --disable-netlink option for disabling netlink during runtime, > without the need to recompile the binary. > > Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl> > --- > ethtool.8.in | 6 ++++++ > ethtool.c | 6 ++++++ > internal.h | 1 + > netlink/netlink.c | 5 +++++ > 4 files changed, 18 insertions(+) > > diff --git a/ethtool.8.in b/ethtool.8.in > index 11bb0f9..0b54983 100644 > --- a/ethtool.8.in > +++ b/ethtool.8.in > @@ -137,6 +137,9 @@ ethtool \- query or control network driver and hardware settings > .BN --debug > .I args > .HP > +.B ethtool [--disable-netlink] > +.I args > +.HP > .B ethtool [--json] > .I args > .HP > @@ -579,6 +582,9 @@ lB l. > 0x10 Structure of netlink messages > .TE > .TP > +.BI \-\-disable-netlink > +Do not use netlink and fall back to the ioctl interface if possible. > +.TP > .BI \-\-json > Output results in JavaScript Object Notation (JSON). Only a subset of > options support this. Those which do not will continue to output > diff --git a/ethtool.c b/ethtool.c > index 7f47407..dc28069 100644 > --- a/ethtool.c > +++ b/ethtool.c > @@ -6537,6 +6537,12 @@ int main(int argc, char **argp) > argc -= 2; > continue; > } > + if (*argp && !strcmp(*argp, "--disable-netlink")) { > + ctx.nl_disable = true; > + argp += 1; > + argc -= 1; > + continue; > + } > if (*argp && !strcmp(*argp, "--json")) { > ctx.json = true; > argp += 1; > diff --git a/internal.h b/internal.h > index 4b994f5..84c64be 100644 > --- a/internal.h > +++ b/internal.h > @@ -221,6 +221,7 @@ struct cmd_context { > char **argp; /* arguments to the sub-command */ > unsigned long debug; /* debugging mask */ > bool json; /* Output JSON, if supported */ > + bool nl_disable; /* Disable netlink even if available */ > bool show_stats; /* include command-specific stats */ > #ifdef ETHTOOL_ENABLE_NETLINK > struct nl_context *nlctx; /* netlink context (opaque) */ > diff --git a/netlink/netlink.c b/netlink/netlink.c > index ef0d825..3cf1710 100644 > --- a/netlink/netlink.c > +++ b/netlink/netlink.c > @@ -470,6 +470,11 @@ void netlink_run_handler(struct cmd_context *ctx, nl_chk_t nlchk, > const char *reason; > int ret; > > + if (ctx->nl_disable) { > + reason = "netlink disabled"; > + goto no_support; > + } > + > if (nlchk && !nlchk(ctx)) { > reason = "ioctl-only request"; > goto no_support; >> 2. Can you test this transceiver with a different NIC? > Ah, yes. Sorry once again - I had done that already, and of course it works, > which is why I came here and explicitly blamed CX2. > > Here is the output from a CX3 Pro NIC: > > # ethtool -m eth2 > Identifier : 0x03 (SFP) > (...) > Optical diagnostics support : Yes > Laser bias current : 7.574 mA > Laser output power : 0.5815 mW / -2.35 dBm > Receiver signal average optical power : 0.0001 mW / -40.00 dBm > Module temperature : 32.13 degrees C / 89.83 degrees F > Module voltage : 3.2714 V > Alarm/warning flags implemented : Yes > Laser bias current high alarm : Off > Laser bias current low alarm : Off > Laser bias current high warning : Off > Laser bias current low warning : Off > Laser output power high alarm : Off > Laser output power low alarm : Off > Laser output power high warning : Off > Laser output power low warning : Off > Module temperature high alarm : Off > Module temperature low alarm : Off > Module temperature high warning : Off > Module temperature low warning : Off > Module voltage high alarm : Off > Module voltage low alarm : Off > Module voltage high warning : Off > Module voltage low warning : Off > Laser rx power high alarm : Off > Laser rx power low alarm : On > Laser rx power high warning : Off > Laser rx power low warning : On > Laser bias current high alarm threshold : 10.500 mA > Laser bias current low alarm threshold : 2.500 mA > Laser bias current high warning threshold : 10.500 mA > Laser bias current low warning threshold : 2.500 mA > Laser output power high alarm threshold : 2.0000 mW / 3.01 dBm > Laser output power low alarm threshold : 0.1260 mW / -9.00 dBm > Laser output power high warning threshold : 0.7900 mW / -1.02 dBm > Laser output power low warning threshold : 0.3170 mW / -4.99 dBm > Module temperature high alarm threshold : 85.00 degrees C / 185.00 degrees F > Module temperature low alarm threshold : -5.00 degrees C / 23.00 degrees F > Module temperature high warning threshold : 80.00 degrees C / 176.00 degrees F > Module temperature low warning threshold : 0.00 degrees C / 32.00 degrees F > Module voltage high alarm threshold : 3.6000 V > Module voltage low alarm threshold : 3.0000 V > Module voltage high warning threshold : 3.4600 V > Module voltage low warning threshold : 3.1300 V > Laser rx power high alarm threshold : 2.0000 mW / 3.01 dBm > Laser rx power low alarm threshold : 0.0315 mW / -15.02 dBm > Laser rx power high warning threshold : 0.7900 mW / -1.02 dBm > Laser rx power low warning threshold : 0.0315 mW / -15.02 dBm > > >> 3. I'm wondering if this transceiver requires an "address change >> sequence" before accessing I2C address 0x51 (see SFF-8472 Section 8.9 >> Addressing Modes). The generic SFP driver doesn't support it (see >> sfp_module_parse_sff8472()) and other drivers probably don't support it >> as well. Can you look at an hexdump of page 0 and see if this bit is >> set? If so, maybe the correct thing to do would be to teach the SFF-8472 >> parser to look at both bit 2 and bit 6 before trying to access this I2C >> address. > That would be byte 92, correct? > > # ethtool -m eth2 raw on offset 92 length 1|hexdump -C > 00000000 68 |h| > 00000001 > > 0x68 = 01101000b: > > - 6 Digital diagnostic monitoring implemented (described in this document). > - 5 Internally calibrated > - 3 Received power measurement type: 0 = OMA, 1 = average power > >>> However, as the driver intentionally tries mask the problem [1], ethtool reports "Optical diagnostics support" being available and shows completely wrong information [2]. >>> >>> Removing the workaround allows ethtool to recognize the problem and handle everything correctly [3]: >>> ---- cut here ---- >>> --- a/drivers/net/ethernet/mellanox/mlx4/port.c 2024-07-27 02:34:11.000000000 -0700 >>> +++ b/drivers/net/ethernet/mellanox/mlx4/port.c 2024-08-31 21:57:11.211612505 -0700 >>> @@ -2197,14 +2197,7 @@ >>> 0xFF60, port, i2c_addr, offset, size, >>> ret, cable_info_mad_err_str(ret)); >>> >>> - if (i2c_addr == I2C_ADDR_HIGH && >>> - MAD_STATUS_2_CABLE_ERR(ret) == CABLE_INF_I2C_ADDR) >>> - /* Some SFP cables do not support i2c slave >>> - * address 0x51 (high page), abort silently. >>> - */ >>> - ret = 0; >>> - else >>> - ret = -ret; >>> + ret = -ret; >>> goto out; >>> } >>> cable_info = (struct mlx4_cable_info *)outm >>> ---- cut here ---- >>> >>> However, we end up with a strange "netlink error: Unknown error 1820" error because mlx4_get_module_info returns -0x71c (0x71c is 1820 in decimal). >>> >>> This can be fixed with returning -EIO instead of ret, either in mlx4_get_module_info() or perhaps better mlx4_en_get_module_eeprom() from en_ethtool.c: >>> ---- cut here ---- >>> --- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c 2024-07-27 02:34:11.000000000 -0700 >>> +++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c 2024-08-31 21:52:50.370553218 -0700 >>> @@ -2110,7 +2110,7 @@ >>> en_err(priv, >>> "mlx4_get_module_info i(%d) offset(%d) bytes_to_read(%d) - FAILED (0x%x)\n", >>> i, offset, ee->len - i, ret); >>> - return ret; >>> + return -EIO; >>> } >>> >>> i += ret; >>> ---- cut here ---- >>> >>> BTW: it is also possible to augment the error reporting in ethtool/sfpid.c: >>> ---- cut here ---- >>> - if (ret) >>> + if (ret) { >>> + fprintf(stderr, "Failed to read Page A2h.\n"); >>> goto out; >>> + } >>> ---- cut here ---- >>> With all the above changes, we now get: >>> >>> ---- cut here ---- >>> Identifier : 0x03 (SFP) >>> Extended identifier : 0x04 (GBIC/SFP defined by 2-wire interface ID) >>> (...) >>> Date code : <REDACTED> >>> netlink error: Input/output error >>> Failed to read Page A2h. >>> ---- cut here ---- >>> >>> So, the first question is if above set of fixes makes sense, give that ethtool handles this correctly? If so, I'm happy to send the fixes. >> I believe it makes sense for the driver to return an error rather than >> mask the problem and return the wrong information (zeroes). > Alright, will work on the patches. Thank you BTW for your very patient and > encouraging review and support the last time. Greatly appreciated. > >>> The second question is if not being able to read Page A2h and "invalid I2C slave address" is a due to a bug in the driver or a HW (firmware?) limitation and if something can be done to address this? >> Let's see if it's related to the "address change sequence" I mentioned >> above. Maybe that's why the error masking was put in mlx4 in the first >> place. > So it seems it is not, but please double check after me. > >>> 2. For a QSFP module (which works in CX3/CX3Pro), handling "ethtool -m" seems to be completely broken. >> Given it works with CX3, then the problem is most likely with CX2 HW/FW. >> Gal, can you or someone from the team look into it? > On 04.09.2024 at 08:09, Gal Pressman wrote: >>> ConnectX-2 is End-of-Life since 2015 and End-of-Service since 2017.. >>> > Yes, I am very familiar with these terms and aware of the EoL / EoS situation. > > However, the HW still works (and actually works very well), it is still > supported even by the most recent Linux kernels, and for a non-prod use > cases like mine (retro workstation) it may be hard to make an argument that > it should not be used. Especially that it supports everything I need, > including PXE booting with iPXE, and BTW - the fact that Mellanox even > provided sources allowing to build / tweak the mrom is truly unique and > amazing. > > That said, I totally get that *if* this is a FW issues, getting a new > version is rather unlikely, even it used to be a "best-in-class" > premium NIC... 15 years ago?
On 04.09.2024 at 21:47, Krzysztof Olędzki wrote: > On 04.09.2024 at 21:08, Krzysztof Olędzki wrote: >> On 04.09.2024 at 06:00, Ido Schimmel wrote: >>> I see Tariq is OOO so I'm adding Gal who might be able to help with >>> CX2/mlx4 issues. >>> >>> On Sat, Aug 31, 2024 at 11:28:03PM -0700, Krzysztof Olędzki wrote: >>>> Hi, >>>> >>>> I noticed that module diagnostic on Mellanox ConnectX2 NIC (MHQH29C aka 26428 aka 15b3:673c, FW version 2.10.0720) behaves in somehow strange ways. >>>> >>>> 1. For SFP modules the driver is able to read the first page but not the 2nd one: >>>> >>>> [ 318.082923] mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(1) i2c_addr(51) offset(0) size(48): Response Mad Status(71c) - invalid I2C slave address >>>> [ 318.082936] mlx4_en: eth1: mlx4_get_module_info i(0) offset(256) bytes_to_read(128) - FAILED (0xfffff8e4) >>> I assume you are using a relatively recent ethtool with netlink support. >> Yes, sorry for not stating this explicitly: >> >> # ethtool --version >> ethtool version 6.10 >> >>> It should only try to read from I2C address 0x51 if the module indicated >>> support for diagnostics via bit 6 in byte 92. >>> >>> A few things worth checking: >>> >>> 1. mlx4 does not implement the modern get_module_eeprom_by_page() ethtool >>> operation so what it gets invoked is the fallback path in >>> eeprom_fallback(). Can you try to rule out problems in this path by >>> compiling ethtool without netlink support (i.e., ./configure >>> --disable-netlink) and retesting? I don't think it will make a >>> difference, but worth trying. >> Right... I should have thought about this. >> >> Interestingly, this makes things even worse: >> >> # ethtool -m eth2 >> Cannot get Module EEPROM data: Unknown error 1564 >> >> mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(2) i2c_addr(50) offset(240) size(16): Response Mad Status(61c) - invalid device_address or size (that is, size equals 0 or address+size is greater than 256) >> mlx4_en: eth2: mlx4_get_module_info i(240) offset(240) bytes_to_read(272) - FAILED (0xfffff9e4) >> >> 1564 is mlx4_get_module_info() incorrectly returning -0x61c coming from "Response Mad Status(61c)"... And 240 comes from: #define MODULE_INFO_MAX_READ 48 mlx4_get_module_info() has: if (size > MODULE_INFO_MAX_READ) size = MODULE_INFO_MAX_READ; So, the reads are: 0.. 47 (48) 48.. 95 (48) 96..143 (48) 144..191 (48) 192..239 (48) 240..255 (16) <- this one fails > > This BTW looks like another problem: > > # ethtool -m eth1 hex on offset 254 length 1 > Offset Values > ------ ------ > 0x00fe: 00 > > # ethtool -m eth1 hex on offset 255 length 1 > Cannot get Module EEPROM data: Unknown error 1564 > > mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(1) i2c_addr(50) offset(255) size(1): Response Mad Status(61c) - invalid device_address or size (that is, size equals 0 or address+size is greater than 256) > mlx4_en: eth1: mlx4_get_module_info i(0) offset(255) bytes_to_read(1) - FAILED (0xfffff9e4) > > With the netlink interface, ethtool seems to be only asking for for the first 128 bytes, which works: > > sending genetlink packet (76 bytes): > msg length 76 ethool ETHTOOL_MSG_MODULE_EEPROM_GET > ETHTOOL_MSG_MODULE_EEPROM_GET > ETHTOOL_A_MODULE_EEPROM_HEADER > ETHTOOL_A_HEADER_DEV_NAME = "eth1" > ETHTOOL_A_MODULE_EEPROM_LENGTH = 128 > ETHTOOL_A_MODULE_EEPROM_OFFSET = 0 > ETHTOOL_A_MODULE_EEPROM_PAGE = 0 > ETHTOOL_A_MODULE_EEPROM_BANK = 0 > ETHTOOL_A_MODULE_EEPROM_I2C_ADDRESS = 81 > > For the ioctl one, it looks like we want all 512 bytes but reading fails when trying to read > 16 bytes @240 because of the 255 octet issue. As the driver only masks the "invalid I2C slave address" > error (CABLE_INF_I2C_ADDR) but not "invalid device_address or size" (CABLE_INF_INV_ADDR) it is > able to successfully read 255 octets from A0h, and also to fake 256 from A2h: > > # ethtool -m eth1 hex on offset 256 > Offset Values > ------ ------ > 0x0100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x0110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x0120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x0130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x0140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x0150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x0160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x0170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x0180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x0190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x01a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x01b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x01c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x01d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x01e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x01f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > (again with "i2c_addr(51) offset(0) size(48): Response Mad Status(71c) - invalid I2C slave address") > > To compare, this is what we get from CX3 pro: > > Offset Values > ------ ------ > 0x0100: 55 00 fb 00 50 00 00 00 8c a0 75 30 87 28 7a 44 > 0x0110: 14 82 04 e2 14 82 04 e2 4e 20 04 ec 1e dc 0c 62 > 0x0120: 4e 20 01 3b 1e dc 01 3b 00 00 00 00 00 00 00 00 > 0x0130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x0140: 00 00 00 00 3f 80 00 00 00 00 00 00 01 00 00 00 > 0x0150: 01 00 00 00 01 00 00 00 01 00 00 00 00 00 00 3f > 0x0160: 24 4e 7f 55 0e f6 16 c1 00 01 00 00 00 00 32 00 > 0x0170: 00 40 00 00 00 40 00 00 00 00 1c 00 00 00 00 00 > 0x0180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x0190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x01a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x01b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x01c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x01d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x01e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x01f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > Also no issue here: > 0x00f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >> Also.. I think this is 3rd time I'm recompiling ethtool without netlink support. >> Would it make sense to add add --disable-netlink? >> >> RFC quality patch: >> >> Subject: [PATCH ethtool] Add runtime support for disabling netlink >> >> Provide --disable-netlink option for disabling netlink during runtime, >> without the need to recompile the binary. >> >> Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl> >> --- >> ethtool.8.in | 6 ++++++ >> ethtool.c | 6 ++++++ >> internal.h | 1 + >> netlink/netlink.c | 5 +++++ >> 4 files changed, 18 insertions(+) >> >> diff --git a/ethtool.8.in b/ethtool.8.in >> index 11bb0f9..0b54983 100644 >> --- a/ethtool.8.in >> +++ b/ethtool.8.in >> @@ -137,6 +137,9 @@ ethtool \- query or control network driver and hardware settings >> .BN --debug >> .I args >> .HP >> +.B ethtool [--disable-netlink] >> +.I args >> +.HP >> .B ethtool [--json] >> .I args >> .HP >> @@ -579,6 +582,9 @@ lB l. >> 0x10 Structure of netlink messages >> .TE >> .TP >> +.BI \-\-disable-netlink >> +Do not use netlink and fall back to the ioctl interface if possible. >> +.TP >> .BI \-\-json >> Output results in JavaScript Object Notation (JSON). Only a subset of >> options support this. Those which do not will continue to output >> diff --git a/ethtool.c b/ethtool.c >> index 7f47407..dc28069 100644 >> --- a/ethtool.c >> +++ b/ethtool.c >> @@ -6537,6 +6537,12 @@ int main(int argc, char **argp) >> argc -= 2; >> continue; >> } >> + if (*argp && !strcmp(*argp, "--disable-netlink")) { >> + ctx.nl_disable = true; >> + argp += 1; >> + argc -= 1; >> + continue; >> + } >> if (*argp && !strcmp(*argp, "--json")) { >> ctx.json = true; >> argp += 1; >> diff --git a/internal.h b/internal.h >> index 4b994f5..84c64be 100644 >> --- a/internal.h >> +++ b/internal.h >> @@ -221,6 +221,7 @@ struct cmd_context { >> char **argp; /* arguments to the sub-command */ >> unsigned long debug; /* debugging mask */ >> bool json; /* Output JSON, if supported */ >> + bool nl_disable; /* Disable netlink even if available */ >> bool show_stats; /* include command-specific stats */ >> #ifdef ETHTOOL_ENABLE_NETLINK >> struct nl_context *nlctx; /* netlink context (opaque) */ >> diff --git a/netlink/netlink.c b/netlink/netlink.c >> index ef0d825..3cf1710 100644 >> --- a/netlink/netlink.c >> +++ b/netlink/netlink.c >> @@ -470,6 +470,11 @@ void netlink_run_handler(struct cmd_context *ctx, nl_chk_t nlchk, >> const char *reason; >> int ret; >> >> + if (ctx->nl_disable) { >> + reason = "netlink disabled"; >> + goto no_support; >> + } >> + >> if (nlchk && !nlchk(ctx)) { >> reason = "ioctl-only request"; >> goto no_support; > >>> 2. Can you test this transceiver with a different NIC? >> Ah, yes. Sorry once again - I had done that already, and of course it works, >> which is why I came here and explicitly blamed CX2. >> >> Here is the output from a CX3 Pro NIC: >> >> # ethtool -m eth2 >> Identifier : 0x03 (SFP) >> (...) >> Optical diagnostics support : Yes >> Laser bias current : 7.574 mA >> Laser output power : 0.5815 mW / -2.35 dBm >> Receiver signal average optical power : 0.0001 mW / -40.00 dBm >> Module temperature : 32.13 degrees C / 89.83 degrees F >> Module voltage : 3.2714 V >> Alarm/warning flags implemented : Yes >> Laser bias current high alarm : Off >> Laser bias current low alarm : Off >> Laser bias current high warning : Off >> Laser bias current low warning : Off >> Laser output power high alarm : Off >> Laser output power low alarm : Off >> Laser output power high warning : Off >> Laser output power low warning : Off >> Module temperature high alarm : Off >> Module temperature low alarm : Off >> Module temperature high warning : Off >> Module temperature low warning : Off >> Module voltage high alarm : Off >> Module voltage low alarm : Off >> Module voltage high warning : Off >> Module voltage low warning : Off >> Laser rx power high alarm : Off >> Laser rx power low alarm : On >> Laser rx power high warning : Off >> Laser rx power low warning : On >> Laser bias current high alarm threshold : 10.500 mA >> Laser bias current low alarm threshold : 2.500 mA >> Laser bias current high warning threshold : 10.500 mA >> Laser bias current low warning threshold : 2.500 mA >> Laser output power high alarm threshold : 2.0000 mW / 3.01 dBm >> Laser output power low alarm threshold : 0.1260 mW / -9.00 dBm >> Laser output power high warning threshold : 0.7900 mW / -1.02 dBm >> Laser output power low warning threshold : 0.3170 mW / -4.99 dBm >> Module temperature high alarm threshold : 85.00 degrees C / 185.00 degrees F >> Module temperature low alarm threshold : -5.00 degrees C / 23.00 degrees F >> Module temperature high warning threshold : 80.00 degrees C / 176.00 degrees F >> Module temperature low warning threshold : 0.00 degrees C / 32.00 degrees F >> Module voltage high alarm threshold : 3.6000 V >> Module voltage low alarm threshold : 3.0000 V >> Module voltage high warning threshold : 3.4600 V >> Module voltage low warning threshold : 3.1300 V >> Laser rx power high alarm threshold : 2.0000 mW / 3.01 dBm >> Laser rx power low alarm threshold : 0.0315 mW / -15.02 dBm >> Laser rx power high warning threshold : 0.7900 mW / -1.02 dBm >> Laser rx power low warning threshold : 0.0315 mW / -15.02 dBm >> >> >>> 3. I'm wondering if this transceiver requires an "address change >>> sequence" before accessing I2C address 0x51 (see SFF-8472 Section 8.9 >>> Addressing Modes). The generic SFP driver doesn't support it (see >>> sfp_module_parse_sff8472()) and other drivers probably don't support it >>> as well. Can you look at an hexdump of page 0 and see if this bit is >>> set? If so, maybe the correct thing to do would be to teach the SFF-8472 >>> parser to look at both bit 2 and bit 6 before trying to access this I2C >>> address. >> That would be byte 92, correct? >> >> # ethtool -m eth2 raw on offset 92 length 1|hexdump -C >> 00000000 68 |h| >> 00000001 >> >> 0x68 = 01101000b: >> >> - 6 Digital diagnostic monitoring implemented (described in this document). >> - 5 Internally calibrated >> - 3 Received power measurement type: 0 = OMA, 1 = average power >> >>>> However, as the driver intentionally tries mask the problem [1], ethtool reports "Optical diagnostics support" being available and shows completely wrong information [2]. >>>> >>>> Removing the workaround allows ethtool to recognize the problem and handle everything correctly [3]: >>>> ---- cut here ---- >>>> --- a/drivers/net/ethernet/mellanox/mlx4/port.c 2024-07-27 02:34:11.000000000 -0700 >>>> +++ b/drivers/net/ethernet/mellanox/mlx4/port.c 2024-08-31 21:57:11.211612505 -0700 >>>> @@ -2197,14 +2197,7 @@ >>>> 0xFF60, port, i2c_addr, offset, size, >>>> ret, cable_info_mad_err_str(ret)); >>>> >>>> - if (i2c_addr == I2C_ADDR_HIGH && >>>> - MAD_STATUS_2_CABLE_ERR(ret) == CABLE_INF_I2C_ADDR) >>>> - /* Some SFP cables do not support i2c slave >>>> - * address 0x51 (high page), abort silently. >>>> - */ >>>> - ret = 0; >>>> - else >>>> - ret = -ret; >>>> + ret = -ret; >>>> goto out; >>>> } >>>> cable_info = (struct mlx4_cable_info *)outm >>>> ---- cut here ---- >>>> >>>> However, we end up with a strange "netlink error: Unknown error 1820" error because mlx4_get_module_info returns -0x71c (0x71c is 1820 in decimal). >>>> >>>> This can be fixed with returning -EIO instead of ret, either in mlx4_get_module_info() or perhaps better mlx4_en_get_module_eeprom() from en_ethtool.c: >>>> ---- cut here ---- >>>> --- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c 2024-07-27 02:34:11.000000000 -0700 >>>> +++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c 2024-08-31 21:52:50.370553218 -0700 >>>> @@ -2110,7 +2110,7 @@ >>>> en_err(priv, >>>> "mlx4_get_module_info i(%d) offset(%d) bytes_to_read(%d) - FAILED (0x%x)\n", >>>> i, offset, ee->len - i, ret); >>>> - return ret; >>>> + return -EIO; >>>> } >>>> >>>> i += ret; >>>> ---- cut here ---- >>>> >>>> BTW: it is also possible to augment the error reporting in ethtool/sfpid.c: >>>> ---- cut here ---- >>>> - if (ret) >>>> + if (ret) { >>>> + fprintf(stderr, "Failed to read Page A2h.\n"); >>>> goto out; >>>> + } >>>> ---- cut here ---- >>>> With all the above changes, we now get: >>>> >>>> ---- cut here ---- >>>> Identifier : 0x03 (SFP) >>>> Extended identifier : 0x04 (GBIC/SFP defined by 2-wire interface ID) >>>> (...) >>>> Date code : <REDACTED> >>>> netlink error: Input/output error >>>> Failed to read Page A2h. >>>> ---- cut here ---- >>>> >>>> So, the first question is if above set of fixes makes sense, give that ethtool handles this correctly? If so, I'm happy to send the fixes. >>> I believe it makes sense for the driver to return an error rather than >>> mask the problem and return the wrong information (zeroes). >> Alright, will work on the patches. Thank you BTW for your very patient and >> encouraging review and support the last time. Greatly appreciated. >> >>>> The second question is if not being able to read Page A2h and "invalid I2C slave address" is a due to a bug in the driver or a HW (firmware?) limitation and if something can be done to address this? >>> Let's see if it's related to the "address change sequence" I mentioned >>> above. Maybe that's why the error masking was put in mlx4 in the first >>> place. >> So it seems it is not, but please double check after me. >> >>>> 2. For a QSFP module (which works in CX3/CX3Pro), handling "ethtool -m" seems to be completely broken. >>> Given it works with CX3, then the problem is most likely with CX2 HW/FW. >>> Gal, can you or someone from the team look into it? >> On 04.09.2024 at 08:09, Gal Pressman wrote: >>>> ConnectX-2 is End-of-Life since 2015 and End-of-Service since 2017.. >>>> >> Yes, I am very familiar with these terms and aware of the EoL / EoS situation. >> >> However, the HW still works (and actually works very well), it is still >> supported even by the most recent Linux kernels, and for a non-prod use >> cases like mine (retro workstation) it may be hard to make an argument that >> it should not be used. Especially that it supports everything I need, >> including PXE booting with iPXE, and BTW - the fact that Mellanox even >> provided sources allowing to build / tweak the mrom is truly unique and >> amazing. >> >> That said, I totally get that *if* this is a FW issues, getting a new >> version is rather unlikely, even it used to be a "best-in-class" >> premium NIC... 15 years ago?
> Also.. I think this is 3rd time I'm recompiling ethtool without netlink support. > Would it make sense to add add --disable-netlink? Looking at the code, it should be trivial to add. I expect such a patch would be accepted. Andrew
On Wed, Sep 04, 2024 at 09:47:04PM -0700, Krzysztof Olędzki wrote: > This BTW looks like another problem: > > # ethtool -m eth1 hex on offset 254 length 1 > Offset Values > ------ ------ > 0x00fe: 00 > > # ethtool -m eth1 hex on offset 255 length 1 > Cannot get Module EEPROM data: Unknown error 1564 > > mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(1) i2c_addr(50) offset(255) size(1): Response Mad Status(61c) - invalid device_address or size (that is, size equals 0 or address+size is greater than 256) > mlx4_en: eth1: mlx4_get_module_info i(0) offset(255) bytes_to_read(1) - FAILED (0xfffff9e4) > > With the netlink interface, ethtool seems to be only asking for for the first 128 bytes, which works: Yes. The upper 128 bytes are reserved so sff8079_show_all_nl() doesn't bother querying them. Explains why you don't see this error with netlink. Regarding the runtime "--disable-netlink" patch, I personally don't mind and Andrew seems in favor, so please post a proper patch and lets see what Michal says. Regarding the patch that unmasks the I2C address error, I would target it at net-next as it doesn't really fix a bug (ethtool already displays what it can). Thinking about it, I believe it would be more worthwhile to implement the much simpler get_module_eeprom_by_page() ethtool operation in mlx4 (I can help with the review). It would've helped avoiding the current issue (kernel will return an error) and the previous bug [1] you encountered with the legacy operations. Regarding the fact that these modules work properly with CX3, but not with CX2 (which uses the same driver), it really seems like a HW/FW problem and unfortunately I can't help with that. [1] https://lore.kernel.org/all/b17c5336-6dc3-41f2-afa6-f9e79231f224@ans.pl/
Good morning Ido, On 05.09.2024 at 09:23, Ido Schimmel wrote: > On Wed, Sep 04, 2024 at 09:47:04PM -0700, Krzysztof Olędzki wrote: >> This BTW looks like another problem: >> >> # ethtool -m eth1 hex on offset 254 length 1 >> Offset Values >> ------ ------ >> 0x00fe: 00 >> >> # ethtool -m eth1 hex on offset 255 length 1 >> Cannot get Module EEPROM data: Unknown error 1564 >> >> mlx4_core 0000:01:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(1) i2c_addr(50) offset(255) size(1): Response Mad Status(61c) - invalid device_address or size (that is, size equals 0 or address+size is greater than 256) >> mlx4_en: eth1: mlx4_get_module_info i(0) offset(255) bytes_to_read(1) - FAILED (0xfffff9e4) >> >> With the netlink interface, ethtool seems to be only asking for for the first 128 bytes, which works: > > Yes. The upper 128 bytes are reserved so sff8079_show_all_nl() doesn't > bother querying them. Explains why you don't see this error with > netlink. > > Regarding the runtime "--disable-netlink" patch, I personally don't mind > and Andrew seems in favor, so please post a proper patch and lets see > what Michal says. Thanks. Will send it, let's see how it goes. > Regarding the patch that unmasks the I2C address error, I would target > it at net-next as it doesn't really fix a bug (ethtool already displays > what it can). Well, I would argue that it does fix a bug to some extent, because otherwise "ethtool -m" shows wrong DOM data, instead of just not showing it at all. Also, judging from the date of the commit [1], I would say it is likely that the workaround was actually added to address the limitation of the firmware that was there at that time. That said, definitely a net-next target, same for the other three patches that I just sent. > Thinking about it, I believe it would be more worthwhile > to implement the much simpler get_module_eeprom_by_page() ethtool > operation in mlx4 (I can help with the review). It would've helped > avoiding the current issue (kernel will return an error) and the > previous bug [1] you encountered with the legacy operations. Sure, I can also try to work on that one. Would mlx5/core/en_ethtool.c be a good example of how this should be implemented? > Regarding the fact that these modules work properly with CX3, but not > with CX2 (which uses the same driver), it really seems like a HW/FW > problem and unfortunately I can't help with that. Yes, understood. Initially I was hoping for this to be either a bug in the driver, or something that could be quirked there, but after spending the last week playing with both the kernel and the firmware, I am convinced this it is a firmware issue (at least for the "offset(255)" and the "A2h" problems) and both could be only fixed there. At this point, I cannot count how many times I used mlxburn/flint and the "reset" button after crashing either the NIC FW or the kernel, but I think I now understand where the problem is and even have some ideas how to fix... For the "offset(255)" issue, I am confident this is an off-by-one error, (address+size is compared to 255 instead of 256) that most likely never got fixed for the CX2 firmware, at least not in the ones that are available in public. :( I was able to reproduce almost exactly the same behavior on CX3 flashing a very, very ancient version. For the "A2h" issue, it looks like a FW limitation as the old (CX2) version of firmware only allows to access 0x50 i2c address, instead of 0x50 (for A0h) *and* 0x51 for A02h. I was also able to reproduce it with the ancient CX3 FW. The QSFP error is a big mystery that most likely will never be solved (at least not by me), as I don't have the knowledge nor the documentation to make any progress there and after all, CX2 is technically a retro-hardware at this point: "ConnectX-2 is End-of-Life since 2015 and End-of-Service since 2017" says it all and is 100% spot on. Still, fun to use! ;) [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=32a173c7f9e9ec2b87142f67e1478cd20084a45b Thanks, Krzysztof
On Wed, Sep 11, 2024 at 11:46:11PM -0700, Krzysztof Olędzki wrote: > Sure, I can also try to work on that one. Would mlx5/core/en_ethtool.c > be a good example of how this should be implemented? Yes. You can also look at mlxsw_env_get_module_eeprom_by_page() which is using the same firmware interface as mlx5 (but not mlx4). Basically, this operation is very simple as far as the driver is concerned. You get a "3D address" (bank + page + offset) from user space and ask the device to fetch the required info. In the case of mlx4 bank is irrelevant so you can always return an error for "bank != 0". For the rest of the fields in 'struct ethtool_module_eeprom' the mapping to 'struct mlx4_cable_info' is as follows (AFAIU): struct ethtool_module_eeprom::offset -> struct mlx4_cable_info::dev_mem_address struct ethtool_module_eeprom::length -> struct mlx4_cable_info::size struct ethtool_module_eeprom::page -> struct mlx4_cable_info::page_num struct ethtool_module_eeprom::i2c_address -> struct mlx4_cable_info::i2c_addr Note that mlx4 firmware can only read up to 48 bytes (MODULE_INFO_MAX_READ), so you need to implement a loop like the one in mlx5e_get_module_eeprom_by_page(). Also note that extack is available so try to use it to communicate failures instead of printing to the kernel log.
--- a/drivers/net/ethernet/mellanox/mlx4/port.c 2024-07-27 02:34:11.000000000 -0700 +++ b/drivers/net/ethernet/mellanox/mlx4/port.c 2024-08-31 21:57:11.211612505 -0700 @@ -2197,14 +2197,7 @@ 0xFF60, port, i2c_addr, offset, size, ret, cable_info_mad_err_str(ret)); - if (i2c_addr == I2C_ADDR_HIGH && - MAD_STATUS_2_CABLE_ERR(ret) == CABLE_INF_I2C_ADDR) - /* Some SFP cables do not support i2c slave - * address 0x51 (high page), abort silently. - */ - ret = 0; - else - ret = -ret; + ret = -ret; goto out; } cable_info = (struct mlx4_cable_info *)outm ---- cut here ---- However, we end up with a strange "netlink error: Unknown error 1820" error because mlx4_get_module_info returns -0x71c (0x71c is 1820 in decimal). This can be fixed with returning -EIO instead of ret, either in mlx4_get_module_info() or perhaps better mlx4_en_get_module_eeprom() from en_ethtool.c: ---- cut here ---- --- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c 2024-07-27 02:34:11.000000000 -0700 +++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c 2024-08-31 21:52:50.370553218 -0700 @@ -2110,7 +2110,7 @@ en_err(priv, "mlx4_get_module_info i(%d) offset(%d) bytes_to_read(%d) - FAILED (0x%x)\n", i, offset, ee->len - i, ret); - return ret; + return -EIO; } i += ret;