Message ID | f3e07026-8219-4b36-b230-7f7ddd71c7ab@gmail.com (mailing list archive) |
---|---|
State | New |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | r8169: extend hwmon support | expand |
On Mon, Jan 06, 2025 at 07:05:13PM +0100, Heiner Kallweit wrote: > Add support for reading the over-temp threshold. If the chip temperature > exceeds this value, the chip will reduce the speed to 1Gbps (by disabling > 2.5G/5G advertisement and triggering a renegotiation). I'm assuming here that the over-temp threshold always exists when the temp_in sensors exists? If so: Reviewed-by: Andrew Lunn <andrew@lunn.ch> Does it reduce the speed in the same way as downshift? Can the user tell it has happened, other than networking is slower? Andrew
On 06.01.2025 22:15, Andrew Lunn wrote: > On Mon, Jan 06, 2025 at 07:05:13PM +0100, Heiner Kallweit wrote: >> Add support for reading the over-temp threshold. If the chip temperature >> exceeds this value, the chip will reduce the speed to 1Gbps (by disabling >> 2.5G/5G advertisement and triggering a renegotiation). > > I'm assuming here that the over-temp threshold always exists when the > temp_in sensors exists? If so: > Right > Reviewed-by: Andrew Lunn <andrew@lunn.ch> > > Does it reduce the speed in the same way as downshift? Can the user > tell it has happened, other than networking is slower? > It internally disables 2.5G/5G advertisement and triggers an autoneg. So you get the usual message on console/dmesg indicating the new speed. This internal action sets a register bit, and it can also trigger an interrupt. So it should be possible to check in the link_change_notify() callback whether an over-temp event occurred. The silent change of the advertisement may also result in the phylib-cached advertisement being out-of-sync. So we would have to re-sync it. But I didn't fully test this yet. This patch only allows to read the over-temp threshold set as power-on default or by the boot loader. It doesn't change the existing behavior. > Andrew Heiner
> > Does it reduce the speed in the same way as downshift? Can the user > > tell it has happened, other than networking is slower? > > > It internally disables 2.5G/5G advertisement and triggers an autoneg. > So you get the usual message on console/dmesg indicating the new speed. > This internal action sets a register bit, and it can also trigger an interrupt. > So it should be possible to check in the link_change_notify() callback > whether an over-temp event occurred. The silent change of the advertisement > may also result in the phylib-cached advertisement being out-of-sync. > So we would have to re-sync it. But I didn't fully test this yet. > > This patch only allows to read the over-temp threshold set as power-on default > or by the boot loader. It doesn't change the existing behavior. Thanks for the details. So it does seem to be different to downshift, where generally advertised link modes in the registers is not changed, and the speed indicated in BMSR generally does not indicate the downshifted speed, you need vendor registers to get the actual speed. But downshift happens at link up, not latter when the device starts to overheat. Does it restore advertisement to 2.5G/5G when it cools down? There is an interesting policy decision here. Do you want 1s downtime very so often as it speeds up and slows down, or should it keep at the slower speed until the user kicks it back up to the higher speed? Andrew
On Mon, 6 Jan 2025 19:05:13 +0100 Heiner Kallweit wrote: > Add support for reading the over-temp threshold. If the chip temperature > exceeds this value, the chip will reduce the speed to 1Gbps (by disabling > 2.5G/5G advertisement and triggering a renegotiation). If there is a v2 -- please make sure hwmon folks are CCed. Looks like get_maintainers doesn't flag it, but since we're charting a new territory for networking a broader audience may be quite useful.
On 07.01.2025 00:16, Andrew Lunn wrote: >>> Does it reduce the speed in the same way as downshift? Can the user >>> tell it has happened, other than networking is slower? >>> >> It internally disables 2.5G/5G advertisement and triggers an autoneg. >> So you get the usual message on console/dmesg indicating the new speed. >> This internal action sets a register bit, and it can also trigger an interrupt. >> So it should be possible to check in the link_change_notify() callback >> whether an over-temp event occurred. The silent change of the advertisement >> may also result in the phylib-cached advertisement being out-of-sync. >> So we would have to re-sync it. But I didn't fully test this yet. >> >> This patch only allows to read the over-temp threshold set as power-on default >> or by the boot loader. It doesn't change the existing behavior. > > Thanks for the details. So it does seem to be different to downshift, > where generally advertised link modes in the registers is not changed, > and the speed indicated in BMSR generally does not indicate the > downshifted speed, you need vendor registers to get the actual > speed. But downshift happens at link up, not latter when the device > starts to overheat. > > Does it restore advertisement to 2.5G/5G when it cools down? There is > an interesting policy decision here. Do you want 1s downtime very so > often as it speeds up and slows down, or should it keep at the slower > speed until the user kicks it back up to the higher speed? > I will do more testing and provide an update. > Andrew > Heiner
On 07.01.2025 00:30, Jakub Kicinski wrote: > On Mon, 6 Jan 2025 19:05:13 +0100 Heiner Kallweit wrote: >> Add support for reading the over-temp threshold. If the chip temperature >> exceeds this value, the chip will reduce the speed to 1Gbps (by disabling >> 2.5G/5G advertisement and triggering a renegotiation). > > If there is a v2 -- please make sure hwmon folks are CCed. > Looks like get_maintainers doesn't flag it, but since we're charting > a new territory for networking a broader audience may be quite useful. OK. No v2 is planned, but there may be follow-up patches.
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 38f915956..fd6ff77e8 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -5357,6 +5357,11 @@ static int r8169_hwmon_read(struct device *dev, enum hwmon_sensor_types type, raw = phy_read_paged(tp->phydev, 0xbd8, 0x12) & 0x3ff; *val = r8169_hwmon_get_temp(raw); break; + case hwmon_temp_max: + /* Chip reduces speed to 1G if threshold is exceeded */ + raw = phy_read_paged(tp->phydev, 0xb54, 0x16) >> 6; + *val = r8169_hwmon_get_temp(raw); + break; default: return -EINVAL; } @@ -5370,7 +5375,7 @@ static const struct hwmon_ops r8169_hwmon_ops = { }; static const struct hwmon_channel_info * const r8169_hwmon_info[] = { - HWMON_CHANNEL_INFO(temp, HWMON_T_INPUT), + HWMON_CHANNEL_INFO(temp, HWMON_T_INPUT | HWMON_T_MAX), NULL };
Add support for reading the over-temp threshold. If the chip temperature exceeds this value, the chip will reduce the speed to 1Gbps (by disabling 2.5G/5G advertisement and triggering a renegotiation). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> --- drivers/net/ethernet/realtek/r8169_main.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)