diff mbox series

[net-next,2/2] r8169: add support for reading over-temp threshold

Message ID f3e07026-8219-4b36-b230-7f7ddd71c7ab@gmail.com (mailing list archive)
State New
Delegated to: Netdev Maintainers
Headers show
Series r8169: extend hwmon support | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1 this patch: 1
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/build_clang success Errors and warnings before: 2 this patch: 2
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 19 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-01-07--18-00 (tests: 883)

Commit Message

Heiner Kallweit Jan. 6, 2025, 6:05 p.m. UTC
Add support for reading the over-temp threshold. If the chip temperature
exceeds this value, the chip will reduce the speed to 1Gbps (by disabling
2.5G/5G advertisement and triggering a renegotiation).

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/ethernet/realtek/r8169_main.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Andrew Lunn Jan. 6, 2025, 9:15 p.m. UTC | #1
On Mon, Jan 06, 2025 at 07:05:13PM +0100, Heiner Kallweit wrote:
> Add support for reading the over-temp threshold. If the chip temperature
> exceeds this value, the chip will reduce the speed to 1Gbps (by disabling
> 2.5G/5G advertisement and triggering a renegotiation).

I'm assuming here that the over-temp threshold always exists when the
temp_in sensors exists? If so:

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

Does it reduce the speed in the same way as downshift? Can the user
tell it has happened, other than networking is slower?

    Andrew
Heiner Kallweit Jan. 6, 2025, 10:31 p.m. UTC | #2
On 06.01.2025 22:15, Andrew Lunn wrote:
> On Mon, Jan 06, 2025 at 07:05:13PM +0100, Heiner Kallweit wrote:
>> Add support for reading the over-temp threshold. If the chip temperature
>> exceeds this value, the chip will reduce the speed to 1Gbps (by disabling
>> 2.5G/5G advertisement and triggering a renegotiation).
> 
> I'm assuming here that the over-temp threshold always exists when the
> temp_in sensors exists? If so:
> 
Right

> Reviewed-by: Andrew Lunn <andrew@lunn.ch>
> 
> Does it reduce the speed in the same way as downshift? Can the user
> tell it has happened, other than networking is slower?
> 
It internally disables 2.5G/5G advertisement and triggers an autoneg.
So you get the usual message on console/dmesg indicating the new speed.
This internal action sets a register bit, and it can also trigger an interrupt.
So it should be possible to check in the link_change_notify() callback
whether an over-temp event occurred. The silent change of the advertisement
may also result in the phylib-cached advertisement being out-of-sync.
So we would have to re-sync it. But I didn't fully test this yet.

This patch only allows to read the over-temp threshold set as power-on default
or by the boot loader. It doesn't change the existing behavior.

>     Andrew

Heiner
Andrew Lunn Jan. 6, 2025, 11:16 p.m. UTC | #3
> > Does it reduce the speed in the same way as downshift? Can the user
> > tell it has happened, other than networking is slower?
> > 
> It internally disables 2.5G/5G advertisement and triggers an autoneg.
> So you get the usual message on console/dmesg indicating the new speed.
> This internal action sets a register bit, and it can also trigger an interrupt.
> So it should be possible to check in the link_change_notify() callback
> whether an over-temp event occurred. The silent change of the advertisement
> may also result in the phylib-cached advertisement being out-of-sync.
> So we would have to re-sync it. But I didn't fully test this yet.
> 
> This patch only allows to read the over-temp threshold set as power-on default
> or by the boot loader. It doesn't change the existing behavior.

Thanks for the details. So it does seem to be different to downshift,
where generally advertised link modes in the registers is not changed,
and the speed indicated in BMSR generally does not indicate the
downshifted speed, you need vendor registers to get the actual
speed. But downshift happens at link up, not latter when the device
starts to overheat.

Does it restore advertisement to 2.5G/5G when it cools down? There is
an interesting policy decision here. Do you want 1s downtime very so
often as it speeds up and slows down, or should it keep at the slower
speed until the user kicks it back up to the higher speed?

	Andrew
Jakub Kicinski Jan. 6, 2025, 11:30 p.m. UTC | #4
On Mon, 6 Jan 2025 19:05:13 +0100 Heiner Kallweit wrote:
> Add support for reading the over-temp threshold. If the chip temperature
> exceeds this value, the chip will reduce the speed to 1Gbps (by disabling
> 2.5G/5G advertisement and triggering a renegotiation).

If there is a v2 -- please make sure hwmon folks are CCed.
Looks like get_maintainers doesn't flag it, but since we're charting 
a new territory for networking a broader audience may be quite useful.
Heiner Kallweit Jan. 7, 2025, 7:06 a.m. UTC | #5
On 07.01.2025 00:16, Andrew Lunn wrote:
>>> Does it reduce the speed in the same way as downshift? Can the user
>>> tell it has happened, other than networking is slower?
>>>
>> It internally disables 2.5G/5G advertisement and triggers an autoneg.
>> So you get the usual message on console/dmesg indicating the new speed.
>> This internal action sets a register bit, and it can also trigger an interrupt.
>> So it should be possible to check in the link_change_notify() callback
>> whether an over-temp event occurred. The silent change of the advertisement
>> may also result in the phylib-cached advertisement being out-of-sync.
>> So we would have to re-sync it. But I didn't fully test this yet.
>>
>> This patch only allows to read the over-temp threshold set as power-on default
>> or by the boot loader. It doesn't change the existing behavior.
> 
> Thanks for the details. So it does seem to be different to downshift,
> where generally advertised link modes in the registers is not changed,
> and the speed indicated in BMSR generally does not indicate the
> downshifted speed, you need vendor registers to get the actual
> speed. But downshift happens at link up, not latter when the device
> starts to overheat.
> 
> Does it restore advertisement to 2.5G/5G when it cools down? There is
> an interesting policy decision here. Do you want 1s downtime very so
> often as it speeds up and slows down, or should it keep at the slower
> speed until the user kicks it back up to the higher speed?
> 
I will do more testing and provide an update.

> 	Andrew
> 
Heiner
Heiner Kallweit Jan. 7, 2025, 7:07 a.m. UTC | #6
On 07.01.2025 00:30, Jakub Kicinski wrote:
> On Mon, 6 Jan 2025 19:05:13 +0100 Heiner Kallweit wrote:
>> Add support for reading the over-temp threshold. If the chip temperature
>> exceeds this value, the chip will reduce the speed to 1Gbps (by disabling
>> 2.5G/5G advertisement and triggering a renegotiation).
> 
> If there is a v2 -- please make sure hwmon folks are CCed.
> Looks like get_maintainers doesn't flag it, but since we're charting 
> a new territory for networking a broader audience may be quite useful.

OK. No v2 is planned, but there may be follow-up patches.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 38f915956..fd6ff77e8 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -5357,6 +5357,11 @@  static int r8169_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
 		raw = phy_read_paged(tp->phydev, 0xbd8, 0x12) & 0x3ff;
 		*val = r8169_hwmon_get_temp(raw);
 		break;
+	case hwmon_temp_max:
+		/* Chip reduces speed to 1G if threshold is exceeded */
+		raw = phy_read_paged(tp->phydev, 0xb54, 0x16) >> 6;
+		*val = r8169_hwmon_get_temp(raw);
+		break;
 	default:
 		return -EINVAL;
 	}
@@ -5370,7 +5375,7 @@  static const struct hwmon_ops r8169_hwmon_ops = {
 };
 
 static const struct hwmon_channel_info * const r8169_hwmon_info[] = {
-	HWMON_CHANNEL_INFO(temp, HWMON_T_INPUT),
+	HWMON_CHANNEL_INFO(temp, HWMON_T_INPUT | HWMON_T_MAX),
 	NULL
 };