Message ID | 20200118172615.26329-1-linux@roeck-us.net (mailing list archive) |
---|---|
Headers | show |
Series | hwmon: k10temp driver improvements | expand |
On Sat, 18 Jan 2020 at 17:26, Guenter Roeck <linux@roeck-us.net> wrote: > > This patch series implements various improvements for the k10temp driver. > > Patch 1/5 introduces the use of bit operations. > > Patch 2/5 converts the driver to use the devm_hwmon_device_register_with_info > API. This not only simplifies the code and reduces its size, it also > makes the code easier to maintain and enhance. > > Patch 3/5 adds support for reporting Core Complex Die (CCD) temperatures > on Ryzen 3 (Zen2) CPUs. > > Patch 4/5 adds support for reporting core and SoC current and voltage > information on Ryzen CPUs. > > Patch 5/5 removes the maximum temperature from Tdie for Ryzen CPUs. > It is inaccurate, misleading, and it just doesn't make sense to report > wrong information. > > With all patches in place, output on Ryzen 3900X CPUs looks as follows > (with the system under load). > > k10temp-pci-00c3 > Adapter: PCI adapter > Vcore: +1.36 V > Vsoc: +1.18 V > Tdie: +86.8°C > Tctl: +86.8°C > Tccd1: +80.0°C > Tccd2: +81.8°C > Icore: +44.14 A > Isoc: +13.83 A > > The voltage and current information is limited to Ryzen CPUs. Voltage > and current reporting on Threadripper and EPYC CPUs is different, and the > reported information is either incomplete or wrong. Exclude it for the time > being; it can always be added if/when more information becomes available. > > Tested with the following Ryzen CPUs: > 1300X A user with this CPU in the system reported somewhat unexpected > values for Vcore; it isn't entirely if at all clear why that is > the case. Overall this does not warrant holding up the series. As the owner of that machine, very much agreed. > 1600 > 1800X > 2200G > 2400G > 3800X > 3900X > 3950X > I also had sensible results for v1 on 2500U and 3400G > v2: Added tested-by: tags as received. > Don't display voltage and current information for Threadripper and EPYC. > Stop displaying the fixed (and wrong) maximum temperature of 70 degrees C > for Tdie on model 17h/18h CPUs. For v2 on my 2500U, system idle and then under load - --- k10temp-idle 2020-01-19 00:16:18.812002121 +0000 +++ k10temp-load 2020-01-19 00:22:05.595470877 +0000 @@ -1,15 +1,15 @@ k10temp-pci-00c3 Adapter: PCI adapter -Vcore: +0.98 V +Vcore: +1.15 V Vsoc: +0.93 V -Tdie: +38.2°C -Tctl: +38.2°C -Icore: +10.39 A -Isoc: +6.49 A +Tdie: +76.2°C +Tctl: +76.2°C +Icore: +51.96 A +Isoc: +7.58 A amdgpu-pci-0300 Adapter: PCI adapter vddgfx: N/A vddnb: N/A -edge: +38.0°C (crit = +80.0°C, hyst = +0.0°C) +edge: +76.0°C (crit = +80.0°C, hyst = +0.0°C) I'll ony test v2 on the 3400G if you think the results would add something. ĸen
On 1/18/20 4:33 PM, Ken Moffat wrote: > On Sat, 18 Jan 2020 at 17:26, Guenter Roeck <linux@roeck-us.net> wrote: >> >> This patch series implements various improvements for the k10temp driver. >> >> Patch 1/5 introduces the use of bit operations. >> >> Patch 2/5 converts the driver to use the devm_hwmon_device_register_with_info >> API. This not only simplifies the code and reduces its size, it also >> makes the code easier to maintain and enhance. >> >> Patch 3/5 adds support for reporting Core Complex Die (CCD) temperatures >> on Ryzen 3 (Zen2) CPUs. >> >> Patch 4/5 adds support for reporting core and SoC current and voltage >> information on Ryzen CPUs. >> >> Patch 5/5 removes the maximum temperature from Tdie for Ryzen CPUs. >> It is inaccurate, misleading, and it just doesn't make sense to report >> wrong information. >> >> With all patches in place, output on Ryzen 3900X CPUs looks as follows >> (with the system under load). >> >> k10temp-pci-00c3 >> Adapter: PCI adapter >> Vcore: +1.36 V >> Vsoc: +1.18 V >> Tdie: +86.8°C >> Tctl: +86.8°C >> Tccd1: +80.0°C >> Tccd2: +81.8°C >> Icore: +44.14 A >> Isoc: +13.83 A >> >> The voltage and current information is limited to Ryzen CPUs. Voltage >> and current reporting on Threadripper and EPYC CPUs is different, and the >> reported information is either incomplete or wrong. Exclude it for the time >> being; it can always be added if/when more information becomes available. >> >> Tested with the following Ryzen CPUs: >> 1300X A user with this CPU in the system reported somewhat unexpected >> values for Vcore; it isn't entirely if at all clear why that is >> the case. Overall this does not warrant holding up the series. > > As the owner of that machine, very much agreed. > >> 1600 >> 1800X >> 2200G >> 2400G >> 3800X >> 3900X >> 3950X >> > > I also had sensible results for v1 on 2500U and 3400G > Sorry, I somehow missed that. >> v2: Added tested-by: tags as received. >> Don't display voltage and current information for Threadripper and EPYC. >> Stop displaying the fixed (and wrong) maximum temperature of 70 degrees C >> for Tdie on model 17h/18h CPUs. > > For v2 on my 2500U, system idle and then under load - > > --- k10temp-idle 2020-01-19 00:16:18.812002121 +0000 > +++ k10temp-load 2020-01-19 00:22:05.595470877 +0000 > @@ -1,15 +1,15 @@ > k10temp-pci-00c3 > Adapter: PCI adapter > -Vcore: +0.98 V > +Vcore: +1.15 V > Vsoc: +0.93 V > -Tdie: +38.2°C > -Tctl: +38.2°C > -Icore: +10.39 A > -Isoc: +6.49 A > +Tdie: +76.2°C > +Tctl: +76.2°C > +Icore: +51.96 A > +Isoc: +7.58 A > > amdgpu-pci-0300 > Adapter: PCI adapter > vddgfx: N/A > vddnb: N/A > -edge: +38.0°C (crit = +80.0°C, hyst = +0.0°C) > +edge: +76.0°C (crit = +80.0°C, hyst = +0.0°C) > > I'll ony test v2 on the 3400G if you think the results would add something. > Thanks a lot for the additional testing! I don't think we need another test on 3400G; after all, the actual measurement code didn't change. Everyone: I'll be happy to add Tested-by: tags with your name and e-mail address to the series, but you'll have to send it to me. I appreciate all your testing and would like to acknowledge it, but I can not add Tested-by: tags (or any other tags, for that matter) on my own. Thanks, Guenter
On Sun, 19 Jan 2020 at 00:49, Guenter Roeck <linux@roeck-us.net> wrote: > > On 1/18/20 4:33 PM, Ken Moffat wrote: > > On Sat, 18 Jan 2020 at 17:26, Guenter Roeck <linux@roeck-us.net> wrote: > >> > >> This patch series implements various improvements for the k10temp driver. > >> > >> Patch 1/5 introduces the use of bit operations. > >> > >> Patch 2/5 converts the driver to use the devm_hwmon_device_register_with_info > >> API. This not only simplifies the code and reduces its size, it also > >> makes the code easier to maintain and enhance. > >> > >> Patch 3/5 adds support for reporting Core Complex Die (CCD) temperatures > >> on Ryzen 3 (Zen2) CPUs. > >> > >> Patch 4/5 adds support for reporting core and SoC current and voltage > >> information on Ryzen CPUs. > >> > >> Patch 5/5 removes the maximum temperature from Tdie for Ryzen CPUs. > >> It is inaccurate, misleading, and it just doesn't make sense to report > >> wrong information. > >> > >> With all patches in place, output on Ryzen 3900X CPUs looks as follows > >> (with the system under load). > >> > >> k10temp-pci-00c3 > >> Adapter: PCI adapter > >> Vcore: +1.36 V > >> Vsoc: +1.18 V > >> Tdie: +86.8°C > >> Tctl: +86.8°C > >> Tccd1: +80.0°C > >> Tccd2: +81.8°C > >> Icore: +44.14 A > >> Isoc: +13.83 A > >> > >> The voltage and current information is limited to Ryzen CPUs. Voltage > >> and current reporting on Threadripper and EPYC CPUs is different, and the > >> reported information is either incomplete or wrong. Exclude it for the time > >> being; it can always be added if/when more information becomes available. > >> > >> Tested with the following Ryzen CPUs: > >> 1300X A user with this CPU in the system reported somewhat unexpected > >> values for Vcore; it isn't entirely if at all clear why that is > >> the case. Overall this does not warrant holding up the series. > > > > As the owner of that machine, very much agreed. > > >> 1600 > >> 1800X > >> 2200G > >> 2400G > >> 3800X > >> 3900X > >> 3950X > >> > > > > I also had sensible results for v1 on 2500U and 3400G > > > Sorry, I somehow missed that. > > >> v2: Added tested-by: tags as received. > >> Don't display voltage and current information for Threadripper and EPYC. > >> Stop displaying the fixed (and wrong) maximum temperature of 70 degrees C > >> for Tdie on model 17h/18h CPUs. > > > > For v2 on my 2500U, system idle and then under load - > > > > --- k10temp-idle 2020-01-19 00:16:18.812002121 +0000 > > +++ k10temp-load 2020-01-19 00:22:05.595470877 +0000 > > @@ -1,15 +1,15 @@ > > k10temp-pci-00c3 > > Adapter: PCI adapter > > -Vcore: +0.98 V > > +Vcore: +1.15 V > > Vsoc: +0.93 V > > -Tdie: +38.2°C > > -Tctl: +38.2°C > > -Icore: +10.39 A > > -Isoc: +6.49 A > > +Tdie: +76.2°C > > +Tctl: +76.2°C > > +Icore: +51.96 A > > +Isoc: +7.58 A > > > > amdgpu-pci-0300 > > Adapter: PCI adapter > > vddgfx: N/A > > vddnb: N/A > > -edge: +38.0°C (crit = +80.0°C, hyst = +0.0°C) > > +edge: +76.0°C (crit = +80.0°C, hyst = +0.0°C) > > > > I'll ony test v2 on the 3400G if you think the results would add something. > > > > Thanks a lot for the additional testing! I don't think we need another > test on 3400G; after all, the actual measurement code didn't change. > > Everyone: I'll be happy to add Tested-by: tags with your name and e-mail > address to the series, but you'll have to send it to me. I appreciate > all your testing and would like to acknowledge it, but I can not add > Tested-by: tags (or any other tags, for that matter) on my own. > > Thanks, > Guenter For the little it is worth: Tested-by Ken Moffat <zarniwhoop73@googlemail.com>
On 19/1/20 8:48 am, Guenter Roeck wrote: > Everyone: I'll be happy to add Tested-by: tags with your name and e-mail > address to the series, but you'll have to send it to me. I appreciate > all your testing and would like to acknowledge it, but I can not add > Tested-by: tags (or any other tags, for that matter) on my own. > > Thanks, > Guenter > Tested-by: Brad Campbell <lists2009@fnarfbargle.com>
In article <20200118172615.26329-1-linux@roeck-us.net> (earth.lists.linux-kernel) you wrote: > This patch series implements various improvements for the k10temp driver. ... > The voltage and current information is limited to Ryzen CPUs. Voltage > and current reporting on Threadripper and EPYC CPUs is different, and the > reported information is either incomplete or wrong. Exclude it for the time > being; it can always be added if/when more information becomes available. > Tested with the following Ryzen CPUs: Tested-By: Jonathan McDowell <noodles@earth.li> Tested on a Ryzen 7 2700 (patched on top of 5.4.13): | k10temp-pci-00c3 | Adapter: PCI adapter | Vcore: +0.80 V | Vsoc: +0.81 V | Tdie: +37.0°C | Tctl: +37.0°C | Icore: +8.31 A | Isoc: +6.86 A Like the 1300X case I see a discrepancy compared to what the nct6779 driver says Vcore is: | nct6779-isa-0290 | Adapter: ISA adapter | Vcore: +0.33 V (min = +0.00 V, max = +1.74 V) | in1: +0.32 V (min = +0.00 V, max = +0.00 V) ALARM | AVCC: +3.39 V (min = +0.00 V, max = +0.00 V) ALARM | +3.3V: +3.39 V (min = +0.00 V, max = +0.00 V) ALARM | in4: +1.88 V (min = +0.00 V, max = +0.00 V) ALARM | in5: +0.82 V (min = +0.00 V, max = +0.00 V) ALARM | in6: +0.30 V (min = +0.00 V, max = +0.00 V) ALARM | 3VSB: +3.42 V (min = +0.00 V, max = +0.00 V) ALARM | Vbat: +3.25 V (min = +0.00 V, max = +0.00 V) ALARM | in9: +0.00 V (min = +0.00 V, max = +0.00 V) | in10: +0.22 V (min = +0.00 V, max = +0.00 V) ALARM | in11: +1.06 V (min = +0.00 V, max = +0.00 V) ALARM | in12: +1.70 V (min = +0.00 V, max = +0.00 V) ALARM | in13: +1.04 V (min = +0.00 V, max = +0.00 V) ALARM | in14: +1.79 V (min = +0.00 V, max = +0.00 V) ALARM | fan1: 0 RPM (min = 0 RPM) | fan2: 1708 RPM (min = 0 RPM) | fan3: 0 RPM (min = 0 RPM) | fan4: 0 RPM (min = 0 RPM) | fan5: 0 RPM (min = 0 RPM) | SYSTIN: +33.0°C (high = +0.0°C, hyst = +0.0°C) ALARM | sensor = thermistor | CPUTIN: -62.5°C (high = +80.0°C, hyst = +75.0°C) | sensor = thermistor | AUXTIN0: +79.0°C sensor = thermistor | AUXTIN1: +96.0°C sensor = thermistor | AUXTIN2: +23.0°C sensor = thermistor | AUXTIN3: -22.0°C sensor = thermistor | SMBUSMASTER 0: +39.0°C | PCH_CHIP_CPU_MAX_TEMP: +0.0°C | PCH_CHIP_TEMP: +0.0°C | PCH_CPU_TEMP: +0.0°C | intrusion0: ALARM | intrusion1: ALARM | beep_enable: disabled I suspect the nct6779 is not reporting correctly (or needs some configuration) here, as I see that's what Ken is using with his 1300X as well. (ASRock B450M Pro4 motherboard, fwiw.) J.
On Sat, 18 Jan 2020, Guenter Roeck wrote: > This patch series implements various improvements for the k10temp driver. > > Patch 1/5 introduces the use of bit operations. > > Patch 2/5 converts the driver to use the devm_hwmon_device_register_with_info > API. This not only simplifies the code and reduces its size, it also > makes the code easier to maintain and enhance. > > Patch 3/5 adds support for reporting Core Complex Die (CCD) temperatures > on Ryzen 3 (Zen2) CPUs. > > Patch 4/5 adds support for reporting core and SoC current and voltage > information on Ryzen CPUs. > > Patch 5/5 removes the maximum temperature from Tdie for Ryzen CPUs. > It is inaccurate, misleading, and it just doesn't make sense to report > wrong information. > > With all patches in place, output on Ryzen 3900X CPUs looks as follows > (with the system under load). > > k10temp-pci-00c3 > Adapter: PCI adapter > Vcore: +1.36 V > Vsoc: +1.18 V > Tdie: +86.8°C > Tctl: +86.8°C > Tccd1: +80.0°C > Tccd2: +81.8°C > Icore: +44.14 A > Isoc: +13.83 A > > The voltage and current information is limited to Ryzen CPUs. Voltage > and current reporting on Threadripper and EPYC CPUs is different, and the > reported information is either incomplete or wrong. Exclude it for the time > being; it can always be added if/when more information becomes available. > > Tested with the following Ryzen CPUs: > 1300X A user with this CPU in the system reported somewhat unexpected > values for Vcore; it isn't entirely if at all clear why that is > the case. Overall this does not warrant holding up the series. > 1600 > 1800X > 2200G > 2400G > 3800X > 3900X > 3950X > > v2: Added tested-by: tags as received. > Don't display voltage and current information for Threadripper and EPYC. > Stop displaying the fixed (and wrong) maximum temperature of 70 degrees C > for Tdie on model 17h/18h CPUs. > Just tested this on a 2400G. Here idle values: k10temp-pci-00c3 Adapter: PCI adapter Vcore: +0.77 V Vsoc: +1.11 V Tdie: +45.0°C Tctl: +45.0°C Icore: +10.39 A Isoc: +2.89 A nvme-pci-0100 Adapter: PCI adapter Composite: +43.9°C (low = -273.1°C, high = +80.8°C) (crit = +80.8°C) Sensor 1: +43.9°C (low = -273.1°C, high = +65261.8°C) Sensor 2: +48.9°C (low = -273.1°C, high = +65261.8°C) nct6793-isa-0290 Adapter: ISA adapter in0: +0.35 V (min = +0.00 V, max = +1.74 V) in1: +1.85 V (min = +0.00 V, max = +0.00 V) ALARM in2: +3.41 V (min = +0.00 V, max = +0.00 V) ALARM in3: +3.39 V (min = +0.00 V, max = +0.00 V) ALARM in4: +0.26 V (min = +0.00 V, max = +0.00 V) ALARM in5: +0.14 V (min = +0.00 V, max = +0.00 V) ALARM in6: +0.66 V (min = +0.00 V, max = +0.00 V) ALARM in7: +3.39 V (min = +0.00 V, max = +0.00 V) ALARM in8: +3.26 V (min = +0.00 V, max = +0.00 V) ALARM in9: +1.83 V (min = +0.00 V, max = +0.00 V) ALARM in10: +0.19 V (min = +0.00 V, max = +0.00 V) ALARM in11: +0.14 V (min = +0.00 V, max = +0.00 V) ALARM in12: +1.84 V (min = +0.00 V, max = +0.00 V) ALARM in13: +1.72 V (min = +0.00 V, max = +0.00 V) ALARM in14: +0.21 V (min = +0.00 V, max = +0.00 V) ALARM fan1: 0 RPM (min = 0 RPM) fan2: 323 RPM (min = 0 RPM) fan3: 0 RPM (min = 0 RPM) fan4: 0 RPM (min = 0 RPM) fan5: 0 RPM (min = 0 RPM) SYSTIN: +112.0°C (high = +0.0°C, hyst = +0.0°C) sensor = thermistor CPUTIN: +60.0°C (high = +80.0°C, hyst = +75.0°C) sensor = thermistor AUXTIN0: +46.0°C (high = +0.0°C, hyst = +0.0°C) ALARM sensor = thermistor AUXTIN1: +106.0°C sensor = thermistor AUXTIN2: +105.0°C sensor = thermistor AUXTIN3: +102.0°C sensor = thermistor SMBUSMASTER 0: +45.0°C PCH_CHIP_CPU_MAX_TEMP: +0.0°C PCH_CHIP_TEMP: +0.0°C PCH_CPU_TEMP: +0.0°C intrusion0: OK intrusion1: ALARM beep_enable: disabled amdgpu-pci-0300 Adapter: PCI adapter vddgfx: N/A vddnb: N/A edge: +45.0°C (crit = +80.0°C, hyst = +0.0°C) And here with some high load: k10temp-pci-00c3 Adapter: PCI adapter Vcore: +1.32 V Vsoc: +1.11 V Tdie: +77.1°C Tctl: +77.1°C Icore: +85.22 A Isoc: +3.61 A nvme-pci-0100 Adapter: PCI adapter Composite: +42.9°C (low = -273.1°C, high = +80.8°C) (crit = +80.8°C) Sensor 1: +42.9°C (low = -273.1°C, high = +65261.8°C) Sensor 2: +45.9°C (low = -273.1°C, high = +65261.8°C) nct6793-isa-0290 Adapter: ISA adapter in0: +0.68 V (min = +0.00 V, max = +1.74 V) in1: +1.84 V (min = +0.00 V, max = +0.00 V) ALARM in2: +3.39 V (min = +0.00 V, max = +0.00 V) ALARM in3: +3.39 V (min = +0.00 V, max = +0.00 V) ALARM in4: +0.26 V (min = +0.00 V, max = +0.00 V) ALARM in5: +0.14 V (min = +0.00 V, max = +0.00 V) ALARM in6: +0.66 V (min = +0.00 V, max = +0.00 V) ALARM in7: +3.39 V (min = +0.00 V, max = +0.00 V) ALARM in8: +3.26 V (min = +0.00 V, max = +0.00 V) ALARM in9: +1.83 V (min = +0.00 V, max = +0.00 V) ALARM in10: +0.19 V (min = +0.00 V, max = +0.00 V) ALARM in11: +0.14 V (min = +0.00 V, max = +0.00 V) ALARM in12: +1.84 V (min = +0.00 V, max = +0.00 V) ALARM in13: +1.72 V (min = +0.00 V, max = +0.00 V) ALARM in14: +0.20 V (min = +0.00 V, max = +0.00 V) ALARM fan1: 0 RPM (min = 0 RPM) fan2: 1931 RPM (min = 0 RPM) fan3: 0 RPM (min = 0 RPM) fan4: 0 RPM (min = 0 RPM) fan5: 0 RPM (min = 0 RPM) SYSTIN: +113.0°C (high = +0.0°C, hyst = +0.0°C) sensor = thermistor CPUTIN: +64.5°C (high = +80.0°C, hyst = +75.0°C) sensor = thermistor AUXTIN0: +45.0°C (high = +0.0°C, hyst = +0.0°C) ALARM sensor = thermistor AUXTIN1: +107.0°C sensor = thermistor AUXTIN2: +105.0°C sensor = thermistor AUXTIN3: +102.0°C sensor = thermistor SMBUSMASTER 0: +77.0°C PCH_CHIP_CPU_MAX_TEMP: +0.0°C PCH_CHIP_TEMP: +0.0°C PCH_CPU_TEMP: +0.0°C intrusion0: OK intrusion1: ALARM beep_enable: disabled amdgpu-pci-0300 Adapter: PCI adapter vddgfx: N/A vddnb: N/A edge: +77.0°C (crit = +80.0°C, hyst = +0.0°C) Have also tried this on a EPYC 7302. Before the patch: k10temp-pci-00c3 Adapter: PCI adapter Tdie: +28.1°C (high = +70.0°C) Tctl: +28.1°C and after: k10temp-pci-00c3 Adapter: PCI adapter Tdie: +28.2°C Tctl: +28.2°C No extra values shown, but I think this is expected. Tested-by Holger Kiehl <holger.kiehl@dwd.de> Holger
On 1/19/20 2:18 AM, Jonathan McDowell wrote: > > In article <20200118172615.26329-1-linux@roeck-us.net> (earth.lists.linux-kernel) you wrote: >> This patch series implements various improvements for the k10temp driver. > ... >> The voltage and current information is limited to Ryzen CPUs. Voltage >> and current reporting on Threadripper and EPYC CPUs is different, and the >> reported information is either incomplete or wrong. Exclude it for the time >> being; it can always be added if/when more information becomes available. > >> Tested with the following Ryzen CPUs: > > Tested-By: Jonathan McDowell <noodles@earth.li> > Thanks! > Tested on a Ryzen 7 2700 (patched on top of 5.4.13): > > | k10temp-pci-00c3 > | Adapter: PCI adapter > | Vcore: +0.80 V > | Vsoc: +0.81 V > | Tdie: +37.0°C > | Tctl: +37.0°C > | Icore: +8.31 A > | Isoc: +6.86 A > > Like the 1300X case I see a discrepancy compared to what the nct6779 > driver says Vcore is: > > | nct6779-isa-0290 > | Adapter: ISA adapter > | Vcore: +0.33 V (min = +0.00 V, max = +1.74 V) I see that on all of my boards as well (3900X, different boards and board vendors), with temperatures reported by the Super-IO chip sometimes as low as 0.18V (!). Yet, there is a clear correlation of that voltage with CPU load. I suspect the measurement by the Super-IO chip is a different voltage. I don't think there is anything we can do about that without access to more information. > | in1: +0.32 V (min = +0.00 V, max = +0.00 V) ALARM > | AVCC: +3.39 V (min = +0.00 V, max = +0.00 V) ALARM > | +3.3V: +3.39 V (min = +0.00 V, max = +0.00 V) ALARM > | in4: +1.88 V (min = +0.00 V, max = +0.00 V) ALARM > | in5: +0.82 V (min = +0.00 V, max = +0.00 V) ALARM > | in6: +0.30 V (min = +0.00 V, max = +0.00 V) ALARM > | 3VSB: +3.42 V (min = +0.00 V, max = +0.00 V) ALARM > | Vbat: +3.25 V (min = +0.00 V, max = +0.00 V) ALARM > | in9: +0.00 V (min = +0.00 V, max = +0.00 V) > | in10: +0.22 V (min = +0.00 V, max = +0.00 V) ALARM > | in11: +1.06 V (min = +0.00 V, max = +0.00 V) ALARM > | in12: +1.70 V (min = +0.00 V, max = +0.00 V) ALARM > | in13: +1.04 V (min = +0.00 V, max = +0.00 V) ALARM > | in14: +1.79 V (min = +0.00 V, max = +0.00 V) ALARM > | fan1: 0 RPM (min = 0 RPM) > | fan2: 1708 RPM (min = 0 RPM) > | fan3: 0 RPM (min = 0 RPM) > | fan4: 0 RPM (min = 0 RPM) > | fan5: 0 RPM (min = 0 RPM) > | SYSTIN: +33.0°C (high = +0.0°C, hyst = +0.0°C) ALARM > | sensor = thermistor > | CPUTIN: -62.5°C (high = +80.0°C, hyst = +75.0°C) > | sensor = thermistor > | AUXTIN0: +79.0°C sensor = thermistor > | AUXTIN1: +96.0°C sensor = thermistor > | AUXTIN2: +23.0°C sensor = thermistor > | AUXTIN3: -22.0°C sensor = thermistor > | SMBUSMASTER 0: +39.0°C > | PCH_CHIP_CPU_MAX_TEMP: +0.0°C > | PCH_CHIP_TEMP: +0.0°C > | PCH_CPU_TEMP: +0.0°C > | intrusion0: ALARM > | intrusion1: ALARM > | beep_enable: disabled > > I suspect the nct6779 is not reporting correctly (or needs some > configuration) here, as I see that's what Ken is using with his 1300X as > well. > Initially I thought the voltage reported by the Super-IO chip would help us understand what is going on, but that is not really the case. The problem with Ken's board is that idle current and voltage are very high. The idle voltage claims to be higher than the voltage under load, which doesn't really make sense. This is only reflected in the voltage and current reported by the CPU, but not by the voltage reported by the Super-IO chip. Thanks, Guenter
On 1/19/20 5:38 AM, Holger Kiehl wrote: > On Sat, 18 Jan 2020, Guenter Roeck wrote: > >> This patch series implements various improvements for the k10temp driver. >> >> Patch 1/5 introduces the use of bit operations. >> >> Patch 2/5 converts the driver to use the devm_hwmon_device_register_with_info >> API. This not only simplifies the code and reduces its size, it also >> makes the code easier to maintain and enhance. >> >> Patch 3/5 adds support for reporting Core Complex Die (CCD) temperatures >> on Ryzen 3 (Zen2) CPUs. >> >> Patch 4/5 adds support for reporting core and SoC current and voltage >> information on Ryzen CPUs. >> >> Patch 5/5 removes the maximum temperature from Tdie for Ryzen CPUs. >> It is inaccurate, misleading, and it just doesn't make sense to report >> wrong information. >> >> With all patches in place, output on Ryzen 3900X CPUs looks as follows >> (with the system under load). >> >> k10temp-pci-00c3 >> Adapter: PCI adapter >> Vcore: +1.36 V >> Vsoc: +1.18 V >> Tdie: +86.8°C >> Tctl: +86.8°C >> Tccd1: +80.0°C >> Tccd2: +81.8°C >> Icore: +44.14 A >> Isoc: +13.83 A >> >> The voltage and current information is limited to Ryzen CPUs. Voltage >> and current reporting on Threadripper and EPYC CPUs is different, and the >> reported information is either incomplete or wrong. Exclude it for the time >> being; it can always be added if/when more information becomes available. >> >> Tested with the following Ryzen CPUs: >> 1300X A user with this CPU in the system reported somewhat unexpected >> values for Vcore; it isn't entirely if at all clear why that is >> the case. Overall this does not warrant holding up the series. >> 1600 >> 1800X >> 2200G >> 2400G >> 3800X >> 3900X >> 3950X >> >> v2: Added tested-by: tags as received. >> Don't display voltage and current information for Threadripper and EPYC. >> Stop displaying the fixed (and wrong) maximum temperature of 70 degrees C >> for Tdie on model 17h/18h CPUs. >> > Just tested this on a 2400G. Here idle values: > > k10temp-pci-00c3 > Adapter: PCI adapter > Vcore: +0.77 V > Vsoc: +1.11 V > Tdie: +45.0°C > Tctl: +45.0°C > Icore: +10.39 A > Isoc: +2.89 A > > nvme-pci-0100 > Adapter: PCI adapter > Composite: +43.9°C (low = -273.1°C, high = +80.8°C) > (crit = +80.8°C) > Sensor 1: +43.9°C (low = -273.1°C, high = +65261.8°C) > Sensor 2: +48.9°C (low = -273.1°C, high = +65261.8°C) > > nct6793-isa-0290 > Adapter: ISA adapter > in0: +0.35 V (min = +0.00 V, max = +1.74 V) > in1: +1.85 V (min = +0.00 V, max = +0.00 V) ALARM > in2: +3.41 V (min = +0.00 V, max = +0.00 V) ALARM > in3: +3.39 V (min = +0.00 V, max = +0.00 V) ALARM > in4: +0.26 V (min = +0.00 V, max = +0.00 V) ALARM > in5: +0.14 V (min = +0.00 V, max = +0.00 V) ALARM > in6: +0.66 V (min = +0.00 V, max = +0.00 V) ALARM > in7: +3.39 V (min = +0.00 V, max = +0.00 V) ALARM > in8: +3.26 V (min = +0.00 V, max = +0.00 V) ALARM > in9: +1.83 V (min = +0.00 V, max = +0.00 V) ALARM > in10: +0.19 V (min = +0.00 V, max = +0.00 V) ALARM > in11: +0.14 V (min = +0.00 V, max = +0.00 V) ALARM > in12: +1.84 V (min = +0.00 V, max = +0.00 V) ALARM > in13: +1.72 V (min = +0.00 V, max = +0.00 V) ALARM > in14: +0.21 V (min = +0.00 V, max = +0.00 V) ALARM > fan1: 0 RPM (min = 0 RPM) > fan2: 323 RPM (min = 0 RPM) > fan3: 0 RPM (min = 0 RPM) > fan4: 0 RPM (min = 0 RPM) > fan5: 0 RPM (min = 0 RPM) > SYSTIN: +112.0°C (high = +0.0°C, hyst = +0.0°C) sensor = thermistor > CPUTIN: +60.0°C (high = +80.0°C, hyst = +75.0°C) sensor = thermistor > AUXTIN0: +46.0°C (high = +0.0°C, hyst = +0.0°C) ALARM sensor = thermistor > AUXTIN1: +106.0°C sensor = thermistor > AUXTIN2: +105.0°C sensor = thermistor > AUXTIN3: +102.0°C sensor = thermistor > SMBUSMASTER 0: +45.0°C > PCH_CHIP_CPU_MAX_TEMP: +0.0°C > PCH_CHIP_TEMP: +0.0°C > PCH_CPU_TEMP: +0.0°C > intrusion0: OK > intrusion1: ALARM > beep_enable: disabled > > amdgpu-pci-0300 > Adapter: PCI adapter > vddgfx: N/A > vddnb: N/A > edge: +45.0°C (crit = +80.0°C, hyst = +0.0°C) > > And here with some high load: > > k10temp-pci-00c3 > Adapter: PCI adapter > Vcore: +1.32 V > Vsoc: +1.11 V > Tdie: +77.1°C > Tctl: +77.1°C > Icore: +85.22 A > Isoc: +3.61 A > > nvme-pci-0100 > Adapter: PCI adapter > Composite: +42.9°C (low = -273.1°C, high = +80.8°C) > (crit = +80.8°C) > Sensor 1: +42.9°C (low = -273.1°C, high = +65261.8°C) > Sensor 2: +45.9°C (low = -273.1°C, high = +65261.8°C) > > nct6793-isa-0290 > Adapter: ISA adapter > in0: +0.68 V (min = +0.00 V, max = +1.74 V) > in1: +1.84 V (min = +0.00 V, max = +0.00 V) ALARM > in2: +3.39 V (min = +0.00 V, max = +0.00 V) ALARM > in3: +3.39 V (min = +0.00 V, max = +0.00 V) ALARM > in4: +0.26 V (min = +0.00 V, max = +0.00 V) ALARM > in5: +0.14 V (min = +0.00 V, max = +0.00 V) ALARM > in6: +0.66 V (min = +0.00 V, max = +0.00 V) ALARM > in7: +3.39 V (min = +0.00 V, max = +0.00 V) ALARM > in8: +3.26 V (min = +0.00 V, max = +0.00 V) ALARM > in9: +1.83 V (min = +0.00 V, max = +0.00 V) ALARM > in10: +0.19 V (min = +0.00 V, max = +0.00 V) ALARM > in11: +0.14 V (min = +0.00 V, max = +0.00 V) ALARM > in12: +1.84 V (min = +0.00 V, max = +0.00 V) ALARM > in13: +1.72 V (min = +0.00 V, max = +0.00 V) ALARM > in14: +0.20 V (min = +0.00 V, max = +0.00 V) ALARM > fan1: 0 RPM (min = 0 RPM) > fan2: 1931 RPM (min = 0 RPM) > fan3: 0 RPM (min = 0 RPM) > fan4: 0 RPM (min = 0 RPM) > fan5: 0 RPM (min = 0 RPM) > SYSTIN: +113.0°C (high = +0.0°C, hyst = +0.0°C) sensor = thermistor > CPUTIN: +64.5°C (high = +80.0°C, hyst = +75.0°C) sensor = thermistor > AUXTIN0: +45.0°C (high = +0.0°C, hyst = +0.0°C) ALARM sensor = thermistor > AUXTIN1: +107.0°C sensor = thermistor > AUXTIN2: +105.0°C sensor = thermistor > AUXTIN3: +102.0°C sensor = thermistor > SMBUSMASTER 0: +77.0°C > PCH_CHIP_CPU_MAX_TEMP: +0.0°C > PCH_CHIP_TEMP: +0.0°C > PCH_CPU_TEMP: +0.0°C > intrusion0: OK > intrusion1: ALARM > beep_enable: disabled > > amdgpu-pci-0300 > Adapter: PCI adapter > vddgfx: N/A > vddnb: N/A > edge: +77.0°C (crit = +80.0°C, hyst = +0.0°C) > > Have also tried this on a EPYC 7302. Before the patch: > > k10temp-pci-00c3 > Adapter: PCI adapter > Tdie: +28.1°C (high = +70.0°C) > Tctl: +28.1°C > > and after: > > k10temp-pci-00c3 > Adapter: PCI adapter > Tdie: +28.2°C > Tctl: +28.2°C > > No extra values shown, but I think this is expected. > Unfortunately yes, but it helps to confirm that the detection works. > Tested-by Holger Kiehl <holger.kiehl@dwd.de> > Thanks again! Guenter
On Sun, Jan 19, 2020 at 07:46:11AM -0800, Guenter Roeck wrote: > On 1/19/20 2:18 AM, Jonathan McDowell wrote: > > > > In article <20200118172615.26329-1-linux@roeck-us.net> (earth.lists.linux-kernel) you wrote: > > > This patch series implements various improvements for the k10temp driver. > > ... > > > The voltage and current information is limited to Ryzen CPUs. Voltage > > > and current reporting on Threadripper and EPYC CPUs is different, and the > > > reported information is either incomplete or wrong. Exclude it for the time > > > being; it can always be added if/when more information becomes available. > > > > > Tested with the following Ryzen CPUs: > > > > Tested-By: Jonathan McDowell <noodles@earth.li> > > > Thanks! > > > Tested on a Ryzen 7 2700 (patched on top of 5.4.13): > > > > | k10temp-pci-00c3 > > | Adapter: PCI adapter > > | Vcore: +0.80 V > > | Vsoc: +0.81 V > > | Tdie: +37.0°C > > | Tctl: +37.0°C > > | Icore: +8.31 A > > | Isoc: +6.86 A > > > > Like the 1300X case I see a discrepancy compared to what the nct6779 > > driver says Vcore is: > > > > | nct6779-isa-0290 > > | Adapter: ISA adapter > > | Vcore: +0.33 V (min = +0.00 V, max = +1.74 V) > > I see that on all of my boards as well (3900X, different boards and board vendors), > with temperatures reported by the Super-IO chip sometimes as low as 0.18V (!). > Yet, there is a clear correlation of that voltage with CPU load. > I suspect the measurement by the Super-IO chip is a different voltage. > > I don't think there is anything we can do about that without access to more > information. ... > The problem with Ken's board is that idle current and voltage are very high. > The idle voltage claims to be higher than the voltage under load, which > doesn't really make sense. This is only reflected in the voltage and current > reported by the CPU, but not by the voltage reported by the Super-IO chip. I see clear correlation between load/Vcore/Icore/Tdie from your patched k10temp driver which leads me to believe these numbers are valid for the 2700. Vsoc is fairly consistent and Isoc doesn't vary much either (6.3-8.1A range over the past 8 hours). J.