Message ID | 5665899.DvuYhMxLoT@kreacher (mailing list archive) |
---|---|
Headers | show |
Series | thermal: intel: int340x: Use generic trip points table | expand |
Hi Srinivas, On Wed, Jan 25, 2023 at 3:55 PM Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > Hi All, > > This series replaces the following patch: > > https://patchwork.kernel.org/project/linux-pm/patch/2147918.irdbgypaU6@kreacher/ > > but it has been almost completely rewritten, so I've dropped all tags from it. > > The most significant difference is that firmware-induced trip point updates are > now handled in a less controversial manner (no renumbering, just temperature > updates if applicable). > > Please refer to the individual patch changelogs for details. > > The series is on top of this patch: > > https://patchwork.kernel.org/project/linux-pm/patch/2688799.mvXUDI8C0e@kreacher/ > > which applies on top of the linux-next branch in linux-pm.git from today. There are two additional branches in linux-pm.git: thermal-intel-fixes thermal-intel-testing The former is just fixes to go on top of 6.2-rc5 and the latter - this series on top of those and the current thermal-intel branch I have locally with the Intel thermal drivers changes for 6.3. I would appreciate giving each of them a go in your test setup. Cheers!
Hi Rafael, On Wed, 2023-01-25 at 16:20 +0100, Rafael J. Wysocki wrote: > Hi Srinivas, > > On Wed, Jan 25, 2023 at 3:55 PM Rafael J. Wysocki <rjw@rjwysocki.net> > wrote: > > > > Hi All, > > > > This series replaces the following patch: > > > > https://patchwork.kernel.org/project/linux-pm/patch/2147918.irdbgypaU6@kreacher/ > > > > but it has been almost completely rewritten, so I've dropped all > > tags from it. > > > > [...] > > The series is on top of this patch: > > > > https://patchwork.kernel.org/project/linux-pm/patch/2688799.mvXUDI8C0e@kreacher/ > > > > which applies on top of the linux-next branch in linux-pm.git from > > today. > > There are two additional branches in linux-pm.git: > > thermal-intel-fixes On two systems test, no issues are observed. > thermal-intel-testing branch: thermal-intel-test No issues, but number of trips are not same as invalid trips are not registered. Not sure if this is correct. At boot up they may be invalid, but firmware may update later (Not aware of such scenario). For example, the hot is not registered. Current: thermal_zone9/trip_point_0_type:critical thermal_zone9/trip_point_0_temp:125050 thermal_zone9/trip_point_0_hyst:0 thermal_zone9/trip_point_1_type:hot thermal_zone9/trip_point_1_temp:-273250 thermal_zone9/trip_point_1_hyst:0 thermal_zone9/trip_point_2_type:passive thermal_zone9/trip_point_2_temp:103050 thermal_zone9/trip_point_2_hyst:0 thermal_zone9/trip_point_3_type:active thermal_zone9/trip_point_3_temp:103050 thermal_zone9/trip_point_3_hyst:0 thermal_zone9/trip_point_4_type:active thermal_zone9/trip_point_4_temp:101050 thermal_zone9/trip_point_4_hyst:0 thermal_zone9/trip_point_5_type:active thermal_zone9/trip_point_5_temp:100050 thermal_zone9/trip_point_5_hyst:0 thermal_zone9/trip_point_6_type:active thermal_zone9/trip_point_6_temp:98550 thermal_zone9/trip_point_6_hyst:0 thermal_zone9/trip_point_7_type:active thermal_zone9/trip_point_7_temp:97050 thermal_zone9/trip_point_7_hyst:0 with 6.3-rc1 changes thermal_zone9/trip_point_0_type:critical thermal_zone9/trip_point_0_temp:125050 thermal_zone9/trip_point_0_hyst:0 thermal_zone9/trip_point_1_type:passive thermal_zone9/trip_point_1_temp:103050 thermal_zone9/trip_point_1_hyst:0 thermal_zone9/trip_point_2_type:active thermal_zone9/trip_point_2_temp:103050 thermal_zone9/trip_point_2_hyst:0 thermal_zone9/trip_point_3_type:active thermal_zone9/trip_point_3_temp:101050 thermal_zone9/trip_point_3_hyst:0 thermal_zone9/trip_point_4_type:active thermal_zone9/trip_point_4_temp:100050 thermal_zone9/trip_point_4_hyst:0 thermal_zone9/trip_point_5_type:active thermal_zone9/trip_point_5_temp:98550 thermal_zone9/trip_point_5_hyst:0 thermal_zone9/trip_point_6_hyst:0 thermal_zone9/trip_point_6_temp:97050 thermal_zone9/trip_point_6_type:active Thanks, Srinivas > > The former is just fixes to go on top of 6.2-rc5 and the latter - > this > series on top of those and the current thermal-intel branch I have > locally with the Intel thermal drivers changes for 6.3. > > I would appreciate giving each of them a go in your test setup. > > Cheers!
On Thursday, January 26, 2023 1:02:59 AM CET srinivas pandruvada wrote: > Hi Rafael, > > > On Wed, 2023-01-25 at 16:20 +0100, Rafael J. Wysocki wrote: > > Hi Srinivas, > > > > On Wed, Jan 25, 2023 at 3:55 PM Rafael J. Wysocki <rjw@rjwysocki.net> > > wrote: > > > > > > Hi All, > > > > > > This series replaces the following patch: > > > > > > https://patchwork.kernel.org/project/linux-pm/patch/2147918.irdbgypaU6@kreacher/ > > > > > > but it has been almost completely rewritten, so I've dropped all > > > tags from it. > > > > > > > > [...] > > > > The series is on top of this patch: > > > > > > https://patchwork.kernel.org/project/linux-pm/patch/2688799.mvXUDI8C0e@kreacher/ > > > > > > which applies on top of the linux-next branch in linux-pm.git from > > > today. > > > > There are two additional branches in linux-pm.git: > > > > thermal-intel-fixes > On two systems test, no issues are observed. Great! I'll move this to linux-next then. > > thermal-intel-testing > branch: thermal-intel-test > > No issues, but number of trips are not same as invalid trips are not > registered. > Not sure if this is correct. It may not be. At least it is a change in behavior that is not expected to happen after these changes. > At boot up they may be invalid, but > firmware may update later (Not aware of such scenario). > > For example, the hot is not registered. > > Current: > > thermal_zone9/trip_point_0_type:critical > thermal_zone9/trip_point_0_temp:125050 > thermal_zone9/trip_point_0_hyst:0 > > thermal_zone9/trip_point_1_type:hot > thermal_zone9/trip_point_1_temp:-273250 > thermal_zone9/trip_point_1_hyst:0 So this means that _HOT is evaluated successfully (or the trip point index would be negative), but it probably returned an invalid temperature (likely 0) that has been turned into an error by the temperature range check in the new ACPI helper introduced by the change. OK, thanks for testing! I've added the appended patch to the thermal-intel-test branch. Can you please check if it makes that difference in behavior go away? --- From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Subject: [PATCH] thermal: ACPI: Initialize trips if temperature is out of range In some cases it is still useful to register a trip point if the temperature returned by the corresponding ACPI thermal object (for example, _HOT) is invalid to start with, because the same ACPI thermal object may start to return a valid temperature after a system configuration change (for example, from an AC power source to battery an vice versa). For this reason, if the ACPI thermal object evaluated by thermal_acpi_trip_init() successfully returns a temperature value that is out of the range of values taken into account, initialize the trip point using THERMAL_TEMP_INVALID as the temperature value instead of returning an error to allow the user of the trip point to decide what to do with it. Also update pch_wpt_add_acpi_psv_trip() to reject trip points with invalid temperature values. Fixes: 7a0e39748861 ("thermal: ACPI: Add ACPI trip point routines") Reported-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- drivers/thermal/intel/intel_pch_thermal.c | 2 +- drivers/thermal/thermal_acpi.c | 7 ++++--- 2 files changed, 5 insertions(+), 4 deletions(-) Index: linux-pm/drivers/thermal/thermal_acpi.c =================================================================== --- linux-pm.orig/drivers/thermal/thermal_acpi.c +++ linux-pm/drivers/thermal/thermal_acpi.c @@ -64,13 +64,14 @@ static int thermal_acpi_trip_init(struct return -ENODATA; } - if (temp < TEMP_MIN_DECIK || temp >= TEMP_MAX_DECIK) { + if (temp >= TEMP_MIN_DECIK && temp <= TEMP_MAX_DECIK) { + trip->temperature = deci_kelvin_to_millicelsius(temp); + } else { acpi_handle_debug(adev->handle, "%s result %llu out of range\n", obj_name, temp); - return -ENODATA; + trip->temperature = THERMAL_TEMP_INVALID; } - trip->temperature = deci_kelvin_to_millicelsius(temp); trip->hysteresis = 0; trip->type = type; Index: linux-pm/drivers/thermal/intel/intel_pch_thermal.c =================================================================== --- linux-pm.orig/drivers/thermal/intel/intel_pch_thermal.c +++ linux-pm/drivers/thermal/intel/intel_pch_thermal.c @@ -107,7 +107,7 @@ static void pch_wpt_add_acpi_psv_trip(st return; ret = thermal_acpi_trip_passive(adev, &ptd->trips[*nr_trips]); - if (ret) + if (ret || ptd->trips[*nr_trips].temperature <= 0) return; ++(*nr_trips);
Hi Rafael, On Thu, 2023-01-26 at 14:13 +0100, Rafael J. Wysocki wrote: > On Thursday, January 26, 2023 1:02:59 AM CET srinivas pandruvada > wrote: > > Hi Rafael, > > > > > [...] > I've added the appended patch to the thermal-intel-test branch. Can > you please > check if it makes that difference in behavior go away? I synced the tree again and your patch in thermal-intel-test fixes the issue. Thanks, Srinivas > > --- > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > Subject: [PATCH] thermal: ACPI: Initialize trips if temperature is > out of range > > In some cases it is still useful to register a trip point if the > temperature returned by the corresponding ACPI thermal object (for > example, _HOT) is invalid to start with, because the same ACPI > thermal object may start to return a valid temperature after a > system configuration change (for example, from an AC power source > to battery an vice versa). > > For this reason, if the ACPI thermal object evaluated by > thermal_acpi_trip_init() successfully returns a temperature value > that > is out of the range of values taken into account, initialize the trip > point using THERMAL_TEMP_INVALID as the temperature value instead of > returning an error to allow the user of the trip point to decide what > to do with it. > > Also update pch_wpt_add_acpi_psv_trip() to reject trip points with > invalid temperature values. > > Fixes: 7a0e39748861 ("thermal: ACPI: Add ACPI trip point routines") > Reported-by: Srinivas Pandruvada > <srinivas.pandruvada@linux.intel.com> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > drivers/thermal/intel/intel_pch_thermal.c | 2 +- > drivers/thermal/thermal_acpi.c | 7 ++++--- > 2 files changed, 5 insertions(+), 4 deletions(-) > > Index: linux-pm/drivers/thermal/thermal_acpi.c > =================================================================== > --- linux-pm.orig/drivers/thermal/thermal_acpi.c > +++ linux-pm/drivers/thermal/thermal_acpi.c > @@ -64,13 +64,14 @@ static int thermal_acpi_trip_init(struct > return -ENODATA; > } > > - if (temp < TEMP_MIN_DECIK || temp >= TEMP_MAX_DECIK) { > + if (temp >= TEMP_MIN_DECIK && temp <= TEMP_MAX_DECIK) { > + trip->temperature = > deci_kelvin_to_millicelsius(temp); > + } else { > acpi_handle_debug(adev->handle, "%s result %llu out > of range\n", > obj_name, temp); > - return -ENODATA; > + trip->temperature = THERMAL_TEMP_INVALID; > } > > - trip->temperature = deci_kelvin_to_millicelsius(temp); > trip->hysteresis = 0; > trip->type = type; > > Index: linux-pm/drivers/thermal/intel/intel_pch_thermal.c > =================================================================== > --- linux-pm.orig/drivers/thermal/intel/intel_pch_thermal.c > +++ linux-pm/drivers/thermal/intel/intel_pch_thermal.c > @@ -107,7 +107,7 @@ static void pch_wpt_add_acpi_psv_trip(st > return; > > ret = thermal_acpi_trip_passive(adev, &ptd- > >trips[*nr_trips]); > - if (ret) > + if (ret || ptd->trips[*nr_trips].temperature <= 0) > return; > > ++(*nr_trips); > > >
Hi Srinivas, On Thu, Jan 26, 2023 at 6:17 PM srinivas pandruvada <srinivas.pandruvada@linux.intel.com> wrote: > > Hi Rafael, > > On Thu, 2023-01-26 at 14:13 +0100, Rafael J. Wysocki wrote: > > On Thursday, January 26, 2023 1:02:59 AM CET srinivas pandruvada > > wrote: > > > Hi Rafael, > > > > > > > > > > [...] > > > I've added the appended patch to the thermal-intel-test branch. Can > > you please > > check if it makes that difference in behavior go away? > I synced the tree again and your patch in thermal-intel-test fixes the > issue. Thanks a lot for testing and the confirmation! In the meantime, I've merged the thermal-intel-test into the bleeding-edge branch and if 0-day reports success with building it, I'll move the patches to linux-next. Cheers!