Message ID | 2181352.irdbgypaU6@kreacher (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | thermal/debugfs: Fix handling of cdev states and mitigation episodes in progress | expand |
Hi Rafael, On 4/23/24 19:00, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > If cdev_dt_seq_show() runs before the first state transition of a cooling > device, it will not print any state residency information for it, even > though it might be reasonably expected to print residency information for > the initial state of the cooling device. > > For this reason, rearrange the code to get the initial state of a cooling > device at the registration time and pass it to thermal_debug_cdev_add(), > so that the latter can create a duration record for that state which will > allow cdev_dt_seq_show() to print its residency information. > > Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information") > Reported-by: Lukasz Luba <lukasz.luba@arm.com> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > drivers/thermal/thermal_core.c | 9 +++++++-- > drivers/thermal/thermal_debugfs.c | 12 ++++++++++-- > drivers/thermal/thermal_debugfs.h | 4 ++-- > 3 files changed, 19 insertions(+), 6 deletions(-) > I'm trying to test it on my board and do the review, but faced issue. For some reason I couldn't apply that patch on the latest bleeding-edge branch. Could you help me in this please? Is there something missing in the patch set (like one more fist patch)? Regards, Lukasz
Hi Lukasz, On Thu, Apr 25, 2024 at 12:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote: > > Hi Rafael, > > On 4/23/24 19:00, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > If cdev_dt_seq_show() runs before the first state transition of a cooling > > device, it will not print any state residency information for it, even > > though it might be reasonably expected to print residency information for > > the initial state of the cooling device. > > > > For this reason, rearrange the code to get the initial state of a cooling > > device at the registration time and pass it to thermal_debug_cdev_add(), > > so that the latter can create a duration record for that state which will > > allow cdev_dt_seq_show() to print its residency information. > > > > Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information") > > Reported-by: Lukasz Luba <lukasz.luba@arm.com> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > --- > > drivers/thermal/thermal_core.c | 9 +++++++-- > > drivers/thermal/thermal_debugfs.c | 12 ++++++++++-- > > drivers/thermal/thermal_debugfs.h | 4 ++-- > > 3 files changed, 19 insertions(+), 6 deletions(-) > > > > I'm trying to test it on my board and do the review, but faced issue. > For some reason I couldn't apply that patch on the latest bleeding-edge > branch. > Could you help me in this please? Is there something missing in the > patch set (like one more fist patch)? I messed up the ordering of patches (patch [2/3] should be the last one in the series) and on top of that, you'll need a small rebase on that patch. Sorry about this, I'll send a v2. Thank you!
On Thu, Apr 25, 2024 at 2:36 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > > Hi Lukasz, > > On Thu, Apr 25, 2024 at 12:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote: > > > > Hi Rafael, > > > > On 4/23/24 19:00, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > > > If cdev_dt_seq_show() runs before the first state transition of a cooling > > > device, it will not print any state residency information for it, even > > > though it might be reasonably expected to print residency information for > > > the initial state of the cooling device. > > > > > > For this reason, rearrange the code to get the initial state of a cooling > > > device at the registration time and pass it to thermal_debug_cdev_add(), > > > so that the latter can create a duration record for that state which will > > > allow cdev_dt_seq_show() to print its residency information. > > > > > > Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information") > > > Reported-by: Lukasz Luba <lukasz.luba@arm.com> > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > --- > > > drivers/thermal/thermal_core.c | 9 +++++++-- > > > drivers/thermal/thermal_debugfs.c | 12 ++++++++++-- > > > drivers/thermal/thermal_debugfs.h | 4 ++-- > > > 3 files changed, 19 insertions(+), 6 deletions(-) > > > > > > > I'm trying to test it on my board and do the review, but faced issue. > > For some reason I couldn't apply that patch on the latest bleeding-edge > > branch. > > Could you help me in this please? Is there something missing in the > > patch set (like one more fist patch)? > > I messed up the ordering of patches (patch [2/3] should be the last > one in the series) and on top of that, you'll need a small rebase on > that patch. > > Sorry about this, I'll send a v2. Actually, the ordering was OK, but the rebase of the second patch is still needed. I'll send a v2.
On 4/25/24 14:00, Rafael J. Wysocki wrote: > On Thu, Apr 25, 2024 at 2:36 PM Rafael J. Wysocki <rafael@kernel.org> wrote: >> >> Hi Lukasz, >> >> On Thu, Apr 25, 2024 at 12:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote: >>> >>> Hi Rafael, >>> >>> On 4/23/24 19:00, Rafael J. Wysocki wrote: >>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >>>> >>>> If cdev_dt_seq_show() runs before the first state transition of a cooling >>>> device, it will not print any state residency information for it, even >>>> though it might be reasonably expected to print residency information for >>>> the initial state of the cooling device. >>>> >>>> For this reason, rearrange the code to get the initial state of a cooling >>>> device at the registration time and pass it to thermal_debug_cdev_add(), >>>> so that the latter can create a duration record for that state which will >>>> allow cdev_dt_seq_show() to print its residency information. >>>> >>>> Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information") >>>> Reported-by: Lukasz Luba <lukasz.luba@arm.com> >>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >>>> --- >>>> drivers/thermal/thermal_core.c | 9 +++++++-- >>>> drivers/thermal/thermal_debugfs.c | 12 ++++++++++-- >>>> drivers/thermal/thermal_debugfs.h | 4 ++-- >>>> 3 files changed, 19 insertions(+), 6 deletions(-) >>>> >>> >>> I'm trying to test it on my board and do the review, but faced issue. >>> For some reason I couldn't apply that patch on the latest bleeding-edge >>> branch. >>> Could you help me in this please? Is there something missing in the >>> patch set (like one more fist patch)? >> >> I messed up the ordering of patches (patch [2/3] should be the last >> one in the series) and on top of that, you'll need a small rebase on >> that patch. >> >> Sorry about this, I'll send a v2. > > Actually, the ordering was OK, but the rebase of the second patch is > still needed. I'll send a v2. Thanks, I've seen it. That v2 applies smoothly and runs on the board. I'll test it and review.
On Thu, Apr 25, 2024 at 8:42 PM Lukasz Luba <lukasz.luba@arm.com> wrote: > > > > On 4/25/24 14:00, Rafael J. Wysocki wrote: > > On Thu, Apr 25, 2024 at 2:36 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > >> > >> Hi Lukasz, > >> > >> On Thu, Apr 25, 2024 at 12:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote: > >>> > >>> Hi Rafael, > >>> > >>> On 4/23/24 19:00, Rafael J. Wysocki wrote: > >>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > >>>> > >>>> If cdev_dt_seq_show() runs before the first state transition of a cooling > >>>> device, it will not print any state residency information for it, even > >>>> though it might be reasonably expected to print residency information for > >>>> the initial state of the cooling device. > >>>> > >>>> For this reason, rearrange the code to get the initial state of a cooling > >>>> device at the registration time and pass it to thermal_debug_cdev_add(), > >>>> so that the latter can create a duration record for that state which will > >>>> allow cdev_dt_seq_show() to print its residency information. > >>>> > >>>> Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information") > >>>> Reported-by: Lukasz Luba <lukasz.luba@arm.com> > >>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > >>>> --- > >>>> drivers/thermal/thermal_core.c | 9 +++++++-- > >>>> drivers/thermal/thermal_debugfs.c | 12 ++++++++++-- > >>>> drivers/thermal/thermal_debugfs.h | 4 ++-- > >>>> 3 files changed, 19 insertions(+), 6 deletions(-) > >>>> > >>> > >>> I'm trying to test it on my board and do the review, but faced issue. > >>> For some reason I couldn't apply that patch on the latest bleeding-edge > >>> branch. > >>> Could you help me in this please? Is there something missing in the > >>> patch set (like one more fist patch)? > >> > >> I messed up the ordering of patches (patch [2/3] should be the last > >> one in the series) and on top of that, you'll need a small rebase on > >> that patch. > >> > >> Sorry about this, I'll send a v2. > > > > Actually, the ordering was OK, but the rebase of the second patch is > > still needed. I'll send a v2. > > Thanks, I've seen it. That v2 applies smoothly and runs on the board. > I'll test it and review. Thank you! Please also see https://lore.kernel.org/linux-pm/12427744.O9o76ZdvQC@kreacher/ which is actually more urgent because the fixes there address more serious issues (I would even consider them as 6.9-rc material). Thanks!
On 4/25/24 19:53, Rafael J. Wysocki wrote: > On Thu, Apr 25, 2024 at 8:42 PM Lukasz Luba <lukasz.luba@arm.com> wrote: >> >> >> >> On 4/25/24 14:00, Rafael J. Wysocki wrote: >>> On Thu, Apr 25, 2024 at 2:36 PM Rafael J. Wysocki <rafael@kernel.org> wrote: >>>> >>>> Hi Lukasz, >>>> >>>> On Thu, Apr 25, 2024 at 12:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote: >>>>> >>>>> Hi Rafael, >>>>> >>>>> On 4/23/24 19:00, Rafael J. Wysocki wrote: >>>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >>>>>> >>>>>> If cdev_dt_seq_show() runs before the first state transition of a cooling >>>>>> device, it will not print any state residency information for it, even >>>>>> though it might be reasonably expected to print residency information for >>>>>> the initial state of the cooling device. >>>>>> >>>>>> For this reason, rearrange the code to get the initial state of a cooling >>>>>> device at the registration time and pass it to thermal_debug_cdev_add(), >>>>>> so that the latter can create a duration record for that state which will >>>>>> allow cdev_dt_seq_show() to print its residency information. >>>>>> >>>>>> Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information") >>>>>> Reported-by: Lukasz Luba <lukasz.luba@arm.com> >>>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >>>>>> --- >>>>>> drivers/thermal/thermal_core.c | 9 +++++++-- >>>>>> drivers/thermal/thermal_debugfs.c | 12 ++++++++++-- >>>>>> drivers/thermal/thermal_debugfs.h | 4 ++-- >>>>>> 3 files changed, 19 insertions(+), 6 deletions(-) >>>>>> >>>>> >>>>> I'm trying to test it on my board and do the review, but faced issue. >>>>> For some reason I couldn't apply that patch on the latest bleeding-edge >>>>> branch. >>>>> Could you help me in this please? Is there something missing in the >>>>> patch set (like one more fist patch)? >>>> >>>> I messed up the ordering of patches (patch [2/3] should be the last >>>> one in the series) and on top of that, you'll need a small rebase on >>>> that patch. >>>> >>>> Sorry about this, I'll send a v2. >>> >>> Actually, the ordering was OK, but the rebase of the second patch is >>> still needed. I'll send a v2. >> >> Thanks, I've seen it. That v2 applies smoothly and runs on the board. >> I'll test it and review. > > Thank you! > > Please also see > > https://lore.kernel.org/linux-pm/12427744.O9o76ZdvQC@kreacher/ > > which is actually more urgent because the fixes there address more > serious issues (I would even consider them as 6.9-rc material). Yes, I've read that patches' headers and glanced through the code. I'll do the review & testing on them as well today.
On Thu, Apr 25, 2024 at 8:55 PM Lukasz Luba <lukasz.luba@arm.com> wrote: > > > > On 4/25/24 19:53, Rafael J. Wysocki wrote: > > On Thu, Apr 25, 2024 at 8:42 PM Lukasz Luba <lukasz.luba@arm.com> wrote: > >> > >> > >> > >> On 4/25/24 14:00, Rafael J. Wysocki wrote: > >>> On Thu, Apr 25, 2024 at 2:36 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > >>>> > >>>> Hi Lukasz, > >>>> > >>>> On Thu, Apr 25, 2024 at 12:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote: > >>>>> > >>>>> Hi Rafael, > >>>>> > >>>>> On 4/23/24 19:00, Rafael J. Wysocki wrote: > >>>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > >>>>>> > >>>>>> If cdev_dt_seq_show() runs before the first state transition of a cooling > >>>>>> device, it will not print any state residency information for it, even > >>>>>> though it might be reasonably expected to print residency information for > >>>>>> the initial state of the cooling device. > >>>>>> > >>>>>> For this reason, rearrange the code to get the initial state of a cooling > >>>>>> device at the registration time and pass it to thermal_debug_cdev_add(), > >>>>>> so that the latter can create a duration record for that state which will > >>>>>> allow cdev_dt_seq_show() to print its residency information. > >>>>>> > >>>>>> Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information") > >>>>>> Reported-by: Lukasz Luba <lukasz.luba@arm.com> > >>>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > >>>>>> --- > >>>>>> drivers/thermal/thermal_core.c | 9 +++++++-- > >>>>>> drivers/thermal/thermal_debugfs.c | 12 ++++++++++-- > >>>>>> drivers/thermal/thermal_debugfs.h | 4 ++-- > >>>>>> 3 files changed, 19 insertions(+), 6 deletions(-) > >>>>>> > >>>>> > >>>>> I'm trying to test it on my board and do the review, but faced issue. > >>>>> For some reason I couldn't apply that patch on the latest bleeding-edge > >>>>> branch. > >>>>> Could you help me in this please? Is there something missing in the > >>>>> patch set (like one more fist patch)? > >>>> > >>>> I messed up the ordering of patches (patch [2/3] should be the last > >>>> one in the series) and on top of that, you'll need a small rebase on > >>>> that patch. > >>>> > >>>> Sorry about this, I'll send a v2. > >>> > >>> Actually, the ordering was OK, but the rebase of the second patch is > >>> still needed. I'll send a v2. > >> > >> Thanks, I've seen it. That v2 applies smoothly and runs on the board. > >> I'll test it and review. > > > > Thank you! > > > > Please also see > > > > https://lore.kernel.org/linux-pm/12427744.O9o76ZdvQC@kreacher/ > > > > which is actually more urgent because the fixes there address more > > serious issues (I would even consider them as 6.9-rc material). > > Yes, I've read that patches' headers and glanced through the code. > I'll do the review & testing on them as well today. Thanks, much appreciated!
Index: linux-pm/drivers/thermal/thermal_core.c =================================================================== --- linux-pm.orig/drivers/thermal/thermal_core.c +++ linux-pm/drivers/thermal/thermal_core.c @@ -928,6 +928,7 @@ __thermal_cooling_device_register(struct { struct thermal_cooling_device *cdev; struct thermal_zone_device *pos = NULL; + unsigned long current_state; int id, ret; if (!ops || !ops->get_max_state || !ops->get_cur_state || @@ -965,6 +966,10 @@ __thermal_cooling_device_register(struct if (ret) goto out_cdev_type; + ret = cdev->ops->get_cur_state(cdev, ¤t_state); + if (ret) + goto out_cdev_type; + thermal_cooling_device_setup_sysfs(cdev); ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id); @@ -978,6 +983,8 @@ __thermal_cooling_device_register(struct return ERR_PTR(ret); } + thermal_debug_cdev_add(cdev, current_state); + /* Add 'this' new cdev to the global cdev list */ mutex_lock(&thermal_list_lock); @@ -993,8 +1000,6 @@ __thermal_cooling_device_register(struct mutex_unlock(&thermal_list_lock); - thermal_debug_cdev_add(cdev); - return cdev; out_cooling_dev: Index: linux-pm/drivers/thermal/thermal_debugfs.c =================================================================== --- linux-pm.orig/drivers/thermal/thermal_debugfs.c +++ linux-pm/drivers/thermal/thermal_debugfs.c @@ -466,8 +466,9 @@ void thermal_debug_cdev_state_update(con * Allocates a cooling device object for debug, initializes the * statistics and create the entries in sysfs. * @cdev: a pointer to a cooling device + * @state: current state of the cooling device */ -void thermal_debug_cdev_add(struct thermal_cooling_device *cdev) +void thermal_debug_cdev_add(struct thermal_cooling_device *cdev, int state) { struct thermal_debugfs *thermal_dbg; struct cdev_debugfs *cdev_dbg; @@ -484,9 +485,16 @@ void thermal_debug_cdev_add(struct therm INIT_LIST_HEAD(&cdev_dbg->durations[i]); } - cdev_dbg->current_state = 0; + cdev_dbg->current_state = state; cdev_dbg->timestamp = ktime_get(); + /* + * Create a record for the initial cooling device state, so its + * duration will be printed by cdev_dt_seq_show() as expected if it + * runs before the first state transition. + */ + thermal_debugfs_cdev_record_get(thermal_dbg, cdev_dbg->durations, state); + debugfs_create_file("trans_table", 0400, thermal_dbg->d_top, thermal_dbg, &tt_fops); Index: linux-pm/drivers/thermal/thermal_debugfs.h =================================================================== --- linux-pm.orig/drivers/thermal/thermal_debugfs.h +++ linux-pm/drivers/thermal/thermal_debugfs.h @@ -2,7 +2,7 @@ #ifdef CONFIG_THERMAL_DEBUGFS void thermal_debug_init(void); -void thermal_debug_cdev_add(struct thermal_cooling_device *cdev); +void thermal_debug_cdev_add(struct thermal_cooling_device *cdev, int state); void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev); void thermal_debug_cdev_state_update(const struct thermal_cooling_device *cdev, int state); void thermal_debug_tz_add(struct thermal_zone_device *tz); @@ -14,7 +14,7 @@ void thermal_debug_tz_trip_down(struct t void thermal_debug_update_stats(struct thermal_zone_device *tz); #else static inline void thermal_debug_init(void) {} -static inline void thermal_debug_cdev_add(struct thermal_cooling_device *cdev) {} +static inline void thermal_debug_cdev_add(struct thermal_cooling_device *cdev, int state) {} static inline void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev) {} static inline void thermal_debug_cdev_state_update(const struct thermal_cooling_device *cdev, int state) {}