Message ID | 4918025.31r3eYUQgx@kreacher (mailing list archive) |
---|---|
Headers | show |
Series | thermal/debugfs: Fix and clean up trip point statistics updates | expand |
Hi Rafael, On 4/17/24 14:07, Rafael J. Wysocki wrote: > Hi Everyone, > > The first patch in this series addresses the problem of updating trip > point statistics prematurely for trip points that have just been > crossed on the way down (please see the patch changelog for details). > > The way it does that renders the following cleanup patch inapplicable: > > https://lore.kernel.org/linux-pm/2321994.ElGaqSPkdT@kreacher/ > > The remaining two patches in the series are cleanups on top of the > first one. > > This series is based on an older patch series posted last week: > > https://lore.kernel.org/linux-pm/13515747.uLZWGnKmhe@kreacher/ > > but it can be trivially rebased on top of the current linux-next. > > Thanks! > > > I've checked this patch patch set on top of your bleeding-edge which has thermal re-work as well. The patch set looks good and works properly. Although, I have found some issue in this debug info files and I'm not sure if this is expected or not. If not I can address this and send some small fix for it. When I read the cooling device residency statistics, I don't get updates for the first time the state is used. It can only be counted when that state was known and finished it's usage. IMO it is not the right behavior, isn't it? Experiment: My trip points are 70degC and 75degC and I'm setting emulated temperature to cross them and get cooling states 1 then 0. As you can see the statistics counter only starts showing value after after trip crossing down. ------------------------------------8<----------------------------------- root@arm:~# cat /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms State Residency root@arm:~# root@arm:~# root@arm:~# echo 71000 > /sys/class/thermal/thermal_zone0/emul_temp root@arm:~# cat /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms State Residency root@arm:~# cat /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms State Residency root@arm:~# cat /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms State Residency root@arm:~# echo 76000 > /sys/class/thermal/thermal_zone0/emul_temp root@arm:~# root@arm:~# cat /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms State Residency 0 518197 root@arm:~# cat /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms State Residency 0 518197 root@arm:~# cat /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms State Residency 0 518197 root@arm:~# echo 71000 > /sys/class/thermal/thermal_zone0/emul_temp root@arm:~# root@arm:~# cat /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms State Residency 0 520066 1 17567 root@arm:~# cat /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms State Residency 0 522653 1 17567 root@arm:~# cat /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms State Residency 0 526151 1 17567 root@arm:~# echo 66000 > /sys/class/thermal/thermal_zone0/emul_temp root@arm:~# cat /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms State Residency 0 537366 1 17567 root@arm:~# cat /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms State Residency 0 544936 1 17567 root@arm:~# cat /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms State Residency 0 556694 1 17567 root@arm:~# ------------------------------->8---------------------------------------- Please let me know what do you think about that behavior. Regards, Lukasz
Hi Lukasz, On Mon, Apr 22, 2024 at 1:37 PM Lukasz Luba <lukasz.luba@arm.com> wrote: > > Hi Rafael, > > On 4/17/24 14:07, Rafael J. Wysocki wrote: > > Hi Everyone, > > > > The first patch in this series addresses the problem of updating trip > > point statistics prematurely for trip points that have just been > > crossed on the way down (please see the patch changelog for details). > > > > The way it does that renders the following cleanup patch inapplicable: > > > > https://lore.kernel.org/linux-pm/2321994.ElGaqSPkdT@kreacher/ > > > > The remaining two patches in the series are cleanups on top of the > > first one. > > > > This series is based on an older patch series posted last week: > > > > https://lore.kernel.org/linux-pm/13515747.uLZWGnKmhe@kreacher/ > > > > but it can be trivially rebased on top of the current linux-next. > > > > Thanks! > > > > > > > > I've checked this patch patch set on top of your bleeding-edge > which has thermal re-work as well. The patch set looks good > and works properly. > > Although, I have found some issue in this debug info files and > I'm not sure if this is expected or not. If not I can address this > and send some small fix for it. > > When I read the cooling device residency statistics, I don't > get updates for the first time the state is used. It can only > be counted when that state was known and finished it's usage. > > IMO it is not the right behavior, isn't it? I agree, it is not. > Experiment: > My trip points are 70degC and 75degC and I'm setting emulated > temperature to cross them and get cooling states 1 then 0. > As you can see the statistics counter only starts showing value after > after trip crossing down. > ------------------------------------8<----------------------------------- > > root@arm:~# cat > /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms > State Residency > root@arm:~# > root@arm:~# > root@arm:~# echo 71000 > /sys/class/thermal/thermal_zone0/emul_temp > > root@arm:~# cat > /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms > State Residency > root@arm:~# cat > /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms > State Residency > root@arm:~# cat > /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms > State Residency > root@arm:~# echo 76000 > /sys/class/thermal/thermal_zone0/emul_temp > > root@arm:~# > root@arm:~# cat > /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms > State Residency > 0 518197 > root@arm:~# cat > /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms > State Residency > 0 518197 > root@arm:~# cat > /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms > State Residency > 0 518197 > root@arm:~# echo 71000 > /sys/class/thermal/thermal_zone0/emul_temp > > root@arm:~# > root@arm:~# cat > /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms > State Residency > 0 520066 > 1 17567 > root@arm:~# cat > /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms > State Residency > 0 522653 > 1 17567 > root@arm:~# cat > /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms > State Residency > 0 526151 > 1 17567 > root@arm:~# echo 66000 > /sys/class/thermal/thermal_zone0/emul_temp > > root@arm:~# cat > /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms > State Residency > 0 537366 > 1 17567 > root@arm:~# cat > /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms > State Residency > 0 544936 > 1 17567 > root@arm:~# cat > /sys/kernel/debug/thermal/cooling_devices/0/time_in_state_ms > State Residency > 0 556694 > 1 17567 > root@arm:~# > > ------------------------------->8---------------------------------------- > > Please let me know what do you think about that behavior. If you have thought about fixing it already, please do so. Thanks!
On 22/04/2024 13:37, Lukasz Luba wrote: > Hi Rafael, > > On 4/17/24 14:07, Rafael J. Wysocki wrote: >> Hi Everyone, >> >> The first patch in this series addresses the problem of updating trip >> point statistics prematurely for trip points that have just been >> crossed on the way down (please see the patch changelog for details). >> >> The way it does that renders the following cleanup patch inapplicable: >> >> https://lore.kernel.org/linux-pm/2321994.ElGaqSPkdT@kreacher/ >> >> The remaining two patches in the series are cleanups on top of the >> first one. >> >> This series is based on an older patch series posted last week: >> >> https://lore.kernel.org/linux-pm/13515747.uLZWGnKmhe@kreacher/ >> >> but it can be trivially rebased on top of the current linux-next. >> >> Thanks! >> >> >> > > I've checked this patch patch set on top of your bleeding-edge > which has thermal re-work as well. The patch set looks good > and works properly. > > Although, I have found some issue in this debug info files and > I'm not sure if this is expected or not. If not I can address this > and send some small fix for it. > > When I read the cooling device residency statistics, I don't > get updates for the first time the state is used. It can only > be counted when that state was known and finished it's usage. > > IMO it is not the right behavior, isn't it? Do you mean the right behavior is a regression or we should expect at least the residency to be showed even if the mitigation state is not closed ?
On Mon, Apr 22, 2024 at 5:34 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > On 22/04/2024 13:37, Lukasz Luba wrote: > > Hi Rafael, > > > > On 4/17/24 14:07, Rafael J. Wysocki wrote: > >> Hi Everyone, > >> > >> The first patch in this series addresses the problem of updating trip > >> point statistics prematurely for trip points that have just been > >> crossed on the way down (please see the patch changelog for details). > >> > >> The way it does that renders the following cleanup patch inapplicable: > >> > >> https://lore.kernel.org/linux-pm/2321994.ElGaqSPkdT@kreacher/ > >> > >> The remaining two patches in the series are cleanups on top of the > >> first one. > >> > >> This series is based on an older patch series posted last week: > >> > >> https://lore.kernel.org/linux-pm/13515747.uLZWGnKmhe@kreacher/ > >> > >> but it can be trivially rebased on top of the current linux-next. > >> > >> Thanks! > >> > >> > >> > > > > I've checked this patch patch set on top of your bleeding-edge > > which has thermal re-work as well. The patch set looks good > > and works properly. > > > > Although, I have found some issue in this debug info files and > > I'm not sure if this is expected or not. If not I can address this > > and send some small fix for it. > > > > When I read the cooling device residency statistics, I don't > > get updates for the first time the state is used. It can only > > be counted when that state was known and finished it's usage. > > > > IMO it is not the right behavior, isn't it? > > Do you mean the right behavior is a regression It has not changed AFAICS. > or we should expect at least the residency to be showed even if the > mitigation state is not closed ? Well, in fact the device has already been in that state for some time and the mitigation can continue for a while.
On 22/04/2024 17:48, Rafael J. Wysocki wrote: > On Mon, Apr 22, 2024 at 5:34 PM Daniel Lezcano [ ... ] >> or we should expect at least the residency to be showed even if the >> mitigation state is not closed ? > > Well, in fact the device has already been in that state for some time > and the mitigation can continue for a while. Yes, but when to update the residency time ? When we cross a trip point point ? or When we read the information ? The former is what we are currently doing AFAIR and the latter must add the delta between the last update and the current time for the current state, right ?
On Mon, Apr 22, 2024 at 6:12 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote: > > On 22/04/2024 17:48, Rafael J. Wysocki wrote: > > On Mon, Apr 22, 2024 at 5:34 PM Daniel Lezcano > > [ ... ] > > >> or we should expect at least the residency to be showed even if the > >> mitigation state is not closed ? > > > > Well, in fact the device has already been in that state for some time > > and the mitigation can continue for a while. > > Yes, but when to update the residency time ? > > When we cross a trip point point ? > > or > > When we read the information ? > > The former is what we are currently doing AFAIR Not really. Records are added by thermal_debug_cdev_state_update() which only runs when __thermal_cdev_update() is called, ie. from governors. Moreover, it assumes the initial state to be 0 and checks if the new state is equal to the current one before doing anything else, so it will not make a record for the 0 state until the first transition. > and the latter must add the delta between the last update and the current time for the current > state, right ? Yes, and it is doing this already AFAICS. The problem is that it only creates a record for the old state, so if the new one is seen for the first time, there will be no record for it until it changes to some other state. The appended patch (modulo GMail-induced white space breakage) should help with this, but the initial state handling needs to be addressed separately. --- drivers/thermal/thermal_debugfs.c | 8 ++++++++ 1 file changed, 8 insertions(+) Index: linux-pm/drivers/thermal/thermal_debugfs.c =================================================================== --- linux-pm.orig/drivers/thermal/thermal_debugfs.c +++ linux-pm/drivers/thermal/thermal_debugfs.c @@ -433,6 +433,14 @@ void thermal_debug_cdev_state_update(con } cdev_dbg->current_state = new_state; + + /* + * Create a record for the new state if it is not there, so its + * duration will be printed by cdev_dt_seq_show() as expected if it + * runs before the next state transition. + */ + thermal_debugfs_cdev_record_get(thermal_dbg, cdev_dbg->durations, new_state); + transition = (old_state << 16) | new_state; /*
On 4/23/24 13:26, Rafael J. Wysocki wrote: > On Mon, Apr 22, 2024 at 6:12 PM Daniel Lezcano > <daniel.lezcano@linaro.org> wrote: >> >> On 22/04/2024 17:48, Rafael J. Wysocki wrote: >>> On Mon, Apr 22, 2024 at 5:34 PM Daniel Lezcano >> >> [ ... ] >> >>>> or we should expect at least the residency to be showed even if the >>>> mitigation state is not closed ? >>> >>> Well, in fact the device has already been in that state for some time >>> and the mitigation can continue for a while. >> >> Yes, but when to update the residency time ? >> >> When we cross a trip point point ? >> >> or >> >> When we read the information ? >> >> The former is what we are currently doing AFAIR > > Not really. > > Records are added by thermal_debug_cdev_state_update() which only runs > when __thermal_cdev_update() is called, ie. from governors. > > Moreover, it assumes the initial state to be 0 and checks if the new > state is equal to the current one before doing anything else, so it > will not make a record for the 0 state until the first transition. Correct, AFAIKS. > >> and the latter must add the delta between the last update and the current time for the current >> state, right ? > > Yes, and it is doing this already AFAICS. > > The problem is that it only creates a record for the old state, so if > the new one is seen for the first time, there will be no record for it > until it changes to some other state. Exactly, it's not totally wrong what we have now, just some missing part that needs to be added in the code, while we are counting/updating these stats. > > The appended patch (modulo GMail-induced white space breakage) should > help with this, but the initial state handling needs to be addressed > separately. > > --- > drivers/thermal/thermal_debugfs.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > Index: linux-pm/drivers/thermal/thermal_debugfs.c > =================================================================== > --- linux-pm.orig/drivers/thermal/thermal_debugfs.c > +++ linux-pm/drivers/thermal/thermal_debugfs.c > @@ -433,6 +433,14 @@ void thermal_debug_cdev_state_update(con > } > > cdev_dbg->current_state = new_state; > + > + /* > + * Create a record for the new state if it is not there, so its > + * duration will be printed by cdev_dt_seq_show() as expected if it > + * runs before the next state transition. > + */ > + thermal_debugfs_cdev_record_get(thermal_dbg, cdev_dbg->durations, > new_state); > + > transition = (old_state << 16) | new_state; > > /* Yes, something like this should do the trick. We will get the record entry in the list, so we at least enter the list loop in the cdev_dt_seq_show().