mbox series

[v2,0/4] thermal: core/ACPI: Fix processor cooling device regression

Message ID 2692681.mvXUDI8C0e@kreacher (mailing list archive)
Headers show
Series thermal: core/ACPI: Fix processor cooling device regression | expand

Message

Rafael J. Wysocki March 13, 2023, 2:24 p.m. UTC
Hi All,

The first revision of this patch series was posted as

https://lore.kernel.org/linux-pm/2148907.irdbgypaU6@kreacher/

As reported by Rui in this thread:

Link: https://lore.kernel.org/linux-pm/53ec1f06f61c984100868926f282647e57ecfb2d.camel@intel.com/

some recent changes in the thermal core cause the CPU cooling devices
registered by the ACPI processor driver to become unusable in some cases
and somewhat crippled in general.

The problem is that the ACPI processor driver changes its ->get_max_state()
callback return value depending on whether or not cpufreq is available and
there is a cpufreq policy for a given CPU.  However, the thermal core has
always assumed that the return value of that callback will not change, which
in fact is relied on by the cooling device statistics code.  In particular,
when the ->get_max_state() grows, the memory buffer allocated for storing the
statistics will be too small and corruption may ensue as a result.

For this reason, the issue needs to be addressed in the ACPI processor driver
and not in the thermal core, but the core needs to help somewhat too.  Namely,
it needs to provide a helper allowing an interested driver to update the
max_state value for an already registered cooling device in certain situations
which will also cause the statistics to be rebuilt.

This series implements the above and for details please refer to the individual
patch chagelogs.

Thanks!

Comments

Zhang, Rui March 13, 2023, 4:47 p.m. UTC | #1
Hi, Rafael,

The only concern to me is that, in thermal_cooling_device_update(), we
should handle the cases that the cooling device is current used by
one/more thermal zone. say, something like

list_for_each_entry(pos, &cdev->thermal_instances, cdev_node) {
	/* e.g. what to do if tz1 set it to state 1 previously */
}
I have not got a clear idea what we should do here.

But given that I have confirmed that this patch series fixes the
original problem, and the ACPI passive cooling is unlikely to be
triggered before CPUFREQ_CREATE_POLICY notification, probably we can
address that problem later.

Tested-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>

thanks,
rui

On Mon, 2023-03-13 at 15:24 +0100, Rafael J. Wysocki wrote:
> Hi All,
> 
> The first revision of this patch series was posted as
> 
> https://lore.kernel.org/linux-pm/2148907.irdbgypaU6@kreacher/
> 
> As reported by Rui in this thread:
> 
> Link: 
> https://lore.kernel.org/linux-pm/53ec1f06f61c984100868926f282647e57ecfb2d.camel@intel.com/
> 
> some recent changes in the thermal core cause the CPU cooling devices
> registered by the ACPI processor driver to become unusable in some
> cases
> and somewhat crippled in general.
> 
> The problem is that the ACPI processor driver changes its
> ->get_max_state()
> callback return value depending on whether or not cpufreq is
> available and
> there is a cpufreq policy for a given CPU.  However, the thermal core
> has
> always assumed that the return value of that callback will not
> change, which
> in fact is relied on by the cooling device statistics code.  In
> particular,
> when the ->get_max_state() grows, the memory buffer allocated for
> storing the
> statistics will be too small and corruption may ensue as a result.
> 
> For this reason, the issue needs to be addressed in the ACPI
> processor driver
> and not in the thermal core, but the core needs to help somewhat
> too.  Namely,
> it needs to provide a helper allowing an interested driver to update
> the
> max_state value for an already registered cooling device in certain
> situations
> which will also cause the statistics to be rebuilt.
> 
> This series implements the above and for details please refer to the
> individual
> patch chagelogs.
> 
> Thanks!
> 
> 
>
Rafael J. Wysocki March 13, 2023, 6:02 p.m. UTC | #2
On Mon, Mar 13, 2023 at 5:47 PM Zhang, Rui <rui.zhang@intel.com> wrote:
>
> Hi, Rafael,
>
> The only concern to me is that, in thermal_cooling_device_update(), we
> should handle the cases that the cooling device is current used by
> one/more thermal zone. say, something like
>
> list_for_each_entry(pos, &cdev->thermal_instances, cdev_node) {
>         /* e.g. what to do if tz1 set it to state 1 previously */
> }
> I have not got a clear idea what we should do here.

For each instance, set upper to max_state if above it and set target
to upper if above it I'd say.

I guess otherwise there may be some confusion in principle and I have
missed that piece, so thanks for pointing it out!

> But given that I have confirmed that this patch series fixes the
> original problem, and the ACPI passive cooling is unlikely to be
> triggered before CPUFREQ_CREATE_POLICY notification, probably we can
> address that problem later.
>
> Tested-by: Zhang Rui <rui.zhang@intel.com>
> Reviewed-by: Zhang Rui <rui.zhang@intel.com>

Thank you!
Zhang, Rui March 14, 2023, 2:02 a.m. UTC | #3
On Mon, 2023-03-13 at 19:02 +0100, Rafael J. Wysocki wrote:
> On Mon, Mar 13, 2023 at 5:47 PM Zhang, Rui <rui.zhang@intel.com>
> wrote:
> > Hi, Rafael,
> > 
> > The only concern to me is that, in thermal_cooling_device_update(),
> > we
> > should handle the cases that the cooling device is current used by
> > one/more thermal zone. say, something like
> > 
> > list_for_each_entry(pos, &cdev->thermal_instances, cdev_node) {
> >         /* e.g. what to do if tz1 set it to state 1 previously */
> > }
> > I have not got a clear idea what we should do here.
> 
> For each instance, set upper to max_state if above it and set target
> to upper if above it I'd say.
> 

Say, before update,
max_state: 3
target: 1
upper is set to 3 because upper == THERMAL_NO_LIMIT during binding

then, after update
max_state: 7
target: ?
upper: ?

Maybe we should do unbind and rebind, and then set target
to THERMAL_NO_TARGET? it is really the governor that should set the
target.

> I guess otherwise there may be some confusion in principle and I have
> missed that piece, so thanks for pointing it out!
> 
> > But given that I have confirmed that this patch series fixes the
> > original problem, and the ACPI passive cooling is unlikely to be
> > triggered before CPUFREQ_CREATE_POLICY notification, probably we
> > can
> > address that problem later.
> > 
> > Tested-by: Zhang Rui <rui.zhang@intel.com>
> > Reviewed-by: Zhang Rui <rui.zhang@intel.com>
> 
> 
I recalled that patchwork used to catch these tags here and apply them
to every patches in the series, so the tags are appended automatically
when applying the patches. But it apparently does not work now.

Let me reply to the patches one by one.

thanks,
rui