mbox series

[v2,00/17] thermal: Rework binding cooling devices to trip points

Message ID 114901234.nniJfEyVGO@rjwysocki.net (mailing list archive)
Headers show
Series thermal: Rework binding cooling devices to trip points | expand

Message

Rafael J. Wysocki Aug. 12, 2024, 1:50 p.m. UTC
Hi Everyone,

This is an update of

https://lore.kernel.org/linux-pm/3134863.CbtlEUcBR6@rjwysocki.net/#r

the cover letter of which was sent separately by mistake:

https://lore.kernel.org/linux-pm/CAJZ5v0jo5vh2uD5t4GqBnN0qukMBG_ty33PB=NiEqigqxzBcsw@mail.gmail.com/

It addresses several (arguably minor) issues that have been reported by
robots or found by inspection in the v1 and takes review feedback into
account.

The first 10 patches in the series are not expected to be controversial,
even though patch [05/17] requires some extra testing and review (if it
turns out to be problematic, it can be deferred without too much hassle).

The other 7 patches are driver changes and code simplifications on top of
them which may require some more time to process.  For this reason, I'm
considering handling the first 10 patches somewhat faster, with the possible
exception of patch [05/17].

Below is the original cover letter mishandled previously.

The code for binding cooling devices to trip points (and unbinding them from
trip point) is one of the murkiest pieces of the thermal subsystem.  It is
convoluted, bloated with unnecessary code doing questionable things, and it
works backwards.

The idea is to bind cooling devices to trip points in accordance with some
information known to the thermal zone owner (thermal driver).  This information
is not known to the thermal core when the thermal zone is registered, so the
driver needs to be involved, but instead of just asking the driver whether
or not the given cooling device should be bound to a given trip point, the
thermal core expects the driver to carry out all of the binding process
including calling functions specifically provided by the core for this
purpose which is cumbersome and counter-intuitive.

Because the driver has no information regarding the representation of the trip
points at the core level, it is forced to walk them (and it has to avoid some
locking traps while doing this), or it needs to make questionable assumptions
regarding the ordering of the trips in the core.  There are drivers doing both
these things.

But there's more.  The size of the binding/unbinding code can be reduced by
simply moving some parts of it around.  Some checks in it are overkill or
redundant.  White space is used inconsistently in it.  Its locking can be
made more straightforward.

Moreover, overhead can be reduced, especially in governors, if the lists of
thermal instances representing the bindings between cooling devices and trip
points are moved from thermal zone objects to trip descriptors.

The first 7 patches in the series deal with the minor issues listed above in
preparation for a more substantial change which is the introduction of a new
thermal operation, called .should_bind(), that will allow the core to do
exactly what it needs: as the driver whether or not the given cooling device
should be bound to a given trip, in patch [08/17].

Patch [09/17] makes the ACPI thermal driver use .should_bind() instead of
the .bind() and .unbind() operations which is a substantial simplification.

Patch [10/17] unexports two core functions previously used by the ACPI driver
that can be static now.

Patches [11-14/17] modify the remaining drivers implementing .bind() and
.undind() to use .should_bind() instead of them which results in significant
simplifications of the code.

The remaining 3 patches carry out cleanups that can be done after all of the
previous changes, resulting if further code size reductions.

Thanks!

Comments

Rafael J. Wysocki Aug. 13, 2024, 4:55 p.m. UTC | #1
Hi Everyone,

On Mon, Aug 12, 2024 at 4:01 PM Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>
> This is an update of
>
> https://lore.kernel.org/linux-pm/3134863.CbtlEUcBR6@rjwysocki.net/#r
>
> the cover letter of which was sent separately by mistake:
>
> https://lore.kernel.org/linux-pm/CAJZ5v0jo5vh2uD5t4GqBnN0qukMBG_ty33PB=NiEqigqxzBcsw@mail.gmail.com/
>
> It addresses several (arguably minor) issues that have been reported by
> robots or found by inspection in the v1 and takes review feedback into
> account.
>
> The first 10 patches in the series are not expected to be controversial,
> even though patch [05/17] requires some extra testing and review (if it
> turns out to be problematic, it can be deferred without too much hassle).
>
> The other 7 patches are driver changes and code simplifications on top of
> them which may require some more time to process.  For this reason, I'm
> considering handling the first 10 patches somewhat faster, with the possible
> exception of patch [05/17].

Since patch [04/17] in this series turned out to be broken, I decided
to drop it and rearrange the rest.  This mostly affects patch [05/17]
and patch [17/17] which is the only one that really depends on the
former.

Of course, patch [05/17] also clashes with the Bang-bang governor
changes that have just been posted:

https://lore.kernel.org/linux-pm/1903691.tdWV9SEqCh@rjwysocki.net/

For this reason, please skip patches [04/17] (broken), [05/17] (needs
to be rebased), and [17/17] (likewise) for now and focus on the rest
of the series which still is applicable (a couple of patches need to
be rebased slightly).

I'll send a rearranged v3 next week (I'll be mostly offline for the
rest of this week), most likely without the patches mentioned above
(I'll get back to them later).

Thanks!