Message ID | 20180829224014.70036-1-mka@chromium.org (mailing list archive) |
---|---|
State | RFC, archived |
Headers | show |
Series | [RFC] PM/OPP: Add disable count | expand |
Any comments on this? On Wed, Aug 29, 2018 at 03:40:14PM -0700, Matthias Kaehlcke wrote: > The devfreq sub-system uses the OPP API to determine which frequencies are > available. A devfreq cooling device (drivers/thermal/devfreq_cooling.c) > may disable certain OPPs depending on temperature input and the thermal > policy. > > The WIP throttler driver (https://lore.kernel.org/patchwork/project/lkml/list/?series=357937) > can also limit the frequency range of devfreq devices. The current > version does this by introducing a function similar to cpufreq's > cpufreq_verify_within_limits(), however devfreq maintainer Chanwoo Choi > wants the throttler to use the OPP interface for limiting the frequency > range. > A devfreq cooling device and the throttler manipulate OPPs independently > from each other, one might enable an OPP, the other disable it. This patch > adds a 'disable count' to the OPP, to allow to track how often the OPP > was disabled without a subsequent opp_enable(). The OPP is considered > disabled when the disable count is > 0 , even directly after a > opp_enable() call. > > This patch is an initial draft, intended to get feedback from OPP > maintainers on the general idea. > > Suggested-by: Chanwoo Choi <cw00.choi@samsung.com> > Signed-off-by: Matthias Kaehlcke <mka@chromium.org> > --- > > Even though I sent this patch I have reservations about it, some of > general nature. and others in the context of devfreq/throttler/cooling > device: > > - possible race conditions, devfreq may end up using a disabled OPP > (cooling device and throttler run asynchronously to devfreq). > This is not really a new issue though, I think this can already > happen today with devfreq and cooling device. It doesn't cause > problems since currently disabling an OPP doesn't results in disabling > regulators, clocks or other components > > - all OPPs could end up disabled since multiple drivers manipulate OPPs > independently without any arbitration > > - a crashing/buggy driver can leave OPPs disabled forever > > - a typical caller would expect a OPP to be enabled after opp_enable(), > this is not always the case > > - added complexity for callers of opp_enable/disable() to keep track > which OPPs they enabled/disabled previously > --- > drivers/opp/core.c | 9 +++++++++ > drivers/opp/opp.h | 3 +++ > 2 files changed, 12 insertions(+) > > diff --git a/drivers/opp/core.c b/drivers/opp/core.c > index 31ff03dbeb83..a6d23e8560d2 100644 > --- a/drivers/opp/core.c > +++ b/drivers/opp/core.c > @@ -1072,6 +1072,7 @@ int _opp_add(struct device *dev, struct dev_pm_opp *new_opp, > > if (!_opp_supported_by_regulators(new_opp, opp_table)) { > new_opp->available = false; > + new_opp->disable_count = 1; > dev_warn(dev, "%s: OPP not supported by regulators (%lu)\n", > __func__, new_opp->rate); > } > @@ -1592,10 +1593,18 @@ static int _opp_set_availability(struct device *dev, unsigned long freq, > goto unlock; > } > > + if (availability_req) > + opp->disable_count--; > + else > + opp->disable_count++; > + > /* Is update really needed? */ > if (opp->available == availability_req) > goto unlock; > > + if (availability_req && opp->disable_count > 0) > + goto unlock; > + > opp->available = availability_req; > > dev_pm_opp_get(opp); > diff --git a/drivers/opp/opp.h b/drivers/opp/opp.h > index 7c540fd063b2..a72acb16c364 100644 > --- a/drivers/opp/opp.h > +++ b/drivers/opp/opp.h > @@ -55,6 +55,8 @@ extern struct list_head opp_tables; > * order. > * @kref: for reference count of the OPP. > * @available: true/false - marks if this OPP as available or not > + * @disable_count: tracks the number of times the OPP has been disabled > + * (without re-enabling it). > * @dynamic: not-created from static DT entries. > * @turbo: true if turbo (boost) OPP > * @suspend: true if suspend OPP > @@ -74,6 +76,7 @@ struct dev_pm_opp { > struct kref kref; > > bool available; > + unsigned int disable_count; > bool dynamic; > bool turbo; > bool suspend;
Hi Matthias, On 29-08-18, 15:40, Matthias Kaehlcke wrote: > The devfreq sub-system uses the OPP API to determine which frequencies are > available. A devfreq cooling device (drivers/thermal/devfreq_cooling.c) > may disable certain OPPs depending on temperature input and the thermal > policy. I looked at it and the way it is doing it currently has certain assumptions. Like it can enable an already enabled OPP without any side effects, which may not be true after this patch. > The WIP throttler driver (https://lore.kernel.org/patchwork/project/lkml/list/?series=357937) > can also limit the frequency range of devfreq devices. The current > version does this by introducing a function similar to cpufreq's > cpufreq_verify_within_limits(), however devfreq maintainer Chanwoo Choi I don't like this interface and have plans on updating it with something where a user asks for some constraints and then releases them at a later point of time. That will make cpufreq and other frameworks manage things in a better way. > wants the throttler to use the OPP interface for limiting the frequency > range. I am not sure if that would be the right approach as well. Maybe something similar to what I wrote earlier. For example all throttlers can provide min/max frequencies they support at a certain point of time and the devfreq core accumulates all such requests and finds the right target frequency. You wouldn't require to disable OPPs then to make it work. > A devfreq cooling device and the throttler manipulate OPPs independently > from each other, one might enable an OPP, the other disable it. This patch > adds a 'disable count' to the OPP, to allow to track how often the OPP > was disabled without a subsequent opp_enable(). The OPP is considered > disabled when the disable count is > 0 , even directly after a > opp_enable() call. > > This patch is an initial draft, intended to get feedback from OPP > maintainers on the general idea. > > Suggested-by: Chanwoo Choi <cw00.choi@samsung.com> > Signed-off-by: Matthias Kaehlcke <mka@chromium.org> > --- > > Even though I sent this patch I have reservations about it, some of > general nature. and others in the context of devfreq/throttler/cooling > device: > > - possible race conditions, devfreq may end up using a disabled OPP > (cooling device and throttler run asynchronously to devfreq). > This is not really a new issue though, I think this can already > happen today with devfreq and cooling device. It doesn't cause > problems since currently disabling an OPP doesn't results in disabling > regulators, clocks or other components > > - all OPPs could end up disabled since multiple drivers manipulate OPPs > independently without any arbitration > > - a crashing/buggy driver can leave OPPs disabled forever > > - a typical caller would expect a OPP to be enabled after opp_enable(), > this is not always the case > > - added complexity for callers of opp_enable/disable() to keep track > which OPPs they enabled/disabled previously Yeah, its unnecessary complexity on a framework like OPP which should normally be a helper library to provide data. > --- > drivers/opp/core.c | 9 +++++++++ > drivers/opp/opp.h | 3 +++ > 2 files changed, 12 insertions(+) > > diff --git a/drivers/opp/core.c b/drivers/opp/core.c > index 31ff03dbeb83..a6d23e8560d2 100644 > --- a/drivers/opp/core.c > +++ b/drivers/opp/core.c > @@ -1072,6 +1072,7 @@ int _opp_add(struct device *dev, struct dev_pm_opp *new_opp, > > if (!_opp_supported_by_regulators(new_opp, opp_table)) { > new_opp->available = false; > + new_opp->disable_count = 1; > dev_warn(dev, "%s: OPP not supported by regulators (%lu)\n", > __func__, new_opp->rate); > } > @@ -1592,10 +1593,18 @@ static int _opp_set_availability(struct device *dev, unsigned long freq, > goto unlock; > } > > + if (availability_req) > + opp->disable_count--; what if disable_count becomes < 0 ? Is that a bug ? > + else > + opp->disable_count++; > + > /* Is update really needed? */ > if (opp->available == availability_req) > goto unlock; > > + if (availability_req && opp->disable_count > 0) > + goto unlock; > + > opp->available = availability_req; > > dev_pm_opp_get(opp); > diff --git a/drivers/opp/opp.h b/drivers/opp/opp.h > index 7c540fd063b2..a72acb16c364 100644 > --- a/drivers/opp/opp.h > +++ b/drivers/opp/opp.h > @@ -55,6 +55,8 @@ extern struct list_head opp_tables; > * order. > * @kref: for reference count of the OPP. > * @available: true/false - marks if this OPP as available or not > + * @disable_count: tracks the number of times the OPP has been disabled > + * (without re-enabling it). > * @dynamic: not-created from static DT entries. > * @turbo: true if turbo (boost) OPP > * @suspend: true if suspend OPP > @@ -74,6 +76,7 @@ struct dev_pm_opp { > struct kref kref; > > bool available; > + unsigned int disable_count; > bool dynamic; > bool turbo; > bool suspend; > -- > 2.19.0.rc0.228.g281dcd1b4d0-goog
Hi Viresh, thanks for your feedback! Some comments inline On Mon, Oct 01, 2018 at 02:56:44PM +0530, Viresh Kumar wrote: > Hi Matthias, > > On 29-08-18, 15:40, Matthias Kaehlcke wrote: > > The devfreq sub-system uses the OPP API to determine which frequencies are > > available. A devfreq cooling device (drivers/thermal/devfreq_cooling.c) > > may disable certain OPPs depending on temperature input and the thermal > > policy. > > I looked at it and the way it is doing it currently has certain assumptions. > Like it can enable an already enabled OPP without any side effects, which may > not be true after this patch. Yes, devfreq_cooling.c would need to be adapted to keep track of which OPPs it previously en/disabled, which is one of the reasons why I'm reluctant about this change. > > The WIP throttler driver (https://lore.kernel.org/patchwork/project/lkml/list/?series=357937) > > can also limit the frequency range of devfreq devices. The current > > version does this by introducing a function similar to cpufreq's > > cpufreq_verify_within_limits(), however devfreq maintainer Chanwoo Choi > > I don't like this interface and have plans on updating it with something where a > user asks for some constraints and then releases them at a later point of time. > That will make cpufreq and other frameworks manage things in a better way. At first sight this doesn't sound much different, only that cpufreq (or a meta-framework) retains information about the constraints. Also the constraints could contain additional information (e.g. it's a thermal or user defined constraint), which might facilitate the resolution of possible conflicts. Do you have a guesstimate on when you might find time for this? > > wants the throttler to use the OPP interface for limiting the frequency > > range. > > I am not sure if that would be the right approach as well. Maybe something > similar to what I wrote earlier. For example all throttlers can provide min/max > frequencies they support at a certain point of time and the devfreq core > accumulates all such requests and finds the right target frequency. You wouldn't > require to disable OPPs then to make it work. Do you think the same 'constraint framework' could be used for cpufreq and devfreq? The problem appears to be very similar for both sub-systems, having a single solution seems desirable unless there are major caveats. From what you described I came up with the following draft for a possible API for a common framework: enum FreqConstraintType { // constraint from the thermal framework, typically max freq FREQ_CONSTRAINT_THERMAL, // userspace contraint, min or max freq FREQ_CONSTRAINT_USER, // constraint to reduce power consumption, typically max freq FREQ_CONSTRAINT_POWER, // constraint to ensure a certain performance, typically min freq FREQ_CONSTRAINT_PERFORMANCE, }; /* * a single contraint * * since most constraints either impose a max (e.g. thermal) or a min * (e.g. performance) it would also be an option to use a single * frequency here instead of a pair and derive whether it is min/max from * the type (and have USER_MIN and USER_MAX) */ struct freq_constraint { unsigned long min_freq; unsigned long max_freq; enum FreqConstraintType type; }; /* set of constraints for a device/CPU/... */ struct freq_constraint_set { struct idr *idr; }; /* compound constraint after conflict resolution */ struct compound_freq_constraint { unsigned long min; unsigned long max; }; // returns an IDR int freq_constraints_add(struct freq_constraint_set *fcs, enum FreqConstraintType type, unsigned long min_freq, unsigned long max_freq); int freq_constraints_update(struct freq_constraint_set *fcs, int constraint_id, unsigned long min_freq, unsigned long max_freq); int freq_constraints_remove(struct freq_constraint_set *fcs, int constraint_id) // determines the compound constrained, resolves conflicts if necessary int freq_constraints_get_compound_constraint(struct freq_constraint_set *fcs, struct compound_freq_constraint *constraint); For conflict resolution the order of preference could be: max freq: thermal > user > power min freq: user > performance Or in prose: If userspace sets a max frequency through sysfs that is higher than the one requested by the thermal framework, the constraint of the thermal framework is used. A 'power' constraint is loosely related with thermal constraints, but its purpose is less to avoid possible damage than to reduce power consumption (the exact reasons may vary, this would be one of our real world throttler use cases). If userspace decides to overwrite a 'power' constraint let them have their way. A 'performance' constraint could be set by the driver that owns a devfreq device, to request a minimum frequency to perform a certain task. If userspace thinks they are smarter allow it and let them deal with the possible consequences. Not sure if 'performance' constraints are really needed. Does this in general seem reasonable and is in line with what you envision? > > A devfreq cooling device and the throttler manipulate OPPs independently > > from each other, one might enable an OPP, the other disable it. This patch > > adds a 'disable count' to the OPP, to allow to track how often the OPP > > was disabled without a subsequent opp_enable(). The OPP is considered > > disabled when the disable count is > 0 , even directly after a > > opp_enable() call. > > > > This patch is an initial draft, intended to get feedback from OPP > > maintainers on the general idea. > > > > Suggested-by: Chanwoo Choi <cw00.choi@samsung.com> > > Signed-off-by: Matthias Kaehlcke <mka@chromium.org> > > --- > > > > Even though I sent this patch I have reservations about it, some of > > general nature. and others in the context of devfreq/throttler/cooling > > device: > > > > - possible race conditions, devfreq may end up using a disabled OPP > > (cooling device and throttler run asynchronously to devfreq). > > This is not really a new issue though, I think this can already > > happen today with devfreq and cooling device. It doesn't cause > > problems since currently disabling an OPP doesn't results in disabling > > regulators, clocks or other components > > > > - all OPPs could end up disabled since multiple drivers manipulate OPPs > > independently without any arbitration > > > > - a crashing/buggy driver can leave OPPs disabled forever > > > > - a typical caller would expect a OPP to be enabled after opp_enable(), > > this is not always the case > > > > - added complexity for callers of opp_enable/disable() to keep track > > which OPPs they enabled/disabled previously > > Yeah, its unnecessary complexity on a framework like OPP which should normally > be a helper library to provide data. > > > --- > > drivers/opp/core.c | 9 +++++++++ > > drivers/opp/opp.h | 3 +++ > > 2 files changed, 12 insertions(+) > > > > diff --git a/drivers/opp/core.c b/drivers/opp/core.c > > index 31ff03dbeb83..a6d23e8560d2 100644 > > --- a/drivers/opp/core.c > > +++ b/drivers/opp/core.c > > @@ -1072,6 +1072,7 @@ int _opp_add(struct device *dev, struct dev_pm_opp *new_opp, > > > > if (!_opp_supported_by_regulators(new_opp, opp_table)) { > > new_opp->available = false; > > + new_opp->disable_count = 1; > > dev_warn(dev, "%s: OPP not supported by regulators (%lu)\n", > > __func__, new_opp->rate); > > } > > @@ -1592,10 +1593,18 @@ static int _opp_set_availability(struct device *dev, unsigned long freq, > > goto unlock; > > } > > > > + if (availability_req) > > + opp->disable_count--; > > what if disable_count becomes < 0 ? Is that a bug ? Yes, that would be a bug Cheers Matthias
On 02-10-18, 15:06, Matthias Kaehlcke wrote: > On Mon, Oct 01, 2018 at 02:56:44PM +0530, Viresh Kumar wrote: > > On 29-08-18, 15:40, Matthias Kaehlcke wrote: > > > The WIP throttler driver (https://lore.kernel.org/patchwork/project/lkml/list/?series=357937) > > > can also limit the frequency range of devfreq devices. The current > > > version does this by introducing a function similar to cpufreq's > > > cpufreq_verify_within_limits(), however devfreq maintainer Chanwoo Choi > > > > I don't like this interface and have plans on updating it with something where a > > user asks for some constraints and then releases them at a later point of time. > > That will make cpufreq and other frameworks manage things in a better way. > > At first sight this doesn't sound much different, only that cpufreq > (or a meta-framework) retains information about the constraints. Also > the constraints could contain additional information (e.g. it's a > thermal or user defined constraint), which might facilitate the > resolution of possible conflicts. > > Do you have a guesstimate on when you might find time for this? I have lot of pending things to do right now and will easily say I wouldn't touch it until mid of November :) > > > wants the throttler to use the OPP interface for limiting the frequency > > > range. > > > > I am not sure if that would be the right approach as well. Maybe something > > similar to what I wrote earlier. For example all throttlers can provide min/max > > frequencies they support at a certain point of time and the devfreq core > > accumulates all such requests and finds the right target frequency. You wouldn't > > require to disable OPPs then to make it work. > > Do you think the same 'constraint framework' could be used for cpufreq > and devfreq? The problem appears to be very similar for both > sub-systems, having a single solution seems desirable unless there are > major caveats. I wasn't thinking of writing a separate layer for this but to put that as part of cpufreq core itself. But now that we have more than one user for the same, I don't see why shouldn't we write something based on your stuff. > >From what you described I came up with the following draft for a > possible API for a common framework: > > enum FreqConstraintType { > // constraint from the thermal framework, typically max freq We may actually need a min here. There are discussions around cases where the hardware works in very cold places and they may need to run the CPUs at a higher frequencies to make it hot enough to work properly :) > FREQ_CONSTRAINT_THERMAL, > // userspace contraint, min or max freq > FREQ_CONSTRAINT_USER, > // constraint to reduce power consumption, typically max freq > FREQ_CONSTRAINT_POWER, > // constraint to ensure a certain performance, typically min freq > FREQ_CONSTRAINT_PERFORMANCE, > }; > > /* > * a single contraint > * > * since most constraints either impose a max (e.g. thermal) or a min > * (e.g. performance) it would also be an option to use a single > * frequency here instead of a pair and derive whether it is min/max from > * the type (and have USER_MIN and USER_MAX) > */ > struct freq_constraint { > unsigned long min_freq; > unsigned long max_freq; > enum FreqConstraintType type; > }; > > /* set of constraints for a device/CPU/... */ > struct freq_constraint_set { > struct idr *idr; > }; > > /* compound constraint after conflict resolution */ > struct compound_freq_constraint { > unsigned long min; > unsigned long max; > }; > > // returns an IDR > int freq_constraints_add(struct freq_constraint_set *fcs, > enum FreqConstraintType type, > unsigned long min_freq, unsigned long max_freq); > > int freq_constraints_update(struct freq_constraint_set *fcs, int constraint_id, > unsigned long min_freq, unsigned long max_freq); > > int freq_constraints_remove(struct freq_constraint_set *fcs, int constraint_id) > > // determines the compound constrained, resolves conflicts if necessary > int freq_constraints_get_compound_constraint(struct freq_constraint_set *fcs, > struct compound_freq_constraint *constraint); > > > For conflict resolution the order of preference could be: > > max freq: thermal > user > power > min freq: user > performance > > Or in prose: > > If userspace sets a max frequency through sysfs that is higher than > the one requested by the thermal framework, the constraint of the > thermal framework is used. > > A 'power' constraint is loosely related with thermal constraints, but > its purpose is less to avoid possible damage than to reduce power > consumption (the exact reasons may vary, this would be one of our > real world throttler use cases). If userspace decides to overwrite > a 'power' constraint let them have their way. > > A 'performance' constraint could be set by the driver that owns a > devfreq device, to request a minimum frequency to perform a certain > task. If userspace thinks they are smarter allow it and let them deal > with the possible consequences. Not sure if 'performance' constraints > are really needed. > > Does this in general seem reasonable and is in line with what you > envision? It does. If you are willing to do this stuff, maybe just send patches for this and I can help reviewing it.
On Wed, Oct 10, 2018 at 12:51:51PM +0530, Viresh Kumar wrote: > On 02-10-18, 15:06, Matthias Kaehlcke wrote: > > On Mon, Oct 01, 2018 at 02:56:44PM +0530, Viresh Kumar wrote: > > > On 29-08-18, 15:40, Matthias Kaehlcke wrote: > > > > The WIP throttler driver (https://lore.kernel.org/patchwork/project/lkml/list/?series=357937) > > > > can also limit the frequency range of devfreq devices. The current > > > > version does this by introducing a function similar to cpufreq's > > > > cpufreq_verify_within_limits(), however devfreq maintainer Chanwoo Choi > > > > > > I don't like this interface and have plans on updating it with something where a > > > user asks for some constraints and then releases them at a later point of time. > > > That will make cpufreq and other frameworks manage things in a better way. > > > > At first sight this doesn't sound much different, only that cpufreq > > (or a meta-framework) retains information about the constraints. Also > > the constraints could contain additional information (e.g. it's a > > thermal or user defined constraint), which might facilitate the > > resolution of possible conflicts. > > > > Do you have a guesstimate on when you might find time for this? > > I have lot of pending things to do right now and will easily say I > wouldn't touch it until mid of November :) > > > > > wants the throttler to use the OPP interface for limiting the frequency > > > > range. > > > > > > I am not sure if that would be the right approach as well. Maybe something > > > similar to what I wrote earlier. For example all throttlers can provide min/max > > > frequencies they support at a certain point of time and the devfreq core > > > accumulates all such requests and finds the right target frequency. You wouldn't > > > require to disable OPPs then to make it work. > > > > Do you think the same 'constraint framework' could be used for cpufreq > > and devfreq? The problem appears to be very similar for both > > sub-systems, having a single solution seems desirable unless there are > > major caveats. > > I wasn't thinking of writing a separate layer for this but to put that > as part of cpufreq core itself. But now that we have more than one > user for the same, I don't see why shouldn't we write something based > on your stuff. > > > >From what you described I came up with the following draft for a > > possible API for a common framework: > > > > enum FreqConstraintType { > > // constraint from the thermal framework, typically max freq > > We may actually need a min here. There are discussions around cases > where the hardware works in very cold places and they may need to run > the CPUs at a higher frequencies to make it hot enough to work > properly :) Interesting, thanks for the heads up! > > FREQ_CONSTRAINT_THERMAL, > > // userspace contraint, min or max freq > > FREQ_CONSTRAINT_USER, > > // constraint to reduce power consumption, typically max freq > > FREQ_CONSTRAINT_POWER, > > // constraint to ensure a certain performance, typically min freq > > FREQ_CONSTRAINT_PERFORMANCE, > > }; > > > > /* > > * a single contraint > > * > > * since most constraints either impose a max (e.g. thermal) or a min > > * (e.g. performance) it would also be an option to use a single > > * frequency here instead of a pair and derive whether it is min/max from > > * the type (and have USER_MIN and USER_MAX) > > */ > > struct freq_constraint { > > unsigned long min_freq; > > unsigned long max_freq; > > enum FreqConstraintType type; > > }; > > > > /* set of constraints for a device/CPU/... */ > > struct freq_constraint_set { > > struct idr *idr; > > }; > > > > /* compound constraint after conflict resolution */ > > struct compound_freq_constraint { > > unsigned long min; > > unsigned long max; > > }; > > > > // returns an IDR > > int freq_constraints_add(struct freq_constraint_set *fcs, > > enum FreqConstraintType type, > > unsigned long min_freq, unsigned long max_freq); > > > > int freq_constraints_update(struct freq_constraint_set *fcs, int constraint_id, > > unsigned long min_freq, unsigned long max_freq); > > > > int freq_constraints_remove(struct freq_constraint_set *fcs, int constraint_id) > > > > // determines the compound constrained, resolves conflicts if necessary > > int freq_constraints_get_compound_constraint(struct freq_constraint_set *fcs, > > struct compound_freq_constraint *constraint); > > > > > > For conflict resolution the order of preference could be: > > > > max freq: thermal > user > power > > min freq: user > performance > > > > Or in prose: > > > > If userspace sets a max frequency through sysfs that is higher than > > the one requested by the thermal framework, the constraint of the > > thermal framework is used. > > > > A 'power' constraint is loosely related with thermal constraints, but > > its purpose is less to avoid possible damage than to reduce power > > consumption (the exact reasons may vary, this would be one of our > > real world throttler use cases). If userspace decides to overwrite > > a 'power' constraint let them have their way. > > > > A 'performance' constraint could be set by the driver that owns a > > devfreq device, to request a minimum frequency to perform a certain > > task. If userspace thinks they are smarter allow it and let them deal > > with the possible consequences. Not sure if 'performance' constraints > > are really needed. > > > > Does this in general seem reasonable and is in line with what you > > envision? > > It does. Thanks for your feedback! > If you are willing to do this stuff, maybe just send patches for this > and I can help reviewing it. Ok, I'll try to find some time for it, might take a bit though. Cheers Matthias
diff --git a/drivers/opp/core.c b/drivers/opp/core.c index 31ff03dbeb83..a6d23e8560d2 100644 --- a/drivers/opp/core.c +++ b/drivers/opp/core.c @@ -1072,6 +1072,7 @@ int _opp_add(struct device *dev, struct dev_pm_opp *new_opp, if (!_opp_supported_by_regulators(new_opp, opp_table)) { new_opp->available = false; + new_opp->disable_count = 1; dev_warn(dev, "%s: OPP not supported by regulators (%lu)\n", __func__, new_opp->rate); } @@ -1592,10 +1593,18 @@ static int _opp_set_availability(struct device *dev, unsigned long freq, goto unlock; } + if (availability_req) + opp->disable_count--; + else + opp->disable_count++; + /* Is update really needed? */ if (opp->available == availability_req) goto unlock; + if (availability_req && opp->disable_count > 0) + goto unlock; + opp->available = availability_req; dev_pm_opp_get(opp); diff --git a/drivers/opp/opp.h b/drivers/opp/opp.h index 7c540fd063b2..a72acb16c364 100644 --- a/drivers/opp/opp.h +++ b/drivers/opp/opp.h @@ -55,6 +55,8 @@ extern struct list_head opp_tables; * order. * @kref: for reference count of the OPP. * @available: true/false - marks if this OPP as available or not + * @disable_count: tracks the number of times the OPP has been disabled + * (without re-enabling it). * @dynamic: not-created from static DT entries. * @turbo: true if turbo (boost) OPP * @suspend: true if suspend OPP @@ -74,6 +76,7 @@ struct dev_pm_opp { struct kref kref; bool available; + unsigned int disable_count; bool dynamic; bool turbo; bool suspend;
The devfreq sub-system uses the OPP API to determine which frequencies are available. A devfreq cooling device (drivers/thermal/devfreq_cooling.c) may disable certain OPPs depending on temperature input and the thermal policy. The WIP throttler driver (https://lore.kernel.org/patchwork/project/lkml/list/?series=357937) can also limit the frequency range of devfreq devices. The current version does this by introducing a function similar to cpufreq's cpufreq_verify_within_limits(), however devfreq maintainer Chanwoo Choi wants the throttler to use the OPP interface for limiting the frequency range. A devfreq cooling device and the throttler manipulate OPPs independently from each other, one might enable an OPP, the other disable it. This patch adds a 'disable count' to the OPP, to allow to track how often the OPP was disabled without a subsequent opp_enable(). The OPP is considered disabled when the disable count is > 0 , even directly after a opp_enable() call. This patch is an initial draft, intended to get feedback from OPP maintainers on the general idea. Suggested-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Matthias Kaehlcke <mka@chromium.org> --- Even though I sent this patch I have reservations about it, some of general nature. and others in the context of devfreq/throttler/cooling device: - possible race conditions, devfreq may end up using a disabled OPP (cooling device and throttler run asynchronously to devfreq). This is not really a new issue though, I think this can already happen today with devfreq and cooling device. It doesn't cause problems since currently disabling an OPP doesn't results in disabling regulators, clocks or other components - all OPPs could end up disabled since multiple drivers manipulate OPPs independently without any arbitration - a crashing/buggy driver can leave OPPs disabled forever - a typical caller would expect a OPP to be enabled after opp_enable(), this is not always the case - added complexity for callers of opp_enable/disable() to keep track which OPPs they enabled/disabled previously --- drivers/opp/core.c | 9 +++++++++ drivers/opp/opp.h | 3 +++ 2 files changed, 12 insertions(+)