diff mbox series

[v2,11/48] opp: Add dev_pm_opp_find_level_ceil()

Message ID 20201217180638.22748-12-digetx@gmail.com (mailing list archive)
State New, archived
Headers show
Series Introduce core voltage scaling for NVIDIA Tegra20/30 SoCs | expand

Commit Message

Dmitry Osipenko Dec. 17, 2020, 6:06 p.m. UTC
Add a ceil version of the dev_pm_opp_find_level(). It's handy to have if
levels don't start from 0 in OPP table and zero usually means a minimal
level.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/opp/core.c     | 49 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/pm_opp.h |  8 +++++++
 2 files changed, 57 insertions(+)

Comments

Viresh Kumar Dec. 22, 2020, 6:42 a.m. UTC | #1
On 17-12-20, 21:06, Dmitry Osipenko wrote:
> Add a ceil version of the dev_pm_opp_find_level(). It's handy to have if
> levels don't start from 0 in OPP table and zero usually means a minimal
> level.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>

Why doesn't the exact version work for you here ?
Dmitry Osipenko Dec. 22, 2020, 7:15 p.m. UTC | #2
22.12.2020 09:42, Viresh Kumar пишет:
> On 17-12-20, 21:06, Dmitry Osipenko wrote:
>> Add a ceil version of the dev_pm_opp_find_level(). It's handy to have if
>> levels don't start from 0 in OPP table and zero usually means a minimal
>> level.
>>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> 
> Why doesn't the exact version work for you here ?
> 

The exact version won't find OPP for level=0 if levels don't start with
0, where 0 means that minimal level is desired.
Viresh Kumar Dec. 23, 2020, 4:19 a.m. UTC | #3
On 22-12-20, 22:15, Dmitry Osipenko wrote:
> 22.12.2020 09:42, Viresh Kumar пишет:
> > On 17-12-20, 21:06, Dmitry Osipenko wrote:
> >> Add a ceil version of the dev_pm_opp_find_level(). It's handy to have if
> >> levels don't start from 0 in OPP table and zero usually means a minimal
> >> level.
> >>
> >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> > 
> > Why doesn't the exact version work for you here ?
> > 
> 
> The exact version won't find OPP for level=0 if levels don't start with
> 0, where 0 means that minimal level is desired.

Right, but why do you need to send 0 for your platform ?
Dmitry Osipenko Dec. 23, 2020, 8:37 p.m. UTC | #4
23.12.2020 07:19, Viresh Kumar пишет:
> On 22-12-20, 22:15, Dmitry Osipenko wrote:
>> 22.12.2020 09:42, Viresh Kumar пишет:
>>> On 17-12-20, 21:06, Dmitry Osipenko wrote:
>>>> Add a ceil version of the dev_pm_opp_find_level(). It's handy to have if
>>>> levels don't start from 0 in OPP table and zero usually means a minimal
>>>> level.
>>>>
>>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>>>
>>> Why doesn't the exact version work for you here ?
>>>
>>
>> The exact version won't find OPP for level=0 if levels don't start with
>> 0, where 0 means that minimal level is desired.
> 
> Right, but why do you need to send 0 for your platform ?
> 

To put power domain into the lowest performance state when device is idling.

https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/opp/core.c#L897

https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/opp/core.c#L785

Also please see patch 32, tegra_clock_runtime_suspend().
Viresh Kumar Dec. 24, 2020, 6:43 a.m. UTC | #5
On 23-12-20, 23:37, Dmitry Osipenko wrote:
> 23.12.2020 07:19, Viresh Kumar пишет:
> > On 22-12-20, 22:15, Dmitry Osipenko wrote:
> >> 22.12.2020 09:42, Viresh Kumar пишет:
> >>> On 17-12-20, 21:06, Dmitry Osipenko wrote:
> >>>> Add a ceil version of the dev_pm_opp_find_level(). It's handy to have if
> >>>> levels don't start from 0 in OPP table and zero usually means a minimal
> >>>> level.
> >>>>
> >>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> >>>
> >>> Why doesn't the exact version work for you here ?
> >>>
> >>
> >> The exact version won't find OPP for level=0 if levels don't start with
> >> 0, where 0 means that minimal level is desired.
> > 
> > Right, but why do you need to send 0 for your platform ?
> > 
> 
> To put power domain into the lowest performance state when device is idling.

I see. So you really want to set it to the lowest state or just take the vote
out ? Which may end up powering off the domain in the worst case ?
Dmitry Osipenko Dec. 24, 2020, 1 p.m. UTC | #6
24.12.2020 09:43, Viresh Kumar пишет:
> On 23-12-20, 23:37, Dmitry Osipenko wrote:
>> 23.12.2020 07:19, Viresh Kumar пишет:
>>> On 22-12-20, 22:15, Dmitry Osipenko wrote:
>>>> 22.12.2020 09:42, Viresh Kumar пишет:
>>>>> On 17-12-20, 21:06, Dmitry Osipenko wrote:
>>>>>> Add a ceil version of the dev_pm_opp_find_level(). It's handy to have if
>>>>>> levels don't start from 0 in OPP table and zero usually means a minimal
>>>>>> level.
>>>>>>
>>>>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>>>>>
>>>>> Why doesn't the exact version work for you here ?
>>>>>
>>>>
>>>> The exact version won't find OPP for level=0 if levels don't start with
>>>> 0, where 0 means that minimal level is desired.
>>>
>>> Right, but why do you need to send 0 for your platform ?
>>>
>>
>> To put power domain into the lowest performance state when device is idling.
> 
> I see. So you really want to set it to the lowest state or just take the vote
> out ? Which may end up powering off the domain in the worst case ?
> 

In a device driver I want to set PD to the lowest performance state by
removing the performance vote when dev_pm_opp_set_rate(dev, 0) is
invoked by the driver.

The OPP core already does this, but if OPP levels don't start from 0 in
a device-tree for PD, then it currently doesn't work since there is a
need to get a rounded-up performance state because
dev_pm_opp_set_voltage() takes OPP entry for the argument (patches 9 and
28).

The PD powering off and performance-changes are separate from each other
in the GENPD core. The GENPD core automatically turns off domain when
all devices within the domain are suspended by system-suspend or RPM.

The performance state of a power domain is controlled solely by a device
driver. GENPD core only aggregates the performance requests, it doesn't
change the performance state of a domain by itself when device is
suspended or resumed, IIUC this is intentional. And I want to put domain
into lowest performance state when device is suspended.
Viresh Kumar Dec. 28, 2020, 6:22 a.m. UTC | #7
On 24-12-20, 16:00, Dmitry Osipenko wrote:
> In a device driver I want to set PD to the lowest performance state by
> removing the performance vote when dev_pm_opp_set_rate(dev, 0) is
> invoked by the driver.
> 
> The OPP core already does this, but if OPP levels don't start from 0 in
> a device-tree for PD, then it currently doesn't work since there is a
> need to get a rounded-up performance state because
> dev_pm_opp_set_voltage() takes OPP entry for the argument (patches 9 and
> 28).
> 
> The PD powering off and performance-changes are separate from each other
> in the GENPD core. The GENPD core automatically turns off domain when
> all devices within the domain are suspended by system-suspend or RPM.
> 
> The performance state of a power domain is controlled solely by a device
> driver. GENPD core only aggregates the performance requests, it doesn't
> change the performance state of a domain by itself when device is
> suspended or resumed, IIUC this is intentional. And I want to put domain
> into lowest performance state when device is suspended.

Right, so if you really want to just drop the performance vote, then with a
value of 0 for the performance state the call will reach to your genpd's
callback ->set_performance_state(). Just as dev_pm_opp_set_rate() accepts the
frequency to be 0, I would expect dev_pm_opp_set_rate() to accept opp argument
as NULL and in that case set voltage to 0 and do regulator_disable() as well.
Won't that work better than going for the lowest voltage ?
Dmitry Osipenko Dec. 28, 2020, 2:03 p.m. UTC | #8
28.12.2020 09:22, Viresh Kumar пишет:
> On 24-12-20, 16:00, Dmitry Osipenko wrote:
>> In a device driver I want to set PD to the lowest performance state by
>> removing the performance vote when dev_pm_opp_set_rate(dev, 0) is
>> invoked by the driver.
>>
>> The OPP core already does this, but if OPP levels don't start from 0 in
>> a device-tree for PD, then it currently doesn't work since there is a
>> need to get a rounded-up performance state because
>> dev_pm_opp_set_voltage() takes OPP entry for the argument (patches 9 and
>> 28).
>>
>> The PD powering off and performance-changes are separate from each other
>> in the GENPD core. The GENPD core automatically turns off domain when
>> all devices within the domain are suspended by system-suspend or RPM.
>>
>> The performance state of a power domain is controlled solely by a device
>> driver. GENPD core only aggregates the performance requests, it doesn't
>> change the performance state of a domain by itself when device is
>> suspended or resumed, IIUC this is intentional. And I want to put domain
>> into lowest performance state when device is suspended.
> 
> Right, so if you really want to just drop the performance vote, then with a
> value of 0 for the performance state the call will reach to your genpd's
> callback ->set_performance_state(). Just as dev_pm_opp_set_rate() accepts the
> frequency to be 0, I would expect dev_pm_opp_set_rate() to accept opp argument
> as NULL and in that case set voltage to 0 and do regulator_disable() as well.
> Won't that work better than going for the lowest voltage ?
> 

We can make dev_pm_opp_set_voltage() to accept OPP=NULL in order to
disable the regulator, like it's done for dev_pm_opp_set_rate(dev, 0).
Although, I don't need this kind of behaviour for the Tegra PD driver,
and thus, would prefer to leave this for somebody else to implement in
the future, once it will be really needed.

Still we need the dev_pm_opp_find_level_ceil() because level=0 means
that we want to set PD to the lowest (minimal) performance state, i.e.
it doesn't necessarily mean that we want to set the voltage to 0 and
disable the PD entirely. GENPD has a separate controls for on/off.
Viresh Kumar Dec. 30, 2020, 4:46 a.m. UTC | #9
On 28-12-20, 17:03, Dmitry Osipenko wrote:
> 28.12.2020 09:22, Viresh Kumar пишет:
> > On 24-12-20, 16:00, Dmitry Osipenko wrote:
> >> In a device driver I want to set PD to the lowest performance state by
> >> removing the performance vote when dev_pm_opp_set_rate(dev, 0) is
> >> invoked by the driver.
> >>
> >> The OPP core already does this, but if OPP levels don't start from 0 in
> >> a device-tree for PD, then it currently doesn't work since there is a
> >> need to get a rounded-up performance state because
> >> dev_pm_opp_set_voltage() takes OPP entry for the argument (patches 9 and
> >> 28).
> >>
> >> The PD powering off and performance-changes are separate from each other
> >> in the GENPD core. The GENPD core automatically turns off domain when
> >> all devices within the domain are suspended by system-suspend or RPM.
> >>
> >> The performance state of a power domain is controlled solely by a device
> >> driver. GENPD core only aggregates the performance requests, it doesn't
> >> change the performance state of a domain by itself when device is
> >> suspended or resumed, IIUC this is intentional. And I want to put domain
> >> into lowest performance state when device is suspended.
> > 
> > Right, so if you really want to just drop the performance vote, then with a
> > value of 0 for the performance state the call will reach to your genpd's
> > callback ->set_performance_state(). Just as dev_pm_opp_set_rate() accepts the
> > frequency to be 0, I would expect dev_pm_opp_set_rate() to accept opp argument
> > as NULL and in that case set voltage to 0 and do regulator_disable() as well.
> > Won't that work better than going for the lowest voltage ?
> > 
> 
> We can make dev_pm_opp_set_voltage() to accept OPP=NULL in order to
> disable the regulator, like it's done for dev_pm_opp_set_rate(dev, 0).
> Although, I don't need this kind of behaviour for the Tegra PD driver,
> and thus, would prefer to leave this for somebody else to implement in
> the future, once it will be really needed.
> 
> Still we need the dev_pm_opp_find_level_ceil() because level=0 means
> that we want to set PD to the lowest (minimal) performance state, i.e.
> it doesn't necessarily mean that we want to set the voltage to 0 and
> disable the PD entirely. GENPD has a separate controls for on/off.

Ok.
Dmitry Osipenko Dec. 30, 2020, 2:02 p.m. UTC | #10
30.12.2020 07:46, Viresh Kumar пишет:
> On 28-12-20, 17:03, Dmitry Osipenko wrote:
>> 28.12.2020 09:22, Viresh Kumar пишет:
>>> On 24-12-20, 16:00, Dmitry Osipenko wrote:
>>>> In a device driver I want to set PD to the lowest performance state by
>>>> removing the performance vote when dev_pm_opp_set_rate(dev, 0) is
>>>> invoked by the driver.
>>>>
>>>> The OPP core already does this, but if OPP levels don't start from 0 in
>>>> a device-tree for PD, then it currently doesn't work since there is a
>>>> need to get a rounded-up performance state because
>>>> dev_pm_opp_set_voltage() takes OPP entry for the argument (patches 9 and
>>>> 28).
>>>>
>>>> The PD powering off and performance-changes are separate from each other
>>>> in the GENPD core. The GENPD core automatically turns off domain when
>>>> all devices within the domain are suspended by system-suspend or RPM.
>>>>
>>>> The performance state of a power domain is controlled solely by a device
>>>> driver. GENPD core only aggregates the performance requests, it doesn't
>>>> change the performance state of a domain by itself when device is
>>>> suspended or resumed, IIUC this is intentional. And I want to put domain
>>>> into lowest performance state when device is suspended.
>>>
>>> Right, so if you really want to just drop the performance vote, then with a
>>> value of 0 for the performance state the call will reach to your genpd's
>>> callback ->set_performance_state(). Just as dev_pm_opp_set_rate() accepts the
>>> frequency to be 0, I would expect dev_pm_opp_set_rate() to accept opp argument
>>> as NULL and in that case set voltage to 0 and do regulator_disable() as well.
>>> Won't that work better than going for the lowest voltage ?
>>>
>>
>> We can make dev_pm_opp_set_voltage() to accept OPP=NULL in order to
>> disable the regulator, like it's done for dev_pm_opp_set_rate(dev, 0).
>> Although, I don't need this kind of behaviour for the Tegra PD driver,
>> and thus, would prefer to leave this for somebody else to implement in
>> the future, once it will be really needed.
>>
>> Still we need the dev_pm_opp_find_level_ceil() because level=0 means
>> that we want to set PD to the lowest (minimal) performance state, i.e.
>> it doesn't necessarily mean that we want to set the voltage to 0 and
>> disable the PD entirely. GENPD has a separate controls for on/off.
> 
> Ok.
> 

I'll separate the OPP patches from this series and will prepare v3,
thank you for the review!
diff mbox series

Patch

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index eab37b3a27bb..0783a4ac819a 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -449,6 +449,55 @@  struct dev_pm_opp *dev_pm_opp_find_level_exact(struct device *dev,
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_find_level_exact);
 
+/**
+ * dev_pm_opp_find_level_ceil() - search for an rounded up level
+ * @dev:		device for which we do this operation
+ * @level:		level to search for
+ *
+ * Return: Searches for rounded up match in the opp table and returns pointer
+ * to the  matching opp if found, else returns ERR_PTR in case of error and
+ * should be handled using IS_ERR. Error return values can be:
+ * EINVAL:	for bad pointer
+ * ERANGE:	no match found for search
+ * ENODEV:	if device not found in list of registered devices
+ *
+ * The callers are required to call dev_pm_opp_put() for the returned OPP after
+ * use.
+ */
+struct dev_pm_opp *dev_pm_opp_find_level_ceil(struct device *dev,
+					      unsigned int *level)
+{
+	struct opp_table *opp_table;
+	struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-ERANGE);
+
+	opp_table = _find_opp_table(dev);
+	if (IS_ERR(opp_table)) {
+		int r = PTR_ERR(opp_table);
+
+		dev_err(dev, "%s: OPP table not found (%d)\n", __func__, r);
+		return ERR_PTR(r);
+	}
+
+	mutex_lock(&opp_table->lock);
+
+	list_for_each_entry(temp_opp, &opp_table->opp_list, node) {
+		if (temp_opp->available && temp_opp->level >= *level) {
+			opp = temp_opp;
+			*level = opp->level;
+
+			/* Increment the reference count of OPP */
+			dev_pm_opp_get(opp);
+			break;
+		}
+	}
+
+	mutex_unlock(&opp_table->lock);
+	dev_pm_opp_put_opp_table(opp_table);
+
+	return opp;
+}
+EXPORT_SYMBOL_GPL(dev_pm_opp_find_level_ceil);
+
 static noinline struct dev_pm_opp *_find_freq_ceil(struct opp_table *opp_table,
 						   unsigned long *freq)
 {
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index f311a8b2ca04..a17d92d923cc 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -111,6 +111,8 @@  struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
 					      bool available);
 struct dev_pm_opp *dev_pm_opp_find_level_exact(struct device *dev,
 					       unsigned int level);
+struct dev_pm_opp *dev_pm_opp_find_level_ceil(struct device *dev,
+					      unsigned int *level);
 
 struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 					      unsigned long *freq);
@@ -228,6 +230,12 @@  static inline struct dev_pm_opp *dev_pm_opp_find_level_exact(struct device *dev,
 	return ERR_PTR(-ENOTSUPP);
 }
 
+static inline struct dev_pm_opp *dev_pm_opp_find_level_ceil(struct device *dev,
+					unsigned int *level)
+{
+	return ERR_PTR(-ENOTSUPP);
+}
+
 static inline struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
 					unsigned long *freq)
 {