diff mbox

[04/17] PM / OPP: Introduce dev_pm_opp_get_max_volt_latency()

Message ID 20160112051058.GJ1084@ubuntu (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Viresh Kumar Jan. 12, 2016, 5:10 a.m. UTC
On 11-01-16, 17:25, Stephen Boyd wrote:
> Doesn't rcu_read_lock() disable preemption sometimes? I don't
> think we can call regulator ops with preemption disabled because
> regulators use mutex locking. It looks like
> regulator_list_voltage() may take a mutex anyway.

-------------------------8<-------------------------

Subject: [PATCH] PM / OPP: Introduce dev_pm_opp_get_max_volt_latency()

In few use cases (like: cpufreq), it is desired to get the maximum
voltage latency for changing OPPs. Add support for that.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/base/power/opp/core.c | 56 ++++++++++++++++++++++++++++++++++++++++++-
 include/linux/pm_opp.h        |  6 +++++
 2 files changed, 61 insertions(+), 1 deletion(-)

Comments

Stephen Boyd Jan. 12, 2016, 7:45 p.m. UTC | #1
On 01/12, Viresh Kumar wrote:
> diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
> index 052fc6b78dc3..62976d0bd61c 100644
> --- a/drivers/base/power/opp/core.c
> +++ b/drivers/base/power/opp/core.c
> @@ -231,8 +231,62 @@ unsigned long dev_pm_opp_get_max_clock_latency(struct device *dev)
>  EXPORT_SYMBOL_GPL(dev_pm_opp_get_max_clock_latency);
>  
>  /**
> + * dev_pm_opp_get_max_volt_latency() - Get max voltage latency in nanoseconds
> + * @dev: device for which we do this operation
> + *
> + * Return: This function returns the max voltage latency in nanoseconds.
> + *
> + * Locking: This function takes rcu_read_lock().

False.

> + */
> +unsigned long dev_pm_opp_get_max_volt_latency(struct device *dev)
> +{
> +	struct device_opp *dev_opp;
> +	struct dev_pm_opp *opp;
> +	struct regulator *reg;
> +	unsigned long latency_ns = 0;
> +	unsigned long min_uV = ~0, max_uV = 0;
> +	int ret;
> +
> +	/*
> +	 * Hold our list modification lock here as regulator_set_voltage_time()
> +	 * can possibly take another mutex, which isn't allowed within rcu
> +	 * locks.
> +	 */
> +	mutex_lock(&dev_opp_list_lock);

So now we take the list modification mutex. Why can't we
rcu_read_lock(), find the min and max voltage, and then release
the read lock and ask regulator framework for the voltage time?
From what I can tell we're just trying to make sure that the list
is stable when iterating through it to find the min/max voltage.

> +
> +	dev_opp = _find_device_opp(dev);
> +	if (IS_ERR(dev_opp))
> +		goto unlock;
> +
> +	reg = dev_opp->regulator;
> +	/* Regulator may not be available for device */
> +	if (IS_ERR(reg))
> +		goto unlock;
> +
> +	list_for_each_entry_rcu(opp, &dev_opp->opp_list, node) {
> +		if (!opp->available)
> +			continue;
> +
> +		if (opp->u_volt_min < min_uV)
> +			min_uV = opp->u_volt_min;
> +		if (opp->u_volt_max > max_uV)
> +			max_uV = opp->u_volt_max;
> +	}
> +
> +	ret = regulator_set_voltage_time(reg, min_uV, max_uV);
> +	if (ret > 0)
> +		latency_ns = ret * 1000;
> +
> +unlock:
> +	mutex_unlock(&dev_opp_list_lock);
> +
> +	return latency_ns;
> +}
> +EXPORT_SYMBOL_GPL(dev_pm_opp_get_max_volt_latency);
Viresh Kumar Jan. 13, 2016, 5:34 a.m. UTC | #2
On 12-01-16, 11:45, Stephen Boyd wrote:
> > + * Locking: This function takes rcu_read_lock().
> 
> False.

Yeah, I realized it and fixed it right after sending it:

+ * Locking: This function internally uses mutex locks to keep the integrity of
+ * the internal data structures. Callers should ensure that this function is
+ * *NOT* called under RCU protection or in contexts where mutex cannot be
+ * locked.

> > +	/*
> > +	 * Hold our list modification lock here as regulator_set_voltage_time()
> > +	 * can possibly take another mutex, which isn't allowed within rcu
> > +	 * locks.
> > +	 */
> > +	mutex_lock(&dev_opp_list_lock);
> 
> So now we take the list modification mutex. Why can't we
> rcu_read_lock(), find the min and max voltage, and then release
> the read lock and ask regulator framework for the voltage time?

That was possible before this series came in..

> From what I can tell we're just trying to make sure that the list
> is stable when iterating through it to find the min/max voltage.

Hmm, we have pointer to regulator within the dev-opp struct. If we
drop the lock, the dev-opp struct can get freed and regulator will be
put just before that. So, we may be using an already freed resource.

How do you suggest we fix it ?
Viresh Kumar Jan. 18, 2016, 7:23 a.m. UTC | #3
On 13-01-16, 11:04, Viresh Kumar wrote:
> On 12-01-16, 11:45, Stephen Boyd wrote:
> > > + * Locking: This function takes rcu_read_lock().
> > 
> > False.
> 
> Yeah, I realized it and fixed it right after sending it:
> 
> + * Locking: This function internally uses mutex locks to keep the integrity of
> + * the internal data structures. Callers should ensure that this function is
> + * *NOT* called under RCU protection or in contexts where mutex cannot be
> + * locked.
> 
> > > +	/*
> > > +	 * Hold our list modification lock here as regulator_set_voltage_time()
> > > +	 * can possibly take another mutex, which isn't allowed within rcu
> > > +	 * locks.
> > > +	 */
> > > +	mutex_lock(&dev_opp_list_lock);
> > 
> > So now we take the list modification mutex. Why can't we
> > rcu_read_lock(), find the min and max voltage, and then release
> > the read lock and ask regulator framework for the voltage time?
> 
> That was possible before this series came in..
> 
> > From what I can tell we're just trying to make sure that the list
> > is stable when iterating through it to find the min/max voltage.
> 
> Hmm, we have pointer to regulator within the dev-opp struct. If we
> drop the lock, the dev-opp struct can get freed and regulator will be
> put just before that. So, we may be using an already freed resource.
> 
> How do you suggest we fix it ?

Ping..
Stephen Boyd Jan. 21, 2016, 12:20 a.m. UTC | #4
On 01/13, Viresh Kumar wrote:
> On 12-01-16, 11:45, Stephen Boyd wrote:
> > > +	/*
> > > +	 * Hold our list modification lock here as regulator_set_voltage_time()
> > > +	 * can possibly take another mutex, which isn't allowed within rcu
> > > +	 * locks.
> > > +	 */
> > > +	mutex_lock(&dev_opp_list_lock);
> > 
> > So now we take the list modification mutex. Why can't we
> > rcu_read_lock(), find the min and max voltage, and then release
> > the read lock and ask regulator framework for the voltage time?
> 
> That was possible before this series came in..
> 
> > From what I can tell we're just trying to make sure that the list
> > is stable when iterating through it to find the min/max voltage.
> 
> Hmm, we have pointer to regulator within the dev-opp struct. If we
> drop the lock, the dev-opp struct can get freed and regulator will be
> put just before that. So, we may be using an already freed resource.
> 
> How do you suggest we fix it ?
> 

Ok, first off, I don't understand why the regulator and clock
pointers are in the struct device_opp instead of the struct
device_list_opp. I thought we wanted to make it possible for two
devices to share the same OPP table (device_opp), but have
physically different clocks and regulators (non opp-shared case)?
If we put the clock and regulator handles in the device_opp then
the opp table is limited to one device or a set of devices that
all share the same clock and regulators.

BTW, these structure names confuse me all the time. I have to
rename dev_pm_opp to opp_entry, device_opp to opp_table, and
device_list_opp to opp_device_list in my head for anything to
make sense.

Gripes aside, the clock and regulator pointers should never be
'put' and go away until the device driver that's using the
dev_pm_opp_set_rate() API has called dev_pm_opp_remove(). So any
concerns about that happening during an OPP switch aren't the
concern of the OPP framework, but the concern of the consumer
drivers having the proper locking and tear down sequences.
Viresh Kumar Jan. 21, 2016, 2:32 a.m. UTC | #5
On 20-01-16, 16:20, Stephen Boyd wrote:
> Ok, first off, I don't understand why the regulator and clock
> pointers are in the struct device_opp instead of the struct
> device_list_opp. I thought we wanted to make it possible for two
> devices to share the same OPP table (device_opp), but have
> physically different clocks and regulators (non opp-shared case)?
> If we put the clock and regulator handles in the device_opp then
> the opp table is limited to one device or a set of devices that
> all share the same clock and regulators.

Not exactly. We wanted devices (with separate rails) to share the OPP
tables ONLY IN DT. So that the same thing isn't required to be copied
multiple times in DT, for example in case of Krait.

But the code isn't supposed to shared device_opp structure at all.
There is one device_opp structure for a groups of devices that share
their OPPs and their clock/voltage rails. opp_device_list is
representing individual devices that share the same device_opp thing.

And this works pretty well and I don't think we should touch it now.

> BTW, these structure names confuse me all the time. I have to
> rename dev_pm_opp to opp_entry, device_opp to opp_table, and
> device_list_opp to opp_device_list in my head for anything to
> make sense.

Sure, we can (should) rename them after this series goes in.

> Gripes aside, the clock and regulator pointers should never be
> 'put' and go away until the device driver that's using the
> dev_pm_opp_set_rate() API has called dev_pm_opp_remove().

Right.

> So any
> concerns about that happening during an OPP switch aren't the
> concern of the OPP framework, but the concern of the consumer
> drivers having the proper locking and tear down sequences.

I am fine with that view as well.. It simplifies lots of things for me
:)
Stephen Boyd Jan. 21, 2016, 2:43 a.m. UTC | #6
On 01/20/2016 06:32 PM, Viresh Kumar wrote:
> On 20-01-16, 16:20, Stephen Boyd wrote:
>> Ok, first off, I don't understand why the regulator and clock
>> pointers are in the struct device_opp instead of the struct
>> device_list_opp. I thought we wanted to make it possible for two
>> devices to share the same OPP table (device_opp), but have
>> physically different clocks and regulators (non opp-shared case)?
>> If we put the clock and regulator handles in the device_opp then
>> the opp table is limited to one device or a set of devices that
>> all share the same clock and regulators.
> Not exactly. We wanted devices (with separate rails) to share the OPP
> tables ONLY IN DT. So that the same thing isn't required to be copied
> multiple times in DT, for example in case of Krait.
>
> But the code isn't supposed to shared device_opp structure at all.
> There is one device_opp structure for a groups of devices that share
> their OPPs and their clock/voltage rails. opp_device_list is
> representing individual devices that share the same device_opp thing.
>
> And this works pretty well and I don't think we should touch it now.
>
>

Ok. Thanks for clarifying. I see the code matches that description.
diff mbox

Patch

diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index 052fc6b78dc3..62976d0bd61c 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -231,8 +231,62 @@  unsigned long dev_pm_opp_get_max_clock_latency(struct device *dev)
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_max_clock_latency);
 
 /**
+ * dev_pm_opp_get_max_volt_latency() - Get max voltage latency in nanoseconds
+ * @dev: device for which we do this operation
+ *
+ * Return: This function returns the max voltage latency in nanoseconds.
+ *
+ * Locking: This function takes rcu_read_lock().
+ */
+unsigned long dev_pm_opp_get_max_volt_latency(struct device *dev)
+{
+	struct device_opp *dev_opp;
+	struct dev_pm_opp *opp;
+	struct regulator *reg;
+	unsigned long latency_ns = 0;
+	unsigned long min_uV = ~0, max_uV = 0;
+	int ret;
+
+	/*
+	 * Hold our list modification lock here as regulator_set_voltage_time()
+	 * can possibly take another mutex, which isn't allowed within rcu
+	 * locks.
+	 */
+	mutex_lock(&dev_opp_list_lock);
+
+	dev_opp = _find_device_opp(dev);
+	if (IS_ERR(dev_opp))
+		goto unlock;
+
+	reg = dev_opp->regulator;
+	/* Regulator may not be available for device */
+	if (IS_ERR(reg))
+		goto unlock;
+
+	list_for_each_entry_rcu(opp, &dev_opp->opp_list, node) {
+		if (!opp->available)
+			continue;
+
+		if (opp->u_volt_min < min_uV)
+			min_uV = opp->u_volt_min;
+		if (opp->u_volt_max > max_uV)
+			max_uV = opp->u_volt_max;
+	}
+
+	ret = regulator_set_voltage_time(reg, min_uV, max_uV);
+	if (ret > 0)
+		latency_ns = ret * 1000;
+
+unlock:
+	mutex_unlock(&dev_opp_list_lock);
+
+	return latency_ns;
+}
+EXPORT_SYMBOL_GPL(dev_pm_opp_get_max_volt_latency);
+
+/**
  * dev_pm_opp_get_suspend_opp() - Get suspend opp
- * @dev:	device for which we do this operation
+ * @dev: device for which we do this operation
  *
  * Return: This function returns pointer to the suspend opp if it is
  * defined and available, otherwise it returns NULL.
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index c70a18ac9c8a..5daa43058ac1 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -34,6 +34,7 @@  bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp);
 
 int dev_pm_opp_get_opp_count(struct device *dev);
 unsigned long dev_pm_opp_get_max_clock_latency(struct device *dev);
+unsigned long dev_pm_opp_get_max_volt_latency(struct device *dev);
 struct dev_pm_opp *dev_pm_opp_get_suspend_opp(struct device *dev);
 
 struct dev_pm_opp *dev_pm_opp_find_freq_exact(struct device *dev,
@@ -88,6 +89,11 @@  static inline unsigned long dev_pm_opp_get_max_clock_latency(struct device *dev)
 	return 0;
 }
 
+static inline unsigned long dev_pm_opp_get_max_volt_latency(struct device *dev)
+{
+	return 0;
+}
+
 static inline struct dev_pm_opp *dev_pm_opp_get_suspend_opp(struct device *dev)
 {
 	return NULL;