diff mbox series

[v7,6/7] OPP: Update the bandwidth on OPP frequency changes

Message ID 20200424155404.10746-7-georgi.djakov@linaro.org (mailing list archive)
State Not Applicable, archived
Headers show
Series Introduce OPP bandwidth bindings | expand

Commit Message

Georgi Djakov April 24, 2020, 3:54 p.m. UTC
If the OPP bandwidth values are populated, we want to switch also the
interconnect bandwidth in addition to frequency and voltage.

Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
---
v7:
* Addressed review comments from Viresh.

v2: https://lore.kernel.org/r/20190423132823.7915-5-georgi.djakov@linaro.org

 drivers/opp/core.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

Comments

Matthias Kaehlcke April 24, 2020, 7:36 p.m. UTC | #1
On Fri, Apr 24, 2020 at 06:54:03PM +0300, Georgi Djakov wrote:
> If the OPP bandwidth values are populated, we want to switch also the
> interconnect bandwidth in addition to frequency and voltage.
> 
> Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
> ---
> v7:
> * Addressed review comments from Viresh.
> 
> v2: https://lore.kernel.org/r/20190423132823.7915-5-georgi.djakov@linaro.org
> 
>  drivers/opp/core.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index 8e86811eb7b2..66a8ea10f3de 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -808,7 +808,7 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
>  	unsigned long freq, old_freq, temp_freq;
>  	struct dev_pm_opp *old_opp, *opp;
>  	struct clk *clk;
> -	int ret;
> +	int ret, i;
>  
>  	opp_table = _find_opp_table(dev);
>  	if (IS_ERR(opp_table)) {
> @@ -895,6 +895,17 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
>  			dev_err(dev, "Failed to set required opps: %d\n", ret);
>  	}
>  
> +	if (!ret && opp_table->paths) {
> +		for (i = 0; i < opp_table->path_count; i++) {
> +			ret = icc_set_bw(opp_table->paths[i],
> +					 opp->bandwidth[i].avg,
> +					 opp->bandwidth[i].peak);
> +			if (ret)
> +				dev_err(dev, "Failed to set bandwidth[%d]: %d\n",
> +					i, ret);
> +		}
> +	}
> +
>  put_opp:
>  	dev_pm_opp_put(opp);
>  put_old_opp:

Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Saravana Kannan April 24, 2020, 9:18 p.m. UTC | #2
On Fri, Apr 24, 2020 at 8:54 AM Georgi Djakov <georgi.djakov@linaro.org> wrote:
>
> If the OPP bandwidth values are populated, we want to switch also the
> interconnect bandwidth in addition to frequency and voltage.
>
> Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
> ---
> v7:
> * Addressed review comments from Viresh.
>
> v2: https://lore.kernel.org/r/20190423132823.7915-5-georgi.djakov@linaro.org
>
>  drivers/opp/core.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index 8e86811eb7b2..66a8ea10f3de 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -808,7 +808,7 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
>         unsigned long freq, old_freq, temp_freq;
>         struct dev_pm_opp *old_opp, *opp;
>         struct clk *clk;
> -       int ret;
> +       int ret, i;
>
>         opp_table = _find_opp_table(dev);
>         if (IS_ERR(opp_table)) {
> @@ -895,6 +895,17 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
>                         dev_err(dev, "Failed to set required opps: %d\n", ret);
>         }
>
> +       if (!ret && opp_table->paths) {
> +               for (i = 0; i < opp_table->path_count; i++) {
> +                       ret = icc_set_bw(opp_table->paths[i],
> +                                        opp->bandwidth[i].avg,
> +                                        opp->bandwidth[i].peak);
> +                       if (ret)
> +                               dev_err(dev, "Failed to set bandwidth[%d]: %d\n",
> +                                       i, ret);
> +               }
> +       }
> +

Hey Georgi,

Thanks for getting this series going again and converging on the DT
bindings! Will be nice to see this land finally.

I skimmed through all the patches in the series and they mostly look
good (if you address some of Matthias's comments).

My only comment is -- can we drop this patch please? I'd like to use
devfreq governors for voting on bandwidth and this will effectively
override whatever bandwidth decisions are made by the devfreq
governor.

If you really want to keep this, then maybe don't "get" the icc path
by default in patch 4/7 and then let the device driver set the icc
path if it wants the opp framework to manage the bandwidth too?

-Saravana
Viresh Kumar April 30, 2020, 6:09 a.m. UTC | #3
On 24-04-20, 14:18, Saravana Kannan wrote:
> My only comment is -- can we drop this patch please? I'd like to use
> devfreq governors for voting on bandwidth and this will effectively
> override whatever bandwidth decisions are made by the devfreq
> governor.

And why would that be better ? FWIW, that will have the same problem
which cpufreq governors had since ages, i.e. they were not proactive
and were always too late.

The bw should get updated right with frequency, why shouldn't it ?
Saravana Kannan April 30, 2020, 7:35 a.m. UTC | #4
On Wed, Apr 29, 2020 at 11:09 PM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 24-04-20, 14:18, Saravana Kannan wrote:
> > My only comment is -- can we drop this patch please? I'd like to use
> > devfreq governors for voting on bandwidth and this will effectively
> > override whatever bandwidth decisions are made by the devfreq
> > governor.
>
> And why would that be better ? FWIW, that will have the same problem
> which cpufreq governors had since ages, i.e. they were not proactive
> and were always too late.
>
> The bw should get updated right with frequency, why shouldn't it ?

I didn't say the bw would be voted based on just CPUfreq. It can also
be based on CPU busy time and other stats. Having said that, this is
not just about CPUfreq. Having the bw be force changed every time a
device has it's OPP is changed is very inflexible. Please don't do it.

-Saravana
Viresh Kumar April 30, 2020, 7:53 a.m. UTC | #5
On 30-04-20, 00:35, Saravana Kannan wrote:
> On Wed, Apr 29, 2020 at 11:09 PM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >
> > On 24-04-20, 14:18, Saravana Kannan wrote:
> > > My only comment is -- can we drop this patch please? I'd like to use
> > > devfreq governors for voting on bandwidth and this will effectively
> > > override whatever bandwidth decisions are made by the devfreq
> > > governor.
> >
> > And why would that be better ? FWIW, that will have the same problem
> > which cpufreq governors had since ages, i.e. they were not proactive
> > and were always too late.
> >
> > The bw should get updated right with frequency, why shouldn't it ?
> 
> I didn't say the bw would be voted based on just CPUfreq. It can also
> be based on CPU busy time and other stats. Having said that, this is
> not just about CPUfreq. Having the bw be force changed every time a
> device has it's OPP is changed is very inflexible. Please don't do it.

So, the vote based on the requirements of cpufreq driver should come
directly from the cpufreq side itself, but no one stops the others
layers to aggregate the requests and then act on them. This is how it
is done for other frameworks like clk, regulator, genpd, etc.

You guys need to figure out who aggregates the requests from all users
or input providers for a certain path. This was pushed into the genpd
core in case of performance state for example.
Saravana Kannan April 30, 2020, 4:32 p.m. UTC | #6
On Thu, Apr 30, 2020 at 12:54 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 30-04-20, 00:35, Saravana Kannan wrote:
> > On Wed, Apr 29, 2020 at 11:09 PM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > >
> > > On 24-04-20, 14:18, Saravana Kannan wrote:
> > > > My only comment is -- can we drop this patch please? I'd like to use
> > > > devfreq governors for voting on bandwidth and this will effectively
> > > > override whatever bandwidth decisions are made by the devfreq
> > > > governor.
> > >
> > > And why would that be better ? FWIW, that will have the same problem
> > > which cpufreq governors had since ages, i.e. they were not proactive
> > > and were always too late.
> > >
> > > The bw should get updated right with frequency, why shouldn't it ?
> >
> > I didn't say the bw would be voted based on just CPUfreq. It can also
> > be based on CPU busy time and other stats. Having said that, this is
> > not just about CPUfreq. Having the bw be force changed every time a
> > device has it's OPP is changed is very inflexible. Please don't do it.
>
> So, the vote based on the requirements of cpufreq driver should come
> directly from the cpufreq side itself, but no one stops the others
> layers to aggregate the requests and then act on them. This is how it
> is done for other frameworks like clk, regulator, genpd, etc.

You are missing the point. This is not about aggregation. This is
about OPP voting for bandwidth on a path when the vote can/should be
0.

I'll give another example. Say one of the interconnect paths needs to
be voted only when a particular use case is running. Say, the GPU
needs to vote for bandwidth to L3 only when it's running in cache
coherent mode. But it always needs to vote for bandwidth to DDR. With
the way it's written now, OPP is going to force vote a non-zero
bandwidth to L3 even when it can be zero. Wasting power for no good
reason.

Just let the drivers/device get the bandwidth values from OPP without
forcing them to vote for the bandwidth when they don't need to. Just
because they decide to use OPP to set their clock doesn't mean they
should lose to ability to control their bandwidth in a more
intelligent fashion.

-Saravana
Viresh Kumar May 4, 2020, 5 a.m. UTC | #7
On 30-04-20, 09:32, Saravana Kannan wrote:
> You are missing the point. This is not about aggregation. This is
> about OPP voting for bandwidth on a path when the vote can/should be
> 0.
> 
> I'll give another example. Say one of the interconnect paths needs to
> be voted only when a particular use case is running. Say, the GPU
> needs to vote for bandwidth to L3 only when it's running in cache
> coherent mode. But it always needs to vote for bandwidth to DDR. With
> the way it's written now, OPP is going to force vote a non-zero
> bandwidth to L3 even when it can be zero. Wasting power for no good
> reason.
> 
> Just let the drivers/device get the bandwidth values from OPP without
> forcing them to vote for the bandwidth when they don't need to. Just
> because they decide to use OPP to set their clock doesn't mean they
> should lose to ability to control their bandwidth in a more
> intelligent fashion.

They shouldn't use opp_set_rate() in such a scenario. Why should they?

opp_set_rate() was introduced to take care of only the simple cases
and the complex ones are left for the drivers to handle. For example,
they take care of programming multiple regulators (in case of TI), as
OPP core can't know the order in which regulators need to be
programmed. But for the simple cases, opp core can program everything
the way it is presented in DT.
Sibi Sankar May 4, 2020, 8:54 p.m. UTC | #8
On 2020-04-24 21:24, Georgi Djakov wrote:
> If the OPP bandwidth values are populated, we want to switch also the
> interconnect bandwidth in addition to frequency and voltage.
> 

https://patchwork.kernel.org/patch/11527571/

Scaling from set_rate or using ^^
to set bw levels, I'm fine with
both.

Reviewed-by: Sibi Sankar <sibis@codeaurora.org>

> Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
> ---
> v7:
> * Addressed review comments from Viresh.
> 
> v2: 
> https://lore.kernel.org/r/20190423132823.7915-5-georgi.djakov@linaro.org
> 
>  drivers/opp/core.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index 8e86811eb7b2..66a8ea10f3de 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -808,7 +808,7 @@ int dev_pm_opp_set_rate(struct device *dev,
...
Saravana Kannan May 4, 2020, 9:01 p.m. UTC | #9
On Sun, May 3, 2020 at 10:00 PM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 30-04-20, 09:32, Saravana Kannan wrote:
> > You are missing the point. This is not about aggregation. This is
> > about OPP voting for bandwidth on a path when the vote can/should be
> > 0.
> >
> > I'll give another example. Say one of the interconnect paths needs to
> > be voted only when a particular use case is running. Say, the GPU
> > needs to vote for bandwidth to L3 only when it's running in cache
> > coherent mode. But it always needs to vote for bandwidth to DDR. With
> > the way it's written now, OPP is going to force vote a non-zero
> > bandwidth to L3 even when it can be zero. Wasting power for no good
> > reason.
> >
> > Just let the drivers/device get the bandwidth values from OPP without
> > forcing them to vote for the bandwidth when they don't need to. Just
> > because they decide to use OPP to set their clock doesn't mean they
> > should lose to ability to control their bandwidth in a more
> > intelligent fashion.
>
> They shouldn't use opp_set_rate() in such a scenario. Why should they?
>
> opp_set_rate() was introduced to take care of only the simple cases
> and the complex ones are left for the drivers to handle. For example,
> they take care of programming multiple regulators (in case of TI), as
> OPP core can't know the order in which regulators need to be
> programmed. But for the simple cases, opp core can program everything
> the way it is presented in DT.

Fair enough. But don't "voltage corner" based devices NEED to use OPP
framework to set their frequencies?

Because, if voltage corners are only handled through OPP framework,
then any device that uses voltage corners doesn't get to pick and
choose when to vote for what path. Also, maybe a one liner helper
function to enable BW voting using OPP framework by default might be
another option. Something like:
dev_pm_opp_enable_bw_voting(struct device *dev)?

If devices with voltage corners can still do their own
frequency/voltage corner control without having to use OPP framework,
then I agree with your point above.

-Saravana
Viresh Kumar May 5, 2020, 3:38 a.m. UTC | #10
On 04-05-20, 14:01, Saravana Kannan wrote:
> Fair enough. But don't "voltage corner" based devices NEED to use OPP
> framework to set their frequencies?

No. Anyone can call dev_pm_genpd_set_performance_state().
diff mbox series

Patch

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 8e86811eb7b2..66a8ea10f3de 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -808,7 +808,7 @@  int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 	unsigned long freq, old_freq, temp_freq;
 	struct dev_pm_opp *old_opp, *opp;
 	struct clk *clk;
-	int ret;
+	int ret, i;
 
 	opp_table = _find_opp_table(dev);
 	if (IS_ERR(opp_table)) {
@@ -895,6 +895,17 @@  int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 			dev_err(dev, "Failed to set required opps: %d\n", ret);
 	}
 
+	if (!ret && opp_table->paths) {
+		for (i = 0; i < opp_table->path_count; i++) {
+			ret = icc_set_bw(opp_table->paths[i],
+					 opp->bandwidth[i].avg,
+					 opp->bandwidth[i].peak);
+			if (ret)
+				dev_err(dev, "Failed to set bandwidth[%d]: %d\n",
+					i, ret);
+		}
+	}
+
 put_opp:
 	dev_pm_opp_put(opp);
 put_old_opp: