diff mbox series

[RFC,4/9] opp: core: Don't warn if required OPP device does not exist

Message ID 20211011165707.138157-5-marcan@marcan.st (mailing list archive)
State New, archived
Headers show
Series Apple SoC CPU P-state switching | expand

Commit Message

Hector Martin Oct. 11, 2021, 4:57 p.m. UTC
When required-opps is used in CPU OPP tables, there is no parent power
domain to drive it. Squelch this error, to allow a clock driver to
handle this directly instead.

Signed-off-by: Hector Martin <marcan@marcan.st>
---
 drivers/opp/core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Viresh Kumar Oct. 12, 2021, 3:21 a.m. UTC | #1
On 12-10-21, 01:57, Hector Martin wrote:
> When required-opps is used in CPU OPP tables, there is no parent power
> domain to drive it. Squelch this error, to allow a clock driver to
> handle this directly instead.
> 
> Signed-off-by: Hector Martin <marcan@marcan.st>
> ---
>  drivers/opp/core.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index 04b4691a8aac..89e616721f70 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -873,12 +873,13 @@ static int _set_required_opp(struct device *dev, struct device *pd_dev,
>  		return 0;
>  
>  	ret = dev_pm_genpd_set_performance_state(pd_dev, pstate);
> -	if (ret) {
> +	if (ret && ret != -ENODEV) {
>  		dev_err(dev, "Failed to set performance rate of %s: %d (%d)\n",
>  			dev_name(pd_dev), pstate, ret);
> +		return ret;
>  	}
>  
> -	return ret;
> +	return 0;
>  }
>  
>  /* This is only called for PM domain for now */

I am not sure why you need this, since _set_required_opps() has this check:

	if (unlikely(!required_opp_tables[0]->is_genpd)) {
		dev_err(dev, "required-opps don't belong to a genpd\n");
		return -ENOENT;
	}
Hector Martin Oct. 12, 2021, 5:34 a.m. UTC | #2
On 12/10/2021 12.21, Viresh Kumar wrote:
> I am not sure why you need this, since _set_required_opps() has this check:
> 
> 	if (unlikely(!required_opp_tables[0]->is_genpd)) {
> 		dev_err(dev, "required-opps don't belong to a genpd\n");
> 		return -ENOENT;
> 	}
> 

The table *is* assigned to a genpd (the memory controller), it's just 
that that genpd isn't actually a parent of the CPU device. Without the 
patch you end up with:

[    3.040060] cpu cpu4: Failed to set performance rate of cpu4: 0 (-19)
[    3.042881] cpu cpu4: Failed to set required opps: -19
[    3.045508] cpufreq: __target_index: Failed to change cpu frequency: -19
Viresh Kumar Oct. 12, 2021, 5:51 a.m. UTC | #3
On 12-10-21, 14:34, Hector Martin wrote:
> The table *is* assigned to a genpd (the memory controller), it's just that
> that genpd isn't actually a parent of the CPU device. Without the patch you
> end up with:
> 
> [    3.040060] cpu cpu4: Failed to set performance rate of cpu4: 0 (-19)
> [    3.042881] cpu cpu4: Failed to set required opps: -19
> [    3.045508] cpufreq: __target_index: Failed to change cpu frequency: -19

Hmm, Saravana and Sibi were working on a similar problem earlier and decided to
solve this using devfreq instead. Don't remember the exact series which got
merged for this, Sibi ?

If this part fails, how do you actually set the performance state of the memory
controller's genpd ?
Hector Martin Oct. 12, 2021, 5:57 a.m. UTC | #4
On 12/10/2021 14.51, Viresh Kumar wrote:
> On 12-10-21, 14:34, Hector Martin wrote:
>> The table *is* assigned to a genpd (the memory controller), it's just that
>> that genpd isn't actually a parent of the CPU device. Without the patch you
>> end up with:
>>
>> [    3.040060] cpu cpu4: Failed to set performance rate of cpu4: 0 (-19)
>> [    3.042881] cpu cpu4: Failed to set required opps: -19
>> [    3.045508] cpufreq: __target_index: Failed to change cpu frequency: -19
> 
> Hmm, Saravana and Sibi were working on a similar problem earlier and decided to
> solve this using devfreq instead. Don't remember the exact series which got
> merged for this, Sibi ?
> 
> If this part fails, how do you actually set the performance state of the memory
> controller's genpd ?

The clock controller has the genpd as an actual power-domain parent, so 
it does it instead. From patch #7:

> +	if (cluster->has_pd)
> +		dev_pm_genpd_set_performance_state(cluster->dev,
> +						   dev_pm_opp_get_required_pstate(opp, 0));
> +

This is arguably not entirely representative of how the hardware works, 
since technically the cluster switching couldn't care less what the 
memory controller is doing; it's a soft dependency, states that should 
be switched together but are not interdependent (in fact, the clock code 
does this unconditionally after the CPU p-state change, regardless of 
whether we're shifting up or down; this is, FWIW, the same order macOS 
uses, and it clearly doesn't matter which way you do it).
Viresh Kumar Oct. 12, 2021, 9:26 a.m. UTC | #5
On 12-10-21, 14:57, Hector Martin wrote:
> On 12/10/2021 14.51, Viresh Kumar wrote:
> > On 12-10-21, 14:34, Hector Martin wrote:
> > > The table *is* assigned to a genpd (the memory controller), it's just that
> > > that genpd isn't actually a parent of the CPU device. Without the patch you
> > > end up with:
> > > 
> > > [    3.040060] cpu cpu4: Failed to set performance rate of cpu4: 0 (-19)
> > > [    3.042881] cpu cpu4: Failed to set required opps: -19
> > > [    3.045508] cpufreq: __target_index: Failed to change cpu frequency: -19
> > 
> > Hmm, Saravana and Sibi were working on a similar problem earlier and decided to
> > solve this using devfreq instead. Don't remember the exact series which got
> > merged for this, Sibi ?
> > 
> > If this part fails, how do you actually set the performance state of the memory
> > controller's genpd ?
> 
> The clock controller has the genpd as an actual power-domain parent, so it
> does it instead. From patch #7:
> 
> > +	if (cluster->has_pd)
> > +		dev_pm_genpd_set_performance_state(cluster->dev,
> > +						   dev_pm_opp_get_required_pstate(opp, 0));
> > +
> 
> This is arguably not entirely representative of how the hardware works,
> since technically the cluster switching couldn't care less what the memory
> controller is doing; it's a soft dependency, states that should be switched
> together but are not interdependent (in fact, the clock code does this
> unconditionally after the CPU p-state change, regardless of whether we're
> shifting up or down; this is, FWIW, the same order macOS uses, and it
> clearly doesn't matter which way you do it).

Yeah, I understand what you are doing. But the current patch is
incorrect in the sense that it can cause a bug on other platforms. To
make this work, you should rather set this genpd as parent of CPU
devices (which are doing anyway since you are updating them with CPU's
DVFS). With that the clk driver won't be required to do the magic
behind the scene.
Hector Martin Oct. 12, 2021, 9:31 a.m. UTC | #6
On 2021年10月12日 18:26:03 JST, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>On 12-10-21, 14:57, Hector Martin wrote:
>> 
>> This is arguably not entirely representative of how the hardware works,
>> since technically the cluster switching couldn't care less what the memory
>> controller is doing; it's a soft dependency, states that should be switched
>> together but are not interdependent (in fact, the clock code does this
>> unconditionally after the CPU p-state change, regardless of whether we're
>> shifting up or down; this is, FWIW, the same order macOS uses, and it
>> clearly doesn't matter which way you do it).
>
>Yeah, I understand what you are doing. But the current patch is
>incorrect in the sense that it can cause a bug on other platforms. To
>make this work, you should rather set this genpd as parent of CPU
>devices (which are doing anyway since you are updating them with CPU's
>DVFS). With that the clk driver won't be required to do the magic
>behind the scene.
>

That doesn't work, though, because the CPUs aren't normal devices with runtime-pm. That was the first thing I tried :).

If you think this *should* be made to work instead then I can try that.
Viresh Kumar Oct. 12, 2021, 9:32 a.m. UTC | #7
On 12-10-21, 18:31, Hector Martin "marcan" wrote:
> That doesn't work, though, because the CPUs aren't normal devices
> with runtime-pm. That was the first thing I tried :).

What's the exact problem with runtime PM here ?

> If you think this *should* be made to work instead then I can try that.
Hector Martin Oct. 14, 2021, 6:52 a.m. UTC | #8
On 12/10/2021 18.32, Viresh Kumar wrote:
> On 12-10-21, 18:31, Hector Martin "marcan" wrote:
>> That doesn't work, though, because the CPUs aren't normal devices
>> with runtime-pm. That was the first thing I tried :).
> 
> What's the exact problem with runtime PM here ?

The CPU devices aren't attached to their genpd, so the required OPP
transition fails with the same error.

However, this was easier to fix than I expected. With this patch to
cpufreq-dt, it all works properly, and I can drop the parent genpd
from the clock node and related handling. Thoughts?

commit c4f88743374c1f4678ee7f17fb6cae30ded9ed59
Author: Hector Martin <marcan@marcan.st>
Date:   Thu Oct 14 15:47:45 2021 +0900

     cpufreq: dt: Attach CPU devices to power domains
     
     This allows the required-opps mechanism to work for CPU OPP tables,
     triggering specific OPP levels in a parent power domain.
     
     Signed-off-by: Hector Martin <marcan@marcan.st>

diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index 8fcaba541539..5b22846b557d 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -16,6 +16,7 @@
  #include <linux/list.h>
  #include <linux/module.h>
  #include <linux/of.h>
+#include <linux/pm_domain.h>
  #include <linux/pm_opp.h>
  #include <linux/platform_device.h>
  #include <linux/regulator/consumer.h>
@@ -264,6 +265,16 @@ static int dt_cpufreq_early_init(struct device *dev, int cpu)
  		goto out;
  	}
  
+	/*
+	 * Attach the CPU device to its genpd domain (if any), to allow OPP
+	 * dependencies to be satisfied.
+	 */
+	ret = genpd_dev_pm_attach(cpu_dev);
+	if (ret <= 0) {
+		dev_err(cpu_dev, "Failed to attach CPU device to genpd\n");
+		goto out;
+	}
+
  	/*
  	 * The OPP table must be initialized, statically or dynamically, by this
  	 * point.
Viresh Kumar Oct. 14, 2021, 6:56 a.m. UTC | #9
On 14-10-21, 15:52, Hector Martin wrote:
> The CPU devices aren't attached to their genpd, so the required OPP
> transition fails with the same error.
> 
> However, this was easier to fix than I expected. With this patch to
> cpufreq-dt, it all works properly, and I can drop the parent genpd
> from the clock node and related handling. Thoughts?
> 
> commit c4f88743374c1f4678ee7f17fb6cae30ded9ed59
> Author: Hector Martin <marcan@marcan.st>
> Date:   Thu Oct 14 15:47:45 2021 +0900
> 
>     cpufreq: dt: Attach CPU devices to power domains
>     This allows the required-opps mechanism to work for CPU OPP tables,
>     triggering specific OPP levels in a parent power domain.
>     Signed-off-by: Hector Martin <marcan@marcan.st>
> 
> diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
> index 8fcaba541539..5b22846b557d 100644
> --- a/drivers/cpufreq/cpufreq-dt.c
> +++ b/drivers/cpufreq/cpufreq-dt.c
> @@ -16,6 +16,7 @@
>  #include <linux/list.h>
>  #include <linux/module.h>
>  #include <linux/of.h>
> +#include <linux/pm_domain.h>
>  #include <linux/pm_opp.h>
>  #include <linux/platform_device.h>
>  #include <linux/regulator/consumer.h>
> @@ -264,6 +265,16 @@ static int dt_cpufreq_early_init(struct device *dev, int cpu)
>  		goto out;
>  	}
> +	/*
> +	 * Attach the CPU device to its genpd domain (if any), to allow OPP
> +	 * dependencies to be satisfied.
> +	 */
> +	ret = genpd_dev_pm_attach(cpu_dev);
> +	if (ret <= 0) {
> +		dev_err(cpu_dev, "Failed to attach CPU device to genpd\n");
> +		goto out;
> +	}
> +

Other platform do this from some other place I think.

Ulf, where should this code be moved ? cpu-clk driver ?
Hector Martin Oct. 14, 2021, 7:03 a.m. UTC | #10
On 14/10/2021 15.56, Viresh Kumar wrote:
>> +	/*
>> +	 * Attach the CPU device to its genpd domain (if any), to allow OPP
>> +	 * dependencies to be satisfied.
>> +	 */
>> +	ret = genpd_dev_pm_attach(cpu_dev);
>> +	if (ret <= 0) {
>> +		dev_err(cpu_dev, "Failed to attach CPU device to genpd\n");
>> +		goto out;
>> +	}
>> +
> 
> Other platform do this from some other place I think.
> 
> Ulf, where should this code be moved ? cpu-clk driver ?
> 

I see one driver that does this is drivers/clk/qcom/apcs-sdx55.c (via 
dev_pm_domain_attach). Though it only does it for CPU#0; we need to do 
it for all CPUs.
Viresh Kumar Oct. 14, 2021, 7:22 a.m. UTC | #11
On 14-10-21, 16:03, Hector Martin wrote:
> On 14/10/2021 15.56, Viresh Kumar wrote:
> > > +	/*
> > > +	 * Attach the CPU device to its genpd domain (if any), to allow OPP
> > > +	 * dependencies to be satisfied.
> > > +	 */
> > > +	ret = genpd_dev_pm_attach(cpu_dev);
> > > +	if (ret <= 0) {
> > > +		dev_err(cpu_dev, "Failed to attach CPU device to genpd\n");
> > > +		goto out;
> > > +	}
> > > +
> > 
> > Other platform do this from some other place I think.
> > 
> > Ulf, where should this code be moved ? cpu-clk driver ?
> > 
> 
> I see one driver that does this is drivers/clk/qcom/apcs-sdx55.c (via
> dev_pm_domain_attach).

That may be a good place since you are already adding it and it is related to
CPU clk.

> Though it only does it for CPU#0; we need to do it
> for all CPUs.

Sure.
Hector Martin Oct. 14, 2021, 7:23 a.m. UTC | #12
On 14/10/2021 16.03, Hector Martin wrote:
> On 14/10/2021 15.56, Viresh Kumar wrote:
>>> +	/*
>>> +	 * Attach the CPU device to its genpd domain (if any), to allow OPP
>>> +	 * dependencies to be satisfied.
>>> +	 */
>>> +	ret = genpd_dev_pm_attach(cpu_dev);
>>> +	if (ret <= 0) {
>>> +		dev_err(cpu_dev, "Failed to attach CPU device to genpd\n");
>>> +		goto out;
>>> +	}
>>> +
>>
>> Other platform do this from some other place I think.
>>
>> Ulf, where should this code be moved ? cpu-clk driver ?
>>
> 
> I see one driver that does this is drivers/clk/qcom/apcs-sdx55.c (via
> dev_pm_domain_attach). Though it only does it for CPU#0; we need to do
> it for all CPUs.

Looking into this further, I'm not sure I like the idea of doing this in 
the clocks driver. There might be locking issues since it gets 
instantiated twice and yet doesn't really itself know what subset of 
CPUs it applies to.

There's another driver that does this: 
drivers/cpuidle/cpuidle-psci-domain.c. That one specifically looks for a 
power domain called "psci". Perhaps it would make sense to make this 
generic in cpufreq-dt as per my prior patch, but explicitly request a 
"cpufreq" domain? That way only devicetrees that opt in to having this 
handled by cpufreq by naming it that way would get this behavior.
Ulf Hansson Oct. 14, 2021, 9:55 a.m. UTC | #13
On Tue, 12 Oct 2021 at 07:57, Hector Martin <marcan@marcan.st> wrote:
>
> On 12/10/2021 14.51, Viresh Kumar wrote:
> > On 12-10-21, 14:34, Hector Martin wrote:
> >> The table *is* assigned to a genpd (the memory controller), it's just that
> >> that genpd isn't actually a parent of the CPU device. Without the patch you
> >> end up with:
> >>
> >> [    3.040060] cpu cpu4: Failed to set performance rate of cpu4: 0 (-19)
> >> [    3.042881] cpu cpu4: Failed to set required opps: -19
> >> [    3.045508] cpufreq: __target_index: Failed to change cpu frequency: -19
> >
> > Hmm, Saravana and Sibi were working on a similar problem earlier and decided to
> > solve this using devfreq instead. Don't remember the exact series which got
> > merged for this, Sibi ?
> >
> > If this part fails, how do you actually set the performance state of the memory
> > controller's genpd ?
>
> The clock controller has the genpd as an actual power-domain parent, so
> it does it instead. From patch #7:
>
> > +     if (cluster->has_pd)
> > +             dev_pm_genpd_set_performance_state(cluster->dev,
> > +                                                dev_pm_opp_get_required_pstate(opp, 0));
> > +
>
> This is arguably not entirely representative of how the hardware works,
> since technically the cluster switching couldn't care less what the
> memory controller is doing; it's a soft dependency, states that should
> be switched together but are not interdependent (in fact, the clock code
> does this unconditionally after the CPU p-state change, regardless of
> whether we're shifting up or down; this is, FWIW, the same order macOS
> uses, and it clearly doesn't matter which way you do it).

Yes, this sounds like you should move away from modeling the memory
part as a parent genpd for the CPUs' genpd.

As Viresh pointed out, a devfreq driver seems like a better way to do
this. As a matter of fact, there are already devfreq drivers that do
this, unless I am mistaken.

It looks like devfreq providers are listening to opp/cpufreq
notifiers, as to get an indication of when it could make sense to
change a performance state.

In some cases the devfreq provider is also modeled as an interconnect
provider, allowing consumers to specify memory bandwidth constraints,
which may trigger a new performance state to be set for the memory
controller.

In the tegra case, the memory controller is modelled as an
interconnect provider and the devfreq node is modelled as an
interconnect-consumer of the memory controller. Perhaps this can work
for apple SoCs too?

That said, perhaps as an option to move forward, we can try to get the
cpufreq pieces solved first. Then as a step on top, add the
performance scaling for the memory controller?

>
> --
> Hector Martin (marcan@marcan.st)
> Public Key: https://mrcn.st/pub

Kind regards
Uffe
Ulf Hansson Oct. 14, 2021, 11:08 a.m. UTC | #14
On Thu, 14 Oct 2021 at 09:23, Hector Martin <marcan@marcan.st> wrote:
>
> On 14/10/2021 16.03, Hector Martin wrote:
> > On 14/10/2021 15.56, Viresh Kumar wrote:
> >>> +   /*
> >>> +    * Attach the CPU device to its genpd domain (if any), to allow OPP
> >>> +    * dependencies to be satisfied.
> >>> +    */
> >>> +   ret = genpd_dev_pm_attach(cpu_dev);
> >>> +   if (ret <= 0) {
> >>> +           dev_err(cpu_dev, "Failed to attach CPU device to genpd\n");
> >>> +           goto out;
> >>> +   }
> >>> +
> >>
> >> Other platform do this from some other place I think.
> >>
> >> Ulf, where should this code be moved ? cpu-clk driver ?
> >>
> >
> > I see one driver that does this is drivers/clk/qcom/apcs-sdx55.c (via
> > dev_pm_domain_attach). Though it only does it for CPU#0; we need to do
> > it for all CPUs.
>
> Looking into this further, I'm not sure I like the idea of doing this in
> the clocks driver. There might be locking issues since it gets
> instantiated twice and yet doesn't really itself know what subset of
> CPUs it applies to.

I agree. I suggest you look into using a genpd provider and hook up
all CPU's devices to it. I think that is what Viresh also suggested
earlier - and this makes most sense to me.

As a reference you may have a look at some Qcom platforms that already use this:

arch/arm64/boot/dts/qcom/qcs404.dtsi

drivers/cpufreq/qcom-cpufreq-nvmem.c:
To hook up CPU devices to their PM domains (genpds) - it calls
dev_pm_opp_attach_genpd(), which is a kind of wrapper for
dev_pm_domain_attach_by_name().

drivers/soc/qcom/cpr.c
Registers the genpd provider that is capable of dealing with
performance states/OPPs for CPUs.

>
> There's another driver that does this:
> drivers/cpuidle/cpuidle-psci-domain.c. That one specifically looks for a
> power domain called "psci". Perhaps it would make sense to make this
> generic in cpufreq-dt as per my prior patch, but explicitly request a
> "cpufreq" domain? That way only devicetrees that opt in to having this
> handled by cpufreq by naming it that way would get this behavior.

That sounds like an idea that is worth exploring. In this way, the
only thing that needs to be implemented for new cases would be the
genpd provider driver.

BTW, as you will figure out by looking at the above references, for
the qcom case we are using "cpr" as the domain name for cpufreq. Of
course, that doesn't mean we can use "cpufreq" (or whatever name that
makes sense) going forward for new cases.

>
> --
> Hector Martin (marcan@marcan.st)
> Public Key: https://mrcn.st/pub

Kind regards
Uffe
Hector Martin Oct. 14, 2021, 11:43 a.m. UTC | #15
On 14/10/2021 18.55, Ulf Hansson wrote:
> Yes, this sounds like you should move away from modeling the memory
> part as a parent genpd for the CPUs' genpd.
> 
> As Viresh pointed out, a devfreq driver seems like a better way to do
> this. As a matter of fact, there are already devfreq drivers that do
> this, unless I am mistaken.
> 
> It looks like devfreq providers are listening to opp/cpufreq
> notifiers, as to get an indication of when it could make sense to
> change a performance state.
> 
> In some cases the devfreq provider is also modeled as an interconnect
> provider, allowing consumers to specify memory bandwidth constraints,
> which may trigger a new performance state to be set for the memory
> controller.
> 
> In the tegra case, the memory controller is modelled as an
> interconnect provider and the devfreq node is modelled as an
> interconnect-consumer of the memory controller. Perhaps this can work
> for apple SoCs too?

I was poking around and noticed the OPP core can already integrate with 
interconnect requirements, so perhaps the memory controller can be an 
interconnect provider, and the CPU nodes can directly reference it as a 
consumer? This seems like a more accurate model of what the hardware 
does, and I think I saw some devices doing this already.

(only problem is I have no idea of the actual bandwidth numbers involved 
here... I'll have to run some benchmarks to make sure this isn't just 
completely dummy data)

> 
> That said, perhaps as an option to move forward, we can try to get the
> cpufreq pieces solved first. Then as a step on top, add the
> performance scaling for the memory controller?

Sure; that's a pretty much independent part of this patchset, though I'm 
thinking I might as well try some things out for v2 anyway; if it looks 
like it'll take longer we can split it out and do just the cpufreq side.
Ulf Hansson Oct. 14, 2021, 12:55 p.m. UTC | #16
On Thu, 14 Oct 2021 at 13:43, Hector Martin <marcan@marcan.st> wrote:
>
> On 14/10/2021 18.55, Ulf Hansson wrote:
> > Yes, this sounds like you should move away from modeling the memory
> > part as a parent genpd for the CPUs' genpd.
> >
> > As Viresh pointed out, a devfreq driver seems like a better way to do
> > this. As a matter of fact, there are already devfreq drivers that do
> > this, unless I am mistaken.
> >
> > It looks like devfreq providers are listening to opp/cpufreq
> > notifiers, as to get an indication of when it could make sense to
> > change a performance state.
> >
> > In some cases the devfreq provider is also modeled as an interconnect
> > provider, allowing consumers to specify memory bandwidth constraints,
> > which may trigger a new performance state to be set for the memory
> > controller.
> >
> > In the tegra case, the memory controller is modelled as an
> > interconnect provider and the devfreq node is modelled as an
> > interconnect-consumer of the memory controller. Perhaps this can work
> > for apple SoCs too?
>
> I was poking around and noticed the OPP core can already integrate with
> interconnect requirements, so perhaps the memory controller can be an
> interconnect provider, and the CPU nodes can directly reference it as a
> consumer? This seems like a more accurate model of what the hardware
> does, and I think I saw some devices doing this already.

Yeah, that could work too. And, yes, I agree, it may be a better
description of the HW.

>
> (only problem is I have no idea of the actual bandwidth numbers involved
> here... I'll have to run some benchmarks to make sure this isn't just
> completely dummy data)
>
> >
> > That said, perhaps as an option to move forward, we can try to get the
> > cpufreq pieces solved first. Then as a step on top, add the
> > performance scaling for the memory controller?
>
> Sure; that's a pretty much independent part of this patchset, though I'm
> thinking I might as well try some things out for v2 anyway; if it looks
> like it'll take longer we can split it out and do just the cpufreq side.

In any case, I do my best to help with review.

Kind regards
Uffe
Hector Martin Oct. 14, 2021, 5:02 p.m. UTC | #17
On 14/10/2021 21.55, Ulf Hansson wrote:
> On Thu, 14 Oct 2021 at 13:43, Hector Martin <marcan@marcan.st> wrote:
>> I was poking around and noticed the OPP core can already integrate with
>> interconnect requirements, so perhaps the memory controller can be an
>> interconnect provider, and the CPU nodes can directly reference it as a
>> consumer? This seems like a more accurate model of what the hardware
>> does, and I think I saw some devices doing this already.
> 
> Yeah, that could work too. And, yes, I agree, it may be a better
> description of the HW.
> 
>>
>> (only problem is I have no idea of the actual bandwidth numbers involved
>> here... I'll have to run some benchmarks to make sure this isn't just
>> completely dummy data)
>>

So... I tried getting bandwidth numbers and failed. It seems these 
registers don't actually affect peak performance in any measurable way. 
I'm also getting almost the same GeekBench scores on macOS with and 
without this mechanism enabled, although there is one subtest that seems 
to show a measurable difference.

My current guess is this is something more subtle (latencies? idle 
timers and such?) than a performance state. If that is the case, do you 
have any ideas as to the best way to model it in Linux? Should we even 
bother if it mostly has a minimal performance gain for typical workloads?

I'll try to do some latency tests, see if I can make sense of what it's 
actually doing.
Ulf Hansson Oct. 15, 2021, 11:26 a.m. UTC | #18
On Thu, 14 Oct 2021 at 19:02, Hector Martin <marcan@marcan.st> wrote:
>
> On 14/10/2021 21.55, Ulf Hansson wrote:
> > On Thu, 14 Oct 2021 at 13:43, Hector Martin <marcan@marcan.st> wrote:
> >> I was poking around and noticed the OPP core can already integrate with
> >> interconnect requirements, so perhaps the memory controller can be an
> >> interconnect provider, and the CPU nodes can directly reference it as a
> >> consumer? This seems like a more accurate model of what the hardware
> >> does, and I think I saw some devices doing this already.
> >
> > Yeah, that could work too. And, yes, I agree, it may be a better
> > description of the HW.
> >
> >>
> >> (only problem is I have no idea of the actual bandwidth numbers involved
> >> here... I'll have to run some benchmarks to make sure this isn't just
> >> completely dummy data)
> >>
>
> So... I tried getting bandwidth numbers and failed. It seems these
> registers don't actually affect peak performance in any measurable way.
> I'm also getting almost the same GeekBench scores on macOS with and
> without this mechanism enabled, although there is one subtest that seems
> to show a measurable difference.
>
> My current guess is this is something more subtle (latencies? idle
> timers and such?) than a performance state. If that is the case, do you
> have any ideas as to the best way to model it in Linux? Should we even
> bother if it mostly has a minimal performance gain for typical workloads?

For latency constraints, we have dev_pm_qos. This will make the genpd
governor, to prevent deeper idle states for the device and its
corresponding PM domain (genpd). But that doesn't sound like a good
fit here.

If you are right, it rather sounds like there is some kind of
quiescence mode of the memory controller that can be prevented. But I
have no clue, of course. :-)

>
> I'll try to do some latency tests, see if I can make sense of what it's
> actually doing.
>

Kind regards
Uffe
diff mbox series

Patch

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 04b4691a8aac..89e616721f70 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -873,12 +873,13 @@  static int _set_required_opp(struct device *dev, struct device *pd_dev,
 		return 0;
 
 	ret = dev_pm_genpd_set_performance_state(pd_dev, pstate);
-	if (ret) {
+	if (ret && ret != -ENODEV) {
 		dev_err(dev, "Failed to set performance rate of %s: %d (%d)\n",
 			dev_name(pd_dev), pstate, ret);
+		return ret;
 	}
 
-	return ret;
+	return 0;
 }
 
 /* This is only called for PM domain for now */