diff mbox series

[v9,4/8] PM: domains: Add get_performance_state() callback

Message ID 20210827013415.24027-5-digetx@gmail.com (mailing list archive)
State Not Applicable, archived
Headers show
Series NVIDIA Tegra power management patches for 5.16 | expand

Commit Message

Dmitry Osipenko Aug. 27, 2021, 1:34 a.m. UTC
Add get_performance_state() callback that retrieves and initializes
performance state of a device attached to a power domain. This removes
inconsistency of the performance state with hardware state.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/base/power/domain.c | 32 +++++++++++++++++++++++++++++---
 include/linux/pm_domain.h   |  2 ++
 2 files changed, 31 insertions(+), 3 deletions(-)

Comments

Ulf Hansson Aug. 27, 2021, 2:23 p.m. UTC | #1
On Fri, 27 Aug 2021 at 03:37, Dmitry Osipenko <digetx@gmail.com> wrote:
>
> Add get_performance_state() callback that retrieves and initializes
> performance state of a device attached to a power domain. This removes
> inconsistency of the performance state with hardware state.

Can you please try to elaborate a bit more on the use case. Users need
to know when it makes sense to implement the callback - and so far we
tend to document this through detailed commit messages.

Moreover, please state that implementing the callback is optional.

>
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/base/power/domain.c | 32 +++++++++++++++++++++++++++++---
>  include/linux/pm_domain.h   |  2 ++
>  2 files changed, 31 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index 3a13a942d012..8b828dcdf7f8 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -2700,15 +2700,41 @@ static int __genpd_dev_pm_attach(struct device *dev, struct device *base_dev,
>                 goto err;
>         } else if (pstate > 0) {
>                 ret = dev_pm_genpd_set_performance_state(dev, pstate);
> -               if (ret)
> +               if (ret) {
> +                       dev_err(dev, "failed to set required performance state for power-domain %s: %d\n",
> +                               pd->name, ret);

Moving the dev_err() here, leads to that we won't print an error if
of_get_required_opp_performance_state() fails, a few lines above, is
that intentional?

>                         goto err;
> +               }
>                 dev_gpd_data(dev)->default_pstate = pstate;
>         }
> +
> +       if (pd->get_performance_state && !dev_gpd_data(dev)->default_pstate) {
> +               bool dev_suspended = false;
> +
> +               ret = pd->get_performance_state(pd, base_dev, &dev_suspended);
> +               if (ret < 0) {
> +                       dev_err(dev, "failed to get performance state for power-domain %s: %d\n",
> +                               pd->name, ret);
> +                       goto err;
> +               }
> +
> +               pstate = ret;
> +
> +               if (dev_suspended) {

The dev_suspended thing looks weird.

Perhaps it was needed before dev_pm_genpd_set_performance_state()
didn't check pm_runtime_disabled()?

> +                       dev_gpd_data(dev)->rpm_pstate = pstate;
> +               } else if (pstate > 0) {
> +                       ret = dev_pm_genpd_set_performance_state(dev, pstate);
> +                       if (ret) {
> +                               dev_err(dev, "failed to set required performance state for power-domain %s: %d\n",
> +                                       pd->name, ret);
> +                               goto err;
> +                       }
> +               }
> +       }

Overall, what we seem to be doing here, is to retrieve a value for an
initial/default performance state for a device and then we want to set
it to make sure the vote becomes aggregated and finally set for the
genpd.

With your suggested change, there are now two ways to get the
initial/default state. One is through the existing
of_get_required_opp_performance_state() and the other is by using a
new genpd callback.

That said, perhaps we would get a bit cleaner code by moving the "get
initial/default performance state" thingy, into a separate function
and then call it from here. If this function returns a valid
performance state, then we should continue to set the state, by
calling dev_pm_genpd_set_performance_state() and update
dev_gpd_data(dev)->default_pstate accordingly.

Would that work, do you think?

> +
>         return 1;
>
>  err:
> -       dev_err(dev, "failed to set required performance state for power-domain %s: %d\n",
> -               pd->name, ret);
>         genpd_remove_device(pd, dev);
>         return ret;
>  }
> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
> index 67017c9390c8..4f78b31791ae 100644
> --- a/include/linux/pm_domain.h
> +++ b/include/linux/pm_domain.h
> @@ -133,6 +133,8 @@ struct generic_pm_domain {
>                                                  struct dev_pm_opp *opp);
>         int (*set_performance_state)(struct generic_pm_domain *genpd,
>                                      unsigned int state);
> +       int (*get_performance_state)(struct generic_pm_domain *genpd,
> +                                    struct device *dev, bool *dev_suspended);

Comparing the ->set_performance_state() callback, which sets a
performance state for the PM domain (genpd) - this new callback is
about retrieving the *initial/default* performance state for a
*device* that gets attached to a genpd.

That said, may I suggest renaming the callback to
"dev_get_performance_state", or something along those lines.

>         struct gpd_dev_ops dev_ops;
>         s64 max_off_time_ns;    /* Maximum allowed "suspended" time. */
>         ktime_t next_wakeup;    /* Maintained by the domain governor */
> --
> 2.32.0
>

Kind regards
Uffe
Dmitry Osipenko Aug. 27, 2021, 3:50 p.m. UTC | #2
27.08.2021 17:23, Ulf Hansson пишет:
> On Fri, 27 Aug 2021 at 03:37, Dmitry Osipenko <digetx@gmail.com> wrote:
>>
>> Add get_performance_state() callback that retrieves and initializes
>> performance state of a device attached to a power domain. This removes
>> inconsistency of the performance state with hardware state.
> 
> Can you please try to elaborate a bit more on the use case. Users need
> to know when it makes sense to implement the callback - and so far we
> tend to document this through detailed commit messages.
> 
> Moreover, please state that implementing the callback is optional.

Noted

>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
>>  drivers/base/power/domain.c | 32 +++++++++++++++++++++++++++++---
>>  include/linux/pm_domain.h   |  2 ++
>>  2 files changed, 31 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
>> index 3a13a942d012..8b828dcdf7f8 100644
>> --- a/drivers/base/power/domain.c
>> +++ b/drivers/base/power/domain.c
>> @@ -2700,15 +2700,41 @@ static int __genpd_dev_pm_attach(struct device *dev, struct device *base_dev,
>>                 goto err;
>>         } else if (pstate > 0) {
>>                 ret = dev_pm_genpd_set_performance_state(dev, pstate);
>> -               if (ret)
>> +               if (ret) {
>> +                       dev_err(dev, "failed to set required performance state for power-domain %s: %d\n",
>> +                               pd->name, ret);
> 
> Moving the dev_err() here, leads to that we won't print an error if
> of_get_required_opp_performance_state() fails, a few lines above, is
> that intentional?

Not intentional, I'll add another message.

>>                         goto err;
>> +               }
>>                 dev_gpd_data(dev)->default_pstate = pstate;
>>         }
>> +
>> +       if (pd->get_performance_state && !dev_gpd_data(dev)->default_pstate) {
>> +               bool dev_suspended = false;
>> +
>> +               ret = pd->get_performance_state(pd, base_dev, &dev_suspended);
>> +               if (ret < 0) {
>> +                       dev_err(dev, "failed to get performance state for power-domain %s: %d\n",
>> +                               pd->name, ret);
>> +                       goto err;
>> +               }
>> +
>> +               pstate = ret;
>> +
>> +               if (dev_suspended) {
> 
> The dev_suspended thing looks weird.
> 
> Perhaps it was needed before dev_pm_genpd_set_performance_state()
> didn't check pm_runtime_disabled()?

There are two possible variants here:

1. Device is suspended
2. Device is active

If device is suspended, then it will be activated on RPM-resume and h/w
state will require a specific performance state when resumed. Hence only
the the rpm_pstate should be set, otherwise SoC may start to consume
extra power if device won't be resumed by a consumer driver and
performance state is bumped without a real need.

If device is known to be active, then the performance state should be
updated immediately, otherwise we have inconsistent state with hardware.

For Tegra dev_suspended=true because in general it should be safe to
assume that hardware is suspended since it's either stopped by the PD
driver on initial power_on or it's assumed to be disabled by a consumer
driver during probe. Technically it's possible to check clock and reset
state of an attached device from the get_performance_state() to find the
real state of device, but it's not necessary to do so far.

I'll add comment to the code.

>> +                       dev_gpd_data(dev)->rpm_pstate = pstate;
>> +               } else if (pstate > 0) {
>> +                       ret = dev_pm_genpd_set_performance_state(dev, pstate);
>> +                       if (ret) {
>> +                               dev_err(dev, "failed to set required performance state for power-domain %s: %d\n",
>> +                                       pd->name, ret);
>> +                               goto err;
>> +                       }
>> +               }
>> +       }
> 
> Overall, what we seem to be doing here, is to retrieve a value for an
> initial/default performance state for a device and then we want to set
> it to make sure the vote becomes aggregated and finally set for the
> genpd.
> 
> With your suggested change, there are now two ways to get the
> initial/default state. One is through the existing
> of_get_required_opp_performance_state() and the other is by using a
> new genpd callback.
> 
> That said, perhaps we would get a bit cleaner code by moving the "get
> initial/default performance state" thingy, into a separate function
> and then call it from here. If this function returns a valid
> performance state, then we should continue to set the state, by
> calling dev_pm_genpd_set_performance_state() and update
> dev_gpd_data(dev)->default_pstate accordingly.
> 
> Would that work, do you think?

To be honest, I'm now confused by
of_get_required_opp_performance_state(). It assumes that device is
active all the time while attached and that device is stopped on detach.

If hardware is always-on, then it should be wrong to drop the
performance state on detach.

If hardware isn't always-on, then it might be suspended during
attachment, and thus, only the rpm_pstate should be set. It's also not
guaranteed that consumer driver will suspend device on unbind, leaving
it active on detach, thus it should be wrong to drop performance state
on detach.

Hence I think the default_pstate is a bit out of touch. If this
attach/detach behaviour is specific to QCOM driver/hardware, then maybe
of_get_required_opp_performance_state() should be moved out to a
get_performance_state() of the QCOM PD driver?

I added Rajendra Nayak to explain.

For now we're bailing out if default_pstate is set because it conflicts
with get_performance_state().

But we can factor out the code into a separate function anyways to make
it cleaner a tad.

>> +
>>         return 1;
>>
>>  err:
>> -       dev_err(dev, "failed to set required performance state for power-domain %s: %d\n",
>> -               pd->name, ret);
>>         genpd_remove_device(pd, dev);
>>         return ret;
>>  }
>> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
>> index 67017c9390c8..4f78b31791ae 100644
>> --- a/include/linux/pm_domain.h
>> +++ b/include/linux/pm_domain.h
>> @@ -133,6 +133,8 @@ struct generic_pm_domain {
>>                                                  struct dev_pm_opp *opp);
>>         int (*set_performance_state)(struct generic_pm_domain *genpd,
>>                                      unsigned int state);
>> +       int (*get_performance_state)(struct generic_pm_domain *genpd,
>> +                                    struct device *dev, bool *dev_suspended);
> 
> Comparing the ->set_performance_state() callback, which sets a
> performance state for the PM domain (genpd) - this new callback is
> about retrieving the *initial/default* performance state for a
> *device* that gets attached to a genpd.
> 
> That said, may I suggest renaming the callback to
> "dev_get_performance_state", or something along those lines.

Noted

>>         struct gpd_dev_ops dev_ops;
>>         s64 max_off_time_ns;    /* Maximum allowed "suspended" time. */
>>         ktime_t next_wakeup;    /* Maintained by the domain governor */
>> --
>> 2.32.0
>>
> 
> Kind regards
> Uffe
>
Ulf Hansson Aug. 30, 2021, 9:19 a.m. UTC | #3
+ Dmitry Baryshkov, Bjorn Andersson

On Fri, 27 Aug 2021 at 17:50, Dmitry Osipenko <digetx@gmail.com> wrote:
>
> 27.08.2021 17:23, Ulf Hansson пишет:
> > On Fri, 27 Aug 2021 at 03:37, Dmitry Osipenko <digetx@gmail.com> wrote:
> >>
> >> Add get_performance_state() callback that retrieves and initializes
> >> performance state of a device attached to a power domain. This removes
> >> inconsistency of the performance state with hardware state.
> >
> > Can you please try to elaborate a bit more on the use case. Users need
> > to know when it makes sense to implement the callback - and so far we
> > tend to document this through detailed commit messages.
> >
> > Moreover, please state that implementing the callback is optional.
>
> Noted
>
> >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> >> ---
> >>  drivers/base/power/domain.c | 32 +++++++++++++++++++++++++++++---
> >>  include/linux/pm_domain.h   |  2 ++
> >>  2 files changed, 31 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> >> index 3a13a942d012..8b828dcdf7f8 100644
> >> --- a/drivers/base/power/domain.c
> >> +++ b/drivers/base/power/domain.c
> >> @@ -2700,15 +2700,41 @@ static int __genpd_dev_pm_attach(struct device *dev, struct device *base_dev,
> >>                 goto err;
> >>         } else if (pstate > 0) {
> >>                 ret = dev_pm_genpd_set_performance_state(dev, pstate);
> >> -               if (ret)
> >> +               if (ret) {
> >> +                       dev_err(dev, "failed to set required performance state for power-domain %s: %d\n",
> >> +                               pd->name, ret);
> >
> > Moving the dev_err() here, leads to that we won't print an error if
> > of_get_required_opp_performance_state() fails, a few lines above, is
> > that intentional?
>
> Not intentional, I'll add another message.
>
> >>                         goto err;
> >> +               }
> >>                 dev_gpd_data(dev)->default_pstate = pstate;
> >>         }
> >> +
> >> +       if (pd->get_performance_state && !dev_gpd_data(dev)->default_pstate) {
> >> +               bool dev_suspended = false;
> >> +
> >> +               ret = pd->get_performance_state(pd, base_dev, &dev_suspended);
> >> +               if (ret < 0) {
> >> +                       dev_err(dev, "failed to get performance state for power-domain %s: %d\n",
> >> +                               pd->name, ret);
> >> +                       goto err;
> >> +               }
> >> +
> >> +               pstate = ret;
> >> +
> >> +               if (dev_suspended) {
> >
> > The dev_suspended thing looks weird.
> >
> > Perhaps it was needed before dev_pm_genpd_set_performance_state()
> > didn't check pm_runtime_disabled()?
>
> There are two possible variants here:
>
> 1. Device is suspended
> 2. Device is active
>
> If device is suspended, then it will be activated on RPM-resume and h/w
> state will require a specific performance state when resumed. Hence only
> the the rpm_pstate should be set, otherwise SoC may start to consume
> extra power if device won't be resumed by a consumer driver and
> performance state is bumped without a real need.
>
> If device is known to be active, then the performance state should be
> updated immediately, otherwise we have inconsistent state with hardware.
>
> For Tegra dev_suspended=true because in general it should be safe to
> assume that hardware is suspended since it's either stopped by the PD
> driver on initial power_on or it's assumed to be disabled by a consumer
> driver during probe. Technically it's possible to check clock and reset
> state of an attached device from the get_performance_state() to find the
> real state of device, but it's not necessary to do so far.

I follow your reasoning above, but I fail to understand your point, sorry.

Your recent patch ("PM: domains: Improve runtime PM performance state
handling"), made dev_pm_genpd_set_performance_state() to call
pm_runtime_suspended(), to check whether it should assign
dev_gpd_data(dev)->rpm_pstate, which postpones the vote until the
device gets runtime resumed - or call genpd_set_performance_state() to
immediately vote for a new performance state.

That updated behaviour of dev_pm_genpd_set_performance_state should be
sufficient, I think.

In other words, please drop the "dev_suspended" parameter from the
->get_performance_state() callback, as it doesn't make sense to me.

>
> I'll add comment to the code.
>
> >> +                       dev_gpd_data(dev)->rpm_pstate = pstate;
> >> +               } else if (pstate > 0) {
> >> +                       ret = dev_pm_genpd_set_performance_state(dev, pstate);
> >> +                       if (ret) {
> >> +                               dev_err(dev, "failed to set required performance state for power-domain %s: %d\n",
> >> +                                       pd->name, ret);
> >> +                               goto err;
> >> +                       }
> >> +               }
> >> +       }
> >
> > Overall, what we seem to be doing here, is to retrieve a value for an
> > initial/default performance state for a device and then we want to set
> > it to make sure the vote becomes aggregated and finally set for the
> > genpd.
> >
> > With your suggested change, there are now two ways to get the
> > initial/default state. One is through the existing
> > of_get_required_opp_performance_state() and the other is by using a
> > new genpd callback.
> >
> > That said, perhaps we would get a bit cleaner code by moving the "get
> > initial/default performance state" thingy, into a separate function
> > and then call it from here. If this function returns a valid
> > performance state, then we should continue to set the state, by
> > calling dev_pm_genpd_set_performance_state() and update
> > dev_gpd_data(dev)->default_pstate accordingly.
> >
> > Would that work, do you think?
>
> To be honest, I'm now confused by
> of_get_required_opp_performance_state(). It assumes that device is
> active all the time while attached and that device is stopped on detach.
>
> If hardware is always-on, then it should be wrong to drop the
> performance state on detach.
>
> If hardware isn't always-on, then it might be suspended during
> attachment, and thus, only the rpm_pstate should be set. It's also not
> guaranteed that consumer driver will suspend device on unbind, leaving
> it active on detach, thus it should be wrong to drop performance state
> on detach.

I assume the new behaviour in dev_pm_genpd_set_performance_state()
should address most of your concerns above, no?

When it comes to the detaching, the best we can do is to drop the
performance state vote for the device, no matter if it's an always on
HW or not. Simply because after a detach, genpd loses track of the
device, which means it can't account for performance states votes for
it anyway.

>
> Hence I think the default_pstate is a bit out of touch. If this
> attach/detach behaviour is specific to QCOM driver/hardware, then maybe
> of_get_required_opp_performance_state() should be moved out to a
> get_performance_state() of the QCOM PD driver?

That may work, but I hope it's unnecessary.

Overall, the important part is the updated path in
dev_pm_genpd_set_performance_state() where we now call
pm_runtime_suspended(). I am pretty sure this should work fine for
Qcom platforms/drivers too, but let's see if Rajendra, Dmitry or Bjorn
have some concerns about this.

>
> I added Rajendra Nayak to explain.
>
> For now we're bailing out if default_pstate is set because it conflicts
> with get_performance_state().
>
> But we can factor out the code into a separate function anyways to make
> it cleaner a tad.

Yes, please.

[...]

Kind regards
Uffe
diff mbox series

Patch

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 3a13a942d012..8b828dcdf7f8 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -2700,15 +2700,41 @@  static int __genpd_dev_pm_attach(struct device *dev, struct device *base_dev,
 		goto err;
 	} else if (pstate > 0) {
 		ret = dev_pm_genpd_set_performance_state(dev, pstate);
-		if (ret)
+		if (ret) {
+			dev_err(dev, "failed to set required performance state for power-domain %s: %d\n",
+				pd->name, ret);
 			goto err;
+		}
 		dev_gpd_data(dev)->default_pstate = pstate;
 	}
+
+	if (pd->get_performance_state && !dev_gpd_data(dev)->default_pstate) {
+		bool dev_suspended = false;
+
+		ret = pd->get_performance_state(pd, base_dev, &dev_suspended);
+		if (ret < 0) {
+			dev_err(dev, "failed to get performance state for power-domain %s: %d\n",
+				pd->name, ret);
+			goto err;
+		}
+
+		pstate = ret;
+
+		if (dev_suspended) {
+			dev_gpd_data(dev)->rpm_pstate = pstate;
+		} else if (pstate > 0) {
+			ret = dev_pm_genpd_set_performance_state(dev, pstate);
+			if (ret) {
+				dev_err(dev, "failed to set required performance state for power-domain %s: %d\n",
+					pd->name, ret);
+				goto err;
+			}
+		}
+	}
+
 	return 1;
 
 err:
-	dev_err(dev, "failed to set required performance state for power-domain %s: %d\n",
-		pd->name, ret);
 	genpd_remove_device(pd, dev);
 	return ret;
 }
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index 67017c9390c8..4f78b31791ae 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -133,6 +133,8 @@  struct generic_pm_domain {
 						 struct dev_pm_opp *opp);
 	int (*set_performance_state)(struct generic_pm_domain *genpd,
 				     unsigned int state);
+	int (*get_performance_state)(struct generic_pm_domain *genpd,
+				     struct device *dev, bool *dev_suspended);
 	struct gpd_dev_ops dev_ops;
 	s64 max_off_time_ns;	/* Maximum allowed "suspended" time. */
 	ktime_t next_wakeup;	/* Maintained by the domain governor */