Message ID | 20210827013415.24027-5-digetx@gmail.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
Series | NVIDIA Tegra power management patches for 5.16 | expand |
On Fri, 27 Aug 2021 at 03:37, Dmitry Osipenko <digetx@gmail.com> wrote: > > Add get_performance_state() callback that retrieves and initializes > performance state of a device attached to a power domain. This removes > inconsistency of the performance state with hardware state. Can you please try to elaborate a bit more on the use case. Users need to know when it makes sense to implement the callback - and so far we tend to document this through detailed commit messages. Moreover, please state that implementing the callback is optional. > > Signed-off-by: Dmitry Osipenko <digetx@gmail.com> > --- > drivers/base/power/domain.c | 32 +++++++++++++++++++++++++++++--- > include/linux/pm_domain.h | 2 ++ > 2 files changed, 31 insertions(+), 3 deletions(-) > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > index 3a13a942d012..8b828dcdf7f8 100644 > --- a/drivers/base/power/domain.c > +++ b/drivers/base/power/domain.c > @@ -2700,15 +2700,41 @@ static int __genpd_dev_pm_attach(struct device *dev, struct device *base_dev, > goto err; > } else if (pstate > 0) { > ret = dev_pm_genpd_set_performance_state(dev, pstate); > - if (ret) > + if (ret) { > + dev_err(dev, "failed to set required performance state for power-domain %s: %d\n", > + pd->name, ret); Moving the dev_err() here, leads to that we won't print an error if of_get_required_opp_performance_state() fails, a few lines above, is that intentional? > goto err; > + } > dev_gpd_data(dev)->default_pstate = pstate; > } > + > + if (pd->get_performance_state && !dev_gpd_data(dev)->default_pstate) { > + bool dev_suspended = false; > + > + ret = pd->get_performance_state(pd, base_dev, &dev_suspended); > + if (ret < 0) { > + dev_err(dev, "failed to get performance state for power-domain %s: %d\n", > + pd->name, ret); > + goto err; > + } > + > + pstate = ret; > + > + if (dev_suspended) { The dev_suspended thing looks weird. Perhaps it was needed before dev_pm_genpd_set_performance_state() didn't check pm_runtime_disabled()? > + dev_gpd_data(dev)->rpm_pstate = pstate; > + } else if (pstate > 0) { > + ret = dev_pm_genpd_set_performance_state(dev, pstate); > + if (ret) { > + dev_err(dev, "failed to set required performance state for power-domain %s: %d\n", > + pd->name, ret); > + goto err; > + } > + } > + } Overall, what we seem to be doing here, is to retrieve a value for an initial/default performance state for a device and then we want to set it to make sure the vote becomes aggregated and finally set for the genpd. With your suggested change, there are now two ways to get the initial/default state. One is through the existing of_get_required_opp_performance_state() and the other is by using a new genpd callback. That said, perhaps we would get a bit cleaner code by moving the "get initial/default performance state" thingy, into a separate function and then call it from here. If this function returns a valid performance state, then we should continue to set the state, by calling dev_pm_genpd_set_performance_state() and update dev_gpd_data(dev)->default_pstate accordingly. Would that work, do you think? > + > return 1; > > err: > - dev_err(dev, "failed to set required performance state for power-domain %s: %d\n", > - pd->name, ret); > genpd_remove_device(pd, dev); > return ret; > } > diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h > index 67017c9390c8..4f78b31791ae 100644 > --- a/include/linux/pm_domain.h > +++ b/include/linux/pm_domain.h > @@ -133,6 +133,8 @@ struct generic_pm_domain { > struct dev_pm_opp *opp); > int (*set_performance_state)(struct generic_pm_domain *genpd, > unsigned int state); > + int (*get_performance_state)(struct generic_pm_domain *genpd, > + struct device *dev, bool *dev_suspended); Comparing the ->set_performance_state() callback, which sets a performance state for the PM domain (genpd) - this new callback is about retrieving the *initial/default* performance state for a *device* that gets attached to a genpd. That said, may I suggest renaming the callback to "dev_get_performance_state", or something along those lines. > struct gpd_dev_ops dev_ops; > s64 max_off_time_ns; /* Maximum allowed "suspended" time. */ > ktime_t next_wakeup; /* Maintained by the domain governor */ > -- > 2.32.0 > Kind regards Uffe
27.08.2021 17:23, Ulf Hansson пишет: > On Fri, 27 Aug 2021 at 03:37, Dmitry Osipenko <digetx@gmail.com> wrote: >> >> Add get_performance_state() callback that retrieves and initializes >> performance state of a device attached to a power domain. This removes >> inconsistency of the performance state with hardware state. > > Can you please try to elaborate a bit more on the use case. Users need > to know when it makes sense to implement the callback - and so far we > tend to document this through detailed commit messages. > > Moreover, please state that implementing the callback is optional. Noted >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> >> --- >> drivers/base/power/domain.c | 32 +++++++++++++++++++++++++++++--- >> include/linux/pm_domain.h | 2 ++ >> 2 files changed, 31 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c >> index 3a13a942d012..8b828dcdf7f8 100644 >> --- a/drivers/base/power/domain.c >> +++ b/drivers/base/power/domain.c >> @@ -2700,15 +2700,41 @@ static int __genpd_dev_pm_attach(struct device *dev, struct device *base_dev, >> goto err; >> } else if (pstate > 0) { >> ret = dev_pm_genpd_set_performance_state(dev, pstate); >> - if (ret) >> + if (ret) { >> + dev_err(dev, "failed to set required performance state for power-domain %s: %d\n", >> + pd->name, ret); > > Moving the dev_err() here, leads to that we won't print an error if > of_get_required_opp_performance_state() fails, a few lines above, is > that intentional? Not intentional, I'll add another message. >> goto err; >> + } >> dev_gpd_data(dev)->default_pstate = pstate; >> } >> + >> + if (pd->get_performance_state && !dev_gpd_data(dev)->default_pstate) { >> + bool dev_suspended = false; >> + >> + ret = pd->get_performance_state(pd, base_dev, &dev_suspended); >> + if (ret < 0) { >> + dev_err(dev, "failed to get performance state for power-domain %s: %d\n", >> + pd->name, ret); >> + goto err; >> + } >> + >> + pstate = ret; >> + >> + if (dev_suspended) { > > The dev_suspended thing looks weird. > > Perhaps it was needed before dev_pm_genpd_set_performance_state() > didn't check pm_runtime_disabled()? There are two possible variants here: 1. Device is suspended 2. Device is active If device is suspended, then it will be activated on RPM-resume and h/w state will require a specific performance state when resumed. Hence only the the rpm_pstate should be set, otherwise SoC may start to consume extra power if device won't be resumed by a consumer driver and performance state is bumped without a real need. If device is known to be active, then the performance state should be updated immediately, otherwise we have inconsistent state with hardware. For Tegra dev_suspended=true because in general it should be safe to assume that hardware is suspended since it's either stopped by the PD driver on initial power_on or it's assumed to be disabled by a consumer driver during probe. Technically it's possible to check clock and reset state of an attached device from the get_performance_state() to find the real state of device, but it's not necessary to do so far. I'll add comment to the code. >> + dev_gpd_data(dev)->rpm_pstate = pstate; >> + } else if (pstate > 0) { >> + ret = dev_pm_genpd_set_performance_state(dev, pstate); >> + if (ret) { >> + dev_err(dev, "failed to set required performance state for power-domain %s: %d\n", >> + pd->name, ret); >> + goto err; >> + } >> + } >> + } > > Overall, what we seem to be doing here, is to retrieve a value for an > initial/default performance state for a device and then we want to set > it to make sure the vote becomes aggregated and finally set for the > genpd. > > With your suggested change, there are now two ways to get the > initial/default state. One is through the existing > of_get_required_opp_performance_state() and the other is by using a > new genpd callback. > > That said, perhaps we would get a bit cleaner code by moving the "get > initial/default performance state" thingy, into a separate function > and then call it from here. If this function returns a valid > performance state, then we should continue to set the state, by > calling dev_pm_genpd_set_performance_state() and update > dev_gpd_data(dev)->default_pstate accordingly. > > Would that work, do you think? To be honest, I'm now confused by of_get_required_opp_performance_state(). It assumes that device is active all the time while attached and that device is stopped on detach. If hardware is always-on, then it should be wrong to drop the performance state on detach. If hardware isn't always-on, then it might be suspended during attachment, and thus, only the rpm_pstate should be set. It's also not guaranteed that consumer driver will suspend device on unbind, leaving it active on detach, thus it should be wrong to drop performance state on detach. Hence I think the default_pstate is a bit out of touch. If this attach/detach behaviour is specific to QCOM driver/hardware, then maybe of_get_required_opp_performance_state() should be moved out to a get_performance_state() of the QCOM PD driver? I added Rajendra Nayak to explain. For now we're bailing out if default_pstate is set because it conflicts with get_performance_state(). But we can factor out the code into a separate function anyways to make it cleaner a tad. >> + >> return 1; >> >> err: >> - dev_err(dev, "failed to set required performance state for power-domain %s: %d\n", >> - pd->name, ret); >> genpd_remove_device(pd, dev); >> return ret; >> } >> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h >> index 67017c9390c8..4f78b31791ae 100644 >> --- a/include/linux/pm_domain.h >> +++ b/include/linux/pm_domain.h >> @@ -133,6 +133,8 @@ struct generic_pm_domain { >> struct dev_pm_opp *opp); >> int (*set_performance_state)(struct generic_pm_domain *genpd, >> unsigned int state); >> + int (*get_performance_state)(struct generic_pm_domain *genpd, >> + struct device *dev, bool *dev_suspended); > > Comparing the ->set_performance_state() callback, which sets a > performance state for the PM domain (genpd) - this new callback is > about retrieving the *initial/default* performance state for a > *device* that gets attached to a genpd. > > That said, may I suggest renaming the callback to > "dev_get_performance_state", or something along those lines. Noted >> struct gpd_dev_ops dev_ops; >> s64 max_off_time_ns; /* Maximum allowed "suspended" time. */ >> ktime_t next_wakeup; /* Maintained by the domain governor */ >> -- >> 2.32.0 >> > > Kind regards > Uffe >
+ Dmitry Baryshkov, Bjorn Andersson On Fri, 27 Aug 2021 at 17:50, Dmitry Osipenko <digetx@gmail.com> wrote: > > 27.08.2021 17:23, Ulf Hansson пишет: > > On Fri, 27 Aug 2021 at 03:37, Dmitry Osipenko <digetx@gmail.com> wrote: > >> > >> Add get_performance_state() callback that retrieves and initializes > >> performance state of a device attached to a power domain. This removes > >> inconsistency of the performance state with hardware state. > > > > Can you please try to elaborate a bit more on the use case. Users need > > to know when it makes sense to implement the callback - and so far we > > tend to document this through detailed commit messages. > > > > Moreover, please state that implementing the callback is optional. > > Noted > > >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> > >> --- > >> drivers/base/power/domain.c | 32 +++++++++++++++++++++++++++++--- > >> include/linux/pm_domain.h | 2 ++ > >> 2 files changed, 31 insertions(+), 3 deletions(-) > >> > >> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > >> index 3a13a942d012..8b828dcdf7f8 100644 > >> --- a/drivers/base/power/domain.c > >> +++ b/drivers/base/power/domain.c > >> @@ -2700,15 +2700,41 @@ static int __genpd_dev_pm_attach(struct device *dev, struct device *base_dev, > >> goto err; > >> } else if (pstate > 0) { > >> ret = dev_pm_genpd_set_performance_state(dev, pstate); > >> - if (ret) > >> + if (ret) { > >> + dev_err(dev, "failed to set required performance state for power-domain %s: %d\n", > >> + pd->name, ret); > > > > Moving the dev_err() here, leads to that we won't print an error if > > of_get_required_opp_performance_state() fails, a few lines above, is > > that intentional? > > Not intentional, I'll add another message. > > >> goto err; > >> + } > >> dev_gpd_data(dev)->default_pstate = pstate; > >> } > >> + > >> + if (pd->get_performance_state && !dev_gpd_data(dev)->default_pstate) { > >> + bool dev_suspended = false; > >> + > >> + ret = pd->get_performance_state(pd, base_dev, &dev_suspended); > >> + if (ret < 0) { > >> + dev_err(dev, "failed to get performance state for power-domain %s: %d\n", > >> + pd->name, ret); > >> + goto err; > >> + } > >> + > >> + pstate = ret; > >> + > >> + if (dev_suspended) { > > > > The dev_suspended thing looks weird. > > > > Perhaps it was needed before dev_pm_genpd_set_performance_state() > > didn't check pm_runtime_disabled()? > > There are two possible variants here: > > 1. Device is suspended > 2. Device is active > > If device is suspended, then it will be activated on RPM-resume and h/w > state will require a specific performance state when resumed. Hence only > the the rpm_pstate should be set, otherwise SoC may start to consume > extra power if device won't be resumed by a consumer driver and > performance state is bumped without a real need. > > If device is known to be active, then the performance state should be > updated immediately, otherwise we have inconsistent state with hardware. > > For Tegra dev_suspended=true because in general it should be safe to > assume that hardware is suspended since it's either stopped by the PD > driver on initial power_on or it's assumed to be disabled by a consumer > driver during probe. Technically it's possible to check clock and reset > state of an attached device from the get_performance_state() to find the > real state of device, but it's not necessary to do so far. I follow your reasoning above, but I fail to understand your point, sorry. Your recent patch ("PM: domains: Improve runtime PM performance state handling"), made dev_pm_genpd_set_performance_state() to call pm_runtime_suspended(), to check whether it should assign dev_gpd_data(dev)->rpm_pstate, which postpones the vote until the device gets runtime resumed - or call genpd_set_performance_state() to immediately vote for a new performance state. That updated behaviour of dev_pm_genpd_set_performance_state should be sufficient, I think. In other words, please drop the "dev_suspended" parameter from the ->get_performance_state() callback, as it doesn't make sense to me. > > I'll add comment to the code. > > >> + dev_gpd_data(dev)->rpm_pstate = pstate; > >> + } else if (pstate > 0) { > >> + ret = dev_pm_genpd_set_performance_state(dev, pstate); > >> + if (ret) { > >> + dev_err(dev, "failed to set required performance state for power-domain %s: %d\n", > >> + pd->name, ret); > >> + goto err; > >> + } > >> + } > >> + } > > > > Overall, what we seem to be doing here, is to retrieve a value for an > > initial/default performance state for a device and then we want to set > > it to make sure the vote becomes aggregated and finally set for the > > genpd. > > > > With your suggested change, there are now two ways to get the > > initial/default state. One is through the existing > > of_get_required_opp_performance_state() and the other is by using a > > new genpd callback. > > > > That said, perhaps we would get a bit cleaner code by moving the "get > > initial/default performance state" thingy, into a separate function > > and then call it from here. If this function returns a valid > > performance state, then we should continue to set the state, by > > calling dev_pm_genpd_set_performance_state() and update > > dev_gpd_data(dev)->default_pstate accordingly. > > > > Would that work, do you think? > > To be honest, I'm now confused by > of_get_required_opp_performance_state(). It assumes that device is > active all the time while attached and that device is stopped on detach. > > If hardware is always-on, then it should be wrong to drop the > performance state on detach. > > If hardware isn't always-on, then it might be suspended during > attachment, and thus, only the rpm_pstate should be set. It's also not > guaranteed that consumer driver will suspend device on unbind, leaving > it active on detach, thus it should be wrong to drop performance state > on detach. I assume the new behaviour in dev_pm_genpd_set_performance_state() should address most of your concerns above, no? When it comes to the detaching, the best we can do is to drop the performance state vote for the device, no matter if it's an always on HW or not. Simply because after a detach, genpd loses track of the device, which means it can't account for performance states votes for it anyway. > > Hence I think the default_pstate is a bit out of touch. If this > attach/detach behaviour is specific to QCOM driver/hardware, then maybe > of_get_required_opp_performance_state() should be moved out to a > get_performance_state() of the QCOM PD driver? That may work, but I hope it's unnecessary. Overall, the important part is the updated path in dev_pm_genpd_set_performance_state() where we now call pm_runtime_suspended(). I am pretty sure this should work fine for Qcom platforms/drivers too, but let's see if Rajendra, Dmitry or Bjorn have some concerns about this. > > I added Rajendra Nayak to explain. > > For now we're bailing out if default_pstate is set because it conflicts > with get_performance_state(). > > But we can factor out the code into a separate function anyways to make > it cleaner a tad. Yes, please. [...] Kind regards Uffe
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 3a13a942d012..8b828dcdf7f8 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -2700,15 +2700,41 @@ static int __genpd_dev_pm_attach(struct device *dev, struct device *base_dev, goto err; } else if (pstate > 0) { ret = dev_pm_genpd_set_performance_state(dev, pstate); - if (ret) + if (ret) { + dev_err(dev, "failed to set required performance state for power-domain %s: %d\n", + pd->name, ret); goto err; + } dev_gpd_data(dev)->default_pstate = pstate; } + + if (pd->get_performance_state && !dev_gpd_data(dev)->default_pstate) { + bool dev_suspended = false; + + ret = pd->get_performance_state(pd, base_dev, &dev_suspended); + if (ret < 0) { + dev_err(dev, "failed to get performance state for power-domain %s: %d\n", + pd->name, ret); + goto err; + } + + pstate = ret; + + if (dev_suspended) { + dev_gpd_data(dev)->rpm_pstate = pstate; + } else if (pstate > 0) { + ret = dev_pm_genpd_set_performance_state(dev, pstate); + if (ret) { + dev_err(dev, "failed to set required performance state for power-domain %s: %d\n", + pd->name, ret); + goto err; + } + } + } + return 1; err: - dev_err(dev, "failed to set required performance state for power-domain %s: %d\n", - pd->name, ret); genpd_remove_device(pd, dev); return ret; } diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h index 67017c9390c8..4f78b31791ae 100644 --- a/include/linux/pm_domain.h +++ b/include/linux/pm_domain.h @@ -133,6 +133,8 @@ struct generic_pm_domain { struct dev_pm_opp *opp); int (*set_performance_state)(struct generic_pm_domain *genpd, unsigned int state); + int (*get_performance_state)(struct generic_pm_domain *genpd, + struct device *dev, bool *dev_suspended); struct gpd_dev_ops dev_ops; s64 max_off_time_ns; /* Maximum allowed "suspended" time. */ ktime_t next_wakeup; /* Maintained by the domain governor */
Add get_performance_state() callback that retrieves and initializes performance state of a device attached to a power domain. This removes inconsistency of the performance state with hardware state. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> --- drivers/base/power/domain.c | 32 +++++++++++++++++++++++++++++--- include/linux/pm_domain.h | 2 ++ 2 files changed, 31 insertions(+), 3 deletions(-)