Message ID | 1516744675-21233-1-git-send-email-byan@nvidia.com (mailing list archive) |
---|---|
State | Mainlined |
Delegated to: | Rafael Wysocki |
Headers | show |
On Tuesday, January 23, 2018 10:57:55 PM CET Bo Yan wrote: > cpufreq_resume can be called even without preceding cpufreq_suspend. > This can happen in following scenario: > > suspend_devices_and_enter > --> dpm_suspend_start > --> dpm_prepare > --> device_prepare : this function errors out > --> dpm_suspend: this is skipped due to dpm_prepare failure > this means cpufreq_suspend is skipped over > --> goto Recover_platform, due to previous error > --> goto Resume_devices > --> dpm_resume_end > --> dpm_resume > --> cpufreq_resume > > In case schedutil is used as frequency governor, cpufreq_resume will > eventually call sugov_start, which does following: > > memset(sg_cpu, 0, sizeof(*sg_cpu)); > .... > > This effectively erases function pointer for frequency update, causing > crash later on. The function pointer would have been set correctly if > subsequent cpufreq_add_update_util_hook runs successfully, but that > function returns earlier because cpufreq_suspend was not called: > > if (WARN_ON(per_cpu(cpufreq_update_util_data, cpu))) > return; > > Ideally, suspend should succeed, then things will be fine. But even > in case of suspend failure, system should not crash. > > The fix is to check cpufreq_suspended first, if it's false, that means > cpufreq_suspend was not called in the first place, so do not resume > cpufreq. > > Signed-off-by: Bo Yan <byan@nvidia.com> > --- > drivers/cpufreq/cpufreq.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c > index 41d148af7748..95b1c4afe14e 100644 > --- a/drivers/cpufreq/cpufreq.c > +++ b/drivers/cpufreq/cpufreq.c > @@ -1680,6 +1680,10 @@ void cpufreq_resume(void) > if (!cpufreq_driver) > return; > > + if (unlikely(!cpufreq_suspended)) { > + pr_warn("%s: resume after failing suspend\n", __func__); > + return; > + } > cpufreq_suspended = false; > > if (!has_target() && !cpufreq_driver->resume) > Good catch, but rather than doing this it would be better to avoid calling cpufreq_resume() at all if cpufreq_suspend() has not been called. Thanks, Rafael
On Tuesday, January 23, 2018 10:57:55 PM CET Bo Yan wrote: > cpufreq_resume can be called even without preceding cpufreq_suspend. > This can happen in following scenario: > > suspend_devices_and_enter > --> dpm_suspend_start > --> dpm_prepare > --> device_prepare : this function errors out > --> dpm_suspend: this is skipped due to dpm_prepare failure > this means cpufreq_suspend is skipped over > --> goto Recover_platform, due to previous error > --> goto Resume_devices > --> dpm_resume_end > --> dpm_resume > --> cpufreq_resume > > In case schedutil is used as frequency governor, cpufreq_resume will > eventually call sugov_start, which does following: > > memset(sg_cpu, 0, sizeof(*sg_cpu)); > .... > > This effectively erases function pointer for frequency update, causing > crash later on. The function pointer would have been set correctly if > subsequent cpufreq_add_update_util_hook runs successfully, but that > function returns earlier because cpufreq_suspend was not called: > > if (WARN_ON(per_cpu(cpufreq_update_util_data, cpu))) > return; > > Ideally, suspend should succeed, then things will be fine. But even > in case of suspend failure, system should not crash. > > The fix is to check cpufreq_suspended first, if it's false, that means > cpufreq_suspend was not called in the first place, so do not resume > cpufreq. > > Signed-off-by: Bo Yan <byan@nvidia.com> > --- > drivers/cpufreq/cpufreq.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c > index 41d148af7748..95b1c4afe14e 100644 > --- a/drivers/cpufreq/cpufreq.c > +++ b/drivers/cpufreq/cpufreq.c > @@ -1680,6 +1680,10 @@ void cpufreq_resume(void) > if (!cpufreq_driver) > return; > > + if (unlikely(!cpufreq_suspended)) { > + pr_warn("%s: resume after failing suspend\n", __func__); > + return; > + } > cpufreq_suspended = false; > > if (!has_target() && !cpufreq_driver->resume) I've just edited this patch somewhat (mostly by dropping the pr_warn()) and queued it up. Thanks, Rafael
On 05-02-18, 10:19, Rafael J. Wysocki wrote: > On Tuesday, January 23, 2018 10:57:55 PM CET Bo Yan wrote: > > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c > > index 41d148af7748..95b1c4afe14e 100644 > > --- a/drivers/cpufreq/cpufreq.c > > +++ b/drivers/cpufreq/cpufreq.c > > @@ -1680,6 +1680,10 @@ void cpufreq_resume(void) > > if (!cpufreq_driver) > > return; > > > > + if (unlikely(!cpufreq_suspended)) { > > + pr_warn("%s: resume after failing suspend\n", __func__); > > + return; > > + } > > cpufreq_suspended = false; > > > > if (!has_target() && !cpufreq_driver->resume) > > I've just edited this patch somewhat (mostly by dropping the pr_warn()) > and queued it up. You can add my Ack as well. Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 41d148af7748..95b1c4afe14e 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1680,6 +1680,10 @@ void cpufreq_resume(void) if (!cpufreq_driver) return; + if (unlikely(!cpufreq_suspended)) { + pr_warn("%s: resume after failing suspend\n", __func__); + return; + } cpufreq_suspended = false; if (!has_target() && !cpufreq_driver->resume)
cpufreq_resume can be called even without preceding cpufreq_suspend. This can happen in following scenario: suspend_devices_and_enter --> dpm_suspend_start --> dpm_prepare --> device_prepare : this function errors out --> dpm_suspend: this is skipped due to dpm_prepare failure this means cpufreq_suspend is skipped over --> goto Recover_platform, due to previous error --> goto Resume_devices --> dpm_resume_end --> dpm_resume --> cpufreq_resume In case schedutil is used as frequency governor, cpufreq_resume will eventually call sugov_start, which does following: memset(sg_cpu, 0, sizeof(*sg_cpu)); .... This effectively erases function pointer for frequency update, causing crash later on. The function pointer would have been set correctly if subsequent cpufreq_add_update_util_hook runs successfully, but that function returns earlier because cpufreq_suspend was not called: if (WARN_ON(per_cpu(cpufreq_update_util_data, cpu))) return; Ideally, suspend should succeed, then things will be fine. But even in case of suspend failure, system should not crash. The fix is to check cpufreq_suspended first, if it's false, that means cpufreq_suspend was not called in the first place, so do not resume cpufreq. Signed-off-by: Bo Yan <byan@nvidia.com> --- drivers/cpufreq/cpufreq.c | 4 ++++ 1 file changed, 4 insertions(+)