diff mbox

[Update,1/1] Cpufreq: Make governor data on nonboot cpus across system suspend/resume

Message ID 1384503334-18809-1-git-send-email-tianyu.lan@intel.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

lan,Tianyu Nov. 15, 2013, 8:15 a.m. UTC
Currently, governor of nonboot cpus will be put to EXIT when system suspend.
Since all these cpus will be unplugged and the governor usage_count decreases
to zero. The governor data and its sysfs interfaces will be freed or released.
This makes user config of these governors loss during suspend and resume.

This doesn't happen on the governor covering boot cpu because it isn't
unplugged during system suspend.

To fix this issue, skipping governor exit during system suspend and check
policy governor data to determine whether the governor is really needed
to be initialized when do init. If not, return EALREADY to indicate the
governor has been initialized and should do nothing. __cpufreq_governor()
convert EALREADY to 0 as return value for INIT event since governor is
still under INIT state and can do START operation.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
---
Fix some typos

 drivers/cpufreq/cpufreq.c          |  5 ++++-
 drivers/cpufreq/cpufreq_governor.c | 13 ++++++++++++-
 2 files changed, 16 insertions(+), 2 deletions(-)

Comments

Viresh Kumar Nov. 15, 2013, 10:22 a.m. UTC | #1
On 15 November 2013 13:45, Lan Tianyu <tianyu.lan@intel.com> wrote:
> Currently, governor of nonboot cpus will be put to EXIT when system suspend.
> Since all these cpus will be unplugged and the governor usage_count decreases
> to zero. The governor data and its sysfs interfaces will be freed or released.
> This makes user config of these governors loss during suspend and resume.
>
> This doesn't happen on the governor covering boot cpu because it isn't
> unplugged during system suspend.
>
> To fix this issue, skipping governor exit during system suspend and check
> policy governor data to determine whether the governor is really needed
> to be initialized when do init. If not, return EALREADY to indicate the
> governor has been initialized and should do nothing. __cpufreq_governor()
> convert EALREADY to 0 as return value for INIT event since governor is
> still under INIT state and can do START operation.
>
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---

Hi Lan..

Apologies!!

I already had a solution for this as this was reported by few Broadcom people
as well. But I haven't send it to mainline yet as it was untested. It
looked similar
to what you had..

And so I would have taken your patch (as you have sent it first to the list and
there is no real advantage of my patch over yours, they were almost same) :)

But then I went chasing another bug posted by Nishant:

https://lkml.org/lkml/2013/10/24/369

And the final solution I have to write solved all the problems you and he
reported.

Please have a look at that patch (you are cc'd) and give it a try to see
if it fixes your problem..

Btw, One question about your setup:
- you must have a multi cluster/socket SoC as you have atleast one more
policy structure than used for group containing boot cpu..
- Are you using separate governor for both groups?
- Or are you using CPUFREQ_HAVE_GOVERNOR_PER_POLICY stuff
to use same governor with separate tunables for both groups?

Just wanted to know if somebody else is also using
CPUFREQ_HAVE_GOVERNOR_PER_POLICY :)
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar Nov. 15, 2013, 10:54 a.m. UTC | #2
Adding Rainer as well

On 15 November 2013 13:45, Lan Tianyu <tianyu.lan@intel.com> wrote:
> Currently, governor of nonboot cpus will be put to EXIT when system suspend.
> Since all these cpus will be unplugged and the governor usage_count decreases
> to zero. The governor data and its sysfs interfaces will be freed or released.
> This makes user config of these governors loss during suspend and resume.
>
> This doesn't happen on the governor covering boot cpu because it isn't
> unplugged during system suspend.
>
> To fix this issue, skipping governor exit during system suspend and check
> policy governor data to determine whether the governor is really needed
> to be initialized when do init. If not, return EALREADY to indicate the
> governor has been initialized and should do nothing. __cpufreq_governor()
> convert EALREADY to 0 as return value for INIT event since governor is
> still under INIT state and can do START operation.

Though the patch I have sent fixes a problem similar to this but I don't think
patch of any of us will solve the issue Rainer is facing..

I checked his system configuration and its like this:
- Four CPUs, all having separate clock domains (atleast from kernel
perspective) and so separate policy structure.
- All are using ondemand governor
- not using CPUFREQ_HAVE_GOVERNOR_PER_POLICY feature
- So there is a single set of tunables for ondemand governor that is applicable
across all CPUs..

The way INIT/EXIT are designed in cpufreq_governor.c should take care
of this scenario.

memory for tunables must not be freed unless all the CPUs are removed.
Which can't happen, as we only offline non-boot CPUs and so I believe
that memory isn't getting freed and so your solution wouldn't address his
problem..

Sorry if I said something stupid enough :)

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Nov. 16, 2013, 12:38 a.m. UTC | #3
On Friday, November 15, 2013 04:15:34 PM Lan Tianyu wrote:
> Currently, governor of nonboot cpus will be put to EXIT when system suspend.
> Since all these cpus will be unplugged and the governor usage_count decreases
> to zero. The governor data and its sysfs interfaces will be freed or released.
> This makes user config of these governors loss during suspend and resume.

First off, do we have a pointer to a bug report related to that?

Second, what does need to be done to reproduce this problem?

> This doesn't happen on the governor covering boot cpu because it isn't
> unplugged during system suspend.
> 
> To fix this issue, skipping governor exit during system suspend and check
> policy governor data to determine whether the governor is really needed
> to be initialized when do init. If not, return EALREADY to indicate the
> governor has been initialized and should do nothing. __cpufreq_governor()
> convert EALREADY to 0 as return value for INIT event since governor is
> still under INIT state and can do START operation.
> 
> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> ---
> Fix some typos
> 
>  drivers/cpufreq/cpufreq.c          |  5 ++++-
>  drivers/cpufreq/cpufreq_governor.c | 13 ++++++++++++-
>  2 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 02d534d..38f2e4a 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1239,7 +1239,7 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
>  
>  	/* If cpu is last user of policy, free policy */
>  	if (cpus == 1) {
> -		if (has_target()) {
> +		if (has_target() && !frozen) {
>  			ret = __cpufreq_governor(policy,
>  					CPUFREQ_GOV_POLICY_EXIT);
>  			if (ret) {
> @@ -1822,6 +1822,9 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
>  			((event == CPUFREQ_GOV_POLICY_EXIT) && !ret))
>  		module_put(policy->governor->owner);
>  
> +	if ((event == CPUFREQ_GOV_POLICY_INIT) && ret == -EALREADY)
> +		ret = 0;
> +
>  	return ret;
>  }
>  
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 0806c31..ddb93af 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -204,9 +204,20 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>  
>  	switch (event) {
>  	case CPUFREQ_GOV_POLICY_INIT:
> +		/*
> +		 * In order to keep governor data across suspend/resume,
> +		 * Governor doesn't exit when suspend and will be
> +		 * reinitialized when resume. Here check policy governor
> +		 * data to determine whether the governor has been exited.
> +		 * If not, return EALREADY.
> +		 */
>  		if (have_governor_per_policy()) {
> -			WARN_ON(dbs_data);
> +			if (dbs_data)
> +				return -EALREADY;
>  		} else if (dbs_data) {
> +			if (policy->governor_data == dbs_data)
> +				return -EALREADY;
> +
>  			dbs_data->usage_count++;
>  			policy->governor_data = dbs_data;
>  			return 0;
>
lan,Tianyu Nov. 16, 2013, 3:59 a.m. UTC | #4
On 11/16/2013 08:38 AM, Rafael J. Wysocki wrote:
> On Friday, November 15, 2013 04:15:34 PM Lan Tianyu wrote:
>> Currently, governor of nonboot cpus will be put to EXIT when system suspend.
>> Since all these cpus will be unplugged and the governor usage_count decreases
>> to zero. The governor data and its sysfs interfaces will be freed or released.
>> This makes user config of these governors loss during suspend and resume.
>
> First off, do we have a pointer to a bug report related to that?
>

No, I found this bug when I tried to resolve other similar bug.
https://bugzilla.kernel.org/show_bug.cgi?id=63081. I still have no idea 
about bug 63081 and asked reporter to try this patch.

> Second, what does need to be done to reproduce this problem?
>

Defaultly, all cpus use ondemand governor after bootup. Change one 
non-boot cpu's governor to conservative, modify conservative config via 
sysfs interface and then do system suspend. After resume, the config
of conservative is reset. On my machine, all cpus have owen policy.


>> This doesn't happen on the governor covering boot cpu because it isn't
>> unplugged during system suspend.
>>
>> To fix this issue, skipping governor exit during system suspend and check
>> policy governor data to determine whether the governor is really needed
>> to be initialized when do init. If not, return EALREADY to indicate the
>> governor has been initialized and should do nothing. __cpufreq_governor()
>> convert EALREADY to 0 as return value for INIT event since governor is
>> still under INIT state and can do START operation.
>>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>> Fix some typos
>>
>>   drivers/cpufreq/cpufreq.c          |  5 ++++-
>>   drivers/cpufreq/cpufreq_governor.c | 13 ++++++++++++-
>>   2 files changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 02d534d..38f2e4a 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -1239,7 +1239,7 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
>>
>>   	/* If cpu is last user of policy, free policy */
>>   	if (cpus == 1) {
>> -		if (has_target()) {
>> +		if (has_target() && !frozen) {
>>   			ret = __cpufreq_governor(policy,
>>   					CPUFREQ_GOV_POLICY_EXIT);
>>   			if (ret) {
>> @@ -1822,6 +1822,9 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
>>   			((event == CPUFREQ_GOV_POLICY_EXIT) && !ret))
>>   		module_put(policy->governor->owner);
>>
>> +	if ((event == CPUFREQ_GOV_POLICY_INIT) && ret == -EALREADY)
>> +		ret = 0;
>> +
>>   	return ret;
>>   }
>>
>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>> index 0806c31..ddb93af 100644
>> --- a/drivers/cpufreq/cpufreq_governor.c
>> +++ b/drivers/cpufreq/cpufreq_governor.c
>> @@ -204,9 +204,20 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>>
>>   	switch (event) {
>>   	case CPUFREQ_GOV_POLICY_INIT:
>> +		/*
>> +		 * In order to keep governor data across suspend/resume,
>> +		 * Governor doesn't exit when suspend and will be
>> +		 * reinitialized when resume. Here check policy governor
>> +		 * data to determine whether the governor has been exited.
>> +		 * If not, return EALREADY.
>> +		 */
>>   		if (have_governor_per_policy()) {
>> -			WARN_ON(dbs_data);
>> +			if (dbs_data)
>> +				return -EALREADY;
>>   		} else if (dbs_data) {
>> +			if (policy->governor_data == dbs_data)
>> +				return -EALREADY;
>> +
>>   			dbs_data->usage_count++;
>>   			policy->governor_data = dbs_data;
>>   			return 0;
>>
lan,Tianyu Nov. 16, 2013, 4:33 a.m. UTC | #5
On 11/15/2013 06:22 PM, Viresh Kumar wrote:
> On 15 November 2013 13:45, Lan Tianyu <tianyu.lan@intel.com> wrote:
>> Currently, governor of nonboot cpus will be put to EXIT when system suspend.
>> Since all these cpus will be unplugged and the governor usage_count decreases
>> to zero. The governor data and its sysfs interfaces will be freed or released.
>> This makes user config of these governors loss during suspend and resume.
>>
>> This doesn't happen on the governor covering boot cpu because it isn't
>> unplugged during system suspend.
>>
>> To fix this issue, skipping governor exit during system suspend and check
>> policy governor data to determine whether the governor is really needed
>> to be initialized when do init. If not, return EALREADY to indicate the
>> governor has been initialized and should do nothing. __cpufreq_governor()
>> convert EALREADY to 0 as return value for INIT event since governor is
>> still under INIT state and can do START operation.
>>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> ---
>
> Hi Lan..
>

Hi Viresh:

> Apologies!!
>
> I already had a solution for this as this was reported by few Broadcom people
> as well. But I haven't send it to mainline yet as it was untested. It
> looked similar
> to what you had..
>
> And so I would have taken your patch (as you have sent it first to the list and
> there is no real advantage of my patch over yours, they were almost same) :)
>
> But then I went chasing another bug posted by Nishant:
>
> https://lkml.org/lkml/2013/10/24/369
>
> And the final solution I have to write solved all the problems you and he
> reported.
>
> Please have a look at that patch (you are cc'd) and give it a try to see
> if it fixes your problem..

Never mind. I think it should work and will try it.

>
> Btw, One question about your setup:
> - you must have a multi cluster/socket SoC as you have atleast one more
> policy structure than used for group containing boot cpu..

Actually, I test on a laptop and find this issue when reading code to 
fix other bug. :)

All cpus have their own policys.

> - Are you using separate governor for both groups?

Just to produce the bug, I set one non-boot cpu to conservative gov. All 
other remain default "ondemand".

> - Or are you using CPUFREQ_HAVE_GOVERNOR_PER_POLICY stuff
> to use same governor with separate tunables for both groups?
>

No, I am not using this.

> Just wanted to know if somebody else is also using
> CPUFREQ_HAVE_GOVERNOR_PER_POLICY :)
>
Viresh Kumar Nov. 16, 2013, 5:24 a.m. UTC | #6
On Friday 15 November 2013 04:24 PM, Viresh Kumar wrote:
> Though the patch I have sent fixes a problem similar to this but I don't think
> patch of any of us will solve the issue Rainer is facing..
> 
> I checked his system configuration and its like this:
> - Four CPUs, all having separate clock domains (atleast from kernel
> perspective) and so separate policy structure.
> - All are using ondemand governor
> - not using CPUFREQ_HAVE_GOVERNOR_PER_POLICY feature
> - So there is a single set of tunables for ondemand governor that is applicable
> across all CPUs..
> 
> The way INIT/EXIT are designed in cpufreq_governor.c should take care
> of this scenario.
> 
> memory for tunables must not be freed unless all the CPUs are removed.
> Which can't happen, as we only offline non-boot CPUs and so I believe
> that memory isn't getting freed and so your solution wouldn't address his
> problem..
> 
> Sorry if I said something stupid enough :)

I haven't :)

From your another mail it is clear that you have used separate governors and so
you have faced the real problem :)

Hope my patch fixes it for you.

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Nov. 16, 2013, 2:41 p.m. UTC | #7
On Saturday, November 16, 2013 11:59:59 AM Lan Tianyu wrote:
> On 11/16/2013 08:38 AM, Rafael J. Wysocki wrote:
> > On Friday, November 15, 2013 04:15:34 PM Lan Tianyu wrote:
> >> Currently, governor of nonboot cpus will be put to EXIT when system suspend.
> >> Since all these cpus will be unplugged and the governor usage_count decreases
> >> to zero. The governor data and its sysfs interfaces will be freed or released.
> >> This makes user config of these governors loss during suspend and resume.
> >
> > First off, do we have a pointer to a bug report related to that?
> >
> 
> No, I found this bug when I tried to resolve other similar bug.
> https://bugzilla.kernel.org/show_bug.cgi?id=63081. I still have no idea 
> about bug 63081 and asked reporter to try this patch.
> 
> > Second, what does need to be done to reproduce this problem?
> >
> 
> Defaultly, all cpus use ondemand governor after bootup. Change one 
> non-boot cpu's governor to conservative,

Well, why would anyone want to do that?  Just out of curiosity ...

> modify conservative config via sysfs interface and then do system suspend.
> After resume, the config of conservative is reset. On my machine, all cpus
> have owen policy.

So this is acpi-cpufreq, right?

The patch looks basically OK to me, but ->

> >> This doesn't happen on the governor covering boot cpu because it isn't
> >> unplugged during system suspend.
> >>
> >> To fix this issue, skipping governor exit during system suspend and check
> >> policy governor data to determine whether the governor is really needed
> >> to be initialized when do init. If not, return EALREADY to indicate the
> >> governor has been initialized and should do nothing. __cpufreq_governor()
> >> convert EALREADY to 0 as return value for INIT event since governor is
> >> still under INIT state and can do START operation.
> >>
> >> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
> >> ---
> >> Fix some typos
> >>
> >>   drivers/cpufreq/cpufreq.c          |  5 ++++-
> >>   drivers/cpufreq/cpufreq_governor.c | 13 ++++++++++++-
> >>   2 files changed, 16 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> >> index 02d534d..38f2e4a 100644
> >> --- a/drivers/cpufreq/cpufreq.c
> >> +++ b/drivers/cpufreq/cpufreq.c
> >> @@ -1239,7 +1239,7 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
> >>
> >>   	/* If cpu is last user of policy, free policy */
> >>   	if (cpus == 1) {
> >> -		if (has_target()) {
> >> +		if (has_target() && !frozen) {
> >>   			ret = __cpufreq_governor(policy,
> >>   					CPUFREQ_GOV_POLICY_EXIT);
> >>   			if (ret) {
> >> @@ -1822,6 +1822,9 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
> >>   			((event == CPUFREQ_GOV_POLICY_EXIT) && !ret))
> >>   		module_put(policy->governor->owner);
> >>
> >> +	if ((event == CPUFREQ_GOV_POLICY_INIT) && ret == -EALREADY)
> >> +		ret = 0;
> >> +

-> I'd prefer this check to be combined with the one done to determine whether
or not we need to do the module_put().  Something like

	if (event == CPUFREQ_GOV_POLICY_EXIT && ret) {
		module_put(policy->governor->owner);
		if (ret == -EALREADY)
			return 0;
	} else if (event == CPUFREQ_GOV_POLICY_EXIT && !ret) {
		module_put(policy->governor->owner);
	}

Thanks!

> >>   	return ret;
> >>   }
> >>
> >> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> >> index 0806c31..ddb93af 100644
> >> --- a/drivers/cpufreq/cpufreq_governor.c
> >> +++ b/drivers/cpufreq/cpufreq_governor.c
> >> @@ -204,9 +204,20 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
> >>
> >>   	switch (event) {
> >>   	case CPUFREQ_GOV_POLICY_INIT:
> >> +		/*
> >> +		 * In order to keep governor data across suspend/resume,
> >> +		 * Governor doesn't exit when suspend and will be
> >> +		 * reinitialized when resume. Here check policy governor
> >> +		 * data to determine whether the governor has been exited.
> >> +		 * If not, return EALREADY.
> >> +		 */
> >>   		if (have_governor_per_policy()) {
> >> -			WARN_ON(dbs_data);
> >> +			if (dbs_data)
> >> +				return -EALREADY;
> >>   		} else if (dbs_data) {
> >> +			if (policy->governor_data == dbs_data)
> >> +				return -EALREADY;
> >> +
> >>   			dbs_data->usage_count++;
> >>   			policy->governor_data = dbs_data;
> >>   			return 0;
> >>
> 
> 
>
Viresh Kumar Nov. 16, 2013, 2:57 p.m. UTC | #8
On 16 November 2013 20:11, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Saturday, November 16, 2013 11:59:59 AM Lan Tianyu wrote:

>> Defaultly, all cpus use ondemand governor after bootup. Change one
>> non-boot cpu's governor to conservative,
>
> Well, why would anyone want to do that?  Just out of curiosity ...

People may want to use different group/cluster/socket of CPUs differently,
with different kind of policies. Maybe performance governor for boot cpu
and ondemand for others.

This bug would also be there for big LITTLE where we want to have
separate set of tunables for big and LITTLE clusters for the same type
of governor.

> So this is acpi-cpufreq, right?

Probably yes, I saw something similar somewhere.. But this is driver
independent..

> The patch looks basically OK to me, but ->

We wouldn't need this patch if my other patch (where I am disabling
governors in suspend/resume goes in, in any form)..
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Nov. 16, 2013, 3:09 p.m. UTC | #9
On Saturday, November 16, 2013 03:41:10 PM Rafael J. Wysocki wrote:

[...]

> > >> @@ -1822,6 +1822,9 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
> > >>   			((event == CPUFREQ_GOV_POLICY_EXIT) && !ret))
> > >>   		module_put(policy->governor->owner);
> > >>
> > >> +	if ((event == CPUFREQ_GOV_POLICY_INIT) && ret == -EALREADY)
> > >> +		ret = 0;
> > >> +
> 
> -> I'd prefer this check to be combined with the one done to determine whether
> or not we need to do the module_put().  Something like
> 
> 	if (event == CPUFREQ_GOV_POLICY_EXIT && ret) {

Obviously, that should be:

	if (event == CPUFREQ_GOV_POLICY_INIT && ret) {

> 		module_put(policy->governor->owner);
> 		if (ret == -EALREADY)
> 			return 0;
> 	} else if (event == CPUFREQ_GOV_POLICY_EXIT && !ret) {
> 		module_put(policy->governor->owner);
> 	}

Sorry for the confusion.

Thanks!
Rafael J. Wysocki Nov. 16, 2013, 3:10 p.m. UTC | #10
On Saturday, November 16, 2013 08:27:07 PM Viresh Kumar wrote:
> On 16 November 2013 20:11, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Saturday, November 16, 2013 11:59:59 AM Lan Tianyu wrote:
> 
> >> Defaultly, all cpus use ondemand governor after bootup. Change one
> >> non-boot cpu's governor to conservative,
> >
> > Well, why would anyone want to do that?  Just out of curiosity ...
> 
> People may want to use different group/cluster/socket of CPUs differently,
> with different kind of policies. Maybe performance governor for boot cpu
> and ondemand for others.
> 
> This bug would also be there for big LITTLE where we want to have
> separate set of tunables for big and LITTLE clusters for the same type
> of governor.
> 
> > So this is acpi-cpufreq, right?
> 
> Probably yes, I saw something similar somewhere.. But this is driver
> independent..
> 
> > The patch looks basically OK to me, but ->
> 
> We wouldn't need this patch if my other patch (where I am disabling
> governors in suspend/resume goes in, in any form)..

Yes, I know that, but I don't think this is the right approach.

Thanks!
lan,Tianyu Nov. 16, 2013, 3:23 p.m. UTC | #11
On 11/16/2013 10:41 PM, Rafael J. Wysocki wrote:
> On Saturday, November 16, 2013 11:59:59 AM Lan Tianyu wrote:
>> On 11/16/2013 08:38 AM, Rafael J. Wysocki wrote:
>>> On Friday, November 15, 2013 04:15:34 PM Lan Tianyu wrote:
>>>> Currently, governor of nonboot cpus will be put to EXIT when system suspend.
>>>> Since all these cpus will be unplugged and the governor usage_count decreases
>>>> to zero. The governor data and its sysfs interfaces will be freed or released.
>>>> This makes user config of these governors loss during suspend and resume.
>>>
>>> First off, do we have a pointer to a bug report related to that?
>>>
>>
>> No, I found this bug when I tried to resolve other similar bug.
>> https://bugzilla.kernel.org/show_bug.cgi?id=63081. I still have no idea
>> about bug 63081 and asked reporter to try this patch.
>>
>>> Second, what does need to be done to reproduce this problem?
>>>
>>
>> Defaultly, all cpus use ondemand governor after bootup. Change one
>> non-boot cpu's governor to conservative,
>
> Well, why would anyone want to do that?  Just out of curiosity ...

Just use this way to produce the issue. But on the laptop, I think
fewer people will do the same thing. Just like Viresh said, this also
will affect the systems of governor per policy.

>
>> modify conservative config via sysfs interface and then do system suspend.
>> After resume, the config of conservative is reset. On my machine, all cpus
>> have owen policy.
>
> So this is acpi-cpufreq, right?
>

Yes, it's acpi-cpufreq driver.

> The patch looks basically OK to me, but ->
>
>>>> This doesn't happen on the governor covering boot cpu because it isn't
>>>> unplugged during system suspend.
>>>>
>>>> To fix this issue, skipping governor exit during system suspend and check
>>>> policy governor data to determine whether the governor is really needed
>>>> to be initialized when do init. If not, return EALREADY to indicate the
>>>> governor has been initialized and should do nothing. __cpufreq_governor()
>>>> convert EALREADY to 0 as return value for INIT event since governor is
>>>> still under INIT state and can do START operation.
>>>>
>>>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>>>> ---
>>>> Fix some typos
>>>>
>>>>    drivers/cpufreq/cpufreq.c          |  5 ++++-
>>>>    drivers/cpufreq/cpufreq_governor.c | 13 ++++++++++++-
>>>>    2 files changed, 16 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>>> index 02d534d..38f2e4a 100644
>>>> --- a/drivers/cpufreq/cpufreq.c
>>>> +++ b/drivers/cpufreq/cpufreq.c
>>>> @@ -1239,7 +1239,7 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
>>>>
>>>>    	/* If cpu is last user of policy, free policy */
>>>>    	if (cpus == 1) {
>>>> -		if (has_target()) {
>>>> +		if (has_target() && !frozen) {
>>>>    			ret = __cpufreq_governor(policy,
>>>>    					CPUFREQ_GOV_POLICY_EXIT);
>>>>    			if (ret) {
>>>> @@ -1822,6 +1822,9 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
>>>>    			((event == CPUFREQ_GOV_POLICY_EXIT) && !ret))
>>>>    		module_put(policy->governor->owner);
>>>>
>>>> +	if ((event == CPUFREQ_GOV_POLICY_INIT) && ret == -EALREADY)
>>>> +		ret = 0;
>>>> +
>
> -> I'd prefer this check to be combined with the one done to determine whether
> or not we need to do the module_put().  Something like
>
> 	if (event == CPUFREQ_GOV_POLICY_EXIT && ret) {
> 		module_put(policy->governor->owner);
> 		if (ret == -EALREADY)
> 			return 0;
> 	} else if (event == CPUFREQ_GOV_POLICY_EXIT && !ret) {
> 		module_put(policy->governor->owner);
> 	}
>

Ok. I will update soon.

> Thanks!
>
>>>>    	return ret;
>>>>    }
>>>>
>>>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>>>> index 0806c31..ddb93af 100644
>>>> --- a/drivers/cpufreq/cpufreq_governor.c
>>>> +++ b/drivers/cpufreq/cpufreq_governor.c
>>>> @@ -204,9 +204,20 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
>>>>
>>>>    	switch (event) {
>>>>    	case CPUFREQ_GOV_POLICY_INIT:
>>>> +		/*
>>>> +		 * In order to keep governor data across suspend/resume,
>>>> +		 * Governor doesn't exit when suspend and will be
>>>> +		 * reinitialized when resume. Here check policy governor
>>>> +		 * data to determine whether the governor has been exited.
>>>> +		 * If not, return EALREADY.
>>>> +		 */
>>>>    		if (have_governor_per_policy()) {
>>>> -			WARN_ON(dbs_data);
>>>> +			if (dbs_data)
>>>> +				return -EALREADY;
>>>>    		} else if (dbs_data) {
>>>> +			if (policy->governor_data == dbs_data)
>>>> +				return -EALREADY;
>>>> +
>>>>    			dbs_data->usage_count++;
>>>>    			policy->governor_data = dbs_data;
>>>>    			return 0;
>>>>
>>
>>
>>
diff mbox

Patch

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 02d534d..38f2e4a 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1239,7 +1239,7 @@  static int __cpufreq_remove_dev_finish(struct device *dev,
 
 	/* If cpu is last user of policy, free policy */
 	if (cpus == 1) {
-		if (has_target()) {
+		if (has_target() && !frozen) {
 			ret = __cpufreq_governor(policy,
 					CPUFREQ_GOV_POLICY_EXIT);
 			if (ret) {
@@ -1822,6 +1822,9 @@  static int __cpufreq_governor(struct cpufreq_policy *policy,
 			((event == CPUFREQ_GOV_POLICY_EXIT) && !ret))
 		module_put(policy->governor->owner);
 
+	if ((event == CPUFREQ_GOV_POLICY_INIT) && ret == -EALREADY)
+		ret = 0;
+
 	return ret;
 }
 
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 0806c31..ddb93af 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -204,9 +204,20 @@  int cpufreq_governor_dbs(struct cpufreq_policy *policy,
 
 	switch (event) {
 	case CPUFREQ_GOV_POLICY_INIT:
+		/*
+		 * In order to keep governor data across suspend/resume,
+		 * Governor doesn't exit when suspend and will be
+		 * reinitialized when resume. Here check policy governor
+		 * data to determine whether the governor has been exited.
+		 * If not, return EALREADY.
+		 */
 		if (have_governor_per_policy()) {
-			WARN_ON(dbs_data);
+			if (dbs_data)
+				return -EALREADY;
 		} else if (dbs_data) {
+			if (policy->governor_data == dbs_data)
+				return -EALREADY;
+
 			dbs_data->usage_count++;
 			policy->governor_data = dbs_data;
 			return 0;