diff mbox

NOHZ: WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule, round 2

Message ID 5199E54D.7030407@linux.vnet.ibm.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Michael Wang May 20, 2013, 8:56 a.m. UTC
On 05/20/2013 03:25 PM, Michael Wang wrote:
[]
> 
> Yeah, that's right, I guess the issue is, although the policy->cpus is
> correct at a given time, after get cpu from it, it's possible to be
> changed, unless we disabled preempt or irq, or hotplug before we use it...
> 
> Like such issue cases:
> 				get x from policy->cpus
> 	DOWN notifier
> 	change policy->cpus
> 	do offline x
> 				send ipi to x
> 
> Will that happen?

May be we could do some test to confirm it?

 EXPORT_SYMBOL_GPL(gov_queue_work);

This is supposed to make WARN disappear, if it works, then BINGO :)

Regards,
Michael Wang

> 
> Regards,
> Michael Wang
> 
> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Viresh Kumar May 20, 2013, 9:09 a.m. UTC | #1
On 20 May 2013 14:26, Michael Wang <wangyun@linux.vnet.ibm.com> wrote:
> On 05/20/2013 03:25 PM, Michael Wang wrote:
>> Yeah, that's right, I guess the issue is, although the policy->cpus is
>> correct at a given time, after get cpu from it, it's possible to be
>> changed, unless we disabled preempt or irq, or hotplug before we use it...
>>
>> Like such issue cases:
>>                               get x from policy->cpus
>>       DOWN notifier
>>       change policy->cpus
>>       do offline x
>>                               send ipi to x
>>
>> Will that happen?

Sorry I am not sure. :(

I can see mutex being used in cpufreq_governor.c which should take care
of race conditions...

> May be we could do some test to confirm it?
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c
> b/drivers/cpufreq/cpufreq_governor.c
> index 443442d..449be88 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -26,6 +26,7 @@
>  #include <linux/tick.h>
>  #include <linux/types.h>
>  #include <linux/workqueue.h>
> +#include <linux/cpu.h>
>
>  #include "cpufreq_governor.h"
>
> @@ -180,8 +181,10 @@ void gov_queue_work(struct dbs_data *dbs_data,
> struct cpufreq_policy *policy,
>         if (!all_cpus) {
>                 __gov_queue_work(smp_processor_id(), dbs_data, delay);
>         } else {
> +               get_online_cpus();
>                 for_each_cpu(i, policy->cpus)
>                         __gov_queue_work(i, dbs_data, delay);
> +               put_online_cpus();
>         }
>  }
>  EXPORT_SYMBOL_GPL(gov_queue_work);
>
> This is supposed to make WARN disappear, if it works, then BINGO :)

Let people test it and then we can talk :)
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Wang May 20, 2013, 9:24 a.m. UTC | #2
On 05/20/2013 05:09 PM, Viresh Kumar wrote:
> On 20 May 2013 14:26, Michael Wang <wangyun@linux.vnet.ibm.com> wrote:
>> On 05/20/2013 03:25 PM, Michael Wang wrote:
>>> Yeah, that's right, I guess the issue is, although the policy->cpus is
>>> correct at a given time, after get cpu from it, it's possible to be
>>> changed, unless we disabled preempt or irq, or hotplug before we use it...
>>>
>>> Like such issue cases:
>>>                               get x from policy->cpus
>>>       DOWN notifier
>>>       change policy->cpus
>>>       do offline x
>>>                               send ipi to x
>>>
>>> Will that happen?
> 
> Sorry I am not sure. :(
> 
> I can see mutex being used in cpufreq_governor.c which should take care
> of race conditions...
> 
>> May be we could do some test to confirm it?
>>
>> diff --git a/drivers/cpufreq/cpufreq_governor.c
>> b/drivers/cpufreq/cpufreq_governor.c
>> index 443442d..449be88 100644
>> --- a/drivers/cpufreq/cpufreq_governor.c
>> +++ b/drivers/cpufreq/cpufreq_governor.c
>> @@ -26,6 +26,7 @@
>>  #include <linux/tick.h>
>>  #include <linux/types.h>
>>  #include <linux/workqueue.h>
>> +#include <linux/cpu.h>
>>
>>  #include "cpufreq_governor.h"
>>
>> @@ -180,8 +181,10 @@ void gov_queue_work(struct dbs_data *dbs_data,
>> struct cpufreq_policy *policy,
>>         if (!all_cpus) {
>>                 __gov_queue_work(smp_processor_id(), dbs_data, delay);
>>         } else {
>> +               get_online_cpus();
>>                 for_each_cpu(i, policy->cpus)
>>                         __gov_queue_work(i, dbs_data, delay);
>> +               put_online_cpus();
>>         }
>>  }
>>  EXPORT_SYMBOL_GPL(gov_queue_work);
>>
>> This is supposed to make WARN disappear, if it works, then BINGO :)
> 
> Let people test it and then we can talk :)

Agree :)

Borislav, would you like to take a try?

If this fix cause other troubles, you could try get_cpu() or disable irq
also.

Regards,
Michael Wang

> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/cpufreq/cpufreq_governor.c
b/drivers/cpufreq/cpufreq_governor.c
index 443442d..449be88 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -26,6 +26,7 @@ 
 #include <linux/tick.h>
 #include <linux/types.h>
 #include <linux/workqueue.h>
+#include <linux/cpu.h>

 #include "cpufreq_governor.h"

@@ -180,8 +181,10 @@  void gov_queue_work(struct dbs_data *dbs_data,
struct cpufreq_policy *policy,
        if (!all_cpus) {
                __gov_queue_work(smp_processor_id(), dbs_data, delay);
        } else {
+               get_online_cpus();
                for_each_cpu(i, policy->cpus)
                        __gov_queue_work(i, dbs_data, delay);
+               put_online_cpus();
        }
 }