diff mbox

"cpufreq: fix serialization issues with freq change notifiers" breaks cpufreq too

Message ID Pine.LNX.4.64.1309101740120.16010@axis700.grange
State New, archived
Headers show

Commit Message

Guennadi Liakhovetski Sept. 10, 2013, 4:22 p.m. UTC
On Tue, 10 Sep 2013, Viresh Kumar wrote:

> On 10 September 2013 20:42, Guennadi Liakhovetski <g.liakhovetski@gmx.de> wrote:
> > 4. reverted that commit, resolving a trivial conflict. Added a debug
> > output in __cpufreq_driver_target() of
> >
> >
> >         if (cpufreq_disabled())
> >                 return -ENODEV;
> > +       pr_info("%s() %d\n", __func__, policy->transition_ongoing);
> >         if (policy->transition_ongoing)
> >                 return -EBUSY;
> 
> Are you sure this diff is on linux-next and not after the revert that
> you mentioned later in the mail? There is some locking introduced
> by my patch which I can't see in you diff..

Of course, isn't that what I've written above? reverted a commit and added 
debug - in that order.

> > Built and booted, got
> >
> > cpufreq: __cpufreq_driver_target(): 1
> >
> > printed out 4 times from the beginning.
> >
> > 5. tried
> >
> > echo powersave > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
> >
> > the above output appeared 2 more times - no frequency change resulted.
> >
> > 6. reverted commit dceff5ce18801dddc220d6238628619c93bc3cb6, built booted
> > - cpufreq works again.
> >
> >> I am afraid you need to give us some more information on how it broke
> >> your stuff.. :)
> >
> > Hope the above is enough.
> 
> A bit more would be helpful.. Can you add these debug prints to all the places
> where transition_ongoing is getting modified? with %s, __func__ to distinguish
> them better? That will make it a bit easy for me...

Sure, I can... So, with the performance governor I get

[    1.290000] cpufreq-cpu0 cpufreq-cpu0: Looking up cpu0-supply from device tree
[    1.290000] cpufreq: trying to register driver generic_cpu0
[    1.290000] cpufreq: adding CPU 0
[    1.290000] cpufreq: Adding link for CPU: 1
[    1.290000] cpufreq: setting new policy for CPU 0: 398667 - 1196000 kHz
[    1.290000] cpufreq: new min and max freqs are 398667 - 1196000 kHz
[    1.290000] cpufreq: governor switch
[    1.290000] cpufreq: __cpufreq_governor for CPU 0, event 4
[    1.290000] cpufreq: __cpufreq_governor for CPU 0, event 1
[    1.290000] cpufreq_performance: setting to 1196000 kHz because of event 1
[    1.290000] cpufreq: __cpufreq_driver_target().1665 1

This is my debug - .transition_ongoing is incremented ^^^^^^^^

[    1.300000] cpufreq: target for CPU 0: 1196000 kHz, relation 1, requested 1196000 kHz
[    1.300000] cpufreq: governor: change or update limits
[    1.300000] cpufreq: __cpufreq_governor for CPU 0, event 3
[    1.300000] cpufreq_performance: setting to 1196000 kHz because of event 3
[    1.300000] cpufreq: initialization complete
[    1.300000] cpufreq: adding CPU 1
[    1.300000] cpufreq: driver generic_cpu0 up and running

That's it. It never gets decremented again.

> Also, what's the configuration of your system? How many CPUs?

2 CPU cores.

> And are all of them sharing clock? (I believe yes, as you are using cpufreq-cpu0)..

Correct. Debug diff is below.

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/

Comments

Viresh Kumar Sept. 10, 2013, 4:34 p.m. UTC | #1
On 10 September 2013 21:52, Guennadi Liakhovetski <g.liakhovetski@gmx.de> wrote:
> Of course, isn't that what I've written above? reverted a commit and added
> debug - in that order.

Ok, I misread it then :(

> Sure, I can... So, with the performance governor I get
>
> [    1.290000] cpufreq-cpu0 cpufreq-cpu0: Looking up cpu0-supply from device tree
> [    1.290000] cpufreq: trying to register driver generic_cpu0
> [    1.290000] cpufreq: adding CPU 0
> [    1.290000] cpufreq: Adding link for CPU: 1
> [    1.290000] cpufreq: setting new policy for CPU 0: 398667 - 1196000 kHz
> [    1.290000] cpufreq: new min and max freqs are 398667 - 1196000 kHz
> [    1.290000] cpufreq: governor switch
> [    1.290000] cpufreq: __cpufreq_governor for CPU 0, event 4
> [    1.290000] cpufreq: __cpufreq_governor for CPU 0, event 1
> [    1.290000] cpufreq_performance: setting to 1196000 kHz because of event 1
> [    1.290000] cpufreq: __cpufreq_driver_target().1665 1
>
> This is my debug - .transition_ongoing is incremented ^^^^^^^^
>
> [    1.300000] cpufreq: target for CPU 0: 1196000 kHz, relation 1, requested 1196000 kHz

Quite straight forward actually.. Please try attached patch and see if it fixes
your problem.. Which it should if I am not wrong.. I will send it
separately then..

Thanks for your time..
Guennadi Liakhovetski Sept. 10, 2013, 5:07 p.m. UTC | #2
On Tue, 10 Sep 2013, Viresh Kumar wrote:

> On 10 September 2013 21:52, Guennadi Liakhovetski <g.liakhovetski@gmx.de> wrote:
> > Of course, isn't that what I've written above? reverted a commit and added
> > debug - in that order.
> 
> Ok, I misread it then :(
> 
> > Sure, I can... So, with the performance governor I get
> >
> > [    1.290000] cpufreq-cpu0 cpufreq-cpu0: Looking up cpu0-supply from device tree
> > [    1.290000] cpufreq: trying to register driver generic_cpu0
> > [    1.290000] cpufreq: adding CPU 0
> > [    1.290000] cpufreq: Adding link for CPU: 1
> > [    1.290000] cpufreq: setting new policy for CPU 0: 398667 - 1196000 kHz
> > [    1.290000] cpufreq: new min and max freqs are 398667 - 1196000 kHz
> > [    1.290000] cpufreq: governor switch
> > [    1.290000] cpufreq: __cpufreq_governor for CPU 0, event 4
> > [    1.290000] cpufreq: __cpufreq_governor for CPU 0, event 1
> > [    1.290000] cpufreq_performance: setting to 1196000 kHz because of event 1
> > [    1.290000] cpufreq: __cpufreq_driver_target().1665 1
> >
> > This is my debug - .transition_ongoing is incremented ^^^^^^^^
> >
> > [    1.300000] cpufreq: target for CPU 0: 1196000 kHz, relation 1, requested 1196000 kHz
> 
> Quite straight forward actually..

Apparently, not quite.

> Please try attached patch and see if it fixes
> your problem.. Which it should if I am not wrong.. I will send it
> separately then..

It helps only once. The first switching works, the second one doesn't. 
Below debug

[   12.010000] cpufreq: setting new policy for CPU 0: 398667 - 1196000 kHz
[   12.010000] cpufreq: new min and max freqs are 398667 - 1196000 kHz
[   12.010000] cpufreq: governor switch
[   12.010000] cpufreq: __cpufreq_governor for CPU 0, event 2
[   12.010000] cpufreq: __cpufreq_governor for CPU 0, event 5
[   12.010000] ondemand governor failed, too long transition latency of HW, fallback to performance governor
[   12.020000] cpufreq: __cpufreq_governor for CPU 0, event 4
[   12.020000] cpufreq: __cpufreq_governor for CPU 0, event 1
[   12.020000] cpufreq_performance: setting to 1196000 kHz because of event 1
[   12.020000] cpufreq: target for CPU 0: 1196000 kHz, relation 1, requested 1196000 kHz
[   12.020000] cpufreq: governor: change or update limits
[   12.020000] cpufreq: __cpufreq_governor for CPU 0, event 3
[   12.020000] cpufreq_performance: setting to 1196000 kHz because of event 3
[   12.020000] cpufreq: target for CPU 0: 1196000 kHz, relation 1, requested 1196000 kHz
[   12.030000] cpufreq: setting new policy for CPU 0: 398667 - 1196000 kHz
[   12.030000] cpufreq: new min and max freqs are 398667 - 1196000 kHz
[   12.030000] cpufreq: governor switch
[   12.030000] cpufreq: __cpufreq_governor for CPU 0, event 2
[   12.030000] cpufreq: __cpufreq_governor for CPU 0, event 5
[   12.030000] ondemand governor failed, too long transition latency of HW, fallback to performance governor
[   12.040000] cpufreq: __cpufreq_governor for CPU 0, event 4
[   12.040000] cpufreq: __cpufreq_governor for CPU 0, event 1
[   12.040000] cpufreq_performance: setting to 1196000 kHz because of event 1
[   12.040000] cpufreq: target for CPU 0: 1196000 kHz, relation 1, requested 1196000 kHz
[   12.040000] cpufreq: governor: change or update limits
[   12.040000] cpufreq: __cpufreq_governor for CPU 0, event 3
[   12.040000] cpufreq_performance: setting to 1196000 kHz because of event 3
[   12.040000] cpufreq: target for CPU 0: 1196000 kHz, relation 1, requested 1196000 kHz

echo powersave > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

[   66.490000] cpufreq: setting new policy for CPU 0: 398667 - 1196000 kHz
[   66.490000] cpufreq: new min and max freqs are 398667 - 1196000 kHz
[   66.490000] cpufreq: governor switch
[   66.490000] cpufreq: __cpufreq_governor for CPU 0, event 2
[   66.490000] cpufreq: __cpufreq_governor for CPU 0, event 5
[   66.490000] cpufreq: __cpufreq_governor for CPU 0, event 4
[   66.490000] cpufreq: __cpufreq_governor for CPU 0, event 1
[   66.490000] cpufreq: target for CPU 0: 398667 kHz, relation 0, requested 398667 kHz
[   66.490000] cpufreq: __cpufreq_driver_target().1677 1
[   66.500000] cpufreq: notification 0 of frequency transition to 398666 kHz
[   66.500000] cpufreq: __cpufreq_notify_transition().297 2
[   66.500000] cpufreq: notification 0 of frequency transition to 398666 kHz
[   66.500000] cpufreq: __cpufreq_notify_transition().297 3
[   66.510000] cpufreq: notification 1 of frequency transition to 398666 kHz
[   66.510000] cpufreq: __cpufreq_notify_transition().327 2
[   66.520000] cpufreq: FREQ: 398666 - CPU: 0
[   66.520000] cpufreq: notification 1 of frequency transition to 398666 kHz
[   66.520000] cpufreq: __cpufreq_notify_transition().327 1
[   66.520000] cpufreq: FREQ: 398666 - CPU: 1
[   66.520000] cpufreq: cpufreq_notify_transition().366 0
[   66.530000] cpufreq: governor: change or update limits
[   66.530000] cpufreq: __cpufreq_governor for CPU 0, event 3
[   66.530000] cpufreq: target for CPU 0: 398667 kHz, relation 0, requested 398667 kHz
[   66.530000] cpufreq: __cpufreq_driver_target().1677 1

echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

[   72.470000] cpufreq: setting new policy for CPU 0: 398667 - 1196000 kHz
[   72.470000] cpufreq: new min and max freqs are 398667 - 1196000 kHz
[   72.470000] cpufreq: governor switch
[   72.470000] cpufreq: __cpufreq_governor for CPU 0, event 2
[   72.470000] cpufreq: __cpufreq_governor for CPU 0, event 5
[   72.470000] cpufreq: __cpufreq_governor for CPU 0, event 4
[   72.470000] cpufreq: __cpufreq_governor for CPU 0, event 1
[   72.470000] cpufreq_performance: setting to 1196000 kHz because of event 1
[   72.470000] cpufreq: target for CPU 0: 1196000 kHz, relation 1, requested 1196000 kHz
[   72.470000] cpufreq: governor: change or update limits
[   72.470000] cpufreq: __cpufreq_governor for CPU 0, event 3
[   72.470000] cpufreq_performance: setting to 1196000 kHz because of event 3
[   72.470000] cpufreq: target for CPU 0: 1196000 kHz, relation 1, requested 1196000 kHz

> Thanks for your time..

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/
Viresh Kumar Sept. 11, 2013, 8:06 a.m. UTC | #3
On 10 September 2013 22:37, Guennadi Liakhovetski <g.liakhovetski@gmx.de> wrote:
> On Tue, 10 Sep 2013, Viresh Kumar wrote:
>> Quite straight forward actually..
>
> Apparently, not quite.

I overlooked the situation where we return early from ->target() routines.. :(

Please try attached patches, I will repost them later (once I am able to
convince Rafael that these are really important :) )

--
viresh
Guennadi Liakhovetski Sept. 11, 2013, 8:15 a.m. UTC | #4
On Wed, 11 Sep 2013, Viresh Kumar wrote:

> On 10 September 2013 22:37, Guennadi Liakhovetski <g.liakhovetski@gmx.de> wrote:
> > On Tue, 10 Sep 2013, Viresh Kumar wrote:
> >> Quite straight forward actually..
> >
> > Apparently, not quite.
> 
> I overlooked the situation where we return early from ->target() routines.. :(
> 
> Please try attached patches, I will repost them later (once I am able to
> convince Rafael that these are really important :) )

I'd rather wait until Rafael is convinced, then we'll see.

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/
Viresh Kumar Sept. 11, 2013, 8:39 a.m. UTC | #5
On 11 September 2013 13:45, Guennadi Liakhovetski <g.liakhovetski@gmx.de> wrote:
> I'd rather wait until Rafael is convinced, then we'll see.

Okay.. I have just sent a mail to Rafael about that, see if you
are convinced with what I wrote :)

--
viresh
Rafael Wysocki Sept. 11, 2013, 1:28 p.m. UTC | #6
On Wednesday, September 11, 2013 10:15:53 AM Guennadi Liakhovetski wrote:
> On Wed, 11 Sep 2013, Viresh Kumar wrote:
> 
> > On 10 September 2013 22:37, Guennadi Liakhovetski <g.liakhovetski@gmx.de> wrote:
> > > On Tue, 10 Sep 2013, Viresh Kumar wrote:
> > >> Quite straight forward actually..
> > >
> > > Apparently, not quite.
> > 
> > I overlooked the situation where we return early from ->target() routines.. :(
> > 
> > Please try attached patches, I will repost them later (once I am able to
> > convince Rafael that these are really important :) )
> 
> I'd rather wait until Rafael is convinced, then we'll see.

I'm going to revert the commit that was the source of these issues, but that
doesn't mean that everything is now rosy.  We need to fix the concurrency
problems that Viresh was trying to fix in that commit, so it would be nice if
you could verify whether or not he is on the right track.

Thanks,
Rafael
Guennadi Liakhovetski Sept. 12, 2013, 7:47 a.m. UTC | #7
Hi Viresh

On Wed, 11 Sep 2013, Viresh Kumar wrote:

> On 10 September 2013 22:37, Guennadi Liakhovetski <g.liakhovetski@gmx.de> wrote:
> > On Tue, 10 Sep 2013, Viresh Kumar wrote:
> >> Quite straight forward actually..
> >
> > Apparently, not quite.
> 
> I overlooked the situation where we return early from ->target() routines.. :(
> 
> Please try attached patches, I will repost them later (once I am able to
> convince Rafael that these are really important :) )

Yes, they seem to fix my issues. You probably aren't going to submit them 
in this form, instead, merge them with the original serialisation fix 
patch, right? So, you don't need my tested-by here. But feel free to cc me 
when you submit a fixed version.

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/
Viresh Kumar Sept. 12, 2013, 7:51 a.m. UTC | #8
On 12 September 2013 13:17, Guennadi Liakhovetski <g.liakhovetski@gmx.de> wrote:
> Yes, they seem to fix my issues.

Great!!

> You probably aren't going to submit them
> in this form, instead, merge them with the original serialisation fix
> patch, right? So, you don't need my tested-by here. But feel free to cc me
> when you submit a fixed version.

I will add your tested-by as you have eventually test all three patch that I
would merge :)

Thanks for testing this..
diff mbox

Patch

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index ecc55d1..374e030 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -15,6 +15,8 @@ 
  * published by the Free Software Foundation.
  */
 
+#define DEBUG
+
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/cpu.h>
@@ -292,6 +294,7 @@  static void __cpufreq_notify_transition(struct cpufreq_policy *policy,
 
 		policy->transition_ongoing++;
 		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+		pr_info("%s().%d %d\n", __func__, __LINE__, policy->transition_ongoing);
 
 		/* detect if the driver reported a value as "old frequency"
 		 * which is not equal to what the cpufreq core thinks is
@@ -321,6 +324,7 @@  static void __cpufreq_notify_transition(struct cpufreq_policy *policy,
 
 		policy->transition_ongoing--;
 		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+		pr_info("%s().%d %d\n", __func__, __LINE__, policy->transition_ongoing);
 
 		adjust_jiffies(CPUFREQ_POSTCHANGE, freqs);
 		pr_debug("FREQ: %lu - CPU: %lu", (unsigned long)freqs->new,
@@ -359,6 +363,7 @@  void cpufreq_notify_transition(struct cpufreq_policy *policy,
 		write_lock_irqsave(&cpufreq_driver_lock, flags);
 		policy->transition_ongoing--;
 		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+		pr_info("%s().%d %d\n", __func__, __LINE__, policy->transition_ongoing);
 	}
 }
 EXPORT_SYMBOL_GPL(cpufreq_notify_transition);
@@ -1356,6 +1361,7 @@  static void cpufreq_out_of_sync(unsigned int cpu, unsigned int old_freq,
 	}
 	policy->transition_ongoing++;
 	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	pr_info("%s().%d %d\n", __func__, __LINE__, policy->transition_ongoing);
 
 	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
@@ -1656,6 +1662,7 @@  int __cpufreq_driver_target(struct cpufreq_policy *policy,
 	}
 	policy->transition_ongoing++;
 	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	pr_info("%s().%d %d\n", __func__, __LINE__, policy->transition_ongoing);
 
 	/* Make sure that target_freq is within supported range */
 	if (target_freq > policy->max)
diff --git a/drivers/cpufreq/cpufreq_performance.c b/drivers/cpufreq/cpufreq_performance.c
index cf117de..5575b08 100644
--- a/drivers/cpufreq/cpufreq_performance.c
+++ b/drivers/cpufreq/cpufreq_performance.c
@@ -10,6 +10,8 @@ 
  *
  */
 
+#define DEBUG
+
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/cpufreq.h>