diff mbox

[v3,1/3] ARM: add cpufreq transiton notifier to adjust loops_per_jiffy for smp

Message ID 20120907025847.GI26709@S2101-09.ap.freescale.net (mailing list archive)
State New, archived
Headers show

Commit Message

Shawn Guo Sept. 7, 2012, 2:58 a.m. UTC
On Thu, Sep 06, 2012 at 04:35:25PM -0600, Stephen Warren wrote:
> I believe this patch is causing issues initializing PCI-Express on Tegra.
> 
> In next-20120906, I cold-booted 10 times. 3 times, PCIe initialized OK,
> and 7 times, the driver timed out in arch/arm/mach-tegra/pcie.c function
> tegra_pcie_check_link(). With this patch reverted, another 10 cold boot
> attempts all succeeded just fine. Similarly, the regression appeared in
> next-20120905, and I isolated it to arch/arm/kernel/, and this is the
> only patch in that directory between next-20120904 and next-20120905.
> 
> Do you have any idea what the problem might be?
> 
> Looking at the timestamps in dmesg in the failing case, the driver is
> waiting the expected (per pcie.c driver code) 1.2 seconds before giving
> up on the port, although I suppose if the kernel's idea of real-time is
> off, then the dmesg log timestamps might be off too.

Just for identifying the problem, can you test the following change to
see if it fixes the failure.

Regards,
Shawn

Comments

Stephen Warren Sept. 7, 2012, 4:55 p.m. UTC | #1
On 09/06/2012 08:58 PM, Shawn Guo wrote:
> On Thu, Sep 06, 2012 at 04:35:25PM -0600, Stephen Warren wrote:
>> I believe this patch is causing issues initializing PCI-Express on Tegra.
>>
>> In next-20120906, I cold-booted 10 times. 3 times, PCIe initialized OK,
>> and 7 times, the driver timed out in arch/arm/mach-tegra/pcie.c function
>> tegra_pcie_check_link(). With this patch reverted, another 10 cold boot
>> attempts all succeeded just fine. Similarly, the regression appeared in
>> next-20120905, and I isolated it to arch/arm/kernel/, and this is the
>> only patch in that directory between next-20120904 and next-20120905.
>>
>> Do you have any idea what the problem might be?
>>
>> Looking at the timestamps in dmesg in the failing case, the driver is
>> waiting the expected (per pcie.c driver code) 1.2 seconds before giving
>> up on the port, although I suppose if the kernel's idea of real-time is
>> off, then the dmesg log timestamps might be off too.
> 
> Just for identifying the problem, can you test the following change to
> see if it fixes the failure.

Yes, that does solve the problem (well, with s/late_init/late_initcall/).

As you imply, that change wouldn't help if cpu-tegra.c was built as a
module. So, is this something you can work around in your patch?

Thanks for the quick response.
Shawn Guo Sept. 10, 2012, 1:17 a.m. UTC | #2
On Fri, Sep 07, 2012 at 10:55:00AM -0600, Stephen Warren wrote:
> Yes, that does solve the problem (well, with s/late_init/late_initcall/).
> 
> As you imply, that change wouldn't help if cpu-tegra.c was built as a
> module.

I doubt that.  Have you confirmed it with testing?  When you install
module cpu-tegra, the pcie initialization has been done, right?

> So, is this something you can work around in your patch?
> 
Before we head to a solution, we need to identify the cause.

One thing I note is there is no loops_per_jiffy updating in cpu-tegra
driver.  But if I understand correctly, the driver is running on
MP system.  It should update loops_per_jiffy when cpu frequency
changes, right?  I'm worrying about that the timeout in tegra pcie
driver just happens to work with a wrong loops_per_jiffy, but becomes
broken when loops_per_jiffy gets correct.
Stephen Warren Sept. 10, 2012, 5:17 p.m. UTC | #3
On 09/09/2012 07:17 PM, Shawn Guo wrote:
> On Fri, Sep 07, 2012 at 10:55:00AM -0600, Stephen Warren wrote:
>> Yes, that does solve the problem (well, with s/late_init/late_initcall/).
>>
>> As you imply, that change wouldn't help if cpu-tegra.c was built as a
>> module.
> 
> I doubt that.  Have you confirmed it with testing?  When you install
> module cpu-tegra, the pcie initialization has been done, right?

Never mind, the code can't be built as a module anyway.

Aside from that, I misinterpreted your test patch, and thought that it
was moving the Tegra cpufreq driver initialization earlier, but it's
actually moving it later, and in fact by chance after PCIe
initialization, which explains why it solves the issue.

I think the root of the problem is that cpufreq is lowering the CPU
frequency, yet the patch which converted Tegra to the common clock
framework removed the code that actually changes the CPU clock rate. So,
cpufreq thinks the CPU is running at e.g. 216MHz, but it's actually
still running at 1.0GHz, and hence re-calculating the delay loops breaks
things, since delays aren't as long as the system thinks they are. The
variability is due to whether lowering the CPU frequency just happens to
occur before or after the PCIe controller is initialized.

Prashant, are you able to fix the clock driver deficiency within the
next 2-3 days or so (so I can include the fix in the pull requests I
send for 3.7, which need to be sent before the end of the week)? Or, do
we need to disable cpufreq for Tegra because of this?
Prashant Gaikwad Sept. 11, 2012, 11:59 a.m. UTC | #4
On Monday 10 September 2012 10:47 PM, Stephen Warren wrote:
> On 09/09/2012 07:17 PM, Shawn Guo wrote:
>> On Fri, Sep 07, 2012 at 10:55:00AM -0600, Stephen Warren wrote:
>>> Yes, that does solve the problem (well, with s/late_init/late_initcall/).
>>>
>>> As you imply, that change wouldn't help if cpu-tegra.c was built as a
>>> module.
>> I doubt that.  Have you confirmed it with testing?  When you install
>> module cpu-tegra, the pcie initialization has been done, right?
> Never mind, the code can't be built as a module anyway.
>
> Aside from that, I misinterpreted your test patch, and thought that it
> was moving the Tegra cpufreq driver initialization earlier, but it's
> actually moving it later, and in fact by chance after PCIe
> initialization, which explains why it solves the issue.
>
> I think the root of the problem is that cpufreq is lowering the CPU
> frequency, yet the patch which converted Tegra to the common clock
> framework removed the code that actually changes the CPU clock rate. So,
> cpufreq thinks the CPU is running at e.g. 216MHz, but it's actually
> still running at 1.0GHz, and hence re-calculating the delay loops breaks
> things, since delays aren't as long as the system thinks they are. The
> variability is due to whether lowering the CPU frequency just happens to
> occur before or after the PCIe controller is initialized.
>
> Prashant, are you able to fix the clock driver deficiency within the
> next 2-3 days or so (so I can include the fix in the pull requests I
> send for 3.7, which need to be sent before the end of the week)? Or, do
> we need to disable cpufreq for Tegra because of this?

Your fix looks good to me except the concern I have mentioned in reply to that patch.
Stephen Warren Sept. 11, 2012, 3:16 p.m. UTC | #5
On 09/11/2012 05:59 AM, Prashant Gaikwad wrote:
> On Monday 10 September 2012 10:47 PM, Stephen Warren wrote:
>> On 09/09/2012 07:17 PM, Shawn Guo wrote:
>>> On Fri, Sep 07, 2012 at 10:55:00AM -0600, Stephen Warren wrote:
>>>> Yes, that does solve the problem (well, with
>>>> s/late_init/late_initcall/).
>>>>
>>>> As you imply, that change wouldn't help if cpu-tegra.c was built as a
>>>> module.
>>> I doubt that.  Have you confirmed it with testing?  When you install
>>> module cpu-tegra, the pcie initialization has been done, right?
>> Never mind, the code can't be built as a module anyway.
>>
>> Aside from that, I misinterpreted your test patch, and thought that it
>> was moving the Tegra cpufreq driver initialization earlier, but it's
>> actually moving it later, and in fact by chance after PCIe
>> initialization, which explains why it solves the issue.
>>
>> I think the root of the problem is that cpufreq is lowering the CPU
>> frequency, yet the patch which converted Tegra to the common clock
>> framework removed the code that actually changes the CPU clock rate. So,
>> cpufreq thinks the CPU is running at e.g. 216MHz, but it's actually
>> still running at 1.0GHz, and hence re-calculating the delay loops breaks
>> things, since delays aren't as long as the system thinks they are. The
>> variability is due to whether lowering the CPU frequency just happens to
>> occur before or after the PCIe controller is initialized.
>>
>> Prashant, are you able to fix the clock driver deficiency within the
>> next 2-3 days or so (so I can include the fix in the pull requests I
>> send for 3.7, which need to be sent before the end of the week)? Or, do
>> we need to disable cpufreq for Tegra because of this?
> 
> Your fix looks good to me except the concern I have mentioned in reply
> to that patch.

Well, the cpufreq driver will need explicit knowledge of
Tega20-vs-Tegra30 anyway, due to e.g. differing CPU clock names,
probable different latencies, etc.
diff mbox

Patch

diff --git a/arch/arm/mach-tegra/cpu-tegra.c b/arch/arm/mach-tegra/cpu-tegra.c
index ceb52db..a3bbdc9 100644
--- a/arch/arm/mach-tegra/cpu-tegra.c
+++ b/arch/arm/mach-tegra/cpu-tegra.c
@@ -247,5 +247,5 @@  static void __exit tegra_cpufreq_exit(void)
 MODULE_AUTHOR("Colin Cross <ccross@android.com>");
 MODULE_DESCRIPTION("cpufreq driver for Nvidia Tegra2");
 MODULE_LICENSE("GPL");
-module_init(tegra_cpufreq_init);
+late_init(tegra_cpufreq_init);
 module_exit(tegra_cpufreq_exit);