diff mbox

[PATCHv2] omap2+: pm: cpufreq: Fix loops_per_jiffy calculation

Message ID 20110624151201.GO9449@n2100.arm.linux.org.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Russell King - ARM Linux June 24, 2011, 3:12 p.m. UTC
Right, thanks for the file.  Here's the patch.



Notice how we adjust _both_ the per-cpu loops_per_jiffy, and that we
adjust them with reference to the initial values.

If you adjust lpj with reference to the last, then you _will_ build up
a progressively bigger and bigger error in the value over time.

Comments

Sanjeev Premi June 24, 2011, 3:34 p.m. UTC | #1
> -----Original Message-----
> From: Russell King - ARM Linux [mailto:linux@arm.linux.org.uk] 
> Sent: Friday, June 24, 2011 8:42 PM
> To: Premi, Sanjeev
> Cc: linux-omap@vger.kernel.org; linux-arm-kernel@lists.infradead.org
> Subject: Re: [PATCHv2] omap2+: pm: cpufreq: Fix 
> loops_per_jiffy calculation
> 
> Right, thanks for the file.  Here's the patch.
> 
> --- omap2plus-cpufreq.c~	2011-06-24 15:50:32.000000000 +0100
> +++ omap2plus-cpufreq.c	2011-06-24 16:00:08.000000000 +0100
> @@ -44,6 +44,16 @@
>  static char *mpu_clk_name;
>  static struct device *mpu_dev;
>  
> +#ifdef CONFIG_SMP
> +struct lpj_info {
> +	unsigned long	ref;
> +	unsigned int	freq;
> +};
> +
> +static DEFINE_PER_CPU(struct lpj_info, lpj_ref);
> +static struct lpj_info global_lpj_ref;
> +#endif
> +
>  static int omap_verify_speed(struct cpufreq_policy *policy)
>  {
>  	if (!freq_table)
> @@ -109,14 +119,25 @@
>  	freqs.new = omap_getspeed(policy->cpu);
>  
>  #ifdef CONFIG_SMP
> -	/* Adjust jiffies before transition */
> +	/* Adjust per-cpu loops_per_jiffy before transition */
>  	for_each_cpu(i, policy->cpus) {
> -		unsigned long lpj = per_cpu(cpu_data, 
> i).loops_per_jiffy;
> -
> -		per_cpu(cpu_data, i).loops_per_jiffy = 
> cpufreq_scale(lpj,
> -							freqs.old,
> -							freqs.new);
> +		struct lpj_info *lpj = &per_cpu(lpj_ref, i);
> +		if (!lpj->freq) {
> +			lpj->ref = per_cpu(cpu_data, i).loops_per_jiffy;
> +			lpj->freq = freqs.old;
> +		}
> +
> +		per_cpu(cpu_data, i).loops_per_jiffy =
> +			cpufreq_scale(lpj->ref, lpj->freq, freqs.new);
> +	}
> +
> +	/* And don't forget to adjust the global one */
> +	if (!global_lpj_ref.freq) {
> +		global_lpj_ref.ref = loops_per_jiffy;
> +		global_lpj_ref.freq = freqs.old;
>  	}
> +	loops_per_jiffy = cpufreq_scale(global_lpj_ref.ref, 
> global_lpj_ref.freq,
> +					freqs.new);
>  #endif
>  
>  	/* Notify transitions */
> 
> 
> Notice how we adjust _both_ the per-cpu loops_per_jiffy, and that we
> adjust them with reference to the initial values.
> 
> If you adjust lpj with reference to the last, then you _will_ build up
> a progressively bigger and bigger error in the value over time.

Russell,

I definitely didn't see so many things through your comments. But
that may just be reflection of my naivety with SMP!

I am currently testing another patch for beagle - will apply and
test on OMAP3EVM (just to be sure).

Can I include it in my next patch rev?

~sanjeev
Sanjeev Premi June 24, 2011, 5:50 p.m. UTC | #2
> -----Original Message-----
> From: Russell King - ARM Linux [mailto:linux@arm.linux.org.uk] 
> Sent: Friday, June 24, 2011 8:42 PM
> To: Premi, Sanjeev
> Cc: linux-omap@vger.kernel.org; linux-arm-kernel@lists.infradead.org
> Subject: Re: [PATCHv2] omap2+: pm: cpufreq: Fix 
> loops_per_jiffy calculation
> 
> Right, thanks for the file.  Here's the patch.
> 
> --- omap2plus-cpufreq.c~	2011-06-24 15:50:32.000000000 +0100
> +++ omap2plus-cpufreq.c	2011-06-24 16:00:08.000000000 +0100
> @@ -44,6 +44,16 @@
>  static char *mpu_clk_name;
>  static struct device *mpu_dev;
>  
> +#ifdef CONFIG_SMP
> +struct lpj_info {
> +	unsigned long	ref;
> +	unsigned int	freq;
> +};
> +
> +static DEFINE_PER_CPU(struct lpj_info, lpj_ref);
> +static struct lpj_info global_lpj_ref;
> +#endif
> +
>  static int omap_verify_speed(struct cpufreq_policy *policy)
>  {
>  	if (!freq_table)
> @@ -109,14 +119,25 @@
>  	freqs.new = omap_getspeed(policy->cpu);
>  
>  #ifdef CONFIG_SMP
> -	/* Adjust jiffies before transition */
> +	/* Adjust per-cpu loops_per_jiffy before transition */
>  	for_each_cpu(i, policy->cpus) {
> -		unsigned long lpj = per_cpu(cpu_data, 
> i).loops_per_jiffy;
> -
> -		per_cpu(cpu_data, i).loops_per_jiffy = 
> cpufreq_scale(lpj,
> -							freqs.old,
> -							freqs.new);
> +		struct lpj_info *lpj = &per_cpu(lpj_ref, i);
> +		if (!lpj->freq) {
> +			lpj->ref = per_cpu(cpu_data, i).loops_per_jiffy;
> +			lpj->freq = freqs.old;
> +		}
> +
> +		per_cpu(cpu_data, i).loops_per_jiffy =
> +			cpufreq_scale(lpj->ref, lpj->freq, freqs.new);
> +	}
> +
> +	/* And don't forget to adjust the global one */
> +	if (!global_lpj_ref.freq) {
> +		global_lpj_ref.ref = loops_per_jiffy;
> +		global_lpj_ref.freq = freqs.old;
>  	}
> +	loops_per_jiffy = cpufreq_scale(global_lpj_ref.ref, 
> global_lpj_ref.freq,
> +					freqs.new);
>  #endif
>  
>  	/* Notify transitions */
> 
> 
> Notice how we adjust _both_ the per-cpu loops_per_jiffy, and that we
> adjust them with reference to the initial values.
> 
> If you adjust lpj with reference to the last, then you _will_ build up
> a progressively bigger and bigger error in the value over time.
> 

Russell,

I was able to test BogoMIPS calculations via /proc/cpuinfo for
both with & without CONFIG_SMP selected.

For most part things work fine - but I do notice occassional Oops
and segmentation faults while doing "cat /proc/cpuinfo"

With CONFIG_SMP enabled, system doesn't recover from the Oops;
but without SMP - I noticed segmentation faults/ BUG but system
does recover.

They could be unrelated - but i didn't see any of these earlier
today. I will continue debug on MON.

Here are details:

[1] This log comes corresponds to CONFIG_SMP enabled.

[2] This corresponds to running "cat /proc/cpuinfo" in a tight
    loop. (CONFIG_SMP disabled).
    
[3] Saw only once today - but had seen it few days ago. None
    of my local changes included. (CONFIG_SMP disabled).
    [http://marc.info/?l=linux-omap&m=130884641524123&w=2]

~sanjeev

=== [1]

[root@OMAP3EVM cpufreq]# cat /proc/cpuinfo
[   73.832366] Internal error: Oops - undefined instruction: 0 [#1] SMP
[   73.839019] Modules linked in:
[   73.842193] CPU: 0    Not tainted  (3.0.0-rc3-14002-g40b6752-dirty #21)
[   73.849121] PC is at __do_fault+0x1c0/0x450
[   73.853485] LR is at __do_fault+0x2b0/0x450
[   73.857879] pc : [<c010fa18>]    lr : [<c010fb08>]    psr: 00000113
[   73.857879] sp : c7907d48  ip : 00000000  fp : c5d518c0
[   73.869873] r10: 00000200  r9 : 40214000  r8 : 00000000
[   73.875335] r7 : c2692f98  r6 : c0ad7600  r5 : 87fb018f  r4 : 00000000
[   73.882141] r3 : 87fb0a3e  r2 : 00000800  r1 : 87fb01cf  r0 : c5d518c0
[   73.888977] Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   73.896423] Control: 10c5387d  Table: 8795c019  DAC: 00000015
[   73.902435] Process cat (pid: 449, stack limit = 0xc79062f8)
[   73.908355] Stack: (0xc7907d48 to 0xc7908000)
[   73.912902] 7d40:                   00000008 c795d008 00000000 00000000 c5d4d050 c5d518f4
[   73.921447] 7d60: 00000000 0000000c 40214000 c0ad7600 00000002 0000000c c795d008 00000000
[   73.929992] 7d80: 40214000 c5d518c0 40214000 c2692f98 be91579c c0110760 0000000c 00000000
[   73.938537] 7da0: 00000000 c00bb088 00000002 c5d518c0 00000000 c03111a0 00000000 c795c000
[   73.947113] 7dc0: c795d008 00000201 40214000 c5d518c0 00000000 c2692f98 be91579c c0111184
[   73.955657] 7de0: c795d008 00000000 c2692f98 c7907fb0 00000017 402140ea c5d518c0 402140ea
[   73.964202] 7e00: c5d51914 c031138c c78c6180 00000000 00000002 0000002f 00000017 c78c6180
[   73.972747] 7e20: c08da030 c00b9184 c08da030 c00b9184 c5d518c4 c00b7860 c78c6180 c78c6180
[   73.981292] 7e40: c04787dc 00000000 c04af0e0 0000013e 00000000 00000000 00000002 c00b9184
[   73.989868] 7e60: 00000000 00000000 c78c6180 00000001 c78c6180 00000000 a2217a4d 3caba335
[   73.998413] 7e80: 89831c46 a201fb19 4c802a1d c78c65e0 0000002f 00000000 0895e1ee 8cafea22
[   74.006958] 7ea0: 2f5807fc 70396d1c 752baa3e 00000001 c78c6180 00000007 00000006 000080d0
[   74.015502] 7ec0: c2692b78 00000000 00000000 00000000 00000000 c5d51928 00000002 60000093
[   74.024047] 7ee0: 00000000 c00bb088 00000002 c044d374 0001b900 00000017 c7907fb0 402140ea
[   74.032592] 7f00: 402140ea 4032d068 be91579c c003e480 60000013 00001000 c5d51914 00000003
[   74.041168] 7f20: be915a14 c030e788 00000002 00000000 c021e1b4 00000000 c048cb1c c7906000
[   74.049713] 7f40: 4008e000 00000000 c78c6180 c030ecf4 00000000 00000000 00001000 c00b7ba0
[   74.058258] 7f60: c5d51918 60000013 00000000 c030ecf4 c78c6180 c030f1d8 000004b0 4021051c
[   74.066802] 7f80: 4021c998 c00b7ba0 ffffffff 0001b900 000004b0 c030f1d8 ffffffff 0001b900
[   74.075347] 7fa0: 000004b0 4021051c 4021c998 c030f50c 402140ea 4005c950 be91587c 4005cb08
[   74.083892] 7fc0: 00000015 0001b900 000004b0 4021051c 4021c998 402140ea 4032d068 be91579c
[   74.092468] 7fe0: 00000000 be9156d0 400403f8 4003e6f4 20000010 ffffffff ffffffff ffffffff
[   74.101013] [<c010fa18>] (__do_fault+0x1c0/0x450) from [<c0110760>] (handle_pte_fault+0x74/0x674)
[   74.110290] [<c0110760>] (handle_pte_fault+0x74/0x674) from [<c0111184>] (handle_mm_fault+0x94/0xc4)
[   74.119842] [<c0111184>] (handle_mm_fault+0x94/0xc4) from [<c031138c>] (do_page_fault+0x2b8/0x358)
[   74.129211] [<c031138c>] (do_page_fault+0x2b8/0x358) from [<c003e480>] (do_DataAbort+0x38/0x98)
[   74.138336] [<c003e480>] (do_DataAbort+0x38/0x98) from [<c030f50c>] (ret_from_exception+0x0/0x10)
[   74.147613] Exception stack(0xc7907fb0 to 0xc7907ff8)
[   74.152862] 7fa0:                                     402140ea 4005c950 be91587c 4005cb08
[   74.161437] 7fc0: 00000015 0001b900 000004b0 4021051c 4021c998 402140ea 4032d068 be91579c
[   74.169982] 7fe0: 00000000 be9156d0 400403f8 4003e6f4 20000010 ffffffff
[   74.176879] Code: e1a01005 e3a02000 ebfd1694 e59d0014 (eb07fcba)
[   74.183380] ---[ end trace 9fe8ca36c9812c43 ]---
[   86.665679] BUG: spinlock lockup on CPU#0, cat/449, c5d518f4
[   86.671661] [<c00509c4>] (unwind_backtrace+0x0/0xf8) from [<c0227320>] (do_raw_spin_lock+0xec/0x178)
[   86.681213] [<c0227320>] (do_raw_spin_lock+0xec/0x178) from [<c0112578>] (unmap_vmas+0x188/0x620)
[   86.690521] [<c0112578>] (unmap_vmas+0x188/0x620) from [<c0114fd4>] (exit_mmap+0x10c/0x23c)
[   86.699279] [<c0114fd4>] (exit_mmap+0x10c/0x23c) from [<c007f4bc>] (mmput+0x54/0x118)
[   86.707489] [<c007f4bc>] (mmput+0x54/0x118) from [<c00838a0>] (exit_mm+0x13c/0x194)
[   86.715545] [<c00838a0>] (exit_mm+0x13c/0x194) from [<c0085864>] (do_exit+0x660/0x714)
[   86.723846] [<c0085864>] (do_exit+0x660/0x714) from [<c004dafc>] (die+0x2d8/0x2f4)
[   86.731781] [<c004dafc>] (die+0x2d8/0x2f4) from [<c003e204>] (do_undefinstr+0x14c/0x150)
[   86.740264] [<c003e204>] (do_undefinstr+0x14c/0x150) from [<c030f0e4>] (__und_svc+0x44/0x60)
[   86.749114] Exception stack(0xc7907cc0 to 0xc7907d08)
[   86.754394] 7cc0: c5d518c0 87fb01cf 00000800 87fb0a3e 00000000 87fb018f c0ad7600 c2692f98
[   86.762969] 7ce0: 00000000 40214000 00000200 c5d518c0 00000000 c7907d48 c010fb08 c010fa18
[   86.771514] 7d00: 00000113 ffffffff
[   86.775177] [<c030f0e4>] (__und_svc+0x44/0x60) from [<c010fa18>] (__do_fault+0x1c0/0x450)
[   86.783752] [<c010fa18>] (__do_fault+0x1c0/0x450) from [<c0110760>] (handle_pte_fault+0x74/0x674)
[   86.793060] [<c0110760>] (handle_pte_fault+0x74/0x674) from [<c0111184>] (handle_mm_fault+0x94/0xc4)
[   86.802612] [<c0111184>] (handle_mm_fault+0x94/0xc4) from [<c031138c>] (do_page_fault+0x2b8/0x358)
[   86.812011] [<c031138c>] (do_page_fault+0x2b8/0x358) from [<c003e480>] (do_DataAbort+0x38/0x98)
[   86.821136] [<c003e480>] (do_DataAbort+0x38/0x98) from [<c030f50c>] (ret_from_exception+0x0/0x10)
[   86.830444] Exception stack(0xc7907fb0 to 0xc7907ff8)
[   86.835723] 7fa0:                                     402140ea 4005c950 be91587c 4005cb08
[   86.844299] 7fc0: 00000015 0001b900 000004b0 4021051c 4021c998 402140ea 4032d068 be91579c
[   86.852874] 7fe0: 00000000 be9156d0 400403f8 4003e6f4 20000010 ffffffff


=== [2]

[root@OMAP3EVM cpufreq]# cat /proc/cpuinfo
[   99.083923] Unable to handle kernel paging request at virtual address 85d44831
[   99.091583] pgd = c5d30000
[   99.094390] [85d44831] *pgd=00000000
[   99.098175] Internal error: Oops: 5 [#1]
[   99.102264] Modules linked in:
[   99.105468] CPU: 0    Not tainted  (3.0.0-rc3-14002-g40b6752-dirty #22)
[   99.112365] PC is at find_get_pages+0xd8/0x170
[   99.117004] LR is at 0x2
[   99.119659] pc : [<c00d3cc4>]    lr : [<00000002>]    psr: 60000113
[   99.119659] sp : c7945d08  ip : 00000001  fp : c7945d58
[   99.131652] r10: c759b89c  r9 : c5d3e860  r8 : c7945d58
[   99.137115] r7 : 00000013  r6 : 00000000  r5 : c5d37900  r4 : c5d31000
[   99.143951] r3 : c759b758  r2 : 85d44831  r1 : 00000001  r0 : c759b89c
[   99.150756] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   99.158203] Control: 10c5387d  Table: 85d30019  DAC: 00000015
[   99.164215] Process sh (pid: 446, stack limit = 0xc79442f0)
[   99.170043] Stack: (0xc7945d08 to 0xc7946000)
[   99.174591] 5d00:                   c78cc280 c78cc6c0 00000001 00000001 c759b758 00000001
[   99.183135] 5d20: c5d472b4 c5d31000 00000000 00000000 c5d3e860 c7945d58 400d3000 c5d3e860
[   99.191680] 5d40: c5d47280 c00edf20 00000129 00000000 00000000 c5d472e8 00000000 00000013
[   99.200256] 5d60: 400d3000 00000000 00000002 00000013 c5d31000 00000000 400d3000 c5d47280
[   99.208801] 5d80: 400d3000 c5d3e860 00000000 c00eedf8 00000013 00000000 00000000 c00a1114
[   99.217346] 5da0: 00000002 c5d47280 00000000 c02e4760 00000000 c5d30000 c5d31000 00000200
[   99.225891] 5dc0: 400d3000 c5d47280 00000000 c5d3e860 00000000 c00ef89c c5d31000 00000000
[   99.234436] 5de0: c5d3e860 c7945fb0 00000007 400d3c6c c5d47280 400584b0 c5d472d4 c02e4948
[   99.242980] 5e00: 00000000 c02e47d0 c0413fc0 c043f468 80000007 c78cc280 0000081f c78cc280
[   99.251525] 5e20: 00000002 60000093 00000000 c00a1114 00000000 c042c204 00000000 00000001
[   99.260070] 5e40: c045d510 00000044 00000006 c5e3c000 00000000 00000000 c78cc280 00000001
[   99.268646] 5e60: c78cc280 c02e258c 00000000 c781e8c0 c5e3c000 c009dd48 c5e3c0ac 60000013
[   99.277191] 5e80: 00000000 c78cc6c0 00000044 00000000 beacd7ec c0229940 00000001 00000000
[   99.285736] 5ea0: c0229894 0000000b 20000113 00000000 00000002 00000000 c78cc280 00000000
[   99.294281] 5ec0: 00000000 00000000 00000000 c5e32694 00000002 60000093 00000000 c00a1114
[   99.302825] 5ee0: 00000002 c03fff0c 000d72cc 00000007 c03fff7c c7945fb0 400584b0 400d3c6c
[   99.311370] 5f00: 00000000 c0034208 000d7008 c00342a4 c78cc280 c02e2554 c7945f84 c78cc280
[   99.319915] 5f20: c5e321ac c009dd48 c5e32684 00000003 c7945f84 c02e2554 0000000a c007b0a8
[   99.328491] 5f40: c7944000 00000000 000d7008 00000000 c7945f84 00000003 00000000 c7945f84
[   99.337036] 5f60: 000000ae c003f8a8 00000000 c007b26c c78cc280 c02e2c7c 00000000 000d727c
[   99.345581] 5f80: 400584b0 c009dd48 ffffffff 000d72cc 00000000 ffffffff 000d72cc 00000000
[   99.354125] 5fa0: 000d727c 400584b0 000d727c c02e2c8c 000d72cc 000d726c 000d727c 000d6dfc
[   99.362670] 5fc0: 000d726c 000d72cc 00000000 000d727c 400584b0 400584b0 000d727c 00000000
[   99.371215] 5fe0: 000d450c beacd804 000d4104 400d3c6c 60000010 ffffffff 00000000 00000000
[   99.379791] [<c00d3cc4>] (find_get_pages+0xd8/0x170) from [<c5d3e860>] (0xc5d3e860)
[   99.387786] Code: e1a0c001 eafffff2 e5942000 e5903000 (e5923000)
[   99.394195] ---[ end trace a04771993a740cbe ]---
Segmentation fault
[root@OMAP3EVM cpufreq]#

=== [3]

[root@OMAP3EVM cpufreq]# cat /proc/cpuinfo
[   90.313537] BUG: looking up invalid subclass: 3348054496
[   90.319122] turning off the locking correctness validator.
[   90.324859] [<c00509c4>] (unwind_backtrace+0x0/0xf8) from [<c00ba6b4>] (__lock_acquire+0x1984/0x1d90)
[   90.334503] [<c00ba6b4>] (__lock_acquire+0x1984/0x1d90) from [<c00bb088>] (lock_acquire+0xd4/0xf8)
[   90.343902] [<c00bb088>] (lock_acquire+0xd4/0xf8) from [<c030e780>] (_raw_spin_lock_irqsave+0x44/0x58)
[   90.353637] [<c030e780>] (_raw_spin_lock_irqsave+0x44/0x58) from [<c021e44c>] (__down_read_trylock+0x14/0x54)
[   90.364013] [<c021e44c>] (__down_read_trylock+0x14/0x54) from [<c00a69a4>] (down_read_trylock+0x14/0x54)
[   90.373931] [<c00a69a4>] (down_read_trylock+0x14/0x54) from [<c03111a0>] (do_page_fault+0xcc/0x358)
[   90.383392] [<c03111a0>] (do_page_fault+0xcc/0x358) from [<c003e480>] (do_DataAbort+0x38/0x98)
[   90.392395] [<c003e480>] (do_DataAbort+0x38/0x98) from [<c030f50c>] (ret_from_exception+0x0/0x10)
[   90.401672] Exception stack(0xc796ffb0 to 0xc796fff8)
Russell King - ARM Linux June 24, 2011, 6:51 p.m. UTC | #3
On Fri, Jun 24, 2011 at 11:20:44PM +0530, Premi, Sanjeev wrote:
> I was able to test BogoMIPS calculations via /proc/cpuinfo for
> both with & without CONFIG_SMP selected.
> 
> For most part things work fine - but I do notice occassional Oops
> and segmentation faults while doing "cat /proc/cpuinfo"
> 
> With CONFIG_SMP enabled, system doesn't recover from the Oops;
> but without SMP - I noticed segmentation faults/ BUG but system
> does recover.
> 
> They could be unrelated - but i didn't see any of these earlier
> today. I will continue debug on MON.

I don't think these are related to the patch - I think there's something
up with your hardware.

Let's take the first.

> [root@OMAP3EVM cpufreq]# cat /proc/cpuinfo
> [   73.832366] Internal error: Oops - undefined instruction: 0 [#1] SMP

Ok an undefined instruction.  So...

> [   73.839019] Modules linked in:
> [   73.842193] CPU: 0    Not tainted  (3.0.0-rc3-14002-g40b6752-dirty #21)
> [   73.849121] PC is at __do_fault+0x1c0/0x450
> [   73.853485] LR is at __do_fault+0x2b0/0x450
> [   73.857879] pc : [<c010fa18>]    lr : [<c010fb08>]    psr: 00000113
> [   73.857879] sp : c7907d48  ip : 00000000  fp : c5d518c0
> [   73.869873] r10: 00000200  r9 : 40214000  r8 : 00000000
> [   73.875335] r7 : c2692f98  r6 : c0ad7600  r5 : 87fb018f  r4 : 00000000
> [   73.882141] r3 : 87fb0a3e  r2 : 00000800  r1 : 87fb01cf  r0 : c5d518c0
> [   73.888977] Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> [   73.896423] Control: 10c5387d  Table: 8795c019  DAC: 00000015
> [   73.902435] Process cat (pid: 449, stack limit = 0xc79062f8)

... lets look at the code line:

> [   74.176879] Code: e1a01005 e3a02000 ebfd1694 e59d0014 (eb07fcba)

and disassemble it:

   0:   e1a01005        mov     r1, r5
   4:   e3a02000        mov     r2, #0  ; 0x0
   8:   ebfd1694        bl      0xfff45a60
   c:   e59d0014        ldr     r0, [sp, #20]
  10:   eb07fcba        bl      0x1ff300

There is no way that 0xeb07fcba should ever cause an undefined ARM
instruction on a properly functioning system.

It points at a hardware problem - are you using a socketed SoC?  Is
it properly socketed?  Is the socket dirty?  And all other questions
related to hardware integrity...
Kevin Hilman June 24, 2011, 8:14 p.m. UTC | #4
Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Fri, Jun 24, 2011 at 11:20:44PM +0530, Premi, Sanjeev wrote:
>> I was able to test BogoMIPS calculations via /proc/cpuinfo for
>> both with & without CONFIG_SMP selected.
>> 
>> For most part things work fine - but I do notice occassional Oops
>> and segmentation faults while doing "cat /proc/cpuinfo"
>> 
>> With CONFIG_SMP enabled, system doesn't recover from the Oops;
>> but without SMP - I noticed segmentation faults/ BUG but system
>> does recover.
>> 
>> They could be unrelated - but i didn't see any of these earlier
>> today. I will continue debug on MON.
>
> I don't think these are related to the patch - I think there's something
> up with your hardware.
>
> Let's take the first.
>
>> [root@OMAP3EVM cpufreq]# cat /proc/cpuinfo
>> [   73.832366] Internal error: Oops - undefined instruction: 0 [#1] SMP
>
> Ok an undefined instruction.  So...
>
>> [   73.839019] Modules linked in:
>> [   73.842193] CPU: 0    Not tainted  (3.0.0-rc3-14002-g40b6752-dirty #21)
>> [   73.849121] PC is at __do_fault+0x1c0/0x450
>> [   73.853485] LR is at __do_fault+0x2b0/0x450
>> [   73.857879] pc : [<c010fa18>]    lr : [<c010fb08>]    psr: 00000113
>> [   73.857879] sp : c7907d48  ip : 00000000  fp : c5d518c0
>> [   73.869873] r10: 00000200  r9 : 40214000  r8 : 00000000
>> [   73.875335] r7 : c2692f98  r6 : c0ad7600  r5 : 87fb018f  r4 : 00000000
>> [   73.882141] r3 : 87fb0a3e  r2 : 00000800  r1 : 87fb01cf  r0 : c5d518c0
>> [   73.888977] Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
>> [   73.896423] Control: 10c5387d  Table: 8795c019  DAC: 00000015
>> [   73.902435] Process cat (pid: 449, stack limit = 0xc79062f8)
>
> ... lets look at the code line:
>
>> [   74.176879] Code: e1a01005 e3a02000 ebfd1694 e59d0014 (eb07fcba)
>
> and disassemble it:
>
>    0:   e1a01005        mov     r1, r5
>    4:   e3a02000        mov     r2, #0  ; 0x0
>    8:   ebfd1694        bl      0xfff45a60
>    c:   e59d0014        ldr     r0, [sp, #20]
>   10:   eb07fcba        bl      0x1ff300
>
> There is no way that 0xeb07fcba should ever cause an undefined ARM
> instruction on a properly functioning system.
>
> It points at a hardware problem - are you using a socketed SoC?  Is
> it properly socketed?  Is the socket dirty?  And all other questions
> related to hardware integrity...

And in particular, since we're talking CPUfreq, are you running at a
frequency that the SoC and especially the memory support?

Kevin
Sanjeev Premi June 25, 2011, 4:20 p.m. UTC | #5
> -----Original Message-----
> From: Hilman, Kevin 
> Sent: Saturday, June 25, 2011 1:44 AM
> To: Russell King - ARM Linux
> Cc: Premi, Sanjeev; linux-omap@vger.kernel.org; 
> linux-arm-kernel@lists.infradead.org
> Subject: Re: [PATCHv2] omap2+: pm: cpufreq: Fix 
> loops_per_jiffy calculation
> 
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> 
> > On Fri, Jun 24, 2011 at 11:20:44PM +0530, Premi, Sanjeev wrote:
> >> I was able to test BogoMIPS calculations via /proc/cpuinfo for
> >> both with & without CONFIG_SMP selected.
> >> 
> >> For most part things work fine - but I do notice occassional Oops
> >> and segmentation faults while doing "cat /proc/cpuinfo"
> >> 
> >> With CONFIG_SMP enabled, system doesn't recover from the Oops;
> >> but without SMP - I noticed segmentation faults/ BUG but system
> >> does recover.
> >> 
> >> They could be unrelated - but i didn't see any of these earlier
> >> today. I will continue debug on MON.
> >
> > I don't think these are related to the patch - I think 
> there's something
> > up with your hardware.
> >
> > Let's take the first.
> >
> >> [root@OMAP3EVM cpufreq]# cat /proc/cpuinfo
> >> [   73.832366] Internal error: Oops - undefined 
> instruction: 0 [#1] SMP
> >
> > Ok an undefined instruction.  So...
> >
> >> [   73.839019] Modules linked in:
> >> [   73.842193] CPU: 0    Not tainted  
> (3.0.0-rc3-14002-g40b6752-dirty #21)
> >> [   73.849121] PC is at __do_fault+0x1c0/0x450
> >> [   73.853485] LR is at __do_fault+0x2b0/0x450
> >> [   73.857879] pc : [<c010fa18>]    lr : [<c010fb08>]    
> psr: 00000113
> >> [   73.857879] sp : c7907d48  ip : 00000000  fp : c5d518c0
> >> [   73.869873] r10: 00000200  r9 : 40214000  r8 : 00000000
> >> [   73.875335] r7 : c2692f98  r6 : c0ad7600  r5 : 87fb018f 
>  r4 : 00000000
> >> [   73.882141] r3 : 87fb0a3e  r2 : 00000800  r1 : 87fb01cf 
>  r0 : c5d518c0
> >> [   73.888977] Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  
> ISA ARM  Segment user
> >> [   73.896423] Control: 10c5387d  Table: 8795c019  DAC: 00000015
> >> [   73.902435] Process cat (pid: 449, stack limit = 0xc79062f8)
> >
> > ... lets look at the code line:
> >
> >> [   74.176879] Code: e1a01005 e3a02000 ebfd1694 e59d0014 (eb07fcba)
> >
> > and disassemble it:
> >
> >    0:   e1a01005        mov     r1, r5
> >    4:   e3a02000        mov     r2, #0  ; 0x0
> >    8:   ebfd1694        bl      0xfff45a60
> >    c:   e59d0014        ldr     r0, [sp, #20]
> >   10:   eb07fcba        bl      0x1ff300
> >
> > There is no way that 0xeb07fcba should ever cause an undefined ARM
> > instruction on a properly functioning system.
> >
> > It points at a hardware problem - are you using a socketed SoC?  Is
> > it properly socketed?  Is the socket dirty?  And all other questions
> > related to hardware integrity...
> 
> And in particular, since we're talking CPUfreq, are you running at a
> frequency that the SoC and especially the memory support?

Yes. the frequencies are 300 - 800MHz range. The same board is also quite
stable for 1GHz operations (tested ARM only) - with sources hosted at:
http://arago-project.org/git/projects/?p=linux-omap3.git;a=summary

For testing, I was changing frequencies in a tight 'forever' loop. But, as
I mentioned earlier that issues could be unrelated. And the loop could be
exposing something else.

~sanjeev

> 
> Kevin
> 
> 
>
diff mbox

Patch

--- omap2plus-cpufreq.c~	2011-06-24 15:50:32.000000000 +0100
+++ omap2plus-cpufreq.c	2011-06-24 16:00:08.000000000 +0100
@@ -44,6 +44,16 @@ 
 static char *mpu_clk_name;
 static struct device *mpu_dev;
 
+#ifdef CONFIG_SMP
+struct lpj_info {
+	unsigned long	ref;
+	unsigned int	freq;
+};
+
+static DEFINE_PER_CPU(struct lpj_info, lpj_ref);
+static struct lpj_info global_lpj_ref;
+#endif
+
 static int omap_verify_speed(struct cpufreq_policy *policy)
 {
 	if (!freq_table)
@@ -109,14 +119,25 @@ 
 	freqs.new = omap_getspeed(policy->cpu);
 
 #ifdef CONFIG_SMP
-	/* Adjust jiffies before transition */
+	/* Adjust per-cpu loops_per_jiffy before transition */
 	for_each_cpu(i, policy->cpus) {
-		unsigned long lpj = per_cpu(cpu_data, i).loops_per_jiffy;
-
-		per_cpu(cpu_data, i).loops_per_jiffy = cpufreq_scale(lpj,
-							freqs.old,
-							freqs.new);
+		struct lpj_info *lpj = &per_cpu(lpj_ref, i);
+		if (!lpj->freq) {
+			lpj->ref = per_cpu(cpu_data, i).loops_per_jiffy;
+			lpj->freq = freqs.old;
+		}
+
+		per_cpu(cpu_data, i).loops_per_jiffy =
+			cpufreq_scale(lpj->ref, lpj->freq, freqs.new);
+	}
+
+	/* And don't forget to adjust the global one */
+	if (!global_lpj_ref.freq) {
+		global_lpj_ref.ref = loops_per_jiffy;
+		global_lpj_ref.freq = freqs.old;
 	}
+	loops_per_jiffy = cpufreq_scale(global_lpj_ref.ref, global_lpj_ref.freq,
+					freqs.new);
 #endif
 
 	/* Notify transitions */