diff mbox

OMAP baseline test results for v3.16-rc4

Message ID 20140730053940.GX29045@atomide.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tony Lindgren July 30, 2014, 5:39 a.m. UTC
* Paul Walmsley <paul@pwsan.com> [140729 12:39]:
> On Tue, 29 Jul 2014, Tony Lindgren wrote:
> 
> > Hmm maybe different u-boot version then? I'm using
> > 2014.04-00001-g5f09f5b.
> >  
> > > Are you using NFS root on 37xxevm or MMC root?
> > 
> > Using nfsroot and omap2plus_defconfig. My dmesg attached
> > in case it provides some clues. I don't have console=ttyO
> > here, but I've verified that it works with that too.
> 
> Walked through the PM test script by hand, and the proximal cause of the 
> problem became obvious...
> 
> Turns out a five-second delay for a three-second autosuspend_delay_ms is 
> no longer sufficient time for kernels to enter idle.  A ten-second sleep 
> seems to be long enough.

Oh OK.
 
> Not sure what exactly is causing that weirdness yet, or when that started 
> happening.  Am suspecting it could be some of the RCU changes over the 
> past couple of years.  We don't have RCU_FAST_NO_HZ enabled in 
> omap2plus_defconfig; we should probably switch that on.

OK
 
> Now 37xxevm and the 3730beaglexm are entering idle as they should be.  
> Test report below; logs etc. have been uploaded.  Thanks for the debug 
> discussion,

Great, good to hear you found what caused it :)
 
> PM: chip retention via dynamic idle:
>     FAIL ( 5/ 7): 2430sdp, 3530es3beagle, 4430es2panda, 4460pandaes,
> 		  4460varsomom
>     Pass ( 2/ 7): 3730beaglexm, 37xxevm
> 
> PM: chip off except CORE via suspend:
>     Pass ( 1/ 1): 3730beaglexm
> 
> PM: chip off except CORE via dynamic idle:
>     Pass ( 1/ 1): 3730beaglexm
> 
> PM: chip off via suspend:
>     FAIL ( 4/ 5): 3530es3beagle, 4430es2panda, 4460pandaes,
> 		  4460varsomom
>     Pass ( 1/ 5): 37xxevm
> 
> PM: chip off via dynamic idle:
>     FAIL ( 4/ 5): 3530es3beagle, 4430es2panda, 4460pandaes,
> 		  4460varsomom
>     Pass ( 1/ 5): 37xxevm

The following patch should fix the tests above for 3530es3beagle.
Care to test and ack as I don't have one?

Regards,

Tony

8< -----------------------------------
From: Tony Lindgren <tony@atomide.com>
Date: Tue, 29 Jul 2014 22:36:59 -0700
Subject: [PATCH] ARM: dts: Enable UART wake-up events for beagleboard

For device tree based booting, we need to use wake-up
interrupts like we already do for some omaps. This fixes
a PM regression on beagleboard compared to legacy booting.

Signed-off-by: Tony Lindgren <tony@atomide.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Paul Walmsley July 30, 2014, 7:53 a.m. UTC | #1
On Tue, 29 Jul 2014, Tony Lindgren wrote:

> The following patch should fix the tests above for 3530es3beagle.
> Care to test and ack as I don't have one?

3530es3beagle retention dynamic idle tests hang on next-20140729.  (Maybe 
other boards fail too - haven't tested any others).  

http://www.pwsan.com/omap/testlogs/next_20140729/20140730010841/pm/3530es3beagle/3530es3beagle_log.txt

Adding the patch you sent doesn't change that, but now some extra warning 
messages appear ("PRM: I/O chain clock line assertion timed out"):

http://www.pwsan.com/omap/testlogs/next_20140729_beagle_pm/20140730004856/pm/3530es3beagle/3530es3beagle_log.txt


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren July 31, 2014, 1:11 p.m. UTC | #2
* Paul Walmsley <paul@pwsan.com> [140730 00:55]:
> On Tue, 29 Jul 2014, Tony Lindgren wrote:
> 
> > The following patch should fix the tests above for 3530es3beagle.
> > Care to test and ack as I don't have one?
> 
> 3530es3beagle retention dynamic idle tests hang on next-20140729.  (Maybe 
> other boards fail too - haven't tested any others).  

I just checked that today's linux next works for off-idle and
wake-up events for at least 37xx evm.
 
> http://www.pwsan.com/omap/testlogs/next_20140729/20140730010841/pm/3530es3beagle/3530es3beagle_log.txt
> 
> Adding the patch you sent doesn't change that, but now some extra warning 
> messages appear ("PRM: I/O chain clock line assertion timed out"):
> 
> http://www.pwsan.com/omap/testlogs/next_20140729_beagle_pm/20140730004856/pm/3530es3beagle/3530es3beagle_log.txt

Weird it should just work. I don't know why the "PRM: I/O chain
clock line assertion timed out" happens for you, I have not seen
that. So far I've tested it on n900, beagle xm and 37xx-evm. None
of those are 3530 though, but it should behave the same way as
on n900.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tero Kristo July 31, 2014, 1:12 p.m. UTC | #3
On 07/30/2014 08:39 AM, Tony Lindgren wrote:
> 8< -----------------------------------
> From: Tony Lindgren<tony@atomide.com>
> Date: Tue, 29 Jul 2014 22:36:59 -0700
> Subject: [PATCH] ARM: dts: Enable UART wake-up events for beagleboard
>
> For device tree based booting, we need to use wake-up
> interrupts like we already do for some omaps. This fixes
> a PM regression on beagleboard compared to legacy booting.
>
> Signed-off-by: Tony Lindgren<tony@atomide.com>
>
> --- a/arch/arm/boot/dts/omap3-beagle.dts
> +++ b/arch/arm/boot/dts/omap3-beagle.dts
> @@ -292,6 +292,7 @@
>   &uart3 {
>   	pinctrl-names = "default";
>   	pinctrl-0 = <&uart3_pins>;
> +	interrupts-extended = <&intc 74 &omap3_pmx_core OMAP3_UART3_RX>;
>   };
>
>   &gpio1 {
> --

The above patch works for me with ret/off-idle on beagle rev C4 on top 
of 3.16-rc5. Without it, the board just seems to hang with ret, and with 
off, it just doesn't respond to anything on uart but seems alive otherwise.

Tested-by: Tero Kristo <t-kristo@ti.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Walmsley July 31, 2014, 7:27 p.m. UTC | #4
On Thu, 31 Jul 2014, Tony Lindgren wrote:

> * Paul Walmsley <paul@pwsan.com> [140730 00:55]:
> > On Tue, 29 Jul 2014, Tony Lindgren wrote:
> > 
> > > The following patch should fix the tests above for 3530es3beagle.
> > > Care to test and ack as I don't have one?
> > 
> > 3530es3beagle retention dynamic idle tests hang on next-20140729.  (Maybe 
> > other boards fail too - haven't tested any others).  
> 
> I just checked that today's linux next works for off-idle and
> wake-up events for at least 37xx evm.

I ran the full set of tests across all boards.  The only board that passed 
the dynamic idle testing on next-20140729 was the 3730beaglexm.

http://www.pwsan.com/omap/testlogs/next_20140729/20140730124836/README.txt

37xxevm hangs on the first suspend entry:

http://www.pwsan.com/omap/testlogs/next_20140729/20140730124836/pm/37xxevm/37xxevm_log.txt

If I find some extra time, I'll set up a bisection run.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren Aug. 1, 2014, 7:10 a.m. UTC | #5
* Paul Walmsley <paul@pwsan.com> [140731 12:29]:
> On Thu, 31 Jul 2014, Tony Lindgren wrote:
> 
> > * Paul Walmsley <paul@pwsan.com> [140730 00:55]:
> > > On Tue, 29 Jul 2014, Tony Lindgren wrote:
> > > 
> > > > The following patch should fix the tests above for 3530es3beagle.
> > > > Care to test and ack as I don't have one?
> > > 
> > > 3530es3beagle retention dynamic idle tests hang on next-20140729.  (Maybe 
> > > other boards fail too - haven't tested any others).  
> > 
> > I just checked that today's linux next works for off-idle and
> > wake-up events for at least 37xx evm.
> 
> I ran the full set of tests across all boards.  The only board that passed 
> the dynamic idle testing on next-20140729 was the 3730beaglexm.
> 
> http://www.pwsan.com/omap/testlogs/next_20140729/20140730124836/README.txt
> 
> 37xxevm hangs on the first suspend entry:
> 
> http://www.pwsan.com/omap/testlogs/next_20140729/20140730124836/pm/37xxevm/37xxevm_log.txt
> 
> If I find some extra time, I'll set up a bisection run.

OK that sounds like some driver suspend regression that needs
to be tracked down. I'm seeing it on my 37xx evm also with
linux next too.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tony Lindgren Aug. 1, 2014, 7:52 a.m. UTC | #6
* Tony Lindgren <tony@atomide.com> [140801 00:14]:
> * Paul Walmsley <paul@pwsan.com> [140731 12:29]:
> > On Thu, 31 Jul 2014, Tony Lindgren wrote:
> > 
> > > * Paul Walmsley <paul@pwsan.com> [140730 00:55]:
> > > > On Tue, 29 Jul 2014, Tony Lindgren wrote:
> > > > 
> > > > > The following patch should fix the tests above for 3530es3beagle.
> > > > > Care to test and ack as I don't have one?
> > > > 
> > > > 3530es3beagle retention dynamic idle tests hang on next-20140729.  (Maybe 
> > > > other boards fail too - haven't tested any others).  
> > > 
> > > I just checked that today's linux next works for off-idle and
> > > wake-up events for at least 37xx evm.
> > 
> > I ran the full set of tests across all boards.  The only board that passed 
> > the dynamic idle testing on next-20140729 was the 3730beaglexm.
> > 
> > http://www.pwsan.com/omap/testlogs/next_20140729/20140730124836/README.txt
> > 
> > 37xxevm hangs on the first suspend entry:
> > 
> > http://www.pwsan.com/omap/testlogs/next_20140729/20140730124836/pm/37xxevm/37xxevm_log.txt
> > 
> > If I find some extra time, I'll set up a bisection run.
> 
> OK that sounds like some driver suspend regression that needs
> to be tracked down. I'm seeing it on my 37xx evm also with
> linux next too.

With no_console_suspend I'm seeing the following for smsc911x.
This does not happen with v3.16-rc, and there have been no changes
to smsc911x.c, so sounds like a good candidate for an automated
bisect and boot test.

Regards,

Tony

[   47.804443] PM: Syncing filesystems ... done.
[   47.807342] PM: Preparing system for mem sleep
[   47.817871] Freezing user space processes ... (elapsed 0.002 seconds) done.
[   47.821258] Freezing remaining freezable tasks ... (elapsed 0.003 seconds) done.
[   47.825073] PM: Entering mem sleep
[   47.838897] random: nonblocking pool is initialized
[   47.839630] Unable to handle kernel NULL pointer dereference at virtual address 00000698
[   47.
*pte=00000000, *ppte=00000000
[   47.840423] Internal error: Oops: 17 [#1] SMP ARM
[   47.840545] Modules linked in:
[   47.840728] CPU: 0 PID: 882 Comm: bash Not tainted 3.16.0-rc7-next-20140731-10331-ge9fab93 #303
[   47.840942] task: ce4dd140 ti: ce508000 task.ti: ce508000
[   47.841125] PC is at __lock_acquire+0x128/0xb78
[   47.841247] LR is at lock_acquire+0xa4/0x10c
[   47.841369] pc : [<c0088524>]    lr : [<c00894d8>]    psr: 20010093
[   47.841369] sp : ce509c68  ip : ce508000  fp : ce509cc4
[   47.841583] r10: 00000000  r9 : 00000000  r8 : ce4dd140
[   47.841766] r7 : 00000698  r6 : c097d068  r5 : c1165764  r4 : 00000001
[   47.841888] r3 : c09676f0  r2 : 00000001  r1 : 00000000  r0 : 00000698
[   47.842071] Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   47.842254] Control: 10c5387d  Table: 8e534019  DAC: 00000015
[   47.842376] Process bash (pid: 882, stack limit = 0xce508248)
[   47.842559] Stack: (0xce509c68 to 0xce50a000)
[   47.842681] 9c60:                   ce4dd700 00000000 00000000 00000000 00011e89 f8316096
[   47.842895] 9c80: 
0000000
[   47.843017] 9ca0: 00000698 00000000 00000000 00000000 ce508000 c0436c94 ce509d14 ce509cc8
[   47.843231] 9cc0: c00894d8 c0088408 00000001 00000080 00000000 c0436c94 00000000 00000006
[   47.843444] 9ce0: ce4dd140 60010093 ce593cac 00000688 60010013 c0436c94 00000002 00000002
[   47.843658] 9d00: c07a83e8 c0436c6c ce509d44 ce509d18 c05db420 c0089440 00000001 00000000
[   47.843811] 9d20: c0436c94 60010013 c1165764 00000000 00000688 ce593c78 ce509d64 ce509d48
[   47.844024] 9d40: c0436c94 c05db3dc c09b43b0 00000000 00000000 ce593c78 ce509da4 ce509d68
[   47.844238] 9d60: c03bbc88 c0436c78 ce509d8c ce509d78 00000000 00000000 00000000 ce593c78
[   47.844451] 9d80: 00000000 c11aa6d8 000000
da0: c03bce3c c03bbc40 ce593c78 009886cc c11aa6d8 ce593c78 c09886cc c11aa6d8
[   47.844787] 9dc0: 00000000 00000002 ce509e14 ce5
05dbf74 0000000b 00000000 00000002 00000000 00000003 c09b6550 c1164e64
[   47.845153] 9e00: 0000000c c077086c ce509e2c ce509e18 c03bea04 c03be2dc c1164e78 c1164e58
[   47.845367] 9e20: ce509e7c ce509e30 c008f0c8 c03be9a8 c09b6550 c1164e64 0000000c c077086c
[   47.845550] 9e40: ce509e6c ce509e50 c05d158c c0092104 c07707d8 00000003 00000000 c0770848
[   47.845764] 9e60: c09b6550 c1164e64 0000000c c077086c ce509eb4 ce509e80 c008f920 c008f03c
[   47.845916] 9e80: ce4dd140 c077086c ce509eb4 00000003 c076b6d4 c1164e68 00000003 ce377140
[   47.846130] 9ea0: ce6a2e4c 00000004 ce509edc ce509eb8 c008e00c c008f534 ce6a2e40 00000004
[   47.846343] 9ec0: ce377140 ce509f78 00000004 ce6a2e40 ce509eec ce509ee0 c032699c c008dfa0
[   47.846557] 9ee0: ce509f0c ce509ef0 c01b109c c032698c c01b1040 00000000 00000000 ce377140
[   47.846679] 9f00: ce509f44 ce509f10 c01b0328 c01b104c 00000000 00000000 ce509f4c ce4cae00
[   47.846893] 9f20: 00000004 000bcc08 ce509f78 00000004 ce508000 000bcc08 ce509f74 ce509f48
[   47.847106] 9f40: c014406c c01b0250 c01604b4 c0160424 00000000 00000000 ce4cae00 ce4cae00
[   47.847259] 9f60: 00000004 000bcc08 ce509fa4 ce509f78 c01444b4 c0143fc4 00000000 00000000
[   47.847473] 9f80: 00000004 000bcc08 b6e94b50 00000004 c000f1c4 00000000 00000000 ce509fa8
[   47.847686] 9fa0: c000ef40 c0144474 00000004 000bcc08 0000000
c08 b6e94b50 00000004 00000004 00000000 000b5174 00000000
[   47.848052] 9fe0: 00000000 be94b92c b6e01671 b6e3a7c6 40010030 00000001 fffe7fff ffffffff
[   47.848266] [<c0088524>] (__lock_acquire) from [<c00894d8>] (lock_acquire+0xa4/0x10c)
[   47.848510] [<c00894d8>] (lock_acquire) from [<c05db420>] (_raw_spin_lock_irqsave+0x50/0x64)
[   47.848754] [<c05db420>] (_raw_spin_lock_irqsave) from [<c0436c94>] (smsc911x_suspend+0x28/0x5c)
[   47.848907] [<c0436c94>] (smsc911x_suspend) from [<c03bbc88>] (dpm_run_callback+0x54/0x108)
[   47.849121] [<c03bbc88>] (dpm_run_callback) from [<c03bce3c>] (__device_suspend+0x13c/0x388)
[   47.849334] [<c03bce3c>] (__device_suspend) from [<c03be340>] (dpm_suspend+0x70/0x2d0)
[   47.849548] [<c03be340>] (dpm_suspend) from [<c03bea04>] (dpm_suspend_start+0x68/0x70)
[   47.849670] [<c03bea04>] (dpm_suspend_start) from [<c008f0c8>] (suspend_devices_and_enter+0x98/0x4f8)
[   47.849914] [<c008f0c8>] (suspend_devices_and_enter) from [<c008f920>] (pm_suspend+0x3f8/0x470)
[   47.850128] [<c008f920>] (pm_suspend) from [<c008e00c>] (state_store+0x78/0xc8)
[   47.850250] [<c008e00c>] (state_store) from [<c032699c>] (kobj_attr_store+0x1c/0x28)
[   47.850463] [<c032699c>] (kobj_attr_store) from [<c01b109c>] (sysfs_kf_write+0x5c/0x60)
[   47.850677] [<c01b109c>] (sysfs_kf_write) from [<c01b0328>] (kernfs_fop_write+0xe4/0x198)
[   47.850891] [<c01b0328>] (kernfs_fop_wr
014406c>] (vfs_write) from [<c01444b4>] (SyS_write+0x4c/0x98)
[   47.851257] [<c01444b4>] (SyS_write) from [<c000ef40>] (ret_fast_syscall+0x0/0x48)
[   47.851470] Code: 1a00000e e3a00000 e24bd028 e89daff0 (e5971000) 
[   47.851593] ---[ end trace c6b1f330200fefe5 ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Walmsley Aug. 7, 2014, 10:21 p.m. UTC | #7
On Fri, 1 Aug 2014, Tony Lindgren wrote:

> * Paul Walmsley <paul@pwsan.com> [140731 12:29]:
> > On Thu, 31 Jul 2014, Tony Lindgren wrote:
> > 
> > > * Paul Walmsley <paul@pwsan.com> [140730 00:55]:
> > > > On Tue, 29 Jul 2014, Tony Lindgren wrote:
> > > > 
> > > > > The following patch should fix the tests above for 3530es3beagle.
> > > > > Care to test and ack as I don't have one?
> > > > 
> > > > 3530es3beagle retention dynamic idle tests hang on next-20140729.  (Maybe 
> > > > other boards fail too - haven't tested any others).  
> > > 
> > > I just checked that today's linux next works for off-idle and
> > > wake-up events for at least 37xx evm.
> > 
> > I ran the full set of tests across all boards.  The only board that passed 
> > the dynamic idle testing on next-20140729 was the 3730beaglexm.
> > 
> > http://www.pwsan.com/omap/testlogs/next_20140729/20140730124836/README.txt
> > 
> > 37xxevm hangs on the first suspend entry:
> > 
> > http://www.pwsan.com/omap/testlogs/next_20140729/20140730124836/pm/37xxevm/37xxevm_log.txt
> > 
> > If I find some extra time, I'll set up a bisection run.
> 
> OK that sounds like some driver suspend regression that needs
> to be tracked down. I'm seeing it on my 37xx evm also with
> linux next too.

It's commit a71e3c37960ce5f9c6a519bc1215e3ba9fa83e75:

Author: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Date:   Wed Jul 23 16:47:31 2014 -0300

    net: phy: Set the driver when registering an MDIO bus device
    
    mdiobus_register() registers a device which is already bound to a driver.
    Hence, the driver pointer should be set properly in order to track down
    the driver associated to the MDIO bus.
    
    This will be used to allow ethernet driver to pin down a MDIO bus driver,
    preventing it from being unloaded while the PHY device is running.
    
    Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
    Tested-by: Florian Fainelli <f.fainelli@gmail.com>
    Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>


What's bad is that this went in late during v3.16-rc fixes.  So now v3.16 
itself is broken, and there's no way to fix it.

As far as I can tell, this patch doesn't fix a regression.  So no way it 
should have gone in during late -rc kernels.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Felipe Balbi Aug. 8, 2014, 2:14 a.m. UTC | #8
Hi,

On Thu, Aug 07, 2014 at 10:21:23PM +0000, Paul Walmsley wrote:
> On Fri, 1 Aug 2014, Tony Lindgren wrote:
> 
> > * Paul Walmsley <paul@pwsan.com> [140731 12:29]:
> > > On Thu, 31 Jul 2014, Tony Lindgren wrote:
> > > 
> > > > * Paul Walmsley <paul@pwsan.com> [140730 00:55]:
> > > > > On Tue, 29 Jul 2014, Tony Lindgren wrote:
> > > > > 
> > > > > > The following patch should fix the tests above for 3530es3beagle.
> > > > > > Care to test and ack as I don't have one?
> > > > > 
> > > > > 3530es3beagle retention dynamic idle tests hang on next-20140729.  (Maybe 
> > > > > other boards fail too - haven't tested any others).  
> > > > 
> > > > I just checked that today's linux next works for off-idle and
> > > > wake-up events for at least 37xx evm.
> > > 
> > > I ran the full set of tests across all boards.  The only board that passed 
> > > the dynamic idle testing on next-20140729 was the 3730beaglexm.
> > > 
> > > http://www.pwsan.com/omap/testlogs/next_20140729/20140730124836/README.txt
> > > 
> > > 37xxevm hangs on the first suspend entry:
> > > 
> > > http://www.pwsan.com/omap/testlogs/next_20140729/20140730124836/pm/37xxevm/37xxevm_log.txt
> > > 
> > > If I find some extra time, I'll set up a bisection run.
> > 
> > OK that sounds like some driver suspend regression that needs
> > to be tracked down. I'm seeing it on my 37xx evm also with
> > linux next too.
> 
> It's commit a71e3c37960ce5f9c6a519bc1215e3ba9fa83e75:
> 
> Author: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
> Date:   Wed Jul 23 16:47:31 2014 -0300
> 
>     net: phy: Set the driver when registering an MDIO bus device
>     
>     mdiobus_register() registers a device which is already bound to a driver.
>     Hence, the driver pointer should be set properly in order to track down
>     the driver associated to the MDIO bus.
>     
>     This will be used to allow ethernet driver to pin down a MDIO bus driver,
>     preventing it from being unloaded while the PHY device is running.
>     
>     Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
>     Tested-by: Florian Fainelli <f.fainelli@gmail.com>
>     Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
> 
> 
> What's bad is that this went in late during v3.16-rc fixes.  So now v3.16 
> itself is broken, and there's no way to fix it.

we have stable releases for that.
Fabio Estevam Aug. 8, 2014, 2:29 a.m. UTC | #9
On Thu, Aug 7, 2014 at 7:21 PM, Paul Walmsley <paul@pwsan.com> wrote:
>
> It's commit a71e3c37960ce5f9c6a519bc1215e3ba9fa83e75:
>
> Author: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
> Date:   Wed Jul 23 16:47:31 2014 -0300
>
>     net: phy: Set the driver when registering an MDIO bus device
>
>     mdiobus_register() registers a device which is already bound to a driver.
>     Hence, the driver pointer should be set properly in order to track down
>     the driver associated to the MDIO bus.
>
>     This will be used to allow ethernet driver to pin down a MDIO bus driver,
>     preventing it from being unloaded while the PHY device is running.
>
>     Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
>     Tested-by: Florian Fainelli <f.fainelli@gmail.com>
>     Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
>
>
> What's bad is that this went in late during v3.16-rc fixes.  So now v3.16
> itself is broken, and there's no way to fix it.

I have sent a patch reverting this commit and it is in mainline now.
It will reach 3.16.1:

commit ce7991e8198b80eb6b4441b6f6114bea4a665d66
Author: Fabio Estevam <fabio.estevam@freescale.com>
Date:   Tue Aug 5 08:13:42 2014 -0300

    Revert "net: phy: Set the driver when registering an MDIO bus device"

    Commit a71e3c37960ce5f9 ("net: phy: Set the driver when
registering an MDIO bus
    device") caused the following regression on the fec driver:

    root@imx6qsabresd:~# echo mem > /sys/power/state
    PM: Syncing filesystems ... done.
    Freezing user space processes ... (elapsed 0.003 seconds) done.
    Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
    Unable to handle kernel NULL pointer dereference at virtual address 0000002c
    pgd = bcd14000
    [0000002c] *pgd=4d9e0831, *pte=00000000, *ppte=00000000
    Internal error: Oops: 17 [#1] SMP ARM
    Modules linked in:
    CPU: 0 PID: 617 Comm: sh Not tainted 3.16.0 #17
    task: bc0c4e00 ti: bceb6000 task.ti: bceb6000
    PC is at fec_suspend+0x10/0x70
    LR is at dpm_run_callback.isra.7+0x34/0x6c
    pc : [<803f8a98>]    lr : [<80361f44>]    psr: 600f0013
    sp : bceb7d70  ip : bceb7d88  fp : bceb7d84
    r10: 8091523c  r9 : 00000000  r8 : bd88f478
    r7 : 803f8a88  r6 : 81165988  r5 : 00000000  r4 : 00000000
    r3 : 00000000  r2 : 00000000  r1 : bd88f478  r0 : bd88f478
    Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
    Control: 10c5387d  Table: 4cd1404a  DAC: 00000015
    Process sh (pid: 617, stack limit = 0xbceb6240)
    Stack: (0xbceb7d70 to 0xbceb8000)
    ....

    The problem with the original commit is explained by Russell King:

    "It has the effect (as can be seen from the oops) of attaching the MDIO bus
    device (itself is a bus-less device) to the platform driver, which means
    that if the platform driver supports power management, it will be called
    to power manage the MDIO bus device.

    Moreover, drivers do not expect to be called for power management
    operations for devices which they haven't probed, and certainly not for
    devices which aren't part of the same bus that the driver is registered
    against."

    This reverts commit a71e3c37960ce5f9c6a519bc1215e3ba9fa83e75.

    Cc: <stable@vger.kernel.org> #3.16
    Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Walmsley Aug. 8, 2014, 6:11 a.m. UTC | #10
On Thu, 7 Aug 2014, Felipe Balbi wrote:

> we have stable releases for that.

Stable releases aren't a fix for process failures.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Walmsley Aug. 8, 2014, 6:14 a.m. UTC | #11
On Thu, 7 Aug 2014, Fabio Estevam wrote:

> On Thu, Aug 7, 2014 at 7:21 PM, Paul Walmsley <paul@pwsan.com> wrote:
> >
> > It's commit a71e3c37960ce5f9c6a519bc1215e3ba9fa83e75:
> >
> > Author: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
> > Date:   Wed Jul 23 16:47:31 2014 -0300
> >
> >     net: phy: Set the driver when registering an MDIO bus device
> >
> >     mdiobus_register() registers a device which is already bound to a driver.
> >     Hence, the driver pointer should be set properly in order to track down
> >     the driver associated to the MDIO bus.
> >
> >     This will be used to allow ethernet driver to pin down a MDIO bus driver,
> >     preventing it from being unloaded while the PHY device is running.
> >
> >     Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
> >     Tested-by: Florian Fainelli <f.fainelli@gmail.com>
> >     Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
> >     Signed-off-by: David S. Miller <davem@davemloft.net>
> >
> >
> > What's bad is that this went in late during v3.16-rc fixes.  So now v3.16
> > itself is broken, and there's no way to fix it.
> 
> I have sent a patch reverting this commit and it is in mainline now.

That's great; thanks for letting us know.

>     The problem with the original commit is explained by Russell King:
> 
>     "It has the effect (as can be seen from the oops) of attaching the MDIO bus
>     device (itself is a bus-less device) to the platform driver, which means
>     that if the platform driver supports power management, it will be called
>     to power manage the MDIO bus device.
> 
>     Moreover, drivers do not expect to be called for power management
>     operations for devices which they haven't probed, and certainly not for
>     devices which aren't part of the same bus that the driver is registered
>     against."

Makes sense to me.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Felipe Balbi Aug. 8, 2014, 2:34 p.m. UTC | #12
On Fri, Aug 08, 2014 at 06:11:54AM +0000, Paul Walmsley wrote:
> On Thu, 7 Aug 2014, Felipe Balbi wrote:
> 
> > we have stable releases for that.
> 
> Stable releases aren't a fix for process failures.

of course not, but your claim that there's no way to fix v3.16 is
nonsense. And mistakes occur as well.
Paul Walmsley Aug. 8, 2014, 11:39 p.m. UTC | #13
On Fri, 8 Aug 2014, Felipe Balbi wrote:

> On Fri, Aug 08, 2014 at 06:11:54AM +0000, Paul Walmsley wrote:
> > On Thu, 7 Aug 2014, Felipe Balbi wrote:
> > 
> > > we have stable releases for that.
> > 
> > Stable releases aren't a fix for process failures.
> 
> of course not, but your claim that there's no way to fix v3.16 is
> nonsense. And mistakes occur as well.

There is no way to fix v3.16. 

The point of the -rc cycle is to stabilize the final, non-rc releases.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Felipe Balbi Aug. 9, 2014, 2:49 a.m. UTC | #14
Hi,

On Fri, Aug 08, 2014 at 11:39:08PM +0000, Paul Walmsley wrote:
> On Fri, 8 Aug 2014, Felipe Balbi wrote:
> 
> > On Fri, Aug 08, 2014 at 06:11:54AM +0000, Paul Walmsley wrote:
> > > On Thu, 7 Aug 2014, Felipe Balbi wrote:
> > > 
> > > > we have stable releases for that.
> > > 
> > > Stable releases aren't a fix for process failures.
> > 
> > of course not, but your claim that there's no way to fix v3.16 is
> > nonsense. And mistakes occur as well.
> 
> There is no way to fix v3.16. 

There is no way to fix *any* commit.

We can go back a few major releases and we will hit a point when ARM
multiplatform was enabled and the tree doesn't even build. There is no
way to fix that either, but I'm sure the patches fixing the build breaks
have been back ported to stable.

> The point of the -rc cycle is to stabilize the final, non-rc releases.

Well, stable release are *not* -rc releases and they exist in order to
fix bugs that were found after the final release has been tagged;
otherwise what would be the point of even having -stable ?
Ezequiel Garcia Aug. 9, 2014, 12:41 p.m. UTC | #15
On 07 Aug 10:21 PM, Paul Walmsley wrote:
> On Fri, 1 Aug 2014, Tony Lindgren wrote:
> 
> > * Paul Walmsley <paul@pwsan.com> [140731 12:29]:
> > > On Thu, 31 Jul 2014, Tony Lindgren wrote:
> > > 
> > > > * Paul Walmsley <paul@pwsan.com> [140730 00:55]:
> > > > > On Tue, 29 Jul 2014, Tony Lindgren wrote:
> > > > > 
> > > > > > The following patch should fix the tests above for 3530es3beagle.
> > > > > > Care to test and ack as I don't have one?
> > > > > 
> > > > > 3530es3beagle retention dynamic idle tests hang on next-20140729.  (Maybe 
> > > > > other boards fail too - haven't tested any others).  
> > > > 
> > > > I just checked that today's linux next works for off-idle and
> > > > wake-up events for at least 37xx evm.
> > > 
> > > I ran the full set of tests across all boards.  The only board that passed 
> > > the dynamic idle testing on next-20140729 was the 3730beaglexm.
> > > 
> > > http://www.pwsan.com/omap/testlogs/next_20140729/20140730124836/README.txt
> > > 
> > > 37xxevm hangs on the first suspend entry:
> > > 
> > > http://www.pwsan.com/omap/testlogs/next_20140729/20140730124836/pm/37xxevm/37xxevm_log.txt
> > > 
> > > If I find some extra time, I'll set up a bisection run.
> > 
> > OK that sounds like some driver suspend regression that needs
> > to be tracked down. I'm seeing it on my 37xx evm also with
> > linux next too.
> 
> It's commit a71e3c37960ce5f9c6a519bc1215e3ba9fa83e75:
> 
> Author: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
> Date:   Wed Jul 23 16:47:31 2014 -0300
> 
>     net: phy: Set the driver when registering an MDIO bus device
>     
>     mdiobus_register() registers a device which is already bound to a driver.
>     Hence, the driver pointer should be set properly in order to track down
>     the driver associated to the MDIO bus.
>     
>     This will be used to allow ethernet driver to pin down a MDIO bus driver,
>     preventing it from being unloaded while the PHY device is running.
>     
>     Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
>     Tested-by: Florian Fainelli <f.fainelli@gmail.com>
>     Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
> 
> 
> What's bad is that this went in late during v3.16-rc fixes.  So now v3.16 
> itself is broken, and there's no way to fix it.
> 
> As far as I can tell, this patch doesn't fix a regression.  So no way it 
> should have gone in during late -rc kernels.
> 

Indeed, the commit shouldn't have landed as a v3.16-rc fix. FWIW, it was
originally intended for v3.17, but I wasn't clear enough about this when it
was submitted.
diff mbox

Patch

--- a/arch/arm/boot/dts/omap3-beagle.dts
+++ b/arch/arm/boot/dts/omap3-beagle.dts
@@ -292,6 +292,7 @@ 
 &uart3 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&uart3_pins>;
+	interrupts-extended = <&intc 74 &omap3_pmx_core OMAP3_UART3_RX>;
 };
 
 &gpio1 {