diff mbox

[v2,1/2] ARM: kirkwood: Ensure that kirkwood_ge0[01]_init() finds its clock

Message ID 1359283223-23082-2-git-send-email-gmbnomis@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Simon Baatz Jan. 27, 2013, 10:40 a.m. UTC
Commit 1611f87 (ARM: Kirkwood: switch to DT clock providers) broke the
functions to initialize the ethernet interfaces (kirkwood_ge00_init() and
kirkwood_ge01_init()).  In the DT case, the functions could not enable the
correct clocks.

Fix this by looking up the clocks through the device name.

Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
---
 arch/arm/mach-kirkwood/common.c |   13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

Comments

Sebastian Hesselbarth Jan. 27, 2013, 10:52 a.m. UTC | #1
On 01/27/2013 11:40 AM, Simon Baatz wrote:
> Commit 1611f87 (ARM: Kirkwood: switch to DT clock providers) broke the
> functions to initialize the ethernet interfaces (kirkwood_ge00_init() and
> kirkwood_ge01_init()).  In the DT case, the functions could not enable the
> correct clocks.
>
> Fix this by looking up the clocks through the device name.
>
> Signed-off-by: Simon Baatz<gmbnomis@gmail.com>
> ---
>   arch/arm/mach-kirkwood/common.c |   13 +++++++++----
>   1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm/mach-kirkwood/common.c b/arch/arm/mach-kirkwood/common.c
> index bac21a5..2c97847 100644
> --- a/arch/arm/mach-kirkwood/common.c
> +++ b/arch/arm/mach-kirkwood/common.c
> ...
>   	tclk = clk_register_fixed_rate(NULL, "tclk", NULL,
> @@ -288,12 +287,15 @@ void __init kirkwood_ehci_init(void)
>    ****************************************************************************/
>   void __init kirkwood_ge00_init(struct mv643xx_eth_platform_data *eth_data)
>   {
> +	struct clk *ge0;
>   	orion_ge00_init(eth_data,
>   			GE00_PHYS_BASE, IRQ_KIRKWOOD_GE00_SUM,
>   			IRQ_KIRKWOOD_GE00_ERR, 1600);
>   	/* The interface forgets the MAC address assigned by u-boot if
>   	the clock is turned off, so claim the clk now. */
> -	clk_prepare_enable(ge0);
> +	ge0 = clk_get_sys(MV643XX_ETH_NAME ".0", NULL);
> +	if (!IS_ERR(ge0))
> +		clk_prepare_enable(ge0);
>   }

Simon,

Jason posted a patch set that makes mv643xx DT compatible. IMHO this
patch is obsolete when we have DT support for mv643xx.
kirkwood_ge00/01_init will not be called with DT support at all.

I agree that loosing the MAC address _is_ an issue but there must
be another way to retain it during gated ge clocks than not gate the
clocks at all.

I can think of some ways to retain it but don't know what is the most
common with linux:
- make u-boot pass it through cmdline and let mv643xx get it from there
- have kirkwood's common code grab it before clk gates kick in

I will do some tests on dove if it also suffers from loosing it's MAC
on gated clock.

Sebastian
Simon Baatz Jan. 27, 2013, 11:08 a.m. UTC | #2
Hi Sebastian,

On Sun, Jan 27, 2013 at 11:52:41AM +0100, Sebastian Hesselbarth wrote:
> Jason posted a patch set that makes mv643xx DT compatible. IMHO this
> patch is obsolete when we have DT support for mv643xx.
> kirkwood_ge00/01_init will not be called with DT support at all.

I know. This is to fix a regression in 3.8 (which currently does not
boot with my "modularized" config).

As said in my other mail to Jason, we will need something more clever
in 3.9.

- Simon
Sebastian Hesselbarth Jan. 27, 2013, 11:18 a.m. UTC | #3
On 01/27/2013 12:08 PM, Simon Baatz wrote:
> On Sun, Jan 27, 2013 at 11:52:41AM +0100, Sebastian Hesselbarth wrote:
>> Jason posted a patch set that makes mv643xx DT compatible. IMHO this
>> patch is obsolete when we have DT support for mv643xx.
>> kirkwood_ge00/01_init will not be called with DT support at all.
>
> I know. This is to fix a regression in 3.8 (which currently does not
> boot with my "modularized" config).

Well, patching kirkwood_ge00/01_init() will also affect non-DT code
so there is a great potential to break something else.

If it is a DT-only issue please move the "always enable ge clocks"
fix to kirkwood_clk_legacy_init() as it only called by DT enabled
boards.

If there is no other issue than keeping the clocks running because
of not loosing the MAC address, I prefer not to fix it at all.

Looks like modular gbe on kirkwood is just unsupported on 3.8.

Sebastian
Sebastian Hesselbarth Jan. 27, 2013, 2:19 p.m. UTC | #4
On 01/27/2013 12:08 PM, Simon Baatz wrote:
> Hi Sebastian,
>
> On Sun, Jan 27, 2013 at 11:52:41AM +0100, Sebastian Hesselbarth wrote:
>> Jason posted a patch set that makes mv643xx DT compatible. IMHO this
>> patch is obsolete when we have DT support for mv643xx.
>> kirkwood_ge00/01_init will not be called with DT support at all.
>
> I know. This is to fix a regression in 3.8 (which currently does not
> boot with my "modularized" config).

Simon,

I _cannot_ confirm that gbe is loosing its MAC address on Dove. I will
post a follow-up patch to Jason's cleanup patches that will also
grab a clock for smi. With that patch insmod'ing/rmmod'ing mv643xx_eth
does work just fine here on Dove.

Sebastian
Jason Cooper Jan. 27, 2013, 2:46 p.m. UTC | #5
On Sun, Jan 27, 2013 at 03:19:13PM +0100, Sebastian Hesselbarth wrote:
> On 01/27/2013 12:08 PM, Simon Baatz wrote:
> >Hi Sebastian,
> >
> >On Sun, Jan 27, 2013 at 11:52:41AM +0100, Sebastian Hesselbarth wrote:
> >>Jason posted a patch set that makes mv643xx DT compatible. IMHO this
> >>patch is obsolete when we have DT support for mv643xx.
> >>kirkwood_ge00/01_init will not be called with DT support at all.
> >
> >I know. This is to fix a regression in 3.8 (which currently does not
> >boot with my "modularized" config).
> 
> Simon,
> 
> I _cannot_ confirm that gbe is loosing its MAC address on Dove. I will
> post a follow-up patch to Jason's cleanup patches that will also
> grab a clock for smi. With that patch insmod'ing/rmmod'ing mv643xx_eth
> does work just fine here on Dove.

I believe Simon's issue is that the mv643xx_eth driver is not loaded at
boot, it's clocks get gated, then when he loads the driver, there is no
mac address.  Is that correct Simon?  I don't think unloading the driver
after boot will trigger this regression.

thx,

Jason.
Sebastian Hesselbarth Jan. 27, 2013, 2:53 p.m. UTC | #6
On 01/27/2013 03:46 PM, Jason Cooper wrote:
>> I _cannot_ confirm that gbe is loosing its MAC address on Dove. I will
>> post a follow-up patch to Jason's cleanup patches that will also
>> grab a clock for smi. With that patch insmod'ing/rmmod'ing mv643xx_eth
>> does work just fine here on Dove.
>
> I believe Simon's issue is that the mv643xx_eth driver is not loaded at
> boot, it's clocks get gated, then when he loads the driver, there is no
> mac address.  Is that correct Simon?  I don't think unloading the driver
> after boot will trigger this regression.

Loading and unloading the driver module hangs because of the missing
clk_prepeare_enable in shared driver part. This should be fixed by the
patch I sent you.

Dove and Kirkwood have the same gbe internally and I can boot into Dove
without mv643xx_eth, load, unload, reload the module and it always finds
its MAC address.

I just want Simon to confirm that Kirkwood's gbe is really loosing the
contents of its MAC address registers during gated clocks, which is from
a HW point of view very unlikely.

Sebastian
Jason Cooper Jan. 27, 2013, 3:24 p.m. UTC | #7
On Sun, Jan 27, 2013 at 03:53:53PM +0100, Sebastian Hesselbarth wrote:
> On 01/27/2013 03:46 PM, Jason Cooper wrote:
> >>I _cannot_ confirm that gbe is loosing its MAC address on Dove. I will
> >>post a follow-up patch to Jason's cleanup patches that will also
> >>grab a clock for smi. With that patch insmod'ing/rmmod'ing mv643xx_eth
> >>does work just fine here on Dove.
> >
> >I believe Simon's issue is that the mv643xx_eth driver is not loaded at
> >boot, it's clocks get gated, then when he loads the driver, there is no
> >mac address.  Is that correct Simon?  I don't think unloading the driver
> >after boot will trigger this regression.
> 
> Loading and unloading the driver module hangs because of the missing
> clk_prepeare_enable in shared driver part. This should be fixed by the
> patch I sent you.
> 
> Dove and Kirkwood have the same gbe internally and I can boot into Dove
> without mv643xx_eth, load, unload, reload the module and it always finds
> its MAC address.
> 
> I just want Simon to confirm that Kirkwood's gbe is really loosing the
> contents of its MAC address registers during gated clocks, which is from
> a HW point of view very unlikely.

Ok, I just wanted to make sure we understood his problem correctly, and
if possible, reproduce it.

Simon, can you give us some steps to reproduce this on our side so we
can see exactly what's happening?

thx,

Jason.
Jason Gunthorpe Jan. 28, 2013, 6:28 p.m. UTC | #8
On Sun, Jan 27, 2013 at 03:53:53PM +0100, Sebastian Hesselbarth wrote:

> I just want Simon to confirm that Kirkwood's gbe is really loosing the
> contents of its MAC address registers during gated clocks, which is from
> a HW point of view very unlikely.

FWIW, there are two block power management controls on the kirkwood
chips, one is documented to a clock gate, and one is documented to a
be a power down. I wouldn't expect the clock gate to have a problem
with the MAC address, but the powerdown I would expect to wipe it..

Other varients might have only the powerdown..

That said, I'm not sure what Linux uses on Kirkwood, it ideally should
be using both, I think.

Jason
Simon Baatz Jan. 28, 2013, 10:31 p.m. UTC | #9
Hi Jason,

On Sun, Jan 27, 2013 at 10:24:31AM -0500, Jason Cooper wrote:
> On Sun, Jan 27, 2013 at 03:53:53PM +0100, Sebastian Hesselbarth wrote:
> > On 01/27/2013 03:46 PM, Jason Cooper wrote:
> > >>I _cannot_ confirm that gbe is loosing its MAC address on Dove. I will
> > >>post a follow-up patch to Jason's cleanup patches that will also
> > >>grab a clock for smi. With that patch insmod'ing/rmmod'ing mv643xx_eth
> > >>does work just fine here on Dove.
> > >
> > >I believe Simon's issue is that the mv643xx_eth driver is not loaded at
> > >boot, it's clocks get gated, then when he loads the driver, there is no
> > >mac address.  Is that correct Simon?  I don't think unloading the driver
> > >after boot will trigger this regression.
> > 
> > Loading and unloading the driver module hangs because of the missing
> > clk_prepeare_enable in shared driver part. This should be fixed by the
> > patch I sent you.
> > 
> > Dove and Kirkwood have the same gbe internally and I can boot into Dove
> > without mv643xx_eth, load, unload, reload the module and it always finds
> > its MAC address.
> > 
> > I just want Simon to confirm that Kirkwood's gbe is really loosing the
> > contents of its MAC address registers during gated clocks, which is from
> > a HW point of view very unlikely.
> 
> Ok, I just wanted to make sure we understood his problem correctly, and
> if possible, reproduce it.
> 
> Simon, can you give us some steps to reproduce this on our side so we
> can see exactly what's happening?

Nothing special here. My config originally based on a Debian config
for a Kirkwood kernel.  Thus, amongst other drivers, mv643xx_eth is
build as a module and is loaded from an initrd.
Because all relevant drivers are loaded as modules, I need my
runit patch to boot at all.

Here are my findings with various patches: ("non-DT" means booting
the IBNAS6210 with machine ID 1680 â??Marvell DB-88F6281-BP
Development Board�, which works reasonably well)

3.8-rc5 + runit patch:

DT: Hangs at boot (when loading mv643xx_eth)
non-DT: Boots ok

3.8-rc5 + runit patch + ge00/ge01 patch:

DT: Boots ok
non-DT: Boots ok

3.8-rc5 + runit + Sebastians get smi clock patch (modified to use
legacy clock names):

DT: Boots, but no MAC
non-DT: Boots ok

3.8-rc5 + runit + using Sebastians patch + clks not ticking at module
load:

DT: Boots, but no MAC
non-DT: Boots, but no MAC


hth,
  Simon
Jason Cooper Jan. 29, 2013, 12:48 a.m. UTC | #10
On Mon, Jan 28, 2013 at 11:31:48PM +0100, Simon Baatz wrote:
> Hi Jason,
> 
> On Sun, Jan 27, 2013 at 10:24:31AM -0500, Jason Cooper wrote:
> > On Sun, Jan 27, 2013 at 03:53:53PM +0100, Sebastian Hesselbarth wrote:
> > > On 01/27/2013 03:46 PM, Jason Cooper wrote:
> > > >>I _cannot_ confirm that gbe is loosing its MAC address on Dove. I will
> > > >>post a follow-up patch to Jason's cleanup patches that will also
> > > >>grab a clock for smi. With that patch insmod'ing/rmmod'ing mv643xx_eth
> > > >>does work just fine here on Dove.
> > > >
> > > >I believe Simon's issue is that the mv643xx_eth driver is not loaded at
> > > >boot, it's clocks get gated, then when he loads the driver, there is no
> > > >mac address.  Is that correct Simon?  I don't think unloading the driver
> > > >after boot will trigger this regression.
> > > 
> > > Loading and unloading the driver module hangs because of the missing
> > > clk_prepeare_enable in shared driver part. This should be fixed by the
> > > patch I sent you.
> > > 
> > > Dove and Kirkwood have the same gbe internally and I can boot into Dove
> > > without mv643xx_eth, load, unload, reload the module and it always finds
> > > its MAC address.
> > > 
> > > I just want Simon to confirm that Kirkwood's gbe is really loosing the
> > > contents of its MAC address registers during gated clocks, which is from
> > > a HW point of view very unlikely.
> > 
> > Ok, I just wanted to make sure we understood his problem correctly, and
> > if possible, reproduce it.
> > 
> > Simon, can you give us some steps to reproduce this on our side so we
> > can see exactly what's happening?
> 
> Nothing special here. My config originally based on a Debian config
> for a Kirkwood kernel.  Thus, amongst other drivers, mv643xx_eth is
> build as a module and is loaded from an initrd.
> Because all relevant drivers are loaded as modules, I need my
> runit patch to boot at all.

Let's back up.  If you config early printk and COMMON_CLK_DEBUG and a
vanilla v3.8-rc5 with your current config, what happens?

Could you also please try kirkwood_defconfig and tell us if it boots?

thx,

Jason.

> 
> Here are my findings with various patches: ("non-DT" means booting
> the IBNAS6210 with machine ID 1680 â??Marvell DB-88F6281-BP
> Development Board�, which works reasonably well)
> 
> 3.8-rc5 + runit patch:
> 
> DT: Hangs at boot (when loading mv643xx_eth)
> non-DT: Boots ok
> 
> 3.8-rc5 + runit patch + ge00/ge01 patch:
> 
> DT: Boots ok
> non-DT: Boots ok
> 
> 3.8-rc5 + runit + Sebastians get smi clock patch (modified to use
> legacy clock names):
> 
> DT: Boots, but no MAC
> non-DT: Boots ok
> 
> 3.8-rc5 + runit + using Sebastians patch + clks not ticking at module
> load:
> 
> DT: Boots, but no MAC
> non-DT: Boots, but no MAC
> 
> 
> hth,
>   Simon
> 
>
Andrew Lunn Jan. 29, 2013, 6:56 a.m. UTC | #11
On Mon, Jan 28, 2013 at 11:28:18AM -0700, Jason Gunthorpe wrote:
> On Sun, Jan 27, 2013 at 03:53:53PM +0100, Sebastian Hesselbarth wrote:
> 
> > I just want Simon to confirm that Kirkwood's gbe is really loosing the
> > contents of its MAC address registers during gated clocks, which is from
> > a HW point of view very unlikely.
> 
> FWIW, there are two block power management controls on the kirkwood
> chips, one is documented to a clock gate, and one is documented to a
> be a power down.

Hi Jason

Are you referring to 

Memory Power Management Control Register
Offset: 0x20118

Bit Field Type / Init Val Description
0 GE0_Mem_PD RW GbE0 Memory Power Down:
                 0x0 0 = Functional
                     1 = PowerDown
1 PEX0_Mem_PD RW PCI Express0 Memory Power Down;
                 0x0 0 = Functional
                     1 = PowerDown
2 USB0_Mem_PD RW USB0 Memory Power Down:
                 0x0 0 = Functional
                     1 = PowerDown

etc.

> I wouldn't expect the clock gate to have a problem
> with the MAC address, but the powerdown I would expect to wipe it..

As far as i know, we don't touch it. However, as a debug exercise, it
would be good to print its value at boot, at ethernet driver load
time, after unused clocks are disabled, when the ethernet clock is
enabled by the driver, etc.

	Andrew
Jason Gunthorpe Jan. 29, 2013, 5:29 p.m. UTC | #12
On Tue, Jan 29, 2013 at 07:56:46AM +0100, Andrew Lunn wrote:
> On Mon, Jan 28, 2013 at 11:28:18AM -0700, Jason Gunthorpe wrote:
> > On Sun, Jan 27, 2013 at 03:53:53PM +0100, Sebastian Hesselbarth wrote:
> > 
> > > I just want Simon to confirm that Kirkwood's gbe is really loosing the
> > > contents of its MAC address registers during gated clocks, which is from
> > > a HW point of view very unlikely.
> > 
> > FWIW, there are two block power management controls on the kirkwood
> > chips, one is documented to a clock gate, and one is documented to a
> > be a power down.

> Are you referring to 
> 
> Memory Power Management Control Register
> Offset: 0x20118

Yes, and the other is 'Clock Gating Control Register' at 2011c.

I suppose Linux should really used both for maximum power savings,
probably in a clock off, then power off kind of pattern??

Jason
Simon Baatz Jan. 29, 2013, 7:42 p.m. UTC | #13
On Mon, Jan 28, 2013 at 07:48:24PM -0500, Jason Cooper wrote:
> On Mon, Jan 28, 2013 at 11:31:48PM +0100, Simon Baatz wrote:
> > Hi Jason,
> > 
> > On Sun, Jan 27, 2013 at 10:24:31AM -0500, Jason Cooper wrote:
> > > On Sun, Jan 27, 2013 at 03:53:53PM +0100, Sebastian Hesselbarth wrote:
> > > > On 01/27/2013 03:46 PM, Jason Cooper wrote:
> > > > >>I _cannot_ confirm that gbe is loosing its MAC address on Dove. I will
> > > > >>post a follow-up patch to Jason's cleanup patches that will also
> > > > >>grab a clock for smi. With that patch insmod'ing/rmmod'ing mv643xx_eth
> > > > >>does work just fine here on Dove.
> > > > >
> > > > >I believe Simon's issue is that the mv643xx_eth driver is not loaded at
> > > > >boot, it's clocks get gated, then when he loads the driver, there is no
> > > > >mac address.  Is that correct Simon?  I don't think unloading the driver
> > > > >after boot will trigger this regression.
> > > > 
> > > > Loading and unloading the driver module hangs because of the missing
> > > > clk_prepeare_enable in shared driver part. This should be fixed by the
> > > > patch I sent you.
> > > > 
> > > > Dove and Kirkwood have the same gbe internally and I can boot into Dove
> > > > without mv643xx_eth, load, unload, reload the module and it always finds
> > > > its MAC address.
> > > > 
> > > > I just want Simon to confirm that Kirkwood's gbe is really loosing the
> > > > contents of its MAC address registers during gated clocks, which is from
> > > > a HW point of view very unlikely.
> > > 
> > > Ok, I just wanted to make sure we understood his problem correctly, and
> > > if possible, reproduce it.
> > > 
> > > Simon, can you give us some steps to reproduce this on our side so we
> > > can see exactly what's happening?
> > 
> > Nothing special here. My config originally based on a Debian config
> > for a Kirkwood kernel.  Thus, amongst other drivers, mv643xx_eth is
> > build as a module and is loaded from an initrd.
> > Because all relevant drivers are loaded as modules, I need my
> > runit patch to boot at all.
> 
> Let's back up.  If you config early printk and COMMON_CLK_DEBUG and a
> vanilla v3.8-rc5 with your current config, what happens?


I thought it's clear what happens. "runit" is requested in the dts by
serial, spi, watchdog, nand, i2c.  Each of these are either not built
or built as a module in my config, except serial.

of_serial.c does the following in of_platform_serial_setup():

	if (of_property_read_u32(np, "clock-frequency", &clk)) {

		/* Get clk rate through clk driver if present */
		info->clk = clk_get(&ofdev->dev, NULL);
		if (IS_ERR(info->clk)) {
			dev_warn(&ofdev->dev,
				"clk or clock-frequency not defined\n");
			return PTR_ERR(info->clk);
		}

		clk_prepare_enable(info->clk);
		clk = clk_get_rate(info->clk);
	}

and since "clock-frequency" is still given in the standard DTS for
the IB-NAS 6210, it does not enable the clock.

Thus, no driver uses the clock, it gets disabled and the system locks
up.

The system boots when removing "clock-frequency" from the DTS. 

Which lead to two questions:
- Is it safe to remove "clock-frequency" now that we have the clocks
  in place?
- The behavior of of_serial.c looks strange to me, is this really the
desired behavior?


For the "runit" clock, this means that any time you hit a
configuration that does not keep the clock enabled, the system will
lockup.  (We have gone through this more than once now. Especially,
this hits you hard when you try to support a new board and disable as
much as possible in the config/dts for a start...).

Thus, we should keep it enabled even if it is unused, which is
exactly the purpose of CLK_IGNORE_UNUSED if I understood this
correctly.

 
> Could you also please try kirkwood_defconfig and tell us if it boots?

It is very easy to get my system to boot with a stock kernel. Just
build one of the drivers I use directly into the kernel and it boots
(up to the point where the gbe driver is loaded if built as a module,
of course).

> > Here are my findings with various patches: ("non-DT" means booting
> > the IBNAS6210 with machine ID 1680 â??Marvell DB-88F6281-BP
> > Development Board�, which works reasonably well)
> > 
> > 3.8-rc5 + runit patch:
> > 
> > DT: Hangs at boot (when loading mv643xx_eth)
> > non-DT: Boots ok

In the DT case ge0 and ge1 are NULL, and thus, kirkwood_ge0[01]_init()
tries to enable NULL clocks, but not the gbe clocks. -> The clocks
get disabled.

As described in Sebastian's patch (and also found by Andrew some time
ago), this leads to a hanging system when loading the gbe driver.


> > 3.8-rc5 + runit patch + ge00/ge01 patch:
> > 
> > DT: Boots ok
> > non-DT: Boots ok

Since the clocks are found now and kept enabled, DT behaves the same
as non-DT.


> > 
> > 3.8-rc5 + runit + Sebastians get smi clock patch (modified to use
> > legacy clock names):
> > 
> > DT: Boots, but no MAC
> > non-DT: Boots ok

This is the behavior originally found by Andrew. The clock patch
eliminates the lockup, but the hardware forgets its MAC address.

> > 
> > 3.8-rc5 + runit + using Sebastians patch + clks not ticking at module
> > load:
> > 
> > DT: Boots, but no MAC
> > non-DT: Boots, but no MAC

I think non-DT and DT gated clocks behave differently, since the
non-DT ones also enable/disable the PHY.  I explicitly removed the
clk_prepare_enable() from kirkwood_ge0[01]_init() in order to see if
this makes a difference.

(Which reminds me, that we wanted to implement the PHY enabling in
the driver.)

Apparently, it does not make a difference.


- Simon
Jason Cooper Jan. 29, 2013, 8:32 p.m. UTC | #14
On Tue, Jan 29, 2013 at 09:08:46PM +0100, Sebastian Hesselbarth wrote:
> On 01/29/2013 08:42 PM, Simon Baatz wrote:
> >On Mon, Jan 28, 2013 at 07:48:24PM -0500, Jason Cooper wrote:
> >>On Mon, Jan 28, 2013 at 11:31:48PM +0100, Simon Baatz wrote:
> >>>Nothing special here. My config originally based on a Debian config
> >>>for a Kirkwood kernel.  Thus, amongst other drivers, mv643xx_eth is
> >>>build as a module and is loaded from an initrd.
> >>>Because all relevant drivers are loaded as modules, I need my
> >>>runit patch to boot at all.
> >>
> >>Let's back up.  If you config early printk and COMMON_CLK_DEBUG and a
> >>vanilla v3.8-rc5 with your current config, what happens?
> >
> >I thought it's clear what happens. "runit" is requested in the dts by
> >serial, spi, watchdog, nand, i2c.  Each of these are either not built
> >or built as a module in my config, except serial.
> 
> Simon,
> 
> it is clear what happens but just to blindly disable clock gating will
> not help here. What you are suffering are several issues not only one.

Agreed.

> >of_serial.c does the following in of_platform_serial_setup():
> >
> >	if (of_property_read_u32(np, "clock-frequency",&clk)) {
> >
> >		/* Get clk rate through clk driver if present */
> >		info->clk = clk_get(&ofdev->dev, NULL);
> >		if (IS_ERR(info->clk)) {
> >			dev_warn(&ofdev->dev,
> >				"clk or clock-frequency not defined\n");
> >			return PTR_ERR(info->clk);
> >		}
> >
> >		clk_prepare_enable(info->clk);
> >		clk = clk_get_rate(info->clk);
> >	}
> >
> >and since "clock-frequency" is still given in the standard DTS for
> >the IB-NAS 6210, it does not enable the clock.
> >Thus, no driver uses the clock, it gets disabled and the system locks
> >up.
> >The system boots when removing "clock-frequency" from the DTS.
> >
> >Which lead to two questions:
> >- Is it safe to remove "clock-frequency" now that we have the clocks
> >   in place?
> 
> Safe, as long as you replace it with the corresponding clocks=<&..> property.

For serial, this is in kirkwood.dtsi, so we are fine there.

> >- The behavior of of_serial.c looks strange to me, is this really the
> >desired behavior?
> 
> I guess it is, rely on clock-frequency because a lot of boards still use it.
> There was no clocks property in of_serial when we started with kirkwood and
> dove. I think we can switch now to clocks property which will also remove
> all hard coded occurrences of tclk frequency.
> 
> >For the "runit" clock, this means that any time you hit a
> >configuration that does not keep the clock enabled, the system will
> >lockup.  (We have gone through this more than once now. Especially,
> >this hits you hard when you try to support a new board and disable as
> >much as possible in the config/dts for a start...).
> 
> Well, if there is no device/driver using runit it should be safe to disable
> it. If there is a driver using it, we need a clocks property for it. And
> as you already stated, serial is using it. And you want serial in every
> minimal kernel, don't you?
> 
> >Thus, we should keep it enabled even if it is unused, which is
> >exactly the purpose of CLK_IGNORE_UNUSED if I understood this
> >correctly.
> 
> True, but only if it is that important that it should _never_ be gated.
> As long as it is 'only' used by peripheral devices, it should be safe
> to gate it.
> 
> >>Could you also please try kirkwood_defconfig and tell us if it boots?
> >
> >It is very easy to get my system to boot with a stock kernel. Just
> >build one of the drivers I use directly into the kernel and it boots
> >(up to the point where the gbe driver is loaded if built as a module,
> >of course).
> >
> >>>Here are my findings with various patches: ("non-DT" means booting
> >>>the IBNAS6210 with machine ID 1680 “Marvell DB-88F6281-BP
> >>>Development Board�, which works reasonably well)
> >>>
> >>>3.8-rc5 + runit patch:
> >>>
> >>>DT: Hangs at boot (when loading mv643xx_eth)
> >>>non-DT: Boots ok
> >
> >In the DT case ge0 and ge1 are NULL, and thus, kirkwood_ge0[01]_init()
> >tries to enable NULL clocks, but not the gbe clocks. ->  The clocks
> >get disabled.
> >
> >As described in Sebastian's patch (and also found by Andrew some time
> >ago), this leads to a hanging system when loading the gbe driver.
> 
> If I got all your configs right, this DT case is with serial but no
> clocks property? All other runit "users" are _not_ built into the
> kernel? Maybe it is just a coincidence that it fails when loading
> eth but I guess it hangs because of runit getting gated while serial
> accessing it. When you replace serial's clock-frequency with clocks
> property to "runit" it shouldn't hang.
> 
> (Issue 1: gated runit clock hangs kernel due to serial access)

This can be fixed with a patch removing clock-frequency from all
kirkwood boards (dove, orion, etc?) I'll gin this up and add it to the
havoc ;-)

> >>>3.8-rc5 + runit patch + ge00/ge01 patch:
> >>>
> >>>DT: Boots ok
> >>>non-DT: Boots ok
> >
> >Since the clocks are found now and kept enabled, DT behaves the same
> >as non-DT.
> >
> >>>
> >>>3.8-rc5 + runit + Sebastians get smi clock patch (modified to use
> >>>legacy clock names):
> >>>
> >>>DT: Boots, but no MAC
> >>>non-DT: Boots ok
> >
> >This is the behavior originally found by Andrew. The clock patch
> >eliminates the lockup, but the hardware forgets its MAC address.
> >
> >>>
> >>>3.8-rc5 + runit + using Sebastians patch + clks not ticking at module
> >>>load:
> >>>
> >>>DT: Boots, but no MAC
> >>>non-DT: Boots, but no MAC
> 
> Ok, here my patch for smi clock enables gbe clock before accessing gbe
> registers.
> 
> (Issue 2: gated gbe clock hangs kernel due to smi access)

Here, I'm going to wait for Florian's mvmdio changes to settle out and
then readdress this.  My current understanding is that there will only
be one DT node for this.  iiuc, each board dts will have to say which
gate clocks to consume to prevent the mvmdio node in the dtsi from
consuming both unconditionally.

> >I think non-DT and DT gated clocks behave differently, since the
> >non-DT ones also enable/disable the PHY.  I explicitly removed the
> >clk_prepare_enable() from kirkwood_ge0[01]_init() in order to see if
> >this makes a difference.
> 
> True, DT gated clocks don't disable PHYs (and never will).
> 
> >(Which reminds me, that we wanted to implement the PHY enabling in
> >the driver.)
> >
> >Apparently, it does not make a difference.
> 
> Leaves Issue 3, gbe forgets about its MAC address when gated or powered
> down. That should be done with local-mac-address passed by DT enabled
> u-boot or any other (dirty) ATAG hack ;)

A patch to mv643xx_eth to pull this from DT should solve this.

thx,

Jason.
Sebastian Hesselbarth Jan. 29, 2013, 8:48 p.m. UTC | #15
On 01/29/2013 09:32 PM, Jason Cooper wrote:
> On Tue, Jan 29, 2013 at 09:08:46PM +0100, Sebastian Hesselbarth wrote:
>> (Issue 1: gated runit clock hangs kernel due to serial access)
>
> This can be fixed with a patch removing clock-frequency from all
> kirkwood boards (dove, orion, etc?) I'll gin this up and add it to the
> havoc ;-)

For kirkwood.dtsi it is already there as you stated, but yes, it should
be removed from the board files including kirkwood.dtsi and also the
comments in kirkwood.dtsi. I'll prepare a patch for dove.dtsi.

>> (Issue 2: gated gbe clock hangs kernel due to smi access)
>
> Here, I'm going to wait for Florian's mvmdio changes to settle out and
> then readdress this.  My current understanding is that there will only
> be one DT node for this.  iiuc, each board dts will have to say which
> gate clocks to consume to prevent the mvmdio node in the dtsi from
> consuming both unconditionally.

Ok, feel free to consume the patch ;)

>> Leaves Issue 3, gbe forgets about its MAC address when gated or powered
>> down. That should be done with local-mac-address passed by DT enabled
>> u-boot or any other (dirty) ATAG hack ;)
>
> A patch to mv643xx_eth to pull this from DT should solve this.

Great.

Sebastian
Jason Cooper Jan. 29, 2013, 9:23 p.m. UTC | #16
On Tue, Jan 29, 2013 at 09:48:00PM +0100, Sebastian Hesselbarth wrote:
> On 01/29/2013 09:32 PM, Jason Cooper wrote:
> >On Tue, Jan 29, 2013 at 09:08:46PM +0100, Sebastian Hesselbarth wrote:
> >>Leaves Issue 3, gbe forgets about its MAC address when gated or powered
> >>down. That should be done with local-mac-address passed by DT enabled
> >>u-boot or any other (dirty) ATAG hack ;)
> >
> >A patch to mv643xx_eth to pull this from DT should solve this.
> 
> Great.

Did I just volunteer myself for that too? :-)

thx,

Jason.
Simon Baatz Jan. 30, 2013, 12:03 a.m. UTC | #17
On Tue, Jan 29, 2013 at 09:08:46PM +0100, Sebastian Hesselbarth wrote:
> On 01/29/2013 08:42 PM, Simon Baatz wrote:
> >On Mon, Jan 28, 2013 at 07:48:24PM -0500, Jason Cooper wrote:
> >>On Mon, Jan 28, 2013 at 11:31:48PM +0100, Simon Baatz wrote:
> >>>Nothing special here. My config originally based on a Debian config
> >>>for a Kirkwood kernel.  Thus, amongst other drivers, mv643xx_eth is
> >>>build as a module and is loaded from an initrd.
> >>>Because all relevant drivers are loaded as modules, I need my
> >>>runit patch to boot at all.
> >>
> >>Let's back up.  If you config early printk and COMMON_CLK_DEBUG and a
> >>vanilla v3.8-rc5 with your current config, what happens?
> >
> >I thought it's clear what happens. "runit" is requested in the dts by
> >serial, spi, watchdog, nand, i2c.  Each of these are either not built
> >or built as a module in my config, except serial.
> 
> Simon,
> 
> it is clear what happens but just to blindly disable clock gating will
> not help here. What you are suffering are several issues not only one.


Sure it is more than one issue. That's why I ran different scenarios.

> 
> >of_serial.c does the following in of_platform_serial_setup():
> >
> >	if (of_property_read_u32(np, "clock-frequency",&clk)) {
> >
> >		/* Get clk rate through clk driver if present */
> >		info->clk = clk_get(&ofdev->dev, NULL);
> >		if (IS_ERR(info->clk)) {
> >			dev_warn(&ofdev->dev,
> >				"clk or clock-frequency not defined\n");
> >			return PTR_ERR(info->clk);
> >		}
> >
> >		clk_prepare_enable(info->clk);
> >		clk = clk_get_rate(info->clk);
> >	}
> >
> >and since "clock-frequency" is still given in the standard DTS for
> >the IB-NAS 6210, it does not enable the clock.
> >Thus, no driver uses the clock, it gets disabled and the system locks
> >up.
> >The system boots when removing "clock-frequency" from the DTS.
> >
> >Which lead to two questions:
> >- Is it safe to remove "clock-frequency" now that we have the clocks
> >   in place?
> 
> Safe, as long as you replace it with the corresponding clocks=<&..> property.
> 
> >- The behavior of of_serial.c looks strange to me, is this really the
> >desired behavior?
> 
> I guess it is, rely on clock-frequency because a lot of boards still use it.
> There was no clocks property in of_serial when we started with kirkwood and
> dove. I think we can switch now to clocks property which will also remove
> all hard coded occurrences of tclk frequency.
> 
> >For the "runit" clock, this means that any time you hit a
> >configuration that does not keep the clock enabled, the system will
> >lockup.  (We have gone through this more than once now. Especially,
> >this hits you hard when you try to support a new board and disable as
> >much as possible in the config/dts for a start...).
> 
> Well, if there is no device/driver using runit it should be safe to disable
> it. If there is a driver using it, we need a clocks property for it. And
> as you already stated, serial is using it. And you want serial in every
> minimal kernel, don't you?

From f479db4:

 Marvell engineers tell us:

 It seems that many units use the RUNIT clock.
 SPI, UART, NAND, TWSI, ...
 So it's not possible to clock gate it.

 Currently the SPI, NAND and TWSI driver will clk_prepaure_enable()
 this clk, but since we have no idea what ... is, and turning this clk
 off results in a hard lock, unconditionally enable runit.


Thus, if we know better than that, that's fine, but please don't tell
me that this is "blindly" enabling clocks.

> 
> >Thus, we should keep it enabled even if it is unused, which is
> >exactly the purpose of CLK_IGNORE_UNUSED if I understood this
> >correctly.
> 
> True, but only if it is that important that it should _never_ be gated.
> As long as it is 'only' used by peripheral devices, it should be safe
> to gate it.
> 
> >>Could you also please try kirkwood_defconfig and tell us if it boots?
> >
> >It is very easy to get my system to boot with a stock kernel. Just
> >build one of the drivers I use directly into the kernel and it boots
> >(up to the point where the gbe driver is loaded if built as a module,
> >of course).
> >
> >>>Here are my findings with various patches: ("non-DT" means booting
> >>>the IBNAS6210 with machine ID 1680 â??Marvell DB-88F6281-BP
> >>>Development Boardâ??, which works reasonably well)
> >>>
> >>>3.8-rc5 + runit patch:
> >>>
> >>>DT: Hangs at boot (when loading mv643xx_eth)
> >>>non-DT: Boots ok
> >
> >In the DT case ge0 and ge1 are NULL, and thus, kirkwood_ge0[01]_init()
> >tries to enable NULL clocks, but not the gbe clocks. ->  The clocks
> >get disabled.
> >
> >As described in Sebastian's patch (and also found by Andrew some time
> >ago), this leads to a hanging system when loading the gbe driver.
> 
> If I got all your configs right, this DT case is with serial but no
> clocks property? All other runit "users" are _not_ built into the
> kernel? Maybe it is just a coincidence that it fails when loading
> eth but I guess it hangs because of runit getting gated while serial
> accessing it. When you replace serial's clock-frequency with clocks
> property to "runit" it shouldn't hang.

No, this is the SMI hang already. As stated its "3.8-rc5 + runit
patch".  So runit is running.

And sure, removing "clock-frequency" gets the kernel booting until
the gbe driver loads, as I said above in the same mail.

> (Issue 1: gated runit clock hangs kernel due to serial access)

Yes, and issue 1a: gated runit clock hangs kernel due to "...".

I can't get a stock kernel to boot when additionally disabling the
serial driver altogether (and compiling the Ethernet driver into the
kernel to avoid its lockup).

When the runit clock is enabled all the time, it works. I have no
idea where it gets stuck, since I have no console.  It looks like the
SATA module does something (telling from the LEDs), but booting does
not continue.

So, we have a statement from Marvell not to gate it, we know that it
is needed for "...", can we please not gate it?

> 
> >>>3.8-rc5 + runit patch + ge00/ge01 patch:
> >>>
> >>>DT: Boots ok
> >>>non-DT: Boots ok
> >
> >Since the clocks are found now and kept enabled, DT behaves the same
> >as non-DT.
> >
> >>>
> >>>3.8-rc5 + runit + Sebastians get smi clock patch (modified to use
> >>>legacy clock names):
> >>>
> >>>DT: Boots, but no MAC
> >>>non-DT: Boots ok
> >
> >This is the behavior originally found by Andrew. The clock patch
> >eliminates the lockup, but the hardware forgets its MAC address.
> >
> >>>
> >>>3.8-rc5 + runit + using Sebastians patch + clks not ticking at module
> >>>load:
> >>>
> >>>DT: Boots, but no MAC
> >>>non-DT: Boots, but no MAC
> 
> Ok, here my patch for smi clock enables gbe clock before accessing gbe
> registers.
> 
> (Issue 2: gated gbe clock hangs kernel due to smi access)
> 
> >I think non-DT and DT gated clocks behave differently, since the
> >non-DT ones also enable/disable the PHY.  I explicitly removed the
> >clk_prepare_enable() from kirkwood_ge0[01]_init() in order to see if
> >this makes a difference.
> 
> True, DT gated clocks don't disable PHYs (and never will).
> 
> >(Which reminds me, that we wanted to implement the PHY enabling in
> >the driver.)
> >
> >Apparently, it does not make a difference.
> 
> Leaves Issue 3, gbe forgets about its MAC address when gated or powered
> down. That should be done with local-mac-address passed by DT enabled
> u-boot or any other (dirty) ATAG hack ;)

Fine with all of that. But: I am talking about 3.8 all the time. We
have three options for issues 2 and 3 from my point of view:

1. We do proper fixes in 3.8 for issues 2 and 3.

2. We fix this regression by not gating the clock in both the DT and
the non-DT case, preferably by using the correct clocks in
kirkwood_ge0[01]_init().  Additionally, Jason promises that all gets
well with the DT aware driver in 3.9.  ;-)

3. Do nothing. Simply accept that we broke modular Ethernet for DT
because some code relies on two pointers being set.  When we
introduced a new code path for DT, we forgot about these pointers. 
Bad luck, the solution was not nice anyway and we will do proper
fixes in 3.9.


As should be clear by now, I think we should go for option 2.



- Simon
Sebastian Hesselbarth Jan. 30, 2013, 12:51 a.m. UTC | #18
On 01/30/2013 01:03 AM, Simon Baatz wrote:
> On Tue, Jan 29, 2013 at 09:08:46PM +0100, Sebastian Hesselbarth wrote:
>> Well, if there is no device/driver using runit it should be safe to disable
>> it. If there is a driver using it, we need a clocks property for it. And
>> as you already stated, serial is using it. And you want serial in every
>> minimal kernel, don't you?
>
>  From f479db4:
>
>   Marvell engineers tell us:
>
>   It seems that many units use the RUNIT clock.
>   SPI, UART, NAND, TWSI, ...
>   So it's not possible to clock gate it.
>
>   Currently the SPI, NAND and TWSI driver will clk_prepaure_enable()
>   this clk, but since we have no idea what ... is, and turning this clk
>   off results in a hard lock, unconditionally enable runit.
>
>
> Thus, if we know better than that, that's fine, but please don't tell
> me that this is "blindly" enabling clocks.

Hmm, should have seen that comment ealier. Based on your/Andrew's statement
above using CLK_IGNORE_UNUSED on runit maybe is the best.

So I guess we take patch 2/2 and extend DT clock gating for .flags later?

> So, we have a statement from Marvell not to gate it, we know that it
> is needed for "...", can we please not gate it?

Ack.

>>>>> 3.8-rc5 + runit + Sebastians get smi clock patch (modified to use
>>>>> legacy clock names):
>>>>>
>>>>> DT: Boots, but no MAC
>>>>> non-DT: Boots ok
>>>
>>> This is the behavior originally found by Andrew. The clock patch
>>> eliminates the lockup, but the hardware forgets its MAC address.

For the smi clock issue: DT is fixed by the smi clock patch, right?
non-DT should be fixed in kirkwood_clk_init() with an orion_clkdev_add()
alias for MV643XX_ETH_SHARED_NAME ".0" and ".1" respectively.

For the MAC address issue: non-DT doesn't need to be fixed, right?
DT clock gates should be fixed in kirkwood_legacy_clk_init which will
also save the clk_get_sys() call. We can easily remove it when mv643xx_eth
catches the MAC address from either local-mac-address or ATAG.

> Fine with all of that. But: I am talking about 3.8 all the time. We
> have three options for issues 2 and 3 from my point of view:
>
> 1. We do proper fixes in 3.8 for issues 2 and 3.
>
> 2. We fix this regression by not gating the clock in both the DT and
> the non-DT case, preferably by using the correct clocks in
> kirkwood_ge0[01]_init().  Additionally, Jason promises that all gets
> well with the DT aware driver in 3.9.  ;-)
>
> 3. Do nothing. Simply accept that we broke modular Ethernet for DT
> because some code relies on two pointers being set.  When we
> introduced a new code path for DT, we forgot about these pointers.
> Bad luck, the solution was not nice anyway and we will do proper
> fixes in 3.9.
>
> As should be clear by now, I think we should go for option 2.

Agreed, do you think all issues you are suffering will be solved with:

- [PATCH v2 2/2] clk: mvebu: Do not gate runit clock on Kirkwood
   (no lockup for minimal kernel configs)

- [PATCH] NET: mv643xx: get smi clock from device tree
   (no lockup for modular DT ethernet)

- Some patch that adds MV643XX_ETH_SHARED_NAME ".0" and ".1" clk aliases
   (no lockup for modular non-DT ethernet)

- Some patch that adds clk_prepare_enable to ge0/ge1 clocks to
   kirkwood_legacy_clk_init()
   (retain MAC address for modular DT ethernet)

I'll prepare the latter patches and post them.

Sebastian
Jason Cooper Jan. 30, 2013, 4:26 a.m. UTC | #19
On Wed, Jan 30, 2013 at 01:51:18AM +0100, Sebastian Hesselbarth wrote:
> On 01/30/2013 01:03 AM, Simon Baatz wrote:
> >On Tue, Jan 29, 2013 at 09:08:46PM +0100, Sebastian Hesselbarth wrote:
> >>Well, if there is no device/driver using runit it should be safe to disable
> >>it. If there is a driver using it, we need a clocks property for it. And
> >>as you already stated, serial is using it. And you want serial in every
> >>minimal kernel, don't you?
> >
> > From f479db4:
> >
> >  Marvell engineers tell us:
> >
> >  It seems that many units use the RUNIT clock.
> >  SPI, UART, NAND, TWSI, ...
> >  So it's not possible to clock gate it.
> >
> >  Currently the SPI, NAND and TWSI driver will clk_prepaure_enable()
> >  this clk, but since we have no idea what ... is, and turning this clk
> >  off results in a hard lock, unconditionally enable runit.
> >
> >
> >Thus, if we know better than that, that's fine, but please don't tell
> >me that this is "blindly" enabling clocks.

'blindly' is the correct term.  I'm sorry, I'm not trying to be
difficult, but I don't like applying a 'fix' because a 'Marvell
engineer' was too f'ing lazy to figure out wtf was going on with the
runit clock.

So what happens if runit is gated?  Here's what I did to find out:

######
$ git checkout -b runit v3.8-rc5
$ make kirkwood_defconfig
$ ./scripts/config -e COMMON_CLK_DEBUG -e NETCONSOLE \
	-d SERIAL_OF_PLATFORM -d ORION_WATCHDOG \
	-d SPI_ORION -d MTD_NAND_ORION -d I2C_MV64XXX

$ make uImage dtbs

# edit kernel cmdline in u-boot:

Marvell>> setenv bootargs 'console=null rootwait root=/dev/sda2 netconsole=@192.168.1.253/eth0,@192.168.1.196/ff:ff:ff:ff:ff:ff'
Marvell>> boot

$ socat udp-recv:6666 -
<small pause here>
EXT3-fs (sda2): warning: maximal mount count reached, running e2fsck is recommended
EXT3-fs (sda2): using internal journal
<smaller pause here>>
Unable to handle kernel NULL pointer dereference at virtual address 00000000
#######

It boots all the way up to mounting the USB attached uSD card rootfs.

A few notes about this test setup:
 - the rootfs still has a getty on ttyS0 (could be causing above NULL?)
 - I haven't setup networking on it yet (I was just debugging
   mv643xx_eth gated clock issue).

All this tells me the kernel *does* boot with runit gated since all
drivers that would grab it are disabled.  Tomorrow I'll setup dhcp and
sshd and try to mount debugfs to see if runit is indeed disabled.
Unless someone gets to it before me ;-)

I'll get to the below tomorrow.

thx,

Jason.

> Hmm, should have seen that comment ealier. Based on your/Andrew's statement
> above using CLK_IGNORE_UNUSED on runit maybe is the best.
> 
> So I guess we take patch 2/2 and extend DT clock gating for .flags later?
> 
> >So, we have a statement from Marvell not to gate it, we know that it
> >is needed for "...", can we please not gate it?
> 
> Ack.


> 
> >>>>>3.8-rc5 + runit + Sebastians get smi clock patch (modified to use
> >>>>>legacy clock names):
> >>>>>
> >>>>>DT: Boots, but no MAC
> >>>>>non-DT: Boots ok
> >>>
> >>>This is the behavior originally found by Andrew. The clock patch
> >>>eliminates the lockup, but the hardware forgets its MAC address.
> 
> For the smi clock issue: DT is fixed by the smi clock patch, right?
> non-DT should be fixed in kirkwood_clk_init() with an orion_clkdev_add()
> alias for MV643XX_ETH_SHARED_NAME ".0" and ".1" respectively.
> 
> For the MAC address issue: non-DT doesn't need to be fixed, right?
> DT clock gates should be fixed in kirkwood_legacy_clk_init which will
> also save the clk_get_sys() call. We can easily remove it when mv643xx_eth
> catches the MAC address from either local-mac-address or ATAG.
> 
> >Fine with all of that. But: I am talking about 3.8 all the time. We
> >have three options for issues 2 and 3 from my point of view:
> >
> >1. We do proper fixes in 3.8 for issues 2 and 3.
> >
> >2. We fix this regression by not gating the clock in both the DT and
> >the non-DT case, preferably by using the correct clocks in
> >kirkwood_ge0[01]_init().  Additionally, Jason promises that all gets
> >well with the DT aware driver in 3.9.  ;-)
> >
> >3. Do nothing. Simply accept that we broke modular Ethernet for DT
> >because some code relies on two pointers being set.  When we
> >introduced a new code path for DT, we forgot about these pointers.
> >Bad luck, the solution was not nice anyway and we will do proper
> >fixes in 3.9.
> >
> >As should be clear by now, I think we should go for option 2.
> 
> Agreed, do you think all issues you are suffering will be solved with:
> 
> - [PATCH v2 2/2] clk: mvebu: Do not gate runit clock on Kirkwood
>   (no lockup for minimal kernel configs)

fix for v3.8-rc

> - [PATCH] NET: mv643xx: get smi clock from device tree
>   (no lockup for modular DT ethernet)
> 
> - Some patch that adds MV643XX_ETH_SHARED_NAME ".0" and ".1" clk aliases
>   (no lockup for modular non-DT ethernet)
> 
> - Some patch that adds clk_prepare_enable to ge0/ge1 clocks to
>   kirkwood_legacy_clk_init()
>   (retain MAC address for modular DT ethernet)
> 
> I'll prepare the latter patches and post them.
> 
> Sebastian
Simon Baatz Jan. 30, 2013, 8:30 a.m. UTC | #20
On Wed, Jan 30, 2013 at 01:51:18AM +0100, Sebastian Hesselbarth wrote:
> above using CLK_IGNORE_UNUSED on runit maybe is the best.
> >Fine with all of that. But: I am talking about 3.8 all the time. We
> >have three options for issues 2 and 3 from my point of view:
> >
> >1. We do proper fixes in 3.8 for issues 2 and 3.
> >
> >2. We fix this regression by not gating the clock in both the DT and
> >the non-DT case, preferably by using the correct clocks in
> >kirkwood_ge0[01]_init().  Additionally, Jason promises that all gets
> >well with the DT aware driver in 3.9.  ;-)
> >
> >3. Do nothing. Simply accept that we broke modular Ethernet for DT
> >because some code relies on two pointers being set.  When we
> >introduced a new code path for DT, we forgot about these pointers.
> >Bad luck, the solution was not nice anyway and we will do proper
> >fixes in 3.9.
> >
> >As should be clear by now, I think we should go for option 2.
> 
> Agreed, do you think all issues you are suffering will be solved with:
> 
> - [PATCH v2 2/2] clk: mvebu: Do not gate runit clock on Kirkwood
>   (no lockup for minimal kernel configs)
> 
> - [PATCH] NET: mv643xx: get smi clock from device tree
>   (no lockup for modular DT ethernet)
> 
> - Some patch that adds MV643XX_ETH_SHARED_NAME ".0" and ".1" clk aliases
>   (no lockup for modular non-DT ethernet)

I think your patch to get the smi clock is intended for device tree.
Thus, the driver won't use these aliases, right?
 
> - Some patch that adds clk_prepare_enable to ge0/ge1 clocks to
>   kirkwood_legacy_clk_init()
>   (retain MAC address for modular DT ethernet)

I like mine better, since it only enables the clocks of the
interfaces that are initialized in the init code.  I tested it with
non-DT as well.  But either is fine with me.
 
> I'll prepare the latter patches and post them.

Thanks!


- Simon
Sebastian Hesselbarth Jan. 30, 2013, 10:16 a.m. UTC | #21
On 01/30/2013 09:30 AM, Simon Baatz wrote:
> On Wed, Jan 30, 2013 at 01:51:18AM +0100, Sebastian Hesselbarth wrote:
>> - [PATCH v2 2/2] clk: mvebu: Do not gate runit clock on Kirkwood
>>    (no lockup for minimal kernel configs)
>>
>> - [PATCH] NET: mv643xx: get smi clock from device tree
>>    (no lockup for modular DT ethernet)
>>
>> - Some patch that adds MV643XX_ETH_SHARED_NAME ".0" and ".1" clk aliases
>>    (no lockup for modular non-DT ethernet)
>
> I think your patch to get the smi clock is intended for device tree.
> Thus, the driver won't use these aliases, right?

Actually, both patches above will not fix modular ethernet for 3.8-rc as
shared driver is probed before core driver and not requesting any clk at
all. The "NET: mv643xx: get smi clock from device tree" patch is based
on Jason's attempt to separate shared driver.

If we need to fix modular ethernet now, we also need to add a clk_get
to shared ethernet.

But yes, DT doesn't need any clock aliases.

>> - Some patch that adds clk_prepare_enable to ge0/ge1 clocks to
>>    kirkwood_legacy_clk_init()
>>    (retain MAC address for modular DT ethernet)
>
> I like mine better, since it only enables the clocks of the
> interfaces that are initialized in the init code.  I tested it with
> non-DT as well.  But either is fine with me.

I know the difference, but here it is not only about fixing an issue
but have it cleanly removed later on. But I don't have a strong opinion
on that and maybe Andrew or Jason should coordinate what must be fixed
now and how we do it.

Sebastian
Simon Baatz Jan. 30, 2013, 2:53 p.m. UTC | #22
Hi Sebastian,

On Wed, Jan 30, 2013 at 11:16:32AM +0100, Sebastian Hesselbarth wrote:
> On 01/30/2013 09:30 AM, Simon Baatz wrote:
> >On Wed, Jan 30, 2013 at 01:51:18AM +0100, Sebastian Hesselbarth wrote:
> >>- [PATCH v2 2/2] clk: mvebu: Do not gate runit clock on Kirkwood
> >>   (no lockup for minimal kernel configs)
> >>
> >>- [PATCH] NET: mv643xx: get smi clock from device tree
> >>   (no lockup for modular DT ethernet)
> >>
> >>- Some patch that adds MV643XX_ETH_SHARED_NAME ".0" and ".1" clk aliases
> >>   (no lockup for modular non-DT ethernet)
> >
> >I think your patch to get the smi clock is intended for device tree.
> >Thus, the driver won't use these aliases, right?
> 
> Actually, both patches above will not fix modular ethernet for 3.8-rc as
> shared driver is probed before core driver and not requesting any clk at
> all. The "NET: mv643xx: get smi clock from device tree" patch is based
> on Jason's attempt to separate shared driver.
> 
> If we need to fix modular ethernet now, we also need to add a clk_get
> to shared ethernet.
> 
> But yes, DT doesn't need any clock aliases.
> 
> >>- Some patch that adds clk_prepare_enable to ge0/ge1 clocks to
> >>   kirkwood_legacy_clk_init()
> >>   (retain MAC address for modular DT ethernet)
> >
> >I like mine better, since it only enables the clocks of the
> >interfaces that are initialized in the init code.  I tested it with
> >non-DT as well.  But either is fine with me.
> 
> I know the difference, but here it is not only about fixing an issue
> but have it cleanly removed later on. But I don't have a strong opinion
> on that and maybe Andrew or Jason should coordinate what must be fixed
> now and how we do it.

Nitpicking here: For a DT aware driver on a DT board, the calls to
kirkwood_ge0[01]_init() need to be removed anyway (this is already
part of Jason's patches) and the clocks will not be enabled.  Thus,
there is not need to cleanup anything for DT when keeping the
clk_prepare_enable in those functions (beyond what we need to do
anyhow).

But now I will shut up. I fully agree that letting Jason/Andrew
decide what to do is the best way forward.

- Simon
Jason Cooper Jan. 30, 2013, 10:43 p.m. UTC | #23
> On 01/29/2013 09:32 PM, Jason Cooper wrote:
> >On Tue, Jan 29, 2013 at 09:08:46PM +0100, Sebastian Hesselbarth wrote:
> >>Leaves Issue 3, gbe forgets about its MAC address when gated or powered
> >>down. That should be done with local-mac-address passed by DT enabled
> >>u-boot or any other (dirty) ATAG hack ;)
> >
> >A patch to mv643xx_eth to pull this from DT should solve this.

Somewhere, Jason Gunthorpe shared his patch to do this.  I'll poke
around for it and try to get it merged in.

thx,

Jason.
Jason Cooper Jan. 30, 2013, 11:01 p.m. UTC | #24
On Wed, Jan 30, 2013 at 11:16:32AM +0100, Sebastian Hesselbarth wrote:
> On 01/30/2013 09:30 AM, Simon Baatz wrote:
> >On Wed, Jan 30, 2013 at 01:51:18AM +0100, Sebastian Hesselbarth wrote:
> >>- [PATCH v2 2/2] clk: mvebu: Do not gate runit clock on Kirkwood
> >>   (no lockup for minimal kernel configs)
> >>
> >>- [PATCH] NET: mv643xx: get smi clock from device tree
> >>   (no lockup for modular DT ethernet)
> >>
> >>- Some patch that adds MV643XX_ETH_SHARED_NAME ".0" and ".1" clk aliases
> >>   (no lockup for modular non-DT ethernet)
> >
> >I think your patch to get the smi clock is intended for device tree.
> >Thus, the driver won't use these aliases, right?
> 
> Actually, both patches above will not fix modular ethernet for 3.8-rc as
> shared driver is probed before core driver and not requesting any clk at
> all. The "NET: mv643xx: get smi clock from device tree" patch is based
> on Jason's attempt to separate shared driver.
> 
> If we need to fix modular ethernet now, we also need to add a clk_get
> to shared ethernet.
> 
> But yes, DT doesn't need any clock aliases.
> 
> >>- Some patch that adds clk_prepare_enable to ge0/ge1 clocks to
> >>   kirkwood_legacy_clk_init()
> >>   (retain MAC address for modular DT ethernet)
> >
> >I like mine better, since it only enables the clocks of the
> >interfaces that are initialized in the init code.  I tested it with
> >non-DT as well.  But either is fine with me.
> 
> I know the difference, but here it is not only about fixing an issue
> but have it cleanly removed later on. But I don't have a strong opinion
> on that and maybe Andrew or Jason should coordinate what must be fixed
> now and how we do it.

I agree that Simon's is nicer (per device disabling).  However,
Sebastian is correct, his is easier to remove later on once we have
proper DT bindings in mv643xx_eth.

As it stands, there are three patches to fix this issue:

ARM: kirkwood: of_serial: fix clock gating by removing clock-frequency
ARM: kirkwood: provide ge clock aliases for shared smi
ARM: kirkwood: fix to retain gbe MAC addresses for DT kernels

wrt to runit gating, the only case we are not covering is if of_serial
is a module (and so is everything else using the runit clk).  That's
*really* rare.  If someone embarks down that path, they get the
responsibility of not writing to all the deactivated registers. ;-)

wrt to ge losing mac addresses, both DT and non-DT booting are covered by
Sebastian's patches, for non-DT aware mv643xx_eth.

Once we add proper DT support to mv643xx_eth, the local-mac-address will
need to be defined in the dtb.  We should probably poke the other Jason
to see if there is a cooresponding u-boot patch to his local-mac-address
patch.

thx,

Jason.
Jason Gunthorpe Jan. 30, 2013, 11:15 p.m. UTC | #25
On Wed, Jan 30, 2013 at 06:01:00PM -0500, Jason Cooper wrote:

> Once we add proper DT support to mv643xx_eth, the local-mac-address will
> need to be defined in the dtb.  We should probably poke the other Jason
> to see if there is a cooresponding u-boot patch to his local-mac-address
> patch.

Sorry I don't have a u-boot fix, we don't use u-boot here - but
local-mac-address is common on other ARM drivers so I'd hope there is
a u-boot facility for stuffing it already?

Jason
Sebastian Hesselbarth Jan. 30, 2013, 11:22 p.m. UTC | #26
On 01/31/2013 12:01 AM, Jason Cooper wrote:
> As it stands, there are three patches to fix this issue:
>
> ARM: kirkwood: of_serial: fix clock gating by removing clock-frequency
> ARM: kirkwood: provide ge clock aliases for shared smi
> ARM: kirkwood: fix to retain gbe MAC addresses for DT kernels

Actually, for the second patch I got distracted by the smi split patch
set floating around. But that is not in current kernel and smi will not
request any clock at all.

If Simon can hit another round of testing without second patch included
and agrees, I suggest to keep it for next release.

> wrt to ge losing mac addresses, both DT and non-DT booting are covered by
> Sebastian's patches, for non-DT aware mv643xx_eth.

non-DT already ungates ge0/1 clocks on registration and cannot loose its
mac address, not my fix.

Sebastian
Jason Cooper Jan. 31, 2013, 12:32 a.m. UTC | #27
On Thu, Jan 31, 2013 at 12:22:24AM +0100, Sebastian Hesselbarth wrote:
> On 01/31/2013 12:01 AM, Jason Cooper wrote:
> >As it stands, there are three patches to fix this issue:
> >
> >ARM: kirkwood: of_serial: fix clock gating by removing clock-frequency
> >ARM: kirkwood: provide ge clock aliases for shared smi
> >ARM: kirkwood: fix to retain gbe MAC addresses for DT kernels
> 
> Actually, for the second patch I got distracted by the smi split patch
> set floating around. But that is not in current kernel and smi will not
> request any clock at all.
> 
> If Simon can hit another round of testing without second patch included
> and agrees, I suggest to keep it for next release.

Ok, I'll wait for Simon on this.  fixes can go anytime, so no rush.
Better to get it right the first (second?) time out.

> >wrt to ge losing mac addresses, both DT and non-DT booting are covered by
> >Sebastian's patches, for non-DT aware mv643xx_eth.
> 
> non-DT already ungates ge0/1 clocks on registration and cannot loose its
> mac address, not my fix.

Yes, bad wording on my part.  I meant that everything was magically
fixed by Sebastian and that his thesis should be approved without
challenge.  ;-)

thx,

Jason.
Jason Cooper Jan. 31, 2013, 12:40 a.m. UTC | #28
On Wed, Jan 30, 2013 at 04:15:52PM -0700, Jason Gunthorpe wrote:
> On Wed, Jan 30, 2013 at 06:01:00PM -0500, Jason Cooper wrote:
> 
> > Once we add proper DT support to mv643xx_eth, the local-mac-address will
> > need to be defined in the dtb.  We should probably poke the other Jason
> > to see if there is a cooresponding u-boot patch to his local-mac-address
> > patch.
> 
> Sorry I don't have a u-boot fix, we don't use u-boot here - but
> local-mac-address is common on other ARM drivers so I'd hope there is
> a u-boot facility for stuffing it already?

a quick search of the u-boot source reveals common/fdt_support.c:477:

                do_fixup_by_path(fdt, path, "local-mac-address",
                                &mac_addr, 6, 1);

which is indeed called from bootm if u-boot is compiled with OF_LIBFDT.

I'll test this when I work on the mv643xx_eth DT binding series.

thx,

Jason.
Simon Baatz Jan. 31, 2013, 10:26 p.m. UTC | #29
Hi Jason,

On Wed, Jan 30, 2013 at 07:32:22PM -0500, Jason Cooper wrote:
> On Thu, Jan 31, 2013 at 12:22:24AM +0100, Sebastian Hesselbarth wrote:
> > On 01/31/2013 12:01 AM, Jason Cooper wrote:
> > >As it stands, there are three patches to fix this issue:
> > >
> > >ARM: kirkwood: of_serial: fix clock gating by removing clock-frequency
> > >ARM: kirkwood: provide ge clock aliases for shared smi
> > >ARM: kirkwood: fix to retain gbe MAC addresses for DT kernels
> > 
> > Actually, for the second patch I got distracted by the smi split patch
> > set floating around. But that is not in current kernel and smi will not
> > request any clock at all.
> > 
> > If Simon can hit another round of testing without second patch included
> > and agrees, I suggest to keep it for next release.
> 
> Ok, I'll wait for Simon on this.  fixes can go anytime, so no rush.
> Better to get it right the first (second?) time out.

Ok, will do. For the first patch you already have my Tested-by. 
Sebastian is right on the second one, it does not add value for 3.8
yet.  I will test the third patch as soon as I find time.
 
- Simon
Simon Baatz Jan. 31, 2013, 10:44 p.m. UTC | #30
Hi Jason, Sebastian,

On Wed, Jan 30, 2013 at 06:01:00PM -0500, Jason Cooper wrote:
> 
> wrt to runit gating, the only case we are not covering is if of_serial
> is a module (and so is everything else using the runit clk).  That's
> *really* rare.  If someone embarks down that path, they get the
> responsibility of not writing to all the deactivated registers. ;-)

With the serial driver now enabling runit it is really rare, but
where is your enthusiasm to get to the bottom of it?  At least we
have indications that there really is something in "..." (my box
stops somewhere when no driver enables runit)

Sebastian, are you still interested in the .flags stuff from the
runit patch or do you see no need now since "ddr" is the only
exception anyway?


- Simon
Sebastian Hesselbarth Jan. 31, 2013, 10:49 p.m. UTC | #31
On 01/31/2013 11:44 PM, Simon Baatz wrote:
> Hi Jason, Sebastian,
>
> On Wed, Jan 30, 2013 at 06:01:00PM -0500, Jason Cooper wrote:
>>
>> wrt to runit gating, the only case we are not covering is if of_serial
>> is a module (and so is everything else using the runit clk).  That's
>> *really* rare.  If someone embarks down that path, they get the
>> responsibility of not writing to all the deactivated registers. ;-)
>
> With the serial driver now enabling runit it is really rare, but
> where is your enthusiasm to get to the bottom of it?  At least we
> have indications that there really is something in "..." (my box
> stops somewhere when no driver enables runit)

I think watchdog could be an issue, Jason had it disabled and
he was able to run it, right? I don't have a strong opinion on that,
but I'd disable every clock you can - OTOH runit will be enabled
anyway if you choose to have serial.

> Sebastian, are you still interested in the .flags stuff from the
> runit patch or do you see no need now since "ddr" is the only
> exception anyway?

I don't expect to use it on Dove but it should be good to have for
Kirkwood and maybe Armada XP/370 too. You prepare a patch?

Sebastian
Jason Cooper Feb. 1, 2013, 12:01 a.m. UTC | #32
On Thu, Jan 31, 2013 at 11:44:57PM +0100, Simon Baatz wrote:
> Hi Jason, Sebastian,
> 
> On Wed, Jan 30, 2013 at 06:01:00PM -0500, Jason Cooper wrote:
> > 
> > wrt to runit gating, the only case we are not covering is if of_serial
> > is a module (and so is everything else using the runit clk).  That's
> > *really* rare.  If someone embarks down that path, they get the
> > responsibility of not writing to all the deactivated registers. ;-)
> 
> With the serial driver now enabling runit it is really rare, but
> where is your enthusiasm to get to the bottom of it?

No one responded to my email.  I figured I got a little too intense and
decided to let it go.  Didn't want to be the ranting a-hole (too late). :-)

If you're interested, I still have a few ideas.  One was to wire two USB
serial adapters end to end to create a different console
(console=/dev/ttyUSB0,115200, getty, etc).  Since they would be going
over usb, that's a different clock, so it should work and provide us
with a safety net.

I'll see if I can dig up a few spare cables and try it out over the next
few days.  Priority is to get the pull requests for v3.9 in, though.

thx,

Jason.
Jason Cooper Feb. 1, 2013, 12:11 a.m. UTC | #33
On Thu, Jan 31, 2013 at 11:49:38PM +0100, Sebastian Hesselbarth wrote:
> On 01/31/2013 11:44 PM, Simon Baatz wrote:
> >Hi Jason, Sebastian,
> >
> >On Wed, Jan 30, 2013 at 06:01:00PM -0500, Jason Cooper wrote:
> >>
> >>wrt to runit gating, the only case we are not covering is if of_serial
> >>is a module (and so is everything else using the runit clk).  That's
> >>*really* rare.  If someone embarks down that path, they get the
> >>responsibility of not writing to all the deactivated registers. ;-)
> >
> >With the serial driver now enabling runit it is really rare, but
> >where is your enthusiasm to get to the bottom of it?  At least we
> >have indications that there really is something in "..." (my box
> >stops somewhere when no driver enables runit)
> 
> I think watchdog could be an issue, Jason had it disabled and
> he was able to run it, right? I don't have a strong opinion on that,
> but I'd disable every clock you can - OTOH runit will be enabled
> anyway if you choose to have serial.

I got the list of modules be searching kirkwood.dtsi for '&gate_clk 7'.
the watchdog made the list.

Once I can get a proper test environment setup, my first goal is to show
runit gated from a command prompt.  The second goal is to use it in a
*really* locked down / low power firewall gateway.  There is no timeline
for goal number two. ;-)

> >Sebastian, are you still interested in the .flags stuff from the
> >runit patch or do you see no need now since "ddr" is the only
> >exception anyway?
> 
> I don't expect to use it on Dove but it should be good to have for
> Kirkwood and maybe Armada XP/370 too. You prepare a patch?

I'm thinking we probably don't need it atm.

thx,

Jason.
Jason Gunthorpe Feb. 1, 2013, 12:19 a.m. UTC | #34
On Thu, Jan 31, 2013 at 07:01:09PM -0500, Jason Cooper wrote:

> If you're interested, I still have a few ideas.  One was to wire two USB
> serial adapters end to end to create a different console
> (console=/dev/ttyUSB0,115200, getty, etc).  Since they would be going
> over usb, that's a different clock, so it should work and provide us
> with a safety net.

I can't recall, can you still use JTAG once the CPU has hung on a mbus
access?

If so memory dumping the console ring, or cpu registers would get the
answer pretty directly..

My guesses would be the RTC and/or GPIO blocks (the GPIO blinker needs
a clock), based on table 94.

Jason
Andrew Lunn Feb. 1, 2013, 6:14 a.m. UTC | #35
> My guesses would be the RTC and/or GPIO blocks (the GPIO blinker needs
> a clock), based on table 94.

I've used AT91 parts where you can do GPO with the clock disabled, but
GPI needed a ticking clock. So, yes, GPIO is a good candidate for
ruint clock as well.

Looking through the data sheets, and comparing against the gated
clocks, we have the following without their own clock:

RTC, I2C (a.k.a. TWI), UART, NAND, SPI, Watchdog, eFuse,

     Andrew
Jason Gunthorpe Feb. 1, 2013, 6:46 a.m. UTC | #36
On Fri, Feb 01, 2013 at 07:14:50AM +0100, Andrew Lunn wrote:
> > My guesses would be the RTC and/or GPIO blocks (the GPIO blinker needs
> > a clock), based on table 94.
> 
> I've used AT91 parts where you can do GPO with the clock disabled, but
> GPI needed a ticking clock. So, yes, GPIO is a good candidate for
> ruint clock as well.
> 
> Looking through the data sheets, and comparing against the gated
> clocks, we have the following without their own clock:
> 
> RTC, I2C (a.k.a. TWI), UART, NAND, SPI, Watchdog, eFuse,

Hmm..

If watchdog is on the runit clock then the bridge registers and thus
the timer are on the runit clock, so the whole point would be moot.

Any easy test would be to boot a system with a minimal DT, basically
serial only, and have the kernel disable the runit clock, read a
register from one of those blocks, enable the clock and print OK. The
ones that lock up need the runit clock for sure. 7 boots should answer
the question :)

My guess is that all the peripherals behind mbus unit 0x1 (see table
94, and table 96) are controlled by that clock gate. The other gates
seem to be organized by mbus unit, and there is something very tidy
about that from a hardware perspective :)

Jason
Simon Baatz Feb. 2, 2013, 11:04 p.m. UTC | #37
Hi Jason,

On Thu, Jan 31, 2013 at 05:19:32PM -0700, Jason Gunthorpe wrote:
> On Thu, Jan 31, 2013 at 07:01:09PM -0500, Jason Cooper wrote:
> 
> > If you're interested, I still have a few ideas.  One was to wire two USB
> > serial adapters end to end to create a different console
> > (console=/dev/ttyUSB0,115200, getty, etc).  Since they would be going
> > over usb, that's a different clock, so it should work and provide us
> > with a safety net.
> 
> I can't recall, can you still use JTAG once the CPU has hung on a mbus
> access?
> 
> If so memory dumping the console ring, or cpu registers would get the
> answer pretty directly..
> 
> My guesses would be the RTC and/or GPIO blocks (the GPIO blinker needs
> a clock), based on table 94.

These guesses seem to be dead on:

Moved the RTC module and GPIO modules (keys, leds) out of the way, to
see whether they cause the boot with disabled runit to lock up.

System boots now and SSH login is possible!

# mount -t debugfs debugfs /sys/kernel/debug 
# cat /sys/kernel/debug/clk/tclk/runit/clk_enable_count 
0
# insmod ./gpio_keys.ko 

System locks up.

and, after a reboot:

# insmod ./rtc-mv.ko 

System locks up.

Bingo!


- Simon
Jason Cooper Feb. 3, 2013, 4:45 p.m. UTC | #38
On Sun, Feb 03, 2013 at 12:04:18AM +0100, Simon Baatz wrote:
> Hi Jason,
> 
> On Thu, Jan 31, 2013 at 05:19:32PM -0700, Jason Gunthorpe wrote:
> > On Thu, Jan 31, 2013 at 07:01:09PM -0500, Jason Cooper wrote:
> > 
> > > If you're interested, I still have a few ideas.  One was to wire two USB
> > > serial adapters end to end to create a different console
> > > (console=/dev/ttyUSB0,115200, getty, etc).  Since they would be going
> > > over usb, that's a different clock, so it should work and provide us
> > > with a safety net.
> > 
> > I can't recall, can you still use JTAG once the CPU has hung on a mbus
> > access?
> > 
> > If so memory dumping the console ring, or cpu registers would get the
> > answer pretty directly..
> > 
> > My guesses would be the RTC and/or GPIO blocks (the GPIO blinker needs
> > a clock), based on table 94.
> 
> These guesses seem to be dead on:
> 
> Moved the RTC module and GPIO modules (keys, leds) out of the way, to
> see whether they cause the boot with disabled runit to lock up.
> 
> System boots now and SSH login is possible!
> 
> # mount -t debugfs debugfs /sys/kernel/debug 
> # cat /sys/kernel/debug/clk/tclk/runit/clk_enable_count 
> 0
> # insmod ./gpio_keys.ko 
> 
> System locks up.
> 
> and, after a reboot:
> 
> # insmod ./rtc-mv.ko 
> 
> System locks up.
> 
> Bingo!

Awesome!  Thanks for running the test Simon!  I see Andrew sent two
patches hopefully fixing these lockups.  I'll wait for your tested-by
and then apply all four to mvebu/fixes:

ARM: kirkwood: of_serial: fix clock gating by removing clock-frequency
ARM: kirkwood: fix to retain gbe MAC addresses for DT kernels
gpio: mvebu: Add clk support to prevent lockup
rtc: rtc-mv: Add support for clk to avoid lockups

I'll push Jason's local-mac-address patch to v3.9.

That should cover everything.

thx,

Jason.
diff mbox

Patch

diff --git a/arch/arm/mach-kirkwood/common.c b/arch/arm/mach-kirkwood/common.c
index bac21a5..2c97847 100644
--- a/arch/arm/mach-kirkwood/common.c
+++ b/arch/arm/mach-kirkwood/common.c
@@ -218,11 +218,10 @@  static struct clk __init *kirkwood_register_gate_fn(const char *name,
 				    bit_idx, 0, &gating_lock, fn_en, fn_dis);
 }
 
-static struct clk *ge0, *ge1;
 
 void __init kirkwood_clk_init(void)
 {
-	struct clk *runit, *sata0, *sata1, *usb0, *sdio;
+	struct clk *runit, *ge0, *ge1, *sata0, *sata1, *usb0, *sdio;
 	struct clk *crypto, *xor0, *xor1, *pex0, *pex1, *audio;
 
 	tclk = clk_register_fixed_rate(NULL, "tclk", NULL,
@@ -288,12 +287,15 @@  void __init kirkwood_ehci_init(void)
  ****************************************************************************/
 void __init kirkwood_ge00_init(struct mv643xx_eth_platform_data *eth_data)
 {
+	struct clk *ge0;
 	orion_ge00_init(eth_data,
 			GE00_PHYS_BASE, IRQ_KIRKWOOD_GE00_SUM,
 			IRQ_KIRKWOOD_GE00_ERR, 1600);
 	/* The interface forgets the MAC address assigned by u-boot if
 	the clock is turned off, so claim the clk now. */
-	clk_prepare_enable(ge0);
+	ge0 = clk_get_sys(MV643XX_ETH_NAME ".0", NULL);
+	if (!IS_ERR(ge0))
+		clk_prepare_enable(ge0);
 }
 
 
@@ -302,10 +304,13 @@  void __init kirkwood_ge00_init(struct mv643xx_eth_platform_data *eth_data)
  ****************************************************************************/
 void __init kirkwood_ge01_init(struct mv643xx_eth_platform_data *eth_data)
 {
+	struct clk *ge1;
 	orion_ge01_init(eth_data,
 			GE01_PHYS_BASE, IRQ_KIRKWOOD_GE01_SUM,
 			IRQ_KIRKWOOD_GE01_ERR, 1600);
-	clk_prepare_enable(ge1);
+	ge1 = clk_get_sys(MV643XX_ETH_NAME ".1", NULL);
+	if (!IS_ERR(ge1))
+		clk_prepare_enable(ge1);
 }