diff mbox

[RFT/PATCH] serial: omap: prevent resume if device is not suspended.

Message ID 50784458.9080806@ti.com (mailing list archive)
State New, archived
Headers show

Commit Message

Poddar, Sourav Oct. 12, 2012, 4:24 p.m. UTC
Hi Paul,

There are
On Thursday 11 October 2012 11:58 PM, Paul Walmsley wrote:
> Hi Sourav, Felipe,
>
> any progress on fixing the N800 problem?  Would be good to keep it booting
> since we use it as our primary 2420 test platform.
>
>
> - Paul
The patch sent inlined below might help us to get rid of the serial init 
issue.
Unfortunately, I dont have a N800 board with me to test it and will require
your help to do so.
-----------
From: Sourav Poddar <sourav.poddar@ti.com>
Date: Wed, 1 Aug 2012 15:44:12 +0530
Subject: [RFT/PATCH] serial: omap: Fix N800 serial init issue.


This patch might solve the N800 serial init issue.

This patch will also give pointers if there is any mux settings issue 
with N800 OR
a mismatch between the initial harware state, runtime PM state and omap 
hwmod state.

I don't have a N800 schematics to check about the mux settings getting used.

The observation on beagle board XM with this patch on different boards 
looks flaky,
so your feedback on beagle board will also be very helpful.

Cc: Felipe Balbi <balbi@ti.com>
Signed-off-by: Sourav Poddar <sourav.poddar@ti.com>
---
  drivers/tty/serial/omap-serial.c |    1 +
  1 files changed, 1 insertions(+), 0 deletions(-)

         pm_runtime_set_autosuspend_delay(&pdev->dev,

Comments

Kevin Hilman Oct. 12, 2012, 4:35 p.m. UTC | #1
Sourav <sourav.poddar@ti.com> writes:

> Hi Paul,
>
> There are
> On Thursday 11 October 2012 11:58 PM, Paul Walmsley wrote:
>> Hi Sourav, Felipe,
>>
>> any progress on fixing the N800 problem?  Would be good to keep it booting
>> since we use it as our primary 2420 test platform.
>>
>>
>> - Paul
> The patch sent inlined below might help us to get rid of the serial
> init issue.
> Unfortunately, I dont have a N800 board with me to test it and will require
> your help to do so.
> -----------
> From: Sourav Poddar <sourav.poddar@ti.com>
> Date: Wed, 1 Aug 2012 15:44:12 +0530
> Subject: [RFT/PATCH] serial: omap: Fix N800 serial init issue.
>
>
> This patch might solve the N800 serial init issue.
>
> This patch will also give pointers if there is any mux settings issue
> with N800 OR
> a mismatch between the initial harware state, runtime PM state and
> omap hwmod state.

> I don't have a N800 schematics to check about the mux settings getting used.
>
> The observation on beagle board XM with this patch on different boards
> looks flaky,
> so your feedback on beagle board will also be very helpful.
>
> Cc: Felipe Balbi <balbi@ti.com>
> Signed-off-by: Sourav Poddar <sourav.poddar@ti.com>
> ---
>  drivers/tty/serial/omap-serial.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/tty/serial/omap-serial.c
> b/drivers/tty/serial/omap-serial.c
> index 6ede6fd..3fbc7f7 100644
> --- a/drivers/tty/serial/omap-serial.c
> +++ b/drivers/tty/serial/omap-serial.c
> @@ -1414,6 +1414,7 @@ static int __devinit serial_omap_probe(struct
> platform_device *pdev)
>         INIT_WORK(&up->qos_work, serial_omap_uart_qos_work);
>
>         platform_set_drvdata(pdev, up);
> +       pm_runtime_set_active(&pdev->dev);

NAK.

This will obviously break platforms where the UARTs are not active
before driver loads.

Kevin

>         pm_runtime_enable(&pdev->dev);
>         pm_runtime_use_autosuspend(&pdev->dev);
>         pm_runtime_set_autosuspend_delay(&pdev->dev,
Russell King - ARM Linux Oct. 12, 2012, 4:42 p.m. UTC | #2
On Fri, Oct 12, 2012 at 09:35:54AM -0700, Kevin Hilman wrote:
> Sourav <sourav.poddar@ti.com> writes:
> > diff --git a/drivers/tty/serial/omap-serial.c
> > b/drivers/tty/serial/omap-serial.c
> > index 6ede6fd..3fbc7f7 100644
> > --- a/drivers/tty/serial/omap-serial.c
> > +++ b/drivers/tty/serial/omap-serial.c
> > @@ -1414,6 +1414,7 @@ static int __devinit serial_omap_probe(struct
> > platform_device *pdev)
> >         INIT_WORK(&up->qos_work, serial_omap_uart_qos_work);
> >
> >         platform_set_drvdata(pdev, up);
> > +       pm_runtime_set_active(&pdev->dev);
> 
> NAK.
> 
> This will obviously break platforms where the UARTs are not active
> before driver loads.

I thought I had proposed a solution for this issue, which was this
sequence:

        omap_device_enable(dev);                                                
        pm_runtime_set_active(dev);                                             
        pm_runtime_enable(dev);                                                 

Yes, I can understand people not liking the omap_device_enable()
there, but I also notice that the email suggesting that never got a
reply either - not even a "I tried this and it doesn't work" or "it
does work".

As such, it seems this issue isn't making any progress as we had
already established that merely doing a "pm_runtime_set_active()"
before "pm_runtime_enable()" was going to break other platforms.
Poddar, Sourav Oct. 12, 2012, 5:29 p.m. UTC | #3
Hi Russell,
Kevin Hilman Oct. 12, 2012, 5:59 p.m. UTC | #4
Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Fri, Oct 12, 2012 at 09:35:54AM -0700, Kevin Hilman wrote:
>> Sourav <sourav.poddar@ti.com> writes:
>> > diff --git a/drivers/tty/serial/omap-serial.c
>> > b/drivers/tty/serial/omap-serial.c
>> > index 6ede6fd..3fbc7f7 100644
>> > --- a/drivers/tty/serial/omap-serial.c
>> > +++ b/drivers/tty/serial/omap-serial.c
>> > @@ -1414,6 +1414,7 @@ static int __devinit serial_omap_probe(struct
>> > platform_device *pdev)
>> >         INIT_WORK(&up->qos_work, serial_omap_uart_qos_work);
>> >
>> >         platform_set_drvdata(pdev, up);
>> > +       pm_runtime_set_active(&pdev->dev);
>> 
>> NAK.
>> 
>> This will obviously break platforms where the UARTs are not active
>> before driver loads.
>
> I thought I had proposed a solution for this issue, which was this
> sequence:
>
>         omap_device_enable(dev);                                                
>         pm_runtime_set_active(dev);                                             
>         pm_runtime_enable(dev);                                                 
>
> Yes, I can understand people not liking the omap_device_enable()
> there, but I also notice that the email suggesting that never got a
> reply either - not even a "I tried this and it doesn't work" or "it
> does work".

Yes, that solution would work (though I didn't actually try it.)

However, we can't use omap_device_enable() in the driver.  We're trying
to clean all the drivers of OMAP-specific APIs.  That being said,
something similar could be done in the device/board init code to ensure
the UART HW is in the state that the driver is expecting it, but again,
that would just mask the real problem which is that a (re)init of the
console UART on 2420/n800 is causing output to disappear.

As I detailed in an earlier response, I still think it's the fact that
the pinmux is not setup correctly for the console UART pins in the board
file, so when the UART is initialized, its mux settings are changed from
the bootloader defaults, causing output to disappear.

> As such, it seems this issue isn't making any progress as we had
> already established that merely doing a "pm_runtime_set_active()"
> before "pm_runtime_enable()" was going to break other platforms.

Agreed.

Kevin
Russell King - ARM Linux Oct. 12, 2012, 6:49 p.m. UTC | #5
On Fri, Oct 12, 2012 at 05:29:55PM +0000, Poddar, Sourav wrote:
> Hi Russell,
> ________________________________________
> From: Russell King - ARM Linux [linux@arm.linux.org.uk]
> Sent: Friday, October 12, 2012 10:12 PM
> To: Kevin Hilman
> Cc: Poddar, Sourav; Paul Walmsley; Balbi, Felipe; gregkh@linuxfoundation.org; tony@atomide.com; linux-kernel@vger.kernel.org; Shilimkar, Santosh; linux-serial@vger.kernel.org; linux-omap@vger.kernel.org; linux-arm-kernel@lists.infradead.org; alan@linux.intel.com
> Subject: Re: [RFT/PATCH] serial: omap: prevent resume if device is not  suspended.
> 
> On Fri, Oct 12, 2012 at 09:35:54AM -0700, Kevin Hilman wrote:
> > Sourav <sourav.poddar@ti.com> writes:
> > > diff --git a/drivers/tty/serial/omap-serial.c
> > > b/drivers/tty/serial/omap-serial.c
> > > index 6ede6fd..3fbc7f7 100644
> > > --- a/drivers/tty/serial/omap-serial.c
> > > +++ b/drivers/tty/serial/omap-serial.c
> > > @@ -1414,6 +1414,7 @@ static int __devinit serial_omap_probe(struct
> > > platform_device *pdev)
> > >         INIT_WORK(&up->qos_work, serial_omap_uart_qos_work);
> > >
> > >         platform_set_drvdata(pdev, up);
> > > +       pm_runtime_set_active(&pdev->dev);
> >
> > NAK.
> >
> > This will obviously break platforms where the UARTs are not active
> > before driver loads.
> 
> I thought I had proposed a solution for this issue, which was this
> sequence:
> 
>         omap_device_enable(dev);
>         pm_runtime_set_active(dev);
>         pm_runtime_enable(dev);
> 
> Yes, I can understand people not liking the omap_device_enable()
> there, but I also notice that the email suggesting that never got a
> reply either - not even a "I tried this and it doesn't work" or "it
> does work".
> 
> Sorry for the late reply on this. I tried this sequence and it worked perfectly fine on
> panda and beagle. 
> 
> As such, it seems this issue isn't making any progress as we had
> already established that merely doing a "pm_runtime_set_active()"
> before "pm_runtime_enable()" was going to break other platforms.
> 
>  I was  trying to analyse your explanations on this and since
> omap_device_enable is not generally recommended,  I was trying to see
> if anything else can be done to get around this.

Right, so what you need is explanation about why I believe the above
sequence to be necessary.

What is happening is that we're starting from a device in unknown state.
We don't know whether it is enabled or disabled.  We don't know the
state of the clocks or the power domain.

PM runtime state is initialized at device creation in the "device is
suspended" state.  If we merely enable PM runtime from that state, we
are telling the PM runtime subsystem that the device is _indeed_
suspended (disabled) at boot time.

So, that causes the first pm_runtime_get() call to resume the device.
Due to the OMAP runtime PM hooks at the bus layer, this causes
_od_runtime_resume() to be called.

_od_runtime_resume() does two things.  It calls omap_device_enable()
to ensure that the device is woken up (such as, ensuring that the
power domain is on, and turning on the clocks etc.)  It then goes on
to call the device PM layers to call the driver specific runtime PM
resume hook.

So, in summary, what this sequence does is:

        pm_runtime_enable(&pdev->dev);
        pm_runtime_use_autosuspend(&pdev->dev);
        pm_runtime_set_autosuspend_delay(&pdev->dev,
                        omap_up_info->autosuspend_timeout);

        pm_runtime_irq_safe(&pdev->dev);
        pm_runtime_get_sync(&pdev->dev);

is, at the last call, it calls:

		_od_runtime_resume()
			omap_device_enable()
			serial_omap_runtime_resume()

Your original patch at the head of this thread says that the driver
specific runtime resume call causes a problem for N800 - because the
device is already enabled and setup.

Okay, so the initial device state does not match the runtime PM state.

So, what happens if we _do_ make it match your state - as required by
the runtime PM documentation - by adding a call before the sequence:

	pm_runtime_set_active(&pdev->dev);
        pm_runtime_enable(&pdev->dev);
        pm_runtime_use_autosuspend(&pdev->dev);
        pm_runtime_set_autosuspend_delay(&pdev->dev,
                        omap_up_info->autosuspend_timeout);

        pm_runtime_irq_safe(&pdev->dev);
        pm_runtime_get_sync(&pdev->dev);

Right, now runtime PM knows that the device is enabled and alive prior
to that pm_runtime_get_sync() call, and it will _not_ call the runtime
resume hooks.

However, this breaks beaglebone, because the device is disabled when
this driver probes.  So, we have exactly the opposite problem here -
the device is disabled, but runtime PM thinks it is enabled.

The _two_ problems are precisely the same problem: the runtime PM state
does not accurately reflect the actual hardware state - again, as required
by the runtime PM documentation.  The only sane solution to this is to
ensure that the hardware _is_ in a known state prior to enabling runtime
PM.

How do we do that?  Well, the clue is in the bus layer runtime resume
handler - that's what is missing from the beaglebone situation.

Calling this before calling pm_runtime_set_active() gets the hardware
into a known state (enabled), and we then tell the runtime PM code
that the harware _is_ enabled.  Now, runtime PM can be sure what the
initial state is, and everything works.

What's the longer term answer?

Well, I _bet_ all OMAP drivers are doing something along the lines of:

	pm_runtime_enable(dev);
	pm_runtime_get(dev);

in their probe functions.

PCI already solved this problem - partly because it has far too many drivers
to convert.  I took that solution over to the AMBA bus layer, because I
didn't want to have a flag day for all the drivers to convert over in one
massive patch.  What is this solution?

You get the bus layer to handle the initial setup of runtime PM state like
this - in OMAP's case:

	omap_device_enable(pdev);
	pm_runtime_get_noresume(&pdev->dev);
	pm_runtime_set_active(&pdev->dev);
	pm_runtime_enable(&pdev->dev);

You do this prior to calling the device probe function.  If the device
probe fails, then you can undo those actions.  You also need to undo them
when the device is unbound from the driver:

	pm_runtime_disable(&pdev->dev);
	pm_runtime_set_suspended(&pdev->dev);
	pm_runtime_put_noidle(&pdev->dev);

(It is probably dangerous to call omap_device_disable() here for certain
devices...)

This gets rid of all that driver specific runtime PM initialization, with
questionable starting state.  It also means that your devices all get
runtime PM support in so far as if they're bound to a driver, they will
be runtime PM resumed, and when unbound, they will be runtime PM suspended.

However, it means that the driver has to do something to make runtime PM
work.  In the probe function, just before it returns success, it has to
'put' that pm_runtime_get_noresume() reference to allow the device to
enter runtime PM states.  And more importantly, on a remove call, it
_must_ balance that 'put' in the probe with an appropriate 'get' - no
exceptions to that.

And that is how we get to a sane state over runtime PM here, which will
work in every situation on every device, without throwing calls to
omap_device_enable() into every OMAP device driver.

This also has another advantage - by doing that, the OMAP specific
omap_device_enable() call ends up being in bus layer code, not driver
layer, which is the right place for it to be - after all, it's the
bus layer which is already handling that stuff in its runtime PM support
code.
Russell King - ARM Linux Oct. 12, 2012, 6:54 p.m. UTC | #6
On Fri, Oct 12, 2012 at 10:59:22AM -0700, Kevin Hilman wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> 
> > On Fri, Oct 12, 2012 at 09:35:54AM -0700, Kevin Hilman wrote:
> >> Sourav <sourav.poddar@ti.com> writes:
> >> > diff --git a/drivers/tty/serial/omap-serial.c
> >> > b/drivers/tty/serial/omap-serial.c
> >> > index 6ede6fd..3fbc7f7 100644
> >> > --- a/drivers/tty/serial/omap-serial.c
> >> > +++ b/drivers/tty/serial/omap-serial.c
> >> > @@ -1414,6 +1414,7 @@ static int __devinit serial_omap_probe(struct
> >> > platform_device *pdev)
> >> >         INIT_WORK(&up->qos_work, serial_omap_uart_qos_work);
> >> >
> >> >         platform_set_drvdata(pdev, up);
> >> > +       pm_runtime_set_active(&pdev->dev);
> >> 
> >> NAK.
> >> 
> >> This will obviously break platforms where the UARTs are not active
> >> before driver loads.
> >
> > I thought I had proposed a solution for this issue, which was this
> > sequence:
> >
> >         omap_device_enable(dev);                                                
> >         pm_runtime_set_active(dev);                                             
> >         pm_runtime_enable(dev);                                                 
> >
> > Yes, I can understand people not liking the omap_device_enable()
> > there, but I also notice that the email suggesting that never got a
> > reply either - not even a "I tried this and it doesn't work" or "it
> > does work".
> 
> Yes, that solution would work (though I didn't actually try it.)
> 
> However, we can't use omap_device_enable() in the driver.  We're trying
> to clean all the drivers of OMAP-specific APIs.  That being said,
> something similar could be done in the device/board init code to ensure
> the UART HW is in the state that the driver is expecting it, but again,
> that would just mask the real problem which is that a (re)init of the
> console UART on 2420/n800 is causing output to disappear.

See my more expansive suggestion just posted.

Whatever way, this discrepancy between runtime PM state and actual device
state is what is biting you, and it is that which needs fixing.  It's
fairly easy to fix given the right design, one which several other bus
types are already using.

Given the route that OMAP went down when adopting runtime PM, it's going
to be a big change across many drivers, because there's no way to gradually
transition them, but that's unfortunately one of the results of ignoring
requirements of the layers being used.  Sooner or later the oversights
come back to haunt.  Just make sure it's not the ghost of Jaws.
Kevin Hilman Oct. 12, 2012, 8:32 p.m. UTC | #7
Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Fri, Oct 12, 2012 at 10:59:22AM -0700, Kevin Hilman wrote:
>> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
>> 
>> > On Fri, Oct 12, 2012 at 09:35:54AM -0700, Kevin Hilman wrote:
>> >> Sourav <sourav.poddar@ti.com> writes:
>> >> > diff --git a/drivers/tty/serial/omap-serial.c
>> >> > b/drivers/tty/serial/omap-serial.c
>> >> > index 6ede6fd..3fbc7f7 100644
>> >> > --- a/drivers/tty/serial/omap-serial.c
>> >> > +++ b/drivers/tty/serial/omap-serial.c
>> >> > @@ -1414,6 +1414,7 @@ static int __devinit serial_omap_probe(struct
>> >> > platform_device *pdev)
>> >> >         INIT_WORK(&up->qos_work, serial_omap_uart_qos_work);
>> >> >
>> >> >         platform_set_drvdata(pdev, up);
>> >> > +       pm_runtime_set_active(&pdev->dev);
>> >> 
>> >> NAK.
>> >> 
>> >> This will obviously break platforms where the UARTs are not active
>> >> before driver loads.
>> >
>> > I thought I had proposed a solution for this issue, which was this
>> > sequence:
>> >
>> >         omap_device_enable(dev);                                                
>> >         pm_runtime_set_active(dev);                                             
>> >         pm_runtime_enable(dev);                                                 
>> >
>> > Yes, I can understand people not liking the omap_device_enable()
>> > there, but I also notice that the email suggesting that never got a
>> > reply either - not even a "I tried this and it doesn't work" or "it
>> > does work".
>> 
>> Yes, that solution would work (though I didn't actually try it.)
>> 
>> However, we can't use omap_device_enable() in the driver.  We're trying
>> to clean all the drivers of OMAP-specific APIs.  That being said,
>> something similar could be done in the device/board init code to ensure
>> the UART HW is in the state that the driver is expecting it, but again,
>> that would just mask the real problem which is that a (re)init of the
>> console UART on 2420/n800 is causing output to disappear.
>
> See my more expansive suggestion just posted.
>
> Whatever way, this discrepancy between runtime PM state and actual device
> state is what is biting you, and it is that which needs fixing.  

I'm not conviced (yet) that a mismatch is the root cause.  Yes, that's
what the author of $SUBJECT patch assumed and stated, but I'm not
pursuaded.  

If it's an improperly configured mux issue, then the UART will break
whenever the device is actually omap_device_enable'd, whether in the
driver or in the bus layer.

Kevin
Tony Lindgren Oct. 12, 2012, 9:51 p.m. UTC | #8
* Kevin Hilman <khilman@deeprootsystems.com> [121012 13:34]:
>
> I'm not conviced (yet) that a mismatch is the root cause.  Yes, that's
> what the author of $SUBJECT patch assumed and stated, but I'm not
> pursuaded.  
> 
> If it's an improperly configured mux issue, then the UART will break
> whenever the device is actually omap_device_enable'd, whether in the
> driver or in the bus layer.

I tried booting n800 here with CONFIG_OMAP_MUX disabled, and no
change. Serial console output stops right when the console initializes.

Regards,

Tony
diff mbox

Patch

diff --git a/drivers/tty/serial/omap-serial.c 
b/drivers/tty/serial/omap-serial.c
index 6ede6fd..3fbc7f7 100644
--- a/drivers/tty/serial/omap-serial.c
+++ b/drivers/tty/serial/omap-serial.c
@@ -1414,6 +1414,7 @@  static int __devinit serial_omap_probe(struct 
platform_device *pdev)
         INIT_WORK(&up->qos_work, serial_omap_uart_qos_work);

         platform_set_drvdata(pdev, up);
+       pm_runtime_set_active(&pdev->dev);
         pm_runtime_enable(&pdev->dev);
         pm_runtime_use_autosuspend(&pdev->dev);