diff mbox

Boot hang regression 3.10.0-rc4 -> 3.10.0

Message ID 20130710160704.GH18966@arwen.pp.htv.fi (mailing list archive)
State New, archived
Headers show

Commit Message

Felipe Balbi July 10, 2013, 4:07 p.m. UTC
Hi,

On Wed, Jul 10, 2013 at 07:26:34AM -0700, Tony Lindgren wrote:
> * Kevin Hilman <khilman@linaro.org> [130710 01:29]:
> > Felipe Balbi <balbi@ti.com> writes:
> > >> 
> > >> Right, but calling serial_omap_restore_context() even when the context
> > >> is not lost, should not ideally cause an issue.
> > >
> > > it does in one condition. If context hasn't been saved before. And that
> > > can happen in the case of wrong pm runtime status for that device.
> > >
> > > Imagine the device is marked as suspended even though it's fully enabled
> > > (it hasn't been suspended by hwmod due to NO_IDLE flag). In that case
> > > your context structure is all zeroes (context has never been saved
> > > before) then when you call pm_runtime_get_sync() on probe() your
> > > ->runtime_resume() will get called, which will restore context,
> > > essentially undoing anything which was configured by u-boot.
> > >
> > > Am I missing something ?
> > 
> > You're right, the _set_active() is crucial in the case when we prevent
> > the console UART from idling during boot (though that shouldn't be
> > happening in mainline unless the fix for "Issue 1" is done.)
> 
> Felipe is right, looks like all we need is to check if context is
> initialized or not. So no need for mach-omap2/serial.c or hwmod tinkering.
> 
> After that having DEBUG_LL and cmdline with earlyprintk console=ttyO..
> works for me. We could also check for some combination of the context
> save registers being NULL if somebody has a good idea which ones
> should never be 0.
> 
> Regards,
> 
> Tony
> 
> 
> --- a/drivers/tty/serial/omap-serial.c
> +++ b/drivers/tty/serial/omap-serial.c
> @@ -161,6 +161,7 @@ struct uart_omap_port {
>  	u32			calc_latency;
>  	struct work_struct	qos_work;
>  	struct pinctrl		*pins;
> +	bool			initialized;
>  	bool			is_suspending;
>  };
>  
> @@ -1523,6 +1524,8 @@ static int serial_omap_probe(struct platform_device *pdev)
>  
>  	pm_runtime_mark_last_busy(up->dev);
>  	pm_runtime_put_autosuspend(up->dev);
> +	up->initialized = true;
> +
>  	return 0;
>  
>  err_add_port:
> @@ -1584,6 +1587,9 @@ static void serial_omap_mdr1_errataset(struct uart_omap_port *up, u8 mdr1)
>  #ifdef CONFIG_PM_RUNTIME
>  static void serial_omap_restore_context(struct uart_omap_port *up)
>  {
> +	if (!up->initialized)
> +		return;
> +
>  	if (up->errata & UART_ERRATA_i202_MDR1_ACCESS)
>  		serial_omap_mdr1_errataset(up, UART_OMAP_MDR1_DISABLE);
>  	else

how about something like below ? It makes omap_device/hwmod and
pm_runtime agree on the initial state of the device and will prevent
->runtime_resume() from being called on first pm_runtime_get*() done
during probe.

This is similar to what PCI bus does (if you look at pci_pm_init()).

commit 59108a500b4ab4b1a5102648a3360276dbf7df6f
Author: Felipe Balbi <balbi@ti.com>
Date:   Wed Jul 10 18:50:16 2013 +0300

    arm: omap2plus: unidle devices which are about to probe
    
    in order to make HWMOD and pm_runtime agree on the
    initial state of the device, we will unidle the device
    and call pm_runtime_set_active() to tell pm_runtime
    that the device is really active.
    
    By the time driver's probe() is reached, a call to
    pm_runtime_get_sync() will not cause driver's
    ->runtime_resume() method to be called at first, only
    after a successful ->runtime_suspend().
    
    Signed-off-by: Felipe Balbi <balbi@ti.com>

Comments

Felipe Balbi July 10, 2013, 4:11 p.m. UTC | #1
Hi,

On Wed, Jul 10, 2013 at 07:07:04PM +0300, Felipe Balbi wrote:
> how about something like below ? It makes omap_device/hwmod and
> pm_runtime agree on the initial state of the device and will prevent
> ->runtime_resume() from being called on first pm_runtime_get*() done
> during probe.
> 
> This is similar to what PCI bus does (if you look at pci_pm_init()).
> 
> commit 59108a500b4ab4b1a5102648a3360276dbf7df6f
> Author: Felipe Balbi <balbi@ti.com>
> Date:   Wed Jul 10 18:50:16 2013 +0300
> 
>     arm: omap2plus: unidle devices which are about to probe
>     
>     in order to make HWMOD and pm_runtime agree on the
>     initial state of the device, we will unidle the device
>     and call pm_runtime_set_active() to tell pm_runtime
>     that the device is really active.
>     
>     By the time driver's probe() is reached, a call to
>     pm_runtime_get_sync() will not cause driver's
>     ->runtime_resume() method to be called at first, only
>     after a successful ->runtime_suspend().
>     
>     Signed-off-by: Felipe Balbi <balbi@ti.com>

btw, this is RFC, haven't tested at all.
Tony Lindgren July 11, 2013, 6:32 a.m. UTC | #2
* Felipe Balbi <balbi@ti.com> [130710 09:18]:
> 
> On Wed, Jul 10, 2013 at 07:07:04PM +0300, Felipe Balbi wrote:
> > how about something like below ? It makes omap_device/hwmod and
> > pm_runtime agree on the initial state of the device and will prevent
> > ->runtime_resume() from being called on first pm_runtime_get*() done
> > during probe.
> > 
> > This is similar to what PCI bus does (if you look at pci_pm_init()).
> > 
> > commit 59108a500b4ab4b1a5102648a3360276dbf7df6f
> > Author: Felipe Balbi <balbi@ti.com>
> > Date:   Wed Jul 10 18:50:16 2013 +0300
> > 
> >     arm: omap2plus: unidle devices which are about to probe
> >     
> >     in order to make HWMOD and pm_runtime agree on the
> >     initial state of the device, we will unidle the device
> >     and call pm_runtime_set_active() to tell pm_runtime
> >     that the device is really active.
> >     
> >     By the time driver's probe() is reached, a call to
> >     pm_runtime_get_sync() will not cause driver's
> >     ->runtime_resume() method to be called at first, only
> >     after a successful ->runtime_suspend().
> >     
> >     Signed-off-by: Felipe Balbi <balbi@ti.com>
> 
> btw, this is RFC, haven't tested at all.

Yes it does not compile, then removing the extra ; at the end
of the functions, it oopses with a NULL pointer exception.

Regards,

Tony
Grygorii Strashko July 11, 2013, 9:59 a.m. UTC | #3
Hi,

On 07/11/2013 09:32 AM, Tony Lindgren wrote:
> * Felipe Balbi <balbi@ti.com> [130710 09:18]:
>>
>> On Wed, Jul 10, 2013 at 07:07:04PM +0300, Felipe Balbi wrote:
>>> how about something like below ? It makes omap_device/hwmod and
>>> pm_runtime agree on the initial state of the device and will prevent
>>> ->runtime_resume() from being called on first pm_runtime_get*() done
>>> during probe.
>>>
>>> This is similar to what PCI bus does (if you look at pci_pm_init()).
>>>
>>> commit 59108a500b4ab4b1a5102648a3360276dbf7df6f
>>> Author: Felipe Balbi <balbi@ti.com>
>>> Date:   Wed Jul 10 18:50:16 2013 +0300
>>>
>>>      arm: omap2plus: unidle devices which are about to probe
>>>
>>>      in order to make HWMOD and pm_runtime agree on the
>>>      initial state of the device, we will unidle the device
>>>      and call pm_runtime_set_active() to tell pm_runtime
>>>      that the device is really active.
Don't think that it's good idea (
I've checked some driver's and think this patch will enable some devices 
unpredictably:
- hwspinlock
- mailbox
- iommu
- ipu
All above devices need to be enabled on demand only (no 
pm_runtime_get*() calls in probe). More over, some of them have very 
specific enabling sequence - like ipu).

May be Summan can say more on that.

>>>
>>>      By the time driver's probe() is reached, a call to
>>>      pm_runtime_get_sync() will not cause driver's
>>>      ->runtime_resume() method to be called at first, only
>>>      after a successful ->runtime_suspend().
>>>
>>>      Signed-off-by: Felipe Balbi <balbi@ti.com>
>>
>> btw, this is RFC, haven't tested at all.
>
> Yes it does not compile, then removing the extra ; at the end
> of the functions, it oopses with a NULL pointer exception.
>
> Regards,
>
> Tony
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Suman Anna July 12, 2013, 12:40 a.m. UTC | #4
On 07/11/2013 04:59 AM, Grygorii Strashko wrote:
> Hi,
> 
> On 07/11/2013 09:32 AM, Tony Lindgren wrote:
>> * Felipe Balbi <balbi@ti.com> [130710 09:18]:
>>>
>>> On Wed, Jul 10, 2013 at 07:07:04PM +0300, Felipe Balbi wrote:
>>>> how about something like below ? It makes omap_device/hwmod and
>>>> pm_runtime agree on the initial state of the device and will prevent
>>>> ->runtime_resume() from being called on first pm_runtime_get*() done
>>>> during probe.
>>>>
>>>> This is similar to what PCI bus does (if you look at pci_pm_init()).
>>>>
>>>> commit 59108a500b4ab4b1a5102648a3360276dbf7df6f
>>>> Author: Felipe Balbi <balbi@ti.com>
>>>> Date:   Wed Jul 10 18:50:16 2013 +0300
>>>>
>>>>      arm: omap2plus: unidle devices which are about to probe
>>>>
>>>>      in order to make HWMOD and pm_runtime agree on the
>>>>      initial state of the device, we will unidle the device
>>>>      and call pm_runtime_set_active() to tell pm_runtime
>>>>      that the device is really active.
> Don't think that it's good idea (
> I've checked some driver's and think this patch will enable some devices
> unpredictably:
> - hwspinlock
> - mailbox
> - iommu
> - ipu
> All above devices need to be enabled on demand only (no
> pm_runtime_get*() calls in probe). More over, some of them have very
> specific enabling sequence - like ipu).
> 
> May be Summan can say more on that.

Indeed, this is a problem for any of the slave processor devices.
mailbox and iommu would be slaves to the remoteproc and the drivers have
a specific sequence of bringing up a processor. The current
hwmod/omap_device code is such that these devices will be left in reset
and the driver code use the omap_device_(de)assert_hardreset API
together with omap_device_enable code to bring up the devices. The
remoteproc driver also needs to assert the resets (there are other
problems associated with using omap_device_idle for remoteproc and
iommu) for bringing up the devices after a suspend sequence. hwspinlock
and mailbox may get away since they are in CORE domain, but definitely
an issue for iommu and remoteproc. I would think that this would also
affect other compute devices like IVAHD, ISS, SGX.

regards
Suman

> 
>>>>
>>>>      By the time driver's probe() is reached, a call to
>>>>      pm_runtime_get_sync() will not cause driver's
>>>>      ->runtime_resume() method to be called at first, only
>>>>      after a successful ->runtime_suspend().
>>>>
>>>>      Signed-off-by: Felipe Balbi <balbi@ti.com>
>>>
>>> btw, this is RFC, haven't tested at all.
>>
>> Yes it does not compile, then removing the extra ; at the end
>> of the functions, it oopses with a NULL pointer exception.
>>
>> Regards,
>>
>> Tony
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
Rajendra Nayak July 15, 2013, 6:44 a.m. UTC | #5
On Friday 12 July 2013 06:10 AM, Suman Anna wrote:
> On 07/11/2013 04:59 AM, Grygorii Strashko wrote:
>> Hi,
>>
>> On 07/11/2013 09:32 AM, Tony Lindgren wrote:
>>> * Felipe Balbi <balbi@ti.com> [130710 09:18]:
>>>>
>>>> On Wed, Jul 10, 2013 at 07:07:04PM +0300, Felipe Balbi wrote:
>>>>> how about something like below ? It makes omap_device/hwmod and
>>>>> pm_runtime agree on the initial state of the device and will prevent
>>>>> ->runtime_resume() from being called on first pm_runtime_get*() done
>>>>> during probe.
>>>>>
>>>>> This is similar to what PCI bus does (if you look at pci_pm_init()).
>>>>>
>>>>> commit 59108a500b4ab4b1a5102648a3360276dbf7df6f
>>>>> Author: Felipe Balbi <balbi@ti.com>
>>>>> Date:   Wed Jul 10 18:50:16 2013 +0300
>>>>>
>>>>>      arm: omap2plus: unidle devices which are about to probe
>>>>>
>>>>>      in order to make HWMOD and pm_runtime agree on the
>>>>>      initial state of the device, we will unidle the device
>>>>>      and call pm_runtime_set_active() to tell pm_runtime
>>>>>      that the device is really active.
>> Don't think that it's good idea (
>> I've checked some driver's and think this patch will enable some devices
>> unpredictably:
>> - hwspinlock
>> - mailbox
>> - iommu
>> - ipu
>> All above devices need to be enabled on demand only (no
>> pm_runtime_get*() calls in probe). More over, some of them have very
>> specific enabling sequence - like ipu).
>>
>> May be Summan can say more on that.
> 
> Indeed, this is a problem for any of the slave processor devices.
> mailbox and iommu would be slaves to the remoteproc and the drivers have
> a specific sequence of bringing up a processor. The current
> hwmod/omap_device code is such that these devices will be left in reset
> and the driver code use the omap_device_(de)assert_hardreset API
> together with omap_device_enable code to bring up the devices. The
> remoteproc driver also needs to assert the resets (there are other
> problems associated with using omap_device_idle for remoteproc and
> iommu) for bringing up the devices after a suspend sequence. hwspinlock
> and mailbox may get away since they are in CORE domain, but definitely
> an issue for iommu and remoteproc. I would think that this would also
> affect other compute devices like IVAHD, ISS, SGX.

Today, for these IPs I guess hwmod waits for the resets to be de-asserted, right?

        /*
         * If an IP block contains HW reset lines and all of them are
         * asserted, we let integration code associated with that
         * block handle the enable.  We've received very little
         * information on what those driver authors need, and until
         * detailed information is provided and the driver code is
         * posted to the public lists, this is probably the best we
         * can do.
         */
        if (_are_all_hardreset_lines_asserted(oh))
                return 0;

What if this information is send back to omap_device() through a return value
so omap_device() knows about this too, so it avoids marking the omap device as
enabled? Wouldn't that fix the issue?


> 
> regards
> Suman
> 
>>
>>>>>
>>>>>      By the time driver's probe() is reached, a call to
>>>>>      pm_runtime_get_sync() will not cause driver's
>>>>>      ->runtime_resume() method to be called at first, only
>>>>>      after a successful ->runtime_suspend().
>>>>>
>>>>>      Signed-off-by: Felipe Balbi <balbi@ti.com>
>>>>
>>>> btw, this is RFC, haven't tested at all.
>>>
>>> Yes it does not compile, then removing the extra ; at the end
>>> of the functions, it oopses with a NULL pointer exception.
>>>
>>> Regards,
>>>
>>> Tony
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>
diff mbox

Patch

diff --git a/arch/arm/mach-omap2/omap_device.c b/arch/arm/mach-omap2/omap_device.c
index e6d2307..1cedf3a 100644
--- a/arch/arm/mach-omap2/omap_device.c
+++ b/arch/arm/mach-omap2/omap_device.c
@@ -181,6 +181,26 @@  odbfd_exit:
 	return ret;
 }
 
+static void omap_device_pm_init(struct platform_device *pdev);
+{
+	/*
+	 * if we reach this function, it's because the device is about to be
+	 * probed. In such cases, we will enable the device, and call
+	 * pm_runtime_set_active() so that the device driver and runtime PM
+	 * framework agree on initial state of the device.
+	 */
+	omap_device_enable(pdev);
+	pm_runtime_set_active(&pdev->dev);
+	device_enable_async_suspend(&pdev->dev);
+}
+
+static void omap_device_pm_exit(struct platform_device *pdev);
+{
+	device_disable_async_suspend(&pdev->dev);
+	pm_runtime_set_suspended(&pdev->dev);
+	omap_device_idle(pdev);
+}
+
 static int _omap_device_notifier_call(struct notifier_block *nb,
 				      unsigned long event, void *dev)
 {
@@ -192,6 +212,12 @@  static int _omap_device_notifier_call(struct notifier_block *nb,
 		if (pdev->archdata.od)
 			omap_device_delete(pdev->archdata.od);
 		break;
+	case BUS_NOTIFY_BIND_DRIVER:
+		omap_device_pm_init(pdev);
+		break;
+	case BUS_NOTIFY_UNBOUND_DRIVER:
+		omap_device_pm_exit(pdev);
+		break;
 	case BUS_NOTIFY_ADD_DEVICE:
 		if (pdev->dev.of_node)
 			omap_device_build_from_dt(pdev);