diff mbox

[RFC,2/4] ARM: OMAP: PM: Get rid of Powerdomain book-keeping from cpuidle

Message ID 1342764284-8143-3-git-send-email-rnayak@ti.com (mailing list archive)
State New, archived
Headers show

Commit Message

Rajendra Nayak July 20, 2012, 6:04 a.m. UTC
pwrdm_pre_transition()/pwrdm_post_transition() have always been high latency
operations done within cpuidle to do Powerdomain level book-keeping to know
what state transitions for different Powerdomains have been triggered.
This is also useful to do a restore-on-demand in some cases when we know
the context for the given Powerdomain was lost etc.

Now that we have definitive entry/exit points (thanks to the Powerdomain
level usecounting) for Powerdomain transitions, these book-keeping functions
can very well be moved from within CPUidle into pwrdm_clkdm_enable()/pwrdm_
clkdm_disable() functions.

Also rename _pwrdm_pre/post_transition_cb() to pwrdm_pre/post_transition()
and get rid of the original ones which iterate over all powerdomains.

Signed-off-by: Rajendra Nayak <rnayak@ti.com>
---
 arch/arm/mach-omap2/omap-mpuss-lowpower.c |    4 ++--
 arch/arm/mach-omap2/pm34xx.c              |    4 ++--
 arch/arm/mach-omap2/powerdomain.c         |   28 ++++++++--------------------
 arch/arm/mach-omap2/powerdomain.h         |    4 ++--
 4 files changed, 14 insertions(+), 26 deletions(-)

Comments

Santosh Shilimkar July 20, 2012, 7:25 a.m. UTC | #1
On Fri, Jul 20, 2012 at 11:34 AM, Rajendra Nayak <rnayak@ti.com> wrote:
> pwrdm_pre_transition()/pwrdm_post_transition() have always been high latency
> operations done within cpuidle to do Powerdomain level book-keeping to know
> what state transitions for different Powerdomains have been triggered.
> This is also useful to do a restore-on-demand in some cases when we know
> the context for the given Powerdomain was lost etc.
>
> Now that we have definitive entry/exit points (thanks to the Powerdomain
> level usecounting) for Powerdomain transitions, these book-keeping functions
> can very well be moved from within CPUidle into pwrdm_clkdm_enable()/pwrdm_
> clkdm_disable() functions.
>
> Also rename _pwrdm_pre/post_transition_cb() to pwrdm_pre/post_transition()
> and get rid of the original ones which iterate over all powerdomains.
>
> Signed-off-by: Rajendra Nayak <rnayak@ti.com>
> ---
>  arch/arm/mach-omap2/omap-mpuss-lowpower.c |    4 ++--
>  arch/arm/mach-omap2/pm34xx.c              |    4 ++--
>  arch/arm/mach-omap2/powerdomain.c         |   28 ++++++++--------------------
>  arch/arm/mach-omap2/powerdomain.h         |    4 ++--
>  4 files changed, 14 insertions(+), 26 deletions(-)
>
> diff --git a/arch/arm/mach-omap2/omap-mpuss-lowpower.c b/arch/arm/mach-omap2/omap-mpuss-lowpower.c
> index 13670aa..ea19439 100644
> --- a/arch/arm/mach-omap2/omap-mpuss-lowpower.c
> +++ b/arch/arm/mach-omap2/omap-mpuss-lowpower.c
> @@ -255,7 +255,7 @@ int omap4_enter_lowpower(unsigned int cpu, unsigned int power_state)
>                 return -ENXIO;
>         }
>
> -       pwrdm_pre_transition();
> +       pwrdm_cpu_idle();
>
Glad to see this is getting optimized.
I haven't seen how "pwrdm_cpu_[idle/wakeup]()" is
implemented but will those work on SMP system ?
I mean OMAP4, any CPU can make this call ?

Regards
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rajendra Nayak July 20, 2012, 8:08 a.m. UTC | #2
On Friday 20 July 2012 12:55 PM, Shilimkar, Santosh wrote:
> On Fri, Jul 20, 2012 at 11:34 AM, Rajendra Nayak<rnayak@ti.com>  wrote:
>> pwrdm_pre_transition()/pwrdm_post_transition() have always been high latency
>> operations done within cpuidle to do Powerdomain level book-keeping to know
>> what state transitions for different Powerdomains have been triggered.
>> This is also useful to do a restore-on-demand in some cases when we know
>> the context for the given Powerdomain was lost etc.
>>
>> Now that we have definitive entry/exit points (thanks to the Powerdomain
>> level usecounting) for Powerdomain transitions, these book-keeping functions
>> can very well be moved from within CPUidle into pwrdm_clkdm_enable()/pwrdm_
>> clkdm_disable() functions.
>>
>> Also rename _pwrdm_pre/post_transition_cb() to pwrdm_pre/post_transition()
>> and get rid of the original ones which iterate over all powerdomains.
>>
>> Signed-off-by: Rajendra Nayak<rnayak@ti.com>
>> ---
>>   arch/arm/mach-omap2/omap-mpuss-lowpower.c |    4 ++--
>>   arch/arm/mach-omap2/pm34xx.c              |    4 ++--
>>   arch/arm/mach-omap2/powerdomain.c         |   28 ++++++++--------------------
>>   arch/arm/mach-omap2/powerdomain.h         |    4 ++--
>>   4 files changed, 14 insertions(+), 26 deletions(-)
>>
>> diff --git a/arch/arm/mach-omap2/omap-mpuss-lowpower.c b/arch/arm/mach-omap2/omap-mpuss-lowpower.c
>> index 13670aa..ea19439 100644
>> --- a/arch/arm/mach-omap2/omap-mpuss-lowpower.c
>> +++ b/arch/arm/mach-omap2/omap-mpuss-lowpower.c
>> @@ -255,7 +255,7 @@ int omap4_enter_lowpower(unsigned int cpu, unsigned int power_state)
>>                  return -ENXIO;
>>          }
>>
>> -       pwrdm_pre_transition();
>> +       pwrdm_cpu_idle();
>>
> Glad to see this is getting optimized.
> I haven't seen how "pwrdm_cpu_[idle/wakeup]()" is
> implemented but will those work on SMP system ?
> I mean OMAP4, any CPU can make this call ?

Thats a good question. I think Tero did this so he can kick in
voltage transitions at the right time in idle/suspend.
Given that these deal with incrementing/decrementing the MPU and CORE
pwrdm usecounts alone, maybe on OMAP4 (SMP systems) this needs to also
increment/decrement the specific CPU usecounts on the CPUs these calls
are made.

>
> Regards
> Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tero Kristo July 20, 2012, 8:51 a.m. UTC | #3
On Fri, 2012-07-20 at 13:38 +0530, Rajendra Nayak wrote:
> On Friday 20 July 2012 12:55 PM, Shilimkar, Santosh wrote:
> > On Fri, Jul 20, 2012 at 11:34 AM, Rajendra Nayak<rnayak@ti.com>  wrote:
> >> pwrdm_pre_transition()/pwrdm_post_transition() have always been high latency
> >> operations done within cpuidle to do Powerdomain level book-keeping to know
> >> what state transitions for different Powerdomains have been triggered.
> >> This is also useful to do a restore-on-demand in some cases when we know
> >> the context for the given Powerdomain was lost etc.
> >>
> >> Now that we have definitive entry/exit points (thanks to the Powerdomain
> >> level usecounting) for Powerdomain transitions, these book-keeping functions
> >> can very well be moved from within CPUidle into pwrdm_clkdm_enable()/pwrdm_
> >> clkdm_disable() functions.
> >>
> >> Also rename _pwrdm_pre/post_transition_cb() to pwrdm_pre/post_transition()
> >> and get rid of the original ones which iterate over all powerdomains.
> >>
> >> Signed-off-by: Rajendra Nayak<rnayak@ti.com>
> >> ---
> >>   arch/arm/mach-omap2/omap-mpuss-lowpower.c |    4 ++--
> >>   arch/arm/mach-omap2/pm34xx.c              |    4 ++--
> >>   arch/arm/mach-omap2/powerdomain.c         |   28 ++++++++--------------------
> >>   arch/arm/mach-omap2/powerdomain.h         |    4 ++--
> >>   4 files changed, 14 insertions(+), 26 deletions(-)
> >>
> >> diff --git a/arch/arm/mach-omap2/omap-mpuss-lowpower.c b/arch/arm/mach-omap2/omap-mpuss-lowpower.c
> >> index 13670aa..ea19439 100644
> >> --- a/arch/arm/mach-omap2/omap-mpuss-lowpower.c
> >> +++ b/arch/arm/mach-omap2/omap-mpuss-lowpower.c
> >> @@ -255,7 +255,7 @@ int omap4_enter_lowpower(unsigned int cpu, unsigned int power_state)
> >>                  return -ENXIO;
> >>          }
> >>
> >> -       pwrdm_pre_transition();
> >> +       pwrdm_cpu_idle();
> >>
> > Glad to see this is getting optimized.
> > I haven't seen how "pwrdm_cpu_[idle/wakeup]()" is
> > implemented but will those work on SMP system ?
> > I mean OMAP4, any CPU can make this call ?
> 
> Thats a good question. I think Tero did this so he can kick in
> voltage transitions at the right time in idle/suspend.
> Given that these deal with incrementing/decrementing the MPU and CORE
> pwrdm usecounts alone, maybe on OMAP4 (SMP systems) this needs to also
> increment/decrement the specific CPU usecounts on the CPUs these calls
> are made.

Yeah, you should keep the usecounts valid by each cpu separately calling
these functions. My current set only sets these usecounts based on cpu0
activity, as cpu1 is statically controlled through cpu online / offline.
Once per-cpu cpuidle is in, these should be changed so that each
individual cpu increases the usecounts when they are brought up,
decrease/increase during idle, and decrease when they are brought down.
The usecount should always reflect the number of CPUs active on MPU
domain.

-Tero


--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Santosh Shilimkar July 20, 2012, 11:54 a.m. UTC | #4
On Fri, Jul 20, 2012 at 2:21 PM, Tero Kristo <t-kristo@ti.com> wrote:
>
> On Fri, 2012-07-20 at 13:38 +0530, Rajendra Nayak wrote:
> > On Friday 20 July 2012 12:55 PM, Shilimkar, Santosh wrote:
> > > On Fri, Jul 20, 2012 at 11:34 AM, Rajendra Nayak<rnayak@ti.com>
> > > wrote:
> > >> pwrdm_pre_transition()/pwrdm_post_transition() have always been high
> > >> latency
> > >> operations done within cpuidle to do Powerdomain level book-keeping
> > >> to know
> > >> what state transitions for different Powerdomains have been
> > >> triggered.
> > >> This is also useful to do a restore-on-demand in some cases when we
> > >> know
> > >> the context for the given Powerdomain was lost etc.
> > >>
> > >> Now that we have definitive entry/exit points (thanks to the
> > >> Powerdomain
> > >> level usecounting) for Powerdomain transitions, these book-keeping
> > >> functions
> > >> can very well be moved from within CPUidle into
> > >> pwrdm_clkdm_enable()/pwrdm_
> > >> clkdm_disable() functions.
> > >>
> > >> Also rename _pwrdm_pre/post_transition_cb() to
> > >> pwrdm_pre/post_transition()
> > >> and get rid of the original ones which iterate over all powerdomains.
> > >>
> > >> Signed-off-by: Rajendra Nayak<rnayak@ti.com>
> > >> ---
> > >>   arch/arm/mach-omap2/omap-mpuss-lowpower.c |    4 ++--
> > >>   arch/arm/mach-omap2/pm34xx.c              |    4 ++--
> > >>   arch/arm/mach-omap2/powerdomain.c         |   28
> > >> ++++++++--------------------
> > >>   arch/arm/mach-omap2/powerdomain.h         |    4 ++--
> > >>   4 files changed, 14 insertions(+), 26 deletions(-)
> > >>
> > >> diff --git a/arch/arm/mach-omap2/omap-mpuss-lowpower.c
> > >> b/arch/arm/mach-omap2/omap-mpuss-lowpower.c
> > >> index 13670aa..ea19439 100644
> > >> --- a/arch/arm/mach-omap2/omap-mpuss-lowpower.c
> > >> +++ b/arch/arm/mach-omap2/omap-mpuss-lowpower.c
> > >> @@ -255,7 +255,7 @@ int omap4_enter_lowpower(unsigned int cpu,
> > >> unsigned int power_state)
> > >>                  return -ENXIO;
> > >>          }
> > >>
> > >> -       pwrdm_pre_transition();
> > >> +       pwrdm_cpu_idle();
> > >>
> > > Glad to see this is getting optimized.
> > > I haven't seen how "pwrdm_cpu_[idle/wakeup]()" is
> > > implemented but will those work on SMP system ?
> > > I mean OMAP4, any CPU can make this call ?
> >
> > Thats a good question. I think Tero did this so he can kick in
> > voltage transitions at the right time in idle/suspend.
> > Given that these deal with incrementing/decrementing the MPU and CORE
> > pwrdm usecounts alone, maybe on OMAP4 (SMP systems) this needs to also
> > increment/decrement the specific CPU usecounts on the CPUs these calls
> > are made.
>
> Yeah, you should keep the usecounts valid by each cpu separately calling
> these functions. My current set only sets these usecounts based on cpu0
> activity, as cpu1 is statically controlled through cpu online / offline.
> Once per-cpu cpuidle is in, these should be changed so that each
> individual cpu increases the usecounts when they are brought up,
> decrease/increase during idle, and decrease when they are brought down.
> The usecount should always reflect the number of CPUs active on MPU
> domain.
>
Sounds good to me !!

Regards
santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kevin Hilman July 25, 2012, 10:43 p.m. UTC | #5
Tero Kristo <t-kristo@ti.com> writes:

> On Fri, 2012-07-20 at 13:38 +0530, Rajendra Nayak wrote:
>> On Friday 20 July 2012 12:55 PM, Shilimkar, Santosh wrote:
>> > On Fri, Jul 20, 2012 at 11:34 AM, Rajendra Nayak<rnayak@ti.com>  wrote:
>> >> pwrdm_pre_transition()/pwrdm_post_transition() have always been high latency
>> >> operations done within cpuidle to do Powerdomain level book-keeping to know
>> >> what state transitions for different Powerdomains have been triggered.
>> >> This is also useful to do a restore-on-demand in some cases when we know
>> >> the context for the given Powerdomain was lost etc.
>> >>
>> >> Now that we have definitive entry/exit points (thanks to the Powerdomain
>> >> level usecounting) for Powerdomain transitions, these book-keeping functions
>> >> can very well be moved from within CPUidle into pwrdm_clkdm_enable()/pwrdm_
>> >> clkdm_disable() functions.
>> >>
>> >> Also rename _pwrdm_pre/post_transition_cb() to pwrdm_pre/post_transition()
>> >> and get rid of the original ones which iterate over all powerdomains.
>> >>
>> >> Signed-off-by: Rajendra Nayak<rnayak@ti.com>

This is excellent!   Thanks for working on this.

However, it needs a rebase against mainline though because I merged a
set of optimizations[1] to this code already that only calls pre/post
per-pwrdm.

[...]

>> > Glad to see this is getting optimized.
>> > I haven't seen how "pwrdm_cpu_[idle/wakeup]()" is
>> > implemented but will those work on SMP system ?
>> > I mean OMAP4, any CPU can make this call ?
>> 
>> Thats a good question. I think Tero did this so he can kick in
>> voltage transitions at the right time in idle/suspend.
>> Given that these deal with incrementing/decrementing the MPU and CORE
>> pwrdm usecounts alone, maybe on OMAP4 (SMP systems) this needs to also
>> increment/decrement the specific CPU usecounts on the CPUs these calls
>> are made.
>
> Yeah, you should keep the usecounts valid by each cpu separately calling
> these functions. My current set only sets these usecounts based on cpu0
> activity, as cpu1 is statically controlled through cpu online / offline.
> Once per-cpu cpuidle is in, these should be changed so that each
> individual cpu increases the usecounts when they are brought up,
> decrease/increase during idle, and decrease when they are brought down.
> The usecount should always reflect the number of CPUs active on MPU
> domain.

Coupled CPUidle is merging for v3.6 (hopefully OMAP support too), so
this should be addressed sooner rather than later.

Kevin

[1] Specifically, see:
commit 58f0829b7186150318c79515f0e0850c5e7a9c89
Author: Kevin Hilman <khilman@ti.com>
Date:   Fri May 11 15:47:17 2012 -0700

    ARM: OMAP3: PM: call pre/post transition per powerdomain
    
    We only need to call the pre/post transtion methods when we know the
    power state is changing.  First, split up the pre/post transition
    calls to be per-powerdomain, and then make them conditional on whether
    the power domain is actually changing states.
    
    Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
    Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
    Tested-by: Grazvydas Ignotas <notasas@gmail.com>
    Signed-off-by: Kevin Hilman <khilman@ti.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rajendra Nayak July 26, 2012, 11:43 a.m. UTC | #6
On Thursday 26 July 2012 04:13 AM, Kevin Hilman wrote:
> Tero Kristo<t-kristo@ti.com>  writes:
>
>> >  On Fri, 2012-07-20 at 13:38 +0530, Rajendra Nayak wrote:
>>> >>  On Friday 20 July 2012 12:55 PM, Shilimkar, Santosh wrote:
>>>> >>  >  On Fri, Jul 20, 2012 at 11:34 AM, Rajendra Nayak<rnayak@ti.com>   wrote:
>>>>> >>  >>  pwrdm_pre_transition()/pwrdm_post_transition() have always been high latency
>>>>> >>  >>  operations done within cpuidle to do Powerdomain level book-keeping to know
>>>>> >>  >>  what state transitions for different Powerdomains have been triggered.
>>>>> >>  >>  This is also useful to do a restore-on-demand in some cases when we know
>>>>> >>  >>  the context for the given Powerdomain was lost etc.
>>>>> >>  >>
>>>>> >>  >>  Now that we have definitive entry/exit points (thanks to the Powerdomain
>>>>> >>  >>  level usecounting) for Powerdomain transitions, these book-keeping functions
>>>>> >>  >>  can very well be moved from within CPUidle into pwrdm_clkdm_enable()/pwrdm_
>>>>> >>  >>  clkdm_disable() functions.
>>>>> >>  >>
>>>>> >>  >>  Also rename _pwrdm_pre/post_transition_cb() to pwrdm_pre/post_transition()
>>>>> >>  >>  and get rid of the original ones which iterate over all powerdomains.
>>>>> >>  >>
>>>>> >>  >>  Signed-off-by: Rajendra Nayak<rnayak@ti.com>
> This is excellent!   Thanks for working on this.
>
> However, it needs a rebase against mainline though because I merged a
> set of optimizations[1] to this code already that only calls pre/post
> per-pwrdm.

Sure Kevin, I'll repost this series once 3.6-rc1 is out.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rajendra Nayak July 26, 2012, 12:42 p.m. UTC | #7
On Thursday 26 July 2012 04:13 AM, Kevin Hilman wrote:
> Tero Kristo<t-kristo@ti.com>  writes:
>
>> On Fri, 2012-07-20 at 13:38 +0530, Rajendra Nayak wrote:
>>> On Friday 20 July 2012 12:55 PM, Shilimkar, Santosh wrote:
>>>> On Fri, Jul 20, 2012 at 11:34 AM, Rajendra Nayak<rnayak@ti.com>   wrote:
>>>>> pwrdm_pre_transition()/pwrdm_post_transition() have always been high latency
>>>>> operations done within cpuidle to do Powerdomain level book-keeping to know
>>>>> what state transitions for different Powerdomains have been triggered.
>>>>> This is also useful to do a restore-on-demand in some cases when we know
>>>>> the context for the given Powerdomain was lost etc.
>>>>>
>>>>> Now that we have definitive entry/exit points (thanks to the Powerdomain
>>>>> level usecounting) for Powerdomain transitions, these book-keeping functions
>>>>> can very well be moved from within CPUidle into pwrdm_clkdm_enable()/pwrdm_
>>>>> clkdm_disable() functions.
>>>>>
>>>>> Also rename _pwrdm_pre/post_transition_cb() to pwrdm_pre/post_transition()
>>>>> and get rid of the original ones which iterate over all powerdomains.
>>>>>
>>>>> Signed-off-by: Rajendra Nayak<rnayak@ti.com>
>
> This is excellent!   Thanks for working on this.
>
> However, it needs a rebase against mainline though because I merged a
> set of optimizations[1] to this code already that only calls pre/post
> per-pwrdm.
>

Hi Kevin,

I thought some more on this patch, and I think this way of collecting
stats and knowing what state transitions the powerdomains been through
will not work on OMAP3, mainly because of the autodeps. Might work on
OMAP4 and beyond which do not need any autodeps.

Here why I think so,
Lets assume a Powerdomain X with a last module Y active, once Y disables
the last clock (lets assume no hardware controlled clocks for
simplicity), we clear the last power state register for X. However
due to autodeps X does not transition to a target state immediately.
It only does so when the MPU (and IVA) go down, and because
of the wakeup dependency (autodeps set a sleep and a wakeup dep with
both MPU and IVA) is also woken up every time MPU or IVA are up.
So its quite possible that X transitions in and out of sleep multiple
times before Y decides to re-enable its clocks, which is when we end up
looking for the last power state entered.
Lets say X entered OFF 3 times in between Y disabling and re-enabling
its clocks. Though we end up updating the counter only once (instead of
3) we still know and can tell Y that it lost context.
The problem however arises if for some reason X entered OFF
twice and happened to stay ON the third time the dependencies were met.
When Y re-enables its clocks, we end up telling it that it *did not*
lose context because we see the previous power state was ON.

I think as long as we have autodeps, the only way on OMAP3 to accurately
do this is to do it for all dependent domains in CPUIdle :(

regards,
Rajendra

PS: In theory maybe there is still a possibility to miss some domain 
transitions even with the current way of doing it in CPUidle. Its quite
possible that while MPU is asleep, IVA goes in and out of sleep multiple
times causing all dependent domains to also wakeup and sleep :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kevin Hilman July 26, 2012, 5:44 p.m. UTC | #8
Rajendra Nayak <rnayak@ti.com> writes:

> On Thursday 26 July 2012 04:13 AM, Kevin Hilman wrote:
>> Tero Kristo<t-kristo@ti.com>  writes:
>>
>>> On Fri, 2012-07-20 at 13:38 +0530, Rajendra Nayak wrote:
>>>> On Friday 20 July 2012 12:55 PM, Shilimkar, Santosh wrote:
>>>>> On Fri, Jul 20, 2012 at 11:34 AM, Rajendra Nayak<rnayak@ti.com>   wrote:
>>>>>> pwrdm_pre_transition()/pwrdm_post_transition() have always been high latency
>>>>>> operations done within cpuidle to do Powerdomain level book-keeping to know
>>>>>> what state transitions for different Powerdomains have been triggered.
>>>>>> This is also useful to do a restore-on-demand in some cases when we know
>>>>>> the context for the given Powerdomain was lost etc.
>>>>>>
>>>>>> Now that we have definitive entry/exit points (thanks to the Powerdomain
>>>>>> level usecounting) for Powerdomain transitions, these book-keeping functions
>>>>>> can very well be moved from within CPUidle into pwrdm_clkdm_enable()/pwrdm_
>>>>>> clkdm_disable() functions.
>>>>>>
>>>>>> Also rename _pwrdm_pre/post_transition_cb() to pwrdm_pre/post_transition()
>>>>>> and get rid of the original ones which iterate over all powerdomains.
>>>>>>
>>>>>> Signed-off-by: Rajendra Nayak<rnayak@ti.com>
>>
>> This is excellent!   Thanks for working on this.
>>
>> However, it needs a rebase against mainline though because I merged a
>> set of optimizations[1] to this code already that only calls pre/post
>> per-pwrdm.
>>
>
> Hi Kevin,
>
> I thought some more on this patch, and I think this way of collecting
> stats and knowing what state transitions the powerdomains been through
> will not work on OMAP3, mainly because of the autodeps. Might work on
> OMAP4 and beyond which do not need any autodeps.
>
> Here why I think so,
> Lets assume a Powerdomain X with a last module Y active, once Y disables
> the last clock (lets assume no hardware controlled clocks for
> simplicity), we clear the last power state register for X. However
> due to autodeps X does not transition to a target state immediately.
> It only does so when the MPU (and IVA) go down, and because
> of the wakeup dependency (autodeps set a sleep and a wakeup dep with
> both MPU and IVA) is also woken up every time MPU or IVA are up.
> So its quite possible that X transitions in and out of sleep multiple
> times before Y decides to re-enable its clocks, which is when we end up
> looking for the last power state entered.
> Lets say X entered OFF 3 times in between Y disabling and re-enabling
> its clocks. Though we end up updating the counter only once (instead of
> 3) we still know and can tell Y that it lost context.
> The problem however arises if for some reason X entered OFF
> twice and happened to stay ON the third time the dependencies were met.
> When Y re-enables its clocks, we end up telling it that it *did not*
> lose context because we see the previous power state was ON.

Yeah, this is definitely a problem.

As long as we have autodeps, everything is centralized around CPU
transitions anyways, so it makes sense to keep the accounting
centralized too.

> I think as long as we have autodeps, the only way on OMAP3 to accurately
> do this is to do it for all dependent domains in CPUIdle :(

Or, to get rid of autodeps. ;)

Kevin
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tero Kristo July 26, 2012, 6:27 p.m. UTC | #9
On Thu, 2012-07-26 at 10:44 -0700, Kevin Hilman wrote:
> Rajendra Nayak <rnayak@ti.com> writes:
> 
> > On Thursday 26 July 2012 04:13 AM, Kevin Hilman wrote:
> >> Tero Kristo<t-kristo@ti.com>  writes:
> >>
> >>> On Fri, 2012-07-20 at 13:38 +0530, Rajendra Nayak wrote:
> >>>> On Friday 20 July 2012 12:55 PM, Shilimkar, Santosh wrote:
> >>>>> On Fri, Jul 20, 2012 at 11:34 AM, Rajendra Nayak<rnayak@ti.com>   wrote:
> >>>>>> pwrdm_pre_transition()/pwrdm_post_transition() have always been high latency
> >>>>>> operations done within cpuidle to do Powerdomain level book-keeping to know
> >>>>>> what state transitions for different Powerdomains have been triggered.
> >>>>>> This is also useful to do a restore-on-demand in some cases when we know
> >>>>>> the context for the given Powerdomain was lost etc.
> >>>>>>
> >>>>>> Now that we have definitive entry/exit points (thanks to the Powerdomain
> >>>>>> level usecounting) for Powerdomain transitions, these book-keeping functions
> >>>>>> can very well be moved from within CPUidle into pwrdm_clkdm_enable()/pwrdm_
> >>>>>> clkdm_disable() functions.
> >>>>>>
> >>>>>> Also rename _pwrdm_pre/post_transition_cb() to pwrdm_pre/post_transition()
> >>>>>> and get rid of the original ones which iterate over all powerdomains.
> >>>>>>
> >>>>>> Signed-off-by: Rajendra Nayak<rnayak@ti.com>
> >>
> >> This is excellent!   Thanks for working on this.
> >>
> >> However, it needs a rebase against mainline though because I merged a
> >> set of optimizations[1] to this code already that only calls pre/post
> >> per-pwrdm.
> >>
> >
> > Hi Kevin,
> >
> > I thought some more on this patch, and I think this way of collecting
> > stats and knowing what state transitions the powerdomains been through
> > will not work on OMAP3, mainly because of the autodeps. Might work on
> > OMAP4 and beyond which do not need any autodeps.
> >
> > Here why I think so,
> > Lets assume a Powerdomain X with a last module Y active, once Y disables
> > the last clock (lets assume no hardware controlled clocks for
> > simplicity), we clear the last power state register for X. However
> > due to autodeps X does not transition to a target state immediately.
> > It only does so when the MPU (and IVA) go down, and because
> > of the wakeup dependency (autodeps set a sleep and a wakeup dep with
> > both MPU and IVA) is also woken up every time MPU or IVA are up.
> > So its quite possible that X transitions in and out of sleep multiple
> > times before Y decides to re-enable its clocks, which is when we end up
> > looking for the last power state entered.
> > Lets say X entered OFF 3 times in between Y disabling and re-enabling
> > its clocks. Though we end up updating the counter only once (instead of
> > 3) we still know and can tell Y that it lost context.
> > The problem however arises if for some reason X entered OFF
> > twice and happened to stay ON the third time the dependencies were met.
> > When Y re-enables its clocks, we end up telling it that it *did not*
> > lose context because we see the previous power state was ON.
> 
> Yeah, this is definitely a problem.
> 
> As long as we have autodeps, everything is centralized around CPU
> transitions anyways, so it makes sense to keep the accounting
> centralized too.
> 
> > I think as long as we have autodeps, the only way on OMAP3 to accurately
> > do this is to do it for all dependent domains in CPUIdle :(
> 
> Or, to get rid of autodeps. ;)

Whats the reason for having them anyway? Some of the wakedeps make sense
(per domain due to hw bugs) but sleepdeps...?

-Tero

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Walmsley July 26, 2012, 8:50 p.m. UTC | #10
On Thu, 26 Jul 2012, Tero Kristo wrote:

> On Thu, 2012-07-26 at 10:44 -0700, Kevin Hilman wrote:
> 
> > Or, to get rid of autodeps. ;)
> 
> Whats the reason for having them anyway?

No one has yet posted tested patches to convert OMAP2/3 to use the IP 
block-based clockdomain enable sequence.  A few folks posted some patches 
for this in 2010 but the patches didn't boot.

It should be a little easier to try this now since we have more data 
defined now than we did in the past.

> Some of the wakedeps make sense (per domain due to hw bugs) but 
> sleepdeps...?

That sounds like a different issue.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rajendra Nayak July 27, 2012, 6:46 a.m. UTC | #11
Hi Tero,

On Thursday 26 July 2012 11:57 PM, Tero Kristo wrote:
>> Yeah, this is definitely a problem.
>> >
>> >  As long as we have autodeps, everything is centralized around CPU
>> >  transitions anyways, so it makes sense to keep the accounting
>> >  centralized too.
>> >
>>> >  >  I think as long as we have autodeps, the only way on OMAP3 to accurately
>>> >  >  do this is to do it for all dependent domains in CPUIdle:(
>> >
>> >  Or, to get rid of autodeps.;)
> Whats the reason for having them anyway? Some of the wakedeps make sense
> (per domain due to hw bugs) but sleepdeps...?
>

FWIK, the main problem is with modules with clocks under hardware
control. Once a slave module isn't functional and there are no
outstanding OCP accesses pending, the module when sent an IDLEREQ
by PRCM responds with an IDLEACK causing the clkdm to transition
to INACTIVE. A driver which is active can still in the meantime
cause a register access to the module, but the register access
does not act as an event which makes PRCM de-assert IDLEREQ causing
the module to unidle. This causes problems. So as long as there is
a possibility of some code doing a register access on the module
we need to keep it from idling. IIRC the issues we saw on OMAP3
were mostly around PER domain and I think with GPIO in particular.

The problem might be limited to modules with _only_ hardware controlled
clocks (iclks) like GPIO. For others which have an iclk and fclk, we can
always keep the autodeps while fclks are requested and get rid of them
when fclks are disabled. This is exactly what clkdm_clk_enable/disable
functions do, but given in the current mainline even iclks account
for usecounting, a clkdms usecount would never hit 0 causing 
clkdm_clk_disable never to be called. (On clkdms which have atleast one
iclk under hardware control).
hwmod keeps all iclks always enabled and under hardware control unless
the OCP interface has a 'OCPIF_SWSUP_IDLE' flag set.

But now with your series, which does not account iclks for usecounting,
I would expect this to be fixed. I was expecting for modules with both
fclk and iclk, as long as the driver has done a pm_runtime_get_sync()
or equivalent the autodeps would be set, and once the driver does a
pm_runtime_put_sync() the autodeps would be removed. This would also
be the same time we clear the Powerdomains previous power state register
and the Powerdomain should ideally immediately transition on not go in
and out with every MPU transition.
So this kind of rules out the problems that my theory above was
suggesting we would have with the $subject patch.
We still have to deal with the iclk only modules like GPIO though.

However a quick test with just your latest usecounting series (without
any of my RFC patches) seems to make me think I am still missing
something.

If you see the counts below for usbhost and dss, they both seem to
go in and out of RET with every MPU transition. Which means the
dependencies are still set.

# cat /debug/pm_debug/count
usbhost_pwrdm 
(ON),OFF:0,RET:138,INA:0,ON:139,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0
sgx_pwrdm (OFF),OFF:1,RET:0,INA:0,ON:1,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0
core_pwrdm 
(ON),OFF:0,RET:0,INA:0,ON:1,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0,RET-MEMBANK2-OFF:0
per_pwrdm (ON),OFF:0,RET:0,INA:0,ON:1,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0
dss_pwrdm (ON),OFF:0,RET:138,INA:0,ON:139,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0
cam_pwrdm (RET),OFF:0,RET:1,INA:0,ON:1,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0
neon_pwrdm (ON),OFF:0,RET:132,INA:6,ON:139,RET-LOGIC-OFF:0
mpu_pwrdm (ON),OFF:0,RET:132,INA:6,ON:139,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0
iva2_pwrdm 
(RET),OFF:0,RET:1,INA:0,ON:1,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0,RET-MEMBANK2-OFF:0,RET-MEMBANK3-OFF:0,RET-MEMBANK4-OFF:0

However if I look at the dss registers, I don;t see any fclks are
enabled.

# ./readmem 0x48004E00
0x00000000 <- All FCLK disabled.

# ./readmem 0x48004E10
0x00000001 <- ICLK enabled

# ./readmem 0x48004E44
0x00000006 <- dependencies are set with MPU and IVA

# ./readmem 0x48004E48
0x00000003 <- clkdm is under HWSUP.

Any idea why this could be happening?

regards,
Rajendra




--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rajendra Nayak July 27, 2012, 7:43 a.m. UTC | #12
Hi Tero,

On Friday 27 July 2012 12:16 PM, Rajendra Nayak wrote:
> However a quick test with just your latest usecounting series (without
> any of my RFC patches) seems to make me think I am still missing
> something.
>
> If you see the counts below for usbhost and dss, they both seem to
> go in and out of RET with every MPU transition. Which means the
> dependencies are still set.
>
> # cat /debug/pm_debug/count
> usbhost_pwrdm
> (ON),OFF:0,RET:138,INA:0,ON:139,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0
> sgx_pwrdm (OFF),OFF:1,RET:0,INA:0,ON:1,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0
> core_pwrdm
> (ON),OFF:0,RET:0,INA:0,ON:1,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0,RET-MEMBANK2-OFF:0
>
> per_pwrdm (ON),OFF:0,RET:0,INA:0,ON:1,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0
> dss_pwrdm
> (ON),OFF:0,RET:138,INA:0,ON:139,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0
> cam_pwrdm (RET),OFF:0,RET:1,INA:0,ON:1,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0
> neon_pwrdm (ON),OFF:0,RET:132,INA:6,ON:139,RET-LOGIC-OFF:0
> mpu_pwrdm
> (ON),OFF:0,RET:132,INA:6,ON:139,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0
> iva2_pwrdm
> (RET),OFF:0,RET:1,INA:0,ON:1,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0,RET-MEMBANK2-OFF:0,RET-MEMBANK3-OFF:0,RET-MEMBANK4-OFF:0
>
>
> However if I look at the dss registers, I don;t see any fclks are
> enabled.
>
> # ./readmem 0x48004E00
> 0x00000000 <- All FCLK disabled.
>
> # ./readmem 0x48004E10
> 0x00000001 <- ICLK enabled
>
> # ./readmem 0x48004E44
> 0x00000006 <- dependencies are set with MPU and IVA
>
> # ./readmem 0x48004E48
> 0x00000003 <- clkdm is under HWSUP.
>
> Any idea why this could be happening?

I think I know what the problem is now.

omap3_pm_init calls clkdm_allow_idle for all clkdms, while autoidle on
all iclks is still disabled. This causes autodeps to be set as the iclks
are accounted for in the usecount.
(hwmod would have done a clk_enable() on the iclks and left them enabled
as OCPIF_SWSUP_IDLE isn't set)

static void omap3_clkdm_allow_idle(struct clockdomain *clkdm)
{
         if (atomic_read(&clkdm->usecount) > 0)
                 _clkdm_add_autodeps(clkdm);

         omap3xxx_cm_clkdm_enable_hwsup(clkdm->pwrdm.ptr->prcm_offs,
                                 clkdm->clktrctrl_mask);
}

A little later after omap3_pm_init, we enable autoidle for all iclks.
omap2_clkt_iclk_allow_idle decrements the usecount, but leaves the
autodeps still set.

This seems to be causing the dss and usb to also transition along with
MPU.

We will need some way to also clear and set autodeps in
omap2_clkt_iclk_allow_idle/deny_idle.

regards,
Rajendra




--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/arm/mach-omap2/omap-mpuss-lowpower.c b/arch/arm/mach-omap2/omap-mpuss-lowpower.c
index 13670aa..ea19439 100644
--- a/arch/arm/mach-omap2/omap-mpuss-lowpower.c
+++ b/arch/arm/mach-omap2/omap-mpuss-lowpower.c
@@ -255,7 +255,7 @@  int omap4_enter_lowpower(unsigned int cpu, unsigned int power_state)
 		return -ENXIO;
 	}
 
-	pwrdm_pre_transition();
+	pwrdm_cpu_idle();
 
 	/*
 	 * Check MPUSS next state and save interrupt controller if needed.
@@ -287,7 +287,7 @@  int omap4_enter_lowpower(unsigned int cpu, unsigned int power_state)
 	wakeup_cpu = smp_processor_id();
 	set_cpu_next_pwrst(wakeup_cpu, PWRDM_POWER_ON);
 
-	pwrdm_post_transition();
+	pwrdm_cpu_wakeup();
 
 	return 0;
 }
diff --git a/arch/arm/mach-omap2/pm34xx.c b/arch/arm/mach-omap2/pm34xx.c
index 8d96b1f..9bdf53c 100644
--- a/arch/arm/mach-omap2/pm34xx.c
+++ b/arch/arm/mach-omap2/pm34xx.c
@@ -290,7 +290,7 @@  void omap_sram_idle(void)
 			omap3_enable_io_chain();
 	}
 
-	pwrdm_pre_transition();
+	pwrdm_cpu_idle();
 
 	/* PER */
 	if (per_next_state < PWRDM_POWER_ON) {
@@ -355,7 +355,7 @@  void omap_sram_idle(void)
 	}
 	omap3_intc_resume_idle();
 
-	pwrdm_post_transition();
+	pwrdm_cpu_wakeup();
 
 	/* PER */
 	if (per_next_state < PWRDM_POWER_ON)
diff --git a/arch/arm/mach-omap2/powerdomain.c b/arch/arm/mach-omap2/powerdomain.c
index 39a45a9..f435819 100644
--- a/arch/arm/mach-omap2/powerdomain.c
+++ b/arch/arm/mach-omap2/powerdomain.c
@@ -187,14 +187,14 @@  static int _pwrdm_state_switch(struct powerdomain *pwrdm, int flag)
 	return 0;
 }
 
-static int _pwrdm_pre_transition_cb(struct powerdomain *pwrdm, void *unused)
+int pwrdm_pre_transition(struct powerdomain *pwrdm)
 {
 	pwrdm_clear_all_prev_pwrst(pwrdm);
 	_pwrdm_state_switch(pwrdm, PWRDM_STATE_NOW);
 	return 0;
 }
 
-static int _pwrdm_post_transition_cb(struct powerdomain *pwrdm, void *unused)
+int pwrdm_post_transition(struct powerdomain *pwrdm)
 {
 	_pwrdm_state_switch(pwrdm, PWRDM_STATE_PREV);
 	return 0;
@@ -995,8 +995,10 @@  void pwrdm_clkdm_enable(struct powerdomain *pwrdm)
 	if (!pwrdm)
 		return;
 
-	if (atomic_inc_return(&pwrdm->usecount) == 1)
+	if (atomic_inc_return(&pwrdm->usecount) == 1) {
+		pwrdm_post_transition(pwrdm);
 		voltdm_pwrdm_enable(pwrdm->voltdm.ptr);
+	}
 }
 
 /**
@@ -1015,28 +1017,14 @@  void pwrdm_clkdm_disable(struct powerdomain *pwrdm)
 
 	val = atomic_dec_return(&pwrdm->usecount);
 
-	if (!val)
+	if (!val) {
 		voltdm_pwrdm_disable(pwrdm->voltdm.ptr);
+		pwrdm_pre_transition(pwrdm);
+	}
 
 	BUG_ON(val < 0);
 }
 
-int pwrdm_pre_transition(void)
-{
-	pwrdm_for_each(_pwrdm_pre_transition_cb, NULL);
-	/* Decrease mpu / core usecounts to indicate we are entering idle */
-	pwrdm_cpu_idle();
-	return 0;
-}
-
-int pwrdm_post_transition(void)
-{
-	/* Increase mpu / core usecounts to indicate we are leaving idle */
-	pwrdm_cpu_wakeup();
-	pwrdm_for_each(_pwrdm_post_transition_cb, NULL);
-	return 0;
-}
-
 /**
  * pwrdm_get_context_loss_count - get powerdomain's context loss count
  * @pwrdm: struct powerdomain * to wait for
diff --git a/arch/arm/mach-omap2/powerdomain.h b/arch/arm/mach-omap2/powerdomain.h
index ecf7d3d..52a99cf 100644
--- a/arch/arm/mach-omap2/powerdomain.h
+++ b/arch/arm/mach-omap2/powerdomain.h
@@ -214,8 +214,8 @@  bool pwrdm_has_hdwr_sar(struct powerdomain *pwrdm);
 int pwrdm_wait_transition(struct powerdomain *pwrdm);
 
 int pwrdm_state_switch(struct powerdomain *pwrdm);
-int pwrdm_pre_transition(void);
-int pwrdm_post_transition(void);
+int pwrdm_pre_transition(struct powerdomain *pwrdm);
+int pwrdm_post_transition(struct powerdomain *pwrdm);
 
 void pwrdm_clkdm_enable(struct powerdomain *pwrdm);
 void pwrdm_clkdm_disable(struct powerdomain *pwrdm);