diff mbox series

[v2,4/4] cpuidle: psci: Allow WFI to be the only state for the hierarchical topology

Message ID 20200303203559.23995-5-ulf.hansson@linaro.org (mailing list archive)
State Changes Requested, archived
Headers show
Series cpuidle: psci: Some fixes when using the hierarchical layout | expand

Commit Message

Ulf Hansson March 3, 2020, 8:35 p.m. UTC
It's possible that only the WFI state is supported for the CPU, while also
a shared idle state exists for a group of CPUs.

When the hierarchical topology is used, the shared idle state may not be
compatible with arm,idle-state, rather with "domain-idle-state", which
makes dt_init_idle_driver() to return zero. This leads to that the
cpuidle-psci driver bails out during initialization, avoiding to register a
cpuidle driver and instead relies on the default architectural back-end
(called via cpu_do_idle()). In other words, the shared idle state becomes
unused.

Let's fix this behaviour, by allowing the dt_init_idle_driver() to return 0
and then continue with the initialization. If it turns out that the
hierarchical topology is used and we have some additional states to manage,
then continue with the cpuidle driver registration, otherwise bail out as
before.

Reported-by: Benjamin Gaignard <benjamin.gaignard@st.com>
Fixes: a65a397f2451 ("cpuidle: psci: Add support for PM domains by using genpd")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---

Changes in v2:
	- Convert the error code returned from psci_cpu_suspend_enter() into an
	expected error code by cpuidle core.

---
 drivers/cpuidle/cpuidle-psci.c | 48 +++++++++++++++++++++-------------
 1 file changed, 30 insertions(+), 18 deletions(-)

Comments

Sudeep Holla March 4, 2020, 12:23 p.m. UTC | #1
The $subject is bit confusing. IIUC, if there are no idle states to
manage including hierarchical domain states you will not register the driver
right ? If so, you are not allowing WFI to be the only state, hence my
concern with $subject.

On Tue, Mar 03, 2020 at 09:35:59PM +0100, Ulf Hansson wrote:
> It's possible that only the WFI state is supported for the CPU, while also
> a shared idle state exists for a group of CPUs.
>
> When the hierarchical topology is used, the shared idle state may not be
> compatible with arm,idle-state, rather with "domain-idle-state", which
> makes dt_init_idle_driver() to return zero. This leads to that the
> cpuidle-psci driver bails out during initialization, avoiding to register a
> cpuidle driver and instead relies on the default architectural back-end
> (called via cpu_do_idle()). In other words, the shared idle state becomes
> unused.
>
> Let's fix this behaviour, by allowing the dt_init_idle_driver() to return 0
> and then continue with the initialization. If it turns out that the
> hierarchical topology is used and we have some additional states to manage,
> then continue with the cpuidle driver registration, otherwise bail out as
> before.
>
> Reported-by: Benjamin Gaignard <benjamin.gaignard@st.com>
> Fixes: a65a397f2451 ("cpuidle: psci: Add support for PM domains by using genpd")
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> ---
>
> Changes in v2:
> 	- Convert the error code returned from psci_cpu_suspend_enter() into an
> 	expected error code by cpuidle core.
>
> ---
>  drivers/cpuidle/cpuidle-psci.c | 48 +++++++++++++++++++++-------------
>  1 file changed, 30 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
> index bae9140a65a5..ae0fabec2742 100644
> --- a/drivers/cpuidle/cpuidle-psci.c
> +++ b/drivers/cpuidle/cpuidle-psci.c
> @@ -56,16 +56,19 @@ static int psci_enter_domain_idle_state(struct cpuidle_device *dev,
>  	u32 *states = data->psci_states;
>  	struct device *pd_dev = data->dev;
>  	u32 state;
> -	int ret;
> +	int ret = 0;
>
>  	/* Do runtime PM to manage a hierarchical CPU toplogy. */
>  	pm_runtime_put_sync_suspend(pd_dev);
>
>  	state = psci_get_domain_state();
> -	if (!state)
> +	if (!state && states)
>  		state = states[idx];
>
> -	ret = psci_enter_state(idx, state);
> +	if (state)
> +		ret = psci_cpu_suspend_enter(state) ? -1 : idx;
> +	else
> +		cpu_do_idle();

May be, I haven't followed this completely yet, but I don't want to be
in the position to replicated default arch idle hook. Just use the one
that exist by simply not registering the driver.

--
Regards,
Sudeep
Ulf Hansson March 5, 2020, 2:17 p.m. UTC | #2
On Wed, 4 Mar 2020 at 13:23, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> The $subject is bit confusing. IIUC, if there are no idle states to
> manage including hierarchical domain states you will not register the driver
> right ? If so, you are not allowing WFI to be the only state, hence my
> concern with $subject.

I agree that's not so clear, but it wasn't easy to fit everything I
wanted to say in one line. :-)

Is this below better and okay for you?

"cpuidle: psci: Update condition when avoiding driver registration".

>
> On Tue, Mar 03, 2020 at 09:35:59PM +0100, Ulf Hansson wrote:
> > It's possible that only the WFI state is supported for the CPU, while also
> > a shared idle state exists for a group of CPUs.
> >
> > When the hierarchical topology is used, the shared idle state may not be
> > compatible with arm,idle-state, rather with "domain-idle-state", which
> > makes dt_init_idle_driver() to return zero. This leads to that the
> > cpuidle-psci driver bails out during initialization, avoiding to register a
> > cpuidle driver and instead relies on the default architectural back-end
> > (called via cpu_do_idle()). In other words, the shared idle state becomes
> > unused.
> >
> > Let's fix this behaviour, by allowing the dt_init_idle_driver() to return 0
> > and then continue with the initialization. If it turns out that the
> > hierarchical topology is used and we have some additional states to manage,
> > then continue with the cpuidle driver registration, otherwise bail out as
> > before.
> >
> > Reported-by: Benjamin Gaignard <benjamin.gaignard@st.com>
> > Fixes: a65a397f2451 ("cpuidle: psci: Add support for PM domains by using genpd")
> > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> > ---
> >
> > Changes in v2:
> >       - Convert the error code returned from psci_cpu_suspend_enter() into an
> >       expected error code by cpuidle core.
> >
> > ---
> >  drivers/cpuidle/cpuidle-psci.c | 48 +++++++++++++++++++++-------------
> >  1 file changed, 30 insertions(+), 18 deletions(-)
> >
> > diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
> > index bae9140a65a5..ae0fabec2742 100644
> > --- a/drivers/cpuidle/cpuidle-psci.c
> > +++ b/drivers/cpuidle/cpuidle-psci.c
> > @@ -56,16 +56,19 @@ static int psci_enter_domain_idle_state(struct cpuidle_device *dev,
> >       u32 *states = data->psci_states;
> >       struct device *pd_dev = data->dev;
> >       u32 state;
> > -     int ret;
> > +     int ret = 0;
> >
> >       /* Do runtime PM to manage a hierarchical CPU toplogy. */
> >       pm_runtime_put_sync_suspend(pd_dev);
> >
> >       state = psci_get_domain_state();
> > -     if (!state)
> > +     if (!state && states)
> >               state = states[idx];
> >
> > -     ret = psci_enter_state(idx, state);
> > +     if (state)
> > +             ret = psci_cpu_suspend_enter(state) ? -1 : idx;
> > +     else
> > +             cpu_do_idle();
>
> May be, I haven't followed this completely yet, but I don't want to be
> in the position to replicated default arch idle hook. Just use the one
> that exist by simply not registering the driver.

That doesn't work for the configuration I am solving.

Assume this scenario: We have WFI and a domain/cluster idle state.
From the cpuidle governor point of view, it always selects the WFI
state, which means idx is zero.

Then, after we have called pm_runtime_put_sync_suspend() a few lines
above, we may potentially have a "domain state" to use, instead of the
WFI state.

In this case, if we would have called psci_enter_state(), that would
lead us to calling cpu_do_idle() from the __CPU_PM_CPU_IDLE_ENTER()
macro, becuase idx is zero. In other words, the domain state would
become unused.

Hope this clarifies what goes on here?

Kind regards
Uffe
Sudeep Holla March 5, 2020, 4:23 p.m. UTC | #3
On Thu, Mar 05, 2020 at 03:17:42PM +0100, Ulf Hansson wrote:
> On Wed, 4 Mar 2020 at 13:23, Sudeep Holla <sudeep.holla@arm.com> wrote:
> >
> > The $subject is bit confusing. IIUC, if there are no idle states to
> > manage including hierarchical domain states you will not register the driver
> > right ? If so, you are not allowing WFI to be the only state, hence my
> > concern with $subject.
>
> I agree that's not so clear, but it wasn't easy to fit everything I
> wanted to say in one line. :-)
>

No worries, just wanted to clarified. Looking at the patch, lot of things
got clarified but thought we can always improve.

> Is this below better and okay for you?
>
> "cpuidle: psci: Update condition when avoiding driver registration".
>

Definitely better than $subject :)

> >
> > On Tue, Mar 03, 2020 at 09:35:59PM +0100, Ulf Hansson wrote:
> > > It's possible that only the WFI state is supported for the CPU, while also
> > > a shared idle state exists for a group of CPUs.
> > >
> > > When the hierarchical topology is used, the shared idle state may not be
> > > compatible with arm,idle-state, rather with "domain-idle-state", which
> > > makes dt_init_idle_driver() to return zero. This leads to that the
> > > cpuidle-psci driver bails out during initialization, avoiding to register a
> > > cpuidle driver and instead relies on the default architectural back-end
> > > (called via cpu_do_idle()). In other words, the shared idle state becomes
> > > unused.
> > >
> > > Let's fix this behaviour, by allowing the dt_init_idle_driver() to return 0
> > > and then continue with the initialization. If it turns out that the
> > > hierarchical topology is used and we have some additional states to manage,
> > > then continue with the cpuidle driver registration, otherwise bail out as
> > > before.
> > >
> > > Reported-by: Benjamin Gaignard <benjamin.gaignard@st.com>
> > > Fixes: a65a397f2451 ("cpuidle: psci: Add support for PM domains by using genpd")
> > > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> > > ---
> > >
> > > Changes in v2:
> > >       - Convert the error code returned from psci_cpu_suspend_enter() into an
> > >       expected error code by cpuidle core.
> > >
> > > ---
> > >  drivers/cpuidle/cpuidle-psci.c | 48 +++++++++++++++++++++-------------
> > >  1 file changed, 30 insertions(+), 18 deletions(-)
> > >
> > > diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
> > > index bae9140a65a5..ae0fabec2742 100644
> > > --- a/drivers/cpuidle/cpuidle-psci.c
> > > +++ b/drivers/cpuidle/cpuidle-psci.c
> > > @@ -56,16 +56,19 @@ static int psci_enter_domain_idle_state(struct cpuidle_device *dev,
> > >       u32 *states = data->psci_states;
> > >       struct device *pd_dev = data->dev;
> > >       u32 state;
> > > -     int ret;
> > > +     int ret = 0;
> > >
> > >       /* Do runtime PM to manage a hierarchical CPU toplogy. */
> > >       pm_runtime_put_sync_suspend(pd_dev);
> > >
> > >       state = psci_get_domain_state();
> > > -     if (!state)
> > > +     if (!state && states)
> > >               state = states[idx];
> > >
> > > -     ret = psci_enter_state(idx, state);
> > > +     if (state)
> > > +             ret = psci_cpu_suspend_enter(state) ? -1 : idx;
> > > +     else
> > > +             cpu_do_idle();
> >
> > May be, I haven't followed this completely yet, but I don't want to be
> > in the position to replicated default arch idle hook. Just use the one
> > that exist by simply not registering the driver.
>
> That doesn't work for the configuration I am solving.
>
> Assume this scenario: We have WFI and a domain/cluster idle state.
> From the cpuidle governor point of view, it always selects the WFI
> state, which means idx is zero.
>

OK. The only state that cluster can enter when CPUs are in WFI are
cluster WFI and most hardware can handle it automatically. I don't see
the need to do any extra work for that.

> Then, after we have called pm_runtime_put_sync_suspend() a few lines
> above, we may potentially have a "domain state" to use, instead of the
> WFI state.
>

Are they any platforms with this potential "domain state" to use with
CPU WFI. I want to understand this better.

> In this case, if we would have called psci_enter_state(), that would
> lead us to calling cpu_do_idle() from the __CPU_PM_CPU_IDLE_ENTER()
> macro, becuase idx is zero. In other words, the domain state would
> become unused.
>

For a domain state to become unused with WFI, it needs to be available
and I am not 100% sure of that.

> Hope this clarifies what goes on here?
>

Yes.

--
Regards,
Sudeep
Ulf Hansson March 6, 2020, 9:28 a.m. UTC | #4
On Thu, 5 Mar 2020 at 17:23, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Thu, Mar 05, 2020 at 03:17:42PM +0100, Ulf Hansson wrote:
> > On Wed, 4 Mar 2020 at 13:23, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > >
> > > The $subject is bit confusing. IIUC, if there are no idle states to
> > > manage including hierarchical domain states you will not register the driver
> > > right ? If so, you are not allowing WFI to be the only state, hence my
> > > concern with $subject.
> >
> > I agree that's not so clear, but it wasn't easy to fit everything I
> > wanted to say in one line. :-)
> >
>
> No worries, just wanted to clarified. Looking at the patch, lot of things
> got clarified but thought we can always improve.
>
> > Is this below better and okay for you?
> >
> > "cpuidle: psci: Update condition when avoiding driver registration".
> >
>
> Definitely better than $subject :)

Great, then I switch to that.

>
> > >
> > > On Tue, Mar 03, 2020 at 09:35:59PM +0100, Ulf Hansson wrote:
> > > > It's possible that only the WFI state is supported for the CPU, while also
> > > > a shared idle state exists for a group of CPUs.
> > > >
> > > > When the hierarchical topology is used, the shared idle state may not be
> > > > compatible with arm,idle-state, rather with "domain-idle-state", which
> > > > makes dt_init_idle_driver() to return zero. This leads to that the
> > > > cpuidle-psci driver bails out during initialization, avoiding to register a
> > > > cpuidle driver and instead relies on the default architectural back-end
> > > > (called via cpu_do_idle()). In other words, the shared idle state becomes
> > > > unused.
> > > >
> > > > Let's fix this behaviour, by allowing the dt_init_idle_driver() to return 0
> > > > and then continue with the initialization. If it turns out that the
> > > > hierarchical topology is used and we have some additional states to manage,
> > > > then continue with the cpuidle driver registration, otherwise bail out as
> > > > before.
> > > >
> > > > Reported-by: Benjamin Gaignard <benjamin.gaignard@st.com>
> > > > Fixes: a65a397f2451 ("cpuidle: psci: Add support for PM domains by using genpd")
> > > > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> > > > ---
> > > >
> > > > Changes in v2:
> > > >       - Convert the error code returned from psci_cpu_suspend_enter() into an
> > > >       expected error code by cpuidle core.
> > > >
> > > > ---
> > > >  drivers/cpuidle/cpuidle-psci.c | 48 +++++++++++++++++++++-------------
> > > >  1 file changed, 30 insertions(+), 18 deletions(-)
> > > >
> > > > diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
> > > > index bae9140a65a5..ae0fabec2742 100644
> > > > --- a/drivers/cpuidle/cpuidle-psci.c
> > > > +++ b/drivers/cpuidle/cpuidle-psci.c
> > > > @@ -56,16 +56,19 @@ static int psci_enter_domain_idle_state(struct cpuidle_device *dev,
> > > >       u32 *states = data->psci_states;
> > > >       struct device *pd_dev = data->dev;
> > > >       u32 state;
> > > > -     int ret;
> > > > +     int ret = 0;
> > > >
> > > >       /* Do runtime PM to manage a hierarchical CPU toplogy. */
> > > >       pm_runtime_put_sync_suspend(pd_dev);
> > > >
> > > >       state = psci_get_domain_state();
> > > > -     if (!state)
> > > > +     if (!state && states)
> > > >               state = states[idx];
> > > >
> > > > -     ret = psci_enter_state(idx, state);
> > > > +     if (state)
> > > > +             ret = psci_cpu_suspend_enter(state) ? -1 : idx;
> > > > +     else
> > > > +             cpu_do_idle();
> > >
> > > May be, I haven't followed this completely yet, but I don't want to be
> > > in the position to replicated default arch idle hook. Just use the one
> > > that exist by simply not registering the driver.
> >
> > That doesn't work for the configuration I am solving.
> >
> > Assume this scenario: We have WFI and a domain/cluster idle state.
> > From the cpuidle governor point of view, it always selects the WFI
> > state, which means idx is zero.
> >
>
> OK. The only state that cluster can enter when CPUs are in WFI are
> cluster WFI and most hardware can handle it automatically. I don't see
> the need to do any extra work for that.

This isn't about cluster WFI, but about deeper cluster states, such as
a cluster-clock-gated-state and a cluster-power-off-state. It's an ST
platform, which Benjamin is working on.

>
> > Then, after we have called pm_runtime_put_sync_suspend() a few lines
> > above, we may potentially have a "domain state" to use, instead of the
> > WFI state.
> >
>
> Are they any platforms with this potential "domain state" to use with
> CPU WFI. I want to understand this better.
>
> > In this case, if we would have called psci_enter_state(), that would
> > lead us to calling cpu_do_idle() from the __CPU_PM_CPU_IDLE_ENTER()
> > macro, becuase idx is zero. In other words, the domain state would
> > become unused.
> >
>
> For a domain state to become unused with WFI, it needs to be available
> and I am not 100% sure of that.

With these changes from the series, we can fully conform to the
hierarchical DT bindings for PSCI.

I am not sure I understand your concern, is there a cost involved by
applying this?

Kind regards
Uffe
Sudeep Holla March 6, 2020, 10:04 a.m. UTC | #5
On Fri, Mar 06, 2020 at 10:28:10AM +0100, Ulf Hansson wrote:
> On Thu, 5 Mar 2020 at 17:23, Sudeep Holla <sudeep.holla@arm.com> wrote:
> >

[...]

> > OK. The only state that cluster can enter when CPUs are in WFI are
> > cluster WFI and most hardware can handle it automatically. I don't see
> > the need to do any extra work for that.
>
> This isn't about cluster WFI, but about deeper cluster states, such as
> a cluster-clock-gated-state and a cluster-power-off-state. It's an ST
> platform, which Benjamin is working on.
>

Then definitely something is completely wrong. You can't enter deeper
cluster states(clock-gated and power-off to be specific) with CPU in
just WFI state. So, if the attempt here is to enter those states, I
disagree with the change.

Benjamin, please share the complete hierarchical topology for your platform.

> >
> > > Then, after we have called pm_runtime_put_sync_suspend() a few lines
> > > above, we may potentially have a "domain state" to use, instead of the
> > > WFI state.
> > >
> >
> > Are they any platforms with this potential "domain state" to use with
> > CPU WFI. I want to understand this better.
> >
> > > In this case, if we would have called psci_enter_state(), that would
> > > lead us to calling cpu_do_idle() from the __CPU_PM_CPU_IDLE_ENTER()
> > > macro, becuase idx is zero. In other words, the domain state would
> > > become unused.
> > >
> >
> > For a domain state to become unused with WFI, it needs to be available
> > and I am not 100% sure of that.
>
> With these changes from the series, we can fully conform to the
> hierarchical DT bindings for PSCI.
>

Theoretically may be, but may not confirm to the hardware states.

> I am not sure I understand your concern, is there a cost involved by
> applying this?
>

Yes as mentioned above.

--
Regards,
Sudeep
Benjamin Gaignard March 6, 2020, 10:47 a.m. UTC | #6
Le ven. 6 mars 2020 à 11:04, Sudeep Holla <sudeep.holla@arm.com> a écrit :
>
> On Fri, Mar 06, 2020 at 10:28:10AM +0100, Ulf Hansson wrote:
> > On Thu, 5 Mar 2020 at 17:23, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > >
>
> [...]
>
> > > OK. The only state that cluster can enter when CPUs are in WFI are
> > > cluster WFI and most hardware can handle it automatically. I don't see
> > > the need to do any extra work for that.
> >
> > This isn't about cluster WFI, but about deeper cluster states, such as
> > a cluster-clock-gated-state and a cluster-power-off-state. It's an ST
> > platform, which Benjamin is working on.
> >
>
> Then definitely something is completely wrong. You can't enter deeper
> cluster states(clock-gated and power-off to be specific) with CPU in
> just WFI state. So, if the attempt here is to enter those states, I
> disagree with the change.
>
> Benjamin, please share the complete hierarchical topology for your platform.

The platform is stm32mp157 SoC which embedded two Cortex A7 in one cluster.
I would like to be able to put the system in a state where clocks of CPUs and
hardware blocks are gated. In this state local timer are off.
The platform should be allowed to go in this state when the devices
within the power
domain are pm_runtime_suspend and the CPUs in WFI.
In DT I have one system power domain where the hardware blocks (i2,
uart; spi, etc..)
are attached + a power per CPU.

Benjamin

>
> > >
> > > > Then, after we have called pm_runtime_put_sync_suspend() a few lines
> > > > above, we may potentially have a "domain state" to use, instead of the
> > > > WFI state.
> > > >
> > >
> > > Are they any platforms with this potential "domain state" to use with
> > > CPU WFI. I want to understand this better.
> > >
> > > > In this case, if we would have called psci_enter_state(), that would
> > > > lead us to calling cpu_do_idle() from the __CPU_PM_CPU_IDLE_ENTER()
> > > > macro, becuase idx is zero. In other words, the domain state would
> > > > become unused.
> > > >
> > >
> > > For a domain state to become unused with WFI, it needs to be available
> > > and I am not 100% sure of that.
> >
> > With these changes from the series, we can fully conform to the
> > hierarchical DT bindings for PSCI.
> >
>
> Theoretically may be, but may not confirm to the hardware states.
>
> > I am not sure I understand your concern, is there a cost involved by
> > applying this?
> >
>
> Yes as mentioned above.
>
> --
> Regards,
> Sudeep
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Sudeep Holla March 6, 2020, 12:06 p.m. UTC | #7
On Fri, Mar 06, 2020 at 11:47:40AM +0100, Benjamin Gaignard wrote:
> Le ven. 6 mars 2020 à 11:04, Sudeep Holla <sudeep.holla@arm.com> a écrit :
> >
> > On Fri, Mar 06, 2020 at 10:28:10AM +0100, Ulf Hansson wrote:
> > > On Thu, 5 Mar 2020 at 17:23, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > > >
> >
> > [...]
> >
> > > > OK. The only state that cluster can enter when CPUs are in WFI are
> > > > cluster WFI and most hardware can handle it automatically. I don't see
> > > > the need to do any extra work for that.
> > >
> > > This isn't about cluster WFI, but about deeper cluster states, such as
> > > a cluster-clock-gated-state and a cluster-power-off-state. It's an ST
> > > platform, which Benjamin is working on.
> > >
> >
> > Then definitely something is completely wrong. You can't enter deeper
> > cluster states(clock-gated and power-off to be specific) with CPU in
> > just WFI state. So, if the attempt here is to enter those states, I
> > disagree with the change.
> >
> > Benjamin, please share the complete hierarchical topology for your platform.
>
> The platform is stm32mp157 SoC which embedded two Cortex A7 in one cluster.

Hang on a minute, is this the same platform where you wanted high
resolution timer and were hacking moving dirty tricks around[1]. Now I think
you have landed here.

> I would like to be able to put the system in a state where clocks of CPUs and
> hardware blocks are gated. In this state local timer are off.

Sure, please create a deeper CPU state than WFI and enter so that the CPU
state is saved and restored correctly. What is the problem doing that ?

> The platform should be allowed to go in this state when the devices
> within the power domain are pm_runtime_suspend and the CPUs in WFI.

Nope, we don't save and restore state when we enter/exit WFI. And hence
we can't allow deeper idle states in the hierarchy. No more discussion
on that.

> In DT I have one system power domain where the hardware blocks (i2,
> uart; spi, etc..) are attached + a power per CPU.

You really need a CPU idle state here.

--
Regards,
Sudeep

[1] https://lore.kernel.org/linux-arm-kernel/a42dd20677cddd8d09ea91a369a4e10b@www.loen.fr/
Benjamin Gaignard March 6, 2020, 12:32 p.m. UTC | #8
Le ven. 6 mars 2020 à 13:06, Sudeep Holla <sudeep.holla@arm.com> a écrit :
>
> On Fri, Mar 06, 2020 at 11:47:40AM +0100, Benjamin Gaignard wrote:
> > Le ven. 6 mars 2020 à 11:04, Sudeep Holla <sudeep.holla@arm.com> a écrit :
> > >
> > > On Fri, Mar 06, 2020 at 10:28:10AM +0100, Ulf Hansson wrote:
> > > > On Thu, 5 Mar 2020 at 17:23, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > > > >
> > >
> > > [...]
> > >
> > > > > OK. The only state that cluster can enter when CPUs are in WFI are
> > > > > cluster WFI and most hardware can handle it automatically. I don't see
> > > > > the need to do any extra work for that.
> > > >
> > > > This isn't about cluster WFI, but about deeper cluster states, such as
> > > > a cluster-clock-gated-state and a cluster-power-off-state. It's an ST
> > > > platform, which Benjamin is working on.
> > > >
> > >
> > > Then definitely something is completely wrong. You can't enter deeper
> > > cluster states(clock-gated and power-off to be specific) with CPU in
> > > just WFI state. So, if the attempt here is to enter those states, I
> > > disagree with the change.
> > >
> > > Benjamin, please share the complete hierarchical topology for your platform.
> >
> > The platform is stm32mp157 SoC which embedded two Cortex A7 in one cluster.
>
> Hang on a minute, is this the same platform where you wanted high
> resolution timer and were hacking moving dirty tricks around[1]. Now I think
> you have landed here.

yes it has been fixed in this commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm/kernel/time.c?h=v5.6-rc4&id=022eb8ae8b5ee8c5c813923c69b5ebb1e9612c4c

>
> > I would like to be able to put the system in a state where clocks of CPUs and
> > hardware blocks are gated. In this state local timer are off.
>
> Sure, please create a deeper CPU state than WFI and enter so that the CPU
> state is saved and restored correctly. What is the problem doing that ?

This state stop the clocks for all the hardware blocks and not only the CPUs
so we can't go on it while devices aren't suspended.
I may have missed something but I don't believe that I could add this kind of
conditions in a cpu idle state, right ?
In this state I need to be able to enable the wake up sources because
it is the only
for hardware block used as broadcast timer to wake up the system.

>
> > The platform should be allowed to go in this state when the devices
> > within the power domain are pm_runtime_suspend and the CPUs in WFI.
>
> Nope, we don't save and restore state when we enter/exit WFI. And hence
> we can't allow deeper idle states in the hierarchy. No more discussion
> on that.

>
> > In DT I have one system power domain where the hardware blocks (i2,
> > uart; spi, etc..) are attached + a power per CPU.
>
> You really need a CPU idle state here.
>
> --
> Regards,
> Sudeep
>
> [1] https://lore.kernel.org/linux-arm-kernel/a42dd20677cddd8d09ea91a369a4e10b@www.loen.fr/
Sudeep Holla March 6, 2020, 2:23 p.m. UTC | #9
On Fri, Mar 06, 2020 at 01:32:59PM +0100, Benjamin Gaignard wrote:
> Le ven. 6 mars 2020 à 13:06, Sudeep Holla <sudeep.holla@arm.com> a écrit :
> >

[...]

> > Sure, please create a deeper CPU state than WFI and enter so that the CPU
> > state is saved and restored correctly. What is the problem doing that ?
>
> This state stop the clocks for all the hardware blocks and not only the CPUs
> so we can't go on it while devices aren't suspended.
> I may have missed something but I don't believe that I could add this kind of
> conditions in a cpu idle state, right ?
> In this state I need to be able to enable the wake up sources because
> it is the only
> for hardware block used as broadcast timer to wake up the system.
>

We have discussed this in past in the thread I mentioned and may be
others too. It sounds like a broken hardware, sorry if I am wrong.
But this $subject patch is a hack to solve that and I am NACK-ing this
now. Please fix it adding another CPU level idle state, we are not
supporting without that and there is absolutely no need to.

--
Regards,
Sudeep
Benjamin Gaignard March 6, 2020, 2:44 p.m. UTC | #10
Le ven. 6 mars 2020 à 15:23, Sudeep Holla <sudeep.holla@arm.com> a écrit :
>
> On Fri, Mar 06, 2020 at 01:32:59PM +0100, Benjamin Gaignard wrote:
> > Le ven. 6 mars 2020 à 13:06, Sudeep Holla <sudeep.holla@arm.com> a écrit :
> > >
>
> [...]
>
> > > Sure, please create a deeper CPU state than WFI and enter so that the CPU
> > > state is saved and restored correctly. What is the problem doing that ?
> >
> > This state stop the clocks for all the hardware blocks and not only the CPUs
> > so we can't go on it while devices aren't suspended.
> > I may have missed something but I don't believe that I could add this kind of
> > conditions in a cpu idle state, right ?
> > In this state I need to be able to enable the wake up sources because
> > it is the only
> > for hardware block used as broadcast timer to wake up the system.
> >
>
> We have discussed this in past in the thread I mentioned and may be
> others too. It sounds like a broken hardware, sorry if I am wrong.
> But this $subject patch is a hack to solve that and I am NACK-ing this
> now. Please fix it adding another CPU level idle state, we are not
> supporting without that and there is absolutely no need to.

A CPU idle state only take care of CPU activities, right ? but before going in
the targeting state I need to be sure that the other hardware blocks
are suspended.
Is it possible to describe that in an idle state ?
What sound broken ? is it because we need to setup the wake up sources ?

>
> --
> Regards,
> Sudeep
Sudeep Holla March 6, 2020, 2:50 p.m. UTC | #11
On Fri, Mar 06, 2020 at 03:44:33PM +0100, Benjamin Gaignard wrote:
> Le ven. 6 mars 2020 à 15:23, Sudeep Holla <sudeep.holla@arm.com> a écrit :
> >
> > On Fri, Mar 06, 2020 at 01:32:59PM +0100, Benjamin Gaignard wrote:
> > > Le ven. 6 mars 2020 à 13:06, Sudeep Holla <sudeep.holla@arm.com> a écrit :
> > > >
> >
> > [...]
> >
> > > > Sure, please create a deeper CPU state than WFI and enter so that the CPU
> > > > state is saved and restored correctly. What is the problem doing that ?
> > >
> > > This state stop the clocks for all the hardware blocks and not only the CPUs
> > > so we can't go on it while devices aren't suspended.
> > > I may have missed something but I don't believe that I could add this kind of
> > > conditions in a cpu idle state, right ?
> > > In this state I need to be able to enable the wake up sources because
> > > it is the only
> > > for hardware block used as broadcast timer to wake up the system.
> > >
> >
> > We have discussed this in past in the thread I mentioned and may be
> > others too. It sounds like a broken hardware, sorry if I am wrong.
> > But this $subject patch is a hack to solve that and I am NACK-ing this
> > now. Please fix it adding another CPU level idle state, we are not
> > supporting without that and there is absolutely no need to.
>
> A CPU idle state only take care of CPU activities, right ? but before going in
> the targeting state I need to be sure that the other hardware blocks
> are suspended.
> Is it possible to describe that in an idle state ?
> What sound broken ? is it because we need to setup the wake up sources ?
>

You said: " In DT I have one system power domain where the hardware blocks
(i2c,uart; spi, etc..) are attached + a power per CPU". Now since the CPU
stays in WFI always in this platform, it means it is always ON and you
can't vote to power down the magic "system power domain".

--
Regards,
Sudeep
Benjamin Gaignard March 6, 2020, 3:35 p.m. UTC | #12
Le ven. 6 mars 2020 à 15:50, Sudeep Holla <sudeep.holla@arm.com> a écrit :
>
> On Fri, Mar 06, 2020 at 03:44:33PM +0100, Benjamin Gaignard wrote:
> > Le ven. 6 mars 2020 à 15:23, Sudeep Holla <sudeep.holla@arm.com> a écrit :
> > >
> > > On Fri, Mar 06, 2020 at 01:32:59PM +0100, Benjamin Gaignard wrote:
> > > > Le ven. 6 mars 2020 à 13:06, Sudeep Holla <sudeep.holla@arm.com> a écrit :
> > > > >
> > >
> > > [...]
> > >
> > > > > Sure, please create a deeper CPU state than WFI and enter so that the CPU
> > > > > state is saved and restored correctly. What is the problem doing that ?
> > > >
> > > > This state stop the clocks for all the hardware blocks and not only the CPUs
> > > > so we can't go on it while devices aren't suspended.
> > > > I may have missed something but I don't believe that I could add this kind of
> > > > conditions in a cpu idle state, right ?
> > > > In this state I need to be able to enable the wake up sources because
> > > > it is the only
> > > > for hardware block used as broadcast timer to wake up the system.
> > > >
> > >
> > > We have discussed this in past in the thread I mentioned and may be
> > > others too. It sounds like a broken hardware, sorry if I am wrong.
> > > But this $subject patch is a hack to solve that and I am NACK-ing this
> > > now. Please fix it adding another CPU level idle state, we are not
> > > supporting without that and there is absolutely no need to.
> >
> > A CPU idle state only take care of CPU activities, right ? but before going in
> > the targeting state I need to be sure that the other hardware blocks
> > are suspended.
> > Is it possible to describe that in an idle state ?
> > What sound broken ? is it because we need to setup the wake up sources ?
> >
>
> You said: " In DT I have one system power domain where the hardware blocks
> (i2c,uart; spi, etc..) are attached + a power per CPU". Now since the CPU
> stays in WFI always in this platform, it means it is always ON and you
> can't vote to power down the magic "system power domain".

CPU power domains are subdomains of the system power domain so they can vote
for the targeting power domain.

>
> --
> Regards,
> Sudeep
Sudeep Holla March 6, 2020, 3:55 p.m. UTC | #13
On Fri, Mar 06, 2020 at 04:35:32PM +0100, Benjamin Gaignard wrote:

[...]

>
> CPU power domains are subdomains of the system power domain

Yes, that is platform specific.

> so they can vote for the targeting power domain.
>

Not when they are in WFI, it can't be powered down. I am going to say one
last time, add a CPU level state to workaround whatever you are trying to
and please stop hacking the psci domain like in the $subject patch.

If it was not any clear before, NACK.

--
Regards,
Sudeep
diff mbox series

Patch

diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
index bae9140a65a5..ae0fabec2742 100644
--- a/drivers/cpuidle/cpuidle-psci.c
+++ b/drivers/cpuidle/cpuidle-psci.c
@@ -56,16 +56,19 @@  static int psci_enter_domain_idle_state(struct cpuidle_device *dev,
 	u32 *states = data->psci_states;
 	struct device *pd_dev = data->dev;
 	u32 state;
-	int ret;
+	int ret = 0;
 
 	/* Do runtime PM to manage a hierarchical CPU toplogy. */
 	pm_runtime_put_sync_suspend(pd_dev);
 
 	state = psci_get_domain_state();
-	if (!state)
+	if (!state && states)
 		state = states[idx];
 
-	ret = psci_enter_state(idx, state);
+	if (state)
+		ret = psci_cpu_suspend_enter(state) ? -1 : idx;
+	else
+		cpu_do_idle();
 
 	pm_runtime_get_sync(pd_dev);
 
@@ -180,7 +183,7 @@  static int __init psci_dt_cpu_init_topology(struct cpuidle_driver *drv,
 	drv->states[state_count - 1].enter = psci_enter_domain_idle_state;
 	psci_cpuidle_use_cpuhp = true;
 
-	return 0;
+	return 1;
 }
 
 static int __init psci_dt_cpu_init_idle(struct cpuidle_driver *drv,
@@ -192,6 +195,13 @@  static int __init psci_dt_cpu_init_idle(struct cpuidle_driver *drv,
 	struct device_node *state_node;
 	struct psci_cpuidle_data *data = per_cpu_ptr(&psci_cpuidle_data, cpu);
 
+	/*
+	 * Special case when WFI is the only state, as we may still need to
+	 * initialize data, if the hierarchical topology is used.
+	 */
+	if (!state_count)
+		return psci_dt_cpu_init_topology(drv, data, 1, cpu);
+
 	state_count++; /* Add WFI state too */
 	psci_states = kcalloc(state_count, sizeof(*psci_states), GFP_KERNEL);
 	if (!psci_states)
@@ -223,7 +233,7 @@  static int __init psci_dt_cpu_init_idle(struct cpuidle_driver *drv,
 
 	/* Idle states parsed correctly, store them in the per-cpu struct. */
 	data->psci_states = psci_states;
-	return 0;
+	return state_count;
 
 free_mem:
 	kfree(psci_states);
@@ -282,33 +292,35 @@  static int __init psci_idle_init_cpu(int cpu)
 		return -ENOMEM;
 
 	drv->cpumask = (struct cpumask *)cpumask_of(cpu);
+	drv->state_count = 1;
 
 	/*
-	 * Initialize idle states data, starting at index 1, since
-	 * by default idle state 0 is the quiescent state reached
-	 * by the cpu by executing the wfi instruction.
-	 *
-	 * If no DT idle states are detected (ret == 0) let the driver
-	 * initialization fail accordingly since there is no reason to
-	 * initialize the idle driver if only wfi is supported, the
-	 * default archictectural back-end already executes wfi
-	 * on idle entry.
+	 * Initialize idle states data, starting at index 1, since by default
+	 * idle state 0 is the quiescent state reached by the cpu by executing
+	 * the wfi instruction. If no DT idle states are detected (ret == 0),
+	 * we may still use the hierarchical topology.
 	 */
 	ret = dt_init_idle_driver(drv, psci_idle_state_match, 1);
-	if (ret <= 0) {
-		ret = ret ? : -ENODEV;
+	if (ret < 0)
 		goto out_kfree_drv;
-	}
 
 	/*
 	 * Initialize PSCI idle states.
 	 */
 	ret = psci_cpu_init_idle(drv, cpu, ret);
-	if (ret) {
+	if (ret < 0) {
 		pr_err("CPU %d failed to PSCI idle\n", cpu);
 		goto out_kfree_drv;
 	}
 
+	/* If there are no idle states to manage, but the wfi state and we also
+	 * don't use the hierarchical topology, let the driver initialization
+	 * fail. Instead, let's rely on the default architectural back-end to
+	 * execute wfi on idle entry.
+	 */
+	if (!ret)
+		goto out_kfree_drv;
+
 	ret = cpuidle_register(drv, NULL);
 	if (ret)
 		goto out_kfree_drv;