diff mbox series

[v2,1/2] cpuidle: psci: Move enabling OSI mode after power domains creation

Message ID 20230330084250.32600-2-quic_mkshah@quicinc.com (mailing list archive)
State Superseded, archived
Headers show
Series Use PSCI OS initiated mode for sc7280 | expand

Commit Message

Maulik Shah (mkshah) March 30, 2023, 8:42 a.m. UTC
A switch from OSI to PC mode is only possible if all CPUs other than the
calling one are OFF, either through a call to CPU_OFF or not yet booted.

Currently OSI mode is enabled before power domains are created. In cases
where CPUidle states are not using hierarchical CPU topology the bail out
path tries to switch back to PC mode which gets denied by firmware since
other CPUs are online at this point and creates inconsistent state as
firmware is in OSI mode and Linux in PC mode.

This change moves enabling OSI mode after power domains are created,
this would makes sure that hierarchical CPU topology is used before
switching firmware to OSI mode.

Fixes: 70c179b49870 ("cpuidle: psci: Allow PM domain to be initialized even if no OSI mode")
Signed-off-by: Maulik Shah <quic_mkshah@quicinc.com>
---
 drivers/cpuidle/cpuidle-psci-domain.c | 29 +++++++--------------------
 1 file changed, 7 insertions(+), 22 deletions(-)

Comments

Sudeep Holla March 30, 2023, 9:34 a.m. UTC | #1
On Thu, Mar 30, 2023 at 02:12:49PM +0530, Maulik Shah wrote:
> A switch from OSI to PC mode is only possible if all CPUs other than the
> calling one are OFF, either through a call to CPU_OFF or not yet booted.
>

As per the spec, all cores are in one of the following states:
 - Running
 - OFF, either through a call to CPU_OFF or not yet booted
 - Suspended, through a call to CPU_DEFAULT_SUSPEND

Better to provide full information.

> Currently OSI mode is enabled before power domains are created. In cases
> where CPUidle states are not using hierarchical CPU topology the bail out
> path tries to switch back to PC mode which gets denied by firmware since
> other CPUs are online at this point and creates inconsistent state as
> firmware is in OSI mode and Linux in PC mode.
>

OK what is the issue if the other cores are online ? As long as they are
running, it is allowed in the spec, so your statement is incorrect.

Is CPUidle enabled before setting the OSI mode. I see only that can cause
issue as we don't use CPU_DEFAULT_SUSPEND. If idle is not yet enabled, it
shouldn't be a problem.
Ulf Hansson March 30, 2023, 12:06 p.m. UTC | #2
On Thu, 30 Mar 2023 at 11:34, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Thu, Mar 30, 2023 at 02:12:49PM +0530, Maulik Shah wrote:
> > A switch from OSI to PC mode is only possible if all CPUs other than the
> > calling one are OFF, either through a call to CPU_OFF or not yet booted.
> >
>
> As per the spec, all cores are in one of the following states:
>  - Running
>  - OFF, either through a call to CPU_OFF or not yet booted
>  - Suspended, through a call to CPU_DEFAULT_SUSPEND
>
> Better to provide full information.
>
> > Currently OSI mode is enabled before power domains are created. In cases
> > where CPUidle states are not using hierarchical CPU topology the bail out
> > path tries to switch back to PC mode which gets denied by firmware since
> > other CPUs are online at this point and creates inconsistent state as
> > firmware is in OSI mode and Linux in PC mode.
> >
>
> OK what is the issue if the other cores are online ? As long as they are
> running, it is allowed in the spec, so your statement is incorrect.
>
> Is CPUidle enabled before setting the OSI mode. I see only that can cause
> issue as we don't use CPU_DEFAULT_SUSPEND. If idle is not yet enabled, it
> shouldn't be a problem.

Sudeep, you may very well be correct here. Nevertheless, it looks like
the current public TF-A implementation doesn't work exactly like this,
as it reports an error in Maulik's case. We should fix it too, I
think.

Although, to me it doesn't really matter as I think $subject patch
makes sense anyway. It's always nice to simplify code when it's
possible.

Note also that, before commit 70c179b49870 ("cpuidle: psci: Allow PM
domain to be initialized even if no OSI mode"), it made sense to call
psci_pd_try_set_osi_mode() before creating the genpds, but beyond that
it doesn't really matter anymore.

Kind regards
Uffe
Ulf Hansson March 30, 2023, 12:19 p.m. UTC | #3
On Thu, 30 Mar 2023 at 10:43, Maulik Shah <quic_mkshah@quicinc.com> wrote:
>
> A switch from OSI to PC mode is only possible if all CPUs other than the
> calling one are OFF, either through a call to CPU_OFF or not yet booted.
>
> Currently OSI mode is enabled before power domains are created. In cases
> where CPUidle states are not using hierarchical CPU topology the bail out
> path tries to switch back to PC mode which gets denied by firmware since
> other CPUs are online at this point and creates inconsistent state as
> firmware is in OSI mode and Linux in PC mode.
>
> This change moves enabling OSI mode after power domains are created,
> this would makes sure that hierarchical CPU topology is used before
> switching firmware to OSI mode.
>
> Fixes: 70c179b49870 ("cpuidle: psci: Allow PM domain to be initialized even if no OSI mode")
> Signed-off-by: Maulik Shah <quic_mkshah@quicinc.com>
> ---
>  drivers/cpuidle/cpuidle-psci-domain.c | 29 +++++++--------------------
>  1 file changed, 7 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c
> index 11316c3b14ca..d81f6ae35002 100644
> --- a/drivers/cpuidle/cpuidle-psci-domain.c
> +++ b/drivers/cpuidle/cpuidle-psci-domain.c
> @@ -120,20 +120,6 @@ static void psci_pd_remove(void)
>         }
>  }
>
> -static bool psci_pd_try_set_osi_mode(void)
> -{
> -       int ret;
> -
> -       if (!psci_has_osi_support())
> -               return false;
> -
> -       ret = psci_set_osi_mode(true);
> -       if (ret)
> -               return false;
> -
> -       return true;
> -}
> -
>  static void psci_cpuidle_domain_sync_state(struct device *dev)
>  {
>         /*
> @@ -152,15 +138,12 @@ static int psci_cpuidle_domain_probe(struct platform_device *pdev)
>  {
>         struct device_node *np = pdev->dev.of_node;
>         struct device_node *node;
> -       bool use_osi;
> +       bool use_osi = psci_has_osi_support();
>         int ret = 0, pd_count = 0;
>
>         if (!np)
>                 return -ENODEV;
>
> -       /* If OSI mode is supported, let's try to enable it. */
> -       use_osi = psci_pd_try_set_osi_mode();
> -
>         /*
>          * Parse child nodes for the "#power-domain-cells" property and
>          * initialize a genpd/genpd-of-provider pair when it's found.
> @@ -178,13 +161,18 @@ static int psci_cpuidle_domain_probe(struct platform_device *pdev)
>
>         /* Bail out if not using the hierarchical CPU topology. */
>         if (!pd_count)
> -               goto no_pd;
> +               goto remove_pd;
>
>         /* Link genpd masters/subdomains to model the CPU topology. */
>         ret = dt_idle_pd_init_topology(np);
>         if (ret)
>                 goto remove_pd;
>
> +       /* let's try to enable OSI. */
> +       ret = psci_set_osi_mode(use_osi);
> +       if (ret)
> +               goto remove_pd;

If this fails, subdomains that were added in
dt_idle_pd_init_topology() should be removed too. In other words, we
need a add a new helper function in drivers/cpuidle/dt_idle_genpd.c,
that we can call before calling psci_pd_remove() for this error path.

> +
>         pr_info("Initialized CPU PM domain topology using %s mode\n",
>                 use_osi ? "OSI" : "PC");
>         return 0;
> @@ -194,9 +182,6 @@ static int psci_cpuidle_domain_probe(struct platform_device *pdev)
>  remove_pd:
>         psci_pd_remove();
>         pr_err("failed to create CPU PM domains ret=%d\n", ret);
> -no_pd:
> -       if (use_osi)
> -               psci_set_osi_mode(false);
>         return ret;
>  }
>

Other than the above, this looks good to me.

Kind regards
Uffe
Sudeep Holla March 30, 2023, 1:13 p.m. UTC | #4
On Thu, Mar 30, 2023 at 02:06:19PM +0200, Ulf Hansson wrote:
> On Thu, 30 Mar 2023 at 11:34, Sudeep Holla <sudeep.holla@arm.com> wrote:
> >
> > On Thu, Mar 30, 2023 at 02:12:49PM +0530, Maulik Shah wrote:
> > > A switch from OSI to PC mode is only possible if all CPUs other than the
> > > calling one are OFF, either through a call to CPU_OFF or not yet booted.
> > >
> >
> > As per the spec, all cores are in one of the following states:
> >  - Running
> >  - OFF, either through a call to CPU_OFF or not yet booted
> >  - Suspended, through a call to CPU_DEFAULT_SUSPEND
> >
> > Better to provide full information.
> >
> > > Currently OSI mode is enabled before power domains are created. In cases
> > > where CPUidle states are not using hierarchical CPU topology the bail out
> > > path tries to switch back to PC mode which gets denied by firmware since
> > > other CPUs are online at this point and creates inconsistent state as
> > > firmware is in OSI mode and Linux in PC mode.
> > >
> >
> > OK what is the issue if the other cores are online ? As long as they are
> > running, it is allowed in the spec, so your statement is incorrect.
> >
> > Is CPUidle enabled before setting the OSI mode. I see only that can cause
> > issue as we don't use CPU_DEFAULT_SUSPEND. If idle is not yet enabled, it
> > shouldn't be a problem.
> 
> Sudeep, you may very well be correct here. Nevertheless, it looks like
> the current public TF-A implementation doesn't work exactly like this,
> as it reports an error in Maulik's case. We should fix it too, I
> think.
> 
> Although, to me it doesn't really matter as I think $subject patch
> makes sense anyway. It's always nice to simplify code when it's
> possible.
>

Agreed, I don't have any objection to the change. The wording the message
worried me and wanted to check if there are any other issues because of this.
As such it doesn't look like there are but the commit message needs to be
updated as it gives a different impression/understanding.
Wing Li March 31, 2023, 1:46 a.m. UTC | #5
Adding some clarifications.

On Thu, Mar 30, 2023 at 6:13 AM Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Thu, Mar 30, 2023 at 02:06:19PM +0200, Ulf Hansson wrote:
> > On Thu, 30 Mar 2023 at 11:34, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > >
> > > On Thu, Mar 30, 2023 at 02:12:49PM +0530, Maulik Shah wrote:
> > > > A switch from OSI to PC mode is only possible if all CPUs other than the
> > > > calling one are OFF, either through a call to CPU_OFF or not yet booted.
> > > >
> > >
> > > As per the spec, all cores are in one of the following states:
> > >  - Running
> > >  - OFF, either through a call to CPU_OFF or not yet booted
> > >  - Suspended, through a call to CPU_DEFAULT_SUSPEND
> > >
> > > Better to provide full information.

The spec quoted above only applies when switching from
platform-coordinated mode to OS-initiated mode.

For switching from OS-initiated to platform-coordinated, which is the
case Maulik is referring to, section 5.20.2 of the spec specifies:
"A switch from OS-initiated mode to platform-coordinated mode is only
possible if all cores other than the calling one are OFF, either
through a call to CPU_OFF or not yet booted."

> > >
> > > > Currently OSI mode is enabled before power domains are created. In cases
> > > > where CPUidle states are not using hierarchical CPU topology the bail out
> > > > path tries to switch back to PC mode which gets denied by firmware since
> > > > other CPUs are online at this point and creates inconsistent state as
> > > > firmware is in OSI mode and Linux in PC mode.
> > > >
> > >
> > > OK what is the issue if the other cores are online ? As long as they are
> > > running, it is allowed in the spec, so your statement is incorrect.

The issue here is that the kernel prematurely enabled OSI mode based
on the condition that OSI mode is supported by the firmware, and is
unable to switch back to PC mode in the bail out path if hierarchical
CPU topology isn't used because the other CPUs at this point are now
online.

> > >
> > > Is CPUidle enabled before setting the OSI mode. I see only that can cause
> > > issue as we don't use CPU_DEFAULT_SUSPEND. If idle is not yet enabled, it
> > > shouldn't be a problem.
> >
> > Sudeep, you may very well be correct here. Nevertheless, it looks like
> > the current public TF-A implementation doesn't work exactly like this,
> > as it reports an error in Maulik's case. We should fix it too, I
> > think.
> >
> > Although, to me it doesn't really matter as I think $subject patch
> > makes sense anyway. It's always nice to simplify code when it's
> > possible.
> >
>
> Agreed, I don't have any objection to the change. The wording the message
> worried me and wanted to check if there are any other issues because of this.
> As such it doesn't look like there are but the commit message needs to be
> updated as it gives a different impression/understanding.

I think the commit message is accurate and we can keep it as is.

Best regards,
Wing

>
> --
> Regards,
> Sudeep
Sudeep Holla March 31, 2023, 2:27 p.m. UTC | #6
On Thu, Mar 30, 2023 at 06:40:03PM -0700, Wing Li wrote:
> Adding some clarifications.
> 
> On Thu, Mar 30, 2023 at 6:13 AM Sudeep Holla <sudeep.holla@arm.com> wrote:
> 
> > On Thu, Mar 30, 2023 at 02:06:19PM +0200, Ulf Hansson wrote:
> > > On Thu, 30 Mar 2023 at 11:34, Sudeep Holla <sudeep.holla@arm.com> wrote:
> > > >
> > > > On Thu, Mar 30, 2023 at 02:12:49PM +0530, Maulik Shah wrote:
> > > > > A switch from OSI to PC mode is only possible if all CPUs other than
> > the
> > > > > calling one are OFF, either through a call to CPU_OFF or not yet
> > booted.
> > > > >
> > > >
> > > > As per the spec, all cores are in one of the following states:
> > > >  - Running
> > > >  - OFF, either through a call to CPU_OFF or not yet booted
> > > >  - Suspended, through a call to CPU_DEFAULT_SUSPEND
> > > >
> > > > Better to provide full information.
> >
> 
> The spec quoted above only applies when switching from platform-coordinated
> mode to OS-initiated mode.
> 
> For switching from OS-initiated to platform-coordinated, which is the case
> Maulik is referring to, section 5.20.2 of the spec specifies:
> "A switch from OS-initiated mode to platform-coordinated mode is only
> possible if all cores other than the calling one are OFF, either through a
> call to CPU_OFF or not yet booted."
> 
>

My bad, I read/imagined it as PC->OSI mode couple of times even though it
is pretty explicit. Sorry for that. And thanks a lot for pointing that out.
diff mbox series

Patch

diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c
index 11316c3b14ca..d81f6ae35002 100644
--- a/drivers/cpuidle/cpuidle-psci-domain.c
+++ b/drivers/cpuidle/cpuidle-psci-domain.c
@@ -120,20 +120,6 @@  static void psci_pd_remove(void)
 	}
 }
 
-static bool psci_pd_try_set_osi_mode(void)
-{
-	int ret;
-
-	if (!psci_has_osi_support())
-		return false;
-
-	ret = psci_set_osi_mode(true);
-	if (ret)
-		return false;
-
-	return true;
-}
-
 static void psci_cpuidle_domain_sync_state(struct device *dev)
 {
 	/*
@@ -152,15 +138,12 @@  static int psci_cpuidle_domain_probe(struct platform_device *pdev)
 {
 	struct device_node *np = pdev->dev.of_node;
 	struct device_node *node;
-	bool use_osi;
+	bool use_osi = psci_has_osi_support();
 	int ret = 0, pd_count = 0;
 
 	if (!np)
 		return -ENODEV;
 
-	/* If OSI mode is supported, let's try to enable it. */
-	use_osi = psci_pd_try_set_osi_mode();
-
 	/*
 	 * Parse child nodes for the "#power-domain-cells" property and
 	 * initialize a genpd/genpd-of-provider pair when it's found.
@@ -178,13 +161,18 @@  static int psci_cpuidle_domain_probe(struct platform_device *pdev)
 
 	/* Bail out if not using the hierarchical CPU topology. */
 	if (!pd_count)
-		goto no_pd;
+		goto remove_pd;
 
 	/* Link genpd masters/subdomains to model the CPU topology. */
 	ret = dt_idle_pd_init_topology(np);
 	if (ret)
 		goto remove_pd;
 
+	/* let's try to enable OSI. */
+	ret = psci_set_osi_mode(use_osi);
+	if (ret)
+		goto remove_pd;
+
 	pr_info("Initialized CPU PM domain topology using %s mode\n",
 		use_osi ? "OSI" : "PC");
 	return 0;
@@ -194,9 +182,6 @@  static int psci_cpuidle_domain_probe(struct platform_device *pdev)
 remove_pd:
 	psci_pd_remove();
 	pr_err("failed to create CPU PM domains ret=%d\n", ret);
-no_pd:
-	if (use_osi)
-		psci_set_osi_mode(false);
 	return ret;
 }