[15/18] drivers: firmware: psci: Support CPU hotplug for the hierarchical model
diff mbox series

Message ID 20190513192300.653-16-ulf.hansson@linaro.org
State Not Applicable, archived
Headers show
Series
  • ARM/ARM64: Support hierarchical CPU arrangement for PSCI
Related show

Commit Message

Ulf Hansson May 13, 2019, 7:22 p.m. UTC
When the hierarchical CPU topology is used and when a CPU has been put
offline (hotplug), that same CPU prevents its PM domain and thus also
potential master PM domains, from being powered off. This is because genpd
observes the CPU's attached device as being active from a runtime PM point
of view.

To deal with this, let's decrease the runtime PM usage count by calling
pm_runtime_put_sync_suspend() of the attached struct device when putting
the CPU offline. Consequentially, we must then increase the runtime PM
usage count, while putting the CPU online again.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---

Changes:
	- Use get_logical_index() to find the CPU number.
	- Verify that a corresponding struct device* has been attached to the
	  PM domain before doing runtime PM refrence counting.
	- Clear the domain state when the CPU goes offline, to start fresh.
	- Move code to internal helper functions and move them inside
	  "ifdef CONFIG_CPU_IDLE.

---
 drivers/firmware/psci/psci.c | 47 +++++++++++++++++++++++++++++++++++-
 1 file changed, 46 insertions(+), 1 deletion(-)

Comments

Sudeep Holla June 7, 2019, 3:31 p.m. UTC | #1
On Mon, May 13, 2019 at 09:22:57PM +0200, Ulf Hansson wrote:
> When the hierarchical CPU topology is used and when a CPU has been put
> offline (hotplug), that same CPU prevents its PM domain and thus also
> potential master PM domains, from being powered off. This is because genpd
> observes the CPU's attached device as being active from a runtime PM point
> of view.
> 
> To deal with this, let's decrease the runtime PM usage count by calling
> pm_runtime_put_sync_suspend() of the attached struct device when putting
> the CPU offline. Consequentially, we must then increase the runtime PM
> usage count, while putting the CPU online again.
> 

Why is this firmware/driver specific ? Why can't this be dealt in core
pm-domain ? I am concerned that if any other architectures or firmware
method decides to use this feature, this need to be duplicated there.

The way I see this is pure reference counting and is hardware/firmware/
driver agnostic and can be made generic.

--
Regards,
Sudeep
Ulf Hansson June 10, 2019, 10:21 a.m. UTC | #2
On Fri, 7 Jun 2019 at 17:31, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Mon, May 13, 2019 at 09:22:57PM +0200, Ulf Hansson wrote:
> > When the hierarchical CPU topology is used and when a CPU has been put
> > offline (hotplug), that same CPU prevents its PM domain and thus also
> > potential master PM domains, from being powered off. This is because genpd
> > observes the CPU's attached device as being active from a runtime PM point
> > of view.
> >
> > To deal with this, let's decrease the runtime PM usage count by calling
> > pm_runtime_put_sync_suspend() of the attached struct device when putting
> > the CPU offline. Consequentially, we must then increase the runtime PM
> > usage count, while putting the CPU online again.
> >
>
> Why is this firmware/driver specific ? Why can't this be dealt in core
> pm-domain ? I am concerned that if any other architectures or firmware
> method decides to use this feature, this need to be duplicated there.

What is the core pm-domain? Do you refer to the generic PM domain (genpd), no?

In such case, this is not the job of genpd, but rather the opposite
(to *monitor* the reference count).

>
> The way I see this is pure reference counting and is hardware/firmware/
> driver agnostic and can be made generic.

As stated in the another reply, I would rather start with having more
things driver specific rather than generic. Later on we can always
consider to move/split things, when there are more users.

In this particular case, the runtime PM reference counting is done on
the struct device*, that genpd returned via
dev_pm_domain_attach_by_name(). And because
dev_pm_domain_attach_by_name() is called from PSCI code, I decided to
keep this struct device* internal to PSCI.

Kind regards
Uffe
Sudeep Holla June 10, 2019, 11:02 a.m. UTC | #3
On Mon, Jun 10, 2019 at 12:21:47PM +0200, Ulf Hansson wrote:
> On Fri, 7 Jun 2019 at 17:31, Sudeep Holla <sudeep.holla@arm.com> wrote:
> >
> > On Mon, May 13, 2019 at 09:22:57PM +0200, Ulf Hansson wrote:
> > > When the hierarchical CPU topology is used and when a CPU has been put
> > > offline (hotplug), that same CPU prevents its PM domain and thus also
> > > potential master PM domains, from being powered off. This is because genpd
> > > observes the CPU's attached device as being active from a runtime PM point
> > > of view.
> > >
> > > To deal with this, let's decrease the runtime PM usage count by calling
> > > pm_runtime_put_sync_suspend() of the attached struct device when putting
> > > the CPU offline. Consequentially, we must then increase the runtime PM
> > > usage count, while putting the CPU online again.
> > >
> >
> > Why is this firmware/driver specific ? Why can't this be dealt in core
> > pm-domain ? I am concerned that if any other architectures or firmware
> > method decides to use this feature, this need to be duplicated there.
>
> What is the core pm-domain? Do you refer to the generic PM domain (genpd), no?
>

Sorry for my bad choice of names. I just wrote names as I understand
rather than looking for exact match. But yes, I meant generic place
where such ref-counting is done currently for other things.

> In such case, this is not the job of genpd, but rather the opposite
> (to *monitor* the reference count).
>

OK, I need to understand that then.

> >
> > The way I see this is pure reference counting and is hardware/firmware/
> > driver agnostic and can be made generic.
>
> As stated in the another reply, I would rather start with having more
> things driver specific rather than generic. Later on we can always
> consider to move/split things, when there are more users.
>
> In this particular case, the runtime PM reference counting is done on
> the struct device*, that genpd returned via
> dev_pm_domain_attach_by_name(). And because
> dev_pm_domain_attach_by_name() is called from PSCI code, I decided to
> keep this struct device* internal to PSCI.

Sure, I understand your intent. I have just mentioned my thoughts/comments
as I reviewed.

--
Regards,
Sudeep

Patch
diff mbox series

diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
index 2c4157d3a616..5ad93c3694b5 100644
--- a/drivers/firmware/psci/psci.c
+++ b/drivers/firmware/psci/psci.c
@@ -15,6 +15,7 @@ 
 
 #include <linux/acpi.h>
 #include <linux/arm-smccc.h>
+#include <linux/cpu.h>
 #include <linux/cpuidle.h>
 #include <linux/errno.h>
 #include <linux/linkage.h>
@@ -93,6 +94,9 @@  static u32 psci_function_id[PSCI_FN_MAX];
 static u32 psci_cpu_suspend_feature;
 static bool psci_system_reset2_supported;
 
+static void psci_cpuidle_cpu_off(void);
+static void psci_cpuidle_cpu_on(unsigned long cpuid);
+
 static inline bool psci_has_ext_power_state(void)
 {
 	return psci_cpu_suspend_feature &
@@ -188,6 +192,8 @@  static int psci_cpu_off(u32 state)
 	int err;
 	u32 fn;
 
+	psci_cpuidle_cpu_off();
+
 	fn = psci_function_id[PSCI_FN_CPU_OFF];
 	err = invoke_psci_fn(fn, state, 0, 0);
 	return psci_to_linux_errno(err);
@@ -200,7 +206,13 @@  static int psci_cpu_on(unsigned long cpuid, unsigned long entry_point)
 
 	fn = psci_function_id[PSCI_FN_CPU_ON];
 	err = invoke_psci_fn(fn, cpuid, entry_point, 0);
-	return psci_to_linux_errno(err);
+	err = psci_to_linux_errno(err);
+	if (err)
+		return err;
+
+	psci_cpuidle_cpu_on(cpuid);
+
+	return 0;
 }
 
 static int psci_migrate(unsigned long cpuid)
@@ -540,8 +552,41 @@  static int __init _psci_dt_topology_init(struct device_node *np)
 
 	return ret;
 }
+
+static void psci_cpuidle_cpu_off(void)
+{
+	struct device *dev = __this_cpu_read(psci_cpuidle_data.dev);
+
+	/*
+	 * Drop the runtime PM usage count if the CPU has been attached to a
+	 * CPU PM domain. This is needed to, for example, not prevent other
+	 * master domains in the hierarchy to remain powered on.
+	 */
+	if (dev)
+		pm_runtime_put_sync_suspend(dev);
+}
+
+static void psci_cpuidle_cpu_on(unsigned long cpuid)
+{
+	struct device *dev;
+	int cpu;
+
+	if (!psci_dt_topology)
+		return;
+
+	cpu = get_logical_index(cpuid);
+	if (cpu < 0)
+		return;
+
+	dev = per_cpu(psci_cpuidle_data.dev, cpu);
+	if (dev)
+		pm_runtime_get_sync(dev);
+}
+
 #else
 static inline int _psci_dt_topology_init(struct device_node *np) { return 0; }
+static void psci_cpuidle_cpu_off(void) {}
+static void psci_cpuidle_cpu_on(unsigned long cpuid) {}
 #endif
 
 static int psci_system_suspend(unsigned long unused)