diff mbox series

[v4,2/2] firmware: arm_scmi: Add SCMI System Power Control driver

Message ID 20210105130945.8192-2-cristian.marussi@arm.com (mailing list archive)
State New, archived
Headers show
Series [v4,1/2] firmware: arm_scmi: Add System Power utility macro | expand

Commit Message

Cristian Marussi Jan. 5, 2021, 1:09 p.m. UTC
Add an SCMI System Power control driver to handle platform's requests
carried by SYSTEM_POWER_STATE_NOTIFIER notifications: such platform
requested system-wide power state transitions are handled accordingly,
gracefully or forcefully, depending on the notifications' message flags.

Graceful requests are by default relayed to userspace using the same
Kernel API used to handle ACPI Shutdown bus events: alternatively, a few
available module parameters can be used to tunnel instead such requests to
userspace via signals addressed to CAD pid.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
---
This patch, building on top of the recently introduced SCMI System Power
Protocol support, adds a new SCMI driver which, registering for such SCMI
System Power notifications, acts accordingly to satisfy such SCMI plaform
system-wide transition requests.

Given that this driver is called to react on SCMI System Power events, it
is potentially needed very early on during boot, so it has not been made
configurable as a loadable module: if you decide to use it, makes no sense
not to have it builtin, since it has to be immediately ready once SCMI
subsystem is activated.

It is currently based on Linux v5.11-rc2.

Thanks,

Cristian

v3 --> v4
- rebased v5.11-rc2
- removed unneeded ugly usage of atomics and barriers
- simplfied SysPower shutdown state machine
- split out macro definition to different patch

v2 --> v3
- rebased
- some minor cleanup in codestyle and commit message

v1 --> v2
- split out of SCMI System Power Protocol series now merged
---
 drivers/firmware/Kconfig                      |  12 +
 drivers/firmware/arm_scmi/Makefile            |   1 +
 .../firmware/arm_scmi/scmi_power_control.c    | 359 ++++++++++++++++++
 3 files changed, 372 insertions(+)
 create mode 100644 drivers/firmware/arm_scmi/scmi_power_control.c

Comments

Greg Kroah-Hartman Jan. 7, 2021, 4:04 p.m. UTC | #1
On Tue, Jan 05, 2021 at 01:09:45PM +0000, Cristian Marussi wrote:
> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> index 3f14dffb9669..2b39453e9dd1 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -40,6 +40,18 @@ config ARM_SCMI_POWER_DOMAIN
>  	  will be called scmi_pm_domain. Note this may needed early in boot
>  	  before rootfs may be available.
>  
> +config ARM_SCMI_POWER_CONTROL
> +	bool "SCMI system power control driver"
> +	depends on ARM_SCMI_PROTOCOL || (COMPILE_TEST && OF)
> +	default n

n is always the default, no need to list it here.

> +	help
> +	  This enables System Power control logic which binds system shutdown or
> +	  reboot actions to SCMI System Power notifications generated by SCP
> +	  firmware.
> +
> +	  Graceful requests' methods and timeout and can be configured using
> +	  a few available module parameters.
> +
>  config ARM_SCPI_PROTOCOL
>  	tristate "ARM System Control and Power Interface (SCPI) Message Protocol"
>  	depends on ARM || ARM64 || COMPILE_TEST
> diff --git a/drivers/firmware/arm_scmi/Makefile b/drivers/firmware/arm_scmi/Makefile
> index 6a2ef63306d6..ddddb4636ebd 100644
> --- a/drivers/firmware/arm_scmi/Makefile
> +++ b/drivers/firmware/arm_scmi/Makefile
> @@ -9,3 +9,4 @@ scmi-module-objs := $(scmi-bus-y) $(scmi-driver-y) $(scmi-protocols-y) \
>  		    $(scmi-transport-y)
>  obj-$(CONFIG_ARM_SCMI_PROTOCOL) += scmi-module.o
>  obj-$(CONFIG_ARM_SCMI_POWER_DOMAIN) += scmi_pm_domain.o
> +obj-$(CONFIG_ARM_SCMI_POWER_CONTROL) += scmi_power_control.o
> diff --git a/drivers/firmware/arm_scmi/scmi_power_control.c b/drivers/firmware/arm_scmi/scmi_power_control.c
> new file mode 100644
> index 000000000000..216c2f031111
> --- /dev/null
> +++ b/drivers/firmware/arm_scmi/scmi_power_control.c
> @@ -0,0 +1,359 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * SCMI Generic SystemPower Control driver.
> + *
> + * Copyright (C) 2020-2021 ARM Ltd.
> + */
> +/**
> + * DOC: Theory of operation

What does "DOC:" provide?  Did you tie this into the kernel doc build
system?  If not, then what is this tag for?

> + *
> + * In order to handle platform originated SCMI SystemPower requests (like
> + * shutdowns or cold/warm resets) we register an SCMI Notification notifier
> + * block to react when such SCMI SystemPower events are emitted.
> + *
> + * Once such a notification is received we act accordingly to perform the
> + * required system transition depending on the kind of request.
> + *
> + * By default graceful requests are routed to userspace through the same
> + * API methods (orderly_poweroff/reboot()) used by ACPI when handling ACPI
> + * Shutdown bus events: alternatively, by properly setting a couple of module
> + * parameters (scmi_graceful_*_signum), it is possible to choose to use signals
> + * to CAD pid as a mean to communicate such graceful requests to userspace.
> + *
> + * Forceful requests, instead, will simply cause an immediate emergency_sync()
> + * and subsequent Kernel-only shutdown/reboot.
> + *
> + * Additionally, setting scmi_graceful_request_tmo_ms to a non-zero value, it
> + * is possible to convert a timed-out graceful request to a forceful one.
> + *
> + * The assumption here is that even graceful requests can be upper-bound by a
> + * maximum final timeout strictly enforced by the platform itself which can
> + * ultimately cut the power off at will anytime: in order to avoid such extreme
> + * scenario, when scmi_graceful_request_tmo_ms is non-zero, we track progress of
> + * graceful requests through the means of a reboot notifier converting timed-out
> + * graceful requests to forceful ones: in such a way at least we perform a
> + * clean sync and shutdown/restart before the power is cut.
> + *
> + * Given that such platform hard-timeout, if present, is anyway highly platform
> + * and event specific and not exposed at run-time by the SCMI SytemPower
> + * protocol, the parameter scmi_graceful_request_tmo_ms defaults to zero.
> + */
> +
> +#define pr_fmt(fmt) "SCMI SystemPower - " fmt

drivers should not need this, and even if they do, that's a HUGE prefix,
be more terse please.  Always use dev_*() calls instead of pr_() as you
have access to a device at all times, right?  If not, then this isn't a
driver :)

> +enum scmi_syspower_state {
> +	SCMI_SYSPOWER_IDLE,
> +	SCMI_SYSPOWER_IN_PROGRESS,
> +	SCMI_SYSPOWER_REBOOTING
> +};
> +
> +static enum scmi_syspower_state scmi_state;
> +/* Protect access to scmi_state */
> +static DEFINE_MUTEX(scmi_state_mtx);

Why does this only handle "one" state and device?  Shouldn't this all be
per-device?

> +
> +static enum scmi_system_events scmi_required_transition = SCMI_SYSTEM_MAX;
> +
> +static unsigned int scmi_graceful_poweroff_signum;
> +static unsigned int scmi_graceful_reboot_signum;
> +static unsigned int scmi_graceful_request_tmo_ms;

why "unsigned int"?

> +
> +static struct timer_list scmi_request_timer;
> +static struct work_struct scmi_forceful_work;
> +
> +/**
> + * scmi_reboot_notifier  - A reboot notifier to catch an ongoing successful
> + * system transition
> + * @nb: Reference to the related notifier block
> + * @reason: The reason for the ongoing reboot
> + * @__unused: The cmd being executed on a restart request (unused)
> + *
> + * When an ongoing system transition is detected, compatible with the one
> + * requested by SCMI, cancel the timer work.
> + * This is registered only when a valid SCMI SystemPower transition has been
> + * received and scmi_graceful_request_tmo_ms was non-zero.
> + *
> + * Return: NOTIFY_OK in any case
> + */
> +static int scmi_reboot_notifier(struct notifier_block *nb,
> +				unsigned long reason, void *__unused)
> +{
> +	mutex_lock(&scmi_state_mtx);
> +	switch (reason) {
> +	case SYS_HALT:
> +	case SYS_POWER_OFF:
> +		if (scmi_required_transition == SCMI_SYSTEM_SHUTDOWN)
> +			scmi_state = SCMI_SYSPOWER_REBOOTING;
> +		break;
> +	case SYS_RESTART:
> +		if (scmi_required_transition == SCMI_SYSTEM_COLDRESET ||
> +		    scmi_required_transition == SCMI_SYSTEM_WARMRESET)
> +			scmi_state = SCMI_SYSPOWER_REBOOTING;
> +		break;
> +	default:
> +		break;
> +	}
> +
> +	if (scmi_state == SCMI_SYSPOWER_REBOOTING) {
> +		pr_debug("Reboot in progress...cancel timer.\n");

use dev_dbg() and pass it your device pointer please.

Same for all usages of pr_*() in the driver.

> +static int scmi_syspower_probe(struct scmi_device *sdev)
> +{
> +	int ret;
> +	struct scmi_handle *handle = sdev->handle;
> +	struct notifier_block *userspace_nb;
> +
> +	if (!handle)
> +		return -ENODEV;
> +
> +	userspace_nb = devm_kzalloc(&sdev->dev, sizeof(*userspace_nb),
> +				    GFP_KERNEL);
> +	if (!userspace_nb)
> +		return -ENOMEM;
> +
> +	userspace_nb->notifier_call = &scmi_userspace_notifier;
> +	ret = handle->notify_ops->register_event_notifier(handle,
> +						SCMI_PROTOCOL_SYSTEM,
> +					SCMI_EVENT_SYSTEM_POWER_STATE_NOTIFIER,
> +					    NULL, userspace_nb);

Crazy indentation, checkpatch was ok with this?

> +module_param(scmi_graceful_request_tmo_ms, uint, 0644);
> +MODULE_PARM_DESC(scmi_graceful_request_tmo_ms,
> +		 "Maximum time(ms) allowed to userspace for complying to the request. Unlimited if zero.");
> +
> +module_param(scmi_graceful_poweroff_signum, uint, 0644);
> +MODULE_PARM_DESC(scmi_graceful_poweroff_signum,
> +		 "Signal to request graceful poweroff to CAD process. Ignored if zero.");
> +
> +module_param(scmi_graceful_reboot_signum, uint, 0644);
> +MODULE_PARM_DESC(scmi_graceful_reboot_signum,
> +		 "Signal to request graceful reboot to CAD process. Ignored if zero.");

This is not the 1990's, please do not add module parameters.  Make this
"just work", or worst case, sysfs attributes.

thanks,

greg k-h
Cristian Marussi Jan. 21, 2021, 10:22 a.m. UTC | #2
Hi Greg,

sorry for the delayed answer but this spawned a good deal of internal
discussions.

On Thu, Jan 07, 2021 at 05:04:11PM +0100, Greg KH wrote:
> On Tue, Jan 05, 2021 at 01:09:45PM +0000, Cristian Marussi wrote:
> > diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> > index 3f14dffb9669..2b39453e9dd1 100644
> > --- a/drivers/firmware/Kconfig
> > +++ b/drivers/firmware/Kconfig
> > @@ -40,6 +40,18 @@ config ARM_SCMI_POWER_DOMAIN
> >  	  will be called scmi_pm_domain. Note this may needed early in boot
> >  	  before rootfs may be available.
> >  
> > +config ARM_SCMI_POWER_CONTROL
> > +	bool "SCMI system power control driver"
> > +	depends on ARM_SCMI_PROTOCOL || (COMPILE_TEST && OF)
> > +	default n
> 
> n is always the default, no need to list it here.
> 
Ok

> > +	help
> > +	  This enables System Power control logic which binds system shutdown or
> > +	  reboot actions to SCMI System Power notifications generated by SCP
> > +	  firmware.
> > +
> > +	  Graceful requests' methods and timeout and can be configured using
> > +	  a few available module parameters.
> > +
> >  config ARM_SCPI_PROTOCOL
> >  	tristate "ARM System Control and Power Interface (SCPI) Message Protocol"
> >  	depends on ARM || ARM64 || COMPILE_TEST
> > diff --git a/drivers/firmware/arm_scmi/Makefile b/drivers/firmware/arm_scmi/Makefile
> > index 6a2ef63306d6..ddddb4636ebd 100644
> > --- a/drivers/firmware/arm_scmi/Makefile
> > +++ b/drivers/firmware/arm_scmi/Makefile
> > @@ -9,3 +9,4 @@ scmi-module-objs := $(scmi-bus-y) $(scmi-driver-y) $(scmi-protocols-y) \
> >  		    $(scmi-transport-y)
> >  obj-$(CONFIG_ARM_SCMI_PROTOCOL) += scmi-module.o
> >  obj-$(CONFIG_ARM_SCMI_POWER_DOMAIN) += scmi_pm_domain.o
> > +obj-$(CONFIG_ARM_SCMI_POWER_CONTROL) += scmi_power_control.o
> > diff --git a/drivers/firmware/arm_scmi/scmi_power_control.c b/drivers/firmware/arm_scmi/scmi_power_control.c
> > new file mode 100644
> > index 000000000000..216c2f031111
> > --- /dev/null
> > +++ b/drivers/firmware/arm_scmi/scmi_power_control.c
> > @@ -0,0 +1,359 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * SCMI Generic SystemPower Control driver.
> > + *
> > + * Copyright (C) 2020-2021 ARM Ltd.
> > + */
> > +/**
> > + * DOC: Theory of operation
> 
> What does "DOC:" provide?  Did you tie this into the kernel doc build
> system?  If not, then what is this tag for?
>
Right, it is not indeed tied into the kernel-doc build, I'll remove the
tag.

> > + *
> > + * In order to handle platform originated SCMI SystemPower requests (like
> > + * shutdowns or cold/warm resets) we register an SCMI Notification notifier
> > + * block to react when such SCMI SystemPower events are emitted.
> > + *
> > + * Once such a notification is received we act accordingly to perform the
> > + * required system transition depending on the kind of request.
> > + *
> > + * By default graceful requests are routed to userspace through the same
> > + * API methods (orderly_poweroff/reboot()) used by ACPI when handling ACPI
> > + * Shutdown bus events: alternatively, by properly setting a couple of module
> > + * parameters (scmi_graceful_*_signum), it is possible to choose to use signals
> > + * to CAD pid as a mean to communicate such graceful requests to userspace.
> > + *
> > + * Forceful requests, instead, will simply cause an immediate emergency_sync()
> > + * and subsequent Kernel-only shutdown/reboot.
> > + *
> > + * Additionally, setting scmi_graceful_request_tmo_ms to a non-zero value, it
> > + * is possible to convert a timed-out graceful request to a forceful one.
> > + *
> > + * The assumption here is that even graceful requests can be upper-bound by a
> > + * maximum final timeout strictly enforced by the platform itself which can
> > + * ultimately cut the power off at will anytime: in order to avoid such extreme
> > + * scenario, when scmi_graceful_request_tmo_ms is non-zero, we track progress of
> > + * graceful requests through the means of a reboot notifier converting timed-out
> > + * graceful requests to forceful ones: in such a way at least we perform a
> > + * clean sync and shutdown/restart before the power is cut.
> > + *
> > + * Given that such platform hard-timeout, if present, is anyway highly platform
> > + * and event specific and not exposed at run-time by the SCMI SytemPower
> > + * protocol, the parameter scmi_graceful_request_tmo_ms defaults to zero.
> > + */
> > +
> > +#define pr_fmt(fmt) "SCMI SystemPower - " fmt
> 
> drivers should not need this, and even if they do, that's a HUGE prefix,
> be more terse please.  Always use dev_*() calls instead of pr_() as you
> have access to a device at all times, right?  If not, then this isn't a
> driver :)
> 

Indeed I'm plenty of devices :D, I moved to pr_ when I started using
common shared status (as you spotted below), I'll move back to dev_

> > +enum scmi_syspower_state {
> > +	SCMI_SYSPOWER_IDLE,
> > +	SCMI_SYSPOWER_IN_PROGRESS,
> > +	SCMI_SYSPOWER_REBOOTING
> > +};
> > +
> > +static enum scmi_syspower_state scmi_state;
> > +/* Protect access to scmi_state */
> > +static DEFINE_MUTEX(scmi_state_mtx);
> 
> Why does this only handle "one" state and device?  Shouldn't this all be
> per-device?
> 

This was made on purpose because you can have multiple chiplets/node on
the system each one running it own SCMI platform fw described in the DT
by distinct nodes and exposed via distinct SCMI buses with per-SCMI-protocol
devices attached to each of them, so that every SCMI driver (like this one)
can be indeed instantiated against multiple devices each one talking on
distinct underlying transport channels: that's fine in general for other
SCMI protocols BUT for this particular SystemPower protocol the OSPM is
really interested to receive shutdown/reboot requests (notifications),
so as soon as I received the first valid shudtwon/reboot request is game
over for all: as a consequence I keep a shared state and once a valid
reboot/shutdown is initiated any other following request from the same or
other SCMI platforms is ignored. (that's also the reason I removed all the
dynamic per-device data too and moved from dev_ to pr_...)

NOW, even if all the above still holds true, after internal discussions
turns out that we do not really want to promote a system design in which
SystemPower requests can be emitted by multiple SCMI platforms because
that would complicate a lot the fw design (each platform would have to
sync with each other and shudown on their side too), but instead we want
to encourage a SCMI system fw design in which only a master SCMI platform
is in charge of communicating SystemPower notifications to the OSPM.

For this reason, in the next V5 I'll take care, in a separate patch in
this series, to enforce the creation of a single unique device dedicated
to SystemPower across all the possible SCMI platforms (buses), or WARN
otherwise when more SCMI platform are attempting to register SystemPower
support, so that this driver will be effectively instantiated only once.

> > +
> > +static enum scmi_system_events scmi_required_transition = SCMI_SYSTEM_MAX;
> > +
> > +static unsigned int scmi_graceful_poweroff_signum;
> > +static unsigned int scmi_graceful_reboot_signum;
> > +static unsigned int scmi_graceful_request_tmo_ms;
> 
> why "unsigned int"?
> 

They represents timeouts and signals (positive values), but I'll remove
them (see down below)

> > +
> > +static struct timer_list scmi_request_timer;
> > +static struct work_struct scmi_forceful_work;
> > +
> > +/**
> > + * scmi_reboot_notifier  - A reboot notifier to catch an ongoing successful
> > + * system transition
> > + * @nb: Reference to the related notifier block
> > + * @reason: The reason for the ongoing reboot
> > + * @__unused: The cmd being executed on a restart request (unused)
> > + *
> > + * When an ongoing system transition is detected, compatible with the one
> > + * requested by SCMI, cancel the timer work.
> > + * This is registered only when a valid SCMI SystemPower transition has been
> > + * received and scmi_graceful_request_tmo_ms was non-zero.
> > + *
> > + * Return: NOTIFY_OK in any case
> > + */
> > +static int scmi_reboot_notifier(struct notifier_block *nb,
> > +				unsigned long reason, void *__unused)
> > +{
> > +	mutex_lock(&scmi_state_mtx);
> > +	switch (reason) {
> > +	case SYS_HALT:
> > +	case SYS_POWER_OFF:
> > +		if (scmi_required_transition == SCMI_SYSTEM_SHUTDOWN)
> > +			scmi_state = SCMI_SYSPOWER_REBOOTING;
> > +		break;
> > +	case SYS_RESTART:
> > +		if (scmi_required_transition == SCMI_SYSTEM_COLDRESET ||
> > +		    scmi_required_transition == SCMI_SYSTEM_WARMRESET)
> > +			scmi_state = SCMI_SYSPOWER_REBOOTING;
> > +		break;
> > +	default:
> > +		break;
> > +	}
> > +
> > +	if (scmi_state == SCMI_SYSPOWER_REBOOTING) {
> > +		pr_debug("Reboot in progress...cancel timer.\n");
> 
> use dev_dbg() and pass it your device pointer please.
> 
> Same for all usages of pr_*() in the driver.
> 

Ok

> > +static int scmi_syspower_probe(struct scmi_device *sdev)
> > +{
> > +	int ret;
> > +	struct scmi_handle *handle = sdev->handle;
> > +	struct notifier_block *userspace_nb;
> > +
> > +	if (!handle)
> > +		return -ENODEV;
> > +
> > +	userspace_nb = devm_kzalloc(&sdev->dev, sizeof(*userspace_nb),
> > +				    GFP_KERNEL);
> > +	if (!userspace_nb)
> > +		return -ENOMEM;
> > +
> > +	userspace_nb->notifier_call = &scmi_userspace_notifier;
> > +	ret = handle->notify_ops->register_event_notifier(handle,
> > +						SCMI_PROTOCOL_SYSTEM,
> > +					SCMI_EVENT_SYSTEM_POWER_STATE_NOTIFIER,
> > +					    NULL, userspace_nb);
> 
> Crazy indentation, checkpatch was ok with this?
> 

Yes, I check with --strict, but I agree it's awful I'll re-formmat

> > +module_param(scmi_graceful_request_tmo_ms, uint, 0644);
> > +MODULE_PARM_DESC(scmi_graceful_request_tmo_ms,
> > +		 "Maximum time(ms) allowed to userspace for complying to the request. Unlimited if zero.");
> > +
> > +module_param(scmi_graceful_poweroff_signum, uint, 0644);
> > +MODULE_PARM_DESC(scmi_graceful_poweroff_signum,
> > +		 "Signal to request graceful poweroff to CAD process. Ignored if zero.");
> > +
> > +module_param(scmi_graceful_reboot_signum, uint, 0644);
> > +MODULE_PARM_DESC(scmi_graceful_reboot_signum,
> > +		 "Signal to request graceful reboot to CAD process. Ignored if zero.");
> 
> This is not the 1990's, please do not add module parameters.  Make this
> "just work", or worst case, sysfs attributes.
> 

These optional parameters were meant to support: 

- configurable timeouts for the grace shutdown requests
  Here the assumption is that even a graceful shutdown request from the SCMI-fw
  will be somehow upper-bound by the SCMI fw itself to a maximum timeout and
  then a brutal powercut will be anyway served if userspace is lagging behing,
  so in order to minimize possible corruption in this scenario I track the
  reboot process with a reboot_notifier and issue an emergency_sync +
  kernel shutdown/reboot when userspace is late and the timeout is hit.

  Anyway, since this timeout is highly dependent on the specific SCMI fw
  implementation and the specific event/cause for the request, it seemed to me
  not sane that you had to guess such timeout in a module param indeed, so I
  asked achitects to add such information to the SCMI SystenPower notification
  payload itself so that can be generated and handled dynamically.

  I'll drop this module param and stick to a very high fixed max timeout for now
  and once the next SCMIvNN spec is released with this extension I'll add such
  dynamic handling to this driver.
                                                                                                                                                                                                                                                                                                                                                                                                                                         
- an alternative way to pass graceful reboot/shutdwon requests to userspace
  To support systems where the init process allows (like systemd) or expects to
  receive some sort of signal to initiate reboot/shutdown instead of poweroff/
  reboot_cmd as invoked by the kernel usermodehelper via orderly_reboot/poweroff
  (like this driver and ACPI does by default)

  Moving this to sysfs attributes would mean:
                                                                                                                                                                                                                                                                                                                                                                                                                                         
  - moving signals params to some higher-than-device-driver attrs groups
  (class or bus) given that they are common to any instance of this driver

  - exposing a new sysfs ABI for which I really do not have a real user at
  the moment (but just hypotethical ones like systemd) and for an optionally
  built driver like this, thing that I'm not sure is allowed at all.

  So I'd drop this params and the alternative signal based communication
  as a whole for the moment.

This was supposed to be a concise summary of the why's, not sure to have
been so concise :D

Thanks

Cristian

> thanks,
> 
> greg k-h
diff mbox series

Patch

diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index 3f14dffb9669..2b39453e9dd1 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -40,6 +40,18 @@  config ARM_SCMI_POWER_DOMAIN
 	  will be called scmi_pm_domain. Note this may needed early in boot
 	  before rootfs may be available.
 
+config ARM_SCMI_POWER_CONTROL
+	bool "SCMI system power control driver"
+	depends on ARM_SCMI_PROTOCOL || (COMPILE_TEST && OF)
+	default n
+	help
+	  This enables System Power control logic which binds system shutdown or
+	  reboot actions to SCMI System Power notifications generated by SCP
+	  firmware.
+
+	  Graceful requests' methods and timeout and can be configured using
+	  a few available module parameters.
+
 config ARM_SCPI_PROTOCOL
 	tristate "ARM System Control and Power Interface (SCPI) Message Protocol"
 	depends on ARM || ARM64 || COMPILE_TEST
diff --git a/drivers/firmware/arm_scmi/Makefile b/drivers/firmware/arm_scmi/Makefile
index 6a2ef63306d6..ddddb4636ebd 100644
--- a/drivers/firmware/arm_scmi/Makefile
+++ b/drivers/firmware/arm_scmi/Makefile
@@ -9,3 +9,4 @@  scmi-module-objs := $(scmi-bus-y) $(scmi-driver-y) $(scmi-protocols-y) \
 		    $(scmi-transport-y)
 obj-$(CONFIG_ARM_SCMI_PROTOCOL) += scmi-module.o
 obj-$(CONFIG_ARM_SCMI_POWER_DOMAIN) += scmi_pm_domain.o
+obj-$(CONFIG_ARM_SCMI_POWER_CONTROL) += scmi_power_control.o
diff --git a/drivers/firmware/arm_scmi/scmi_power_control.c b/drivers/firmware/arm_scmi/scmi_power_control.c
new file mode 100644
index 000000000000..216c2f031111
--- /dev/null
+++ b/drivers/firmware/arm_scmi/scmi_power_control.c
@@ -0,0 +1,359 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * SCMI Generic SystemPower Control driver.
+ *
+ * Copyright (C) 2020-2021 ARM Ltd.
+ */
+/**
+ * DOC: Theory of operation
+ *
+ * In order to handle platform originated SCMI SystemPower requests (like
+ * shutdowns or cold/warm resets) we register an SCMI Notification notifier
+ * block to react when such SCMI SystemPower events are emitted.
+ *
+ * Once such a notification is received we act accordingly to perform the
+ * required system transition depending on the kind of request.
+ *
+ * By default graceful requests are routed to userspace through the same
+ * API methods (orderly_poweroff/reboot()) used by ACPI when handling ACPI
+ * Shutdown bus events: alternatively, by properly setting a couple of module
+ * parameters (scmi_graceful_*_signum), it is possible to choose to use signals
+ * to CAD pid as a mean to communicate such graceful requests to userspace.
+ *
+ * Forceful requests, instead, will simply cause an immediate emergency_sync()
+ * and subsequent Kernel-only shutdown/reboot.
+ *
+ * Additionally, setting scmi_graceful_request_tmo_ms to a non-zero value, it
+ * is possible to convert a timed-out graceful request to a forceful one.
+ *
+ * The assumption here is that even graceful requests can be upper-bound by a
+ * maximum final timeout strictly enforced by the platform itself which can
+ * ultimately cut the power off at will anytime: in order to avoid such extreme
+ * scenario, when scmi_graceful_request_tmo_ms is non-zero, we track progress of
+ * graceful requests through the means of a reboot notifier converting timed-out
+ * graceful requests to forceful ones: in such a way at least we perform a
+ * clean sync and shutdown/restart before the power is cut.
+ *
+ * Given that such platform hard-timeout, if present, is anyway highly platform
+ * and event specific and not exposed at run-time by the SCMI SytemPower
+ * protocol, the parameter scmi_graceful_request_tmo_ms defaults to zero.
+ */
+
+#define pr_fmt(fmt) "SCMI SystemPower - " fmt
+
+#include <linux/bitfield.h>
+#include <linux/err.h>
+#include <linux/fs.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/reboot.h>
+#include <linux/sched/signal.h>
+#include <linux/scmi_protocol.h>
+#include <linux/slab.h>
+#include <linux/timer.h>
+#include <linux/types.h>
+#include <linux/workqueue.h>
+
+enum scmi_syspower_state {
+	SCMI_SYSPOWER_IDLE,
+	SCMI_SYSPOWER_IN_PROGRESS,
+	SCMI_SYSPOWER_REBOOTING
+};
+
+static enum scmi_syspower_state scmi_state;
+/* Protect access to scmi_state */
+static DEFINE_MUTEX(scmi_state_mtx);
+
+static enum scmi_system_events scmi_required_transition = SCMI_SYSTEM_MAX;
+
+static unsigned int scmi_graceful_poweroff_signum;
+static unsigned int scmi_graceful_reboot_signum;
+static unsigned int scmi_graceful_request_tmo_ms;
+
+static struct timer_list scmi_request_timer;
+static struct work_struct scmi_forceful_work;
+
+/**
+ * scmi_reboot_notifier  - A reboot notifier to catch an ongoing successful
+ * system transition
+ * @nb: Reference to the related notifier block
+ * @reason: The reason for the ongoing reboot
+ * @__unused: The cmd being executed on a restart request (unused)
+ *
+ * When an ongoing system transition is detected, compatible with the one
+ * requested by SCMI, cancel the timer work.
+ * This is registered only when a valid SCMI SystemPower transition has been
+ * received and scmi_graceful_request_tmo_ms was non-zero.
+ *
+ * Return: NOTIFY_OK in any case
+ */
+static int scmi_reboot_notifier(struct notifier_block *nb,
+				unsigned long reason, void *__unused)
+{
+	mutex_lock(&scmi_state_mtx);
+	switch (reason) {
+	case SYS_HALT:
+	case SYS_POWER_OFF:
+		if (scmi_required_transition == SCMI_SYSTEM_SHUTDOWN)
+			scmi_state = SCMI_SYSPOWER_REBOOTING;
+		break;
+	case SYS_RESTART:
+		if (scmi_required_transition == SCMI_SYSTEM_COLDRESET ||
+		    scmi_required_transition == SCMI_SYSTEM_WARMRESET)
+			scmi_state = SCMI_SYSPOWER_REBOOTING;
+		break;
+	default:
+		break;
+	}
+
+	if (scmi_state == SCMI_SYSPOWER_REBOOTING) {
+		pr_debug("Reboot in progress...cancel timer.\n");
+		del_timer_sync(&scmi_request_timer);
+	}
+	mutex_unlock(&scmi_state_mtx);
+
+	return NOTIFY_OK;
+}
+
+static struct notifier_block scmi_reboot_nb = {
+	.notifier_call = &scmi_reboot_notifier,
+};
+
+static void scmi_request_forceful_transition(void);
+
+static void scmi_forceful_work_func(struct work_struct *work)
+{
+	if (system_state > SYSTEM_RUNNING)
+		return;
+
+	/* avoid deadlock unregistering reboot notifier */
+	unregister_reboot_notifier(&scmi_reboot_nb);
+
+	mutex_lock(&scmi_state_mtx);
+	if (scmi_state == SCMI_SYSPOWER_IN_PROGRESS)
+		scmi_request_forceful_transition();
+	mutex_unlock(&scmi_state_mtx);
+}
+
+/**
+ * scmi_request_timeout  - On timeout trigger a forceful transition
+ * @t: The timer reference
+ *
+ * Actual work is deferred out of the timer IRQ context since shutdown/reboot
+ * code do not play well when !in_task().
+ */
+static void scmi_request_timeout(struct timer_list *t)
+{
+	pr_debug("Graceful request timed out...forcing !\n");
+
+	schedule_work(&scmi_forceful_work);
+}
+
+/**
+ * scmi_request_graceful_transition  - Request graceful SystemPower transition
+ *
+ * Initiates the required SystemPower transition, requesting userspace
+ * co-operation: by default it uses the same orderly_ methods used by ACPI
+ * Shutdown event processing, while instead it uses a signal to CAD pid if
+ * any of the scmi_graceful_poweroff_ module params were non-zero.
+ *
+ * If scmi_graceful_request_tmo_ms is non zero, take care also to register a
+ * reboot notifier and a timer callback in order to detect if userspace actions
+ * are taking too long and in such a case fall back to a forceful transition.
+ */
+static void scmi_request_graceful_transition(void)
+{
+	if (scmi_graceful_request_tmo_ms) {
+		int ret;
+
+		ret = register_reboot_notifier(&scmi_reboot_nb);
+		if (!ret) {
+			INIT_WORK(&scmi_forceful_work, scmi_forceful_work_func);
+			scmi_request_timer.expires = jiffies +
+				msecs_to_jiffies(scmi_graceful_request_tmo_ms);
+			timer_setup(&scmi_request_timer,
+				    scmi_request_timeout, 0);
+			add_timer(&scmi_request_timer);
+		} else {
+			/* Carry on best effort even without a reboot notifier */
+			pr_warn("Cannot register reboot notifier !\n");
+		}
+	}
+
+	pr_debug("Serving graceful request: %d (timeout_ms:%d)\n",
+		 scmi_required_transition, scmi_graceful_request_tmo_ms);
+
+	switch (scmi_required_transition) {
+	case SCMI_SYSTEM_SHUTDOWN:
+		/*
+		 * When received early at boot-time the 'orderly' branch will
+		 * fail due to the lack of userspace itself, but the force=true
+		 * argument will anyway enable us to trigger a successful forced
+		 * shutdown.
+		 */
+		if (!scmi_graceful_poweroff_signum)
+			orderly_poweroff(true);
+		else
+			kill_cad_pid(scmi_graceful_poweroff_signum, 1);
+		break;
+	case SCMI_SYSTEM_COLDRESET:
+	case SCMI_SYSTEM_WARMRESET:
+		if (!scmi_graceful_reboot_signum)
+			orderly_reboot();
+		else
+			kill_cad_pid(scmi_graceful_reboot_signum, 1);
+		break;
+	default:
+		break;
+	}
+}
+
+/**
+ * scmi_request_forceful_transition  - Request forceful SystemPower transition
+ *
+ * Initiates the required SystemPower transition without involving userspace:
+ * just trigger the action at the kernel level after issuing an emergency sync.
+ */
+static void scmi_request_forceful_transition(void)
+{
+	pr_debug("Serving forceful request:%d\n", scmi_required_transition);
+
+	emergency_sync();
+	switch (scmi_required_transition) {
+	case SCMI_SYSTEM_SHUTDOWN:
+		kernel_power_off();
+		break;
+	case SCMI_SYSTEM_COLDRESET:
+	case SCMI_SYSTEM_WARMRESET:
+		kernel_restart(NULL);
+		break;
+	default:
+		break;
+	}
+}
+
+/**
+ * scmi_userspace_notifier  - Notifier callback to act on SystemPower
+ * Notifications
+ * @nb: Reference to the related notifier block
+ * @event: The SystemPower notification event id
+ * @data: The SystemPower event report
+ *
+ * This callback is in charge of decoding the received SystemPower report
+ * and act accordingly triggering a graceful or forceful system transition.
+ *
+ * Note that once a valid SCMI SystemPower event starts being served, any
+ * other following SystemPower notification received from the same or any
+ * other SCMI instance (handle) will be ignored.
+ *
+ * Return: NOTIFY_OK once a valid SystemPower event has been successfully
+ * processed.
+ */
+static int scmi_userspace_notifier(struct notifier_block *nb,
+				   unsigned long event, void *data)
+{
+	struct scmi_system_power_state_notifier_report *er = data;
+
+	if (er->system_state >= SCMI_SYSTEM_POWERUP) {
+		pr_err("Ignoring invalid request: %d\n", er->system_state);
+		return NOTIFY_DONE;
+	}
+
+	/*
+	 * Bail out if system is already shutting down or an SCMI SystemPower
+	 * requested is already being served.
+	 */
+	if (system_state > SYSTEM_RUNNING)
+		return NOTIFY_DONE;
+	mutex_lock(&scmi_state_mtx);
+	if (scmi_state != SCMI_SYSPOWER_IDLE) {
+		pr_debug("Transition already in progress...ignore.\n");
+		mutex_unlock(&scmi_state_mtx);
+		return NOTIFY_DONE;
+	}
+	scmi_state = SCMI_SYSPOWER_IN_PROGRESS;
+	mutex_unlock(&scmi_state_mtx);
+
+	scmi_required_transition = er->system_state;
+
+	/* Leaving a trace in logs of who triggered the shutdown/reboot. */
+	pr_info("Serving shutdown/reboot request: %d\n",
+		scmi_required_transition);
+
+	if (SCMI_SYSPOWER_IS_REQUEST_GRACEFUL(er->flags))
+		scmi_request_graceful_transition();
+	else
+		scmi_request_forceful_transition();
+
+	return NOTIFY_OK;
+}
+
+static int scmi_syspower_probe(struct scmi_device *sdev)
+{
+	int ret;
+	struct scmi_handle *handle = sdev->handle;
+	struct notifier_block *userspace_nb;
+
+	if (!handle)
+		return -ENODEV;
+
+	userspace_nb = devm_kzalloc(&sdev->dev, sizeof(*userspace_nb),
+				    GFP_KERNEL);
+	if (!userspace_nb)
+		return -ENOMEM;
+
+	userspace_nb->notifier_call = &scmi_userspace_notifier;
+	ret = handle->notify_ops->register_event_notifier(handle,
+						SCMI_PROTOCOL_SYSTEM,
+					SCMI_EVENT_SYSTEM_POWER_STATE_NOTIFIER,
+					    NULL, userspace_nb);
+	if (!ret)
+		dev_set_drvdata(&sdev->dev, userspace_nb);
+
+	return ret;
+}
+
+static void scmi_syspower_remove(struct scmi_device *sdev)
+{
+	struct notifier_block *userspace_nb;
+
+	userspace_nb = dev_get_drvdata(&sdev->dev);
+	if (userspace_nb) {
+		const struct scmi_handle *handle = sdev->handle;
+
+		handle->notify_ops->unregister_event_notifier(handle,
+							   SCMI_PROTOCOL_SYSTEM,
+					 SCMI_EVENT_SYSTEM_POWER_STATE_NOTIFIER,
+						      NULL, userspace_nb);
+	}
+}
+
+static const struct scmi_device_id scmi_id_table[] = {
+	{ SCMI_PROTOCOL_SYSTEM, "syspower" },
+	{ },
+};
+MODULE_DEVICE_TABLE(scmi, scmi_id_table);
+
+static struct scmi_driver scmi_system_power_driver = {
+	.name = "scmi-system-power",
+	.probe = scmi_syspower_probe,
+	.remove = scmi_syspower_remove,
+	.id_table = scmi_id_table,
+};
+module_scmi_driver(scmi_system_power_driver);
+
+module_param(scmi_graceful_request_tmo_ms, uint, 0644);
+MODULE_PARM_DESC(scmi_graceful_request_tmo_ms,
+		 "Maximum time(ms) allowed to userspace for complying to the request. Unlimited if zero.");
+
+module_param(scmi_graceful_poweroff_signum, uint, 0644);
+MODULE_PARM_DESC(scmi_graceful_poweroff_signum,
+		 "Signal to request graceful poweroff to CAD process. Ignored if zero.");
+
+module_param(scmi_graceful_reboot_signum, uint, 0644);
+MODULE_PARM_DESC(scmi_graceful_reboot_signum,
+		 "Signal to request graceful reboot to CAD process. Ignored if zero.");
+
+MODULE_AUTHOR("Cristian Marussi <cristian.marussi@arm.com>");
+MODULE_DESCRIPTION("ARM SCMI SystemPower Control driver");
+MODULE_LICENSE("GPL v2");