diff mbox series

[RFC,1/2] PM: runtime: Add pm_runtime_try_put_sync() and pm_runtime_try_get_sync()

Message ID 20220922084322.RFC.1.Iaa18b24fef0c8e88f0b82502f7fa0a70565b64d2@changeid (mailing list archive)
State RFC, archived
Headers show
Series clk: Avoid potential deadlock when disabling unused clocks | expand

Commit Message

Doug Anderson Sept. 22, 2022, 3:43 p.m. UTC
In some cases, a caller may wish to synchronously get or put the PM
Runtime state of a device but the caller may also be holding a
resource that the runtime suspend or runtime resume of the device
needs. Obviously this can lead to deadlock.

A case in point is the clock framework. While running
clk_disable_unused() the clock framework holds the global clock
"prepare" lock. The clock framework then goes through and does PM
Runtime actions. It should be no surprise to anyone that some devices
need to prepare or unprepare clocks are part of their PM Runtime
actions. Things generally work OK because of the "recursive" nature of
the global clock "prepare" lock, but if we get unlucky and the PM
Runtime action is happening in another task then we can end up
deadlocking.

Let's add a "try" version of the synchronous PM Runtime routines.
This version will return -EINPROGRESS rather than waiting. To
implement this we'll add a new flag: RPM_TRY.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/base/power/runtime.c |  7 +++++--
 include/linux/pm_runtime.h   | 28 ++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+), 2 deletions(-)

Comments

Rafael J. Wysocki Sept. 22, 2022, 5:37 p.m. UTC | #1
On Thu, Sep 22, 2022 at 5:44 PM Douglas Anderson <dianders@chromium.org> wrote:
>
> In some cases, a caller may wish to synchronously get or put the PM
> Runtime state of a device but the caller may also be holding a
> resource that the runtime suspend or runtime resume of the device
> needs. Obviously this can lead to deadlock.
>
> A case in point is the clock framework. While running
> clk_disable_unused() the clock framework holds the global clock
> "prepare" lock. The clock framework then goes through and does PM
> Runtime actions. It should be no surprise to anyone that some devices
> need to prepare or unprepare clocks are part of their PM Runtime
> actions. Things generally work OK because of the "recursive" nature of
> the global clock "prepare" lock, but if we get unlucky and the PM
> Runtime action is happening in another task then we can end up
> deadlocking.
>
> Let's add a "try" version of the synchronous PM Runtime routines.
> This version will return -EINPROGRESS rather than waiting. To
> implement this we'll add a new flag: RPM_TRY.
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
>
>  drivers/base/power/runtime.c |  7 +++++--
>  include/linux/pm_runtime.h   | 28 ++++++++++++++++++++++++++++
>  2 files changed, 33 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 997be3ac20a7..67cc6a620b12 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -609,7 +609,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)
>         if (dev->power.runtime_status == RPM_SUSPENDING) {
>                 DEFINE_WAIT(wait);
>
> -               if (rpmflags & (RPM_ASYNC | RPM_NOWAIT)) {
> +               if (rpmflags & (RPM_ASYNC | RPM_NOWAIT | RPM_TRY)) {
>                         retval = -EINPROGRESS;
>                         goto out;
>                 }
> @@ -791,7 +791,10 @@ static int rpm_resume(struct device *dev, int rpmflags)
>             || dev->power.runtime_status == RPM_SUSPENDING) {
>                 DEFINE_WAIT(wait);
>
> -               if (rpmflags & (RPM_ASYNC | RPM_NOWAIT)) {
> +               if (rpmflags & RPM_TRY) {
> +                       retval = -EINPROGRESS;

Returning -EINPROGRESS from here may be misleading, because the device
may not be resuming.

Besides, I'm not sure why a new flag is needed.  What about using
RPM_NOWAIT instead?



> +                       goto out;
> +               } else if (rpmflags & (RPM_ASYNC | RPM_NOWAIT)) {
>                         if (dev->power.runtime_status == RPM_SUSPENDING)
>                                 dev->power.deferred_resume = true;
>                         else
> diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h
> index 0a41b2dcccad..c68baa63f0e7 100644
> --- a/include/linux/pm_runtime.h
> +++ b/include/linux/pm_runtime.h
> @@ -21,6 +21,8 @@
>  #define RPM_GET_PUT            0x04    /* Increment/decrement the
>                                             usage_count */
>  #define RPM_AUTO               0x08    /* Use autosuspend_delay */
> +#define RPM_TRY                        0x10    /* Try to be synchronous but fail
> +                                           with an error if we can't. */
>
>  /*
>   * Use this for defining a set of PM operations to be used in all situations
> @@ -425,6 +427,19 @@ static inline int pm_runtime_get_sync(struct device *dev)
>         return __pm_runtime_resume(dev, RPM_GET_PUT);
>  }
>
> +/**
> + * pm_runtime_try_get_sync - Like pm_runtime_get_sync() but err if blocking
> + * @dev: Target device.
> + *
> + * This function works just like pm_runtime_get_sync() except that if the
> + * device in question is currently in the process of suspending or resuming
> + * that it will return with -EINPROGRESS instead of blocking.
> + */
> +static inline int pm_runtime_try_get_sync(struct device *dev)
> +{
> +       return __pm_runtime_resume(dev, RPM_GET_PUT | RPM_TRY);
> +}
> +
>  /**
>   * pm_runtime_resume_and_get - Bump up usage counter of a device and resume it.
>   * @dev: Target device.
> @@ -489,6 +504,19 @@ static inline int pm_runtime_put_sync(struct device *dev)
>         return __pm_runtime_idle(dev, RPM_GET_PUT);
>  }
>
> +/**
> + * pm_runtime_try_put_sync - Like pm_runtime_put_sync() but err if blocking
> + * @dev: Target device.
> + *
> + * This function works just like pm_runtime_put_sync() except that if the
> + * device in question is currently in the process of suspending that it will
> + * return with -EINPROGRESS instead of blocking.
> + */
> +static inline int pm_runtime_try_put_sync(struct device *dev)
> +{
> +       return __pm_runtime_idle(dev, RPM_GET_PUT | RPM_TRY);
> +}
> +
>  /**
>   * pm_runtime_put_sync_suspend - Drop device usage counter and suspend if 0.
>   * @dev: Target device.
> --
> 2.37.3.968.ga6b4b080e4-goog
>
Doug Anderson Sept. 22, 2022, 7:46 p.m. UTC | #2
Hi,

On Thu, Sep 22, 2022 at 10:38 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Thu, Sep 22, 2022 at 5:44 PM Douglas Anderson <dianders@chromium.org> wrote:
> >
> > In some cases, a caller may wish to synchronously get or put the PM
> > Runtime state of a device but the caller may also be holding a
> > resource that the runtime suspend or runtime resume of the device
> > needs. Obviously this can lead to deadlock.
> >
> > A case in point is the clock framework. While running
> > clk_disable_unused() the clock framework holds the global clock
> > "prepare" lock. The clock framework then goes through and does PM
> > Runtime actions. It should be no surprise to anyone that some devices
> > need to prepare or unprepare clocks are part of their PM Runtime
> > actions. Things generally work OK because of the "recursive" nature of
> > the global clock "prepare" lock, but if we get unlucky and the PM
> > Runtime action is happening in another task then we can end up
> > deadlocking.
> >
> > Let's add a "try" version of the synchronous PM Runtime routines.
> > This version will return -EINPROGRESS rather than waiting. To
> > implement this we'll add a new flag: RPM_TRY.
> >
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > ---
> >
> >  drivers/base/power/runtime.c |  7 +++++--
> >  include/linux/pm_runtime.h   | 28 ++++++++++++++++++++++++++++
> >  2 files changed, 33 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> > index 997be3ac20a7..67cc6a620b12 100644
> > --- a/drivers/base/power/runtime.c
> > +++ b/drivers/base/power/runtime.c
> > @@ -609,7 +609,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)
> >         if (dev->power.runtime_status == RPM_SUSPENDING) {
> >                 DEFINE_WAIT(wait);
> >
> > -               if (rpmflags & (RPM_ASYNC | RPM_NOWAIT)) {
> > +               if (rpmflags & (RPM_ASYNC | RPM_NOWAIT | RPM_TRY)) {
> >                         retval = -EINPROGRESS;
> >                         goto out;
> >                 }
> > @@ -791,7 +791,10 @@ static int rpm_resume(struct device *dev, int rpmflags)
> >             || dev->power.runtime_status == RPM_SUSPENDING) {
> >                 DEFINE_WAIT(wait);
> >
> > -               if (rpmflags & (RPM_ASYNC | RPM_NOWAIT)) {
> > +               if (rpmflags & RPM_TRY) {
> > +                       retval = -EINPROGRESS;
>
> Returning -EINPROGRESS from here may be misleading, because the device
> may not be resuming.
>
> Besides, I'm not sure why a new flag is needed.  What about using
> RPM_NOWAIT instead?

Yeah, we can use RPM_NOWAIT if we land your patch [1]. I'll spin with
that if folks agree that the overall approach taken in this series
makes sense.

[1] https://lore.kernel.org/r/12079576.O9o76ZdvQC@kreacher

-Doug
diff mbox series

Patch

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 997be3ac20a7..67cc6a620b12 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -609,7 +609,7 @@  static int rpm_suspend(struct device *dev, int rpmflags)
 	if (dev->power.runtime_status == RPM_SUSPENDING) {
 		DEFINE_WAIT(wait);
 
-		if (rpmflags & (RPM_ASYNC | RPM_NOWAIT)) {
+		if (rpmflags & (RPM_ASYNC | RPM_NOWAIT | RPM_TRY)) {
 			retval = -EINPROGRESS;
 			goto out;
 		}
@@ -791,7 +791,10 @@  static int rpm_resume(struct device *dev, int rpmflags)
 	    || dev->power.runtime_status == RPM_SUSPENDING) {
 		DEFINE_WAIT(wait);
 
-		if (rpmflags & (RPM_ASYNC | RPM_NOWAIT)) {
+		if (rpmflags & RPM_TRY) {
+			retval = -EINPROGRESS;
+			goto out;
+		} else if (rpmflags & (RPM_ASYNC | RPM_NOWAIT)) {
 			if (dev->power.runtime_status == RPM_SUSPENDING)
 				dev->power.deferred_resume = true;
 			else
diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h
index 0a41b2dcccad..c68baa63f0e7 100644
--- a/include/linux/pm_runtime.h
+++ b/include/linux/pm_runtime.h
@@ -21,6 +21,8 @@ 
 #define RPM_GET_PUT		0x04	/* Increment/decrement the
 					    usage_count */
 #define RPM_AUTO		0x08	/* Use autosuspend_delay */
+#define RPM_TRY			0x10	/* Try to be synchronous but fail
+					    with an error if we can't. */
 
 /*
  * Use this for defining a set of PM operations to be used in all situations
@@ -425,6 +427,19 @@  static inline int pm_runtime_get_sync(struct device *dev)
 	return __pm_runtime_resume(dev, RPM_GET_PUT);
 }
 
+/**
+ * pm_runtime_try_get_sync - Like pm_runtime_get_sync() but err if blocking
+ * @dev: Target device.
+ *
+ * This function works just like pm_runtime_get_sync() except that if the
+ * device in question is currently in the process of suspending or resuming
+ * that it will return with -EINPROGRESS instead of blocking.
+ */
+static inline int pm_runtime_try_get_sync(struct device *dev)
+{
+	return __pm_runtime_resume(dev, RPM_GET_PUT | RPM_TRY);
+}
+
 /**
  * pm_runtime_resume_and_get - Bump up usage counter of a device and resume it.
  * @dev: Target device.
@@ -489,6 +504,19 @@  static inline int pm_runtime_put_sync(struct device *dev)
 	return __pm_runtime_idle(dev, RPM_GET_PUT);
 }
 
+/**
+ * pm_runtime_try_put_sync - Like pm_runtime_put_sync() but err if blocking
+ * @dev: Target device.
+ *
+ * This function works just like pm_runtime_put_sync() except that if the
+ * device in question is currently in the process of suspending that it will
+ * return with -EINPROGRESS instead of blocking.
+ */
+static inline int pm_runtime_try_put_sync(struct device *dev)
+{
+	return __pm_runtime_idle(dev, RPM_GET_PUT | RPM_TRY);
+}
+
 /**
  * pm_runtime_put_sync_suspend - Drop device usage counter and suspend if 0.
  * @dev: Target device.