diff mbox series

[v3,1/2] PM / runtime: inform runtime PM of a device's next wakeup

Message ID 20201015193807.17423-2-ilina@codeaurora.org (mailing list archive)
State Superseded
Headers show
Series Better domain idle from device wakeup patterns | expand

Commit Message

Lina Iyer Oct. 15, 2020, 7:38 p.m. UTC
Some devices may have a predictable interrupt pattern while executing
usecases. An example would be the VSYNC interrupt associated with
display devices. A 60 Hz display could cause a interrupt every 16 ms. If
the device were in a PM domain, the domain would need to be powered up
for device to resume and handle the interrupt.

Entering a domain idle state saves power, only if the residency of the
idle state is met. Without knowing the idle duration of the domain, the
governor would just choose the deepest idle state that matches the QoS
requirements. The domain might be powered off just as the device is
expecting to wake up. If devices could inform runtime PM of their next
event, the parent PM domain's idle duration can be determined.

So let's add the pm_runtime_set_next_wake() API for the device to notify
runtime PM of the impending wakeup and document it's usage.

Signed-off-by: Lina Iyer <ilina@codeaurora.org>
---
Changes in v2:
	- Update documentation
	- Remove runtime PM enabled check
	- Update commit text
---
 Documentation/power/runtime_pm.rst | 17 +++++++++++++++++
 drivers/base/power/runtime.c       | 24 ++++++++++++++++++++++++
 include/linux/pm.h                 |  2 ++
 include/linux/pm_runtime.h         |  1 +
 4 files changed, 44 insertions(+)

Comments

Rafael J. Wysocki Oct. 16, 2020, 4:55 p.m. UTC | #1
On Thu, Oct 15, 2020 at 9:38 PM Lina Iyer <ilina@codeaurora.org> wrote:
>
> Some devices may have a predictable interrupt pattern while executing
> usecases. An example would be the VSYNC interrupt associated with
> display devices. A 60 Hz display could cause a interrupt every 16 ms. If
> the device were in a PM domain, the domain would need to be powered up
> for device to resume and handle the interrupt.
>
> Entering a domain idle state saves power, only if the residency of the
> idle state is met. Without knowing the idle duration of the domain, the
> governor would just choose the deepest idle state that matches the QoS
> requirements. The domain might be powered off just as the device is
> expecting to wake up. If devices could inform runtime PM of their next
> event, the parent PM domain's idle duration can be determined.
>
> So let's add the pm_runtime_set_next_wake() API for the device to notify
> runtime PM of the impending wakeup and document it's usage.
>
> Signed-off-by: Lina Iyer <ilina@codeaurora.org>
> ---
> Changes in v2:
>         - Update documentation
>         - Remove runtime PM enabled check
>         - Update commit text
> ---
>  Documentation/power/runtime_pm.rst | 17 +++++++++++++++++
>  drivers/base/power/runtime.c       | 24 ++++++++++++++++++++++++
>  include/linux/pm.h                 |  2 ++
>  include/linux/pm_runtime.h         |  1 +
>  4 files changed, 44 insertions(+)
>
> diff --git a/Documentation/power/runtime_pm.rst b/Documentation/power/runtime_pm.rst
> index 0553008b6279..f6aaef15a511 100644
> --- a/Documentation/power/runtime_pm.rst
> +++ b/Documentation/power/runtime_pm.rst
> @@ -515,6 +515,12 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:
>        power.use_autosuspend isn't set, otherwise returns the expiration time
>        in jiffies
>
> +  `int pm_runtime_set_next_event(struct device *dev, ktime_t next);`
> +    - inform runtime PM of the next event on the device. Devices that are
> +      sensitive to their domain idle enter/exit latencies may provide this
> +      information for use by the PM domain governor. The domain governor would
> +      use this information to calculate it's sleep length.
> +
>  It is safe to execute the following helper functions from interrupt context:
>
>  - pm_request_idle()
> @@ -545,6 +551,7 @@ functions may also be used in interrupt context:
>  - pm_runtime_put_sync()
>  - pm_runtime_put_sync_suspend()
>  - pm_runtime_put_sync_autosuspend()
> +- pm_runtime_set_next_event()
>
>  5. Runtime PM Initialization, Device Probing and Removal
>  ========================================================
> @@ -639,6 +646,16 @@ suspend routine).  It may be necessary to resume the device and suspend it again
>  in order to do so.  The same is true if the driver uses different power levels
>  or other settings for runtime suspend and system sleep.
>
> +When a device enters idle at runtime, it may trigger the runtime PM up the
> +hierarchy and if device has a predictable interrupt pattern, we can even do a
> +better job at determining the parent's idle state. For example, a display
> +device gets a VSYNC interrupt every 16 ms when running at 60 Hz. When it's PM
> +domain is powering down and happens to be at the boundary of the VSYNC
> +interrupt, it may not be efficient to power off the domain. Knowing the next
> +wake up (when available) for devices in the domain we can determine the idle
> +duration of the domain. By comparing idle duration with the residencies of the
> +domain idle states, we can be efficient in both power and performance.
> +
>  During system resume, the simplest approach is to bring all devices back to full
>  power, even if they had been suspended before the system suspend began.  There
>  are several reasons for this, including:
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 8143210a5c54..5d2ebacfd35e 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -122,6 +122,27 @@ u64 pm_runtime_suspended_time(struct device *dev)
>  }
>  EXPORT_SYMBOL_GPL(pm_runtime_suspended_time);
>
> +/**
> + * pm_runtime_set_next_wakeup_event - Notify PM framework of an impending event.
> + * @dev: Device to handle
> + * @next: impending interrupt/wakeup for the device
> + */
> +int pm_runtime_set_next_event(struct device *dev, ktime_t next)
> +{
> +       unsigned long flags;
> +       int ret = -EINVAL;
> +
> +       spin_lock_irqsave(&dev->power.lock, flags);
> +       if (ktime_before(ktime_get(), next)) {
> +               dev->power.next_event = next;
> +               ret = 0;
> +       }
> +       spin_unlock_irqrestore(&dev->power.lock, flags);
> +
> +       return ret;
> +}
> +EXPORT_SYMBOL_GPL(pm_runtime_set_next_event);
> +
>  /**
>   * pm_runtime_deactivate_timer - Deactivate given device's suspend timer.
>   * @dev: Device to handle.
> @@ -1415,6 +1436,9 @@ void pm_runtime_enable(struct device *dev)
>              "Enabling runtime PM for inactive device (%s) with active children\n",
>              dev_name(dev));
>
> +       /* Reset the next wakeup for the device */
> +       dev->power.next_event = KTIME_MAX;
> +
>         spin_unlock_irqrestore(&dev->power.lock, flags);
>  }
>  EXPORT_SYMBOL_GPL(pm_runtime_enable);
> diff --git a/include/linux/pm.h b/include/linux/pm.h
> index a30a4b54df52..9051658674a4 100644
> --- a/include/linux/pm.h
> +++ b/include/linux/pm.h
> @@ -8,6 +8,7 @@
>  #ifndef _LINUX_PM_H
>  #define _LINUX_PM_H
>
> +#include <linux/ktime.h>
>  #include <linux/list.h>
>  #include <linux/workqueue.h>
>  #include <linux/spinlock.h>
> @@ -616,6 +617,7 @@ struct dev_pm_info {
>         u64                     active_time;
>         u64                     suspended_time;
>         u64                     accounting_timestamp;
> +       ktime_t                 next_event;

While there are some cosmetic changes to be made, this particular bit
is fundamentally questionable IMV, because next_event (which BTW would
better be called next_wakeup IMO) is not used by PM-runtime.

The only user of it will be genpd AFAICS, so I don't quite see a
reason to inflict this extra memory cost on everybody, even if they
don't care about genpd and may not even compile it in.

>  #endif
>         struct pm_subsys_data   *subsys_data;  /* Owned by the subsystem. */
>         void (*set_latency_tolerance)(struct device *, s32);
> diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h
> index 6245caa18034..af6d35178335 100644
> --- a/include/linux/pm_runtime.h
> +++ b/include/linux/pm_runtime.h
> @@ -59,6 +59,7 @@ extern void pm_runtime_get_suppliers(struct device *dev);
>  extern void pm_runtime_put_suppliers(struct device *dev);
>  extern void pm_runtime_new_link(struct device *dev);
>  extern void pm_runtime_drop_link(struct device *dev);
> +extern int pm_runtime_set_next_event(struct device *dev, ktime_t next);
>
>  /**
>   * pm_runtime_get_if_in_use - Conditionally bump up runtime PM usage counter.
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>
Ulf Hansson Oct. 19, 2020, 10 a.m. UTC | #2
On Fri, 16 Oct 2020 at 18:55, Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Thu, Oct 15, 2020 at 9:38 PM Lina Iyer <ilina@codeaurora.org> wrote:
> >
> > Some devices may have a predictable interrupt pattern while executing
> > usecases. An example would be the VSYNC interrupt associated with
> > display devices. A 60 Hz display could cause a interrupt every 16 ms. If
> > the device were in a PM domain, the domain would need to be powered up
> > for device to resume and handle the interrupt.
> >
> > Entering a domain idle state saves power, only if the residency of the
> > idle state is met. Without knowing the idle duration of the domain, the
> > governor would just choose the deepest idle state that matches the QoS
> > requirements. The domain might be powered off just as the device is
> > expecting to wake up. If devices could inform runtime PM of their next
> > event, the parent PM domain's idle duration can be determined.
> >
> > So let's add the pm_runtime_set_next_wake() API for the device to notify
> > runtime PM of the impending wakeup and document it's usage.
> >
> > Signed-off-by: Lina Iyer <ilina@codeaurora.org>
> > ---
> > Changes in v2:
> >         - Update documentation
> >         - Remove runtime PM enabled check
> >         - Update commit text
> > ---
> >  Documentation/power/runtime_pm.rst | 17 +++++++++++++++++
> >  drivers/base/power/runtime.c       | 24 ++++++++++++++++++++++++
> >  include/linux/pm.h                 |  2 ++
> >  include/linux/pm_runtime.h         |  1 +
> >  4 files changed, 44 insertions(+)
> >
> > diff --git a/Documentation/power/runtime_pm.rst b/Documentation/power/runtime_pm.rst
> > index 0553008b6279..f6aaef15a511 100644
> > --- a/Documentation/power/runtime_pm.rst
> > +++ b/Documentation/power/runtime_pm.rst
> > @@ -515,6 +515,12 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:
> >        power.use_autosuspend isn't set, otherwise returns the expiration time
> >        in jiffies
> >
> > +  `int pm_runtime_set_next_event(struct device *dev, ktime_t next);`
> > +    - inform runtime PM of the next event on the device. Devices that are
> > +      sensitive to their domain idle enter/exit latencies may provide this
> > +      information for use by the PM domain governor. The domain governor would
> > +      use this information to calculate it's sleep length.
> > +
> >  It is safe to execute the following helper functions from interrupt context:
> >
> >  - pm_request_idle()
> > @@ -545,6 +551,7 @@ functions may also be used in interrupt context:
> >  - pm_runtime_put_sync()
> >  - pm_runtime_put_sync_suspend()
> >  - pm_runtime_put_sync_autosuspend()
> > +- pm_runtime_set_next_event()
> >
> >  5. Runtime PM Initialization, Device Probing and Removal
> >  ========================================================
> > @@ -639,6 +646,16 @@ suspend routine).  It may be necessary to resume the device and suspend it again
> >  in order to do so.  The same is true if the driver uses different power levels
> >  or other settings for runtime suspend and system sleep.
> >
> > +When a device enters idle at runtime, it may trigger the runtime PM up the
> > +hierarchy and if device has a predictable interrupt pattern, we can even do a
> > +better job at determining the parent's idle state. For example, a display
> > +device gets a VSYNC interrupt every 16 ms when running at 60 Hz. When it's PM
> > +domain is powering down and happens to be at the boundary of the VSYNC
> > +interrupt, it may not be efficient to power off the domain. Knowing the next
> > +wake up (when available) for devices in the domain we can determine the idle
> > +duration of the domain. By comparing idle duration with the residencies of the
> > +domain idle states, we can be efficient in both power and performance.
> > +
> >  During system resume, the simplest approach is to bring all devices back to full
> >  power, even if they had been suspended before the system suspend began.  There
> >  are several reasons for this, including:
> > diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> > index 8143210a5c54..5d2ebacfd35e 100644
> > --- a/drivers/base/power/runtime.c
> > +++ b/drivers/base/power/runtime.c
> > @@ -122,6 +122,27 @@ u64 pm_runtime_suspended_time(struct device *dev)
> >  }
> >  EXPORT_SYMBOL_GPL(pm_runtime_suspended_time);
> >
> > +/**
> > + * pm_runtime_set_next_wakeup_event - Notify PM framework of an impending event.
> > + * @dev: Device to handle
> > + * @next: impending interrupt/wakeup for the device
> > + */
> > +int pm_runtime_set_next_event(struct device *dev, ktime_t next)
> > +{
> > +       unsigned long flags;
> > +       int ret = -EINVAL;
> > +
> > +       spin_lock_irqsave(&dev->power.lock, flags);
> > +       if (ktime_before(ktime_get(), next)) {
> > +               dev->power.next_event = next;
> > +               ret = 0;
> > +       }
> > +       spin_unlock_irqrestore(&dev->power.lock, flags);
> > +
> > +       return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(pm_runtime_set_next_event);
> > +
> >  /**
> >   * pm_runtime_deactivate_timer - Deactivate given device's suspend timer.
> >   * @dev: Device to handle.
> > @@ -1415,6 +1436,9 @@ void pm_runtime_enable(struct device *dev)
> >              "Enabling runtime PM for inactive device (%s) with active children\n",
> >              dev_name(dev));
> >
> > +       /* Reset the next wakeup for the device */
> > +       dev->power.next_event = KTIME_MAX;
> > +
> >         spin_unlock_irqrestore(&dev->power.lock, flags);
> >  }
> >  EXPORT_SYMBOL_GPL(pm_runtime_enable);
> > diff --git a/include/linux/pm.h b/include/linux/pm.h
> > index a30a4b54df52..9051658674a4 100644
> > --- a/include/linux/pm.h
> > +++ b/include/linux/pm.h
> > @@ -8,6 +8,7 @@
> >  #ifndef _LINUX_PM_H
> >  #define _LINUX_PM_H
> >
> > +#include <linux/ktime.h>
> >  #include <linux/list.h>
> >  #include <linux/workqueue.h>
> >  #include <linux/spinlock.h>
> > @@ -616,6 +617,7 @@ struct dev_pm_info {
> >         u64                     active_time;
> >         u64                     suspended_time;
> >         u64                     accounting_timestamp;
> > +       ktime_t                 next_event;
>
> While there are some cosmetic changes to be made, this particular bit
> is fundamentally questionable IMV, because next_event (which BTW would
> better be called next_wakeup IMO) is not used by PM-runtime.
>
> The only user of it will be genpd AFAICS, so I don't quite see a
> reason to inflict this extra memory cost on everybody, even if they
> don't care about genpd and may not even compile it in.

That's a good point!

May I suggest that the new data is put into the "struct
generic_pm_domain_data" instead, which means it will be allocated when
a device is attached to a genpd.

Moreover, we should probably rename the API (and move the
implementation of it accordingly) from pm_runtime_set_next_event() to
dev_pm_genpd_set_next_wakeup(). Unless we believe the interface could
be useful for other PM domain types (ACPI ?), then we could consider
adding a ->set_next_wakeup() callback to the struct dev_pm_domain and
implement the interface through a common
dev_pm_domain_set_next_wakeup() API.

>
> >  #endif
> >         struct pm_subsys_data   *subsys_data;  /* Owned by the subsystem. */
> >         void (*set_latency_tolerance)(struct device *, s32);
> > diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h
> > index 6245caa18034..af6d35178335 100644
> > --- a/include/linux/pm_runtime.h
> > +++ b/include/linux/pm_runtime.h
> > @@ -59,6 +59,7 @@ extern void pm_runtime_get_suppliers(struct device *dev);
> >  extern void pm_runtime_put_suppliers(struct device *dev);
> >  extern void pm_runtime_new_link(struct device *dev);
> >  extern void pm_runtime_drop_link(struct device *dev);
> > +extern int pm_runtime_set_next_event(struct device *dev, ktime_t next);
> >
> >  /**
> >   * pm_runtime_get_if_in_use - Conditionally bump up runtime PM usage counter.
> > --
> > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> > a Linux Foundation Collaborative Project
> >

Kind regards
Uffe
Rafael J. Wysocki Oct. 19, 2020, 10:21 a.m. UTC | #3
On Mon, Oct 19, 2020 at 12:01 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Fri, 16 Oct 2020 at 18:55, Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Thu, Oct 15, 2020 at 9:38 PM Lina Iyer <ilina@codeaurora.org> wrote:
> > >
> > > Some devices may have a predictable interrupt pattern while executing
> > > usecases. An example would be the VSYNC interrupt associated with
> > > display devices. A 60 Hz display could cause a interrupt every 16 ms. If
> > > the device were in a PM domain, the domain would need to be powered up
> > > for device to resume and handle the interrupt.
> > >
> > > Entering a domain idle state saves power, only if the residency of the
> > > idle state is met. Without knowing the idle duration of the domain, the
> > > governor would just choose the deepest idle state that matches the QoS
> > > requirements. The domain might be powered off just as the device is
> > > expecting to wake up. If devices could inform runtime PM of their next
> > > event, the parent PM domain's idle duration can be determined.
> > >
> > > So let's add the pm_runtime_set_next_wake() API for the device to notify
> > > runtime PM of the impending wakeup and document it's usage.
> > >
> > > Signed-off-by: Lina Iyer <ilina@codeaurora.org>
> > > ---
> > > Changes in v2:
> > >         - Update documentation
> > >         - Remove runtime PM enabled check
> > >         - Update commit text
> > > ---
> > >  Documentation/power/runtime_pm.rst | 17 +++++++++++++++++
> > >  drivers/base/power/runtime.c       | 24 ++++++++++++++++++++++++
> > >  include/linux/pm.h                 |  2 ++
> > >  include/linux/pm_runtime.h         |  1 +
> > >  4 files changed, 44 insertions(+)
> > >
> > > diff --git a/Documentation/power/runtime_pm.rst b/Documentation/power/runtime_pm.rst
> > > index 0553008b6279..f6aaef15a511 100644
> > > --- a/Documentation/power/runtime_pm.rst
> > > +++ b/Documentation/power/runtime_pm.rst
> > > @@ -515,6 +515,12 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:
> > >        power.use_autosuspend isn't set, otherwise returns the expiration time
> > >        in jiffies
> > >
> > > +  `int pm_runtime_set_next_event(struct device *dev, ktime_t next);`
> > > +    - inform runtime PM of the next event on the device. Devices that are
> > > +      sensitive to their domain idle enter/exit latencies may provide this
> > > +      information for use by the PM domain governor. The domain governor would
> > > +      use this information to calculate it's sleep length.
> > > +
> > >  It is safe to execute the following helper functions from interrupt context:
> > >
> > >  - pm_request_idle()
> > > @@ -545,6 +551,7 @@ functions may also be used in interrupt context:
> > >  - pm_runtime_put_sync()
> > >  - pm_runtime_put_sync_suspend()
> > >  - pm_runtime_put_sync_autosuspend()
> > > +- pm_runtime_set_next_event()
> > >
> > >  5. Runtime PM Initialization, Device Probing and Removal
> > >  ========================================================
> > > @@ -639,6 +646,16 @@ suspend routine).  It may be necessary to resume the device and suspend it again
> > >  in order to do so.  The same is true if the driver uses different power levels
> > >  or other settings for runtime suspend and system sleep.
> > >
> > > +When a device enters idle at runtime, it may trigger the runtime PM up the
> > > +hierarchy and if device has a predictable interrupt pattern, we can even do a
> > > +better job at determining the parent's idle state. For example, a display
> > > +device gets a VSYNC interrupt every 16 ms when running at 60 Hz. When it's PM
> > > +domain is powering down and happens to be at the boundary of the VSYNC
> > > +interrupt, it may not be efficient to power off the domain. Knowing the next
> > > +wake up (when available) for devices in the domain we can determine the idle
> > > +duration of the domain. By comparing idle duration with the residencies of the
> > > +domain idle states, we can be efficient in both power and performance.
> > > +
> > >  During system resume, the simplest approach is to bring all devices back to full
> > >  power, even if they had been suspended before the system suspend began.  There
> > >  are several reasons for this, including:
> > > diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> > > index 8143210a5c54..5d2ebacfd35e 100644
> > > --- a/drivers/base/power/runtime.c
> > > +++ b/drivers/base/power/runtime.c
> > > @@ -122,6 +122,27 @@ u64 pm_runtime_suspended_time(struct device *dev)
> > >  }
> > >  EXPORT_SYMBOL_GPL(pm_runtime_suspended_time);
> > >
> > > +/**
> > > + * pm_runtime_set_next_wakeup_event - Notify PM framework of an impending event.
> > > + * @dev: Device to handle
> > > + * @next: impending interrupt/wakeup for the device
> > > + */
> > > +int pm_runtime_set_next_event(struct device *dev, ktime_t next)
> > > +{
> > > +       unsigned long flags;
> > > +       int ret = -EINVAL;
> > > +
> > > +       spin_lock_irqsave(&dev->power.lock, flags);
> > > +       if (ktime_before(ktime_get(), next)) {
> > > +               dev->power.next_event = next;
> > > +               ret = 0;
> > > +       }
> > > +       spin_unlock_irqrestore(&dev->power.lock, flags);
> > > +
> > > +       return ret;
> > > +}
> > > +EXPORT_SYMBOL_GPL(pm_runtime_set_next_event);
> > > +
> > >  /**
> > >   * pm_runtime_deactivate_timer - Deactivate given device's suspend timer.
> > >   * @dev: Device to handle.
> > > @@ -1415,6 +1436,9 @@ void pm_runtime_enable(struct device *dev)
> > >              "Enabling runtime PM for inactive device (%s) with active children\n",
> > >              dev_name(dev));
> > >
> > > +       /* Reset the next wakeup for the device */
> > > +       dev->power.next_event = KTIME_MAX;
> > > +
> > >         spin_unlock_irqrestore(&dev->power.lock, flags);
> > >  }
> > >  EXPORT_SYMBOL_GPL(pm_runtime_enable);
> > > diff --git a/include/linux/pm.h b/include/linux/pm.h
> > > index a30a4b54df52..9051658674a4 100644
> > > --- a/include/linux/pm.h
> > > +++ b/include/linux/pm.h
> > > @@ -8,6 +8,7 @@
> > >  #ifndef _LINUX_PM_H
> > >  #define _LINUX_PM_H
> > >
> > > +#include <linux/ktime.h>
> > >  #include <linux/list.h>
> > >  #include <linux/workqueue.h>
> > >  #include <linux/spinlock.h>
> > > @@ -616,6 +617,7 @@ struct dev_pm_info {
> > >         u64                     active_time;
> > >         u64                     suspended_time;
> > >         u64                     accounting_timestamp;
> > > +       ktime_t                 next_event;
> >
> > While there are some cosmetic changes to be made, this particular bit
> > is fundamentally questionable IMV, because next_event (which BTW would
> > better be called next_wakeup IMO) is not used by PM-runtime.
> >
> > The only user of it will be genpd AFAICS, so I don't quite see a
> > reason to inflict this extra memory cost on everybody, even if they
> > don't care about genpd and may not even compile it in.
>
> That's a good point!
>
> May I suggest that the new data is put into the "struct
> generic_pm_domain_data" instead, which means it will be allocated when
> a device is attached to a genpd.

Yes, something like that.

> Moreover, we should probably rename the API (and move the
> implementation of it accordingly) from pm_runtime_set_next_event() to
> dev_pm_genpd_set_next_wakeup().

Right.

> Unless we believe the interface could
> be useful for other PM domain types (ACPI ?), then we could consider
> adding a ->set_next_wakeup() callback to the struct dev_pm_domain and
> implement the interface through a common
> dev_pm_domain_set_next_wakeup() API.

Maybe.

That would depend on who the other user would be and I wouldn't worry
about that upfront.

Cheers!
Lina Iyer Oct. 19, 2020, 5:02 p.m. UTC | #4
On Mon, Oct 19 2020 at 04:21 -0600, Rafael J. Wysocki wrote:
>On Mon, Oct 19, 2020 at 12:01 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>
>> On Fri, 16 Oct 2020 at 18:55, Rafael J. Wysocki <rafael@kernel.org> wrote:
>> >
>> > On Thu, Oct 15, 2020 at 9:38 PM Lina Iyer <ilina@codeaurora.org> wrote:
>> > >
>> > > Some devices may have a predictable interrupt pattern while executing
>> > > usecases. An example would be the VSYNC interrupt associated with
>> > > display devices. A 60 Hz display could cause a interrupt every 16 ms. If
>> > > the device were in a PM domain, the domain would need to be powered up
>> > > for device to resume and handle the interrupt.
>> > >
>> > > Entering a domain idle state saves power, only if the residency of the
>> > > idle state is met. Without knowing the idle duration of the domain, the
>> > > governor would just choose the deepest idle state that matches the QoS
>> > > requirements. The domain might be powered off just as the device is
>> > > expecting to wake up. If devices could inform runtime PM of their next
>> > > event, the parent PM domain's idle duration can be determined.
>> > >
>> > > So let's add the pm_runtime_set_next_wake() API for the device to notify
>> > > runtime PM of the impending wakeup and document it's usage.
>> > >
>> > > Signed-off-by: Lina Iyer <ilina@codeaurora.org>
>> > > ---
>> > > Changes in v2:
>> > >         - Update documentation
>> > >         - Remove runtime PM enabled check
>> > >         - Update commit text
>> > > ---
>> > >  Documentation/power/runtime_pm.rst | 17 +++++++++++++++++
>> > >  drivers/base/power/runtime.c       | 24 ++++++++++++++++++++++++
>> > >  include/linux/pm.h                 |  2 ++
>> > >  include/linux/pm_runtime.h         |  1 +
>> > >  4 files changed, 44 insertions(+)
>> > >
>> > > diff --git a/Documentation/power/runtime_pm.rst b/Documentation/power/runtime_pm.rst
>> > > index 0553008b6279..f6aaef15a511 100644
>> > > --- a/Documentation/power/runtime_pm.rst
>> > > +++ b/Documentation/power/runtime_pm.rst
>> > > @@ -515,6 +515,12 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:
>> > >        power.use_autosuspend isn't set, otherwise returns the expiration time
>> > >        in jiffies
>> > >
>> > > +  `int pm_runtime_set_next_event(struct device *dev, ktime_t next);`
>> > > +    - inform runtime PM of the next event on the device. Devices that are
>> > > +      sensitive to their domain idle enter/exit latencies may provide this
>> > > +      information for use by the PM domain governor. The domain governor would
>> > > +      use this information to calculate it's sleep length.
>> > > +
>> > >  It is safe to execute the following helper functions from interrupt context:
>> > >
>> > >  - pm_request_idle()
>> > > @@ -545,6 +551,7 @@ functions may also be used in interrupt context:
>> > >  - pm_runtime_put_sync()
>> > >  - pm_runtime_put_sync_suspend()
>> > >  - pm_runtime_put_sync_autosuspend()
>> > > +- pm_runtime_set_next_event()
>> > >
>> > >  5. Runtime PM Initialization, Device Probing and Removal
>> > >  ========================================================
>> > > @@ -639,6 +646,16 @@ suspend routine).  It may be necessary to resume the device and suspend it again
>> > >  in order to do so.  The same is true if the driver uses different power levels
>> > >  or other settings for runtime suspend and system sleep.
>> > >
>> > > +When a device enters idle at runtime, it may trigger the runtime PM up the
>> > > +hierarchy and if device has a predictable interrupt pattern, we can even do a
>> > > +better job at determining the parent's idle state. For example, a display
>> > > +device gets a VSYNC interrupt every 16 ms when running at 60 Hz. When it's PM
>> > > +domain is powering down and happens to be at the boundary of the VSYNC
>> > > +interrupt, it may not be efficient to power off the domain. Knowing the next
>> > > +wake up (when available) for devices in the domain we can determine the idle
>> > > +duration of the domain. By comparing idle duration with the residencies of the
>> > > +domain idle states, we can be efficient in both power and performance.
>> > > +
>> > >  During system resume, the simplest approach is to bring all devices back to full
>> > >  power, even if they had been suspended before the system suspend began.  There
>> > >  are several reasons for this, including:
>> > > diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
>> > > index 8143210a5c54..5d2ebacfd35e 100644
>> > > --- a/drivers/base/power/runtime.c
>> > > +++ b/drivers/base/power/runtime.c
>> > > @@ -122,6 +122,27 @@ u64 pm_runtime_suspended_time(struct device *dev)
>> > >  }
>> > >  EXPORT_SYMBOL_GPL(pm_runtime_suspended_time);
>> > >
>> > > +/**
>> > > + * pm_runtime_set_next_wakeup_event - Notify PM framework of an impending event.
>> > > + * @dev: Device to handle
>> > > + * @next: impending interrupt/wakeup for the device
>> > > + */
>> > > +int pm_runtime_set_next_event(struct device *dev, ktime_t next)
>> > > +{
>> > > +       unsigned long flags;
>> > > +       int ret = -EINVAL;
>> > > +
>> > > +       spin_lock_irqsave(&dev->power.lock, flags);
>> > > +       if (ktime_before(ktime_get(), next)) {
>> > > +               dev->power.next_event = next;
>> > > +               ret = 0;
>> > > +       }
>> > > +       spin_unlock_irqrestore(&dev->power.lock, flags);
>> > > +
>> > > +       return ret;
>> > > +}
>> > > +EXPORT_SYMBOL_GPL(pm_runtime_set_next_event);
>> > > +
>> > >  /**
>> > >   * pm_runtime_deactivate_timer - Deactivate given device's suspend timer.
>> > >   * @dev: Device to handle.
>> > > @@ -1415,6 +1436,9 @@ void pm_runtime_enable(struct device *dev)
>> > >              "Enabling runtime PM for inactive device (%s) with active children\n",
>> > >              dev_name(dev));
>> > >
>> > > +       /* Reset the next wakeup for the device */
>> > > +       dev->power.next_event = KTIME_MAX;
>> > > +
>> > >         spin_unlock_irqrestore(&dev->power.lock, flags);
>> > >  }
>> > >  EXPORT_SYMBOL_GPL(pm_runtime_enable);
>> > > diff --git a/include/linux/pm.h b/include/linux/pm.h
>> > > index a30a4b54df52..9051658674a4 100644
>> > > --- a/include/linux/pm.h
>> > > +++ b/include/linux/pm.h
>> > > @@ -8,6 +8,7 @@
>> > >  #ifndef _LINUX_PM_H
>> > >  #define _LINUX_PM_H
>> > >
>> > > +#include <linux/ktime.h>
>> > >  #include <linux/list.h>
>> > >  #include <linux/workqueue.h>
>> > >  #include <linux/spinlock.h>
>> > > @@ -616,6 +617,7 @@ struct dev_pm_info {
>> > >         u64                     active_time;
>> > >         u64                     suspended_time;
>> > >         u64                     accounting_timestamp;
>> > > +       ktime_t                 next_event;
>> >
>> > While there are some cosmetic changes to be made, this particular bit
>> > is fundamentally questionable IMV, because next_event (which BTW would
>> > better be called next_wakeup IMO) is not used by PM-runtime.
>> >
>> > The only user of it will be genpd AFAICS, so I don't quite see a
>> > reason to inflict this extra memory cost on everybody, even if they
>> > don't care about genpd and may not even compile it in.
>>
>> That's a good point!
>>
>> May I suggest that the new data is put into the "struct
>> generic_pm_domain_data" instead, which means it will be allocated when
>> a device is attached to a genpd.
>
>Yes, something like that.
>
>> Moreover, we should probably rename the API (and move the
>> implementation of it accordingly) from pm_runtime_set_next_event() to
>> dev_pm_genpd_set_next_wakeup().
>
>Right.
>
Thanks, both of you for the suggestions. I will send an update soon.

--Lina

>> Unless we believe the interface could
>> be useful for other PM domain types (ACPI ?), then we could consider
>> adding a ->set_next_wakeup() callback to the struct dev_pm_domain and
>> implement the interface through a common
>> dev_pm_domain_set_next_wakeup() API.
>
>Maybe.
>
>That would depend on who the other user would be and I wouldn't worry
>about that upfront.
>
>Cheers!
diff mbox series

Patch

diff --git a/Documentation/power/runtime_pm.rst b/Documentation/power/runtime_pm.rst
index 0553008b6279..f6aaef15a511 100644
--- a/Documentation/power/runtime_pm.rst
+++ b/Documentation/power/runtime_pm.rst
@@ -515,6 +515,12 @@  drivers/base/power/runtime.c and include/linux/pm_runtime.h:
       power.use_autosuspend isn't set, otherwise returns the expiration time
       in jiffies
 
+  `int pm_runtime_set_next_event(struct device *dev, ktime_t next);`
+    - inform runtime PM of the next event on the device. Devices that are
+      sensitive to their domain idle enter/exit latencies may provide this
+      information for use by the PM domain governor. The domain governor would
+      use this information to calculate it's sleep length.
+
 It is safe to execute the following helper functions from interrupt context:
 
 - pm_request_idle()
@@ -545,6 +551,7 @@  functions may also be used in interrupt context:
 - pm_runtime_put_sync()
 - pm_runtime_put_sync_suspend()
 - pm_runtime_put_sync_autosuspend()
+- pm_runtime_set_next_event()
 
 5. Runtime PM Initialization, Device Probing and Removal
 ========================================================
@@ -639,6 +646,16 @@  suspend routine).  It may be necessary to resume the device and suspend it again
 in order to do so.  The same is true if the driver uses different power levels
 or other settings for runtime suspend and system sleep.
 
+When a device enters idle at runtime, it may trigger the runtime PM up the
+hierarchy and if device has a predictable interrupt pattern, we can even do a
+better job at determining the parent's idle state. For example, a display
+device gets a VSYNC interrupt every 16 ms when running at 60 Hz. When it's PM
+domain is powering down and happens to be at the boundary of the VSYNC
+interrupt, it may not be efficient to power off the domain. Knowing the next
+wake up (when available) for devices in the domain we can determine the idle
+duration of the domain. By comparing idle duration with the residencies of the
+domain idle states, we can be efficient in both power and performance.
+
 During system resume, the simplest approach is to bring all devices back to full
 power, even if they had been suspended before the system suspend began.  There
 are several reasons for this, including:
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 8143210a5c54..5d2ebacfd35e 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -122,6 +122,27 @@  u64 pm_runtime_suspended_time(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(pm_runtime_suspended_time);
 
+/**
+ * pm_runtime_set_next_wakeup_event - Notify PM framework of an impending event.
+ * @dev: Device to handle
+ * @next: impending interrupt/wakeup for the device
+ */
+int pm_runtime_set_next_event(struct device *dev, ktime_t next)
+{
+	unsigned long flags;
+	int ret = -EINVAL;
+
+	spin_lock_irqsave(&dev->power.lock, flags);
+	if (ktime_before(ktime_get(), next)) {
+		dev->power.next_event = next;
+		ret = 0;
+	}
+	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_set_next_event);
+
 /**
  * pm_runtime_deactivate_timer - Deactivate given device's suspend timer.
  * @dev: Device to handle.
@@ -1415,6 +1436,9 @@  void pm_runtime_enable(struct device *dev)
 	     "Enabling runtime PM for inactive device (%s) with active children\n",
 	     dev_name(dev));
 
+	/* Reset the next wakeup for the device */
+	dev->power.next_event = KTIME_MAX;
+
 	spin_unlock_irqrestore(&dev->power.lock, flags);
 }
 EXPORT_SYMBOL_GPL(pm_runtime_enable);
diff --git a/include/linux/pm.h b/include/linux/pm.h
index a30a4b54df52..9051658674a4 100644
--- a/include/linux/pm.h
+++ b/include/linux/pm.h
@@ -8,6 +8,7 @@ 
 #ifndef _LINUX_PM_H
 #define _LINUX_PM_H
 
+#include <linux/ktime.h>
 #include <linux/list.h>
 #include <linux/workqueue.h>
 #include <linux/spinlock.h>
@@ -616,6 +617,7 @@  struct dev_pm_info {
 	u64			active_time;
 	u64			suspended_time;
 	u64			accounting_timestamp;
+	ktime_t			next_event;
 #endif
 	struct pm_subsys_data	*subsys_data;  /* Owned by the subsystem. */
 	void (*set_latency_tolerance)(struct device *, s32);
diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h
index 6245caa18034..af6d35178335 100644
--- a/include/linux/pm_runtime.h
+++ b/include/linux/pm_runtime.h
@@ -59,6 +59,7 @@  extern void pm_runtime_get_suppliers(struct device *dev);
 extern void pm_runtime_put_suppliers(struct device *dev);
 extern void pm_runtime_new_link(struct device *dev);
 extern void pm_runtime_drop_link(struct device *dev);
+extern int pm_runtime_set_next_event(struct device *dev, ktime_t next);
 
 /**
  * pm_runtime_get_if_in_use - Conditionally bump up runtime PM usage counter.