Message ID | 4874693.GXAFRqVoOG@kreacher (mailing list archive) |
---|---|
State | Mainlined, archived |
Headers | show |
Series | PM: sleep: Fix possible device suspend-resume deadlocks | expand |
On Wed, Dec 27, 2023 at 09:38:23PM +0100, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > In preparation for subsequent changes, introduce a specialized variant > of async_schedule_dev() that will not invoke the argument function > synchronously when it cannot be scheduled for asynchronous execution. > > The new function, async_schedule_dev_nocall(), will be used for fixing > possible deadlocks in the system-wide power management core code. > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > drivers/base/power/main.c | 12 ++++++++---- > include/linux/async.h | 2 ++ > kernel/async.c | 29 +++++++++++++++++++++++++++++ > 3 files changed, 39 insertions(+), 4 deletions(-) > > Index: linux-pm/kernel/async.c > =================================================================== > --- linux-pm.orig/kernel/async.c > +++ linux-pm/kernel/async.c > @@ -244,6 +244,35 @@ async_cookie_t async_schedule_node(async > EXPORT_SYMBOL_GPL(async_schedule_node); > > /** > + * async_schedule_dev_nocall - A simplified variant of async_schedule_dev() > + * @func: function to execute asynchronously > + * @dev: device argument to be passed to function > + * > + * @dev is used as both the argument for the function and to provide NUMA > + * context for where to run the function. > + * > + * If the asynchronous execution of @func is scheduled successfully, return > + * true. Otherwise, do nothing and return false, unlike async_schedule_dev() > + * that will run the function synchronously then. > + */ > +bool async_schedule_dev_nocall(async_func_t func, struct device *dev) > +{ > + struct async_entry *entry; > + > + entry = kzalloc(sizeof(struct async_entry), GFP_KERNEL); Is GFP_KERNEL intended here ? I think it's not safe since will be called from device_resume_noirq() . Regards Stanislaw
On Fri, Dec 29, 2023 at 02:37:36PM +0100, Rafael J. Wysocki wrote: > > > +bool async_schedule_dev_nocall(async_func_t func, struct device *dev) > > > +{ > > > + struct async_entry *entry; > > > + > > > + entry = kzalloc(sizeof(struct async_entry), GFP_KERNEL); > > > > Is GFP_KERNEL intended here ? > > Yes, it is. > > PM will be the only user of this, at least for now, and it all runs in > process context. > > > I think it's not safe since will > > be called from device_resume_noirq() . > > device_resume_noirq() runs in process context too. > > The name is somewhat confusing (sorry about that) and it means that > hardirq handlers (for the majority of IRQs) don't run in that resume > phase, but interrupts are enabled locally on all CPUs (this is > required for wakeup handling, among other things). Then my concern would be: if among devices with disabled IRQs are disk devices? Seems there are disk devices as well, and because GFP_KERNEL can start reclaiming memory by doing disk IO (write dirty pages for example), with disk driver interrupts disabled reclaiming process can not finish. I do not see how such possible infinite waiting for disk IO scenario is prevented here, did I miss something? Regards Stanislaw
On Fri, Dec 29, 2023 at 8:02 AM Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> wrote: > > On Wed, Dec 27, 2023 at 09:38:23PM +0100, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > In preparation for subsequent changes, introduce a specialized variant > > of async_schedule_dev() that will not invoke the argument function > > synchronously when it cannot be scheduled for asynchronous execution. > > > > The new function, async_schedule_dev_nocall(), will be used for fixing > > possible deadlocks in the system-wide power management core code. > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > --- > > drivers/base/power/main.c | 12 ++++++++---- > > include/linux/async.h | 2 ++ > > kernel/async.c | 29 +++++++++++++++++++++++++++++ > > 3 files changed, 39 insertions(+), 4 deletions(-) > > > > Index: linux-pm/kernel/async.c > > =================================================================== > > --- linux-pm.orig/kernel/async.c > > +++ linux-pm/kernel/async.c > > @@ -244,6 +244,35 @@ async_cookie_t async_schedule_node(async > > EXPORT_SYMBOL_GPL(async_schedule_node); > > > > /** > > + * async_schedule_dev_nocall - A simplified variant of async_schedule_dev() > > + * @func: function to execute asynchronously > > + * @dev: device argument to be passed to function > > + * > > + * @dev is used as both the argument for the function and to provide NUMA > > + * context for where to run the function. > > + * > > + * If the asynchronous execution of @func is scheduled successfully, return > > + * true. Otherwise, do nothing and return false, unlike async_schedule_dev() > > + * that will run the function synchronously then. > > + */ > > +bool async_schedule_dev_nocall(async_func_t func, struct device *dev) > > +{ > > + struct async_entry *entry; > > + > > + entry = kzalloc(sizeof(struct async_entry), GFP_KERNEL); > > Is GFP_KERNEL intended here ? Yes, it is. PM will be the only user of this, at least for now, and it all runs in process context. > I think it's not safe since will > be called from device_resume_noirq() . device_resume_noirq() runs in process context too. The name is somewhat confusing (sorry about that) and it means that hardirq handlers (for the majority of IRQs) don't run in that resume phase, but interrupts are enabled locally on all CPUs (this is required for wakeup handling, among other things).
On Fri, Dec 29, 2023 at 3:54 PM Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> wrote: > > On Fri, Dec 29, 2023 at 02:37:36PM +0100, Rafael J. Wysocki wrote: > > > > +bool async_schedule_dev_nocall(async_func_t func, struct device *dev) > > > > +{ > > > > + struct async_entry *entry; > > > > + > > > > + entry = kzalloc(sizeof(struct async_entry), GFP_KERNEL); > > > > > > Is GFP_KERNEL intended here ? > > > > Yes, it is. > > > > PM will be the only user of this, at least for now, and it all runs in > > process context. > > > > > I think it's not safe since will > > > be called from device_resume_noirq() . > > > > device_resume_noirq() runs in process context too. > > > > The name is somewhat confusing (sorry about that) and it means that > > hardirq handlers (for the majority of IRQs) don't run in that resume > > phase, but interrupts are enabled locally on all CPUs (this is > > required for wakeup handling, among other things). > > Then my concern would be: if among devices with disabled IRQs are > disk devices? Seems there are disk devices as well, and because > GFP_KERNEL can start reclaiming memory by doing disk IO (write > dirty pages for example), with disk driver interrupts disabled > reclaiming process can not finish. > > I do not see how such possible infinite waiting for disk IO > scenario is prevented here, did I miss something? Well, it is not a concern, because the suspend code already prevents the mm subsystem from trying too hard to find free memory. See the pm_restrict_gfp_mask() call in enter_state(). Otherwise, it would have been a problem for any GFP_KERNEL allocations made during system-wide suspend-resume, not just in the _noirq phases.
On Fri, Dec 29, 2023 at 05:36:01PM +0100, Rafael J. Wysocki wrote: > On Fri, Dec 29, 2023 at 3:54 PM Stanislaw Gruszka > <stanislaw.gruszka@linux.intel.com> wrote: > > > > On Fri, Dec 29, 2023 at 02:37:36PM +0100, Rafael J. Wysocki wrote: > > > > > +bool async_schedule_dev_nocall(async_func_t func, struct device *dev) > > > > > +{ > > > > > + struct async_entry *entry; > > > > > + > > > > > + entry = kzalloc(sizeof(struct async_entry), GFP_KERNEL); > > > > > > > > Is GFP_KERNEL intended here ? > > > > > > Yes, it is. > > > > > > PM will be the only user of this, at least for now, and it all runs in > > > process context. > > > > > > > I think it's not safe since will > > > > be called from device_resume_noirq() . > > > > > > device_resume_noirq() runs in process context too. > > > > > > The name is somewhat confusing (sorry about that) and it means that > > > hardirq handlers (for the majority of IRQs) don't run in that resume > > > phase, but interrupts are enabled locally on all CPUs (this is > > > required for wakeup handling, among other things). > > > > Then my concern would be: if among devices with disabled IRQs are > > disk devices? Seems there are disk devices as well, and because > > GFP_KERNEL can start reclaiming memory by doing disk IO (write > > dirty pages for example), with disk driver interrupts disabled > > reclaiming process can not finish. > > > > I do not see how such possible infinite waiting for disk IO > > scenario is prevented here, did I miss something? > > Well, it is not a concern, because the suspend code already prevents > the mm subsystem from trying too hard to find free memory. See the > pm_restrict_gfp_mask() call in enter_state(). So that I missed :-) Thanks for explanations. Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> for the series.
On Tue, Jan 2, 2024 at 8:10 AM Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> wrote: > > On Fri, Dec 29, 2023 at 05:36:01PM +0100, Rafael J. Wysocki wrote: > > On Fri, Dec 29, 2023 at 3:54 PM Stanislaw Gruszka > > <stanislaw.gruszka@linux.intel.com> wrote: > > > > > > On Fri, Dec 29, 2023 at 02:37:36PM +0100, Rafael J. Wysocki wrote: > > > > > > +bool async_schedule_dev_nocall(async_func_t func, struct device *dev) > > > > > > +{ > > > > > > + struct async_entry *entry; > > > > > > + > > > > > > + entry = kzalloc(sizeof(struct async_entry), GFP_KERNEL); > > > > > > > > > > Is GFP_KERNEL intended here ? > > > > > > > > Yes, it is. > > > > > > > > PM will be the only user of this, at least for now, and it all runs in > > > > process context. > > > > > > > > > I think it's not safe since will > > > > > be called from device_resume_noirq() . > > > > > > > > device_resume_noirq() runs in process context too. > > > > > > > > The name is somewhat confusing (sorry about that) and it means that > > > > hardirq handlers (for the majority of IRQs) don't run in that resume > > > > phase, but interrupts are enabled locally on all CPUs (this is > > > > required for wakeup handling, among other things). > > > > > > Then my concern would be: if among devices with disabled IRQs are > > > disk devices? Seems there are disk devices as well, and because > > > GFP_KERNEL can start reclaiming memory by doing disk IO (write > > > dirty pages for example), with disk driver interrupts disabled > > > reclaiming process can not finish. > > > > > > I do not see how such possible infinite waiting for disk IO > > > scenario is prevented here, did I miss something? > > > > Well, it is not a concern, because the suspend code already prevents > > the mm subsystem from trying too hard to find free memory. See the > > pm_restrict_gfp_mask() call in enter_state(). > > So that I missed :-) Thanks for explanations. > > Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> for the series. Thank you!
Index: linux-pm/kernel/async.c =================================================================== --- linux-pm.orig/kernel/async.c +++ linux-pm/kernel/async.c @@ -244,6 +244,35 @@ async_cookie_t async_schedule_node(async EXPORT_SYMBOL_GPL(async_schedule_node); /** + * async_schedule_dev_nocall - A simplified variant of async_schedule_dev() + * @func: function to execute asynchronously + * @dev: device argument to be passed to function + * + * @dev is used as both the argument for the function and to provide NUMA + * context for where to run the function. + * + * If the asynchronous execution of @func is scheduled successfully, return + * true. Otherwise, do nothing and return false, unlike async_schedule_dev() + * that will run the function synchronously then. + */ +bool async_schedule_dev_nocall(async_func_t func, struct device *dev) +{ + struct async_entry *entry; + + entry = kzalloc(sizeof(struct async_entry), GFP_KERNEL); + + /* Give up if there is no memory or too much work. */ + if (!entry || atomic_read(&entry_count) > MAX_WORK) { + kfree(entry); + return false; + } + + __async_schedule_node_domain(func, dev, dev_to_node(dev), + &async_dfl_domain, entry); + return true; +} + +/** * async_synchronize_full - synchronize all asynchronous function calls * * This function waits until all asynchronous function calls have been done. Index: linux-pm/include/linux/async.h =================================================================== --- linux-pm.orig/include/linux/async.h +++ linux-pm/include/linux/async.h @@ -90,6 +90,8 @@ async_schedule_dev(async_func_t func, st return async_schedule_node(func, dev, dev_to_node(dev)); } +bool async_schedule_dev_nocall(async_func_t func, struct device *dev); + /** * async_schedule_dev_domain - A device specific version of async_schedule_domain * @func: function to execute asynchronously