Message ID | 200908262221.09224.rjw@sisk.pl (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
On Wed, 26 Aug 2009, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rjw@sisk.pl> > > Theoretically, the total time of system sleep transitions (suspend > to RAM, hibernation) can be reduced by running suspend and resume > callbacks of device drivers in parallel with each other. However, > there are dependencies between devices such that, for example, we may > not be allowed to put one device into a low power state before > anohter one has been suspended (e.g. we cannot suspend a bridge > before suspending all devices behind it). In particular, we're not > allowed to suspend the parent of a device before suspending the > device itself. Analogously, we're not allowed to resume a device > before resuming its parent. > In this version of the patch the async threads started to execute > the resume callbacks of specific device don't exit immediately having > done that, but search dpm_list for devices whose PM dependencies have > already been satisfied and execute their callbacks without waiting. Given this design, why bother to invoke device_resume() for the async devices? Why not just start up a bunch of async threads, each of which calls async_resume() repeatedly until everything is finished? (And rearrange async_resume() to scan the list first and do the actual resume second.) The same goes for the noirq versions. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Friday 28 August 2009, Alan Stern wrote: > On Wed, 26 Aug 2009, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <rjw@sisk.pl> > > > > Theoretically, the total time of system sleep transitions (suspend > > to RAM, hibernation) can be reduced by running suspend and resume > > callbacks of device drivers in parallel with each other. However, > > there are dependencies between devices such that, for example, we may > > not be allowed to put one device into a low power state before > > anohter one has been suspended (e.g. we cannot suspend a bridge > > before suspending all devices behind it). In particular, we're not > > allowed to suspend the parent of a device before suspending the > > device itself. Analogously, we're not allowed to resume a device > > before resuming its parent. > > > In this version of the patch the async threads started to execute > > the resume callbacks of specific device don't exit immediately having > > done that, but search dpm_list for devices whose PM dependencies have > > already been satisfied and execute their callbacks without waiting. > > Given this design, why bother to invoke device_resume() for the async > devices? Why not just start up a bunch of async threads, each of which > calls async_resume() repeatedly until everything is finished? (And > rearrange async_resume() to scan the list first and do the actual > resume second.) > > The same goes for the noirq versions. I thought about that, but there are a few things to figure out: - how many threads to start - when to start them - stop condition I had a few ideas, but then I thought it would be simpler to start an async thread when we know there's some async work to do (ie. there's an async device to handle) and make each async thread browse the list just once (the rationale is that we've just handled a device, so there's a chance there are some devices with satisfied dependencies down the list). Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 28 Aug 2009, Rafael J. Wysocki wrote: > > Given this design, why bother to invoke device_resume() for the async > > devices? Why not just start up a bunch of async threads, each of which > > calls async_resume() repeatedly until everything is finished? (And > > rearrange async_resume() to scan the list first and do the actual > > resume second.) > > > > The same goes for the noirq versions. > > I thought about that, but there are a few things to figure out: > - how many threads to start That's a tough question. Right now you start roughly as many threads as there are async devices. That seems like overkill. I would expect that a reasonably small number of threads would suffice to achieve most of the possible time savings. Something on the order of 10 should work well. If the majority of the time is spent handling N devices then N+1 threads would be enough. Judging from some of the comments posted earlier, even 4 threads would give a big advantage. > - when to start them You might as well start them at the beginning of dpm_resume and dpm_resume_noirq. That way they can overlap with the synchronous operations. > - stop condition When an error occurs or when op_started has been set for every remaining async device. > I had a few ideas, but then I thought it would be simpler to start an async > thread when we know there's some async work to do (ie. there's an async > device to handle) and make each async thread browse the list just once (the > rationale is that we've just handled a device, so there's a chance there are > some devices with satisfied dependencies down the list). It comes down to this: Should there be many threads, each of which browses the list only once, or should there be a few threads, each of which browses the list many times? Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Friday 28 August 2009, Alan Stern wrote: > On Fri, 28 Aug 2009, Rafael J. Wysocki wrote: > > > > Given this design, why bother to invoke device_resume() for the async > > > devices? Why not just start up a bunch of async threads, each of which > > > calls async_resume() repeatedly until everything is finished? (And > > > rearrange async_resume() to scan the list first and do the actual > > > resume second.) > > > > > > The same goes for the noirq versions. > > > > I thought about that, but there are a few things to figure out: > > - how many threads to start > > That's a tough question. Right now you start roughly as many threads > as there are async devices. That seems like overkill. In fact they are substantially fewer than that, for the following reasons. First, the async framework will not start more than MAX_THREADS threads, which is 256 at the moment. This number is less than the number of async devices to handle on an average system. Second, no new async threads are started while the main thread is handling the sync devices , so the existing threads have a chance to do their job. If there's a "cluster" of sync devices in dpm_list, the number of async threads running is likely to drop rapidly while those devices are being handled. (BTW, if there were no sync devices, the whole thing would be much simpler, but I don't think it's realistic to assume we'll be able to get rid of them any time soon). Finally, but not least importantly, async threads are not started for the async devices that were previously handled "out of order" by the already running async threads (or by async threads that have already finished). My testing shows that there are quite a few of them on the average. For example, on the HP nx6325 typically there are as many as 580 async devices handled "out of order" during a _single_ suspend-resume cycle (including the "early" and "late" phases), while only a few (below 10) devices are waited for by at least one async thread. I can try to monitor the number of asyn threads started if you're interested. > I would expect that a reasonably small number of threads would suffice > to achieve most of the possible time savings. Something on the order > of 10 should work well. If the majority of the time is spent > handling N devices then N+1 threads would be enough. Judging from some > of the comments posted earlier, even 4 threads would give a big > advantage. That unfortunately is not the case with the set of async devices including PCI, ACPI and serio devices only. The average time savings are between 5% to 14%, depending on the system and the phase of the cycle (the relative savings are typically greater for suspend). Still, that amounts to .5 s in some cases. > > - when to start them > > You might as well start them at the beginning of dpm_resume and > dpm_resume_noirq. That way they can overlap with the synchronous > operations. In that case they would have to wait in the beginning, so I'd need a mechanism to wake them up. Alternatively, there could be a limit to the number of async threads started within the current design, but I'd prefer to leave that to the async framework (namely, if MAX_THREADS makes sense for boot, it's also likely to make sense for PM). > > - stop condition > > When an error occurs or when op_started has been set for every > remaining async device. Yeah, that's the easy one. :-) > > I had a few ideas, but then I thought it would be simpler to start an async > > thread when we know there's some async work to do (ie. there's an async > > device to handle) and make each async thread browse the list just once (the > > rationale is that we've just handled a device, so there's a chance there are > > some devices with satisfied dependencies down the list). > > It comes down to this: Should there be many threads, each of which > browses the list only once, or should there be a few threads, each of > which browses the list many times? Well, quite obviously I prefer the many threads version. :-) Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 29 Aug 2009, Rafael J. Wysocki wrote: > On Friday 28 August 2009, Alan Stern wrote: > > On Fri, 28 Aug 2009, Rafael J. Wysocki wrote: > > > > > > Given this design, why bother to invoke device_resume() for the async > > > > devices? Why not just start up a bunch of async threads, each of which > > > > calls async_resume() repeatedly until everything is finished? (And > > > > rearrange async_resume() to scan the list first and do the actual > > > > resume second.) > > > > > > > > The same goes for the noirq versions. > > > > > > I thought about that, but there are a few things to figure out: > > > - how many threads to start > > > > That's a tough question. Right now you start roughly as many threads > > as there are async devices. That seems like overkill. > > In fact they are substantially fewer than that, for the following reasons. > > First, the async framework will not start more than MAX_THREADS threads, > which is 256 at the moment. This number is less than the number of async > devices to handle on an average system. Okay, but MAX_THREADS isn't under your control. Remember also that each thread takes up some memory, and during hibernation we are in a memory-constrained situation. > Second, no new async threads are started while the main thread is handling the > sync devices , so the existing threads have a chance to do their job. If > there's a "cluster" of sync devices in dpm_list, the number of async threads > running is likely to drop rapidly while those devices are being handled. > (BTW, if there were no sync devices, the whole thing would be much simpler, > but I don't think it's realistic to assume we'll be able to get rid of them any > time soon). Perhaps not, but it would be interesting to see what happens if every device is async. Maybe you can try it and get a meaningful result. > Finally, but not least importantly, async threads are not started for the > async devices that were previously handled "out of order" by the already > running async threads (or by async threads that have already finished). My > testing shows that there are quite a few of them on the average. For example, > on the HP nx6325 typically there are as many as 580 async devices handled "out > of order" during a _single_ suspend-resume cycle (including the "early" and > "late" phases), while only a few (below 10) devices are waited for by at least > one async thread. That is a difficult sort of thing to know in advance. It ought to be highly influenced by the percentage of async devices; that's another reason for wanting to know what happens when every device is async. > > I would expect that a reasonably small number of threads would suffice > > to achieve most of the possible time savings. Something on the order > > of 10 should work well. If the majority of the time is spent > > handling N devices then N+1 threads would be enough. Judging from some > > of the comments posted earlier, even 4 threads would give a big > > advantage. > > That unfortunately is not the case with the set of async devices including > PCI, ACPI and serio devices only. The average time savings are between 5% to > 14%, depending on the system and the phase of the cycle (the relative savings > are typically greater for suspend). Still, that amounts to .5 s in some cases. Without context it's hard to be sure, but I don't think your numbers contradict what I said. If you get between 5% and 14% time savings with 14 threads, then you might get between 4% and 10% savings with only 4 threads. I must agree, 14 threads isn't a lot. But at the moment that number is random, not under your control. > > > - when to start them > > > > You might as well start them at the beginning of dpm_resume and > > dpm_resume_noirq. That way they can overlap with the synchronous > > operations. > > In that case they would have to wait in the beginning, so I'd need a mechanism > to wake them up. You already have two such mechanisms: dpm_list_mtx and the embedded wait_queue_heads. Although in the scheme I'm proposing, no async threads would ever have to wait on a per-device waitqueue. A system-wide waitqueue might work out better (for use when a thread reaches the end of the list and then waits before starting over at the beginning). > Alternatively, there could be a limit to the number of async threads started > within the current design, but I'd prefer to leave that to the async framework > (namely, if MAX_THREADS makes sense for boot, it's also likely to make sense > for PM). Strictly speaking, a new thread should be started only when needed. That is, only when all the existing threads are busy running a callback. It shouldn't be too hard to keep track of when that happens. > > It comes down to this: Should there be many threads, each of which > > browses the list only once, or should there be a few threads, each of > > which browses the list many times? > > Well, quite obviously I prefer the many threads version. :-) Okay, clearly it's a matter of taste. To me the many-threads version seems less elegant and less well controlled. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Saturday 29 August 2009, Alan Stern wrote: > On Sat, 29 Aug 2009, Rafael J. Wysocki wrote: > > > On Friday 28 August 2009, Alan Stern wrote: > > > On Fri, 28 Aug 2009, Rafael J. Wysocki wrote: > > > > > > > > Given this design, why bother to invoke device_resume() for the async > > > > > devices? Why not just start up a bunch of async threads, each of which > > > > > calls async_resume() repeatedly until everything is finished? (And > > > > > rearrange async_resume() to scan the list first and do the actual > > > > > resume second.) > > > > > > > > > > The same goes for the noirq versions. > > > > > > > > I thought about that, but there are a few things to figure out: > > > > - how many threads to start > > > > > > That's a tough question. Right now you start roughly as many threads > > > as there are async devices. That seems like overkill. > > > > In fact they are substantially fewer than that, for the following reasons. > > > > First, the async framework will not start more than MAX_THREADS threads, > > which is 256 at the moment. This number is less than the number of async > > devices to handle on an average system. > > Okay, but MAX_THREADS isn't under your control. Remember also that > each thread takes up some memory, and during hibernation we are in a > memory-constrained situation. We keep some extra free memory for things like this. It's not likely to be exhausted by the async threads alone. > > Second, no new async threads are started while the main thread is handling the > > sync devices , so the existing threads have a chance to do their job. If > > there's a "cluster" of sync devices in dpm_list, the number of async threads > > running is likely to drop rapidly while those devices are being handled. > > (BTW, if there were no sync devices, the whole thing would be much simpler, > > but I don't think it's realistic to assume we'll be able to get rid of them any > > time soon). > > Perhaps not, but it would be interesting to see what happens if every > device is async. Maybe you can try it and get a meaningful result. I could if it didn't crash. Perhaps I can with init=/bin/bash. > > Finally, but not least importantly, async threads are not started for the > > async devices that were previously handled "out of order" by the already > > running async threads (or by async threads that have already finished). My > > testing shows that there are quite a few of them on the average. For example, > > on the HP nx6325 typically there are as many as 580 async devices handled "out > > of order" during a _single_ suspend-resume cycle (including the "early" and > > "late" phases), while only a few (below 10) devices are waited for by at least > > one async thread. > > That is a difficult sort of thing to know in advance. It ought to be > highly influenced by the percentage of async devices; that's another > reason for wanting to know what happens when every device is async. There are a few factors beyond our direct control influencing this, like for example how much time it takes to handle each individual device (that may vary from .5 s to microseconds AFAICS). In addition to this not only the percentage, but also the distribution of sync devices in dpm_list has an effect. For example, the situations where every second device is sync and where the first half of devices are async and the other is sync would lead to different kinds of behavior. > > > I would expect that a reasonably small number of threads would suffice > > > to achieve most of the possible time savings. Something on the order > > > of 10 should work well. If the majority of the time is spent > > > handling N devices then N+1 threads would be enough. Judging from some > > > of the comments posted earlier, even 4 threads would give a big > > > advantage. > > > > That unfortunately is not the case with the set of async devices including > > PCI, ACPI and serio devices only. The average time savings are between 5% to > > 14%, depending on the system and the phase of the cycle (the relative savings > > are typically greater for suspend). Still, that amounts to .5 s in some cases. > > Without context it's hard to be sure, but I don't think your numbers > contradict what I said. If you get between 5% and 14% time savings > with 14 threads, then you might get between 4% and 10% savings with > only 4 threads. I only wanted to say that the advantage is not really that "big". :-) > I must agree, 14 threads isn't a lot. But at the moment that number is > random, not under your control. It's not directly controlled, but there are some interactions between the async threads, the main threads and the async framework that don't allow this number to grow too much. IMO it sometimes is better to allow things to work themselves out, as long as they don't explode, than to try to keep everything under strict control. YMMV. > > > > - when to start them > > > > > > You might as well start them at the beginning of dpm_resume and > > > dpm_resume_noirq. That way they can overlap with the synchronous > > > operations. > > > > In that case they would have to wait in the beginning, so I'd need a mechanism > > to wake them up. > > You already have two such mechanisms: dpm_list_mtx and the embedded > wait_queue_heads. Although in the scheme I'm proposing, no async > threads would ever have to wait on a per-device waitqueue. A > system-wide waitqueue might work out better (for use when a thread > reaches the end of the list and then waits before starting over at the > beginning). However, sync devices may depend on the async ones and they need the per-device wait queues for this purpose. > > Alternatively, there could be a limit to the number of async threads started > > within the current design, but I'd prefer to leave that to the async framework > > (namely, if MAX_THREADS makes sense for boot, it's also likely to make sense > > for PM). > > Strictly speaking, a new thread should be started only when needed. > That is, only when all the existing threads are busy running a > callback. It shouldn't be too hard to keep track of when that happens. The async framework does that for us. :-) Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Saturday 29 August 2009, Rafael J. Wysocki wrote: > On Saturday 29 August 2009, Alan Stern wrote: ... > > Strictly speaking, a new thread should be started only when needed. > > That is, only when all the existing threads are busy running a > > callback. It shouldn't be too hard to keep track of when that happens. > > The async framework does that for us. :-) So, when I said "async threads", I should have rather said "async functions", because one async thread can execute multiple functions scheduled with async_schedule(). Taking that into consideration, it all works like this. When there's a new async device to handle and there are no async threads available, a new thread is started and it runs the async function that handles the device and "opportunistically" searches dpm_list (once) for other device that are ready for handling. After this function has returned, the thread goes to sleep and it only exits after being idle for 1 s. In turn, when there's a new async device to handle and there are async threads available, on of them gets the async function to run. I don't really think we can do much better than this in any other approach. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 29 Aug 2009, Rafael J. Wysocki wrote: > I only wanted to say that the advantage is not really that "big". :-) > > > I must agree, 14 threads isn't a lot. But at the moment that number is > > random, not under your control. > > It's not directly controlled, but there are some interactions between the > async threads, the main threads and the async framework that don't allow this > number to grow too much. > > IMO it sometimes is better to allow things to work themselves out, as long as > they don't explode, than to try to keep everything under strict control. YMMV. For testing purposes it would be nice to have a one-line summary for each device containing a thread ID, start timestamp, end timestamp, and elapsed time. With that information you could evaluate the amount of parallelism and determine where the bottlenecks are. It would give a much more detailed picture of the entire process than the total time of your recent patch 9. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 29 Aug 2009, Rafael J. Wysocki wrote: > On Saturday 29 August 2009, Rafael J. Wysocki wrote: > > On Saturday 29 August 2009, Alan Stern wrote: > ... > > > Strictly speaking, a new thread should be started only when needed. > > > That is, only when all the existing threads are busy running a > > > callback. It shouldn't be too hard to keep track of when that happens. > > > > The async framework does that for us. :-) > > So, when I said "async threads", I should have rather said "async functions", > because one async thread can execute multiple functions scheduled with > async_schedule(). > > Taking that into consideration, it all works like this. When there's a new > async device to handle and there are no async threads available, a new thread > is started and it runs the async function that handles the device and > "opportunistically" searches dpm_list (once) for other device that are ready > for handling. After this function has returned, the thread goes to sleep and > it only exits after being idle for 1 s. In turn, when there's a new async > device to handle and there are async threads available, on of them gets the > async function to run. > > I don't really think we can do much better than this in any other approach. Since it obviously works and is relatively simple, I guess we ought to adopt it. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi! > > > > > > The same goes for the noirq versions. > > > > > > > > > > I thought about that, but there are a few things to figure out: > > > > > - how many threads to start > > > > > > > > That's a tough question. Right now you start roughly as many threads > > > > as there are async devices. That seems like overkill. > > > > > > In fact they are substantially fewer than that, for the following reasons. > > > > > > First, the async framework will not start more than MAX_THREADS threads, > > > which is 256 at the moment. This number is less than the number of async > > > devices to handle on an average system. > > > > Okay, but MAX_THREADS isn't under your control. Remember also that > > each thread takes up some memory, and during hibernation we are in a > > memory-constrained situation. > > We keep some extra free memory for things like this. It's not likely to be > exhausted by the async threads alone. What extra memory? You are creating quite a lot of threads. For 256 of them, it would take cca 2MB... You recently removed code from s2disk that freed 4MB of extra memory, so you are now essentially relying on min_free_kbytes -- and even then is not reliable, user might s2disk under memory pressure. min_free_kbytes is 1MB on my system. Pavel
On Sunday 30 August 2009, Pavel Machek wrote: > Hi! > > > > > > > > The same goes for the noirq versions. > > > > > > > > > > > > I thought about that, but there are a few things to figure out: > > > > > > - how many threads to start > > > > > > > > > > That's a tough question. Right now you start roughly as many threads > > > > > as there are async devices. That seems like overkill. > > > > > > > > In fact they are substantially fewer than that, for the following reasons. > > > > > > > > First, the async framework will not start more than MAX_THREADS threads, > > > > which is 256 at the moment. This number is less than the number of async > > > > devices to handle on an average system. > > > > > > Okay, but MAX_THREADS isn't under your control. Remember also that > > > each thread takes up some memory, and during hibernation we are in a > > > memory-constrained situation. > > > > We keep some extra free memory for things like this. It's not likely to be > > exhausted by the async threads alone. > > What extra memory? You are creating quite a lot of threads. For 256 of > them, it would take cca 2MB... We never start that many threads and even if there's not enough memory to start a new thread, the async framework will handle that for us. > You recently removed code from s2disk that freed 4MB of extra memory, That was removed from s2ram. For STD we still have PAGES_FOR_IO and SPARE_PAGES, nothing's changed there. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sunday 30 August 2009, Alan Stern wrote: > On Sat, 29 Aug 2009, Rafael J. Wysocki wrote: > > > I only wanted to say that the advantage is not really that "big". :-) > > > > > I must agree, 14 threads isn't a lot. But at the moment that number is > > > random, not under your control. > > > > It's not directly controlled, but there are some interactions between the > > async threads, the main threads and the async framework that don't allow this > > number to grow too much. > > > > IMO it sometimes is better to allow things to work themselves out, as long as > > they don't explode, than to try to keep everything under strict control. YMMV. > > For testing purposes it would be nice to have a one-line summary for > each device containing a thread ID, start timestamp, end timestamp, and > elapsed time. With that information you could evaluate the amount of > parallelism and determine where the bottlenecks are. It would give a > much more detailed picture of the entire process than the total time of > your recent patch 9. Of course it would. I think I'll implement it. The purpose of patch 9 is basically to allow one to see how much time is spent on the handling of devices overall and to compare that with the time spent on the other operation during suspend-resume. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Index: linux-2.6/include/linux/pm.h =================================================================== --- linux-2.6.orig/include/linux/pm.h +++ linux-2.6/include/linux/pm.h @@ -26,6 +26,7 @@ #include <linux/spinlock.h> #include <linux/wait.h> #include <linux/timer.h> +#include <linux/completion.h> /* * Callbacks for platform drivers to implement. @@ -409,20 +410,23 @@ enum rpm_request { struct dev_pm_info { spinlock_t lock; + wait_queue_head_t wait_queue; struct list_head master_links; struct list_head slave_links; pm_message_t power_state; unsigned int can_wakeup:1; unsigned int should_wakeup:1; + unsigned int async_suspend:1; enum dpm_state status; /* Owned by the PM core */ #ifdef CONFIG_PM_SLEEP struct list_head entry; + unsigned int op_started:1; + unsigned int op_complete:1; #endif #ifdef CONFIG_PM_RUNTIME struct timer_list suspend_timer; unsigned long timer_expires; struct work_struct work; - wait_queue_head_t wait_queue; atomic_t usage_count; atomic_t child_count; unsigned int disable_depth:3; Index: linux-2.6/include/linux/device.h =================================================================== --- linux-2.6.orig/include/linux/device.h +++ linux-2.6/include/linux/device.h @@ -472,6 +472,12 @@ static inline int device_is_registered(s return dev->kobj.state_in_sysfs; } +static inline void device_enable_async_suspend(struct device *dev, bool enable) +{ + if (dev->power.status == DPM_ON) + dev->power.async_suspend = enable; +} + void driver_init(void); /* Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -26,6 +26,8 @@ #include <linux/resume-trace.h> #include <linux/rwsem.h> #include <linux/interrupt.h> +#include <linux/async.h> +#include <linux/completion.h> #include "../base.h" #include "power.h" @@ -43,6 +45,7 @@ LIST_HEAD(dpm_list); static DEFINE_MUTEX(dpm_list_mtx); +static pm_message_t pm_transition; /* * Set once the preparation of devices for a PM transition has started, reset @@ -145,6 +148,131 @@ void device_pm_move_last(struct device * } /** + * dpm_reset - Clear power.op_started and power.op_complete for given device. + * @dev: Device to handle. + */ +static void dpm_reset(struct device *dev) +{ + dev->power.op_started = false; + dev->power.op_complete = false; +} + +/** + * dpm_reset_all - Call dpm_reset() for all devices. + */ +static void dpm_reset_all(void) +{ + struct device *dev; + + list_for_each_entry(dev, &dpm_list, power.entry) + dpm_reset(dev); +} + +/** + * dpm_synchronize_noirq - Wait for "late" or "early" PM callbacks to complete. + * + * Wait for the "late" or "early" suspend/resume callbacks of all devices to + * complete and clear power.op_started and power.op_complete for all devices. + */ +static void dpm_synchronize_noirq(void) +{ + async_synchronize_full(); + dpm_reset_all(); +} + +/** + * dpm_synchronize_noirq - Wait for PM callbacks to complete. + * + * Wait for the "regular" suspend/resume callbacks of all devices to complete + * and clear power.op_started and power.op_complete for all devices. + */ +static void dpm_synchronize(void) +{ + async_synchronize_full(); + mutex_lock(&dpm_list_mtx); + dpm_reset_all(); + mutex_unlock(&dpm_list_mtx); +} + +/** + * device_pm_wait - Wait for a PM operation to complete. + * @sub: "Slave" device. + * @dev: Device to wait for. + * + * Wait for a PM operation carried out for @dev to complete, unless both @sub + * and @dev have to be handled synchronously (in such a case they are going to + * be handled in the right order anyway thanks to the pm_list ordering). + */ +static void device_pm_wait(struct device *sub, struct device *dev) +{ + if (!dev) + return; + + if (!(sub->power.async_suspend || dev->power.async_suspend)) + return; + + if (!dev->power.op_complete) { + dev_dbg(sub, "PM: Waiting for %s %s\n", dev_driver_string(dev), + dev_name(dev)); + wait_event(dev->power.wait_queue, !!dev->power.op_complete); + } +} + +/** + * device_pm_wait_fn - Wrapper for device_pm_wait(). + * @dev: Device to wait for. + * @data: Pointer to the "slave" device object. + */ +static int device_pm_wait_fn(struct device *dev, void *data) +{ + device_pm_wait((struct device *)data, dev); + return 0; +} + +/** + * device_pm_wait_for_masters - Wait for all masters of given device. + * @slave: Device to wait for the masters of. + */ +static void device_pm_wait_for_masters(struct device *slave) +{ + if (!pm_trace_enabled) + device_for_each_master(slave, slave, device_pm_wait_fn); +} + +/** + * device_pm_check - Check the power.op_complete flag of given device. + * @dev: Device to check. + */ +static bool device_pm_check(struct device *dev) +{ + int ret = 0; + + if (dev) + ret = !dev->power.op_complete; + + return ret; +} + +/** + * device_pm_check_fn - Wrapper for device_pm_check(). + * @dev: Device to check. + * @data: Ignored. + */ +static int device_pm_check_fn(struct device *dev, void *data) +{ + return device_pm_check(dev); +} + +/** + * device_pm_check_masters - Check power.op_complete for masters of a device. + * @slave: Device to check the masters of. + */ +static int device_pm_check_masters(struct device *slave) +{ + return device_for_each_master(slave, NULL, device_pm_check_fn); +} + +/** * pm_op - Execute the PM operation appropriate for given PM event. * @dev: Device to handle. * @ops: PM operations to choose from. @@ -269,6 +397,24 @@ static int pm_noirq_op(struct device *de return error; } +/** + * pm_op_started - Mark the beginning of a PM operation for given device. + * @dev: Device to handle. + */ +static bool pm_op_started(struct device *dev) +{ + bool ret = false; + + spin_lock_irq(&dev->power.lock); + if (dev->power.op_started) + ret = true; + else + dev->power.op_started = true; + spin_unlock_irq(&dev->power.lock); + + return ret; +} + static char *pm_verb(int event) { switch (event) { @@ -310,33 +456,102 @@ static void pm_dev_err(struct device *de /*------------------------- Resume routines -------------------------*/ /** - * device_resume_noirq - Execute an "early resume" callback for given device. + * __device_resume_noirq - Execute an "early resume" callback for given device. * @dev: Device to handle. * @state: PM transition of the system being carried out. * * The driver of @dev will not receive interrupts while this function is being * executed. */ -static int device_resume_noirq(struct device *dev, pm_message_t state) +static int __device_resume_noirq(struct device *dev, pm_message_t state) { int error = 0; TRACE_DEVICE(dev); TRACE_RESUME(0); - if (!dev->bus) - goto End; - - if (dev->bus->pm) { + if (dev->bus && dev->bus->pm) { pm_dev_dbg(dev, state, "EARLY "); error = pm_noirq_op(dev, dev->bus->pm, state); } - End: + + dev->power.op_complete = true; + wake_up_all(&dev->power.wait_queue); + TRACE_RESUME(error); return error; } /** + * async_device_resume_noirq - Wrapper of __device_resume_noirq(). + * @dev: Device to resume. + */ +static void async_device_resume_noirq(struct device *dev) +{ + int error; + + pm_dev_dbg(dev, pm_transition, "async EARLY "); + error = __device_resume_noirq(dev, pm_transition); + if (error) + pm_dev_err(dev, pm_transition, " async EARLY", error); +} + +/** + * async_resume_noirq - Execute "early" resume callbacks asynchronously. + * @data: Pointer to the first device to resume. + * @cookie: Ignored. + * + * The execution of this function is scheduled with async_schedule(), so it runs + * in its own kernel thread. It first calls the "early" resume callback for the + * device passed to it as @data. Next, it walks dpm_list looking for devices + * that can be resumed without waiting for their "masters". If such a device is + * found, its "early" resume callback is run. + */ +static void async_resume_noirq(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + + device_pm_wait_for_masters(dev); + async_device_resume_noirq(dev); + + list_for_each_entry_continue(dev, &dpm_list, power.entry) { + if (!dev->power.async_suspend || dev->power.status <= DPM_OFF) + continue; + + if (device_pm_check_masters(dev)) + continue; + + if (pm_op_started(dev)) + continue; + + pm_dev_dbg(dev, pm_transition, "out of order EARLY "); + async_device_resume_noirq(dev); + } +} + +/** + * device_resume_noirq - Execute or schedule "early" resume callback. + * @dev: Device to resume. + * + * If @dev can be resumed asynchronously, schedule the execution of + * async_resume_noirq() for it. Otherwise, execute its "early" resume callback + * directly. + */ +static int device_resume_noirq(struct device *dev) +{ + if (pm_op_started(dev)) + return 0; + + if (dev->power.async_suspend && !pm_trace_enabled) { + async_schedule(async_resume_noirq, dev); + return 0; + } + + device_pm_wait_for_masters(dev); + return __device_resume_noirq(dev, pm_transition); +} + +/** * dpm_resume_noirq - Execute "early resume" callbacks for non-sysdev devices. * @state: PM transition of the system being carried out. * @@ -349,26 +564,28 @@ void dpm_resume_noirq(pm_message_t state mutex_lock(&dpm_list_mtx); transition_started = false; + pm_transition = state; list_for_each_entry(dev, &dpm_list, power.entry) if (dev->power.status > DPM_OFF) { int error; dev->power.status = DPM_OFF; - error = device_resume_noirq(dev, state); + error = device_resume_noirq(dev); if (error) - pm_dev_err(dev, state, " early", error); + pm_dev_err(dev, state, " EARLY", error); } + dpm_synchronize_noirq(); mutex_unlock(&dpm_list_mtx); resume_device_irqs(); } EXPORT_SYMBOL_GPL(dpm_resume_noirq); /** - * device_resume - Execute "resume" callbacks for given device. + * __device_resume - Execute "resume" callbacks for given device. * @dev: Device to handle. * @state: PM transition of the system being carried out. */ -static int device_resume(struct device *dev, pm_message_t state) +static int __device_resume(struct device *dev, pm_message_t state) { int error = 0; @@ -409,12 +626,92 @@ static int device_resume(struct device * } End: up(&dev->sem); + dev->power.op_complete = true; + wake_up_all(&dev->power.wait_queue); TRACE_RESUME(error); return error; } /** + * async_device_resume - Wrapper of __device_resume(). + * @dev: Device to resume. + */ +static void async_device_resume(struct device *dev) +{ + int error; + + pm_dev_dbg(dev, pm_transition, "async "); + error = __device_resume(dev, pm_transition); + if (error) + pm_dev_err(dev, pm_transition, " async", error); +} + +/** + * async_resume - Execute resume callbacks asynchronously. + * @data: Pointer to the first device to resume. + * @cookie: Ignored. + * + * The execution of this function is scheduled with async_schedule(), so it runs + * in its own kernel thread. It first calls the resume callbacks for the device + * passed to it as @data. Next, it walks dpm_list looking for devices that can + * be resumed without waiting for their "masters". If such a device is found, + * its resume callbacks are run. + */ +static void async_resume(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + + device_pm_wait_for_masters(dev); + + repeat: + async_device_resume(dev); + put_device(dev); + + mutex_lock(&dpm_list_mtx); + if (dev->power.status < DPM_OFF) + dev = to_device(dpm_list.next); + list_for_each_entry_continue(dev, &dpm_list, power.entry) { + if (!dev->power.async_suspend || dev->power.status < DPM_OFF) + continue; + + if (device_pm_check_masters(dev)) + continue; + + if (pm_op_started(dev)) + continue; + + get_device(dev); + mutex_unlock(&dpm_list_mtx); + pm_dev_dbg(dev, pm_transition, "out of order "); + goto repeat; + } + mutex_unlock(&dpm_list_mtx); +} + +/** + * device_resume - Execute or schedule resume callbacks for given device. + * @dev: Device to resume. + * + * If @dev can be resumed asynchronously, schedule the execution of + * async_resume() for it. Otherwise, execute its resume callbacks directly. + */ +static int device_resume(struct device *dev) +{ + if (pm_op_started(dev)) + return 0; + + if (dev->power.async_suspend && !pm_trace_enabled) { + get_device(dev); + async_schedule(async_resume, dev); + return 0; + } + + device_pm_wait_for_masters(dev); + return __device_resume(dev, pm_transition); +} + +/** * dpm_resume - Execute "resume" callbacks for non-sysdev devices. * @state: PM transition of the system being carried out. * @@ -427,6 +724,7 @@ static void dpm_resume(pm_message_t stat INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); + pm_transition = state; while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.next); @@ -437,7 +735,7 @@ static void dpm_resume(pm_message_t stat dev->power.status = DPM_RESUMING; mutex_unlock(&dpm_list_mtx); - error = device_resume(dev, state); + error = device_resume(dev); mutex_lock(&dpm_list_mtx); if (error) @@ -452,6 +750,7 @@ static void dpm_resume(pm_message_t stat } list_splice(&list, &dpm_list); mutex_unlock(&dpm_list_mtx); + dpm_synchronize(); } /** @@ -775,8 +1074,10 @@ static int dpm_prepare(pm_message_t stat break; } dev->power.status = DPM_SUSPENDING; - if (!list_empty(&dev->power.entry)) + if (!list_empty(&dev->power.entry)) { list_move_tail(&dev->power.entry, &list); + dpm_reset(dev); + } put_device(dev); } list_splice(&list, &dpm_list); Index: linux-2.6/drivers/base/power/common.c =================================================================== --- linux-2.6.orig/drivers/base/power/common.c +++ linux-2.6/drivers/base/power/common.c @@ -19,10 +19,11 @@ */ void device_pm_init(struct device *dev) { - dev->power.status = DPM_ON; spin_lock_init(&dev->power.lock); + init_waitqueue_head(&dev->power.wait_queue); INIT_LIST_HEAD(&dev->power.master_links); INIT_LIST_HEAD(&dev->power.slave_links); + dev->power.status = DPM_ON; pm_runtime_init(dev); } @@ -117,6 +118,8 @@ int pm_link_add(struct device *slave, st return 0; err_link: + master->power.async_suspend = false; + slave->power.async_suspend = false; error = -ENOMEM; put_device(slave);