diff mbox

[2/8] introduce the device async action mechanism

Message ID 1247643516.26272.77.camel@rzhang-dt (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Zhang, Rui July 15, 2009, 7:38 a.m. UTC
Introduce the device async action mechanism.

In order to speed up Linux suspend/resume/shutdown process,
we introduce the device async action mechanism that allow devices
to suspend/resume/shutdown asynchronously.

The basic idea is that,
if the suspend/resume/shutdown process of a device set,
including a root device and its child devices, are independent of
other devices, we create an async domain for this device set,
and make them suspend/resume/shutdown asynchronously.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
---
 drivers/base/Makefile     |    3 
 drivers/base/async_dev.c  |  180 ++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/async_dev.h |   40 ++++++++++
 include/linux/device.h    |    2 
 4 files changed, 224 insertions(+), 1 deletion(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Pavel Machek July 15, 2009, 12:31 a.m. UTC | #1
On Wed 2009-07-15 15:38:36, Zhang Rui wrote:
> Introduce the device async action mechanism.
> 
> In order to speed up Linux suspend/resume/shutdown process,
> we introduce the device async action mechanism that allow devices
> to suspend/resume/shutdown asynchronously.
> 
> The basic idea is that,
> if the suspend/resume/shutdown process of a device set,
> including a root device and its child devices, are independent of
> other devices, we create an async domain for this device set,
> and make them suspend/resume/shutdown asynchronously.
> 
> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
> ---
>  drivers/base/Makefile     |    3 
>  drivers/base/async_dev.c  |  180 ++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/async_dev.h |   40 ++++++++++
>  include/linux/device.h    |    2 
>  4 files changed, 224 insertions(+), 1 deletion(-)
> 
> Index: linux-2.6/include/linux/async_dev.h
> ===================================================================
> --- /dev/null
> +++ linux-2.6/include/linux/async_dev.h
> @@ -0,0 +1,40 @@
> +/*
> + * async_dev.h: function calls for device async actions
> + *
> + * (C) Copyright 2009 Intel Corporation
> + * Author: Zhang Rui <rui.zhang@intel.com>
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; version 2
> + * of the License.
> + */
> +
> +#ifndef _ASYNC_DEV_H_
> +#define _ASYNC_DEV_H_
> +
> +#include <linux/types.h>
> +#include <linux/list.h>
> +#include <linux/async.h>
> +#include <linux/device.h>
> +#include <linux/pm.h>
> +
> +struct dev_async_struct {
> +	struct device *dev;
> +	int type;
> +	/* Synchronization Domain for device async actions */
> +	struct list_head domain;
> +	struct list_head node;
> +	async_cookie_t cookie;
> +};
> +
> +#define DEV_ASYNC_ACTIONS_ALL	0
> +
> +extern int dev_async_schedule(struct device *, void *,
> +			void *, int);
> +extern void dev_async_synchronization(void);

'_synchronize' to be consistent with schedule?
Arjan van de Ven July 15, 2009, 1 p.m. UTC | #2
On Wed, 15 Jul 2009 15:38:36 +0800
Zhang Rui <rui.zhang@intel.com> wrote:

> Introduce the device async action mechanism.
> 
> In order to speed up Linux suspend/resume/shutdown process,
> we introduce the device async action mechanism that allow devices
> to suspend/resume/shutdown asynchronously.
> 
> The basic idea is that,
> if the suspend/resume/shutdown process of a device set,
> including a root device and its child devices, are independent of
> other devices, we create an async domain for this device set,
> and make them suspend/resume/shutdown asynchronously.

Hi,

I have some concerns about having an async domain per device(group)
rather than having one async domain for all of this, I would
strongly suggest going to one global domain, so that it becomes
more practical to just wait for all async actions like this...
which you need to do during suspend for sure.
Zhang, Rui July 16, 2009, 2:05 a.m. UTC | #3
On Wed, 2009-07-15 at 21:00 +0800, Arjan van de Ven wrote:
> On Wed, 15 Jul 2009 15:38:36 +0800
> Zhang Rui <rui.zhang@intel.com> wrote:
> 
> > Introduce the device async action mechanism.
> > 
> > In order to speed up Linux suspend/resume/shutdown process,
> > we introduce the device async action mechanism that allow devices
> > to suspend/resume/shutdown asynchronously.
> > 
> > The basic idea is that,
> > if the suspend/resume/shutdown process of a device set,
> > including a root device and its child devices, are independent of
> > other devices, we create an async domain for this device set,
> > and make them suspend/resume/shutdown asynchronously.
> 
> Hi,
> 
> I have some concerns about having an async domain per device(group)
> rather than having one async domain for all of this, 

we create an async domain ONLY if we are sure that the device group is
independent with the other devices.

and IMO, using multiple async domains brings real device async actions.
For example, in S3 resume case, there are two device groups:
device group1: device1, device2, device3
device group2: device4, device5, device6

If they share the global domain, we may get:
device group1: device1(cookie 1), device2(cookie 4), device3(cookie 5)
device group2: device4(cookie 2), device5(cookie 3), device6(cookie 6)

this is not real asynchronous resume because
device3 needs to call async_synchronize_cookie(5) before resume itself.
which means that device4 and device5 must be resumed before device3.

But if multiple async domain is used, we get:
device group1: device1(cookie 1), device2(cookie 2), device3(cookie 3)
device group2: device4(cookie 1), device5(cookie 2), device6(cookie 3)

device group1 and group2 can be resumed asynchronously.


Another example, in my previous test,
1. sata controller. takes 0.4s to resume.
2. usb, including uchi and ehci controller takes 1.4s to resume
3. ACPI battery takes 0.3s to resume.
3. all the other devices take 0.2s to resume.

sata, usb and ACPI battery are independent device groups.
If we use multiple async domains, we can reduce the total device resume
time from 2.3s to a little more than 1.4s because there are a lot of
sleep in usb resume process.
But if we share the global async domain, the total resume time can only
be reduced to about 2.1s because sata, usb and ACPI battery are actually
resumed synchronously.


> I would
> strongly suggest going to one global domain, so that it becomes
> more practical to just wait for all async actions like this...
> which you need to do during suspend for sure.
> 
We can use multiple async domains, even in the suspend case, on the
premise of the device group being independent of ANY other devices.
For example, this is what I get after patch 8/8 is applied.
BTW: actions 1 stands for suspend here.
[  498.891337] device event8 actions 1 started
[  498.891339] device event8 actions 1 done
[  498.891344] device mouse3 actions 1 started
[  498.891345] device mouse3 actions 1 done
[  498.891349] device input17 actions 1 started
[  498.891350] device input17 actions 1 done
[  498.891354] device event7 actions 1 started
[  498.891355] device event7 actions 1 done
[  498.891359] device mouse2 actions 1 started
[  498.891360] device mouse2 actions 1 done
[  498.891364] device input16 actions 1 started
[  498.891365] device input16 actions 1 done
[  498.992052] device serio4 actions 1 started
[  498.992054] device serio4 actions 1 done
[  498.992058] device serio3 actions 1 started
[  498.992060] device serio3 actions 1 done
[  498.992064] device serio2 actions 1 started
[  499.411877] device serio2 actions 1 done
[  499.454733] device serio1 actions 1 started
[  499.454734] device serio1 actions 1 done
[  499.454739] device event1 actions 1 started
[  499.454740] device event1 actions 1 done
[  499.454744] device input1 actions 1 started
[  499.454746] device input1 actions 1 done
[  499.454750] device serio0 actions 1 started
[  499.456052] device i8042 actions 1 started
[  499.556054] device serio0 actions 1 done
[  499.556748] device i8042 actions 1 done

thanks,
rui

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael Wysocki July 17, 2009, 1:11 a.m. UTC | #4
On Thursday 16 July 2009, Zhang Rui wrote:
> On Wed, 2009-07-15 at 21:00 +0800, Arjan van de Ven wrote:
> > On Wed, 15 Jul 2009 15:38:36 +0800
> > Zhang Rui <rui.zhang@intel.com> wrote:
> > 
> > > Introduce the device async action mechanism.
> > > 
> > > In order to speed up Linux suspend/resume/shutdown process,
> > > we introduce the device async action mechanism that allow devices
> > > to suspend/resume/shutdown asynchronously.
> > > 
> > > The basic idea is that,
> > > if the suspend/resume/shutdown process of a device set,
> > > including a root device and its child devices, are independent of
> > > other devices, we create an async domain for this device set,
> > > and make them suspend/resume/shutdown asynchronously.
> > 
> > Hi,
> > 
> > I have some concerns about having an async domain per device(group)
> > rather than having one async domain for all of this, 
> 
> we create an async domain ONLY if we are sure that the device group is
> independent with the other devices.
> 
> and IMO, using multiple async domains brings real device async actions.
> For example, in S3 resume case, there are two device groups:
> device group1: device1, device2, device3
> device group2: device4, device5, device6
> 
> If they share the global domain, we may get:
> device group1: device1(cookie 1), device2(cookie 4), device3(cookie 5)
> device group2: device4(cookie 2), device5(cookie 3), device6(cookie 6)
> 
> this is not real asynchronous resume because
> device3 needs to call async_synchronize_cookie(5) before resume itself.
> which means that device4 and device5 must be resumed before device3.
> 
> But if multiple async domain is used, we get:
> device group1: device1(cookie 1), device2(cookie 2), device3(cookie 3)
> device group2: device4(cookie 1), device5(cookie 2), device6(cookie 3)
> 
> device group1 and group2 can be resumed asynchronously.
> 
> 
> Another example, in my previous test,
> 1. sata controller. takes 0.4s to resume.
> 2. usb, including uchi and ehci controller takes 1.4s to resume
> 3. ACPI battery takes 0.3s to resume.
> 3. all the other devices take 0.2s to resume.
> 
> sata, usb and ACPI battery are independent device groups.
> If we use multiple async domains, we can reduce the total device resume
> time from 2.3s to a little more than 1.4s because there are a lot of
> sleep in usb resume process.
> But if we share the global async domain, the total resume time can only
> be reduced to about 2.1s because sata, usb and ACPI battery are actually
> resumed synchronously.

Well, first, I'm not really sure that using the async _boot_ infrastructure for
that is a good choice.  During suspend-resume we know dependencies between
devices beforehand, at least in theory, so we can use them.

In particular, we have to make sure that parent devices will not be suspended
until all of their children have been suspended and children devices will not
be resumed before the parents.  The current code handles this quite
efficiently, so I wonder what you're going to do not to break it.

Second, you seem to think that it only makes sense to execute ->suspend()
and ->resume() asynchronously (or in parallel), while for example in the case
of PCI ->suspend_noirq() and ->resume_noirq() also contain code that
potentially can take quite some time to execute.

Finally, I don't really understand what the code in the $subject patch is
supposed to do.  In particular, what's the purpose of dev_action()?
It only seems to check if func is not NULL right now.  Also, you define
DEV_ASYNC_ACTIONS_ALL as 0, so the condition
if (!(DEV_ASYNC_ACTIONS_ALL & type)) in dev_async_register() is always
satisfied.  There are more things like that in this patch, not to mention
excessive return statements and passing function pointers as (void *).

Can we please discuss this thoroughly before any new patches are sent?

Best,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhang, Rui July 17, 2009, 2:44 a.m. UTC | #5
On Fri, 2009-07-17 at 09:11 +0800, Rafael J. Wysocki wrote:
> On Thursday 16 July 2009, Zhang Rui wrote:
> > On Wed, 2009-07-15 at 21:00 +0800, Arjan van de Ven wrote:
> > > On Wed, 15 Jul 2009 15:38:36 +0800
> > > Zhang Rui <rui.zhang@intel.com> wrote:
> > > 
> > > > Introduce the device async action mechanism.
> > > > 
> > > > In order to speed up Linux suspend/resume/shutdown process,
> > > > we introduce the device async action mechanism that allow devices
> > > > to suspend/resume/shutdown asynchronously.
> > > > 
> > > > The basic idea is that,
> > > > if the suspend/resume/shutdown process of a device set,
> > > > including a root device and its child devices, are independent of
> > > > other devices, we create an async domain for this device set,
> > > > and make them suspend/resume/shutdown asynchronously.
> > > 
> > > Hi,
> > > 
> > > I have some concerns about having an async domain per device(group)
> > > rather than having one async domain for all of this, 
> > 
> > we create an async domain ONLY if we are sure that the device group is
> > independent with the other devices.
> > 
> > and IMO, using multiple async domains brings real device async actions.
> > For example, in S3 resume case, there are two device groups:
> > device group1: device1, device2, device3
> > device group2: device4, device5, device6
> > 
> > If they share the global domain, we may get:
> > device group1: device1(cookie 1), device2(cookie 4), device3(cookie 5)
> > device group2: device4(cookie 2), device5(cookie 3), device6(cookie 6)
> > 
> > this is not real asynchronous resume because
> > device3 needs to call async_synchronize_cookie(5) before resume itself.
> > which means that device4 and device5 must be resumed before device3.
> > 
> > But if multiple async domain is used, we get:
> > device group1: device1(cookie 1), device2(cookie 2), device3(cookie 3)
> > device group2: device4(cookie 1), device5(cookie 2), device6(cookie 3)
> > 
> > device group1 and group2 can be resumed asynchronously.
> > 
> > 
> > Another example, in my previous test,
> > 1. sata controller. takes 0.4s to resume.
> > 2. usb, including uchi and ehci controller takes 1.4s to resume
> > 3. ACPI battery takes 0.3s to resume.
> > 3. all the other devices take 0.2s to resume.
> > 
> > sata, usb and ACPI battery are independent device groups.
> > If we use multiple async domains, we can reduce the total device resume
> > time from 2.3s to a little more than 1.4s because there are a lot of
> > sleep in usb resume process.
> > But if we share the global async domain, the total resume time can only
> > be reduced to about 2.1s because sata, usb and ACPI battery are actually
> > resumed synchronously.
> 
> Well, first, I'm not really sure that using the async _boot_ infrastructure for
> that is a good choice.

IMO, kernel/async.c provides good interfaces that can be used not only
in boot phrase.
it's generic enough for suspend/resume.

>  During suspend-resume we know dependencies between
> devices beforehand, at least in theory, so we can use them.
> 
that's why I use multiple async domains. :)
One domain for a device group.

> In particular, we have to make sure that parent devices will not be suspended
> until all of their children have been suspended and children devices will not
> be resumed before the parents.

that's not enough.
For examples,
ACPI battery and EC are independent devices, but EC must be resumed
before battery because battery driver may access some EC address space
during resume time.
Of course the problem I describe above doesn't exist because ACPI
battery driver doesn't have .resume method right now.
But this is the case that works well in the current code but can not be
converted to async device resume.

> The current code handles this quite
> efficiently, so I wonder what you're going to do not to break it.
> 
sorry I don't quite understand.

> Second, you seem to think that it only makes sense to execute ->suspend()
> and ->resume() asynchronously (or in parallel), while for example in the case
> of PCI ->suspend_noirq() and ->resume_noirq() also contain code that
> potentially can take quite some time to execute.
> 
IMO, this patch set just provides a framework that can be used for
suspend/resume/shutdown optimization, and it doesn't solve all the
problem at one time.

IMO, the next step we should do is:
1. analyze the suspend/resume/shutdown process and find out the hotspot,
   i.e. which drivers suspend/resume slowly
2. If it's software problem that we can fix in the driver, fix it.
   like commit 24920c8a6358bf5532f1336b990b1c0fe2b599ee
3. If the hardware is slow, try to do it asynchronously.
   like I did in patch 8/8.

This framework makes it quite easy to register an async device group.

And even the suspend_noirq and resume_noirq can be covered easily with
this framework.
For example,
1. introduce two new device async actions.
   DEV_ASYNC_SUSPEND_NOIRQ/DEV_ASYNC_RESUME_NOIRQ
   just like what I did in patch 4/8, 5/8 and 6/8.
2. find out which device is slow in ->suspend_noirq and ->resume_noirq
3. see if we can find an async device group that including this device.
4. if yes, register this new async device group,
   with the type DEV_ASYNC_SUSPEND_NOIRQ/DEV_ASYNC_RESUME_NOIRQ

> Finally, I don't really understand what the code in the $subject patch is
> supposed to do.  In particular, what's the purpose of dev_action()?
> It only seems to check if func is not NULL right now.  Also, you define
> DEV_ASYNC_ACTIONS_ALL as 0, so the condition
> if (!(DEV_ASYNC_ACTIONS_ALL & type)) in dev_async_register() is always
> satisfied.

please refer to patch 4/8 and 5/8 and 6/8

Patch 2/8 is just a framework. No device async action is support yet.
And in Patch 4, 5, 6, we introduced three different types of device
async actions, DEV_ASYNC_SUSPEND, DEV_ASYNC_RESUME and
DEV_ASYNC_SHUTDOWN.
I tried to split a big patch into several small patches.
And it suggests how to adding a new device async type, like
DEV_ASYNC_SUSPEND_NOIRQ, DEV_ASYNC_RESUME_NO_IRQ, etc. :)

> Can we please discuss this thoroughly before any new patches are sent?
> 
do I still need to resend the patch?
If yes, I'll resend patch 2/8, 3/8, 4/8, 5/8, 6/8 as a new big one. :)

thanks,
rui

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arjan van de Ven July 17, 2009, 4:31 a.m. UTC | #6
On Fri, 17 Jul 2009 03:11:59 +0200
"Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> 
> Well, first, I'm not really sure that using the async _boot_
> infrastructure for that is a good choice. 

the async function call mechanism is for more than boot.
Pavel Machek July 19, 2009, 10:16 p.m. UTC | #7
Hi!

> >  During suspend-resume we know dependencies between
> > devices beforehand, at least in theory, so we can use them.
> > 
> that's why I use multiple async domains. :)
> One domain for a device group.
> 
> > In particular, we have to make sure that parent devices will not be suspended
> > until all of their children have been suspended and children devices will not
> > be resumed before the parents.
> 
> that's not enough.
> ???For examples,
> ACPI battery and EC are independent devices, but EC must be resumed
> before battery because battery driver may access some EC address space
> during resume time.

Yes, but those dependencies should be pulled from driver tree, not
adding separate dependencies infrastructure.

								Pavel
Zhang, Rui July 20, 2009, 3:09 a.m. UTC | #8
On Mon, 2009-07-20 at 06:16 +0800, Pavel Machek wrote:
> Hi!
> 
> > 
> > that's not enough.
> > ???For examples,
> > ACPI battery and EC are independent devices, but EC must be resumed
> > before battery because battery driver may access some EC address space
> > during resume time.
> 
> Yes, but those dependencies should be pulled from driver tree, not
> adding separate dependencies infrastructure.
> 
Sorry, the problem I described above is wrong.
We just disable the EC interrupt mode in S3, but it still works.
Battery can access EC address space before EC resumed, in low speed.


> 
> > > In particular, we have to make sure that parent devices will not be suspended
> > > until all of their children have been suspended and children devices will not
> > > be resumed before the parents.
> > 
this is right.

thanks,
rui

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael Wysocki July 20, 2009, 4:24 p.m. UTC | #9
On Friday 17 July 2009, Zhang Rui wrote:
> On Fri, 2009-07-17 at 09:11 +0800, Rafael J. Wysocki wrote:
> > On Thursday 16 July 2009, Zhang Rui wrote:
> > > On Wed, 2009-07-15 at 21:00 +0800, Arjan van de Ven wrote:
> > > > On Wed, 15 Jul 2009 15:38:36 +0800
> > > > Zhang Rui <rui.zhang@intel.com> wrote:
> > > > 
> > > > > Introduce the device async action mechanism.
> > > > > 
> > > > > In order to speed up Linux suspend/resume/shutdown process,
> > > > > we introduce the device async action mechanism that allow devices
> > > > > to suspend/resume/shutdown asynchronously.
> > > > > 
> > > > > The basic idea is that,
> > > > > if the suspend/resume/shutdown process of a device set,
> > > > > including a root device and its child devices, are independent of
> > > > > other devices, we create an async domain for this device set,
> > > > > and make them suspend/resume/shutdown asynchronously.
> > > > 
> > > > Hi,
> > > > 
> > > > I have some concerns about having an async domain per device(group)
> > > > rather than having one async domain for all of this, 
> > > 
> > > we create an async domain ONLY if we are sure that the device group is
> > > independent with the other devices.
> > > 
> > > and IMO, using multiple async domains brings real device async actions.
> > > For example, in S3 resume case, there are two device groups:
> > > device group1: device1, device2, device3
> > > device group2: device4, device5, device6
> > > 
> > > If they share the global domain, we may get:
> > > device group1: device1(cookie 1), device2(cookie 4), device3(cookie 5)
> > > device group2: device4(cookie 2), device5(cookie 3), device6(cookie 6)
> > > 
> > > this is not real asynchronous resume because
> > > device3 needs to call async_synchronize_cookie(5) before resume itself.
> > > which means that device4 and device5 must be resumed before device3.
> > > 
> > > But if multiple async domain is used, we get:
> > > device group1: device1(cookie 1), device2(cookie 2), device3(cookie 3)
> > > device group2: device4(cookie 1), device5(cookie 2), device6(cookie 3)
> > > 
> > > device group1 and group2 can be resumed asynchronously.
> > > 
> > > 
> > > Another example, in my previous test,
> > > 1. sata controller. takes 0.4s to resume.
> > > 2. usb, including uchi and ehci controller takes 1.4s to resume
> > > 3. ACPI battery takes 0.3s to resume.
> > > 3. all the other devices take 0.2s to resume.
> > > 
> > > sata, usb and ACPI battery are independent device groups.
> > > If we use multiple async domains, we can reduce the total device resume
> > > time from 2.3s to a little more than 1.4s because there are a lot of
> > > sleep in usb resume process.
> > > But if we share the global async domain, the total resume time can only
> > > be reduced to about 2.1s because sata, usb and ACPI battery are actually
> > > resumed synchronously.
> > 
> > Well, first, I'm not really sure that using the async _boot_ infrastructure for
> > that is a good choice.
> 
> IMO, kernel/async.c provides good interfaces that can be used not only
> in boot phrase.
> it's generic enough for suspend/resume.

Well, if that were not your opinion, you certainly wouldn't post patches using
it. :-)

> >  During suspend-resume we know dependencies between
> > devices beforehand, at least in theory, so we can use them.
> > 
> that's why I use multiple async domains. :)
> One domain for a device group.

And how exactly the device groups are defined?

> > In particular, we have to make sure that parent devices will not be suspended
> > until all of their children have been suspended and children devices will not
> > be resumed before the parents.
> 
> that's not enough.
> For examples,
> ACPI battery and EC are independent devices, but EC must be resumed
> before battery because battery driver may access some EC address space
> during resume time.
> Of course the problem I describe above doesn't exist because ACPI
> battery driver doesn't have .resume method right now.
> But this is the case that works well in the current code but can not be
> converted to async device resume.
> 
> > The current code handles this quite
> > efficiently, so I wonder what you're going to do not to break it.
> > 
> sorry I don't quite understand.
> 
> > Second, you seem to think that it only makes sense to execute ->suspend()
> > and ->resume() asynchronously (or in parallel), while for example in the case
> > of PCI ->suspend_noirq() and ->resume_noirq() also contain code that
> > potentially can take quite some time to execute.
> > 
> IMO, this patch set just provides a framework that can be used for
> suspend/resume/shutdown optimization, and it doesn't solve all the
> problem at one time.
> 
> IMO, the next step we should do is:
> 1. analyze the suspend/resume/shutdown process and find out the hotspot,
>    i.e. which drivers suspend/resume slowly
> 2. If it's software problem that we can fix in the driver, fix it.
>    like commit 24920c8a6358bf5532f1336b990b1c0fe2b599ee
> 3. If the hardware is slow, try to do it asynchronously.
>    like I did in patch 8/8.
> 
> This framework makes it quite easy to register an async device group.
> 
> And even the suspend_noirq and resume_noirq can be covered easily with
> this framework.
> For example,
> 1. introduce two new device async actions.
>    DEV_ASYNC_SUSPEND_NOIRQ/DEV_ASYNC_RESUME_NOIRQ
>    just like what I did in patch 4/8, 5/8 and 6/8.
> 2. find out which device is slow in ->suspend_noirq and ->resume_noirq
> 3. see if we can find an async device group that including this device.
> 4. if yes, register this new async device group,
>    with the type DEV_ASYNC_SUSPEND_NOIRQ/DEV_ASYNC_RESUME_NOIRQ
> 
> > Finally, I don't really understand what the code in the $subject patch is
> > supposed to do.  In particular, what's the purpose of dev_action()?
> > It only seems to check if func is not NULL right now.  Also, you define
> > DEV_ASYNC_ACTIONS_ALL as 0, so the condition
> > if (!(DEV_ASYNC_ACTIONS_ALL & type)) in dev_async_register() is always
> > satisfied.
> 
> please refer to patch 4/8 and 5/8 and 6/8
> 
> Patch 2/8 is just a framework. No device async action is support yet.

OK

So please remove the redundant return statements from there and please don't
use (void *) for passing function pointers.

> And in Patch 4, 5, 6, we introduced three different types of device
> async actions, DEV_ASYNC_SUSPEND, DEV_ASYNC_RESUME and
> DEV_ASYNC_SHUTDOWN.
> I tried to split a big patch into several small patches.
> And it suggests how to adding a new device async type, like
> DEV_ASYNC_SUSPEND_NOIRQ, DEV_ASYNC_RESUME_NO_IRQ, etc. :)
> 
> > Can we please discuss this thoroughly before any new patches are sent?
> > 
> do I still need to resend the patch?

Yes, please.

> If yes, I'll resend patch 2/8, 3/8, 4/8, 5/8, 6/8 as a new big one. :)

Yes, please, I think that would help to understand the design.

Best,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Index: linux-2.6/include/linux/async_dev.h
===================================================================
--- /dev/null
+++ linux-2.6/include/linux/async_dev.h
@@ -0,0 +1,40 @@ 
+/*
+ * async_dev.h: function calls for device async actions
+ *
+ * (C) Copyright 2009 Intel Corporation
+ * Author: Zhang Rui <rui.zhang@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ */
+
+#ifndef _ASYNC_DEV_H_
+#define _ASYNC_DEV_H_
+
+#include <linux/types.h>
+#include <linux/list.h>
+#include <linux/async.h>
+#include <linux/device.h>
+#include <linux/pm.h>
+
+struct dev_async_struct {
+	struct device *dev;
+	int type;
+	/* Synchronization Domain for device async actions */
+	struct list_head domain;
+	struct list_head node;
+	async_cookie_t cookie;
+};
+
+#define DEV_ASYNC_ACTIONS_ALL	0
+
+extern int dev_async_schedule(struct device *, void *,
+			void *, int);
+extern void dev_async_synchronization(void);
+
+extern int dev_async_register(struct device *, int);
+extern void dev_async_unregister(struct device *);
+
+#endif /* _ASYNC_DEV_H_ */
Index: linux-2.6/drivers/base/async_dev.c
===================================================================
--- /dev/null
+++ linux-2.6/drivers/base/async_dev.c
@@ -0,0 +1,180 @@ 
+/*
+ * async_dev.c: Device asynchronous functions
+ *
+ * (C) Copyright 2009 Intel Corporation
+ * Author: Zhang Rui <rui.zhang@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ */
+
+#include <linux/device.h>
+#include <linux/async.h>
+#include <linux/async_dev.h>
+
+static LIST_HEAD(dev_async_list);
+static int dev_async_enabled;
+
+struct dev_async_context {
+	struct device *dev;
+	void *data;
+	void *func;
+	int type;
+};
+
+static int dev_action(struct device *dev, void *func,
+			void *data, int type)
+{
+	if (!func)
+		return -EINVAL;
+
+	return 0;
+}
+
+static void dev_async_action(void *data, async_cookie_t cookie)
+{
+	int error;
+	struct dev_async_context *context = data;
+
+	context->dev->dev_async->cookie = cookie;
+	async_synchronize_cookie_domain(cookie,
+					 &context->dev->dev_async->domain);
+
+	error =	dev_action(context->dev, context->func, context->data,
+							context->type);
+	if (error)
+		printk(KERN_ERR "PM: Device %s async action failed: error %d\n",
+			dev_name(context->dev), error);
+
+	kfree(context);
+}
+
+/**
+ * dev_async_schedule - async execution of device actions.
+ * @dev: Device.
+ * @func: device callback function.
+ * @data: data.
+ * @type: the type of device async actions.
+ */
+int dev_async_schedule(struct device *dev, void *func,
+			void *data, int type)
+{
+	struct dev_async_context *context;
+
+	if (!dev_async_enabled || !dev->dev_async)
+		return dev_action(dev, func, data, type);
+
+	/* the current dev async action is not supported */
+	if (!(dev->dev_async->type & type))
+		return dev_action(dev, func, data, type);
+
+	if (!func)
+		return -EINVAL;
+
+	if (type > DEV_ASYNC_ACTIONS_ALL)
+		return -EINVAL;
+
+	context = kzalloc(sizeof(struct dev_async_context), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	context->dev = dev;
+	context->data = data;
+	context->func = func;
+	context->type = type;
+	async_schedule_domain(dev_async_action, context,
+			       &dev->dev_async->domain);
+	return 0;
+}
+
+/**
+ * device_async_synchronization - sync point for all the async actions
+ * @dev: Device.
+ *
+ * wait until all the async actions are done.
+ */
+void dev_async_synchronization(void)
+{
+	struct dev_async_struct *pos;
+
+	list_for_each_entry(pos, &dev_async_list, node)
+	    async_synchronize_full_domain(&pos->domain);
+
+	return;
+}
+
+/**
+ * device_async_register - register a device that supports async actions
+ * @dev: Device.
+ * @type: the kind of dev async actions that supported
+ *
+ * Register a device that supports a certain kind of dev async actions.
+ * Create a synchrolization Domain for this device and share with all its
+ * child devices.
+ */
+int dev_async_register(struct device *dev, int type)
+{
+	if (!dev_async_enabled)
+		return 0;
+
+	if (!dev)
+		return -EINVAL;
+
+	if (dev->dev_async) {
+		if (dev->dev_async->dev == dev) {
+			printk(KERN_ERR "device already registered\n");
+			return -EEXIST;
+		}
+	}
+
+	if (!(DEV_ASYNC_ACTIONS_ALL & type))
+	/* check for unsupported async actions */
+		return -EINVAL;
+
+	dev->dev_async = kzalloc(sizeof(struct dev_async_struct), GFP_KERNEL);
+	if (!dev->dev_async)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&dev->dev_async->domain);
+	dev->dev_async->dev = dev;
+	dev->dev_async->type = type;
+	list_add_tail(&dev->dev_async->node, &dev_async_list);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(dev_async_register);
+
+/**
+ * device_async_unregister - unregister a device that supports async actions
+ * @dev: Device.
+ *
+ * Unregister a device that supports async actions.
+ * And delete async action Domain at the same time.
+ */
+void dev_async_unregister(struct device *dev)
+{
+	if (!dev_async_enabled)
+		return ;
+
+	if (!dev->dev_async)
+		return;
+
+	if (dev->dev_async->dev != dev)
+		return;
+
+	list_del(&dev->dev_async->node);
+	kfree(dev->dev_async);
+	dev->dev_async = NULL;
+	return;
+}
+EXPORT_SYMBOL_GPL(dev_async_unregister);
+
+/* To enable the device async actions, boot with "dev_async_action" */
+static int __init enable_dev_async(char *arg)
+{
+	dev_async_enabled = 1;
+	return 0;
+}
+
+early_param("dev_async_action", enable_dev_async);
Index: linux-2.6/include/linux/device.h
===================================================================
--- linux-2.6.orig/include/linux/device.h
+++ linux-2.6/include/linux/device.h
@@ -22,6 +22,7 @@ 
 #include <linux/module.h>
 #include <linux/pm.h>
 #include <linux/semaphore.h>
+#include <linux/async_dev.h>
 #include <asm/atomic.h>
 #include <asm/device.h>
 
@@ -416,6 +417,7 @@  struct device {
 	struct attribute_group	**groups;	/* optional groups */
 
 	void	(*release)(struct device *dev);
+	struct dev_async_struct	*dev_async;	/* device async actions */
 };
 
 /* Get the wakeup routines, which depend on struct device */
Index: linux-2.6/drivers/base/Makefile
===================================================================
--- linux-2.6.orig/drivers/base/Makefile
+++ linux-2.6/drivers/base/Makefile
@@ -3,7 +3,8 @@ 
 obj-y			:= core.o sys.o bus.o dd.o \
 			   driver.o class.o platform.o \
 			   cpu.o firmware.o init.o map.o devres.o \
-			   attribute_container.o transport_class.o
+			   attribute_container.o transport_class.o \
+			   async_dev.o
 obj-y			+= power/
 obj-$(CONFIG_HAS_DMA)	+= dma-mapping.o
 obj-$(CONFIG_ISA)	+= isa.o