diff mbox

[RFC] mmc: fix dead lock issue when system entering S3

Message ID 20110517075352.GA3992@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Chuanxiao.Dong May 17, 2011, 7:53 a.m. UTC
Hi all,
   I encountered a dead lock issue. Two threads try to claim host, unfortunately
   one thread needs to sync with the other. I encountered this issue with SD
   card when testing system entering S3. Does anybody encounter the same issue?

Environment:
1. without CONFIG_MMC_UNSAFE_RESUME operation
2. the SD card mounted

During system tried to enter S3, mmc_pm_notifier will be called first to remove
SD card. So calling sequence in SD remove thread is like this:
	mmc_claim_host
	bus_ops->remove
	....
	....
	mmc_cleanup_queue
	kthread_stop(mq->thread)
	...
	...

Mean while, mmc_cleanup_queue wakes up mq->thread, the calling sequence in mq->thread is
like this:
	mmc_queue_thread
	mq->issue_fn (mmc_blk_issue_rq)
	mmc_claim_host (dead lock)
	....
	....

Since mmc_claim_host is called in mq->thread (not SD remove thread) again,
unfortunately right now host is already claimed by SD remove thread which is
also waiting for mq->thread finished, so cause a dead lock here.

Move the mmc_claim_host(in mmc_pm_notifier) after bus_ops->remove can resolve
this dead lock. mmc_suspend_host() is using the same way to claim host.

Need to know your suggestions.

Signed-off-by: He Bo <bo.he@intel.com>
Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
---
 drivers/mmc/core/core.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

Comments

Chuanxiao.Dong May 17, 2011, 10:03 a.m. UTC | #1
Just found Andrei Warkentin also submitted a RFC patch, which included more fix for mounted root file system media. Please ignore this thread.

> -----Original Message-----
> From: linux-mmc-owner@vger.kernel.org
> [mailto:linux-mmc-owner@vger.kernel.org] On Behalf Of Chuanxiao Dong
> Sent: Tuesday, May 17, 2011 3:54 PM
> To: linux-mmc@vger.kernel.org
> Cc: He, Bo
> Subject: [RFC]mmc: fix dead lock issue when system entering S3
> 
> Hi all,
>    I encountered a dead lock issue. Two threads try to claim host, unfortunately
>    one thread needs to sync with the other. I encountered this issue with SD
>    card when testing system entering S3. Does anybody encounter the same
> issue?
> 
> Environment:
> 1. without CONFIG_MMC_UNSAFE_RESUME operation
> 2. the SD card mounted
> 
> During system tried to enter S3, mmc_pm_notifier will be called first to remove
> SD card. So calling sequence in SD remove thread is like this:
> 	mmc_claim_host
> 	bus_ops->remove
> 	....
> 	....
> 	mmc_cleanup_queue
> 	kthread_stop(mq->thread)
> 	...
> 	...
> 
> Mean while, mmc_cleanup_queue wakes up mq->thread, the calling sequence in
> mq->thread is
> like this:
> 	mmc_queue_thread
> 	mq->issue_fn (mmc_blk_issue_rq)
> 	mmc_claim_host (dead lock)
> 	....
> 	....
> 
> Since mmc_claim_host is called in mq->thread (not SD remove thread) again,
> unfortunately right now host is already claimed by SD remove thread which is
> also waiting for mq->thread finished, so cause a dead lock here.
> 
> Move the mmc_claim_host(in mmc_pm_notifier) after bus_ops->remove can
> resolve
> this dead lock. mmc_suspend_host() is using the same way to claim host.
> 
> Need to know your suggestions.
> 
> Signed-off-by: He Bo <bo.he@intel.com>
> Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
> ---
>  drivers/mmc/core/core.c |    3 +--
>  1 files changed, 1 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> index 68091dd..1e27588 100644
> --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -1850,11 +1850,10 @@ int mmc_pm_notify(struct notifier_block
> *notify_block,
>  		if (!host->bus_ops || host->bus_ops->suspend)
>  			break;
> 
> -		mmc_claim_host(host);
> -
>  		if (host->bus_ops->remove)
>  			host->bus_ops->remove(host);
> 
> +		mmc_claim_host(host);
>  		mmc_detach_bus(host);
>  		mmc_release_host(host);
>  		host->pm_flags = 0;
> --
> 1.7.3.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrei Warkentin May 18, 2011, 7:58 a.m. UTC | #2
Hey,

On Tue, May 17, 2011 at 5:03 AM, Dong, Chuanxiao
<chuanxiao.dong@intel.com> wrote:
> Just found Andrei Warkentin also submitted a RFC patch, which included more fix for mounted root file system media. Please ignore this thread.

I submitted an RFC patch as an idea up for discussion, but certainly
the issue never went away :-).. Your patch will not quite work. If you
have a filesystem it will not unmount, so mmc0 (for example) will
never release, so when a new (or same) card re-enumerates, it will be
mmc1...

I believe my patch basically created "md orphans" which would be
reconnected on resume, which is bit of a really nasty hack... I'm not
sure if there is a better way. I could have sworn I saw a patch a week
ago that attempted to resolve the suspend/resume with mounted fs issue
by deferring removal until resume, but I don't see it now, and I'm not
sure if it dealt well with the whole attempting to unmount fs in mmc
block cleanup path...

I guess I can resubmit...

A
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 68091dd..1e27588 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -1850,11 +1850,10 @@  int mmc_pm_notify(struct notifier_block *notify_block,
 		if (!host->bus_ops || host->bus_ops->suspend)
 			break;
 
-		mmc_claim_host(host);
-
 		if (host->bus_ops->remove)
 			host->bus_ops->remove(host);
 
+		mmc_claim_host(host);
 		mmc_detach_bus(host);
 		mmc_release_host(host);
 		host->pm_flags = 0;