diff mbox

mmc: sdhci: disable irq in sdhci host suspend ranther than free this irq

Message ID 1453974146-20951-1-git-send-email-haibo.chen@nxp.com (mailing list archive)
State New, archived
Headers show

Commit Message

Bough Chen Jan. 28, 2016, 9:42 a.m. UTC
Currently sdhci driver free irq in host suspend, and call
request_threaded_irq() in host resume. But during host resume,
Ctrl+C can impact sdhci host resume, see the error log:

CPU1 is up
PM: noirq resume of devices complete after 0.637 msecs imx-sdma 30bd0000.sdma: loaded firmware 4.1
PM: early resume of devices complete after 0.774 msecs
dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4
PM: Device 30b40000.usdhc failed to resume: error -4
dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4
PM: Device 30b50000.usdhc failed to resume: error -4
dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4
PM: Device 30b60000.usdhc failed to resume: error -4 fec 30be0000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
mmc0: Timeout waiting for hardware interrupt.
mmc0: Timeout waiting for hardware interrupt.
mmc0: Timeout waiting for hardware interrupt.
mmc0: Timeout waiting for hardware interrupt.
mmc0: Timeout waiting for hardware interrupt.
mmc0: Timeout waiting for hardware interrupt.
mmc0: error -110 during resume (card was removed?)
mmc2: Timeout waiting for hardware interrupt.
mmc2: Timeout waiting for hardware interrupt.
mmc2: error -110 during resume (card was removed?)

In request_threaded_irq-> __setup_irq-> kthread_create
->kthread_create_on_node, the comment shows that SIGKILLed will
impact the kthread create, and return -EINTR.

This patch replace them with disable|enable_irq(), that will prevent
IRQs from being propagated to the sdhci driver.

Fixes: 781e989cf593 ("mmc: sdhci: convert to new SDIO IRQ handling")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
---
 drivers/mmc/host/sdhci.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

Comments

Russell King - ARM Linux Jan. 28, 2016, 10:20 a.m. UTC | #1
On Thu, Jan 28, 2016 at 05:42:26PM +0800, Haibo Chen wrote:
> Currently sdhci driver free irq in host suspend, and call
> request_threaded_irq() in host resume. But during host resume,
> Ctrl+C can impact sdhci host resume, see the error log:

Ctrl+C should have no effect on this - that seems to imply that there's
some other bug elsewhere.

> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> index d622435..4b1646b 100644
> --- a/drivers/mmc/host/sdhci.c
> +++ b/drivers/mmc/host/sdhci.c
> @@ -2686,7 +2686,7 @@ int sdhci_suspend_host(struct sdhci_host *host)
>  		host->ier = 0;
>  		sdhci_writel(host, 0, SDHCI_INT_ENABLE);
>  		sdhci_writel(host, 0, SDHCI_SIGNAL_ENABLE);
> -		free_irq(host->irq, host);
> +		disable_irq(host->irq);

This is really not acceptable I'm afraid.  While it's common on ARM for
each interrupt to be uniquely allocated to a peripheral, not all SDHCI
platforms have that luxury.

SDHCI is also used on PCI, and on x86 platforms, it's common to have PCI
interrupts shared between (sometimes many) different PCI devices.

For example, on my laptop:

 18:    1089806     286185   IO-APIC-fasteoi   uhci_hcd:usb8, r852, mmc0

the SDHCI interrupt is shared with two other peripherals - one USB
controller and a NAND device.

Disabling the interrupt will adversely impact other peripherals and
cause regressions where the interrupt is shared.

So, I'm afraid I'm going to have to NAK this patch.
Ulf Hansson Jan. 28, 2016, 3:47 p.m. UTC | #2
+tglx, Jon

On 28 January 2016 at 11:20, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Thu, Jan 28, 2016 at 05:42:26PM +0800, Haibo Chen wrote:
>> Currently sdhci driver free irq in host suspend, and call
>> request_threaded_irq() in host resume. But during host resume,
>> Ctrl+C can impact sdhci host resume, see the error log:
>
> Ctrl+C should have no effect on this - that seems to imply that there's
> some other bug elsewhere.
>
>> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
>> index d622435..4b1646b 100644
>> --- a/drivers/mmc/host/sdhci.c
>> +++ b/drivers/mmc/host/sdhci.c
>> @@ -2686,7 +2686,7 @@ int sdhci_suspend_host(struct sdhci_host *host)
>>               host->ier = 0;
>>               sdhci_writel(host, 0, SDHCI_INT_ENABLE);
>>               sdhci_writel(host, 0, SDHCI_SIGNAL_ENABLE);
>> -             free_irq(host->irq, host);
>> +             disable_irq(host->irq);
>
> This is really not acceptable I'm afraid.  While it's common on ARM for
> each interrupt to be uniquely allocated to a peripheral, not all SDHCI
> platforms have that luxury.
>
> SDHCI is also used on PCI, and on x86 platforms, it's common to have PCI
> interrupts shared between (sometimes many) different PCI devices.
>
> For example, on my laptop:
>
>  18:    1089806     286185   IO-APIC-fasteoi   uhci_hcd:usb8, r852, mmc0
>
> the SDHCI interrupt is shared with two other peripherals - one USB
> controller and a NAND device.
>
> Disabling the interrupt will adversely impact other peripherals and
> cause regressions where the interrupt is shared.

I thought disable|enable_irq() was being reference counted, so it
shouldn't impact the other peripherals for shared IRQs. I might have
understood this wrong though!?

Although, as if that's the case it also means that the IRQ can still
reach sdhci's irq handler as it hasn't actually been disabled.

Therefore, the only way we currently can make sure to don't get the
IRQ is to free and later re-request it. Now, apparently that has
issues when using threaded IRQ handlers.

I have recently discussed a related change on the genirq framework,
which in principle turned out that we concluded on needing a new API
to deal with PM related enable/disable IRQ cases.
http://www.gossamer-threads.com/lists/linux/kernel/2350504?do=post_view_threaded#2350504

Perhaps that's actually what we need to cover this case.

>
> So, I'm afraid I'm going to have to NAK this patch.

I agree. We need another solution!

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Jan. 28, 2016, 4:21 p.m. UTC | #3
On Thu, 28 Jan 2016, Ulf Hansson wrote:
> On 28 January 2016 at 11:20, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> >> -             free_irq(host->irq, host);
> >> +             disable_irq(host->irq);
> >
> > This is really not acceptable I'm afraid.  While it's common on ARM for
> > each interrupt to be uniquely allocated to a peripheral, not all SDHCI
> > platforms have that luxury.
> >
> > SDHCI is also used on PCI, and on x86 platforms, it's common to have PCI
> > interrupts shared between (sometimes many) different PCI devices.
> >
> > For example, on my laptop:
> >
> >  18:    1089806     286185   IO-APIC-fasteoi   uhci_hcd:usb8, r852, mmc0
> >
> > the SDHCI interrupt is shared with two other peripherals - one USB
> > controller and a NAND device.
> >
> > Disabling the interrupt will adversely impact other peripherals and
> > cause regressions where the interrupt is shared.
> 
> I thought disable|enable_irq() was being reference counted, so it
> shouldn't impact the other peripherals for shared IRQs. I might have
> understood this wrong though!?

It's reference counted. But it disables the irq line and not a particular
interrupt handler.
 
> Although, as if that's the case it also means that the IRQ can still
> reach sdhci's irq handler as it hasn't actually been disabled.

No. The result is that the other devices on the same irq line won't get any
interrupt anymore.

> Therefore, the only way we currently can make sure to don't get the
> IRQ is to free and later re-request it. Now, apparently that has
> issues when using threaded IRQ handlers.

What's the issue?
 
Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Jan. 28, 2016, 4:27 p.m. UTC | #4
On Thu, 28 Jan 2016, Thomas Gleixner wrote:
> On Thu, 28 Jan 2016, Ulf Hansson wrote:
> > Therefore, the only way we currently can make sure to don't get the
> > IRQ is to free and later re-request it. Now, apparently that has
> > issues when using threaded IRQ handlers.
> 
> What's the issue?

Ah, you mean that one:

> Currently sdhci driver free irq in host suspend, and call
> request_threaded_irq() in host resume. But during host resume,
> Ctrl+C can impact sdhci host resume, see the error log:

> CPU1 is up
> PM: noirq resume of devices complete after 0.637 msecs imx-sdma 30bd0000.sdma: loaded firmware 4.1
> PM: early resume of devices complete after 0.774 msecs
> dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4
> PM: Device 30b40000.usdhc failed to resume: error -4
> dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4
> PM: Device 30b50000.usdhc failed to resume: error -4
> dpm_run_callback(): platform_pm_resume+0x0/0x44 returns -4
> PM: Device 30b60000.usdhc failed to resume: error -4 fec 30be0000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
> mmc0: Timeout waiting for hardware interrupt.
> mmc0: Timeout waiting for hardware interrupt.
> mmc0: Timeout waiting for hardware interrupt.
> mmc0: Timeout waiting for hardware interrupt.
> mmc0: Timeout waiting for hardware interrupt.
> mmc0: Timeout waiting for hardware interrupt.
> mmc0: error -110 during resume (card was removed?)
> mmc2: Timeout waiting for hardware interrupt.
> mmc2: Timeout waiting for hardware interrupt.
> mmc2: error -110 during resume (card was removed?)

In request_threaded_irq-> __setup_irq-> kthread_create
->kthread_create_on_node, the comment shows that SIGKILLed will
impact the kthread create, and return -EINTR.

And how should that thread be SIGKILLed? Hitting Ctrl+C on the console does
not affect any kernel internal thread. Hitting Ctrl+C affects solely the
process which is running on that console.

And if it would, then that would be a completely different, serious bug which
needs to be fixed.

How was verified, that the thread was not created and that the creation failed
due to a SIGKILL?

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Russell King - ARM Linux Jan. 28, 2016, 4:38 p.m. UTC | #5
On Thu, Jan 28, 2016 at 04:47:23PM +0100, Ulf Hansson wrote:
> +tglx, Jon
> 
> On 28 January 2016 at 11:20, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Thu, Jan 28, 2016 at 05:42:26PM +0800, Haibo Chen wrote:
> >> Currently sdhci driver free irq in host suspend, and call
> >> request_threaded_irq() in host resume. But during host resume,
> >> Ctrl+C can impact sdhci host resume, see the error log:
> >
> > Ctrl+C should have no effect on this - that seems to imply that there's
> > some other bug elsewhere.
> >
> >> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> >> index d622435..4b1646b 100644
> >> --- a/drivers/mmc/host/sdhci.c
> >> +++ b/drivers/mmc/host/sdhci.c
> >> @@ -2686,7 +2686,7 @@ int sdhci_suspend_host(struct sdhci_host *host)
> >>               host->ier = 0;
> >>               sdhci_writel(host, 0, SDHCI_INT_ENABLE);
> >>               sdhci_writel(host, 0, SDHCI_SIGNAL_ENABLE);
> >> -             free_irq(host->irq, host);
> >> +             disable_irq(host->irq);
> >
> > This is really not acceptable I'm afraid.  While it's common on ARM for
> > each interrupt to be uniquely allocated to a peripheral, not all SDHCI
> > platforms have that luxury.
> >
> > SDHCI is also used on PCI, and on x86 platforms, it's common to have PCI
> > interrupts shared between (sometimes many) different PCI devices.
> >
> > For example, on my laptop:
> >
> >  18:    1089806     286185   IO-APIC-fasteoi   uhci_hcd:usb8, r852, mmc0
> >
> > the SDHCI interrupt is shared with two other peripherals - one USB
> > controller and a NAND device.
> >
> > Disabling the interrupt will adversely impact other peripherals and
> > cause regressions where the interrupt is shared.
> 
> I thought disable|enable_irq() was being reference counted, so it
> shouldn't impact the other peripherals for shared IRQs. I might have
> understood this wrong though!?

They are.  When anything disables an IRQ, the IRQ is disabled.  Only
once the N disable_irq()s have been balanced with N enable_irq()s will
the interrupt be re-enabled.  disable_irq() doesn't work on a per-device
level, but on a per-interrupt line level.

So, if sdhci calls disable_irq() in its suspend interrupt, it disables
the IRQ for _everything_ thats sharing that interrupt.  If (eg) USB or
r852 needs an interrupt to complete its own suspend, it won't see that
interrupt because SDHCI disabled it.

It appear might work as-is even so, if SDHCI happens to be (on the test
setup) suspended after (eg) both the USB and r852 drivers.

It is probably much better if SDHCI writes to the device on suspend to
disable interrupts, synchronise with the IRQ, and then set a flag to
indicate that the interrupt handler should immediately return IRQ_NONE
in case any of the other peripherals sharing the IRQ line trigger an
interrupt.

> I have recently discussed a related change on the genirq framework,
> which in principle turned out that we concluded on needing a new API
> to deal with PM related enable/disable IRQ cases.
> http://www.gossamer-threads.com/lists/linux/kernel/2350504?do=post_view_threaded#2350504

I haven't read your link, but I don't think we really need yet more
APIs to deal with this, except possibly one thing - a way to tell
genirq that a specific IRQ handler should not be called because its
device is suspended.

IOW, moving:

static irqreturn_t foo_device_irq(void *devid)
{
	struct foo_device_priv *priv = dev_id;

	if (priv->suspended)
		return IRQ_NONE;

	... rest of IRQ handling
}

into genirq code, so that we don't end up with that pattern repeated
many times in drivers.

It may be that's exactly what's being proposed in the link, but as I
say, I've not read it yet.
diff mbox

Patch

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index d622435..4b1646b 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -2686,7 +2686,7 @@  int sdhci_suspend_host(struct sdhci_host *host)
 		host->ier = 0;
 		sdhci_writel(host, 0, SDHCI_INT_ENABLE);
 		sdhci_writel(host, 0, SDHCI_SIGNAL_ENABLE);
-		free_irq(host->irq, host);
+		disable_irq(host->irq);
 	} else {
 		sdhci_enable_irq_wakeups(host);
 		enable_irq_wake(host->irq);
@@ -2698,8 +2698,6 @@  EXPORT_SYMBOL_GPL(sdhci_suspend_host);
 
 int sdhci_resume_host(struct sdhci_host *host)
 {
-	int ret = 0;
-
 	if (host->flags & (SDHCI_USE_SDMA | SDHCI_USE_ADMA)) {
 		if (host->ops->enable_dma)
 			host->ops->enable_dma(host);
@@ -2718,11 +2716,7 @@  int sdhci_resume_host(struct sdhci_host *host)
 	}
 
 	if (!device_may_wakeup(mmc_dev(host->mmc))) {
-		ret = request_threaded_irq(host->irq, sdhci_irq,
-					   sdhci_thread_irq, IRQF_SHARED,
-					   mmc_hostname(host->mmc), host);
-		if (ret)
-			return ret;
+		enable_irq(host->irq);
 	} else {
 		sdhci_disable_irq_wakeups(host);
 		disable_irq_wake(host->irq);
@@ -2730,7 +2724,7 @@  int sdhci_resume_host(struct sdhci_host *host)
 
 	sdhci_enable_card_detection(host);
 
-	return ret;
+	return 0;
 }
 
 EXPORT_SYMBOL_GPL(sdhci_resume_host);