diff mbox

[v2,RESEND] sdhci: Ignore unexpected CARD_INT interrupts

Message ID 20170116142342.21283-1-krisman@collabora.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Gabriel Krisman Bertazi Jan. 16, 2017, 2:23 p.m. UTC
One of our kernelCI boxes hanged at boot because a faulty eSDHC device
was triggering spurious CARD_INT interrupts for SD cards, causing CMD52
reads, which are not allowed for SD devices.  This adds a sanity check
to the interruption path, preventing that illegal command from getting
sent if the CARD_INT interruption should be disabled.

This quirk allows that particular machine to resume boot despite the
faulty hardware, instead of getting hung dealing with thousands of
mishandled interrupts.

Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
CC: Ulf Hansson <ulf.hansson@linaro.org>
CC: linux-mmc@vger.kernel.org
---
 drivers/mmc/host/sdhci.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Adrian Hunter Jan. 19, 2017, 10:28 a.m. UTC | #1
On 16/01/17 16:23, Gabriel Krisman Bertazi wrote:
> One of our kernelCI boxes hanged at boot because a faulty eSDHC device
> was triggering spurious CARD_INT interrupts for SD cards, causing CMD52
> reads, which are not allowed for SD devices.  This adds a sanity check
> to the interruption path, preventing that illegal command from getting
> sent if the CARD_INT interruption should be disabled.
> 
> This quirk allows that particular machine to resume boot despite the
> faulty hardware, instead of getting hung dealing with thousands of
> mishandled interrupts.
> 
> Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
> CC: Ulf Hansson <ulf.hansson@linaro.org>
> CC: linux-mmc@vger.kernel.org

Acked-by: Adrian Hunter <adrian.hunter@intel.com>

> ---
>  drivers/mmc/host/sdhci.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> index 51cd4f0e973f..a2efa25c7f3b 100644
> --- a/drivers/mmc/host/sdhci.c
> +++ b/drivers/mmc/host/sdhci.c
> @@ -2733,7 +2733,8 @@ static irqreturn_t sdhci_irq(int irq, void *dev_id)
>  		if (intmask & SDHCI_INT_RETUNE)
>  			mmc_retune_needed(host->mmc);
>  
> -		if (intmask & SDHCI_INT_CARD_INT) {
> +		if ((intmask & SDHCI_INT_CARD_INT) &&
> +		    (host->ier & SDHCI_INT_CARD_INT)) {
>  			sdhci_enable_sdio_irq_nolock(host, false);
>  			host->thread_isr |= SDHCI_INT_CARD_INT;
>  			result = IRQ_WAKE_THREAD;
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Jan. 19, 2017, 8:37 p.m. UTC | #2
On 16 January 2017 at 15:23, Gabriel Krisman Bertazi
<krisman@collabora.co.uk> wrote:
> One of our kernelCI boxes hanged at boot because a faulty eSDHC device
> was triggering spurious CARD_INT interrupts for SD cards, causing CMD52
> reads, which are not allowed for SD devices.  This adds a sanity check
> to the interruption path, preventing that illegal command from getting
> sent if the CARD_INT interruption should be disabled.
>
> This quirk allows that particular machine to resume boot despite the
> faulty hardware, instead of getting hung dealing with thousands of
> mishandled interrupts.
>
> Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
> CC: Ulf Hansson <ulf.hansson@linaro.org>
> CC: linux-mmc@vger.kernel.org

Thanks, applied for next with update commit msg header and by removing
the CCs from the changelog.

Kind regards
Uffe

> ---
>  drivers/mmc/host/sdhci.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> index 51cd4f0e973f..a2efa25c7f3b 100644
> --- a/drivers/mmc/host/sdhci.c
> +++ b/drivers/mmc/host/sdhci.c
> @@ -2733,7 +2733,8 @@ static irqreturn_t sdhci_irq(int irq, void *dev_id)
>                 if (intmask & SDHCI_INT_RETUNE)
>                         mmc_retune_needed(host->mmc);
>
> -               if (intmask & SDHCI_INT_CARD_INT) {
> +               if ((intmask & SDHCI_INT_CARD_INT) &&
> +                   (host->ier & SDHCI_INT_CARD_INT)) {
>                         sdhci_enable_sdio_irq_nolock(host, false);
>                         host->thread_isr |= SDHCI_INT_CARD_INT;
>                         result = IRQ_WAKE_THREAD;
> --
> 2.11.0
>
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gabriel Krisman Bertazi Jan. 27, 2017, 7:20 p.m. UTC | #3
Ulf Hansson <ulf.hansson@linaro.org> writes:

> On 16 January 2017 at 15:23, Gabriel Krisman Bertazi
> <krisman@collabora.co.uk> wrote:
>> One of our kernelCI boxes hanged at boot because a faulty eSDHC device
>> was triggering spurious CARD_INT interrupts for SD cards, causing CMD52
>> reads, which are not allowed for SD devices.  This adds a sanity check
>> to the interruption path, preventing that illegal command from getting
>> sent if the CARD_INT interruption should be disabled.
>>
>> This quirk allows that particular machine to resume boot despite the
>> faulty hardware, instead of getting hung dealing with thousands of
>> mishandled interrupts.
>>
>> Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
>> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
>> CC: Ulf Hansson <ulf.hansson@linaro.org>
>> CC: linux-mmc@vger.kernel.org
>
> Thanks, applied for next with update commit msg header and by removing
> the CCs from the changelog.

Hi Ulf,

Thanks for applying.  Although, I saw it got queued to the next merge
window, but I we believe it to be -rc material, since it fixes a hang
in our kernelCI boxes[1], and is preventing us from testing other features
in this box.  Can you consider submitting it to Linus for the next -rc?
it will be much appreciated!

[1]  <https://kernelci.org/boot/id/5881c81859b5140b59f6c3ae/>

Thanks.
Ulf Hansson Jan. 30, 2017, 8:19 a.m. UTC | #4
On 27 January 2017 at 20:20, Gabriel Krisman Bertazi
<krisman@collabora.co.uk> wrote:
> Ulf Hansson <ulf.hansson@linaro.org> writes:
>
>> On 16 January 2017 at 15:23, Gabriel Krisman Bertazi
>> <krisman@collabora.co.uk> wrote:
>>> One of our kernelCI boxes hanged at boot because a faulty eSDHC device
>>> was triggering spurious CARD_INT interrupts for SD cards, causing CMD52
>>> reads, which are not allowed for SD devices.  This adds a sanity check
>>> to the interruption path, preventing that illegal command from getting
>>> sent if the CARD_INT interruption should be disabled.
>>>
>>> This quirk allows that particular machine to resume boot despite the
>>> faulty hardware, instead of getting hung dealing with thousands of
>>> mishandled interrupts.
>>>
>>> Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
>>> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
>>> CC: Ulf Hansson <ulf.hansson@linaro.org>
>>> CC: linux-mmc@vger.kernel.org
>>
>> Thanks, applied for next with update commit msg header and by removing
>> the CCs from the changelog.
>
> Hi Ulf,
>
> Thanks for applying.  Although, I saw it got queued to the next merge
> window, but I we believe it to be -rc material, since it fixes a hang
> in our kernelCI boxes[1], and is preventing us from testing other features
> in this box.  Can you consider submitting it to Linus for the next -rc?
> it will be much appreciated!

I can do that, but perhaps this should then be tagged for stable as well?

Or was this a regression in 4.10?

Kind regards
Uffe

>
> [1]  <https://kernelci.org/boot/id/5881c81859b5140b59f6c3ae/>
>
> Thanks.
>
> --
> Gabriel Krisman Bertazi
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gabriel Krisman Bertazi Jan. 30, 2017, 12:34 p.m. UTC | #5
Ulf Hansson <ulf.hansson@linaro.org> writes:

> On 27 January 2017 at 20:20, Gabriel Krisman Bertazi
> <krisman@collabora.co.uk> wrote:
>> Ulf Hansson <ulf.hansson@linaro.org> writes:
>>
>>> On 16 January 2017 at 15:23, Gabriel Krisman Bertazi
>>> <krisman@collabora.co.uk> wrote:
>>>> One of our kernelCI boxes hanged at boot because a faulty eSDHC
>>>> device
>>>> was triggering spurious CARD_INT interrupts for SD cards, causing
>>>> CMD52
>>>> reads, which are not allowed for SD devices.  This adds a sanity
>>>> check
>>>> to the interruption path, preventing that illegal command from
>>>> getting
>>>> sent if the CARD_INT interruption should be disabled.
>>>>
>>>> This quirk allows that particular machine to resume boot despite the
>>>> faulty hardware, instead of getting hung dealing with thousands of
>>>> mishandled interrupts.
>>>>
>>>> Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
>>>> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
>>>> CC: Ulf Hansson <ulf.hansson@linaro.org>
>>>> CC: linux-mmc@vger.kernel.org
>>>
>>> Thanks, applied for next with update commit msg header and by
>>> removing
>>> the CCs from the changelog.
>>
>> Hi Ulf,
>>
>> Thanks for applying.  Although, I saw it got queued to the next merge
>> window, but I we believe it to be -rc material, since it fixes a hang
>> in our kernelCI boxes[1], and is preventing us from testing other
>> features
>> in this box.  Can you consider submitting it to Linus for the next
>> -rc?
>> it will be much appreciated!
>
> I can do that, but perhaps this should then be tagged for stable as
> well?
>

Thanks Ulf.  Yes, it should go to stable releases as well.  I tested it
on 4.4 and 4.9 already.
Adrian Hunter Jan. 30, 2017, 12:40 p.m. UTC | #6
On 30/01/17 14:34, Gabriel Krisman Bertazi wrote:
> Ulf Hansson <ulf.hansson@linaro.org> writes:
> 
>> On 27 January 2017 at 20:20, Gabriel Krisman Bertazi
>> <krisman@collabora.co.uk> wrote:
>>> Ulf Hansson <ulf.hansson@linaro.org> writes:
>>>
>>>> On 16 January 2017 at 15:23, Gabriel Krisman Bertazi
>>>> <krisman@collabora.co.uk> wrote:
>>>>> One of our kernelCI boxes hanged at boot because a faulty eSDHC
>>>>> device
>>>>> was triggering spurious CARD_INT interrupts for SD cards, causing
>>>>> CMD52
>>>>> reads, which are not allowed for SD devices.  This adds a sanity
>>>>> check
>>>>> to the interruption path, preventing that illegal command from
>>>>> getting
>>>>> sent if the CARD_INT interruption should be disabled.
>>>>>
>>>>> This quirk allows that particular machine to resume boot despite the
>>>>> faulty hardware, instead of getting hung dealing with thousands of
>>>>> mishandled interrupts.
>>>>>
>>>>> Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
>>>>> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
>>>>> CC: Ulf Hansson <ulf.hansson@linaro.org>
>>>>> CC: linux-mmc@vger.kernel.org
>>>>
>>>> Thanks, applied for next with update commit msg header and by
>>>> removing
>>>> the CCs from the changelog.
>>>
>>> Hi Ulf,
>>>
>>> Thanks for applying.  Although, I saw it got queued to the next merge
>>> window, but I we believe it to be -rc material, since it fixes a hang
>>> in our kernelCI boxes[1], and is preventing us from testing other
>>> features
>>> in this box.  Can you consider submitting it to Linus for the next
>>> -rc?
>>> it will be much appreciated!
>>
>> I can do that, but perhaps this should then be tagged for stable as
>> well?
>>
> 
> Thanks Ulf.  Yes, it should go to stable releases as well.  I tested it
> on 4.4 and 4.9 already.

Did you test normal SDIO operation was unaffected?  Please do that before
sending it to stable.

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gabriel Krisman Bertazi Jan. 30, 2017, 1:40 p.m. UTC | #7
Adrian Hunter <adrian.hunter@intel.com> writes:
> Did you test normal SDIO operation was unaffected?  Please do that before
> sending it to stable.

I don't have any SDIO card in hand to confirm it right now.  would you
or Ulf be able to give it a try?  I'll keep looking for one.
Ulf Hansson Jan. 31, 2017, 10:35 a.m. UTC | #8
On 30 January 2017 at 14:40, Gabriel Krisman Bertazi
<krisman@collabora.co.uk> wrote:
> Adrian Hunter <adrian.hunter@intel.com> writes:
>> Did you test normal SDIO operation was unaffected?  Please do that before
>> sending it to stable.
>
> I don't have any SDIO card in hand to confirm it right now.  would you
> or Ulf be able to give it a try?  I'll keep looking for one.

Sorry, I don't have any SDIO cards at hand that is using sdhci.

Still, as this has been tested for a while already in next, I decided
to move this to my fixes branch. I also added a stable-tag #4.4+.

If we see some issues, please report asap then I will drop the change.
I plan to send it to Linus on Friday this week.

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 51cd4f0e973f..a2efa25c7f3b 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -2733,7 +2733,8 @@  static irqreturn_t sdhci_irq(int irq, void *dev_id)
 		if (intmask & SDHCI_INT_RETUNE)
 			mmc_retune_needed(host->mmc);
 
-		if (intmask & SDHCI_INT_CARD_INT) {
+		if ((intmask & SDHCI_INT_CARD_INT) &&
+		    (host->ier & SDHCI_INT_CARD_INT)) {
 			sdhci_enable_sdio_irq_nolock(host, false);
 			host->thread_isr |= SDHCI_INT_CARD_INT;
 			result = IRQ_WAKE_THREAD;