diff mbox

[09/14] mmc: jz4740: Fix race condition in IRQ mask update

Message ID 20180309151219.18723-10-ezequiel@vanguardiasur.com.ar (mailing list archive)
State New, archived
Headers show

Commit Message

Ezequiel Garcia March 9, 2018, 3:12 p.m. UTC
From: Alex Smith <alex.smith@imgtec.com>

A spinlock is held while updating the internal copy of the IRQ mask,
but not while writing it to the actual IMASK register. After the lock
is released, an IRQ can occur before the IMASK register is written.
If handling this IRQ causes the mask to be changed, when the handler
returns back to the middle of the first mask update, a stale value
will be written to the mask register.

If this causes an IRQ to become unmasked that cannot have its status
cleared by writing a 1 to it in the IREG register, e.g. the SDIO IRQ,
then we can end up stuck with the same IRQ repeatedly being fired but
not handled. Normally the MMC IRQ handler attempts to clear any
unexpected IRQs by writing IREG, but for those that cannot be cleared
in this way then the IRQ will just repeatedly fire.

This was resulting in lockups after a while of using Wi-Fi on the
CI20 (GitHub issue #19).

Resolve by holding the spinlock until after the IMASK register has
been updated.

Signed-off-by: Alex Smith <alex.smith@imgtec.com>
---
 drivers/mmc/host/jz4740_mmc.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

James Hogan March 9, 2018, 10:49 p.m. UTC | #1
On Fri, Mar 09, 2018 at 12:12:14PM -0300, Ezequiel Garcia wrote:
> From: Alex Smith <alex.smith@imgtec.com>
> 
> A spinlock is held while updating the internal copy of the IRQ mask,
> but not while writing it to the actual IMASK register. After the lock
> is released, an IRQ can occur before the IMASK register is written.
> If handling this IRQ causes the mask to be changed, when the handler
> returns back to the middle of the first mask update, a stale value
> will be written to the mask register.
> 
> If this causes an IRQ to become unmasked that cannot have its status
> cleared by writing a 1 to it in the IREG register, e.g. the SDIO IRQ,
> then we can end up stuck with the same IRQ repeatedly being fired but
> not handled. Normally the MMC IRQ handler attempts to clear any
> unexpected IRQs by writing IREG, but for those that cannot be cleared
> in this way then the IRQ will just repeatedly fire.
> 
> This was resulting in lockups after a while of using Wi-Fi on the
> CI20 (GitHub issue #19).
> 
> Resolve by holding the spinlock until after the IMASK register has
> been updated.
> 

Maybe have a Link tag instead of referencing "github issue #19", i.e.:
Link: https://github.com/MIPS/CI20_linux/issues/19

Since this fixes an older commit, it'd be worth adding:

Fixes: 61bfbdb85687 ("MMC: Add support for the controller on JZ4740 SoCs.")

> Signed-off-by: Alex Smith <alex.smith@imgtec.com>

... and presumably is worthy of backporting (the driver was introduced
in 2.6.36), i.e.:
Cc: stable@vger.kernel.org

Cheers
James
Ezequiel Garcia March 10, 2018, 10:46 p.m. UTC | #2
On 9 March 2018 at 19:49, James Hogan <jhogan@kernel.org> wrote:
> On Fri, Mar 09, 2018 at 12:12:14PM -0300, Ezequiel Garcia wrote:
>> From: Alex Smith <alex.smith@imgtec.com>
>>
>> A spinlock is held while updating the internal copy of the IRQ mask,
>> but not while writing it to the actual IMASK register. After the lock
>> is released, an IRQ can occur before the IMASK register is written.
>> If handling this IRQ causes the mask to be changed, when the handler
>> returns back to the middle of the first mask update, a stale value
>> will be written to the mask register.
>>
>> If this causes an IRQ to become unmasked that cannot have its status
>> cleared by writing a 1 to it in the IREG register, e.g. the SDIO IRQ,
>> then we can end up stuck with the same IRQ repeatedly being fired but
>> not handled. Normally the MMC IRQ handler attempts to clear any
>> unexpected IRQs by writing IREG, but for those that cannot be cleared
>> in this way then the IRQ will just repeatedly fire.
>>
>> This was resulting in lockups after a while of using Wi-Fi on the
>> CI20 (GitHub issue #19).
>>
>> Resolve by holding the spinlock until after the IMASK register has
>> been updated.
>>
>
> Maybe have a Link tag instead of referencing "github issue #19", i.e.:
> Link: https://github.com/MIPS/CI20_linux/issues/19
>
> Since this fixes an older commit, it'd be worth adding:
>
> Fixes: 61bfbdb85687 ("MMC: Add support for the controller on JZ4740 SoCs.")
>
>> Signed-off-by: Alex Smith <alex.smith@imgtec.com>
>
> ... and presumably is worthy of backporting (the driver was introduced
> in 2.6.36), i.e.:
> Cc: stable@vger.kernel.org
>

Oh, yes, good catch. I've overlooked this commit.

Thanks,
diff mbox

Patch

diff --git a/drivers/mmc/host/jz4740_mmc.c b/drivers/mmc/host/jz4740_mmc.c
index bb1b9114ef53..55587d798d47 100644
--- a/drivers/mmc/host/jz4740_mmc.c
+++ b/drivers/mmc/host/jz4740_mmc.c
@@ -410,13 +410,15 @@  static void jz4740_mmc_set_irq_enabled(struct jz4740_mmc_host *host,
 	unsigned long flags;
 
 	spin_lock_irqsave(&host->lock, flags);
+
 	if (enabled)
 		host->irq_mask &= ~irq;
 	else
 		host->irq_mask |= irq;
-	spin_unlock_irqrestore(&host->lock, flags);
 
 	host->write_irq_mask(host, host->irq_mask);
+
+	spin_unlock_irqrestore(&host->lock, flags);
 }
 
 static void jz4740_mmc_clock_enable(struct jz4740_mmc_host *host,