diff mbox series

[v2] i2c: omap: fix IRQ storms

Message ID 20250228140420.379498-1-andreas@kemnade.info (mailing list archive)
State New
Headers show
Series [v2] i2c: omap: fix IRQ storms | expand

Commit Message

Andreas Kemnade Feb. 28, 2025, 2:04 p.m. UTC
On the GTA04A5 writing a reset command to the gyroscope causes IRQ
storms because NACK IRQs are enabled and therefore triggered but not
acked.

Sending a reset command to the gyroscope by
i2cset 1 0x69 0x14 0xb6
with an additional debug print in the ISR (not the thread) itself
causes

[ 363.353515] i2c i2c-1: ioctl, cmd=0x720, arg=0xbe801b00
[ 363.359039] omap_i2c 48072000.i2c: addr: 0x0069, len: 2, flags: 0x0, stop: 1
[ 363.366180] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x1110)
[ 363.371673] omap_i2c 48072000.i2c: IRQ (ISR = 0x0010)
[ 363.376892] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
[ 363.382263] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
[ 363.387664] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
repeating till infinity
[...]
(0x2 = NACK, 0x100 = Bus free, which is not enabled)
Apparently no other IRQ bit gets set, so this stalls.

Do not ignore enabled interrupts and make sure they are acked.
If the NACK IRQ is not needed, it should simply not enabled, but
according to the above log, caring about it is necessary unless
the Bus free IRQ is enabled and handled. The assumption that is
will always come with a ARDY IRQ, which was the idea behind
ignoring it, proves wrong.
It is true for simple reads from an unused address.

To still avoid the i2cdetect trouble which is the reason for
commit c770657bd261 ("i2c: omap: Fix standard mode false ACK readings"),
avoid doing much about NACK in omap_i2c_xfer_data() which is used
by both IRQ mode and polling mode, so also the false detection fix
is extended to polling usage and IRQ storms are avoided.

By changing this, the hardirq handler is not needed anymore to filter
stuff.

The mentioned gyro reset now just causes a -ETIMEDOUT instead of
hanging the system.

Fixes: c770657bd261 ("i2c: omap: Fix standard mode false ACK readings").
CC: <stable@kernel.org>
Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
---
This needs at least to be tested on systems where false acks were
detected.

 drivers/i2c/busses/i2c-omap.c | 26 +++++++-------------------
 1 file changed, 7 insertions(+), 19 deletions(-)

Comments

Nishanth Menon March 11, 2025, 12:39 p.m. UTC | #1
On 15:04-20250228, Andreas Kemnade wrote:
> On the GTA04A5 writing a reset command to the gyroscope causes IRQ
> storms because NACK IRQs are enabled and therefore triggered but not
> acked.
> 
> Sending a reset command to the gyroscope by
> i2cset 1 0x69 0x14 0xb6
> with an additional debug print in the ISR (not the thread) itself
> causes
> 
> [ 363.353515] i2c i2c-1: ioctl, cmd=0x720, arg=0xbe801b00
> [ 363.359039] omap_i2c 48072000.i2c: addr: 0x0069, len: 2, flags: 0x0, stop: 1
> [ 363.366180] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x1110)
> [ 363.371673] omap_i2c 48072000.i2c: IRQ (ISR = 0x0010)
> [ 363.376892] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> [ 363.382263] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> [ 363.387664] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> repeating till infinity
> [...]
> (0x2 = NACK, 0x100 = Bus free, which is not enabled)
> Apparently no other IRQ bit gets set, so this stalls.
> 
> Do not ignore enabled interrupts and make sure they are acked.
> If the NACK IRQ is not needed, it should simply not enabled, but
> according to the above log, caring about it is necessary unless
> the Bus free IRQ is enabled and handled. The assumption that is
> will always come with a ARDY IRQ, which was the idea behind
> ignoring it, proves wrong.
> It is true for simple reads from an unused address.
> 
> To still avoid the i2cdetect trouble which is the reason for
> commit c770657bd261 ("i2c: omap: Fix standard mode false ACK readings"),
> avoid doing much about NACK in omap_i2c_xfer_data() which is used
> by both IRQ mode and polling mode, so also the false detection fix
> is extended to polling usage and IRQ storms are avoided.
> 
> By changing this, the hardirq handler is not needed anymore to filter
> stuff.
> 
> The mentioned gyro reset now just causes a -ETIMEDOUT instead of
> hanging the system.
> 
> Fixes: c770657bd261 ("i2c: omap: Fix standard mode false ACK readings").
> CC: <stable@kernel.org>
> Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
> ---
> This needs at least to be tested on systems where false acks were
> detected.

At least on BeaglePlay, I have not been able to reproduce the original
bug which was the trigger for commit c770657bd261

I also ran basic boot tests on other K3 platforms and none seem to show
regressions at the very least.

Tested-by: Nishanth Menon <nm@ti.com>

> 
>  drivers/i2c/busses/i2c-omap.c | 26 +++++++-------------------
>  1 file changed, 7 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
> index 92faf03d64cf..f18c3e74b076 100644
> --- a/drivers/i2c/busses/i2c-omap.c
> +++ b/drivers/i2c/busses/i2c-omap.c
> @@ -1048,23 +1048,6 @@ static int omap_i2c_transmit_data(struct omap_i2c_dev *omap, u8 num_bytes,
>  	return 0;
>  }
>  
> -static irqreturn_t
> -omap_i2c_isr(int irq, void *dev_id)
> -{
> -	struct omap_i2c_dev *omap = dev_id;
> -	irqreturn_t ret = IRQ_HANDLED;
> -	u16 mask;
> -	u16 stat;
> -
> -	stat = omap_i2c_read_reg(omap, OMAP_I2C_STAT_REG);
> -	mask = omap_i2c_read_reg(omap, OMAP_I2C_IE_REG) & ~OMAP_I2C_STAT_NACK;
> -
> -	if (stat & mask)
> -		ret = IRQ_WAKE_THREAD;
> -
> -	return ret;
> -}
> -
>  static int omap_i2c_xfer_data(struct omap_i2c_dev *omap)
>  {
>  	u16 bits;
> @@ -1095,8 +1078,13 @@ static int omap_i2c_xfer_data(struct omap_i2c_dev *omap)
>  		}
>  
>  		if (stat & OMAP_I2C_STAT_NACK) {
> -			err |= OMAP_I2C_STAT_NACK;
> +			omap->cmd_err |= OMAP_I2C_STAT_NACK;
>  			omap_i2c_ack_stat(omap, OMAP_I2C_STAT_NACK);
> +
> +			if (!(stat & ~OMAP_I2C_STAT_NACK)) {
> +				err = -EAGAIN;
> +				break;
> +			}
>  		}
>  
>  		if (stat & OMAP_I2C_STAT_AL) {
> @@ -1472,7 +1460,7 @@ omap_i2c_probe(struct platform_device *pdev)
>  				IRQF_NO_SUSPEND, pdev->name, omap);
>  	else
>  		r = devm_request_threaded_irq(&pdev->dev, omap->irq,
> -				omap_i2c_isr, omap_i2c_isr_thread,
> +				NULL, omap_i2c_isr_thread,
>  				IRQF_NO_SUSPEND | IRQF_ONESHOT,
>  				pdev->name, omap);
>  
> -- 
> 2.39.5
> 
>
Wolfram Sang March 11, 2025, 6:29 p.m. UTC | #2
> This needs at least to be tested on systems where false acks were
> detected.

Which do you mean? You did test this on GTA04A5, or?
Andreas Kemnade March 11, 2025, 7:14 p.m. UTC | #3
Am Tue, 11 Mar 2025 19:29:04 +0100
schrieb Wolfram Sang <wsa+renesas@sang-engineering.com>:

> > This needs at least to be tested on systems where false acks were
> > detected.  
> 
> Which do you mean? You did test this on GTA04A5, or?
> 
Exactly the tests which Nishanth did. So I would say with his Tested-By,
this patch is good to go. I test on GTA04A5 but there and on any other
system I have I did not observe the false acks but there I have the IRQ
storm. And I want a solution which avoids the IRQ storm and also does
not reintroduce the false acks.

Regards,
Andreas
Wolfram Sang March 11, 2025, 9:12 p.m. UTC | #4
> Exactly the tests which Nishanth did. So I would say with his Tested-By,
> this patch is good to go.

This is what I wanted to know. Thanks for the heads up!
Andi Shyti March 11, 2025, 10:25 p.m. UTC | #5
Hi,

On Tue, Mar 11, 2025 at 07:39:47AM -0500, Nishanth Menon wrote:
> On 15:04-20250228, Andreas Kemnade wrote:
> > On the GTA04A5 writing a reset command to the gyroscope causes IRQ
> > storms because NACK IRQs are enabled and therefore triggered but not
> > acked.
> > 
> > Sending a reset command to the gyroscope by
> > i2cset 1 0x69 0x14 0xb6
> > with an additional debug print in the ISR (not the thread) itself
> > causes
> > 
> > [ 363.353515] i2c i2c-1: ioctl, cmd=0x720, arg=0xbe801b00
> > [ 363.359039] omap_i2c 48072000.i2c: addr: 0x0069, len: 2, flags: 0x0, stop: 1
> > [ 363.366180] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x1110)
> > [ 363.371673] omap_i2c 48072000.i2c: IRQ (ISR = 0x0010)
> > [ 363.376892] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> > [ 363.382263] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> > [ 363.387664] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> > repeating till infinity
> > [...]
> > (0x2 = NACK, 0x100 = Bus free, which is not enabled)
> > Apparently no other IRQ bit gets set, so this stalls.
> > 
> > Do not ignore enabled interrupts and make sure they are acked.
> > If the NACK IRQ is not needed, it should simply not enabled, but
> > according to the above log, caring about it is necessary unless
> > the Bus free IRQ is enabled and handled. The assumption that is
> > will always come with a ARDY IRQ, which was the idea behind
> > ignoring it, proves wrong.
> > It is true for simple reads from an unused address.
> > 
> > To still avoid the i2cdetect trouble which is the reason for
> > commit c770657bd261 ("i2c: omap: Fix standard mode false ACK readings"),
> > avoid doing much about NACK in omap_i2c_xfer_data() which is used
> > by both IRQ mode and polling mode, so also the false detection fix
> > is extended to polling usage and IRQ storms are avoided.
> > 
> > By changing this, the hardirq handler is not needed anymore to filter
> > stuff.
> > 
> > The mentioned gyro reset now just causes a -ETIMEDOUT instead of
> > hanging the system.
> > 
> > Fixes: c770657bd261 ("i2c: omap: Fix standard mode false ACK readings").
> > CC: <stable@kernel.org>
> > Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
> > ---
> > This needs at least to be tested on systems where false acks were
> > detected.
> 
> At least on BeaglePlay, I have not been able to reproduce the original
> bug which was the trigger for commit c770657bd261
> 
> I also ran basic boot tests on other K3 platforms and none seem to show
> regressions at the very least.
> 
> Tested-by: Nishanth Menon <nm@ti.com>

Thanks for testing it! I asked some OMAP folks to check this
patch, but no one took action. With Nishanth's test, I can now
sleep soundly. :-)

Merged to i2c/i2c-host-fixes.

Thanks,
Andi
Aniket Limaye March 12, 2025, 9:25 a.m. UTC | #6
Hi,

On 12/03/25 03:55, Andi Shyti wrote:
> Hi,
> 
> On Tue, Mar 11, 2025 at 07:39:47AM -0500, Nishanth Menon wrote:
>> On 15:04-20250228, Andreas Kemnade wrote:
>>> On the GTA04A5 writing a reset command to the gyroscope causes IRQ
>>> storms because NACK IRQs are enabled and therefore triggered but not
>>> acked.
>>>
>>> Sending a reset command to the gyroscope by
>>> i2cset 1 0x69 0x14 0xb6
>>> with an additional debug print in the ISR (not the thread) itself
>>> causes
>>>
>>> [ 363.353515] i2c i2c-1: ioctl, cmd=0x720, arg=0xbe801b00
>>> [ 363.359039] omap_i2c 48072000.i2c: addr: 0x0069, len: 2, flags: 0x0, stop: 1
>>> [ 363.366180] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x1110)
>>> [ 363.371673] omap_i2c 48072000.i2c: IRQ (ISR = 0x0010)
>>> [ 363.376892] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
>>> [ 363.382263] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
>>> [ 363.387664] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
>>> repeating till infinity
>>> [...]
>>> (0x2 = NACK, 0x100 = Bus free, which is not enabled)
>>> Apparently no other IRQ bit gets set, so this stalls.
>>>
>>> Do not ignore enabled interrupts and make sure they are acked.
>>> If the NACK IRQ is not needed, it should simply not enabled, but
>>> according to the above log, caring about it is necessary unless
>>> the Bus free IRQ is enabled and handled. The assumption that is
>>> will always come with a ARDY IRQ, which was the idea behind
>>> ignoring it, proves wrong.
>>> It is true for simple reads from an unused address.
>>>
>>> To still avoid the i2cdetect trouble which is the reason for
>>> commit c770657bd261 ("i2c: omap: Fix standard mode false ACK readings"),
>>> avoid doing much about NACK in omap_i2c_xfer_data() which is used
>>> by both IRQ mode and polling mode, so also the false detection fix
>>> is extended to polling usage and IRQ storms are avoided.
>>>
>>> By changing this, the hardirq handler is not needed anymore to filter
>>> stuff.
>>>
>>> The mentioned gyro reset now just causes a -ETIMEDOUT instead of
>>> hanging the system.
>>>
>>> Fixes: c770657bd261 ("i2c: omap: Fix standard mode false ACK readings").
>>> CC: <stable@kernel.org>
>>> Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
>>> ---
>>> This needs at least to be tested on systems where false acks were
>>> detected.
>>
>> At least on BeaglePlay, I have not been able to reproduce the original
>> bug which was the trigger for commit c770657bd261
>>
>> I also ran basic boot tests on other K3 platforms and none seem to show
>> regressions at the very least.
>>
>> Tested-by: Nishanth Menon <nm@ti.com>
> 
> Thanks for testing it! I asked some OMAP folks to check this
> patch, but no one took action. With Nishanth's test, I can now
> sleep soundly. :-)
> 
> Merged to i2c/i2c-host-fixes.
> 
> Thanks,
> Andi
> 

I see that the patch got merged so don't know if this is useful at all 
at this point, but yeah looks good to me. Apologies for the slow 
response. Nishanth, Thanks for testing it too!

Reviewed-by: Aniket Limaye <a-limaye@ti.com>

Thanks,
Aniket
Andi Shyti March 12, 2025, 11:26 a.m. UTC | #7
Hi Aniket,

On Wed, Mar 12, 2025 at 02:55:38PM +0530, Aniket Limaye wrote:
> On 12/03/25 03:55, Andi Shyti wrote:
> > On Tue, Mar 11, 2025 at 07:39:47AM -0500, Nishanth Menon wrote:
> > > On 15:04-20250228, Andreas Kemnade wrote:
> > > > On the GTA04A5 writing a reset command to the gyroscope causes IRQ
> > > > storms because NACK IRQs are enabled and therefore triggered but not
> > > > acked.
> > > > 
> > > > Sending a reset command to the gyroscope by
> > > > i2cset 1 0x69 0x14 0xb6
> > > > with an additional debug print in the ISR (not the thread) itself
> > > > causes
> > > > 
> > > > [ 363.353515] i2c i2c-1: ioctl, cmd=0x720, arg=0xbe801b00
> > > > [ 363.359039] omap_i2c 48072000.i2c: addr: 0x0069, len: 2, flags: 0x0, stop: 1
> > > > [ 363.366180] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x1110)
> > > > [ 363.371673] omap_i2c 48072000.i2c: IRQ (ISR = 0x0010)
> > > > [ 363.376892] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> > > > [ 363.382263] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> > > > [ 363.387664] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> > > > repeating till infinity
> > > > [...]
> > > > (0x2 = NACK, 0x100 = Bus free, which is not enabled)
> > > > Apparently no other IRQ bit gets set, so this stalls.
> > > > 
> > > > Do not ignore enabled interrupts and make sure they are acked.
> > > > If the NACK IRQ is not needed, it should simply not enabled, but
> > > > according to the above log, caring about it is necessary unless
> > > > the Bus free IRQ is enabled and handled. The assumption that is
> > > > will always come with a ARDY IRQ, which was the idea behind
> > > > ignoring it, proves wrong.
> > > > It is true for simple reads from an unused address.
> > > > 
> > > > To still avoid the i2cdetect trouble which is the reason for
> > > > commit c770657bd261 ("i2c: omap: Fix standard mode false ACK readings"),
> > > > avoid doing much about NACK in omap_i2c_xfer_data() which is used
> > > > by both IRQ mode and polling mode, so also the false detection fix
> > > > is extended to polling usage and IRQ storms are avoided.
> > > > 
> > > > By changing this, the hardirq handler is not needed anymore to filter
> > > > stuff.
> > > > 
> > > > The mentioned gyro reset now just causes a -ETIMEDOUT instead of
> > > > hanging the system.
> > > > 
> > > > Fixes: c770657bd261 ("i2c: omap: Fix standard mode false ACK readings").
> > > > CC: <stable@kernel.org>
> > > > Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
> > > > ---
> > > > This needs at least to be tested on systems where false acks were
> > > > detected.
> > > 
> > > At least on BeaglePlay, I have not been able to reproduce the original
> > > bug which was the trigger for commit c770657bd261
> > > 
> > > I also ran basic boot tests on other K3 platforms and none seem to show
> > > regressions at the very least.
> > > 
> > > Tested-by: Nishanth Menon <nm@ti.com>
> > 
> > Thanks for testing it! I asked some OMAP folks to check this
> > patch, but no one took action. With Nishanth's test, I can now
> > sleep soundly. :-)
> > 
> > Merged to i2c/i2c-host-fixes.
> > 
> > Thanks,
> > Andi
> > 
> 
> I see that the patch got merged so don't know if this is useful at all at
> this point, but yeah looks good to me. Apologies for the slow response.
> Nishanth, Thanks for testing it too!
> 
> Reviewed-by: Aniket Limaye <a-limaye@ti.com>

thanks for your review, I added it.

Andi
diff mbox series

Patch

diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
index 92faf03d64cf..f18c3e74b076 100644
--- a/drivers/i2c/busses/i2c-omap.c
+++ b/drivers/i2c/busses/i2c-omap.c
@@ -1048,23 +1048,6 @@  static int omap_i2c_transmit_data(struct omap_i2c_dev *omap, u8 num_bytes,
 	return 0;
 }
 
-static irqreturn_t
-omap_i2c_isr(int irq, void *dev_id)
-{
-	struct omap_i2c_dev *omap = dev_id;
-	irqreturn_t ret = IRQ_HANDLED;
-	u16 mask;
-	u16 stat;
-
-	stat = omap_i2c_read_reg(omap, OMAP_I2C_STAT_REG);
-	mask = omap_i2c_read_reg(omap, OMAP_I2C_IE_REG) & ~OMAP_I2C_STAT_NACK;
-
-	if (stat & mask)
-		ret = IRQ_WAKE_THREAD;
-
-	return ret;
-}
-
 static int omap_i2c_xfer_data(struct omap_i2c_dev *omap)
 {
 	u16 bits;
@@ -1095,8 +1078,13 @@  static int omap_i2c_xfer_data(struct omap_i2c_dev *omap)
 		}
 
 		if (stat & OMAP_I2C_STAT_NACK) {
-			err |= OMAP_I2C_STAT_NACK;
+			omap->cmd_err |= OMAP_I2C_STAT_NACK;
 			omap_i2c_ack_stat(omap, OMAP_I2C_STAT_NACK);
+
+			if (!(stat & ~OMAP_I2C_STAT_NACK)) {
+				err = -EAGAIN;
+				break;
+			}
 		}
 
 		if (stat & OMAP_I2C_STAT_AL) {
@@ -1472,7 +1460,7 @@  omap_i2c_probe(struct platform_device *pdev)
 				IRQF_NO_SUSPEND, pdev->name, omap);
 	else
 		r = devm_request_threaded_irq(&pdev->dev, omap->irq,
-				omap_i2c_isr, omap_i2c_isr_thread,
+				NULL, omap_i2c_isr_thread,
 				IRQF_NO_SUSPEND | IRQF_ONESHOT,
 				pdev->name, omap);