Message ID | 20250228140420.379498-1-andreas@kemnade.info (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] i2c: omap: fix IRQ storms | expand |
On 15:04-20250228, Andreas Kemnade wrote: > On the GTA04A5 writing a reset command to the gyroscope causes IRQ > storms because NACK IRQs are enabled and therefore triggered but not > acked. > > Sending a reset command to the gyroscope by > i2cset 1 0x69 0x14 0xb6 > with an additional debug print in the ISR (not the thread) itself > causes > > [ 363.353515] i2c i2c-1: ioctl, cmd=0x720, arg=0xbe801b00 > [ 363.359039] omap_i2c 48072000.i2c: addr: 0x0069, len: 2, flags: 0x0, stop: 1 > [ 363.366180] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x1110) > [ 363.371673] omap_i2c 48072000.i2c: IRQ (ISR = 0x0010) > [ 363.376892] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102) > [ 363.382263] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102) > [ 363.387664] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102) > repeating till infinity > [...] > (0x2 = NACK, 0x100 = Bus free, which is not enabled) > Apparently no other IRQ bit gets set, so this stalls. > > Do not ignore enabled interrupts and make sure they are acked. > If the NACK IRQ is not needed, it should simply not enabled, but > according to the above log, caring about it is necessary unless > the Bus free IRQ is enabled and handled. The assumption that is > will always come with a ARDY IRQ, which was the idea behind > ignoring it, proves wrong. > It is true for simple reads from an unused address. > > To still avoid the i2cdetect trouble which is the reason for > commit c770657bd261 ("i2c: omap: Fix standard mode false ACK readings"), > avoid doing much about NACK in omap_i2c_xfer_data() which is used > by both IRQ mode and polling mode, so also the false detection fix > is extended to polling usage and IRQ storms are avoided. > > By changing this, the hardirq handler is not needed anymore to filter > stuff. > > The mentioned gyro reset now just causes a -ETIMEDOUT instead of > hanging the system. > > Fixes: c770657bd261 ("i2c: omap: Fix standard mode false ACK readings"). > CC: <stable@kernel.org> > Signed-off-by: Andreas Kemnade <andreas@kemnade.info> > --- > This needs at least to be tested on systems where false acks were > detected. At least on BeaglePlay, I have not been able to reproduce the original bug which was the trigger for commit c770657bd261 I also ran basic boot tests on other K3 platforms and none seem to show regressions at the very least. Tested-by: Nishanth Menon <nm@ti.com> > > drivers/i2c/busses/i2c-omap.c | 26 +++++++------------------- > 1 file changed, 7 insertions(+), 19 deletions(-) > > diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c > index 92faf03d64cf..f18c3e74b076 100644 > --- a/drivers/i2c/busses/i2c-omap.c > +++ b/drivers/i2c/busses/i2c-omap.c > @@ -1048,23 +1048,6 @@ static int omap_i2c_transmit_data(struct omap_i2c_dev *omap, u8 num_bytes, > return 0; > } > > -static irqreturn_t > -omap_i2c_isr(int irq, void *dev_id) > -{ > - struct omap_i2c_dev *omap = dev_id; > - irqreturn_t ret = IRQ_HANDLED; > - u16 mask; > - u16 stat; > - > - stat = omap_i2c_read_reg(omap, OMAP_I2C_STAT_REG); > - mask = omap_i2c_read_reg(omap, OMAP_I2C_IE_REG) & ~OMAP_I2C_STAT_NACK; > - > - if (stat & mask) > - ret = IRQ_WAKE_THREAD; > - > - return ret; > -} > - > static int omap_i2c_xfer_data(struct omap_i2c_dev *omap) > { > u16 bits; > @@ -1095,8 +1078,13 @@ static int omap_i2c_xfer_data(struct omap_i2c_dev *omap) > } > > if (stat & OMAP_I2C_STAT_NACK) { > - err |= OMAP_I2C_STAT_NACK; > + omap->cmd_err |= OMAP_I2C_STAT_NACK; > omap_i2c_ack_stat(omap, OMAP_I2C_STAT_NACK); > + > + if (!(stat & ~OMAP_I2C_STAT_NACK)) { > + err = -EAGAIN; > + break; > + } > } > > if (stat & OMAP_I2C_STAT_AL) { > @@ -1472,7 +1460,7 @@ omap_i2c_probe(struct platform_device *pdev) > IRQF_NO_SUSPEND, pdev->name, omap); > else > r = devm_request_threaded_irq(&pdev->dev, omap->irq, > - omap_i2c_isr, omap_i2c_isr_thread, > + NULL, omap_i2c_isr_thread, > IRQF_NO_SUSPEND | IRQF_ONESHOT, > pdev->name, omap); > > -- > 2.39.5 > >
> This needs at least to be tested on systems where false acks were > detected. Which do you mean? You did test this on GTA04A5, or?
Am Tue, 11 Mar 2025 19:29:04 +0100 schrieb Wolfram Sang <wsa+renesas@sang-engineering.com>: > > This needs at least to be tested on systems where false acks were > > detected. > > Which do you mean? You did test this on GTA04A5, or? > Exactly the tests which Nishanth did. So I would say with his Tested-By, this patch is good to go. I test on GTA04A5 but there and on any other system I have I did not observe the false acks but there I have the IRQ storm. And I want a solution which avoids the IRQ storm and also does not reintroduce the false acks. Regards, Andreas
> Exactly the tests which Nishanth did. So I would say with his Tested-By, > this patch is good to go. This is what I wanted to know. Thanks for the heads up!
Hi, On Tue, Mar 11, 2025 at 07:39:47AM -0500, Nishanth Menon wrote: > On 15:04-20250228, Andreas Kemnade wrote: > > On the GTA04A5 writing a reset command to the gyroscope causes IRQ > > storms because NACK IRQs are enabled and therefore triggered but not > > acked. > > > > Sending a reset command to the gyroscope by > > i2cset 1 0x69 0x14 0xb6 > > with an additional debug print in the ISR (not the thread) itself > > causes > > > > [ 363.353515] i2c i2c-1: ioctl, cmd=0x720, arg=0xbe801b00 > > [ 363.359039] omap_i2c 48072000.i2c: addr: 0x0069, len: 2, flags: 0x0, stop: 1 > > [ 363.366180] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x1110) > > [ 363.371673] omap_i2c 48072000.i2c: IRQ (ISR = 0x0010) > > [ 363.376892] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102) > > [ 363.382263] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102) > > [ 363.387664] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102) > > repeating till infinity > > [...] > > (0x2 = NACK, 0x100 = Bus free, which is not enabled) > > Apparently no other IRQ bit gets set, so this stalls. > > > > Do not ignore enabled interrupts and make sure they are acked. > > If the NACK IRQ is not needed, it should simply not enabled, but > > according to the above log, caring about it is necessary unless > > the Bus free IRQ is enabled and handled. The assumption that is > > will always come with a ARDY IRQ, which was the idea behind > > ignoring it, proves wrong. > > It is true for simple reads from an unused address. > > > > To still avoid the i2cdetect trouble which is the reason for > > commit c770657bd261 ("i2c: omap: Fix standard mode false ACK readings"), > > avoid doing much about NACK in omap_i2c_xfer_data() which is used > > by both IRQ mode and polling mode, so also the false detection fix > > is extended to polling usage and IRQ storms are avoided. > > > > By changing this, the hardirq handler is not needed anymore to filter > > stuff. > > > > The mentioned gyro reset now just causes a -ETIMEDOUT instead of > > hanging the system. > > > > Fixes: c770657bd261 ("i2c: omap: Fix standard mode false ACK readings"). > > CC: <stable@kernel.org> > > Signed-off-by: Andreas Kemnade <andreas@kemnade.info> > > --- > > This needs at least to be tested on systems where false acks were > > detected. > > At least on BeaglePlay, I have not been able to reproduce the original > bug which was the trigger for commit c770657bd261 > > I also ran basic boot tests on other K3 platforms and none seem to show > regressions at the very least. > > Tested-by: Nishanth Menon <nm@ti.com> Thanks for testing it! I asked some OMAP folks to check this patch, but no one took action. With Nishanth's test, I can now sleep soundly. :-) Merged to i2c/i2c-host-fixes. Thanks, Andi
Hi, On 12/03/25 03:55, Andi Shyti wrote: > Hi, > > On Tue, Mar 11, 2025 at 07:39:47AM -0500, Nishanth Menon wrote: >> On 15:04-20250228, Andreas Kemnade wrote: >>> On the GTA04A5 writing a reset command to the gyroscope causes IRQ >>> storms because NACK IRQs are enabled and therefore triggered but not >>> acked. >>> >>> Sending a reset command to the gyroscope by >>> i2cset 1 0x69 0x14 0xb6 >>> with an additional debug print in the ISR (not the thread) itself >>> causes >>> >>> [ 363.353515] i2c i2c-1: ioctl, cmd=0x720, arg=0xbe801b00 >>> [ 363.359039] omap_i2c 48072000.i2c: addr: 0x0069, len: 2, flags: 0x0, stop: 1 >>> [ 363.366180] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x1110) >>> [ 363.371673] omap_i2c 48072000.i2c: IRQ (ISR = 0x0010) >>> [ 363.376892] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102) >>> [ 363.382263] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102) >>> [ 363.387664] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102) >>> repeating till infinity >>> [...] >>> (0x2 = NACK, 0x100 = Bus free, which is not enabled) >>> Apparently no other IRQ bit gets set, so this stalls. >>> >>> Do not ignore enabled interrupts and make sure they are acked. >>> If the NACK IRQ is not needed, it should simply not enabled, but >>> according to the above log, caring about it is necessary unless >>> the Bus free IRQ is enabled and handled. The assumption that is >>> will always come with a ARDY IRQ, which was the idea behind >>> ignoring it, proves wrong. >>> It is true for simple reads from an unused address. >>> >>> To still avoid the i2cdetect trouble which is the reason for >>> commit c770657bd261 ("i2c: omap: Fix standard mode false ACK readings"), >>> avoid doing much about NACK in omap_i2c_xfer_data() which is used >>> by both IRQ mode and polling mode, so also the false detection fix >>> is extended to polling usage and IRQ storms are avoided. >>> >>> By changing this, the hardirq handler is not needed anymore to filter >>> stuff. >>> >>> The mentioned gyro reset now just causes a -ETIMEDOUT instead of >>> hanging the system. >>> >>> Fixes: c770657bd261 ("i2c: omap: Fix standard mode false ACK readings"). >>> CC: <stable@kernel.org> >>> Signed-off-by: Andreas Kemnade <andreas@kemnade.info> >>> --- >>> This needs at least to be tested on systems where false acks were >>> detected. >> >> At least on BeaglePlay, I have not been able to reproduce the original >> bug which was the trigger for commit c770657bd261 >> >> I also ran basic boot tests on other K3 platforms and none seem to show >> regressions at the very least. >> >> Tested-by: Nishanth Menon <nm@ti.com> > > Thanks for testing it! I asked some OMAP folks to check this > patch, but no one took action. With Nishanth's test, I can now > sleep soundly. :-) > > Merged to i2c/i2c-host-fixes. > > Thanks, > Andi > I see that the patch got merged so don't know if this is useful at all at this point, but yeah looks good to me. Apologies for the slow response. Nishanth, Thanks for testing it too! Reviewed-by: Aniket Limaye <a-limaye@ti.com> Thanks, Aniket
Hi Aniket, On Wed, Mar 12, 2025 at 02:55:38PM +0530, Aniket Limaye wrote: > On 12/03/25 03:55, Andi Shyti wrote: > > On Tue, Mar 11, 2025 at 07:39:47AM -0500, Nishanth Menon wrote: > > > On 15:04-20250228, Andreas Kemnade wrote: > > > > On the GTA04A5 writing a reset command to the gyroscope causes IRQ > > > > storms because NACK IRQs are enabled and therefore triggered but not > > > > acked. > > > > > > > > Sending a reset command to the gyroscope by > > > > i2cset 1 0x69 0x14 0xb6 > > > > with an additional debug print in the ISR (not the thread) itself > > > > causes > > > > > > > > [ 363.353515] i2c i2c-1: ioctl, cmd=0x720, arg=0xbe801b00 > > > > [ 363.359039] omap_i2c 48072000.i2c: addr: 0x0069, len: 2, flags: 0x0, stop: 1 > > > > [ 363.366180] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x1110) > > > > [ 363.371673] omap_i2c 48072000.i2c: IRQ (ISR = 0x0010) > > > > [ 363.376892] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102) > > > > [ 363.382263] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102) > > > > [ 363.387664] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102) > > > > repeating till infinity > > > > [...] > > > > (0x2 = NACK, 0x100 = Bus free, which is not enabled) > > > > Apparently no other IRQ bit gets set, so this stalls. > > > > > > > > Do not ignore enabled interrupts and make sure they are acked. > > > > If the NACK IRQ is not needed, it should simply not enabled, but > > > > according to the above log, caring about it is necessary unless > > > > the Bus free IRQ is enabled and handled. The assumption that is > > > > will always come with a ARDY IRQ, which was the idea behind > > > > ignoring it, proves wrong. > > > > It is true for simple reads from an unused address. > > > > > > > > To still avoid the i2cdetect trouble which is the reason for > > > > commit c770657bd261 ("i2c: omap: Fix standard mode false ACK readings"), > > > > avoid doing much about NACK in omap_i2c_xfer_data() which is used > > > > by both IRQ mode and polling mode, so also the false detection fix > > > > is extended to polling usage and IRQ storms are avoided. > > > > > > > > By changing this, the hardirq handler is not needed anymore to filter > > > > stuff. > > > > > > > > The mentioned gyro reset now just causes a -ETIMEDOUT instead of > > > > hanging the system. > > > > > > > > Fixes: c770657bd261 ("i2c: omap: Fix standard mode false ACK readings"). > > > > CC: <stable@kernel.org> > > > > Signed-off-by: Andreas Kemnade <andreas@kemnade.info> > > > > --- > > > > This needs at least to be tested on systems where false acks were > > > > detected. > > > > > > At least on BeaglePlay, I have not been able to reproduce the original > > > bug which was the trigger for commit c770657bd261 > > > > > > I also ran basic boot tests on other K3 platforms and none seem to show > > > regressions at the very least. > > > > > > Tested-by: Nishanth Menon <nm@ti.com> > > > > Thanks for testing it! I asked some OMAP folks to check this > > patch, but no one took action. With Nishanth's test, I can now > > sleep soundly. :-) > > > > Merged to i2c/i2c-host-fixes. > > > > Thanks, > > Andi > > > > I see that the patch got merged so don't know if this is useful at all at > this point, but yeah looks good to me. Apologies for the slow response. > Nishanth, Thanks for testing it too! > > Reviewed-by: Aniket Limaye <a-limaye@ti.com> thanks for your review, I added it. Andi
diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c index 92faf03d64cf..f18c3e74b076 100644 --- a/drivers/i2c/busses/i2c-omap.c +++ b/drivers/i2c/busses/i2c-omap.c @@ -1048,23 +1048,6 @@ static int omap_i2c_transmit_data(struct omap_i2c_dev *omap, u8 num_bytes, return 0; } -static irqreturn_t -omap_i2c_isr(int irq, void *dev_id) -{ - struct omap_i2c_dev *omap = dev_id; - irqreturn_t ret = IRQ_HANDLED; - u16 mask; - u16 stat; - - stat = omap_i2c_read_reg(omap, OMAP_I2C_STAT_REG); - mask = omap_i2c_read_reg(omap, OMAP_I2C_IE_REG) & ~OMAP_I2C_STAT_NACK; - - if (stat & mask) - ret = IRQ_WAKE_THREAD; - - return ret; -} - static int omap_i2c_xfer_data(struct omap_i2c_dev *omap) { u16 bits; @@ -1095,8 +1078,13 @@ static int omap_i2c_xfer_data(struct omap_i2c_dev *omap) } if (stat & OMAP_I2C_STAT_NACK) { - err |= OMAP_I2C_STAT_NACK; + omap->cmd_err |= OMAP_I2C_STAT_NACK; omap_i2c_ack_stat(omap, OMAP_I2C_STAT_NACK); + + if (!(stat & ~OMAP_I2C_STAT_NACK)) { + err = -EAGAIN; + break; + } } if (stat & OMAP_I2C_STAT_AL) { @@ -1472,7 +1460,7 @@ omap_i2c_probe(struct platform_device *pdev) IRQF_NO_SUSPEND, pdev->name, omap); else r = devm_request_threaded_irq(&pdev->dev, omap->irq, - omap_i2c_isr, omap_i2c_isr_thread, + NULL, omap_i2c_isr_thread, IRQF_NO_SUSPEND | IRQF_ONESHOT, pdev->name, omap);
On the GTA04A5 writing a reset command to the gyroscope causes IRQ storms because NACK IRQs are enabled and therefore triggered but not acked. Sending a reset command to the gyroscope by i2cset 1 0x69 0x14 0xb6 with an additional debug print in the ISR (not the thread) itself causes [ 363.353515] i2c i2c-1: ioctl, cmd=0x720, arg=0xbe801b00 [ 363.359039] omap_i2c 48072000.i2c: addr: 0x0069, len: 2, flags: 0x0, stop: 1 [ 363.366180] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x1110) [ 363.371673] omap_i2c 48072000.i2c: IRQ (ISR = 0x0010) [ 363.376892] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102) [ 363.382263] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102) [ 363.387664] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102) repeating till infinity [...] (0x2 = NACK, 0x100 = Bus free, which is not enabled) Apparently no other IRQ bit gets set, so this stalls. Do not ignore enabled interrupts and make sure they are acked. If the NACK IRQ is not needed, it should simply not enabled, but according to the above log, caring about it is necessary unless the Bus free IRQ is enabled and handled. The assumption that is will always come with a ARDY IRQ, which was the idea behind ignoring it, proves wrong. It is true for simple reads from an unused address. To still avoid the i2cdetect trouble which is the reason for commit c770657bd261 ("i2c: omap: Fix standard mode false ACK readings"), avoid doing much about NACK in omap_i2c_xfer_data() which is used by both IRQ mode and polling mode, so also the false detection fix is extended to polling usage and IRQ storms are avoided. By changing this, the hardirq handler is not needed anymore to filter stuff. The mentioned gyro reset now just causes a -ETIMEDOUT instead of hanging the system. Fixes: c770657bd261 ("i2c: omap: Fix standard mode false ACK readings"). CC: <stable@kernel.org> Signed-off-by: Andreas Kemnade <andreas@kemnade.info> --- This needs at least to be tested on systems where false acks were detected. drivers/i2c/busses/i2c-omap.c | 26 +++++++------------------- 1 file changed, 7 insertions(+), 19 deletions(-)