diff mbox

[2/2] i2c: i2c-omap: Call request_irq with IRQF_DISABLED

Message ID 7d7e7dd1a4c64c732a21bdfcf2bd42556be708c3.1236345858.git.Ext-Ari.Kauppi@nokia.com (mailing list archive)
State Awaiting Upstream, archived
Headers show

Commit Message

Ari Kauppi March 6, 2009, 1:34 p.m. UTC
I have observed some Spurious IRQ's for I2C1 when all kernel hacking options
(and thus LOCKDEP) are disabled.

Applying Richard Woodruff's 'I2C bug fixes for L-O and L-Z' seems to help
but IRQF_DISABLED is needed for proper behaviour.

Signed-off-by: Ari Kauppi <Ext-Ari.Kauppi@nokia.com>
Acked-by: Felipe Balbi <felipe.balbi@nokia.com>
---
 drivers/i2c/busses/i2c-omap.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Comments

Woodruff, Richard March 6, 2009, 2:54 p.m. UTC | #1
> I have observed some Spurious IRQ's for I2C1 when all kernel hacking options
> (and thus LOCKDEP) are disabled.
>
> Applying Richard Woodruff's 'I2C bug fixes for L-O and L-Z' seems to help
> but IRQF_DISABLED is needed for proper behaviour.

I only submitted on l-o list for information and comment.

They need to be submitted on i2c list and to Ben to get added.

Thanks for the additional positive feed back and fix.

Regards,
Richard W.
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
ben@fluff.org.uk March 10, 2009, 12:52 a.m. UTC | #2
On Fri, Mar 06, 2009 at 03:34:54PM +0200, Ari Kauppi wrote:
> I have observed some Spurious IRQ's for I2C1 when all kernel hacking options
> (and thus LOCKDEP) are disabled.
> 
> Applying Richard Woodruff's 'I2C bug fixes for L-O and L-Z' seems to help
> but IRQF_DISABLED is needed for proper behaviour.
> 
> Signed-off-by: Ari Kauppi <Ext-Ari.Kauppi@nokia.com>
> Acked-by: Felipe Balbi <felipe.balbi@nokia.com>
> ---
>  drivers/i2c/busses/i2c-omap.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
> index 0c3ed41..18af43f 100644
> --- a/drivers/i2c/busses/i2c-omap.c
> +++ b/drivers/i2c/busses/i2c-omap.c
> @@ -847,7 +847,7 @@ omap_i2c_probe(struct platform_device *pdev)
>  	omap_i2c_init(dev);
>  
>  	isr = (dev->rev < OMAP_I2C_REV_2) ? omap_i2c_rev1_isr : omap_i2c_isr;
> -	r = request_irq(dev->irq, isr, 0, pdev->name, dev);
> +	r = request_irq(dev->irq, isr, IRQF_DISABLED, pdev->name, dev);

Is disabling all IRQs on the system the right thing to do?
  
>  	if (r) {
>  		dev_err(dev->dev, "failure requesting irq %i\n", dev->irq);
> -- 
> 1.5.6.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Felipe Balbi March 11, 2009, 7:16 p.m. UTC | #3
On Tue, Mar 10, 2009 at 12:52:22AM +0000, Ben Dooks wrote:
> On Fri, Mar 06, 2009 at 03:34:54PM +0200, Ari Kauppi wrote:
> > I have observed some Spurious IRQ's for I2C1 when all kernel hacking options
> > (and thus LOCKDEP) are disabled.
> > 
> > Applying Richard Woodruff's 'I2C bug fixes for L-O and L-Z' seems to help
> > but IRQF_DISABLED is needed for proper behaviour.
> > 
> > Signed-off-by: Ari Kauppi <Ext-Ari.Kauppi@nokia.com>
> > Acked-by: Felipe Balbi <felipe.balbi@nokia.com>
> > ---
> >  drivers/i2c/busses/i2c-omap.c |    2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> > 
> > diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
> > index 0c3ed41..18af43f 100644
> > --- a/drivers/i2c/busses/i2c-omap.c
> > +++ b/drivers/i2c/busses/i2c-omap.c
> > @@ -847,7 +847,7 @@ omap_i2c_probe(struct platform_device *pdev)
> >  	omap_i2c_init(dev);
> >  
> >  	isr = (dev->rev < OMAP_I2C_REV_2) ? omap_i2c_rev1_isr : omap_i2c_isr;
> > -	r = request_irq(dev->irq, isr, 0, pdev->name, dev);
> > +	r = request_irq(dev->irq, isr, IRQF_DISABLED, pdev->name, dev);
> 
> Is disabling all IRQs on the system the right thing to do?

if not we could fall into stack overflow, right ?
Paul Walmsley March 11, 2009, 11:55 p.m. UTC | #4
Hello Ari et al.,

On Tue, 10 Mar 2009, Ben Dooks wrote:

> On Fri, Mar 06, 2009 at 03:34:54PM +0200, Ari Kauppi wrote:
> > I have observed some Spurious IRQ's for I2C1 when all kernel hacking options
> > (and thus LOCKDEP) are disabled.
> > 
> > Applying Richard Woodruff's 'I2C bug fixes for L-O and L-Z' seems to help
> > but IRQF_DISABLED is needed for proper behaviour.
> > 
> > Signed-off-by: Ari Kauppi <Ext-Ari.Kauppi@nokia.com>
> > Acked-by: Felipe Balbi <felipe.balbi@nokia.com>
> > ---
> >  drivers/i2c/busses/i2c-omap.c |    2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> > 
> > diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
> > index 0c3ed41..18af43f 100644
> > --- a/drivers/i2c/busses/i2c-omap.c
> > +++ b/drivers/i2c/busses/i2c-omap.c
> > @@ -847,7 +847,7 @@ omap_i2c_probe(struct platform_device *pdev)
> >  	omap_i2c_init(dev);
> >  
> >  	isr = (dev->rev < OMAP_I2C_REV_2) ? omap_i2c_rev1_isr : omap_i2c_isr;
> > -	r = request_irq(dev->irq, isr, 0, pdev->name, dev);
> > +	r = request_irq(dev->irq, isr, IRQF_DISABLED, pdev->name, dev);
> 
> Is disabling all IRQs on the system the right thing to do?

Ben's right, there shouldn't be any need for this.  This patch could cause 
some unnecessary interrupt service latency.

Ari, are you seeing "Spurious irq XX: XXXXXXXX, please flush posted write 
for irq" messages?  If so, the correct fix for this is to read from the 
device interrupt status register immediately after writing to it.  This 
forces the ARM to wait until the write to the device is complete.  Ari, 
could you make this change to i2c-omap.c:omap_i2c_isr() instead, and test 
whether this fixes the problem?

+ u32 tmp;

...

  omap_i2c_write_reg(dev, OMAP_I2C_STAT_REG, stat);
+ tmp = omap_i2c_read_reg(dev, OMAP_I2C_STAT_REG); /* OCP barrier */



- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Felipe Balbi March 11, 2009, 11:59 p.m. UTC | #5
On Wed, Mar 11, 2009 at 05:55:50PM -0600, Paul Walmsley wrote:
> Ben's right, there shouldn't be any need for this.  This patch could cause
> some unnecessary interrupt service latency.

That's not what Thomas Gleixner thinks. How about the possibility of
stack overflow ?

According to Thomas (and Ingo, I'd say) all drivers should call
request_irq() with IRQF_DISABLED and that's gonna be true as soon as the
threaded irq handler support gets merged, if I'm not wrong.
Paul Walmsley March 12, 2009, 12:04 a.m. UTC | #6
Hi Ari,

One thing that I missed -

On Wed, 11 Mar 2009, Paul Walmsley wrote:

> > On Fri, Mar 06, 2009 at 03:34:54PM +0200, Ari Kauppi wrote:
> > > I have observed some Spurious IRQ's for I2C1 when all kernel hacking options
> > > (and thus LOCKDEP) are disabled.
> 
> Ari, are you seeing "Spurious irq XX: XXXXXXXX, please flush posted write 
> for irq" messages?  If so, the correct fix for this is to read from the 
> device interrupt status register immediately after writing to it.  This 
> forces the ARM to wait until the write to the device is complete.  Ari, 
> could you make this change to i2c-omap.c:omap_i2c_isr() instead, and test 
> whether this fixes the problem?
> 
> + u32 tmp;
> 
> ...
> 
>   omap_i2c_write_reg(dev, OMAP_I2C_STAT_REG, stat);
> + tmp = omap_i2c_read_reg(dev, OMAP_I2C_STAT_REG); /* OCP barrier */

You'll also want to make a similar change in omap_i2c_ack_stat(), to add a 
read immediately after that write.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Walmsley March 12, 2009, 12:07 a.m. UTC | #7
Hi Felipe,

On Thu, 12 Mar 2009, Felipe Balbi wrote:

> On Wed, Mar 11, 2009 at 05:55:50PM -0600, Paul Walmsley wrote:
> > Ben's right, there shouldn't be any need for this.  This patch could cause
> > some unnecessary interrupt service latency.
> 
> That's not what Thomas Gleixner thinks. How about the possibility of
> stack overflow ?

That sounds like a separate issue from the spurious IRQ problem that the 
patch was intended to fix.   

I'm not familiar with the discussion on the stack overflow issue.  Could 
you send a link?

> According to Thomas (and Ingo, I'd say) all drivers should call
> request_irq() with IRQF_DISABLED and that's gonna be true as soon as the
> threaded irq handler support gets merged, if I'm not wrong.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Felipe Contreras March 12, 2009, 12:11 a.m. UTC | #8
On Thu, Mar 12, 2009 at 1:59 AM, Felipe Balbi <me@felipebalbi.com> wrote:
> On Wed, Mar 11, 2009 at 05:55:50PM -0600, Paul Walmsley wrote:
>> Ben's right, there shouldn't be any need for this.  This patch could cause
>> some unnecessary interrupt service latency.
>
> That's not what Thomas Gleixner thinks. How about the possibility of
> stack overflow ?
>
> According to Thomas (and Ingo, I'd say) all drivers should call
> request_irq() with IRQF_DISABLED and that's gonna be true as soon as the
> threaded irq handler support gets merged, if I'm not wrong.

That's my understanding too, but I think it has always been true:
http://marc.info/?l=linux-kernel&m=123607685804562&w=2
Kevin Hilman March 12, 2009, 12:20 a.m. UTC | #9
Paul Walmsley <paul@pwsan.com> writes:

> Hi Felipe,
>
> On Thu, 12 Mar 2009, Felipe Balbi wrote:
>
>> On Wed, Mar 11, 2009 at 05:55:50PM -0600, Paul Walmsley wrote:
>> > Ben's right, there shouldn't be any need for this.  This patch could cause
>> > some unnecessary interrupt service latency.
>> 
>> That's not what Thomas Gleixner thinks. How about the possibility of
>> stack overflow ?
>
> That sounds like a separate issue from the spurious IRQ problem that the 
> patch was intended to fix.   

I agree.  The IRQF_DISABLED happens to fix this issue, but it may be
masking the real spurious issue as Paul suggested.

> I'm not familiar with the discussion on the stack overflow issue.  Could 
> you send a link?
>

Here's one:

  http://marc.info/?l=linux-kernel&m=123607359500596&w=2

Kevin

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Felipe Balbi March 12, 2009, 12:23 a.m. UTC | #10
On Wed, Mar 11, 2009 at 06:07:00PM -0600, Paul Walmsley wrote:
> Hi Felipe,
> 
> On Thu, 12 Mar 2009, Felipe Balbi wrote:
> 
> > On Wed, Mar 11, 2009 at 05:55:50PM -0600, Paul Walmsley wrote:
> > > Ben's right, there shouldn't be any need for this.  This patch could cause
> > > some unnecessary interrupt service latency.
> > 
> > That's not what Thomas Gleixner thinks. How about the possibility of
> > stack overflow ?
> 
> That sounds like a separate issue from the spurious IRQ problem that the 
> patch was intended to fix.   
> 
> I'm not familiar with the discussion on the stack overflow issue.  Could 
> you send a link?

This is the link where Ingo discusses why !IRQF_DISABLED could cause
stack overflow:

http://lkml.org/lkml/2009/3/3/71

You probably wanna take a look at the whole thread to figure out the
discussion, but basically Ingo and Peter (Zijlstra) believe
!IRQF_DISABLED is a bug and drivers needing that probably should be
using threaded irqs (which is not yet merged) or the hw is broken, and
for those an IRQF_ENABLED flag will be created and a TAINT will also be
placed.

Once that gets merged, all drivers will be forced (at some point) to
IRQF_DISABLED and those which don't want that will be moved gradually to
threaded irq.

This patch is also interesting:

http://lkml.org/lkml/2009/3/2/33
Felipe Balbi March 12, 2009, 12:25 a.m. UTC | #11
On Thu, Mar 12, 2009 at 02:11:17AM +0200, Felipe Contreras wrote:
> That's my understanding too, but I think it has always been true:
> http://marc.info/?l=linux-kernel&m=123607685804562&w=2

sure, not merged yet though. And still the threaded irq support can't
handle twl4030 where the irq demuxing logic has to run in a thread due
to the need of communicating via i2c.
David Brownell March 12, 2009, 1:28 a.m. UTC | #12
On Wednesday 11 March 2009, Felipe Balbi wrote:
> According to Thomas (and Ingo, I'd say) all drivers should call
> request_irq() with IRQF_DISABLED and that's gonna be true as soon as the
> threaded irq handler support gets merged, if I'm not wrong.

Well, they're wrong.

Folk like Dave Miller, Andrew Morton, Benjamin Herrenschmdit,
and Alan Cox have come out on the saner side of that.


--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Walmsley March 12, 2009, 3:30 a.m. UTC | #13
Hi Felipe,

On Thu, 12 Mar 2009, Felipe Balbi wrote:

> This is the link where Ingo discusses why !IRQF_DISABLED could cause
> stack overflow:
> 
> http://lkml.org/lkml/2009/3/3/71
> 
> You probably wanna take a look at the whole thread to figure out the
> discussion, but basically Ingo and Peter (Zijlstra) believe
> !IRQF_DISABLED is a bug and drivers needing that probably should be
> using threaded irqs (which is not yet merged) or the hw is broken, and
> for those an IRQF_ENABLED flag will be created and a TAINT will also be
> placed.
> 
> Once that gets merged, all drivers will be forced (at some point) to
> IRQF_DISABLED and those which don't want that will be moved gradually to
> threaded irq.
> 
> This patch is also interesting:
> 
> http://lkml.org/lkml/2009/3/2/33

Thanks for the links.  Those issues are indeed distinct from the spurious 
IRQ problem that Ari is reporting.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ari Kauppi March 12, 2009, 6:26 a.m. UTC | #14
On Thu, 2009-03-12 at 01:04 +0100, ext Paul Walmsley wrote:
Hi Paul,

> On Wed, 11 Mar 2009, Paul Walmsley wrote:
> 
> > > On Fri, Mar 06, 2009 at 03:34:54PM +0200, Ari Kauppi wrote:
> > > > I have observed some Spurious IRQ's for I2C1 when all kernel hacking options
> > > > (and thus LOCKDEP) are disabled.
> > 
> > Ari, are you seeing "Spurious irq XX: XXXXXXXX, please flush posted write 
> > for irq" messages?  If so, the correct fix for this is to read from the 
> > device interrupt status register immediately after writing to it.  This 
> > forces the ARM to wait until the write to the device is complete.  Ari, 
> > could you make this change to i2c-omap.c:omap_i2c_isr() instead, and test 
> > whether this fixes the problem?
> > 
> > + u32 tmp;
> > 
> > ...
> > 
> >   omap_i2c_write_reg(dev, OMAP_I2C_STAT_REG, stat);
> > + tmp = omap_i2c_read_reg(dev, OMAP_I2C_STAT_REG); /* OCP barrier */
> 
> You'll also want to make a similar change in omap_i2c_ack_stat(), to add a 
> read immediately after that write.

I was seeing some Spurious irq's for the I2C1 (IRQ 56).

I'm aware of flushing posted write and the very first thing I tried was
to use read-after-write for OMAP_I2C_STAT_REG (in all applicable
places). However, it didn't make any difference.

Applying Richard Woodruff's patch (mentioned earlier in thread) that
disables dev->b_hw hack for 3430 (STT/STP bits written together) and
double clears ARDY fixed the spurious IRQ issues for I2C1.

However, with the STT/STP+ARDY patch I was seeing Spurious interrupts
all over the place and the IRQF_DISABLED in i2c-omap seemed to tame them
quite well. I do agree that my approach might not be the proper one in
long term.

Best regards,

--
Ari

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Walmsley March 12, 2009, 6:46 a.m. UTC | #15
Hello Ari,

On Thu, 12 Mar 2009, Kauppi Ari (EXT-Ixonos/Oulu) wrote:

> I was seeing some Spurious irq's for the I2C1 (IRQ 56).
> 
> I'm aware of flushing posted write and the very first thing I tried was
> to use read-after-write for OMAP_I2C_STAT_REG (in all applicable
> places). However, it didn't make any difference.

Strange.

> Applying Richard Woodruff's patch (mentioned earlier in thread) that
> disables dev->b_hw hack for 3430 (STT/STP bits written together) and
> double clears ARDY fixed the spurious IRQ issues for I2C1.
> 
> However, with the STT/STP+ARDY patch I was seeing Spurious interrupts
> all over the place and the IRQF_DISABLED in i2c-omap seemed to tame them
> quite well. I do agree that my approach might not be the proper one in
> long term.

Could you clarify this?  Do Richard's patches fix the spurious IRQs, or 
cause spurious interrupts to appear?


regards,

- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ari Kauppi March 12, 2009, 7:54 a.m. UTC | #16
On Thu, 2009-03-12 at 07:46 +0100, ext Paul Walmsley wrote:
> Hello Ari,
> 
> On Thu, 12 Mar 2009, Kauppi Ari (EXT-Ixonos/Oulu) wrote:
> 
> > I was seeing some Spurious irq's for the I2C1 (IRQ 56).
> > 
> > I'm aware of flushing posted write and the very first thing I tried was
> > to use read-after-write for OMAP_I2C_STAT_REG (in all applicable
> > places). However, it didn't make any difference.
> 
> Strange.

I agree.

> > Applying Richard Woodruff's patch (mentioned earlier in thread) that
> > disables dev->b_hw hack for 3430 (STT/STP bits written together) and
> > double clears ARDY fixed the spurious IRQ issues for I2C1.
> > 
> > However, with the STT/STP+ARDY patch I was seeing Spurious interrupts
> > all over the place and the IRQF_DISABLED in i2c-omap seemed to tame them
> > quite well. I do agree that my approach might not be the proper one in
> > long term.
> 
> Could you clarify this?  Do Richard's patches fix the spurious IRQs, or 
> cause spurious interrupts to appear?

My process was:

1) Take 2.6.28-based kernel on custom OMAP3430ES3.0 hardware with
CONFIG_DEBUG_LOCKDEP enabled. There are no spurious interrupts and
everything works.

2) Disable CONFIG_DEBUG_LOCKDEP and all other kernel debugging options.
Spurious interrupts start to appear on several IRQs, especially with IRQ
56 (I2C1).

3) Apply Richard's patch. All spurious interrupts for IRQ 56 are gone
but frequency of others increase.

4) Set IRQF_DISABLED in i2c-omap and the frequency of other spurious
interrupts decreases considerably. However, I'm starting to realize that
the real problem is probably elsewhere.

My test setup is pretty systematic, it does not have any user
interaction in it. I have a relay controlling the power to the device
and have taken logs of about 12000 boots with different kernel options
and patches applied.

Best regards,

--
Ari

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Walmsley March 12, 2009, 9:58 a.m. UTC | #17
On Thu, 12 Mar 2009, Kauppi Ari (EXT-Ixonos/Oulu) wrote:

> My process was:
> 
> 1) Take 2.6.28-based kernel on custom OMAP3430ES3.0 hardware with
> CONFIG_DEBUG_LOCKDEP enabled. There are no spurious interrupts and
> everything works.
> 
> 2) Disable CONFIG_DEBUG_LOCKDEP and all other kernel debugging options.
> Spurious interrupts start to appear on several IRQs, especially with IRQ
> 56 (I2C1).
> 
> 3) Apply Richard's patch. All spurious interrupts for IRQ 56 are gone
> but frequency of others increase.
> 
> 4) Set IRQF_DISABLED in i2c-omap and the frequency of other spurious
> interrupts decreases considerably. However, I'm starting to realize that
> the real problem is probably elsewhere.
> 
> My test setup is pretty systematic, it does not have any user
> interaction in it. I have a relay controlling the power to the device
> and have taken logs of about 12000 boots with different kernel options
> and patches applied.

Thanks for the details.  Can you extract the list of spurious IRQ warnings 
that you're getting, and post them?  I suspect that, like I2C, many of the 
driver ISRs are not reading back the device interrupt status registers 
after they clear them.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ari Kauppi March 12, 2009, 11:33 a.m. UTC | #18
On Thu, 2009-03-12 at 10:58 +0100, ext Paul Walmsley wrote:
> On Thu, 12 Mar 2009, Kauppi Ari (EXT-Ixonos/Oulu) wrote:
> 
> > My process was:
> > 
> > 1) Take 2.6.28-based kernel on custom OMAP3430ES3.0 hardware with
> > CONFIG_DEBUG_LOCKDEP enabled. There are no spurious interrupts and
> > everything works.
> > 
> > 2) Disable CONFIG_DEBUG_LOCKDEP and all other kernel debugging options.
> > Spurious interrupts start to appear on several IRQs, especially with IRQ
> > 56 (I2C1).
> > 
> > 3) Apply Richard's patch. All spurious interrupts for IRQ 56 are gone
> > but frequency of others increase.
> > 
> > 4) Set IRQF_DISABLED in i2c-omap and the frequency of other spurious
> > interrupts decreases considerably. However, I'm starting to realize that
> > the real problem is probably elsewhere.
> > 
> > My test setup is pretty systematic, it does not have any user
> > interaction in it. I have a relay controlling the power to the device
> > and have taken logs of about 12000 boots with different kernel options
> > and patches applied.
> 
> Thanks for the details.  Can you extract the list of spurious IRQ warnings 
> that you're getting, and post them?  I suspect that, like I2C, many of the 
> driver ISRs are not reading back the device interrupt status registers 
> after they clear them.

Sure,

Here is "grep -hoa 'write for irq.*' *.log | sort | uniq -c" from the
most problematic case (step 2 in above process). It should be noted that
the messages are always in pairs, ie. there are two consecutive messages
in the logs with only microseconds between them. I have divided the
counts reported by uniq -c by two in the list below.

Boot count: 6628
      1 write for irq 12
      1 write for irq 25
      1 write for irq 33
     10 write for irq 37
  29532 write for irq 56
     12 write for irq 67
      1 write for irq 71
    281 write for irq 73
    114 write for irq 83
    407 write for irq 86

I have also heard that with other use cases irq 17 and 21 should be in
the list, too. The single ones from 12,25,33,71 are probably just
one-offs and should not be taken too seriously, 37/67 are corner cases
but 73/83/86 are definitely valid measurements.

Best regards,

--
Ari

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Woodruff, Richard March 12, 2009, 3:04 p.m. UTC | #19
> From: linux-omap-owner@vger.kernel.org [mailto:linux-omap-
> owner@vger.kernel.org] On Behalf Of Kauppi Ari (EXT-Ixonos/Oulu)
> Sent: Thursday, March 12, 2009 6:34 AM

> > > 3) Apply Richard's patch. All spurious interrupts for IRQ 56 are gone
> > > but frequency of others increase.

The bugs fixed in my patch are function violation bugs. The TRM clearly requires the changes I made. In of themselves they don't have anything to do with spurious interrupts. Though if the FSM is messed up, there is no telling what will happen for interrupts.

In our reference code and code other customers are using. Applying fixes resulted in I2C flakiness going away.

> Boot count: 6628
>       1 write for irq 12
>       1 write for irq 25
>       1 write for irq 33
>      10 write for irq 37
>   29532 write for irq 56
>      12 write for irq 67
>       1 write for irq 71
>     281 write for irq 73
>     114 write for irq 83
>     407 write for irq 86
>
> I have also heard that with other use cases irq 17 and 21 should be in
> the list, too. The single ones from 12,25,33,71 are probably just
> one-offs and should not be taken too seriously, 37/67 are corner cases
> but 73/83/86 are definitely valid measurements.

If you apply the bus posting patch I sent a while back they will likely all go away. There are a number of subtle effects posting and resulting bursting will have on register ranges.

http://www.mail-archive.com/linux-omap@vger.kernel.org/msg08363.html

It is quite noble that people want to experience and hunt down subtle system wide errors for a potential small performance gain. I'm tired of the weeks if not months of cumulative time wasted here. It would be more fun to play with a broken alpha version of gcc and look for subtle errors. I might eventually get some benefit.

Regards,
Richard W.

--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Walmsley March 13, 2009, 12:04 a.m. UTC | #20
(cc's trimmed to drop Ben, i2c list)

Hello Ari,

On Thu, 12 Mar 2009, Kauppi Ari (EXT-Ixonos/Oulu) wrote:
> On Thu, 2009-03-12 at 10:58 +0100, ext Paul Walmsley wrote:
> > Thanks for the details.  Can you extract the list of spurious IRQ warnings 
> > that you're getting, and post them?  I suspect that, like I2C, many of the 
> > driver ISRs are not reading back the device interrupt status registers 
> > after they clear them.
> 
> Sure,
> 
> Here is "grep -hoa 'write for irq.*' *.log | sort | uniq -c" from the
> most problematic case (step 2 in above process). It should be noted that
> the messages are always in pairs, ie. there are two consecutive messages
> in the logs with only microseconds between them. I have divided the
> counts reported by uniq -c by two in the list below.
> 
> Boot count: 6628
>       1 write for irq 12
>       1 write for irq 25
>       1 write for irq 33
>      10 write for irq 37
>   29532 write for irq 56
>      12 write for irq 67
>       1 write for irq 71
>     281 write for irq 73
>     114 write for irq 83
>     407 write for irq 86
> 
> I have also heard that with other use cases irq 17 and 21 should be in
> the list, too. The single ones from 12,25,33,71 are probably just
> one-offs and should not be taken too seriously, 37/67 are corner cases
> but 73/83/86 are definitely valid measurements.

Would you be willing to test some patches that might resolve most of these 
spurious IRQs?  They are here:

   http://www.pwsan.com/omap/spurious-irq-fix.tar.gz

They at least boot okay on a 3430SDP, but since I don't have a test case 
to reproduce the spurious IRQs, I haven't done any further testing.


Thanks for your help with this -

- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
index 0c3ed41..18af43f 100644
--- a/drivers/i2c/busses/i2c-omap.c
+++ b/drivers/i2c/busses/i2c-omap.c
@@ -847,7 +847,7 @@  omap_i2c_probe(struct platform_device *pdev)
 	omap_i2c_init(dev);
 
 	isr = (dev->rev < OMAP_I2C_REV_2) ? omap_i2c_rev1_isr : omap_i2c_isr;
-	r = request_irq(dev->irq, isr, 0, pdev->name, dev);
+	r = request_irq(dev->irq, isr, IRQF_DISABLED, pdev->name, dev);
 
 	if (r) {
 		dev_err(dev->dev, "failure requesting irq %i\n", dev->irq);