diff mbox

ARM: OMAP2+: Use handle_fasteoi_irq for INTC interrupt handling

Message ID 1393236556-12361-1-git-send-email-stefan.sorensen@spectralink.com (mailing list archive)
State New, archived
Headers show

Commit Message

Stefan Sørensen Feb. 24, 2014, 10:09 a.m. UTC
Currently INTC interrupts are handled using handle_level_irq which
will acknowledge the interrupt before running the handler. If a second
interrupt is then asserted and this interrupt is disabled while
running the first handler, the INTC will be brought into an
inconsistent state. In this state the INTC will interrupt the CPU but
the interrupt pending registers will show no pending interrupts
causing the CPU to take no action. This will repeat until another
higher priority interrupt triggers, hogging the CPU.

handle_fasteoi_irq does not acknowledge the interrupt before after
running the handler, so using this for handling INTC interrupts avoid
the above problem. This also brings the handling more in line with the
sequence described in the TRM.

Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
---
 arch/arm/mach-omap2/irq.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

Comments

Tony Lindgren Feb. 28, 2014, 5:11 p.m. UTC | #1
* Stefan Sørensen <stefan.sorensen@spectralink.com> [140224 02:12]:
> Currently INTC interrupts are handled using handle_level_irq which
> will acknowledge the interrupt before running the handler. If a second
> interrupt is then asserted and this interrupt is disabled while
> running the first handler, the INTC will be brought into an
> inconsistent state.

Hmm care to explain a bit more here if the second interrupt you're
talking about is the same interrupt or any interrupt in the same
interrupt bank? Is this limited to GPIO interrupts?

The reason I'm asking is because of the issues we've seen earlier
with not flushing posted writes from the interrupt handlers. In
those case the interrupt on the device gets acked too late without
the read back call.

> In this state the INTC will interrupt the CPU but
> the interrupt pending registers will show no pending interrupts
> causing the CPU to take no action. This will repeat until another
> higher priority interrupt triggers, hogging the CPU.
> 
> handle_fasteoi_irq does not acknowledge the interrupt before after
> running the handler, so using this for handling INTC interrupts avoid
> the above problem. This also brings the handling more in line with the
> sequence described in the TRM.
> 
> Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
> ---
>  arch/arm/mach-omap2/irq.c | 10 ++--------
>  1 file changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm/mach-omap2/irq.c b/arch/arm/mach-omap2/irq.c
> index e022a86..9d09914 100644
> --- a/arch/arm/mach-omap2/irq.c
> +++ b/arch/arm/mach-omap2/irq.c
> @@ -97,12 +97,6 @@ static void omap_ack_irq(struct irq_data *d)
>  	intc_bank_write_reg(0x1, &irq_banks[0], INTC_CONTROL);
>  }
>  
> -static void omap_mask_ack_irq(struct irq_data *d)
> -{
> -	irq_gc_mask_disable_reg(d);
> -	omap_ack_irq(d);
> -}
> -
>  static void __init omap_irq_bank_init_one(struct omap_irq_bank *bank)
>  {
>  	unsigned long tmp;
> @@ -145,9 +139,9 @@ omap_alloc_gc(void __iomem *base, unsigned int irq_start, unsigned int num)
>  	struct irq_chip_type *ct;
>  
>  	gc = irq_alloc_generic_chip("INTC", 1, irq_start, base,
> -					handle_level_irq);
> +					handle_fasteoi_irq);
>  	ct = gc->chip_types;
> -	ct->chip.irq_ack = omap_mask_ack_irq;
> +	ct->chip.irq_eoi = omap_ack_irq;
>  	ct->chip.irq_mask = irq_gc_mask_disable_reg;
>  	ct->chip.irq_unmask = irq_gc_unmask_enable_reg;
>  	ct->chip.flags |= IRQCHIP_SKIP_SET_WAKE;
> -- 
> 1.8.5.3
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Sørensen March 1, 2014, 9:59 a.m. UTC | #2
On Fri, 2014-02-28 at 09:11 -0800, Tony Lindgren wrote:
> * Stefan Sørensen <stefan.sorensen@spectralink.com> [140224 02:12]:

> > Currently INTC interrupts are handled using handle_level_irq which

> > will acknowledge the interrupt before running the handler. If a second

> > interrupt is then asserted and this interrupt is disabled while

> > running the first handler, the INTC will be brought into an

> > inconsistent state.

> 

> Hmm care to explain a bit more here if the second interrupt you're

> talking about is the same interrupt or any interrupt in the same

> interrupt bank? Is this limited to GPIO interrupts?


I am seeing it with the cpsw driver on a custom board and on the
beaglebone. When a tx irq is handled the cpsw irq handler disables both
the tx and the rx irqs, and if the rx irq was also asserted (i.e. duplex
traffic), this bug will trigger. Reproducing it is very simple, just hit
a beaglebone with a flood ping and watch a function trace.

Applying this patch I see a significant performance boost on duplex
traffic. An iperf full duplex test gives a 50-100% increase in receive
bandwidth - it now fully saturates a 100Mb interface, so the increase
might be even larger with a gigabit interface.

> The reason I'm asking is because of the issues we've seen earlier

> with not flushing posted writes from the interrupt handlers. In

> those case the interrupt on the device gets acked too late without

> the read back call.


I am not very familiar with the low level details of the irq handling,  
but am335x TRM states that a data synchronization barrier should be used
after the ACK is posted to the INTC, and I don't see that anywhere in
the code. Is this related?

Stefan
Tony Lindgren March 2, 2014, 5:37 p.m. UTC | #3
* Sørensen, Stefan <Stefan.Sorensen@spectralink.com> [140301 02:02]:
> On Fri, 2014-02-28 at 09:11 -0800, Tony Lindgren wrote:
> > * Stefan Sørensen <stefan.sorensen@spectralink.com> [140224 02:12]:
> > > Currently INTC interrupts are handled using handle_level_irq which
> > > will acknowledge the interrupt before running the handler. If a second
> > > interrupt is then asserted and this interrupt is disabled while
> > > running the first handler, the INTC will be brought into an
> > > inconsistent state.
> > 
> > Hmm care to explain a bit more here if the second interrupt you're
> > talking about is the same interrupt or any interrupt in the same
> > interrupt bank? Is this limited to GPIO interrupts?
> 
> I am seeing it with the cpsw driver on a custom board and on the
> beaglebone. When a tx irq is handled the cpsw irq handler disables both
> the tx and the rx irqs, and if the rx irq was also asserted (i.e. duplex
> traffic), this bug will trigger. Reproducing it is very simple, just hit
> a beaglebone with a flood ping and watch a function trace.

OK so it's for the same interrupt. And that sounds like a good test :)
 
> Applying this patch I see a significant performance boost on duplex
> traffic. An iperf full duplex test gives a 50-100% increase in receive
> bandwidth - it now fully saturates a 100Mb interface, so the increase
> might be even larger with a gigabit interface.
> 
> > The reason I'm asking is because of the issues we've seen earlier
> > with not flushing posted writes from the interrupt handlers. In
> > those case the interrupt on the device gets acked too late without
> > the read back call.
> 
> I am not very familiar with the low level details of the irq handling,  
> but am335x TRM states that a data synchronization barrier should be used
> after the ACK is posted to the INTC, and I don't see that anywhere in
> the code. Is this related?

Well sort of, except DSB won't do it as it won't guarantee the write
gets to the device on the bus. So a readback from the device is needed
as only the order of reads and writes is guaranteed.

A good sanity check would be to find the related interrupt handler(s)
in the cpsw driver that do the write to the cpsw registers to ack
interrupts.

Then check if there's a readback in the cpsw interrupt handler(s) of
some harmless cpsw register after acking the interrupt(s) and before
doing return IRQ_HANDLED.

If this fixes things without your patch, then we know it's a driver
issue and there's no need to debug it further :) The missing flush of
posted write usually shows up as a spurious interrupts with no status
in the device, but depending on the driver code handling of spurious
interrupts it may also have other side effects.

I'm not too familiar with the cpsw driver so I can't do a test patch
without digging into it further sorry. For similar examples, you
may want to grep for "flush posted write" and "OCP barrier" in the
kernel code.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Sørensen March 3, 2014, 11:05 a.m. UTC | #4
On Sun, 2014-03-02 at 09:37 -0800, Tony Lindgren wrote:
> * Sørensen, Stefan <Stefan.Sorensen@spectralink.com> [140301 02:02]:

> > On Fri, 2014-02-28 at 09:11 -0800, Tony Lindgren wrote:

> > > * Stefan Sørensen <stefan.sorensen@spectralink.com> [140224 02:12]:

> > > > Currently INTC interrupts are handled using handle_level_irq which

> > > > will acknowledge the interrupt before running the handler. If a second

> > > > interrupt is then asserted and this interrupt is disabled while

> > > > running the first handler, the INTC will be brought into an

> > > > inconsistent state.

> > > 

> > > Hmm care to explain a bit more here if the second interrupt you're

> > > talking about is the same interrupt or any interrupt in the same

> > > interrupt bank? Is this limited to GPIO interrupts?

> > 

> > I am seeing it with the cpsw driver on a custom board and on the

> > beaglebone. When a tx irq is handled the cpsw irq handler disables both

> > the tx and the rx irqs, and if the rx irq was also asserted (i.e. duplex

> > traffic), this bug will trigger. Reproducing it is very simple, just hit

> > a beaglebone with a flood ping and watch a function trace.

> 

> OK so it's for the same interrupt. And that sounds like a good test :)


No, the tx and rx are separate interrupts, but the cpsw driver has a
common handler.
 
> > > The reason I'm asking is because of the issues we've seen earlier

> > > with not flushing posted writes from the interrupt handlers. In

> > > those case the interrupt on the device gets acked too late without

> > > the read back call.

> > 

> > I am not very familiar with the low level details of the irq handling,  

> > but am335x TRM states that a data synchronization barrier should be used

> > after the ACK is posted to the INTC, and I don't see that anywhere in

> > the code. Is this related?

> 

> Well sort of, except DSB won't do it as it won't guarantee the write

> gets to the device on the bus. So a readback from the device is needed

> as only the order of reads and writes is guaranteed.


I think that we are talking about two different scenarios, what I am
seeing is that an interrupt is disabled while active on the INTC.

	1. CPSW device asserts TX IRQ
	2. CPSW device asserts RX IRQ
	3. INTC interrupts CPU, TX IRQ marked as active
	4. omap_intc_handle_irq ACKs TX IRQ on the INTC
	5. INTC marks RX IRQ as active
	6. omap_intc_handle_irq calls cpsw_interrupt
	7. cpsw_interrupt disables RX+TX IRQ in CPSW device
	8. cpsw_interrupt disables RX+TX IRQ in INTC (the IRQs are masked)
	9.. omap_intc_handle_irq sees no unmasked IRQs are pending and returns
	10. INTC interrupts CPU, RX IRQ marked as pending
	11. omap_intc_handle_irq sees no unmasked IRQs are pending and returns
	12. Go to step 10

The problem arises in step 8 where an active IRQ is masked. This will
not make it inactive in the INTC but it will be cleared from the
pending IRQ registers - this is the register that omap_intc_handle_irq
uses to decide which IRQ is active.

> A good sanity check would be to find the related interrupt handler(s)

> in the cpsw driver that do the write to the cpsw registers to ack

> interrupts.

> 

> Then check if there's a readback in the cpsw interrupt handler(s) of

> some harmless cpsw register after acking the interrupt(s) and before

> doing return IRQ_HANDLED.

> 

> If this fixes things without your patch, then we know it's a driver

> issue and there's no need to debug it further :) The missing flush of

> posted write usually shows up as a spurious interrupts with no status

> in the device, but depending on the driver code handling of spurious

> interrupts it may also have other side effects.

> 

> I'm not too familiar with the cpsw driver so I can't do a test patch

> without digging into it further sorry. For similar examples, you

> may want to grep for "flush posted write" and "OCP barrier" in the

> kernel code.


I tried this with an assortment of different CPSW registers - no change.

Stefan
Tony Lindgren March 3, 2014, 5:49 p.m. UTC | #5
* Sørensen, Stefan <Stefan.Sorensen@spectralink.com> [140303 03:08]:
> On Sun, 2014-03-02 at 09:37 -0800, Tony Lindgren wrote:
> > * Sørensen, Stefan <Stefan.Sorensen@spectralink.com> [140301 02:02]:
> > > On Fri, 2014-02-28 at 09:11 -0800, Tony Lindgren wrote:
> > > > * Stefan Sørensen <stefan.sorensen@spectralink.com> [140224 02:12]:
> > > > > Currently INTC interrupts are handled using handle_level_irq which
> > > > > will acknowledge the interrupt before running the handler. If a second
> > > > > interrupt is then asserted and this interrupt is disabled while
> > > > > running the first handler, the INTC will be brought into an
> > > > > inconsistent state.
> > > > 
> > > > Hmm care to explain a bit more here if the second interrupt you're
> > > > talking about is the same interrupt or any interrupt in the same
> > > > interrupt bank? Is this limited to GPIO interrupts?
> > > 
> > > I am seeing it with the cpsw driver on a custom board and on the
> > > beaglebone. When a tx irq is handled the cpsw irq handler disables both
> > > the tx and the rx irqs, and if the rx irq was also asserted (i.e. duplex
> > > traffic), this bug will trigger. Reproducing it is very simple, just hit
> > > a beaglebone with a flood ping and watch a function trace.
> > 
> > OK so it's for the same interrupt. And that sounds like a good test :)
> 
> No, the tx and rx are separate interrupts, but the cpsw driver has a
> common handler.

Oh OK sounds like that's where the problem is :)
  
> > > > The reason I'm asking is because of the issues we've seen earlier
> > > > with not flushing posted writes from the interrupt handlers. In
> > > > those case the interrupt on the device gets acked too late without
> > > > the read back call.
> > > 
> > > I am not very familiar with the low level details of the irq handling,  
> > > but am335x TRM states that a data synchronization barrier should be used
> > > after the ACK is posted to the INTC, and I don't see that anywhere in
> > > the code. Is this related?
> > 
> > Well sort of, except DSB won't do it as it won't guarantee the write
> > gets to the device on the bus. So a readback from the device is needed
> > as only the order of reads and writes is guaranteed.
> 
> I think that we are talking about two different scenarios, what I am
> seeing is that an interrupt is disabled while active on the INTC.

Yes sounds like these are different issues.
 
> 	1. CPSW device asserts TX IRQ
> 	2. CPSW device asserts RX IRQ
> 	3. INTC interrupts CPU, TX IRQ marked as active
> 	4. omap_intc_handle_irq ACKs TX IRQ on the INTC
> 	5. INTC marks RX IRQ as active
> 	6. omap_intc_handle_irq calls cpsw_interrupt
> 	7. cpsw_interrupt disables RX+TX IRQ in CPSW device
> 	8. cpsw_interrupt disables RX+TX IRQ in INTC (the IRQs are masked)
> 	9.. omap_intc_handle_irq sees no unmasked IRQs are pending and returns
> 	10. INTC interrupts CPU, RX IRQ marked as pending
> 	11. omap_intc_handle_irq sees no unmasked IRQs are pending and returns
> 	12. Go to step 10
> 
> The problem arises in step 8 where an active IRQ is masked. This will
> not make it inactive in the INTC but it will be cleared from the
> pending IRQ registers - this is the register that omap_intc_handle_irq
> uses to decide which IRQ is active.

OK thanks for the info. Looking at your trace above it seems like the
two separate TX and RX INTC interrupts are not really treated as separate
interrupts in the cpsw driver. It seems the real fix is to fix up the
cpsw interrupt handler so it's not racy. It seems that at least step 7
and 8 above should be handled separately for TX and RX based on the INTC
interrupts.

From the INTC point of view we should just have standard level IRQ
handling for each interrupt, right? INTC does not need to know about
driver specific coordination between two separate interrupts.
 
> > A good sanity check would be to find the related interrupt handler(s)
> > in the cpsw driver that do the write to the cpsw registers to ack
> > interrupts.
> > 
> > Then check if there's a readback in the cpsw interrupt handler(s) of
> > some harmless cpsw register after acking the interrupt(s) and before
> > doing return IRQ_HANDLED.
> > 
> > If this fixes things without your patch, then we know it's a driver
> > issue and there's no need to debug it further :) The missing flush of
> > posted write usually shows up as a spurious interrupts with no status
> > in the device, but depending on the driver code handling of spurious
> > interrupts it may also have other side effects.
> > 
> > I'm not too familiar with the cpsw driver so I can't do a test patch
> > without digging into it further sorry. For similar examples, you
> > may want to grep for "flush posted write" and "OCP barrier" in the
> > kernel code.
> 
> I tried this with an assortment of different CPSW registers - no change.

OK thanks for trying. Based on your description above it seems that
the problem you found is not related to flushing posted writes even
though that issue might be there also once race issue is fixed.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Sørensen March 6, 2014, 2:32 p.m. UTC | #6
On Mon, 2014-03-03 at 09:49 -0800, Tony Lindgren wrote:
> > 	1. CPSW device asserts TX IRQ

> > 	2. CPSW device asserts RX IRQ

> > 	3. INTC interrupts CPU, TX IRQ marked as active

> > 	4. omap_intc_handle_irq ACKs TX IRQ on the INTC

> > 	5. INTC marks RX IRQ as active

> > 	6. omap_intc_handle_irq calls cpsw_interrupt

> > 	7. cpsw_interrupt disables RX+TX IRQ in CPSW device

> > 	8. cpsw_interrupt disables RX+TX IRQ in INTC (the IRQs are masked)

> > 	9.. omap_intc_handle_irq sees no unmasked IRQs are pending and returns

> > 	10. INTC interrupts CPU, RX IRQ marked as pending

> > 	11. omap_intc_handle_irq sees no unmasked IRQs are pending and returns

> > 	12. Go to step 10

> > 

> > The problem arises in step 8 where an active IRQ is masked. This will

> > not make it inactive in the INTC but it will be cleared from the

> > pending IRQ registers - this is the register that omap_intc_handle_irq

> > uses to decide which IRQ is active.

> 

> OK thanks for the info. Looking at your trace above it seems like the

> two separate TX and RX INTC interrupts are not really treated as separate

> interrupts in the cpsw driver. It seems the real fix is to fix up the

> cpsw interrupt handler so it's not racy. It seems that at least step 7

> and 8 above should be handled separately for TX and RX based on the INTC

> interrupts.


Looking in drivers/net/ethernet, it looks like a quite common pattern
is to handle both RXx and TX interrupts with the same handler, which
will disable both interrupts and then hand off to napi. The cpsw driver
works the same way, so shouldn't this work on omap?

> From the INTC point of view we should just have standard level IRQ

> handling for each interrupt, right? INTC does not need to know about

> driver specific coordination between two separate interrupts.


The problem is that the INTC is not doing 100% level IRQ handling
combined with the way we handle the INTC during interrupts. When a IRQ
has become active on the INTC it will stay active until we acknowledge
it, even if it is masked or de-asserted.

My patch does not fix the problem when the active interrupt was
de-asserted, so I am looking into other ways of dealing with it.

Stefan
Tony Lindgren March 6, 2014, 6:15 p.m. UTC | #7
* Sørensen, Stefan <Stefan.Sorensen@spectralink.com> [140306 06:36]:
> On Mon, 2014-03-03 at 09:49 -0800, Tony Lindgren wrote:
> > > 	1. CPSW device asserts TX IRQ
> > > 	2. CPSW device asserts RX IRQ
> > > 	3. INTC interrupts CPU, TX IRQ marked as active
> > > 	4. omap_intc_handle_irq ACKs TX IRQ on the INTC
> > > 	5. INTC marks RX IRQ as active
> > > 	6. omap_intc_handle_irq calls cpsw_interrupt
> > > 	7. cpsw_interrupt disables RX+TX IRQ in CPSW device
> > > 	8. cpsw_interrupt disables RX+TX IRQ in INTC (the IRQs are masked)
> > > 	9.. omap_intc_handle_irq sees no unmasked IRQs are pending and returns
> > > 	10. INTC interrupts CPU, RX IRQ marked as pending
> > > 	11. omap_intc_handle_irq sees no unmasked IRQs are pending and returns
> > > 	12. Go to step 10
> > > 
> > > The problem arises in step 8 where an active IRQ is masked. This will
> > > not make it inactive in the INTC but it will be cleared from the
> > > pending IRQ registers - this is the register that omap_intc_handle_irq
> > > uses to decide which IRQ is active.
> > 
> > OK thanks for the info. Looking at your trace above it seems like the
> > two separate TX and RX INTC interrupts are not really treated as separate
> > interrupts in the cpsw driver. It seems the real fix is to fix up the
> > cpsw interrupt handler so it's not racy. It seems that at least step 7
> > and 8 above should be handled separately for TX and RX based on the INTC
> > interrupts.
> 
> Looking in drivers/net/ethernet, it looks like a quite common pattern
> is to handle both RXx and TX interrupts with the same handler, which
> will disable both interrupts and then hand off to napi. The cpsw driver
> works the same way, so shouldn't this work on omap?

Hmm yes at least disable_irq should work no matter when it's called.
 
> > From the INTC point of view we should just have standard level IRQ
> > handling for each interrupt, right? INTC does not need to know about
> > driver specific coordination between two separate interrupts.
> 
> The problem is that the INTC is not doing 100% level IRQ handling
> combined with the way we handle the INTC during interrupts. When a IRQ
> has become active on the INTC it will stay active until we acknowledge
> it, even if it is masked or de-asserted.

Yes it's starting to sound that INTC has an issuerecovering from the
spurious interrupts.
 
> My patch does not fix the problem when the active interrupt was
> de-asserted, so I am looking into other ways of dealing with it.

OK.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/arm/mach-omap2/irq.c b/arch/arm/mach-omap2/irq.c
index e022a86..9d09914 100644
--- a/arch/arm/mach-omap2/irq.c
+++ b/arch/arm/mach-omap2/irq.c
@@ -97,12 +97,6 @@  static void omap_ack_irq(struct irq_data *d)
 	intc_bank_write_reg(0x1, &irq_banks[0], INTC_CONTROL);
 }
 
-static void omap_mask_ack_irq(struct irq_data *d)
-{
-	irq_gc_mask_disable_reg(d);
-	omap_ack_irq(d);
-}
-
 static void __init omap_irq_bank_init_one(struct omap_irq_bank *bank)
 {
 	unsigned long tmp;
@@ -145,9 +139,9 @@  omap_alloc_gc(void __iomem *base, unsigned int irq_start, unsigned int num)
 	struct irq_chip_type *ct;
 
 	gc = irq_alloc_generic_chip("INTC", 1, irq_start, base,
-					handle_level_irq);
+					handle_fasteoi_irq);
 	ct = gc->chip_types;
-	ct->chip.irq_ack = omap_mask_ack_irq;
+	ct->chip.irq_eoi = omap_ack_irq;
 	ct->chip.irq_mask = irq_gc_mask_disable_reg;
 	ct->chip.irq_unmask = irq_gc_unmask_enable_reg;
 	ct->chip.flags |= IRQCHIP_SKIP_SET_WAKE;