diff mbox series

[3/3] irqchip: renesas-rzg2l: Fix irq storm with edge trigger detection for TINT

Message ID 20230918122411.237635-4-biju.das.jz@bp.renesas.com (mailing list archive)
State Superseded
Delegated to: Geert Uytterhoeven
Headers show
Series Fix IRQ storm with GPIO interrupts | expand

Commit Message

Biju Das Sept. 18, 2023, 12:24 p.m. UTC
In case of edge trigger detection, enabling the TINT source causes a
phantum interrupt that leads to irq storm. So clear the phantum interrupt
in rzg2l_irqc_irq_enable().

This issue is observed when the irq handler disables the interrupts using
disable_irq_nosync() and scheduling a work queue and in the work queue,
re-enabling the interrupt with enable_irq().

Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller driver")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
---
 drivers/irqchip/irq-renesas-rzg2l.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Marc Zyngier Sept. 19, 2023, 2:37 p.m. UTC | #1
On Mon, 18 Sep 2023 13:24:11 +0100,
Biju Das <biju.das.jz@bp.renesas.com> wrote:
> 
> In case of edge trigger detection, enabling the TINT source causes a
> phantum interrupt that leads to irq storm. So clear the phantum interrupt
> in rzg2l_irqc_irq_enable().
> 
> This issue is observed when the irq handler disables the interrupts using
> disable_irq_nosync() and scheduling a work queue and in the work queue,
> re-enabling the interrupt with enable_irq().
> 
> Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller driver")
> Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
> Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
> ---
>  drivers/irqchip/irq-renesas-rzg2l.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/irqchip/irq-renesas-rzg2l.c b/drivers/irqchip/irq-renesas-rzg2l.c
> index 33a22bafedcd..78a9e90512a6 100644
> --- a/drivers/irqchip/irq-renesas-rzg2l.c
> +++ b/drivers/irqchip/irq-renesas-rzg2l.c
> @@ -144,6 +144,12 @@ static void rzg2l_irqc_irq_enable(struct irq_data *d)
>  		reg = readl_relaxed(priv->base + TSSR(tssr_index));
>  		reg |= (TIEN | tint) << TSSEL_SHIFT(tssr_offset);
>  		writel_relaxed(reg, priv->base + TSSR(tssr_index));
> +		/*
> +		 * In case of edge trigger detection, enabling the TINT source
> +		 * cause a phantum interrupt that leads to irq storm. So clear
> +		 * the phantum interrupt.
> +		 */
> +		rzg2l_tint_eoi(d);

This looks incredibly unsafe. disable_irq()+enable_irq() with an
interrupt being made pending in the middle, and you've lost that
interrupt.

What prevents this scenario?

	M.
Biju Das Sept. 19, 2023, 3:24 p.m. UTC | #2
Hi Marc Zyngier,

Thanks for the feedback.

> Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge
> trigger detection for TINT
> 
> On Mon, 18 Sep 2023 13:24:11 +0100,
> Biju Das <biju.das.jz@bp.renesas.com> wrote:
> >
> > In case of edge trigger detection, enabling the TINT source causes a
> > phantum interrupt that leads to irq storm. So clear the phantum
> > interrupt in rzg2l_irqc_irq_enable().
> >
> > This issue is observed when the irq handler disables the interrupts
> > using
> > disable_irq_nosync() and scheduling a work queue and in the work
> > queue, re-enabling the interrupt with enable_irq().
> >
> > Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller
> > driver")
> > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
> > Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
> > ---
> >  drivers/irqchip/irq-renesas-rzg2l.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/irqchip/irq-renesas-rzg2l.c
> > b/drivers/irqchip/irq-renesas-rzg2l.c
> > index 33a22bafedcd..78a9e90512a6 100644
> > --- a/drivers/irqchip/irq-renesas-rzg2l.c
> > +++ b/drivers/irqchip/irq-renesas-rzg2l.c
> > @@ -144,6 +144,12 @@ static void rzg2l_irqc_irq_enable(struct irq_data
> *d)
> >  		reg = readl_relaxed(priv->base + TSSR(tssr_index));
> >  		reg |= (TIEN | tint) << TSSEL_SHIFT(tssr_offset);
> >  		writel_relaxed(reg, priv->base + TSSR(tssr_index));
> > +		/*
> > +		 * In case of edge trigger detection, enabling the TINT source
> > +		 * cause a phantum interrupt that leads to irq storm. So clear
> > +		 * the phantum interrupt.
> > +		 */
> > +		rzg2l_tint_eoi(d);
> 
> This looks incredibly unsafe. disable_irq()+enable_irq() with an interrupt
> being made pending in the middle, and you've lost that interrupt.

In this driver that will never happen as it clears the TINT source
during disable(), so there won't be any TINT source for interrupt detection after disable().

Cheers,
Biju

> What prevents this scenario?
Marc Zyngier Sept. 19, 2023, 4:19 p.m. UTC | #3
On Tue, 19 Sep 2023 16:24:53 +0100,
Biju Das <biju.das.jz@bp.renesas.com> wrote:
> 
> Hi Marc Zyngier,
> 
> Thanks for the feedback.
> 
> > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge
> > trigger detection for TINT
> > 
> > On Mon, 18 Sep 2023 13:24:11 +0100,
> > Biju Das <biju.das.jz@bp.renesas.com> wrote:
> > >
> > > In case of edge trigger detection, enabling the TINT source causes a
> > > phantum interrupt that leads to irq storm. So clear the phantum
> > > interrupt in rzg2l_irqc_irq_enable().
> > >
> > > This issue is observed when the irq handler disables the interrupts
> > > using
> > > disable_irq_nosync() and scheduling a work queue and in the work
> > > queue, re-enabling the interrupt with enable_irq().
> > >
> > > Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller
> > > driver")
> > > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
> > > Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
> > > ---
> > >  drivers/irqchip/irq-renesas-rzg2l.c | 6 ++++++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/drivers/irqchip/irq-renesas-rzg2l.c
> > > b/drivers/irqchip/irq-renesas-rzg2l.c
> > > index 33a22bafedcd..78a9e90512a6 100644
> > > --- a/drivers/irqchip/irq-renesas-rzg2l.c
> > > +++ b/drivers/irqchip/irq-renesas-rzg2l.c
> > > @@ -144,6 +144,12 @@ static void rzg2l_irqc_irq_enable(struct irq_data
> > *d)
> > >  		reg = readl_relaxed(priv->base + TSSR(tssr_index));
> > >  		reg |= (TIEN | tint) << TSSEL_SHIFT(tssr_offset);
> > >  		writel_relaxed(reg, priv->base + TSSR(tssr_index));
> > > +		/*
> > > +		 * In case of edge trigger detection, enabling the TINT source
> > > +		 * cause a phantum interrupt that leads to irq storm. So clear
> > > +		 * the phantum interrupt.
> > > +		 */
> > > +		rzg2l_tint_eoi(d);
> > 
> > This looks incredibly unsafe. disable_irq()+enable_irq() with an interrupt
> > being made pending in the middle, and you've lost that interrupt.
> 
> In this driver that will never happen as it clears the TINT source
> during disable(), so there won't be any TINT source for interrupt
> detection after disable().

So you mean that you *already* lose interrupts across a disable
followed by an enable? I'm slightly puzzled...

	M.
Biju Das Sept. 19, 2023, 4:32 p.m. UTC | #4
Hi Marc Zyngier,

> Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge
> trigger detection for TINT
> 
> On Tue, 19 Sep 2023 16:24:53 +0100,
> Biju Das <biju.das.jz@bp.renesas.com> wrote:
> >
> > Hi Marc Zyngier,
> >
> > Thanks for the feedback.
> >
> > > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with
> > > edge trigger detection for TINT
> > >
> > > On Mon, 18 Sep 2023 13:24:11 +0100,
> > > Biju Das <biju.das.jz@bp.renesas.com> wrote:
> > > >
> > > > In case of edge trigger detection, enabling the TINT source causes
> > > > a phantum interrupt that leads to irq storm. So clear the phantum
> > > > interrupt in rzg2l_irqc_irq_enable().
> > > >
> > > > This issue is observed when the irq handler disables the
> > > > interrupts using
> > > > disable_irq_nosync() and scheduling a work queue and in the work
> > > > queue, re-enabling the interrupt with enable_irq().
> > > >
> > > > Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt
> > > > Controller
> > > > driver")
> > > > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
> > > > Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
> > > > ---
> > > >  drivers/irqchip/irq-renesas-rzg2l.c | 6 ++++++
> > > >  1 file changed, 6 insertions(+)
> > > >
> > > > diff --git a/drivers/irqchip/irq-renesas-rzg2l.c
> > > > b/drivers/irqchip/irq-renesas-rzg2l.c
> > > > index 33a22bafedcd..78a9e90512a6 100644
> > > > --- a/drivers/irqchip/irq-renesas-rzg2l.c
> > > > +++ b/drivers/irqchip/irq-renesas-rzg2l.c
> > > > @@ -144,6 +144,12 @@ static void rzg2l_irqc_irq_enable(struct
> > > > irq_data
> > > *d)
> > > >  		reg = readl_relaxed(priv->base + TSSR(tssr_index));
> > > >  		reg |= (TIEN | tint) << TSSEL_SHIFT(tssr_offset);
> > > >  		writel_relaxed(reg, priv->base + TSSR(tssr_index));
> > > > +		/*
> > > > +		 * In case of edge trigger detection, enabling the TINT
> source
> > > > +		 * cause a phantum interrupt that leads to irq storm. So
> clear
> > > > +		 * the phantum interrupt.
> > > > +		 */
> > > > +		rzg2l_tint_eoi(d);
> > >
> > > This looks incredibly unsafe. disable_irq()+enable_irq() with an
> > > interrupt being made pending in the middle, and you've lost that
> interrupt.
> >
> > In this driver that will never happen as it clears the TINT source
> > during disable(), so there won't be any TINT source for interrupt
> > detection after disable().
> 
> So you mean that you *already* lose interrupts across a disable followed by
> an enable? I'm slightly puzzled...

There is no interrupt lost at all. 

Currently this patch addresses 2 issues.

Scenario 1: Extra interrupt when we select TINT source on enable_irq()

Getting an extra interrupt, when client drivers calls enable_irq() during probe()/resume(). In this case, the irq handler on the
Client driver just clear the interrupt status bit.

Issue 2: IRQ storm when we select TINT source on enable_irq()

Here as well, we are getting an extra interrupt, when client drivers calls enable_irq() during probe() and this Interrupts getting generated infinitely, when the client driver calls disable_irq() in irq handler and in in work queue calling enable_irq().

Currently we are not loosing interrupts, but we are getting additional
Interrupt(phantom) which is causing the issue.

Cheers,
Biju
Marc Zyngier Sept. 19, 2023, 4:49 p.m. UTC | #5
On Tue, 19 Sep 2023 17:32:05 +0100,
Biju Das <biju.das.jz@bp.renesas.com> wrote:

[...]

> > So you mean that you *already* lose interrupts across a disable followed by
> > an enable? I'm slightly puzzled...
> 
> There is no interrupt lost at all. 
> 
> Currently this patch addresses 2 issues.
> 
> Scenario 1: Extra interrupt when we select TINT source on enable_irq()
> 
> Getting an extra interrupt, when client drivers calls enable_irq()
> during probe()/resume(). In this case, the irq handler on the Client
> driver just clear the interrupt status bit.
>
> Issue 2: IRQ storm when we select TINT source on enable_irq()
> 
> Here as well, we are getting an extra interrupt, when client drivers
> calls enable_irq() during probe() and this Interrupts getting
> generated infinitely, when the client driver calls disable_irq() in
> irq handler and in in work queue calling enable_irq().

How do you know this is a spurious interrupt? For all you can tell,
you are just consuming an edge. I absolutely don't buy this
workaround, because you have no context that allows you to
discriminate between a real spurious interrupt and a normal interrupt
that lands while the interrupt line was masked.

> Currently we are not loosing interrupts, but we are getting additional
> Interrupt(phantom) which is causing the issue.

If you get an interrupt at probe time in the endpoint driver, that's
probably because the device is not in a quiescent state when the
interrupt is requested. And it is probably this that needs addressing.

	M.
Biju Das Sept. 19, 2023, 5:06 p.m. UTC | #6
Hi Marc Zyngier,

> Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge
> trigger detection for TINT
> 
> On Tue, 19 Sep 2023 17:32:05 +0100,
> Biju Das <biju.das.jz@bp.renesas.com> wrote:
> 
> [...]
> 
> > > So you mean that you *already* lose interrupts across a disable
> > > followed by an enable? I'm slightly puzzled...
> >
> > There is no interrupt lost at all.
> >
> > Currently this patch addresses 2 issues.
> >
> > Scenario 1: Extra interrupt when we select TINT source on enable_irq()
> >
> > Getting an extra interrupt, when client drivers calls enable_irq()
> > during probe()/resume(). In this case, the irq handler on the Client
> > driver just clear the interrupt status bit.
> >
> > Issue 2: IRQ storm when we select TINT source on enable_irq()
> >
> > Here as well, we are getting an extra interrupt, when client drivers
> > calls enable_irq() during probe() and this Interrupts getting
> > generated infinitely, when the client driver calls disable_irq() in
> > irq handler and in in work queue calling enable_irq().
> 
> How do you know this is a spurious interrupt? 

We have PMOD on RZ/G2L SMARC EVK. So I connected it to GPIO pin
and other end to ground. During the boot, I get an interrupt
even though there is no high to low transition, when the IRQ is setup
in the probe(). From this it is a spurious interrupt.

> For all you can tell, you are
> just consuming an edge. I absolutely don't buy this workaround, because you
> have no context that allows you to discriminate between a real spurious
> interrupt and a normal interrupt that lands while the interrupt line was
> masked.
> 
> > Currently we are not loosing interrupts, but we are getting additional
> > Interrupt(phantom) which is causing the issue.
> 
> If you get an interrupt at probe time in the endpoint driver, that's
> probably because the device is not in a quiescent state when the interrupt
> is requested. And it is probably this that needs addressing.

Any pointer for addressing this issue? 

Thanks for your help.

Cheers,
Biju
Marc Zyngier Sept. 21, 2023, 7:55 a.m. UTC | #7
On Tue, 19 Sep 2023 18:06:54 +0100,
Biju Das <biju.das.jz@bp.renesas.com> wrote:
> 
> Hi Marc Zyngier,
> 
> > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge
> > trigger detection for TINT
> > 
> > On Tue, 19 Sep 2023 17:32:05 +0100,
> > Biju Das <biju.das.jz@bp.renesas.com> wrote:
> > 
> > [...]
> > 
> > > > So you mean that you *already* lose interrupts across a disable
> > > > followed by an enable? I'm slightly puzzled...
> > >
> > > There is no interrupt lost at all.
> > >
> > > Currently this patch addresses 2 issues.
> > >
> > > Scenario 1: Extra interrupt when we select TINT source on enable_irq()
> > >
> > > Getting an extra interrupt, when client drivers calls enable_irq()
> > > during probe()/resume(). In this case, the irq handler on the Client
> > > driver just clear the interrupt status bit.
> > >
> > > Issue 2: IRQ storm when we select TINT source on enable_irq()
> > >
> > > Here as well, we are getting an extra interrupt, when client drivers
> > > calls enable_irq() during probe() and this Interrupts getting
> > > generated infinitely, when the client driver calls disable_irq() in
> > > irq handler and in in work queue calling enable_irq().
> > 
> > How do you know this is a spurious interrupt? 
> 
> We have PMOD on RZ/G2L SMARC EVK. So I connected it to GPIO pin
> and other end to ground. During the boot, I get an interrupt
> even though there is no high to low transition, when the IRQ is setup
> in the probe(). From this it is a spurious interrupt.

That doesn't really handle my question. At the point of enabling the
interrupt and consuming the edge (which is what this patch does), how
do you know you can readily discard this signal? This is a genuine
question.

Spurious interrupts at boot are common. The HW resets in a funky,
unspecified state, and it's SW's job to initialise it before letting
other agents in the system use interrupts.

> 
> > For all you can tell, you are
> > just consuming an edge. I absolutely don't buy this workaround, because you
> > have no context that allows you to discriminate between a real spurious
> > interrupt and a normal interrupt that lands while the interrupt line was
> > masked.
> > 
> > > Currently we are not loosing interrupts, but we are getting additional
> > > Interrupt(phantom) which is causing the issue.
> > 
> > If you get an interrupt at probe time in the endpoint driver, that's
> > probably because the device is not in a quiescent state when the interrupt
> > is requested. And it is probably this that needs addressing.
> 
> Any pointer for addressing this issue? 

Nothing but the most basic stuff: you should make sure that the
interrupt isn't enabled before you can actually handle it, and triage
it as spurious.

	M.
Biju Das Sept. 22, 2023, 2:34 p.m. UTC | #8
Hi Marc Zyngier,

Thanks for the feedback.

> Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge
> trigger detection for TINT
> 
> On Tue, 19 Sep 2023 18:06:54 +0100,
> Biju Das <biju.das.jz@bp.renesas.com> wrote:
> >
> > Hi Marc Zyngier,
> >
> > > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with
> > > edge trigger detection for TINT
> > >
> > > On Tue, 19 Sep 2023 17:32:05 +0100,
> > > Biju Das <biju.das.jz@bp.renesas.com> wrote:
> > >
> > > [...]
> > >
> > > > > So you mean that you *already* lose interrupts across a disable
> > > > > followed by an enable? I'm slightly puzzled...
> > > >
> > > > There is no interrupt lost at all.
> > > >
> > > > Currently this patch addresses 2 issues.
> > > >
> > > > Scenario 1: Extra interrupt when we select TINT source on
> > > > enable_irq()
> > > >
> > > > Getting an extra interrupt, when client drivers calls enable_irq()
> > > > during probe()/resume(). In this case, the irq handler on the
> > > > Client driver just clear the interrupt status bit.
> > > >
> > > > Issue 2: IRQ storm when we select TINT source on enable_irq()
> > > >
> > > > Here as well, we are getting an extra interrupt, when client
> > > > drivers calls enable_irq() during probe() and this Interrupts
> > > > getting generated infinitely, when the client driver calls
> > > > disable_irq() in irq handler and in in work queue calling
> enable_irq().
> > >
> > > How do you know this is a spurious interrupt?
> >
> > We have PMOD on RZ/G2L SMARC EVK. So I connected it to GPIO pin and
> > other end to ground. During the boot, I get an interrupt even though
> > there is no high to low transition, when the IRQ is setup in the
> > probe(). From this it is a spurious interrupt.
> 
> That doesn't really handle my question. At the point of enabling the
> interrupt and consuming the edge (which is what this patch does), how do
> you know you can readily discard this signal? This is a genuine question.
> 
> Spurious interrupts at boot are common. The HW resets in a funky,
> unspecified state, and it's SW's job to initialise it before letting other
> agents in the system use interrupts.

I got your point related to loosing interrupts.

Now I can detect spurious interrupts for edge trigger.

Pin controller driver has a read-only register to monitor input values of GPIO input pins, use that register values before/after rzg2l_irq_enable() with TINT Status Control Register (TSCR)
in IRQ controller to detect the spurious interrupt.

Eg:
1) Check PIN_43_0 value (ex: low)in pinctrl driver
2) Enable the IRQ using rzg2l_irq_enable()/ irq_chip_enable_parent()in pinctrl driver
3) Check PIN_43_0 value (ex: low) in pinctrl driver
4) Check the TINT Status Control Register(TSCR) in IRQ controller driver

     If the values in 1 and 3 are same and the status in 4 is set, then there is a spurious interrupt.

> 
> >
> > > For all you can tell, you are
> > > just consuming an edge. I absolutely don't buy this workaround,
> > > because you have no context that allows you to discriminate between
> > > a real spurious interrupt and a normal interrupt that lands while
> > > the interrupt line was masked.
> > >
> > > > Currently we are not loosing interrupts, but we are getting
> > > > additional
> > > > Interrupt(phantom) which is causing the issue.
> > >
> > > If you get an interrupt at probe time in the endpoint driver, that's
> > > probably because the device is not in a quiescent state when the
> > > interrupt is requested. And it is probably this that needs addressing.
> >
> > Any pointer for addressing this issue?
> 
> Nothing but the most basic stuff: you should make sure that the interrupt
> isn't enabled before you can actually handle it, and triage it as spurious.

For the GPIO interrupt case I have,

RTC driver(endpoint)--> Pin controller driver -->IRQ controller driver-->GIC controller.

1) I have configured the pin as GPIO interrupts in pin controller driver
2) Set the IRQ detection in IRQ controller for edge trigger
3) The moment I set the IRQ source in IRQ controller 
   I get an interrupt, even though there is no voltage transition.

Here the system is setup properly, but there is a spurious interrupt. Currently don't know how to handle it? 

Any pointers for handling this issue?

Note:
 Currently the pin controller driver is not configuring GPIO as GPIO input in Port Mode Register for the GPIO interrupts instead it is using reset value which is "Hi-Z". I will send a patch to fix it.

Cheers,
Biju
Biju Das Oct. 6, 2023, 10:46 a.m. UTC | #9
Hi Marc,

> Subject: RE: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge
> trigger detection for TINT
> 
> Hi Marc Zyngier,
> 
> Thanks for the feedback.
> 
> > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with
> > edge trigger detection for TINT
> >
> > On Tue, 19 Sep 2023 18:06:54 +0100,
> > Biju Das <biju.das.jz@bp.renesas.com> wrote:
> > >
> > > Hi Marc Zyngier,
> > >
> > > > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm
> > > > with edge trigger detection for TINT
> > > >
> > > > On Tue, 19 Sep 2023 17:32:05 +0100, Biju Das
> > > > <biju.das.jz@bp.renesas.com> wrote:
> > > >
> > > > [...]
> > > >
> > > > > > So you mean that you *already* lose interrupts across a
> > > > > > disable followed by an enable? I'm slightly puzzled...
> > > > >
> > > > > There is no interrupt lost at all.
> > > > >
> > > > > Currently this patch addresses 2 issues.
> > > > >
> > > > > Scenario 1: Extra interrupt when we select TINT source on
> > > > > enable_irq()
> > > > >
> > > > > Getting an extra interrupt, when client drivers calls
> > > > > enable_irq() during probe()/resume(). In this case, the irq
> > > > > handler on the Client driver just clear the interrupt status bit.
> > > > >
> > > > > Issue 2: IRQ storm when we select TINT source on enable_irq()
> > > > >
> > > > > Here as well, we are getting an extra interrupt, when client
> > > > > drivers calls enable_irq() during probe() and this Interrupts
> > > > > getting generated infinitely, when the client driver calls
> > > > > disable_irq() in irq handler and in in work queue calling
> > enable_irq().
> > > >
> > > > How do you know this is a spurious interrupt?
> > >
> > > We have PMOD on RZ/G2L SMARC EVK. So I connected it to GPIO pin and
> > > other end to ground. During the boot, I get an interrupt even though
> > > there is no high to low transition, when the IRQ is setup in the
> > > probe(). From this it is a spurious interrupt.
> >
> > That doesn't really handle my question. At the point of enabling the
> > interrupt and consuming the edge (which is what this patch does), how
> > do you know you can readily discard this signal? This is a genuine
> question.
> >
> > Spurious interrupts at boot are common. The HW resets in a funky,
> > unspecified state, and it's SW's job to initialise it before letting
> > other agents in the system use interrupts.
> 
> I got your point related to loosing interrupts.
> 
> Now I can detect spurious interrupts for edge trigger.
> 
> Pin controller driver has a read-only register to monitor input values of
> GPIO input pins, use that register values before/after rzg2l_irq_enable()
> with TINT Status Control Register (TSCR) in IRQ controller to detect the
> spurious interrupt.
> 
> Eg:
> 1) Check PIN_43_0 value (ex: low)in pinctrl driver
> 2) Enable the IRQ using rzg2l_irq_enable()/ irq_chip_enable_parent()in
> pinctrl driver
> 3) Check PIN_43_0 value (ex: low) in pinctrl driver
> 4) Check the TINT Status Control Register(TSCR) in IRQ controller driver
> 
>      If the values in 1 and 3 are same and the status in 4 is set, then
> there is a spurious interrupt.
> 
> >
> > >
> > > > For all you can tell, you are
> > > > just consuming an edge. I absolutely don't buy this workaround,
> > > > because you have no context that allows you to discriminate
> > > > between a real spurious interrupt and a normal interrupt that
> > > > lands while the interrupt line was masked.
> > > >
> > > > > Currently we are not loosing interrupts, but we are getting
> > > > > additional
> > > > > Interrupt(phantom) which is causing the issue.
> > > >
> > > > If you get an interrupt at probe time in the endpoint driver,
> > > > that's probably because the device is not in a quiescent state
> > > > when the interrupt is requested. And it is probably this that needs
> addressing.
> > >
> > > Any pointer for addressing this issue?
> >
> > Nothing but the most basic stuff: you should make sure that the
> > interrupt isn't enabled before you can actually handle it, and triage it
> as spurious.
> 
> For the GPIO interrupt case I have,
> 
> RTC driver(endpoint)--> Pin controller driver -->IRQ controller driver--
> >GIC controller.
> 
> 1) I have configured the pin as GPIO interrupts in pin controller driver
> 2) Set the IRQ detection in IRQ controller for edge trigger
> 3) The moment I set the IRQ source in IRQ controller
>    I get an interrupt, even though there is no voltage transition.
> 
> Here the system is setup properly, but there is a spurious interrupt.
> Currently don't know how to handle it?
> 
> Any pointers for handling this issue?
> 
> Note:
>  Currently the pin controller driver is not configuring GPIO as GPIO input
> in Port Mode Register for the GPIO interrupts instead it is using reset
> value which is "Hi-Z". I will send a patch to fix it.

An update, I have found a way to fix the spurious interrupt issue.

Spurious interrupt is generated if we do simultaneous writing of
TINT Source selection and TINT Source enable in TSSRx register.

If we write the register in correct order, then there is no issue.
i.e., first set the TINT Source selection and after that enable it.

Looks like it is a HW race condition. I am checking this issue with HW team.

Cheers,
Biju
diff mbox series

Patch

diff --git a/drivers/irqchip/irq-renesas-rzg2l.c b/drivers/irqchip/irq-renesas-rzg2l.c
index 33a22bafedcd..78a9e90512a6 100644
--- a/drivers/irqchip/irq-renesas-rzg2l.c
+++ b/drivers/irqchip/irq-renesas-rzg2l.c
@@ -144,6 +144,12 @@  static void rzg2l_irqc_irq_enable(struct irq_data *d)
 		reg = readl_relaxed(priv->base + TSSR(tssr_index));
 		reg |= (TIEN | tint) << TSSEL_SHIFT(tssr_offset);
 		writel_relaxed(reg, priv->base + TSSR(tssr_index));
+		/*
+		 * In case of edge trigger detection, enabling the TINT source
+		 * cause a phantum interrupt that leads to irq storm. So clear
+		 * the phantum interrupt.
+		 */
+		rzg2l_tint_eoi(d);
 		raw_spin_unlock(&priv->lock);
 		irq_chip_unmask_parent(d);
 	}