diff mbox series

[1/3] can: rcar_canfd: Fix IRQ storm on global fifo receive

Message ID 20221022081503.1051257-2-biju.das.jz@bp.renesas.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series R-Car CANFD fixes | expand

Checks

Context Check Description
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers fail 1 blamed authors not CCed: ramesh.shanmugasundaram@bp.renesas.com; 1 maintainers not CCed: ramesh.shanmugasundaram@bp.renesas.com
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 22 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Biju Das Oct. 22, 2022, 8:15 a.m. UTC
We are seeing IRQ storm on global receive IRQ line under heavy CAN
bus load conditions with both CAN channels are enabled.

Conditions:
  The global receive IRQ line is shared between can0 and can1, either
  of the channels can trigger interrupt while the other channel irq
  line is disabled(rfie).
  When global receive IRQ interrupt occurs, we mask the interrupt in
  irqhandler. Clearing and unmasking of the interrupt is happening in
  rx_poll(). There is a race condition where rx_poll unmask the
  interrupt, but the next irq handler does not mask the irq due to
  NAPIF_STATE_MISSED flag(for eg: can0 rx fifo interrupt enable is
  disabled and can1 is triggering rx interrupt, the delay in rx_poll()
  processing results in setting NAPIF_STATE_MISSED flag) leading to IRQ
  storm.

This patch fixes the issue by checking irq is masked or not in
irq handler and it masks the interrupt if it is unmasked.

Fixes: dd3bd23eb438 ("can: rcar_canfd: Add Renesas R-Car CAN FD driver")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
---
 drivers/net/can/rcar/rcar_canfd.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Marc Kleine-Budde Oct. 24, 2022, 3:37 p.m. UTC | #1
On 22.10.2022 09:15:01, Biju Das wrote:
> We are seeing IRQ storm on global receive IRQ line under heavy CAN
> bus load conditions with both CAN channels are enabled.
> 
> Conditions:
>   The global receive IRQ line is shared between can0 and can1, either
>   of the channels can trigger interrupt while the other channel irq
>   line is disabled(rfie).
>   When global receive IRQ interrupt occurs, we mask the interrupt in
>   irqhandler. Clearing and unmasking of the interrupt is happening in
>   rx_poll(). There is a race condition where rx_poll unmask the
>   interrupt, but the next irq handler does not mask the irq due to
>   NAPIF_STATE_MISSED flag.

Why does this happen? Is it a problem that you call
rcar_canfd_handle_global_receive() for a channel that has the IRQs
actually disabled in hardware?

>   (for eg: can0 rx fifo interrupt enable is
>   disabled and can1 is triggering rx interrupt, the delay in rx_poll()
>   processing results in setting NAPIF_STATE_MISSED flag) leading to IRQ
>   storm.
> 
> This patch fixes the issue by checking irq is masked or not in
> irq handler and it masks the interrupt if it is unmasked.

Marc
Marc Kleine-Budde Oct. 24, 2022, 3:53 p.m. UTC | #2
On 24.10.2022 17:37:35, Marc Kleine-Budde wrote:
> On 22.10.2022 09:15:01, Biju Das wrote:
> > We are seeing IRQ storm on global receive IRQ line under heavy CAN
> > bus load conditions with both CAN channels are enabled.
> > 
> > Conditions:
> >   The global receive IRQ line is shared between can0 and can1, either
> >   of the channels can trigger interrupt while the other channel irq
> >   line is disabled(rfie).
> >   When global receive IRQ interrupt occurs, we mask the interrupt in
> >   irqhandler. Clearing and unmasking of the interrupt is happening in
> >   rx_poll(). There is a race condition where rx_poll unmask the
> >   interrupt, but the next irq handler does not mask the irq due to
> >   NAPIF_STATE_MISSED flag.
> 
> Why does this happen? Is it a problem that you call
> rcar_canfd_handle_global_receive() for a channel that has the IRQs
> actually disabled in hardware?

Can you check if the IRQ is active _and_ enabled before handling the IRQ
on a particular channel?

A more clearer approach would be to get rid of the global interrupt
handlers at all. If the hardware only given 1 IRQ line for more than 1
channel, the driver would register an IRQ handler for each channel (with
the shared attribute). The IRQ handler must check, if the IRQ is pending
and enabled. If not return IRQ_NONE, otherwise handle and return IRQ_HANDLED.

Marc
Biju Das Oct. 24, 2022, 4:42 p.m. UTC | #3
Hi Marc,

> Subject: Re: [PATCH 1/3] can: rcar_canfd: Fix IRQ storm on global fifo
> receive
> 
> On 22.10.2022 09:15:01, Biju Das wrote:
> > We are seeing IRQ storm on global receive IRQ line under heavy CAN
> bus
> > load conditions with both CAN channels are enabled.
> >
> > Conditions:
> >   The global receive IRQ line is shared between can0 and can1,
> either
> >   of the channels can trigger interrupt while the other channel irq
> >   line is disabled(rfie).
> >   When global receive IRQ interrupt occurs, we mask the interrupt in
> >   irqhandler. Clearing and unmasking of the interrupt is happening
> in
> >   rx_poll(). There is a race condition where rx_poll unmask the
> >   interrupt, but the next irq handler does not mask the irq due to
> >   NAPIF_STATE_MISSED flag.
> 
> Why does this happen?

It is due to race between rx_poll() and interrupt triggered by other
Channel.

> Is it a problem that you call
> rcar_canfd_handle_global_receive() for a channel that has the IRQs
> actually disabled in hardware?

Yes, Due to other channel triggering interrupt and executing
the same call for channel0 again. 

Scenario:
 Channel0 IRQ line is disabled because of RXFiFo ch0 status in IRQ
and it schedule NAPI call. Before executing rx_poll, you get another
interrupt due to channel1 IRQ. Since RXFifo status is still set,
it will call napi_sched_prep() and state become missed.

Assume rx_poll() called it clear and unmask the IRQ line. This
time we get an IRQ from Channel0, since the state is missed state,
the line will be unmasked and we get IRQ storm

Finally, It will be like this you have an interrupt, which is not cleared
Leading to IRQ storm.

Cheers,
Biju
Biju Das Oct. 24, 2022, 4:55 p.m. UTC | #4
Hi Marc,
> Subject: Re: [PATCH 1/3] can: rcar_canfd: Fix IRQ storm on global fifo
> receive
> 
> On 24.10.2022 17:37:35, Marc Kleine-Budde wrote:
> > On 22.10.2022 09:15:01, Biju Das wrote:
> > > We are seeing IRQ storm on global receive IRQ line under heavy CAN
> > > bus load conditions with both CAN channels are enabled.
> > >
> > > Conditions:
> > >   The global receive IRQ line is shared between can0 and can1,
> either
> > >   of the channels can trigger interrupt while the other channel
> irq
> > >   line is disabled(rfie).
> > >   When global receive IRQ interrupt occurs, we mask the interrupt
> in
> > >   irqhandler. Clearing and unmasking of the interrupt is happening
> in
> > >   rx_poll(). There is a race condition where rx_poll unmask the
> > >   interrupt, but the next irq handler does not mask the irq due to
> > >   NAPIF_STATE_MISSED flag.
> >
> > Why does this happen? Is it a problem that you call
> > rcar_canfd_handle_global_receive() for a channel that has the IRQs
> > actually disabled in hardware?
> 
> Can you check if the IRQ is active _and_ enabled before handling the
> IRQ on a particular channel?

You mean IRQ handler or rx_poll()??

IRQ handler check the status and disable(mask) the IRQ line.
rx_poll() clears the status and enable(unmask) the IRQ line.

Status flag is set by HW while line is in disabled/enabled state.

Channel0 and channel1 has 2 IRQ lines within the IP which is ored together
to provide global receive interrupt(shared line).


> 
> A more clearer approach would be to get rid of the global interrupt
> handlers at all. If the hardware only given 1 IRQ line for more than 1
> channel, the driver would register an IRQ handler for each channel
> (with the shared attribute). The IRQ handler must check, if the IRQ is
> pending and enabled. If not return IRQ_NONE, otherwise handle and
> return IRQ_HANDLED.

That involves restructuring the IRQ handler altogether.

RZ/G2L has shared line for rx fifos {ch0 and ch1} -> 2 IRQ routine with shared attributes
R-Car SoCs has shared line for rx fifos {ch0 and ch1} and error interrupts->3 IRQ routines with shared attributes.
R-CarV3U SoCs has shared line for rx fifos {ch0 to ch8} and error interrupts->9 IRQ routines with shared attributes.

Yes, I can send follow up patches for migrating to shared interrupt handlers as enhancement.
Please let me know.

Cheers,
Biju
Marc Kleine-Budde Oct. 24, 2022, 6:07 p.m. UTC | #5
On 24.10.2022 16:55:56, Biju Das wrote:
> Hi Marc,
> > Subject: Re: [PATCH 1/3] can: rcar_canfd: Fix IRQ storm on global fifo
> > receive
> > 
> > On 24.10.2022 17:37:35, Marc Kleine-Budde wrote:
> > > On 22.10.2022 09:15:01, Biju Das wrote:
> > > > We are seeing IRQ storm on global receive IRQ line under heavy CAN
> > > > bus load conditions with both CAN channels are enabled.
> > > >
> > > > Conditions:
> > > >   The global receive IRQ line is shared between can0 and can1,
> > either
> > > >   of the channels can trigger interrupt while the other channel
> > irq
> > > >   line is disabled(rfie).
> > > >   When global receive IRQ interrupt occurs, we mask the interrupt
> > in
> > > >   irqhandler. Clearing and unmasking of the interrupt is happening
> > in
> > > >   rx_poll(). There is a race condition where rx_poll unmask the
> > > >   interrupt, but the next irq handler does not mask the irq due to
> > > >   NAPIF_STATE_MISSED flag.
> > >
> > > Why does this happen? Is it a problem that you call
> > > rcar_canfd_handle_global_receive() for a channel that has the IRQs
> > > actually disabled in hardware?
> > 
> > Can you check if the IRQ is active _and_ enabled before handling the
> > IRQ on a particular channel?
> 
> You mean IRQ handler or rx_poll()??

I mean the IRQ handler.

Consider the IRQ for channel0 is disabled but active and the IRQ for
channel1 is enabled and active. The
rcar_canfd_global_receive_fifo_interrupt() will iterate over both
channels, and rcar_canfd_handle_global_receive() will serve the channel0
IRQ, even if the IRQ is _not_ enabled. So I suggested to only handle a
channel's RX IRQ if that IRQ is actually enabled.

Assuming "cc & RCANFD_RFCC_RFI" checks if IRQ is enabled:

index 567620d215f8..ea828c1bd3a1 100644
--- a/drivers/net/can/rcar/rcar_canfd.c
+++ b/drivers/net/can/rcar/rcar_canfd.c
@@ -1157,11 +1157,13 @@ static void rcar_canfd_handle_global_receive(struct rcar_canfd_global *gpriv, u3
 {
        struct rcar_canfd_channel *priv = gpriv->ch[ch];
        u32 ridx = ch + RCANFD_RFFIFO_IDX;
-       u32 sts;
+       u32 sts, cc;
 
        /* Handle Rx interrupts */
        sts = rcar_canfd_read(priv->base, RCANFD_RFSTS(gpriv, ridx));
-       if (likely(sts & RCANFD_RFSTS_RFIF)) {
+       cc = rcar_canfd_read(priv->base, RCANFD_RFCC(gpriv, ridx));
+       if (likely(sts & RCANFD_RFSTS_RFIF &&
+                  cc & RCANFD_RFCC_RFIE)) {
                if (napi_schedule_prep(&priv->napi)) {
                        /* Disable Rx FIFO interrupts */
                        rcar_canfd_clear_bit(priv->base,

Please check if that fixes your issue.

> IRQ handler check the status and disable(mask) the IRQ line.
> rx_poll() clears the status and enable(unmask) the IRQ line.
> 
> Status flag is set by HW while line is in disabled/enabled state.
> 
> Channel0 and channel1 has 2 IRQ lines within the IP which is ored together
> to provide global receive interrupt(shared line).

> > A more clearer approach would be to get rid of the global interrupt
> > handlers at all. If the hardware only given 1 IRQ line for more than 1
> > channel, the driver would register an IRQ handler for each channel
> > (with the shared attribute). The IRQ handler must check, if the IRQ is
                     ^^^^^^^^^
That should be "flag".

> > pending and enabled. If not return IRQ_NONE, otherwise handle and
> > return IRQ_HANDLED.
> 
> That involves restructuring the IRQ handler altogether.

ACK

> RZ/G2L has shared line for rx fifos {ch0 and ch1} -> 2 IRQ routine
> with shared attributes.

It's the same IRQ handler (or IRQ routine), but called 1x for each
channel, so 2x in total. The SHARED is actually a IRQ flag in the 4th
argument in the devm_request_irq() function.

| devm_request_irq(..., ..., ..., IRQF_SHARED, ..., ...);

> R-Car SoCs has shared line for rx fifos {ch0 and ch1} and error
> interrupts->3 IRQ routines with shared attributes.

> R-CarV3U SoCs has shared line for rx fifos {ch0 to ch8} and error
> interrupts->9 IRQ routines with shared attributes.

I think you got the point, I just wanted to point out the usual way they
are called.

> Yes, I can send follow up patches for migrating to shared interrupt
> handlers as enhancement. Please let me know.

Please check if my patch snippet from above works. To fix the IRQ storm
problem I'd like to have a simple and short solution that can go into
stable before restructuring the IRQ handlers.

regards,
Marc
Biju Das Oct. 24, 2022, 6:31 p.m. UTC | #6
Hi Marc,

> Subject: Re: [PATCH 1/3] can: rcar_canfd: Fix IRQ storm on global fifo
> receive
> 
> On 24.10.2022 16:55:56, Biju Das wrote:
> > Hi Marc,
> > > Subject: Re: [PATCH 1/3] can: rcar_canfd: Fix IRQ storm on global
> > > fifo receive
> > >
> > > On 24.10.2022 17:37:35, Marc Kleine-Budde wrote:
> > > > On 22.10.2022 09:15:01, Biju Das wrote:
> > > > > We are seeing IRQ storm on global receive IRQ line under heavy
> > > > > CAN bus load conditions with both CAN channels are enabled.
> > > > >
> > > > > Conditions:
> > > > >   The global receive IRQ line is shared between can0 and can1,
> > > either
> > > > >   of the channels can trigger interrupt while the other
> channel
> > > irq
> > > > >   line is disabled(rfie).
> > > > >   When global receive IRQ interrupt occurs, we mask the
> > > > > interrupt
> > > in
> > > > >   irqhandler. Clearing and unmasking of the interrupt is
> > > > > happening
> > > in
> > > > >   rx_poll(). There is a race condition where rx_poll unmask
> the
> > > > >   interrupt, but the next irq handler does not mask the irq
> due to
> > > > >   NAPIF_STATE_MISSED flag.
> > > >
> > > > Why does this happen? Is it a problem that you call
> > > > rcar_canfd_handle_global_receive() for a channel that has the
> IRQs
> > > > actually disabled in hardware?
> > >
> > > Can you check if the IRQ is active _and_ enabled before handling
> the
> > > IRQ on a particular channel?
> >
> > You mean IRQ handler or rx_poll()??
> 
> I mean the IRQ handler.
> 
> Consider the IRQ for channel0 is disabled but active and the IRQ for
> channel1 is enabled and active. The
> rcar_canfd_global_receive_fifo_interrupt() will iterate over both
> channels, and rcar_canfd_handle_global_receive() will serve the
> channel0 IRQ, even if the IRQ is _not_ enabled. So I suggested to only
> handle a channel's RX IRQ if that IRQ is actually enabled.
> 
> Assuming "cc & RCANFD_RFCC_RFI" checks if IRQ is enabled:


> 
> index 567620d215f8..ea828c1bd3a1 100644
> --- a/drivers/net/can/rcar/rcar_canfd.c
> +++ b/drivers/net/can/rcar/rcar_canfd.c
> @@ -1157,11 +1157,13 @@ static void
> rcar_canfd_handle_global_receive(struct rcar_canfd_global *gpriv, u3
> {
>         struct rcar_canfd_channel *priv = gpriv->ch[ch];
>         u32 ridx = ch + RCANFD_RFFIFO_IDX;
> -       u32 sts;
> +       u32 sts, cc;
> 
>         /* Handle Rx interrupts */
>         sts = rcar_canfd_read(priv->base, RCANFD_RFSTS(gpriv, ridx));
> -       if (likely(sts & RCANFD_RFSTS_RFIF)) {
> +       cc = rcar_canfd_read(priv->base, RCANFD_RFCC(gpriv, ridx));
> +       if (likely(sts & RCANFD_RFSTS_RFIF &&
> +                  cc & RCANFD_RFCC_RFIE)) {
>                 if (napi_schedule_prep(&priv->napi)) {
>                         /* Disable Rx FIFO interrupts */
>                         rcar_canfd_clear_bit(priv->base,
> 
> Please check if that fixes your issue.

Looks like your solution also will work.

Tomorrow will check and provide you feedback.

> 
> > IRQ handler check the status and disable(mask) the IRQ line.
> > rx_poll() clears the status and enable(unmask) the IRQ line.
> >
> > Status flag is set by HW while line is in disabled/enabled state.
> >
> > Channel0 and channel1 has 2 IRQ lines within the IP which is ored
> > together to provide global receive interrupt(shared line).
> 
> > > A more clearer approach would be to get rid of the global
> interrupt
> > > handlers at all. If the hardware only given 1 IRQ line for more
> than
> > > 1 channel, the driver would register an IRQ handler for each
> channel
> > > (with the shared attribute). The IRQ handler must check, if the
> IRQ
> > > is
>                      ^^^^^^^^^
> That should be "flag".
OK.

> 
> > > pending and enabled. If not return IRQ_NONE, otherwise handle and
> > > return IRQ_HANDLED.
> >
> > That involves restructuring the IRQ handler altogether.
> 
> ACK
> 
> > RZ/G2L has shared line for rx fifos {ch0 and ch1} -> 2 IRQ routine
> > with shared attributes.
> 
> It's the same IRQ handler (or IRQ routine), but called 1x for each
> channel, so 2x in total. The SHARED is actually a IRQ flag in the 4th
> argument in the devm_request_irq() function.
> 
> | devm_request_irq(..., ..., ..., IRQF_SHARED, ..., ...);
> 
> > R-Car SoCs has shared line for rx fifos {ch0 and ch1} and error
> > interrupts->3 IRQ routines with shared attributes.
> 
> > R-CarV3U SoCs has shared line for rx fifos {ch0 to ch8} and error
> > interrupts->9 IRQ routines with shared attributes.
> 
> I think you got the point, I just wanted to point out the usual way
> they are called.
> 
> > Yes, I can send follow up patches for migrating to shared interrupt
> > handlers as enhancement. Please let me know.
> 
> Please check if my patch snippet from above works. To fix the IRQ
> storm problem I'd like to have a simple and short solution that can go
> into stable before restructuring the IRQ handlers.

OK, Tomorrow will provide you the feedback.

Cheers,
Biju
Biju Das Oct. 25, 2022, 3:50 p.m. UTC | #7
Hi Marc,

> Subject: RE: [PATCH 1/3] can: rcar_canfd: Fix IRQ storm on global fifo
> receive
> 
> Hi Marc,
> 
> > Subject: Re: [PATCH 1/3] can: rcar_canfd: Fix IRQ storm on global
> fifo
> > receive
> >
> > On 24.10.2022 16:55:56, Biju Das wrote:
> > > Hi Marc,
> > > > Subject: Re: [PATCH 1/3] can: rcar_canfd: Fix IRQ storm on
> global
> > > > fifo receive
> > > >
> > > > On 24.10.2022 17:37:35, Marc Kleine-Budde wrote:
> > > > > On 22.10.2022 09:15:01, Biju Das wrote:
> > > > > > We are seeing IRQ storm on global receive IRQ line under
> heavy
> > > > > > CAN bus load conditions with both CAN channels are enabled.
> > > > > >
> > > > > > Conditions:
> > > > > >   The global receive IRQ line is shared between can0 and
> can1,
> > > > either
> > > > > >   of the channels can trigger interrupt while the other
> > channel
> > > > irq
> > > > > >   line is disabled(rfie).
> > > > > >   When global receive IRQ interrupt occurs, we mask the
> > > > > > interrupt
> > > > in
> > > > > >   irqhandler. Clearing and unmasking of the interrupt is
> > > > > > happening
> > > > in
> > > > > >   rx_poll(). There is a race condition where rx_poll unmask
> > the
> > > > > >   interrupt, but the next irq handler does not mask the irq
> > due to
> > > > > >   NAPIF_STATE_MISSED flag.
> > > > >
> > > > > Why does this happen? Is it a problem that you call
> > > > > rcar_canfd_handle_global_receive() for a channel that has the
> > IRQs
> > > > > actually disabled in hardware?
> > > >
> > > > Can you check if the IRQ is active _and_ enabled before handling
> > the
> > > > IRQ on a particular channel?
> > >
> > > You mean IRQ handler or rx_poll()??
> >
> > I mean the IRQ handler.
> >
> > Consider the IRQ for channel0 is disabled but active and the IRQ for
> > channel1 is enabled and active. The
> > rcar_canfd_global_receive_fifo_interrupt() will iterate over both
> > channels, and rcar_canfd_handle_global_receive() will serve the
> > channel0 IRQ, even if the IRQ is _not_ enabled. So I suggested to
> only
> > handle a channel's RX IRQ if that IRQ is actually enabled.
> >
> > Assuming "cc & RCANFD_RFCC_RFI" checks if IRQ is enabled:
> 
> 
> >
> > index 567620d215f8..ea828c1bd3a1 100644
> > --- a/drivers/net/can/rcar/rcar_canfd.c
> > +++ b/drivers/net/can/rcar/rcar_canfd.c
> > @@ -1157,11 +1157,13 @@ static void
> > rcar_canfd_handle_global_receive(struct rcar_canfd_global *gpriv, u3
> {
> >         struct rcar_canfd_channel *priv = gpriv->ch[ch];
> >         u32 ridx = ch + RCANFD_RFFIFO_IDX;
> > -       u32 sts;
> > +       u32 sts, cc;
> >
> >         /* Handle Rx interrupts */
> >         sts = rcar_canfd_read(priv->base, RCANFD_RFSTS(gpriv,
> ridx));
> > -       if (likely(sts & RCANFD_RFSTS_RFIF)) {
> > +       cc = rcar_canfd_read(priv->base, RCANFD_RFCC(gpriv, ridx));
> > +       if (likely(sts & RCANFD_RFSTS_RFIF &&
> > +                  cc & RCANFD_RFCC_RFIE)) {
> >                 if (napi_schedule_prep(&priv->napi)) {
> >                         /* Disable Rx FIFO interrupts */
> >                         rcar_canfd_clear_bit(priv->base,
> >
> > Please check if that fixes your issue.

Yes, it fixes the issue.

> 
> >
> > > IRQ handler check the status and disable(mask) the IRQ line.
> > > rx_poll() clears the status and enable(unmask) the IRQ line.
> > >
> > > Status flag is set by HW while line is in disabled/enabled state.
> > >
> > > Channel0 and channel1 has 2 IRQ lines within the IP which is ored
> > > together to provide global receive interrupt(shared line).
> >
> > > > A more clearer approach would be to get rid of the global
> > interrupt
> > > > handlers at all. If the hardware only given 1 IRQ line for more
> > than
> > > > 1 channel, the driver would register an IRQ handler for each
> > channel
> > > > (with the shared attribute). The IRQ handler must check, if the
> > IRQ
> > > > is
> >                      ^^^^^^^^^
> > That should be "flag".
> OK.
> 
> >
> > > > pending and enabled. If not return IRQ_NONE, otherwise handle
> and
> > > > return IRQ_HANDLED.
> > >
> > > That involves restructuring the IRQ handler altogether.
> >
> > ACK
> >
> > > RZ/G2L has shared line for rx fifos {ch0 and ch1} -> 2 IRQ routine
> > > with shared attributes.
> >
> > It's the same IRQ handler (or IRQ routine), but called 1x for each
> > channel, so 2x in total. The SHARED is actually a IRQ flag in the
> 4th
> > argument in the devm_request_irq() function.
> >
> > | devm_request_irq(..., ..., ..., IRQF_SHARED, ..., ...);
> >
> > > R-Car SoCs has shared line for rx fifos {ch0 and ch1} and error
> > > interrupts->3 IRQ routines with shared attributes.
> >
> > > R-CarV3U SoCs has shared line for rx fifos {ch0 to ch8} and error
> > > interrupts->9 IRQ routines with shared attributes.
> >
> > I think you got the point, I just wanted to point out the usual way
> > they are called.
> >
> > > Yes, I can send follow up patches for migrating to shared
> interrupt
> > > handlers as enhancement. Please let me know.
> >
> > Please check if my patch snippet from above works. To fix the IRQ
> > storm problem I'd like to have a simple and short solution that can
> go
> > into stable before restructuring the IRQ handlers.
> 
> OK, Tomorrow will provide you the feedback.

I will send V2 with these changes.

Cheers,
Biju
Marc Kleine-Budde Oct. 25, 2022, 5:44 p.m. UTC | #8
On 25.10.2022 15:50:18, Biju Das wrote:
[...]
> > > index 567620d215f8..ea828c1bd3a1 100644
> > > --- a/drivers/net/can/rcar/rcar_canfd.c
> > > +++ b/drivers/net/can/rcar/rcar_canfd.c
> > > @@ -1157,11 +1157,13 @@ static void
> > > rcar_canfd_handle_global_receive(struct rcar_canfd_global *gpriv, u3
> > {
> > >         struct rcar_canfd_channel *priv = gpriv->ch[ch];
> > >         u32 ridx = ch + RCANFD_RFFIFO_IDX;
> > > -       u32 sts;
> > > +       u32 sts, cc;
> > >
> > >         /* Handle Rx interrupts */
> > >         sts = rcar_canfd_read(priv->base, RCANFD_RFSTS(gpriv,
> > ridx));
> > > -       if (likely(sts & RCANFD_RFSTS_RFIF)) {
> > > +       cc = rcar_canfd_read(priv->base, RCANFD_RFCC(gpriv, ridx));
> > > +       if (likely(sts & RCANFD_RFSTS_RFIF &&
> > > +                  cc & RCANFD_RFCC_RFIE)) {
> > >                 if (napi_schedule_prep(&priv->napi)) {
> > >                         /* Disable Rx FIFO interrupts */
> > >                         rcar_canfd_clear_bit(priv->base,
> > >
> > > Please check if that fixes your issue.
> 
> Yes, it fixes the issue.

\o/

> I will send V2 with these changes.

Thanks,
Marc
diff mbox series

Patch

diff --git a/drivers/net/can/rcar/rcar_canfd.c b/drivers/net/can/rcar/rcar_canfd.c
index 567620d215f8..1a3ae0b232c9 100644
--- a/drivers/net/can/rcar/rcar_canfd.c
+++ b/drivers/net/can/rcar/rcar_canfd.c
@@ -1157,7 +1157,7 @@  static void rcar_canfd_handle_global_receive(struct rcar_canfd_global *gpriv, u3
 {
 	struct rcar_canfd_channel *priv = gpriv->ch[ch];
 	u32 ridx = ch + RCANFD_RFFIFO_IDX;
-	u32 sts;
+	u32 sts, rfie;
 
 	/* Handle Rx interrupts */
 	sts = rcar_canfd_read(priv->base, RCANFD_RFSTS(gpriv, ridx));
@@ -1168,6 +1168,14 @@  static void rcar_canfd_handle_global_receive(struct rcar_canfd_global *gpriv, u3
 					     RCANFD_RFCC(gpriv, ridx),
 					     RCANFD_RFCC_RFIE);
 			__napi_schedule(&priv->napi);
+		} else {
+			rfie = rcar_canfd_read(priv->base,
+					       RCANFD_RFCC(gpriv, ridx));
+			if (rfie & RCANFD_RFCC_RFIE)
+				/* Disable Rx FIFO interrupts */
+				rcar_canfd_clear_bit(priv->base,
+						     RCANFD_RFCC(gpriv, ridx),
+						     RCANFD_RFCC_RFIE);
 		}
 	}
 }