diff mbox series

[v1] usb: dwc3: gadget: Prevent irq storm when TH re-executes

Message ID 20250124075911.811594-1-badhri@google.com (mailing list archive)
State New
Headers show
Series [v1] usb: dwc3: gadget: Prevent irq storm when TH re-executes | expand

Commit Message

Badhri Jagan Sridharan Jan. 24, 2025, 7:59 a.m. UTC
While commit d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events
in event cache") makes sure that top half(TH) does not end up overwriting
the cached events before processing them when the TH gets invoked more
than one time, returning IRQ_HANDLED results in occasional irq storm
where the TH hogs the CPU. The irq storm can be prevented if
IRQ_WAKE_THREAD is returned.

ftrace event stub during dwc3 irq storm:
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000866: irq_handler_exit: irq=14 ret=handled
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000872: irq_handler_entry: irq=504 name=dwc3
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000874: irq_handler_exit: irq=504 ret=handled
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000881: irq_handler_entry: irq=504 name=dwc3
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000883: irq_handler_exit: irq=504 ret=handled
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000889: irq_handler_entry: irq=504 name=dwc3
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000892: irq_handler_exit: irq=504 ret=handled
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000898: irq_handler_entry: irq=504 name=dwc3
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000901: irq_handler_exit: irq=504 ret=handled
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000907: irq_handler_entry: irq=504 name=dwc3
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000909: irq_handler_exit: irq=504 ret=handled
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000915: irq_handler_entry: irq=504 name=dwc3
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000918: irq_handler_exit: irq=504 ret=handled
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000924: irq_handler_entry: irq=504 name=dwc3
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000927: irq_handler_exit: irq=504 ret=handled
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000933: irq_handler_entry: irq=504 name=dwc3
    irq/504_dwc3-1111  ( 1111) [000] .... 70.000935: irq_handler_exit: irq=504 ret=handled
    ....

Cc: stable@kernel.org
Fixes: d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events in event cache")
Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
---
 drivers/usb/dwc3/gadget.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


base-commit: 70cd0576aa39c55aabd227851cba0c601e811fb6

Comments

Thinh Nguyen Jan. 28, 2025, 2:44 a.m. UTC | #1
On Fri, Jan 24, 2025, Badhri Jagan Sridharan wrote:
> While commit d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events
> in event cache") makes sure that top half(TH) does not end up overwriting
> the cached events before processing them when the TH gets invoked more
> than one time, returning IRQ_HANDLED results in occasional irq storm
> where the TH hogs the CPU. The irq storm can be prevented if
> IRQ_WAKE_THREAD is returned.
> 
> ftrace event stub during dwc3 irq storm:
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000866: irq_handler_exit: irq=14 ret=handled
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000872: irq_handler_entry: irq=504 name=dwc3
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000874: irq_handler_exit: irq=504 ret=handled
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000881: irq_handler_entry: irq=504 name=dwc3
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000883: irq_handler_exit: irq=504 ret=handled
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000889: irq_handler_entry: irq=504 name=dwc3
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000892: irq_handler_exit: irq=504 ret=handled
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000898: irq_handler_entry: irq=504 name=dwc3
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000901: irq_handler_exit: irq=504 ret=handled
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000907: irq_handler_entry: irq=504 name=dwc3
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000909: irq_handler_exit: irq=504 ret=handled
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000915: irq_handler_entry: irq=504 name=dwc3
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000918: irq_handler_exit: irq=504 ret=handled
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000924: irq_handler_entry: irq=504 name=dwc3
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000927: irq_handler_exit: irq=504 ret=handled
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000933: irq_handler_entry: irq=504 name=dwc3
>     irq/504_dwc3-1111  ( 1111) [000] .... 70.000935: irq_handler_exit: irq=504 ret=handled
>     ....
> 
> Cc: stable@kernel.org
> Fixes: d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events in event cache")

I don't think this should be Cc to stable, at least not the way it is
right now.

> Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
> ---
>  drivers/usb/dwc3/gadget.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index d27af65eb08a..376ab75adc4e 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -4519,7 +4519,7 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
>  	 * losing events.
>  	 */
>  	if (evt->flags & DWC3_EVENT_PENDING)
> -		return IRQ_HANDLED;
> +		return IRQ_WAKE_THREAD;

This looks like a workaround to the issue we have. Have you tried to
enable imod instead? It's the feature to help avoid these kind of issue.

Thanks,
Thinh

>  
>  	count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
>  	count &= DWC3_GEVNTCOUNT_MASK;
> 
> base-commit: 70cd0576aa39c55aabd227851cba0c601e811fb6
> -- 
> 2.48.1.262.g85cc9f2d1e-goog
>
Badhri Jagan Sridharan Jan. 29, 2025, 8:30 a.m. UTC | #2
On Mon, Jan 27, 2025 at 6:44 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
>
> On Fri, Jan 24, 2025, Badhri Jagan Sridharan wrote:
> > While commit d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events
> > in event cache") makes sure that top half(TH) does not end up overwriting
> > the cached events before processing them when the TH gets invoked more
> > than one time, returning IRQ_HANDLED results in occasional irq storm
> > where the TH hogs the CPU. The irq storm can be prevented if
> > IRQ_WAKE_THREAD is returned.
> >
> > ftrace event stub during dwc3 irq storm:
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000866: irq_handler_exit: irq=14 ret=handled
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000872: irq_handler_entry: irq=504 name=dwc3
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000874: irq_handler_exit: irq=504 ret=handled
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000881: irq_handler_entry: irq=504 name=dwc3
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000883: irq_handler_exit: irq=504 ret=handled
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000889: irq_handler_entry: irq=504 name=dwc3
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000892: irq_handler_exit: irq=504 ret=handled
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000898: irq_handler_entry: irq=504 name=dwc3
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000901: irq_handler_exit: irq=504 ret=handled
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000907: irq_handler_entry: irq=504 name=dwc3
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000909: irq_handler_exit: irq=504 ret=handled
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000915: irq_handler_entry: irq=504 name=dwc3
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000918: irq_handler_exit: irq=504 ret=handled
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000924: irq_handler_entry: irq=504 name=dwc3
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000927: irq_handler_exit: irq=504 ret=handled
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000933: irq_handler_entry: irq=504 name=dwc3
> >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000935: irq_handler_exit: irq=504 ret=handled
> >     ....
> >
> > Cc: stable@kernel.org
> > Fixes: d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events in event cache")
>
> I don't think this should be Cc to stable, at least not the way it is
> right now.
>
> > Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
> > ---
> >  drivers/usb/dwc3/gadget.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index d27af65eb08a..376ab75adc4e 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -4519,7 +4519,7 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
> >        * losing events.
> >        */
> >       if (evt->flags & DWC3_EVENT_PENDING)
> > -             return IRQ_HANDLED;
> > +             return IRQ_WAKE_THREAD;
>
> This looks like a workaround to the issue we have. Have you tried to
> enable imod instead? It's the feature to help avoid these kind of issue.

Hi Thinh,

Thanks for the feedback ! Yes, we have been experimenting with the
interrupt moderation interval as well.
Have follow up questions though, please bear with me !

1. Given that when DWC3_EVENT_PENDING  is still set,
dwc3_check_event_buf() is still waiting for the previous cached events
to be processed by the dwc3_thread_interrupt(), what's the reasoning
behind returning IRQ_HANDLED here ? Shouldn't we be returning
IRQ_WAKE_THREAD anyways ?

2. When interrupt moderation is enabled, does DEVICE_IMODC start to
decrement as soon as the interrupt is masked (where I expect that the
interrupt line gets de-asserted by the controller) in
dwc3_check_event_buf()  ?

/* Mask interrupt */
dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0),
   DWC3_GEVNTSIZ_INTMASK | DWC3_GEVNTSIZ_SIZE(evt->length));

Regards,
Badhri




>
> Thanks,
> Thinh
>
> >
> >       count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
> >       count &= DWC3_GEVNTCOUNT_MASK;
> >
> > base-commit: 70cd0576aa39c55aabd227851cba0c601e811fb6
> > --
> > 2.48.1.262.g85cc9f2d1e-goog
> >
Thinh Nguyen Jan. 30, 2025, 1:42 a.m. UTC | #3
On Wed, Jan 29, 2025, Badhri Jagan Sridharan wrote:
> On Mon, Jan 27, 2025 at 6:44 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
> >
> > On Fri, Jan 24, 2025, Badhri Jagan Sridharan wrote:
> > > While commit d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events
> > > in event cache") makes sure that top half(TH) does not end up overwriting
> > > the cached events before processing them when the TH gets invoked more
> > > than one time, returning IRQ_HANDLED results in occasional irq storm
> > > where the TH hogs the CPU. The irq storm can be prevented if
> > > IRQ_WAKE_THREAD is returned.
> > >
> > > ftrace event stub during dwc3 irq storm:
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000866: irq_handler_exit: irq=14 ret=handled
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000872: irq_handler_entry: irq=504 name=dwc3
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000874: irq_handler_exit: irq=504 ret=handled
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000881: irq_handler_entry: irq=504 name=dwc3
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000883: irq_handler_exit: irq=504 ret=handled
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000889: irq_handler_entry: irq=504 name=dwc3
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000892: irq_handler_exit: irq=504 ret=handled
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000898: irq_handler_entry: irq=504 name=dwc3
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000901: irq_handler_exit: irq=504 ret=handled
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000907: irq_handler_entry: irq=504 name=dwc3
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000909: irq_handler_exit: irq=504 ret=handled
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000915: irq_handler_entry: irq=504 name=dwc3
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000918: irq_handler_exit: irq=504 ret=handled
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000924: irq_handler_entry: irq=504 name=dwc3
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000927: irq_handler_exit: irq=504 ret=handled
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000933: irq_handler_entry: irq=504 name=dwc3
> > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000935: irq_handler_exit: irq=504 ret=handled
> > >     ....
> > >
> > > Cc: stable@kernel.org
> > > Fixes: d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events in event cache")
> >
> > I don't think this should be Cc to stable, at least not the way it is
> > right now.
> >
> > > Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
> > > ---
> > >  drivers/usb/dwc3/gadget.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > > index d27af65eb08a..376ab75adc4e 100644
> > > --- a/drivers/usb/dwc3/gadget.c
> > > +++ b/drivers/usb/dwc3/gadget.c
> > > @@ -4519,7 +4519,7 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
> > >        * losing events.
> > >        */
> > >       if (evt->flags & DWC3_EVENT_PENDING)
> > > -             return IRQ_HANDLED;
> > > +             return IRQ_WAKE_THREAD;
> >
> > This looks like a workaround to the issue we have. Have you tried to
> > enable imod instead? It's the feature to help avoid these kind of issue.
> 
> Hi Thinh,
> 
> Thanks for the feedback ! Yes, we have been experimenting with the
> interrupt moderation interval as well.
> Have follow up questions though, please bear with me !
> 
> 1. Given that when DWC3_EVENT_PENDING  is still set,
> dwc3_check_event_buf() is still waiting for the previous cached events
> to be processed by the dwc3_thread_interrupt(), what's the reasoning
> behind returning IRQ_HANDLED here ? Shouldn't we be returning
> IRQ_WAKE_THREAD anyways ?

Currently dwc3 is implemented such that it will not process new events
until the BH is done with its work. The DWC3_EVENT_PENDING flag
indicates when the events are processed. With this expectation, we
should not schedule the BH as the events are still being handled.

In your case, there's a small window where the TH may be scheduled but
the DWC3_EVENT_PENDING flag is not cleared yet. Returning
IRQ_WAKE_THREAD may workaround your issue because the BH may already be
running when DWC3_EVENT_PENDING is set. I'm not sure what other side
effect this may have since we're breaking this expectation.

We may enhance the dwc3 handling of event flow in the future to improve
this. But at the moment, we should not return IRQ_WAKE_THREAD here.

> 
> 2. When interrupt moderation is enabled, does DEVICE_IMODC start to
> decrement as soon as the interrupt is masked (where I expect that the
> interrupt line gets de-asserted by the controller) in
> dwc3_check_event_buf()  ?
> 
> /* Mask interrupt */
> dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0),
>    DWC3_GEVNTSIZ_INTMASK | DWC3_GEVNTSIZ_SIZE(evt->length));
> 

The DEVICE_IMODC is loaded with DEVICE_IMODI and starts to decrement as
soon as the interrupt is de-asserted from the asserted state, which
includes when the interrupt is masked. You brought up a good question
here. The IMOD count may already be 0 when we exit the BH. Can you try
this experiment to update the count and let me know if it helps:

Note: not tested.

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 0fe92c0fb520..62aaac31ca68 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -5739,7 +5739,8 @@ static irqreturn_t dwc3_process_event_buf(struct dwc3_event_buffer *evt)
 
        if (dwc->imod_interval) {
                dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), DWC3_GEVNTCOUNT_EHB);
-               dwc3_writel(dwc->regs, DWC3_DEV_IMOD(0), dwc->imod_interval);
+               dwc3_writel(dwc->regs, DWC3_DEV_IMOD(0),
+                           (dwc->imod_interval << 16) | dwc->imod_interval);
        }
 
        /* Keep the clearing of DWC3_EVENT_PENDING at the end */


Thanks,
Thinh
Badhri Jagan Sridharan Jan. 31, 2025, 10:39 a.m. UTC | #4
On Wed, Jan 29, 2025 at 5:42 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
>
> On Wed, Jan 29, 2025, Badhri Jagan Sridharan wrote:
> > On Mon, Jan 27, 2025 at 6:44 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
> > >
> > > On Fri, Jan 24, 2025, Badhri Jagan Sridharan wrote:
> > > > While commit d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events
> > > > in event cache") makes sure that top half(TH) does not end up overwriting
> > > > the cached events before processing them when the TH gets invoked more
> > > > than one time, returning IRQ_HANDLED results in occasional irq storm
> > > > where the TH hogs the CPU. The irq storm can be prevented if
> > > > IRQ_WAKE_THREAD is returned.
> > > >
> > > > ftrace event stub during dwc3 irq storm:
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000866: irq_handler_exit: irq=14 ret=handled
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000872: irq_handler_entry: irq=504 name=dwc3
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000874: irq_handler_exit: irq=504 ret=handled
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000881: irq_handler_entry: irq=504 name=dwc3
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000883: irq_handler_exit: irq=504 ret=handled
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000889: irq_handler_entry: irq=504 name=dwc3
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000892: irq_handler_exit: irq=504 ret=handled
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000898: irq_handler_entry: irq=504 name=dwc3
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000901: irq_handler_exit: irq=504 ret=handled
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000907: irq_handler_entry: irq=504 name=dwc3
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000909: irq_handler_exit: irq=504 ret=handled
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000915: irq_handler_entry: irq=504 name=dwc3
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000918: irq_handler_exit: irq=504 ret=handled
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000924: irq_handler_entry: irq=504 name=dwc3
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000927: irq_handler_exit: irq=504 ret=handled
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000933: irq_handler_entry: irq=504 name=dwc3
> > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000935: irq_handler_exit: irq=504 ret=handled
> > > >     ....
> > > >
> > > > Cc: stable@kernel.org
> > > > Fixes: d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events in event cache")
> > >
> > > I don't think this should be Cc to stable, at least not the way it is
> > > right now.
> > >
> > > > Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
> > > > ---
> > > >  drivers/usb/dwc3/gadget.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > > > index d27af65eb08a..376ab75adc4e 100644
> > > > --- a/drivers/usb/dwc3/gadget.c
> > > > +++ b/drivers/usb/dwc3/gadget.c
> > > > @@ -4519,7 +4519,7 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
> > > >        * losing events.
> > > >        */
> > > >       if (evt->flags & DWC3_EVENT_PENDING)
> > > > -             return IRQ_HANDLED;
> > > > +             return IRQ_WAKE_THREAD;
> > >
> > > This looks like a workaround to the issue we have. Have you tried to
> > > enable imod instead? It's the feature to help avoid these kind of issue.
> >
> > Hi Thinh,
> >
> > Thanks for the feedback ! Yes, we have been experimenting with the
> > interrupt moderation interval as well.
> > Have follow up questions though, please bear with me !
> >
> > 1. Given that when DWC3_EVENT_PENDING  is still set,
> > dwc3_check_event_buf() is still waiting for the previous cached events
> > to be processed by the dwc3_thread_interrupt(), what's the reasoning
> > behind returning IRQ_HANDLED here ? Shouldn't we be returning
> > IRQ_WAKE_THREAD anyways ?
>
> Currently dwc3 is implemented such that it will not process new events
> until the BH is done with its work. The DWC3_EVENT_PENDING flag
> indicates when the events are processed. With this expectation, we
> should not schedule the BH as the events are still being handled.


Hi Thinh,

Thanks for sharing your thoughts !
Given that the intention of the design is to keep top half
(dwc3_check_event_buf()) and bottom half (dwc3_process_event_buf())
mutually exclusive, Is there a reason why the threaded interrupt
should not be marked with IRQF_ONESHOT ? Marking it IRQF_ONESHOT makes
the threaded irq framework to ensure mutual exclusivity for us. I
was validating this and this seems to be pretty effective. Curious to
know your thoughts !

>
>
> In your case, there's a small window where the TH may be scheduled but
> the DWC3_EVENT_PENDING flag is not cleared yet. Returning
> IRQ_WAKE_THREAD may workaround your issue because the BH may already be
> running when DWC3_EVENT_PENDING is set. I'm not sure what other side
> effect this may have since we're breaking this expectation.
>
> We may enhance the dwc3 handling of event flow in the future to improve
> this. But at the moment, we should not return IRQ_WAKE_THREAD here.
>
> >
> > 2. When interrupt moderation is enabled, does DEVICE_IMODC start to
> > decrement as soon as the interrupt is masked (where I expect that the
> > interrupt line gets de-asserted by the controller) in
> > dwc3_check_event_buf()  ?
> >
> > /* Mask interrupt */
> > dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0),
> >    DWC3_GEVNTSIZ_INTMASK | DWC3_GEVNTSIZ_SIZE(evt->length));
> >
>
> The DEVICE_IMODC is loaded with DEVICE_IMODI and starts to decrement as
> soon as the interrupt is de-asserted from the asserted state, which
> includes when the interrupt is masked. You brought up a good question
> here. The IMOD count may already be 0 when we exit the BH. Can you try
> this experiment to update the count and let me know if it helps:


Gave this a try, unfortunately this does not seem to help ! I see what
you are trying to do though. You are trying to explicitly re-arm the
timer. I was checking the register description as well and it does not seem to
guarantee that directly writing to DEVICE_IMODC restarts the counter
again.

Thanks,
Badhri


>
>
> Note: not tested.
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 0fe92c0fb520..62aaac31ca68 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -5739,7 +5739,8 @@ static irqreturn_t dwc3_process_event_buf(struct dwc3_event_buffer *evt)
>
>         if (dwc->imod_interval) {
>                 dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), DWC3_GEVNTCOUNT_EHB);
> -               dwc3_writel(dwc->regs, DWC3_DEV_IMOD(0), dwc->imod_interval);
> +               dwc3_writel(dwc->regs, DWC3_DEV_IMOD(0),
> +                           (dwc->imod_interval << 16) | dwc->imod_interval);
>         }
>
>         /* Keep the clearing of DWC3_EVENT_PENDING at the end */
>
>
> Thanks,
> Thinh
Thinh Nguyen Feb. 1, 2025, 12:09 a.m. UTC | #5
On Fri, Jan 31, 2025, Badhri Jagan Sridharan wrote:
> On Wed, Jan 29, 2025 at 5:42 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
> >
> > On Wed, Jan 29, 2025, Badhri Jagan Sridharan wrote:
> > > On Mon, Jan 27, 2025 at 6:44 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
> > > >
> > > > On Fri, Jan 24, 2025, Badhri Jagan Sridharan wrote:
> > > > > While commit d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events
> > > > > in event cache") makes sure that top half(TH) does not end up overwriting
> > > > > the cached events before processing them when the TH gets invoked more
> > > > > than one time, returning IRQ_HANDLED results in occasional irq storm
> > > > > where the TH hogs the CPU. The irq storm can be prevented if
> > > > > IRQ_WAKE_THREAD is returned.
> > > > >
> > > > > ftrace event stub during dwc3 irq storm:
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000866: irq_handler_exit: irq=14 ret=handled
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000872: irq_handler_entry: irq=504 name=dwc3
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000874: irq_handler_exit: irq=504 ret=handled
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000881: irq_handler_entry: irq=504 name=dwc3
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000883: irq_handler_exit: irq=504 ret=handled
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000889: irq_handler_entry: irq=504 name=dwc3
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000892: irq_handler_exit: irq=504 ret=handled
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000898: irq_handler_entry: irq=504 name=dwc3
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000901: irq_handler_exit: irq=504 ret=handled
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000907: irq_handler_entry: irq=504 name=dwc3
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000909: irq_handler_exit: irq=504 ret=handled
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000915: irq_handler_entry: irq=504 name=dwc3
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000918: irq_handler_exit: irq=504 ret=handled
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000924: irq_handler_entry: irq=504 name=dwc3
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000927: irq_handler_exit: irq=504 ret=handled
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000933: irq_handler_entry: irq=504 name=dwc3
> > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000935: irq_handler_exit: irq=504 ret=handled
> > > > >     ....
> > > > >
> > > > > Cc: stable@kernel.org
> > > > > Fixes: d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events in event cache")
> > > >
> > > > I don't think this should be Cc to stable, at least not the way it is
> > > > right now.
> > > >
> > > > > Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
> > > > > ---
> > > > >  drivers/usb/dwc3/gadget.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > > > > index d27af65eb08a..376ab75adc4e 100644
> > > > > --- a/drivers/usb/dwc3/gadget.c
> > > > > +++ b/drivers/usb/dwc3/gadget.c
> > > > > @@ -4519,7 +4519,7 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
> > > > >        * losing events.
> > > > >        */
> > > > >       if (evt->flags & DWC3_EVENT_PENDING)
> > > > > -             return IRQ_HANDLED;
> > > > > +             return IRQ_WAKE_THREAD;
> > > >
> > > > This looks like a workaround to the issue we have. Have you tried to
> > > > enable imod instead? It's the feature to help avoid these kind of issue.
> > >
> > > Hi Thinh,
> > >
> > > Thanks for the feedback ! Yes, we have been experimenting with the
> > > interrupt moderation interval as well.
> > > Have follow up questions though, please bear with me !
> > >
> > > 1. Given that when DWC3_EVENT_PENDING  is still set,
> > > dwc3_check_event_buf() is still waiting for the previous cached events
> > > to be processed by the dwc3_thread_interrupt(), what's the reasoning
> > > behind returning IRQ_HANDLED here ? Shouldn't we be returning
> > > IRQ_WAKE_THREAD anyways ?
> >
> > Currently dwc3 is implemented such that it will not process new events
> > until the BH is done with its work. The DWC3_EVENT_PENDING flag
> > indicates when the events are processed. With this expectation, we
> > should not schedule the BH as the events are still being handled.
> 
> 
> Hi Thinh,
> 
> Thanks for sharing your thoughts !
> Given that the intention of the design is to keep top half
> (dwc3_check_event_buf()) and bottom half (dwc3_process_event_buf())
> mutually exclusive, Is there a reason why the threaded interrupt
> should not be marked with IRQF_ONESHOT ? Marking it IRQF_ONESHOT makes
> the threaded irq framework to ensure mutual exclusivity for us. I
> was validating this and this seems to be pretty effective. Curious to
> know your thoughts !


We shouldn't do that. There are designs with phy driver or connector
driver or other devices that share the same interrupt line.

> 
> >
> >
> > In your case, there's a small window where the TH may be scheduled but
> > the DWC3_EVENT_PENDING flag is not cleared yet. Returning
> > IRQ_WAKE_THREAD may workaround your issue because the BH may already be
> > running when DWC3_EVENT_PENDING is set. I'm not sure what other side
> > effect this may have since we're breaking this expectation.
> >
> > We may enhance the dwc3 handling of event flow in the future to improve
> > this. But at the moment, we should not return IRQ_WAKE_THREAD here.
> >
> > >
> > > 2. When interrupt moderation is enabled, does DEVICE_IMODC start to
> > > decrement as soon as the interrupt is masked (where I expect that the
> > > interrupt line gets de-asserted by the controller) in
> > > dwc3_check_event_buf()  ?
> > >
> > > /* Mask interrupt */
> > > dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0),
> > >    DWC3_GEVNTSIZ_INTMASK | DWC3_GEVNTSIZ_SIZE(evt->length));
> > >
> >
> > The DEVICE_IMODC is loaded with DEVICE_IMODI and starts to decrement as
> > soon as the interrupt is de-asserted from the asserted state, which
> > includes when the interrupt is masked. You brought up a good question
> > here. The IMOD count may already be 0 when we exit the BH. Can you try
> > this experiment to update the count and let me know if it helps:
> 
> 
> Gave this a try, unfortunately this does not seem to help ! I see what
> you are trying to do though. You are trying to explicitly re-arm the
> timer. I was checking the register description as well and it does not seem to
> guarantee that directly writing to DEVICE_IMODC restarts the counter
> again.
> 

Hmm... Can you try this instead:

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 0fe92c0fb520..c1b5a3742ab4 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -5737,14 +5737,20 @@ static irqreturn_t dwc3_process_event_buf(struct dwc3_event_buffer *evt)
        dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0),
                    DWC3_GEVNTSIZ_SIZE(evt->length));
 
+       /*
+        * Keep the clearing of DWC3_EVENT_PENDING after the interrupt unmask
+        * but before the clearing of DWC3_GEVNTCOUNT_EHB.
+        */
+       evt->flags &= ~DWC3_EVENT_PENDING;
+
+       /* Ensure the flag is updated before clearing DWC3_GEVNTCOUNT_EHB */
+       wmb();
+
        if (dwc->imod_interval) {
                dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), DWC3_GEVNTCOUNT_EHB);
                dwc3_writel(dwc->regs, DWC3_DEV_IMOD(0), dwc->imod_interval);
        }
 
-       /* Keep the clearing of DWC3_EVENT_PENDING at the end */
-       evt->flags &= ~DWC3_EVENT_PENDING;
-
        return ret;
 }


This should solve the issue for controllers with IMOD enabled.

Thanks,
Thinh
Badhri Jagan Sridharan Feb. 2, 2025, 3:59 a.m. UTC | #6
On Fri, Jan 31, 2025 at 4:09 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
>
> On Fri, Jan 31, 2025, Badhri Jagan Sridharan wrote:
> > On Wed, Jan 29, 2025 at 5:42 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
> > >
> > > On Wed, Jan 29, 2025, Badhri Jagan Sridharan wrote:
> > > > On Mon, Jan 27, 2025 at 6:44 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
> > > > >
> > > > > On Fri, Jan 24, 2025, Badhri Jagan Sridharan wrote:
> > > > > > While commit d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events
> > > > > > in event cache") makes sure that top half(TH) does not end up overwriting
> > > > > > the cached events before processing them when the TH gets invoked more
> > > > > > than one time, returning IRQ_HANDLED results in occasional irq storm
> > > > > > where the TH hogs the CPU. The irq storm can be prevented if
> > > > > > IRQ_WAKE_THREAD is returned.
> > > > > >
> > > > > > ftrace event stub during dwc3 irq storm:
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000866: irq_handler_exit: irq=14 ret=handled
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000872: irq_handler_entry: irq=504 name=dwc3
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000874: irq_handler_exit: irq=504 ret=handled
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000881: irq_handler_entry: irq=504 name=dwc3
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000883: irq_handler_exit: irq=504 ret=handled
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000889: irq_handler_entry: irq=504 name=dwc3
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000892: irq_handler_exit: irq=504 ret=handled
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000898: irq_handler_entry: irq=504 name=dwc3
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000901: irq_handler_exit: irq=504 ret=handled
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000907: irq_handler_entry: irq=504 name=dwc3
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000909: irq_handler_exit: irq=504 ret=handled
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000915: irq_handler_entry: irq=504 name=dwc3
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000918: irq_handler_exit: irq=504 ret=handled
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000924: irq_handler_entry: irq=504 name=dwc3
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000927: irq_handler_exit: irq=504 ret=handled
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000933: irq_handler_entry: irq=504 name=dwc3
> > > > > >     irq/504_dwc3-1111  ( 1111) [000] .... 70.000935: irq_handler_exit: irq=504 ret=handled
> > > > > >     ....
> > > > > >
> > > > > > Cc: stable@kernel.org
> > > > > > Fixes: d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events in event cache")
> > > > >
> > > > > I don't think this should be Cc to stable, at least not the way it is
> > > > > right now.
> > > > >
> > > > > > Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
> > > > > > ---
> > > > > >  drivers/usb/dwc3/gadget.c | 2 +-
> > > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > > > > > index d27af65eb08a..376ab75adc4e 100644
> > > > > > --- a/drivers/usb/dwc3/gadget.c
> > > > > > +++ b/drivers/usb/dwc3/gadget.c
> > > > > > @@ -4519,7 +4519,7 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
> > > > > >        * losing events.
> > > > > >        */
> > > > > >       if (evt->flags & DWC3_EVENT_PENDING)
> > > > > > -             return IRQ_HANDLED;
> > > > > > +             return IRQ_WAKE_THREAD;
> > > > >
> > > > > This looks like a workaround to the issue we have. Have you tried to
> > > > > enable imod instead? It's the feature to help avoid these kind of issue.
> > > >
> > > > Hi Thinh,
> > > >
> > > > Thanks for the feedback ! Yes, we have been experimenting with the
> > > > interrupt moderation interval as well.
> > > > Have follow up questions though, please bear with me !
> > > >
> > > > 1. Given that when DWC3_EVENT_PENDING  is still set,
> > > > dwc3_check_event_buf() is still waiting for the previous cached events
> > > > to be processed by the dwc3_thread_interrupt(), what's the reasoning
> > > > behind returning IRQ_HANDLED here ? Shouldn't we be returning
> > > > IRQ_WAKE_THREAD anyways ?
> > >
> > > Currently dwc3 is implemented such that it will not process new events
> > > until the BH is done with its work. The DWC3_EVENT_PENDING flag
> > > indicates when the events are processed. With this expectation, we
> > > should not schedule the BH as the events are still being handled.
> >
> >
> > Hi Thinh,
> >
> > Thanks for sharing your thoughts !
> > Given that the intention of the design is to keep top half
> > (dwc3_check_event_buf()) and bottom half (dwc3_process_event_buf())
> > mutually exclusive, Is there a reason why the threaded interrupt
> > should not be marked with IRQF_ONESHOT ? Marking it IRQF_ONESHOT makes
> > the threaded irq framework to ensure mutual exclusivity for us. I
> > was validating this and this seems to be pretty effective. Curious to
> > know your thoughts !
>
>
> We shouldn't do that. There are designs with phy driver or connector
> driver or other devices that share the same interrupt line.
>
> >
> > >
> > >
> > > In your case, there's a small window where the TH may be scheduled but
> > > the DWC3_EVENT_PENDING flag is not cleared yet. Returning
> > > IRQ_WAKE_THREAD may workaround your issue because the BH may already be
> > > running when DWC3_EVENT_PENDING is set. I'm not sure what other side
> > > effect this may have since we're breaking this expectation.
> > >
> > > We may enhance the dwc3 handling of event flow in the future to improve
> > > this. But at the moment, we should not return IRQ_WAKE_THREAD here.
> > >
> > > >
> > > > 2. When interrupt moderation is enabled, does DEVICE_IMODC start to
> > > > decrement as soon as the interrupt is masked (where I expect that the
> > > > interrupt line gets de-asserted by the controller) in
> > > > dwc3_check_event_buf()  ?
> > > >
> > > > /* Mask interrupt */
> > > > dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0),
> > > >    DWC3_GEVNTSIZ_INTMASK | DWC3_GEVNTSIZ_SIZE(evt->length));
> > > >
> > >
> > > The DEVICE_IMODC is loaded with DEVICE_IMODI and starts to decrement as
> > > soon as the interrupt is de-asserted from the asserted state, which
> > > includes when the interrupt is masked. You brought up a good question
> > > here. The IMOD count may already be 0 when we exit the BH. Can you try
> > > this experiment to update the count and let me know if it helps:
> >
> >
> > Gave this a try, unfortunately this does not seem to help ! I see what
> > you are trying to do though. You are trying to explicitly re-arm the
> > timer. I was checking the register description as well and it does not seem to
> > guarantee that directly writing to DEVICE_IMODC restarts the counter
> > again.
> >
>
> Hmm... Can you try this instead:

Thanks, this seems to be working !
I also sent out the following as there isn't a way to enable interrupt
moderation through the device tree node:
https://lore.kernel.org/all/20250202035100.31235-1-badhri@google.com/
https://lore.kernel.org/all/20250202035100.31235-2-badhri@google.com/

>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 0fe92c0fb520..c1b5a3742ab4 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -5737,14 +5737,20 @@ static irqreturn_t dwc3_process_event_buf(struct dwc3_event_buffer *evt)
>         dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0),
>                     DWC3_GEVNTSIZ_SIZE(evt->length));
>
> +       /*
> +        * Keep the clearing of DWC3_EVENT_PENDING after the interrupt unmask
> +        * but before the clearing of DWC3_GEVNTCOUNT_EHB.
> +        */
> +       evt->flags &= ~DWC3_EVENT_PENDING;
> +
> +       /* Ensure the flag is updated before clearing DWC3_GEVNTCOUNT_EHB */
> +       wmb();

I still have one more question though :)
Wondering why not move this code about the DWC3_GEVNTSIZ write where
the interrupt is actually unmasked that way this would also work for
systems which dont have interrupt moderation enabled right ?

Thanks,
Badhri

> +
>         if (dwc->imod_interval) {
>                 dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), DWC3_GEVNTCOUNT_EHB);
>                 dwc3_writel(dwc->regs, DWC3_DEV_IMOD(0), dwc->imod_interval);
>         }
>
> -       /* Keep the clearing of DWC3_EVENT_PENDING at the end */
> -       evt->flags &= ~DWC3_EVENT_PENDING;
> -
>         return ret;
>  }
>
>
> This should solve the issue for controllers with IMOD enabled.
>
> Thanks,
> Thinh
Thinh Nguyen Feb. 4, 2025, 12:22 a.m. UTC | #7
On Sat, Feb 01, 2025, Badhri Jagan Sridharan wrote:
> On Fri, Jan 31, 2025 at 4:09 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
> >
> > Hmm... Can you try this instead:
> 
> Thanks, this seems to be working !
> I also sent out the following as there isn't a way to enable interrupt
> moderation through the device tree node:
> https://urldefense.com/v3/__https://lore.kernel.org/all/20250202035100.31235-1-badhri@google.com/__;!!A4F2R9G_pg!e6BA5h7M1-HZjrH2-bLVF0YbmzStu9ASv1lkZudrnyX2RFDlDUnrlFCgMdbwXcbTrMsVUQvgnpRmYxmcV-w$ 
> https://urldefense.com/v3/__https://lore.kernel.org/all/20250202035100.31235-2-badhri@google.com/__;!!A4F2R9G_pg!e6BA5h7M1-HZjrH2-bLVF0YbmzStu9ASv1lkZudrnyX2RFDlDUnrlFCgMdbwXcbTrMsVUQvgnpRm9hdUtjE$ 
> 
> >
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index 0fe92c0fb520..c1b5a3742ab4 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -5737,14 +5737,20 @@ static irqreturn_t dwc3_process_event_buf(struct dwc3_event_buffer *evt)
> >         dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0),
> >                     DWC3_GEVNTSIZ_SIZE(evt->length));
> >
> > +       /*
> > +        * Keep the clearing of DWC3_EVENT_PENDING after the interrupt unmask
> > +        * but before the clearing of DWC3_GEVNTCOUNT_EHB.
> > +        */
> > +       evt->flags &= ~DWC3_EVENT_PENDING;
> > +
> > +       /* Ensure the flag is updated before clearing DWC3_GEVNTCOUNT_EHB */
> > +       wmb();
> 
> I still have one more question though :)
> Wondering why not move this code about the DWC3_GEVNTSIZ write where
> the interrupt is actually unmasked that way this would also work for
> systems which dont have interrupt moderation enabled right ?
> 

It's mainly for PCI devices. PCI devices sends assert/de-assert messages
to represent the level interrupt for PCI legacy interrupt. The de-assert
interrupt message may not be received right after masking the interrupt,
such as due to system latency. This may trigger a scheduling of the
TH again. If there's new event received during the TH, this will cause a
race and dwc3 driver may overwrite the cached events (a window between
clearing the flag and unmasking of interrupt).

We may have a potential issue for platform with devices with shared
interrupt too (though I haven't seen report of this issue before).

BR,
Thinh
Badhri Jagan Sridharan Feb. 5, 2025, 8:32 a.m. UTC | #8
On Mon, Feb 3, 2025 at 4:22 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
>
> On Sat, Feb 01, 2025, Badhri Jagan Sridharan wrote:
> > On Fri, Jan 31, 2025 at 4:09 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
> > >
> > > Hmm... Can you try this instead:
> >
> > Thanks, this seems to be working !
> > I also sent out the following as there isn't a way to enable interrupt
> > moderation through the device tree node:
> > https://urldefense.com/v3/__https://lore.kernel.org/all/20250202035100.31235-1-badhri@google.com/__;!!A4F2R9G_pg!e6BA5h7M1-HZjrH2-bLVF0YbmzStu9ASv1lkZudrnyX2RFDlDUnrlFCgMdbwXcbTrMsVUQvgnpRmYxmcV-w$
> > https://urldefense.com/v3/__https://lore.kernel.org/all/20250202035100.31235-2-badhri@google.com/__;!!A4F2R9G_pg!e6BA5h7M1-HZjrH2-bLVF0YbmzStu9ASv1lkZudrnyX2RFDlDUnrlFCgMdbwXcbTrMsVUQvgnpRm9hdUtjE$
> >
> > >
> > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > > index 0fe92c0fb520..c1b5a3742ab4 100644
> > > --- a/drivers/usb/dwc3/gadget.c
> > > +++ b/drivers/usb/dwc3/gadget.c
> > > @@ -5737,14 +5737,20 @@ static irqreturn_t dwc3_process_event_buf(struct dwc3_event_buffer *evt)
> > >         dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0),
> > >                     DWC3_GEVNTSIZ_SIZE(evt->length));
> > >
> > > +       /*
> > > +        * Keep the clearing of DWC3_EVENT_PENDING after the interrupt unmask
> > > +        * but before the clearing of DWC3_GEVNTCOUNT_EHB.
> > > +        */
> > > +       evt->flags &= ~DWC3_EVENT_PENDING;
> > > +
> > > +       /* Ensure the flag is updated before clearing DWC3_GEVNTCOUNT_EHB */
> > > +       wmb();
> >
> > I still have one more question though :)
> > Wondering why not move this code about the DWC3_GEVNTSIZ write where
> > the interrupt is actually unmasked that way this would also work for
> > systems which dont have interrupt moderation enabled right ?
> >
>
> It's mainly for PCI devices. PCI devices sends assert/de-assert messages
> to represent the level interrupt for PCI legacy interrupt. The de-assert
> interrupt message may not be received right after masking the interrupt,
> such as due to system latency. This may trigger a scheduling of the
> TH again. If there's new event received during the TH, this will cause a
> race and dwc3 driver may overwrite the cached events (a window between
> clearing the flag and unmasking of interrupt).
>
> We may have a potential issue for platform with devices with shared
> interrupt too (though I haven't seen report of this issue before).

Thanks for the detailed explanation Thinh ! Appreciate that !
I am running more tests before sending out the V2 version of the patch.
Should be done in another day or two.

Thanks,
Badhri

>s
> BR,
> Thinh
diff mbox series

Patch

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index d27af65eb08a..376ab75adc4e 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -4519,7 +4519,7 @@  static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
 	 * losing events.
 	 */
 	if (evt->flags & DWC3_EVENT_PENDING)
-		return IRQ_HANDLED;
+		return IRQ_WAKE_THREAD;
 
 	count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
 	count &= DWC3_GEVNTCOUNT_MASK;