diff mbox series

usb: dwc3: gadget: don't dequeue requests on already disabled endpoints

Message ID 20200327084302.606-1-m.grzeschik@pengutronix.de (mailing list archive)
State New, archived
Headers show
Series usb: dwc3: gadget: don't dequeue requests on already disabled endpoints | expand

Commit Message

Michael Grzeschik March 27, 2020, 8:43 a.m. UTC
dwc3_gadget_ep_disable gets called before the last request gets
dequeued.

In __dwc3_gadget_ep_disable all started, pending and cancelled
lists for this endpoint will call dwc3_gadget_giveback in
dwc3_remove_requests.

After that no list containing the afterwards dequed request,
therefor it is not necessary to run the dequeue routine.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
---
@Lars-Peter Clausen:

This patch addresses the case that not queued requests get dequeued.
The only case that this happens seems on disabling the gadget.

 drivers/usb/dwc3/gadget.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Sergei Shtylyov March 27, 2020, 8:53 a.m. UTC | #1
Hello!

On 27.03.2020 11:43, Michael Grzeschik wrote:

> dwc3_gadget_ep_disable gets called before the last request gets
> dequeued.
> 
> In __dwc3_gadget_ep_disable all started, pending and cancelled
> lists for this endpoint will call dwc3_gadget_giveback in
> dwc3_remove_requests.
> 
> After that no list containing the afterwards dequed request,

    Dequeued.

> therefor it is not necessary to run the dequeue routine.

    Therefore?

> Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
[...]

MBR, Sergei
Lars-Peter Clausen March 27, 2020, 9:14 a.m. UTC | #2
On 3/27/20 9:43 AM, Michael Grzeschik wrote:
> dwc3_gadget_ep_disable gets called before the last request gets
> dequeued.
>
> In __dwc3_gadget_ep_disable all started, pending and cancelled
> lists for this endpoint will call dwc3_gadget_giveback in
> dwc3_remove_requests.
>
> After that no list containing the afterwards dequed request,
> therefor it is not necessary to run the dequeue routine.
>
> Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
> ---
> @Lars-Peter Clausen:
>
> This patch addresses the case that not queued requests get dequeued.
> The only case that this happens seems on disabling the gadget.


I don't believe it does. Calling usb_ep_dequeue() is not limited to be 
called after the endpoint has been disabled. It is part of the API and 
can be called at any time. E.g. with function_fs you can abort a queued 
transfer from userspace at any time.

- Lars

>
>   drivers/usb/dwc3/gadget.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 9a6f741d1db0dc..5d4fa8d6c93e49 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -1609,6 +1609,9 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
>   
>   	trace_dwc3_ep_dequeue(req);
>   
> +	if (!(dep->flags & DWC3_EP_ENABLED))
> +		return 0;
> +
>   	spin_lock_irqsave(&dwc->lock, flags);
>   
>   	list_for_each_entry(r, &dep->pending_list, list) {
Michael Grzeschik March 27, 2020, 10:43 a.m. UTC | #3
On Fri, Mar 27, 2020 at 10:14:10AM +0100, Lars-Peter Clausen wrote:
> On 3/27/20 9:43 AM, Michael Grzeschik wrote:
> > dwc3_gadget_ep_disable gets called before the last request gets
> > dequeued.
> > 
> > In __dwc3_gadget_ep_disable all started, pending and cancelled
> > lists for this endpoint will call dwc3_gadget_giveback in
> > dwc3_remove_requests.
> > 
> > After that no list containing the afterwards dequed request,
> > therefor it is not necessary to run the dequeue routine.
> > 
> > Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
> > ---
> > @Lars-Peter Clausen:
> > 
> > This patch addresses the case that not queued requests get dequeued.
> > The only case that this happens seems on disabling the gadget.
> 
> 
> I don't believe it does. Calling usb_ep_dequeue() is not limited to be
> called after the endpoint has been disabled. It is part of the API and can
> be called at any time. E.g. with function_fs you can abort a queued transfer
> from userspace at any time.

OK, can you give me an Userspace example how to simply trigger the
issue. I tried to figure your mentioned function stack but it would
be much easier if it could be reproduced.

The patch on the other hand can stand on itself, as it then
fixes another code path that is much more common.

Regads,
Michael

> > 
> >   drivers/usb/dwc3/gadget.c | 3 +++
> >   1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index 9a6f741d1db0dc..5d4fa8d6c93e49 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -1609,6 +1609,9 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
> >   	trace_dwc3_ep_dequeue(req);
> > +	if (!(dep->flags & DWC3_EP_ENABLED))
> > +		return 0;
> > +
> >   	spin_lock_irqsave(&dwc->lock, flags);
> >   	list_for_each_entry(r, &dep->pending_list, list) {
> 
> 
>
Lars-Peter Clausen March 27, 2020, 10:55 a.m. UTC | #4
On 3/27/20 11:43 AM, Michael Grzeschik wrote:
> On Fri, Mar 27, 2020 at 10:14:10AM +0100, Lars-Peter Clausen wrote:
>> On 3/27/20 9:43 AM, Michael Grzeschik wrote:
>>> dwc3_gadget_ep_disable gets called before the last request gets
>>> dequeued.
>>>
>>> In __dwc3_gadget_ep_disable all started, pending and cancelled
>>> lists for this endpoint will call dwc3_gadget_giveback in
>>> dwc3_remove_requests.
>>>
>>> After that no list containing the afterwards dequed request,
>>> therefor it is not necessary to run the dequeue routine.
>>>
>>> Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
>>> ---
>>> @Lars-Peter Clausen:
>>>
>>> This patch addresses the case that not queued requests get dequeued.
>>> The only case that this happens seems on disabling the gadget.
>>
>> I don't believe it does. Calling usb_ep_dequeue() is not limited to be
>> called after the endpoint has been disabled. It is part of the API and can
>> be called at any time. E.g. with function_fs you can abort a queued transfer
>> from userspace at any time.
> OK, can you give me an Userspace example how to simply trigger the
> issue. I tried to figure your mentioned function stack but it would
> be much easier if it could be reproduced.
>
> The patch on the other hand can stand on itself, as it then
> fixes another code path that is much more common.

Hi,

I don't have a standalone example. But the issue occurs if you submit a 
request when using function_fs through the AIO API and then call 
io_cancel() to abort it. At the same time there must be traffic on the 
USB bus so that the URB has a chance to complete.

This is a race condition between the IRQ handler running on one CPU and 
the usb_ep_dequeue() running on another CPU. As such it might take a 
while of stress testing before it is triggered. The more CPUs your 
system has the higher the chance of trigger the issue.

You can find the code which triggers the issue below.

Submission of the request:

https://github.com/analogdevicesinc/libiio/blob/master/iiod/ops.c#L139-L156

Canceling it:

https://github.com/analogdevicesinc/libiio/blob/master/iiod/ops.c#L193
Andy Shevchenko March 27, 2020, 11:15 a.m. UTC | #5
On Fri, Mar 27, 2020 at 10:54 AM Sergei Shtylyov
<sergei.shtylyov@cogentembedded.com> wrote:
> On 27.03.2020 11:43, Michael Grzeschik wrote:

...

> > therefor it is not necessary to run the dequeue routine.
>
>     Therefore?

Original as good on its own.
Felipe Balbi March 28, 2020, 8:27 a.m. UTC | #6
Hi,

Michael Grzeschik <m.grzeschik@pengutronix.de> writes:
> dwc3_gadget_ep_disable gets called before the last request gets
> dequeued.
>
> In __dwc3_gadget_ep_disable all started, pending and cancelled
> lists for this endpoint will call dwc3_gadget_giveback in
> dwc3_remove_requests.
>
> After that no list containing the afterwards dequed request,
> therefor it is not necessary to run the dequeue routine.
>
> Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
> ---
> @Lars-Peter Clausen:
>
> This patch addresses the case that not queued requests get dequeued.
> The only case that this happens seems on disabling the gadget.
>
>  drivers/usb/dwc3/gadget.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 9a6f741d1db0dc..5d4fa8d6c93e49 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -1609,6 +1609,9 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
>  
>  	trace_dwc3_ep_dequeue(req);
>  
> +	if (!(dep->flags & DWC3_EP_ENABLED))
> +		return 0;

which driver is trying to call dequeue after the endpoint is disabled?
Got some tracepoints of the problem happening?
Michael Grzeschik March 29, 2020, 7:02 p.m. UTC | #7
On Sat, Mar 28, 2020 at 10:27:49AM +0200, Felipe Balbi wrote:
> 
> Hi,
> 
> Michael Grzeschik <m.grzeschik@pengutronix.de> writes:
> > dwc3_gadget_ep_disable gets called before the last request gets
> > dequeued.
> >
> > In __dwc3_gadget_ep_disable all started, pending and cancelled
> > lists for this endpoint will call dwc3_gadget_giveback in
> > dwc3_remove_requests.
> >
> > After that no list containing the afterwards dequed request,
> > therefor it is not necessary to run the dequeue routine.
> >
> > Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
> > ---
> > @Lars-Peter Clausen:
> >
> > This patch addresses the case that not queued requests get dequeued.
> > The only case that this happens seems on disabling the gadget.
> >
> >  drivers/usb/dwc3/gadget.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index 9a6f741d1db0dc..5d4fa8d6c93e49 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -1609,6 +1609,9 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
> >  
> >  	trace_dwc3_ep_dequeue(req);
> >  
> > +	if (!(dep->flags & DWC3_EP_ENABLED))
> > +		return 0;
> 
> which driver is trying to call dequeue after the endpoint is disabled?
> Got some tracepoints of the problem happening?

I see the case when using uvc-gadget.

Look into uvc_v4l2_release in uvc_v4l2.c:

uvc_function_disconnect
   composite_disconnect
      reset_config
         uvc_function_disable->usb_ep_disable

uvcg_video_enable
   usb_ep_dequeue
      dwc3_gadget_ep_dequeue

Regards,
Michael
Felipe Balbi March 30, 2020, 7:18 a.m. UTC | #8
Hi,

Michael Grzeschik <mgr@pengutronix.de> writes:
>> > dwc3_gadget_ep_disable gets called before the last request gets
>> > dequeued.
>> >
>> > In __dwc3_gadget_ep_disable all started, pending and cancelled
>> > lists for this endpoint will call dwc3_gadget_giveback in
>> > dwc3_remove_requests.
>> >
>> > After that no list containing the afterwards dequed request,
>> > therefor it is not necessary to run the dequeue routine.
>> >
>> > Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
>> > ---
>> > @Lars-Peter Clausen:
>> >
>> > This patch addresses the case that not queued requests get dequeued.
>> > The only case that this happens seems on disabling the gadget.
>> >
>> >  drivers/usb/dwc3/gadget.c | 3 +++
>> >  1 file changed, 3 insertions(+)
>> >
>> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> > index 9a6f741d1db0dc..5d4fa8d6c93e49 100644
>> > --- a/drivers/usb/dwc3/gadget.c
>> > +++ b/drivers/usb/dwc3/gadget.c
>> > @@ -1609,6 +1609,9 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
>> >  
>> >  	trace_dwc3_ep_dequeue(req);
>> >  
>> > +	if (!(dep->flags & DWC3_EP_ENABLED))
>> > +		return 0;
>> 
>> which driver is trying to call dequeue after the endpoint is disabled?
>> Got some tracepoints of the problem happening?
>
> I see the case when using uvc-gadget.
>
> Look into uvc_v4l2_release in uvc_v4l2.c:
>
> uvc_function_disconnect
>    composite_disconnect
>       reset_config
>          uvc_function_disable->usb_ep_disable
>
> uvcg_video_enable
>    usb_ep_dequeue
>       dwc3_gadget_ep_dequeue

Oh, I see what you mean. We get a disconnect, which disables the
endpoints, which forces all requests to be dequeued. Now I remember why
this exists: we giveback the requests from disconnect because not all
gadget drivers will call usb_ep_dequeue() if simply told about the
disconnect. Then UDC drivers have to be a little more careful and make
sure that all requests are givenback.

In any case, why is it a problem to call usb_ep_dequeue()? Is it only
because of that dev_err()? We could just remove that message,
really. Eventually, I want to move more of this logic into UDC core so
udc drivers can be simplified. For that work, though, first we would
have to add a "generic" struct usb_ep_hw implementation and manage list
of requests as part of UDC core as well.
Michael Grzeschik March 30, 2020, 8:25 a.m. UTC | #9
On Mon, Mar 30, 2020 at 10:18:57AM +0300, Felipe Balbi wrote:
> 
> Hi,
> 
> Michael Grzeschik <mgr@pengutronix.de> writes:
> >> > dwc3_gadget_ep_disable gets called before the last request gets
> >> > dequeued.
> >> >
> >> > In __dwc3_gadget_ep_disable all started, pending and cancelled
> >> > lists for this endpoint will call dwc3_gadget_giveback in
> >> > dwc3_remove_requests.
> >> >
> >> > After that no list containing the afterwards dequed request,
> >> > therefor it is not necessary to run the dequeue routine.
> >> >
> >> > Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
> >> > ---
> >> > @Lars-Peter Clausen:
> >> >
> >> > This patch addresses the case that not queued requests get dequeued.
> >> > The only case that this happens seems on disabling the gadget.
> >> >
> >> >  drivers/usb/dwc3/gadget.c | 3 +++
> >> >  1 file changed, 3 insertions(+)
> >> >
> >> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> >> > index 9a6f741d1db0dc..5d4fa8d6c93e49 100644
> >> > --- a/drivers/usb/dwc3/gadget.c
> >> > +++ b/drivers/usb/dwc3/gadget.c
> >> > @@ -1609,6 +1609,9 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
> >> >  
> >> >  	trace_dwc3_ep_dequeue(req);
> >> >  
> >> > +	if (!(dep->flags & DWC3_EP_ENABLED))
> >> > +		return 0;
> >> 
> >> which driver is trying to call dequeue after the endpoint is disabled?
> >> Got some tracepoints of the problem happening?
> >
> > I see the case when using uvc-gadget.
> >
> > Look into uvc_v4l2_release in uvc_v4l2.c:
> >
> > uvc_function_disconnect
> >    composite_disconnect
> >       reset_config
> >          uvc_function_disable->usb_ep_disable
> >
> > uvcg_video_enable
> >    usb_ep_dequeue
> >       dwc3_gadget_ep_dequeue
> 
> Oh, I see what you mean. We get a disconnect, which disables the
> endpoints, which forces all requests to be dequeued. Now I remember why
> this exists: we giveback the requests from disconnect because not all
> gadget drivers will call usb_ep_dequeue() if simply told about the
> disconnect. Then UDC drivers have to be a little more careful and make
> sure that all requests are givenback.
> 
> In any case, why is it a problem to call usb_ep_dequeue()? Is it only
> because of that dev_err()? We could just remove that message,
> really.

In my case, it is not a problem removing the dev_err. The ep_dequeue
will only be called once for each request at the stream end. I don't
know about the case Lars has mentioned.

If we have to search all lists for the request every n times while in
traffic, only to find out that it was not enqueued, I think it would be
worth it to keep the dev_err and let these cases trigger so we have an
option to find and avoid/fix these.

> Eventually, I want to move more of this logic into UDC core so
> udc drivers can be simplified. For that work, though, first we would
> have to add a "generic" struct usb_ep_hw implementation and manage list
> of requests as part of UDC core as well.

I don't know about the cases you plan to abstract but it sounds
like a good idea to get some gadget logic out of the drivers.

Thanks,
Michael
Felipe Balbi March 30, 2020, 10:06 a.m. UTC | #10
Hi,

Michael Grzeschik <mgr@pengutronix.de> writes:
>> >> > dwc3_gadget_ep_disable gets called before the last request gets
>> >> > dequeued.
>> >> >
>> >> > In __dwc3_gadget_ep_disable all started, pending and cancelled
>> >> > lists for this endpoint will call dwc3_gadget_giveback in
>> >> > dwc3_remove_requests.
>> >> >
>> >> > After that no list containing the afterwards dequed request,
>> >> > therefor it is not necessary to run the dequeue routine.
>> >> >
>> >> > Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
>> >> > ---
>> >> > @Lars-Peter Clausen:
>> >> >
>> >> > This patch addresses the case that not queued requests get dequeued.
>> >> > The only case that this happens seems on disabling the gadget.
>> >> >
>> >> >  drivers/usb/dwc3/gadget.c | 3 +++
>> >> >  1 file changed, 3 insertions(+)
>> >> >
>> >> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> >> > index 9a6f741d1db0dc..5d4fa8d6c93e49 100644
>> >> > --- a/drivers/usb/dwc3/gadget.c
>> >> > +++ b/drivers/usb/dwc3/gadget.c
>> >> > @@ -1609,6 +1609,9 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
>> >> >  
>> >> >  	trace_dwc3_ep_dequeue(req);
>> >> >  
>> >> > +	if (!(dep->flags & DWC3_EP_ENABLED))
>> >> > +		return 0;
>> >> 
>> >> which driver is trying to call dequeue after the endpoint is disabled?
>> >> Got some tracepoints of the problem happening?
>> >
>> > I see the case when using uvc-gadget.
>> >
>> > Look into uvc_v4l2_release in uvc_v4l2.c:
>> >
>> > uvc_function_disconnect
>> >    composite_disconnect
>> >       reset_config
>> >          uvc_function_disable->usb_ep_disable
>> >
>> > uvcg_video_enable
>> >    usb_ep_dequeue
>> >       dwc3_gadget_ep_dequeue
>> 
>> Oh, I see what you mean. We get a disconnect, which disables the
>> endpoints, which forces all requests to be dequeued. Now I remember why
>> this exists: we giveback the requests from disconnect because not all
>> gadget drivers will call usb_ep_dequeue() if simply told about the
>> disconnect. Then UDC drivers have to be a little more careful and make
>> sure that all requests are givenback.
>> 
>> In any case, why is it a problem to call usb_ep_dequeue()? Is it only
>> because of that dev_err()? We could just remove that message,
>> really.
>
> In my case, it is not a problem removing the dev_err. The ep_dequeue
> will only be called once for each request at the stream end. I don't
> know about the case Lars has mentioned.

Okay

> If we have to search all lists for the request every n times while in
> traffic, only to find out that it was not enqueued, I think it would be
> worth it to keep the dev_err and let these cases trigger so we have an
> option to find and avoid/fix these.

Yeah, I agree. That's why the dev_err() was placed there to start
with. In fact, I found a few gadget drivers which were trying to reuse a
request a allocated for EPxIN and queueing it to EPxOUT, clearly a
violation of request lifetime rules.

As for the search on three separate lists, I never considered this to be
a problem since it happens so infrequently. One thing we can do to make
it maybe faster, is convert those list_for_each_entry() to
list_for_each_entry_reverse(). I'm betting that there's higher
likelihood that the oldest request will be dequeued first, then again,
this needs to be profiled.

>> Eventually, I want to move more of this logic into UDC core so
>> udc drivers can be simplified. For that work, though, first we would
>> have to add a "generic" struct usb_ep_hw implementation and manage list
>> of requests as part of UDC core as well.
>
> I don't know about the cases you plan to abstract but it sounds
> like a good idea to get some gadget logic out of the drivers.

Yeah, this will take a lot of time, though. Hopefully it'll happen :-)
diff mbox series

Patch

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 9a6f741d1db0dc..5d4fa8d6c93e49 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1609,6 +1609,9 @@  static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
 
 	trace_dwc3_ep_dequeue(req);
 
+	if (!(dep->flags & DWC3_EP_ENABLED))
+		return 0;
+
 	spin_lock_irqsave(&dwc->lock, flags);
 
 	list_for_each_entry(r, &dep->pending_list, list) {