diff mbox

usb: dwc3: gadget: giveback request if start transfer fail

Message ID 20140501063608.GA30575@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Zhuang Jin Can May 1, 2014, 6:36 a.m. UTC
At least we should giveback the current request to the
gadget. Otherwise, the gadget will be stuck without knowing
anything.

It was oberved that the failure can happen if the request is
queued when the run/stop bit of controller is not set.

Signed-off-by: Zhuang Jin Can <jin.can.zhuang@intel.com>
---
 drivers/usb/dwc3/gadget.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Comments

Felipe Balbi April 30, 2014, 7:58 p.m. UTC | #1
Hi,

On Thu, May 01, 2014 at 02:36:08AM -0400, Zhuang Jin Can wrote:
> At least we should giveback the current request to the
> gadget. Otherwise, the gadget will be stuck without knowing
> anything.
> 
> It was oberved that the failure can happen if the request is
> queued when the run/stop bit of controller is not set.

why is your gadget queueing any requests before calling ->udc_start() ?

A better question, what modification have you done to udc-core.c which
broke this ? udc-core *always* calls ->udc_start() by the time you load
a gadget driver so this case will *never* happen. Whatever modification
you did, broke this assumption and I will *not* accept this patch
because the bug is elsewhere and *not* in mainline kernel.

> Signed-off-by: Zhuang Jin Can <jin.can.zhuang@intel.com>
> ---
>  drivers/usb/dwc3/gadget.c |    4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 70715ee..8d0c3c7 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -1000,9 +1000,7 @@ static int __dwc3_gadget_kick_transfer(struct dwc3_ep *dep, u16 cmd_param,
>  		 * here and stop, unmap, free and del each of the linked
>  		 * requests instead of what we do now.
>  		 */
> -		usb_gadget_unmap_request(&dwc->gadget, &req->request,
> -				req->direction);
> -		list_del(&req->list);
> +		dwc3_gadget_giveback(dep, req, -ESHUTDOWN);
>  		return ret;
>  	}
>  
> -- 
> 1.7.9.5
>
Felipe Balbi May 1, 2014, 3:13 p.m. UTC | #2
Hi,

On Thu, May 01, 2014 at 04:44:52PM -0400, Zhuang Jin Can wrote:
> On Wed, Apr 30, 2014 at 02:58:29PM -0500, Felipe Balbi wrote:
> > On Thu, May 01, 2014 at 02:36:08AM -0400, Zhuang Jin Can wrote:
> > > At least we should giveback the current request to the
> > > gadget. Otherwise, the gadget will be stuck without knowing
> > > anything.
> > > 
> > > It was oberved that the failure can happen if the request is
> > > queued when the run/stop bit of controller is not set.
> > 
> > why is your gadget queueing any requests before calling ->udc_start() ?
> > 
> > A better question, what modification have you done to udc-core.c which
> > broke this ? udc-core *always* calls ->udc_start() by the time you load
> > a gadget driver so this case will *never* happen. Whatever modification
> > you did, broke this assumption and I will *not* accept this patch
> > because the bug is elsewhere and *not* in mainline kernel.
> > 
> It's found in Android using kernel 3.10.20. Android has its own
> usb_composite_driver usb/gadget/android.c (not in mainline), and it

so you found something on an old kernel using an out-of-tree gadget
driver.

> allows userspace to disconnect the pullup (i.e clear run/stop bit in dwc3)
> and remove the gadget functions like adb, mtp and then add new functions
> like rndis, acm. The problem is when you disconnect the pullup, a gadget
> maybe in the middle of queuing a request, and result in the "start
> transfer cmd failure". I think this is also a common issue for other

Android gadget needs to learn how to cope with that.

> usb_composite_drivers too. Normally, if one of the gadget deactivate its
> own function, the pullup will be disconnected, other gadgets won't get
> notified until their requests are failed. So it makes dwc3 more robust
> to deal with these situations.

Right, but Android gadget can run on top of several other UDCs and you
want to have a single one of them cope with android's bug ?

You'd be better off getting google to accept a bugfix to the android
gadget, since that's where the problem lies.
Zhuang Jin Can May 1, 2014, 8:44 p.m. UTC | #3
Hi Balbi,

On Wed, Apr 30, 2014 at 02:58:29PM -0500, Felipe Balbi wrote:
> On Thu, May 01, 2014 at 02:36:08AM -0400, Zhuang Jin Can wrote:
> > At least we should giveback the current request to the
> > gadget. Otherwise, the gadget will be stuck without knowing
> > anything.
> > 
> > It was oberved that the failure can happen if the request is
> > queued when the run/stop bit of controller is not set.
> 
> why is your gadget queueing any requests before calling ->udc_start() ?
> 
> A better question, what modification have you done to udc-core.c which
> broke this ? udc-core *always* calls ->udc_start() by the time you load
> a gadget driver so this case will *never* happen. Whatever modification
> you did, broke this assumption and I will *not* accept this patch
> because the bug is elsewhere and *not* in mainline kernel.
> 
It's found in Android using kernel 3.10.20. Android has its own
usb_composite_driver usb/gadget/android.c (not in mainline), and it
allows userspace to disconnect the pullup (i.e clear run/stop bit in dwc3)
and remove the gadget functions like adb, mtp and then add new functions
like rndis, acm. The problem is when you disconnect the pullup, a gadget
maybe in the middle of queuing a request, and result in the "start
transfer cmd failure". I think this is also a common issue for other
usb_composite_drivers too. Normally, if one of the gadget deactivate its
own function, the pullup will be disconnected, other gadgets won't get
notified until their requests are failed. So it makes dwc3 more robust
to deal with these situations.

> > Signed-off-by: Zhuang Jin Can <jin.can.zhuang@intel.com>
> > ---
> >  drivers/usb/dwc3/gadget.c |    4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
> > 
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index 70715ee..8d0c3c7 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -1000,9 +1000,7 @@ static int __dwc3_gadget_kick_transfer(struct dwc3_ep *dep, u16 cmd_param,
> >  		 * here and stop, unmap, free and del each of the linked
> >  		 * requests instead of what we do now.
> >  		 */
> > -		usb_gadget_unmap_request(&dwc->gadget, &req->request,
> > -				req->direction);
> > -		list_del(&req->list);
> > +		dwc3_gadget_giveback(dep, req, -ESHUTDOWN);
> >  		return ret;
> >  	}
> >  
> > -- 
> > 1.7.9.5
> > 

Jincan
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Felipe Balbi May 2, 2014, 4:10 p.m. UTC | #4
Hi,

On Sat, May 03, 2014 at 12:05:41AM -0400, Zhuang Jin Can wrote:
> On Thu, May 01, 2014 at 10:13:28AM -0500, Felipe Balbi wrote:
> > On Thu, May 01, 2014 at 04:44:52PM -0400, Zhuang Jin Can wrote:
> > > On Wed, Apr 30, 2014 at 02:58:29PM -0500, Felipe Balbi wrote:
> > > > On Thu, May 01, 2014 at 02:36:08AM -0400, Zhuang Jin Can wrote:
> > > > > At least we should giveback the current request to the
> > > > > gadget. Otherwise, the gadget will be stuck without knowing
> > > > > anything.
> > > > > 
> > > > > It was oberved that the failure can happen if the request is
> > > > > queued when the run/stop bit of controller is not set.
> > > > 
> > > > why is your gadget queueing any requests before calling ->udc_start() ?
> > > > 
> > > > A better question, what modification have you done to udc-core.c which
> > > > broke this ? udc-core *always* calls ->udc_start() by the time you load
> > > > a gadget driver so this case will *never* happen. Whatever modification
> > > > you did, broke this assumption and I will *not* accept this patch
> > > > because the bug is elsewhere and *not* in mainline kernel.
> > > > 
> > > It's found in Android using kernel 3.10.20. Android has its own
> > > usb_composite_driver usb/gadget/android.c (not in mainline), and it
> > 
> > so you found something on an old kernel using an out-of-tree gadget
> > driver.
> > 
> > > allows userspace to disconnect the pullup (i.e clear run/stop bit in dwc3)
> > > and remove the gadget functions like adb, mtp and then add new functions
> > > like rndis, acm. The problem is when you disconnect the pullup, a gadget
> > > maybe in the middle of queuing a request, and result in the "start
> > > transfer cmd failure". I think this is also a common issue for other
> > 
> > Android gadget needs to learn how to cope with that.
> > 
> Agree.
> 
> > > usb_composite_drivers too. Normally, if one of the gadget deactivate its
> > > own function, the pullup will be disconnected, other gadgets won't get
> > > notified until their requests are failed. So it makes dwc3 more robust
> > > to deal with these situations.
> > 
> > Right, but Android gadget can run on top of several other UDCs and you
> > want to have a single one of them cope with android's bug ?
> > 
> > You'd be better off getting google to accept a bugfix to the android
> > gadget, since that's where the problem lies.
> > 
> I agree. I'll try to push the fix to google.

alright, thanks

> It's really hard to fix the race condition (for me), as any gadget or
> /sys/class/udc/soft_connect can just disconnect the pullup anytime they
> want. The only thing I can do is giving back the request to the
> gadget if the condition happens.

even in that case, gadget driver's ->disconnect() will be called and
that should be enough to tell the gadget driver 'hey, don't queue
anything right now because you're not talking to any host'.

cheers
Zhuang Jin Can May 3, 2014, 4:05 a.m. UTC | #5
Hi,

On Thu, May 01, 2014 at 10:13:28AM -0500, Felipe Balbi wrote:
> On Thu, May 01, 2014 at 04:44:52PM -0400, Zhuang Jin Can wrote:
> > On Wed, Apr 30, 2014 at 02:58:29PM -0500, Felipe Balbi wrote:
> > > On Thu, May 01, 2014 at 02:36:08AM -0400, Zhuang Jin Can wrote:
> > > > At least we should giveback the current request to the
> > > > gadget. Otherwise, the gadget will be stuck without knowing
> > > > anything.
> > > > 
> > > > It was oberved that the failure can happen if the request is
> > > > queued when the run/stop bit of controller is not set.
> > > 
> > > why is your gadget queueing any requests before calling ->udc_start() ?
> > > 
> > > A better question, what modification have you done to udc-core.c which
> > > broke this ? udc-core *always* calls ->udc_start() by the time you load
> > > a gadget driver so this case will *never* happen. Whatever modification
> > > you did, broke this assumption and I will *not* accept this patch
> > > because the bug is elsewhere and *not* in mainline kernel.
> > > 
> > It's found in Android using kernel 3.10.20. Android has its own
> > usb_composite_driver usb/gadget/android.c (not in mainline), and it
> 
> so you found something on an old kernel using an out-of-tree gadget
> driver.
> 
> > allows userspace to disconnect the pullup (i.e clear run/stop bit in dwc3)
> > and remove the gadget functions like adb, mtp and then add new functions
> > like rndis, acm. The problem is when you disconnect the pullup, a gadget
> > maybe in the middle of queuing a request, and result in the "start
> > transfer cmd failure". I think this is also a common issue for other
> 
> Android gadget needs to learn how to cope with that.
> 
Agree.

> > usb_composite_drivers too. Normally, if one of the gadget deactivate its
> > own function, the pullup will be disconnected, other gadgets won't get
> > notified until their requests are failed. So it makes dwc3 more robust
> > to deal with these situations.
> 
> Right, but Android gadget can run on top of several other UDCs and you
> want to have a single one of them cope with android's bug ?
> 
> You'd be better off getting google to accept a bugfix to the android
> gadget, since that's where the problem lies.
> 
I agree. I'll try to push the fix to google.
It's really hard to fix the race condition (for me), as any gadget or
/sys/class/udc/soft_connect can just disconnect the pullup anytime they
want. The only thing I can do is giving back the request to the
gadget if the condition happens.

Jincan
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 70715ee..8d0c3c7 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1000,9 +1000,7 @@  static int __dwc3_gadget_kick_transfer(struct dwc3_ep *dep, u16 cmd_param,
 		 * here and stop, unmap, free and del each of the linked
 		 * requests instead of what we do now.
 		 */
-		usb_gadget_unmap_request(&dwc->gadget, &req->request,
-				req->direction);
-		list_del(&req->list);
+		dwc3_gadget_giveback(dep, req, -ESHUTDOWN);
 		return ret;
 	}