diff mbox

[RFC,v1,1/6] USB: HCD: support giveback of URB in tasklet context

Message ID 1371567833-9077-2-git-send-email-ming.lei@canonical.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ming Lei June 18, 2013, 3:03 p.m. UTC
This patch implements the mechanism of giveback of URB in
tasklet context, so that hardware interrupt handling time for
usb host controller can be saved much, and HCD interrupt handling
can be simplified.

Motivations:

1), on some arch(such as ARM), DMA mapping/unmapping is a bit
time-consuming, for example: when accessing usb mass storage
via EHCI on pandaboard, the common length of transfer buffer is 120KB,
the time consumed on DMA unmapping may reach hundreds of microseconds;
even on A15 based box, the time is still about scores of microseconds

2), on some arch, reading DMA coherent memoery is very time-consuming,
the most common example is usb video class driver[1]

3), driver's complete() callback may do much things which is driver
specific, so the time is consumed unnecessarily in hardware irq context.

4), running driver's complete() callback in hardware irq context causes
that host controller driver has to release its lock in interrupt handler,
so reacquiring the lock after return may busy wait a while and increase
interrupt handling time. More seriously, releasing the HCD lock makes
HCD becoming quite complicated to deal with introduced races.

So the patch proposes to run giveback of URB in tasklet context, then
time consumed in HCD irq handling doesn't depend on drivers' complete and
DMA mapping/unmapping any more, also we can simplify HCD since the HCD
lock isn't needed to be released during irq handling.

The patch should be reasonable and doable:

1), for drivers, they don't care if the complete() is called in hard irq
context or softirq context

2), the biggest change is the situation in which usb_submit_urb() is called
in complete() callback, so the introduced tasklet schedule delay might be a
con, but it shouldn't be a big deal:

	- control/bulk asynchronous transfer isn't sensitive to schedule
	  delay

	- the patch schedules giveback of periodic URBs using
	  tasklet_hi_schedule, so the introduced delay should be very
	  small

	- for ISOC transfer, generally, drivers submit several URBs
	  concurrently to avoid interrupt delay, so it is OK with the
	  little schedule delay.

	- for interrupt transfer, generally, drivers only submit one URB
	  at the same time, but interrupt transfer is often used in event
	  report, polling, ... situations, and a little delay should be OK.

Considered that HCDs may optimize on submitting URB in complete(), the
patch may cause the optimization not working, so introduces one flag to mark
if the HCD supports to run giveback URB in tasklet context. When all HCDs
are ready, the flag can be removed.

[1], http://marc.info/?t=136438111600010&r=1&w=2

Cc: Oliver Neukum <oliver@neukum.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 drivers/usb/core/hcd.c  |  175 ++++++++++++++++++++++++++++++++++++++++-------
 include/linux/usb/hcd.h |   17 +++++
 2 files changed, 169 insertions(+), 23 deletions(-)

Comments

Alan Stern June 18, 2013, 4:05 p.m. UTC | #1
On Tue, 18 Jun 2013, Ming Lei wrote:

> This patch implements the mechanism of giveback of URB in
> tasklet context, so that hardware interrupt handling time for
> usb host controller can be saved much, and HCD interrupt handling
> can be simplified.

Okay.  I'd make a few relatively minor changes.

> diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
> index 014dc99..a272968 100644
> --- a/drivers/usb/core/hcd.c
> +++ b/drivers/usb/core/hcd.c
> @@ -699,9 +699,11 @@ error:
>  	 * Avoiding calls to local_irq_disable/enable makes the code
>  	 * RT-friendly.
>  	 */
> -	spin_unlock(&hcd_root_hub_lock);
> +	if (!hcd_giveback_urb_in_bh(hcd))
> +		spin_unlock(&hcd_root_hub_lock);
>  	usb_hcd_giveback_urb(hcd, urb, status);
> -	spin_lock(&hcd_root_hub_lock);
> +	if (!hcd_giveback_urb_in_bh(hcd))
> +		spin_lock(&hcd_root_hub_lock);
>  
>  	spin_unlock_irq(&hcd_root_hub_lock);
>  	return 0;
> @@ -742,9 +744,11 @@ void usb_hcd_poll_rh_status(struct usb_hcd *hcd)
>  			memcpy(urb->transfer_buffer, buffer, length);
>  
>  			usb_hcd_unlink_urb_from_ep(hcd, urb);
> -			spin_unlock(&hcd_root_hub_lock);
> +			if (!hcd_giveback_urb_in_bh(hcd))
> +				spin_unlock(&hcd_root_hub_lock);
>  			usb_hcd_giveback_urb(hcd, urb, 0);
> -			spin_lock(&hcd_root_hub_lock);
> +			if (!hcd_giveback_urb_in_bh(hcd))
> +				spin_lock(&hcd_root_hub_lock);
>  		} else {
>  			length = 0;
>  			set_bit(HCD_FLAG_POLL_PENDING, &hcd->flags);
> @@ -835,9 +839,11 @@ static int usb_rh_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status)
>  			hcd->status_urb = NULL;
>  			usb_hcd_unlink_urb_from_ep(hcd, urb);
>  
> -			spin_unlock(&hcd_root_hub_lock);
> +			if (!hcd_giveback_urb_in_bh(hcd))
> +				spin_unlock(&hcd_root_hub_lock);
>  			usb_hcd_giveback_urb(hcd, urb, status);
> -			spin_lock(&hcd_root_hub_lock);
> +			if (!hcd_giveback_urb_in_bh(hcd))
> +				spin_lock(&hcd_root_hub_lock);
>  		}
>  	}
>   done:

None of these tests are necessary.  Root hubs are different from normal
devices; their URBs are handled mostly by usbcore.  The only part done
by the HCD is always synchronous.  And we know that root-hub URBs
always have a very short completion handler.  So we may as well
continue to handle root-hub URBs the way we always have.

> @@ -1648,6 +1654,60 @@ int usb_hcd_unlink_urb (struct urb *urb, int status)
>  
>  /*-------------------------------------------------------------------------*/
>  
> +static void __usb_hcd_giveback_urb(struct urb *urb)
> +{
> +	struct usb_hcd *hcd = bus_to_hcd(urb->dev->bus);
> +	int status = urb->status;
> +
> +	urb->hcpriv = NULL;
> +	if (unlikely(urb->unlinked))
> +		status = urb->unlinked;

The way you're storing the status value isn't good.  We decided to make 
the status a separate argument because of drivers that abuse the 
urb->status field (they poll it instead of waiting for the completion 
handler to be called).  Hopefully there aren't any examples of drivers 
still doing this, but...

This means you shouldn't store anything in urb->status until just
before calling the completion handler.  What you can do instead is
always store the status value in urb->unlinked.

> +	else if (unlikely((urb->transfer_flags & URB_SHORT_NOT_OK) &&
> +			urb->actual_length < urb->transfer_buffer_length &&
> +			!status))
> +		status = -EREMOTEIO;
> +
> +	unmap_urb_for_dma(hcd, urb);
> +	usbmon_urb_complete(&hcd->self, urb, status);
> +	usb_unanchor_urb(urb);
> +
> +	/* pass ownership to the completion handler */
> +	urb->status = status;
> +	urb->complete (urb);

You are supposed to disable local interrupts around the call to the 
completion handler.

> +	atomic_dec (&urb->use_count);
> +	if (unlikely(atomic_read(&urb->reject)))
> +		wake_up (&usb_kill_urb_queue);
> +	usb_put_urb (urb);
> +}
> +
> +static void usb_giveback_urb_bh(unsigned long param)
> +{
> +	struct giveback_urb_bh *bh = (struct giveback_urb_bh *)param;
> +	unsigned long flags;
> +	struct list_head local_list;
> +
> +	spin_lock_irqsave(&bh->lock, flags);
> +	bh->running = 1;
> +restart:

I like to have statement labels indented by a single space character, 
so that tools like diff don't think the label marks the start of a new 
function.

> @@ -1667,25 +1727,33 @@ int usb_hcd_unlink_urb (struct urb *urb, int status)
>   */
>  void usb_hcd_giveback_urb(struct usb_hcd *hcd, struct urb *urb, int status)
>  {
> -	urb->hcpriv = NULL;
> -	if (unlikely(urb->unlinked))
> -		status = urb->unlinked;
> -	else if (unlikely((urb->transfer_flags & URB_SHORT_NOT_OK) &&
> -			urb->actual_length < urb->transfer_buffer_length &&
> -			!status))
> -		status = -EREMOTEIO;
> +	struct giveback_urb_bh *bh = hcd->async_bh;
> +	bool async = 1;
> +	bool sched = 1;

I know this is a common way of doing things, but to me it makes more 
sense in this case to set these values later on, in an "if ... else 
..." statement.

>  
> -	unmap_urb_for_dma(hcd, urb);
> -	usbmon_urb_complete(&hcd->self, urb, status);
> -	usb_unanchor_urb(urb);
> -
> -	/* pass ownership to the completion handler */
>  	urb->status = status;

This should now be:

	if (urb->unlinked == 0)
		urb->unlinked = status;

> -	urb->complete (urb);
> -	atomic_dec (&urb->use_count);
> -	if (unlikely(atomic_read(&urb->reject)))
> -		wake_up (&usb_kill_urb_queue);
> -	usb_put_urb (urb);
> +	if (!hcd_giveback_urb_in_bh(hcd)) {
> +		__usb_hcd_giveback_urb(urb);
> +		return;
> +	}
> +
> +	if (usb_pipeisoc(urb->pipe) || usb_pipeint(urb->pipe)) {
> +		bh = hcd->periodic_bh;
> +		async = 0;

I don't like these names, for several reasons.  The difference between 
the two tasklets isn't that one is meant specifically for periodic URBs 
and the other for async URBs; the difference is that one runs with 
higher priority than the other.  The names should reflect this.  For 
example, you could call them hcd->low_prio_bh and hcd->high_prio_bh.  

> +	}
> +
> +	spin_lock(&bh->lock);
> +	list_add_tail(&urb->urb_list, &bh->head);
> +	if (bh->running)
> +		sched = 0;
> +	spin_unlock(&bh->lock);
> +
> +	if (sched) {
> +		if (async)
> +			tasklet_schedule(&bh->bh);
> +		else
> +			tasklet_hi_schedule(&bh->bh);
> +	}


Also, you could combine async and sched into a single variable.
Using an enumerated type for sched, this would become:

	if (!sched)
		;	/* tasklet doesn't need to be scheduled */
	else if (sched == HIGH_PRIO)
		tasklet_hi_schedule(&bh->bh);
	else
		tasklet_schedule(&bh->bh);

Or you could turn this into a "switch" statement.

> +static int init_giveback_urb_bh(struct usb_hcd *hcd)
> +{
> +	if (!hcd_giveback_urb_in_bh(hcd))
> +		return 0;
> +
> +	hcd->periodic_bh = kzalloc(sizeof(*hcd->periodic_bh), GFP_KERNEL);
> +	if (!hcd->periodic_bh)
> +		return -ENOMEM;
> +
> +	hcd->async_bh = kzalloc(sizeof(*hcd->async_bh), GFP_KERNEL);
> +	if (!hcd->async_bh) {
> +		kfree(hcd->periodic_bh);
> +		return -ENOMEM;
> +	}

I think these things can be stored directly in the usb_hcd struct
instead of allocated separately.

> +
> +	__init_giveback_urb_bh(hcd->periodic_bh);
> +	__init_giveback_urb_bh(hcd->async_bh);
> +
> +	return 0;
> +}
> +
> +static void __exit_giveback_urb_bh(struct giveback_urb_bh *bh)
> +{
> +	tasklet_kill(&bh->bh);
> +}
> +
> +static void exit_giveback_urb_bh(struct usb_hcd *hcd)
> +{
> +	if (!hcd_giveback_urb_in_bh(hcd))
> +		return;
> +
> +	__exit_giveback_urb_bh(hcd->periodic_bh);
> +	__exit_giveback_urb_bh(hcd->async_bh);

You might as well put the tasklet_kill() call directly inline instead 
of going through a separate function.

> +
> +	kfree(hcd->periodic_bh);
> +	kfree(hcd->async_bh);
> +}
> +
>  /**
>   * usb_create_shared_hcd - create and initialize an HCD structure
>   * @driver: HC driver that will use this hcd
> @@ -2573,6 +2687,16 @@ int usb_add_hcd(struct usb_hcd *hcd,
>  			&& device_can_wakeup(&hcd->self.root_hub->dev))
>  		dev_dbg(hcd->self.controller, "supports USB remote wakeup\n");
>  
> +	if (usb_hcd_is_primary_hcd(hcd)) {
> +		retval = init_giveback_urb_bh(hcd);
> +		if (retval)
> +			goto err_init_giveback_bh;
> +	} else {
> +		/* share tasklet handling with primary hcd */
> +		hcd->async_bh = hcd->primary_hcd->async_bh;
> +		hcd->periodic_bh = hcd->primary_hcd->periodic_bh;
> +	}

Is there any reason why a secondary HCD can't have its own tasklets?

Alan Stern
Ming Lei June 19, 2013, 2:59 a.m. UTC | #2
On Wed, Jun 19, 2013 at 12:05 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Tue, 18 Jun 2013, Ming Lei wrote:
>
>> This patch implements the mechanism of giveback of URB in
>> tasklet context, so that hardware interrupt handling time for
>> usb host controller can be saved much, and HCD interrupt handling
>> can be simplified.
>
> Okay.  I'd make a few relatively minor changes.
>
>> diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
>> index 014dc99..a272968 100644
>> --- a/drivers/usb/core/hcd.c
>> +++ b/drivers/usb/core/hcd.c
>> @@ -699,9 +699,11 @@ error:
>>        * Avoiding calls to local_irq_disable/enable makes the code
>>        * RT-friendly.
>>        */
>> -     spin_unlock(&hcd_root_hub_lock);
>> +     if (!hcd_giveback_urb_in_bh(hcd))
>> +             spin_unlock(&hcd_root_hub_lock);
>>       usb_hcd_giveback_urb(hcd, urb, status);
>> -     spin_lock(&hcd_root_hub_lock);
>> +     if (!hcd_giveback_urb_in_bh(hcd))
>> +             spin_lock(&hcd_root_hub_lock);
>>
>>       spin_unlock_irq(&hcd_root_hub_lock);
>>       return 0;
>> @@ -742,9 +744,11 @@ void usb_hcd_poll_rh_status(struct usb_hcd *hcd)
>>                       memcpy(urb->transfer_buffer, buffer, length);
>>
>>                       usb_hcd_unlink_urb_from_ep(hcd, urb);
>> -                     spin_unlock(&hcd_root_hub_lock);
>> +                     if (!hcd_giveback_urb_in_bh(hcd))
>> +                             spin_unlock(&hcd_root_hub_lock);
>>                       usb_hcd_giveback_urb(hcd, urb, 0);
>> -                     spin_lock(&hcd_root_hub_lock);
>> +                     if (!hcd_giveback_urb_in_bh(hcd))
>> +                             spin_lock(&hcd_root_hub_lock);
>>               } else {
>>                       length = 0;
>>                       set_bit(HCD_FLAG_POLL_PENDING, &hcd->flags);
>> @@ -835,9 +839,11 @@ static int usb_rh_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status)
>>                       hcd->status_urb = NULL;
>>                       usb_hcd_unlink_urb_from_ep(hcd, urb);
>>
>> -                     spin_unlock(&hcd_root_hub_lock);
>> +                     if (!hcd_giveback_urb_in_bh(hcd))
>> +                             spin_unlock(&hcd_root_hub_lock);
>>                       usb_hcd_giveback_urb(hcd, urb, status);
>> -                     spin_lock(&hcd_root_hub_lock);
>> +                     if (!hcd_giveback_urb_in_bh(hcd))
>> +                             spin_lock(&hcd_root_hub_lock);
>>               }
>>       }
>>   done:
>
> None of these tests are necessary.  Root hubs are different from normal
> devices; their URBs are handled mostly by usbcore.  The only part done
> by the HCD is always synchronous.  And we know that root-hub URBs

Looks not always synchronous, control transfer is synchronous, and
interrupt transfer is still asynchronous. No drivers(hub, usbfs) depend
on that,  and IMO, treating root hub same as hub will simplify HCD core,
and finally we can remove all the above lock releasing & acquiring if
all HCDs set HCD_BH.

Also there is very less roothub transfers and always letting tasklet
handle URB giveback of roothub won't have performance problem, so
how about keeping the above tests?

> always have a very short completion handler.  So we may as well

root hub's completion handler is same with hub's, and usb_submit_urb()
is still called to submit new schedule.

> continue to handle root-hub URBs the way we always have.



>
>> @@ -1648,6 +1654,60 @@ int usb_hcd_unlink_urb (struct urb *urb, int status)
>>
>>  /*-------------------------------------------------------------------------*/
>>
>> +static void __usb_hcd_giveback_urb(struct urb *urb)
>> +{
>> +     struct usb_hcd *hcd = bus_to_hcd(urb->dev->bus);
>> +     int status = urb->status;
>> +
>> +     urb->hcpriv = NULL;
>> +     if (unlikely(urb->unlinked))
>> +             status = urb->unlinked;
>
> The way you're storing the status value isn't good.  We decided to make
> the status a separate argument because of drivers that abuse the
> urb->status field (they poll it instead of waiting for the completion
> handler to be called).  Hopefully there aren't any examples of drivers
> still doing this, but...
>
> This means you shouldn't store anything in urb->status until just
> before calling the completion handler.  What you can do instead is
> always store the status value in urb->unlinked.

Good point, will do it.

>
>> +     else if (unlikely((urb->transfer_flags & URB_SHORT_NOT_OK) &&
>> +                     urb->actual_length < urb->transfer_buffer_length &&
>> +                     !status))
>> +             status = -EREMOTEIO;
>> +
>> +     unmap_urb_for_dma(hcd, urb);
>> +     usbmon_urb_complete(&hcd->self, urb, status);
>> +     usb_unanchor_urb(urb);
>> +
>> +     /* pass ownership to the completion handler */
>> +     urb->status = status;
>> +     urb->complete (urb);
>
> You are supposed to disable local interrupts around the call to the
> completion handler.

I did it in the 2nd patch to highlight the change.

>
>> +     atomic_dec (&urb->use_count);
>> +     if (unlikely(atomic_read(&urb->reject)))
>> +             wake_up (&usb_kill_urb_queue);
>> +     usb_put_urb (urb);
>> +}
>> +
>> +static void usb_giveback_urb_bh(unsigned long param)
>> +{
>> +     struct giveback_urb_bh *bh = (struct giveback_urb_bh *)param;
>> +     unsigned long flags;
>> +     struct list_head local_list;
>> +
>> +     spin_lock_irqsave(&bh->lock, flags);
>> +     bh->running = 1;
>> +restart:
>
> I like to have statement labels indented by a single space character,
> so that tools like diff don't think the label marks the start of a new
> function.

OK.

>
>> @@ -1667,25 +1727,33 @@ int usb_hcd_unlink_urb (struct urb *urb, int status)
>>   */
>>  void usb_hcd_giveback_urb(struct usb_hcd *hcd, struct urb *urb, int status)
>>  {
>> -     urb->hcpriv = NULL;
>> -     if (unlikely(urb->unlinked))
>> -             status = urb->unlinked;
>> -     else if (unlikely((urb->transfer_flags & URB_SHORT_NOT_OK) &&
>> -                     urb->actual_length < urb->transfer_buffer_length &&
>> -                     !status))
>> -             status = -EREMOTEIO;
>> +     struct giveback_urb_bh *bh = hcd->async_bh;
>> +     bool async = 1;
>> +     bool sched = 1;
>
> I know this is a common way of doing things, but to me it makes more
> sense in this case to set these values later on, in an "if ... else
> ..." statement.

OK, it is fine for me too.

>
>>
>> -     unmap_urb_for_dma(hcd, urb);
>> -     usbmon_urb_complete(&hcd->self, urb, status);
>> -     usb_unanchor_urb(urb);
>> -
>> -     /* pass ownership to the completion handler */
>>       urb->status = status;
>
> This should now be:
>
>         if (urb->unlinked == 0)
>                 urb->unlinked = status;
>
>> -     urb->complete (urb);
>> -     atomic_dec (&urb->use_count);
>> -     if (unlikely(atomic_read(&urb->reject)))
>> -             wake_up (&usb_kill_urb_queue);
>> -     usb_put_urb (urb);
>> +     if (!hcd_giveback_urb_in_bh(hcd)) {
>> +             __usb_hcd_giveback_urb(urb);
>> +             return;
>> +     }
>> +
>> +     if (usb_pipeisoc(urb->pipe) || usb_pipeint(urb->pipe)) {
>> +             bh = hcd->periodic_bh;
>> +             async = 0;
>
> I don't like these names, for several reasons.  The difference between
> the two tasklets isn't that one is meant specifically for periodic URBs
> and the other for async URBs; the difference is that one runs with
> higher priority than the other.  The names should reflect this.  For
> example, you could call them hcd->low_prio_bh and hcd->high_prio_bh.

Yes, that is also what I am wanting to do, but forget to change the name,
thanks for pointing it out.

>
>> +     }
>> +
>> +     spin_lock(&bh->lock);
>> +     list_add_tail(&urb->urb_list, &bh->head);
>> +     if (bh->running)
>> +             sched = 0;
>> +     spin_unlock(&bh->lock);
>> +
>> +     if (sched) {
>> +             if (async)
>> +                     tasklet_schedule(&bh->bh);
>> +             else
>> +                     tasklet_hi_schedule(&bh->bh);
>> +     }
>
>
> Also, you could combine async and sched into a single variable.
> Using an enumerated type for sched, this would become:
>
>         if (!sched)
>                 ;       /* tasklet doesn't need to be scheduled */
>         else if (sched == HIGH_PRIO)
>                 tasklet_hi_schedule(&bh->bh);
>         else
>                 tasklet_schedule(&bh->bh);

Looks fine.

>
> Or you could turn this into a "switch" statement.
>
>> +static int init_giveback_urb_bh(struct usb_hcd *hcd)
>> +{
>> +     if (!hcd_giveback_urb_in_bh(hcd))
>> +             return 0;
>> +
>> +     hcd->periodic_bh = kzalloc(sizeof(*hcd->periodic_bh), GFP_KERNEL);
>> +     if (!hcd->periodic_bh)
>> +             return -ENOMEM;
>> +
>> +     hcd->async_bh = kzalloc(sizeof(*hcd->async_bh), GFP_KERNEL);
>> +     if (!hcd->async_bh) {
>> +             kfree(hcd->periodic_bh);
>> +             return -ENOMEM;
>> +     }
>
> I think these things can be stored directly in the usb_hcd struct
> instead of allocated separately.

Yes, the allocation is just inherited from last percpu allocation, :-)

>
>> +
>> +     __init_giveback_urb_bh(hcd->periodic_bh);
>> +     __init_giveback_urb_bh(hcd->async_bh);
>> +
>> +     return 0;
>> +}
>> +
>> +static void __exit_giveback_urb_bh(struct giveback_urb_bh *bh)
>> +{
>> +     tasklet_kill(&bh->bh);
>> +}
>> +
>> +static void exit_giveback_urb_bh(struct usb_hcd *hcd)
>> +{
>> +     if (!hcd_giveback_urb_in_bh(hcd))
>> +             return;
>> +
>> +     __exit_giveback_urb_bh(hcd->periodic_bh);
>> +     __exit_giveback_urb_bh(hcd->async_bh);
>
> You might as well put the tasklet_kill() call directly inline instead
> of going through a separate function.

OK.

>
>> +
>> +     kfree(hcd->periodic_bh);
>> +     kfree(hcd->async_bh);
>> +}
>> +
>>  /**
>>   * usb_create_shared_hcd - create and initialize an HCD structure
>>   * @driver: HC driver that will use this hcd
>> @@ -2573,6 +2687,16 @@ int usb_add_hcd(struct usb_hcd *hcd,
>>                       && device_can_wakeup(&hcd->self.root_hub->dev))
>>               dev_dbg(hcd->self.controller, "supports USB remote wakeup\n");
>>
>> +     if (usb_hcd_is_primary_hcd(hcd)) {
>> +             retval = init_giveback_urb_bh(hcd);
>> +             if (retval)
>> +                     goto err_init_giveback_bh;
>> +     } else {
>> +             /* share tasklet handling with primary hcd */
>> +             hcd->async_bh = hcd->primary_hcd->async_bh;
>> +             hcd->periodic_bh = hcd->primary_hcd->periodic_bh;
>> +     }
>
> Is there any reason why a secondary HCD can't have its own tasklets?

I didn't do that because both primary and secondary HCDs share one
hard interrupt handler, so basically there is no obvious advantage to
do that.

Thanks,
--
Ming Lei
Ming Lei June 19, 2013, 11:50 a.m. UTC | #3
On Wed, Jun 19, 2013 at 10:59 AM, Ming Lei <ming.lei@canonical.com> wrote:
> On Wed, Jun 19, 2013 at 12:05 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
>> On Tue, 18 Jun 2013, Ming Lei wrote:
>>
>>> This patch implements the mechanism of giveback of URB in
>>> tasklet context, so that hardware interrupt handling time for
>>> usb host controller can be saved much, and HCD interrupt handling
>>> can be simplified.
>>
>> Okay.  I'd make a few relatively minor changes.
>>
>>> diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
>>> index 014dc99..a272968 100644
>>> --- a/drivers/usb/core/hcd.c
>>> +++ b/drivers/usb/core/hcd.c
>>> @@ -699,9 +699,11 @@ error:
>>>        * Avoiding calls to local_irq_disable/enable makes the code
>>>        * RT-friendly.
>>>        */
>>> -     spin_unlock(&hcd_root_hub_lock);
>>> +     if (!hcd_giveback_urb_in_bh(hcd))
>>> +             spin_unlock(&hcd_root_hub_lock);
>>>       usb_hcd_giveback_urb(hcd, urb, status);
>>> -     spin_lock(&hcd_root_hub_lock);
>>> +     if (!hcd_giveback_urb_in_bh(hcd))
>>> +             spin_lock(&hcd_root_hub_lock);
>>>
>>>       spin_unlock_irq(&hcd_root_hub_lock);
>>>       return 0;
>>> @@ -742,9 +744,11 @@ void usb_hcd_poll_rh_status(struct usb_hcd *hcd)
>>>                       memcpy(urb->transfer_buffer, buffer, length);
>>>
>>>                       usb_hcd_unlink_urb_from_ep(hcd, urb);
>>> -                     spin_unlock(&hcd_root_hub_lock);
>>> +                     if (!hcd_giveback_urb_in_bh(hcd))
>>> +                             spin_unlock(&hcd_root_hub_lock);
>>>                       usb_hcd_giveback_urb(hcd, urb, 0);
>>> -                     spin_lock(&hcd_root_hub_lock);
>>> +                     if (!hcd_giveback_urb_in_bh(hcd))
>>> +                             spin_lock(&hcd_root_hub_lock);
>>>               } else {
>>>                       length = 0;
>>>                       set_bit(HCD_FLAG_POLL_PENDING, &hcd->flags);
>>> @@ -835,9 +839,11 @@ static int usb_rh_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status)
>>>                       hcd->status_urb = NULL;
>>>                       usb_hcd_unlink_urb_from_ep(hcd, urb);
>>>
>>> -                     spin_unlock(&hcd_root_hub_lock);
>>> +                     if (!hcd_giveback_urb_in_bh(hcd))
>>> +                             spin_unlock(&hcd_root_hub_lock);
>>>                       usb_hcd_giveback_urb(hcd, urb, status);
>>> -                     spin_lock(&hcd_root_hub_lock);
>>> +                     if (!hcd_giveback_urb_in_bh(hcd))
>>> +                             spin_lock(&hcd_root_hub_lock);
>>>               }
>>>       }
>>>   done:
>>
>> None of these tests are necessary.  Root hubs are different from normal
>> devices; their URBs are handled mostly by usbcore.  The only part done
>> by the HCD is always synchronous.  And we know that root-hub URBs
>
> Looks not always synchronous, control transfer is synchronous, and
> interrupt transfer is still asynchronous. No drivers(hub, usbfs) depend
> on that,  and IMO, treating root hub same as hub will simplify HCD core,
> and finally we can remove all the above lock releasing & acquiring if
> all HCDs set HCD_BH.

Actually, we can remove the above releasing and acquiring lock
unconditionally now.

>
> Also there is very less roothub transfers and always letting tasklet
> handle URB giveback of roothub won't have performance problem, so
> how about keeping the above tests?

thanks,
--
Ming Lei
Alan Stern June 19, 2013, 3:37 p.m. UTC | #4
On Wed, 19 Jun 2013, Ming Lei wrote:

> >> @@ -835,9 +839,11 @@ static int usb_rh_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status)
> >>                       hcd->status_urb = NULL;
> >>                       usb_hcd_unlink_urb_from_ep(hcd, urb);
> >>
> >> -                     spin_unlock(&hcd_root_hub_lock);
> >> +                     if (!hcd_giveback_urb_in_bh(hcd))
> >> +                             spin_unlock(&hcd_root_hub_lock);
> >>                       usb_hcd_giveback_urb(hcd, urb, status);
> >> -                     spin_lock(&hcd_root_hub_lock);
> >> +                     if (!hcd_giveback_urb_in_bh(hcd))
> >> +                             spin_lock(&hcd_root_hub_lock);
> >>               }
> >>       }
> >>   done:
> >
> > None of these tests are necessary.  Root hubs are different from normal
> > devices; their URBs are handled mostly by usbcore.  The only part done
> > by the HCD is always synchronous.  And we know that root-hub URBs
> 
> Looks not always synchronous, control transfer is synchronous, and
> interrupt transfer is still asynchronous. No drivers(hub, usbfs) depend
> on that,  and IMO, treating root hub same as hub will simplify HCD core,
> and finally we can remove all the above lock releasing & acquiring if
> all HCDs set HCD_BH.
> 
> Also there is very less roothub transfers and always letting tasklet
> handle URB giveback of roothub won't have performance problem, so
> how about keeping the above tests?
  
If you want to use the tasklets for root-hub URBs, okay.  There's no
reason to check the HCD_BH flag, though, because HCDs have only minimal
involvement in root-hub URBs.  In particular, HCD's don't call
usb_hcd_giveback_urb() for them.
  
So you can use the tasklets for _all_ root-hub URBs.  Then the tests   
above aren't necessary, and neither are the spinlock operations.

> >> @@ -2573,6 +2687,16 @@ int usb_add_hcd(struct usb_hcd *hcd,
> >>                       && device_can_wakeup(&hcd->self.root_hub->dev))
> >>               dev_dbg(hcd->self.controller, "supports USB remote wakeup\n");
> >>
> >> +     if (usb_hcd_is_primary_hcd(hcd)) {
> >> +             retval = init_giveback_urb_bh(hcd);
> >> +             if (retval)
> >> +                     goto err_init_giveback_bh;
> >> +     } else {
> >> +             /* share tasklet handling with primary hcd */
> >> +             hcd->async_bh = hcd->primary_hcd->async_bh;
> >> +             hcd->periodic_bh = hcd->primary_hcd->periodic_bh;
> >> +     }
> >
> > Is there any reason why a secondary HCD can't have its own tasklets?
> 
> I didn't do that because both primary and secondary HCDs share one
> hard interrupt handler, so basically there is no obvious advantage to
> do that.

If the bh structures are embedded directly in the hcd structure, it 
won't be possible for a secondary hcd to share its tasklets with the 
primary hcd.  Not sharing seems simpler, and there's no obvious 
disadvantage either.

Alan Stern
Ming Lei June 20, 2013, 1:50 a.m. UTC | #5
On Wed, Jun 19, 2013 at 11:37 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Wed, 19 Jun 2013, Ming Lei wrote:
>
>> >> @@ -835,9 +839,11 @@ static int usb_rh_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status)
>> >>                       hcd->status_urb = NULL;
>> >>                       usb_hcd_unlink_urb_from_ep(hcd, urb);
>> >>
>> >> -                     spin_unlock(&hcd_root_hub_lock);
>> >> +                     if (!hcd_giveback_urb_in_bh(hcd))
>> >> +                             spin_unlock(&hcd_root_hub_lock);
>> >>                       usb_hcd_giveback_urb(hcd, urb, status);
>> >> -                     spin_lock(&hcd_root_hub_lock);
>> >> +                     if (!hcd_giveback_urb_in_bh(hcd))
>> >> +                             spin_lock(&hcd_root_hub_lock);
>> >>               }
>> >>       }
>> >>   done:
>> >
>> > None of these tests are necessary.  Root hubs are different from normal
>> > devices; their URBs are handled mostly by usbcore.  The only part done
>> > by the HCD is always synchronous.  And we know that root-hub URBs
>>
>> Looks not always synchronous, control transfer is synchronous, and
>> interrupt transfer is still asynchronous. No drivers(hub, usbfs) depend
>> on that,  and IMO, treating root hub same as hub will simplify HCD core,
>> and finally we can remove all the above lock releasing & acquiring if
>> all HCDs set HCD_BH.
>>
>> Also there is very less roothub transfers and always letting tasklet
>> handle URB giveback of roothub won't have performance problem, so
>> how about keeping the above tests?
>
> If you want to use the tasklets for root-hub URBs, okay.  There's no
> reason to check the HCD_BH flag, though, because HCDs have only minimal
> involvement in root-hub URBs.  In particular, HCD's don't call
> usb_hcd_giveback_urb() for them.

Looks both root hub's control and interrupt transfer call usb_hcd_giveback_urb()
to complete URBs, don't they?

> So you can use the tasklets for _all_ root-hub URBs.  Then the tests
> above aren't necessary, and neither are the spinlock operations.

Yes, that is what I am going to do.

>
>> >> @@ -2573,6 +2687,16 @@ int usb_add_hcd(struct usb_hcd *hcd,
>> >>                       && device_can_wakeup(&hcd->self.root_hub->dev))
>> >>               dev_dbg(hcd->self.controller, "supports USB remote wakeup\n");
>> >>
>> >> +     if (usb_hcd_is_primary_hcd(hcd)) {
>> >> +             retval = init_giveback_urb_bh(hcd);
>> >> +             if (retval)
>> >> +                     goto err_init_giveback_bh;
>> >> +     } else {
>> >> +             /* share tasklet handling with primary hcd */
>> >> +             hcd->async_bh = hcd->primary_hcd->async_bh;
>> >> +             hcd->periodic_bh = hcd->primary_hcd->periodic_bh;
>> >> +     }
>> >
>> > Is there any reason why a secondary HCD can't have its own tasklets?
>>
>> I didn't do that because both primary and secondary HCDs share one
>> hard interrupt handler, so basically there is no obvious advantage to
>> do that.
>
> If the bh structures are embedded directly in the hcd structure, it
> won't be possible for a secondary hcd to share its tasklets with the
> primary hcd.  Not sharing seems simpler, and there's no obvious
> disadvantage either.

OK, I will let secondary HCD have its own tasklet in v2.

Thanks,
--
Ming Lei
Alan Stern June 20, 2013, 2:59 p.m. UTC | #6
On Thu, 20 Jun 2013, Ming Lei wrote:

> >> Looks not always synchronous, control transfer is synchronous, and
> >> interrupt transfer is still asynchronous. No drivers(hub, usbfs) depend
> >> on that,  and IMO, treating root hub same as hub will simplify HCD core,
> >> and finally we can remove all the above lock releasing & acquiring if
> >> all HCDs set HCD_BH.
> >>
> >> Also there is very less roothub transfers and always letting tasklet
> >> handle URB giveback of roothub won't have performance problem, so
> >> how about keeping the above tests?
> >
> > If you want to use the tasklets for root-hub URBs, okay.  There's no
> > reason to check the HCD_BH flag, though, because HCDs have only minimal
> > involvement in root-hub URBs.  In particular, HCD's don't call
> > usb_hcd_giveback_urb() for them.
> 
> Looks both root hub's control and interrupt transfer call usb_hcd_giveback_urb()
> to complete URBs, don't they?

They do.  But those calls come from within usbcore, not from within the 
HCD.  Which means it doesn't matter whether the HCD_BH flag is set.

When working with URBs sent to a root hub, it many ways it's as though 
usbcore acts as its own HCD.  It takes care of all the locking and 
giving back the URBs.

> > So you can use the tasklets for _all_ root-hub URBs.  Then the tests
> > above aren't necessary, and neither are the spinlock operations.
> 
> Yes, that is what I am going to do.

Okay.

By the way, did you consider the race that Oliver pointed out?  When an
HCD is removed, all the outstanding URBs for all devices on its bus get
cancelled.  The core waits until the urb_list for each endpoint is
empty.

In the past this was good enough.  But now it looks like we will also
need to wait until the tasklet lists are empty and the tasklets aren't
running before they get killed.  Your __exit_giveback_urb_bh() routine 
doesn't seem to do that.

(Probably it's sufficient to wait until the tasklet lists are empty.  I 
assume tasklet_kill() won't stop a tasklet that's currently running.)

Alan Stern
Ming Lei June 20, 2013, 3:13 p.m. UTC | #7
On Thu, Jun 20, 2013 at 10:59 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
>
> By the way, did you consider the race that Oliver pointed out?  When an
> HCD is removed, all the outstanding URBs for all devices on its bus get
> cancelled.  The core waits until the urb_list for each endpoint is
> empty.

This should be enough since during remove path usbcore will wait for all
URBs' completion which is only triggered by tasklet, so once all URBs are
finished, the tasklet list has been empty already.

>
> In the past this was good enough.  But now it looks like we will also
> need to wait until the tasklet lists are empty and the tasklets aren't
> running before they get killed.  Your __exit_giveback_urb_bh() routine
> doesn't seem to do that.
>
> (Probably it's sufficient to wait until the tasklet lists are empty.  I
> assume tasklet_kill() won't stop a tasklet that's currently running.)

From the implementation of tasklet_kill(), it will wait for the completion
of the tasklet.

Actually the tasklet_kill() should not be necessary, I think.


Thanks,
--
Ming Lei
Alan Stern June 20, 2013, 4:52 p.m. UTC | #8
On Thu, 20 Jun 2013, Ming Lei wrote:

> On Thu, Jun 20, 2013 at 10:59 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
> >
> > By the way, did you consider the race that Oliver pointed out?  When an
> > HCD is removed, all the outstanding URBs for all devices on its bus get
> > cancelled.  The core waits until the urb_list for each endpoint is
> > empty.
> 
> This should be enough since during remove path usbcore will wait for all
> URBs' completion which is only triggered by tasklet, so once all URBs are
> finished, the tasklet list has been empty already.

No, usbcore only waits until the urb_list for each endpoint is empty.  
It doesn't keep track of the individual URB completions.  Look at
usb_hcd_flush_endpoint() and you'll see.

> > (Probably it's sufficient to wait until the tasklet lists are empty.  I
> > assume tasklet_kill() won't stop a tasklet that's currently running.)
> 
> From the implementation of tasklet_kill(), it will wait for the completion
> of the tasklet.
> 
> Actually the tasklet_kill() should not be necessary, I think.

You're probably right.  But we should wait until the tasklet's list is 
empty; we don't want to deallocate an hcd structure until all the URBs 
are taken off.

Alan Stern
Ming Lei June 21, 2013, 1:12 a.m. UTC | #9
On Fri, Jun 21, 2013 at 12:52 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Thu, 20 Jun 2013, Ming Lei wrote:
>
>> On Thu, Jun 20, 2013 at 10:59 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
>> >
>> > By the way, did you consider the race that Oliver pointed out?  When an
>> > HCD is removed, all the outstanding URBs for all devices on its bus get
>> > cancelled.  The core waits until the urb_list for each endpoint is
>> > empty.
>>
>> This should be enough since during remove path usbcore will wait for all
>> URBs' completion which is only triggered by tasklet, so once all URBs are
>> finished, the tasklet list has been empty already.
>
> No, usbcore only waits until the urb_list for each endpoint is empty.
> It doesn't keep track of the individual URB completions.  Look at
> usb_hcd_flush_endpoint() and you'll see.

But, if URBs still aren't completed after usb_disconnect(), it should be
bug of usbcore or driver, shouldn't it?

>
>> > (Probably it's sufficient to wait until the tasklet lists are empty.  I
>> > assume tasklet_kill() won't stop a tasklet that's currently running.)
>>
>> From the implementation of tasklet_kill(), it will wait for the completion
>> of the tasklet.
>>
>> Actually the tasklet_kill() should not be necessary, I think.
>
> You're probably right.  But we should wait until the tasklet's list is
> empty; we don't want to deallocate an hcd structure until all the URBs
> are taken off.

I will keep the tasklet_kill() for safe reason.

Thanks,
--
Ming Lei
Ming Lei June 21, 2013, 4:46 a.m. UTC | #10
On Fri, Jun 21, 2013 at 9:12 AM, Ming Lei <ming.lei@canonical.com> wrote:
> On Fri, Jun 21, 2013 at 12:52 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
>> On Thu, 20 Jun 2013, Ming Lei wrote:
>>
>>> On Thu, Jun 20, 2013 at 10:59 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
>>> >
>>> > By the way, did you consider the race that Oliver pointed out?  When an
>>> > HCD is removed, all the outstanding URBs for all devices on its bus get
>>> > cancelled.  The core waits until the urb_list for each endpoint is
>>> > empty.
>>>
>>> This should be enough since during remove path usbcore will wait for all
>>> URBs' completion which is only triggered by tasklet, so once all URBs are
>>> finished, the tasklet list has been empty already.
>>
>> No, usbcore only waits until the urb_list for each endpoint is empty.
>> It doesn't keep track of the individual URB completions.  Look at
>> usb_hcd_flush_endpoint() and you'll see.
>
> But, if URBs still aren't completed after usb_disconnect(), it should be
> bug of usbcore or driver, shouldn't it?

Thought about the problem further, there should be one problem:

- we can't let URBs survive from deleting the interface, otherwise
the module in which the completion handler lives might have been
removed

- so tasklet_kill() should be added into usb_hcd_synchronize_unlinks()
to make sure the above point, although it is a bit tricky since tasklet_kill()
flushes global list, but it does work.


Thanks,
--
Ming Lei
Ming Lei June 21, 2013, 5:13 a.m. UTC | #11
On Fri, Jun 21, 2013 at 12:46 PM, Ming Lei <ming.lei@canonical.com> wrote:
> On Fri, Jun 21, 2013 at 9:12 AM, Ming Lei <ming.lei@canonical.com> wrote:
>> On Fri, Jun 21, 2013 at 12:52 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
>>> On Thu, 20 Jun 2013, Ming Lei wrote:
>>>
>>>> On Thu, Jun 20, 2013 at 10:59 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
>>>> >
>>>> > By the way, did you consider the race that Oliver pointed out?  When an
>>>> > HCD is removed, all the outstanding URBs for all devices on its bus get
>>>> > cancelled.  The core waits until the urb_list for each endpoint is
>>>> > empty.
>>>>
>>>> This should be enough since during remove path usbcore will wait for all
>>>> URBs' completion which is only triggered by tasklet, so once all URBs are
>>>> finished, the tasklet list has been empty already.
>>>
>>> No, usbcore only waits until the urb_list for each endpoint is empty.
>>> It doesn't keep track of the individual URB completions.  Look at
>>> usb_hcd_flush_endpoint() and you'll see.
>>
>> But, if URBs still aren't completed after usb_disconnect(), it should be
>> bug of usbcore or driver, shouldn't it?
>
> Thought about the problem further, there should be one problem:
>
> - we can't let URBs survive from deleting the interface, otherwise
> the module in which the completion handler lives might have been
> removed
>
> - so tasklet_kill() should be added into usb_hcd_synchronize_unlinks()
> to make sure the above point, although it is a bit tricky since tasklet_kill()
> flushes global list, but it does work.

Maybe usb_suspend_both() need to call usb_hcd_synchronize_unlinks(dev),
both the two cases(disconnect, suspend) suppose drivers don't kill their
URBs in remove() and suspend().

Even we can implement one function which only flushes URBs for one device
to optimize the cases.

Thanks,
--
Ming Lei
Oliver Neukum June 21, 2013, 8:33 a.m. UTC | #12
On Friday 21 June 2013 09:12:38 Ming Lei wrote:
> >> This should be enough since during remove path usbcore will wait for all
> >> URBs' completion which is only triggered by tasklet, so once all URBs are
> >> finished, the tasklet list has been empty already.
> >
> > No, usbcore only waits until the urb_list for each endpoint is empty.
> > It doesn't keep track of the individual URB completions.  Look at
> > usb_hcd_flush_endpoint() and you'll see.
> 
> But, if URBs still aren't completed after usb_disconnect(), it should be
> bug of usbcore or driver, shouldn't it?

A driver cannot know what caused the call to disconnect(). That includes
driver unbind. Doing IO to a device that another driver may be bound to
is certainly a bug.
Nevertheless it usually works and it is desirable to keep that behavior.

	Regards
		Oliver
Ming Lei June 21, 2013, 9:13 a.m. UTC | #13
On Fri, Jun 21, 2013 at 4:33 PM, Oliver Neukum <oneukum@suse.de> wrote:
> On Friday 21 June 2013 09:12:38 Ming Lei wrote:
>> >> This should be enough since during remove path usbcore will wait for all
>> >> URBs' completion which is only triggered by tasklet, so once all URBs are
>> >> finished, the tasklet list has been empty already.
>> >
>> > No, usbcore only waits until the urb_list for each endpoint is empty.
>> > It doesn't keep track of the individual URB completions.  Look at
>> > usb_hcd_flush_endpoint() and you'll see.
>>
>> But, if URBs still aren't completed after usb_disconnect(), it should be
>> bug of usbcore or driver, shouldn't it?
>
> A driver cannot know what caused the call to disconnect(). That includes
> driver unbind. Doing IO to a device that another driver may be bound to
> is certainly a bug.
> Nevertheless it usually works and it is desirable to keep that behavior.

I mean once driver is unbound, URB's complete() in that driver shouldn't be
called.  The situation might happen when driver->remove() doesn't
kill the URBs with the patch applied.

Thanks,
--
Ming Lei
Oliver Neukum June 21, 2013, 9:20 a.m. UTC | #14
On Friday 21 June 2013 17:13:48 Ming Lei wrote:
> On Fri, Jun 21, 2013 at 4:33 PM, Oliver Neukum <oneukum@suse.de> wrote:
> > On Friday 21 June 2013 09:12:38 Ming Lei wrote:
> >> >> This should be enough since during remove path usbcore will wait for all
> >> >> URBs' completion which is only triggered by tasklet, so once all URBs are
> >> >> finished, the tasklet list has been empty already.
> >> >
> >> > No, usbcore only waits until the urb_list for each endpoint is empty.
> >> > It doesn't keep track of the individual URB completions.  Look at
> >> > usb_hcd_flush_endpoint() and you'll see.
> >>
> >> But, if URBs still aren't completed after usb_disconnect(), it should be
> >> bug of usbcore or driver, shouldn't it?
> >
> > A driver cannot know what caused the call to disconnect(). That includes
> > driver unbind. Doing IO to a device that another driver may be bound to
> > is certainly a bug.
> > Nevertheless it usually works and it is desirable to keep that behavior.
> 
> I mean once driver is unbound, URB's complete() in that driver shouldn't be

This is highly problematic. It is bound to cause resource leaks.

> called.  The situation might happen when driver->remove() doesn't
> kill the URBs with the patch applied.

Well, there is no good way to handle this. But we have a simple rule.
If usb_submit_urb() succeeds, complete() will be called.
Breaking this rule is a bad idea.

The best way is really to make sure that no URB survive.

	Regards
		Oliver
Ming Lei June 21, 2013, 9:43 a.m. UTC | #15
On Fri, Jun 21, 2013 at 5:20 PM, Oliver Neukum <oneukum@suse.de> wrote:
>
> This is highly problematic. It is bound to cause resource leaks.
>
>> called.  The situation might happen when driver->remove() doesn't
>> kill the URBs with the patch applied.
>
> Well, there is no good way to handle this. But we have a simple rule.
> If usb_submit_urb() succeeds, complete() will be called.
> Breaking this rule is a bad idea.

The rule should be enhanced by calling complete() before
completing driver unbind.

>
> The best way is really to make sure that no URB survive.

Drivers may let usbcore to do that by not setting driver->soft_unbind.

I will fix the problem in v2, one solution I thought of is to flush
the endpoint's URBs which have been added to tasklet list
at the end of usb_hcd_flush_endpoint(). Any comments?

Thanks,
--
Ming Lei
Ming Lei June 21, 2013, 10:09 a.m. UTC | #16
On Fri, Jun 21, 2013 at 5:43 PM, Ming Lei <ming.lei@canonical.com> wrote:
> On Fri, Jun 21, 2013 at 5:20 PM, Oliver Neukum <oneukum@suse.de> wrote:
>
> Drivers may let usbcore to do that by not setting driver->soft_unbind.
>
> I will fix the problem in v2, one solution I thought of is to flush
> the endpoint's URBs which have been added to tasklet list
> at the end of usb_hcd_flush_endpoint(). Any comments?

Or we can wait for completion of all URBs in the endpoint at the end
of usb_hcd_flush_endpoint(), I think this approach should be easier.

Thanks,
--
Ming Lei
Alan Stern June 21, 2013, 2:48 p.m. UTC | #17
On Fri, 21 Jun 2013, Ming Lei wrote:

> On Fri, Jun 21, 2013 at 5:20 PM, Oliver Neukum <oneukum@suse.de> wrote:
> >
> > This is highly problematic. It is bound to cause resource leaks.
> >
> >> called.  The situation might happen when driver->remove() doesn't
> >> kill the URBs with the patch applied.
> >
> > Well, there is no good way to handle this. But we have a simple rule.
> > If usb_submit_urb() succeeds, complete() will be called.
> > Breaking this rule is a bad idea.
> 
> The rule should be enhanced by calling complete() before
> completing driver unbind.
> 
> >
> > The best way is really to make sure that no URB survive.
> 
> Drivers may let usbcore to do that by not setting driver->soft_unbind.
> 
> I will fix the problem in v2, one solution I thought of is to flush
> the endpoint's URBs which have been added to tasklet list
> at the end of usb_hcd_flush_endpoint(). Any comments?

Come to think of it, this shouldn't be a problem.  Drivers _must_
insure that all their URBs have completed before their disconnect
routine returns; otherwise the completion handler could get called
after the driver has been unloaded from memory.

Currently we have no way to enforce this in usbcore, although with the
tasklets we could enforce it.  But I don't think it is necessary.  It
would be sufficient to log a warning if the tasklet's URB list isn't
empty when exit_giveback_urb_bh() runs.  It won't happen unless there's
a buggy driver.

Besides, even if you try to flush all the URBs on the tasklet's list at 
the end of usb_hcd_flush_endpoint(), you'll still miss the URBs which 
have been moved to the local_list in usb_giveback_urb_bh().  You'd have 
to wait until the tasklet wasn't running, and it would most likely be a 
waste of time.

Alan Stern
Ming Lei June 21, 2013, 4:12 p.m. UTC | #18
On Fri, Jun 21, 2013 at 10:48 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
>
> Come to think of it, this shouldn't be a problem.  Drivers _must_
> insure that all their URBs have completed before their disconnect
> routine returns; otherwise the completion handler could get called
> after the driver has been unloaded from memory.
>
> Currently we have no way to enforce this in usbcore, although with the
> tasklets we could enforce it.  But I don't think it is necessary.  It

One way of enforcing this I thought of is to wait for completions of all
unlinking URBs at the end of usb_hcd_flush_endpoint().

But as you said, it may not be necessary.

> would be sufficient to log a warning if the tasklet's URB list isn't
> empty when exit_giveback_urb_bh() runs.  It won't happen unless there's
> a buggy driver.
>
> Besides, even if you try to flush all the URBs on the tasklet's list at
> the end of usb_hcd_flush_endpoint(), you'll still miss the URBs which
> have been moved to the local_list in usb_giveback_urb_bh().  You'd have
> to wait until the tasklet wasn't running, and it would most likely be a
> waste of time.


Thanks,
--
Ming Lei
diff mbox

Patch

diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index 014dc99..a272968 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -699,9 +699,11 @@  error:
 	 * Avoiding calls to local_irq_disable/enable makes the code
 	 * RT-friendly.
 	 */
-	spin_unlock(&hcd_root_hub_lock);
+	if (!hcd_giveback_urb_in_bh(hcd))
+		spin_unlock(&hcd_root_hub_lock);
 	usb_hcd_giveback_urb(hcd, urb, status);
-	spin_lock(&hcd_root_hub_lock);
+	if (!hcd_giveback_urb_in_bh(hcd))
+		spin_lock(&hcd_root_hub_lock);
 
 	spin_unlock_irq(&hcd_root_hub_lock);
 	return 0;
@@ -742,9 +744,11 @@  void usb_hcd_poll_rh_status(struct usb_hcd *hcd)
 			memcpy(urb->transfer_buffer, buffer, length);
 
 			usb_hcd_unlink_urb_from_ep(hcd, urb);
-			spin_unlock(&hcd_root_hub_lock);
+			if (!hcd_giveback_urb_in_bh(hcd))
+				spin_unlock(&hcd_root_hub_lock);
 			usb_hcd_giveback_urb(hcd, urb, 0);
-			spin_lock(&hcd_root_hub_lock);
+			if (!hcd_giveback_urb_in_bh(hcd))
+				spin_lock(&hcd_root_hub_lock);
 		} else {
 			length = 0;
 			set_bit(HCD_FLAG_POLL_PENDING, &hcd->flags);
@@ -835,9 +839,11 @@  static int usb_rh_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status)
 			hcd->status_urb = NULL;
 			usb_hcd_unlink_urb_from_ep(hcd, urb);
 
-			spin_unlock(&hcd_root_hub_lock);
+			if (!hcd_giveback_urb_in_bh(hcd))
+				spin_unlock(&hcd_root_hub_lock);
 			usb_hcd_giveback_urb(hcd, urb, status);
-			spin_lock(&hcd_root_hub_lock);
+			if (!hcd_giveback_urb_in_bh(hcd))
+				spin_lock(&hcd_root_hub_lock);
 		}
 	}
  done:
@@ -1648,6 +1654,60 @@  int usb_hcd_unlink_urb (struct urb *urb, int status)
 
 /*-------------------------------------------------------------------------*/
 
+static void __usb_hcd_giveback_urb(struct urb *urb)
+{
+	struct usb_hcd *hcd = bus_to_hcd(urb->dev->bus);
+	int status = urb->status;
+
+	urb->hcpriv = NULL;
+	if (unlikely(urb->unlinked))
+		status = urb->unlinked;
+	else if (unlikely((urb->transfer_flags & URB_SHORT_NOT_OK) &&
+			urb->actual_length < urb->transfer_buffer_length &&
+			!status))
+		status = -EREMOTEIO;
+
+	unmap_urb_for_dma(hcd, urb);
+	usbmon_urb_complete(&hcd->self, urb, status);
+	usb_unanchor_urb(urb);
+
+	/* pass ownership to the completion handler */
+	urb->status = status;
+	urb->complete (urb);
+	atomic_dec (&urb->use_count);
+	if (unlikely(atomic_read(&urb->reject)))
+		wake_up (&usb_kill_urb_queue);
+	usb_put_urb (urb);
+}
+
+static void usb_giveback_urb_bh(unsigned long param)
+{
+	struct giveback_urb_bh *bh = (struct giveback_urb_bh *)param;
+	unsigned long flags;
+	struct list_head local_list;
+
+	spin_lock_irqsave(&bh->lock, flags);
+	bh->running = 1;
+restart:
+	list_replace_init(&bh->head, &local_list);
+	spin_unlock_irqrestore(&bh->lock, flags);
+
+	while (!list_empty(&local_list)) {
+		struct urb *urb;
+
+		urb = list_entry(local_list.next, struct urb, urb_list);
+		list_del_init(&urb->urb_list);
+		__usb_hcd_giveback_urb(urb);
+	}
+
+	/* check if there are new URBs to giveback */
+	spin_lock_irqsave(&bh->lock, flags);
+	if (!list_empty(&bh->head))
+		goto restart;
+	bh->running = 0;
+	spin_unlock_irqrestore(&bh->lock, flags);
+}
+
 /**
  * usb_hcd_giveback_urb - return URB from HCD to device driver
  * @hcd: host controller returning the URB
@@ -1667,25 +1727,33 @@  int usb_hcd_unlink_urb (struct urb *urb, int status)
  */
 void usb_hcd_giveback_urb(struct usb_hcd *hcd, struct urb *urb, int status)
 {
-	urb->hcpriv = NULL;
-	if (unlikely(urb->unlinked))
-		status = urb->unlinked;
-	else if (unlikely((urb->transfer_flags & URB_SHORT_NOT_OK) &&
-			urb->actual_length < urb->transfer_buffer_length &&
-			!status))
-		status = -EREMOTEIO;
+	struct giveback_urb_bh *bh = hcd->async_bh;
+	bool async = 1;
+	bool sched = 1;
 
-	unmap_urb_for_dma(hcd, urb);
-	usbmon_urb_complete(&hcd->self, urb, status);
-	usb_unanchor_urb(urb);
-
-	/* pass ownership to the completion handler */
 	urb->status = status;
-	urb->complete (urb);
-	atomic_dec (&urb->use_count);
-	if (unlikely(atomic_read(&urb->reject)))
-		wake_up (&usb_kill_urb_queue);
-	usb_put_urb (urb);
+	if (!hcd_giveback_urb_in_bh(hcd)) {
+		__usb_hcd_giveback_urb(urb);
+		return;
+	}
+
+	if (usb_pipeisoc(urb->pipe) || usb_pipeint(urb->pipe)) {
+		bh = hcd->periodic_bh;
+		async = 0;
+	}
+
+	spin_lock(&bh->lock);
+	list_add_tail(&urb->urb_list, &bh->head);
+	if (bh->running)
+		sched = 0;
+	spin_unlock(&bh->lock);
+
+	if (sched) {
+		if (async)
+			tasklet_schedule(&bh->bh);
+		else
+			tasklet_hi_schedule(&bh->bh);
+	}
 }
 EXPORT_SYMBOL_GPL(usb_hcd_giveback_urb);
 
@@ -2307,6 +2375,52 @@  EXPORT_SYMBOL_GPL (usb_hc_died);
 
 /*-------------------------------------------------------------------------*/
 
+static void __init_giveback_urb_bh(struct giveback_urb_bh *bh)
+{
+
+	spin_lock_init(&bh->lock);
+	INIT_LIST_HEAD(&bh->head);
+	tasklet_init(&bh->bh, usb_giveback_urb_bh, (unsigned long)bh);
+}
+
+static int init_giveback_urb_bh(struct usb_hcd *hcd)
+{
+	if (!hcd_giveback_urb_in_bh(hcd))
+		return 0;
+
+	hcd->periodic_bh = kzalloc(sizeof(*hcd->periodic_bh), GFP_KERNEL);
+	if (!hcd->periodic_bh)
+		return -ENOMEM;
+
+	hcd->async_bh = kzalloc(sizeof(*hcd->async_bh), GFP_KERNEL);
+	if (!hcd->async_bh) {
+		kfree(hcd->periodic_bh);
+		return -ENOMEM;
+	}
+
+	__init_giveback_urb_bh(hcd->periodic_bh);
+	__init_giveback_urb_bh(hcd->async_bh);
+
+	return 0;
+}
+
+static void __exit_giveback_urb_bh(struct giveback_urb_bh *bh)
+{
+	tasklet_kill(&bh->bh);
+}
+
+static void exit_giveback_urb_bh(struct usb_hcd *hcd)
+{
+	if (!hcd_giveback_urb_in_bh(hcd))
+		return;
+
+	__exit_giveback_urb_bh(hcd->periodic_bh);
+	__exit_giveback_urb_bh(hcd->async_bh);
+
+	kfree(hcd->periodic_bh);
+	kfree(hcd->async_bh);
+}
+
 /**
  * usb_create_shared_hcd - create and initialize an HCD structure
  * @driver: HC driver that will use this hcd
@@ -2573,6 +2687,16 @@  int usb_add_hcd(struct usb_hcd *hcd,
 			&& device_can_wakeup(&hcd->self.root_hub->dev))
 		dev_dbg(hcd->self.controller, "supports USB remote wakeup\n");
 
+	if (usb_hcd_is_primary_hcd(hcd)) {
+		retval = init_giveback_urb_bh(hcd);
+		if (retval)
+			goto err_init_giveback_bh;
+	} else {
+		/* share tasklet handling with primary hcd */
+		hcd->async_bh = hcd->primary_hcd->async_bh;
+		hcd->periodic_bh = hcd->primary_hcd->periodic_bh;
+	}
+
 	/* enable irqs just before we start the controller,
 	 * if the BIOS provides legacy PCI irqs.
 	 */
@@ -2636,6 +2760,8 @@  err_hcd_driver_start:
 	if (usb_hcd_is_primary_hcd(hcd) && hcd->irq > 0)
 		free_irq(irqnum, hcd);
 err_request_irq:
+	exit_giveback_urb_bh(hcd);
+err_init_giveback_bh:
 err_hcd_driver_setup:
 err_set_rh_speed:
 	usb_put_dev(hcd->self.root_hub);
@@ -2681,6 +2807,9 @@  void usb_remove_hcd(struct usb_hcd *hcd)
 	usb_disconnect(&rhdev);		/* Sets rhdev to NULL */
 	mutex_unlock(&usb_bus_list_lock);
 
+	if (usb_hcd_is_primary_hcd(hcd))
+		exit_giveback_urb_bh(hcd);
+
 	/* Prevent any more root-hub status calls from the timer.
 	 * The HCD might still restart the timer (if a port status change
 	 * interrupt occurs), but usb_hcd_poll_rh_status() won't invoke
diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h
index 1e88377..9fd3409 100644
--- a/include/linux/usb/hcd.h
+++ b/include/linux/usb/hcd.h
@@ -22,6 +22,7 @@ 
 #ifdef __KERNEL__
 
 #include <linux/rwsem.h>
+#include <linux/interrupt.h>
 
 #define MAX_TOPO_LEVEL		6
 
@@ -67,6 +68,13 @@ 
 
 /*-------------------------------------------------------------------------*/
 
+struct giveback_urb_bh {
+	bool running;
+	spinlock_t lock;
+	struct list_head  head;
+	struct tasklet_struct bh;
+};
+
 struct usb_hcd {
 
 	/*
@@ -139,6 +147,9 @@  struct usb_hcd {
 	resource_size_t		rsrc_len;	/* memory/io resource length */
 	unsigned		power_budget;	/* in mA, 0 = no limit */
 
+	struct giveback_urb_bh  *periodic_bh;
+	struct giveback_urb_bh  *async_bh;
+
 	/* bandwidth_mutex should be taken before adding or removing
 	 * any new bus bandwidth constraints:
 	 *   1. Before adding a configuration for a new device.
@@ -221,6 +232,7 @@  struct hc_driver {
 #define	HCD_USB25	0x0030		/* Wireless USB 1.0 (USB 2.5)*/
 #define	HCD_USB3	0x0040		/* USB 3.0 */
 #define	HCD_MASK	0x0070
+#define	HCD_BH		0x0100		/* URB complete in BH context */
 
 	/* called to init HCD and root hub */
 	int	(*reset) (struct usb_hcd *hcd);
@@ -361,6 +373,11 @@  struct hc_driver {
 	int	(*find_raw_port_number)(struct usb_hcd *, int);
 };
 
+static inline int hcd_giveback_urb_in_bh(struct usb_hcd *hcd)
+{
+	return hcd->driver->flags & HCD_BH;
+}
+
 extern int usb_hcd_link_urb_to_ep(struct usb_hcd *hcd, struct urb *urb);
 extern int usb_hcd_check_unlink_urb(struct usb_hcd *hcd, struct urb *urb,
 		int status);