Message ID | 1446677585-28582-3-git-send-email-dianders@chromium.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi, On Wed, Nov 4, 2015 at 2:53 PM, Douglas Anderson <dianders@chromium.org> wrote: > In commit 94dfd7edfd5c ("USB: HCD: support giveback of URB in tasklet > context") support was added to give back the URB in tasklet context. > Let's take advantage of this in dwc2. > > This speeds up the dwc2 interrupt handler considerably. > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > --- > drivers/usb/dwc2/hcd.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c > index e79baf73c234..9e7988950c7a 100644 > --- a/drivers/usb/dwc2/hcd.c > +++ b/drivers/usb/dwc2/hcd.c > @@ -2273,9 +2273,7 @@ void dwc2_host_complete(struct dwc2_hsotg *hsotg, struct dwc2_qtd *qtd, > kfree(qtd->urb); > qtd->urb = NULL; > > - spin_unlock(&hsotg->lock); > usb_hcd_giveback_urb(dwc2_hsotg_to_hcd(hsotg), urb, status); > - spin_lock(&hsotg->lock); > } > > /* > @@ -2888,7 +2886,7 @@ static struct hc_driver dwc2_hc_driver = { > .hcd_priv_size = sizeof(struct wrapper_priv_data), > > .irq = _dwc2_hcd_irq, > - .flags = HCD_MEMORY | HCD_USB2, > + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH, > > .start = _dwc2_hcd_start, > .stop = _dwc2_hcd_stop, In the ChromeOS gerrit <https://chromium-review.googlesource.com/#/c/310583/> Julius Werner points out that for EHCI it was good to take the optimization from commit 9118f9eb4f1e ("USB: EHCI: improve interrupt qh unlink") before this one. I'm still trying to learn USB / dwc2 so it's unclear to me whether we also need a similar change before landing. I'll see if I can do some investigation about this and also some benchmarking before and after. Certainly profiling the interrupt handler itself showed a huge improvement, but I'd hate to see a regression elsewhere. If anyone else knows better than I, please speak up! :) -Doug
On Wed, 4 Nov 2015, Doug Anderson wrote: > In the ChromeOS gerrit > <https://chromium-review.googlesource.com/#/c/310583/> Julius Werner > points out that for EHCI it was good to take the optimization from > commit 9118f9eb4f1e ("USB: EHCI: improve interrupt qh unlink") before > this one. I'm still trying to learn USB / dwc2 so it's unclear to me > whether we also need a similar change before landing. > > I'll see if I can do some investigation about this and also some > benchmarking before and after. Certainly profiling the interrupt > handler itself showed a huge improvement, but I'd hate to see a > regression elsewhere. > > If anyone else knows better than I, please speak up! :) This is a matter of both efficiency and correctness. Giving back URBs in a tasklet is not a simple change. Have you read the kerneldoc for usb_submit_urb() in drivers/usb/core/urb.c? The portion about "Reserved Bandwidth Transfers" is highly relevant. I don't know how dwc2 goes about reserving bandwidth for periodic transfers, but if it relies on the endpoint queue being non-empty to maintain a reservation then it will be affected by this change. Alan Stern
Alan, On Thu, Nov 5, 2015 at 7:19 AM, Alan Stern <stern@rowland.harvard.edu> wrote: > On Wed, 4 Nov 2015, Doug Anderson wrote: > >> In the ChromeOS gerrit >> <https://chromium-review.googlesource.com/#/c/310583/> Julius Werner >> points out that for EHCI it was good to take the optimization from >> commit 9118f9eb4f1e ("USB: EHCI: improve interrupt qh unlink") before >> this one. I'm still trying to learn USB / dwc2 so it's unclear to me >> whether we also need a similar change before landing. >> >> I'll see if I can do some investigation about this and also some >> benchmarking before and after. Certainly profiling the interrupt >> handler itself showed a huge improvement, but I'd hate to see a >> regression elsewhere. >> >> If anyone else knows better than I, please speak up! :) > > This is a matter of both efficiency and correctness. Giving back URBs > in a tasklet is not a simple change. > > Have you read the kerneldoc for usb_submit_urb() in > drivers/usb/core/urb.c? The portion about "Reserved Bandwidth > Transfers" is highly relevant. I don't know how dwc2 goes about > reserving bandwidth for periodic transfers, but if it relies on the > endpoint queue being non-empty to maintain a reservation then it will > be affected by this change. It does look as if you are right and the reservation will end up being released. It looks to me like dwc2_deschedule_periodic() is in charge of releasing the reservation. I'll work on trying to actually confirm this. I guess I need to find a USB test setup where there are enough devices that I exceed the available time so I can see the brokenness of my old solution... I hadn't realized that this was a correctness problem and not just an optimization problem, so thank you very much for the info! :) I ran with a bunch of USB devices and it worked fine (and performance improved!) so I figured I was good to go... Now I've read the kerneldoc you pointed at and it was very helpful. As I understand it, it's considered OK if I copy what EHCI did and release the reservation if nothing has been scheduled for 5 ms. Quoting a friend of mine: I'm now all done adding the delayed reservation release code. Now I just need to compile it and test it. :-P My plan is add some printouts to my current implementation to see cases where the deferred "unreserve" actually saved us and (to me) that will help indicate that it's working properly. Presumably I won't see this case hit (or not much) without HCD_BH and I will see this case with HCD_BH. Please consider this patch "on hold" until my next spin. -Doug
On Thu, 5 Nov 2015, Doug Anderson wrote: > Alan, > > On Thu, Nov 5, 2015 at 7:19 AM, Alan Stern <stern@rowland.harvard.edu> wrote: > > On Wed, 4 Nov 2015, Doug Anderson wrote: > > > >> In the ChromeOS gerrit > >> <https://chromium-review.googlesource.com/#/c/310583/> Julius Werner > >> points out that for EHCI it was good to take the optimization from > >> commit 9118f9eb4f1e ("USB: EHCI: improve interrupt qh unlink") before > >> this one. I'm still trying to learn USB / dwc2 so it's unclear to me > >> whether we also need a similar change before landing. > >> > >> I'll see if I can do some investigation about this and also some > >> benchmarking before and after. Certainly profiling the interrupt > >> handler itself showed a huge improvement, but I'd hate to see a > >> regression elsewhere. > >> > >> If anyone else knows better than I, please speak up! :) > > > > This is a matter of both efficiency and correctness. Giving back URBs > > in a tasklet is not a simple change. > > > > Have you read the kerneldoc for usb_submit_urb() in > > drivers/usb/core/urb.c? The portion about "Reserved Bandwidth > > Transfers" is highly relevant. I don't know how dwc2 goes about > > reserving bandwidth for periodic transfers, but if it relies on the > > endpoint queue being non-empty to maintain a reservation then it will > > be affected by this change. > > It does look as if you are right and the reservation will end up being > released. It looks to me like dwc2_deschedule_periodic() is in charge > of releasing the reservation. I'll work on trying to actually confirm > this. I guess I need to find a USB test setup where there are enough > devices that I exceed the available time so I can see the brokenness > of my old solution... You may not need that. Try a single USB keyboard and see what happens when the interrupt URB is given back. > I hadn't realized that this was a correctness problem and not just an > optimization problem, so thank you very much for the info! :) I ran > with a bunch of USB devices and it worked fine (and performance > improved!) so I figured I was good to go... Now I've read the > kerneldoc you pointed at and it was very helpful. As I understand it, > it's considered OK if I copy what EHCI did and release the reservation > if nothing has been scheduled for 5 ms. You might also look into the issues surrounding isochronous URBs. In particular, when an URB is submitted, which frames or microframes should it be scheduled in? The problem is that when the submission occurs, some of the slots following the previous URB may already have expired. The explanations for EXDEV and EFBIG in Documentation/usb/error-codes.txt are relevant here, although terse. The host controller drivers that I maintain work like this: If the endpoint's queue is empty and none of the endpoint's URBs are still being given back by the tasklet, pretend that the URB_ISO_ASAP flag is set. Note that this involves testing hcd_periodic_completion_in_progress() -- that's where switching over to tasklets makes a difference. If the URB_ISO_ASAP flag is set, the URB is scheduled for the first unallocated slot that hasn't already expired. If the flag isn't set, try to schedule the URB for the first unallocated slot. If that means all the isoc packets in the URB would be assigned to expired slots, return -EXDEV. If some but not all of the packets would be assigned to expired slots, skip them -- only schedule the packets whose slots haven't expired yet. Alan Stern
Alan, On Fri, Nov 6, 2015 at 7:40 AM, Alan Stern <stern@rowland.harvard.edu> wrote: > On Thu, 5 Nov 2015, Doug Anderson wrote: > >> Alan, >> >> On Thu, Nov 5, 2015 at 7:19 AM, Alan Stern <stern@rowland.harvard.edu> wrote: >> > On Wed, 4 Nov 2015, Doug Anderson wrote: >> > >> >> In the ChromeOS gerrit >> >> <https://chromium-review.googlesource.com/#/c/310583/> Julius Werner >> >> points out that for EHCI it was good to take the optimization from >> >> commit 9118f9eb4f1e ("USB: EHCI: improve interrupt qh unlink") before >> >> this one. I'm still trying to learn USB / dwc2 so it's unclear to me >> >> whether we also need a similar change before landing. >> >> >> >> I'll see if I can do some investigation about this and also some >> >> benchmarking before and after. Certainly profiling the interrupt >> >> handler itself showed a huge improvement, but I'd hate to see a >> >> regression elsewhere. >> >> >> >> If anyone else knows better than I, please speak up! :) >> > >> > This is a matter of both efficiency and correctness. Giving back URBs >> > in a tasklet is not a simple change. >> > >> > Have you read the kerneldoc for usb_submit_urb() in >> > drivers/usb/core/urb.c? The portion about "Reserved Bandwidth >> > Transfers" is highly relevant. I don't know how dwc2 goes about >> > reserving bandwidth for periodic transfers, but if it relies on the >> > endpoint queue being non-empty to maintain a reservation then it will >> > be affected by this change. >> >> It does look as if you are right and the reservation will end up being >> released. It looks to me like dwc2_deschedule_periodic() is in charge >> of releasing the reservation. I'll work on trying to actually confirm >> this. I guess I need to find a USB test setup where there are enough >> devices that I exceed the available time so I can see the brokenness >> of my old solution... > > You may not need that. Try a single USB keyboard and see what happens > when the interrupt URB is given back. I haven't been able to reproduce any new errors with my patch, but I have with trace_printk I have proven that it does cause reservations to be lost and then re-gained, which isn't good. Note that dwc2 is currently having scheduling problems anyway (even before my patch). lyz@rock-chips.com posted an RFC patch to fix problems he was seeing, but I have a setup that has problems too and his patch doesn't fix it. :( ...so possibly if everything was working then I could see my patch break things, but as it is now it's hard to say. In any case, I've finished testing the patch that adds a 5us delay before releasing the reservation. I'll post that so we can be on the same page. >> I hadn't realized that this was a correctness problem and not just an >> optimization problem, so thank you very much for the info! :) I ran >> with a bunch of USB devices and it worked fine (and performance >> improved!) so I figured I was good to go... Now I've read the >> kerneldoc you pointed at and it was very helpful. As I understand it, >> it's considered OK if I copy what EHCI did and release the reservation >> if nothing has been scheduled for 5 ms. > > You might also look into the issues surrounding isochronous URBs. In > particular, when an URB is submitted, which frames or microframes > should it be scheduled in? The problem is that when the submission > occurs, some of the slots following the previous URB may already have > expired. The explanations for EXDEV and EFBIG in > Documentation/usb/error-codes.txt are relevant here, although terse. > > The host controller drivers that I maintain work like this: > > If the endpoint's queue is empty and none of the endpoint's > URBs are still being given back by the tasklet, pretend that > the URB_ISO_ASAP flag is set. Note that this involves > testing hcd_periodic_completion_in_progress() -- that's > where switching over to tasklets makes a difference. > > If the URB_ISO_ASAP flag is set, the URB is scheduled for > the first unallocated slot that hasn't already expired. > > If the flag isn't set, try to schedule the URB for the first > unallocated slot. If that means all the isoc packets in the > URB would be assigned to expired slots, return -EXDEV. If > some but not all of the packets would be assigned to expired > slots, skip them -- only schedule the packets whose slots > haven't expired yet. You're talking to someone who has only been looking at the details of USB for about 2 days now. :-P ...I'm trying to grok all of that, but I'm not sure I got it all... I will say that "URB_ISO_ASAP" is not referenced anywhere in dwc2. Presumably that's a bug (or missing feature). Would it be terrible if I just ignored that for now and said that if you try to add something to the queue and we're currently in the process of waiting for our timer to expire then you'll just get back -ENOSPC? dwc2 won't deal perfectly well if you've very close to exhausting the available bandwidth, but that could be a task for another day. -Doug
On Fri, 6 Nov 2015, Doug Anderson wrote: > You're talking to someone who has only been looking at the details of > USB for about 2 days now. :-P ...I'm trying to grok all of that, but > I'm not sure I got it all... > > I will say that "URB_ISO_ASAP" is not referenced anywhere in dwc2. > Presumably that's a bug (or missing feature). Would it be terrible if > I just ignored that for now and said that if you try to add something > to the queue and we're currently in the process of waiting for our > timer to expire then you'll just get back -ENOSPC? dwc2 won't deal > perfectly well if you've very close to exhausting the available > bandwidth, but that could be a task for another day. Sure. It's not necessary to support everything perfectly, right from the beginning. Alan Stern
diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c index e79baf73c234..9e7988950c7a 100644 --- a/drivers/usb/dwc2/hcd.c +++ b/drivers/usb/dwc2/hcd.c @@ -2273,9 +2273,7 @@ void dwc2_host_complete(struct dwc2_hsotg *hsotg, struct dwc2_qtd *qtd, kfree(qtd->urb); qtd->urb = NULL; - spin_unlock(&hsotg->lock); usb_hcd_giveback_urb(dwc2_hsotg_to_hcd(hsotg), urb, status); - spin_lock(&hsotg->lock); } /* @@ -2888,7 +2886,7 @@ static struct hc_driver dwc2_hc_driver = { .hcd_priv_size = sizeof(struct wrapper_priv_data), .irq = _dwc2_hcd_irq, - .flags = HCD_MEMORY | HCD_USB2, + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH, .start = _dwc2_hcd_start, .stop = _dwc2_hcd_stop,
In commit 94dfd7edfd5c ("USB: HCD: support giveback of URB in tasklet context") support was added to give back the URB in tasklet context. Let's take advantage of this in dwc2. This speeds up the dwc2 interrupt handler considerably. Signed-off-by: Douglas Anderson <dianders@chromium.org> --- drivers/usb/dwc2/hcd.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)