Message ID | 1447721484-22548-7-git-send-email-dianders@chromium.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi, On Mon, Nov 16, 2015 at 4:51 PM, Douglas Anderson <dianders@chromium.org> wrote: > Until we have logic to determine which devices share the same TT let's > add logic to assume that all devices on a given dwc2 controller are on > one single_tt hub. This is better than the previous code that assumed > that all devices were on one multi_tt hub, since single_tt hubs > appear (in my experience) to be much more common and any schedule that > would work on a single_tt hub will also work on a multi_tt hub. This > will prevent more than 8 total low/full speed devices to be on the bus > at one time, but that's a reasonable restriction until we've made things > smarter. > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > --- > Changes in v3: > - Assuming single_tt is new for v3; not terribly well tested (yet). > > Changes in v2: None > > drivers/usb/dwc2/core.h | 1 + > drivers/usb/dwc2/hcd_queue.c | 40 +++++++++++++++++++++++++++++++++++++++- > 2 files changed, 40 insertions(+), 1 deletion(-) Just as a FYI I managed to make this a little better with <http://crosreview.com/312981>, but not posting that yet because Julius has pointed out some things offline that I could be doing better (actually schedule the low speed bus properly). I'll hopefully post something more soon... -Doug
Hi, Douglas Anderson <dianders@chromium.org> writes: > Until we have logic to determine which devices share the same TT let's > add logic to assume that all devices on a given dwc2 controller are on > one single_tt hub. This is better than the previous code that assumed > that all devices were on one multi_tt hub, since single_tt hubs > appear (in my experience) to be much more common and any schedule that > would work on a single_tt hub will also work on a multi_tt hub. This > will prevent more than 8 total low/full speed devices to be on the bus > at one time, but that's a reasonable restriction until we've made things > smarter. > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > --- > Changes in v3: > - Assuming single_tt is new for v3; not terribly well tested (yet). > > Changes in v2: None > > drivers/usb/dwc2/core.h | 1 + > drivers/usb/dwc2/hcd_queue.c | 40 +++++++++++++++++++++++++++++++++++++++- > 2 files changed, 40 insertions(+), 1 deletion(-) > > diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h > index 567ee2c9e69f..09aa2b5ae29e 100644 > --- a/drivers/usb/dwc2/core.h > +++ b/drivers/usb/dwc2/core.h > @@ -782,6 +782,7 @@ struct dwc2_hsotg { > u16 periodic_usecs; > unsigned long periodic_bitmap[DIV_ROUND_UP(TOTAL_PERIODIC_USEC, > BITS_PER_LONG)]; > + bool has_split[8]; why don't you use a u8 instead then just set each bit for each "has_split" you need to take care of. This array is kinda ugly. > @@ -386,6 +409,13 @@ static int dwc2_find_multi_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh) > bitmap_set(hsotg->periodic_bitmap, start, qh->usecs); > qh->start_usecs = start; > > + if (qh->do_split) { > + for (i = start / EARLY_FRAME_USEC; > + i < DIV_ROUND_UP(start + utime - 1, EARLY_FRAME_USEC); > + i++) > + hsotg->has_split[i] = true; hsotg->has_split |= BIT(i); > @@ -546,6 +577,13 @@ static void dwc2_deschedule_periodic(struct dwc2_hsotg *hsotg, > } > > bitmap_clear(hsotg->periodic_bitmap, start, utime); > + > + if (qh->do_split) { > + for (i = start / EARLY_FRAME_USEC; > + i < DIV_ROUND_UP(start + utime - 1, EARLY_FRAME_USEC); > + i++) > + hsotg->has_split[i] = false; hsotg->has_split &= ~BIT(i);
Felipe, On Thu, Nov 19, 2015 at 7:34 AM, Felipe Balbi <balbi@ti.com> wrote: > > Hi, > > Douglas Anderson <dianders@chromium.org> writes: >> Until we have logic to determine which devices share the same TT let's >> add logic to assume that all devices on a given dwc2 controller are on >> one single_tt hub. This is better than the previous code that assumed >> that all devices were on one multi_tt hub, since single_tt hubs >> appear (in my experience) to be much more common and any schedule that >> would work on a single_tt hub will also work on a multi_tt hub. This >> will prevent more than 8 total low/full speed devices to be on the bus >> at one time, but that's a reasonable restriction until we've made things >> smarter. >> >> Signed-off-by: Douglas Anderson <dianders@chromium.org> >> --- >> Changes in v3: >> - Assuming single_tt is new for v3; not terribly well tested (yet). >> >> Changes in v2: None >> >> drivers/usb/dwc2/core.h | 1 + >> drivers/usb/dwc2/hcd_queue.c | 40 +++++++++++++++++++++++++++++++++++++++- >> 2 files changed, 40 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h >> index 567ee2c9e69f..09aa2b5ae29e 100644 >> --- a/drivers/usb/dwc2/core.h >> +++ b/drivers/usb/dwc2/core.h >> @@ -782,6 +782,7 @@ struct dwc2_hsotg { >> u16 periodic_usecs; >> unsigned long periodic_bitmap[DIV_ROUND_UP(TOTAL_PERIODIC_USEC, >> BITS_PER_LONG)]; >> + bool has_split[8]; > > why don't you use a u8 instead then just set each bit for each > "has_split" you need to take care of. This array is kinda ugly. Let's actually drop this patch completely. Julius and I sat down and he talked me through things, and with my current understanding the current microframe scheduler in dwc2 is broken enough that small little band-aids like this will do little more than just push the problems around. I'm a good portion of the way through a better microframe scheduler. I have no doubt that it won't be perfect, but hopefully it will at least be based in reality... My latest thinking on the patches in this series: 1. usb: dwc2: rockchip: Make the max_transfer_size automatic I'll probably separate this out into its own patch so I can stop sending it as part of this series. ...or if someone wanted to land it then I won't bother. 2. usb: dwc2: host: Get aligned DMA in a more supported way Still can land any time and has good benefits. I believe that I can't separate this because it will cause conflicts with scheduler patches. 3. usb: dwc2: host: Add scheduler tracing Would be nice to land. 4. usb: dwc2: host: Rewrite the microframe scheduler 5. usb: dwc2: host: Keep track of and use our scheduled microframe 6. usb: dwc2: host: Assume all devices are on one single_tt hub Please drop patches 4-6 right now. 7. usb: dwc2: host: Add a delay before releasing periodic bandwidth 8. usb: dwc2: host: Giveback URB in tasklet context I'll probably move these back up in the series (like in v2) and put microframe rewrite atop them. With my current understanding the scheduling is so broken today that the concerns Alan brought up can wait until we have a proper scheduler to be addressed. In the meantime getting the huge boost in interrupt speed will help with dwc2's correctness (and performance) because it means we're much less likely to miss SOF interrupts. If anyone has any review time, giving a review to v2 of these patches would be helpful. Otherwise I'll double check that v2 still looks good with my current understanding and eventually post them again. -Doug
Hi, Doug Anderson <dianders@chromium.org> writes: >> Douglas Anderson <dianders@chromium.org> writes: >>> Until we have logic to determine which devices share the same TT let's >>> add logic to assume that all devices on a given dwc2 controller are on >>> one single_tt hub. This is better than the previous code that assumed >>> that all devices were on one multi_tt hub, since single_tt hubs >>> appear (in my experience) to be much more common and any schedule that >>> would work on a single_tt hub will also work on a multi_tt hub. This >>> will prevent more than 8 total low/full speed devices to be on the bus >>> at one time, but that's a reasonable restriction until we've made things >>> smarter. >>> >>> Signed-off-by: Douglas Anderson <dianders@chromium.org> >>> --- >>> Changes in v3: >>> - Assuming single_tt is new for v3; not terribly well tested (yet). >>> >>> Changes in v2: None >>> >>> drivers/usb/dwc2/core.h | 1 + >>> drivers/usb/dwc2/hcd_queue.c | 40 +++++++++++++++++++++++++++++++++++++++- >>> 2 files changed, 40 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h >>> index 567ee2c9e69f..09aa2b5ae29e 100644 >>> --- a/drivers/usb/dwc2/core.h >>> +++ b/drivers/usb/dwc2/core.h >>> @@ -782,6 +782,7 @@ struct dwc2_hsotg { >>> u16 periodic_usecs; >>> unsigned long periodic_bitmap[DIV_ROUND_UP(TOTAL_PERIODIC_USEC, >>> BITS_PER_LONG)]; >>> + bool has_split[8]; >> >> why don't you use a u8 instead then just set each bit for each >> "has_split" you need to take care of. This array is kinda ugly. > > Let's actually drop this patch completely. Julius and I sat down and > he talked me through things, and with my current understanding the > current microframe scheduler in dwc2 is broken enough that small > little band-aids like this will do little more than just push the > problems around. > > I'm a good portion of the way through a better microframe scheduler. > I have no doubt that it won't be perfect, but hopefully it will at > least be based in reality... > > My latest thinking on the patches in this series: > > 1. usb: dwc2: rockchip: Make the max_transfer_size automatic > > I'll probably separate this out into its own patch so I can stop > sending it as part of this series. ...or if someone wanted to land it > then I won't bother. > > > 2. usb: dwc2: host: Get aligned DMA in a more supported way > > Still can land any time and has good benefits. I believe that I can't > separate this because it will cause conflicts with scheduler patches. > > > 3. usb: dwc2: host: Add scheduler tracing > > Would be nice to land. > > > 4. usb: dwc2: host: Rewrite the microframe scheduler > 5. usb: dwc2: host: Keep track of and use our scheduled microframe > 6. usb: dwc2: host: Assume all devices are on one single_tt hub > > Please drop patches 4-6 right now. > > > 7. usb: dwc2: host: Add a delay before releasing periodic bandwidth > 8. usb: dwc2: host: Giveback URB in tasklet context if you can, it's best to send a new series with the changes. This makes mine and John's life a lot easier :-) > I'll probably move these back up in the series (like in v2) and put > microframe rewrite atop them. With my current understanding the > scheduling is so broken today that the concerns Alan brought up can > wait until we have a proper scheduler to be addressed. In the > meantime getting the huge boost in interrupt speed will help with > dwc2's correctness (and performance) because it means we're much less > likely to miss SOF interrupts. > > If anyone has any review time, giving a review to v2 of these patches > would be helpful. Otherwise I'll double check that v2 still looks > good with my current understanding and eventually post them again. That would have to be someone experienced with that IP. I don't even have docs :-)
On 11/19/2015 8:27 AM, Doug Anderson wrote: > Felipe, > > On Thu, Nov 19, 2015 at 7:34 AM, Felipe Balbi <balbi@ti.com> wrote: >> >> Hi, >> >> Douglas Anderson <dianders@chromium.org> writes: >>> Until we have logic to determine which devices share the same TT let's >>> add logic to assume that all devices on a given dwc2 controller are on >>> one single_tt hub. This is better than the previous code that assumed >>> that all devices were on one multi_tt hub, since single_tt hubs >>> appear (in my experience) to be much more common and any schedule that >>> would work on a single_tt hub will also work on a multi_tt hub. This >>> will prevent more than 8 total low/full speed devices to be on the bus >>> at one time, but that's a reasonable restriction until we've made things >>> smarter. >>> >>> Signed-off-by: Douglas Anderson <dianders@chromium.org> >>> --- >>> Changes in v3: >>> - Assuming single_tt is new for v3; not terribly well tested (yet). >>> >>> Changes in v2: None >>> >>> drivers/usb/dwc2/core.h | 1 + >>> drivers/usb/dwc2/hcd_queue.c | 40 +++++++++++++++++++++++++++++++++++++++- >>> 2 files changed, 40 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h >>> index 567ee2c9e69f..09aa2b5ae29e 100644 >>> --- a/drivers/usb/dwc2/core.h >>> +++ b/drivers/usb/dwc2/core.h >>> @@ -782,6 +782,7 @@ struct dwc2_hsotg { >>> u16 periodic_usecs; >>> unsigned long periodic_bitmap[DIV_ROUND_UP(TOTAL_PERIODIC_USEC, >>> BITS_PER_LONG)]; >>> + bool has_split[8]; >> >> why don't you use a u8 instead then just set each bit for each >> "has_split" you need to take care of. This array is kinda ugly. > > Let's actually drop this patch completely. Julius and I sat down and > he talked me through things, and with my current understanding the > current microframe scheduler in dwc2 is broken enough that small > little band-aids like this will do little more than just push the > problems around. > > I'm a good portion of the way through a better microframe scheduler. > I have no doubt that it won't be perfect, but hopefully it will at > least be based in reality... > > My latest thinking on the patches in this series: > > 1. usb: dwc2: rockchip: Make the max_transfer_size automatic > > I'll probably separate this out into its own patch so I can stop > sending it as part of this series. ...or if someone wanted to land it > then I won't bother. > > > 2. usb: dwc2: host: Get aligned DMA in a more supported way > > Still can land any time and has good benefits. I believe that I can't > separate this because it will cause conflicts with scheduler patches. > > > 3. usb: dwc2: host: Add scheduler tracing > > Would be nice to land. > > > 4. usb: dwc2: host: Rewrite the microframe scheduler > 5. usb: dwc2: host: Keep track of and use our scheduled microframe > 6. usb: dwc2: host: Assume all devices are on one single_tt hub > > Please drop patches 4-6 right now. > > > 7. usb: dwc2: host: Add a delay before releasing periodic bandwidth > 8. usb: dwc2: host: Giveback URB in tasklet context > > I'll probably move these back up in the series (like in v2) and put > microframe rewrite atop them. With my current understanding the > scheduling is so broken today that the concerns Alan brought up can > wait until we have a proper scheduler to be addressed. In the > meantime getting the huge boost in interrupt speed will help with > dwc2's correctness (and performance) because it means we're much less > likely to miss SOF interrupts. > > If anyone has any review time, giving a review to v2 of these patches > would be helpful. Otherwise I'll double check that v2 still looks > good with my current understanding and eventually post them again. > > -Doug > Hi Doug, Patches 1-3: Acked-by: John Youn <johnyoun@synopsys.com> Patch 2: Tested-by: John Youn <johnyoun@synopsys.com> Tested on core version 3.20 using internal TE for un-aligned buffers. I haven't had time to look into the scheduling patches yet. But I agree with you that there are fundamental problems. I'll await your rewrite. Regards, John
John, On Thu, Nov 19, 2015 at 8:33 PM, John Youn <John.Youn@synopsys.com> wrote: > Patches 1-3: > Acked-by: John Youn <johnyoun@synopsys.com> > > Patch 2: > Tested-by: John Youn <johnyoun@synopsys.com> > > Tested on core version 3.20 using internal TE for un-aligned > buffers. > > I haven't had time to look into the scheduling patches yet. But I > agree with you that there are fundamental problems. I'll await > your rewrite. Thanks! I'm going to attempt to at least get things reposted with your tags before Thanksgiving. I'm hopeful that I'll actually be able to post a RFC patch for the microframe rewrite by then, too! :) I have something that seems to be sane at <https://chromium-review.googlesource.com/#/c/313808/>, but it obviously still needs a lot of cleanup. I also need to pick all of the things that have landed on Felipe's tree recently and rebase atop those. -Doug
Hi, On Mon, Nov 23, 2015 at 4:28 PM, Doug Anderson <dianders@chromium.org> wrote: > John, > > On Thu, Nov 19, 2015 at 8:33 PM, John Youn <John.Youn@synopsys.com> wrote: >> Patches 1-3: >> Acked-by: John Youn <johnyoun@synopsys.com> >> >> Patch 2: >> Tested-by: John Youn <johnyoun@synopsys.com> >> >> Tested on core version 3.20 using internal TE for un-aligned >> buffers. >> >> I haven't had time to look into the scheduling patches yet. But I >> agree with you that there are fundamental problems. I'll await >> your rewrite. > > Thanks! I'm going to attempt to at least get things reposted with > your tags before Thanksgiving. I'm hopeful that I'll actually be able > to post a RFC patch for the microframe rewrite by then, too! :) > > I have something that seems to be sane at > <https://chromium-review.googlesource.com/#/c/313808/>, but it > obviously still needs a lot of cleanup. I also need to pick all of > the things that have landed on Felipe's tree recently and rebase atop > those. I didn't succeed at getting something postable. :( I could easily repost some of the earlier patches (like fixing aligned memory, etc) if there's a desire for those, but I figured I'd wait until I had something postable for the microframe scheduler. I've definitely made progress in the last few days and things are a bit cleaner at <https://chromium-review.googlesource.com/#/c/313808/3>. It's also working reasonably well, but I'd still like to understand some of the behavior I'm seeing before I actually post it to the mailing list. I'm going to be away from my computer for about a week, so I'll plan to pick this up again when I'm back... -Doug
diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h index 567ee2c9e69f..09aa2b5ae29e 100644 --- a/drivers/usb/dwc2/core.h +++ b/drivers/usb/dwc2/core.h @@ -782,6 +782,7 @@ struct dwc2_hsotg { u16 periodic_usecs; unsigned long periodic_bitmap[DIV_ROUND_UP(TOTAL_PERIODIC_USEC, BITS_PER_LONG)]; + bool has_split[8]; u16 frame_number; u16 periodic_qh_count; bool bus_suspended; diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c index 4c1d9cf482d0..c5a2edb04bec 100644 --- a/drivers/usb/dwc2/hcd_queue.c +++ b/drivers/usb/dwc2/hcd_queue.c @@ -372,10 +372,33 @@ static int dwc2_find_single_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh) */ static int dwc2_find_multi_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh) { + unsigned long tmp_bmp[DIV_ROUND_UP(TOTAL_PERIODIC_USEC, BITS_PER_LONG)]; + unsigned long *bmp; unsigned short utime = qh->usecs; unsigned long start; + int i; + + if (qh->do_split) { + /* + * Eventually we'll only want to exclude out microframes used by + * other people on the same TT as us, and then only if we're on + * a single_tt hub. ...but until we have that logic, just + * schedule everyone together. + */ + bmp = tmp_bmp; + memcpy(bmp, hsotg->periodic_bitmap, sizeof(tmp_bmp)); + start = 0; + + for (i = 0; i < ARRAY_SIZE(max_uframe_usecs); i++) { + if (hsotg->has_split[i]) + bitmap_set(bmp, start, max_uframe_usecs[i]); + start += max_uframe_usecs[i]; + } + } else { + bmp = hsotg->periodic_bitmap; + } - start = bitmap_find_next_zero_area(hsotg->periodic_bitmap, + start = bitmap_find_next_zero_area(bmp, TOTAL_PERIODIC_USEC, 0, utime, 0); if (start >= TOTAL_PERIODIC_USEC) { dwc2_sch_dbg(hsotg, "%s: failed to assign %d us\n", @@ -386,6 +409,13 @@ static int dwc2_find_multi_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh) bitmap_set(hsotg->periodic_bitmap, start, qh->usecs); qh->start_usecs = start; + if (qh->do_split) { + for (i = start / EARLY_FRAME_USEC; + i < DIV_ROUND_UP(start + utime - 1, EARLY_FRAME_USEC); + i++) + hsotg->has_split[i] = true; + } + dwc2_sch_dbg(hsotg, "%s: assigned %d us @ %d us\n", __func__, qh->usecs, qh->start_usecs); @@ -533,6 +563,7 @@ static void dwc2_deschedule_periodic(struct dwc2_hsotg *hsotg, { int start = qh->start_usecs; int utime = qh->usecs; + int i; list_del_init(&qh->qh_list_entry); @@ -546,6 +577,13 @@ static void dwc2_deschedule_periodic(struct dwc2_hsotg *hsotg, } bitmap_clear(hsotg->periodic_bitmap, start, utime); + + if (qh->do_split) { + for (i = start / EARLY_FRAME_USEC; + i < DIV_ROUND_UP(start + utime - 1, EARLY_FRAME_USEC); + i++) + hsotg->has_split[i] = false; + } } /**
Until we have logic to determine which devices share the same TT let's add logic to assume that all devices on a given dwc2 controller are on one single_tt hub. This is better than the previous code that assumed that all devices were on one multi_tt hub, since single_tt hubs appear (in my experience) to be much more common and any schedule that would work on a single_tt hub will also work on a multi_tt hub. This will prevent more than 8 total low/full speed devices to be on the bus at one time, but that's a reasonable restriction until we've made things smarter. Signed-off-by: Douglas Anderson <dianders@chromium.org> --- Changes in v3: - Assuming single_tt is new for v3; not terribly well tested (yet). Changes in v2: None drivers/usb/dwc2/core.h | 1 + drivers/usb/dwc2/hcd_queue.c | 40 +++++++++++++++++++++++++++++++++++++++- 2 files changed, 40 insertions(+), 1 deletion(-)