Message ID | 20240827194625.61be5733@foxbook (mailing list archive) |
---|---|
Headers | show |
Series | Various xhci fixes and improvements | expand |
On 8/27/24 7:46 PM, Michal Pecio wrote: > Hi, > > Here's a handful of bug fixes, cleanups and improvements for the xHCI > driver. [...] > All patches were applied and tested on v6.11-rc5, all compiled cleanly > and worked without regressions with some HID, storage, audio, video. > Functional changes received additional functional testing described in > their respective changelogs. Hi! I have a 6.11-rc5 tree here (tar.gz from kernel.org) and trying to apply this series I get: $ for n in ~/Downloads/michal\ patches/20240827-\[PATCH\ *; do echo "$n"; patch --dry-run -p 1 < "$n"; done /home/fps/Downloads/michal patches/20240827-[PATCH 1_9] usb_ xhci_ Fix double newline in a deb-148577.txt checking file drivers/usb/host/xhci-ring.c /home/fps/Downloads/michal patches/20240827-[PATCH 2_9] usb_ xhci_ Fix handling errors mid TD -148571.txt checking file drivers/usb/host/xhci-ring.c /home/fps/Downloads/michal patches/20240827-[PATCH 3_9] usb_ xhci_ Clean up the TD skipping lo-148572.txt checking file drivers/usb/host/xhci-ring.c Hunk #1 FAILED at 2829. 1 out of 1 hunk FAILED /home/fps/Downloads/michal patches/20240827-[PATCH 4_9] usb_ xhci_ Expedite processing missed -148579.txt checking file drivers/usb/host/xhci-ring.c /home/fps/Downloads/michal patches/20240827-[PATCH 5_9] usb_ xhci_ Simplify error_mid_td marki-148573.txt checking file drivers/usb/host/xhci-ring.c Hunk #1 FAILED at 2410. 1 out of 1 hunk FAILED /home/fps/Downloads/michal patches/20240827-[PATCH 6_9] usb_ xhci_ Sanity check _spurious succ-148574.txt checking file drivers/usb/host/xhci-ring.c Hunk #1 succeeded at 2778 (offset -7 lines). Hunk #2 FAILED at 2879. 1 out of 2 hunks FAILED /home/fps/Downloads/michal patches/20240827-[PATCH 7_9] usb_ xhci_ Don't lie about trb_comp_co-148575.txt checking file drivers/usb/host/xhci-ring.c Hunk #1 succeeded at 2610 (offset -3 lines). Hunk #2 succeeded at 2643 (offset -3 lines). Hunk #3 FAILED at 2898. 1 out of 3 hunks FAILED /home/fps/Downloads/michal patches/20240827-[PATCH 8_9] usb_ xhci_ Print completion code in th-148576.txt checking file drivers/usb/host/xhci-ring.c Hunk #1 FAILED at 2789. 1 out of 1 hunk FAILED /home/fps/Downloads/michal patches/20240827-[PATCH 9_9] usb_ xhci_ Clean up and rename bad str-148580.txt checking file drivers/usb/host/xhci-ring.c Hunk #1 succeeded at 2563 (offset -3 lines). Hunk #2 succeeded at 2582 (offset -3 lines). Hunk #3 succeeded at 2635 (offset -4 lines). Are you sure these are against 6.11-rc5? Or did I screw up on my side? Kind regards, FPS
On 27.8.2024 20.46, Michal Pecio wrote: > Hi, > > Here's a handful of bug fixes, cleanups and improvements for the xHCI > driver. > > The first is a trivial fix, the second also fixes a bug albeit a less > trivial one. These two maybe deserve to go into usb-linus, the latter > perhaps to stable as well, but the patch doesn't apply as-is. > > The rest are new functionality or code cleanups with low anticipated > user impact. > > All patches were applied and tested on v6.11-rc5, all compiled cleanly > and worked without regressions with some HID, storage, audio, video. > Functional changes received additional functional testing described in > their respective changelogs. > > > Notably missing is a solution to the "Underrun after Missed Service" > problem. To recap, Underrun/Overrun typically has a NULL TRB pointer > and ep_ring->td_list contains some missed TDs and possibly a few that > have been added after the underrun occured but before we got the IRQ. > No way to tell them apart, at least by the spec, as far as I see. > > On USB 3.1+ hosts we can get out of it with "expedited skipping" - it > ensures that the ring is never left with stale TDs in the first place. > > On USB 3.0 hosts the underrun handler *will* sometimes face skipping > and it needs to make a decision. > > Currently, it skips everything. This may cause DMA-after-free - not > great on IN endpoints - and "TRB DMA ptr not part of current TD" or > "WARN Event TRB with no TDs queued" later. > > The obvious alterantive is to never skip on empty ring events, but it > can deadlock a driver which waits for its URBs to be given back when > all of them were missed and we can't get a valid dequeue from the HC. > > > I wonder if it would make sense to always skip the first queued TD > when we get an MSE with NULL pointer? I think it's legal, and likely > sufficient to avoid the stupid deadlock. > > Just a last minute idea. I will think about it, but now I'd like to > flush this patch queue before it grows to infinity ;) > Thanks Michal for this series. I need to go through the patches individually and pick fixes. We are in the middle of reworking transfer event handling with Niklas as well. I'd be grateful if you could take a quick look at our solution and give your opinion on it as you have expertise in this area. Main idea is simple. If a transfer event points to a TRB between dequeue and enqueue we give back all queued TDs passed up to this TRB. And if the event points to a valid TD then handle that. (normal case) Code is much simpler, we get rid of skip flag, and could probably merge error_mid_td and urb_length_set flags into one flag. The code isn't complete or tested yet. It doesn't handle under/overruns, but that could probably be fixed by just turning "return 0" for those cases into "break" Code on top of 6.11-rc4 can be found in my feature_transfer_event_rework branch: git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git feature_transfer_event_rework https://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git/log/?h=feature_transfer_event_rework Thanks Mathias
> Are you sure these are against 6.11-rc5? Or did I screw up on my side?
Absolutely for 6.11-rc5 (but -rc4 works too and -rc6 likely will).
Please try again without --dry-run, some of them will only apply on top
of earlier ones because they touch the same lines. If that doesn't work
then I may have screwed up mailing them.
BTW, #3 is logically dependent on #2, but the others could be reordered
at will IIRC.
As for your audio case, #4 is probably the only potentially interesting
one, but it didn't help on my sluggish ASMedia host. It should, however,
at least eliminate the "WARN Event TRB with no TDs queued" on USB 3.1
and newer hosts.
But the root cause of those audio issues is likely elsewhere.
Regards,
Michal
On August 28, 2024 3:24:58 PM GMT+02:00, "Michał Pecio" <michal.pecio@gmail.com> wrote: >> Are you sure these are against 6.11-rc5? Or did I screw up on my side? >Absolutely for 6.11-rc5 (but -rc4 works too and -rc6 likely will). > >Please try again without --dry-run, some of them will only apply on top >of earlier ones because they touch the same lines. If that doesn't work >then I may have screwed up mailing them. Oh, of course! My bad. Sorry for the noise! > >BTW, #3 is logically dependent on #2, but the others could be reordered >at will IIRC. > >As for your audio case, #4 is probably the only potentially interesting >one, but it didn't help on my sluggish ASMedia host. It should, however, >at least eliminate the "WARN Event TRB with no TDs queued" on USB 3.1 >and newer hosts. > >But the root cause of those audio issues is likely elsewhere. Yeah, just giving it a shot :) Kind regards, FPS
> We are in the middle of reworking transfer event handling with Niklas > as well. > > I'd be grateful if you could take a quick look at our solution and > give your opinion on it as you have expertise in this area. Beware that I may only have enough of that to be dangerous ;) But I also have a few of those PCIe cards with various chips and all three public revisions of the spec, so I'm trying to stay in spec and verify things on hardware. BTW, I only really touched isochronous transfers so far, but it looks like some things are not entirely right in others too. The handling of short packets for instance - it looks like TRB_ISP is set on various transfers, but process_xxx_td() functions don't care if the event is mid-TD and give back the URB immediately. I though about fixing this (I have a naive hope that some xHC crashes that I see are "undefined behavior" caused by driver bugs, even though in reality the hardware is probably simply FUBAR) but I'd need to read more spec chapters to get there. > Main idea is simple. > If a transfer event points to a TRB between dequeue and enqueue we > give back all queued TDs passed up to this TRB. > And if the event points to a valid TD then handle that. (normal case) > > Code is much simpler, we get rid of skip flag, and could probably > merge error_mid_td and urb_length_set flags into one flag. Yes, lots of things could be better if the code walked the ring from the last known dequeue to the current one and analyzed what's there. We could know if the event is for something that we have already freed (a bug?), or for any known pending TD and which one exactly, or for a non-transfer TD (like Forced Stop Event). Many of those thing are currently detected with heuristics which only work about right if everything goes as planned (no bugs in HW or SW). And first find the TD, then worry how to handle it. I like that Niklas managed to shorten this giant loop already, and my patches #2 and #3 shorten it further and move TD selection logic up (#2) and all else down (#3). This way we can work on selection and handling separately, with less worrying about how they influence each other. > The code isn't complete or tested yet. It doesn't handle > under/overruns, but that could probably be fixed by just turning > "return 0" for those cases into "break" Problem is that not all events give valid dequeue pointers. These two, for example, are not guaranteed to. Then we have no idea where on the ring the HC currently is. The EP doesn't stall and it may have already been resumed by a simple doorbell ring. Regards, Michal
Hi Mathias, > Code on top of 6.11-rc4 can be found in my > feature_transfer_event_rework branch: > > git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git Thanks for pointing me to this. It does look like a very promising approach. This is the principle of "matching before handling" taken to the extreme, as nothing is changed during the first pass through the TD list. It seems to work well. I feel like the counting of distances from ring->dequeue is a little, I don't know, unexpected maybe, but it does indeed produce surprisingly compact and simple code thanks to those segment numbers. That being said, I've always been a strong case of NIH syndrome, so I implemented a similar search for passed TDs which relies solely on walking the ring in order from dequeue to the event DMA pointer. So I guess we will have two implementations to compare and choose from. I will try to develop this approach into a complete implementation that actually works and can be tested. I want to maximally preserve original behavior, except for obvious bugs (I found an obscure one already) to simplify validation and review. So, for example, no removing the skip flag for now. I'm not sure if it's a good idea at all, as it frankly is sweeping bugs under the rug. I have one HC which sometimes silently misses TDs for no known reason and it misses so many of them that auto-skipping wouldn't help anyway. OTOH, when I get the TRB errors, at least I know the HC is screwed. Regards, Michal