Message ID | 20250306144954.3507700-16-mathias.nyman@linux.intel.com (mailing list archive) |
---|---|
State | Superseded |
Commit | b331a3d8097fad4e541d212684192f21fedbd6e5 |
Headers | show |
Series | xhci features for usb-next | expand |
> Unplugging a USB3.0 webcam from Etron hosts while streaming results > in errors like this: > > [ 2.646387] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr > not part of current TD ep_index 18 comp_code 13 [ 2.646446] xhci_hcd > 0000:03:00.0: Looking for event-dma 000000002fdf8630 trb-start > 000000002fdf8640 trb-end 000000002fdf8650 [ 2.646560] xhci_hcd > 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD > ep_index 18 comp_code 13 [ 2.646568] xhci_hcd 0000:03:00.0: Looking > for event-dma 000000002fdf8660 trb-start 000000002fdf8670 trb-end > 000000002fdf8670 > > Etron xHC generates two transfer events for the TRB if an error is > detected while processing the last TRB of an isoc TD. > > The first event can be any sort of error (like USB Transaction or > Babble Detected, etc), and the final event is Success. > > The xHCI driver will handle the TD after the first event and remove > it from its internal list, and then print an "Transfer event TRB DMA > ptr not part of current TD" error message after the final event. > > Commit 5372c65e1311 ("xhci: process isoc TD properly when there was a > transaction error mid TD.") is designed to address isoc transaction > errors, but unfortunately it doesn't account for this scenario. > > This issue is similar to the XHCI_SPURIOUS_SUCCESS case where a > success event follows a 'short transfer' event, but the TD the event > points to is already given back. > > Expand the spurious success 'short transfer' event handling to cover > the spurious success after error on Etron hosts. > > Kuangyi Chiang reported this issue and submitted a different solution > based on using error_mid_td. This commit message is mostly taken > from that patch. > > Reported-by: Kuangyi Chiang <ki.chiang65@gmail.com> > Closes: > https://lore.kernel.org/linux-usb/20241028025337.6372-6-ki.chiang65@gmail.com/ > Tested-by: Kuangyi Chiang <ki.chiang65@gmail.com> Tested-by: Michal > Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman > <mathias.nyman@linux.intel.com> Such simple HW quirk would be an abvious candidate for stable if a Short Packet refactor weren't bundled with it. And it is subtly broken. I could swear that I have mailed you about it, maybe you missed it or I didn't explain myself clearly enough. > --- > drivers/usb/host/xhci-ring.c | 38 ++++++++++++++++++++++++------------ > drivers/usb/host/xhci.h | 2 +- > 2 files changed, 27 insertions(+), 13 deletions(-) > > diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c > index 2df94ed3152c..0f8acbb9cd21 100644 > --- a/drivers/usb/host/xhci-ring.c > +++ b/drivers/usb/host/xhci-ring.c > @@ -2611,6 +2611,22 @@ static int handle_transferless_tx_event(struct xhci_hcd *xhci, struct xhci_virt_ > return 0; > } > > +static bool xhci_spurious_success_tx_event(struct xhci_hcd *xhci, > + struct xhci_ring *ring) > +{ > + switch (ring->old_trb_comp_code) { > + case COMP_SHORT_PACKET: > + return xhci->quirks & XHCI_SPURIOUS_SUCCESS; XHCI_SPURIOUS_SUCCESS applies to practically all HCs, so this code will typically boil down to: return (ring->old_trb_comp_code == COMP_SHORT_PACKET); > + case COMP_USB_TRANSACTION_ERROR: > + case COMP_BABBLE_DETECTED_ERROR: > + case COMP_ISOCH_BUFFER_OVERRUN: > + return xhci->quirks & XHCI_ETRON_HOST && > + ring->type == TYPE_ISOC; > + default: > + return false; > + } > +} > + > /* > * If this function returns an error condition, it means it got a Transfer > * event with a corrupted Slot ID, Endpoint ID, or TRB DMA address. > @@ -2665,8 +2681,8 @@ static int handle_tx_event(struct xhci_hcd *xhci, > case COMP_SUCCESS: > if (EVENT_TRB_LEN(le32_to_cpu(event->transfer_len)) != 0) { > trb_comp_code = COMP_SHORT_PACKET; > - xhci_dbg(xhci, "Successful completion on short TX for slot %u ep %u with last td short %d\n", > - slot_id, ep_index, ep_ring->last_td_was_short); > + xhci_dbg(xhci, "Successful completion on short TX for slot %u ep %u with last td comp code %d\n", > + slot_id, ep_index, ep_ring->old_trb_comp_code); > } > break; > case COMP_SHORT_PACKET: > @@ -2817,7 +2833,7 @@ static int handle_tx_event(struct xhci_hcd *xhci, > if (trb_comp_code != COMP_STOPPED && > trb_comp_code != COMP_STOPPED_LENGTH_INVALID && > !ring_xrun_event && > - !ep_ring->last_td_was_short) { > + !xhci_spurious_success_tx_event(xhci, ep_ring)) { > xhci_warn(xhci, "Event TRB for slot %u ep %u with no TDs queued\n", > slot_id, ep_index); > } > @@ -2882,11 +2898,12 @@ static int handle_tx_event(struct xhci_hcd *xhci, > > /* > * Some hosts give a spurious success event after a short > - * transfer. Ignore it. > + * transfer or error on last TRB. Ignore it. > */ > - if ((xhci->quirks & XHCI_SPURIOUS_SUCCESS) && > - ep_ring->last_td_was_short) { > - ep_ring->last_td_was_short = false; 'last_td_was_short' means "expect one more event", and it is being cleared here after receiving said event, or at least suspecting so. > + if (xhci_spurious_success_tx_event(xhci, ep_ring)) { > + xhci_dbg(xhci, "Spurious event dma %pad, comp_code %u after %u\n", > + &ep_trb_dma, trb_comp_code, ep_ring->old_trb_comp_code); > + ep_ring->old_trb_comp_code = trb_comp_code; Proper equivalent here would be to reset old_trb_comp_code to some "impossible" value (0, -1) so that xhci_spurious_success_tx_event() ceases returning true. Otherwise, this branch will trigger again on the next event if it's for a wrong transfer (dangerous HW or SW bug). Specifically and explicitly, two problems are created: 1. The "one more event" we expect will always be COMP_SHORT_PACKET, so this code will keep silently ignoring invalid events until some event is handled without error or is other than Short Packet. 2. There are endpoints (e.g. async/adaptive audio, usb-serial IN, IIRC some UAS too) where all or most transfers complete with Short Packet as a matter of routine. This code will silently ignore errors until an event is handled without error, so it will ignore all errors. IOW, "TRB DMA ptr not part of current TD" can never show up as far as I can tell. > return 0; > } > > @@ -2909,15 +2926,12 @@ static int handle_tx_event(struct xhci_hcd *xhci, > */ > } while (ep->skip); > > + ep_ring->old_trb_comp_code = trb_comp_code; > + > /* Get out if a TD was queued at enqueue after the xrun occurred */ > if (ring_xrun_event) > return 0; > > - if (trb_comp_code == COMP_SHORT_PACKET) > - ep_ring->last_td_was_short = true; > - else > - ep_ring->last_td_was_short = false; > - > ep_trb = &ep_seg->trbs[(ep_trb_dma - ep_seg->dma) / sizeof(*ep_trb)]; > trace_xhci_handle_transfer(ep_ring, (struct xhci_generic_trb *) ep_trb, ep_trb_dma); > > diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h > index d9d7cd1906f3..6c00062a9acc 100644 > --- a/drivers/usb/host/xhci.h > +++ b/drivers/usb/host/xhci.h > @@ -1375,7 +1375,7 @@ struct xhci_ring { > unsigned int num_trbs_free; /* used only by xhci DbC */ > unsigned int bounce_buf_len; > enum xhci_ring_type type; > - bool last_td_was_short; > + u32 old_trb_comp_code; > struct radix_tree_root *trb_address_map; > }; > > --
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 2df94ed3152c..0f8acbb9cd21 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -2611,6 +2611,22 @@ static int handle_transferless_tx_event(struct xhci_hcd *xhci, struct xhci_virt_ return 0; } +static bool xhci_spurious_success_tx_event(struct xhci_hcd *xhci, + struct xhci_ring *ring) +{ + switch (ring->old_trb_comp_code) { + case COMP_SHORT_PACKET: + return xhci->quirks & XHCI_SPURIOUS_SUCCESS; + case COMP_USB_TRANSACTION_ERROR: + case COMP_BABBLE_DETECTED_ERROR: + case COMP_ISOCH_BUFFER_OVERRUN: + return xhci->quirks & XHCI_ETRON_HOST && + ring->type == TYPE_ISOC; + default: + return false; + } +} + /* * If this function returns an error condition, it means it got a Transfer * event with a corrupted Slot ID, Endpoint ID, or TRB DMA address. @@ -2665,8 +2681,8 @@ static int handle_tx_event(struct xhci_hcd *xhci, case COMP_SUCCESS: if (EVENT_TRB_LEN(le32_to_cpu(event->transfer_len)) != 0) { trb_comp_code = COMP_SHORT_PACKET; - xhci_dbg(xhci, "Successful completion on short TX for slot %u ep %u with last td short %d\n", - slot_id, ep_index, ep_ring->last_td_was_short); + xhci_dbg(xhci, "Successful completion on short TX for slot %u ep %u with last td comp code %d\n", + slot_id, ep_index, ep_ring->old_trb_comp_code); } break; case COMP_SHORT_PACKET: @@ -2817,7 +2833,7 @@ static int handle_tx_event(struct xhci_hcd *xhci, if (trb_comp_code != COMP_STOPPED && trb_comp_code != COMP_STOPPED_LENGTH_INVALID && !ring_xrun_event && - !ep_ring->last_td_was_short) { + !xhci_spurious_success_tx_event(xhci, ep_ring)) { xhci_warn(xhci, "Event TRB for slot %u ep %u with no TDs queued\n", slot_id, ep_index); } @@ -2882,11 +2898,12 @@ static int handle_tx_event(struct xhci_hcd *xhci, /* * Some hosts give a spurious success event after a short - * transfer. Ignore it. + * transfer or error on last TRB. Ignore it. */ - if ((xhci->quirks & XHCI_SPURIOUS_SUCCESS) && - ep_ring->last_td_was_short) { - ep_ring->last_td_was_short = false; + if (xhci_spurious_success_tx_event(xhci, ep_ring)) { + xhci_dbg(xhci, "Spurious event dma %pad, comp_code %u after %u\n", + &ep_trb_dma, trb_comp_code, ep_ring->old_trb_comp_code); + ep_ring->old_trb_comp_code = trb_comp_code; return 0; } @@ -2909,15 +2926,12 @@ static int handle_tx_event(struct xhci_hcd *xhci, */ } while (ep->skip); + ep_ring->old_trb_comp_code = trb_comp_code; + /* Get out if a TD was queued at enqueue after the xrun occurred */ if (ring_xrun_event) return 0; - if (trb_comp_code == COMP_SHORT_PACKET) - ep_ring->last_td_was_short = true; - else - ep_ring->last_td_was_short = false; - ep_trb = &ep_seg->trbs[(ep_trb_dma - ep_seg->dma) / sizeof(*ep_trb)]; trace_xhci_handle_transfer(ep_ring, (struct xhci_generic_trb *) ep_trb, ep_trb_dma); diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index d9d7cd1906f3..6c00062a9acc 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1375,7 +1375,7 @@ struct xhci_ring { unsigned int num_trbs_free; /* used only by xhci DbC */ unsigned int bounce_buf_len; enum xhci_ring_type type; - bool last_td_was_short; + u32 old_trb_comp_code; struct radix_tree_root *trb_address_map; };