diff mbox series

Isochronous error handling bug on VIA VL805

Message ID 20240118120027.5bc498b5@foxbook (mailing list archive)
State New, archived
Headers show
Series Isochronous error handling bug on VIA VL805 | expand

Commit Message

Michał Pecio Jan. 18, 2024, 11 a.m. UTC
I concurrently executed plan B for dealing with my NEC issues, which is
to simply order a host controller with other chip (VIA), and now I have
two controllers and three problems.

One is that it sometimes "dies" and requires a reboot, probably nothing
Linux can do about it.

The other is that it reports successful completion of the final TRB and
the driver overwrites frame->status set by prior errors. This seems to
only affect isochronous again, though I'm not 100% sure.


This change on top of yours takes care of it for transaction errors.



Interesting questions arise:

1. Should this be a separate patch?

2. Are there any errors that may need error_mid_td treatment on NEC?
Maybe BABBLE or others, unfortunately it isn't easy for me to produce
them and see what happens.

3. Are there any errors that may need "not success" treatment on VIA?
Any chance that these error sets aren't equal and I can't get away with
reusing your error_mid_td flag here?

Comments

Michał Pecio Jan. 18, 2024, 11:10 a.m. UTC | #1
> 2. Are there any errors that may need error_mid_td treatment on NEC?
> 3. Are there any errors that may need "not success" treatment on VIA?

Sorry, I meant to write "any other errors", of course.
For VIA, the likely answer is "all errors" or "almost all".


And I forgot to add that while NEC seems to violate the spec, VIA
follows 4.9.1 exactly and other hosts are likely to do the same.
Mathias Nyman Jan. 18, 2024, 1:54 p.m. UTC | #2
On 18.1.2024 13.00, Michał Pecio wrote:
> I concurrently executed plan B for dealing with my NEC issues, which is
> to simply order a host controller with other chip (VIA), and now I have
> two controllers and three problems.
> 
> One is that it sometimes "dies" and requires a reboot, probably nothing
> Linux can do about it.
> 
> The other is that it reports successful completion of the final TRB and
> the driver overwrites frame->status set by prior errors. This seems to
> only affect isochronous again, though I'm not 100% sure.
> 
> 
> This change on top of yours takes care of it for transaction errors.
> 
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index d2fe1f073e38..fce67493dfdf 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -2375,6 +2375,12 @@ static int process_isoc_td(struct xhci_hcd *xhci, struct xhci_virt_ep *ep,
>   	/* handle completion code */
>   	switch (trb_comp_code) {
>   	case COMP_SUCCESS:
> +		/* Check for earlier errors; see xHCI 4.9.1 */
> +		if (td->error_mid_td) {
> +			xhci_info(xhci, "Got SUCCESS after mid TD error\n");
> +			/* don't overwrite previously assigned status */
> +			break;
> +		}
>   		if (remaining) {
>   			frame->status = short_framestatus;
>   			if (xhci->quirks & XHCI_TRUST_TX_LENGTH)
> 
> 
> Interesting questions arise:
> 
> 1. Should this be a separate patch?

Added this to the original patch.

> 
> 2. Are there any errors that may need error_mid_td treatment on NEC?
> Maybe BABBLE or others, unfortunately it isn't easy for me to produce
> them and see what happens.

Need to go through these, Isoc transfers should have as many error options as other types.

> 
> 3. Are there any errors that may need "not success" treatment on VIA?
> Any chance that these error sets aren't equal and I can't get away with
> reusing your error_mid_td flag here?
> 

Need to look at this as well after getting the first patch done.

I did play with the "error on last TD case",  patch for that is in a temp branch:

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git fix_missing_td_event
https://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git/commit/?h=fix_missing_td_event&id=681415c02f95f6615a4d497df4b202a4f1fb0cc1

But it might just add more code without any real world benefit so I
think I won't send it upstream.

Thanks
Mathias
diff mbox series

Patch

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index d2fe1f073e38..fce67493dfdf 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2375,6 +2375,12 @@  static int process_isoc_td(struct xhci_hcd *xhci, struct xhci_virt_ep *ep,
 	/* handle completion code */
 	switch (trb_comp_code) {
 	case COMP_SUCCESS:
+		/* Check for earlier errors; see xHCI 4.9.1 */
+		if (td->error_mid_td) {
+			xhci_info(xhci, "Got SUCCESS after mid TD error\n");
+			/* don't overwrite previously assigned status */
+			break;
+		}
 		if (remaining) {
 			frame->status = short_framestatus;
 			if (xhci->quirks & XHCI_TRUST_TX_LENGTH)