diff mbox series

usb: musb: return -ESHUTDOWN in urb when three-strikes error happened

Message ID 20200513213620.21541-1-b-liu@ti.com (mailing list archive)
State Superseded
Headers show
Series usb: musb: return -ESHUTDOWN in urb when three-strikes error happened | expand

Commit Message

Bin Liu May 13, 2020, 9:36 p.m. UTC
When a USB device attached to a hub got disconnected, MUSB controller
generates RXCSR_RX_ERROR interrupt for the 3-strikes-out error.

Currently the MUSB host driver returns -EPROTO in current URB, then the
USB device driver could immediately resubmit the URB which causes MUSB
generate RXCSR_RX_ERROR interrupt again. This circle causes interrupt
storm then the hub never got a chance to report the USB device detach.

To fix the interrupt storm, change the URB return code to -ESHUTDOWN for
MUSB_RXCSR_H_ERROR interrupt, so that the USB device driver will not
immediately resubmit the URB.

Signed-off-by: Bin Liu <b-liu@ti.com>
---
 drivers/usb/musb/musb_host.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Alan Stern May 14, 2020, 1:32 a.m. UTC | #1
On Wed, May 13, 2020 at 04:36:20PM -0500, Bin Liu wrote:
> When a USB device attached to a hub got disconnected, MUSB controller
> generates RXCSR_RX_ERROR interrupt for the 3-strikes-out error.
> 
> Currently the MUSB host driver returns -EPROTO in current URB, then the
> USB device driver could immediately resubmit the URB which causes MUSB
> generate RXCSR_RX_ERROR interrupt again. This circle causes interrupt
> storm then the hub never got a chance to report the USB device detach.
> 
> To fix the interrupt storm, change the URB return code to -ESHUTDOWN for
> MUSB_RXCSR_H_ERROR interrupt, so that the USB device driver will not
> immediately resubmit the URB.
> 
> Signed-off-by: Bin Liu <b-liu@ti.com>

Strictly speaking, this is not the right thing to do.  It goes against 
the API described in error-codes.rst.  A better approach would be to fix 
the drivers that immediately resubmit an URB after getting a -EPROTO 
error.  After all, that is the wrong thing to do no matter what sort of 
host controller the device is attached to.

Alan Stern
Bin Liu May 14, 2020, 2:28 p.m. UTC | #2
On Wed, May 13, 2020 at 09:32:05PM -0400, Alan Stern wrote:
> On Wed, May 13, 2020 at 04:36:20PM -0500, Bin Liu wrote:
> > When a USB device attached to a hub got disconnected, MUSB controller
> > generates RXCSR_RX_ERROR interrupt for the 3-strikes-out error.
> > 
> > Currently the MUSB host driver returns -EPROTO in current URB, then the
> > USB device driver could immediately resubmit the URB which causes MUSB
> > generate RXCSR_RX_ERROR interrupt again. This circle causes interrupt
> > storm then the hub never got a chance to report the USB device detach.
> > 
> > To fix the interrupt storm, change the URB return code to -ESHUTDOWN for
> > MUSB_RXCSR_H_ERROR interrupt, so that the USB device driver will not
> > immediately resubmit the URB.
> > 
> > Signed-off-by: Bin Liu <b-liu@ti.com>
> 
> Strictly speaking, this is not the right thing to do.  It goes against 
> the API described in error-codes.rst.  A better approach would be to fix 

error-codes.rst says:

-ESHUTDOWN              The device or host controller has been
			disabled due to some problem that could not
			be worked around, such as a physical
			disconnect.

So -ESHUTDOWN is applicable in this case - the device is disconnected
behind a hub.

> the drivers that immediately resubmit an URB after getting a -EPROTO 

This has been discussed before [1]. And John indicated there are many
device drivers do not delay when resubmitting after getting -EPROTO [2],
and it is quite bit of work to fix it in all device drivers [3]. So
better to solve the issue in HCD.

> error.  After all, that is the wrong thing to do no matter what sort of 
> host controller the device is attached to.

[1] https://www.spinics.net/lists/linux-usb/msg175292.html
[2] https://www.spinics.net/lists/linux-usb/msg176327.html
[3] https://www.spinics.net/lists/linux-usb/msg176375.html

-Bin.
Alan Stern May 14, 2020, 2:40 p.m. UTC | #3
On Thu, May 14, 2020 at 09:28:03AM -0500, Bin Liu wrote:
> On Wed, May 13, 2020 at 09:32:05PM -0400, Alan Stern wrote:
> > On Wed, May 13, 2020 at 04:36:20PM -0500, Bin Liu wrote:
> > > When a USB device attached to a hub got disconnected, MUSB controller
> > > generates RXCSR_RX_ERROR interrupt for the 3-strikes-out error.
> > > 
> > > Currently the MUSB host driver returns -EPROTO in current URB, then the
> > > USB device driver could immediately resubmit the URB which causes MUSB
> > > generate RXCSR_RX_ERROR interrupt again. This circle causes interrupt
> > > storm then the hub never got a chance to report the USB device detach.
> > > 
> > > To fix the interrupt storm, change the URB return code to -ESHUTDOWN for
> > > MUSB_RXCSR_H_ERROR interrupt, so that the USB device driver will not
> > > immediately resubmit the URB.
> > > 
> > > Signed-off-by: Bin Liu <b-liu@ti.com>
> > 
> > Strictly speaking, this is not the right thing to do.  It goes against 
> > the API described in error-codes.rst.  A better approach would be to fix 
> 
> error-codes.rst says:
> 
> -ESHUTDOWN              The device or host controller has been
> 			disabled due to some problem that could not
> 			be worked around, such as a physical
> 			disconnect.
> 
> So -ESHUTDOWN is applicable in this case - the device is disconnected
> behind a hub.

Yes, but you don't _know_ that the device was disconnected.  All you 
know is that there was a 3-strikes error.  Other problems can cause such 
errors (noise, for example).

Alan Stern
Bin Liu May 14, 2020, 3:02 p.m. UTC | #4
On Thu, May 14, 2020 at 10:40:53AM -0400, Alan Stern wrote:
> On Thu, May 14, 2020 at 09:28:03AM -0500, Bin Liu wrote:
> > On Wed, May 13, 2020 at 09:32:05PM -0400, Alan Stern wrote:
> > > On Wed, May 13, 2020 at 04:36:20PM -0500, Bin Liu wrote:
> > > > When a USB device attached to a hub got disconnected, MUSB controller
> > > > generates RXCSR_RX_ERROR interrupt for the 3-strikes-out error.
> > > > 
> > > > Currently the MUSB host driver returns -EPROTO in current URB, then the
> > > > USB device driver could immediately resubmit the URB which causes MUSB
> > > > generate RXCSR_RX_ERROR interrupt again. This circle causes interrupt
> > > > storm then the hub never got a chance to report the USB device detach.
> > > > 
> > > > To fix the interrupt storm, change the URB return code to -ESHUTDOWN for
> > > > MUSB_RXCSR_H_ERROR interrupt, so that the USB device driver will not
> > > > immediately resubmit the URB.
> > > > 
> > > > Signed-off-by: Bin Liu <b-liu@ti.com>
> > > 
> > > Strictly speaking, this is not the right thing to do.  It goes against 
> > > the API described in error-codes.rst.  A better approach would be to fix 
> > 
> > error-codes.rst says:
> > 
> > -ESHUTDOWN              The device or host controller has been
> > 			disabled due to some problem that could not
> > 			be worked around, such as a physical
> > 			disconnect.
> > 
> > So -ESHUTDOWN is applicable in this case - the device is disconnected
> > behind a hub.
> 
> Yes, but you don't _know_ that the device was disconnected.  All you 
> know is that there was a 3-strikes error.  Other problems can cause such 
> errors (noise, for example).

Yes, I know this. But we don't have a solution then. I cannot add
resubmit delay in those ~500 device drivers.

-Bin.
Bin Liu May 14, 2020, 3:37 p.m. UTC | #5
On Thu, May 14, 2020 at 10:02:59AM -0500, Bin Liu wrote:
> On Thu, May 14, 2020 at 10:40:53AM -0400, Alan Stern wrote:
> > On Thu, May 14, 2020 at 09:28:03AM -0500, Bin Liu wrote:
> > > On Wed, May 13, 2020 at 09:32:05PM -0400, Alan Stern wrote:
> > > > On Wed, May 13, 2020 at 04:36:20PM -0500, Bin Liu wrote:
> > > > > When a USB device attached to a hub got disconnected, MUSB controller
> > > > > generates RXCSR_RX_ERROR interrupt for the 3-strikes-out error.
> > > > > 
> > > > > Currently the MUSB host driver returns -EPROTO in current URB, then the
> > > > > USB device driver could immediately resubmit the URB which causes MUSB
> > > > > generate RXCSR_RX_ERROR interrupt again. This circle causes interrupt
> > > > > storm then the hub never got a chance to report the USB device detach.
> > > > > 
> > > > > To fix the interrupt storm, change the URB return code to -ESHUTDOWN for
> > > > > MUSB_RXCSR_H_ERROR interrupt, so that the USB device driver will not
> > > > > immediately resubmit the URB.
> > > > > 
> > > > > Signed-off-by: Bin Liu <b-liu@ti.com>
> > > > 
> > > > Strictly speaking, this is not the right thing to do.  It goes against 
> > > > the API described in error-codes.rst.  A better approach would be to fix 
> > > 
> > > error-codes.rst says:
> > > 
> > > -ESHUTDOWN              The device or host controller has been
> > > 			disabled due to some problem that could not
> > > 			be worked around, such as a physical
> > > 			disconnect.
> > > 
> > > So -ESHUTDOWN is applicable in this case - the device is disconnected
> > > behind a hub.
> > 
> > Yes, but you don't _know_ that the device was disconnected.  All you 
> > know is that there was a 3-strikes error.  Other problems can cause such 
> > errors (noise, for example).
> 
> Yes, I know this. But we don't have a solution then. I cannot add
> resubmit delay in those ~500 device drivers.

By the way I don't think noise could last long enough to cause 3-strikes
error. A shortest USB packet is about 3-bytes long, a noise should be
just a glitch, it won't last at least 3-bytes long to supress the bus
and 3 times on the exact timing when the host expecting a response
packet. I cannot think of any other reason which can cause the 3-strikes
error other than the device is off the bus.

-Bin.
Alan Stern May 14, 2020, 4:26 p.m. UTC | #6
On Thu, May 14, 2020 at 10:37:31AM -0500, Bin Liu wrote:
> On Thu, May 14, 2020 at 10:02:59AM -0500, Bin Liu wrote:
> > On Thu, May 14, 2020 at 10:40:53AM -0400, Alan Stern wrote:
> > > On Thu, May 14, 2020 at 09:28:03AM -0500, Bin Liu wrote:
> > > > On Wed, May 13, 2020 at 09:32:05PM -0400, Alan Stern wrote:
> > > > > On Wed, May 13, 2020 at 04:36:20PM -0500, Bin Liu wrote:
> > > > > > When a USB device attached to a hub got disconnected, MUSB controller
> > > > > > generates RXCSR_RX_ERROR interrupt for the 3-strikes-out error.
> > > > > > 
> > > > > > Currently the MUSB host driver returns -EPROTO in current URB, then the
> > > > > > USB device driver could immediately resubmit the URB which causes MUSB
> > > > > > generate RXCSR_RX_ERROR interrupt again. This circle causes interrupt
> > > > > > storm then the hub never got a chance to report the USB device detach.
> > > > > > 
> > > > > > To fix the interrupt storm, change the URB return code to -ESHUTDOWN for
> > > > > > MUSB_RXCSR_H_ERROR interrupt, so that the USB device driver will not
> > > > > > immediately resubmit the URB.
> > > > > > 
> > > > > > Signed-off-by: Bin Liu <b-liu@ti.com>
> > > > > 
> > > > > Strictly speaking, this is not the right thing to do.  It goes against 
> > > > > the API described in error-codes.rst.  A better approach would be to fix 
> > > > 
> > > > error-codes.rst says:
> > > > 
> > > > -ESHUTDOWN              The device or host controller has been
> > > > 			disabled due to some problem that could not
> > > > 			be worked around, such as a physical
> > > > 			disconnect.
> > > > 
> > > > So -ESHUTDOWN is applicable in this case - the device is disconnected
> > > > behind a hub.
> > > 
> > > Yes, but you don't _know_ that the device was disconnected.  All you 
> > > know is that there was a 3-strikes error.  Other problems can cause such 
> > > errors (noise, for example).
> > 
> > Yes, I know this. But we don't have a solution then. I cannot add
> > resubmit delay in those ~500 device drivers.
> 
> By the way I don't think noise could last long enough to cause 3-strikes
> error. A shortest USB packet is about 3-bytes long, a noise should be
> just a glitch, it won't last at least 3-bytes long to supress the bus
> and 3 times on the exact timing when the host expecting a response
> packet. I cannot think of any other reason which can cause the 3-strikes
> error other than the device is off the bus.

Heh.  I heard from somebody (many years ago) about a setup where one of 
his USB devices stopped working whenever he turned on the fluorescent 
lights.

Yes, I agree that noise is pretty uncommon, and the vast majority of 
3-strikes errors are caused by disconnection or device firmware bugs.  
That's why I didn't NAK this patch.

Still, it's worth pointing out that this change abuses the API (perhaps 
mentioning it in a comment).  And it still would be preferable to fix 
the drivers in question, impractical though that may be.  (I have a hard 
time believing there are really 500 of them getting this wrong...)

Alan Stern
Bin Liu May 14, 2020, 5 p.m. UTC | #7
On Thu, May 14, 2020 at 12:26:04PM -0400, Alan Stern wrote:
> On Thu, May 14, 2020 at 10:37:31AM -0500, Bin Liu wrote:
> > On Thu, May 14, 2020 at 10:02:59AM -0500, Bin Liu wrote:
> > > On Thu, May 14, 2020 at 10:40:53AM -0400, Alan Stern wrote:
> > > > On Thu, May 14, 2020 at 09:28:03AM -0500, Bin Liu wrote:
> > > > > On Wed, May 13, 2020 at 09:32:05PM -0400, Alan Stern wrote:
> > > > > > On Wed, May 13, 2020 at 04:36:20PM -0500, Bin Liu wrote:
> > > > > > > When a USB device attached to a hub got disconnected, MUSB controller
> > > > > > > generates RXCSR_RX_ERROR interrupt for the 3-strikes-out error.
> > > > > > > 
> > > > > > > Currently the MUSB host driver returns -EPROTO in current URB, then the
> > > > > > > USB device driver could immediately resubmit the URB which causes MUSB
> > > > > > > generate RXCSR_RX_ERROR interrupt again. This circle causes interrupt
> > > > > > > storm then the hub never got a chance to report the USB device detach.
> > > > > > > 
> > > > > > > To fix the interrupt storm, change the URB return code to -ESHUTDOWN for
> > > > > > > MUSB_RXCSR_H_ERROR interrupt, so that the USB device driver will not
> > > > > > > immediately resubmit the URB.
> > > > > > > 
> > > > > > > Signed-off-by: Bin Liu <b-liu@ti.com>
> > > > > > 
> > > > > > Strictly speaking, this is not the right thing to do.  It goes against 
> > > > > > the API described in error-codes.rst.  A better approach would be to fix 
> > > > > 
> > > > > error-codes.rst says:
> > > > > 
> > > > > -ESHUTDOWN              The device or host controller has been
> > > > > 			disabled due to some problem that could not
> > > > > 			be worked around, such as a physical
> > > > > 			disconnect.
> > > > > 
> > > > > So -ESHUTDOWN is applicable in this case - the device is disconnected
> > > > > behind a hub.
> > > > 
> > > > Yes, but you don't _know_ that the device was disconnected.  All you 
> > > > know is that there was a 3-strikes error.  Other problems can cause such 
> > > > errors (noise, for example).
> > > 
> > > Yes, I know this. But we don't have a solution then. I cannot add
> > > resubmit delay in those ~500 device drivers.
> > 
> > By the way I don't think noise could last long enough to cause 3-strikes
> > error. A shortest USB packet is about 3-bytes long, a noise should be
> > just a glitch, it won't last at least 3-bytes long to supress the bus
> > and 3 times on the exact timing when the host expecting a response
> > packet. I cannot think of any other reason which can cause the 3-strikes
> > error other than the device is off the bus.
> 
> Heh.  I heard from somebody (many years ago) about a setup where one of 
> his USB devices stopped working whenever he turned on the fluorescent 
> lights.

This could happen, noise comming in from the power supply, cheap power
supply and poor board design.

> Yes, I agree that noise is pretty uncommon, and the vast majority of 
> 3-strikes errors are caused by disconnection or device firmware bugs.  

I wasn't mean noise is uncommon, it is just it won't cause 3-strikes
error because the glitch is very short. I have seen many cases that
noise causing issues on MUSB due to poor board design especially on
those USB modem or WIFI projects.

> That's why I didn't NAK this patch.

Thanks.

> Still, it's worth pointing out that this change abuses the API (perhaps 
> mentioning it in a comment).  And it still would be preferable to fix 

Okay, I will add those notes in comment in v2.

> the drivers in question, impractical though that may be.  (I have a hard 
> time believing there are really 500 of them getting this wrong...)

I am not sure about it either, but yeah it is not practical to fix the
issue in device drivers. So far I have seen 3 reports of this issue:

1. with FTDI usb-serial adapter, the issue is in the usb serial generic
driver;

2. with a modem, the issue is in usb wwan driver as I fixed in the patch
I sent yesterday;

2. another modem, before I was able to locate the offending device
driver, the guy who reported the issue disappeared, and not responding
me.

-Bin.
Alan Stern May 14, 2020, 6:55 p.m. UTC | #8
On Thu, May 14, 2020 at 12:00:23PM -0500, Bin Liu wrote:

> > Still, it's worth pointing out that this change abuses the API (perhaps 
> > mentioning it in a comment).  And it still would be preferable to fix 
> 
> Okay, I will add those notes in comment in v2.
> 
> > the drivers in question, impractical though that may be.  (I have a hard 
> > time believing there are really 500 of them getting this wrong...)
> 
> I am not sure about it either, but yeah it is not practical to fix the
> issue in device drivers. So far I have seen 3 reports of this issue:
> 
> 1. with FTDI usb-serial adapter, the issue is in the usb serial generic
> driver;

It shouldn't be hard to fix this.  We don't even need a delayed retry; 
the driver can simply treat -EPROTO as an unrecoverable error, just like 
-ESHUTDOWN.

Alan Stern

> 2. with a modem, the issue is in usb wwan driver as I fixed in the patch
> I sent yesterday;
> 
> 2. another modem, before I was able to locate the offending device
> driver, the guy who reported the issue disappeared, and not responding
> me.
> 
> -Bin.
Bin Liu May 19, 2020, 5:12 p.m. UTC | #9
On Thu, May 14, 2020 at 02:55:49PM -0400, Alan Stern wrote:
> On Thu, May 14, 2020 at 12:00:23PM -0500, Bin Liu wrote:
> 
> > > Still, it's worth pointing out that this change abuses the API (perhaps 
> > > mentioning it in a comment).  And it still would be preferable to fix 
> > 
> > Okay, I will add those notes in comment in v2.
> > 
> > > the drivers in question, impractical though that may be.  (I have a hard 
> > > time believing there are really 500 of them getting this wrong...)
> > 
> > I am not sure about it either, but yeah it is not practical to fix the
> > issue in device drivers. So far I have seen 3 reports of this issue:
> > 
> > 1. with FTDI usb-serial adapter, the issue is in the usb serial generic
> > driver;
> 
> It shouldn't be hard to fix this.  We don't even need a delayed retry; 
> the driver can simply treat -EPROTO as an unrecoverable error, just like 
> -ESHUTDOWN.

This is just an example showing that it is not practical to fix the
problem in device drivers.

BTY, I just briefly searched the drivers/ folder, there are only three
drivers handle -EPROTO specifically in rx callback and bail out. So
seems most drivers do not care about -EPROTO.

-Bin.
Alan Stern May 19, 2020, 8:01 p.m. UTC | #10
On Tue, May 19, 2020 at 12:12:55PM -0500, Bin Liu wrote:
> On Thu, May 14, 2020 at 02:55:49PM -0400, Alan Stern wrote:
> > On Thu, May 14, 2020 at 12:00:23PM -0500, Bin Liu wrote:
> > 
> > > > Still, it's worth pointing out that this change abuses the API (perhaps 
> > > > mentioning it in a comment).  And it still would be preferable to fix 
> > > 
> > > Okay, I will add those notes in comment in v2.
> > > 
> > > > the drivers in question, impractical though that may be.  (I have a hard 
> > > > time believing there are really 500 of them getting this wrong...)
> > > 
> > > I am not sure about it either, but yeah it is not practical to fix the
> > > issue in device drivers. So far I have seen 3 reports of this issue:
> > > 
> > > 1. with FTDI usb-serial adapter, the issue is in the usb serial generic
> > > driver;
> > 
> > It shouldn't be hard to fix this.  We don't even need a delayed retry; 
> > the driver can simply treat -EPROTO as an unrecoverable error, just like 
> > -ESHUTDOWN.
> 
> This is just an example showing that it is not practical to fix the
> problem in device drivers.
> 
> BTY, I just briefly searched the drivers/ folder, there are only three
> drivers handle -EPROTO specifically in rx callback and bail out. So
> seems most drivers do not care about -EPROTO.

Then do you think we should change the API?

Right now it's a little too complicated.  There's no real reason for 
EPROTO, EILSEQ, and ETIME to be different codes.  And now you seem to be 
suggesting that we should fold them all into ESHUTDOWN.

Alan Stern
Bin Liu May 20, 2020, 2:31 p.m. UTC | #11
On Tue, May 19, 2020 at 04:01:58PM -0400, Alan Stern wrote:
> On Tue, May 19, 2020 at 12:12:55PM -0500, Bin Liu wrote:
> > On Thu, May 14, 2020 at 02:55:49PM -0400, Alan Stern wrote:
> > > On Thu, May 14, 2020 at 12:00:23PM -0500, Bin Liu wrote:
> > > 
> > > > > Still, it's worth pointing out that this change abuses the API (perhaps 
> > > > > mentioning it in a comment).  And it still would be preferable to fix 
> > > > 
> > > > Okay, I will add those notes in comment in v2.
> > > > 
> > > > > the drivers in question, impractical though that may be.  (I have a hard 
> > > > > time believing there are really 500 of them getting this wrong...)
> > > > 
> > > > I am not sure about it either, but yeah it is not practical to fix the
> > > > issue in device drivers. So far I have seen 3 reports of this issue:
> > > > 
> > > > 1. with FTDI usb-serial adapter, the issue is in the usb serial generic
> > > > driver;
> > > 
> > > It shouldn't be hard to fix this.  We don't even need a delayed retry; 
> > > the driver can simply treat -EPROTO as an unrecoverable error, just like 
> > > -ESHUTDOWN.
> > 
> > This is just an example showing that it is not practical to fix the
> > problem in device drivers.
> > 
> > BTY, I just briefly searched the drivers/ folder, there are only three
> > drivers handle -EPROTO specifically in rx callback and bail out. So
> > seems most drivers do not care about -EPROTO.
> 
> Then do you think we should change the API?
> 
> Right now it's a little too complicated.  There's no real reason for 
> EPROTO, EILSEQ, and ETIME to be different codes.  And now you seem to be 
> suggesting that we should fold them all into ESHUTDOWN.

I haven't studied all the USB error code enough to suggest this. I am
only changing to ESHUTDOWN in musb driver because I know this 3-strikes
error should only be caused by USB device disabled, which is the
definition of ESHUTDOWN in error-code.rst.

-Bin.
Alan Stern May 20, 2020, 4:40 p.m. UTC | #12
> > Then do you think we should change the API?
> > 
> > Right now it's a little too complicated.  There's no real reason for 
> > EPROTO, EILSEQ, and ETIME to be different codes.  And now you seem to be 
> > suggesting that we should fold them all into ESHUTDOWN.
> 
> I haven't studied all the USB error code enough to suggest this. I am
> only changing to ESHUTDOWN in musb driver because I know this 3-strikes
> error should only be caused by USB device disabled, which is the
> definition of ESHUTDOWN in error-code.rst.

You should spend a little time studying the error-codes document.  
EPROTO, EILSEQ, and ETIME describe different kinds of errors that all 
fall into the 3-strikes category.

Alan Stern
Bin Liu May 20, 2020, 6:05 p.m. UTC | #13
On Wed, May 20, 2020 at 12:40:55PM -0400, Alan Stern wrote:
> > > Then do you think we should change the API?
> > > 
> > > Right now it's a little too complicated.  There's no real reason for 
> > > EPROTO, EILSEQ, and ETIME to be different codes.  And now you seem to be 
> > > suggesting that we should fold them all into ESHUTDOWN.
> > 
> > I haven't studied all the USB error code enough to suggest this. I am
> > only changing to ESHUTDOWN in musb driver because I know this 3-strikes
> > error should only be caused by USB device disabled, which is the
> > definition of ESHUTDOWN in error-code.rst.
> 
> You should spend a little time studying the error-codes document.  
> EPROTO, EILSEQ, and ETIME describe different kinds of errors that all 
> fall into the 3-strikes category.

But not all device drivers, except the 3 I found, treat EPROTO as fatal,
and I cannot fix it in all the device drivers. The only thing I can do
at this moment is changing the MUSB HCD driver to return ESHUTDOWN which
still follows the document, kind of.

-Bin.
Alan Stern May 20, 2020, 6:25 p.m. UTC | #14
On Wed, May 20, 2020 at 01:05:50PM -0500, Bin Liu wrote:
> On Wed, May 20, 2020 at 12:40:55PM -0400, Alan Stern wrote:
> > > > Then do you think we should change the API?
> > > > 
> > > > Right now it's a little too complicated.  There's no real reason for 
> > > > EPROTO, EILSEQ, and ETIME to be different codes.  And now you seem to be 
> > > > suggesting that we should fold them all into ESHUTDOWN.
> > > 
> > > I haven't studied all the USB error code enough to suggest this. I am
> > > only changing to ESHUTDOWN in musb driver because I know this 3-strikes
> > > error should only be caused by USB device disabled, which is the
> > > definition of ESHUTDOWN in error-code.rst.
> > 
> > You should spend a little time studying the error-codes document.  
> > EPROTO, EILSEQ, and ETIME describe different kinds of errors that all 
> > fall into the 3-strikes category.
> 
> But not all device drivers, except the 3 I found, treat EPROTO as fatal,
> and I cannot fix it in all the device drivers. The only thing I can do
> at this moment is changing the MUSB HCD driver to return ESHUTDOWN which
> still follows the document, kind of.

Think long-term.  Yes, for now you're focused on changing the MUSB HCD.  
But what should happen in the future?

At some point we all need to stop reacting and start making some 
deliberate plans.  This issue is bigger than one patch or one driver.

Alan Stern
Bin Liu May 20, 2020, 6:59 p.m. UTC | #15
On Wed, May 20, 2020 at 02:25:36PM -0400, Alan Stern wrote:
> On Wed, May 20, 2020 at 01:05:50PM -0500, Bin Liu wrote:
> > On Wed, May 20, 2020 at 12:40:55PM -0400, Alan Stern wrote:
> > > > > Then do you think we should change the API?
> > > > > 
> > > > > Right now it's a little too complicated.  There's no real reason for 
> > > > > EPROTO, EILSEQ, and ETIME to be different codes.  And now you seem to be 
> > > > > suggesting that we should fold them all into ESHUTDOWN.
> > > > 
> > > > I haven't studied all the USB error code enough to suggest this. I am
> > > > only changing to ESHUTDOWN in musb driver because I know this 3-strikes
> > > > error should only be caused by USB device disabled, which is the
> > > > definition of ESHUTDOWN in error-code.rst.
> > > 
> > > You should spend a little time studying the error-codes document.  
> > > EPROTO, EILSEQ, and ETIME describe different kinds of errors that all 
> > > fall into the 3-strikes category.
> > 
> > But not all device drivers, except the 3 I found, treat EPROTO as fatal,
> > and I cannot fix it in all the device drivers. The only thing I can do
> > at this moment is changing the MUSB HCD driver to return ESHUTDOWN which
> > still follows the document, kind of.
> 
> Think long-term.  Yes, for now you're focused on changing the MUSB HCD.  
> But what should happen in the future?
> 
> At some point we all need to stop reacting and start making some 
> deliberate plans.  This issue is bigger than one patch or one driver.

Yes, the issue big, not all device drivers treat the urb error status
properly. As the case #3 I mentioned earlier, even MUSB HCD returns
ESHUTDOWN, the interrupt storm still happens. Unfortunately before I
located the offending device driver, the remote setup is gone...

I agree with you, we probably should simply the number of the error code
based on the real usecases and the handlings in device drivers.

-Bin.
diff mbox series

Patch

diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
index 8736f4251a22..e41d910dcad1 100644
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -1774,9 +1774,9 @@  void musb_host_rx(struct musb *musb, u8 epnum)
 		status = -EPIPE;
 
 	} else if (rx_csr & MUSB_RXCSR_H_ERROR) {
-		musb_dbg(musb, "end %d RX proto error", epnum);
+		dev_err(musb->controller, "ep%d RX three-strikes error", epnum);
 
-		status = -EPROTO;
+		status = -ESHUTDOWN;
 		musb_writeb(epio, MUSB_RXINTERVAL, 0);
 
 		rx_csr &= ~MUSB_RXCSR_H_ERROR;