Message ID | 20200730104226.3537-1-yazzep@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] USB: hub.c: Add the retry count module parameter for usbcore | expand |
On Thu, Jul 30, 2020 at 07:42:26PM +0900, Yasushi Asano wrote: > From: Yasushi Asano <yasano@jp.adit-jv.com> > > Dear Alan > Dear Greg > > I would like to have a consultation with you. The customer's product > board needs to get a USB certification and compliance, but it can not > pass the test using the current USB hub driver. According to the USB > compliance plan[1], the“6.7.22 Device No Response” test requires the > detection of device enumeration failure within 30 seconds. The > kernel(USB hub driver) must notify userspace of the enumeration failure > within 30 seconds. Odd, we have passed the USB certification tests in the past, what changed recently to cause this? > In the current hub driver, the place to detect device enumeration > failure is [2]. I have modified the hub driver to issue a uevent here, > but the procedure of device enumeration (new_scheme+old_scheme) takes > too long to execute, it couldn't reach [2] within 30 seconds after > starting the test. According to the result of PETtool [3], the state of > "Device No response" is that usb_control_msg(USB_REQ_GET_DESCRIPTOR) [4] > in hub_port_initn times out. That means r == -ETIMEDOUT. because of the > default setting of initial_descriptor_timeout is 5000 msec[5], > usb_control_msg() took 5000msec until -ETIMEDOUT. > > I will try to describe the flow of device enumeration processing > below[6]. If my understanding is correct, the enumeration failure [2] > will be reached after performing all the loops of [7][8][9]+[7][10]. In > the new scheme, 12 times check will be performed ([7]/2*[8]*[9] => 2*2*3 > => 12). In the old scheme , 4 times check will be performed ([7]/2*[10] > => 2*2 => 4). In total, it checks 16 times, and it took 5000msec to > ETIMEDOUT every time. Therefore, It takes about 80 seconds(16*5000msec) > to reach [2]. This does not pass the "Device No response" test. > > If I set the module parameter old_schene_first=Y and use_both_schemes=N, > it can be reached [2] within 30 seconds. However, the new_scheme will no > longer be performed. I think we can't choose this, as > previously-detected devices may become undetectable. new_scheme is > taking too long and I think 12 times checks are redundant. So, I > confirmed the USB specification. > > This is the only description of the read of the device descriptor[12]. > I couldn't find the description related retry count or timing here and > anywhere in this specification. And I couldn't find the description > related timing in the “No silent failures" requirements in other > documents[13] also. > > Because I'm not sure where this number of loop count is decided, I'm not > sure how should it be modified the driver to pass the USB compliance > test. Is it possible for me to receive a proposal for a solution? As my > thought, the number of loops may be redundant, so I think if the user > can change it arbitrarily, the problem will be solved. Currently, the > timeout value can be adjusted with a module parameter, but the loop > count cannot. Is there any problem if it is changed like this patch? > The original handling of the driver is unchanged by this modification. I > think the user will be able to adjust the time to enumeration failure > freely. Is this patch acceptable? I would appreciate it much if I could > receive the feedback from you. > > ---------------------------------------------------------------------------------------------------------------------------------------- > [1] https://www.usb.org/sites/default/files/otgeh_compliance_plan_1_2.pdf > 6.7.22 A-UUT “Device No Response” for connection timeout > 6.7.22.2 Test Procedure for A-UUT which does not support sessions > - 5. If operator clicks OK before 30s elapses since VBUS went on, then UUT passes test. > - 6. If 30s elapses first, then UUT fails test. > ---------------------------------------------------------------------------------------------------------------------------------------- > [2] hub_port_connect() > > if (hub->hdev->parent || > !hcd->driver->port_handed_over || > !(hcd->driver->port_handed_over)(hcd, port1)) { > if (status != -ENOTCONN && status != -ENODEV) > dev_err(&port_dev->dev, > "unable to enumerate USB device\n"); > } > ---------------------------------------------------------------------------------------------------------------------------------------- > [3] http://www.mqp.com/usbcomp.htm > ---------------------------------------------------------------------------------------------------------------------------------------- > [4] hub_port_init() > > r = usb_control_msg(udev, usb_rcvaddr0pipe(), > USB_REQ_GET_DESCRIPTOR, USB_DIR_IN, > USB_DT_DEVICE << 8, 0, > buf, GET_DESCRIPTOR_BUFSIZE, > initial_descriptor_timeout); > ---------------------------------------------------------------------------------------------------------------------------------------- > [5] > static int initial_descriptor_timeout = USB_CTRL_GET_TIMEOUT; > include/linux/usb.h:#define USB_CTRL_GET_TIMEOUT 5000 > ---------------------------------------------------------------------------------------------------------------------------------------- > [6] The flow of device enumeration processing > drivers/usb/core/hub.c > > hub_port_connect(){ > for (x = 0; x < SET_CONFIG_TRIES; x++) { ...[7] SET_CONFIG_TRIES is 4 as default setting > hub_port_init(){ > if( x < 2 ) { > ------ new scheme ------ > for ( y= 0; y < 2; ++y ) { ...[8] 2==GET_DESCRIPTOR_TRIES > for ( z = 0; z < 3; ++z ) { ...[9] > usb_control_msg() ...[3] ETIMEDOUT(-110) is detected. Timieout value=5000msec. > } > } > } > else { > ------ old scheme ------ > for ( y= 0; y < 2; ++y ) { ...[10] 2==SET_ADDRESS_TRIES > hub_set_address() ...[11] ETIMEDOUT(-110) is detected. Timieout value=5000msec. > } > } > } > } > unable to enumerate USB device. ...[2] > } > ---------------------------------------------------------------------------------------------------------------------------------------- > [12] https://www.usb.org/document-library/usb-20-specification > Universal Serial Bus Specification (usb_20.pdf) > 9.1.2 Bus Enumeration > 6. Before the USB device receives a unique address, its Default Control Pipe is still accessible via the default address. > The host reads the device descriptor to determine what actual maximum data payload size this USB device's default pipe can use. > ---------------------------------------------------------------------------------------------------------------------------------------- > [13] https://www.usb.org/document-library/usb-20-specification > On-The-Go and Embedded Host Supplement to the USB Revision 2.0 Specification (USB_OTG_and_EH_2-0-version 1_1a.pdf) > 3.5 No Silent Failures > ---------------------------------------------------------------------------------------------------------------------------------------- > > Signed-off-by: Yasushi Asano <yasano@jp.adit-jv.com> > --- > drivers/usb/core/hub.c | 17 +++++++++++++++-- > 1 file changed, 15 insertions(+), 2 deletions(-) > > diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c > index 052d5ac..01c2b2d 100644 > --- a/drivers/usb/core/hub.c > +++ b/drivers/usb/core/hub.c > @@ -99,6 +99,18 @@ MODULE_PARM_DESC(use_both_schemes, > "try the other device initialization scheme if the " > "first one fails"); > > +static int get_descriptor_tries = 2; > +module_param(get_descriptor_tries, int, S_IRUGO | S_IWUSR); > +MODULE_PARM_DESC(get_descriptor_tries, > + "The number of a 64-byte GET_DESCRIPTOR request tries " > + "(default 2)"); > + > +static int get_descriptor_operations = 3; > +module_param(get_descriptor_operations, int, S_IRUGO | S_IWUSR); > +MODULE_PARM_DESC(get_descriptor_operations, > + "The number of a 64-byte GET_DESCRIPTOR operation " > + "(default 3)"); > + > /* Mutual exclusion for EHCI CF initialization. This interferes with > * port reset on some companion controllers. > */ > @@ -2707,7 +2719,8 @@ static unsigned hub_is_wusb(struct usb_hub *hub) > > #define PORT_RESET_TRIES 5 > #define SET_ADDRESS_TRIES 2 > -#define GET_DESCRIPTOR_TRIES 2 > +#define GET_DESCRIPTOR_TRIES get_descriptor_tries > +#define GET_DESCRIPTOR_OPERATIONS get_descriptor_operations No need for these defines now that you have a variable being used, right? > #define SET_CONFIG_TRIES (2 * (use_both_schemes + 1)) > #define USE_NEW_SCHEME(i, scheme) ((i) / 2 == (int)(scheme)) > > @@ -4684,7 +4697,7 @@ hub_port_init(struct usb_hub *hub, struct usb_device *udev, int port1, > * 255 is for WUSB devices, we actually need to use > * 512 (WUSB1.0[4.8.1]). > */ > - for (operations = 0; operations < 3; ++operations) { > + for (operations = 0; operations < GET_DESCRIPTOR_OPERATIONS; ++operations) { module parameters are not ok, they work for all devices/hubs in the system, and no one knows how to change them or not. Any other way we can "just always do it correctly" here? thanks, greg k-h
On Thu, Jul 30, 2020 at 07:42:26PM +0900, Yasushi Asano wrote: > From: Yasushi Asano <yasano@jp.adit-jv.com> > > Dear Alan > Dear Greg > > I would like to have a consultation with you. The customer's product > board needs to get a USB certification and compliance, but it can not > pass the test using the current USB hub driver. According to the USB > compliance plan[1], the“6.7.22 Device No Response” test requires the > detection of device enumeration failure within 30 seconds. The > kernel(USB hub driver) must notify userspace of the enumeration failure > within 30 seconds. > > In the current hub driver, the place to detect device enumeration > failure is [2]. I have modified the hub driver to issue a uevent here, > but the procedure of device enumeration (new_scheme+old_scheme) takes > too long to execute, it couldn't reach [2] within 30 seconds after > starting the test. According to the result of PETtool [3], the state of > "Device No response" is that usb_control_msg(USB_REQ_GET_DESCRIPTOR) [4] > in hub_port_initn times out. That means r == -ETIMEDOUT. because of the > default setting of initial_descriptor_timeout is 5000 msec[5], > usb_control_msg() took 5000msec until -ETIMEDOUT. > > I will try to describe the flow of device enumeration processing > below[6]. If my understanding is correct, the enumeration failure [2] > will be reached after performing all the loops of [7][8][9]+[7][10]. In > the new scheme, 12 times check will be performed ([7]/2*[8]*[9] => 2*2*3 > => 12). In the old scheme , 4 times check will be performed ([7]/2*[10] > => 2*2 => 4). In total, it checks 16 times, and it took 5000msec to > ETIMEDOUT every time. Therefore, It takes about 80 seconds(16*5000msec) > to reach [2]. This does not pass the "Device No response" test. I agree with Greg, we don't want to add module parameters. The problem is that we make up to 16 attempts, and each attempt can take 5 seconds. We need to decrease the number of attempts to 6; then the total time will be 30 seconds. We also need to keep both the old and new schemes. So let's change the code to do 3 tries with each scheme. For example, 3 * new scheme, then 3 * old scheme, or else 3 * old scheme, then 3 * new_scheme, depending on the old_scheme_first parameter. Change the loops in [7], [8], [9], and [10] so that each iteration does the next item on this list instead of starting back at the beginning. (Or maybe it would work better with 2 * scheme, then 2 * old scheme, then 1 * new scheme, then 1 * old scheme with old and new swapped if old_scheme_first is set. I don't know.) Anyway, can you write a patch to do this? Alan Stern
Dear Alan Dear Greg Thank you for your feedback. I really appreciate your concrete proposal. > So let's change the code to do 3 tries with each scheme. I understood. I will try to modify it so that the number of attempts will decrease. It is 6 attempts in total both old and new schemes, but msleep is executed at various places in hub_port_connect and hub_port_init. apart from a timeout. For example, msleep(100) is executed every time in the loop of GET_DESCRIPTOR_TRIES[8] of new scheme. and In the old scheme, msleep(200) is executed in the loop of SET_ADDRESS_TRIES[10]. From my measurement, it does not subside within 30 seconds, but it is around 32 seconds. From these things, I would like you to reconsider the number of attempts. Is it OK to set the new scheme to 3 times and the old scheme to 2 times(no change as it is)? In other words [plan 1] 3 * new scheme, then 2 * old scheme, or else 2 * old scheme, then 3 * new_scheme, depending on the old_scheme_first parameter. Also, although it is a "better plan", the original processing is in the following. 6 * new scheme, then 6 * new scheme, then 2 * old scheme, then 2 * old scheme if it will be modified from above to below, It seems that the structure of the loop has to be greatly revised. I think. 2 * new scheme, then 2 * old scheme, then 1 * new scheme, then 1 * old scheme The fix is likely to be large, so Can I proceed with a patch in plan 1? I will post the patch after confirming the behavior of the patch with the customer board with the PET tool. please give me a little time. Best Regards Yasushi Asano On Thu, Jul 30, 2020 at 07:42:26PM +0900, Yasushi Asano wrote: > From: Yasushi Asano <yasano@jp.adit-jv.com> > > Dear Alan > Dear Greg > > I would like to have a consultation with you. The customer's product > board needs to get a USB certification and compliance, but it can not > pass the test using the current USB hub driver. According to the USB > compliance plan[1], the“6.7.22 Device No Response” test requires the > detection of device enumeration failure within 30 seconds. The > kernel(USB hub driver) must notify userspace of the enumeration failure > within 30 seconds. > > In the current hub driver, the place to detect device enumeration > failure is [2]. I have modified the hub driver to issue a uevent here, > but the procedure of device enumeration (new_scheme+old_scheme) takes > too long to execute, it couldn't reach [2] within 30 seconds after > starting the test. According to the result of PETtool [3], the state of > "Device No response" is that usb_control_msg(USB_REQ_GET_DESCRIPTOR) [4] > in hub_port_initn times out. That means r == -ETIMEDOUT. because of the > default setting of initial_descriptor_timeout is 5000 msec[5], > usb_control_msg() took 5000msec until -ETIMEDOUT. > > I will try to describe the flow of device enumeration processing > below[6]. If my understanding is correct, the enumeration failure [2] > will be reached after performing all the loops of [7][8][9]+[7][10]. In > the new scheme, 12 times check will be performed ([7]/2*[8]*[9] => 2*2*3 > => 12). In the old scheme , 4 times check will be performed ([7]/2*[10] > => 2*2 => 4). In total, it checks 16 times, and it took 5000msec to > ETIMEDOUT every time. Therefore, It takes about 80 seconds(16*5000msec) > to reach [2]. This does not pass the "Device No response" test. I agree with Greg, we don't want to add module parameters. The problem is that we make up to 16 attempts, and each attempt can take 5 seconds. We need to decrease the number of attempts to 6; then the total time will be 30 seconds. We also need to keep both the old and new schemes. So let's change the code to do 3 tries with each scheme. For example, 3 * new scheme, then 3 * old scheme, or else 3 * old scheme, then 3 * new_scheme, depending on the old_scheme_first parameter. Change the loops in [7], [8], [9], and [10] so that each iteration does the next item on this list instead of starting back at the beginning. (Or maybe it would work better with 2 * scheme, then 2 * old scheme, then 1 * new scheme, then 1 * old scheme with old and new swapped if old_scheme_first is set. I don't know.) Anyway, can you write a patch to do this? Alan Stern
On Thu, Aug 06, 2020 at 05:43:54AM +0000, Asano, Yasushi (ADITJ/SWG) wrote: > Dear Alan > Dear Greg > > Thank you for your feedback. > I really appreciate your concrete proposal. > > > So let's change the code to do 3 tries with each scheme. > I understood. I will try to modify it so that the number of > attempts will decrease. It is 6 attempts in total both old and > new schemes, but msleep is executed at various places in > hub_port_connect and hub_port_init. apart from a timeout. > > For example, msleep(100) is executed every time in the loop of > GET_DESCRIPTOR_TRIES[8] of new scheme. and In the old scheme, > msleep(200) is executed in the loop of SET_ADDRESS_TRIES[10]. > From my measurement, it does not subside within 30 seconds, > but it is around 32 seconds. > > From these things, I would like you to reconsider the number of attempts. > Is it OK to set the new scheme to 3 times and the old scheme to > 2 times(no change as it is)? In other words > > [plan 1] > 3 * new scheme, then 2 * old scheme, or else > 2 * old scheme, then 3 * new_scheme, > depending on the old_scheme_first parameter. Yes, that's all right. Although you might want to make the second case be: 3 * old scheme, then 2 * new scheme. > Also, although it is a "better plan", the original processing is in the following. > > 6 * new scheme, then 6 * new scheme, > then 2 * old scheme, then 2 * old scheme > > if it will be modified from above to below, It seems that the structure > of the loop has to be greatly revised. I think. > > 2 * new scheme, then 2 * old scheme, > then 1 * new scheme, then 1 * old scheme If you want to use only five attempts, you'll have to get rid of the last one. > The fix is likely to be large, so Can I proceed with a patch in plan 1? Okay. Alan Stern > I will post the patch after confirming the behavior of the patch with > the customer board with the PET tool. please give me a little time. > > Best Regards > Yasushi Asano
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c index 052d5ac..01c2b2d 100644 --- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -99,6 +99,18 @@ MODULE_PARM_DESC(use_both_schemes, "try the other device initialization scheme if the " "first one fails"); +static int get_descriptor_tries = 2; +module_param(get_descriptor_tries, int, S_IRUGO | S_IWUSR); +MODULE_PARM_DESC(get_descriptor_tries, + "The number of a 64-byte GET_DESCRIPTOR request tries " + "(default 2)"); + +static int get_descriptor_operations = 3; +module_param(get_descriptor_operations, int, S_IRUGO | S_IWUSR); +MODULE_PARM_DESC(get_descriptor_operations, + "The number of a 64-byte GET_DESCRIPTOR operation " + "(default 3)"); + /* Mutual exclusion for EHCI CF initialization. This interferes with * port reset on some companion controllers. */ @@ -2707,7 +2719,8 @@ static unsigned hub_is_wusb(struct usb_hub *hub) #define PORT_RESET_TRIES 5 #define SET_ADDRESS_TRIES 2 -#define GET_DESCRIPTOR_TRIES 2 +#define GET_DESCRIPTOR_TRIES get_descriptor_tries +#define GET_DESCRIPTOR_OPERATIONS get_descriptor_operations #define SET_CONFIG_TRIES (2 * (use_both_schemes + 1)) #define USE_NEW_SCHEME(i, scheme) ((i) / 2 == (int)(scheme)) @@ -4684,7 +4697,7 @@ hub_port_init(struct usb_hub *hub, struct usb_device *udev, int port1, * 255 is for WUSB devices, we actually need to use * 512 (WUSB1.0[4.8.1]). */ - for (operations = 0; operations < 3; ++operations) { + for (operations = 0; operations < GET_DESCRIPTOR_OPERATIONS; ++operations) { buf->bMaxPacketSize0 = 0; r = usb_control_msg(udev, usb_rcvaddr0pipe(), USB_REQ_GET_DESCRIPTOR, USB_DIR_IN,