Message ID | 20231011164525.97616-1-hgajjar@de.adit-jv.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v4] usb: core: hub: Add quirks for reducing device address timeout | expand |
On Wed, Oct 11, 2023 at 06:45:25PM +0200, Hardik Gajjar wrote: > Currently, the timeout for the set address command is fixed at > 5 seconds in the xhci driver. This means the host waits up to 5 > seconds to receive a response for the set_address command from > the device. > > In the automotive context, most smartphone enumerations, including > screen projection, should ideally complete within 3 seconds. "should" according to whom? That goes against the USB specification, so why not take it up with them? > Achieving this is impossible in scenarios where the set_address is > not successful and waits for a timeout. Agreed, broken hardware is a pain, but if your device is allowed to take longer, it can, and will, so you have to support that. > The shortened address device timeout quirks provide the flexibility > to align with a 3-second time limit in the event of errors. > By swiftly triggering a failure response and swiftly initiating > retry procedures, these quirks ensure efficient and rapid recovery, > particularly in automotive contexts where rapid smartphone enumeration > and screen projection are vital. Screen projection is a requirement that you should not be relying on USB for as USB has a different set of required timeouts, right? This sounds like a bad hardware design, if not an impossible one. > The quirk will set the timeout to 500 ms from 5 seconds. > > To use the quirk, please write "vendor_id:product_id:p" to > /sys/bus/usb/drivers/hub/module/parameter/quirks > > For example, > echo "0x2c48:0x0132:p" > /sys/bus/usb/drivers/hub/module/parameter/quirks" > > Signed-off-by: Hardik Gajjar <hgajjar@de.adit-jv.com> > --- > changes since version 1: > - implement quirk instead of new API in xhci driver > > changes since version 2: > - Add documentation for the new quirk. > - Define the timeout unit in milliseconds in variable names and function arguments. > - Change the xHCI command timeout from HZ (jiffies) to milliseconds. > - Add APTIV usb hub vendor and product ID in device quirk list > - Adding some other comments for clarity > > Changes since version 3: > - Add some comments for clarity. > - Minor indentation and sequence change. > --- > .../admin-guide/kernel-parameters.txt | 3 +++ > drivers/usb/core/hub.c | 21 ++++++++++++++++-- > drivers/usb/core/quirks.c | 6 +++++ > drivers/usb/host/xhci-mem.c | 2 ++ > drivers/usb/host/xhci-ring.c | 11 +++++----- > drivers/usb/host/xhci.c | 22 ++++++++++++++----- > drivers/usb/host/xhci.h | 9 ++++++-- > include/linux/usb/hcd.h | 5 +++-- > include/linux/usb/quirks.h | 3 +++ > 9 files changed, 65 insertions(+), 17 deletions(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index 0a1731a0f0ef..3c03f23bd5d5 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -6817,6 +6817,9 @@ > pause after every control message); > o = USB_QUIRK_HUB_SLOW_RESET (Hub needs extra > delay after resetting its port); > + p = USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT (Timeout > + of set_address command reducing from > + 5000 ms to 500 ms) > Example: quirks=0781:5580:bk,0a5c:5834:gij > > usbhid.mousepoll= > diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c > index 3c54b218301c..83d1af0a3953 100644 > --- a/drivers/usb/core/hub.c > +++ b/drivers/usb/core/hub.c > @@ -54,6 +54,18 @@ > #define USB_TP_TRANSMISSION_DELAY_MAX 65535 /* ns */ > #define USB_PING_RESPONSE_TIME 400 /* ns */ > > +/* > + * address device command timeout 5000 ms is recommended in > + * USB 2.0 spec, section 9.2.6.1 The 2.0 spec is superseeded by the USB 3.1 specification (or is it 3.2 now?) Please use the latest specification as 2.0 is very very old by now. > + */ > +#define USB_DEFAULT_ADDR_DEVICE_TIMEOUT_MS 5000 /* ms */ > + > +/* > + * address device command timeout will be 500 ms when > + * USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT enable. > + */ > +#define USB_SHORT_ADDR_DEVICE_TIMEOUT_MS 500 /* ms */ > + > /* Protect struct usb_device->state and ->children members > * Note: Both are also protected by ->dev.sem, except that ->state can > * change to USB_STATE_NOTATTACHED even when the semaphore isn't held. */ > @@ -4626,7 +4638,12 @@ EXPORT_SYMBOL_GPL(usb_ep0_reinit); > static int hub_set_address(struct usb_device *udev, int devnum) > { > int retval; > + unsigned int timeout_ms = USB_DEFAULT_ADDR_DEVICE_TIMEOUT_MS; > struct usb_hcd *hcd = bus_to_hcd(udev->bus); > + struct usb_hub *hub = usb_hub_to_struct_hub(udev->parent); > + > + if (hub->hdev->quirks & USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT) > + timeout_ms = USB_SHORT_ADDR_DEVICE_TIMEOUT_MS; > > /* > * The host controller will choose the device address, > @@ -4639,11 +4656,11 @@ static int hub_set_address(struct usb_device *udev, int devnum) > if (udev->state != USB_STATE_DEFAULT) > return -EINVAL; > if (hcd->driver->address_device) > - retval = hcd->driver->address_device(hcd, udev); > + retval = hcd->driver->address_device(hcd, udev, timeout_ms); > else > retval = usb_control_msg(udev, usb_sndaddr0pipe(), > USB_REQ_SET_ADDRESS, 0, devnum, 0, > - NULL, 0, USB_CTRL_SET_TIMEOUT); > + NULL, 0, timeout_ms); > if (retval == 0) { > update_devnum(udev, devnum); > /* Device now using proper address. */ > diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c > index 15e9bd180a1d..863e7fe24157 100644 > --- a/drivers/usb/core/quirks.c > +++ b/drivers/usb/core/quirks.c > @@ -138,6 +138,9 @@ static int quirks_param_set(const char *value, const struct kernel_param *kp) > case 'o': > flags |= USB_QUIRK_HUB_SLOW_RESET; > break; > + case 'p': > + flags |= USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT; > + break; > /* Ignore unrecognized flag characters */ > } > } > @@ -527,6 +530,9 @@ static const struct usb_device_id usb_quirk_list[] = { > > { USB_DEVICE(0x2386, 0x350e), .driver_info = USB_QUIRK_NO_LPM }, > > + /* APTIV AUTOMOTIVE HUB */ > + { USB_DEVICE(0x2c48, 0x0132), .driver_info = USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT }, So the real issue that you have here is a broken built-in USB hub that does not error out quick enough, right? Why not fix the firmware in that hub as you know it's broken? Why is it the operating system's job to work around non-compliant devices? Ok, that last question was redundant, of course it's our job to work around broken devices, but this feels different. You are trying to say "hey, I know this device is broken, so error out quick so we can just ignore it", right? If so, why not just never allow that device to enumerate at all? You don't have to accept it as a valid device to the system (just don't authorize it), and then no device will ever connect to it so what is the delay issue? > + > /* DJI CineSSD */ > { USB_DEVICE(0x2ca3, 0x0031), .driver_info = USB_QUIRK_NO_LPM }, > > diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c > index 8714ab5bf04d..4a286136d1a8 100644 > --- a/drivers/usb/host/xhci-mem.c > +++ b/drivers/usb/host/xhci-mem.c > @@ -1729,6 +1729,8 @@ struct xhci_command *xhci_alloc_command(struct xhci_hcd *xhci, > } > > command->status = 0; > + /* set default timeout to 5000 ms */ > + command->timeout_ms = XHCI_CMD_DEFAULT_TIMEOUT_MS; > INIT_LIST_HEAD(&command->cmd_list); > return command; > } > diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c > index 1dde53f6eb31..8f36c2914938 100644 > --- a/drivers/usb/host/xhci-ring.c > +++ b/drivers/usb/host/xhci-ring.c > @@ -366,9 +366,10 @@ void xhci_ring_cmd_db(struct xhci_hcd *xhci) > readl(&xhci->dba->doorbell[0]); > } > > -static bool xhci_mod_cmd_timer(struct xhci_hcd *xhci, unsigned long delay) > +static bool xhci_mod_cmd_timer(struct xhci_hcd *xhci) > { > - return mod_delayed_work(system_wq, &xhci->cmd_timer, delay); > + return mod_delayed_work(system_wq, &xhci->cmd_timer, > + msecs_to_jiffies(xhci->current_cmd->timeout_ms)); > } > > static struct xhci_command *xhci_next_queued_cmd(struct xhci_hcd *xhci) > @@ -412,7 +413,7 @@ static void xhci_handle_stopped_cmd_ring(struct xhci_hcd *xhci, > if ((xhci->cmd_ring->dequeue != xhci->cmd_ring->enqueue) && > !(xhci->xhc_state & XHCI_STATE_DYING)) { > xhci->current_cmd = cur_cmd; > - xhci_mod_cmd_timer(xhci, XHCI_CMD_DEFAULT_TIMEOUT); > + xhci_mod_cmd_timer(xhci); > xhci_ring_cmd_db(xhci); > } > } > @@ -1786,7 +1787,7 @@ static void handle_cmd_completion(struct xhci_hcd *xhci, > if (!list_is_singular(&xhci->cmd_list)) { > xhci->current_cmd = list_first_entry(&cmd->cmd_list, > struct xhci_command, cmd_list); > - xhci_mod_cmd_timer(xhci, XHCI_CMD_DEFAULT_TIMEOUT); > + xhci_mod_cmd_timer(xhci); > } else if (xhci->current_cmd == cmd) { > xhci->current_cmd = NULL; > } > @@ -4301,7 +4302,7 @@ static int queue_command(struct xhci_hcd *xhci, struct xhci_command *cmd, > /* if there are no other commands queued we start the timeout timer */ > if (list_empty(&xhci->cmd_list)) { > xhci->current_cmd = cmd; > - xhci_mod_cmd_timer(xhci, XHCI_CMD_DEFAULT_TIMEOUT); > + xhci_mod_cmd_timer(xhci); > } > > list_add_tail(&cmd->cmd_list, &xhci->cmd_list); > diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c > index e1b1b64a0723..85ea4e17d2a0 100644 > --- a/drivers/usb/host/xhci.c > +++ b/drivers/usb/host/xhci.c > @@ -3998,11 +3998,18 @@ int xhci_alloc_dev(struct usb_hcd *hcd, struct usb_device *udev) > } > > /* > - * Issue an Address Device command and optionally send a corresponding > - * SetAddress request to the device. > + * This function issues an Address Device command to assign a unique USB bus > + * address. Optionally, it sends a SetAddress request. > + * > + * @param hcd USB host controller data structure. > + * @param udev USB device structure representing the connected device. > + * @param setup Enum specifying setup mode: address only or with context. > + * @param timeout_ms Max wait time (ms) for the command operation to complete. "param" is not how kernel doc formatting works at all, sorry. If you are going to document this that way, please use the correct style otherwise our tools will choke. > + * > + * @return Integer status code: 0 on success, negative on error. > */ > static int xhci_setup_device(struct usb_hcd *hcd, struct usb_device *udev, > - enum xhci_setup_dev setup) > + enum xhci_setup_dev setup, unsigned int timeout_ms) > { > const char *act = setup == SETUP_CONTEXT_ONLY ? "context" : "address"; > unsigned long flags; > @@ -4059,6 +4066,7 @@ static int xhci_setup_device(struct usb_hcd *hcd, struct usb_device *udev, > } > > command->in_ctx = virt_dev->in_ctx; > + command->timeout_ms = timeout_ms; > > slot_ctx = xhci_get_slot_ctx(xhci, virt_dev->in_ctx); > ctrl_ctx = xhci_get_input_control_ctx(virt_dev->in_ctx); > @@ -4185,14 +4193,16 @@ static int xhci_setup_device(struct usb_hcd *hcd, struct usb_device *udev, > return ret; > } > > -static int xhci_address_device(struct usb_hcd *hcd, struct usb_device *udev) > +static int xhci_address_device(struct usb_hcd *hcd, struct usb_device *udev, > + unsigned int timeout_ms) > { > - return xhci_setup_device(hcd, udev, SETUP_CONTEXT_ADDRESS); > + return xhci_setup_device(hcd, udev, SETUP_CONTEXT_ADDRESS, timeout_ms); > } > > static int xhci_enable_device(struct usb_hcd *hcd, struct usb_device *udev) > { > - return xhci_setup_device(hcd, udev, SETUP_CONTEXT_ONLY); > + return xhci_setup_device(hcd, udev, SETUP_CONTEXT_ONLY, > + XHCI_CMD_DEFAULT_TIMEOUT_MS); > } > > /* > diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h > index 7e282b4522c0..ec5c663246e5 100644 > --- a/drivers/usb/host/xhci.h > +++ b/drivers/usb/host/xhci.h > @@ -818,6 +818,8 @@ struct xhci_command { > struct completion *completion; > union xhci_trb *command_trb; > struct list_head cmd_list; > + /* xHCI command response timeout in milliseconds */ > + unsigned int timeout_ms; > }; > > /* drop context bitmasks */ > @@ -1576,8 +1578,11 @@ struct xhci_td { > unsigned int num_trbs; > }; > > -/* xHCI command default timeout value */ > -#define XHCI_CMD_DEFAULT_TIMEOUT (5 * HZ) > +/* > + * xHCI command default timeout value in milliseconds. > + * USB 2.0 spec, section 9.2.6.1 xHCI came about in the 3.0 specification, it was not around in the 2.0 one, right? > + */ > +#define XHCI_CMD_DEFAULT_TIMEOUT_MS 5000 > > /* command descriptor */ > struct xhci_cd { > diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h > index 61d4f0b793dc..d0e19ac3ba6c 100644 > --- a/include/linux/usb/hcd.h > +++ b/include/linux/usb/hcd.h > @@ -372,8 +372,9 @@ struct hc_driver { > * or bandwidth constraints. > */ > void (*reset_bandwidth)(struct usb_hcd *, struct usb_device *); > - /* Returns the hardware-chosen device address */ > - int (*address_device)(struct usb_hcd *, struct usb_device *udev); > + /* Set the hardware-chosen device address */ > + int (*address_device)(struct usb_hcd *, struct usb_device *udev, > + unsigned int timeout_ms); Did this function callback just change operation? If not, why change the comment? Or has the comment always been wrong? > /* prepares the hardware to send commands to the device */ > int (*enable_device)(struct usb_hcd *, struct usb_device *udev); > /* Notifies the HCD after a hub descriptor is fetched. > diff --git a/include/linux/usb/quirks.h b/include/linux/usb/quirks.h > index eeb7c2157c72..0cb464e3eaf4 100644 > --- a/include/linux/usb/quirks.h > +++ b/include/linux/usb/quirks.h > @@ -72,4 +72,7 @@ > /* device has endpoints that should be ignored */ > #define USB_QUIRK_ENDPOINT_IGNORE BIT(15) > > +/* short device address timeout */ > +#define USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT BIT(16) As you really just want to fail this device, why not just make a "This is a broken device, never talk to it" type of quirk instead? thanks, greg k-h
On Mon, Oct 16, 2023 at 07:58:36PM +0200, Greg KH wrote: > On Wed, Oct 11, 2023 at 06:45:25PM +0200, Hardik Gajjar wrote: > > Currently, the timeout for the set address command is fixed at > > 5 seconds in the xhci driver. This means the host waits up to 5 > > seconds to receive a response for the set_address command from > > the device. > > > > In the automotive context, most smartphone enumerations, including > > screen projection, should ideally complete within 3 seconds. > > "should" according to whom? That goes against the USB specification, so > why not take it up with them? We're implementing this change to meet Google's Android Auto certification requirements, even though these requirements aren't publicly available. The issue arises when we connect an Android phone to a Linux host using a special automotive USB hub. The host's xHCI doesn't receive a response from the hub after sending a 'set_address' request, resulting in a timeout. This problem is specific to one phone model. It's worth noting that we can't argue that this requirement violates USB standards because the 3-second time limit applies only when connecting an Android mobile phone. I assume that Android phones also undergo Google certification, which likely includes a condition for successful enumeration within a specified time frame. However, I can't confirm this. More logs and detailed in patch V1: https://lore.kernel.org/linux-usb/20230818092353.124658-1-hgajjar@de.adit-jv.com/T/#m452ec9dad94e8181fdb050cd29483dd89437f7c1 > > > Achieving this is impossible in scenarios where the set_address is > > not successful and waits for a timeout. > > Agreed, broken hardware is a pain, but if your device is allowed to take > longer, it can, and will, so you have to support that. > The problem is not caused by the device taking an extended amount of time to process the 'set_address' request. Instead, the issue lies in the absence of any activity on the upstream bus until a timeout occurs. This situation arises when the host has already transmitted the 'set_address' command to the hub, assuming that the device operates at full speed. However, the device connected to the hub undergoes a state change from full speed to high-speed during this process. > > The shortened address device timeout quirks provide the flexibility > > to align with a 3-second time limit in the event of errors. > > By swiftly triggering a failure response and swiftly initiating > > retry procedures, these quirks ensure efficient and rapid recovery, > > particularly in automotive contexts where rapid smartphone enumeration > > and screen projection are vital. > > Screen projection is a requirement that you should not be relying on USB > for as USB has a different set of required timeouts, right? This sounds > like a bad hardware design, if not an impossible one. > Screen projection for us means displaying the connected phone on the screen and launching Carplay and Android Auto for the user. This works perfectly in nearly all cases, except in scenarios like this one where a combination of a special hub and a specific phone model is causing the issue > > The quirk will set the timeout to 500 ms from 5 seconds. > > > > To use the quirk, please write "vendor_id:product_id:p" to > > /sys/bus/usb/drivers/hub/module/parameter/quirks > > > > For example, > > echo "0x2c48:0x0132:p" > /sys/bus/usb/drivers/hub/module/parameter/quirks" > > > > Signed-off-by: Hardik Gajjar <hgajjar@de.adit-jv.com> > > --- > > changes since version 1: > > - implement quirk instead of new API in xhci driver > > > > changes since version 2: > > - Add documentation for the new quirk. > > - Define the timeout unit in milliseconds in variable names and function arguments. > > - Change the xHCI command timeout from HZ (jiffies) to milliseconds. > > - Add APTIV usb hub vendor and product ID in device quirk list > > - Adding some other comments for clarity > > > > Changes since version 3: > > - Add some comments for clarity. > > - Minor indentation and sequence change. > > --- > > .../admin-guide/kernel-parameters.txt | 3 +++ > > drivers/usb/core/hub.c | 21 ++++++++++++++++-- > > drivers/usb/core/quirks.c | 6 +++++ > > drivers/usb/host/xhci-mem.c | 2 ++ > > drivers/usb/host/xhci-ring.c | 11 +++++----- > > drivers/usb/host/xhci.c | 22 ++++++++++++++----- > > drivers/usb/host/xhci.h | 9 ++++++-- > > include/linux/usb/hcd.h | 5 +++-- > > include/linux/usb/quirks.h | 3 +++ > > 9 files changed, 65 insertions(+), 17 deletions(-) > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > > index 0a1731a0f0ef..3c03f23bd5d5 100644 > > --- a/Documentation/admin-guide/kernel-parameters.txt > > +++ b/Documentation/admin-guide/kernel-parameters.txt > > @@ -6817,6 +6817,9 @@ > > pause after every control message); > > o = USB_QUIRK_HUB_SLOW_RESET (Hub needs extra > > delay after resetting its port); > > + p = USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT (Timeout > > + of set_address command reducing from > > + 5000 ms to 500 ms) > > Example: quirks=0781:5580:bk,0a5c:5834:gij > > > > usbhid.mousepoll= > > diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c > > index 3c54b218301c..83d1af0a3953 100644 > > --- a/drivers/usb/core/hub.c > > +++ b/drivers/usb/core/hub.c > > @@ -54,6 +54,18 @@ > > #define USB_TP_TRANSMISSION_DELAY_MAX 65535 /* ns */ > > #define USB_PING_RESPONSE_TIME 400 /* ns */ > > > > +/* > > + * address device command timeout 5000 ms is recommended in > > + * USB 2.0 spec, section 9.2.6.1 > > The 2.0 spec is superseeded by the USB 3.1 specification (or is it 3.2 > now?) Please use the latest specification as 2.0 is very very old by > now. will update to USB3 specs. > > > + */ > > +#define USB_DEFAULT_ADDR_DEVICE_TIMEOUT_MS 5000 /* ms */ > > + > > +/* > > + * address device command timeout will be 500 ms when > > + * USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT enable. > > + */ > > +#define USB_SHORT_ADDR_DEVICE_TIMEOUT_MS 500 /* ms */ > > + > > /* Protect struct usb_device->state and ->children members > > * Note: Both are also protected by ->dev.sem, except that ->state can > > * change to USB_STATE_NOTATTACHED even when the semaphore isn't held. */ > > @@ -4626,7 +4638,12 @@ EXPORT_SYMBOL_GPL(usb_ep0_reinit); > > static int hub_set_address(struct usb_device *udev, int devnum) > > { > > int retval; > > + unsigned int timeout_ms = USB_DEFAULT_ADDR_DEVICE_TIMEOUT_MS; > > struct usb_hcd *hcd = bus_to_hcd(udev->bus); > > + struct usb_hub *hub = usb_hub_to_struct_hub(udev->parent); > > + > > + if (hub->hdev->quirks & USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT) > > + timeout_ms = USB_SHORT_ADDR_DEVICE_TIMEOUT_MS; > > > > /* > > * The host controller will choose the device address, > > @@ -4639,11 +4656,11 @@ static int hub_set_address(struct usb_device *udev, int devnum) > > if (udev->state != USB_STATE_DEFAULT) > > return -EINVAL; > > if (hcd->driver->address_device) > > - retval = hcd->driver->address_device(hcd, udev); > > + retval = hcd->driver->address_device(hcd, udev, timeout_ms); > > else > > retval = usb_control_msg(udev, usb_sndaddr0pipe(), > > USB_REQ_SET_ADDRESS, 0, devnum, 0, > > - NULL, 0, USB_CTRL_SET_TIMEOUT); > > + NULL, 0, timeout_ms); > > if (retval == 0) { > > update_devnum(udev, devnum); > > /* Device now using proper address. */ > > diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c > > index 15e9bd180a1d..863e7fe24157 100644 > > --- a/drivers/usb/core/quirks.c > > +++ b/drivers/usb/core/quirks.c > > @@ -138,6 +138,9 @@ static int quirks_param_set(const char *value, const struct kernel_param *kp) > > case 'o': > > flags |= USB_QUIRK_HUB_SLOW_RESET; > > break; > > + case 'p': > > + flags |= USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT; > > + break; > > /* Ignore unrecognized flag characters */ > > } > > } > > @@ -527,6 +530,9 @@ static const struct usb_device_id usb_quirk_list[] = { > > > > { USB_DEVICE(0x2386, 0x350e), .driver_info = USB_QUIRK_NO_LPM }, > > > > + /* APTIV AUTOMOTIVE HUB */ > > + { USB_DEVICE(0x2c48, 0x0132), .driver_info = USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT }, > > So the real issue that you have here is a broken built-in USB hub that > does not error out quick enough, right? Why not fix the firmware in > that hub as you know it's broken? Why is it the operating system's job > to work around non-compliant devices? > > Ok, that last question was redundant, of course it's our job to work > around broken devices, but this feels different. You are trying to say > "hey, I know this device is broken, so error out quick so we can just > ignore it", right? If so, why not just never allow that device to > enumerate at all? You don't have to accept it as a valid device to the > system (just don't authorize it), and then no device will ever connect > to it so what is the delay issue? > We require the device (HUB) as it is a vital component of our infotainment system. The issue arises when the host has already issued the 'set_address' command to the hub, assuming the device is operating at full speed. However, in such cases, when the device connected to the hub changes its state from full speed to high-speed, the 'set_address' request becomes blocked, waiting for the full 5-second timeout. This patch reduces the timeout from 5 seconds to 500 milliseconds, allowing for quicker failure and re-enumeration of the device as high-speed > > > + > > /* DJI CineSSD */ > > { USB_DEVICE(0x2ca3, 0x0031), .driver_info = USB_QUIRK_NO_LPM }, > > > > diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c > > index 8714ab5bf04d..4a286136d1a8 100644 > > --- a/drivers/usb/host/xhci-mem.c > > +++ b/drivers/usb/host/xhci-mem.c > > @@ -1729,6 +1729,8 @@ struct xhci_command *xhci_alloc_command(struct xhci_hcd *xhci, > > } > > > > command->status = 0; > > + /* set default timeout to 5000 ms */ > > + command->timeout_ms = XHCI_CMD_DEFAULT_TIMEOUT_MS; > > INIT_LIST_HEAD(&command->cmd_list); > > return command; > > } > > diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c > > index 1dde53f6eb31..8f36c2914938 100644 > > --- a/drivers/usb/host/xhci-ring.c > > +++ b/drivers/usb/host/xhci-ring.c > > @@ -366,9 +366,10 @@ void xhci_ring_cmd_db(struct xhci_hcd *xhci) > > readl(&xhci->dba->doorbell[0]); > > } > > > > -static bool xhci_mod_cmd_timer(struct xhci_hcd *xhci, unsigned long delay) > > +static bool xhci_mod_cmd_timer(struct xhci_hcd *xhci) > > { > > - return mod_delayed_work(system_wq, &xhci->cmd_timer, delay); > > + return mod_delayed_work(system_wq, &xhci->cmd_timer, > > + msecs_to_jiffies(xhci->current_cmd->timeout_ms)); > > } > > > > static struct xhci_command *xhci_next_queued_cmd(struct xhci_hcd *xhci) > > @@ -412,7 +413,7 @@ static void xhci_handle_stopped_cmd_ring(struct xhci_hcd *xhci, > > if ((xhci->cmd_ring->dequeue != xhci->cmd_ring->enqueue) && > > !(xhci->xhc_state & XHCI_STATE_DYING)) { > > xhci->current_cmd = cur_cmd; > > - xhci_mod_cmd_timer(xhci, XHCI_CMD_DEFAULT_TIMEOUT); > > + xhci_mod_cmd_timer(xhci); > > xhci_ring_cmd_db(xhci); > > } > > } > > @@ -1786,7 +1787,7 @@ static void handle_cmd_completion(struct xhci_hcd *xhci, > > if (!list_is_singular(&xhci->cmd_list)) { > > xhci->current_cmd = list_first_entry(&cmd->cmd_list, > > struct xhci_command, cmd_list); > > - xhci_mod_cmd_timer(xhci, XHCI_CMD_DEFAULT_TIMEOUT); > > + xhci_mod_cmd_timer(xhci); > > } else if (xhci->current_cmd == cmd) { > > xhci->current_cmd = NULL; > > } > > @@ -4301,7 +4302,7 @@ static int queue_command(struct xhci_hcd *xhci, struct xhci_command *cmd, > > /* if there are no other commands queued we start the timeout timer */ > > if (list_empty(&xhci->cmd_list)) { > > xhci->current_cmd = cmd; > > - xhci_mod_cmd_timer(xhci, XHCI_CMD_DEFAULT_TIMEOUT); > > + xhci_mod_cmd_timer(xhci); > > } > > > > list_add_tail(&cmd->cmd_list, &xhci->cmd_list); > > diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c > > index e1b1b64a0723..85ea4e17d2a0 100644 > > --- a/drivers/usb/host/xhci.c > > +++ b/drivers/usb/host/xhci.c > > @@ -3998,11 +3998,18 @@ int xhci_alloc_dev(struct usb_hcd *hcd, struct usb_device *udev) > > } > > > > /* > > - * Issue an Address Device command and optionally send a corresponding > > - * SetAddress request to the device. > > + * This function issues an Address Device command to assign a unique USB bus > > + * address. Optionally, it sends a SetAddress request. > > + * > > + * @param hcd USB host controller data structure. > > + * @param udev USB device structure representing the connected device. > > + * @param setup Enum specifying setup mode: address only or with context. > > + * @param timeout_ms Max wait time (ms) for the command operation to complete. > > "param" is not how kernel doc formatting works at all, sorry. If you > are going to document this that way, please use the correct style > otherwise our tools will choke. > will check this. > > + * > > + * @return Integer status code: 0 on success, negative on error. > > */ > > static int xhci_setup_device(struct usb_hcd *hcd, struct usb_device *udev, > > - enum xhci_setup_dev setup) > > + enum xhci_setup_dev setup, unsigned int timeout_ms) > > { > > const char *act = setup == SETUP_CONTEXT_ONLY ? "context" : "address"; > > unsigned long flags; > > @@ -4059,6 +4066,7 @@ static int xhci_setup_device(struct usb_hcd *hcd, struct usb_device *udev, > > } > > > > command->in_ctx = virt_dev->in_ctx; > > + command->timeout_ms = timeout_ms; > > > > slot_ctx = xhci_get_slot_ctx(xhci, virt_dev->in_ctx); > > ctrl_ctx = xhci_get_input_control_ctx(virt_dev->in_ctx); > > @@ -4185,14 +4193,16 @@ static int xhci_setup_device(struct usb_hcd *hcd, struct usb_device *udev, > > return ret; > > } > > > > -static int xhci_address_device(struct usb_hcd *hcd, struct usb_device *udev) > > +static int xhci_address_device(struct usb_hcd *hcd, struct usb_device *udev, > > + unsigned int timeout_ms) > > { > > - return xhci_setup_device(hcd, udev, SETUP_CONTEXT_ADDRESS); > > + return xhci_setup_device(hcd, udev, SETUP_CONTEXT_ADDRESS, timeout_ms); > > } > > > > static int xhci_enable_device(struct usb_hcd *hcd, struct usb_device *udev) > > { > > - return xhci_setup_device(hcd, udev, SETUP_CONTEXT_ONLY); > > + return xhci_setup_device(hcd, udev, SETUP_CONTEXT_ONLY, > > + XHCI_CMD_DEFAULT_TIMEOUT_MS); > > } > > > > /* > > diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h > > index 7e282b4522c0..ec5c663246e5 100644 > > --- a/drivers/usb/host/xhci.h > > +++ b/drivers/usb/host/xhci.h > > @@ -818,6 +818,8 @@ struct xhci_command { > > struct completion *completion; > > union xhci_trb *command_trb; > > struct list_head cmd_list; > > + /* xHCI command response timeout in milliseconds */ > > + unsigned int timeout_ms; > > }; > > > > /* drop context bitmasks */ > > @@ -1576,8 +1578,11 @@ struct xhci_td { > > unsigned int num_trbs; > > }; > > > > -/* xHCI command default timeout value */ > > -#define XHCI_CMD_DEFAULT_TIMEOUT (5 * HZ) > > +/* > > + * xHCI command default timeout value in milliseconds. > > + * USB 2.0 spec, section 9.2.6.1 > > xHCI came about in the 3.0 specification, it was not around in the 2.0 > one, right? will update to usb3 specs > > > + */ > > +#define XHCI_CMD_DEFAULT_TIMEOUT_MS 5000 > > > > /* command descriptor */ > > struct xhci_cd { > > diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h > > index 61d4f0b793dc..d0e19ac3ba6c 100644 > > --- a/include/linux/usb/hcd.h > > +++ b/include/linux/usb/hcd.h > > @@ -372,8 +372,9 @@ struct hc_driver { > > * or bandwidth constraints. > > */ > > void (*reset_bandwidth)(struct usb_hcd *, struct usb_device *); > > - /* Returns the hardware-chosen device address */ > > - int (*address_device)(struct usb_hcd *, struct usb_device *udev); > > + /* Set the hardware-chosen device address */ > > + int (*address_device)(struct usb_hcd *, struct usb_device *udev, > > + unsigned int timeout_ms); > > Did this function callback just change operation? If not, why change > the comment? Or has the comment always been wrong? It was written like "Returns the Hardware device addres" that sounds wrong. It only return success or failure. > > > /* prepares the hardware to send commands to the device */ > > int (*enable_device)(struct usb_hcd *, struct usb_device *udev); > > /* Notifies the HCD after a hub descriptor is fetched. > > diff --git a/include/linux/usb/quirks.h b/include/linux/usb/quirks.h > > index eeb7c2157c72..0cb464e3eaf4 100644 > > --- a/include/linux/usb/quirks.h > > +++ b/include/linux/usb/quirks.h > > @@ -72,4 +72,7 @@ > > /* device has endpoints that should be ignored */ > > #define USB_QUIRK_ENDPOINT_IGNORE BIT(15) > > > > +/* short device address timeout */ > > +#define USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT BIT(16) > > As you really just want to fail this device, why not just make a "This > is a broken device, never talk to it" type of quirk instead? > > thanks, > > greg k-h We need the device(hub) because It is working as expected except this one scenario The quirk is for fail fast(in-case of error) and retry next attempt quickly(perhapse as high speed device enumeration). As mentioned earlier this is happning when end device connected to Hub change its state while the set_address request already received to hub from host. Thanks, Hardik
On Tue, Oct 17, 2023 at 06:10:21PM +0200, Hardik Gajjar wrote: > On Mon, Oct 16, 2023 at 07:58:36PM +0200, Greg KH wrote: > > On Wed, Oct 11, 2023 at 06:45:25PM +0200, Hardik Gajjar wrote: > > > Currently, the timeout for the set address command is fixed at > > > 5 seconds in the xhci driver. This means the host waits up to 5 > > > seconds to receive a response for the set_address command from > > > the device. > > > > > > In the automotive context, most smartphone enumerations, including > > > screen projection, should ideally complete within 3 seconds. > > > > "should" according to whom? That goes against the USB specification, so > > why not take it up with them? > We're implementing this change to meet Google's Android Auto certification > requirements, even though these requirements aren't publicly available. Then perhaps Google needs to talk about this somewhere if they are forcing us to take such an out-of-spec requirement? > The issue arises when we connect an Android phone to a Linux host using a special > automotive USB hub. The host's xHCI doesn't receive a response from the hub after > sending a 'set_address' request, resulting in a timeout. This problem is specific > to one phone model. > > It's worth noting that we can't argue that this requirement violates USB standards > because the 3-second time limit applies only when connecting an Android mobile phone. > I assume that Android phones also undergo Google certification, which likely includes > a condition for successful enumeration within a specified time frame. However, I can't confirm this. Android's CTS/VTS testing is pretty public, but huge, so you might want to dig into that and see if they really test for this or not. > More logs and detailed in patch V1: > https://lore.kernel.org/linux-usb/20230818092353.124658-1-hgajjar@de.adit-jv.com/T/#m452ec9dad94e8181fdb050cd29483dd89437f7c1 > > > > > Achieving this is impossible in scenarios where the set_address is > > > not successful and waits for a timeout. > > > > Agreed, broken hardware is a pain, but if your device is allowed to take > > longer, it can, and will, so you have to support that. > > > The problem is not caused by the device taking an extended amount of time to > process the 'set_address' request. Instead, the issue lies in the absence of > any activity on the upstream bus until a timeout occurs. So, a broken device. Why are you then adding the hub to the quirk list and not the broken device? We are used to adding broken devices to qurik lists all the time, this shouldn't be new. > This situation arises when the host has already transmitted the 'set_address' command to the hub, > assuming that the device operates at full speed. However, the device connected > to the hub undergoes a state change from full speed to high-speed during this process. During which process? While the set-address happens? That feels like a hub bug then. > > > The shortened address device timeout quirks provide the flexibility > > > to align with a 3-second time limit in the event of errors. > > > By swiftly triggering a failure response and swiftly initiating > > > retry procedures, these quirks ensure efficient and rapid recovery, > > > particularly in automotive contexts where rapid smartphone enumeration > > > and screen projection are vital. > > > > Screen projection is a requirement that you should not be relying on USB > > for as USB has a different set of required timeouts, right? This sounds > > like a bad hardware design, if not an impossible one. > > > > Screen projection for us means displaying the connected phone on the screen and > launching Carplay and Android Auto for the user. This works perfectly in nearly all > cases, except in scenarios like this one where a combination of a special hub and > a specific phone model is causing the issue So which is broken, the hub or phone? > > So the real issue that you have here is a broken built-in USB hub that > > does not error out quick enough, right? Why not fix the firmware in > > that hub as you know it's broken? Why is it the operating system's job > > to work around non-compliant devices? > > > > Ok, that last question was redundant, of course it's our job to work > > around broken devices, but this feels different. You are trying to say > > "hey, I know this device is broken, so error out quick so we can just > > ignore it", right? If so, why not just never allow that device to > > enumerate at all? You don't have to accept it as a valid device to the > > system (just don't authorize it), and then no device will ever connect > > to it so what is the delay issue? > > > > We require the device (HUB) as it is a vital component of our infotainment system. Great, then you can fix the firmware in it! > The issue arises when the host has already issued the 'set_address' command to the hub, > assuming the device is operating at full speed. However, in such cases, when the device > connected to the hub changes its state from full speed to high-speed, the 'set_address' > request becomes blocked, waiting for the full 5-second timeout. This patch reduces the > timeout from 5 seconds to 500 milliseconds, allowing for quicker failure and re-enumeration > of the device as high-speed Changing speed is under hub control, not device control, right? Are you sure the firmware is correct in that hub? Has the hub passed all of the USB-IF testing requirements? thanks, greg k-h
On Tue, Oct 17, 2023 at 06:53:44PM +0200, Greg KH wrote: > On Tue, Oct 17, 2023 at 06:10:21PM +0200, Hardik Gajjar wrote: > > More logs and detailed in patch V1: > > https://lore.kernel.org/linux-usb/20230818092353.124658-1-hgajjar@de.adit-jv.com/T/#m452ec9dad94e8181fdb050cd29483dd89437f7c1 > > > > > > > Achieving this is impossible in scenarios where the set_address is > > > > not successful and waits for a timeout. > > > > > > Agreed, broken hardware is a pain, but if your device is allowed to take > > > longer, it can, and will, so you have to support that. > > > > > The problem is not caused by the device taking an extended amount of time to > > process the 'set_address' request. Instead, the issue lies in the absence of > > any activity on the upstream bus until a timeout occurs. > > So, a broken device. Why are you then adding the hub to the quirk list > and not the broken device? We are used to adding broken devices to > qurik lists all the time, this shouldn't be new. Adding a quirk for the device isn't feasible, because the problem occurs before the device has been initialized and enumerated. The kernel doesn't know anything about the device at this point; only that it has just connected. > > This situation arises when the host has already transmitted the 'set_address' command to the hub, > > assuming that the device operates at full speed. However, the device connected > > to the hub undergoes a state change from full speed to high-speed during this process. > > During which process? While the set-address happens? That feels like a > hub bug then. > > > > > The shortened address device timeout quirks provide the flexibility > > > > to align with a 3-second time limit in the event of errors. > > > > By swiftly triggering a failure response and swiftly initiating > > > > retry procedures, these quirks ensure efficient and rapid recovery, > > > > particularly in automotive contexts where rapid smartphone enumeration > > > > and screen projection are vital. > > > > > > Screen projection is a requirement that you should not be relying on USB > > > for as USB has a different set of required timeouts, right? This sounds > > > like a bad hardware design, if not an impossible one. > > > > > > > Screen projection for us means displaying the connected phone on the screen and > > launching Carplay and Android Auto for the user. This works perfectly in nearly all > > cases, except in scenarios like this one where a combination of a special hub and > > a specific phone model is causing the issue > > So which is broken, the hub or phone? It sounds like both of them are broken to some extent, although we can't tell for sure without seeing what's actually happening on the USB bus (i.e., bus analyzer output): The phone seems to take too long to activate its high-speed terminations and deactivate the full-speed terminations. The hub doesn't seem to realize that the phone has disconnected its full-speed connection and switched to high-speed. But without real data, these are just best guesses. > > The issue arises when the host has already issued the 'set_address' command to the hub, > > assuming the device is operating at full speed. However, in such cases, when the device > > connected to the hub changes its state from full speed to high-speed, the 'set_address' > > request becomes blocked, waiting for the full 5-second timeout. This patch reduces the > > timeout from 5 seconds to 500 milliseconds, allowing for quicker failure and re-enumeration > > of the device as high-speed > > Changing speed is under hub control, not device control, right? Are you Changing speed is under device control. But of course, the hub has to detect the change and react to it properly. Alan Stern > sure the firmware is correct in that hub? Has the hub passed all of the > USB-IF testing requirements? > > thanks, > > greg k-h
On Tue, Oct 17, 2023 at 02:59:54PM -0400, Alan Stern wrote: > On Tue, Oct 17, 2023 at 06:53:44PM +0200, Greg KH wrote: > > On Tue, Oct 17, 2023 at 06:10:21PM +0200, Hardik Gajjar wrote: > > > More logs and detailed in patch V1: > > > https://lore.kernel.org/linux-usb/20230818092353.124658-1-hgajjar@de.adit-jv.com/T/#m452ec9dad94e8181fdb050cd29483dd89437f7c1 > > > > > > > > > Achieving this is impossible in scenarios where the set_address is > > > > > not successful and waits for a timeout. > > > > > > > > Agreed, broken hardware is a pain, but if your device is allowed to take > > > > longer, it can, and will, so you have to support that. > > > > > > > The problem is not caused by the device taking an extended amount of time to > > > process the 'set_address' request. Instead, the issue lies in the absence of > > > any activity on the upstream bus until a timeout occurs. > > > > So, a broken device. Why are you then adding the hub to the quirk list > > and not the broken device? We are used to adding broken devices to > > qurik lists all the time, this shouldn't be new. > > Adding a quirk for the device isn't feasible, because the problem occurs > before the device has been initialized and enumerated. The kernel > doesn't know anything about the device at this point; only that it has > just connected. Ah, ick, you are right, but we do know the "broken hub" id, so that makes a bit more sense. Should this be a hub-only type quirk? > > > This situation arises when the host has already transmitted the 'set_address' command to the hub, > > > assuming that the device operates at full speed. However, the device connected > > > to the hub undergoes a state change from full speed to high-speed during this process. > > > > During which process? While the set-address happens? That feels like a > > hub bug then. > > > > > > > The shortened address device timeout quirks provide the flexibility > > > > > to align with a 3-second time limit in the event of errors. > > > > > By swiftly triggering a failure response and swiftly initiating > > > > > retry procedures, these quirks ensure efficient and rapid recovery, > > > > > particularly in automotive contexts where rapid smartphone enumeration > > > > > and screen projection are vital. > > > > > > > > Screen projection is a requirement that you should not be relying on USB > > > > for as USB has a different set of required timeouts, right? This sounds > > > > like a bad hardware design, if not an impossible one. > > > > > > > > > > Screen projection for us means displaying the connected phone on the screen and > > > launching Carplay and Android Auto for the user. This works perfectly in nearly all > > > cases, except in scenarios like this one where a combination of a special hub and > > > a specific phone model is causing the issue > > > > So which is broken, the hub or phone? > > It sounds like both of them are broken to some extent, although we can't > tell for sure without seeing what's actually happening on the USB bus > (i.e., bus analyzer output): > > The phone seems to take too long to activate its high-speed > terminations and deactivate the full-speed terminations. > > The hub doesn't seem to realize that the phone has disconnected > its full-speed connection and switched to high-speed. > > But without real data, these are just best guesses. Agreed, Hardik, can you look at some bus traces to figure out where the root problem here is? thanks, greg k-h
On Sat, Oct 21, 2023 at 12:15:35PM +0200, Greg KH wrote: > On Tue, Oct 17, 2023 at 02:59:54PM -0400, Alan Stern wrote: > > On Tue, Oct 17, 2023 at 06:53:44PM +0200, Greg KH wrote: > > > On Tue, Oct 17, 2023 at 06:10:21PM +0200, Hardik Gajjar wrote: > > > > More logs and detailed in patch V1: > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_linux-2Dusb_20230818092353.124658-2D1-2Dhgajjar-40de.adit-2Djv.com_T_-23m452ec9dad94e8181fdb050cd29483dd89437f7c1&d=DwICAg&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=SAhjP5GOmrADp1v_EE5jWoSuMlYCIt9gKduw-DCBPLs&m=P0HXZTx6ta7v5M4y2Y7WZkPrY-dpKkxBq8tAzuX8cI9aj9tE2NuVvJjLl3Uvojpw&s=N_HwnQeZb_gHMmgz53uTGDUZVi28EXb1l9Pg6PdbvVI&e= > > > > > > > > > > > Achieving this is impossible in scenarios where the set_address is > > > > > > not successful and waits for a timeout. > > > > > > > > > > Agreed, broken hardware is a pain, but if your device is allowed to take > > > > > longer, it can, and will, so you have to support that. > > > > > > > > > The problem is not caused by the device taking an extended amount of time to > > > > process the 'set_address' request. Instead, the issue lies in the absence of > > > > any activity on the upstream bus until a timeout occurs. > > > > > > So, a broken device. Why are you then adding the hub to the quirk list > > > and not the broken device? We are used to adding broken devices to > > > qurik lists all the time, this shouldn't be new. > > > > Adding a quirk for the device isn't feasible, because the problem occurs > > before the device has been initialized and enumerated. The kernel > > doesn't know anything about the device at this point; only that it has > > just connected. > > Ah, ick, you are right, but we do know the "broken hub" id, so that > makes a bit more sense. Should this be a hub-only type quirk? In addition to the earlier comment, it appears that the issue is most likely related to the hub. While we have identified one specific phone that triggers this problem, we cannot determine how many other devices might encounter a similar issue, where they enumerate as full speed initially and then switch to high speed. To address this, we are proposing to use a 500 ms timeout for all devices connected via the hub. This change aims to prevent potential timeout-related problems with various devices > > > > > This situation arises when the host has already transmitted the 'set_address' command to the hub, > > > > assuming that the device operates at full speed. However, the device connected > > > > to the hub undergoes a state change from full speed to high-speed during this process. > > > > > > During which process? While the set-address happens? That feels like a > > > hub bug then. > > > > > > > > > The shortened address device timeout quirks provide the flexibility > > > > > > to align with a 3-second time limit in the event of errors. > > > > > > By swiftly triggering a failure response and swiftly initiating > > > > > > retry procedures, these quirks ensure efficient and rapid recovery, > > > > > > particularly in automotive contexts where rapid smartphone enumeration > > > > > > and screen projection are vital. > > > > > > > > > > Screen projection is a requirement that you should not be relying on USB > > > > > for as USB has a different set of required timeouts, right? This sounds > > > > > like a bad hardware design, if not an impossible one. > > > > > > > > > > > > > Screen projection for us means displaying the connected phone on the screen and > > > > launching Carplay and Android Auto for the user. This works perfectly in nearly all > > > > cases, except in scenarios like this one where a combination of a special hub and > > > > a specific phone model is causing the issue > > > > > > So which is broken, the hub or phone? > > > > It sounds like both of them are broken to some extent, although we can't > > tell for sure without seeing what's actually happening on the USB bus > > (i.e., bus analyzer output): > > > > The phone seems to take too long to activate its high-speed > > terminations and deactivate the full-speed terminations. > > > > The hub doesn't seem to realize that the phone has disconnected > > its full-speed connection and switched to high-speed.0 > > > > But without real data, these are just best guesses. > > Agreed, Hardik, can you look at some bus traces to figure out where the > root problem here is? It does appear that the issue is related to the hub, and the ideal solution would involve modifying the hub's firmware. However, implementing such a firmware fix is currently not a straightforward task. As a result, we have implemented this quirk-based solution to mitigate the issue to some extent Following is the LeCroy analyzer logs: 1. logs between Hub and phone with broken hub. In packet 58, there is a Full-speed J (suspend) event that lasted for 5.347 seconds. It's suspected that the hub was suspended due to incorrect chirp parsing. This anomaly in chirp parsing may be a contributing factor to the issue we're facing. Packet(0) _______| Dir SE0?(?????)-BAD Idle(0.000 ns) Time Stamp(44 . 964 254 882) _______|_______________________________________________________________________Ch0 10 Packets(1-10) Dir Chirp J(27.233 sec) Chirp J Chirp K _______| Time( 27.905 sec) Time Stamp(0 . 000 001 050) _______|_______________________________________________________________________Ch0 Packet(11) Dir(?) Full Speed J (Suspend)(151.614 ms) Idle(50.000 ns) _______| Time Stamp(27 . 905 210 900) _______|_______________________________________________________________________Ch0 Packet(12) Dir Chirp J( 10.923 ms) Idle( 11.866 us) _______| Time Stamp(28 . 056 825 150) _______|_______________________________________________________________________Ch0 Packet(13) Dir(?) Full Speed J( 2.167 us) Idle( 11.401 us) _______| Time Stamp(28 . 067 759 916) _______|_______________________________________________________________________Ch0 Packet(14) Dir(?) Full Speed J( 2.167 us) Idle( 11.399 us) _______| Time Stamp(28 . 067 773 484) _______|_______________________________________________________________________Ch0 Packet(15) Dir(?) Full Speed J(122.082 us) Idle( 2.834 us) _______| Time Stamp(28 . 067 787 050) _______|_______________________________________________________________________Ch0 Packet(16) Dir(?) Full Speed J(997.100 us) Idle( 2.850 us) _______| Time Stamp(28 . 067 911 966) _______|_______________________________________________________________________Ch0 Packet(17) Dir(?) Full Speed J(997.100 us) Idle( 2.850 us) _______| Time Stamp(28 . 068 911 916) _______|_______________________________________________________________________Ch0 Packet(18) Dir(?) Full Speed J(997.134 us) Idle( 2.850 us) _______| Time Stamp(28 . 069 911 866) _______|_______________________________________________________________________Ch0 Packet(19) Dir(?) Full Speed J(997.100 us) Idle( 2.850 us) _______| Time Stamp(28 . 070 911 850) _______|_______________________________________________________________________Ch0 Packet(20) Dir(?) Full Speed J(997.132 us) Idle( 2.850 us) _______| Time Stamp(28 . 071 911 800) _______|_______________________________________________________________________Ch0 Packet(21) Dir(?) Full Speed J(997.068 us) Idle( 2.850 us) _______| Time Stamp(28 . 072 911 782) _______|_______________________________________________________________________Ch0 Packet(22) Dir(?) Full Speed J(997.200 us) Idle( 2.832 us) _______| Time Stamp(28 . 073 911 700) _______|_______________________________________________________________________Ch0 Packet(23) Dir(?) Full Speed J(997.118 us) Idle( 2.832 us) _______| Time Stamp(28 . 074 911 732) _______|_______________________________________________________________________Ch0 Packet(24) Dir(?) Full Speed J(997.134 us) Idle( 2.834 us) _______| Time Stamp(28 . 075 911 682) _______|_______________________________________________________________________Ch0 Packet(25) Dir(?) Full Speed J(997.132 us) Idle( 2.834 us) _______| Time Stamp(28 . 076 911 650) _______|_______________________________________________________________________Ch0 Packet(26) Dir(?) Full Speed J(997.134 us) Idle( 2.832 us) _______| Time Stamp(28 . 077 911 616) _______|_______________________________________________________________________Ch0 Packet(27) Dir(?) Full Speed J(997.134 us) Idle( 2.834 us) _______| Time Stamp(28 . 078 911 582) _______|_______________________________________________________________________Ch0 Packet(28) Dir(?) Full Speed J(997.116 us) Idle( 2.834 us) _______| Time Stamp(28 . 079 911 550) _______|_______________________________________________________________________Ch0 Packet(29) Dir(?) Full Speed J(997.150 us) Idle( 2.832 us) _______| Time Stamp(28 . 080 911 500) _______|_______________________________________________________________________Ch0 Packet(30) Dir(?) Full Speed J(997.118 us) Idle( 2.832 us) _______| Time Stamp(28 . 081 911 482) _______|_______________________________________________________________________Ch0 Packet(31) Dir(?) Full Speed J(997.150 us) Idle( 2.834 us) _______| Time Stamp(28 . 082 911 432) _______|_______________________________________________________________________Ch0 Packet(32) Dir(?) Full Speed J(997.116 us) Idle( 2.918 us) _______| Time Stamp(28 . 083 911 416) _______|_______________________________________________________________________Ch0 Packet(33) Dir(?) Full Speed J(997.032 us) Idle( 2.918 us) _______| Time Stamp(28 . 084 911 450) _______|_______________________________________________________________________Ch0 Packet(34) Dir(?) Full Speed J(997.066 us) Idle( 2.834 us) _______| Time Stamp(28 . 085 911 400) _______|_______________________________________________________________________Ch0 Packet(35) Dir(?) Full Speed J(997.082 us) Idle( 2.850 us) _______| Time Stamp(28 . 086 911 300) _______|_______________________________________________________________________Ch0 Packet(36) Dir(?) Full Speed J(997.118 us) Idle( 2.850 us) _______| Time Stamp(28 . 087 911 232) _______|_______________________________________________________________________Ch0 Packet(37) Dir(?) Full Speed J(997.182 us) Idle( 2.850 us) _______| Time Stamp(28 . 088 911 200) _______|_______________________________________________________________________Ch0 Packet(38) Dir(?) Full Speed J(997.118 us) Idle( 2.850 us) _______| Time Stamp(28 . 089 911 232) _______|_______________________________________________________________________Ch0 Packet(39) Dir(?) Full Speed J(997.100 us) Idle( 2.832 us) _______| Time Stamp(28 . 090 911 200) _______|_______________________________________________________________________Ch0 Packet(40) Dir(?) Full Speed J(997.134 us) Idle( 2.850 us) _______| Time Stamp(28 . 091 911 132) _______|_______________________________________________________________________Ch0 Packet(41) Dir(?) Full Speed J(997.116 us) Idle( 2.850 us) _______| Time Stamp(28 . 092 911 116) _______|_______________________________________________________________________Ch0 Packet(42) Dir(?) Full Speed J(997.118 us) Idle( 2.850 us) _______| Time Stamp(28 . 093 911 082) _______|_______________________________________________________________________Ch0 Packet(43) Dir(?) Full Speed J(997.116 us) Idle( 2.850 us) _______| Time Stamp(28 . 094 911 050) _______|_______________________________________________________________________Ch0 Packet(44) Dir(?) Full Speed J(997.100 us) Idle( 2.834 us) _______| Time Stamp(28 . 095 911 016) _______|_______________________________________________________________________Ch0 Packet(45) Dir(?) Full Speed J(997.150 us) Idle( 2.850 us) _______| Time Stamp(28 . 096 910 950) _______|_______________________________________________________________________Ch0 Packet(46) Dir(?) Full Speed J(997.100 us) Idle( 2.832 us) _______| Time Stamp(28 . 097 910 950) _______|_______________________________________________________________________Ch0 Packet(47) Dir(?) Full Speed J(997.134 us) Idle( 2.850 us) _______| Time Stamp(28 . 098 910 882) _______|_______________________________________________________________________Ch0 Packet(48) Dir(?) Full Speed J(997.116 us) Idle( 2.834 us) _______| Time Stamp(28 . 099 910 866) _______|_______________________________________________________________________Ch0 Packet(49) Dir(?) Full Speed J(997.116 us) Idle( 2.834 us) _______| Time Stamp(28 . 100 910 816) _______|_______________________________________________________________________Ch0 Packet(50) Dir(?) Full Speed J(997.100 us) Idle( 2.850 us) _______| Time Stamp(28 . 101 910 766) _______|_______________________________________________________________________Ch0 Packet(51) Dir(?) Full Speed J(997.116 us) Idle( 2.850 us) _______| Time Stamp(28 . 102 910 716) _______|_______________________________________________________________________Ch0 Packet(52) Dir(?) Full Speed J(997.184 us) Idle( 2.850 us) _______| Time Stamp(28 . 103 910 682) _______|_______________________________________________________________________Ch0 Packet(53) Dir(?) Full Speed J(997.116 us) Idle( 2.850 us) _______| Time Stamp(28 . 104 910 716) _______|_______________________________________________________________________Ch0 Packet(54) Dir(?) Full Speed J(997.100 us) Idle( 2.850 us) _______| Time Stamp(28 . 105 910 682) _______|_______________________________________________________________________Ch0 Packet(55) Dir(?) Full Speed J(997.118 us) Idle( 2.850 us) _______| Time Stamp(28 . 106 910 632) _______|_______________________________________________________________________Ch0 Packet(56) Dir(?) Full Speed J(399.650 us) Idle(222.582 us) _______| Time Stamp(28 . 107 910 600) _______|_______________________________________________________________________Ch0 Packet(57) Dir Chirp J( 23.955 ms) Idle(115.169 ms) _______| Time Stamp(28 . 108 532 832) _______|_______________________________________________________________________Ch0 Packet(58) Dir(?) Full Speed J (Suspend)( 5.347 sec) Idle( 5.366 us) _______| Time Stamp(28 . 247 657 600) _______|_______________________________________________________________________Ch0 173 Packets(59-231) Dir(?) Chirp K( 2.000 ms) Chirp J Chirp K _______| Time( 10.918 ms) Time Stamp(33 . 594 605 882) _______|_______________________________________________________________________Ch0 Packet(232) Dir H(S) SPLIT(0x1E) SC(1) Hub Addr(1) Port(3) S(0) U(0) _______| ET(Ctrl) CRC5(0x01) Pkt Len(9) Duration(150.000 ns) Idle(200.000 ns) _______| Time Stamp(33 . 605 524 366) _______|_______________________________________________________________________Ch0 2. Following is the logs between Host and Hub: In transfer 12, the Full-Speed (FS) SET Address request commenced at 8.995 seconds, following an SSPLIT request. Unfortunately, it eventually failed in Transaction 2866864 at 14.784 seconds with a Timeout error, which is nearly a 5-second duration. It appears that the timeout occurred due to a suspension in the upstream logs. Transfer(12) F(H) Control(SET) ADDR(0) ENDP(0) bRequest(SET_ADDRESS) _______| wValue(New address 4) wIndex(0x0000) wLength(0) STALL(0x08) _______| Time Stamp(8 . 955 781 832) _______|_______________________________________________________________________ Split Trans(0) F(H) SETUP(0xB4) ADDR(0) ENDP(0) T(0) D(H->D) Tp(S) R(D) _______| bRequest(0x05) wValue(0x0004) wIndex(0x0000) wLength(0) STALL(0x78) _______| Time Stamp(8 . 955 781 832) _______|_______________________________________________________________________ Transaction(307) H(S) SSplit(Ctrl) Hub Addr(1) Port(2) Speed(Full) SETUP(0xB4) _______| ADDR(0) ENDP(0) T(0) D(H->D) Tp(S) R(D) bRequest(0x05) wValue(0x0004) _______| wIndex(0x0000) wLength(0) ACK(0x4B) Time Stamp(8 . 955 781 832) _______|_______________________________________________________________________Ch0 Packet(72291) Dir H(S) SPLIT(0x1E) SC(0) Hub Addr(1) Port(2) S(0) E(0) _______| ET(Ctrl) CRC5(0x05) Pkt Len(9) Duration(150.000 ns) Idle(200.000 ns) _______| Time Stamp(8 . 955 781 832) _______|_______________________________________________________________________Ch0 Packet(72292) Dir H(S) SETUP(0xB4) ADDR(0) ENDP(0) CRC5(0x08) Pkt Len(8) _______| Duration(133.330 ns) Idle(200.660 ns) Time Stamp(8 . 955 782 182) _______|_______________________________________________________________________Ch0 Packet(72293) Dir H(S) DATA0(0xC3) Data(8 bytes) CRC16(0xD70E) _______| Pkt Len(16) Duration(266.660 ns) Idle(299.330 ns) _______| Time Stamp(8 . 955 782 516) _______|_______________________________________________________________________Ch0 Packet(72294) Dir H(S) ACK(0x4B) Pkt Len(6) Duration(100.000 ns) _______| Time(984.000 ns) Time Stamp(8 . 955 783 082) _______|_______________________________________________________________________ Transaction(308) H(S) CSplit(Ctrl) Hub Addr(1) Port(2) Speed(Full) SETUP(0xB4) _______| ADDR(0) ENDP(0) STALL(0x78) Time Stamp(8 . 955 784 066) _______|_______________________________________________________________________Ch0 Packet(72295) Dir H(S) SPLIT(0x1E) SC(1) Hub Addr(1) Port(2) S(0) U(0) _______| ET(Ctrl) CRC5(0x1E) Pkt Len(9) Duration(150.000 ns) Idle(200.000 ns) _______| Time Stamp(8 . 955 784 066) _______|_______________________________________________________________________Ch0 Packet(72296) Dir H(S) SETUP(0xB4) ADDR(0) ENDP(0) CRC5(0x08) Pkt Len(8) _______| Duration(133.330 ns) Idle(300.660 ns) Time Stamp(8 . 955 784 416) _______|_______________________________________________________________________Ch0 Packet(72297) Dir H(S) STALL(0x78) Pkt Len(6) Duration(100.000 ns) _______| Time( 5.829 sec) Time Stamp(8 . 955 784 850) _______|_______________________________________________________________________ Transaction(2866864) H(S) SETUP(0xB4) ADDR(0) ENDP(0) Cplt(NO) T(0) D(H->D) _______| Tp(S) R(D) bRequest(0x05) wValue(0x0000) wIndex(0x0000) wLength(0) _______| !! Propagated Error !!(Turnaround/Timeout Error) _______| Time Stamp(14 . 784 876 232) 3. Following is the logs between Hub and phone using normal hub As you can see, the suspend timeout is only ~300 ms (packet 65) compare to 5 sec in broken hub logs in point 1. _______| Time Stamp(5 . 247 278 566) _______|_______________________________________________________________________Ch0 Packet(54) Dir(?) Full Speed J(997.082 us) Idle( 2.850 us) _______| Time Stamp(5 . 248 278 550) _______|_______________________________________________________________________Ch0 Packet(55) Dir(?) Full Speed J(997.118 us) Idle( 2.832 us) _______| Time Stamp(5 . 249 278 482) _______|_______________________________________________________________________Ch0 Packet(56) Dir(?) Full Speed J(997.134 us) Idle( 2.850 us) _______| Time Stamp(5 . 250 278 432) _______|_______________________________________________________________________Ch0 Packet(57) Dir(?) Full Speed J(997.134 us) Idle( 2.850 us) _______| Time Stamp(5 . 251 278 416) _______|_______________________________________________________________________Ch0 Packet(58) Dir(?) Full Speed J(997.132 us) Idle( 2.834 us) _______| Time Stamp(5 . 252 278 400) _______|_______________________________________________________________________Ch0 Packet(59) Dir(?) Full Speed J(997.116 us) Idle( 2.850 us) _______| Time Stamp(5 . 253 278 366) _______|_______________________________________________________________________Ch0 Packet(60) Dir(?) Full Speed J(997.134 us) Idle( 2.850 us) _______| Time Stamp(5 . 254 278 332) _______|_______________________________________________________________________Ch0 Packet(61) Dir(?) Full Speed J(997.116 us) Idle( 2.834 us) _______| Time Stamp(5 . 255 278 316) _______|_______________________________________________________________________Ch0 Packet(62) Dir(?) Full Speed J(997.134 us) Idle( 2.850 us) _______| Time Stamp(5 . 256 278 266) _______|_______________________________________________________________________Ch0 Packet(63) Dir(?) Full Speed J(997.116 us) Idle( 2.850 us) _______| Time Stamp(5 . 257 278 250) _______|_______________________________________________________________________Ch0 Packet(64) Dir(?) Full Speed J(466.516 us) Idle(130.706 ms) _______| Time Stamp(5 . 258 278 216) _______|_______________________________________________________________________Ch0 Packet(65) Dir(?) Full Speed J (Suspend)(311.252 ms) Idle( 5.384 us) _______| Time Stamp(5 . 389 451 066) _______|_______________________________________________________________________Ch0 256 Packets(66-321) Dir(?) Chirp K( 2.000 ms) Chirp J Chirp K _______| Time( 15.979 ms) Time Stamp(5 . 700 708 316) _______|_______________________________________________________________________Ch0 Packet(330) Dir H(S) SETUP(0xB4) ADDR(1) ENDP(0) CRC5(0x17) Pkt Len(8) _______| Duration(133.330 ns) Idle(198.660 ns) Time Stamp(5 . 716 687 650) _______|_______________________________________________________________________Ch0 I uploaded the detailed logs on gdrive https://drive.google.com/file/d/1sbitUOIQTZ4XwbrcB0gAUucGq4ReSroW/view?usp=share_link > > thanks, > > greg k-h
On Mon, Oct 23, 2023 at 06:13:48PM +0200, Hardik Gajjar wrote: > On Sat, Oct 21, 2023 at 12:15:35PM +0200, Greg KH wrote: > > On Tue, Oct 17, 2023 at 02:59:54PM -0400, Alan Stern wrote: > > > On Tue, Oct 17, 2023 at 06:53:44PM +0200, Greg KH wrote: > > > > On Tue, Oct 17, 2023 at 06:10:21PM +0200, Hardik Gajjar wrote: > > > > > More logs and detailed in patch V1: > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_linux-2Dusb_20230818092353.124658-2D1-2Dhgajjar-40de.adit-2Djv.com_T_-23m452ec9dad94e8181fdb050cd29483dd89437f7c1&d=DwICAg&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=SAhjP5GOmrADp1v_EE5jWoSuMlYCIt9gKduw-DCBPLs&m=P0HXZTx6ta7v5M4y2Y7WZkPrY-dpKkxBq8tAzuX8cI9aj9tE2NuVvJjLl3Uvojpw&s=N_HwnQeZb_gHMmgz53uTGDUZVi28EXb1l9Pg6PdbvVI&e= > > > > > > > > > > > > > Achieving this is impossible in scenarios where the set_address is > > > > > > > not successful and waits for a timeout. > > > > > > > > > > > > Agreed, broken hardware is a pain, but if your device is allowed to take > > > > > > longer, it can, and will, so you have to support that. > > > > > > > > > > > The problem is not caused by the device taking an extended amount of time to > > > > > process the 'set_address' request. Instead, the issue lies in the absence of > > > > > any activity on the upstream bus until a timeout occurs. > > > > > > > > So, a broken device. Why are you then adding the hub to the quirk list > > > > and not the broken device? We are used to adding broken devices to > > > > qurik lists all the time, this shouldn't be new. > > > > > > Adding a quirk for the device isn't feasible, because the problem occurs > > > before the device has been initialized and enumerated. The kernel > > > doesn't know anything about the device at this point; only that it has > > > just connected. > > > > Ah, ick, you are right, but we do know the "broken hub" id, so that > > makes a bit more sense. Should this be a hub-only type quirk? > > In addition to the earlier comment, it appears that the issue is most likely > related to the hub. While we have identified one specific phone that triggers > this problem, we cannot determine how many other devices might encounter a > similar issue, where they enumerate as full speed initially and then switch > to high speed. To address this, we are proposing to use a 500 ms timeout for > all devices connected via the hub. This change aims to prevent potential > timeout-related problems with various devices So it sounds like the best approach is to make this a hub-specific quirk. > It does appear that the issue is related to the hub, and the ideal solution would involve > modifying the hub's firmware. However, implementing such a firmware fix is currently not > a straightforward task. As a result, we have implemented this quirk-based solution to > mitigate the issue to some extent > > Following is the LeCroy analyzer logs: > > 1. logs between Hub and phone with broken hub. > > In packet 58, there is a Full-speed J (suspend) event that lasted for 5.347 seconds. > It's suspected that the hub was suspended due to incorrect chirp parsing. > This anomaly in chirp parsing may be a contributing factor to the issue we're facing. Yes, that's probably true. It's another indication that the hub is somehow at fault. Alan Stern
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 0a1731a0f0ef..3c03f23bd5d5 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -6817,6 +6817,9 @@ pause after every control message); o = USB_QUIRK_HUB_SLOW_RESET (Hub needs extra delay after resetting its port); + p = USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT (Timeout + of set_address command reducing from + 5000 ms to 500 ms) Example: quirks=0781:5580:bk,0a5c:5834:gij usbhid.mousepoll= diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c index 3c54b218301c..83d1af0a3953 100644 --- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -54,6 +54,18 @@ #define USB_TP_TRANSMISSION_DELAY_MAX 65535 /* ns */ #define USB_PING_RESPONSE_TIME 400 /* ns */ +/* + * address device command timeout 5000 ms is recommended in + * USB 2.0 spec, section 9.2.6.1 + */ +#define USB_DEFAULT_ADDR_DEVICE_TIMEOUT_MS 5000 /* ms */ + +/* + * address device command timeout will be 500 ms when + * USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT enable. + */ +#define USB_SHORT_ADDR_DEVICE_TIMEOUT_MS 500 /* ms */ + /* Protect struct usb_device->state and ->children members * Note: Both are also protected by ->dev.sem, except that ->state can * change to USB_STATE_NOTATTACHED even when the semaphore isn't held. */ @@ -4626,7 +4638,12 @@ EXPORT_SYMBOL_GPL(usb_ep0_reinit); static int hub_set_address(struct usb_device *udev, int devnum) { int retval; + unsigned int timeout_ms = USB_DEFAULT_ADDR_DEVICE_TIMEOUT_MS; struct usb_hcd *hcd = bus_to_hcd(udev->bus); + struct usb_hub *hub = usb_hub_to_struct_hub(udev->parent); + + if (hub->hdev->quirks & USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT) + timeout_ms = USB_SHORT_ADDR_DEVICE_TIMEOUT_MS; /* * The host controller will choose the device address, @@ -4639,11 +4656,11 @@ static int hub_set_address(struct usb_device *udev, int devnum) if (udev->state != USB_STATE_DEFAULT) return -EINVAL; if (hcd->driver->address_device) - retval = hcd->driver->address_device(hcd, udev); + retval = hcd->driver->address_device(hcd, udev, timeout_ms); else retval = usb_control_msg(udev, usb_sndaddr0pipe(), USB_REQ_SET_ADDRESS, 0, devnum, 0, - NULL, 0, USB_CTRL_SET_TIMEOUT); + NULL, 0, timeout_ms); if (retval == 0) { update_devnum(udev, devnum); /* Device now using proper address. */ diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c index 15e9bd180a1d..863e7fe24157 100644 --- a/drivers/usb/core/quirks.c +++ b/drivers/usb/core/quirks.c @@ -138,6 +138,9 @@ static int quirks_param_set(const char *value, const struct kernel_param *kp) case 'o': flags |= USB_QUIRK_HUB_SLOW_RESET; break; + case 'p': + flags |= USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT; + break; /* Ignore unrecognized flag characters */ } } @@ -527,6 +530,9 @@ static const struct usb_device_id usb_quirk_list[] = { { USB_DEVICE(0x2386, 0x350e), .driver_info = USB_QUIRK_NO_LPM }, + /* APTIV AUTOMOTIVE HUB */ + { USB_DEVICE(0x2c48, 0x0132), .driver_info = USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT }, + /* DJI CineSSD */ { USB_DEVICE(0x2ca3, 0x0031), .driver_info = USB_QUIRK_NO_LPM }, diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index 8714ab5bf04d..4a286136d1a8 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -1729,6 +1729,8 @@ struct xhci_command *xhci_alloc_command(struct xhci_hcd *xhci, } command->status = 0; + /* set default timeout to 5000 ms */ + command->timeout_ms = XHCI_CMD_DEFAULT_TIMEOUT_MS; INIT_LIST_HEAD(&command->cmd_list); return command; } diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 1dde53f6eb31..8f36c2914938 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -366,9 +366,10 @@ void xhci_ring_cmd_db(struct xhci_hcd *xhci) readl(&xhci->dba->doorbell[0]); } -static bool xhci_mod_cmd_timer(struct xhci_hcd *xhci, unsigned long delay) +static bool xhci_mod_cmd_timer(struct xhci_hcd *xhci) { - return mod_delayed_work(system_wq, &xhci->cmd_timer, delay); + return mod_delayed_work(system_wq, &xhci->cmd_timer, + msecs_to_jiffies(xhci->current_cmd->timeout_ms)); } static struct xhci_command *xhci_next_queued_cmd(struct xhci_hcd *xhci) @@ -412,7 +413,7 @@ static void xhci_handle_stopped_cmd_ring(struct xhci_hcd *xhci, if ((xhci->cmd_ring->dequeue != xhci->cmd_ring->enqueue) && !(xhci->xhc_state & XHCI_STATE_DYING)) { xhci->current_cmd = cur_cmd; - xhci_mod_cmd_timer(xhci, XHCI_CMD_DEFAULT_TIMEOUT); + xhci_mod_cmd_timer(xhci); xhci_ring_cmd_db(xhci); } } @@ -1786,7 +1787,7 @@ static void handle_cmd_completion(struct xhci_hcd *xhci, if (!list_is_singular(&xhci->cmd_list)) { xhci->current_cmd = list_first_entry(&cmd->cmd_list, struct xhci_command, cmd_list); - xhci_mod_cmd_timer(xhci, XHCI_CMD_DEFAULT_TIMEOUT); + xhci_mod_cmd_timer(xhci); } else if (xhci->current_cmd == cmd) { xhci->current_cmd = NULL; } @@ -4301,7 +4302,7 @@ static int queue_command(struct xhci_hcd *xhci, struct xhci_command *cmd, /* if there are no other commands queued we start the timeout timer */ if (list_empty(&xhci->cmd_list)) { xhci->current_cmd = cmd; - xhci_mod_cmd_timer(xhci, XHCI_CMD_DEFAULT_TIMEOUT); + xhci_mod_cmd_timer(xhci); } list_add_tail(&cmd->cmd_list, &xhci->cmd_list); diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index e1b1b64a0723..85ea4e17d2a0 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -3998,11 +3998,18 @@ int xhci_alloc_dev(struct usb_hcd *hcd, struct usb_device *udev) } /* - * Issue an Address Device command and optionally send a corresponding - * SetAddress request to the device. + * This function issues an Address Device command to assign a unique USB bus + * address. Optionally, it sends a SetAddress request. + * + * @param hcd USB host controller data structure. + * @param udev USB device structure representing the connected device. + * @param setup Enum specifying setup mode: address only or with context. + * @param timeout_ms Max wait time (ms) for the command operation to complete. + * + * @return Integer status code: 0 on success, negative on error. */ static int xhci_setup_device(struct usb_hcd *hcd, struct usb_device *udev, - enum xhci_setup_dev setup) + enum xhci_setup_dev setup, unsigned int timeout_ms) { const char *act = setup == SETUP_CONTEXT_ONLY ? "context" : "address"; unsigned long flags; @@ -4059,6 +4066,7 @@ static int xhci_setup_device(struct usb_hcd *hcd, struct usb_device *udev, } command->in_ctx = virt_dev->in_ctx; + command->timeout_ms = timeout_ms; slot_ctx = xhci_get_slot_ctx(xhci, virt_dev->in_ctx); ctrl_ctx = xhci_get_input_control_ctx(virt_dev->in_ctx); @@ -4185,14 +4193,16 @@ static int xhci_setup_device(struct usb_hcd *hcd, struct usb_device *udev, return ret; } -static int xhci_address_device(struct usb_hcd *hcd, struct usb_device *udev) +static int xhci_address_device(struct usb_hcd *hcd, struct usb_device *udev, + unsigned int timeout_ms) { - return xhci_setup_device(hcd, udev, SETUP_CONTEXT_ADDRESS); + return xhci_setup_device(hcd, udev, SETUP_CONTEXT_ADDRESS, timeout_ms); } static int xhci_enable_device(struct usb_hcd *hcd, struct usb_device *udev) { - return xhci_setup_device(hcd, udev, SETUP_CONTEXT_ONLY); + return xhci_setup_device(hcd, udev, SETUP_CONTEXT_ONLY, + XHCI_CMD_DEFAULT_TIMEOUT_MS); } /* diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 7e282b4522c0..ec5c663246e5 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -818,6 +818,8 @@ struct xhci_command { struct completion *completion; union xhci_trb *command_trb; struct list_head cmd_list; + /* xHCI command response timeout in milliseconds */ + unsigned int timeout_ms; }; /* drop context bitmasks */ @@ -1576,8 +1578,11 @@ struct xhci_td { unsigned int num_trbs; }; -/* xHCI command default timeout value */ -#define XHCI_CMD_DEFAULT_TIMEOUT (5 * HZ) +/* + * xHCI command default timeout value in milliseconds. + * USB 2.0 spec, section 9.2.6.1 + */ +#define XHCI_CMD_DEFAULT_TIMEOUT_MS 5000 /* command descriptor */ struct xhci_cd { diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h index 61d4f0b793dc..d0e19ac3ba6c 100644 --- a/include/linux/usb/hcd.h +++ b/include/linux/usb/hcd.h @@ -372,8 +372,9 @@ struct hc_driver { * or bandwidth constraints. */ void (*reset_bandwidth)(struct usb_hcd *, struct usb_device *); - /* Returns the hardware-chosen device address */ - int (*address_device)(struct usb_hcd *, struct usb_device *udev); + /* Set the hardware-chosen device address */ + int (*address_device)(struct usb_hcd *, struct usb_device *udev, + unsigned int timeout_ms); /* prepares the hardware to send commands to the device */ int (*enable_device)(struct usb_hcd *, struct usb_device *udev); /* Notifies the HCD after a hub descriptor is fetched. diff --git a/include/linux/usb/quirks.h b/include/linux/usb/quirks.h index eeb7c2157c72..0cb464e3eaf4 100644 --- a/include/linux/usb/quirks.h +++ b/include/linux/usb/quirks.h @@ -72,4 +72,7 @@ /* device has endpoints that should be ignored */ #define USB_QUIRK_ENDPOINT_IGNORE BIT(15) +/* short device address timeout */ +#define USB_QUIRK_SHORT_DEVICE_ADDR_TIMEOUT BIT(16) + #endif /* __LINUX_USB_QUIRKS_H */
Currently, the timeout for the set address command is fixed at 5 seconds in the xhci driver. This means the host waits up to 5 seconds to receive a response for the set_address command from the device. In the automotive context, most smartphone enumerations, including screen projection, should ideally complete within 3 seconds. Achieving this is impossible in scenarios where the set_address is not successful and waits for a timeout. The shortened address device timeout quirks provide the flexibility to align with a 3-second time limit in the event of errors. By swiftly triggering a failure response and swiftly initiating retry procedures, these quirks ensure efficient and rapid recovery, particularly in automotive contexts where rapid smartphone enumeration and screen projection are vital. The quirk will set the timeout to 500 ms from 5 seconds. To use the quirk, please write "vendor_id:product_id:p" to /sys/bus/usb/drivers/hub/module/parameter/quirks For example, echo "0x2c48:0x0132:p" > /sys/bus/usb/drivers/hub/module/parameter/quirks" Signed-off-by: Hardik Gajjar <hgajjar@de.adit-jv.com> --- changes since version 1: - implement quirk instead of new API in xhci driver changes since version 2: - Add documentation for the new quirk. - Define the timeout unit in milliseconds in variable names and function arguments. - Change the xHCI command timeout from HZ (jiffies) to milliseconds. - Add APTIV usb hub vendor and product ID in device quirk list - Adding some other comments for clarity Changes since version 3: - Add some comments for clarity. - Minor indentation and sequence change. --- .../admin-guide/kernel-parameters.txt | 3 +++ drivers/usb/core/hub.c | 21 ++++++++++++++++-- drivers/usb/core/quirks.c | 6 +++++ drivers/usb/host/xhci-mem.c | 2 ++ drivers/usb/host/xhci-ring.c | 11 +++++----- drivers/usb/host/xhci.c | 22 ++++++++++++++----- drivers/usb/host/xhci.h | 9 ++++++-- include/linux/usb/hcd.h | 5 +++-- include/linux/usb/quirks.h | 3 +++ 9 files changed, 65 insertions(+), 17 deletions(-)