diff mbox series

[v2] xhci: Prevent deadlock when xhci adapter breaks during init

Message ID 1569247119-32708-1-git-send-email-William.Kuzeja@stratus.com (mailing list archive)
State Mainlined
Commit 8de66b0e6a56ff10dd00d2b0f2ae52e300178587
Headers show
Series [v2] xhci: Prevent deadlock when xhci adapter breaks during init | expand

Commit Message

Bill Kuzeja Sept. 23, 2019, 1:58 p.m. UTC
The system can hit a deadlock if an xhci adapter breaks while initializing.
The deadlock is between two threads: thread 1 is tearing down the
adapter and is stuck in usb_unlocked_disable_lpm waiting to lock the
hcd->handwidth_mutex. Thread 2 is holding this mutex (while still trying
to add a usb device), but is stuck in xhci_endpoint_reset waiting for a
stop or config command to complete. A reboot is required to resolve.

It turns out when calling xhci_queue_stop_endpoint and
xhci_queue_configure_endpoint in xhci_endpoint_reset, the return code is
not checked for errors. If the timing is right and the adapter dies just
before either of these commands get issued, we hang indefinitely waiting
for a completion on a command that didn't get issued.

This wasn't a problem before the following fix because we didn't send
commands in xhci_endpoint_reset:

commit f5249461b504 ("xhci: Clear the host side toggle manually when
    endpoint is soft reset")

With the patch I am submitting, a duration test which breaks adapters
during initialization (and which deadlocks with the standard kernel) runs
without issue.

Fixes: f5249461b504 ("xhci: Clear the host side toggle manually when
    endpoint is soft reset")

Cc: Mathias Nyman <mathias.nyman@intel.com>
Cc: Torez Smith <torez@redhat.com>

Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Signed-off-by: Torez Smith <torez@redhat.com>
---
 drivers/usb/host/xhci.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

Comments

Mathias Nyman Sept. 23, 2019, 2:15 p.m. UTC | #1
On 23.9.2019 16.58, Bill Kuzeja wrote:
> The system can hit a deadlock if an xhci adapter breaks while initializing.
> The deadlock is between two threads: thread 1 is tearing down the
> adapter and is stuck in usb_unlocked_disable_lpm waiting to lock the
> hcd->handwidth_mutex. Thread 2 is holding this mutex (while still trying
> to add a usb device), but is stuck in xhci_endpoint_reset waiting for a
> stop or config command to complete. A reboot is required to resolve.
> 
> It turns out when calling xhci_queue_stop_endpoint and
> xhci_queue_configure_endpoint in xhci_endpoint_reset, the return code is
> not checked for errors. If the timing is right and the adapter dies just
> before either of these commands get issued, we hang indefinitely waiting
> for a completion on a command that didn't get issued.
> 
> This wasn't a problem before the following fix because we didn't send
> commands in xhci_endpoint_reset:
> 
> commit f5249461b504 ("xhci: Clear the host side toggle manually when
>      endpoint is soft reset")
> 
> With the patch I am submitting, a duration test which breaks adapters
> during initialization (and which deadlocks with the standard kernel) runs
> without issue.
> 
> Fixes: f5249461b504 ("xhci: Clear the host side toggle manually when
>      endpoint is soft reset")
> 
> Cc: Mathias Nyman <mathias.nyman@intel.com>
> Cc: Torez Smith <torez@redhat.com>
> 
> Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
> Signed-off-by: Torez Smith <torez@redhat.com>
> ---

Thanks, adding to queue

-Mathias
diff mbox series

Patch

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 5008659..ed44ec2 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -3083,6 +3083,7 @@  static void xhci_endpoint_reset(struct usb_hcd *hcd,
 	unsigned int ep_index;
 	unsigned long flags;
 	u32 ep_flag;
+	int err;
 
 	xhci = hcd_to_xhci(hcd);
 	if (!host_ep->hcpriv)
@@ -3142,7 +3143,17 @@  static void xhci_endpoint_reset(struct usb_hcd *hcd,
 		xhci_free_command(xhci, cfg_cmd);
 		goto cleanup;
 	}
-	xhci_queue_stop_endpoint(xhci, stop_cmd, udev->slot_id, ep_index, 0);
+
+	err = xhci_queue_stop_endpoint(xhci, stop_cmd, udev->slot_id,
+					ep_index, 0);
+	if (err < 0) {
+		spin_unlock_irqrestore(&xhci->lock, flags);
+		xhci_free_command(xhci, cfg_cmd);
+		xhci_dbg(xhci, "%s: Failed to queue stop ep command, %d ",
+				__func__, err);
+		goto cleanup;
+	}
+
 	xhci_ring_cmd_db(xhci);
 	spin_unlock_irqrestore(&xhci->lock, flags);
 
@@ -3156,8 +3167,16 @@  static void xhci_endpoint_reset(struct usb_hcd *hcd,
 					   ctrl_ctx, ep_flag, ep_flag);
 	xhci_endpoint_copy(xhci, cfg_cmd->in_ctx, vdev->out_ctx, ep_index);
 
-	xhci_queue_configure_endpoint(xhci, cfg_cmd, cfg_cmd->in_ctx->dma,
+	err = xhci_queue_configure_endpoint(xhci, cfg_cmd, cfg_cmd->in_ctx->dma,
 				      udev->slot_id, false);
+	if (err < 0) {
+		spin_unlock_irqrestore(&xhci->lock, flags);
+		xhci_free_command(xhci, cfg_cmd);
+		xhci_dbg(xhci, "%s: Failed to queue config ep command, %d ",
+				__func__, err);
+		goto cleanup;
+	}
+
 	xhci_ring_cmd_db(xhci);
 	spin_unlock_irqrestore(&xhci->lock, flags);