diff mbox series

[1/2] usb: xhci: fix uninitialized completion when USB3 port got wrong status

Message ID 1540141725-13047-1-git-send-email-aaron.ma@canonical.com (mailing list archive)
State New, archived
Headers show
Series [1/2] usb: xhci: fix uninitialized completion when USB3 port got wrong status | expand

Commit Message

Aaron Ma Oct. 21, 2018, 5:08 p.m. UTC
Realtek USB3.0 Card Reader [0bda:0328] reports wrong port status on
Cannon lake PCH USB3.1 xHCI [8086:a36d] after resume from S3,
after clear port reset it works fine.

Since this device is registered on USB3 roothub at boot,
when port status reports not superspeed, xhci_get_port_status will call
an uninitialized completion in bus_state[0].
Kernel will hang because of NULL pointer.

Restrict the USB2 resume status check in USB2 roothub to fix hang issue.
No harm to initialize USB3 bus_state[0] in case it is called.

Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
---
 drivers/usb/host/xhci-hub.c  | 2 +-
 drivers/usb/host/xhci-mem.c  | 1 +
 drivers/usb/host/xhci-ring.c | 2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)

Comments

Mathias Nyman Oct. 22, 2018, 1:12 p.m. UTC | #1
On 21.10.2018 20:08, Aaron Ma wrote:
> Realtek USB3.0 Card Reader [0bda:0328] reports wrong port status on
> Cannon lake PCH USB3.1 xHCI [8086:a36d] after resume from S3,
> after clear port reset it works fine.
> 
> Since this device is registered on USB3 roothub at boot,
> when port status reports not superspeed, xhci_get_port_status will call
> an uninitialized completion in bus_state[0].
> Kernel will hang because of NULL pointer.
> 
> Restrict the USB2 resume status check in USB2 roothub to fix hang issue.
> No harm to initialize USB3 bus_state[0] in case it is called.
> 
> Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
> ---
>   drivers/usb/host/xhci-hub.c  | 2 +-
>   drivers/usb/host/xhci-mem.c  | 1 +
>   drivers/usb/host/xhci-ring.c | 2 +-
>   3 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
> index 7e2a531ba321..d30ca6ceffc9 100644
> --- a/drivers/usb/host/xhci-hub.c
> +++ b/drivers/usb/host/xhci-hub.c
> @@ -876,7 +876,7 @@ static u32 xhci_get_port_status(struct usb_hcd *hcd,
>   			status |= USB_PORT_STAT_SUSPEND;
>   	}
>   	if ((raw_port_status & PORT_PLS_MASK) == XDEV_RESUME &&
> -		!DEV_SUPERSPEED_ANY(raw_port_status)) {
> +		!DEV_SUPERSPEED_ANY(raw_port_status) && 1 == hcd_index(hcd)) {
>   		if ((raw_port_status & PORT_RESET) ||
>   				!(raw_port_status & PORT_PE))
>   			return 0xffffffff;

Nice catch.

Maybe use "hcd->speed < HCD_USB3" instead of "1 == hcd_index(hcd)"
It's easier to understand.

Turns out this isn't an issue with your Realtek device, it just happens to trigger
a driver issue.

The original !DEV_SUPERSPEED_ANY() check was not suitable here.
It checks the port-speed field of portsc register (bits 13:10), which are only valid for USB3
ports if all link training is done and port reached its "enabled" state.
Otherwise it will return 0, and USB3 ports may be mistaken for USB2 ports.

Just to make sure, Does your device stay as a USB 3 device, it's never
enumerated as USB2, right?

I'm in the middle of refactoring the get_port_status(), it should solve this
as well, but we need your solution stable releases.

Any chance you to check if the refactored code works with the Realtek device?
I just created a "get_port_status_refactor" branch for it:

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git get_port_status_refactor

> diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
> index b1f27aa38b10..dd2ad50c5289 100644
> --- a/drivers/usb/host/xhci-mem.c
> +++ b/drivers/usb/host/xhci-mem.c
> @@ -2539,6 +2539,7 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
>   		xhci->bus_state[0].resume_done[i] = 0;
>   		xhci->bus_state[1].resume_done[i] = 0;
>   		/* Only the USB 2.0 completions will ever be used. */
> +		init_completion(&xhci->bus_state[0].rexit_done[i]);
>   		init_completion(&xhci->bus_state[1].rexit_done[i]);
>   	}

I don't think we should init the completion unnecessary for USB3 ports.

>   
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index f0a99aa0ac58..894d4625b8b9 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -1634,7 +1634,7 @@ static void handle_port_status(struct xhci_hcd *xhci,
>   	 * RExit to a disconnect state).  If so, let the the driver know it's
>   	 * out of the RExit state.
>   	 */
> -	if (!DEV_SUPERSPEED_ANY(portsc) &&
> +	if (!DEV_SUPERSPEED_ANY(portsc) && 1 == hcd_index(hcd) &&
>   			test_and_clear_bit(hcd_portnum,
>   				&bus_state->rexit_ports)) {
>   		complete(&bus_state->rexit_done[hcd_portnum]);
> 

Same here, prefer hcd->speed < HCD_USB3

Thanks
Mathias
Aaron Ma Oct. 22, 2018, 1:23 p.m. UTC | #2
On 10/22/18 9:12 PM, Mathias Nyman wrote:
> On 21.10.2018 20:08, Aaron Ma wrote:
>> Realtek USB3.0 Card Reader [0bda:0328] reports wrong port status on
>> Cannon lake PCH USB3.1 xHCI [8086:a36d] after resume from S3,
>> after clear port reset it works fine.
>>
>> Since this device is registered on USB3 roothub at boot,
>> when port status reports not superspeed, xhci_get_port_status will call
>> an uninitialized completion in bus_state[0].
>> Kernel will hang because of NULL pointer.
>>
>> Restrict the USB2 resume status check in USB2 roothub to fix hang issue.
>> No harm to initialize USB3 bus_state[0] in case it is called.
>>
>> Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
>> ---
>>   drivers/usb/host/xhci-hub.c  | 2 +-
>>   drivers/usb/host/xhci-mem.c  | 1 +
>>   drivers/usb/host/xhci-ring.c | 2 +-
>>   3 files changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
>> index 7e2a531ba321..d30ca6ceffc9 100644
>> --- a/drivers/usb/host/xhci-hub.c
>> +++ b/drivers/usb/host/xhci-hub.c
>> @@ -876,7 +876,7 @@ static u32 xhci_get_port_status(struct usb_hcd *hcd,
>>               status |= USB_PORT_STAT_SUSPEND;
>>       }
>>       if ((raw_port_status & PORT_PLS_MASK) == XDEV_RESUME &&
>> -        !DEV_SUPERSPEED_ANY(raw_port_status)) {
>> +        !DEV_SUPERSPEED_ANY(raw_port_status) && 1 == hcd_index(hcd)) {
>>           if ((raw_port_status & PORT_RESET) ||
>>                   !(raw_port_status & PORT_PE))
>>               return 0xffffffff;
> 
> Nice catch.
> 
> Maybe use "hcd->speed < HCD_USB3" instead of "1 == hcd_index(hcd)"
> It's easier to understand.
> 
> Turns out this isn't an issue with your Realtek device, it just happens
> to trigger
> a driver issue.
> 
> The original !DEV_SUPERSPEED_ANY() check was not suitable here.
> It checks the port-speed field of portsc register (bits 13:10), which
> are only valid for USB3
> ports if all link training is done and port reached its "enabled" state.
> Otherwise it will return 0, and USB3 ports may be mistaken for USB2 ports.
> 
> Just to make sure, Does your device stay as a USB 3 device, it's never
> enumerated as USB2, right?
> 
> I'm in the middle of refactoring the get_port_status(), it should solve
> this
> as well, but we need your solution stable releases.
> 
> Any chance you to check if the refactored code works with the Realtek
> device?
> I just created a "get_port_status_refactor" branch for it:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
> get_port_status_refactor

Let me try your branch, please wait a moment.

> 
>> diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
>> index b1f27aa38b10..dd2ad50c5289 100644
>> --- a/drivers/usb/host/xhci-mem.c
>> +++ b/drivers/usb/host/xhci-mem.c
>> @@ -2539,6 +2539,7 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t
>> flags)
>>           xhci->bus_state[0].resume_done[i] = 0;
>>           xhci->bus_state[1].resume_done[i] = 0;
>>           /* Only the USB 2.0 completions will ever be used. */
>> +        init_completion(&xhci->bus_state[0].rexit_done[i]);
>>           init_completion(&xhci->bus_state[1].rexit_done[i]);
>>       }
> 
> I don't think we should init the completion unnecessary for USB3 ports.
> 
>>   diff --git a/drivers/usb/host/xhci-ring.c
>> b/drivers/usb/host/xhci-ring.c
>> index f0a99aa0ac58..894d4625b8b9 100644
>> --- a/drivers/usb/host/xhci-ring.c
>> +++ b/drivers/usb/host/xhci-ring.c
>> @@ -1634,7 +1634,7 @@ static void handle_port_status(struct xhci_hcd
>> *xhci,
>>        * RExit to a disconnect state).  If so, let the the driver know
>> it's
>>        * out of the RExit state.
>>        */
>> -    if (!DEV_SUPERSPEED_ANY(portsc) &&
>> +    if (!DEV_SUPERSPEED_ANY(portsc) && 1 == hcd_index(hcd) &&
>>               test_and_clear_bit(hcd_portnum,
>>                   &bus_state->rexit_ports)) {
>>           complete(&bus_state->rexit_done[hcd_portnum]);
>>
> 
> Same here, prefer hcd->speed < HCD_USB3

Yes, I thought about this, bus_state[1/0] are defined to USB 2/3, so I
used "1 == hcd_index(hcd)".

Anyway, I will send V2 as your suggestion.

Thanks,
Aaron

> 
> Thanks
> Mathias
Aaron Ma Oct. 22, 2018, 5:53 p.m. UTC | #3
On 10/22/18 9:12 PM, Mathias Nyman wrote:
> On 21.10.2018 20:08, Aaron Ma wrote:
>> Realtek USB3.0 Card Reader [0bda:0328] reports wrong port status on
>> Cannon lake PCH USB3.1 xHCI [8086:a36d] after resume from S3,
>> after clear port reset it works fine.
>>
>> Since this device is registered on USB3 roothub at boot,
>> when port status reports not superspeed, xhci_get_port_status will call
>> an uninitialized completion in bus_state[0].
>> Kernel will hang because of NULL pointer.
>>
>> Restrict the USB2 resume status check in USB2 roothub to fix hang issue.
>> No harm to initialize USB3 bus_state[0] in case it is called.
>>
>> Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
>> ---
>>   drivers/usb/host/xhci-hub.c  | 2 +-
>>   drivers/usb/host/xhci-mem.c  | 1 +
>>   drivers/usb/host/xhci-ring.c | 2 +-
>>   3 files changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
>> index 7e2a531ba321..d30ca6ceffc9 100644
>> --- a/drivers/usb/host/xhci-hub.c
>> +++ b/drivers/usb/host/xhci-hub.c
>> @@ -876,7 +876,7 @@ static u32 xhci_get_port_status(struct usb_hcd *hcd,
>>               status |= USB_PORT_STAT_SUSPEND;
>>       }
>>       if ((raw_port_status & PORT_PLS_MASK) == XDEV_RESUME &&
>> -        !DEV_SUPERSPEED_ANY(raw_port_status)) {
>> +        !DEV_SUPERSPEED_ANY(raw_port_status) && 1 == hcd_index(hcd)) {
>>           if ((raw_port_status & PORT_RESET) ||
>>                   !(raw_port_status & PORT_PE))
>>               return 0xffffffff;
> 
> Nice catch.
> 
> Maybe use "hcd->speed < HCD_USB3" instead of "1 == hcd_index(hcd)"
> It's easier to understand.
> 
> Turns out this isn't an issue with your Realtek device, it just happens
> to trigger
> a driver issue.
> 
> The original !DEV_SUPERSPEED_ANY() check was not suitable here.
> It checks the port-speed field of portsc register (bits 13:10), which
> are only valid for USB3
> ports if all link training is done and port reached its "enabled" state.
> Otherwise it will return 0, and USB3 ports may be mistaken for USB2 ports.

PORT_ENABLE should be already set to one.
The same device ID card reader doesn't have issue on Sunrise Point.
Maybe it is related to Cannon lake PCH USB controller?

> 
> Just to make sure, Does your device stay as a USB 3 device, it's never
> enumerated as USB2, right?
> 

Right, always USB3.

> I'm in the middle of refactoring the get_port_status(), it should solve
> this
> as well, but we need your solution stable releases.
> 

V2 sent out. Cc-ed stable.

> Any chance you to check if the refactored code works with the Realtek
> device?
> I just created a "get_port_status_refactor" branch for it:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
> get_port_status_refactor

The hang issue is not reproduced on this kernel branch.

Thanks,
Aaron

> 
>> diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
>> index b1f27aa38b10..dd2ad50c5289 100644
>> --- a/drivers/usb/host/xhci-mem.c
>> +++ b/drivers/usb/host/xhci-mem.c
>> @@ -2539,6 +2539,7 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t
>> flags)
>>           xhci->bus_state[0].resume_done[i] = 0;
>>           xhci->bus_state[1].resume_done[i] = 0;
>>           /* Only the USB 2.0 completions will ever be used. */
>> +        init_completion(&xhci->bus_state[0].rexit_done[i]);
>>           init_completion(&xhci->bus_state[1].rexit_done[i]);
>>       }
> 
> I don't think we should init the completion unnecessary for USB3 ports.
> 
>>   diff --git a/drivers/usb/host/xhci-ring.c
>> b/drivers/usb/host/xhci-ring.c
>> index f0a99aa0ac58..894d4625b8b9 100644
>> --- a/drivers/usb/host/xhci-ring.c
>> +++ b/drivers/usb/host/xhci-ring.c
>> @@ -1634,7 +1634,7 @@ static void handle_port_status(struct xhci_hcd
>> *xhci,
>>        * RExit to a disconnect state).  If so, let the the driver know
>> it's
>>        * out of the RExit state.
>>        */
>> -    if (!DEV_SUPERSPEED_ANY(portsc) &&
>> +    if (!DEV_SUPERSPEED_ANY(portsc) && 1 == hcd_index(hcd) &&
>>               test_and_clear_bit(hcd_portnum,
>>                   &bus_state->rexit_ports)) {
>>           complete(&bus_state->rexit_done[hcd_portnum]);
>>
> 
> Same here, prefer hcd->speed < HCD_USB3
> 
> Thanks
> Mathias
Mathias Nyman Oct. 23, 2018, 10:39 a.m. UTC | #4
On 22.10.2018 20:53, Aaron Ma wrote:
> 
> 
> On 10/22/18 9:12 PM, Mathias Nyman wrote:
>> On 21.10.2018 20:08, Aaron Ma wrote:
>>> Realtek USB3.0 Card Reader [0bda:0328] reports wrong port status on
>>> Cannon lake PCH USB3.1 xHCI [8086:a36d] after resume from S3,
>>> after clear port reset it works fine.
>>>
>>> Since this device is registered on USB3 roothub at boot,
>>> when port status reports not superspeed, xhci_get_port_status will call
>>> an uninitialized completion in bus_state[0].
>>> Kernel will hang because of NULL pointer.
>>>
>>> Restrict the USB2 resume status check in USB2 roothub to fix hang issue.
>>> No harm to initialize USB3 bus_state[0] in case it is called.
>>>
>>> Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
>>> ---
>>>    drivers/usb/host/xhci-hub.c  | 2 +-
>>>    drivers/usb/host/xhci-mem.c  | 1 >>>    drivers/usb/host/xhci-ring.c | 2 +-
>>>    3 files changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
>>> index 7e2a531ba321..d30ca6ceffc9 100644
>>> --- a/drivers/usb/host/xhci-hub.c
>>> +++ b/drivers/usb/host/xhci-hub.c
>>> @@ -876,7 +876,7 @@ static u32 xhci_get_port_status(struct usb_hcd *hcd,
>>>                status |= USB_PORT_STAT_SUSPEND;
>>>        }
>>>        if ((raw_port_status & PORT_PLS_MASK) == XDEV_RESUME &&
>>> -        !DEV_SUPERSPEED_ANY(raw_port_status)) {
>>> +        !DEV_SUPERSPEED_ANY(raw_port_status) && 1 == hcd_index(hcd)) {
>>>            if ((raw_port_status & PORT_RESET) ||
>>>                    !(raw_port_status & PORT_PE))
>>>                return 0xffffffff;
>>
>> The original !DEV_SUPERSPEED_ANY() check was not suitable here.
>> It checks the port-speed field of portsc register (bits 13:10), which
>> are only valid for USB3
>> ports if all link training is done and port reached its "enabled" state.
>> Otherwise it will return 0, and USB3 ports may be mistaken for USB2 ports.
> 
> PORT_ENABLE should be already set to one.
> The same device ID card reader doesn't have issue on Sunrise Point.
> Maybe it is related to Cannon lake PCH USB controller?
> 

Ok, thanks for the info

> 
> V2 sent out. Cc-ed stable.
> 
>> Any chance you to check if the refactored code works with the Realtek
>> device?
>> I just created a "get_port_status_refactor" branch for it:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
>> get_port_status_refactor
> 
> The hang issue is not reproduced on this kernel branch.
> 

Great, thanks for testing it

-Mathias
diff mbox series

Patch

diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index 7e2a531ba321..d30ca6ceffc9 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -876,7 +876,7 @@  static u32 xhci_get_port_status(struct usb_hcd *hcd,
 			status |= USB_PORT_STAT_SUSPEND;
 	}
 	if ((raw_port_status & PORT_PLS_MASK) == XDEV_RESUME &&
-		!DEV_SUPERSPEED_ANY(raw_port_status)) {
+		!DEV_SUPERSPEED_ANY(raw_port_status) && 1 == hcd_index(hcd)) {
 		if ((raw_port_status & PORT_RESET) ||
 				!(raw_port_status & PORT_PE))
 			return 0xffffffff;
diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index b1f27aa38b10..dd2ad50c5289 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -2539,6 +2539,7 @@  int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
 		xhci->bus_state[0].resume_done[i] = 0;
 		xhci->bus_state[1].resume_done[i] = 0;
 		/* Only the USB 2.0 completions will ever be used. */
+		init_completion(&xhci->bus_state[0].rexit_done[i]);
 		init_completion(&xhci->bus_state[1].rexit_done[i]);
 	}
 
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index f0a99aa0ac58..894d4625b8b9 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1634,7 +1634,7 @@  static void handle_port_status(struct xhci_hcd *xhci,
 	 * RExit to a disconnect state).  If so, let the the driver know it's
 	 * out of the RExit state.
 	 */
-	if (!DEV_SUPERSPEED_ANY(portsc) &&
+	if (!DEV_SUPERSPEED_ANY(portsc) && 1 == hcd_index(hcd) &&
 			test_and_clear_bit(hcd_portnum,
 				&bus_state->rexit_ports)) {
 		complete(&bus_state->rexit_done[hcd_portnum]);