diff mbox series

[2/2] xhci: Poll for U0 after disabling USB2 LPM

Message ID 20200520101811.2623-2-kai.heng.feng@canonical.com (mailing list archive)
State Mainlined
Commit b3d71abd135e6919ca0b6cab463738472653ddfb
Headers show
Series [1/2] xhci: Return if xHCI doesn't support LPM | expand

Commit Message

Kai-Heng Feng May 20, 2020, 10:18 a.m. UTC
USB2 devices with LPM enabled may interrupt the system suspend:
[  932.510475] usb 1-7: usb suspend, wakeup 0
[  932.510549] hub 1-0:1.0: hub_suspend
[  932.510581] usb usb1: bus suspend, wakeup 0
[  932.510590] xhci_hcd 0000:00:14.0: port 9 not suspended
[  932.510593] xhci_hcd 0000:00:14.0: port 8 not suspended
..
[  932.520323] xhci_hcd 0000:00:14.0: Port change event, 1-7, id 7, portsc: 0x400e03
..
[  932.591405] PM: pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -16
[  932.591414] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -16
[  932.591418] PM: Device 0000:00:14.0 failed to suspend async: error -16

During system suspend, USB core will let HC suspends the device if it
doesn't have remote wakeup enabled and doesn't have any children.
However, from the log above we can see that the usb 1-7 doesn't get bus
suspended due to not in U0. After a while the port finished U2 -> U0
transition, interrupts the suspend process.

The observation is that after disabling LPM, port doesn't transit to U0
immediately and can linger in U2. xHCI spec 4.23.5.2 states that the
maximum exit latency for USB2 LPM should be BESL + 10us. The BESL for
the affected device is advertised as 400us, which is still not enough
based on my testing result.

So let's use the maximum permitted latency, 10000, to poll for U0
status to solve the issue.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
 drivers/usb/host/xhci.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Kai-Heng Feng June 8, 2020, 3:58 a.m. UTC | #1
> On May 20, 2020, at 18:18, Kai-Heng Feng <kai.heng.feng@canonical.com> wrote:
> 
> USB2 devices with LPM enabled may interrupt the system suspend:
> [  932.510475] usb 1-7: usb suspend, wakeup 0
> [  932.510549] hub 1-0:1.0: hub_suspend
> [  932.510581] usb usb1: bus suspend, wakeup 0
> [  932.510590] xhci_hcd 0000:00:14.0: port 9 not suspended
> [  932.510593] xhci_hcd 0000:00:14.0: port 8 not suspended
> ..
> [  932.520323] xhci_hcd 0000:00:14.0: Port change event, 1-7, id 7, portsc: 0x400e03
> ..
> [  932.591405] PM: pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -16
> [  932.591414] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -16
> [  932.591418] PM: Device 0000:00:14.0 failed to suspend async: error -16
> 
> During system suspend, USB core will let HC suspends the device if it
> doesn't have remote wakeup enabled and doesn't have any children.
> However, from the log above we can see that the usb 1-7 doesn't get bus
> suspended due to not in U0. After a while the port finished U2 -> U0
> transition, interrupts the suspend process.
> 
> The observation is that after disabling LPM, port doesn't transit to U0
> immediately and can linger in U2. xHCI spec 4.23.5.2 states that the
> maximum exit latency for USB2 LPM should be BESL + 10us. The BESL for
> the affected device is advertised as 400us, which is still not enough
> based on my testing result.
> 
> So let's use the maximum permitted latency, 10000, to poll for U0
> status to solve the issue.
> 
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

A gentle ping...

> ---
> drivers/usb/host/xhci.c | 3 +++
> 1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 14181a7ea375..1058f604975b 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -4474,6 +4474,9 @@ static int xhci_set_usb2_hardware_lpm(struct usb_hcd *hcd,
> 			mutex_lock(hcd->bandwidth_mutex);
> 			xhci_change_max_exit_latency(xhci, udev, 0);
> 			mutex_unlock(hcd->bandwidth_mutex);
> +			readl_poll_timeout(ports[port_num]->addr, pm_val,
> +					   (pm_val & PORT_PLS_MASK) == XDEV_U0,
> +					   100, 10000);
> 			return 0;
> 		}
> 	}
> -- 
> 2.17.1
>
Greg Kroah-Hartman June 8, 2020, 7:05 a.m. UTC | #2
On Mon, Jun 08, 2020 at 11:58:40AM +0800, Kai-Heng Feng wrote:
> 
> 
> > On May 20, 2020, at 18:18, Kai-Heng Feng <kai.heng.feng@canonical.com> wrote:
> > 
> > USB2 devices with LPM enabled may interrupt the system suspend:
> > [  932.510475] usb 1-7: usb suspend, wakeup 0
> > [  932.510549] hub 1-0:1.0: hub_suspend
> > [  932.510581] usb usb1: bus suspend, wakeup 0
> > [  932.510590] xhci_hcd 0000:00:14.0: port 9 not suspended
> > [  932.510593] xhci_hcd 0000:00:14.0: port 8 not suspended
> > ..
> > [  932.520323] xhci_hcd 0000:00:14.0: Port change event, 1-7, id 7, portsc: 0x400e03
> > ..
> > [  932.591405] PM: pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -16
> > [  932.591414] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -16
> > [  932.591418] PM: Device 0000:00:14.0 failed to suspend async: error -16
> > 
> > During system suspend, USB core will let HC suspends the device if it
> > doesn't have remote wakeup enabled and doesn't have any children.
> > However, from the log above we can see that the usb 1-7 doesn't get bus
> > suspended due to not in U0. After a while the port finished U2 -> U0
> > transition, interrupts the suspend process.
> > 
> > The observation is that after disabling LPM, port doesn't transit to U0
> > immediately and can linger in U2. xHCI spec 4.23.5.2 states that the
> > maximum exit latency for USB2 LPM should be BESL + 10us. The BESL for
> > the affected device is advertised as 400us, which is still not enough
> > based on my testing result.
> > 
> > So let's use the maximum permitted latency, 10000, to poll for U0
> > status to solve the issue.
> > 
> > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> 
> A gentle ping...

It is the middle of the merge window, we can't do anything with any new
patch until after -rc1 is out.

You know this...

greg k-h
Mathias Nyman June 8, 2020, 11:21 a.m. UTC | #3
On 20.5.2020 13.18, Kai-Heng Feng wrote:
> USB2 devices with LPM enabled may interrupt the system suspend:
> [  932.510475] usb 1-7: usb suspend, wakeup 0
> [  932.510549] hub 1-0:1.0: hub_suspend
> [  932.510581] usb usb1: bus suspend, wakeup 0
> [  932.510590] xhci_hcd 0000:00:14.0: port 9 not suspended
> [  932.510593] xhci_hcd 0000:00:14.0: port 8 not suspended
> ..
> [  932.520323] xhci_hcd 0000:00:14.0: Port change event, 1-7, id 7, portsc: 0x400e03

400e03 = Connected, Enabled, U0 with port ink state change flag (PLC) set.

> ..
> [  932.591405] PM: pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -16
> [  932.591414] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -16
> [  932.591418] PM: Device 0000:00:14.0 failed to suspend async: error -16
> 
> During system suspend, USB core will let HC suspends the device if it
> doesn't have remote wakeup enabled and doesn't have any children.
> However, from the log above we can see that the usb 1-7 doesn't get bus
> suspended due to not in U0. After a while the port finished U2 -> U0
> transition, interrupts the suspend process.

In USB2 HW link PM the PLC flag should not be set in U2Exit -> U0 transitions.
Only case we should see a port change event is U2Entry -> U0 due to STALL or
error/timeout. (see xhci 4.23.5.1.1.1)

> 
> The observation is that after disabling LPM, port doesn't transit to U0
> immediately and can linger in U2. xHCI spec 4.23.5.2 states that the
> maximum exit latency for USB2 LPM should be BESL + 10us. The BESL for
> the affected device is advertised as 400us, which is still not enough
> based on my testing result.
> 
> So let's use the maximum permitted latency, 10000, to poll for U0
> status to solve the issue.

I don't recall all details, but it could be that disabling LPM before suspend
is unnecessary. 
At least xhci should be able to set a port to U3 from U1 and U2 (see xhci 4.15.1)
so that is one change that could be done to xhci_bus_suspend()

Also just noticed that we are not really checking L1S field in PORTPMSC register, 
this should tell if there was an issue with USB2 lpm state transitions, and
perhaps we should disable lpm for that device. 

Does the L1S field show anything unuaual in your case?
That could explain the port event with the PLC bit set.

I think we can avoid a readl_poll_timeout() solution in this case.

-Mathias
Kai-Heng Feng June 9, 2020, 10:15 a.m. UTC | #4
> On Jun 8, 2020, at 19:21, Mathias Nyman <mathias.nyman@linux.intel.com> wrote:
> 
> On 20.5.2020 13.18, Kai-Heng Feng wrote:
>> USB2 devices with LPM enabled may interrupt the system suspend:
>> [  932.510475] usb 1-7: usb suspend, wakeup 0
>> [  932.510549] hub 1-0:1.0: hub_suspend
>> [  932.510581] usb usb1: bus suspend, wakeup 0
>> [  932.510590] xhci_hcd 0000:00:14.0: port 9 not suspended
>> [  932.510593] xhci_hcd 0000:00:14.0: port 8 not suspended
>> ..
>> [  932.520323] xhci_hcd 0000:00:14.0: Port change event, 1-7, id 7, portsc: 0x400e03
> 
> 400e03 = Connected, Enabled, U0 with port ink state change flag (PLC) set.
> 
>> ..
>> [  932.591405] PM: pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -16
>> [  932.591414] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -16
>> [  932.591418] PM: Device 0000:00:14.0 failed to suspend async: error -16
>> 
>> During system suspend, USB core will let HC suspends the device if it
>> doesn't have remote wakeup enabled and doesn't have any children.
>> However, from the log above we can see that the usb 1-7 doesn't get bus
>> suspended due to not in U0. After a while the port finished U2 -> U0
>> transition, interrupts the suspend process.
> 
> In USB2 HW link PM the PLC flag should not be set in U2Exit -> U0 transitions.
> Only case we should see a port change event is U2Entry -> U0 due to STALL or
> error/timeout. (see xhci 4.23.5.1.1.1)
> 
>> 
>> The observation is that after disabling LPM, port doesn't transit to U0
>> immediately and can linger in U2. xHCI spec 4.23.5.2 states that the
>> maximum exit latency for USB2 LPM should be BESL + 10us. The BESL for
>> the affected device is advertised as 400us, which is still not enough
>> based on my testing result.
>> 
>> So let's use the maximum permitted latency, 10000, to poll for U0
>> status to solve the issue.
> 
> I don't recall all details, but it could be that disabling LPM before suspend
> is unnecessary. 
> At least xhci should be able to set a port to U3 from U1 and U2 (see xhci 4.15.1)
> so that is one change that could be done to xhci_bus_suspend()

Yes, put the device to U3 directly does the trick.

> 
> Also just noticed that we are not really checking L1S field in PORTPMSC register, 
> this should tell if there was an issue with USB2 lpm state transitions, and
> perhaps we should disable lpm for that device. 
> 
> Does the L1S field show anything unuaual in your case?
> That could explain the port event with the PLC bit set.

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 2c255d0620b0..a2099d42e490 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1592,7 +1592,7 @@ static void handle_port_status(struct xhci_hcd *xhci,
 {
        struct usb_hcd *hcd;
        u32 port_id;
-       u32 portsc, cmd_reg;
+       u32 portsc, portpmsc, cmd_reg;
        int max_ports;
        int slot_id;
        unsigned int hcd_portnum;
@@ -1634,9 +1634,10 @@ static void handle_port_status(struct xhci_hcd *xhci,
        bus_state = &port->rhub->bus_state;
        hcd_portnum = port->hcd_portnum;
        portsc = readl(port->addr);
+       portpmsc = readl(port->addr + PORTPMSC);
 
-       xhci_dbg(xhci, "Port change event, %d-%d, id %d, portsc: 0x%x\n",
-                hcd->self.busnum, hcd_portnum + 1, port_id, portsc);
+       xhci_dbg(xhci, "Port change event, %d-%d, id %d, portsc: 0x%x, portpmsc 0x%0x\n",
+                hcd->self.busnum, hcd_portnum + 1, port_id, portsc, portpmsc);
 
        trace_xhci_handle_port_status(hcd_portnum, portsc);

[  685.460054] xhci_hcd 0000:00:14.0: Port change event, 1-7, id 7, portsc: 0x400e03, portpmsc 0x1
[  685.460062] xhci_hcd 0000:00:14.0: resume root hub
[  685.460079] xhci_hcd 0000:00:14.0: handle_port_status: starting port polling.
[  685.460094] xhci_hcd 0000:00:14.0: xhci_hub_status_data: stopping port polling.
[  685.521685] PM: pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -16
[  685.521695] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -16
[  685.521699] PM: Device 0000:00:14.0 failed to suspend async: error -16

So after disabling LPM, it takes a long time to complete L1 transition, before transitioning to L0.

Kai-Heng

> 
> I think we can avoid a readl_poll_timeout() solution in this case.
> 
> -Mathias
Kai-Heng Feng June 19, 2020, 2:19 p.m. UTC | #5
Hi Mathias,

> On Jun 9, 2020, at 18:15, Kai-Heng Feng <kai.heng.feng@canonical.com> wrote:
> 
> 
> 
>> On Jun 8, 2020, at 19:21, Mathias Nyman <mathias.nyman@linux.intel.com> wrote:
>> 
>> On 20.5.2020 13.18, Kai-Heng Feng wrote:
>>> USB2 devices with LPM enabled may interrupt the system suspend:
>>> [  932.510475] usb 1-7: usb suspend, wakeup 0
>>> [  932.510549] hub 1-0:1.0: hub_suspend
>>> [  932.510581] usb usb1: bus suspend, wakeup 0
>>> [  932.510590] xhci_hcd 0000:00:14.0: port 9 not suspended
>>> [  932.510593] xhci_hcd 0000:00:14.0: port 8 not suspended
>>> ..
>>> [  932.520323] xhci_hcd 0000:00:14.0: Port change event, 1-7, id 7, portsc: 0x400e03
>> 
>> 400e03 = Connected, Enabled, U0 with port ink state change flag (PLC) set.
>> 
>>> ..
>>> [  932.591405] PM: pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -16
>>> [  932.591414] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -16
>>> [  932.591418] PM: Device 0000:00:14.0 failed to suspend async: error -16
>>> 
>>> During system suspend, USB core will let HC suspends the device if it
>>> doesn't have remote wakeup enabled and doesn't have any children.
>>> However, from the log above we can see that the usb 1-7 doesn't get bus
>>> suspended due to not in U0. After a while the port finished U2 -> U0
>>> transition, interrupts the suspend process.
>> 
>> In USB2 HW link PM the PLC flag should not be set in U2Exit -> U0 transitions.
>> Only case we should see a port change event is U2Entry -> U0 due to STALL or
>> error/timeout. (see xhci 4.23.5.1.1.1)
>> 
>>> 
>>> The observation is that after disabling LPM, port doesn't transit to U0
>>> immediately and can linger in U2. xHCI spec 4.23.5.2 states that the
>>> maximum exit latency for USB2 LPM should be BESL + 10us. The BESL for
>>> the affected device is advertised as 400us, which is still not enough
>>> based on my testing result.
>>> 
>>> So let's use the maximum permitted latency, 10000, to poll for U0
>>> status to solve the issue.
>> 
>> I don't recall all details, but it could be that disabling LPM before suspend
>> is unnecessary. 
>> At least xhci should be able to set a port to U3 from U1 and U2 (see xhci 4.15.1)
>> so that is one change that could be done to xhci_bus_suspend()
> 
> Yes, put the device to U3 directly does the trick.

As discussed, will you pick this series over the v2?
Or is there anything I should improve for this one?

Kai-Heng

> 
>> 
>> Also just noticed that we are not really checking L1S field in PORTPMSC register, 
>> this should tell if there was an issue with USB2 lpm state transitions, and
>> perhaps we should disable lpm for that device. 
>> 
>> Does the L1S field show anything unuaual in your case?
>> That could explain the port event with the PLC bit set.
> 
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index 2c255d0620b0..a2099d42e490 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -1592,7 +1592,7 @@ static void handle_port_status(struct xhci_hcd *xhci,
> {
>        struct usb_hcd *hcd;
>        u32 port_id;
> -       u32 portsc, cmd_reg;
> +       u32 portsc, portpmsc, cmd_reg;
>        int max_ports;
>        int slot_id;
>        unsigned int hcd_portnum;
> @@ -1634,9 +1634,10 @@ static void handle_port_status(struct xhci_hcd *xhci,
>        bus_state = &port->rhub->bus_state;
>        hcd_portnum = port->hcd_portnum;
>        portsc = readl(port->addr);
> +       portpmsc = readl(port->addr + PORTPMSC);
> 
> -       xhci_dbg(xhci, "Port change event, %d-%d, id %d, portsc: 0x%x\n",
> -                hcd->self.busnum, hcd_portnum + 1, port_id, portsc);
> +       xhci_dbg(xhci, "Port change event, %d-%d, id %d, portsc: 0x%x, portpmsc 0x%0x\n",
> +                hcd->self.busnum, hcd_portnum + 1, port_id, portsc, portpmsc);
> 
>        trace_xhci_handle_port_status(hcd_portnum, portsc);
> 
> [  685.460054] xhci_hcd 0000:00:14.0: Port change event, 1-7, id 7, portsc: 0x400e03, portpmsc 0x1
> [  685.460062] xhci_hcd 0000:00:14.0: resume root hub
> [  685.460079] xhci_hcd 0000:00:14.0: handle_port_status: starting port polling.
> [  685.460094] xhci_hcd 0000:00:14.0: xhci_hub_status_data: stopping port polling.
> [  685.521685] PM: pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -16
> [  685.521695] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -16
> [  685.521699] PM: Device 0000:00:14.0 failed to suspend async: error -16
> 
> So after disabling LPM, it takes a long time to complete L1 transition, before transitioning to L0.
> 
> Kai-Heng
> 
>> 
>> I think we can avoid a readl_poll_timeout() solution in this case.
>> 
>> -Mathias
Mathias Nyman June 23, 2020, 8:38 a.m. UTC | #6
On 19.6.2020 17.19, Kai-Heng Feng wrote:
> Hi Mathias,
> 
>> On Jun 9, 2020, at 18:15, Kai-Heng Feng <kai.heng.feng@canonical.com> wrote:
>>
>>
>>
>>> On Jun 8, 2020, at 19:21, Mathias Nyman <mathias.nyman@linux.intel.com> wrote:
>>>
>>> On 20.5.2020 13.18, Kai-Heng Feng wrote:
>>>> USB2 devices with LPM enabled may interrupt the system suspend:
>>>> [  932.510475] usb 1-7: usb suspend, wakeup 0
>>>> [  932.510549] hub 1-0:1.0: hub_suspend
>>>> [  932.510581] usb usb1: bus suspend, wakeup 0
>>>> [  932.510590] xhci_hcd 0000:00:14.0: port 9 not suspended
>>>> [  932.510593] xhci_hcd 0000:00:14.0: port 8 not suspended
>>>> ..
>>>> [  932.520323] xhci_hcd 0000:00:14.0: Port change event, 1-7, id 7, portsc: 0x400e03
>>>
>>> 400e03 = Connected, Enabled, U0 with port ink state change flag (PLC) set.
>>>
>>>> ..
>>>> [  932.591405] PM: pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -16
>>>> [  932.591414] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -16
>>>> [  932.591418] PM: Device 0000:00:14.0 failed to suspend async: error -16
>>>>
>>>> During system suspend, USB core will let HC suspends the device if it
>>>> doesn't have remote wakeup enabled and doesn't have any children.
>>>> However, from the log above we can see that the usb 1-7 doesn't get bus
>>>> suspended due to not in U0. After a while the port finished U2 -> U0
>>>> transition, interrupts the suspend process.
>>>
>>> In USB2 HW link PM the PLC flag should not be set in U2Exit -> U0 transitions.
>>> Only case we should see a port change event is U2Entry -> U0 due to STALL or
>>> error/timeout. (see xhci 4.23.5.1.1.1)
>>>
>>>>
>>>> The observation is that after disabling LPM, port doesn't transit to U0
>>>> immediately and can linger in U2. xHCI spec 4.23.5.2 states that the
>>>> maximum exit latency for USB2 LPM should be BESL + 10us. The BESL for
>>>> the affected device is advertised as 400us, which is still not enough
>>>> based on my testing result.
>>>>
>>>> So let's use the maximum permitted latency, 10000, to poll for U0
>>>> status to solve the issue.
>>>
>>> I don't recall all details, but it could be that disabling LPM before suspend
>>> is unnecessary. 
>>> At least xhci should be able to set a port to U3 from U1 and U2 (see xhci 4.15.1)
>>> so that is one change that could be done to xhci_bus_suspend()
>>
>> Yes, put the device to U3 directly does the trick.
> 
> As discussed, will you pick this series over the v2?
> Or is there anything I should improve for this one?
> 
> Kai-Heng

Added this to queueI

Thanks
-Mathias
diff mbox series

Patch

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 14181a7ea375..1058f604975b 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -4474,6 +4474,9 @@  static int xhci_set_usb2_hardware_lpm(struct usb_hcd *hcd,
 			mutex_lock(hcd->bandwidth_mutex);
 			xhci_change_max_exit_latency(xhci, udev, 0);
 			mutex_unlock(hcd->bandwidth_mutex);
+			readl_poll_timeout(ports[port_num]->addr, pm_val,
+					   (pm_val & PORT_PLS_MASK) == XDEV_U0,
+					   100, 10000);
 			return 0;
 		}
 	}