Message ID | 5171D7FD.7070609@fold.natur.cuni.cz (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Hi, would somebody comment please on the seemingly suspended dead xHCI socket behavior? It is not completely dead, as you could see in step 7, the PME is being changed as a result of mouse being plugged into the socket. True, the mouse appears dead because it gets no power but I believe it is because xhci_hcd is fooled. Although I did the testing with pcie_aspm=off I also tried just now pcie_aspm=native but with same results. Either way, the 'suspend to death' is reversible once I force wakeup of the 0b:00 device (the TI host) by echo on > ...0b:00.0/power/control. Thanks, Martin Martin Mokrejs wrote: > Hi Sarah, > does anyone has any comments to this thread? I just retried with 3.8.8 > kernel and it is still same issue. I can put to 'auto' upstream 1c.4 port, > detach mouse and the 1c.4 does not suspend (due to a recent patch I think > around 3.8.5). > If I set also its downstream 0b:00 to 'auto', plugin mouse ... mouse works, > after I unplug the mouse the 0b:00 goes 'suspended' and XHCI socket dies. > > Here is comparison of the 'active' state and of the 'suspended' to death > (note pcie_aspm=off on my kernel command line): > --- lspci_vvv_initial.txt 2013-04-20 00:16:11.000000000 +0200 > +++ lspci_vvv_initial__mouse_attached__detached__attached__1c.4_to_auto__detached__0b:00_to_auto.txt 2013-04-20 00:18:38.000000000 +0200 > @@ -484,15 +484,14 @@ > > 0b:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI]) > Subsystem: Dell Device 04b3 > - Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ > + Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- > - Latency: 0, Cache Line Size: 64 bytes > Interrupt: pin A routed to IRQ 16 > Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K] > Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K] > Capabilities: [40] Power Management version 3 > Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+) > - Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- > + Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME- > Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+ > Address: 0000000000000000 Data: 0000 > Capabilities: [70] Express (v2) Endpoint, MSI 00 > > > If I put back 0b:00/control to 'on' I rescue the XHCI socket. > > > > So, should the TI host be blacklisted so that it is never put into suspend > state? I wrote already that I don't think it is necessary but looks nobody > looked into the lspci files. So, here is my interpretation: > > > See another test scenario: > > 1. When I bootup without any devices attached to the TI host (no laptop-mode-tools), the TI host at 0b:00 is active. > > 2. If I enable powersaving via setting control file to 'auto' of 1c.4 (just to be sure) and 0b:00, > the 0b:00 goes after a while suspended. But it is not dead, if I connect a mouse to the XHCI socket > it would work. BUt look how such 'softly suspended' state looks like: > > # diff -u -w lspci_vvv_initial.txt lspci_vvv_initial__1c.4_and_0b:00_to_auto.txt > --- lspci_vvv_initial.txt 2013-04-20 01:06:51.000000000 +0200 > +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto.txt 2013-04-20 01:08:46.000000000 +0200 > @@ -484,15 +484,14 @@ > > 0b:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI]) > Subsystem: Dell Device 04b3 > - Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ > + Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- > - Latency: 0, Cache Line Size: 64 bytes > Interrupt: pin A routed to IRQ 16 > Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K] > Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K] > Capabilities: [40] Power Management version 3 > Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+) > - Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- > + Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME- > Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+ > Address: 0000000000000000 Data: 0000 > Capabilities: [70] Express (v2) Endpoint, MSI 00 > # > > > 3. Now, look what happens if I plugin a mouse (works, as I said, and uplug it, which triggers a deadly suspend, > although reversible): > > # diff -u -w lspci_vvv_initial__1c.4_and_0b:00_to_auto.txt lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached.txt > --- lspci_vvv_initial__1c.4_and_0b:00_to_auto.txt 2013-04-20 01:08:46.000000000 +0200 > +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached.txt 2013-04-20 01:10:06.000000000 +0200 > @@ -271,7 +271,7 @@ > Changed: MRL- PresDet- LinkState+ > RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible- > RootCap: CRSVisible- > - RootSta: PME ReqID 0000, PMEStatus- PMEPending- > + RootSta: PME ReqID 0b00, PMEStatus- PMEPending- > DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd- > DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd- > LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- > > 4. Interestingly, if I connect a mouse to the socket to show it is "dead" there is a tiny change in lspci: > > --- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached.txt 2013-04-20 01:10:06.000000000 +0200 > +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt 2013-04-20 01:10:28.000000000 +0200 > @@ -491,7 +491,7 @@ > Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K] > Capabilities: [40] Power Management version 3 > Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+) > - Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME- > + Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME+ > Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+ > Address: 0000000000000000 Data: 0000 > Capabilities: [70] Express (v2) Endpoint, MSI 00 > > > 5. I said the port 'suspended to death' can be rescued by echo 'on' > .../*0b:00*/control (the mouse was > plugged in during the echo command so we see not only PME changes but also D3 to D0 change because the > mouse is attached): > > # diff -u -w lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__rea > ttached_but_dead__0b\:00_to_on_rescues.txt > --- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt 2013-04-20 01:10:28.000000000 +0200 > +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues.txt 2013-04-20 01:12:25.000000000 +0200 > @@ -484,14 +484,15 @@ > > 0b:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI]) > Subsystem: Dell Device 04b3 > - Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ > + Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- > + Latency: 0, Cache Line Size: 64 bytes > Interrupt: pin A routed to IRQ 16 > Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K] > Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K] > Capabilities: [40] Power Management version 3 > Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+) > - Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME+ > + Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- > Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+ > Address: 0000000000000000 Data: 0000 > Capabilities: [70] Express (v2) Endpoint, MSI 00 > > > > 6. When I unplug the mouse of course the port does not die because the control file is set to 'on'. > I already demonstrated that but once again, setting 0b:00 to 'auto': > > # diff -u -w lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached.txt lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached__0b\:00_to_auto.txt > --- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached.txt 2013-04-20 01:13:36.000000000 +0200 > +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached__0b:00_to_auto.txt 2013-04-20 01:14:41.000000000 +0200 > @@ -484,15 +484,14 @@ > > 0b:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI]) > Subsystem: Dell Device 04b3 > - Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ > + Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- > - Latency: 0, Cache Line Size: 64 bytes > Interrupt: pin A routed to IRQ 16 > Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K] > Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K] > Capabilities: [40] Power Management version 3 > Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+) > - Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- > + Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME- > Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+ > Address: 0000000000000000 Data: 0000 > Capabilities: [70] Express (v2) Endpoint, MSI 00 > @@ -521,7 +520,7 @@ > UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- > - CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ > + CESta: RxErr- BadTLP- BadDLLP+ Rollover- Timeout- NonFatalErr+ > CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ > AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- > Capabilities: [150 v1] Device Serial Number 08-00-28-00-00-20-00-00 > > > 7. Now, a question to the reader: If I attach the mouse, will it work or not? > > > # diff -u -w lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached__0b\:00_to_auto.txt lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached__0b\:00_to_auto__attached_dead.txt > --- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached__0b:00_to_auto.txt 2013-04-20 01:14:41.000000000 +0200 > +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached__0b:00_to_auto__attached_dead.txt 2013-04-20 01:17:59.000000000 +0200 > @@ -491,7 +491,7 @@ > Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K] > Capabilities: [40] Power Management version 3 > Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+) > - Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME- > + Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME+ > Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+ > Address: 0000000000000000 Data: 0000 > Capabilities: [70] Express (v2) Endpoint, MSI 00 > # > > > No, it did not work. Situation in step 7 is same like in step 4. The diff below is likely benign: > > # diff -u -w lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached__0b\:00_to_auto__attached_dead.txt > --- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt 2013-04-20 01:10:28.000000000 +0200 > +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached__0b:00_to_auto__attached_dead.txt 2013-04-20 01:17:59.000000000 +0200 > @@ -520,7 +520,7 @@ > UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- > UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- > - CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ > + CESta: RxErr- BadTLP- BadDLLP+ Rollover- Timeout- NonFatalErr+ > CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ > AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- > Capabilities: [150 v1] Device Serial Number 08-00-28-00-00-20-00-00 > # > > > > Collected data are at http://195.113.57.32/~mmokrejs/tmp/20130420.tar.bz2 (90kB) > > Thanks, > Martin > > > Martin Mokrejs wrote: >> >> >> Huang Ying wrote: >>> Hi, Martin, >>> >>> On Wed, 2013-04-03 at 14:16 +0200, Martin Mokrejs wrote: >>>> Meanwhile, the raw data: http://195.113.57.32/~mmokrejs/tmp/20130402.tar.bz2 >>>> (size 468641 bytes) >>> >>> Thanks a lot! Your information is very complete and clear :) >>> >>>> They were collected by: >>>> >>>> # cat ~/bin/collect_runtime_status.sh >>>> #!/bin/sh >>>> grep . /sys/bus/pci/devices/*/power/runtime_status > runtime_status_"$1".txt >>>> grep . /sys/bus/pci/devices/*/power/control > control_"$1".txt >>>> cat /proc/interrupts > interrupts_"$1".txt >>>> cat /proc/iomem > iomem_"$1".txt >>>> lspci -vvv > lspci_vvv_"$1".txt >>>> dmesg > dmesg_"$1".txt >>>> # >>>> >>>> Just do 'ls -latr' to see the ordering of the files as they were created. >>>> The longer the filename, the later in the test process. The names should be >>>> relatively self-explaining. Definitely, from the log files you should see >>>> what happened in real and therefore, can figure out what the (maybe weird) >>>> long filename really meant. >>>> >>>> Sometimes I manually recorded lsusb of dmesg_final.txt, mostly after I did some >>>> extra tests but but not want to record every step by the above 6 files. >>>> >>>> In one or two places I added some my own notes into COMMENTS file. >>>> >>>> >>>> >>>> >>>> I will try to guide your below where you can study which of the bugs. Mostly, >>>> for each bug you need just one subdirectory to look into, the other are just >>>> repeated the same bug under different kernel version or another patch. >>>> However, Sarah for the xHCI dead port issue will need to compare by diff >>>> two directories, one with the TI-based controller tests, the other with the >>>> NEC-based tests. Especially there, I would do something like: >>>> >>>> cd *TI-based; for f in dmesg*; do cut -c 15- $f > /tmp/TI/$f; done >>>> cd ../*NEC-based; for f in dmesg*; do cut -c 15- $f > /tmp/NEC/$f; done >>>> >>>> Then it should be easier to poke through file captured at the same test step, >>>> like: >>>> >>>> diff -u -w /tmp/TI/dmesg_initial__mouse_attached__unplugged__reattached_but_port_dead.txt \ >>>> /tmp/NEC/dmesg_initial__mouse_attached__detached__reattached.txt >>>> >>>> >>>> >>>> Other than that, just diff pairs of files with each other, like: >>>> >>>> diff -u -w lspci_vvv_initial.txt lspci_vvv_initial__mouse_attached.txt >>>> >>>> >>>> Sorry that I sometimes used only a single underscore instead of double underscores >>>> to separate the test steps from each other in the filename. >>>> >>>> >>>> Martin Mokrejs wrote: >>>>> [ +linux-pci and Yinghai as they suffered already those many emails on individual >>>>> threads so one overviewing email hopefully won't harm] ;-) >>>>> >>>>> Martin Mokrejs wrote: >>>>>> >>>>>> >>>>>> Bjorn Helgaas wrote: >>>>>>> On Tue, Apr 2, 2013 at 9:02 AM, Martin Mokrejs >>>>>>> <mmokrejs@fold.natur.cuni.cz> wrote: >>>>>>>> Hi Ying, >>>>>>>> >>>>>>>> huang ying wrote: >>>>>>> >>>>>>>>> And please give me the full dmesg for boot and incremental dmesg for >>>>>>>>> operations. >>>>>>>> >>>>>>>> >>>>>>>> The incremental bits here, the full dmesg will send only directly to your email, due to its size. >>>>>>> >>>>>>> Is there a bugzilla for this issue? Please attach the complete dmesg >>>>>>> there or somewhere similar so we can all benefit. >>>>>> >>>>>> I changed my mind. I am attaching the dmesg here but omitting linux-acpi >>>>>> list. After I hear a proposal from Rafel/Bjorn I will open separate bugs. >>>>>> I thought that the threads I started so far were enough but yes, dmesg >>>>>> files don't pass through list filters so I should move that to bugzilla. >>>>>> >>>>>> so far my view of the the bugs was: >>>>>> >>>>>> 1) acpiphp hotplug broken due to upstream pcieport 1c.7 PME# enabled >>>>>> (eSATA-based card) >>>>> >>>>> Fixed by Ying Huang port_dbg.patch applied over 3.8.5 (fixes acpiphp hotplug >>>>> of eSATA and Firewire cards, NOT the hotplug of a NEC-based USB3 card -> hence >>>>> the bug 4) below). Now I can continue using laptop-mode-tools. >>>> >>>> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_eSATA_testing >>>> 20130402/3.8.3-vanilla__with_laptop-mode-tools (with some comments in >>>> COMMENTS file) >>> >>> Thanks for your testing! >>> >>>>>> 2) xHCI dead due to to its suspend - 3.8 series and above >>>>> >>>>> Not fixed by port_dbg.patch applied over 3.8.5. Interestingly, a NEC-based >>>>> XHCI card *in an express card slot* does not suffer this suspend issue. >>>>> Although it is being put into suspend if a device is unplugged. >>>> >>>> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_xHCI_test_TI-based >>>> 20130402/3.8.5-ying_port-dbg__with_laptop-mode-tools_xHCI_test_NEC-based >>>> >>>> Same thing but yet without the port_dbg.patch: >>>> 20130402/3.9-rc5__with_2368081__with-latop-mode-tools_xhci_testing/ >>> >>> It appears that TI xHCI dead port issue will present even if the PCIe >>> port will never go suspended. So I think this bug is not related to >>> PCIe port runtime PM but related to USB xHCI. >>> >>> Do you agree Sarah? >> >> Although I confirmed with 20130405.tar.bz2 dataset what Sarah repeated from our >> past findings in the email which should be just in your your inbox, one thing is >> puzzling: >> When I have powersaving enabled upon bootup with NO USB devices attached to the TI >> controller, effectively while reaching multiuser mode the 0b:00.0 is in a suspend >> state. But, somehow, the very first mouse plugin works. Only the reject causes >> more 'aggressive' suspend. >> As it seems no upstream 1c.4 is messing up here (in the test Sarah wanted me to do >> we have all control files 'on' except the end 0b:00.0) then really still something >> *else* is causing the dead port *in conjunction* with 'suspended' runtime state. >> Please double check what I wrote initially about the 20130402.tar.bz2 dataset. >> Notably, I would compare lspci outputs from a cold boot state with no devices >> attached and suspended 0b:00.0 (the 20130402.tar.bz2 dataset) with the dead port >> status in lspci (find any in 20130402.tar.bz2 or now in 20130405.tar.bz2). >> >> Martin >> >>> >>> [snip] >>> >>> Best Regards, >>> Huang Ying > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- lspci_vvv_initial.txt 2013-04-20 00:16:11.000000000 +0200 +++ lspci_vvv_initial__mouse_attached__detached__attached__1c.4_to_auto__detached__0b:00_to_auto.txt 2013-04-20 00:18:38.000000000 +0200 @@ -484,15 +484,14 @@ 0b:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI]) Subsystem: Dell Device 04b3 - Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ + Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- - Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 16 Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K] Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+) - Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- + Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME- Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 00 If I put back 0b:00/control to 'on' I rescue the XHCI socket. So, should the TI host be blacklisted so that it is never put into suspend state? I wrote already that I don't think it is necessary but looks nobody looked into the lspci files. So, here is my interpretation: See another test scenario: 1. When I bootup without any devices attached to the TI host (no laptop-mode-tools), the TI host at 0b:00 is active. 2. If I enable powersaving via setting control file to 'auto' of 1c.4 (just to be sure) and 0b:00, the 0b:00 goes after a while suspended. But it is not dead, if I connect a mouse to the XHCI socket it would work. BUt look how such 'softly suspended' state looks like: # diff -u -w lspci_vvv_initial.txt lspci_vvv_initial__1c.4_and_0b:00_to_auto.txt --- lspci_vvv_initial.txt 2013-04-20 01:06:51.000000000 +0200 +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto.txt 2013-04-20 01:08:46.000000000 +0200 @@ -484,15 +484,14 @@ 0b:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI]) Subsystem: Dell Device 04b3 - Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ + Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- - Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 16 Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K] Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+) - Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- + Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME- Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 00 # 3. Now, look what happens if I plugin a mouse (works, as I said, and uplug it, which triggers a deadly suspend, although reversible): # diff -u -w lspci_vvv_initial__1c.4_and_0b:00_to_auto.txt lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached.txt --- lspci_vvv_initial__1c.4_and_0b:00_to_auto.txt 2013-04-20 01:08:46.000000000 +0200 +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached.txt 2013-04-20 01:10:06.000000000 +0200 @@ -271,7 +271,7 @@ Changed: MRL- PresDet- LinkState+ RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible- RootCap: CRSVisible- - RootSta: PME ReqID 0000, PMEStatus- PMEPending- + RootSta: PME ReqID 0b00, PMEStatus- PMEPending- DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd- LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- 4. Interestingly, if I connect a mouse to the socket to show it is "dead" there is a tiny change in lspci: --- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached.txt 2013-04-20 01:10:06.000000000 +0200 +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt 2013-04-20 01:10:28.000000000 +0200 @@ -491,7 +491,7 @@ Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+) - Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME- + Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME+ Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 00 5. I said the port 'suspended to death' can be rescued by echo 'on' > .../*0b:00*/control (the mouse was plugged in during the echo command so we see not only PME changes but also D3 to D0 change because the mouse is attached): # diff -u -w lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__rea ttached_but_dead__0b\:00_to_on_rescues.txt --- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt 2013-04-20 01:10:28.000000000 +0200 +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues.txt 2013-04-20 01:12:25.000000000 +0200 @@ -484,14 +484,15 @@ 0b:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI]) Subsystem: Dell Device 04b3 - Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ + Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- + Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 16 Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K] Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+) - Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME+ + Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 00 6. When I unplug the mouse of course the port does not die because the control file is set to 'on'. I already demonstrated that but once again, setting 0b:00 to 'auto': # diff -u -w lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached.txt lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached__0b\:00_to_auto.txt --- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached.txt 2013-04-20 01:13:36.000000000 +0200 +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached__0b:00_to_auto.txt 2013-04-20 01:14:41.000000000 +0200 @@ -484,15 +484,14 @@ 0b:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI]) Subsystem: Dell Device 04b3 - Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ + Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- - Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 16 Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=64K] Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+) - Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- + Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME- Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 00 @@ -521,7 +520,7 @@ UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- - CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ + CESta: RxErr- BadTLP- BadDLLP+ Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [150 v1] Device Serial Number 08-00-28-00-00-20-00-00 7. Now, a question to the reader: If I attach the mouse, will it work or not? # diff -u -w lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached__0b\:00_to_auto.txt lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached__0b\:00_to_auto__attached_dead.txt --- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached__0b:00_to_auto.txt 2013-04-20 01:14:41.000000000 +0200 +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached__0b:00_to_auto__attached_dead.txt 2013-04-20 01:17:59.000000000 +0200 @@ -491,7 +491,7 @@ Region 2: Memory at f7d10000 (64-bit, non-prefetchable) [size=8K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=100mA PME(D0+,D1+,D2+,D3hot+,D3cold+) - Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME- + Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME+ Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 00 # No, it did not work. Situation in step 7 is same like in step 4. The diff below is likely benign: # diff -u -w lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt lspci_vvv_initial__1c.4_and_0b\:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b\:00_to_on_rescues__detached__0b\:00_to_auto__attached_dead.txt --- lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead.txt 2013-04-20 01:10:28.000000000 +0200 +++ lspci_vvv_initial__1c.4_and_0b:00_to_auto__mouse_attached_and_works__detached__reattached_but_dead__0b:00_to_on_rescues__detached__0b:00_to_auto__attached_dead.txt 2013-04-20 01:17:59.000000000 +0200 @@ -520,7 +520,7 @@ UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- - CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ + CESta: RxErr- BadTLP- BadDLLP+ Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [150 v1] Device Serial Number 08-00-28-00-00-20-00-00