Message ID | 20140121211741.GA12657@xanatos (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Sarah, Sarah Sharp <sarah.a.sharp@linux.intel.com> writes: > On Sat, Jan 18, 2014 at 10:49:17PM +0100, Arnaud Ebalard wrote: >> Hi, >> >> I have added Thomas in the recipients, because I guess he may be of some >> help debugging the issue further. Thomas, the beginning of the thread is >> here: http://thread.gmane.org/gmane.linux.usb.general/101531 > > ... > >> I started suspecting the introduction of MSI support in Marvell PCIe >> host controller driver (FL1009 is on the PCIe bus) and compiled a >> a 3.13.0-rc8 w/ CONFIG_PCI_MSI disabled (it was enabled in all my >> previous tests): I did not manage to reproduce the issue with this >> kernel. As a side note, commits 5b4deb6526bd, 31f614edb726 and >> 627dfcc249e2 are >> >> ATM, I do not know if the problem is related to a bug in introduced MSI >> support or some weird incompatibility of that functionality with the >> FL1009 which would require some quirk in XHCI stack. > > We've actually had issues in the past with Fresco Logic hosts not > supporting MSI properly, even though the PCI devices claim to have MSI > support. So turning off CONFIG_PCI_MSI may actually mean the Fresco > Logic host is to blame, rather than the Marvell patches. I assume MSI > wouldn't have been turned on for the Fresco Logic host unless the parent > PCI host controller supported it. > > Let's see if the Fresco Logic host is really the root cause. Please > apply the this patch to 3.13.0-rc8 and recompile with CONFIG_PCI_MSI > enabled: > > diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c > index 6c03584ac15f..74748444c040 100644 > --- a/drivers/usb/host/xhci-pci.c > +++ b/drivers/usb/host/xhci-pci.c > @@ -30,6 +30,7 @@ > /* Device for a quirk */ > #define PCI_VENDOR_ID_FRESCO_LOGIC 0x1b73 > #define PCI_DEVICE_ID_FRESCO_LOGIC_PDK 0x1000 > +#define PCI_DEVICE_ID_FRESCO_LOGIC_FL1009 0x1009 > #define PCI_DEVICE_ID_FRESCO_LOGIC_FL1400 0x1400 > > #define PCI_VENDOR_ID_ETRON 0x1b6f > @@ -63,6 +64,9 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) > > /* Look for vendor-specific quirks */ > if (pdev->vendor == PCI_VENDOR_ID_FRESCO_LOGIC && > + pdev->device == PCI_DEVICE_ID_FRESCO_LOGIC_FL1009) > + xhci->quirks |= XHCI_BROKEN_MSI; > + if (pdev->vendor == PCI_VENDOR_ID_FRESCO_LOGIC && > (pdev->device == PCI_DEVICE_ID_FRESCO_LOGIC_PDK || > pdev->device == PCI_DEVICE_ID_FRESCO_LOGIC_FL1400)) { > if (pdev->device == PCI_DEVICE_ID_FRESCO_LOGIC_PDK && With the patch applied on top of 3.13.0 kernel recompiled w/ CONFIG_PCI_MSI enabled, I cannot reproduce the bug. I guess you can add my: Reported-and-tested-By: Arnaud Ebalard <arno@natisbad.org> Since you'll have to push the patch to -stable team at least for 3.13, I wonder if it would not make sense to extend that at least to 3.12. and possibly 3.10 (3.2 is still widely used but I wonder if it makes sense to go that far). Cheers, a+
On Wed, Jan 22, 2014 at 11:23:23PM +0100, Arnaud Ebalard wrote: > With the patch applied on top of 3.13.0 kernel recompiled w/ > CONFIG_PCI_MSI enabled, I cannot reproduce the bug. I guess > you can add my: > > Reported-and-tested-By: Arnaud Ebalard <arno@natisbad.org> > > Since you'll have to push the patch to -stable team at least for 3.13, > I wonder if it would not make sense to extend that at least to 3.12. > and possibly 3.10 (3.2 is still widely used but I wonder if it makes > sense to go that far). Can you pinpoint the commit that introduced the regression? thx, Jason.
Hi Jason, Jason Cooper <jason@lakedaemon.net> writes: > On Wed, Jan 22, 2014 at 11:23:23PM +0100, Arnaud Ebalard wrote: >> With the patch applied on top of 3.13.0 kernel recompiled w/ >> CONFIG_PCI_MSI enabled, I cannot reproduce the bug. I guess >> you can add my: >> >> Reported-and-tested-By: Arnaud Ebalard <arno@natisbad.org> >> >> Since you'll have to push the patch to -stable team at least for 3.13, >> I wonder if it would not make sense to extend that at least to 3.12. >> and possibly 3.10 (3.2 is still widely used but I wonder if it makes >> sense to go that far). > > Can you pinpoint the commit that introduced the regression? f5182b4155b9d686c5540a6822486400e34ddd98 "xhci: Disable MSI for some Fresco Logic hosts." Technically, this is not per se the commit which introduced the regression but the one that *partially* fixed it by introducing the XHCI quirk to skip MSI enabling for Fresco Logic chips. The thing is it should have included the FL1009 in the targets. Sarah, can you confirm this? Jason, the logic is summarized here, AFAICT: commit 455f58925247e8a1a1941e159f3636ad6ee4c90b Author: Oliver Neukum <oneukum@suse.de> Date: Mon Sep 30 15:50:54 2013 +0200 xhci: quirk for extra long delay for S4 It has been reported that this chipset really cannot sleep without this extraordinary delay. This patch should be backported, in order to ensure this host functions under stable kernels. The last quirk for Fresco Logic hosts (commit bba18e33f25072ebf70fd8f7f0cdbf8cdb59a746 "xhci: Extend Fresco Logic MSI quirk.") was backported to stable kernels as old as 2.6.36. Signed-off-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Cc: stable@vger.kernel.org commit bba18e33f25072ebf70fd8f7f0cdbf8cdb59a746 Author: Sarah Sharp <sarah.a.sharp@linux.intel.com> Date: Wed Oct 17 13:44:06 2012 -0700 xhci: Extend Fresco Logic MSI quirk. Ali reports that plugging a device into the Fresco Logic xHCI host with PCI device ID 1400 produces an IRQ error: do_IRQ: 3.176 No irq handler for vector (irq -1) Other early Fresco Logic host revisions don't support MSI, even though their PCI config space claims they do. Extend the quirk to disabling MSI to this chipset revision. Also enable the short transfer quirk, since it's likely this revision also has that quirk, and it should be harmless to enable. <SNIP> This patch should be backported to stable kernels as old as 2.6.36, that contain the commit f5182b4155b9d686c5540a6822486400e34ddd98 "xhci: Disable MSI for some Fresco Logic hosts." Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Reported-by: A Sh <smr.ash1991@gmail.com> Tested-by: A Sh <smr.ash1991@gmail.com> Cc: stable@vger.kernel.org commit f5182b4155b9d686c5540a6822486400e34ddd98 Author: Sarah Sharp <sarah.a.sharp@linux.intel.com> Date: Thu Jun 2 11:33:02 2011 -0700 xhci: Disable MSI for some Fresco Logic hosts. Some Fresco Logic hosts, including those found in the AUAU N533V laptop, advertise MSI, but fail to actually generate MSI interrupts. Add a new xHCI quirk to skip MSI enabling for the Fresco Logic host controllers. Fresco Logic confirms that all chips with PCI vendor ID 0x1b73 and device ID 0x1000, regardless of PCI revision ID, do not support MSI. This should be backported to stable kernels as far back as 2.6.36, which was the first kernel to support MSI on xHCI hosts. Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Reported-by: Sergey Galanov <sergey.e.galanov@gmail.com> Cc: stable@kernel.org
On Wed, Jan 22, 2014 at 11:43:16PM +0100, Arnaud Ebalard wrote: > Hi Jason, > > Jason Cooper <jason@lakedaemon.net> writes: > > > On Wed, Jan 22, 2014 at 11:23:23PM +0100, Arnaud Ebalard wrote: > >> With the patch applied on top of 3.13.0 kernel recompiled w/ > >> CONFIG_PCI_MSI enabled, I cannot reproduce the bug. I guess > >> you can add my: > >> > >> Reported-and-tested-By: Arnaud Ebalard <arno@natisbad.org> > >> > >> Since you'll have to push the patch to -stable team at least for 3.13, > >> I wonder if it would not make sense to extend that at least to 3.12. > >> and possibly 3.10 (3.2 is still widely used but I wonder if it makes > >> sense to go that far). > > > > Can you pinpoint the commit that introduced the regression? > > f5182b4155b9d686c5540a6822486400e34ddd98 "xhci: Disable MSI for some Fresco Logic hosts." > > Technically, this is not per se the commit which introduced the > regression but the one that *partially* fixed it by introducing the XHCI > quirk to skip MSI enabling for Fresco Logic chips. The thing is it > should have included the FL1009 in the targets. Sarah, can you confirm > this? I don't know if it should have included FL1009, it was just a guess, based on the fact that the 0x1000 and 0x1400 devices did need MSI disabled. I can attempt to ask the Fresco Logic folks I know, but I'm not sure if/when I'll get a response back. That still doesn't necessarily rule out MSI issues in the Marvell PCI host controller code. Can you attach another PCI device with MSI support under the host and see if it works? Sarah Sharp > Jason, the logic is summarized here, AFAICT: > > commit 455f58925247e8a1a1941e159f3636ad6ee4c90b > Author: Oliver Neukum <oneukum@suse.de> > Date: Mon Sep 30 15:50:54 2013 +0200 > > xhci: quirk for extra long delay for S4 > > It has been reported that this chipset really cannot > sleep without this extraordinary delay. > > This patch should be backported, in order to ensure this host functions > under stable kernels. The last quirk for Fresco Logic hosts (commit > bba18e33f25072ebf70fd8f7f0cdbf8cdb59a746 "xhci: Extend Fresco Logic MSI > quirk.") was backported to stable kernels as old as 2.6.36. > > Signed-off-by: Oliver Neukum <oneukum@suse.de> > Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> > Cc: stable@vger.kernel.org > > > commit bba18e33f25072ebf70fd8f7f0cdbf8cdb59a746 > Author: Sarah Sharp <sarah.a.sharp@linux.intel.com> > Date: Wed Oct 17 13:44:06 2012 -0700 > > xhci: Extend Fresco Logic MSI quirk. > > Ali reports that plugging a device into the Fresco Logic xHCI host with > PCI device ID 1400 produces an IRQ error: > > do_IRQ: 3.176 No irq handler for vector (irq -1) > > Other early Fresco Logic host revisions don't support MSI, even though > their PCI config space claims they do. Extend the quirk to disabling > MSI to this chipset revision. Also enable the short transfer quirk, > since it's likely this revision also has that quirk, and it should be > harmless to enable. > > <SNIP> > > This patch should be backported to stable kernels as old as 2.6.36, that > contain the commit f5182b4155b9d686c5540a6822486400e34ddd98 "xhci: > Disable MSI for some Fresco Logic hosts." > > Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> > Reported-by: A Sh <smr.ash1991@gmail.com> > Tested-by: A Sh <smr.ash1991@gmail.com> > Cc: stable@vger.kernel.org > > > commit f5182b4155b9d686c5540a6822486400e34ddd98 > Author: Sarah Sharp <sarah.a.sharp@linux.intel.com> > Date: Thu Jun 2 11:33:02 2011 -0700 > > xhci: Disable MSI for some Fresco Logic hosts. > > Some Fresco Logic hosts, including those found in the AUAU N533V laptop, > advertise MSI, but fail to actually generate MSI interrupts. Add a new > xHCI quirk to skip MSI enabling for the Fresco Logic host controllers. > Fresco Logic confirms that all chips with PCI vendor ID 0x1b73 and device > ID 0x1000, regardless of PCI revision ID, do not support MSI. > > This should be backported to stable kernels as far back as 2.6.36, which > was the first kernel to support MSI on xHCI hosts. > > Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> > Reported-by: Sergey Galanov <sergey.e.galanov@gmail.com> > Cc: stable@kernel.org
Hi Sarah, Sarah Sharp <sarah.a.sharp@linux.intel.com> writes: >> > Can you pinpoint the commit that introduced the regression? >> >> f5182b4155b9d686c5540a6822486400e34ddd98 "xhci: Disable MSI for some Fresco Logic hosts." >> >> Technically, this is not per se the commit which introduced the >> regression but the one that *partially* fixed it by introducing the XHCI >> quirk to skip MSI enabling for Fresco Logic chips. The thing is it >> should have included the FL1009 in the targets. Sarah, can you confirm >> this? > > I don't know if it should have included FL1009, it was just a guess, > based on the fact that the 0x1000 and 0x1400 devices did need MSI > disabled. I can attempt to ask the Fresco Logic folks I know, but I'm > not sure if/when I'll get a response back. > > That still doesn't necessarily rule out MSI issues in the Marvell PCI > host controller code. Can you attach another PCI device with MSI > support under the host and see if it works? The various Armada-based devices I have are NAS which do not have PCIe slots to plug additional devices (everything is soldered). I don't know which device Thomas used for its tests. Just in case, I also added Willy in CC: who have various boards and may also have done more test with additional PCIe devices and CONFIG_PCI_MSI enabled on 3.13 kernel. Cheers, a+
On Thu, Jan 23, 2014 at 09:24:41AM +0100, Arnaud Ebalard wrote: > Hi Sarah, > > Sarah Sharp <sarah.a.sharp@linux.intel.com> writes: > > >> > Can you pinpoint the commit that introduced the regression? > >> > >> f5182b4155b9d686c5540a6822486400e34ddd98 "xhci: Disable MSI for some Fresco Logic hosts." > >> > >> Technically, this is not per se the commit which introduced the > >> regression but the one that *partially* fixed it by introducing the XHCI > >> quirk to skip MSI enabling for Fresco Logic chips. The thing is it > >> should have included the FL1009 in the targets. Sarah, can you confirm > >> this? > > > > I don't know if it should have included FL1009, it was just a guess, > > based on the fact that the 0x1000 and 0x1400 devices did need MSI > > disabled. I can attempt to ask the Fresco Logic folks I know, but I'm > > not sure if/when I'll get a response back. > > > > That still doesn't necessarily rule out MSI issues in the Marvell PCI > > host controller code. Can you attach another PCI device with MSI > > support under the host and see if it works? > > The various Armada-based devices I have are NAS which do not have PCIe > slots to plug additional devices (everything is soldered). I don't know > which device Thomas used for its tests. Just in case, I also added Willy > in CC: who have various boards and may also have done more test with > additional PCIe devices and CONFIG_PCI_MSI enabled on 3.13 kernel. I've been running an intel i350 dual-port NIC (igb driver) supporting MSI on the mirabox, and it used to work in 3.10+many of the patches coming from the Free-electrons team. Some recent changes to the PCI code introduced a regression preventing this driver from correctly registering an MSI interrupt, and I did not have enough time to investigate it deep enough to fix it. That said, I know how to hack it to work again, so if it can be of any use, I can run a test on the mirabox (armada370) and on the XPGP board (armadaXP). Willy
Dear Arnaud Ebalard, On Thu, 23 Jan 2014 09:24:41 +0100, Arnaud Ebalard wrote: > The various Armada-based devices I have are NAS which do not have PCIe > slots to plug additional devices (everything is soldered). I don't know > which device Thomas used for its tests. Just in case, I also added Willy > in CC: who have various boards and may also have done more test with > additional PCIe devices and CONFIG_PCI_MSI enabled on 3.13 kernel. The device I've used to test MSI is a e1000e PCIe Intel network card. It uses one MSI interrupt, so admittedly, the MSI testing is quite limited for now. Thomas
Hi Thomas and Sarah, Thomas Petazzoni <thomas.petazzoni@free-electrons.com> writes: > On Thu, 23 Jan 2014 09:24:41 +0100, Arnaud Ebalard wrote: > >> The various Armada-based devices I have are NAS which do not have PCIe >> slots to plug additional devices (everything is soldered). I don't know >> which device Thomas used for its tests. Just in case, I also added Willy >> in CC: who have various boards and may also have done more test with >> additional PCIe devices and CONFIG_PCI_MSI enabled on 3.13 kernel. > > The device I've used to test MSI is a e1000e PCIe Intel network card. > It uses one MSI interrupt, so admittedly, the MSI testing is quite > limited for now. I had a second thought this WE about a previous question from Sarah: my platforms do not have a PCIe extension slots to test other devices but the RN102 does have an additional device connected on the PCIe bus: a Marvell 88SE9170 SATA Controller. I have put below the output of lspci -vvv (on a 3.13.0-rc8 kernel w/ CONFIG_PCI_MSI enabled) in case you can spot something obviously wrong in it: 00:01.0 PCI bridge: Marvell Technology Group Ltd. Device 7846 (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Bus: primary=00, secondary=01, subordinate=01, sec-latency=0 I/O behind bridge: 00010000-00010fff Memory behind bridge: e0000000-e00fffff Prefetchable memory behind bridge: 00000000-000fffff Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR- BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- 00:02.0 PCI bridge: Marvell Technology Group Ltd. Device 7846 (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Bus: primary=00, secondary=02, subordinate=02, sec-latency=0 I/O behind bridge: 0000f000-00000fff Memory behind bridge: e0100000-e01fffff Prefetchable memory behind bridge: 00000000-000fffff Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR- BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- 01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9170 (rev 12) (prog-if 01 [AHCI 1.0]) Subsystem: Marvell Technology Group Ltd. Device 9170 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 105 Region 0: I/O ports at 10010 [size=8] Region 1: I/O ports at 10020 [size=4] Region 2: I/O ports at 10018 [size=8] Region 3: I/O ports at 10024 [size=4] Region 4: I/O ports at 10000 [size=16] Region 5: Memory at e0010000 (32-bit, non-prefetchable) [size=512] Expansion ROM at e0000000 [disabled] [size=64K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: d0020a04 Data: 0f10 Capabilities: [70] Express (v2) Legacy Endpoint, MSI 00 DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <512ns, L1 <64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- Kernel driver in use: ahci 02:00.0 USB controller: Fresco Logic FL1009 USB 3.0 Host Controller (rev 02) (prog-if 30 [XHCI]) Subsystem: Fresco Logic Device 0000 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 104 Region 0: Memory at e0100000 (64-bit, non-prefetchable) [size=64K] Region 2: Memory at e0110000 (64-bit, non-prefetchable) [size=4K] Region 4: Memory at e0111000 (64-bit, non-prefetchable) [size=4K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/8 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 unlimited ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [b0] MSI-X: Enable+ Count=8 Masked- Vector table: BAR=2 offset=00000000 PBA: BAR=4 offset=00000000 Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Kernel driver in use: xhci_hcd
Hi Willy, Willy Tarreau <w@1wt.eu> writes: >> > I don't know if it should have included FL1009, it was just a guess, >> > based on the fact that the 0x1000 and 0x1400 devices did need MSI >> > disabled. I can attempt to ask the Fresco Logic folks I know, but I'm >> > not sure if/when I'll get a response back. >> > >> > That still doesn't necessarily rule out MSI issues in the Marvell PCI >> > host controller code. Can you attach another PCI device with MSI >> > support under the host and see if it works? >> >> The various Armada-based devices I have are NAS which do not have PCIe >> slots to plug additional devices (everything is soldered). I don't know >> which device Thomas used for its tests. Just in case, I also added Willy >> in CC: who have various boards and may also have done more test with >> additional PCIe devices and CONFIG_PCI_MSI enabled on 3.13 kernel. > > I've been running an intel i350 dual-port NIC (igb driver) supporting > MSI on the mirabox, and it used to work in 3.10+many of the patches > coming from the Free-electrons team. Some recent changes to the PCI > code introduced a regression preventing this driver from correctly > registering an MSI interrupt, and I did not have enough time to > investigate it deep enough to fix it. That said, I know how to hack > it to work again, so if it can be of any use, I can run a test on > the mirabox (armada370) and on the XPGP board (armadaXP). Thanks for the proposal, Willy. I guess Thomas can tell better than me what kind of tests would help ruling out a problem in MSI support and put the blame on FL chip ;-) Thomas, if you need me to test something on some of my platforms, do not hesitate. Cheers, a+
Hi Sarah, Sarah Sharp <sarah.a.sharp@linux.intel.com> writes: > On Wed, Jan 22, 2014 at 11:43:16PM +0100, Arnaud Ebalard wrote: >> Hi Jason, >> >> Jason Cooper <jason@lakedaemon.net> writes: >> >> > On Wed, Jan 22, 2014 at 11:23:23PM +0100, Arnaud Ebalard wrote: >> >> With the patch applied on top of 3.13.0 kernel recompiled w/ >> >> CONFIG_PCI_MSI enabled, I cannot reproduce the bug. I guess >> >> you can add my: >> >> >> >> Reported-and-tested-By: Arnaud Ebalard <arno@natisbad.org> >> >> >> >> Since you'll have to push the patch to -stable team at least for 3.13, >> >> I wonder if it would not make sense to extend that at least to 3.12. >> >> and possibly 3.10 (3.2 is still widely used but I wonder if it makes >> >> sense to go that far). >> > >> > Can you pinpoint the commit that introduced the regression? >> >> f5182b4155b9d686c5540a6822486400e34ddd98 "xhci: Disable MSI for some Fresco Logic hosts." >> >> Technically, this is not per se the commit which introduced the >> regression but the one that *partially* fixed it by introducing the XHCI >> quirk to skip MSI enabling for Fresco Logic chips. The thing is it >> should have included the FL1009 in the targets. Sarah, can you confirm >> this? > > I don't know if it should have included FL1009, it was just a guess, > based on the fact that the 0x1000 and 0x1400 devices did need MSI > disabled. I can attempt to ask the Fresco Logic folks I know, but I'm > not sure if/when I'll get a response back. > > That still doesn't necessarily rule out MSI issues in the Marvell PCI > host controller code. Can you attach another PCI device with MSI > support under the host and see if it works? Unless you have some objections or some positive feedback from Fresco Logic people, can you queue your quirks for FL1009 for 3.14-rc* and -stable? Note that I am just asking, i.e. if you want to wait a bit more, I am not that in a hurry. Cheers, a+
On Mon, Feb 10, 2014 at 07:57:42PM +0100, Arnaud Ebalard wrote: > Sarah Sharp <sarah.a.sharp@linux.intel.com> writes: > > > On Wed, Jan 22, 2014 at 11:43:16PM +0100, Arnaud Ebalard wrote: > >> f5182b4155b9d686c5540a6822486400e34ddd98 "xhci: Disable MSI for some Fresco Logic hosts." > >> > >> Technically, this is not per se the commit which introduced the > >> regression but the one that *partially* fixed it by introducing the XHCI > >> quirk to skip MSI enabling for Fresco Logic chips. The thing is it > >> should have included the FL1009 in the targets. Sarah, can you confirm > >> this? > > > > I don't know if it should have included FL1009, it was just a guess, > > based on the fact that the 0x1000 and 0x1400 devices did need MSI > > disabled. I can attempt to ask the Fresco Logic folks I know, but I'm > > not sure if/when I'll get a response back. > > > > That still doesn't necessarily rule out MSI issues in the Marvell PCI > > host controller code. Can you attach another PCI device with MSI > > support under the host and see if it works? > > Unless you have some objections or some positive feedback from Fresco > Logic people, can you queue your quirks for FL1009 for 3.14-rc* and > -stable? Note that I am just asking, i.e. if you want to wait a bit > more, I am not that in a hurry. Sorry for not getting back to you sooner. The Fresco Logic folks said that the FL1000 and FL1400 hosts are actually the same chipset, and it doesn't support MSI. However, they say the FL1009 *should* support MSI. So that doesn't rule out issues with the Marvell PCI MSI code. I suspect that's actually the root cause, since I haven't gotten any bug reports that the FL1009 doesn't work with MSI enabled on other systems. Sarah Sharp
Sarah, Arnaud, On Thu, 13 Feb 2014 16:09:10 -0800, Sarah Sharp wrote: > > Unless you have some objections or some positive feedback from Fresco > > Logic people, can you queue your quirks for FL1009 for 3.14-rc* and > > -stable? Note that I am just asking, i.e. if you want to wait a bit > > more, I am not that in a hurry. > > Sorry for not getting back to you sooner. The Fresco Logic folks said > that the FL1000 and FL1400 hosts are actually the same chipset, and it > doesn't support MSI. However, they say the FL1009 *should* support MSI. > > So that doesn't rule out issues with the Marvell PCI MSI code. I > suspect that's actually the root cause, since I haven't gotten any bug > reports that the FL1009 doesn't work with MSI enabled on other systems. Ok, I'll try to have a look into this, by re-reading the entire thread, and trying to propose some patches that add debugging details in the Marvell PCI MSI code to try to understand what's going on. Thanks! Thomas
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c index 6c03584ac15f..74748444c040 100644 --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -30,6 +30,7 @@ /* Device for a quirk */ #define PCI_VENDOR_ID_FRESCO_LOGIC 0x1b73 #define PCI_DEVICE_ID_FRESCO_LOGIC_PDK 0x1000 +#define PCI_DEVICE_ID_FRESCO_LOGIC_FL1009 0x1009 #define PCI_DEVICE_ID_FRESCO_LOGIC_FL1400 0x1400 #define PCI_VENDOR_ID_ETRON 0x1b6f @@ -63,6 +64,9 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) /* Look for vendor-specific quirks */ if (pdev->vendor == PCI_VENDOR_ID_FRESCO_LOGIC && + pdev->device == PCI_DEVICE_ID_FRESCO_LOGIC_FL1009) + xhci->quirks |= XHCI_BROKEN_MSI; + if (pdev->vendor == PCI_VENDOR_ID_FRESCO_LOGIC && (pdev->device == PCI_DEVICE_ID_FRESCO_LOGIC_PDK || pdev->device == PCI_DEVICE_ID_FRESCO_LOGIC_FL1400)) { if (pdev->device == PCI_DEVICE_ID_FRESCO_LOGIC_PDK &&