diff mbox

bisected regression, v3.5 -> next-20120724: PCI PM causes USB hotplug failure

Message ID 1343618131.3874.37.camel@yhuang-dev (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Huang, Ying July 30, 2012, 3:15 a.m. UTC
On Fri, 2012-07-27 at 11:11 +0200, Bjørn Mork wrote:
> Huang Ying <ying.huang@intel.com> writes:
> 
> > Do you have time to try the following patch?
> >
> > Best Regards,
> > Huang Ying
> >
> > ---
> >  drivers/pci/pci-driver.c |    6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > --- a/drivers/pci/pci-driver.c
> > +++ b/drivers/pci/pci-driver.c
> > @@ -280,8 +280,12 @@ static long local_pci_probe(void *_ddi)
> >  {
> >  	struct drv_dev_and_id *ddi = _ddi;
> >  	struct device *dev = &ddi->dev->dev;
> > +	struct device *parent = dev->parent;
> >  	int rc;
> >  
> > +	/* The parent bridge must be in active state when probing */
> > +	if (parent)
> > +		pm_runtime_get_sync(parent);
> >  	/* Unbound PCI devices are always set to disabled and suspended.
> >  	 * During probe, the device is set to enabled and active and the
> >  	 * usage count is incremented.  If the driver supports runtime PM,
> > @@ -298,6 +302,8 @@ static long local_pci_probe(void *_ddi)
> >  		pm_runtime_set_suspended(dev);
> >  		pm_runtime_put_noidle(dev);
> >  	}
> > +	if (parent)
> > +		pm_runtime_put(parent);
> >  	return rc;
> >  }
> >  
> 
> 
> Yup, that worked in the quick test I just did.
> 
>  lspci reading the device config will still not wake the bridge, but I
> assume that is intentional?  But loading the device driver now wakes
> both the bridge and the device, so that works.

Do you have time to test the following patch to fix the lspci issue?

Subject: [BUGFIX] PCI/PM: Keep parent bridge active when read/write config reg

This patch fixes the following bug:

http://marc.info/?l=linux-pci&m=134338059022620&w=2

Where lspci does not work properly if a device and the corresponding
parent bridge (such as PCIe port) is suspended.  This is because the
device configuration space registers will be not accessible if the
corresponding parent bridge is suspended.

To solve the issue, the bridge/PCIe port connected to the device is
put into active state before read/write configuration space registers.

To avoid resume/suspend PCIe port for each configuration register
read/write, a small delay is added before the PCIe port to go
suspended.

Reported-by: Bjorn Mork <bjorn@mork.no>
Signed-off-by: Huang Ying <ying.huang@intel.com>
---
 drivers/pci/pci-sysfs.c        |   14 ++++++++++++++
 drivers/pci/pcie/portdrv_pci.c |    9 +++++++++
 2 files changed, 23 insertions(+)



--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Bjørn Mork July 30, 2012, 8:08 a.m. UTC | #1
Huang Ying <ying.huang@intel.com> writes:

> Do you have time to test the following patch to fix the lspci issue?
>
> Subject: [BUGFIX] PCI/PM: Keep parent bridge active when read/write config reg


Sure.  But keep this going and I will file a request for modular build
of the PCI subsystem :-)

The patch works fine for me:


nemi:/home/bjorn# lspci -t
-[0000:00]-+-00.0
           +-02.0
           +-02.1
           +-03.0
           +-19.0
           +-1a.0
           +-1a.1
           +-1a.2
           +-1a.7
           +-1b.0
           +-1c.0-[02]--
           +-1c.1-[03]----00.0
           +-1d.0
           +-1d.1
           +-1d.2
           +-1d.7
           +-1e.0-[15]--
           +-1f.0
           +-1f.2
           \-1f.3

nemi:/home/bjorn# lspci -vvnns 1c.1; lspci -vvnns 3:0
00:1c.1 PCI bridge [0604]: Intel Corporation 82801I (ICH9 Family) PCI Express Port 2 [8086:2942] (rev 03) (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
        I/O behind bridge: 00003000-00003fff
        Memory behind bridge: f0500000-f05fffff
        Prefetchable memory behind bridge: 00000000c0400000-00000000c05fffff
        Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
        BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B-
                PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
        Capabilities: [40] Express (v1) Root Port (Slot+), MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                        ExtTag- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
                LnkCap: Port #2, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <256ns, L1 <4us
                        ClockPM- Surprise- LLActRep+ BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
                SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+
                        Slot #1, PowerLimit 6.500W; Interlock- NoCompl-
                SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
                        Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
                SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
                        Changed: MRL- PresDet- LinkState+
                RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
                RootCap: CRSVisible-
                RootSta: PME ReqID 0000, PMEStatus- PMEPending-
        Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
                Address: fee0300c  Data: 4143
        Capabilities: [90] Subsystem: Lenovo Device [17aa:20f3]
        Capabilities: [a0] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D3 NoSoftRst- PME-Enable+ DSel=0 DScale=0 PME-
        Capabilities: [100 v1] Virtual Channel
                Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
                Arb:    Fixed+ WRR32- WRR64- WRR128-
                Ctrl:   ArbSelect=Fixed
                Status: InProgress-
                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                        Arb:    Fixed+ WRR32- WRR64- WRR128- TWRR128- WRR256-
                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=01
                        Status: NegoPending- InProgress-
        Capabilities: [180 v1] Root Complex Link
                Desc:   PortNumber=02 ComponentID=02 EltType=Config
                Link0:  Desc:   TargetPort=00 TargetComponent=02 AssocRCRB- LinkType=MemMapped LinkValid+
                        Addr:   00000000fed1c000
        Kernel driver in use: pcieport

03:00.0 Network controller [0280]: Intel Corporation Ultimate N WiFi Link 5300 [8086:4236]
        Subsystem: Intel Corporation Device [8086:1011]
        Physical Slot: 1
        Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Interrupt: pin A routed to IRQ 17
        Region 0: Memory at f0500000 (64-bit, non-prefetchable) [size=8K]
        Capabilities: [c8] Power Management version 3
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 00000000fee0300c  Data: 4183
        Capabilities: [e0] Express (v1) Endpoint, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 unlimited
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <128ns, L1 <32us
                        ClockPM+ Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
        Capabilities: [140 v1] Device Serial Number 00-16-ea-ff-ff-b3-07-88





Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
huang ying July 30, 2012, 1:31 p.m. UTC | #2
On Mon, Jul 30, 2012 at 4:08 PM, Bjørn Mork <bjorn@mork.no> wrote:
> Huang Ying <ying.huang@intel.com> writes:
>
>> Do you have time to test the following patch to fix the lspci issue?
>>
>> Subject: [BUGFIX] PCI/PM: Keep parent bridge active when read/write config reg
>
>
> Sure.  But keep this going and I will file a request for modular build
> of the PCI subsystem :-)

Why?  All not so old PC hardware has PCI/PCIe devices.

Best Regards,
Huang Ying
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern July 30, 2012, 2:19 p.m. UTC | #3
On Mon, 30 Jul 2012, Huang Ying wrote:

> > Yup, that worked in the quick test I just did.
> > 
> >  lspci reading the device config will still not wake the bridge, but I
> > assume that is intentional?  But loading the device driver now wakes
> > both the bridge and the device, so that works.
> 
> Do you have time to test the following patch to fix the lspci issue?
> 
> Subject: [BUGFIX] PCI/PM: Keep parent bridge active when read/write config reg
> 
> This patch fixes the following bug:
> 
> http://marc.info/?l=linux-pci&m=134338059022620&w=2
> 
> Where lspci does not work properly if a device and the corresponding
> parent bridge (such as PCIe port) is suspended.  This is because the
> device configuration space registers will be not accessible if the
> corresponding parent bridge is suspended.
> 
> To solve the issue, the bridge/PCIe port connected to the device is
> put into active state before read/write configuration space registers.

What happens when you run lspci and the device is in D3cold?  Then even 
if the parent bridge is active, lspci will still fail.

It seems that in this case you need to resume the device itself, not 
just its parent.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjørn Mork July 30, 2012, 4:57 p.m. UTC | #4
huang ying <huang.ying.caritas@gmail.com> writes:

> On Mon, Jul 30, 2012 at 4:08 PM, Bjørn Mork <bjorn@mork.no> wrote:
>> Huang Ying <ying.huang@intel.com> writes:
>>
>>> Do you have time to test the following patch to fix the lspci issue?
>>>
>>> Subject: [BUGFIX] PCI/PM: Keep parent bridge active when read/write config reg
>>
>>
>> Sure.  But keep this going and I will file a request for modular build
>> of the PCI subsystem :-)
>
> Why?  All not so old PC hardware has PCI/PCIe devices.

So that I didn't have to reboot all the time to test your new patches...

It was a (bad) joke.  I don't really think my laptop would work all that
well without PCI.


Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Huang, Ying July 31, 2012, 12:22 a.m. UTC | #5
On Mon, 2012-07-30 at 18:57 +0200, Bjørn Mork wrote:
> huang ying <huang.ying.caritas@gmail.com> writes:
> 
> > On Mon, Jul 30, 2012 at 4:08 PM, Bjørn Mork <bjorn@mork.no> wrote:
> >> Huang Ying <ying.huang@intel.com> writes:
> >>
> >>> Do you have time to test the following patch to fix the lspci issue?
> >>>
> >>> Subject: [BUGFIX] PCI/PM: Keep parent bridge active when read/write config reg
> >>
> >>
> >> Sure.  But keep this going and I will file a request for modular build
> >> of the PCI subsystem :-)
> >
> > Why?  All not so old PC hardware has PCI/PCIe devices.
> 
> So that I didn't have to reboot all the time to test your new patches...
> 
> It was a (bad) joke.  I don't really think my laptop would work all that
> well without PCI.

Haha, understood now.

Best Regards,
Huang Ying

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Huang, Ying July 31, 2012, 12:24 a.m. UTC | #6
On Mon, 2012-07-30 at 10:19 -0400, Alan Stern wrote:
> On Mon, 30 Jul 2012, Huang Ying wrote:
> 
> > > Yup, that worked in the quick test I just did.
> > > 
> > >  lspci reading the device config will still not wake the bridge, but I
> > > assume that is intentional?  But loading the device driver now wakes
> > > both the bridge and the device, so that works.
> > 
> > Do you have time to test the following patch to fix the lspci issue?
> > 
> > Subject: [BUGFIX] PCI/PM: Keep parent bridge active when read/write config reg
> > 
> > This patch fixes the following bug:
> > 
> > http://marc.info/?l=linux-pci&m=134338059022620&w=2
> > 
> > Where lspci does not work properly if a device and the corresponding
> > parent bridge (such as PCIe port) is suspended.  This is because the
> > device configuration space registers will be not accessible if the
> > corresponding parent bridge is suspended.
> > 
> > To solve the issue, the bridge/PCIe port connected to the device is
> > put into active state before read/write configuration space registers.
> 
> What happens when you run lspci and the device is in D3cold?  Then even 
> if the parent bridge is active, lspci will still fail.
> 
> It seems that in this case you need to resume the device itself, not 
> just its parent.

Yes.  Will do that.

Best Regards,
Huang Ying


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -464,6 +464,7 @@  pci_read_config(struct file *filp, struc
 		char *buf, loff_t off, size_t count)
 {
 	struct pci_dev *dev = to_pci_dev(container_of(kobj,struct device,kobj));
+	struct device *parent = dev->dev.parent;
 	unsigned int size = 64;
 	loff_t init_off = off;
 	u8 *data = (u8*) buf;
@@ -484,6 +485,9 @@  pci_read_config(struct file *filp, struc
 		size = count;
 	}
 
+	if (parent)
+		pm_runtime_get_sync(parent);
+
 	if ((off & 1) && size) {
 		u8 val;
 		pci_user_read_config_byte(dev, off, &val);
@@ -529,6 +533,9 @@  pci_read_config(struct file *filp, struc
 		--size;
 	}
 
+	if (parent)
+		pm_runtime_put(parent);
+
 	return count;
 }
 
@@ -538,6 +545,7 @@  pci_write_config(struct file* filp, stru
 		 char *buf, loff_t off, size_t count)
 {
 	struct pci_dev *dev = to_pci_dev(container_of(kobj,struct device,kobj));
+	struct device *parent = dev->dev.parent;
 	unsigned int size = count;
 	loff_t init_off = off;
 	u8 *data = (u8*) buf;
@@ -549,6 +557,9 @@  pci_write_config(struct file* filp, stru
 		count = size;
 	}
 	
+	if (parent)
+		pm_runtime_get_sync(parent);
+
 	if ((off & 1) && size) {
 		pci_user_write_config_byte(dev, off, data[off - init_off]);
 		off++;
@@ -587,6 +598,9 @@  pci_write_config(struct file* filp, stru
 		--size;
 	}
 
+	if (parent)
+		pm_runtime_put(parent);
+
 	return count;
 }
 
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -140,9 +140,17 @@  static int pcie_port_runtime_resume(stru
 {
 	return 0;
 }
+
+static int pcie_port_runtime_idle(struct device *dev)
+{
+	/* Delay for a short while to prevent too frequent suspend/resume */
+	pm_schedule_suspend(dev, 10);
+	return -EBUSY;
+}
 #else
 #define pcie_port_runtime_suspend	NULL
 #define pcie_port_runtime_resume	NULL
+#define pcie_port_runtime_idle		NULL
 #endif
 
 static const struct dev_pm_ops pcie_portdrv_pm_ops = {
@@ -155,6 +163,7 @@  static const struct dev_pm_ops pcie_port
 	.resume_noirq	= pcie_port_resume_noirq,
 	.runtime_suspend = pcie_port_runtime_suspend,
 	.runtime_resume = pcie_port_runtime_resume,
+	.runtime_idle	= pcie_port_runtime_idle,
 };
 
 #define PCIE_PORTDRV_PM_OPS	(&pcie_portdrv_pm_ops)