diff mbox

pciehp is broken from 4.10-rc1

Message ID 20170208084624.GA13086@linux-siqj.site (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Yinghai Lu Feb. 8, 2017, 8:46 a.m. UTC
On Tue, Feb 07, 2017 at 10:08:13AM -0800, Yinghai Lu wrote:
> On Mon, Feb 6, 2017 at 10:08 PM, Lukas Wunner <lukas@wunner.de> wrote:
> > Please retry with a stock v4.10-rc7 kernel and report back if the issue
> > persists.
> 
> sca05-0a81fd7f:~ # echo 1 > /sys/bus/pci/slots/7/power
> [  470.356464] pciehp 0000:73:00.0:pcie004: pciehp_get_power_status:
> SLOTCTRL a8 value read 17f1
> [  470.366662] pciehp 0000:73:00.0:pcie004: pending interrupts 0x0010
> from Slot Status
> [  470.375339] pciehp 0000:73:00.0:pcie004: pciehp_power_on_slot:
> SLOTCTRL a8 write cmd 0
> [  470.384199] pciehp 0000:73:00.0:pcie004: __pciehp_link_set: lnk_ctrl = 40
> [  470.391789] pciehp 0000:73:00.0:pcie004: pciehp_green_led_blink:
> SLOTCTRL a8 write cmd 200
> [  470.391966] pciehp 0000:73:00.0:pcie004: pending interrupts 0x0010
> from Slot Status
> [  470.428791] pciehp 0000:73:00.0:pcie004: pending interrupts 0x0010
> from Slot Status
> [  472.428147] pciehp 0000:73:00.0:pcie004: Data Link Layer Link
> Active not set in 1000 msec
> [  473.944158] pci 0000:74:00.0 id reading try 50 times with interval
> 20 ms to get ffffffff
> [  473.953209] pciehp 0000:73:00.0:pcie004: pciehp_check_link_status:
> lnk_status = 5001
> [  473.961861] pciehp 0000:73:00.0:pcie004: link training error: status 0x5001
> [  473.969642] pciehp 0000:73:00.0:pcie004: Failed to check link status
> [  473.970721] pciehp 0000:73:00.0:pcie004: pending interrupts 0x0010
> from Slot Status
> [  473.970818] pciehp 0000:73:00.0:pcie004: pciehp_power_off_slot:
> SLOTCTRL a8 write cmd 400
> [  475.000272] pciehp 0000:73:00.0:pcie004: pciehp_green_led_off:
> SLOTCTRL a8 write cmd 300
> [  475.000880] pciehp 0000:73:00.0:pcie004: pending interrupts 0x0010
> from Slot Status
> [  475.017879] pciehp 0000:73:00.0:pcie004:
> pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
> [  475.018386] pciehp 0000:73:00.0:pcie004: pending interrupts 0x0010
> from Slot Status
> -bash: echo: write error: Operation not permitted

Following change could make it work:

---
 drivers/pci/hotplug/pciehp_ctrl.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


after that change will get:

sca05-0a81fd7f:~ # echo 1 > /sys/bus/pci/slots/7/power 
[  300.949937] pci_hotplug: power_write_file: power = 1
[  300.955502] pciehp 0000:73:00.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 17f1
[  300.982557] pciehp 0000:73:00.0:pcie004: pending interrupts 0x0010 from Slot Status
[  300.991171] pciehp 0000:73:00.0:pcie004: pciehp_power_on_slot: SLOTCTRL a8 write cmd 0
[  301.000033] pciehp 0000:73:00.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
[  301.009274] pciehp 0000:73:00.0:pcie004: pending interrupts 0x0010 from Slot Status
[  301.662172] pciehp 0000:73:00.0:pcie004: pciehp_check_link_active: lnk_status = f083
[  301.670827] pciehp 0000:73:00.0:pcie004: pending interrupts 0x0108 from Slot Status
[  301.679376] pciehp 0000:73:00.0:pcie004: Slot(7): Link Up
[  301.685463] pciehp 0000:73:00.0:pcie004: Slot(7): Link Up event ignored; already powering on
[  301.685508] pciehp 0000:73:00.0:pcie004: pciehp_check_link_active: lnk_status = f083
[  302.005967] pciehp 0000:73:00.0:pcie004: pciehp_check_link_status: lnk_status = f083
[  302.014859] pci 0000:74:00.0: [15b3:1003] type 00 class 0x0c0600
...

that means in commit 68db9bc8 changelog about power on does not need D0
-----
  Even turning on slot power doesn't seem
  to require the port to be in D0, at least the PCIe spec doesn't say so
  and I confirmed that by testing with a Thunderbolt controller.
-----
may not stand on this silicon.

Thanks

Yinghai

Comments

Bjorn Helgaas Feb. 18, 2017, 11:46 p.m. UTC | #1
On Wed, Feb 08, 2017 at 12:46:26AM -0800, Yinghai Lu wrote:
> after that change will get:
> 
> sca05-0a81fd7f:~ # echo 1 > /sys/bus/pci/slots/7/power 
> [  300.949937] pci_hotplug: power_write_file: power = 1
> [  300.955502] pciehp 0000:73:00.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 17f1
> [  300.982557] pciehp 0000:73:00.0:pcie004: pending interrupts 0x0010 from Slot Status

Can anybody explain this Command Completed interrupt?  I think we should be
in this path:

  pciehp_sysfs_enable_slot
    pciehp_enable_slot
      pciehp_get_adapter_status
      pciehp_get_power_status
        "SLOTCTRL a8 value read 17f1" (DLLSCE PWR_OFF PWR_IND_OFF CCIE HPIE ATTN_IND_OFF ABPE)
      board_added
	pciehp_power_on_slot
	  pcie_write_cmd(ctrl, PCI_EXP_SLTCTL_PWR_ON, PCI_EXP_SLTCTL_PCC)
          "SLOTCTRL a8 write cmd 0" (PWR_ON)
	  pciehp_link_enable
	    __pciehp_link_set

I don't see a write to SLTCTL between pciehp_get_power_status() and
pciehp_power_on_slot(), so I don't know why we see a Command Completed
interrupt.

> [  300.991171] pciehp 0000:73:00.0:pcie004: pciehp_power_on_slot: SLOTCTRL a8 write cmd 0
> [  301.000033] pciehp 0000:73:00.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
> [  301.009274] pciehp 0000:73:00.0:pcie004: pending interrupts 0x0010 from Slot Status
> [  301.662172] pciehp 0000:73:00.0:pcie004: pciehp_check_link_active: lnk_status = f083
> [  301.670827] pciehp 0000:73:00.0:pcie004: pending interrupts 0x0108 from Slot Status
> [  301.679376] pciehp 0000:73:00.0:pcie004: Slot(7): Link Up
> [  301.685463] pciehp 0000:73:00.0:pcie004: Slot(7): Link Up event ignored; already powering on
> [  301.685508] pciehp 0000:73:00.0:pcie004: pciehp_check_link_active: lnk_status = f083
> [  302.005967] pciehp 0000:73:00.0:pcie004: pciehp_check_link_status: lnk_status = f083
> [  302.014859] pci 0000:74:00.0: [15b3:1003] type 00 class 0x0c0600
Yinghai Lu Feb. 19, 2017, 1:54 a.m. UTC | #2
On Sat, Feb 18, 2017 at 3:46 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Wed, Feb 08, 2017 at 12:46:26AM -0800, Yinghai Lu wrote:
>> after that change will get:
>>
>> sca05-0a81fd7f:~ # echo 1 > /sys/bus/pci/slots/7/power
>> [  300.949937] pci_hotplug: power_write_file: power = 1
>> [  300.955502] pciehp 0000:73:00.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 17f1
>> [  300.982557] pciehp 0000:73:00.0:pcie004: pending interrupts 0x0010 from Slot Status
>
> Can anybody explain this Command Completed interrupt?  I think we should be
> in this path:
>
>   pciehp_sysfs_enable_slot
>     pciehp_enable_slot
>       pciehp_get_adapter_status
>       pciehp_get_power_status
>         "SLOTCTRL a8 value read 17f1" (DLLSCE PWR_OFF PWR_IND_OFF CCIE HPIE ATTN_IND_OFF ABPE)
>       board_added
>         pciehp_power_on_slot
>           pcie_write_cmd(ctrl, PCI_EXP_SLTCTL_PWR_ON, PCI_EXP_SLTCTL_PCC)
>           "SLOTCTRL a8 write cmd 0" (PWR_ON)
>           pciehp_link_enable
>             __pciehp_link_set
>
> I don't see a write to SLTCTL between pciehp_get_power_status() and
> pciehp_power_on_slot(), so I don't know why we see a Command Completed
> interrupt.
>
>> [  300.991171] pciehp 0000:73:00.0:pcie004: pciehp_power_on_slot: SLOTCTRL a8 write cmd 0
>> [  301.000033] pciehp 0000:73:00.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
>> [  301.009274] pciehp 0000:73:00.0:pcie004: pending interrupts 0x0010 from Slot Status
>> [  301.662172] pciehp 0000:73:00.0:pcie004: pciehp_check_link_active: lnk_status = f083
>> [  301.670827] pciehp 0000:73:00.0:pcie004: pending interrupts 0x0108 from Slot Status
>> [  301.679376] pciehp 0000:73:00.0:pcie004: Slot(7): Link Up
>> [  301.685463] pciehp 0000:73:00.0:pcie004: Slot(7): Link Up event ignored; already powering on
>> [  301.685508] pciehp 0000:73:00.0:pcie004: pciehp_check_link_active: lnk_status = f083
>> [  302.005967] pciehp 0000:73:00.0:pcie004: pciehp_check_link_status: lnk_status = f083
>> [  302.014859] pci 0000:74:00.0: [15b3:1003] type 00 class 0x0c0600

That should belong to
  pciehp_power_on_slot: SLOTCTRL a8 write cmd 0

We print out debug info after cmd write.
        pcie_write_cmd(ctrl, PCI_EXP_SLTCTL_PWR_ON, PCI_EXP_SLTCTL_PCC);
        ctrl_dbg(ctrl, "%s: SLOTCTRL %x write cmd %x\n", __func__,
                 pci_pcie_cap(ctrl->pcie->port) + PCI_EXP_SLTCTL,
                 PCI_EXP_SLTCTL_PWR_ON);

should we adjust that print out sequence ?
diff mbox

Patch

Index: linux-2.6/drivers/pci/hotplug/pciehp_ctrl.c
===================================================================
--- linux-2.6.orig/drivers/pci/hotplug/pciehp_ctrl.c
+++ linux-2.6/drivers/pci/hotplug/pciehp_ctrl.c
@@ -89,17 +89,17 @@  static int board_added(struct slot *p_sl
 	struct controller *ctrl = p_slot->ctrl;
 	struct pci_bus *parent = ctrl->pcie->port->subordinate;
 
+	pm_runtime_get_sync(&ctrl->pcie->port->dev);
 	if (POWER_CTRL(ctrl)) {
 		/* Power on slot */
 		retval = pciehp_power_on_slot(p_slot);
 		if (retval)
-			return retval;
+			goto err_exit;
 	}
 
 	pciehp_green_led_blink(p_slot);
 
 	/* Check link training status */
-	pm_runtime_get_sync(&ctrl->pcie->port->dev);
 	retval = pciehp_check_link_status(ctrl);
 	if (retval) {
 		ctrl_err(ctrl, "Failed to check link status\n");