Message ID | 14ffd1ceeb76200b34e0abfebab134545e504bdc.1486913733.git.lukas@wunner.de (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Sun, Feb 12, 2017 at 05:07:45PM +0100, Lukas Wunner wrote: > Commit 68db9bc81436 ("PCI: pciehp: Add runtime PM support for PCIe > hotplug ports") extended runtime PM for PCIe ports to hotplug ports. > However Yinghai Lu reported the following breakage caused by the commit: > > * After disabling a slot on one of his servers, the slot signals PME > which causes pciehp to immediately re-enable the card, thus defeating > manual slot control via sysfs. I don't think we've clearly established this. It surprises me that a Downstream Port would signal PME after turning off power to the slot. Wouldn't that make PME useless? I'd like to look at this in a little more detail, either by turning off PM and pciehp and poking things manually with setpci, or possibly by adding some debug output in the PME ISR and the pciehp ISR and "write command" paths. > * On another server which has a Power Controller for PCIe hotplug ports > (PCIe r3.1, sec 6.7.1.8), the slot cannot be re-enabled after manually > disabling it via sysfs since commit 68db9bc81436 acquired a runtime > ref only *after* enabling power on the slot, yet this particular > SkyLake server requires that the port is in D0 when enabling power. So this server requires that the Downstream Port must be in D0 before we turn on power to the slot below the Downstream Port? That seems like it would be a requirement for *all* systems. If the bridge is not in D0, we can't generate any PCI transactions on the secondary bus (PCI PM r1.1., Table 19), so we can't enumerate anything in the slot. Bjorn
On Tue, Feb 14, 2017 at 04:59:26PM -0600, Bjorn Helgaas wrote: > On Sun, Feb 12, 2017 at 05:07:45PM +0100, Lukas Wunner wrote: > > Commit 68db9bc81436 ("PCI: pciehp: Add runtime PM support for PCIe > > hotplug ports") extended runtime PM for PCIe ports to hotplug ports. > > However Yinghai Lu reported the following breakage caused by the commit: > > > > * After disabling a slot on one of his servers, the slot signals PME > > which causes pciehp to immediately re-enable the card, thus defeating > > manual slot control via sysfs. > > I don't think we've clearly established this. It surprises me that a > Downstream Port would signal PME after turning off power to the slot. > Wouldn't that make PME useless? > > I'd like to look at this in a little more detail, either by turning > off PM and pciehp and poking things manually with setpci, or possibly > by adding some debug output in the PME ISR and the pciehp ISR and > "write command" paths. I'm in the To: header of your e-mail but I am not in a position to debug this as I don't have the hardware available. It would have to be done by Yinghai Lu or Intel. This may already be clear to everyone involved, so apologies for stating the obvious. Also, I hope it's clear that these Intel servers are not affected by v6 of this patch as it only enables runtime PM on *Thunderbolt* hotplug ports. > > * On another server which has a Power Controller for PCIe hotplug ports > > (PCIe r3.1, sec 6.7.1.8), the slot cannot be re-enabled after manually > > disabling it via sysfs since commit 68db9bc81436 acquired a runtime > > ref only *after* enabling power on the slot, yet this particular > > SkyLake server requires that the port is in D0 when enabling power. > > So this server requires that the Downstream Port must be in D0 before > we turn on power to the slot below the Downstream Port? Apparently. > That seems like it would be a requirement for *all* systems. Well, it's not mandated by the spec but indeed seems like a good idea. Please note that the present patch does exactly that. > If the bridge is > not in D0, we can't generate any PCI transactions on the secondary bus > (PCI PM r1.1., Table 19), so we can't enumerate anything in the slot. Turning on slot power has nothing to do with accessing the secondary bus. v5 of this patch resumed the hotplug port to D0 when accessing the secondary bus. On top of that, v6 resumes it to D0 when turning on slot power. Thanks, Lukas
On Sat, Feb 18, 2017 at 10:25:16AM +0100, Lukas Wunner wrote: > On Tue, Feb 14, 2017 at 04:59:26PM -0600, Bjorn Helgaas wrote: > > On Sun, Feb 12, 2017 at 05:07:45PM +0100, Lukas Wunner wrote: > > > Commit 68db9bc81436 ("PCI: pciehp: Add runtime PM support for PCIe > > > hotplug ports") extended runtime PM for PCIe ports to hotplug ports. > > > However Yinghai Lu reported the following breakage caused by the commit: > > > > > > * After disabling a slot on one of his servers, the slot signals PME > > > which causes pciehp to immediately re-enable the card, thus defeating > > > manual slot control via sysfs. > > > > I don't think we've clearly established this. It surprises me that a > > Downstream Port would signal PME after turning off power to the slot. > > Wouldn't that make PME useless? > > > > I'd like to look at this in a little more detail, either by turning > > off PM and pciehp and poking things manually with setpci, or possibly > > by adding some debug output in the PME ISR and the pciehp ISR and > > "write command" paths. > > I'm in the To: header of your e-mail but I am not in a position to > debug this as I don't have the hardware available. It would have > to be done by Yinghai Lu or Intel. This may already be clear to > everyone involved, so apologies for stating the obvious. It is clear; I was just hoping you would make it as easy as possible for the folks with the hardware by proposing a debugging patch or a series of specific setpci commands to run. > Also, I hope it's clear that these Intel servers are not affected > by v6 of this patch as it only enables runtime PM on *Thunderbolt* > hotplug ports. Right. I just don't understand what's happening, and I don't want this to become folk wisdom before we have more details. > > > * On another server which has a Power Controller for PCIe hotplug ports > > > (PCIe r3.1, sec 6.7.1.8), the slot cannot be re-enabled after manually > > > disabling it via sysfs since commit 68db9bc81436 acquired a runtime > > > ref only *after* enabling power on the slot, yet this particular > > > SkyLake server requires that the port is in D0 when enabling power. > > > > So this server requires that the Downstream Port must be in D0 before > > we turn on power to the slot below the Downstream Port? > > Apparently. > > > That seems like it would be a requirement for *all* systems. > > Well, it's not mandated by the spec but indeed seems like a good idea. > Please note that the present patch does exactly that. A bridge in D3hot still accepts config cycles, so we should be able to use Slot Control to turn slot power for its secondary bus on or off. PCI PM r1.1, Table 16, says the only action we can expect from the bridge is PME, so we can't rely on interrupts for Command Completion, Link State Changed, etc. Unless we wanted to manage the hotplug controller by polling it, I think we have to put the bridge in D0 before sending any commands to it. This seems like something that could be split into a separate patch that wouldn't have anything to do with Thunderbolt. Would that make sense to you? Bjorn
On Sat, Feb 18, 2017 at 06:16:06PM -0600, Bjorn Helgaas wrote: > On Sat, Feb 18, 2017 at 10:25:16AM +0100, Lukas Wunner wrote: > > On Tue, Feb 14, 2017 at 04:59:26PM -0600, Bjorn Helgaas wrote: > > > On Sun, Feb 12, 2017 at 05:07:45PM +0100, Lukas Wunner wrote: > > > > Commit 68db9bc81436 ("PCI: pciehp: Add runtime PM support for PCIe > > > > hotplug ports") extended runtime PM for PCIe ports to hotplug ports. > > > > However Yinghai Lu reported the following breakage caused by the commit: > > > > > > > > * After disabling a slot on one of his servers, the slot signals PME > > > > which causes pciehp to immediately re-enable the card, thus defeating > > > > manual slot control via sysfs. > > > > > > I don't think we've clearly established this. It surprises me that a > > > Downstream Port would signal PME after turning off power to the slot. > > > Wouldn't that make PME useless? > > > > > > I'd like to look at this in a little more detail, either by turning > > > off PM and pciehp and poking things manually with setpci, or possibly > > > by adding some debug output in the PME ISR and the pciehp ISR and > > > "write command" paths. > > > > I'm in the To: header of your e-mail but I am not in a position to > > debug this as I don't have the hardware available. It would have > > to be done by Yinghai Lu or Intel. This may already be clear to > > everyone involved, so apologies for stating the obvious. > > It is clear; I was just hoping you would make it as easy as possible > for the folks with the hardware by proposing a debugging patch or a > series of specific setpci commands to run. > > > Also, I hope it's clear that these Intel servers are not affected > > by v6 of this patch as it only enables runtime PM on *Thunderbolt* > > hotplug ports. > > Right. I just don't understand what's happening, and I don't want > this to become folk wisdom before we have more details. If I understand Yinghai Lu's analysis correctly, the port is in D3hot, we write to config space to enable power on the slot and request a Command Complete interrupt, the port signals PME to request being transferred to D0 so that it can deliver the Command Complete interrupt. So this patch just papers over the issue by disabling PME. :-) The issue doesn't manifest itself on Thunderbolt controllers because (a) they don't support power control and (b) they never send Command Complete interrupts (even though they claim to support it). Since patch [3/8] of v6 of my series enables runtime PM only on *Thunderbolt* hotplug ports, the series should be fine as is, however this isn't really beautiful so if you decide to apply patch [3/8] I'll come up with a fix, or if you'd rather postpone it I'll rework the patch to keep the port runtime resumed whenever and for as long as we're waiting for command completion. I'll not be able to start working on this until next week though. If you do decide to postpone patch [3/8], maybe we can at least come up with an agreeable version of patch [1/8] to go into your 4.11 pull. Thanks, Lukas
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c index ec0b4c11ccd9..d071aa63dac9 100644 --- a/drivers/pci/hotplug/pciehp_ctrl.c +++ b/drivers/pci/hotplug/pciehp_ctrl.c @@ -31,6 +31,7 @@ #include <linux/kernel.h> #include <linux/types.h> #include <linux/slab.h> +#include <linux/pm_runtime.h> #include <linux/pci.h> #include "../pci.h" #include "pciehp.h" @@ -390,6 +391,7 @@ int pciehp_enable_slot(struct slot *p_slot) { u8 getstatus = 0; struct controller *ctrl = p_slot->ctrl; + int retval; pciehp_get_adapter_status(p_slot, &getstatus); if (!getstatus) { @@ -414,7 +416,10 @@ int pciehp_enable_slot(struct slot *p_slot) } } - return board_added(p_slot); + pm_runtime_get_sync(&ctrl->pcie->port->dev); + retval = board_added(p_slot); + pm_runtime_put(&ctrl->pcie->port->dev); + return retval; } /* @@ -424,6 +429,7 @@ int pciehp_disable_slot(struct slot *p_slot) { u8 getstatus = 0; struct controller *ctrl = p_slot->ctrl; + int retval; if (!p_slot->ctrl) return 1; @@ -437,7 +443,10 @@ int pciehp_disable_slot(struct slot *p_slot) } } - return remove_board(p_slot); + pm_runtime_get_sync(&ctrl->pcie->port->dev); + retval = remove_board(p_slot); + pm_runtime_put(&ctrl->pcie->port->dev); + return retval; } int pciehp_sysfs_enable_slot(struct slot *p_slot) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 441083a0d5b0..a1896df274d4 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -2241,13 +2241,10 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge) return false; /* - * Hotplug interrupts cannot be delivered if the link is down, - * so parents of a hotplug port must stay awake. In addition, - * hotplug ports handled by firmware in System Management Mode + * Hotplug ports handled by firmware in System Management Mode * may not be put into D3 by the OS (Thunderbolt on non-Macs). - * For simplicity, disallow in general for now. */ - if (bridge->is_hotplug_bridge) + if (bridge->is_hotplug_bridge && !pciehp_is_native(bridge)) return false; if (pci_bridge_d3_force) @@ -2258,6 +2255,13 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge) return true; /* + * Non-Thunderbolt hotplug ports need further testing before + * enabling D3 on them. + */ + if (bridge->is_hotplug_bridge) + return false; + + /* * It should be safe to put PCIe ports from 2015 or newer * to D3. */
Commit 68db9bc81436 ("PCI: pciehp: Add runtime PM support for PCIe hotplug ports") extended runtime PM for PCIe ports to hotplug ports. However Yinghai Lu reported the following breakage caused by the commit: * After disabling a slot on one of his servers, the slot signals PME which causes pciehp to immediately re-enable the card, thus defeating manual slot control via sysfs. * On another server which has a Power Controller for PCIe hotplug ports (PCIe r3.1, sec 6.7.1.8), the slot cannot be re-enabled after manually disabling it via sysfs since commit 68db9bc81436 acquired a runtime ref only *after* enabling power on the slot, yet this particular SkyLake server requires that the port is in D0 when enabling power. Hence the commit was reverted with d98e0929071e. It is herewith reinstated in modified form: * The Power Controller issue is addressed by no longer acquiring the runtime ref in board_added() and remove_board(), but in their callers. (There's just a single one each.) * An additional condition is added to pci_bridge_d3_possible() such that runtime PM is only enabled on *Thunderbolt* hotplug ports to constrain it to a few well understood chips. In all other cases the feature can be enabled by booting with pcie_port_pm=force. If confidence is high that it works well for everyone, it can be made the default by removing this condition: /* * Non-Thunderbolt hotplug ports need further testing before * enabling D3 on them. */ if (bridge->is_hotplug_bridge) return false; Following is an updated recapitulation of the extra considerations required for hotplug ports as laid out in 68db9bc81436: * The configuration space of the port remains accessible in D3hot, so all the functions to read or modify the Slot Status and Slot Control registers need not be modified. * However D0 is required to access devices on the secondary bus. This happens in pciehp_check_link_status() and pciehp_configure_device() (both called from board_added()) and in pciehp_unconfigure_device() (called from remove_board()), so acquire a runtime PM ref for their invocation. * The hotplug port stays active as long as it has active children. If all hotplugged devices below the port runtime suspend, the port is allowed to runtime suspend as well. Plug and unplug detection continues to work in D3hot. * Hotplug interrupts are delivered in-band, which requires parents of the hotplug port to stay in D0. For hotplug-capable root ports this is a non-issue. For cascaded hotplug ports, side-band signaling is required. E.g. with Thunderbolt, a plug event at the end of the chain is signaled through the CIO switching fabric to the NHI regardless of PCIe ports on the chain being in D3. It is then the NHI's job to runtime resume the PCIe port on which the plug event occurred. * Runtime PM may only be allowed if the hotplug port is handled natively by the OS. On ACPI systems, the port may alternatively be handled by the firmware and things break if the OS puts the port into D3 behind the firmware's back: E.g. Thunderbolt hotplug ports on non-Macs are handled by Intel's firmware in System Management Mode and the firmware is known to access devices on the port's secondary bus without checking first if the port is in D0: https://bugzilla.kernel.org/show_bug.cgi?id=53811 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=193951 Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Erik Veijola <erik.veijola@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> --- drivers/pci/hotplug/pciehp_ctrl.c | 13 +++++++++++-- drivers/pci/pci.c | 14 +++++++++----- 2 files changed, 20 insertions(+), 7 deletions(-)