Message ID | 20191025190047.38130-1-stuart.w.hayes@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | PCI: pciehp: Do not turn off slot if presence comes up after link | expand |
On Fri, Oct 25, 2019 at 03:00:44PM -0400, Stuart Hayes wrote: > Alexandru Gagniuc (2): > PCI: pciehp: Add support for disabling in-band presence > PCI: pciehp: Wait for PDS if in-band presence is disabled > > Stuart Hayes (1): > PCI: pciehp: Add dmi table for in-band presence disabled For the whole series, Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
On Fri, Oct 25, 2019 at 03:00:44PM -0400, Stuart Hayes wrote: > In older PCIe specs, PDS (presence detect) would come up when the > "in-band" presence detect pin connected, and would be up before DLLLA > (link active). > > In PCIe 4.0 (as an ECN) and in PCIe 5.0, there is a new bit to show if > in-band presence detection can be disabled for the slot, and another bit > that disables it--and a recommendation that it should be disabled if it > can be. In addition, certain OEMs disable in-band presence detection > without implementing these bits. > > This means it is possible to get a "card present" interrupt after the > link is up and the driver is loaded. This causes an erroneous removal > of the device driver, followed by an immediate re-probing. > > This patch set defines these new bits, uses them to disable in-band > presence detection if it can be, waits for PDS to go up if in-band > presence detection is disabled, and adds a DMI table that will let us > know if we should assume in-band presence is disabled on a system. FWIW, this series is Reviewed-by: Lukas Wunner <lukas@wunner.de> Looking at the patches again today, I only spotted a minor cosmetic issue: In patch [1/3] I would have preferred readout of the PCI_EXP_SLTCAP2 register (hunk #3) to be inserted a little further up in pcie_init(), perhaps before reading the PCI_EXP_LNKCAP register. It just looks a little out of place at the end of the function. I would have grouped it together with the other quirks and feature checks further up in the function and I probably would have amended the ctrl_info() to print the status of the inband_presence_disabled flag. In patch [3/3] the DMI check would then likewise have to be moved up in the function. Maybe Bjorn can make this change when applying, and if not, it's not a big deal. Thanks, Lukas
[+cc Libor (thanks for the ping!)] On Fri, Oct 25, 2019 at 03:00:44PM -0400, Stuart Hayes wrote: > In older PCIe specs, PDS (presence detect) would come up when the > "in-band" presence detect pin connected, and would be up before DLLLA > (link active). > > In PCIe 4.0 (as an ECN) and in PCIe 5.0, there is a new bit to show if > in-band presence detection can be disabled for the slot, and another bit > that disables it--and a recommendation that it should be disabled if it > can be. In addition, certain OEMs disable in-band presence detection > without implementing these bits. > > This means it is possible to get a "card present" interrupt after the > link is up and the driver is loaded. This causes an erroneous removal > of the device driver, followed by an immediate re-probing. > > This patch set defines these new bits, uses them to disable in-band > presence detection if it can be, waits for PDS to go up if in-band > presence detection is disabled, and adds a DMI table that will let us > know if we should assume in-band presence is disabled on a system. > > The first two patches in this set come from a patch set that was > submitted but not accepted many months ago by Alexandru Gagniuc [1]. > The first is unmodified, the second has the commit message and timeout > modified. > > [1] https://patchwork.kernel.org/cover/10909167/ > [v3,0/4] PCI: pciehp: Do not turn off slot if presence comes up after link > > v2: > - modify loop in pcie_wait_for_presence to do..while > > v3: > - remove unused variable declaration > - modify text of warning message > > v4: > - remove "!!" boolean conversion in an "if" condition for readability > - add explanation comment in dmi table > > Alexandru Gagniuc (2): > PCI: pciehp: Add support for disabling in-band presence > PCI: pciehp: Wait for PDS if in-band presence is disabled > > Stuart Hayes (1): > PCI: pciehp: Add dmi table for in-band presence disabled > > drivers/pci/hotplug/pciehp.h | 1 + > drivers/pci/hotplug/pciehp_hpc.c | 50 +++++++++++++++++++++++++++++++- > include/uapi/linux/pci_regs.h | 2 ++ > 3 files changed, 52 insertions(+), 1 deletion(-) I added the spec reference to the 1/3 commit log, tried to make the tweaks Lukas suggested (interdiff below), used ctrl_info() instead of pci_info() (I would actually like to change the whole driver to use pci_info(), but better to be consistent for now), and applied to pci/hotplug for v5.7. Somebody should also update lspci to: - Do something with DevCap AttnBtn, AttnInd, PwrInd to indicate that they were only defined for PCIe r1.0 and have been explicitly undefined since then. If there's a way to identify those 1.0 devices and only decode those fields for 1.0, that would be nice. - Add SltCap2 and SltCtrl2 decoding. Speak up if you plan to do this so we don't duplicate effort. Bjorn diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c index ae0108b92084..469873b44a8e 100644 --- a/drivers/pci/hotplug/pciehp_hpc.c +++ b/drivers/pci/hotplug/pciehp_hpc.c @@ -284,7 +284,7 @@ static void pcie_wait_for_presence(struct pci_dev *pdev) timeout -= 10; } while (timeout > 0); - pci_info(pdev, "Timeout waiting for Presence Detect state to be set\n"); + ctrl_info(ctrl, "Timeout waiting for Presence Detect\n"); } int pciehp_check_link_status(struct controller *ctrl) @@ -921,6 +921,16 @@ struct controller *pcie_init(struct pcie_device *dev) ctrl->state = list_empty(&subordinate->devices) ? OFF_STATE : ON_STATE; up_read(&pci_bus_sem); + pcie_capability_read_dword(pdev, PCI_EXP_SLTCAP2, &slot_cap2); + if (slot_cap2 & PCI_EXP_SLTCAP2_IBPD) { + pcie_write_cmd_nowait(ctrl, PCI_EXP_SLTCTL_IBPD_DISABLE, + PCI_EXP_SLTCTL_IBPD_DISABLE); + ctrl->inband_presence_disabled = 1; + } + + if (dmi_first_match(inband_presence_disabled_dmi_table)) + ctrl->inband_presence_disabled = 1; + /* Check if Data Link Layer Link Active Reporting is implemented */ pcie_capability_read_dword(pdev, PCI_EXP_LNKCAP, &link_cap); @@ -930,7 +940,7 @@ struct controller *pcie_init(struct pcie_device *dev) PCI_EXP_SLTSTA_MRLSC | PCI_EXP_SLTSTA_CC | PCI_EXP_SLTSTA_DLLSC | PCI_EXP_SLTSTA_PDC); - ctrl_info(ctrl, "Slot #%d AttnBtn%c PwrCtrl%c MRL%c AttnInd%c PwrInd%c HotPlug%c Surprise%c Interlock%c NoCompl%c LLActRep%c%s\n", + ctrl_info(ctrl, "Slot #%d AttnBtn%c PwrCtrl%c MRL%c AttnInd%c PwrInd%c HotPlug%c Surprise%c Interlock%c NoCompl%c IbPresDis%c LLActRep%c%s\n", (slot_cap & PCI_EXP_SLTCAP_PSN) >> 19, FLAG(slot_cap, PCI_EXP_SLTCAP_ABP), FLAG(slot_cap, PCI_EXP_SLTCAP_PCP), @@ -941,19 +951,10 @@ struct controller *pcie_init(struct pcie_device *dev) FLAG(slot_cap, PCI_EXP_SLTCAP_HPS), FLAG(slot_cap, PCI_EXP_SLTCAP_EIP), FLAG(slot_cap, PCI_EXP_SLTCAP_NCCS), + ctrl->inband_presence_disabled, FLAG(link_cap, PCI_EXP_LNKCAP_DLLLARC), pdev->broken_cmd_compl ? " (with Cmd Compl erratum)" : ""); - pcie_capability_read_dword(pdev, PCI_EXP_SLTCAP2, &slot_cap2); - if (slot_cap2 & PCI_EXP_SLTCAP2_IBPD) { - pcie_write_cmd_nowait(ctrl, PCI_EXP_SLTCTL_IBPD_DISABLE, - PCI_EXP_SLTCTL_IBPD_DISABLE); - ctrl->inband_presence_disabled = 1; - } - - if (dmi_first_match(inband_presence_disabled_dmi_table)) - ctrl->inband_presence_disabled = 1; - /* * If empty slot's power status is on, turn power off. The IRQ isn't * requested yet, so avoid triggering a notification with this command. diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h index b464d2f76513..f9701410d3b5 100644 --- a/include/uapi/linux/pci_regs.h +++ b/include/uapi/linux/pci_regs.h @@ -681,7 +681,7 @@ #define PCI_EXP_LNKSTA2 50 /* Link Status 2 */ #define PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 52 /* v2 endpoints with link end here */ #define PCI_EXP_SLTCAP2 52 /* Slot Capabilities 2 */ -#define PCI_EXP_SLTCAP2_IBPD 0x0001 /* In-band PD Disable Supported */ +#define PCI_EXP_SLTCAP2_IBPD 0x00000001 /* In-band PD Disable Supported */ #define PCI_EXP_SLTCTL2 56 /* Slot Control 2 */ #define PCI_EXP_SLTSTA2 58 /* Slot Status 2 */
On Mon, Feb 10, 2020 at 06:08:16PM -0600, Bjorn Helgaas wrote: > used ctrl_info() instead of pci_info() (I would actually like to change > the whole driver to use pci_info(), but better to be consistent for now) Most of the ctrl_info() calls prepend "Slot(%s): " to the message. However that prefix can only be used once pci_hp_initialize() has been called. It would probably make sense to change ctrl_info() to always include the prefix and change those invocations of ctrl_info() which happen when the slot is not yet or no longer registered, to pci_info(). > @@ -930,7 +940,7 @@ struct controller *pcie_init(struct pcie_device *dev) > PCI_EXP_SLTSTA_MRLSC | PCI_EXP_SLTSTA_CC | > PCI_EXP_SLTSTA_DLLSC | PCI_EXP_SLTSTA_PDC); > > - ctrl_info(ctrl, "Slot #%d AttnBtn%c PwrCtrl%c MRL%c AttnInd%c PwrInd%c HotPlug%c Surprise%c Interlock%c NoCompl%c LLActRep%c%s\n", > + ctrl_info(ctrl, "Slot #%d AttnBtn%c PwrCtrl%c MRL%c AttnInd%c PwrInd%c HotPlug%c Surprise%c Interlock%c NoCompl%c IbPresDis%c LLActRep%c%s\n", > (slot_cap & PCI_EXP_SLTCAP_PSN) >> 19, > FLAG(slot_cap, PCI_EXP_SLTCAP_ABP), > FLAG(slot_cap, PCI_EXP_SLTCAP_PCP), > @@ -941,19 +951,10 @@ struct controller *pcie_init(struct pcie_device *dev) > FLAG(slot_cap, PCI_EXP_SLTCAP_HPS), > FLAG(slot_cap, PCI_EXP_SLTCAP_EIP), > FLAG(slot_cap, PCI_EXP_SLTCAP_NCCS), > + ctrl->inband_presence_disabled, > FLAG(link_cap, PCI_EXP_LNKCAP_DLLLARC), > pdev->broken_cmd_compl ? " (with Cmd Compl erratum)" : ""); I've just reviewed the resulting commits on pci/hotplug once more and think there's a small issue here: If ctrl->inband_presence_disabled is 0, the string will contain ASCII character 0 (end of string) and if it's 1 it will contain ASCII character 1 (start of header). A possible solution would be FLAG(ctrl->inband_presence_disabled, 1). (The real solution would probably to have a printk format for this kind of thing.) Thanks, Lukas
On Tue, Feb 11, 2020 at 05:49:40AM +0100, Lukas Wunner wrote: > On Mon, Feb 10, 2020 at 06:08:16PM -0600, Bjorn Helgaas wrote: > > used ctrl_info() instead of pci_info() (I would actually like to change > > the whole driver to use pci_info(), but better to be consistent for now) > > Most of the ctrl_info() calls prepend "Slot(%s): " to the message. > However that prefix can only be used once pci_hp_initialize() has > been called. > > It would probably make sense to change ctrl_info() to always > include the prefix and change those invocations of ctrl_info() > which happen when the slot is not yet or no longer registered, > to pci_info(). Ouch, my tweaks were definitely half-baked. I really like your idea of hoisting the "Slot(%s)" text up into ctrl_*(). I might rename ctrl_*() to slot_*() or similar to connect it more with the slot registration. I'm a little confused about why pci_hp_initialize()/ __pci_hp_initialize()/pci_hp_register()/__pci_hp_register() is such a rat's nest with hotplug drivers using a mix of them. I wonder if that could be rationalized and maybe done earlier so all hotplug- related messages could use the same ctrl_*() logging. But this is all outside the scope of this patch. I'll look at the pcie_wait_for_presence() situation and revert to pci_info() if if can be called when the slot is not registered. > > @@ -930,7 +940,7 @@ struct controller *pcie_init(struct pcie_device *dev) > > PCI_EXP_SLTSTA_MRLSC | PCI_EXP_SLTSTA_CC | > > PCI_EXP_SLTSTA_DLLSC | PCI_EXP_SLTSTA_PDC); > > > > - ctrl_info(ctrl, "Slot #%d AttnBtn%c PwrCtrl%c MRL%c AttnInd%c PwrInd%c HotPlug%c Surprise%c Interlock%c NoCompl%c LLActRep%c%s\n", > > + ctrl_info(ctrl, "Slot #%d AttnBtn%c PwrCtrl%c MRL%c AttnInd%c PwrInd%c HotPlug%c Surprise%c Interlock%c NoCompl%c IbPresDis%c LLActRep%c%s\n", > > (slot_cap & PCI_EXP_SLTCAP_PSN) >> 19, > > FLAG(slot_cap, PCI_EXP_SLTCAP_ABP), > > FLAG(slot_cap, PCI_EXP_SLTCAP_PCP), > > @@ -941,19 +951,10 @@ struct controller *pcie_init(struct pcie_device *dev) > > FLAG(slot_cap, PCI_EXP_SLTCAP_HPS), > > FLAG(slot_cap, PCI_EXP_SLTCAP_EIP), > > FLAG(slot_cap, PCI_EXP_SLTCAP_NCCS), > > + ctrl->inband_presence_disabled, > > FLAG(link_cap, PCI_EXP_LNKCAP_DLLLARC), > > pdev->broken_cmd_compl ? " (with Cmd Compl erratum)" : ""); > > I've just reviewed the resulting commits on pci/hotplug once more and > think there's a small issue here: If ctrl->inband_presence_disabled is 0, > the string will contain ASCII character 0 (end of string) and if it's 1 > it will contain ASCII character 1 (start of header). A possible solution > would be FLAG(ctrl->inband_presence_disabled, 1). Definitely broken, sorry about that. Feels like sort of a double-negative situation, too. Obviously the hardware bit has to be "1 means disabled" to be compatible with previous spec versions, but the code is usually easier to read if we test for something being *enabled*. I'll try to figure out something. Bjorn
On Tue, Feb 11, 2020 at 08:14:44AM -0600, Bjorn Helgaas wrote: > I'm a little confused about why pci_hp_initialize()/ > __pci_hp_initialize()/pci_hp_register()/__pci_hp_register() is such a > rat's nest with hotplug drivers using a mix of them. This is modeled after device registration, which can be done either in two steps (device_initialize() + device_add()) or in 1 step (device_register()). So it's either pci_hp_initialize() + pci_hp_add() or pci_hp_register(). The rationale is provided in the commit message of 51bbf9bee34f ("PCI: hotplug: Demidlayer registration with the core"). > Feels like sort of a > double-negative situation, too. Obviously the hardware bit has to be > "1 means disabled" to be compatible with previous spec versions, but > the code is usually easier to read if we test for something being > *enabled*. It's a similar situation with the "DisINTx" bit in the Command register, which, if disabled, is shown as "DisINTx-" in lspci even though the more intuitive notion is that INTx is *enabled*. I think you did the right thing by showing it as "IbPresDis-" because it's consistent with how it's done elsewhere for similar bits. Thanks, Lukas
On Tue, Feb 11, 2020 at 03:32:02PM +0100, Lukas Wunner wrote: > On Tue, Feb 11, 2020 at 08:14:44AM -0600, Bjorn Helgaas wrote: > > I'm a little confused about why pci_hp_initialize()/ > > __pci_hp_initialize()/pci_hp_register()/__pci_hp_register() is such a > > rat's nest with hotplug drivers using a mix of them. > > This is modeled after device registration, which can be done either > in two steps (device_initialize() + device_add()) or in 1 step > (device_register()). > > So it's either pci_hp_initialize() + pci_hp_add() or pci_hp_register(). > > The rationale is provided in the commit message of 51bbf9bee34f > ("PCI: hotplug: Demidlayer registration with the core"). Thanks for the pointer. I wrote that down in case I ever try to figure that out in the future. Obviously I haven't looked at this in any detail, but it seems like the sort of thing that all the hotplug drivers should do the same way regardless of their internal structure, and the slot concept seems pretty integral to the bridge leading to it. Maybe this is a somehow a consequence of the hotplug drivers being separated from the enumeration path. Or maybe the slot part could be split out from the hotplug drivers and done during enumeration. Just blue sky thinking, I don't pretend to have done any actual research here. Bjorn
On Tue, Feb 11, 2020 at 03:32:02PM +0100, Lukas Wunner wrote: > On Tue, Feb 11, 2020 at 08:14:44AM -0600, Bjorn Helgaas wrote: > > Feels like sort of a > > double-negative situation, too. Obviously the hardware bit has to be > > "1 means disabled" to be compatible with previous spec versions, but > > the code is usually easier to read if we test for something being > > *enabled*. > > It's a similar situation with the "DisINTx" bit in the Command > register, which, if disabled, is shown as "DisINTx-" in lspci even > though the more intuitive notion is that INTx is *enabled*. I think > you did the right thing by showing it as "IbPresDis-" because it's > consistent with how it's done elsewhere for similar bits. Everything else we decode is *capability* bits and IBPD is another one. So by the principle of least surprise, I propose this: + ctrl_info(ctrl, "Slot #%d AttnBtn%c PwrCtrl%c MRL%c AttnInd%c PwrInd%c HotPlug%c Surprise%c Interlock%c NoCompl%c IbPresDis%c LLActRep%c%s\n", + FLAG(slot_cap2, PCI_EXP_SLTCAP2_IBPD), That works out to be the same as printing inbound_presence_disabled ? '+' : '-' because we always set inbound_presence_disabled when PCI_EXP_SLTCAP2_IBPD is supported.