Message ID | 20230621185152.105320-3-Smita.KoralahalliChannabasappa@amd.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | PCI: pciehp: Add support for native AER and DPC handling on async remove | expand |
On Wed, Jun 21, 2023 at 06:51:52PM +0000, Smita Koralahalli wrote: > --- a/drivers/pci/hotplug/pciehp_pci.c > +++ b/drivers/pci/hotplug/pciehp_pci.c > @@ -102,6 +102,10 @@ void pciehp_unconfigure_device(struct controller *ctrl, bool presence) > > pci_lock_rescan_remove(); > > + pcie_capability_clear_word(ctrl->pcie->port, PCI_EXP_DEVCTL2, > + (PCI_EXP_DEVCTL2_ARI | PCI_EXP_DEVCTL2_ATOMIC_REQ | > + PCI_EXP_DEVCTL2_TAG_REQ_EN)); > + Hm, this will clear the bits while the device may still be present. Note that the subsequent pci_stop_and_remove_bus_device() will unbind the driver and may thus cause communication with the device. Can clearing those bits in the hotplug port hamper communication with the device? I'd recommend avoiding that issue altogether by clearing the bits at the end of the function after the call to pci_unlock_rescan_remove(), so that negotiated state of the hotplug port gets cleared after all subordinate devices are de-enumerated. The commit message doesn't point out that PCI_EXP_DEVCTL2_ARI is already being taken care of on enumeration of future subordinate devices in pci_configure_ari() and is only cleared here for good measure. If you intend to configure 10 bit tags and atomic ops on enumeration in future patches, I'd recommend omitting PCI_EXP_DEVCTL2_ARI here and clearing each of the other two bits in the future patches which configure them on enumeration. You don't need braces around the "or" operation for the bits. Thanks, Lukas > /* > * Stopping an SR-IOV PF device removes all the associated VFs, > * which will update the bus->devices list and confuse the > diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h > index dc2000e0fe3a..6fbc47f23d52 100644 > --- a/include/uapi/linux/pci_regs.h > +++ b/include/uapi/linux/pci_regs.h > @@ -668,6 +668,7 @@ > #define PCI_EXP_DEVCTL2_IDO_REQ_EN 0x0100 /* Allow IDO for requests */ > #define PCI_EXP_DEVCTL2_IDO_CMP_EN 0x0200 /* Allow IDO for completions */ > #define PCI_EXP_DEVCTL2_LTR_EN 0x0400 /* Enable LTR mechanism */ > +#define PCI_EXP_DEVCTL2_TAG_REQ_EN 0x1000 /* Allow 10 Tags for Requester */ > #define PCI_EXP_DEVCTL2_OBFF_MSGA_EN 0x2000 /* Enable OBFF Message type A */ > #define PCI_EXP_DEVCTL2_OBFF_MSGB_EN 0x4000 /* Enable OBFF Message type B */ > #define PCI_EXP_DEVCTL2_OBFF_WAKE_EN 0x6000 /* OBFF using WAKE# signaling */ > -- > 2.17.1
On Thu, Jun 22, 2023 at 08:31:05AM +0200, Lukas Wunner wrote: > On Wed, Jun 21, 2023 at 06:51:52PM +0000, Smita Koralahalli wrote: > The commit message doesn't point out that PCI_EXP_DEVCTL2_ARI is > already being taken care of on enumeration of future subordinate > devices in pci_configure_ari() and is only cleared here for good > measure. If you intend to configure 10 bit tags and atomic ops > on enumeration in future patches, I'd recommend omitting > PCI_EXP_DEVCTL2_ARI here and clearing each of the other two bits ^^^^^^^^ Sorry, I meant to say "dropping". I.e. once you enable or disable the feature on enumeration and there's no longer a need to unconditionally disable it on de-enumeration, drop clearing the bit here. Thanks, Lukas
On 6/21/2023 11:31 PM, Lukas Wunner wrote: > On Wed, Jun 21, 2023 at 06:51:52PM +0000, Smita Koralahalli wrote: >> --- a/drivers/pci/hotplug/pciehp_pci.c >> +++ b/drivers/pci/hotplug/pciehp_pci.c >> @@ -102,6 +102,10 @@ void pciehp_unconfigure_device(struct controller *ctrl, bool presence) >> >> pci_lock_rescan_remove(); >> >> + pcie_capability_clear_word(ctrl->pcie->port, PCI_EXP_DEVCTL2, >> + (PCI_EXP_DEVCTL2_ARI | PCI_EXP_DEVCTL2_ATOMIC_REQ | >> + PCI_EXP_DEVCTL2_TAG_REQ_EN)); >> + > > Hm, this will clear the bits while the device may still be present. > Note that the subsequent pci_stop_and_remove_bus_device() will unbind > the driver and may thus cause communication with the device. > Can clearing those bits in the hotplug port hamper communication with > the device? > > I'd recommend avoiding that issue altogether by clearing the bits at > the end of the function after the call to pci_unlock_rescan_remove(), > so that negotiated state of the hotplug port gets cleared after all > subordinate devices are de-enumerated. This is a good point. Thanks! > > The commit message doesn't point out that PCI_EXP_DEVCTL2_ARI is > already being taken care of on enumeration of future subordinate > devices in pci_configure_ari() and is only cleared here for good > measure. If you intend to configure 10 bit tags and atomic ops > on enumeration in future patches, I'd recommend omitting > PCI_EXP_DEVCTL2_ARI here and clearing each of the other two bits > in the future patches which configure them on enumeration. Would it be fair to just reuse pci_enable_atomic_ops_to_root() for Atomic_Ops configuration? > > You don't need braces around the "or" operation for the bits. Sure! Thanks, Smita > > Thanks, > > Lukas > >> /* >> * Stopping an SR-IOV PF device removes all the associated VFs, >> * which will update the bus->devices list and confuse the >> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h >> index dc2000e0fe3a..6fbc47f23d52 100644 >> --- a/include/uapi/linux/pci_regs.h >> +++ b/include/uapi/linux/pci_regs.h >> @@ -668,6 +668,7 @@ >> #define PCI_EXP_DEVCTL2_IDO_REQ_EN 0x0100 /* Allow IDO for requests */ >> #define PCI_EXP_DEVCTL2_IDO_CMP_EN 0x0200 /* Allow IDO for completions */ >> #define PCI_EXP_DEVCTL2_LTR_EN 0x0400 /* Enable LTR mechanism */ >> +#define PCI_EXP_DEVCTL2_TAG_REQ_EN 0x1000 /* Allow 10 Tags for Requester */ >> #define PCI_EXP_DEVCTL2_OBFF_MSGA_EN 0x2000 /* Enable OBFF Message type A */ >> #define PCI_EXP_DEVCTL2_OBFF_MSGB_EN 0x4000 /* Enable OBFF Message type B */ >> #define PCI_EXP_DEVCTL2_OBFF_WAKE_EN 0x6000 /* OBFF using WAKE# signaling */ >> -- >> 2.17.1
[cc += Jay, Felix] On Thu, Jun 22, 2023 at 02:02:12PM -0700, Smita Koralahalli wrote: > Would it be fair to just reuse pci_enable_atomic_ops_to_root() for > Atomic_Ops configuration? Hm, that's a good question. I'm not an expert on that corner of the PCI core. But indeed what you could try is amend that function to not only *set* PCI_EXP_DEVCTL2_ATOMIC_REQ if it's supported, but to also *clear* it if it's not supported. And you'd have to call pci_enable_atomic_ops_to_root() on enumeration, e.g. from pci_init_capabilities(). That should obviate the need to call pci_enable_atomic_ops_to_root() from drivers, so you could probably remove the call from all the drivers which currently call it (amdgpu, infiniband, mellanox), in one separate patch per driver. An then you could drop the EXPORT clause for pci_enable_atomic_ops_to_root() and make it private to the PCI core. So that would be 5 patches (enablement/disablement on enumeration, amendmend of the 3 drivers, making the call private). I'm not sure if anyone will cry foul if you do that but if you want to give it a try, go for it. :) I don't now why commit 430a23689dea, which introduced pci_enable_atomic_ops_to_root(), chose to add it as a library function which is only called from specific drivers, instead of universally enabling the feature for all devices. Adding the commit authors to cc so they can chime in. Thanks, Lukas
On 2023-06-22 23:42, Lukas Wunner wrote: > [cc += Jay, Felix] > > On Thu, Jun 22, 2023 at 02:02:12PM -0700, Smita Koralahalli wrote: >> Would it be fair to just reuse pci_enable_atomic_ops_to_root() for >> Atomic_Ops configuration? > Hm, that's a good question. I'm not an expert on that corner of > the PCI core. > > But indeed what you could try is amend that function to not only > *set* PCI_EXP_DEVCTL2_ATOMIC_REQ if it's supported, but to also > *clear* it if it's not supported. > > And you'd have to call pci_enable_atomic_ops_to_root() on enumeration, > e.g. from pci_init_capabilities(). > > That should obviate the need to call pci_enable_atomic_ops_to_root() > from drivers, so you could probably remove the call from all the > drivers which currently call it (amdgpu, infiniband, mellanox), > in one separate patch per driver. > > An then you could drop the EXPORT clause for pci_enable_atomic_ops_to_root() > and make it private to the PCI core. Then our driver would need an alternative way to determine whether atomic capabilities are enabled for a device. We currently use the return value from pci_enable_atomic_ops_to_root to determine this. Regards, Felix > > So that would be 5 patches (enablement/disablement on enumeration, > amendmend of the 3 drivers, making the call private). > > I'm not sure if anyone will cry foul if you do that but if you want > to give it a try, go for it. :) > > I don't now why commit 430a23689dea, which introduced > pci_enable_atomic_ops_to_root(), chose to add it as a library function > which is only called from specific drivers, instead of universally > enabling the feature for all devices. Adding the commit authors to cc > so they can chime in. > > Thanks, > > Lukas
On Fri, Jun 23, 2023 at 05:59:55AM +0200, Felix Kuehling wrote: > On 2023-06-22 23:42, Lukas Wunner wrote: > > On Thu, Jun 22, 2023 at 02:02:12PM -0700, Smita Koralahalli wrote: > > > Would it be fair to just reuse pci_enable_atomic_ops_to_root() for > > > Atomic_Ops configuration? > > > > Hm, that's a good question. I'm not an expert on that corner of > > the PCI core. > > > > But indeed what you could try is amend that function to not only > > *set* PCI_EXP_DEVCTL2_ATOMIC_REQ if it's supported, but to also > > *clear* it if it's not supported. > > > > And you'd have to call pci_enable_atomic_ops_to_root() on enumeration, > > e.g. from pci_init_capabilities(). > > > > That should obviate the need to call pci_enable_atomic_ops_to_root() > > from drivers, so you could probably remove the call from all the > > drivers which currently call it (amdgpu, infiniband, mellanox), > > in one separate patch per driver. > > > > An then you could drop the EXPORT clause for pci_enable_atomic_ops_to_root() > > and make it private to the PCI core. > > Then our driver would need an alternative way to determine whether atomic > capabilities are enabled for a device. We currently use the return value > from pci_enable_atomic_ops_to_root to determine this. Just read PCI_EXP_DEVCTL2 and check whether PCI_EXP_DEVCTL2_ATOMIC_REQ is set. (I.e. has been set by the PCI core on device enumeration.) Problem solved, I guess? Thanks, Lukas
On 6/22/2023 16:42, Lukas Wunner wrote: > I don't now why commit 430a23689dea, which introduced > pci_enable_atomic_ops_to_root(), chose to add it as a library function > which is only called from specific drivers, instead of universally > enabling the feature for all devices. Adding the commit authors to cc > so they can chime in. IIRC during the initial design discussion on linux-pci this approach was suggested to avoid triggering potential bugs in devices without AtomicOps support. See quote below. I've no objections to changing it. On 2016-05-06 10:48, Bjorn Helgaas wrote: > Once enabled in Device Control 2, a device's use of AtomicOps is > competely device-specific. In many cases, the device probably doesn't > support AtomicOps, so enabling them would be a no-op. But there could > be devices where AtomicOps are nominally supported but untested or > broken. Even if we didn't change their drivers, those devices could > start using AtomicOps, so I'm not comfortable with the PCI core > enabling AtomicOp requests indiscriminately.
On 6/22/2023 2:42 PM, Lukas Wunner wrote: > [cc += Jay, Felix] > > On Thu, Jun 22, 2023 at 02:02:12PM -0700, Smita Koralahalli wrote: >> Would it be fair to just reuse pci_enable_atomic_ops_to_root() for >> Atomic_Ops configuration? > > Hm, that's a good question. I'm not an expert on that corner of > the PCI core. > > But indeed what you could try is amend that function to not only > *set* PCI_EXP_DEVCTL2_ATOMIC_REQ if it's supported, but to also > *clear* it if it's not supported. > > And you'd have to call pci_enable_atomic_ops_to_root() on enumeration, > e.g. from pci_init_capabilities(). > > That should obviate the need to call pci_enable_atomic_ops_to_root() > from drivers, so you could probably remove the call from all the > drivers which currently call it (amdgpu, infiniband, mellanox), > in one separate patch per driver. > > An then you could drop the EXPORT clause for pci_enable_atomic_ops_to_root() > and make it private to the PCI core. > > So that would be 5 patches (enablement/disablement on enumeration, > amendmend of the 3 drivers, making the call private). > > I'm not sure if anyone will cry foul if you do that but if you want > to give it a try, go for it. :) Okay, I see there are no objections except for Bjorn/Jay's comments on "But there could be devices where AtomicOps are nominally supported but untested or broken.." Would this be an issue? If not, I will start working on those 5 patches. Thanks, Smita > > I don't now why commit 430a23689dea, which introduced > pci_enable_atomic_ops_to_root(), chose to add it as a library function > which is only called from specific drivers, instead of universally > enabling the feature for all devices. Adding the commit authors to cc > so they can chime in. > > Thanks, > > Lukas
On Tue, Jun 27, 2023 at 10:38:54AM -0700, Smita Koralahalli wrote: > Okay, I see there are no objections except for Bjorn/Jay's comments on > > "But there could be devices where AtomicOps are nominally supported but > untested or broken.." > > Would this be an issue? I think you said that BIOS enables AtomicOps on certain AMD machines? Or did that observation only apply to 10 Bit tags? If BIOS does enable AtomicOps, it would be interesting to know which logic BIOS follows, i.e. how does it determine whether to set AtomicOp Requester Enable on Endpoints? It would also be interesting to know how far that BIOS has proliferated, i.e. how much experience with various Endpoint devices exists in the real world. If it turns out that BIOS has enabled the feature for years on a wide range of Endpoints without any issues, I think that would render concerns mute that enabling it in the kernel might cause regressions. I don't know why the spec says that "discovery of AtomicOp Requester capabilities is outside the scope of this specification". I imagine it would be possible to set AtomicOp Requester Enable, then read it to determine whether the bit is now indeed 1 or hard-wired to 0. In the latter case, AtomicOp Requester capabilities can be assumed to be absent. So that would be a way to make do without any other specific discovery of AtomicOp Requester capabilities. Thanks, Lukas
On 6/28/2023 6:25 AM, Lukas Wunner wrote: > On Tue, Jun 27, 2023 at 10:38:54AM -0700, Smita Koralahalli wrote: >> Okay, I see there are no objections except for Bjorn/Jay's comments on >> >> "But there could be devices where AtomicOps are nominally supported but >> untested or broken.." >> >> Would this be an issue? > > I think you said that BIOS enables AtomicOps on certain AMD machines? > Or did that observation only apply to 10 Bit tags? Yes, that observation right now applies only to 10 bit tags. > > If BIOS does enable AtomicOps, it would be interesting to know which > logic BIOS follows, i.e. how does it determine whether to set > AtomicOp Requester Enable on Endpoints? I agree this is a very good thing to know. I have followed up with the BIOS team to get some pointers on this. I will get back to you soon. > > It would also be interesting to know how far that BIOS has proliferated, > i.e. how much experience with various Endpoint devices exists in the > real world. If it turns out that BIOS has enabled the feature for > years on a wide range of Endpoints without any issues, I think > that would render concerns mute that enabling it in the kernel > might cause regressions. > > I don't know why the spec says that "discovery of AtomicOp Requester > capabilities is outside the scope of this specification". I imagine > it would be possible to set AtomicOp Requester Enable, then read it > to determine whether the bit is now indeed 1 or hard-wired to 0. > In the latter case, AtomicOp Requester capabilities can be assumed > to be absent. So that would be a way to make do without any other > specific discovery of AtomicOp Requester capabilities. > > Thanks, > > Lukas
On 6/28/2023 6:25 AM, Lukas Wunner wrote: > On Tue, Jun 27, 2023 at 10:38:54AM -0700, Smita Koralahalli wrote: >> Okay, I see there are no objections except for Bjorn/Jay's comments on >> >> "But there could be devices where AtomicOps are nominally supported but >> untested or broken.." >> >> Would this be an issue? > > I think you said that BIOS enables AtomicOps on certain AMD machines? > Or did that observation only apply to 10 Bit tags? > > If BIOS does enable AtomicOps, it would be interesting to know which > logic BIOS follows, i.e. how does it determine whether to set > AtomicOp Requester Enable on Endpoints? > > It would also be interesting to know how far that BIOS has proliferated, > i.e. how much experience with various Endpoint devices exists in the > real world. If it turns out that BIOS has enabled the feature for > years on a wide range of Endpoints without any issues, I think > that would render concerns mute that enabling it in the kernel > might cause regressions. > > I don't know why the spec says that "discovery of AtomicOp Requester > capabilities is outside the scope of this specification". I imagine > it would be possible to set AtomicOp Requester Enable, then read it > to determine whether the bit is now indeed 1 or hard-wired to 0. > In the latter case, AtomicOp Requester capabilities can be assumed > to be absent. So that would be a way to make do without any other > specific discovery of AtomicOp Requester capabilities. Sorry for getting back to this very late. The approach taken by our Platform FW is not as robust as how it is handled by SW. And also they haven't come across issues for AtomicOps similar to 10-bit tags. Hence, I worked on the approach of reading back the "AtomicOp Requester Enable" to determine whether it is indeed 1 or hard-wired to 0. I came across two issues. This approach did not eliminate calling pci_enable_atomic_ops_to_root() on all drivers as some of them enabled AtomicOp by only checking for 32-bit and 64-bit Completion capabilities. And, on internal review I received comments that this approach wouldn't help in determining the presence/absence of AtomicOps. PCIe r6.0 sec 7.5.3.16 [1], states "AtomicOps Requester Enable is permitted to be RW even if no AtomicOp Requester capabilities are supported by the Endpoint or Root Port", thereby substantiating devices that hardwires this bit to '1' is also valid.. I have sent v4 considering all these constraints. Let me know what you think. Thanks, Smita > > Thanks, > > Lukas
diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c index ad12515a4a12..e27fd2bc4ceb 100644 --- a/drivers/pci/hotplug/pciehp_pci.c +++ b/drivers/pci/hotplug/pciehp_pci.c @@ -102,6 +102,10 @@ void pciehp_unconfigure_device(struct controller *ctrl, bool presence) pci_lock_rescan_remove(); + pcie_capability_clear_word(ctrl->pcie->port, PCI_EXP_DEVCTL2, + (PCI_EXP_DEVCTL2_ARI | PCI_EXP_DEVCTL2_ATOMIC_REQ | + PCI_EXP_DEVCTL2_TAG_REQ_EN)); + /* * Stopping an SR-IOV PF device removes all the associated VFs, * which will update the bus->devices list and confuse the diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h index dc2000e0fe3a..6fbc47f23d52 100644 --- a/include/uapi/linux/pci_regs.h +++ b/include/uapi/linux/pci_regs.h @@ -668,6 +668,7 @@ #define PCI_EXP_DEVCTL2_IDO_REQ_EN 0x0100 /* Allow IDO for requests */ #define PCI_EXP_DEVCTL2_IDO_CMP_EN 0x0200 /* Allow IDO for completions */ #define PCI_EXP_DEVCTL2_LTR_EN 0x0400 /* Enable LTR mechanism */ +#define PCI_EXP_DEVCTL2_TAG_REQ_EN 0x1000 /* Allow 10 Tags for Requester */ #define PCI_EXP_DEVCTL2_OBFF_MSGA_EN 0x2000 /* Enable OBFF Message type A */ #define PCI_EXP_DEVCTL2_OBFF_MSGB_EN 0x4000 /* Enable OBFF Message type B */ #define PCI_EXP_DEVCTL2_OBFF_WAKE_EN 0x6000 /* OBFF using WAKE# signaling */
Clear the optional capabilities ARI Forwarding Enable, AtomicOp Requester Enable and 10-Bit Tag Requester Enable in DEVCTL2 unconditionally on a hot-plug event. These are the bits which are negotiated between endpoint and root port and should be re-negotiated and set up for optimal operation on a hot add. According to implementation notes in 2.2.6.2, of PCIe Base Specification [1], "For platforms where the RC supports 10-Bit Tag Completer capability, it is highly recommended for platform firmware or operating software that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable bit automatically in Endpoints with 10-Bit Tag Requester capability. This enables the important class of 10-Bit Tag capable adapters that send Memory Read Requests only to host memory". Hence, it must be noted that Platform FW may enable these bits if endpoint supports them at boot time for performance reasons even if Linux hasn't defined them. Issues are being observed where a device became inaccessible and the port was not able to be recovered without a system reset when a device with 10-bit tags was removed and replaced with a device that didn't support 10-bit tags. Section 2.2.6.2, in PCIe Base Specification also implies that: * If a Requester sends a 10-Bit Tag Request to a Completer that lacks 10-Bit Completer capability, the returned Completion(s) will have Tags with Tag[9:8] equal to 00b. Since the Requester is forbidden to generate these Tag values for 10-Bit Tags, such Completions will be handled as Unexpected Completions, which by default are Advisory Non-Fatal Errors. The Requester must follow standard PCI Express error handling requirements. * In configurations where a Requester with 10-Bit Tag Requester capability needs to target multiple Completers, one needs to ensure that the Requester sends 10-Bit Tag Requests only to Completers that have 10-Bit Tag Completer capability. Additionally, Section 6.13 and 6.15 of the PCIe Base Spec points out that following a hot-plug event, clear the ARI Forwarding Enable bit and AtomicOp Requester Enable as its not determined whether the next device inserted will support these capabilities. AtomicOp capabilities are not supported on PCI Express to PCI/PCI-X Bridges and any newly added component may not be an ARI device. [1] PCI Express Base Specification Revision 6.0, Dec 16 2021. https://members.pcisig.com/wg/PCI-SIG/document/16609 Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> --- v2: Clear all optional capabilities in Device Control 2 register instead of individually clearing ARI Forwarding Enable, AtomicOp Requestor Enable and 10-bit Tag Requestor Enable. v3: Restore clearing only ARI, Atomic Op and 10 bit tags as these are the optional capabilities. Provide all necessary information in commit description. Clear register bits of the hotplug port. --- drivers/pci/hotplug/pciehp_pci.c | 4 ++++ include/uapi/linux/pci_regs.h | 1 + 2 files changed, 5 insertions(+)