Message ID | 1512792555-26042-30-git-send-email-Felix.Kuehling@amd.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
[+cc Ram, Michal, Ariel, Doug, Jason] The [29/37] in the subject makes it look like this is part of a larger series, but I can't find the rest of it on linux-pci or linux-kernel. I don't want to merge a new interface unless there's an in-tree user of it. I assume the rest of the series includes a user. On Fri, Dec 08, 2017 at 11:09:07PM -0500, Felix Kuehling wrote: > From: Jay Cornwall <Jay.Cornwall@amd.com> > > The PCIe 3.0 AtomicOp (6.15) feature allows atomic transctions to be > requested by, routed through and completed by PCIe components. Routing and > completion do not require software support. Component support for each is > detectable via the DEVCAP2 register. > > AtomicOp requests are permitted only if a component's > DEVCTL2.ATOMICOP_REQUESTER_ENABLE field is set. This capability cannot be > detected but is a no-op if set on a component with no support. These > requests can only be serviced if the upstream components support AtomicOp > completion and/or routing to a component which does. > > A concrete example is the AMD Fiji-class GPU, which is specified to > support AtomicOp requests, routed through a PLX 8747 switch (advertising > AtomicOp routing) to a Haswell host bridge (advertising AtomicOp > completion support). When AtomicOp requests are disabled the GPU logs > attempts to initiate requests to an MMIO register for debugging. > > Add pci_enable_atomic_ops_to_root for per-device control over AtomicOp > requests. Upstream bridges are checked for AtomicOp routing capability and > the call fails if any lack this capability. The root port is checked for > AtomicOp completion capabilities and the call fails if it does not support > any. Routes to other PCIe components are not checked for AtomicOp routing > and completion capabilities. > > v2: Check for AtomicOp route to root port with AtomicOp completion > v3: Style fixes > v4: Endpoint to root port only, check upstream egress blocking > v5: Rebase, use existing PCI_EXP_DEVCTL2_ATOMIC_EGRESS_BLOCK define > > CC: linux-pci@vger.kernel.org > Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> > Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> > --- > drivers/pci/pci.c | 81 +++++++++++++++++++++++++++++++++++++++++++ > include/linux/pci.h | 1 + > include/uapi/linux/pci_regs.h | 2 ++ > 3 files changed, 84 insertions(+) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 6078dfc..89a8bb0 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -2966,6 +2966,87 @@ bool pci_acs_path_enabled(struct pci_dev *start, > } > > /** > + * pci_enable_atomic_ops_to_root - enable AtomicOp requests to root port > + * @dev: the PCI device > + * > + * Return 0 if the device is capable of generating AtomicOp requests, I don't believe this part. You return 0 if the upstream path can route AtomicOps and the Root Port can complete them. But there's nothing here that's conditional on capabilities of *dev*. You could read back PCI_EXP_DEVCTL2 to see if PCI_EXP_DEVCTL2_ATOMIC_REQ was writable, but even then, you can't really tell what the device is capable of. > + * all upstream bridges support AtomicOp routing, egress blocking is disabled > + * on all upstream ports, and the root port supports 32-bit, 64-bit and/or > + * 128-bit AtomicOp completion, or negative otherwise. > + */ > +int pci_enable_atomic_ops_to_root(struct pci_dev *dev) > +{ > + struct pci_bus *bus = dev->bus; > + > + if (!pci_is_pcie(dev)) > + return -EINVAL; > + > + switch (pci_pcie_type(dev)) { > + /* > + * PCIe 3.0, 6.15 specifies that endpoints and root ports are permitted > + * to implement AtomicOp requester capabilities. > + */ > + case PCI_EXP_TYPE_ENDPOINT: > + case PCI_EXP_TYPE_LEG_END: > + case PCI_EXP_TYPE_RC_END: > + break; > + default: > + return -EINVAL; > + } > + > + while (bus->parent) { > + struct pci_dev *bridge = bus->self; > + u32 cap; > + > + pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap); > + > + switch (pci_pcie_type(bridge)) { > + /* > + * Upstream, downstream and root ports may implement AtomicOp > + * routing capabilities. AtomicOp routing via a root port is > + * not considered. > + */ > + case PCI_EXP_TYPE_UPSTREAM: > + case PCI_EXP_TYPE_DOWNSTREAM: > + if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE)) > + return -EINVAL; > + break; > + > + /* > + * Root ports are permitted to implement AtomicOp completion > + * capabilities. > + */ > + case PCI_EXP_TYPE_ROOT_PORT: > + if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 | > + PCI_EXP_DEVCAP2_ATOMIC_COMP64 | > + PCI_EXP_DEVCAP2_ATOMIC_COMP128))) > + return -EINVAL; > + break; > + } IIUC, you want to enable an endpoint, e.g., an AMD Fiji-class GPU, to initiate AtomicOps that target system memory. This interface (pci_enable_atomic_ops_to_root()) doesn't specify what size operations the driver wants to do. If the GPU requests a 128-bit op and the Root Port doesn't support it, I think we'll see an Unsupported Request error. Do you need to extend this interface so the driver can specify what sizes it wants? The existing code in qedr_pci_set_atomic() is very similar. We should make this new interface work for both places, then actually use it in qedr_pci_set_atomic(). > + > + /* > + * Upstream ports may block AtomicOps on egress. > + */ > + if (pci_pcie_type(bridge) == PCI_EXP_TYPE_UPSTREAM) { pci_pcie_type() is not a reliable method for determining the function of a switch port. There are systems where the upstream port is labeled as a downstream port, e.g., http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c8fc9339409d > + u32 ctl2; > + > + pcie_capability_read_dword(bridge, PCI_EXP_DEVCTL2, > + &ctl2); > + if (ctl2 & PCI_EXP_DEVCTL2_ATOMIC_EGRESS_BLOCK) > + return -EINVAL; > + } > + > + bus = bus->parent; > + } > + > + pcie_capability_set_word(dev, PCI_EXP_DEVCTL2, > + PCI_EXP_DEVCTL2_ATOMIC_REQ); > + > + return 0; > +} > +EXPORT_SYMBOL(pci_enable_atomic_ops_to_root); > + > +/** > * pci_swizzle_interrupt_pin - swizzle INTx for device behind bridge > * @dev: the PCI device > * @pin: the INTx pin (1=INTA, 2=INTB, 3=INTC, 4=INTD) > diff --git a/include/linux/pci.h b/include/linux/pci.h > index f4f8ee5..2a39f63 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -2062,6 +2062,7 @@ void pci_request_acs(void); > bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags); > bool pci_acs_path_enabled(struct pci_dev *start, > struct pci_dev *end, u16 acs_flags); > +int pci_enable_atomic_ops_to_root(struct pci_dev *dev); > > #define PCI_VPD_LRDT 0x80 /* Large Resource Data Type */ > #define PCI_VPD_LRDT_ID(x) ((x) | PCI_VPD_LRDT) > diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h > index f8d5804..45f251a 100644 > --- a/include/uapi/linux/pci_regs.h > +++ b/include/uapi/linux/pci_regs.h > @@ -623,7 +623,9 @@ > #define PCI_EXP_DEVCAP2 36 /* Device Capabilities 2 */ > #define PCI_EXP_DEVCAP2_ARI 0x00000020 /* Alternative Routing-ID */ > #define PCI_EXP_DEVCAP2_ATOMIC_ROUTE 0x00000040 /* Atomic Op routing */ > +#define PCI_EXP_DEVCAP2_ATOMIC_COMP32 0x00000080 /* 32b AtomicOp completion */ > #define PCI_EXP_DEVCAP2_ATOMIC_COMP64 0x00000100 /* Atomic 64-bit compare */ > +#define PCI_EXP_DEVCAP2_ATOMIC_COMP128 0x00000200 /* 128b AtomicOp completion*/ The comments should be similar. I think yours are better than the original, so please change the original to /* 64b AtomicOp completion */ so they all match. > #define PCI_EXP_DEVCAP2_LTR 0x00000800 /* Latency tolerance reporting */ > #define PCI_EXP_DEVCAP2_OBFF_MASK 0x000c0000 /* OBFF support mechanism */ > #define PCI_EXP_DEVCAP2_OBFF_MSG 0x00040000 /* New message signaling */ > -- > 2.7.4 >
On Tue, Dec 12, 2017 at 05:27:07PM -0600, Bjorn Helgaas wrote: > [+cc Ram, Michal, Ariel, Doug, Jason] > > The [29/37] in the subject makes it look like this is part of a larger > series, but I can't find the rest of it on linux-pci or linux-kernel. Didn't find the cover letter, but the AMD patchworks captured the series.. https://patchwork.freedesktop.org/project/amd-xorg-ddx/patches/ > I don't want to merge a new interface unless there's an in-tree user > of it. I assume the rest of the series includes a user. Looks like it. I would also guess we will see users in drivers/infiniband emerge as CPU coherent atomics are also a topic our hardware drivers will be interested in. But I am not aware of any pending patches. Jason
On Wed, Dec 13, 2017 at 1:42 AM, Jason Gunthorpe <jgg@ziepe.ca> wrote: > On Tue, Dec 12, 2017 at 05:27:07PM -0600, Bjorn Helgaas wrote: >> [+cc Ram, Michal, Ariel, Doug, Jason] >> >> The [29/37] in the subject makes it look like this is part of a larger >> series, but I can't find the rest of it on linux-pci or linux-kernel. > > Didn't find the cover letter, but the AMD patchworks captured the series.. > > https://patchwork.freedesktop.org/project/amd-xorg-ddx/patches/ Hi, This patchset is mainly for the amdkfd driver, which is used for running HSA Framework on AMD's APUs and in the near future, dGPUs. This driver has been in the kernel since 3.19. PCIe atomics were not required for APUs because there GPU part is integrated with the CPU and they have atomic accesses between them. For enabling HSA on dGPUs (such as Fiji, Vega, Polaris) which connect through PCIe, we need to have PCIe atomics support. The patchset starts to upstream the dGPU support and one of the pre-requisites is the patch in discussion. > >> I don't want to merge a new interface unless there's an in-tree user >> of it. I assume the rest of the series includes a user. > > Looks like it. So, yes, there is a user in the kernel and there is an entire open-source userspace framework around it, called ROCm (https://github.com/RadeonOpenCompute/ROCm) Oded > > I would also guess we will see users in drivers/infiniband emerge as > CPU coherent atomics are also a topic our hardware drivers will be > interested in. But I am not aware of any pending patches. > > Jason
On 2017-12-12 06:27 PM, Bjorn Helgaas wrote: > [+cc Ram, Michal, Ariel, Doug, Jason] > > The [29/37] in the subject makes it look like this is part of a larger > series, but I can't find the rest of it on linux-pci or linux-kernel. > > I don't want to merge a new interface unless there's an in-tree user > of it. I assume the rest of the series includes a user. > > On Fri, Dec 08, 2017 at 11:09:07PM -0500, Felix Kuehling wrote: [snip] >> + * all upstream bridges support AtomicOp routing, egress blocking is disabled >> + * on all upstream ports, and the root port supports 32-bit, 64-bit and/or >> + * 128-bit AtomicOp completion, or negative otherwise. >> + */ >> +int pci_enable_atomic_ops_to_root(struct pci_dev *dev) >> +{ >> + struct pci_bus *bus = dev->bus; >> + >> + if (!pci_is_pcie(dev)) >> + return -EINVAL; >> + >> + switch (pci_pcie_type(dev)) { >> + /* >> + * PCIe 3.0, 6.15 specifies that endpoints and root ports are permitted >> + * to implement AtomicOp requester capabilities. >> + */ >> + case PCI_EXP_TYPE_ENDPOINT: >> + case PCI_EXP_TYPE_LEG_END: >> + case PCI_EXP_TYPE_RC_END: >> + break; >> + default: >> + return -EINVAL; >> + } >> + >> + while (bus->parent) { >> + struct pci_dev *bridge = bus->self; >> + u32 cap; >> + >> + pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap); >> + >> + switch (pci_pcie_type(bridge)) { >> + /* >> + * Upstream, downstream and root ports may implement AtomicOp >> + * routing capabilities. AtomicOp routing via a root port is >> + * not considered. >> + */ >> + case PCI_EXP_TYPE_UPSTREAM: >> + case PCI_EXP_TYPE_DOWNSTREAM: >> + if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE)) >> + return -EINVAL; >> + break; >> + >> + /* >> + * Root ports are permitted to implement AtomicOp completion >> + * capabilities. >> + */ >> + case PCI_EXP_TYPE_ROOT_PORT: >> + if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 | >> + PCI_EXP_DEVCAP2_ATOMIC_COMP64 | >> + PCI_EXP_DEVCAP2_ATOMIC_COMP128))) >> + return -EINVAL; >> + break; >> + } > IIUC, you want to enable an endpoint, e.g., an AMD Fiji-class GPU, to > initiate AtomicOps that target system memory. This interface > (pci_enable_atomic_ops_to_root()) doesn't specify what size operations > the driver wants to do. If the GPU requests a 128-bit op and the Root > Port doesn't support it, I think we'll see an Unsupported Request > error. > > Do you need to extend this interface so the driver can specify what > sizes it wants? > > The existing code in qedr_pci_set_atomic() is very similar. We should > make this new interface work for both places, then actually use it in > qedr_pci_set_atomic(). Hi Bjorn, Doug, Ram, I just discussed this with Jay, and he noticed that qedr_pci_set_atomic seems to use a different criteria to find the completer for atomic requests. Jay's function expects the root port to have a parent, which was the case on the systems he tested. But Ram's function looks for a bridge without a parent and checks completion capabilities on that. Jay believes that to be a root complex, not a root port. According to the spec, "Root ports are permitted to implement AtomicOp completion capabilities." It talks about a root port, not a root complex. Can you help us understand, which interpretation is correct? And how to correctly identify the root port for checking completion capabilities? Are there valid topologies where a root port does not have a parent? Regards, Felix
On Tue, Jan 02, 2018 at 06:41:17PM -0500, Felix Kuehling wrote: > On 2017-12-12 06:27 PM, Bjorn Helgaas wrote: > > [+cc Ram, Michal, Ariel, Doug, Jason] > > > > The [29/37] in the subject makes it look like this is part of a larger > > series, but I can't find the rest of it on linux-pci or linux-kernel. > > > > I don't want to merge a new interface unless there's an in-tree user > > of it. I assume the rest of the series includes a user. > > > > On Fri, Dec 08, 2017 at 11:09:07PM -0500, Felix Kuehling wrote: > [snip] > >> + * all upstream bridges support AtomicOp routing, egress blocking is disabled > >> + * on all upstream ports, and the root port supports 32-bit, 64-bit and/or > >> + * 128-bit AtomicOp completion, or negative otherwise. > >> + */ > >> +int pci_enable_atomic_ops_to_root(struct pci_dev *dev) > >> +{ > >> + struct pci_bus *bus = dev->bus; > >> + > >> + if (!pci_is_pcie(dev)) > >> + return -EINVAL; > >> + > >> + switch (pci_pcie_type(dev)) { > >> + /* > >> + * PCIe 3.0, 6.15 specifies that endpoints and root ports are permitted > >> + * to implement AtomicOp requester capabilities. > >> + */ > >> + case PCI_EXP_TYPE_ENDPOINT: > >> + case PCI_EXP_TYPE_LEG_END: > >> + case PCI_EXP_TYPE_RC_END: > >> + break; > >> + default: > >> + return -EINVAL; > >> + } > >> + > >> + while (bus->parent) { > >> + struct pci_dev *bridge = bus->self; > >> + u32 cap; > >> + > >> + pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap); > >> + > >> + switch (pci_pcie_type(bridge)) { > >> + /* > >> + * Upstream, downstream and root ports may implement AtomicOp > >> + * routing capabilities. AtomicOp routing via a root port is > >> + * not considered. > >> + */ > >> + case PCI_EXP_TYPE_UPSTREAM: > >> + case PCI_EXP_TYPE_DOWNSTREAM: > >> + if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE)) > >> + return -EINVAL; > >> + break; > >> + > >> + /* > >> + * Root ports are permitted to implement AtomicOp completion > >> + * capabilities. > >> + */ > >> + case PCI_EXP_TYPE_ROOT_PORT: > >> + if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 | > >> + PCI_EXP_DEVCAP2_ATOMIC_COMP64 | > >> + PCI_EXP_DEVCAP2_ATOMIC_COMP128))) > >> + return -EINVAL; > >> + break; > >> + } > > IIUC, you want to enable an endpoint, e.g., an AMD Fiji-class GPU, to > > initiate AtomicOps that target system memory. This interface > > (pci_enable_atomic_ops_to_root()) doesn't specify what size operations > > the driver wants to do. If the GPU requests a 128-bit op and the Root > > Port doesn't support it, I think we'll see an Unsupported Request > > error. > > > > Do you need to extend this interface so the driver can specify what > > sizes it wants? > > > > The existing code in qedr_pci_set_atomic() is very similar. We should > > make this new interface work for both places, then actually use it in > > qedr_pci_set_atomic(). > > Hi Bjorn, Doug, Ram, > > I just discussed this with Jay, and he noticed that qedr_pci_set_atomic > seems to use a different criteria to find the completer for atomic > requests. Jay's function expects the root port to have a parent, which > was the case on the systems he tested. But Ram's function looks for a > bridge without a parent and checks completion capabilities on that. Jay > believes that to be a root complex, not a root port. By "Ram's function", I guess you mean qedr_pci_set_atomic()? That starts with a PCIe device ("pdev"; it assumes but does not check that this is a PCIe device), and traverses through all the bridges leading to it. Usually this will be: endpoint -> root port endpoint -> switch downstream port -> switch upstream port -> root port Or there may be additional switches in the middle. The code is actually not quite correct because it is legal to have this: endpoint -> PCI-to-PCIe bridge -> conventional PCI bridge -> ... and qedr_pci_set_atomic() will traverse up through the conventional part of the hierarchy, where there is no PCI_EXP_DEVCAP2. In general, a Root Port is the root of a PCIe hierarchy and there is no parent device. E.g., on my laptop: 00:1c.0 Intel Root Port (bridge to [bus 02]) 00:1c.2 Intel Root Port (bridge to [bus 04]) What sort of parent do you expect? As I mentioned, it's legal to have a PCI/PCI-X to PCIe bridge inside a conventional PCI hierarchy, but that's a little unusual. > According to the spec, "Root ports are permitted to implement AtomicOp > completion capabilities." It talks about a root port, not a root complex. > > Can you help us understand, which interpretation is correct? And how to > correctly identify the root port for checking completion capabilities? If you start with a PCIe device and traverse upstream, you should eventually reach a Root Port or a PCI/PCI-X to PCIe bridge. > Are there valid topologies where a root port does not have a parent? I don't understand this because Root Ports normally do not have parents. PCIe devices other than Root Ports normally have a Root Port (or PCI/PCI-X to PCIe bridge) at the root of the PCIe hierarchy, but there are definitely exceptions. For example, there some systems where the Root Port is not visible to Linux, e.g., http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ee8bdfb6568d On systems like that, I don't think you can safely use AtomicOps. Bjorn
Hi Bjorn, I figured it out. The difference between the functions is whether they use struct pci_bus * or struct pci_dev * as cursor. I found that the functions are in fact equivalent. The last loop iteration in pci_enable_atomic_ops_to_root is equivalent to the code after the loop in qedr_pci_set_atomic. Both handle the root port. I think my confusion was based on an incorrect assumption that bridge->bus->self is the same as bridge. But brigde->bus is in fact the parent bus. I sent out an updated patch that addresses your comments to the previous version. This should be general enough that it can replace qedr_pci_set_atomic. Regards, Felix On 2018-01-04 06:40 PM, Bjorn Helgaas wrote: > On Tue, Jan 02, 2018 at 06:41:17PM -0500, Felix Kuehling wrote: >> On 2017-12-12 06:27 PM, Bjorn Helgaas wrote: >>> [+cc Ram, Michal, Ariel, Doug, Jason] >>> >>> The [29/37] in the subject makes it look like this is part of a larger >>> series, but I can't find the rest of it on linux-pci or linux-kernel. >>> >>> I don't want to merge a new interface unless there's an in-tree user >>> of it. I assume the rest of the series includes a user. >>> >>> On Fri, Dec 08, 2017 at 11:09:07PM -0500, Felix Kuehling wrote: >> [snip] >>>> + * all upstream bridges support AtomicOp routing, egress blocking is disabled >>>> + * on all upstream ports, and the root port supports 32-bit, 64-bit and/or >>>> + * 128-bit AtomicOp completion, or negative otherwise. >>>> + */ >>>> +int pci_enable_atomic_ops_to_root(struct pci_dev *dev) >>>> +{ >>>> + struct pci_bus *bus = dev->bus; >>>> + >>>> + if (!pci_is_pcie(dev)) >>>> + return -EINVAL; >>>> + >>>> + switch (pci_pcie_type(dev)) { >>>> + /* >>>> + * PCIe 3.0, 6.15 specifies that endpoints and root ports are permitted >>>> + * to implement AtomicOp requester capabilities. >>>> + */ >>>> + case PCI_EXP_TYPE_ENDPOINT: >>>> + case PCI_EXP_TYPE_LEG_END: >>>> + case PCI_EXP_TYPE_RC_END: >>>> + break; >>>> + default: >>>> + return -EINVAL; >>>> + } >>>> + >>>> + while (bus->parent) { >>>> + struct pci_dev *bridge = bus->self; >>>> + u32 cap; >>>> + >>>> + pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap); >>>> + >>>> + switch (pci_pcie_type(bridge)) { >>>> + /* >>>> + * Upstream, downstream and root ports may implement AtomicOp >>>> + * routing capabilities. AtomicOp routing via a root port is >>>> + * not considered. >>>> + */ >>>> + case PCI_EXP_TYPE_UPSTREAM: >>>> + case PCI_EXP_TYPE_DOWNSTREAM: >>>> + if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE)) >>>> + return -EINVAL; >>>> + break; >>>> + >>>> + /* >>>> + * Root ports are permitted to implement AtomicOp completion >>>> + * capabilities. >>>> + */ >>>> + case PCI_EXP_TYPE_ROOT_PORT: >>>> + if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 | >>>> + PCI_EXP_DEVCAP2_ATOMIC_COMP64 | >>>> + PCI_EXP_DEVCAP2_ATOMIC_COMP128))) >>>> + return -EINVAL; >>>> + break; >>>> + } >>> IIUC, you want to enable an endpoint, e.g., an AMD Fiji-class GPU, to >>> initiate AtomicOps that target system memory. This interface >>> (pci_enable_atomic_ops_to_root()) doesn't specify what size operations >>> the driver wants to do. If the GPU requests a 128-bit op and the Root >>> Port doesn't support it, I think we'll see an Unsupported Request >>> error. >>> >>> Do you need to extend this interface so the driver can specify what >>> sizes it wants? >>> >>> The existing code in qedr_pci_set_atomic() is very similar. We should >>> make this new interface work for both places, then actually use it in >>> qedr_pci_set_atomic(). >> Hi Bjorn, Doug, Ram, >> >> I just discussed this with Jay, and he noticed that qedr_pci_set_atomic >> seems to use a different criteria to find the completer for atomic >> requests. Jay's function expects the root port to have a parent, which >> was the case on the systems he tested. But Ram's function looks for a >> bridge without a parent and checks completion capabilities on that. Jay >> believes that to be a root complex, not a root port. > By "Ram's function", I guess you mean qedr_pci_set_atomic()? > > That starts with a PCIe device ("pdev"; it assumes but does not check > that this is a PCIe device), and traverses through all the bridges > leading to it. Usually this will be: > > endpoint -> root port > endpoint -> switch downstream port -> switch upstream port -> root port > > Or there may be additional switches in the middle. The code is > actually not quite correct because it is legal to have this: > > endpoint -> PCI-to-PCIe bridge -> conventional PCI bridge -> ... > > and qedr_pci_set_atomic() will traverse up through the conventional > part of the hierarchy, where there is no PCI_EXP_DEVCAP2. > > In general, a Root Port is the root of a PCIe hierarchy and there is > no parent device. E.g., on my laptop: > > 00:1c.0 Intel Root Port (bridge to [bus 02]) > 00:1c.2 Intel Root Port (bridge to [bus 04]) > > What sort of parent do you expect? As I mentioned, it's legal to have > a PCI/PCI-X to PCIe bridge inside a conventional PCI hierarchy, but > that's a little unusual. > >> According to the spec, "Root ports are permitted to implement AtomicOp >> completion capabilities." It talks about a root port, not a root complex. >> >> Can you help us understand, which interpretation is correct? And how to >> correctly identify the root port for checking completion capabilities? > If you start with a PCIe device and traverse upstream, you should > eventually reach a Root Port or a PCI/PCI-X to PCIe bridge. > >> Are there valid topologies where a root port does not have a parent? > I don't understand this because Root Ports normally do not have > parents. > > PCIe devices other than Root Ports normally have a Root Port (or > PCI/PCI-X to PCIe bridge) at the root of the PCIe hierarchy, but there > are definitely exceptions. > > For example, there some systems where the Root Port is not visible to > Linux, e.g., > http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ee8bdfb6568d > On systems like that, I don't think you can safely use AtomicOps. > > Bjorn
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 6078dfc..89a8bb0 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -2966,6 +2966,87 @@ bool pci_acs_path_enabled(struct pci_dev *start, } /** + * pci_enable_atomic_ops_to_root - enable AtomicOp requests to root port + * @dev: the PCI device + * + * Return 0 if the device is capable of generating AtomicOp requests, + * all upstream bridges support AtomicOp routing, egress blocking is disabled + * on all upstream ports, and the root port supports 32-bit, 64-bit and/or + * 128-bit AtomicOp completion, or negative otherwise. + */ +int pci_enable_atomic_ops_to_root(struct pci_dev *dev) +{ + struct pci_bus *bus = dev->bus; + + if (!pci_is_pcie(dev)) + return -EINVAL; + + switch (pci_pcie_type(dev)) { + /* + * PCIe 3.0, 6.15 specifies that endpoints and root ports are permitted + * to implement AtomicOp requester capabilities. + */ + case PCI_EXP_TYPE_ENDPOINT: + case PCI_EXP_TYPE_LEG_END: + case PCI_EXP_TYPE_RC_END: + break; + default: + return -EINVAL; + } + + while (bus->parent) { + struct pci_dev *bridge = bus->self; + u32 cap; + + pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap); + + switch (pci_pcie_type(bridge)) { + /* + * Upstream, downstream and root ports may implement AtomicOp + * routing capabilities. AtomicOp routing via a root port is + * not considered. + */ + case PCI_EXP_TYPE_UPSTREAM: + case PCI_EXP_TYPE_DOWNSTREAM: + if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE)) + return -EINVAL; + break; + + /* + * Root ports are permitted to implement AtomicOp completion + * capabilities. + */ + case PCI_EXP_TYPE_ROOT_PORT: + if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 | + PCI_EXP_DEVCAP2_ATOMIC_COMP64 | + PCI_EXP_DEVCAP2_ATOMIC_COMP128))) + return -EINVAL; + break; + } + + /* + * Upstream ports may block AtomicOps on egress. + */ + if (pci_pcie_type(bridge) == PCI_EXP_TYPE_UPSTREAM) { + u32 ctl2; + + pcie_capability_read_dword(bridge, PCI_EXP_DEVCTL2, + &ctl2); + if (ctl2 & PCI_EXP_DEVCTL2_ATOMIC_EGRESS_BLOCK) + return -EINVAL; + } + + bus = bus->parent; + } + + pcie_capability_set_word(dev, PCI_EXP_DEVCTL2, + PCI_EXP_DEVCTL2_ATOMIC_REQ); + + return 0; +} +EXPORT_SYMBOL(pci_enable_atomic_ops_to_root); + +/** * pci_swizzle_interrupt_pin - swizzle INTx for device behind bridge * @dev: the PCI device * @pin: the INTx pin (1=INTA, 2=INTB, 3=INTC, 4=INTD) diff --git a/include/linux/pci.h b/include/linux/pci.h index f4f8ee5..2a39f63 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2062,6 +2062,7 @@ void pci_request_acs(void); bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags); bool pci_acs_path_enabled(struct pci_dev *start, struct pci_dev *end, u16 acs_flags); +int pci_enable_atomic_ops_to_root(struct pci_dev *dev); #define PCI_VPD_LRDT 0x80 /* Large Resource Data Type */ #define PCI_VPD_LRDT_ID(x) ((x) | PCI_VPD_LRDT) diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h index f8d5804..45f251a 100644 --- a/include/uapi/linux/pci_regs.h +++ b/include/uapi/linux/pci_regs.h @@ -623,7 +623,9 @@ #define PCI_EXP_DEVCAP2 36 /* Device Capabilities 2 */ #define PCI_EXP_DEVCAP2_ARI 0x00000020 /* Alternative Routing-ID */ #define PCI_EXP_DEVCAP2_ATOMIC_ROUTE 0x00000040 /* Atomic Op routing */ +#define PCI_EXP_DEVCAP2_ATOMIC_COMP32 0x00000080 /* 32b AtomicOp completion */ #define PCI_EXP_DEVCAP2_ATOMIC_COMP64 0x00000100 /* Atomic 64-bit compare */ +#define PCI_EXP_DEVCAP2_ATOMIC_COMP128 0x00000200 /* 128b AtomicOp completion*/ #define PCI_EXP_DEVCAP2_LTR 0x00000800 /* Latency tolerance reporting */ #define PCI_EXP_DEVCAP2_OBFF_MASK 0x000c0000 /* OBFF support mechanism */ #define PCI_EXP_DEVCAP2_OBFF_MSG 0x00040000 /* New message signaling */