Message ID | 20180802113635.7097-1-felipe.balbi@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | [RFC] PCI: add support for Immediate Readiness | expand |
> + unsigned int immediate:1; /* Supports Immediate Readiness */
That name is a bit too generic. immediate_ready (or imm_ready if you
need it short) would be a lot more descriptive.
Hi, Christoph Hellwig <hch@infradead.org> writes: >> + unsigned int immediate:1; /* Supports Immediate Readiness */ > > That name is a bit too generic. immediate_ready (or imm_ready if you > need it short) would be a lot more descriptive. That can be easily patched. I'll until tomorrow for more comments then send a v2
On 8/2/2018 7:36 AM, Felipe Balbi wrote: > PCIe GEN4 defines a new bit on Status Register which tells us that, if > Set, a function is immediately ready after a Reset. This means that > all delays after a Conventional or Function Reset can be skipped. Can you give a reference to the section of the specification? or a pointer to the ECN?
Sinan Kaya <okaya@kernel.org> writes: > On 8/2/2018 7:36 AM, Felipe Balbi wrote: >> PCIe GEN4 defines a new bit on Status Register which tells us that, if >> Set, a function is immediately ready after a Reset. This means that >> all delays after a Conventional or Function Reset can be skipped. > > Can you give a reference to the section of the specification? or > a pointer to the ECN? Section 7.5.1.1.4 of PCIe GEN4 spec. Table 7-4: Immediate Readiness – This optional bit, when Set, indicates the Function is guaranteed to be ready to successfully complete valid configuration accesses at any time following any reset that the host is capable of issuing Configuration Requests to this Function. When this bit is Set, for accesses to this Function, software is exempt from all requirements to delay configuration accesses following any type of reset, including but not limited to the timing requirements defined in Section 6.6. How this guarantee is established is beyond the scope of this document. It is permitted that system software/firmware provide mechanisms that supersede the indication provided by this bit, however such software/firmware mechanisms are outside the scope of this specification.
On 8/3/2018 2:26 AM, Felipe Balbi wrote: > Sinan Kaya <okaya@kernel.org> writes: > >> On 8/2/2018 7:36 AM, Felipe Balbi wrote: >>> PCIe GEN4 defines a new bit on Status Register which tells us that, if >>> Set, a function is immediately ready after a Reset. This means that >>> all delays after a Conventional or Function Reset can be skipped. >> >> Can you give a reference to the section of the specification? or >> a pointer to the ECN? > > Section 7.5.1.1.4 of PCIe GEN4 spec. Table 7-4: > > Immediate Readiness – This optional bit, when Set, indicates the > Function is guaranteed to be ready to successfully complete valid > configuration accesses at any time following any reset that the host is > capable of issuing Configuration Requests to this Function. > > When this bit is Set, for accesses to this Function, software is exempt > from all requirements to delay configuration accesses following any type > of reset, including but not limited to the timing requirements defined > in Section 6.6. How this guarantee is established is beyond the scope > of this document. > > It is permitted that system software/firmware provide mechanisms that > supersede the indication provided by this bit, however such > software/firmware mechanisms are outside the scope of this > specification. > Thanks for the spec reference. I think the patch is touching the wrong places. pci_dev_wait() is there to wait for CRS response to finish following reset. Typical sequences are: 1. Do some kind of reset in another routine 2. Wait reset specific wait time (1sec for secondary bus reset as an example and 100ms for d3-d0 transition) 3. call pci_dev_wait() after reset to see if device can accept config transactions. Since this applies to all resets, I think you also need to get rid of waits following different reset types in step #2 and return immediately. I suggest you review callers of pci_dev_wait() and tap in there. Another thing is that this is a common functionality. Initializing the flag in pm_init() would not be the best place.
On Fri, Aug 03, 2018 at 01:25:58PM -0400, Sinan Kaya wrote: > On 8/3/2018 2:26 AM, Felipe Balbi wrote: > > Sinan Kaya <okaya@kernel.org> writes: > > > > > On 8/2/2018 7:36 AM, Felipe Balbi wrote: > > > > PCIe GEN4 defines a new bit on Status Register which tells us that, if > > > > Set, a function is immediately ready after a Reset. This means that > > > > all delays after a Conventional or Function Reset can be skipped. > > > > > > Can you give a reference to the section of the specification? or > > > a pointer to the ECN? > > > > Section 7.5.1.1.4 of PCIe GEN4 spec. Table 7-4: > > > > Immediate Readiness – This optional bit, when Set, indicates the > > Function is guaranteed to be ready to successfully complete valid > > configuration accesses at any time following any reset that the host is > > capable of issuing Configuration Requests to this Function. > > > > When this bit is Set, for accesses to this Function, software is exempt > > from all requirements to delay configuration accesses following any type > > of reset, including but not limited to the timing requirements defined > > in Section 6.6. How this guarantee is established is beyond the scope > > of this document. > > > > It is permitted that system software/firmware provide mechanisms that > > supersede the indication provided by this bit, however such > > software/firmware mechanisms are outside the scope of this > > specification. > > > > Thanks for the spec reference. Yes. Please include the reference in the changelog of v2. > I think the patch is touching the wrong places. pci_dev_wait() is there > to wait for CRS response to finish following reset. > > Typical sequences are: > 1. Do some kind of reset in another routine > 2. Wait reset specific wait time (1sec for secondary bus reset as an > example and 100ms for d3-d0 transition) > 3. call pci_dev_wait() after reset to see if device can accept config > transactions. > > Since this applies to all resets, I think you also need to get rid of > waits following different reset types in step #2 and return immediately. > I suggest you review callers of pci_dev_wait() and tap in there. I agree; I think we should be able to skip the delays in pcie_flr(), pci_af_flr(), etc. > Another thing is that this is a common functionality. Initializing > the flag in pm_init() would not be the best place. The immediate readiness thing is not directly related to PM, but pci_pm_init() does deal with similar delays, and I don't have a better suggestion off the top of my head. Bjorn
Hi, Bjorn Helgaas <helgaas@kernel.org> writes: > On Fri, Aug 03, 2018 at 01:25:58PM -0400, Sinan Kaya wrote: >> On 8/3/2018 2:26 AM, Felipe Balbi wrote: >> > Sinan Kaya <okaya@kernel.org> writes: >> > >> > > On 8/2/2018 7:36 AM, Felipe Balbi wrote: >> > > > PCIe GEN4 defines a new bit on Status Register which tells us that, if >> > > > Set, a function is immediately ready after a Reset. This means that >> > > > all delays after a Conventional or Function Reset can be skipped. >> > > >> > > Can you give a reference to the section of the specification? or >> > > a pointer to the ECN? >> > >> > Section 7.5.1.1.4 of PCIe GEN4 spec. Table 7-4: >> > >> > Immediate Readiness – This optional bit, when Set, indicates the >> > Function is guaranteed to be ready to successfully complete valid >> > configuration accesses at any time following any reset that the host is >> > capable of issuing Configuration Requests to this Function. >> > >> > When this bit is Set, for accesses to this Function, software is exempt >> > from all requirements to delay configuration accesses following any type >> > of reset, including but not limited to the timing requirements defined >> > in Section 6.6. How this guarantee is established is beyond the scope >> > of this document. >> > >> > It is permitted that system software/firmware provide mechanisms that >> > supersede the indication provided by this bit, however such >> > software/firmware mechanisms are outside the scope of this >> > specification. >> > >> >> Thanks for the spec reference. > > Yes. Please include the reference in the changelog of v2. will do >> I think the patch is touching the wrong places. pci_dev_wait() is there >> to wait for CRS response to finish following reset. >> >> Typical sequences are: >> 1. Do some kind of reset in another routine >> 2. Wait reset specific wait time (1sec for secondary bus reset as an >> example and 100ms for d3-d0 transition) >> 3. call pci_dev_wait() after reset to see if device can accept config >> transactions. >> >> Since this applies to all resets, I think you also need to get rid of >> waits following different reset types in step #2 and return immediately. >> I suggest you review callers of pci_dev_wait() and tap in there. > > I agree; I think we should be able to skip the delays in pcie_flr(), > pci_af_flr(), etc. but that's why I put the code in pci_dev_wait(). Both pcie_flr() and pci_af_flr() call pci_dev_wait(); or are you saying that the msleep(100) call in pci_af_flr() can also be removed if (dev->imm_ready)?
Hi, Felipe Balbi <felipe.balbi@linux.intel.com> writes: >> Since this applies to all resets, I think you also need to get rid of >>> waits following different reset types in step #2 and return immediately. >>> I suggest you review callers of pci_dev_wait() and tap in there. >> >> I agree; I think we should be able to skip the delays in pcie_flr(), >> pci_af_flr(), etc. > > but that's why I put the code in pci_dev_wait(). Both pcie_flr() and > pci_af_flr() call pci_dev_wait(); or are you saying that the msleep(100) > call in pci_af_flr() can also be removed if (dev->imm_ready)? What about moving that msleep() to pci_dev_wait() itself? diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 29ff9619b5fa..7d4da2196618 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -999,7 +999,7 @@ static void __pci_start_power_transition(struct pci_dev *dev, pci_power_t state) * because have already delayed for the bridge. */ if (dev->runtime_d3cold) { - if (dev->d3cold_delay) + if (dev->d3cold_delay && !dev->imm_ready) msleep(dev->d3cold_delay); /* * When powering on a bridge from D3cold, the @@ -2644,6 +2644,7 @@ EXPORT_SYMBOL_GPL(pci_d3cold_disable); void pci_pm_init(struct pci_dev *dev) { int pm; + u16 status; u16 pmc; pm_runtime_forbid(&dev->dev); @@ -2706,6 +2707,9 @@ void pci_pm_init(struct pci_dev *dev) /* Disable the PME# generation functionality */ pci_pme_active(dev, false); } + + pci_read_config_word(dev, PCI_STATUS, &status); + dev->imm_ready = status & PCI_STATUS_IMMEDIATE; } static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop) @@ -4306,6 +4310,17 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout) int delay = 1; u32 id; + if (dev->imm_ready) + return 0; + + /* + * Per Advanced Capabilities for Conventional PCI ECN, 13 April 2006, + * updated 27 July 2006; a device must complete an FLR within + * 100ms, but may silently discard requests while the FLR is in + * progress. Wait 100ms before trying to access the device. + */ + msleep(100); + /* * After reset, the device should not silently discard config * requests, but it may still indicate that it needs more time by @@ -4376,13 +4391,6 @@ int pcie_flr(struct pci_dev *dev) pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_BCR_FLR); - /* - * Per PCIe r4.0, sec 6.6.2, a device must complete an FLR within - * 100ms, but may silently discard requests while the FLR is in - * progress. Wait 100ms before trying to access the device. - */ - msleep(100); - return pci_dev_wait(dev, "FLR", PCIE_RESET_READY_POLL_MS); } EXPORT_SYMBOL_GPL(pcie_flr); @@ -4416,15 +4424,6 @@ static int pci_af_flr(struct pci_dev *dev, int probe) pci_err(dev, "timed out waiting for pending transaction; performing AF function level reset anyway\n"); pci_write_config_byte(dev, pos + PCI_AF_CTRL, PCI_AF_CTRL_FLR); - - /* - * Per Advanced Capabilities for Conventional PCI ECN, 13 April 2006, - * updated 27 July 2006; a device must complete an FLR within - * 100ms, but may silently discard requests while the FLR is in - * progress. Wait 100ms before trying to access the device. - */ - msleep(100); - return pci_dev_wait(dev, "AF_FLR", PCIE_RESET_READY_POLL_MS); } diff --git a/include/linux/pci.h b/include/linux/pci.h index e72ca8dd6241..7eed464e844a 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -325,6 +325,7 @@ struct pci_dev { pci_power_t current_state; /* Current operating state. In ACPI, this is D0-D3, D0 being fully functional, and D3 being off. */ + unsigned int imm_ready:1; /* Supports Immediate Readiness */ u8 pm_cap; /* PM capability offset */ unsigned int pme_support:5; /* Bitmask of states from which PME# can be generated */ diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h index ee556ccc93f4..b5cf51a06cae 100644 --- a/include/uapi/linux/pci_regs.h +++ b/include/uapi/linux/pci_regs.h @@ -52,6 +52,7 @@ #define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */ #define PCI_STATUS 0x06 /* 16 bits */ +#define PCI_STATUS_IMMEDIATE 0x01 /* Immediate Readiness */ #define PCI_STATUS_INTERRUPT 0x08 /* Interrupt status */ #define PCI_STATUS_CAP_LIST 0x10 /* Support Capability List */ #define PCI_STATUS_66MHZ 0x20 /* Support 66 MHz PCI 2.1 bus */
Hi, Felipe Balbi <felipe.balbi@linux.intel.com> writes: > Felipe Balbi <felipe.balbi@linux.intel.com> writes: >>> Since this applies to all resets, I think you also need to get rid of >>>> waits following different reset types in step #2 and return immediately. >>>> I suggest you review callers of pci_dev_wait() and tap in there. >>> >>> I agree; I think we should be able to skip the delays in pcie_flr(), >>> pci_af_flr(), etc. >> >> but that's why I put the code in pci_dev_wait(). Both pcie_flr() and >> pci_af_flr() call pci_dev_wait(); or are you saying that the msleep(100) >> call in pci_af_flr() can also be removed if (dev->imm_ready)? > > What about moving that msleep() to pci_dev_wait() itself? this will incur and extra 100ms delay on D3->D0 transitions and bus reset. Is that acceptable or would you prefer to add another check on pcie_flr() and pci_af_flr()? - balbi
On Wed, Sep 05, 2018 at 08:18:48AM +0300, Felipe Balbi wrote: > > Hi, > > Bjorn Helgaas <helgaas@kernel.org> writes: > > On Fri, Aug 03, 2018 at 01:25:58PM -0400, Sinan Kaya wrote: > >> On 8/3/2018 2:26 AM, Felipe Balbi wrote: > >> > Sinan Kaya <okaya@kernel.org> writes: > >> > > >> > > On 8/2/2018 7:36 AM, Felipe Balbi wrote: > >> > > > PCIe GEN4 defines a new bit on Status Register which tells us that, if > >> > > > Set, a function is immediately ready after a Reset. This means that > >> > > > all delays after a Conventional or Function Reset can be skipped. > >> > > > >> > > Can you give a reference to the section of the specification? or > >> > > a pointer to the ECN? > >> > > >> > Section 7.5.1.1.4 of PCIe GEN4 spec. Table 7-4: > >> > > >> > Immediate Readiness – This optional bit, when Set, indicates the > >> > Function is guaranteed to be ready to successfully complete valid > >> > configuration accesses at any time following any reset that the host is > >> > capable of issuing Configuration Requests to this Function. > >> > > >> > When this bit is Set, for accesses to this Function, software is exempt > >> > from all requirements to delay configuration accesses following any type > >> > of reset, including but not limited to the timing requirements defined > >> > in Section 6.6. How this guarantee is established is beyond the scope > >> > of this document. > >> > > >> > It is permitted that system software/firmware provide mechanisms that > >> > supersede the indication provided by this bit, however such > >> > software/firmware mechanisms are outside the scope of this > >> > specification. > >> > > >> > >> Thanks for the spec reference. > > > > Yes. Please include the reference in the changelog of v2. > > will do > > >> I think the patch is touching the wrong places. pci_dev_wait() is there > >> to wait for CRS response to finish following reset. > >> > >> Typical sequences are: > >> 1. Do some kind of reset in another routine > >> 2. Wait reset specific wait time (1sec for secondary bus reset as an > >> example and 100ms for d3-d0 transition) > >> 3. call pci_dev_wait() after reset to see if device can accept config > >> transactions. > >> > >> Since this applies to all resets, I think you also need to get rid of > >> waits following different reset types in step #2 and return immediately. > >> I suggest you review callers of pci_dev_wait() and tap in there. > > > > I agree; I think we should be able to skip the delays in pcie_flr(), > > pci_af_flr(), etc. > > but that's why I put the code in pci_dev_wait(). Both pcie_flr() and > pci_af_flr() call pci_dev_wait(); or are you saying that the msleep(100) > call in pci_af_flr() can also be removed if (dev->imm_ready)? Yes. The spec says (PCIe r4.0, sec 7.5.1.1.4): When this bit is Set, for accesses to this Function, software is exempt from all requirements to delay configuration accesses following any type of reset, including but not limited to the timing requirements defined in Section 6.6. Sec 6.6 contains various references to waiting 100ms after different types of resets, and I think those correspond to the msleep(100) calls. So I think we can skip those delays if the device supports Immediate Readiness. pci_dev_wait() is connected with CRS. There's some reason that function doesn't actually looks for CRS responses, which I can't remember right now. But you can look it up in the changelogs. So I don't think we should put the Immediate Readiness check in pci_dev_wait(). Bjorn
On 9/5/2018 9:49 AM, Bjorn Helgaas wrote: > pci_dev_wait() is connected with CRS. There's some reason that > function doesn't actually looks for CRS responses, which I can't > remember right now. But you can look it up in the changelogs. > We wanted to be able to support virtual functions (that can't read vendor id) as well as systems without CRS visibility support. That was the reason why we are polling a well known register such as command status against ~0 rather than the vendor id register. > So I don't think we should put the Immediate Readiness check in > pci_dev_wait(). I agree, it should be outside of pci_dev_wait()
Hi, Bjorn Helgaas <helgaas@kernel.org> writes: >> >> I think the patch is touching the wrong places. pci_dev_wait() is there >> >> to wait for CRS response to finish following reset. >> >> >> >> Typical sequences are: >> >> 1. Do some kind of reset in another routine >> >> 2. Wait reset specific wait time (1sec for secondary bus reset as an >> >> example and 100ms for d3-d0 transition) >> >> 3. call pci_dev_wait() after reset to see if device can accept config >> >> transactions. >> >> >> >> Since this applies to all resets, I think you also need to get rid of >> >> waits following different reset types in step #2 and return immediately. >> >> I suggest you review callers of pci_dev_wait() and tap in there. >> > >> > I agree; I think we should be able to skip the delays in pcie_flr(), >> > pci_af_flr(), etc. >> >> but that's why I put the code in pci_dev_wait(). Both pcie_flr() and >> pci_af_flr() call pci_dev_wait(); or are you saying that the msleep(100) >> call in pci_af_flr() can also be removed if (dev->imm_ready)? > > Yes. The spec says (PCIe r4.0, sec 7.5.1.1.4): > > When this bit is Set, for accesses to this Function, software is > exempt from all requirements to delay configuration accesses > following any type of reset, including but not limited to the timing > requirements defined in Section 6.6. > > Sec 6.6 contains various references to waiting 100ms after different > types of resets, and I think those correspond to the msleep(100) > calls. So I think we can skip those delays if the device supports > Immediate Readiness. > > pci_dev_wait() is connected with CRS. There's some reason that > function doesn't actually looks for CRS responses, which I can't > remember right now. But you can look it up in the changelogs. > > So I don't think we should put the Immediate Readiness check in > pci_dev_wait(). IOW, you want something like below? If that's the case, I can send another version shortly. diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 29ff9619b5fa..61be196db92b 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -999,7 +999,7 @@ static void __pci_start_power_transition(struct pci_dev *dev, pci_power_t state) * because have already delayed for the bridge. */ if (dev->runtime_d3cold) { - if (dev->d3cold_delay) + if (dev->d3cold_delay && !dev->imm_ready) msleep(dev->d3cold_delay); /* * When powering on a bridge from D3cold, the @@ -2644,6 +2644,7 @@ EXPORT_SYMBOL_GPL(pci_d3cold_disable); void pci_pm_init(struct pci_dev *dev) { int pm; + u16 status; u16 pmc; pm_runtime_forbid(&dev->dev); @@ -2706,6 +2707,9 @@ void pci_pm_init(struct pci_dev *dev) /* Disable the PME# generation functionality */ pci_pme_active(dev, false); } + + pci_read_config_word(dev, PCI_STATUS, &status); + dev->imm_ready = status & PCI_STATUS_IMMEDIATE; } static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop) @@ -4376,6 +4380,9 @@ int pcie_flr(struct pci_dev *dev) pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_BCR_FLR); + if (dev->imm_ready) + return 0; + /* * Per PCIe r4.0, sec 6.6.2, a device must complete an FLR within * 100ms, but may silently discard requests while the FLR is in @@ -4417,6 +4424,9 @@ static int pci_af_flr(struct pci_dev *dev, int probe) pci_write_config_byte(dev, pos + PCI_AF_CTRL, PCI_AF_CTRL_FLR); + if (dev->imm_ready) + return 0; + /* * Per Advanced Capabilities for Conventional PCI ECN, 13 April 2006, * updated 27 July 2006; a device must complete an FLR within diff --git a/include/linux/pci.h b/include/linux/pci.h index e72ca8dd6241..7eed464e844a 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -325,6 +325,7 @@ struct pci_dev { pci_power_t current_state; /* Current operating state. In ACPI, this is D0-D3, D0 being fully functional, and D3 being off. */ + unsigned int imm_ready:1; /* Supports Immediate Readiness */ u8 pm_cap; /* PM capability offset */ unsigned int pme_support:5; /* Bitmask of states from which PME# can be generated */ diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h index ee556ccc93f4..b5cf51a06cae 100644 --- a/include/uapi/linux/pci_regs.h +++ b/include/uapi/linux/pci_regs.h @@ -52,6 +52,7 @@ #define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */ #define PCI_STATUS 0x06 /* 16 bits */ +#define PCI_STATUS_IMMEDIATE 0x01 /* Immediate Readiness */ #define PCI_STATUS_INTERRUPT 0x08 /* Interrupt status */ #define PCI_STATUS_CAP_LIST 0x10 /* Support Capability List */ #define PCI_STATUS_66MHZ 0x20 /* Support 66 MHz PCI 2.1 bus */
On Thu, Sep 06, 2018 at 09:02:56AM +0300, Felipe Balbi wrote: > Bjorn Helgaas <helgaas@kernel.org> writes: > >> >> I think the patch is touching the wrong places. pci_dev_wait() is there > >> >> to wait for CRS response to finish following reset. > >> >> > >> >> Typical sequences are: > >> >> 1. Do some kind of reset in another routine > >> >> 2. Wait reset specific wait time (1sec for secondary bus reset as an > >> >> example and 100ms for d3-d0 transition) > >> >> 3. call pci_dev_wait() after reset to see if device can accept config > >> >> transactions. > >> >> > >> >> Since this applies to all resets, I think you also need to get rid of > >> >> waits following different reset types in step #2 and return immediately. > >> >> I suggest you review callers of pci_dev_wait() and tap in there. > >> > > >> > I agree; I think we should be able to skip the delays in pcie_flr(), > >> > pci_af_flr(), etc. > >> > >> but that's why I put the code in pci_dev_wait(). Both pcie_flr() and > >> pci_af_flr() call pci_dev_wait(); or are you saying that the msleep(100) > >> call in pci_af_flr() can also be removed if (dev->imm_ready)? > > > > Yes. The spec says (PCIe r4.0, sec 7.5.1.1.4): > > > > When this bit is Set, for accesses to this Function, software is > > exempt from all requirements to delay configuration accesses > > following any type of reset, including but not limited to the timing > > requirements defined in Section 6.6. > > > > Sec 6.6 contains various references to waiting 100ms after different > > types of resets, and I think those correspond to the msleep(100) > > calls. So I think we can skip those delays if the device supports > > Immediate Readiness. > > > > pci_dev_wait() is connected with CRS. There's some reason that > > function doesn't actually looks for CRS responses, which I can't > > remember right now. But you can look it up in the changelogs. > > > > So I don't think we should put the Immediate Readiness check in > > pci_dev_wait(). > > IOW, you want something like below? If that's the case, I can send > another version shortly. > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 29ff9619b5fa..61be196db92b 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -999,7 +999,7 @@ static void __pci_start_power_transition(struct pci_dev *dev, pci_power_t state) > * because have already delayed for the bridge. > */ > if (dev->runtime_d3cold) { > - if (dev->d3cold_delay) > + if (dev->d3cold_delay && !dev->imm_ready) > msleep(dev->d3cold_delay); > /* > * When powering on a bridge from D3cold, the > @@ -2644,6 +2644,7 @@ EXPORT_SYMBOL_GPL(pci_d3cold_disable); > void pci_pm_init(struct pci_dev *dev) > { > int pm; > + u16 status; > u16 pmc; > > pm_runtime_forbid(&dev->dev); > @@ -2706,6 +2707,9 @@ void pci_pm_init(struct pci_dev *dev) > /* Disable the PME# generation functionality */ > pci_pme_active(dev, false); > } > + > + pci_read_config_word(dev, PCI_STATUS, &status); > + dev->imm_ready = status & PCI_STATUS_IMMEDIATE; This needs something like if (status & PCI_STATUS_IMMEDIATE) dev->imm_ready = 1; It's true that PCI_STATUS_IMMEDIATE is the least-significant bit, but the code here shouldn't depend on that. But otherwise this looks good. > } > > static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop) > @@ -4376,6 +4380,9 @@ int pcie_flr(struct pci_dev *dev) > > pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_BCR_FLR); > > + if (dev->imm_ready) > + return 0; > + > /* > * Per PCIe r4.0, sec 6.6.2, a device must complete an FLR within > * 100ms, but may silently discard requests while the FLR is in > @@ -4417,6 +4424,9 @@ static int pci_af_flr(struct pci_dev *dev, int probe) > > pci_write_config_byte(dev, pos + PCI_AF_CTRL, PCI_AF_CTRL_FLR); > > + if (dev->imm_ready) > + return 0; > + > /* > * Per Advanced Capabilities for Conventional PCI ECN, 13 April 2006, > * updated 27 July 2006; a device must complete an FLR within > diff --git a/include/linux/pci.h b/include/linux/pci.h > index e72ca8dd6241..7eed464e844a 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -325,6 +325,7 @@ struct pci_dev { > pci_power_t current_state; /* Current operating state. In ACPI, > this is D0-D3, D0 being fully > functional, and D3 being off. */ > + unsigned int imm_ready:1; /* Supports Immediate Readiness */ > u8 pm_cap; /* PM capability offset */ > unsigned int pme_support:5; /* Bitmask of states from which PME# > can be generated */ > diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h > index ee556ccc93f4..b5cf51a06cae 100644 > --- a/include/uapi/linux/pci_regs.h > +++ b/include/uapi/linux/pci_regs.h > @@ -52,6 +52,7 @@ > #define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */ > > #define PCI_STATUS 0x06 /* 16 bits */ > +#define PCI_STATUS_IMMEDIATE 0x01 /* Immediate Readiness */ > #define PCI_STATUS_INTERRUPT 0x08 /* Interrupt status */ > #define PCI_STATUS_CAP_LIST 0x10 /* Support Capability List */ > #define PCI_STATUS_66MHZ 0x20 /* Support 66 MHz PCI 2.1 bus */ > > > -- > balbi
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 316496e99da9..cf5239edd46a 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -835,7 +835,7 @@ static void __pci_start_power_transition(struct pci_dev *dev, pci_power_t state) * because have already delayed for the bridge. */ if (dev->runtime_d3cold) { - if (dev->d3cold_delay) + if (dev->d3cold_delay && !dev->immediate) msleep(dev->d3cold_delay); /* * When powering on a bridge from D3cold, the @@ -2442,6 +2442,7 @@ EXPORT_SYMBOL_GPL(pci_d3cold_disable); void pci_pm_init(struct pci_dev *dev) { int pm; + u16 status; u16 pmc; pm_runtime_forbid(&dev->dev); @@ -2504,6 +2505,9 @@ void pci_pm_init(struct pci_dev *dev) /* Disable the PME# generation functionality */ pci_pme_active(dev, false); } + + pci_read_config_word(dev, PCI_STATUS, &status); + dev->immediate = status & PCI_STATUS_IMMEDIATE; } static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop) @@ -4034,6 +4038,9 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout) int delay = 1; u32 id; + if (dev->immediate) + return 0; + /* * After reset, the device should not silently discard config * requests, but it may still indicate that it needs more time by diff --git a/include/linux/pci.h b/include/linux/pci.h index abd5d5e17aee..d8f0a382b026 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -321,6 +321,7 @@ struct pci_dev { pci_power_t current_state; /* Current operating state. In ACPI, this is D0-D3, D0 being fully functional, and D3 being off. */ + unsigned int immediate:1; /* Supports Immediate Readiness */ u8 pm_cap; /* PM capability offset */ unsigned int pme_support:5; /* Bitmask of states from which PME# can be generated */ diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h index 4da87e2ef8a8..796d12910791 100644 --- a/include/uapi/linux/pci_regs.h +++ b/include/uapi/linux/pci_regs.h @@ -52,6 +52,7 @@ #define PCI_COMMAND_INTX_DISABLE 0x400 /* INTx Emulation Disable */ #define PCI_STATUS 0x06 /* 16 bits */ +#define PCI_STATUS_IMMEDIATE 0x01 /* Immediate Readiness */ #define PCI_STATUS_INTERRUPT 0x08 /* Interrupt status */ #define PCI_STATUS_CAP_LIST 0x10 /* Support Capability List */ #define PCI_STATUS_66MHZ 0x20 /* Support 66 MHz PCI 2.1 bus */
PCIe GEN4 defines a new bit on Status Register which tells us that, if Set, a function is immediately ready after a Reset. This means that all delays after a Conventional or Function Reset can be skipped. This patch reads such bit and caches its value in a flag inside struct pci_dev to be checked later if we should delay or can skip delays after a reset. Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> --- drivers/pci/pci.c | 9 ++++++++- include/linux/pci.h | 1 + include/uapi/linux/pci_regs.h | 1 + 3 files changed, 10 insertions(+), 1 deletion(-)