Message ID | 1708697021-16877-2-git-send-email-quic_msarkar@quicinc.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | arm64: qcom: sa8775p: add cache coherency support for SA8775P | expand |
On Fri, Feb 23, 2024 at 07:33:38PM +0530, Mrinmay Sarkar wrote: > Due to some hardware changes, SA8775P has set the NO_SNOOP attribute > in its TLP for all the PCIe controllers. NO_SNOOP attribute when set, > the requester is indicating that there no cache coherency issues exit > for the addressed memory on the host i.e., memory is not cached. But > in reality, requester cannot assume this unless there is a complete > control/visibility over the addressed memory on the host. s/that there no/that no/ s/exit/exist/ Forgive my ignorance here. It sounds like the cache coherency issue would refer to system memory, so the relevant No Snoop attribute would be in DMA transactions, i.e., Memory Reads or Writes initiated by PCIe Endpoints. But it looks like this patch would affect TLPs initiated by the Root Complex, not those from Endpoints, so I'm confused about how this works. If this were in the qcom-ep driver, it would make sense that setting No Snoop in the TLPs initiated by the Endpoint could be a problem, but that doesn't seem to be what this patch is concerned with. > And worst case, if the memory is cached on the host, it may lead to > memory corruption issues. It should be noted that the caching of memory > on the host is not solely dependent on the NO_SNOOP attribute in TLP. > > So to avoid the corruption, this patch overrides the NO_SNOOP attribute > by setting the PCIE_PARF_NO_SNOOP_OVERIDE register. This patch is not > needed for other upstream supported platforms since they do not set > NO_SNOOP attribute by default. > > 8775 has IP version 1.34.0 so intruduce a new cfg(cfg_1_34_0) for this > platform. Assign enable_cache_snoop flag into struct qcom_pcie_cfg and > set it true in cfg_1_34_0 and enable cache snooping if this particular > flag is true. s/intruduce/introduce/ Bjorn
On 2/24/2024 4:24 AM, Bjorn Helgaas wrote: > On Fri, Feb 23, 2024 at 07:33:38PM +0530, Mrinmay Sarkar wrote: >> Due to some hardware changes, SA8775P has set the NO_SNOOP attribute >> in its TLP for all the PCIe controllers. NO_SNOOP attribute when set, >> the requester is indicating that there no cache coherency issues exit >> for the addressed memory on the host i.e., memory is not cached. But >> in reality, requester cannot assume this unless there is a complete >> control/visibility over the addressed memory on the host. > s/that there no/that no/ > s/exit/exist/ > > Forgive my ignorance here. It sounds like the cache coherency issue > would refer to system memory, so the relevant No Snoop attribute would > be in DMA transactions, i.e., Memory Reads or Writes initiated by PCIe > Endpoints. But it looks like this patch would affect TLPs initiated > by the Root Complex, not those from Endpoints, so I'm confused about > how this works. > > If this were in the qcom-ep driver, it would make sense that setting > No Snoop in the TLPs initiated by the Endpoint could be a problem, but > that doesn't seem to be what this patch is concerned with. I think in multiprocessor system cache coherency issue might occur. and RC as well needs to snoop cache to avoid coherency as it is not enable by default. and we are enabling this feature for qcom-ep driver as well. it is in patch2. Thanks Mrinmay >> And worst case, if the memory is cached on the host, it may lead to >> memory corruption issues. It should be noted that the caching of memory >> on the host is not solely dependent on the NO_SNOOP attribute in TLP. >> >> So to avoid the corruption, this patch overrides the NO_SNOOP attribute >> by setting the PCIE_PARF_NO_SNOOP_OVERIDE register. This patch is not >> needed for other upstream supported platforms since they do not set >> NO_SNOOP attribute by default. >> >> 8775 has IP version 1.34.0 so intruduce a new cfg(cfg_1_34_0) for this >> platform. Assign enable_cache_snoop flag into struct qcom_pcie_cfg and >> set it true in cfg_1_34_0 and enable cache snooping if this particular >> flag is true. > s/intruduce/introduce/ > > Bjorn
On Wed, Feb 28, 2024 at 06:34:11PM +0530, Mrinmay Sarkar wrote: > On 2/24/2024 4:24 AM, Bjorn Helgaas wrote: > > On Fri, Feb 23, 2024 at 07:33:38PM +0530, Mrinmay Sarkar wrote: > > > Due to some hardware changes, SA8775P has set the NO_SNOOP attribute > > > in its TLP for all the PCIe controllers. NO_SNOOP attribute when set, > > > the requester is indicating that there no cache coherency issues exit > > > for the addressed memory on the host i.e., memory is not cached. But > > > in reality, requester cannot assume this unless there is a complete > > > control/visibility over the addressed memory on the host. > > > > Forgive my ignorance here. It sounds like the cache coherency issue > > would refer to system memory, so the relevant No Snoop attribute would > > be in DMA transactions, i.e., Memory Reads or Writes initiated by PCIe > > Endpoints. But it looks like this patch would affect TLPs initiated > > by the Root Complex, not those from Endpoints, so I'm confused about > > how this works. > > > > If this were in the qcom-ep driver, it would make sense that setting > > No Snoop in the TLPs initiated by the Endpoint could be a problem, but > > that doesn't seem to be what this patch is concerned with. > > I think in multiprocessor system cache coherency issue might occur. > and RC as well needs to snoop cache to avoid coherency as it is not > enable by default. My mental picture isn't detailed enough, so I'm still confused. We're talking about TLPs initiated by the RC. Normally these would be because a driver did a CPU load or store to a PCIe device MMIO space, not to system memory. But I guess you're suggesting the RC can initiate a TLP with a system memory address? And this TLP would be routed not to a Root Port or to downstream devices, but it would instead be kind of a loopback and be routed back up through the RC and maybe IOMMU, to system memory? I would have expected accesses like this to be routed directly to system memory without ever reaching the PCIe RC. > and we are enabling this feature for qcom-ep driver as well. > it is in patch2. > > Thanks > Mrinmay > > > > And worst case, if the memory is cached on the host, it may lead to > > > memory corruption issues. It should be noted that the caching of memory > > > on the host is not solely dependent on the NO_SNOOP attribute in TLP. > > > > > > So to avoid the corruption, this patch overrides the NO_SNOOP attribute > > > by setting the PCIE_PARF_NO_SNOOP_OVERIDE register. This patch is not > > > needed for other upstream supported platforms since they do not set > > > NO_SNOOP attribute by default. > > > > > > 8775 has IP version 1.34.0 so intruduce a new cfg(cfg_1_34_0) for this > > > platform. Assign enable_cache_snoop flag into struct qcom_pcie_cfg and > > > set it true in cfg_1_34_0 and enable cache snooping if this particular > > > flag is true. > > s/intruduce/introduce/ > > > > Bjorn
On Wed, Feb 28, 2024 at 09:02:11AM -0600, Bjorn Helgaas wrote: > On Wed, Feb 28, 2024 at 06:34:11PM +0530, Mrinmay Sarkar wrote: > > On 2/24/2024 4:24 AM, Bjorn Helgaas wrote: > > > On Fri, Feb 23, 2024 at 07:33:38PM +0530, Mrinmay Sarkar wrote: > > > > Due to some hardware changes, SA8775P has set the NO_SNOOP attribute > > > > in its TLP for all the PCIe controllers. NO_SNOOP attribute when set, > > > > the requester is indicating that there no cache coherency issues exit > > > > for the addressed memory on the host i.e., memory is not cached. But > > > > in reality, requester cannot assume this unless there is a complete > > > > control/visibility over the addressed memory on the host. > > > > > > Forgive my ignorance here. It sounds like the cache coherency issue > > > would refer to system memory, so the relevant No Snoop attribute would > > > be in DMA transactions, i.e., Memory Reads or Writes initiated by PCIe > > > Endpoints. But it looks like this patch would affect TLPs initiated > > > by the Root Complex, not those from Endpoints, so I'm confused about > > > how this works. > > > > > > If this were in the qcom-ep driver, it would make sense that setting > > > No Snoop in the TLPs initiated by the Endpoint could be a problem, but > > > that doesn't seem to be what this patch is concerned with. > > > > I think in multiprocessor system cache coherency issue might occur. > > and RC as well needs to snoop cache to avoid coherency as it is not > > enable by default. > > My mental picture isn't detailed enough, so I'm still confused. We're > talking about TLPs initiated by the RC. Normally these would be > because a driver did a CPU load or store to a PCIe device MMIO space, > not to system memory. > Endpoint can expose its system memory as a BAR to the host. In that case, the cache coherency issue would apply for TLPs originating from RC as well. - Mani > But I guess you're suggesting the RC can initiate a TLP with a system > memory address? And this TLP would be routed not to a Root Port or to > downstream devices, but it would instead be kind of a loopback and be > routed back up through the RC and maybe IOMMU, to system memory? > > I would have expected accesses like this to be routed directly to > system memory without ever reaching the PCIe RC. > > > and we are enabling this feature for qcom-ep driver as well. > > it is in patch2. > > > > Thanks > > Mrinmay > > > > > > And worst case, if the memory is cached on the host, it may lead to > > > > memory corruption issues. It should be noted that the caching of memory > > > > on the host is not solely dependent on the NO_SNOOP attribute in TLP. > > > > > > > > So to avoid the corruption, this patch overrides the NO_SNOOP attribute > > > > by setting the PCIE_PARF_NO_SNOOP_OVERIDE register. This patch is not > > > > needed for other upstream supported platforms since they do not set > > > > NO_SNOOP attribute by default. > > > > > > > > 8775 has IP version 1.34.0 so intruduce a new cfg(cfg_1_34_0) for this > > > > platform. Assign enable_cache_snoop flag into struct qcom_pcie_cfg and > > > > set it true in cfg_1_34_0 and enable cache snooping if this particular > > > > flag is true. > > > s/intruduce/introduce/ > > > > > > Bjorn
On Wed, Feb 28, 2024 at 10:44:12PM +0530, Manivannan Sadhasivam wrote: > On Wed, Feb 28, 2024 at 09:02:11AM -0600, Bjorn Helgaas wrote: > > On Wed, Feb 28, 2024 at 06:34:11PM +0530, Mrinmay Sarkar wrote: > > > On 2/24/2024 4:24 AM, Bjorn Helgaas wrote: > > > > On Fri, Feb 23, 2024 at 07:33:38PM +0530, Mrinmay Sarkar wrote: > > > > > Due to some hardware changes, SA8775P has set the NO_SNOOP attribute > > > > > in its TLP for all the PCIe controllers. NO_SNOOP attribute when set, > > > > > the requester is indicating that there no cache coherency issues exit > > > > > for the addressed memory on the host i.e., memory is not cached. But > > > > > in reality, requester cannot assume this unless there is a complete > > > > > control/visibility over the addressed memory on the host. > > > > > > > > Forgive my ignorance here. It sounds like the cache coherency issue > > > > would refer to system memory, so the relevant No Snoop attribute would > > > > be in DMA transactions, i.e., Memory Reads or Writes initiated by PCIe > > > > Endpoints. But it looks like this patch would affect TLPs initiated > > > > by the Root Complex, not those from Endpoints, so I'm confused about > > > > how this works. > > > > > > > > If this were in the qcom-ep driver, it would make sense that setting > > > > No Snoop in the TLPs initiated by the Endpoint could be a problem, but > > > > that doesn't seem to be what this patch is concerned with. > > > > > > I think in multiprocessor system cache coherency issue might occur. > > > and RC as well needs to snoop cache to avoid coherency as it is not > > > enable by default. > > > > My mental picture isn't detailed enough, so I'm still confused. We're > > talking about TLPs initiated by the RC. Normally these would be > > because a driver did a CPU load or store to a PCIe device MMIO space, > > not to system memory. > > Endpoint can expose its system memory as a BAR to the host. In that case, the > cache coherency issue would apply for TLPs originating from RC as well. What PCIe transactions are involved here? So far I know about: RC initiates Memory Read Request (or Write) with NO_SNOOP==0 ... EP responds with Completion with Data (for Read) But I guess you're saying the EP would initiate other transactions in the middle related to snooping? I don't know what those are. > > But I guess you're suggesting the RC can initiate a TLP with a system > > memory address? And this TLP would be routed not to a Root Port or to > > downstream devices, but it would instead be kind of a loopback and be > > routed back up through the RC and maybe IOMMU, to system memory? > > > > I would have expected accesses like this to be routed directly to > > system memory without ever reaching the PCIe RC. > > > > > and we are enabling this feature for qcom-ep driver as well. > > > it is in patch2. > > > > > > Thanks > > > Mrinmay > > > > > > > > And worst case, if the memory is cached on the host, it may lead to > > > > > memory corruption issues. It should be noted that the caching of memory > > > > > on the host is not solely dependent on the NO_SNOOP attribute in TLP. > > > > > > > > > > So to avoid the corruption, this patch overrides the NO_SNOOP attribute > > > > > by setting the PCIE_PARF_NO_SNOOP_OVERIDE register. This patch is not > > > > > needed for other upstream supported platforms since they do not set > > > > > NO_SNOOP attribute by default. > > > > > > > > > > 8775 has IP version 1.34.0 so intruduce a new cfg(cfg_1_34_0) for this > > > > > platform. Assign enable_cache_snoop flag into struct qcom_pcie_cfg and > > > > > set it true in cfg_1_34_0 and enable cache snooping if this particular > > > > > flag is true. > > > > s/intruduce/introduce/ > > > > > > > > Bjorn > > -- > மணிவண்ணன் சதாசிவம்
On Wed, Feb 28, 2024 at 11:39:07AM -0600, Bjorn Helgaas wrote: > On Wed, Feb 28, 2024 at 10:44:12PM +0530, Manivannan Sadhasivam wrote: > > On Wed, Feb 28, 2024 at 09:02:11AM -0600, Bjorn Helgaas wrote: > > > On Wed, Feb 28, 2024 at 06:34:11PM +0530, Mrinmay Sarkar wrote: > > > > On 2/24/2024 4:24 AM, Bjorn Helgaas wrote: > > > > > On Fri, Feb 23, 2024 at 07:33:38PM +0530, Mrinmay Sarkar wrote: > > > > > > Due to some hardware changes, SA8775P has set the NO_SNOOP attribute > > > > > > in its TLP for all the PCIe controllers. NO_SNOOP attribute when set, > > > > > > the requester is indicating that there no cache coherency issues exit > > > > > > for the addressed memory on the host i.e., memory is not cached. But > > > > > > in reality, requester cannot assume this unless there is a complete > > > > > > control/visibility over the addressed memory on the host. > > > > > > > > > > Forgive my ignorance here. It sounds like the cache coherency issue > > > > > would refer to system memory, so the relevant No Snoop attribute would > > > > > be in DMA transactions, i.e., Memory Reads or Writes initiated by PCIe > > > > > Endpoints. But it looks like this patch would affect TLPs initiated > > > > > by the Root Complex, not those from Endpoints, so I'm confused about > > > > > how this works. > > > > > > > > > > If this were in the qcom-ep driver, it would make sense that setting > > > > > No Snoop in the TLPs initiated by the Endpoint could be a problem, but > > > > > that doesn't seem to be what this patch is concerned with. > > > > > > > > I think in multiprocessor system cache coherency issue might occur. > > > > and RC as well needs to snoop cache to avoid coherency as it is not > > > > enable by default. > > > > > > My mental picture isn't detailed enough, so I'm still confused. We're > > > talking about TLPs initiated by the RC. Normally these would be > > > because a driver did a CPU load or store to a PCIe device MMIO space, > > > not to system memory. > > > > Endpoint can expose its system memory as a BAR to the host. In that case, the > > cache coherency issue would apply for TLPs originating from RC as well. > > What PCIe transactions are involved here? So far I know about: > > RC initiates Memory Read Request (or Write) with NO_SNOOP==0 > ... > EP responds with Completion with Data (for Read) > The memory on the endpoint may be cached (due to linear map and such). So if the RC is initiating the MWd TLP with NO_SNOOP=1, then there would be coherency issues because there is no guarantee that the memory is not cached on the endpoint. So, not snooping the caches and directly writing to the DDR would cause coherency issues on the endpoint as well. - Mani > But I guess you're saying the EP would initiate other transactions in > the middle related to snooping? I don't know what those are. > > > > But I guess you're suggesting the RC can initiate a TLP with a system > > > memory address? And this TLP would be routed not to a Root Port or to > > > downstream devices, but it would instead be kind of a loopback and be > > > routed back up through the RC and maybe IOMMU, to system memory? > > > > > > I would have expected accesses like this to be routed directly to > > > system memory without ever reaching the PCIe RC. > > > > > > > and we are enabling this feature for qcom-ep driver as well. > > > > it is in patch2. > > > > > > > > Thanks > > > > Mrinmay > > > > > > > > > > And worst case, if the memory is cached on the host, it may lead to > > > > > > memory corruption issues. It should be noted that the caching of memory > > > > > > on the host is not solely dependent on the NO_SNOOP attribute in TLP. > > > > > > > > > > > > So to avoid the corruption, this patch overrides the NO_SNOOP attribute > > > > > > by setting the PCIE_PARF_NO_SNOOP_OVERIDE register. This patch is not > > > > > > needed for other upstream supported platforms since they do not set > > > > > > NO_SNOOP attribute by default. > > > > > > > > > > > > 8775 has IP version 1.34.0 so intruduce a new cfg(cfg_1_34_0) for this > > > > > > platform. Assign enable_cache_snoop flag into struct qcom_pcie_cfg and > > > > > > set it true in cfg_1_34_0 and enable cache snooping if this particular > > > > > > flag is true. > > > > > s/intruduce/introduce/ > > > > > > > > > > Bjorn > > > > -- > > மணிவண்ணன் சதாசிவம்
On Thu, Feb 29, 2024 at 12:15:02AM +0530, Manivannan Sadhasivam wrote: > On Wed, Feb 28, 2024 at 11:39:07AM -0600, Bjorn Helgaas wrote: > > On Wed, Feb 28, 2024 at 10:44:12PM +0530, Manivannan Sadhasivam wrote: > > > On Wed, Feb 28, 2024 at 09:02:11AM -0600, Bjorn Helgaas wrote: > > > > On Wed, Feb 28, 2024 at 06:34:11PM +0530, Mrinmay Sarkar wrote: > > > > > On 2/24/2024 4:24 AM, Bjorn Helgaas wrote: > > > > > > On Fri, Feb 23, 2024 at 07:33:38PM +0530, Mrinmay Sarkar wrote: > > > > > > > Due to some hardware changes, SA8775P has set the > > > > > > > NO_SNOOP attribute in its TLP for all the PCIe > > > > > > > controllers. NO_SNOOP attribute when set, the requester > > > > > > > is indicating that there no cache coherency issues exit > > > > > > > for the addressed memory on the host i.e., memory is not > > > > > > > cached. But in reality, requester cannot assume this > > > > > > > unless there is a complete control/visibility over the > > > > > > > addressed memory on the host. > > > > > > > > > > > > Forgive my ignorance here. It sounds like the cache > > > > > > coherency issue would refer to system memory, so the > > > > > > relevant No Snoop attribute would be in DMA transactions, > > > > > > i.e., Memory Reads or Writes initiated by PCIe Endpoints. > > > > > > But it looks like this patch would affect TLPs initiated > > > > > > by the Root Complex, not those from Endpoints, so I'm > > > > > > confused about how this works. > > > > > > > > > > > > If this were in the qcom-ep driver, it would make sense > > > > > > that setting No Snoop in the TLPs initiated by the > > > > > > Endpoint could be a problem, but that doesn't seem to be > > > > > > what this patch is concerned with. > > > > > > > > > > I think in multiprocessor system cache coherency issue might > > > > > occur. and RC as well needs to snoop cache to avoid > > > > > coherency as it is not enable by default. > > > > > > > > My mental picture isn't detailed enough, so I'm still > > > > confused. We're talking about TLPs initiated by the RC. > > > > Normally these would be because a driver did a CPU load or > > > > store to a PCIe device MMIO space, not to system memory. > > > > > > Endpoint can expose its system memory as a BAR to the host. In > > > that case, the cache coherency issue would apply for TLPs > > > originating from RC as well. > > > > What PCIe transactions are involved here? So far I know about: > > > > RC initiates Memory Read Request (or Write) with NO_SNOOP==0 > > ... > > EP responds with Completion with Data (for Read) > > The memory on the endpoint may be cached (due to linear map and > such). So if the RC is initiating the MWd TLP with NO_SNOOP=1, then > there would be coherency issues because there is no guarantee that > the memory is not cached on the endpoint. So, not snooping the > caches and directly writing to the DDR would cause coherency issues > on the endpoint as well. I don't know what linear map is, but I'll take your word for it that endpoints are allowed to cache things internally. So I guess in the ideal world there might be a way for a driver to specify no-snoop for accesses to its device if it knows there is no caching. The commit log for this patch refers to caching on the *host*, though, and IIUC you're saying this patch clears NO_SNOOP on TLPs from the RC because of potential coherency issues on the *endpoint*. > > But I guess you're saying the EP would initiate other transactions in > > the middle related to snooping? I don't know what those are. > > > > > > But I guess you're suggesting the RC can initiate a TLP with a system > > > > memory address? And this TLP would be routed not to a Root Port or to > > > > downstream devices, but it would instead be kind of a loopback and be > > > > routed back up through the RC and maybe IOMMU, to system memory? > > > > > > > > I would have expected accesses like this to be routed directly to > > > > system memory without ever reaching the PCIe RC. > > > > > > > > > and we are enabling this feature for qcom-ep driver as well. > > > > > it is in patch2. > > > > > > > > > > Thanks > > > > > Mrinmay > > > > > > > > > > > > And worst case, if the memory is cached on the host, it may lead to > > > > > > > memory corruption issues. It should be noted that the caching of memory > > > > > > > on the host is not solely dependent on the NO_SNOOP attribute in TLP. > > > > > > > > > > > > > > So to avoid the corruption, this patch overrides the NO_SNOOP attribute > > > > > > > by setting the PCIE_PARF_NO_SNOOP_OVERIDE register. This patch is not > > > > > > > needed for other upstream supported platforms since they do not set > > > > > > > NO_SNOOP attribute by default. > > > > > > > > > > > > > > 8775 has IP version 1.34.0 so intruduce a new cfg(cfg_1_34_0) for this > > > > > > > platform. Assign enable_cache_snoop flag into struct qcom_pcie_cfg and > > > > > > > set it true in cfg_1_34_0 and enable cache snooping if this particular > > > > > > > flag is true. > > > > > > s/intruduce/introduce/ > > > > > > > > > > > > Bjorn > > > > > > -- > > > மணிவண்ணன் சதாசிவம் > > -- > மணிவண்ணன் சதாசிவம்
On Wed, Feb 28, 2024 at 01:34:41PM -0600, Bjorn Helgaas wrote: > On Thu, Feb 29, 2024 at 12:15:02AM +0530, Manivannan Sadhasivam wrote: > > On Wed, Feb 28, 2024 at 11:39:07AM -0600, Bjorn Helgaas wrote: > > > On Wed, Feb 28, 2024 at 10:44:12PM +0530, Manivannan Sadhasivam wrote: > > > > On Wed, Feb 28, 2024 at 09:02:11AM -0600, Bjorn Helgaas wrote: > > > > > On Wed, Feb 28, 2024 at 06:34:11PM +0530, Mrinmay Sarkar wrote: > > > > > > On 2/24/2024 4:24 AM, Bjorn Helgaas wrote: > > > > > > > On Fri, Feb 23, 2024 at 07:33:38PM +0530, Mrinmay Sarkar wrote: > > > > > > > > Due to some hardware changes, SA8775P has set the > > > > > > > > NO_SNOOP attribute in its TLP for all the PCIe > > > > > > > > controllers. NO_SNOOP attribute when set, the requester > > > > > > > > is indicating that there no cache coherency issues exit > > > > > > > > for the addressed memory on the host i.e., memory is not > > > > > > > > cached. But in reality, requester cannot assume this > > > > > > > > unless there is a complete control/visibility over the > > > > > > > > addressed memory on the host. > > > > > > > > > > > > > > Forgive my ignorance here. It sounds like the cache > > > > > > > coherency issue would refer to system memory, so the > > > > > > > relevant No Snoop attribute would be in DMA transactions, > > > > > > > i.e., Memory Reads or Writes initiated by PCIe Endpoints. > > > > > > > But it looks like this patch would affect TLPs initiated > > > > > > > by the Root Complex, not those from Endpoints, so I'm > > > > > > > confused about how this works. > > > > > > > > > > > > > > If this were in the qcom-ep driver, it would make sense > > > > > > > that setting No Snoop in the TLPs initiated by the > > > > > > > Endpoint could be a problem, but that doesn't seem to be > > > > > > > what this patch is concerned with. > > > > > > > > > > > > I think in multiprocessor system cache coherency issue might > > > > > > occur. and RC as well needs to snoop cache to avoid > > > > > > coherency as it is not enable by default. > > > > > > > > > > My mental picture isn't detailed enough, so I'm still > > > > > confused. We're talking about TLPs initiated by the RC. > > > > > Normally these would be because a driver did a CPU load or > > > > > store to a PCIe device MMIO space, not to system memory. > > > > > > > > Endpoint can expose its system memory as a BAR to the host. In > > > > that case, the cache coherency issue would apply for TLPs > > > > originating from RC as well. > > > > > > What PCIe transactions are involved here? So far I know about: > > > > > > RC initiates Memory Read Request (or Write) with NO_SNOOP==0 > > > ... > > > EP responds with Completion with Data (for Read) > > > > The memory on the endpoint may be cached (due to linear map and > > such). So if the RC is initiating the MWd TLP with NO_SNOOP=1, then > > there would be coherency issues because there is no guarantee that > > the memory is not cached on the endpoint. So, not snooping the > > caches and directly writing to the DDR would cause coherency issues > > on the endpoint as well. > > I don't know what linear map is, but I'll take your word for it that > endpoints are allowed to cache things internally. So I guess in the > ideal world there might be a way for a driver to specify no-snoop for > accesses to its device if it knows there is no caching. > I referred to Linux kernel's mapping of the DDR space as "linear map". But the endpoint may not run only Linux, but any RTOS or even bare metal. So it is certainly possible the BAR memory could be cached. > The commit log for this patch refers to caching on the *host*, though, > and IIUC you're saying this patch clears NO_SNOOP on TLPs from the RC > because of potential coherency issues on the *endpoint*. > Yeah, the commit message was wrong. I shared the wording during the review of previous version and it got duplicated for both RC and EP patches :( This should be fixed. - Mani
On Fri, Feb 23, 2024 at 07:33:38PM +0530, Mrinmay Sarkar wrote: Subject should be: "PCI: qcom: Override NO_SNOOP attribute for SA8775P" > Due to some hardware changes, SA8775P has set the NO_SNOOP attribute > in its TLP for all the PCIe controllers. NO_SNOOP attribute when set, > the requester is indicating that there no cache coherency issues exit > for the addressed memory on the host i.e., memory is not cached. But s/host/endpoint > in reality, requester cannot assume this unless there is a complete > control/visibility over the addressed memory on the host. > s/host/endpoint > And worst case, if the memory is cached on the host, it may lead to s/host/endpoint > memory corruption issues. It should be noted that the caching of memory > on the host is not solely dependent on the NO_SNOOP attribute in TLP. > s/host/endpoint > So to avoid the corruption, this patch overrides the NO_SNOOP attribute > by setting the PCIE_PARF_NO_SNOOP_OVERIDE register. This patch is not > needed for other upstream supported platforms since they do not set > NO_SNOOP attribute by default. > > 8775 has IP version 1.34.0 so intruduce a new cfg(cfg_1_34_0) for this > platform. Assign enable_cache_snoop flag into struct qcom_pcie_cfg and > set it true in cfg_1_34_0 and enable cache snooping if this particular > flag is true. > > Signed-off-by: Mrinmay Sarkar <quic_msarkar@quicinc.com> > --- > drivers/pci/controller/dwc/pcie-qcom.c | 20 +++++++++++++++++++- > 1 file changed, 19 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c > index 2ce2a3bd932b..872be7f7d7b3 100644 > --- a/drivers/pci/controller/dwc/pcie-qcom.c > +++ b/drivers/pci/controller/dwc/pcie-qcom.c > @@ -51,6 +51,7 @@ > #define PARF_SID_OFFSET 0x234 > #define PARF_BDF_TRANSLATE_CFG 0x24c > #define PARF_SLV_ADDR_SPACE_SIZE 0x358 > +#define PARF_NO_SNOOP_OVERIDE 0x3d4 > #define PARF_DEVICE_TYPE 0x1000 > #define PARF_BDF_TO_SID_TABLE_N 0x2000 > > @@ -117,6 +118,10 @@ > /* PARF_LTSSM register fields */ > #define LTSSM_EN BIT(8) > > +/* PARF_NO_SNOOP_OVERIDE register fields */ > +#define WR_NO_SNOOP_OVERIDE_EN BIT(1) > +#define RD_NO_SNOOP_OVERIDE_EN BIT(3) > + > /* PARF_DEVICE_TYPE register fields */ > #define DEVICE_TYPE_RC 0x4 > > @@ -229,6 +234,7 @@ struct qcom_pcie_ops { > Please add Kdoc comments for this struct. And describe the "override_no_snoop" member as below: "Override NO_SNOOP attribute in TLP to enable cache snooping" > struct qcom_pcie_cfg { > const struct qcom_pcie_ops *ops; > + bool enable_cache_snoop; Rename this to "override_no_snoop" > }; > > struct qcom_pcie { > @@ -961,6 +967,13 @@ static int qcom_pcie_init_2_7_0(struct qcom_pcie *pcie) > > static int qcom_pcie_post_init_2_7_0(struct qcom_pcie *pcie) > { > + const struct qcom_pcie_cfg *pcie_cfg = pcie->cfg; > + > + /* Enable cache snooping for SA8775P */ Remove this comment in favor of Kdoc mentioned above. - Mani
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c index 2ce2a3bd932b..872be7f7d7b3 100644 --- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -51,6 +51,7 @@ #define PARF_SID_OFFSET 0x234 #define PARF_BDF_TRANSLATE_CFG 0x24c #define PARF_SLV_ADDR_SPACE_SIZE 0x358 +#define PARF_NO_SNOOP_OVERIDE 0x3d4 #define PARF_DEVICE_TYPE 0x1000 #define PARF_BDF_TO_SID_TABLE_N 0x2000 @@ -117,6 +118,10 @@ /* PARF_LTSSM register fields */ #define LTSSM_EN BIT(8) +/* PARF_NO_SNOOP_OVERIDE register fields */ +#define WR_NO_SNOOP_OVERIDE_EN BIT(1) +#define RD_NO_SNOOP_OVERIDE_EN BIT(3) + /* PARF_DEVICE_TYPE register fields */ #define DEVICE_TYPE_RC 0x4 @@ -229,6 +234,7 @@ struct qcom_pcie_ops { struct qcom_pcie_cfg { const struct qcom_pcie_ops *ops; + bool enable_cache_snoop; }; struct qcom_pcie { @@ -961,6 +967,13 @@ static int qcom_pcie_init_2_7_0(struct qcom_pcie *pcie) static int qcom_pcie_post_init_2_7_0(struct qcom_pcie *pcie) { + const struct qcom_pcie_cfg *pcie_cfg = pcie->cfg; + + /* Enable cache snooping for SA8775P */ + if (pcie_cfg->enable_cache_snoop) + writel(WR_NO_SNOOP_OVERIDE_EN | RD_NO_SNOOP_OVERIDE_EN, + pcie->parf + PARF_NO_SNOOP_OVERIDE); + qcom_pcie_clear_hpc(pcie->pci); return 0; @@ -1334,6 +1347,11 @@ static const struct qcom_pcie_cfg cfg_1_9_0 = { .ops = &ops_1_9_0, }; +static const struct qcom_pcie_cfg cfg_1_34_0 = { + .ops = &ops_1_9_0, + .enable_cache_snoop = true, +}; + static const struct qcom_pcie_cfg cfg_2_1_0 = { .ops = &ops_2_1_0, }; @@ -1630,7 +1648,7 @@ static const struct of_device_id qcom_pcie_match[] = { { .compatible = "qcom,pcie-msm8996", .data = &cfg_2_3_2 }, { .compatible = "qcom,pcie-qcs404", .data = &cfg_2_4_0 }, { .compatible = "qcom,pcie-sa8540p", .data = &cfg_1_9_0 }, - { .compatible = "qcom,pcie-sa8775p", .data = &cfg_1_9_0}, + { .compatible = "qcom,pcie-sa8775p", .data = &cfg_1_34_0}, { .compatible = "qcom,pcie-sc7280", .data = &cfg_1_9_0 }, { .compatible = "qcom,pcie-sc8180x", .data = &cfg_1_9_0 }, { .compatible = "qcom,pcie-sc8280xp", .data = &cfg_1_9_0 },
Due to some hardware changes, SA8775P has set the NO_SNOOP attribute in its TLP for all the PCIe controllers. NO_SNOOP attribute when set, the requester is indicating that there no cache coherency issues exit for the addressed memory on the host i.e., memory is not cached. But in reality, requester cannot assume this unless there is a complete control/visibility over the addressed memory on the host. And worst case, if the memory is cached on the host, it may lead to memory corruption issues. It should be noted that the caching of memory on the host is not solely dependent on the NO_SNOOP attribute in TLP. So to avoid the corruption, this patch overrides the NO_SNOOP attribute by setting the PCIE_PARF_NO_SNOOP_OVERIDE register. This patch is not needed for other upstream supported platforms since they do not set NO_SNOOP attribute by default. 8775 has IP version 1.34.0 so intruduce a new cfg(cfg_1_34_0) for this platform. Assign enable_cache_snoop flag into struct qcom_pcie_cfg and set it true in cfg_1_34_0 and enable cache snooping if this particular flag is true. Signed-off-by: Mrinmay Sarkar <quic_msarkar@quicinc.com> --- drivers/pci/controller/dwc/pcie-qcom.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)