Message ID | 20221214235305.31744-24-Sergey.Semin@baikalelectronics.ru (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | dmaengine: dw-edma: Add RP/EP local DMA controllers support | expand |
Hi Robin On Thu, Dec 15, 2022 at 02:53:03AM +0300, Serge Semin wrote: > DW PCIe Root Ports and End-points can be equipped with the DW eDMA engine. > In that case it is critical to have the platform device pre-initialized > with a valid DMA-mask so the drivers using the eDMA-engine would be able > to allocate the DMA-able buffers. The MSI-capable data requires to be > allocated from the lowest 4GB region. Since that procedure implies the > DMA-mask change we need to restore the mask set by the low-level drivers > after the MSI-data allocation is done. I dropped the patch [PATCH v6 22/24] dmaengine: dw-edma: Bypass dma-ranges mapping for the local setup Link: https://lore.kernel.org/linux-pci/20221107210438.1515-23-Sergey.Semin@baikalelectronics.ru/ from the series. It has turned to be useless since your commit f1ad5338a4d5 ("of: Fix "dma-ranges" handling for bus controllers"). Instead I added the patches [PATCH v7 23/25] PCI: dwc: Restore DMA-mask after MSI-data allocation and [PATCH v7 24/25] PCI: bt1: Set 64-bit DMA-mask to this patchset in order to properly handle the DMA-mask. Could you please give your opinion on these patches? -Serge(y) > > Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> > > --- > > Changelog v7: > - This is a new patch added on v7 stage of the series. (@Robin) > --- > drivers/pci/controller/dwc/pcie-designware-host.c | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c > index 5762bd306261..1a3dae1f6aa2 100644 > --- a/drivers/pci/controller/dwc/pcie-designware-host.c > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c > @@ -326,7 +326,7 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp) > struct dw_pcie *pci = to_dw_pcie_from_pp(pp); > struct device *dev = pci->dev; > struct platform_device *pdev = to_platform_device(dev); > - u64 *msi_vaddr; > + u64 *msi_vaddr, dma_mask; > int ret; > u32 ctrl, num_ctrls; > > @@ -366,6 +366,13 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp) > dw_chained_msi_isr, pp); > } > > + /* > + * Save and then restore the DMA-mask pre-set by the low-level drivers > + * after allocating the MSI-capable region. The mask might be useful for > + * the controllers with the embedded eDMA engine. > + */ > + dma_mask = dma_get_mask(dev); > + > ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)); > if (ret) > dev_warn(dev, "Failed to set DMA mask to 32-bit. Devices with only 32-bit MSI support may not work properly\n"); > @@ -378,6 +385,10 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp) > return -ENOMEM; > } > > + ret = dma_set_mask_and_coherent(dev, dma_mask); > + if (ret) > + dev_warn(dev, "Failed to re-store DMA-mask\n"); > + > return 0; > } > > -- > 2.38.1 > >
On Thu, Dec 15, 2022 at 02:53:03AM +0300, Serge Semin wrote: > DW PCIe Root Ports and End-points can be equipped with the DW eDMA engine. > In that case it is critical to have the platform device pre-initialized > with a valid DMA-mask so the drivers using the eDMA-engine would be able > to allocate the DMA-able buffers. The MSI-capable data requires to be > allocated from the lowest 4GB region. Since that procedure implies the > DMA-mask change we need to restore the mask set by the low-level drivers > after the MSI-data allocation is done. You can't change the DMA mask when there are existing allocations.
Hi Christoph On Wed, Dec 14, 2022 at 11:13:41PM -0800, Christoph Hellwig wrote: > On Thu, Dec 15, 2022 at 02:53:03AM +0300, Serge Semin wrote: > > DW PCIe Root Ports and End-points can be equipped with the DW eDMA engine. > > In that case it is critical to have the platform device pre-initialized > > with a valid DMA-mask so the drivers using the eDMA-engine would be able > > to allocate the DMA-able buffers. The MSI-capable data requires to be > > allocated from the lowest 4GB region. Since that procedure implies the > > DMA-mask change we need to restore the mask set by the low-level drivers > > after the MSI-data allocation is done. > > You can't change the DMA mask when there are existing allocations. Em, what do you guys suggest for the DW PCIe devices with the embedded DMA-engine then? To live forever with the SWIOTLBs? I can't drop the DMA-mask update due to this commit 423511ec23e2 ("PCI: dwc: Drop dependency on ZONE_DMA32") and I can't change the mask after it's updated. Note it's updated for the memory allocation to which actually no DMA will be performed, see https://lore.kernel.org/linux-pci/20220825185026.3816331-2-willmcvicker@google.com/. My patches imply adding the real DMA operations support. We've discussed this a lot with Robin in various threads and I thought a workable solution was found. I was going to update the mask in another place, but basically it would still mean to have first setting the 32-bit mask here, and then change it to 64-bit one in the framework of the DW eDMA driver. So to speak I don't see a proper way out from the situation. Nothing I suggested was accepted and now we'll have to live with the SWIOTLBs used for the memory above 4GB. So please suggest a workable solution then. We need the next things: 1. Somehow preserve a single DWORD of the PCIe bus memory for the iMSI-RX engine. (That's what is currently done the dw_pcie_msi_host_init() method by allocating the coherent memory.) 2. Set the actual DMA-mask to the DW PCIe platform device so the DMA-engine clients would be able to allocate actually DMA-able memory. @Robin, please join the discussion. -Serge(y)
On 2022-12-15 09:27, Serge Semin wrote: > Hi Christoph > > On Wed, Dec 14, 2022 at 11:13:41PM -0800, Christoph Hellwig wrote: >> On Thu, Dec 15, 2022 at 02:53:03AM +0300, Serge Semin wrote: >>> DW PCIe Root Ports and End-points can be equipped with the DW eDMA engine. >>> In that case it is critical to have the platform device pre-initialized >>> with a valid DMA-mask so the drivers using the eDMA-engine would be able >>> to allocate the DMA-able buffers. The MSI-capable data requires to be >>> allocated from the lowest 4GB region. Since that procedure implies the >>> DMA-mask change we need to restore the mask set by the low-level drivers >>> after the MSI-data allocation is done. >> >> You can't change the DMA mask when there are existing allocations. > > Em, what do you guys suggest for the DW PCIe devices with the embedded > DMA-engine then? To live forever with the SWIOTLBs? I can't drop the > DMA-mask update due to this commit 423511ec23e2 ("PCI: dwc: Drop > dependency on ZONE_DMA32") and I can't change the mask after it's > updated. Note it's updated for the memory allocation to which actually > no DMA will be performed, see > https://lore.kernel.org/linux-pci/20220825185026.3816331-2-willmcvicker@google.com/. > My patches imply adding the real DMA operations support. > > We've discussed this a lot with Robin in various threads and I thought > a workable solution was found. I was going to update the mask in > another place, but basically it would still mean to have first setting > the 32-bit mask here, and then change it to 64-bit one in the > framework of the DW eDMA driver. > > So to speak I don't see a proper way out from the situation. Nothing I > suggested was accepted and now we'll have to live with the SWIOTLBs > used for the memory above 4GB. So please suggest a workable solution > then. We need the next things: > 1. Somehow preserve a single DWORD of the PCIe bus memory for the > iMSI-RX engine. (That's what is currently done the > dw_pcie_msi_host_init() method by allocating the coherent memory.) > 2. Set the actual DMA-mask to the DW PCIe platform device so the > DMA-engine clients would be able to allocate actually DMA-able memory. > > @Robin, please join the discussion. Basically just don't touch the coherent mask. The eDMA drivers can still set the streaming mask to something larger, and that's the one that's going to matter for most dmaengine clients anyway. Even if someone does call dma_alloc_coherent() for their eDMA channel, it's not going to make much practical difference if that has to come from a DMA zone, unless the system is under severe memory pressure anyway. Thanks, Robin.
On Thu, Dec 15, 2022 at 10:26:08AM +0000, Robin Murphy wrote: > On 2022-12-15 09:27, Serge Semin wrote: > > Hi Christoph > > > > On Wed, Dec 14, 2022 at 11:13:41PM -0800, Christoph Hellwig wrote: > > > On Thu, Dec 15, 2022 at 02:53:03AM +0300, Serge Semin wrote: > > > > DW PCIe Root Ports and End-points can be equipped with the DW eDMA engine. > > > > In that case it is critical to have the platform device pre-initialized > > > > with a valid DMA-mask so the drivers using the eDMA-engine would be able > > > > to allocate the DMA-able buffers. The MSI-capable data requires to be > > > > allocated from the lowest 4GB region. Since that procedure implies the > > > > DMA-mask change we need to restore the mask set by the low-level drivers > > > > after the MSI-data allocation is done. > > > > > > You can't change the DMA mask when there are existing allocations. > > > > Em, what do you guys suggest for the DW PCIe devices with the embedded > > DMA-engine then? To live forever with the SWIOTLBs? I can't drop the > > DMA-mask update due to this commit 423511ec23e2 ("PCI: dwc: Drop > > dependency on ZONE_DMA32") and I can't change the mask after it's > > updated. Note it's updated for the memory allocation to which actually > > no DMA will be performed, see > > https://lore.kernel.org/linux-pci/20220825185026.3816331-2-willmcvicker@google.com/. > > My patches imply adding the real DMA operations support. > > > > We've discussed this a lot with Robin in various threads and I thought > > a workable solution was found. I was going to update the mask in > > another place, but basically it would still mean to have first setting > > the 32-bit mask here, and then change it to 64-bit one in the > > framework of the DW eDMA driver. > > > > So to speak I don't see a proper way out from the situation. Nothing I > > suggested was accepted and now we'll have to live with the SWIOTLBs > > used for the memory above 4GB. So please suggest a workable solution > > then. We need the next things: > > 1. Somehow preserve a single DWORD of the PCIe bus memory for the > > iMSI-RX engine. (That's what is currently done the > > dw_pcie_msi_host_init() method by allocating the coherent memory.) > > 2. Set the actual DMA-mask to the DW PCIe platform device so the > > DMA-engine clients would be able to allocate actually DMA-able memory. > > > > @Robin, please join the discussion. > > Basically just don't touch the coherent mask. The eDMA drivers can still set > the streaming mask to something larger, and that's the one that's going to > matter for most dmaengine clients anyway. Even if someone does call > dma_alloc_coherent() for their eDMA channel, it's not going to make much > practical difference if that has to come from a DMA zone, unless the system > is under severe memory pressure anyway. Got it. Thanks for clarification. I'll resubmit the series with only the streaming DMA mask restoration. -Serge(y) > > Thanks, > Robin.
On Fri, Dec 16, 2022 at 02:52:18AM +0300, Serge Semin wrote: > Got it. Thanks for clarification. I'll resubmit the series with only > the streaming DMA mask restoration. Note that even for that we need to make sure there are no outstanding mappings when you change the mask.
On Thu, Dec 15, 2022 at 11:39:41PM -0800, Christoph Hellwig wrote: > On Fri, Dec 16, 2022 at 02:52:18AM +0300, Serge Semin wrote: > > Got it. Thanks for clarification. I'll resubmit the series with only > > the streaming DMA mask restoration. > > Note that even for that we need to make sure there are no outstanding > mappings when you change the mask. > What about instead of save/restore pattern I'll just change the dma_set_mask_and_coherent() method with the dma_set_coherent_mask() function call? It seems cleaner. Like this: < --- a/drivers/pci/controller/dwc/pcie-designware-host.c < +++ b/drivers/pci/controller/dwc/pcie-designware-host.c < @@ -366,10 +366,10 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp) < dw_chained_msi_isr, pp); < } < < - ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)); < + ret = dma_set_coherent_mask(dev, DMA_BIT_MASK(32)); < if (ret) < dev_warn(dev, "Failed to set DMA mask to 32-bit. Devices with only 32-bit MSI support may not work properly\n"); < < msi_vaddr = dmam_alloc_coherent(dev, sizeof(u64), &pp->msi_data, < GFP_KERNEL); Thus the platform-specific streaming DMA mask would be preserved. Since it's PCIe then having the streaming DMA-mask less than 32-bits wide is very much improbable. Moreover DW PCIe AXI-interface can be synthesize only with one out of two address bus widths: 32 and 64. -Serge(y)
On Fri, Dec 16, 2022 at 12:34:23PM +0300, Serge Semin wrote: > What about instead of save/restore pattern I'll just change the > dma_set_mask_and_coherent() method with the dma_set_coherent_mask() > function call? It seems cleaner. Like this: > Thus the platform-specific streaming DMA mask would be preserved. > Since it's PCIe then having the streaming DMA-mask less than 32-bits > wide is very much improbable. Moreover DW PCIe AXI-interface can be > synthesize only with one out of two address bus widths: 32 and 64. Where platform-specific means the dwc subdriver? Yes, that seems to work. Alternatively have a flag that says which streaming mask to set.
On Fri, Dec 16, 2022 at 01:49:38AM -0800, Christoph Hellwig wrote: > On Fri, Dec 16, 2022 at 12:34:23PM +0300, Serge Semin wrote: > > What about instead of save/restore pattern I'll just change the > > dma_set_mask_and_coherent() method with the dma_set_coherent_mask() > > function call? It seems cleaner. Like this: > > > Thus the platform-specific streaming DMA mask would be preserved. > > Since it's PCIe then having the streaming DMA-mask less than 32-bits > > wide is very much improbable. Moreover DW PCIe AXI-interface can be > > synthesize only with one out of two address bus widths: 32 and 64. > > Where platform-specific means the dwc subdriver? Right. I meant the streaming DMA-mask set by the low-level DWC PCIe drivers (like pcie-qcom(-ep)?.c, pcie-bt1.c, etc). It's very much important to have the real DMA-mask (at least the streaming one) set for the eDMA-capable controllers so the DMA-engine clients would work with the best performance. > Yes, that seems to work. Ok. I'll just use the direct dma_set_coherent_mask() method here then. > Alternatively have a flag that says which streaming mask > to set. I'd prefer to have more flexibility here relying on the low-level drivers to set the mask(s) instead of adding the new flag, just in case if there is vendor-specific IP-core/platform changes in the address bus width. -Serge(y)
On 2022-12-16 10:18, Serge Semin wrote: > On Fri, Dec 16, 2022 at 01:49:38AM -0800, Christoph Hellwig wrote: >> On Fri, Dec 16, 2022 at 12:34:23PM +0300, Serge Semin wrote: >>> What about instead of save/restore pattern I'll just change the >>> dma_set_mask_and_coherent() method with the dma_set_coherent_mask() >>> function call? It seems cleaner. Like this: >> >>> Thus the platform-specific streaming DMA mask would be preserved. >>> Since it's PCIe then having the streaming DMA-mask less than 32-bits >>> wide is very much improbable. Moreover DW PCIe AXI-interface can be >>> synthesize only with one out of two address bus widths: 32 and 64. >> > >> Where platform-specific means the dwc subdriver? > > Right. I meant the streaming DMA-mask set by the low-level DWC PCIe drivers > (like pcie-qcom(-ep)?.c, pcie-bt1.c, etc). It's very much important to > have the real DMA-mask (at least the streaming one) set for the eDMA-capable > controllers so the DMA-engine clients would work with the best performance. > >> Yes, that seems to work. > > Ok. I'll just use the direct dma_set_coherent_mask() method here then. > >> Alternatively have a flag that says which streaming mask >> to set. > > I'd prefer to have more flexibility here relying on the low-level > drivers to set the mask(s) instead of adding the new flag, just in case > if there is vendor-specific IP-core/platform changes in the address > bus width. Presumably the low-level glue drivers could pass a bus size or mask value in struct dw_pcie_rp/dw_pcie, so the actual dma_set_mask() call itself could be centralised? I guess there's also an argument that only glue drivers which care about eDMA need to care about setting a mask at all, so I don't have a string preference either way. If you'd rather stick with that approach then it might be worth a brief comment at each site to clarify why the other mask is being set from an entirely different place, just in case anyone comes along and tries to "fix" it. Cheers, Robin.
On Fri, Dec 16, 2022 at 02:01:20PM +0000, Robin Murphy wrote: > On 2022-12-16 10:18, Serge Semin wrote: > > On Fri, Dec 16, 2022 at 01:49:38AM -0800, Christoph Hellwig wrote: > > > On Fri, Dec 16, 2022 at 12:34:23PM +0300, Serge Semin wrote: > > > > What about instead of save/restore pattern I'll just change the > > > > dma_set_mask_and_coherent() method with the dma_set_coherent_mask() > > > > function call? It seems cleaner. Like this: > > > > > > > Thus the platform-specific streaming DMA mask would be preserved. > > > > Since it's PCIe then having the streaming DMA-mask less than 32-bits > > > > wide is very much improbable. Moreover DW PCIe AXI-interface can be > > > > synthesize only with one out of two address bus widths: 32 and 64. > > > > > > > > Where platform-specific means the dwc subdriver? > > > > Right. I meant the streaming DMA-mask set by the low-level DWC PCIe drivers > > (like pcie-qcom(-ep)?.c, pcie-bt1.c, etc). It's very much important to > > have the real DMA-mask (at least the streaming one) set for the eDMA-capable > > controllers so the DMA-engine clients would work with the best performance. > > > > > Yes, that seems to work. > > > > Ok. I'll just use the direct dma_set_coherent_mask() method here then. > > > > > Alternatively have a flag that says which streaming mask > > > to set. > > > > I'd prefer to have more flexibility here relying on the low-level > > drivers to set the mask(s) instead of adding the new flag, just in case > > if there is vendor-specific IP-core/platform changes in the address > > bus width. > > Presumably the low-level glue drivers could pass a bus size or mask value in > struct dw_pcie_rp/dw_pcie, so the actual dma_set_mask() call itself could be > centralised? I guess there's also an argument that only glue drivers which > care about eDMA need to care about setting a mask at all, so I don't have a > string preference either way. There is another peculiarity here due to which the centralized approach turns to be less suitable. The dw_pcie_msi_host_init() method, which currently updates the mask, isn't always executed. It's run only if pci_msi_enabled() returns true and there is no platform-specific dw_pci_rp.msi_host_init callback specified. Thus if we got to implement the centralized DMA-mask setting up procedure then it should have been done in the always executed place. That code would have also implied the 32-bit coherent DMA-mask if the generic iMSI-RX config is required. Thus the iMSI-RX-specific setups (data allocation and the mask settings) would have been split up into different places which would be less maintainable. Moreover as you correctly noted the real DMA-mask setting up is only needed for the eDMA-capable controllers. Meanwhile in the most of the cases there is no eDMA embedded in the PCIe controller, but due to the centralized approach we would need to set some mask anyway. Since I don't know the real address bus width of all the already available DW PCIe platforms we'll have to fallback to using some default mask value, which might incorrectly describe the actual device capability (and possible cause some side-effects/regressions). To sum up weighing up pros and cons the centralized approach seems to me more complex, less maintainable and less flexible. In my opinion relying on the glue-drivers to set the mask if it's needed to be set (that is there is the embedded eDMA) is the best and the simplest choice. > If you'd rather stick with that approach then > it might be worth a brief comment at each site to clarify why the other mask > is being set from an entirely different place, just in case anyone comes > along and tries to "fix" it. Exactly what I was going to do. I'll add a brief comment why the coherent DMA-mask is updated in the dw_pcie_msi_host_init() method. -Serge(y) > > Cheers, > Robin.
diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c index 5762bd306261..1a3dae1f6aa2 100644 --- a/drivers/pci/controller/dwc/pcie-designware-host.c +++ b/drivers/pci/controller/dwc/pcie-designware-host.c @@ -326,7 +326,7 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp) struct dw_pcie *pci = to_dw_pcie_from_pp(pp); struct device *dev = pci->dev; struct platform_device *pdev = to_platform_device(dev); - u64 *msi_vaddr; + u64 *msi_vaddr, dma_mask; int ret; u32 ctrl, num_ctrls; @@ -366,6 +366,13 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp) dw_chained_msi_isr, pp); } + /* + * Save and then restore the DMA-mask pre-set by the low-level drivers + * after allocating the MSI-capable region. The mask might be useful for + * the controllers with the embedded eDMA engine. + */ + dma_mask = dma_get_mask(dev); + ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32)); if (ret) dev_warn(dev, "Failed to set DMA mask to 32-bit. Devices with only 32-bit MSI support may not work properly\n"); @@ -378,6 +385,10 @@ static int dw_pcie_msi_host_init(struct dw_pcie_rp *pp) return -ENOMEM; } + ret = dma_set_mask_and_coherent(dev, dma_mask); + if (ret) + dev_warn(dev, "Failed to re-store DMA-mask\n"); + return 0; }
DW PCIe Root Ports and End-points can be equipped with the DW eDMA engine. In that case it is critical to have the platform device pre-initialized with a valid DMA-mask so the drivers using the eDMA-engine would be able to allocate the DMA-able buffers. The MSI-capable data requires to be allocated from the lowest 4GB region. Since that procedure implies the DMA-mask change we need to restore the mask set by the low-level drivers after the MSI-data allocation is done. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> --- Changelog v7: - This is a new patch added on v7 stage of the series. (@Robin) --- drivers/pci/controller/dwc/pcie-designware-host.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)