Message ID | 20220822185332.26149-23-Sergey.Semin@baikalelectronics.ru (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | dmaengine: dw-edma: Add RP/EP local DMA controllers support | expand |
On 2022-08-22 19:53, Serge Semin wrote: > DW eDMA doesn't perform any translation of the traffic generated on the > CPU/Application side. It just generates read/write AXI-bus requests with > the specified addresses. But in case if the dma-ranges DT-property is > specified for a platform device node, Linux will use it to map the CPU > memory regions into the DMAable bus ranges. This isn't what we want for > the eDMA embedded into the locally accessed DW PCIe Root Port and > End-point. In order to work that around let's set the chan_dma_dev flag > for each DW eDMA channel thus forcing the client drivers to getting a > custom dma-ranges-less parental device for the mappings. > > Note it will only work for the client drivers using the > dmaengine_get_dma_device() method to get the parental DMA device. No, this is nonsense. If the DMA engine is on the host side of the bridge then it should not have anything to do with the PCI device at all, it should be associated with the platform device, and thus any range mapping on the bridge itself would be irrelevant anyway. > Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> > Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> > Acked-By: Vinod Koul <vkoul@kernel.org> > > --- > > Changelog v2: > - Fix the comment a bit to being clearer. (@Manivannan) > > Changelog v3: > - Conditionally set dchan->dev->device.dma_coherent field since it can > be missing on some platforms. (@Manivannan) > - Remove Manivannan' rb and tb tags since the patch content has been > changed. > --- > drivers/dma/dw-edma/dw-edma-core.c | 20 ++++++++++++++++++++ > 1 file changed, 20 insertions(+) > > diff --git a/drivers/dma/dw-edma/dw-edma-core.c b/drivers/dma/dw-edma/dw-edma-core.c > index 6a8282eaebaf..4f56149dc8d8 100644 > --- a/drivers/dma/dw-edma/dw-edma-core.c > +++ b/drivers/dma/dw-edma/dw-edma-core.c > @@ -716,6 +716,26 @@ static int dw_edma_alloc_chan_resources(struct dma_chan *dchan) > if (chan->status != EDMA_ST_IDLE) > return -EBUSY; > > + /* Bypass the dma-ranges based memory regions mapping for the eDMA > + * controlled from the CPU/Application side since in that case > + * the local memory address is left untranslated. > + */ > + if (chan->dw->chip->flags & DW_EDMA_CHIP_LOCAL) { > + dchan->dev->chan_dma_dev = true; > + > +#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \ > + defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \ > + defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) > + dchan->dev->device.dma_coherent = chan->dw->chip->dev->dma_coherent; > +#endif > + > + dma_coerce_mask_and_coherent(&dchan->dev->device, > + dma_get_mask(chan->dw->chip->dev)); > + dchan->dev->device.dma_parms = chan->dw->chip->dev->dma_parms; > + } else { > + dchan->dev->chan_dma_dev = false; > + } NAK. Don't try to poke into DMA API internals and copy random partial pieces between devices, it doesn't work properly (I can guess that your system doesn't have an IOMMU...) and having to deal with ugly mess like this in drivers just makes it harder for us to maintain the DMA API itself. Fair enough if you have good reason to create logical child devices to represent individual DMA channels, but the correct way to handle that is to keep the real parent device pointer around and use that for DMA API calls. Robin. > + > pm_runtime_get(chan->dw->chip->dev); > > return 0;
On Wed, Aug 31, 2022 at 10:17:30AM +0100, Robin Murphy wrote: > On 2022-08-22 19:53, Serge Semin wrote: > > DW eDMA doesn't perform any translation of the traffic generated on the > > CPU/Application side. It just generates read/write AXI-bus requests with > > the specified addresses. But in case if the dma-ranges DT-property is > > specified for a platform device node, Linux will use it to map the CPU > > memory regions into the DMAable bus ranges. This isn't what we want for > > the eDMA embedded into the locally accessed DW PCIe Root Port and > > End-point. In order to work that around let's set the chan_dma_dev flag > > for each DW eDMA channel thus forcing the client drivers to getting a > > custom dma-ranges-less parental device for the mappings. > > > > Note it will only work for the client drivers using the > > dmaengine_get_dma_device() method to get the parental DMA device. > > No, this is nonsense. If the DMA engine is on the host side of the bridge > then it should not have anything to do with the PCI device at all, it should > be associated with the platform device, Well. The DMA-engine is embedded into the PCIe Root Port bus, is associated with the platform device it's embedded to, and it doesn't have anything to do with any particular PCI device. > and thus any range mapping on the bridge itself would be irrelevant anyway. Really? I find it otherwise. Please see the way the "dma-ranges" property is parsed and works during the device-specific memory ranges mapping when it's applicable for the PCIe Root Ports. > > > Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> > > Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> > > Acked-By: Vinod Koul <vkoul@kernel.org> > > > > --- > > > > Changelog v2: > > - Fix the comment a bit to being clearer. (@Manivannan) > > > > Changelog v3: > > - Conditionally set dchan->dev->device.dma_coherent field since it can > > be missing on some platforms. (@Manivannan) > > - Remove Manivannan' rb and tb tags since the patch content has been > > changed. > > --- > > drivers/dma/dw-edma/dw-edma-core.c | 20 ++++++++++++++++++++ > > 1 file changed, 20 insertions(+) > > > > diff --git a/drivers/dma/dw-edma/dw-edma-core.c b/drivers/dma/dw-edma/dw-edma-core.c > > index 6a8282eaebaf..4f56149dc8d8 100644 > > --- a/drivers/dma/dw-edma/dw-edma-core.c > > +++ b/drivers/dma/dw-edma/dw-edma-core.c > > @@ -716,6 +716,26 @@ static int dw_edma_alloc_chan_resources(struct dma_chan *dchan) > > if (chan->status != EDMA_ST_IDLE) > > return -EBUSY; > > + /* Bypass the dma-ranges based memory regions mapping for the eDMA > > + * controlled from the CPU/Application side since in that case > > + * the local memory address is left untranslated. > > + */ > > + if (chan->dw->chip->flags & DW_EDMA_CHIP_LOCAL) { > > + dchan->dev->chan_dma_dev = true; > > + > > +#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \ > > + defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \ > > + defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) > > + dchan->dev->device.dma_coherent = chan->dw->chip->dev->dma_coherent; > > +#endif > > + > > + dma_coerce_mask_and_coherent(&dchan->dev->device, > > + dma_get_mask(chan->dw->chip->dev)); > > + dchan->dev->device.dma_parms = chan->dw->chip->dev->dma_parms; > > + } else { > > + dchan->dev->chan_dma_dev = false; > > + } > > NAK. Don't try to poke into DMA API internals and copy random partial pieces > between devices, it doesn't work properly (I can guess that your system > doesn't have an IOMMU...) and having to deal with ugly mess like this in > drivers just makes it harder for us to maintain the DMA API itself. Hold on with that angry tone. First of all I don't really see you fixing the drivers/dma/ti/k3-udma.c driver then. Second read more carefully the patch log. Judging by your comments you don't fully understand the problem. > > Fair enough if you have good reason to create logical child devices to > represent individual DMA channels, but the correct way to handle that is to > keep the real parent device pointer around and use that for DMA API calls. That's what is in my patches. The problem is that the "dma-ranges" property specified for the parental PCIe Root Port device isn't applicable for the DMA-engine embedded into it. The "dma-ranges" is supposed to be used for the PCIe-bus peripheral devices since their MRw/MRd TLPs are translated by means of the Inbound iATU engine. The IO accesses generated by the PCIe controller itself aren't affected by iATU. So any mapping performed for the PCIe Root Port controller platform device mustn't take these DMA-ranges into account. That's why I need to enable the "chan_dma_dev" DMA-engine capability and just copy the main DMA-parts of the parental device except the "dma_range_map" data. If you have any better suggestion in mind please share, but what you've said so far definitely won't give us any explicit solution. -Sergey > > Robin. > > > + > > pm_runtime_get(chan->dw->chip->dev); > > return 0;
On 2022-09-12 02:24, Serge Semin wrote: > On Wed, Aug 31, 2022 at 10:17:30AM +0100, Robin Murphy wrote: >> On 2022-08-22 19:53, Serge Semin wrote: >>> DW eDMA doesn't perform any translation of the traffic generated on the >>> CPU/Application side. It just generates read/write AXI-bus requests with >>> the specified addresses. But in case if the dma-ranges DT-property is >>> specified for a platform device node, Linux will use it to map the CPU >>> memory regions into the DMAable bus ranges. This isn't what we want for >>> the eDMA embedded into the locally accessed DW PCIe Root Port and >>> End-point. In order to work that around let's set the chan_dma_dev flag >>> for each DW eDMA channel thus forcing the client drivers to getting a >>> custom dma-ranges-less parental device for the mappings. >>> >>> Note it will only work for the client drivers using the >>> dmaengine_get_dma_device() method to get the parental DMA device. >> > >> No, this is nonsense. If the DMA engine is on the host side of the bridge >> then it should not have anything to do with the PCI device at all, it should >> be associated with the platform device, > > Well. The DMA-engine is embedded into the PCIe Root Port bus, is associated > with the platform device it's embedded to, and it doesn't have > anything to do with any particular PCI device. > >> and thus any range mapping on the bridge itself would be irrelevant anyway. > > Really? I find it otherwise. Please see the way the "dma-ranges" > property is parsed and works during the device-specific memory ranges > mapping when it's applicable for the PCIe Root Ports. Sigh, that's a bug. Now I see where the confusion is coming from. Annoyingly it's basically the exact thing I called out in 951d48855d86 when making dma-ranges work for non-OF PCI devices in the first place, but apparently neither I nor anyone else thought of this particular edge case at the time. Sorry about that. I'll have a look at how best to fix it. Everything else still stands, though. If you can't use the original platform device for DMA API calls, at least configure the child device properly by calling of_dma_configure() with the parent's DT node in the expected manner (and manually remove its dma_range_map if you need an immediate workaround). Thanks, Robin.
On Mon, Sep 26, 2022 at 03:08:01PM +0100, Robin Murphy wrote: > On 2022-09-12 02:24, Serge Semin wrote: > > On Wed, Aug 31, 2022 at 10:17:30AM +0100, Robin Murphy wrote: > > > On 2022-08-22 19:53, Serge Semin wrote: > > > > DW eDMA doesn't perform any translation of the traffic generated on the > > > > CPU/Application side. It just generates read/write AXI-bus requests with > > > > the specified addresses. But in case if the dma-ranges DT-property is > > > > specified for a platform device node, Linux will use it to map the CPU > > > > memory regions into the DMAable bus ranges. This isn't what we want for > > > > the eDMA embedded into the locally accessed DW PCIe Root Port and > > > > End-point. In order to work that around let's set the chan_dma_dev flag > > > > for each DW eDMA channel thus forcing the client drivers to getting a > > > > custom dma-ranges-less parental device for the mappings. > > > > > > > > Note it will only work for the client drivers using the > > > > dmaengine_get_dma_device() method to get the parental DMA device. > > > > > > > > No, this is nonsense. If the DMA engine is on the host side of the bridge > > > then it should not have anything to do with the PCI device at all, it should > > > be associated with the platform device, > > > > Well. The DMA-engine is embedded into the PCIe Root Port bus, is associated > > with the platform device it's embedded to, and it doesn't have > > anything to do with any particular PCI device. > > > > > and thus any range mapping on the bridge itself would be irrelevant anyway. > > > > Really? I find it otherwise. Please see the way the "dma-ranges" > > property is parsed and works during the device-specific memory ranges > > mapping when it's applicable for the PCIe Root Ports. > > Sigh, that's a bug. Now I see where the confusion is coming from. Finally we are on the same page.) I didn't thought it was a bug though. Some details of the problem I described in another thread earlier today: Link: https://lore.kernel.org/linux-pci/20220926205333.qlhb5ojmx4sktzt5@mobilestation/ (See my note regarding the "dma-ranges" usage, which I accidentally addressed to William instead of you.) > > Annoyingly it's basically the exact thing I called out in 951d48855d86 when > making dma-ranges work for non-OF PCI devices in the first place, but > apparently neither I nor anyone else thought of this particular edge case at > the time. Sorry about that. I'll have a look at how best to fix it. You are right. The PCI-specific dma-ranges semantic hasn't been well thought through in the first place. The child devices should have had a dedicated method to set their own way of the memory ranges mapping. Just a thought. As a possible solution for the dma-ranges property being dedicated for the child devices we could introduce a new "space code" of the dma-ranges property with a flag which would indicate the actual bridge/host-controller memory range. If the dma-ranges property doesn't have an entry with such code the mapping could be considered as direct (in accordance with the parental dma-ranges properties). IOMMU-part is applicable for all PCIe-related hierarchy - bridge itself and peripheral devices. > > Everything else still stands, though. If you can't use the original platform > device for DMA API calls, at least configure the child device properly by > calling of_dma_configure() with the parent's DT node in the expected manner > (and manually remove its dma_range_map if you need an immediate workaround). Do you mean something like this? < struct dma_chan *dchan = ...; < struct dw_edma_chan *chan = ...; < struct device *parent = chan->dw->chip->dev; < < if (dev_of_node(parent)) { < struct device_node *node = dev_of_node(parent); < < ret = of_dma_configure(&chan->dev->device, node, true); < } else if (has_acpi_companion(parent)) { < struct acpi_device *adev = to_acpi_device_node(parent->fwnode); < < ret = acpi_dma_configure(&chan->dev->device, acpi_get_dma_attr(adev)); < } else { < ret = -EINVAL; < } < < if (ret) < return ret; < < /* Drop the detected dma-ranges mapping since it isn't applicable for < * the PCIe RP/EP bridge itself but to the peripheral devices only. < */ < dchan->dev->device.dma_range_map = NULL; < dchan->dev->chan_dma_dev = true; < < return 0; What about the DMA-mask? Will it be ok if I copy it from the parental device? Like this: < dma_coerce_mask_and_coherent(&dchan->dev->device, dma_get_mask(parent)); Judging by the of_dma_configure_id() method implementation the mask upper bound is calculated based on the dma-ranges entries. Since the DT-property isn't applicable for the PCIe host platform device itself then it' upper bound most like will be invalid for the bridge too. Regards, -Sergey > > Thanks, > Robin.
diff --git a/drivers/dma/dw-edma/dw-edma-core.c b/drivers/dma/dw-edma/dw-edma-core.c index 6a8282eaebaf..4f56149dc8d8 100644 --- a/drivers/dma/dw-edma/dw-edma-core.c +++ b/drivers/dma/dw-edma/dw-edma-core.c @@ -716,6 +716,26 @@ static int dw_edma_alloc_chan_resources(struct dma_chan *dchan) if (chan->status != EDMA_ST_IDLE) return -EBUSY; + /* Bypass the dma-ranges based memory regions mapping for the eDMA + * controlled from the CPU/Application side since in that case + * the local memory address is left untranslated. + */ + if (chan->dw->chip->flags & DW_EDMA_CHIP_LOCAL) { + dchan->dev->chan_dma_dev = true; + +#if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \ + defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \ + defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) + dchan->dev->device.dma_coherent = chan->dw->chip->dev->dma_coherent; +#endif + + dma_coerce_mask_and_coherent(&dchan->dev->device, + dma_get_mask(chan->dw->chip->dev)); + dchan->dev->device.dma_parms = chan->dw->chip->dev->dma_parms; + } else { + dchan->dev->chan_dma_dev = false; + } + pm_runtime_get(chan->dw->chip->dev); return 0;