mbox series

[0/4] Expand Xilinx CDMA functions

Message ID 20210423011913.13122-1-adrian.martinezlarumbe@imgtec.com (mailing list archive)
Headers show
Series Expand Xilinx CDMA functions | expand

Message

Adrian Larumbe April 23, 2021, 1:19 a.m. UTC
Recently at Imgtec we had to provide GLES API buffers with DMA transfer
capabilities to device memory. We had access to a Xilinx CDMA IP module,
but even though the hardware supports scatter-gather operations,
the driver did not. This patch series' goal is to extend the driver
to support SG transfers on CDMA devices.

It also fixes a couple of issues I found in the driver: lack of support
for HW descriptors allocated in an extended address space (above 32 bits)
and an unusual race condition when closing a DMA channel.

Adrian Larumbe (4):
  dmaengine: xilinx_dma: Add extended address support in CDMA
  dmaengine: xilinx_dma: Add channel configuration setting callback
  dmaengine: xilinx_dma: Add CDMA SG transfer support
  dmaengine: xilinx_dma: Add device synchronisation callback

 drivers/dma/xilinx/xilinx_dma.c | 186 ++++++++++++++++++++++++++++++--
 1 file changed, 177 insertions(+), 9 deletions(-)

Comments

Lars-Peter Clausen April 23, 2021, 9:17 a.m. UTC | #1
On 4/23/21 3:19 AM, Adrian Larumbe wrote:
> Recently at Imgtec we had to provide GLES API buffers with DMA transfer
> capabilities to device memory. We had access to a Xilinx CDMA IP module,
> but even though the hardware supports scatter-gather operations,
> the driver did not. This patch series' goal is to extend the driver
> to support SG transfers on CDMA devices.

Hi,

Thanks for the patch series. From an implementation point of view this 
looks good. But I'm a bit concerned about dmaengine API compliance.

The device_config() and device_prep_slave_sg() APIs are meant for 
memory-to-device or device-to-memory transfers. This means the device 
address is fixed and usually maps to a FIFO in the device.

What you are implementing is memory-to-memory, which means addresses are 
incremented on both sides of transfer. And this works if you know what 
you are doing, e.g. you know that you are using a specific dmaengine 
driver that implements the dmaengine API in a way that does not comply 
with the specification. But it breaks for generic dmaengine API clients 
that expect compliant behavior. And it is the reason why we have an API 
in the first place, to get consistent behavior. If you have to know 
about the driver specific semantics you might as well just bypass the 
framework and use driver specific functions.

And the CDMA seems to support the dmaengine API intended behavior 
through what it calls the keyhole feature where the address is not 
incremented. And if somebody wanted to use that feature they wouldn't be 
able to add support for it because it will break your usage of the 
prep_slave_sg() API.

It seems to me what we are missing from the DMAengine API is the 
equivalent of device_prep_dma_memcpy() that is able to take SG lists. 
There is already a memset_sg, it should be possible to add something 
similar for memcpy.

  - Lars
Adrian Larumbe April 23, 2021, 11:38 a.m. UTC | #2
On 23.04.2021 11:17, Lars-Peter Clausen wrote:

> The device_config() and device_prep_slave_sg() APIs are meant for
> memory-to-device or device-to-memory transfers. This means the device address
> is fixed and usually maps to a FIFO in the device.
> 
> What you are implementing is memory-to-memory, which means addresses are
> incremented on both sides of transfer. And this works if you know what you are
> doing, e.g. you know that you are using a specific dmaengine driver that
> implements the dmaengine API in a way that does not comply with the
> specification. But it breaks for generic dmaengine API clients that expect
> compliant behavior. And it is the reason why we have an API in the first
> place, to get consistent behavior. If you have to know about the driver
> specific semantics you might as well just bypass the framework and use driver
> specific functions.

You're absolutely right about this. I think I might've been confused when
picking the right DMA Engine API callback to write this new functionality into
by the fact that, on our particular configuration, transfers were being done
between main system memory and DDR memory on a PCI card. However, from the point
of view of the semantics of the API, these are still memory-to-memory transfers,
just like you said.

> It seems to me what we are missing from the DMAengine API is the equivalent of
> device_prep_dma_memcpy() that is able to take SG lists. There is already a
> memset_sg, it should be possible to add something similar for memcpy.

I wasn't aware of the existence of this callback. Grepping the sources reveals a
single consumer at the moment, I guess this also might've led me to wrongly
conclude device_prep_slave_sg was the right choice. Perhaps I can convert most
of the code currently in xilinx_cdma_prep_slave_sg to fit the semantics of this
DMAengine API function instead. Any thoughts on this?

Cheers,
Adrian
Vinod Koul April 23, 2021, 1:24 p.m. UTC | #3
On 23-04-21, 11:17, Lars-Peter Clausen wrote:
> 
> It seems to me what we are missing from the DMAengine API is the equivalent
> of device_prep_dma_memcpy() that is able to take SG lists. There is already
> a memset_sg, it should be possible to add something similar for memcpy.

You mean something like dmaengine_prep_dma_sg() which was removed?

static inline struct dma_async_tx_descriptor *dmaengine_prep_dma_sg(
               struct dma_chan *chan,
               struct scatterlist *dst_sg, unsigned int dst_nents,
               struct scatterlist *src_sg, unsigned int src_nents,
               unsigned long flags)

The problem with this API is that it would work only when src_sg and
dst_sg is of similar nature, if not then how should one go about
copying...should we fill without a care for dst_sg being different than
src_sg as long as total data to be copied has enough space in dst...

We can always add this back if we have in-kernel user but the semantics
of the API needs to be thought thru

Thanks
Lars-Peter Clausen April 23, 2021, 1:51 p.m. UTC | #4
On 4/23/21 3:24 PM, Vinod Koul wrote:
> On 23-04-21, 11:17, Lars-Peter Clausen wrote:
>> It seems to me what we are missing from the DMAengine API is the equivalent
>> of device_prep_dma_memcpy() that is able to take SG lists. There is already
>> a memset_sg, it should be possible to add something similar for memcpy.
> You mean something like dmaengine_prep_dma_sg() which was removed?
>
Ah, that's why I could have sworn we already had this!

> static inline struct dma_async_tx_descriptor *dmaengine_prep_dma_sg(
>                 struct dma_chan *chan,
>                 struct scatterlist *dst_sg, unsigned int dst_nents,
>                 struct scatterlist *src_sg, unsigned int src_nents,
>                 unsigned long flags)
>
> The problem with this API is that it would work only when src_sg and
> dst_sg is of similar nature, if not then how should one go about
> copying...should we fill without a care for dst_sg being different than
> src_sg as long as total data to be copied has enough space in dst...
At least for the CDMA the only requirement is that both buffers have the 
same total size.
Adrian Larumbe April 23, 2021, 2:36 p.m. UTC | #5
On 23.04.2021 15:51, Lars-Peter Clausen wrote:

> > On 23-04-21, 11:17, Lars-Peter Clausen wrote:
> > > It seems to me what we are missing from the DMAengine API is the equivalent
> > > of device_prep_dma_memcpy() that is able to take SG lists. There is already
> > > a memset_sg, it should be possible to add something similar for memcpy.
> > You mean something like dmaengine_prep_dma_sg() which was removed?
> > 
> Ah, that's why I could have sworn we already had this!

Yes, after that API function was removed, the Xilinx DMA driver effectively
ceased to support memory-to-memory SG transfers. My assumption was if it wasn't
ever reimplemented through the callback you mentioned before, is because there
isn't truly much interest in this device.

However we do need this functionality in our system. At the moment, and for our
particular use of it, we can always guarantee that either the source or
destination will be one contiguous chunk of memory, so just one scatterlist
pointer array was enough to fully program an operation. I think this would fit
alright into memset_sg.  What do you think?

Also, at the moment we're using it for transfers between main memory and
CPU-mapped PCI memory, which cannot be allocated with kmalloc. Wouldn't this
fall into the use case for device_prep_slave_sg?

Adrian
Vinod Koul April 23, 2021, 5:20 p.m. UTC | #6
On 23-04-21, 15:51, Lars-Peter Clausen wrote:
> On 4/23/21 3:24 PM, Vinod Koul wrote:
> > On 23-04-21, 11:17, Lars-Peter Clausen wrote:
> > > It seems to me what we are missing from the DMAengine API is the equivalent
> > > of device_prep_dma_memcpy() that is able to take SG lists. There is already
> > > a memset_sg, it should be possible to add something similar for memcpy.
> > You mean something like dmaengine_prep_dma_sg() which was removed?
> > 
> Ah, that's why I could have sworn we already had this!

Even at that time we had the premise that we can bring the API back if
we had users. I think many have asked for it, but I havent seen a patch
with user yet :)

> > static inline struct dma_async_tx_descriptor *dmaengine_prep_dma_sg(
> >                 struct dma_chan *chan,
> >                 struct scatterlist *dst_sg, unsigned int dst_nents,
> >                 struct scatterlist *src_sg, unsigned int src_nents,
> >                 unsigned long flags)
> > 
> > The problem with this API is that it would work only when src_sg and
> > dst_sg is of similar nature, if not then how should one go about
> > copying...should we fill without a care for dst_sg being different than
> > src_sg as long as total data to be copied has enough space in dst...
> At least for the CDMA the only requirement is that both buffers have the
> same total size.

I will merge if with a user but semantics need to be absolutely clear on
what is allowed and not, do I hear a volunteer ?
radhey pandey June 1, 2021, 10:29 a.m. UTC | #7
On Fri, Apr 23, 2021 at 10:51 PM Vinod Koul <vkoul@kernel.org> wrote:
>
> On 23-04-21, 15:51, Lars-Peter Clausen wrote:
> > On 4/23/21 3:24 PM, Vinod Koul wrote:
> > > On 23-04-21, 11:17, Lars-Peter Clausen wrote:
> > > > It seems to me what we are missing from the DMAengine API is the equivalent
> > > > of device_prep_dma_memcpy() that is able to take SG lists. There is already
> > > > a memset_sg, it should be possible to add something similar for memcpy.
> > > You mean something like dmaengine_prep_dma_sg() which was removed?
> > >
> > Ah, that's why I could have sworn we already had this!
>
> Even at that time we had the premise that we can bring the API back if
> we had users. I think many have asked for it, but I havent seen a patch
> with user yet :)
Right.  Back then we had also discussed bringing the dma_sg API
but the idea was dropped as we didn't had a xilinx/any consumer
client driver for it in the mainline kernel.

I think it's the same state now.
>
> > > static inline struct dma_async_tx_descriptor *dmaengine_prep_dma_sg(
> > >                 struct dma_chan *chan,
> > >                 struct scatterlist *dst_sg, unsigned int dst_nents,
> > >                 struct scatterlist *src_sg, unsigned int src_nents,
> > >                 unsigned long flags)
> > >
> > > The problem with this API is that it would work only when src_sg and
> > > dst_sg is of similar nature, if not then how should one go about
> > > copying...should we fill without a care for dst_sg being different than
> > > src_sg as long as total data to be copied has enough space in dst...
> > At least for the CDMA the only requirement is that both buffers have the
> > same total size.
>
> I will merge if with a user but semantics need to be absolutely clear on
> what is allowed and not, do I hear a volunteer ?
> --
> ~Vinod
Adrian Larumbe July 2, 2021, 2:23 p.m. UTC | #8
On 01.06.2021 15:59, radhey pandey wrote:
> On Fri, Apr 23, 2021 at 10:51 PM Vinod Koul <vkoul@kernel.org> wrote:
> >
> > On 23-04-21, 15:51, Lars-Peter Clausen wrote:
> > > On 4/23/21 3:24 PM, Vinod Koul wrote:
> > > > On 23-04-21, 11:17, Lars-Peter Clausen wrote:
> > > > > It seems to me what we are missing from the DMAengine API is the equivalent
> > > > > of device_prep_dma_memcpy() that is able to take SG lists. There is already
> > > > > a memset_sg, it should be possible to add something similar for memcpy.
> > > > You mean something like dmaengine_prep_dma_sg() which was removed?
> > > >
> > > Ah, that's why I could have sworn we already had this!
> >
> > Even at that time we had the premise that we can bring the API back if
> > we had users. I think many have asked for it, but I havent seen a patch
> > with user yet :)
> Right.  Back then we had also discussed bringing the dma_sg API
> but the idea was dropped as we didn't had a xilinx/any consumer
> client driver for it in the mainline kernel.
> 
> I think it's the same state now.

Would it be alright if I brought back the old DMA_SG interface that was removed
in commit c678fa66341c?  It seems that what I've effectively done is
implementing the semantics of that API call under the guise of
dma_prep_slave. However I still need mem2mem SG transfer support on CDMA, which
seems long gone from the driver, even though the HW does offer it.

If people are fine with it I can restore that interface and CDMA as the sole consumer.

>
> > > static inline struct dma_async_tx_descriptor *dmaengine_prep_dma_sg(
> > >                 struct dma_chan *chan,
> > >                 struct scatterlist *dst_sg, unsigned int dst_nents,
> > >                 struct scatterlist *src_sg, unsigned int src_nents,
> > >                 unsigned long flags)
> > >
> > > The problem with this API is that it would work only when src_sg and
> > > dst_sg is of similar nature, if not then how should one go about
> > > copying...should we fill without a care for dst_sg being different than
> > > src_sg as long as total data to be copied has enough space in dst...
> > At least for the CDMA the only requirement is that both buffers have the
> > same total size.
>
> I will merge if with a user but semantics need to be absolutely clear on
> what is allowed and not, do I hear a volunteer ?
> --
> ~Vinod
Vinod Koul July 5, 2021, 3:53 a.m. UTC | #9
On 02-07-21, 15:23, Adrian Larumbe wrote:
> On 01.06.2021 15:59, radhey pandey wrote:
> > On Fri, Apr 23, 2021 at 10:51 PM Vinod Koul <vkoul@kernel.org> wrote:
> > >
> > > On 23-04-21, 15:51, Lars-Peter Clausen wrote:
> > > > On 4/23/21 3:24 PM, Vinod Koul wrote:
> > > > > On 23-04-21, 11:17, Lars-Peter Clausen wrote:
> > > > > > It seems to me what we are missing from the DMAengine API is the equivalent
> > > > > > of device_prep_dma_memcpy() that is able to take SG lists. There is already
> > > > > > a memset_sg, it should be possible to add something similar for memcpy.
> > > > > You mean something like dmaengine_prep_dma_sg() which was removed?
> > > > >
> > > > Ah, that's why I could have sworn we already had this!
> > >
> > > Even at that time we had the premise that we can bring the API back if
> > > we had users. I think many have asked for it, but I havent seen a patch
> > > with user yet :)
> > Right.  Back then we had also discussed bringing the dma_sg API
> > but the idea was dropped as we didn't had a xilinx/any consumer
> > client driver for it in the mainline kernel.
> > 
> > I think it's the same state now.
> 
> Would it be alright if I brought back the old DMA_SG interface that was removed
> in commit c678fa66341c?  It seems that what I've effectively done is
> implementing the semantics of that API call under the guise of
> dma_prep_slave. However I still need mem2mem SG transfer support on CDMA, which
> seems long gone from the driver, even though the HW does offer it.
> 
> If people are fine with it I can restore that interface and CDMA as the sole consumer.

I guess I should start putting this regularly now! Yes it is okay to
bring dma sg support, provided
1. We have a user
2. the sematics of how that api works is well defined. Esp in the case where is src_sg and dstn_sg are different

Also, I would not accept patches which implement this in guise of a
different API.

Thanks